STT-tensorflow/tensorflow/core/tpu
Tayo Oguntebi 6983bacea1 Enables per-host dummy args for TPUExecute (TF1) and adds XLA options.
Enabling this logic removes cross-worker send/recv dependencies required for TPUExecuteOp nodes to access a model's variables. This decreases overhead at the start of a training loop.

The approach used is to replace remote variable reads with zero tensors on each worker, except for the primary worker. The zero tensors feed TPUExecute nodes that are local to that worker.  For large distributed systems with large variables, this removes the need for the initial Send/Recv variable broadcast, which can be expensive.

PiperOrigin-RevId: 351904109
Change-Id: I9f1ed63c2401f227646010a94a70c04f1c96cb7e
2021-01-14 17:03:51 -08:00
..
graph_rewrite Enables per-host dummy args for TPUExecute (TF1) and adds XLA options. 2021-01-14 17:03:51 -08:00
kernels Internal dependency cleanup. 2021-01-11 17:21:12 -08:00
ops Make sure TPUPartitionedInput shape inference doesn't crash if input handle shapes and types are not available. 2020-11-11 00:41:46 -08:00
BUILD Open source the XRTTpuDeviceAccessor and register XRTStateOps on TPU on the open source XRT side if it is a 1vm use case. 2vm code remain unchanged. 2021-01-13 14:00:49 -08:00
libtftpu.h Add argc and argv arguments to pass to tpu library on startup 2020-12-04 16:20:33 -08:00
tpu_api_dlsym_initializer_windows.cc Rename tpu_load_library to tpu_api_dlsym_initializer. 2020-06-22 17:45:58 -07:00
tpu_api_dlsym_initializer.cc Fix API initializer bug: request TPU library initialize itself during loading 2020-12-11 18:27:16 -08:00
tpu_api_dlsym_initializer.h [TPU 1VM] Consolidate all TPU ops related APIs into a single file 2020-10-23 18:09:32 -07:00
tpu_api_dlsym_set_fn.h Add a new parameter to turn on and off library initialization in TPU APIs 2020-09-28 13:56:40 -07:00
tpu_api.cc [TPU 1VM] Consolidate all TPU ops related APIs into a single file 2020-10-23 18:09:32 -07:00
tpu_api.h [TPU 1VM] Consolidate all TPU ops related APIs into a single file 2020-10-23 18:09:32 -07:00
tpu_compilation_device.cc Add some missing dependencies so that the TPU version of TensorFlow builds 2020-07-27 17:43:48 -07:00
tpu_compile_interface.cc TPU rewrite pass refactoring. 2020-07-14 14:45:58 -07:00
tpu_compile_interface.h Add error payload in status. 2021-01-07 00:09:29 -08:00
tpu_configuration.cc Add a global resource manager for TPU specific operations. 2020-05-19 17:38:56 -07:00
tpu_configuration.h Add a global resource manager for TPU specific operations. 2020-05-19 17:38:56 -07:00
tpu_defs.cc Introduce additional TPU infeed and outfeed ops 2020-08-10 01:05:31 -07:00
tpu_defs.h Introduce additional TPU infeed and outfeed ops 2020-08-10 01:05:31 -07:00
tpu_embedding_optimization_parameters_utils.cc Add a set of dynamic embedding optimizers directly taking an HloModule. 2020-09-29 18:51:01 -07:00
tpu_embedding_optimization_parameters_utils.h Frequency estimator implementation for TPU embedding. 2020-09-25 19:43:49 -07:00
tpu_embedding_output_layout_utils.cc Deprecated and removed uses of TPUEmbeddingOutputLayout proto and output_layout field in TPUEmbeddingConfiguration. 2020-09-30 14:37:01 -07:00
tpu_embedding_output_layout_utils.h Deprecated and removed uses of TPUEmbeddingOutputLayout proto and output_layout field in TPUEmbeddingConfiguration. 2020-09-30 14:37:01 -07:00
tpu_execute.cc Update Configure TPU, Wait For TPU, and Initialize TPU APIs to backward compatible API style 2020-11-13 11:34:22 -08:00
tpu_execute.h [TPU 1VM] Consolidate all TPU ops related APIs into a single file 2020-10-23 18:09:32 -07:00
tpu_executor_api.cc Don't attempt to register "TPU" platform if the underlying C API isn't initialized. 2020-09-21 18:04:07 -07:00
tpu_executor_api.h Don't attempt to register "TPU" platform if the underlying C API isn't initialized. 2020-09-21 18:04:07 -07:00
tpu_executor_dlsym_initializer_windows.cc Merge pull request #44731 from cloudhan:jax_winbuild 2020-11-10 19:44:10 -08:00
tpu_executor_dlsym_initializer.cc Initialize the TPU library when loading through tpu_executor_dlsym_initializer 2020-12-07 11:14:44 -08:00
tpu_executor_init_fns.inc Add TPU runtime version. 2021-01-14 13:29:47 -08:00
tpu_init_mode.cc Introduce some common constants for TPU. 2020-05-08 19:53:44 -07:00
tpu_init_mode.h Introduce some common constants for TPU. 2020-05-08 19:53:44 -07:00
tpu_library_init_fns.inc Add initial TpuTracer. 2020-12-14 10:11:39 -08:00
tpu_node_device_util.cc Add a few more utility functions for TPUs 2020-05-13 17:07:28 -07:00
tpu_node_device_util.h Add a few more utility functions for TPUs 2020-05-13 17:07:28 -07:00
tpu_node_device.cc Fix missing status check. 2020-07-21 18:07:46 -07:00
tpu_node_device.h Introduce tpu_node_device to represent the individual TPU cores 2020-07-09 18:15:58 -07:00
tpu_on_demand_compiler.cc Add TpuExecutable_FreeXlaShapeIndexArray and TpuExecutable_FreeMaybeOwningDeviceMemoryArray to free appropriate memory 2021-01-11 14:24:22 -08:00
tpu_ops_c_api.h Add initial TpuTracer. 2020-12-14 10:11:39 -08:00
tpu_system_device.cc Put a bunch of TPU classes in the tensorflow::tpu namespace. 2020-09-21 13:59:05 -07:00
tpu_system_device.h Put a bunch of TPU classes in the tensorflow::tpu namespace. 2020-09-21 13:59:05 -07:00
virtual_device.cc Add virtual_device to TPU library 2020-06-26 15:51:39 -07:00
virtual_device.h Add virtual_device to TPU library 2020-06-26 15:51:39 -07:00