..
cluster_resolver
Remove deprecated tfrt_enabled
test target flag.
2020-10-22 12:55:06 -07:00
coordinator
PSv2: Privatize ClusterCoordinator.cluster attribute since Cluster is not meant for public use.
2020-12-18 02:56:32 -08:00
experimental
integration_test
Raise meaningful error message when loading a ShardedVariable.
2020-12-21 15:59:59 -08:00
parallel_device
Parallel device: make tf.cond work executing eagerly
2020-12-04 13:19:43 -08:00
v1
tf.distribute: Move old/unused all_reduce util to v1/
2020-11-23 19:46:50 -08:00
BUILD
Raise meaningful error message when loading a ShardedVariable.
2020-12-21 15:59:59 -08:00
central_storage_strategy.py
Merge pull request #38968 from kushanam:distribute_dali_ctl
2020-10-19 09:25:22 -07:00
checkpoint_utils_test.py
Small adjustments on import spacing.
2019-12-18 20:32:12 -08:00
checkpointing_test.py
Add callable wrapper to CheckpointValueInitializer so that we can delay the variable restore until after variable creation scopes have been called.
2020-09-01 15:42:47 -07:00
collective_all_reduce_strategy_test.py
Remove enable collective ops tests
2020-11-24 14:27:20 -08:00
collective_all_reduce_strategy.py
Set a timeout to check health RPC
2020-10-21 13:02:25 -07:00
collective_util_test.py
Fix constructor of CommunicationOptions
2020-11-11 13:17:33 -08:00
collective_util.py
Fix constructor of CommunicationOptions
2020-11-11 13:17:33 -08:00
combinations_test.py
Use the same worker pool for tests that requires the same number of workers
2020-10-26 14:31:49 -07:00
combinations.py
Use the same worker pool for tests that requires the same number of workers
2020-10-26 14:31:49 -07:00
cross_device_ops_test.py
Enable NCCL for all all-reduces
2020-11-24 14:55:12 -08:00
cross_device_ops.py
Condition whether to use NCCL for all collectives on the launcher
2020-11-23 17:31:58 -08:00
cross_device_utils_test.py
Refactor collective utils to be of one replica
2020-10-13 20:02:22 -07:00
cross_device_utils.py
Enable NCCL for all all-reduces
2020-11-24 14:55:12 -08:00
custom_training_loop_gradient_test.py
Support Google-internal TPU resolution in strategy combinations.
2020-05-27 14:29:14 -07:00
custom_training_loop_input_test.py
Rename "experimental_distribute_datasets_from_function" to "distribute_datasets_from_function".
2020-09-23 18:15:32 -07:00
device_util_test.py
Try to deduce job, replica and task from config.list_logical_devices() again
2020-06-16 15:22:24 -07:00
device_util.py
Use __slots__ for small classes
2020-06-28 18:41:22 +02:00
distribute_config.py
distribute_coordinator_context.py
Use distribution strategy to configure distribute coordinator.
2018-08-16 12:28:46 -07:00
distribute_coordinator_test.py
minor spelling tweaks
2020-02-11 15:09:21 +09:00
distribute_coordinator.py
Distribute Coordinator currently assumes TF_CONFIG to be the only way to configure a strategy. We now allow cluster resolvers to be passed as arguments to instantiate the strategy instead of TF_CONFIG which should be used instead if set by the user.
2020-03-16 12:03:17 -07:00
distribute_lib_test.py
Graduate experimental_hints to options in all_reduce/reduce/batch_reduce
2020-10-16 11:54:24 -07:00
distribute_lib.py
Revise docstring for strategy.scope to hit a few key points:
2020-12-02 05:21:09 -08:00
distribute_utils_test.py
Get namedtuple _make method from instance instead of class.
2020-08-10 09:10:33 -07:00
distribute_utils.py
Install _distributed_container only at variable creation
2020-09-16 00:17:33 -07:00
distribution_strategy_context.py
Generate replica_id tensor at call time
2020-07-27 19:21:33 -07:00
estimator_training.py
minor spelling tweaks
2020-02-11 15:09:21 +09:00
input_lib_test.py
Always enable get_next_as_optional unless the dataset is finite.
2020-11-11 17:48:40 -08:00
input_lib_type_spec_test.py
Merge pull request #44632 from kushanam:keras_distribute_lib
2020-11-12 16:52:16 -08:00
input_lib.py
Merge pull request #44632 from kushanam:keras_distribute_lib
2020-11-12 16:52:16 -08:00
input_ops_test.py
input_ops.py
[tf.data + tf.distribute] Use RebatchDataset instead of LegacyRebatchDataset in distribution strategies when global batch size can be statically determined.
2020-09-30 12:18:30 -07:00
metrics_v1_test.py
Move to using 'initializer' from 'initialize' to be more consistent with the tf.data APIs.
2020-01-15 13:20:27 -08:00
mirrored_run.py
Return the correct replica id within a sync group for MWMS. Currently we return the local replica id within a worker as opposed to within a sync group.
2020-10-09 13:20:06 -07:00
mirrored_strategy_test.py
Retire MultiWorkerAllReduce
2020-08-27 00:12:37 -07:00
mirrored_strategy.py
Turn on VariablePolicy for MirroredStrategy.
2020-10-29 14:41:21 -07:00
mirrored_variable_test.py
Use utility to identify OnWrite and OnRead synchronized variables.
2020-07-27 14:14:19 -07:00
moving_averages_test.py
Set 2 virtual cpus and 2 virtual gpus by default for test cases.
2020-11-03 16:57:08 -08:00
multi_process_lib.py
Update multi_process_lib to handle file path for OSS keras build/test.
2020-12-07 15:19:40 -08:00
multi_process_runner_no_init_test.py
TF Internal API: tf_export a few distribute-related symbols:
2020-10-07 14:38:53 -07:00
multi_process_runner_test.py
Re-enable multi process pool runner tests
2020-10-26 11:58:41 -07:00
multi_process_runner.py
MultiProcessPoolRunner: Comment correction as we're longer using atexit. Upon testing it seems we don't need _shutdown_all_pool_runners at the end of _pool_runner_worker either now.
2020-10-26 14:02:51 -07:00
multi_worker_continuous_run_test.py
MultiProcessRunner: symbol replacement: barrier->get_barrier
2020-10-07 10:51:25 -07:00
multi_worker_test_base_test.py
Use MPR for fault tolerance test
2020-08-21 00:08:42 -07:00
multi_worker_test_base.py
Enable cluster_coordinator_test on OSS.
2020-11-20 12:22:45 -08:00
multi_worker_util_test.py
Move away from deprecated asserts
2020-06-30 16:10:22 -07:00
multi_worker_util.py
PSv2: Check that there is no more than one chief, and at least one ps/worker. Combine the validation logic with multi_worker_util.
2020-11-10 18:37:30 -08:00
numpy_dataset_test.py
Add tf.distribute.Strategy.experimental_make_numpy_iterator()
function.
2019-01-09 14:10:49 -08:00
numpy_dataset.py
one_device_strategy_test.py
Add InputOption support to all remaining strategies.
2020-06-24 16:20:39 -07:00
one_device_strategy.py
Merge pull request #38968 from kushanam:distribute_dali_ctl
2020-10-19 09:25:22 -07:00
packed_distributed_variable_test.py
Support packed variable in DistributedVariable. Add an option to enable packed variable in TPUStrategy.
2020-06-18 20:12:02 -07:00
packed_distributed_variable.py
Return the primary handle when it's in graph mode and not under a tpu context.
2020-12-15 22:10:51 -08:00
parameter_server_strategy_test.py
fix typos in python directory
2020-10-29 16:21:24 +03:00
parameter_server_strategy_v2_test.py
PSv2: Add checks that ParameterServerStrategy
's run
, reduce
, experimental_distribute_dataset
, and distribute_datasets_from_function
are used with a ClusterCoordinator
, and that run
and reduce
need to be used within a function that is used with schedule
.
2020-11-25 12:30:24 -08:00
parameter_server_strategy_v2.py
Raise meaningful error message when loading a ShardedVariable.
2020-12-21 15:59:59 -08:00
parameter_server_strategy.py
PSv2: Dedup the legacy ParameterServerStrategy class (as the estimator usage of it uses ParameterServerStrategyV1).
2020-10-21 12:16:22 -07:00
ps_values_test.py
Replace usages of Tensorflow DistributionStrategy method experimental_run_v2 with run.
2020-06-29 11:22:53 -07:00
ps_values.py
[TF DistStrat] Add support for deepcopy on AggregatingVariable (PS)
2020-08-19 08:57:16 -07:00
README.md
Graduate TPUStrategy from experimental.
2020-06-20 13:10:50 -07:00
reduce_util.py
remote_mirrored_strategy_eager_test.py
Support LogicalDevice in MirroredStrategy config
2019-11-13 15:19:24 -08:00
sharded_variable_test.py
Raise meaningful error message when loading a ShardedVariable.
2020-12-21 15:59:59 -08:00
sharded_variable.py
Raise meaningful error message when loading a ShardedVariable.
2020-12-21 15:59:59 -08:00
shared_variable_creator_test.py
Move away from deprecated asserts
2020-06-30 16:10:22 -07:00
shared_variable_creator.py
single_loss_example.py
Update minimize_loss_test to not rely on Keras.
2020-07-07 21:39:06 -07:00
step_fn.py
strategy_combinations_test.py
Create different strategy based on TF1/2 in strategy_combinations
2020-10-09 17:02:10 -07:00
strategy_combinations.py
Only call initialize_tpu_system once per process.
2020-10-28 17:31:50 -07:00
strategy_common_test.py
Split strategy_common_test into two pieces as this test is currently timing out.
2020-10-13 10:11:48 -07:00
strategy_gather_test.py
Fix and test all_gather gradient.
2020-10-21 03:47:11 -07:00
strategy_test_lib.py
Remove numpy_datasets from V2 strategies
2020-10-12 14:30:17 -07:00
summary_op_util.py
test_util_test.py
Order NCCL all-reduce with ordering token
2020-11-11 11:18:30 -08:00
test_util.py
Order NCCL all-reduce with ordering token
2020-11-11 11:18:30 -08:00
tf_function_test.py
Always retrace in tf.saved_model.save
2020-10-10 12:18:19 -07:00
tpu_strategy_compilation_test.py
Pass non empty MLIR module serialized string when constructing TpuCompilationCacheKey.
2020-07-24 16:40:48 -07:00
tpu_strategy_test.py
Return the primary handle when it's in graph mode and not under a tpu context.
2020-12-15 22:10:51 -08:00
tpu_strategy.py
fix typos in python directory
2020-10-29 16:21:24 +03:00
tpu_values.py
Return the primary handle when it's in graph mode and not under a tpu context.
2020-12-15 22:10:51 -08:00
values_test.py
Disallow saving if the function cannot be used for inference
2020-10-15 21:08:51 -07:00
values_util.py
Disallow saving if the function cannot be used for inference
2020-10-15 21:08:51 -07:00
values.py
Turn on VariablePolicy for MirroredStrategy.
2020-10-29 14:41:21 -07:00
vars_test.py
Add test_util.main() and test_util.set_logical_devices_to_at_least()
2020-10-06 16:30:51 -07:00
warm_starting_util_test.py
Small adjustments on import spacing.
2019-12-18 20:32:12 -08:00
zero_batch_test.py
Fix input size used for batch normalization.
2020-04-09 22:01:21 -07:00