STT-tensorflow/tensorflow/python/distribute/cluster_resolver
Christian Sigg 4838793e12 Comment out a number of google-internal targets when copybara-exporting instead of removing them.
PiperOrigin-RevId: 353848826
Change-Id: I0801c0e713a0c63597deb5aed31c8bdb37999c6a
2021-01-26 05:47:31 -08:00
..
tpu PY2 removal cleanup 2021-01-15 16:48:57 -08:00
__init__.py
BUILD Comment out a number of google-internal targets when copybara-exporting instead of removing them. 2021-01-26 05:47:31 -08:00
cluster_resolver_test.py
cluster_resolver.py
gce_cluster_resolver_test.py Move away from deprecated asserts 2020-06-30 16:10:22 -07:00
gce_cluster_resolver.py
kubernetes_cluster_resolver_test.py Move away from deprecated asserts 2020-06-30 16:10:22 -07:00
kubernetes_cluster_resolver.py
README_Slurm.md
README.md
sagemaker_cluster_resolver_test.py Merge pull request from sboshin:sagemaker_resolver 2020-09-17 20:22:03 -07:00
sagemaker_cluster_resolver.py Merge pull request from sboshin:sagemaker_resolver 2020-09-17 20:22:03 -07:00
slurm_cluster_resolver_test.py
slurm_cluster_resolver.py
tfconfig_cluster_resolver_test.py
tfconfig_cluster_resolver.py
tpu_cluster_resolver.py

Cluster Resolvers

Cluster Resolvers are a new way of specifying cluster information for distributed execution. Built on top of existing ClusterSpec framework, Cluster Resolvers allow users to simply specify a configuration and a cluster management service and a ClusterResolver will automatically fetch the relevant information from the service and populate ClusterSpecs.

ClusterResolvers are designed to work well with ManagedTrainingSession and ClusterSpec propagation so that distributed training sessions remain robust in the face of node and network failures.