.. |
eager
|
Fix an issue of out of order execution. For a multi-device function, don't send a packed input to the function device until all underlying remote handles are ready on remote devices. Otherwise, on a remote worker, a remote component function execution request could be enqueued before a request for producing a function input.
|
2020-12-11 14:56:03 -08:00 |
rpc
|
Optimize calls to std::string::find() and friends for a single char.
|
2020-12-17 17:48:51 -08:00 |
base_rendezvous_mgr.cc
|
Fix cancellation race condition in BaseRendezvousMgr::RegisterCall
|
2020-06-19 13:04:53 -07:00 |
base_rendezvous_mgr.h
|
Fix cancellation race condition in BaseRendezvousMgr::RegisterCall
|
2020-06-19 13:04:53 -07:00 |
BUILD
|
Support aborting RING communication in multi worker collectives
|
2020-10-21 17:03:09 -07:00 |
call_options_test.cc
|
|
|
call_options.cc
|
A series of changes to significantly reduce the number of allocations
|
2016-06-27 13:32:57 -07:00 |
call_options.h
|
Prefixing TensorFlow thread annotation macros with TF_.
|
2020-03-05 08:42:01 -08:00 |
cancellable_call.cc
|
Support aborting RING communication in multi worker collectives
|
2020-10-21 17:03:09 -07:00 |
cancellable_call.h
|
Support aborting RING communication in multi worker collectives
|
2020-10-21 17:03:09 -07:00 |
cluster_function_library_runtime_test.cc
|
Pass in GrpcWorkerEnv when creating GrpcWorkerCache.
|
2020-06-04 11:46:09 -07:00 |
cluster_function_library_runtime.cc
|
Change the function output type, either a Tensor for a local output or a TensorShape for a remote output, preparing for the support of function outputs placed on remote workers.
|
2020-08-04 19:13:03 -07:00 |
cluster_function_library_runtime.h
|
Use the original output indices when adding a component function output to RemoteMgr.
|
2020-08-19 14:41:05 -07:00 |
collective_param_resolver_distributed_test.cc
|
Set a timeout to check health RPC
|
2020-10-21 13:02:25 -07:00 |
collective_param_resolver_distributed.cc
|
Support aborting param resolution in multi worker collectives
|
2020-10-21 18:40:58 -07:00 |
collective_param_resolver_distributed.h
|
Support aborting param resolution in multi worker collectives
|
2020-10-21 18:40:58 -07:00 |
collective_rma_distributed_test.cc
|
Set a timeout to check health RPC
|
2020-10-21 13:02:25 -07:00 |
collective_rma_distributed.cc
|
Support aborting RING communication in multi worker collectives
|
2020-10-21 17:03:09 -07:00 |
collective_rma_distributed.h
|
Support aborting RING communication in multi worker collectives
|
2020-10-21 17:03:09 -07:00 |
device_resolver_distributed_test.cc
|
Use device attributes from group resolution
|
2020-09-09 10:53:49 -07:00 |
device_resolver_distributed.cc
|
Use device attributes from group resolution
|
2020-09-09 10:53:49 -07:00 |
device_resolver_distributed.h
|
Use device attributes from group resolution
|
2020-09-09 10:53:49 -07:00 |
graph_mgr.cc
|
[TF2XLA] Remove the serialization of CustomKernelCreator, since there is only one, and we won't add new ones
|
2020-09-22 10:59:59 -07:00 |
graph_mgr.h
|
When calling connect_to_cluser, if the options are identical and there is no renaming of local device, reuse existing local DeviceManager, otherwise we keep the old DeviceManager around to allow the old Tensor created to be usable.
|
2020-05-20 08:53:52 -07:00 |
local_master.cc
|
Add remote session support for the MakeCallable API.
|
2018-04-06 18:18:06 -07:00 |
local_master.h
|
Address compiler warnings in tensorflow/core/distributed_runtime.
|
2018-06-05 08:23:35 -07:00 |
master_env.h
|
Fix two memory leaks and enable asan for C API remote tests.
|
2020-07-17 10:17:20 -07:00 |
master_interface.h
|
|
|
master_session.cc
|
fix typos in core directory
|
2020-10-29 02:52:55 +03:00 |
master_session.h
|
Prefixing TensorFlow thread annotation macros with TF_.
|
2020-03-05 08:42:01 -08:00 |
master_test.cc
|
Fix the call to NewHostPortGrpcChannel in distributed_runtime/master_test
|
2019-09-23 15:03:29 -07:00 |
master.cc
|
Prefixing TensorFlow thread annotation macros with TF_.
|
2020-03-05 08:42:01 -08:00 |
master.h
|
Prefixing TensorFlow thread annotation macros with TF_.
|
2020-03-05 08:42:01 -08:00 |
message_wrappers_test.cc
|
|
|
message_wrappers.cc
|
Support TensorProtos as Operation inputs, in order to support remote inputs passed as Tensors to EagerClusterFunctionLibraryRuntime::Run.
|
2020-03-09 17:10:38 -07:00 |
message_wrappers.h
|
Support TensorProtos as Operation inputs, in order to support remote inputs passed as Tensors to EagerClusterFunctionLibraryRuntime::Run.
|
2020-03-09 17:10:38 -07:00 |
partial_run_mgr_test.cc
|
|
|
partial_run_mgr.cc
|
|
|
partial_run_mgr.h
|
Prefixing TensorFlow thread annotation macros with TF_.
|
2020-03-05 08:42:01 -08:00 |
README.md
|
Fix how-to reference in distributed runtime README (#9772)
|
2017-05-12 06:35:31 -07:00 |
recent_request_ids_test.cc
|
Internal tests cleanup.
|
2020-10-27 13:24:35 -07:00 |
recent_request_ids.cc
|
|
|
recent_request_ids.h
|
Prefixing TensorFlow thread annotation macros with TF_.
|
2020-03-05 08:42:01 -08:00 |
remote_device_test.cc
|
Pass in GrpcWorkerEnv when creating GrpcWorkerCache.
|
2020-06-04 11:46:09 -07:00 |
remote_device.cc
|
replace PFLR DeviceGetContext hardcode with Device::IsRemoteCallAllowed
|
2020-10-22 20:49:03 +02:00 |
remote_device.h
|
The remote device manager in WorkerSession contains only RemoteDevice instance which has device->IsLocal() == false even if the device is on the local host. This patch ensures that device->IsLocal() should return true if and only if this device is on the local host.
|
2019-08-20 13:50:02 -07:00 |
rendezvous_mgr_interface.h
|
[Cleanup] Remove unused method RendezvousMgrInterface::CleanupAll() .
|
2020-04-13 11:26:06 -07:00 |
request_id_test.cc
|
Reject retried RecvTensor requests.
|
2018-01-22 17:30:59 -08:00 |
request_id.cc
|
|
|
request_id.h
|
|
|
rpc_collective_executor_mgr_test.cc
|
Make NcclManager part of CollectiveExecutorMgr
|
2020-09-17 14:35:16 -07:00 |
rpc_collective_executor_mgr.cc
|
Make NcclManager part of CollectiveExecutorMgr
|
2020-09-17 14:35:16 -07:00 |
rpc_collective_executor_mgr.h
|
Make NcclManager part of CollectiveExecutorMgr
|
2020-09-17 14:35:16 -07:00 |
rpcbench_test.cc
|
Internal tests cleanup.
|
2020-10-27 13:24:35 -07:00 |
scheduler.cc
|
|
|
scheduler.h
|
|
|
server_lib_test.cc
|
When calling connect_to_cluser, if the options are identical and there is no renaming of local device, reuse existing local DeviceManager, otherwise we keep the old DeviceManager around to allow the old Tensor created to be usable.
|
2020-05-20 08:53:52 -07:00 |
server_lib.cc
|
When calling connect_to_cluser, if the options are identical and there is no renaming of local device, reuse existing local DeviceManager, otherwise we keep the old DeviceManager around to allow the old Tensor created to be usable.
|
2020-05-20 08:53:52 -07:00 |
server_lib.h
|
When calling connect_to_cluser, if the options are identical and there is no renaming of local device, reuse existing local DeviceManager, otherwise we keep the old DeviceManager around to allow the old Tensor created to be usable.
|
2020-05-20 08:53:52 -07:00 |
session_mgr_test.cc
|
Garbage collect old WorkerSession when the restarted master task create new one.
|
2020-08-03 11:31:26 -07:00 |
session_mgr.cc
|
Garbage collect old WorkerSession when the restarted master task create new one.
|
2020-08-03 11:31:26 -07:00 |
session_mgr.h
|
Garbage collect old WorkerSession when the restarted master task create new one.
|
2020-08-03 11:31:26 -07:00 |
tensor_coding_test.cc
|
Internal tests cleanup.
|
2020-10-27 13:24:35 -07:00 |
tensor_coding.cc
|
|
|
tensor_coding.h
|
New Timestamped BFCAllocator and GPUKernelTracker.
|
2019-02-06 11:01:38 -08:00 |
test_utils.h
|
Set a timeout to check health RPC
|
2020-10-21 13:02:25 -07:00 |
worker_cache_logger.cc
|
|
|
worker_cache_logger.h
|
Prefixing TensorFlow thread annotation macros with TF_.
|
2020-03-05 08:42:01 -08:00 |
worker_cache_partial.cc
|
|
|
worker_cache_partial.h
|
Prefixing TensorFlow thread annotation macros with TF_.
|
2020-03-05 08:42:01 -08:00 |
worker_cache_wrapper.h
|
Fix compiler warnings in worker_cache_wrapper.h.
|
2019-07-31 09:15:51 -07:00 |
worker_cache.h
|
|
|
worker_env.h
|
Fix two memory leaks and enable asan for C API remote tests.
|
2020-07-17 10:17:20 -07:00 |
worker_interface.h
|
Set a timeout to check health RPC
|
2020-10-21 13:02:25 -07:00 |
worker_session.cc
|
When calling connect_to_cluser, if the options are identical and there is no renaming of local device, reuse existing local DeviceManager, otherwise we keep the old DeviceManager around to allow the old Tensor created to be usable.
|
2020-05-20 08:53:52 -07:00 |
worker_session.h
|
When calling connect_to_cluser, if the options are identical and there is no renaming of local device, reuse existing local DeviceManager, otherwise we keep the old DeviceManager around to allow the old Tensor created to be usable.
|
2020-05-20 08:53:52 -07:00 |
worker.cc
|
Set a timeout to check health RPC
|
2020-10-21 13:02:25 -07:00 |
worker.h
|
Set a timeout to check health RPC
|
2020-10-21 13:02:25 -07:00 |