Use a reader lock when reading the TfToPlatformGpuIdMap.

We perform a look up in this map every time a kernel is executed on a GPU. A contention profile shows non-trivial contention on the current exclusive lock, which can be avoided by switching to a reader lock. PiperOrigin-RevId: 260820700
2019-07-30 16:32:09 -07:00 · 2019-07-30 16:32:09 -07:00 · 25a15dda8e
commit 25a15dda8e
parent 890f21cb3f
1 changed files with 3 additions and 1 deletions
--- a/tensorflow/core/common_runtime/gpu/gpu_id_manager.cc
+++ b/tensorflow/core/common_runtime/gpu/gpu_id_manager.cc
@ -59,7 +59,9 @@ class TfToPlatformGpuIdMap {

  bool Find(TfGpuId tf_gpu_id, PlatformGpuId* platform_gpu_id) const
      LOCKS_EXCLUDED(mu_) {
-    mutex_lock lock(mu_);
+    // TODO(mrry): Consider replacing this with an atomic `is_initialized` bit,
+    // to avoid writing to a shared cache line in the tf_shared_lock.
+    tf_shared_lock lock(mu_);
    auto result = id_map_.find(tf_gpu_id.value());
    if (result == id_map_.end()) return false;
    *platform_gpu_id = result->second;