address comments on commit d628b34f78f291692c8b4ef0da54f1b91f24291d

Add XLA-only merge that can merge all types. This prevents insertion of H2D and D2H copies when XLA-GPU clusters have int32 outputs. This merge is only used the merge the outputs from the XlaRun and the the PartitionedCall node.
2019-10-09 09:19:42 -07:00 · 2019-10-09 09:19:42 -07:00 · b3ced407a8
commit b3ced407a8
parent d628b34f78
1 changed files with 5 additions and 4 deletions
--- a/tensorflow/compiler/jit/ops/xla_ops.cc
+++ b/tensorflow/compiler/jit/ops/xla_ops.cc
@ -107,10 +107,11 @@ REGISTER_OP("_XlaMerge")
    .Doc(R"(XLA Merge Op. For use by the XLA JIT only.

 Merges the outputs from the PartitionedCall node and the _XlaRun node.
-Unlike the TensorFlow merge op, _XlaMerge supports merging inputs of all types.
-This prevents the need for copy operations, in particluar when an XLA cluster
-has int32 outputs. The _XlaMerge up does not have a value_index output that
-identifies the chosen input.
+Unlike the TensorFlow Merge op, which requires inputs of some types to be
+placed on the host, the _XlaMerge op can merge inputs of all types when
+placed on the device. This prevents the need for copy operations, in
+particluar when an XLA cluster has int32 outputs. The _XlaMerge up does not
+have a value_index output that identifies the chosen input.
 )");

 }  // namespace tensorflow