address comments on commit d628b34f78f291692c8b4ef0da54f1b91f24291d

Add XLA-only merge that can merge all types.

This prevents insertion of H2D and D2H copies when XLA-GPU clusters
have int32 outputs. This merge is only used the merge the outputs
from the XlaRun and the the PartitionedCall node.
This commit is contained in:
Bas Aarts 2019-10-09 09:19:42 -07:00
parent d628b34f78
commit b3ced407a8

View File

@ -107,10 +107,11 @@ REGISTER_OP("_XlaMerge")
.Doc(R"(XLA Merge Op. For use by the XLA JIT only.
Merges the outputs from the PartitionedCall node and the _XlaRun node.
Unlike the TensorFlow merge op, _XlaMerge supports merging inputs of all types.
This prevents the need for copy operations, in particluar when an XLA cluster
has int32 outputs. The _XlaMerge up does not have a value_index output that
identifies the chosen input.
Unlike the TensorFlow Merge op, which requires inputs of some types to be
placed on the host, the _XlaMerge op can merge inputs of all types when
placed on the device. This prevents the need for copy operations, in
particluar when an XLA cluster has int32 outputs. The _XlaMerge up does not
have a value_index output that identifies the chosen input.
)");
} // namespace tensorflow