address comments on commit d628b34f78f291692c8b4ef0da54f1b91f24291d
Add XLA-only merge that can merge all types. This prevents insertion of H2D and D2H copies when XLA-GPU clusters have int32 outputs. This merge is only used the merge the outputs from the XlaRun and the the PartitionedCall node.
This commit is contained in:
parent
d628b34f78
commit
b3ced407a8
@ -107,10 +107,11 @@ REGISTER_OP("_XlaMerge")
|
||||
.Doc(R"(XLA Merge Op. For use by the XLA JIT only.
|
||||
|
||||
Merges the outputs from the PartitionedCall node and the _XlaRun node.
|
||||
Unlike the TensorFlow merge op, _XlaMerge supports merging inputs of all types.
|
||||
This prevents the need for copy operations, in particluar when an XLA cluster
|
||||
has int32 outputs. The _XlaMerge up does not have a value_index output that
|
||||
identifies the chosen input.
|
||||
Unlike the TensorFlow Merge op, which requires inputs of some types to be
|
||||
placed on the host, the _XlaMerge op can merge inputs of all types when
|
||||
placed on the device. This prevents the need for copy operations, in
|
||||
particluar when an XLA cluster has int32 outputs. The _XlaMerge up does not
|
||||
have a value_index output that identifies the chosen input.
|
||||
)");
|
||||
|
||||
} // namespace tensorflow
|
||||
|
Loading…
x
Reference in New Issue
Block a user