Op documentation update.

update of g3doc/includes/tf_passes.md

PiperOrigin-RevId: 347668072
Change-Id: I0d9cb91a18c56347844c8d3c699fcbd8f878bda4
This commit is contained in:
A. Unique TensorFlower 2020-12-15 12:22:48 -08:00 committed by TensorFlower Gardener
parent e54e7a08c5
commit ce477dd2bb

View File

@ -39,3 +39,79 @@ func @my_fn(%arg0: tensor<i32>, %arg1: tensor<i32>) -> (tensor<i32>, tensor<i32>
```
-max-iterations : Maximum shape inference iterations
```
### `-tf-tpu-cluster-formation`: Form clusters from operations assigned to the same TPU computation
TPU computations from the frontend are composed of a `tf.TPUReplicateMetadata`
op, a subgraph of ops (TensorFlow Dialect) each with a matching `_tpu_replicate`
attribute relative to the associated `tf.TPUReplicateMetadata` op, and
optionally `tf.TPUReplicatedInput` and `tf.TPUReplicatedOutput` ops feeding in
inputs and outputs to and from a replicated TPU computation. The number of times
a TPU computation is replicated is defined in the `tf.TPUReplicateMetadata` op
(`num_replicas` attribute) and operand and result sizes of
`tf.TPUReplicatedInput` and `tf.TPUReplicatedOutput` respectively must match,
excluding packed tensors. It is also assumed ops of the same TPU computation do
not have ops outside of the TPU computation that are both inputs and outputs to
the same TPU computation.
This pass takes the TPU computation subgraph, moves them into a
`tf_device.cluster`, and copies over attributes from the associated
`tf.TPUReplicateMetadata` op to the newly created `tf_device.cluster`. If the
computation is replicated (`num_replicas` > 1), the `num_replicas` attribute is
not copied over but instead the `tf_device.cluster` is further wrapped with a
`tf_device.replicate`, and associated `tf.TPUReplicatedInput` and
`tf.TPUReplicatedOutput` ops are replaced as the `tf_device.replicate` operands
and results. Otherwise, the single operands and results of the associated
`tf.TPUReplicatedInput` and `tf.TPUReplicatedOutput` ops are simply forwarded to
the `tf_device.cluster`.
For example, the following non replicated computation:
```mlir
func @tpu_computation(%arg0: tensor<i32>) -> tensor<i32> {
// Metadata op for cluster `cluster` with 1 replica, 1 core per replica and
// with topology `<topology>`.
"tf.TPUReplicateMetadata"() {_tpu_replicate = "cluster", num_relicas = 1, num_cores_per_replica = 1, topology = "<topology>", device_assignment = [], padding_map = []} : () -> ()
%replicated_input = "tf.TPUReplicatedInput"(%arg0) : (tensor<i32>) -> tensor<i32>
%identity = "tf.Identity"(%replicated_input) {_tpu_replicate = "cluster"} : (tensor<i32>) -> tensor<i32>
%replicated_output = "tf.TPUReplicatedOutput(%identity) : (tensor<i32>) -> tensor<i32>
return %replicated_output : tensor<i32>
}
```
will be transformed into:
```mlir
func @tpu_computation(%arg0: tensor<i32>) -> tensor<i32> {
%cluster = "tf_device.cluster"() ( {
%identity = "tf.Identity"(%arg0) : (tensor<i32>) -> tensor<i32>
tf_device.return %identity : tensor<i32>
}) {_tpu_replicate = "cluster", num_cores_per_replica = 1, topology = "topology", device_assignment = [], padding_map = []} : () -> (tensor<i32>)
return %cluster : tensor<i32>
}
```
The following replicated computation:
```mlir
func @tpu_computation(%arg0: tensor<i32>, %arg1: tensor<i32>) -> (tensor<i32>, tensor<i32>) {
"tf.TPUReplicateMetadata"() {_tpu_replicate = "cluster", num_relicas = 2, num_cores_per_replica = 1, topology = "topology", device_assignment = [], padding_map = []} : () -> ()
%replicated_input = "tf.TPUReplicatedInput"(%arg0, %arg1) : (tensor<i32>, tensor<i32>) -> tensor<i32>
%identity = "tf.Identity"(%replicated_input) {_tpu_replicate = "cluster"} : (tensor<i32>) -> tensor<i32>
%replicated_output:2 = "tf.TPUReplicatedOutput(%identity) : (tensor<i32>) -> (tensor<i32>, tensor<i32>)
return %replicated_output#0, %replicated_output#1 : tensor<i32>, tensor<i32>
}
```
will be transformed into:
```mlir
func @tpu_computation(%arg0: tensor<i32>, %arg1: tensor<i32>) -> (tensor<i32>, tensor<i32>) {
%replicate:2 = tf_device.replicate([%arg0, %arg1] as %replicated_input) {n = 2 : i32} {
%cluster = "tf_device.cluster"() ( {
%identity = "tf.Identity"(%replicated_input) : (tensor<i32>) -> tensor<i32>
tf_device.return %identity : tensor<i32>
}) {_tpu_replicate = "cluster", num_cores_per_replica = 1, topology = "topology", device_assignment = [], padding_map = []} : () -> (tensor<i32>)
tf_device.return %cluster : tensor<i32>
}
return %replicate#0, %replicate#1 : tensor<i32>, tensor<i32>
}
```