[XLA:GPU] [NFC] Clarify the precondition for the fast reduction emitter

PiperOrigin-RevId: 317266013
Change-Id: I384acac279f0db53f195d5b43318c38c87a1739c
This commit is contained in:
George Karpenkov 2020-06-19 01:13:01 -07:00 committed by TensorFlower Gardener
parent e972c55726
commit 9f20b156bc

View File

@ -226,6 +226,11 @@ bool IsReductionFromOrToContiguousDimensions(const HloInstruction& reduce) {
dims_to_keep.push_back(dim);
}
}
// We support fast codegen for three cases:
// 1) Row reduction: (K, R)
// 2) Column reduction: (K, R, K)
// 3) "Batched" row reduction: (R, K, R)
if (!LayoutUtil::AreDimensionsConsecutive(input->shape().layout(),
dims_to_keep) &&
!LayoutUtil::AreDimensionsConsecutive(input->shape().layout(),