Fix for TF build failure with ROCm 3.9 (error: call to 'min' is ambiguous)

When building TF with ROCm 3.9, we are running into the following compile error

```
In file included from tensorflow/core/kernels/reduction_ops_half_mean_sum.cu.cc:20:
./tensorflow/core/kernels/reduction_gpu_kernels.cu.h:430:9: error: call to 'min' is ambiguous
        min(blockDim.y, num_rows - blockIdx.y * blockDim.y);
        ^~~
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1183:23: note: candidate function
__DEVICE__ inline int min(int __arg1, int __arg2) {
                      ^
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1197:14: note: candidate function
inline float min(float __x, float __y) { return fminf(__x, __y); }
             ^
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1200:15: note: candidate function
inline double min(double __x, double __y) { return fmin(__x, __y); }
              ^
1 error generated when compiling for gfx803.
```

The build error seems to be because ROCm 3.9 uses llvm header files from `llvm/lib/clang/12.0.0/include` (ROCm 3.8 uses the `11.0.0` version). `12.0.0` has a new `__clang_hip_math.h` file, which is not present in `11.0.0`. This file has the `min` function overloaded for the `float` and `double` types.

The first argument in the call to `min` (which leads to the error) is `blockDim.y` which has a `uint` type, and hence the compiler gets confused as to which overloaded type to resole to. Previously (i.e. ROCm 3.8 and before) there was only one option (`int`), with ROCm 3.9 there are three (`int`, `float`, and `double`) and hence the error.

The "fix" is to explicitly cast the first argument to `int` to remove the ambiguity (the second argument is already an `int` type).
This commit is contained in:
Deven Desai 2020-09-21 17:03:00 +00:00
parent 868395d7d0
commit 8f4f90992c

View File

@ -387,7 +387,7 @@ __global__ __launch_bounds__(1024) void ColumnReduceKernel(
// - =
// =
const int numRowsThisBlock =
min(blockDim.y, num_rows - blockIdx.y * blockDim.y);
min(int(blockDim.y), num_rows - blockIdx.y * blockDim.y);
for (int row = 1; row < numRowsThisBlock; ++row) {
value_type t = partial_sums[threadIdx.x * (TF_RED_WARPSIZE + 1) + row];