Fix for TF build failure with ROCm 3.9 (error: call to 'min' is ambiguous)
When building TF with ROCm 3.9, we are running into the following compile error ``` In file included from tensorflow/core/kernels/reduction_ops_half_mean_sum.cu.cc:20: ./tensorflow/core/kernels/reduction_gpu_kernels.cu.h:430:9: error: call to 'min' is ambiguous min(blockDim.y, num_rows - blockIdx.y * blockDim.y); ^~~ /opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1183:23: note: candidate function __DEVICE__ inline int min(int __arg1, int __arg2) { ^ /opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1197:14: note: candidate function inline float min(float __x, float __y) { return fminf(__x, __y); } ^ /opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1200:15: note: candidate function inline double min(double __x, double __y) { return fmin(__x, __y); } ^ 1 error generated when compiling for gfx803. ``` The build error seems to be because ROCm 3.9 uses llvm header files from `llvm/lib/clang/12.0.0/include` (ROCm 3.8 uses the `11.0.0` version). `12.0.0` has a new `__clang_hip_math.h` file, which is not present in `11.0.0`. This file has the `min` function overloaded for the `float` and `double` types. The first argument in the call to `min` (which leads to the error) is `blockDim.y` which has a `uint` type, and hence the compiler gets confused as to which overloaded type to resole to. Previously (i.e. ROCm 3.8 and before) there was only one option (`int`), with ROCm 3.9 there are three (`int`, `float`, and `double`) and hence the error. The "fix" is to explicitly cast the first argument to `int` to remove the ambiguity (the second argument is already an `int` type).
This commit is contained in:
parent
868395d7d0
commit
8f4f90992c
@ -387,7 +387,7 @@ __global__ __launch_bounds__(1024) void ColumnReduceKernel(
|
||||
// - =
|
||||
// =
|
||||
const int numRowsThisBlock =
|
||||
min(blockDim.y, num_rows - blockIdx.y * blockDim.y);
|
||||
min(int(blockDim.y), num_rows - blockIdx.y * blockDim.y);
|
||||
|
||||
for (int row = 1; row < numRowsThisBlock; ++row) {
|
||||
value_type t = partial_sums[threadIdx.x * (TF_RED_WARPSIZE + 1) + row];
|
||||
|
Loading…
Reference in New Issue
Block a user