STT-tensorflow/third_party/gpus/crosstool
Deven Desai 312e6bacca PR : [ROCm] Update to use ROCm 3.9 (when building TF with --config=rocm)
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/44471

PR https://github.com/tensorflow/tensorflow/pull/43636 is a pre-requisite for this PR.

For the time being, this PR includes commits from it's pre-req as well.  Once the pre-req PR is merged, I will rebase this PR to remove those commits.

--------------------------------------

/cc @cheshire @chsigg @nvining-work

Copybara import of the project:

--
3f0d378c14f55ac850ace17ac154e2333169329b by Deven Desai <deven.desai.amd@gmail.com>:

Adding #defines for ROCm / MIOpen / HIP Runtime version numbers

This PR/commit introduces the following #defines in the `rocm/rocm_config.h` file

```
#define TF_ROCM_VERSION <Version Number of ROCm install>
#define TF_MIOPEN_VERSION <Verion Number of MIOpen in ROCm install>
#define TF_HIPRUNTIME_VERSION <Version Number of HIP Runtinme in ROCm install>
```

These #defines should be used within TF code to add ROCm/MIOpen/HIp Runtime version specific code.

Details on how we go about determining these version numbers can found on the following wiki-page

https://github.com/ROCmSoftwarePlatform/tensorflow-internal/wiki/How-to-add-ROCm-version-specific-code-changes-in-the-TensorFlow-code%3F

A new script `find_rocm_config.py` is being added by this commit. This script does all the work of determining the version number information and it is pretty to extend it to query more information about the ROCM install.

The information collected by the script is available to `rocm_configure.bzl` and hence can be used to add version specific code in `rocm_configure.bzl` as well.

--
922e0e556c4f31f7ff8da1053f014964d01c0859 by Deven Desai <deven.desai.amd@gmail.com>:

Updating Dockerfile.rocm to use ROCm 3.9

--
cc0b4ae28218a83b3cc262ac83d0b2cf476939c8 by Deven Desai <deven.desai.amd@gmail.com>:

Changing CI scripts to use ROCm 3.9

--
fbfdb64c3375f79674a4f56433f944e1e4fd6b6e by Deven Desai <deven.desai.amd@gmail.com>:

Updating rocm_config.py to account for the new location of the rocblas version header file (in ROCm 3.8)

--
3f191faf8b8f2a0111bc386f41316079cad4aaaa by Deven Desai <deven.desai.amd@gmail.com>:

Removing references to TENSORFLOW_COMPILER_IS_HIP_CLANG

Now that we are way past the switch to use ROCm 3.5 and above (i.e. hip-clang), the codes within `#ifdef TENSORFLOW_COMPILER_IS_HIP_CLANG` are always enabled, and the codes within the corresponding `#else` blocks are deadcodes.

This commit removes the references to `#ifdef TENSORFLOW_COMPILER_IS_HIP_CLANG` and their corresponding `#else` blocks

--
9a4841c9bb8117e8228946be1f3752bdaea4a359 by Deven Desai <deven.desai.amd@gmail.com>:

Removing -DTENSORFLOW_COMPILER_IS_HIP_CLANG from the list of compile flags

--
745e2ad6db4282f5efcfef3155d9a46d9235dbf6 by Deven Desai <deven.desai.amd@gmail.com>:

Removing deadcode for the ROCm platform within the third_party/gpus dir

--
c96dc03986636badce7dbd87fb85cf26dff7a43b by Deven Desai <deven.desai.amd@gmail.com>:

Updating XLA code to account for the device lib files location change in ROCm 3.9

The location of the ROCm device lib files is changing in ROCm 3.9

Current (ROCm 3.8 and before) location is $ROCM_PATH/lib

```
root@ixt-rack-04:/opt/rocm-3.8.0# find . -name *.bc
./lib/oclc_isa_version_701.amdgcn.bc
./lib/ocml.amdgcn.bc
./lib/oclc_daz_opt_on.amdgcn.bc
./lib/oclc_isa_version_700.amdgcn.bc
./lib/oclc_isa_version_810.amdgcn.bc
./lib/oclc_unsafe_math_off.amdgcn.bc
./lib/oclc_wavefrontsize64_off.amdgcn.bc
./lib/oclc_isa_version_803.amdgcn.bc
./lib/oclc_isa_version_1011.amdgcn.bc
./lib/oclc_isa_version_1012.amdgcn.bc
./lib/opencl.amdgcn.bc
./lib/oclc_unsafe_math_on.amdgcn.bc
./lib/oclc_isa_version_1010.amdgcn.bc
./lib/oclc_finite_only_off.amdgcn.bc
./lib/oclc_correctly_rounded_sqrt_on.amdgcn.bc
./lib/oclc_daz_opt_off.amdgcn.bc
./lib/oclc_isa_version_802.amdgcn.bc
./lib/ockl.amdgcn.bc
./lib/oclc_isa_version_906.amdgcn.bc
./lib/oclc_isa_version_1030.amdgcn.bc
./lib/oclc_correctly_rounded_sqrt_off.amdgcn.bc
./lib/hip.amdgcn.bc
./lib/oclc_isa_version_908.amdgcn.bc
./lib/oclc_isa_version_900.amdgcn.bc
./lib/oclc_isa_version_702.amdgcn.bc
./lib/oclc_wavefrontsize64_on.amdgcn.bc
./lib/hc.amdgcn.bc
./lib/oclc_isa_version_902.amdgcn.bc
./lib/oclc_isa_version_801.amdgcn.bc
./lib/oclc_finite_only_on.amdgcn.bc
./lib/oclc_isa_version_904.amdgcn.bc
```

New (ROCm 3.9 and above) location is $ROCM_PATH/amdgcn/bitcode
```
root@ixt-hq-99:/opt/rocm-3.9.0-3703# find -name *.bc
./amdgcn/bitcode/oclc_isa_version_700.bc
./amdgcn/bitcode/ocml.bc
./amdgcn/bitcode/oclc_isa_version_1030.bc
./amdgcn/bitcode/oclc_isa_version_1010.bc
./amdgcn/bitcode/oclc_isa_version_904.bc
./amdgcn/bitcode/hip.bc
./amdgcn/bitcode/hc.bc
./amdgcn/bitcode/oclc_daz_opt_off.bc
./amdgcn/bitcode/oclc_wavefrontsize64_off.bc
./amdgcn/bitcode/oclc_wavefrontsize64_on.bc
./amdgcn/bitcode/oclc_isa_version_900.bc
./amdgcn/bitcode/oclc_isa_version_1012.bc
./amdgcn/bitcode/oclc_isa_version_702.bc
./amdgcn/bitcode/oclc_daz_opt_on.bc
./amdgcn/bitcode/oclc_unsafe_math_off.bc
./amdgcn/bitcode/ockl.bc
./amdgcn/bitcode/oclc_isa_version_803.bc
./amdgcn/bitcode/oclc_isa_version_908.bc
./amdgcn/bitcode/oclc_isa_version_802.bc
./amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc
./amdgcn/bitcode/oclc_finite_only_on.bc
./amdgcn/bitcode/oclc_isa_version_701.bc
./amdgcn/bitcode/oclc_unsafe_math_on.bc
./amdgcn/bitcode/oclc_isa_version_902.bc
./amdgcn/bitcode/oclc_finite_only_off.bc
./amdgcn/bitcode/opencl.bc
./amdgcn/bitcode/oclc_isa_version_906.bc
./amdgcn/bitcode/oclc_isa_version_810.bc
./amdgcn/bitcode/oclc_isa_version_801.bc
./amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc
./amdgcn/bitcode/oclc_isa_version_1011.bc
```

Also not the change in the filename(s)

This commit updates the XLA code, that has the device lib path + filename(s) hardcoded, to account for the change in location / filename

--
6f981a91c8d8a349c88b450c2191df9c62b2b38b by Deven Desai <deven.desai.amd@gmail.com>:

Adding "-fcuda-flush-denormals-to-zero" as a default hipcc option

Prior to ROCm 3.8, hipcc (hipclang) flushed denormal values to zero by default. Starting with ROCm 3.8 that is no longer true, denormal values are kept as is.

TF expects denormals to be flushed to zero. This is enforced on the CUDA side by explicitly passing the "-fcuda-flush-denormals-to-zero" (see tensorflow.bzl). This commit does the same for the ROCm side.

Also removing the no_rocm tag from the corresponding unit test - //tensorflow/python/kernel_tests:denormal_test_gpu

--
74810439720e0692f81ffb0cc3b97dc6ed50876d by Deven Desai <deven.desai.amd@gmail.com>:

Fix for TF build failure with ROCm 3.9 (error: call to 'min' is ambiguous)

When building TF with ROCm 3.9, we are running into the following compile error

```
In file included from tensorflow/core/kernels/reduction_ops_half_mean_sum.cu.cc:20:
./tensorflow/core/kernels/reduction_gpu_kernels.cu.h:430:9: error: call to 'min' is ambiguous
        min(blockDim.y, num_rows - blockIdx.y * blockDim.y);
        ^~~
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1183:23: note: candidate function
__DEVICE__ inline int min(int __arg1, int __arg2) {
                      ^
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1197:14: note: candidate function
inline float min(float __x, float __y) { return fminf(__x, __y); }
             ^
/opt/rocm-3.9.0-3805/llvm/lib/clang/12.0.0/include/__clang_hip_math.h:1200:15: note: candidate function
inline double min(double __x, double __y) { return fmin(__x, __y); }
              ^
1 error generated when compiling for gfx803.
```

The build error seems to be because ROCm 3.9 uses llvm header files from `llvm/lib/clang/12.0.0/include` (ROCm 3.8 uses the `11.0.0` version). `12.0.0` has a new `__clang_hip_math.h` file, which is not present in `11.0.0`. This file has the `min` function overloaded for the `float` and `double` types.

The first argument in the call to `min` (which leads to the error) is `blockDim.y` which has a `uint` type, and hence the compiler gets confused as to which overloaded type to resole to. Previously (i.e. ROCm 3.8 and before) there was only one option (`int`), with ROCm 3.9 there are three (`int`, `float`, and `double`) and hence the error.

The "fix" is to explicitly cast the first argument to `int` to remove the ambiguity (the second argument is already an `int` type).

COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/44471 from ROCmSoftwarePlatform:google_upstream_rocm_switch_to_rocm39 74810439720e0692f81ffb0cc3b97dc6ed50876d
PiperOrigin-RevId: 341569721
Change-Id: Ia614893881bf8db1ef8901034c35cc585a82dba8
2020-11-10 00:57:26 -08:00
..
clang/bin PR : [ROCm] Update to use ROCm 3.9 (when building TF with --config=rocm) 2020-11-10 00:57:26 -08:00
windows Revert PR : Make fast builds work with MSVC 2020-10-05 10:30:22 -07:00
BUILD Add cuda_configure repository rule to autodetect cuda. 2016-08-20 02:45:45 -07:00
BUILD.rocm.tpl Migrate toolchains for --incompatible_use_specific_tool_files 2019-11-12 05:29:32 -08:00
BUILD.tpl Allow adding compiler specific flags to the GPU crosstool. 2020-03-23 07:35:44 -07:00
cc_toolchain_config.bzl.tpl Remove stray quotes 2020-10-18 21:56:50 +02:00
hipcc_cc_toolchain_config.bzl.tpl Update ROCm toolchain configuration 2019-11-06 06:28:22 -08:00
LICENSE TensorFlow: Initial commit of TensorFlow library. 2015-11-06 16:27:58 -08:00