[ROCm] Fix for ROCm CSB breakage - 200527
The following commit introduces a new unit-test which fails on ROCm.
dbef0933eb
I think that this unit-test is for checking the reduced memory usage of the gradient checkpointing method.
The sub-test `test_does_not_raise_oom_exception` fails on ROCm, because on the ROCm platform the scratch space required for doing backward convolution pushes the total memory allocation just beyond the 1GB limit imposed by the testcase.
This fix moves up the threshold by 128MB (from 1024 MB to 1152 MB). This still presevers the intent of the unit-test, i.e. the `test_raises_oom_exception` continues to raise the exception, while also allowing the `test_does_not_raise_oom_exception` sub-test to pass on the ROCm platform.
This commit is contained in:
parent
b847ff9b30
commit
1c2527fd11
@ -75,7 +75,7 @@ def _limit_gpu_memory():
|
||||
if gpus:
|
||||
tf.config.experimental.set_virtual_device_configuration(
|
||||
gpus[0],
|
||||
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
|
||||
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1152)])
|
||||
return True
|
||||
return False
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user