[ROCm] Populating memory bandwidth information in the DeviceDescription
Currently when running on the ROCm platform, the `deviceMemoryBandwidth` information displayed is incorrect. for e.g ``` 2020-12-04 14:13:18.503661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1738] Found device 0 with properties: pciBusID: 0000:1c:00.0 name: Vega 10 [Radeon Instinct MI25] ROCm AMD GPU ISA: gfx900 coreClock: 1.5GHz coreCount: 64 deviceMemorySize: 15.98GiB deviceMemoryBandwidth: -1B/s ``` This commit fixes that by quering the GPU for that information, and populating it correctly in TF ``` 2020-12-04 19:41:12.864439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1745] Found device 0 with properties: pciBusID: 0000:63:00.0 name: Vega 10 [Radeon Instinct MI25] ROCm AMD GPU ISA: gfx900 coreClock: 1.5GHz coreCount: 64 deviceMemorySize: 15.98GiB deviceMemoryBandwidth: 450.61GiB/s ```
This commit is contained in:
parent
ef9f660b77
commit
e1962fce81
@ -856,6 +856,11 @@ GpuExecutor::CreateDeviceDescription(int device_ordinal) {
|
||||
|
||||
float clock_rate_ghz = static_cast<float>(prop.clockRate) / 1e6;
|
||||
builder.set_clock_rate_ghz(clock_rate_ghz);
|
||||
|
||||
// mem_bandwidth = 2 * mem_bus_width_in_bytes * mem_clock_rate_in_hz
|
||||
int64 memory_bandwidth = 2 * (int64(prop.memoryBusWidth) / 8) *
|
||||
(int64(prop.memoryClockRate) * 1000);
|
||||
builder.set_memory_bandwidth(memory_bandwidth);
|
||||
}
|
||||
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user