STT-tensorflow

History

Bixia Zheng 14d708ab72 [TF:TRT] Handle out of GPU memory when creating TensorRT execution context.

Previously, we use ICudaEngine::createExecutionContext to create a TensorRT
execution context along with the GPU needed to execute the Cuda Engine. This
API doesn't handle out of GPU memory properly, instead propagates an exception.
This change uses ICudaEngine::createExecutionContextWithoutDeviceMemory to
create a TensorRT execution context without any GPU memory, and let TF-TRT
create the needed GPU memory. In order to keep track of such GPU memory, we
wrap the TensorRT execution context and the associated GPU memory in a new
class callsed ExecutionContext.

PiperOrigin-RevId: 351895192
Change-Id: Ie01f0241578fadba8fad25bd110f937fd47082c8

2021-01-14 16:08:51 -08:00

common

[TF:TRT] Initialize TensorRT plugin registry before deserializing cuda engines.

2020-08-20 12:49:47 -07:00

convert

PR #46382 : TF-TRT Test ConvertConcat in dynamic shape mode

2021-01-14 13:37:37 -08:00

kernels

[TF:TRT] Handle out of GPU memory when creating TensorRT execution context.

2021-01-14 16:08:51 -08:00

ops

[TF:TRT] Add support for per cluster maximum batch size.