Previously, we use ICudaEngine::createExecutionContext to create a TensorRT
execution context along with the GPU needed to execute the Cuda Engine. This
API doesn't handle out of GPU memory properly, instead propagates an exception.
This change uses ICudaEngine::createExecutionContextWithoutDeviceMemory to
create a TensorRT execution context without any GPU memory, and let TF-TRT
create the needed GPU memory. In order to keep track of such GPU memory, we
wrap the TensorRT execution context and the associated GPU memory in a new
class callsed ExecutionContext.
PiperOrigin-RevId: 351895192
Change-Id: Ie01f0241578fadba8fad25bd110f937fd47082c8