STT-tensorflow/tensorflow/python/eager
Daniel Ellis c29e9f25e7 Handle garbage collection race condition.
An exception is being thrown when objects that use `CapturableResourceDeleter` are garbage collected at the end of a program's life.  This can happen in very normal circumstances, such as when using `saved_model_cli` to inspect a model.

The cause of the exception appears to be a race condition with garbage collection between `CapturableResourceDeleter` and `ScopedTFFunction`. Both define a custom finalizer (`__del__`); `CaptureableResourceDeleter`'s finalizer ultimately calls a concrete function which calls an `_EagerDefinedFunction` which attempts to load and execute a `ScopedTFFunction`.

In the case of multiple objects in a reference cycle all going unreachable during the same garbage collection cycle, we get no guaranteed ordering for which of the objects will be collected first. In the case of the exception, `ScopedTFFunction` is collected first and its underlying function is deleted. Later, `CapturableResourceDeleter` is called, which fails, since the function it's trying to call is gone.

PiperOrigin-RevId: 358292164
Change-Id: I9162d5de622f5c1ec9b2954647b9958a7d3d87b6
2021-02-18 17:00:03 -08:00
..
benchmarks PY2 removal cleanup 2021-01-15 16:48:57 -08:00
memory_tests PY2 removal cleanup 2021-01-15 16:48:57 -08:00
backprop_test.py GradientTape.jacobian/batch_jacobian: allow calls inside the GradientTape's scope 2021-02-12 13:06:37 -08:00
backprop_util.py Internal symbol name change. 2021-01-05 14:15:45 -08:00
backprop.py GradientTape.jacobian/batch_jacobian: allow calls inside the GradientTape's scope 2021-02-12 13:06:37 -08:00
benchmarks_test_base.py add benchmarks_test option to capture profiles 2020-07-20 17:27:49 -07:00
benchmarks_test.py First phase of nest.py migration to cc: move flatten_dict_items implementation from python to C++ 2021-02-16 15:20:09 -08:00
BUILD Update check for multiple devices found to check whether there is a single device colocated with the default_device. 2021-02-17 11:48:37 -08:00
cancellation_test.py
cancellation.py Let pybind handle deletion of cancellation managers 2021-02-03 08:42:08 -08:00
context_test.py Fix leaking registered functions when destroying a loaded model. 2021-01-21 14:14:11 -08:00
context.py Add function tf.config.experimental.get_memory_info. 2021-01-22 14:59:04 -08:00
core_test.py Clean up a few disable_tfrt annotations. 2020-09-28 13:28:19 -07:00
core.py
custom_device_test.py Move away from deprecated asserts 2020-06-30 16:10:22 -07:00
custom_device_testutil.cc
def_function_test_cpu_only.py Rationale: 2020-11-03 12:55:59 -08:00
def_function_test.py Internal change 2021-01-07 23:26:21 -08:00
def_function_xla_jit_test.py Switch legalization status type to match old bridge 2021-02-08 09:36:17 -08:00
def_function_xla_test.py
def_function.py Don't run restored functions eagerly. 2021-02-04 10:24:06 -08:00
device_placement_test.py Run placer before importing function to mlir. 2020-11-24 17:18:21 -08:00
execute.py Internal change 2020-07-17 20:18:01 -07:00
executor.py Use __slots__ for small classes 2020-06-28 18:41:22 +02:00
forwardprop_test.py Merge pull request from ROCmSoftwarePlatform:google_upstream_rocm_misc_update_201203 2020-12-09 07:38:17 -08:00
forwardprop_util.py
forwardprop.py Remove ndarray wrapper from TF Numpy. We return tensors directly. 2021-02-04 19:20:27 -08:00
function_argument_naming_test.py Clean up a few disable_tfrt annotations. 2020-09-28 13:28:19 -07:00
function_defun_collection_test.py Move away from deprecated asserts 2020-06-30 16:10:22 -07:00
function_gradients_test.py Remove unnecessary eval() calls 2020-07-10 22:35:43 -07:00
function_test.py Update check for multiple devices found to check whether there is a single device colocated with the default_device. 2021-02-17 11:48:37 -08:00
function.cc resolve name conflict for now 2020-08-25 17:44:10 -07:00
function.py Handle garbage collection race condition. 2021-02-18 17:00:03 -08:00
gradient_input_output_exclusions_test.py
gradient_input_output_exclusions.py Upgrade to gast 0.4. 2021-01-06 10:17:47 -08:00
graph_only_ops_test.py remove v1 decorator 2020-07-29 11:53:37 -07:00
graph_only_ops.py
imperative_grad.py
lift_to_graph_test.py Fix a bug in lift_to_graph.py. 2020-06-11 02:32:40 -07:00
lift_to_graph.py Export lift_to_graph as a tf.__internal__ API. 2021-01-25 14:07:39 -08:00
monitoring_test.py Make eager/monitoring_test.py less flaky. 2020-06-08 14:44:00 -07:00
monitoring.py Export BoolGauge as a tf.__internal__ API. 2021-02-03 15:02:41 -08:00
ops_test.py - Handle error metadata properly in c_api_tfrt. 2020-10-16 11:59:33 -07:00
profiler_client_test.py
profiler_client.py
profiler_test.py Use trace.Trace instead of traceme.TraceMe in python 2020-06-12 19:48:10 -07:00
profiler.py
pywrap_gradient_exclusions.cc Create a V2 Op to stop the gradient when the input is out of range. 2020-10-12 10:39:36 -07:00
pywrap_gradient_exclusions.h
pywrap_tensor_conversion.cc Make TFE_TensorHandleCache aware of TFE_Context. 2020-10-01 12:31:59 -07:00
pywrap_tensor_conversion.h Make TFE_TensorHandleCache aware of TFE_Context. 2020-10-01 12:31:59 -07:00
pywrap_tensor_test_util.cc Export TFE_TensorHandleToNumpy in pywrap_tensor.h so that in case be used in python binding for unified API. 2020-09-08 20:09:11 -07:00
pywrap_tensor_test.py Export TFE_TensorHandleToNumpy in pywrap_tensor.h so that in case be used in python binding for unified API. 2020-09-08 20:09:11 -07:00
pywrap_tensor.cc Update TFE_Py_TensorShapeSlice to use abstract interface APIs instead of TF_* APIs. 2020-11-17 09:29:24 -08:00
pywrap_tensor.h 1. Add helper for casting EagerTensor* to AbstractTensorHandle*. This allows us to directly work with EagerTensors and will also allow using tf.data with unified APIs. 2020-09-14 17:40:33 -07:00
pywrap_tfe_src.cc Fix --config=asan leak reports. 2021-01-21 12:40:02 -08:00
pywrap_tfe_test.py Fix bug where int attributes >= 2**31 caused on exception on some platforms. 2020-11-20 12:50:28 -08:00
pywrap_tfe.h Internal refactoring change. 2020-10-26 07:12:49 -07:00
remote_benchmarks_test.py Remove experimental mirroring policy APIs. 2020-08-25 16:00:49 -07:00
remote_cloud_tpu_test.py [TF2XLA] Update the test to match removed XLA:CPU device 2020-08-11 17:56:10 -07:00
remote_cluster_test.py Use Unavailable error for non-existing server context since this indicates the remote server has restarted. 2020-10-21 17:32:30 -07:00
remote_execution_test.py Remove the internal flag of lazy_remote_inputs_copy. 2021-01-07 10:42:36 -08:00
remote_test.py Update check for multiple devices found to check whether there is a single device colocated with the default_device. 2021-02-17 11:48:37 -08:00
remote.py Infer local ip in connect_to_cluster. 2020-10-28 17:52:06 -07:00
tape_test.py
tape.py Use __slots__ for small classes 2020-06-28 18:41:22 +02:00
tensor_test.py Update TFE_Py_TensorShapeSlice to use abstract interface APIs instead of TF_* APIs. 2020-11-17 09:29:24 -08:00
test.py
wrap_function_device_test.py Add wrap_function tests with explicit device placement. 2020-09-28 12:38:47 -07:00
wrap_function_test.py
wrap_function.py Internal symbol name change. 2021-01-05 14:15:45 -08:00