STT-tensorflow/tensorflow/python/saved_model
Daniel Ellis c29e9f25e7 Handle garbage collection race condition.
An exception is being thrown when objects that use `CapturableResourceDeleter` are garbage collected at the end of a program's life.  This can happen in very normal circumstances, such as when using `saved_model_cli` to inspect a model.

The cause of the exception appears to be a race condition with garbage collection between `CapturableResourceDeleter` and `ScopedTFFunction`. Both define a custom finalizer (`__del__`); `CaptureableResourceDeleter`'s finalizer ultimately calls a concrete function which calls an `_EagerDefinedFunction` which attempts to load and execute a `ScopedTFFunction`.

In the case of multiple objects in a reference cycle all going unreachable during the same garbage collection cycle, we get no guaranteed ordering for which of the objects will be collected first. In the case of the exception, `ScopedTFFunction` is collected first and its underlying function is deleted. Later, `CapturableResourceDeleter` is called, which fails, since the function it's trying to call is gone.

PiperOrigin-RevId: 358292164
Change-Id: I9162d5de622f5c1ec9b2954647b9958a7d3d87b6
2021-02-18 17:00:03 -08:00
..
model_utils PY2 removal cleanup 2021-01-20 17:11:44 -08:00
BUILD PY2 removal cleanup 2021-01-20 17:11:44 -08:00
builder_impl.py Use a ternary if for checking main_op is None 2020-08-28 04:22:14 +02:00
builder.py
constants.py
function_deserialization.py Don't run restored functions eagerly. 2021-02-04 10:24:06 -08:00
function_serialization.py Rationale: 2020-11-03 12:55:59 -08:00
load_options.py Add load option for loading SavedModel from specific io_device for distributed training. 2020-06-15 15:34:47 -07:00
load_test.py Handle garbage collection race condition. 2021-02-18 17:00:03 -08:00
load_v1_in_v2_test.py Internal change on model loading. 2021-02-08 22:28:22 -08:00
load_v1_in_v2.py Internal change on model loading. 2021-02-08 22:28:22 -08:00
load.py Stop running variable restoration twice when loading SavedModels 2021-01-11 14:11:06 -08:00
loader_impl.py pylint fix 2020-07-14 15:31:24 +00:00
loader_test.py Remove v1 test decorators from saved_model:loader_test. 2020-07-30 15:15:57 -07:00
loader.py
main_op_impl.py
main_op.py
method_name_updater_test.py
method_name_updater.py Internal Cleanup for docstring. 2020-09-15 08:56:28 -07:00
nested_structure_coder_test.py Remove ndarray wrapper from TF Numpy. We return tensors directly. 2021-02-04 19:20:27 -08:00
nested_structure_coder.py Remove ndarray wrapper from TF Numpy. We return tensors directly. 2021-02-04 19:20:27 -08:00
README.md Update README.md for python/saved_model, see tensorflow.org for most up-to-date documentation re SavedModel. 2020-11-19 15:34:04 -08:00
revived_types_test.py
revived_types.py
save_context_test.py Allow access SaveOptions through SaveContext 2020-07-17 11:17:43 -07:00
save_context.py Allow access SaveOptions through SaveContext 2020-07-17 11:17:43 -07:00
save_options.py Update save_options.function_aliases docstring. 2020-11-09 15:57:58 -08:00
save_test.py Decrease frequency of logged warning for untraced functions while saving. Remove warning if attempting to restore an untraced function while loading. 2020-10-22 12:04:42 -07:00
save.py Format an AssertionError string with space between sentences. 2021-02-17 12:42:47 -08:00
saved_model_test.py Remove v1 decorators from saved_model:saved_model_test. 2020-07-31 15:22:49 -07:00
saved_model.py
signature_constants.py
signature_def_utils_impl.py
signature_def_utils_test.py Remove run_deprecated_v1 decorators from signature_def_utilts_test. 2020-07-29 09:55:23 -07:00
signature_def_utils.py
signature_serialization.py Log info message if input name in function signature changes in SavedModel, which get converted here: 7e3a0d6be0/tensorflow/core/framework/graph_to_functiondef.cc (L82-L93) 2020-11-04 10:43:32 -08:00
simple_save_test.py Remove run_deprecated_v1 qualifier from saved_model:simple_save_test. 2020-07-28 11:26:41 -07:00
simple_save.py
tag_constants.py
utils_impl.py Add option to construct tf.train.Checkpoint with a root object. 2020-08-14 12:19:56 -07:00
utils_test.py Move away from deprecated asserts 2020-06-30 16:10:22 -07:00
utils.py

TensorFlow SavedModel

[TOC]

Overview

SavedModel is the universal serialization format for TensorFlow models.

SavedModel provides a language-neutral format to save machine-learning models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models.

Guides

Public API

The SavedModel Format

A SavedModel directory has the following structure:

assets/
assets.extra/
variables/
    variables.data-?????-of-?????
    variables.index
saved_model.pb
  • SavedModel protocol buffer
    • saved_model.pb or saved_model.pbtxt
    • Includes the graph definitions as MetaGraphDef protocol buffers.
  • Assets
    • Subfolder called assets.
    • Contains auxiliary files such as vocabularies, etc.
  • Extra assets
    • Subfolder where higher-level libraries and users can add their own assets that co-exist with the model, but are not loaded by the graph.
    • This subfolder is not managed by the SavedModel libraries.
  • Variables
    • Subfolder called variables.
      • variables.data-?????-of-?????
      • variables.index

Stripping Default valued attributes

The SavedModelBuilder class allows users to control whether default-valued attributes must be stripped from the NodeDefs while adding a meta graph to the SavedModel bundle. Both SavedModelBuilder.add_meta_graph_and_variables and SavedModelBuilder.add_meta_graph methods accept a Boolean flag strip_default_attrs that controls this behavior.

If strip_default_attrs is False, the exported MetaGraphDef will have the default valued attributes in all it's NodeDef instances. This can break forward compatibility with a sequence of events such as the following:

  • An existing Op (Foo) is updated to include a new attribute (T) with a default (bool) at version 101.
  • A model producer (such as a Trainer) binary picks up this change (version 101) to the OpDef and re-exports an existing model that uses Op Foo.
  • A model consumer (such as Tensorflow Serving) running an older binary (version 100) doesn't have attribute T for Op Foo, but tries to import this model. The model consumer doesn't recognize attribute T in a NodeDef that uses Op Foo and therefore fails to load the model.

By setting strip_default_attrs to True, the model producers can strip away any default valued attributes in the NodeDefs. This helps ensure that newly added attributes with defaults don't cause older model consumers to fail loading models regenerated with newer training binaries.

TIP: If you care about forward compatibility, then set strip_default_attrs to True while using SavedModelBuilder.add_meta_graph_and_variables and SavedModelBuilder.add_meta_graph.