Commit Graph

190 Commits

Author SHA1 Message Date
TensorFlower Gardener
eacf534690 Merge pull request from rahul003:s3_skip_temp
PiperOrigin-RevId: 297455352
Change-Id: I41411282776981e9cf4e347b25d238557151f9e6
2020-02-26 14:49:31 -08:00
Rahul Huilgol
e65e99c433 Not use temp files when writing to S3
Address feedback

Add test for the python method has_atomic_move

Removed old comment, and fixed indentation

Remove unncessary imports

Remove the test which checks for reference cycles when saving. Since the check for file system introduces a conditional op, it introduces a reference cycle, and this check does not apply anymore

Fighting lint

Fix lint errors

Use returned status of hasAtomicMove
2020-02-20 20:00:53 +00:00
TensorFlower Gardener
1bece066de Merge pull request from kiszk:spelling_tweaks_python
PiperOrigin-RevId: 294744506
Change-Id: Ib74df8ca3aacaab65adbfc663c3f3ada43594942
2020-02-12 13:41:49 -08:00
A. Unique TensorFlower
34f38875b1 Fix some tensorflow errors triggered when tests are run with python3 -bb.
PiperOrigin-RevId: 294716915
Change-Id: I725b900eb0da80ea8693f23491b0171dc11affe6
2020-02-12 11:59:05 -08:00
Kazuaki Ishizaki
bd8e308b4c minor spelling tweaks 2020-02-11 15:09:21 +09:00
Rohan Jain
c19c8167c2 In all cases, we can't rely on the tensor.device attribute being set. So its better to get the device for a SaveSpec from the device passed in rather. This was an issue with saving iterators because for iterators the resource usually has a device specification but the serialized tensor derived from it might not have it set. As a result, when saving iterators in a sharded fashion all iterators end up on '' device instead which is not what is intended.
Also adding support for saving iterators in a sharded fashion to avoid unnecessary copying during checkpointing.

PiperOrigin-RevId: 286310419
Change-Id: I1a957af783f7f69753992ce220b59eb43df2c02f
2019-12-18 19:11:04 -08:00
Amit Patankar
b66e4e833c Export the checkpoint reader classes and functions from C++ to Python with pybind11 instead of swig. This is part of a larger effort to deprecate swig and eventually with modularization break pywrap_tensorflow into smaller components. It will also make exporting C++ ops to Python significantly easier. XLA is using the pybind11 macros already. Please refer to https://github.com/tensorflow/community/blob/master/rfcs/20190208-pybind11.md for more information.
PiperOrigin-RevId: 279101529
Change-Id: I25502ed3d3718499abca41f5614681f41e4c7199
2019-11-07 09:52:50 -08:00
Abdullah Selek
8cc20365f7 Fix sanity errors. 2019-07-08 17:56:46 +01:00
Abdullah Selek
2d9851f9b0 Create an internal undeprecated function to check checkpoint exists. 2019-07-08 17:56:46 +01:00
Billy Lamberta
84bc93d748 Fix spelling in saver.py
PiperOrigin-RevId: 256414539
2019-07-03 16:53:55 -07:00
TensorFlower Gardener
17fd6c8585 Merge pull request from gehring:patch-3
PiperOrigin-RevId: 254078978
2019-06-19 15:41:14 -07:00
gehring
cbf76901b8
Added a warning about AutoTrackable 2019-06-10 15:52:13 -04:00
Vishnuvardhan Janapati
b45715074f
Update saver.py 2019-06-05 14:18:56 -07:00
Vishnuvardhan Janapati
1730032d2f
Update saver.py 2019-05-06 10:50:19 -07:00
Vishnuvardhan Janapati
f6aac99d18
Update saver.py 2019-05-03 15:42:47 -07:00
Goutham Bhat
a8cd934cde Gracefully handle missing checkpoints in recover_last_checkpoints
If some checkpoints present in CheckpointState are absent on disk, recover_last_checkpoints incorrectly initializes Saver internal state.
In this example:
(1) CheckpointState.all_model_checkpoint_paths = ['ckpt-1', 'ckpt-2', 'ckpt-3']
(2) Actual checkpoints on disk: ['ckpt-2', 'ckpt-3']
last_checkpoints gets incorrectly initialized to ['ckpt-1', 'ckpt-2']. This is because get_checkpoint_mtimes silently ignores any absent checkpoints and returns a list of length 2 corresponding to checkpoints on disk, which then gets zipped with (1). After the fix, last_checkpoints would be ['ckpt-2', 'ckpt-3'].

PiperOrigin-RevId: 245983586
2019-04-30 11:28:50 -07:00
Mark Daoust
18b680216e Apply tf1->tf2 name replaces to doc-strings and comments in tensorflow.
No code changes, only doc-strings and comments.

PiperOrigin-RevId: 244275767
2019-04-18 17:19:27 -07:00
Allen Lavoie
bd36b48c55 Rename Checkpointable -> Trackable and AutoCheckpointable -> AutoTrackable
No API changes in this CL. Just more refactoring for a future API change.

PiperOrigin-RevId: 234242335
2019-02-15 17:38:56 -08:00
Nupur Garg
9e10fc685f Makes exporting MetaGraphDefs possible in TensorFlow 2.0.
PiperOrigin-RevId: 232980970
2019-02-07 18:00:39 -08:00
A. Unique TensorFlower
187bbff3bc Dump graph debug information when saving the meta_graph.
PiperOrigin-RevId: 231320436
2019-01-28 17:45:27 -08:00
TensorFlower Gardener
3377ad3dd4 Merge pull request from vidakDK:master
PiperOrigin-RevId: 224576050
2018-12-07 14:23:46 -08:00
Allen Lavoie
66ca3cd10d Add a functional saver, use it for object-based checkpointing
Pulls some utilities out of saver.py which are necessary to actually use it. The functional saver takes only SaveableObjects, so these are utilities for taking a list of whatever users pass in and converting them to those.

One other code move for object-based checkpointing to avoid circular imports.

Applications which need a SaverDef still use the old Saver. Serialization to SaverDef will be added to this saver in a followup.

Does not actually wrap the new Saver's methods in @tf.function yet, since there are memory issues which need to be fixed first.

PiperOrigin-RevId: 224561069
2018-12-07 12:52:23 -08:00
Vidak Kazic
b6a6296de5 Add example to import_meta_graph docstring 2018-12-05 15:01:56 +01:00
Allen Lavoie
4f3d18dc2f Add a deprecation warning to Saver when executing eagerly
This is probably long overdue. Saver has lots of subtle bugs which make it unpleasant.

PiperOrigin-RevId: 223598061
2018-11-30 17:10:22 -08:00
Allen Lavoie
6b52883918 A more useful error message for incompatible object-based checkpoints loaded using variable names
PiperOrigin-RevId: 223587442
2018-11-30 16:02:45 -08:00
A. Unique TensorFlower
45622121ac Change filename from constant to placeholder with default so that grappler can run on graph restoration.
PiperOrigin-RevId: 221205392
2018-11-12 20:44:57 -08:00
Allen Lavoie
2822f1351e Remove from the 2.x API tf.train.Saver and related symbols which rely on sessions/collections
Saver will be replaced by tf.train.Checkpoint (and tf.contrib.checkpoint.CheckpointManager) for training checkpoints, and by a simple Python representation of a SaverDef (which may not be a public symbol).

tf.train.Checkpoint does not write/merge sharded checkpoints at the moment, so v2 will want a solution for that (tf.train.ShardedCheckpoint?).

MetaGraph import and export will be replaced by object-based tf.saved_model.import/tf.saved_model.export.

PiperOrigin-RevId: 218262301
2018-10-22 17:25:42 -07:00
Chris Jones
ead4fda065 Fixes a bug in tf.train.Saver(), where classes using the VARIABLE_VALUE_KEY used different naming in the checkpoint file when var_list was a dict.
PiperOrigin-RevId: 217182136
2018-10-15 12:18:32 -07:00
A. Unique TensorFlower
cb926e1ed7 Fixes a bug in tf.train.Saver() where it couldn't use Checkpointable
objects in a tf.train.Saver() if var_list was a dict.

Includes the logic used for list in the dict code path.

PiperOrigin-RevId: 214324913
2018-09-24 13:58:54 -07:00
Allen Lavoie
b23df6e502 Automated rollback of commit 91fd2cd6c3
PiperOrigin-RevId: 209433774
2018-08-20 09:56:23 -07:00
Allen Lavoie
91fd2cd6c3 Automated rollback of commit 45aad1a422
PiperOrigin-RevId: 209168291
2018-08-17 10:31:05 -07:00
Allen Lavoie
45aad1a422 Fix for PartitionedVariables in collections
Makes sure there are read ops in the graph, to avoid errors on MetaGraph restore.

PiperOrigin-RevId: 209158129
2018-08-17 09:23:05 -07:00
Mark Daoust
e4371880b1 Remove magic-doc-links from code.
This change contains no code changes. Only doc-strings.

We can't use relative links in code files, so we don't have much choice but to link to tensorflow.org/

The deleted links were to docs that no longer exist.

PiperOrigin-RevId: 209019572
2018-08-16 12:10:03 -07:00
Katherine Wu
f6be7aadb4 Pull out code that reads an object graph and a saver with remapped variables into separate functions.
PiperOrigin-RevId: 207981685
2018-08-08 18:50:48 -07:00
Allen Lavoie
200fa71857 Add some symbols back to saver.py temporarily to unbreak some users of non-public TF APIs
PiperOrigin-RevId: 207197647
2018-08-02 17:57:38 -07:00
Allen Lavoie
1bf206bc82 Split checkpoint management utility functions out of saver.py
Pure refactor, in preparation for adding a higher level checkpoint management utility. This utility will also need to work with the Checkpoint proto, and globbing it on to saver.py seems dirty.

PiperOrigin-RevId: 207179646
2018-08-02 15:51:17 -07:00
Katherine Wu
9671996af3 Add estimator in contrib that loads its model function from a SavedModel.
PiperOrigin-RevId: 206048542
2018-07-25 13:54:23 -07:00
Jacker
0c11bcb5f3
Update saver.py
Fix device placement of save_op for ResourceVariable.
2018-07-20 10:09:16 +08:00
A. Unique TensorFlower
8f130ff5b0 Fix ResourceVariable placement during checkpointing to correctly colocate the
copy of the variable on the same machine. Addresses Issue .

PiperOrigin-RevId: 205317119
2018-07-19 16:07:36 -07:00
Karmel Allison
0979821324 Add more helpful error messages when restoring from checkpoint fails.
PiperOrigin-RevId: 202668227
2018-06-29 10:32:46 -07:00
A. Unique TensorFlower
f596bcc786 Remove dead code from bulk_restore() but keep dead function parameter for backward-compatibility.
PiperOrigin-RevId: 200587926
2018-06-14 11:21:58 -07:00
Goutham Bhat
52af244989 Factor out tf.train.remove_checkpoint utility function.
PiperOrigin-RevId: 200276735
2018-06-12 14:11:18 -07:00
A. Unique TensorFlower
dc7821ccf4 Apply import_scope to asset and variable tensors during tf.saved_model.loader.load
This change explicitly declares import_scope as a kwarg for tf.saved_model.loader.load. Previously, tf.saved_model.loader.load implicitly accepted import_scope and passed it through to import_meta_graph through **saver_kwargs.

PiperOrigin-RevId: 200249417
2018-06-12 11:28:58 -07:00
Yifei Feng
b59833c3fd Merge changes from github.
Revert . Too many internal test failures due to the name scope change caused by this change.
Revert . Cannot use re2::StringPiece internally. Need alternative for set call. Will pull and clean this up in a separate change.

PiperOrigin-RevId: 197991247
2018-05-24 19:15:01 -07:00
Allen Lavoie
b28938c367 Remove object-based checkpointing probes from Python 3 tf.train.Saver "name not found" stack traces
PiperOrigin-RevId: 197473101
2018-05-21 15:45:42 -07:00
Allen Lavoie
da600975c1 Checkpointable: move python/training/checkpointable_* to python/training/checkpointable/
Need to add some new checkpointable files in core (specifically I had some checkpointable data structures in mind), and prefixing more files with "checkpointable_" in python/training/ seems dirty.

No functional changes, just some branching and build/import fiddling.

PiperOrigin-RevId: 196883136
2018-05-16 13:55:34 -07:00
Allen Lavoie
ef58a46b73 Support saving Python state with object-based checkpoints
Allows SaveableObjects to specify feed dict addition callbacks for object-based saving.

For now just saves get_config() with Layers. Doesn't do any loading, and there isn't quite enough information to reconstruct a Model yet (needs topology).

My plan is to get Models to the point where they can be reconstructed from object-based checkpoints (probably one more change), add in SavedModel export (assuming no dynamic control flow for now), then add this "SavedModel+Python" format to Model.save / load_model.

PiperOrigin-RevId: 196043183
2018-05-09 15:59:21 -07:00
Allen Lavoie
236120d32d Split out SaveableObjects into their own file
Pulls a couple build rules out of tensorflow/python:training. I'd like to use a SaveableObject in :checkpointable (for saving some Python state by default), which means the file with SaveableObject has to be essientially dependency-free.

PiperOrigin-RevId: 194473987
2018-04-26 16:42:50 -07:00
Allen Lavoie
5ec3b021fd Add tf.train.Checkpoint for reading and writing object-based checkpoints.
Previously exposed as tf.contrib.eager.Checkpoint / tfe.Checkpoint.

Spiffies up the documentation a bit, but otherwise just adds the export decorator.

Compatible in both directions with tf.train.Saver (object-based checkpoints can be fed to tf.train.Saver, and name-based checkpoints can be fed to tf.train.Checkpoint).

PiperOrigin-RevId: 193439442
2018-04-18 16:51:33 -07:00
Allen Lavoie
8600d918a6 Allow tf.train.Saver to load object-based checkpoints (using names)
This is the second part of the compatibility story. Object-based checkpointing APIs can already read name-based checkpoints, and now the name-based APIs can read object-based checkpoints by looking up the modified keys in the object graph proto.

PiperOrigin-RevId: 192824907
2018-04-13 14:35:26 -07:00