To ensure that it is safe to replace the copy with a const reference the following heuristic is employed:
The loop variable is const qualified.
The loop variable is not const, but only const methods or operators are invoked on it, or it is used as const reference or value argument in constructors or function calls.
PiperOrigin-RevId: 305169937
Change-Id: I682f40e98a074f074332e6e4d0d47575c9909286
Need to special-case variant dtypes since they require using zeros_like
Fixes forwardprop of functions containing control flow
PiperOrigin-RevId: 272938367
We were trying to use the EagerTensor variable handle inside the function. Stops doing that, and also makes it not segfault if it were to do that.
A couple misc. fixes to test infrastructure to support the new tests, e.g. gradient checker didn't like unconnected gradients.
PiperOrigin-RevId: 266958407
The backward functions were returning [] instead of a list of Nones and the tape was giving an unhelpful error message
(Akshay tracked this down and came up with the test, which for sure was most of the effort)
PiperOrigin-RevId: 266273603
More work toward adding a function special case. An accumulator triggers function-building, then needs to work on symbolic tensors captured by the function before returning to its original task.
PiperOrigin-RevId: 265120491
Allows the unused backward computation to be pruned out.
Does not change custom_gradient or function forward-mode computations.
Some fiddling with the memory checking on the unit tests, since tf.function creates persistent symbolic Tensors the first time it's called. This means we need to do warmup runs and ignore Tensors allocated there.
Forward gradients still need some followups after this:
- Functions should have a special-cased forward function so that they're efficient when executing eagerly.
- Watching variables on an accumulator should be possible
After those the remaining case is custom gradients, which are probably fine to leave as they are for now (they work, they're just a bit less efficient than they could be if the user provided a jvp or told us the code was safe to wrap in a tf.function).
From //tensorflow/python/eager:benchmarks_test:
benchmark_forwardprop_in_defun_matmul_256_by_2096_CPU 487 examples/second no change
benchmark_forwardprop_matmul_256_by_2096_CPU 406 examples/second 1.6x speedup
benchmark_forwardprop_of_defun_matmul_256_by_2096_CPU 176 examples/second no change
benchmark_forwardprop_in_defun_matmul_100_by_784_CPU 2872 examples/second no change
benchmark_forwardprop_matmul_100_by_784_CPU 1766 examples/second 1.4x speedup
benchmark_forwardprop_of_defun_matmul_100_by_784_CPU 909 examples/second no change
PiperOrigin-RevId: 257832992
One fix for non-trainable target handling when using a tape. Arrays passed to the tape could previously be misaligned.
Also fixes the order of op recording so that "outer" accumulators record ops first and so can always trace inner accumulators' jvp computation if necessary.
PiperOrigin-RevId: 257497741
Adds a ForwardGradientAccumulator, which hooks into similar places as GradientTape. This keeps track of JVPs as ops are executed and cleans up its JVPs when the Tensors are deleted.
Implemented using the double-gradient trick, which is fine when the forward accumulation is wrapped in a function (since it's double-symbolic-gradient) but is not great executing eagerly.
Some things that are not working yet:
- Watching variables (Tensors only at the moment)
- Executing eagerly we currently waste computation executing the gradient function. This is unavoidable with our current custom gradient API, but functions and our ops can get special cases.
- Functions in particular really need a special case, since currently their memory use is the same as backprop when executed eagerly (since I believe we end up running the function's forward-with-outputs-for-gradients version).
- Re-using forwardprop objects / adding them back to the active set
PiperOrigin-RevId: 253841138
Gradient functions can selectively perform less work when this happens. This is implemented for Add, MatMul, Maximum and Minimum.
Also update the tape code to not generate zeros tensors until they are required.
PiperOrigin-RevId: 240172300
This seems to perform no worse on small benchmarks, and considerably better on some larger ones.
Ideally we'd use absl::flat_hash_{map,set} but those don't compile with the python3.4 Python.h header. This bug is tracked in https://github.com/abseil/abseil-cpp/issues/235
PiperOrigin-RevId: 239280970
Before this change, it was easy to forget [] around the source tensor.
This mistake lead to GradientTape.gradient(), returning a list of Nones.
Nones normally tell to the user that the source and the target are
not connected via differentiable operations, which is not the source
of the error in this case.
Instead of adding a check that `sources` is a list of tensors, this CL
adds ability to handle structured source (which includes a lone tensor),
similarly to many existing TensorFlow APIs.
Also, with Alex's help, it fixes a bug where repeated tensors in
`sources` were not handled correctly.
PiperOrigin-RevId: 190878583
1. Using _shape_tuple
2. Bypassing * over math_ops.mul etc
3. Flatmaps in the tape code
4. Cache for ones similar to for zeros
5. Fast path for _SubGrad
6. Fast global_step += 1 for resource variables
7. Bypassing deprecated args decorator in eager mode
PiperOrigin-RevId: 183446593
Added two simple tests for persistent tapes and did a manual test
that calling "del" on gradient tape releases all tensors.
Also:
- Add missing Py_DECREF to error case in MakeTensorIDList
- Make a couple error messages more descriptive
PiperOrigin-RevId: 176718477