Commit Graph

45 Commits

Author SHA1 Message Date
A. Unique TensorFlower
ae0a9a4461 This CL optimizes C++11 range-based for loops where the variable is copied in each iteration but it would suffice to obtain it by const reference. This is only applied to loop variables of types that are expensive to copy which means they are not trivially copyable or have a non-trivial copy constructor or destructor.
To ensure that it is safe to replace the copy with a const reference the following heuristic is employed:
  The loop variable is const qualified.
  The loop variable is not const, but only const methods or operators are invoked on it, or it is used as const reference or value argument in constructors or function calls.

PiperOrigin-RevId: 305169937
Change-Id: I682f40e98a074f074332e6e4d0d47575c9909286
2020-04-06 19:54:28 -07:00
Kazuaki Ishizaki
27643b326c minor spelling tweaks 2020-01-16 14:36:52 +09:00
Sergei Lebedev
7b094e93b9 Added mising unordered_{map,set} includes to tape.h
PiperOrigin-RevId: 280218429
Change-Id: I51024998ef3f85e1f99fface6f6f36a1443ef031
2019-11-13 10:46:19 -08:00
Allen Lavoie
6077ea44e3 Fix tape/accumulator variant handling
Need to special-case variant dtypes since they require using zeros_like

Fixes forwardprop of functions containing control flow

PiperOrigin-RevId: 272938367
2019-10-04 19:03:57 -07:00
Allen Lavoie
7403920848 Forwardprop: fix variables inside functions, test with layers
We were trying to use the EagerTensor variable handle inside the function. Stops doing that, and also makes it not segfault if it were to do that.

A couple misc. fixes to test infrastructure to support the new tests, e.g. gradient checker didn't like unconnected gradients.

PiperOrigin-RevId: 266958407
2019-09-03 10:23:42 -07:00
Allen Lavoie
ea5477fe59 Fix recording of functions with no connected gradients
The backward functions were returning [] instead of a list of Nones and the tape was giving an unhelpful error message

(Akshay tracked this down and came up with the test, which for sure was most of the effort)

PiperOrigin-RevId: 266273603
2019-08-29 18:05:15 -07:00
Allen Lavoie
62d8504982 Forwardprop: Switch to a deque-backed stack to avoid element invalidation on reallocation
Fixes an asan use-after-free issue

PiperOrigin-RevId: 265487652
2019-08-26 11:19:12 -07:00
Allen Lavoie
e9ff15d98a Forwardprop: Add utilities for temporarily pushing forward accumulator state
More work toward adding a function special case. An accumulator triggers function-building, then needs to work on symbolic tensors captured by the function before returning to its original task.

PiperOrigin-RevId: 265120491
2019-08-23 13:43:32 -07:00
Allen Lavoie
012a1167d2 Forwardprop: Ensure that inner nested accumulators don't see outer accumulators' jvps
Just for consistency; apparently this was a difference between function-wrapped and non-function-wrapped accumulation.

PiperOrigin-RevId: 260979468
2019-07-31 13:19:51 -07:00
Allen Lavoie
68e2db3b04 Automated rollback of commit acab6a2051
PiperOrigin-RevId: 258246018
2019-07-15 15:19:21 -07:00
Brian Zhao
4f0cfd0b18 Automated rollback of commit acab6a2051
PiperOrigin-RevId: 257896409
2019-07-12 17:24:17 -07:00
Allen Lavoie
acab6a2051 Use a tf.function to more efficiently compute op jvps
Allows the unused backward computation to be pruned out.

Does not change custom_gradient or function forward-mode computations.

Some fiddling with the memory checking on the unit tests, since tf.function creates persistent symbolic Tensors the first time it's called. This means we need to do warmup runs and ignore Tensors allocated there.

Forward gradients still need some followups after this:
  - Functions should have a special-cased forward function so that they're efficient when executing eagerly.
  - Watching variables on an accumulator should be possible

After those the remaining case is custom gradients, which are probably fine to leave as they are for now (they work, they're just a bit less efficient than they could be if the user provided a jvp or told us the code was safe to wrap in a tf.function).

From //tensorflow/python/eager:benchmarks_test:

benchmark_forwardprop_in_defun_matmul_256_by_2096_CPU 487 examples/second no change
benchmark_forwardprop_matmul_256_by_2096_CPU          406 examples/second 1.6x speedup
benchmark_forwardprop_of_defun_matmul_256_by_2096_CPU 176 examples/second no change

benchmark_forwardprop_in_defun_matmul_100_by_784_CPU 2872 examples/second no change
benchmark_forwardprop_matmul_100_by_784_CPU          1766 examples/second 1.4x speedup
benchmark_forwardprop_of_defun_matmul_100_by_784_CPU  909 examples/second no change

PiperOrigin-RevId: 257832992
2019-07-12 11:14:19 -07:00
Allen Lavoie
166662af04 Forward gradients: fixes for higher-order forward-only gradients
One fix for non-trainable target handling when using a tape. Arrays passed to the tape could previously be misaligned.

Also fixes the order of op recording so that "outer" accumulators record ops first and so can always trace inner accumulators' jvp computation if necessary.

PiperOrigin-RevId: 257497741
2019-07-10 16:13:51 -07:00
Allen Lavoie
29b284e539 Another attempt to placate the Windows build
PiperOrigin-RevId: 254215759
2019-06-20 09:59:59 -07:00
Allen Lavoie
835c658a0a Forward gradients: try to placate the Windows compiler
PiperOrigin-RevId: 254076778
2019-06-19 15:10:55 -07:00
Allen Lavoie
e985e958b3 Keep a constant reference to VSpace in ForwardAccumulator instead of a constant pointer
Also checks that one was already registered instead of segfaulting.

PiperOrigin-RevId: 253879165
2019-06-18 15:29:02 -07:00
Allen Lavoie
8e19a3f3df A start on eager-friendly forward-mode autodiff. No public symbols yet.
Adds a ForwardGradientAccumulator, which hooks into similar places as GradientTape. This keeps track of JVPs as ops are executed and cleans up its JVPs when the Tensors are deleted.

Implemented using the double-gradient trick, which is fine when the forward accumulation is wrapped in a function (since it's double-symbolic-gradient) but is not great executing eagerly.

Some things that are not working yet:
 - Watching variables (Tensors only at the moment)
 - Executing eagerly we currently waste computation executing the gradient function. This is unavoidable with our current custom gradient API, but functions and our ops can get special cases.
 - Functions in particular really need a special case, since currently their memory use is the same as backprop when executed eagerly (since I believe we end up running the function's forward-with-outputs-for-gradients version).
 - Re-using forwardprop objects / adding them back to the active set

PiperOrigin-RevId: 253841138
2019-06-18 12:36:23 -07:00
Akshay Modi
27caaad8ee Include a tuple specifying if any gradient is unnecessary.
Gradient functions can selectively perform less work when this happens. This is implemented for Add, MatMul, Maximum and Minimum.

Also update the tape code to not generate zeros tensors until they are required.

PiperOrigin-RevId: 240172300
2019-03-25 10:49:29 -07:00
Akshay Modi
70d112d353 std::unordered_map, unordered_set instead of gtl::FlatMap, FlatSet
This seems to perform no worse on small benchmarks, and considerably better on some larger ones.

Ideally we'd use absl::flat_hash_{map,set} but those don't compile with the python3.4 Python.h header. This bug is tracked in https://github.com/abseil/abseil-cpp/issues/235

PiperOrigin-RevId: 239280970
2019-03-19 16:17:50 -07:00
Alexandre Passos
710b322a8b Fixes , a crash with gradient tape and invalid states.
PiperOrigin-RevId: 236858702
2019-03-05 09:11:05 -08:00
A. Unique TensorFlower
20df63d2f3 This CL fixes a bug where GradientTapes return an unconnected gradient for no-ops on watched tensors instead of returning a ones tensor.
PiperOrigin-RevId: 221852847
2018-11-16 14:35:53 -08:00
Akshay Modi
c7237e6070 Don't generate backward function and delete when its not necessary
PiperOrigin-RevId: 215288224
2018-10-01 15:16:07 -07:00
Akshay Modi
c3014ec19e Allow the tape tensor to have unknown shapes.
This is done by making the TapeTensor a template rather than a concrete struct.

PiperOrigin-RevId: 213700425
2018-09-19 14:57:27 -07:00
Akshay Modi
1ede512f8c Remove some dead code after migration from python to C.
PiperOrigin-RevId: 213372027
2018-09-17 18:02:52 -07:00
Akshay Modi
862d753d4e Skip zeros call if unrequired in backprop for SparseSoftmaxCrossEntropyWithLogits
See 065f9b833f/tensorflow/python/ops/nn_grad.py (L482)

Should help with 

PiperOrigin-RevId: 210933185
2018-08-30 10:27:38 -07:00
Akshay Modi
6732ec3dff Skip calling back into python if only 1 gradient to aggregate
PiperOrigin-RevId: 203786896
2018-07-09 10:26:44 -07:00
Akshay Modi
c0dd400f43 Remove _get_backward_fn and depend on _gradient_function directly.
(_magic_gradient_function was renamed to _gradient_function)

Before:
entry {
  name: "MicroBenchmarks.benchmark_tf_gradient_forward_identity"
  iters: 30000
  wall_time: 5.88456789653
  extras {
    key: "examples_per_sec"
    value {
      double_value: 169936.011885
    }
  }
}

After:
entry {
  name: "MicroBenchmarks.benchmark_tf_gradient_forward_identity"
  iters: 30000
  wall_time: 5.04853725433
  extras {
    key: "examples_per_sec"
    value {
      double_value: 198077.175551
    }
  }
}
PiperOrigin-RevId: 197972668
2018-05-24 16:23:23 -07:00
Alexandre Passos
753cc5b3f7 Fixes issue with gradient tape when asking for the gradient of an intermediate tensor.
PiperOrigin-RevId: 197481473
2018-05-21 16:39:58 -07:00
Alexandre Passos
8a8dddf8bd Do not differentiate integers in the eager backprop API.
(with bugfix)

PiperOrigin-RevId: 196184587
2018-05-10 15:56:58 -07:00
Asim Shankar
e696dc1bd0 Automated g4 rollback of changelist 195878952
PiperOrigin-RevId: 196127751
2018-05-10 09:41:04 -07:00
Alexandre Passos
f58effe44d Do not differentiage integers in the eager API.
This is similar to the change made in:
f637506458
for backpropagation during graph construction via tf.gradients()

PiperOrigin-RevId: 195878952
2018-05-08 15:52:59 -07:00
Alexandre Passos
aa2405ee79 Fixes to tape gradient for providing outputs and having multiple targets.
PiperOrigin-RevId: 194796304
2018-04-30 09:32:36 -07:00
Igor Ganichev
3f7adc7104 Support structured source in GradientTape.gradient
Before this change, it was easy to forget [] around the source tensor.
This mistake lead to GradientTape.gradient(), returning a list of Nones.
Nones normally tell to the user that the source and the target are
not connected via differentiable operations, which is not the source
of the error in this case.

Instead of adding a check that `sources` is a list of tensors, this CL
adds ability to handle structured source (which includes a lone tensor),
similarly to many existing TensorFlow APIs.

Also, with Alex's help, it fixes a bug where repeated tensors in
`sources` were not handled correctly.

PiperOrigin-RevId: 190878583
2018-03-28 20:53:48 -07:00
Alexandre Passos
72a0b0efbc Clarifying when is it possible to use a tape while it is still active.
PiperOrigin-RevId: 189260773
2018-03-15 16:10:15 -07:00
Alexandre Passos
b45c09ce71 Improvements to eager linear regression benchmark:
1. Using _shape_tuple
  2. Bypassing * over math_ops.mul etc
  3. Flatmaps in the tape code
  4. Cache for ones similar to for zeros
  5. Fast path for _SubGrad
  6. Fast global_step += 1 for resource variables
  7. Bypassing deprecated args decorator in eager mode

PiperOrigin-RevId: 183446593
2018-01-26 14:41:18 -08:00
A. Unique TensorFlower
e4532d2097 Merge changes from github.
PiperOrigin-RevId: 179953488
2017-12-22 12:46:28 -08:00
Allen Lavoie
8d3690c564 Plug an eager memory leak, add tests for reference counts.
There are still some slightly less serious leaks. Will follow up with a fix once I track those down.

PiperOrigin-RevId: 179220052
2017-12-15 11:43:21 -08:00
Alexandre Passos
508abd1ea3 None gradients should trigger stopping traversal of the backward graph in tape gradients.
PiperOrigin-RevId: 177662732
2017-12-01 17:31:40 -08:00
Igor Ganichev
4c2ca8b0cb Fix typo in GradientTape.persistent_ comment.
PiperOrigin-RevId: 177268420
2017-11-28 22:52:09 -08:00
Igor Ganichev
58d0e5b6c3 Add persistent GradientTape support
Added two simple tests for persistent tapes and did a manual test
that calling "del" on gradient tape releases all tensors.

Also:
 - Add missing Py_DECREF to error case in MakeTensorIDList
 - Make a couple error messages more descriptive
PiperOrigin-RevId: 176718477
2017-11-22 17:23:07 -08:00
Alexandre Passos
775b496167 Do not swallow exceptions in gradient functions in eager.
PiperOrigin-RevId: 176422128
2017-11-20 15:02:51 -08:00
Alexandre Passos
d93d985a73 Tensor template argument to gradienttape was unnecessary.
PiperOrigin-RevId: 175225805
2017-11-10 16:14:40 -08:00
Alexandre Passos
c5a7366bfe Removes void*s from the tape gradient code, replacing with templates.
PiperOrigin-RevId: 175155685
2017-11-10 16:14:38 -08:00
Alexandre Passos
2545c4e93b Moves imperative_grad to C
Neutral-to-positive on all benchmarks. Also reduces overhead of should_record.

PiperOrigin-RevId: 175057104
2017-11-10 16:14:37 -08:00
Alexandre Passos
0bbdeaf45e Ports the eager gradient tape to C.
The tape stack is still in python as is the backprop code.

PiperOrigin-RevId: 172151189
2017-10-13 15:07:38 -07:00