Commit Graph

105 Commits

Author SHA1 Message Date
Russell Power
30749f263e Add minor optimization for graph copies.
Reserve input/output edgeset sizes when copying graphs.

PiperOrigin-RevId: 355055758
Change-Id: Id78260cda6f8bf9ed30663ecc819b5936fff26a8
2021-02-01 16:55:28 -08:00
Kibeom Kim
63b7178b4a Annotate FunctionDef registered from eager runtime to correct annotate tensorflow::Graph's construction context.
PiperOrigin-RevId: 347077468
Change-Id: I77e928287b063ade8a5dfebfeebb5db391af79a9
2020-12-11 14:41:33 -08:00
George Karpenkov
1943e58d29 Rollback of rollback of "Move the ownership of Python stack traces to Graph object, make them accessible from C++ API"
Move the ownership of Python stack traces to Graph object, make them accessible from C++ API

Expose stack printing options, implement common prefix filtering.

PiperOrigin-RevId: 345579757
Change-Id: I88673891e893b1f71a5b039e44f0bc30f190c18a
2020-12-03 18:39:36 -08:00
A. Unique TensorFlower
90881b041f Move the ownership of Python stack traces to Graph object, make them accessible from C++ API
Expose stack printing options, implement common prefix filtering.

PiperOrigin-RevId: 345153254
Change-Id: Ifc2eb8b5a4208358787db346a06837c6907f409c
2020-12-01 20:13:04 -08:00
George Karpenkov
22220649d3 Move the ownership of Python stack traces to Graph object, make them accessible from C++ API
Expose stack printing options, implement common prefix filtering.

PiperOrigin-RevId: 345147201
Change-Id: Iafb94afc07a8bada1e1f5978a66f692b4a06668e
2020-12-01 19:17:34 -08:00
Kibeom Kim
1aab9cfeb0 Add a tag to tensorflow::Graph that indicates where it's originated from.
- `ConstructionContext::kDirectSession` From `tensorflow::DirectSession`, tf1 session API.
- `ConstructionContext::kFunctionDef`: From `FunctionDef`, @tf.function.
- `ConstructionContext::kUnknown`: Not tracked.

It can be accessed via `Graph::GetConstructionContext()`

PiperOrigin-RevId: 343109880
Change-Id: I7b3488648855c9d86d4fb4a202bd66cf7182191a
2020-11-18 16:01:06 -08:00
zmx
486c1e62f2
Update graph.h
change comment for  Graph::FindEdgeId method.
2020-09-29 05:54:07 +08:00
A. Unique TensorFlower
04112a1910 Qualify uses of std::string
PiperOrigin-RevId: 324235054
Change-Id: Ia0f0279b70bac000ff67334a5ca871bd4ec6ef5e
2020-07-31 10:38:08 -07:00
Yanhua Sun
b5a2876c65 generate stateless_case op if all ops in all branches are stateless
This avoids unnecessary auto dependency due to stateful ops

PiperOrigin-RevId: 324105758
Change-Id: Icf7979cca19f283ce7fd4cc338b4182f5e2e3b51
2020-07-30 16:23:47 -07:00
Derek Murray
aec85065ff [Graph] Avoid calling Node::GetNodeClassForOp() when a node is copied.
Instead, copy the class from the original node. This change also modifies the `kNodeClassTable` to use `absl::flat_hash_map<>`.

PiperOrigin-RevId: 306945263
Change-Id: I8eb1c80b57fdf204fbc7072a55615dd688025e87
2020-04-16 16:38:09 -07:00
George Karpenkov
f3dcd9dc11 Support interprocedural constant meta-information propagation for compilation
This CL does two things:

1) Supports inter-procedural constant information propagation, across
PartitionedCall and StatefulPartitionedCall.

2) Done naively, (1) leads to exponential number of calls, as each function
will be reinlined for each (indirect) caller.
In order to address this performance issue, we cache the argument indices which
need to be constant, and attach that information to the Graph object.

This might require some clarification:

a) Caching in a passed map would not work, as duplication of constant
propagation for each top-level caller is still prohibitively expensive.

b) Caching in a global object would not work, as graphs are created and
destroyed during transformations.

c) Caching this meta-information on a `Graph` object has an added benefit that
we no longer perform the same constant propagation many times (a lot of
compilation passes call BackwardsConstAnalysis, and previously all this work
had to be repeated).

PiperOrigin-RevId: 303860413
Change-Id: I78f92ca1487fc952044e5ac6526dcaa5b50d5f21
2020-03-30 17:51:05 -07:00
A. Unique TensorFlower
5f3a3019ba Replace NodeDef with std::shared_ptr<NodeProperties> in the kernel creation code paths and try to avoid as many copies of NodeDefs as possible. This will in most cases allow sharing the NodeDef between the OpKernel and the graph Node from which it is created.
This reduces the number of allocations in the executor benchmark by about 8%:

name                                                 old time/op             new time/op             delta
BM_executor/16/1k       [Nodes = 9824  ]              911µs ± 3%              911µs ± 1%    ~     (p=0.548 n=5+5)
BM_executor/32/8k       [Nodes = 141991]             17.1ms ± 2%             16.8ms ± 1%  -2.17%  (p=0.016 n=5+5)
BM_executor/1k/16       [Nodes = 6781  ]             1.21ms ± 1%             1.25ms ± 7%    ~     (p=0.095 n=5+5)
BM_executor/8k/32       [Nodes = 130875]              4.35s ± 0%              4.34s ± 0%    ~     (p=0.841 n=5+5)
BM_executor/1k/1k       [Nodes = 526256]              3.33s ± 1%              3.31s ± 1%    ~     (p=0.095 n=5+5)
BM_FeedInputFetchOutput                              54.0µs ± 7%             56.9µs ±13%    ~     (p=0.222 n=5+5)

name                                                 old allocs/op           new allocs/op           delta
BM_executor/16/1k       [Nodes = 9824  ]              15.4k ± 0%              14.1k ± 0%  -7.95%  (p=0.008 n=5+5)
BM_executor/32/8k       [Nodes = 141991]               226k ± 0%               208k ± 0%  -7.86%  (p=0.008 n=5+5)
BM_executor/1k/16       [Nodes = 6781  ]              10.2k ± 0%               9.3k ± 0%  -8.36%  (p=0.008 n=5+5)
BM_executor/8k/32       [Nodes = 130875]               197k ± 0%               180k ± 0%  -8.31%  (p=0.016 n=4+5)
BM_executor/1k/1k       [Nodes = 526256]               771k ± 0%               706k ± 0%  -8.53%  (p=0.008 n=5+5)
BM_FeedInputFetchOutput                                58.0 ± 0%               57.0 ± 0%  -1.72%  (p=0.008 n=5+5)

PiperOrigin-RevId: 295803318
Change-Id: I0d262c6082822023f449f9817dc943d20bd302d5
2020-02-18 13:20:06 -08:00
Derek Murray
fc30d76c55 Automated rollback of commit ce05dd80cc
PiperOrigin-RevId: 273515236
2019-10-08 07:37:58 -07:00
Derek Murray
ce05dd80cc Automated rollback of commit 3d830370a8
PiperOrigin-RevId: 272959765
2019-10-04 23:40:25 -07:00
Derek Murray
3d830370a8 Cache whether a Node is a function op in the Node's class.
This enables the testing for whether a node calls a function by checking the NodeClass enum, rather than looking up the type string in a FunctionLibraryDefinition map.

We call IsFunctionCall() for every node in several graph rewrites (including the control-flow lowering pass), so this should reduce the cost of rewrites on large graphs.

PiperOrigin-RevId: 272940579
2019-10-04 19:12:39 -07:00
Derek Murray
6a42e239dc Use GetNodeAttrSimple() when it is possible that the attr is not present.
In the Status-returning GetNodeAttr(), constructing an `errors::NotFound()` when the attr is not present involves expensive string concatenation.

Additionally, change GetNodeAttr() to GetNodeAttrString() on hot codepaths (e.g. `Executor::PropagateOutputs()`) to avoid copying a string on each call, and add overloads of GetNodeAttrSimple() that enable accessing const-pointers to non-POD types in the AttrValue proto without copying them.

PiperOrigin-RevId: 261141528
2019-08-01 10:21:48 -07:00
Yanhua Sun
170a95de67 In while_v2 emit a StatelessIf op if the body is stateless.
PiperOrigin-RevId: 260927755
2019-07-31 08:12:52 -07:00
Saurabh Saxena
22beb4c7d5 In cond_v2 emit a StatelessIf op if the body is stateless.
PiperOrigin-RevId: 256009375
2019-07-01 14:16:11 -07:00
Derek Murray
a101d48091 Change Graph::AddNode() to take node_def by value.
This method currently copies the given `const NodeDef&` proto into the returned `Node`. In many cases, the argument can be moved into the call, and we can elide a potentially large copy.

PiperOrigin-RevId: 254008134
2019-06-19 09:18:27 -07:00
Eugene Zhulenev
a1366af7f6 Make NodeIter and NeighborIter more like real c++ iterators.
Reference: https://en.cppreference.com/w/cpp/iterator/iterator
PiperOrigin-RevId: 247125416
2019-05-07 18:41:13 -07:00
Andy Ly
1a40c07e83 Expose FindKernelDef with NodeDef components (name, op, device, etc.). Update Grappler util's IsKernelRegisteredForNode to use lower FindKernelDef.
PiperOrigin-RevId: 247104392
2019-05-07 16:33:58 -07:00
A. Unique TensorFlower
919b38007e Speedup removal of nodes from Graph by not removing edges one by one from the node's own EdgeSet. Only remove it from the neighboring nodes' EdgeSets, then clear the node's own in_edges_ and out_edges_ in one operation.
PiperOrigin-RevId: 240449165
2019-03-26 16:18:58 -07:00
Igor Ganichev
ea294844b0 Add Node::IsArg() and Node::IsRetval() methods and use them
As functions become more prominent, we have been directly checking
for op types in many places. This is slow and buggy because not all
places were updated for Device* versions of _Arg and _Retval.

PiperOrigin-RevId: 238044205
2019-03-12 10:34:44 -07:00
Igor Ganichev
ed2b195990 Support function calls through PartitionedCall in tf2xla
Also, make eager runtime always emit PartitionedCall and remove
special handling of xla compilation.

Becase this change makes XLA look inside PartitionedCalls, this change
had to update/disable some tests that include PartitionedCalls with
some uncompilable ops inside.

PiperOrigin-RevId: 237486703
2019-03-08 11:32:38 -08:00
Jiri Simsa
5702b86c96 [tf.data] Marking dataset ops that consume a dataset without an iterator as stateful to make sure they are not prune from the graph in case their output is not used.
This is a conservative approach to guarantee that any side-effects of the op are carried out.

This CL also reverts a previous (incomplete) solution to the same problem.

PiperOrigin-RevId: 233663631
2019-02-12 19:11:25 -08:00
Jiri Simsa
eadd8aca53 Making sure dataset "output" ops are not pruned from function graphs as they might have side effects even though the ops are not marked as stateful.
PiperOrigin-RevId: 229425577
2019-01-15 13:35:35 -08:00
A. Unique TensorFlower
f9699b8d40 Fixing build due to ambiguous vector constructor.
PiperOrigin-RevId: 225859201
2018-12-17 11:30:46 -08:00
A. Unique TensorFlower
e8d6281e7e [Error improvement] We now put an attribute for keeping track of the original source nodes. We have also changed many optimizers to correctly transmit the original node values.
PiperOrigin-RevId: 225457141
2018-12-13 16:36:57 -08:00
Skye Wanderman-Milne
62db4a3ccf Introduce Operation._add_while_inputs to allow adding inputs to a While op.
This is in preparation for changing while_v2 to rewrite the forward
pass to output intermediates needed by the gradient, instead of
outputting all intermediates. Since While ops always have the same
inputs and output types, we need to be able to add inputs in addition
to adding outputs.

PiperOrigin-RevId: 223812986
2018-12-03 10:11:42 -08:00
Peter Hawkins
c1e50881f3 [XLA] Avoid undefined behavior in absl::bit_cast with Eigen::half.
[XLA] [TF] Fix more compile warnings in Mac OS OSS build.

PiperOrigin-RevId: 223035708
2018-11-27 12:29:49 -08:00
Peter Hawkins
ee20d8c029 [TF] [XLA] Fix a number of compiler warnings on Mac OS X.
* Mostly these are unused/dead code.
* Fix an ignored Status in shape_util.h

PiperOrigin-RevId: 222126279
2018-11-19 13:19:32 -08:00
Skye Wanderman-Milne
199ead85e8 Fix bug in If lowering and make {Input,Output}Tensor.node non-const.
Prior to this change, the If lowering pass would always use the first
output of the predicate op as the predicate. This change makes it use
the correct output. In addition, this adds more OutputTensor plumbing
which required making the node field non-const.

PiperOrigin-RevId: 221308751
2018-11-13 12:09:26 -08:00
A. Unique TensorFlower
4e4fc3b889 Fix a couple of linter complaints.
PiperOrigin-RevId: 220464726
2018-11-07 08:16:04 -08:00
Skye Wanderman-Milne
d1b2537f33 Don't constant fold FakeParam ops.
FakeParam claims to have a different output shape than it actually
outputs (since the output is not meant to ever be accessed).  Prior to
this change, ConstantFold() would call ReplaceTensorWithConstant() with
the invalid FakeParam output, which would cause a
use-before-initialization error.

PiperOrigin-RevId: 217929903
2018-10-19 14:24:54 -07:00
Peter Hawkins
3f23f4ddea Automated rollback of commit 6fa6bd045c
PiperOrigin-RevId: 217173355
2018-10-15 11:22:32 -07:00
Peter Hawkins
6fa6bd045c Replace references to tensorflow::StringPiece with absl::string_view. No functional changes.
PiperOrigin-RevId: 217170781
2018-10-15 11:01:32 -07:00
Tong Shen
72bf28cd1f Add a utility function to build node name to node index.
PiperOrigin-RevId: 216853788
2018-10-12 06:39:26 -07:00
Alexandre Passos
eec9ca8f0b Partial support tfe.defun in tf.gradients.
Doesn't attempt to deal with cases where we might have already generated
the functiondef for the parent function as in that case we cannot easily
modify the forward pass.

PiperOrigin-RevId: 216243224
2018-10-08 13:58:40 -07:00
Rachel Lim
47eafbaf43 [tf.data] Add utility to deduplicate graph node names (after vectorization)
PiperOrigin-RevId: 215595078
2018-10-03 11:29:40 -07:00
Sanjoy Das
9884cb3629 Check that IsValid{Input|Output}Tensor is only given non-control edges
PiperOrigin-RevId: 215338658
2018-10-01 23:12:22 -07:00
Asim Shankar
7f52de1a2b Make Graph::UpdateEdge() be O(e) instead of O(E)
where:
- E = number of edges in the graph
- e = number of edges on the node of interest

e is necessarily <= E and is typically really small
(# of inputs to an operation + control edges)

PiperOrigin-RevId: 210624296
2018-08-28 16:07:05 -07:00
Peter Hawkins
642a043de4 [TF:XLA] Replace bespoke NodeSlot class in subgraph encapsulation code with InputTensor and OutputTensor classes from TF core.
Add equality and hash methods to InputTensor and OutputTensor.

No functional changes intended.

PiperOrigin-RevId: 200440015
2018-06-13 13:08:14 -07:00
A. Unique TensorFlower
9d2c6ff2a5 Collective Ops Part 7
Complete just enough of the core implementation to run
multi-device collectives locally within a single process.
Interfaces are still private and not availble for general use.

PiperOrigin-RevId: 197617132
2018-05-22 13:51:22 -07:00
Ayush Dubey
09e529ff5a Prepare nodes that will be allocated using ScopedAllocator.
This includes changes to Executor that (1) set scope_id on nodes that are
decorated with _scoped_allocator attribute, (2) mark such nodes to never
forward input.

PiperOrigin-RevId: 194807086
2018-04-30 10:39:01 -07:00
Benjamin Kramer
5d624aa437 Clarify that in_nodes and in_edges includes control edges.
PiperOrigin-RevId: 189225717
2018-03-15 12:19:25 -07:00
Mingsheng Hong
b7b4fe66ee Added const to Node* in various parts of the code base.
PiperOrigin-RevId: 187050526
2018-02-26 11:24:57 -08:00
Mingsheng Hong
3590c452ea Enabled XLA for TF C API.
Summary of changes:

1. Set MarkForCompilationPassFlags::tf_xla_cpu_global_jit default to true in
C_API unit test env when XLA-execute is intended. Together with setting session
config config.graph_options.optimizer_options.global_jit_level to > 0, this
turns on XLA for the entire graph (eligible nodes only, with _Arg and _RetVal
nodes excluded).

We decided against defaulting MarkForCompilationPassFlags::tf_xla_cpu_global_jit
to true, due to performance concerns with the single-threaded nature of the XLA
CPU backend (see
https://www.tensorflow.org/performance/xla/jit#turning_on_jit_compilation).

2. In FindCompilationCandidates() during MarkForCompilationPass, skip compiling
any '_Arg'-typed nodes. This is necessary to avoid hitting a "Invalid argument
number" error during MarkForCompilationPass.

3. Extended C API based build rules to link in XLA libraries, and added unit
test "CAPI.Session_Min_XLA_CPU".

Also added some misc improvements and debugging aids.

PiperOrigin-RevId: 185193314
2018-02-09 14:35:39 -08:00
A. Unique TensorFlower
7149a2e2e2 Cleanup: Ran clang-format on files in tensorflow/core/.../*.{cc,h}.
PiperOrigin-RevId: 183848459
2018-01-30 12:27:47 -08:00
Sourabh Bajaj
b2db981a67 Merge changes from github.
PiperOrigin-RevId: 177526301
2017-11-30 16:41:01 -08:00
Yifei Feng
b1d8c59e9b Merge changes from github.
PiperOrigin-RevId: 176695926
2017-11-22 13:50:02 -08:00