STT-tensorflow

Author	SHA1	Message	Date
Russell Power	30749f263e	Add minor optimization for graph copies. Reserve input/output edgeset sizes when copying graphs. PiperOrigin-RevId: 355055758 Change-Id: Id78260cda6f8bf9ed30663ecc819b5936fff26a8	2021-02-01 16:55:28 -08:00
Kibeom Kim	63b7178b4a	Annotate FunctionDef registered from eager runtime to correct annotate `tensorflow::Graph`'s construction context. PiperOrigin-RevId: 347077468 Change-Id: I77e928287b063ade8a5dfebfeebb5db391af79a9	2020-12-11 14:41:33 -08:00
George Karpenkov	1943e58d29	Rollback of rollback of "Move the ownership of Python stack traces to Graph object, make them accessible from C++ API" Move the ownership of Python stack traces to Graph object, make them accessible from C++ API Expose stack printing options, implement common prefix filtering. PiperOrigin-RevId: 345579757 Change-Id: I88673891e893b1f71a5b039e44f0bc30f190c18a	2020-12-03 18:39:36 -08:00
A. Unique TensorFlower	90881b041f	Move the ownership of Python stack traces to Graph object, make them accessible from C++ API Expose stack printing options, implement common prefix filtering. PiperOrigin-RevId: 345153254 Change-Id: Ifc2eb8b5a4208358787db346a06837c6907f409c	2020-12-01 20:13:04 -08:00
George Karpenkov	22220649d3	Move the ownership of Python stack traces to Graph object, make them accessible from C++ API Expose stack printing options, implement common prefix filtering. PiperOrigin-RevId: 345147201 Change-Id: Iafb94afc07a8bada1e1f5978a66f692b4a06668e	2020-12-01 19:17:34 -08:00
Kibeom Kim	1aab9cfeb0	Add a tag to `tensorflow::Graph` that indicates where it's originated from. - `ConstructionContext::kDirectSession` From `tensorflow::DirectSession`, tf1 session API. - `ConstructionContext::kFunctionDef`: From `FunctionDef`, @tf.function. - `ConstructionContext::kUnknown`: Not tracked. It can be accessed via `Graph::GetConstructionContext()` PiperOrigin-RevId: 343109880 Change-Id: I7b3488648855c9d86d4fb4a202bd66cf7182191a	2020-11-18 16:01:06 -08:00
zmx	486c1e62f2	Update graph.h change comment for Graph::FindEdgeId method.	2020-09-29 05:54:07 +08:00
A. Unique TensorFlower	04112a1910	Qualify uses of std::string PiperOrigin-RevId: 324235054 Change-Id: Ia0f0279b70bac000ff67334a5ca871bd4ec6ef5e	2020-07-31 10:38:08 -07:00
Yanhua Sun	b5a2876c65	generate stateless_case op if all ops in all branches are stateless This avoids unnecessary auto dependency due to stateful ops PiperOrigin-RevId: 324105758 Change-Id: Icf7979cca19f283ce7fd4cc338b4182f5e2e3b51	2020-07-30 16:23:47 -07:00
Derek Murray	aec85065ff	[Graph] Avoid calling `Node::GetNodeClassForOp()` when a node is copied. Instead, copy the class from the original node. This change also modifies the `kNodeClassTable` to use `absl::flat_hash_map<>`. PiperOrigin-RevId: 306945263 Change-Id: I8eb1c80b57fdf204fbc7072a55615dd688025e87	2020-04-16 16:38:09 -07:00
George Karpenkov	f3dcd9dc11	Support interprocedural constant meta-information propagation for compilation This CL does two things: 1) Supports inter-procedural constant information propagation, across PartitionedCall and StatefulPartitionedCall. 2) Done naively, (1) leads to exponential number of calls, as each function will be reinlined for each (indirect) caller. In order to address this performance issue, we cache the argument indices which need to be constant, and attach that information to the Graph object. This might require some clarification: a) Caching in a passed map would not work, as duplication of constant propagation for each top-level caller is still prohibitively expensive. b) Caching in a global object would not work, as graphs are created and destroyed during transformations. c) Caching this meta-information on a `Graph` object has an added benefit that we no longer perform the same constant propagation many times (a lot of compilation passes call BackwardsConstAnalysis, and previously all this work had to be repeated). PiperOrigin-RevId: 303860413 Change-Id: I78f92ca1487fc952044e5ac6526dcaa5b50d5f21	2020-03-30 17:51:05 -07:00
A. Unique TensorFlower	5f3a3019ba	Replace NodeDef with std::shared_ptr<NodeProperties> in the kernel creation code paths and try to avoid as many copies of NodeDefs as possible. This will in most cases allow sharing the NodeDef between the OpKernel and the graph Node from which it is created. This reduces the number of allocations in the executor benchmark by about 8%: name old time/op new time/op delta BM_executor/16/1k [Nodes = 9824 ] 911µs ± 3% 911µs ± 1% ~ (p=0.548 n=5+5) BM_executor/32/8k [Nodes = 141991] 17.1ms ± 2% 16.8ms ± 1% -2.17% (p=0.016 n=5+5) BM_executor/1k/16 [Nodes = 6781 ] 1.21ms ± 1% 1.25ms ± 7% ~ (p=0.095 n=5+5) BM_executor/8k/32 [Nodes = 130875] 4.35s ± 0% 4.34s ± 0% ~ (p=0.841 n=5+5) BM_executor/1k/1k [Nodes = 526256] 3.33s ± 1% 3.31s ± 1% ~ (p=0.095 n=5+5) BM_FeedInputFetchOutput 54.0µs ± 7% 56.9µs ±13% ~ (p=0.222 n=5+5) name old allocs/op new allocs/op delta BM_executor/16/1k [Nodes = 9824 ] 15.4k ± 0% 14.1k ± 0% -7.95% (p=0.008 n=5+5) BM_executor/32/8k [Nodes = 141991] 226k ± 0% 208k ± 0% -7.86% (p=0.008 n=5+5) BM_executor/1k/16 [Nodes = 6781 ] 10.2k ± 0% 9.3k ± 0% -8.36% (p=0.008 n=5+5) BM_executor/8k/32 [Nodes = 130875] 197k ± 0% 180k ± 0% -8.31% (p=0.016 n=4+5) BM_executor/1k/1k [Nodes = 526256] 771k ± 0% 706k ± 0% -8.53% (p=0.008 n=5+5) BM_FeedInputFetchOutput 58.0 ± 0% 57.0 ± 0% -1.72% (p=0.008 n=5+5) PiperOrigin-RevId: 295803318 Change-Id: I0d262c6082822023f449f9817dc943d20bd302d5	2020-02-18 13:20:06 -08:00
Derek Murray	fc30d76c55	Automated rollback of commit `ce05dd80cc` PiperOrigin-RevId: 273515236	2019-10-08 07:37:58 -07:00
Derek Murray	ce05dd80cc	Automated rollback of commit `3d830370a8` PiperOrigin-RevId: 272959765	2019-10-04 23:40:25 -07:00
Derek Murray	3d830370a8	Cache whether a Node is a function op in the Node's class. This enables the testing for whether a node calls a function by checking the NodeClass enum, rather than looking up the type string in a FunctionLibraryDefinition map. We call IsFunctionCall() for every node in several graph rewrites (including the control-flow lowering pass), so this should reduce the cost of rewrites on large graphs. PiperOrigin-RevId: 272940579	2019-10-04 19:12:39 -07:00
Derek Murray	6a42e239dc	Use GetNodeAttrSimple() when it is possible that the attr is not present. In the Status-returning GetNodeAttr(), constructing an `errors::NotFound()` when the attr is not present involves expensive string concatenation. Additionally, change GetNodeAttr() to GetNodeAttrString() on hot codepaths (e.g. `Executor::PropagateOutputs()`) to avoid copying a string on each call, and add overloads of GetNodeAttrSimple() that enable accessing const-pointers to non-POD types in the AttrValue proto without copying them. PiperOrigin-RevId: 261141528	2019-08-01 10:21:48 -07:00
Yanhua Sun	170a95de67	In while_v2 emit a StatelessIf op if the body is stateless. PiperOrigin-RevId: 260927755	2019-07-31 08:12:52 -07:00
Saurabh Saxena	22beb4c7d5	In cond_v2 emit a StatelessIf op if the body is stateless. PiperOrigin-RevId: 256009375	2019-07-01 14:16:11 -07:00
Derek Murray	a101d48091	Change `Graph::AddNode()` to take `node_def` by value. This method currently copies the given `const NodeDef&` proto into the returned `Node`. In many cases, the argument can be moved into the call, and we can elide a potentially large copy. PiperOrigin-RevId: 254008134	2019-06-19 09:18:27 -07:00
Eugene Zhulenev	a1366af7f6	Make NodeIter and NeighborIter more like real c++ iterators. Reference: https://en.cppreference.com/w/cpp/iterator/iterator PiperOrigin-RevId: 247125416	2019-05-07 18:41:13 -07:00
Andy Ly	1a40c07e83	Expose FindKernelDef with NodeDef components (name, op, device, etc.). Update Grappler util's IsKernelRegisteredForNode to use lower FindKernelDef. PiperOrigin-RevId: 247104392	2019-05-07 16:33:58 -07:00
A. Unique TensorFlower	919b38007e	Speedup removal of nodes from Graph by not removing edges one by one from the node's own EdgeSet. Only remove it from the neighboring nodes' EdgeSets, then clear the node's own in_edges_ and out_edges_ in one operation. PiperOrigin-RevId: 240449165	2019-03-26 16:18:58 -07:00
Igor Ganichev	ea294844b0	Add Node::IsArg() and Node::IsRetval() methods and use them As functions become more prominent, we have been directly checking for op types in many places. This is slow and buggy because not all places were updated for Device* versions of _Arg and _Retval. PiperOrigin-RevId: 238044205	2019-03-12 10:34:44 -07:00
Igor Ganichev	ed2b195990	Support function calls through PartitionedCall in tf2xla Also, make eager runtime always emit PartitionedCall and remove special handling of xla compilation. Becase this change makes XLA look inside PartitionedCalls, this change had to update/disable some tests that include PartitionedCalls with some uncompilable ops inside. PiperOrigin-RevId: 237486703	2019-03-08 11:32:38 -08:00
Jiri Simsa	5702b86c96	[tf.data] Marking dataset ops that consume a dataset without an iterator as stateful to make sure they are not prune from the graph in case their output is not used. This is a conservative approach to guarantee that any side-effects of the op are carried out. This CL also reverts a previous (incomplete) solution to the same problem. PiperOrigin-RevId: 233663631	2019-02-12 19:11:25 -08:00
Jiri Simsa	eadd8aca53	Making sure dataset "output" ops are not pruned from function graphs as they might have side effects even though the ops are not marked as stateful. PiperOrigin-RevId: 229425577	2019-01-15 13:35:35 -08:00
A. Unique TensorFlower	f9699b8d40	Fixing build due to ambiguous vector constructor. PiperOrigin-RevId: 225859201	2018-12-17 11:30:46 -08:00
A. Unique TensorFlower	e8d6281e7e	[Error improvement] We now put an attribute for keeping track of the original source nodes. We have also changed many optimizers to correctly transmit the original node values. PiperOrigin-RevId: 225457141	2018-12-13 16:36:57 -08:00
Skye Wanderman-Milne	62db4a3ccf	Introduce Operation._add_while_inputs to allow adding inputs to a While op. This is in preparation for changing while_v2 to rewrite the forward pass to output intermediates needed by the gradient, instead of outputting all intermediates. Since While ops always have the same inputs and output types, we need to be able to add inputs in addition to adding outputs. PiperOrigin-RevId: 223812986	2018-12-03 10:11:42 -08:00
Peter Hawkins	c1e50881f3	[XLA] Avoid undefined behavior in absl::bit_cast with Eigen::half. [XLA] [TF] Fix more compile warnings in Mac OS OSS build. PiperOrigin-RevId: 223035708	2018-11-27 12:29:49 -08:00
Peter Hawkins	ee20d8c029	[TF] [XLA] Fix a number of compiler warnings on Mac OS X. * Mostly these are unused/dead code. * Fix an ignored Status in shape_util.h PiperOrigin-RevId: 222126279	2018-11-19 13:19:32 -08:00
Skye Wanderman-Milne	199ead85e8	Fix bug in If lowering and make {Input,Output}Tensor.node non-const. Prior to this change, the If lowering pass would always use the first output of the predicate op as the predicate. This change makes it use the correct output. In addition, this adds more OutputTensor plumbing which required making the node field non-const. PiperOrigin-RevId: 221308751	2018-11-13 12:09:26 -08:00
A. Unique TensorFlower	4e4fc3b889	Fix a couple of linter complaints. PiperOrigin-RevId: 220464726	2018-11-07 08:16:04 -08:00
Skye Wanderman-Milne	d1b2537f33	Don't constant fold FakeParam ops. FakeParam claims to have a different output shape than it actually outputs (since the output is not meant to ever be accessed). Prior to this change, ConstantFold() would call ReplaceTensorWithConstant() with the invalid FakeParam output, which would cause a use-before-initialization error. PiperOrigin-RevId: 217929903	2018-10-19 14:24:54 -07:00
Peter Hawkins	3f23f4ddea	Automated rollback of commit `6fa6bd045c` PiperOrigin-RevId: 217173355	2018-10-15 11:22:32 -07:00
Peter Hawkins	6fa6bd045c	Replace references to tensorflow::StringPiece with absl::string_view. No functional changes. PiperOrigin-RevId: 217170781	2018-10-15 11:01:32 -07:00
Tong Shen	72bf28cd1f	Add a utility function to build node name to node index. PiperOrigin-RevId: 216853788	2018-10-12 06:39:26 -07:00
Alexandre Passos	eec9ca8f0b	Partial support tfe.defun in tf.gradients. Doesn't attempt to deal with cases where we might have already generated the functiondef for the parent function as in that case we cannot easily modify the forward pass. PiperOrigin-RevId: 216243224	2018-10-08 13:58:40 -07:00
Rachel Lim	47eafbaf43	[tf.data] Add utility to deduplicate graph node names (after vectorization) PiperOrigin-RevId: 215595078	2018-10-03 11:29:40 -07:00
Sanjoy Das	9884cb3629	Check that IsValid{Input\|Output}Tensor is only given non-control edges PiperOrigin-RevId: 215338658	2018-10-01 23:12:22 -07:00
Asim Shankar	7f52de1a2b	Make Graph::UpdateEdge() be O(e) instead of O(E) where: - E = number of edges in the graph - e = number of edges on the node of interest e is necessarily <= E and is typically really small (# of inputs to an operation + control edges) PiperOrigin-RevId: 210624296	2018-08-28 16:07:05 -07:00
Peter Hawkins	642a043de4	[TF:XLA] Replace bespoke NodeSlot class in subgraph encapsulation code with InputTensor and OutputTensor classes from TF core. Add equality and hash methods to InputTensor and OutputTensor. No functional changes intended. PiperOrigin-RevId: 200440015	2018-06-13 13:08:14 -07:00
A. Unique TensorFlower	9d2c6ff2a5	Collective Ops Part 7 Complete just enough of the core implementation to run multi-device collectives locally within a single process. Interfaces are still private and not availble for general use. PiperOrigin-RevId: 197617132	2018-05-22 13:51:22 -07:00
Ayush Dubey	09e529ff5a	Prepare nodes that will be allocated using ScopedAllocator. This includes changes to Executor that (1) set scope_id on nodes that are decorated with _scoped_allocator attribute, (2) mark such nodes to never forward input. PiperOrigin-RevId: 194807086	2018-04-30 10:39:01 -07:00
Benjamin Kramer	5d624aa437	Clarify that in_nodes and in_edges includes control edges. PiperOrigin-RevId: 189225717	2018-03-15 12:19:25 -07:00
Mingsheng Hong	b7b4fe66ee	Added const to Node* in various parts of the code base. PiperOrigin-RevId: 187050526	2018-02-26 11:24:57 -08:00
Mingsheng Hong	3590c452ea	Enabled XLA for TF C API. Summary of changes: 1. Set MarkForCompilationPassFlags::tf_xla_cpu_global_jit default to true in C_API unit test env when XLA-execute is intended. Together with setting session config config.graph_options.optimizer_options.global_jit_level to > 0, this turns on XLA for the entire graph (eligible nodes only, with _Arg and _RetVal nodes excluded). We decided against defaulting MarkForCompilationPassFlags::tf_xla_cpu_global_jit to true, due to performance concerns with the single-threaded nature of the XLA CPU backend (see https://www.tensorflow.org/performance/xla/jit#turning_on_jit_compilation). 2. In FindCompilationCandidates() during MarkForCompilationPass, skip compiling any '_Arg'-typed nodes. This is necessary to avoid hitting a "Invalid argument number" error during MarkForCompilationPass. 3. Extended C API based build rules to link in XLA libraries, and added unit test "CAPI.Session_Min_XLA_CPU". Also added some misc improvements and debugging aids. PiperOrigin-RevId: 185193314	2018-02-09 14:35:39 -08:00
A. Unique TensorFlower	7149a2e2e2	Cleanup: Ran clang-format on files in tensorflow/core/.../*.{cc,h}. PiperOrigin-RevId: 183848459	2018-01-30 12:27:47 -08:00
Sourabh Bajaj	b2db981a67	Merge changes from github. PiperOrigin-RevId: 177526301	2017-11-30 16:41:01 -08:00
Yifei Feng	b1d8c59e9b	Merge changes from github. PiperOrigin-RevId: 176695926	2017-11-22 13:50:02 -08:00

1 2 3

105 Commits