STT-tensorflow

Author	SHA1	Message	Date
Derek Murray	f7bae45c69	[Graph] Avoid copying the NodeDef where possible in `Node::UpdateProperties()`. 1. We currently only call `UpdateProperties()` from `AddAttr()`, which ensures that we have unique ownership of the shared `NodeProperties`. This allows us to modify the `NodeProperties` in place. 2. We only need to update the `NodeProperties` when the input and output types change as a result of adding the new attr. Most calls to `AddAttr()` add inferred shape attributes, and do not modify the input or output types. PiperOrigin-RevId: 307054556 Change-Id: I01fbc6ba832020bcd2d89822a5d3a2e3beba02ce	2020-04-17 09:24:37 -07:00
Derek Murray	aec85065ff	[Graph] Avoid calling `Node::GetNodeClassForOp()` when a node is copied. Instead, copy the class from the original node. This change also modifies the `kNodeClassTable` to use `absl::flat_hash_map<>`. PiperOrigin-RevId: 306945263 Change-Id: I8eb1c80b57fdf204fbc7072a55615dd688025e87	2020-04-16 16:38:09 -07:00
A. Unique TensorFlower	5f3a3019ba	Replace NodeDef with std::shared_ptr<NodeProperties> in the kernel creation code paths and try to avoid as many copies of NodeDefs as possible. This will in most cases allow sharing the NodeDef between the OpKernel and the graph Node from which it is created. This reduces the number of allocations in the executor benchmark by about 8%: name old time/op new time/op delta BM_executor/16/1k [Nodes = 9824 ] 911µs ± 3% 911µs ± 1% ~ (p=0.548 n=5+5) BM_executor/32/8k [Nodes = 141991] 17.1ms ± 2% 16.8ms ± 1% -2.17% (p=0.016 n=5+5) BM_executor/1k/16 [Nodes = 6781 ] 1.21ms ± 1% 1.25ms ± 7% ~ (p=0.095 n=5+5) BM_executor/8k/32 [Nodes = 130875] 4.35s ± 0% 4.34s ± 0% ~ (p=0.841 n=5+5) BM_executor/1k/1k [Nodes = 526256] 3.33s ± 1% 3.31s ± 1% ~ (p=0.095 n=5+5) BM_FeedInputFetchOutput 54.0µs ± 7% 56.9µs ±13% ~ (p=0.222 n=5+5) name old allocs/op new allocs/op delta BM_executor/16/1k [Nodes = 9824 ] 15.4k ± 0% 14.1k ± 0% -7.95% (p=0.008 n=5+5) BM_executor/32/8k [Nodes = 141991] 226k ± 0% 208k ± 0% -7.86% (p=0.008 n=5+5) BM_executor/1k/16 [Nodes = 6781 ] 10.2k ± 0% 9.3k ± 0% -8.36% (p=0.008 n=5+5) BM_executor/8k/32 [Nodes = 130875] 197k ± 0% 180k ± 0% -8.31% (p=0.016 n=4+5) BM_executor/1k/1k [Nodes = 526256] 771k ± 0% 706k ± 0% -8.53% (p=0.008 n=5+5) BM_FeedInputFetchOutput 58.0 ± 0% 57.0 ± 0% -1.72% (p=0.008 n=5+5) PiperOrigin-RevId: 295803318 Change-Id: I0d262c6082822023f449f9817dc943d20bd302d5	2020-02-18 13:20:06 -08:00
Anna R	31c3789692	Split out node_def_util, op_def_builder, op_def_util and attr_value_util targets in tensorflow/core/framework/BUILD. Split out node_def_util.cc/.h into node_def_util.cc/.h and graph_node_util.cc/.h, where only the latter depends on graph.h. PiperOrigin-RevId: 288340739 Change-Id: I66932bab042bda4bd707f866514b18b80efa805b	2020-01-06 11:46:01 -08:00
TensorFlower Gardener	c87a16e17a	Merge pull request #33063 from bas-aarts:xla-merge PiperOrigin-RevId: 276286841 Change-Id: I4f9cbc4d82cc963676b0b55ea023d4792ee1b0c7	2019-10-23 09:14:40 -07:00
Derek Murray	1d0aff41ac	In Graph constructor, use `FunctionLibraryDefinition::num_functions()` to get number of functions. Previously, we were converting the FunctionLibraryDefinition to a FunctionDefLibrary proto to read out its number of functions, which incurs heavy allocation and deallocation costs when there are many functions. PiperOrigin-RevId: 274255340	2019-10-11 19:26:30 -07:00
Derek Murray	fc30d76c55	Automated rollback of commit `ce05dd80cc` PiperOrigin-RevId: 273515236	2019-10-08 07:37:58 -07:00
Derek Murray	ce05dd80cc	Automated rollback of commit `3d830370a8` PiperOrigin-RevId: 272959765	2019-10-04 23:40:25 -07:00
Derek Murray	3d830370a8	Cache whether a Node is a function op in the Node's class. This enables the testing for whether a node calls a function by checking the NodeClass enum, rather than looking up the type string in a FunctionLibraryDefinition map. We call IsFunctionCall() for every node in several graph rewrites (including the control-flow lowering pass), so this should reduce the cost of rewrites on large graphs. PiperOrigin-RevId: 272940579	2019-10-04 19:12:39 -07:00
Bas Aarts	791bf78c29	Add XLA-only merge that can merge all types. This prevents insertion of H2D and D2H copies when XLA-GPU clusters have int32 outputs. This merge is only used the merge the outputs from the XlaRun and the the PartitionedCall node.	2019-10-04 15:29:07 -07:00
Zhuoran Liu	c45537230d	Make debug message more human-readable. PiperOrigin-RevId: 268257613	2019-09-10 11:53:34 -07:00
Mehdi Amini	0197a2d8a3	Add a check to catch out-of-bound access on invalid Graphs The existing Check trying to catch malformed graph is not robust when an op is registered with an expected number of inputs but has data edges beyond this. PiperOrigin-RevId: 266826557	2019-09-02 16:57:38 -07:00
Ayush Dubey	39e7715eb0	Configure `NcclGather` in collective param resolution. Also add python tests that cover NCCL implementations of broadcast and all-gather. PiperOrigin-RevId: 261408514	2019-08-02 17:19:53 -07:00
Yanhua Sun	170a95de67	In while_v2 emit a StatelessIf op if the body is stateless. PiperOrigin-RevId: 260927755	2019-07-31 08:12:52 -07:00
Saurabh Saxena	22beb4c7d5	In cond_v2 emit a StatelessIf op if the body is stateless. PiperOrigin-RevId: 256009375	2019-07-01 14:16:11 -07:00
Derek Murray	a101d48091	Change `Graph::AddNode()` to take `node_def` by value. This method currently copies the given `const NodeDef&` proto into the returned `Node`. In many cases, the argument can be moved into the call, and we can elide a potentially large copy. PiperOrigin-RevId: 254008134	2019-06-19 09:18:27 -07:00
Brian Patton	10ed2f7bb5	Adds a lowering from Case to _SwitchN+Merge. Introduces a new n-way _SwitchN op+kernel. I audited usages of the grappler and graph variants of IsSwitch and IsMerge, and believe the corrections in this CL are correct. PiperOrigin-RevId: 250803634	2019-05-30 18:28:08 -07:00
Andy Ly	1a40c07e83	Expose FindKernelDef with NodeDef components (name, op, device, etc.). Update Grappler util's IsKernelRegisteredForNode to use lower FindKernelDef. PiperOrigin-RevId: 247104392	2019-05-07 16:33:58 -07:00
A. Unique TensorFlower	7867001d88	[TF core] Remove accidentally quadratic behaviour when repeatedly calling Graph::RemoveNode We were resizing the edge free list on every call, which meant repeated calls would have a 100% probability of resizing the edge free list and making node removal O(n^2) in the number of nodes to remove. Instead, never call reserve. Use std::vector's default resizing heuristics to amortize this away. PiperOrigin-RevId: 246459207	2019-05-03 00:27:00 -07:00
A. Unique TensorFlower	c42d974e7a	Optimize tf graph manipulation: * Don't copy all nodes in PruneForReverseReachability. * Using std::vector<bool> instead of std::unordered_set<Node> for marking visited nodes makes PruneForReverseReachability about 2X faster. std::move target nodes when calling PruneForReverseReachability. * Don't clear content of nodes when moving them to the free list: They are immediately overwritten when re-used, so no need to touch the memory where they reside when recycling them. * Add benchmnarks for RemoveNode and PruneForReverseReachability. PiperOrigin-RevId: 242508299	2019-04-08 12:27:12 -07:00
A. Unique TensorFlower	4591bad9f9	Automated rollback of commit `fe30579d67` PiperOrigin-RevId: 241998820	2019-04-04 13:57:10 -07:00
A. Unique TensorFlower	fd32a7f773	Automated rollback of commit `fe30579d67` PiperOrigin-RevId: 241944783	2019-04-04 09:36:17 -07:00
A. Unique TensorFlower	98c3cfbf74	Automated rollback of commit `fe30579d67` PiperOrigin-RevId: 241863985	2019-04-03 21:26:55 -07:00
A. Unique TensorFlower	fe30579d67	Optimize tf graph manipulation: * Don't copy all nodes in PruneForReverseReachability. * std::move target nodes when calling PruneForReverseReachability. * Don't clear content of nodes when moving them to the free list: They are immediately overwritten when re-used, so no need to touch the memory where they reside when recycling them. PiperOrigin-RevId: 241841395	2019-04-03 17:57:18 -07:00
A. Unique TensorFlower	919b38007e	Speedup removal of nodes from Graph by not removing edges one by one from the node's own EdgeSet. Only remove it from the neighboring nodes' EdgeSets, then clear the node's own in_edges_ and out_edges_ in one operation. PiperOrigin-RevId: 240449165	2019-03-26 16:18:58 -07:00
Igor Ganichev	ea294844b0	Add Node::IsArg() and Node::IsRetval() methods and use them As functions become more prominent, we have been directly checking for op types in many places. This is slow and buggy because not all places were updated for Device* versions of _Arg and _Retval. PiperOrigin-RevId: 238044205	2019-03-12 10:34:44 -07:00
Igor Ganichev	ed2b195990	Support function calls through PartitionedCall in tf2xla Also, make eager runtime always emit PartitionedCall and remove special handling of xla compilation. Becase this change makes XLA look inside PartitionedCalls, this change had to update/disable some tests that include PartitionedCalls with some uncompilable ops inside. PiperOrigin-RevId: 237486703	2019-03-08 11:32:38 -08:00
Jiri Simsa	5702b86c96	[tf.data] Marking dataset ops that consume a dataset without an iterator as stateful to make sure they are not prune from the graph in case their output is not used. This is a conservative approach to guarantee that any side-effects of the op are carried out. This CL also reverts a previous (incomplete) solution to the same problem. PiperOrigin-RevId: 233663631	2019-02-12 19:11:25 -08:00
Jiri Simsa	eadd8aca53	Making sure dataset "output" ops are not pruned from function graphs as they might have side effects even though the ops are not marked as stateful. PiperOrigin-RevId: 229425577	2019-01-15 13:35:35 -08:00
A. Unique TensorFlower	f9699b8d40	Fixing build due to ambiguous vector constructor. PiperOrigin-RevId: 225859201	2018-12-17 11:30:46 -08:00
A. Unique TensorFlower	e8d6281e7e	[Error improvement] We now put an attribute for keeping track of the original source nodes. We have also changed many optimizers to correctly transmit the original node values. PiperOrigin-RevId: 225457141	2018-12-13 16:36:57 -08:00
Skye Wanderman-Milne	b4c2856141	Make AddWhileInputHack handle control inputs correctly. PiperOrigin-RevId: 225131361	2018-12-11 23:19:14 -08:00
Skye Wanderman-Milne	62db4a3ccf	Introduce Operation._add_while_inputs to allow adding inputs to a While op. This is in preparation for changing while_v2 to rewrite the forward pass to output intermediates needed by the gradient, instead of outputting all intermediates. Since While ops always have the same inputs and output types, we need to be able to add inputs in addition to adding outputs. PiperOrigin-RevId: 223812986	2018-12-03 10:11:42 -08:00
Skye Wanderman-Milne	199ead85e8	Fix bug in If lowering and make {Input,Output}Tensor.node non-const. Prior to this change, the If lowering pass would always use the first output of the predicate op as the predicate. This change makes it use the correct output. In addition, this adds more OutputTensor plumbing which required making the node field non-const. PiperOrigin-RevId: 221308751	2018-11-13 12:09:26 -08:00
Skye Wanderman-Milne	d1b2537f33	Don't constant fold FakeParam ops. FakeParam claims to have a different output shape than it actually outputs (since the output is not meant to ever be accessed). Prior to this change, ConstantFold() would call ReplaceTensorWithConstant() with the invalid FakeParam output, which would cause a use-before-initialization error. PiperOrigin-RevId: 217929903	2018-10-19 14:24:54 -07:00
A. Unique TensorFlower	be409cac81	During error conditions, we were currently putting an entire NodeDef information in the user facing error message. This makes it very hard to read and understand the error messages. We now just put the name in the message, and put the NodeDef information in the logs. PiperOrigin-RevId: 217616833	2018-10-17 17:13:58 -07:00
Peter Hawkins	3f23f4ddea	Automated rollback of commit `6fa6bd045c` PiperOrigin-RevId: 217173355	2018-10-15 11:22:32 -07:00
Peter Hawkins	6fa6bd045c	Replace references to tensorflow::StringPiece with absl::string_view. No functional changes. PiperOrigin-RevId: 217170781	2018-10-15 11:01:32 -07:00
Tong Shen	72bf28cd1f	Add a utility function to build node name to node index. PiperOrigin-RevId: 216853788	2018-10-12 06:39:26 -07:00
Rachel Lim	09e098e505	Automated rollback of commit `d6a3d6a829` PiperOrigin-RevId: 216617037	2018-10-10 16:55:28 -07:00
A. Unique TensorFlower	d6a3d6a829	Automated rollback of commit `950cf87104` PiperOrigin-RevId: 216500702	2018-10-10 02:47:15 -07:00
Rachel Lim	950cf87104	[tf.data vectorization] Add vectorizer for `Add` op PiperOrigin-RevId: 216424512	2018-10-09 14:46:11 -07:00
Alexandre Passos	eec9ca8f0b	Partial support tfe.defun in tf.gradients. Doesn't attempt to deal with cases where we might have already generated the functiondef for the parent function as in that case we cannot easily modify the forward pass. PiperOrigin-RevId: 216243224	2018-10-08 13:58:40 -07:00
Rachel Lim	47eafbaf43	[tf.data] Add utility to deduplicate graph node names (after vectorization) PiperOrigin-RevId: 215595078	2018-10-03 11:29:40 -07:00
Sanjoy Das	9884cb3629	Check that IsValid{Input\|Output}Tensor is only given non-control edges PiperOrigin-RevId: 215338658	2018-10-01 23:12:22 -07:00
Asim Shankar	7f52de1a2b	Make Graph::UpdateEdge() be O(e) instead of O(E) where: - E = number of edges in the graph - e = number of edges on the node of interest e is necessarily <= E and is typically really small (# of inputs to an operation + control edges) PiperOrigin-RevId: 210624296	2018-08-28 16:07:05 -07:00
A. Unique TensorFlower	4f4e1b4886	Removed redundant std::string -> string conversions. PiperOrigin-RevId: 210565027	2018-08-28 10:43:43 -07:00
Peter Hawkins	642a043de4	[TF:XLA] Replace bespoke NodeSlot class in subgraph encapsulation code with InputTensor and OutputTensor classes from TF core. Add equality and hash methods to InputTensor and OutputTensor. No functional changes intended. PiperOrigin-RevId: 200440015	2018-06-13 13:08:14 -07:00
A. Unique TensorFlower	9d2c6ff2a5	Collective Ops Part 7 Complete just enough of the core implementation to run multi-device collectives locally within a single process. Interfaces are still private and not availble for general use. PiperOrigin-RevId: 197617132	2018-05-22 13:51:22 -07:00
A. Unique TensorFlower	170634d5a1	Replaced calls to tensorflow::StringPiece::ToString with std::string conversions. That is, instances of sp.ToString() are replaced with std::string(sp). This will allow tensorflow::StringPiece::ToString to be removed, which is necessary before it can be replaced with absl::string_view. PiperOrigin-RevId: 195689392	2018-05-07 16:39:29 -07:00

1 2 3

110 Commits