STT-tensorflow

Author	SHA1	Message	Date
Xiaoqiang Zheng	01a6f5e504	Multiple layout support for pooling operations. Change: 115611259	2016-02-25 18:09:01 -08:00
Derek Murray	cdd0f2eeef	Fix compilation error in argv parsing code... whoops. Change: 115610448	2016-02-25 15:25:10 -08:00
A. Unique TensorFlower	aeae4825b3	Add symbolic gradient functions for Conv2D and MaxPool Change: 115608522	2016-02-25 15:24:58 -08:00
Vijay Vasudevan	03fed366e4	TensorFlow: conv_ops uses gpu_device_context, needs to depend on the lib. Change: 115607974	2016-02-25 15:00:06 -08:00
Derek Murray	f3ead2df04	Correct handling of argv in test utility. Change: 115607801	2016-02-25 14:59:54 -08:00
Vijay Vasudevan	c38bbf42e8	Rollback of "TestReporter is back in. Maybe also fixed the Android build." Test fails. Change: 115602477	2016-02-25 14:13:33 -08:00
Vijay Vasudevan	90cf3e2eea	Rollback of: Add native depthwise_convolution op (forward pass). The current depthwise_conv is very inefficient by calling slice() on each input channel on input and filters, followed by a conv() on each input channel, after which is a concat(). Change: 115601904	2016-02-25 14:13:22 -08:00
Vijay Vasudevan	a82f7e6b55	Make gpu_lib for non-cuda deps that we use in public kernels. Change: 115598732	2016-02-25 13:45:59 -08:00
Sherry Moore	a5f3979004	Clarify comments for max_to_keep. Change: 115598592	2016-02-25 13:45:46 -08:00
Manjunath Kudlur	be64da9454	Remove endl at the end of VLOG. Change: 115594986	2016-02-25 13:19:40 -08:00
Josh Levenberg	eec5477ab6	Execute TODO to rename io.* to save_restore_tensor.. This will hopefully reduce confusion since io. is not the implementation of the ".../kernels:io" build target. Change: 115593814	2016-02-25 13:19:30 -08:00
Eugene Brevdo	ad3ef4c05b	TestReporter is back in. Maybe also fixed the Android build. Change: 115589642	2016-02-25 13:19:20 -08:00
Vijay Vasudevan	356bf7f466	TensorFlow: add missing header file to posix/test.cc Change: 115589382	2016-02-25 12:13:32 -08:00
Jianmin Chen	7b47c8b4a3	Add native depthwise_convolution op (forward pass). The current depthwise_conv is very inefficient by calling slice() on each input channel on input and filters, followed by a conv() on each input channel, after which is a concat(). Change: 115583330	2016-02-25 12:13:20 -08:00
Derek Murray	818644c2a9	Changed testing::SrcDir() to testing::TensorFlowSourceRoot() and fixed it. Also fixed some compiler warnings. Change: 115582482	2016-02-25 11:16:31 -08:00
Vijay Vasudevan	13d7f52034	TensorFlow: make split_op not use internal header library for callback, since this breaks the build on GPU. Change: 115582331	2016-02-25 11:16:20 -08:00
Vijay Vasudevan	86e93febaa	TensorFlow: Fix scatter_op_test now that StringPiece::contains is fixed. Change: 115580211	2016-02-25 11:16:08 -08:00
A. Unique TensorFlower	d1aed6505a	Add contrib/testing. Change: 115578243	2016-02-25 11:15:56 -08:00
A. Unique TensorFlower	9ccc4b6afe	Avoid some over-inlined routines. Reduces code size of TensorFlow binaries considerably. Shrinks text size of example_trainer binary by ~1.5%. Change: 115578002	2016-02-25 11:15:46 -08:00
Benoit Steiner	63bd3efc5c	Made sure that the tracking allocator always counts the allocated sizes. Made the corresponding unit test more robust. Change: 115575179	2016-02-25 11:15:32 -08:00
Vijay Vasudevan	5c9f4f8973	TensorFlow: fix bug in StringPiece::contains which made it always return true. Add a unittest to catch this type of regression in the future. Change: 115573280	2016-02-25 11:15:20 -08:00
A. Unique TensorFlower	82ecfff7da	Fix for constant folding where nodes with no inputs doesn't get constant folded. Change: 115568214	2016-02-25 11:14:58 -08:00
A. Unique TensorFlower	e752109efb	Fixes bug in accumulation of total-approximate-duality-gap. Change: 115528686	2016-02-25 09:03:20 -08:00
A. Unique TensorFlower	73d557cc88	Fix an error message in tf.sparse_to_dense to include the possibility that indices are invalid because they are out of bounds. Change: 115522264	2016-02-25 09:02:57 -08:00
Eugene Brevdo	fcfa866d67	Added TestReporter and test / benchmark reporting tools. These tools are meant to allow recording of benchmark & unit test structured output to pbtxt files in a directory only when the environment variable TEST_REPORT_FILE_PREFIX is set. For now, only saving of C++ microbenchmark output is supported. Change: 115518303	2016-02-25 09:02:41 -08:00
Sherry Moore	4ecd2a70dd	Added unit test for max_to_keep being None. Change: 115516426	2016-02-25 09:02:28 -08:00
Kiril Gorovoy	77da168dbc	Move all Tensorflow WORKSPACE rules to a skylark macro Change: 115515678	2016-02-25 09:02:17 -08:00
Josh Levenberg	9ba55d8a75	Remove no-longer-needed RequireDefaultOps(). Change: 115511835	2016-02-25 09:01:55 -08:00
Josh Levenberg	ab286e0996	Remove no-longer-needed RequireDefaultOps(). Change: 115511794	2016-02-25 09:01:43 -08:00
Vincent Vanhoucke	bce6216610	Switch nn.moments() to using a one-pass stable algorithm. Helps with: https://github.com/tensorflow/tensorflow/issues/917 Also fixes https://github.com/tensorflow/tensorflow/issues/1162 The main benefit is that the computation of the sufficient statistics is now decoupled of the aggregation of the moments, which means that if you want to perform the accumulation incrementally, you don't have to keep all the inputs around, and can instead keep the much more compact sum and sum-of-squares. Accumulation could also be performed locally if you aggregate across multiple devices. Computing sum and sum-of-squares can also theoretically be performed in parallel now. Tested running inception: same performance, same step time. Batch normalization benchmark is a bit faster on CPU, a bit slower on GPU: Before: cpu shape:4/3 #layers:10 mode:py scale:True train:False - 1.139310 secs gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.021970 secs cpu shape:4/3 #layers:10 mode:py scale:True train:True - 2.767147 secs gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.074531 secs cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.742835 secs gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.013473 secs cpu shape:4/3 #layers:10 mode:py scale:True train:True - 1.738806 secs gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.052777 secs cpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.119180 secs gpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.011201 secs cpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.218297 secs gpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.048526 secs After: cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.998944 secs gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.025828 secs cpu shape:4/3 #layers:10 mode:py scale:True train:True - 2.657428 secs gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.086614 secs cpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.603137 secs gpu shape:4/3 #layers:10 mode:py scale:True train:False - 0.017668 secs cpu shape:4/3 #layers:10 mode:py scale:True train:True - 1.519533 secs gpu shape:4/3 #layers:10 mode:py scale:True train:True - 0.055214 secs cpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.071344 secs gpu shape:2/1 #layers:10 mode:py scale:True train:False - 0.016440 secs cpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.222093 secs gpu shape:2/1 #layers:10 mode:py scale:True train:True - 0.039967 secs Change: 115507032	2016-02-25 09:01:18 -08:00
Josh Levenberg	2cc5ed87e3	Execute TODO to explain graph-consumer usage of RemoveNewDefaultAttrsFromGraphDef(). Change: 115506523	2016-02-25 09:01:06 -08:00
A. Unique TensorFlower	8041c546bb	Switch sdca_ops to use tf.load_library mechanism. Change: 115505008	2016-02-25 09:00:55 -08:00
Benoit Steiner	223794ee78	Avoid using initialization lists since the version of nvcc shipped with Tegra X1 crashes when attempting to compile them Change: 115500414	2016-02-24 15:36:22 -08:00
Eugene Brevdo	2861cc1d23	Surface control_flow_ops.case to public. Update docs. Add unit tests. Change: 115496194	2016-02-24 15:36:11 -08:00
Geoffrey Irving	497606904b	Fix build issue with safety fix to gather and scatter Change: 115495726	2016-02-24 15:35:58 -08:00
Eugene Brevdo	746ccc842e	Temporarily disable sdca_ops_test - it breaks the opensource build. Change: 115494526	2016-02-24 15:35:47 -08:00
A. Unique TensorFlower	4afef14f02	Support leaving the offset (beta) parameter out in batch_normalization, in which case no offset will be added after normalization. Change: 115489328	2016-02-24 15:35:25 -08:00
A. Unique TensorFlower	87a289103f	removing repeated hostcast lines Change: 115472914	2016-02-24 15:35:03 -08:00
A. Unique TensorFlower	57df84c47e	Rename map in control_flow_ops to map_fn, to avoid name conflict with Python's native 'map' function. This also fixes the bug with control_flow_ops.case Change: 115472163	2016-02-24 15:34:53 -08:00
David G. Andersen	6b2c0012d1	Eliminate unneded pylint disable Change: 115470945	2016-02-24 15:34:42 -08:00
A. Unique TensorFlower	14a237beb0	Update TensorBoard README.md. Describe how to load many runs. Change: 115467346	2016-02-24 15:34:19 -08:00
Geoffrey Irving	26078dfaf2	Fix safety bug in gather and scatter Both gather and scatter now unconditionally validate indices in the inner loop, which prevents crashes if indices are changed asynchronously while the ops are running. For gather when validate_indices = true, the new code is within the noise of the old code speedwise or possibly slightly faster (unsurprising since the new code fuses two loops). Specifically, the geometric mean of int32 gather benchmarks goes from 4.05GB/s to 4.04-4.07GB/s. For gather when validate_indices = false, the old code and a version of the old code that supported validate_indices = false both get 1.5% slower. Xiaoqiang and I deem this difference insufficient to preserve the unsafe code path, so poof: it's gone. For scatter (which always validates), the new code is slightly faster than the old code: the geometric mean goes from 546-559M items/s to 573M items/s. Change: 115467091	2016-02-24 15:34:07 -08:00
Yuan Yu	8804a486c7	Store only what is needed in the node name to node map. Change: 115464489	2016-02-24 15:33:35 -08:00
Eugene Brevdo	8411effdee	Add the OneHot op. Change: 115464229	2016-02-24 15:33:24 -08:00
Vijay Vasudevan	9d84271a20	TensorFlow: Initial support in SimplePlacer for colocation groups, to be used to colocate based on attributes rather than either names of ops or devices (op names and devices aren't portable). A follow up change will add an ops.colocate_with() to Python that adds this attribute to nodes, and will be used to replace calls to 'with tf.device(foo.device)' in TF library code, which assumes that devices have been specified. Change: 115463464	2016-02-24 15:33:10 -08:00
A. Unique TensorFlower	92383c8754	Fix minor typo in documentation in training_util.py Change: 115462062	2016-02-24 15:32:59 -08:00
A. Unique TensorFlower	5f4ec004b8	Tests to check linear_optimizer in tf.contrib. Change: 115419426	2016-02-24 15:32:48 -08:00
A. Unique TensorFlower	94a992cfc3	Add correct dependencies to sdca ops to fix build breakage. Change: 115408162	2016-02-24 15:32:39 -08:00
Josh Levenberg	185cff7f41	Make core/framework/graph_def_util.h publicly accessible. Change: 115384748	2016-02-24 15:31:32 -08:00
Josh Levenberg	2408e359cc	Give tensorflow/core/kernels/ its own BUILD file. Change: 115379524	2016-02-24 15:31:20 -08:00

1 2 3 4 5 ...

831 Commits