Commit Graph

1777 Commits

Author SHA1 Message Date
Frank Chen
23e387975e Rename tpu_load_library to tpu_api_dlsym_initializer.
This better reflects what the load library module does.

PiperOrigin-RevId: 317763973
Change-Id: I66169394c2f1d77b260323cb81d76f0f4e167579
2020-06-22 17:45:58 -07:00
Yujing Zhang
38d95ad2d8 [Cleanup] Remove allowed_devices of ResourceHandle since it's no longer used.
PiperOrigin-RevId: 317710941
Change-Id: Ib1920c5ee25d405290f852b725d693ee5ea09766
2020-06-22 13:18:43 -07:00
Gaurav Jain
ae423bd3bf Rollback: Add uint32 & uint64 to TF_CALL_INTEGRAL_TYPES
PiperOrigin-RevId: 317488920
Change-Id: I65736e7f4a1004ff634194343dc4ec237a227a19
2020-06-20 15:17:39 -07:00
TensorFlower Gardener
e264e71a44 Merge pull request from yongtang:40471-equal-output-shapes-autograph
PiperOrigin-RevId: 317396929
Change-Id: Id5ce5a139055d367e97ff39fc65cbe68af3146b6
2020-06-19 16:13:47 -07:00
A. Unique TensorFlower
1823f87735 Fix issues with TypeIndex on MacOS, i.e. hash on the type name where available since this otherwise causes problems when loading different shared libraries with RTLD_LOCAL.
PiperOrigin-RevId: 317395983
Change-Id: I14b3add5fa19725b2150b68813364d16b8320130
2020-06-19 15:59:55 -07:00
Mihai Maruseac
c3cc3c40b0 Move fuzzers for TF ops to own subdir. Trim some dependencies.
This duplicates some of the BUILD dependency tree to go around the need to link huge bottleneck dependencies (such as `//tensorflow/core:framework`). Until TF can use `cc_shared_library` in a stable way (and all support in Bazel exists), we will need to use the duplicated tree for fuzzing.

PiperOrigin-RevId: 317326319
Change-Id: I1493e3ae7340298971fe15bd3702b63657f9bf9f
2020-06-19 09:59:12 -07:00
Gaurav Jain
e972c55726 Add uint32 & uint64 to TF_CALL_INTEGRAL_TYPES
Both uint32 & uint64 had been omitted from TF_CALL_INTEGRAL_TYPES due to
suggested concerns of size bloat. In reality it seems that the size
increase is only around 2MB. Further, this fixes  since we are no
longer inadvertently using the XLA_CPU device to perform tf.reduce_mean.

PiperOrigin-RevId: 317259372
Change-Id: Iacf75eaedce198fbef4bd9fd59b6fefa584cbf34
2020-06-19 00:12:21 -07:00
Yong Tang
83f19c6a9e Fix unknown output shape issue in autograph for tf.equal
This PR tries to address the issue raised in 40471 where
the output shape of an autograph consists of tf.equal
could not inference correctly. Specifically
`x.shape == [None, 10, 1]` and `y.shape == [None, 1, 4]`
only yield `shape == None` (should be `shape == [None, 10, 4]`).

The reason was that the shape inbference function for equal
didn't capture the cases where both x and y's dim are None.

This PR fixes the issue.

This PR fixes 40471.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2020-06-16 09:27:41 +00:00
Yujing Zhang
67487368bb Preserve FunctionDef.arg_attr in GrapplerFunctionItem.
PiperOrigin-RevId: 316498288
Change-Id: I6c3288c725bb281cca17256146c9ec3fd8cec5f0
2020-06-15 11:06:19 -07:00
A. Unique TensorFlower
615cedf9d2 More descriptive error message for Concat.
PiperOrigin-RevId: 316125402
Change-Id: I4f86104f4197fc1d2387d1914d5a9a54e965f31c
2020-06-12 10:15:05 -07:00
Jiri Simsa
40704b8933 [tf.data] Calculate the average input time for the root node of the data input pipeline.
PiperOrigin-RevId: 315904116
Change-Id: Ica73e017ab796a8ff97b92552c8b2aa698edebb3
2020-06-11 08:37:37 -07:00
Jay Shi
57c09eb4d8 [tf.data] Calculate the average input time for the root node of the data input pipeline.
PiperOrigin-RevId: 315740720
Change-Id: I55967b18f1f039847049656ccfd849714e83a4cf
2020-06-10 12:18:06 -07:00
A. Unique TensorFlower
9c236222b3 [tf.data] Add comments for the input time computation in the InterleaveMany node.
PiperOrigin-RevId: 315535430
Change-Id: I0587175a9288c465676225b573d9ac136fbec361
2020-06-09 12:51:32 -07:00
A. Unique TensorFlower
05355c404a [tf.data] Update the input time computation in the InterleaveMany node.
PiperOrigin-RevId: 315331988
Change-Id: I105dd74bb87092f2d56781cdb8e28d2c39360100
2020-06-08 13:03:55 -07:00
Gaurav Jain
b0b763203e Add ability for functions to share rendezvous
The private `_shared_rendezvous` property allows the function to use the
rendezvous of the parent. This is only needed in order to support code
where raw send/recv operations are inserted and when functions are run
in graph mode where they may not be inlined.

PiperOrigin-RevId: 315319264
Change-Id: Ieb6b3924c51ccfd201b4693f3a499f883c7c0b71
2020-06-08 11:47:46 -07:00
Gaurav Jain
4e7ce793d9 Merge various kernel registrations with macros
We add the TF_CALL_COMPLEX_TYPES macro and update related kernel
registrations with more compact macros rather than the individual dtype
listings.

This should be a no-op and should give better visibility into what is
the dtype coverage for many of our kernels.

PiperOrigin-RevId: 315224662
Change-Id: I14aad07711a407fa632a94d891238a48ae89bcab
2020-06-08 00:42:40 -07:00
TensorFlower Gardener
5b70031ebf Merge pull request from tg-at-google:master
PiperOrigin-RevId: 315036791
Change-Id: I2a345bf39bf6eda7af7f4356781f3d83ff1ca502
2020-06-05 19:37:43 -07:00
Gaurav Jain
a565c473c1 Improve error message when attributes are invalid
If we are unable to find any valid devices for a node, we can do a quick
check to see if the node is even valid as per the op definition. This
greatly improves the eager error message since there is no point in
listing all the available kernels across all devices if we know none of
them can match.

Previous:
NotFoundError: Could not find device for node: {{node GatherV2}} = GatherV2[Taxis=DT_INT32, Tindices=DT_FLOAT, Tparams=DT_INT32, batch_dims=0]
All kernels registered for op GatherV2:
  device='CPU'; Tparams in [DT_INT64]; Tindices in [DT_INT32]
  device='CPU'; Tparams in [DT_INT64]; Tindices in [DT_INT64]
  device='CPU'; Tparams in [DT_INT32]; Tindices in [DT_INT32]

... Many more registrations ...

New:
InvalidArgumentError: Value for attr 'Tindices' of float is not in the list of allowed values: int32, int64
	; NodeDef: {{node GatherV2}}; ...
PiperOrigin-RevId: 314963092
Change-Id: I8072e7ba9e6d316570a536780d78992691e620f1
2020-06-05 11:35:08 -07:00
tg-at-google
c799bd996e
Update tensor_shape.cc 2020-06-05 08:56:29 -04:00
Frank Chen
75c40f6bff Add a rudimentary library loader implementation for some TPU related functions
PiperOrigin-RevId: 314863001
Change-Id: Iafe056ab3fcf592cd28873e6fd740121f17d1a91
2020-06-04 21:49:12 -07:00
Tare Gaskin
49cf3e2341 in resolution of [Wsign-compare] warning ids : [22, 23, 24, 25, 26, 27] 2020-06-04 20:27:56 +00:00
Zhenyu Tan
ff307c36d6 Add a rudimentary library loader implementation for some TPU related functions
PiperOrigin-RevId: 314575034
Change-Id: I14c32fe73063e903eccacd212320cd253c941531
2020-06-03 12:17:14 -07:00
Jiri Simsa
33b3e6cd06 [tf.data] Improvements to the performance modeling framework.
This CL switches from using iterator prefix for identifying the parent node in the model tree when a node is constructed to directly passing a parent pointer to the constructor. In addition, this CL makes the `IteratorBase::InitializeBase` method public, which makes it possible to fix `cache` and `snapshot` implementations to reflect their use of nested iterators in the model tree.

PiperOrigin-RevId: 314570434
Change-Id: Ide0b37f404077938ad8dc4fbbd91489b7197c6e1
2020-06-03 12:02:15 -07:00
TensorFlower Gardener
62826a2d37 Merge pull request from tg-at-google:patch-11
PiperOrigin-RevId: 314567994
Change-Id: Ic31b06095620dd9e21d3acdf980ec7c2c683da66
2020-06-03 11:36:31 -07:00
TensorFlower Gardener
68744f8673 Merge pull request from tg-at-google:patch-12
PiperOrigin-RevId: 314562794
Change-Id: I4ee281e9f5d4fee17ba33fbebe64f50472fd18c2
2020-06-03 11:18:37 -07:00
Frank Chen
2cbb43e1fc Add a rudimentary library loader implementation for some TPU related functions
PiperOrigin-RevId: 314496921
Change-Id: I9c499f37525dee7b2cf138089b052e020766d54f
2020-06-03 02:56:17 -07:00
tg-at-google
251db72cfc
in resolution of [Wsign-compare] warning id 9
Static cast  `int index` for the sake of the single `size_t` to `int` comparison on line 116.
2020-06-02 11:07:21 -04:00
tg-at-google
1499585ea6
in resolution of [Wsign-compare] warning id 10
LargeAllocationWarningBytes() is implied to be a quantity ( it calls AvailableRam(), which unless improperly named, is a quantity/ i.e. non-negative ). 

Given LargeAllocationWarningBytes() is a quantity, it can be cast to size_t with behavior constrained to be as-intend.
2020-06-02 10:59:06 -04:00
Anna R
6f22fa9376 Removing device-related code that doesn't seem to be used. Usage was removed by the following commits: 223c8bdf89, 04c23099c2 (diff-3780f0ef44936240abc76c4c42541532).
PiperOrigin-RevId: 314178530
Change-Id: I7a2502d691610a6cd44a9752e9f48e4798071f13
2020-06-01 12:34:36 -07:00
A. Unique TensorFlower
66529c35a7 Add timeout to collective ops to detect deadlocks.
The timeout is set as an argument to a collective op. When non zero value, a completion timeout is set to detect staleness. If a timeout goes off, the execution is aborted through a DEADLINE_EXCEEDED error.

PiperOrigin-RevId: 313861868
Change-Id: I7fee45736608ad7fbcc9dd980db2fd302c9cb4df
2020-05-29 15:37:42 -07:00
Haoyu Zhang
356121e563 Server-side cancellation support for distributed function execution.
1. Thread the RPC cancel signal through the eager service RunComponentFunction calls;
2. Always pass the cancellation manager to the underlying executor (instead of only passing when `is_eager` is true, i.e., pure eager ops). With this we do not need to cancel the rendezvous from the process FLR; instead the ExecutorState takes care of it when op fails.
3. Do not mark all statuses as derived when aborting rendezvous or triggering cancellation. This usually results in the original errors buried as one of the derived errors.

PiperOrigin-RevId: 313814162
Change-Id: Ia866f5f522a0b1aa54e9dce7b9cc0bcf7682136a
2020-05-29 13:11:45 -07:00
George Karpenkov
90b80fba1a [TF/XLA] On compilation failure, do not overflow the max size of the bad status by a huge list of function inputs
PiperOrigin-RevId: 313433935
Change-Id: Iaff5c61ce01c6eac7894bed4edd76a396f846151
2020-05-27 11:59:06 -07:00
A. Unique TensorFlower
a67ee929f5 add a tensorflow::batch_util::CopyContiguousSlices utility function for
slicing out a contiguous pieces of tensors along the batch dimension and
copying them to another tensor.

PiperOrigin-RevId: 313414257
Change-Id: I2530c58ed53ad8e92e5f976f2dd1728296d12185
2020-05-27 10:26:55 -07:00
Andrew Audibert
f24faa153a Add dataset element compression ops.
These allow us to implement tf.data service compression/decompression as a part of the tf.data pipeline.

PiperOrigin-RevId: 312605093
Change-Id: I4a833bc89e602c8fd78abc4c1a0026c2a397449f
2020-05-20 20:08:06 -07:00
Frank Chen
e56cf87b54 Adds necessary hooks to load a TPU-specific shared library.
PiperOrigin-RevId: 312601701
Change-Id: I1ae43d253d1734c30ffefe4d4062c82639d7a4d1
2020-05-20 19:34:16 -07:00
A. Unique TensorFlower
e234c0a44e [tf.data] Update output time functions to solve the stack overflow problem. Also update some mathematics computation in the code.
PiperOrigin-RevId: 311858942
Change-Id: Iafa345b5d235c60a455671c924af594396a361ad
2020-05-15 23:04:02 -07:00
Anna R
28229ffdbf Delete Tensor constructor that takes a pointer. Otherwise, say, std::make_unique<Tensor>(GetTensorSomewhereThatActuallyReturnsAPointer()) would construct boolean tensor without a compile time error.
PiperOrigin-RevId: 311778946
Change-Id: Ibdb69ff7c4a9697028ed30ac40ffb0797b4493f9
2020-05-15 12:31:38 -07:00
A. Unique TensorFlower
f84726697e [tf.data] Update some maths formulas in the ComputeWaitTime function.
PiperOrigin-RevId: 311436667
Change-Id: Ie3537625e9daac73caba5f790b90b65507f999f7
2020-05-13 17:28:26 -07:00
A. Unique TensorFlower
1afe51a60c [tf.data] Update the node destructor to solve the stack overflow problem.
PiperOrigin-RevId: 311220597
Change-Id: I7efaa889a27e52c0d05bec9778a7f40976a5e90e
2020-05-12 16:08:07 -07:00
Hye Soo Yang
1638fe218d Fix for adhering to latest clang style guide.
PiperOrigin-RevId: 311197936
Change-Id: I014b041ff03f656587651da9a4977688d501d330
2020-05-12 14:08:36 -07:00
A. Unique TensorFlower
fedc6d951f Remove all uses of TENSORFLOW_LITE_PROTOS.
PiperOrigin-RevId: 310116123
Change-Id: I5fa28cc61644efc2fd202e80997a5c4c0a227572
2020-05-06 02:43:41 -07:00
Anna R
1a434fba5e Reorganizing kernel fallback into its own (kernel/) directory. Right now this looks a bit awkward, but long term organization will look as follows:
tfrt_fallback/
   runtime/
   kernel/

Files renamed:
tfrt_forwarding.* --> tfrt_op_kernel.* (to indicate similarity with op_kernel.* in TF)
tfrt_forwarding_kernels.cc --> kernel_fallback_kernels.cc
tfrt_forwarding_attr_util.* --> attr_util.*

PiperOrigin-RevId: 310016286
Change-Id: Ia863afbcab136ded40c9aa2f04a3806044244fac
2020-05-05 14:05:19 -07:00
Derek Murray
c8d6fce87c [CallFrameInterface] Add methods ConsumeArg() and CanConsumeArg().
At present, the `CallFrameInterface` (and, by extension, all TensorFlow functions that pass arguments via `ArgOp`) retains ownership of one reference on each argument tensor for the lifetime of the frame. This prevents buffer forwarding optimizations from being performed on arguments, which can lead to performance issues. However, in some cases (e.g. the loop variables in a `WhileOp`, the arguments to the map function in `Dataset.map()`) we know enough about the caller to be able to "move" the arguments into the function call.

This change adds the following:
* Methods called `CanConsumeArg(int index)` and `ConsumeArg(int index, Tensor* val)` to `CallFrameInterface` that allow the runtime to query the call frame for consumable arguments. The default implementations do not permit any arguments to be consumed, for backwards compatibility.
* An implementation of these methods in tf.data's "captured_function.cc", allowing arguments to `Dataset.map()` functions (and similar) to be consumed by the function call.
* A specialized `CallFrameInterface` implementation for the synchronous pass in `WhileOp` that allows arguments to be consumed.
* Modifications to `SingleThreadedExecutorImpl` to query the `CanConsumeArg()` method and consume arguments wherever possible.

Potential future additions:
* Add `ConsumeArg()` support to the default executor.
* Use custom call frames with `ConsumeArg()` support in more functional ops.

PiperOrigin-RevId: 309981157
Change-Id: I02f7b9f5611a27b087ee0058540ab7e38e70c21d
2020-05-05 11:09:16 -07:00
Yujing Zhang
8fcb130e92 Support packed tensor inputs in ProcessFunctionLibraryRuntime.
- Expand the packed _Arg nodes when the graph is ready for graph partition.
- Introduce an optional sub-index to function Arg nodes, in order to distinguish between two arguments with the same "index". It happens after replacing a packed _Arg node which is assigned to a CompositeDevice with multiple replica nodes (one per device).

The "index" of an _Arg node is unique before expanding it. It's also unique within each subgraph after graph partition.

PiperOrigin-RevId: 309781835
Change-Id: Ic6e351f45b7523288b5dae30997ddf0dae86660b
2020-05-04 11:18:39 -07:00
Derek Murray
00cac2ef52 [OpKernelContext] Enable plumbing the executor type through to kernels.
This change adds an `OpKernelContext::executor_type()` method, which (by analogy with `OpKernelContext::runner()` and `OpKernelContext::run_all_kernels_inline()`) enables function calls within a kernel to inherit that option from the calling context. As a concrete example, this enables a WhileOp or IfOp running with the SINGLE_THREADED_EXECUTOR executor_type to use the same optimizations in the branch/cond/body functions of those ops.

PiperOrigin-RevId: 309515120
Change-Id: I11b0b3ee458dd8ea1cdc9284f8acdb293c5cb770
2020-05-01 19:39:38 -07:00
Andrew Audibert
655d6d3cca [tf.data service] Support __iter__ for tf.data service datasets.
Now we can iterate through tf.data service datasets with `for elem in dataset`, and `distributed_dataset.repeat()` will work correctly.

This CL removes the previous method of iteration via CreateJob/CreateDataServiceIterator. It wasn't yet made public, so it is OK to remove the old ops.

PiperOrigin-RevId: 309281077
Change-Id: I9531f7d2834ce6669f15896d8c830d23d8277b13
2020-04-30 12:52:12 -07:00
Gaurav Jain
cccfd47023 Clearly indicate when to disable optimizations
Many ops mark themselves as stateful even though they primarily want to
disable optimizations such as constant folding and CSE. We add a new
method to clearly indicate this intent even though we are currently not
adding a new flag.

PiperOrigin-RevId: 309253555
Change-Id: I8cae8bbc4c3b71819ee869b1870fce1e39e061be
2020-04-30 10:37:25 -07:00
Derek Murray
4a935b268e Add FunctionLibraryRuntime::RunSync() methods that avoid the need for creating a callback.
In many cases, we invoke a TensorFlow function synchronously, and have to create a callback and notification to block on the result. With some executor types (e.g. SINGLE_THREADED_EXECUTOR), this causes unnecessary atomic operations to be executed (e.g. setting and destroying the Notification), which can account for ~100ns.

PiperOrigin-RevId: 309105760
Change-Id: Ie5b3aef5c4dcd3529e6a3a2701ac152b9630cc01
2020-04-29 15:14:43 -07:00
Derek Murray
000c8f09ea [Build cleanup] Update #includes of moved header "graph/graph_constructor.h".
This change modifies these includes to point to
"tensorflow/core/common_runtime/graph_constructor.h" instead. This change will enable us to remove the accidental dependency from //tensorflow/core/graph to //tensorflow/core/common_runtime.

PiperOrigin-RevId: 309035649
Change-Id: I2af0fdd6a6ccc4ae8d351a9117a69b6fc80c22e9
2020-04-29 09:20:48 -07:00
Derek Murray
bb1e5c0d09 [RunHandler] Minor optimizations to environment variable handling.
1. Avoid re-evaluating the "TF_RUN_HANDLER_USE_SUB_THREAD_POOL" environment variable each time a task is enqueued, by caching the result in a static local variable.
2. A similar optimization in `ChooseRequestsWithExponentialDistribution()`.
3. Since all the arguments are string literals, we can pass `const char*` to the `ParamFromEnv*WithDefault()` methods, to avoid creating a temporary `std::string` on each call.

PiperOrigin-RevId: 308915732
Change-Id: I1642b5d924477eb006497acf95ac7cfc17956feb
2020-04-28 16:07:48 -07:00