Compare commits

...

624 Commits

Author SHA1 Message Date
Mihai Maruseac
cdf2c541c3
Merge pull request #45767 from tensorflow-jenkins/relnotes-2.0.4-9907
Update release notes for TensorFlow 2.0.4
2021-01-04 12:19:09 -08:00
Mihai Maruseac
7041a615ec
Update RELEASE.md 2021-01-04 11:57:46 -08:00
Mihai Maruseac
47c3737ee5
Merge pull request #46153 from tensorflow/mm-cherry-pick-sqlite-bump-on-r2.0
Update SQLite to the lastest sqlite-amalgamation-3340000
2021-01-04 11:34:45 -08:00
Yong Tang
610e7edd3d Update SQLite to the lastest sqlite-amalgamation-3340000
This PR updates SQLite to the latest sqlite-amalgamation-3340000

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2021-01-04 11:23:01 -08:00
Mihai Maruseac
ad6e7edf68
Merge pull request #45884 from tensorflow/mm-fix-broken-cherrypick-on-r2.0
Disable failing test
2020-12-19 19:03:41 -08:00
Mihai Maruseac
f3dcdbbc53 Disable failing test 2020-12-19 19:01:58 -08:00
Mihai Maruseac
33f107fa6f
Merge pull request #45810 from tensorflow/fix-cp-2.0
Fix bad cherrypick
2020-12-17 14:30:28 -08:00
Geeta Chavan
e006acfa2e Fix bad cherrypick 2020-12-17 13:56:59 -08:00
Mihai Maruseac
ef46b8fc9d
Merge pull request #45803 from tensorflow/update_import_path
Fix import path
2020-12-17 12:55:17 -08:00
Geeta Chavan
550dd04e20 Fix import path 2020-12-17 12:41:13 -08:00
Mihai Maruseac
412d527a54
Merge pull request #45525 from tensorflow/mm-cherrypick-6d7da36623b-on-r2.0
Prevent unitialized memory access in `GraphConstructor::MakeEdge`
2020-12-17 09:36:05 -08:00
Mihai Maruseac
46b39ddd5b
Merge pull request #45524 from tensorflow/mm-cherrypick-b95ccc06e04-on-r2.0
Prevent CHECK-fail in LSTM/GRU with zero-length input.
2020-12-17 09:35:40 -08:00
Mihai Maruseac
3aa68a19d9
Merge pull request #45523 from tensorflow/mm-cherrypick-c1e1fc899ad5f8c725dcbb6470069890b5060bc7-on-r2.0
Mark `MemmappedTensorAllocator` as returning opaque handle.
2020-12-17 09:34:55 -08:00
Mihai Maruseac
906099567b
Merge pull request #45522 from tensorflow/mm-cherrypick-faf7af8ef8a-on-r2.0
Validate that `DataFormat*` attributes form a permutation.
2020-12-17 09:34:42 -08:00
Mihai Maruseac
1e93fea67d
Merge pull request #45521 from tensorflow/mm-cherrypick-ace0c15a22f7f054abcc1f53eabbcb0a1239a9e2-on-r2.0
Default initialize fixed point Eigen types.
2020-12-17 09:34:10 -08:00
Mihai Maruseac
a0bb7cae4c
Merge pull request #45772 from tensorflow-jenkins/version-numbers-2.0.4-3081
Update version numbers for TensorFlow 2.0.4
2020-12-17 09:20:00 -08:00
TensorFlow Release Automation
07b946f108 Update version numbers to 2.0.4 2020-12-16 16:48:46 -08:00
TensorFlow Release Automation
3475c0f2c2 Insert release notes place-fill 2020-12-16 16:31:31 -08:00
Mihai Maruseac
9f3305ea15
Merge pull request #45549 from tensorflow/mm-cherry-pick-pcre-fixes-on-r2.0
Update PCRE library from 8.42 to 8.44
2020-12-16 09:53:00 -08:00
Mihai Maruseac
27b086e0df
Merge pull request #43988 from tensorflow/mm-cherry-pick-java-fixes-on-r2.0
Bump junit from 4.11 to 4.13.1.
2020-12-16 09:48:22 -08:00
Mihai Maruseac
b86785cdd3
Merge pull request #44847 from tensorflow/mm-cherry-pick-libjpeg-turbo-on-r2.0
Bump libjpeg-turbo from 2.0.4 to 2.0.5
2020-12-16 09:45:48 -08:00
Mihai Maruseac
cf5067a6a7
Merge pull request #45711 from tensorflow/mm-pin-h5py-on-r2.0
Add upper bound to `h5py`.
2020-12-15 15:52:25 -08:00
Mihai Maruseac
3fe8742f0f
Merge pull request #45722 from tensorflow/mm-disable-segfault-tests-on-r20
Disable a few tests.
2020-12-15 15:46:11 -08:00
Mihai Maruseac
1dbb3af0da Disable a few tests.
These tests now segfault after some dependency updated below us.
2020-12-15 15:23:10 -08:00
Mihai Maruseac
0094ab4ecb Add upper bound to h5py.
Newer versions of `h5py` would cause errors in keras tests due to
difference between `unicode` and `str`. Since `h5py` comes from `keras`
as an unbounded dependency, we have to manually patch this way.
2020-12-15 12:13:06 -08:00
Yong Tang
c7667263be Update PCRE library from 8.42 to 8.44
This PR updates PCRE library from 8.42 to 8.44.

Note there is a CVS related to old 8.42 (https://nvd.nist.gov/vuln/detail/CVE-2019-20838#VulnChangeHistorySection)

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2020-12-09 11:22:23 -08:00
Mihai Maruseac
e24dbaf797 Prevent unitialized memory access in GraphConstructor::MakeEdge
The `MakeEdge` implementation assumes that there exists an output at `output_index` of `src` node and an input at `input_index` of `dst` node. However, if this is not the case this results in accessing data out of bounds. Because we are accessing an array that is a private member of a class and only in read only mode, this usually results only in unitialized memory access. However, it is reasonable to think that malicious users could manipulate these indexes to actually read data outside the class, thus resulting in information leakage and further exploits.

PiperOrigin-RevId: 346343288
Change-Id: I2127da27c2023d27f26efd39afa6c853385cab6f
2020-12-08 18:30:07 -08:00
Mihai Maruseac
84d7ddf6d6 Prevent CHECK-fail in LSTM/GRU with zero-length input.
PiperOrigin-RevId: 346239181
Change-Id: I5f233dbc076aab7bb4e31ba24f5abd4eaf99ea4f
2020-12-08 18:29:54 -08:00
Mihai Maruseac
07d2f0766c Mark MemmappedTensorAllocator as returning opaque handle.
This allocator is used for `ImmutableConstantOp` and it returns a handle to the contents of a memory mapped file which is supposed to represent a tensor.

For tensors of complex types (resources, variables and strings), allocators which are not marked as returning opaque handles will call placement new to initialize each element. This means writing to the buffer. However, in our case, the buffer is immutable and already contains the tensor data. Hence, writing to it is both destructive and causes a crash.

PiperOrigin-RevId: 345786451
Change-Id: I46369c50fa60b3431709ffe068a728d3061f49c4
2020-12-08 18:29:42 -08:00
Mihai Maruseac
54461b130d Validate that DataFormat* attributes form a permutation.
The `src_format` and `dst_format` attributes for the `DataFormatDimMap` and `DataFormatVecPermute` raw ops are supposed to determine a permutation. However, this was not validated and could result in unitialized memory accesses as well as writes outside of bounds and potential crashes.

While here, we also test that the format attributes have the needed length, add tests for all validation failure cases, remove unnecessary calls to `strings::StrCat`, and fix a few grammar errors.

This will be cherry-picked on the supported release branches.

PiperOrigin-RevId: 346135579
Change-Id: I1c76392382c89ad8f072d5bc93d70669851eb404
2020-12-08 18:29:26 -08:00
Mihai Maruseac
a3b91595bb Default initialize fixed point Eigen types.
In certain cases, tensors are filled with default values of the type. But, for these fixed point types, these values were uninitialized. Thus, we would have uninitialized memory access bugs, some of which were caught by MSAN.

PiperOrigin-RevId: 344101137
Change-Id: I14555fda74dca3b5f1582da9008901937e3f14e2
2020-12-08 18:29:17 -08:00
Yong Tang
e57caee599 Bump libjpeg-turbo from 2.0.4 to 2.0.5
It looks like the latest libjpeg-turbo is 2.0.5 so this PR
bumps the version (currently on 2.0.4).

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2020-11-13 09:54:47 -08:00
dependabot[bot]
0804acb639 Bump junit in /tensorflow/java/maven/spark-tensorflow-connector
Bumps [junit](https://github.com/junit-team/junit4) from 4.11 to 4.13.1.
- [Release notes](https://github.com/junit-team/junit4/releases)
- [Changelog](https://github.com/junit-team/junit4/blob/main/doc/ReleaseNotes4.11.md)
- [Commits](https://github.com/junit-team/junit4/compare/r4.11...r4.13.1)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-13 16:11:44 -07:00
dependabot[bot]
58b28de21d Bump junit in /tensorflow/java/maven/tensorflow-hadoop
Bumps [junit](https://github.com/junit-team/junit4) from 4.11 to 4.13.1.
- [Release notes](https://github.com/junit-team/junit4/releases)
- [Changelog](https://github.com/junit-team/junit4/blob/main/doc/ReleaseNotes4.11.md)
- [Commits](https://github.com/junit-team/junit4/compare/r4.11...r4.13.1)

Signed-off-by: dependabot[bot] <support@github.com>
2020-10-13 16:11:44 -07:00
Mihai Maruseac
295ad27816
Merge pull request #43443 from tensorflow-jenkins/version-numbers-2.0.3-23981
Update version numbers for TensorFlow 2.0.3
2020-09-21 18:54:47 -07:00
TensorFlow Release Automation
c574e641ff Update version numbers to 2.0.3 2020-09-21 18:49:30 -07:00
Mihai Maruseac
1bdd956e63
Merge pull request #43438 from tensorflow-jenkins/relnotes-2.0.3-26591
Update release notes for TensorFlow 2.0.3
2020-09-21 18:46:10 -07:00
Mihai Maruseac
60f0b7b1db
Update RELEASE.md 2020-09-21 18:45:55 -07:00
TensorFlow Release Automation
e081967f1d Insert release notes place-fill 2020-09-21 17:09:55 -07:00
Mihai Maruseac
975a52309c
Merge pull request #43411 from tensorflow/mm-fix-2.0
Fix tests broken by merge conflicts
2020-09-20 20:26:16 -07:00
Mihai Maruseac
0ea1b845d8 Add missing import 2020-09-20 20:24:33 -07:00
Mihai Maruseac
b98dc690f1 No disable_tfrt present on this branch 2020-09-20 20:24:33 -07:00
Mihai Maruseac
4ec48a161f
Merge pull request #43407 from tensorflow/mihaimaruseac-patch-2
Remove import that is not needed
2020-09-20 17:49:39 -07:00
Mihai Maruseac
c55d0c3008
Remove import that is not needed 2020-09-20 17:48:57 -07:00
Mihai Maruseac
a6752f0ceb
Merge pull request #43404 from tensorflow/mihaimaruseac-patch-1
Fix import path
2020-09-20 17:04:04 -07:00
Mihai Maruseac
e47dd65e72
Fix import path 2020-09-20 17:03:52 -07:00
Mihai Maruseac
63e6ed8802
Merge pull request #43402 from tensorflow/mihaimaruseac-patch-2-1
Cast away a const in intermediate API
2020-09-20 16:57:28 -07:00
Mihai Maruseac
baa1b690f7
Cast away a const in intermediate API 2020-09-20 16:57:05 -07:00
Mihai Maruseac
4573558d1c
Merge pull request #43399 from tensorflow/mihaimaruseac-patch-1
Fix typo in macro
2020-09-20 16:35:45 -07:00
Mihai Maruseac
67e53f87ad
Fix typo in macro 2020-09-20 16:35:35 -07:00
Mihai Maruseac
43903c16d0
Merge pull request #43395 from tensorflow/mihaimaruseac-patch-2
Fix import path
2020-09-20 15:58:37 -07:00
Mihai Maruseac
5502bd7d05
Fix import path 2020-09-20 15:58:23 -07:00
Mihai Maruseac
b71eb54495
Merge pull request #43389 from tensorflow/mihaimaruseac-patch-2
Solve leftover from merge conflict
2020-09-20 15:25:53 -07:00
Mihai Maruseac
3f13b7ef99
Solve leftover from merge conflict 2020-09-20 15:24:29 -07:00
Mihai Maruseac
7732945f62
Merge pull request #43355 from tensorflow/mm-patch-r2.0
Patch for TF 2.0.3
2020-09-20 12:30:09 -07:00
Mihai Maruseac
ec14e1b429 Fix undefined behavior in tf.raw_ops.Switch in eager mode.
PiperOrigin-RevId: 332578058
Change-Id: I9727571d2f21476b10d8aa27c1b7176564b76ac9
2020-09-20 10:35:38 -07:00
Mihai Maruseac
0615b26093 Fix heap buffer overflow in tf.raw_ops.SparseFillEmptyRowsGrad.
Also add tests as they were lacking

PiperOrigin-RevId: 332566071
Change-Id: I44277578e26ff5fb3fdb0dcbba6e91b2ec3e7859
2020-09-20 09:50:00 -07:00
Mihai Maruseac
446b0ead53 Prevent integer truncation from 64 to 32 bits.
The `tensorflow::Shard` functions last argument must be a 2 argument function where both arguments are `int64` (`long long`, 64 bits). However, there are usages where code passes in a function where arguments are `int` or `int32` (32 bits). In these cases, it is possible that the integer truncation would later cause a segfault or other unexpected behavior.

PiperOrigin-RevId: 332560414
Change-Id: Ief649406babc8d4f60b3e7a9d573cbcc5ce5b767
2020-09-19 19:57:30 -07:00
Mihai Maruseac
3a322cb0f4 Prevent int64 to int truncation in Shard API usage.
The function argument in `Shard` must be a function of two `int64` arguments. However, we are passing in a function with two `int` arguments. Thus, for large workloads, these arguments get truncated from positive `int64` values to negative `int` ones, resulting in a buffer out of bounds write.

PiperOrigin-RevId: 332557334
Change-Id: I236c9a2e7f53580e520571da8ba941a3aa9fa0b5
2020-09-19 19:41:42 -07:00
Mihai Maruseac
e432a7a775 Prevent format string vulnerability in tf.strings.as_string.
The `printf` format specifier only allows `#`, `0`, `-`, `+` and space as flag characters. Others are interpreted as width/precision/length modifier or conversion specifiers. If a character does not fit into any of these sets `printf` just displays it.

Also add a test suite for `tf.strings.as_string`. Also fix the issue where the flag character was used only if width was specified.

PiperOrigin-RevId: 332553548
Change-Id: Ie57cf2a7c14d1a36097642794c14329db669bbba
2020-09-19 19:30:19 -07:00
Mihai Maruseac
8cdee3c8fe Prevent segfault in GetSessionHandle{,V2}.
In eager mode, session state is null.

PiperOrigin-RevId: 332548597
Change-Id: If094812c2e094044220b9ba28f7d7601be042f38
2020-09-19 19:27:52 -07:00
Mihai Maruseac
f95fa5faf4 Validate data_splits for tf.StringNGrams.
Without validation, we can cause a heap buffer overflow which results in data leakage and/or segfaults.

PiperOrigin-RevId: 332543478
Change-Id: Iee5bda24497a195d09d122355502480830b1b317
2020-09-19 19:16:12 -07:00
Mihai Maruseac
b902fac120 Fix bad import 2020-09-19 18:50:15 -07:00
Mihai Maruseac
4c1c846f60 Validate NodeDefs from FunctionDefLibrary of a GraphDef.
We already validated `NodeDef`s from a `GraphDef` but missed validating those from the `FunctionDefLibrary`. Thus, some maliciously crafted models could evade detection and cause denial of service due to a `CHECK`-fail.

PiperOrigin-RevId: 332536309
Change-Id: I052efe919ff1fe2f90815e286a1aa4c54c7b94ff
2020-09-19 18:44:48 -07:00
Mihai Maruseac
e4d36adab4 Prevent loading saved models where constant nodes have no tensor value.
Also reorder fuzz generated test cases following f760f88b42

PiperOrigin-RevId: 308339007
Change-Id: I11d825203964cf3397846c57fd4a6f458e8536f3
2020-09-19 18:44:48 -07:00
Mihai Maruseac
db6e19099e Properly handle negative shape dimensions from improper saved models.
PiperOrigin-RevId: 308283636
Change-Id: Ib10849425de7d541d8dacfe4d0c709fbac9180b6
2020-09-19 18:44:48 -07:00
Mihai Maruseac
ce945d5b0e [tflite] Ensure ResolveAxis properly handles negative inputs.
In Python, a list `l` of length `n` allows indexing with negative indices, `l[i]`. The only constraint is that `n + i` becomes positive. Code in `ResolveAxis` assumes the constraints and only checks it using a `DCHECK`. But the macro is a no-op in non-debug builds and that can result in reading from negative offsets (buffer underflows).

PiperOrigin-RevId: 332530683
Change-Id: I464e073fee618054ae3719a3679739007bb3f3bc
2020-09-19 18:19:28 -07:00
Mihai Maruseac
79deaeb06c [tflite] Ensure MatchingDim does not allow buffer overflow.
We check in `MatchingDim` that both arguments have the same dimensionality, however that is a `DCHECK` only enabled if building in debug mode. Hence, it could be possible to cause buffer overflows by passing in a tensor with larger dimensions as the second argument. To fix, we now make `MatchingDim` return the minimum of the two sizes.

A much better fix would be to return a status object but that requires refactoring a large part of the codebase for minor benefits.

PiperOrigin-RevId: 332526127
Change-Id: If627d0d2c80a685217b6e0d1e64b0872dbf1c5e4
2020-09-19 18:09:52 -07:00
Mihai Maruseac
7bb92eeb9f [tflite] Ensure input tensors don't have nullptr buffers.
A crafted TFLite model can force a node to have as input a tensor backed by a `nullptr` buffer. That is, by carefully changing the buffer index in the flatbuffer serialization, we can force the TFLite interpreter to consider a read-only tensor to be a read-write one and assume that there is an operator that has this tensor as output, writing to it and allocating memory before the tensor is used as input. If this does not happen, we get memory corruption.

PiperOrigin-RevId: 332524692
Change-Id: I57ef175152a29020af9ab041dc959e5631dce40f
2020-09-19 18:00:34 -07:00
Mihai Maruseac
8c2092e9f9 [tflite] Ensure inputs and outputs don't overlap.
If a model uses the same tensor for both an input and an output then this can result in data loss and memory corruption. This should not happen.

PiperOrigin-RevId: 332522916
Change-Id: If0905b142415a9dfceaf2d181872f2a8fb88f48a
2020-09-18 19:23:05 -07:00
Mihai Maruseac
0b5be2717a [tflite] Make GetOptionalInputTensor the same as GetInput.
With the previous change, there is no more need for two separate APIs. We would deprecate `GetOptionalInputTensor` in the future.

PiperOrigin-RevId: 332513386
Change-Id: Id7110271c25ebd6126ad8c82a493e37e0e0756b3
2020-09-18 18:42:20 -07:00
Mihai Maruseac
d8f8236c29 [tflite] Test for kTfLiteOptionalTensor in GetInput.
`GetInput`, `GetVariableInput` and `GetOutput` all fail to check for the case where `node->inputs->data[index]` is the special `kTfLiteOptionalTensor` value (-1) which then causes `context->tensors[node->inputs->data[index]]` to read from invalid memory location.

This fix makes `GetInput` and related return `nullptr` in those cases, asking the caller to check for `nullptr`. This is better than having `GetOptionalInputTensor` and `GetOptionalOutputTensor` (does not exist but could be added) as using the patched `GetInput` in error would be caught by a sanitizer test in the default optimized build (due to the `-fsanitize=null` option).

PiperOrigin-RevId: 332512190
Change-Id: Iabca54da2f2de02b6ece3c38b54f76d4277d689e
2020-09-18 18:22:41 -07:00
Mihai Maruseac
015b91f752 [tflite] Don't check for buffers on every subgraph.
Buffers in the model are allocated globally, hence it makes sense to check for
their presence only once (O(1)) instead of on every subgraph (O(n)).

PiperOrigin-RevId: 323677724
Change-Id: I2da0c381093006828cc4c80f03dec8a917782861
2020-09-18 16:28:34 -07:00
Mihai Maruseac
f19bc1bcf5
Merge pull request #40731 from tensorflow/mm-cherry-pick-sqlite-fix-r2.0
Cherry-pick sqlite version bump
2020-09-18 11:16:58 -07:00
Mihai Maruseac
fe9e707e4f Bump sqlite to 3.33.0
This should handle CVE-2020-15358.

PiperOrigin-RevId: 332484006
Change-Id: Id2e7c4e877fcfaa53184fd21139a00f3234a5e3d
2020-09-18 11:13:35 -07:00
Mihai Maruseac
5fd6a2d0a8
Merge pull request #43305 from tensorflow/mihaimaruseac-patch-1
Disable test which fails on mac pip
2020-09-17 10:16:51 -07:00
Mihai Maruseac
d80787c241
Disable test which fails on mac pip 2020-09-17 10:15:59 -07:00
Mihai Maruseac
f3fece3d60
Merge pull request #43283 from tensorflow/mihaimaruseac-patch-1
Fix missing comma
2020-09-16 17:06:07 -07:00
Mihai Maruseac
0703cc53ae
Fix missing comma 2020-09-16 17:05:48 -07:00
Mihai Maruseac
ba7ae97c01
Merge pull request #43281 from tensorflow/patch_release
Fixing build files for broken test/builds
2020-09-16 15:22:15 -07:00
Geeta Chavan
cb10f638c4 Fixing build files for broken test/builds 2020-09-16 15:15:00 -07:00
Mihai Maruseac
3ecef3cbc6
Merge pull request #42536 from tensorflow/update_v_2.0
Pin numpy version to 1.19
2020-08-21 01:00:49 +00:00
Geeta Chavan
0da03ad0c9 Pin numpy version to 1.19 2020-08-20 16:25:47 -07:00
Mihai Maruseac
5e8d4012b4
Merge pull request #41706 from angerson/r2.0
Backport new rel/ structure to r2.0 branch
2020-07-24 17:15:49 +00:00
Austin Anderson
06e7f8aebc Backport new rel/ structure to r2.0 branch
These scripts are from the branch, not cloned from master.
2020-07-24 08:42:03 -07:00
TensorFlower Gardener
67746f8bd9 Merge pull request #40705 from Intel-tensorflow:chuanqiw/upgrade_sqlite
PiperOrigin-RevId: 317934381
Change-Id: I95cdf789f7f5a89d75d45a1b6d67f1ad993cafab
2020-06-23 14:23:05 -07:00
Mihai Maruseac
2c2fdd3205
Merge pull request #39355 from tensorflow-jenkins/relnotes-2.0.2-25649
Update release notes for TensorFlow 2.0.2
2020-05-12 19:46:23 +00:00
Mihai Maruseac
d853307857
Update RELEASE.md 2020-05-12 12:45:52 -07:00
Mihai Maruseac
2cab83adf3
Merge pull request #37063 from tensorflow/mm-cherrypick-36960-cve-fixes-on-r2.0
[Intel Mkl] Upgrade Sqlite3 to fix CVE-2019-19880 CVE-2019-19244 and …
2020-05-10 17:23:04 +00:00
Mihai Maruseac
b64d5ed6ea
Merge pull request #38407 from tensorflow/mm-cherrypick-libjpeg-turbo-on-r2.0
Cherrypick libjpeg-turbo version update
2020-05-10 17:03:40 +00:00
Mihai Maruseac
f09139cab4
Merge pull request #38275 from tensorflow/mm-cherrypick-curl-cve-fix-on-r2.0
Cherry-pick curl CVE fix
2020-05-10 15:44:26 +00:00
Mihai Maruseac
4dc855b72e
Merge pull request #39086 from ShaneSmiskol/r2.0-cherry-pick-pycharm-dynamic-display
Fix dynamic display for PyCharm (r2.0)
2020-05-10 15:17:02 +00:00
Mihai Maruseac
4d1d786c91
Merge pull request #39191 from tensorflow/mm-cherry-pick-apache-switches-on-r2.0
Increase Apache Spark version to 2.4.5 to handle GitHub Security Alert
2020-05-10 13:27:44 +00:00
Mihai Maruseac
e6f531b523
Merge pull request #39356 from tensorflow-jenkins/version-numbers-2.0.2-6911
Update version numbers for TensorFlow 2.0.2
2020-05-09 17:03:20 +00:00
TensorFlow Release Automation
ac3518ed2d Update version numbers to 2.0.2 2020-05-09 10:02:34 -07:00
TensorFlow Release Automation
7fdf7f2565 Insert release notes place-fill 2020-05-09 09:54:42 -07:00
Mihai Maruseac
2ab9864cd9 Increase Apache Spark version to 2.4.5 to handle GitHub Security Alert
Handles CVE-2019-10099, CVE-2018-17190, CVE-2018-11770.

To be cherrypicked on r1.15, r2.0, r2.1 and r2.2 branches

PiperOrigin-RevId: 309955549
Change-Id: I5ee68fdd3270534066487be67232c1abc687f968
2020-05-05 09:14:28 -07:00
Shane Smiskol
30d8248a2f Fix dynamic display for PyCharm 2020-05-01 15:09:15 -05:00
TensorFlower Gardener
0ecb0e0005 Merge pull request #38401 from yongtang:libjpeg-turbo
PiperOrigin-RevId: 305742301
Change-Id: I50968c0868f70a5009f018d3132cba77c0900158
2020-04-09 13:23:39 -07:00
Mihai Maruseac
24ccca49fa Add socketpair.c to curl buildable files to fix Windows builds.
Follow-up from bfb0e49d58

PiperOrigin-RevId: 305351839
Change-Id: Ic7a8b4942394d6d030e93b3ad9179e0bffdc434c
2020-04-07 15:33:55 -07:00
TensorFlower Gardener
cf6831f6da Merge pull request #38200 from Intel-tensorflow:chuanqiw/curl_upgrade
PiperOrigin-RevId: 304938718
Change-Id: I408e3b1d9ce1badfb08666ddac6400bae2c97936
2020-04-06 09:36:33 -07:00
Clayne Robison
b5dd7bf139 [Intel Mkl] Upgrade Sqlite3 to fix CVE-2019-19880 CVE-2019-19244 and CVE-2019-19645 2020-02-25 12:48:58 -08:00
Mihai Maruseac
a641fa1c5c
Merge pull request #36260 from angerson/r2.0
Remove 2.0.x Python 2 docker images
2020-01-27 20:54:22 +00:00
Austin Anderson
791c6faf17 Remove 2.0.x Python 2 docker images 2020-01-27 12:46:35 -08:00
Mihai Maruseac
765ac8d16e
Merge pull request #35913 from tensorflow-jenkins/relnotes-2.0.1-6767
Update release notes for TensorFlow 2.0.1
2020-01-22 23:43:57 +00:00
Mihai Maruseac
0bcb99b375
Add CVE number for main patch 2020-01-21 17:12:22 -08:00
Mihai Maruseac
a093c7ebd4
Merge pull request #36085 from tensorflow/mm-r2.0-fix-release-builds-pt4
Attempt 4 at fixing release builds
2020-01-20 20:52:46 -08:00
Mihai Maruseac
63aedd7d84 Disable test that times out on mac non pip builds 2020-01-20 20:09:15 -08:00
Mihai Maruseac
619c578581 Disable the gpu on cpu tests as they were added for 2.1 2020-01-20 19:58:47 -08:00
Mihai Maruseac
1a617d66fe
Merge pull request #36047 from tensorflow/mm-r2.0-fix-release-builds-pt3
Third attempt at fixing the release builds
2020-01-19 19:03:55 -08:00
Mihai Maruseac
32d9138b2e Cleanup the windows builds 2020-01-19 18:32:35 -08:00
Mihai Maruseac
dd1ebd7542 Cleanup macos builds 2020-01-19 18:32:18 -08:00
Mihai Maruseac
3b9305981c Remove py2 macos scripts 2020-01-19 18:32:02 -08:00
Mihai Maruseac
606596f080 Remove builds which are not needed for the release 2020-01-19 18:31:43 -08:00
Mihai Maruseac
39283d4b86 Pin estimator and tensorboard in common.sh instead of in build files 2020-01-19 17:25:52 -08:00
Mihai Maruseac
fe523f06c7
Merge pull request #36041 from tensorflow/mm-r2.0-fix-release-builds-pt2
Fix release builds pt2
2020-01-19 10:46:57 -08:00
Mihai Maruseac
3d77cdd7dc Remove the tests that depend on tf.contrib 2020-01-19 08:41:47 -08:00
Mihai Maruseac
932e052347 Remove -y on pip install as it does not exist 2020-01-19 08:41:42 -08:00
Mihai Maruseac
3b432d69b0
Merge pull request #36026 from tensorflow/mm-r2.0-patch
Fix segfault when attempting to convert string to float16.
2020-01-19 08:09:28 -08:00
Mihai Maruseac
b8c41d43e6 Fix copy-paste error 2020-01-18 15:21:39 -08:00
Mihai Maruseac
328255abee Fix BUILD file 2020-01-18 14:57:44 -08:00
Mihai Maruseac
bedfb3999f
Update RELEASE.md 2020-01-18 14:52:35 -08:00
Mihai Maruseac
620c7705fb Fix cherry-pick 2020-01-18 14:16:45 -08:00
Mihai Maruseac
3ccda1c38d Changes with no issues 2020-01-18 14:02:00 -08:00
Mihai Maruseac
d2b2ee94f2 Revert "Fix segfault when attempting to convert string to float16."
This reverts commit 7dc97c2704.
2020-01-18 14:01:00 -08:00
Mihai Maruseac
cc01e36c53
Merge pull request #36022 from tensorflow/mm-r2.0-fix-release-builds
Fixes to the release builds, based on first run attempt
2020-01-18 13:44:41 -08:00
Mihai Maruseac
7dc97c2704 Fix segfault when attempting to convert string to float16.
To make sure this gets fixed, add test for converting string to any numeric type.

PiperOrigin-RevId: 286650886
Change-Id: I81f770ec2bbd33a863e8057ce198c679912fa8e0
2020-01-18 13:42:51 -08:00
Mihai Maruseac
89a24acee1 Pin estimator and tensorboard at their version during final release 2020-01-18 12:18:06 -08:00
Mihai Maruseac
5aa3a3b23f Fix linux/gpu_on_cpu build based on changes to master 2020-01-18 11:54:10 -08:00
Mihai Maruseac
b356642918 Remove non-existent build in TF2.0 2020-01-18 10:56:54 -08:00
Mihai Maruseac
c847bdea29 Use xcode 10.3 for pip builds too. See #32998 2020-01-18 10:51:39 -08:00
Mihai Maruseac
719bd67640 Update xcode to 10.3 2020-01-18 10:39:38 -08:00
Mihai Maruseac
c31671d519
Merge pull request #36012 from tensorflow/mm-r2.0-deps-upgrade
Upgrade dependencies
2020-01-18 07:22:47 -08:00
Mihai Maruseac
0645aeb94e Add mising quic.h header file 2020-01-17 21:42:55 -08:00
Mihai Maruseac
bf94600681 Update curl 2020-01-17 17:04:29 -08:00
Mihai Maruseac
06865f81f7 Update URL to giflib to https 2020-01-17 16:52:09 -08:00
Mihai Maruseac
2774d80642 Update sqlite amalgamation 2020-01-17 16:51:59 -08:00
Mihai Maruseac
340387d8b4
Merge pull request #34995 from sisp/pep508-environment-markers-r2.0
Use PEP508 environment markers (r2.0)
2020-01-16 07:03:52 -08:00
Mihai Maruseac
0ee3001040
Merge pull request #35916 from tensorflow-jenkins/version-numbers-2.0.1-8555
Update version numbers for TensorFlow 2.0.1
2020-01-16 07:03:30 -08:00
TensorFlow Release Automation
86d38ef153 Update version numbers to 2.0.1 2020-01-15 14:17:37 -08:00
TensorFlow Release Automation
809fa9f8d4 Insert release notes place-fill 2020-01-15 13:59:20 -08:00
Mihai Maruseac
b65ea7395a
Merge pull request #33982 from tensorflow/cherrypick-release-build-2pt0
Cherrypick release builds.
2020-01-15 10:44:18 -08:00
Mihai Maruseac
f4b25218cd --incompatible_list_based_execution_strategy_selection is not a startup option 2020-01-14 08:51:43 -08:00
Mihai Maruseac
758b71bce6 Fix sanity reporting 2020-01-14 08:26:58 -08:00
Mihai Maruseac
918f82abbb Fix buildifier (2/2) 2020-01-14 07:59:31 -08:00
Mihai Maruseac
cd6a90f1c0 Fix buildifier 2020-01-14 07:49:11 -08:00
Mihai Maruseac
37658d220f Disable //tensorflow/python/eager:backprop_test on mac 2020-01-10 13:37:41 -08:00
Mihai Maruseac
33238693f0 Add Python3.7 testing on MacOS as we drop support for Python2.
To be cherry-picked on `r1.15`, `r2.0`, and `r2.1` branches.

PiperOrigin-RevId: 287871757
Change-Id: Ic530e884de421a39a82c686f1f0d086b6400d75c
2020-01-03 08:25:40 -08:00
A. Unique TensorFlower
01d790fa09 Reduced tolerance of ExponentialOpTest.
PiperOrigin-RevId: 281156604
Change-Id: I57fae6b19444a5e4ccf4a731ffe6722269fda4c4
2019-12-26 14:26:10 -08:00
Mihai Maruseac
0d7a1b96e0 Proper bazel flag usage for android build 2019-12-26 13:13:42 -08:00
Mihai Maruseac
b26dddd05f Add --incompatible_list_based_execution_strategy_selection to fix more presubmits 2019-12-26 11:15:10 -08:00
Mihai Maruseac
fa4a5f2c1b Fix android build error due to spacing issues.
BUILD rules should start on the first column in the file, not indented.

PiperOrigin-RevId: 286949644
Change-Id: I4e1b11d7e8889d63193725270ccfb1bac522e15d
2019-12-26 08:07:41 -08:00
Mihai Maruseac
74f7f99cd1 Add open source build scripts for the android presubmit.
PiperOrigin-RevId: 286931649
Change-Id: I68c08a14cf12f0d33b2ddeb182751128ab8dac6e
2019-12-23 12:59:51 -08:00
Mihai Maruseac
e755c3428f Add open source build scripts for the macos presubmit.
PiperOrigin-RevId: 286931214
Change-Id: I0a85a1e92835aad78acab729e07795ca103dce8c
2019-12-23 12:59:46 -08:00
Mihai Maruseac
efa1d86ec7 Add open source build scripts for the windows presubmits.
PiperOrigin-RevId: 286931071
Change-Id: If6ef44a6a677f9ecfbf40a00173b46684b926da9
2019-12-23 12:59:37 -08:00
Mihai Maruseac
6fa9a5f415 Add open source build scripts for the ubuntu presubmits.
PiperOrigin-RevId: 286911782
Change-Id: I599073962631f632fae4d6e79c7347982428cecb
2019-12-23 10:33:11 -08:00
Mihai Maruseac
af1ffe7c9c Publish sanity presubmit script.
PiperOrigin-RevId: 286669248
Change-Id: I687e242e69784a804e477fce909b9a091c8f43ad
2019-12-20 18:29:38 -08:00
Mihai Maruseac
8da3a7ba71 Add RBE to .bazelrc 2019-12-20 14:43:53 -08:00
Sigurd Spieckermann
c739b7f878 Use PEP508 environment markers 2019-12-10 11:31:34 +01:00
Hye Soo Yang
4726095fc7 Reverting bazel changes. 2019-11-07 09:54:47 -08:00
A. Unique TensorFlower
bcab47312f Upgrade Bazel to 0.29.1
Besides upgrading the Bazel version, we also refactored all build scripts to use rbe options in .bazelrc file.
In order to migrate for https://github.com/bazelbuild/bazel/issues/7480, we have to specify the complete strategies list in .bazelrc file.

PiperOrigin-RevId: 275459466
Change-Id: Iaec997da7862245955a36ebb1018d901f61c591d
2019-11-07 09:37:07 -08:00
A. Unique TensorFlower
9acc00baac Update bazelin the build scripts.
PiperOrigin-RevId: 278921759
Change-Id: Ib25e275eb64478cea9ad9d4e3298e195a0920aac
2019-11-06 16:06:49 -08:00
Hye Soo Yang
3d7a9c895a Downgrade expected bazel version. 2019-11-04 10:26:25 -08:00
Hye Soo Yang
e630ac31af Cherrypick release builds. 2019-11-04 09:56:29 -08:00
Goldie Gadde
1cf0898dd4
Merge pull request #32474 from ROCmSoftwarePlatform/r2.0-rocm-upstream-squashed
[ROCM] Patch to enable rocm for r2.0 release branch
2019-10-08 09:40:53 -07:00
sunway513
c4604a3265 Update python/keras/optimizer_v2/BUILD to pass buildifier test 2019-10-05 13:40:57 -05:00
sunway513
3f80be7894 Fix conv_ops_3d and pooling_ops_test scripts 2019-10-03 15:39:18 -05:00
Goldie Gadde
64c3d382ca Update RELEASE.md 2019-09-27 14:56:33 -07:00
Goldie Gadde
2845767d91 Update RELEASE.md 2019-09-27 14:56:33 -07:00
Nathan Luehr
3d230aaa1f Update release notes for tensorrt and mixed precision 2019-09-27 14:56:33 -07:00
Goldie Gadde
b1c53619cf Update RELEASE.md 2019-09-27 14:56:33 -07:00
Goldie Gadde
51054374ea Update RELEASE.md 2019-09-27 14:56:33 -07:00
Goldie Gadde
cf6180b841 Update RELEASE.md 2019-09-27 14:56:33 -07:00
Goldie Gadde
ec8d660892 Release Notes for 2.0.0-rc0 2019-09-27 14:56:33 -07:00
Goldie Gadde
ac24e9eb3a
Merge pull request #32861 from guptapriya/cherrypicks_5NZHH
Mark tf.keras.utils.multi_gpu_model as deprecated.
2019-09-27 08:35:23 -07:00
Priya Gupta
23a94133f5 Mark tf.keras.utils.multi_gpu_model as deprecated.
PiperOrigin-RevId: 271495434
2019-09-26 22:16:49 -07:00
Goldie Gadde
1f372a0968
Merge pull request #32742 from rmlarsen/cherrypicks_BX1WK
[r2.0-CherryPick]: [Grappler] Fix bug in layout optimizer introduced by cl/247704284. Re…
2019-09-26 18:46:21 -07:00
Goldie Gadde
5f13570706
Merge pull request #32852 from gunan/r2.0
[r2.0-CherryPick]: Fix licenses in C, java and python packages.
2019-09-26 18:45:58 -07:00
Gunhan Gulsoy
0a492b89f4 Include the license of TensorFlow in the pip package.
PiperOrigin-RevId: 271441009
2019-09-26 16:09:44 -07:00
Gunhan Gulsoy
03156c145d Make sure TF license is included in C/Java
TF package license was overwritten with a file with the same name.

Now, in C, the licenses are at the root as:
LICENSE: TensorFlow license file
THIRD_PARTY_TF_C_LICENSES: licenses for all TF dependencies.

Java:
LICENSE: TensorFlow license file
THIRD_PARTY_TF_JNI_LICENSES: licenses for all TF dependencies.
PiperOrigin-RevId: 271376706
2019-09-26 16:09:41 -07:00
Goldie Gadde
a2766d279f
Merge pull request #32823 from pooyadavoodi/enable_trt6
[r2.0-CherryPick]:Update TRT6 headerfiles
2019-09-25 21:21:25 -07:00
Goldie Gadde
99655ee92e
Merge pull request #32824 from tensorflow/ggadde-2-0-final-version
[r2.0-CherryPick]: Update TF version to 2.0.0.
2019-09-25 18:57:18 -07:00
Goldie Gadde
5d5bca28e7
Merge pull request #32826 from tensorflow/cherrypick_fc_v2
[r2.0-CherryPick]: Fix deprecation warnings in feature_column_v2.
2019-09-25 18:56:54 -07:00
Yanhui Liang
30af3984ff Fix deprecation warnings in feature_column_v2.
PiperOrigin-RevId: 271211187
2019-09-25 16:13:47 -07:00
Goldie Gadde
a4180fc5c6 Update tf version to 2.0.0. 2019-09-25 15:37:43 -07:00
Pooya Davoodi
a88fef676c Add new TensorRT6 functions to SimpleITensor 2019-09-25 14:53:20 -07:00
Pooya Davoodi
6ce88bc983 Update TRT6 headerfiles 2019-09-25 14:38:27 -07:00
Goldie Gadde
bb52379685
Merge pull request #32820 from tanzhenyu/cherrypicks_KK9GH
[r2.0-CherryPick]:remove array_ops.where deprecation message.
2019-09-25 13:52:44 -07:00
Zhenyu Tan
550d3a428d remove array_ops.where deprecation message.
PiperOrigin-RevId: 271168043
2019-09-25 13:03:48 -07:00
Amit Patankar
75a91fac16 Fixing presubmits for 2.0 by removing conterib as a testing target. 2019-09-24 09:43:33 -07:00
Amit Patankar
2e914feef3 Adding a default target set for presubmits.
PiperOrigin-RevId: 270791565
2019-09-24 09:43:33 -07:00
Mihai Maruseac
3aded2100b Pin Estimator version to newly released 2.0.0 2019-09-24 09:31:06 -07:00
A. Unique TensorFlower
0e2545be93 [Grappler] Fix bug in layout optimizer introduced by cl/247704284. Restore the original graph if Tune() fails.
PiperOrigin-RevId: 270737190
2019-09-23 14:02:33 -07:00
Goldie Gadde
2646d23074
Merge pull request #32669 from tensorflow/ggadde-cp-19
[r2.0-CherryPick]:[tf.data] Avoid double conversion to a tensor during input normalizat…
2019-09-19 14:51:25 -07:00
Goldie Gadde
3b88a7d990
Merge pull request #32644 from tensorflow/nfelt-tb-2.0-cherrypick
TF 2.0 Cherrypick for TensorBoard 2.0 dependency
2019-09-19 14:50:49 -07:00
Goldie Gadde
a54776c1a7
Merge pull request #32631 from kkimdev/cherrypicks_E6E8H
[r2.0-CherryPick]:forward_compatible env variable caching perf optimization.
2019-09-19 14:46:26 -07:00
Jiri Simsa
4f07c49468 [tf.data] Avoid double conversion to a tensor during input normalization.
PiperOrigin-RevId: 270046393
2019-09-19 12:31:36 -07:00
Nick Felt
42be2d86b6
TF 2.0 Cherrypick for TensorBoard 2.0 dependency
Update tensorboard dependency to 2.0.x - TensorBoard release: https://pypi.org/project/tensorboard/2.0.0/
2019-09-18 22:57:32 -07:00
Goldie Gadde
41474120da
Merge pull request #32626 from tensorflow/r2.0-cherry-picks-for-estimator
Cherrypicks for estimator 2.0 release
2019-09-18 15:17:59 -07:00
Kibeom Kim
ed5e2b1506 forward_compatible env variable caching perf optimization.
PiperOrigin-RevId: 269832823
2019-09-18 12:38:15 -07:00
Zhenyu Tan
ef2bf71478 Add serialization/deserialization for WideDeep model.
PiperOrigin-RevId: 263859331
2019-09-18 10:02:29 -07:00
Zhenyu Tan
b7ee3a8ead Add serial/deserial for Linear Model and test in model_to_estimator.
PiperOrigin-RevId: 263647067
2019-09-18 10:02:21 -07:00
Goldie Gadde
7226dfb2d7
Merge pull request #32441 from tensorflow/cherrypick_batch_dot
[r2.0-CherryPick]:Cherrypick batch_dot behavior change
2019-09-17 14:39:39 -07:00
Goldie Gadde
f136f5467b
Merge pull request #32565 from kkimdev/cherrypicks_I3HYM
[r2.0-CherryPick]: Autograph: Remove tf.autograph.experimental.set_loop_options doc
2019-09-17 14:32:22 -07:00
Goldie Gadde
1d908742b2
Merge pull request #32596 from tomerk/cherrypicks_YOUHM
Fix support for custom op namespacing in Python
2019-09-17 14:31:45 -07:00
A. Unique TensorFlower
95e3c9cade Yet another gen_file fix to support '>' namespace separators in op names. (In this case when an op has multiple named outputs)
PiperOrigin-RevId: 269455111
2019-09-17 11:14:32 -07:00
A. Unique TensorFlower
acb27f3f05 (Hopefully) remaining work to get custom ops working w/ namespaces. Also makes some example tests about adding ops create & run namespaced ops.
PiperOrigin-RevId: 268944869
2019-09-17 11:03:39 -07:00
A. Unique TensorFlower
b35ee613eb Fixes a bug in the custom op namespacing support where node_def 'name' was made to support '>' for namespaces instead of op_def name
PiperOrigin-RevId: 268570785
2019-09-17 11:03:38 -07:00
Mihai Maruseac
1c83755e5a
Merge pull request #32560 from akshaym/cherrypicks_QGUWI
Stop caching inf/nan floats
2019-09-16 15:20:28 -07:00
Kibeom Kim
cdbf2d7119 Autograph: Remove tf.autograph.experimental.set_loop_options doc
It's not implemented yet

PiperOrigin-RevId: 269387821
2019-09-16 14:58:15 -07:00
A. Unique TensorFlower
d109b8812a Split segment_reduction_ops.cc to reduce compile time.
PiperOrigin-RevId: 266052197
2019-09-16 12:38:16 -07:00
Peng Sun
a70de4d877 add 'no_rocm' tag to //tensorflow/python/keras/optimizer_v2:optimizer_v2_test_gpu
A recent change broke this test for the ROCm platform.
We are looking into fixing this test, but need to disable this test in the mean
time because this test gets run as part of the ROCm Community Supported Build.
2019-09-16 14:38:06 -05:00
sunway513
2118059a0f Revert the reduction ops changes for NV path 2019-09-16 14:34:05 -05:00
sunway513
41041a9c79 Revert "improve concurrency between compute and nccl streams"
This reverts commit 7dbb5dd1c4.
2019-09-16 14:34:05 -05:00
Peng Sun
9cebda7419 Remove the non-applicable patches for conv_grad_filter_ops and conv_grad_input_ops 2019-09-16 14:34:04 -05:00
Peng Sun
1c95052fac Fix format issues reported by pylint 2019-09-16 14:34:04 -05:00
Peng Sun
4da1ccb9ff Patch to enable r2.0 ROCm non-xla support
The goal for this PR is to patch Tensorflow r2.0 release, so it would fully enable ROCm
non-xla path support.

Most of the PRs been cherry-picked in this patch have already been upstreamed in the
upstream master branch.

The following were all the related commits been cherry-picked:

Commits on Aug 20, 2019
deven-amd and sunway513
adding/updating ROCm support in the ci_build scripts
d5a0eee
deven-amd and sunway513
updating Dockerfile.rocm to pick a specific version of the rocm libra… …
e335575
deven-amd and sunway513
adding a script for testing the ROCm Community Supported Build
ae83a20

Commits on Aug 22, 2019
deven-amd and sunway513
Resolve merge conflicts for PR #31393
73ff708
deven-amd and sunway513
The following PR/commit breaks the --config=rocm build …
614bdb5
deven-amd and sunway513
updating testcases to work correctly with ROCm
1685240
jeffdaily and sunway513
improve concurrency between compute and nccl streams …
3fbb049
whchung and sunway513
[ROCm] enable roll op on ROCm.
1d5f440
whchung and sunway513
[ROCm] enable InTopK op on ROCm.
941f713
deven-amd and sunway513
updating README.md with information on ROCm Community Supported Builds
73ce64e

Commits on Aug 25, 2019
houtoms and sunway513
fixed potential rocm breaks from use_padded_io
0832b33
deven-amd and sunway513
adding no_rocm tag on unit-tests that check features that are current… …
7aed626
deven-amd and sunway513
Adding ROCm support for reduction ops
82bd216
sunway513
Fix ROCm path build error in rocm_dnn.h
5dba305

Commits on Aug 27, 2019
deven-amd
fixing test failures by skipping parts that functionality not yet sup… …
be6378c
sunway513
Merge pull request #616 from ROCmSoftwarePlatform/r2.0-rocm-upstream-… …
d98a943
sunway513
Add no_rocm tag to //tensorflow/python:stateful_random_ops_test_gpu
d05a47f

Commits on Sep 04, 2019
sunway513
Merge branch 'r2.0-rocm-upstream' of https://github.com/ROCmSoftwareP… …
b1148e4

Commits on Sep 06, 2019
deven-amd and sunway513
adding ROCm support in the build_pip_package script
b908324
2019-09-16 14:34:04 -05:00
Akshay Modi
8996e40f93 Use std::isfinite instead of Py_IS_FINITE
PiperOrigin-RevId: 269010942
2019-09-16 12:23:10 -07:00
Akshay Modi
8a385c775d Don't cache inf/nan floats
PiperOrigin-RevId: 268941838
2019-09-16 12:23:06 -07:00
Goldie Gadde
4096702326
Merge pull request #32399 from jaingaurav/cherry-2.0
[r2.0 CherryPick]: Use experimental_ref() in moving_averages
2019-09-15 11:35:15 -07:00
Goldie Gadde
a4b5755b30
Merge pull request #32511 from k-w-w/cherrypicks_9FILR
[r2.0-CherryPick]:Fix bug when cloning functional models that use Tensor keyword argume…
2019-09-15 11:34:41 -07:00
Mihai Maruseac
017abe402d
Merge pull request #32538 from perfinion/r2.0-patches
Systemlibs cherry-picks for r2.0
2019-09-15 10:34:28 -07:00
Goldie Gadde
80147fb509 Update the version to 2.0.0-rc2 2019-09-14 17:18:36 -07:00
Goldie Gadde
451681fa38
Merge pull request #32527 from jaingaurav/cherry-2.0-2
[r2.0-CherryPick]:Disallow comparing ObjectIdentityWrapper to others
2019-09-14 14:56:36 -07:00
Gaurav Jain
97b6a54c41 Disallow comparing ObjectIdentityWrapper to others
When using the experimental_ref() API in Tensors & Variables. A common
bug I hit was incorrectly comparing a wrapped object with an unwrapped
object instead of first calling deref(). To avoid this we raise an
exception now instead of returning False. This implies that if Tensors
and Variables are kept in the same set or dictionary as other objects,
an exception can be raised if there is a hash collision.

PiperOrigin-RevId: 268837575
(cherry picked from commit 57e8769bc4)
2019-09-14 10:13:58 -07:00
Gaurav Jain
d159647866 Use experimental_ref() in moving_averages
In addition, fix zero_debias_true to use experimental_ref.

PiperOrigin-RevId: 268230742
(cherry picked from commit a0e85fd379)
(cherry picked from commit c069581e47eec6ff05156e81898ce3e7bc350401)
2019-09-14 10:03:19 -07:00
Jason Zaman
cd5e6f3f36 systemlibs: unbundle functools32
Signed-off-by: Jason Zaman <jason@perfinion.com>
2019-09-14 17:44:14 +08:00
Jason Zaman
b60cc3a082 systemlibs: jsoncpp: update header symlinks for jsoncpp 1.9
Signed-off-by: Jason Zaman <jason@perfinion.com>
2019-09-14 17:41:47 +08:00
Jason Zaman
cc72c6f4dc systemlibs: unbundle enum34
Signed-off-by: Jason Zaman <jason@perfinion.com>
2019-09-14 17:41:47 +08:00
Jason Zaman
93c797678c install_headers: fix paths of generated headers
The generated headers moved from bazel-genfiles to bazel-bin so change
the match to remove both. Also adjust the external library header files
so they have the right paths to work with the base include.

All the TensorFlow header files should compile cleanly on their own
(excluding the windows ones etc). To verify, run the target then install
to /usr/include/tensorflow and run the following:

for i in $(find /usr/include/tensorflow -iname "*.h"); do \
g++ -o/dev/null -E -I/usr/include/tensorflow -I/opt/cuda/include $i \
|| echo $i; done

Signed-off-by: Jason Zaman <jason@perfinion.com>
2019-09-14 17:41:47 +08:00
Jason Zaman
1e685a716f pkgconfig: generate tensorflow_cc pkg-config entry
Signed-off-by: Jason Zaman <jason@perfinion.com>
2019-09-14 17:41:47 +08:00
Penporn Koanantakool
396daca0e0 Reorder Bazel rule load order (sort alphabetically). 2019-09-13 23:40:38 -07:00
srinivasan.narayanamoorthy
fa24f39434 Hardest single line change ever. 2019-09-13 23:40:38 -07:00
srinivasan.narayanamoorthy
f3912e8e58 minor fix 2019-09-13 23:40:38 -07:00
srinivasan.narayanamoorthy
4d992c74eb minor formatting fix. 2019-09-13 23:40:38 -07:00
srinivasan.narayanamoorthy
25b3792547 minor fix. 2019-09-13 23:40:38 -07:00
srinivasan.narayanamoorthy
7c3a1cca86 Fixing spurious omp thread spawning. 2019-09-13 23:40:38 -07:00
Katherine Wu
2f43a9e48c Fix bug when cloning functional models that use Tensor keyword arguments.
Aligned the cloning implementation to be similar to Model.from_config(model.get_config()), with a few minor differences.

PiperOrigin-RevId: 268969706
2019-09-13 15:00:36 -07:00
Mihai Maruseac
bf7d835d03
Merge pull request #32507 from annarev/cherrypicks_T5PFH
2.0.0-rc2 cherry-pick request: Use relative imports only in TensorFlow.
2019-09-13 12:32:41 -07:00
Anna R
11cab5dc63 Use relative imports only in TensorFlow. Estimator's autocomplete works without
relative imports.

PiperOrigin-RevId: 268940330
2019-09-13 11:35:48 -07:00
Goldie Gadde
25d8dda8bb
Merge pull request #32482 from reedwm/cherrypick_lso_saving
[r2.0-CherryPick]: Have LossScaleOptimizer better emulate the OptimizerV2 interface.
2019-09-13 10:59:43 -07:00
Goldie Gadde
c69c2b9df2
Merge pull request #32504 from tensorflow/ggadde-cp17
Fix performance regression issue by reusing metrics property.
2019-09-13 10:58:11 -07:00
Pavithra Vijay
6c70a16656 Fix performance regression issue by reusing metrics property.
PiperOrigin-RevId: 268827158
2019-09-13 10:04:24 -07:00
Goldie Gadde
426862d01a
Merge pull request #32440 from tensorflow/cherrypick_cast_to_floatx
[r2.0 Cherrypick]: Cherrypick `cast_to_floatx` new behavior
2019-09-13 08:55:33 -07:00
Reed Wanderman-Milne
e95b85ac57 Have LossScaleOptimizer better emulate the OptimizerV2 interface.
This allows LossScaleOptimizers to be saved in a SavedModel with Model.save().

The main addition is implementing LossScaleOptimizer.get_config(). This required adding serialization support to LossScales.

Also rename the "opt" argument in LossScaleOptimizer.__init__ to "optimizer".

PiperOrigin-RevId: 268776796
2019-09-12 15:23:36 -07:00
Mihai Maruseac
c6babdd8aa
Merge pull request #27546 from angersson/cherrypick-87ea41d023a985c5ecee0370c7f7381c8a8cee52
Fix configure.py to properly compare "X.Y.Z" with "X.Y"
2019-09-12 08:40:58 -07:00
ag.ramesh
c31017a68f Use tf_opts when compiling MkL related eager files. 2019-09-12 01:40:38 -07:00
François Chollet
1e57145558 Sync behavior of batch_dot with external Keras 2019-09-11 12:34:42 -07:00
François Chollet
7853c50239 Add tensor/variable support to cast_to_floatx 2019-09-11 12:29:53 -07:00
tanzhenyu
f03fe1bf79
Merge pull request #32433 from tanzhenyu/cherrypicks_WIQA6
Fix major Adamax gpu bug.
2019-09-11 10:43:51 -07:00
Zhenyu Tan
67c17f221a Fix major Adamax gpu bug.
PiperOrigin-RevId: 268469299
2019-09-11 10:26:10 -07:00
tanzhenyu
ffb00d7fe6
Merge pull request #32431 from tanzhenyu/cherrypicks_L1RT5
[r2.0-CherrPick]:Add Ftrl cuda kernels.
2019-09-11 10:19:26 -07:00
Zhenyu Tan
f117d547a1 Add Ftrl cuda kernels.
PiperOrigin-RevId: 268335477
2019-09-11 09:23:09 -07:00
Mihai Maruseac
59bf3351a4
Merge pull request #32308 from saxenasaurabh/cherrypicks_PE4RX
Fix shape inference for default gradients of resources
2019-09-10 14:06:37 -07:00
Mihai Maruseac
ca3ad1a527
Merge pull request #32179 from tensorflow/jvishnuvardhan-patch-3
Corrected typo in `ones_like`
2019-09-10 13:49:17 -07:00
Mihai Maruseac
355c22c6d9
Merge pull request #32391 from tensorflow/mm-cherrypick-gast-pin-on-2.0
Freeze gast==0.2.2 (#32319) in setup.py requirements.
2019-09-10 13:27:46 -07:00
Mihai Maruseac
16cad8c1f6 Freeze gast==0.2.2 (#32319) in setup.py requirements.
This fixes breakage caused by minor update breaking tensorflow.

PiperOrigin-RevId: 268261146
2019-09-10 12:40:26 -07:00
Goldie Gadde
d2d2566eef
Merge pull request #32367 from aaroey/r2.0
[r2.0-CherryPick]: Fix segfault error in trt_convert_test
2019-09-09 15:56:00 -07:00
Guangda Lai
26095fd3b1 Fix segfault error in trt_convert_test 2019-09-09 13:55:03 -07:00
Subin M
208e4e4fcd Adding full name
Adding full name
2019-09-09 09:46:28 -07:00
Goldie Gadde
12985ec6f7
Merge pull request #32339 from tensorflow/ggadde-cp16
[r2.0Cherrypick]: Install gast at known version and not the latest.
2019-09-08 22:06:07 -07:00
A. Unique TensorFlower
32f4c71f1d Install gast at known version and not the latest.
PiperOrigin-RevId: 267873409
2019-09-08 21:20:31 -07:00
Goldie Gadde
2e19c76352
Merge pull request #32305 from annarev/cherrypicks_JN7Y8
2.0.0-rc1 cherry-pick request: Import submodules using relative imports.
2019-09-08 16:42:16 -07:00
Anna R
77f5617249 Disable tf.summary test if tensorboard is not installed 2019-09-08 15:34:44 -07:00
Goldie Gadde
f896d2dc8f
Merge pull request #32324 from tensorflow/ggadde-cp-version
[r2.0 CherryPick]:Update version to 2.0.0-rc1
2019-09-08 15:08:46 -07:00
Goldie Gadde
e84e7a5f41
Merge pull request #32334 from annarev/cherrypicks_8035R
2.0.0-rc1 cherry-pick request: Don't exclude tensorboard symbols from goldens if tensorboard pip is installed.
2019-09-08 14:18:54 -07:00
Anna R
3729551d4b Don't exclude tensorboard symbols from goldens if tensorboard pip is installed.
PiperOrigin-RevId: 267883131
2019-09-08 13:25:58 -07:00
Goldie Gadde
2d397d9fc5
Merge pull request #32303 from tensorflow/ggadde-cp15
[r2.0-CherryPick]:Add tf.saved_model.Asset public symbol.
2019-09-08 11:34:59 -07:00
Andr? Susano Pinto
40047b6bf1 Fix SavedModel text embedding example to use TF-2 public APIs.
PiperOrigin-RevId: 267392885
2019-09-08 09:58:54 -07:00
Goldie Gadde
803f868824 Update version to 2.0.0-rc1 2019-09-07 22:01:34 -07:00
Saurabh Saxena
0c47dd5b13 Fix shape inference for default gradients of resources which are usually of the form:
`tf.zeros(gen_resource_variable_ops.variable_shape(resource), dtype)`
This adds support for VariableShape to ShapeRefiner::ConstantPartialShape.
This change should provide better graph building shape inference but should not affect graph-building/runtime behavior.

PiperOrigin-RevId: 267681703
2019-09-06 23:09:55 -04:00
Goldie Gadde
dc363393f6
Merge pull request #32306 from k-w-w/cherrypicks_UR7H3
[r2.0-CherryPick]:Keras SavedModel important bug fixes and refactor
2019-09-06 18:52:21 -07:00
Goldie Gadde
592d128e50
Merge pull request #32301 from tomerk/cherrypicks_6GCFP
[r2.0-CherryPick]: Adds support for generator inputs w/ varying batch sizes & shapes n t…
2019-09-06 18:51:10 -07:00
Katherine Wu
f9494f24d8
Remove dedupe_weights @property
Accidentally left it in the commit.
2019-09-06 17:06:58 -07:00
Katherine Wu
960c3ba8e9 If layer's call function is wrapped in tf.function, then trace the original method instead of the the wrapped function.
This avoids the issue where canonicalize_function_inputs added an additional defaults to the `args` list.

PiperOrigin-RevId: 267650429
2019-09-06 16:51:01 -07:00
Katherine Wu
cc9422d373 Convert tensors to constants when saving Keras functional and graph models to SavedModel.
This includes changes to backend.get_value() to ensure that the correct graph context created when getting the value of a tensor.

PiperOrigin-RevId: 267498198
2019-09-06 16:50:59 -07:00
Katherine Wu
91658fade1 Remove SavedModel-specific methods from base Layer and Model, and add documentation for updating the serialization implementation.
PiperOrigin-RevId: 266978650
2019-09-06 16:50:53 -07:00
Anna R
23c3928d31 Automated rollback of commit 1d1f7dfcbd
PiperOrigin-RevId: 266040561
2019-09-06 16:40:38 -07:00
Goldie Gadde
aed84aeded
Merge pull request #32300 from tensorflow/tomerk-patch-1
[r2.0-CherryPick]:Implementing RFC#126: Allow Op names of the form RepoName>OpName
2019-09-06 16:24:39 -07:00
Andr? Susano Pinto
f185080c09 Add tf.saved_model.Asset public symbol.
`Asset` is the mechanism that allows to make hermetic SavedModels
that depend on files. It replaces functionality that on TF-1.x was
typically provided by the ASSET_FILEPATHS collection.

PiperOrigin-RevId: 267534289
2019-09-06 16:17:56 -07:00
Goldie Gadde
14f20c9ff6
Merge pull request #32259 from geetachavan1/cherrypicks_YQPDR
[r2.0 Cherrypick]:Implementing RFC#126: Allow Op names of the form RepoName>OpName.
2019-09-06 16:02:50 -07:00
Goldie Gadde
fd99c2cf93
Merge pull request #32184 from annarev/cherrypicks_32WQL
[2.0.0-rc0 CherryPick]: Only add ModuleWrapper when lazy loading is requested or when using TF 1.x (to print deprecation warnings).
2019-09-06 16:01:54 -07:00
A. Unique TensorFlower
735179ad00 Adds support for generator inputs w/ varying batch sizes & shapes n the keras unified execution path
PiperOrigin-RevId: 267686850
2019-09-06 15:51:55 -07:00
Tomer Kaftan
2e35d26a48
Implementing RFC#126: Allow Op names of the form RepoName>OpName
This change maps '>' in the op names to underscores in the generated python op function names and export names.

Getting '.' in the names instead is theoretically doable but would add too much complexity to the whole codegen loop to be worthwhile.
(It would require analyzing the op names to group ops by their respective nested namespaces, code-gen'ing nested python classes that match the namespaces, then indenting the codegen'd python ops inside of the nested classes w/ the names set correctly).
2019-09-06 15:09:31 -07:00
Anna R
0733efc770 Fixing builds after removing module wrapper in 2.0. 2019-09-06 14:30:44 -07:00
Goldie Gadde
b6e4c89267
Merge pull request #32297 from penpornk/cherrypicks_IF3LM
[r2.0 CherryPick]: Upgrading giflib to fix CVE-2019-15133
2019-09-06 14:18:20 -07:00
Goldie Gadde
d2e5f5a49e
Merge pull request #32272 from reedwm/docstring_cherrypicks
[r2.0-rc1 CherryPick]: Improve Layer docstrings in regards to autocasting.
2019-09-06 14:17:49 -07:00
Goldie Gadde
8ea0a418a6
Merge pull request #32269 from reedwm/mp_cherrypicks
[r2.0-rc1 CherryPick]: Several tf.keras mixed precision API changes
2019-09-06 14:15:19 -07:00
Goldie Gadde
5a580681ad
Merge pull request #32257 from omalleyt12/cherrypicks_20IEW
[r2.0-CherryPick] Deduplicate Keras weights
2019-09-06 14:14:22 -07:00
Goldie Gadde
687a4dbbaf
Merge pull request #32267 from penpornk/cherrypicks_AV4NQ
[r2.0 CherryPick]: [INTEL MKL] Add support for Addv2
2019-09-06 13:13:52 -07:00
Goldie Gadde
31d4ac9744
Merge pull request #32292 from tanzhenyu/cherrypicks_6APVN
[r2.0-CherryPick]:Fix loss computation when y_true and y_pred is not same shape.
2019-09-06 12:58:51 -07:00
Goldie Gadde
a14d091723
Merge pull request #32065 from k-w-w/cherrypicks_RSDJN
Add SaveOptions object with option to whitelist op namespaces.
2019-09-06 12:58:30 -07:00
Penporn Koanantakool
a8d963daa2 Rollforward of PR #32169: Upgrading giflib to fix CVE-2019-15133
Add a patch file to fix giflib's compilation issue on Windows (replace a call to strtok_r with strtok_s).

**NVD**: 2019/08/17 - CVSS v2.0 Base Score: 4.3 - CVSS v3.0 Base Score: 6.5
In GIFLIB before 2019-02-16, a malformed GIF file triggers a divide-by-zero exception in the decoder function DGifSlurp in dgif_lib.c if the height field of the ImageSize data structure is equal to zero.

Source | Link | Type
---- | ---- | ----
MISC | bugs.chromium.org | Mailing List, Third Party Advisory
UBUNTU | usn.ubuntu.com | Third Party Advisory

PiperOrigin-RevId: 267533902
2019-09-06 12:42:44 -07:00
Alexandre Passos
4af47547fe
Merge pull request #32282 from jaingaurav/cherry-2.0
Add incompatible_shape_error attribute to equal op
2019-09-06 11:19:06 -07:00
Goldie Gadde
bb85212ce9
Merge pull request #32252 from jdduke/cherrypicks_C49XH
[r2.0-Cherrypick]NNAPI TransposeConv op takes tensor inputs from TFLite node
2019-09-06 09:35:53 -07:00
Gaurav Jain
ff750af214 Add fill_functor.h to cwise_lib to fix ROCm build
PiperOrigin-RevId: 267531307
(cherry picked from commit dc29ecea18)
2019-09-06 09:03:05 -07:00
Gaurav Jain
3e08639766 Add incompatible_shape_error attribute to equal op
When tensor equality is enabled, if there is an incompatible shape we
currently throw and exception. Ideally we'd like to return False when
calling __eq__ and True when calling __ne__. We thus modify the Equal
and NotEqual ops to return a boolean upon a shape incompatibility. Due
to this change the shape inference logic needs to be changed to either
return a scalar bool if the shapes are incompatible, or else return an
unknown shape to allow for either a boolean Tensor or scalar to be
returned.

Note the behavior of tf.math.equal & tf.math.not_equal is unchanged as
they both use optimistic shape inference logic when dealing with unknown
dimensions which allows for more efficient graphs rather than inserting
Rank operations.

This distinction between __eq__ & tf.math.equal is also found in numpy
and as a result the tf.debugging.assert_equal and
tf.debugging.assert_none_equal APIs needed to be change to utilize the
numpy operations.

PiperOrigin-RevId: 267466043
(cherry picked from commit e0e1efbe08)
2019-09-06 09:03:05 -07:00
Zhenyu Tan
e69e5cc79a Fix loss computation when y_true and y_pred is not same shape.
PiperOrigin-RevId: 267595602
2019-09-06 08:49:15 -07:00
Reed Wanderman-Milne
43c6071ead Improve Layer docstrings in regards to autocasting.
I fixed the examples so they actually can run now. And I mentioned that currently, only the first argument to call() is casted.

PiperOrigin-RevId: 262603558
2019-09-05 18:19:49 -07:00
AG Ramesh
c8319800f9 Update tensorflow/core/graph/mkl_layout_pass_test.cc
updating based on review comments

Co-Authored-By: Penporn Koanantakool <38085909+penpornk@users.noreply.github.com>
2019-09-05 17:05:23 -07:00
AG Ramesh
199d79d32e Add support for Addv2 2019-09-05 17:05:15 -07:00
Alexandre Passos
f67991359e
Merge pull request #32258 from kkimdev/cherrypicks_5XVFU
[r2.0 CherryPick]: @tf.function: Show a warning message when tracing happens too frequently
2019-09-05 15:42:16 -07:00
Reed Wanderman-Milne
821b2fc9f1 Implement __repr__ for LossScale subclasses.
PiperOrigin-RevId: 262603680
2019-09-05 14:12:53 -07:00
Martin Wicke
166b763967 Implementing RFC#126: Allow Op names of the form RepoName>OpName.
PiperOrigin-RevId: 264491560
2019-09-05 13:54:01 -07:00
Kibeom Kim
827e1edc1e @tf.function: Show a warning message when tracing happens too frequently.
PiperOrigin-RevId: 267414002
2019-09-05 13:53:21 -07:00
Goldie Gadde
52f22d913d
Merge pull request #32256 from saxenasaurabh/cherrypicks_JUPKK
[r2.0 Cherrypick]:Make calls to `tf.function(f)()`, `tf.function(f).get_concrete_function` and `tf.function(f).get_initialization_function` thread-safe.
2019-09-05 13:50:33 -07:00
Thomas O'Malley
873bfb7b91 Fix merge conflicts 2019-09-05 12:56:13 -07:00
Saurabh Saxena
739b9c5423 Make calls to , and thread-safe. 2019-09-05 15:52:54 -04:00
Reed Wanderman-Milne
6b9a66d543 Deprecate "_with_float32_vars" policies.
These policies will be removed in TensorFlow 2.1. I plan on removing them very shortly.

PiperOrigin-RevId: 267429724
2019-09-05 12:52:05 -07:00
Reed Wanderman-Milne
a3a9fd34e9 Do not cast inputs to AddLoss layers
This means tensors passed to Model.add_loss will no longer be cast to floatx.

PiperOrigin-RevId: 264287945
2019-09-05 12:51:57 -07:00
Reed Wanderman-Milne
75a9d99941 Add mixed_float16 and mixed_bfloat16 dtype policies.
These policies will be the recommended way of using mixed precision in tf.keras. So far, the only difference from [b]float16_with_float32_vars is that mixed_float16 enables dynamic loss scaling by default (and mixed_bfloat16 has no difference). In the future, the *_with_float32_vars policies will be removed.

PiperOrigin-RevId: 263206151
2019-09-05 12:51:57 -07:00
Reed Wanderman-Milne
0fe04a9a52 Add loss_scale field to Policy.
A Keras Model will wrap it's optimizer with a LossScaleOptimizer, if it's policy has a loss scale. This way, users do not have to manually wrap their optimizers with a LossScaleOptimizer.

PiperOrigin-RevId: 263186511
2019-09-05 12:51:57 -07:00
T.J. Alumbaugh
cdb35502d8 NNAPI TransposeConv op takes tensor inputs from TFLite node
PiperOrigin-RevId: 267384015
2019-09-05 12:24:09 -07:00
Goldie Gadde
705083378c
Merge pull request #32189 from tensorflow/cherrypicks_U0F4M
[r2.0:Cherrypick]Negating cosine similarity loss so that value gets minimized while training without needed a wrapper function.
2019-09-05 10:35:12 -07:00
Mihai Maruseac
3891e4f092
Merge pull request #32080 from guillaumekln/r2.0-cherry-pick-functions-input-shape
[r2.0:Cherrypick] Remove tensor input shape from function signature.
2019-09-05 09:47:15 -07:00
Goldie Gadde
f6c418757e
Merge pull request #32220 from tensorflow/cherrypicks_2BM2H
[r2.0:Cherrypick]:Fix add_metric name ordering issue.
2019-09-05 09:09:51 -07:00
Goldie Gadde
1bd1f8428e
Merge pull request #32181 from mrry/cancellation_cherrypick
[r2.0 Cherrypick] Fix use-after-free of CancellationManager in LocalRendezvousImpl.
2019-09-05 09:09:00 -07:00
Goldie Gadde
4932df55e3
Merge pull request #31937 from tomerk/cherrypicks_LRWAT
r2.0-Cherrypick:Makes `nest` able to flatten dictionary views (produced by dict.items…
2019-09-04 16:39:00 -07:00
Pavithra Vijay
cf3386a93c Fix add_metric name ordering issue.
PiperOrigin-RevId: 267226711
2019-09-04 16:29:12 -07:00
Goldie Gadde
b74c58f43d
Merge pull request #32094 from penpornk/cherrypicks_YQKBS
r2.0 cherry-pick request: [Intel MKL] Upgrade MKL-DNN to 0.20.3
2019-09-04 10:57:11 -07:00
Goldie Gadde
f079372183
Merge pull request #32123 from k-w-w/cherrypicks_9TURB
Track trackables in graph networks
2019-09-03 14:31:31 -07:00
Pavithra Vijay
80aa5a25a7 Automated rollback of commit c3fb862245
PiperOrigin-RevId: 266977539
2019-09-03 13:41:48 -07:00
Anna R
0a702eb670 Only add ModuleWrapper when lazy loading is requested or when using TF 1.x (to
print deprecation warnings).

PiperOrigin-RevId: 266944067
2019-09-03 10:16:23 -07:00
Derek Murray
e8ee6b8d67 Fix use-after-free of CancellationManager in LocalRendezvousImpl.
Previously, we were invoking the CancellationManager in ~Item, which runs after the done callback. However, the CancellationManager was borrowed from the calling RecvOp, and it will tend to be deleted synchronously when the done callback executes.

PiperOrigin-RevId: 266567112
2019-09-03 09:58:46 -07:00
Vishnuvardhan Janapati
e300cc261e
Corrected typo in ones_like
TF Website shows `zero` in place of `one`. 
This fixes the issue https://github.com/tensorflow/tensorflow/issues/32129
2019-09-03 09:03:13 -07:00
Katherine Wu
4d64a47c3e Track trackables in graph networks, and removing automatic tracking of Keras-internal attributes.
Also resolves loading bug from #31893.

PiperOrigin-RevId: 266440936
2019-08-30 17:27:49 -07:00
rxsang
25006be096
Merge pull request #32061 from tensorflow/rxsang-patch-1
Cherrypick: add an `enter_master_device` flag in tf.config.experimental_connect_to_cluster API.
2019-08-29 14:15:26 -07:00
AG Ramesh
6c6c81fa37 Upgrade MKL DNN 2019-08-29 13:43:49 -07:00
Tong Shen
b2e64f8e94 Remove tensor input shape from function signature.
PiperOrigin-RevId: 265973257
2019-08-29 11:35:45 +02:00
Katherine Wu
2c772511d9 Add SaveOptions object with option to whitelist op namespaces. Added options argument to all functions that save out a SavedModel.
PiperOrigin-RevId: 266021878
2019-08-28 17:54:41 -07:00
Tomer Kaftan
f6489a2d5a
Update nest.py
Another fix to the cherrypick
2019-08-28 14:54:56 -07:00
rxsang
0ff7bcc59a
Update v1 golden file. 2019-08-28 14:13:50 -07:00
rxsang
d0c2bf3226
Update v2 golden file. 2019-08-28 14:12:37 -07:00
rxsang
733002ffed
Add enter_master_device flag. 2019-08-28 14:08:35 -07:00
Goldie Gadde
69b1feac62
Merge pull request #32057 from tensorflow/backend_api_cherrypick
[r2.0:Cherrypick] `is_keras_tensor` in the Keras backend API.
2019-08-28 12:32:01 -07:00
Tomer Kaftan
4d19860414
Update nest.py
Added in missing part of the cherrypick
2019-08-28 10:58:07 -07:00
François Chollet
ca2f3de8da Add is_keras_tensor to the public API. 2019-08-28 10:52:11 -07:00
Goldie Gadde
ad9ed12ce4
Merge pull request #31941 from haozha111/cherrypicks_X1T3X
r2.0-CherryPick
2019-08-28 10:40:55 -07:00
Goldie Gadde
2f8cd45afb
Merge pull request #31811 from akshaym/cherrypicks_9Y70M
r2.0-CherryPick:Limit py_func on gpu to only take numeric types
2019-08-28 09:36:28 -07:00
Goldie Gadde
aac18a3ffb
Merge pull request #31812 from saxenasaurabh/cherrypicks_HM5OH
r2.0-CherryPick:Create zeros grad of the correct shape for ResourceVariables.
2019-08-27 13:31:59 -07:00
Goldie Gadde
b22eb94079
Merge pull request #31773 from aaroey/r2.0
2.0-CherryPick:TF-TRT API changes and fix regressions
2019-08-27 13:31:34 -07:00
Haoliang Zhang
9033d7a2f1 Add an explanation about using FlatbufferModel::GetMinimumRuntime().
PiperOrigin-RevId: 265133207
2019-08-23 17:09:53 -07:00
Haoliang Zhang
c58a667812 [Fix] fix the logic when comparing the runtime version strings.
PiperOrigin-RevId: 264948915
2019-08-23 17:09:46 -07:00
A. Unique TensorFlower
2ca19947d3 Makes nest able to flatten dictionary views (produced by dict.items(), dict.values(), dict.keys() in Python3.)
This is done for all mapping views rather than just ordereddict views because:
1. In python2 these all returned lists anyway so nest worked on them even if dictionary ordering wasn't guaranteed
2. In python 3.6 dictionaries became insertion-ordered as an implementation detail, and as of python 3.7 this became a language feature: https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6

So, this would only pose a not-already-present randomization risk with nest.flatten for:
- people using python3 with Custom mappings or built-in mappings that don't have order guarantees (I'm not sure if there are still built-in mappings that don't have order guarantees)
- people using dict views with older python3 versions that are < 3.6

Note: This cl makes nest.pack_sequence_as with views as structures return a list rather than a mapping view, because you cannot directly instantiate built-in mapping views.
PiperOrigin-RevId: 264433983
2019-08-23 15:34:14 -07:00
Alexandre Passos
c75bb66a99
Merge pull request #31898 from mrry/cherrypicks_CPD6K
[r2.0 cherrypick] Fix tf.gradients() performance regression
2019-08-22 10:40:23 -07:00
Derek Murray
c99513c0ba In _GradientsHelper() compute the ObjectIdentitySet(xs) once and reuse it.
This avoids a potentially quadratic execution time in building the gradient graph, because we were previously creating the set multiple times for each op in the graph.

PiperOrigin-RevId: 264826531
2019-08-22 09:24:07 -07:00
Goldie Gadde
b8b60ae09a
Merge pull request #31742 from tensorflow/ggadde-cp12
r2.0-Cherrypick: Move to cuDN 7.6.2 and TensorRT 5.1.5
2019-08-21 22:12:03 -07:00
Goldie Gadde
305ecc6ed5
Merge pull request #31878 from tensorflow/ggadde-cp14
Revert skipping EIGEN_FORCE_INLINE change & potential fix for windows flaky build
2019-08-21 22:11:41 -07:00
A. Unique TensorFlower
12af54e67b Provide an unique dir for each compiling action to avoid conflicts.
PiperOrigin-RevId: 264635106
2019-08-21 21:16:28 -07:00
A. Unique TensorFlower
6a4957735c Automated rollback of commit 92b7212e54
PiperOrigin-RevId: 264744295
2019-08-21 21:13:05 -07:00
Goldie Gadde
b12998c174
Merge pull request #31862 from jdduke/cherrypicks_DVR9A
Fix regression in memory consumption on arm64 devices
2019-08-21 21:03:14 -07:00
Goldie Gadde
217315e22e
Merge pull request #31869 from jaingaurav/cherry-2.0
Fix v2 compatibility with moving average
2019-08-21 21:02:32 -07:00
Goldie Gadde
eb85412f3c
Merge pull request #31864 from robieta/cherrypicks_USYUF
r2.0-CherryPick: Refactor the keras TensorLikeDataAdapter (numpy array, EagerTensor, etc) to use tf.shuffle rather than np.shuffle
2019-08-21 18:07:15 -07:00
Goldie Gadde
32b18bcb0e
Merge pull request #31856 from tensorflow/cherrypicks_UNEFG
1. Fix incorrect steps inference when validation_split is provided in…
2019-08-21 17:43:19 -07:00
Pavithra Vijay
c458b1cca4 Updating regex with inline dotall flag. 2019-08-21 15:53:57 -07:00
Goldie Gadde
66876a263f
Merge pull request #31863 from rachellim/cherrypicks_9AS72
Fix incompatibility between tf.data rebatching fallback & unknown batch dim (from partial batch)
2019-08-21 15:13:52 -07:00
Goldie Gadde
b42e5925e6
Merge pull request #31857 from rohan100jain/cherrypicks_5PW8E
Fixing a couple of issues with DenseFeatures
2019-08-21 15:11:56 -07:00
Gaurav Jain
adf96a1258 Fix v2 compatibility with moving average
PiperOrigin-RevId: 264688574
(cherry picked from commit dc3534c6a5)
2019-08-21 15:00:22 -07:00
Pavithra Vijay
b3916ef9c6 Escaping / for regex. 2019-08-21 14:57:16 -07:00
Taylor Robie
df96b0acdb Refactor the keras TensorLikeDataAdapter (numpy array, EagerTensor, etc) to use tf.shuffle rather than np.shuffle. This allows us to use more of tf.data's pipelining machinery which both improves multi-epoch performance and decreases memory consumption.
PiperOrigin-RevId: 264681581
2019-08-21 14:15:35 -07:00
Rachel Lim
7107f907aa [tf.data] Make rebatching fallback work for datasets with unknown shapes.
PiperOrigin-RevId: 264501770
2019-08-21 13:43:48 -07:00
Rachel Lim
6208021e3d [tf.data] s/workers/replicas in all rebatching related files for consistency with distribution strategy naming conventions (https://github.com/tensorflow/community/blob/master/rfcs/20181016-replicator.md).
PiperOrigin-RevId: 261958155
2019-08-21 13:43:34 -07:00
Benoit Jacob
1790e093de Don't round the allocator's storage size to the next power of two. This is typically a huge buffer. We're going to reach a steady state where we have only a few such buffers and they won't get frequently reallocated, anyway.
PiperOrigin-RevId: 264669851
2019-08-21 13:07:51 -07:00
Benoit Jacob
ee25c2a3a6 Fix allocator in cases of sizes overflowing 32bit integer arithmetic
in size_util.

Part of it was AllocateFast not checking if ptr_ is null
before using it (null deref with offset, so didn't look like a null deref).

Part of it was using round_up_pot with a large size_t value that got implicitly casted to int as round_up_pot took an int argument. This showed how it's safer to just templatize the helpers in size_util.h, make them accept either int32 or int64 (guarded in floor_log2, which is the only of these functions that cares).

I just changed allocator to use only signed types (std::size_t --> std::ptrdiff_t) because I didn't want to deal with the extra complexity of dealing with both signed and unsigned in size_util.

PiperOrigin-RevId: 264668215
2019-08-21 13:07:37 -07:00
Rohan Jain
c8f3235a03 Fixing a couple of issues with DenseFeatures
1. Fixing variable_scope to not pass in partition_info argument if its in V2 mode. This ensures compatibility with V2 initializers
2. Adding a tracking_name argument to add_variable which allows for passing in a different name for tracking purposes without affecting the variable name.

PiperOrigin-RevId: 264658202
2019-08-21 13:02:14 -07:00
Pavithra Vijay
947f2c4ad4 1. Fix incorrect steps inference when validation_split is provided in fit in v2 single path execution.
2. Infer validation_steps/steps for a dataset when validation_steps/steps is not provided in fit/(evaluate and predict).

PiperOrigin-RevId: 264508014
2019-08-21 11:46:45 -07:00
Goldie Gadde
71d73e56a2
Merge pull request #31822 from tensorflow/ggadde-cp13
r2.0-CherryPick:test fixes for the windows release builds.
2019-08-20 22:30:20 -07:00
Goldie Gadde
28a383988d
Merge pull request #31815 from kkimdev/cherrypicks_Q43SS
r2.0 cherry-pick request: Autograph: zip() with tf.data fix
2019-08-20 21:41:53 -07:00
Saurabh Saxena
4a416ef792 Internal change
PiperOrigin-RevId: 264475263
2019-08-20 21:36:04 -07:00
A. Unique TensorFlower
cdf62283bf Skip test on Windows platform.
PiperOrigin-RevId: 264463634
2019-08-20 21:29:58 -07:00
Ilham Firdausi Putra
34eb0b58b9 Fix argument naming and tuple on zip 2019-08-20 15:47:44 -07:00
Ilham Firdausi Putra
c49ae107aa Override zip on Autograph 2019-08-20 15:47:34 -07:00
Ilham Firdausi Putra
7e74596547 Add test cases for zip 2019-08-20 15:47:18 -07:00
Saurabh Saxena
0e772b8c96 Create zeros grad of the correct shape for ResourceVariables.
Fixes #31297

PiperOrigin-RevId: 264312145
2019-08-20 12:29:31 -07:00
Akshay Modi
24d2531449 Fix typos in EagerPyFunc GPU kernel registration
PiperOrigin-RevId: 264429933
2019-08-20 12:19:00 -07:00
Akshay Modi
7382c4b45f Limit py_func on gpu to only take numeric types
PiperOrigin-RevId: 264188322
2019-08-20 12:19:00 -07:00
Tomer Kaftan
3f3c728bf8
Merge pull request #31769 from tomerk/cherrypicks_FGHW0
Cherrypick Request: Updates the tf.print docstring for TF 2.0.
2019-08-19 20:23:36 -07:00
Goldie Gadde
5b3f39156f
Merge pull request #31780 from penpornk/cherrypicks_VO02I
r2.0 cherry-pick request: [Intel MKL] Fixing a member variable initialization issue
2019-08-19 19:48:55 -07:00
Goldie Gadde
e307f25d1c
Merge pull request #31772 from jaingaurav/cherry-2.0
Cherry-pick Tensor equality fixes
2019-08-19 19:46:50 -07:00
Allen Lavoie
a6efea89d1 tf.train.Checkpoint: Fix for tensor equality incompatibility in name-based restore logic
PiperOrigin-RevId: 263601820
(cherry picked from commit 84ddc218bb)
2019-08-19 18:34:12 -07:00
Clayne Robison
dc4f698238 [Intel MKL] Fixing a member variable initialization issue detected by static code scans. 2019-08-19 17:02:41 -07:00
Guangda Lai
d90e836507 Use generator function for calibration and build() and remove redundant num_runs/num_calibration_runs arguments. 2019-08-19 15:56:09 -07:00
Guangda Lai
cb6c0c8c20 Raise error if is_dynamic_op is set to False in 2.0, and fix corresponding test 2019-08-19 15:56:03 -07:00
Guangda Lai
1fbdcea362 Fix pylint errors 2019-08-19 15:55:56 -07:00
Guangda Lai
55e408c2e9 Change build() to accept num_runs and input_fn, to make API consistent. 2019-08-19 15:55:49 -07:00
Guangda Lai
0423e0f30c Deprecate static mode and fix corresponding test 2019-08-19 15:55:25 -07:00
Guangda Lai
fee3d86071 Terminate calibration in TrtGraphConverterV2.convert() and improve the test to cover that. 2019-08-19 15:55:17 -07:00
Guangda Lai
61de424923 Simplify trt_convert_test 2019-08-19 15:55:08 -07:00
Guangda Lai
4961524653 Remove redundant variable _calibration_data_collected; inline _calibrate() function; fix python formatting issues. 2019-08-19 15:54:57 -07:00
Pooya Davoodi
005de1a79c Add back tests for converted_func 2019-08-19 15:54:50 -07:00
Pooya Davoodi
ec5c02724a Add calibration to TrtGraphConverterV2.convert
Add TrtGraphConverterV2.build

Also do not return function from convert.

Convert dict_values to list for python3

Fix tests as well

Fix pylint errors
2019-08-19 15:54:42 -07:00
Guangda Lai
163f9df4c7 Save calibration table after calibration, so it can support multiple engines in
int8 mode.

PiperOrigin-RevId: 263857264
2019-08-19 15:43:35 -07:00
Guangda Lai
ed47e70590 Support static mode in TF 2.0. Comparing to building engines by calling the
converted function returned from convert(), this has the advantage that it
doesn't require input data from user.

PiperOrigin-RevId: 263592310
2019-08-19 15:40:38 -07:00
Guangda Lai
d5e9bcc00f Hide helper classes from public API, and update documentation about how to use
V2 converter with INT8 mode.

PiperOrigin-RevId: 263200301
2019-08-19 15:37:12 -07:00
Alexandre Passos
01d28693dd Make padded_batch_test robust to the eq changes.
PiperOrigin-RevId: 263840727
(cherry picked from commit 76221c94dc)
2019-08-19 15:11:31 -07:00
Saurabh Saxena
c55b77eb70 Do not use tensors as keys in _EagerTensorCache since tensors are no longer hashable.
PiperOrigin-RevId: 263666059
(cherry picked from commit 63155ea68e)
2019-08-19 15:11:03 -07:00
Saurabh Saxena
de02c794d6 Prepare feature_columns_test and dense_features_test for tensor equality.
PiperOrigin-RevId: 263450143
(cherry picked from commit 9a0ef32d7b)
2019-08-19 15:10:37 -07:00
Saurabh Saxena
48fa7d9881 Prepare cudnn_recurrent_test for Tensor equality.
PiperOrigin-RevId: 263449689
(cherry picked from commit ff61cee968)
2019-08-19 15:10:25 -07:00
Saurabh Saxena
97071c535c Prepare keras/backend_test for Tensor equality changes.
PiperOrigin-RevId: 263453765
(cherry picked from commit 9503dc4440)
2019-08-19 15:09:36 -07:00
Kibeom Kim
e492005f67 Prepare tensorflow/python/eager:ops_test for Tensor equality.
PiperOrigin-RevId: 263589781
(cherry picked from commit 74942377c6)
2019-08-19 15:08:54 -07:00
Kibeom Kim
58b9d190a8 Prepare //tensorflow/python/data/kernel_tests:map_test for Tensor equality.
PiperOrigin-RevId: 263589206
(cherry picked from commit fc27466628)
2019-08-19 15:08:41 -07:00
Kibeom Kim
070e519450 Prepare //tensorflow/python:function_def_to_graph_test for Tensor equality.
PiperOrigin-RevId: 263563061
(cherry picked from commit c84f6e2b65)
2019-08-19 15:08:22 -07:00
Kibeom Kim
5b27c05c38 Prepare //tensorflow/python/eager:function_test for Tensor equality.
PiperOrigin-RevId: 263562894
(cherry picked from commit 913f565e0f)
2019-08-19 15:08:09 -07:00
Kibeom Kim
beff50ca87 Prepare //tensorflow/python:function_test for Tensor equality.
PiperOrigin-RevId: 263500156
(cherry picked from commit 8d67ec5fa3)
2019-08-19 15:07:49 -07:00
Kibeom Kim
2dbb74261f tf.function: tf.gather Graph mode axis argument fix
PiperOrigin-RevId: 263245888
(cherry picked from commit a03e530512)
2019-08-19 15:06:49 -07:00
Goldie Gadde
b2a68a4907
Merge pull request #31715 from jsimsa/cherrypicks_UZ9Z2
2.0 cherry-pick request: tf.data fix
2019-08-19 15:02:00 -07:00
Kibeom Kim
5417befa7f Prepare //tensorflow/python/keras:metrics_correctness_test for Tensor equality.
PiperOrigin-RevId: 263671686
(cherry picked from commit c02cc7e858)
2019-08-19 14:55:37 -07:00
Kibeom Kim
5c9a16c77d Prepare //tensorflow/python/training/tracking:util_with_v1_optimizers_test for Tensor equality.
PiperOrigin-RevId: 263718652
(cherry picked from commit a24e8f4da1)
2019-08-19 14:55:13 -07:00
Yanhua Sun
10111d8bd5 Fix assertIn for eq change in save_model
PiperOrigin-RevId: 263610354
(cherry picked from commit ccb5bcc69a)
2019-08-19 14:53:39 -07:00
Yanhua Sun
b7ec637af1 Automated rollback of commit eb478151c2
PiperOrigin-RevId: 264184909
(cherry picked from commit 3d880adb04)
2019-08-19 14:51:34 -07:00
Yanhua Sun
85e08754e1 Automated rollback of commit 4089730950
PiperOrigin-RevId: 263396507
(cherry picked from commit eb478151c2)
2019-08-19 14:51:25 -07:00
Gaurav Jain
48378e5237 Enable Tensor equality for 2.0
Fixes #9359

PiperOrigin-RevId: 262948811
(cherry picked from commit 4089730950)
2019-08-19 14:51:14 -07:00
Gaurav Jain
c9e4bdd566 Add equality tests with broadcasting
PiperOrigin-RevId: 264224683
(cherry picked from commit fc56e08e1b)
2019-08-19 14:50:19 -07:00
Yanhua Sun
84a8c340bd Fix tests for eq change
PiperOrigin-RevId: 263784883
(cherry picked from commit e16348a81a)
2019-08-19 14:43:12 -07:00
Yanhua Sun
e3e60d26f8 Disable eq for tensors running in graph mode
PiperOrigin-RevId: 263780887
(cherry picked from commit 2861da0bf6)
2019-08-19 14:42:55 -07:00
Yanhua Sun
e2342aa7ee Use is for Tensor identity comparison.
PiperOrigin-RevId: 263702957
(cherry picked from commit e03a290d8a)
2019-08-19 14:42:43 -07:00
Yanhua Sun
6b5f02024a Fix set in training for eq change
PiperOrigin-RevId: 263702884
(cherry picked from commit 46b9f950d5)
2019-08-19 14:42:33 -07:00
Yanhua Sun
7cc6b3a9e3 Replace dict with list for eq change
PiperOrigin-RevId: 263647316
(cherry picked from commit 923c55a659)
2019-08-19 14:42:21 -07:00
Yanhua Sun
833432df2a Replace map with objectIdentityDict to fix math_grad
PiperOrigin-RevId: 263632918
(cherry picked from commit c9443f07d2)
2019-08-19 14:42:02 -07:00
Yanhua Sun
58969d36aa Fix training_test by using AssertIs
PiperOrigin-RevId: 263631974
(cherry picked from commit 68e332dc3c)
2019-08-19 14:41:56 -07:00
Martin Wicke
13f5ffcc56 Use "is" instead of ==, and Reference for dict keys {} in Bijector.
PiperOrigin-RevId: 263629253
(cherry picked from commit f7f6f8655a)
2019-08-19 14:41:44 -07:00
Martin Wicke
6c2ec4e39f Use id() for Tensor identity comparison.
PiperOrigin-RevId: 263627848
(cherry picked from commit 053f39e766)
2019-08-19 14:41:30 -07:00
Martin Wicke
89ba20d03c Replace == with is in test comparing variable identity.
PiperOrigin-RevId: 263627688
(cherry picked from commit 5f1f68499c)
2019-08-19 14:41:18 -07:00
Gaurav Jain
b363a41e62 Fix issues when tensor equality is enabled
PiperOrigin-RevId: 263589274
(cherry picked from commit 8651257d36)
2019-08-19 14:41:04 -07:00
Martin Wicke
800701404b Avoid tensor == in list membership test
PiperOrigin-RevId: 263505153
2019-08-19 14:40:51 -07:00
Yanhua Sun
77f168f69d Replace set(tensors) with set(id(tensors)) for eq change
PiperOrigin-RevId: 263473977
2019-08-19 14:40:32 -07:00
Yanhua Sun
d812281027 Fix assert a in b in lstm with assertTrue (any( a is b))
PiperOrigin-RevId: 263473895
2019-08-19 14:40:18 -07:00
Gaurav Jain
82cc8f6391 Tensor equality fixes for function_test
PiperOrigin-RevId: 263462725
2019-08-19 14:35:51 -07:00
Yanhua Sun
3ea3e38ec1 Replace map with ObjectIdentityDict for eq change
PiperOrigin-RevId: 263458266
2019-08-19 14:35:32 -07:00
Yanhua Sun
a7425bab38 Replace set(tensor) with id(tensor) for eq change
PiperOrigin-RevId: 263454392
2019-08-19 14:35:22 -07:00
Yanhua Sun
d4a92a6d5e Replace None in [] with is None for eq change
PiperOrigin-RevId: 263453410
2019-08-19 14:35:02 -07:00
Yanhua Sun
85301ece74 Fix rmsprop_test for eq change
PiperOrigin-RevId: 263439386
2019-08-19 14:34:39 -07:00
Yanhua Sun
f4aa98af47 Fix while_v2 for eq change
PiperOrigin-RevId: 263434775
2019-08-19 14:34:27 -07:00
Yanhua Sun
853ff441b7 Fix template for eq change
PiperOrigin-RevId: 263432053
2019-08-19 14:34:14 -07:00
Yanhua Sun
690e3fa140 Fix core_test due to eq change
PiperOrigin-RevId: 263428851
2019-08-19 14:33:57 -07:00
Yanhua Sun
a17fd9dc33 Use _ref to fix unhashable tensor error for eq change
PiperOrigin-RevId: 263428471
2019-08-19 14:33:43 -07:00
A. Unique TensorFlower
c609e0ab8d Ensure to use tensor references when dealing with sets in TF 2.
PiperOrigin-RevId: 263401414
2019-08-19 14:33:27 -07:00
Yanhua Sun
9699aaac38 Replace map with ObjectIdentityDict for eq change
PiperOrigin-RevId: 263353699
2019-08-19 14:32:04 -07:00
Alexandre Passos
530a522769 Make custom_gradient.py eq-safe.
PiperOrigin-RevId: 263218366
2019-08-19 14:31:13 -07:00
Gaurav Jain
942df35882 Ensure function CacheKey is equality-safe
When comparing CacheKeys we cannot simply compare the namedtuple fields
since that might involve comparing 2 Variables whose equality comparison
returns a Tensor rather than a bool. We thus compare Variables using
their class name, dtype & shape, similar to the processing we do for the
hash.

PiperOrigin-RevId: 262935453
2019-08-19 14:29:12 -07:00
A. Unique TensorFlower
6811e43160 Updates the tf.print docstring for TF 2.0.
PiperOrigin-RevId: 264181110
2019-08-19 13:07:56 -07:00
Penporn Koanantakool
961eaa691a Fix curl build rules. 2019-08-19 12:37:48 -07:00
Penporn Koanantakool
9a9d5fa42d Fix the mirror URL. 2019-08-18 23:53:13 -07:00
Clayne Robison
f0955efedc [Intel MKL] Upgrading curl to 7.65.3 to fix CVE-2019-5443 2019-08-18 23:53:13 -07:00
Toby Boyd
ebb92c7b8c Move to cuDN 7.6.2 and TensorRT 5.1.5 2019-08-18 17:46:59 -07:00
Goldie Gadde
0e41a9125f
Merge pull request #31710 from tensorflow/fix_optimizer_warning
r2.0-CherryPick:Fix unwanted deprecation warning in optimizers v2
2019-08-18 17:36:04 -07:00
Goldie Gadde
0a4fbccbd7
Merge pull request #31704 from jhseu/lstm
r2.0-CherryPick:Fix LSTMs in TPUStrategy.
2019-08-18 17:33:40 -07:00
Jiri Simsa
9d8571fd48 [tf.data] Add non-determinstic seed code path for RandomSeedGenerator to match TF 1.X behavior.
Fixes: #31706
PiperOrigin-RevId: 263878374
2019-08-16 19:15:52 -07:00
Goldie Gadde
b0fee96b1d
Merge pull request #31708 from tensorflow/ggadde-cp11
r2.0-CherryPick: More test and bug fixes.
2019-08-16 18:17:10 -07:00
François Chollet
61f9c06b35 Fix unwanted deprecation warning in optimizers v2 2019-08-16 15:17:27 -07:00
Goldie Gadde
bdbaf055f1
Merge pull request #31699 from jdduke/cherrypicks_5UP6F
Ensure native libs are loaded when using NnApiDelegate
2019-08-16 14:19:53 -07:00
Goldie Gadde
4756cfbbec
Merge pull request #31673 from saxenasaurabh/cherrypicks_VWQL1
Make `maybe_set_static_shape` a no-op when `shape` is a python constant.
2019-08-16 14:19:11 -07:00
Goldie Gadde
b8c634ea60
Merge pull request #31563 from jdduke/cherrypicks_8JAWG
Correctly convert const int8 weights to uint8 for NNAPI
2019-08-16 14:16:11 -07:00
Zhenyu Tan
393e4bc06c Also capture variable during hyper creation.
PiperOrigin-RevId: 263071623
2019-08-16 14:02:39 -07:00
Guangda Lai
32510d7e91 Convert dict_values to list to support indexing in python3.
PiperOrigin-RevId: 263825828
2019-08-16 14:02:28 -07:00
Jonathan Hseu
412938d821 Fix LSTMs in TPUStrategy. We need to check the outer graph for the control flow context to find out whether we're in a tpu.replicate().
PiperOrigin-RevId: 263821933
2019-08-16 12:40:21 -07:00
Jared Duke
ccdc0a9b7a Use the NnApiDelegate directly with Interpreter.Options.setUseNNAPI.
The dynamic useNNAPI method is deprecated, so avoid it when using the
blessed NNAPI path in Interpreter.Options.

PiperOrigin-RevId: 263580821
2019-08-16 10:57:27 -07:00
Jared Duke
657516bc21 Reland "Ensure native libs are loaded when using NnApiDelegate"
PiperOrigin-RevId: 263384929
2019-08-16 10:57:24 -07:00
Alexandre Passos
a671a14cfe
Merge pull request #31672 from penpornk/cherrypicks_DI6UL
r2.0-rc0 cherry-pick request: [INTEL MKL] Fix for Batchmatmul regression
2019-08-16 09:02:53 -07:00
Saurabh Saxena
cd2e770305 Make maybe_set_static_shape a no-op when shape is a python constant.
`maybe_set_static_shape` is only meant to handle cases that C++ shape inference cannot, which is when shape is a tensor that has a path to a captured placeholder inside a FuncGraph. So this change does not break any use-cases we care about.
This fixes an issue with creating spurious constants in the Graph which are unused after shape inference.

PiperOrigin-RevId: 263666943
2019-08-15 23:00:12 -07:00
Goldie Gadde
d96ab53c6e
Merge pull request #31667 from tensorflow/ggadde-cp10
Improve NumPy to Dataset performance with vectorized shuffling.
2019-08-15 22:26:13 -07:00
AG Ramesh
c7e34625d6 Changes based on review comments 2019-08-15 21:53:23 -07:00
AG Ramesh
05605c8775 Update tensorflow/core/kernels/mkl_batch_matmul_op.cc
Co-Authored-By: Penporn Koanantakool <38085909+penpornk@users.noreply.github.com>
2019-08-15 21:53:05 -07:00
AG Ramesh
26beb07074 Add support for MklBatchMatMulv2 2019-08-15 21:52:42 -07:00
Thomas O'Malley
9c57a63096 Fix recursion error from NumPy->DS change.
PiperOrigin-RevId: 263703207
2019-08-15 21:40:35 -07:00
Thomas O'Malley
8235fb5642 Improve NumPy to Dataset performance with vectorized shuffling.
PiperOrigin-RevId: 263611727
2019-08-15 17:14:49 -07:00
Goldie Gadde
1ad1f90069
Merge pull request #31663 from tensorflow/ggadde-cp9
r2.0 Cherrypick: bugs fixes and test fixes.
2019-08-15 17:06:13 -07:00
Zhenyu Tan
03429911fa Decrease loss threshold for premade test.
PiperOrigin-RevId: 262280488
2019-08-15 16:12:15 -07:00
Guangda Lai
4850ef3125 Fix TRT tests in OSS build by reducing the GPU memory consumption.
PiperOrigin-RevId: 263579969
2019-08-15 14:44:06 -07:00
Pavithra Vijay
676ff6bf31 Fixing validation callback configuration in fit in single execution path.
- Added 'Train on # steps' msg during training
- Setting verbose as 0 on fit validation loop to prevent progbar errors in v1

PiperOrigin-RevId: 263462848
2019-08-15 14:43:25 -07:00
Scott Zhu
c61bedb2b7 Update training v2 execution function to return batch size.
This will let the training function to return the correct number of examples to the callbacks.

Also add casting for batch_size to int64, which was broken for multi-worker all reduce.

PiperOrigin-RevId: 263356430
2019-08-15 14:42:33 -07:00
Pavithra Vijay
66edf93f01 1. Do not raise steps unsupported with numpy arrays warning message in single execution path.
2. Raise error if batch_size argument is used when input is dataset/generator/keras sequence.

PiperOrigin-RevId: 263272222
2019-08-15 14:38:04 -07:00
Goldie Gadde
845502ac5e
Merge pull request #31657 from k-w-w/cherrypicks_XSBZI
[2.0 Cherrypick] Bug fixes for Keras SavedModel
2019-08-15 14:23:44 -07:00
Katherine Wu
9045dfb8bc Track lookup tables created in FeatureColumn, and add FeatureColumn saving/loading test.
PiperOrigin-RevId: 263399690
2019-08-15 13:15:07 -07:00
Katherine Wu
88c1b25380 Fix bug where input signature is added for call functions with kwargs.
Issue reported in: #30808
Similar issue: #29545

PiperOrigin-RevId: 263390265
2019-08-15 12:47:25 -07:00
Katherine Wu
d211b76700 Add specific error for functions capturing Keras learning phase, and fix keras saving tests.
PiperOrigin-RevId: 262630016
2019-08-15 12:47:16 -07:00
Goldie Gadde
65e6355ad9
Merge pull request #31632 from tensorflow/ggadde-cp8
r2.0-CherryPick: Fixes for failing tests on windows builds
2019-08-14 22:38:50 -07:00
Rick Chao
aa884ae1ee Skip MultiWorkerTrainingStateTest for Windows py35 path not found error. Root cause is unclear, but skipping the test to unblock release.
PiperOrigin-RevId: 263432533
2019-08-14 21:03:37 -07:00
Allen Lavoie
6c803cb870 Use an older protobuf API for compatibility with a Windows nightly
PiperOrigin-RevId: 263468822
2019-08-14 21:01:54 -07:00
Goldie Gadde
091f65f95e
Merge pull request #31624 from tensorflow/ggadde-cp7
r2.0 CherryPick: Fix arg typo.
2019-08-14 16:20:01 -07:00
Yanhui Liang
83f14a9fbe Fix arg typo.
PiperOrigin-RevId: 262395899
2019-08-14 14:40:10 -07:00
Alexandre Passos
cf8189bb88
Merge pull request #31620 from tensorflow/ggadde-cp6
r2.0-Cherrypick: TensorEquality related changes
2019-08-14 09:25:41 -07:00
Saurabh Saxena
643d109b3f Prepare convert_to_constants.py for tensor equality changes.
PiperOrigin-RevId: 262655851
2019-08-14 08:21:46 -07:00
Yanhua Sun
360e0db035 Not apply new eq change to graph function building mode
PiperOrigin-RevId: 262649875
2019-08-14 08:20:58 -07:00
Yanhua Sun
19ec4ccd82 export enable_tensor_equality disable_tensor_equality so that users have way to opt-in and opt-out explicitly
PiperOrigin-RevId: 262637047
2019-08-14 08:16:53 -07:00
Kibeom Kim
6492762d59 Disallow dictionary argument for tf.case
Tensors will be unhashable starting TF 2.0, so disallow
a dictionary argument with Tensor as a key for tf.case

PiperOrigin-RevId: 262483688
2019-08-14 08:15:03 -07:00
Yanhua Sun
c577f4633b Add suggestions in unhashable tensor error message.
PiperOrigin-RevId: 262443293
2019-08-14 08:14:26 -07:00
Yanhua Sun
27f0c213c3 Replace set with ObjectIdentitySet to prepare for eq change in TF
PiperOrigin-RevId: 262423827
2019-08-14 08:08:02 -07:00
Kibeom Kim
cc7b696b00 Fix _ObjectIdentityWrapper __eq__, to be symmetric.
PiperOrigin-RevId: 262421488
2019-08-14 08:06:59 -07:00
Kibeom Kim
0ee3820d07 Implement tensor.experimental_ref() that returns a reference object.
tf.Tensor and tf.Variable will be unhashable in 2.0 so users
can's use them in sets and dictionaries.

This experimental API returns a reference object to the tensor,
and users can use this instead for sets and dictionaries.
Also, it has .deref() function that returns the original object.

PiperOrigin-RevId: 262407223
2019-08-14 07:55:41 -07:00
Gaurav Jain
5917907e42 Do not compare Tensors to _ACCEPTABLE_CSV_TYPES
We cannot directly compare tensors to dtypes. In fact the previous check
would have always returned false since the comparison was based on id
rather than contents. With tensor equality enabled we see that this test
was invalid and avoid doing the comparison altogether.

PiperOrigin-RevId: 262406666
2019-08-14 07:54:29 -07:00
Yanhua Sun
6f416ed0a9 add id to MirroredVariable, and modify capture key to use tensor_id instead of tensor
PiperOrigin-RevId: 262376081
2019-08-14 07:53:27 -07:00
Gaurav Jain
37a8e557fd Compare functions by id instead of equality
PiperOrigin-RevId: 262341582
2019-08-14 07:52:18 -07:00
Gaurav Jain
f75e87a6e1 Fix unhashable Variable class in input signature
We would like to make Variables unhashable when Tensor equality is
enabled. Unfortunately, they are needed as part of the function cache
key. We can resolve this by instead using the dtype & shape in the cache
key. Note this is a temporary fix until we make Variable a
CompositeTensor.

PiperOrigin-RevId: 262167751
2019-08-14 07:51:17 -07:00
Yanhua Sun
2c1e190ca1 Replace set with ObjectIdentitySet to prepare for eq change in TF
PiperOrigin-RevId: 262060827
2019-08-14 07:49:25 -07:00
Goldie Gadde
3e6a702c88
Merge pull request #31581 from tensorflow/ggadde-cp5
r2.0-cp: Override eigen strong inline to reduce windows build times for debugg…
2019-08-13 06:21:07 -07:00
A. Unique TensorFlower
689d40f473 Override eigen strong inline to reduce windows build times for debugging the
failures.

PiperOrigin-RevId: 262857756
2019-08-13 06:08:28 -07:00
Goldie Gadde
eca3413b03
Merge pull request #31552 from dubey/cherrypicks_706UC
r2.0-rc0 cherry-pick request: use NCCL only for all-reduce.
2019-08-12 13:46:49 -07:00
Goldie Gadde
81cef82199
Merge pull request #31553 from guptapriya/cherrypicks_3NEIM
r2.0-rc0 cherry-pick request: Rollback gradients change which broke convergence for NCF
2019-08-12 13:45:17 -07:00
A. Unique TensorFlower
a7103376b0 Correctly convert const int8 weights to uint8 for NNAPI.
PiperOrigin-RevId: 262747813
2019-08-12 13:18:41 -07:00
Stefano Galarraga
3ee2cb4b28 Extracts NNAPIDelegateKernel from nnapi_delegate.cc
PiperOrigin-RevId: 262571387
2019-08-12 13:18:38 -07:00
Goldie Gadde
1959992f4c
Merge pull request #31497 from zhangyujing/zhangyujing-patch-2
Roll back experimental_compile from 2.0
2019-08-12 12:50:19 -07:00
Goldie Gadde
78b028c2d5
Merge pull request #31490 from k-w-w/cherrypicks_HH6F5
Set default training value to `False` when exporting to SavedModel
2019-08-12 10:46:06 -07:00
Goldie Gadde
d77d36a141
Merge pull request #31456 from akshaym/cherrypicks_95FOF
Use "correct" op in while_v2 gradient
2019-08-12 10:41:29 -07:00
Goldie Gadde
6cc66d77c9
Merge pull request #31551 from tensorflow/ggadde-cp4
CherryPick: Only enable graph rewrite for RNN layer in v2 mode (outmost eager con…
2019-08-12 10:40:00 -07:00
Ayush Dubey
ff25a63ab9 Automated rollback of commit c9552455ab
PiperOrigin-RevId: 262647780
2019-08-12 10:28:42 -07:00
Priya Gupta
d4922a3c3a Automated rollback of commit cadb128334. Revert #22231.
PiperOrigin-RevId: 262947974
2019-08-12 10:28:21 -07:00
Scott Zhu
098e2c5b8e Only enable graph rewrite for RNN layer in v2 mode (outmost eager context).
The tf.function approach does not work well in v1 with session since it might try to update /mutate the graph between session. This change will disable the tf function path in v1 session mode. This will prevent any user to use cudnn kernel either with compat.v2 or tf.disable_eager_exeuction().

Note the estimator in v2 should still have the graph rewrite support (have cudnn kernel on GPU).

The graph rewrite tests are now run in v2 only since the rewrite in v1 has been disabled.

PiperOrigin-RevId: 262577530
2019-08-12 09:28:44 -07:00
Goldie Gadde
ccaf553d14
Merge pull request #31487 from saxenasaurabh/cherrypicks_PHYVE
Cherrypicks to fix shape inference of some common ops inside functional ops
2019-08-11 21:35:18 -07:00
Yifei Feng
926e66c254
Merge pull request #31511 from tensorflow/ggadde-cp3
Cherrypicks to fix the disconnected graph issue, and missing CUDA compute capabilities.
2019-08-10 09:14:44 -07:00
Amit Patankar
e0b6dd6c7c Add the supported CUDA compute capabilities into the toolchain created for manylinux2010 compatibility.
PiperOrigin-RevId: 262473631
2019-08-10 08:56:59 -07:00
Thomas O'Malley
8b21ae572e Add support for TF op whitelisting in Keras Layers with Keras Tensors as
positional args, kwargs.

PiperOrigin-RevId: 262417758
2019-08-10 08:53:23 -07:00
zhangyujing
e0779db1c3
Update tensorflow.pbtxt 2019-08-09 15:52:53 -07:00
zhangyujing
0b4dc2a14c
Update tensorflow.pbtxt 2019-08-09 15:52:07 -07:00
zhangyujing
b9a7a9e4bb
Update eager_test.py 2019-08-09 15:49:37 -07:00
zhangyujing
f714aec3bf
Update def_function.py 2019-08-09 15:42:16 -07:00
Katherine Wu
93d83d5398 Set default training value to False when exporting layer/model calls to SavedModel.
PiperOrigin-RevId: 262481484
2019-08-09 10:36:14 -07:00
Saurabh Saxena
60e87a3fa7 Fix shape inference of random.uniform with non-scalar alpha/beta, random.poisson with non-scalar lam.
PiperOrigin-RevId: 262268329
2019-08-09 09:47:36 -07:00
Saurabh Saxena
370f4470bd Fix shape inference of some more common ops inside functional ops.
PiperOrigin-RevId: 262184638
2019-08-09 09:47:26 -07:00
Akshay Modi
d32cc881cf Don't return an uninitialized value from TFE_OpNameGetAttrType
PiperOrigin-RevId: 262251730
2019-08-08 11:08:58 -07:00
Goldie Gadde
ff98617eb0
Merge pull request #31437 from saxenasaurabh/cherrypicks_01FLT
Do not accumulate Const nodes created in forward pass in while_v2.
2019-08-08 10:49:33 -07:00
Saurabh Saxena
9c3422db3b Do not accumulate Const nodes created in forward pass in while_v2.
PiperOrigin-RevId: 261958798
2019-08-07 21:07:28 -07:00
Goldie Gadde
f76633ead7
Merge pull request #31386 from annarev/build_v2_by_default
Build v2 by default
2019-08-07 20:25:12 -07:00
Goldie Gadde
0271554289
Merge pull request #31430 from tensorflow/ggadde-cp2
cherrypick to fix the Linux CPU py2 pip builds
2019-08-07 20:24:37 -07:00
Anna R
3de2e5a636 Disable tensorflow/python/debug/examples:examples_test in v2 builds. 2019-08-07 17:21:51 -07:00
Amit Patankar
141a66f9de Install the new future module directly in the virtualenv when building and testing TensorFlow using pip_new.sh.
PiperOrigin-RevId: 262165125
2019-08-07 16:57:35 -07:00
Anna R
7bebfe6cdc Fix / disable a few more tests that don't work with v2 2019-08-07 16:04:24 -07:00
Goldie Gadde
82317e84d7
Merge pull request #31422 from michaelwunder/cherrypicks_97NZL
r2.0-rc0 cherry-pick request: Fix FlushQuantileSummaries Op
2019-08-07 14:48:08 -07:00
Anna R
a880649989 Remove redundant dependencies on contrib. Add tensorflow/python/tpu:tpu_lib to pip dependencies 2019-08-07 14:43:20 -07:00
Goldie Gadde
53af10c85a
Merge pull request #31418 from tensorflow/ggadde-cp-1
Cherrypick for fixing the tests timing out.
2019-08-07 12:00:03 -07:00
Anna R
ba94d6f782 Remove reference to tensorboard_targets in pip_smoke_test 2019-08-07 11:54:01 -07:00
A. Unique TensorFlower
0314c381d2 Fix FlushQuantileSummaries Op so we can repeatedly use resource.
PiperOrigin-RevId: 262043164
2019-08-07 14:39:10 -04:00
Anna R
6eed92c3e2 Revert a few temporary changes that aren't supposed to be submitted 2019-08-07 11:26:15 -07:00
Anna R
4de05d7afd Fixing / disabling a few tests failing when we run with TF v2. 2019-08-07 11:23:28 -07:00
Goldie Gadde
5c9f008d8c Merge branch 'r2.0' of github.com:tensorflow/tensorflow into r2.0 2019-08-07 11:05:30 -07:00
Shining Sun
a4117b5033 Increase shard counts for some tests in hope of solving the timeout issue.
PiperOrigin-RevId: 262001676
2019-08-07 11:04:44 -07:00
Goldie Gadde
856917ba24
Merge pull request #31382 from tensorflow/mm-r2.0-security-cherry-pick
TF2.0 cherry-pick request: Don't copy more variant elements...
2019-08-07 09:09:20 -07:00
Goldie Gadde
1e3af1b4c9
Merge pull request #31383 from tensorflow/ggadde-version-update
Update TF version 2.0.0-rc0 and update estimator and tensorboard nightly versions.
2019-08-06 18:13:27 -07:00
Goldie Gadde
2ab7a3374a
Merge pull request #31385 from penpornk/cherrypicks_XYAFB
r2.0-rc0 cherry-pick request: Enabling eager rewriting for MKL matmul
2019-08-06 18:08:25 -07:00
Anna R
0a345bf124 Remove changed added by mistake to previous commit: max bazel version shouldn't be changed 2019-08-06 16:33:11 -07:00
Anna R
54734900bd Build TensorFlow 2.0 by default. 2019-08-06 16:30:41 -07:00
AG Ramesh
b505fde34f Enabling eager rewriting for MKL matmul 2019-08-06 16:09:41 -07:00
Goldie Gadde
fbc17c1c1c Update TF version to 2.0.0-rc0 and estimator/tb nightly version pins. 2019-08-06 15:12:15 -07:00
Mihai Maruseac
100f54cd9a Don't copy more variant elements than allowed by tensor shape
PiperOrigin-RevId: 261962798
2019-08-06 15:09:07 -07:00
Yifei Feng
135639d075
Merge pull request #31379 from tensorflow/mm-2.0
Update r2.0 branch to prepare  for TF2.0.0 release
2019-08-06 14:38:47 -07:00
Goldie Gadde
6e5da334f1 Removing merge conflict marker. 2019-08-06 13:51:02 -07:00
Goldie Gadde
118ac87006 Update the release notes to be in sync with master. 2019-08-06 13:48:29 -07:00
Goldie Gadde
c650305e6c Update estimator nightly version to pick the checkpoint converter tool
changes.
2019-08-06 13:21:51 -07:00
Goldie Gadde
efe4ebd038 Update release notes. 2019-08-06 13:16:40 -07:00
Goldie Gadde
9c9a6658a5 Release Notes for 2.0 Beta. Address the comments.
Add release notes for the cherrypicks.
Add a few more lines to the other fixes.
2019-08-06 13:16:13 -07:00
Goldie Gadde
8e2740f26f Update TF 2.0.0-alpha-0 release notes. (#26369)
Release Notes for 2.0.0-alpha0
2019-08-06 13:13:00 -07:00
Goldie Gadde
d36dc6ca87 Update version to 2.0.0-beta1. 2019-08-06 13:09:52 -07:00
Yifei Feng
d1c32c20aa Update the version to 2.0.0 in tensorflow.bzl
Fix https://github.com/tensorflow/tensorflow/issues/29540
2019-08-06 13:09:26 -07:00
Goldie Gadde
46b5bd8cfe Version update. 2019-08-06 13:07:15 -07:00
Goldie Gadde
5859e87317 Update the version of TF to 2.0.0-alpha0 2019-08-06 13:00:00 -07:00
Goldie Gadde
2d561c3431 Merge branch 'master-where-we-want-it' into r2.0 2019-08-06 12:40:51 -07:00
Goldie Gadde
8e423e3d56 Update release notes. 2019-06-13 10:41:13 -07:00
Goldie Gadde
1d91213fe7
Merge pull request #29741 from goldiegadde/ggadde-version2
Update version to 2.0.0-beta1.
2019-06-13 10:25:23 -07:00
Goldie Gadde
b43f012dec Update version to 2.0.0-beta1. 2019-06-13 04:49:26 -07:00
Goldie Gadde
0c59cc94fd
Merge pull request #29722 from k-w-w/cherrypicks_WPCYV
Cherrypicks for Keras SavedModel
2019-06-12 19:44:07 -07:00
Katherine Wu
974ff69e6e Automated rollback of commit 65b507e8a1
PiperOrigin-RevId: 252924374
2019-06-12 17:19:40 -07:00
Katherine Wu
319e32730a Allow SavedModel serialization to accept None InputSpec values.
PiperOrigin-RevId: 252916721
2019-06-12 17:19:40 -07:00
Goldie Gadde
d08e899087
Merge pull request #29670 from goldiegadde/ggadde-cp5
Cherrypick important fixes to r2.0 branch.
2019-06-12 16:37:56 -07:00
Goldie Gadde
5c07d01e92
Merge pull request #29547 from tensorflow/yifeif-patch-1
Update the version to 2.0.0 in tensorflow.bzl
2019-06-12 16:18:06 -07:00
Edward Loper
a93f2d0465 Add TypeSpec subclasses
PiperOrigin-RevId: 251330524
2019-06-12 14:04:38 -07:00
Goldie Gadde
1640a1f717 Revert "Replace training tensor argument with python boolean. Required for TFLite, which does not yet support control flow ops."
This reverts commit 43e36e609c.
2019-06-12 13:30:02 -07:00
Gunhan Gulsoy
48e26561ec Address review comments. 2019-06-11 15:36:31 -07:00
Goldie Gadde
976d0b63c7 Release Notes for 2.0 Beta. Address the comments.
Add release notes for the cherrypicks.
Add a few more lines to the other fixes.
2019-06-11 15:36:31 -07:00
Katherine Wu
43e36e609c Replace training tensor argument with python boolean. Required for TFLite, which does not yet support control flow ops.
This also adds a class that ensures that all layer call functions are traced with the same inputs, and with training set to both True&False.

PiperOrigin-RevId: 252682485
2019-06-11 15:15:44 -07:00
Nupur Garg
dc3be2fcec Internal change.
PiperOrigin-RevId: 252080249
2019-06-11 15:15:32 -07:00
Scott Zhu
f10351e8b4 Fix the RNN backend swapping issue, and adding new unit test for model.eval.
PiperOrigin-RevId: 252492125
2019-06-11 15:08:45 -07:00
Scott Zhu
c6b4bd510f Partially fix the function inlining and performance regression for LSTM/GRU.
1. Force the defun graph to not inline, so that grappler can properly
do the rewrite. This will fix the codelab performance issue, eg #29506 and #29549.

2. Disable the code path for CuDNN backend with masked input in LSTM/GRU.
Due to the issue of graph rewrite in Grappler for the generated under
tf.cond, this change will fallback to use normal kernel in v2 func graph
model when mask is present. It will be bit slower compare to CuDNN kernel,
but will give the correct numerical result. This issue will be addressed
in future change before 2.0 formal release.

PiperOrigin-RevId: 252428433
2019-06-11 15:08:23 -07:00
Yifei Feng
00f6608a0a
Update the version to 2.0.0 in tensorflow.bzl
Fix https://github.com/tensorflow/tensorflow/issues/29540
2019-06-07 13:50:56 -07:00
Goldie Gadde
f59745a381
Merge pull request #29521 from goldiegadde/ggadde-cp2
Revert "Fix an important performance regression for LSTM and GRU in t…
2019-06-07 00:00:18 -07:00
Goldie Gadde
e3bd1efbba Revert "Fix an important performance regression for LSTM and GRU in tf 2.0"
This reverts commit 4f39bd9ce8.
2019-06-06 22:46:38 -07:00
Goldie Gadde
5a25489b70
Merge pull request #29517 from goldiegadde/ggadde-cp1
Fix an important performance regression for LSTM and GRU in tf 2.0
2019-06-06 17:03:22 -07:00
Scott Zhu
4f39bd9ce8 Fix an important performance regression for LSTM and GRU in tf 2.0
The issue was caused by auto inline the tf function in eager context,
which cause the grappler not able to do the swap the optimization.

PiperOrigin-RevId: 251945251
2019-06-06 16:39:06 -07:00
Goldie Gadde
93eee337a1
Merge pull request #29499 from goldiegadde/ggadde-3
[XLA] Seed each convolution with the same rng state, so that the conv…
2019-06-06 09:30:25 -07:00
Tim Shen
18ea3d398a [XLA] Seed each convolution with the same rng state, so that the conv autotuning input is consistent even when run individually.
PiperOrigin-RevId: 251332367
2019-06-06 08:59:27 -07:00
Goldie Gadde
aea1a7549c
Merge pull request #29473 from goldiegadde/ggadde-cherrypick4
Fix the missing numpy import
2019-06-05 20:33:22 -07:00
Goldie Gadde
e7433829c4 Fix the numpy import. 2019-06-05 20:20:09 -07:00
Katherine Wu
9f7f717179 Set SavedModel as default format for model.save in TF2, and compile loaded model.
Additional change:
- Sequential models are now revived as Sequential.

PiperOrigin-RevId: 251723025
2019-06-05 16:33:56 -07:00
Pavithra Vijay
b9c7a8c6e9 Deserializing loss class in hdf5 format.
PiperOrigin-RevId: 251579891
2019-06-05 16:33:56 -07:00
Shanqing Cai
141727be8c [tf.keras] Fix a breakage in which string attrs in HDF5 file get quotes
- Details of the breakage:
  - The "keras_version" attr of a saved HDF5 (.h5) file of tf.keras model
    started to have quotes around it today. For instance, it ought to be
    2.2.4-tf, but instead becomes "2.2.4-tf" (with the quotes)
  - The root cause CL appears to be CL/251386039
  - This was discovered during TensorFlow.js nightly benchmark
- This CL fixes the breakage and adds a unit test to prevent regression.

PiperOrigin-RevId: 251544402
2019-06-05 16:33:56 -07:00
Anna R
dc7514029a Use self.get_temp_dir() instead of self.create_tempdir to fix Windows build.
PiperOrigin-RevId: 251374355
2019-06-05 16:33:56 -07:00
Francois Chollet
0030e1dbdd Add integration test for Sequential pop workflow.
PiperOrigin-RevId: 251323411
2019-06-05 16:33:56 -07:00
Goldie Gadde
d69b8192fb
Merge pull request #29456 from jaingaurav/cherry-2.0
Marks Keras set_session as compat.v1 only. Also moves some renames to…
2019-06-05 15:00:13 -07:00
Goldie Gadde
79d7110436
Merge pull request #29455 from jaingaurav/cherrypicks-2.0
Make default Keras ConfigProto use tf.config
2019-06-05 14:49:41 -07:00
A. Unique TensorFlower
64bfb8cc30 Marks Keras set_session as compat.v1 only. Also moves some renames to the manual renames that had been incorrectly placed in the auto-generated symbol mappings.
PiperOrigin-RevId: 251708447
2019-06-05 14:36:33 -07:00
Gaurav Jain
9ed3e9423a Make default Keras ConfigProto use tf.config
PiperOrigin-RevId: 251659257
2019-06-05 12:09:21 -07:00
Goldie Gadde
fba5ab6ad0 Update estimator nightly version to pick the checkpoint converter tool
changes.
2019-06-05 10:52:37 -07:00
Yifei Feng
eed4d9c4c4
Merge pull request #29415 from goldiegadde/ggadde-2
Cleaning up this file, so that only kathy's changes are in.
2019-06-04 22:55:54 +02:00
Goldie Gadde
9ec031ca18 the cherrypick of this file had some issues as some changes were rolled
back. Cleaning up this file, so that only kathy's changes are in.
2019-06-04 13:42:16 -07:00
A. Unique TensorFlower
9a1e7c8a50 Disable tests that are failing in TFv2 Windows CPU/GPU builds.
PiperOrigin-RevId: 251441123
2019-06-04 10:06:15 -07:00
Goldie Gadde
2becc5bc82 Version update. 2019-06-04 09:28:31 -07:00
Goldie Gadde
8bd937b371 fix the merge conflicts. 2019-06-04 09:28:02 -07:00
A. Unique TensorFlower
71a03a68e7 Disable tests that are failing in TFv2 Windows CPU/GPU builds.
PiperOrigin-RevId: 251343281
2019-06-04 09:28:02 -07:00
Gaurav Jain
742eb3f22a Add get_ prefix to parallelism_threads APIs
PiperOrigin-RevId: 251391606
2019-06-04 09:28:02 -07:00
Katherine Wu
e425b84524 Keras models and layers saving and reviving code. Implements go/tf-model-serialization.
To save and revive a model:
1. Save the model using tf.saved_model.save
2. call load_from_save_model_v2

This restores various metadata about Keras models and layers, as well as their call and loss functions.

Changes to object serialization:
- Adds private fields for tracking object's identifier and metadata.
- Added _list_extra_dependencies_for_serialization, which allows objects to save extra
  dependencies when serialized to SavedModel.
- Object graph view maintains a serialization cache object that is passed to each object when serializing functions/extra dependencies.

PiperOrigin-RevId: 251386039
2019-06-04 09:28:02 -07:00
Gunhan Gulsoy
3149d0377e
Merge pull request #29370 from goldiegadde/r2.0
R2.0 fastforward branch.
2019-06-03 14:55:27 -07:00
Goldie Gadde
bc41b48267 Merge branch 'r2.0' of github.com:tensorflow/tensorflow into r2.0 2019-06-03 14:39:18 -07:00
Yifei Feng
2c2d508aa2
Merge pull request #27515 from tensorflow/cherrypick-87ea41d023a985c5ecee0370c7f7381c8a8cee52
Fix configure.py to properly compare "X.Y.Z" with "X.Y"
2019-04-05 10:23:52 -07:00
Austin Anderson
3b09dccd1c Fix configure.py to properly compare "X.Y.Z" with "X.Y"
Right now, "0.24" is treated as lower than "*.*.*" because of the odd comparison method that adds digits to each existing section. This change converts "0.24" to "0.24.0" to fix that.

This will probably need to be pulled into r2.0.

PiperOrigin-RevId: 241950768
2019-04-04 12:44:52 -07:00
Austin Anderson
966cd0db3a Fix configure.py to properly compare "X.Y.Z" with "X.Y"
Right now, "0.24" is treated as lower than "*.*.*" because of the odd comparison method that adds digits to each existing section. This change converts "0.24" to "0.24.0" to fix that.

This will probably need to be pulled into r2.0.

PiperOrigin-RevId: 241950768
2019-04-04 11:16:58 -07:00
Yifei Feng
17548db070
Merge pull request #27399 from tensorflow/angersson-r2.0-bazel-version
Pull support for latest Bazel version
2019-04-02 18:12:19 -07:00
Austin Anderson
516939197b
Pull support for latest Bazel version
This makes the 2.0 branch match the current maximum Bazel version settings on `master`. It should help resolve https://github.com/tensorflow/tensorflow/issues/26553, where the latest `devel` images don't support `r2.0`.
2019-04-01 17:33:46 -07:00
Yifei Feng
2c319fb415
Merge pull request #26373 from goldiegadde/cherrypicks_MQMPF
Disable bincount_op_test on windows gpu.
2019-03-05 13:41:53 -08:00
A. Unique TensorFlower
421340c397 Disable bincount_op_test on windows gpu.
PiperOrigin-RevId: 236907827
2019-03-05 13:39:05 -08:00
Goldie Gadde
74a8ab67fa Update TF 2.0.0-alpha-0 release notes. (#26369)
Release Notes for 2.0.0-alpha0
2019-03-05 13:24:28 -08:00
Gunhan Gulsoy
685ade7422
Merge pull request #26370 from tensorflow/yifeif-patch-2
Switch to use pip3
2019-03-05 11:33:58 -08:00
Yifei Feng
68afc63e93
Switch to use pip3 2019-03-05 11:26:40 -08:00
Gunhan Gulsoy
86af76e6f6
Merge pull request #26367 from goldiegadde/cherrypicks_AEHRG
Cherrypick build and test fixes for fixing the release builds for TF 2.0 alpha-0
2019-03-05 10:00:28 -08:00
A. Unique TensorFlower
1f73987d59 disable failing virtual_gpu_test on gpu.
PiperOrigin-RevId: 236780392
2019-03-05 09:36:51 -08:00
A. Unique TensorFlower
fc5e2edd0d Add ops_history.v2.pbtxt to the build dependancy of
backwards_compatibility_test.

PiperOrigin-RevId: 236782576
2019-03-05 09:36:44 -08:00
A. Unique TensorFlower
92c7f48de1 Add ops_history.v2.pbtxt for TF 2.0 alpha release.
PiperOrigin-RevId: 236732665
2019-03-05 09:36:39 -08:00
Mihai Maruseac
0bece764ea Disable some v2 py36 pip flaky tests
PiperOrigin-RevId: 236747310
2019-03-05 09:36:32 -08:00
Martin Wicke
e907242ae3 Use eager-friendly evaluation for bincount_op_test.
PiperOrigin-RevId: 236690668
2019-03-05 09:36:26 -08:00
A. Unique TensorFlower
c6850fa156 Internal Change
PiperOrigin-RevId: 236754902
2019-03-05 09:36:19 -08:00
A. Unique TensorFlower
0d18c7cbad Internal change
PiperOrigin-RevId: 236729117
2019-03-05 09:36:11 -08:00
A. Unique TensorFlower
f147da65a6 Work around crash in nvcc by using the same #include files and order as max_pooling, which seems to not trigger the crash.
PiperOrigin-RevId: 236832623
2019-03-05 09:36:04 -08:00
Yifei Feng
c808841ce7 Install auditwheel 1.5.0.
PiperOrigin-RevId: 236681136
2019-03-05 09:35:56 -08:00
Gunhan Gulsoy
45534731dd
Merge pull request #26272 from tensorflow/version-update-ggadde
Update the version of TF to 2.0.0-alpha0
2019-03-04 11:16:57 -08:00
Gunhan Gulsoy
bdecee4c43
Merge pull request #26326 from tensorflow/2.0-ff
2.0 ff , Move r2.0 branch ahaed to pick up test and build fixes.
2019-03-04 10:26:28 -08:00
Goldie Gadde
d1d77dd8c8 Update the version of TF to 2.0.0-alpha0 2019-03-01 16:57:47 -08:00
552 changed files with 18028 additions and 6369 deletions

225
.bazelrc
View File

@ -1,3 +1,80 @@
# TensorFlow Bazel configuration file.
# This file tries to group and simplify build options for TensorFlow
#
# ----CONFIG OPTIONS----
# Android options:
# android:
# android_arm:
# android_x86:
# android_x86_64:
#
# iOS options:
# ios:
# ios_armv7:
# ios_arm64:
# ios_x86_64:
# ios_fat:
#
# Compiler options:
# cuda_clang: Use clang when building CUDA code.
# c++17: Build with C++17 options
# C++1z: Build with C++17 options
# avx_linux: Build with avx instruction set on linux.
# avx2_linux: Build with avx2 instruction set on linux.
# arch_native_linux: Build with instruction sets available to the host machine on linux
# avx_win: Build with avx instruction set on windows
# avx2_win: Build with avx2 instruction set on windows
#
# Other build options:
# short_logs: Only log errors during build, skip warnings.
# monolithic: Build all TF C++ code into a single shared object.
# dynamic_kernels: Try to link all kernels dynamically (experimental).
#
#
# TF version options;
# v1: Build TF V1 (without contrib)
# v2: Build TF v2
#
# Feature and Third party library support options:
# xla: Build TF with XLA
# using_cuda: CUDA is available to build system.
# cuda: Build with full cuda support.
# rocm: Build with AMD GPU support (rocm).
# sycl: Build with SYCL support.
# sycl_nodouble:
# sycl_asan:
# sycl_trisycl:
# mkl: Enable full mkl support.
# mkl_open_source_only: Enable MKL support only using open source MKL libraries.
# tensorrt: Enable Tensorrt support.
# ngraph: Enable ngraph support.
# numa: Enable numa using hwloc.
# noaws: Disable AWS S3 storage support
# nogcp: Disable GCS support.
# nohdfs: Disable hadoop hdfs support.
# nonccl: Disable nccl support.
#
#
# Remote build execution options (only configured to work with TF team projects for now.)
# rbe: General RBE options shared by all flavors.
# rbe_linux: General RBE options used on all linux builds.
# rbe_win: General RBE options used on all windows builds.
#
# rbe_cpu_linux: RBE options to build with only CPU support.
# rbe_linux_cuda_nvcc: RBE options to build with GPU support using nvcc.
# rbe_gpu_linux: An alias for rbe_linux_cuda_nvcc
#
# rbe_linux_py2: Linux Python 2 RBE config.
# rbe_linux_py3: Linux Python 3 RBE config
#
# rbe_win_py37: Windows Python 3.7 RBE config
#
# tensorflow_testing_rbe_linux: RBE options to use RBE with tensorflow-testing project on linux
# tensorflow_testing_rbe_win: RBE options to use RBE with tensorflow-testing project on windows
#
# Android configs. Bazel needs to have --cpu and --fat_apk_cpu both set to the
# target CPU to build transient dependencies correctly. See
# https://docs.bazel.build/versions/master/user-manual.html#flag--fat_apk_cpu
@ -48,15 +125,6 @@ build:mkl_open_source_only --define=build_with_mkl_dnn_v1_only=true
build:mkl_open_source_only --define=build_with_mkl=true --define=enable_mkl=true
build:mkl_open_source_only --define=tensorflow_mkldnn_contraction_kernel=0
build:download_clang --crosstool_top=@local_config_download_clang//:toolchain
build:download_clang --define=using_clang=true
build:download_clang --action_env TF_DOWNLOAD_CLANG=1
# Instruct clang to use LLD for linking.
# This only works with GPU builds currently, since Bazel sets -B/usr/bin in
# auto-generated CPU crosstool, forcing /usr/bin/ld.lld to be preferred over
# the downloaded one.
build:download_clang_use_lld --linkopt='-fuse-ld=lld'
# This config refers to building with CUDA available. It does not necessarily
# mean that we build CUDA op kernels.
build:using_cuda --define=using_cuda=true
@ -109,7 +177,6 @@ build --define=use_fast_cpp_protos=true
build --define=allow_oversize_protos=true
build --spawn_strategy=standalone
build --strategy=Genrule=standalone
build -c opt
# Make Bazel print out all options from rc files.
@ -132,29 +199,147 @@ build --define=PREFIX=/usr
build --define=LIBDIR=$(PREFIX)/lib
build --define=INCLUDEDIR=$(PREFIX)/include
# Suppress C++ compiler warnings, otherwise build logs become 10s of MBs.
build --copt=-w
# Suppress all warning messages.
build:short_logs --output_filter=DONT_MATCH_ANYTHING
# Instruction set optimizations
# TODO(gunan): Create a feature in toolchains for avx/avx2 to
# avoid having to define linux/win separately.
build:avx_linux --copt=-mavx
build:avx2_linux --copt=-mavx2
build:native_arch_linux --copt=-march=native
build:avx_win --copt=/arch=AVX
build:avx2_win --copt=/arch=AVX2
# Options to build TensorFlow 1.x or 2.x.
build:v1 --define=tf_api_version=1
build:v2 --define=tf_api_version=2
build:v1 --action_env=TF2_BEHAVIOR=0
build:v2 --action_env=TF2_BEHAVIOR=1
build --config=v2
test --config=v2
# Enable XLA
build:xla --action_env=TF_ENABLE_XLA=1
build:xla --define=with_xla_support=true
# BEGIN TF REMOTE BUILD EXECUTION OPTIONS
# Options when using remote execution
# WARNING: THESE OPTIONS WONT WORK IF YOU DO NOT HAVE PROPER AUTHENTICATION AND PERMISSIONS
build:rbe --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1
build:rbe --auth_enabled=true
build:rbe --auth_scope=https://www.googleapis.com/auth/cloud-source-tools
build:rbe --bes_backend=buildeventservice.googleapis.com
build:rbe --bes_best_effort=false
build:rbe --bes_results_url="https://source.cloud.google.com/results/invocations"
build:rbe --bes_timeout=600s
build:rbe --define=EXECUTOR=remote
build:rbe --flaky_test_attempts=3
build:rbe --jobs=200
build:rbe --remote_accept_cached=true
build:rbe --remote_cache=remotebuildexecution.googleapis.com
build:rbe --remote_executor=remotebuildexecution.googleapis.com
build:rbe --remote_local_fallback=false
build:rbe --remote_timeout=600
build:rbe --remote_executor=grpcs://remotebuildexecution.googleapis.com
build:rbe --remote_timeout=3600
build:rbe --spawn_strategy=remote
build:rbe --strategy=Genrule=remote
build:rbe --strategy=Closure=remote
build:rbe --strategy=Javac=remote
build:rbe --strategy=TestRunner=remote
build:rbe --tls_enabled
test:rbe --test_env=USER=anon
build:rbe --distinct_host_configuration=false
build:rbe_linux --config=rbe
build:rbe_linux --action_env=PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/go/bin"
build:rbe_linux --host_javabase=@bazel_toolchains//configs/ubuntu16_04_clang/1.1:jdk8
build:rbe_linux --javabase=@bazel_toolchains//configs/ubuntu16_04_clang/1.1:jdk8
build:rbe_linux --host_java_toolchain=@bazel_tools//tools/jdk:toolchain_hostjdk8
build:rbe_linux --java_toolchain=@bazel_tools//tools/jdk:toolchain_hostjdk8
# Non-rbe settings we should include because we do not run configure
build:rbe_linux --config=xla
build:rbe_linux --config=avx_linux
build:rbe_linux --config=short_logs
# TODO(gunan): Check why we need this specified in rbe, but not in other builds.
build:rbe_linux --linkopt=-lrt
build:rbe_cpu_linux --config=rbe_linux
build:rbe_cpu_linux --crosstool_top="//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:toolchain"
build:rbe_cpu_linux --extra_toolchains="//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010:cc-toolchain-k8"
build:rbe_cpu_linux --extra_execution_platforms"=@org_tensorflow//third_party/toolchains:rbe_ubuntu16.04-manylinux2010"
build:rbe_cpu_linux --host_platform="@org_tensorflow//third_party/toolchains:rbe_ubuntu16.04-manylinux2010"
build:rbe_cpu_linux --platforms="@org_tensorflow//third_party/toolchains:rbe_ubuntu16.04-manylinux2010"
build:rbe_linux_cuda_nvcc --config=rbe_linux
build:rbe_linux_cuda_nvcc --crosstool_top="//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010-nvcc-cuda10.1:toolchain"
build:rbe_linux_cuda_nvcc --extra_toolchains="//third_party/toolchains/preconfig/ubuntu16.04/gcc7_manylinux2010-nvcc-cuda10.1:toolchain-linux-x86_64"
build:rbe_linux_cuda_nvcc --extra_execution_platforms="@org_tensorflow//third_party/toolchains:rbe_cuda10.1-cudnn7-ubuntu16.04-manylinux2010,@org_tensorflow//third_party/toolchains:rbe_cuda10.1-cudnn7-ubuntu16.04-manylinux2010-gpu"
build:rbe_linux_cuda_nvcc --host_platform="@org_tensorflow//third_party/toolchains:rbe_cuda10.1-cudnn7-ubuntu16.04-manylinux2010"
build:rbe_linux_cuda_nvcc --platforms="@org_tensorflow//third_party/toolchains:rbe_cuda10.1-cudnn7-ubuntu16.04-manylinux2010"
build:rbe_linux_cuda_nvcc --repo_env=TF_CUDA_CONFIG_REPO="@org_tensorflow//third_party/toolchains/preconfig/ubuntu16.04/cuda10.1-cudnn7"
build:rbe_linux_cuda_nvccrepoo_env=TF_TENSORRT_CONFIG_REPO="@org_tensorflow//third_party/toolchains/preconfig/ubuntu16.04/tensorrt6.0"
build:rbe_linux_cuda_nvcc --repo_env=TF_NEED_TENSORRT=1
build:rbe_linux_cuda_nvcc --repo_env=TF_CUDA_VERSION=10
build:rbe_linux_cuda_nvcc --repo_env=TF_CUDNN_VERSION=7
build:rbe_linux_cuda_nvcc --repo_env=REMOTE_GPU_TESTING=1
build:rbe_linux_cuda_nvcc --repo_env=TF_NEED_CUDA=1
build:rbe_linux_cuda_nvcc --define=using_cuda_nvcc=true
test:rbe_linux_cuda_nvcc --test_env=LD_LIBRARY_PATH="/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
common:rbe_gpu_linux --config=rbe_linux_cuda_nvcc
build:rbe_linux_py2 --config=rbe_linux
build:rbe_linux_py2 --repo_env=PYTHON_BIN_PATH="/usr/bin/python2"
build:rbe_linux_py2 --python_path="/usr/bin/python2"
build:rbe_linux_py2 --repo_env=TF_PYTHON_CONFIG_REPO="@org_tensorflow//third_party/toolchains/preconfig/ubuntu16.04/py"
build:rbe_linux_py3 --config=rbe_linux
build:rbe_linux_py3 --repo_env=PYTHON_BIN_PATH="/usr/bin/python3"
build:rbe_linux_py3 --python_path="/usr/bin/python3"
build:rbe_linux_py3 --repo_env=TF_PYTHON_CONFIG_REPO="@org_tensorflow//third_party/toolchains/preconfig/ubuntu16.04/py3"
build:rbe_win --config=rbe
build:rbe_win --crosstool_top="@org_tensorflow//third_party/toolchains/preconfig/win_1803/bazel_026:toolchain"
build:rbe_win --extra_execution_platforms="@org_tensorflow//third_party/toolchains/preconfig/win_1803:rbe_windows_1803"
build:rbe_win --extra_toolchains="@org_tensorflow//third_party/toolchains/preconfig/win_1803/bazel_026:cc-toolchain-x64_windows"
build:rbe_win --host_javabase="@org_tensorflow//third_party/toolchains/preconfig/win_1803:windows_jdk8"
build:rbe_win --host_platform="@org_tensorflow//third_party/toolchains/preconfig/win_1803:rbe_windows_1803"
build:rbe_win --javabase="@org_tensorflow//third_party/toolchains/preconfig/win_1803:windows_jdk8"
build:rbe_win --platforms="@org_tensorflow//third_party/toolchains/preconfig/win_1803:rbe_windows_1803"
build:rbe_win --shell_executable=C:\\tools\\msys64\\usr\\bin\\bash.exe
# Misc build options we need for windows
build:rbe_win --copt=-DWIN32_LEAN_AND_MEAN
build:rbe_win --host_copt=-DWIN32_LEAN_AND_MEAN
build:rbe_win --copt=-DNOGDI
build:rbe_win --host_copt=-DNOGDI
build:rbe_win --linkopt=/DEBUG
build:rbe_win --host_linkopt=/DEBUG
build:rbe_win --linkopt=/OPT:REF
build:rbe_win --host_linkopt=/OPT:REF
build:rbe_win --linkopt=/OPT:ICF
build:rbe_win --host_linkopt=/OPT:ICF
build:rbe_win --config=monolithic
build:rbe_win --experimental_strict_action_env=true
build:rbe_win --incompatible_windows_native_test_wrapper
# TODO(gunan): Remove once we use MSVC 2019 with latest patches.
build:rbe_win --define=override_eigen_strong_inline=true
build:rbe_win_py37 --config=rbe
build:rbe_win_py37 --repo_env=PYTHON_BIN_PATH=C:\\Python37\\python.exe
build:rbe_win_py37 --repo_env=PYTHON_LIB_PATH=C:\\Python37\\lib\\site-packages
build:rbe_win_py37 --repo_env=TF_PYTHON_CONFIG_REPO=@org_tensorflow//third_party/toolchains/preconfig/win_1803/py37
build:rbe_win_py37 --python_path=C:\\Python37\\python.exe
# These you may need to change for your own GCP project.
build:tensorflow_testing_rbe --project_id=tensorflow-testing
common:tensorflow_testing_rbe_linux --remote_instance_name=tensorflow-testing/instances/default_instance
build:tensorflow_testing_rbe_linux --config=tensorflow_testing_rbe
build:tensorflow_testing_rbe_linux --config=rbe
build:tensorflow_testing_rbe_linux --config=rbe_linux
common:tensorflow_testing_rbe_win --remote_instance_name=projects/tensorflow-testing/instances/windows
build:tensorflow_testing_rbe_win --config=tensorflow_testing_rbe
build:tensorflow_testing_rbe_win --config=rbe_win
# END TF REMOTE BUILD EXECUTION OPTIONS
# Default options should come above this line
# Options from ./configure

View File

@ -116,6 +116,8 @@ The TensorFlow project strives to abide by generally accepted best practices in
Build Type | Status | Artifacts
--------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------
**Linux AMD ROCm GPU** Nightly | [![Build Status](http://ml-ci.amd.com:21096/job/tensorflow-rocm-nightly/badge/icon)](http://ml-ci.amd.com:21096/job/tensorflow-rocm-nightly) | [Nightly](http://ml-ci.amd.com:21096/job/tensorflow-rocm-nightly/lastSuccessfulBuild/)
**Linux AMD ROCm GPU** Stable Release | [![Build Status](http://ml-ci.amd.com:21096/job/tensorflow-rocm-release/badge/icon)](http://ml-ci.amd.com:21096/job/tensorflow-rocm-release/) | [Release](http://ml-ci.amd.com:21096/job/tensorflow-rocm-release/lastSuccessfulBuild/)
**Linux s390x** Nightly | [![Build Status](http://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_CI/badge/icon)](http://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_CI/) | [Nightly](http://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_CI/)
**Linux s390x CPU** Stable Release | [![Build Status](http://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_Release_Build/badge/icon)](https://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_Release_Build/) | [Release](https://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_Release_Build/)
**Linux ppc64le CPU** Nightly | [![Build Status](https://powerci.osuosl.org/job/TensorFlow_PPC64LE_CPU_Build/badge/icon)](https://powerci.osuosl.org/job/TensorFlow_PPC64LE_CPU_Build/) | [Nightly](https://powerci.osuosl.org/job/TensorFlow_PPC64LE_CPU_Nightly_Artifact/)

File diff suppressed because one or more lines are too long

View File

@ -1559,9 +1559,6 @@ def main():
if is_windows():
set_windows_build_flags(environ_cp)
# Add a config option to build TensorFlow 2.0 API.
write_to_bazelrc('build:v2 --define=tf_api_version=2')
if get_var(environ_cp, 'TF_SET_ANDROID_WORKSPACE', 'android workspace', False,
('Would you like to interactively configure ./WORKSPACE for '
'Android builds?'), 'Searching for NDK and SDK installations.',

View File

@ -777,8 +777,8 @@ genrule(
mkdir $@
for f in $(SRCS); do
d="$${f%/*}"
d="$${d#bazel-out*genfiles/}"
d="$${d#*external/eigen_archive/}"
d="$${d#bazel-out/*/genfiles/}"
d="$${d#bazel-out/*/bin/}"
if [[ $${d} == *local_config_* ]]; then
continue
@ -790,6 +790,9 @@ genrule(
if [[ $${TF_SYSTEM_LIBS:-} == *$${extname}* ]]; then
continue
fi
d="$${d#*external/farmhash_archive/src}"
d="$${d#*external/$${extname}/}"
fi
mkdir -p "$@/$${d}"
@ -808,8 +811,8 @@ genrule(
}),
outs = ["__init__.py"],
cmd = select({
"api_version_2": "cp $(@D)/_api/v2/v2.py $(OUTS)",
"//conditions:default": "cp $(@D)/_api/v1/v1.py $(OUTS)",
"api_version_2": "cp $(@D)/_api/v2/v2.py $(OUTS) && sed -i'.original' 's:from . import:from . _api.v2 import:g' $(OUTS)",
"//conditions:default": "cp $(@D)/_api/v1/v1.py $(OUTS) && sed -i'.original' 's:from . import:from ._api.v1 import:g' $(OUTS)",
}),
)

View File

@ -56,10 +56,10 @@ elif _tf_api_dir not in __path__:
__path__.append(_tf_api_dir)
# Hook external TensorFlow modules.
# Import compat before trying to import summary from tensorboard, so that
# reexport_tf_summary can get compat from sys.modules
_current_module.compat.v2.compat.v1 = _current_module.compat.v1
# reexport_tf_summary can get compat from sys.modules. Only needed if using
# lazy loading.
_current_module.compat.v2 # pylint: disable=pointless-statement
try:
from tensorboard.summary._tf import summary
_current_module.__path__ = (
@ -78,7 +78,7 @@ except ImportError:
pass
try:
from tensorflow.python.keras.api._v2 import keras
from .python.keras.api._v2 import keras
_current_module.__path__ = (
[_module_util.get_parent_dir(keras)] + _current_module.__path__)
setattr(_current_module, "keras", keras)
@ -125,25 +125,6 @@ if _running_from_pip_package():
if _fi.file_exists(plugin_dir):
_ll.load_library(plugin_dir)
# These symbols appear because we import the python package which
# in turn imports from tensorflow.core and tensorflow.python. They
# must come from this module. So python adds these symbols for the
# resolution to succeed.
# pylint: disable=undefined-variable
try:
del python
except NameError:
pass
try:
del core
except NameError:
pass
try:
del compiler
except NameError:
pass
# pylint: enable=undefined-variable
# Add module aliases
if hasattr(_current_module, 'keras'):
losses = keras.losses

View File

@ -60,6 +60,10 @@ elif _tf_api_dir not in __path__:
__path__.append(_tf_api_dir)
# Hook external TensorFlow modules.
# Import compat before trying to import summary from tensorboard, so that
# reexport_tf_summary can get compat from sys.modules. Only needed if using
# lazy loading.
_current_module.compat.v2 # pylint: disable=pointless-statement
try:
from tensorflow_estimator.python.estimator.api._v1 import estimator
_current_module.__path__ = (
@ -69,7 +73,7 @@ except ImportError:
pass
try:
from tensorflow.python.keras.api._v1 import keras
from .python.keras.api._v1 import keras
_current_module.__path__ = (
[_module_util.get_parent_dir(keras)] + _current_module.__path__)
setattr(_current_module, "keras", keras)
@ -134,6 +138,10 @@ if _running_from_pip_package():
if _fi.file_exists(plugin_dir):
_ll.load_library(plugin_dir)
# Disable TF2 behavior
from tensorflow.python.compat import v2_compat as _compat # pylint: disable=g-import-not-at-top
_compat.disable_v2_behavior()
# These symbols appear because we import the python package which
# in turn imports from tensorflow.core and tensorflow.python. They
# must come from this module. So python adds these symbols for the
@ -152,5 +160,4 @@ try:
except NameError:
pass
_current_module.compat.v2.compat.v1 = _current_module.compat.v1
# pylint: enable=undefined-variable

View File

@ -671,7 +671,7 @@ void TFE_OpAddInputList(TFE_Op* op, TFE_TensorHandle** inputs, int num_inputs,
TF_AttrType TFE_OpGetAttrType(TFE_Op* op, const char* attr_name,
unsigned char* is_list, TF_Status* status) {
TF_AttrType ret;
TF_AttrType ret = TF_ATTR_INT;
status->status = tensorflow::AttrTypeByName(*op->operation.AttrTypes(),
attr_name, &ret, is_list);
return ret;

View File

@ -83,7 +83,10 @@ void ExecuteWithProfiling(bool async) {
if (!gpu_device_name.empty()) {
EXPECT_TRUE(HasSubstr(profile_proto_str, "/device:GPU:0"));
// device name with "stream:all" is collected by Device Tracer.
#ifndef TENSORFLOW_USE_ROCM
// ROCm platform does not yet support stream level tracing
EXPECT_TRUE(HasSubstr(profile_proto_str, "stream:all"));
#endif
}
// "/host:CPU" is collected by TraceMe
EXPECT_TRUE(HasSubstr(profile_proto_str, "/host:CPU"));

View File

@ -63,12 +63,26 @@ cat << EOF > tensorflow.pc
prefix=${TF_PREFIX}
exec_prefix=\${prefix}
libdir=\${exec_prefix}/${LIBDIR}
includedir=\${prefix}/include
includedir=\${prefix}/include/tensorflow
Name: TensorFlow
Version: ${TF_VERSION}
Description: Library for computation using data flow graphs for scalable machine learning
Requires:
Libs: -L\${libdir} -ltensorflow
Libs: -L\${libdir} -ltensorflow -ltensorflow_framework
Cflags: -I\${includedir}
EOF
cat << EOF > tensorflow_cc.pc
prefix=${TF_PREFIX}
exec_prefix=\${prefix}
libdir=\${exec_prefix}/${LIBDIR}
includedir=\${prefix}/include/tensorflow
Name: TensorFlow
Version: ${TF_VERSION}
Description: Library for computation using data flow graphs for scalable machine learning
Requires:
Libs: -L\${libdir} -ltensorflow_cc -ltensorflow_framework
Cflags: -I\${includedir}
EOF

View File

@ -1,8 +1,8 @@
load(
"//tensorflow:tensorflow.bzl",
"tf_cc_test",
"tf_kernel_library",
"tf_gen_op_libs",
"tf_kernel_library",
)
package(

View File

@ -293,7 +293,9 @@ string ToCamelCase(const string& str) {
bool cap = true;
while (i < str.size()) {
const char c = str[i++];
if (c == joiner) {
if (c == '>') {
cap = true;
} else if (c == joiner) {
cap = true;
} else if (cap) {
result += toupper(c);
@ -305,6 +307,21 @@ string ToCamelCase(const string& str) {
return result;
}
string SeparateNamespaces(const string& str) {
string result;
const char joiner = '_';
size_t i = 0;
while (i < str.size()) {
const char c = str[i++];
if (c == '>') {
result += joiner;
} else {
result += c;
}
}
return result;
}
// Returns a <string, bool> pair. The string is the C++ type name to be used for
// attr_type when defining an object of that type. The bool is a flag to
// indicate whether to treat the type as const when accepting the C++ type as an
@ -550,7 +567,7 @@ struct OpInfo {
OpInfo::OpInfo(const OpDef& graph_op_def, const ApiDef& api_def,
const std::vector<string>& aliases)
: graph_op_def(graph_op_def), api_def(api_def), aliases(aliases) {
op_name = api_def.endpoint(0).name();
op_name = SeparateNamespaces(api_def.endpoint(0).name());
InferOpAttributes(graph_op_def, &inferred_input_attrs);
has_optional_attrs = HasOptionalAttrs(api_def, inferred_input_attrs);
arg_types.push_back("const ::tensorflow::Scope&");

View File

@ -9,6 +9,7 @@ tf_cuda_cc_test(
name = "profiler_test",
srcs = ["profiler_test.cc"],
tags = [
"no_rocm", # stream level tracing not supported on ROCm
"nogpu", # b/77649654
],
deps = [

View File

@ -19,6 +19,11 @@ limitations under the License.
#include "tensorflow/cc/saved_model/constants.h"
#include "tensorflow/cc/saved_model/reader.h"
#include "tensorflow/core/framework/attr_value.pb.h"
#include "tensorflow/core/framework/function.pb.h"
#include "tensorflow/core/framework/node_def.pb.h"
#include "tensorflow/core/framework/tensor.pb.h"
#include "tensorflow/core/lib/core/errors.h"
#include "tensorflow/core/lib/io/path.h"
#include "tensorflow/core/lib/monitoring/counter.h"
#include "tensorflow/core/lib/monitoring/sampler.h"
@ -64,12 +69,54 @@ uint64 GetLatencyMicroseconds(const uint64 start_microseconds) {
return end_microseconds - start_microseconds;
}
// Ensure that constant tensors loaded from the saved model have valid shape.
// Also ensure that constant nodes have a value assigned to them.
// TODO(b/154763635): this is temporary and will be replaced with a better audit
static Status ValidateNode(const NodeDef& node) {
const auto node_iterator = node.attr().find("value");
if (node_iterator != node.attr().end()) {
AttrValue node_value = node_iterator->second;
if (node_value.has_tensor()) {
const PartialTensorShape node_shape(node_value.tensor().tensor_shape());
if (node_shape.num_elements() < 0) {
return errors::FailedPrecondition(
"Saved model contains node \"", node.name(), "\" (op \"", node.op(),
"\") which initializes from a tensor with ",
node_shape.num_elements(), " elements");
}
}
} else if (node.op() == "Const") {
return errors::FailedPrecondition(
"Saved model contains node \"", node.name(),
"\" which is a constant tensor but no value has been provided");
}
return Status::OK();
}
static Status ValidateSavedTensors(const GraphDef& graph_def) {
for (const auto& node : graph_def.node()) {
TF_RETURN_IF_ERROR(ValidateNode(node));
}
if (graph_def.has_library()) {
const FunctionDefLibrary& library = graph_def.library();
for (const auto& function : library.function()) {
for (const auto& node : function.node_def()) {
TF_RETURN_IF_ERROR(ValidateNode(node));
}
}
}
return Status::OK();
}
Status LoadMetaGraphIntoSession(const MetaGraphDef& meta_graph_def,
const SessionOptions& session_options,
std::unique_ptr<Session>* session) {
Session* session_p = nullptr;
TF_RETURN_IF_ERROR(NewSession(session_options, &session_p));
session->reset(session_p);
TF_RETURN_IF_ERROR(ValidateSavedTensors(meta_graph_def.graph_def()));
return (*session)->Create(meta_graph_def.graph_def());
}

View File

@ -1,4 +1,4 @@
load("//tensorflow:tensorflow.bzl", "tf_cc_test", "cc_header_only_library")
load("//tensorflow:tensorflow.bzl", "cc_header_only_library", "tf_cc_test")
load("@local_config_cuda//cuda:build_defs.bzl", "if_cuda")
load("//tensorflow:tensorflow.bzl", "tf_custom_op_py_library", "tf_jit_compilation_passes_extra_deps")
load("//tensorflow/core/platform:default/build_config.bzl", "tf_additional_all_protos", "tf_proto_library")

View File

@ -693,7 +693,8 @@ class EagerFunctionTest(xla_test.XLATestCase):
return x, y
wholly_compiled_f = def_function.function(f)
op_by_op_f = def_function.function(f, experimental_compile=False)
op_by_op_f = function.defun_with_attributes(
f, attributes={'_XlaCompile': False})
x = constant_op.constant([0.0, 2.0], name='data')

View File

@ -776,7 +776,9 @@ class TRT_TensorOrWeights::SimpleITensor : public nvinfer1::ITensor {
nvinfer1::TensorFormats getAllowedFormats() const override { return 1; }
bool isShape() const override { return false; }
bool isShapeTensor() const override { return false; }
bool isExecutionTensor() const override { return true; }
#endif
private:
@ -5191,7 +5193,11 @@ Status ConvertGraphDefToEngine(
}
// Build the network
VLOG(1) << "Starting engine conversion ";
if (VLOG_IS_ON(1)) {
string mode_str;
TF_RETURN_IF_ERROR(TrtPrecisionModeToName(precision_mode, &mode_str));
VLOG(1) << "Starting engine conversion, precision mode: " << mode_str;
}
Converter converter(trt_network.get(), precision_mode, use_calibration);
std::vector<Converter::EngineOutputInfo> output_tensors;
// Graph nodes are already topologically sorted during construction

View File

@ -48,12 +48,10 @@ class GetCalibrationDataOp : public OpKernel {
&resource));
core::ScopedUnref sc(resource);
auto* calib_ctx = resource->calib_ctx_.get();
// Serialize the resource as output.
string serialized_resource;
OP_REQUIRES_OK(context, calib_ctx->SerializeToString(&serialized_resource));
resource->calib_ctx_.reset();
string serialized_resource = resource->calib_ctx_->TerminateCalibration();
OP_REQUIRES(context, !serialized_resource.empty(),
errors::Unknown("Calibration table is empty."));
Tensor* output = nullptr;
OP_REQUIRES_OK(context,

View File

@ -392,9 +392,8 @@ Status TRTEngineOp::VerifyInputShapes(const std::vector<TensorShape>& shapes) {
return Status::OK();
}
Status TRTEngineOp::GetEngineInputShapes(
const CacheType& cache, const std::vector<TensorShape>& actual_input_shapes,
std::vector<TensorShape>* engine_input_shapes) {
bool AreShapesCompatible(const std::vector<TensorShape>& actual_shapes,
const std::vector<TensorShape>& cached_shapes) {
auto match_shape = [](const TensorShape& actual_shape,
const TensorShape& cached_shape) {
// Match the rank.
@ -407,16 +406,17 @@ Status TRTEngineOp::GetEngineInputShapes(
}
return true;
};
auto match_shapes = [&](const std::vector<TensorShape>& actual_shapes,
const std::vector<TensorShape>& cached_shapes) {
for (int i = 0; i < actual_shapes.size(); ++i) {
if (!match_shape(actual_shapes[i], cached_shapes[i])) {
return false;
}
for (int i = 0; i < actual_shapes.size(); ++i) {
if (!match_shape(actual_shapes[i], cached_shapes[i])) {
return false;
}
return true;
};
}
return true;
}
Status TRTEngineOp::GetEngineInputShapes(
const CacheType& cache, const std::vector<TensorShape>& actual_input_shapes,
std::vector<TensorShape>* engine_input_shapes) {
// VerifyInputShapes() already ensured that all input shapes have same
// batch size, and are not scalars.
*engine_input_shapes = actual_input_shapes;
@ -430,7 +430,7 @@ Status TRTEngineOp::GetEngineInputShapes(
", cached size: ", cached_input_shapes.size(),
" vs. actual size: ", actual_input_shapes.size());
}
if (match_shapes(actual_input_shapes, cached_input_shapes)) {
if (AreShapesCompatible(actual_input_shapes, cached_input_shapes)) {
const int cached_batch_size = cached_input_shapes[0].dim_size(0);
if (min_matched_batch_size > cached_batch_size) {
min_matched_batch_size = cached_batch_size;
@ -668,7 +668,8 @@ StatusOr<EngineContext*> TRTEngineOp::GetEngine(
static EngineContext empty_context;
mutex_lock lock(engine_mutex_);
// TODO(tmorris): using first input to get batch size - is this reliable?
// Using first input to get batch size is reliable - VerifyInputShapes() has
// verified that.
const int batch_size = input_shapes[0].dim_size(0);
auto& cache = cache_res->cache_;
auto allocator = cache_res->allocator_.get();
@ -678,14 +679,9 @@ StatusOr<EngineContext*> TRTEngineOp::GetEngine(
// Handle the static engine case. For static engines, the cache will have a
// single element containing the only engine.
//
// TODO(laigd): This is legacy mode for TF v1.x, need to remove when all known
// users switch to 2.0.
if (static_engine_) {
if (cache.size()) {
// Batch size of engine must be >= the input batch size
// TODO(tmorris): use match compatible function?
if (cache.begin()->first[0].dim_size(0) >= batch_size) {
if (AreShapesCompatible(input_shapes, cache.begin()->first)) {
return cache.begin()->second.get();
}
return &empty_context;
@ -724,9 +720,7 @@ StatusOr<EngineContext*> TRTEngineOp::GetEngine(
return cache.at(engine_input_shapes).get();
} // static_engine_
// Handle the dynamic engine case.
// See if there is a compatible engine cached. The batch size should be <= the
// cached batch size.
// Handle the dynamic engine case. See if there is a compatible engine cached.
std::vector<TensorShape> engine_input_shapes;
TF_RETURN_IF_ERROR(
GetEngineInputShapes(cache, input_shapes, &engine_input_shapes));
@ -843,19 +837,18 @@ Status TRTEngineOp::AllocateCalibrationResources(
if (!s.ok()) {
LOG(ERROR) << "Calibration failed: " << s;
cres->calibrator_->setDone(); // Ignore further pushes
} else {
// Transfer the ownership of the engine to the engine cache, so we can
// dump it out during conversion for TF 2.0.
mutex_lock lock(this->engine_mutex_);
this->calibrator_ = std::move(cres->calibrator_);
TrtUniquePtrType<nvinfer1::IExecutionContext> exec_context(
cres->engine_->createExecutionContext());
cache_res->cache_.emplace(
shapes, absl::make_unique<EngineContext>(std::move(cres->engine_),
std::move(exec_context)));
}
// Transfer the ownership of the engine to the engine cache, so we can
// dump it out during conversion for TF 2.0.
mutex_lock lock(this->engine_mutex_);
cres->SetCalibrationTable();
this->calibrator_ = std::move(cres->calibrator_);
TrtUniquePtrType<nvinfer1::IExecutionContext> exec_context(
cres->engine_->createExecutionContext());
cache_res->cache_.emplace(
shapes, absl::make_unique<EngineContext>(std::move(cres->engine_),
std::move(exec_context)));
VLOG(1) << "Calibration loop terminated " << this->name();
}));
VLOG(1) << "initialized calibrator resource";

View File

@ -184,13 +184,7 @@ class SerializeTRTResource : public OpKernel {
core::ScopedUnref unref_me(resource);
// Terminate the calibration if any.
if (resource->calib_ctx_) {
// We don't save the calibration_table for TF 2.0 at the moment, it's used
// in 1.x environment.
string calibration_table;
OP_REQUIRES_OK(
ctx, resource->calib_ctx_->SerializeToString(&calibration_table));
}
if (resource->calib_ctx_) resource->calib_ctx_->TerminateCalibration();
// Serialize the engines and write them to file.
std::unique_ptr<WritableFile> file;

View File

@ -30,6 +30,26 @@ limitations under the License.
namespace tensorflow {
namespace tensorrt {
string CalibrationContext::TerminateCalibration() {
mutex_lock l(mu_);
if (terminated_) return calibration_table_;
TRTInt8Calibrator* raw_calibrator = calibrator_.get();
raw_calibrator->waitAndSetDone();
terminated_ = true;
// At this point the calibration thread `thr_` is woken up and can
// transfer the ownership of `calibrator_` and `engine_` at any time, so
// it's not safe to use `calibrator_` below, but we can still access it
// using raw pointer.
// TODO(laigd): make TRTEngineOp::AllocateCalibrationResources() a member
// function of this class instead.
thr_->join();
calibration_table_ = raw_calibrator->getCalibrationTableAsString();
return calibration_table_;
}
const absl::string_view kTfTrtContainerName = "TF-TRT";
Logger& TRTEngineCacheResource::GetLogger() {

View File

@ -142,19 +142,7 @@ struct EngineContext {
// Contains the context required to build the calibration data.
class CalibrationContext {
public:
void SetCalibrationTable() {
calibration_table_ = calibrator_->getCalibrationTableAsString();
}
Status SerializeToString(string* serialized) {
calibrator_->waitAndSetDone();
thr_->join();
*serialized = calibration_table_;
if (serialized->empty()) {
return errors::Unknown("Calibration table is empty.");
}
return Status::OK();
}
string TerminateCalibration();
// Lookup table for temporary staging areas of input tensors for calibration.
std::unordered_map<string, std::pair<void*, size_t>> device_buffers_;
@ -162,12 +150,16 @@ class CalibrationContext {
// Temporary staging areas for calibration inputs.
std::vector<PersistentTensor> device_tensors_;
string calibration_table_;
std::unique_ptr<TRTInt8Calibrator> calibrator_;
TrtUniquePtrType<nvinfer1::IBuilder> builder_;
TrtUniquePtrType<nvinfer1::ICudaEngine> engine_;
// TODO(sami): Use threadpool threads!
std::unique_ptr<std::thread> thr_;
private:
mutex mu_;
bool terminated_ GUARDED_BY(mu_) = false;
std::string calibration_table_ GUARDED_BY(mu_);
};
ABSL_CONST_INIT extern const absl::string_view kTfTrtContainerName;

View File

@ -1,4 +1,4 @@
load("//tensorflow:tensorflow.bzl", "tf_cc_test", "cc_header_only_library")
load("//tensorflow:tensorflow.bzl", "cc_header_only_library", "tf_cc_test")
load("//tensorflow/compiler/xla:xla.bzl", "xla_proto_library")
load(
"//tensorflow/core/platform:default/build_config.bzl",

View File

@ -6,7 +6,7 @@ load(
"//third_party/mkl:build_defs.bzl",
"mkl_deps",
)
load("//tensorflow:tensorflow.bzl", "tf_cc_binary", "tf_cc_test")
load("//tensorflow:tensorflow.bzl", "tf_cc_binary", "tf_cc_test", "tf_openmp_copts")
load(":build_defs.bzl", "runtime_copts")
package(
@ -560,7 +560,7 @@ cc_library(
"runtime_conv2d_mkl.cc",
],
hdrs = ["runtime_conv2d_mkl.h"],
copts = runtime_copts(),
copts = runtime_copts() + tf_openmp_copts(),
visibility = ["//visibility:public"],
deps = [
":runtime_conv2d",

View File

@ -232,8 +232,11 @@ tensorflow/core/kernels/scatter_nd_op_cpu_impl_5.cc
tensorflow/core/kernels/scatter_nd_op_cpu_impl_6.cc
tensorflow/core/kernels/scatter_nd_op_cpu_impl_7.cc
tensorflow/core/kernels/scatter_op.cc
tensorflow/core/kernels/segment_reduction_ops.cc
tensorflow/core/kernels/segment_reduction_ops.cc
tensorflow/core/kernels/segment_reduction_ops_impl_1.cc
tensorflow/core/kernels/segment_reduction_ops_impl_2.cc
tensorflow/core/kernels/segment_reduction_ops_impl_3.cc
tensorflow/core/kernels/segment_reduction_ops_impl_4.cc
tensorflow/core/kernels/segment_reduction_ops_impl_5.cc
tensorflow/core/kernels/sendrecv_ops.cc
tensorflow/core/kernels/sequence_ops.cc
tensorflow/core/kernels/session_ops.cc

View File

@ -443,11 +443,9 @@ class Image(ItemHandler):
"""Decodes a raw image."""
return parsing_ops.decode_raw(image_buffer, out_type=self._dtype)
pred_fn_pairs = {
math_ops.logical_or(
math_ops.equal(image_format, 'raw'),
math_ops.equal(image_format, 'RAW')): decode_raw,
}
pred_fn_pairs = [(math_ops.logical_or(
math_ops.equal(image_format, 'raw'),
math_ops.equal(image_format, 'RAW')), decode_raw)]
image = control_flow_ops.case(
pred_fn_pairs, default=check_jpeg, exclusive=True)

View File

@ -86,6 +86,7 @@ load(
"tf_gen_op_libs",
"tf_generate_proto_text_sources",
"tf_genrule_cmd_append_to_srcs",
"tf_openmp_copts",
"tf_opts_nortti_if_android",
"tf_opts_nortti_if_emscripten",
"transitive_hdrs",
@ -3263,7 +3264,7 @@ tf_cuda_library(
"public/version.h",
],
hdrs = CORE_CPU_LIB_HEADERS,
copts = tf_copts(),
copts = tf_copts() + tf_openmp_copts(),
deps = [
":bfc_allocator",
":graph",
@ -4604,7 +4605,7 @@ tf_cc_test(
size = "small",
srcs = ["common_runtime/constant_folding_test.cc"],
linkstatic = tf_kernel_tests_linkstatic(),
tags = tf_cuda_tests_tags(),
tags = tf_cuda_tests_tags() + ["no_rocm"],
deps = [
":core",
":core_cpu",
@ -4670,6 +4671,7 @@ tf_cuda_cc_test(
size = "small",
srcs = ["common_runtime/process_function_library_runtime_test.cc"],
linkstatic = tf_kernel_tests_linkstatic(),
tags = ["no_rocm"],
deps = [
":core_cpu",
":core_cpu_internal",

View File

@ -8,9 +8,9 @@ A variant tensor representing the input dataset.
END
}
in_arg {
name: "num_workers"
name: "num_replicas"
description: <<END
A scalar representing the number of workers to distribute this batch across. As
A scalar representing the number of replicas to distribute this batch across. As
a result of this transformation the current batch size would end up being
divided by this parameter.
END
@ -18,6 +18,6 @@ END
summary: "Creates a dataset that changes the batch size."
description: <<END
Creates a dataset that changes the batch size of the dataset to current batch
size // num_workers.
size // num_replicas.
END
}

View File

@ -8,9 +8,9 @@ A variant tensor representing the input dataset.
END
}
in_arg {
name: "num_workers"
name: "num_replicas"
description: <<END
A scalar representing the number of workers to distribute this batch across. As
A scalar representing the number of replicas to distribute this batch across. As
a result of this transformation the current batch size would end up being
divided by this parameter.
END

View File

@ -1,9 +1,4 @@
op {
graph_op_name: "Equal"
endpoint {
name: "math.equal"
}
endpoint {
name: "equal"
}
visibility: HIDDEN
}

View File

@ -0,0 +1,4 @@
op {
graph_op_name: "Fill"
visibility: HIDDEN
}

View File

@ -1,9 +1,4 @@
op {
graph_op_name: "NotEqual"
endpoint {
name: "math.not_equal"
}
endpoint {
name: "not_equal"
}
visibility: HIDDEN
}

View File

@ -59,13 +59,13 @@ namespace {
const char* GetCollectiveName(const CollectiveParams* cp, bool nccl) {
switch (cp->instance.type) {
case BROADCAST_COLLECTIVE:
return nccl ? "NcclBroadcast" : "HierarchicalTreeBroadcast";
return "HierarchicalTreeBroadcast";
case REDUCTION_COLLECTIVE:
return nccl ? "NcclReduce" : "RingReduce";
case GATHER_COLLECTIVE:
return nccl ? "NcclGather" : "RingGather";
return "RingGather";
default:
return "undef";
@ -91,8 +91,16 @@ void CollectiveParamResolverLocal::CompleteGroupLocal(
// Initialize group runtime details.
CollectiveImplementationInterface* col_impl;
status = CollectiveRegistry::LookupParamResolverInstance(
GetCollectiveName(cp, nccl_), &col_impl);
// Try to lookup a NCCL collective kernel. This will return error status
// if `NcclReduce` kernel is not present in the registry, e.g. on an
// environment that does not support NCCL.
status = CollectiveRegistry::LookupParamResolverInstance("NcclReduce",
&col_impl);
if (!status.ok()) {
// Fallback to non-NCCL collective.
status = CollectiveRegistry::LookupParamResolverInstance(
GetCollectiveName(cp, /*nccl=*/false), &col_impl);
}
if (status.ok()) {
status = col_impl->InitializeCollectiveGroupRuntimeDetails(
&gr->group.runtime_details);

View File

@ -51,9 +51,11 @@ limitations under the License.
#include "tensorflow/core/public/session_options.h"
#include "tensorflow/core/util/device_name_utils.h"
#ifdef GOOGLE_CUDA
#if GOOGLE_CUDA
#include "third_party/gpus/cuda/include/cuda.h"
#include "third_party/gpus/cuda/include/cuda_runtime_api.h"
#elif TENSORFLOW_USE_ROCM
#include "rocm/include/hip/hip_runtime.h"
#endif // GOOGLE_CUDA
namespace tensorflow {
@ -2089,6 +2091,12 @@ bool IsCUDATensor(const Tensor& t) {
if (err == cudaErrorInvalidValue) return false;
CHECK_EQ(cudaSuccess, err) << cudaGetErrorString(err);
return (attributes.memoryType == cudaMemoryTypeDevice);
#elif TENSORFLOW_USE_ROCM
hipPointerAttribute_t attributes;
hipError_t err = hipPointerGetAttributes(&attributes, t.tensor_data().data());
if (err == hipErrorInvalidValue) return false;
CHECK_EQ(hipSuccess, err) << hipGetErrorString(err);
return (attributes.memoryType == hipMemoryTypeDevice);
#else
return false;
#endif

View File

@ -1,6 +1,7 @@
load(
"//tensorflow:tensorflow.bzl",
"tf_cc_test",
"tf_copts",
"tf_cuda_library",
)
load(
@ -276,6 +277,8 @@ cc_library(
cc_library(
name = "mkl_eager_op_rewrite",
srcs = ["mkl_eager_op_rewrite.cc"],
copts = tf_copts(),
nocopts = "-fno-exceptions",
deps = [
":eager_op_rewrite_registry",
"//tensorflow/core:framework",

View File

@ -486,10 +486,6 @@ Status EagerLocalExecute(EagerOperation* op, TensorHandle** retvals,
IsMultiDevice(ctx->FindFunctionDef(op->Name()));
std::vector<Device*> input_dev_ptrs;
// `input_tensor_shapes` contains (potentially a subset of) non DT_RESOURCE
// arguments, and `input_resource_variable_dtypes_and_shapes` contains shapes
// and underlying types for (potentially a subset) of DT_RESOURCE arguments.
std::unordered_map<int, TensorShape> input_tensor_shapes;
std::unordered_map<int, DtypeAndPartialTensorShape>
input_resource_variable_dtypes_and_shapes;
if (is_multi_device_function) {
@ -524,19 +520,9 @@ Status EagerLocalExecute(EagerOperation* op, TensorHandle** retvals,
cache_key =
FingerprintCat128(cache_key, Fingerprint128(input_device->name()));
// If input is normal tensor, get its shape and add it to 'cache_key';
// If input is a ResourceHandle, get its resource handle dtypes and shapes
// and add them to 'cache_key'.
if (input->dtype != DT_RESOURCE) {
TensorShape shape;
TF_RETURN_IF_ERROR(input->Shape(&shape));
input_tensor_shapes[i] = shape;
// Add both _Arg index and shape to "cache_key".
cache_key = FingerprintCat128(cache_key, i);
AppendTensorShapeToFingerprint(shape, &cache_key);
} else {
if (input->dtype == DT_RESOURCE) {
// We only care about data type and shape for resource variable inputs.
// But we have no way to tell if input is resource variable (other than
// looking it up in ResourceMgr, which is slow). So we just get
@ -616,7 +602,6 @@ Status EagerLocalExecute(EagerOperation* op, TensorHandle** retvals,
<< ". Full node_def=" << ndef.DebugString();
kernel.reset(new KernelAndDeviceFunc(
flr, ctx->pflr(), std::move(input_dev_ptrs),
std::move(input_tensor_shapes),
std::move(input_resource_variable_dtypes_and_shapes), runner,
ctx->GetCollectiveExecutorHandle(), ctx->HostCPU(), op->Name(),
[ctx](const int64 step_id) {

View File

@ -112,7 +112,6 @@ Status KernelAndDeviceFunc::Init(const NodeDef& ndef,
for (const Device* device : input_devices_) {
options.input_devices.push_back(device->name());
}
options.input_tensor_shapes = input_tensor_shapes_;
options.input_resource_dtypes_and_shapes = input_resource_dtypes_and_shapes_;
const auto& it = ndef.attr().find("executor_type");
@ -337,7 +336,12 @@ Status KernelAndDeviceOp::Run(ScopedStepContainer* step_container,
if (outputs != nullptr) {
outputs->clear();
for (int i = 0; i < context.num_outputs(); ++i) {
outputs->push_back(Tensor(*context.mutable_output(i)));
const auto* output_tensor = context.mutable_output(i);
if (output_tensor != nullptr) {
outputs->push_back(Tensor(*output_tensor));
} else {
outputs->push_back(Tensor());
}
}
}
if (stats != nullptr) {

View File

@ -185,7 +185,6 @@ class KernelAndDeviceFunc final : public KernelAndDevice {
KernelAndDeviceFunc(
FunctionLibraryRuntime* flr, ProcessFunctionLibraryRuntime* pflr,
std::vector<Device*> input_devices,
std::unordered_map<int, TensorShape> input_tensor_shapes,
std::unordered_map<int, DtypeAndPartialTensorShape>
input_resource_dtypes_and_shapes,
std::function<void(std::function<void()>)>* runner,
@ -197,7 +196,6 @@ class KernelAndDeviceFunc final : public KernelAndDevice {
pflr_(pflr),
handle_(kInvalidHandle),
input_devices_(std::move(input_devices)),
input_tensor_shapes_(std::move(input_tensor_shapes)),
input_resource_dtypes_and_shapes_(
std::move(input_resource_dtypes_and_shapes)),
name_(name),
@ -240,7 +238,6 @@ class KernelAndDeviceFunc final : public KernelAndDevice {
// CPU devices are not null. Resource handles' devices are actual backing
// devices.
std::vector<Device*> input_devices_;
std::unordered_map<int, TensorShape> input_tensor_shapes_;
std::unordered_map<int, DtypeAndPartialTensorShape>
input_resource_dtypes_and_shapes_;

View File

@ -45,6 +45,11 @@ class MklEagerOpRewrite : public EagerOpRewrite {
static Status SetupNewOp(EagerOperation* orig_op, const string mkl_op_name,
std::unique_ptr<EagerOperation>* new_mkl_op);
// Generic rewrite that can be used for any mkl op that doesn't need
// special processing.
static Status CreateGenericMklOp(EagerOperation* orig_op,
std::unique_ptr<EagerOperation>* mkl_op);
// Creates new MKL op for Conv2D, Conv2DBackpropInput and
// Conv2DBackpropFilter.
static Status CreateMklConv2DOp(
@ -60,6 +65,10 @@ class MklEagerOpRewrite : public EagerOpRewrite {
// Checks whether we can rewrite the op to MKL one or not.
bool ShouldRewriteOp(EagerOperation* op, int* op_idx);
// Default rewrite rule to be used when rewrite should happen without any
// restriction.
static bool AlwaysRewrite(EagerOperation* op) { return true; }
};
REGISTER_REWRITE(EagerOpRewriteRegistry::PRE_EXECUTION, MklEagerOpRewrite);
@ -67,11 +76,15 @@ REGISTER_REWRITE(EagerOpRewriteRegistry::PRE_EXECUTION, MklEagerOpRewrite);
// Constructor
MklEagerOpRewrite::MklEagerOpRewrite(string name, string file, string line)
: EagerOpRewrite(name, file, line) {
mkl_eager_ops_.push_back({"BatchMatMul", AlwaysRewrite, CreateGenericMklOp});
mkl_eager_ops_.push_back(
{"BatchMatMulV2", AlwaysRewrite, CreateGenericMklOp});
mkl_eager_ops_.push_back({"Conv2D", RewriteConv2D, CreateMklConv2DOp});
mkl_eager_ops_.push_back(
{"Conv2DBackpropInput", RewriteConv2D, CreateMklConv2DOp});
mkl_eager_ops_.push_back(
{"Conv2DBackpropFilter", RewriteConv2D, CreateMklConv2DOp});
mkl_eager_ops_.push_back({"MatMul", AlwaysRewrite, CreateGenericMklOp});
}
Status MklEagerOpRewrite::Run(
@ -124,6 +137,13 @@ Status MklEagerOpRewrite::SetupNewOp(
return Status::OK();
}
Status MklEagerOpRewrite::CreateGenericMklOp(
EagerOperation* orig_op, std::unique_ptr<EagerOperation>* mkl_op) {
const string mkl_op_name = mkl_op_registry::GetMklOpName(orig_op->Name());
TF_CHECK_OK(SetupNewOp(orig_op, mkl_op_name, mkl_op));
return Status::OK();
}
Status MklEagerOpRewrite::CreateMklConv2DOp(
EagerOperation* orig_op, std::unique_ptr<EagerOperation>* mkl_conv2d_op) {
const string mkl_op_name =

View File

@ -311,7 +311,6 @@ const string* AssignedOrRequestedDeviceName(const Node& node) {
}
Status SetArgShape(
const std::unordered_map<int, TensorShape>& input_tensor_shapes,
const std::unordered_map<int, DtypeAndPartialTensorShape>&
input_resource_dtypes_and_shapes,
const std::vector<Node*>& arg_nodes) {
@ -320,16 +319,7 @@ Status SetArgShape(
TF_RETURN_IF_ERROR(GetNodeAttr(n->def(), "index", &index));
DataType dtype;
TF_RETURN_IF_ERROR(GetNodeAttr(n->def(), "T", &dtype));
if (dtype != DT_RESOURCE) {
auto shape_iter = input_tensor_shapes.find(index);
if (shape_iter != input_tensor_shapes.end()) {
TensorShapeProto shape_proto;
shape_iter->second.AsProto(&shape_proto);
AttrValue attr_value;
*attr_value.mutable_list()->add_shape() = shape_proto;
n->AddAttr("_output_shapes", attr_value);
}
} else {
if (dtype == DT_RESOURCE) {
auto dtype_and_shape_iter = input_resource_dtypes_and_shapes.find(index);
if (dtype_and_shape_iter != input_resource_dtypes_and_shapes.end()) {
AttrValue dtype_attr_value;
@ -620,9 +610,8 @@ Status ProcessFunctionLibraryRuntime::InstantiateMultiDevice(
options.graph_collector->CollectRawGraph(def);
}
TF_RETURN_IF_ERROR(SetArgShape(options.input_tensor_shapes,
options.input_resource_dtypes_and_shapes,
arg_nodes));
TF_RETURN_IF_ERROR(
SetArgShape(options.input_resource_dtypes_and_shapes, arg_nodes));
TF_RETURN_IF_ERROR(PinArgsAndRets(options.input_devices,
options.output_devices, device_set_,
arg_nodes, ret_nodes));

View File

@ -33,9 +33,11 @@ limitations under the License.
#include "tensorflow/core/public/session_options.h"
#include "tensorflow/core/public/version.h"
#ifdef GOOGLE_CUDA
#if GOOGLE_CUDA
#include "third_party/gpus/cuda/include/cuda.h"
#include "third_party/gpus/cuda/include/cuda_runtime_api.h"
#elif TENSORFLOW_USE_ROCM
#include "rocm/include/hip/hip_runtime.h"
#endif // GOOGLE_CUDA
namespace tensorflow {
@ -122,7 +124,7 @@ class ProcessFunctionLibraryRuntimeTest : public ::testing::Test {
}
Tensor GPUToCPU(const Tensor& device_tensor) {
#ifdef GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
CHECK(gpu_device_);
CHECK(gpu_device_->tensorflow_gpu_device_info() != nullptr);
DeviceContext* device_context =
@ -146,7 +148,7 @@ class ProcessFunctionLibraryRuntimeTest : public ::testing::Test {
}
Tensor CPUToGPU(const Tensor& cpu_tensor) {
#ifdef GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
CHECK(gpu_device_);
CHECK(gpu_device_->tensorflow_gpu_device_info() != nullptr);
DeviceContext* device_context =
@ -461,6 +463,12 @@ bool IsCUDATensor(const Tensor& t) {
if (err == cudaErrorInvalidValue) return false;
CHECK_EQ(cudaSuccess, err) << cudaGetErrorString(err);
return (attributes.memoryType == cudaMemoryTypeDevice);
#elif TENSORFLOW_USE_ROCM
hipPointerAttribute_t attributes;
hipError_t err = hipPointerGetAttributes(&attributes, t.tensor_data().data());
if (err == hipErrorInvalidValue) return false;
CHECK_EQ(hipSuccess, err) << hipGetErrorString(err);
return (attributes.memoryType == hipMemoryTypeDevice);
#else
CHECK(false)
<< "IsCUDATensor should not be called when CUDA is not available";

View File

@ -561,6 +561,13 @@ Status ShapeRefiner::ConstantPartialShape(InferenceContext* target_context,
} else if (src_op == "StridedSlice") {
TF_RETURN_IF_ERROR(
PartialStridedSliceShape(input_edge->src(), src_context, result));
} else if (src_op == "VariableShape") {
auto* handle_data = src_context->input_handle_shapes_and_types(0);
if (handle_data != nullptr && !handle_data->empty()) {
*result = handle_data->at(0).shape;
} else {
*result = target_context->UnknownShape();
}
} else {
Tensor t;
bool evaluated = false;

View File

@ -366,7 +366,8 @@ Status EinsumShape(shape_inference::InferenceContext* c) {
output_bcast_shape = input_bcast_shapes[0];
} else if (input_bcast_shapes.size() == 2) {
TF_RETURN_IF_ERROR(BroadcastBinaryOpOutputShapeFnHelper(
c, input_bcast_shapes[0], input_bcast_shapes[1], &output_bcast_shape));
c, input_bcast_shapes[0], input_bcast_shapes[1], true,
&output_bcast_shape));
}
bool output_has_ellipsis = false;
@ -441,7 +442,7 @@ Status BatchMatMulV2Shape(shape_inference::InferenceContext* c) {
TF_RETURN_IF_ERROR(c->Subshape(b_shape, 0, -2, &b_batch_shape));
TF_RETURN_IF_ERROR(BroadcastBinaryOpOutputShapeFnHelper(
c, a_batch_shape, b_batch_shape, &output_batch_shape));
c, a_batch_shape, b_batch_shape, true, &output_batch_shape));
ShapeHandle output_shape;
TF_RETURN_IF_ERROR(c->Concatenate(
@ -1613,6 +1614,7 @@ Status QuantizedConcatV2Shape(InferenceContext* c, int num_inputs_to_concat) {
Status BroadcastBinaryOpOutputShapeFnHelper(InferenceContext* c,
ShapeHandle shape_x,
ShapeHandle shape_y,
bool incompatible_shape_error,
ShapeHandle* out) {
CHECK_NOTNULL(out);
if (!c->RankKnown(shape_x) || !c->RankKnown(shape_y)) {
@ -1646,8 +1648,16 @@ Status BroadcastBinaryOpOutputShapeFnHelper(InferenceContext* c,
// or the same as the known dim.
// - If either dimension is 1, the other dimension is the output.
if (c->Value(dim_x) > 1) {
if (!incompatible_shape_error) {
*out = c->UnknownShape();
return Status::OK();
}
dims.push_back(dim_x);
} else if (c->Value(dim_y) > 1) {
if (!incompatible_shape_error) {
*out = c->UnknownShape();
return Status::OK();
}
dims.push_back(dim_y);
} else if (c->Value(dim_x) == 1) {
dims.push_back(dim_y);
@ -1656,6 +1666,10 @@ Status BroadcastBinaryOpOutputShapeFnHelper(InferenceContext* c,
} else if (dim_y.SameHandle(dim_x)) {
dims.push_back(dim_x);
} else {
if (!incompatible_shape_error) {
*out = c->UnknownShape();
return Status::OK();
}
dims.push_back(c->UnknownDim());
}
} else if (c->Value(dim_x) == 1 || c->Value(dim_y) == 1) {
@ -1669,7 +1683,14 @@ Status BroadcastBinaryOpOutputShapeFnHelper(InferenceContext* c,
}
} else {
DimensionHandle dim;
TF_RETURN_IF_ERROR(c->Merge(dim_x, dim_y, &dim));
Status s = c->Merge(dim_x, dim_y, &dim);
if (!s.ok()) {
if (!incompatible_shape_error) {
*out = c->MakeShape({});
return Status::OK();
}
return s;
}
dims.push_back(dim);
}
}

View File

@ -306,6 +306,7 @@ Status QuantizedConcatV2Shape(InferenceContext* c, int num_inputs_to_concat);
Status BroadcastBinaryOpOutputShapeFnHelper(InferenceContext* c,
ShapeHandle shape_x,
ShapeHandle shape_y,
bool incompatible_shape_error,
ShapeHandle* out);
// Shape function for binary operators that broadcast their inputs
@ -313,8 +314,8 @@ Status BroadcastBinaryOpOutputShapeFnHelper(InferenceContext* c,
inline Status BroadcastBinaryOpOutputShapeFn(InferenceContext* c,
int output_index) {
ShapeHandle out;
TF_RETURN_IF_ERROR(
BroadcastBinaryOpOutputShapeFnHelper(c, c->input(0), c->input(1), &out));
TF_RETURN_IF_ERROR(BroadcastBinaryOpOutputShapeFnHelper(
c, c->input(0), c->input(1), true, &out));
c->set_output(output_index, out);
return Status::OK();
}

View File

@ -921,11 +921,6 @@ string Canonicalize(const string& funcname, AttrSlice attrs,
entries.push_back(strings::StrCat(
"_output_dev", i, "=", absl::CEscape(options.output_devices[i])));
}
for (const auto& iter : options.input_tensor_shapes) {
entries.push_back(
strings::StrCat("_input_tensor_shape", iter.first, "=",
absl::CEscape(iter.second.DebugString())));
}
for (const auto& iter : options.input_resource_dtypes_and_shapes) {
entries.push_back(strings::StrCat("_input_resource_dtype", iter.first, "=",
DataTypeString(iter.second.dtype)));

View File

@ -563,14 +563,6 @@ class FunctionLibraryRuntime {
// infer correct device.
std::vector<string> output_devices;
// This interface is EXPERIMENTAL and subject to change.
//
// For multi-device functions, a mapping from _Arg node index to input
// tensor shape.
// REQUIRES: if input_tensor_shapes.count(i) > 0 then i-th argument type
// must not be DT_RESOURCE.
std::unordered_map<int, TensorShape> input_tensor_shapes;
// This interface is EXPERIMENTAL and subject to change.
//
// For multi-device functions, a mapping from _Arg node index to type and

View File

@ -35,6 +35,17 @@ Status FinalizeOpDef(const OpDefBuilder& b, OpDef* op_def) {
return s;
}
// We can create a Graph containing a namespaced Op
TEST(AddToGraphTest, MakeGraphDefWithNamespacedOpName) {
OpList op_list;
TF_ASSERT_OK(FinalizeOpDef(OpDefBuilder("Project>SomeOp"), op_list.add_op()));
OpListOpRegistry registry(&op_list);
GraphDef graph_def;
TF_ASSERT_OK(NodeDefBuilder("node", "Project>SomeOp", &registry)
.Finalize(graph_def.add_node()));
}
// Producer and consumer have default for an attr -> graph unchanged.
TEST(RemoveNewDefaultAttrsFromGraphDefTest, NoChangeWithDefault) {
OpList op_list;

View File

@ -11,7 +11,7 @@ import "tensorflow/core/framework/attr_value.proto";
message NodeDef {
// The name given to this operator. Used for naming inputs,
// logging, visualization, etc. Unique within a single GraphDef.
// Must match the regexp "[A-Za-z0-9.][A-Za-z0-9_./]*".
// Must match the regexp "[A-Za-z0-9.][A-Za-z0-9_>./]*".
string name = 1;
// The operation name. There may be custom parameters in attrs.

View File

@ -742,12 +742,22 @@ namespace {
using ::tensorflow::strings::Scanner;
bool IsValidOpName(StringPiece sp) {
return Scanner(sp)
.One(Scanner::LETTER_DIGIT_DOT)
.Any(Scanner::LETTER_DIGIT_DASH_DOT_SLASH_UNDERSCORE)
.Eos()
.GetResult();
bool IsValidNodeName(StringPiece sp) {
Scanner scanner(sp);
scanner.One(Scanner::LETTER_DIGIT_DOT)
.Any(Scanner::LETTER_DIGIT_DASH_DOT_SLASH_UNDERSCORE);
while (true) {
if (!scanner.GetResult()) // Some error in previous iteration.
return false;
if (scanner.empty()) // No error, but nothing left, good.
return true;
// Absorb another piece, starting with a '>'
scanner.One(Scanner::RANGLE)
.One(Scanner::LETTER_DIGIT_DOT)
.Any(Scanner::LETTER_DIGIT_DASH_DOT_SLASH_UNDERSCORE);
}
}
bool IsValidDataInputName(StringPiece sp) {
@ -791,16 +801,16 @@ Status ValidateOpInput(const string& input_name, bool* is_control_input) {
}
}
Status ValidateOpName(const string& op_name) {
if (IsValidOpName(op_name)) {
Status ValidateNodeName(const string& node_name) {
if (IsValidNodeName(node_name)) {
return Status::OK();
} else {
return errors::InvalidArgument("Illegal op name '", op_name, "'");
return errors::InvalidArgument("Illegal op name '", node_name, "'");
}
}
Status ValidateExternalNodeDefSyntax(const NodeDef& node_def) {
Status s = ValidateOpName(node_def.name());
Status s = ValidateNodeName(node_def.name());
if (!s.ok()) {
return AttachDef(s, node_def);
}

View File

@ -282,10 +282,28 @@ TEST(NodeDefUtilTest, ValidSyntax) {
)proto");
ExpectValidSyntax(node_def);
const NodeDef node_def_namespace = ToNodeDef(R"proto(
name: 'n'
op: 'Project>AnyIn'
input: 'a'
input: 'b'
attr {
key: 'T'
value { list { type: [ DT_INT32, DT_STRING ] } }
}
)proto");
ExpectValidSyntax(node_def_namespace);
const NodeDef node_def_explicit_inputs = ToNodeDef(R"proto(
name:'n' op:'AnyIn' input:'a:0' input:'b:123'
attr { key:'T' value { list { type: [DT_INT32, DT_STRING] } } }
)proto");
name: 'n'
op: 'AnyIn'
input: 'a:0'
input: 'b:123'
attr {
key: 'T'
value { list { type: [ DT_INT32, DT_STRING ] } }
}
)proto");
ExpectValidSyntax(node_def_explicit_inputs);
EXPECT_EQ("{{node n}} = AnyIn[T=[DT_INT32, DT_STRING]](a:0, b:123)",

View File

@ -14,7 +14,7 @@ import "tensorflow/core/framework/types.proto";
// LINT.IfChange
message OpDef {
// Op names starting with an underscore are reserved for internal use.
// Names should be CamelCase and match the regexp "[A-Z][a-zA-Z0-9_]*".
// Names should be CamelCase and match the regexp "[A-Z][a-zA-Z0-9>_]*".
string name = 1;
// For describing inputs and outputs.

View File

@ -248,16 +248,29 @@ static Status ValidateArg(const OpDef::ArgDef& arg, const OpDef& op_def,
return Status::OK();
}
Status ValidateOpDef(const OpDef& op_def) {
bool IsValidOpName(StringPiece sp) {
using ::tensorflow::strings::Scanner;
Scanner scanner(sp);
scanner.One(Scanner::UPPERLETTER).Any(Scanner::LETTER_DIGIT_UNDERSCORE);
while (true) {
if (!scanner.GetResult()) // Some error in previous iteration.
return false;
if (scanner.empty()) // No error, but nothing left, good.
return true;
// Absorb another name/namespace, starting with a '>'
scanner.One(Scanner::RANGLE)
.One(Scanner::UPPERLETTER)
.Any(Scanner::LETTER_DIGIT_UNDERSCORE);
}
}
Status ValidateOpDef(const OpDef& op_def) {
if (!absl::StartsWith(op_def.name(), "_")) {
VALIDATE(Scanner(op_def.name())
.One(Scanner::UPPERLETTER)
.Any(Scanner::LETTER_DIGIT_UNDERSCORE)
.Eos()
.GetResult(),
"Invalid name: ", op_def.name(), " (Did you use CamelCase?)");
VALIDATE(IsValidOpName(op_def.name()), "Invalid name: ", op_def.name(),
" (Did you use CamelCase?)");
}
std::set<string> names; // for detecting duplicate names

View File

@ -74,12 +74,26 @@ TEST_F(ValidateOpDefTest, OpDefValid) {
TF_EXPECT_OK(TestBuilder(OpDefBuilder("X").Attr("a: int >= -5 = 3")));
TF_EXPECT_OK(TestBuilder(OpDefBuilder("X").Attr("a: numbertype")));
TF_EXPECT_OK(TestBuilder(OpDefBuilder("Uppercase")));
TF_EXPECT_OK(TestBuilder(OpDefBuilder("Namespace>X").Attr("a: int")));
TF_EXPECT_OK(TestBuilder(OpDefBuilder("Namespace>X>Y").Attr("a: int")));
}
TEST_F(ValidateOpDefTest, InvalidName) {
ExpectFailure(TestBuilder(OpDefBuilder("lower").Attr("a: int")),
"Invalid name");
ExpectFailure(TestBuilder(OpDefBuilder("BadSuffix 7%")), "Invalid name");
ExpectFailure(TestBuilder(OpDefBuilder(">OpName").Attr("a: int")),
"Invalid name");
// Can't have a dangling empty namespace
ExpectFailure(TestBuilder(OpDefBuilder("OpName>").Attr("a: int")),
"Invalid name");
// Each namespace section must be Camelcased
ExpectFailure(TestBuilder(OpDefBuilder("OpName>b").Attr("a: int")),
"Invalid name");
// Can't have empty namespaces
ExpectFailure(TestBuilder(OpDefBuilder("OpName>A>>B").Attr("a: int")),
"Invalid name");
}
TEST_F(ValidateOpDefTest, DuplicateName) {

View File

@ -264,7 +264,24 @@ class LocalRendezvousImpl : public Rendezvous {
VLOG(2) << "Enqueue Recv Item (key:" << key.FullKey() << "). ";
Item* item = new Item;
item->waiter = std::move(done);
if (cm != nullptr) {
auto wrapped_done = std::bind(
[cm, token](const DoneCallback& done,
// Begin unbound arguments.
const Status& s, const Args& send_args,
const Args& recv_args, const Tensor& v, bool dead) {
cm->TryDeregisterCallback(token);
done(s, send_args, recv_args, v, dead);
},
std::move(done), std::placeholders::_1, std::placeholders::_2,
std::placeholders::_3, std::placeholders::_4,
std::placeholders::_5);
item->waiter = std::move(wrapped_done);
} else {
item->waiter = std::move(done);
}
item->recv_args = recv_args;
item->cancellation_token = token;
if (item->recv_args.device_context) {
@ -332,11 +349,6 @@ class LocalRendezvousImpl : public Rendezvous {
if (recv_args.device_context) {
recv_args.device_context->Unref();
}
auto* cm = recv_args.cancellation_manager;
if (cancellation_token != CancellationManager::kInvalidToken &&
cm != nullptr) {
cm->TryDeregisterCallback(cancellation_token);
}
}
// Returns true iff this item represents a value being sent.

View File

@ -515,7 +515,12 @@ TensorBuffer* FromProtoField<Variant>(Allocator* a, const TensorProto& in,
if (in_n <= 0) {
std::fill_n(data, n, Variant());
} else {
for (int64 i = 0; i < in_n; ++i) {
// If tensor shape says we have n < in_n elements in the output tensor
// then make sure to only decode the first n out of the in_n elements in the
// in tensors. In all other cases, we decode all in_n elements of in and set
// the remaining elements up to n to be the default Variant() value.
const int64 real_n = n < in_n ? n : in_n;
for (int64 i = 0; i < real_n; ++i) {
data[i] = in.variant_val(i);
if (!DecodeUnaryVariant(&data[i])) {
LOG(ERROR) << "Could not decode variant with type_name: \""

View File

@ -66,12 +66,23 @@ inline bool IsNextIteration(const NodeDef& node_def) {
bool IsValidNodeName(StringPiece s, bool allow_internal_ops) {
using ::tensorflow::strings::Scanner;
return Scanner(s)
Scanner scanner(s);
scanner
.One(allow_internal_ops ? Scanner::LETTER_DIGIT_DOT_UNDERSCORE
: Scanner::LETTER_DIGIT_DOT)
.Any(Scanner::LETTER_DIGIT_DASH_DOT_SLASH_UNDERSCORE)
.Eos()
.GetResult();
.Any(Scanner::LETTER_DIGIT_DASH_DOT_SLASH_UNDERSCORE);
while (true) {
if (!scanner.GetResult()) // Some error in previous iteration.
return false;
if (scanner.empty()) // No error, but nothing left, good.
return true;
// Absorb another piece, starting with a '>'
scanner.One(Scanner::RANGLE)
.One(Scanner::LETTER_DIGIT_DOT)
.Any(Scanner::LETTER_DIGIT_DASH_DOT_SLASH_UNDERSCORE);
}
}
class GraphConstructor {
@ -1399,6 +1410,17 @@ void GraphConstructor::Undo() {
Status GraphConstructor::MakeEdge(Node* src, int output_index, Node* dst,
int input_index) {
if (output_index >= src->num_outputs()) {
return errors::InvalidArgument(
"Output ", output_index, " of node ", src->name(),
" does not exist. Node only has ", src->num_outputs(), " outputs.");
}
if (input_index >= dst->num_inputs()) {
return errors::InvalidArgument(
"Input ", input_index, " of node ", dst->name(),
" does not exist. Node only has ", dst->num_inputs(), " inputs.");
}
DataType src_out = src->output_type(output_index);
DataType dst_in = dst->input_type(input_index);
if (!TypesCompatible(dst_in, src_out)) {

View File

@ -206,6 +206,7 @@ static inline bool IsMklElementWiseOp(const string& op_name, DataType T) {
return false;
}
bool result = (0 == op_name.compare(GetMklOpName("Add")) ||
0 == op_name.compare(GetMklOpName("AddV2")) ||
0 == op_name.compare(GetMklOpName("Sub")) ||
0 == op_name.compare(GetMklOpName("Mul")) ||
0 == op_name.compare(GetMklOpName("Maximum")) ||

View File

@ -246,6 +246,7 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
csinfo_.avg_pool3d = "AvgPool3D";
csinfo_.avg_pool3d_grad = "AvgPool3DGrad";
csinfo_.batch_matmul = "BatchMatMul";
csinfo_.batch_matmul_v2 = "BatchMatMulV2";
csinfo_.bias_add = "BiasAdd";
csinfo_.bias_add_grad = "BiasAddGrad";
csinfo_.concat = "Concat";
@ -349,6 +350,7 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
// in the MklUtil.h (IsMklElementWiseOp method) to ensure that the
// MklInputConversion op is added before it.
csinfo_.add = "Add";
csinfo_.add_v2 = "AddV2";
csinfo_.maximum = "Maximum";
csinfo_.mul = "Mul";
csinfo_.squared_difference = "SquaredDifference";
@ -363,6 +365,10 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
rinfo_.push_back({csinfo_.add, mkl_op_registry::GetMklOpName(csinfo_.add),
CopyAttrsAll, RewriteIfAtleastOneMklInput,
kRewriteForLayoutPropagation});
rinfo_.push_back({csinfo_.add_v2,
mkl_op_registry::GetMklOpName(csinfo_.add_v2),
CopyAttrsAll, RewriteIfAtleastOneMklInput,
kRewriteForLayoutPropagation});
rinfo_.push_back(
{csinfo_.avg_pool, mkl_op_registry::GetMklOpName(csinfo_.avg_pool),
CopyAttrsAll, AlwaysRewrite, kRewriteForLayoutPropagation});
@ -380,6 +386,9 @@ class MklLayoutRewritePass : public GraphOptimizationPass {
rinfo_.push_back({csinfo_.batch_matmul,
mkl_op_registry::GetMklOpName(csinfo_.batch_matmul),
CopyAttrsAll, AlwaysRewrite, kRewriteForOpNameChange});
rinfo_.push_back({csinfo_.batch_matmul_v2,
mkl_op_registry::GetMklOpName(csinfo_.batch_matmul_v2),
CopyAttrsAll, AlwaysRewrite, kRewriteForOpNameChange});
rinfo_.push_back(
{csinfo_.concat, mkl_op_registry::GetMklOpName(csinfo_.concat),
CopyAttrsAll, AlwaysRewrite, kRewriteForLayoutPropagation});
@ -863,11 +872,13 @@ rinfo_.push_back({csinfo_.tanh_grad,
typedef struct {
string addn;
string add;
string add_v2;
string avg_pool;
string avg_pool_grad;
string avg_pool3d;
string avg_pool3d_grad;
string batch_matmul;
string batch_matmul_v2;
string bias_add;
string bias_add_grad;
string concat;

View File

@ -3776,6 +3776,65 @@ TEST_F(MklLayoutPassTest, NodeRewrite_Slice_DeviceTest) {
"B->D:1;C->D:2;D->E:1");
}
// The following positive and negative tests test the rewrite of Add and AddV2
// to MKL versions. The operators will be rewritten only if one of the inputs
// comes from another MKL operator.
TEST_F(MklLayoutPassTest, PositiveRewriteAdd) {
InitGraph(
"node { name: 'A' op: 'Input'}"
"node { name: 'B' op: 'Input'}"
"node { name: 'M' op: 'Relu'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['A']}"
"node { name: 'N' op: 'Add'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['M', 'B']}");
EXPECT_EQ(
DoMklLayoutOptimizationPass(),
"A(Input);B(Input);DMT/_0(Const);DMT/_1(Const);M(_MklRelu);N(_MklAdd)"
"|A->M;A:control->DMT/_0:control;B->N:1;DMT/_0->M:1;DMT/_1->N:3;M->N;"
"M:1->N:2;M:control->DMT/_1:control");
}
TEST_F(MklLayoutPassTest, NegativeRewriteAdd) {
InitGraph(
"node { name: 'A' op: 'Input'}"
"node { name: 'B' op: 'Input'}"
"node { name: 'N' op: 'Add'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['A', 'B']}");
EXPECT_EQ(DoMklLayoutOptimizationPass(),
"A(Input);B(Input);N(Add)|A->N;B->N:1");
}
TEST_F(MklLayoutPassTest, PositiveRewriteAddV2) {
InitGraph(
"node { name: 'A' op: 'Input'}"
"node { name: 'B' op: 'Input'}"
"node { name: 'M' op: 'Relu'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['A']}"
"node { name: 'N' op: 'AddV2'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['M', 'B']}");
EXPECT_EQ(
DoMklLayoutOptimizationPass(),
"A(Input);B(Input);DMT/_0(Const);DMT/_1(Const);M(_MklRelu);N(_MklAddV2)"
"|A->M;A:control->DMT/_0:control;B->N:1;DMT/_0->M:1;DMT/_1->N:3;M->N;"
"M:1->N:2;M:control->DMT/_1:control");
}
TEST_F(MklLayoutPassTest, NegativeRewriteAddV2) {
InitGraph(
"node { name: 'A' op: 'Input'}"
"node { name: 'B' op: 'Input'}"
"node { name: 'N' op: 'AddV2'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['A', 'B']}");
EXPECT_EQ(DoMklLayoutOptimizationPass(),
"A(Input);B(Input);N(AddV2)|A->N;B->N:1");
}
/////////////////////////////////////////////////////////////////////
// Post-rewrite fixup pass test
/////////////////////////////////////////////////////////////////////
@ -4307,6 +4366,39 @@ TEST_F(MklLayoutPassTest,
"H->K:7;I->K:8;J->L:1;K->L");
}
TEST_F(MklLayoutPassTest, MatMul_Positive) {
InitGraph(
"node { name: 'A' op: 'Input'}"
"node { name: 'B' op: 'Input'}"
"node { name: 'C' op: 'MatMul'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['A', 'B']}");
EXPECT_EQ(DoMklLayoutOptimizationPass(),
"A(Input);B(Input);C(_MklMatMul)|A->C;B->C:1");
}
TEST_F(MklLayoutPassTest, BatchMatMul_Positive) {
InitGraph(
"node { name: 'A' op: 'Input'}"
"node { name: 'B' op: 'Input'}"
"node { name: 'C' op: 'BatchMatMul'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['A', 'B']}");
EXPECT_EQ(DoMklLayoutOptimizationPass(),
"A(Input);B(Input);C(_MklBatchMatMul)|A->C;B->C:1");
}
TEST_F(MklLayoutPassTest, BatchMatMulV2_Positive) {
InitGraph(
"node { name: 'A' op: 'Input'}"
"node { name: 'B' op: 'Input'}"
"node { name: 'C' op: 'BatchMatMulV2'"
" attr { key: 'T' value { type: DT_FLOAT } }"
" input: ['A', 'B']}");
EXPECT_EQ(DoMklLayoutOptimizationPass(),
"A(Input);B(Input);C(_MklBatchMatMulV2)|A->C;B->C:1");
}
static void BM_MklLayoutRewritePass(int iters, int op_nodes) {
testing::StopTiming();
string s;

View File

@ -40,6 +40,18 @@ TEST(UtilsTest, GetLocalGPUInfo) {
properties = GetLocalGPUInfo(PlatformGpuId(0));
EXPECT_EQ("GPU", properties.type());
EXPECT_EQ("NVIDIA", properties.vendor());
#elif TENSORFLOW_USE_ROCM
LOG(INFO) << "ROCm is enabled.";
DeviceProperties properties;
// Invalid platform GPU ID.
properties = GetLocalGPUInfo(PlatformGpuId(100));
EXPECT_EQ("UNKNOWN", properties.type());
// Succeed when a valid platform GPU id was inserted.
properties = GetLocalGPUInfo(PlatformGpuId(0));
EXPECT_EQ("GPU", properties.type());
EXPECT_EQ("Advanced Micro Devices, Inc", properties.vendor());
#else
LOG(INFO) << "CUDA is not enabled.";
DeviceProperties properties;
@ -73,6 +85,8 @@ TEST(UtilsTest, GetDeviceInfo) {
EXPECT_EQ("GPU", properties.type());
#if GOOGLE_CUDA
EXPECT_EQ("NVIDIA", properties.vendor());
#elif TENSORFLOW_USE_ROCM
EXPECT_EQ("Advanced Micro Devices, Inc", properties.vendor());
#endif
// TF to platform GPU id mapping entry doesn't exist.
@ -81,7 +95,7 @@ TEST(UtilsTest, GetDeviceInfo) {
properties = GetDeviceInfo(device);
EXPECT_EQ("UNKNOWN", properties.type());
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
// Invalid platform GPU id.
TF_ASSERT_OK(
GpuIdManager::InsertTfPlatformGpuIdPair(TfGpuId(0), PlatformGpuId(100)));
@ -94,7 +108,11 @@ TEST(UtilsTest, GetDeviceInfo) {
device.id = 1;
properties = GetDeviceInfo(device);
EXPECT_EQ("GPU", properties.type());
#if GOOGLE_CUDA
EXPECT_EQ("NVIDIA", properties.vendor());
#elif TENSORFLOW_USE_ROCM
EXPECT_EQ("Advanced Micro Devices, Inc", properties.vendor());
#endif
#endif
}

View File

@ -39,7 +39,7 @@ Status RebatchOptimizer::Init(
return errors::InvalidArgument(
"Cannot initialize RebatchOptimizer without config.");
num_workers_ = config->parameter_map().at("num_workers").i();
num_replicas_ = config->parameter_map().at("num_replicas").i();
use_fallback_ = config->parameter_map().at("use_fallback").b();
return Status::OK();
}
@ -200,11 +200,13 @@ Status AddConstBoolNode(bool value, FunctionDef* fdef, NodeDef** result) {
return Status::OK();
}
Status AddShapeNode(const NodeDefBuilder::NodeOut& input, FunctionDef* fdef,
NodeDef** result) {
Status AddShapeNode(const NodeDefBuilder::NodeOut& input, DataType out_type,
FunctionDef* fdef, NodeDef** result) {
*result = fdef->add_node_def();
TF_RETURN_IF_ERROR(
NodeDefBuilder("", "Shape").Input(input).Finalize(*result));
TF_RETURN_IF_ERROR(NodeDefBuilder("", "Shape")
.Input(input)
.Attr("out_type", out_type)
.Finalize(*result));
function_utils::SetUniqueFunctionNodeName("rebatch/shape", fdef, *result);
return Status::OK();
}
@ -276,45 +278,60 @@ void SetUnknownShapes(int num_components, AttrValue* output_shapes) {
}
}
Status GetBatchDim(AttrValue output_shapes, int* batch_dim) {
const auto& shape_0 = output_shapes.list().shape(0);
if (shape_0.unknown_rank() || shape_0.dim(0).size() == -1) {
// If the batch dimension is known and divisible by num_replicas, we set
// result = batch_dim / num_replicas. If the batch dimension is unknown,
// result = -1. If the dataset node is missing an output shapes attr,
// or the batch dimensions of its components don't match, we return an error
// status.
Status GetMinibatchDimForReshape(const NodeDef& dataset_node,
int64 num_replicas, int64* result) {
AttrValue output_shapes;
if (!dataset_node.attr().contains(kOutputShapesAttr)) {
return errors::InvalidArgument(
"Cannot use rebatching fallback when 0th dimensions of dataset "
"components are not fully known. Component 0 has shape: ",
shape_0.ShortDebugString());
"Cannot use rebatching fallback when the final dataset node does not "
"have an `output_shapes` attr. Node: ",
dataset_node.name(), " Op: ", dataset_node.op());
}
output_shapes = dataset_node.attr().at(kOutputShapesAttr);
*batch_dim = output_shapes.list().shape(0).dim(0).size();
for (int i = 1; i < output_shapes.list().shape_size(); ++i) {
// Get the batch dimension by checking the 0th dimension of all the inputs.
int batch_dim = -1;
for (int i = 0; i < output_shapes.list().shape_size(); ++i) {
const auto& shape_i = output_shapes.list().shape(i);
if (shape_i.unknown_rank() || shape_i.dim(0).size() == -1) {
// If unknown, ignore.
if (shape_i.unknown_rank()) continue;
int batch_dim_i = shape_i.dim(0).size();
if (batch_dim_i == -1) continue;
// Update batch_dim with known dimension.
if (batch_dim_i != batch_dim && batch_dim != -1) {
return errors::InvalidArgument(
"Cannot use rebatching fallback when 0th dimensions of dataset "
"components are not fully known. Component ",
i, " has shape: ", shape_i.ShortDebugString());
}
if (shape_i.dim(0).size() != *batch_dim) {
return errors::InvalidArgument(
"Cannot use rebatching fallback when 0th dimensions of dataset "
"Cannot use rebatching fallback: 0th dimensions of dataset "
"components don't match. Component ",
i, " has batch dimension: ", shape_i.dim(0).size(),
" while previous components have batch dimension: ", *batch_dim);
i, " has batch dimension: ", batch_dim_i,
" while previous components have batch dimension: ", batch_dim);
}
batch_dim = batch_dim_i;
}
if (batch_dim == -1 || batch_dim % num_replicas != 0) {
*result = -1;
} else {
*result = batch_dim / num_replicas;
}
return Status::OK();
}
Status UpdateOutputShapes(const string& node_name, int64 num_workers,
Status UpdateOutputShapes(const string& node_name, int64 num_replicas,
MutableGraphView* graph) {
NodeDef* node = graph->GetNode(node_name);
if (node->attr().contains(kOutputShapesAttr)) {
AttrValue output_shapes = node->attr().at(kOutputShapesAttr);
for (auto& shape : *output_shapes.mutable_list()->mutable_shape()) {
if (!shape.unknown_rank() && shape.dim(0).size() != -1) {
shape.mutable_dim(0)->set_size(shape.dim(0).size() / num_workers);
shape.mutable_dim(0)->set_size(shape.dim(0).size() / num_replicas);
}
}
(*node->mutable_attr())[kOutputShapesAttr] = output_shapes;
@ -335,16 +352,16 @@ int64 GetBatchSizeArgIndex(const NodeDef& batch_node) {
}
Status MakeNewBatchSizeNode(const string& global_batch_size_name,
int64 num_workers, FunctionDef* fdef,
int64 num_replicas, FunctionDef* fdef,
NodeDef** result) {
NodeDef* one_node;
TF_RETURN_IF_ERROR(AddConstInt64Node(1, fdef, &one_node));
NodeDef* num_workers_node;
TF_RETURN_IF_ERROR(AddConstInt64Node(num_workers, fdef, &num_workers_node));
NodeDef* num_replicas_node;
TF_RETURN_IF_ERROR(AddConstInt64Node(num_replicas, fdef, &num_replicas_node));
NodeDef* numerator_node =
AddBinaryNode(global_batch_size_name,
strings::StrCat(num_workers_node->name(), ":output:0"),
strings::StrCat(num_replicas_node->name(), ":output:0"),
kAddOp, DT_INT64, fdef);
numerator_node = AddBinaryNode(
strings::StrCat(numerator_node->name(), ":z:0"),
@ -352,14 +369,14 @@ Status MakeNewBatchSizeNode(const string& global_batch_size_name,
*result =
AddBinaryNode(strings::StrCat(numerator_node->name(), ":z:0"),
strings::StrCat(num_workers_node->name(), ":output:0"),
strings::StrCat(num_replicas_node->name(), ":output:0"),
kTruncateDivOp, DT_INT64, fdef);
return Status::OK();
}
// Given a "batch" dataset node, we replace the `batch_size` input with a new
// input that corresponds to the original input divided by `num_workers`.
Status MutateBatchSize(const NodeDef& node, int64 num_workers,
// input that corresponds to the original input divided by `num_replicas`.
Status MutateBatchSize(const NodeDef& node, int64 num_replicas,
MutableGraphView* graph) {
// For all the batching datasets the batch_size is input number 1 except for
// MapAndBatchDataset.
@ -369,8 +386,8 @@ Status MutateBatchSize(const NodeDef& node, int64 num_workers,
int64 batch_size;
TF_RETURN_IF_ERROR(
graph_utils::GetScalarConstNodeValue(*batch_size_node, &batch_size));
DCHECK_EQ(batch_size % num_workers, 0);
batch_size = batch_size / num_workers;
DCHECK_EQ(batch_size % num_replicas, 0);
batch_size = batch_size / num_replicas;
NodeDef* new_batch_size_node =
graph_utils::AddScalarConstNode<int64>(batch_size, graph);
// We don't call UpdateFanouts here because CSE elimination might lead to
@ -411,10 +428,12 @@ Status AddFlatMapNode(const string& input_dataset,
}
// def flat_map_fn(*batched_components):
// batch_size = tf.shape(batched_components[0])[0]
// minibatch_size = (batch_size + num_replicas - 1) // num_replicas
// ds = tf.data.Dataset.from_tensor_slices(batched_components)
// return ds.batch(minibatch_size, drop_remainder=False)
Status CreateFlatMapFnWithBatch(const DataTypeVector& dtypes, int64 num_workers,
FunctionDef* result) {
Status CreateFlatMapFnWithBatch(const DataTypeVector& dtypes,
int64 num_replicas, FunctionDef* result) {
NodeDef* tensor_slice_node = result->add_node_def();
tensor_slice_node->set_op("TensorSliceDataset");
for (int i = 0; i < dtypes.size(); ++i) {
@ -439,13 +458,32 @@ Status CreateFlatMapFnWithBatch(const DataTypeVector& dtypes, int64 num_workers,
batch_node->add_input(
strings::StrCat(tensor_slice_node->name(), ":handle:0"));
// `batch_size` input
// Here, we capture the original batch size from outside the flat map fn.
auto* original_batch_size =
function_utils::AddFunctionInput("captured_batch_size", result, DT_INT64);
// `batch_size` is tf.shape(arg)[0]
NodeDef* shape;
TF_RETURN_IF_ERROR(AddShapeNode({tensor_slice_node->input(0), 0, dtypes[0]},
DT_INT64, result, &shape));
// Const with value [0]
NodeDef* const_vec_0;
TF_RETURN_IF_ERROR(AddConstIntNode({0}, {1}, result, &const_vec_0));
// Const with value [1]
NodeDef* const_vec_1;
TF_RETURN_IF_ERROR(AddConstIntNode({1}, {1}, result, &const_vec_1));
// Extracts the 0th dimension from the shape node.
NodeDef* original_batch_size;
TF_RETURN_IF_ERROR(AddStridedSliceNode(
{strings::StrCat(shape->name(), ":output"), 0, DT_INT64},
{strings::StrCat(const_vec_0->name(), ":output"), 0, DT_INT32},
{strings::StrCat(const_vec_1->name(), ":output"), 0, DT_INT32},
{strings::StrCat(const_vec_1->name(), ":output"), 0, DT_INT32}, DT_INT32,
0, 0, 0, 0, 1, result, &original_batch_size));
NodeDef* new_batch_size;
TF_RETURN_IF_ERROR(MakeNewBatchSizeNode(
original_batch_size->name(), num_workers, result, &new_batch_size));
strings::StrCat(original_batch_size->name(), ":output:0"), num_replicas,
result, &new_batch_size));
batch_node->add_input(strings::StrCat(new_batch_size->name(), ":z:0"));
// `drop_remainder` input
@ -470,9 +508,9 @@ Status CreateFlatMapFnWithBatch(const DataTypeVector& dtypes, int64 num_workers,
// in a step adds up to the global batch size. However, since this adds
// additional data copies (both from_tensor_slices and batch), we only use
// this approach when necessary, i.e. when we need to drop remainder on the
// global batch, or when the global batch size does not divide num_workers
// global batch, or when the global batch size does not divide num_replicas
// evenly.
Status AppendFlatMap(const NodeDef& batch_node, int64 num_workers,
Status AppendFlatMap(const NodeDef& batch_node, int64 num_replicas,
FunctionLibraryDefinition* flib, MutableGraphView* graph) {
// `.flat_map(lambda x: tf.data.Dataset.from_tensor_slices(x).
// batch(minibatch_size, drop_remainder=False))`
@ -484,9 +522,7 @@ Status AppendFlatMap(const NodeDef& batch_node, int64 num_workers,
TF_RETURN_IF_ERROR(
graph_utils::GetDatasetOutputTypesAttr(batch_node, &dtypes));
TF_RETURN_IF_ERROR(
CreateFlatMapFnWithBatch(dtypes, num_workers, &flat_map_fn));
int64 batch_size_index = GetBatchSizeArgIndex(batch_node);
CreateFlatMapFnWithBatch(dtypes, num_replicas, &flat_map_fn));
NodeDef* flat_map_node;
@ -496,15 +532,14 @@ Status AppendFlatMap(const NodeDef& batch_node, int64 num_workers,
// Because the flat map function uses drop_remainder = False,
// the shape might be unknown
auto old_dim = shape.dim(0).size();
auto new_dim = old_dim % num_workers == 0 ? old_dim / num_workers : -1;
auto new_dim = old_dim % num_replicas == 0 ? old_dim / num_replicas : -1;
shape.mutable_dim(0)->set_size(new_dim);
}
}
TF_RETURN_IF_ERROR(AddFlatMapNode(strings::StrCat(batch_node.name(), ":0"),
{batch_node.input(batch_size_index)},
{DT_INT64}, flat_map_fn, output_shapes,
dtypes, flib, graph, &flat_map_node));
{}, {}, flat_map_fn, output_shapes, dtypes,
flib, graph, &flat_map_node));
TF_RETURN_IF_ERROR(
graph->UpdateFanouts(batch_node.name(), flat_map_node->name()));
@ -514,12 +549,13 @@ Status AppendFlatMap(const NodeDef& batch_node, int64 num_workers,
// There are several things we do here, depending on the values of
// batch_size and drop_remainder.
// (1) If batch size is known and divisible by num_workers, and drop_remainder
// (1) If batch size is known and divisible by num_replicas, and drop_remainder
// is known to be False, we mutate the batch size directly.
// .batch(global_batch_size) -> .batch(global_batch_size // num_workers)
// .batch(global_batch_size) -> .batch(global_batch_size // num_replicas)
// (2) Otherwise, we add a flat_map transformation to preserve the global batch
// size across the workers and to preserve the drop remainder behavior.
bool ShouldMutateBatchSizeDirectly(const NodeDef& batch_node, int64 num_workers,
// size across the replicas and to preserve the drop remainder behavior.
bool ShouldMutateBatchSizeDirectly(const NodeDef& batch_node,
int64 num_replicas,
MutableGraphView* graph) {
int64 batch_size_arg_index = GetBatchSizeArgIndex(batch_node);
NodeDef* batch_size_node =
@ -528,9 +564,9 @@ bool ShouldMutateBatchSizeDirectly(const NodeDef& batch_node, int64 num_workers,
int64 batch_size;
Status s =
graph_utils::GetScalarConstNodeValue(*batch_size_node, &batch_size);
// If batch size is unknown or indivisible by num workers, we don't
// If batch size is unknown or indivisible by num replicas, we don't
// mutate it directly
if (!s.ok() || batch_size % num_workers != 0) return false;
if (!s.ok() || batch_size % num_replicas != 0) return false;
if (batch_node.op() == kBatchOp || batch_node.op() == kPaddedBatchOp) {
// These ops don't have a `drop_remainder` input, and behave like
@ -547,16 +583,16 @@ bool ShouldMutateBatchSizeDirectly(const NodeDef& batch_node, int64 num_workers,
return s.ok() && !drop_remainder;
}
Status RewriteBatchNode(const NodeDef& batch_node, int64 num_workers,
Status RewriteBatchNode(const NodeDef& batch_node, int64 num_replicas,
FunctionLibraryDefinition* flib,
MutableGraphView* graph) {
if (ShouldMutateBatchSizeDirectly(batch_node, num_workers, graph)) {
return MutateBatchSize(batch_node, num_workers, graph);
if (ShouldMutateBatchSizeDirectly(batch_node, num_replicas, graph)) {
return MutateBatchSize(batch_node, num_replicas, graph);
}
return AppendFlatMap(batch_node, num_workers, flib, graph);
return AppendFlatMap(batch_node, num_replicas, flib, graph);
}
Status OptimizeGraph(const GrapplerItem& item, int64 num_workers,
Status OptimizeGraph(const GrapplerItem& item, int64 num_replicas,
bool use_fallback, GraphDef* output);
// Helper function that starts from a node in the graph and recurses into its
@ -567,16 +603,16 @@ Status OptimizeGraph(const GrapplerItem& item, int64 num_workers,
// as they are datasets themselves.
// 3. Core dataset ops + Identity op: Recurses into first input parameter.
// 4. FlatMap type mapping dataset ops: Recurses into the function definition.
Status RecursivelyHandleOp(const NodeDef& node, int64 num_workers,
Status RecursivelyHandleOp(const NodeDef& node, int64 num_replicas,
bool use_fallback, FunctionLibraryDefinition* flib,
MutableGraphView* graph) {
if (IsDatasetNodeOfType(node, kBatchDatasetOps)) {
TF_RETURN_IF_ERROR(RewriteBatchNode(node, num_workers, flib, graph));
TF_RETURN_IF_ERROR(RewriteBatchNode(node, num_replicas, flib, graph));
} else if (IsDatasetNodeOfType(node, kMultipleInputsDatasetOps)) {
// For all multiple input datasets, all inputs are datasets themselves.
for (int i = 0; i < node.input_size(); ++i) {
NodeDef* input_node = graph_utils::GetInputNode(node, *graph, i);
TF_RETURN_IF_ERROR(RecursivelyHandleOp(*input_node, num_workers,
TF_RETURN_IF_ERROR(RecursivelyHandleOp(*input_node, num_replicas,
use_fallback, flib, graph));
}
} else if (IsDatasetNodeOfType(node, kPassThroughOps) || IsRetval(node)) {
@ -584,7 +620,7 @@ Status RecursivelyHandleOp(const NodeDef& node, int64 num_workers,
// function body graph in place of function outputs, the input dataset is
// input 0.
NodeDef* input_node = graph_utils::GetInputNode(node, *graph, 0);
TF_RETURN_IF_ERROR(RecursivelyHandleOp(*input_node, num_workers,
TF_RETURN_IF_ERROR(RecursivelyHandleOp(*input_node, num_replicas,
use_fallback, flib, graph));
} else if (IsDatasetNodeOfType(node, kFuncDatasetOps)) {
const string func_name =
@ -594,7 +630,7 @@ Status RecursivelyHandleOp(const NodeDef& node, int64 num_workers,
TF_RETURN_IF_ERROR(MakeGrapplerFunctionItem(
*fdef, *flib, graph->graph()->versions().producer(), &f_item));
GraphDef optimized_func_graph;
TF_RETURN_IF_ERROR(OptimizeGraph(f_item, num_workers, use_fallback,
TF_RETURN_IF_ERROR(OptimizeGraph(f_item, num_replicas, use_fallback,
&optimized_func_graph));
// Function body optimization might have created new specialized
@ -623,7 +659,7 @@ Status RecursivelyHandleOp(const NodeDef& node, int64 num_workers,
}
// If we've successfully updated the batch size of this node or any nodes
// in the dataset tree rooted in this node, we update the output_shapes attr.
TF_RETURN_IF_ERROR(UpdateOutputShapes(node.name(), num_workers, graph));
TF_RETURN_IF_ERROR(UpdateOutputShapes(node.name(), num_replicas, graph));
return Status::OK();
}
@ -649,7 +685,7 @@ Status ReshapeComponent(int new_batch_dim, const string& arg, DataType dtype,
// shape = tf.shape(arg)
NodeDef* shape;
TF_RETURN_IF_ERROR(AddShapeNode({arg, 0, dtype}, fdef, &shape));
TF_RETURN_IF_ERROR(AddShapeNode({arg, 0, dtype}, DT_INT32, fdef, &shape));
// later_dimensions = tf.shape(arg)[1:]
NodeDef* later_dimensions;
@ -689,7 +725,7 @@ Status CreateFlatMapFnWithReshape(int new_batch_dim,
// For each component of the dataset, we reshape it from shape
// (old_batch_size, ...) to (-1, new_batch_size, ...)
// where new_batch_size = (old_batch_size + num_workers - 1) // num_workers
// where new_batch_size = (old_batch_size + num_replicas - 1) // num_replicas
for (int i = 0; i < types.size(); ++i) {
auto* input_arg = function_utils::AddFunctionInput(
strings::StrCat("args_", i), result, types.at(i));
@ -733,13 +769,13 @@ Status CreateFlatMapFnWithReshape(int new_batch_dim,
// return tf.data.Dataset.from_tensor_slices(
// tf.reshape(
// x,
// tf.concat([[-1, old_batch_dim / num_workers], tf.shape(x)[1:]], 0)
// tf.concat([[-1, old_batch_dim / num_replicas], tf.shape(x)[1:]], 0)
// )
// )
//
// dataset = dataset.flat_map(fn)
// ```
Status RebatchWithFallback(const NodeDef* fetch_node, int64 num_workers,
Status RebatchWithFallback(const NodeDef* fetch_node, int64 num_replicas,
FunctionLibraryDefinition* flib,
MutableGraphView* graph) {
if (IsRetval(*fetch_node) || fetch_node->op() == kIdentityOp) {
@ -747,26 +783,6 @@ Status RebatchWithFallback(const NodeDef* fetch_node, int64 num_workers,
fetch_node = graph_utils::GetInputNode(*fetch_node, *graph, 0);
}
// Note: Here, we are conservative with only using the fallback when
// the output_shapes attr has the 0th dimension defined for every component.
// This because the flat_map_fn will fail if the batch does not divide evenly
// because of the use of the "Reshape" op. This ensures that the error is
// surfaced correctly.
AttrValue output_shapes;
if (!fetch_node->attr().contains(kOutputShapesAttr)) {
return errors::InvalidArgument(
"Cannot use rebatching fallback without output_shapes attr. Node: ",
fetch_node->name(), " Op: ", fetch_node->op());
} else {
output_shapes = fetch_node->attr().at(kOutputShapesAttr);
}
int batch_dim;
TF_RETURN_IF_ERROR(GetBatchDim(output_shapes, &batch_dim));
if (batch_dim % num_workers != 0) {
return errors::InvalidArgument(
"Cannot use rebatching fallback when batch dimension doesn't divide "
"num_workers evenly.");
}
// Create the flat map fn
FunctionDef flat_map_fn;
@ -778,15 +794,32 @@ Status RebatchWithFallback(const NodeDef* fetch_node, int64 num_workers,
DataTypeVector output_types;
TF_RETURN_IF_ERROR(
graph_utils::GetDatasetOutputTypesAttr(*fetch_node, &output_types));
TF_RETURN_IF_ERROR(CreateFlatMapFnWithReshape(batch_dim / num_workers,
output_types, &flat_map_fn));
int64 minibatch_dim;
// If the batch dimension is known and perfectly divisible by num_replicas,
// we use a fallback with `tf.reshape` for better performance.
TF_RETURN_IF_ERROR(
GetMinibatchDimForReshape(*fetch_node, num_replicas, &minibatch_dim));
if (minibatch_dim != -1) {
TF_RETURN_IF_ERROR(
CreateFlatMapFnWithReshape(minibatch_dim, output_types, &flat_map_fn));
} else {
TF_RETURN_IF_ERROR(
CreateFlatMapFnWithBatch(output_types, num_replicas, &flat_map_fn));
}
AttrValue output_shapes;
if (fetch_node->attr().contains(kOutputShapesAttr)) {
output_shapes = fetch_node->attr().at(kOutputShapesAttr);
} else {
SetUnknownShapes(output_types.size(), &output_shapes);
}
NodeDef* flat_map_node;
TF_RETURN_IF_ERROR(AddFlatMapNode(strings::StrCat(fetch_node->name(), ":0"),
{}, {}, flat_map_fn, output_shapes,
output_types, flib, graph, &flat_map_node));
TF_RETURN_IF_ERROR(
UpdateOutputShapes(flat_map_node->name(), num_workers, graph));
UpdateOutputShapes(flat_map_node->name(), num_replicas, graph));
TF_RETURN_IF_ERROR(
graph->UpdateFanouts(fetch_node->name(), flat_map_node->name()));
@ -797,7 +830,7 @@ Status RebatchWithFallback(const NodeDef* fetch_node, int64 num_workers,
// Helper function that given a GrapplerItem generates a mutated graph def
// with the batch size changed. The GrapplerItem could be generated from the
// main graph or could be a function graph.
Status OptimizeGraph(const GrapplerItem& item, int64 num_workers,
Status OptimizeGraph(const GrapplerItem& item, int64 num_replicas,
bool use_fallback, GraphDef* output) {
*output = item.graph;
MutableGraphView graph(output);
@ -807,8 +840,8 @@ Status OptimizeGraph(const GrapplerItem& item, int64 num_workers,
NodeDef* sink_node;
TF_RETURN_IF_ERROR(graph_utils::GetFetchNode(graph, item, &sink_node));
Status s =
RecursivelyHandleOp(*sink_node, num_workers, use_fallback, &flib, &graph);
Status s = RecursivelyHandleOp(*sink_node, num_replicas, use_fallback, &flib,
&graph);
if (!s.ok()) {
if (use_fallback) {
VLOG(1) << "Failed to rebatch by rewriting the batch transformation ("
@ -818,7 +851,7 @@ Status OptimizeGraph(const GrapplerItem& item, int64 num_workers,
*output = item.graph;
graph = MutableGraphView(output);
TF_RETURN_IF_ERROR(
RebatchWithFallback(sink_node, num_workers, &flib, &graph));
RebatchWithFallback(sink_node, num_replicas, &flib, &graph));
} else {
// Return the error
return s;
@ -837,7 +870,7 @@ Status RebatchOptimizer::OptimizeAndCollectStats(Cluster* cluster,
*output = item.graph;
MutableGraphView graph(output);
TF_RETURN_IF_ERROR(OptimizeGraph(item, num_workers_, use_fallback_, output));
TF_RETURN_IF_ERROR(OptimizeGraph(item, num_replicas_, use_fallback_, output));
stats->num_changes++;
return Status::OK();
}

View File

@ -23,7 +23,7 @@ namespace tensorflow {
namespace grappler {
// This optimizer changes the batch size of the output dataset by dividing the
// current batch size by parameter `num_workers`. Currently, this works only
// current batch size by parameter `num_replicas`. Currently, this works only
// for very simple pipelines with a single BatchDatasetV2 transformation.
class RebatchOptimizer : public TFDataOptimizerBase {
public:
@ -43,7 +43,7 @@ class RebatchOptimizer : public TFDataOptimizerBase {
const GraphDef& optimize_output, double result) override;
private:
int64 num_workers_;
int64 num_replicas_;
bool use_fallback_;
};

View File

@ -2265,7 +2265,11 @@ Status LayoutOptimizer::Optimize(Cluster* cluster, const GrapplerItem& item,
config.no_gemm = true;
// TODO(yaozhang): Enable tuning with various TuningConfig choices with
// the measurement-based estimator.
return Tune(item, graph_properties, config, output);
Status status = Tune(item, graph_properties, config, output);
if (!status.ok()) {
*output = item.graph;
}
return status;
}
void LayoutOptimizer::Feedback(Cluster* cluster, const GrapplerItem& item,

View File

@ -203,7 +203,7 @@ TEST_F(PinToHostOptimizerTest, Identity) {
// If CUDA, then there is a GPU kernel registration that is pinned to Host
// memory. Consequently, `b` will be mapped to Host correct if there is
// a GPU kernel registered.
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
EXPECT_EQ(node.device(), "/device:CPU:0");
#else
EXPECT_TRUE(node.device().empty());

View File

@ -5533,6 +5533,24 @@ tf_kernel_library(
deps = STRING_DEPS,
)
tf_cc_test(
name = "as_string_op_test",
size = "small",
srcs = ["as_string_op_test.cc"],
deps = [
":as_string_op",
":ops_testutil",
":ops_util",
"//tensorflow/core:core_cpu",
"//tensorflow/core:framework",
"//tensorflow/core:lib",
"//tensorflow/core:protos_all_cc",
"//tensorflow/core:test",
"//tensorflow/core:test_main",
"//tensorflow/core:testlib",
],
)
tf_kernel_library(
name = "unicode_ops",
prefix = "unicode_ops",
@ -6174,6 +6192,7 @@ filegroup(
"scatter_nd_op.h",
"scatter_nd_op_cpu_impl.h",
"segment_reduction_ops.h",
"segment_reduction_ops_impl.h",
"softplus_op.h",
"softsign_op.h",
"spacetobatch_functor.h",
@ -6370,7 +6389,11 @@ filegroup(
"scatter_nd_op_cpu_impl_5.cc",
"scatter_nd_op_cpu_impl_6.cc",
"scatter_nd_op_cpu_impl_7.cc",
"segment_reduction_ops.cc",
"segment_reduction_ops_impl_1.cc",
"segment_reduction_ops_impl_2.cc",
"segment_reduction_ops_impl_3.cc",
"segment_reduction_ops_impl_4.cc",
"segment_reduction_ops_impl_5.cc",
"session_ops.cc",
"softplus_op.cc",
"softsign_op.cc",
@ -7944,6 +7967,7 @@ cc_library(
"cwise_ops_gpu_common.cu.h",
"cwise_ops_gpu_gradients.cu.h",
"cwise_ops_gradients.h",
"fill_functor.h",
"meta_support.h",
],
deps = [

View File

@ -65,9 +65,26 @@ class AsStringOp : public OpKernel {
OP_REQUIRES(ctx, !(scientific && shortest),
errors::InvalidArgument(
"Cannot select both scientific and shortest notation"));
format_ = "%";
if (!fill_string.empty()) {
switch (fill_string[0]) {
case ' ':
case '+':
case '-':
case '0':
case '#':
strings::Appendf(&format_, "%s", fill_string.c_str());
break;
default:
bool fill_not_supported = true;
OP_REQUIRES(ctx, !fill_not_supported,
errors::InvalidArgument("Fill argument not supported: \"",
fill_string, "\""));
}
}
if (width > -1) {
strings::Appendf(&format_, "%s%d", fill_string.c_str(), width);
strings::Appendf(&format_, "%d", width);
}
if (precision > -1) {
strings::Appendf(&format_, ".%d", precision);

View File

@ -0,0 +1,245 @@
/* Copyright 2020 The TensorFlow Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#include "tensorflow/core/framework/fake_input.h"
#include "tensorflow/core/framework/node_def_builder.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/framework/tensor_testutil.h"
#include "tensorflow/core/framework/types.h"
#include "tensorflow/core/kernels/ops_testutil.h"
#include "tensorflow/core/kernels/ops_util.h"
#include "tensorflow/core/lib/core/status_test_util.h"
namespace tensorflow {
namespace {
class AsStringGraphTest : public OpsTestBase {
protected:
Status Init(DataType input_type, const string& fill = "", int width = -1,
int precision = -1, bool scientific = false,
bool shortest = false) {
TF_CHECK_OK(NodeDefBuilder("op", "AsString")
.Input(FakeInput(input_type))
.Attr("fill", fill)
.Attr("precision", precision)
.Attr("scientific", scientific)
.Attr("shortest", shortest)
.Attr("width", width)
.Finalize(node_def()));
return InitOp();
}
};
TEST_F(AsStringGraphTest, Int8) {
TF_ASSERT_OK(Init(DT_INT8));
AddInputFromArray<int8>(TensorShape({3}), {-42, 0, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({3}));
test::FillValues<tstring>(&expected, {"-42", "0", "42"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, Int64) {
TF_ASSERT_OK(Init(DT_INT64));
AddInputFromArray<int64>(TensorShape({3}), {-42, 0, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({3}));
test::FillValues<tstring>(&expected, {"-42", "0", "42"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, FloatDefault) {
TF_ASSERT_OK(Init(DT_FLOAT));
AddInputFromArray<float>(TensorShape({4}), {-42, 0, 3.14159, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({4}));
test::FillValues<tstring>(
&expected, {"-42.000000", "0.000000", "3.141590", "42.000000"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, FloatScientific) {
TF_ASSERT_OK(Init(DT_FLOAT, /*fill=*/"", /*width=*/-1, /*precision=*/-1,
/*scientific=*/true));
AddInputFromArray<float>(TensorShape({4}), {-42, 0, 3.14159, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({4}));
test::FillValues<tstring>(&expected, {"-4.200000e+01", "0.000000e+00",
"3.141590e+00", "4.200000e+01"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, FloatShortest) {
TF_ASSERT_OK(Init(DT_FLOAT, /*fill=*/"", /*width=*/-1, /*precision=*/-1,
/*scientific=*/false, /*shortest=*/true));
AddInputFromArray<float>(TensorShape({4}), {-42, 0, 3.14159, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({4}));
test::FillValues<tstring>(&expected, {"-42", "0", "3.14159", "42"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, FloatPrecisionOnly) {
TF_ASSERT_OK(Init(DT_FLOAT, /*fill=*/"", /*width=*/-1, /*precision=*/2));
AddInputFromArray<float>(TensorShape({4}), {-42, 0, 3.14159, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({4}));
test::FillValues<tstring>(&expected, {"-42.00", "0.00", "3.14", "42.00"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, FloatWidthOnly) {
TF_ASSERT_OK(Init(DT_FLOAT, /*fill=*/"", /*width=*/5));
AddInputFromArray<float>(TensorShape({4}), {-42, 0, 3.14159, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({4}));
test::FillValues<tstring>(
&expected, {"-42.000000", "0.000000", "3.141590", "42.000000"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, Float_5_2_Format) {
TF_ASSERT_OK(Init(DT_FLOAT, /*fill=*/"", /*width=*/5, /*precision=*/2));
AddInputFromArray<float>(TensorShape({4}), {-42, 0, 3.14159, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({4}));
test::FillValues<tstring>(&expected, {"-42.00", " 0.00", " 3.14", "42.00"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, Complex) {
TF_ASSERT_OK(Init(DT_COMPLEX64, /*fill=*/"", /*width=*/5, /*precision=*/2));
AddInputFromArray<complex64>(TensorShape({3}), {{-4, 2}, {0}, {3.14159, -1}});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({3}));
test::FillValues<tstring>(
&expected, {"(-4.00, 2.00)", "( 0.00, 0.00)", "( 3.14,-1.00)"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, Bool) {
TF_ASSERT_OK(Init(DT_BOOL));
AddInputFromArray<bool>(TensorShape({2}), {true, false});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({2}));
test::FillValues<tstring>(&expected, {"true", "false"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, String) {
Status s = Init(DT_STRING);
ASSERT_EQ(error::INVALID_ARGUMENT, s.code());
ASSERT_TRUE(absl::StrContains(
s.error_message(),
"Value for attr 'T' of string is not in the list of allowed values"));
}
TEST_F(AsStringGraphTest, OnlyOneOfScientificAndShortest) {
Status s = Init(DT_FLOAT, /*fill=*/"", /*width=*/-1, /*precision=*/-1,
/*scientific=*/true, /*shortest=*/true);
ASSERT_EQ(error::INVALID_ARGUMENT, s.code());
ASSERT_TRUE(
absl::StrContains(s.error_message(),
"Cannot select both scientific and shortest notation"));
}
TEST_F(AsStringGraphTest, NoShortestForNonFloat) {
Status s = Init(DT_INT32, /*fill=*/"", /*width=*/-1, /*precision=*/-1,
/*scientific=*/false, /*shortest=*/true);
ASSERT_EQ(error::INVALID_ARGUMENT, s.code());
ASSERT_TRUE(absl::StrContains(
s.error_message(),
"scientific and shortest format not supported for datatype"));
}
TEST_F(AsStringGraphTest, NoScientificForNonFloat) {
Status s = Init(DT_INT32, /*fill=*/"", /*width=*/-1, /*precision=*/-1,
/*scientific=*/true);
ASSERT_EQ(error::INVALID_ARGUMENT, s.code());
ASSERT_TRUE(absl::StrContains(
s.error_message(),
"scientific and shortest format not supported for datatype"));
}
TEST_F(AsStringGraphTest, NoPrecisionForNonFloat) {
Status s = Init(DT_INT32, /*fill=*/"", /*width=*/-1, /*precision=*/5);
ASSERT_EQ(error::INVALID_ARGUMENT, s.code());
ASSERT_TRUE(absl::StrContains(s.error_message(),
"precision not supported for datatype"));
}
TEST_F(AsStringGraphTest, LongFill) {
Status s = Init(DT_INT32, /*fill=*/"asdf");
ASSERT_EQ(error::INVALID_ARGUMENT, s.code());
ASSERT_TRUE(absl::StrContains(s.error_message(),
"Fill string must be one or fewer characters"));
}
TEST_F(AsStringGraphTest, FillWithZero) {
TF_ASSERT_OK(Init(DT_INT64, /*fill=*/"0", /*width=*/4));
AddInputFromArray<int64>(TensorShape({3}), {-42, 0, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({3}));
test::FillValues<tstring>(&expected, {"-042", "0000", "0042"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, FillWithSpace) {
TF_ASSERT_OK(Init(DT_INT64, /*fill=*/" ", /*width=*/4));
AddInputFromArray<int64>(TensorShape({3}), {-42, 0, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({3}));
test::FillValues<tstring>(&expected, {" -42", " 0", " 42"});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, FillWithChar1) {
TF_ASSERT_OK(Init(DT_INT64, /*fill=*/"-", /*width=*/4));
AddInputFromArray<int64>(TensorShape({3}), {-42, 0, 42});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_STRING, TensorShape({3}));
test::FillValues<tstring>(&expected, {"-42 ", "0 ", "42 "});
test::ExpectTensorEqual<tstring>(expected, *GetOutput(0));
}
TEST_F(AsStringGraphTest, FillWithChar3) {
Status s = Init(DT_INT32, /*fill=*/"s");
ASSERT_EQ(error::INVALID_ARGUMENT, s.code());
ASSERT_TRUE(
absl::StrContains(s.error_message(), "Fill argument not supported"));
}
TEST_F(AsStringGraphTest, FillWithChar4) {
Status s = Init(DT_INT32, /*fill=*/"n");
ASSERT_EQ(error::INVALID_ARGUMENT, s.code());
ASSERT_TRUE(
absl::StrContains(s.error_message(), "Fill argument not supported"));
}
} // end namespace
} // end namespace tensorflow

View File

@ -109,7 +109,7 @@ class BoostedTreesTrainingPredictOp : public OpKernel {
auto do_work = [&resource, &batch_bucketized_features, &cached_tree_ids,
&cached_node_ids, &output_partial_logits,
&output_node_ids, latest_tree,
this](int32 start, int32 end) {
this](int64 start, int64 end) {
for (int32 i = start; i < end; ++i) {
int32 tree_id = cached_tree_ids(i);
int32 node_id = cached_node_ids(i);
@ -227,7 +227,7 @@ class BoostedTreesPredictOp : public OpKernel {
const int32 last_tree = resource->num_trees() - 1;
auto do_work = [&resource, &batch_bucketized_features, &output_logits,
last_tree, this](int32 start, int32 end) {
last_tree, this](int64 start, int64 end) {
for (int32 i = start; i < end; ++i) {
std::vector<float> tree_logits(logits_dimension_, 0.0);
int32 tree_id = 0;
@ -332,7 +332,7 @@ class BoostedTreesExampleDebugOutputsOp : public OpKernel {
// path. Note: feature_ids has one less value than logits_path because the
// first value of each logit path will be the bias.
auto do_work = [&resource, &batch_bucketized_features, &output_debug_info,
last_tree](int32 start, int32 end) {
last_tree](int64 start, int64 end) {
for (int32 i = start; i < end; ++i) {
// Proto to store debug outputs, per example.
boosted_trees::DebugOutput example_debug_info;

View File

@ -264,6 +264,7 @@ class BoostedTreesFlushQuantileSummariesOp : public OpKernel {
*context->device()->tensorflow_cpu_worker_threads();
Shard(worker_threads.num_threads, worker_threads.workers, num_features_,
kCostPerUnit, do_quantile_summary_gen);
stream_resource->ResetStreams();
}
private:
@ -424,6 +425,7 @@ class BoostedTreesQuantileStreamResourceFlushOp : public OpKernel {
Shard(worker_threads.num_threads, worker_threads.workers, num_streams,
kCostPerUnit, do_quantile_flush);
stream_resource->ResetStreams();
stream_resource->set_buckets_ready(true);
}

View File

@ -67,6 +67,14 @@ class BoostedTreesQuantileStreamResource : public ResourceBase {
are_buckets_ready_ = are_buckets_ready;
}
void ResetStreams() {
streams_.clear();
streams_.reserve(num_streams_);
for (int64 idx = 0; idx < num_streams_; ++idx) {
streams_.push_back(QuantileStream(epsilon_, max_elements_));
}
}
private:
~BoostedTreesQuantileStreamResource() override {}

View File

@ -1001,6 +1001,10 @@ class FusedConv2DWithBatchNormOpTest : public FusedConv2DOpTest<T> {};
TYPED_TEST_SUITE_P(FusedConv2DWithBiasOpTest);
TYPED_TEST_SUITE_P(FusedConv2DWithBatchNormOpTest);
// ROCm does not yet support the _FusedConv2D op,
// Therefore disable tests that check _FusedConv2D, when building with ROCm
#ifndef TENSORFLOW_USE_ROCM
// -------------------------------------------------------------------------- //
// Conv2D + BiasAdd + {Activation} //
// -------------------------------------------------------------------------- //
@ -1165,4 +1169,5 @@ using FusedBatchNormDataTypes = ::testing::Types<float>;
INSTANTIATE_TYPED_TEST_SUITE_P(Test, FusedConv2DWithBatchNormOpTest,
FusedBatchNormDataTypes);
#endif // TENSORFLOW_USE_ROCM
} // namespace tensorflow

View File

@ -57,11 +57,23 @@ BinaryOpShared::BinaryOpState::BinaryOpState(OpKernelContext* ctx)
in1(ctx->input(1)),
bcast(BCast::FromShape(in0.shape()), BCast::FromShape(in1.shape())) {
if (!bcast.IsValid()) {
bool incompatible_shape_error;
bool has_attr =
GetNodeAttrSimple(ctx->op_kernel().def(), "incompatible_shape_error",
&(incompatible_shape_error));
if (has_attr && !incompatible_shape_error) {
const string& op = ctx->op_kernel().type_string();
OP_REQUIRES_OK(ctx, ctx->allocate_output(0, TensorShape({}), &out));
result = (op == "NotEqual");
return;
}
ctx->SetStatus(errors::InvalidArgument(
"Incompatible shapes: ", in0.shape().DebugString(), " vs. ",
in1.shape().DebugString()));
return;
}
const TensorShape output_shape = BCast::ToShape(bcast.output_shape());
out_num_elements = output_shape.num_elements();
in0_num_elements = in0.NumElements();

View File

@ -26,13 +26,13 @@ limitations under the License.
#include "tensorflow/core/kernels/cwise_ops_sycl_common.h"
#endif
#include "tensorflow/core/kernels/cwise_ops.h"
#include "tensorflow/core/kernels/cwise_ops_gradients.h"
#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/tensor_types.h"
#include "tensorflow/core/framework/variant_op_registry.h"
#include "tensorflow/core/kernels/cwise_ops.h"
#include "tensorflow/core/kernels/cwise_ops_gradients.h"
#include "tensorflow/core/kernels/fill_functor.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/util/bcast.h"
@ -56,7 +56,7 @@ class BinaryOpShared : public OpKernel {
// in-place computation.
// Caller must check ctx->status() upon return for non-ok status.
// If ctx->status().ok() is true, then out is guaranteed to be allocated.
BinaryOpState(OpKernelContext* ctx);
explicit BinaryOpState(OpKernelContext* ctx);
const Tensor& in0;
const Tensor& in1;
@ -69,6 +69,7 @@ class BinaryOpShared : public OpKernel {
int64 in1_num_elements;
int ndims;
bool result;
};
void SetUnimplementedError(OpKernelContext* ctx);
@ -91,16 +92,29 @@ class BinaryOp : public BinaryOpShared {
void Compute(OpKernelContext* ctx) override {
// 'state': Shared helper not dependent on T to reduce code size
BinaryOpState state(ctx);
if (!ctx->status().ok()) return;
auto& bcast = state.bcast;
const Device& eigen_device = ctx->eigen_device<Device>();
Tensor* out = state.out;
BCast* bcast = &state.bcast;
if (!bcast.IsValid()) {
if (ctx->status().ok()) {
if (state.result) {
functor::SetOneFunctor<Device, bool>()(eigen_device,
out->flat<bool>());
} else {
functor::SetZeroFunctor<Device, bool>()(eigen_device,
out->flat<bool>());
}
}
return;
}
auto& in0 = state.in0;
auto& in1 = state.in1;
if (state.out_num_elements == 0) {
return;
}
const int ndims = state.ndims;
const Device& eigen_device = ctx->eigen_device<Device>();
bool error = false;
bool* const error_ptr = Functor::has_errors ? &error : nullptr;
if (ndims <= 1) {
@ -122,32 +136,32 @@ class BinaryOp : public BinaryOpShared {
}
} else if (ndims == 2) {
functor::BinaryFunctor<Device, Functor, 2>().BCast(
eigen_device, out->shaped<Tout, 2>(bcast->result_shape()),
in0.template shaped<Tin, 2>(bcast->x_reshape()),
BCast::ToIndexArray<2>(bcast->x_bcast()),
in1.template shaped<Tin, 2>(bcast->y_reshape()),
BCast::ToIndexArray<2>(bcast->y_bcast()), error_ptr);
eigen_device, out->shaped<Tout, 2>(bcast.result_shape()),
in0.template shaped<Tin, 2>(bcast.x_reshape()),
BCast::ToIndexArray<2>(bcast.x_bcast()),
in1.template shaped<Tin, 2>(bcast.y_reshape()),
BCast::ToIndexArray<2>(bcast.y_bcast()), error_ptr);
} else if (ndims == 3) {
functor::BinaryFunctor<Device, Functor, 3>().BCast(
eigen_device, out->shaped<Tout, 3>(bcast->result_shape()),
in0.template shaped<Tin, 3>(bcast->x_reshape()),
BCast::ToIndexArray<3>(bcast->x_bcast()),
in1.template shaped<Tin, 3>(bcast->y_reshape()),
BCast::ToIndexArray<3>(bcast->y_bcast()), error_ptr);
eigen_device, out->shaped<Tout, 3>(bcast.result_shape()),
in0.template shaped<Tin, 3>(bcast.x_reshape()),
BCast::ToIndexArray<3>(bcast.x_bcast()),
in1.template shaped<Tin, 3>(bcast.y_reshape()),
BCast::ToIndexArray<3>(bcast.y_bcast()), error_ptr);
} else if (ndims == 4) {
functor::BinaryFunctor<Device, Functor, 4>().BCast(
eigen_device, out->shaped<Tout, 4>(bcast->result_shape()),
in0.template shaped<Tin, 4>(bcast->x_reshape()),
BCast::ToIndexArray<4>(bcast->x_bcast()),
in1.template shaped<Tin, 4>(bcast->y_reshape()),
BCast::ToIndexArray<4>(bcast->y_bcast()), error_ptr);
eigen_device, out->shaped<Tout, 4>(bcast.result_shape()),
in0.template shaped<Tin, 4>(bcast.x_reshape()),
BCast::ToIndexArray<4>(bcast.x_bcast()),
in1.template shaped<Tin, 4>(bcast.y_reshape()),
BCast::ToIndexArray<4>(bcast.y_bcast()), error_ptr);
} else if (ndims == 5) {
functor::BinaryFunctor<Device, Functor, 5>().BCast(
eigen_device, out->shaped<Tout, 5>(bcast->result_shape()),
in0.template shaped<Tin, 5>(bcast->x_reshape()),
BCast::ToIndexArray<5>(bcast->x_bcast()),
in1.template shaped<Tin, 5>(bcast->y_reshape()),
BCast::ToIndexArray<5>(bcast->y_bcast()), error_ptr);
eigen_device, out->shaped<Tout, 5>(bcast.result_shape()),
in0.template shaped<Tin, 5>(bcast.x_reshape()),
BCast::ToIndexArray<5>(bcast.x_bcast()),
in1.template shaped<Tin, 5>(bcast.y_reshape()),
BCast::ToIndexArray<5>(bcast.y_bcast()), error_ptr);
} else {
SetUnimplementedError(ctx);
}

View File

@ -36,14 +36,15 @@ class RebatchDatasetOp : public UnaryDatasetOpKernel {
protected:
void MakeDataset(OpKernelContext* ctx, DatasetBase* input,
DatasetBase** output) override {
int64 num_workers;
OP_REQUIRES_OK(ctx, ParseScalarArgument(ctx, "num_workers", &num_workers));
int64 num_replicas;
OP_REQUIRES_OK(ctx,
ParseScalarArgument(ctx, "num_replicas", &num_replicas));
OP_REQUIRES(
ctx, num_workers > 0,
errors::InvalidArgument("num_workers must be greater than zero."));
ctx, num_replicas > 0,
errors::InvalidArgument("num_replicas must be greater than zero."));
auto config_factory = [num_workers, this]() {
return CreateConfig(num_workers, this->use_fallback_);
auto config_factory = [num_replicas, this]() {
return CreateConfig(num_replicas, this->use_fallback_);
};
// We only want to optimize functions for some particular datasets like
@ -56,17 +57,17 @@ class RebatchDatasetOp : public UnaryDatasetOpKernel {
}
private:
static RewriterConfig CreateConfig(int64 num_workers, bool use_fallback) {
static RewriterConfig CreateConfig(int64 num_replicas, bool use_fallback) {
RewriterConfig rewriter_config;
rewriter_config.set_fail_on_optimizer_errors(true);
rewriter_config.add_optimizers(kOptimizerName);
rewriter_config.set_meta_optimizer_iterations(RewriterConfig::ONE);
auto custom_optimizer = rewriter_config.add_custom_optimizers();
custom_optimizer->set_name(kOptimizerName);
AttrValue num_workers_attr;
num_workers_attr.set_i(num_workers);
(*custom_optimizer->mutable_parameter_map())["num_workers"] =
num_workers_attr;
AttrValue num_replicas_attr;
num_replicas_attr.set_i(num_replicas);
(*custom_optimizer->mutable_parameter_map())["num_replicas"] =
num_replicas_attr;
AttrValue use_fallback_attr;
use_fallback_attr.set_b(use_fallback);
(*custom_optimizer->mutable_parameter_map())["use_fallback"] =

View File

@ -81,8 +81,16 @@ AnonymousRandomSeedGeneratorHandleOp::AnonymousRandomSeedGeneratorHandleOp(
: AnonymousResourceOp<RandomSeedGenerator>(ctx) {}
void AnonymousRandomSeedGeneratorHandleOp::Compute(OpKernelContext* ctx) {
OP_REQUIRES_OK(ctx, ParseScalarArgument<int64>(ctx, kSeed, &seed_));
OP_REQUIRES_OK(ctx, ParseScalarArgument<int64>(ctx, kSeed2, &seed2_));
int64 seed;
OP_REQUIRES_OK(ctx, ParseScalarArgument<int64>(ctx, kSeed, &seed));
int64 seed2;
OP_REQUIRES_OK(ctx, ParseScalarArgument<int64>(ctx, kSeed2, &seed2));
if (seed == 0 && seed2 == 0) {
seed = random::New64();
seed2 = random::New64();
}
seed_ = seed;
seed2_ = seed2;
AnonymousResourceOp<RandomSeedGenerator>::Compute(ctx);
}

View File

@ -18,16 +18,52 @@ limitations under the License.
#define EIGEN_USE_THREADS
#include "tensorflow/core/kernels/data_format_ops.h"
#include <map>
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/register_types.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/lib/core/errors.h"
namespace tensorflow {
typedef Eigen::ThreadPoolDevice CPUDevice;
typedef Eigen::GpuDevice GPUDevice;
// Ensure that `src` and `dst` define a valid permutation.
// Ops defined in this file assume that user specifies a permutation via two
// string attributes. This check validates that these attributes properly define
// it to prevent security vulnerabilities.
static bool IsValidPermutation(const std::string& src, const std::string& dst) {
if (src.size() != dst.size()) {
return false;
}
std::map<char, bool> characters;
// Every character in `src` must be present only once
for (const auto c : src) {
if (characters[c]) {
return false;
}
characters[c] = true;
}
// Every character in `dst` must show up in `src` exactly once
for (const auto c : dst) {
if (!characters[c]) {
return false;
}
characters[c] = false;
}
// At this point, characters[] has been switched to true and false exactly
// once for all character in `src` (and `dst`) so we have a valid permutation
return true;
}
template <typename Device, typename T>
class DataFormatDimMapOp : public OpKernel {
public:
@ -37,15 +73,20 @@ class DataFormatDimMapOp : public OpKernel {
OP_REQUIRES_OK(context, context->GetAttr("src_format", &src_format));
string dst_format;
OP_REQUIRES_OK(context, context->GetAttr("dst_format", &dst_format));
OP_REQUIRES(context, src_format.size() == 4,
errors::InvalidArgument(strings::StrCat(
"Source format must of length 4, received src_format = ",
src_format)));
OP_REQUIRES(context, src_format.size() == 4 || src_format.size() == 5,
errors::InvalidArgument(
"Source format must be of length 4 or 5, received "
"src_format = ",
src_format));
OP_REQUIRES(context, dst_format.size() == 4 || dst_format.size() == 5,
errors::InvalidArgument("Destination format must be of length "
"4 or 5, received dst_format = ",
dst_format));
OP_REQUIRES(
context, dst_format.size() == 4,
errors::InvalidArgument(strings::StrCat(
"Destination format must of length 4, received dst_format = ",
dst_format)));
context, IsValidPermutation(src_format, dst_format),
errors::InvalidArgument(
"Destination and source format must determine a permutation, got ",
src_format, " and ", dst_format));
dst_idx_ = Tensor(DT_INT32, {static_cast<int64>(src_format.size())});
for (int i = 0; i < src_format.size(); ++i) {
for (int j = 0; j < dst_format.size(); ++j) {
@ -77,8 +118,22 @@ class DataFormatVecPermuteOp : public OpKernel {
: OpKernel(context) {
string src_format;
OP_REQUIRES_OK(context, context->GetAttr("src_format", &src_format));
OP_REQUIRES(context, src_format.size() == 4 || src_format.size() == 5,
errors::InvalidArgument(
"Source format must be of length 4 or 5, received "
"src_format = ",
src_format));
string dst_format;
OP_REQUIRES_OK(context, context->GetAttr("dst_format", &dst_format));
OP_REQUIRES(context, dst_format.size() == 4 || dst_format.size() == 5,
errors::InvalidArgument("Destination format must be of length "
"4 or 5, received dst_format = ",
dst_format));
OP_REQUIRES(
context, IsValidPermutation(src_format, dst_format),
errors::InvalidArgument(
"Destination and source format must determine a permutation, got ",
src_format, " and ", dst_format));
src_format_ = src_format;
dst_format_ = dst_format;
}
@ -112,6 +167,24 @@ class DataFormatVecPermuteOp : public OpKernel {
context->allocate_output(0, input.shape(), &output));
// Support 1D and 2D cases.
Eigen::DSizes<Eigen::DenseIndex, 8> dst_idx;
string src_format_str = src_format_;
string dst_format_str = dst_format_;
if (input.dim_size(0) == 2) {
// If the input is a vector of size 2, treat the two elements as spatial
// dimensions.
auto keep_only_spatial_dimensions = [](string* format_str) -> void {
auto new_end = std::remove_if(
format_str->begin(), format_str->end(),
[](const char dim) { return dim != 'H' && dim != 'W'; });
format_str->erase(new_end, format_str->end());
};
keep_only_spatial_dimensions(&src_format_str);
keep_only_spatial_dimensions(&dst_format_str);
OP_REQUIRES(context,
src_format_str.size() == 2 && dst_format_str.size() == 2,
errors::InvalidArgument(
"Format specifier must contain H and W for 2D case"));
}
ComputeDstIndex(input.dims(), &dst_idx);
functor::DataFormatVecPermute<Device, T>()(context->eigen_device<Device>(),

View File

@ -62,6 +62,12 @@ class MemmappedTensorAllocator : public Allocator {
void set_delete_on_deallocate() { delete_on_deallocate_ = true; }
// Make sure tensors or complex types (strings, variants, resources) don't get
// their constructor called via a placement new since that would require
// writing to immutable data.
// See also: tensorflow/core/framework/typed_allocator.h
bool AllocatesOpaqueHandle() const override { return true; }
private:
std::unique_ptr<ReadOnlyMemoryRegion> memory_region_;
// If there is an error during allocation we keep it in this status.

View File

@ -116,7 +116,7 @@ REGISTER_KERNEL_BUILDER(Name("InTopKV2")
.TypeConstraint<int64>("T"),
InTopK<CPUDevice, float, int64>);
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
// Forward declarations of the functor specializations for GPU.
namespace functor {
@ -142,6 +142,6 @@ REGISTER_KERNEL_BUILDER(
Name("InTopKV2").Device(DEVICE_GPU).TypeConstraint<int64>("T"),
InTopK<GPUDevice, float, int64>);
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM
} // namespace tensorflow

View File

@ -16,9 +16,9 @@ limitations under the License.
#ifndef TENSORFLOW_CORE_KERNELS_IN_TOPK_OP_H_
#define TENSORFLOW_CORE_KERNELS_IN_TOPK_OP_H_
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define EIGEN_USE_GPU
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
#include "tensorflow/core/framework/bounds_check.h"

View File

@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if (defined(GOOGLE_CUDA) && GOOGLE_CUDA)
#if (defined(GOOGLE_CUDA) && GOOGLE_CUDA) || TENSORFLOW_USE_ROCM
#define EIGEN_USE_GPU
@ -41,7 +41,7 @@ __global__ void ComputePredictionMaskKernel(
const TargetT* targets, // dims: [ num_targets ]
int64* mask, // dims: [ num_targets x num_classes ]
int num_targets, int num_classes) {
CUDA_1D_KERNEL_LOOP(i, num_targets * num_classes) {
GPU_1D_KERNEL_LOOP(i, num_targets * num_classes) {
const int batch_index = i / num_classes;
TargetT target_idx = ldg(targets + batch_index);
@ -118,7 +118,7 @@ struct InTopKFunctor<GPUDevice, T, TargetT> {
const auto& d = context->eigen_device<GPUDevice>();
// Compute a mask for all predictions.
CudaLaunchConfig config = GetGpuLaunchConfig(num_targets * num_classes, d);
GpuLaunchConfig config = GetGpuLaunchConfig(num_targets * num_classes, d);
OP_REQUIRES_OK(
context, GpuLaunchKernel(ComputePredictionMaskKernel<T, TargetT>,
config.block_count, config.thread_per_block, 0,
@ -173,4 +173,4 @@ DEFINE_GPU_KERNELS(float, int64);
} // end namespace tensorflow
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM

View File

@ -29,7 +29,6 @@ limitations under the License.
#include <vector>
#include "mkl_cblas.h"
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/register_types.h"
@ -41,13 +40,17 @@ limitations under the License.
#include "tensorflow/core/kernels/fill_functor.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/platform/types.h"
#include "tensorflow/core/util/matmul_bcast.h"
#include "tensorflow/core/util/mkl_util.h"
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
namespace tensorflow {
typedef Eigen::ThreadPoolDevice CPUDevice;
template <typename Device, typename Scalar>
// The third parameter v2_bcast is set to true if we are using V2 otherwise
// we set it to false.
template <typename Device, typename Scalar, bool v2_bcast>
class BatchMatMulMkl : public OpKernel {
public:
explicit BatchMatMulMkl(OpKernelConstruction *context) : OpKernel(context) {
@ -60,28 +63,54 @@ class BatchMatMulMkl : public OpKernel {
void Compute(OpKernelContext *ctx) override {
const Tensor &lhs = ctx->input(0);
const Tensor &rhs = ctx->input(1);
OP_REQUIRES(ctx, lhs.dims() == rhs.dims(),
errors::InvalidArgument("lhs and rhs has different ndims: ",
lhs.shape().DebugString(), " vs. ",
rhs.shape().DebugString()));
const int ndims = lhs.dims();
OP_REQUIRES(
ctx, ndims >= 2,
errors::InvalidArgument("lhs and rhs ndims must be >= 2: ", ndims));
TensorShape out_shape;
for (int i = 0; i < ndims - 2; ++i) {
OP_REQUIRES(ctx, lhs.dim_size(i) == rhs.dim_size(i),
errors::InvalidArgument(
"lhs.dim(", i, ") and rhs.dim(", i,
") must be the same: ", lhs.shape().DebugString(), " vs ",
rhs.shape().DebugString()));
out_shape.AddDim(lhs.dim_size(i));
if (!v2_bcast) {
// Using V1, so check to make sure lhs and rhs dimensions are correct and
// no broadcasting is needed.
OP_REQUIRES(ctx, lhs.dims() == rhs.dims(),
errors::InvalidArgument("lhs and rhs has different ndims: ",
lhs.shape().DebugString(), " vs. ",
rhs.shape().DebugString()));
const int ndims = lhs.dims();
OP_REQUIRES(
ctx, ndims >= 2,
errors::InvalidArgument("lhs and rhs ndims must be >= 2: ", ndims));
for (int i = 0; i < ndims - 2; ++i) {
OP_REQUIRES(ctx, lhs.dim_size(i) == rhs.dim_size(i),
errors::InvalidArgument("lhs.dim(", i, ") and rhs.dim(", i,
") must be the same: ",
lhs.shape().DebugString(), " vs ",
rhs.shape().DebugString()));
}
} else {
OP_REQUIRES(
ctx, lhs.dims() >= 2,
errors::InvalidArgument("In[0] ndims must be >= 2: ", lhs.dims()));
OP_REQUIRES(
ctx, rhs.dims() >= 2,
errors::InvalidArgument("In[1] ndims must be >= 2: ", rhs.dims()));
}
auto batch_size = (ndims == 2) ? 1 : out_shape.num_elements();
auto lhs_rows = lhs.dim_size(ndims - 2);
auto lhs_cols = lhs.dim_size(ndims - 1);
auto rhs_rows = rhs.dim_size(ndims - 2);
auto rhs_cols = rhs.dim_size(ndims - 1);
// lhs and rhs can have different dimensions
const int ndims_lhs = lhs.dims();
const int ndims_rhs = rhs.dims();
// Get broadcast info
MatMulBCast bcast(lhs.shape().dim_sizes(), rhs.shape().dim_sizes());
OP_REQUIRES(
ctx, bcast.IsValid(),
errors::InvalidArgument(
"In[0] and In[1] must have compatible batch dimensions: ",
lhs.shape().DebugString(), " vs. ", rhs.shape().DebugString()));
TensorShape out_shape = bcast.output_batch_shape();
auto batch_size = bcast.output_batch_size();
auto lhs_rows = lhs.dim_size(ndims_lhs - 2);
auto lhs_cols = lhs.dim_size(ndims_lhs - 1);
auto rhs_rows = rhs.dim_size(ndims_rhs - 2);
auto rhs_cols = rhs.dim_size(ndims_rhs - 1);
if (adj_x_) std::swap(lhs_rows, lhs_cols);
if (adj_y_) std::swap(rhs_rows, rhs_cols);
OP_REQUIRES(ctx, lhs_cols == rhs_rows,
@ -89,8 +118,10 @@ class BatchMatMulMkl : public OpKernel {
"lhs mismatch rhs shape: ", lhs_cols, " vs. ", rhs_rows,
": ", lhs.shape().DebugString(), " ",
rhs.shape().DebugString(), " ", adj_x_, " ", adj_y_));
out_shape.AddDim(lhs_rows);
out_shape.AddDim(rhs_cols);
Tensor *out = nullptr;
OP_REQUIRES_OK(ctx, ctx->allocate_output(0, out_shape, &out));
if (out->NumElements() == 0) {
@ -122,10 +153,24 @@ class BatchMatMulMkl : public OpKernel {
a_array.reserve(batch_size);
b_array.reserve(batch_size);
c_array.reserve(batch_size);
for (int64 i = 0; i < batch_size; i++) {
a_array.push_back(&lhs_reshaped(i, 0, 0));
b_array.push_back(&rhs_reshaped(i, 0, 0));
c_array.push_back(&out_reshaped(i, 0, 0));
if (!bcast.IsBroadcastingRequired()) {
for (int64 i = 0; i < batch_size; i++) {
a_array.push_back(&lhs_reshaped(i, 0, 0));
b_array.push_back(&rhs_reshaped(i, 0, 0));
c_array.push_back(&out_reshaped(i, 0, 0));
}
} else {
// Broadcasting is needed, so get the mapping from flattened output batch
// indices to x's and y's flattened batch indices.
const std::vector<int64> &a_batch_indices = bcast.x_batch_indices();
const std::vector<int64> &b_batch_indices = bcast.y_batch_indices();
for (int64 i = 0; i < batch_size; i++) {
a_array.push_back(&lhs_reshaped(a_batch_indices[i], 0, 0));
b_array.push_back(&rhs_reshaped(b_batch_indices[i], 0, 0));
c_array.push_back(&out_reshaped(i, 0, 0));
}
}
MklCblasGemmBatch(CblasRowMajor, adj_x_, adj_y_, &m_array[0], &n_array[0],
@ -226,13 +271,25 @@ class BatchMatMulMkl : public OpKernel {
.Device(DEVICE_CPU) \
.TypeConstraint<TYPE>("T") \
.Label(mkl_op_registry::kMklNameChangeOpLabel), \
BatchMatMulMkl<CPUDevice, TYPE>)
BatchMatMulMkl<CPUDevice, TYPE, false>)
#define REGISTER_BATCH_MATMUL_MKL_V2(TYPE) \
REGISTER_KERNEL_BUILDER(Name("_MklBatchMatMulV2") \
.Device(DEVICE_CPU) \
.TypeConstraint<TYPE>("T") \
.Label(mkl_op_registry::kMklNameChangeOpLabel), \
BatchMatMulMkl<CPUDevice, TYPE, true>)
#ifdef ENABLE_MKL
TF_CALL_float(REGISTER_BATCH_MATMUL_MKL);
TF_CALL_double(REGISTER_BATCH_MATMUL_MKL);
TF_CALL_complex64(REGISTER_BATCH_MATMUL_MKL);
TF_CALL_complex128(REGISTER_BATCH_MATMUL_MKL);
TF_CALL_float(REGISTER_BATCH_MATMUL_MKL_V2);
TF_CALL_double(REGISTER_BATCH_MATMUL_MKL_V2);
TF_CALL_complex64(REGISTER_BATCH_MATMUL_MKL_V2);
TF_CALL_complex128(REGISTER_BATCH_MATMUL_MKL_V2);
#endif // ENABLE_MKL
} // end namespace tensorflow

View File

@ -70,6 +70,8 @@ class MklBinaryOp : public BinaryOp<Device, Functor> {
REGISTER6(MklBinaryOp, CPU, "_MklAdd", functor::add, float, Eigen::half, double,
int32, int64, bfloat16);
REGISTER6(MklBinaryOp, CPU, "_MklAddV2", functor::add, float, Eigen::half,
double, int32, int64, bfloat16);
REGISTER8(MklBinaryOp, CPU, "_MklSub", functor::sub, float, Eigen::half, double,
int32, int64, complex64, complex128, bfloat16);
REGISTER6(MklBinaryOp, CPU, "_MklMul", functor::mul, float, Eigen::half, double,

View File

@ -95,7 +95,8 @@ struct NthElementFunctor<CPUDevice, T> {
const int last_dim = input_tensor.dim_size(input_tensor.dims() - 1);
// Allocate each row to different shard.
auto SubNthElement = [&, input, output, last_dim, n](int start, int limit) {
auto SubNthElement = [&, input, output, last_dim, n](int64 start,
int64 limit) {
// std::nth_element would rearrange the array, so we need a new buffer.
std::vector<T> buf(last_dim);

View File

@ -69,8 +69,8 @@ struct TruncatedNormalFunctor<CPUDevice, T> {
auto DoWork = [samples_per_batch, num_elements, &ctx, &means, &stddevs,
&minvals, &maxvals, &gen, &output,
kStdDevsInsideBoundsToUseRandnSampler](int start_batch,
int limit_batch) {
kStdDevsInsideBoundsToUseRandnSampler](int64 start_batch,
int64 limit_batch) {
// Capturing "gen" by-value would only make a copy for the _shared_
// lambda. Since we want to let each worker have its own copy, we pass
// "gen" by reference and explicitly do a copy assignment here.

View File

@ -176,7 +176,7 @@ struct RandomBinomialFunctor<CPUDevice, T, U> {
auto worker_threads = *(ctx->device()->tensorflow_cpu_worker_threads());
auto DoWork = [samples_per_batch, num_elements, &counts, &probs, &gen,
&output](int start_batch, int limit_batch) {
&output](int64 start_batch, int64 limit_batch) {
// Capturing "gen" by-value would only make a copy for the _shared_
// lambda. Since we want to let each worker have its own copy, we pass
// "gen" by reference and explicitly do a copy assignment here.

View File

@ -204,7 +204,7 @@ class RandomGammaOp : public OpKernel {
// avoid a couple flops which can be done on a per-alpha basis.
auto DoWork = [num_samples, num_alphas, &rng, samples_flat, alpha_flat](
int start_output, int limit_output) {
int64 start_output, int64 limit_output) {
using Eigen::numext::exp;
using Eigen::numext::log;
using Eigen::numext::pow;

View File

@ -103,7 +103,7 @@ struct PoissonFunctor<CPUDevice, T, U> {
typedef random::UniformDistribution<random::PhiloxRandom, CT> Uniform;
auto DoWork = [num_samples, num_rate, &rng, samples_flat, rate_flat](
int start_output, int limit_output) {
int64 start_output, int64 limit_output) {
// Capturing "rng" by value would only make a copy for the _shared_
// lambda. Since we want to let each worker have its own copy, we pass
// "rng" by reference and explicitly do a copy assignment.

View File

@ -30,7 +30,7 @@ REGISTER_KERNEL_BUILDER(
.HostMemory("reduction_indices"),
ReductionOp<CPUDevice, bool, int64, Eigen::internal::AndReducer>);
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
REGISTER_KERNEL_BUILDER(
Name("All")
.TypeConstraint<int32>("Tidx")

View File

@ -30,7 +30,7 @@ REGISTER_KERNEL_BUILDER(
.HostMemory("reduction_indices"),
ReductionOp<CPUDevice, bool, int64, Eigen::internal::OrReducer>);
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
REGISTER_KERNEL_BUILDER(
Name("Any")
.TypeConstraint<int32>("Tidx")

View File

@ -15,8 +15,8 @@ limitations under the License.
#ifndef TENSORFLOW_CORE_KERNELS_REDUCTION_OPS_COMMON_GPU_H_
#define TENSORFLOW_CORE_KERNELS_REDUCTION_OPS_COMMON_GPU_H_
#if !GOOGLE_CUDA
#error This file must only be included when building with Cuda support
#if !GOOGLE_CUDA && !TENSORFLOW_USE_ROCM
#error This file must only be included when building with GPU support
#endif
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"

View File

@ -33,7 +33,7 @@ namespace tensorflow {
TF_CALL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
#undef REGISTER_CPU_KERNELS
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define REGISTER_GPU_KERNELS(type) \
REGISTER_KERNEL_BUILDER(Name("EuclideanNorm") \
@ -51,8 +51,10 @@ TF_CALL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
ReductionOp<GPUDevice, type, int64, \
functor::EuclideanNormReducer<type>>);
TF_CALL_GPU_NUMBER_TYPES(REGISTER_GPU_KERNELS);
#if GOOGLE_CUDA
TF_CALL_complex64(REGISTER_GPU_KERNELS);
TF_CALL_complex128(REGISTER_GPU_KERNELS);
#endif
#undef REGISTER_GPU_KERNELS
#endif

View File

@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define EIGEN_USE_GPU
@ -59,4 +59,4 @@ DEFINE_FOR_TYPE_AND_R(bool, Eigen::internal::OrReducer);
} // end namespace functor
} // end namespace tensorflow
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM

View File

@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define EIGEN_USE_GPU
@ -67,4 +67,4 @@ DEFINE_FOR_ALL_REDUCERS(double);
} // end namespace functor
} // end namespace tensorflow
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM

View File

@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define EIGEN_USE_GPU
@ -67,4 +67,4 @@ DEFINE_FOR_ALL_REDUCERS(float);
} // end namespace functor
} // end namespace tensorflow
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM

View File

@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define EIGEN_USE_GPU
@ -68,4 +68,4 @@ DEFINE_FOR_ALL_REDUCERS(int64);
} // end namespace functor
} // end namespace tensorflow
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM

View File

@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define EIGEN_USE_GPU
@ -64,4 +64,4 @@ DEFINE_FOR_ALL_REDUCERS(Eigen::half);
} // end namespace functor
} // end namespace tensorflow
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM

View File

@ -13,7 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define EIGEN_USE_GPU
@ -64,4 +64,4 @@ DEFINE_FOR_ALL_REDUCERS(Eigen::half);
} // end namespace functor
} // end namespace tensorflow
#endif // GOOGLE_CUDA
#endif // GOOGLE_CUDA || TENSORFLOW_USE_ROCM

View File

@ -33,7 +33,7 @@ namespace tensorflow {
TF_CALL_REAL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
#undef REGISTER_CPU_KERNELS
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define REGISTER_GPU_KERNELS(type) \
REGISTER_KERNEL_BUILDER( \

View File

@ -33,7 +33,7 @@ namespace tensorflow {
TF_CALL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
#undef REGISTER_CPU_KERNELS
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define REGISTER_GPU_KERNELS(type) \
REGISTER_KERNEL_BUILDER( \
@ -51,8 +51,10 @@ TF_CALL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
.HostMemory("reduction_indices"), \
ReductionOp<GPUDevice, type, int64, functor::MeanReducer<type>>);
TF_CALL_GPU_NUMBER_TYPES(REGISTER_GPU_KERNELS);
#if GOOGLE_CUDA
TF_CALL_complex64(REGISTER_GPU_KERNELS);
TF_CALL_complex128(REGISTER_GPU_KERNELS);
#endif
#undef REGISTER_GPU_KERNELS
#endif

View File

@ -33,7 +33,7 @@ namespace tensorflow {
TF_CALL_REAL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
#undef REGISTER_CPU_KERNELS
#if GOOGLE_CUDA
#if GOOGLE_CUDA || TENSORFLOW_USE_ROCM
#define REGISTER_GPU_KERNELS(type) \
REGISTER_KERNEL_BUILDER( \

Some files were not shown because too many files have changed in this diff Show More