Merge changes from github.
PiperOrigin-RevId: 203037623
This commit is contained in:
parent
eacdfdf6c0
commit
73e38c29c7
@ -96,6 +96,8 @@ The TensorFlow project strives to abide by generally accepted best practices in
|
|||||||
| --- | --- | --- |
|
| --- | --- | --- |
|
||||||
| **IBM s390x** | [](http://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_CI/) | TBA |
|
| **IBM s390x** | [](http://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_CI/) | TBA |
|
||||||
| **IBM ppc64le CPU** | [](http://powerci.osuosl.org/job/TensorFlow_Ubuntu_16.04_CPU/) | TBA |
|
| **IBM ppc64le CPU** | [](http://powerci.osuosl.org/job/TensorFlow_Ubuntu_16.04_CPU/) | TBA |
|
||||||
|
| **IBM ppc64le GPU** | [](http://powerci.osuosl.org/job/TensorFlow_Ubuntu_16.04_PPC64LE_GPU/) | TBA |
|
||||||
|
| **Linux CPU with Intel® MKL-DNN®** | [](https://tensorflow-ci.intel.com/job/tensorflow-mkl-linux-cpu/) | TBA |
|
||||||
|
|
||||||
|
|
||||||
## For more information
|
## For more information
|
||||||
|
40
RELEASE.md
40
RELEASE.md
@ -1,18 +1,38 @@
|
|||||||
# Release 1.9.0
|
# Release 1.9.0
|
||||||
|
|
||||||
## Major Features And Improvements
|
## Major Features And Improvements
|
||||||
* Update tf.keras to the Keras 2.1.6 API.
|
* Updated docs for `tf.keras`: New Keras-based [get started](http://tensorflow.org/versions/r1.9/get_started),
|
||||||
* `tfe.Network` is deprecated. Please inherit from `tf.keras.Model`.
|
and [programmers guide page](http://tensorflow.org/versions/r1.9/programmers_guide/keras).
|
||||||
* Adding support of core feature columns and losses to gradient boosted trees estimators.
|
* Update `tf.keras` to the Keras 2.1.6 API.
|
||||||
* The distributions.Bijector API supports broadcasting for Bijectors with new API changes. See [here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/distributions/bijectors/Bijector) for more details.
|
* Added [`tf.keras.layers.CuDNNGRU`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNGRU) and [`tf.keras.layers.CuDNNLSTM`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNLSTM) layers. [Try it](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb?linkId=53292082).
|
||||||
* Layered variable names have changed in the following conditions:
|
* Adding support of core [feature columns](https://www.tensorflow.org/get_started/feature_columns) and [losses](https://www.tensorflow.org/api_docs/python/tf/losses) to [gradient boosted trees estimators](https://github.com/tensorflow/models/tree/master/official/boosted_trees).
|
||||||
* Using `tf.keras.layers` with custom variable scopes.
|
* The [python interface](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/lite)
|
||||||
* Using `tf.layers` in a subclassed `tf.keras.Model` class. See [here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/layers) for more details
|
for the [TFLite Optimizing Converter](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/README.md)
|
||||||
|
has been expanded, and the command line interface (AKA: `toco`, `tflite_convert`) is once again
|
||||||
## Breaking Changes
|
included in the standard `pip` installation.
|
||||||
* If you're opening empty variable scopes; replace `variable_scope`('', ...) by `variable_scope`(`tf.get_variable_scope()`, ...).
|
* Improved data-loading and text processing with:
|
||||||
|
* [`tf.decode_compressed`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/decode_compressed)
|
||||||
|
* [`tf.string_strip`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/string_strip)
|
||||||
|
* [`tf.strings.regex_full_match`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/strings/regex_full_match)
|
||||||
|
* Added experimental support for new pre-made Estimators:
|
||||||
|
* [`tf.contrib.estimator.BaselineEstimator`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/BaselineEstimator)
|
||||||
|
* [`tf.contrib.estimator.RNNClassifier`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/RNNEstimator)
|
||||||
|
* [`tf.contrib.estimator.RNNEstimator`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/RNNClassifier)
|
||||||
|
* The [distributions.Bijector](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/distributions/bijectors/Bijector)
|
||||||
|
API supports broadcasting for Bijectors with new API changes.
|
||||||
|
|
||||||
|
## Breaking Chances
|
||||||
|
* If you're opening empty variable scopes; replace `variable_scope('', ...)` by
|
||||||
|
`variable_scope(tf.get_variable_scope(), ...)`.
|
||||||
|
* Headers used for building custom ops have been moved from site-packages/external into site-packages/tensorflow/include/external.
|
||||||
|
|
||||||
## Bug Fixes and Other Changes
|
## Bug Fixes and Other Changes
|
||||||
|
|
||||||
|
* `tfe.Network` is deprecated. Please inherit from `tf.keras.Model`.
|
||||||
|
* Layered variable names have changed in the following conditions:
|
||||||
|
* Using `tf.keras.layers` with custom variable scopes.
|
||||||
|
* Using `tf.layers` in a subclassed `tf.keras.Model` class. See
|
||||||
|
[here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/layers) for more details
|
||||||
* `tf.data`:
|
* `tf.data`:
|
||||||
* The `DatasetBase::DebugString()` method is now `const`.
|
* The `DatasetBase::DebugString()` method is now `const`.
|
||||||
* Added the `tf.contrib.data.sample_from_datasets()` API for randomly sampling from multiple datasets.
|
* Added the `tf.contrib.data.sample_from_datasets()` API for randomly sampling from multiple datasets.
|
||||||
|
@ -438,6 +438,22 @@ filegroup(
|
|||||||
data = glob(["docs_src/**/*.md"]),
|
data = glob(["docs_src/**/*.md"]),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
cc_library(
|
||||||
|
name = "grpc",
|
||||||
|
deps = select({
|
||||||
|
":linux_s390x": ["@grpc//:grpc_unsecure"],
|
||||||
|
"//conditions:default": ["@grpc"],
|
||||||
|
}),
|
||||||
|
)
|
||||||
|
|
||||||
|
cc_library(
|
||||||
|
name = "grpc++",
|
||||||
|
deps = select({
|
||||||
|
":linux_s390x": ["@grpc//:grpc++_unsecure"],
|
||||||
|
"//conditions:default": ["@grpc//:grpc++"],
|
||||||
|
}),
|
||||||
|
)
|
||||||
|
|
||||||
# A shared object which includes registration mechanisms for ops and
|
# A shared object which includes registration mechanisms for ops and
|
||||||
# kernels. Does not include the implementations of any ops or kernels. Instead,
|
# kernels. Does not include the implementations of any ops or kernels. Instead,
|
||||||
# the library which loads libtensorflow_framework.so
|
# the library which loads libtensorflow_framework.so
|
||||||
@ -587,19 +603,3 @@ py_library(
|
|||||||
visibility = ["//visibility:public"],
|
visibility = ["//visibility:public"],
|
||||||
deps = ["//tensorflow/python:no_contrib"],
|
deps = ["//tensorflow/python:no_contrib"],
|
||||||
)
|
)
|
||||||
|
|
||||||
cc_library(
|
|
||||||
name = "grpc",
|
|
||||||
deps = select({
|
|
||||||
":linux_s390x": ["@grpc//:grpc_unsecure"],
|
|
||||||
"//conditions:default": ["@grpc"],
|
|
||||||
}),
|
|
||||||
)
|
|
||||||
|
|
||||||
cc_library(
|
|
||||||
name = "grpc++",
|
|
||||||
deps = select({
|
|
||||||
":linux_s390x": ["@grpc//:grpc++_unsecure"],
|
|
||||||
"//conditions:default": ["@grpc//:grpc++"],
|
|
||||||
}),
|
|
||||||
)
|
|
||||||
|
@ -421,6 +421,58 @@ Status StridedSliceGradHelper(const Scope& scope, const Operation& op,
|
|||||||
}
|
}
|
||||||
REGISTER_GRADIENT_OP("StridedSlice", StridedSliceGradHelper);
|
REGISTER_GRADIENT_OP("StridedSlice", StridedSliceGradHelper);
|
||||||
|
|
||||||
|
Status SliceGrad(const Scope& scope, const Operation& op,
|
||||||
|
const std::vector<Output>& grad_inputs,
|
||||||
|
std::vector<Output>* grad_outputs) {
|
||||||
|
// Propagate the incoming gradient along all the selected values,
|
||||||
|
// and zero everywhere else. Use the Pad operator for this.
|
||||||
|
//
|
||||||
|
// First create an Nx2 padding where N is the number of input
|
||||||
|
// dimensions. The first column is the number of prepended zeros
|
||||||
|
// for each dimension, and the second column is the number of
|
||||||
|
// appended zeros.
|
||||||
|
//
|
||||||
|
// The first column is just the begin vector.
|
||||||
|
// The second column is the shape of the input element-wise
|
||||||
|
// subtracted by begin+size
|
||||||
|
|
||||||
|
// Running example:
|
||||||
|
// input.shape = [3, 5, 3]
|
||||||
|
// begin = [1, 2, 1], size = [1, 3, 2]
|
||||||
|
Input input = op.input(0);
|
||||||
|
Input begin = op.input(1);
|
||||||
|
// input_rank = 3
|
||||||
|
auto input_rank = Rank(scope, input);
|
||||||
|
// slice_size = [1, 3, 2]
|
||||||
|
auto slice_size = Shape(scope, op.output(0));
|
||||||
|
// padding_shape = [3, 1]
|
||||||
|
auto padding_shape = Stack(scope, {input_rank, 1});
|
||||||
|
// before_padding = [[1]
|
||||||
|
// [2]
|
||||||
|
// [1]]
|
||||||
|
Input before_padding = Reshape(scope, begin, padding_shape);
|
||||||
|
// after_padding_sizes = shape(input) - slice_size - begin
|
||||||
|
// = [3, 5, 3] - [1, 3, 2] - [1, 2, 1]
|
||||||
|
// = [1, 0, 0]
|
||||||
|
auto after_padding_sizes =
|
||||||
|
Sub(scope, Sub(scope, Shape(scope, input), slice_size), begin);
|
||||||
|
// after_padding = [[1]
|
||||||
|
// [0]
|
||||||
|
// [0]]
|
||||||
|
Input after_padding = Reshape(scope, after_padding_sizes, padding_shape);
|
||||||
|
// paddings = [[1 1]
|
||||||
|
// [2 0]
|
||||||
|
// [1 0]]
|
||||||
|
auto paddings =
|
||||||
|
Concat(scope, {before_padding, after_padding}, Const(scope, 1));
|
||||||
|
grad_outputs->push_back(Pad(scope, grad_inputs[0], paddings));
|
||||||
|
// Nothing propagated for "begin" and "size" inputs
|
||||||
|
grad_outputs->push_back(NoGradient());
|
||||||
|
grad_outputs->push_back(NoGradient());
|
||||||
|
return scope.status();
|
||||||
|
}
|
||||||
|
REGISTER_GRADIENT_OP("Slice", SliceGrad);
|
||||||
|
|
||||||
} // anonymous namespace
|
} // anonymous namespace
|
||||||
} // namespace ops
|
} // namespace ops
|
||||||
} // namespace tensorflow
|
} // namespace tensorflow
|
||||||
|
@ -378,5 +378,12 @@ TEST_F(ArrayGradTest, StridedSliceGrad) {
|
|||||||
RunTest(x, x_shape, y, {1, 2, 2, 2});
|
RunTest(x, x_shape, y, {1, 2, 2, 2});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
TEST_F(ArrayGradTest, SliceGrad) {
|
||||||
|
TensorShape x_shape({3, 5, 3});
|
||||||
|
auto x = Placeholder(scope_, DT_FLOAT, Placeholder::Shape(x_shape));
|
||||||
|
auto y = Slice(scope_, x, {1, 2, 1}, {1, 3, 2});
|
||||||
|
RunTest(x, x_shape, y, {1, 3, 2});
|
||||||
|
}
|
||||||
|
|
||||||
} // namespace
|
} // namespace
|
||||||
} // namespace tensorflow
|
} // namespace tensorflow
|
||||||
|
@ -128,7 +128,14 @@ cc_library(
|
|||||||
"@llvm//:target", # fixdeps: keep
|
"@llvm//:target", # fixdeps: keep
|
||||||
"@llvm//:x86_code_gen", # fixdeps: keep
|
"@llvm//:x86_code_gen", # fixdeps: keep
|
||||||
"@llvm//:x86_disassembler", # fixdeps: keep
|
"@llvm//:x86_disassembler", # fixdeps: keep
|
||||||
],
|
] + select({
|
||||||
|
"//tensorflow:linux_ppc64le": [
|
||||||
|
"@llvm//:powerpc_disassembler",
|
||||||
|
"@llvm//:powerpc_code_gen",
|
||||||
|
],
|
||||||
|
"//conditions:default": [
|
||||||
|
],
|
||||||
|
}),
|
||||||
alwayslink = True, # Contains compiler registration
|
alwayslink = True, # Contains compiler registration
|
||||||
)
|
)
|
||||||
|
|
||||||
|
@ -125,9 +125,9 @@ py_library(
|
|||||||
}) + if_not_windows_cuda([
|
}) + if_not_windows_cuda([
|
||||||
"//tensorflow/contrib/fused_conv:fused_conv_py", # unresolved symbols, need to export more symbols
|
"//tensorflow/contrib/fused_conv:fused_conv_py", # unresolved symbols, need to export more symbols
|
||||||
]) + if_not_windows([
|
]) + if_not_windows([
|
||||||
"//tensorflow/contrib/ffmpeg:ffmpeg_ops_py",
|
|
||||||
"//tensorflow/contrib/cloud:cloud_py", # depends on bigtable
|
"//tensorflow/contrib/cloud:cloud_py", # depends on bigtable
|
||||||
"//tensorflow/contrib/bigtable", # doesn't compile on Windows
|
"//tensorflow/contrib/bigtable", # doesn't compile on Windows
|
||||||
|
"//tensorflow/contrib/ffmpeg:ffmpeg_ops_py",
|
||||||
"//tensorflow/contrib/lite/python:lite", # unix dependency, need to fix code
|
"//tensorflow/contrib/lite/python:lite", # unix dependency, need to fix code
|
||||||
]),
|
]),
|
||||||
)
|
)
|
||||||
|
@ -47,7 +47,6 @@ class SymbolNamer(object):
|
|||||||
|
|
||||||
class ControlFlowTransformer(converter.Base):
|
class ControlFlowTransformer(converter.Base):
|
||||||
"""Transforms control flow structures like loops an conditionals."""
|
"""Transforms control flow structures like loops an conditionals."""
|
||||||
|
|
||||||
def _create_cond_branch(self, body_name, aliased_orig_names,
|
def _create_cond_branch(self, body_name, aliased_orig_names,
|
||||||
aliased_new_names, body, returns):
|
aliased_new_names, body, returns):
|
||||||
if aliased_orig_names:
|
if aliased_orig_names:
|
||||||
|
@ -299,17 +299,20 @@ include_directories(
|
|||||||
${double_conversion_INCLUDE_DIR}
|
${double_conversion_INCLUDE_DIR}
|
||||||
)
|
)
|
||||||
|
|
||||||
if(tensorflow_ENABLE_SSL_SUPPORT)
|
|
||||||
include(boringssl)
|
|
||||||
list(APPEND tensorflow_EXTERNAL_LIBRARIES ${boringssl_STATIC_LIBRARIES})
|
|
||||||
list(APPEND tensorflow_EXTERNAL_DEPENDENCIES boringssl)
|
|
||||||
include_directories(${boringssl_INCLUDE_DIR})
|
|
||||||
endif()
|
|
||||||
if(tensorflow_ENABLE_GRPC_SUPPORT)
|
if(tensorflow_ENABLE_GRPC_SUPPORT)
|
||||||
|
if(tensorflow_ENABLE_SSL_SUPPORT)
|
||||||
|
include(boringssl)
|
||||||
|
include_directories(${boringssl_INCLUDE_DIR})
|
||||||
|
endif()
|
||||||
include(grpc)
|
include(grpc)
|
||||||
|
include_directories(${GRPC_INCLUDE_DIRS})
|
||||||
|
# Place boringssl after grpc as grpc depends on boringssl.
|
||||||
list(APPEND tensorflow_EXTERNAL_LIBRARIES ${grpc_STATIC_LIBRARIES})
|
list(APPEND tensorflow_EXTERNAL_LIBRARIES ${grpc_STATIC_LIBRARIES})
|
||||||
list(APPEND tensorflow_EXTERNAL_DEPENDENCIES grpc)
|
list(APPEND tensorflow_EXTERNAL_DEPENDENCIES grpc)
|
||||||
include_directories(${GRPC_INCLUDE_DIRS})
|
if(tensorflow_ENABLE_SSL_SUPPORT)
|
||||||
|
list(APPEND tensorflow_EXTERNAL_LIBRARIES ${boringssl_STATIC_LIBRARIES})
|
||||||
|
list(APPEND tensorflow_EXTERNAL_DEPENDENCIES boringssl)
|
||||||
|
endif()
|
||||||
endif()
|
endif()
|
||||||
if(tensorflow_ENABLE_JEMALLOC_SUPPORT)
|
if(tensorflow_ENABLE_JEMALLOC_SUPPORT)
|
||||||
include(jemalloc)
|
include(jemalloc)
|
||||||
|
@ -17,7 +17,7 @@ include (ExternalProject)
|
|||||||
set(boringssl_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/boringssl/src/boringssl/include)
|
set(boringssl_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/boringssl/src/boringssl/include)
|
||||||
#set(boringssl_EXTRA_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/boringssl/src)
|
#set(boringssl_EXTRA_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/boringssl/src)
|
||||||
set(boringssl_URL https://boringssl.googlesource.com/boringssl)
|
set(boringssl_URL https://boringssl.googlesource.com/boringssl)
|
||||||
set(boringssl_TAG ee7aa02)
|
set(boringssl_TAG 7f8c553d7f4db0a6ce727f2986d41bf8fe8ec4bf)
|
||||||
set(boringssl_BUILD ${CMAKE_BINARY_DIR}/boringssl/src/boringssl-build)
|
set(boringssl_BUILD ${CMAKE_BINARY_DIR}/boringssl/src/boringssl-build)
|
||||||
#set(boringssl_LIBRARIES ${boringssl_BUILD}/obj/so/libboringssl.so)
|
#set(boringssl_LIBRARIES ${boringssl_BUILD}/obj/so/libboringssl.so)
|
||||||
set(boringssl_STATIC_LIBRARIES
|
set(boringssl_STATIC_LIBRARIES
|
||||||
|
@ -236,15 +236,6 @@ if(WIN32)
|
|||||||
list(APPEND tf_core_lib_srcs ${tf_core_platform_windows_srcs})
|
list(APPEND tf_core_lib_srcs ${tf_core_platform_windows_srcs})
|
||||||
endif(WIN32)
|
endif(WIN32)
|
||||||
|
|
||||||
if(tensorflow_ENABLE_SSL_SUPPORT)
|
|
||||||
# Cloud libraries require boringssl.
|
|
||||||
file(GLOB tf_core_platform_cloud_srcs
|
|
||||||
"${tensorflow_source_dir}/tensorflow/core/platform/cloud/*.h"
|
|
||||||
"${tensorflow_source_dir}/tensorflow/core/platform/cloud/*.cc"
|
|
||||||
)
|
|
||||||
list(APPEND tf_core_lib_srcs ${tf_core_platform_cloud_srcs})
|
|
||||||
endif()
|
|
||||||
|
|
||||||
if (tensorflow_ENABLE_HDFS_SUPPORT)
|
if (tensorflow_ENABLE_HDFS_SUPPORT)
|
||||||
list(APPEND tf_core_platform_hdfs_srcs
|
list(APPEND tf_core_platform_hdfs_srcs
|
||||||
"${tensorflow_source_dir}/tensorflow/core/platform/hadoop/hadoop_file_system.cc"
|
"${tensorflow_source_dir}/tensorflow/core/platform/hadoop/hadoop_file_system.cc"
|
||||||
|
@ -134,14 +134,13 @@ if(tensorflow_BUILD_CONTRIB_KERNELS)
|
|||||||
list(APPEND tf_core_kernels_srcs ${tf_contrib_kernels_srcs})
|
list(APPEND tf_core_kernels_srcs ${tf_contrib_kernels_srcs})
|
||||||
endif(tensorflow_BUILD_CONTRIB_KERNELS)
|
endif(tensorflow_BUILD_CONTRIB_KERNELS)
|
||||||
|
|
||||||
if(NOT tensorflow_ENABLE_SSL_SUPPORT)
|
# Cloud libraries require curl and boringssl.
|
||||||
# Cloud libraries require boringssl.
|
# Curl is not supported yet anyway so we remove for now.
|
||||||
file(GLOB tf_core_kernels_cloud_srcs
|
file(GLOB tf_core_kernels_cloud_srcs
|
||||||
"${tensorflow_source_dir}/tensorflow/contrib/cloud/kernels/*.h"
|
"${tensorflow_source_dir}/tensorflow/contrib/cloud/kernels/*.h"
|
||||||
"${tensorflow_source_dir}/tensorflow/contrib/cloud/kernels/*.cc"
|
"${tensorflow_source_dir}/tensorflow/contrib/cloud/kernels/*.cc"
|
||||||
)
|
)
|
||||||
list(REMOVE_ITEM tf_core_kernels_srcs ${tf_core_kernels_cloud_srcs})
|
list(REMOVE_ITEM tf_core_kernels_srcs ${tf_core_kernels_cloud_srcs})
|
||||||
endif()
|
|
||||||
|
|
||||||
file(GLOB_RECURSE tf_core_kernels_exclude_srcs
|
file(GLOB_RECURSE tf_core_kernels_exclude_srcs
|
||||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*test*.h"
|
"${tensorflow_source_dir}/tensorflow/core/kernels/*test*.h"
|
||||||
|
@ -64,8 +64,6 @@ file(GLOB tf_stream_executor_srcs
|
|||||||
if (tensorflow_ENABLE_GPU)
|
if (tensorflow_ENABLE_GPU)
|
||||||
file(GLOB tf_stream_executor_gpu_srcs
|
file(GLOB tf_stream_executor_gpu_srcs
|
||||||
"${tensorflow_source_dir}/tensorflow/stream_executor/cuda/*.cc"
|
"${tensorflow_source_dir}/tensorflow/stream_executor/cuda/*.cc"
|
||||||
"${tensorflow_source_dir}/tensorflow/compiler/xla/statusor.h"
|
|
||||||
"${tensorflow_source_dir}/tensorflow/compiler/xla/statusor.cc"
|
|
||||||
)
|
)
|
||||||
if (NOT tensorflow_BUILD_CC_TESTS)
|
if (NOT tensorflow_BUILD_CC_TESTS)
|
||||||
file(GLOB tf_stream_executor_gpu_tests
|
file(GLOB tf_stream_executor_gpu_tests
|
||||||
|
@ -534,7 +534,8 @@ def multi_label_head(n_classes,
|
|||||||
* An integer `SparseTensor` of class indices. The `dense_shape` must be
|
* An integer `SparseTensor` of class indices. The `dense_shape` must be
|
||||||
`[D0, D1, ... DN, ?]` and the values within `[0, n_classes)`.
|
`[D0, D1, ... DN, ?]` and the values within `[0, n_classes)`.
|
||||||
* If `label_vocabulary` is given, a string `SparseTensor`. The `dense_shape`
|
* If `label_vocabulary` is given, a string `SparseTensor`. The `dense_shape`
|
||||||
must be `[D0, D1, ... DN, ?]` and the values within `label_vocabulary`.
|
must be `[D0, D1, ... DN, ?]` and the values within `label_vocabulary` or a
|
||||||
|
multi-hot tensor of shape `[D0, D1, ... DN, n_classes]`.
|
||||||
|
|
||||||
If `weight_column` is specified, weights must be of shape
|
If `weight_column` is specified, weights must be of shape
|
||||||
`[D0, D1, ... DN]`, or `[D0, D1, ... DN, 1]`.
|
`[D0, D1, ... DN]`, or `[D0, D1, ... DN, 1]`.
|
||||||
|
@ -568,6 +568,33 @@ class MultiLabelHead(test.TestCase):
|
|||||||
expected_loss=expected_loss,
|
expected_loss=expected_loss,
|
||||||
expected_metrics=expected_metrics)
|
expected_metrics=expected_metrics)
|
||||||
|
|
||||||
|
def test_eval_with_label_vocabulary_with_multi_hot_input(self):
|
||||||
|
n_classes = 2
|
||||||
|
head = head_lib.multi_label_head(
|
||||||
|
n_classes, label_vocabulary=['class0', 'class1'])
|
||||||
|
logits = np.array([[-1., 1.], [-1.5, 1.5]], dtype=np.float32)
|
||||||
|
labels_multi_hot = np.array([[1, 0], [1, 1]], dtype=np.int64)
|
||||||
|
# loss = labels * -log(sigmoid(logits)) +
|
||||||
|
# (1 - labels) * -log(1 - sigmoid(logits))
|
||||||
|
# Sum over examples, divide by batch_size.
|
||||||
|
expected_loss = 0.5 * np.sum(
|
||||||
|
_sigmoid_cross_entropy(labels=labels_multi_hot, logits=logits))
|
||||||
|
keys = metric_keys.MetricKeys
|
||||||
|
expected_metrics = {
|
||||||
|
# Average loss over examples.
|
||||||
|
keys.LOSS_MEAN: expected_loss,
|
||||||
|
# auc and auc_pr cannot be reliably calculated for only 4 samples, but
|
||||||
|
# this assert tests that the algorithm remains consistent.
|
||||||
|
keys.AUC: 0.3333,
|
||||||
|
keys.AUC_PR: 0.7639,
|
||||||
|
}
|
||||||
|
self._test_eval(
|
||||||
|
head=head,
|
||||||
|
logits=logits,
|
||||||
|
labels=labels_multi_hot,
|
||||||
|
expected_loss=expected_loss,
|
||||||
|
expected_metrics=expected_metrics)
|
||||||
|
|
||||||
def test_eval_with_thresholds(self):
|
def test_eval_with_thresholds(self):
|
||||||
n_classes = 2
|
n_classes = 2
|
||||||
thresholds = [0.25, 0.5, 0.75]
|
thresholds = [0.25, 0.5, 0.75]
|
||||||
|
@ -103,9 +103,20 @@ class GANHead(head._Head): # pylint: disable=protected-access
|
|||||||
name: name of the head. If provided, summary and metrics keys will be
|
name: name of the head. If provided, summary and metrics keys will be
|
||||||
suffixed by `"/" + name`.
|
suffixed by `"/" + name`.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
if not callable(generator_loss_fn):
|
||||||
|
raise TypeError('generator_loss_fn must be callable.')
|
||||||
|
if not callable(discriminator_loss_fn):
|
||||||
|
raise TypeError('discriminator_loss_fn must be callable.')
|
||||||
|
if not use_loss_summaries in [True, False, None]:
|
||||||
|
raise ValueError('use_loss_summaries must be True, False or None.')
|
||||||
|
if get_hooks_fn is not None and not callable(get_hooks_fn):
|
||||||
|
raise TypeError('get_hooks_fn must be callable.')
|
||||||
|
if name is not None and not isinstance(name, str):
|
||||||
|
raise TypeError('name must be string.')
|
||||||
|
|
||||||
if get_hooks_fn is None:
|
if get_hooks_fn is None:
|
||||||
get_hooks_fn = tfgan_train.get_sequential_train_hooks()
|
get_hooks_fn = tfgan_train.get_sequential_train_hooks()
|
||||||
# TODO(joelshor): Validate inputs.
|
|
||||||
|
|
||||||
if use_loss_summaries in [True, False]:
|
if use_loss_summaries in [True, False]:
|
||||||
generator_loss_fn = functools.partial(
|
generator_loss_fn = functools.partial(
|
||||||
|
@ -570,7 +570,7 @@ class MutualInformationPenaltyTest(test.TestCase, _PenaltyTest):
|
|||||||
'predicted_distributions': self._predicted_distributions,
|
'predicted_distributions': self._predicted_distributions,
|
||||||
}
|
}
|
||||||
self._expected_loss = 1.61610
|
self._expected_loss = 1.61610
|
||||||
self._expected_op_name = 'mutual_information_loss/mul'
|
self._expected_op_name = 'mutual_information_loss/mul_1'
|
||||||
self._batch_size = 2
|
self._batch_size = 2
|
||||||
|
|
||||||
|
|
||||||
|
@ -35,6 +35,7 @@ typedef Eigen::ThreadPoolDevice CPUDevice;
|
|||||||
template struct FillProjectiveTransform<CPUDevice, uint8>;
|
template struct FillProjectiveTransform<CPUDevice, uint8>;
|
||||||
template struct FillProjectiveTransform<CPUDevice, int32>;
|
template struct FillProjectiveTransform<CPUDevice, int32>;
|
||||||
template struct FillProjectiveTransform<CPUDevice, int64>;
|
template struct FillProjectiveTransform<CPUDevice, int64>;
|
||||||
|
template struct FillProjectiveTransform<CPUDevice, Eigen::half>;
|
||||||
template struct FillProjectiveTransform<CPUDevice, float>;
|
template struct FillProjectiveTransform<CPUDevice, float>;
|
||||||
template struct FillProjectiveTransform<CPUDevice, double>;
|
template struct FillProjectiveTransform<CPUDevice, double>;
|
||||||
|
|
||||||
@ -99,6 +100,7 @@ class ImageProjectiveTransform : public OpKernel {
|
|||||||
TF_CALL_uint8(REGISTER);
|
TF_CALL_uint8(REGISTER);
|
||||||
TF_CALL_int32(REGISTER);
|
TF_CALL_int32(REGISTER);
|
||||||
TF_CALL_int64(REGISTER);
|
TF_CALL_int64(REGISTER);
|
||||||
|
TF_CALL_half(REGISTER);
|
||||||
TF_CALL_float(REGISTER);
|
TF_CALL_float(REGISTER);
|
||||||
TF_CALL_double(REGISTER);
|
TF_CALL_double(REGISTER);
|
||||||
|
|
||||||
|
@ -21,6 +21,7 @@ limitations under the License.
|
|||||||
#define EIGEN_USE_THREADS
|
#define EIGEN_USE_THREADS
|
||||||
|
|
||||||
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
|
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
|
||||||
|
|
||||||
#include "tensorflow/core/framework/tensor_types.h"
|
#include "tensorflow/core/framework/tensor_types.h"
|
||||||
#include "tensorflow/core/platform/types.h"
|
#include "tensorflow/core/platform/types.h"
|
||||||
|
|
||||||
@ -110,21 +111,21 @@ class ProjectiveGenerator {
|
|||||||
// f(x, y_floor) = (x_ceil - x) / (x_ceil - x_floor) * f(x_floor, y_floor)
|
// f(x, y_floor) = (x_ceil - x) / (x_ceil - x_floor) * f(x_floor, y_floor)
|
||||||
// + (x - x_floor) / (x_ceil - x_floor) * f(x_ceil, y_floor)
|
// + (x - x_floor) / (x_ceil - x_floor) * f(x_ceil, y_floor)
|
||||||
const float value_yfloor =
|
const float value_yfloor =
|
||||||
(x_ceil - x) * read_with_fill_value(batch, DenseIndex(y_floor),
|
(x_ceil - x) * static_cast<float>(read_with_fill_value(
|
||||||
DenseIndex(x_floor), channel,
|
batch, DenseIndex(y_floor), DenseIndex(x_floor),
|
||||||
fill_value) +
|
channel, fill_value)) +
|
||||||
(x - x_floor) * read_with_fill_value(batch, DenseIndex(y_floor),
|
(x - x_floor) * static_cast<float>(read_with_fill_value(
|
||||||
DenseIndex(x_ceil), channel,
|
batch, DenseIndex(y_floor), DenseIndex(x_ceil),
|
||||||
fill_value);
|
channel, fill_value));
|
||||||
// f(x, y_ceil) = (x_ceil - x) / (x_ceil - x_floor) * f(x_floor, y_ceil)
|
// f(x, y_ceil) = (x_ceil - x) / (x_ceil - x_floor) * f(x_floor, y_ceil)
|
||||||
// + (x - x_floor) / (x_ceil - x_floor) * f(x_ceil, y_ceil)
|
// + (x - x_floor) / (x_ceil - x_floor) * f(x_ceil, y_ceil)
|
||||||
const float value_yceil =
|
const float value_yceil =
|
||||||
(x_ceil - x) * read_with_fill_value(batch, DenseIndex(y_ceil),
|
(x_ceil - x) * static_cast<float>(read_with_fill_value(
|
||||||
DenseIndex(x_floor), channel,
|
batch, DenseIndex(y_ceil), DenseIndex(x_floor),
|
||||||
fill_value) +
|
channel, fill_value)) +
|
||||||
(x - x_floor) * read_with_fill_value(batch, DenseIndex(y_ceil),
|
(x - x_floor) * static_cast<float>(read_with_fill_value(
|
||||||
DenseIndex(x_ceil), channel,
|
batch, DenseIndex(y_ceil), DenseIndex(x_ceil),
|
||||||
fill_value);
|
channel, fill_value));
|
||||||
// f(x, y) = (y_ceil - y) / (y_ceil - y_floor) * f(x, y_floor)
|
// f(x, y) = (y_ceil - y) / (y_ceil - y_floor) * f(x, y_floor)
|
||||||
// + (y - y_floor) / (y_ceil - y_floor) * f(x, y_ceil)
|
// + (y - y_floor) / (y_ceil - y_floor) * f(x, y_ceil)
|
||||||
return T((y_ceil - y) * value_yfloor + (y - y_floor) * value_yceil);
|
return T((y_ceil - y) * value_yfloor + (y - y_floor) * value_yceil);
|
||||||
|
@ -29,7 +29,7 @@ using shape_inference::ShapeHandle;
|
|||||||
REGISTER_OP("ImageProjectiveTransform")
|
REGISTER_OP("ImageProjectiveTransform")
|
||||||
.Input("images: dtype")
|
.Input("images: dtype")
|
||||||
.Input("transforms: float32")
|
.Input("transforms: float32")
|
||||||
.Attr("dtype: {uint8, int32, int64, float32, float64}")
|
.Attr("dtype: {uint8, int32, int64, float16, float32, float64}")
|
||||||
.Attr("interpolation: string")
|
.Attr("interpolation: string")
|
||||||
.Output("transformed_images: dtype")
|
.Output("transformed_images: dtype")
|
||||||
.SetShapeFn([](InferenceContext* c) {
|
.SetShapeFn([](InferenceContext* c) {
|
||||||
|
@ -30,7 +30,8 @@ from tensorflow.python.ops import math_ops
|
|||||||
from tensorflow.python.platform import googletest
|
from tensorflow.python.platform import googletest
|
||||||
|
|
||||||
_DTYPES = set(
|
_DTYPES = set(
|
||||||
[dtypes.uint8, dtypes.int32, dtypes.int64, dtypes.float32, dtypes.float64])
|
[dtypes.uint8, dtypes.int32, dtypes.int64,
|
||||||
|
dtypes.float16, dtypes.float32, dtypes.float64])
|
||||||
|
|
||||||
|
|
||||||
class ImageOpsTest(test_util.TensorFlowTestCase):
|
class ImageOpsTest(test_util.TensorFlowTestCase):
|
||||||
|
@ -33,7 +33,8 @@ _image_ops_so = loader.load_op_library(
|
|||||||
resource_loader.get_path_to_datafile("_image_ops.so"))
|
resource_loader.get_path_to_datafile("_image_ops.so"))
|
||||||
|
|
||||||
_IMAGE_DTYPES = set(
|
_IMAGE_DTYPES = set(
|
||||||
[dtypes.uint8, dtypes.int32, dtypes.int64, dtypes.float32, dtypes.float64])
|
[dtypes.uint8, dtypes.int32, dtypes.int64,
|
||||||
|
dtypes.float16, dtypes.float32, dtypes.float64])
|
||||||
|
|
||||||
ops.RegisterShape("ImageConnectedComponents")(common_shapes.call_cpp_shape_fn)
|
ops.RegisterShape("ImageConnectedComponents")(common_shapes.call_cpp_shape_fn)
|
||||||
ops.RegisterShape("ImageProjectiveTransform")(common_shapes.call_cpp_shape_fn)
|
ops.RegisterShape("ImageProjectiveTransform")(common_shapes.call_cpp_shape_fn)
|
||||||
|
@ -1356,7 +1356,7 @@ class DropoutTest(test.TestCase):
|
|||||||
with self.test_session():
|
with self.test_session():
|
||||||
images = np.random.uniform(size=(5, height, width, 3))
|
images = np.random.uniform(size=(5, height, width, 3))
|
||||||
output = _layers.dropout(images)
|
output = _layers.dropout(images)
|
||||||
self.assertEqual(output.op.name, 'Dropout/dropout/mul')
|
self.assertEqual(output.op.name, 'Dropout/dropout_1/mul')
|
||||||
output.get_shape().assert_is_compatible_with(
|
output.get_shape().assert_is_compatible_with(
|
||||||
ops.convert_to_tensor(images).get_shape())
|
ops.convert_to_tensor(images).get_shape())
|
||||||
|
|
||||||
|
@ -57,3 +57,39 @@ dependencies {
|
|||||||
|
|
||||||
testCompile 'junit:junit:4.12'
|
testCompile 'junit:junit:4.12'
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def modelDownloadUrl = "https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip"
|
||||||
|
def localCache = "build/intermediates/mobilenet_v1_224_android_quant_2017_11_08.zip"
|
||||||
|
def targetFolder = "src/main/assets"
|
||||||
|
|
||||||
|
task downloadModel(type: DownloadUrlTask) {
|
||||||
|
doFirst {
|
||||||
|
println "Downloading ${modelDownloadUrl}"
|
||||||
|
}
|
||||||
|
sourceUrl = "${modelDownloadUrl}"
|
||||||
|
target = file("${localCache}")
|
||||||
|
}
|
||||||
|
|
||||||
|
task unzipModel(type: Copy, dependsOn: 'downloadModel') {
|
||||||
|
doFirst {
|
||||||
|
println "Unzipping ${localCache}"
|
||||||
|
}
|
||||||
|
from zipTree("${localCache}")
|
||||||
|
into "${targetFolder}"
|
||||||
|
}
|
||||||
|
|
||||||
|
// Ensure the model file is downloaded and extracted before every build
|
||||||
|
preBuild.dependsOn unzipModel
|
||||||
|
|
||||||
|
class DownloadUrlTask extends DefaultTask {
|
||||||
|
@Input
|
||||||
|
String sourceUrl
|
||||||
|
|
||||||
|
@OutputFile
|
||||||
|
File target
|
||||||
|
|
||||||
|
@TaskAction
|
||||||
|
void download() {
|
||||||
|
ant.get(src: sourceUrl, dest: target)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
@ -39,7 +39,7 @@ class ExpandDimsOpModel : public SingleOpModel {
|
|||||||
void SetInputFloat(std::initializer_list<float> data) {
|
void SetInputFloat(std::initializer_list<float> data) {
|
||||||
PopulateTensor<float>(input_, data);
|
PopulateTensor<float>(input_, data);
|
||||||
}
|
}
|
||||||
void SetAxis(int axis) { PopulateTensor<int32>(axis_, {axis}); }
|
void SetAxis(int axis) { PopulateTensor<int32_t>(axis_, {axis}); }
|
||||||
std::vector<float> GetValuesFloat() { return ExtractVector<float>(output_); }
|
std::vector<float> GetValuesFloat() { return ExtractVector<float>(output_); }
|
||||||
std::vector<int> GetOutputShape() { return GetTensorShape(output_); }
|
std::vector<int> GetOutputShape() { return GetTensorShape(output_); }
|
||||||
|
|
||||||
@ -51,7 +51,7 @@ class ExpandDimsOpModel : public SingleOpModel {
|
|||||||
|
|
||||||
TEST(ExpandDimsOpTest, DifferentAxis) {
|
TEST(ExpandDimsOpTest, DifferentAxis) {
|
||||||
ExpandDimsOpModel m({2, 2}, TensorType_FLOAT32);
|
ExpandDimsOpModel m({2, 2}, TensorType_FLOAT32);
|
||||||
const auto values = {-1.f, 1.f, -2.f, 2.f};
|
std::initializer_list<float> values = {-1.f, 1.f, -2.f, 2.f};
|
||||||
m.SetInputFloat(values);
|
m.SetInputFloat(values);
|
||||||
m.SetAxis(0);
|
m.SetAxis(0);
|
||||||
m.Invoke();
|
m.Invoke();
|
||||||
|
@ -126,10 +126,10 @@ TEST(MaximumOpTest, FloatWithBroadcastTest) {
|
|||||||
TEST(MaximumOpTest, Int32WithBroadcastTest) {
|
TEST(MaximumOpTest, Int32WithBroadcastTest) {
|
||||||
std::initializer_list<int32_t> data1 = {1, 0, -1, -2, 3, 11};
|
std::initializer_list<int32_t> data1 = {1, 0, -1, -2, 3, 11};
|
||||||
std::initializer_list<int32_t> data2 = {2};
|
std::initializer_list<int32_t> data2 = {2};
|
||||||
TestModel<int32>(BuiltinOperator_MAXIMUM, {TensorType_INT32, {3, 1, 2}},
|
TestModel<int32_t>(BuiltinOperator_MAXIMUM, {TensorType_INT32, {3, 1, 2}},
|
||||||
{TensorType_INT32, {1}}, {TensorType_INT32, {3, 1, 2}},
|
{TensorType_INT32, {1}}, {TensorType_INT32, {3, 1, 2}},
|
||||||
data1, data2, {2, 2, 2, 2, 3, 11});
|
data1, data2, {2, 2, 2, 2, 3, 11});
|
||||||
TestModel<int32>(BuiltinOperator_MINIMUM, {TensorType_INT32, {3, 1, 2}},
|
TestModel<int32_t>(BuiltinOperator_MINIMUM, {TensorType_INT32, {3, 1, 2}},
|
||||||
{TensorType_INT32, {1}}, {TensorType_INT32, {3, 1, 2}},
|
{TensorType_INT32, {1}}, {TensorType_INT32, {3, 1, 2}},
|
||||||
data1, data2, {1, 0, -1, -2, 2, 2});
|
data1, data2, {1, 0, -1, -2, 2, 2});
|
||||||
}
|
}
|
||||||
|
@ -58,9 +58,9 @@ TEST(NegOpModel, NegFloat) {
|
|||||||
|
|
||||||
TEST(NegOpModel, NegInt32) {
|
TEST(NegOpModel, NegInt32) {
|
||||||
NegOpModel m({TensorType_INT32, {2, 3}}, {TensorType_INT32, {2, 3}});
|
NegOpModel m({TensorType_INT32, {2, 3}}, {TensorType_INT32, {2, 3}});
|
||||||
m.SetInput<int32>({-2, -1, 0, 1, 2, 3});
|
m.SetInput<int32_t>({-2, -1, 0, 1, 2, 3});
|
||||||
m.Invoke();
|
m.Invoke();
|
||||||
EXPECT_THAT(m.GetOutput<int32>(), ElementsAreArray({2, 1, 0, -1, -2, -3}));
|
EXPECT_THAT(m.GetOutput<int32_t>(), ElementsAreArray({2, 1, 0, -1, -2, -3}));
|
||||||
}
|
}
|
||||||
|
|
||||||
TEST(NegOpModel, NegInt64) {
|
TEST(NegOpModel, NegInt64) {
|
||||||
|
@ -88,11 +88,11 @@ TEST(SelectOpTest, SelectUInt8) {
|
|||||||
TensorType_UINT8);
|
TensorType_UINT8);
|
||||||
|
|
||||||
model.PopulateTensor<bool>(model.input1(), {false, true, false, false});
|
model.PopulateTensor<bool>(model.input1(), {false, true, false, false});
|
||||||
model.PopulateTensor<uint8>(model.input2(), {1, 2, 3, 4});
|
model.PopulateTensor<uint8_t>(model.input2(), {1, 2, 3, 4});
|
||||||
model.PopulateTensor<uint8>(model.input3(), {5, 6, 7, 8});
|
model.PopulateTensor<uint8_t>(model.input3(), {5, 6, 7, 8});
|
||||||
model.Invoke();
|
model.Invoke();
|
||||||
|
|
||||||
EXPECT_THAT(model.GetOutput<uint8>(), ElementsAreArray({5, 2, 7, 8}));
|
EXPECT_THAT(model.GetOutput<uint8_t>(), ElementsAreArray({5, 2, 7, 8}));
|
||||||
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 1, 1, 4}));
|
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 1, 1, 4}));
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -101,11 +101,11 @@ TEST(SelectOpTest, SelectInt32) {
|
|||||||
TensorType_INT32);
|
TensorType_INT32);
|
||||||
|
|
||||||
model.PopulateTensor<bool>(model.input1(), {false, true, false, false});
|
model.PopulateTensor<bool>(model.input1(), {false, true, false, false});
|
||||||
model.PopulateTensor<int32>(model.input2(), {1, 2, 3, 4});
|
model.PopulateTensor<int32_t>(model.input2(), {1, 2, 3, 4});
|
||||||
model.PopulateTensor<int32>(model.input3(), {5, 6, 7, 8});
|
model.PopulateTensor<int32_t>(model.input3(), {5, 6, 7, 8});
|
||||||
model.Invoke();
|
model.Invoke();
|
||||||
|
|
||||||
EXPECT_THAT(model.GetOutput<int32>(), ElementsAreArray({5, 2, 7, 8}));
|
EXPECT_THAT(model.GetOutput<int32_t>(), ElementsAreArray({5, 2, 7, 8}));
|
||||||
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 1, 1, 4}));
|
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 1, 1, 4}));
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -113,11 +113,11 @@ TEST(SelectOpTest, RankOneSelectInt32) {
|
|||||||
SelectOpModel model({2}, {2, 1, 2, 1}, {2, 1, 2, 1}, TensorType_INT32);
|
SelectOpModel model({2}, {2, 1, 2, 1}, {2, 1, 2, 1}, TensorType_INT32);
|
||||||
|
|
||||||
model.PopulateTensor<bool>(model.input1(), {false, true});
|
model.PopulateTensor<bool>(model.input1(), {false, true});
|
||||||
model.PopulateTensor<int32>(model.input2(), {1, 2, 3, 4});
|
model.PopulateTensor<int32_t>(model.input2(), {1, 2, 3, 4});
|
||||||
model.PopulateTensor<int32>(model.input3(), {5, 6, 7, 8});
|
model.PopulateTensor<int32_t>(model.input3(), {5, 6, 7, 8});
|
||||||
model.Invoke();
|
model.Invoke();
|
||||||
|
|
||||||
EXPECT_THAT(model.GetOutput<int32>(), ElementsAreArray({5, 6, 3, 4}));
|
EXPECT_THAT(model.GetOutput<int32_t>(), ElementsAreArray({5, 6, 3, 4}));
|
||||||
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({2, 1, 2, 1}));
|
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({2, 1, 2, 1}));
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -125,11 +125,11 @@ TEST(SelectOpTest, RankZeroSelectInt32) {
|
|||||||
SelectOpModel model({1}, {1, 2, 2, 1}, {1, 2, 2, 1}, TensorType_INT32);
|
SelectOpModel model({1}, {1, 2, 2, 1}, {1, 2, 2, 1}, TensorType_INT32);
|
||||||
|
|
||||||
model.PopulateTensor<bool>(model.input1(), {false});
|
model.PopulateTensor<bool>(model.input1(), {false});
|
||||||
model.PopulateTensor<int32>(model.input2(), {1, 2, 3, 4});
|
model.PopulateTensor<int32_t>(model.input2(), {1, 2, 3, 4});
|
||||||
model.PopulateTensor<int32>(model.input3(), {5, 6, 7, 8});
|
model.PopulateTensor<int32_t>(model.input3(), {5, 6, 7, 8});
|
||||||
model.Invoke();
|
model.Invoke();
|
||||||
|
|
||||||
EXPECT_THAT(model.GetOutput<int32>(), ElementsAreArray({5, 6, 7, 8}));
|
EXPECT_THAT(model.GetOutput<int32_t>(), ElementsAreArray({5, 6, 7, 8}));
|
||||||
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 2, 2, 1}));
|
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 2, 2, 1}));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -21,7 +21,6 @@ limitations under the License.
|
|||||||
namespace tflite {
|
namespace tflite {
|
||||||
namespace {
|
namespace {
|
||||||
|
|
||||||
using ::int32;
|
|
||||||
using ::testing::ElementsAreArray;
|
using ::testing::ElementsAreArray;
|
||||||
|
|
||||||
template <typename input_type = float,
|
template <typename input_type = float,
|
||||||
@ -50,14 +49,14 @@ class StridedSliceOpModel : public SingleOpModel {
|
|||||||
void SetInput(std::initializer_list<input_type> data) {
|
void SetInput(std::initializer_list<input_type> data) {
|
||||||
PopulateTensor<input_type>(input_, data);
|
PopulateTensor<input_type>(input_, data);
|
||||||
}
|
}
|
||||||
void SetBegin(std::initializer_list<int32> data) {
|
void SetBegin(std::initializer_list<int32_t> data) {
|
||||||
PopulateTensor<int32>(begin_, data);
|
PopulateTensor<int32_t>(begin_, data);
|
||||||
}
|
}
|
||||||
void SetEnd(std::initializer_list<int32> data) {
|
void SetEnd(std::initializer_list<int32_t> data) {
|
||||||
PopulateTensor<int32>(end_, data);
|
PopulateTensor<int32_t>(end_, data);
|
||||||
}
|
}
|
||||||
void SetStrides(std::initializer_list<int32> data) {
|
void SetStrides(std::initializer_list<int32_t> data) {
|
||||||
PopulateTensor<int32>(strides_, data);
|
PopulateTensor<int32_t>(strides_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<input_type> GetOutput() {
|
std::vector<input_type> GetOutput() {
|
||||||
@ -566,7 +565,7 @@ TEST(StridedSliceOpTest, RunTwice) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
TEST(StridedSliceOpTest, In3D_IdentityShrinkAxis1Uint8) {
|
TEST(StridedSliceOpTest, In3D_IdentityShrinkAxis1Uint8) {
|
||||||
StridedSliceOpModel<uint8, TensorType_UINT8> m({2, 3, 2}, {3}, {3}, {3}, 0, 0,
|
StridedSliceOpModel<uint8_t, TensorType_UINT8> m({2, 3, 2}, {3}, {3}, {3}, 0, 0,
|
||||||
0, 0, 1);
|
0, 0, 1);
|
||||||
m.SetInput({1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12});
|
m.SetInput({1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12});
|
||||||
m.SetBegin({0, 0, 0});
|
m.SetBegin({0, 0, 0});
|
||||||
|
@ -22,22 +22,22 @@ using ::testing::ElementsAreArray;
|
|||||||
|
|
||||||
TEST(TestUtilTest, QuantizeVector) {
|
TEST(TestUtilTest, QuantizeVector) {
|
||||||
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
||||||
auto q_data = Quantize<uint8>(data, /*scale=*/1.0, /*zero_point=*/0);
|
auto q_data = Quantize<uint8_t>(data, /*scale=*/1.0, /*zero_point=*/0);
|
||||||
std::vector<uint8> expected = {0, 0, 0, 1, 1, 255};
|
std::vector<uint8_t> expected = {0, 0, 0, 1, 1, 255};
|
||||||
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
||||||
}
|
}
|
||||||
|
|
||||||
TEST(TestUtilTest, QuantizeVectorScalingDown) {
|
TEST(TestUtilTest, QuantizeVectorScalingDown) {
|
||||||
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
||||||
auto q_data = Quantize<uint8>(data, /*scale=*/10.0, /*zero_point=*/0);
|
auto q_data = Quantize<uint8_t>(data, /*scale=*/10.0, /*zero_point=*/0);
|
||||||
std::vector<uint8> expected = {0, 0, 0, 0, 0, 100};
|
std::vector<uint8_t> expected = {0, 0, 0, 0, 0, 100};
|
||||||
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
||||||
}
|
}
|
||||||
|
|
||||||
TEST(TestUtilTest, QuantizeVectorScalingUp) {
|
TEST(TestUtilTest, QuantizeVectorScalingUp) {
|
||||||
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
||||||
auto q_data = Quantize<uint8>(data, /*scale=*/0.1, /*zero_point=*/0);
|
auto q_data = Quantize<uint8_t>(data, /*scale=*/0.1, /*zero_point=*/0);
|
||||||
std::vector<uint8> expected = {0, 0, 0, 5, 10, 255};
|
std::vector<uint8_t> expected = {0, 0, 0, 5, 10, 255};
|
||||||
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -38,27 +38,27 @@ class TileOpModel : public SingleOpModel {
|
|||||||
PopulateTensor<float>(input_, data);
|
PopulateTensor<float>(input_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
void SetInputUInt8(std::initializer_list<uint8> data) {
|
void SetInputUInt8(std::initializer_list<uint8_t> data) {
|
||||||
PopulateTensor<uint8>(input_, data);
|
PopulateTensor<uint8_t>(input_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
void SetInputInt32(std::initializer_list<int32> data) {
|
void SetInputInt32(std::initializer_list<int32_t> data) {
|
||||||
PopulateTensor<int32>(input_, data);
|
PopulateTensor<int32_t>(input_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
void SetInputInt64(std::initializer_list<int64_t> data) {
|
void SetInputInt64(std::initializer_list<int64_t> data) {
|
||||||
PopulateTensor<int64_t>(input_, data);
|
PopulateTensor<int64_t>(input_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
void SetMultipliers(std::initializer_list<int32> data) {
|
void SetMultipliers(std::initializer_list<int32_t> data) {
|
||||||
PopulateTensor<int32>(multipliers_, data);
|
PopulateTensor<int32_t>(multipliers_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<float> GetOutputFloat() { return ExtractVector<float>(output_); }
|
std::vector<float> GetOutputFloat() { return ExtractVector<float>(output_); }
|
||||||
|
|
||||||
std::vector<uint8> GetOutputUInt8() { return ExtractVector<uint8>(output_); }
|
std::vector<uint8_t> GetOutputUInt8() { return ExtractVector<uint8_t>(output_); }
|
||||||
|
|
||||||
std::vector<int32> GetOutputInt32() { return ExtractVector<int32>(output_); }
|
std::vector<int32_t> GetOutputInt32() { return ExtractVector<int32_t>(output_); }
|
||||||
|
|
||||||
std::vector<int64_t> GetOutputInt64() {
|
std::vector<int64_t> GetOutputInt64() {
|
||||||
return ExtractVector<int64_t>(output_);
|
return ExtractVector<int64_t>(output_);
|
||||||
|
@ -42,32 +42,32 @@ class TopKV2OpModel : public SingleOpModel {
|
|||||||
PopulateTensor<float>(input_, data);
|
PopulateTensor<float>(input_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
void SetInputUInt8(std::initializer_list<uint8> data) {
|
void SetInputUInt8(std::initializer_list<uint8_t> data) {
|
||||||
PopulateTensor<uint8>(input_, data);
|
PopulateTensor<uint8_t>(input_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
void SetInputInt32(std::initializer_list<int32> data) {
|
void SetInputInt32(std::initializer_list<int32_t> data) {
|
||||||
PopulateTensor<int32>(input_, data);
|
PopulateTensor<int32_t>(input_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
void SetInputInt64(std::initializer_list<int64_t> data) {
|
void SetInputInt64(std::initializer_list<int64_t> data) {
|
||||||
PopulateTensor<int64_t>(input_, data);
|
PopulateTensor<int64_t>(input_, data);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<int32> GetIndexes() {
|
std::vector<int32_t> GetIndexes() {
|
||||||
return ExtractVector<int32>(output_indexes_);
|
return ExtractVector<int32_t>(output_indexes_);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<float> GetValuesFloat() {
|
std::vector<float> GetValuesFloat() {
|
||||||
return ExtractVector<float>(output_values_);
|
return ExtractVector<float>(output_values_);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<uint8> GetValuesUInt8() {
|
std::vector<uint8_t> GetValuesUInt8() {
|
||||||
return ExtractVector<uint8>(output_values_);
|
return ExtractVector<uint8_t>(output_values_);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<int32> GetValuesInt32() {
|
std::vector<int32_t> GetValuesInt32() {
|
||||||
return ExtractVector<int32>(output_values_);
|
return ExtractVector<int32_t>(output_values_);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<int64_t> GetValuesInt64() {
|
std::vector<int64_t> GetValuesInt64() {
|
||||||
@ -119,7 +119,7 @@ TEST(TopKV2OpTest, VectorFloat) {
|
|||||||
EXPECT_THAT(m.GetValuesFloat(), ElementsAreArray(ArrayFloatNear({0.8, 0.2})));
|
EXPECT_THAT(m.GetValuesFloat(), ElementsAreArray(ArrayFloatNear({0.8, 0.2})));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check that uint8 works.
|
// Check that uint8_t works.
|
||||||
TEST(TopKV2OpTest, TypeUint8) {
|
TEST(TopKV2OpTest, TypeUint8) {
|
||||||
TopKV2OpModel m({2, 3}, TensorType_UINT8, 2);
|
TopKV2OpModel m({2, 3}, TensorType_UINT8, 2);
|
||||||
m.SetInputUInt8({1, 2, 3, 251, 250, 249});
|
m.SetInputUInt8({1, 2, 3, 251, 250, 249});
|
||||||
@ -128,7 +128,7 @@ TEST(TopKV2OpTest, TypeUint8) {
|
|||||||
EXPECT_THAT(m.GetValuesUInt8(), ElementsAreArray({3, 2, 251, 250}));
|
EXPECT_THAT(m.GetValuesUInt8(), ElementsAreArray({3, 2, 251, 250}));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check that int32 works.
|
// Check that int32_t works.
|
||||||
TEST(TopKV2OpTest, TypeInt32) {
|
TEST(TopKV2OpTest, TypeInt32) {
|
||||||
TopKV2OpModel m({2, 3}, TensorType_INT32, 2);
|
TopKV2OpModel m({2, 3}, TensorType_INT32, 2);
|
||||||
m.SetInputInt32({1, 2, 3, 10251, 10250, 10249});
|
m.SetInputInt32({1, 2, 3, 10251, 10250, 10249});
|
||||||
|
@ -105,7 +105,7 @@ def _convert_model(flags):
|
|||||||
input_arrays = converter.get_input_arrays()
|
input_arrays = converter.get_input_arrays()
|
||||||
std_dev_values = _parse_array(flags.std_dev_values, type_fn=int)
|
std_dev_values = _parse_array(flags.std_dev_values, type_fn=int)
|
||||||
mean_values = _parse_array(flags.mean_values, type_fn=int)
|
mean_values = _parse_array(flags.mean_values, type_fn=int)
|
||||||
quant_stats = zip(mean_values, std_dev_values)
|
quant_stats = list(zip(mean_values, std_dev_values))
|
||||||
if ((not flags.input_arrays and len(input_arrays) > 1) or
|
if ((not flags.input_arrays and len(input_arrays) > 1) or
|
||||||
(len(input_arrays) != len(quant_stats))):
|
(len(input_arrays) != len(quant_stats))):
|
||||||
raise ValueError("Mismatching --input_arrays, --std_dev_values, and "
|
raise ValueError("Mismatching --input_arrays, --std_dev_values, and "
|
||||||
|
@ -52,6 +52,7 @@ tf_custom_op_library(
|
|||||||
deps = [
|
deps = [
|
||||||
":mpi_defines",
|
":mpi_defines",
|
||||||
":mpi_message_proto_cc",
|
":mpi_message_proto_cc",
|
||||||
|
"//tensorflow/stream_executor:stream_executor_headers_lib",
|
||||||
"//third_party/mpi",
|
"//third_party/mpi",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
|
@ -73,7 +73,7 @@ limitations under the License.
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
template <class T>
|
template <class T>
|
||||||
using StatusOr = se::port::StatusOr<T>;
|
using StatusOr = stream_executor::port::StatusOr<T>;
|
||||||
|
|
||||||
using CPUDevice = Eigen::ThreadPoolDevice;
|
using CPUDevice = Eigen::ThreadPoolDevice;
|
||||||
using GPUDevice = Eigen::GpuDevice;
|
using GPUDevice = Eigen::GpuDevice;
|
||||||
|
@ -30,6 +30,7 @@ from tensorflow.contrib.opt.python.training.model_average_optimizer import *
|
|||||||
from tensorflow.contrib.opt.python.training.moving_average_optimizer import *
|
from tensorflow.contrib.opt.python.training.moving_average_optimizer import *
|
||||||
from tensorflow.contrib.opt.python.training.multitask_optimizer_wrapper import *
|
from tensorflow.contrib.opt.python.training.multitask_optimizer_wrapper import *
|
||||||
from tensorflow.contrib.opt.python.training.nadam_optimizer import *
|
from tensorflow.contrib.opt.python.training.nadam_optimizer import *
|
||||||
|
from tensorflow.contrib.opt.python.training.weight_decay_optimizers import *
|
||||||
from tensorflow.contrib.opt.python.training.powersign import *
|
from tensorflow.contrib.opt.python.training.powersign import *
|
||||||
from tensorflow.contrib.opt.python.training.variable_clipping_optimizer import *
|
from tensorflow.contrib.opt.python.training.variable_clipping_optimizer import *
|
||||||
from tensorflow.contrib.opt.python.training.weight_decay_optimizers import *
|
from tensorflow.contrib.opt.python.training.weight_decay_optimizers import *
|
||||||
|
@ -506,7 +506,7 @@ def _FoldUnfusedBatchNorms(graph, is_training, freeze_batch_norm_delay):
|
|||||||
def _IsValidUnfusedBatchNorm(graph, context):
|
def _IsValidUnfusedBatchNorm(graph, context):
|
||||||
"""Checks that the output of the unfused batch norm has consumers."""
|
"""Checks that the output of the unfused batch norm has consumers."""
|
||||||
add_shift = graph.get_operation_by_name(
|
add_shift = graph.get_operation_by_name(
|
||||||
context + '/BatchNorm/batchnorm/add_1')
|
context + '/BatchNorm/batchnorm_1/add_1')
|
||||||
# Ensure that the output tensor of batch norm has consumers, otherwise this
|
# Ensure that the output tensor of batch norm has consumers, otherwise this
|
||||||
# is a dangling node and not a match.
|
# is a dangling node and not a match.
|
||||||
return bool(add_shift.outputs[0].consumers())
|
return bool(add_shift.outputs[0].consumers())
|
||||||
@ -599,7 +599,7 @@ def _GetBatchNormParams(graph, context, has_scaling):
|
|||||||
|
|
||||||
op_suffix_mean = '/BatchNorm/moments/Squeeze'
|
op_suffix_mean = '/BatchNorm/moments/Squeeze'
|
||||||
op_suffix_variance = '/BatchNorm/moments/Squeeze_1'
|
op_suffix_variance = '/BatchNorm/moments/Squeeze_1'
|
||||||
op_suffix_epsilon = '/BatchNorm/batchnorm/add/y'
|
op_suffix_epsilon = '/BatchNorm/batchnorm_1/add/y'
|
||||||
op_suffix_bn_decay_mean = '/BatchNorm/AssignMovingAvg/decay'
|
op_suffix_bn_decay_mean = '/BatchNorm/AssignMovingAvg/decay'
|
||||||
op_suffix_bn_decay_var = '/BatchNorm/AssignMovingAvg_1/decay'
|
op_suffix_bn_decay_var = '/BatchNorm/AssignMovingAvg_1/decay'
|
||||||
|
|
||||||
@ -675,12 +675,12 @@ def _CreateFoldedOp(graph, context, has_scaling, freeze_batch_norm_delay,
|
|||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
A pair of Operations, the first is the original consumer node of the batch
|
A pair of Operations, the first is the original consumer node of the batch
|
||||||
norm (../BatchNorm/batchnorm/add_1), the second is the consumer node of
|
norm (../BatchNorm/batchnorm_1/add_1), the second is the consumer node of
|
||||||
the folded graph (add_fold).
|
the folded graph (add_fold).
|
||||||
"""
|
"""
|
||||||
mul_scale_name = 'mul_1' if has_scaling else 'mul'
|
mul_scale_name = 'mul_1' if has_scaling else 'mul'
|
||||||
mul_scale = graph.get_operation_by_name(context +
|
mul_scale = graph.get_operation_by_name(context +
|
||||||
'/BatchNorm/batchnorm/' +
|
'/BatchNorm/batchnorm_1/' +
|
||||||
mul_scale_name)
|
mul_scale_name)
|
||||||
op_below = mul_scale.inputs[0].op
|
op_below = mul_scale.inputs[0].op
|
||||||
# Skip over the BatchToSpace operation in the case of atrous convolutions.
|
# Skip over the BatchToSpace operation in the case of atrous convolutions.
|
||||||
@ -707,7 +707,7 @@ def _CreateFoldedOp(graph, context, has_scaling, freeze_batch_norm_delay,
|
|||||||
]
|
]
|
||||||
scale_name = 'mul' if has_scaling else 'Rsqrt'
|
scale_name = 'mul' if has_scaling else 'Rsqrt'
|
||||||
scale = graph.get_operation_by_name(
|
scale = graph.get_operation_by_name(
|
||||||
context + '/BatchNorm/batchnorm/' + scale_name)
|
context + '/BatchNorm/batchnorm_1/' + scale_name)
|
||||||
scale = array_ops.reshape(scale.outputs[0], new_shape,
|
scale = array_ops.reshape(scale.outputs[0], new_shape,
|
||||||
context + '/scale_reshape')
|
context + '/scale_reshape')
|
||||||
|
|
||||||
@ -735,7 +735,7 @@ def _CreateFoldedOp(graph, context, has_scaling, freeze_batch_norm_delay,
|
|||||||
[(1, mul_fold.outputs[0])])
|
[(1, mul_fold.outputs[0])])
|
||||||
|
|
||||||
add_shift = graph.get_operation_by_name(
|
add_shift = graph.get_operation_by_name(
|
||||||
context + '/BatchNorm/batchnorm/add_1')
|
context + '/BatchNorm/batchnorm_1/add_1')
|
||||||
|
|
||||||
corrected_output = conv_or_fc_folded.outputs[0]
|
corrected_output = conv_or_fc_folded.outputs[0]
|
||||||
# Copy the batch to space operation if we have a atrous convolution.
|
# Copy the batch to space operation if we have a atrous convolution.
|
||||||
@ -930,7 +930,7 @@ def _HasScaling(graph, input_to_ops_map, bn):
|
|||||||
Returns:
|
Returns:
|
||||||
A boolean indicating whether this batch norm layer has scaling enabled.
|
A boolean indicating whether this batch norm layer has scaling enabled.
|
||||||
"""
|
"""
|
||||||
rsqrt_op = graph.get_operation_by_name(bn + '/BatchNorm/batchnorm/Rsqrt')
|
rsqrt_op = graph.get_operation_by_name(bn + '/BatchNorm/batchnorm_1/Rsqrt')
|
||||||
rsqrt_consumers = input_to_ops_map.ConsumerOperations(rsqrt_op)
|
rsqrt_consumers = input_to_ops_map.ConsumerOperations(rsqrt_op)
|
||||||
|
|
||||||
return sum(1 for op in rsqrt_consumers if op.type == 'Mul') == 1
|
return sum(1 for op in rsqrt_consumers if op.type == 'Mul') == 1
|
||||||
|
@ -600,13 +600,13 @@ class FoldBatchNormsTest(test_util.TensorFlowTestCase):
|
|||||||
if has_scaling:
|
if has_scaling:
|
||||||
if fused:
|
if fused:
|
||||||
return scope + '/BatchNorm_Fold/mul'
|
return scope + '/BatchNorm_Fold/mul'
|
||||||
return scope + '/BatchNorm/batchnorm/mul'
|
return scope + '/BatchNorm/batchnorm_1/mul'
|
||||||
return scope + '/BatchNorm/batchnorm/Rsqrt'
|
return scope + '/BatchNorm/batchnorm_1/Rsqrt'
|
||||||
|
|
||||||
def _BathNormBiasName(self, scope, fused):
|
def _BathNormBiasName(self, scope, fused):
|
||||||
if fused:
|
if fused:
|
||||||
return scope + '/BatchNorm_Fold/bias'
|
return scope + '/BatchNorm_Fold/bias'
|
||||||
return scope + '/BatchNorm/batchnorm/sub'
|
return scope + '/BatchNorm/batchnorm_1/sub'
|
||||||
|
|
||||||
def _WeightInit(self, stddev):
|
def _WeightInit(self, stddev):
|
||||||
"""Returns a truncated normal variable initializer.
|
"""Returns a truncated normal variable initializer.
|
||||||
|
@ -385,7 +385,7 @@ class ReceptiveFieldTest(test.TestCase):
|
|||||||
effective_stride_y, effective_padding_x, effective_padding_y) = (
|
effective_stride_y, effective_padding_x, effective_padding_y) = (
|
||||||
receptive_field.compute_receptive_field_from_graph_def(
|
receptive_field.compute_receptive_field_from_graph_def(
|
||||||
graph_def, input_node, output_node,
|
graph_def, input_node, output_node,
|
||||||
['Dropout/dropout/random_uniform']))
|
['Dropout/dropout_1/random_uniform']))
|
||||||
self.assertEqual(receptive_field_x, 3)
|
self.assertEqual(receptive_field_x, 3)
|
||||||
self.assertEqual(receptive_field_y, 3)
|
self.assertEqual(receptive_field_y, 3)
|
||||||
self.assertEqual(effective_stride_x, 4)
|
self.assertEqual(effective_stride_x, 4)
|
||||||
|
@ -18,131 +18,330 @@ from __future__ import absolute_import
|
|||||||
from __future__ import division
|
from __future__ import division
|
||||||
from __future__ import print_function
|
from __future__ import print_function
|
||||||
|
|
||||||
|
from collections import namedtuple
|
||||||
|
import itertools
|
||||||
import warnings
|
import warnings
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
import six
|
||||||
|
|
||||||
from tensorflow.contrib import tensorrt as trt
|
from tensorflow.contrib import tensorrt as trt
|
||||||
from tensorflow.core.protobuf import config_pb2 as cpb2
|
from tensorflow.core.protobuf import config_pb2
|
||||||
from tensorflow.python.framework import constant_op as cop
|
from tensorflow.core.protobuf import rewriter_config_pb2
|
||||||
from tensorflow.python.framework import dtypes as dtypes
|
from tensorflow.python.framework import constant_op
|
||||||
from tensorflow.python.framework import importer as importer
|
from tensorflow.python.framework import dtypes
|
||||||
from tensorflow.python.framework import ops as ops
|
from tensorflow.python.framework import importer
|
||||||
|
from tensorflow.python.framework import ops
|
||||||
from tensorflow.python.framework import test_util
|
from tensorflow.python.framework import test_util
|
||||||
from tensorflow.python.ops import array_ops as aops
|
from tensorflow.python.ops import array_ops
|
||||||
from tensorflow.python.ops import nn as nn
|
from tensorflow.python.ops import math_ops
|
||||||
from tensorflow.python.ops import nn_ops as nn_ops
|
from tensorflow.python.ops import nn
|
||||||
from tensorflow.python.platform import googletest
|
from tensorflow.python.ops import nn_ops
|
||||||
|
from tensorflow.python.platform import test
|
||||||
|
|
||||||
|
INPUT_NAME = "input"
|
||||||
|
OUTPUT_NAME = "output"
|
||||||
|
INPUT_DIMS = [100, 24, 24, 2]
|
||||||
|
MODE_FP32 = "FP32"
|
||||||
|
MODE_FP16 = "FP16"
|
||||||
|
MODE_INT8 = "INT8"
|
||||||
|
|
||||||
|
if six.PY2:
|
||||||
|
to_bytes = lambda s: s
|
||||||
|
to_string = lambda s: s
|
||||||
|
else:
|
||||||
|
to_bytes = lambda s: s.encode("utf-8", errors="surrogateescape")
|
||||||
|
to_string = lambda s: s.decode("utf-8")
|
||||||
|
|
||||||
|
|
||||||
class IntegrationTest(test_util.TensorFlowTestCase):
|
# TODO(aaroey): test graph with different dtypes.
|
||||||
|
def GetSingleEngineGraphDef(dtype=dtypes.float32):
|
||||||
|
"""Create a graph containing single segment."""
|
||||||
|
g = ops.Graph()
|
||||||
|
with g.as_default():
|
||||||
|
inp = array_ops.placeholder(
|
||||||
|
dtype=dtype, shape=[None] + INPUT_DIMS[1:], name=INPUT_NAME)
|
||||||
|
with g.device("/GPU:0"):
|
||||||
|
conv_filter = constant_op.constant(
|
||||||
|
[[[[1., 0.5, 4., 6., 0.5, 1.], [1., 0.5, 1., 1., 0.5, 1.]]]],
|
||||||
|
name="weights",
|
||||||
|
dtype=dtype)
|
||||||
|
conv = nn.conv2d(
|
||||||
|
input=inp,
|
||||||
|
filter=conv_filter,
|
||||||
|
strides=[1, 2, 2, 1],
|
||||||
|
padding="SAME",
|
||||||
|
name="conv")
|
||||||
|
bias = constant_op.constant(
|
||||||
|
[4., 1.5, 2., 3., 5., 7.], name="bias", dtype=dtype)
|
||||||
|
added = nn.bias_add(conv, bias, name="bias_add")
|
||||||
|
relu = nn.relu(added, "relu")
|
||||||
|
identity = array_ops.identity(relu, "identity")
|
||||||
|
pool = nn_ops.max_pool(
|
||||||
|
identity, [1, 2, 2, 1], [1, 2, 2, 1], "VALID", name="max_pool")
|
||||||
|
array_ops.squeeze(pool, name=OUTPUT_NAME)
|
||||||
|
return g.as_graph_def()
|
||||||
|
|
||||||
|
|
||||||
|
# TODO(aaroey): test graph with different dtypes.
|
||||||
|
def GetMultiEngineGraphDef(dtype=dtypes.float32):
|
||||||
|
"""Create a graph containing multiple segment."""
|
||||||
|
g = ops.Graph()
|
||||||
|
with g.as_default():
|
||||||
|
inp = array_ops.placeholder(
|
||||||
|
dtype=dtype, shape=[None] + INPUT_DIMS[1:], name=INPUT_NAME)
|
||||||
|
with g.device("/GPU:0"):
|
||||||
|
conv_filter = constant_op.constant(
|
||||||
|
[[[[1., 0.5, 4., 6., 0.5, 1.], [1., 0.5, 1., 1., 0.5, 1.]]]],
|
||||||
|
name="weights",
|
||||||
|
dtype=dtype)
|
||||||
|
conv = nn.conv2d(
|
||||||
|
input=inp,
|
||||||
|
filter=conv_filter,
|
||||||
|
strides=[1, 2, 2, 1],
|
||||||
|
padding="SAME",
|
||||||
|
name="conv")
|
||||||
|
c1 = constant_op.constant(
|
||||||
|
np.random.randn(INPUT_DIMS[0], 12, 12, 6), dtype=dtype)
|
||||||
|
p = conv * c1
|
||||||
|
c2 = constant_op.constant(
|
||||||
|
np.random.randn(INPUT_DIMS[0], 12, 12, 6), dtype=dtype)
|
||||||
|
q = conv / c2
|
||||||
|
|
||||||
|
edge = math_ops.sin(q)
|
||||||
|
edge /= edge
|
||||||
|
r = edge + edge
|
||||||
|
|
||||||
|
p -= edge
|
||||||
|
q *= edge
|
||||||
|
s = p + q
|
||||||
|
s -= r
|
||||||
|
array_ops.squeeze(s, name=OUTPUT_NAME)
|
||||||
|
return g.as_graph_def()
|
||||||
|
|
||||||
|
|
||||||
|
TestGraph = namedtuple("TestGraph",
|
||||||
|
["gdef", "num_expected_engines", "expected_output_dims"])
|
||||||
|
|
||||||
|
TEST_GRAPHS = {
|
||||||
|
"SingleEngineGraph":
|
||||||
|
TestGraph(
|
||||||
|
gdef=GetSingleEngineGraphDef(),
|
||||||
|
num_expected_engines=1,
|
||||||
|
expected_output_dims=(100, 6, 6, 6)),
|
||||||
|
"MultiEngineGraph":
|
||||||
|
TestGraph(
|
||||||
|
gdef=GetMultiEngineGraphDef(),
|
||||||
|
num_expected_engines=2,
|
||||||
|
expected_output_dims=(100, 12, 12, 6)),
|
||||||
|
# TODO(aaroey): add a large complex graph to test.
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class TfTrtIntegrationTest(test_util.TensorFlowTestCase):
|
||||||
"""Class to test Tensorflow-TensorRT integration."""
|
"""Class to test Tensorflow-TensorRT integration."""
|
||||||
|
|
||||||
def setUp(self):
|
def setUp(self):
|
||||||
"""Setup method."""
|
"""Setup method."""
|
||||||
super(IntegrationTest, self).setUp()
|
super(TfTrtIntegrationTest, self).setUp()
|
||||||
warnings.simplefilter("always")
|
warnings.simplefilter("always")
|
||||||
inp_dims = (100, 24, 24, 2)
|
self._input = np.random.random_sample(INPUT_DIMS)
|
||||||
self._input = np.random.random_sample(inp_dims)
|
|
||||||
self._original_graph = self.get_simple_graph_def()
|
|
||||||
self._gpu_options = cpb2.GPUOptions(per_process_gpu_memory_fraction=0.50)
|
|
||||||
self._config = cpb2.ConfigProto(gpu_options=self._gpu_options)
|
|
||||||
self._reference = self.run_graph(self._original_graph, self._input)
|
|
||||||
|
|
||||||
def get_simple_graph_def(self):
|
def _GetConfigProto(self,
|
||||||
"""Create a simple graph and return its graph_def."""
|
use_optimizer,
|
||||||
g = ops.Graph()
|
precision_mode=None,
|
||||||
with g.as_default():
|
is_dynamic_op=None):
|
||||||
a = aops.placeholder(
|
if use_optimizer:
|
||||||
dtype=dtypes.float32, shape=(None, 24, 24, 2), name="input")
|
rewriter_cfg = rewriter_config_pb2.RewriterConfig()
|
||||||
e = cop.constant(
|
rewriter_cfg.optimizers.extend(["constfold", "layout"])
|
||||||
[[[[1., 0.5, 4., 6., 0.5, 1.], [1., 0.5, 1., 1., 0.5, 1.]]]],
|
custom_op = rewriter_cfg.custom_optimizers.add()
|
||||||
name="weights",
|
custom_op.name = "TensorRTOptimizer"
|
||||||
dtype=dtypes.float32)
|
custom_op.parameter_map["minimum_segment_size"].i = 3
|
||||||
conv = nn.conv2d(
|
custom_op.parameter_map["max_batch_size"].i = self._input.shape[0]
|
||||||
input=a, filter=e, strides=[1, 2, 2, 1], padding="SAME", name="conv")
|
custom_op.parameter_map["is_dynamic_op"].b = is_dynamic_op
|
||||||
b = cop.constant(
|
custom_op.parameter_map["max_workspace_size_bytes"].i = 1 << 25
|
||||||
[4., 1.5, 2., 3., 5., 7.], name="bias", dtype=dtypes.float32)
|
custom_op.parameter_map["precision_mode"].s = to_bytes(precision_mode)
|
||||||
t = nn.bias_add(conv, b, name="biasAdd")
|
graph_options = config_pb2.GraphOptions(rewrite_options=rewriter_cfg)
|
||||||
relu = nn.relu(t, "relu")
|
else:
|
||||||
idty = aops.identity(relu, "ID")
|
graph_options = config_pb2.GraphOptions()
|
||||||
v = nn_ops.max_pool(
|
|
||||||
idty, [1, 2, 2, 1], [1, 2, 2, 1], "VALID", name="max_pool")
|
|
||||||
aops.squeeze(v, name="output")
|
|
||||||
return g.as_graph_def()
|
|
||||||
|
|
||||||
def run_graph(self, gdef, dumm_inp):
|
gpu_options = config_pb2.GPUOptions()
|
||||||
"""Run given graphdef once."""
|
if trt.trt_convert.get_linked_tensorrt_version()[0] == 3:
|
||||||
ops.reset_default_graph()
|
gpu_options.per_process_gpu_memory_fraction = 0.50
|
||||||
|
|
||||||
|
config = config_pb2.ConfigProto(
|
||||||
|
gpu_options=gpu_options, graph_options=graph_options)
|
||||||
|
return config
|
||||||
|
|
||||||
|
def _RunGraph(self, graph_key, gdef, input_data, config, num_runs=2):
|
||||||
|
"""Run given graphdef multiple times."""
|
||||||
g = ops.Graph()
|
g = ops.Graph()
|
||||||
with g.as_default():
|
with g.as_default():
|
||||||
inp, out = importer.import_graph_def(
|
inp, out = importer.import_graph_def(
|
||||||
graph_def=gdef, return_elements=["input", "output"])
|
graph_def=gdef, return_elements=[INPUT_NAME, OUTPUT_NAME], name="")
|
||||||
inp = inp.outputs[0]
|
inp = inp.outputs[0]
|
||||||
out = out.outputs[0]
|
out = out.outputs[0]
|
||||||
with self.test_session(
|
with self.test_session(
|
||||||
graph=g, config=self._config, use_gpu=True, force_gpu=True) as sess:
|
graph=g, config=config, use_gpu=True, force_gpu=True) as sess:
|
||||||
val = sess.run(out, {inp: dumm_inp})
|
val = None
|
||||||
|
# Defaults to 2 runs to verify result across multiple runs is same.
|
||||||
|
for _ in range(num_runs):
|
||||||
|
new_val = sess.run(out, {inp: input_data})
|
||||||
|
self.assertEquals(TEST_GRAPHS[graph_key].expected_output_dims,
|
||||||
|
new_val.shape)
|
||||||
|
if val is not None:
|
||||||
|
self.assertAllEqual(new_val, val)
|
||||||
|
val = new_val
|
||||||
return val
|
return val
|
||||||
|
|
||||||
# Use real data that is representative of the inference dataset
|
# Use real data that is representative of the inference dataset
|
||||||
# for calibration. For this test script it is random data.
|
# for calibration. For this test script it is random data.
|
||||||
def run_calibration(self, gdef, dumm_inp):
|
def _RunCalibration(self, graph_key, gdef, input_data, config):
|
||||||
"""Run given calibration graph multiple times."""
|
"""Run calibration on given graph."""
|
||||||
ops.reset_default_graph()
|
return self._RunGraph(graph_key, gdef, input_data, config, 30)
|
||||||
g = ops.Graph()
|
|
||||||
with g.as_default():
|
|
||||||
inp, out = importer.import_graph_def(
|
|
||||||
graph_def=gdef, return_elements=["input", "output"])
|
|
||||||
inp = inp.outputs[0]
|
|
||||||
out = out.outputs[0]
|
|
||||||
# run over real calibration data here, we are mimicking a calibration
|
|
||||||
# set of 30 different batches. Use as much calibration data as you want
|
|
||||||
with self.test_session(
|
|
||||||
graph=g, config=self._config, use_gpu=True, force_gpu=True) as sess:
|
|
||||||
for _ in range(30):
|
|
||||||
val = sess.run(out, {inp: dumm_inp})
|
|
||||||
return val
|
|
||||||
|
|
||||||
def get_trt_graph(self, mode):
|
def _GetTrtGraph(self, gdef, precision_mode, is_dynamic_op):
|
||||||
"""Return trt converted graph."""
|
"""Return trt converted graph."""
|
||||||
if mode in ["FP32", "FP16", "INT8"]:
|
return trt.create_inference_graph(
|
||||||
return trt.create_inference_graph(
|
input_graph_def=gdef,
|
||||||
input_graph_def=self._original_graph,
|
outputs=[OUTPUT_NAME],
|
||||||
outputs=["output"],
|
max_batch_size=self._input.shape[0],
|
||||||
max_batch_size=self._input.shape[0],
|
max_workspace_size_bytes=1 << 25,
|
||||||
max_workspace_size_bytes=1 << 25,
|
precision_mode=precision_mode,
|
||||||
precision_mode=mode, # TRT Engine precision "FP32","FP16" or "INT8"
|
minimum_segment_size=2,
|
||||||
minimum_segment_size=2 # minimum number of nodes in an engine
|
is_dynamic_op=is_dynamic_op)
|
||||||
)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def testFP32(self):
|
def _VerifyGraphDef(self,
|
||||||
"""Test FP32 conversion. Results should be identical to native case."""
|
graph_key,
|
||||||
trt_graph = self.get_trt_graph("FP32")
|
gdef,
|
||||||
result = self.run_graph(trt_graph, self._input)
|
precision_mode=None,
|
||||||
self.assertAllEqual(self._reference, result)
|
is_calibrated=None,
|
||||||
result1 = self.run_graph(trt_graph, self._input)
|
dynamic_engine=None):
|
||||||
self.assertAllEqual(result1, result)
|
num_engines = 0
|
||||||
|
for n in gdef.node:
|
||||||
|
if n.op == "TRTEngineOp":
|
||||||
|
num_engines += 1
|
||||||
|
self.assertNotEqual("", n.attr["serialized_segment"].s)
|
||||||
|
self.assertNotEqual("", n.attr["segment_funcdef_name"].s)
|
||||||
|
self.assertEquals(n.attr["precision_mode"].s, precision_mode)
|
||||||
|
self.assertEquals(n.attr["static_engine"].b, not dynamic_engine)
|
||||||
|
if precision_mode == MODE_INT8 and is_calibrated:
|
||||||
|
self.assertNotEqual("", n.attr["calibration_data"].s)
|
||||||
|
else:
|
||||||
|
self.assertEquals("", n.attr["calibration_data"].s)
|
||||||
|
if precision_mode is None:
|
||||||
|
self.assertEquals(num_engines, 0)
|
||||||
|
else:
|
||||||
|
self.assertEquals(num_engines,
|
||||||
|
TEST_GRAPHS[graph_key].num_expected_engines)
|
||||||
|
|
||||||
def testFP16(self):
|
def _RunTest(self, graph_key, use_optimizer, precision_mode,
|
||||||
"""Test FP16 conversion. Results may be different from native case."""
|
dynamic_infer_engine, dynamic_calib_engine):
|
||||||
trt_graph = self.get_trt_graph("FP16")
|
assert precision_mode in [MODE_FP32, MODE_FP16, MODE_INT8]
|
||||||
result = self.run_graph(trt_graph, self._input)
|
input_gdef = TEST_GRAPHS[graph_key].gdef
|
||||||
self.assertAllClose(self._reference, result, rtol=1.e-03)
|
self._VerifyGraphDef(graph_key, input_gdef)
|
||||||
result1 = self.run_graph(trt_graph, self._input)
|
|
||||||
self.assertAllEqual(result1, result)
|
|
||||||
|
|
||||||
def testINT8(self):
|
# Get reference result without running trt.
|
||||||
"""Test INT8 conversion. Results may be different from native case."""
|
config_no_trt = self._GetConfigProto(False)
|
||||||
calib_graph = self.get_trt_graph("INT8")
|
print("Running original graph w/o trt, config:\n%s" % str(config_no_trt))
|
||||||
result = self.run_calibration(calib_graph, self._input)
|
ref_result = self._RunGraph(graph_key, input_gdef, self._input,
|
||||||
self.assertAllEqual(self._reference, result)
|
config_no_trt)
|
||||||
int8_graph = trt.calib_graph_to_infer_graph(calib_graph)
|
|
||||||
result = self.run_graph(int8_graph, self._input)
|
# Run calibration if necessary.
|
||||||
self.assertAllClose(self._reference, result, rtol=1.e-03)
|
if precision_mode == MODE_INT8:
|
||||||
result1 = self.run_graph(int8_graph, self._input)
|
|
||||||
self.assertAllEqual(result1, result)
|
calib_config = self._GetConfigProto(use_optimizer, precision_mode,
|
||||||
|
dynamic_calib_engine)
|
||||||
|
print("Running calibration graph, config:\n%s" % str(calib_config))
|
||||||
|
if use_optimizer:
|
||||||
|
self.assertTrue(False)
|
||||||
|
# TODO(aaroey): uncomment this and get infer_gdef when this mode is
|
||||||
|
# supported.
|
||||||
|
# result = self._RunCalibration(graph_key, input_gdef, self._input,
|
||||||
|
# calib_config)
|
||||||
|
else:
|
||||||
|
calib_gdef = self._GetTrtGraph(input_gdef, precision_mode,
|
||||||
|
dynamic_calib_engine)
|
||||||
|
self._VerifyGraphDef(graph_key, calib_gdef, precision_mode, False,
|
||||||
|
dynamic_calib_engine)
|
||||||
|
result = self._RunCalibration(graph_key, calib_gdef, self._input,
|
||||||
|
calib_config)
|
||||||
|
infer_gdef = trt.calib_graph_to_infer_graph(calib_gdef)
|
||||||
|
self._VerifyGraphDef(graph_key, infer_gdef, precision_mode, True,
|
||||||
|
dynamic_calib_engine)
|
||||||
|
self.assertAllClose(ref_result, result, rtol=1.e-03)
|
||||||
|
else:
|
||||||
|
infer_gdef = input_gdef
|
||||||
|
|
||||||
|
# Run inference.
|
||||||
|
infer_config = self._GetConfigProto(use_optimizer, precision_mode,
|
||||||
|
dynamic_infer_engine)
|
||||||
|
print("Running final inference graph, config:\n%s" % str(infer_config))
|
||||||
|
if use_optimizer:
|
||||||
|
result = self._RunGraph(graph_key, infer_gdef, self._input, infer_config)
|
||||||
|
else:
|
||||||
|
trt_infer_gdef = self._GetTrtGraph(infer_gdef, precision_mode,
|
||||||
|
dynamic_infer_engine)
|
||||||
|
self._VerifyGraphDef(graph_key, trt_infer_gdef, precision_mode, True,
|
||||||
|
dynamic_infer_engine)
|
||||||
|
result = self._RunGraph(graph_key, trt_infer_gdef, self._input,
|
||||||
|
infer_config)
|
||||||
|
self.assertAllClose(ref_result, result, rtol=1.e-03)
|
||||||
|
|
||||||
|
def testIdempotence(self):
|
||||||
|
# Test that applying tensorrt optimizer or offline conversion tools multiple
|
||||||
|
# times to the same graph will result in same graph.
|
||||||
|
# TODO(aaroey): implement this.
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
def GetTests():
|
||||||
|
|
||||||
|
def _GetTest(g, u, p, i, c):
|
||||||
|
|
||||||
|
def _Test(self):
|
||||||
|
print("Running test with parameters: graph_key=%s, use_optimizer=%s, "
|
||||||
|
"precision_mode=%s, dynamic_infer_engine=%s, "
|
||||||
|
"dynamic_calib_engine=%s" % (g, u, p, i, c))
|
||||||
|
self._RunTest(g, u, p, i, c)
|
||||||
|
|
||||||
|
return _Test
|
||||||
|
|
||||||
|
use_optimizer_options = [False, True]
|
||||||
|
precision_mode_options = [MODE_FP32, MODE_FP16, MODE_INT8]
|
||||||
|
dynamic_infer_engine_options = [False, True]
|
||||||
|
dynamic_calib_engine_options = [False, True]
|
||||||
|
for (graph_key, use_optimizer, precision_mode,
|
||||||
|
dynamic_infer_engine, dynamic_calib_engine) in itertools.product(
|
||||||
|
TEST_GRAPHS, use_optimizer_options, precision_mode_options,
|
||||||
|
dynamic_infer_engine_options, dynamic_calib_engine_options):
|
||||||
|
if precision_mode == MODE_INT8:
|
||||||
|
if not dynamic_calib_engine and dynamic_infer_engine:
|
||||||
|
# TODO(aaroey): test this case, the conversion from static calibration
|
||||||
|
# engine to dynamic inference engine should be a noop.
|
||||||
|
continue
|
||||||
|
if use_optimizer:
|
||||||
|
# TODO(aaroey): if use_optimizer is True we need to get the inference
|
||||||
|
# graphdef using custom python wrapper class, which is not currently
|
||||||
|
# supported yet.
|
||||||
|
continue
|
||||||
|
if not dynamic_calib_engine:
|
||||||
|
# TODO(aaroey): construction of static calibration engine is not
|
||||||
|
# supported yet.
|
||||||
|
continue
|
||||||
|
if dynamic_calib_engine and not dynamic_infer_engine:
|
||||||
|
# TODO(aaroey): construction of static inference engine using dynamic
|
||||||
|
# calibration engine is not supported yet.
|
||||||
|
continue
|
||||||
|
else: # In non int8 mode.
|
||||||
|
if dynamic_calib_engine:
|
||||||
|
# dynamic_calib_engine doesn't affect non-int8 modes, so just let
|
||||||
|
# related tests run once on dynamic_calib_engine=False.
|
||||||
|
continue
|
||||||
|
yield _GetTest(graph_key, use_optimizer, precision_mode,
|
||||||
|
dynamic_infer_engine, dynamic_calib_engine)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
googletest.main()
|
for index, t in enumerate(GetTests()):
|
||||||
|
setattr(TfTrtIntegrationTest, "testTfTRT_" + str(index), t)
|
||||||
|
test.main()
|
||||||
|
@ -25,7 +25,7 @@ END
|
|||||||
(K-1)-dimensional tensor of indices into `params`, where each element defines a
|
(K-1)-dimensional tensor of indices into `params`, where each element defines a
|
||||||
slice of `params`:
|
slice of `params`:
|
||||||
|
|
||||||
output[i_0, ..., i_{K-2}] = params[indices[i0, ..., i_{K-2}]]
|
output[\\(i_0, ..., i_{K-2}\\)] = params[indices[\\(i_0, ..., i_{K-2}\\)]]
|
||||||
|
|
||||||
Whereas in @{tf.gather} `indices` defines slices into the first
|
Whereas in @{tf.gather} `indices` defines slices into the first
|
||||||
dimension of `params`, in `tf.gather_nd`, `indices` defines slices into the
|
dimension of `params`, in `tf.gather_nd`, `indices` defines slices into the
|
||||||
|
@ -3,19 +3,19 @@ op {
|
|||||||
in_arg {
|
in_arg {
|
||||||
name: "start"
|
name: "start"
|
||||||
description: <<END
|
description: <<END
|
||||||
First entry in the range.
|
0-D tensor. First entry in the range.
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
in_arg {
|
in_arg {
|
||||||
name: "stop"
|
name: "stop"
|
||||||
description: <<END
|
description: <<END
|
||||||
Last entry in the range.
|
0-D tensor. Last entry in the range.
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
in_arg {
|
in_arg {
|
||||||
name: "num"
|
name: "num"
|
||||||
description: <<END
|
description: <<END
|
||||||
Number of values to generate.
|
0-D tensor. Number of values to generate.
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
out_arg {
|
out_arg {
|
||||||
|
@ -18,7 +18,7 @@ END
|
|||||||
}
|
}
|
||||||
summary: "Computes the matrix exponential of one or more square matrices:"
|
summary: "Computes the matrix exponential of one or more square matrices:"
|
||||||
description: <<END
|
description: <<END
|
||||||
exp(A) = \sum_{n=0}^\infty A^n/n!
|
\\(exp(A) = \sum_{n=0}^\infty A^n/n!\\)
|
||||||
|
|
||||||
The exponential is computed using a combination of the scaling and squaring
|
The exponential is computed using a combination of the scaling and squaring
|
||||||
method and the Pade approximation. Details can be founds in:
|
method and the Pade approximation. Details can be founds in:
|
||||||
|
@ -20,7 +20,7 @@ END
|
|||||||
summary: "Computes the matrix logarithm of one or more square matrices:"
|
summary: "Computes the matrix logarithm of one or more square matrices:"
|
||||||
description: <<END
|
description: <<END
|
||||||
|
|
||||||
log(exp(A)) = A
|
\\(log(exp(A)) = A\\)
|
||||||
|
|
||||||
This op is only defined for complex matrices. If A is positive-definite and
|
This op is only defined for complex matrices. If A is positive-definite and
|
||||||
real, then casting to a complex matrix, taking the logarithm and casting back
|
real, then casting to a complex matrix, taking the logarithm and casting back
|
||||||
|
@ -36,7 +36,7 @@ END
|
|||||||
summary: "Joins a string Tensor across the given dimensions."
|
summary: "Joins a string Tensor across the given dimensions."
|
||||||
description: <<END
|
description: <<END
|
||||||
Computes the string join across dimensions in the given string Tensor of shape
|
Computes the string join across dimensions in the given string Tensor of shape
|
||||||
`[d_0, d_1, ..., d_n-1]`. Returns a new Tensor created by joining the input
|
`[\\(d_0, d_1, ..., d_{n-1}\\)]`. Returns a new Tensor created by joining the input
|
||||||
strings with the given separator (default: empty string). Negative indices are
|
strings with the given separator (default: empty string). Negative indices are
|
||||||
counted backwards from the end, with `-1` being equivalent to `n - 1`. If
|
counted backwards from the end, with `-1` being equivalent to `n - 1`. If
|
||||||
indices are not specified, joins across all dimensions beginning from `n - 1`
|
indices are not specified, joins across all dimensions beginning from `n - 1`
|
||||||
|
@ -42,7 +42,7 @@ within a given variable according to `indices`.
|
|||||||
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
||||||
|
|
||||||
`indices` must be integer tensor, containing indices into `ref`.
|
`indices` must be integer tensor, containing indices into `ref`.
|
||||||
It must be shape `[d_0, ..., d_{Q-2}, K]` where `0 < K <= P`.
|
It must be shape `\\([d_0, ..., d_{Q-2}, K]\\)` where `0 < K <= P`.
|
||||||
|
|
||||||
The innermost dimension of `indices` (with length `K`) corresponds to
|
The innermost dimension of `indices` (with length `K`) corresponds to
|
||||||
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
||||||
@ -50,9 +50,7 @@ dimension of `ref`.
|
|||||||
|
|
||||||
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
||||||
|
|
||||||
```
|
$$[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].$$
|
||||||
[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].
|
|
||||||
```
|
|
||||||
|
|
||||||
For example, say we want to add 4 scattered elements to a rank-1 tensor to 8
|
For example, say we want to add 4 scattered elements to a rank-1 tensor to 8
|
||||||
elements. In Python, that addition would look like this:
|
elements. In Python, that addition would look like this:
|
||||||
|
@ -37,7 +37,7 @@ respect to both `input` and `updates`.
|
|||||||
`input` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
`input` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
||||||
|
|
||||||
`indices` must be integer tensor, containing indices into `input`.
|
`indices` must be integer tensor, containing indices into `input`.
|
||||||
It must be shape `[d_0, ..., d_{Q-2}, K]` where `0 < K <= P`.
|
It must be shape \\([d_0, ..., d_{Q-2}, K]\\) where `0 < K <= P`.
|
||||||
|
|
||||||
The innermost dimension of `indices` (with length `K`) corresponds to
|
The innermost dimension of `indices` (with length `K`) corresponds to
|
||||||
indices into elements (if `K = P`) or `(P-K)`-dimensional slices
|
indices into elements (if `K = P`) or `(P-K)`-dimensional slices
|
||||||
@ -45,9 +45,7 @@ indices into elements (if `K = P`) or `(P-K)`-dimensional slices
|
|||||||
|
|
||||||
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
||||||
|
|
||||||
```
|
$$[d_0, ..., d_{Q-2}, input.shape[K], ..., input.shape[P-1]].$$
|
||||||
[d_0, ..., d_{Q-2}, input.shape[K], ..., input.shape[P-1]].
|
|
||||||
```
|
|
||||||
|
|
||||||
For example, say we want to add 4 scattered elements to a rank-1 tensor to 8
|
For example, say we want to add 4 scattered elements to a rank-1 tensor to 8
|
||||||
elements. In Python, that addition would look like this:
|
elements. In Python, that addition would look like this:
|
||||||
|
@ -42,7 +42,7 @@ within a given variable according to `indices`.
|
|||||||
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
||||||
|
|
||||||
`indices` must be integer tensor, containing indices into `ref`.
|
`indices` must be integer tensor, containing indices into `ref`.
|
||||||
It must be shape `[d_0, ..., d_{Q-2}, K]` where `0 < K <= P`.
|
It must be shape \\([d_0, ..., d_{Q-2}, K]\\) where `0 < K <= P`.
|
||||||
|
|
||||||
The innermost dimension of `indices` (with length `K`) corresponds to
|
The innermost dimension of `indices` (with length `K`) corresponds to
|
||||||
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
||||||
@ -50,9 +50,7 @@ dimension of `ref`.
|
|||||||
|
|
||||||
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
||||||
|
|
||||||
```
|
$$[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].$$
|
||||||
[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].
|
|
||||||
```
|
|
||||||
|
|
||||||
For example, say we want to subtract 4 scattered elements from a rank-1 tensor
|
For example, say we want to subtract 4 scattered elements from a rank-1 tensor
|
||||||
with 8 elements. In Python, that subtraction would look like this:
|
with 8 elements. In Python, that subtraction would look like this:
|
||||||
|
@ -42,7 +42,7 @@ variable according to `indices`.
|
|||||||
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
||||||
|
|
||||||
`indices` must be integer tensor, containing indices into `ref`.
|
`indices` must be integer tensor, containing indices into `ref`.
|
||||||
It must be shape `[d_0, ..., d_{Q-2}, K]` where `0 < K <= P`.
|
It must be shape \\([d_0, ..., d_{Q-2}, K]\\) where `0 < K <= P`.
|
||||||
|
|
||||||
The innermost dimension of `indices` (with length `K`) corresponds to
|
The innermost dimension of `indices` (with length `K`) corresponds to
|
||||||
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
||||||
@ -50,9 +50,7 @@ dimension of `ref`.
|
|||||||
|
|
||||||
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
||||||
|
|
||||||
```
|
$$[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].$$
|
||||||
[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].
|
|
||||||
```
|
|
||||||
|
|
||||||
For example, say we want to update 4 scattered elements to a rank-1 tensor to
|
For example, say we want to update 4 scattered elements to a rank-1 tensor to
|
||||||
8 elements. In Python, that update would look like this:
|
8 elements. In Python, that update would look like this:
|
||||||
|
@ -16,6 +16,6 @@ END
|
|||||||
description: <<END
|
description: <<END
|
||||||
For each batch `i` and class `j` we have
|
For each batch `i` and class `j` we have
|
||||||
|
|
||||||
softmax[i, j] = exp(logits[i, j]) / sum_j(exp(logits[i, j]))
|
$$softmax[i, j] = exp(logits[i, j]) / sum_j(exp(logits[i, j]))$$
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
|
@ -47,7 +47,7 @@ END
|
|||||||
summary: "Update relevant entries in \'*var\' and \'*accum\' according to the adagrad scheme."
|
summary: "Update relevant entries in \'*var\' and \'*accum\' according to the adagrad scheme."
|
||||||
description: <<END
|
description: <<END
|
||||||
That is for rows we have grad for, we update var and accum as follows:
|
That is for rows we have grad for, we update var and accum as follows:
|
||||||
accum += grad * grad
|
$$accum += grad * grad$$
|
||||||
var -= lr * grad * (1 / sqrt(accum))
|
$$var -= lr * grad * (1 / sqrt(accum))$$
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
|
@ -83,8 +83,8 @@ mean_square = decay * mean_square + (1-decay) * gradient ** 2
|
|||||||
mean_grad = decay * mean_grad + (1-decay) * gradient
|
mean_grad = decay * mean_grad + (1-decay) * gradient
|
||||||
Delta = learning_rate * gradient / sqrt(mean_square + epsilon - mean_grad ** 2)
|
Delta = learning_rate * gradient / sqrt(mean_square + epsilon - mean_grad ** 2)
|
||||||
|
|
||||||
ms <- rho * ms_{t-1} + (1-rho) * grad * grad
|
$$ms <- rho * ms_{t-1} + (1-rho) * grad * grad$$
|
||||||
mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)
|
$$mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)$$
|
||||||
var <- var - mom
|
$$var <- var - mom$$
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
|
@ -71,10 +71,10 @@ END
|
|||||||
summary: "Update relevant entries in \'*var\' according to the Ftrl-proximal scheme."
|
summary: "Update relevant entries in \'*var\' according to the Ftrl-proximal scheme."
|
||||||
description: <<END
|
description: <<END
|
||||||
That is for rows we have grad for, we update var, accum and linear as follows:
|
That is for rows we have grad for, we update var, accum and linear as follows:
|
||||||
accum_new = accum + grad * grad
|
$$accum_new = accum + grad * grad$$
|
||||||
linear += grad + (accum_new^(-lr_power) - accum^(-lr_power)) / lr * var
|
$$linear += grad + (accum_{new}^{-lr_{power}} - accum^{-lr_{power}} / lr * var$$
|
||||||
quadratic = 1.0 / (accum_new^(lr_power) * lr) + 2 * l2
|
$$quadratic = 1.0 / (accum_{new}^{lr_{power}} * lr) + 2 * l2$$
|
||||||
var = (sign(linear) * l1 - linear) / quadratic if |linear| > l1 else 0.0
|
$$var = (sign(linear) * l1 - linear) / quadratic\ if\ |linear| > l1\ else\ 0.0$$
|
||||||
accum = accum_new
|
$$accum = accum_{new}$$
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
|
@ -64,7 +64,7 @@ Set use_nesterov = True if you want to use Nesterov momentum.
|
|||||||
|
|
||||||
That is for rows we have grad for, we update var and accum as follows:
|
That is for rows we have grad for, we update var and accum as follows:
|
||||||
|
|
||||||
accum = accum * momentum + grad
|
$$accum = accum * momentum + grad$$
|
||||||
var -= lr * accum
|
$$var -= lr * accum$$
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
|
@ -58,9 +58,9 @@ END
|
|||||||
summary: "Sparse update entries in \'*var\' and \'*accum\' according to FOBOS algorithm."
|
summary: "Sparse update entries in \'*var\' and \'*accum\' according to FOBOS algorithm."
|
||||||
description: <<END
|
description: <<END
|
||||||
That is for rows we have grad for, we update var and accum as follows:
|
That is for rows we have grad for, we update var and accum as follows:
|
||||||
accum += grad * grad
|
$$accum += grad * grad$$
|
||||||
prox_v = var
|
$$prox_v = var$$
|
||||||
prox_v -= lr * grad * (1 / sqrt(accum))
|
$$prox_v -= lr * grad * (1 / sqrt(accum))$$
|
||||||
var = sign(prox_v)/(1+lr*l2) * max{|prox_v|-lr*l1,0}
|
$$var = sign(prox_v)/(1+lr*l2) * max{|prox_v|-lr*l1,0}$$
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
|
@ -52,7 +52,7 @@ END
|
|||||||
summary: "Sparse update \'*var\' as FOBOS algorithm with fixed learning rate."
|
summary: "Sparse update \'*var\' as FOBOS algorithm with fixed learning rate."
|
||||||
description: <<END
|
description: <<END
|
||||||
That is for rows we have grad for, we update var as follows:
|
That is for rows we have grad for, we update var as follows:
|
||||||
prox_v = var - alpha * grad
|
$$prox_v = var - alpha * grad$$
|
||||||
var = sign(prox_v)/(1+alpha*l2) * max{|prox_v|-alpha*l1,0}
|
$$var = sign(prox_v)/(1+alpha*l2) * max{|prox_v|-alpha*l1,0}$$
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
|
@ -71,8 +71,8 @@ and mom will not update in iterations during which the grad is zero.
|
|||||||
mean_square = decay * mean_square + (1-decay) * gradient ** 2
|
mean_square = decay * mean_square + (1-decay) * gradient ** 2
|
||||||
Delta = learning_rate * gradient / sqrt(mean_square + epsilon)
|
Delta = learning_rate * gradient / sqrt(mean_square + epsilon)
|
||||||
|
|
||||||
ms <- rho * ms_{t-1} + (1-rho) * grad * grad
|
$$ms <- rho * ms_{t-1} + (1-rho) * grad * grad$$
|
||||||
mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)
|
$$mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)$$
|
||||||
var <- var - mom
|
$$var <- var - mom$$
|
||||||
END
|
END
|
||||||
}
|
}
|
||||||
|
@ -0,0 +1,40 @@
|
|||||||
|
op {
|
||||||
|
graph_op_name: "SparseSliceGrad"
|
||||||
|
in_arg {
|
||||||
|
name: "backprop_val_grad"
|
||||||
|
description: <<END
|
||||||
|
1-D. The gradient with respect to
|
||||||
|
the non-empty values of the sliced `SparseTensor`.
|
||||||
|
END
|
||||||
|
}
|
||||||
|
in_arg {
|
||||||
|
name: "input_indices"
|
||||||
|
description: <<END
|
||||||
|
2-D. The `indices` of the input `SparseTensor`.
|
||||||
|
END
|
||||||
|
}
|
||||||
|
in_arg {
|
||||||
|
name: "input_start"
|
||||||
|
description: <<END
|
||||||
|
1-D. tensor represents the start of the slice.
|
||||||
|
END
|
||||||
|
}
|
||||||
|
in_arg {
|
||||||
|
name: "output_indices"
|
||||||
|
description: <<END
|
||||||
|
2-D. The `indices` of the sliced `SparseTensor`.
|
||||||
|
END
|
||||||
|
}
|
||||||
|
out_arg {
|
||||||
|
name: "val_grad"
|
||||||
|
description: <<END
|
||||||
|
1-D. The gradient with respect to the non-empty values of input `SparseTensor`.
|
||||||
|
END
|
||||||
|
}
|
||||||
|
summary: "The gradient operator for the SparseSlice op."
|
||||||
|
description: <<END
|
||||||
|
This op takes in the upstream gradient w.r.t. non-empty values of
|
||||||
|
the sliced `SparseTensor`, and outputs the gradients w.r.t.
|
||||||
|
the non-empty values of input `SparseTensor`.
|
||||||
|
END
|
||||||
|
}
|
@ -20,7 +20,7 @@ Read @{$math_ops#Segmentation$the section on segmentation} for an explanation of
|
|||||||
segments.
|
segments.
|
||||||
|
|
||||||
Computes a tensor such that
|
Computes a tensor such that
|
||||||
`(output[i] = sum_{j...} data[j...]` where the sum is over tuples `j...` such
|
\\(output[i] = sum_{j...} data[j...]\\) where the sum is over tuples `j...` such
|
||||||
that `segment_ids[j...] == i`. Unlike `SegmentSum`, `segment_ids`
|
that `segment_ids[j...] == i`. Unlike `SegmentSum`, `segment_ids`
|
||||||
need not be sorted and need not cover all values in the full
|
need not be sorted and need not cover all values in the full
|
||||||
range of valid values.
|
range of valid values.
|
||||||
|
@ -1,4 +0,0 @@
|
|||||||
op {
|
|
||||||
graph_op_name: "BroadcastTo"
|
|
||||||
visibility: HIDDEN
|
|
||||||
}
|
|
@ -0,0 +1,4 @@
|
|||||||
|
op {
|
||||||
|
graph_op_name: "SparseSliceGrad"
|
||||||
|
visibility: HIDDEN
|
||||||
|
}
|
@ -3941,6 +3941,7 @@ cc_library(
|
|||||||
":sparse_reduce_op",
|
":sparse_reduce_op",
|
||||||
":sparse_reorder_op",
|
":sparse_reorder_op",
|
||||||
":sparse_reshape_op",
|
":sparse_reshape_op",
|
||||||
|
":sparse_slice_grad_op",
|
||||||
":sparse_slice_op",
|
":sparse_slice_op",
|
||||||
":sparse_softmax",
|
":sparse_softmax",
|
||||||
":sparse_sparse_binary_op_shared",
|
":sparse_sparse_binary_op_shared",
|
||||||
@ -4026,6 +4027,12 @@ tf_kernel_library(
|
|||||||
],
|
],
|
||||||
)
|
)
|
||||||
|
|
||||||
|
tf_kernel_library(
|
||||||
|
name = "sparse_slice_grad_op",
|
||||||
|
prefix = "sparse_slice_grad_op",
|
||||||
|
deps = SPARSE_DEPS,
|
||||||
|
)
|
||||||
|
|
||||||
tf_kernel_library(
|
tf_kernel_library(
|
||||||
name = "sparse_slice_op",
|
name = "sparse_slice_op",
|
||||||
prefix = "sparse_slice_op",
|
prefix = "sparse_slice_op",
|
||||||
|
@ -221,7 +221,7 @@ class FusedResizePadConvOpTest : public OpsTestBase {
|
|||||||
std::vector<Tensor> fused_tensors;
|
std::vector<Tensor> fused_tensors;
|
||||||
TF_ASSERT_OK(session->Run({}, {"fused_conv"}, {}, &fused_tensors));
|
TF_ASSERT_OK(session->Run({}, {"fused_conv"}, {}, &fused_tensors));
|
||||||
|
|
||||||
test::ExpectTensorNear<float>(unfused_tensors[0], fused_tensors[0], 1e-5);
|
test::ExpectClose(unfused_tensors[0], fused_tensors[0]);
|
||||||
}
|
}
|
||||||
|
|
||||||
void CompareFusedPadOnlyAndSeparate(int input_width, int input_height,
|
void CompareFusedPadOnlyAndSeparate(int input_width, int input_height,
|
||||||
@ -269,7 +269,7 @@ class FusedResizePadConvOpTest : public OpsTestBase {
|
|||||||
std::vector<Tensor> fused_tensors;
|
std::vector<Tensor> fused_tensors;
|
||||||
TF_ASSERT_OK(session->Run({}, {"fused_conv"}, {}, &fused_tensors));
|
TF_ASSERT_OK(session->Run({}, {"fused_conv"}, {}, &fused_tensors));
|
||||||
|
|
||||||
test::ExpectTensorNear<float>(unfused_tensors[0], fused_tensors[0], 1e-5);
|
test::ExpectClose(unfused_tensors[0], fused_tensors[0]);
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
@ -704,14 +704,14 @@ class MklConcatOp : public OpKernel {
|
|||||||
if (input_tensors[k].NumElements() == 0)
|
if (input_tensors[k].NumElements() == 0)
|
||||||
continue;
|
continue;
|
||||||
|
|
||||||
auto src_dims = TFShapeToMklDnnDims(
|
|
||||||
mkl_input_shapes[k].GetTfShape());
|
|
||||||
auto src_md = mkl_input_shapes[k].GetMklLayout();
|
auto src_md = mkl_input_shapes[k].GetMklLayout();
|
||||||
srcs[k].SetUsrMem(src_md, &input_tensors[k]);
|
srcs[k].SetUsrMem(src_md, &input_tensors[k]);
|
||||||
|
|
||||||
if (src_md.data.format != mkl_common_format)
|
if (src_md.data.format != mkl_common_format) {
|
||||||
|
memory::dims src_dims(src_md.data.dims, &src_md.data.dims[src_md.data.ndims]);
|
||||||
src_md = memory::desc(src_dims, MklDnnType<T>(),
|
src_md = memory::desc(src_dims, MklDnnType<T>(),
|
||||||
mkl_common_format);
|
mkl_common_format);
|
||||||
|
}
|
||||||
|
|
||||||
srcs_pd.push_back(memory::primitive_desc(src_md, cpu_engine));
|
srcs_pd.push_back(memory::primitive_desc(src_md, cpu_engine));
|
||||||
}
|
}
|
||||||
|
126
tensorflow/core/kernels/sparse_slice_grad_op.cc
Normal file
126
tensorflow/core/kernels/sparse_slice_grad_op.cc
Normal file
@ -0,0 +1,126 @@
|
|||||||
|
/* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License.
|
||||||
|
==============================================================================*/
|
||||||
|
|
||||||
|
#include "tensorflow/core/framework/op_kernel.h"
|
||||||
|
#include "tensorflow/core/framework/register_types.h"
|
||||||
|
#include "tensorflow/core/framework/tensor.h"
|
||||||
|
#include "tensorflow/core/framework/tensor_util.h"
|
||||||
|
#include "tensorflow/core/framework/types.h"
|
||||||
|
#include "tensorflow/core/util/sparse/sparse_tensor.h"
|
||||||
|
|
||||||
|
namespace tensorflow {
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
class SparseSliceGradOp : public OpKernel {
|
||||||
|
public:
|
||||||
|
explicit SparseSliceGradOp(OpKernelConstruction *ctx) : OpKernel(ctx) {}
|
||||||
|
|
||||||
|
void Compute(OpKernelContext *ctx) override {
|
||||||
|
const Tensor *backprop_val_grad, *input_indices, *output_indices, *input_start;
|
||||||
|
OP_REQUIRES_OK(ctx, ctx->input("backprop_val_grad", &backprop_val_grad));
|
||||||
|
OP_REQUIRES_OK(ctx, ctx->input("input_indices", &input_indices));
|
||||||
|
OP_REQUIRES_OK(ctx, ctx->input("input_start", &input_start));
|
||||||
|
OP_REQUIRES_OK(ctx, ctx->input("output_indices", &output_indices));
|
||||||
|
|
||||||
|
OP_REQUIRES(ctx,
|
||||||
|
TensorShapeUtils::IsMatrix(input_indices->shape()) &&
|
||||||
|
TensorShapeUtils::IsMatrix(output_indices->shape()),
|
||||||
|
errors::InvalidArgument(
|
||||||
|
"Input and output indices should be matrices "
|
||||||
|
"but received shapes: ",
|
||||||
|
input_indices->shape().DebugString(), " and ",
|
||||||
|
output_indices->shape().DebugString()));
|
||||||
|
OP_REQUIRES(
|
||||||
|
ctx, TensorShapeUtils::IsVector(backprop_val_grad->shape()),
|
||||||
|
errors::InvalidArgument(
|
||||||
|
"Input backprop_val_grad should be a vector but received shape: ",
|
||||||
|
backprop_val_grad->shape().DebugString()));
|
||||||
|
OP_REQUIRES(
|
||||||
|
ctx,
|
||||||
|
input_indices->dim_size(1) == output_indices->dim_size(1),
|
||||||
|
errors::InvalidArgument("The input and output should have the same "
|
||||||
|
"ndims: got: ", input_indices->dim_size(1), " and ",
|
||||||
|
output_indices->dim_size(1)));
|
||||||
|
OP_REQUIRES(
|
||||||
|
ctx, output_indices->dim_size(0) <= input_indices->dim_size(0),
|
||||||
|
errors::InvalidArgument("# rows of output_indices should be not greater "
|
||||||
|
"than of input_indices, got ",
|
||||||
|
output_indices->dim_size(0), " and ",
|
||||||
|
input_indices->dim_size(0)));
|
||||||
|
OP_REQUIRES(
|
||||||
|
ctx, backprop_val_grad->NumElements() == output_indices->dim_size(0),
|
||||||
|
errors::InvalidArgument("# elements of backprop_val_grad and # rows of "
|
||||||
|
"output_indices should match (#nnz of sum): got ",
|
||||||
|
backprop_val_grad->NumElements(), " and ",
|
||||||
|
output_indices->dim_size(0)));
|
||||||
|
OP_REQUIRES(ctx, TensorShapeUtils::IsVector(input_start->shape()),
|
||||||
|
errors::InvalidArgument(
|
||||||
|
"The input_start should be a vector but received shape ",
|
||||||
|
input_start->shape().DebugString()));
|
||||||
|
|
||||||
|
const int num_dims = input_indices->dim_size(1);
|
||||||
|
OP_REQUIRES(ctx, num_dims == input_start->NumElements(),
|
||||||
|
errors::InvalidArgument(
|
||||||
|
"Expected input_start to be a vector of length ", num_dims,
|
||||||
|
" but got length ", input_start->NumElements()));
|
||||||
|
|
||||||
|
const int64 input_nnz = input_indices->dim_size(0);
|
||||||
|
|
||||||
|
Tensor *val_grad;
|
||||||
|
OP_REQUIRES_OK(ctx,
|
||||||
|
ctx->allocate_output(0, TensorShape({input_nnz}), &val_grad));
|
||||||
|
|
||||||
|
T *val_grad_flat = val_grad->flat<T>().data();
|
||||||
|
const T *backprop_val_grad_flat = backprop_val_grad->flat<T>().data();
|
||||||
|
memset(val_grad_flat, 0, sizeof(T) * input_nnz);
|
||||||
|
|
||||||
|
// Fill gradients for position where indices of input and output are same.
|
||||||
|
const auto input_indices_mat = input_indices->matrix<int64>();
|
||||||
|
const auto output_indices_mat = output_indices->matrix<int64>();
|
||||||
|
const auto input_start_flat = input_start->flat<int64>();
|
||||||
|
int64 j = 0;
|
||||||
|
for (int64 i = 0; i < input_nnz && j < backprop_val_grad->NumElements();
|
||||||
|
++i) {
|
||||||
|
bool is_same = true;
|
||||||
|
for (int d = 0; d < num_dims; ++d) {
|
||||||
|
const int64 a = input_indices_mat(i, d);
|
||||||
|
const int64 b = output_indices_mat(j, d);
|
||||||
|
const int64 offset = input_start_flat(d);
|
||||||
|
if (a != b + offset) {
|
||||||
|
is_same = false;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (is_same) {
|
||||||
|
val_grad_flat[i] = backprop_val_grad_flat[j];
|
||||||
|
++j;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
OP_REQUIRES(
|
||||||
|
ctx, backprop_val_grad->NumElements() == j,
|
||||||
|
errors::Internal("Elements of backprop_val_grad aren't all propagated. "
|
||||||
|
"Num elements:", backprop_val_grad->NumElements(),
|
||||||
|
", used: ", j));
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
#define REGISTER_KERNELS(type) \
|
||||||
|
REGISTER_KERNEL_BUILDER( \
|
||||||
|
Name("SparseSliceGrad").Device(DEVICE_CPU).TypeConstraint<type>("T"), \
|
||||||
|
SparseSliceGradOp<type>)
|
||||||
|
|
||||||
|
TF_CALL_NUMBER_TYPES(REGISTER_KERNELS);
|
||||||
|
#undef REGISTER_KERNELS
|
||||||
|
} // namespace tensorflow
|
@ -73,6 +73,21 @@ TEST_F(SqliteTest, InsertAndSelectDouble) {
|
|||||||
EXPECT_EQ(1, stmt.ColumnInt(1));
|
EXPECT_EQ(1, stmt.ColumnInt(1));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#ifdef DSQLITE_ENABLE_JSON1
|
||||||
|
TEST_F(SqliteTest, Json1Extension) {
|
||||||
|
string s1 = "{\"key\": 42}";
|
||||||
|
string s2 = "{\"key\": \"value\"}";
|
||||||
|
auto stmt = db_->PrepareOrDie("INSERT INTO T (a, b) VALUES (?, ?)");
|
||||||
|
stmt.BindText(1, s1);
|
||||||
|
stmt.BindText(2, s2);
|
||||||
|
TF_ASSERT_OK(stmt.StepAndReset());
|
||||||
|
stmt = db_->PrepareOrDie("SELECT json_extract(a, '$.key'), json_extract(b, '$.key') FROM T");
|
||||||
|
TF_ASSERT_OK(stmt.Step(&is_done_));
|
||||||
|
EXPECT_EQ(42, stmt.ColumnInt(0));
|
||||||
|
EXPECT_EQ("value", stmt.ColumnString(1));
|
||||||
|
}
|
||||||
|
#endif //DSQLITE_ENABLE_JSON1
|
||||||
|
|
||||||
TEST_F(SqliteTest, NulCharsInString) {
|
TEST_F(SqliteTest, NulCharsInString) {
|
||||||
string s; // XXX: Want to write {2, '\0'} but not sure why not.
|
string s; // XXX: Want to write {2, '\0'} but not sure why not.
|
||||||
s.append(static_cast<size_t>(2), '\0');
|
s.append(static_cast<size_t>(2), '\0');
|
||||||
|
@ -302,6 +302,20 @@ REGISTER_OP("SparseSplit")
|
|||||||
return Status::OK();
|
return Status::OK();
|
||||||
});
|
});
|
||||||
|
|
||||||
|
REGISTER_OP("SparseSliceGrad")
|
||||||
|
.Input("backprop_val_grad: T")
|
||||||
|
.Input("input_indices: int64")
|
||||||
|
.Input("input_start: int64")
|
||||||
|
.Input("output_indices: int64")
|
||||||
|
.Output("val_grad: T")
|
||||||
|
.Attr("T: numbertype")
|
||||||
|
.SetShapeFn([](InferenceContext* c) {
|
||||||
|
ShapeHandle indices;
|
||||||
|
TF_RETURN_IF_ERROR(c->WithRank(c->input(1), 2, &indices));
|
||||||
|
c->set_output(0, c->Vector(c->Dim(indices, 0)));
|
||||||
|
return Status::OK();
|
||||||
|
});
|
||||||
|
|
||||||
REGISTER_OP("SparseSlice")
|
REGISTER_OP("SparseSlice")
|
||||||
.Input("indices: int64")
|
.Input("indices: int64")
|
||||||
.Input("values: T")
|
.Input("values: T")
|
||||||
|
@ -52,6 +52,18 @@ TEST(SparseOpsTest, SparseAddGrad_ShapeFn) {
|
|||||||
INFER_OK(op, "?;[?,?];[?,?];?", "[d1_0];[d2_0]");
|
INFER_OK(op, "?;[?,?];[?,?];?", "[d1_0];[d2_0]");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
TEST(SparseOpsTest, SparseSliceGrad_ShapeFn) {
|
||||||
|
ShapeInferenceTestOp op("SparseSliceGrad");
|
||||||
|
|
||||||
|
// Rank checks.
|
||||||
|
INFER_ERROR("must be rank 2", op, "?;[1];?;?");
|
||||||
|
|
||||||
|
INFER_OK(op, "?;?;?;?", "[?]");
|
||||||
|
|
||||||
|
// input[1].dim(0) determine output.
|
||||||
|
INFER_OK(op, "?;[?,?];?;?", "[d1_0]");
|
||||||
|
}
|
||||||
|
|
||||||
TEST(SparseOpsTest, SparseReorder_ShapeFn) {
|
TEST(SparseOpsTest, SparseReorder_ShapeFn) {
|
||||||
ShapeInferenceTestOp op("SparseReorder");
|
ShapeInferenceTestOp op("SparseReorder");
|
||||||
|
|
||||||
|
@ -66,9 +66,7 @@ landing_page:
|
|||||||
}
|
}
|
||||||
</style>
|
</style>
|
||||||
<div class="devsite-landing-row-item-description">
|
<div class="devsite-landing-row-item-description">
|
||||||
<a href="#">
|
<h3 class="hide-from-toc">Learn and use ML</h3>
|
||||||
<h3 class="hide-from-toc">Learn and use ML</h3>
|
|
||||||
</a>
|
|
||||||
<div class="devsite-landing-row-item-description-content">
|
<div class="devsite-landing-row-item-description-content">
|
||||||
<p>
|
<p>
|
||||||
The high-level Keras API provides building blocks to create and
|
The high-level Keras API provides building blocks to create and
|
||||||
@ -117,9 +115,7 @@ landing_page:
|
|||||||
- items:
|
- items:
|
||||||
- custom_html: >
|
- custom_html: >
|
||||||
<div class="devsite-landing-row-item-description" style="border-right: 2px solid #eee;">
|
<div class="devsite-landing-row-item-description" style="border-right: 2px solid #eee;">
|
||||||
<a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples/notebooks">
|
<h3 class="hide-from-toc">Research and experimentation</h3>
|
||||||
<h3 class="hide-from-toc">Research and experimentation</h3>
|
|
||||||
</a>
|
|
||||||
<div class="devsite-landing-row-item-description-content">
|
<div class="devsite-landing-row-item-description-content">
|
||||||
<p>
|
<p>
|
||||||
Eager execution provides an imperative, define-by-run interface for advanced operations. Write custom layers, forward passes, and training loops with auto‑differentiation. Start with
|
Eager execution provides an imperative, define-by-run interface for advanced operations. Write custom layers, forward passes, and training loops with auto‑differentiation. Start with
|
||||||
@ -170,9 +166,7 @@ landing_page:
|
|||||||
</div>
|
</div>
|
||||||
- custom_html: >
|
- custom_html: >
|
||||||
<div class="devsite-landing-row-item-description">
|
<div class="devsite-landing-row-item-description">
|
||||||
<a href="#">
|
<h3 class="hide-from-toc">ML at production scale</h3>
|
||||||
<h3 class="hide-from-toc">ML at production scale</h3>
|
|
||||||
</a>
|
|
||||||
<div class="devsite-landing-row-item-description-content">
|
<div class="devsite-landing-row-item-description-content">
|
||||||
<p>
|
<p>
|
||||||
Estimators can train large models on multiple machines in a
|
Estimators can train large models on multiple machines in a
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
### Learn and use ML
|
### Learn and use ML
|
||||||
basic_classification.md
|
basic_classification.md: Basic classification
|
||||||
basic_text_classification.md
|
basic_text_classification.md: Text classification
|
||||||
basic_regression.md
|
basic_regression.md: Regression
|
||||||
overfit_and_underfit.md
|
overfit_and_underfit.md
|
||||||
save_and_restore_models.md
|
save_and_restore_models.md
|
||||||
next_steps.md
|
next_steps.md
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
# Next Steps
|
# Next steps
|
||||||
|
|
||||||
## Learn more about TensorFlow
|
## Learn more about TensorFlow
|
||||||
|
|
||||||
|
@ -362,10 +362,10 @@ model's loss. This is the
|
|||||||
that will be optimized.
|
that will be optimized.
|
||||||
|
|
||||||
We can calculate the loss by calling @{tf.losses.sparse_softmax_cross_entropy}.
|
We can calculate the loss by calling @{tf.losses.sparse_softmax_cross_entropy}.
|
||||||
The value returned by this function will be lowest, approximately 0,
|
The value returned by this function will be approximately 0 at lowest,
|
||||||
probability of the correct class (at index `label`) is near 1.0. The loss value
|
when the probability of the correct class (at index `label`) is near 1.0.
|
||||||
returned is progressively larger as the probability of the correct class
|
The loss value returned is progressively larger as the probability of the
|
||||||
decreases.
|
correct class decreases.
|
||||||
|
|
||||||
This function returns the average over the whole batch.
|
This function returns the average over the whole batch.
|
||||||
|
|
||||||
|
@ -35,7 +35,7 @@ from tensorflow import keras
|
|||||||
* The `tf.keras` version in the latest TensorFlow release might not be the same
|
* The `tf.keras` version in the latest TensorFlow release might not be the same
|
||||||
as the latest `keras` version from PyPI. Check `tf.keras.__version__`.
|
as the latest `keras` version from PyPI. Check `tf.keras.__version__`.
|
||||||
* When [saving a model's weights](#weights_only), `tf.keras` defaults to the
|
* When [saving a model's weights](#weights_only), `tf.keras` defaults to the
|
||||||
[checkpoint format](../get_started/checkpoints.md). Pass `save_format='h5'` to
|
[checkpoint format](./checkpoints.md). Pass `save_format='h5'` to
|
||||||
use HDF5.
|
use HDF5.
|
||||||
|
|
||||||
## Build a simple model
|
## Build a simple model
|
||||||
@ -221,7 +221,7 @@ To *evaluate* the inference-mode loss and metrics for the data provided:
|
|||||||
```python
|
```python
|
||||||
model.evaluate(x, y, batch_size=32)
|
model.evaluate(x, y, batch_size=32)
|
||||||
|
|
||||||
model.evaluate(dataset, steps=30
|
model.evaluate(dataset, steps=30)
|
||||||
```
|
```
|
||||||
|
|
||||||
And to *predict* the output of the last layer in inference for the data provided,
|
And to *predict* the output of the last layer in inference for the data provided,
|
||||||
@ -442,7 +442,7 @@ model.load_weights('my_model')
|
|||||||
```
|
```
|
||||||
|
|
||||||
By default, this saves the model's weights in the
|
By default, this saves the model's weights in the
|
||||||
[TensorFlow checkpoint](../get_started/checkpoints.md) file format. Weights can
|
[TensorFlow checkpoint](./checkpoints.md) file format. Weights can
|
||||||
also be saved to the Keras HDF5 format (the default for the multi-backend
|
also be saved to the Keras HDF5 format (the default for the multi-backend
|
||||||
implementation of Keras):
|
implementation of Keras):
|
||||||
|
|
||||||
@ -581,15 +581,6 @@ model.compile(loss='binary_crossentropy', optimizer=optimizer)
|
|||||||
model.summary()
|
model.summary()
|
||||||
```
|
```
|
||||||
|
|
||||||
Convert the Keras model to a `tf.estimator.Estimator` instance:
|
|
||||||
|
|
||||||
```python
|
|
||||||
keras_estimator = keras.estimator.model_to_estimator(
|
|
||||||
keras_model=model,
|
|
||||||
config=config,
|
|
||||||
model_dir='/tmp/model_dir')
|
|
||||||
```
|
|
||||||
|
|
||||||
Define an *input pipeline*. The `input_fn` returns a `tf.data.Dataset` object
|
Define an *input pipeline*. The `input_fn` returns a `tf.data.Dataset` object
|
||||||
used to distribute the data across multiple devices—with each device processing
|
used to distribute the data across multiple devices—with each device processing
|
||||||
a slice of the input batch.
|
a slice of the input batch.
|
||||||
@ -615,6 +606,15 @@ strategy = tf.contrib.distribute.MirroredStrategy()
|
|||||||
config = tf.estimator.RunConfig(train_distribute=strategy)
|
config = tf.estimator.RunConfig(train_distribute=strategy)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Convert the Keras model to a `tf.estimator.Estimator` instance:
|
||||||
|
|
||||||
|
```python
|
||||||
|
keras_estimator = keras.estimator.model_to_estimator(
|
||||||
|
keras_model=model,
|
||||||
|
config=config,
|
||||||
|
model_dir='/tmp/model_dir')
|
||||||
|
```
|
||||||
|
|
||||||
Finally, train the `Estimator` instance by providing the `input_fn` and `steps`
|
Finally, train the `Estimator` instance by providing the `input_fn` and `steps`
|
||||||
arguments:
|
arguments:
|
||||||
|
|
||||||
|
@ -289,17 +289,27 @@ Note: If you're only interested in building the libraries for the TensorFlow C
|
|||||||
or Java APIs, see [Build the C or Java libraries](#BuildCorJava), you do not
|
or Java APIs, see [Build the C or Java libraries](#BuildCorJava), you do not
|
||||||
need to build the pip package in that case.
|
need to build the pip package in that case.
|
||||||
|
|
||||||
To build a pip package for TensorFlow with CPU-only support,
|
### CPU-only support
|
||||||
you would typically invoke the following command:
|
|
||||||
|
To build a pip package for TensorFlow with CPU-only support:
|
||||||
|
|
||||||
<pre>
|
<pre>
|
||||||
$ <b>bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package</b>
|
$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
To build a pip package for TensorFlow with GPU support,
|
To build a pip package for TensorFlow with CPU-only support for the Intel® MKL-DNN:
|
||||||
invoke the following command:
|
|
||||||
|
|
||||||
<pre>$ <b>bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package</b> </pre>
|
<pre>
|
||||||
|
$ bazel build --config=mkl --config=opt //tensorflow/tools/pip_package:build_pip_package
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
### GPU support
|
||||||
|
|
||||||
|
To build a pip package for TensorFlow with GPU support:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
|
||||||
|
</pre>
|
||||||
|
|
||||||
**NOTE on gcc 5 or later:** the binary pip packages available on the
|
**NOTE on gcc 5 or later:** the binary pip packages available on the
|
||||||
TensorFlow website are built with gcc 4, which uses the older ABI. To
|
TensorFlow website are built with gcc 4, which uses the older ABI. To
|
||||||
|
@ -44,23 +44,22 @@ app:
|
|||||||
Android Studio project.
|
Android Studio project.
|
||||||
* Install all the Gradle extensions it requests.
|
* Install all the Gradle extensions it requests.
|
||||||
|
|
||||||
To get a model, either:
|
Now you can build and run the demo app.
|
||||||
|
|
||||||
* Download the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip)
|
The build process downloads the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip), and unzips it into the assets directory: `tensorflow/contrib/lite/java/demo/app/src/main/assets/`.
|
||||||
and unzip and copy `mobilenet_quant_v1_224.tflite` to the assets directory:
|
|
||||||
`tensorflow/contrib/lite/java/demo/app/src/main/assets/`.
|
|
||||||
* Or, download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip)
|
|
||||||
and unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets
|
|
||||||
directory. Change the chosen classifier in
|
|
||||||
[Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)<br>
|
|
||||||
from: `classifier = new ImageClassifierQuantizedMobileNet(getActivity());`<br>
|
|
||||||
to: `classifier = new ImageClassifierFloatInception(getActivity());`.
|
|
||||||
|
|
||||||
Now you can build and run the demo app.
|
|
||||||
|
|
||||||
Some additional details are available on the
|
Some additional details are available on the
|
||||||
[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md).
|
[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md).
|
||||||
|
|
||||||
|
### Using other models
|
||||||
|
|
||||||
|
To use a different model:
|
||||||
|
* Download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip).
|
||||||
|
* Unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets directory.
|
||||||
|
* Change the chosen classifier in [Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)<br>
|
||||||
|
from: `classifier = new ImageClassifierQuantizedMobileNet(getActivity());`<br>
|
||||||
|
to: `classifier = new ImageClassifierFloatInception(getActivity());`.
|
||||||
|
|
||||||
|
|
||||||
## Build TensorFlow Lite and the demo app from source
|
## Build TensorFlow Lite and the demo app from source
|
||||||
|
|
||||||
|
@ -470,51 +470,18 @@ as the loss metric. The following code calculates cross entropy when the model
|
|||||||
runs in either `TRAIN` or `EVAL` mode:
|
runs in either `TRAIN` or `EVAL` mode:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10)
|
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
|
||||||
loss = tf.losses.softmax_cross_entropy(
|
|
||||||
onehot_labels=onehot_labels, logits=logits)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Let's take a closer look at what's happening above.
|
Let's take a closer look at what's happening above.
|
||||||
|
|
||||||
Our `labels` tensor contains a list of predictions for our examples, e.g. `[1,
|
Our `labels` tensor contains a list of prediction indices for our examples, e.g. `[1,
|
||||||
9, ...]`. In order to calculate cross-entropy, first we need to convert `labels`
|
9, ...]`. `logits` contains the linear outputs of our last layer.
|
||||||
to the corresponding
|
|
||||||
[one-hot encoding](https://www.quora.com/What-is-one-hot-encoding-and-when-is-it-used-in-data-science):
|
|
||||||
|
|
||||||
```none
|
`tf.losses.sparse_softmax_cross_entropy`, calculates the softmax crossentropy
|
||||||
[[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
|
(aka: categorical crossentropy, negative log-likelihood) from these two inputs
|
||||||
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
|
in an efficient, numerically stable way.
|
||||||
...]
|
|
||||||
```
|
|
||||||
|
|
||||||
We use the @{tf.one_hot} function
|
|
||||||
to perform this conversion. `tf.one_hot()` has two required arguments:
|
|
||||||
|
|
||||||
* `indices`. The locations in the one-hot tensor that will have "on
|
|
||||||
values"—i.e., the locations of `1` values in the tensor shown above.
|
|
||||||
* `depth`. The depth of the one-hot tensor—i.e., the number of target classes.
|
|
||||||
Here, the depth is `10`.
|
|
||||||
|
|
||||||
The following code creates the one-hot tensor for our labels, `onehot_labels`:
|
|
||||||
|
|
||||||
```python
|
|
||||||
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10)
|
|
||||||
```
|
|
||||||
|
|
||||||
Because `labels` contains a series of values from 0–9, `indices` is just our
|
|
||||||
`labels` tensor, with values cast to integers. The `depth` is `10` because we
|
|
||||||
have 10 possible target classes, one for each digit.
|
|
||||||
|
|
||||||
Next, we compute cross-entropy of `onehot_labels` and the softmax of the
|
|
||||||
predictions from our logits layer. `tf.losses.softmax_cross_entropy()` takes
|
|
||||||
`onehot_labels` and `logits` as arguments, performs softmax activation on
|
|
||||||
`logits`, calculates cross-entropy, and returns our `loss` as a scalar `Tensor`:
|
|
||||||
|
|
||||||
```python
|
|
||||||
loss = tf.losses.softmax_cross_entropy(
|
|
||||||
onehot_labels=onehot_labels, logits=logits)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Configure the Training Op
|
### Configure the Training Op
|
||||||
|
|
||||||
|
@ -11210,7 +11210,7 @@ func SampleDistortedBoundingBoxAspectRatioRange(value []float32) SampleDistorted
|
|||||||
// SampleDistortedBoundingBoxAreaRange sets the optional area_range attribute to value.
|
// SampleDistortedBoundingBoxAreaRange sets the optional area_range attribute to value.
|
||||||
//
|
//
|
||||||
// value: The cropped area of the image must contain a fraction of the
|
// value: The cropped area of the image must contain a fraction of the
|
||||||
// supplied image within in this range.
|
// supplied image within this range.
|
||||||
// If not specified, defaults to <f:0.05 f:1 >
|
// If not specified, defaults to <f:0.05 f:1 >
|
||||||
func SampleDistortedBoundingBoxAreaRange(value []float32) SampleDistortedBoundingBoxAttr {
|
func SampleDistortedBoundingBoxAreaRange(value []float32) SampleDistortedBoundingBoxAttr {
|
||||||
return func(m optionalAttr) {
|
return func(m optionalAttr) {
|
||||||
@ -17969,9 +17969,10 @@ func SparseFillEmptyRowsGrad(scope *Scope, reverse_index_map tf.Output, grad_val
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Computes scaled exponential linear: `scale * alpha * (exp(features) - 1)`
|
// Computes scaled exponential linear: `scale * alpha * (exp(features) - 1)`
|
||||||
//
|
|
||||||
// if < 0, `scale * features` otherwise.
|
// if < 0, `scale * features` otherwise.
|
||||||
//
|
//
|
||||||
|
// Assumes weights to have zero mean and variance 1.0 / fan_in.
|
||||||
|
//
|
||||||
// See [Self-Normalizing Neural Networks](https://arxiv.org/abs/1706.02515)
|
// See [Self-Normalizing Neural Networks](https://arxiv.org/abs/1706.02515)
|
||||||
func Selu(scope *Scope, features tf.Output) (activations tf.Output) {
|
func Selu(scope *Scope, features tf.Output) (activations tf.Output) {
|
||||||
if scope.Err() != nil {
|
if scope.Err() != nil {
|
||||||
@ -21655,7 +21656,7 @@ func ImageSummaryBadColor(value tf.Tensor) ImageSummaryAttr {
|
|||||||
// generated sequentially as '*tag*/image/0', '*tag*/image/1', etc.
|
// generated sequentially as '*tag*/image/0', '*tag*/image/1', etc.
|
||||||
//
|
//
|
||||||
// The `bad_color` argument is the color to use in the generated images for
|
// The `bad_color` argument is the color to use in the generated images for
|
||||||
// non-finite input values. It is a `unit8` 1-D tensor of length `channels`.
|
// non-finite input values. It is a `uint8` 1-D tensor of length `channels`.
|
||||||
// Each element must be in the range `[0, 255]` (It represents the value of a
|
// Each element must be in the range `[0, 255]` (It represents the value of a
|
||||||
// pixel in the output image). Non-finite values in the input tensor are
|
// pixel in the output image). Non-finite values in the input tensor are
|
||||||
// replaced by this tensor in the output image. The default value is the color
|
// replaced by this tensor in the output image. The default value is the color
|
||||||
@ -24048,7 +24049,7 @@ func SampleDistortedBoundingBoxV2AspectRatioRange(value []float32) SampleDistort
|
|||||||
// SampleDistortedBoundingBoxV2AreaRange sets the optional area_range attribute to value.
|
// SampleDistortedBoundingBoxV2AreaRange sets the optional area_range attribute to value.
|
||||||
//
|
//
|
||||||
// value: The cropped area of the image must contain a fraction of the
|
// value: The cropped area of the image must contain a fraction of the
|
||||||
// supplied image within in this range.
|
// supplied image within this range.
|
||||||
// If not specified, defaults to <f:0.05 f:1 >
|
// If not specified, defaults to <f:0.05 f:1 >
|
||||||
func SampleDistortedBoundingBoxV2AreaRange(value []float32) SampleDistortedBoundingBoxV2Attr {
|
func SampleDistortedBoundingBoxV2AreaRange(value []float32) SampleDistortedBoundingBoxV2Attr {
|
||||||
return func(m optionalAttr) {
|
return func(m optionalAttr) {
|
||||||
@ -24744,8 +24745,7 @@ type DecodeProtoV2Attr func(optionalAttr)
|
|||||||
// If not specified, defaults to "local://"
|
// If not specified, defaults to "local://"
|
||||||
func DecodeProtoV2DescriptorSource(value string) DecodeProtoV2Attr {
|
func DecodeProtoV2DescriptorSource(value string) DecodeProtoV2Attr {
|
||||||
return func(m optionalAttr) {
|
return func(m optionalAttr) {
|
||||||
m["descriptor_source"] = value
|
m["descriptor_source"] = value }
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// DecodeProtoV2MessageFormat sets the optional message_format attribute to value.
|
// DecodeProtoV2MessageFormat sets the optional message_format attribute to value.
|
||||||
|
@ -13,6 +13,7 @@ See the License for the specific language governing permissions and
|
|||||||
limitations under the License.
|
limitations under the License.
|
||||||
==============================================================================*/
|
==============================================================================*/
|
||||||
|
|
||||||
|
#include <string>
|
||||||
#include <algorithm>
|
#include <algorithm>
|
||||||
#include <list>
|
#include <list>
|
||||||
#include <string>
|
#include <string>
|
||||||
|
@ -143,6 +143,82 @@ public final class Graph implements AutoCloseable {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Adds operations to compute the partial derivatives of sum of {@code y}s w.r.t {@code x}s,
|
||||||
|
* i.e., {@code d(y_1 + y_2 + ...)/dx_1, d(y_1 + y_2 + ...)/dx_2...}
|
||||||
|
* <p>
|
||||||
|
* {@code dx} are used as initial gradients (which represent the symbolic partial derivatives of some loss function
|
||||||
|
* {@code L} w.r.t. {@code y}). {@code dx} must be null or have size of {@code y}.
|
||||||
|
* <p>
|
||||||
|
* If {@code dx} is null, the implementation will use dx of {@link org.tensorflow.op.core.OnesLike OnesLike} for all
|
||||||
|
* shapes in {@code y}.
|
||||||
|
*
|
||||||
|
* @param y output of the function to derive
|
||||||
|
* @param x inputs of the function for which partial derivatives are computed
|
||||||
|
* @param dx if not null, the partial derivatives of some loss function {@code L} w.r.t. {@code y}
|
||||||
|
* @return the partial derivatives {@code dy} with the size of {@code x}
|
||||||
|
*/
|
||||||
|
public Output<?>[] addGradients(Output<?>[] y, Output<?>[] x, Output<?>[] dx) {
|
||||||
|
Output<?>[] dy = new Output<?>[x.length];
|
||||||
|
final long[] yHandles = new long[y.length];
|
||||||
|
final int[] yIndices = new int[y.length];
|
||||||
|
final long[] xHandles = new long[x.length];
|
||||||
|
final int[] xIndices = new int[x.length];
|
||||||
|
long[] dxHandles = null;
|
||||||
|
int[] dxIndices = null;
|
||||||
|
|
||||||
|
try (Reference ref = ref()) {
|
||||||
|
for (int i = 0; i < y.length; ++i) {
|
||||||
|
yHandles[i] = y[i].op().getUnsafeNativeHandle();
|
||||||
|
yIndices[i] = y[i].index();
|
||||||
|
}
|
||||||
|
for (int i = 0; i < x.length; ++i) {
|
||||||
|
xHandles[i] = x[i].op().getUnsafeNativeHandle();
|
||||||
|
xIndices[i] = x[i].index();
|
||||||
|
}
|
||||||
|
if (dx != null && dx.length > 0) {
|
||||||
|
dxHandles = new long[dx.length];
|
||||||
|
dxIndices = new int[dx.length];
|
||||||
|
|
||||||
|
for (int i = 0; i < dx.length; ++i) {
|
||||||
|
dxHandles[i] = dx[i].op().getUnsafeNativeHandle();
|
||||||
|
dxIndices[i] = dx[i].index();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Gradient outputs are returned in two continuous arrays concatenated into one. The first holds the native handles
|
||||||
|
// of the gradient operations while the second holds the index of their output
|
||||||
|
// e.g. given xHandles = [x0Handle, x1Handle, ...] and xIndices = [x0Index, x1Index, ..], we obtain
|
||||||
|
// dy = [dy0Handle, dy1Handle, ..., dy0Index, dy1Index, ...]
|
||||||
|
long[] dyHandlesAndIndices =
|
||||||
|
addGradients(ref.nativeHandle(), yHandles, yIndices, xHandles, xIndices, dxHandles, dxIndices);
|
||||||
|
int ndy = dyHandlesAndIndices.length >> 1;
|
||||||
|
if (ndy != dy.length) {
|
||||||
|
throw new IllegalStateException(String.valueOf(ndy) + " gradients were added to the graph when " + dy.length
|
||||||
|
+ " were expected");
|
||||||
|
}
|
||||||
|
for (int i = 0, j = ndy; i < ndy; ++i, ++j) {
|
||||||
|
Operation op = new Operation(this, dyHandlesAndIndices[i]);
|
||||||
|
dy[i] = new Output<>(op, (int) dyHandlesAndIndices[j]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return dy;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Adds operations to compute the partial derivatives of sum of {@code y}s w.r.t {@code x}s,
|
||||||
|
* i.e., {@code dy/dx_1, dy/dx_2...}
|
||||||
|
* <p>
|
||||||
|
* This is a simplified version of {@link #addGradients(Output[], Output[], Output[]) where {@code y} is
|
||||||
|
* a single output and {@code dx} is null.
|
||||||
|
*
|
||||||
|
* @param y output of the function to derive
|
||||||
|
* @param x inputs of the function for which partial derivatives are computed
|
||||||
|
* @return the partial derivatives {@code dy} with the size of {@code x}
|
||||||
|
*/
|
||||||
|
public Output<?>[] addGradients(Output<?> y, Output<?>[] x) {
|
||||||
|
return addGradients(new Output<?>[]{y}, x, null);
|
||||||
|
}
|
||||||
|
|
||||||
private final Object nativeHandleLock = new Object();
|
private final Object nativeHandleLock = new Object();
|
||||||
private long nativeHandle;
|
private long nativeHandle;
|
||||||
private int refcount = 0;
|
private int refcount = 0;
|
||||||
@ -254,6 +330,9 @@ public final class Graph implements AutoCloseable {
|
|||||||
|
|
||||||
private static native byte[] toGraphDef(long handle);
|
private static native byte[] toGraphDef(long handle);
|
||||||
|
|
||||||
|
private static native long[] addGradients(long handle, long[] inputHandles, int[] inputIndices,
|
||||||
|
long[] outputHandles, int[] outputIndices, long[] gradInputHandles, int[] gradInputIndices);
|
||||||
|
|
||||||
static {
|
static {
|
||||||
TensorFlow.init();
|
TensorFlow.init();
|
||||||
}
|
}
|
||||||
|
@ -0,0 +1,153 @@
|
|||||||
|
/* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License.
|
||||||
|
==============================================================================*/
|
||||||
|
|
||||||
|
package org.tensorflow.op.core;
|
||||||
|
|
||||||
|
import java.util.Arrays;
|
||||||
|
import java.util.Iterator;
|
||||||
|
import java.util.List;
|
||||||
|
|
||||||
|
import org.tensorflow.Operand;
|
||||||
|
import org.tensorflow.Output;
|
||||||
|
import org.tensorflow.op.Op;
|
||||||
|
import org.tensorflow.op.Operands;
|
||||||
|
import org.tensorflow.op.Scope;
|
||||||
|
import org.tensorflow.op.annotation.Operator;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Adds operations to compute the partial derivatives of sum of {@code y}s w.r.t {@code x}s,
|
||||||
|
* i.e., {@code d(y_1 + y_2 + ...)/dx_1, d(y_1 + y_2 + ...)/dx_2...}
|
||||||
|
* <p>
|
||||||
|
* If {@code Options.dx()} values are set, they are as the initial symbolic partial derivatives of some loss
|
||||||
|
* function {@code L} w.r.t. {@code y}. {@code Options.dx()} must have the size of {@code y}.
|
||||||
|
* <p>
|
||||||
|
* If {@code Options.dx()} is not set, the implementation will use dx of {@code OnesLike} for all
|
||||||
|
* shapes in {@code y}.
|
||||||
|
* <p>
|
||||||
|
* The partial derivatives are returned in output {@code dy}, with the size of {@code x}.
|
||||||
|
* <p>
|
||||||
|
* Example of usage:
|
||||||
|
* <pre>{@code
|
||||||
|
* Gradients gradients = Gradients.create(scope, Arrays.asList(loss), Arrays.asList(w, b));
|
||||||
|
*
|
||||||
|
* Constant<Float> alpha = ops.constant(1.0f, Float.class);
|
||||||
|
* ApplyGradientDescent.create(scope, w, alpha, gradients.<Float>dy(0));
|
||||||
|
* ApplyGradientDescent.create(scope, b, alpha, gradients.<Float>dy(1));
|
||||||
|
* }</pre>
|
||||||
|
*/
|
||||||
|
@Operator
|
||||||
|
public class Gradients implements Op, Iterable<Operand<?>> {
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Optional attributes for {@link Gradients}
|
||||||
|
*/
|
||||||
|
public static class Options {
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @param dx partial derivatives of some loss function {@code L} w.r.t. {@code y}
|
||||||
|
* @return this option builder
|
||||||
|
*/
|
||||||
|
public Options dx(Iterable<Operand<?>> dx) {
|
||||||
|
this.dx = dx;
|
||||||
|
return this;
|
||||||
|
}
|
||||||
|
|
||||||
|
private Iterable<Operand<?>> dx;
|
||||||
|
|
||||||
|
private Options() {
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Adds gradients computation ops to the graph according to scope.
|
||||||
|
*
|
||||||
|
* @param scope current graph scope
|
||||||
|
* @param y outputs of the function to derive
|
||||||
|
* @param x inputs of the function for which partial derivatives are computed
|
||||||
|
* @param options carries optional attributes values
|
||||||
|
* @return a new instance of {@code Gradients}
|
||||||
|
*/
|
||||||
|
public static Gradients create(Scope scope, Iterable<Operand<?>> y, Iterable<Operand<?>> x, Options... options) {
|
||||||
|
Output<?>[] dx = null;
|
||||||
|
if (options != null) {
|
||||||
|
for (Options opts : options) {
|
||||||
|
if (opts.dx != null) {
|
||||||
|
dx = Operands.asOutputs(opts.dx);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
Output<?>[] gradOutputs = scope.graph().addGradients(Operands.asOutputs(y), Operands.asOutputs(x), dx);
|
||||||
|
return new Gradients(Arrays.asList(gradOutputs));
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Adds gradients computation ops to the graph according to scope.
|
||||||
|
*
|
||||||
|
* This is a simplified version of {@link #create(Scope, Iterable, Iterable, Options...)} where {@code y} is
|
||||||
|
* a single output.
|
||||||
|
*
|
||||||
|
* @param scope current graph scope
|
||||||
|
* @param y output of the function to derive
|
||||||
|
* @param x inputs of the function for which partial derivatives are computed
|
||||||
|
* @param options carries optional attributes values
|
||||||
|
* @return a new instance of {@code Gradients}
|
||||||
|
*/
|
||||||
|
@SuppressWarnings({"unchecked", "rawtypes"})
|
||||||
|
public static Gradients create(Scope scope, Operand<?> y, Iterable<Operand<?>> x, Options... options) {
|
||||||
|
return create(scope, (Iterable) Arrays.asList(y), x, options);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @param dx partial derivatives of some loss function {@code L} w.r.t. {@code y}
|
||||||
|
* @return builder to add more options to this operation
|
||||||
|
*/
|
||||||
|
public Options dx(Iterable<Operand<?>> dx) {
|
||||||
|
return new Options().dx(dx);
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
@SuppressWarnings({"rawtypes", "unchecked"})
|
||||||
|
public Iterator<Operand<?>> iterator() {
|
||||||
|
return (Iterator) dy.iterator();
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Partial derivatives of {@code y}s w.r.t. {@code x}s, with the size of {@code x}
|
||||||
|
*/
|
||||||
|
public List<Output<?>> dy() {
|
||||||
|
return dy;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns a symbolic handle to one of the gradient operation output
|
||||||
|
* <p>
|
||||||
|
* Warning: Does not check that the type of the tensor matches T. It is recommended to call
|
||||||
|
* this method with an explicit type parameter rather than letting it be inferred, e.g. {@code
|
||||||
|
* gradients.<Integer>dy(0)}
|
||||||
|
*
|
||||||
|
* @param <T> The expected element type of the tensors produced by this output.
|
||||||
|
* @param index The index of the output among the gradients added by this operation
|
||||||
|
*/
|
||||||
|
@SuppressWarnings("unchecked")
|
||||||
|
public <T> Output<T> dy(int index) {
|
||||||
|
return (Output<T>) dy.get(index);
|
||||||
|
}
|
||||||
|
|
||||||
|
private List<Output<?>> dy;
|
||||||
|
|
||||||
|
private Gradients(List<Output<?>> dy) {
|
||||||
|
this.dy = dy;
|
||||||
|
}
|
||||||
|
}
|
@ -16,7 +16,9 @@ limitations under the License.
|
|||||||
#include "tensorflow/java/src/main/native/graph_jni.h"
|
#include "tensorflow/java/src/main/native/graph_jni.h"
|
||||||
|
|
||||||
#include <limits>
|
#include <limits>
|
||||||
|
#include <memory>
|
||||||
#include "tensorflow/c/c_api.h"
|
#include "tensorflow/c/c_api.h"
|
||||||
|
#include "tensorflow/java/src/main/native/utils_jni.h"
|
||||||
#include "tensorflow/java/src/main/native/exception_jni.h"
|
#include "tensorflow/java/src/main/native/exception_jni.h"
|
||||||
|
|
||||||
namespace {
|
namespace {
|
||||||
@ -130,3 +132,55 @@ Java_org_tensorflow_Graph_toGraphDef(JNIEnv* env, jclass clazz, jlong handle) {
|
|||||||
TF_DeleteBuffer(buf);
|
TF_DeleteBuffer(buf);
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
JNIEXPORT jlongArray JNICALL
|
||||||
|
Java_org_tensorflow_Graph_addGradients(JNIEnv* env, jclass clazz, jlong handle,
|
||||||
|
jlongArray y_handles, jintArray y_indices,
|
||||||
|
jlongArray x_handles, jintArray x_indices,
|
||||||
|
jlongArray dx_handles, jintArray dx_indices) {
|
||||||
|
|
||||||
|
TF_Graph* g = requireHandle(env, handle);
|
||||||
|
if (g == nullptr) return nullptr;
|
||||||
|
|
||||||
|
const jint ny = env->GetArrayLength(y_handles);
|
||||||
|
const jint nx = env->GetArrayLength(x_handles);
|
||||||
|
|
||||||
|
std::unique_ptr<TF_Output[]> y(new TF_Output[ny]);
|
||||||
|
std::unique_ptr<TF_Output[]> x(new TF_Output[nx]);
|
||||||
|
std::unique_ptr<TF_Output[]> dx(nullptr);
|
||||||
|
std::unique_ptr<TF_Output[]> dy(new TF_Output[nx]);
|
||||||
|
|
||||||
|
resolveOutputs(env, "y", y_handles, y_indices, y.get(), ny);
|
||||||
|
resolveOutputs(env, "x", x_handles, x_indices, x.get(), nx);
|
||||||
|
if (dx_handles != nullptr) {
|
||||||
|
if (env->GetArrayLength(dx_handles) != ny) {
|
||||||
|
throwException(env, kIllegalArgumentException,
|
||||||
|
"expected %d, got %d dx handles", ny,
|
||||||
|
env->GetArrayLength(dx_handles));
|
||||||
|
}
|
||||||
|
dx.reset(new TF_Output[ny]);
|
||||||
|
resolveOutputs(env, "dx", dx_handles, dx_indices, dx.get(), ny);
|
||||||
|
}
|
||||||
|
if (env->ExceptionCheck()) return nullptr;
|
||||||
|
|
||||||
|
TF_Status* status = TF_NewStatus();
|
||||||
|
TF_AddGradients(g, y.get(), ny, x.get(), nx, dx.get(), status, dy.get());
|
||||||
|
|
||||||
|
if (!throwExceptionIfNotOK(env, status)) {
|
||||||
|
TF_DeleteStatus(status);
|
||||||
|
return nullptr;
|
||||||
|
}
|
||||||
|
TF_DeleteStatus(status);
|
||||||
|
|
||||||
|
// returned array contains both op handles and output indices, in pair
|
||||||
|
jlongArray dy_handles_and_indices = env->NewLongArray(nx << 1);
|
||||||
|
jlong* dy_elems = env->GetLongArrayElements(dy_handles_and_indices, nullptr);
|
||||||
|
for (int i = 0, j = nx; i < nx; ++i, ++j) {
|
||||||
|
TF_Output dy_output = dy.get()[i];
|
||||||
|
dy_elems[i] = reinterpret_cast<jlong>(dy_output.oper);
|
||||||
|
dy_elems[j] = static_cast<jlong>(dy_output.index);
|
||||||
|
}
|
||||||
|
env->ReleaseLongArrayElements(dy_handles_and_indices, dy_elems, 0);
|
||||||
|
|
||||||
|
return dy_handles_and_indices;
|
||||||
|
}
|
||||||
|
@ -73,6 +73,15 @@ JNIEXPORT jbyteArray JNICALL Java_org_tensorflow_Graph_toGraphDef(JNIEnv *,
|
|||||||
jclass,
|
jclass,
|
||||||
jlong);
|
jlong);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Class: org_tensorflow_Graph
|
||||||
|
* Method: name
|
||||||
|
* Signature: (J[J[I[J[I[J[I)[J
|
||||||
|
*/
|
||||||
|
JNIEXPORT jlongArray JNICALL Java_org_tensorflow_Graph_addGradients(JNIEnv *,
|
||||||
|
jclass, jlong, jlongArray, jintArray, jlongArray, jintArray, jlongArray,
|
||||||
|
jintArray);
|
||||||
|
|
||||||
#ifdef __cplusplus
|
#ifdef __cplusplus
|
||||||
} // extern "C"
|
} // extern "C"
|
||||||
#endif // __cplusplus
|
#endif // __cplusplus
|
||||||
|
@ -17,6 +17,7 @@ limitations under the License.
|
|||||||
#include <memory>
|
#include <memory>
|
||||||
|
|
||||||
#include "tensorflow/c/c_api.h"
|
#include "tensorflow/c/c_api.h"
|
||||||
|
#include "tensorflow/java/src/main/native/utils_jni.h"
|
||||||
#include "tensorflow/java/src/main/native/exception_jni.h"
|
#include "tensorflow/java/src/main/native/exception_jni.h"
|
||||||
#include "tensorflow/java/src/main/native/session_jni.h"
|
#include "tensorflow/java/src/main/native/session_jni.h"
|
||||||
|
|
||||||
@ -55,37 +56,6 @@ void resolveHandles(JNIEnv* env, const char* type, jlongArray src_array,
|
|||||||
env->ReleaseLongArrayElements(src_array, src_start, JNI_ABORT);
|
env->ReleaseLongArrayElements(src_array, src_start, JNI_ABORT);
|
||||||
}
|
}
|
||||||
|
|
||||||
void resolveOutputs(JNIEnv* env, const char* type, jlongArray src_op,
|
|
||||||
jintArray src_index, TF_Output* dst, jint n) {
|
|
||||||
if (env->ExceptionCheck()) return;
|
|
||||||
jint len = env->GetArrayLength(src_op);
|
|
||||||
if (len != n) {
|
|
||||||
throwException(env, kIllegalArgumentException,
|
|
||||||
"expected %d, got %d %s Operations", n, len, type);
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
len = env->GetArrayLength(src_index);
|
|
||||||
if (len != n) {
|
|
||||||
throwException(env, kIllegalArgumentException,
|
|
||||||
"expected %d, got %d %s Operation output indices", n, len,
|
|
||||||
type);
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
jlong* op_handles = env->GetLongArrayElements(src_op, nullptr);
|
|
||||||
jint* indices = env->GetIntArrayElements(src_index, nullptr);
|
|
||||||
for (int i = 0; i < n; ++i) {
|
|
||||||
if (op_handles[i] == 0) {
|
|
||||||
throwException(env, kNullPointerException, "invalid %s (#%d of %d)", type,
|
|
||||||
i, n);
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
dst[i] = TF_Output{reinterpret_cast<TF_Operation*>(op_handles[i]),
|
|
||||||
static_cast<int>(indices[i])};
|
|
||||||
}
|
|
||||||
env->ReleaseIntArrayElements(src_index, indices, JNI_ABORT);
|
|
||||||
env->ReleaseLongArrayElements(src_op, op_handles, JNI_ABORT);
|
|
||||||
}
|
|
||||||
|
|
||||||
void TF_MaybeDeleteBuffer(TF_Buffer* buf) {
|
void TF_MaybeDeleteBuffer(TF_Buffer* buf) {
|
||||||
if (buf == nullptr) return;
|
if (buf == nullptr) return;
|
||||||
TF_DeleteBuffer(buf);
|
TF_DeleteBuffer(buf);
|
||||||
|
53
tensorflow/java/src/main/native/utils_jni.cc
Normal file
53
tensorflow/java/src/main/native/utils_jni.cc
Normal file
@ -0,0 +1,53 @@
|
|||||||
|
/* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License.
|
||||||
|
==============================================================================*/
|
||||||
|
|
||||||
|
#include "tensorflow/java/src/main/native/utils_jni.h"
|
||||||
|
|
||||||
|
#include "tensorflow/java/src/main/native/exception_jni.h"
|
||||||
|
|
||||||
|
void resolveOutputs(JNIEnv* env, const char* type, jlongArray src_op,
|
||||||
|
jintArray src_index, TF_Output* dst, jint n) {
|
||||||
|
if (env->ExceptionCheck()) return;
|
||||||
|
jint len = env->GetArrayLength(src_op);
|
||||||
|
if (len != n) {
|
||||||
|
throwException(env, kIllegalArgumentException,
|
||||||
|
"expected %d, got %d %s Operations", n, len, type);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
len = env->GetArrayLength(src_index);
|
||||||
|
if (len != n) {
|
||||||
|
throwException(env, kIllegalArgumentException,
|
||||||
|
"expected %d, got %d %s Operation output indices", n, len,
|
||||||
|
type);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
jlong* op_handles = env->GetLongArrayElements(src_op, nullptr);
|
||||||
|
jint* indices = env->GetIntArrayElements(src_index, nullptr);
|
||||||
|
for (int i = 0; i < n; ++i) {
|
||||||
|
if (op_handles[i] == 0) {
|
||||||
|
throwException(env, kNullPointerException, "invalid %s (#%d of %d)", type,
|
||||||
|
i, n);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
dst[i] = TF_Output{reinterpret_cast<TF_Operation*>(op_handles[i]),
|
||||||
|
static_cast<int>(indices[i])};
|
||||||
|
}
|
||||||
|
env->ReleaseIntArrayElements(src_index, indices, JNI_ABORT);
|
||||||
|
env->ReleaseLongArrayElements(src_op, op_handles, JNI_ABORT);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
33
tensorflow/java/src/main/native/utils_jni.h
Normal file
33
tensorflow/java/src/main/native/utils_jni.h
Normal file
@ -0,0 +1,33 @@
|
|||||||
|
/* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License.
|
||||||
|
==============================================================================*/
|
||||||
|
|
||||||
|
#ifndef TENSORFLOW_JAVA_UTILS_JNI_H_
|
||||||
|
#define TENSORFLOW_JAVA_UTILS_JNI_H_
|
||||||
|
|
||||||
|
#include <jni.h>
|
||||||
|
|
||||||
|
#include "tensorflow/c/c_api.h"
|
||||||
|
|
||||||
|
#ifdef __cplusplus
|
||||||
|
extern "C" {
|
||||||
|
#endif // __cplusplus
|
||||||
|
|
||||||
|
void resolveOutputs(JNIEnv* env, const char* type, jlongArray src_op,
|
||||||
|
jintArray src_index, TF_Output* dst, jint n);
|
||||||
|
|
||||||
|
#ifdef __cplusplus
|
||||||
|
} // extern "C"
|
||||||
|
#endif // __cplusplus
|
||||||
|
#endif /* TENSORFLOW_JAVA_UTILS_JNI_H_ */
|
@ -22,6 +22,7 @@ import static org.junit.Assert.assertTrue;
|
|||||||
|
|
||||||
import java.util.HashSet;
|
import java.util.HashSet;
|
||||||
import java.util.Iterator;
|
import java.util.Iterator;
|
||||||
|
|
||||||
import org.junit.Test;
|
import org.junit.Test;
|
||||||
import org.junit.runner.RunWith;
|
import org.junit.runner.RunWith;
|
||||||
import org.junit.runners.JUnit4;
|
import org.junit.runners.JUnit4;
|
||||||
@ -129,4 +130,106 @@ public class GraphTest {
|
|||||||
// expected exception.
|
// expected exception.
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@Test
|
||||||
|
public void addGradientsToGraph() {
|
||||||
|
try (Graph g = new Graph();
|
||||||
|
Session s = new Session(g)) {
|
||||||
|
|
||||||
|
Output<Float> x1 = TestUtil.placeholder(g, "x1", Float.class);
|
||||||
|
Output<Float> x2 = TestUtil.placeholder(g, "x2", Float.class);
|
||||||
|
Output<Float> y0 = TestUtil.square(g, "y0", x1);
|
||||||
|
Output<Float> y1 = TestUtil.square(g, "y1", y0);
|
||||||
|
Output<Float> y2 = TestUtil.addN(g, y0, x2);
|
||||||
|
|
||||||
|
Output<?>[] grads0 = g.addGradients(y1, toArray(x1));
|
||||||
|
assertNotNull(grads0);
|
||||||
|
assertEquals(1, grads0.length);
|
||||||
|
assertEquals(DataType.FLOAT, grads0[0].dataType());
|
||||||
|
|
||||||
|
Output<?>[] grads1 = g.addGradients(y2, toArray(x1, x2));
|
||||||
|
assertNotNull(grads1);
|
||||||
|
assertEquals(2, grads1.length);
|
||||||
|
assertEquals(DataType.FLOAT, grads1[0].dataType());
|
||||||
|
assertEquals(DataType.FLOAT, grads1[1].dataType());
|
||||||
|
|
||||||
|
try (Tensor<Float> c1 = Tensors.create(3.0f);
|
||||||
|
Tensor<Float> c2 = Tensors.create(2.0f);
|
||||||
|
TestUtil.AutoCloseableList<Tensor<?>> outputs = new TestUtil.AutoCloseableList<>(
|
||||||
|
s.runner()
|
||||||
|
.feed(x1, c1)
|
||||||
|
.feed(x2, c2)
|
||||||
|
.fetch(grads0[0])
|
||||||
|
.fetch(grads1[0])
|
||||||
|
.fetch(grads1[1])
|
||||||
|
.run())) {
|
||||||
|
|
||||||
|
assertEquals(3, outputs.size());
|
||||||
|
assertEquals(108.0f, outputs.get(0).floatValue(), 0.0f);
|
||||||
|
assertEquals(6.0f, outputs.get(1).floatValue(), 0.0f);
|
||||||
|
assertEquals(1.0f, outputs.get(2).floatValue(), 0.0f);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test
|
||||||
|
public void addGradientSumsToGraph() {
|
||||||
|
try (Graph g = new Graph();
|
||||||
|
Session s = new Session(g)) {
|
||||||
|
|
||||||
|
Output<Float> x = TestUtil.placeholder(g, "x", Float.class);
|
||||||
|
Output<Float> y0 = TestUtil.square(g, "y0", x);
|
||||||
|
Output<Float> y1 = TestUtil.square(g, "y1", y0);
|
||||||
|
|
||||||
|
Output<?>[] grad = g.addGradients(toArray(y0, y1), toArray(x), null);
|
||||||
|
assertNotNull(grad);
|
||||||
|
assertEquals(1, grad.length);
|
||||||
|
assertEquals(DataType.FLOAT, grad[0].dataType());
|
||||||
|
|
||||||
|
try (Tensor<Float> c = Tensors.create(3.0f);
|
||||||
|
Tensor<?> output = s.runner()
|
||||||
|
.feed(x, c)
|
||||||
|
.fetch(grad[0])
|
||||||
|
.run()
|
||||||
|
.get(0)) {
|
||||||
|
|
||||||
|
assertEquals(114.0f, output.floatValue(), 0.0f);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test
|
||||||
|
public void addGradientsWithInitialValuesToGraph() {
|
||||||
|
try (Graph g = new Graph();
|
||||||
|
Session s = new Session(g)) {
|
||||||
|
|
||||||
|
Output<Float> x = TestUtil.placeholder(g, "x", Float.class);
|
||||||
|
Output<Float> y0 = TestUtil.square(g, "y0", x);
|
||||||
|
Output<Float> y1 = TestUtil.square(g, "y1", y0);
|
||||||
|
|
||||||
|
Output<?>[] grad0 = g.addGradients(y1, toArray(y0));
|
||||||
|
assertNotNull(grad0);
|
||||||
|
assertEquals(1, grad0.length);
|
||||||
|
assertEquals(DataType.FLOAT, grad0[0].dataType());
|
||||||
|
|
||||||
|
Output<?>[] grad1 = g.addGradients(toArray(y0), toArray(x), toArray(grad0[0]));
|
||||||
|
assertNotNull(grad1);
|
||||||
|
assertEquals(1, grad1.length);
|
||||||
|
assertEquals(DataType.FLOAT, grad1[0].dataType());
|
||||||
|
|
||||||
|
try (Tensor<Float> c = Tensors.create(3.0f);
|
||||||
|
Tensor<?> output = s.runner()
|
||||||
|
.feed(x, c)
|
||||||
|
.fetch(grad1[0])
|
||||||
|
.run()
|
||||||
|
.get(0)) {
|
||||||
|
|
||||||
|
assertEquals(108.0f, output.floatValue(), 0.0f);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
private static Output<?>[] toArray(Output<?>... outputs) {
|
||||||
|
return outputs;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
@ -20,8 +20,6 @@ import static org.junit.Assert.assertEquals;
|
|||||||
import static org.junit.Assert.assertTrue;
|
import static org.junit.Assert.assertTrue;
|
||||||
import static org.junit.Assert.fail;
|
import static org.junit.Assert.fail;
|
||||||
|
|
||||||
import java.util.ArrayList;
|
|
||||||
import java.util.Collection;
|
|
||||||
import org.junit.Test;
|
import org.junit.Test;
|
||||||
import org.junit.runner.RunWith;
|
import org.junit.runner.RunWith;
|
||||||
import org.junit.runners.JUnit4;
|
import org.junit.runners.JUnit4;
|
||||||
@ -36,8 +34,8 @@ public class SessionTest {
|
|||||||
Session s = new Session(g)) {
|
Session s = new Session(g)) {
|
||||||
TestUtil.transpose_A_times_X(g, new int[][] {{2}, {3}});
|
TestUtil.transpose_A_times_X(g, new int[][] {{2}, {3}});
|
||||||
try (Tensor<Integer> x = Tensors.create(new int[][] {{5}, {7}});
|
try (Tensor<Integer> x = Tensors.create(new int[][] {{5}, {7}});
|
||||||
AutoCloseableList<Tensor<?>> outputs =
|
TestUtil.AutoCloseableList<Tensor<?>> outputs =
|
||||||
new AutoCloseableList<Tensor<?>>(s.runner().feed("X", x).fetch("Y").run())) {
|
new TestUtil.AutoCloseableList<Tensor<?>>(s.runner().feed("X", x).fetch("Y").run())) {
|
||||||
assertEquals(1, outputs.size());
|
assertEquals(1, outputs.size());
|
||||||
final int[][] expected = {{31}};
|
final int[][] expected = {{31}};
|
||||||
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
||||||
@ -53,8 +51,8 @@ public class SessionTest {
|
|||||||
Output<Integer> feed = g.operation("X").output(0);
|
Output<Integer> feed = g.operation("X").output(0);
|
||||||
Output<Integer> fetch = g.operation("Y").output(0);
|
Output<Integer> fetch = g.operation("Y").output(0);
|
||||||
try (Tensor<Integer> x = Tensors.create(new int[][] {{5}, {7}});
|
try (Tensor<Integer> x = Tensors.create(new int[][] {{5}, {7}});
|
||||||
AutoCloseableList<Tensor<?>> outputs =
|
TestUtil.AutoCloseableList<Tensor<?>> outputs =
|
||||||
new AutoCloseableList<Tensor<?>>(s.runner().feed(feed, x).fetch(fetch).run())) {
|
new TestUtil.AutoCloseableList<Tensor<?>>(s.runner().feed(feed, x).fetch(fetch).run())) {
|
||||||
assertEquals(1, outputs.size());
|
assertEquals(1, outputs.size());
|
||||||
final int[][] expected = {{31}};
|
final int[][] expected = {{31}};
|
||||||
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
||||||
@ -112,7 +110,7 @@ public class SessionTest {
|
|||||||
.setOptions(fullTraceRunOptions())
|
.setOptions(fullTraceRunOptions())
|
||||||
.runAndFetchMetadata();
|
.runAndFetchMetadata();
|
||||||
// Sanity check on outputs.
|
// Sanity check on outputs.
|
||||||
AutoCloseableList<Tensor<?>> outputs = new AutoCloseableList<Tensor<?>>(result.outputs);
|
TestUtil.AutoCloseableList<Tensor<?>> outputs = new TestUtil.AutoCloseableList<Tensor<?>>(result.outputs);
|
||||||
assertEquals(1, outputs.size());
|
assertEquals(1, outputs.size());
|
||||||
final int[][] expected = {{31}};
|
final int[][] expected = {{31}};
|
||||||
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
||||||
@ -135,8 +133,8 @@ public class SessionTest {
|
|||||||
Session s = new Session(g)) {
|
Session s = new Session(g)) {
|
||||||
TestUtil.constant(g, "c1", 2718);
|
TestUtil.constant(g, "c1", 2718);
|
||||||
TestUtil.constant(g, "c2", 31415);
|
TestUtil.constant(g, "c2", 31415);
|
||||||
AutoCloseableList<Tensor<?>> outputs =
|
TestUtil.AutoCloseableList<Tensor<?>> outputs =
|
||||||
new AutoCloseableList<Tensor<?>>(s.runner().fetch("c2").fetch("c1").run());
|
new TestUtil.AutoCloseableList<Tensor<?>>(s.runner().fetch("c2").fetch("c1").run());
|
||||||
assertEquals(2, outputs.size());
|
assertEquals(2, outputs.size());
|
||||||
assertEquals(31415, outputs.get(0).intValue());
|
assertEquals(31415, outputs.get(0).intValue());
|
||||||
assertEquals(2718, outputs.get(1).intValue());
|
assertEquals(2718, outputs.get(1).intValue());
|
||||||
@ -164,28 +162,6 @@ public class SessionTest {
|
|||||||
Session s = new Session(g, singleThreadConfigProto())) {}
|
Session s = new Session(g, singleThreadConfigProto())) {}
|
||||||
}
|
}
|
||||||
|
|
||||||
private static final class AutoCloseableList<E extends AutoCloseable> extends ArrayList<E>
|
|
||||||
implements AutoCloseable {
|
|
||||||
AutoCloseableList(Collection<? extends E> c) {
|
|
||||||
super(c);
|
|
||||||
}
|
|
||||||
|
|
||||||
@Override
|
|
||||||
public void close() {
|
|
||||||
Exception toThrow = null;
|
|
||||||
for (AutoCloseable c : this) {
|
|
||||||
try {
|
|
||||||
c.close();
|
|
||||||
} catch (Exception e) {
|
|
||||||
toThrow = e;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (toThrow != null) {
|
|
||||||
throw new RuntimeException(toThrow);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private static byte[] fullTraceRunOptions() {
|
private static byte[] fullTraceRunOptions() {
|
||||||
// Ideally this would use the generated Java sources for protocol buffers
|
// Ideally this would use the generated Java sources for protocol buffers
|
||||||
// and end up with something like the snippet below. However, generating
|
// and end up with something like the snippet below. However, generating
|
||||||
|
@ -16,9 +16,34 @@ limitations under the License.
|
|||||||
package org.tensorflow;
|
package org.tensorflow;
|
||||||
|
|
||||||
import java.lang.reflect.Array;
|
import java.lang.reflect.Array;
|
||||||
|
import java.util.ArrayList;
|
||||||
|
import java.util.Collection;
|
||||||
|
|
||||||
/** Static utility functions. */
|
/** Static utility functions. */
|
||||||
public class TestUtil {
|
public class TestUtil {
|
||||||
|
|
||||||
|
public static final class AutoCloseableList<E extends AutoCloseable> extends ArrayList<E>
|
||||||
|
implements AutoCloseable {
|
||||||
|
AutoCloseableList(Collection<? extends E> c) {
|
||||||
|
super(c);
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public void close() {
|
||||||
|
Exception toThrow = null;
|
||||||
|
for (AutoCloseable c : this) {
|
||||||
|
try {
|
||||||
|
c.close();
|
||||||
|
} catch (Exception e) {
|
||||||
|
toThrow = e;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (toThrow != null) {
|
||||||
|
throw new RuntimeException(toThrow);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
public static <T> Output<T> constant(Graph g, String name, Object value) {
|
public static <T> Output<T> constant(Graph g, String name, Object value) {
|
||||||
try (Tensor<?> t = Tensor.create(value)) {
|
try (Tensor<?> t = Tensor.create(value)) {
|
||||||
return g.opBuilder("Const", name)
|
return g.opBuilder("Const", name)
|
||||||
@ -36,7 +61,7 @@ public class TestUtil {
|
|||||||
.<T>output(0);
|
.<T>output(0);
|
||||||
}
|
}
|
||||||
|
|
||||||
public static Output<?> addN(Graph g, Output<?>... inputs) {
|
public static <T> Output<T> addN(Graph g, Output<?>... inputs) {
|
||||||
return g.opBuilder("AddN", "AddN").addInputList(inputs).build().output(0);
|
return g.opBuilder("AddN", "AddN").addInputList(inputs).build().output(0);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -58,6 +83,13 @@ public class TestUtil {
|
|||||||
.setAttr("num_split", numSplit)
|
.setAttr("num_split", numSplit)
|
||||||
.build();
|
.build();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
public static <T> Output<T> square(Graph g, String name, Output<T> value) {
|
||||||
|
return g.opBuilder("Square", name)
|
||||||
|
.addInput(value)
|
||||||
|
.build()
|
||||||
|
.<T>output(0);
|
||||||
|
}
|
||||||
|
|
||||||
public static void transpose_A_times_X(Graph g, int[][] a) {
|
public static void transpose_A_times_X(Graph g, int[][] a) {
|
||||||
Output<Integer> aa = constant(g, "A", a);
|
Output<Integer> aa = constant(g, "A", a);
|
||||||
|
@ -99,7 +99,7 @@ class EstimatorSpec(
|
|||||||
ignored in eval and infer modes. Example:
|
ignored in eval and infer modes. Example:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
def my_model_fn(mode, features, labels):
|
def my_model_fn(features, labels, mode):
|
||||||
predictions = ...
|
predictions = ...
|
||||||
loss = ...
|
loss = ...
|
||||||
train_op = ...
|
train_op = ...
|
||||||
@ -114,7 +114,7 @@ class EstimatorSpec(
|
|||||||
given mode. Example:
|
given mode. Example:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
def my_model_fn(mode, features, labels):
|
def my_model_fn(features, labels, mode):
|
||||||
if (mode == tf.estimator.ModeKeys.TRAIN or
|
if (mode == tf.estimator.ModeKeys.TRAIN or
|
||||||
mode == tf.estimator.ModeKeys.EVAL):
|
mode == tf.estimator.ModeKeys.EVAL):
|
||||||
loss = ...
|
loss = ...
|
||||||
|
@ -3239,8 +3239,9 @@ class Graph(object):
|
|||||||
# the name will still appear in _names_in_use even though the name hasn't
|
# the name will still appear in _names_in_use even though the name hasn't
|
||||||
# been used. This is ok, just leave _names_in_use as-is in this case.
|
# been used. This is ok, just leave _names_in_use as-is in this case.
|
||||||
# TODO(skyewm): make the C API guarantee no name conflicts.
|
# TODO(skyewm): make the C API guarantee no name conflicts.
|
||||||
if ret.name not in self._names_in_use:
|
name_key = ret.name.lower()
|
||||||
self._names_in_use[ret.name] = 1
|
if name_key not in self._names_in_use:
|
||||||
|
self._names_in_use[name_key] = 1
|
||||||
self._create_op_helper(ret, compute_device=compute_device)
|
self._create_op_helper(ret, compute_device=compute_device)
|
||||||
return ret
|
return ret
|
||||||
|
|
||||||
@ -3949,20 +3950,27 @@ class Graph(object):
|
|||||||
"""
|
"""
|
||||||
if self._name_stack:
|
if self._name_stack:
|
||||||
name = self._name_stack + "/" + name
|
name = self._name_stack + "/" + name
|
||||||
i = self._names_in_use.get(name, 0)
|
|
||||||
# Increment the number for "name".
|
# For the sake of checking for names in use, we treat names as case
|
||||||
|
# insensitive (e.g. foo = Foo).
|
||||||
|
name_key = name.lower()
|
||||||
|
i = self._names_in_use.get(name_key, 0)
|
||||||
|
# Increment the number for "name_key".
|
||||||
if mark_as_used:
|
if mark_as_used:
|
||||||
self._names_in_use[name] = i + 1
|
self._names_in_use[name_key] = i + 1
|
||||||
if i > 0:
|
if i > 0:
|
||||||
base_name = name
|
base_name_key = name_key
|
||||||
# Make sure the composed name is not already used.
|
# Make sure the composed name key is not already used.
|
||||||
while name in self._names_in_use:
|
while name_key in self._names_in_use:
|
||||||
name = "%s_%d" % (base_name, i)
|
name_key = "%s_%d" % (base_name_key, i)
|
||||||
i += 1
|
i += 1
|
||||||
# Mark the composed name as used in case someone wants
|
# Mark the composed name_key as used in case someone wants
|
||||||
# to call unique_name("name_1").
|
# to call unique_name("name_1").
|
||||||
if mark_as_used:
|
if mark_as_used:
|
||||||
self._names_in_use[name] = 1
|
self._names_in_use[name_key] = 1
|
||||||
|
|
||||||
|
# Return the new name with the original capitalization of the given name.
|
||||||
|
name = "%s_%d" % (name, i-1)
|
||||||
return name
|
return name
|
||||||
|
|
||||||
def get_name_scope(self):
|
def get_name_scope(self):
|
||||||
|
@ -965,6 +965,15 @@ class NameStackTest(test_util.TensorFlowTestCase):
|
|||||||
self.assertEqual("foo_1", g.unique_name("foo"))
|
self.assertEqual("foo_1", g.unique_name("foo"))
|
||||||
self.assertEqual("foo_3", g.unique_name("foo"))
|
self.assertEqual("foo_3", g.unique_name("foo"))
|
||||||
|
|
||||||
|
def testUniqueNameCaseInsensitivity(self):
|
||||||
|
g = ops.Graph()
|
||||||
|
self.assertEqual("foo", g.unique_name("foo"))
|
||||||
|
self.assertEqual("Foo_1", g.unique_name("Foo"))
|
||||||
|
with g.name_scope("bar"):
|
||||||
|
self.assertEqual("bar/foo", g.unique_name("foo"))
|
||||||
|
with g.name_scope("Bar"):
|
||||||
|
self.assertEqual("Bar_1/foo", g.unique_name("foo"))
|
||||||
|
|
||||||
def testInvalidNameRaisesError(self):
|
def testInvalidNameRaisesError(self):
|
||||||
g = ops.Graph()
|
g = ops.Graph()
|
||||||
with g.name_scope(""): # Should not raise
|
with g.name_scope(""): # Should not raise
|
||||||
|
@ -1390,7 +1390,7 @@ class LayoutOptimizerTest(test.TestCase):
|
|||||||
expected_num_transposes = 3
|
expected_num_transposes = 3
|
||||||
self.assertEqual(expected_num_transposes, num_transposes)
|
self.assertEqual(expected_num_transposes, num_transposes)
|
||||||
self._assert_trans_nhwc_to_nchw('map/while/Conv2D-0', nodes)
|
self._assert_trans_nhwc_to_nchw('map/while/Conv2D-0', nodes)
|
||||||
self._assert_trans_nchw_to_nhwc('map/while/Add-0-2', nodes)
|
self._assert_trans_nchw_to_nhwc('map/while/Add_1-0-2', nodes)
|
||||||
self.assertAllClose(output_val_ref, output_val, atol=1e-3)
|
self.assertAllClose(output_val_ref, output_val, atol=1e-3)
|
||||||
|
|
||||||
def testLoopWithVecAnd4D(self):
|
def testLoopWithVecAnd4D(self):
|
||||||
@ -1414,7 +1414,7 @@ class LayoutOptimizerTest(test.TestCase):
|
|||||||
expected_num_transposes = 2
|
expected_num_transposes = 2
|
||||||
self.assertEqual(expected_num_transposes, num_transposes)
|
self.assertEqual(expected_num_transposes, num_transposes)
|
||||||
self._assert_trans_nhwc_to_nchw('map/while/Conv2D-0', nodes)
|
self._assert_trans_nhwc_to_nchw('map/while/Conv2D-0', nodes)
|
||||||
self._assert_trans_nchw_to_nhwc('map/while/Add-0-2', nodes)
|
self._assert_trans_nchw_to_nhwc('map/while/Add_1-0-2', nodes)
|
||||||
self.assertAllClose(output_val_ref, output_val, atol=1e-3)
|
self.assertAllClose(output_val_ref, output_val, atol=1e-3)
|
||||||
|
|
||||||
def testBinaryOpSecondPort(self):
|
def testBinaryOpSecondPort(self):
|
||||||
|
@ -893,6 +893,7 @@ tf_py_test(
|
|||||||
"//third_party/py/numpy",
|
"//third_party/py/numpy",
|
||||||
"//tensorflow/python:client_testlib",
|
"//tensorflow/python:client_testlib",
|
||||||
"//tensorflow/python:framework",
|
"//tensorflow/python:framework",
|
||||||
|
"//tensorflow/python:sparse_grad",
|
||||||
"//tensorflow/python:sparse_ops",
|
"//tensorflow/python:sparse_ops",
|
||||||
],
|
],
|
||||||
)
|
)
|
||||||
|
@ -364,14 +364,52 @@ class UniformUnitScalingInitializationTest(test.TestCase):
|
|||||||
|
|
||||||
class VarianceScalingInitializationTest(test.TestCase):
|
class VarianceScalingInitializationTest(test.TestCase):
|
||||||
|
|
||||||
|
def testTruncatedNormalDistribution(self):
|
||||||
|
shape = [100, 100]
|
||||||
|
expect_mean = 0.
|
||||||
|
expect_var = 1. / shape[0]
|
||||||
|
init = init_ops.variance_scaling_initializer(
|
||||||
|
distribution='truncated_normal')
|
||||||
|
|
||||||
|
with self.test_session(use_gpu=True), \
|
||||||
|
test.mock.patch.object(
|
||||||
|
random_ops, 'truncated_normal', wraps=random_ops.truncated_normal) \
|
||||||
|
as mock_truncated_normal:
|
||||||
|
x = init(shape).eval()
|
||||||
|
self.assertTrue(mock_truncated_normal.called)
|
||||||
|
|
||||||
|
self.assertNear(np.mean(x), expect_mean, err=1e-2)
|
||||||
|
self.assertNear(np.var(x), expect_var, err=1e-2)
|
||||||
|
|
||||||
def testNormalDistribution(self):
|
def testNormalDistribution(self):
|
||||||
shape = [100, 100]
|
shape = [100, 100]
|
||||||
expect_mean = 0.
|
expect_mean = 0.
|
||||||
expect_var = 1. / shape[0]
|
expect_var = 1. / shape[0]
|
||||||
init = init_ops.variance_scaling_initializer(distribution='normal')
|
init = init_ops.variance_scaling_initializer(distribution='normal')
|
||||||
|
|
||||||
with self.test_session(use_gpu=True):
|
with self.test_session(use_gpu=True), \
|
||||||
|
test.mock.patch.object(
|
||||||
|
random_ops, 'truncated_normal', wraps=random_ops.truncated_normal) \
|
||||||
|
as mock_truncated_normal:
|
||||||
x = init(shape).eval()
|
x = init(shape).eval()
|
||||||
|
self.assertTrue(mock_truncated_normal.called)
|
||||||
|
|
||||||
|
self.assertNear(np.mean(x), expect_mean, err=1e-2)
|
||||||
|
self.assertNear(np.var(x), expect_var, err=1e-2)
|
||||||
|
|
||||||
|
def testUntruncatedNormalDistribution(self):
|
||||||
|
shape = [100, 100]
|
||||||
|
expect_mean = 0.
|
||||||
|
expect_var = 1. / shape[0]
|
||||||
|
init = init_ops.variance_scaling_initializer(
|
||||||
|
distribution='untruncated_normal')
|
||||||
|
|
||||||
|
with self.test_session(use_gpu=True), \
|
||||||
|
test.mock.patch.object(
|
||||||
|
random_ops, 'random_normal', wraps=random_ops.random_normal) \
|
||||||
|
as mock_random_normal:
|
||||||
|
x = init(shape).eval()
|
||||||
|
self.assertTrue(mock_random_normal.called)
|
||||||
|
|
||||||
self.assertNear(np.mean(x), expect_mean, err=1e-2)
|
self.assertNear(np.mean(x), expect_mean, err=1e-2)
|
||||||
self.assertNear(np.var(x), expect_var, err=1e-2)
|
self.assertNear(np.var(x), expect_var, err=1e-2)
|
||||||
|
@ -642,6 +642,29 @@ class TileTest(test.TestCase):
|
|||||||
err = gradient_checker.compute_gradient_error(a, [4, 2], tiled, [4, 4])
|
err = gradient_checker.compute_gradient_error(a, [4, 2], tiled, [4, 4])
|
||||||
self.assertLess(err, 1e-3)
|
self.assertLess(err, 1e-3)
|
||||||
|
|
||||||
|
def testGradientWithSparseGradWithRank1(self):
|
||||||
|
inputs = constant_op.constant([1.0, 2.0, 3.0, 4.0],
|
||||||
|
dtype=dtypes.float32)
|
||||||
|
outputs = array_ops.gather(array_ops.tile(inputs, [3]),
|
||||||
|
[1, 5, 9, 3, 7, 2, 2, 2])
|
||||||
|
with self.test_session():
|
||||||
|
error = gradient_checker.compute_gradient_error(
|
||||||
|
inputs, inputs.get_shape().as_list(),
|
||||||
|
outputs, outputs.get_shape().as_list())
|
||||||
|
self.assertLess(error, 1e-4)
|
||||||
|
|
||||||
|
def testGradientWithSparseGradWithRank3(self):
|
||||||
|
inputs = constant_op.constant([1.0, 2.0, 3.0, 4.0],
|
||||||
|
dtype=dtypes.float32)
|
||||||
|
inputs = array_ops.reshape(inputs, [-1, 1, 1])
|
||||||
|
outputs = array_ops.gather(array_ops.tile(inputs, [3, 4, 2]),
|
||||||
|
[1, 5, 9, 3, 7, 2, 2, 2])
|
||||||
|
with self.test_session():
|
||||||
|
error = gradient_checker.compute_gradient_error(
|
||||||
|
inputs, inputs.get_shape().as_list(),
|
||||||
|
outputs, outputs.get_shape().as_list())
|
||||||
|
self.assertLess(error, 1e-4)
|
||||||
|
|
||||||
def testShapeFunctionEdgeCases(self):
|
def testShapeFunctionEdgeCases(self):
|
||||||
# Unknown multiples shape.
|
# Unknown multiples shape.
|
||||||
inp = constant_op.constant(0.0, shape=[4, 4, 4, 4])
|
inp = constant_op.constant(0.0, shape=[4, 4, 4, 4])
|
||||||
|
@ -21,13 +21,15 @@ from __future__ import print_function
|
|||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
from tensorflow.python.framework import sparse_tensor
|
from tensorflow.python.framework import sparse_tensor
|
||||||
|
from tensorflow.python.ops import gradient_checker
|
||||||
from tensorflow.python.ops import sparse_ops
|
from tensorflow.python.ops import sparse_ops
|
||||||
|
import tensorflow.python.ops.sparse_grad # pylint: disable=unused-import
|
||||||
from tensorflow.python.platform import test
|
from tensorflow.python.platform import test
|
||||||
|
|
||||||
|
|
||||||
class SparseSliceOpTest(test.TestCase):
|
class SparseSliceOpTest(test.TestCase):
|
||||||
|
|
||||||
def _SparseTensor_4x6(self):
|
def _SparseTensor_4x6(self, val_dtype=np.int64):
|
||||||
# [0 | |2 | |4 |5 ]
|
# [0 | |2 | |4 |5 ]
|
||||||
# [ |11| |13|14| ]
|
# [ |11| |13|14| ]
|
||||||
# [20| | |23| |25]
|
# [20| | |23| |25]
|
||||||
@ -37,7 +39,7 @@ class SparseSliceOpTest(test.TestCase):
|
|||||||
[2, 3], [2, 5], [3, 0], [3, 2], [3, 3], [3, 5]]).astype(
|
[2, 3], [2, 5], [3, 0], [3, 2], [3, 3], [3, 5]]).astype(
|
||||||
np.int64)
|
np.int64)
|
||||||
val = np.array([0, 2, 4, 5, 11, 13, 14, 20, 23, 25, 30, 32, 33, 35]).astype(
|
val = np.array([0, 2, 4, 5, 11, 13, 14, 20, 23, 25, 30, 32, 33, 35]).astype(
|
||||||
np.int64)
|
val_dtype)
|
||||||
shape = np.array([4, 6]).astype(np.int64)
|
shape = np.array([4, 6]).astype(np.int64)
|
||||||
return sparse_tensor.SparseTensor(ind, val, shape)
|
return sparse_tensor.SparseTensor(ind, val, shape)
|
||||||
|
|
||||||
@ -244,6 +246,22 @@ class SparseSliceOpTest(test.TestCase):
|
|||||||
self.assertAllEqual(sparse_tensor5.values.eval(), [5, 25, 35])
|
self.assertAllEqual(sparse_tensor5.values.eval(), [5, 25, 35])
|
||||||
self.assertAllEqual(sparse_tensor5.dense_shape.eval(), [4, 1])
|
self.assertAllEqual(sparse_tensor5.dense_shape.eval(), [4, 1])
|
||||||
|
|
||||||
|
def testGradients(self):
|
||||||
|
sp_input = self._SparseTensor_4x6(val_dtype=np.float32)
|
||||||
|
start_and_size = [([0, 0], [4, 2]),
|
||||||
|
([0, 2], [5, 2]),
|
||||||
|
([0, 4], [5, 3])]
|
||||||
|
|
||||||
|
with self.test_session(use_gpu=False):
|
||||||
|
for start, size in start_and_size:
|
||||||
|
sp_output = sparse_ops.sparse_slice(sp_input, start, size)
|
||||||
|
nnz_in = len(sp_input.values.eval())
|
||||||
|
nnz_out = len(sp_output.values.eval())
|
||||||
|
|
||||||
|
err = gradient_checker.compute_gradient_error(
|
||||||
|
[sp_input.values], [(nnz_in,)], sp_output.values, (nnz_out,))
|
||||||
|
self.assertLess(err, 1e-3)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
test.main()
|
test.main()
|
||||||
|
@ -568,7 +568,6 @@ ops.NotDifferentiable("Size")
|
|||||||
@ops.RegisterGradient("Tile")
|
@ops.RegisterGradient("Tile")
|
||||||
def _TileGrad(op, grad):
|
def _TileGrad(op, grad):
|
||||||
"""Sum reduces grad along the tiled dimensions."""
|
"""Sum reduces grad along the tiled dimensions."""
|
||||||
assert isinstance(grad, ops.Tensor)
|
|
||||||
input_shape = array_ops.shape(op.inputs[0])
|
input_shape = array_ops.shape(op.inputs[0])
|
||||||
# We interleave multiples and input_shape to get split_shape,
|
# We interleave multiples and input_shape to get split_shape,
|
||||||
# reshape grad to split_shape, and reduce along all even
|
# reshape grad to split_shape, and reduce along all even
|
||||||
@ -581,6 +580,13 @@ def _TileGrad(op, grad):
|
|||||||
split_shape = array_ops.reshape(
|
split_shape = array_ops.reshape(
|
||||||
array_ops.transpose(array_ops.stack([op.inputs[1], input_shape])), [-1])
|
array_ops.transpose(array_ops.stack([op.inputs[1], input_shape])), [-1])
|
||||||
axes = math_ops.range(0, array_ops.size(split_shape), 2)
|
axes = math_ops.range(0, array_ops.size(split_shape), 2)
|
||||||
|
# Sum reduces grad along the first dimension for IndexedSlices
|
||||||
|
if isinstance(grad, ops.IndexedSlices):
|
||||||
|
grad = math_ops.unsorted_segment_sum(
|
||||||
|
grad.values,
|
||||||
|
math_ops.mod(grad.indices, input_shape[0]),
|
||||||
|
input_shape[0])
|
||||||
|
split_shape = array_ops.concat([[1], split_shape[1:]], axis=0)
|
||||||
input_grad = math_ops.reduce_sum(array_ops.reshape(grad, split_shape), axes)
|
input_grad = math_ops.reduce_sum(array_ops.reshape(grad, split_shape), axes)
|
||||||
# Fix shape inference
|
# Fix shape inference
|
||||||
if not context.executing_eagerly():
|
if not context.executing_eagerly():
|
||||||
|
@ -3135,6 +3135,7 @@ def while_loop(cond,
|
|||||||
happen is that the thread updating `x` can never get ahead of the
|
happen is that the thread updating `x` can never get ahead of the
|
||||||
counter thread because the thread incrementing `x` depends on the value
|
counter thread because the thread incrementing `x` depends on the value
|
||||||
of the counter.
|
of the counter.
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import tensorflow as tf
|
import tensorflow as tf
|
||||||
|
|
||||||
|
@ -43,7 +43,8 @@ from tensorflow.python.ops import linalg_ops_impl
|
|||||||
from tensorflow.python.ops import gen_linalg_ops
|
from tensorflow.python.ops import gen_linalg_ops
|
||||||
from tensorflow.python.ops import math_ops
|
from tensorflow.python.ops import math_ops
|
||||||
from tensorflow.python.ops import random_ops
|
from tensorflow.python.ops import random_ops
|
||||||
from tensorflow.python.util.deprecation import deprecated
|
from tensorflow.python.util.deprecation import (
|
||||||
|
deprecated, deprecated_arg_values)
|
||||||
from tensorflow.python.util.tf_export import tf_export
|
from tensorflow.python.util.tf_export import tf_export
|
||||||
|
|
||||||
|
|
||||||
@ -409,8 +410,10 @@ class UniformUnitScaling(Initializer):
|
|||||||
class VarianceScaling(Initializer):
|
class VarianceScaling(Initializer):
|
||||||
"""Initializer capable of adapting its scale to the shape of weights tensors.
|
"""Initializer capable of adapting its scale to the shape of weights tensors.
|
||||||
|
|
||||||
With `distribution="normal"`, samples are drawn from a truncated normal
|
With `distribution="truncated_normal" or "untruncated_normal"`,
|
||||||
distribution centered on zero, with `stddev = sqrt(scale / n)`
|
samples are drawn from a truncated/untruncated normal
|
||||||
|
distribution with a mean of zero and a standard deviation (after truncation,
|
||||||
|
if used) `stddev = sqrt(scale / n)`
|
||||||
where n is:
|
where n is:
|
||||||
- number of input units in the weight tensor, if mode = "fan_in"
|
- number of input units in the weight tensor, if mode = "fan_in"
|
||||||
- number of output units, if mode = "fan_out"
|
- number of output units, if mode = "fan_out"
|
||||||
@ -433,10 +436,14 @@ class VarianceScaling(Initializer):
|
|||||||
"distribution" arguments.
|
"distribution" arguments.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
@deprecated_arg_values(
|
||||||
|
None,
|
||||||
|
"`normal` is a deprecated alias for `truncated_normal`",
|
||||||
|
distribution="normal")
|
||||||
def __init__(self,
|
def __init__(self,
|
||||||
scale=1.0,
|
scale=1.0,
|
||||||
mode="fan_in",
|
mode="fan_in",
|
||||||
distribution="normal",
|
distribution="truncated_normal",
|
||||||
seed=None,
|
seed=None,
|
||||||
dtype=dtypes.float32):
|
dtype=dtypes.float32):
|
||||||
if scale <= 0.:
|
if scale <= 0.:
|
||||||
@ -444,7 +451,8 @@ class VarianceScaling(Initializer):
|
|||||||
if mode not in {"fan_in", "fan_out", "fan_avg"}:
|
if mode not in {"fan_in", "fan_out", "fan_avg"}:
|
||||||
raise ValueError("Invalid `mode` argument:", mode)
|
raise ValueError("Invalid `mode` argument:", mode)
|
||||||
distribution = distribution.lower()
|
distribution = distribution.lower()
|
||||||
if distribution not in {"normal", "uniform"}:
|
if distribution not in {"normal", "uniform",
|
||||||
|
"truncated_normal", "untruncated_normal"}:
|
||||||
raise ValueError("Invalid `distribution` argument:", distribution)
|
raise ValueError("Invalid `distribution` argument:", distribution)
|
||||||
self.scale = scale
|
self.scale = scale
|
||||||
self.mode = mode
|
self.mode = mode
|
||||||
@ -466,11 +474,15 @@ class VarianceScaling(Initializer):
|
|||||||
scale /= max(1., fan_out)
|
scale /= max(1., fan_out)
|
||||||
else:
|
else:
|
||||||
scale /= max(1., (fan_in + fan_out) / 2.)
|
scale /= max(1., (fan_in + fan_out) / 2.)
|
||||||
if self.distribution == "normal":
|
if self.distribution == "normal" or self.distribution == "truncated_normal":
|
||||||
# constant taken from scipy.stats.truncnorm.std(a=-2, b=2, loc=0., scale=1.)
|
# constant taken from scipy.stats.truncnorm.std(a=-2, b=2, loc=0., scale=1.)
|
||||||
stddev = math.sqrt(scale) / .87962566103423978
|
stddev = math.sqrt(scale) / .87962566103423978
|
||||||
return random_ops.truncated_normal(
|
return random_ops.truncated_normal(
|
||||||
shape, 0.0, stddev, dtype, seed=self.seed)
|
shape, 0.0, stddev, dtype, seed=self.seed)
|
||||||
|
elif self.distribution == "untruncated_normal":
|
||||||
|
stddev = math.sqrt(scale)
|
||||||
|
return random_ops.random_normal(
|
||||||
|
shape, 0.0, stddev, dtype, seed=self.seed)
|
||||||
else:
|
else:
|
||||||
limit = math.sqrt(3.0 * scale)
|
limit = math.sqrt(3.0 * scale)
|
||||||
return random_ops.random_uniform(
|
return random_ops.random_uniform(
|
||||||
|
@ -878,7 +878,8 @@ def sparse_softmax_cross_entropy(
|
|||||||
exception when this op is run on CPU, and return `NaN` for corresponding
|
exception when this op is run on CPU, and return `NaN` for corresponding
|
||||||
loss and gradient rows on GPU.
|
loss and gradient rows on GPU.
|
||||||
logits: Unscaled log probabilities of shape
|
logits: Unscaled log probabilities of shape
|
||||||
`[d_0, d_1, ..., d_{r-1}, num_classes]` and dtype `float32` or `float64`.
|
`[d_0, d_1, ..., d_{r-1}, num_classes]` and dtype `float16`, `float32` or
|
||||||
|
`float64`.
|
||||||
weights: Coefficients for the loss. This must be scalar or broadcastable to
|
weights: Coefficients for the loss. This must be scalar or broadcastable to
|
||||||
`labels` (i.e. same rank and each dimension is either 1 or the same).
|
`labels` (i.e. same rank and each dimension is either 1 or the same).
|
||||||
scope: the scope for the operations performed in computing the loss.
|
scope: the scope for the operations performed in computing the loss.
|
||||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user