Merge changes from github.
PiperOrigin-RevId: 203037623
This commit is contained in:
parent
eacdfdf6c0
commit
73e38c29c7
@ -96,6 +96,8 @@ The TensorFlow project strives to abide by generally accepted best practices in
|
||||
| --- | --- | --- |
|
||||
| **IBM s390x** | [](http://ibmz-ci.osuosl.org/job/TensorFlow_IBMZ_CI/) | TBA |
|
||||
| **IBM ppc64le CPU** | [](http://powerci.osuosl.org/job/TensorFlow_Ubuntu_16.04_CPU/) | TBA |
|
||||
| **IBM ppc64le GPU** | [](http://powerci.osuosl.org/job/TensorFlow_Ubuntu_16.04_PPC64LE_GPU/) | TBA |
|
||||
| **Linux CPU with Intel® MKL-DNN®** | [](https://tensorflow-ci.intel.com/job/tensorflow-mkl-linux-cpu/) | TBA |
|
||||
|
||||
|
||||
## For more information
|
||||
|
40
RELEASE.md
40
RELEASE.md
@ -1,18 +1,38 @@
|
||||
# Release 1.9.0
|
||||
|
||||
## Major Features And Improvements
|
||||
* Update tf.keras to the Keras 2.1.6 API.
|
||||
* `tfe.Network` is deprecated. Please inherit from `tf.keras.Model`.
|
||||
* Adding support of core feature columns and losses to gradient boosted trees estimators.
|
||||
* The distributions.Bijector API supports broadcasting for Bijectors with new API changes. See [here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/distributions/bijectors/Bijector) for more details.
|
||||
* Layered variable names have changed in the following conditions:
|
||||
* Using `tf.keras.layers` with custom variable scopes.
|
||||
* Using `tf.layers` in a subclassed `tf.keras.Model` class. See [here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/layers) for more details
|
||||
|
||||
## Breaking Changes
|
||||
* If you're opening empty variable scopes; replace `variable_scope`('', ...) by `variable_scope`(`tf.get_variable_scope()`, ...).
|
||||
* Updated docs for `tf.keras`: New Keras-based [get started](http://tensorflow.org/versions/r1.9/get_started),
|
||||
and [programmers guide page](http://tensorflow.org/versions/r1.9/programmers_guide/keras).
|
||||
* Update `tf.keras` to the Keras 2.1.6 API.
|
||||
* Added [`tf.keras.layers.CuDNNGRU`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNGRU) and [`tf.keras.layers.CuDNNLSTM`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNLSTM) layers. [Try it](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb?linkId=53292082).
|
||||
* Adding support of core [feature columns](https://www.tensorflow.org/get_started/feature_columns) and [losses](https://www.tensorflow.org/api_docs/python/tf/losses) to [gradient boosted trees estimators](https://github.com/tensorflow/models/tree/master/official/boosted_trees).
|
||||
* The [python interface](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/lite)
|
||||
for the [TFLite Optimizing Converter](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/README.md)
|
||||
has been expanded, and the command line interface (AKA: `toco`, `tflite_convert`) is once again
|
||||
included in the standard `pip` installation.
|
||||
* Improved data-loading and text processing with:
|
||||
* [`tf.decode_compressed`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/decode_compressed)
|
||||
* [`tf.string_strip`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/string_strip)
|
||||
* [`tf.strings.regex_full_match`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/strings/regex_full_match)
|
||||
* Added experimental support for new pre-made Estimators:
|
||||
* [`tf.contrib.estimator.BaselineEstimator`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/BaselineEstimator)
|
||||
* [`tf.contrib.estimator.RNNClassifier`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/RNNEstimator)
|
||||
* [`tf.contrib.estimator.RNNEstimator`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/RNNClassifier)
|
||||
* The [distributions.Bijector](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/distributions/bijectors/Bijector)
|
||||
API supports broadcasting for Bijectors with new API changes.
|
||||
|
||||
## Breaking Chances
|
||||
* If you're opening empty variable scopes; replace `variable_scope('', ...)` by
|
||||
`variable_scope(tf.get_variable_scope(), ...)`.
|
||||
* Headers used for building custom ops have been moved from site-packages/external into site-packages/tensorflow/include/external.
|
||||
|
||||
## Bug Fixes and Other Changes
|
||||
|
||||
* `tfe.Network` is deprecated. Please inherit from `tf.keras.Model`.
|
||||
* Layered variable names have changed in the following conditions:
|
||||
* Using `tf.keras.layers` with custom variable scopes.
|
||||
* Using `tf.layers` in a subclassed `tf.keras.Model` class. See
|
||||
[here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/layers) for more details
|
||||
* `tf.data`:
|
||||
* The `DatasetBase::DebugString()` method is now `const`.
|
||||
* Added the `tf.contrib.data.sample_from_datasets()` API for randomly sampling from multiple datasets.
|
||||
|
@ -438,6 +438,22 @@ filegroup(
|
||||
data = glob(["docs_src/**/*.md"]),
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "grpc",
|
||||
deps = select({
|
||||
":linux_s390x": ["@grpc//:grpc_unsecure"],
|
||||
"//conditions:default": ["@grpc"],
|
||||
}),
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "grpc++",
|
||||
deps = select({
|
||||
":linux_s390x": ["@grpc//:grpc++_unsecure"],
|
||||
"//conditions:default": ["@grpc//:grpc++"],
|
||||
}),
|
||||
)
|
||||
|
||||
# A shared object which includes registration mechanisms for ops and
|
||||
# kernels. Does not include the implementations of any ops or kernels. Instead,
|
||||
# the library which loads libtensorflow_framework.so
|
||||
@ -587,19 +603,3 @@ py_library(
|
||||
visibility = ["//visibility:public"],
|
||||
deps = ["//tensorflow/python:no_contrib"],
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "grpc",
|
||||
deps = select({
|
||||
":linux_s390x": ["@grpc//:grpc_unsecure"],
|
||||
"//conditions:default": ["@grpc"],
|
||||
}),
|
||||
)
|
||||
|
||||
cc_library(
|
||||
name = "grpc++",
|
||||
deps = select({
|
||||
":linux_s390x": ["@grpc//:grpc++_unsecure"],
|
||||
"//conditions:default": ["@grpc//:grpc++"],
|
||||
}),
|
||||
)
|
||||
|
@ -421,6 +421,58 @@ Status StridedSliceGradHelper(const Scope& scope, const Operation& op,
|
||||
}
|
||||
REGISTER_GRADIENT_OP("StridedSlice", StridedSliceGradHelper);
|
||||
|
||||
Status SliceGrad(const Scope& scope, const Operation& op,
|
||||
const std::vector<Output>& grad_inputs,
|
||||
std::vector<Output>* grad_outputs) {
|
||||
// Propagate the incoming gradient along all the selected values,
|
||||
// and zero everywhere else. Use the Pad operator for this.
|
||||
//
|
||||
// First create an Nx2 padding where N is the number of input
|
||||
// dimensions. The first column is the number of prepended zeros
|
||||
// for each dimension, and the second column is the number of
|
||||
// appended zeros.
|
||||
//
|
||||
// The first column is just the begin vector.
|
||||
// The second column is the shape of the input element-wise
|
||||
// subtracted by begin+size
|
||||
|
||||
// Running example:
|
||||
// input.shape = [3, 5, 3]
|
||||
// begin = [1, 2, 1], size = [1, 3, 2]
|
||||
Input input = op.input(0);
|
||||
Input begin = op.input(1);
|
||||
// input_rank = 3
|
||||
auto input_rank = Rank(scope, input);
|
||||
// slice_size = [1, 3, 2]
|
||||
auto slice_size = Shape(scope, op.output(0));
|
||||
// padding_shape = [3, 1]
|
||||
auto padding_shape = Stack(scope, {input_rank, 1});
|
||||
// before_padding = [[1]
|
||||
// [2]
|
||||
// [1]]
|
||||
Input before_padding = Reshape(scope, begin, padding_shape);
|
||||
// after_padding_sizes = shape(input) - slice_size - begin
|
||||
// = [3, 5, 3] - [1, 3, 2] - [1, 2, 1]
|
||||
// = [1, 0, 0]
|
||||
auto after_padding_sizes =
|
||||
Sub(scope, Sub(scope, Shape(scope, input), slice_size), begin);
|
||||
// after_padding = [[1]
|
||||
// [0]
|
||||
// [0]]
|
||||
Input after_padding = Reshape(scope, after_padding_sizes, padding_shape);
|
||||
// paddings = [[1 1]
|
||||
// [2 0]
|
||||
// [1 0]]
|
||||
auto paddings =
|
||||
Concat(scope, {before_padding, after_padding}, Const(scope, 1));
|
||||
grad_outputs->push_back(Pad(scope, grad_inputs[0], paddings));
|
||||
// Nothing propagated for "begin" and "size" inputs
|
||||
grad_outputs->push_back(NoGradient());
|
||||
grad_outputs->push_back(NoGradient());
|
||||
return scope.status();
|
||||
}
|
||||
REGISTER_GRADIENT_OP("Slice", SliceGrad);
|
||||
|
||||
} // anonymous namespace
|
||||
} // namespace ops
|
||||
} // namespace tensorflow
|
||||
|
@ -378,5 +378,12 @@ TEST_F(ArrayGradTest, StridedSliceGrad) {
|
||||
RunTest(x, x_shape, y, {1, 2, 2, 2});
|
||||
}
|
||||
|
||||
TEST_F(ArrayGradTest, SliceGrad) {
|
||||
TensorShape x_shape({3, 5, 3});
|
||||
auto x = Placeholder(scope_, DT_FLOAT, Placeholder::Shape(x_shape));
|
||||
auto y = Slice(scope_, x, {1, 2, 1}, {1, 3, 2});
|
||||
RunTest(x, x_shape, y, {1, 3, 2});
|
||||
}
|
||||
|
||||
} // namespace
|
||||
} // namespace tensorflow
|
||||
|
@ -128,7 +128,14 @@ cc_library(
|
||||
"@llvm//:target", # fixdeps: keep
|
||||
"@llvm//:x86_code_gen", # fixdeps: keep
|
||||
"@llvm//:x86_disassembler", # fixdeps: keep
|
||||
],
|
||||
] + select({
|
||||
"//tensorflow:linux_ppc64le": [
|
||||
"@llvm//:powerpc_disassembler",
|
||||
"@llvm//:powerpc_code_gen",
|
||||
],
|
||||
"//conditions:default": [
|
||||
],
|
||||
}),
|
||||
alwayslink = True, # Contains compiler registration
|
||||
)
|
||||
|
||||
|
@ -125,9 +125,9 @@ py_library(
|
||||
}) + if_not_windows_cuda([
|
||||
"//tensorflow/contrib/fused_conv:fused_conv_py", # unresolved symbols, need to export more symbols
|
||||
]) + if_not_windows([
|
||||
"//tensorflow/contrib/ffmpeg:ffmpeg_ops_py",
|
||||
"//tensorflow/contrib/cloud:cloud_py", # depends on bigtable
|
||||
"//tensorflow/contrib/bigtable", # doesn't compile on Windows
|
||||
"//tensorflow/contrib/ffmpeg:ffmpeg_ops_py",
|
||||
"//tensorflow/contrib/lite/python:lite", # unix dependency, need to fix code
|
||||
]),
|
||||
)
|
||||
|
@ -47,7 +47,6 @@ class SymbolNamer(object):
|
||||
|
||||
class ControlFlowTransformer(converter.Base):
|
||||
"""Transforms control flow structures like loops an conditionals."""
|
||||
|
||||
def _create_cond_branch(self, body_name, aliased_orig_names,
|
||||
aliased_new_names, body, returns):
|
||||
if aliased_orig_names:
|
||||
|
@ -299,17 +299,20 @@ include_directories(
|
||||
${double_conversion_INCLUDE_DIR}
|
||||
)
|
||||
|
||||
if(tensorflow_ENABLE_SSL_SUPPORT)
|
||||
include(boringssl)
|
||||
list(APPEND tensorflow_EXTERNAL_LIBRARIES ${boringssl_STATIC_LIBRARIES})
|
||||
list(APPEND tensorflow_EXTERNAL_DEPENDENCIES boringssl)
|
||||
include_directories(${boringssl_INCLUDE_DIR})
|
||||
endif()
|
||||
if(tensorflow_ENABLE_GRPC_SUPPORT)
|
||||
if(tensorflow_ENABLE_SSL_SUPPORT)
|
||||
include(boringssl)
|
||||
include_directories(${boringssl_INCLUDE_DIR})
|
||||
endif()
|
||||
include(grpc)
|
||||
include_directories(${GRPC_INCLUDE_DIRS})
|
||||
# Place boringssl after grpc as grpc depends on boringssl.
|
||||
list(APPEND tensorflow_EXTERNAL_LIBRARIES ${grpc_STATIC_LIBRARIES})
|
||||
list(APPEND tensorflow_EXTERNAL_DEPENDENCIES grpc)
|
||||
include_directories(${GRPC_INCLUDE_DIRS})
|
||||
if(tensorflow_ENABLE_SSL_SUPPORT)
|
||||
list(APPEND tensorflow_EXTERNAL_LIBRARIES ${boringssl_STATIC_LIBRARIES})
|
||||
list(APPEND tensorflow_EXTERNAL_DEPENDENCIES boringssl)
|
||||
endif()
|
||||
endif()
|
||||
if(tensorflow_ENABLE_JEMALLOC_SUPPORT)
|
||||
include(jemalloc)
|
||||
|
@ -17,7 +17,7 @@ include (ExternalProject)
|
||||
set(boringssl_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/boringssl/src/boringssl/include)
|
||||
#set(boringssl_EXTRA_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/boringssl/src)
|
||||
set(boringssl_URL https://boringssl.googlesource.com/boringssl)
|
||||
set(boringssl_TAG ee7aa02)
|
||||
set(boringssl_TAG 7f8c553d7f4db0a6ce727f2986d41bf8fe8ec4bf)
|
||||
set(boringssl_BUILD ${CMAKE_BINARY_DIR}/boringssl/src/boringssl-build)
|
||||
#set(boringssl_LIBRARIES ${boringssl_BUILD}/obj/so/libboringssl.so)
|
||||
set(boringssl_STATIC_LIBRARIES
|
||||
|
@ -236,15 +236,6 @@ if(WIN32)
|
||||
list(APPEND tf_core_lib_srcs ${tf_core_platform_windows_srcs})
|
||||
endif(WIN32)
|
||||
|
||||
if(tensorflow_ENABLE_SSL_SUPPORT)
|
||||
# Cloud libraries require boringssl.
|
||||
file(GLOB tf_core_platform_cloud_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/platform/cloud/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/platform/cloud/*.cc"
|
||||
)
|
||||
list(APPEND tf_core_lib_srcs ${tf_core_platform_cloud_srcs})
|
||||
endif()
|
||||
|
||||
if (tensorflow_ENABLE_HDFS_SUPPORT)
|
||||
list(APPEND tf_core_platform_hdfs_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/platform/hadoop/hadoop_file_system.cc"
|
||||
|
@ -134,14 +134,13 @@ if(tensorflow_BUILD_CONTRIB_KERNELS)
|
||||
list(APPEND tf_core_kernels_srcs ${tf_contrib_kernels_srcs})
|
||||
endif(tensorflow_BUILD_CONTRIB_KERNELS)
|
||||
|
||||
if(NOT tensorflow_ENABLE_SSL_SUPPORT)
|
||||
# Cloud libraries require boringssl.
|
||||
file(GLOB tf_core_kernels_cloud_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/contrib/cloud/kernels/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/contrib/cloud/kernels/*.cc"
|
||||
)
|
||||
# Cloud libraries require curl and boringssl.
|
||||
# Curl is not supported yet anyway so we remove for now.
|
||||
file(GLOB tf_core_kernels_cloud_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/contrib/cloud/kernels/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/contrib/cloud/kernels/*.cc"
|
||||
)
|
||||
list(REMOVE_ITEM tf_core_kernels_srcs ${tf_core_kernels_cloud_srcs})
|
||||
endif()
|
||||
|
||||
file(GLOB_RECURSE tf_core_kernels_exclude_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*test*.h"
|
||||
|
@ -64,8 +64,6 @@ file(GLOB tf_stream_executor_srcs
|
||||
if (tensorflow_ENABLE_GPU)
|
||||
file(GLOB tf_stream_executor_gpu_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/stream_executor/cuda/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/compiler/xla/statusor.h"
|
||||
"${tensorflow_source_dir}/tensorflow/compiler/xla/statusor.cc"
|
||||
)
|
||||
if (NOT tensorflow_BUILD_CC_TESTS)
|
||||
file(GLOB tf_stream_executor_gpu_tests
|
||||
|
@ -534,7 +534,8 @@ def multi_label_head(n_classes,
|
||||
* An integer `SparseTensor` of class indices. The `dense_shape` must be
|
||||
`[D0, D1, ... DN, ?]` and the values within `[0, n_classes)`.
|
||||
* If `label_vocabulary` is given, a string `SparseTensor`. The `dense_shape`
|
||||
must be `[D0, D1, ... DN, ?]` and the values within `label_vocabulary`.
|
||||
must be `[D0, D1, ... DN, ?]` and the values within `label_vocabulary` or a
|
||||
multi-hot tensor of shape `[D0, D1, ... DN, n_classes]`.
|
||||
|
||||
If `weight_column` is specified, weights must be of shape
|
||||
`[D0, D1, ... DN]`, or `[D0, D1, ... DN, 1]`.
|
||||
|
@ -568,6 +568,33 @@ class MultiLabelHead(test.TestCase):
|
||||
expected_loss=expected_loss,
|
||||
expected_metrics=expected_metrics)
|
||||
|
||||
def test_eval_with_label_vocabulary_with_multi_hot_input(self):
|
||||
n_classes = 2
|
||||
head = head_lib.multi_label_head(
|
||||
n_classes, label_vocabulary=['class0', 'class1'])
|
||||
logits = np.array([[-1., 1.], [-1.5, 1.5]], dtype=np.float32)
|
||||
labels_multi_hot = np.array([[1, 0], [1, 1]], dtype=np.int64)
|
||||
# loss = labels * -log(sigmoid(logits)) +
|
||||
# (1 - labels) * -log(1 - sigmoid(logits))
|
||||
# Sum over examples, divide by batch_size.
|
||||
expected_loss = 0.5 * np.sum(
|
||||
_sigmoid_cross_entropy(labels=labels_multi_hot, logits=logits))
|
||||
keys = metric_keys.MetricKeys
|
||||
expected_metrics = {
|
||||
# Average loss over examples.
|
||||
keys.LOSS_MEAN: expected_loss,
|
||||
# auc and auc_pr cannot be reliably calculated for only 4 samples, but
|
||||
# this assert tests that the algorithm remains consistent.
|
||||
keys.AUC: 0.3333,
|
||||
keys.AUC_PR: 0.7639,
|
||||
}
|
||||
self._test_eval(
|
||||
head=head,
|
||||
logits=logits,
|
||||
labels=labels_multi_hot,
|
||||
expected_loss=expected_loss,
|
||||
expected_metrics=expected_metrics)
|
||||
|
||||
def test_eval_with_thresholds(self):
|
||||
n_classes = 2
|
||||
thresholds = [0.25, 0.5, 0.75]
|
||||
|
@ -103,9 +103,20 @@ class GANHead(head._Head): # pylint: disable=protected-access
|
||||
name: name of the head. If provided, summary and metrics keys will be
|
||||
suffixed by `"/" + name`.
|
||||
"""
|
||||
|
||||
if not callable(generator_loss_fn):
|
||||
raise TypeError('generator_loss_fn must be callable.')
|
||||
if not callable(discriminator_loss_fn):
|
||||
raise TypeError('discriminator_loss_fn must be callable.')
|
||||
if not use_loss_summaries in [True, False, None]:
|
||||
raise ValueError('use_loss_summaries must be True, False or None.')
|
||||
if get_hooks_fn is not None and not callable(get_hooks_fn):
|
||||
raise TypeError('get_hooks_fn must be callable.')
|
||||
if name is not None and not isinstance(name, str):
|
||||
raise TypeError('name must be string.')
|
||||
|
||||
if get_hooks_fn is None:
|
||||
get_hooks_fn = tfgan_train.get_sequential_train_hooks()
|
||||
# TODO(joelshor): Validate inputs.
|
||||
|
||||
if use_loss_summaries in [True, False]:
|
||||
generator_loss_fn = functools.partial(
|
||||
|
@ -570,7 +570,7 @@ class MutualInformationPenaltyTest(test.TestCase, _PenaltyTest):
|
||||
'predicted_distributions': self._predicted_distributions,
|
||||
}
|
||||
self._expected_loss = 1.61610
|
||||
self._expected_op_name = 'mutual_information_loss/mul'
|
||||
self._expected_op_name = 'mutual_information_loss/mul_1'
|
||||
self._batch_size = 2
|
||||
|
||||
|
||||
|
@ -35,6 +35,7 @@ typedef Eigen::ThreadPoolDevice CPUDevice;
|
||||
template struct FillProjectiveTransform<CPUDevice, uint8>;
|
||||
template struct FillProjectiveTransform<CPUDevice, int32>;
|
||||
template struct FillProjectiveTransform<CPUDevice, int64>;
|
||||
template struct FillProjectiveTransform<CPUDevice, Eigen::half>;
|
||||
template struct FillProjectiveTransform<CPUDevice, float>;
|
||||
template struct FillProjectiveTransform<CPUDevice, double>;
|
||||
|
||||
@ -99,6 +100,7 @@ class ImageProjectiveTransform : public OpKernel {
|
||||
TF_CALL_uint8(REGISTER);
|
||||
TF_CALL_int32(REGISTER);
|
||||
TF_CALL_int64(REGISTER);
|
||||
TF_CALL_half(REGISTER);
|
||||
TF_CALL_float(REGISTER);
|
||||
TF_CALL_double(REGISTER);
|
||||
|
||||
|
@ -21,6 +21,7 @@ limitations under the License.
|
||||
#define EIGEN_USE_THREADS
|
||||
|
||||
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
|
||||
|
||||
#include "tensorflow/core/framework/tensor_types.h"
|
||||
#include "tensorflow/core/platform/types.h"
|
||||
|
||||
@ -110,21 +111,21 @@ class ProjectiveGenerator {
|
||||
// f(x, y_floor) = (x_ceil - x) / (x_ceil - x_floor) * f(x_floor, y_floor)
|
||||
// + (x - x_floor) / (x_ceil - x_floor) * f(x_ceil, y_floor)
|
||||
const float value_yfloor =
|
||||
(x_ceil - x) * read_with_fill_value(batch, DenseIndex(y_floor),
|
||||
DenseIndex(x_floor), channel,
|
||||
fill_value) +
|
||||
(x - x_floor) * read_with_fill_value(batch, DenseIndex(y_floor),
|
||||
DenseIndex(x_ceil), channel,
|
||||
fill_value);
|
||||
(x_ceil - x) * static_cast<float>(read_with_fill_value(
|
||||
batch, DenseIndex(y_floor), DenseIndex(x_floor),
|
||||
channel, fill_value)) +
|
||||
(x - x_floor) * static_cast<float>(read_with_fill_value(
|
||||
batch, DenseIndex(y_floor), DenseIndex(x_ceil),
|
||||
channel, fill_value));
|
||||
// f(x, y_ceil) = (x_ceil - x) / (x_ceil - x_floor) * f(x_floor, y_ceil)
|
||||
// + (x - x_floor) / (x_ceil - x_floor) * f(x_ceil, y_ceil)
|
||||
const float value_yceil =
|
||||
(x_ceil - x) * read_with_fill_value(batch, DenseIndex(y_ceil),
|
||||
DenseIndex(x_floor), channel,
|
||||
fill_value) +
|
||||
(x - x_floor) * read_with_fill_value(batch, DenseIndex(y_ceil),
|
||||
DenseIndex(x_ceil), channel,
|
||||
fill_value);
|
||||
(x_ceil - x) * static_cast<float>(read_with_fill_value(
|
||||
batch, DenseIndex(y_ceil), DenseIndex(x_floor),
|
||||
channel, fill_value)) +
|
||||
(x - x_floor) * static_cast<float>(read_with_fill_value(
|
||||
batch, DenseIndex(y_ceil), DenseIndex(x_ceil),
|
||||
channel, fill_value));
|
||||
// f(x, y) = (y_ceil - y) / (y_ceil - y_floor) * f(x, y_floor)
|
||||
// + (y - y_floor) / (y_ceil - y_floor) * f(x, y_ceil)
|
||||
return T((y_ceil - y) * value_yfloor + (y - y_floor) * value_yceil);
|
||||
|
@ -29,7 +29,7 @@ using shape_inference::ShapeHandle;
|
||||
REGISTER_OP("ImageProjectiveTransform")
|
||||
.Input("images: dtype")
|
||||
.Input("transforms: float32")
|
||||
.Attr("dtype: {uint8, int32, int64, float32, float64}")
|
||||
.Attr("dtype: {uint8, int32, int64, float16, float32, float64}")
|
||||
.Attr("interpolation: string")
|
||||
.Output("transformed_images: dtype")
|
||||
.SetShapeFn([](InferenceContext* c) {
|
||||
|
@ -30,7 +30,8 @@ from tensorflow.python.ops import math_ops
|
||||
from tensorflow.python.platform import googletest
|
||||
|
||||
_DTYPES = set(
|
||||
[dtypes.uint8, dtypes.int32, dtypes.int64, dtypes.float32, dtypes.float64])
|
||||
[dtypes.uint8, dtypes.int32, dtypes.int64,
|
||||
dtypes.float16, dtypes.float32, dtypes.float64])
|
||||
|
||||
|
||||
class ImageOpsTest(test_util.TensorFlowTestCase):
|
||||
|
@ -33,7 +33,8 @@ _image_ops_so = loader.load_op_library(
|
||||
resource_loader.get_path_to_datafile("_image_ops.so"))
|
||||
|
||||
_IMAGE_DTYPES = set(
|
||||
[dtypes.uint8, dtypes.int32, dtypes.int64, dtypes.float32, dtypes.float64])
|
||||
[dtypes.uint8, dtypes.int32, dtypes.int64,
|
||||
dtypes.float16, dtypes.float32, dtypes.float64])
|
||||
|
||||
ops.RegisterShape("ImageConnectedComponents")(common_shapes.call_cpp_shape_fn)
|
||||
ops.RegisterShape("ImageProjectiveTransform")(common_shapes.call_cpp_shape_fn)
|
||||
|
@ -1356,7 +1356,7 @@ class DropoutTest(test.TestCase):
|
||||
with self.test_session():
|
||||
images = np.random.uniform(size=(5, height, width, 3))
|
||||
output = _layers.dropout(images)
|
||||
self.assertEqual(output.op.name, 'Dropout/dropout/mul')
|
||||
self.assertEqual(output.op.name, 'Dropout/dropout_1/mul')
|
||||
output.get_shape().assert_is_compatible_with(
|
||||
ops.convert_to_tensor(images).get_shape())
|
||||
|
||||
|
@ -57,3 +57,39 @@ dependencies {
|
||||
|
||||
testCompile 'junit:junit:4.12'
|
||||
}
|
||||
|
||||
def modelDownloadUrl = "https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip"
|
||||
def localCache = "build/intermediates/mobilenet_v1_224_android_quant_2017_11_08.zip"
|
||||
def targetFolder = "src/main/assets"
|
||||
|
||||
task downloadModel(type: DownloadUrlTask) {
|
||||
doFirst {
|
||||
println "Downloading ${modelDownloadUrl}"
|
||||
}
|
||||
sourceUrl = "${modelDownloadUrl}"
|
||||
target = file("${localCache}")
|
||||
}
|
||||
|
||||
task unzipModel(type: Copy, dependsOn: 'downloadModel') {
|
||||
doFirst {
|
||||
println "Unzipping ${localCache}"
|
||||
}
|
||||
from zipTree("${localCache}")
|
||||
into "${targetFolder}"
|
||||
}
|
||||
|
||||
// Ensure the model file is downloaded and extracted before every build
|
||||
preBuild.dependsOn unzipModel
|
||||
|
||||
class DownloadUrlTask extends DefaultTask {
|
||||
@Input
|
||||
String sourceUrl
|
||||
|
||||
@OutputFile
|
||||
File target
|
||||
|
||||
@TaskAction
|
||||
void download() {
|
||||
ant.get(src: sourceUrl, dest: target)
|
||||
}
|
||||
}
|
||||
|
@ -39,7 +39,7 @@ class ExpandDimsOpModel : public SingleOpModel {
|
||||
void SetInputFloat(std::initializer_list<float> data) {
|
||||
PopulateTensor<float>(input_, data);
|
||||
}
|
||||
void SetAxis(int axis) { PopulateTensor<int32>(axis_, {axis}); }
|
||||
void SetAxis(int axis) { PopulateTensor<int32_t>(axis_, {axis}); }
|
||||
std::vector<float> GetValuesFloat() { return ExtractVector<float>(output_); }
|
||||
std::vector<int> GetOutputShape() { return GetTensorShape(output_); }
|
||||
|
||||
@ -51,7 +51,7 @@ class ExpandDimsOpModel : public SingleOpModel {
|
||||
|
||||
TEST(ExpandDimsOpTest, DifferentAxis) {
|
||||
ExpandDimsOpModel m({2, 2}, TensorType_FLOAT32);
|
||||
const auto values = {-1.f, 1.f, -2.f, 2.f};
|
||||
std::initializer_list<float> values = {-1.f, 1.f, -2.f, 2.f};
|
||||
m.SetInputFloat(values);
|
||||
m.SetAxis(0);
|
||||
m.Invoke();
|
||||
|
@ -126,10 +126,10 @@ TEST(MaximumOpTest, FloatWithBroadcastTest) {
|
||||
TEST(MaximumOpTest, Int32WithBroadcastTest) {
|
||||
std::initializer_list<int32_t> data1 = {1, 0, -1, -2, 3, 11};
|
||||
std::initializer_list<int32_t> data2 = {2};
|
||||
TestModel<int32>(BuiltinOperator_MAXIMUM, {TensorType_INT32, {3, 1, 2}},
|
||||
TestModel<int32_t>(BuiltinOperator_MAXIMUM, {TensorType_INT32, {3, 1, 2}},
|
||||
{TensorType_INT32, {1}}, {TensorType_INT32, {3, 1, 2}},
|
||||
data1, data2, {2, 2, 2, 2, 3, 11});
|
||||
TestModel<int32>(BuiltinOperator_MINIMUM, {TensorType_INT32, {3, 1, 2}},
|
||||
TestModel<int32_t>(BuiltinOperator_MINIMUM, {TensorType_INT32, {3, 1, 2}},
|
||||
{TensorType_INT32, {1}}, {TensorType_INT32, {3, 1, 2}},
|
||||
data1, data2, {1, 0, -1, -2, 2, 2});
|
||||
}
|
||||
|
@ -58,9 +58,9 @@ TEST(NegOpModel, NegFloat) {
|
||||
|
||||
TEST(NegOpModel, NegInt32) {
|
||||
NegOpModel m({TensorType_INT32, {2, 3}}, {TensorType_INT32, {2, 3}});
|
||||
m.SetInput<int32>({-2, -1, 0, 1, 2, 3});
|
||||
m.SetInput<int32_t>({-2, -1, 0, 1, 2, 3});
|
||||
m.Invoke();
|
||||
EXPECT_THAT(m.GetOutput<int32>(), ElementsAreArray({2, 1, 0, -1, -2, -3}));
|
||||
EXPECT_THAT(m.GetOutput<int32_t>(), ElementsAreArray({2, 1, 0, -1, -2, -3}));
|
||||
}
|
||||
|
||||
TEST(NegOpModel, NegInt64) {
|
||||
|
@ -88,11 +88,11 @@ TEST(SelectOpTest, SelectUInt8) {
|
||||
TensorType_UINT8);
|
||||
|
||||
model.PopulateTensor<bool>(model.input1(), {false, true, false, false});
|
||||
model.PopulateTensor<uint8>(model.input2(), {1, 2, 3, 4});
|
||||
model.PopulateTensor<uint8>(model.input3(), {5, 6, 7, 8});
|
||||
model.PopulateTensor<uint8_t>(model.input2(), {1, 2, 3, 4});
|
||||
model.PopulateTensor<uint8_t>(model.input3(), {5, 6, 7, 8});
|
||||
model.Invoke();
|
||||
|
||||
EXPECT_THAT(model.GetOutput<uint8>(), ElementsAreArray({5, 2, 7, 8}));
|
||||
EXPECT_THAT(model.GetOutput<uint8_t>(), ElementsAreArray({5, 2, 7, 8}));
|
||||
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 1, 1, 4}));
|
||||
}
|
||||
|
||||
@ -101,11 +101,11 @@ TEST(SelectOpTest, SelectInt32) {
|
||||
TensorType_INT32);
|
||||
|
||||
model.PopulateTensor<bool>(model.input1(), {false, true, false, false});
|
||||
model.PopulateTensor<int32>(model.input2(), {1, 2, 3, 4});
|
||||
model.PopulateTensor<int32>(model.input3(), {5, 6, 7, 8});
|
||||
model.PopulateTensor<int32_t>(model.input2(), {1, 2, 3, 4});
|
||||
model.PopulateTensor<int32_t>(model.input3(), {5, 6, 7, 8});
|
||||
model.Invoke();
|
||||
|
||||
EXPECT_THAT(model.GetOutput<int32>(), ElementsAreArray({5, 2, 7, 8}));
|
||||
EXPECT_THAT(model.GetOutput<int32_t>(), ElementsAreArray({5, 2, 7, 8}));
|
||||
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 1, 1, 4}));
|
||||
}
|
||||
|
||||
@ -113,11 +113,11 @@ TEST(SelectOpTest, RankOneSelectInt32) {
|
||||
SelectOpModel model({2}, {2, 1, 2, 1}, {2, 1, 2, 1}, TensorType_INT32);
|
||||
|
||||
model.PopulateTensor<bool>(model.input1(), {false, true});
|
||||
model.PopulateTensor<int32>(model.input2(), {1, 2, 3, 4});
|
||||
model.PopulateTensor<int32>(model.input3(), {5, 6, 7, 8});
|
||||
model.PopulateTensor<int32_t>(model.input2(), {1, 2, 3, 4});
|
||||
model.PopulateTensor<int32_t>(model.input3(), {5, 6, 7, 8});
|
||||
model.Invoke();
|
||||
|
||||
EXPECT_THAT(model.GetOutput<int32>(), ElementsAreArray({5, 6, 3, 4}));
|
||||
EXPECT_THAT(model.GetOutput<int32_t>(), ElementsAreArray({5, 6, 3, 4}));
|
||||
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({2, 1, 2, 1}));
|
||||
}
|
||||
|
||||
@ -125,11 +125,11 @@ TEST(SelectOpTest, RankZeroSelectInt32) {
|
||||
SelectOpModel model({1}, {1, 2, 2, 1}, {1, 2, 2, 1}, TensorType_INT32);
|
||||
|
||||
model.PopulateTensor<bool>(model.input1(), {false});
|
||||
model.PopulateTensor<int32>(model.input2(), {1, 2, 3, 4});
|
||||
model.PopulateTensor<int32>(model.input3(), {5, 6, 7, 8});
|
||||
model.PopulateTensor<int32_t>(model.input2(), {1, 2, 3, 4});
|
||||
model.PopulateTensor<int32_t>(model.input3(), {5, 6, 7, 8});
|
||||
model.Invoke();
|
||||
|
||||
EXPECT_THAT(model.GetOutput<int32>(), ElementsAreArray({5, 6, 7, 8}));
|
||||
EXPECT_THAT(model.GetOutput<int32_t>(), ElementsAreArray({5, 6, 7, 8}));
|
||||
EXPECT_THAT(model.GetOutputShape(), ElementsAreArray({1, 2, 2, 1}));
|
||||
}
|
||||
|
||||
|
@ -21,7 +21,6 @@ limitations under the License.
|
||||
namespace tflite {
|
||||
namespace {
|
||||
|
||||
using ::int32;
|
||||
using ::testing::ElementsAreArray;
|
||||
|
||||
template <typename input_type = float,
|
||||
@ -50,14 +49,14 @@ class StridedSliceOpModel : public SingleOpModel {
|
||||
void SetInput(std::initializer_list<input_type> data) {
|
||||
PopulateTensor<input_type>(input_, data);
|
||||
}
|
||||
void SetBegin(std::initializer_list<int32> data) {
|
||||
PopulateTensor<int32>(begin_, data);
|
||||
void SetBegin(std::initializer_list<int32_t> data) {
|
||||
PopulateTensor<int32_t>(begin_, data);
|
||||
}
|
||||
void SetEnd(std::initializer_list<int32> data) {
|
||||
PopulateTensor<int32>(end_, data);
|
||||
void SetEnd(std::initializer_list<int32_t> data) {
|
||||
PopulateTensor<int32_t>(end_, data);
|
||||
}
|
||||
void SetStrides(std::initializer_list<int32> data) {
|
||||
PopulateTensor<int32>(strides_, data);
|
||||
void SetStrides(std::initializer_list<int32_t> data) {
|
||||
PopulateTensor<int32_t>(strides_, data);
|
||||
}
|
||||
|
||||
std::vector<input_type> GetOutput() {
|
||||
@ -566,7 +565,7 @@ TEST(StridedSliceOpTest, RunTwice) {
|
||||
}
|
||||
|
||||
TEST(StridedSliceOpTest, In3D_IdentityShrinkAxis1Uint8) {
|
||||
StridedSliceOpModel<uint8, TensorType_UINT8> m({2, 3, 2}, {3}, {3}, {3}, 0, 0,
|
||||
StridedSliceOpModel<uint8_t, TensorType_UINT8> m({2, 3, 2}, {3}, {3}, {3}, 0, 0,
|
||||
0, 0, 1);
|
||||
m.SetInput({1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12});
|
||||
m.SetBegin({0, 0, 0});
|
||||
|
@ -22,22 +22,22 @@ using ::testing::ElementsAreArray;
|
||||
|
||||
TEST(TestUtilTest, QuantizeVector) {
|
||||
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
||||
auto q_data = Quantize<uint8>(data, /*scale=*/1.0, /*zero_point=*/0);
|
||||
std::vector<uint8> expected = {0, 0, 0, 1, 1, 255};
|
||||
auto q_data = Quantize<uint8_t>(data, /*scale=*/1.0, /*zero_point=*/0);
|
||||
std::vector<uint8_t> expected = {0, 0, 0, 1, 1, 255};
|
||||
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
||||
}
|
||||
|
||||
TEST(TestUtilTest, QuantizeVectorScalingDown) {
|
||||
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
||||
auto q_data = Quantize<uint8>(data, /*scale=*/10.0, /*zero_point=*/0);
|
||||
std::vector<uint8> expected = {0, 0, 0, 0, 0, 100};
|
||||
auto q_data = Quantize<uint8_t>(data, /*scale=*/10.0, /*zero_point=*/0);
|
||||
std::vector<uint8_t> expected = {0, 0, 0, 0, 0, 100};
|
||||
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
||||
}
|
||||
|
||||
TEST(TestUtilTest, QuantizeVectorScalingUp) {
|
||||
std::vector<float> data = {-1.0, -0.5, 0.0, 0.5, 1.0, 1000.0};
|
||||
auto q_data = Quantize<uint8>(data, /*scale=*/0.1, /*zero_point=*/0);
|
||||
std::vector<uint8> expected = {0, 0, 0, 5, 10, 255};
|
||||
auto q_data = Quantize<uint8_t>(data, /*scale=*/0.1, /*zero_point=*/0);
|
||||
std::vector<uint8_t> expected = {0, 0, 0, 5, 10, 255};
|
||||
EXPECT_THAT(q_data, ElementsAreArray(expected));
|
||||
}
|
||||
|
||||
|
@ -38,27 +38,27 @@ class TileOpModel : public SingleOpModel {
|
||||
PopulateTensor<float>(input_, data);
|
||||
}
|
||||
|
||||
void SetInputUInt8(std::initializer_list<uint8> data) {
|
||||
PopulateTensor<uint8>(input_, data);
|
||||
void SetInputUInt8(std::initializer_list<uint8_t> data) {
|
||||
PopulateTensor<uint8_t>(input_, data);
|
||||
}
|
||||
|
||||
void SetInputInt32(std::initializer_list<int32> data) {
|
||||
PopulateTensor<int32>(input_, data);
|
||||
void SetInputInt32(std::initializer_list<int32_t> data) {
|
||||
PopulateTensor<int32_t>(input_, data);
|
||||
}
|
||||
|
||||
void SetInputInt64(std::initializer_list<int64_t> data) {
|
||||
PopulateTensor<int64_t>(input_, data);
|
||||
}
|
||||
|
||||
void SetMultipliers(std::initializer_list<int32> data) {
|
||||
PopulateTensor<int32>(multipliers_, data);
|
||||
void SetMultipliers(std::initializer_list<int32_t> data) {
|
||||
PopulateTensor<int32_t>(multipliers_, data);
|
||||
}
|
||||
|
||||
std::vector<float> GetOutputFloat() { return ExtractVector<float>(output_); }
|
||||
|
||||
std::vector<uint8> GetOutputUInt8() { return ExtractVector<uint8>(output_); }
|
||||
std::vector<uint8_t> GetOutputUInt8() { return ExtractVector<uint8_t>(output_); }
|
||||
|
||||
std::vector<int32> GetOutputInt32() { return ExtractVector<int32>(output_); }
|
||||
std::vector<int32_t> GetOutputInt32() { return ExtractVector<int32_t>(output_); }
|
||||
|
||||
std::vector<int64_t> GetOutputInt64() {
|
||||
return ExtractVector<int64_t>(output_);
|
||||
|
@ -42,32 +42,32 @@ class TopKV2OpModel : public SingleOpModel {
|
||||
PopulateTensor<float>(input_, data);
|
||||
}
|
||||
|
||||
void SetInputUInt8(std::initializer_list<uint8> data) {
|
||||
PopulateTensor<uint8>(input_, data);
|
||||
void SetInputUInt8(std::initializer_list<uint8_t> data) {
|
||||
PopulateTensor<uint8_t>(input_, data);
|
||||
}
|
||||
|
||||
void SetInputInt32(std::initializer_list<int32> data) {
|
||||
PopulateTensor<int32>(input_, data);
|
||||
void SetInputInt32(std::initializer_list<int32_t> data) {
|
||||
PopulateTensor<int32_t>(input_, data);
|
||||
}
|
||||
|
||||
void SetInputInt64(std::initializer_list<int64_t> data) {
|
||||
PopulateTensor<int64_t>(input_, data);
|
||||
}
|
||||
|
||||
std::vector<int32> GetIndexes() {
|
||||
return ExtractVector<int32>(output_indexes_);
|
||||
std::vector<int32_t> GetIndexes() {
|
||||
return ExtractVector<int32_t>(output_indexes_);
|
||||
}
|
||||
|
||||
std::vector<float> GetValuesFloat() {
|
||||
return ExtractVector<float>(output_values_);
|
||||
}
|
||||
|
||||
std::vector<uint8> GetValuesUInt8() {
|
||||
return ExtractVector<uint8>(output_values_);
|
||||
std::vector<uint8_t> GetValuesUInt8() {
|
||||
return ExtractVector<uint8_t>(output_values_);
|
||||
}
|
||||
|
||||
std::vector<int32> GetValuesInt32() {
|
||||
return ExtractVector<int32>(output_values_);
|
||||
std::vector<int32_t> GetValuesInt32() {
|
||||
return ExtractVector<int32_t>(output_values_);
|
||||
}
|
||||
|
||||
std::vector<int64_t> GetValuesInt64() {
|
||||
@ -119,7 +119,7 @@ TEST(TopKV2OpTest, VectorFloat) {
|
||||
EXPECT_THAT(m.GetValuesFloat(), ElementsAreArray(ArrayFloatNear({0.8, 0.2})));
|
||||
}
|
||||
|
||||
// Check that uint8 works.
|
||||
// Check that uint8_t works.
|
||||
TEST(TopKV2OpTest, TypeUint8) {
|
||||
TopKV2OpModel m({2, 3}, TensorType_UINT8, 2);
|
||||
m.SetInputUInt8({1, 2, 3, 251, 250, 249});
|
||||
@ -128,7 +128,7 @@ TEST(TopKV2OpTest, TypeUint8) {
|
||||
EXPECT_THAT(m.GetValuesUInt8(), ElementsAreArray({3, 2, 251, 250}));
|
||||
}
|
||||
|
||||
// Check that int32 works.
|
||||
// Check that int32_t works.
|
||||
TEST(TopKV2OpTest, TypeInt32) {
|
||||
TopKV2OpModel m({2, 3}, TensorType_INT32, 2);
|
||||
m.SetInputInt32({1, 2, 3, 10251, 10250, 10249});
|
||||
|
@ -105,7 +105,7 @@ def _convert_model(flags):
|
||||
input_arrays = converter.get_input_arrays()
|
||||
std_dev_values = _parse_array(flags.std_dev_values, type_fn=int)
|
||||
mean_values = _parse_array(flags.mean_values, type_fn=int)
|
||||
quant_stats = zip(mean_values, std_dev_values)
|
||||
quant_stats = list(zip(mean_values, std_dev_values))
|
||||
if ((not flags.input_arrays and len(input_arrays) > 1) or
|
||||
(len(input_arrays) != len(quant_stats))):
|
||||
raise ValueError("Mismatching --input_arrays, --std_dev_values, and "
|
||||
|
@ -52,6 +52,7 @@ tf_custom_op_library(
|
||||
deps = [
|
||||
":mpi_defines",
|
||||
":mpi_message_proto_cc",
|
||||
"//tensorflow/stream_executor:stream_executor_headers_lib",
|
||||
"//third_party/mpi",
|
||||
],
|
||||
)
|
||||
|
@ -73,7 +73,7 @@ limitations under the License.
|
||||
*/
|
||||
|
||||
template <class T>
|
||||
using StatusOr = se::port::StatusOr<T>;
|
||||
using StatusOr = stream_executor::port::StatusOr<T>;
|
||||
|
||||
using CPUDevice = Eigen::ThreadPoolDevice;
|
||||
using GPUDevice = Eigen::GpuDevice;
|
||||
|
@ -30,6 +30,7 @@ from tensorflow.contrib.opt.python.training.model_average_optimizer import *
|
||||
from tensorflow.contrib.opt.python.training.moving_average_optimizer import *
|
||||
from tensorflow.contrib.opt.python.training.multitask_optimizer_wrapper import *
|
||||
from tensorflow.contrib.opt.python.training.nadam_optimizer import *
|
||||
from tensorflow.contrib.opt.python.training.weight_decay_optimizers import *
|
||||
from tensorflow.contrib.opt.python.training.powersign import *
|
||||
from tensorflow.contrib.opt.python.training.variable_clipping_optimizer import *
|
||||
from tensorflow.contrib.opt.python.training.weight_decay_optimizers import *
|
||||
|
@ -506,7 +506,7 @@ def _FoldUnfusedBatchNorms(graph, is_training, freeze_batch_norm_delay):
|
||||
def _IsValidUnfusedBatchNorm(graph, context):
|
||||
"""Checks that the output of the unfused batch norm has consumers."""
|
||||
add_shift = graph.get_operation_by_name(
|
||||
context + '/BatchNorm/batchnorm/add_1')
|
||||
context + '/BatchNorm/batchnorm_1/add_1')
|
||||
# Ensure that the output tensor of batch norm has consumers, otherwise this
|
||||
# is a dangling node and not a match.
|
||||
return bool(add_shift.outputs[0].consumers())
|
||||
@ -599,7 +599,7 @@ def _GetBatchNormParams(graph, context, has_scaling):
|
||||
|
||||
op_suffix_mean = '/BatchNorm/moments/Squeeze'
|
||||
op_suffix_variance = '/BatchNorm/moments/Squeeze_1'
|
||||
op_suffix_epsilon = '/BatchNorm/batchnorm/add/y'
|
||||
op_suffix_epsilon = '/BatchNorm/batchnorm_1/add/y'
|
||||
op_suffix_bn_decay_mean = '/BatchNorm/AssignMovingAvg/decay'
|
||||
op_suffix_bn_decay_var = '/BatchNorm/AssignMovingAvg_1/decay'
|
||||
|
||||
@ -675,12 +675,12 @@ def _CreateFoldedOp(graph, context, has_scaling, freeze_batch_norm_delay,
|
||||
|
||||
Returns:
|
||||
A pair of Operations, the first is the original consumer node of the batch
|
||||
norm (../BatchNorm/batchnorm/add_1), the second is the consumer node of
|
||||
norm (../BatchNorm/batchnorm_1/add_1), the second is the consumer node of
|
||||
the folded graph (add_fold).
|
||||
"""
|
||||
mul_scale_name = 'mul_1' if has_scaling else 'mul'
|
||||
mul_scale = graph.get_operation_by_name(context +
|
||||
'/BatchNorm/batchnorm/' +
|
||||
'/BatchNorm/batchnorm_1/' +
|
||||
mul_scale_name)
|
||||
op_below = mul_scale.inputs[0].op
|
||||
# Skip over the BatchToSpace operation in the case of atrous convolutions.
|
||||
@ -707,7 +707,7 @@ def _CreateFoldedOp(graph, context, has_scaling, freeze_batch_norm_delay,
|
||||
]
|
||||
scale_name = 'mul' if has_scaling else 'Rsqrt'
|
||||
scale = graph.get_operation_by_name(
|
||||
context + '/BatchNorm/batchnorm/' + scale_name)
|
||||
context + '/BatchNorm/batchnorm_1/' + scale_name)
|
||||
scale = array_ops.reshape(scale.outputs[0], new_shape,
|
||||
context + '/scale_reshape')
|
||||
|
||||
@ -735,7 +735,7 @@ def _CreateFoldedOp(graph, context, has_scaling, freeze_batch_norm_delay,
|
||||
[(1, mul_fold.outputs[0])])
|
||||
|
||||
add_shift = graph.get_operation_by_name(
|
||||
context + '/BatchNorm/batchnorm/add_1')
|
||||
context + '/BatchNorm/batchnorm_1/add_1')
|
||||
|
||||
corrected_output = conv_or_fc_folded.outputs[0]
|
||||
# Copy the batch to space operation if we have a atrous convolution.
|
||||
@ -930,7 +930,7 @@ def _HasScaling(graph, input_to_ops_map, bn):
|
||||
Returns:
|
||||
A boolean indicating whether this batch norm layer has scaling enabled.
|
||||
"""
|
||||
rsqrt_op = graph.get_operation_by_name(bn + '/BatchNorm/batchnorm/Rsqrt')
|
||||
rsqrt_op = graph.get_operation_by_name(bn + '/BatchNorm/batchnorm_1/Rsqrt')
|
||||
rsqrt_consumers = input_to_ops_map.ConsumerOperations(rsqrt_op)
|
||||
|
||||
return sum(1 for op in rsqrt_consumers if op.type == 'Mul') == 1
|
||||
|
@ -600,13 +600,13 @@ class FoldBatchNormsTest(test_util.TensorFlowTestCase):
|
||||
if has_scaling:
|
||||
if fused:
|
||||
return scope + '/BatchNorm_Fold/mul'
|
||||
return scope + '/BatchNorm/batchnorm/mul'
|
||||
return scope + '/BatchNorm/batchnorm/Rsqrt'
|
||||
return scope + '/BatchNorm/batchnorm_1/mul'
|
||||
return scope + '/BatchNorm/batchnorm_1/Rsqrt'
|
||||
|
||||
def _BathNormBiasName(self, scope, fused):
|
||||
if fused:
|
||||
return scope + '/BatchNorm_Fold/bias'
|
||||
return scope + '/BatchNorm/batchnorm/sub'
|
||||
return scope + '/BatchNorm/batchnorm_1/sub'
|
||||
|
||||
def _WeightInit(self, stddev):
|
||||
"""Returns a truncated normal variable initializer.
|
||||
|
@ -385,7 +385,7 @@ class ReceptiveFieldTest(test.TestCase):
|
||||
effective_stride_y, effective_padding_x, effective_padding_y) = (
|
||||
receptive_field.compute_receptive_field_from_graph_def(
|
||||
graph_def, input_node, output_node,
|
||||
['Dropout/dropout/random_uniform']))
|
||||
['Dropout/dropout_1/random_uniform']))
|
||||
self.assertEqual(receptive_field_x, 3)
|
||||
self.assertEqual(receptive_field_y, 3)
|
||||
self.assertEqual(effective_stride_x, 4)
|
||||
|
@ -18,131 +18,330 @@ from __future__ import absolute_import
|
||||
from __future__ import division
|
||||
from __future__ import print_function
|
||||
|
||||
from collections import namedtuple
|
||||
import itertools
|
||||
import warnings
|
||||
import numpy as np
|
||||
import six
|
||||
|
||||
from tensorflow.contrib import tensorrt as trt
|
||||
from tensorflow.core.protobuf import config_pb2 as cpb2
|
||||
from tensorflow.python.framework import constant_op as cop
|
||||
from tensorflow.python.framework import dtypes as dtypes
|
||||
from tensorflow.python.framework import importer as importer
|
||||
from tensorflow.python.framework import ops as ops
|
||||
from tensorflow.core.protobuf import config_pb2
|
||||
from tensorflow.core.protobuf import rewriter_config_pb2
|
||||
from tensorflow.python.framework import constant_op
|
||||
from tensorflow.python.framework import dtypes
|
||||
from tensorflow.python.framework import importer
|
||||
from tensorflow.python.framework import ops
|
||||
from tensorflow.python.framework import test_util
|
||||
from tensorflow.python.ops import array_ops as aops
|
||||
from tensorflow.python.ops import nn as nn
|
||||
from tensorflow.python.ops import nn_ops as nn_ops
|
||||
from tensorflow.python.platform import googletest
|
||||
from tensorflow.python.ops import array_ops
|
||||
from tensorflow.python.ops import math_ops
|
||||
from tensorflow.python.ops import nn
|
||||
from tensorflow.python.ops import nn_ops
|
||||
from tensorflow.python.platform import test
|
||||
|
||||
INPUT_NAME = "input"
|
||||
OUTPUT_NAME = "output"
|
||||
INPUT_DIMS = [100, 24, 24, 2]
|
||||
MODE_FP32 = "FP32"
|
||||
MODE_FP16 = "FP16"
|
||||
MODE_INT8 = "INT8"
|
||||
|
||||
if six.PY2:
|
||||
to_bytes = lambda s: s
|
||||
to_string = lambda s: s
|
||||
else:
|
||||
to_bytes = lambda s: s.encode("utf-8", errors="surrogateescape")
|
||||
to_string = lambda s: s.decode("utf-8")
|
||||
|
||||
|
||||
class IntegrationTest(test_util.TensorFlowTestCase):
|
||||
# TODO(aaroey): test graph with different dtypes.
|
||||
def GetSingleEngineGraphDef(dtype=dtypes.float32):
|
||||
"""Create a graph containing single segment."""
|
||||
g = ops.Graph()
|
||||
with g.as_default():
|
||||
inp = array_ops.placeholder(
|
||||
dtype=dtype, shape=[None] + INPUT_DIMS[1:], name=INPUT_NAME)
|
||||
with g.device("/GPU:0"):
|
||||
conv_filter = constant_op.constant(
|
||||
[[[[1., 0.5, 4., 6., 0.5, 1.], [1., 0.5, 1., 1., 0.5, 1.]]]],
|
||||
name="weights",
|
||||
dtype=dtype)
|
||||
conv = nn.conv2d(
|
||||
input=inp,
|
||||
filter=conv_filter,
|
||||
strides=[1, 2, 2, 1],
|
||||
padding="SAME",
|
||||
name="conv")
|
||||
bias = constant_op.constant(
|
||||
[4., 1.5, 2., 3., 5., 7.], name="bias", dtype=dtype)
|
||||
added = nn.bias_add(conv, bias, name="bias_add")
|
||||
relu = nn.relu(added, "relu")
|
||||
identity = array_ops.identity(relu, "identity")
|
||||
pool = nn_ops.max_pool(
|
||||
identity, [1, 2, 2, 1], [1, 2, 2, 1], "VALID", name="max_pool")
|
||||
array_ops.squeeze(pool, name=OUTPUT_NAME)
|
||||
return g.as_graph_def()
|
||||
|
||||
|
||||
# TODO(aaroey): test graph with different dtypes.
|
||||
def GetMultiEngineGraphDef(dtype=dtypes.float32):
|
||||
"""Create a graph containing multiple segment."""
|
||||
g = ops.Graph()
|
||||
with g.as_default():
|
||||
inp = array_ops.placeholder(
|
||||
dtype=dtype, shape=[None] + INPUT_DIMS[1:], name=INPUT_NAME)
|
||||
with g.device("/GPU:0"):
|
||||
conv_filter = constant_op.constant(
|
||||
[[[[1., 0.5, 4., 6., 0.5, 1.], [1., 0.5, 1., 1., 0.5, 1.]]]],
|
||||
name="weights",
|
||||
dtype=dtype)
|
||||
conv = nn.conv2d(
|
||||
input=inp,
|
||||
filter=conv_filter,
|
||||
strides=[1, 2, 2, 1],
|
||||
padding="SAME",
|
||||
name="conv")
|
||||
c1 = constant_op.constant(
|
||||
np.random.randn(INPUT_DIMS[0], 12, 12, 6), dtype=dtype)
|
||||
p = conv * c1
|
||||
c2 = constant_op.constant(
|
||||
np.random.randn(INPUT_DIMS[0], 12, 12, 6), dtype=dtype)
|
||||
q = conv / c2
|
||||
|
||||
edge = math_ops.sin(q)
|
||||
edge /= edge
|
||||
r = edge + edge
|
||||
|
||||
p -= edge
|
||||
q *= edge
|
||||
s = p + q
|
||||
s -= r
|
||||
array_ops.squeeze(s, name=OUTPUT_NAME)
|
||||
return g.as_graph_def()
|
||||
|
||||
|
||||
TestGraph = namedtuple("TestGraph",
|
||||
["gdef", "num_expected_engines", "expected_output_dims"])
|
||||
|
||||
TEST_GRAPHS = {
|
||||
"SingleEngineGraph":
|
||||
TestGraph(
|
||||
gdef=GetSingleEngineGraphDef(),
|
||||
num_expected_engines=1,
|
||||
expected_output_dims=(100, 6, 6, 6)),
|
||||
"MultiEngineGraph":
|
||||
TestGraph(
|
||||
gdef=GetMultiEngineGraphDef(),
|
||||
num_expected_engines=2,
|
||||
expected_output_dims=(100, 12, 12, 6)),
|
||||
# TODO(aaroey): add a large complex graph to test.
|
||||
}
|
||||
|
||||
|
||||
class TfTrtIntegrationTest(test_util.TensorFlowTestCase):
|
||||
"""Class to test Tensorflow-TensorRT integration."""
|
||||
|
||||
def setUp(self):
|
||||
"""Setup method."""
|
||||
super(IntegrationTest, self).setUp()
|
||||
super(TfTrtIntegrationTest, self).setUp()
|
||||
warnings.simplefilter("always")
|
||||
inp_dims = (100, 24, 24, 2)
|
||||
self._input = np.random.random_sample(inp_dims)
|
||||
self._original_graph = self.get_simple_graph_def()
|
||||
self._gpu_options = cpb2.GPUOptions(per_process_gpu_memory_fraction=0.50)
|
||||
self._config = cpb2.ConfigProto(gpu_options=self._gpu_options)
|
||||
self._reference = self.run_graph(self._original_graph, self._input)
|
||||
self._input = np.random.random_sample(INPUT_DIMS)
|
||||
|
||||
def get_simple_graph_def(self):
|
||||
"""Create a simple graph and return its graph_def."""
|
||||
g = ops.Graph()
|
||||
with g.as_default():
|
||||
a = aops.placeholder(
|
||||
dtype=dtypes.float32, shape=(None, 24, 24, 2), name="input")
|
||||
e = cop.constant(
|
||||
[[[[1., 0.5, 4., 6., 0.5, 1.], [1., 0.5, 1., 1., 0.5, 1.]]]],
|
||||
name="weights",
|
||||
dtype=dtypes.float32)
|
||||
conv = nn.conv2d(
|
||||
input=a, filter=e, strides=[1, 2, 2, 1], padding="SAME", name="conv")
|
||||
b = cop.constant(
|
||||
[4., 1.5, 2., 3., 5., 7.], name="bias", dtype=dtypes.float32)
|
||||
t = nn.bias_add(conv, b, name="biasAdd")
|
||||
relu = nn.relu(t, "relu")
|
||||
idty = aops.identity(relu, "ID")
|
||||
v = nn_ops.max_pool(
|
||||
idty, [1, 2, 2, 1], [1, 2, 2, 1], "VALID", name="max_pool")
|
||||
aops.squeeze(v, name="output")
|
||||
return g.as_graph_def()
|
||||
def _GetConfigProto(self,
|
||||
use_optimizer,
|
||||
precision_mode=None,
|
||||
is_dynamic_op=None):
|
||||
if use_optimizer:
|
||||
rewriter_cfg = rewriter_config_pb2.RewriterConfig()
|
||||
rewriter_cfg.optimizers.extend(["constfold", "layout"])
|
||||
custom_op = rewriter_cfg.custom_optimizers.add()
|
||||
custom_op.name = "TensorRTOptimizer"
|
||||
custom_op.parameter_map["minimum_segment_size"].i = 3
|
||||
custom_op.parameter_map["max_batch_size"].i = self._input.shape[0]
|
||||
custom_op.parameter_map["is_dynamic_op"].b = is_dynamic_op
|
||||
custom_op.parameter_map["max_workspace_size_bytes"].i = 1 << 25
|
||||
custom_op.parameter_map["precision_mode"].s = to_bytes(precision_mode)
|
||||
graph_options = config_pb2.GraphOptions(rewrite_options=rewriter_cfg)
|
||||
else:
|
||||
graph_options = config_pb2.GraphOptions()
|
||||
|
||||
def run_graph(self, gdef, dumm_inp):
|
||||
"""Run given graphdef once."""
|
||||
ops.reset_default_graph()
|
||||
gpu_options = config_pb2.GPUOptions()
|
||||
if trt.trt_convert.get_linked_tensorrt_version()[0] == 3:
|
||||
gpu_options.per_process_gpu_memory_fraction = 0.50
|
||||
|
||||
config = config_pb2.ConfigProto(
|
||||
gpu_options=gpu_options, graph_options=graph_options)
|
||||
return config
|
||||
|
||||
def _RunGraph(self, graph_key, gdef, input_data, config, num_runs=2):
|
||||
"""Run given graphdef multiple times."""
|
||||
g = ops.Graph()
|
||||
with g.as_default():
|
||||
inp, out = importer.import_graph_def(
|
||||
graph_def=gdef, return_elements=["input", "output"])
|
||||
graph_def=gdef, return_elements=[INPUT_NAME, OUTPUT_NAME], name="")
|
||||
inp = inp.outputs[0]
|
||||
out = out.outputs[0]
|
||||
with self.test_session(
|
||||
graph=g, config=self._config, use_gpu=True, force_gpu=True) as sess:
|
||||
val = sess.run(out, {inp: dumm_inp})
|
||||
graph=g, config=config, use_gpu=True, force_gpu=True) as sess:
|
||||
val = None
|
||||
# Defaults to 2 runs to verify result across multiple runs is same.
|
||||
for _ in range(num_runs):
|
||||
new_val = sess.run(out, {inp: input_data})
|
||||
self.assertEquals(TEST_GRAPHS[graph_key].expected_output_dims,
|
||||
new_val.shape)
|
||||
if val is not None:
|
||||
self.assertAllEqual(new_val, val)
|
||||
val = new_val
|
||||
return val
|
||||
|
||||
# Use real data that is representative of the inference dataset
|
||||
# for calibration. For this test script it is random data.
|
||||
def run_calibration(self, gdef, dumm_inp):
|
||||
"""Run given calibration graph multiple times."""
|
||||
ops.reset_default_graph()
|
||||
g = ops.Graph()
|
||||
with g.as_default():
|
||||
inp, out = importer.import_graph_def(
|
||||
graph_def=gdef, return_elements=["input", "output"])
|
||||
inp = inp.outputs[0]
|
||||
out = out.outputs[0]
|
||||
# run over real calibration data here, we are mimicking a calibration
|
||||
# set of 30 different batches. Use as much calibration data as you want
|
||||
with self.test_session(
|
||||
graph=g, config=self._config, use_gpu=True, force_gpu=True) as sess:
|
||||
for _ in range(30):
|
||||
val = sess.run(out, {inp: dumm_inp})
|
||||
return val
|
||||
def _RunCalibration(self, graph_key, gdef, input_data, config):
|
||||
"""Run calibration on given graph."""
|
||||
return self._RunGraph(graph_key, gdef, input_data, config, 30)
|
||||
|
||||
def get_trt_graph(self, mode):
|
||||
def _GetTrtGraph(self, gdef, precision_mode, is_dynamic_op):
|
||||
"""Return trt converted graph."""
|
||||
if mode in ["FP32", "FP16", "INT8"]:
|
||||
return trt.create_inference_graph(
|
||||
input_graph_def=self._original_graph,
|
||||
outputs=["output"],
|
||||
max_batch_size=self._input.shape[0],
|
||||
max_workspace_size_bytes=1 << 25,
|
||||
precision_mode=mode, # TRT Engine precision "FP32","FP16" or "INT8"
|
||||
minimum_segment_size=2 # minimum number of nodes in an engine
|
||||
)
|
||||
return None
|
||||
return trt.create_inference_graph(
|
||||
input_graph_def=gdef,
|
||||
outputs=[OUTPUT_NAME],
|
||||
max_batch_size=self._input.shape[0],
|
||||
max_workspace_size_bytes=1 << 25,
|
||||
precision_mode=precision_mode,
|
||||
minimum_segment_size=2,
|
||||
is_dynamic_op=is_dynamic_op)
|
||||
|
||||
def testFP32(self):
|
||||
"""Test FP32 conversion. Results should be identical to native case."""
|
||||
trt_graph = self.get_trt_graph("FP32")
|
||||
result = self.run_graph(trt_graph, self._input)
|
||||
self.assertAllEqual(self._reference, result)
|
||||
result1 = self.run_graph(trt_graph, self._input)
|
||||
self.assertAllEqual(result1, result)
|
||||
def _VerifyGraphDef(self,
|
||||
graph_key,
|
||||
gdef,
|
||||
precision_mode=None,
|
||||
is_calibrated=None,
|
||||
dynamic_engine=None):
|
||||
num_engines = 0
|
||||
for n in gdef.node:
|
||||
if n.op == "TRTEngineOp":
|
||||
num_engines += 1
|
||||
self.assertNotEqual("", n.attr["serialized_segment"].s)
|
||||
self.assertNotEqual("", n.attr["segment_funcdef_name"].s)
|
||||
self.assertEquals(n.attr["precision_mode"].s, precision_mode)
|
||||
self.assertEquals(n.attr["static_engine"].b, not dynamic_engine)
|
||||
if precision_mode == MODE_INT8 and is_calibrated:
|
||||
self.assertNotEqual("", n.attr["calibration_data"].s)
|
||||
else:
|
||||
self.assertEquals("", n.attr["calibration_data"].s)
|
||||
if precision_mode is None:
|
||||
self.assertEquals(num_engines, 0)
|
||||
else:
|
||||
self.assertEquals(num_engines,
|
||||
TEST_GRAPHS[graph_key].num_expected_engines)
|
||||
|
||||
def testFP16(self):
|
||||
"""Test FP16 conversion. Results may be different from native case."""
|
||||
trt_graph = self.get_trt_graph("FP16")
|
||||
result = self.run_graph(trt_graph, self._input)
|
||||
self.assertAllClose(self._reference, result, rtol=1.e-03)
|
||||
result1 = self.run_graph(trt_graph, self._input)
|
||||
self.assertAllEqual(result1, result)
|
||||
def _RunTest(self, graph_key, use_optimizer, precision_mode,
|
||||
dynamic_infer_engine, dynamic_calib_engine):
|
||||
assert precision_mode in [MODE_FP32, MODE_FP16, MODE_INT8]
|
||||
input_gdef = TEST_GRAPHS[graph_key].gdef
|
||||
self._VerifyGraphDef(graph_key, input_gdef)
|
||||
|
||||
def testINT8(self):
|
||||
"""Test INT8 conversion. Results may be different from native case."""
|
||||
calib_graph = self.get_trt_graph("INT8")
|
||||
result = self.run_calibration(calib_graph, self._input)
|
||||
self.assertAllEqual(self._reference, result)
|
||||
int8_graph = trt.calib_graph_to_infer_graph(calib_graph)
|
||||
result = self.run_graph(int8_graph, self._input)
|
||||
self.assertAllClose(self._reference, result, rtol=1.e-03)
|
||||
result1 = self.run_graph(int8_graph, self._input)
|
||||
self.assertAllEqual(result1, result)
|
||||
# Get reference result without running trt.
|
||||
config_no_trt = self._GetConfigProto(False)
|
||||
print("Running original graph w/o trt, config:\n%s" % str(config_no_trt))
|
||||
ref_result = self._RunGraph(graph_key, input_gdef, self._input,
|
||||
config_no_trt)
|
||||
|
||||
# Run calibration if necessary.
|
||||
if precision_mode == MODE_INT8:
|
||||
|
||||
calib_config = self._GetConfigProto(use_optimizer, precision_mode,
|
||||
dynamic_calib_engine)
|
||||
print("Running calibration graph, config:\n%s" % str(calib_config))
|
||||
if use_optimizer:
|
||||
self.assertTrue(False)
|
||||
# TODO(aaroey): uncomment this and get infer_gdef when this mode is
|
||||
# supported.
|
||||
# result = self._RunCalibration(graph_key, input_gdef, self._input,
|
||||
# calib_config)
|
||||
else:
|
||||
calib_gdef = self._GetTrtGraph(input_gdef, precision_mode,
|
||||
dynamic_calib_engine)
|
||||
self._VerifyGraphDef(graph_key, calib_gdef, precision_mode, False,
|
||||
dynamic_calib_engine)
|
||||
result = self._RunCalibration(graph_key, calib_gdef, self._input,
|
||||
calib_config)
|
||||
infer_gdef = trt.calib_graph_to_infer_graph(calib_gdef)
|
||||
self._VerifyGraphDef(graph_key, infer_gdef, precision_mode, True,
|
||||
dynamic_calib_engine)
|
||||
self.assertAllClose(ref_result, result, rtol=1.e-03)
|
||||
else:
|
||||
infer_gdef = input_gdef
|
||||
|
||||
# Run inference.
|
||||
infer_config = self._GetConfigProto(use_optimizer, precision_mode,
|
||||
dynamic_infer_engine)
|
||||
print("Running final inference graph, config:\n%s" % str(infer_config))
|
||||
if use_optimizer:
|
||||
result = self._RunGraph(graph_key, infer_gdef, self._input, infer_config)
|
||||
else:
|
||||
trt_infer_gdef = self._GetTrtGraph(infer_gdef, precision_mode,
|
||||
dynamic_infer_engine)
|
||||
self._VerifyGraphDef(graph_key, trt_infer_gdef, precision_mode, True,
|
||||
dynamic_infer_engine)
|
||||
result = self._RunGraph(graph_key, trt_infer_gdef, self._input,
|
||||
infer_config)
|
||||
self.assertAllClose(ref_result, result, rtol=1.e-03)
|
||||
|
||||
def testIdempotence(self):
|
||||
# Test that applying tensorrt optimizer or offline conversion tools multiple
|
||||
# times to the same graph will result in same graph.
|
||||
# TODO(aaroey): implement this.
|
||||
pass
|
||||
|
||||
|
||||
def GetTests():
|
||||
|
||||
def _GetTest(g, u, p, i, c):
|
||||
|
||||
def _Test(self):
|
||||
print("Running test with parameters: graph_key=%s, use_optimizer=%s, "
|
||||
"precision_mode=%s, dynamic_infer_engine=%s, "
|
||||
"dynamic_calib_engine=%s" % (g, u, p, i, c))
|
||||
self._RunTest(g, u, p, i, c)
|
||||
|
||||
return _Test
|
||||
|
||||
use_optimizer_options = [False, True]
|
||||
precision_mode_options = [MODE_FP32, MODE_FP16, MODE_INT8]
|
||||
dynamic_infer_engine_options = [False, True]
|
||||
dynamic_calib_engine_options = [False, True]
|
||||
for (graph_key, use_optimizer, precision_mode,
|
||||
dynamic_infer_engine, dynamic_calib_engine) in itertools.product(
|
||||
TEST_GRAPHS, use_optimizer_options, precision_mode_options,
|
||||
dynamic_infer_engine_options, dynamic_calib_engine_options):
|
||||
if precision_mode == MODE_INT8:
|
||||
if not dynamic_calib_engine and dynamic_infer_engine:
|
||||
# TODO(aaroey): test this case, the conversion from static calibration
|
||||
# engine to dynamic inference engine should be a noop.
|
||||
continue
|
||||
if use_optimizer:
|
||||
# TODO(aaroey): if use_optimizer is True we need to get the inference
|
||||
# graphdef using custom python wrapper class, which is not currently
|
||||
# supported yet.
|
||||
continue
|
||||
if not dynamic_calib_engine:
|
||||
# TODO(aaroey): construction of static calibration engine is not
|
||||
# supported yet.
|
||||
continue
|
||||
if dynamic_calib_engine and not dynamic_infer_engine:
|
||||
# TODO(aaroey): construction of static inference engine using dynamic
|
||||
# calibration engine is not supported yet.
|
||||
continue
|
||||
else: # In non int8 mode.
|
||||
if dynamic_calib_engine:
|
||||
# dynamic_calib_engine doesn't affect non-int8 modes, so just let
|
||||
# related tests run once on dynamic_calib_engine=False.
|
||||
continue
|
||||
yield _GetTest(graph_key, use_optimizer, precision_mode,
|
||||
dynamic_infer_engine, dynamic_calib_engine)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
googletest.main()
|
||||
for index, t in enumerate(GetTests()):
|
||||
setattr(TfTrtIntegrationTest, "testTfTRT_" + str(index), t)
|
||||
test.main()
|
||||
|
@ -25,7 +25,7 @@ END
|
||||
(K-1)-dimensional tensor of indices into `params`, where each element defines a
|
||||
slice of `params`:
|
||||
|
||||
output[i_0, ..., i_{K-2}] = params[indices[i0, ..., i_{K-2}]]
|
||||
output[\\(i_0, ..., i_{K-2}\\)] = params[indices[\\(i_0, ..., i_{K-2}\\)]]
|
||||
|
||||
Whereas in @{tf.gather} `indices` defines slices into the first
|
||||
dimension of `params`, in `tf.gather_nd`, `indices` defines slices into the
|
||||
|
@ -3,19 +3,19 @@ op {
|
||||
in_arg {
|
||||
name: "start"
|
||||
description: <<END
|
||||
First entry in the range.
|
||||
0-D tensor. First entry in the range.
|
||||
END
|
||||
}
|
||||
in_arg {
|
||||
name: "stop"
|
||||
description: <<END
|
||||
Last entry in the range.
|
||||
0-D tensor. Last entry in the range.
|
||||
END
|
||||
}
|
||||
in_arg {
|
||||
name: "num"
|
||||
description: <<END
|
||||
Number of values to generate.
|
||||
0-D tensor. Number of values to generate.
|
||||
END
|
||||
}
|
||||
out_arg {
|
||||
|
@ -18,7 +18,7 @@ END
|
||||
}
|
||||
summary: "Computes the matrix exponential of one or more square matrices:"
|
||||
description: <<END
|
||||
exp(A) = \sum_{n=0}^\infty A^n/n!
|
||||
\\(exp(A) = \sum_{n=0}^\infty A^n/n!\\)
|
||||
|
||||
The exponential is computed using a combination of the scaling and squaring
|
||||
method and the Pade approximation. Details can be founds in:
|
||||
|
@ -20,7 +20,7 @@ END
|
||||
summary: "Computes the matrix logarithm of one or more square matrices:"
|
||||
description: <<END
|
||||
|
||||
log(exp(A)) = A
|
||||
\\(log(exp(A)) = A\\)
|
||||
|
||||
This op is only defined for complex matrices. If A is positive-definite and
|
||||
real, then casting to a complex matrix, taking the logarithm and casting back
|
||||
|
@ -36,7 +36,7 @@ END
|
||||
summary: "Joins a string Tensor across the given dimensions."
|
||||
description: <<END
|
||||
Computes the string join across dimensions in the given string Tensor of shape
|
||||
`[d_0, d_1, ..., d_n-1]`. Returns a new Tensor created by joining the input
|
||||
`[\\(d_0, d_1, ..., d_{n-1}\\)]`. Returns a new Tensor created by joining the input
|
||||
strings with the given separator (default: empty string). Negative indices are
|
||||
counted backwards from the end, with `-1` being equivalent to `n - 1`. If
|
||||
indices are not specified, joins across all dimensions beginning from `n - 1`
|
||||
|
@ -42,7 +42,7 @@ within a given variable according to `indices`.
|
||||
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
||||
|
||||
`indices` must be integer tensor, containing indices into `ref`.
|
||||
It must be shape `[d_0, ..., d_{Q-2}, K]` where `0 < K <= P`.
|
||||
It must be shape `\\([d_0, ..., d_{Q-2}, K]\\)` where `0 < K <= P`.
|
||||
|
||||
The innermost dimension of `indices` (with length `K`) corresponds to
|
||||
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
||||
@ -50,9 +50,7 @@ dimension of `ref`.
|
||||
|
||||
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
||||
|
||||
```
|
||||
[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].
|
||||
```
|
||||
$$[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].$$
|
||||
|
||||
For example, say we want to add 4 scattered elements to a rank-1 tensor to 8
|
||||
elements. In Python, that addition would look like this:
|
||||
|
@ -37,7 +37,7 @@ respect to both `input` and `updates`.
|
||||
`input` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
||||
|
||||
`indices` must be integer tensor, containing indices into `input`.
|
||||
It must be shape `[d_0, ..., d_{Q-2}, K]` where `0 < K <= P`.
|
||||
It must be shape \\([d_0, ..., d_{Q-2}, K]\\) where `0 < K <= P`.
|
||||
|
||||
The innermost dimension of `indices` (with length `K`) corresponds to
|
||||
indices into elements (if `K = P`) or `(P-K)`-dimensional slices
|
||||
@ -45,9 +45,7 @@ indices into elements (if `K = P`) or `(P-K)`-dimensional slices
|
||||
|
||||
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
||||
|
||||
```
|
||||
[d_0, ..., d_{Q-2}, input.shape[K], ..., input.shape[P-1]].
|
||||
```
|
||||
$$[d_0, ..., d_{Q-2}, input.shape[K], ..., input.shape[P-1]].$$
|
||||
|
||||
For example, say we want to add 4 scattered elements to a rank-1 tensor to 8
|
||||
elements. In Python, that addition would look like this:
|
||||
|
@ -42,7 +42,7 @@ within a given variable according to `indices`.
|
||||
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
||||
|
||||
`indices` must be integer tensor, containing indices into `ref`.
|
||||
It must be shape `[d_0, ..., d_{Q-2}, K]` where `0 < K <= P`.
|
||||
It must be shape \\([d_0, ..., d_{Q-2}, K]\\) where `0 < K <= P`.
|
||||
|
||||
The innermost dimension of `indices` (with length `K`) corresponds to
|
||||
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
||||
@ -50,9 +50,7 @@ dimension of `ref`.
|
||||
|
||||
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
||||
|
||||
```
|
||||
[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].
|
||||
```
|
||||
$$[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].$$
|
||||
|
||||
For example, say we want to subtract 4 scattered elements from a rank-1 tensor
|
||||
with 8 elements. In Python, that subtraction would look like this:
|
||||
|
@ -42,7 +42,7 @@ variable according to `indices`.
|
||||
`ref` is a `Tensor` with rank `P` and `indices` is a `Tensor` of rank `Q`.
|
||||
|
||||
`indices` must be integer tensor, containing indices into `ref`.
|
||||
It must be shape `[d_0, ..., d_{Q-2}, K]` where `0 < K <= P`.
|
||||
It must be shape \\([d_0, ..., d_{Q-2}, K]\\) where `0 < K <= P`.
|
||||
|
||||
The innermost dimension of `indices` (with length `K`) corresponds to
|
||||
indices into elements (if `K = P`) or slices (if `K < P`) along the `K`th
|
||||
@ -50,9 +50,7 @@ dimension of `ref`.
|
||||
|
||||
`updates` is `Tensor` of rank `Q-1+P-K` with shape:
|
||||
|
||||
```
|
||||
[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].
|
||||
```
|
||||
$$[d_0, ..., d_{Q-2}, ref.shape[K], ..., ref.shape[P-1]].$$
|
||||
|
||||
For example, say we want to update 4 scattered elements to a rank-1 tensor to
|
||||
8 elements. In Python, that update would look like this:
|
||||
|
@ -16,6 +16,6 @@ END
|
||||
description: <<END
|
||||
For each batch `i` and class `j` we have
|
||||
|
||||
softmax[i, j] = exp(logits[i, j]) / sum_j(exp(logits[i, j]))
|
||||
$$softmax[i, j] = exp(logits[i, j]) / sum_j(exp(logits[i, j]))$$
|
||||
END
|
||||
}
|
||||
|
@ -47,7 +47,7 @@ END
|
||||
summary: "Update relevant entries in \'*var\' and \'*accum\' according to the adagrad scheme."
|
||||
description: <<END
|
||||
That is for rows we have grad for, we update var and accum as follows:
|
||||
accum += grad * grad
|
||||
var -= lr * grad * (1 / sqrt(accum))
|
||||
$$accum += grad * grad$$
|
||||
$$var -= lr * grad * (1 / sqrt(accum))$$
|
||||
END
|
||||
}
|
||||
|
@ -83,8 +83,8 @@ mean_square = decay * mean_square + (1-decay) * gradient ** 2
|
||||
mean_grad = decay * mean_grad + (1-decay) * gradient
|
||||
Delta = learning_rate * gradient / sqrt(mean_square + epsilon - mean_grad ** 2)
|
||||
|
||||
ms <- rho * ms_{t-1} + (1-rho) * grad * grad
|
||||
mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)
|
||||
var <- var - mom
|
||||
$$ms <- rho * ms_{t-1} + (1-rho) * grad * grad$$
|
||||
$$mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)$$
|
||||
$$var <- var - mom$$
|
||||
END
|
||||
}
|
||||
|
@ -71,10 +71,10 @@ END
|
||||
summary: "Update relevant entries in \'*var\' according to the Ftrl-proximal scheme."
|
||||
description: <<END
|
||||
That is for rows we have grad for, we update var, accum and linear as follows:
|
||||
accum_new = accum + grad * grad
|
||||
linear += grad + (accum_new^(-lr_power) - accum^(-lr_power)) / lr * var
|
||||
quadratic = 1.0 / (accum_new^(lr_power) * lr) + 2 * l2
|
||||
var = (sign(linear) * l1 - linear) / quadratic if |linear| > l1 else 0.0
|
||||
accum = accum_new
|
||||
$$accum_new = accum + grad * grad$$
|
||||
$$linear += grad + (accum_{new}^{-lr_{power}} - accum^{-lr_{power}} / lr * var$$
|
||||
$$quadratic = 1.0 / (accum_{new}^{lr_{power}} * lr) + 2 * l2$$
|
||||
$$var = (sign(linear) * l1 - linear) / quadratic\ if\ |linear| > l1\ else\ 0.0$$
|
||||
$$accum = accum_{new}$$
|
||||
END
|
||||
}
|
||||
|
@ -64,7 +64,7 @@ Set use_nesterov = True if you want to use Nesterov momentum.
|
||||
|
||||
That is for rows we have grad for, we update var and accum as follows:
|
||||
|
||||
accum = accum * momentum + grad
|
||||
var -= lr * accum
|
||||
$$accum = accum * momentum + grad$$
|
||||
$$var -= lr * accum$$
|
||||
END
|
||||
}
|
||||
|
@ -58,9 +58,9 @@ END
|
||||
summary: "Sparse update entries in \'*var\' and \'*accum\' according to FOBOS algorithm."
|
||||
description: <<END
|
||||
That is for rows we have grad for, we update var and accum as follows:
|
||||
accum += grad * grad
|
||||
prox_v = var
|
||||
prox_v -= lr * grad * (1 / sqrt(accum))
|
||||
var = sign(prox_v)/(1+lr*l2) * max{|prox_v|-lr*l1,0}
|
||||
$$accum += grad * grad$$
|
||||
$$prox_v = var$$
|
||||
$$prox_v -= lr * grad * (1 / sqrt(accum))$$
|
||||
$$var = sign(prox_v)/(1+lr*l2) * max{|prox_v|-lr*l1,0}$$
|
||||
END
|
||||
}
|
||||
|
@ -52,7 +52,7 @@ END
|
||||
summary: "Sparse update \'*var\' as FOBOS algorithm with fixed learning rate."
|
||||
description: <<END
|
||||
That is for rows we have grad for, we update var as follows:
|
||||
prox_v = var - alpha * grad
|
||||
var = sign(prox_v)/(1+alpha*l2) * max{|prox_v|-alpha*l1,0}
|
||||
$$prox_v = var - alpha * grad$$
|
||||
$$var = sign(prox_v)/(1+alpha*l2) * max{|prox_v|-alpha*l1,0}$$
|
||||
END
|
||||
}
|
||||
|
@ -71,8 +71,8 @@ and mom will not update in iterations during which the grad is zero.
|
||||
mean_square = decay * mean_square + (1-decay) * gradient ** 2
|
||||
Delta = learning_rate * gradient / sqrt(mean_square + epsilon)
|
||||
|
||||
ms <- rho * ms_{t-1} + (1-rho) * grad * grad
|
||||
mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)
|
||||
var <- var - mom
|
||||
$$ms <- rho * ms_{t-1} + (1-rho) * grad * grad$$
|
||||
$$mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)$$
|
||||
$$var <- var - mom$$
|
||||
END
|
||||
}
|
||||
|
@ -0,0 +1,40 @@
|
||||
op {
|
||||
graph_op_name: "SparseSliceGrad"
|
||||
in_arg {
|
||||
name: "backprop_val_grad"
|
||||
description: <<END
|
||||
1-D. The gradient with respect to
|
||||
the non-empty values of the sliced `SparseTensor`.
|
||||
END
|
||||
}
|
||||
in_arg {
|
||||
name: "input_indices"
|
||||
description: <<END
|
||||
2-D. The `indices` of the input `SparseTensor`.
|
||||
END
|
||||
}
|
||||
in_arg {
|
||||
name: "input_start"
|
||||
description: <<END
|
||||
1-D. tensor represents the start of the slice.
|
||||
END
|
||||
}
|
||||
in_arg {
|
||||
name: "output_indices"
|
||||
description: <<END
|
||||
2-D. The `indices` of the sliced `SparseTensor`.
|
||||
END
|
||||
}
|
||||
out_arg {
|
||||
name: "val_grad"
|
||||
description: <<END
|
||||
1-D. The gradient with respect to the non-empty values of input `SparseTensor`.
|
||||
END
|
||||
}
|
||||
summary: "The gradient operator for the SparseSlice op."
|
||||
description: <<END
|
||||
This op takes in the upstream gradient w.r.t. non-empty values of
|
||||
the sliced `SparseTensor`, and outputs the gradients w.r.t.
|
||||
the non-empty values of input `SparseTensor`.
|
||||
END
|
||||
}
|
@ -20,7 +20,7 @@ Read @{$math_ops#Segmentation$the section on segmentation} for an explanation of
|
||||
segments.
|
||||
|
||||
Computes a tensor such that
|
||||
`(output[i] = sum_{j...} data[j...]` where the sum is over tuples `j...` such
|
||||
\\(output[i] = sum_{j...} data[j...]\\) where the sum is over tuples `j...` such
|
||||
that `segment_ids[j...] == i`. Unlike `SegmentSum`, `segment_ids`
|
||||
need not be sorted and need not cover all values in the full
|
||||
range of valid values.
|
||||
|
@ -1,4 +0,0 @@
|
||||
op {
|
||||
graph_op_name: "BroadcastTo"
|
||||
visibility: HIDDEN
|
||||
}
|
@ -0,0 +1,4 @@
|
||||
op {
|
||||
graph_op_name: "SparseSliceGrad"
|
||||
visibility: HIDDEN
|
||||
}
|
@ -3941,6 +3941,7 @@ cc_library(
|
||||
":sparse_reduce_op",
|
||||
":sparse_reorder_op",
|
||||
":sparse_reshape_op",
|
||||
":sparse_slice_grad_op",
|
||||
":sparse_slice_op",
|
||||
":sparse_softmax",
|
||||
":sparse_sparse_binary_op_shared",
|
||||
@ -4026,6 +4027,12 @@ tf_kernel_library(
|
||||
],
|
||||
)
|
||||
|
||||
tf_kernel_library(
|
||||
name = "sparse_slice_grad_op",
|
||||
prefix = "sparse_slice_grad_op",
|
||||
deps = SPARSE_DEPS,
|
||||
)
|
||||
|
||||
tf_kernel_library(
|
||||
name = "sparse_slice_op",
|
||||
prefix = "sparse_slice_op",
|
||||
|
@ -221,7 +221,7 @@ class FusedResizePadConvOpTest : public OpsTestBase {
|
||||
std::vector<Tensor> fused_tensors;
|
||||
TF_ASSERT_OK(session->Run({}, {"fused_conv"}, {}, &fused_tensors));
|
||||
|
||||
test::ExpectTensorNear<float>(unfused_tensors[0], fused_tensors[0], 1e-5);
|
||||
test::ExpectClose(unfused_tensors[0], fused_tensors[0]);
|
||||
}
|
||||
|
||||
void CompareFusedPadOnlyAndSeparate(int input_width, int input_height,
|
||||
@ -269,7 +269,7 @@ class FusedResizePadConvOpTest : public OpsTestBase {
|
||||
std::vector<Tensor> fused_tensors;
|
||||
TF_ASSERT_OK(session->Run({}, {"fused_conv"}, {}, &fused_tensors));
|
||||
|
||||
test::ExpectTensorNear<float>(unfused_tensors[0], fused_tensors[0], 1e-5);
|
||||
test::ExpectClose(unfused_tensors[0], fused_tensors[0]);
|
||||
}
|
||||
};
|
||||
|
||||
|
@ -704,14 +704,14 @@ class MklConcatOp : public OpKernel {
|
||||
if (input_tensors[k].NumElements() == 0)
|
||||
continue;
|
||||
|
||||
auto src_dims = TFShapeToMklDnnDims(
|
||||
mkl_input_shapes[k].GetTfShape());
|
||||
auto src_md = mkl_input_shapes[k].GetMklLayout();
|
||||
srcs[k].SetUsrMem(src_md, &input_tensors[k]);
|
||||
|
||||
if (src_md.data.format != mkl_common_format)
|
||||
if (src_md.data.format != mkl_common_format) {
|
||||
memory::dims src_dims(src_md.data.dims, &src_md.data.dims[src_md.data.ndims]);
|
||||
src_md = memory::desc(src_dims, MklDnnType<T>(),
|
||||
mkl_common_format);
|
||||
}
|
||||
|
||||
srcs_pd.push_back(memory::primitive_desc(src_md, cpu_engine));
|
||||
}
|
||||
|
126
tensorflow/core/kernels/sparse_slice_grad_op.cc
Normal file
126
tensorflow/core/kernels/sparse_slice_grad_op.cc
Normal file
@ -0,0 +1,126 @@
|
||||
/* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
==============================================================================*/
|
||||
|
||||
#include "tensorflow/core/framework/op_kernel.h"
|
||||
#include "tensorflow/core/framework/register_types.h"
|
||||
#include "tensorflow/core/framework/tensor.h"
|
||||
#include "tensorflow/core/framework/tensor_util.h"
|
||||
#include "tensorflow/core/framework/types.h"
|
||||
#include "tensorflow/core/util/sparse/sparse_tensor.h"
|
||||
|
||||
namespace tensorflow {
|
||||
|
||||
template <typename T>
|
||||
class SparseSliceGradOp : public OpKernel {
|
||||
public:
|
||||
explicit SparseSliceGradOp(OpKernelConstruction *ctx) : OpKernel(ctx) {}
|
||||
|
||||
void Compute(OpKernelContext *ctx) override {
|
||||
const Tensor *backprop_val_grad, *input_indices, *output_indices, *input_start;
|
||||
OP_REQUIRES_OK(ctx, ctx->input("backprop_val_grad", &backprop_val_grad));
|
||||
OP_REQUIRES_OK(ctx, ctx->input("input_indices", &input_indices));
|
||||
OP_REQUIRES_OK(ctx, ctx->input("input_start", &input_start));
|
||||
OP_REQUIRES_OK(ctx, ctx->input("output_indices", &output_indices));
|
||||
|
||||
OP_REQUIRES(ctx,
|
||||
TensorShapeUtils::IsMatrix(input_indices->shape()) &&
|
||||
TensorShapeUtils::IsMatrix(output_indices->shape()),
|
||||
errors::InvalidArgument(
|
||||
"Input and output indices should be matrices "
|
||||
"but received shapes: ",
|
||||
input_indices->shape().DebugString(), " and ",
|
||||
output_indices->shape().DebugString()));
|
||||
OP_REQUIRES(
|
||||
ctx, TensorShapeUtils::IsVector(backprop_val_grad->shape()),
|
||||
errors::InvalidArgument(
|
||||
"Input backprop_val_grad should be a vector but received shape: ",
|
||||
backprop_val_grad->shape().DebugString()));
|
||||
OP_REQUIRES(
|
||||
ctx,
|
||||
input_indices->dim_size(1) == output_indices->dim_size(1),
|
||||
errors::InvalidArgument("The input and output should have the same "
|
||||
"ndims: got: ", input_indices->dim_size(1), " and ",
|
||||
output_indices->dim_size(1)));
|
||||
OP_REQUIRES(
|
||||
ctx, output_indices->dim_size(0) <= input_indices->dim_size(0),
|
||||
errors::InvalidArgument("# rows of output_indices should be not greater "
|
||||
"than of input_indices, got ",
|
||||
output_indices->dim_size(0), " and ",
|
||||
input_indices->dim_size(0)));
|
||||
OP_REQUIRES(
|
||||
ctx, backprop_val_grad->NumElements() == output_indices->dim_size(0),
|
||||
errors::InvalidArgument("# elements of backprop_val_grad and # rows of "
|
||||
"output_indices should match (#nnz of sum): got ",
|
||||
backprop_val_grad->NumElements(), " and ",
|
||||
output_indices->dim_size(0)));
|
||||
OP_REQUIRES(ctx, TensorShapeUtils::IsVector(input_start->shape()),
|
||||
errors::InvalidArgument(
|
||||
"The input_start should be a vector but received shape ",
|
||||
input_start->shape().DebugString()));
|
||||
|
||||
const int num_dims = input_indices->dim_size(1);
|
||||
OP_REQUIRES(ctx, num_dims == input_start->NumElements(),
|
||||
errors::InvalidArgument(
|
||||
"Expected input_start to be a vector of length ", num_dims,
|
||||
" but got length ", input_start->NumElements()));
|
||||
|
||||
const int64 input_nnz = input_indices->dim_size(0);
|
||||
|
||||
Tensor *val_grad;
|
||||
OP_REQUIRES_OK(ctx,
|
||||
ctx->allocate_output(0, TensorShape({input_nnz}), &val_grad));
|
||||
|
||||
T *val_grad_flat = val_grad->flat<T>().data();
|
||||
const T *backprop_val_grad_flat = backprop_val_grad->flat<T>().data();
|
||||
memset(val_grad_flat, 0, sizeof(T) * input_nnz);
|
||||
|
||||
// Fill gradients for position where indices of input and output are same.
|
||||
const auto input_indices_mat = input_indices->matrix<int64>();
|
||||
const auto output_indices_mat = output_indices->matrix<int64>();
|
||||
const auto input_start_flat = input_start->flat<int64>();
|
||||
int64 j = 0;
|
||||
for (int64 i = 0; i < input_nnz && j < backprop_val_grad->NumElements();
|
||||
++i) {
|
||||
bool is_same = true;
|
||||
for (int d = 0; d < num_dims; ++d) {
|
||||
const int64 a = input_indices_mat(i, d);
|
||||
const int64 b = output_indices_mat(j, d);
|
||||
const int64 offset = input_start_flat(d);
|
||||
if (a != b + offset) {
|
||||
is_same = false;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (is_same) {
|
||||
val_grad_flat[i] = backprop_val_grad_flat[j];
|
||||
++j;
|
||||
}
|
||||
}
|
||||
OP_REQUIRES(
|
||||
ctx, backprop_val_grad->NumElements() == j,
|
||||
errors::Internal("Elements of backprop_val_grad aren't all propagated. "
|
||||
"Num elements:", backprop_val_grad->NumElements(),
|
||||
", used: ", j));
|
||||
}
|
||||
};
|
||||
|
||||
#define REGISTER_KERNELS(type) \
|
||||
REGISTER_KERNEL_BUILDER( \
|
||||
Name("SparseSliceGrad").Device(DEVICE_CPU).TypeConstraint<type>("T"), \
|
||||
SparseSliceGradOp<type>)
|
||||
|
||||
TF_CALL_NUMBER_TYPES(REGISTER_KERNELS);
|
||||
#undef REGISTER_KERNELS
|
||||
} // namespace tensorflow
|
@ -73,6 +73,21 @@ TEST_F(SqliteTest, InsertAndSelectDouble) {
|
||||
EXPECT_EQ(1, stmt.ColumnInt(1));
|
||||
}
|
||||
|
||||
#ifdef DSQLITE_ENABLE_JSON1
|
||||
TEST_F(SqliteTest, Json1Extension) {
|
||||
string s1 = "{\"key\": 42}";
|
||||
string s2 = "{\"key\": \"value\"}";
|
||||
auto stmt = db_->PrepareOrDie("INSERT INTO T (a, b) VALUES (?, ?)");
|
||||
stmt.BindText(1, s1);
|
||||
stmt.BindText(2, s2);
|
||||
TF_ASSERT_OK(stmt.StepAndReset());
|
||||
stmt = db_->PrepareOrDie("SELECT json_extract(a, '$.key'), json_extract(b, '$.key') FROM T");
|
||||
TF_ASSERT_OK(stmt.Step(&is_done_));
|
||||
EXPECT_EQ(42, stmt.ColumnInt(0));
|
||||
EXPECT_EQ("value", stmt.ColumnString(1));
|
||||
}
|
||||
#endif //DSQLITE_ENABLE_JSON1
|
||||
|
||||
TEST_F(SqliteTest, NulCharsInString) {
|
||||
string s; // XXX: Want to write {2, '\0'} but not sure why not.
|
||||
s.append(static_cast<size_t>(2), '\0');
|
||||
|
@ -302,6 +302,20 @@ REGISTER_OP("SparseSplit")
|
||||
return Status::OK();
|
||||
});
|
||||
|
||||
REGISTER_OP("SparseSliceGrad")
|
||||
.Input("backprop_val_grad: T")
|
||||
.Input("input_indices: int64")
|
||||
.Input("input_start: int64")
|
||||
.Input("output_indices: int64")
|
||||
.Output("val_grad: T")
|
||||
.Attr("T: numbertype")
|
||||
.SetShapeFn([](InferenceContext* c) {
|
||||
ShapeHandle indices;
|
||||
TF_RETURN_IF_ERROR(c->WithRank(c->input(1), 2, &indices));
|
||||
c->set_output(0, c->Vector(c->Dim(indices, 0)));
|
||||
return Status::OK();
|
||||
});
|
||||
|
||||
REGISTER_OP("SparseSlice")
|
||||
.Input("indices: int64")
|
||||
.Input("values: T")
|
||||
|
@ -52,6 +52,18 @@ TEST(SparseOpsTest, SparseAddGrad_ShapeFn) {
|
||||
INFER_OK(op, "?;[?,?];[?,?];?", "[d1_0];[d2_0]");
|
||||
}
|
||||
|
||||
TEST(SparseOpsTest, SparseSliceGrad_ShapeFn) {
|
||||
ShapeInferenceTestOp op("SparseSliceGrad");
|
||||
|
||||
// Rank checks.
|
||||
INFER_ERROR("must be rank 2", op, "?;[1];?;?");
|
||||
|
||||
INFER_OK(op, "?;?;?;?", "[?]");
|
||||
|
||||
// input[1].dim(0) determine output.
|
||||
INFER_OK(op, "?;[?,?];?;?", "[d1_0]");
|
||||
}
|
||||
|
||||
TEST(SparseOpsTest, SparseReorder_ShapeFn) {
|
||||
ShapeInferenceTestOp op("SparseReorder");
|
||||
|
||||
|
@ -66,9 +66,7 @@ landing_page:
|
||||
}
|
||||
</style>
|
||||
<div class="devsite-landing-row-item-description">
|
||||
<a href="#">
|
||||
<h3 class="hide-from-toc">Learn and use ML</h3>
|
||||
</a>
|
||||
<h3 class="hide-from-toc">Learn and use ML</h3>
|
||||
<div class="devsite-landing-row-item-description-content">
|
||||
<p>
|
||||
The high-level Keras API provides building blocks to create and
|
||||
@ -117,9 +115,7 @@ landing_page:
|
||||
- items:
|
||||
- custom_html: >
|
||||
<div class="devsite-landing-row-item-description" style="border-right: 2px solid #eee;">
|
||||
<a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples/notebooks">
|
||||
<h3 class="hide-from-toc">Research and experimentation</h3>
|
||||
</a>
|
||||
<h3 class="hide-from-toc">Research and experimentation</h3>
|
||||
<div class="devsite-landing-row-item-description-content">
|
||||
<p>
|
||||
Eager execution provides an imperative, define-by-run interface for advanced operations. Write custom layers, forward passes, and training loops with auto‑differentiation. Start with
|
||||
@ -170,9 +166,7 @@ landing_page:
|
||||
</div>
|
||||
- custom_html: >
|
||||
<div class="devsite-landing-row-item-description">
|
||||
<a href="#">
|
||||
<h3 class="hide-from-toc">ML at production scale</h3>
|
||||
</a>
|
||||
<h3 class="hide-from-toc">ML at production scale</h3>
|
||||
<div class="devsite-landing-row-item-description-content">
|
||||
<p>
|
||||
Estimators can train large models on multiple machines in a
|
||||
|
@ -1,7 +1,7 @@
|
||||
### Learn and use ML
|
||||
basic_classification.md
|
||||
basic_text_classification.md
|
||||
basic_regression.md
|
||||
basic_classification.md: Basic classification
|
||||
basic_text_classification.md: Text classification
|
||||
basic_regression.md: Regression
|
||||
overfit_and_underfit.md
|
||||
save_and_restore_models.md
|
||||
next_steps.md
|
||||
|
@ -1,4 +1,4 @@
|
||||
# Next Steps
|
||||
# Next steps
|
||||
|
||||
## Learn more about TensorFlow
|
||||
|
||||
|
@ -362,10 +362,10 @@ model's loss. This is the
|
||||
that will be optimized.
|
||||
|
||||
We can calculate the loss by calling @{tf.losses.sparse_softmax_cross_entropy}.
|
||||
The value returned by this function will be lowest, approximately 0,
|
||||
probability of the correct class (at index `label`) is near 1.0. The loss value
|
||||
returned is progressively larger as the probability of the correct class
|
||||
decreases.
|
||||
The value returned by this function will be approximately 0 at lowest,
|
||||
when the probability of the correct class (at index `label`) is near 1.0.
|
||||
The loss value returned is progressively larger as the probability of the
|
||||
correct class decreases.
|
||||
|
||||
This function returns the average over the whole batch.
|
||||
|
||||
|
@ -35,7 +35,7 @@ from tensorflow import keras
|
||||
* The `tf.keras` version in the latest TensorFlow release might not be the same
|
||||
as the latest `keras` version from PyPI. Check `tf.keras.__version__`.
|
||||
* When [saving a model's weights](#weights_only), `tf.keras` defaults to the
|
||||
[checkpoint format](../get_started/checkpoints.md). Pass `save_format='h5'` to
|
||||
[checkpoint format](./checkpoints.md). Pass `save_format='h5'` to
|
||||
use HDF5.
|
||||
|
||||
## Build a simple model
|
||||
@ -221,7 +221,7 @@ To *evaluate* the inference-mode loss and metrics for the data provided:
|
||||
```python
|
||||
model.evaluate(x, y, batch_size=32)
|
||||
|
||||
model.evaluate(dataset, steps=30
|
||||
model.evaluate(dataset, steps=30)
|
||||
```
|
||||
|
||||
And to *predict* the output of the last layer in inference for the data provided,
|
||||
@ -442,7 +442,7 @@ model.load_weights('my_model')
|
||||
```
|
||||
|
||||
By default, this saves the model's weights in the
|
||||
[TensorFlow checkpoint](../get_started/checkpoints.md) file format. Weights can
|
||||
[TensorFlow checkpoint](./checkpoints.md) file format. Weights can
|
||||
also be saved to the Keras HDF5 format (the default for the multi-backend
|
||||
implementation of Keras):
|
||||
|
||||
@ -581,15 +581,6 @@ model.compile(loss='binary_crossentropy', optimizer=optimizer)
|
||||
model.summary()
|
||||
```
|
||||
|
||||
Convert the Keras model to a `tf.estimator.Estimator` instance:
|
||||
|
||||
```python
|
||||
keras_estimator = keras.estimator.model_to_estimator(
|
||||
keras_model=model,
|
||||
config=config,
|
||||
model_dir='/tmp/model_dir')
|
||||
```
|
||||
|
||||
Define an *input pipeline*. The `input_fn` returns a `tf.data.Dataset` object
|
||||
used to distribute the data across multiple devices—with each device processing
|
||||
a slice of the input batch.
|
||||
@ -615,6 +606,15 @@ strategy = tf.contrib.distribute.MirroredStrategy()
|
||||
config = tf.estimator.RunConfig(train_distribute=strategy)
|
||||
```
|
||||
|
||||
Convert the Keras model to a `tf.estimator.Estimator` instance:
|
||||
|
||||
```python
|
||||
keras_estimator = keras.estimator.model_to_estimator(
|
||||
keras_model=model,
|
||||
config=config,
|
||||
model_dir='/tmp/model_dir')
|
||||
```
|
||||
|
||||
Finally, train the `Estimator` instance by providing the `input_fn` and `steps`
|
||||
arguments:
|
||||
|
||||
|
@ -289,17 +289,27 @@ Note: If you're only interested in building the libraries for the TensorFlow C
|
||||
or Java APIs, see [Build the C or Java libraries](#BuildCorJava), you do not
|
||||
need to build the pip package in that case.
|
||||
|
||||
To build a pip package for TensorFlow with CPU-only support,
|
||||
you would typically invoke the following command:
|
||||
### CPU-only support
|
||||
|
||||
To build a pip package for TensorFlow with CPU-only support:
|
||||
|
||||
<pre>
|
||||
$ <b>bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package</b>
|
||||
$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
|
||||
</pre>
|
||||
|
||||
To build a pip package for TensorFlow with GPU support,
|
||||
invoke the following command:
|
||||
To build a pip package for TensorFlow with CPU-only support for the Intel® MKL-DNN:
|
||||
|
||||
<pre>$ <b>bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package</b> </pre>
|
||||
<pre>
|
||||
$ bazel build --config=mkl --config=opt //tensorflow/tools/pip_package:build_pip_package
|
||||
</pre>
|
||||
|
||||
### GPU support
|
||||
|
||||
To build a pip package for TensorFlow with GPU support:
|
||||
|
||||
<pre>
|
||||
$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
|
||||
</pre>
|
||||
|
||||
**NOTE on gcc 5 or later:** the binary pip packages available on the
|
||||
TensorFlow website are built with gcc 4, which uses the older ABI. To
|
||||
|
@ -44,23 +44,22 @@ app:
|
||||
Android Studio project.
|
||||
* Install all the Gradle extensions it requests.
|
||||
|
||||
To get a model, either:
|
||||
Now you can build and run the demo app.
|
||||
|
||||
* Download the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip)
|
||||
and unzip and copy `mobilenet_quant_v1_224.tflite` to the assets directory:
|
||||
`tensorflow/contrib/lite/java/demo/app/src/main/assets/`.
|
||||
* Or, download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip)
|
||||
and unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets
|
||||
directory. Change the chosen classifier in
|
||||
[Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)<br>
|
||||
from: `classifier = new ImageClassifierQuantizedMobileNet(getActivity());`<br>
|
||||
to: `classifier = new ImageClassifierFloatInception(getActivity());`.
|
||||
|
||||
Now you can build and run the demo app.
|
||||
The build process downloads the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip), and unzips it into the assets directory: `tensorflow/contrib/lite/java/demo/app/src/main/assets/`.
|
||||
|
||||
Some additional details are available on the
|
||||
[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md).
|
||||
|
||||
### Using other models
|
||||
|
||||
To use a different model:
|
||||
* Download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip).
|
||||
* Unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets directory.
|
||||
* Change the chosen classifier in [Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)<br>
|
||||
from: `classifier = new ImageClassifierQuantizedMobileNet(getActivity());`<br>
|
||||
to: `classifier = new ImageClassifierFloatInception(getActivity());`.
|
||||
|
||||
|
||||
## Build TensorFlow Lite and the demo app from source
|
||||
|
||||
|
@ -470,51 +470,18 @@ as the loss metric. The following code calculates cross entropy when the model
|
||||
runs in either `TRAIN` or `EVAL` mode:
|
||||
|
||||
```python
|
||||
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10)
|
||||
loss = tf.losses.softmax_cross_entropy(
|
||||
onehot_labels=onehot_labels, logits=logits)
|
||||
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
|
||||
```
|
||||
|
||||
Let's take a closer look at what's happening above.
|
||||
|
||||
Our `labels` tensor contains a list of predictions for our examples, e.g. `[1,
|
||||
9, ...]`. In order to calculate cross-entropy, first we need to convert `labels`
|
||||
to the corresponding
|
||||
[one-hot encoding](https://www.quora.com/What-is-one-hot-encoding-and-when-is-it-used-in-data-science):
|
||||
Our `labels` tensor contains a list of prediction indices for our examples, e.g. `[1,
|
||||
9, ...]`. `logits` contains the linear outputs of our last layer.
|
||||
|
||||
```none
|
||||
[[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
|
||||
[0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
|
||||
...]
|
||||
```
|
||||
`tf.losses.sparse_softmax_cross_entropy`, calculates the softmax crossentropy
|
||||
(aka: categorical crossentropy, negative log-likelihood) from these two inputs
|
||||
in an efficient, numerically stable way.
|
||||
|
||||
We use the @{tf.one_hot} function
|
||||
to perform this conversion. `tf.one_hot()` has two required arguments:
|
||||
|
||||
* `indices`. The locations in the one-hot tensor that will have "on
|
||||
values"—i.e., the locations of `1` values in the tensor shown above.
|
||||
* `depth`. The depth of the one-hot tensor—i.e., the number of target classes.
|
||||
Here, the depth is `10`.
|
||||
|
||||
The following code creates the one-hot tensor for our labels, `onehot_labels`:
|
||||
|
||||
```python
|
||||
onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10)
|
||||
```
|
||||
|
||||
Because `labels` contains a series of values from 0–9, `indices` is just our
|
||||
`labels` tensor, with values cast to integers. The `depth` is `10` because we
|
||||
have 10 possible target classes, one for each digit.
|
||||
|
||||
Next, we compute cross-entropy of `onehot_labels` and the softmax of the
|
||||
predictions from our logits layer. `tf.losses.softmax_cross_entropy()` takes
|
||||
`onehot_labels` and `logits` as arguments, performs softmax activation on
|
||||
`logits`, calculates cross-entropy, and returns our `loss` as a scalar `Tensor`:
|
||||
|
||||
```python
|
||||
loss = tf.losses.softmax_cross_entropy(
|
||||
onehot_labels=onehot_labels, logits=logits)
|
||||
```
|
||||
|
||||
### Configure the Training Op
|
||||
|
||||
|
@ -11210,7 +11210,7 @@ func SampleDistortedBoundingBoxAspectRatioRange(value []float32) SampleDistorted
|
||||
// SampleDistortedBoundingBoxAreaRange sets the optional area_range attribute to value.
|
||||
//
|
||||
// value: The cropped area of the image must contain a fraction of the
|
||||
// supplied image within in this range.
|
||||
// supplied image within this range.
|
||||
// If not specified, defaults to <f:0.05 f:1 >
|
||||
func SampleDistortedBoundingBoxAreaRange(value []float32) SampleDistortedBoundingBoxAttr {
|
||||
return func(m optionalAttr) {
|
||||
@ -17969,9 +17969,10 @@ func SparseFillEmptyRowsGrad(scope *Scope, reverse_index_map tf.Output, grad_val
|
||||
}
|
||||
|
||||
// Computes scaled exponential linear: `scale * alpha * (exp(features) - 1)`
|
||||
//
|
||||
// if < 0, `scale * features` otherwise.
|
||||
//
|
||||
// Assumes weights to have zero mean and variance 1.0 / fan_in.
|
||||
//
|
||||
// See [Self-Normalizing Neural Networks](https://arxiv.org/abs/1706.02515)
|
||||
func Selu(scope *Scope, features tf.Output) (activations tf.Output) {
|
||||
if scope.Err() != nil {
|
||||
@ -21655,7 +21656,7 @@ func ImageSummaryBadColor(value tf.Tensor) ImageSummaryAttr {
|
||||
// generated sequentially as '*tag*/image/0', '*tag*/image/1', etc.
|
||||
//
|
||||
// The `bad_color` argument is the color to use in the generated images for
|
||||
// non-finite input values. It is a `unit8` 1-D tensor of length `channels`.
|
||||
// non-finite input values. It is a `uint8` 1-D tensor of length `channels`.
|
||||
// Each element must be in the range `[0, 255]` (It represents the value of a
|
||||
// pixel in the output image). Non-finite values in the input tensor are
|
||||
// replaced by this tensor in the output image. The default value is the color
|
||||
@ -24048,7 +24049,7 @@ func SampleDistortedBoundingBoxV2AspectRatioRange(value []float32) SampleDistort
|
||||
// SampleDistortedBoundingBoxV2AreaRange sets the optional area_range attribute to value.
|
||||
//
|
||||
// value: The cropped area of the image must contain a fraction of the
|
||||
// supplied image within in this range.
|
||||
// supplied image within this range.
|
||||
// If not specified, defaults to <f:0.05 f:1 >
|
||||
func SampleDistortedBoundingBoxV2AreaRange(value []float32) SampleDistortedBoundingBoxV2Attr {
|
||||
return func(m optionalAttr) {
|
||||
@ -24744,8 +24745,7 @@ type DecodeProtoV2Attr func(optionalAttr)
|
||||
// If not specified, defaults to "local://"
|
||||
func DecodeProtoV2DescriptorSource(value string) DecodeProtoV2Attr {
|
||||
return func(m optionalAttr) {
|
||||
m["descriptor_source"] = value
|
||||
}
|
||||
m["descriptor_source"] = value }
|
||||
}
|
||||
|
||||
// DecodeProtoV2MessageFormat sets the optional message_format attribute to value.
|
||||
|
@ -13,6 +13,7 @@ See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
==============================================================================*/
|
||||
|
||||
#include <string>
|
||||
#include <algorithm>
|
||||
#include <list>
|
||||
#include <string>
|
||||
|
@ -143,6 +143,82 @@ public final class Graph implements AutoCloseable {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Adds operations to compute the partial derivatives of sum of {@code y}s w.r.t {@code x}s,
|
||||
* i.e., {@code d(y_1 + y_2 + ...)/dx_1, d(y_1 + y_2 + ...)/dx_2...}
|
||||
* <p>
|
||||
* {@code dx} are used as initial gradients (which represent the symbolic partial derivatives of some loss function
|
||||
* {@code L} w.r.t. {@code y}). {@code dx} must be null or have size of {@code y}.
|
||||
* <p>
|
||||
* If {@code dx} is null, the implementation will use dx of {@link org.tensorflow.op.core.OnesLike OnesLike} for all
|
||||
* shapes in {@code y}.
|
||||
*
|
||||
* @param y output of the function to derive
|
||||
* @param x inputs of the function for which partial derivatives are computed
|
||||
* @param dx if not null, the partial derivatives of some loss function {@code L} w.r.t. {@code y}
|
||||
* @return the partial derivatives {@code dy} with the size of {@code x}
|
||||
*/
|
||||
public Output<?>[] addGradients(Output<?>[] y, Output<?>[] x, Output<?>[] dx) {
|
||||
Output<?>[] dy = new Output<?>[x.length];
|
||||
final long[] yHandles = new long[y.length];
|
||||
final int[] yIndices = new int[y.length];
|
||||
final long[] xHandles = new long[x.length];
|
||||
final int[] xIndices = new int[x.length];
|
||||
long[] dxHandles = null;
|
||||
int[] dxIndices = null;
|
||||
|
||||
try (Reference ref = ref()) {
|
||||
for (int i = 0; i < y.length; ++i) {
|
||||
yHandles[i] = y[i].op().getUnsafeNativeHandle();
|
||||
yIndices[i] = y[i].index();
|
||||
}
|
||||
for (int i = 0; i < x.length; ++i) {
|
||||
xHandles[i] = x[i].op().getUnsafeNativeHandle();
|
||||
xIndices[i] = x[i].index();
|
||||
}
|
||||
if (dx != null && dx.length > 0) {
|
||||
dxHandles = new long[dx.length];
|
||||
dxIndices = new int[dx.length];
|
||||
|
||||
for (int i = 0; i < dx.length; ++i) {
|
||||
dxHandles[i] = dx[i].op().getUnsafeNativeHandle();
|
||||
dxIndices[i] = dx[i].index();
|
||||
}
|
||||
}
|
||||
// Gradient outputs are returned in two continuous arrays concatenated into one. The first holds the native handles
|
||||
// of the gradient operations while the second holds the index of their output
|
||||
// e.g. given xHandles = [x0Handle, x1Handle, ...] and xIndices = [x0Index, x1Index, ..], we obtain
|
||||
// dy = [dy0Handle, dy1Handle, ..., dy0Index, dy1Index, ...]
|
||||
long[] dyHandlesAndIndices =
|
||||
addGradients(ref.nativeHandle(), yHandles, yIndices, xHandles, xIndices, dxHandles, dxIndices);
|
||||
int ndy = dyHandlesAndIndices.length >> 1;
|
||||
if (ndy != dy.length) {
|
||||
throw new IllegalStateException(String.valueOf(ndy) + " gradients were added to the graph when " + dy.length
|
||||
+ " were expected");
|
||||
}
|
||||
for (int i = 0, j = ndy; i < ndy; ++i, ++j) {
|
||||
Operation op = new Operation(this, dyHandlesAndIndices[i]);
|
||||
dy[i] = new Output<>(op, (int) dyHandlesAndIndices[j]);
|
||||
}
|
||||
}
|
||||
return dy;
|
||||
}
|
||||
|
||||
/**
|
||||
* Adds operations to compute the partial derivatives of sum of {@code y}s w.r.t {@code x}s,
|
||||
* i.e., {@code dy/dx_1, dy/dx_2...}
|
||||
* <p>
|
||||
* This is a simplified version of {@link #addGradients(Output[], Output[], Output[]) where {@code y} is
|
||||
* a single output and {@code dx} is null.
|
||||
*
|
||||
* @param y output of the function to derive
|
||||
* @param x inputs of the function for which partial derivatives are computed
|
||||
* @return the partial derivatives {@code dy} with the size of {@code x}
|
||||
*/
|
||||
public Output<?>[] addGradients(Output<?> y, Output<?>[] x) {
|
||||
return addGradients(new Output<?>[]{y}, x, null);
|
||||
}
|
||||
|
||||
private final Object nativeHandleLock = new Object();
|
||||
private long nativeHandle;
|
||||
private int refcount = 0;
|
||||
@ -254,6 +330,9 @@ public final class Graph implements AutoCloseable {
|
||||
|
||||
private static native byte[] toGraphDef(long handle);
|
||||
|
||||
private static native long[] addGradients(long handle, long[] inputHandles, int[] inputIndices,
|
||||
long[] outputHandles, int[] outputIndices, long[] gradInputHandles, int[] gradInputIndices);
|
||||
|
||||
static {
|
||||
TensorFlow.init();
|
||||
}
|
||||
|
@ -0,0 +1,153 @@
|
||||
/* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
==============================================================================*/
|
||||
|
||||
package org.tensorflow.op.core;
|
||||
|
||||
import java.util.Arrays;
|
||||
import java.util.Iterator;
|
||||
import java.util.List;
|
||||
|
||||
import org.tensorflow.Operand;
|
||||
import org.tensorflow.Output;
|
||||
import org.tensorflow.op.Op;
|
||||
import org.tensorflow.op.Operands;
|
||||
import org.tensorflow.op.Scope;
|
||||
import org.tensorflow.op.annotation.Operator;
|
||||
|
||||
/**
|
||||
* Adds operations to compute the partial derivatives of sum of {@code y}s w.r.t {@code x}s,
|
||||
* i.e., {@code d(y_1 + y_2 + ...)/dx_1, d(y_1 + y_2 + ...)/dx_2...}
|
||||
* <p>
|
||||
* If {@code Options.dx()} values are set, they are as the initial symbolic partial derivatives of some loss
|
||||
* function {@code L} w.r.t. {@code y}. {@code Options.dx()} must have the size of {@code y}.
|
||||
* <p>
|
||||
* If {@code Options.dx()} is not set, the implementation will use dx of {@code OnesLike} for all
|
||||
* shapes in {@code y}.
|
||||
* <p>
|
||||
* The partial derivatives are returned in output {@code dy}, with the size of {@code x}.
|
||||
* <p>
|
||||
* Example of usage:
|
||||
* <pre>{@code
|
||||
* Gradients gradients = Gradients.create(scope, Arrays.asList(loss), Arrays.asList(w, b));
|
||||
*
|
||||
* Constant<Float> alpha = ops.constant(1.0f, Float.class);
|
||||
* ApplyGradientDescent.create(scope, w, alpha, gradients.<Float>dy(0));
|
||||
* ApplyGradientDescent.create(scope, b, alpha, gradients.<Float>dy(1));
|
||||
* }</pre>
|
||||
*/
|
||||
@Operator
|
||||
public class Gradients implements Op, Iterable<Operand<?>> {
|
||||
|
||||
/**
|
||||
* Optional attributes for {@link Gradients}
|
||||
*/
|
||||
public static class Options {
|
||||
|
||||
/**
|
||||
* @param dx partial derivatives of some loss function {@code L} w.r.t. {@code y}
|
||||
* @return this option builder
|
||||
*/
|
||||
public Options dx(Iterable<Operand<?>> dx) {
|
||||
this.dx = dx;
|
||||
return this;
|
||||
}
|
||||
|
||||
private Iterable<Operand<?>> dx;
|
||||
|
||||
private Options() {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Adds gradients computation ops to the graph according to scope.
|
||||
*
|
||||
* @param scope current graph scope
|
||||
* @param y outputs of the function to derive
|
||||
* @param x inputs of the function for which partial derivatives are computed
|
||||
* @param options carries optional attributes values
|
||||
* @return a new instance of {@code Gradients}
|
||||
*/
|
||||
public static Gradients create(Scope scope, Iterable<Operand<?>> y, Iterable<Operand<?>> x, Options... options) {
|
||||
Output<?>[] dx = null;
|
||||
if (options != null) {
|
||||
for (Options opts : options) {
|
||||
if (opts.dx != null) {
|
||||
dx = Operands.asOutputs(opts.dx);
|
||||
}
|
||||
}
|
||||
}
|
||||
Output<?>[] gradOutputs = scope.graph().addGradients(Operands.asOutputs(y), Operands.asOutputs(x), dx);
|
||||
return new Gradients(Arrays.asList(gradOutputs));
|
||||
}
|
||||
|
||||
/**
|
||||
* Adds gradients computation ops to the graph according to scope.
|
||||
*
|
||||
* This is a simplified version of {@link #create(Scope, Iterable, Iterable, Options...)} where {@code y} is
|
||||
* a single output.
|
||||
*
|
||||
* @param scope current graph scope
|
||||
* @param y output of the function to derive
|
||||
* @param x inputs of the function for which partial derivatives are computed
|
||||
* @param options carries optional attributes values
|
||||
* @return a new instance of {@code Gradients}
|
||||
*/
|
||||
@SuppressWarnings({"unchecked", "rawtypes"})
|
||||
public static Gradients create(Scope scope, Operand<?> y, Iterable<Operand<?>> x, Options... options) {
|
||||
return create(scope, (Iterable) Arrays.asList(y), x, options);
|
||||
}
|
||||
|
||||
/**
|
||||
* @param dx partial derivatives of some loss function {@code L} w.r.t. {@code y}
|
||||
* @return builder to add more options to this operation
|
||||
*/
|
||||
public Options dx(Iterable<Operand<?>> dx) {
|
||||
return new Options().dx(dx);
|
||||
}
|
||||
|
||||
@Override
|
||||
@SuppressWarnings({"rawtypes", "unchecked"})
|
||||
public Iterator<Operand<?>> iterator() {
|
||||
return (Iterator) dy.iterator();
|
||||
}
|
||||
|
||||
/**
|
||||
* Partial derivatives of {@code y}s w.r.t. {@code x}s, with the size of {@code x}
|
||||
*/
|
||||
public List<Output<?>> dy() {
|
||||
return dy;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns a symbolic handle to one of the gradient operation output
|
||||
* <p>
|
||||
* Warning: Does not check that the type of the tensor matches T. It is recommended to call
|
||||
* this method with an explicit type parameter rather than letting it be inferred, e.g. {@code
|
||||
* gradients.<Integer>dy(0)}
|
||||
*
|
||||
* @param <T> The expected element type of the tensors produced by this output.
|
||||
* @param index The index of the output among the gradients added by this operation
|
||||
*/
|
||||
@SuppressWarnings("unchecked")
|
||||
public <T> Output<T> dy(int index) {
|
||||
return (Output<T>) dy.get(index);
|
||||
}
|
||||
|
||||
private List<Output<?>> dy;
|
||||
|
||||
private Gradients(List<Output<?>> dy) {
|
||||
this.dy = dy;
|
||||
}
|
||||
}
|
@ -16,7 +16,9 @@ limitations under the License.
|
||||
#include "tensorflow/java/src/main/native/graph_jni.h"
|
||||
|
||||
#include <limits>
|
||||
#include <memory>
|
||||
#include "tensorflow/c/c_api.h"
|
||||
#include "tensorflow/java/src/main/native/utils_jni.h"
|
||||
#include "tensorflow/java/src/main/native/exception_jni.h"
|
||||
|
||||
namespace {
|
||||
@ -130,3 +132,55 @@ Java_org_tensorflow_Graph_toGraphDef(JNIEnv* env, jclass clazz, jlong handle) {
|
||||
TF_DeleteBuffer(buf);
|
||||
return ret;
|
||||
}
|
||||
|
||||
JNIEXPORT jlongArray JNICALL
|
||||
Java_org_tensorflow_Graph_addGradients(JNIEnv* env, jclass clazz, jlong handle,
|
||||
jlongArray y_handles, jintArray y_indices,
|
||||
jlongArray x_handles, jintArray x_indices,
|
||||
jlongArray dx_handles, jintArray dx_indices) {
|
||||
|
||||
TF_Graph* g = requireHandle(env, handle);
|
||||
if (g == nullptr) return nullptr;
|
||||
|
||||
const jint ny = env->GetArrayLength(y_handles);
|
||||
const jint nx = env->GetArrayLength(x_handles);
|
||||
|
||||
std::unique_ptr<TF_Output[]> y(new TF_Output[ny]);
|
||||
std::unique_ptr<TF_Output[]> x(new TF_Output[nx]);
|
||||
std::unique_ptr<TF_Output[]> dx(nullptr);
|
||||
std::unique_ptr<TF_Output[]> dy(new TF_Output[nx]);
|
||||
|
||||
resolveOutputs(env, "y", y_handles, y_indices, y.get(), ny);
|
||||
resolveOutputs(env, "x", x_handles, x_indices, x.get(), nx);
|
||||
if (dx_handles != nullptr) {
|
||||
if (env->GetArrayLength(dx_handles) != ny) {
|
||||
throwException(env, kIllegalArgumentException,
|
||||
"expected %d, got %d dx handles", ny,
|
||||
env->GetArrayLength(dx_handles));
|
||||
}
|
||||
dx.reset(new TF_Output[ny]);
|
||||
resolveOutputs(env, "dx", dx_handles, dx_indices, dx.get(), ny);
|
||||
}
|
||||
if (env->ExceptionCheck()) return nullptr;
|
||||
|
||||
TF_Status* status = TF_NewStatus();
|
||||
TF_AddGradients(g, y.get(), ny, x.get(), nx, dx.get(), status, dy.get());
|
||||
|
||||
if (!throwExceptionIfNotOK(env, status)) {
|
||||
TF_DeleteStatus(status);
|
||||
return nullptr;
|
||||
}
|
||||
TF_DeleteStatus(status);
|
||||
|
||||
// returned array contains both op handles and output indices, in pair
|
||||
jlongArray dy_handles_and_indices = env->NewLongArray(nx << 1);
|
||||
jlong* dy_elems = env->GetLongArrayElements(dy_handles_and_indices, nullptr);
|
||||
for (int i = 0, j = nx; i < nx; ++i, ++j) {
|
||||
TF_Output dy_output = dy.get()[i];
|
||||
dy_elems[i] = reinterpret_cast<jlong>(dy_output.oper);
|
||||
dy_elems[j] = static_cast<jlong>(dy_output.index);
|
||||
}
|
||||
env->ReleaseLongArrayElements(dy_handles_and_indices, dy_elems, 0);
|
||||
|
||||
return dy_handles_and_indices;
|
||||
}
|
||||
|
@ -73,6 +73,15 @@ JNIEXPORT jbyteArray JNICALL Java_org_tensorflow_Graph_toGraphDef(JNIEnv *,
|
||||
jclass,
|
||||
jlong);
|
||||
|
||||
/*
|
||||
* Class: org_tensorflow_Graph
|
||||
* Method: name
|
||||
* Signature: (J[J[I[J[I[J[I)[J
|
||||
*/
|
||||
JNIEXPORT jlongArray JNICALL Java_org_tensorflow_Graph_addGradients(JNIEnv *,
|
||||
jclass, jlong, jlongArray, jintArray, jlongArray, jintArray, jlongArray,
|
||||
jintArray);
|
||||
|
||||
#ifdef __cplusplus
|
||||
} // extern "C"
|
||||
#endif // __cplusplus
|
||||
|
@ -17,6 +17,7 @@ limitations under the License.
|
||||
#include <memory>
|
||||
|
||||
#include "tensorflow/c/c_api.h"
|
||||
#include "tensorflow/java/src/main/native/utils_jni.h"
|
||||
#include "tensorflow/java/src/main/native/exception_jni.h"
|
||||
#include "tensorflow/java/src/main/native/session_jni.h"
|
||||
|
||||
@ -55,37 +56,6 @@ void resolveHandles(JNIEnv* env, const char* type, jlongArray src_array,
|
||||
env->ReleaseLongArrayElements(src_array, src_start, JNI_ABORT);
|
||||
}
|
||||
|
||||
void resolveOutputs(JNIEnv* env, const char* type, jlongArray src_op,
|
||||
jintArray src_index, TF_Output* dst, jint n) {
|
||||
if (env->ExceptionCheck()) return;
|
||||
jint len = env->GetArrayLength(src_op);
|
||||
if (len != n) {
|
||||
throwException(env, kIllegalArgumentException,
|
||||
"expected %d, got %d %s Operations", n, len, type);
|
||||
return;
|
||||
}
|
||||
len = env->GetArrayLength(src_index);
|
||||
if (len != n) {
|
||||
throwException(env, kIllegalArgumentException,
|
||||
"expected %d, got %d %s Operation output indices", n, len,
|
||||
type);
|
||||
return;
|
||||
}
|
||||
jlong* op_handles = env->GetLongArrayElements(src_op, nullptr);
|
||||
jint* indices = env->GetIntArrayElements(src_index, nullptr);
|
||||
for (int i = 0; i < n; ++i) {
|
||||
if (op_handles[i] == 0) {
|
||||
throwException(env, kNullPointerException, "invalid %s (#%d of %d)", type,
|
||||
i, n);
|
||||
break;
|
||||
}
|
||||
dst[i] = TF_Output{reinterpret_cast<TF_Operation*>(op_handles[i]),
|
||||
static_cast<int>(indices[i])};
|
||||
}
|
||||
env->ReleaseIntArrayElements(src_index, indices, JNI_ABORT);
|
||||
env->ReleaseLongArrayElements(src_op, op_handles, JNI_ABORT);
|
||||
}
|
||||
|
||||
void TF_MaybeDeleteBuffer(TF_Buffer* buf) {
|
||||
if (buf == nullptr) return;
|
||||
TF_DeleteBuffer(buf);
|
||||
|
53
tensorflow/java/src/main/native/utils_jni.cc
Normal file
53
tensorflow/java/src/main/native/utils_jni.cc
Normal file
@ -0,0 +1,53 @@
|
||||
/* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
==============================================================================*/
|
||||
|
||||
#include "tensorflow/java/src/main/native/utils_jni.h"
|
||||
|
||||
#include "tensorflow/java/src/main/native/exception_jni.h"
|
||||
|
||||
void resolveOutputs(JNIEnv* env, const char* type, jlongArray src_op,
|
||||
jintArray src_index, TF_Output* dst, jint n) {
|
||||
if (env->ExceptionCheck()) return;
|
||||
jint len = env->GetArrayLength(src_op);
|
||||
if (len != n) {
|
||||
throwException(env, kIllegalArgumentException,
|
||||
"expected %d, got %d %s Operations", n, len, type);
|
||||
return;
|
||||
}
|
||||
len = env->GetArrayLength(src_index);
|
||||
if (len != n) {
|
||||
throwException(env, kIllegalArgumentException,
|
||||
"expected %d, got %d %s Operation output indices", n, len,
|
||||
type);
|
||||
return;
|
||||
}
|
||||
jlong* op_handles = env->GetLongArrayElements(src_op, nullptr);
|
||||
jint* indices = env->GetIntArrayElements(src_index, nullptr);
|
||||
for (int i = 0; i < n; ++i) {
|
||||
if (op_handles[i] == 0) {
|
||||
throwException(env, kNullPointerException, "invalid %s (#%d of %d)", type,
|
||||
i, n);
|
||||
break;
|
||||
}
|
||||
dst[i] = TF_Output{reinterpret_cast<TF_Operation*>(op_handles[i]),
|
||||
static_cast<int>(indices[i])};
|
||||
}
|
||||
env->ReleaseIntArrayElements(src_index, indices, JNI_ABORT);
|
||||
env->ReleaseLongArrayElements(src_op, op_handles, JNI_ABORT);
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
33
tensorflow/java/src/main/native/utils_jni.h
Normal file
33
tensorflow/java/src/main/native/utils_jni.h
Normal file
@ -0,0 +1,33 @@
|
||||
/* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
==============================================================================*/
|
||||
|
||||
#ifndef TENSORFLOW_JAVA_UTILS_JNI_H_
|
||||
#define TENSORFLOW_JAVA_UTILS_JNI_H_
|
||||
|
||||
#include <jni.h>
|
||||
|
||||
#include "tensorflow/c/c_api.h"
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif // __cplusplus
|
||||
|
||||
void resolveOutputs(JNIEnv* env, const char* type, jlongArray src_op,
|
||||
jintArray src_index, TF_Output* dst, jint n);
|
||||
|
||||
#ifdef __cplusplus
|
||||
} // extern "C"
|
||||
#endif // __cplusplus
|
||||
#endif /* TENSORFLOW_JAVA_UTILS_JNI_H_ */
|
@ -22,6 +22,7 @@ import static org.junit.Assert.assertTrue;
|
||||
|
||||
import java.util.HashSet;
|
||||
import java.util.Iterator;
|
||||
|
||||
import org.junit.Test;
|
||||
import org.junit.runner.RunWith;
|
||||
import org.junit.runners.JUnit4;
|
||||
@ -129,4 +130,106 @@ public class GraphTest {
|
||||
// expected exception.
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
public void addGradientsToGraph() {
|
||||
try (Graph g = new Graph();
|
||||
Session s = new Session(g)) {
|
||||
|
||||
Output<Float> x1 = TestUtil.placeholder(g, "x1", Float.class);
|
||||
Output<Float> x2 = TestUtil.placeholder(g, "x2", Float.class);
|
||||
Output<Float> y0 = TestUtil.square(g, "y0", x1);
|
||||
Output<Float> y1 = TestUtil.square(g, "y1", y0);
|
||||
Output<Float> y2 = TestUtil.addN(g, y0, x2);
|
||||
|
||||
Output<?>[] grads0 = g.addGradients(y1, toArray(x1));
|
||||
assertNotNull(grads0);
|
||||
assertEquals(1, grads0.length);
|
||||
assertEquals(DataType.FLOAT, grads0[0].dataType());
|
||||
|
||||
Output<?>[] grads1 = g.addGradients(y2, toArray(x1, x2));
|
||||
assertNotNull(grads1);
|
||||
assertEquals(2, grads1.length);
|
||||
assertEquals(DataType.FLOAT, grads1[0].dataType());
|
||||
assertEquals(DataType.FLOAT, grads1[1].dataType());
|
||||
|
||||
try (Tensor<Float> c1 = Tensors.create(3.0f);
|
||||
Tensor<Float> c2 = Tensors.create(2.0f);
|
||||
TestUtil.AutoCloseableList<Tensor<?>> outputs = new TestUtil.AutoCloseableList<>(
|
||||
s.runner()
|
||||
.feed(x1, c1)
|
||||
.feed(x2, c2)
|
||||
.fetch(grads0[0])
|
||||
.fetch(grads1[0])
|
||||
.fetch(grads1[1])
|
||||
.run())) {
|
||||
|
||||
assertEquals(3, outputs.size());
|
||||
assertEquals(108.0f, outputs.get(0).floatValue(), 0.0f);
|
||||
assertEquals(6.0f, outputs.get(1).floatValue(), 0.0f);
|
||||
assertEquals(1.0f, outputs.get(2).floatValue(), 0.0f);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
public void addGradientSumsToGraph() {
|
||||
try (Graph g = new Graph();
|
||||
Session s = new Session(g)) {
|
||||
|
||||
Output<Float> x = TestUtil.placeholder(g, "x", Float.class);
|
||||
Output<Float> y0 = TestUtil.square(g, "y0", x);
|
||||
Output<Float> y1 = TestUtil.square(g, "y1", y0);
|
||||
|
||||
Output<?>[] grad = g.addGradients(toArray(y0, y1), toArray(x), null);
|
||||
assertNotNull(grad);
|
||||
assertEquals(1, grad.length);
|
||||
assertEquals(DataType.FLOAT, grad[0].dataType());
|
||||
|
||||
try (Tensor<Float> c = Tensors.create(3.0f);
|
||||
Tensor<?> output = s.runner()
|
||||
.feed(x, c)
|
||||
.fetch(grad[0])
|
||||
.run()
|
||||
.get(0)) {
|
||||
|
||||
assertEquals(114.0f, output.floatValue(), 0.0f);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
public void addGradientsWithInitialValuesToGraph() {
|
||||
try (Graph g = new Graph();
|
||||
Session s = new Session(g)) {
|
||||
|
||||
Output<Float> x = TestUtil.placeholder(g, "x", Float.class);
|
||||
Output<Float> y0 = TestUtil.square(g, "y0", x);
|
||||
Output<Float> y1 = TestUtil.square(g, "y1", y0);
|
||||
|
||||
Output<?>[] grad0 = g.addGradients(y1, toArray(y0));
|
||||
assertNotNull(grad0);
|
||||
assertEquals(1, grad0.length);
|
||||
assertEquals(DataType.FLOAT, grad0[0].dataType());
|
||||
|
||||
Output<?>[] grad1 = g.addGradients(toArray(y0), toArray(x), toArray(grad0[0]));
|
||||
assertNotNull(grad1);
|
||||
assertEquals(1, grad1.length);
|
||||
assertEquals(DataType.FLOAT, grad1[0].dataType());
|
||||
|
||||
try (Tensor<Float> c = Tensors.create(3.0f);
|
||||
Tensor<?> output = s.runner()
|
||||
.feed(x, c)
|
||||
.fetch(grad1[0])
|
||||
.run()
|
||||
.get(0)) {
|
||||
|
||||
assertEquals(108.0f, output.floatValue(), 0.0f);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static Output<?>[] toArray(Output<?>... outputs) {
|
||||
return outputs;
|
||||
}
|
||||
}
|
||||
|
@ -20,8 +20,6 @@ import static org.junit.Assert.assertEquals;
|
||||
import static org.junit.Assert.assertTrue;
|
||||
import static org.junit.Assert.fail;
|
||||
|
||||
import java.util.ArrayList;
|
||||
import java.util.Collection;
|
||||
import org.junit.Test;
|
||||
import org.junit.runner.RunWith;
|
||||
import org.junit.runners.JUnit4;
|
||||
@ -36,8 +34,8 @@ public class SessionTest {
|
||||
Session s = new Session(g)) {
|
||||
TestUtil.transpose_A_times_X(g, new int[][] {{2}, {3}});
|
||||
try (Tensor<Integer> x = Tensors.create(new int[][] {{5}, {7}});
|
||||
AutoCloseableList<Tensor<?>> outputs =
|
||||
new AutoCloseableList<Tensor<?>>(s.runner().feed("X", x).fetch("Y").run())) {
|
||||
TestUtil.AutoCloseableList<Tensor<?>> outputs =
|
||||
new TestUtil.AutoCloseableList<Tensor<?>>(s.runner().feed("X", x).fetch("Y").run())) {
|
||||
assertEquals(1, outputs.size());
|
||||
final int[][] expected = {{31}};
|
||||
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
||||
@ -53,8 +51,8 @@ public class SessionTest {
|
||||
Output<Integer> feed = g.operation("X").output(0);
|
||||
Output<Integer> fetch = g.operation("Y").output(0);
|
||||
try (Tensor<Integer> x = Tensors.create(new int[][] {{5}, {7}});
|
||||
AutoCloseableList<Tensor<?>> outputs =
|
||||
new AutoCloseableList<Tensor<?>>(s.runner().feed(feed, x).fetch(fetch).run())) {
|
||||
TestUtil.AutoCloseableList<Tensor<?>> outputs =
|
||||
new TestUtil.AutoCloseableList<Tensor<?>>(s.runner().feed(feed, x).fetch(fetch).run())) {
|
||||
assertEquals(1, outputs.size());
|
||||
final int[][] expected = {{31}};
|
||||
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
||||
@ -112,7 +110,7 @@ public class SessionTest {
|
||||
.setOptions(fullTraceRunOptions())
|
||||
.runAndFetchMetadata();
|
||||
// Sanity check on outputs.
|
||||
AutoCloseableList<Tensor<?>> outputs = new AutoCloseableList<Tensor<?>>(result.outputs);
|
||||
TestUtil.AutoCloseableList<Tensor<?>> outputs = new TestUtil.AutoCloseableList<Tensor<?>>(result.outputs);
|
||||
assertEquals(1, outputs.size());
|
||||
final int[][] expected = {{31}};
|
||||
assertArrayEquals(expected, outputs.get(0).copyTo(new int[1][1]));
|
||||
@ -135,8 +133,8 @@ public class SessionTest {
|
||||
Session s = new Session(g)) {
|
||||
TestUtil.constant(g, "c1", 2718);
|
||||
TestUtil.constant(g, "c2", 31415);
|
||||
AutoCloseableList<Tensor<?>> outputs =
|
||||
new AutoCloseableList<Tensor<?>>(s.runner().fetch("c2").fetch("c1").run());
|
||||
TestUtil.AutoCloseableList<Tensor<?>> outputs =
|
||||
new TestUtil.AutoCloseableList<Tensor<?>>(s.runner().fetch("c2").fetch("c1").run());
|
||||
assertEquals(2, outputs.size());
|
||||
assertEquals(31415, outputs.get(0).intValue());
|
||||
assertEquals(2718, outputs.get(1).intValue());
|
||||
@ -164,28 +162,6 @@ public class SessionTest {
|
||||
Session s = new Session(g, singleThreadConfigProto())) {}
|
||||
}
|
||||
|
||||
private static final class AutoCloseableList<E extends AutoCloseable> extends ArrayList<E>
|
||||
implements AutoCloseable {
|
||||
AutoCloseableList(Collection<? extends E> c) {
|
||||
super(c);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void close() {
|
||||
Exception toThrow = null;
|
||||
for (AutoCloseable c : this) {
|
||||
try {
|
||||
c.close();
|
||||
} catch (Exception e) {
|
||||
toThrow = e;
|
||||
}
|
||||
}
|
||||
if (toThrow != null) {
|
||||
throw new RuntimeException(toThrow);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static byte[] fullTraceRunOptions() {
|
||||
// Ideally this would use the generated Java sources for protocol buffers
|
||||
// and end up with something like the snippet below. However, generating
|
||||
|
@ -16,9 +16,34 @@ limitations under the License.
|
||||
package org.tensorflow;
|
||||
|
||||
import java.lang.reflect.Array;
|
||||
import java.util.ArrayList;
|
||||
import java.util.Collection;
|
||||
|
||||
/** Static utility functions. */
|
||||
public class TestUtil {
|
||||
|
||||
public static final class AutoCloseableList<E extends AutoCloseable> extends ArrayList<E>
|
||||
implements AutoCloseable {
|
||||
AutoCloseableList(Collection<? extends E> c) {
|
||||
super(c);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void close() {
|
||||
Exception toThrow = null;
|
||||
for (AutoCloseable c : this) {
|
||||
try {
|
||||
c.close();
|
||||
} catch (Exception e) {
|
||||
toThrow = e;
|
||||
}
|
||||
}
|
||||
if (toThrow != null) {
|
||||
throw new RuntimeException(toThrow);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public static <T> Output<T> constant(Graph g, String name, Object value) {
|
||||
try (Tensor<?> t = Tensor.create(value)) {
|
||||
return g.opBuilder("Const", name)
|
||||
@ -36,7 +61,7 @@ public class TestUtil {
|
||||
.<T>output(0);
|
||||
}
|
||||
|
||||
public static Output<?> addN(Graph g, Output<?>... inputs) {
|
||||
public static <T> Output<T> addN(Graph g, Output<?>... inputs) {
|
||||
return g.opBuilder("AddN", "AddN").addInputList(inputs).build().output(0);
|
||||
}
|
||||
|
||||
@ -58,6 +83,13 @@ public class TestUtil {
|
||||
.setAttr("num_split", numSplit)
|
||||
.build();
|
||||
}
|
||||
|
||||
public static <T> Output<T> square(Graph g, String name, Output<T> value) {
|
||||
return g.opBuilder("Square", name)
|
||||
.addInput(value)
|
||||
.build()
|
||||
.<T>output(0);
|
||||
}
|
||||
|
||||
public static void transpose_A_times_X(Graph g, int[][] a) {
|
||||
Output<Integer> aa = constant(g, "A", a);
|
||||
|
@ -99,7 +99,7 @@ class EstimatorSpec(
|
||||
ignored in eval and infer modes. Example:
|
||||
|
||||
```python
|
||||
def my_model_fn(mode, features, labels):
|
||||
def my_model_fn(features, labels, mode):
|
||||
predictions = ...
|
||||
loss = ...
|
||||
train_op = ...
|
||||
@ -114,7 +114,7 @@ class EstimatorSpec(
|
||||
given mode. Example:
|
||||
|
||||
```python
|
||||
def my_model_fn(mode, features, labels):
|
||||
def my_model_fn(features, labels, mode):
|
||||
if (mode == tf.estimator.ModeKeys.TRAIN or
|
||||
mode == tf.estimator.ModeKeys.EVAL):
|
||||
loss = ...
|
||||
|
@ -3239,8 +3239,9 @@ class Graph(object):
|
||||
# the name will still appear in _names_in_use even though the name hasn't
|
||||
# been used. This is ok, just leave _names_in_use as-is in this case.
|
||||
# TODO(skyewm): make the C API guarantee no name conflicts.
|
||||
if ret.name not in self._names_in_use:
|
||||
self._names_in_use[ret.name] = 1
|
||||
name_key = ret.name.lower()
|
||||
if name_key not in self._names_in_use:
|
||||
self._names_in_use[name_key] = 1
|
||||
self._create_op_helper(ret, compute_device=compute_device)
|
||||
return ret
|
||||
|
||||
@ -3949,20 +3950,27 @@ class Graph(object):
|
||||
"""
|
||||
if self._name_stack:
|
||||
name = self._name_stack + "/" + name
|
||||
i = self._names_in_use.get(name, 0)
|
||||
# Increment the number for "name".
|
||||
|
||||
# For the sake of checking for names in use, we treat names as case
|
||||
# insensitive (e.g. foo = Foo).
|
||||
name_key = name.lower()
|
||||
i = self._names_in_use.get(name_key, 0)
|
||||
# Increment the number for "name_key".
|
||||
if mark_as_used:
|
||||
self._names_in_use[name] = i + 1
|
||||
self._names_in_use[name_key] = i + 1
|
||||
if i > 0:
|
||||
base_name = name
|
||||
# Make sure the composed name is not already used.
|
||||
while name in self._names_in_use:
|
||||
name = "%s_%d" % (base_name, i)
|
||||
base_name_key = name_key
|
||||
# Make sure the composed name key is not already used.
|
||||
while name_key in self._names_in_use:
|
||||
name_key = "%s_%d" % (base_name_key, i)
|
||||
i += 1
|
||||
# Mark the composed name as used in case someone wants
|
||||
# Mark the composed name_key as used in case someone wants
|
||||
# to call unique_name("name_1").
|
||||
if mark_as_used:
|
||||
self._names_in_use[name] = 1
|
||||
self._names_in_use[name_key] = 1
|
||||
|
||||
# Return the new name with the original capitalization of the given name.
|
||||
name = "%s_%d" % (name, i-1)
|
||||
return name
|
||||
|
||||
def get_name_scope(self):
|
||||
|
@ -965,6 +965,15 @@ class NameStackTest(test_util.TensorFlowTestCase):
|
||||
self.assertEqual("foo_1", g.unique_name("foo"))
|
||||
self.assertEqual("foo_3", g.unique_name("foo"))
|
||||
|
||||
def testUniqueNameCaseInsensitivity(self):
|
||||
g = ops.Graph()
|
||||
self.assertEqual("foo", g.unique_name("foo"))
|
||||
self.assertEqual("Foo_1", g.unique_name("Foo"))
|
||||
with g.name_scope("bar"):
|
||||
self.assertEqual("bar/foo", g.unique_name("foo"))
|
||||
with g.name_scope("Bar"):
|
||||
self.assertEqual("Bar_1/foo", g.unique_name("foo"))
|
||||
|
||||
def testInvalidNameRaisesError(self):
|
||||
g = ops.Graph()
|
||||
with g.name_scope(""): # Should not raise
|
||||
|
@ -1390,7 +1390,7 @@ class LayoutOptimizerTest(test.TestCase):
|
||||
expected_num_transposes = 3
|
||||
self.assertEqual(expected_num_transposes, num_transposes)
|
||||
self._assert_trans_nhwc_to_nchw('map/while/Conv2D-0', nodes)
|
||||
self._assert_trans_nchw_to_nhwc('map/while/Add-0-2', nodes)
|
||||
self._assert_trans_nchw_to_nhwc('map/while/Add_1-0-2', nodes)
|
||||
self.assertAllClose(output_val_ref, output_val, atol=1e-3)
|
||||
|
||||
def testLoopWithVecAnd4D(self):
|
||||
@ -1414,7 +1414,7 @@ class LayoutOptimizerTest(test.TestCase):
|
||||
expected_num_transposes = 2
|
||||
self.assertEqual(expected_num_transposes, num_transposes)
|
||||
self._assert_trans_nhwc_to_nchw('map/while/Conv2D-0', nodes)
|
||||
self._assert_trans_nchw_to_nhwc('map/while/Add-0-2', nodes)
|
||||
self._assert_trans_nchw_to_nhwc('map/while/Add_1-0-2', nodes)
|
||||
self.assertAllClose(output_val_ref, output_val, atol=1e-3)
|
||||
|
||||
def testBinaryOpSecondPort(self):
|
||||
|
@ -893,6 +893,7 @@ tf_py_test(
|
||||
"//third_party/py/numpy",
|
||||
"//tensorflow/python:client_testlib",
|
||||
"//tensorflow/python:framework",
|
||||
"//tensorflow/python:sparse_grad",
|
||||
"//tensorflow/python:sparse_ops",
|
||||
],
|
||||
)
|
||||
|
@ -364,14 +364,52 @@ class UniformUnitScalingInitializationTest(test.TestCase):
|
||||
|
||||
class VarianceScalingInitializationTest(test.TestCase):
|
||||
|
||||
def testTruncatedNormalDistribution(self):
|
||||
shape = [100, 100]
|
||||
expect_mean = 0.
|
||||
expect_var = 1. / shape[0]
|
||||
init = init_ops.variance_scaling_initializer(
|
||||
distribution='truncated_normal')
|
||||
|
||||
with self.test_session(use_gpu=True), \
|
||||
test.mock.patch.object(
|
||||
random_ops, 'truncated_normal', wraps=random_ops.truncated_normal) \
|
||||
as mock_truncated_normal:
|
||||
x = init(shape).eval()
|
||||
self.assertTrue(mock_truncated_normal.called)
|
||||
|
||||
self.assertNear(np.mean(x), expect_mean, err=1e-2)
|
||||
self.assertNear(np.var(x), expect_var, err=1e-2)
|
||||
|
||||
def testNormalDistribution(self):
|
||||
shape = [100, 100]
|
||||
expect_mean = 0.
|
||||
expect_var = 1. / shape[0]
|
||||
init = init_ops.variance_scaling_initializer(distribution='normal')
|
||||
|
||||
with self.test_session(use_gpu=True):
|
||||
with self.test_session(use_gpu=True), \
|
||||
test.mock.patch.object(
|
||||
random_ops, 'truncated_normal', wraps=random_ops.truncated_normal) \
|
||||
as mock_truncated_normal:
|
||||
x = init(shape).eval()
|
||||
self.assertTrue(mock_truncated_normal.called)
|
||||
|
||||
self.assertNear(np.mean(x), expect_mean, err=1e-2)
|
||||
self.assertNear(np.var(x), expect_var, err=1e-2)
|
||||
|
||||
def testUntruncatedNormalDistribution(self):
|
||||
shape = [100, 100]
|
||||
expect_mean = 0.
|
||||
expect_var = 1. / shape[0]
|
||||
init = init_ops.variance_scaling_initializer(
|
||||
distribution='untruncated_normal')
|
||||
|
||||
with self.test_session(use_gpu=True), \
|
||||
test.mock.patch.object(
|
||||
random_ops, 'random_normal', wraps=random_ops.random_normal) \
|
||||
as mock_random_normal:
|
||||
x = init(shape).eval()
|
||||
self.assertTrue(mock_random_normal.called)
|
||||
|
||||
self.assertNear(np.mean(x), expect_mean, err=1e-2)
|
||||
self.assertNear(np.var(x), expect_var, err=1e-2)
|
||||
|
@ -642,6 +642,29 @@ class TileTest(test.TestCase):
|
||||
err = gradient_checker.compute_gradient_error(a, [4, 2], tiled, [4, 4])
|
||||
self.assertLess(err, 1e-3)
|
||||
|
||||
def testGradientWithSparseGradWithRank1(self):
|
||||
inputs = constant_op.constant([1.0, 2.0, 3.0, 4.0],
|
||||
dtype=dtypes.float32)
|
||||
outputs = array_ops.gather(array_ops.tile(inputs, [3]),
|
||||
[1, 5, 9, 3, 7, 2, 2, 2])
|
||||
with self.test_session():
|
||||
error = gradient_checker.compute_gradient_error(
|
||||
inputs, inputs.get_shape().as_list(),
|
||||
outputs, outputs.get_shape().as_list())
|
||||
self.assertLess(error, 1e-4)
|
||||
|
||||
def testGradientWithSparseGradWithRank3(self):
|
||||
inputs = constant_op.constant([1.0, 2.0, 3.0, 4.0],
|
||||
dtype=dtypes.float32)
|
||||
inputs = array_ops.reshape(inputs, [-1, 1, 1])
|
||||
outputs = array_ops.gather(array_ops.tile(inputs, [3, 4, 2]),
|
||||
[1, 5, 9, 3, 7, 2, 2, 2])
|
||||
with self.test_session():
|
||||
error = gradient_checker.compute_gradient_error(
|
||||
inputs, inputs.get_shape().as_list(),
|
||||
outputs, outputs.get_shape().as_list())
|
||||
self.assertLess(error, 1e-4)
|
||||
|
||||
def testShapeFunctionEdgeCases(self):
|
||||
# Unknown multiples shape.
|
||||
inp = constant_op.constant(0.0, shape=[4, 4, 4, 4])
|
||||
|
@ -21,13 +21,15 @@ from __future__ import print_function
|
||||
import numpy as np
|
||||
|
||||
from tensorflow.python.framework import sparse_tensor
|
||||
from tensorflow.python.ops import gradient_checker
|
||||
from tensorflow.python.ops import sparse_ops
|
||||
import tensorflow.python.ops.sparse_grad # pylint: disable=unused-import
|
||||
from tensorflow.python.platform import test
|
||||
|
||||
|
||||
class SparseSliceOpTest(test.TestCase):
|
||||
|
||||
def _SparseTensor_4x6(self):
|
||||
def _SparseTensor_4x6(self, val_dtype=np.int64):
|
||||
# [0 | |2 | |4 |5 ]
|
||||
# [ |11| |13|14| ]
|
||||
# [20| | |23| |25]
|
||||
@ -37,7 +39,7 @@ class SparseSliceOpTest(test.TestCase):
|
||||
[2, 3], [2, 5], [3, 0], [3, 2], [3, 3], [3, 5]]).astype(
|
||||
np.int64)
|
||||
val = np.array([0, 2, 4, 5, 11, 13, 14, 20, 23, 25, 30, 32, 33, 35]).astype(
|
||||
np.int64)
|
||||
val_dtype)
|
||||
shape = np.array([4, 6]).astype(np.int64)
|
||||
return sparse_tensor.SparseTensor(ind, val, shape)
|
||||
|
||||
@ -244,6 +246,22 @@ class SparseSliceOpTest(test.TestCase):
|
||||
self.assertAllEqual(sparse_tensor5.values.eval(), [5, 25, 35])
|
||||
self.assertAllEqual(sparse_tensor5.dense_shape.eval(), [4, 1])
|
||||
|
||||
def testGradients(self):
|
||||
sp_input = self._SparseTensor_4x6(val_dtype=np.float32)
|
||||
start_and_size = [([0, 0], [4, 2]),
|
||||
([0, 2], [5, 2]),
|
||||
([0, 4], [5, 3])]
|
||||
|
||||
with self.test_session(use_gpu=False):
|
||||
for start, size in start_and_size:
|
||||
sp_output = sparse_ops.sparse_slice(sp_input, start, size)
|
||||
nnz_in = len(sp_input.values.eval())
|
||||
nnz_out = len(sp_output.values.eval())
|
||||
|
||||
err = gradient_checker.compute_gradient_error(
|
||||
[sp_input.values], [(nnz_in,)], sp_output.values, (nnz_out,))
|
||||
self.assertLess(err, 1e-3)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
test.main()
|
||||
|
@ -568,7 +568,6 @@ ops.NotDifferentiable("Size")
|
||||
@ops.RegisterGradient("Tile")
|
||||
def _TileGrad(op, grad):
|
||||
"""Sum reduces grad along the tiled dimensions."""
|
||||
assert isinstance(grad, ops.Tensor)
|
||||
input_shape = array_ops.shape(op.inputs[0])
|
||||
# We interleave multiples and input_shape to get split_shape,
|
||||
# reshape grad to split_shape, and reduce along all even
|
||||
@ -581,6 +580,13 @@ def _TileGrad(op, grad):
|
||||
split_shape = array_ops.reshape(
|
||||
array_ops.transpose(array_ops.stack([op.inputs[1], input_shape])), [-1])
|
||||
axes = math_ops.range(0, array_ops.size(split_shape), 2)
|
||||
# Sum reduces grad along the first dimension for IndexedSlices
|
||||
if isinstance(grad, ops.IndexedSlices):
|
||||
grad = math_ops.unsorted_segment_sum(
|
||||
grad.values,
|
||||
math_ops.mod(grad.indices, input_shape[0]),
|
||||
input_shape[0])
|
||||
split_shape = array_ops.concat([[1], split_shape[1:]], axis=0)
|
||||
input_grad = math_ops.reduce_sum(array_ops.reshape(grad, split_shape), axes)
|
||||
# Fix shape inference
|
||||
if not context.executing_eagerly():
|
||||
|
@ -3135,6 +3135,7 @@ def while_loop(cond,
|
||||
happen is that the thread updating `x` can never get ahead of the
|
||||
counter thread because the thread incrementing `x` depends on the value
|
||||
of the counter.
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
|
@ -43,7 +43,8 @@ from tensorflow.python.ops import linalg_ops_impl
|
||||
from tensorflow.python.ops import gen_linalg_ops
|
||||
from tensorflow.python.ops import math_ops
|
||||
from tensorflow.python.ops import random_ops
|
||||
from tensorflow.python.util.deprecation import deprecated
|
||||
from tensorflow.python.util.deprecation import (
|
||||
deprecated, deprecated_arg_values)
|
||||
from tensorflow.python.util.tf_export import tf_export
|
||||
|
||||
|
||||
@ -409,8 +410,10 @@ class UniformUnitScaling(Initializer):
|
||||
class VarianceScaling(Initializer):
|
||||
"""Initializer capable of adapting its scale to the shape of weights tensors.
|
||||
|
||||
With `distribution="normal"`, samples are drawn from a truncated normal
|
||||
distribution centered on zero, with `stddev = sqrt(scale / n)`
|
||||
With `distribution="truncated_normal" or "untruncated_normal"`,
|
||||
samples are drawn from a truncated/untruncated normal
|
||||
distribution with a mean of zero and a standard deviation (after truncation,
|
||||
if used) `stddev = sqrt(scale / n)`
|
||||
where n is:
|
||||
- number of input units in the weight tensor, if mode = "fan_in"
|
||||
- number of output units, if mode = "fan_out"
|
||||
@ -433,10 +436,14 @@ class VarianceScaling(Initializer):
|
||||
"distribution" arguments.
|
||||
"""
|
||||
|
||||
@deprecated_arg_values(
|
||||
None,
|
||||
"`normal` is a deprecated alias for `truncated_normal`",
|
||||
distribution="normal")
|
||||
def __init__(self,
|
||||
scale=1.0,
|
||||
mode="fan_in",
|
||||
distribution="normal",
|
||||
distribution="truncated_normal",
|
||||
seed=None,
|
||||
dtype=dtypes.float32):
|
||||
if scale <= 0.:
|
||||
@ -444,7 +451,8 @@ class VarianceScaling(Initializer):
|
||||
if mode not in {"fan_in", "fan_out", "fan_avg"}:
|
||||
raise ValueError("Invalid `mode` argument:", mode)
|
||||
distribution = distribution.lower()
|
||||
if distribution not in {"normal", "uniform"}:
|
||||
if distribution not in {"normal", "uniform",
|
||||
"truncated_normal", "untruncated_normal"}:
|
||||
raise ValueError("Invalid `distribution` argument:", distribution)
|
||||
self.scale = scale
|
||||
self.mode = mode
|
||||
@ -466,11 +474,15 @@ class VarianceScaling(Initializer):
|
||||
scale /= max(1., fan_out)
|
||||
else:
|
||||
scale /= max(1., (fan_in + fan_out) / 2.)
|
||||
if self.distribution == "normal":
|
||||
if self.distribution == "normal" or self.distribution == "truncated_normal":
|
||||
# constant taken from scipy.stats.truncnorm.std(a=-2, b=2, loc=0., scale=1.)
|
||||
stddev = math.sqrt(scale) / .87962566103423978
|
||||
return random_ops.truncated_normal(
|
||||
shape, 0.0, stddev, dtype, seed=self.seed)
|
||||
elif self.distribution == "untruncated_normal":
|
||||
stddev = math.sqrt(scale)
|
||||
return random_ops.random_normal(
|
||||
shape, 0.0, stddev, dtype, seed=self.seed)
|
||||
else:
|
||||
limit = math.sqrt(3.0 * scale)
|
||||
return random_ops.random_uniform(
|
||||
|
@ -878,7 +878,8 @@ def sparse_softmax_cross_entropy(
|
||||
exception when this op is run on CPU, and return `NaN` for corresponding
|
||||
loss and gradient rows on GPU.
|
||||
logits: Unscaled log probabilities of shape
|
||||
`[d_0, d_1, ..., d_{r-1}, num_classes]` and dtype `float32` or `float64`.
|
||||
`[d_0, d_1, ..., d_{r-1}, num_classes]` and dtype `float16`, `float32` or
|
||||
`float64`.
|
||||
weights: Coefficients for the loss. This must be scalar or broadcastable to
|
||||
`labels` (i.e. same rank and each dimension is either 1 or the same).
|
||||
scope: the scope for the operations performed in computing the loss.
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user