Merge changes from github.

PiperOrigin-RevId: 184897758
This commit is contained in:
Michael Case 2018-02-07 14:36:00 -08:00 committed by TensorFlower Gardener
parent 8461760f9f
commit d90054e7c0
256 changed files with 4471 additions and 891 deletions

View File

@ -4,7 +4,7 @@ https://stackoverflow.com/questions/tagged/tensorflow
If you open a GitHub issue, here is our policy:
1. It must be a bug or a feature request.
1. It must be a bug, a feature request, or a significant problem with documentation (for small docs fixes please send a PR instead).
2. The form below must be filled out.
3. It shouldn't be a TensorBoard issue. Those go [here](https://github.com/tensorflow/tensorboard/issues).

View File

@ -6,7 +6,7 @@
| **`Linux CPU`** | **`Linux GPU`** | **`Mac OS CPU`** | **`Windows CPU`** | **`Android`** |
|-----------------|---------------------|------------------|-------------------|---------------|
| [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-cpu)](https://ci.tensorflow.org/job/tensorflow-master-cpu) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-linux-gpu)](https://ci.tensorflow.org/job/tensorflow-master-linux-gpu) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-mac)](https://ci.tensorflow.org/job/tensorflow-master-mac) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-win-cmake-py)](https://ci.tensorflow.org/job/tensorflow-master-win-cmake-py) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-android)](https://ci.tensorflow.org/job/tensorflow-master-android) |
| [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-cpu)](https://ci.tensorflow.org/job/tensorflow-master-cpu) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-linux-gpu)](https://ci.tensorflow.org/job/tensorflow-master-linux-gpu) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-mac)](https://ci.tensorflow.org/job/tensorflow-master-mac) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-win-cmake-py)](https://ci.tensorflow.org/job/tensorflow-master-win-cmake-py) | [![Build Status](https://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-android)](https://ci.tensorflow.org/job/tensorflow-master-android) [ ![Download](https://api.bintray.com/packages/google/tensorflow/tensorflow/images/download.svg) ](https://bintray.com/google/tensorflow/tensorflow/_latestVersion) |
**TensorFlow** is an open source software library for numerical computation using
data flow graphs. The graph nodes represent mathematical operations, while
@ -27,7 +27,7 @@ guidelines](CONTRIBUTING.md). This project adheres to TensorFlow's
uphold this code.**
**We use [GitHub issues](https://github.com/tensorflow/tensorflow/issues) for
tracking requests and bugs. So please see
tracking requests and bugs. So please see
[TensorFlow Discuss](https://groups.google.com/a/tensorflow.org/forum/#!forum/discuss) for general questions
and discussion, and please direct specific questions to [Stack Overflow](https://stackoverflow.com/questions/tagged/tensorflow).**

View File

@ -1,18 +1,39 @@
# Release 1.5.0
## Breaking Changes
* Prebuilt binaries are now built against CUDA 9 and cuDNN 7.
* Prebuilt binaries are now built against CUDA 9.0 and cuDNN 7.
* Our Linux binaries are built using ubuntu 16 containers, potentially
introducing glibc incompatibility issues with ubuntu 14.
* Starting from 1.6 release, our prebuilt binaries will use AVX instructions.
This may break TF on older CPUs.
## Known Bugs
* Using XLA:GPU with CUDA 9 and CUDA 9.1 results in garbage results and/or
`CUDA_ILLEGAL_ADDRESS` failures.
Google discovered in mid-December 2017 that the PTX-to-SASS compiler in CUDA 9
and CUDA 9.1 sometimes does not properly compute the carry bit when
decomposing 64-bit address calculations with large offsets (e.g. `load [x +
large_constant]`) into 32-bit arithmetic in SASS.
As a result, these versions of `ptxas` miscompile most XLA programs which use
more than 4GB of temp memory. This results in garbage results and/or
`CUDA_ERROR_ILLEGAL_ADDRESS` failures.
A fix in CUDA 9.1.121 is expected in late February 2018. We do not expect a
fix for CUDA 9.0.x. Until the fix is available, the only workaround is to
[downgrade](https://developer.nvidia.com/cuda-toolkit-archive) to CUDA 8.0.x
or disable XLA:GPU.
TensorFlow will print a warning if you use XLA:GPU with a known-bad version of
CUDA; see e00ba24c4038e7644da417ddc639169b6ea59122.
## Major Features And Improvements
* [Eager execution](https://github.com/tensorflow/tensorflow/tree/r1.5/tensorflow/contrib/eager)
preview version is now available.
* [TensorFlow Lite](https://github.com/tensorflow/tensorflow/tree/r1.5/tensorflow/contrib/lite)
dev preview is now available.
* CUDA 9 and cuDNN 7 support.
* CUDA 9.0 and cuDNN 7 support.
* Accelerated Linear Algebra (XLA):
* Add `complex64` support to XLA compiler.
* `bfloat` support is now added to XLA infrastructure.
@ -523,7 +544,7 @@ answered questions, and were part of inspiring discussions.
* Fixed LIBXSMM integration.
* Make decode_jpeg/decode_png/decode_gif handle all formats, since users frequently try to decode an image as the wrong type.
* Improve implicit broadcasting lowering.
* Improving stability of GCS/Bigquery clients by a faster retrying of stale transmissions.
* Improving stability of GCS/BigQuery clients by a faster retrying of stale transmissions.
* Remove OpKernelConstruction::op_def() as part of minimizing proto dependencies.
* VectorLaplaceDiag distribution added.
* Android demo no longer requires libtensorflow_demo.so to run (libtensorflow_inference.so still required)

View File

@ -41,12 +41,12 @@ load("//tensorflow:workspace.bzl", "tf_workspace")
tf_workspace()
new_http_archive(
name = "inception5h",
name = "inception_v1",
build_file = "models.BUILD",
sha256 = "d13569f6a98159de37e92e9c8ec4dae8f674fbf475f69fe6199b514f756d4364",
sha256 = "7efe12a8363f09bc24d7b7a450304a15655a57a7751929b2c1593a71183bb105",
urls = [
"http://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip",
"http://download.tensorflow.org/models/inception5h.zip",
"http://storage.googleapis.com/download.tensorflow.org/models/inception_v1.zip",
"http://download.tensorflow.org/models/inception_v1.zip",
],
)

View File

@ -298,7 +298,7 @@ def get_var(environ_cp,
System".
enabled_by_default: boolean for default behavior.
question: optional string for how to ask for user input.
yes_reply: optionanl string for reply when feature is enabled.
yes_reply: optional string for reply when feature is enabled.
no_reply: optional string for reply when feature is disabled.
Returns:
@ -411,7 +411,7 @@ def set_action_env_var(environ_cp,
System".
enabled_by_default: boolean for default behavior.
question: optional string for how to ask for user input.
yes_reply: optionanl string for reply when feature is enabled.
yes_reply: optional string for reply when feature is enabled.
no_reply: optional string for reply when feature is disabled.
"""
var = int(
@ -1354,6 +1354,7 @@ def main():
environ_cp['TF_NEED_GCP'] = '0'
environ_cp['TF_NEED_HDFS'] = '0'
environ_cp['TF_NEED_JEMALLOC'] = '0'
environ_cp['TF_NEED_KAFKA'] = '0'
environ_cp['TF_NEED_OPENCL_SYCL'] = '0'
environ_cp['TF_NEED_COMPUTECPP'] = '0'
environ_cp['TF_NEED_OPENCL'] = '0'
@ -1372,6 +1373,8 @@ def main():
'with_hdfs_support', True, 'hdfs')
set_build_var(environ_cp, 'TF_NEED_S3', 'Amazon S3 File System',
'with_s3_support', True, 's3')
set_build_var(environ_cp, 'TF_NEED_KAFKA', 'Apache Kafka Platform',
'with_kafka_support', False, 'kafka')
set_build_var(environ_cp, 'TF_ENABLE_XLA', 'XLA JIT', 'with_xla_support',
False, 'xla')
set_build_var(environ_cp, 'TF_NEED_GDR', 'GDR', 'with_gdr_support',

View File

@ -211,6 +211,12 @@ config_setting(
visibility = ["//visibility:public"],
)
config_setting(
name = "with_kafka_support",
define_values = {"with_kafka_support": "true"},
visibility = ["//visibility:public"],
)
# Crosses between platforms and file system libraries not supported on those
# platforms due to limitations in nested select() statements.
config_setting(

View File

@ -433,6 +433,7 @@ tf_gen_op_wrappers_cc(
"linalg_ops",
"logging_ops",
"lookup_ops",
"manip_ops",
"math_ops",
"nn_ops",
"no_op",

View File

@ -71,7 +71,7 @@ class FreezeTest : public ::testing::Test {
return Status::OK();
}
// Adds `graph_def` to `saved_model_bundle` and intializes a session with
// Adds `graph_def` to `saved_model_bundle` and initializes a session with
// `init_node`.
Status AddGraphDefToSavedModelBundle(const GraphDef& graph_def,
const string& init_node,

View File

@ -132,7 +132,10 @@ tf_library(
config = "test_graph_tfadd.config.pbtxt",
cpp_class = "AddComp",
graph = "test_graph_tfadd.pbtxt",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
# A test of tf_library that includes a graph with an unknown op, but where
@ -143,7 +146,10 @@ tf_library(
config = "test_graph_tfunknownop.config.pbtxt",
cpp_class = "UnknownOpAddComp",
graph = "test_graph_tfunknownop.pbtxt",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
# A test of tf_library that includes a graph with an unknown op, but where
@ -155,7 +161,10 @@ tf_library(
config = "test_graph_tfunknownop2.config.pbtxt",
cpp_class = "UnknownOpAddComp",
graph = "test_graph_tfunknownop.pbtxt",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
# A test of tf_library that includes a graph with an unknown op, but where
@ -166,7 +175,10 @@ tf_library(
config = "test_graph_tfunknownop3.config.pbtxt",
cpp_class = "UnknownOpAddComp",
graph = "test_graph_tfunknownop.pbtxt",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
# Utility library for benchmark binaries, used by the *_benchmark rules that are

View File

@ -74,7 +74,10 @@ tf_library(
# compile but the others in this directory succeed, you may need to
# expand the "required by all tf_library targets" list in tfcompile.bzl.
include_standard_runtime_deps = False,
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
tf_library(
@ -84,7 +87,10 @@ tf_library(
cpp_class = "AddWithCkptComp",
freeze_checkpoint = "test_graph_tfadd_with_ckpt.ckpt",
graph = "test_graph_tfadd_with_ckpt.pb",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
tf_library(
@ -95,7 +101,10 @@ tf_library(
freeze_checkpoint = "test_graph_tfadd_with_ckpt_saver.ckpt",
freeze_saver = "test_graph_tfadd_with_ckpt_saver.saver",
graph = "test_graph_tfadd_with_ckpt_saver.pb",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
tf_library(
@ -104,7 +113,10 @@ tf_library(
config = "test_graph_tffunction.config.pbtxt",
cpp_class = "FunctionComp",
graph = "test_graph_tffunction.pb",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
tf_library(
@ -113,7 +125,10 @@ tf_library(
config = "test_graph_tfgather.config.pbtxt",
cpp_class = "GatherComp",
graph = "test_graph_tfgather.pb",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
tf_library(
@ -122,7 +137,10 @@ tf_library(
config = "test_graph_tfmatmul.config.pbtxt",
cpp_class = "foo::bar::MatMulComp",
graph = "test_graph_tfmatmul.pb",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
tf_library(
@ -131,7 +149,10 @@ tf_library(
config = "test_graph_tfmatmulandadd.config.pbtxt",
cpp_class = "MatMulAndAddComp",
graph = "test_graph_tfmatmulandadd.pb",
tags = ["manual"],
tags = [
"manual",
"notap",
],
tfcompile_flags = "--gen_name_to_index --gen_program_shape",
)
@ -141,13 +162,19 @@ tf_library(
config = "test_graph_tfsplits.config.pbtxt",
cpp_class = "SplitsComp",
graph = "test_graph_tfsplits.pb",
tags = ["manual"],
tags = [
"manual",
"notap",
],
)
tf_cc_test(
name = "tfcompile_test",
srcs = ["tfcompile_test.cc"],
tags = ["manual"],
tags = [
"manual",
"notap",
],
deps = [
":test_graph_tfadd",
":test_graph_tfadd_with_ckpt",

View File

@ -774,15 +774,15 @@ class BinaryOpsTest(XLATestCase):
def DISABLED_testSparseMatMul(self):
# Binary wrappers for sparse_matmul with different hints
def SparseMatmulWrapperTF(a, b):
return tf.sparse_matmul(a, b, a_is_sparse=True)
return math_ops.sparse_matmul(a, b, a_is_sparse=True)
def SparseMatmulWrapperFT(a, b):
return tf.sparse_matmul(a, b, b_is_sparse=True)
return math_ops.sparse_matmul(a, b, b_is_sparse=True)
def SparseMatmulWrapperTT(a, b):
return tf.sparse_matmul(a, b, a_is_sparse=True, b_is_sparse=True)
return math_ops.sparse_matmul(a, b, a_is_sparse=True, b_is_sparse=True)
self._testMatMul(tf.sparse_matmul)
self._testMatMul(math_ops.sparse_matmul)
self._testMatMul(SparseMatmulWrapperTF)
self._testMatMul(SparseMatmulWrapperFT)
self._testMatMul(SparseMatmulWrapperTT)

View File

@ -38,8 +38,22 @@ class PoolingOp : public XlaOpKernel {
PoolingOp(OpKernelConstruction* ctx, int num_spatial_dims)
: XlaOpKernel(ctx), num_spatial_dims_(num_spatial_dims) {
if (ctx->num_inputs() == 1) {
OP_REQUIRES_OK(ctx, ctx->GetAttr("ksize", &ksize_));
OP_REQUIRES_OK(ctx, ctx->GetAttr("strides", &stride_));
std::vector<int32> ksize_int;
std::vector<int32> stride_int;
OP_REQUIRES_OK(ctx, ctx->GetAttr("ksize", &ksize_int));
OP_REQUIRES(ctx, ksize_int.size() == num_dims(),
errors::InvalidArgument("Sliding window ksize field must "
"specify ",
num_dims(), " dimensions"));
OP_REQUIRES_OK(ctx, ctx->GetAttr("strides", &stride_int));
OP_REQUIRES(ctx, stride_int.size() == num_dims(),
errors::InvalidArgument("Sliding window stride field must "
"specify ",
num_dims(), " dimensions"));
for (int i = 0; i < num_dims(); ++i) {
ksize_.push_back(ksize_int[i]);
stride_.push_back(stride_int[i]);
}
}
Padding padding;
OP_REQUIRES_OK(ctx, ctx->GetAttr("padding", &padding));
@ -65,28 +79,33 @@ class PoolingOp : public XlaOpKernel {
xla::ComputationDataHandle input = ctx->Input(0);
const TensorShape input_shape = ctx->InputShape(0);
std::vector<int64> ksize = ksize_;
std::vector<int64> stride = stride_;
if (ctx->num_inputs() != 1) {
const TensorShape ksize_shape = ctx->InputShape(1);
// Validate input sizes.
OP_REQUIRES(ctx, TensorShapeUtils::IsVector(ksize_shape),
errors::InvalidArgument("ksize must be a vector, not shape ",
ksize_shape.DebugString()));
OP_REQUIRES_OK(ctx, ctx->ConstantInputAsIntVector(1, &ksize_));
OP_REQUIRES(ctx, ksize_shape.num_elements() == num_dims(),
errors::InvalidArgument("Sliding window ksize field must "
"specify ",
num_dims(), " dimensions"));
ksize.clear();
OP_REQUIRES_OK(ctx, ctx->ConstantInputAsIntVector(1, &ksize));
const TensorShape stride_shape = ctx->InputShape(2);
// Validate input sizes.
OP_REQUIRES(ctx, TensorShapeUtils::IsVector(stride_shape),
errors::InvalidArgument("stride must be a vector, not shape ",
stride_shape.DebugString()));
OP_REQUIRES_OK(ctx, ctx->ConstantInputAsIntVector(2, &stride_));
OP_REQUIRES(ctx, stride_shape.num_elements() == num_dims(),
errors::InvalidArgument("Sliding window stride field must "
"specify ",
num_dims(), " dimensions"));
stride.clear();
OP_REQUIRES_OK(ctx, ctx->ConstantInputAsIntVector(2, &stride));
}
OP_REQUIRES(ctx, ksize_.size() == num_dims(),
errors::InvalidArgument("Sliding window ksize field must "
"specify ",
num_dims(), " dimensions"));
OP_REQUIRES(ctx, stride_.size() == num_dims(),
errors::InvalidArgument("Sliding window stride field must "
"specify ",
num_dims(), " dimensions"));
OP_REQUIRES(ctx, input_shape.dims() == num_dims(),
errors::InvalidArgument("Input to ", type_string(),
" operator must have ", num_dims(),
@ -94,8 +113,8 @@ class PoolingOp : public XlaOpKernel {
const DataType type = input_type(0);
xla::ComputationDataHandle pooled = ctx->builder()->ReduceWindow(
input, InitValue(ctx->builder(), type), *Reduction(ctx, type), ksize_,
stride_, padding_);
input, InitValue(ctx->builder(), type), *Reduction(ctx, type), ksize,
stride, padding_);
ctx->SetOutput(0, PostProcessOutput(ctx, pooled, type, input_shape));
}

View File

@ -67,7 +67,7 @@ class ComputationBuilder {
// OpMetadata is often applied to a series of XLA HLO instructions. As a
// result, OpMetadata is set on the Computation Builder. All subsequent
// instructions generated via this Computation Builder will have the same
// OpMetadata attached until a call to ClearOpMetdata.
// OpMetadata attached until a call to ClearOpMetadata.
void SetOpMetadata(const OpMetadata& metadata) { metadata_ = metadata; }
// Clears the HloMetadata state.

View File

@ -2173,7 +2173,7 @@ bool HloParser::ParseConvolutionDimensionNumbers(
//
// {[2:3:4], [5:6:7], [8:9]}
//
// The the parsed result will be:
// The parsed result will be:
//
// {/*starts=*/{2, 5, 8}, /*limits=*/{3, 6, 9}, /*strides=*/{4, 7, 1}}
//

View File

@ -50,6 +50,7 @@ py_library(
"//tensorflow/contrib/image:single_image_random_dot_stereograms_py",
"//tensorflow/contrib/input_pipeline:input_pipeline_py",
"//tensorflow/contrib/integrate:integrate_py",
"//tensorflow/contrib/kafka",
"//tensorflow/contrib/keras",
"//tensorflow/contrib/kernel_methods",
"//tensorflow/contrib/kfac",
@ -142,6 +143,7 @@ cc_library(
"//tensorflow/contrib/factorization:all_ops",
"//tensorflow/contrib/framework:all_ops",
"//tensorflow/contrib/input_pipeline:input_pipeline_ops_op_lib",
"//tensorflow/contrib/kafka:kafka_ops_op_lib",
"//tensorflow/contrib/layers:sparse_feature_cross_op_op_lib",
"//tensorflow/contrib/nccl:nccl_ops_op_lib",
"//tensorflow/contrib/nearest_neighbor:nearest_neighbor_ops_op_lib",

View File

@ -194,6 +194,11 @@ public class TensorFlowInferenceInterface {
* @param outputNames A list of output nodes which should be filled by the inference pass.
*/
public void run(String[] outputNames, boolean enableStats) {
run(outputNames, enableStats, new String[] {});
}
/** An overloaded version of runInference that allows supplying targetNodeNames as well */
public void run(String[] outputNames, boolean enableStats, String[] targetNodeNames) {
// Release any Tensors from the previous run calls.
closeFetches();
@ -204,6 +209,11 @@ public class TensorFlowInferenceInterface {
runner.fetch(tid.name, tid.outputIndex);
}
// Add targets.
for (String t : targetNodeNames) {
runner.addTarget(t);
}
// Run the session.
try {
if (enableStats) {

View File

@ -6,6 +6,7 @@ tensorflow/core/example
tensorflow/core/framework
tensorflow/core/lib
tensorflow/core/lib/core
tensorflow/core/profiler
tensorflow/core/protobuf
tensorflow/core/util
tensorflow/examples
@ -219,6 +220,8 @@ tensorflow/contrib/input_pipeline/python/ops
tensorflow/contrib/integrate
tensorflow/contrib/integrate/python
tensorflow/contrib/integrate/python/ops
tensorflow/contrib/kafka/python
tensorflow/contrib/kafka/python/ops
tensorflow/contrib/keras
tensorflow/contrib/keras/api
tensorflow/contrib/keras/api/keras

View File

@ -30,6 +30,7 @@ set(tf_op_lib_names
"list_ops"
"lookup_ops"
"logging_ops"
"manip_ops"
"math_ops"
"nn_ops"
"no_op"

View File

@ -335,6 +335,7 @@ GENERATE_PYTHON_OP_LIB("list_ops")
GENERATE_PYTHON_OP_LIB("logging_ops")
GENERATE_PYTHON_OP_LIB("lookup_ops")
GENERATE_PYTHON_OP_LIB("nn_ops")
GENERATE_PYTHON_OP_LIB("manip_ops")
GENERATE_PYTHON_OP_LIB("parsing_ops")
GENERATE_PYTHON_OP_LIB("random_ops")
GENERATE_PYTHON_OP_LIB("remote_fused_graph_ops"

View File

@ -31,7 +31,7 @@ from __future__ import division
from __future__ import print_function
import argparse
import io
import codecs
import os
import re
import subprocess
@ -103,7 +103,7 @@ def main():
for lib_path in args.input:
proc = subprocess.Popen([DUMPBIN, "/nologo", "/linkermember:1", lib_path],
stdout=subprocess.PIPE)
for line in io.TextIOWrapper(proc.stdout, encoding="utf-8"):
for line in codecs.getreader("utf-8")(proc.stdout):
cols = line.split()
if len(cols) < 2:
continue
@ -131,7 +131,7 @@ def main():
# We compare on undname but use the decorated name from candidates.
dupes = 0
proc = subprocess.Popen([UNDNAME, tmpfile.name], stdout=subprocess.PIPE)
for idx, line in enumerate(io.TextIOWrapper(proc.stdout, encoding="utf-8")):
for idx, line in enumerate(codecs.getreader("utf-8")(proc.stdout)):
decorated = candidates[idx]
if decorated in taken:
# Symbol is already in output, done.

View File

@ -30,7 +30,7 @@ following sense:
around,
- The number of CDF axes does not extend, i.e., `CDF.ndim == data.ndim + 1`.
In the previous example where data has shape (10, 10), the followings are
In the previous example where data has shape (10, 10), the following are
acceptable CDF shapes:
- (10, 10, 65)

View File

@ -276,7 +276,7 @@ void RangeEncoder::Finalize(string* sink) {
}
} else if (base_ != 0) {
// If base == 0, then pick 0 from [base, base + size) and no zeros are
// explcitly written.
// explicitly written.
//
// Otherwise, pick (base + (2^16 - base[16:0])), i.e., round up base to the
// next multiple of 2^16. As 2^16 < size, this value should be in the

View File

@ -20,6 +20,7 @@ from __future__ import print_function
import time
from six.moves import xrange # pylint: disable=redefined-builtin
from tensorflow.contrib import rnn as contrib_rnn
from tensorflow.contrib.cudnn_rnn.python.ops import cudnn_rnn_ops
from tensorflow.contrib.rnn.python.ops import lstm_ops

View File

@ -178,7 +178,7 @@ class Evaluator(object):
call_op: An op that updates evaluation state on a mini-batch of examples.
Must generate an tf.errors.OutOfRangeError when done.
results_op: A dictionary of tensors that compute the final evaluation
results from the evaulation state.
results from the evaluation state.
sess: The Session to run the evaluation in. Defaults to the default
Session.

View File

@ -34,7 +34,7 @@ bazel run -c opt --config=cuda :resnet50_graph_test -- --benchmarks=.
(Or remove the `--config=cuda` flag for running on CPU instead of GPU).
On October 31, 2017, the benchmarks demostrated comparable performance
On October 31, 2017, the benchmarks demonstrated comparable performance
for eager and graph execution of this particular model when using
a single NVIDIA Titan X (Pascal) GPU on a host with an
Intel Xeon E5-1650 CPU @ 3.50GHz and a batch size of 32.

View File

@ -97,7 +97,7 @@ class _ConvBlock(tfe.Network):
Args:
kernel_size: the kernel size of middle conv layer at main path
filters: list of integers, the filterss of 3 conv layer at main path
filters: list of integers, the filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
data_format: data_format for the input ('channels_first' or

View File

@ -22,6 +22,7 @@ import gc
import tempfile
import time
from six.moves import xrange # pylint: disable=redefined-builtin
import tensorflow as tf
import tensorflow.contrib.eager as tfe

View File

@ -40,7 +40,7 @@ bazel run -c opt --config=cuda :rnn_ptb_graph_test -- --benchmarks=.
(Or remove the `--config=cuda` flag for running on CPU instead of GPU).
On October 31, 2017, the benchmarks demostrated slightly better performance
On October 31, 2017, the benchmarks demonstrated slightly better performance
(3-6%) for graph execution over eager execution for this particular model when
using a single NVIDIA Titan X (Pascal) GPU on a host with an Intel Xeon E5-1650
CPU @ 3.50GHz and a batch size of 32.

View File

@ -88,7 +88,7 @@ class Embedding(tf.layers.Layer):
class PTBModel(tfe.Network):
"""LSTM for word language modelling.
"""LSTM for word language modeling.
Model described in:
(Zaremba, et. al.) Recurrent Neural Network Regularization
@ -339,8 +339,7 @@ if __name__ == "__main__":
"http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz")
parser.add_argument(
"--logdir", type=str, default="", help="Directory for checkpoint.")
parser.add_argument(
"--epoch", type=int, default=20, help="Number of epoches.")
parser.add_argument("--epoch", type=int, default=20, help="Number of epochs.")
parser.add_argument("--batch-size", type=int, default=20, help="Batch size.")
parser.add_argument(
"--seq-len", type=int, default=35, help="Sequence length.")

View File

@ -51,11 +51,11 @@ def get_non_parenthesis_words(items):
"""Get the non-parenthesis items from a SNLI parsed sentence.
Args:
items: Data items from a parsed SNLI setence, with parentheses. E.g.,
items: Data items from a parsed SNLI sentence, with parentheses. E.g.,
["(", "Man", "(", "(", "(", "(", "(", "wearing", "pass", ")", ...
Returns:
A list of non-parenthis word items, all converted to lower case. E.g.,
A list of non-parentheses word items, all converted to lower case. E.g.,
["man", "wearing", "pass", ...
"""
return [x.lower() for x in items if x not in PARENTHESES and x]
@ -201,7 +201,7 @@ def load_word_vectors(data_root, vocab):
def calculate_bins(length2count, min_bin_size):
"""Cacluate bin boundaries given a histogram of lengths and mininum bin size.
"""Calculate bin boundaries given a histogram of lengths and minimum bin size.
Args:
length2count: A `dict` mapping length to sentence count.
@ -335,9 +335,9 @@ class SnliData(object):
# The sorting above and the batching here makes sure that sentences of
# similar max lengths are batched together, minimizing the inefficiency
# due to uneven max lengths. The sentences are batched differently in
# each call to get_generator() due to the shuffling before sotring
# each call to get_generator() due to the shuffling before sorting
# above. The pad_and_reverse_word_ids() and pad_transitions() functions
# take care of any remaning unevenness of the max sentence lengths.
# take care of any remaining unevenness of the max sentence lengths.
end = min(begin + batch_size, len(labels))
# Transpose, because the SPINN model requires time-major, instead of
# batch-major.

View File

@ -26,6 +26,7 @@ import tempfile
import time
import numpy as np
from six.moves import xrange # pylint: disable=redefined-builtin
import tensorflow as tf
# pylint: disable=g-bad-import-order

View File

@ -539,7 +539,7 @@ class NetworkTest(test.TestCase):
# No issue here since the name is unique within its scope.
name_conflict3 = MyNetwork(name="name_conflict")
net2 = MyNetwork() # name=outside_scope/my_network_2 to avoid the
# variable_scope my_network_1 below.
# variable_scope my_network_1 below.
vs_name_conflict = MyNetwork(name="vs_name_conflict") # conflict below
with variable_scope.variable_scope("intervening_scope"):
with variable_scope.variable_scope(captured_scope):
@ -688,7 +688,7 @@ class NetworkTest(test.TestCase):
net2(one)
# Layer names typically are globally unique rather than being unique within
# the scope of their first use. However, within a Network they must be named
# locally so that previous Layer consutrciton does not interfere with
# locally so that previous Layer construction does not interfere with
# variable naming (e.g. add a Layer construction before the Network,
# suddenly your previously saved checkpoint is incompatible).
self.assertEqual("dense", net1.l1.name)

View File

@ -82,7 +82,7 @@ def restore_variables_on_create(save_path, map_func=None):
map_func_wrapper = lambda self, x: x
else:
if not callable(map_func):
raise ValueError("map_func must be callaled.")
raise ValueError("map_func must be callable.")
map_func_wrapper = lambda self, x: map_func(x)
ckpt_var_cache = dict()

View File

@ -102,16 +102,12 @@ REGISTER_OP("DecodeVideo")
return Status::OK();
})
.Doc(R"doc(
Processes the contents of an audio file into a tensor using FFmpeg to decode
Processes the contents of an video file into a tensor using FFmpeg to decode
the file.
One row of the tensor is created for each channel in the audio file. Each
channel contains audio samples starting at the beginning of the audio and
having `1/samples_per_second` time between them. If the `channel_count` is
different from the contents of the file, channels will be merged or created.
contents: The binary audio file contents, as a string or rank-0 string
tensor.
contents: The binary contents of the video file to decode. This is a
scalar.
output: A rank-4 `Tensor` that has `[frames, height, width, 3]` RGB as output.
)doc");
} // namespace ffmpeg

View File

@ -25,6 +25,7 @@ import re
from tensorflow.contrib.framework.python.ops import add_arg_scope as contrib_add_arg_scope
from tensorflow.contrib.framework.python.ops import gen_variable_ops
from tensorflow.contrib.util import loader
from tensorflow.core.protobuf import saver_pb2
from tensorflow.python import pywrap_tensorflow
from tensorflow.python.framework import device as tf_device
from tensorflow.python.framework import dtypes
@ -684,7 +685,8 @@ def assign_from_checkpoint_fn(model_path, var_list, ignore_missing_vars=False,
'Variable %s missing in checkpoint %s', var, model_path)
var_list = available_vars
if var_list:
saver = tf_saver.Saver(var_list, reshape=reshape_variables)
saver = tf_saver.Saver(var_list, reshape=reshape_variables,
write_version=saver_pb2.SaverDef.V1)
def callback(session):
saver.restore(session, model_path)
return callback

View File

@ -28,6 +28,7 @@ from __future__ import division
from __future__ import print_function
import functools
import os
import sys
import tarfile
@ -189,20 +190,34 @@ def get_graph_def_from_resource(filename):
return graph_pb2.GraphDef.FromString(resource_loader.load_resource(filename))
def get_graph_def_from_url_tarball(url, filename):
"""Get a GraphDef proto from a tarball on the web."""
def _progress(count, block_size, total_size):
sys.stdout.write('\r>> Downloading %s %.1f%%' % (
url, float(count * block_size) / float(total_size) * 100.0))
sys.stdout.flush()
tar_filename, _ = urllib.request.urlretrieve(url, reporthook=_progress)
def get_graph_def_from_url_tarball(url, filename, tar_filename=None):
"""Get a GraphDef proto from a tarball on the web.
Args:
url: Web address of tarball
filename: Filename of graph definition within tarball
tar_filename: Temporary download filename (None = always download)
Returns:
A GraphDef loaded from a file in the downloaded tarball.
"""
if not (tar_filename and os.path.exists(tar_filename)):
def _progress(count, block_size, total_size):
sys.stdout.write('\r>> Downloading %s %.1f%%' %
(url,
float(count * block_size) / float(total_size) * 100.0))
sys.stdout.flush()
tar_filename, _ = urllib.request.urlretrieve(url, tar_filename, _progress)
with tarfile.open(tar_filename, 'r:gz') as tar:
proto_str = tar.extractfile(filename).read()
return graph_pb2.GraphDef.FromString(proto_str)
def _default_graph_def_fn():
return get_graph_def_from_url_tarball(INCEPTION_URL, INCEPTION_FROZEN_GRAPH)
return get_graph_def_from_url_tarball(INCEPTION_URL, INCEPTION_FROZEN_GRAPH,
os.path.basename(INCEPTION_URL))
def run_inception(images,

View File

@ -620,7 +620,7 @@ class CombineAdversarialLossTest(test.TestCase):
with self.test_session(use_gpu=True) as sess:
for _ in range(10): # spot check closeness on more than one sample.
gnorm_np, precond_gnorm_np = sess.run([gnorm, precond_gnorm])
self.assertNear(gnorm_np, precond_gnorm_np, 1e-5)
self.assertNear(gnorm_np, precond_gnorm_np, 1e-4)
class CycleConsistencyLossTest(test.TestCase):

View File

@ -1,60 +1,67 @@
# TensorFlow Runtime with HVX Acceleration
## Description
This README explain how to build and use the TensorFlow runtime with HVX Acceleration. HVX is an extension of Hexagon, a DSP provided by Qualcomm, which can compute vector calculations faster using less energy than ARM processors.
This README explain how to build and use the TensorFlow Runtime with HVX Acceleration. HVX is an extension of Hexagon which is a DSP provided by qualcomm which can compute vector calculations faster using lower energy than ARM processors.
## Dependencies
* [Android SDK](https://developer.android.com/studio/index.html).
* [Android NDK](https://developer.android.com/ndk/index.html). Save the path in `${NDK_ROOT}`.
* A rooted Qualcomm-based Android device connected to the computer (preferably, a [Snapdragon Development Board](https://developer.qualcomm.com/hardware/additional-snapdragon), but it could be a rooted phone with a Qualcomm SoC, albeit this guide may not work with it). The device needs to be rooted for development and testing purposes, and shouldn't be needed in production. See [Behold, The Snapdragon MDP](https://developer.qualcomm.com/blog/behold-snapdragon-mdp) for more information.
* [Hexagon SDK v3.0](https://developer.qualcomm.com/software/hexagon-dsp-sdk/tools). Save the path in `${QUALCOMM_SDK}`.
* The current directory should be TensorFlow source code (`git clone https://github.com/tensorflow/tensorflow.git && cd tensorflow`), and saved into `${TF_ROOT_DIR}`.
You may also need to add a test signature in the device to run HVX-based binaries. Follow the instructions in `${QUALCOMM_SDK}/docs/Tools_Signing.html`, using Python 2.
Note that if the device is not rooted, you may not be able to get the serial number, push the test signature and/or run binary files that call HVX libraries.
## Quick Start Guide
We provides several tools to build and run inference with this runtime quickly.
We provide several tools to build and run inference with this runtime quickly.
#### All-in-one script to run inception model with prebuild hexagon library
If you dont need to build your own implementation of hexagon HVX, we provide a shortcut to execute graphs by using pre-compiled binaries.
### Run inception model with a prebuilt Hexagon library
If you dont need to build your own implementation of Hexagon HVX, we provide a shortcut to execute graphs by using pre-compiled binaries.
```shell
./tensorflow/contrib/makefile/samples/build_and_run_inception_hexagon.sh -p
```
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
NDK_ROOT="/path/to/ndk" ./tensorflow/contrib/makefile/build_all_android.sh -X
```
(-X downloads dependencies to hexagon HVX and graphs, and copy all dependencies to android and execute a test)
#### All-in-one script to run inception model by building entire libraries from source code
If you want to build your own implementation of hexagon HVX, we provide a sample all-in-one script to execute graphs which downloads source and build everything for hexagon.
The `-p` option makes the script download dependencies (i.e., Hexagon HVX binaries and graphs models), copy them to the Android device and execute a test.
```
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
QUALCOMM_SDK="/path/to/qualcomm/sdk" NDK_ROOT="/path/to/ndk" ./tensorflow/contrib/makefile/samples/build_and_run_inception_hexagon.sh
### Run inception model by building all from the source code
If you want to build your own implementation of Hexagon HVX, we provide a sample all-in-one script to execute graphs which downloads the source and builds everything that's necessary.
```shell
./tensorflow/contrib/makefile/samples/build_and_run_inception_hexagon.sh
```
## Building libraries
If you've finished walking through the quick start guide, you may want to try building each binary manually.
#### Build libhexagon_nn_skel.so
Download hexagon nn library from codeaurora.org and build it.
### Build libhexagon\_nn\_skel.so
```
Download Hexagon NN library from codeaurora.org and build it.
```shell
git clone https://source.codeaurora.org/quic/hexagon_nn/nnlib
cd nnlib
```
(Just follow instructions in README.HOW_TO_BUILD. You can find libhexagon_nn_skel.so in hexagon_Release_dynamic_toolv72_v60/ship)
Then copy the generated binary to GEN_LIBS_DIR
Just follow the instructions in `README.HOW_TO_BUILD`. You can find the file `libhexagon_nn_skel.so` in `hexagon_Release_dynamic_toolv72_v60/ship`.
Then copy the generated binary to `${GEN_LIBS_DIR}`.
```
```shell
GEN_LIBS_DIR="/path/to/a/dir/to/store/hexagon/libraries"
cp -v "hexagon_Release_dynamic_toolv72_v60/ship/libhexagon_nn_skel.so" "${GEN_LIBS_DIR}"
```
#### Build libhexagon_controller.so
### Build libhexagon\_controller.so
Download tensorflow and build hexagon controller.
```
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
TF_ROOT_DIR="$(pwd)"
QUALCOMM_SDK="/path/to/qualcomm/sdk"
```shell
GENERATED_NNLIB_DIRECTORY="/path/to/nnlib"
GENERATED_HEXAGON_CONTROLLER_DIRECTORY="${QUALCOMM_SDK}/examples/common/generated_hexagon_controller"
rm -rf "${GENERATED_HEXAGON_CONTROLLER_DIRECTORY}"
@ -70,12 +77,12 @@ make tree VERBOSE=1 V=android_Release
cp -v "${GENERATED_HEXAGON_CONTROLLER_DIRECTORY}/android_Release/ship/libhexagon_controller.so" "${GEN_LIBS_DIR}"
```
#### Build tensorflow linking hexagon library
Build tensorflow with the build_all_android.sh with specifying -x option.
### Build TensorFlow linking Hexagon library
```
Build TensorFlow with `build_all_android.sh` specifying the `-x` option.
```shell
BUILD_ALL_ANDROID_PATH="${TF_ROOT_DIR}/tensorflow/contrib/makefile/build_all_android.sh"
NDK_ROOT="/path/to/ndk/root"
CC_PREFIX=${CC_PREFIX} NDK_ROOT=${NDK_ROOT} "${BUILD_ALL_ANDROID_PATH}" \
-x "${GEN_LIBS_DIR}" \
@ -83,11 +90,11 @@ CC_PREFIX=${CC_PREFIX} NDK_ROOT=${NDK_ROOT} "${BUILD_ALL_ANDROID_PATH}" \
-t hexagon_graph_execution
```
#### Push binaries to your Android device
### Push binaries to your Android device
Before running tests on your Android device, you need to push several binaries to it.
```
```shell
adb push "${GEN_LIBS_DIR}/libhexagon_controller.so" "/data/local/tmp"
adb push "${GEN_LIBS_DIR}/libhexagon_nn_skel.so" "/vendor/lib/rfsa/adsp"
adb push -p \
@ -100,40 +107,54 @@ adb shell chmod "${ANDROID_EXEC_FILE_MODE}" \
adb wait-for-device
```
#### Run tests on the device
### Run tests on the device
Finally, you can run the inference tests on your device.
```
```shell
adb shell 'LD_LIBRARY_PATH=/data/local/tmp:$LD_LIBRARY_PATH' \
"/data/local/tmp/hexagon_graph_execution"
```
#### Troubleshooting
If you're using the Open-Q 820 Snapdragon development kit, you may run into an issue with running the executable due to a missing testsig library. From the Hexagon SDK documentation: *Dynamic shared objects are required to be digitally signed and then authenticated at runtime before they are allowed to be loaded and executed.* Generating a testsig library is necessary to run the unsigned sample library built from this project.
### Troubleshooting
If the lack of a testsig library is your problem, you will see errors of the type:
#### Testsig issue
If you're using the Open-Q 820 Snapdragon Development Kit, you may run into an issue with running the executable due to a missing `testsig` library. From the Hexagon SDK documentation: *Dynamic shared objects are required to be digitally signed and then authenticated at runtime before they are allowed to be loaded and executed.* Generating a testsig library is necessary to run the unsigned sample library built from this project.
If the lack of a `testsig` library is your problem, you will see errors of the type:
`vendor/qcom/proprietary/adsprpc/src/fastrpc_apps_user.c:169::error: -1: 0 == (nErr = remotectl_open(name, (int*)ph, dlerrstr, sizeof(dlerrstr), &dlerr))`
appearing in adb logcat.
appearing in `adb logcat` or ["Expected: (version) >= (1), actual: 0 vs 1" while running a binary from adb](https://github.com/tensorflow/tensorflow/issues/11210).
You need to add a test signature, as described at the beginning of this README. After rebooting your device, you should be able to run the sample application.
#### Qualcomm SDK Linux installation fails with "Malformed \uxxxx encoding"
The installation file is based on LaunchAnywhere, which fails in Linux if the `PS1` env variable contains non-common Unicode chars:
There are several ways to create the testsig library, the only prerequisite is Python and the correct version of the Hexagon-SDK. The following steps is one way to create this library:
1. Run adb as root: `adb root`
2. Run the command `adb shell cat /sys/devices/soc0/serial_number`
3. Convert the decimal number you get as output to hex
4. Run the python script: `python ${QUALCOMM_SDK}/tools/elfsigner/elfsigner.py -t $(SERIAL_NUMBER_HEX_VALUE)`
5. The output of the python script is a shared library stored in ${QUALCOMM_SDK}/tools/elfsigner/output/testsig-$(SERIAL_NUMBER_HEX_VALUE).so
6. Push the shared library to your device:
```
adb root
adb wait-for-device
adb remount
adb wait-for-device
adb shell mkdir /system/lib/rfsa
adb shell mkdir /system/lib/rfsa/adsp
adb push ${QUALCOMM_SDK}/tools/elfsigner/output/testsig-$(SERIAL_NUMBER_HEX_VALUE).so /system/lib/rfsa/adsp/
Preparing to install...
Extracting the JRE from the installer archive...
Unpacking the JRE...
Extracting the installation resources from the installer archive...
Configuring the installer for this system's environment...
Launching installer...
An internal LaunchAnywhere application error has occurred and this application cannot proceed. (LAX)
Stack Trace:
java.lang.IllegalArgumentException: Malformed \uxxxx encoding.
at java.util.Properties.loadConvert(Properties.java:574)
at java.util.Properties.load0(Properties.java:391)
at java.util.Properties.load(Properties.java:317)
at com.zerog.common.java.util.PropertiesUtil.loadProperties(Unknown Source)
at com.zerog.lax.LAX.<init>(Unknown Source)
at com.zerog.lax.LAX.main(Unknown Source)
```
After rebooting your device, you should be able to run the sample application.
It can be solved by temporarily assigning the `PS1` environment variable to something simple, such as '$'.
Maintainers:
- Satoshi Kataoka (satok@google.com, github.com/satok16)
## Maintainers
* Satoshi Kataoka (satok@google.com, github.com/satok16)

View File

@ -0,0 +1,105 @@
package(
default_visibility = ["//visibility:private"],
)
licenses(["notice"]) # Apache 2.0
exports_files(["LICENSE"])
load("//tensorflow:tensorflow.bzl", "tf_gen_op_libs")
load("//tensorflow:tensorflow.bzl", "tf_gen_op_wrapper_py")
load("//tensorflow:tensorflow.bzl", "tf_kernel_library")
load("//tensorflow:tensorflow.bzl", "tf_py_test")
tf_kernel_library(
name = "kafka_kernels",
srcs = ["kernels/kafka_dataset_ops.cc"],
visibility = ["//visibility:public"],
deps = [
"//tensorflow/core:framework",
"//tensorflow/core:lib",
"//tensorflow/core:lib_internal",
"//tensorflow/core/kernels:bounds_check_lib",
"//tensorflow/core/kernels:dataset",
"//third_party/eigen3",
"@kafka",
],
)
tf_gen_op_libs(
op_lib_names = ["kafka_ops"],
deps = [
"//tensorflow/core:lib",
],
)
tf_gen_op_wrapper_py(
name = "gen_kafka_ops",
out = "python/ops/gen_kafka_ops.py",
require_shape_functions = True,
deps = [":kafka_ops_op_lib"],
)
py_library(
name = "kafka",
srcs = [
"__init__.py",
"python/ops/kafka_dataset_ops.py",
],
srcs_version = "PY2AND3",
visibility = ["//visibility:public"],
deps = [
":gen_kafka_ops",
"//tensorflow/contrib/util:util_py",
"//tensorflow/python:array_ops",
"//tensorflow/python:control_flow_ops",
"//tensorflow/python:framework",
"//tensorflow/python:framework_for_generated_wrappers",
"//tensorflow/python:platform",
"//tensorflow/python:state_ops",
"//tensorflow/python:training",
"//tensorflow/python/data/ops:dataset_ops",
"//tensorflow/python/data/ops:iterator_ops",
"//tensorflow/python/data/ops:readers",
],
)
# The Kafka server has to be setup before running the test.
# The Kafka server is setup through Docker so the Docker engine
# has to be installed.
#
# Once the Docker engine is ready:
# To setup the Kafka server:
# $ bash tensorflow/contrib/kafka/python/kernel_tests/kafka_test.sh start kafka
#
# After the test is complete:
# To team down the Kafka server:
# $ bash tensorflow/contrib/kafka/python/kernel_tests/kafka_test.sh stop kafka
tf_py_test(
name = "kafka_test",
srcs = ["python/kernel_tests/kafka_test.py"],
additional_deps = [
":kafka",
"//third_party/py/numpy",
"//tensorflow/python:client_testlib",
"//tensorflow/python:framework",
"//tensorflow/python:framework_test_lib",
"//tensorflow/python:platform_test",
],
tags = [
"manual",
"notap",
],
)
filegroup(
name = "all_files",
srcs = glob(
["**/*"],
exclude = [
"**/METADATA",
"**/OWNERS",
],
),
visibility = ["//tensorflow:__subpackages__"],
)

View File

@ -0,0 +1,32 @@
# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Kafka Dataset.
@@KafkaDataset
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from tensorflow.contrib.kafka.python.ops.kafka_dataset_ops import KafkaDataset
from tensorflow.python.util.all_util import remove_undocumented
_allowed_symbols = [
"KafkaDataset",
]
remove_undocumented(__name__)

View File

@ -0,0 +1,321 @@
/* Copyright 2017 The TensorFlow Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#include "tensorflow/core/kernels/dataset.h"
#include "tensorflow/core/framework/tensor.h"
#include "src-cpp/rdkafkacpp.h"
namespace tensorflow {
class KafkaDatasetOp : public DatasetOpKernel {
public:
using DatasetOpKernel::DatasetOpKernel;
void MakeDataset(OpKernelContext* ctx, DatasetBase** output) override {
const Tensor* topics_tensor;
OP_REQUIRES_OK(ctx, ctx->input("topics", &topics_tensor));
OP_REQUIRES(
ctx, topics_tensor->dims() <= 1,
errors::InvalidArgument("`topics` must be a scalar or a vector."));
std::vector<string> topics;
topics.reserve(topics_tensor->NumElements());
for (int i = 0; i < topics_tensor->NumElements(); ++i) {
topics.push_back(topics_tensor->flat<string>()(i));
}
std::string servers = "";
OP_REQUIRES_OK(ctx,
ParseScalarArgument<std::string>(ctx, "servers", &servers));
std::string group = "";
OP_REQUIRES_OK(ctx, ParseScalarArgument<std::string>(ctx, "group", &group));
bool eof = false;
OP_REQUIRES_OK(ctx, ParseScalarArgument<bool>(ctx, "eof", &eof));
int64 timeout = -1;
OP_REQUIRES_OK(ctx, ParseScalarArgument<int64>(ctx, "timeout", &timeout));
OP_REQUIRES(ctx, (timeout > 0),
errors::InvalidArgument(
"Timeout value should be large than 0, got ", timeout));
*output = new Dataset(ctx, std::move(topics), servers, group, eof, timeout);
}
private:
class Dataset : public GraphDatasetBase {
public:
Dataset(OpKernelContext* ctx, std::vector<string> topics,
const string& servers, const string& group, const bool eof,
const int64 timeout)
: GraphDatasetBase(ctx),
topics_(std::move(topics)),
servers_(servers),
group_(group),
eof_(eof),
timeout_(timeout) {}
std::unique_ptr<IteratorBase> MakeIterator(
const string& prefix) const override {
return std::unique_ptr<IteratorBase>(
new Iterator({this, strings::StrCat(prefix, "::Kafka")}));
}
const DataTypeVector& output_dtypes() const override {
static DataTypeVector* dtypes = new DataTypeVector({DT_STRING});
return *dtypes;
}
const std::vector<PartialTensorShape>& output_shapes() const override {
static std::vector<PartialTensorShape>* shapes =
new std::vector<PartialTensorShape>({{}});
return *shapes;
}
string DebugString() override { return "KafkaDatasetOp::Dataset"; }
protected:
Status AsGraphDefInternal(DatasetGraphDefBuilder* b,
Node** output) const override {
Node* topics = nullptr;
TF_RETURN_IF_ERROR(b->AddVector(topics_, &topics));
Node* servers = nullptr;
TF_RETURN_IF_ERROR(b->AddScalar(servers_, &servers));
Node* group = nullptr;
TF_RETURN_IF_ERROR(b->AddScalar(group_, &group));
Node* eof = nullptr;
TF_RETURN_IF_ERROR(b->AddScalar(eof_, &eof));
Node* timeout = nullptr;
TF_RETURN_IF_ERROR(b->AddScalar(timeout_, &timeout));
TF_RETURN_IF_ERROR(
b->AddDataset(this, {topics, servers, group, eof, timeout}, output));
return Status::OK();
}
private:
class Iterator : public DatasetIterator<Dataset> {
public:
explicit Iterator(const Params& params)
: DatasetIterator<Dataset>(params) {}
Status GetNextInternal(IteratorContext* ctx,
std::vector<Tensor>* out_tensors,
bool* end_of_sequence) override {
mutex_lock l(mu_);
do {
// We are currently processing a topic, so try to read the next line.
if (consumer_.get()) {
while (true) {
if (limit_ >= 0 &&
(topic_partition_->offset() >= limit_ || offset_ >= limit_)) {
// EOF current topic
break;
}
std::unique_ptr<RdKafka::Message> message(
consumer_->consume(dataset()->timeout_));
if (message->err() == RdKafka::ERR_NO_ERROR) {
// Produce the line as output.
Tensor line_tensor(cpu_allocator(), DT_STRING, {});
line_tensor.scalar<string>()() =
std::string(static_cast<const char*>(message->payload()),
message->len());
out_tensors->emplace_back(std::move(line_tensor));
*end_of_sequence = false;
// Sync offset
offset_ = message->offset();
return Status::OK();
}
if (message->err() == RdKafka::ERR__PARTITION_EOF &&
dataset()->eof_) {
// EOF current topic
break;
}
if (message->err() != RdKafka::ERR__TIMED_OUT) {
return errors::Internal("Failed to consume:",
message->errstr());
}
message.reset(nullptr);
consumer_->poll(0);
}
// We have reached the end of the current topic, so maybe
// move on to next topic.
ResetStreamsLocked();
++current_topic_index_;
}
// Iteration ends when there are no more topic to process.
if (current_topic_index_ == dataset()->topics_.size()) {
*end_of_sequence = true;
return Status::OK();
}
TF_RETURN_IF_ERROR(SetupStreamsLocked(ctx->env()));
} while (true);
}
protected:
Status SaveInternal(IteratorStateWriter* writer) override {
mutex_lock l(mu_);
TF_RETURN_IF_ERROR(writer->WriteScalar(full_name("current_topic_index"),
current_topic_index_));
// `consumer_` is empty if
// 1. GetNext has not been called even once.
// 2. All topics have been read and iterator has been exhausted.
if (consumer_.get()) {
TF_RETURN_IF_ERROR(
writer->WriteScalar(full_name("current_pos"), offset_));
}
return Status::OK();
}
Status RestoreInternal(IteratorContext* ctx,
IteratorStateReader* reader) override {
mutex_lock l(mu_);
ResetStreamsLocked();
int64 current_topic_index;
TF_RETURN_IF_ERROR(reader->ReadScalar(full_name("current_topic_index"),
&current_topic_index));
current_topic_index_ = size_t(current_topic_index);
// The key "current_pos" is written only if the iterator was saved
// with an open topic.
if (reader->Contains(full_name("current_pos"))) {
int64 current_pos;
TF_RETURN_IF_ERROR(
reader->ReadScalar(full_name("current_pos"), &current_pos));
TF_RETURN_IF_ERROR(SetupStreamsLocked(ctx->env()));
topic_partition_->set_offset(current_pos);
if (topic_partition_->offset() != current_pos) {
return errors::Internal("Failed to restore to offset ",
current_pos);
}
offset_ = current_pos;
}
return Status::OK();
}
private:
// Sets up Kafka streams to read from the topic at
// `current_topic_index_`.
Status SetupStreamsLocked(Env* env) EXCLUSIVE_LOCKS_REQUIRED(mu_) {
if (current_topic_index_ >= dataset()->topics_.size()) {
return errors::InvalidArgument(
"current_topic_index_:", current_topic_index_,
" >= topics_.size():", dataset()->topics_.size());
}
// Actually move on to next topic.
string entry = dataset()->topics_[current_topic_index_];
std::vector<string> parts = str_util::Split(entry, ":");
if (parts.size() < 1) {
return errors::InvalidArgument("Invalid parameters: ", entry);
}
string topic = parts[0];
int32 partition = 0;
if (parts.size() > 1) {
if (!strings::safe_strto32(parts[1], &partition)) {
return errors::InvalidArgument("Invalid parameters: ", entry);
}
}
int64 offset = 0;
if (parts.size() > 2) {
if (!strings::safe_strto64(parts[2], &offset)) {
return errors::InvalidArgument("Invalid parameters: ", entry);
}
}
topic_partition_.reset(
RdKafka::TopicPartition::create(topic, partition, offset));
offset_ = topic_partition_->offset();
limit_ = -1;
if (parts.size() > 3) {
if (!strings::safe_strto64(parts[3], &limit_)) {
return errors::InvalidArgument("Invalid parameters: ", entry);
}
}
std::unique_ptr<RdKafka::Conf> conf(
RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL));
std::unique_ptr<RdKafka::Conf> topic_conf(
RdKafka::Conf::create(RdKafka::Conf::CONF_TOPIC));
std::string errstr;
RdKafka::Conf::ConfResult result =
conf->set("default_topic_conf", topic_conf.get(), errstr);
if (result != RdKafka::Conf::CONF_OK) {
return errors::Internal("Failed to set default_topic_conf:", errstr);
}
result = conf->set("bootstrap.servers", dataset()->servers_, errstr);
if (result != RdKafka::Conf::CONF_OK) {
return errors::Internal("Failed to set bootstrap.servers ",
dataset()->servers_, ":", errstr);
}
result = conf->set("group.id", dataset()->group_, errstr);
if (result != RdKafka::Conf::CONF_OK) {
return errors::Internal("Failed to set group.id ", dataset()->group_,
":", errstr);
}
consumer_.reset(RdKafka::KafkaConsumer::create(conf.get(), errstr));
if (!consumer_.get()) {
return errors::Internal("Failed to create consumer:", errstr);
}
std::vector<RdKafka::TopicPartition*> partitions;
partitions.emplace_back(topic_partition_.get());
RdKafka::ErrorCode err = consumer_->assign(partitions);
if (err != RdKafka::ERR_NO_ERROR) {
return errors::Internal(
"Failed to assign partition [", topic_partition_->topic(), ", ",
topic_partition_->partition(), ", ", topic_partition_->offset(),
"]:", RdKafka::err2str(err));
}
return Status::OK();
}
// Resets all Kafka streams.
void ResetStreamsLocked() EXCLUSIVE_LOCKS_REQUIRED(mu_) {
consumer_->unassign();
consumer_->close();
consumer_.reset(nullptr);
}
mutex mu_;
size_t current_topic_index_ GUARDED_BY(mu_) = 0;
int64 offset_ GUARDED_BY(mu_) = 0;
int64 limit_ GUARDED_BY(mu_) = -1;
std::unique_ptr<RdKafka::TopicPartition> topic_partition_ GUARDED_BY(mu_);
std::unique_ptr<RdKafka::KafkaConsumer> consumer_ GUARDED_BY(mu_);
};
const std::vector<string> topics_;
const std::string servers_;
const std::string group_;
const bool eof_;
const int64 timeout_;
};
};
REGISTER_KERNEL_BUILDER(Name("KafkaDataset").Device(DEVICE_CPU),
KafkaDatasetOp);
} // namespace tensorflow

View File

@ -0,0 +1,44 @@
/* Copyright 2017 The TensorFlow Authors. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#include "tensorflow/core/framework/common_shape_fns.h"
#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/shape_inference.h"
namespace tensorflow {
REGISTER_OP("KafkaDataset")
.Input("topics: string")
.Input("servers: string")
.Input("group: string")
.Input("eof: bool")
.Input("timeout: int64")
.Output("handle: variant")
.SetIsStateful()
.SetShapeFn(shape_inference::ScalarShape)
.Doc(R"doc(
Creates a dataset that emits the messages of one or more Kafka topics.
topics: A `tf.string` tensor containing one or more subscriptions,
in the format of [topic:partition:offset:length],
by default length is -1 for unlimited.
servers: A list of bootstrap servers.
group: The consumer group id.
eof: If True, the kafka reader will stop on EOF.
timeout: The timeout value for the Kafka Consumer to wait
(in millisecond).
)doc");
} // namespace tensorflow

View File

@ -0,0 +1,115 @@
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not
# use this file except in compliance with the License. You may obtain a copy of
# the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under
# the License.
# ==============================================================================
"""Tests for KafkaDataset."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from tensorflow.contrib.kafka.python.ops import kafka_dataset_ops
from tensorflow.python.data.ops import iterator_ops
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import errors
from tensorflow.python.ops import array_ops
from tensorflow.python.platform import test
class KafkaDatasetTest(test.TestCase):
def setUp(self):
# The Kafka server has to be setup before the test
# and tear down after the test manually.
# The docker engine has to be installed.
#
# To setup the Kafka server:
# $ bash kafka_test.sh start kafka
#
# To team down the Kafka server:
# $ bash kafka_test.sh stop kafka
pass
def testKafkaDataset(self):
topics = array_ops.placeholder(dtypes.string, shape=[None])
num_epochs = array_ops.placeholder(dtypes.int64, shape=[])
batch_size = array_ops.placeholder(dtypes.int64, shape=[])
repeat_dataset = kafka_dataset_ops.KafkaDataset(
topics, group="test", eof=True).repeat(num_epochs)
batch_dataset = repeat_dataset.batch(batch_size)
iterator = iterator_ops.Iterator.from_structure(batch_dataset.output_types)
init_op = iterator.make_initializer(repeat_dataset)
init_batch_op = iterator.make_initializer(batch_dataset)
get_next = iterator.get_next()
with self.test_session() as sess:
# Basic test: read from topic 0.
sess.run(init_op, feed_dict={topics: ["test:0:0:4"], num_epochs: 1})
for i in range(5):
self.assertEqual("D" + str(i), sess.run(get_next))
with self.assertRaises(errors.OutOfRangeError):
sess.run(get_next)
# Basic test: read from topic 1.
sess.run(init_op, feed_dict={topics: ["test:0:5:-1"], num_epochs: 1})
for i in range(5):
self.assertEqual("D" + str(i + 5), sess.run(get_next))
with self.assertRaises(errors.OutOfRangeError):
sess.run(get_next)
# Basic test: read from both topics.
sess.run(
init_op,
feed_dict={
topics: ["test:0:0:4", "test:0:5:-1"],
num_epochs: 1
})
for j in range(2):
for i in range(5):
self.assertEqual("D" + str(i + j * 5), sess.run(get_next))
with self.assertRaises(errors.OutOfRangeError):
sess.run(get_next)
# Test repeated iteration through both files.
sess.run(
init_op,
feed_dict={
topics: ["test:0:0:4", "test:0:5:-1"],
num_epochs: 10
})
for _ in range(10):
for j in range(2):
for i in range(5):
self.assertEqual("D" + str(i + j * 5), sess.run(get_next))
with self.assertRaises(errors.OutOfRangeError):
sess.run(get_next)
# Test batched and repeated iteration through both files.
sess.run(
init_batch_op,
feed_dict={
topics: ["test:0:0:4", "test:0:5:-1"],
num_epochs: 10,
batch_size: 5
})
for _ in range(10):
self.assertAllEqual(["D" + str(i) for i in range(5)],
sess.run(get_next))
self.assertAllEqual(["D" + str(i + 5) for i in range(5)],
sess.run(get_next))
if __name__ == "__main__":
test.main()

View File

@ -0,0 +1,48 @@
#!/usr/bin/env bash
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
set -e
set -o pipefail
if [ "$#" -ne 2 ]; then
echo "Usage: $0 start|stop <kafka container name>" >&2
exit 1
fi
container=$2
if [ "$1" == "start" ]; then
docker run -d --rm --net=host --name=$container spotify/kafka
echo Wait 5 secs until kafka is up and running
sleep 5
echo Create test topic
docker exec $container bash -c '/opt/kafka_2.11-0.10.1.0/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test'
echo Create test message
docker exec $container bash -c 'echo -e "D0\nD1\nD2\nD3\nD4\nD5\nD6\nD7\nD8\nD9" > /test'
echo Produce test message
docker exec $container bash -c '/opt/kafka_2.11-0.10.1.0/bin/kafka-console-producer.sh --topic test --broker-list 127.0.0.1:9092 < /test'
echo Container $container started successfully
elif [ "$1" == "stop" ]; then
docker rm -f $container
echo Container $container stopped successfully
else
echo "Usage: $0 start|stop <kafka container name>" >&2
exit 1
fi

View File

@ -0,0 +1,74 @@
# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Kafka Dataset."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from tensorflow.contrib.kafka.python.ops import gen_kafka_ops
from tensorflow.python.data.ops.readers import Dataset
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import ops
from tensorflow.python.framework import tensor_shape
class KafkaDataset(Dataset):
"""A Kafka Dataset that consumes the message.
"""
def __init__(self,
topics,
servers="localhost",
group="",
eof=False,
timeout=1000):
"""Create a KafkaReader.
Args:
topics: A `tf.string` tensor containing one or more subscriptions,
in the format of [topic:partition:offset:length],
by default length is -1 for unlimited.
servers: A list of bootstrap servers.
group: The consumer group id.
eof: If True, the kafka reader will stop on EOF.
timeout: The timeout value for the Kafka Consumer to wait
(in millisecond).
"""
super(KafkaDataset, self).__init__()
self._topics = ops.convert_to_tensor(
topics, dtype=dtypes.string, name="topics")
self._servers = ops.convert_to_tensor(
servers, dtype=dtypes.string, name="servers")
self._group = ops.convert_to_tensor(
group, dtype=dtypes.string, name="group")
self._eof = ops.convert_to_tensor(eof, dtype=dtypes.bool, name="eof")
self._timeout = ops.convert_to_tensor(
timeout, dtype=dtypes.int64, name="timeout")
def _as_variant_tensor(self):
return gen_kafka_ops.kafka_dataset(self._topics, self._servers, self._group,
self._eof, self._timeout)
@property
def output_classes(self):
return ops.Tensor
@property
def output_shapes(self):
return tensor_shape.scalar()
@property
def output_types(self):
return dtypes.string

View File

@ -27,6 +27,7 @@ See the @{$python/contrib.layers} guide.
@@convolution2d_transpose
@@conv3d_transpose
@@convolution3d_transpose
@@dense_to_sparse
@@dropout
@@elu
@@embedding_lookup_unique

View File

@ -29,6 +29,7 @@ from tensorflow.contrib.framework.python.ops import variables
from tensorflow.contrib.layers.python.layers import initializers
from tensorflow.contrib.layers.python.layers import utils
from tensorflow.python.eager import context
from tensorflow.python.framework import constant_op
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import function
from tensorflow.python.framework import ops
@ -58,12 +59,12 @@ __all__ = [
'avg_pool2d', 'avg_pool3d', 'batch_norm', 'bias_add', 'conv2d', 'conv3d',
'conv2d_in_plane', 'conv2d_transpose', 'conv3d_transpose', 'convolution',
'convolution2d', 'convolution2d_in_plane', 'convolution2d_transpose',
'convolution3d', 'convolution3d_transpose', 'dropout', 'elu', 'flatten',
'fully_connected', 'GDN', 'gdn', 'layer_norm', 'linear', 'pool',
'max_pool2d', 'max_pool3d', 'one_hot_encoding', 'relu', 'relu6', 'repeat',
'scale_gradient', 'separable_conv2d', 'separable_convolution2d', 'softmax',
'spatial_softmax', 'stack', 'unit_norm', 'legacy_fully_connected',
'legacy_linear', 'legacy_relu', 'maxout'
'convolution3d', 'convolution3d_transpose', 'dense_to_sparse', 'dropout',
'elu', 'flatten', 'fully_connected', 'GDN', 'gdn', 'layer_norm', 'linear',
'pool', 'max_pool2d', 'max_pool3d', 'one_hot_encoding', 'relu', 'relu6',
'repeat', 'scale_gradient', 'separable_conv2d', 'separable_convolution2d',
'softmax', 'spatial_softmax', 'stack', 'unit_norm',
'legacy_fully_connected', 'legacy_linear', 'legacy_relu', 'maxout'
]
DATA_FORMAT_NCHW = 'NCHW'
@ -1400,6 +1401,30 @@ def convolution3d_transpose(
return utils.collect_named_outputs(outputs_collections, sc.name, outputs)
@add_arg_scope
def dense_to_sparse(tensor, eos_token=0, outputs_collections=None, scope=None):
"""Converts a dense tensor into a sparse tensor.
An example use would be to convert dense labels to sparse ones
so that they can be fed to the ctc_loss.
Args:
tensor: An `int` `Tensor` to be converted to a `Sparse`.
eos_token: An integer.
It is part of the target label that signfies the end of a sentence.
outputs_collections: Collection to add the outputs.
scope: Optional scope for name_scope.
"""
with variable_scope.variable_scope(scope, 'dense_to_sparse', [tensor]) as sc:
tensor = ops.convert_to_tensor(tensor)
indices = array_ops.where(
math_ops.not_equal(tensor, constant_op.constant(eos_token,
tensor.dtype)))
values = array_ops.gather_nd(tensor, indices)
shape = array_ops.shape(tensor, out_type=dtypes.int64)
outputs = sparse_tensor.SparseTensor(indices, values, shape)
return utils.collect_named_outputs(outputs_collections, sc.name, outputs)
@add_arg_scope
def dropout(inputs,
keep_prob=0.5,

View File

@ -44,6 +44,7 @@ from tensorflow.python.ops import math_ops
from tensorflow.python.ops import nn_ops
from tensorflow.python.ops import partitioned_variables
from tensorflow.python.ops import random_ops
from tensorflow.python.ops import sparse_ops
from tensorflow.python.ops import state_ops
from tensorflow.python.ops import template
from tensorflow.python.ops import variable_scope
@ -1301,6 +1302,19 @@ class ConvolutionInPlaneTest(test.TestCase):
self.assertAllClose(result, expected, rtol=1e-5, atol=1e-5)
class DenseToSparseTest(test.TestCase):
def testDenseFromConstantToSparse(self):
expected_constant = np.reshape(np.arange(24, dtype=np.int64), (3, 4, 2))
tensor = constant_op.constant(expected_constant)
sparse = _layers.dense_to_sparse(tensor)
dense = sparse_ops.sparse_to_dense(sparse.indices, sparse.dense_shape,
sparse.values)
with self.test_session() as sess:
constant = sess.run(dense)
self.assertAllEqual(expected_constant, constant)
class DropoutTest(test.TestCase):
def testCreateDropout(self):

View File

@ -151,7 +151,7 @@ def spirals(n_samples=100,
# Add more points if n_samples is not divisible by n_classes (unbalanced!)
extras = n_samples % n_classes
if extras > 0:
x_exrta, y_extra = _modes[mode](np.random.rand(extras) * 2 * np.pi, *args,
x_extra, y_extra = _modes[mode](np.random.rand(extras) * 2 * np.pi, *args,
**kwargs)
spir_x = np.append(spir_x, x_extra)
spir_y = np.append(spir_y, y_extra)

View File

@ -136,6 +136,9 @@ class SyntheticTest(test.TestCase):
self.assertRaises(AssertionError, np.testing.assert_array_equal,
spir0.data, spir1.data)
def test_spirals_synthetic(self):
synthetic.spirals(3)
if __name__ == '__main__':
test.main()

View File

@ -1224,7 +1224,7 @@ class DNNRegressorTest(test.TestCase):
self, predictions, expected_shape):
predictions_nparray = np.array(predictions)
self.assertAllEqual(expected_shape, predictions_nparray.shape)
self.assertTrue(np.issubdtype(predictions_nparray.dtype, np.float))
self.assertTrue(np.issubdtype(predictions_nparray.dtype, np.floating))
def testPredict_AsIterableFalse(self):
"""Tests predict method with as_iterable=False."""

View File

@ -5,25 +5,25 @@ def tflite_copts():
copts = [
"-DFARMHASH_NO_CXX_STRING",
] + select({
"//tensorflow:android_arm64": [
str(Label("//tensorflow:android_arm64")): [
"-std=c++11",
"-O3",
],
"//tensorflow:android_arm": [
str(Label("//tensorflow:android_arm")): [
"-mfpu=neon",
"-mfloat-abi=softfp",
"-std=c++11",
"-O3",
],
"//tensorflow:android_x86": [
str(Label("//tensorflow:android_x86")): [
"-DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK",
],
"//tensorflow:ios_x86_64": [
str(Label("//tensorflow:ios_x86_64")): [
"-msse4.1",
],
"//conditions:default": [],
}) + select({
"//tensorflow:with_default_optimizations": [],
str(Label("//tensorflow:with_default_optimizations")): [],
"//conditions:default": ["-DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK"],
})

View File

@ -42,7 +42,15 @@ cc_library(
"bitmap_helpers_impl.h",
"label_image.h",
],
deps = ["//tensorflow/contrib/lite:string"],
deps = [
"//tensorflow/contrib/lite:builtin_op_data",
"//tensorflow/contrib/lite:framework",
"//tensorflow/contrib/lite:schema_fbs_version",
"//tensorflow/contrib/lite:string",
"//tensorflow/contrib/lite:string_util",
"//tensorflow/contrib/lite/kernels:builtin_ops",
"//tensorflow/contrib/lite/schema:schema_fbs",
],
)
# TODO(ahentz): Test disabled as it has a memory leek from read_bmp

View File

@ -13,8 +13,8 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#ifndef TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_H
#define TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_H
#ifndef TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_H_
#define TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_H_
#include "tensorflow/contrib/lite/examples/label_image/bitmap_helpers_impl.h"
#include "tensorflow/contrib/lite/examples/label_image/label_image.h"
@ -26,15 +26,15 @@ uint8_t* read_bmp(const std::string& input_bmp_name, int* width, int* height,
int* channels, Settings* s);
template <class T>
void downsize(T* out, uint8_t* in, int image_height, int image_width,
int image_channels, int wanted_height, int wanted_width,
int wanted_channels, Settings* s);
void resize(T* out, uint8_t* in, int image_height, int image_width,
int image_channels, int wanted_height, int wanted_width,
int wanted_channels, Settings* s);
// explicit instantiation
template void downsize<uint8_t>(uint8_t*, unsigned char*, int, int, int, int,
int, int, Settings*);
template void downsize<float>(float*, unsigned char*, int, int, int, int, int,
template void resize<uint8_t>(uint8_t*, unsigned char*, int, int, int, int, int,
int, Settings*);
template void resize<float>(float*, unsigned char*, int, int, int, int, int,
int, Settings*);
} // namespace label_image
} // namespace tflite

View File

@ -13,8 +13,14 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#ifndef TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_IMPL_H
#define TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_IMPL_H
#ifndef TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_IMPL_H_
#define TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_IMPL_H_
#include "tensorflow/contrib/lite/builtin_op_data.h"
#include "tensorflow/contrib/lite/interpreter.h"
#include "tensorflow/contrib/lite/kernels/register.h"
#include "tensorflow/contrib/lite/string_util.h"
#include "tensorflow/contrib/lite/version.h"
#include "tensorflow/contrib/lite/examples/label_image/label_image.h"
@ -22,28 +28,67 @@ namespace tflite {
namespace label_image {
template <class T>
void downsize(T* out, uint8_t* in, int image_height, int image_width,
int image_channels, int wanted_height, int wanted_width,
int wanted_channels, Settings* s) {
for (int y = 0; y < wanted_height; ++y) {
const int in_y = (y * image_height) / wanted_height;
uint8_t* in_row = in + (in_y * image_width * image_channels);
T* out_row = out + (y * wanted_width * wanted_channels);
for (int x = 0; x < wanted_width; ++x) {
const int in_x = (x * image_width) / wanted_width;
uint8_t* in_pixel = in_row + (in_x * image_channels);
T* out_pixel = out_row + (x * wanted_channels);
for (int c = 0; c < wanted_channels; ++c) {
if (s->input_floating)
out_pixel[c] = (in_pixel[c] - s->input_mean) / s->input_std;
else
out_pixel[c] = in_pixel[c];
}
}
void resize(T* out, uint8_t* in, int image_height, int image_width,
int image_channels, int wanted_height, int wanted_width,
int wanted_channels, Settings* s) {
int number_of_pixels = image_height * image_width * image_channels;
std::unique_ptr<Interpreter> interpreter(new Interpreter);
int base_index = 0;
// two inputs: input and new_sizes
interpreter->AddTensors(2, &base_index);
// one output
interpreter->AddTensors(1, &base_index);
// set input and output tensors
interpreter->SetInputs({0, 1});
interpreter->SetOutputs({2});
// set parameters of tensors
TfLiteQuantizationParams quant;
interpreter->SetTensorParametersReadWrite(
0, kTfLiteFloat32, "input",
{1, image_height, image_width, image_channels}, quant);
interpreter->SetTensorParametersReadWrite(1, kTfLiteInt32, "new_size", {2},
quant);
interpreter->SetTensorParametersReadWrite(
2, kTfLiteFloat32, "output",
{1, wanted_height, wanted_width, wanted_channels}, quant);
ops::builtin::BuiltinOpResolver resolver;
TfLiteRegistration* resize_op =
resolver.FindOp(BuiltinOperator_RESIZE_BILINEAR);
interpreter->AddNodeWithParameters({0, 1}, {2}, nullptr, 0, nullptr,
resize_op, nullptr);
interpreter->AllocateTensors();
// fill input image
// in[] are integers, cannot do memcpy() directly
auto input = interpreter->typed_tensor<float>(0);
for (int i = 0; i < number_of_pixels; i++) {
input[i] = in[i];
}
// fill new_sizes
interpreter->typed_tensor<int>(1)[0] = wanted_height;
interpreter->typed_tensor<int>(1)[1] = wanted_width;
interpreter->Invoke();
auto output = interpreter->typed_tensor<float>(2);
auto output_number_of_pixels =
wanted_height * wanted_height * wanted_channels;
for (int i = 0; i < output_number_of_pixels; i++) {
if (s->input_floating)
out[i] = (output[i] - s->input_mean) / s->input_std;
else
out[i] = (uint8_t)output[i];
}
}
} // namespace label_image
} // namespace tflite
#endif // TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_IMPL_H
#endif // TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_BITMAP_HELPERS_IMPL_H_

View File

@ -148,14 +148,22 @@ void RunInference(Settings* s) {
int wanted_width = dims->data[2];
int wanted_channels = dims->data[3];
if (s->input_floating) {
downsize<float>(interpreter->typed_tensor<float>(input), in, image_height,
switch (interpreter->tensor(input)->type) {
case kTfLiteFloat32:
s->input_floating = true;
resize<float>(interpreter->typed_tensor<float>(input), in, image_height,
image_width, image_channels, wanted_height, wanted_width,
wanted_channels, s);
} else {
downsize<uint8_t>(interpreter->typed_tensor<uint8_t>(input), in,
break;
case kTfLiteUInt8:
resize<uint8_t>(interpreter->typed_tensor<uint8_t>(input), in,
image_height, image_width, image_channels, wanted_height,
wanted_width, wanted_channels, s);
break;
default:
LOG(FATAL) << "cannot handle input type "
<< interpreter->tensor(input)->type << " yet";
exit(-1);
}
struct timeval start_time, stop_time;
@ -177,13 +185,21 @@ void RunInference(Settings* s) {
std::vector<std::pair<float, int>> top_results;
if (s->input_floating) {
get_top_n<float>(interpreter->typed_output_tensor<float>(0), output_size,
num_results, threshold, &top_results, s->input_floating);
} else {
get_top_n<uint8_t>(interpreter->typed_output_tensor<uint8_t>(0),
output_size, num_results, threshold, &top_results,
s->input_floating);
int output = interpreter->outputs()[0];
switch (interpreter->tensor(output)->type) {
case kTfLiteFloat32:
get_top_n<float>(interpreter->typed_output_tensor<float>(0), output_size,
num_results, threshold, &top_results, true);
break;
case kTfLiteUInt8:
get_top_n<uint8_t>(interpreter->typed_output_tensor<uint8_t>(0),
output_size, num_results, threshold, &top_results,
false);
break;
default:
LOG(FATAL) << "cannot handle output type "
<< interpreter->tensor(input)->type << " yet";
exit(-1);
}
std::vector<string> labels;
@ -203,13 +219,11 @@ void display_usage() {
LOG(INFO) << "label_image\n"
<< "--accelerated, -a: [0|1], use Android NNAPI or note\n"
<< "--count, -c: loop interpreter->Invoke() for certain times\n"
<< "--input_floating, -f: [0|1] type of input layer is floating "
"point numbers\n"
<< "--input_mean, -b: input mean\n"
<< "--input_std, -s: input standard deviation\n"
<< "--image, -i: image_name.bmp\n"
<< "--labels, -l: labels for the model\n"
<< "--tflite_mode, -m: model_name.tflite\n"
<< "--tflite_model, -m: model_name.tflite\n"
<< "--threads, -t: number of threads\n"
<< "--verbose, -v: [0|1] print more information\n"
<< "\n";
@ -223,7 +237,6 @@ int Main(int argc, char** argv) {
static struct option long_options[] = {
{"accelerated", required_argument, 0, 'a'},
{"count", required_argument, 0, 'c'},
{"input_floating", required_argument, 0, 'f'},
{"verbose", required_argument, 0, 'v'},
{"image", required_argument, 0, 'i'},
{"labels", required_argument, 0, 'l'},
@ -254,11 +267,6 @@ int Main(int argc, char** argv) {
s.loop_count = strtol( // NOLINT(runtime/deprecated_fn)
optarg, (char**)NULL, 10);
break;
case 'f':
s.input_floating = strtol( // NOLINT(runtime/deprecated_fn)
optarg, (char**)NULL, 10);
s.input_layer_type = "float";
break;
case 'i':
s.input_bmp_name = optarg;
break;

View File

@ -16,9 +16,11 @@ limitations under the License.
#ifndef TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_LABEL_IMAGE_H
#define TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_LABEL_IMAGE_H
#include <string>
#include "tensorflow/contrib/lite/string.h"
namespace tflite {
namespace label_image {
struct Settings {
bool verbose = false;
bool accel = false;
@ -33,4 +35,7 @@ struct Settings {
int number_of_threads = 4;
};
} // namespace label_image
} // namespace tflite
#endif // TENSORFLOW_CONTRIB_LITE_EXAMPLES_LABEL_IMAGE_LABEL_IMAGE_H

View File

@ -1,8 +1,12 @@
label_image for TensorFlow Lite inspired by TensorFlow's label_image.
To build label_image for Android, run $TENSORFLOW_ROOT/configure
and set Android NDK or configure NDK setting in
$TENSORFLOW_ROOT/WORKSPACE first.
To build it for android ARMv8:
```
> bazel build --cxxopt=-std=c++11 \
> bazel build --config monolithic --cxxopt=-std=c++11 \
--crosstool_top=//external:android/crosstool \
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain \
--cpu=arm64-v8a \
@ -10,13 +14,13 @@ To build it for android ARMv8:
```
or
```
> bazel build --config android_arm64 --cxxopt=-std=c++11 \
> bazel build --config android_arm64 --config monolithic --cxxopt=-std=c++11 \
//tensorflow/contrib/lite/examples/label_image:label_image
```
To build it for android arm-v7a:
```
> bazel build --cxxopt=-std=c++11 \
> bazel build --config monolithic --cxxopt=-std=c++11 \
--crosstool_top=//external:android/crosstool \
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain \
--cpu=armeabi-v7a \
@ -24,7 +28,7 @@ To build it for android arm-v7a:
```
or
```
> bazel build --config android_arm --cxxopt=-std=c++11 \
> bazel build --config android_arm --config monolithic --cxxopt=-std=c++11 \
//tensorflow/contrib/lite/examples/label_image:label_image
```

View File

@ -278,6 +278,8 @@ cc_library(
"optimized/neon_tensor_utils.cc",
],
hdrs = [
"common.h",
"optimized/cpu_check.h",
"optimized/neon_tensor_utils.h",
"optimized/tensor_utils_impl.h",
],
@ -285,8 +287,11 @@ cc_library(
deps = [
":cpu_check",
":portable_tensor_utils",
":types",
"//tensorflow/contrib/lite:builtin_op_data",
"//tensorflow/contrib/lite/kernels:activation_functor",
"@arm_neon_2_x86_sse",
"@gemmlowp",
],
)
@ -306,14 +311,21 @@ cc_library(
"tensor_utils.cc",
],
hdrs = [
"common.h",
"compatibility.h",
"optimized/cpu_check.h",
"optimized/neon_tensor_utils.h",
"optimized/tensor_utils_impl.h",
"reference/portable_tensor_utils.h",
"tensor_utils.h",
"types.h",
],
copts = NEON_FLAGS_IF_APPLICABLE,
deps = [
"//tensorflow/contrib/lite/kernels:activation_functor",
"//tensorflow/contrib/lite:builtin_op_data",
"@arm_neon_2_x86_sse",
"@gemmlowp",
] + select({
":arm": [
":neon_tensor_utils",
@ -333,6 +345,18 @@ cc_library(
":ios_arm64": [
":neon_tensor_utils",
],
":x86_64": [
":neon_tensor_utils",
],
":x86": [
":neon_tensor_utils",
],
":k8": [
":neon_tensor_utils",
],
":darwin": [
":neon_tensor_utils",
],
"//conditions:default": [
":portable_tensor_utils",
],

View File

@ -34,7 +34,7 @@ inline bool TestCPUFeatureNeon() {
#endif // __aarch64__
}
#elif __ARM_NEON
#elif defined USE_NEON || defined __ARM_NEON
inline bool TestCPUFeatureNeon() { return true; }

View File

@ -16,11 +16,11 @@ limitations under the License.
#include "tensorflow/contrib/lite/builtin_op_data.h"
#include "tensorflow/contrib/lite/kernels/activation_functor.h"
#include "tensorflow/contrib/lite/kernels/internal/common.h"
#include "tensorflow/contrib/lite/kernels/internal/optimized/tensor_utils_impl.h"
#ifdef USE_NEON
#include <arm_neon.h>
#define kFloatWeightsPerNeonLane 4
namespace tflite {

View File

@ -13,6 +13,7 @@ See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#include "tensorflow/contrib/lite/kernels/internal/tensor_utils.h"
#include "tensorflow/contrib/lite/kernels/internal/common.h"
#ifndef USE_NEON
#if defined(__ARM_NEON__) || defined(__ARM_NEON)

View File

@ -1571,7 +1571,7 @@ inline int ANeuralNetworksModel_addOperation(ANeuralNetworksModel* model,
}
/**
* Specfifies which operands will be the model's inputs and outputs.
* Specifies which operands will be the model's inputs and outputs.
*
* An operand cannot be used for both input and output. Doing so will
* return an error.

View File

@ -132,6 +132,7 @@ bool GraphTransformationsPass(int increment, Model* model,
CHECK(increment == 1 || increment == -1);
bool changed = false;
if (model->operators.empty()) {
LOG(INFO) << "Model is empty!!!";
return false;
}
int op_index = increment == 1 ? 0 : model->operators.size() - 1;

View File

@ -189,7 +189,10 @@ bool ResolveConstantConcatenation::Run(Model* model, std::size_t op_index) {
// Remove all the resolved arrays.
for (const string& input_name : concat_op->inputs) {
model->EraseArray(input_name);
// Check to prevent removal of shared tensors
if (CountOpsWithInput(*model, input_name) == 1) {
model->EraseArray(input_name);
}
}
// Remove concatenate operator

View File

@ -15,6 +15,7 @@ limitations under the License.
#ifndef TENSORFLOW_CONTRIB_LITE_TOCO_MODEL_H_
#define TENSORFLOW_CONTRIB_LITE_TOCO_MODEL_H_
#include <functional>
#include <initializer_list>
#include <memory>
#include <string>

View File

@ -698,10 +698,11 @@ void CheckNonExistentIOArrays(const Model& model) {
void CheckNoMissingArray(const Model& model) {
for (const auto& op : model.operators) {
for (const auto& input : op->inputs) {
CHECK(model.HasArray(input) || model.optional_arrays.count(input));
CHECK(model.HasArray(input) || model.optional_arrays.count(input))
<< "Input: " << input << " missing for op: " << op->outputs[0] << ".";
}
for (const auto& output : op->outputs) {
CHECK(model.HasArray(output));
CHECK(model.HasArray(output)) << "Output: " << output << " missing.";
}
}
CheckNonExistentIOArrays(model);

View File

@ -377,10 +377,10 @@ $(MARCH_OPTION) \
ifeq ($(BUILD_FOR_TEGRA),1)
NVCC := $(JETPACK)/cuda/bin/nvcc
NVCCFLAGS := -x=cu -D__CUDACC__ -DNVCC -DNVIDIA_TEGRA -ccbin $(NDK_ROOT)/toolchains/$(TOOLCHAIN)/prebuilt/$(ANDROID_HOST_OS_ARCH)/bin/$(BIN_PREFIX)-g++ --std c++11 --expt-relaxed-constexpr -m64 -gencode arch=compute_53,\"code=sm_53\" -gencode arch=compute_62,\"code=sm_62\" -DEIGEN_AVOID_STL_ARRAY -DTENSORFLOW_USE_EIGEN_THREADPOOL -DLANG_CXX11 -DEIGEN_HAS_C99_MATH -DGOOGLE_CUDA=1 -DTF_EXTRA_CUDA_CAPABILITIES=5.3
NVCCFLAGS := -x=cu -D__CUDACC__ -DNVCC -DANDROID_TEGRA -ccbin $(NDK_ROOT)/toolchains/$(TOOLCHAIN)/prebuilt/$(ANDROID_HOST_OS_ARCH)/bin/$(BIN_PREFIX)-g++ --std c++11 --expt-relaxed-constexpr -m64 -gencode arch=compute_53,\"code=sm_53\" -gencode arch=compute_62,\"code=sm_62\" -DEIGEN_AVOID_STL_ARRAY -DTENSORFLOW_USE_EIGEN_THREADPOOL -DLANG_CXX11 -DEIGEN_HAS_C99_MATH -DGOOGLE_CUDA=1 -DTF_EXTRA_CUDA_CAPABILITIES=5.3
CXXFLAGS4NVCC =\
-DIS_SLIM_BUILD \
-DNVIDIA_TEGRA \
-DANDROID_TEGRA \
-fno-exceptions \
-DNDEBUG $(OPTFLAGS) \
-march=armv8-a \
@ -391,7 +391,7 @@ $(MARCH_OPTION) \
CXXFLAGS +=\
-DGOOGLE_CUDA=1 \
-D__ANDROID_TYPES_FULL__ \
-DNVIDIA_TEGRA \
-DANDROID_TEGRA \
-DEIGEN_AVOID_STL_ARRAY \
-DEIGEN_HAS_C99_MATH \
-DLANG_CXX11 -DTENSORFLOW_USE_EIGEN_THREADPOOL -DTF_EXTRA_CUDA_CAPABILITIES=5.3
@ -407,7 +407,7 @@ $(MARCH_OPTION) \
-I$(JETPACK)/cuda/extras/CUPTI/include
LIBS += \
CUDA_LIBS := \
-ltfcuda \
-lcudart_static \
-lcudnn \
@ -420,10 +420,10 @@ $(MARCH_OPTION) \
-lculibos \
-lcurand_static
OBJDIR := $(OBJDIR)Tegra/
LIBDIR := $(LIBDIR)Tegra/
BINDIR := $(BINDIR)Tegra/
DEPDIR := $(DEPDIR)Tegra/
OBJDIR := $(OBJDIR)android_arm64-v8a/
LIBDIR := $(LIBDIR)android_arm64-v8a/
BINDIR := $(BINDIR)android_arm64-v8a/
DEPDIR := $(DEPDIR)android_arm64-v8a/
TEGRA_LIBS := \
-L$(JETPACK)/cuda/targets/aarch64-linux-androideabi/lib \
@ -606,7 +606,8 @@ $(wildcard tensorflow/core/util/*/*.cc) \
tensorflow/core/util/version_info.cc
# Remove duplicates (for version_info.cc)
CORE_CC_ALL_SRCS := $(sort $(CORE_CC_ALL_SRCS))
CORE_CC_EXCLUDE_SRCS := \
CORE_CC_EXCLUDE_SRCS_NON_GPU := \
$(wildcard tensorflow/core/*/*test.cc) \
$(wildcard tensorflow/core/*/*testutil*) \
$(wildcard tensorflow/core/*/*testlib*) \
@ -626,49 +627,31 @@ $(wildcard tensorflow/core/lib/jpeg/*) \
$(wildcard tensorflow/core/lib/png/*) \
$(wildcard tensorflow/core/util/events_writer.*) \
$(wildcard tensorflow/core/util/reporter.*) \
$(wildcard tensorflow/core/platform/default/cuda_libdevice_path.*) \
$(wildcard tensorflow/core/platform/default/stream_executor.*) \
$(wildcard tensorflow/core/platform/default/test_benchmark.*) \
$(wildcard tensorflow/core/platform/cuda.h) \
$(wildcard tensorflow/core/platform/cuda_libdevice_path.*) \
$(wildcard tensorflow/core/platform/cloud/*) \
$(wildcard tensorflow/core/platform/google/*) \
$(wildcard tensorflow/core/platform/google/*/*) \
$(wildcard tensorflow/core/platform/jpeg.*) \
$(wildcard tensorflow/core/platform/png.*) \
$(wildcard tensorflow/core/platform/s3/*) \
$(wildcard tensorflow/core/platform/stream_executor.*) \
$(wildcard tensorflow/core/platform/windows/*) \
$(wildcard tensorflow/core/user_ops/*.cu.cc) \
$(wildcard tensorflow/core/common_runtime/gpu/*) \
$(wildcard tensorflow/core/common_runtime/gpu_device_factory.*) \
$(wildcard tensorflow/core/grappler/inputs/trivial_test_graph_input_yielder.*) \
$(wildcard tensorflow/core/grappler/inputs/file_input_yielder.*) \
$(wildcard tensorflow/core/grappler/clusters/single_machine.*)
$(wildcard tensorflow/core/grappler/clusters/single_machine.*) \
tensorflow/core/util/cuda_kernel_helper_test.cu.cc
CORE_CC_EXCLUDE_SRCS := \
$(CORE_CC_EXCLUDE_SRCS_NON_GPU) \
$(wildcard tensorflow/core/platform/stream_executor.*) \
$(wildcard tensorflow/core/platform/default/cuda_libdevice_path.*) \
$(wildcard tensorflow/core/platform/cuda.h) \
$(wildcard tensorflow/core/platform/cuda_libdevice_path.*) \
$(wildcard tensorflow/core/user_ops/*.cu.cc) \
$(wildcard tensorflow/core/common_runtime/gpu/*) \
$(wildcard tensorflow/core/common_runtime/gpu_device_factory.*)
ifeq ($(BUILD_FOR_TEGRA),1)
CORE_CC_ALL_SRCS := \
$(wildcard tensorflow/core/*.cc) \
$(wildcard tensorflow/core/common_runtime/*.cc) \
$(wildcard tensorflow/core/common_runtime/gpu/*.cc) \
$(wildcard tensorflow/core/framework/*.cc) \
$(wildcard tensorflow/core/graph/*.cc) \
$(wildcard tensorflow/core/platform/*.cc) \
$(wildcard tensorflow/core/platform/*/*.cc) \
$(wildcard tensorflow/core/platform/*/*/*.cc) \
$(wildcard tensorflow/core/util/*.cc) \
$(wildcard tensorflow/core/util/*/*.cc) \
$(wildcard tensorflow/cc/training/*.cc) \
$(wildcard tensorflow/stream_executor/*.cc) \
$(wildcard tensorflow/stream_executor/*/*.cc) \
$(wildcard tensorflow/core/grappler/optimizers/*.cc) \
$(wildcard tensorflow/core/grappler/*.cc) \
$(wildcard tensorflow/core/grappler/costs/*.cc) \
$(wildcard tensorflow/core/grappler/clusters/*.cc) \
$(wildcard tensorflow/core/grappler/utils/*.cc) \
$(wildcard tensorflow/core/lib/core/*.cc) \
$(wildcard tensorflow/core/lib/*/*.cc) \
tensorflow/core/grappler/inputs/utils.cc \
CORE_CC_ALL_SRCS := $(CORE_CC_ALL_SRCS) \
tensorflow/core/kernels/concat_lib_gpu.cc \
tensorflow/core/kernels/cuda_solvers.cc \
tensorflow/core/kernels/cudnn_pooling_gpu.cc \
@ -677,28 +660,14 @@ tensorflow/core/kernels/fractional_avg_pool_op.cc \
tensorflow/core/kernels/fractional_max_pool_op.cc \
tensorflow/core/kernels/fractional_pool_common.cc \
tensorflow/core/kernels/pooling_ops_3d.cc \
tensorflow/core/kernels/sparse_fill_empty_rows_op.cc
tensorflow/core/kernels/sparse_fill_empty_rows_op.cc \
tensorflow/core/kernels/list_kernels.cc \
$(wildcard tensorflow/core/common_runtime/gpu/*.cc) \
$(wildcard tensorflow/stream_executor/*.cc) \
$(wildcard tensorflow/stream_executor/*/*.cc)
CORE_CC_EXCLUDE_SRCS := \
$(wildcard tensorflow/core/*/*test.cc) \
$(wildcard tensorflow/core/*/*testutil*) \
$(wildcard tensorflow/core/*/*testlib*) \
$(wildcard tensorflow/core/*/*/*test.cc) \
$(wildcard tensorflow/core/*/*/*testutil*) \
$(wildcard tensorflow/core/framework/op_gen_lib.cc) \
$(wildcard tensorflow/core/lib/gif/*) \
$(wildcard tensorflow/core/lib/jpeg/*) \
$(wildcard tensorflow/core/lib/png/*) \
$(wildcard tensorflow/core/lib/db/*) \
$(wildcard tensorflow/core/platform/jpeg.*) \
$(wildcard tensorflow/core/platform/png.*) \
$(wildcard tensorflow/core/platform/cloud/*) \
$(wildcard tensorflow/core/platform/s3/*) \
$(wildcard tensorflow/core/platform/windows/*) \
$(wildcard tensorflow/core/*/*/*testlib*) \
$(wildcard tensorflow/cc/training/*test.cc) \
tensorflow/core/lib/io/record_reader.cc \
tensorflow/core/util/cuda_kernel_helper_test.cu.cc
$(CORE_CC_EXCLUDE_SRCS_NON_GPU)
CUDA_CC_SRCS := $(wildcard tensorflow/core/kernels/*.cu.cc)
CUDA_CC_OBJS := $(addprefix $(OBJDIR), $(CUDA_CC_SRCS:.cc=.o))
@ -760,7 +729,7 @@ $(BENCHMARK_NAME): $(BENCHMARK_OBJS) $(LIB_PATH) $(CUDA_LIB_DEPS)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(INCLUDES) \
-o $(BENCHMARK_NAME) $(BENCHMARK_OBJS) \
$(LIBFLAGS) $(TEGRA_LIBS) $(LIB_PATH) $(LDFLAGS) $(LIBS)
$(LIBFLAGS) $(TEGRA_LIBS) $(LIB_PATH) $(LDFLAGS) $(LIBS) $(CUDA_LIBS)
# NVCC compilation rules for Tegra
ifeq ($(BUILD_FOR_TEGRA),1)

View File

@ -18,7 +18,7 @@
set -e
usage() {
echo "Usage: NDK_ROOT=<path to ndk root> $(basename "$0") [-Es:t:Tx:a:X]"
echo "Usage: NDK_ROOT=<path to ndk root> $(basename "$0") [-Es:t:Tx:a]"
echo "-E enable experimental hexnn ops"
echo "-s [sub_makefiles] sub makefiles separated by white space"
echo "-t [build_target] build target for Android makefile [default=all]"

View File

@ -96,7 +96,7 @@ if [[ "${ONLY_MAKE_TENSORFLOW}" != "true" ]]; then
if [[ -z "${BUILD_ARCH}" ]]; then
# Compile protobuf for the target iOS device architectures.
tensorflow/contrib/makefile/compile_ios_protobuf.sh -a ${DEFAULT_ARCH}
tensorflow/contrib/makefile/compile_ios_protobuf.sh
else
# Compile protobuf for the target iOS device architectures.
tensorflow/contrib/makefile/compile_ios_protobuf.sh -a ${BUILD_ARCH}

View File

@ -76,6 +76,8 @@ GEN_LIBS_DIR="${GEN_DIR}/libs"
GEN_DOWNLOAD_DIR="${GEN_DIR}/downloads"
URL_BASE="https://storage.googleapis.com/download.tensorflow.org"
ARCH="armeabi-v7a"
source "${SCRIPT_DIR}/../build_helper.subr"
rm -rf "${GEN_DIR}"
@ -219,7 +221,7 @@ if [[ "${BUILD_ONLY}" != "true" ]]; then
adb push "${GEN_LIBS_DIR}/libhexagon_nn_skel.so" "/vendor/lib/rfsa/adsp"
adb push -p \
"${TF_ROOT_DIR}/tensorflow/contrib/makefile/gen/bin/hexagon_graph_execution" \
"${TF_ROOT_DIR}/tensorflow/contrib/makefile/gen/bin/android_${ARCH}/hexagon_graph_execution" \
"/data/local/tmp/"
adb wait-for-device
adb shell chmod "${ANDROID_EXEC_FILE_MODE}" \

View File

@ -54,7 +54,7 @@ $(INFERENCE_SO_PATH): $(LIB_OBJS) $(INFERENCE_OBJS) $(CUDA_LIB_DEPS)
-o $@ $(INFERENCE_OBJS) $(LIB_OBJS) $(TEGRA_LIBS) \
$(LIBFLAGS) $(LDFLAGS) \
-shared -Wl,-soname,$(INFERENCE_SO_NAME) \
$(LIBS)
$(LIBS) $(CUDA_LIBS)
$(INFERENCE_SO_NAME): $(INFERENCE_SO_PATH)

View File

@ -91,6 +91,7 @@ tensorflow/core/kernels/reduction_ops_max.cc
tensorflow/core/kernels/reduction_ops_common.cc
tensorflow/core/kernels/reduction_ops_any.cc
tensorflow/core/kernels/reduction_ops_all.cc
tensorflow/core/kernels/roll_op.cc
tensorflow/core/kernels/queue_ops.cc
tensorflow/core/kernels/queue_base.cc
tensorflow/core/kernels/pooling_ops_common.cc
@ -270,6 +271,7 @@ tensorflow/core/ops/parsing_ops.cc
tensorflow/core/ops/no_op.cc
tensorflow/core/ops/nn_ops.cc
tensorflow/core/ops/nn_grad.cc
tensorflow/core/ops/manip_ops.cc
tensorflow/core/ops/math_ops.cc
tensorflow/core/ops/math_grad.cc
tensorflow/core/ops/logging_ops.cc
@ -291,3 +293,4 @@ tensorflow/core/kernels/batchtospace_op.cc
tensorflow/core/kernels/warn_about_ints.cc
tensorflow/core/kernels/segment_reduction_ops.cc
tensorflow/core/kernels/batch_util.cc
tensorflow/core/ops/audio_ops.cc

View File

@ -151,7 +151,7 @@ MPIRemoteRendezvous::~MPIRemoteRendezvous() {}
void MPIRendezvousMgr::AddRequest(RecvTensorRequest request,
const int mpi_dst) {
TF_CHECK_OK(recv_tensor_recent_request_ids_.TrackUnique(
req.request_id(), "RecvTensor (MPIRendezvousMgr)", req));
request.request_id(), "RecvTensor (MPIRendezvousMgr)", request));
const int64 step_id = request.step_id();
const std::string& key = request.rendezvous_key();
Rendezvous::ParsedKey parsed;

View File

@ -33,6 +33,7 @@ limitations under the License.
#include "tensorflow/contrib/mpi/mpi_msg.pb.h"
#include "tensorflow/contrib/mpi/mpi_utils.h"
#include "tensorflow/core/distributed_runtime/base_rendezvous_mgr.h"
#include "tensorflow/core/distributed_runtime/recent_request_ids.h"
#include "tensorflow/core/distributed_runtime/request_id.h"
#include "tensorflow/core/distributed_runtime/worker_env.h"
#include "tensorflow/core/protobuf/worker.pb.h"

View File

@ -12,7 +12,11 @@
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Library of multidimensional LSTM models and related code."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from tensorflow.contrib.ndlstm.python import lstm1d
from tensorflow.contrib.ndlstm.python import lstm2d

View File

@ -22,7 +22,6 @@ from six.moves import xrange # pylint: disable=redefined-builtin
from tensorflow.contrib.framework.python.ops import variables
from tensorflow.python.framework import constant_op
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import nn_ops
from tensorflow.python.ops import random_ops
from tensorflow.python.ops import rnn
@ -85,18 +84,11 @@ def ndlstm_base_dynamic(inputs, noutput, scope=None, reverse=False):
Output sequence (length, batch_size, noutput)
"""
with variable_scope.variable_scope(scope, "SeqLstm", [inputs]):
# TODO(tmb) make batch size, sequence_length dynamic
# example: sequence_length = tf.shape(inputs)[0]
_, batch_size, _ = _shape(inputs)
lstm_cell = rnn_cell.BasicLSTMCell(noutput, state_is_tuple=False)
state = array_ops.zeros([batch_size, lstm_cell.state_size])
sequence_length = int(inputs.get_shape()[0])
sequence_lengths = math_ops.to_int64(
array_ops.fill([batch_size], sequence_length))
lstm_cell = rnn_cell.BasicLSTMCell(noutput)
if reverse:
inputs = array_ops.reverse_v2(inputs, [0])
outputs, _ = rnn.dynamic_rnn(
lstm_cell, inputs, sequence_lengths, state, time_major=True)
lstm_cell, inputs, time_major=True, dtype=inputs.dtype)
if reverse:
outputs = array_ops.reverse_v2(outputs, [0])
return outputs

View File

@ -397,10 +397,6 @@ class ScipyOptimizerInterface(ExternalOptimizerInterface):
'automatically and cannot be injected manually'.format(kwarg))
minimize_kwargs.update(optimizer_kwargs)
if method == 'SLSQP':
# SLSQP doesn't support step callbacks. Obviate associated warning
# message.
del minimize_kwargs['callback']
import scipy.optimize # pylint: disable=g-import-not-at-top
result = scipy.optimize.minimize(*minimize_args, **minimize_kwargs)

View File

@ -299,6 +299,45 @@ class ScipyOptimizerInterfaceTest(TestCase):
method = optimizer.optimizer_kwargs.get('method')
self.assertEqual('SLSQP', method)
def test_callbacks(self):
vector_val = np.array([7., -2.], dtype=np.float32)
vector = variables.Variable(vector_val, 'vector')
minimum_location_val = np.arange(2)
minimum_location = constant_op.constant(
minimum_location_val, dtype=dtypes.float32)
loss = math_ops.reduce_sum(math_ops.square(vector - minimum_location)) / 2.
loss_val_first = ((vector_val - minimum_location_val)**2).sum() / 2.
optimizer = external_optimizer.ScipyOptimizerInterface(loss, method='SLSQP')
with self.test_session() as sess:
sess.run(variables.global_variables_initializer())
initial_vector_val = sess.run(vector)
extra_fetches = [loss]
step_callback = test.mock.Mock()
loss_callback = test.mock.Mock()
optimizer.minimize(
sess,
fetches=extra_fetches,
loss_callback=loss_callback,
step_callback=step_callback)
loss_val_last = sess.run(loss)
call_first = test.mock.call(loss_val_first)
call_last = test.mock.call(loss_val_last)
loss_calls = [call_first, call_last]
loss_callback.assert_has_calls(loss_calls, any_order=True)
args, _ = step_callback.call_args
self.assertAllClose(minimum_location_val, args[0])
if __name__ == '__main__':
test.main()

View File

@ -86,8 +86,8 @@ def convert_inline(f, *args, **kwargs):
def convert(recursive=False, arg_types=None):
"""Decorator that compiles a function to graph mode.
The decorator is dynamic - invoking compilation whenever the decorated fuction
is called. This means the parameter values are known at compilation.
The decorator is dynamic - invoking compilation whenever the decorated
function is called. This means the parameter values are known at compilation.
Args:
recursive: Whether to recusrively convert any functions that the decorator

View File

@ -61,7 +61,7 @@ def _compute_output_resolution(input_spatial_resolution, kernel_size, stride,
stride: Stride (int).
total_padding: Total padding to be applied (int).
Returns:
output_resolution: Ouput dimension (int) or None.
output_resolution: Output dimension (int) or None.
"""
if (input_spatial_resolution is None) or (kernel_size is None) or (
stride is None) or (total_padding is None):

View File

@ -87,9 +87,9 @@ and 'indices' is [[0,1]
[1,1]
[0,2]],
the the output will be [[ 1, 2, 3]
[ 0, 0, 0]
[41,52,63]].
the output will be [[ 1, 2, 3]
[ 0, 0, 0]
[41,52,63]].
```
The data must be at least rank 1. The indices must be of shape (?,2) where the
@ -132,9 +132,9 @@ and 'indices' is [[0,1]
[1,1]
[0,2]],
the the output will be [[ 1, 2, 3]
[ 1, 1, 1]
[40,100,180]].
the output will be [[ 1, 2, 3]
[ 1, 1, 1]
[40,100,180]].
```
The data must be at least rank 1. The indices can be of shape (?,2) where the
@ -189,9 +189,9 @@ and 'indices' is [[0,1]
[1,1]
[0,2]],
the the output will be [[ 1, 20, 3]
[ -BIG_VALUE, -BIG_VALUE, -BIG_VALUE]
[ 400, 20, 60]].
the output will be [[ 1, 20, 3]
[ -BIG_VALUE, -BIG_VALUE, -BIG_VALUE]
[ 400, 20, 60]].
```
The data must be at least rank 1. The indices can be of shape (?,2) where the
@ -246,9 +246,9 @@ and 'indices' is [[0,1]
[1,1]
[0,2]],
the the output will be [[ 1, 20, 3]
[ +BIG_VALUE, +BIG_VALUE, +BIG_VALUE]
[ 1, 5, 3]].
the output will be [[ 1, 20, 3]
[ +BIG_VALUE, +BIG_VALUE, +BIG_VALUE]
[ 1, 5, 3]].
```
The data must be at least rank 1. The indices can be of shape (?,2) where the

View File

@ -157,6 +157,21 @@ class RNNCellTest(test.TestCase):
# Smoke test
self.assertAllClose(res[0], [[0.509682, 0.509682]])
def testSRUCellWithDiffSize(self):
with self.test_session() as sess:
with variable_scope.variable_scope(
"root", initializer=init_ops.constant_initializer(0.5)):
x = array_ops.zeros([1, 3])
m = array_ops.zeros([1, 2])
g, _ = contrib_rnn_cell.SRUCell(2)(x, m)
sess.run([variables_lib.global_variables_initializer()])
res = sess.run([g], {
x.name: np.array([[1., 1., 1.]]),
m.name: np.array([[0.1, 0.1]])
})
# Smoke test
self.assertAllClose(res[0], [[0.55255556, 0.55255556]])
def testBasicLSTMCell(self):
for dtype in [dtypes.float16, dtypes.float32]:
np_dtype = dtype.as_numpy_dtype

View File

@ -1635,6 +1635,5 @@ class WeightNormLSTMCellTest(test.TestCase):
self.assertAllClose(expected_c, actual_c, 1e-5)
self.assertAllClose(expected_h, actual_h, 1e-5)
if __name__ == "__main__":
test.main()

View File

@ -2731,25 +2731,9 @@ class SRUCell(rnn_cell_impl._LayerRNNCell):
input_depth = inputs_shape[1].value
# Here the contributor believes that the following constraints
# are implied. The reasoning is explained here with reference to
# the paper https://arxiv.org/pdf/1709.02755.pdf upon which this
# implementation is based.
# In section 2.1 Equation 5, specifically:
# h_t = r_t \odot g(c_t) + (1 - r_t) \odot x_t
# the pointwise operation between r_t and x_t means they have
# the same shape (since we are implementing an RNN cell, braodcasting
# does not happen to input of a single timestep); by the same
# reasons, x_t has the same shape as h_t, essentially mandating that
# input_depth = unit_num.
if input_depth != self._num_units:
raise ValueError("SRU requires input_depth == num_units, got "
"input_depth = %s, num_units = %s" % (input_depth,
self._num_units))
self._kernel = self.add_variable(
rnn_cell_impl._WEIGHTS_VARIABLE_NAME,
shape=[input_depth, 3 * self._num_units])
shape=[input_depth, 4 * self._num_units])
self._bias = self.add_variable(
rnn_cell_impl._BIAS_VARIABLE_NAME,
@ -2762,8 +2746,8 @@ class SRUCell(rnn_cell_impl._LayerRNNCell):
"""Simple recurrent unit (SRU) with num_units cells."""
U = math_ops.matmul(inputs, self._kernel)
x_bar, f_intermediate, r_intermediate = array_ops.split(
value=U, num_or_size_splits=3, axis=1)
x_bar, f_intermediate, r_intermediate, x_tx = array_ops.split(
value=U, num_or_size_splits=4, axis=1)
f_r = math_ops.sigmoid(
nn_ops.bias_add(
@ -2771,7 +2755,7 @@ class SRUCell(rnn_cell_impl._LayerRNNCell):
f, r = array_ops.split(value=f_r, num_or_size_splits=2, axis=1)
c = f * state + (1.0 - f) * x_bar
h = r * self._activation(c) + (1.0 - r) * inputs
h = r * self._activation(c) + (1.0 - r) * x_tx
return h, c

View File

@ -924,8 +924,7 @@ class LuongMonotonicAttention(_BaseMonotonicAttentionMechanism):
_monotonic_probability_fn, sigmoid_noise=sigmoid_noise, mode=mode,
seed=sigmoid_noise_seed)
super(LuongMonotonicAttention, self).__init__(
query_layer=layers_core.Dense(
num_units, name="query_layer", use_bias=False, dtype=dtype),
query_layer=None,
memory_layer=layers_core.Dense(
num_units, name="memory_layer", use_bias=False, dtype=dtype),
memory=memory,

View File

@ -82,7 +82,8 @@ def _convert_default_signature_to_signature_def(signatures):
"""
default_signature = signatures.default_signature
signature_def = meta_graph_pb2.SignatureDef()
if default_signature.WhichOneof("type") == "regression_signature":
if (default_signature.WhichOneof("type") ==
legacy_constants.REGRESSION_SIGNATURE):
regression_signature = default_signature.regression_signature
signature_def.method_name = signature_constants.REGRESS_METHOD_NAME
_add_input_to_signature_def(regression_signature.input.tensor_name,
@ -91,7 +92,8 @@ def _convert_default_signature_to_signature_def(signatures):
_add_output_to_signature_def(regression_signature.output.tensor_name,
signature_constants.REGRESS_OUTPUTS,
signature_def)
elif default_signature.WhichOneof("type") == "classification_signature":
elif (default_signature.WhichOneof("type") ==
legacy_constants.CLASSIFICATION_SIGNATURE):
classification_signature = default_signature.classification_signature
signature_def.method_name = signature_constants.CLASSIFY_METHOD_NAME
_add_input_to_signature_def(classification_signature.input.tensor_name,
@ -132,8 +134,9 @@ def _convert_named_signatures_to_signature_def(signatures):
signature_constants.PREDICT_OUTPUTS]
# TODO(pdudnik): what if there are other signatures? Mimic cr/140900781 once
# it is submitted.
if (input_signature.WhichOneof("type") != "generic_signature" or
output_signature.WhichOneof("type") != "generic_signature"):
if (input_signature.WhichOneof("type") != legacy_constants.GENERIC_SIGNATURE
or output_signature.WhichOneof("type") !=
legacy_constants.GENERIC_SIGNATURE):
raise RuntimeError("Named input and output signatures can only be "
"up-converted if they are generic signature. "
"Input signature type is %s, output signature type is "

View File

@ -32,3 +32,6 @@ INIT_OP_KEY = "serving_init_op"
SIGNATURES_KEY = "serving_signatures"
ASSETS_KEY = "serving_assets"
GRAPH_KEY = "serving_graph"
REGRESSION_SIGNATURE = "regression_signature"
CLASSIFICATION_SIGNATURE = "classification_signature"
GENERIC_SIGNATURE = "generic_signature"

View File

@ -29,7 +29,6 @@ from tensorflow.contrib.framework.python.ops import variables as variables_lib
from tensorflow.contrib.metrics.python.ops import metric_ops
from tensorflow.contrib.slim.python.slim import evaluation
from tensorflow.contrib.training.python.training import evaluation as evaluation_lib
from tensorflow.core.protobuf import saver_pb2
from tensorflow.python.debug.lib import debug_data
from tensorflow.python.debug.wrappers import hooks
from tensorflow.python.framework import constant_op
@ -236,7 +235,7 @@ class SingleEvaluationTest(test.TestCase):
def _prepareCheckpoint(self, checkpoint_path):
init_op = control_flow_ops.group(variables.global_variables_initializer(),
variables.local_variables_initializer())
saver = saver_lib.Saver(write_version=saver_pb2.SaverDef.V1)
saver = saver_lib.Saver()
with self.test_session() as sess:
sess.run(init_op)
saver.save(sess, checkpoint_path)

View File

@ -45,32 +45,67 @@ def _get_linear_equations_tests(dtype_, use_static_shape_, shape_):
low=-1.0, high=1.0, size=np.prod(shape_)).reshape(shape_).astype(dtype_)
# Make a selfadjoint, positive definite.
a_np = np.dot(a_np.T, a_np)
# jacobi preconditioner
jacobi_np = np.zeros_like(a_np)
jacobi_np[range(a_np.shape[0]), range(a_np.shape[1])] = (
1.0 / a_np.diagonal())
rhs_np = np.random.uniform(
low=-1.0, high=1.0, size=shape_[0]).astype(dtype_)
x_np = np.zeros_like(rhs_np)
tol = 1e-6 if dtype_ == np.float64 else 1e-3
max_iter = 20
with self.test_session() as sess:
if use_static_shape_:
a = constant_op.constant(a_np)
rhs = constant_op.constant(rhs_np)
x = constant_op.constant(x_np)
jacobi = constant_op.constant(jacobi_np)
else:
a = array_ops.placeholder(dtype_)
rhs = array_ops.placeholder(dtype_)
x = array_ops.placeholder(dtype_)
jacobi = array_ops.placeholder(dtype_)
operator = util.create_operator(a)
cg_graph = linear_equations.conjugate_gradient(
operator, rhs, tol=tol, max_iter=max_iter)
if use_static_shape_:
cg_val = sess.run(cg_graph)
else:
cg_val = sess.run(cg_graph, feed_dict={a: a_np, rhs: rhs_np})
norm_r0 = np.linalg.norm(rhs_np)
norm_r = np.sqrt(cg_val.gamma)
self.assertLessEqual(norm_r, tol * norm_r0)
# Validate that we get an equally small residual norm with numpy
# using the computed solution.
r_np = rhs_np - np.dot(a_np, cg_val.x)
norm_r_np = np.linalg.norm(r_np)
self.assertLessEqual(norm_r_np, tol * norm_r0)
preconditioners = [
None, util.identity_operator(a),
util.create_operator(jacobi)
]
cg_results = []
for preconditioner in preconditioners:
cg_graph = linear_equations.conjugate_gradient(
operator,
rhs,
preconditioner=preconditioner,
x=x,
tol=tol,
max_iter=max_iter)
if use_static_shape_:
cg_val = sess.run(cg_graph)
else:
cg_val = sess.run(
cg_graph,
feed_dict={
a: a_np,
rhs: rhs_np,
x: x_np,
jacobi: jacobi_np
})
norm_r0 = np.linalg.norm(rhs_np)
norm_r = np.linalg.norm(cg_val.r)
self.assertLessEqual(norm_r, tol * norm_r0)
# Validate that we get an equally small residual norm with numpy
# using the computed solution.
r_np = rhs_np - np.dot(a_np, cg_val.x)
norm_r_np = np.linalg.norm(r_np)
self.assertLessEqual(norm_r_np, tol * norm_r0)
cg_results.append(cg_val)
# Validate that we get same results using identity_preconditioner
# and None
self.assertEqual(cg_results[0].i, cg_results[1].i)
self.assertAlmostEqual(cg_results[0].gamma, cg_results[1].gamma)
self.assertAllClose(cg_results[0].r, cg_results[1].r, rtol=tol)
self.assertAllClose(cg_results[0].x, cg_results[1].x, rtol=tol)
self.assertAllClose(cg_results[0].p, cg_results[1].p, rtol=tol)
return [test_conjugate_gradient]

View File

@ -63,6 +63,43 @@ class UtilTest(test.TestCase):
def testCreateOperatorUnknownShape(self):
self._testCreateOperator(False)
def _testIdentityOperator(self, use_static_shape_):
for dtype in np.float32, np.float64:
a_np = np.array([[1., 2.], [3., 4.], [5., 6.]], dtype=dtype)
x_np = np.array([[2.], [-3.]], dtype=dtype)
y_np = np.array([[2], [-3.], [5.]], dtype=dtype)
with self.test_session() as sess:
if use_static_shape_:
a = constant_op.constant(a_np, dtype=dtype)
x = constant_op.constant(x_np, dtype=dtype)
y = constant_op.constant(y_np, dtype=dtype)
else:
a = array_ops.placeholder(dtype)
x = array_ops.placeholder(dtype)
y = array_ops.placeholder(dtype)
id_op = util.identity_operator(a)
ax = id_op.apply(x)
aty = id_op.apply_adjoint(y)
op_shape = ops.convert_to_tensor(id_op.shape)
if use_static_shape_:
op_shape_val, ax_val, aty_val = sess.run([op_shape, ax, aty])
else:
op_shape_val, ax_val, aty_val = sess.run(
[op_shape, ax, aty], feed_dict={
a: a_np,
x: x_np,
y: y_np
})
self.assertAllEqual(op_shape_val, [3, 2])
self.assertAllClose(ax_val, x_np)
self.assertAllClose(aty_val, y_np)
def testIdentityOperator(self):
self._testIdentityOperator(True)
def testIdentityOperatorUnknownShape(self):
self._testIdentityOperator(False)
def testL2Norm(self):
with self.test_session():
x_np = np.array([[2], [-3.], [5.]])

View File

@ -26,11 +26,14 @@ from tensorflow.python.framework import dtypes
from tensorflow.python.framework import ops
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import control_flow_ops
from tensorflow.python.ops import linalg_ops
from tensorflow.python.ops import math_ops
def conjugate_gradient(operator,
rhs,
preconditioner=None,
x=None,
tol=1e-4,
max_iter=20,
name="conjugate_gradient"):
@ -55,6 +58,15 @@ def conjugate_gradient(operator,
vector with the result of applying the operator to `x`, i.e. if
`operator` represents matrix `A`, `apply` should return `A * x`.
rhs: A rank-1 `Tensor` of shape `[N]` containing the right-hand size vector.
preconditioner: An object representing a linear operator, see `operator`
for detail. The preconditioner should approximate the inverse of `A`.
An efficient preconditioner could dramatically improve the rate of
convergence. If `preconditioner` represents matrix `M`(`M` approximates
`A^{-1}`), the algorithm uses `preconditioner.apply(x)` to estimate
`A^{-1}x`. For this to be useful, the cost of applying `M` should be
much lower than computing `A^{-1}` directly.
x: A rank-1 `Tensor` of shape `[N]` containing the initial guess for the
solution.
tol: A float scalar convergence tolerance.
max_iter: An integer giving the maximum number of iterations.
name: A name scope for the operation.
@ -65,35 +77,49 @@ def conjugate_gradient(operator,
- x: A rank-1 `Tensor` of shape `[N]` containing the computed solution.
- r: A rank-1 `Tensor` of shape `[M]` containing the residual vector.
- p: A rank-1 `Tensor` of shape `[N]`. `A`-conjugate basis vector.
- gamma: \\(||r||_2^2\\)
- gamma: \\(r \dot M \dot r\\), equivalent to \\(||r||_2^2\\) when
`preconditioner=None`.
"""
# ephemeral class holding CG state.
cg_state = collections.namedtuple("CGState", ["i", "x", "r", "p", "gamma"])
def stopping_criterion(i, state):
return math_ops.logical_and(i < max_iter, state.gamma > tol)
return math_ops.logical_and(i < max_iter, linalg_ops.norm(state.r) > tol)
# TODO(rmlarsen): add preconditioning
def cg_step(i, state):
def cg_step(i, state): # pylint: disable=missing-docstring
z = operator.apply(state.p)
alpha = state.gamma / util.dot(state.p, z)
x = state.x + alpha * state.p
r = state.r - alpha * z
gamma = util.l2norm_squared(r)
beta = gamma / state.gamma
p = r + beta * state.p
if preconditioner is None:
gamma = util.dot(r, r)
beta = gamma / state.gamma
p = r + beta * state.p
else:
q = preconditioner.apply(r)
gamma = util.dot(r, q)
beta = gamma / state.gamma
p = q + beta * state.p
return i + 1, cg_state(i + 1, x, r, p, gamma)
with ops.name_scope(name):
n = operator.shape[1:]
rhs = array_ops.expand_dims(rhs, -1)
gamma0 = util.l2norm_squared(rhs)
tol = tol * tol * gamma0
x = array_ops.expand_dims(
array_ops.zeros(
n, dtype=rhs.dtype.base_dtype), -1)
if x is None:
x = array_ops.expand_dims(
array_ops.zeros(n, dtype=rhs.dtype.base_dtype), -1)
r0 = rhs
else:
x = array_ops.expand_dims(x, -1)
r0 = rhs - operator.apply(x)
if preconditioner is None:
p0 = r0
else:
p0 = preconditioner.apply(r0)
gamma0 = util.dot(r0, p0)
tol *= linalg_ops.norm(r0)
i = constant_op.constant(0, dtype=dtypes.int32)
state = cg_state(i=i, x=x, r=rhs, p=rhs, gamma=gamma0)
state = cg_state(i=i, x=x, r=r0, p=p0, gamma=gamma0)
_, state = control_flow_ops.while_loop(stopping_criterion, cg_step,
[i, state])
return cg_state(

View File

@ -45,6 +45,23 @@ def create_operator(matrix):
apply_adjoint=lambda v: math_ops.matmul(matrix, v, adjoint_a=True))
def identity_operator(matrix):
"""Creates a linear operator from a rank-2 identity tensor."""
linear_operator = collections.namedtuple(
"LinearOperator", ["shape", "dtype", "apply", "apply_adjoint"])
shape = matrix.get_shape()
if shape.is_fully_defined():
shape = shape.as_list()
else:
shape = array_ops.shape(matrix)
return linear_operator(
shape=shape,
dtype=matrix.dtype,
apply=lambda v: v,
apply_adjoint=lambda v: v)
# TODO(rmlarsen): Measure if we should just call matmul.
def dot(x, y):
return math_ops.reduce_sum(math_ops.conj(x) * y)

View File

@ -17,6 +17,7 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from absl import flags
import os
import subprocess
@ -24,13 +25,21 @@ import sys
import tensorflow as tf
tf.flags.DEFINE_string('service_addr', '',
'Address of TPU profiler service e.g. localhost:8466')
tf.flags.DEFINE_string('logdir', '',
'Path of TensorBoard log directory e.g. /tmp/tb_log')
tf.flags.DEFINE_integer('duration_ms', 2000, 'Duration of tracing in ms.')
flags.DEFINE_string(
'service_addr', None, 'Address of TPU profiler service e.g. '
'localhost:8466')
flags.DEFINE_string(
'logdir', None, 'Path of TensorBoard log directory e.g. /tmp/tb_log, '
'gs://tb_bucket')
flags.DEFINE_integer('duration_ms', 2000, 'Duration of tracing in ms.')
flags.DEFINE_integer(
'num_tracing_attempts', 3, 'Automatically retry N times when no trace '
'event is collected.')
flags.DEFINE_boolean(
'include_dataset_ops', True, 'Set to false to profile longer TPU '
'device traces.')
FLAGS = tf.flags.FLAGS
FLAGS = flags.FLAGS
EXECUTABLE = 'data/capture_tpu_profile'
@ -42,10 +51,13 @@ def main(unused_argv=None):
if not FLAGS.service_addr or not FLAGS.logdir:
sys.exit('service_addr and logdir must be provided.')
executable_path = os.path.join(os.path.dirname(__file__), EXECUTABLE)
logdir = os.path.expandvars(os.path.expanduser(FLAGS.logdir))
cmd = [executable_path]
cmd.append('--logdir='+FLAGS.logdir)
cmd.append('--logdir='+logdir)
cmd.append('--service_addr='+FLAGS.service_addr)
cmd.append('--duration_ms='+str(FLAGS.duration_ms))
cmd.append('--num_tracing_attempts='+str(FLAGS.num_tracing_attempts))
cmd.append('--include_dataset_ops='+str(FLAGS.include_dataset_ops).lower())
subprocess.call(cmd)

View File

@ -20,16 +20,12 @@ from __future__ import print_function
from setuptools import setup
_VERSION = '1.3.0-a1'
_VERSION = '1.5.0-rc1'
CONSOLE_SCRIPTS = [
'capture_tpu_profile=cloud_tpu_profiler.main:run_main',
]
REQUIRED_PACKAGES = [
'tensorflow >= 1.2.0',
]
setup(
name='cloud_tpu_profiler',
version=_VERSION.replace('-', ''),
@ -45,27 +41,22 @@ setup(
entry_points={
'console_scripts': CONSOLE_SCRIPTS,
},
install_requires=REQUIRED_PACKAGES,
classifiers=[
# How mature is this project? Common values are
# 3 - Alpha
# 4 - Beta
# 5 - Production/Stable
'Development Status :: 3 - Alpha',
'Development Status :: 4 - Beta',
'Intended Audience :: Developers',
'Intended Audience :: Education',
'Intended Audience :: Science/Research',
'License :: OSI Approved :: Apache Software License',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Topic :: Scientific/Engineering',
'Topic :: Scientific/Engineering :: Mathematics',
'Topic :: Scientific/Engineering :: Artificial Intelligence',
@ -74,4 +65,5 @@ setup(
'Topic :: Software Development :: Libraries :: Python Modules',
],
license='Apache 2.0',
keywords='tensorflow performance tpu',)
keywords='tensorflow performance tpu',
)

View File

@ -454,6 +454,7 @@ tf_cuda_library(
"framework/reader_interface.h",
"framework/reader_op_kernel.h",
"framework/register_types.h",
"framework/register_types_traits.h",
"framework/resource_mgr.h",
"framework/resource_op_kernel.h",
"framework/selective_registration.h",
@ -611,6 +612,7 @@ tf_gen_op_libs(
"list_ops",
"lookup_ops",
"logging_ops",
"manip_ops",
"math_ops",
"nn_ops",
"no_op",
@ -693,6 +695,7 @@ cc_library(
":list_ops_op_lib",
":logging_ops_op_lib",
":lookup_ops_op_lib",
":manip_ops_op_lib",
":math_ops_op_lib",
":nn_ops_op_lib",
":no_op_op_lib",
@ -831,6 +834,7 @@ cc_library(
"//tensorflow/core/kernels:list_kernels",
"//tensorflow/core/kernels:lookup",
"//tensorflow/core/kernels:logging",
"//tensorflow/core/kernels:manip",
"//tensorflow/core/kernels:math",
"//tensorflow/core/kernels:multinomial_op",
"//tensorflow/core/kernels:nn",
@ -1153,6 +1157,7 @@ cc_library(
deps = [
":protos_all_cc_impl",
"//third_party/eigen3",
"@nsync//:nsync_cpp",
"@protobuf_archive//:protobuf",
],
alwayslink = 1,

View File

@ -16,5 +16,6 @@ END
description: <<END
Note that this routine only supports wildcard characters in the
basename portion of the pattern, not in the directory portion.
Note also that the order of filenames returned can be non-deterministic.
END
}

View File

@ -0,0 +1,52 @@
op {
graph_op_name: "Roll"
in_arg {
name: "shift"
description: <<END
Dimension must be 0-D or 1-D. `shift[i]` specifies the number of places by which
elements are shifted positively (towards larger indices) along the dimension
specified by `axis[i]`. Negative shifts will roll the elements in the opposite
direction.
END
}
in_arg {
name: "axis"
description: <<END
Dimension must be 0-D or 1-D. `axis[i]` specifies the dimension that the shift
`shift[i]` should occur. If the same axis is referenced more than once, the
total shift for that axis will be the sum of all the shifts that belong to that
axis.
END
}
out_arg {
name: "output"
description: <<END
Has the same shape and size as the input. The elements are shifted
positively (towards larger indices) by the offsets of `shift` along the
dimensions of `axis`.
END
}
summary: "Rolls the elements of a tensor along an axis."
description: <<END
The elements are shifted positively (towards larger indices) by the offset of
`shift` along the dimension of `axis`. Negative `shift` values will shift
elements in the opposite direction. Elements that roll passed the last position
will wrap around to the first and vice versa. Multiple shifts along multiple
axes may be specified.
For example:
```
# 't' is [0, 1, 2, 3, 4]
roll(t, shift=2, axis=0) ==> [3, 4, 0, 1, 2]
# shifting along multiple dimensions
# 't' is [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]
roll(t, shift=[1, -2], axis=[0, 1]) ==> [[7, 8, 9, 5, 6], [2, 3, 4, 0, 1]]
# shifting along the same axis multiple times
# 't' is [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]
roll(t, shift=[2, -3], axis=[1, 1]) ==> [[1, 2, 3, 4, 0], [6, 7, 8, 9, 5]]
```
END
}

View File

@ -0,0 +1,32 @@
op {
graph_op_name: "UnravelIndex"
in_arg {
name: "indices"
description: <<END
An 0-D or 1-D `int` Tensor whose elements are indices into the
flattened version of an array of dimensions dims.
END
}
in_arg {
name: "dims"
description: <<END
An 1-D `int` Tensor. The shape of the array to use for unraveling
indices.
END
}
out_arg {
name: "output"
description: <<END
An 2-D (or 1-D if indices is 0-D) tensor where each row has the
same shape as the indices array.
END
}
summary: "Converts a flat index or array of flat indices into a tuple of"
description: <<END
coordinate arrays.
@compatibility(numpy)
Equivalent to np.unravel_index
@end_compatibility
END
}

View File

@ -762,7 +762,8 @@ int64 MinSystemMemory(int64 available_memory) {
// is necessary.
min_system_memory *= 2;
#endif
#if defined(NVIDIA_TEGRA)
#if defined(ANDROID_TEGRA)
// 1GB system mem for NVIDIA Tegra devices since they use the same mem for RAM
// and Video RAM
min_system_memory = 1 << 30;

Some files were not shown because too many files have changed in this diff Show More