Fix dependencies bugs
Change: 116925769
This commit is contained in:
parent
64dd5b58d5
commit
56f1d64998
ISSUE_TEMPLATE.mdREADME.mdRELEASE.mdconfigure
tensorflow
BUILD
contrib
cmake
CMakeLists.txtREADME.md
external
install.cmakepatches/jpeg
tests.cmaketf_cc_ops.cmaketf_core_cpu.cmaketf_core_direct_session.cmaketf_core_framework.cmaketf_core_kernels.cmaketf_core_ops.cmaketf_models.cmaketf_stream_executor.cmaketf_tutorials.cmakelayers/python/ops
linear_optimizer
core
distributed_runtime
kernels
BUILDbounds_check.hconv_grad_ops.ccconv_ops_gpu_3.cu.ccdiag_op.ccmatrix_solve_ls_op.ccreduction_ops_gpu.cu.ccreduction_ops_max.ccreduction_ops_min.ccreduction_ops_prod.ccreduction_ops_sum.ccresize_nearest_neighbor_op.ccresize_nearest_neighbor_op_benchmark_test.ccresize_nearest_neighbor_op_gpu.cu.ccresize_nearest_neighbor_op_gpu.hsparse_matmul_op.cctensor_array_ops.cctranspose_functor.h
ops
public
util
examples
how_tos/reading_data
image_retraining
tutorials
udacity
g3doc
api_docs/python
get_started
how_tos
resources
tutorials
models
python
framework
kernel_tests
constant_op_test.pycontrol_flow_ops_py_test.pydepthtospace_op_test.pydiag_op_test.pyinit_ops_test.pymatmul_op_test.pyreduction_ops_test.pyrnn_test.pyseq2seq_test.pytrace_op_test.py
ops
platform/default
training
@ -1,5 +1,11 @@
|
||||
For bugs/issues, please fill in the following. The more information you
|
||||
provide, the more likely we can help you.
|
||||
GitHub issues are for bugs / installation problems / feature requests.
|
||||
For general support from the community, see [StackOverflow](https://stackoverflow.com/questions/tagged/tensorflow).
|
||||
To make bugs and feature requests more easy to find and organize, we close issues that are deemed
|
||||
out of scope for GitHub Issues and point people to StackOverflow.
|
||||
|
||||
For bugs or installation issues, please provide the following information.
|
||||
The more information you provide, the more easily we will be able to offer
|
||||
help and advice.
|
||||
|
||||
### Environment info
|
||||
Operating System:
|
||||
|
16
README.md
16
README.md
@ -5,7 +5,7 @@
|
||||
|
||||
| **`Linux CPU`** | **`Linux GPU PIP`** | **`Mac OS CPU`** | **`Android`** |
|
||||
|-------------------|----------------------|------------------|----------------|
|
||||
| [](http://ci.tensorflow.org/job/tensorflow-master) | [](http://ci.tensorflow.org/job/tensorflow-master-gpu_pip) | [](http://ci.tensorflow.org/job/tensorflow-master-mac) | [](http://ci.tensorflow.org/job/tensorflow-master-android) |
|
||||
| [](http://ci.tensorflow.org/job/tensorflow-master-cpu) | [](http://ci.tensorflow.org/job/tensorflow-master-gpu_pip) | [](http://ci.tensorflow.org/job/tensorflow-master-mac) | [](http://ci.tensorflow.org/job/tensorflow-master-android) |
|
||||
|
||||
**TensorFlow** is an open source software library for numerical computation using
|
||||
data flow graphs. Nodes in the graph represent mathematical operations, while
|
||||
@ -27,7 +27,14 @@ tracking requests and bugs, but please see
|
||||
and discussion.**
|
||||
|
||||
## Installation
|
||||
*See [Download and Setup](tensorflow/g3doc/get_started/os_setup.md).*
|
||||
*See [Download and Setup](tensorflow/g3doc/get_started/os_setup.md) for instructions on how to install our release binaries or how to build from source.*
|
||||
|
||||
People who are a little bit adventurous can also try our nightly binaries:
|
||||
|
||||
* Linux CPU only: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-cp27-none-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py3-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/))
|
||||
* Linux GPU: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py2-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py3-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-slave/))
|
||||
* Mac CPU only: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py2-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py3-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/))
|
||||
* [Android](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/lastSuccessfulBuild/artifact/bazel-out/local_linux/bin/tensorflow/examples/android/tensorflow_demo.apk) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/))
|
||||
|
||||
#### *Try your first TensorFlow program*
|
||||
```python
|
||||
@ -46,6 +53,9 @@ Hello, TensorFlow!
|
||||
```
|
||||
|
||||
##For more information
|
||||
|
||||
* [TensorFlow website](http://tensorflow.org)
|
||||
* [TensorFlow whitepaper](http://download.tensorflow.org/paper/whitepaper2015.pdf)
|
||||
* [Tensorflow MOOC on Udacity] (https://www.udacity.com/course/deep-learning--ud730)
|
||||
* [TensorFlow MOOC on Udacity] (https://www.udacity.com/course/deep-learning--ud730)
|
||||
|
||||
The TensorFlow community has created amazing things with TensorFlow, please see the [resources section of tensorflow.org](https://www.tensorflow.org/versions/master/resources#community) for an incomplete list.
|
||||
|
17
RELEASE.md
17
RELEASE.md
@ -1,3 +1,20 @@
|
||||
# Release 0.7.1
|
||||
|
||||
## Bug Fixes and Other Changes
|
||||
|
||||
* Added gfile.Open and gfile.Copy, used by input_data.py.
|
||||
* Fixed Saver bug when MakeDirs tried to create empty directory.
|
||||
* GPU Pip wheels are built with cuda 7.5 and cudnn-v4, making them
|
||||
required for the binary releases. Lower versions of cuda/cudnn can
|
||||
be supported by installing from sources and setting the options
|
||||
during ./configure
|
||||
* Fix dataset encoding example for Python3 (@danijar)
|
||||
* Fix PIP installation by not packaging protobuf as part of wheel,
|
||||
require protobuf 3.0.0b2.
|
||||
* Fix Mac pip installation of numpy by requiring pip >= 1.10.1.
|
||||
* Improvements and fixes to Docker image.
|
||||
|
||||
|
||||
# Release 0.7.0
|
||||
|
||||
## Major Features and Improvements
|
||||
|
8
configure
vendored
8
configure
vendored
@ -99,12 +99,18 @@ while true; do
|
||||
else
|
||||
TF_CUDNN_EXT=".$TF_CUDNN_VERSION"
|
||||
fi
|
||||
if [ -e "$CUDNN_INSTALL_PATH/libcudnn.so${CUDNNEXT}" -o -e "$CUDNN_INSTALL_PATH/lib64/libcudnn.so${TF_CUDNN_EXT}" ]; then
|
||||
if [ -e "$CUDNN_INSTALL_PATH/libcudnn.so${TF_CUDNN_EXT}" -o -e "$CUDNN_INSTALL_PATH/lib64/libcudnn.so${TF_CUDNN_EXT}" ]; then
|
||||
break
|
||||
fi
|
||||
CUDNN_PATH_FROM_LDCONFIG="$(ldconfig -p | sed -n 's/.*libcudnn.so .* => \(.*\)/\1/p')"
|
||||
if [ -e "${CUDNN_PATH_FROM_LDCONFIG}${TF_CUDNN_EXT}" ]; then
|
||||
CUDNN_INSTALL_PATH="$(dirname ${CUDNN_PATH_FROM_LDCONFIG})"
|
||||
break
|
||||
fi
|
||||
echo "Invalid path to cuDNN ${TF_CUDNN_VERSION} toolkit. Neither of the following two files can be found:"
|
||||
echo "$CUDNN_INSTALL_PATH/lib64/libcudnn.so${TF_CUDNN_EXT}"
|
||||
echo "$CUDNN_INSTALL_PATH/libcudnn.so${TF_CUDNN_EXT}"
|
||||
echo "${CUDNN_PATH_FROM_LDCONFIG}${TF_CUDNN_EXT}"
|
||||
if [ -z "$fromuser" ]; then
|
||||
exit 1
|
||||
fi
|
||||
|
@ -54,6 +54,15 @@ cc_binary(
|
||||
],
|
||||
)
|
||||
|
||||
cc_binary(
|
||||
name = "libtensorflow_cc.so",
|
||||
linkshared = 1,
|
||||
deps = [
|
||||
"//tensorflow/cc:cc_ops",
|
||||
"//tensorflow/core:tensorflow",
|
||||
],
|
||||
)
|
||||
|
||||
py_library(
|
||||
name = "tensorflow_py",
|
||||
srcs = ["__init__.py"],
|
||||
|
62
tensorflow/contrib/cmake/CMakeLists.txt
Normal file
62
tensorflow/contrib/cmake/CMakeLists.txt
Normal file
@ -0,0 +1,62 @@
|
||||
# Minimum CMake required
|
||||
cmake_minimum_required(VERSION 2.8)
|
||||
|
||||
# Project
|
||||
project(tensorflow C CXX)
|
||||
|
||||
# Actual source is the ../../.. directory
|
||||
get_filename_component(tf_contrib_source_dir ${tensorflow_SOURCE_DIR} PATH)
|
||||
get_filename_component(tf_tf_source_dir ${tf_contrib_source_dir} PATH)
|
||||
get_filename_component(tensorflow_source_dir ${tf_tf_source_dir} PATH)
|
||||
|
||||
# [CLEANUP] Not sure if this is needed (copied from Protobuf)
|
||||
# CMake policies
|
||||
cmake_policy(SET CMP0022 NEW)
|
||||
|
||||
# Options
|
||||
option(tensorflow_VERBOSE "Enable for verbose output" OFF)
|
||||
option(tensorflow_BUILD_TESTS "Build tests" ON)
|
||||
|
||||
#Threads: defines CMAKE_THREAD_LIBS_INIT and adds -pthread compile option for
|
||||
# targets that link ${CMAKE_THREAD_LIBS_INIT}.
|
||||
find_package (Threads)
|
||||
|
||||
# [CLEANUP] Remove when done
|
||||
# For debugging
|
||||
function(SHOW_VARIABLES)
|
||||
get_cmake_property(_variableNames VARIABLES)
|
||||
foreach (_variableName ${_variableNames})
|
||||
message(STATUS "${_variableName}=${${_variableName}}")
|
||||
endforeach()
|
||||
endfunction()
|
||||
|
||||
# External dependencies
|
||||
set(CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/external)
|
||||
|
||||
# Location where external projects will be downloaded
|
||||
set (DOWNLOAD_LOCATION "${CMAKE_CURRENT_BINARY_DIR}/downloads"
|
||||
CACHE PATH "Location where external projects will be downloaded.")
|
||||
mark_as_advanced(DOWNLOAD_LOCATION)
|
||||
|
||||
# External dependencies
|
||||
include(png)
|
||||
include(jpeg)
|
||||
include(re2)
|
||||
include(eigen)
|
||||
|
||||
# Let's get to work!
|
||||
include(tf_core_framework.cmake)
|
||||
include(tf_stream_executor.cmake)
|
||||
include(tf_core_cpu.cmake)
|
||||
include(tf_models.cmake)
|
||||
include(tf_core_ops.cmake)
|
||||
include(tf_core_direct_session.cmake)
|
||||
include(tf_core_kernels.cmake)
|
||||
include(tf_cc_ops.cmake)
|
||||
include(tf_tutorials.cmake)
|
||||
|
||||
if (tensorflow_BUILD_TESTS)
|
||||
include(tests.cmake)
|
||||
endif (tensorflow_BUILD_TESTS)
|
||||
|
||||
include(install.cmake)
|
257
tensorflow/contrib/cmake/README.md
Normal file
257
tensorflow/contrib/cmake/README.md
Normal file
@ -0,0 +1,257 @@
|
||||
This directory contains *CMake* files that can be used to build TensorFlow
|
||||
core library.
|
||||
|
||||
You need to have [CMake](http://www.cmake.org) and [Git](http://git-scm.com)
|
||||
installed on your computer before proceeding.
|
||||
|
||||
Most of the instructions will be given to the *Сommand Prompt*, but the same
|
||||
actions can be performed using appropriate GUI tools.
|
||||
|
||||
Environment Setup
|
||||
=================
|
||||
|
||||
Open the appropriate *Command Prompt* from the *Start* menu.
|
||||
|
||||
For example *VS2013 x64 Native Tools Command Prompt*:
|
||||
|
||||
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64>
|
||||
|
||||
Change to your working directory:
|
||||
|
||||
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64>cd C:\Path\to
|
||||
C:\Path\to>
|
||||
|
||||
Where *C:\Path\to* is the path to your real working directory.
|
||||
|
||||
Create a folder where Tensorflow headers/libraries/binaries will be installed
|
||||
after they are built:
|
||||
|
||||
C:\Path\to>mkdir install
|
||||
|
||||
If *cmake* command is not available from *Command Prompt*, add it to system
|
||||
*PATH* variable:
|
||||
|
||||
C:\Path\to>set PATH=%PATH%;C:\Program Files (x86)\CMake\bin
|
||||
|
||||
If *git* command is not available from *Command Prompt*, add it to system
|
||||
*PATH* variable:
|
||||
|
||||
C:\Path\to>set PATH=%PATH%;C:\Program Files\Git\cmd
|
||||
|
||||
Good. Now you are ready to continue.
|
||||
|
||||
Getting Sources
|
||||
===============
|
||||
|
||||
You can get the latest stable source packages from the
|
||||
[releases](https://github.com/tensorflow/tensorflow/releases) page.
|
||||
Or you can type:
|
||||
|
||||
C:\Path\to> git clone --recursive -b [release_tag] https://github.com/tensorflow/tensorflow.git
|
||||
|
||||
Where *[release_tag]* is a git tag like *v0.6.0* or a branch name like *master*
|
||||
if you want to get the latest code.
|
||||
|
||||
Go to the project folder:
|
||||
|
||||
C:\Path\to>cd tensorflow
|
||||
C:\Path\to\tensorflow>
|
||||
|
||||
Now go to *tensorflow\contrib\cmake* folder in Tensorflow's contrib sources:
|
||||
|
||||
C:\Path\to\tensorflow>cd tensorflow\contrib\cmake
|
||||
C:\Path\to\tensorflow\tensorflow\contrib\cmake>
|
||||
|
||||
Good. Now you are ready to configure *CMake*.
|
||||
|
||||
CMake Configuration
|
||||
===================
|
||||
|
||||
*CMake* supports a lot of different
|
||||
[generators](http://www.cmake.org/cmake/help/latest/manual/cmake-generators.7.html)
|
||||
for various native build systems. We are only interested in
|
||||
[Makefile](http://www.cmake.org/cmake/help/latest/manual/cmake-generators.7.html#makefile-generators)
|
||||
and
|
||||
[Visual Studio](http://www.cmake.org/cmake/help/latest/manual/cmake-generators.7.html#visual-studio-generators)
|
||||
generators.
|
||||
|
||||
We will use shadow building to separate the temporary files from the Tensorflow
|
||||
source code.
|
||||
|
||||
Create a temporary *build* folder and change your working directory to it:
|
||||
|
||||
C:\Path\to\tensorflow\tensorflow\contrib\cmake>mkdir build & cd build
|
||||
C:\Path\to\tensorflow\tensorflow\contrib\cmake\build>
|
||||
|
||||
The *Makefile* generator can build the project in only one configuration, so
|
||||
you need to build a separate folder for each configuration.
|
||||
|
||||
To start using a *Release* configuration:
|
||||
|
||||
[...]\contrib\cmake\build>mkdir release & cd release
|
||||
[...]\contrib\cmake\build\release>cmake -G "NMake Makefiles" ^
|
||||
-DCMAKE_BUILD_TYPE=Release ^
|
||||
-DCMAKE_INSTALL_PREFIX=../../../../../../install ^
|
||||
../..
|
||||
|
||||
It will generate *nmake* *Makefile* in current directory.
|
||||
|
||||
To use *Debug* configuration:
|
||||
|
||||
[...]\contrib\cmake\build>mkdir debug & cd debug
|
||||
[...]\contrib\cmake\build\debug>cmake -G "NMake Makefiles" ^
|
||||
-DCMAKE_BUILD_TYPE=Debug ^
|
||||
-DCMAKE_INSTALL_PREFIX=../../../../../../install ^
|
||||
../..
|
||||
|
||||
It will generate *nmake* *Makefile* in current directory.
|
||||
|
||||
To create *Visual Studio* solution file:
|
||||
|
||||
[...]\contrib\cmake\build>mkdir solution & cd solution
|
||||
[...]\contrib\cmake\build\solution>cmake -G "Visual Studio 12 2013 Win64" ^
|
||||
-DCMAKE_INSTALL_PREFIX=../../../../../../install ^
|
||||
../..
|
||||
|
||||
It will generate *Visual Studio* solution file *tensorflow.sln* in current
|
||||
directory.
|
||||
|
||||
If the *gmock* directory does not exist, and/or you do not want to build
|
||||
Tensorflow unit tests, you need to add *cmake* command argument
|
||||
`-Dtensorflow_BUILD_TESTS=OFF` to disable testing.
|
||||
|
||||
Compiling
|
||||
=========
|
||||
|
||||
To compile tensorflow:
|
||||
|
||||
[...]\contrib\cmake\build\release>nmake
|
||||
|
||||
or
|
||||
|
||||
[...]\contrib\cmake\build\debug>nmake
|
||||
|
||||
And wait for the compilation to finish.
|
||||
|
||||
If you prefer to use the IDE:
|
||||
|
||||
* Open the generated tensorflow.sln file in Microsoft Visual Studio.
|
||||
* Choose "Debug" or "Release" configuration as desired.
|
||||
* From the Build menu, choose "Build Solution".
|
||||
|
||||
And wait for the compilation to finish.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
To run unit-tests:
|
||||
|
||||
[...]\contrib\cmake\build\release>nmake check
|
||||
|
||||
or
|
||||
|
||||
[...]\contrib\cmake\build\debug>nmake check
|
||||
|
||||
You can also build project *check* from Visual Studio solution.
|
||||
Yes, it may sound strange, but it works.
|
||||
|
||||
You should see an output similar to:
|
||||
|
||||
Running main() from gmock_main.cc
|
||||
[==========] Running 1546 tests from 165 test cases.
|
||||
|
||||
...
|
||||
|
||||
[==========] 1546 tests from 165 test cases ran. (2529 ms total)
|
||||
[ PASSED ] 1546 tests.
|
||||
|
||||
To run specific tests:
|
||||
|
||||
C:\Path\to\tensorflow>tensorflow\contrib\cmake\build\release\tests.exe ^
|
||||
--gtest_filter=AnyTest*
|
||||
Running main() from gmock_main.cc
|
||||
Note: Google Test filter = AnyTest*
|
||||
[==========] Running 3 tests from 1 test case.
|
||||
[----------] Global test environment set-up.
|
||||
[----------] 3 tests from AnyTest
|
||||
[ RUN ] AnyTest.TestPackAndUnpack
|
||||
[ OK ] AnyTest.TestPackAndUnpack (0 ms)
|
||||
[ RUN ] AnyTest.TestPackAndUnpackAny
|
||||
[ OK ] AnyTest.TestPackAndUnpackAny (0 ms)
|
||||
[ RUN ] AnyTest.TestIs
|
||||
[ OK ] AnyTest.TestIs (0 ms)
|
||||
[----------] 3 tests from AnyTest (1 ms total)
|
||||
|
||||
[----------] Global test environment tear-down
|
||||
[==========] 3 tests from 1 test case ran. (2 ms total)
|
||||
[ PASSED ] 3 tests.
|
||||
|
||||
Note that the tests must be run from the source folder.
|
||||
|
||||
If all tests are passed, safely continue.
|
||||
|
||||
Installing
|
||||
==========
|
||||
|
||||
To install Tensorflow to the specified *install* folder:
|
||||
|
||||
[...]\contrib\cmake\build\release>nmake install
|
||||
|
||||
or
|
||||
|
||||
[...]\contrib\cmake\build\debug>nmake install
|
||||
|
||||
You can also build project *INSTALL* from Visual Studio solution.
|
||||
It sounds not so strange and it works.
|
||||
|
||||
This will create the following folders under the *install* location:
|
||||
* bin - that contains tensorflow binaries;
|
||||
* include - that contains C++ headers and Tensorflow *.proto files;
|
||||
* lib - that contains linking libraries and *CMake* configuration files for
|
||||
*tensorflow* package.
|
||||
|
||||
Now you can if needed:
|
||||
* Copy the contents of the include directory to wherever you want to put
|
||||
headers.
|
||||
* Copy binaries wherever you put build tools (probably somewhere in your
|
||||
PATH).
|
||||
* Copy linking libraries libtensorflow[d].lib wherever you put libraries.
|
||||
|
||||
To avoid conflicts between the MSVC debug and release runtime libraries, when
|
||||
compiling a debug build of your application, you may need to link against a
|
||||
debug build of libtensorflowd.lib with "d" postfix. Similarly, release builds
|
||||
should link against release libtensorflow.lib library.
|
||||
|
||||
DLLs vs. static linking
|
||||
=======================
|
||||
|
||||
Static linking is now the default for the Tensorflow Buffer libraries. Due to
|
||||
issues with Win32's use of a separate heap for each DLL, as well as binary
|
||||
compatibility issues between different versions of MSVC's STL library, it is
|
||||
recommended that you use static linkage only. However, it is possible to
|
||||
build libtensorflow as DLLs if you really want. To do this, do the following:
|
||||
|
||||
* Add an additional flag `-Dtensorflow_BUILD_SHARED_LIBS=ON` when invoking
|
||||
cmake
|
||||
* Follow the same steps as described in the above section.
|
||||
* When compiling your project, make sure to `#define TENSORFLOW_USE_DLLS`.
|
||||
|
||||
When distributing your software to end users, we strongly recommend that you
|
||||
do NOT install libtensorflow.dll to any shared location.
|
||||
Instead, keep these libraries next to your binaries, in your application's
|
||||
own install directory. C++ makes it very difficult to maintain binary
|
||||
compatibility between releases, so it is likely that future versions of these
|
||||
libraries will *not* be usable as drop-in replacements.
|
||||
|
||||
If your project is itself a DLL intended for use by third-party software, we
|
||||
recommend that you do NOT expose Tensorflow objects in your library's
|
||||
public interface, and that you statically link them into your library.
|
||||
|
||||
Notes on Compiler Warnings
|
||||
==========================
|
||||
|
||||
The following warnings have been disabled while building the tensorflow
|
||||
libraries and binaries. You may have to disable some of them in your own
|
||||
project as well, or live with them.
|
||||
|
||||
* [TODO]
|
34
tensorflow/contrib/cmake/external/eigen.cmake
vendored
Normal file
34
tensorflow/contrib/cmake/external/eigen.cmake
vendored
Normal file
@ -0,0 +1,34 @@
|
||||
#new_http_archive(
|
||||
# name = "eigen_archive",
|
||||
# url = "https://bitbucket.org/eigen/eigen/get/...",
|
||||
# sha256 = "...",
|
||||
# build_file = "eigen.BUILD",
|
||||
#)
|
||||
|
||||
include (ExternalProject)
|
||||
|
||||
set(eigen_archive_hash "ed4c9730b545")
|
||||
|
||||
set(eigen_INCLUDE_DIRS
|
||||
${CMAKE_CURRENT_BINARY_DIR}
|
||||
${CMAKE_CURRENT_BINARY_DIR}/external/eigen_archive
|
||||
${CMAKE_CURRENT_BINARY_DIR}/external/eigen_archive/eigen-eigen-${eigen_archive_hash}
|
||||
${tensorflow_source_dir}/third_party/eigen3
|
||||
)
|
||||
set(eigen_URL https://bitbucket.org/eigen/eigen/get/${eigen_archive_hash}.tar.gz)
|
||||
set(eigen_HASH SHA256=3d9eceb8a2add299e37b1f32759157cc2574f7684936c151552a5ae3f33aebd5)
|
||||
set(eigen_BUILD ${CMAKE_CURRENT_BINARY_DIR}/eigen/src/eigen)
|
||||
set(eigen_INSTALL ${CMAKE_CURRENT_BINARY_DIR}/eigen/install)
|
||||
|
||||
ExternalProject_Add(eigen
|
||||
PREFIX eigen
|
||||
URL ${eigen_URL}
|
||||
URL_HASH ${eigen_HASH}
|
||||
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
|
||||
INSTALL_DIR "${eigen_INSTALL}"
|
||||
CMAKE_CACHE_ARGS
|
||||
-DCMAKE_BUILD_TYPE:STRING=Release
|
||||
-DCMAKE_VERBOSE_MAKEFILE:BOOL=OFF
|
||||
-DCMAKE_INSTALL_PREFIX:STRING=${eigen_INSTALL}
|
||||
-DINCLUDE_INSTALL_DIR:STRING=${CMAKE_CURRENT_BINARY_DIR}/external/eigen_archive/eigen-eigen-${eigen_archive_hash}
|
||||
)
|
75
tensorflow/contrib/cmake/external/jpeg.cmake
vendored
Normal file
75
tensorflow/contrib/cmake/external/jpeg.cmake
vendored
Normal file
@ -0,0 +1,75 @@
|
||||
include (ExternalProject)
|
||||
|
||||
set(jpeg_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/external/jpeg_archive)
|
||||
set(jpeg_URL http://www.ijg.org/files/jpegsrc.v9a.tar.gz)
|
||||
set(jpeg_HASH SHA256=3a753ea48d917945dd54a2d97de388aa06ca2eb1066cbfdc6652036349fe05a7)
|
||||
set(jpeg_BUILD ${CMAKE_BINARY_DIR}/jpeg/src/jpeg)
|
||||
set(jpeg_INSTALL ${CMAKE_BINARY_DIR}/jpeg/install)
|
||||
set(jpeg_STATIC_LIBRARIES ${jpeg_INSTALL}/lib/libjpeg.a)
|
||||
|
||||
set(jpeg_HEADERS
|
||||
"${jpeg_INSTALL}/include/jconfig.h"
|
||||
"${jpeg_INSTALL}/include/jerror.h"
|
||||
"${jpeg_INSTALL}/include/jmorecfg.h"
|
||||
"${jpeg_INSTALL}/include/jpeglib.h"
|
||||
"${jpeg_BUILD}/cderror.h"
|
||||
"${jpeg_BUILD}/cdjpeg.h"
|
||||
"${jpeg_BUILD}/jdct.h"
|
||||
"${jpeg_BUILD}/jinclude.h"
|
||||
"${jpeg_BUILD}/jmemsys.h"
|
||||
"${jpeg_BUILD}/jpegint.h"
|
||||
"${jpeg_BUILD}/jversion.h"
|
||||
"${jpeg_BUILD}/transupp.h"
|
||||
)
|
||||
|
||||
if (WIN32)
|
||||
ExternalProject_Add(jpeg
|
||||
PREFIX jpeg
|
||||
URL ${jpeg_URL}
|
||||
URL_HASH ${jpeg_HASH}
|
||||
PATCH_COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_SOURCE_DIR}/patches/jpeg/CMakeLists.txt ${jpeg_BUILD}
|
||||
INSTALL_DIR ${jpeg_INSTALL}
|
||||
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
|
||||
CMAKE_CACHE_ARGS
|
||||
-DCMAKE_BUILD_TYPE:STRING=Release
|
||||
-DCMAKE_VERBOSE_MAKEFILE:BOOL=OFF
|
||||
-DCMAKE_INSTALL_PREFIX:STRING=${jpeg_INSTALL}
|
||||
)
|
||||
|
||||
ExternalProject_Add_Step(jpeg copy_jconfig
|
||||
COMMAND ${CMAKE_COMMAND} -E copy
|
||||
${jpeg_BUILD}/jconfig.vc ${jpeg_BUILD}/jconfig.h
|
||||
DEPENDEES patch
|
||||
DEPENDERS build
|
||||
)
|
||||
|
||||
else()
|
||||
|
||||
ExternalProject_Add(jpeg
|
||||
PREFIX jpeg
|
||||
URL ${jpeg_URL}
|
||||
URL_HASH ${jpeg_HASH}
|
||||
INSTALL_DIR ${jpeg_INSTALL}
|
||||
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
|
||||
BUILD_COMMAND $(MAKE)
|
||||
INSTALL_COMMAND $(MAKE) install
|
||||
CONFIGURE_COMMAND
|
||||
${jpeg_BUILD}/configure
|
||||
--prefix=${jpeg_INSTALL}
|
||||
--enable-shared=yes
|
||||
)
|
||||
|
||||
endif()
|
||||
|
||||
# put jpeg includes in the directory where they are expected
|
||||
add_custom_target(jpeg_create_destination_dir
|
||||
COMMAND ${CMAKE_COMMAND} -E make_directory ${jpeg_INCLUDE_DIR}/jpeg-9a
|
||||
DEPENDS jpeg)
|
||||
|
||||
add_custom_target(jpeg_copy_headers_to_destination
|
||||
DEPENDS jpeg_create_destination_dir)
|
||||
|
||||
foreach(header_file ${jpeg_HEADERS})
|
||||
add_custom_command(TARGET jpeg_copy_headers_to_destination PRE_BUILD
|
||||
COMMAND ${CMAKE_COMMAND} -E copy ${header_file} ${jpeg_INCLUDE_DIR}/jpeg-9a)
|
||||
endforeach()
|
38
tensorflow/contrib/cmake/external/png.cmake
vendored
Normal file
38
tensorflow/contrib/cmake/external/png.cmake
vendored
Normal file
@ -0,0 +1,38 @@
|
||||
include (ExternalProject)
|
||||
|
||||
set(png_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/external/png_archive)
|
||||
set(png_URL https://storage.googleapis.com/libpng-public-archive/libpng-1.2.53.tar.gz)
|
||||
set(png_HASH SHA256=e05c9056d7f323088fd7824d8c6acc03a4a758c4b4916715924edc5dd3223a72)
|
||||
set(png_BUILD ${CMAKE_BINARY_DIR}/png/src/png)
|
||||
set(png_INSTALL ${CMAKE_BINARY_DIR}/png/install)
|
||||
set(png_STATIC_LIBRARIES ${CMAKE_BINARY_DIR}/png/install/lib/libpng12.a)
|
||||
|
||||
set(png_HEADERS
|
||||
"${png_INSTALL}/include/libpng12/png.h"
|
||||
"${png_INSTALL}/include/libpng12/pngconf.h"
|
||||
)
|
||||
|
||||
ExternalProject_Add(png
|
||||
PREFIX png
|
||||
URL ${png_URL}
|
||||
URL_HASH ${png_HASH}
|
||||
INSTALL_DIR ${png_INSTALL}
|
||||
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
|
||||
CMAKE_CACHE_ARGS
|
||||
-DCMAKE_BUILD_TYPE:STRING=Release
|
||||
-DCMAKE_VERBOSE_MAKEFILE:BOOL=OFF
|
||||
-DCMAKE_INSTALL_PREFIX:STRING=${png_INSTALL}
|
||||
)
|
||||
|
||||
## put png includes in the directory where they are expected
|
||||
add_custom_target(png_create_destination_dir
|
||||
COMMAND ${CMAKE_COMMAND} -E make_directory ${png_INCLUDE_DIR}/libpng-1.2.53
|
||||
DEPENDS png)
|
||||
|
||||
add_custom_target(png_copy_headers_to_destination
|
||||
DEPENDS png_create_destination_dir)
|
||||
|
||||
foreach(header_file ${png_HEADERS})
|
||||
add_custom_command(TARGET png_copy_headers_to_destination PRE_BUILD
|
||||
COMMAND ${CMAKE_COMMAND} -E copy ${header_file} ${png_INCLUDE_DIR}/libpng-1.2.53)
|
||||
endforeach()
|
46
tensorflow/contrib/cmake/external/re2.cmake
vendored
Normal file
46
tensorflow/contrib/cmake/external/re2.cmake
vendored
Normal file
@ -0,0 +1,46 @@
|
||||
include (ExternalProject)
|
||||
|
||||
set(re2_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/external/re2/re2)
|
||||
set(re2_EXTRA_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/re2/src)
|
||||
set(re2_URL https://github.com/google/re2.git)
|
||||
set(re2_TAG 791beff)
|
||||
set(re2_BUILD ${CMAKE_BINARY_DIR}/re2/src/re2)
|
||||
set(re2_LIBRARIES ${re2_BUILD}/obj/so/libre2.so)
|
||||
get_filename_component(re2_STATIC_LIBRARIES ${re2_BUILD}/libre2.a ABSOLUTE)
|
||||
set(re2_INCLUDES ${re2_BUILD})
|
||||
|
||||
# We only need re2.h in external/re2/re2/re2.h
|
||||
# For the rest, we'll just add the build dir as an include dir.
|
||||
set(re2_HEADERS
|
||||
"${re2_BUILD}/re2/re2.h"
|
||||
)
|
||||
|
||||
ExternalProject_Add(re2
|
||||
PREFIX re2
|
||||
GIT_REPOSITORY ${re2_URL}
|
||||
GIT_TAG ${re2_TAG}
|
||||
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
|
||||
BUILD_IN_SOURCE 1
|
||||
INSTALL_COMMAND ""
|
||||
CMAKE_CACHE_ARGS
|
||||
-DCMAKE_BUILD_TYPE:STRING=Release
|
||||
-DCMAKE_VERBOSE_MAKEFILE:BOOL=OFF
|
||||
)
|
||||
|
||||
## put re2 includes in the directory where they are expected
|
||||
add_custom_target(re2_create_destination_dir
|
||||
COMMAND ${CMAKE_COMMAND} -E make_directory ${re2_INCLUDE_DIR}
|
||||
DEPENDS re2)
|
||||
|
||||
add_custom_target(re2_copy_headers_to_destination
|
||||
DEPENDS re2_create_destination_dir)
|
||||
|
||||
foreach(header_file ${re2_HEADERS})
|
||||
add_custom_command(TARGET re2_copy_headers_to_destination PRE_BUILD
|
||||
COMMAND ${CMAKE_COMMAND} -E copy ${header_file} ${re2_INCLUDE_DIR})
|
||||
endforeach()
|
||||
|
||||
ADD_LIBRARY(re2_lib STATIC IMPORTED
|
||||
DEPENDS re2)
|
||||
SET_TARGET_PROPERTIES(re2_lib PROPERTIES
|
||||
IMPORTED_LOCATION ${re2_STATIC_LIBRARIES})
|
1
tensorflow/contrib/cmake/install.cmake
Normal file
1
tensorflow/contrib/cmake/install.cmake
Normal file
@ -0,0 +1 @@
|
||||
# [TODO]
|
76
tensorflow/contrib/cmake/patches/jpeg/CMakeLists.txt
Normal file
76
tensorflow/contrib/cmake/patches/jpeg/CMakeLists.txt
Normal file
@ -0,0 +1,76 @@
|
||||
cmake_minimum_required(VERSION 2.8.3)
|
||||
|
||||
project(libjpeg)
|
||||
|
||||
set(LIBJPEG_SRCS
|
||||
"jaricom.c"
|
||||
"jcapimin.c"
|
||||
"jcapistd.c"
|
||||
"jcarith.c"
|
||||
"jccoefct.c"
|
||||
"jccolor.c"
|
||||
"jcdctmgr.c"
|
||||
"jchuff.c"
|
||||
"jcinit.c"
|
||||
"jcmainct.c"
|
||||
"jcmarker.c"
|
||||
"jcmaster.c"
|
||||
"jcomapi.c"
|
||||
"jcparam.c"
|
||||
"jcprepct.c"
|
||||
"jcsample.c"
|
||||
"jctrans.c"
|
||||
"jdapimin.c"
|
||||
"jdapistd.c"
|
||||
"jdarith.c"
|
||||
"jdatadst.c"
|
||||
"jdatasrc.c"
|
||||
"jdcoefct.c"
|
||||
"jdcolor.c"
|
||||
"jddctmgr.c"
|
||||
"jdhuff.c"
|
||||
"jdinput.c"
|
||||
"jdmainct.c"
|
||||
"jdmarker.c"
|
||||
"jdmaster.c"
|
||||
"jdmerge.c"
|
||||
"jdpostct.c"
|
||||
"jdsample.c"
|
||||
"jdtrans.c"
|
||||
"jerror.c"
|
||||
"jfdctflt.c"
|
||||
"jfdctfst.c"
|
||||
"jfdctint.c"
|
||||
"jidctflt.c"
|
||||
"jidctfst.c"
|
||||
"jidctint.c"
|
||||
"jmemmgr.c"
|
||||
"jmemnobs.c"
|
||||
"jquant1.c"
|
||||
"jquant2.c"
|
||||
"jutils.c"
|
||||
)
|
||||
set(LIBJPEG_INCLUDES
|
||||
"jconfig.h"
|
||||
"jdct.h"
|
||||
"jerror.h"
|
||||
"jinclude.h"
|
||||
"jmemsys.h"
|
||||
"jmorecfg.h"
|
||||
"jpegint.h"
|
||||
"jpeglib.h"
|
||||
"jversion.h"
|
||||
)
|
||||
|
||||
include_directories("${CMAKE_CURRENT_SOURCE_DIR}")
|
||||
|
||||
add_library(libjpeg ${LIBJPEG_SRCS})
|
||||
|
||||
install(TARGETS libjpeg
|
||||
RUNTIME DESTINATION bin COMPONENT RuntimeLibraries
|
||||
LIBRARY DESTINATION lib COMPONENT RuntimeLibraries
|
||||
ARCHIVE DESTINATION lib COMPONENT Development)
|
||||
|
||||
foreach(LIBJPEG_INCLUDE ${LIBJPEG_INCLUDES})
|
||||
install(FILES ${LIBJPEG_INCLUDE} DESTINATION include COMPONENT Development)
|
||||
endforeach()
|
1
tensorflow/contrib/cmake/tests.cmake
Normal file
1
tensorflow/contrib/cmake/tests.cmake
Normal file
@ -0,0 +1 @@
|
||||
# [TODO]
|
204
tensorflow/contrib/cmake/tf_cc_ops.cmake
Normal file
204
tensorflow/contrib/cmake/tf_cc_ops.cmake
Normal file
@ -0,0 +1,204 @@
|
||||
########################################################
|
||||
# tf_cc_op_gen_main library
|
||||
########################################################
|
||||
set(tf_cc_op_gen_main_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/cc/ops/cc_op_gen.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/cc/ops/cc_op_gen_main.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/cc/ops/cc_op_gen.h"
|
||||
)
|
||||
|
||||
add_library(tf_cc_op_gen_main OBJECT ${tf_cc_op_gen_main_srcs})
|
||||
|
||||
add_dependencies(tf_cc_op_gen_main tf_core_framework)
|
||||
|
||||
target_include_directories(tf_cc_op_gen_main PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
#target_link_libraries(tf_cc_op_gen_main
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_protos_cc
|
||||
# tf_core_lib
|
||||
# tf_core_framework
|
||||
#)
|
||||
|
||||
target_compile_options(tf_cc_op_gen_main PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_cc_op_gen_main PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
||||
########################################################
|
||||
# tf_gen_op_wrapper_cc executables
|
||||
########################################################
|
||||
|
||||
#
|
||||
# # Run the op generator.
|
||||
# if name == "sendrecv_ops":
|
||||
# include_internal = "1"
|
||||
# else:
|
||||
# include_internal = "0"
|
||||
# native.genrule(
|
||||
# name=name + "_genrule",
|
||||
# outs=[out_ops_file + ".h", out_ops_file + ".cc"],
|
||||
# tools=[":" + tool],
|
||||
# cmd=("$(location :" + tool + ") $(location :" + out_ops_file + ".h) " +
|
||||
# "$(location :" + out_ops_file + ".cc) " + include_internal))
|
||||
|
||||
|
||||
|
||||
#def tf_gen_op_wrappers_cc(name,
|
||||
# op_lib_names=[],
|
||||
# other_srcs=[],
|
||||
# other_hdrs=[],
|
||||
# pkg=""):
|
||||
# subsrcs = other_srcs
|
||||
# subhdrs = other_hdrs
|
||||
# for n in op_lib_names:
|
||||
# tf_gen_op_wrapper_cc(n, "ops/" + n, pkg=pkg)
|
||||
# subsrcs += ["ops/" + n + ".cc"]
|
||||
# subhdrs += ["ops/" + n + ".h"]
|
||||
#
|
||||
# native.cc_library(name=name,
|
||||
# srcs=subsrcs,
|
||||
# hdrs=subhdrs,
|
||||
# deps=["//tensorflow/core:core_cpu"],
|
||||
# copts=tf_copts(),
|
||||
# alwayslink=1,)
|
||||
|
||||
# create directory for ops generated files
|
||||
set(cc_ops_target_dir ${CMAKE_CURRENT_BINARY_DIR}/tensorflow/cc/ops)
|
||||
|
||||
add_custom_target(create_cc_ops_header_dir
|
||||
COMMAND ${CMAKE_COMMAND} -E make_directory ${cc_ops_target_dir}
|
||||
)
|
||||
|
||||
set(tf_cc_ops_generated_files)
|
||||
|
||||
set(tf_cc_op_lib_names
|
||||
${tf_op_lib_names}
|
||||
"user_ops"
|
||||
)
|
||||
foreach(tf_cc_op_lib_name ${tf_cc_op_lib_names})
|
||||
#tf_gen_op_wrapper_cc(name, out_ops_file, pkg=""):
|
||||
# # Construct an op generator binary for these ops.
|
||||
# tool = out_ops_file + "_gen_cc" #example ops/array_ops_gen_cc
|
||||
# native.cc_binary(
|
||||
# name = tool,
|
||||
# copts = tf_copts(),
|
||||
# linkopts = ["-lm"],
|
||||
# linkstatic = 1, # Faster to link this one-time-use binary dynamically
|
||||
# deps = (["//tensorflow/cc:cc_op_gen_main",
|
||||
# pkg + ":" + name + "_op_lib"])
|
||||
# )
|
||||
|
||||
# Using <TARGET_OBJECTS:...> to work around an issue where no ops were
|
||||
# registered (static initializers dropped by the linker because the ops
|
||||
# are not used explicitly in the *_gen_cc executables).
|
||||
add_executable(${tf_cc_op_lib_name}_gen_cc
|
||||
$<TARGET_OBJECTS:tf_cc_op_gen_main>
|
||||
$<TARGET_OBJECTS:tf_${tf_cc_op_lib_name}>
|
||||
$<TARGET_OBJECTS:tf_core_lib>
|
||||
$<TARGET_OBJECTS:tf_core_framework>
|
||||
)
|
||||
|
||||
target_include_directories(${tf_cc_op_lib_name}_gen_cc PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
find_package(ZLIB REQUIRED)
|
||||
|
||||
target_link_libraries(${tf_cc_op_lib_name}_gen_cc PRIVATE
|
||||
${CMAKE_THREAD_LIBS_INIT}
|
||||
${PROTOBUF_LIBRARIES}
|
||||
tf_protos_cc
|
||||
re2_lib
|
||||
${jpeg_STATIC_LIBRARIES}
|
||||
${png_STATIC_LIBRARIES}
|
||||
${ZLIB_LIBRARIES}
|
||||
)
|
||||
|
||||
target_compile_options(${tf_cc_op_lib_name}_gen_cc PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
-lm
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(${tf_cc_op_lib_name}_gen_cc PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
||||
set(cc_ops_include_internal 0)
|
||||
if(${tf_cc_op_lib_name} STREQUAL "sendrecv_ops")
|
||||
set(cc_ops_include_internal 1)
|
||||
endif()
|
||||
|
||||
add_custom_command(
|
||||
OUTPUT ${cc_ops_target_dir}/${tf_cc_op_lib_name}.h
|
||||
${cc_ops_target_dir}/${tf_cc_op_lib_name}.cc
|
||||
COMMAND ${tf_cc_op_lib_name}_gen_cc ${cc_ops_target_dir}/${tf_cc_op_lib_name}.h ${cc_ops_target_dir}/${tf_cc_op_lib_name}.cc ${cc_ops_include_internal}
|
||||
DEPENDS ${tf_cc_op_lib_name}_gen_cc create_cc_ops_header_dir
|
||||
)
|
||||
|
||||
list(APPEND tf_cc_ops_generated_files ${cc_ops_target_dir}/${tf_cc_op_lib_name}.h)
|
||||
list(APPEND tf_cc_ops_generated_files ${cc_ops_target_dir}/${tf_cc_op_lib_name}.cc)
|
||||
endforeach()
|
||||
|
||||
|
||||
########################################################
|
||||
# tf_cc_ops library
|
||||
########################################################
|
||||
add_library(tf_cc_ops OBJECT
|
||||
${tf_cc_ops_generated_files}
|
||||
"${tensorflow_source_dir}/tensorflow/cc/ops/const_op.h"
|
||||
"${tensorflow_source_dir}/tensorflow/cc/ops/const_op.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/cc/ops/standard_ops.h"
|
||||
)
|
||||
|
||||
target_include_directories(tf_cc_ops PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
#target_link_libraries(tf_cc_ops
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_protos_cc
|
||||
# tf_core_lib
|
||||
# tf_core_cpu
|
||||
# tf_models_word2vec_ops
|
||||
#)
|
||||
|
||||
target_compile_options(tf_cc_ops PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_cc_ops PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
||||
|
||||
#tf_gen_op_wrappers_cc(
|
||||
# name = "cc_ops",
|
||||
# op_lib_names = [
|
||||
# ...
|
||||
# ],
|
||||
# other_hdrs = [
|
||||
# "ops/const_op.h",
|
||||
# "ops/standard_ops.h",
|
||||
# ],
|
||||
# other_srcs = [
|
||||
# "ops/const_op.cc",
|
||||
# ] + glob(["ops/*_grad.cc"]),
|
||||
# pkg = "//tensorflow/core",
|
||||
#)
|
53
tensorflow/contrib/cmake/tf_core_cpu.cmake
Normal file
53
tensorflow/contrib/cmake/tf_core_cpu.cmake
Normal file
@ -0,0 +1,53 @@
|
||||
########################################################
|
||||
# tf_core_cpu library
|
||||
########################################################
|
||||
file(GLOB_RECURSE tf_core_cpu_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/common_runtime/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/common_runtime/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/client/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/graph/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/graph/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/public/*.h"
|
||||
)
|
||||
|
||||
file(GLOB_RECURSE tf_core_cpu_exclude_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/*test*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/*test*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/*main.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/common_runtime/gpu/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/common_runtime/gpu_device_factory.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/common_runtime/direct_session.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/common_runtime/direct_session.h"
|
||||
)
|
||||
|
||||
list(REMOVE_ITEM tf_core_cpu_srcs ${tf_core_cpu_exclude_srcs})
|
||||
|
||||
add_library(tf_core_cpu OBJECT ${tf_core_cpu_srcs})
|
||||
|
||||
target_include_directories(tf_core_cpu PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
${re2_INCLUDES}
|
||||
)
|
||||
|
||||
add_dependencies(tf_core_cpu
|
||||
tf_core_framework
|
||||
)
|
||||
#target_link_libraries(tf_core_cpu
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_core_framework
|
||||
# tf_core_lib
|
||||
# tf_protos_cc
|
||||
#)
|
||||
|
||||
target_compile_options(tf_core_cpu PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_core_cpu PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
35
tensorflow/contrib/cmake/tf_core_direct_session.cmake
Normal file
35
tensorflow/contrib/cmake/tf_core_direct_session.cmake
Normal file
@ -0,0 +1,35 @@
|
||||
########################################################
|
||||
# tf_core_direct_session library
|
||||
########################################################
|
||||
file(GLOB tf_core_direct_session_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/common_runtime/direct_session.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/common_runtime/direct_session.h"
|
||||
)
|
||||
|
||||
add_library(tf_core_direct_session OBJECT ${tf_core_direct_session_srcs})
|
||||
|
||||
add_dependencies(tf_core_direct_session tf_core_cpu)
|
||||
|
||||
target_include_directories(tf_core_direct_session PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
#target_link_libraries(tf_core_direct_session
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_core_cpu
|
||||
# tf_core_framework
|
||||
# tf_core_lib
|
||||
# tf_protos_cc
|
||||
#)
|
||||
|
||||
target_compile_options(tf_core_direct_session PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_core_direct_session PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
165
tensorflow/contrib/cmake/tf_core_framework.cmake
Normal file
165
tensorflow/contrib/cmake/tf_core_framework.cmake
Normal file
@ -0,0 +1,165 @@
|
||||
########################################################
|
||||
# RELATIVE_PROTOBUF_GENERATE_CPP function
|
||||
########################################################
|
||||
# A variant of PROTOBUF_GENERATE_CPP that keeps the directory hierarchy.
|
||||
# ROOT_DIR must be absolute, and proto paths must be relative to ROOT_DIR.
|
||||
function(RELATIVE_PROTOBUF_GENERATE_CPP SRCS HDRS ROOT_DIR)
|
||||
if(NOT ARGN)
|
||||
message(SEND_ERROR "Error: RELATIVE_PROTOBUF_GENERATE_CPP() called without any proto files")
|
||||
return()
|
||||
endif()
|
||||
|
||||
set(${SRCS})
|
||||
set(${HDRS})
|
||||
foreach(FIL ${ARGN})
|
||||
set(ABS_FIL ${ROOT_DIR}/${FIL})
|
||||
get_filename_component(FIL_WE ${FIL} NAME_WE)
|
||||
get_filename_component(FIL_DIR ${ABS_FIL} PATH)
|
||||
file(RELATIVE_PATH REL_DIR ${ROOT_DIR} ${FIL_DIR})
|
||||
|
||||
list(APPEND ${SRCS} "${CMAKE_CURRENT_BINARY_DIR}/${REL_DIR}/${FIL_WE}.pb.cc")
|
||||
list(APPEND ${HDRS} "${CMAKE_CURRENT_BINARY_DIR}/${REL_DIR}/${FIL_WE}.pb.h")
|
||||
|
||||
add_custom_command(
|
||||
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${REL_DIR}/${FIL_WE}.pb.cc"
|
||||
"${CMAKE_CURRENT_BINARY_DIR}/${REL_DIR}/${FIL_WE}.pb.h"
|
||||
COMMAND ${PROTOBUF_PROTOC_EXECUTABLE}
|
||||
ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR} -I ${ROOT_DIR} ${ABS_FIL}
|
||||
DEPENDS ${ABS_FIL} ${PROTOBUF_PROTOC_EXECUTABLE}
|
||||
COMMENT "Running C++ protocol buffer compiler on ${FIL}"
|
||||
VERBATIM )
|
||||
endforeach()
|
||||
|
||||
set_source_files_properties(${${SRCS}} ${${HDRS}} PROPERTIES GENERATED TRUE)
|
||||
set(${SRCS} ${${SRCS}} PARENT_SCOPE)
|
||||
set(${HDRS} ${${HDRS}} PARENT_SCOPE)
|
||||
endfunction()
|
||||
|
||||
|
||||
########################################################
|
||||
# tf_protos_cc library
|
||||
########################################################
|
||||
|
||||
# Build proto library
|
||||
include(FindProtobuf)
|
||||
find_package(Protobuf REQUIRED)
|
||||
include_directories(${PROTOBUF_INCLUDE_DIRS})
|
||||
include_directories(${CMAKE_CURRENT_BINARY_DIR})
|
||||
file(GLOB_RECURSE tf_protos_cc_srcs RELATIVE ${tensorflow_source_dir}
|
||||
"${tensorflow_source_dir}/tensorflow/*.proto"
|
||||
)
|
||||
RELATIVE_PROTOBUF_GENERATE_CPP(PROTO_SRCS PROTO_HDRS
|
||||
${tensorflow_source_dir} ${tf_protos_cc_srcs}
|
||||
)
|
||||
|
||||
add_library(tf_protos_cc ${PROTO_SRCS} ${PROTO_HDRS})
|
||||
target_include_directories(tf_protos_cc PUBLIC
|
||||
${CMAKE_CURRENT_BINARY_DIR}
|
||||
)
|
||||
target_link_libraries(tf_protos_cc PUBLIC
|
||||
${PROTOBUF_LIBRARIES}
|
||||
)
|
||||
|
||||
|
||||
########################################################
|
||||
# tf_core_lib library
|
||||
########################################################
|
||||
file(GLOB_RECURSE tf_core_lib_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/lib/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/lib/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/platform/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/platform/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/public/*.h"
|
||||
)
|
||||
|
||||
file(GLOB_RECURSE tf_core_lib_test_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/lib/*test*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/lib/*test*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/platform/*test*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/platform/*test*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/public/*test*.h"
|
||||
)
|
||||
|
||||
list(REMOVE_ITEM tf_core_lib_srcs ${tf_core_lib_test_srcs})
|
||||
|
||||
add_library(tf_core_lib OBJECT ${tf_core_lib_srcs})
|
||||
target_include_directories(tf_core_lib PUBLIC
|
||||
${tensorflow_source_dir}
|
||||
${jpeg_INCLUDE_DIR}
|
||||
${png_INCLUDE_DIR}
|
||||
)
|
||||
#target_link_libraries(tf_core_lib
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_protos_cc
|
||||
#)
|
||||
target_compile_options(tf_core_lib PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_core_lib PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
||||
add_dependencies(tf_core_lib
|
||||
jpeg_copy_headers_to_destination
|
||||
png_copy_headers_to_destination
|
||||
re2_copy_headers_to_destination
|
||||
eigen
|
||||
tf_protos_cc
|
||||
)
|
||||
|
||||
|
||||
########################################################
|
||||
# tf_core_framework library
|
||||
########################################################
|
||||
file(GLOB_RECURSE tf_core_framework_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/framework/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/framework/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/util/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/util/*.cc"
|
||||
"${tensorflow_source_dir}/public/*.h"
|
||||
)
|
||||
|
||||
file(GLOB_RECURSE tf_core_framework_test_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/framework/*test*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/framework/*test*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/framework/*testutil.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/framework/*testutil.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/framework/*main.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/util/*test*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/util/*test*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/util/*main.cc"
|
||||
)
|
||||
|
||||
list(REMOVE_ITEM tf_core_framework_srcs ${tf_core_framework_test_srcs})
|
||||
|
||||
add_library(tf_core_framework OBJECT ${tf_core_framework_srcs})
|
||||
target_include_directories(tf_core_framework PUBLIC
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
${re2_INCLUDES}
|
||||
)
|
||||
#target_link_libraries(tf_core_framework
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# #${re2_STATIC_LIBRARIES}
|
||||
# re2_lib
|
||||
# ${jpeg_STATIC_LIBRARIES}
|
||||
# ${png_STATIC_LIBRARIES}
|
||||
# tf_protos_cc
|
||||
# tf_core_lib
|
||||
#)
|
||||
add_dependencies(tf_core_framework
|
||||
tf_core_lib
|
||||
)
|
||||
target_compile_options(tf_core_framework PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
# C++11
|
||||
target_compile_features(tf_core_framework PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
53
tensorflow/contrib/cmake/tf_core_kernels.cmake
Normal file
53
tensorflow/contrib/cmake/tf_core_kernels.cmake
Normal file
@ -0,0 +1,53 @@
|
||||
########################################################
|
||||
# tf_core_kernels library
|
||||
########################################################
|
||||
file(GLOB_RECURSE tf_core_kernels_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*.cc"
|
||||
)
|
||||
|
||||
file(GLOB_RECURSE tf_core_kernels_exclude_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*test*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*test*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*testutil.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*testutil.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*main.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/kernels/*.cu.cc"
|
||||
)
|
||||
|
||||
list(REMOVE_ITEM tf_core_kernels_srcs ${tf_core_kernels_exclude_srcs})
|
||||
|
||||
add_library(tf_core_kernels OBJECT ${tf_core_kernels_srcs})
|
||||
|
||||
add_dependencies(tf_core_kernels tf_core_cpu)
|
||||
|
||||
target_include_directories(tf_core_kernels PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${png_INCLUDE_DIR}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
#target_link_libraries(tf_core_kernels
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_core_cpu
|
||||
# tf_core_framework
|
||||
# tf_core_lib
|
||||
# tf_protos_cc
|
||||
# tf_models_word2vec_kernels
|
||||
# tf_stream_executor
|
||||
# tf_core_ops
|
||||
# tf_core_cpu
|
||||
#)
|
||||
|
||||
# "@gemmlowp//:eight_bit_int_gemm",
|
||||
|
||||
target_compile_options(tf_core_kernels PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_core_kernels PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
181
tensorflow/contrib/cmake/tf_core_ops.cmake
Normal file
181
tensorflow/contrib/cmake/tf_core_ops.cmake
Normal file
@ -0,0 +1,181 @@
|
||||
#def tf_gen_op_libs(op_lib_names):
|
||||
# # Make library out of each op so it can also be used to generate wrappers
|
||||
# # for various languages.
|
||||
# for n in op_lib_names:
|
||||
# native.cc_library(name=n + "_op_lib"
|
||||
# copts=tf_copts(),
|
||||
# srcs=["ops/" + n + ".cc"],
|
||||
# deps=(["//tensorflow/core:framework"]),
|
||||
# visibility=["//visibility:public"],
|
||||
# alwayslink=1,
|
||||
# linkstatic=1,)
|
||||
|
||||
|
||||
set(tf_op_lib_names
|
||||
"array_ops"
|
||||
"attention_ops"
|
||||
"candidate_sampling_ops"
|
||||
"control_flow_ops"
|
||||
"data_flow_ops"
|
||||
"image_ops"
|
||||
"io_ops"
|
||||
"linalg_ops"
|
||||
"logging_ops"
|
||||
"functional_ops"
|
||||
"math_ops"
|
||||
"nn_ops"
|
||||
"no_op"
|
||||
"parsing_ops"
|
||||
"random_ops"
|
||||
"script_ops"
|
||||
"sendrecv_ops"
|
||||
"sparse_ops"
|
||||
"state_ops"
|
||||
"string_ops"
|
||||
"summary_ops"
|
||||
"training_ops"
|
||||
)
|
||||
|
||||
foreach(tf_op_lib_name ${tf_op_lib_names})
|
||||
########################################################
|
||||
# tf_${tf_op_lib_name} library
|
||||
########################################################
|
||||
file(GLOB tf_${tf_op_lib_name}_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/ops/${tf_op_lib_name}.cc"
|
||||
)
|
||||
|
||||
add_library(tf_${tf_op_lib_name} OBJECT ${tf_${tf_op_lib_name}_srcs})
|
||||
|
||||
add_dependencies(tf_${tf_op_lib_name} tf_core_framework)
|
||||
|
||||
target_include_directories(tf_${tf_op_lib_name} PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
target_compile_options(tf_${tf_op_lib_name} PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_${tf_op_lib_name} PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
endforeach()
|
||||
|
||||
#cc_library(
|
||||
# name = "user_ops_op_lib"
|
||||
# srcs = glob(["user_ops/**/*.cc"]),
|
||||
# copts = tf_copts(),
|
||||
# linkstatic = 1,
|
||||
# visibility = ["//visibility:public"],
|
||||
# deps = [":framework"],
|
||||
# alwayslink = 1,
|
||||
#)
|
||||
########################################################
|
||||
# tf_user_ops library
|
||||
########################################################
|
||||
file(GLOB_RECURSE tf_user_ops_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/user_ops/*.cc"
|
||||
)
|
||||
|
||||
add_library(tf_user_ops OBJECT ${tf_user_ops_srcs})
|
||||
|
||||
add_dependencies(tf_user_ops tf_core_framework)
|
||||
|
||||
target_include_directories(tf_user_ops PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
target_compile_options(tf_user_ops PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_user_ops PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
||||
|
||||
#tf_cuda_library(
|
||||
# name = "ops"
|
||||
# srcs = glob(
|
||||
# [
|
||||
# "ops/**/*.h"
|
||||
# "ops/**/*.cc"
|
||||
# "user_ops/**/*.h"
|
||||
# "user_ops/**/*.cc"
|
||||
# ],
|
||||
# exclude = [
|
||||
# "**/*test*"
|
||||
# "**/*main.cc"
|
||||
# "user_ops/**/*.cu.cc"
|
||||
# ],
|
||||
# ),
|
||||
# copts = tf_copts(),
|
||||
# linkstatic = 1,
|
||||
# visibility = ["//visibility:public"],
|
||||
# deps = [
|
||||
# ":core"
|
||||
# ":lib"
|
||||
# ":protos_cc"
|
||||
# "//tensorflow/models/embedding:word2vec_ops"
|
||||
# "//third_party/eigen3"
|
||||
# ],
|
||||
# alwayslink = 1,
|
||||
#)
|
||||
|
||||
########################################################
|
||||
# tf_core_ops library
|
||||
########################################################
|
||||
file(GLOB_RECURSE tf_core_ops_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/ops/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/ops/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/user_ops/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/user_ops/*.cc"
|
||||
)
|
||||
|
||||
file(GLOB_RECURSE tf_core_ops_exclude_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/core/ops/*test*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/ops/*test*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/ops/*main.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/user_ops/*test*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/core/user_ops/*test*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/user_ops/*main.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/core/user_ops/*.cu.cc"
|
||||
)
|
||||
|
||||
list(REMOVE_ITEM tf_core_ops_srcs ${tf_core_ops_exclude_srcs})
|
||||
|
||||
add_library(tf_core_ops OBJECT ${tf_core_ops_srcs})
|
||||
|
||||
add_dependencies(tf_core_ops tf_core_cpu)
|
||||
|
||||
target_include_directories(tf_core_ops PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
#target_link_libraries(tf_core_ops
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_protos_cc
|
||||
# tf_core_lib
|
||||
# tf_core_cpu
|
||||
# tf_models_word2vec_ops
|
||||
#)
|
||||
|
||||
target_compile_options(tf_core_ops PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_core_ops PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
||||
|
95
tensorflow/contrib/cmake/tf_models.cmake
Normal file
95
tensorflow/contrib/cmake/tf_models.cmake
Normal file
@ -0,0 +1,95 @@
|
||||
#cc_library(
|
||||
# name = "word2vec_ops",
|
||||
# srcs = [
|
||||
# "word2vec_ops.cc",
|
||||
# ],
|
||||
# visibility = ["//tensorflow:internal"],
|
||||
# deps = [
|
||||
# "//tensorflow/core:framework",
|
||||
# ],
|
||||
# alwayslink = 1,
|
||||
#)
|
||||
|
||||
########################################################
|
||||
# tf_models_word2vec_ops library
|
||||
########################################################
|
||||
file(GLOB tf_models_word2vec_ops_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/models/embedding/word2vec_ops.cc"
|
||||
)
|
||||
|
||||
add_library(tf_models_word2vec_ops OBJECT ${tf_models_word2vec_ops_srcs})
|
||||
|
||||
target_include_directories(tf_models_word2vec_ops PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
add_dependencies(tf_models_word2vec_ops
|
||||
tf_core_framework
|
||||
)
|
||||
#target_link_libraries(tf_models_word2vec_ops
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_core_framework
|
||||
# tf_core_lib
|
||||
# tf_protos_cc
|
||||
#)
|
||||
|
||||
target_compile_options(tf_models_word2vec_ops PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_models_word2vec_ops PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
||||
#cc_library(
|
||||
# name = "word2vec_kernels",
|
||||
# srcs = [
|
||||
# "word2vec_kernels.cc",
|
||||
# ],
|
||||
# visibility = ["//tensorflow:internal"],
|
||||
# deps = [
|
||||
# "//tensorflow/core",
|
||||
# ],
|
||||
# alwayslink = 1,
|
||||
#)
|
||||
########################################################
|
||||
# tf_models_word2vec_kernels library
|
||||
########################################################
|
||||
file(GLOB tf_models_word2vec_kernels_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/models/embedding/word2vec_kernels.cc"
|
||||
)
|
||||
|
||||
add_library(tf_models_word2vec_kernels OBJECT ${tf_models_word2vec_kernels_srcs})
|
||||
|
||||
target_include_directories(tf_models_word2vec_kernels PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
${re2_INCLUDES}
|
||||
)
|
||||
|
||||
add_dependencies(tf_models_word2vec_ops
|
||||
tf_core_cpu
|
||||
)
|
||||
|
||||
#target_link_libraries(tf_models_word2vec_kernels
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_core_framework
|
||||
# tf_core_lib
|
||||
# tf_protos_cc
|
||||
# tf_core_cpu
|
||||
#)
|
||||
|
||||
target_compile_options(tf_models_word2vec_kernels PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_models_word2vec_kernels PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
81
tensorflow/contrib/cmake/tf_stream_executor.cmake
Normal file
81
tensorflow/contrib/cmake/tf_stream_executor.cmake
Normal file
@ -0,0 +1,81 @@
|
||||
#cc_library(
|
||||
# name = "stream_executor",
|
||||
# srcs = glob(
|
||||
# [
|
||||
#XX "*.cc",
|
||||
# "lib/*.cc",
|
||||
# ],
|
||||
# exclude = [
|
||||
# "**/*_test.cc",
|
||||
# ],
|
||||
# ) + if_cuda(
|
||||
# glob([
|
||||
# "cuda/*.cc",
|
||||
# ]),
|
||||
# ),
|
||||
# hdrs = glob([
|
||||
# "*.h",
|
||||
# "cuda/*.h",
|
||||
# "lib/*.h",
|
||||
# "platform/**/*.h",
|
||||
# ]),
|
||||
# data = [
|
||||
# "//tensorflow/core:cuda",
|
||||
# "//third_party/gpus/cuda:cublas",
|
||||
# "//third_party/gpus/cuda:cudnn",
|
||||
# ],
|
||||
# linkopts = [
|
||||
# "-ldl",
|
||||
# ],
|
||||
# visibility = ["//visibility:public"],
|
||||
# deps = [
|
||||
# "//tensorflow/core:lib",
|
||||
# "//third_party/gpus/cuda:cuda_headers",
|
||||
# ],
|
||||
# alwayslink = 1,
|
||||
#)
|
||||
|
||||
########################################################
|
||||
# tf_stream_executor library
|
||||
########################################################
|
||||
file(GLOB tf_stream_executor_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/stream_executor/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/stream_executor/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/stream_executor/lib/*.cc"
|
||||
"${tensorflow_source_dir}/tensorflow/stream_executor/lib/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/stream_executor/platform/*.h"
|
||||
"${tensorflow_source_dir}/tensorflow/stream_executor/platform/default/*.h"
|
||||
)
|
||||
|
||||
#file(GLOB_RECURSE tf_stream_executor_test_srcs
|
||||
# "${tensorflow_source_dir}/tensorflow/stream_executor/*_test.cc"
|
||||
# "${tensorflow_source_dir}/tensorflow/stream_executor/*_test.h"
|
||||
#)
|
||||
#
|
||||
#list(REMOVE_ITEM tf_stream_executor_srcs ${tf_stream_executor_test_srcs})
|
||||
|
||||
add_library(tf_stream_executor OBJECT ${tf_stream_executor_srcs})
|
||||
|
||||
target_include_directories(tf_stream_executor PRIVATE
|
||||
${tensorflow_source_dir}
|
||||
)
|
||||
add_dependencies(tf_stream_executor
|
||||
tf_core_lib
|
||||
)
|
||||
#target_link_libraries(tf_stream_executor
|
||||
# ${CMAKE_THREAD_LIBS_INIT}
|
||||
# ${PROTOBUF_LIBRARIES}
|
||||
# tf_protos_cc
|
||||
# tf_core_lib
|
||||
#)
|
||||
|
||||
target_compile_options(tf_stream_executor PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_stream_executor PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
||||
|
54
tensorflow/contrib/cmake/tf_tutorials.cmake
Normal file
54
tensorflow/contrib/cmake/tf_tutorials.cmake
Normal file
@ -0,0 +1,54 @@
|
||||
#cc_binary(
|
||||
# name = "tutorials_example_trainer",
|
||||
# srcs = ["tutorials/example_trainer.cc"],
|
||||
# copts = tf_copts(),
|
||||
# linkopts = [
|
||||
# "-lpthread",
|
||||
# "-lm",
|
||||
# ],
|
||||
# deps = [
|
||||
# ":cc_ops",
|
||||
# "//tensorflow/core:kernels",
|
||||
# "//tensorflow/core:tensorflow",
|
||||
# ],
|
||||
#)
|
||||
|
||||
set(tf_tutorials_example_trainer_srcs
|
||||
"${tensorflow_source_dir}/tensorflow/cc/tutorials/example_trainer.cc"
|
||||
)
|
||||
|
||||
add_executable(tf_tutorials_example_trainer
|
||||
${tf_tutorials_example_trainer_srcs}
|
||||
$<TARGET_OBJECTS:tf_core_lib>
|
||||
$<TARGET_OBJECTS:tf_core_cpu>
|
||||
$<TARGET_OBJECTS:tf_core_framework>
|
||||
$<TARGET_OBJECTS:tf_core_kernels>
|
||||
$<TARGET_OBJECTS:tf_cc_ops>
|
||||
$<TARGET_OBJECTS:tf_core_ops>
|
||||
$<TARGET_OBJECTS:tf_core_direct_session>
|
||||
)
|
||||
|
||||
target_include_directories(tf_tutorials_example_trainer PUBLIC
|
||||
${tensorflow_source_dir}
|
||||
${eigen_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
target_link_libraries(tf_tutorials_example_trainer PUBLIC
|
||||
${CMAKE_THREAD_LIBS_INIT}
|
||||
${PROTOBUF_LIBRARIES}
|
||||
tf_protos_cc
|
||||
re2_lib
|
||||
${jpeg_STATIC_LIBRARIES}
|
||||
${png_STATIC_LIBRARIES}
|
||||
${ZLIB_LIBRARIES}
|
||||
)
|
||||
|
||||
target_compile_options(tf_tutorials_example_trainer PRIVATE
|
||||
-fno-exceptions
|
||||
-DEIGEN_AVOID_STL_ARRAY
|
||||
)
|
||||
|
||||
# C++11
|
||||
target_compile_features(tf_tutorials_example_trainer PRIVATE
|
||||
cxx_rvalue_references
|
||||
)
|
@ -79,7 +79,7 @@ def _reduce_batch(x, reduce_fn, name=None):
|
||||
elif ndims == 1:
|
||||
return x # Don't include a useless reduction.
|
||||
elif ndims:
|
||||
reduction_indices = range(1, ndims)
|
||||
reduction_indices = list(range(1, ndims))
|
||||
shape = [x.get_shape().dims[0]]
|
||||
else:
|
||||
reduction_indices = math_ops.range(1, array_ops.size(array_ops.shape(x)))
|
||||
|
@ -73,11 +73,6 @@ struct Regularizations {
|
||||
float symmetric_l2 = 0;
|
||||
};
|
||||
|
||||
struct RegularizationLoss {
|
||||
double l1_loss = 0;
|
||||
double l2_loss = 0;
|
||||
};
|
||||
|
||||
struct PerExampleData {
|
||||
double wx = 0;
|
||||
double norm = 0;
|
||||
@ -102,7 +97,7 @@ using DenseFeaturesByGroup = std::vector<TTypes<const float>::Vec>;
|
||||
// indicates that the contents of sparse_examples_by_group cannot be trusted or
|
||||
// used.
|
||||
Status FillSparseExamplesByGroup(
|
||||
const int64 num_sparse_features, const int64 num_examples,
|
||||
const int64 num_sparse_features, const int num_examples,
|
||||
const OpInputList& sparse_features_indices_inputs,
|
||||
const OpInputList& sparse_features_values_inputs,
|
||||
const WeightsByGroup& sparse_weights_by_group,
|
||||
@ -127,7 +122,10 @@ Status FillSparseExamplesByGroup(
|
||||
static const int64 kIndicesDims = 2;
|
||||
gtl::InlinedVector<int64, 8> order(kIndicesDims);
|
||||
std::iota(order.begin(), order.end(), 0);
|
||||
for (int64 i = begin; i < end; ++i) {
|
||||
|
||||
// The static_cast here is safe since begin and end can be at most
|
||||
// num_examples which is an int.
|
||||
for (int i = static_cast<int>(begin); i < end; ++i) {
|
||||
if (sparse_features_indices_inputs[i].shape().dims() != kIndicesDims) {
|
||||
mutex_lock l(mu);
|
||||
result = errors::InvalidArgument(strings::Printf(
|
||||
@ -147,7 +145,7 @@ Status FillSparseExamplesByGroup(
|
||||
if (example_index < 0 || example_index >= num_examples) {
|
||||
mutex_lock l(mu);
|
||||
result = errors::Internal(strings::Printf(
|
||||
"Example indices should be in [0, %lld). Encountered: %lld",
|
||||
"Example indices should be in [0, %d). Encountered: %lld",
|
||||
num_examples, example_index));
|
||||
return;
|
||||
}
|
||||
@ -203,35 +201,6 @@ inline double Shrink(const double weight, const double shrink_by) {
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
// Compute L1 and L2 regularization loss.
|
||||
inline RegularizationLoss ComputeRegularizationLoss(
|
||||
const WeightsByGroup& sparse_weights_by_group,
|
||||
const WeightsByGroup& dense_weights_by_group,
|
||||
const Regularizations& regularizations) {
|
||||
RegularizationLoss result;
|
||||
|
||||
const double shrink_by = ShrinkageFactor(regularizations);
|
||||
auto accumulate_regularization_loss = [&](const double w) {
|
||||
const double sw = std::abs(Shrink(w, shrink_by));
|
||||
result.l1_loss += sw;
|
||||
result.l2_loss += sw * sw;
|
||||
};
|
||||
|
||||
for (const TTypes<float>::Vec weights : sparse_weights_by_group) {
|
||||
for (int64 i = 0; i < weights.size(); ++i) {
|
||||
accumulate_regularization_loss(weights(i));
|
||||
}
|
||||
}
|
||||
|
||||
for (const TTypes<float>::Vec weights : dense_weights_by_group) {
|
||||
accumulate_regularization_loss(weights(0));
|
||||
}
|
||||
|
||||
result.l1_loss *= regularizations.symmetric_l1;
|
||||
result.l2_loss *= regularizations.symmetric_l2;
|
||||
return result;
|
||||
}
|
||||
|
||||
// Compute PerExampleData which contains the logits, and weighted example norm
|
||||
// for a given example_id. Norm is weighted by 1/(lambda*N).
|
||||
inline PerExampleData ComputeWxAndWeightedExampleNorm(
|
||||
@ -380,7 +349,7 @@ WeightsByGroup MakeDeltaWeightsFrom(std::vector<Tensor>* const tensors) {
|
||||
}
|
||||
|
||||
Status RunTrainStepsForMiniBatch(
|
||||
const int64 num_examples, const TTypes<const string>::Vec example_ids,
|
||||
const int num_examples, const TTypes<const string>::Vec example_ids,
|
||||
const TTypes<const float>::Vec example_labels,
|
||||
const TTypes<const float>::Vec example_weights,
|
||||
const DeviceBase::CpuWorkerThreads& worker_threads,
|
||||
@ -459,6 +428,13 @@ Status RunTrainStepsForMiniBatch(
|
||||
return train_step_status;
|
||||
}
|
||||
|
||||
Status FillRegularizations(OpKernelConstruction* const context,
|
||||
Regularizations* const regularizations) {
|
||||
TF_RETURN_IF_ERROR(context->GetAttr("l1", ®ularizations->symmetric_l1));
|
||||
TF_RETURN_IF_ERROR(context->GetAttr("l2", ®ularizations->symmetric_l2));
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
class SdcaSolver : public OpKernel {
|
||||
@ -484,25 +460,9 @@ class SdcaSolver : public OpKernel {
|
||||
OP_REQUIRES(
|
||||
context, num_sparse_features_ + num_dense_features_ > 0,
|
||||
errors::InvalidArgument("Requires at least one feature to train."));
|
||||
|
||||
OP_REQUIRES_OK(context,
|
||||
context->GetAttr("l1", ®ularizations_.symmetric_l1));
|
||||
OP_REQUIRES_OK(context,
|
||||
context->GetAttr("l2", ®ularizations_.symmetric_l2));
|
||||
// We enforce a minimal l2, required by the algorithm.
|
||||
regularizations_.symmetric_l2 =
|
||||
std::max(regularizations_.symmetric_l2, 1.0f);
|
||||
|
||||
OP_REQUIRES_OK(context, FillRegularizations(context, ®ularizations_));
|
||||
OP_REQUIRES_OK(context, context->GetAttr("num_inner_iterations",
|
||||
&num_inner_iterations_));
|
||||
|
||||
// TODO(rohananil): Provide emperical evidence for this. It is better to run
|
||||
// more than one iteration on single mini-batch as we want to spend more
|
||||
// time in compute. SDCA works better with larger mini batches and there
|
||||
// is also recent work that shows its better to reuse old samples than train
|
||||
// on new samples. See: http://arxiv.org/abs/1602.02136.
|
||||
num_inner_iterations_ =
|
||||
std::max(num_inner_iterations_, static_cast<int64>(2));
|
||||
OP_REQUIRES_OK(context, context->GetAttr("container", &container_));
|
||||
OP_REQUIRES_OK(context, context->GetAttr("solver_uuid", &solver_uuid_));
|
||||
}
|
||||
@ -533,21 +493,16 @@ class SdcaSolver : public OpKernel {
|
||||
OP_REQUIRES(context, TensorShapeUtils::IsVector(example_weights_t->shape()),
|
||||
errors::InvalidArgument("example_weights should be a vector."));
|
||||
const auto example_weights = example_weights_t->vec<float>();
|
||||
|
||||
Eigen::Tensor<float, 0, Eigen::RowMajor> example_weights_sum;
|
||||
example_weights_sum.device(context->eigen_cpu_device()) =
|
||||
example_weights.sum();
|
||||
const float weighted_examples = example_weights_sum();
|
||||
const int64 num_examples = example_weights.size();
|
||||
|
||||
OP_REQUIRES(context, weighted_examples > 0,
|
||||
errors::InvalidArgument("No weighted examples in ",
|
||||
num_examples, " training examples"));
|
||||
OP_REQUIRES(context,
|
||||
example_weights.size() <= std::numeric_limits<int>::max(),
|
||||
errors::InvalidArgument(strings::Printf(
|
||||
"Too many examples in a mini-batch: %ld > %d",
|
||||
example_weights.size(), std::numeric_limits<int>::max())));
|
||||
const int num_examples = static_cast<int>(example_weights.size());
|
||||
|
||||
OpInputList dense_features_inputs;
|
||||
OP_REQUIRES_OK(
|
||||
context, context->input_list("dense_features", &dense_features_inputs));
|
||||
|
||||
DenseFeaturesByGroup dense_features_by_group;
|
||||
for (const auto& dense_feature : dense_features_inputs) {
|
||||
dense_features_by_group.emplace_back(dense_feature.vec<float>());
|
||||
@ -562,7 +517,7 @@ class SdcaSolver : public OpKernel {
|
||||
OP_REQUIRES(context, example_labels.size() == num_examples,
|
||||
errors::InvalidArgument(strings::Printf(
|
||||
"The number of example labels (%ld) should match the "
|
||||
"number of example weights (%lld).",
|
||||
"number of example weights (%d).",
|
||||
example_labels.size(), num_examples)));
|
||||
|
||||
const Tensor* example_ids_t;
|
||||
@ -573,7 +528,7 @@ class SdcaSolver : public OpKernel {
|
||||
OP_REQUIRES(context, example_labels.size() == num_examples,
|
||||
errors::InvalidArgument(strings::Printf(
|
||||
"The number of example ids (%ld) should match the number "
|
||||
"of example weights (%lld).",
|
||||
"of example weights (%d).",
|
||||
example_ids.size(), num_examples)));
|
||||
const int64 num_duplicate_example_ids = [&] {
|
||||
// TODO(katsiapis): Benchmark and/or optimize.
|
||||
@ -632,12 +587,7 @@ class SdcaSolver : public OpKernel {
|
||||
SetZeroDeltaWeights(&sparse_delta_weights_by_group,
|
||||
&dense_delta_weights_by_group);
|
||||
|
||||
// TODO(rohananil): Provide emperical evidence for this. It is better to run
|
||||
// more than one iteration on single mini-batch as we want to spend more
|
||||
// time in compute. SDCA works better with larger mini batches and there
|
||||
// is also recent work that shows its better to reuse old samples than train
|
||||
// on new samples. See: http://arxiv.org/abs/1602.02136.
|
||||
for (int64 i = 0; i < num_inner_iterations_; ++i) {
|
||||
for (int i = 0; i < num_inner_iterations_; ++i) {
|
||||
OP_REQUIRES_OK(
|
||||
context,
|
||||
RunTrainStepsForMiniBatch(
|
||||
@ -669,7 +619,7 @@ class SdcaSolver : public OpKernel {
|
||||
int64 num_sparse_features_;
|
||||
int64 num_dense_features_;
|
||||
Regularizations regularizations_;
|
||||
int64 num_inner_iterations_;
|
||||
int num_inner_iterations_;
|
||||
string container_;
|
||||
string solver_uuid_;
|
||||
};
|
||||
@ -678,13 +628,7 @@ REGISTER_KERNEL_BUILDER(Name("SdcaSolver").Device(DEVICE_CPU), SdcaSolver);
|
||||
class SdcaShrinkL1 : public OpKernel {
|
||||
public:
|
||||
explicit SdcaShrinkL1(OpKernelConstruction* context) : OpKernel(context) {
|
||||
OP_REQUIRES_OK(context,
|
||||
context->GetAttr("l1", ®ularizations_.symmetric_l1));
|
||||
OP_REQUIRES_OK(context,
|
||||
context->GetAttr("l2", ®ularizations_.symmetric_l2));
|
||||
// We enforce a minimal l2, required by the algorithm.
|
||||
regularizations_.symmetric_l2 =
|
||||
std::max(regularizations_.symmetric_l2, 1.0f);
|
||||
OP_REQUIRES_OK(context, FillRegularizations(context, ®ularizations_));
|
||||
}
|
||||
|
||||
void Compute(OpKernelContext* context) override {
|
||||
@ -709,19 +653,10 @@ class SdcaShrinkL1 : public OpKernel {
|
||||
};
|
||||
REGISTER_KERNEL_BUILDER(Name("SdcaShrinkL1").Device(DEVICE_CPU), SdcaShrinkL1);
|
||||
|
||||
class ComputeDualityGap : public OpKernel {
|
||||
class SdcaTrainingStats : public OpKernel {
|
||||
public:
|
||||
explicit ComputeDualityGap(OpKernelConstruction* context)
|
||||
explicit SdcaTrainingStats(OpKernelConstruction* context)
|
||||
: OpKernel(context) {
|
||||
// TODO(rohananil): Refactor grabbing common attributes across ops related
|
||||
// to sdca.
|
||||
OP_REQUIRES_OK(context,
|
||||
context->GetAttr("l1", ®ularizations_.symmetric_l1));
|
||||
OP_REQUIRES_OK(context,
|
||||
context->GetAttr("l2", ®ularizations_.symmetric_l2));
|
||||
// We enforce a minimal l2, required by the algorithm.
|
||||
regularizations_.symmetric_l2 =
|
||||
std::max(regularizations_.symmetric_l2, 1.0f);
|
||||
OP_REQUIRES_OK(context, context->GetAttr("container", &container_));
|
||||
OP_REQUIRES_OK(context, context->GetAttr("solver_uuid", &solver_uuid_));
|
||||
}
|
||||
@ -734,45 +669,56 @@ class ComputeDualityGap : public OpKernel {
|
||||
context, !data_by_example->RefCountIsOne(),
|
||||
errors::Internal("Expected shared-ownership of data_by_example."));
|
||||
|
||||
OpMutableInputList sparse_weights_inputs;
|
||||
OP_REQUIRES_OK(context, context->mutable_input_list(
|
||||
"sparse_weights", &sparse_weights_inputs));
|
||||
WeightsByGroup sparse_weights_by_group =
|
||||
MakeWeightsFrom(&sparse_weights_inputs);
|
||||
|
||||
OpMutableInputList dense_weights_inputs;
|
||||
OP_REQUIRES_OK(context, context->mutable_input_list("dense_weights",
|
||||
&dense_weights_inputs));
|
||||
WeightsByGroup dense_weights_by_group =
|
||||
MakeWeightsFrom(&dense_weights_inputs);
|
||||
|
||||
double example_weight_sum = 0;
|
||||
double total_duality_gap = 0;
|
||||
double total_primal_loss = 0;
|
||||
double total_dual_loss = 0;
|
||||
double total_example_weight = 0;
|
||||
OP_REQUIRES_OK(context,
|
||||
data_by_example->Visit([&](const DataByExample::Data& data) {
|
||||
example_weight_sum += data.example_weight;
|
||||
total_duality_gap += data.primal_loss + data.dual_loss;
|
||||
total_primal_loss += data.primal_loss;
|
||||
total_dual_loss += data.dual_loss;
|
||||
total_example_weight += data.example_weight;
|
||||
}));
|
||||
|
||||
const RegularizationLoss regularization_loss = ComputeRegularizationLoss(
|
||||
sparse_weights_by_group, dense_weights_by_group, regularizations_);
|
||||
total_duality_gap +=
|
||||
regularization_loss.l2_loss + regularization_loss.l1_loss;
|
||||
// TODO(katsiapis): Think about the most arithmetically stable way of
|
||||
// computing (dual + primal) loss (if it matters).
|
||||
|
||||
Tensor* duality_gap_t = nullptr;
|
||||
OP_REQUIRES_OK(context,
|
||||
context->allocate_output("duality_gap", {}, &duality_gap_t));
|
||||
duality_gap_t->scalar<float>()() = total_duality_gap / example_weight_sum;
|
||||
{
|
||||
Tensor* tensor = nullptr;
|
||||
OP_REQUIRES_OK(context,
|
||||
context->allocate_output("primal_loss", {}, &tensor));
|
||||
tensor->scalar<double>()() = total_primal_loss;
|
||||
}
|
||||
|
||||
{
|
||||
Tensor* tensor = nullptr;
|
||||
OP_REQUIRES_OK(context,
|
||||
context->allocate_output("dual_loss", {}, &tensor));
|
||||
tensor->scalar<double>()() = total_dual_loss;
|
||||
}
|
||||
|
||||
{
|
||||
OP_REQUIRES(
|
||||
context, total_example_weight > 0,
|
||||
errors::FailedPrecondition(
|
||||
"No examples found or all examples have zero weight. Either the "
|
||||
"optimizer was trained with no instances or perhaps there is a "
|
||||
"bug in the training data."));
|
||||
|
||||
Tensor* tensor = nullptr;
|
||||
OP_REQUIRES_OK(context,
|
||||
context->allocate_output("example_weights", {}, &tensor));
|
||||
tensor->scalar<double>()() = total_example_weight;
|
||||
}
|
||||
|
||||
// TODO(katsiapis): Use core::ScopedUnref once it's moved out of internal.
|
||||
data_by_example->Unref();
|
||||
}
|
||||
|
||||
private:
|
||||
Regularizations regularizations_;
|
||||
string container_;
|
||||
string solver_uuid_;
|
||||
};
|
||||
REGISTER_KERNEL_BUILDER(Name("ComputeDualityGap").Device(DEVICE_CPU),
|
||||
ComputeDualityGap);
|
||||
REGISTER_KERNEL_BUILDER(Name("SdcaTrainingStats").Device(DEVICE_CPU),
|
||||
SdcaTrainingStats);
|
||||
|
||||
} // namespace tensorflow
|
||||
|
@ -24,7 +24,7 @@ REGISTER_OP("SdcaSolver")
|
||||
.Attr("num_dense_features: int >= 0")
|
||||
.Attr("l1: float >= 0")
|
||||
.Attr("l2: float >= 1")
|
||||
.Attr("num_inner_iterations: int >= 2")
|
||||
.Attr("num_inner_iterations: int >= 1")
|
||||
.Attr("container: string")
|
||||
.Attr("solver_uuid: string")
|
||||
.Input("sparse_features_indices: num_sparse_features * int64")
|
||||
@ -69,7 +69,7 @@ example_labels: a vector which contains the label/target associated with each
|
||||
example_ids: a vector which contains the unique identifier associated with each
|
||||
example.
|
||||
sparse_weights: a list of vectors where each value is the weight associated with
|
||||
a feature index.
|
||||
a feature group.
|
||||
dense_weights: a list of vectors where the value is the weight associated with
|
||||
a dense feature group.
|
||||
)doc");
|
||||
@ -89,38 +89,28 @@ num_dense_features: Number of dense feature groups to train on.
|
||||
l1: Symmetric l1 regularization strength.
|
||||
l2: Symmetric l2 regularization strength.
|
||||
sparse_weights: a list of vectors where each value is the weight associated with
|
||||
a feature index.
|
||||
a feature group.
|
||||
dense_weights: a list of vectors where the value is the weight associated with
|
||||
a dense feature group.
|
||||
)doc");
|
||||
|
||||
// TODO(katsiapis): We should expand this scope of this op to compute other
|
||||
// statistics about the data.
|
||||
REGISTER_OP("ComputeDualityGap")
|
||||
.Attr("num_sparse_features: int >= 0")
|
||||
.Attr("num_dense_features: int >= 0")
|
||||
.Attr("l1: float >= 0")
|
||||
.Attr("l2: float >= 1")
|
||||
REGISTER_OP("SdcaTrainingStats")
|
||||
.Attr("container: string")
|
||||
.Attr("solver_uuid: string")
|
||||
.Input("sparse_weights: Ref(num_sparse_features * float)")
|
||||
.Input("dense_weights: Ref(num_dense_features * float)")
|
||||
.Output("duality_gap: float")
|
||||
.Output("primal_loss: float64")
|
||||
.Output("dual_loss: float64")
|
||||
.Output("example_weights: float64")
|
||||
.Doc(R"doc(
|
||||
Computes duality gap over all examples seen by the optimizer.
|
||||
Computes statistics over all examples seen by the optimizer.
|
||||
|
||||
num_sparse_features: Number of sparse feature groups to train on.
|
||||
num_dense_features: Number of dense feature groups to train on.
|
||||
l1: Symmetric l1 regularization strength.
|
||||
l2: Symmetric l2 regularization strength.
|
||||
container: Name of the Container that stores data across invocations of this
|
||||
Kernel. Together with SolverUUID form an isolation unit for this solver.
|
||||
solver_uuid: Universally Unique Identifier for this solver.
|
||||
sparse_weights: a list of vectors where each value is the weight associated with
|
||||
a feature index.
|
||||
dense_weights: a list of vectors where the value is the weight associated with
|
||||
a dense feature group.
|
||||
duality_gap: duality gap over all examples seen by the optimizer.
|
||||
primal_loss: total primal loss of all examples seen by the optimizer.
|
||||
dual_loss: total dual loss of all examples seen by the optimizer.
|
||||
example_weights: total example weights of all examples seen by the optimizer
|
||||
(guaranteed to be positive; otherwise returns FAILED_PRECONDITION as it
|
||||
probably indicates a bug in the training data).
|
||||
)doc");
|
||||
|
||||
} // namespace tensorflow
|
||||
|
@ -92,6 +92,7 @@ def make_variable_dict(max_age, max_gender):
|
||||
return dict(sparse_features_weights=[age_weights, gender_weights],
|
||||
dense_features_weights=[])
|
||||
|
||||
|
||||
def make_dense_variable_dict(num_dense_features, num_examples):
|
||||
feature_weights = ([
|
||||
tf.Variable(tf.zeros([1],
|
||||
@ -130,6 +131,7 @@ def tearDown():
|
||||
pass
|
||||
|
||||
|
||||
# TODO(katsiapis): Add tests that exercise L1 and Shrinking.
|
||||
class SdcaOptimizerTest(TensorFlowTestCase):
|
||||
|
||||
def _single_threaded_test_session(self):
|
||||
@ -180,6 +182,44 @@ class SdcaOptimizerTest(TensorFlowTestCase):
|
||||
rtol=1e-2,
|
||||
atol=1e-2)
|
||||
|
||||
def testSimpleLogisticNoL2(self):
|
||||
# Same as test above (so comments from above apply) but without an L2.
|
||||
# The algorithm should behave as if we have an L2 of 1 in optimization but
|
||||
# 0 in regularized_loss.
|
||||
example_protos = [
|
||||
make_example_proto(
|
||||
{'age': [0],
|
||||
'gender': [0]}, 0),
|
||||
make_example_proto(
|
||||
{'age': [1],
|
||||
'gender': [1]}, 1),
|
||||
]
|
||||
example_weights = [1.0, 1.0]
|
||||
with self._single_threaded_test_session():
|
||||
examples = make_example_dict(example_protos, example_weights)
|
||||
variables = make_variable_dict(1, 1)
|
||||
options = dict(symmetric_l2_regularization=0,
|
||||
symmetric_l1_regularization=0,
|
||||
loss_type='logistic_loss')
|
||||
|
||||
lr = SdcaModel(CONTAINER, examples, variables, options)
|
||||
tf.initialize_all_variables().run()
|
||||
unregularized_loss = lr.unregularized_loss(examples)
|
||||
loss = lr.regularized_loss(examples)
|
||||
predictions = lr.predictions(examples)
|
||||
self.assertAllClose(0.693147, unregularized_loss.eval())
|
||||
self.assertAllClose(0.693147, loss.eval())
|
||||
for _ in xrange(5):
|
||||
lr.minimize().run()
|
||||
self.assertAllClose(0.411608, unregularized_loss.eval(), rtol=0.11)
|
||||
self.assertAllClose(0.371705, loss.eval(), atol=0.01)
|
||||
predicted_labels = get_binary_predictions_for_logistic(predictions)
|
||||
self.assertAllEqual([0, 1], predicted_labels.eval())
|
||||
self.assertAllClose(0.01,
|
||||
lr.approximate_duality_gap().eval(),
|
||||
rtol=1e-2,
|
||||
atol=1e-2)
|
||||
|
||||
def testSomeUnweightedExamples(self):
|
||||
# Setup test data with 4 examples, but should produce the same
|
||||
# results as testSimple.
|
||||
@ -272,10 +312,11 @@ class SdcaOptimizerTest(TensorFlowTestCase):
|
||||
lr = SdcaModel(CONTAINER, examples, variables, options)
|
||||
tf.initialize_all_variables().run()
|
||||
self.assertAllClose([0.5, 0.5], lr.predictions(examples).eval())
|
||||
with self.assertRaisesOpError(
|
||||
'No weighted examples in 2 training examples'):
|
||||
lr.minimize().run()
|
||||
lr.minimize().run()
|
||||
self.assertAllClose([0.5, 0.5], lr.predictions(examples).eval())
|
||||
with self.assertRaisesOpError(
|
||||
'No examples found or all examples have zero weight.'):
|
||||
lr.approximate_duality_gap().eval()
|
||||
|
||||
def testDuplicateExampleIds(self):
|
||||
# Setup test data with 1 positive, and 1 negative example.
|
||||
|
@ -28,7 +28,6 @@ from tensorflow.python.framework.ops import name_scope
|
||||
from tensorflow.python.ops import array_ops
|
||||
from tensorflow.python.ops import control_flow_ops
|
||||
from tensorflow.python.ops import math_ops
|
||||
from tensorflow.python.ops import state_ops
|
||||
from tensorflow.python.ops import variables as var_ops
|
||||
from tensorflow.python.ops.nn import sigmoid_cross_entropy_with_logits
|
||||
from tensorflow.python.platform import resource_loader
|
||||
@ -139,30 +138,35 @@ class SdcaModel(object):
|
||||
['loss_type', 'symmetric_l2_regularization',
|
||||
'symmetric_l1_regularization'], options)
|
||||
|
||||
for name in ['symmetric_l1_regularization', 'symmetric_l2_regularization']:
|
||||
value = options[name]
|
||||
if value < 0.0:
|
||||
raise ValueError('%s should be non-negative. Found (%f)' %
|
||||
(name, value))
|
||||
|
||||
self._container = container
|
||||
self._examples = examples
|
||||
self._variables = variables
|
||||
self._options = options
|
||||
self._solver_uuid = uuid.uuid4().hex
|
||||
self._create_slots(variables)
|
||||
self._create_slots()
|
||||
|
||||
# TODO(rohananil): Use optimizer interface to make use of slot creation
|
||||
# logic
|
||||
def _create_slots(self, variables):
|
||||
self._slots = {}
|
||||
# TODO(rohananil): Rename the slot keys to "unshrinked" weights.
|
||||
self._slots['sparse_features_weights'] = []
|
||||
self._slots['dense_features_weights'] = []
|
||||
self._assign_ops = []
|
||||
# Make an internal variable which has the updates before applying L1
|
||||
def _symmetric_l2_regularization(self):
|
||||
# Algorithmic requirement (for now) is to have minimal l2 of 1.0
|
||||
return max(self._options['symmetric_l2_regularization'], 1.0)
|
||||
|
||||
# TODO(rohananil): Use optimizer interface to make use of slot creation logic.
|
||||
def _create_slots(self):
|
||||
# Make internal variables which have the updates before applying L1
|
||||
# regularization.
|
||||
for var_type in ['sparse_features_weights', 'dense_features_weights']:
|
||||
for var in variables[var_type]:
|
||||
if var is not None:
|
||||
self._slots[var_type].append(var_ops.Variable(array_ops.zeros_like(
|
||||
var.initialized_value(), dtypes.float32)))
|
||||
self._assign_ops.append(state_ops.assign(var, self._slots[var_type][
|
||||
-1]))
|
||||
self._slots = {
|
||||
'unshrinked_sparse_features_weights': [],
|
||||
'unshrinked_dense_features_weights': [],
|
||||
}
|
||||
for name in ['sparse_features_weights', 'dense_features_weights']:
|
||||
for var in self._variables[name]:
|
||||
self._slots['unshrinked_' + name].append(var_ops.Variable(
|
||||
array_ops.zeros_like(var.initialized_value(), dtypes.float32)))
|
||||
|
||||
def _assertSpecified(self, items, check_in):
|
||||
for x in items:
|
||||
@ -177,33 +181,22 @@ class SdcaModel(object):
|
||||
def _l1_loss(self):
|
||||
"""Computes the l1 loss of the model."""
|
||||
with name_scope('l1_loss'):
|
||||
sparse_weights = self._convert_n_to_tensor(self._variables[
|
||||
'sparse_features_weights'])
|
||||
dense_weights = self._convert_n_to_tensor(self._variables[
|
||||
'dense_features_weights'])
|
||||
l1 = self._options['symmetric_l1_regularization']
|
||||
loss = 0.0
|
||||
for w in sparse_weights:
|
||||
loss += l1 * math_ops.reduce_sum(abs(w))
|
||||
for w in dense_weights:
|
||||
loss += l1 * math_ops.reduce_sum(abs(w))
|
||||
return loss
|
||||
sum = 0.0
|
||||
for name in ['sparse_features_weights', 'dense_features_weights']:
|
||||
for weights in self._convert_n_to_tensor(self._variables[name]):
|
||||
sum += math_ops.reduce_sum(math_ops.abs(weights))
|
||||
# SDCA L1 regularization cost is: l1 * sum(|weights|)
|
||||
return self._options['symmetric_l1_regularization'] * sum
|
||||
|
||||
def _l2_loss(self):
|
||||
def _l2_loss(self, l2):
|
||||
"""Computes the l2 loss of the model."""
|
||||
with name_scope('l2_loss'):
|
||||
sparse_weights = self._convert_n_to_tensor(self._variables[
|
||||
'sparse_features_weights'])
|
||||
dense_weights = self._convert_n_to_tensor(self._variables[
|
||||
'dense_features_weights'])
|
||||
l2 = self._options['symmetric_l2_regularization']
|
||||
loss = 0.0
|
||||
for w in sparse_weights:
|
||||
loss += l2 * math_ops.reduce_sum(math_ops.square(w))
|
||||
for w in dense_weights:
|
||||
loss += l2 * math_ops.reduce_sum(math_ops.square(w))
|
||||
# SDCA L2 regularization cost is 1/2 * l2 * sum(weights^2)
|
||||
return loss / 2.0
|
||||
sum = 0.0
|
||||
for name in ['sparse_features_weights', 'dense_features_weights']:
|
||||
for weights in self._convert_n_to_tensor(self._variables[name]):
|
||||
sum += math_ops.reduce_sum(math_ops.square(weights))
|
||||
# SDCA L2 regularization cost is: l2 * sum(weights^2) / 2
|
||||
return l2 * sum / 2
|
||||
|
||||
def _convert_n_to_tensor(self, input_list, as_ref=False):
|
||||
"""Converts input list to a set of tensors."""
|
||||
@ -265,31 +258,44 @@ class SdcaModel(object):
|
||||
"""
|
||||
with name_scope('sdca/minimize'):
|
||||
sparse_features_indices = []
|
||||
sparse_features_weights = []
|
||||
sparse_features_values = []
|
||||
for sf in self._examples['sparse_features']:
|
||||
sparse_features_indices.append(convert_to_tensor(sf.indices))
|
||||
sparse_features_weights.append(convert_to_tensor(sf.values))
|
||||
sparse_features_values.append(convert_to_tensor(sf.values))
|
||||
|
||||
step_op = _sdca_ops.sdca_solver(
|
||||
sparse_features_indices,
|
||||
sparse_features_weights,
|
||||
sparse_features_values,
|
||||
self._convert_n_to_tensor(self._examples['dense_features']),
|
||||
convert_to_tensor(self._examples['example_weights']),
|
||||
convert_to_tensor(self._examples['example_labels']),
|
||||
convert_to_tensor(self._examples['example_ids']),
|
||||
self._convert_n_to_tensor(self._slots['sparse_features_weights'],
|
||||
as_ref=True),
|
||||
self._convert_n_to_tensor(self._slots['dense_features_weights'],
|
||||
as_ref=True),
|
||||
self._convert_n_to_tensor(
|
||||
self._slots['unshrinked_sparse_features_weights'],
|
||||
as_ref=True),
|
||||
self._convert_n_to_tensor(
|
||||
self._slots['unshrinked_dense_features_weights'],
|
||||
as_ref=True),
|
||||
l1=self._options['symmetric_l1_regularization'],
|
||||
l2=self._options['symmetric_l2_regularization'],
|
||||
l2=self._symmetric_l2_regularization(),
|
||||
# TODO(rohananil): Provide empirical evidence for this. It is better
|
||||
# to run more than one iteration on single mini-batch as we want to
|
||||
# spend more time in compute. SDCA works better with larger
|
||||
# mini-batches and there is also recent work that shows its better to
|
||||
# reuse old samples than train on new samples.
|
||||
# See: http://arxiv.org/abs/1602.02136.
|
||||
num_inner_iterations=2,
|
||||
loss_type=self._options['loss_type'],
|
||||
container=self._container,
|
||||
solver_uuid=self._solver_uuid)
|
||||
with ops.control_dependencies([step_op]):
|
||||
assign_ops = control_flow_ops.group(*self._assign_ops)
|
||||
with ops.control_dependencies([assign_ops]):
|
||||
assign_ops = []
|
||||
for name in ['sparse_features_weights', 'dense_features_weights']:
|
||||
for var, slot_var in zip(self._variables[name],
|
||||
self._slots['unshrinked_' + name]):
|
||||
assign_ops.append(var.assign(slot_var))
|
||||
assign_group = control_flow_ops.group(*assign_ops)
|
||||
with ops.control_dependencies([assign_group]):
|
||||
return _sdca_ops.sdca_shrink_l1(
|
||||
self._convert_n_to_tensor(
|
||||
self._variables['sparse_features_weights'],
|
||||
@ -298,7 +304,7 @@ class SdcaModel(object):
|
||||
self._variables['dense_features_weights'],
|
||||
as_ref=True),
|
||||
l1=self._options['symmetric_l1_regularization'],
|
||||
l2=self._options['symmetric_l2_regularization'])
|
||||
l2=self._symmetric_l2_regularization())
|
||||
|
||||
def approximate_duality_gap(self):
|
||||
"""Add operations to compute the approximate duality gap.
|
||||
@ -307,15 +313,14 @@ class SdcaModel(object):
|
||||
An Operation that computes the approximate duality gap over all
|
||||
examples.
|
||||
"""
|
||||
return _sdca_ops.compute_duality_gap(
|
||||
self._convert_n_to_tensor(self._slots['sparse_features_weights'],
|
||||
as_ref=True),
|
||||
self._convert_n_to_tensor(self._slots['dense_features_weights'],
|
||||
as_ref=True),
|
||||
l1=self._options['symmetric_l1_regularization'],
|
||||
l2=self._options['symmetric_l2_regularization'],
|
||||
(primal_loss, dual_loss, example_weights) = _sdca_ops.sdca_training_stats(
|
||||
container=self._container,
|
||||
solver_uuid=self._solver_uuid)
|
||||
# Note that example_weights is guaranteed to be positive by
|
||||
# sdca_training_stats so dividing by it is safe.
|
||||
return (primal_loss + dual_loss + math_ops.to_double(self._l1_loss()) +
|
||||
(2.0 * math_ops.to_double(self._l2_loss(
|
||||
self._symmetric_l2_regularization())))) / example_weights
|
||||
|
||||
def unregularized_loss(self, examples):
|
||||
"""Add operations to compute the loss (without the regularization loss).
|
||||
@ -384,6 +389,11 @@ class SdcaModel(object):
|
||||
self._assertList(['sparse_features', 'dense_features'], examples)
|
||||
with name_scope('sdca/regularized_loss'):
|
||||
weights = convert_to_tensor(examples['example_weights'])
|
||||
return ((
|
||||
(self._l1_loss() + self._l2_loss()) / math_ops.reduce_sum(weights)) +
|
||||
return (((
|
||||
self._l1_loss() +
|
||||
# Note that here we are using the raw regularization
|
||||
# (as specified by the user) and *not*
|
||||
# self._symmetric_l2_regularization().
|
||||
self._l2_loss(self._options['symmetric_l2_regularization'])) /
|
||||
math_ops.reduce_sum(weights)) +
|
||||
self.unregularized_loss(examples))
|
||||
|
@ -127,7 +127,7 @@ replicated model. Possible approaches include:
|
||||
|
||||
* As above, but where the gradients from all workers are averaged. See the
|
||||
[CIFAR-10 multi-GPU trainer](https://www.tensorflow.org/code/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py)
|
||||
for an example of this form of replication. The implements *synchronous* training
|
||||
for an example of this form of replication. This implements *synchronous* training
|
||||
|
||||
* The "distributed trainer" approach uses multiple graphs—one per
|
||||
worker—where each graph contains one set of parameters (pinned to
|
||||
|
@ -1089,6 +1089,7 @@ filegroup(
|
||||
"avgpooling_op.cc",
|
||||
"batch_norm_op.cc",
|
||||
"bcast_ops.cc",
|
||||
"check_numerics_op.cc",
|
||||
"control_flow_ops.cc",
|
||||
"conv_2d.h",
|
||||
"conv_ops.cc",
|
||||
|
@ -26,26 +26,15 @@ namespace tensorflow {
|
||||
// Check that 0 <= index < limit using a single comparison, assuming
|
||||
// that 0 <= limit if Index is signed. Intended for use in performance
|
||||
// critical contexts where 0 <= index < limit is almost always true.
|
||||
template <class Index>
|
||||
EIGEN_ALWAYS_INLINE bool FastBoundsCheck(Index index, Index limit) {
|
||||
typedef typename std::make_unsigned<Index>::type UIndex;
|
||||
template <typename Ta, typename Tb>
|
||||
EIGEN_ALWAYS_INLINE bool FastBoundsCheck(const Ta index, const Tb limit) {
|
||||
static_assert(std::is_integral<Ta>::value && std::is_integral<Tb>::value,
|
||||
"FastBoundsCheck can only be used on integer types.");
|
||||
typedef typename std::make_unsigned<decltype(index + limit)>::type UIndex;
|
||||
return TF_PREDICT_TRUE(static_cast<UIndex>(index) <
|
||||
static_cast<UIndex>(limit));
|
||||
}
|
||||
|
||||
// Upcasting specializations when the index and bounds do not match;
|
||||
// always move to the larger type.
|
||||
|
||||
EIGEN_ALWAYS_INLINE bool FastBoundsCheck(int64 index, int32 limit) {
|
||||
return TF_PREDICT_TRUE(static_cast<uint64>(index) <
|
||||
static_cast<uint64>(limit));
|
||||
}
|
||||
|
||||
EIGEN_ALWAYS_INLINE bool FastBoundsCheck(int32 index, int64 limit) {
|
||||
return TF_PREDICT_TRUE(static_cast<uint64>(index) <
|
||||
static_cast<uint64>(limit));
|
||||
}
|
||||
|
||||
namespace internal {
|
||||
// Ensure that the compiler cannot elide a copy into a local, for
|
||||
// bounds checking on source tensors that might be updated asynchronously.
|
||||
|
@ -1398,7 +1398,7 @@ class Conv2DSlowBackpropFilterOp : public OpKernel {
|
||||
// [filter_rows, filter_cols, in_depth, out_depth];
|
||||
// And we need to reverse the filter backprops
|
||||
// So we need to allocated (sigh) yet another piece of memory to hold the
|
||||
// ouptut.
|
||||
// output.
|
||||
TensorShape filter_shuffle_shape(
|
||||
{out_depth, filter_rows, filter_cols, in_depth});
|
||||
Tensor filter_shuffle;
|
||||
|
@ -246,7 +246,7 @@ __global__ void SwapDimension1And2InTensor3UsingTiles(const T* input,
|
||||
}
|
||||
}
|
||||
|
||||
// A Cuda custom kernel that converst input to output, given proper padding on
|
||||
// A Cuda custom kernel that convert input to output, given proper padding on
|
||||
// the left and the top. The padded value is zero.
|
||||
template <typename T>
|
||||
__global__ void PadInputCustomKernelNHWC(int nthreads, const T* input,
|
||||
|
@ -45,6 +45,28 @@ class DiagonalGenerator {
|
||||
private:
|
||||
Tensor diagonal_;
|
||||
};
|
||||
|
||||
template <typename T, size_t NumDims>
|
||||
class DiagonalExtractor {
|
||||
public:
|
||||
explicit DiagonalExtractor(const Tensor& tensor) : tensor_(tensor) {
|
||||
CHECK_EQ(tensor.dims(), 2 * NumDims);
|
||||
}
|
||||
T operator()(const Eigen::array<Eigen::Index, NumDims>& coordinates) const {
|
||||
Eigen::array<Eigen::Index, 2 * NumDims> index;
|
||||
for (size_t j = 0; j < NumDims; ++j){
|
||||
index[j] = coordinates[j];
|
||||
}
|
||||
for (size_t j = NumDims; j < 2 * NumDims; ++j){
|
||||
index[j] = index[j - NumDims];
|
||||
}
|
||||
return tensor_.tensor<T, 2 * NumDims>()(index);
|
||||
}
|
||||
|
||||
private:
|
||||
Tensor tensor_;
|
||||
};
|
||||
|
||||
} // namespace
|
||||
|
||||
// Generate the diagonal tensor with the diagonal set to the input tensor.
|
||||
@ -58,12 +80,9 @@ class DiagOp : public OpKernel {
|
||||
void Compute(OpKernelContext* context) override {
|
||||
const Tensor& diagonal = context->input(0);
|
||||
const int num_dims = diagonal.dims();
|
||||
OP_REQUIRES(context, 1 <= num_dims,
|
||||
errors::InvalidArgument(
|
||||
"The rank of the diagonal should be between 1 and 3."));
|
||||
OP_REQUIRES(context, 3 >= num_dims,
|
||||
errors::InvalidArgument(
|
||||
"The rank of the diagonal should be between 1 and 3."));
|
||||
OP_REQUIRES(context, 1 <= num_dims && num_dims <= 3,
|
||||
errors::InvalidArgument("Expected 1 <= dims <= 3, got shape ",
|
||||
diagonal.shape().DebugString()));
|
||||
TensorShape out_shape;
|
||||
for (int i = 0; i < num_dims; ++i) {
|
||||
out_shape.AddDim(diagonal.dim_size(i));
|
||||
@ -105,4 +124,71 @@ REGISTER_DIAGOP(int32);
|
||||
REGISTER_DIAGOP(int64);
|
||||
|
||||
#undef REGISTER_DIAGOP
|
||||
|
||||
|
||||
// Generate the diagonal tensor with the diagonal set to the input tensor.
|
||||
// It only allows rank 2, 4, or 6 input tensor, so the output tensor is
|
||||
// rank 1, 2, or 3.
|
||||
template <typename T>
|
||||
class DiagPartOp : public OpKernel {
|
||||
public:
|
||||
explicit DiagPartOp(OpKernelConstruction* context) : OpKernel(context) {}
|
||||
|
||||
void Compute(OpKernelContext* context) override {
|
||||
const Tensor& tensor = context->input(0);
|
||||
const int num_dims = tensor.dims();
|
||||
const int out_dims = num_dims / 2;
|
||||
OP_REQUIRES(context, 2 == num_dims || 4 == num_dims || 6 == num_dims,
|
||||
errors::InvalidArgument("The rank of the tensor should be 2, \
|
||||
4, or 6, got shape ",
|
||||
tensor.shape().DebugString()));
|
||||
for (int i = 0; i < out_dims; i++){
|
||||
OP_REQUIRES(context, tensor.dim_size(i) == tensor.dim_size(i + out_dims),
|
||||
errors::InvalidArgument(
|
||||
"Invalid shape ", tensor.shape().DebugString(),
|
||||
": dimensions ", i, " and ", i + out_dims, " do not match.")
|
||||
);
|
||||
}
|
||||
|
||||
TensorShape out_shape;
|
||||
for (int i = 0; i < out_dims; ++i) {
|
||||
out_shape.AddDim(tensor.dim_size(i));
|
||||
}
|
||||
|
||||
Tensor* output = nullptr;
|
||||
OP_REQUIRES_OK(context,
|
||||
context->allocate_output(0, out_shape, &output));
|
||||
|
||||
switch (num_dims) {
|
||||
case 2:
|
||||
output->tensor<T, 1>() = output->tensor<T, 1>().generate(
|
||||
DiagonalExtractor<T, 1>(tensor));
|
||||
break;
|
||||
case 4:
|
||||
output->tensor<T, 2>() = output->tensor<T, 2>().generate(
|
||||
DiagonalExtractor<T, 2>(tensor));
|
||||
break;
|
||||
case 6:
|
||||
output->tensor<T, 3>() = output->tensor<T, 3>().generate(
|
||||
DiagonalExtractor<T, 3>(tensor));
|
||||
break;
|
||||
default:
|
||||
context->SetStatus(errors::Unimplemented(
|
||||
"Diagonal of rank ", num_dims, " tensor is not supported yet."));
|
||||
return;
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
#define REGISTER_DIAGPARTOP(T) \
|
||||
REGISTER_KERNEL_BUILDER( \
|
||||
Name("DiagPart").Device(DEVICE_CPU).TypeConstraint<T>("T"), DiagPartOp<T>)
|
||||
|
||||
REGISTER_DIAGPARTOP(double);
|
||||
REGISTER_DIAGPARTOP(float);
|
||||
REGISTER_DIAGPARTOP(int32);
|
||||
REGISTER_DIAGPARTOP(int64);
|
||||
|
||||
#undef REGISTER_DIAGPARTOP
|
||||
|
||||
} // namespace tensorflow
|
||||
|
@ -94,7 +94,7 @@ class MatrixSolveLsOp
|
||||
}
|
||||
if (fast_) {
|
||||
// The fast branch assumes that matrix is not rank deficient and
|
||||
// not too ill-conditioned. Specifically, the reciprobal condition number
|
||||
// not too ill-conditioned. Specifically, the reciprocal condition number
|
||||
// should be greater than the square root of the machine precision, i.e.
|
||||
// 1 / cond(matrix) > sqrt(std::numeric_limits<Scalar>::epsilon()).
|
||||
// This branch solves over- or underdetermined least-squares problems
|
||||
|
@ -84,6 +84,7 @@ struct ReduceFunctor<GPUDevice, Eigen::internal::MeanReducer<T> > {
|
||||
DEFINE_FOR_TYPE_AND_R(T, Eigen::internal::ProdReducer<T>)
|
||||
|
||||
DEFINE_FOR_ALL_REDUCERS(float);
|
||||
DEFINE_FOR_ALL_REDUCERS(double);
|
||||
#undef DEFINE_FOR_ALL_REDUCERS
|
||||
|
||||
DEFINE_FOR_TYPE_AND_R(complex64, Eigen::internal::SumReducer<complex64>);
|
||||
|
@ -34,6 +34,7 @@ TF_CALL_REAL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
|
||||
.HostMemory("reduction_indices"), \
|
||||
ReductionOp<GPUDevice, type, Eigen::internal::MaxReducer<type>>);
|
||||
REGISTER_GPU_KERNELS(float);
|
||||
REGISTER_GPU_KERNELS(double);
|
||||
#undef REGISTER_GPU_KERNELS
|
||||
|
||||
#endif
|
||||
|
@ -34,6 +34,7 @@ TF_CALL_REAL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
|
||||
.HostMemory("reduction_indices"), \
|
||||
ReductionOp<GPUDevice, type, Eigen::internal::MinReducer<type>>);
|
||||
REGISTER_GPU_KERNELS(float);
|
||||
REGISTER_GPU_KERNELS(double);
|
||||
#undef REGISTER_GPU_KERNELS
|
||||
|
||||
#endif
|
||||
|
@ -34,6 +34,7 @@ TF_CALL_REAL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
|
||||
.HostMemory("reduction_indices"), \
|
||||
ReductionOp<GPUDevice, type, Eigen::internal::ProdReducer<type>>);
|
||||
REGISTER_GPU_KERNELS(float);
|
||||
REGISTER_GPU_KERNELS(double);
|
||||
#undef REGISTER_GPU_KERNELS
|
||||
|
||||
#endif
|
||||
|
@ -41,6 +41,7 @@ REGISTER_KERNEL_BUILDER(
|
||||
.HostMemory("reduction_indices"), \
|
||||
ReductionOp<GPUDevice, type, Eigen::internal::SumReducer<type>>);
|
||||
REGISTER_GPU_KERNELS(float);
|
||||
REGISTER_GPU_KERNELS(double);
|
||||
#undef REGISTER_GPU_KERNELS
|
||||
|
||||
REGISTER_KERNEL_BUILDER(
|
||||
|
@ -26,6 +26,10 @@ limitations under the License.
|
||||
#include "tensorflow/core/lib/core/status.h"
|
||||
#include "tensorflow/core/platform/logging.h"
|
||||
|
||||
#if GOOGLE_CUDA
|
||||
#include "tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.h"
|
||||
#endif // GOOGLE_CUDA
|
||||
|
||||
namespace tensorflow {
|
||||
|
||||
typedef Eigen::ThreadPoolDevice CPUDevice;
|
||||
@ -58,10 +62,10 @@ class ResizeNearestNeighborOp : public OpKernel {
|
||||
// Initialize shape to the batch size of the input, then add
|
||||
// the rest of the dimensions
|
||||
Tensor* output = nullptr;
|
||||
OP_REQUIRES_OK(context, context->allocate_output(
|
||||
0, TensorShape({input.dim_size(0), sizes(0),
|
||||
sizes(1), input.dim_size(3)}),
|
||||
&output));
|
||||
OP_REQUIRES_OK(
|
||||
context, context->allocate_output(0, TensorShape({input.dim_size(0), sizes(0),
|
||||
sizes(1), input.dim_size(3)}),
|
||||
&output));
|
||||
|
||||
const int64 batch_size = input.dim_size(0);
|
||||
const int64 in_height = input.dim_size(1);
|
||||
@ -132,10 +136,10 @@ class ResizeNearestNeighborOpGrad : public OpKernel {
|
||||
// Initialize shape to the batch size of the input, then add
|
||||
// the rest of the dimensions
|
||||
Tensor* output = nullptr;
|
||||
OP_REQUIRES_OK(context, context->allocate_output(
|
||||
0, TensorShape({input.dim_size(0), sizes(0),
|
||||
sizes(1), input.dim_size(3)}),
|
||||
&output));
|
||||
OP_REQUIRES_OK(
|
||||
context, context->allocate_output(0, TensorShape({input.dim_size(0), sizes(0),
|
||||
sizes(1), input.dim_size(3)}),
|
||||
&output));
|
||||
|
||||
const int64 batch_size = input.dim_size(0);
|
||||
const int64 in_height = input.dim_size(1);
|
||||
@ -204,4 +208,83 @@ TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL);
|
||||
|
||||
#undef REGISTER_KERNEL
|
||||
|
||||
#if GOOGLE_CUDA
|
||||
|
||||
template <typename T>
|
||||
class ResizeNearestNeighborGPUOp : public OpKernel {
|
||||
public:
|
||||
explicit ResizeNearestNeighborGPUOp(OpKernelConstruction* context)
|
||||
: OpKernel(context) {
|
||||
OP_REQUIRES_OK(context, context->GetAttr("align_corners", &align_corners_));
|
||||
}
|
||||
|
||||
void Compute(OpKernelContext* context) override {
|
||||
const Tensor& input = context->input(0);
|
||||
OP_REQUIRES(context, input.dims() == 4,
|
||||
errors::InvalidArgument("input must be 4-dimensional",
|
||||
input.shape().DebugString()));
|
||||
const Tensor& shape_t = context->input(1);
|
||||
OP_REQUIRES(context, shape_t.dims() == 1,
|
||||
errors::InvalidArgument("shape_t must be 1-dimensional",
|
||||
shape_t.shape().DebugString()));
|
||||
OP_REQUIRES(context, shape_t.NumElements() == 2,
|
||||
errors::InvalidArgument("shape_t must have two elements",
|
||||
shape_t.shape().DebugString()));
|
||||
|
||||
auto sizes = shape_t.vec<int32>();
|
||||
OP_REQUIRES(context, sizes(0) > 0 && sizes(1) > 0,
|
||||
errors::InvalidArgument("shape_t's elements must be positive"));
|
||||
|
||||
// Initialize shape to the batch size of the input, then add
|
||||
// the rest of the dimensions
|
||||
Tensor* output = nullptr;
|
||||
OP_REQUIRES_OK(
|
||||
context, context->allocate_output(0, TensorShape({input.dim_size(0), sizes(0),
|
||||
sizes(1), input.dim_size(3)}),
|
||||
&output));
|
||||
|
||||
const int64 batch_size = input.dim_size(0);
|
||||
const int64 in_height = input.dim_size(1);
|
||||
const int64 in_width = input.dim_size(2);
|
||||
const int64 channels = input.dim_size(3);
|
||||
const int64 out_height = output->dim_size(1);
|
||||
const int64 out_width = output->dim_size(2);
|
||||
|
||||
const float height_scale =
|
||||
(align_corners_ && out_height > 1)
|
||||
? (in_height - 1) / static_cast<float>(out_height - 1)
|
||||
: in_height / static_cast<float>(out_height);
|
||||
const float width_scale =
|
||||
(align_corners_ && out_width > 1)
|
||||
? (in_width - 1) / static_cast<float>(out_width - 1)
|
||||
: in_width / static_cast<float>(out_width);
|
||||
|
||||
bool status = ResizeNearestNeighbor<T>(
|
||||
input.flat<T>().data(), batch_size, in_height,
|
||||
in_width, channels, out_height, out_width,
|
||||
height_scale, width_scale, output->flat<T>().data(),
|
||||
context->eigen_gpu_device());
|
||||
|
||||
if (!status) {
|
||||
context->SetStatus(
|
||||
errors::Internal("Failed launching ResizeNearestNeighbor"));
|
||||
}
|
||||
}
|
||||
private:
|
||||
bool align_corners_;
|
||||
};
|
||||
|
||||
#define REGISTER_KERNEL(T) \
|
||||
REGISTER_KERNEL_BUILDER(Name("ResizeNearestNeighbor") \
|
||||
.Device(DEVICE_GPU) \
|
||||
.TypeConstraint<T>("T") \
|
||||
.HostMemory("size"), \
|
||||
ResizeNearestNeighborGPUOp<T>);
|
||||
|
||||
TF_CALL_GPU_NUMBER_TYPES(REGISTER_KERNEL);
|
||||
|
||||
#undef REGISTER_KERNEL
|
||||
|
||||
#endif // GOOGLE_CUDA
|
||||
|
||||
} // namespace tensorflow
|
||||
|
@ -0,0 +1,52 @@
|
||||
/* Copyright 2015 Google Inc. All Rights Reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
==============================================================================*/
|
||||
|
||||
#include "tensorflow/core/common_runtime/kernel_benchmark_testlib.h"
|
||||
#include "tensorflow/core/framework/tensor.h"
|
||||
#include "tensorflow/core/graph/node_builder.h"
|
||||
#include "tensorflow/core/platform/test.h"
|
||||
#include "tensorflow/core/platform/test_benchmark.h"
|
||||
|
||||
namespace tensorflow {
|
||||
|
||||
static Graph* BM_ResizeNearestNeighbor(int batches, int width, int height) {
|
||||
Graph* g = new Graph(OpRegistry::Global());
|
||||
Tensor in(DT_FLOAT, TensorShape({batches, width, height, 3}));
|
||||
in.flat<float>().setRandom();
|
||||
|
||||
Tensor out_size(DT_INT32, TensorShape({2}));
|
||||
auto out_size_flat = out_size.flat<int32>();
|
||||
out_size_flat(0) = width * 2;
|
||||
out_size_flat(1) = height * 2;
|
||||
|
||||
Node* ret;
|
||||
NodeBuilder(g->NewName("n"), "ResizeNearestNeighbor")
|
||||
.Input(test::graph::Constant(g, in))
|
||||
.Input(test::graph::Constant(g, out_size))
|
||||
.Finalize(g, &ret);
|
||||
return g;
|
||||
}
|
||||
|
||||
#define BM_ResizeNearestNeighborDev(DEVICE, B, W, H) \
|
||||
static void BM_ResizeNearestNeighbor_##DEVICE##_##B##_##W##_##H(int iters) { \
|
||||
testing::ItemsProcessed(iters* B* W* H * 3); \
|
||||
test::Benchmark(#DEVICE, BM_ResizeNearestNeighbor(B, W, H)).Run(iters); \
|
||||
} \
|
||||
BENCHMARK(BM_ResizeNearestNeighbor_##DEVICE##_##B##_##W##_##H)
|
||||
|
||||
BM_ResizeNearestNeighborDev(cpu, 1, 499, 499);
|
||||
BM_ResizeNearestNeighborDev(gpu, 1, 499, 499);
|
||||
|
||||
} // namespace tensorflow
|
86
tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.cu.cc
Normal file
86
tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.cu.cc
Normal file
@ -0,0 +1,86 @@
|
||||
/* Copyright 2015 Google Inc. All Rights Reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
==============================================================================*/
|
||||
|
||||
#if GOOGLE_CUDA
|
||||
|
||||
#define EIGEN_USE_GPU
|
||||
|
||||
#include <stdio.h>
|
||||
|
||||
#include "tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.h"
|
||||
|
||||
#include "tensorflow/core/framework/register_types.h"
|
||||
#include "tensorflow/core/framework/tensor_types.h"
|
||||
#include "tensorflow/core/util/cuda_kernel_helper.h"
|
||||
|
||||
namespace tensorflow {
|
||||
namespace {
|
||||
|
||||
template <typename T>
|
||||
__global__ void ResizeNearestNeighborNHWC(const int nthreads, const T* bottom_data,
|
||||
const int in_height, const int in_width,
|
||||
const int channels, const int out_height,
|
||||
const int out_width, const float height_scale,
|
||||
const float width_scale, T* top_data) {
|
||||
CUDA_1D_KERNEL_LOOP(index, nthreads) {
|
||||
int n = index;
|
||||
int c = n % channels;
|
||||
n /= channels;
|
||||
int out_x = n % out_width;
|
||||
n /= out_width;
|
||||
int out_y = n % out_height;
|
||||
n /= out_height;
|
||||
|
||||
const T* bottom_data_n = bottom_data + n * channels * in_height * in_width;
|
||||
const int in_x = min(static_cast<int>(floorf(out_x * width_scale)), in_width - 1);
|
||||
const int in_y = min(static_cast<int>(floorf(out_y * height_scale)), in_height - 1);
|
||||
const int idx = (in_y * in_width + in_x) * channels + c;
|
||||
top_data[index] = ldg(bottom_data_n + idx);
|
||||
}
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
template <typename T>
|
||||
bool ResizeNearestNeighbor(const T* bottom_data, const int batch,
|
||||
const int in_height, const int in_width,
|
||||
const int channels, const int out_height,
|
||||
const int out_width, const float height_scale,
|
||||
const float width_scale, T* top_data,
|
||||
const Eigen::GpuDevice& d) {
|
||||
const int output_size = batch * channels * out_height * out_width;
|
||||
CudaLaunchConfig config = GetCudaLaunchConfig(output_size, d);
|
||||
|
||||
ResizeNearestNeighborNHWC<T>
|
||||
<<<config.block_count, config.thread_per_block, 0, d.stream()>>>(
|
||||
output_size, bottom_data, in_height, in_width, channels, out_height,
|
||||
out_width, height_scale, width_scale, top_data);
|
||||
return d.ok();
|
||||
}
|
||||
|
||||
#define DECLARE_GPU_SPEC(T) \
|
||||
template bool ResizeNearestNeighbor(const T* bottom_data, const int batch, \
|
||||
const int in_height, const int in_width, \
|
||||
const int channels, const int out_height, \
|
||||
const int out_width, const float height_scale, \
|
||||
const float width_scale, T* top_data, \
|
||||
const Eigen::GpuDevice& d);
|
||||
|
||||
TF_CALL_GPU_NUMBER_TYPES(DECLARE_GPU_SPEC);
|
||||
|
||||
#undef DECLARE_GPU_SPEC
|
||||
} // end namespace tensorflow
|
||||
|
||||
#endif // GOOGLE_CUDA
|
37
tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.h
Normal file
37
tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.h
Normal file
@ -0,0 +1,37 @@
|
||||
/* Copyright 2015 Google Inc. All Rights Reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
==============================================================================*/
|
||||
|
||||
#if !GOOGLE_CUDA
|
||||
#error This file must only be included when building with Cuda support
|
||||
#endif
|
||||
|
||||
#ifndef TENSORFLOW_CORE_KERNELS_RESIZE_NEAREST_NEIGHBOR_OP_GPU_H_
|
||||
#define TENSORFLOW_CORE_KERNELS_RESIZE_NEAREST_NEIGHBOR_OP_GPU_H_
|
||||
|
||||
#include "third_party/eigen3/unsupported/Eigen/CXX11/NeuralNetworks"
|
||||
#include "tensorflow/core/framework/tensor_types.h"
|
||||
#include "tensorflow/core/platform/types.h"
|
||||
|
||||
namespace tensorflow {
|
||||
|
||||
template <typename T>
|
||||
bool ResizeNearestNeighbor(const T* bottom_data, const int batch, const int in_height,
|
||||
const int in_width, const int channels, const int out_height,
|
||||
const int out_width, const float height_scale, const float width_scale,
|
||||
T* top_data, const Eigen::GpuDevice& d);
|
||||
|
||||
} // namespace tensorflow
|
||||
|
||||
#endif // TENSORFLOW_CORE_KERNELS_RESIZE_NEAREST_NEIGHBOR_OP_GPU_H_
|
@ -524,7 +524,7 @@ class SparseMatMulOp : public OpKernel {
|
||||
|
||||
private:
|
||||
// Perform matrix multiplication of "left" and "right", and store the result
|
||||
// in *"ouptut".
|
||||
// in *"output".
|
||||
static inline void SparseMatMul(
|
||||
const ConstMatrixMap& left, const ConstMatrixMap& right,
|
||||
bool transpose_left, const DeviceBase::CpuWorkerThreads* thread_pool,
|
||||
@ -858,7 +858,7 @@ inline void SparseMatMulOp::SparseMatMul(
|
||||
const int right_dim0 = right.dimension(0);
|
||||
const int right_dim1 = right.dimension(1);
|
||||
// Allocate buffer for storing slices of right matrix.
|
||||
// Note buffer needs enough space to hold atmost a KR * NR matrix since that
|
||||
// Note buffer needs enough space to hold at most a KR * NR matrix since that
|
||||
// is the block size per iteration.
|
||||
const int buffer_num_rows =
|
||||
std::min(KR, right_dim0) * (std::min(NR, right_dim1) + N - 1) / N;
|
||||
|
@ -577,7 +577,7 @@ class TensorArrayConcatOp : public OpKernel {
|
||||
ConstMatrixVector input_tensors_flat;
|
||||
input_tensors_flat.reserve(values.size());
|
||||
|
||||
for (int i = 0; i < values.size(); ++i) {
|
||||
for (size_t i = 0; i < values.size(); ++i) {
|
||||
const Tensor* value_t = value_tensors[i];
|
||||
if (value_t->NumElements() > 0) {
|
||||
input_tensors_flat.emplace_back(new ConstMatrix(
|
||||
|
@ -47,7 +47,7 @@ void ComputeStride(const TensorShape& shape, Index* strides) {
|
||||
}
|
||||
}
|
||||
|
||||
// Device-specific naive implementation for tranpose.
|
||||
// Device-specific naive implementation for transpose.
|
||||
template <typename Device, typename T>
|
||||
void TransposeSimple(const Device& d, const Tensor& in,
|
||||
const gtl::ArraySlice<int32> perm, Tensor* out);
|
||||
|
@ -172,6 +172,38 @@ tf.diag(diagonal) ==> [[1, 0, 0, 0]
|
||||
diagonal: Rank k tensor where k is at most 3.
|
||||
)doc");
|
||||
|
||||
// --------------------------------------------------------------------------
|
||||
REGISTER_OP("DiagPart")
|
||||
.Input("input: T")
|
||||
.Output("diagonal: T")
|
||||
.Attr("T: {float, double, int32, int64}")
|
||||
.Doc(R"doc(
|
||||
Returns the diagonal part of the tensor.
|
||||
|
||||
This operation returns a tensor with the `diagonal` part
|
||||
of the `input`. The `diagonal` part is computed as follows:
|
||||
|
||||
Assume `input` has dimensions `[D1,..., Dk, D1,..., Dk]`, then the output is a
|
||||
tensor of rank `k` with dimensions `[D1,..., Dk]` where:
|
||||
|
||||
`diagonal[i1,..., ik] = input[i1, ..., ik, i1,..., ik]`.
|
||||
|
||||
For example:
|
||||
|
||||
```prettyprint
|
||||
# 'input' is [[1, 0, 0, 0]
|
||||
[0, 2, 0, 0]
|
||||
[0, 0, 3, 0]
|
||||
[0, 0, 0, 4]]
|
||||
|
||||
tf.diag_part(input) ==> [1, 2, 3, 4]
|
||||
```
|
||||
|
||||
input: Rank k tensor where k is 2, 4, or 6.
|
||||
diagonal: The extracted diagonal.
|
||||
|
||||
)doc");
|
||||
|
||||
// --------------------------------------------------------------------------
|
||||
REGISTER_OP("Reverse")
|
||||
.Input("tensor: T")
|
||||
|
@ -3482,6 +3482,29 @@ op {
|
||||
}
|
||||
}
|
||||
}
|
||||
op {
|
||||
name: "DiagPart"
|
||||
input_arg {
|
||||
name: "input"
|
||||
type_attr: "T"
|
||||
}
|
||||
output_arg {
|
||||
name: "diagonal"
|
||||
type_attr: "T"
|
||||
}
|
||||
attr {
|
||||
name: "T"
|
||||
type: "type"
|
||||
allowed_values {
|
||||
list {
|
||||
type: DT_FLOAT
|
||||
type: DT_DOUBLE
|
||||
type: DT_INT32
|
||||
type: DT_INT64
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
op {
|
||||
name: "Digamma"
|
||||
input_arg {
|
||||
|
@ -2858,6 +2858,33 @@ op {
|
||||
summary: "Returns a diagonal tensor with a given diagonal values."
|
||||
description: "Given a `diagonal`, this operation returns a tensor with the `diagonal` and\neverything else padded with zeros. The diagonal is computed as follows:\n\nAssume `diagonal` has dimensions [D1,..., Dk], then the output is a tensor of\nrank 2k with dimensions [D1,..., Dk, D1,..., Dk] where:\n\n`output[i1,..., ik, i1,..., ik] = diagonal[i1, ..., ik]` and 0 everywhere else.\n\nFor example:\n\n```prettyprint\n# \'diagonal\' is [1, 2, 3, 4]\ntf.diag(diagonal) ==> [[1, 0, 0, 0]\n [0, 2, 0, 0]\n [0, 0, 3, 0]\n [0, 0, 0, 4]]\n```"
|
||||
}
|
||||
op {
|
||||
name: "DiagPart"
|
||||
input_arg {
|
||||
name: "input"
|
||||
description: "Rank k tensor where k is 2, 4, or 6."
|
||||
type_attr: "T"
|
||||
}
|
||||
output_arg {
|
||||
name: "diagonal"
|
||||
description: "The extracted diagonal."
|
||||
type_attr: "T"
|
||||
}
|
||||
attr {
|
||||
name: "T"
|
||||
type: "type"
|
||||
allowed_values {
|
||||
list {
|
||||
type: DT_FLOAT
|
||||
type: DT_DOUBLE
|
||||
type: DT_INT32
|
||||
type: DT_INT64
|
||||
}
|
||||
}
|
||||
}
|
||||
summary: "Returns the diagonal part of the tensor."
|
||||
description: "This operation returns a tensor with the `diagonal` part\nof the `input`. The `diagonal` part is computed as follows:\n\nAssume `input` has dimensions `[D1,..., Dk, D1,..., Dk]`, then the output is a\ntensor of rank `k` with dimensions `[D1,..., Dk]` where:\n\n`diagonal[i1,..., ik] = input[i1, ..., ik, i1,..., ik]`.\n\nFor example:\n\n```prettyprint\n# \'input\' is [[1, 0, 0, 0]\n [0, 2, 0, 0]\n [0, 0, 3, 0]\n [0, 0, 0, 4]]\n\ntf.diag_part(input) ==> [1, 2, 3, 4]\n```"
|
||||
}
|
||||
op {
|
||||
name: "Digamma"
|
||||
input_arg {
|
||||
|
@ -20,7 +20,7 @@ limitations under the License.
|
||||
|
||||
#define TF_MAJOR_VERSION 0
|
||||
#define TF_MINOR_VERSION 7
|
||||
#define TF_PATCH_VERSION 0
|
||||
#define TF_PATCH_VERSION 1
|
||||
|
||||
// TF_VERSION_SUFFIX is non-empty for pre-releases (e.g. "-alpha", "-alpha.1",
|
||||
// "-beta", "-rc", "-rc.1")
|
||||
|
@ -63,34 +63,50 @@ message CommitId {
|
||||
};
|
||||
|
||||
message CPUInfo {
|
||||
int64 num_cores = 1;
|
||||
|
||||
int64 num_cores_allowed = 2;
|
||||
|
||||
// How fast are these cpus?
|
||||
double mhz_per_cpu = 1;
|
||||
double mhz_per_cpu = 3;
|
||||
|
||||
// Additional cpu information. For example,
|
||||
// Intel Ivybridge with HyperThreading (24 cores) dL1:32KB dL2:256KB dL3:30MB
|
||||
string cpu_info = 2;
|
||||
string cpu_info = 4;
|
||||
|
||||
// What kind of cpu scaling is enabled on the host.
|
||||
// Examples include "performance", "ondemand", "conservative", "mixed".
|
||||
string cpu_governor = 3;
|
||||
string cpu_governor = 5;
|
||||
|
||||
// Cache sizes (in bytes), e.g. "L2": 262144 (for 256KB)
|
||||
map<string, int64> cache_size = 4;
|
||||
map<string, int64> cache_size = 6;
|
||||
};
|
||||
|
||||
message MemoryInfo {
|
||||
int64 total = 1; // Total virtual memory in bytes
|
||||
int64 available = 2; // Immediately available memory in bytes
|
||||
}
|
||||
|
||||
message GPUInfo {
|
||||
string model = 1; // e.g. "Tesla K40c"
|
||||
string uuid = 2; // Final entry in output of "nvidia-smi -L"
|
||||
string bus_id = 3; // e.g. "0000:04:00.0"
|
||||
};
|
||||
|
||||
message PlatformInfo {
|
||||
string bits = 1; // e.g. '64bit'
|
||||
string linkage = 2; // e.g. 'ELF'
|
||||
string machine = 3; // e.g. 'i386'
|
||||
string processor = 4; // e.g. 'amdk6' (the real processor name)
|
||||
string release = 5; // e.g. '3.13.0-76-generic'
|
||||
string system = 6; // e.g. 'Linux'
|
||||
string version = 7; // e.g. '#120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016'
|
||||
string release = 4; // e.g. '3.13.0-76-generic'
|
||||
string system = 5; // e.g. 'Linux'
|
||||
string version = 6; // e.g. '#120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016'
|
||||
};
|
||||
|
||||
message AvailableDeviceInfo { // Matches DeviceAttributes
|
||||
string name = 1; // Device name.
|
||||
string type = 2; // Device type, e.g. 'CPU' or 'GPU'.
|
||||
int64 memory_limit = 3; // Memory capacity in bytes.
|
||||
string physical_description = 4; // The physical description of this device.
|
||||
};
|
||||
|
||||
message MachineConfiguration {
|
||||
@ -105,6 +121,11 @@ message MachineConfiguration {
|
||||
|
||||
// Other devices that are attached and relevant (e.g. GPUInfo).
|
||||
repeated google.protobuf.Any device_info = 4;
|
||||
|
||||
// Devices accessible to the test (e.g. as given by list_local_devices).
|
||||
repeated AvailableDeviceInfo available_device_info = 5;
|
||||
|
||||
MemoryInfo memory_info = 6;
|
||||
};
|
||||
|
||||
// Run-specific items such as arguments to the test / benchmark.
|
||||
|
@ -68,6 +68,7 @@ def convert_to(images, labels, name):
|
||||
'label': _int64_feature(int(labels[index])),
|
||||
'image_raw': _bytes_feature(image_raw)}))
|
||||
writer.write(example.SerializeToString())
|
||||
writer.close()
|
||||
|
||||
|
||||
def main(argv):
|
||||
|
@ -219,8 +219,8 @@ def create_image_lists(image_dir, testing_percentage, validation_percentage):
|
||||
# To do that, we need a stable way of deciding based on just the file name
|
||||
# itself, so we do a hash of that and then use that to generate a
|
||||
# probability value that we use to assign it.
|
||||
percentage_hash = (int(
|
||||
hashlib.sha1(hash_name).hexdigest(), 16) % (65536)) * (100 / 65535.0)
|
||||
hash_name_hashed = hashlib.sha1(hash_name.encode('utf-8')).hexdigest()
|
||||
percentage_hash = (int(hash_name_hashed, 16) % (65536)) * (100 / 65535.0)
|
||||
if percentage_hash < validation_percentage:
|
||||
validation_images.append(base_name)
|
||||
elif percentage_hash < (testing_percentage + validation_percentage):
|
||||
@ -295,8 +295,9 @@ def create_inception_graph():
|
||||
Graph holding the trained Inception network.
|
||||
"""
|
||||
with tf.Session() as sess:
|
||||
with gfile.FastGFile(
|
||||
os.path.join(FLAGS.model_dir, 'classify_image_graph_def.pb'), 'r') as f:
|
||||
model_filename = os.path.join(
|
||||
FLAGS.model_dir, 'classify_image_graph_def.pb')
|
||||
with gfile.FastGFile(model_filename, 'rb') as f:
|
||||
graph_def = tf.GraphDef()
|
||||
graph_def.ParseFromString(f.read())
|
||||
_ = tf.import_graph_def(graph_def, name='')
|
||||
@ -395,7 +396,7 @@ def get_or_create_bottleneck(sess, image_lists, label_name, index, image_dir,
|
||||
category)
|
||||
if not gfile.Exists(image_path):
|
||||
tf.logging.fatal('File does not exist %s', image_path)
|
||||
image_data = gfile.FastGFile(image_path, 'r').read()
|
||||
image_data = gfile.FastGFile(image_path, 'rb').read()
|
||||
bottleneck_values = run_bottleneck_on_image(sess, image_data,
|
||||
JPEG_DATA_TENSOR_NAME)
|
||||
bottleneck_string = ','.join(str(x) for x in bottleneck_values)
|
||||
@ -430,7 +431,7 @@ def cache_bottlenecks(sess, image_lists, image_dir, bottleneck_dir):
|
||||
"""
|
||||
how_many_bottlenecks = 0
|
||||
ensure_dir_exists(bottleneck_dir)
|
||||
for label_name, label_lists in image_lists.iteritems():
|
||||
for label_name, label_lists in image_lists.items():
|
||||
for category in ['training', 'testing', 'validation']:
|
||||
category_list = label_lists[category]
|
||||
for index, unused_base_name in enumerate(category_list):
|
||||
@ -467,7 +468,7 @@ def get_random_cached_bottlenecks(sess, image_lists, how_many, category,
|
||||
ground_truthes = []
|
||||
for unused_i in range(how_many):
|
||||
label_index = random.randrange(class_count)
|
||||
label_name = image_lists.keys()[label_index]
|
||||
label_name = list(image_lists.keys())[label_index]
|
||||
image_index = random.randrange(65536)
|
||||
bottleneck = get_or_create_bottleneck(sess, image_lists, label_name,
|
||||
image_index, image_dir, category,
|
||||
@ -818,7 +819,7 @@ def main(_):
|
||||
# Write out the trained graph and labels with the weights stored as constants.
|
||||
output_graph_def = graph_util.convert_variables_to_constants(
|
||||
sess, graph.as_graph_def(), [FLAGS.final_tensor_name])
|
||||
with gfile.FastGFile(FLAGS.output_graph, 'w') as f:
|
||||
with gfile.FastGFile(FLAGS.output_graph, 'wb') as f:
|
||||
f.write(output_graph_def.SerializeToString())
|
||||
with gfile.FastGFile(FLAGS.output_labels, 'w') as f:
|
||||
f.write('\n'.join(image_lists.keys()) + '\n')
|
||||
|
@ -54,7 +54,7 @@ def _read32(bytestream):
|
||||
def extract_images(filename):
|
||||
"""Extract the images into a 4D uint8 numpy array [index, y, x, depth]."""
|
||||
print('Extracting', filename)
|
||||
with tf.gfile.Open(filename) as f, gzip.GzipFile(fileobj=f) as bytestream:
|
||||
with tf.gfile.Open(filename, 'rb') as f, gzip.GzipFile(fileobj=f) as bytestream:
|
||||
magic = _read32(bytestream)
|
||||
if magic != 2051:
|
||||
raise ValueError(
|
||||
@ -81,7 +81,7 @@ def dense_to_one_hot(labels_dense, num_classes):
|
||||
def extract_labels(filename, one_hot=False, num_classes=10):
|
||||
"""Extract the labels into a 1D uint8 numpy array [index]."""
|
||||
print('Extracting', filename)
|
||||
with tf.gfile.Open(filename) as f, gzip.GzipFile(fileobj=f) as bytestream:
|
||||
with tf.gfile.Open(filename, 'rb') as f, gzip.GzipFile(fileobj=f) as bytestream:
|
||||
magic = _read32(bytestream)
|
||||
if magic != 2049:
|
||||
raise ValueError(
|
||||
|
@ -143,7 +143,7 @@ def evaluation(logits, labels):
|
||||
"""
|
||||
# For a classifier model, we can use the in_top_k Op.
|
||||
# It returns a bool tensor with shape [batch_size] that is true for
|
||||
# the examples where the label's is was in the top k (here k=1)
|
||||
# the examples where the label is in the top k (here k=1)
|
||||
# of all logits for that example.
|
||||
correct = tf.nn.in_top_k(logits, labels, 1)
|
||||
# Return the number of true entries.
|
||||
|
@ -54,23 +54,23 @@ def main(_):
|
||||
# Create the model
|
||||
x = tf.placeholder(tf.float32, [None, 784], name='x-input')
|
||||
W = tf.Variable(tf.zeros([784, 10]), name='weights')
|
||||
b = tf.Variable(tf.zeros([10], name='bias'))
|
||||
b = tf.Variable(tf.zeros([10]), name='bias')
|
||||
|
||||
# Use a name scope to organize nodes in the graph visualizer
|
||||
with tf.name_scope('Wx_b'):
|
||||
y = tf.nn.softmax(tf.matmul(x, W) + b)
|
||||
|
||||
# Add summary ops to collect data
|
||||
_ = tf.histogram_summary('weights', W)
|
||||
_ = tf.histogram_summary('biases', b)
|
||||
_ = tf.histogram_summary('y', y)
|
||||
tf.histogram_summary('weights', W)
|
||||
tf.histogram_summary('biases', b)
|
||||
tf.histogram_summary('y', y)
|
||||
|
||||
# Define loss and optimizer
|
||||
y_ = tf.placeholder(tf.float32, [None, 10], name='y-input')
|
||||
# More name scopes will clean up the graph representation
|
||||
with tf.name_scope('xent'):
|
||||
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
|
||||
_ = tf.scalar_summary('cross entropy', cross_entropy)
|
||||
tf.scalar_summary('cross entropy', cross_entropy)
|
||||
with tf.name_scope('train'):
|
||||
train_step = tf.train.GradientDescentOptimizer(
|
||||
FLAGS.learning_rate).minimize(cross_entropy)
|
||||
@ -78,7 +78,7 @@ def main(_):
|
||||
with tf.name_scope('test'):
|
||||
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
|
||||
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
|
||||
_ = tf.scalar_summary('accuracy', accuracy)
|
||||
tf.scalar_summary('accuracy', accuracy)
|
||||
|
||||
# Merge all the summaries and write them out to /tmp/mnist_logs (by default)
|
||||
merged = tf.merge_all_summaries()
|
||||
|
@ -128,7 +128,7 @@ num_skips = 2 # How many times to reuse an input to generate a label.
|
||||
# construction are also the most frequent.
|
||||
valid_size = 16 # Random set of words to evaluate similarity on.
|
||||
valid_window = 100 # Only pick dev samples in the head of the distribution.
|
||||
valid_examples = np.array(random.sample(np.arange(valid_window), valid_size))
|
||||
valid_examples = np.random.choice(valid_window, valid_size, replace=False)
|
||||
num_sampled = 64 # Number of negative examples to sample.
|
||||
|
||||
graph = tf.Graph()
|
||||
|
@ -290,11 +290,11 @@
|
||||
"Another one is to use learning rate decay:\n",
|
||||
"\n",
|
||||
" global_step = tf.Variable(0) # count the number of steps taken.\n",
|
||||
" learning_rate = tf.train.exponential_decay(0.5, step, ...)\n",
|
||||
" learning_rate = tf.train.exponential_decay(0.5, global_step, ...)\n",
|
||||
" optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)\n",
|
||||
" \n",
|
||||
" ---\n"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
|
@ -421,7 +421,7 @@
|
||||
"\n",
|
||||
"graph = tf.Graph()\n",
|
||||
"\n",
|
||||
"with graph.as_default():\n",
|
||||
"with graph.as_default(), tf.device('/cpu:0'):\n",
|
||||
"\n",
|
||||
" # Input data.\n",
|
||||
" train_dataset = tf.placeholder(tf.int32, shape=[batch_size])\n",
|
||||
|
@ -1,6 +1,7 @@
|
||||
FROM b.gcr.io/tensorflow/tensorflow:latest
|
||||
MAINTAINER Vincent Vanhoucke <vanhoucke@google.com>
|
||||
RUN pip install scikit-learn
|
||||
RUN rm -rf /notebooks/*
|
||||
ADD *.ipynb /notebooks/
|
||||
WORKDIR /notebooks
|
||||
CMD ["/run_jupyter.sh"]
|
||||
|
@ -820,7 +820,7 @@ classes are mutually exclusive (each entry is in exactly one class). For
|
||||
example, each CIFAR-10 image is labeled with one and only one label: an image
|
||||
can be a dog or a truck, but not both.
|
||||
|
||||
**NOTE:**: While the classes are mutually exclusive, their probabilities
|
||||
**NOTE:** While the classes are mutually exclusive, their probabilities
|
||||
need not be. All that is required is that each row of `labels` is
|
||||
a valid probability distribution. If using exclusive `labels`
|
||||
(wherein one and only one class is true at a time), see
|
||||
@ -857,7 +857,7 @@ classes are mutually exclusive (each entry is in exactly one class). For
|
||||
example, each CIFAR-10 image is labeled with one and only one label: an image
|
||||
can be a dog or a truck, but not both.
|
||||
|
||||
**NOTE:**: For this operation, the probability of a given label is considered
|
||||
**NOTE:** For this operation, the probability of a given label is considered
|
||||
exclusive. That is, soft classes are not allowed, and the `labels` vector
|
||||
must provide a single specific index for the true class for each row of
|
||||
`logits` (each minibatch entry). For soft softmax classification with
|
||||
|
@ -794,9 +794,11 @@ global_step = tf.Variable(0, trainable=False)
|
||||
starter_learning_rate = 0.1
|
||||
learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step,
|
||||
100000, 0.96, staircase=True)
|
||||
optimizer = tf.GradientDescentOptimizer(learning_rate)
|
||||
# Passing global_step to minimize() will increment it at each step.
|
||||
optimizer.minimize(...my loss..., global_step=global_step)
|
||||
learning_step = (
|
||||
tf.GradientDescentOptimizer(learning_rate)
|
||||
.minimize(...my loss..., global_step=global_step)
|
||||
)
|
||||
```
|
||||
|
||||
##### Args:
|
||||
@ -2280,5 +2282,3 @@ device assignments have not changed.
|
||||
##### Returns:
|
||||
|
||||
A saver constructed rom `saver_def` in `MetaGraphDef`.
|
||||
|
||||
|
||||
|
@ -53,28 +53,28 @@ Install TensorFlow:
|
||||
|
||||
```bash
|
||||
# Ubuntu/Linux 64-bit, CPU only:
|
||||
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
|
||||
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
|
||||
|
||||
# Ubuntu/Linux 64-bit, GPU enabled:
|
||||
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
|
||||
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
|
||||
|
||||
# Mac OS X, CPU only:
|
||||
$ sudo easy_install --upgrade six
|
||||
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py2-none-any.whl
|
||||
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp27-none-any.whl
|
||||
```
|
||||
|
||||
For python3:
|
||||
|
||||
```bash
|
||||
# Ubuntu/Linux 64-bit, CPU only:
|
||||
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py3-none-linux_x86_64.whl
|
||||
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl
|
||||
|
||||
# Ubuntu/Linux 64-bit, GPU enabled:
|
||||
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.0-py3-none-linux_x86_64.whl
|
||||
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl
|
||||
|
||||
# Mac OS X, CPU only:
|
||||
$ sudo easy_install --upgrade six
|
||||
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py3-none-any.whl
|
||||
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp35-none-any.whl
|
||||
```
|
||||
|
||||
NOTE: If you are upgrading from a previous installation of TensorFlow < 0.7.1,
|
||||
@ -126,13 +126,13 @@ $ source ~/tensorflow/bin/activate.csh # If using csh
|
||||
(tensorflow)$ # Your prompt should change
|
||||
|
||||
# Ubuntu/Linux 64-bit, CPU only:
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
|
||||
|
||||
# Ubuntu/Linux 64-bit, GPU enabled:
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
|
||||
|
||||
# Mac OS X, CPU only:
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py2-none-any.whl
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp27-none-any.whl
|
||||
```
|
||||
|
||||
and again for python3:
|
||||
@ -143,13 +143,13 @@ $ source ~/tensorflow/bin/activate.csh # If using csh
|
||||
(tensorflow)$ # Your prompt should change
|
||||
|
||||
# Ubuntu/Linux 64-bit, CPU only:
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py3-none-linux_x86_64.whl
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl
|
||||
|
||||
# Ubuntu/Linux 64-bit, GPU enabled:
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.0-py3-none-linux_x86_64.whl
|
||||
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl
|
||||
|
||||
# Mac OS X, CPU only:
|
||||
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py3-none-any.whl
|
||||
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp35-none-any.whl
|
||||
```
|
||||
|
||||
With the Virtualenv environment activated, you can now
|
||||
@ -191,7 +191,7 @@ code.
|
||||
* `b.gcr.io/tensorflow/tensorflow:latest-devel-gpu`: GPU Binary image plus source
|
||||
code.
|
||||
|
||||
We also have tags with `latest` replaced by a released version (e.g., `0.7.0-gpu`).
|
||||
We also have tags with `latest` replaced by a released version (e.g., `0.7.1-gpu`).
|
||||
|
||||
With Docker the installation is as follows:
|
||||
|
||||
@ -464,7 +464,7 @@ We recommend using [homebrew](http://brew.sh) to install the bazel and SWIG
|
||||
dependencies, and installing python dependencies using easy_install or pip.
|
||||
|
||||
Of course you can also install Swig from source without using homebrew. In that
|
||||
case, be sure to install its dependency [PCRE](from www.pcre.org) and not PCRE2.
|
||||
case, be sure to install its dependency [PCRE](http://www.pcre.org) and not PCRE2.
|
||||
|
||||
#### Dependencies
|
||||
|
||||
@ -517,7 +517,7 @@ $ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_pack
|
||||
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
|
||||
|
||||
# The name of the .whl file will depend on your platform.
|
||||
$ pip install /tmp/tensorflow_pkg/tensorflow-0.7.0-py2-none-linux_x86_64.whl
|
||||
$ pip install /tmp/tensorflow_pkg/tensorflow-0.7.1-py2-none-linux_x86_64.whl
|
||||
```
|
||||
|
||||
## Setting up TensorFlow for Development
|
||||
|
@ -74,7 +74,7 @@ and compact summary of the images, since it has to contain enough information
|
||||
for the classifier to make a good choice in a very small set of values. The
|
||||
reason our final layer retraining can work on new classes is that it turns out
|
||||
the kind of information needed to distinguish between all the 1,000 classes in
|
||||
ImageNet is often also useful to chose between new kinds of objects.
|
||||
ImageNet is often also useful to distinguish between new kinds of objects.
|
||||
|
||||
Because every image is reused multiple times during training and calculating
|
||||
each bottleneck takes a significant amount of time, it speeds things up to
|
||||
@ -88,20 +88,20 @@ part again.
|
||||
Once the bottlenecks are complete, the actual training of the top layer of the
|
||||
network begins. You'll see a series of step outputs, each one showing training
|
||||
accuracy, validation accuracy, and the cross entropy. The training accuracy
|
||||
shows how many of the images used in the current training batch were labeled
|
||||
with the correct class. The validation accuracy is the precision on a
|
||||
shows what percent of the images used in the current training batch were
|
||||
labeled with the correct class. The validation accuracy is the precision on a
|
||||
randomly-selected group of images from a different set. The key difference is
|
||||
that the training accuracy is based on images that the network has been able
|
||||
to learn from so the network can overfit to the noise in the training data. A
|
||||
true measure of the performance of the network is to measure its performance on
|
||||
a data set not contained in the training data -- this is measured by the
|
||||
validation accuracy. If the training accuracy is high but the validation remains
|
||||
low, that means the network is overfitting and memorizing particular features
|
||||
in the training images that aren't helpful more generally. Cross entropy is a
|
||||
loss function which gives a glimpse into how well the learning process is
|
||||
progressing. The training's objective is to make the loss as small as possible,
|
||||
so you can tell if the learning is working by keeping an eye on whether the loss
|
||||
keeps trending downwards, ignoring the short-term noise.
|
||||
validation accuracy. If the train accuracy is high but the validation accuracy
|
||||
remains low, that means the network is overfitting and memorizing particular
|
||||
features in the training images that aren't helpful more generally. Cross
|
||||
entropy is a loss function which gives a glimpse into how well the learning
|
||||
process is progressing. The training's objective is to make the loss as small as
|
||||
possible, so you can tell if the learning is working by keeping an eye on
|
||||
whether the loss keeps trending downwards, ignoring the short-term noise.
|
||||
|
||||
By default this script will run 4,000 training steps. Each step chooses ten
|
||||
images at random from the training set, finds their bottlenecks from the cache,
|
||||
@ -114,8 +114,8 @@ and validation pictures. This test evaluation is the best estimate of how the
|
||||
trained model will perform on the classification task. You should see an
|
||||
accuracy value of between 90% and 95%, though the exact value will vary from run
|
||||
to run since there's randomness in the training process. This number is based on
|
||||
how many of the images in the test set are given the correct label after the
|
||||
model is fully trained.
|
||||
the percent of the images in the test set that are given the correct label
|
||||
after the model is fully trained.
|
||||
|
||||
## Using the Retrained Model
|
||||
|
||||
@ -266,7 +266,7 @@ memorized unimportant details of the training images.
|
||||
|
||||
This problem is known as overfitting, and to avoid it we keep some of our data
|
||||
out of the training process, so that the model can't memorize them. We then use
|
||||
those images as a check to make sure that overfitting isn't occuring, since if
|
||||
those images as a check to make sure that overfitting isn't occurring, since if
|
||||
we see good accuracy on them it's a good sign the network isn't overfitting. The
|
||||
usual split is to put 80% of the images into the main training set, keep 10%
|
||||
aside to run as validation frequently during training, and then have a final 10%
|
||||
|
@ -86,23 +86,23 @@ with tf.name_scope("Wx_b") as scope:
|
||||
y = tf.nn.softmax(tf.matmul(x,W) + b)
|
||||
|
||||
# Add summary ops to collect data
|
||||
w_hist = tf.histogram_summary("weights", W)
|
||||
b_hist = tf.histogram_summary("biases", b)
|
||||
y_hist = tf.histogram_summary("y", y)
|
||||
tf.histogram_summary("weights", W)
|
||||
tf.histogram_summary("biases", b)
|
||||
tf.histogram_summary("y", y)
|
||||
|
||||
# Define loss and optimizer
|
||||
y_ = tf.placeholder(tf.float32, [None,10], name="y-input")
|
||||
# More name scopes will clean up the graph representation
|
||||
with tf.name_scope("xent") as scope:
|
||||
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
|
||||
ce_summ = tf.scalar_summary("cross entropy", cross_entropy)
|
||||
tf.scalar_summary("cross entropy", cross_entropy)
|
||||
with tf.name_scope("train") as scope:
|
||||
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
|
||||
|
||||
with tf.name_scope("test") as scope:
|
||||
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
|
||||
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
|
||||
accuracy_summary = tf.scalar_summary("accuracy", accuracy)
|
||||
tf.scalar_summary("accuracy", accuracy)
|
||||
|
||||
# Merge all the summaries and write them out to /tmp/mnist_logs
|
||||
merged = tf.merge_all_summaries()
|
||||
|
@ -28,8 +28,7 @@ by calling `as_graph_def()`, which returns a `GraphDef` object.
|
||||
|
||||
The GraphDef class is an object created by the ProtoBuf library from the
|
||||
definition in
|
||||
[tensorflow/core/framework/graph.proto](https://github.com/tensorflow/tensorflow
|
||||
/blob/master/tensorflow/core/framework/graph.proto). The protobuf tools parse
|
||||
[tensorflow/core/framework/graph.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/graph.proto). The protobuf tools parse
|
||||
this text file, and generate the code to load, store, and manipulate graph
|
||||
definitions. If you see a standalone TensorFlow file representing a model, it's
|
||||
likely to contain a serialized version of one of these `GraphDef` objects
|
||||
@ -37,8 +36,7 @@ saved out by the protobuf code.
|
||||
|
||||
This generated code is used to save and load the GraphDef files from disk. A
|
||||
good example to look at as we dig into this is
|
||||
[graph_metrics.py](https://github.com/tensorflow/tensorflow/blob/master/tensorfl
|
||||
ow/python/tools/graph_metrics.py). This Python script takes a saved graph
|
||||
[graph_metrics.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/graph_metrics.py). This Python script takes a saved graph
|
||||
definition, and analyzes the model to estimate performance and resource
|
||||
statistics. The code that actually loads the model looks like this:
|
||||
|
||||
@ -69,16 +67,14 @@ There are actually two different formats that a ProtoBuf can be saved in.
|
||||
TextFormat is a human-readable form, which makes it nice for debugging and
|
||||
editing, but can get large when there's numerical data like weights stored in
|
||||
it. You can see a small example of that in
|
||||
[poly5-graph.pbtxt](https://github.com/tensorflow/tensorflow/blob/master/tensorf
|
||||
low/tensorboard/components/tf-tensorboard/demo/data/poly5-graph.pbtxt).
|
||||
[poly5-graph.pbtxt](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tensorboard/components/tf-tensorboard/demo/data/poly5-graph.pbtxt).
|
||||
|
||||
Binary format files are a lot smaller than their text equivalents, even though
|
||||
they're not as readable for us. In this script, we ask the user to supply a
|
||||
flag indicating whether the input file is binary or text, so we know the right
|
||||
function to call. You can find an example of a large binary file inside the
|
||||
[inception_dec_2015.zip
|
||||
archive](https://storage.googleapis.com/download.tensorflow.org/models/inception
|
||||
_dec_2015.zip), as `tensorflow_inception_graph.pb`.
|
||||
archive](https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip), as `tensorflow_inception_graph.pb`.
|
||||
|
||||
The API itself can be a bit confusing - the binary call is actually
|
||||
`ParseFromString()`, whereas you use a utility function from the `text_format`
|
||||
@ -104,7 +100,7 @@ single operation along with its input connections. Here are the members of a
|
||||
Every node should have a unique identifier that's not used by any other nodes
|
||||
in the graph. If you don't specify one as you're building a graph using the
|
||||
Python API, one reflecting the name of operation, such as "MatMul",
|
||||
concatenated with a monotonically increasing number, such as "5", will be
|
||||
concatenated with a monotonically increasing number, such as "5", will be
|
||||
picked for you. an arbitrary one will be picked for you. The name is used when
|
||||
defining the connections between nodes, and when setting inputs and outputs for
|
||||
the whole graph when it's run.
|
||||
@ -115,8 +111,7 @@ This defines what operation to run, for example `"Add"`, `"MatMul"`, or
|
||||
`"Conv2D"`. When a graph is run, this op name is looked up in a registry to
|
||||
find an implementation. The registry is populated by calls to the
|
||||
`REGISTER_OP()` macro, like those in
|
||||
[tensorflow/core/ops/nn_ops.cc](https://github.com/tensorflow/tensorflow/blob/ma
|
||||
ster/tensorflow/core/ops/nn_ops.cc).
|
||||
[tensorflow/core/ops/nn_ops.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/ops/nn_ops.cc).
|
||||
|
||||
### `input`
|
||||
|
||||
@ -142,8 +137,7 @@ size of filters for convolutions, or the values of constant ops. Because there
|
||||
can be so many different types of attribute values, from strings, to ints, to
|
||||
arrays of tensor values, there's a separate protobuf file defining the data
|
||||
structure that holds them, in
|
||||
[tensorflow/core/framework/attr_value.proto](https://github.com/tensorflow/tenso
|
||||
rflow/blob/master/tensorflow/core/framework/attr_value.proto).
|
||||
[tensorflow/core/framework/attr_value.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto).
|
||||
|
||||
Each attribute has a unique name string, and the expected attributes are listed
|
||||
when the operation is defined. If an attribute isn't present in a node, but it
|
||||
@ -161,8 +155,7 @@ the file format during training. Instead, they're held in separate checkpoint
|
||||
files, and there are `Variable` ops in the graph that load the latest values
|
||||
when they're initialized. It's often not very convenient to have separate files
|
||||
when you're deploying to production, so there's the
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflo
|
||||
w/python/tools/freeze_graph.py) script that takes a graph definition and a set
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py) script that takes a graph definition and a set
|
||||
of checkpoints and freezes them together into a single file.
|
||||
|
||||
What this does is load the `GraphDef`, pull in the values for all the variables
|
||||
@ -178,10 +171,9 @@ the most common problems is extracting and interpreting the weight values. A
|
||||
common way to store them, for example in graphs created by the freeze_graph
|
||||
script, is as `Const` ops containing the weights as `Tensors`. These are
|
||||
defined in
|
||||
[tensorflow/core/framework/tensor.proto](https://github.com/tensorflow/tensorflo
|
||||
w/blob/master/tensorflow/core/framework/tensor.proto), and contain information
|
||||
[tensorflow/core/framework/tensor.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor.proto), and contain information
|
||||
about the size and type of the data, as well as the values themselves. In
|
||||
Python, you get a `TensorProto` object from a `NodeDef` representing a `Const`
|
||||
Python, you get a `TensorProto` object from a `NodeDef` representing a `Const`
|
||||
op by calling something like `some_node_def.attr['value'].tensor`.
|
||||
|
||||
This will give you an object representing the weights data. The data itself
|
||||
|
@ -16,7 +16,7 @@ Python list) has a rank of 2:
|
||||
t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
|
||||
|
||||
A rank two tensor is what we typically think of as a matrix, a rank one tensor
|
||||
is a vector. For a rank two tensor you can acccess any element with the syntax
|
||||
is a vector. For a rank two tensor you can access any element with the syntax
|
||||
`t[i, j]`. For a rank three tensor you would need to address an element with
|
||||
`t[i, j, k]`.
|
||||
|
||||
|
@ -31,6 +31,11 @@ something amazing with TensorFlow, we'd like to hear about it!
|
||||
|
||||
## Community
|
||||
|
||||
The TensorFlow community has created many great projects around TensorFlow, including:
|
||||
|
||||
* [TensorFlow tutorials](https://github.com/pkmital/tensorflow_tutorials)
|
||||
* [Scikit Flow - Simplified Interface for TensorFlow](https://github.com/tensorflow/skflow)
|
||||
|
||||
### Development
|
||||
|
||||
The source code for TensorFlow is hosted on GitHub:
|
||||
|
@ -9,8 +9,6 @@ CIFAR-10 classification is a common benchmark problem in machine learning. The
|
||||
problem is to classify RGB 32x32 pixel images across 10 categories:
|
||||
```airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.```
|
||||
|
||||

|
||||
|
||||
For more details refer to the [CIFAR-10 page](http://www.cs.toronto.edu/~kriz/cifar.html)
|
||||
and a [Tech Report](http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf)
|
||||
by Alex Krizhevsky.
|
||||
@ -117,7 +115,7 @@ learn more about how the `Reader` class works.
|
||||
The images are processed as follows:
|
||||
|
||||
* They are cropped to 24 x 24 pixels, centrally for evaluation or
|
||||
[randomly](../../api_docs/python/image.md#random_crop) for training.
|
||||
[randomly](../../api_docs/python/constant_op.md#random_crop) for training.
|
||||
* They are [approximately whitened](../../api_docs/python/image.md#per_image_whitening)
|
||||
to make the model insensitive to dynamic range.
|
||||
|
||||
@ -168,7 +166,7 @@ Here is a graph generated from TensorBoard describing the inference operation:
|
||||
</div>
|
||||
|
||||
> **EXERCISE**: The output of `inference` are un-normalized logits. Try editing
|
||||
the network architecture to return normalized predictions using [`tf.softmax()`]
|
||||
the network architecture to return normalized predictions using [`tf.nn.softmax()`]
|
||||
(../../api_docs/python/nn.md#softmax).
|
||||
|
||||
The `inputs()` and `inference()` functions provide all the components
|
||||
|
@ -50,7 +50,7 @@ unpacked (following the instructions available at the website) by the
|
||||
|
||||
The image data is extracted into a 2d tensor of: `[image index, pixel index]`
|
||||
where each entry is the intensity value of a specific pixel in a specific
|
||||
image, rescaled from `[0, 255]` to `[-0.5, 0.5]`. The "image index" corresponds
|
||||
image, rescaled from `[0, 255]` to `[0, 1]`. The "image index" corresponds
|
||||
to an image in the dataset, counting up from zero to the size of the dataset.
|
||||
And the "pixel index" corresponds to a specific pixel in that image, ranging
|
||||
from zero to the number of pixels in the image.
|
||||
|
@ -92,7 +92,7 @@ lstm = rnn_cell.BasicLSTMCell(lstm_size)
|
||||
# Initial state of the LSTM memory.
|
||||
initial_state = state = tf.zeros([batch_size, lstm.state_size])
|
||||
|
||||
for i in range(len(num_steps)):
|
||||
for i in range(num_steps):
|
||||
# The value of state is updated after processing each batch of words.
|
||||
output, state = lstm(words[:, i], state)
|
||||
|
||||
@ -159,7 +159,7 @@ lstm = rnn_cell.BasicLSTMCell(lstm_size)
|
||||
stacked_lstm = rnn_cell.MultiRNNCell([lstm] * number_of_layers)
|
||||
|
||||
initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32)
|
||||
for i in range(len(num_steps)):
|
||||
for i in range(num_steps):
|
||||
# The value of state is updated after processing each batch of words.
|
||||
output, state = stacked_lstm(words[:, i], state)
|
||||
|
||||
|
@ -58,7 +58,7 @@ translation [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215)
|
||||
In the basic model depicted above, every input has to be encoded into
|
||||
a fixed-size state vector, as that is the only thing passed to the decoder.
|
||||
To allow the decoder more direct access to the input, an *attention* mechanism
|
||||
was introduced in [Bahdanu et al., 2014](http://arxiv.org/abs/1409.0473)
|
||||
was introduced in [Bahdanau et al., 2014](http://arxiv.org/abs/1409.0473)
|
||||
([pdf](http://arxiv.org/pdf/1409.0473.pdf)).
|
||||
We will not go into the details of the attention mechanism (see the paper),
|
||||
suffice it to say that it allows the decoder to peek into the input at every
|
||||
@ -176,8 +176,8 @@ projections are constructed by the following code in `seq2seq_model.py`.
|
||||
```
|
||||
|
||||
First, note that we only construct a sampled softmax if the number of samples
|
||||
(512 by default) is smaller that the target vocabulary size. For vocabularies
|
||||
smaller than 512 it might be a better idea to just use a standard softmax loss.
|
||||
(512 by default) is smaller than the target vocabulary size. For vocabularies
|
||||
smaller than 512, it might be a better idea to just use a standard softmax loss.
|
||||
|
||||
Then, as you can see, we construct an output projection. It is a pair,
|
||||
consisting of a weight matrix and a bias vector. If used, the rnn cell
|
||||
|
@ -17,7 +17,7 @@
|
||||
|
||||
This should achieve a test error of 0.7%. Please keep this model as simple and
|
||||
linear as possible, it is meant as a tutorial for simple convolutional models.
|
||||
Run with --self_test on the command line to exectute a short self-test.
|
||||
Run with --self_test on the command line to execute a short self-test.
|
||||
"""
|
||||
from __future__ import absolute_import
|
||||
from __future__ import division
|
||||
|
@ -276,7 +276,7 @@ def get_config():
|
||||
raise ValueError("Invalid model: %s", FLAGS.model)
|
||||
|
||||
|
||||
def main(unused_args):
|
||||
def main(_):
|
||||
if not FLAGS.data_path:
|
||||
raise ValueError("Must set --data_path to PTB data directory")
|
||||
|
||||
|
@ -66,7 +66,7 @@ def gunzip_file(gz_path, new_path):
|
||||
"""Unzips from gz_path into new_path."""
|
||||
print("Unpacking %s to %s" % (gz_path, new_path))
|
||||
with gzip.open(gz_path, "rb") as gz_file:
|
||||
with open(new_path, "w") as new_file:
|
||||
with open(new_path, "wb") as new_file:
|
||||
for line in gz_file:
|
||||
new_file.write(line)
|
||||
|
||||
|
@ -251,8 +251,8 @@ def import_graph_def(graph_def, input_map=None, return_elements=None,
|
||||
class_values = value.list
|
||||
new_class_values = []
|
||||
for class_value in class_values.s:
|
||||
if class_value.startswith('loc:@'):
|
||||
op_to_bind_to = class_value[5:]
|
||||
if class_value.startswith(b'loc:@'):
|
||||
op_to_bind_to = class_value[5:].decode()
|
||||
# Find the op by its original name.
|
||||
if op_to_bind_to not in name_to_op:
|
||||
raise ValueError('Specified colocation to an op that '
|
||||
|
@ -1041,7 +1041,7 @@ class Operation(object):
|
||||
raise TypeError("node_def needs to be a NodeDef: %s" % node_def)
|
||||
if node_def.ByteSize() >= (1 << 31) or node_def.ByteSize() < 0:
|
||||
raise ValueError(
|
||||
"Cannot create an Operation with a NodeDef larger than 2GB.")
|
||||
"Cannot create a tensor proto whose content is larger than 2GB.")
|
||||
if not _VALID_OP_NAME_REGEX.match(node_def.name):
|
||||
raise ValueError("'%s' is not a valid node name" % node_def.name)
|
||||
if not isinstance(g, Graph):
|
||||
|
@ -1228,8 +1228,8 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
|
||||
with ops.colocate_with(a.op):
|
||||
b = constant_op.constant(3.0)
|
||||
c = constant_op.constant(4.0)
|
||||
self.assertEqual(["loc:@a"], a.op.colocation_groups())
|
||||
self.assertEqual(["loc:@a"], b.op.colocation_groups())
|
||||
self.assertEqual([b"loc:@a"], a.op.colocation_groups())
|
||||
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
|
||||
with self.assertRaises(ValueError):
|
||||
c.op.get_attr("_class")
|
||||
|
||||
@ -1242,7 +1242,7 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
|
||||
# colocated with 'a', which is on '/gpu:0'. colocate_with
|
||||
# overrides devices because it is a stronger constraint.
|
||||
b = constant_op.constant(3.0)
|
||||
self.assertEqual(["loc:@a"], b.op.colocation_groups())
|
||||
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
|
||||
self.assertEqual(a.op.device, b.op.device)
|
||||
|
||||
def testLocationOverrides(self):
|
||||
@ -1258,7 +1258,7 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
|
||||
c = constant_op.constant(4.0)
|
||||
d = constant_op.constant(5.0)
|
||||
|
||||
self.assertEqual(["loc:@a"], b.op.colocation_groups())
|
||||
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
|
||||
self.assertEqual("/device:GPU:0", a.op.device)
|
||||
self.assertEqual(a.op.device, b.op.device)
|
||||
|
||||
@ -1272,8 +1272,8 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
|
||||
b = constant_op.constant(3.0)
|
||||
with ops.colocate_with(b.op):
|
||||
c = constant_op.constant(4.0)
|
||||
self.assertEqual(["loc:@a"], b.op.colocation_groups())
|
||||
self.assertEqual(["loc:@a"], c.op.colocation_groups())
|
||||
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
|
||||
self.assertEqual([b"loc:@a"], c.op.colocation_groups())
|
||||
|
||||
def testMultiColocationGroups(self):
|
||||
a = constant_op.constant([2.0], name="a")
|
||||
@ -1281,7 +1281,7 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
|
||||
with ops.colocate_with(a.op):
|
||||
with ops.colocate_with(b.op):
|
||||
c = constant_op.constant(4.0)
|
||||
self.assertEqual(set(["loc:@a", "loc:@b"]), set(c.op.colocation_groups()))
|
||||
self.assertEqual(set([b"loc:@a", b"loc:@b"]), set(c.op.colocation_groups()))
|
||||
|
||||
def testColocationIgnoreStack(self):
|
||||
a = constant_op.constant([2.0], name="a")
|
||||
@ -1295,7 +1295,7 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
|
||||
a = variables.Variable([2.0], name="a")
|
||||
with ops.colocate_with(a.op):
|
||||
b = variables.Variable([3.0], name="b")
|
||||
self.assertEqual(["loc:@a"], b.op.colocation_groups())
|
||||
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
|
||||
|
||||
def testInconsistentDeviceWithinColocate(self):
|
||||
with ops.device("/gpu:0"):
|
||||
|
@ -361,6 +361,9 @@ def make_tensor_proto(values, dtype=None, shape=None):
|
||||
tensor_shape=tensor_shape.as_shape(shape).as_proto())
|
||||
|
||||
if is_same_size and numpy_dtype in _TENSOR_CONTENT_TYPES and shape_size > 1:
|
||||
if nparray.size * nparray.itemsize >= (1 << 31):
|
||||
raise ValueError(
|
||||
"Cannot create a tensor proto whose content is larger than 2GB.")
|
||||
tensor_proto.tensor_content = nparray.tostring()
|
||||
return tensor_proto
|
||||
|
||||
|
@ -155,7 +155,7 @@ class ConstantTest(tf.test.TestCase):
|
||||
large_array = np.zeros((512, 1024, 1024), dtype=np.float32)
|
||||
with self.assertRaisesRegexp(
|
||||
ValueError,
|
||||
"Cannot create an Operation with a NodeDef larger than 2GB."):
|
||||
"Cannot create a tensor proto whose content is larger than 2GB."):
|
||||
c = tf.constant(large_array)
|
||||
|
||||
def testTooLargeGraph(self):
|
||||
|
@ -1397,7 +1397,7 @@ class ControlFlowTest(tf.test.TestCase):
|
||||
vdef)
|
||||
# The device is empty, but the colocation constraint is set.
|
||||
self.assertDeviceEqual("", with_vdef_dep.device)
|
||||
self.assertEqual(["loc:@vdef"],
|
||||
self.assertEqual([b"loc:@vdef"],
|
||||
with_vdef_dep.op.colocation_groups())
|
||||
|
||||
def testGroup(self):
|
||||
|
@ -156,7 +156,7 @@ class DepthToSpaceTest(tf.test.TestCase):
|
||||
out_tf.eval()
|
||||
|
||||
def testBlockSizeNotDivisibleDepth(self):
|
||||
# The the depth is not divisible by the square of the block size.
|
||||
# The depth is not divisible by the square of the block size.
|
||||
x_np = [[[[1, 1, 1, 1],
|
||||
[2, 2, 2, 2]],
|
||||
[[3, 3, 3, 3],
|
||||
|
@ -23,18 +23,21 @@ import tensorflow as tf
|
||||
|
||||
class GenerateIdentityTensorTest(tf.test.TestCase):
|
||||
|
||||
def _testDiagOp(self, diag, dtype, expected_ans, use_gpu=False,
|
||||
expected_err_re=None):
|
||||
def diagOp(self, diag, dtype, expected_ans, use_gpu=False):
|
||||
with self.test_session(use_gpu=use_gpu):
|
||||
tf_ans = tf.diag(tf.convert_to_tensor(diag.astype(dtype)))
|
||||
out = tf_ans.eval()
|
||||
tf_ans_inv = tf.diag_part(expected_ans)
|
||||
inv_out = tf_ans_inv.eval()
|
||||
self.assertAllClose(out, expected_ans)
|
||||
self.assertAllClose(inv_out, diag)
|
||||
self.assertShapeEqual(expected_ans, tf_ans)
|
||||
self.assertShapeEqual(diag, tf_ans_inv)
|
||||
|
||||
def testEmptyTensor(self):
|
||||
x = numpy.array([])
|
||||
expected_ans = numpy.empty([0, 0])
|
||||
self._testDiagOp(x, numpy.int32, expected_ans)
|
||||
self.diagOp(x, numpy.int32, expected_ans)
|
||||
|
||||
def testRankOneIntTensor(self):
|
||||
x = numpy.array([1, 2, 3])
|
||||
@ -42,8 +45,8 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
|
||||
[[1, 0, 0],
|
||||
[0, 2, 0],
|
||||
[0, 0, 3]])
|
||||
self._testDiagOp(x, numpy.int32, expected_ans)
|
||||
self._testDiagOp(x, numpy.int64, expected_ans)
|
||||
self.diagOp(x, numpy.int32, expected_ans)
|
||||
self.diagOp(x, numpy.int64, expected_ans)
|
||||
|
||||
def testRankOneFloatTensor(self):
|
||||
x = numpy.array([1.1, 2.2, 3.3])
|
||||
@ -51,8 +54,8 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
|
||||
[[1.1, 0, 0],
|
||||
[0, 2.2, 0],
|
||||
[0, 0, 3.3]])
|
||||
self._testDiagOp(x, numpy.float32, expected_ans)
|
||||
self._testDiagOp(x, numpy.float64, expected_ans)
|
||||
self.diagOp(x, numpy.float32, expected_ans)
|
||||
self.diagOp(x, numpy.float64, expected_ans)
|
||||
|
||||
def testRankTwoIntTensor(self):
|
||||
x = numpy.array([[1, 2, 3], [4, 5, 6]])
|
||||
@ -63,8 +66,8 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
|
||||
[[[0, 0, 0], [4, 0, 0]],
|
||||
[[0, 0, 0], [0, 5, 0]],
|
||||
[[0, 0, 0], [0, 0, 6]]]])
|
||||
self._testDiagOp(x, numpy.int32, expected_ans)
|
||||
self._testDiagOp(x, numpy.int64, expected_ans)
|
||||
self.diagOp(x, numpy.int32, expected_ans)
|
||||
self.diagOp(x, numpy.int64, expected_ans)
|
||||
|
||||
def testRankTwoFloatTensor(self):
|
||||
x = numpy.array([[1.1, 2.2, 3.3], [4.4, 5.5, 6.6]])
|
||||
@ -75,8 +78,8 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
|
||||
[[[0, 0, 0], [4.4, 0, 0]],
|
||||
[[0, 0, 0], [0, 5.5, 0]],
|
||||
[[0, 0, 0], [0, 0, 6.6]]]])
|
||||
self._testDiagOp(x, numpy.float32, expected_ans)
|
||||
self._testDiagOp(x, numpy.float64, expected_ans)
|
||||
self.diagOp(x, numpy.float32, expected_ans)
|
||||
self.diagOp(x, numpy.float64, expected_ans)
|
||||
|
||||
def testRankThreeFloatTensor(self):
|
||||
x = numpy.array([[[1.1, 2.2], [3.3, 4.4]],
|
||||
@ -90,8 +93,64 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
|
||||
[[[0, 0], [0, 0]], [[0, 6.6], [0, 0]]]],
|
||||
[[[[0, 0], [0, 0]], [[0, 0], [7.7, 0]]],
|
||||
[[[0, 0], [0, 0]], [[0, 0], [0, 8.8]]]]]])
|
||||
self._testDiagOp(x, numpy.float32, expected_ans)
|
||||
self._testDiagOp(x, numpy.float64, expected_ans)
|
||||
self.diagOp(x, numpy.float32, expected_ans)
|
||||
self.diagOp(x, numpy.float64, expected_ans)
|
||||
|
||||
class DiagPartOpTest(tf.test.TestCase):
|
||||
|
||||
def setUp(self):
|
||||
x = numpy.random.seed(0)
|
||||
|
||||
def diagPartOp(self, tensor, dtpe, expected_ans, use_gpu=False):
|
||||
with self.test_session(use_gpu=use_gpu):
|
||||
tf_ans_inv = tf.diag_part(tensor)
|
||||
inv_out = tf_ans_inv.eval()
|
||||
self.assertAllClose(inv_out, expected_ans)
|
||||
self.assertShapeEqual(expected_ans, tf_ans_inv)
|
||||
|
||||
def testRankTwoFloatTensor(self):
|
||||
x = numpy.random.rand(3, 3)
|
||||
i = numpy.arange(3)
|
||||
expected_ans = x[i, i]
|
||||
self.diagPartOp(x, numpy.float32, expected_ans)
|
||||
self.diagPartOp(x, numpy.float64, expected_ans)
|
||||
|
||||
def testRankFourFloatTensor(self):
|
||||
x = numpy.random.rand(2, 3, 2, 3)
|
||||
i = numpy.arange(2)[:, None]
|
||||
j = numpy.arange(3)
|
||||
expected_ans = x[i, j, i, j]
|
||||
self.diagPartOp(x, numpy.float32, expected_ans)
|
||||
self.diagPartOp(x, numpy.float64, expected_ans)
|
||||
|
||||
def testRankSixFloatTensor(self):
|
||||
x = numpy.random.rand(2, 2, 2, 2, 2, 2)
|
||||
i = numpy.arange(2)[:, None, None]
|
||||
j = numpy.arange(2)[:, None]
|
||||
k = numpy.arange(2)
|
||||
expected_ans = x[i, j, k, i, j, k]
|
||||
self.diagPartOp(x, numpy.float32, expected_ans)
|
||||
self.diagPartOp(x, numpy.float64, expected_ans)
|
||||
|
||||
def testOddRank(self):
|
||||
w = numpy.random.rand(2)
|
||||
x = numpy.random.rand(2, 2, 2)
|
||||
y = numpy.random.rand(2, 2, 2, 2, 2)
|
||||
z = numpy.random.rand(2, 2, 2, 2, 2, 2, 2)
|
||||
self.assertRaises(ValueError, self.diagPartOp, w, numpy.float32, 0)
|
||||
self.assertRaises(ValueError, self.diagPartOp, x, numpy.float32, 0)
|
||||
self.assertRaises(ValueError, self.diagPartOp, y, numpy.float32, 0)
|
||||
self.assertRaises(ValueError, self.diagPartOp, z, numpy.float32, 0)
|
||||
|
||||
def testUnevenDimensions(self):
|
||||
w = numpy.random.rand(2, 5)
|
||||
x = numpy.random.rand(2, 1, 2, 3)
|
||||
y = numpy.random.rand(2, 1, 2, 1, 2, 5)
|
||||
z = numpy.random.rand(2, 2, 2, 2, 2, 2, 2, 2)
|
||||
self.assertRaises(ValueError, self.diagPartOp, w, numpy.float32, 0)
|
||||
self.assertRaises(ValueError, self.diagPartOp, x, numpy.float32, 0)
|
||||
self.assertRaises(ValueError, self.diagPartOp, y, numpy.float32, 0)
|
||||
self.assertRaises(ValueError, self.diagPartOp, z, numpy.float32, 0)
|
||||
|
||||
if __name__ == "__main__":
|
||||
tf.test.main()
|
||||
|
@ -25,7 +25,7 @@ from tensorflow.python.framework import random_seed
|
||||
from tensorflow.python.ops import init_ops
|
||||
|
||||
|
||||
# Returns true iff the two initalizers produce the same tensor to
|
||||
# Returns true iff the two initializers produce the same tensor to
|
||||
# within a tiny tolerance.
|
||||
def identicaltest(tc, init1, init2, use_gpu):
|
||||
"""Tests if two initializations are identical to within tiny tolerances.
|
||||
|
@ -120,7 +120,7 @@ class MatMulTest(tf.test.TestCase):
|
||||
self._testCpuMatmul(x, y, True, True)
|
||||
self._testGpuMatmul(x, y, True, True)
|
||||
|
||||
def testDoubleRandomTranposeBoth(self):
|
||||
def testDoubleRandomTransposeBoth(self):
|
||||
for _ in range(10):
|
||||
n, k, m = np.random.randint(1, 100, size=3)
|
||||
x = self._randMatrix(k, n, np.float64)
|
||||
|
@ -116,8 +116,8 @@ class SumReductionTest(tf.test.TestCase):
|
||||
# Simple tests for various types.
|
||||
def testDoubleReduce1D(self):
|
||||
np_arr = np.arange(1, 6).reshape([5]).astype(np.float64)
|
||||
self._compare(np_arr, [], False)
|
||||
self._compare(np_arr, [0], False)
|
||||
self._compareAll(np_arr, [])
|
||||
self._compareAll(np_arr, [0])
|
||||
|
||||
def testInt32Reduce1D(self):
|
||||
np_arr = np.arange(1, 6).reshape([5]).astype(np.int32)
|
||||
@ -230,6 +230,19 @@ class MeanReductionTest(tf.test.TestCase):
|
||||
self._compareAll(np_arr, [0, 2])
|
||||
self._compareAll(np_arr, [0, 1, 2])
|
||||
|
||||
def testDoubleReduce3D(self):
|
||||
# Create a 3D array of doubles and reduce across all possible
|
||||
# dimensions
|
||||
np_arr = np.arange(0, 30).reshape([2, 3, 5]).astype(np.float64)
|
||||
self._compareAll(np_arr, [])
|
||||
self._compareAll(np_arr, [0])
|
||||
self._compareAll(np_arr, [1])
|
||||
self._compareAll(np_arr, [2])
|
||||
self._compareAll(np_arr, [0, 1])
|
||||
self._compareAll(np_arr, [1, 2])
|
||||
self._compareAll(np_arr, [0, 2])
|
||||
self._compareAll(np_arr, [0, 1, 2])
|
||||
|
||||
def testGradient(self):
|
||||
s = [2, 3, 4, 2]
|
||||
x = np.arange(1.0, 49.0).reshape(s).astype(np.float32)
|
||||
@ -383,6 +396,19 @@ class MinReductionTest(tf.test.TestCase):
|
||||
self._compareAll(np_arr, [0, 2])
|
||||
self._compareAll(np_arr, [0, 1, 2])
|
||||
|
||||
def testDoubleReduce3D(self):
|
||||
# Create a 3D array of doubles and reduce across all possible
|
||||
# dimensions
|
||||
np_arr = np.arange(0, 30).reshape([2, 3, 5]).astype(np.float64)
|
||||
self._compareAll(np_arr, [])
|
||||
self._compareAll(np_arr, [0])
|
||||
self._compareAll(np_arr, [1])
|
||||
self._compareAll(np_arr, [2])
|
||||
self._compareAll(np_arr, [0, 1])
|
||||
self._compareAll(np_arr, [1, 2])
|
||||
self._compareAll(np_arr, [0, 2])
|
||||
self._compareAll(np_arr, [0, 1, 2])
|
||||
|
||||
def testGradient(self):
|
||||
s = [2, 3, 4, 2]
|
||||
x = np.arange(1.0, 49.0).reshape(s).astype(np.float64)
|
||||
@ -477,6 +503,20 @@ class MaxReductionTest(tf.test.TestCase):
|
||||
self._compareAll(np_arr, [0, 2])
|
||||
self._compareAll(np_arr, [0, 1, 2])
|
||||
|
||||
def testDoubleReduce3D(self):
|
||||
# Create a 3D array of doubles and reduce across all possible
|
||||
# dimensions
|
||||
np_arr = np.arange(0, 30).reshape([2, 3, 5]).astype(np.float64)
|
||||
self._compareAll(np_arr, None)
|
||||
self._compareAll(np_arr, [])
|
||||
self._compareAll(np_arr, [0])
|
||||
self._compareAll(np_arr, [1])
|
||||
self._compareAll(np_arr, [2])
|
||||
self._compareAll(np_arr, [0, 1])
|
||||
self._compareAll(np_arr, [1, 2])
|
||||
self._compareAll(np_arr, [0, 2])
|
||||
self._compareAll(np_arr, [0, 1, 2])
|
||||
|
||||
def testGradient(self):
|
||||
s = [2, 3, 4, 2]
|
||||
x = np.arange(1.0, 49.0).reshape(s).astype(np.float64)
|
||||
|
@ -782,11 +782,11 @@ class BidirectionalRNNTest(tf.test.TestCase):
|
||||
tf.float32,
|
||||
shape=(batch_size, input_size) if use_shape else (None, input_size))
|
||||
]
|
||||
outputs = tf.nn.bidirectional_rnn(cell_fw,
|
||||
cell_bw,
|
||||
inputs,
|
||||
dtype=tf.float32,
|
||||
sequence_length=sequence_length)
|
||||
outputs, state_fw, state_bw = tf.nn.bidirectional_rnn(cell_fw,
|
||||
cell_bw,
|
||||
inputs,
|
||||
dtype=tf.float32,
|
||||
sequence_length=sequence_length)
|
||||
self.assertEqual(len(outputs), len(inputs))
|
||||
for out in outputs:
|
||||
self.assertEqual(
|
||||
@ -794,17 +794,19 @@ class BidirectionalRNNTest(tf.test.TestCase):
|
||||
[batch_size if use_shape else None, 2 * num_units])
|
||||
|
||||
input_value = np.random.randn(batch_size, input_size)
|
||||
outputs = tf.pack(outputs)
|
||||
|
||||
return input_value, inputs, outputs, sequence_length
|
||||
return input_value, inputs, outputs, state_fw, state_bw, sequence_length
|
||||
|
||||
def _testBidirectionalRNN(self, use_gpu, use_shape):
|
||||
with self.test_session(use_gpu=use_gpu, graph=tf.Graph()) as sess:
|
||||
input_value, inputs, outputs, sequence_length = (
|
||||
input_value, inputs, outputs, state_fw, state_bw, sequence_length = (
|
||||
self._createBidirectionalRNN(use_gpu, use_shape, True))
|
||||
tf.initialize_all_variables().run()
|
||||
# Run with pre-specified sequence length of 2, 3
|
||||
out = sess.run(outputs, feed_dict={inputs[0]: input_value,
|
||||
sequence_length: [2, 3]})
|
||||
out, s_fw, s_bw = sess.run([outputs, state_fw, state_bw],
|
||||
feed_dict={inputs[0]: input_value,
|
||||
sequence_length: [2, 3]})
|
||||
|
||||
# Since the forward and backward LSTM cells were initialized with the
|
||||
# same parameters, the forward and backward output has to be the same,
|
||||
@ -836,13 +838,17 @@ class BidirectionalRNNTest(tf.test.TestCase):
|
||||
self.assertEqual(out[2][1][0], out[0][1][3])
|
||||
self.assertEqual(out[2][1][1], out[0][1][4])
|
||||
self.assertEqual(out[2][1][2], out[0][1][5])
|
||||
# Via the reasoning above, the forward and backward final state should be
|
||||
# exactly the same
|
||||
self.assertAllClose(s_fw, s_bw)
|
||||
|
||||
def _testBidirectionalRNNWithoutSequenceLength(self, use_gpu, use_shape):
|
||||
with self.test_session(use_gpu=use_gpu, graph=tf.Graph()) as sess:
|
||||
input_value, inputs, outputs, _ = self._createBidirectionalRNN(
|
||||
use_gpu, use_shape, False)
|
||||
input_value, inputs, outputs, state_fw, state_bw, _ = self._createBidirectionalRNN(
|
||||
use_gpu, use_shape, False)
|
||||
tf.initialize_all_variables().run()
|
||||
out = sess.run(outputs, feed_dict={inputs[0]: input_value})
|
||||
out, s_fw, s_bw = sess.run([outputs, state_fw, state_bw],
|
||||
feed_dict={inputs[0]: input_value})
|
||||
|
||||
# Since the forward and backward LSTM cells were initialized with the
|
||||
# same parameters, the forward and backward output has to be the same,
|
||||
@ -861,6 +867,9 @@ class BidirectionalRNNTest(tf.test.TestCase):
|
||||
self.assertEqual(out[i][1][0], out[8 - 1 - i][1][3])
|
||||
self.assertEqual(out[i][1][1], out[8 - 1 - i][1][4])
|
||||
self.assertEqual(out[i][1][2], out[8 - 1 - i][1][5])
|
||||
# Via the reasoning above, the forward and backward final state should be
|
||||
# exactly the same
|
||||
self.assertAllClose(s_fw, s_bw)
|
||||
|
||||
def testBidirectionalRNN(self):
|
||||
self._testBidirectionalRNN(use_gpu=False, use_shape=False)
|
||||
|
@ -495,6 +495,105 @@ class Seq2SeqTest(tf.test.TestCase):
|
||||
if len(perplexities[bucket]) > 1: # Assert that perplexity went down.
|
||||
self.assertLess(perplexities[bucket][-1], perplexities[bucket][0])
|
||||
|
||||
def testModelWithBooleanFeedPrevious(self):
|
||||
"""Test the model behavior when feed_previous is True.
|
||||
|
||||
For example, the following two cases have the same effect:
|
||||
- Train `embedding_rnn_seq2seq` with `feed_previous=True`, which contains
|
||||
a `embedding_rnn_decoder` with `feed_previous=True` and
|
||||
`update_embedding_for_previous=True`. The decoder is fed with "<Go>"
|
||||
and outputs "A, B, C".
|
||||
- Train `embedding_rnn_seq2seq` with `feed_previous=False`. The decoder
|
||||
is fed with "<Go>, A, B".
|
||||
"""
|
||||
num_encoder_symbols = 3
|
||||
num_decoder_symbols = 5
|
||||
batch_size = 2
|
||||
num_enc_timesteps = 2
|
||||
num_dec_timesteps = 3
|
||||
|
||||
def TestModel(seq2seq):
|
||||
with self.test_session(graph=tf.Graph()) as sess:
|
||||
tf.set_random_seed(111)
|
||||
random.seed(111)
|
||||
np.random.seed(111)
|
||||
|
||||
enc_inp = [tf.constant(i + 1, tf.int32, shape=[batch_size])
|
||||
for i in range(num_enc_timesteps)]
|
||||
dec_inp_fp_true = [tf.constant(i, tf.int32, shape=[batch_size])
|
||||
for i in range(num_dec_timesteps)]
|
||||
dec_inp_holder_fp_false = [tf.placeholder(tf.int32, shape=[batch_size])
|
||||
for _ in range(num_dec_timesteps)]
|
||||
targets = [tf.constant(i + 1, tf.int32, shape=[batch_size])
|
||||
for i in range(num_dec_timesteps)]
|
||||
weights = [tf.constant(1.0, shape=[batch_size])
|
||||
for i in range(num_dec_timesteps)]
|
||||
|
||||
def ForwardBackward(enc_inp, dec_inp, feed_previous):
|
||||
scope_name = "fp_{}".format(feed_previous)
|
||||
with tf.variable_scope(scope_name):
|
||||
dec_op, _ = seq2seq(enc_inp, dec_inp, feed_previous=feed_previous)
|
||||
net_variables = tf.get_collection(tf.GraphKeys.VARIABLES,
|
||||
scope_name)
|
||||
optimizer = tf.train.AdamOptimizer(0.03, epsilon=1e-5)
|
||||
update_op = optimizer.minimize(
|
||||
tf.nn.seq2seq.sequence_loss(dec_op, targets, weights),
|
||||
var_list=net_variables)
|
||||
return dec_op, update_op, net_variables
|
||||
|
||||
dec_op_fp_true, update_fp_true, variables_fp_true = ForwardBackward(
|
||||
enc_inp, dec_inp_fp_true, feed_previous=True)
|
||||
dec_op_fp_false, update_fp_false, variables_fp_false = ForwardBackward(
|
||||
enc_inp, dec_inp_holder_fp_false, feed_previous=False)
|
||||
|
||||
sess.run(tf.initialize_all_variables())
|
||||
|
||||
# We only check consistencies between the variables existing in both
|
||||
# the models with True and False feed_previous. Variables created by
|
||||
# the loop_function in the model with True feed_previous are ignored.
|
||||
v_false_name_dict = {v.name.split('/', 1)[-1]: v
|
||||
for v in variables_fp_false}
|
||||
matched_variables = [(v, v_false_name_dict[v.name.split('/', 1)[-1]])
|
||||
for v in variables_fp_true]
|
||||
for v_true, v_false in matched_variables:
|
||||
sess.run(tf.assign(v_false, v_true))
|
||||
|
||||
# Take the symbols generated by the decoder with feed_previous=True as
|
||||
# the true input symbols for the decoder with feed_previous=False.
|
||||
dec_fp_true = sess.run(dec_op_fp_true)
|
||||
output_symbols_fp_true = np.argmax(dec_fp_true, axis=2)
|
||||
dec_inp_fp_false = np.vstack((dec_inp_fp_true[0].eval(),
|
||||
output_symbols_fp_true[:-1]))
|
||||
sess.run(update_fp_true)
|
||||
sess.run(update_fp_false,
|
||||
{holder: inp for holder, inp in zip(dec_inp_holder_fp_false,
|
||||
dec_inp_fp_false)})
|
||||
|
||||
for v_true, v_false in matched_variables:
|
||||
self.assertAllClose(v_true.eval(), v_false.eval())
|
||||
|
||||
def EmbeddingRNNSeq2SeqF(enc_inp, dec_inp, feed_previous):
|
||||
cell = tf.nn.rnn_cell.BasicLSTMCell(2)
|
||||
return tf.nn.seq2seq.embedding_rnn_seq2seq(
|
||||
enc_inp, dec_inp, cell, num_encoder_symbols,
|
||||
num_decoder_symbols, feed_previous=feed_previous)
|
||||
|
||||
def EmbeddingTiedRNNSeq2Seq(enc_inp, dec_inp, feed_previous):
|
||||
cell = tf.nn.rnn_cell.BasicLSTMCell(2)
|
||||
return tf.nn.seq2seq.embedding_tied_rnn_seq2seq(
|
||||
enc_inp, dec_inp, cell, num_decoder_symbols,
|
||||
feed_previous=feed_previous)
|
||||
|
||||
def EmbeddingAttentionSeq2Seq(enc_inp, dec_inp, feed_previous):
|
||||
cell = tf.nn.rnn_cell.BasicLSTMCell(2)
|
||||
return tf.nn.seq2seq.embedding_attention_seq2seq(
|
||||
enc_inp, dec_inp, cell, num_encoder_symbols,
|
||||
num_decoder_symbols, feed_previous=feed_previous)
|
||||
|
||||
for model in (EmbeddingRNNSeq2SeqF, EmbeddingTiedRNNSeq2Seq,
|
||||
EmbeddingAttentionSeq2Seq):
|
||||
TestModel(model)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
tf.test.main()
|
||||
|
71
tensorflow/python/kernel_tests/trace_op_test.py
Normal file
71
tensorflow/python/kernel_tests/trace_op_test.py
Normal file
@ -0,0 +1,71 @@
|
||||
# Copyright 2015 Google Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
# ==============================================================================
|
||||
|
||||
from __future__ import absolute_import
|
||||
from __future__ import division
|
||||
from __future__ import print_function
|
||||
|
||||
import numpy
|
||||
import tensorflow as tf
|
||||
|
||||
|
||||
class TraceTest(tf.test.TestCase):
|
||||
|
||||
def setUp(self):
|
||||
x = numpy.random.seed(0)
|
||||
|
||||
def traceOp(self, x, dtype, expected_ans, use_gpu=False):
|
||||
with self.test_session(use_gpu=use_gpu):
|
||||
tf_ans = tf.trace(x.astype(dtype))
|
||||
out = tf_ans.eval()
|
||||
self.assertAllClose(out, expected_ans)
|
||||
|
||||
def testEmptyTensor(self):
|
||||
x = numpy.array([])
|
||||
self.assertRaises(ValueError, self.traceOp, x, numpy.float32, 0)
|
||||
|
||||
def testRankOneTensor(self):
|
||||
x = numpy.array([1,2,3])
|
||||
self.assertRaises(ValueError, self.traceOp, x, numpy.float32, 0)
|
||||
|
||||
def testRankTwoIntTensor(self):
|
||||
x = numpy.array(
|
||||
[[1, 0, 0],
|
||||
[0, 2, 0],
|
||||
[0, 0, 3]])
|
||||
expected_ans = 6
|
||||
self.traceOp(x, numpy.int32, expected_ans)
|
||||
self.traceOp(x, numpy.int64, expected_ans)
|
||||
|
||||
def testRankTwoFloatTensor(self):
|
||||
x = numpy.array(
|
||||
[[1.1, 0, 0],
|
||||
[0, 2.2, 0],
|
||||
[0, 0, 3.3]])
|
||||
expected_ans = 6.6
|
||||
self.traceOp(x, numpy.float32, expected_ans)
|
||||
self.traceOp(x, numpy.float64, expected_ans)
|
||||
|
||||
def testRankThreeFloatTensor(self):
|
||||
x = numpy.random.rand(2, 2, 2)
|
||||
self.assertRaises(ValueError, self.traceOp, x, numpy.float32, 0)
|
||||
|
||||
def testRankFourFloatTensor(self):
|
||||
x = numpy.random.rand(2, 2, 2, 2)
|
||||
self.assertRaises(ValueError, self.traceOp, x, numpy.float32, 0)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
tf.test.main()
|
@ -846,6 +846,35 @@ def _DiagShape(op):
|
||||
input_shape = op.inputs[0].get_shape().with_rank_at_most(3)
|
||||
return [input_shape.concatenate(input_shape)]
|
||||
|
||||
@ops.RegisterShape("DiagPart")
|
||||
def _DiagPartShape(op):
|
||||
"""Shape function for array_ops.diag_part.
|
||||
|
||||
This op has one input (of rank k = 2, 4, or 6), and one output (of rank k/2),
|
||||
where the shape of the output is the diagonal of the input shape.
|
||||
|
||||
Args:
|
||||
op: A DiagPart Operation.
|
||||
|
||||
Returns:
|
||||
A single-element list containing the shape of the output.
|
||||
|
||||
Raises:
|
||||
ValueError: If input has odd rank or greater than 6
|
||||
|
||||
"""
|
||||
shape = op.inputs[0].get_shape()
|
||||
rank = len(shape)
|
||||
mid = rank // 2
|
||||
if rank % 2 or rank > 6:
|
||||
raise ValueError("Input must have even rank <= 6, input rank is " +
|
||||
str(rank) + "." )
|
||||
if shape[:mid] != shape[mid:]:
|
||||
raise ValueError("Invalid shape, shape[:mid] " + str(shape[:mid]) +
|
||||
" and shape[mid:] " + str(shape[mid:]) +
|
||||
" do not match ")
|
||||
input_shape = shape.with_rank_at_most(6)
|
||||
return [input_shape[:len(input_shape) // 2]]
|
||||
|
||||
@ops.RegisterShape("ExpandDims")
|
||||
def _ExpandDimsShape(op):
|
||||
@ -1360,7 +1389,7 @@ def _SpaceToDepthShape(op):
|
||||
* input: a tensor of shape like that [B, H, W, D]
|
||||
* block_size: an int.
|
||||
|
||||
Its output is the the same-rank tensor but with changed
|
||||
Its output is the same-rank tensor but with changed
|
||||
dimensions like that: [B, H/block_size, W/block_size, D*block_size*block_size]
|
||||
|
||||
Args:
|
||||
@ -1408,7 +1437,7 @@ def _DepthToSpaceShape(op):
|
||||
* input: a tensor of shape like that [B, H, W, D]
|
||||
* block_size: an int.
|
||||
|
||||
Its output is the the same-rank tensor but with changed
|
||||
Its output is the same-rank tensor but with changed
|
||||
dimensions like that:
|
||||
[B, H*block_size, W*block_size, D/(block_size*block_size)]
|
||||
|
||||
|
@ -308,6 +308,7 @@ def flip_left_right(image):
|
||||
Raises:
|
||||
ValueError: if the shape of `image` not supported.
|
||||
"""
|
||||
image = ops.convert_to_tensor(image, name='image')
|
||||
_Check3DImage(image, require_static=False)
|
||||
return array_ops.reverse(image, [False, True, False])
|
||||
|
||||
@ -329,6 +330,7 @@ def flip_up_down(image):
|
||||
Raises:
|
||||
ValueError: if the shape of `image` not supported.
|
||||
"""
|
||||
image = ops.convert_to_tensor(image, name='image')
|
||||
_Check3DImage(image, require_static=False)
|
||||
return array_ops.reverse(image, [True, False, False])
|
||||
|
||||
|
@ -741,7 +741,14 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
|
||||
image_ops.ResizeMethod.AREA]
|
||||
|
||||
TYPES = [np.uint8, np.int8, np.int16, np.int32, np.int64,
|
||||
np.float, np.double]
|
||||
np.float32, np.float64]
|
||||
|
||||
def availableGPUModes(self, opt, nptype):
|
||||
if opt == image_ops.ResizeMethod.NEAREST_NEIGHBOR \
|
||||
and nptype in [np.float32, np.float64]:
|
||||
return [True, False]
|
||||
else:
|
||||
return [False]
|
||||
|
||||
def testNoOp(self):
|
||||
img_shape = [1, 6, 4, 1]
|
||||
@ -761,13 +768,14 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
|
||||
img_np = np.array(data, dtype=nptype).reshape(img_shape)
|
||||
|
||||
for opt in self.OPTIONS:
|
||||
with self.test_session() as sess:
|
||||
image = constant_op.constant(img_np, shape=img_shape)
|
||||
y = image_ops.resize_images(image, target_height, target_width, opt)
|
||||
yshape = array_ops.shape(y)
|
||||
resized, newshape = sess.run([y, yshape])
|
||||
self.assertAllEqual(img_shape, newshape)
|
||||
self.assertAllClose(resized, img_np, atol=1e-5)
|
||||
for use_gpu in self.availableGPUModes(opt, nptype):
|
||||
with self.test_session(use_gpu=use_gpu) as sess:
|
||||
image = constant_op.constant(img_np, shape=img_shape)
|
||||
y = image_ops.resize_images(image, target_height, target_width, opt)
|
||||
yshape = array_ops.shape(y)
|
||||
resized, newshape = sess.run([y, yshape])
|
||||
self.assertAllEqual(img_shape, newshape)
|
||||
self.assertAllClose(resized, img_np, atol=1e-5)
|
||||
|
||||
# Resizing with a single image must leave the shape unchanged also.
|
||||
with self.test_session():
|
||||
@ -857,12 +865,13 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
|
||||
img_np = np.array(data, dtype=nptype).reshape(img_shape)
|
||||
|
||||
for opt in self.OPTIONS:
|
||||
with self.test_session():
|
||||
image = constant_op.constant(img_np, shape=img_shape)
|
||||
y = image_ops.resize_images(image, target_height, target_width, opt)
|
||||
expected = np.array(expected_data).reshape(target_shape)
|
||||
resized = y.eval()
|
||||
self.assertAllClose(resized, expected, atol=1e-5)
|
||||
for use_gpu in self.availableGPUModes(opt, nptype):
|
||||
with self.test_session(use_gpu=use_gpu):
|
||||
image = constant_op.constant(img_np, shape=img_shape)
|
||||
y = image_ops.resize_images(image, target_height, target_width, opt)
|
||||
expected = np.array(expected_data).reshape(target_shape)
|
||||
resized = y.eval()
|
||||
self.assertAllClose(resized, expected, atol=1e-5)
|
||||
|
||||
def testResizeUp(self):
|
||||
img_shape = [1, 3, 2, 1]
|
||||
@ -899,14 +908,15 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
|
||||
image_ops.ResizeMethod.BILINEAR,
|
||||
image_ops.ResizeMethod.NEAREST_NEIGHBOR,
|
||||
image_ops.ResizeMethod.AREA]:
|
||||
with self.test_session():
|
||||
img_np = np.array(data, dtype=nptype).reshape(img_shape)
|
||||
image = constant_op.constant(img_np, shape=img_shape)
|
||||
y = image_ops.resize_images(image, target_height, target_width, opt)
|
||||
resized = y.eval()
|
||||
expected = np.array(expected_data[opt]).reshape(
|
||||
[1, target_height, target_width, 1])
|
||||
self.assertAllClose(resized, expected, atol=1e-05)
|
||||
for use_gpu in self.availableGPUModes(opt, nptype):
|
||||
with self.test_session(use_gpu=use_gpu):
|
||||
img_np = np.array(data, dtype=nptype).reshape(img_shape)
|
||||
image = constant_op.constant(img_np, shape=img_shape)
|
||||
y = image_ops.resize_images(image, target_height, target_width, opt)
|
||||
resized = y.eval()
|
||||
expected = np.array(expected_data[opt]).reshape(
|
||||
[1, target_height, target_width, 1])
|
||||
self.assertAllClose(resized, expected, atol=1e-05)
|
||||
|
||||
def testResizeUpBicubic(self):
|
||||
img_shape = [1, 6, 6, 1]
|
||||
@ -964,6 +974,28 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
|
||||
self.assertAllClose(resized, expected, atol=1)
|
||||
|
||||
|
||||
def testCompareNearestNeighbor(self):
|
||||
input_shape = [1, 5, 6, 3]
|
||||
target_height = 8
|
||||
target_width = 12
|
||||
for nptype in [np.float32, np.float64]:
|
||||
for align_corners in [True, False]:
|
||||
img_np = np.arange(0, np.prod(input_shape), dtype=nptype).reshape(input_shape)
|
||||
with self.test_session(use_gpu=True):
|
||||
image = constant_op.constant(img_np, shape=input_shape)
|
||||
out_op = image_ops.resize_images(image, target_height, target_width,
|
||||
image_ops.ResizeMethod.NEAREST_NEIGHBOR,
|
||||
align_corners=align_corners)
|
||||
gpu_val = out_op.eval()
|
||||
with self.test_session(use_gpu=False):
|
||||
image = constant_op.constant(img_np, shape=input_shape)
|
||||
out_op = image_ops.resize_images(image, target_height, target_width,
|
||||
image_ops.ResizeMethod.NEAREST_NEIGHBOR,
|
||||
align_corners=align_corners)
|
||||
cpu_val = out_op.eval()
|
||||
self.assertAllClose(cpu_val, gpu_val, rtol=1e-5, atol=1e-5)
|
||||
|
||||
|
||||
class ResizeImageWithCropOrPadTest(test_util.TensorFlowTestCase):
|
||||
|
||||
def _ResizeImageWithCropOrPad(self, original, original_shape,
|
||||
|
@ -63,6 +63,8 @@ TensorFlow provides several operations that you can use to add basic
|
||||
mathematical functions for matrices to your graph.
|
||||
|
||||
@@diag
|
||||
@@diag_part
|
||||
@@trace
|
||||
@@transpose
|
||||
|
||||
@@matmul
|
||||
@ -921,6 +923,39 @@ def reduce_any(input_tensor, reduction_indices=None, keep_dims=False,
|
||||
keep_dims, name=name)
|
||||
|
||||
|
||||
def trace(x, name=None):
|
||||
""" Compute the trace of a tensor `x`.
|
||||
|
||||
`trace(x)` returns the sum of along the diagonal.
|
||||
|
||||
For example:
|
||||
|
||||
```python
|
||||
# 'x' is [[1, 1],
|
||||
# [1, 1]]
|
||||
tf.trace(x) ==> 2
|
||||
|
||||
# 'x' is [[1,2,3],
|
||||
# [4,5,6],
|
||||
# [7,8,9]]
|
||||
tf.trace(x) ==> 15
|
||||
```
|
||||
|
||||
Args:
|
||||
input_tensor: 2-D tensor.
|
||||
name: A name for the operation (optional).
|
||||
|
||||
Returns:
|
||||
The trace of input tensor.
|
||||
"""
|
||||
with ops.op_scope([x], name, "Trace") as name:
|
||||
x = ops.convert_to_tensor(x, name="x")
|
||||
if len(x.get_shape()) != 2:
|
||||
raise ValueError("Expected a tensor with rank 2, rank %d tensor received"
|
||||
% len(x.get_shape()))
|
||||
return reduce_sum(array_ops.diag_part(x), name=name)
|
||||
|
||||
|
||||
def matmul(a, b,
|
||||
transpose_a=False, transpose_b=False,
|
||||
a_is_sparse=False, b_is_sparse=False,
|
||||
|
@ -194,7 +194,7 @@ def softmax_cross_entropy_with_logits(logits, labels, name=None):
|
||||
example, each CIFAR-10 image is labeled with one and only one label: an image
|
||||
can be a dog or a truck, but not both.
|
||||
|
||||
**NOTE:**: While the classes are mutually exclusive, their probabilities
|
||||
**NOTE:** While the classes are mutually exclusive, their probabilities
|
||||
need not be. All that is required is that each row of `labels` is
|
||||
a valid probability distribution. If using exclusive `labels`
|
||||
(wherein one and only one class is true at a time), see
|
||||
@ -231,7 +231,7 @@ def sparse_softmax_cross_entropy_with_logits(logits, labels, name=None):
|
||||
example, each CIFAR-10 image is labeled with one and only one label: an image
|
||||
can be a dog or a truck, but not both.
|
||||
|
||||
**NOTE:**: For this operation, the probability of a given label is considered
|
||||
**NOTE:** For this operation, the probability of a given label is considered
|
||||
exclusive. That is, soft classes are not allowed, and the `labels` vector
|
||||
must provide a single specific index for the true class for each row of
|
||||
`logits` (each minibatch entry). For soft softmax classification with
|
||||
|
@ -312,9 +312,11 @@ def bidirectional_rnn(cell_fw, cell_bw, inputs,
|
||||
scope: VariableScope for the created subgraph; defaults to "BiRNN"
|
||||
|
||||
Returns:
|
||||
A set of output `Tensors` where:
|
||||
A tuple (outputs, output_state_fw, output_state_bw) where:
|
||||
outputs is a length T list of outputs (one for each input), which
|
||||
are depth-concatenated forward and backward outputs
|
||||
output_state_fw is the final state of the forward rnn
|
||||
output_state_bw is the final state of the backward rnn
|
||||
|
||||
Raises:
|
||||
TypeError: If "cell_fw" or "cell_bw" is not an instance of RNNCell.
|
||||
@ -333,19 +335,19 @@ def bidirectional_rnn(cell_fw, cell_bw, inputs,
|
||||
name = scope or "BiRNN"
|
||||
# Forward direction
|
||||
with vs.variable_scope(name + "_FW") as fw_scope:
|
||||
output_fw, _ = rnn(cell_fw, inputs, initial_state_fw, dtype,
|
||||
output_fw, output_state_fw = rnn(cell_fw, inputs, initial_state_fw, dtype,
|
||||
sequence_length, scope=fw_scope)
|
||||
|
||||
# Backward direction
|
||||
with vs.variable_scope(name + "_BW") as bw_scope:
|
||||
tmp, _ = rnn(cell_bw, _reverse_seq(inputs, sequence_length),
|
||||
tmp, output_state_bw = rnn(cell_bw, _reverse_seq(inputs, sequence_length),
|
||||
initial_state_bw, dtype, sequence_length, scope=bw_scope)
|
||||
output_bw = _reverse_seq(tmp, sequence_length)
|
||||
# Concat each of the forward/backward outputs
|
||||
outputs = [array_ops.concat(1, [fw, bw])
|
||||
for fw, bw in zip(output_fw, output_bw)]
|
||||
|
||||
return outputs
|
||||
return (outputs, output_state_fw, output_state_bw)
|
||||
|
||||
|
||||
def dynamic_rnn(cell, inputs, sequence_length=None, initial_state=None,
|
||||
|
@ -73,6 +73,34 @@ from tensorflow.python.ops import rnn_cell
|
||||
from tensorflow.python.ops import variable_scope
|
||||
|
||||
|
||||
def _extract_argmax_and_embed(embedding, output_projection=None,
|
||||
update_embedding=True):
|
||||
"""Get a loop_function that extracts the previous symbol and embeds it.
|
||||
|
||||
Args:
|
||||
embedding: embedding tensor for symbols.
|
||||
output_projection: None or a pair (W, B). If provided, each fed previous
|
||||
output will first be multiplied by W and added B.
|
||||
update_embedding: Boolean; if False, the gradients will not propagate
|
||||
through the embeddings.
|
||||
|
||||
Returns:
|
||||
A loop function.
|
||||
"""
|
||||
def loop_function(prev, _):
|
||||
if output_projection is not None:
|
||||
prev = nn_ops.xw_plus_b(
|
||||
prev, output_projection[0], output_projection[1])
|
||||
prev_symbol = math_ops.argmax(prev, 1)
|
||||
# Note that gradients will not propagate through the second parameter of
|
||||
# embedding_lookup.
|
||||
emb_prev = embedding_ops.embedding_lookup(embedding, prev_symbol)
|
||||
if not update_embedding:
|
||||
emb_prev = array_ops.stop_gradient(emb_prev)
|
||||
return emb_prev
|
||||
return loop_function
|
||||
|
||||
|
||||
def rnn_decoder(decoder_inputs, initial_state, cell, loop_function=None,
|
||||
scope=None):
|
||||
"""RNN decoder for the sequence-to-sequence model.
|
||||
@ -107,14 +135,13 @@ def rnn_decoder(decoder_inputs, initial_state, cell, loop_function=None,
|
||||
for i, inp in enumerate(decoder_inputs):
|
||||
if loop_function is not None and prev is not None:
|
||||
with variable_scope.variable_scope("loop_function", reuse=True):
|
||||
# We do not propagate gradients over the loop function.
|
||||
inp = array_ops.stop_gradient(loop_function(prev, i))
|
||||
inp = loop_function(prev, i)
|
||||
if i > 0:
|
||||
variable_scope.get_variable_scope().reuse_variables()
|
||||
output, state = cell(inp, state)
|
||||
outputs.append(output)
|
||||
if loop_function is not None:
|
||||
prev = array_ops.stop_gradient(output)
|
||||
prev = output
|
||||
return outputs, state
|
||||
|
||||
|
||||
@ -182,7 +209,7 @@ def tied_rnn_seq2seq(encoder_inputs, decoder_inputs, cell,
|
||||
|
||||
def embedding_rnn_decoder(decoder_inputs, initial_state, cell, num_symbols,
|
||||
output_projection=None, feed_previous=False,
|
||||
scope=None):
|
||||
update_embedding_for_previous=True, scope=None):
|
||||
"""RNN decoder with embedding and a pure-decoding option.
|
||||
|
||||
Args:
|
||||
@ -200,6 +227,11 @@ def embedding_rnn_decoder(decoder_inputs, initial_state, cell, num_symbols,
|
||||
In effect, this implements a greedy decoder. It can also be used
|
||||
during training to emulate http://arxiv.org/abs/1506.03099.
|
||||
If False, decoder_inputs are used as given (the standard decoder case).
|
||||
update_embedding_for_previous: Boolean; if False and feed_previous=True,
|
||||
only the embedding for the first symbol of decoder_inputs (the "GO"
|
||||
symbol) will be updated by back propagation. Embeddings for the symbols
|
||||
generated from the decoder itself remain unchanged. This parameter has
|
||||
no effect if feed_previous=False.
|
||||
scope: VariableScope for the created subgraph; defaults to
|
||||
"embedding_rnn_decoder".
|
||||
|
||||
@ -227,16 +259,9 @@ def embedding_rnn_decoder(decoder_inputs, initial_state, cell, num_symbols,
|
||||
with ops.device("/cpu:0"):
|
||||
embedding = variable_scope.get_variable("embedding",
|
||||
[num_symbols, cell.input_size])
|
||||
|
||||
def extract_argmax_and_embed(prev, _):
|
||||
"""Loop_function that extracts the symbol from prev and embeds it."""
|
||||
if output_projection is not None:
|
||||
prev = nn_ops.xw_plus_b(
|
||||
prev, output_projection[0], output_projection[1])
|
||||
prev_symbol = array_ops.stop_gradient(math_ops.argmax(prev, 1))
|
||||
return embedding_ops.embedding_lookup(embedding, prev_symbol)
|
||||
|
||||
loop_function = extract_argmax_and_embed if feed_previous else None
|
||||
loop_function = _extract_argmax_and_embed(
|
||||
embedding, output_projection,
|
||||
update_embedding_for_previous) if feed_previous else None
|
||||
emb_inp = (
|
||||
embedding_ops.embedding_lookup(embedding, i) for i in decoder_inputs)
|
||||
return rnn_decoder(emb_inp, initial_state, cell,
|
||||
@ -306,7 +331,8 @@ def embedding_rnn_seq2seq(encoder_inputs, decoder_inputs, cell,
|
||||
outputs, state = embedding_rnn_decoder(
|
||||
decoder_inputs, encoder_state, cell, num_decoder_symbols,
|
||||
output_projection=output_projection,
|
||||
feed_previous=feed_previous_bool)
|
||||
feed_previous=feed_previous_bool,
|
||||
update_embedding_for_previous=False)
|
||||
return outputs + [state]
|
||||
|
||||
outputs_and_state = control_flow_ops.cond(feed_previous,
|
||||
@ -372,25 +398,19 @@ def embedding_tied_rnn_seq2seq(encoder_inputs, decoder_inputs, cell,
|
||||
emb_decoder_inputs = [embedding_ops.embedding_lookup(embedding, x)
|
||||
for x in decoder_inputs]
|
||||
|
||||
def extract_argmax_and_embed(prev, _):
|
||||
"""Loop_function that extracts the symbol from prev and embeds it."""
|
||||
if output_projection is not None:
|
||||
prev = nn_ops.xw_plus_b(
|
||||
prev, output_projection[0], output_projection[1])
|
||||
prev_symbol = array_ops.stop_gradient(math_ops.argmax(prev, 1))
|
||||
return embedding_ops.embedding_lookup(embedding, prev_symbol)
|
||||
|
||||
if output_projection is None:
|
||||
cell = rnn_cell.OutputProjectionWrapper(cell, num_symbols)
|
||||
|
||||
if isinstance(feed_previous, bool):
|
||||
loop_function = extract_argmax_and_embed if feed_previous else None
|
||||
loop_function = _extract_argmax_and_embed(
|
||||
embedding, output_projection, True) if feed_previous else None
|
||||
return tied_rnn_seq2seq(emb_encoder_inputs, emb_decoder_inputs, cell,
|
||||
loop_function=loop_function, dtype=dtype)
|
||||
|
||||
# If feed_previous is a Tensor, we construct 2 graphs and use cond.
|
||||
def decoder(feed_previous_bool):
|
||||
loop_function = extract_argmax_and_embed if feed_previous_bool else None
|
||||
loop_function = _extract_argmax_and_embed(
|
||||
embedding, output_projection, False) if feed_previous_bool else None
|
||||
reuse = None if feed_previous_bool else True
|
||||
with variable_scope.variable_scope(variable_scope.get_variable_scope(),
|
||||
reuse=reuse):
|
||||
@ -523,7 +543,7 @@ def attention_decoder(decoder_inputs, initial_state, attention_states, cell,
|
||||
# If loop_function is set, we use it instead of decoder_inputs.
|
||||
if loop_function is not None and prev is not None:
|
||||
with variable_scope.variable_scope("loop_function", reuse=True):
|
||||
inp = array_ops.stop_gradient(loop_function(prev, i))
|
||||
inp = loop_function(prev, i)
|
||||
# Merge input and previous attentions into one vector of the right size.
|
||||
x = rnn_cell.linear([inp] + attns, cell.input_size, True)
|
||||
# Run the RNN.
|
||||
@ -539,8 +559,7 @@ def attention_decoder(decoder_inputs, initial_state, attention_states, cell,
|
||||
with variable_scope.variable_scope("AttnOutputProjection"):
|
||||
output = rnn_cell.linear([cell_output] + attns, output_size, True)
|
||||
if loop_function is not None:
|
||||
# We do not propagate gradients over the loop function.
|
||||
prev = array_ops.stop_gradient(output)
|
||||
prev = output
|
||||
outputs.append(output)
|
||||
|
||||
return outputs, state
|
||||
@ -549,8 +568,10 @@ def attention_decoder(decoder_inputs, initial_state, attention_states, cell,
|
||||
def embedding_attention_decoder(decoder_inputs, initial_state, attention_states,
|
||||
cell, num_symbols, num_heads=1,
|
||||
output_size=None, output_projection=None,
|
||||
feed_previous=False, dtype=dtypes.float32,
|
||||
scope=None, initial_state_attention=False):
|
||||
feed_previous=False,
|
||||
update_embedding_for_previous=True,
|
||||
dtype=dtypes.float32, scope=None,
|
||||
initial_state_attention=False):
|
||||
"""RNN decoder with embedding and attention and a pure-decoding option.
|
||||
|
||||
Args:
|
||||
@ -571,6 +592,11 @@ def embedding_attention_decoder(decoder_inputs, initial_state, attention_states,
|
||||
In effect, this implements a greedy decoder. It can also be used
|
||||
during training to emulate http://arxiv.org/abs/1506.03099.
|
||||
If False, decoder_inputs are used as given (the standard decoder case).
|
||||
update_embedding_for_previous: Boolean; if False and feed_previous=True,
|
||||
only the embedding for the first symbol of decoder_inputs (the "GO"
|
||||
symbol) will be updated by back propagation. Embeddings for the symbols
|
||||
generated from the decoder itself remain unchanged. This parameter has
|
||||
no effect if feed_previous=False.
|
||||
dtype: The dtype to use for the RNN initial states (default: tf.float32).
|
||||
scope: VariableScope for the created subgraph; defaults to
|
||||
"embedding_attention_decoder".
|
||||
@ -602,17 +628,9 @@ def embedding_attention_decoder(decoder_inputs, initial_state, attention_states,
|
||||
with ops.device("/cpu:0"):
|
||||
embedding = variable_scope.get_variable("embedding",
|
||||
[num_symbols, cell.input_size])
|
||||
|
||||
def extract_argmax_and_embed(prev, _):
|
||||
"""Loop_function that extracts the symbol from prev and embeds it."""
|
||||
if output_projection is not None:
|
||||
prev = nn_ops.xw_plus_b(
|
||||
prev, output_projection[0], output_projection[1])
|
||||
prev_symbol = array_ops.stop_gradient(math_ops.argmax(prev, 1))
|
||||
emb_prev = embedding_ops.embedding_lookup(embedding, prev_symbol)
|
||||
return emb_prev
|
||||
|
||||
loop_function = extract_argmax_and_embed if feed_previous else None
|
||||
loop_function = _extract_argmax_and_embed(
|
||||
embedding, output_projection,
|
||||
update_embedding_for_previous) if feed_previous else None
|
||||
emb_inp = [
|
||||
embedding_ops.embedding_lookup(embedding, i) for i in decoder_inputs]
|
||||
return attention_decoder(
|
||||
@ -700,6 +718,7 @@ def embedding_attention_seq2seq(encoder_inputs, decoder_inputs, cell,
|
||||
num_decoder_symbols, num_heads=num_heads, output_size=output_size,
|
||||
output_projection=output_projection,
|
||||
feed_previous=feed_previous_bool,
|
||||
update_embedding_for_previous=False,
|
||||
initial_state_attention=initial_state_attention)
|
||||
return outputs + [state]
|
||||
|
||||
|
@ -248,7 +248,7 @@ class _Nulllocker(object):
|
||||
|
||||
|
||||
def Exists(path): # pylint: disable=invalid-name
|
||||
"""Retruns True iff "path" exists (as a dir, file, non-broken symlink)."""
|
||||
"""Returns True iff "path" exists (as a dir, file, non-broken symlink)."""
|
||||
return os.path.exists(path)
|
||||
|
||||
|
||||
|
@ -50,9 +50,11 @@ def exponential_decay(learning_rate, global_step, decay_steps, decay_rate,
|
||||
starter_learning_rate = 0.1
|
||||
learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step,
|
||||
100000, 0.96, staircase=True)
|
||||
optimizer = tf.GradientDescentOptimizer(learning_rate)
|
||||
# Passing global_step to minimize() will increment it at each step.
|
||||
optimizer.minimize(...my loss..., global_step=global_step)
|
||||
learning_step = (
|
||||
tf.GradientDescentOptimizer(learning_rate)
|
||||
.minimize(...my loss..., global_step=global_step)
|
||||
)
|
||||
```
|
||||
|
||||
Args:
|
||||
|
@ -218,7 +218,7 @@ class ExponentialMovingAverageTest(tf.test.TestCase):
|
||||
self.assertDeviceEqual("/job:dev_v0", ema.average(v0).device)
|
||||
self.assertDeviceEqual("/job:dev_v1", ema.average(v1).device)
|
||||
# However, the colocation property is maintained.
|
||||
self.assertEqual(["loc:@v1"],
|
||||
self.assertEqual([b"loc:@v1"],
|
||||
ema.average(v1).op.colocation_groups())
|
||||
self.assertDeviceEqual("/job:default", ema.average(tensor2).device)
|
||||
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user