Fix dependencies bugs

Change: 116925769
This commit is contained in:
Eugene Brevdo 2016-03-10 17:18:30 -08:00 committed by TensorFlower Gardener
parent 64dd5b58d5
commit 56f1d64998
143 changed files with 4131 additions and 801 deletions
ISSUE_TEMPLATE.mdREADME.mdRELEASE.mdconfigure
tensorflow
BUILD
contrib
core
examples
g3doc
api_docs/python
get_started
how_tos
image_retraining
summaries_and_tensorboard
tool_developers
resources
tutorials
deep_cnn
mnist/download
recurrent
seq2seq
models
python

View File

@ -1,5 +1,11 @@
For bugs/issues, please fill in the following. The more information you
provide, the more likely we can help you.
GitHub issues are for bugs / installation problems / feature requests.
For general support from the community, see [StackOverflow](https://stackoverflow.com/questions/tagged/tensorflow).
To make bugs and feature requests more easy to find and organize, we close issues that are deemed
out of scope for GitHub Issues and point people to StackOverflow.
For bugs or installation issues, please provide the following information.
The more information you provide, the more easily we will be able to offer
help and advice.
### Environment info
Operating System:

View File

@ -5,7 +5,7 @@
| **`Linux CPU`** | **`Linux GPU PIP`** | **`Mac OS CPU`** | **`Android`** |
|-------------------|----------------------|------------------|----------------|
| [![Build Status](http://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master)](http://ci.tensorflow.org/job/tensorflow-master) | [![Build Status](http://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-gpu_pip)](http://ci.tensorflow.org/job/tensorflow-master-gpu_pip) | [![Build Status](http://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-mac)](http://ci.tensorflow.org/job/tensorflow-master-mac) | [![Build Status](http://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-android)](http://ci.tensorflow.org/job/tensorflow-master-android) |
| [![Build Status](http://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-cpu)](http://ci.tensorflow.org/job/tensorflow-master-cpu) | [![Build Status](http://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-gpu_pip)](http://ci.tensorflow.org/job/tensorflow-master-gpu_pip) | [![Build Status](http://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-mac)](http://ci.tensorflow.org/job/tensorflow-master-mac) | [![Build Status](http://ci.tensorflow.org/buildStatus/icon?job=tensorflow-master-android)](http://ci.tensorflow.org/job/tensorflow-master-android) |
**TensorFlow** is an open source software library for numerical computation using
data flow graphs. Nodes in the graph represent mathematical operations, while
@ -27,7 +27,14 @@ tracking requests and bugs, but please see
and discussion.**
## Installation
*See [Download and Setup](tensorflow/g3doc/get_started/os_setup.md).*
*See [Download and Setup](tensorflow/g3doc/get_started/os_setup.md) for instructions on how to install our release binaries or how to build from source.*
People who are a little bit adventurous can also try our nightly binaries:
* Linux CPU only: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-cp27-none-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py3-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/))
* Linux GPU: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py2-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py3-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-slave/))
* Mac CPU only: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py2-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.7.1-py3-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/))
* [Android](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/lastSuccessfulBuild/artifact/bazel-out/local_linux/bin/tensorflow/examples/android/tensorflow_demo.apk) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/))
#### *Try your first TensorFlow program*
```python
@ -46,6 +53,9 @@ Hello, TensorFlow!
```
##For more information
* [TensorFlow website](http://tensorflow.org)
* [TensorFlow whitepaper](http://download.tensorflow.org/paper/whitepaper2015.pdf)
* [Tensorflow MOOC on Udacity] (https://www.udacity.com/course/deep-learning--ud730)
* [TensorFlow MOOC on Udacity] (https://www.udacity.com/course/deep-learning--ud730)
The TensorFlow community has created amazing things with TensorFlow, please see the [resources section of tensorflow.org](https://www.tensorflow.org/versions/master/resources#community) for an incomplete list.

View File

@ -1,3 +1,20 @@
# Release 0.7.1
## Bug Fixes and Other Changes
* Added gfile.Open and gfile.Copy, used by input_data.py.
* Fixed Saver bug when MakeDirs tried to create empty directory.
* GPU Pip wheels are built with cuda 7.5 and cudnn-v4, making them
required for the binary releases. Lower versions of cuda/cudnn can
be supported by installing from sources and setting the options
during ./configure
* Fix dataset encoding example for Python3 (@danijar)
* Fix PIP installation by not packaging protobuf as part of wheel,
require protobuf 3.0.0b2.
* Fix Mac pip installation of numpy by requiring pip >= 1.10.1.
* Improvements and fixes to Docker image.
# Release 0.7.0
## Major Features and Improvements

8
configure vendored
View File

@ -99,12 +99,18 @@ while true; do
else
TF_CUDNN_EXT=".$TF_CUDNN_VERSION"
fi
if [ -e "$CUDNN_INSTALL_PATH/libcudnn.so${CUDNNEXT}" -o -e "$CUDNN_INSTALL_PATH/lib64/libcudnn.so${TF_CUDNN_EXT}" ]; then
if [ -e "$CUDNN_INSTALL_PATH/libcudnn.so${TF_CUDNN_EXT}" -o -e "$CUDNN_INSTALL_PATH/lib64/libcudnn.so${TF_CUDNN_EXT}" ]; then
break
fi
CUDNN_PATH_FROM_LDCONFIG="$(ldconfig -p | sed -n 's/.*libcudnn.so .* => \(.*\)/\1/p')"
if [ -e "${CUDNN_PATH_FROM_LDCONFIG}${TF_CUDNN_EXT}" ]; then
CUDNN_INSTALL_PATH="$(dirname ${CUDNN_PATH_FROM_LDCONFIG})"
break
fi
echo "Invalid path to cuDNN ${TF_CUDNN_VERSION} toolkit. Neither of the following two files can be found:"
echo "$CUDNN_INSTALL_PATH/lib64/libcudnn.so${TF_CUDNN_EXT}"
echo "$CUDNN_INSTALL_PATH/libcudnn.so${TF_CUDNN_EXT}"
echo "${CUDNN_PATH_FROM_LDCONFIG}${TF_CUDNN_EXT}"
if [ -z "$fromuser" ]; then
exit 1
fi

View File

@ -54,6 +54,15 @@ cc_binary(
],
)
cc_binary(
name = "libtensorflow_cc.so",
linkshared = 1,
deps = [
"//tensorflow/cc:cc_ops",
"//tensorflow/core:tensorflow",
],
)
py_library(
name = "tensorflow_py",
srcs = ["__init__.py"],

View File

@ -0,0 +1,62 @@
# Minimum CMake required
cmake_minimum_required(VERSION 2.8)
# Project
project(tensorflow C CXX)
# Actual source is the ../../.. directory
get_filename_component(tf_contrib_source_dir ${tensorflow_SOURCE_DIR} PATH)
get_filename_component(tf_tf_source_dir ${tf_contrib_source_dir} PATH)
get_filename_component(tensorflow_source_dir ${tf_tf_source_dir} PATH)
# [CLEANUP] Not sure if this is needed (copied from Protobuf)
# CMake policies
cmake_policy(SET CMP0022 NEW)
# Options
option(tensorflow_VERBOSE "Enable for verbose output" OFF)
option(tensorflow_BUILD_TESTS "Build tests" ON)
#Threads: defines CMAKE_THREAD_LIBS_INIT and adds -pthread compile option for
# targets that link ${CMAKE_THREAD_LIBS_INIT}.
find_package (Threads)
# [CLEANUP] Remove when done
# For debugging
function(SHOW_VARIABLES)
get_cmake_property(_variableNames VARIABLES)
foreach (_variableName ${_variableNames})
message(STATUS "${_variableName}=${${_variableName}}")
endforeach()
endfunction()
# External dependencies
set(CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/external)
# Location where external projects will be downloaded
set (DOWNLOAD_LOCATION "${CMAKE_CURRENT_BINARY_DIR}/downloads"
CACHE PATH "Location where external projects will be downloaded.")
mark_as_advanced(DOWNLOAD_LOCATION)
# External dependencies
include(png)
include(jpeg)
include(re2)
include(eigen)
# Let's get to work!
include(tf_core_framework.cmake)
include(tf_stream_executor.cmake)
include(tf_core_cpu.cmake)
include(tf_models.cmake)
include(tf_core_ops.cmake)
include(tf_core_direct_session.cmake)
include(tf_core_kernels.cmake)
include(tf_cc_ops.cmake)
include(tf_tutorials.cmake)
if (tensorflow_BUILD_TESTS)
include(tests.cmake)
endif (tensorflow_BUILD_TESTS)
include(install.cmake)

View File

@ -0,0 +1,257 @@
This directory contains *CMake* files that can be used to build TensorFlow
core library.
You need to have [CMake](http://www.cmake.org) and [Git](http://git-scm.com)
installed on your computer before proceeding.
Most of the instructions will be given to the *Сommand Prompt*, but the same
actions can be performed using appropriate GUI tools.
Environment Setup
=================
Open the appropriate *Command Prompt* from the *Start* menu.
For example *VS2013 x64 Native Tools Command Prompt*:
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64>
Change to your working directory:
C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64>cd C:\Path\to
C:\Path\to>
Where *C:\Path\to* is the path to your real working directory.
Create a folder where Tensorflow headers/libraries/binaries will be installed
after they are built:
C:\Path\to>mkdir install
If *cmake* command is not available from *Command Prompt*, add it to system
*PATH* variable:
C:\Path\to>set PATH=%PATH%;C:\Program Files (x86)\CMake\bin
If *git* command is not available from *Command Prompt*, add it to system
*PATH* variable:
C:\Path\to>set PATH=%PATH%;C:\Program Files\Git\cmd
Good. Now you are ready to continue.
Getting Sources
===============
You can get the latest stable source packages from the
[releases](https://github.com/tensorflow/tensorflow/releases) page.
Or you can type:
C:\Path\to> git clone --recursive -b [release_tag] https://github.com/tensorflow/tensorflow.git
Where *[release_tag]* is a git tag like *v0.6.0* or a branch name like *master*
if you want to get the latest code.
Go to the project folder:
C:\Path\to>cd tensorflow
C:\Path\to\tensorflow>
Now go to *tensorflow\contrib\cmake* folder in Tensorflow's contrib sources:
C:\Path\to\tensorflow>cd tensorflow\contrib\cmake
C:\Path\to\tensorflow\tensorflow\contrib\cmake>
Good. Now you are ready to configure *CMake*.
CMake Configuration
===================
*CMake* supports a lot of different
[generators](http://www.cmake.org/cmake/help/latest/manual/cmake-generators.7.html)
for various native build systems. We are only interested in
[Makefile](http://www.cmake.org/cmake/help/latest/manual/cmake-generators.7.html#makefile-generators)
and
[Visual Studio](http://www.cmake.org/cmake/help/latest/manual/cmake-generators.7.html#visual-studio-generators)
generators.
We will use shadow building to separate the temporary files from the Tensorflow
source code.
Create a temporary *build* folder and change your working directory to it:
C:\Path\to\tensorflow\tensorflow\contrib\cmake>mkdir build & cd build
C:\Path\to\tensorflow\tensorflow\contrib\cmake\build>
The *Makefile* generator can build the project in only one configuration, so
you need to build a separate folder for each configuration.
To start using a *Release* configuration:
[...]\contrib\cmake\build>mkdir release & cd release
[...]\contrib\cmake\build\release>cmake -G "NMake Makefiles" ^
-DCMAKE_BUILD_TYPE=Release ^
-DCMAKE_INSTALL_PREFIX=../../../../../../install ^
../..
It will generate *nmake* *Makefile* in current directory.
To use *Debug* configuration:
[...]\contrib\cmake\build>mkdir debug & cd debug
[...]\contrib\cmake\build\debug>cmake -G "NMake Makefiles" ^
-DCMAKE_BUILD_TYPE=Debug ^
-DCMAKE_INSTALL_PREFIX=../../../../../../install ^
../..
It will generate *nmake* *Makefile* in current directory.
To create *Visual Studio* solution file:
[...]\contrib\cmake\build>mkdir solution & cd solution
[...]\contrib\cmake\build\solution>cmake -G "Visual Studio 12 2013 Win64" ^
-DCMAKE_INSTALL_PREFIX=../../../../../../install ^
../..
It will generate *Visual Studio* solution file *tensorflow.sln* in current
directory.
If the *gmock* directory does not exist, and/or you do not want to build
Tensorflow unit tests, you need to add *cmake* command argument
`-Dtensorflow_BUILD_TESTS=OFF` to disable testing.
Compiling
=========
To compile tensorflow:
[...]\contrib\cmake\build\release>nmake
or
[...]\contrib\cmake\build\debug>nmake
And wait for the compilation to finish.
If you prefer to use the IDE:
* Open the generated tensorflow.sln file in Microsoft Visual Studio.
* Choose "Debug" or "Release" configuration as desired.
* From the Build menu, choose "Build Solution".
And wait for the compilation to finish.
Testing
=======
To run unit-tests:
[...]\contrib\cmake\build\release>nmake check
or
[...]\contrib\cmake\build\debug>nmake check
You can also build project *check* from Visual Studio solution.
Yes, it may sound strange, but it works.
You should see an output similar to:
Running main() from gmock_main.cc
[==========] Running 1546 tests from 165 test cases.
...
[==========] 1546 tests from 165 test cases ran. (2529 ms total)
[ PASSED ] 1546 tests.
To run specific tests:
C:\Path\to\tensorflow>tensorflow\contrib\cmake\build\release\tests.exe ^
--gtest_filter=AnyTest*
Running main() from gmock_main.cc
Note: Google Test filter = AnyTest*
[==========] Running 3 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 3 tests from AnyTest
[ RUN ] AnyTest.TestPackAndUnpack
[ OK ] AnyTest.TestPackAndUnpack (0 ms)
[ RUN ] AnyTest.TestPackAndUnpackAny
[ OK ] AnyTest.TestPackAndUnpackAny (0 ms)
[ RUN ] AnyTest.TestIs
[ OK ] AnyTest.TestIs (0 ms)
[----------] 3 tests from AnyTest (1 ms total)
[----------] Global test environment tear-down
[==========] 3 tests from 1 test case ran. (2 ms total)
[ PASSED ] 3 tests.
Note that the tests must be run from the source folder.
If all tests are passed, safely continue.
Installing
==========
To install Tensorflow to the specified *install* folder:
[...]\contrib\cmake\build\release>nmake install
or
[...]\contrib\cmake\build\debug>nmake install
You can also build project *INSTALL* from Visual Studio solution.
It sounds not so strange and it works.
This will create the following folders under the *install* location:
* bin - that contains tensorflow binaries;
* include - that contains C++ headers and Tensorflow *.proto files;
* lib - that contains linking libraries and *CMake* configuration files for
*tensorflow* package.
Now you can if needed:
* Copy the contents of the include directory to wherever you want to put
headers.
* Copy binaries wherever you put build tools (probably somewhere in your
PATH).
* Copy linking libraries libtensorflow[d].lib wherever you put libraries.
To avoid conflicts between the MSVC debug and release runtime libraries, when
compiling a debug build of your application, you may need to link against a
debug build of libtensorflowd.lib with "d" postfix. Similarly, release builds
should link against release libtensorflow.lib library.
DLLs vs. static linking
=======================
Static linking is now the default for the Tensorflow Buffer libraries. Due to
issues with Win32's use of a separate heap for each DLL, as well as binary
compatibility issues between different versions of MSVC's STL library, it is
recommended that you use static linkage only. However, it is possible to
build libtensorflow as DLLs if you really want. To do this, do the following:
* Add an additional flag `-Dtensorflow_BUILD_SHARED_LIBS=ON` when invoking
cmake
* Follow the same steps as described in the above section.
* When compiling your project, make sure to `#define TENSORFLOW_USE_DLLS`.
When distributing your software to end users, we strongly recommend that you
do NOT install libtensorflow.dll to any shared location.
Instead, keep these libraries next to your binaries, in your application's
own install directory. C++ makes it very difficult to maintain binary
compatibility between releases, so it is likely that future versions of these
libraries will *not* be usable as drop-in replacements.
If your project is itself a DLL intended for use by third-party software, we
recommend that you do NOT expose Tensorflow objects in your library's
public interface, and that you statically link them into your library.
Notes on Compiler Warnings
==========================
The following warnings have been disabled while building the tensorflow
libraries and binaries. You may have to disable some of them in your own
project as well, or live with them.
* [TODO]

View File

@ -0,0 +1,34 @@
#new_http_archive(
# name = "eigen_archive",
# url = "https://bitbucket.org/eigen/eigen/get/...",
# sha256 = "...",
# build_file = "eigen.BUILD",
#)
include (ExternalProject)
set(eigen_archive_hash "ed4c9730b545")
set(eigen_INCLUDE_DIRS
${CMAKE_CURRENT_BINARY_DIR}
${CMAKE_CURRENT_BINARY_DIR}/external/eigen_archive
${CMAKE_CURRENT_BINARY_DIR}/external/eigen_archive/eigen-eigen-${eigen_archive_hash}
${tensorflow_source_dir}/third_party/eigen3
)
set(eigen_URL https://bitbucket.org/eigen/eigen/get/${eigen_archive_hash}.tar.gz)
set(eigen_HASH SHA256=3d9eceb8a2add299e37b1f32759157cc2574f7684936c151552a5ae3f33aebd5)
set(eigen_BUILD ${CMAKE_CURRENT_BINARY_DIR}/eigen/src/eigen)
set(eigen_INSTALL ${CMAKE_CURRENT_BINARY_DIR}/eigen/install)
ExternalProject_Add(eigen
PREFIX eigen
URL ${eigen_URL}
URL_HASH ${eigen_HASH}
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
INSTALL_DIR "${eigen_INSTALL}"
CMAKE_CACHE_ARGS
-DCMAKE_BUILD_TYPE:STRING=Release
-DCMAKE_VERBOSE_MAKEFILE:BOOL=OFF
-DCMAKE_INSTALL_PREFIX:STRING=${eigen_INSTALL}
-DINCLUDE_INSTALL_DIR:STRING=${CMAKE_CURRENT_BINARY_DIR}/external/eigen_archive/eigen-eigen-${eigen_archive_hash}
)

View File

@ -0,0 +1,75 @@
include (ExternalProject)
set(jpeg_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/external/jpeg_archive)
set(jpeg_URL http://www.ijg.org/files/jpegsrc.v9a.tar.gz)
set(jpeg_HASH SHA256=3a753ea48d917945dd54a2d97de388aa06ca2eb1066cbfdc6652036349fe05a7)
set(jpeg_BUILD ${CMAKE_BINARY_DIR}/jpeg/src/jpeg)
set(jpeg_INSTALL ${CMAKE_BINARY_DIR}/jpeg/install)
set(jpeg_STATIC_LIBRARIES ${jpeg_INSTALL}/lib/libjpeg.a)
set(jpeg_HEADERS
"${jpeg_INSTALL}/include/jconfig.h"
"${jpeg_INSTALL}/include/jerror.h"
"${jpeg_INSTALL}/include/jmorecfg.h"
"${jpeg_INSTALL}/include/jpeglib.h"
"${jpeg_BUILD}/cderror.h"
"${jpeg_BUILD}/cdjpeg.h"
"${jpeg_BUILD}/jdct.h"
"${jpeg_BUILD}/jinclude.h"
"${jpeg_BUILD}/jmemsys.h"
"${jpeg_BUILD}/jpegint.h"
"${jpeg_BUILD}/jversion.h"
"${jpeg_BUILD}/transupp.h"
)
if (WIN32)
ExternalProject_Add(jpeg
PREFIX jpeg
URL ${jpeg_URL}
URL_HASH ${jpeg_HASH}
PATCH_COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_SOURCE_DIR}/patches/jpeg/CMakeLists.txt ${jpeg_BUILD}
INSTALL_DIR ${jpeg_INSTALL}
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
CMAKE_CACHE_ARGS
-DCMAKE_BUILD_TYPE:STRING=Release
-DCMAKE_VERBOSE_MAKEFILE:BOOL=OFF
-DCMAKE_INSTALL_PREFIX:STRING=${jpeg_INSTALL}
)
ExternalProject_Add_Step(jpeg copy_jconfig
COMMAND ${CMAKE_COMMAND} -E copy
${jpeg_BUILD}/jconfig.vc ${jpeg_BUILD}/jconfig.h
DEPENDEES patch
DEPENDERS build
)
else()
ExternalProject_Add(jpeg
PREFIX jpeg
URL ${jpeg_URL}
URL_HASH ${jpeg_HASH}
INSTALL_DIR ${jpeg_INSTALL}
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
BUILD_COMMAND $(MAKE)
INSTALL_COMMAND $(MAKE) install
CONFIGURE_COMMAND
${jpeg_BUILD}/configure
--prefix=${jpeg_INSTALL}
--enable-shared=yes
)
endif()
# put jpeg includes in the directory where they are expected
add_custom_target(jpeg_create_destination_dir
COMMAND ${CMAKE_COMMAND} -E make_directory ${jpeg_INCLUDE_DIR}/jpeg-9a
DEPENDS jpeg)
add_custom_target(jpeg_copy_headers_to_destination
DEPENDS jpeg_create_destination_dir)
foreach(header_file ${jpeg_HEADERS})
add_custom_command(TARGET jpeg_copy_headers_to_destination PRE_BUILD
COMMAND ${CMAKE_COMMAND} -E copy ${header_file} ${jpeg_INCLUDE_DIR}/jpeg-9a)
endforeach()

View File

@ -0,0 +1,38 @@
include (ExternalProject)
set(png_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/external/png_archive)
set(png_URL https://storage.googleapis.com/libpng-public-archive/libpng-1.2.53.tar.gz)
set(png_HASH SHA256=e05c9056d7f323088fd7824d8c6acc03a4a758c4b4916715924edc5dd3223a72)
set(png_BUILD ${CMAKE_BINARY_DIR}/png/src/png)
set(png_INSTALL ${CMAKE_BINARY_DIR}/png/install)
set(png_STATIC_LIBRARIES ${CMAKE_BINARY_DIR}/png/install/lib/libpng12.a)
set(png_HEADERS
"${png_INSTALL}/include/libpng12/png.h"
"${png_INSTALL}/include/libpng12/pngconf.h"
)
ExternalProject_Add(png
PREFIX png
URL ${png_URL}
URL_HASH ${png_HASH}
INSTALL_DIR ${png_INSTALL}
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
CMAKE_CACHE_ARGS
-DCMAKE_BUILD_TYPE:STRING=Release
-DCMAKE_VERBOSE_MAKEFILE:BOOL=OFF
-DCMAKE_INSTALL_PREFIX:STRING=${png_INSTALL}
)
## put png includes in the directory where they are expected
add_custom_target(png_create_destination_dir
COMMAND ${CMAKE_COMMAND} -E make_directory ${png_INCLUDE_DIR}/libpng-1.2.53
DEPENDS png)
add_custom_target(png_copy_headers_to_destination
DEPENDS png_create_destination_dir)
foreach(header_file ${png_HEADERS})
add_custom_command(TARGET png_copy_headers_to_destination PRE_BUILD
COMMAND ${CMAKE_COMMAND} -E copy ${header_file} ${png_INCLUDE_DIR}/libpng-1.2.53)
endforeach()

View File

@ -0,0 +1,46 @@
include (ExternalProject)
set(re2_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/external/re2/re2)
set(re2_EXTRA_INCLUDE_DIR ${CMAKE_CURRENT_BINARY_DIR}/re2/src)
set(re2_URL https://github.com/google/re2.git)
set(re2_TAG 791beff)
set(re2_BUILD ${CMAKE_BINARY_DIR}/re2/src/re2)
set(re2_LIBRARIES ${re2_BUILD}/obj/so/libre2.so)
get_filename_component(re2_STATIC_LIBRARIES ${re2_BUILD}/libre2.a ABSOLUTE)
set(re2_INCLUDES ${re2_BUILD})
# We only need re2.h in external/re2/re2/re2.h
# For the rest, we'll just add the build dir as an include dir.
set(re2_HEADERS
"${re2_BUILD}/re2/re2.h"
)
ExternalProject_Add(re2
PREFIX re2
GIT_REPOSITORY ${re2_URL}
GIT_TAG ${re2_TAG}
DOWNLOAD_DIR "${DOWNLOAD_LOCATION}"
BUILD_IN_SOURCE 1
INSTALL_COMMAND ""
CMAKE_CACHE_ARGS
-DCMAKE_BUILD_TYPE:STRING=Release
-DCMAKE_VERBOSE_MAKEFILE:BOOL=OFF
)
## put re2 includes in the directory where they are expected
add_custom_target(re2_create_destination_dir
COMMAND ${CMAKE_COMMAND} -E make_directory ${re2_INCLUDE_DIR}
DEPENDS re2)
add_custom_target(re2_copy_headers_to_destination
DEPENDS re2_create_destination_dir)
foreach(header_file ${re2_HEADERS})
add_custom_command(TARGET re2_copy_headers_to_destination PRE_BUILD
COMMAND ${CMAKE_COMMAND} -E copy ${header_file} ${re2_INCLUDE_DIR})
endforeach()
ADD_LIBRARY(re2_lib STATIC IMPORTED
DEPENDS re2)
SET_TARGET_PROPERTIES(re2_lib PROPERTIES
IMPORTED_LOCATION ${re2_STATIC_LIBRARIES})

View File

@ -0,0 +1 @@
# [TODO]

View File

@ -0,0 +1,76 @@
cmake_minimum_required(VERSION 2.8.3)
project(libjpeg)
set(LIBJPEG_SRCS
"jaricom.c"
"jcapimin.c"
"jcapistd.c"
"jcarith.c"
"jccoefct.c"
"jccolor.c"
"jcdctmgr.c"
"jchuff.c"
"jcinit.c"
"jcmainct.c"
"jcmarker.c"
"jcmaster.c"
"jcomapi.c"
"jcparam.c"
"jcprepct.c"
"jcsample.c"
"jctrans.c"
"jdapimin.c"
"jdapistd.c"
"jdarith.c"
"jdatadst.c"
"jdatasrc.c"
"jdcoefct.c"
"jdcolor.c"
"jddctmgr.c"
"jdhuff.c"
"jdinput.c"
"jdmainct.c"
"jdmarker.c"
"jdmaster.c"
"jdmerge.c"
"jdpostct.c"
"jdsample.c"
"jdtrans.c"
"jerror.c"
"jfdctflt.c"
"jfdctfst.c"
"jfdctint.c"
"jidctflt.c"
"jidctfst.c"
"jidctint.c"
"jmemmgr.c"
"jmemnobs.c"
"jquant1.c"
"jquant2.c"
"jutils.c"
)
set(LIBJPEG_INCLUDES
"jconfig.h"
"jdct.h"
"jerror.h"
"jinclude.h"
"jmemsys.h"
"jmorecfg.h"
"jpegint.h"
"jpeglib.h"
"jversion.h"
)
include_directories("${CMAKE_CURRENT_SOURCE_DIR}")
add_library(libjpeg ${LIBJPEG_SRCS})
install(TARGETS libjpeg
RUNTIME DESTINATION bin COMPONENT RuntimeLibraries
LIBRARY DESTINATION lib COMPONENT RuntimeLibraries
ARCHIVE DESTINATION lib COMPONENT Development)
foreach(LIBJPEG_INCLUDE ${LIBJPEG_INCLUDES})
install(FILES ${LIBJPEG_INCLUDE} DESTINATION include COMPONENT Development)
endforeach()

View File

@ -0,0 +1 @@
# [TODO]

View File

@ -0,0 +1,204 @@
########################################################
# tf_cc_op_gen_main library
########################################################
set(tf_cc_op_gen_main_srcs
"${tensorflow_source_dir}/tensorflow/cc/ops/cc_op_gen.cc"
"${tensorflow_source_dir}/tensorflow/cc/ops/cc_op_gen_main.cc"
"${tensorflow_source_dir}/tensorflow/cc/ops/cc_op_gen.h"
)
add_library(tf_cc_op_gen_main OBJECT ${tf_cc_op_gen_main_srcs})
add_dependencies(tf_cc_op_gen_main tf_core_framework)
target_include_directories(tf_cc_op_gen_main PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
#target_link_libraries(tf_cc_op_gen_main
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_protos_cc
# tf_core_lib
# tf_core_framework
#)
target_compile_options(tf_cc_op_gen_main PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_cc_op_gen_main PRIVATE
cxx_rvalue_references
)
########################################################
# tf_gen_op_wrapper_cc executables
########################################################
#
# # Run the op generator.
# if name == "sendrecv_ops":
# include_internal = "1"
# else:
# include_internal = "0"
# native.genrule(
# name=name + "_genrule",
# outs=[out_ops_file + ".h", out_ops_file + ".cc"],
# tools=[":" + tool],
# cmd=("$(location :" + tool + ") $(location :" + out_ops_file + ".h) " +
# "$(location :" + out_ops_file + ".cc) " + include_internal))
#def tf_gen_op_wrappers_cc(name,
# op_lib_names=[],
# other_srcs=[],
# other_hdrs=[],
# pkg=""):
# subsrcs = other_srcs
# subhdrs = other_hdrs
# for n in op_lib_names:
# tf_gen_op_wrapper_cc(n, "ops/" + n, pkg=pkg)
# subsrcs += ["ops/" + n + ".cc"]
# subhdrs += ["ops/" + n + ".h"]
#
# native.cc_library(name=name,
# srcs=subsrcs,
# hdrs=subhdrs,
# deps=["//tensorflow/core:core_cpu"],
# copts=tf_copts(),
# alwayslink=1,)
# create directory for ops generated files
set(cc_ops_target_dir ${CMAKE_CURRENT_BINARY_DIR}/tensorflow/cc/ops)
add_custom_target(create_cc_ops_header_dir
COMMAND ${CMAKE_COMMAND} -E make_directory ${cc_ops_target_dir}
)
set(tf_cc_ops_generated_files)
set(tf_cc_op_lib_names
${tf_op_lib_names}
"user_ops"
)
foreach(tf_cc_op_lib_name ${tf_cc_op_lib_names})
#tf_gen_op_wrapper_cc(name, out_ops_file, pkg=""):
# # Construct an op generator binary for these ops.
# tool = out_ops_file + "_gen_cc" #example ops/array_ops_gen_cc
# native.cc_binary(
# name = tool,
# copts = tf_copts(),
# linkopts = ["-lm"],
# linkstatic = 1, # Faster to link this one-time-use binary dynamically
# deps = (["//tensorflow/cc:cc_op_gen_main",
# pkg + ":" + name + "_op_lib"])
# )
# Using <TARGET_OBJECTS:...> to work around an issue where no ops were
# registered (static initializers dropped by the linker because the ops
# are not used explicitly in the *_gen_cc executables).
add_executable(${tf_cc_op_lib_name}_gen_cc
$<TARGET_OBJECTS:tf_cc_op_gen_main>
$<TARGET_OBJECTS:tf_${tf_cc_op_lib_name}>
$<TARGET_OBJECTS:tf_core_lib>
$<TARGET_OBJECTS:tf_core_framework>
)
target_include_directories(${tf_cc_op_lib_name}_gen_cc PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
find_package(ZLIB REQUIRED)
target_link_libraries(${tf_cc_op_lib_name}_gen_cc PRIVATE
${CMAKE_THREAD_LIBS_INIT}
${PROTOBUF_LIBRARIES}
tf_protos_cc
re2_lib
${jpeg_STATIC_LIBRARIES}
${png_STATIC_LIBRARIES}
${ZLIB_LIBRARIES}
)
target_compile_options(${tf_cc_op_lib_name}_gen_cc PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
-lm
)
# C++11
target_compile_features(${tf_cc_op_lib_name}_gen_cc PRIVATE
cxx_rvalue_references
)
set(cc_ops_include_internal 0)
if(${tf_cc_op_lib_name} STREQUAL "sendrecv_ops")
set(cc_ops_include_internal 1)
endif()
add_custom_command(
OUTPUT ${cc_ops_target_dir}/${tf_cc_op_lib_name}.h
${cc_ops_target_dir}/${tf_cc_op_lib_name}.cc
COMMAND ${tf_cc_op_lib_name}_gen_cc ${cc_ops_target_dir}/${tf_cc_op_lib_name}.h ${cc_ops_target_dir}/${tf_cc_op_lib_name}.cc ${cc_ops_include_internal}
DEPENDS ${tf_cc_op_lib_name}_gen_cc create_cc_ops_header_dir
)
list(APPEND tf_cc_ops_generated_files ${cc_ops_target_dir}/${tf_cc_op_lib_name}.h)
list(APPEND tf_cc_ops_generated_files ${cc_ops_target_dir}/${tf_cc_op_lib_name}.cc)
endforeach()
########################################################
# tf_cc_ops library
########################################################
add_library(tf_cc_ops OBJECT
${tf_cc_ops_generated_files}
"${tensorflow_source_dir}/tensorflow/cc/ops/const_op.h"
"${tensorflow_source_dir}/tensorflow/cc/ops/const_op.cc"
"${tensorflow_source_dir}/tensorflow/cc/ops/standard_ops.h"
)
target_include_directories(tf_cc_ops PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
#target_link_libraries(tf_cc_ops
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_protos_cc
# tf_core_lib
# tf_core_cpu
# tf_models_word2vec_ops
#)
target_compile_options(tf_cc_ops PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_cc_ops PRIVATE
cxx_rvalue_references
)
#tf_gen_op_wrappers_cc(
# name = "cc_ops",
# op_lib_names = [
# ...
# ],
# other_hdrs = [
# "ops/const_op.h",
# "ops/standard_ops.h",
# ],
# other_srcs = [
# "ops/const_op.cc",
# ] + glob(["ops/*_grad.cc"]),
# pkg = "//tensorflow/core",
#)

View File

@ -0,0 +1,53 @@
########################################################
# tf_core_cpu library
########################################################
file(GLOB_RECURSE tf_core_cpu_srcs
"${tensorflow_source_dir}/tensorflow/core/common_runtime/*.h"
"${tensorflow_source_dir}/tensorflow/core/common_runtime/*.cc"
"${tensorflow_source_dir}/tensorflow/core/client/*.cc"
"${tensorflow_source_dir}/tensorflow/core/graph/*.h"
"${tensorflow_source_dir}/tensorflow/core/graph/*.cc"
"${tensorflow_source_dir}/tensorflow/core/public/*.h"
)
file(GLOB_RECURSE tf_core_cpu_exclude_srcs
"${tensorflow_source_dir}/tensorflow/core/*test*.h"
"${tensorflow_source_dir}/tensorflow/core/*test*.cc"
"${tensorflow_source_dir}/tensorflow/core/*main.cc"
"${tensorflow_source_dir}/tensorflow/core/common_runtime/gpu/*.cc"
"${tensorflow_source_dir}/tensorflow/core/common_runtime/gpu_device_factory.cc"
"${tensorflow_source_dir}/tensorflow/core/common_runtime/direct_session.cc"
"${tensorflow_source_dir}/tensorflow/core/common_runtime/direct_session.h"
)
list(REMOVE_ITEM tf_core_cpu_srcs ${tf_core_cpu_exclude_srcs})
add_library(tf_core_cpu OBJECT ${tf_core_cpu_srcs})
target_include_directories(tf_core_cpu PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
${re2_INCLUDES}
)
add_dependencies(tf_core_cpu
tf_core_framework
)
#target_link_libraries(tf_core_cpu
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_core_framework
# tf_core_lib
# tf_protos_cc
#)
target_compile_options(tf_core_cpu PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_core_cpu PRIVATE
cxx_rvalue_references
)

View File

@ -0,0 +1,35 @@
########################################################
# tf_core_direct_session library
########################################################
file(GLOB tf_core_direct_session_srcs
"${tensorflow_source_dir}/tensorflow/core/common_runtime/direct_session.cc"
"${tensorflow_source_dir}/tensorflow/core/common_runtime/direct_session.h"
)
add_library(tf_core_direct_session OBJECT ${tf_core_direct_session_srcs})
add_dependencies(tf_core_direct_session tf_core_cpu)
target_include_directories(tf_core_direct_session PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
#target_link_libraries(tf_core_direct_session
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_core_cpu
# tf_core_framework
# tf_core_lib
# tf_protos_cc
#)
target_compile_options(tf_core_direct_session PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_core_direct_session PRIVATE
cxx_rvalue_references
)

View File

@ -0,0 +1,165 @@
########################################################
# RELATIVE_PROTOBUF_GENERATE_CPP function
########################################################
# A variant of PROTOBUF_GENERATE_CPP that keeps the directory hierarchy.
# ROOT_DIR must be absolute, and proto paths must be relative to ROOT_DIR.
function(RELATIVE_PROTOBUF_GENERATE_CPP SRCS HDRS ROOT_DIR)
if(NOT ARGN)
message(SEND_ERROR "Error: RELATIVE_PROTOBUF_GENERATE_CPP() called without any proto files")
return()
endif()
set(${SRCS})
set(${HDRS})
foreach(FIL ${ARGN})
set(ABS_FIL ${ROOT_DIR}/${FIL})
get_filename_component(FIL_WE ${FIL} NAME_WE)
get_filename_component(FIL_DIR ${ABS_FIL} PATH)
file(RELATIVE_PATH REL_DIR ${ROOT_DIR} ${FIL_DIR})
list(APPEND ${SRCS} "${CMAKE_CURRENT_BINARY_DIR}/${REL_DIR}/${FIL_WE}.pb.cc")
list(APPEND ${HDRS} "${CMAKE_CURRENT_BINARY_DIR}/${REL_DIR}/${FIL_WE}.pb.h")
add_custom_command(
OUTPUT "${CMAKE_CURRENT_BINARY_DIR}/${REL_DIR}/${FIL_WE}.pb.cc"
"${CMAKE_CURRENT_BINARY_DIR}/${REL_DIR}/${FIL_WE}.pb.h"
COMMAND ${PROTOBUF_PROTOC_EXECUTABLE}
ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR} -I ${ROOT_DIR} ${ABS_FIL}
DEPENDS ${ABS_FIL} ${PROTOBUF_PROTOC_EXECUTABLE}
COMMENT "Running C++ protocol buffer compiler on ${FIL}"
VERBATIM )
endforeach()
set_source_files_properties(${${SRCS}} ${${HDRS}} PROPERTIES GENERATED TRUE)
set(${SRCS} ${${SRCS}} PARENT_SCOPE)
set(${HDRS} ${${HDRS}} PARENT_SCOPE)
endfunction()
########################################################
# tf_protos_cc library
########################################################
# Build proto library
include(FindProtobuf)
find_package(Protobuf REQUIRED)
include_directories(${PROTOBUF_INCLUDE_DIRS})
include_directories(${CMAKE_CURRENT_BINARY_DIR})
file(GLOB_RECURSE tf_protos_cc_srcs RELATIVE ${tensorflow_source_dir}
"${tensorflow_source_dir}/tensorflow/*.proto"
)
RELATIVE_PROTOBUF_GENERATE_CPP(PROTO_SRCS PROTO_HDRS
${tensorflow_source_dir} ${tf_protos_cc_srcs}
)
add_library(tf_protos_cc ${PROTO_SRCS} ${PROTO_HDRS})
target_include_directories(tf_protos_cc PUBLIC
${CMAKE_CURRENT_BINARY_DIR}
)
target_link_libraries(tf_protos_cc PUBLIC
${PROTOBUF_LIBRARIES}
)
########################################################
# tf_core_lib library
########################################################
file(GLOB_RECURSE tf_core_lib_srcs
"${tensorflow_source_dir}/tensorflow/core/lib/*.h"
"${tensorflow_source_dir}/tensorflow/core/lib/*.cc"
"${tensorflow_source_dir}/tensorflow/core/platform/*.h"
"${tensorflow_source_dir}/tensorflow/core/platform/*.cc"
"${tensorflow_source_dir}/tensorflow/core/public/*.h"
)
file(GLOB_RECURSE tf_core_lib_test_srcs
"${tensorflow_source_dir}/tensorflow/core/lib/*test*.h"
"${tensorflow_source_dir}/tensorflow/core/lib/*test*.cc"
"${tensorflow_source_dir}/tensorflow/core/platform/*test*.h"
"${tensorflow_source_dir}/tensorflow/core/platform/*test*.cc"
"${tensorflow_source_dir}/tensorflow/core/public/*test*.h"
)
list(REMOVE_ITEM tf_core_lib_srcs ${tf_core_lib_test_srcs})
add_library(tf_core_lib OBJECT ${tf_core_lib_srcs})
target_include_directories(tf_core_lib PUBLIC
${tensorflow_source_dir}
${jpeg_INCLUDE_DIR}
${png_INCLUDE_DIR}
)
#target_link_libraries(tf_core_lib
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_protos_cc
#)
target_compile_options(tf_core_lib PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_core_lib PRIVATE
cxx_rvalue_references
)
add_dependencies(tf_core_lib
jpeg_copy_headers_to_destination
png_copy_headers_to_destination
re2_copy_headers_to_destination
eigen
tf_protos_cc
)
########################################################
# tf_core_framework library
########################################################
file(GLOB_RECURSE tf_core_framework_srcs
"${tensorflow_source_dir}/tensorflow/core/framework/*.h"
"${tensorflow_source_dir}/tensorflow/core/framework/*.cc"
"${tensorflow_source_dir}/tensorflow/core/util/*.h"
"${tensorflow_source_dir}/tensorflow/core/util/*.cc"
"${tensorflow_source_dir}/public/*.h"
)
file(GLOB_RECURSE tf_core_framework_test_srcs
"${tensorflow_source_dir}/tensorflow/core/framework/*test*.h"
"${tensorflow_source_dir}/tensorflow/core/framework/*test*.cc"
"${tensorflow_source_dir}/tensorflow/core/framework/*testutil.h"
"${tensorflow_source_dir}/tensorflow/core/framework/*testutil.cc"
"${tensorflow_source_dir}/tensorflow/core/framework/*main.cc"
"${tensorflow_source_dir}/tensorflow/core/util/*test*.h"
"${tensorflow_source_dir}/tensorflow/core/util/*test*.cc"
"${tensorflow_source_dir}/tensorflow/core/util/*main.cc"
)
list(REMOVE_ITEM tf_core_framework_srcs ${tf_core_framework_test_srcs})
add_library(tf_core_framework OBJECT ${tf_core_framework_srcs})
target_include_directories(tf_core_framework PUBLIC
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
${re2_INCLUDES}
)
#target_link_libraries(tf_core_framework
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# #${re2_STATIC_LIBRARIES}
# re2_lib
# ${jpeg_STATIC_LIBRARIES}
# ${png_STATIC_LIBRARIES}
# tf_protos_cc
# tf_core_lib
#)
add_dependencies(tf_core_framework
tf_core_lib
)
target_compile_options(tf_core_framework PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_core_framework PRIVATE
cxx_rvalue_references
)

View File

@ -0,0 +1,53 @@
########################################################
# tf_core_kernels library
########################################################
file(GLOB_RECURSE tf_core_kernels_srcs
"${tensorflow_source_dir}/tensorflow/core/kernels/*.h"
"${tensorflow_source_dir}/tensorflow/core/kernels/*.cc"
)
file(GLOB_RECURSE tf_core_kernels_exclude_srcs
"${tensorflow_source_dir}/tensorflow/core/kernels/*test*.h"
"${tensorflow_source_dir}/tensorflow/core/kernels/*test*.cc"
"${tensorflow_source_dir}/tensorflow/core/kernels/*testutil.h"
"${tensorflow_source_dir}/tensorflow/core/kernels/*testutil.cc"
"${tensorflow_source_dir}/tensorflow/core/kernels/*main.cc"
"${tensorflow_source_dir}/tensorflow/core/kernels/*.cu.cc"
)
list(REMOVE_ITEM tf_core_kernels_srcs ${tf_core_kernels_exclude_srcs})
add_library(tf_core_kernels OBJECT ${tf_core_kernels_srcs})
add_dependencies(tf_core_kernels tf_core_cpu)
target_include_directories(tf_core_kernels PRIVATE
${tensorflow_source_dir}
${png_INCLUDE_DIR}
${eigen_INCLUDE_DIRS}
)
#target_link_libraries(tf_core_kernels
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_core_cpu
# tf_core_framework
# tf_core_lib
# tf_protos_cc
# tf_models_word2vec_kernels
# tf_stream_executor
# tf_core_ops
# tf_core_cpu
#)
# "@gemmlowp//:eight_bit_int_gemm",
target_compile_options(tf_core_kernels PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_core_kernels PRIVATE
cxx_rvalue_references
)

View File

@ -0,0 +1,181 @@
#def tf_gen_op_libs(op_lib_names):
# # Make library out of each op so it can also be used to generate wrappers
# # for various languages.
# for n in op_lib_names:
# native.cc_library(name=n + "_op_lib"
# copts=tf_copts(),
# srcs=["ops/" + n + ".cc"],
# deps=(["//tensorflow/core:framework"]),
# visibility=["//visibility:public"],
# alwayslink=1,
# linkstatic=1,)
set(tf_op_lib_names
"array_ops"
"attention_ops"
"candidate_sampling_ops"
"control_flow_ops"
"data_flow_ops"
"image_ops"
"io_ops"
"linalg_ops"
"logging_ops"
"functional_ops"
"math_ops"
"nn_ops"
"no_op"
"parsing_ops"
"random_ops"
"script_ops"
"sendrecv_ops"
"sparse_ops"
"state_ops"
"string_ops"
"summary_ops"
"training_ops"
)
foreach(tf_op_lib_name ${tf_op_lib_names})
########################################################
# tf_${tf_op_lib_name} library
########################################################
file(GLOB tf_${tf_op_lib_name}_srcs
"${tensorflow_source_dir}/tensorflow/core/ops/${tf_op_lib_name}.cc"
)
add_library(tf_${tf_op_lib_name} OBJECT ${tf_${tf_op_lib_name}_srcs})
add_dependencies(tf_${tf_op_lib_name} tf_core_framework)
target_include_directories(tf_${tf_op_lib_name} PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
target_compile_options(tf_${tf_op_lib_name} PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_${tf_op_lib_name} PRIVATE
cxx_rvalue_references
)
endforeach()
#cc_library(
# name = "user_ops_op_lib"
# srcs = glob(["user_ops/**/*.cc"]),
# copts = tf_copts(),
# linkstatic = 1,
# visibility = ["//visibility:public"],
# deps = [":framework"],
# alwayslink = 1,
#)
########################################################
# tf_user_ops library
########################################################
file(GLOB_RECURSE tf_user_ops_srcs
"${tensorflow_source_dir}/tensorflow/core/user_ops/*.cc"
)
add_library(tf_user_ops OBJECT ${tf_user_ops_srcs})
add_dependencies(tf_user_ops tf_core_framework)
target_include_directories(tf_user_ops PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
target_compile_options(tf_user_ops PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_user_ops PRIVATE
cxx_rvalue_references
)
#tf_cuda_library(
# name = "ops"
# srcs = glob(
# [
# "ops/**/*.h"
# "ops/**/*.cc"
# "user_ops/**/*.h"
# "user_ops/**/*.cc"
# ],
# exclude = [
# "**/*test*"
# "**/*main.cc"
# "user_ops/**/*.cu.cc"
# ],
# ),
# copts = tf_copts(),
# linkstatic = 1,
# visibility = ["//visibility:public"],
# deps = [
# ":core"
# ":lib"
# ":protos_cc"
# "//tensorflow/models/embedding:word2vec_ops"
# "//third_party/eigen3"
# ],
# alwayslink = 1,
#)
########################################################
# tf_core_ops library
########################################################
file(GLOB_RECURSE tf_core_ops_srcs
"${tensorflow_source_dir}/tensorflow/core/ops/*.h"
"${tensorflow_source_dir}/tensorflow/core/ops/*.cc"
"${tensorflow_source_dir}/tensorflow/core/user_ops/*.h"
"${tensorflow_source_dir}/tensorflow/core/user_ops/*.cc"
)
file(GLOB_RECURSE tf_core_ops_exclude_srcs
"${tensorflow_source_dir}/tensorflow/core/ops/*test*.h"
"${tensorflow_source_dir}/tensorflow/core/ops/*test*.cc"
"${tensorflow_source_dir}/tensorflow/core/ops/*main.cc"
"${tensorflow_source_dir}/tensorflow/core/user_ops/*test*.h"
"${tensorflow_source_dir}/tensorflow/core/user_ops/*test*.cc"
"${tensorflow_source_dir}/tensorflow/core/user_ops/*main.cc"
"${tensorflow_source_dir}/tensorflow/core/user_ops/*.cu.cc"
)
list(REMOVE_ITEM tf_core_ops_srcs ${tf_core_ops_exclude_srcs})
add_library(tf_core_ops OBJECT ${tf_core_ops_srcs})
add_dependencies(tf_core_ops tf_core_cpu)
target_include_directories(tf_core_ops PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
#target_link_libraries(tf_core_ops
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_protos_cc
# tf_core_lib
# tf_core_cpu
# tf_models_word2vec_ops
#)
target_compile_options(tf_core_ops PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_core_ops PRIVATE
cxx_rvalue_references
)

View File

@ -0,0 +1,95 @@
#cc_library(
# name = "word2vec_ops",
# srcs = [
# "word2vec_ops.cc",
# ],
# visibility = ["//tensorflow:internal"],
# deps = [
# "//tensorflow/core:framework",
# ],
# alwayslink = 1,
#)
########################################################
# tf_models_word2vec_ops library
########################################################
file(GLOB tf_models_word2vec_ops_srcs
"${tensorflow_source_dir}/tensorflow/models/embedding/word2vec_ops.cc"
)
add_library(tf_models_word2vec_ops OBJECT ${tf_models_word2vec_ops_srcs})
target_include_directories(tf_models_word2vec_ops PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
add_dependencies(tf_models_word2vec_ops
tf_core_framework
)
#target_link_libraries(tf_models_word2vec_ops
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_core_framework
# tf_core_lib
# tf_protos_cc
#)
target_compile_options(tf_models_word2vec_ops PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_models_word2vec_ops PRIVATE
cxx_rvalue_references
)
#cc_library(
# name = "word2vec_kernels",
# srcs = [
# "word2vec_kernels.cc",
# ],
# visibility = ["//tensorflow:internal"],
# deps = [
# "//tensorflow/core",
# ],
# alwayslink = 1,
#)
########################################################
# tf_models_word2vec_kernels library
########################################################
file(GLOB tf_models_word2vec_kernels_srcs
"${tensorflow_source_dir}/tensorflow/models/embedding/word2vec_kernels.cc"
)
add_library(tf_models_word2vec_kernels OBJECT ${tf_models_word2vec_kernels_srcs})
target_include_directories(tf_models_word2vec_kernels PRIVATE
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
${re2_INCLUDES}
)
add_dependencies(tf_models_word2vec_ops
tf_core_cpu
)
#target_link_libraries(tf_models_word2vec_kernels
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_core_framework
# tf_core_lib
# tf_protos_cc
# tf_core_cpu
#)
target_compile_options(tf_models_word2vec_kernels PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_models_word2vec_kernels PRIVATE
cxx_rvalue_references
)

View File

@ -0,0 +1,81 @@
#cc_library(
# name = "stream_executor",
# srcs = glob(
# [
#XX "*.cc",
# "lib/*.cc",
# ],
# exclude = [
# "**/*_test.cc",
# ],
# ) + if_cuda(
# glob([
# "cuda/*.cc",
# ]),
# ),
# hdrs = glob([
# "*.h",
# "cuda/*.h",
# "lib/*.h",
# "platform/**/*.h",
# ]),
# data = [
# "//tensorflow/core:cuda",
# "//third_party/gpus/cuda:cublas",
# "//third_party/gpus/cuda:cudnn",
# ],
# linkopts = [
# "-ldl",
# ],
# visibility = ["//visibility:public"],
# deps = [
# "//tensorflow/core:lib",
# "//third_party/gpus/cuda:cuda_headers",
# ],
# alwayslink = 1,
#)
########################################################
# tf_stream_executor library
########################################################
file(GLOB tf_stream_executor_srcs
"${tensorflow_source_dir}/tensorflow/stream_executor/*.cc"
"${tensorflow_source_dir}/tensorflow/stream_executor/*.h"
"${tensorflow_source_dir}/tensorflow/stream_executor/lib/*.cc"
"${tensorflow_source_dir}/tensorflow/stream_executor/lib/*.h"
"${tensorflow_source_dir}/tensorflow/stream_executor/platform/*.h"
"${tensorflow_source_dir}/tensorflow/stream_executor/platform/default/*.h"
)
#file(GLOB_RECURSE tf_stream_executor_test_srcs
# "${tensorflow_source_dir}/tensorflow/stream_executor/*_test.cc"
# "${tensorflow_source_dir}/tensorflow/stream_executor/*_test.h"
#)
#
#list(REMOVE_ITEM tf_stream_executor_srcs ${tf_stream_executor_test_srcs})
add_library(tf_stream_executor OBJECT ${tf_stream_executor_srcs})
target_include_directories(tf_stream_executor PRIVATE
${tensorflow_source_dir}
)
add_dependencies(tf_stream_executor
tf_core_lib
)
#target_link_libraries(tf_stream_executor
# ${CMAKE_THREAD_LIBS_INIT}
# ${PROTOBUF_LIBRARIES}
# tf_protos_cc
# tf_core_lib
#)
target_compile_options(tf_stream_executor PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_stream_executor PRIVATE
cxx_rvalue_references
)

View File

@ -0,0 +1,54 @@
#cc_binary(
# name = "tutorials_example_trainer",
# srcs = ["tutorials/example_trainer.cc"],
# copts = tf_copts(),
# linkopts = [
# "-lpthread",
# "-lm",
# ],
# deps = [
# ":cc_ops",
# "//tensorflow/core:kernels",
# "//tensorflow/core:tensorflow",
# ],
#)
set(tf_tutorials_example_trainer_srcs
"${tensorflow_source_dir}/tensorflow/cc/tutorials/example_trainer.cc"
)
add_executable(tf_tutorials_example_trainer
${tf_tutorials_example_trainer_srcs}
$<TARGET_OBJECTS:tf_core_lib>
$<TARGET_OBJECTS:tf_core_cpu>
$<TARGET_OBJECTS:tf_core_framework>
$<TARGET_OBJECTS:tf_core_kernels>
$<TARGET_OBJECTS:tf_cc_ops>
$<TARGET_OBJECTS:tf_core_ops>
$<TARGET_OBJECTS:tf_core_direct_session>
)
target_include_directories(tf_tutorials_example_trainer PUBLIC
${tensorflow_source_dir}
${eigen_INCLUDE_DIRS}
)
target_link_libraries(tf_tutorials_example_trainer PUBLIC
${CMAKE_THREAD_LIBS_INIT}
${PROTOBUF_LIBRARIES}
tf_protos_cc
re2_lib
${jpeg_STATIC_LIBRARIES}
${png_STATIC_LIBRARIES}
${ZLIB_LIBRARIES}
)
target_compile_options(tf_tutorials_example_trainer PRIVATE
-fno-exceptions
-DEIGEN_AVOID_STL_ARRAY
)
# C++11
target_compile_features(tf_tutorials_example_trainer PRIVATE
cxx_rvalue_references
)

View File

@ -79,7 +79,7 @@ def _reduce_batch(x, reduce_fn, name=None):
elif ndims == 1:
return x # Don't include a useless reduction.
elif ndims:
reduction_indices = range(1, ndims)
reduction_indices = list(range(1, ndims))
shape = [x.get_shape().dims[0]]
else:
reduction_indices = math_ops.range(1, array_ops.size(array_ops.shape(x)))

View File

@ -73,11 +73,6 @@ struct Regularizations {
float symmetric_l2 = 0;
};
struct RegularizationLoss {
double l1_loss = 0;
double l2_loss = 0;
};
struct PerExampleData {
double wx = 0;
double norm = 0;
@ -102,7 +97,7 @@ using DenseFeaturesByGroup = std::vector<TTypes<const float>::Vec>;
// indicates that the contents of sparse_examples_by_group cannot be trusted or
// used.
Status FillSparseExamplesByGroup(
const int64 num_sparse_features, const int64 num_examples,
const int64 num_sparse_features, const int num_examples,
const OpInputList& sparse_features_indices_inputs,
const OpInputList& sparse_features_values_inputs,
const WeightsByGroup& sparse_weights_by_group,
@ -127,7 +122,10 @@ Status FillSparseExamplesByGroup(
static const int64 kIndicesDims = 2;
gtl::InlinedVector<int64, 8> order(kIndicesDims);
std::iota(order.begin(), order.end(), 0);
for (int64 i = begin; i < end; ++i) {
// The static_cast here is safe since begin and end can be at most
// num_examples which is an int.
for (int i = static_cast<int>(begin); i < end; ++i) {
if (sparse_features_indices_inputs[i].shape().dims() != kIndicesDims) {
mutex_lock l(mu);
result = errors::InvalidArgument(strings::Printf(
@ -147,7 +145,7 @@ Status FillSparseExamplesByGroup(
if (example_index < 0 || example_index >= num_examples) {
mutex_lock l(mu);
result = errors::Internal(strings::Printf(
"Example indices should be in [0, %lld). Encountered: %lld",
"Example indices should be in [0, %d). Encountered: %lld",
num_examples, example_index));
return;
}
@ -203,35 +201,6 @@ inline double Shrink(const double weight, const double shrink_by) {
return 0.0;
}
// Compute L1 and L2 regularization loss.
inline RegularizationLoss ComputeRegularizationLoss(
const WeightsByGroup& sparse_weights_by_group,
const WeightsByGroup& dense_weights_by_group,
const Regularizations& regularizations) {
RegularizationLoss result;
const double shrink_by = ShrinkageFactor(regularizations);
auto accumulate_regularization_loss = [&](const double w) {
const double sw = std::abs(Shrink(w, shrink_by));
result.l1_loss += sw;
result.l2_loss += sw * sw;
};
for (const TTypes<float>::Vec weights : sparse_weights_by_group) {
for (int64 i = 0; i < weights.size(); ++i) {
accumulate_regularization_loss(weights(i));
}
}
for (const TTypes<float>::Vec weights : dense_weights_by_group) {
accumulate_regularization_loss(weights(0));
}
result.l1_loss *= regularizations.symmetric_l1;
result.l2_loss *= regularizations.symmetric_l2;
return result;
}
// Compute PerExampleData which contains the logits, and weighted example norm
// for a given example_id. Norm is weighted by 1/(lambda*N).
inline PerExampleData ComputeWxAndWeightedExampleNorm(
@ -380,7 +349,7 @@ WeightsByGroup MakeDeltaWeightsFrom(std::vector<Tensor>* const tensors) {
}
Status RunTrainStepsForMiniBatch(
const int64 num_examples, const TTypes<const string>::Vec example_ids,
const int num_examples, const TTypes<const string>::Vec example_ids,
const TTypes<const float>::Vec example_labels,
const TTypes<const float>::Vec example_weights,
const DeviceBase::CpuWorkerThreads& worker_threads,
@ -459,6 +428,13 @@ Status RunTrainStepsForMiniBatch(
return train_step_status;
}
Status FillRegularizations(OpKernelConstruction* const context,
Regularizations* const regularizations) {
TF_RETURN_IF_ERROR(context->GetAttr("l1", &regularizations->symmetric_l1));
TF_RETURN_IF_ERROR(context->GetAttr("l2", &regularizations->symmetric_l2));
return Status::OK();
}
} // namespace
class SdcaSolver : public OpKernel {
@ -484,25 +460,9 @@ class SdcaSolver : public OpKernel {
OP_REQUIRES(
context, num_sparse_features_ + num_dense_features_ > 0,
errors::InvalidArgument("Requires at least one feature to train."));
OP_REQUIRES_OK(context,
context->GetAttr("l1", &regularizations_.symmetric_l1));
OP_REQUIRES_OK(context,
context->GetAttr("l2", &regularizations_.symmetric_l2));
// We enforce a minimal l2, required by the algorithm.
regularizations_.symmetric_l2 =
std::max(regularizations_.symmetric_l2, 1.0f);
OP_REQUIRES_OK(context, FillRegularizations(context, &regularizations_));
OP_REQUIRES_OK(context, context->GetAttr("num_inner_iterations",
&num_inner_iterations_));
// TODO(rohananil): Provide emperical evidence for this. It is better to run
// more than one iteration on single mini-batch as we want to spend more
// time in compute. SDCA works better with larger mini batches and there
// is also recent work that shows its better to reuse old samples than train
// on new samples. See: http://arxiv.org/abs/1602.02136.
num_inner_iterations_ =
std::max(num_inner_iterations_, static_cast<int64>(2));
OP_REQUIRES_OK(context, context->GetAttr("container", &container_));
OP_REQUIRES_OK(context, context->GetAttr("solver_uuid", &solver_uuid_));
}
@ -533,21 +493,16 @@ class SdcaSolver : public OpKernel {
OP_REQUIRES(context, TensorShapeUtils::IsVector(example_weights_t->shape()),
errors::InvalidArgument("example_weights should be a vector."));
const auto example_weights = example_weights_t->vec<float>();
Eigen::Tensor<float, 0, Eigen::RowMajor> example_weights_sum;
example_weights_sum.device(context->eigen_cpu_device()) =
example_weights.sum();
const float weighted_examples = example_weights_sum();
const int64 num_examples = example_weights.size();
OP_REQUIRES(context, weighted_examples > 0,
errors::InvalidArgument("No weighted examples in ",
num_examples, " training examples"));
OP_REQUIRES(context,
example_weights.size() <= std::numeric_limits<int>::max(),
errors::InvalidArgument(strings::Printf(
"Too many examples in a mini-batch: %ld > %d",
example_weights.size(), std::numeric_limits<int>::max())));
const int num_examples = static_cast<int>(example_weights.size());
OpInputList dense_features_inputs;
OP_REQUIRES_OK(
context, context->input_list("dense_features", &dense_features_inputs));
DenseFeaturesByGroup dense_features_by_group;
for (const auto& dense_feature : dense_features_inputs) {
dense_features_by_group.emplace_back(dense_feature.vec<float>());
@ -562,7 +517,7 @@ class SdcaSolver : public OpKernel {
OP_REQUIRES(context, example_labels.size() == num_examples,
errors::InvalidArgument(strings::Printf(
"The number of example labels (%ld) should match the "
"number of example weights (%lld).",
"number of example weights (%d).",
example_labels.size(), num_examples)));
const Tensor* example_ids_t;
@ -573,7 +528,7 @@ class SdcaSolver : public OpKernel {
OP_REQUIRES(context, example_labels.size() == num_examples,
errors::InvalidArgument(strings::Printf(
"The number of example ids (%ld) should match the number "
"of example weights (%lld).",
"of example weights (%d).",
example_ids.size(), num_examples)));
const int64 num_duplicate_example_ids = [&] {
// TODO(katsiapis): Benchmark and/or optimize.
@ -632,12 +587,7 @@ class SdcaSolver : public OpKernel {
SetZeroDeltaWeights(&sparse_delta_weights_by_group,
&dense_delta_weights_by_group);
// TODO(rohananil): Provide emperical evidence for this. It is better to run
// more than one iteration on single mini-batch as we want to spend more
// time in compute. SDCA works better with larger mini batches and there
// is also recent work that shows its better to reuse old samples than train
// on new samples. See: http://arxiv.org/abs/1602.02136.
for (int64 i = 0; i < num_inner_iterations_; ++i) {
for (int i = 0; i < num_inner_iterations_; ++i) {
OP_REQUIRES_OK(
context,
RunTrainStepsForMiniBatch(
@ -669,7 +619,7 @@ class SdcaSolver : public OpKernel {
int64 num_sparse_features_;
int64 num_dense_features_;
Regularizations regularizations_;
int64 num_inner_iterations_;
int num_inner_iterations_;
string container_;
string solver_uuid_;
};
@ -678,13 +628,7 @@ REGISTER_KERNEL_BUILDER(Name("SdcaSolver").Device(DEVICE_CPU), SdcaSolver);
class SdcaShrinkL1 : public OpKernel {
public:
explicit SdcaShrinkL1(OpKernelConstruction* context) : OpKernel(context) {
OP_REQUIRES_OK(context,
context->GetAttr("l1", &regularizations_.symmetric_l1));
OP_REQUIRES_OK(context,
context->GetAttr("l2", &regularizations_.symmetric_l2));
// We enforce a minimal l2, required by the algorithm.
regularizations_.symmetric_l2 =
std::max(regularizations_.symmetric_l2, 1.0f);
OP_REQUIRES_OK(context, FillRegularizations(context, &regularizations_));
}
void Compute(OpKernelContext* context) override {
@ -709,19 +653,10 @@ class SdcaShrinkL1 : public OpKernel {
};
REGISTER_KERNEL_BUILDER(Name("SdcaShrinkL1").Device(DEVICE_CPU), SdcaShrinkL1);
class ComputeDualityGap : public OpKernel {
class SdcaTrainingStats : public OpKernel {
public:
explicit ComputeDualityGap(OpKernelConstruction* context)
explicit SdcaTrainingStats(OpKernelConstruction* context)
: OpKernel(context) {
// TODO(rohananil): Refactor grabbing common attributes across ops related
// to sdca.
OP_REQUIRES_OK(context,
context->GetAttr("l1", &regularizations_.symmetric_l1));
OP_REQUIRES_OK(context,
context->GetAttr("l2", &regularizations_.symmetric_l2));
// We enforce a minimal l2, required by the algorithm.
regularizations_.symmetric_l2 =
std::max(regularizations_.symmetric_l2, 1.0f);
OP_REQUIRES_OK(context, context->GetAttr("container", &container_));
OP_REQUIRES_OK(context, context->GetAttr("solver_uuid", &solver_uuid_));
}
@ -734,45 +669,56 @@ class ComputeDualityGap : public OpKernel {
context, !data_by_example->RefCountIsOne(),
errors::Internal("Expected shared-ownership of data_by_example."));
OpMutableInputList sparse_weights_inputs;
OP_REQUIRES_OK(context, context->mutable_input_list(
"sparse_weights", &sparse_weights_inputs));
WeightsByGroup sparse_weights_by_group =
MakeWeightsFrom(&sparse_weights_inputs);
OpMutableInputList dense_weights_inputs;
OP_REQUIRES_OK(context, context->mutable_input_list("dense_weights",
&dense_weights_inputs));
WeightsByGroup dense_weights_by_group =
MakeWeightsFrom(&dense_weights_inputs);
double example_weight_sum = 0;
double total_duality_gap = 0;
double total_primal_loss = 0;
double total_dual_loss = 0;
double total_example_weight = 0;
OP_REQUIRES_OK(context,
data_by_example->Visit([&](const DataByExample::Data& data) {
example_weight_sum += data.example_weight;
total_duality_gap += data.primal_loss + data.dual_loss;
total_primal_loss += data.primal_loss;
total_dual_loss += data.dual_loss;
total_example_weight += data.example_weight;
}));
const RegularizationLoss regularization_loss = ComputeRegularizationLoss(
sparse_weights_by_group, dense_weights_by_group, regularizations_);
total_duality_gap +=
regularization_loss.l2_loss + regularization_loss.l1_loss;
// TODO(katsiapis): Think about the most arithmetically stable way of
// computing (dual + primal) loss (if it matters).
Tensor* duality_gap_t = nullptr;
OP_REQUIRES_OK(context,
context->allocate_output("duality_gap", {}, &duality_gap_t));
duality_gap_t->scalar<float>()() = total_duality_gap / example_weight_sum;
{
Tensor* tensor = nullptr;
OP_REQUIRES_OK(context,
context->allocate_output("primal_loss", {}, &tensor));
tensor->scalar<double>()() = total_primal_loss;
}
{
Tensor* tensor = nullptr;
OP_REQUIRES_OK(context,
context->allocate_output("dual_loss", {}, &tensor));
tensor->scalar<double>()() = total_dual_loss;
}
{
OP_REQUIRES(
context, total_example_weight > 0,
errors::FailedPrecondition(
"No examples found or all examples have zero weight. Either the "
"optimizer was trained with no instances or perhaps there is a "
"bug in the training data."));
Tensor* tensor = nullptr;
OP_REQUIRES_OK(context,
context->allocate_output("example_weights", {}, &tensor));
tensor->scalar<double>()() = total_example_weight;
}
// TODO(katsiapis): Use core::ScopedUnref once it's moved out of internal.
data_by_example->Unref();
}
private:
Regularizations regularizations_;
string container_;
string solver_uuid_;
};
REGISTER_KERNEL_BUILDER(Name("ComputeDualityGap").Device(DEVICE_CPU),
ComputeDualityGap);
REGISTER_KERNEL_BUILDER(Name("SdcaTrainingStats").Device(DEVICE_CPU),
SdcaTrainingStats);
} // namespace tensorflow

View File

@ -24,7 +24,7 @@ REGISTER_OP("SdcaSolver")
.Attr("num_dense_features: int >= 0")
.Attr("l1: float >= 0")
.Attr("l2: float >= 1")
.Attr("num_inner_iterations: int >= 2")
.Attr("num_inner_iterations: int >= 1")
.Attr("container: string")
.Attr("solver_uuid: string")
.Input("sparse_features_indices: num_sparse_features * int64")
@ -69,7 +69,7 @@ example_labels: a vector which contains the label/target associated with each
example_ids: a vector which contains the unique identifier associated with each
example.
sparse_weights: a list of vectors where each value is the weight associated with
a feature index.
a feature group.
dense_weights: a list of vectors where the value is the weight associated with
a dense feature group.
)doc");
@ -89,38 +89,28 @@ num_dense_features: Number of dense feature groups to train on.
l1: Symmetric l1 regularization strength.
l2: Symmetric l2 regularization strength.
sparse_weights: a list of vectors where each value is the weight associated with
a feature index.
a feature group.
dense_weights: a list of vectors where the value is the weight associated with
a dense feature group.
)doc");
// TODO(katsiapis): We should expand this scope of this op to compute other
// statistics about the data.
REGISTER_OP("ComputeDualityGap")
.Attr("num_sparse_features: int >= 0")
.Attr("num_dense_features: int >= 0")
.Attr("l1: float >= 0")
.Attr("l2: float >= 1")
REGISTER_OP("SdcaTrainingStats")
.Attr("container: string")
.Attr("solver_uuid: string")
.Input("sparse_weights: Ref(num_sparse_features * float)")
.Input("dense_weights: Ref(num_dense_features * float)")
.Output("duality_gap: float")
.Output("primal_loss: float64")
.Output("dual_loss: float64")
.Output("example_weights: float64")
.Doc(R"doc(
Computes duality gap over all examples seen by the optimizer.
Computes statistics over all examples seen by the optimizer.
num_sparse_features: Number of sparse feature groups to train on.
num_dense_features: Number of dense feature groups to train on.
l1: Symmetric l1 regularization strength.
l2: Symmetric l2 regularization strength.
container: Name of the Container that stores data across invocations of this
Kernel. Together with SolverUUID form an isolation unit for this solver.
solver_uuid: Universally Unique Identifier for this solver.
sparse_weights: a list of vectors where each value is the weight associated with
a feature index.
dense_weights: a list of vectors where the value is the weight associated with
a dense feature group.
duality_gap: duality gap over all examples seen by the optimizer.
primal_loss: total primal loss of all examples seen by the optimizer.
dual_loss: total dual loss of all examples seen by the optimizer.
example_weights: total example weights of all examples seen by the optimizer
(guaranteed to be positive; otherwise returns FAILED_PRECONDITION as it
probably indicates a bug in the training data).
)doc");
} // namespace tensorflow

View File

@ -92,6 +92,7 @@ def make_variable_dict(max_age, max_gender):
return dict(sparse_features_weights=[age_weights, gender_weights],
dense_features_weights=[])
def make_dense_variable_dict(num_dense_features, num_examples):
feature_weights = ([
tf.Variable(tf.zeros([1],
@ -130,6 +131,7 @@ def tearDown():
pass
# TODO(katsiapis): Add tests that exercise L1 and Shrinking.
class SdcaOptimizerTest(TensorFlowTestCase):
def _single_threaded_test_session(self):
@ -180,6 +182,44 @@ class SdcaOptimizerTest(TensorFlowTestCase):
rtol=1e-2,
atol=1e-2)
def testSimpleLogisticNoL2(self):
# Same as test above (so comments from above apply) but without an L2.
# The algorithm should behave as if we have an L2 of 1 in optimization but
# 0 in regularized_loss.
example_protos = [
make_example_proto(
{'age': [0],
'gender': [0]}, 0),
make_example_proto(
{'age': [1],
'gender': [1]}, 1),
]
example_weights = [1.0, 1.0]
with self._single_threaded_test_session():
examples = make_example_dict(example_protos, example_weights)
variables = make_variable_dict(1, 1)
options = dict(symmetric_l2_regularization=0,
symmetric_l1_regularization=0,
loss_type='logistic_loss')
lr = SdcaModel(CONTAINER, examples, variables, options)
tf.initialize_all_variables().run()
unregularized_loss = lr.unregularized_loss(examples)
loss = lr.regularized_loss(examples)
predictions = lr.predictions(examples)
self.assertAllClose(0.693147, unregularized_loss.eval())
self.assertAllClose(0.693147, loss.eval())
for _ in xrange(5):
lr.minimize().run()
self.assertAllClose(0.411608, unregularized_loss.eval(), rtol=0.11)
self.assertAllClose(0.371705, loss.eval(), atol=0.01)
predicted_labels = get_binary_predictions_for_logistic(predictions)
self.assertAllEqual([0, 1], predicted_labels.eval())
self.assertAllClose(0.01,
lr.approximate_duality_gap().eval(),
rtol=1e-2,
atol=1e-2)
def testSomeUnweightedExamples(self):
# Setup test data with 4 examples, but should produce the same
# results as testSimple.
@ -272,10 +312,11 @@ class SdcaOptimizerTest(TensorFlowTestCase):
lr = SdcaModel(CONTAINER, examples, variables, options)
tf.initialize_all_variables().run()
self.assertAllClose([0.5, 0.5], lr.predictions(examples).eval())
with self.assertRaisesOpError(
'No weighted examples in 2 training examples'):
lr.minimize().run()
lr.minimize().run()
self.assertAllClose([0.5, 0.5], lr.predictions(examples).eval())
with self.assertRaisesOpError(
'No examples found or all examples have zero weight.'):
lr.approximate_duality_gap().eval()
def testDuplicateExampleIds(self):
# Setup test data with 1 positive, and 1 negative example.

View File

@ -28,7 +28,6 @@ from tensorflow.python.framework.ops import name_scope
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import control_flow_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import state_ops
from tensorflow.python.ops import variables as var_ops
from tensorflow.python.ops.nn import sigmoid_cross_entropy_with_logits
from tensorflow.python.platform import resource_loader
@ -139,30 +138,35 @@ class SdcaModel(object):
['loss_type', 'symmetric_l2_regularization',
'symmetric_l1_regularization'], options)
for name in ['symmetric_l1_regularization', 'symmetric_l2_regularization']:
value = options[name]
if value < 0.0:
raise ValueError('%s should be non-negative. Found (%f)' %
(name, value))
self._container = container
self._examples = examples
self._variables = variables
self._options = options
self._solver_uuid = uuid.uuid4().hex
self._create_slots(variables)
self._create_slots()
# TODO(rohananil): Use optimizer interface to make use of slot creation
# logic
def _create_slots(self, variables):
self._slots = {}
# TODO(rohananil): Rename the slot keys to "unshrinked" weights.
self._slots['sparse_features_weights'] = []
self._slots['dense_features_weights'] = []
self._assign_ops = []
# Make an internal variable which has the updates before applying L1
def _symmetric_l2_regularization(self):
# Algorithmic requirement (for now) is to have minimal l2 of 1.0
return max(self._options['symmetric_l2_regularization'], 1.0)
# TODO(rohananil): Use optimizer interface to make use of slot creation logic.
def _create_slots(self):
# Make internal variables which have the updates before applying L1
# regularization.
for var_type in ['sparse_features_weights', 'dense_features_weights']:
for var in variables[var_type]:
if var is not None:
self._slots[var_type].append(var_ops.Variable(array_ops.zeros_like(
var.initialized_value(), dtypes.float32)))
self._assign_ops.append(state_ops.assign(var, self._slots[var_type][
-1]))
self._slots = {
'unshrinked_sparse_features_weights': [],
'unshrinked_dense_features_weights': [],
}
for name in ['sparse_features_weights', 'dense_features_weights']:
for var in self._variables[name]:
self._slots['unshrinked_' + name].append(var_ops.Variable(
array_ops.zeros_like(var.initialized_value(), dtypes.float32)))
def _assertSpecified(self, items, check_in):
for x in items:
@ -177,33 +181,22 @@ class SdcaModel(object):
def _l1_loss(self):
"""Computes the l1 loss of the model."""
with name_scope('l1_loss'):
sparse_weights = self._convert_n_to_tensor(self._variables[
'sparse_features_weights'])
dense_weights = self._convert_n_to_tensor(self._variables[
'dense_features_weights'])
l1 = self._options['symmetric_l1_regularization']
loss = 0.0
for w in sparse_weights:
loss += l1 * math_ops.reduce_sum(abs(w))
for w in dense_weights:
loss += l1 * math_ops.reduce_sum(abs(w))
return loss
sum = 0.0
for name in ['sparse_features_weights', 'dense_features_weights']:
for weights in self._convert_n_to_tensor(self._variables[name]):
sum += math_ops.reduce_sum(math_ops.abs(weights))
# SDCA L1 regularization cost is: l1 * sum(|weights|)
return self._options['symmetric_l1_regularization'] * sum
def _l2_loss(self):
def _l2_loss(self, l2):
"""Computes the l2 loss of the model."""
with name_scope('l2_loss'):
sparse_weights = self._convert_n_to_tensor(self._variables[
'sparse_features_weights'])
dense_weights = self._convert_n_to_tensor(self._variables[
'dense_features_weights'])
l2 = self._options['symmetric_l2_regularization']
loss = 0.0
for w in sparse_weights:
loss += l2 * math_ops.reduce_sum(math_ops.square(w))
for w in dense_weights:
loss += l2 * math_ops.reduce_sum(math_ops.square(w))
# SDCA L2 regularization cost is 1/2 * l2 * sum(weights^2)
return loss / 2.0
sum = 0.0
for name in ['sparse_features_weights', 'dense_features_weights']:
for weights in self._convert_n_to_tensor(self._variables[name]):
sum += math_ops.reduce_sum(math_ops.square(weights))
# SDCA L2 regularization cost is: l2 * sum(weights^2) / 2
return l2 * sum / 2
def _convert_n_to_tensor(self, input_list, as_ref=False):
"""Converts input list to a set of tensors."""
@ -265,31 +258,44 @@ class SdcaModel(object):
"""
with name_scope('sdca/minimize'):
sparse_features_indices = []
sparse_features_weights = []
sparse_features_values = []
for sf in self._examples['sparse_features']:
sparse_features_indices.append(convert_to_tensor(sf.indices))
sparse_features_weights.append(convert_to_tensor(sf.values))
sparse_features_values.append(convert_to_tensor(sf.values))
step_op = _sdca_ops.sdca_solver(
sparse_features_indices,
sparse_features_weights,
sparse_features_values,
self._convert_n_to_tensor(self._examples['dense_features']),
convert_to_tensor(self._examples['example_weights']),
convert_to_tensor(self._examples['example_labels']),
convert_to_tensor(self._examples['example_ids']),
self._convert_n_to_tensor(self._slots['sparse_features_weights'],
as_ref=True),
self._convert_n_to_tensor(self._slots['dense_features_weights'],
as_ref=True),
self._convert_n_to_tensor(
self._slots['unshrinked_sparse_features_weights'],
as_ref=True),
self._convert_n_to_tensor(
self._slots['unshrinked_dense_features_weights'],
as_ref=True),
l1=self._options['symmetric_l1_regularization'],
l2=self._options['symmetric_l2_regularization'],
l2=self._symmetric_l2_regularization(),
# TODO(rohananil): Provide empirical evidence for this. It is better
# to run more than one iteration on single mini-batch as we want to
# spend more time in compute. SDCA works better with larger
# mini-batches and there is also recent work that shows its better to
# reuse old samples than train on new samples.
# See: http://arxiv.org/abs/1602.02136.
num_inner_iterations=2,
loss_type=self._options['loss_type'],
container=self._container,
solver_uuid=self._solver_uuid)
with ops.control_dependencies([step_op]):
assign_ops = control_flow_ops.group(*self._assign_ops)
with ops.control_dependencies([assign_ops]):
assign_ops = []
for name in ['sparse_features_weights', 'dense_features_weights']:
for var, slot_var in zip(self._variables[name],
self._slots['unshrinked_' + name]):
assign_ops.append(var.assign(slot_var))
assign_group = control_flow_ops.group(*assign_ops)
with ops.control_dependencies([assign_group]):
return _sdca_ops.sdca_shrink_l1(
self._convert_n_to_tensor(
self._variables['sparse_features_weights'],
@ -298,7 +304,7 @@ class SdcaModel(object):
self._variables['dense_features_weights'],
as_ref=True),
l1=self._options['symmetric_l1_regularization'],
l2=self._options['symmetric_l2_regularization'])
l2=self._symmetric_l2_regularization())
def approximate_duality_gap(self):
"""Add operations to compute the approximate duality gap.
@ -307,15 +313,14 @@ class SdcaModel(object):
An Operation that computes the approximate duality gap over all
examples.
"""
return _sdca_ops.compute_duality_gap(
self._convert_n_to_tensor(self._slots['sparse_features_weights'],
as_ref=True),
self._convert_n_to_tensor(self._slots['dense_features_weights'],
as_ref=True),
l1=self._options['symmetric_l1_regularization'],
l2=self._options['symmetric_l2_regularization'],
(primal_loss, dual_loss, example_weights) = _sdca_ops.sdca_training_stats(
container=self._container,
solver_uuid=self._solver_uuid)
# Note that example_weights is guaranteed to be positive by
# sdca_training_stats so dividing by it is safe.
return (primal_loss + dual_loss + math_ops.to_double(self._l1_loss()) +
(2.0 * math_ops.to_double(self._l2_loss(
self._symmetric_l2_regularization())))) / example_weights
def unregularized_loss(self, examples):
"""Add operations to compute the loss (without the regularization loss).
@ -384,6 +389,11 @@ class SdcaModel(object):
self._assertList(['sparse_features', 'dense_features'], examples)
with name_scope('sdca/regularized_loss'):
weights = convert_to_tensor(examples['example_weights'])
return ((
(self._l1_loss() + self._l2_loss()) / math_ops.reduce_sum(weights)) +
return (((
self._l1_loss() +
# Note that here we are using the raw regularization
# (as specified by the user) and *not*
# self._symmetric_l2_regularization().
self._l2_loss(self._options['symmetric_l2_regularization'])) /
math_ops.reduce_sum(weights)) +
self.unregularized_loss(examples))

View File

@ -127,7 +127,7 @@ replicated model. Possible approaches include:
* As above, but where the gradients from all workers are averaged. See the
[CIFAR-10 multi-GPU trainer](https://www.tensorflow.org/code/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py)
for an example of this form of replication. The implements *synchronous* training
for an example of this form of replication. This implements *synchronous* training
* The "distributed trainer" approach uses multiple graphs&mdash;one per
worker&mdash;where each graph contains one set of parameters (pinned to

View File

@ -1089,6 +1089,7 @@ filegroup(
"avgpooling_op.cc",
"batch_norm_op.cc",
"bcast_ops.cc",
"check_numerics_op.cc",
"control_flow_ops.cc",
"conv_2d.h",
"conv_ops.cc",

View File

@ -26,26 +26,15 @@ namespace tensorflow {
// Check that 0 <= index < limit using a single comparison, assuming
// that 0 <= limit if Index is signed. Intended for use in performance
// critical contexts where 0 <= index < limit is almost always true.
template <class Index>
EIGEN_ALWAYS_INLINE bool FastBoundsCheck(Index index, Index limit) {
typedef typename std::make_unsigned<Index>::type UIndex;
template <typename Ta, typename Tb>
EIGEN_ALWAYS_INLINE bool FastBoundsCheck(const Ta index, const Tb limit) {
static_assert(std::is_integral<Ta>::value && std::is_integral<Tb>::value,
"FastBoundsCheck can only be used on integer types.");
typedef typename std::make_unsigned<decltype(index + limit)>::type UIndex;
return TF_PREDICT_TRUE(static_cast<UIndex>(index) <
static_cast<UIndex>(limit));
}
// Upcasting specializations when the index and bounds do not match;
// always move to the larger type.
EIGEN_ALWAYS_INLINE bool FastBoundsCheck(int64 index, int32 limit) {
return TF_PREDICT_TRUE(static_cast<uint64>(index) <
static_cast<uint64>(limit));
}
EIGEN_ALWAYS_INLINE bool FastBoundsCheck(int32 index, int64 limit) {
return TF_PREDICT_TRUE(static_cast<uint64>(index) <
static_cast<uint64>(limit));
}
namespace internal {
// Ensure that the compiler cannot elide a copy into a local, for
// bounds checking on source tensors that might be updated asynchronously.

View File

@ -1398,7 +1398,7 @@ class Conv2DSlowBackpropFilterOp : public OpKernel {
// [filter_rows, filter_cols, in_depth, out_depth];
// And we need to reverse the filter backprops
// So we need to allocated (sigh) yet another piece of memory to hold the
// ouptut.
// output.
TensorShape filter_shuffle_shape(
{out_depth, filter_rows, filter_cols, in_depth});
Tensor filter_shuffle;

View File

@ -246,7 +246,7 @@ __global__ void SwapDimension1And2InTensor3UsingTiles(const T* input,
}
}
// A Cuda custom kernel that converst input to output, given proper padding on
// A Cuda custom kernel that convert input to output, given proper padding on
// the left and the top. The padded value is zero.
template <typename T>
__global__ void PadInputCustomKernelNHWC(int nthreads, const T* input,

View File

@ -45,6 +45,28 @@ class DiagonalGenerator {
private:
Tensor diagonal_;
};
template <typename T, size_t NumDims>
class DiagonalExtractor {
public:
explicit DiagonalExtractor(const Tensor& tensor) : tensor_(tensor) {
CHECK_EQ(tensor.dims(), 2 * NumDims);
}
T operator()(const Eigen::array<Eigen::Index, NumDims>& coordinates) const {
Eigen::array<Eigen::Index, 2 * NumDims> index;
for (size_t j = 0; j < NumDims; ++j){
index[j] = coordinates[j];
}
for (size_t j = NumDims; j < 2 * NumDims; ++j){
index[j] = index[j - NumDims];
}
return tensor_.tensor<T, 2 * NumDims>()(index);
}
private:
Tensor tensor_;
};
} // namespace
// Generate the diagonal tensor with the diagonal set to the input tensor.
@ -58,12 +80,9 @@ class DiagOp : public OpKernel {
void Compute(OpKernelContext* context) override {
const Tensor& diagonal = context->input(0);
const int num_dims = diagonal.dims();
OP_REQUIRES(context, 1 <= num_dims,
errors::InvalidArgument(
"The rank of the diagonal should be between 1 and 3."));
OP_REQUIRES(context, 3 >= num_dims,
errors::InvalidArgument(
"The rank of the diagonal should be between 1 and 3."));
OP_REQUIRES(context, 1 <= num_dims && num_dims <= 3,
errors::InvalidArgument("Expected 1 <= dims <= 3, got shape ",
diagonal.shape().DebugString()));
TensorShape out_shape;
for (int i = 0; i < num_dims; ++i) {
out_shape.AddDim(diagonal.dim_size(i));
@ -105,4 +124,71 @@ REGISTER_DIAGOP(int32);
REGISTER_DIAGOP(int64);
#undef REGISTER_DIAGOP
// Generate the diagonal tensor with the diagonal set to the input tensor.
// It only allows rank 2, 4, or 6 input tensor, so the output tensor is
// rank 1, 2, or 3.
template <typename T>
class DiagPartOp : public OpKernel {
public:
explicit DiagPartOp(OpKernelConstruction* context) : OpKernel(context) {}
void Compute(OpKernelContext* context) override {
const Tensor& tensor = context->input(0);
const int num_dims = tensor.dims();
const int out_dims = num_dims / 2;
OP_REQUIRES(context, 2 == num_dims || 4 == num_dims || 6 == num_dims,
errors::InvalidArgument("The rank of the tensor should be 2, \
4, or 6, got shape ",
tensor.shape().DebugString()));
for (int i = 0; i < out_dims; i++){
OP_REQUIRES(context, tensor.dim_size(i) == tensor.dim_size(i + out_dims),
errors::InvalidArgument(
"Invalid shape ", tensor.shape().DebugString(),
": dimensions ", i, " and ", i + out_dims, " do not match.")
);
}
TensorShape out_shape;
for (int i = 0; i < out_dims; ++i) {
out_shape.AddDim(tensor.dim_size(i));
}
Tensor* output = nullptr;
OP_REQUIRES_OK(context,
context->allocate_output(0, out_shape, &output));
switch (num_dims) {
case 2:
output->tensor<T, 1>() = output->tensor<T, 1>().generate(
DiagonalExtractor<T, 1>(tensor));
break;
case 4:
output->tensor<T, 2>() = output->tensor<T, 2>().generate(
DiagonalExtractor<T, 2>(tensor));
break;
case 6:
output->tensor<T, 3>() = output->tensor<T, 3>().generate(
DiagonalExtractor<T, 3>(tensor));
break;
default:
context->SetStatus(errors::Unimplemented(
"Diagonal of rank ", num_dims, " tensor is not supported yet."));
return;
}
}
};
#define REGISTER_DIAGPARTOP(T) \
REGISTER_KERNEL_BUILDER( \
Name("DiagPart").Device(DEVICE_CPU).TypeConstraint<T>("T"), DiagPartOp<T>)
REGISTER_DIAGPARTOP(double);
REGISTER_DIAGPARTOP(float);
REGISTER_DIAGPARTOP(int32);
REGISTER_DIAGPARTOP(int64);
#undef REGISTER_DIAGPARTOP
} // namespace tensorflow

View File

@ -94,7 +94,7 @@ class MatrixSolveLsOp
}
if (fast_) {
// The fast branch assumes that matrix is not rank deficient and
// not too ill-conditioned. Specifically, the reciprobal condition number
// not too ill-conditioned. Specifically, the reciprocal condition number
// should be greater than the square root of the machine precision, i.e.
// 1 / cond(matrix) > sqrt(std::numeric_limits<Scalar>::epsilon()).
// This branch solves over- or underdetermined least-squares problems

View File

@ -84,6 +84,7 @@ struct ReduceFunctor<GPUDevice, Eigen::internal::MeanReducer<T> > {
DEFINE_FOR_TYPE_AND_R(T, Eigen::internal::ProdReducer<T>)
DEFINE_FOR_ALL_REDUCERS(float);
DEFINE_FOR_ALL_REDUCERS(double);
#undef DEFINE_FOR_ALL_REDUCERS
DEFINE_FOR_TYPE_AND_R(complex64, Eigen::internal::SumReducer<complex64>);

View File

@ -34,6 +34,7 @@ TF_CALL_REAL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
.HostMemory("reduction_indices"), \
ReductionOp<GPUDevice, type, Eigen::internal::MaxReducer<type>>);
REGISTER_GPU_KERNELS(float);
REGISTER_GPU_KERNELS(double);
#undef REGISTER_GPU_KERNELS
#endif

View File

@ -34,6 +34,7 @@ TF_CALL_REAL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
.HostMemory("reduction_indices"), \
ReductionOp<GPUDevice, type, Eigen::internal::MinReducer<type>>);
REGISTER_GPU_KERNELS(float);
REGISTER_GPU_KERNELS(double);
#undef REGISTER_GPU_KERNELS
#endif

View File

@ -34,6 +34,7 @@ TF_CALL_REAL_NUMBER_TYPES(REGISTER_CPU_KERNELS);
.HostMemory("reduction_indices"), \
ReductionOp<GPUDevice, type, Eigen::internal::ProdReducer<type>>);
REGISTER_GPU_KERNELS(float);
REGISTER_GPU_KERNELS(double);
#undef REGISTER_GPU_KERNELS
#endif

View File

@ -41,6 +41,7 @@ REGISTER_KERNEL_BUILDER(
.HostMemory("reduction_indices"), \
ReductionOp<GPUDevice, type, Eigen::internal::SumReducer<type>>);
REGISTER_GPU_KERNELS(float);
REGISTER_GPU_KERNELS(double);
#undef REGISTER_GPU_KERNELS
REGISTER_KERNEL_BUILDER(

View File

@ -26,6 +26,10 @@ limitations under the License.
#include "tensorflow/core/lib/core/status.h"
#include "tensorflow/core/platform/logging.h"
#if GOOGLE_CUDA
#include "tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.h"
#endif // GOOGLE_CUDA
namespace tensorflow {
typedef Eigen::ThreadPoolDevice CPUDevice;
@ -58,10 +62,10 @@ class ResizeNearestNeighborOp : public OpKernel {
// Initialize shape to the batch size of the input, then add
// the rest of the dimensions
Tensor* output = nullptr;
OP_REQUIRES_OK(context, context->allocate_output(
0, TensorShape({input.dim_size(0), sizes(0),
sizes(1), input.dim_size(3)}),
&output));
OP_REQUIRES_OK(
context, context->allocate_output(0, TensorShape({input.dim_size(0), sizes(0),
sizes(1), input.dim_size(3)}),
&output));
const int64 batch_size = input.dim_size(0);
const int64 in_height = input.dim_size(1);
@ -132,10 +136,10 @@ class ResizeNearestNeighborOpGrad : public OpKernel {
// Initialize shape to the batch size of the input, then add
// the rest of the dimensions
Tensor* output = nullptr;
OP_REQUIRES_OK(context, context->allocate_output(
0, TensorShape({input.dim_size(0), sizes(0),
sizes(1), input.dim_size(3)}),
&output));
OP_REQUIRES_OK(
context, context->allocate_output(0, TensorShape({input.dim_size(0), sizes(0),
sizes(1), input.dim_size(3)}),
&output));
const int64 batch_size = input.dim_size(0);
const int64 in_height = input.dim_size(1);
@ -204,4 +208,83 @@ TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL);
#undef REGISTER_KERNEL
#if GOOGLE_CUDA
template <typename T>
class ResizeNearestNeighborGPUOp : public OpKernel {
public:
explicit ResizeNearestNeighborGPUOp(OpKernelConstruction* context)
: OpKernel(context) {
OP_REQUIRES_OK(context, context->GetAttr("align_corners", &align_corners_));
}
void Compute(OpKernelContext* context) override {
const Tensor& input = context->input(0);
OP_REQUIRES(context, input.dims() == 4,
errors::InvalidArgument("input must be 4-dimensional",
input.shape().DebugString()));
const Tensor& shape_t = context->input(1);
OP_REQUIRES(context, shape_t.dims() == 1,
errors::InvalidArgument("shape_t must be 1-dimensional",
shape_t.shape().DebugString()));
OP_REQUIRES(context, shape_t.NumElements() == 2,
errors::InvalidArgument("shape_t must have two elements",
shape_t.shape().DebugString()));
auto sizes = shape_t.vec<int32>();
OP_REQUIRES(context, sizes(0) > 0 && sizes(1) > 0,
errors::InvalidArgument("shape_t's elements must be positive"));
// Initialize shape to the batch size of the input, then add
// the rest of the dimensions
Tensor* output = nullptr;
OP_REQUIRES_OK(
context, context->allocate_output(0, TensorShape({input.dim_size(0), sizes(0),
sizes(1), input.dim_size(3)}),
&output));
const int64 batch_size = input.dim_size(0);
const int64 in_height = input.dim_size(1);
const int64 in_width = input.dim_size(2);
const int64 channels = input.dim_size(3);
const int64 out_height = output->dim_size(1);
const int64 out_width = output->dim_size(2);
const float height_scale =
(align_corners_ && out_height > 1)
? (in_height - 1) / static_cast<float>(out_height - 1)
: in_height / static_cast<float>(out_height);
const float width_scale =
(align_corners_ && out_width > 1)
? (in_width - 1) / static_cast<float>(out_width - 1)
: in_width / static_cast<float>(out_width);
bool status = ResizeNearestNeighbor<T>(
input.flat<T>().data(), batch_size, in_height,
in_width, channels, out_height, out_width,
height_scale, width_scale, output->flat<T>().data(),
context->eigen_gpu_device());
if (!status) {
context->SetStatus(
errors::Internal("Failed launching ResizeNearestNeighbor"));
}
}
private:
bool align_corners_;
};
#define REGISTER_KERNEL(T) \
REGISTER_KERNEL_BUILDER(Name("ResizeNearestNeighbor") \
.Device(DEVICE_GPU) \
.TypeConstraint<T>("T") \
.HostMemory("size"), \
ResizeNearestNeighborGPUOp<T>);
TF_CALL_GPU_NUMBER_TYPES(REGISTER_KERNEL);
#undef REGISTER_KERNEL
#endif // GOOGLE_CUDA
} // namespace tensorflow

View File

@ -0,0 +1,52 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#include "tensorflow/core/common_runtime/kernel_benchmark_testlib.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/graph/node_builder.h"
#include "tensorflow/core/platform/test.h"
#include "tensorflow/core/platform/test_benchmark.h"
namespace tensorflow {
static Graph* BM_ResizeNearestNeighbor(int batches, int width, int height) {
Graph* g = new Graph(OpRegistry::Global());
Tensor in(DT_FLOAT, TensorShape({batches, width, height, 3}));
in.flat<float>().setRandom();
Tensor out_size(DT_INT32, TensorShape({2}));
auto out_size_flat = out_size.flat<int32>();
out_size_flat(0) = width * 2;
out_size_flat(1) = height * 2;
Node* ret;
NodeBuilder(g->NewName("n"), "ResizeNearestNeighbor")
.Input(test::graph::Constant(g, in))
.Input(test::graph::Constant(g, out_size))
.Finalize(g, &ret);
return g;
}
#define BM_ResizeNearestNeighborDev(DEVICE, B, W, H) \
static void BM_ResizeNearestNeighbor_##DEVICE##_##B##_##W##_##H(int iters) { \
testing::ItemsProcessed(iters* B* W* H * 3); \
test::Benchmark(#DEVICE, BM_ResizeNearestNeighbor(B, W, H)).Run(iters); \
} \
BENCHMARK(BM_ResizeNearestNeighbor_##DEVICE##_##B##_##W##_##H)
BM_ResizeNearestNeighborDev(cpu, 1, 499, 499);
BM_ResizeNearestNeighborDev(gpu, 1, 499, 499);
} // namespace tensorflow

View File

@ -0,0 +1,86 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if GOOGLE_CUDA
#define EIGEN_USE_GPU
#include <stdio.h>
#include "tensorflow/core/kernels/resize_nearest_neighbor_op_gpu.h"
#include "tensorflow/core/framework/register_types.h"
#include "tensorflow/core/framework/tensor_types.h"
#include "tensorflow/core/util/cuda_kernel_helper.h"
namespace tensorflow {
namespace {
template <typename T>
__global__ void ResizeNearestNeighborNHWC(const int nthreads, const T* bottom_data,
const int in_height, const int in_width,
const int channels, const int out_height,
const int out_width, const float height_scale,
const float width_scale, T* top_data) {
CUDA_1D_KERNEL_LOOP(index, nthreads) {
int n = index;
int c = n % channels;
n /= channels;
int out_x = n % out_width;
n /= out_width;
int out_y = n % out_height;
n /= out_height;
const T* bottom_data_n = bottom_data + n * channels * in_height * in_width;
const int in_x = min(static_cast<int>(floorf(out_x * width_scale)), in_width - 1);
const int in_y = min(static_cast<int>(floorf(out_y * height_scale)), in_height - 1);
const int idx = (in_y * in_width + in_x) * channels + c;
top_data[index] = ldg(bottom_data_n + idx);
}
}
} // namespace
template <typename T>
bool ResizeNearestNeighbor(const T* bottom_data, const int batch,
const int in_height, const int in_width,
const int channels, const int out_height,
const int out_width, const float height_scale,
const float width_scale, T* top_data,
const Eigen::GpuDevice& d) {
const int output_size = batch * channels * out_height * out_width;
CudaLaunchConfig config = GetCudaLaunchConfig(output_size, d);
ResizeNearestNeighborNHWC<T>
<<<config.block_count, config.thread_per_block, 0, d.stream()>>>(
output_size, bottom_data, in_height, in_width, channels, out_height,
out_width, height_scale, width_scale, top_data);
return d.ok();
}
#define DECLARE_GPU_SPEC(T) \
template bool ResizeNearestNeighbor(const T* bottom_data, const int batch, \
const int in_height, const int in_width, \
const int channels, const int out_height, \
const int out_width, const float height_scale, \
const float width_scale, T* top_data, \
const Eigen::GpuDevice& d);
TF_CALL_GPU_NUMBER_TYPES(DECLARE_GPU_SPEC);
#undef DECLARE_GPU_SPEC
} // end namespace tensorflow
#endif // GOOGLE_CUDA

View File

@ -0,0 +1,37 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#if !GOOGLE_CUDA
#error This file must only be included when building with Cuda support
#endif
#ifndef TENSORFLOW_CORE_KERNELS_RESIZE_NEAREST_NEIGHBOR_OP_GPU_H_
#define TENSORFLOW_CORE_KERNELS_RESIZE_NEAREST_NEIGHBOR_OP_GPU_H_
#include "third_party/eigen3/unsupported/Eigen/CXX11/NeuralNetworks"
#include "tensorflow/core/framework/tensor_types.h"
#include "tensorflow/core/platform/types.h"
namespace tensorflow {
template <typename T>
bool ResizeNearestNeighbor(const T* bottom_data, const int batch, const int in_height,
const int in_width, const int channels, const int out_height,
const int out_width, const float height_scale, const float width_scale,
T* top_data, const Eigen::GpuDevice& d);
} // namespace tensorflow
#endif // TENSORFLOW_CORE_KERNELS_RESIZE_NEAREST_NEIGHBOR_OP_GPU_H_

View File

@ -524,7 +524,7 @@ class SparseMatMulOp : public OpKernel {
private:
// Perform matrix multiplication of "left" and "right", and store the result
// in *"ouptut".
// in *"output".
static inline void SparseMatMul(
const ConstMatrixMap& left, const ConstMatrixMap& right,
bool transpose_left, const DeviceBase::CpuWorkerThreads* thread_pool,
@ -858,7 +858,7 @@ inline void SparseMatMulOp::SparseMatMul(
const int right_dim0 = right.dimension(0);
const int right_dim1 = right.dimension(1);
// Allocate buffer for storing slices of right matrix.
// Note buffer needs enough space to hold atmost a KR * NR matrix since that
// Note buffer needs enough space to hold at most a KR * NR matrix since that
// is the block size per iteration.
const int buffer_num_rows =
std::min(KR, right_dim0) * (std::min(NR, right_dim1) + N - 1) / N;

View File

@ -577,7 +577,7 @@ class TensorArrayConcatOp : public OpKernel {
ConstMatrixVector input_tensors_flat;
input_tensors_flat.reserve(values.size());
for (int i = 0; i < values.size(); ++i) {
for (size_t i = 0; i < values.size(); ++i) {
const Tensor* value_t = value_tensors[i];
if (value_t->NumElements() > 0) {
input_tensors_flat.emplace_back(new ConstMatrix(

View File

@ -47,7 +47,7 @@ void ComputeStride(const TensorShape& shape, Index* strides) {
}
}
// Device-specific naive implementation for tranpose.
// Device-specific naive implementation for transpose.
template <typename Device, typename T>
void TransposeSimple(const Device& d, const Tensor& in,
const gtl::ArraySlice<int32> perm, Tensor* out);

View File

@ -172,6 +172,38 @@ tf.diag(diagonal) ==> [[1, 0, 0, 0]
diagonal: Rank k tensor where k is at most 3.
)doc");
// --------------------------------------------------------------------------
REGISTER_OP("DiagPart")
.Input("input: T")
.Output("diagonal: T")
.Attr("T: {float, double, int32, int64}")
.Doc(R"doc(
Returns the diagonal part of the tensor.
This operation returns a tensor with the `diagonal` part
of the `input`. The `diagonal` part is computed as follows:
Assume `input` has dimensions `[D1,..., Dk, D1,..., Dk]`, then the output is a
tensor of rank `k` with dimensions `[D1,..., Dk]` where:
`diagonal[i1,..., ik] = input[i1, ..., ik, i1,..., ik]`.
For example:
```prettyprint
# 'input' is [[1, 0, 0, 0]
[0, 2, 0, 0]
[0, 0, 3, 0]
[0, 0, 0, 4]]
tf.diag_part(input) ==> [1, 2, 3, 4]
```
input: Rank k tensor where k is 2, 4, or 6.
diagonal: The extracted diagonal.
)doc");
// --------------------------------------------------------------------------
REGISTER_OP("Reverse")
.Input("tensor: T")

View File

@ -3482,6 +3482,29 @@ op {
}
}
}
op {
name: "DiagPart"
input_arg {
name: "input"
type_attr: "T"
}
output_arg {
name: "diagonal"
type_attr: "T"
}
attr {
name: "T"
type: "type"
allowed_values {
list {
type: DT_FLOAT
type: DT_DOUBLE
type: DT_INT32
type: DT_INT64
}
}
}
}
op {
name: "Digamma"
input_arg {

View File

@ -2858,6 +2858,33 @@ op {
summary: "Returns a diagonal tensor with a given diagonal values."
description: "Given a `diagonal`, this operation returns a tensor with the `diagonal` and\neverything else padded with zeros. The diagonal is computed as follows:\n\nAssume `diagonal` has dimensions [D1,..., Dk], then the output is a tensor of\nrank 2k with dimensions [D1,..., Dk, D1,..., Dk] where:\n\n`output[i1,..., ik, i1,..., ik] = diagonal[i1, ..., ik]` and 0 everywhere else.\n\nFor example:\n\n```prettyprint\n# \'diagonal\' is [1, 2, 3, 4]\ntf.diag(diagonal) ==> [[1, 0, 0, 0]\n [0, 2, 0, 0]\n [0, 0, 3, 0]\n [0, 0, 0, 4]]\n```"
}
op {
name: "DiagPart"
input_arg {
name: "input"
description: "Rank k tensor where k is 2, 4, or 6."
type_attr: "T"
}
output_arg {
name: "diagonal"
description: "The extracted diagonal."
type_attr: "T"
}
attr {
name: "T"
type: "type"
allowed_values {
list {
type: DT_FLOAT
type: DT_DOUBLE
type: DT_INT32
type: DT_INT64
}
}
}
summary: "Returns the diagonal part of the tensor."
description: "This operation returns a tensor with the `diagonal` part\nof the `input`. The `diagonal` part is computed as follows:\n\nAssume `input` has dimensions `[D1,..., Dk, D1,..., Dk]`, then the output is a\ntensor of rank `k` with dimensions `[D1,..., Dk]` where:\n\n`diagonal[i1,..., ik] = input[i1, ..., ik, i1,..., ik]`.\n\nFor example:\n\n```prettyprint\n# \'input\' is [[1, 0, 0, 0]\n [0, 2, 0, 0]\n [0, 0, 3, 0]\n [0, 0, 0, 4]]\n\ntf.diag_part(input) ==> [1, 2, 3, 4]\n```"
}
op {
name: "Digamma"
input_arg {

View File

@ -20,7 +20,7 @@ limitations under the License.
#define TF_MAJOR_VERSION 0
#define TF_MINOR_VERSION 7
#define TF_PATCH_VERSION 0
#define TF_PATCH_VERSION 1
// TF_VERSION_SUFFIX is non-empty for pre-releases (e.g. "-alpha", "-alpha.1",
// "-beta", "-rc", "-rc.1")

View File

@ -63,34 +63,50 @@ message CommitId {
};
message CPUInfo {
int64 num_cores = 1;
int64 num_cores_allowed = 2;
// How fast are these cpus?
double mhz_per_cpu = 1;
double mhz_per_cpu = 3;
// Additional cpu information. For example,
// Intel Ivybridge with HyperThreading (24 cores) dL1:32KB dL2:256KB dL3:30MB
string cpu_info = 2;
string cpu_info = 4;
// What kind of cpu scaling is enabled on the host.
// Examples include "performance", "ondemand", "conservative", "mixed".
string cpu_governor = 3;
string cpu_governor = 5;
// Cache sizes (in bytes), e.g. "L2": 262144 (for 256KB)
map<string, int64> cache_size = 4;
map<string, int64> cache_size = 6;
};
message MemoryInfo {
int64 total = 1; // Total virtual memory in bytes
int64 available = 2; // Immediately available memory in bytes
}
message GPUInfo {
string model = 1; // e.g. "Tesla K40c"
string uuid = 2; // Final entry in output of "nvidia-smi -L"
string bus_id = 3; // e.g. "0000:04:00.0"
};
message PlatformInfo {
string bits = 1; // e.g. '64bit'
string linkage = 2; // e.g. 'ELF'
string machine = 3; // e.g. 'i386'
string processor = 4; // e.g. 'amdk6' (the real processor name)
string release = 5; // e.g. '3.13.0-76-generic'
string system = 6; // e.g. 'Linux'
string version = 7; // e.g. '#120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016'
string release = 4; // e.g. '3.13.0-76-generic'
string system = 5; // e.g. 'Linux'
string version = 6; // e.g. '#120-Ubuntu SMP Mon Jan 18 15:59:10 UTC 2016'
};
message AvailableDeviceInfo { // Matches DeviceAttributes
string name = 1; // Device name.
string type = 2; // Device type, e.g. 'CPU' or 'GPU'.
int64 memory_limit = 3; // Memory capacity in bytes.
string physical_description = 4; // The physical description of this device.
};
message MachineConfiguration {
@ -105,6 +121,11 @@ message MachineConfiguration {
// Other devices that are attached and relevant (e.g. GPUInfo).
repeated google.protobuf.Any device_info = 4;
// Devices accessible to the test (e.g. as given by list_local_devices).
repeated AvailableDeviceInfo available_device_info = 5;
MemoryInfo memory_info = 6;
};
// Run-specific items such as arguments to the test / benchmark.

View File

@ -68,6 +68,7 @@ def convert_to(images, labels, name):
'label': _int64_feature(int(labels[index])),
'image_raw': _bytes_feature(image_raw)}))
writer.write(example.SerializeToString())
writer.close()
def main(argv):

View File

@ -219,8 +219,8 @@ def create_image_lists(image_dir, testing_percentage, validation_percentage):
# To do that, we need a stable way of deciding based on just the file name
# itself, so we do a hash of that and then use that to generate a
# probability value that we use to assign it.
percentage_hash = (int(
hashlib.sha1(hash_name).hexdigest(), 16) % (65536)) * (100 / 65535.0)
hash_name_hashed = hashlib.sha1(hash_name.encode('utf-8')).hexdigest()
percentage_hash = (int(hash_name_hashed, 16) % (65536)) * (100 / 65535.0)
if percentage_hash < validation_percentage:
validation_images.append(base_name)
elif percentage_hash < (testing_percentage + validation_percentage):
@ -295,8 +295,9 @@ def create_inception_graph():
Graph holding the trained Inception network.
"""
with tf.Session() as sess:
with gfile.FastGFile(
os.path.join(FLAGS.model_dir, 'classify_image_graph_def.pb'), 'r') as f:
model_filename = os.path.join(
FLAGS.model_dir, 'classify_image_graph_def.pb')
with gfile.FastGFile(model_filename, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
@ -395,7 +396,7 @@ def get_or_create_bottleneck(sess, image_lists, label_name, index, image_dir,
category)
if not gfile.Exists(image_path):
tf.logging.fatal('File does not exist %s', image_path)
image_data = gfile.FastGFile(image_path, 'r').read()
image_data = gfile.FastGFile(image_path, 'rb').read()
bottleneck_values = run_bottleneck_on_image(sess, image_data,
JPEG_DATA_TENSOR_NAME)
bottleneck_string = ','.join(str(x) for x in bottleneck_values)
@ -430,7 +431,7 @@ def cache_bottlenecks(sess, image_lists, image_dir, bottleneck_dir):
"""
how_many_bottlenecks = 0
ensure_dir_exists(bottleneck_dir)
for label_name, label_lists in image_lists.iteritems():
for label_name, label_lists in image_lists.items():
for category in ['training', 'testing', 'validation']:
category_list = label_lists[category]
for index, unused_base_name in enumerate(category_list):
@ -467,7 +468,7 @@ def get_random_cached_bottlenecks(sess, image_lists, how_many, category,
ground_truthes = []
for unused_i in range(how_many):
label_index = random.randrange(class_count)
label_name = image_lists.keys()[label_index]
label_name = list(image_lists.keys())[label_index]
image_index = random.randrange(65536)
bottleneck = get_or_create_bottleneck(sess, image_lists, label_name,
image_index, image_dir, category,
@ -818,7 +819,7 @@ def main(_):
# Write out the trained graph and labels with the weights stored as constants.
output_graph_def = graph_util.convert_variables_to_constants(
sess, graph.as_graph_def(), [FLAGS.final_tensor_name])
with gfile.FastGFile(FLAGS.output_graph, 'w') as f:
with gfile.FastGFile(FLAGS.output_graph, 'wb') as f:
f.write(output_graph_def.SerializeToString())
with gfile.FastGFile(FLAGS.output_labels, 'w') as f:
f.write('\n'.join(image_lists.keys()) + '\n')

View File

@ -54,7 +54,7 @@ def _read32(bytestream):
def extract_images(filename):
"""Extract the images into a 4D uint8 numpy array [index, y, x, depth]."""
print('Extracting', filename)
with tf.gfile.Open(filename) as f, gzip.GzipFile(fileobj=f) as bytestream:
with tf.gfile.Open(filename, 'rb') as f, gzip.GzipFile(fileobj=f) as bytestream:
magic = _read32(bytestream)
if magic != 2051:
raise ValueError(
@ -81,7 +81,7 @@ def dense_to_one_hot(labels_dense, num_classes):
def extract_labels(filename, one_hot=False, num_classes=10):
"""Extract the labels into a 1D uint8 numpy array [index]."""
print('Extracting', filename)
with tf.gfile.Open(filename) as f, gzip.GzipFile(fileobj=f) as bytestream:
with tf.gfile.Open(filename, 'rb') as f, gzip.GzipFile(fileobj=f) as bytestream:
magic = _read32(bytestream)
if magic != 2049:
raise ValueError(

View File

@ -143,7 +143,7 @@ def evaluation(logits, labels):
"""
# For a classifier model, we can use the in_top_k Op.
# It returns a bool tensor with shape [batch_size] that is true for
# the examples where the label's is was in the top k (here k=1)
# the examples where the label is in the top k (here k=1)
# of all logits for that example.
correct = tf.nn.in_top_k(logits, labels, 1)
# Return the number of true entries.

View File

@ -54,23 +54,23 @@ def main(_):
# Create the model
x = tf.placeholder(tf.float32, [None, 784], name='x-input')
W = tf.Variable(tf.zeros([784, 10]), name='weights')
b = tf.Variable(tf.zeros([10], name='bias'))
b = tf.Variable(tf.zeros([10]), name='bias')
# Use a name scope to organize nodes in the graph visualizer
with tf.name_scope('Wx_b'):
y = tf.nn.softmax(tf.matmul(x, W) + b)
# Add summary ops to collect data
_ = tf.histogram_summary('weights', W)
_ = tf.histogram_summary('biases', b)
_ = tf.histogram_summary('y', y)
tf.histogram_summary('weights', W)
tf.histogram_summary('biases', b)
tf.histogram_summary('y', y)
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10], name='y-input')
# More name scopes will clean up the graph representation
with tf.name_scope('xent'):
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
_ = tf.scalar_summary('cross entropy', cross_entropy)
tf.scalar_summary('cross entropy', cross_entropy)
with tf.name_scope('train'):
train_step = tf.train.GradientDescentOptimizer(
FLAGS.learning_rate).minimize(cross_entropy)
@ -78,7 +78,7 @@ def main(_):
with tf.name_scope('test'):
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
_ = tf.scalar_summary('accuracy', accuracy)
tf.scalar_summary('accuracy', accuracy)
# Merge all the summaries and write them out to /tmp/mnist_logs (by default)
merged = tf.merge_all_summaries()

View File

@ -128,7 +128,7 @@ num_skips = 2 # How many times to reuse an input to generate a label.
# construction are also the most frequent.
valid_size = 16 # Random set of words to evaluate similarity on.
valid_window = 100 # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(np.arange(valid_window), valid_size))
valid_examples = np.random.choice(valid_window, valid_size, replace=False)
num_sampled = 64 # Number of negative examples to sample.
graph = tf.Graph()

View File

@ -290,11 +290,11 @@
"Another one is to use learning rate decay:\n",
"\n",
" global_step = tf.Variable(0) # count the number of steps taken.\n",
" learning_rate = tf.train.exponential_decay(0.5, step, ...)\n",
" learning_rate = tf.train.exponential_decay(0.5, global_step, ...)\n",
" optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)\n",
" \n",
" ---\n"
]
}
]
}
}

View File

@ -421,7 +421,7 @@
"\n",
"graph = tf.Graph()\n",
"\n",
"with graph.as_default():\n",
"with graph.as_default(), tf.device('/cpu:0'):\n",
"\n",
" # Input data.\n",
" train_dataset = tf.placeholder(tf.int32, shape=[batch_size])\n",

View File

@ -1,6 +1,7 @@
FROM b.gcr.io/tensorflow/tensorflow:latest
MAINTAINER Vincent Vanhoucke <vanhoucke@google.com>
RUN pip install scikit-learn
RUN rm -rf /notebooks/*
ADD *.ipynb /notebooks/
WORKDIR /notebooks
CMD ["/run_jupyter.sh"]

View File

@ -820,7 +820,7 @@ classes are mutually exclusive (each entry is in exactly one class). For
example, each CIFAR-10 image is labeled with one and only one label: an image
can be a dog or a truck, but not both.
**NOTE:**: While the classes are mutually exclusive, their probabilities
**NOTE:** While the classes are mutually exclusive, their probabilities
need not be. All that is required is that each row of `labels` is
a valid probability distribution. If using exclusive `labels`
(wherein one and only one class is true at a time), see
@ -857,7 +857,7 @@ classes are mutually exclusive (each entry is in exactly one class). For
example, each CIFAR-10 image is labeled with one and only one label: an image
can be a dog or a truck, but not both.
**NOTE:**: For this operation, the probability of a given label is considered
**NOTE:** For this operation, the probability of a given label is considered
exclusive. That is, soft classes are not allowed, and the `labels` vector
must provide a single specific index for the true class for each row of
`logits` (each minibatch entry). For soft softmax classification with

View File

@ -794,9 +794,11 @@ global_step = tf.Variable(0, trainable=False)
starter_learning_rate = 0.1
learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step,
100000, 0.96, staircase=True)
optimizer = tf.GradientDescentOptimizer(learning_rate)
# Passing global_step to minimize() will increment it at each step.
optimizer.minimize(...my loss..., global_step=global_step)
learning_step = (
tf.GradientDescentOptimizer(learning_rate)
.minimize(...my loss..., global_step=global_step)
)
```
##### Args:
@ -2280,5 +2282,3 @@ device assignments have not changed.
##### Returns:
A saver constructed rom `saver_def` in `MetaGraphDef`.

View File

@ -53,28 +53,28 @@ Install TensorFlow:
```bash
# Ubuntu/Linux 64-bit, CPU only:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
# Mac OS X, CPU only:
$ sudo easy_install --upgrade six
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py2-none-any.whl
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp27-none-any.whl
```
For python3:
```bash
# Ubuntu/Linux 64-bit, CPU only:
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py3-none-linux_x86_64.whl
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled:
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.0-py3-none-linux_x86_64.whl
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl
# Mac OS X, CPU only:
$ sudo easy_install --upgrade six
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py3-none-any.whl
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp35-none-any.whl
```
NOTE: If you are upgrading from a previous installation of TensorFlow < 0.7.1,
@ -126,13 +126,13 @@ $ source ~/tensorflow/bin/activate.csh # If using csh
(tensorflow)$ # Your prompt should change
# Ubuntu/Linux 64-bit, CPU only:
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled:
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.0-py2-none-linux_x86_64.whl
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl
# Mac OS X, CPU only:
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py2-none-any.whl
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp27-none-any.whl
```
and again for python3:
@ -143,13 +143,13 @@ $ source ~/tensorflow/bin/activate.csh # If using csh
(tensorflow)$ # Your prompt should change
# Ubuntu/Linux 64-bit, CPU only:
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.0-py3-none-linux_x86_64.whl
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled:
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.0-py3-none-linux_x86_64.whl
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl
# Mac OS X, CPU only:
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.0-py3-none-any.whl
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp35-none-any.whl
```
With the Virtualenv environment activated, you can now
@ -191,7 +191,7 @@ code.
* `b.gcr.io/tensorflow/tensorflow:latest-devel-gpu`: GPU Binary image plus source
code.
We also have tags with `latest` replaced by a released version (e.g., `0.7.0-gpu`).
We also have tags with `latest` replaced by a released version (e.g., `0.7.1-gpu`).
With Docker the installation is as follows:
@ -464,7 +464,7 @@ We recommend using [homebrew](http://brew.sh) to install the bazel and SWIG
dependencies, and installing python dependencies using easy_install or pip.
Of course you can also install Swig from source without using homebrew. In that
case, be sure to install its dependency [PCRE](from www.pcre.org) and not PCRE2.
case, be sure to install its dependency [PCRE](http://www.pcre.org) and not PCRE2.
#### Dependencies
@ -517,7 +517,7 @@ $ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_pack
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# The name of the .whl file will depend on your platform.
$ pip install /tmp/tensorflow_pkg/tensorflow-0.7.0-py2-none-linux_x86_64.whl
$ pip install /tmp/tensorflow_pkg/tensorflow-0.7.1-py2-none-linux_x86_64.whl
```
## Setting up TensorFlow for Development

View File

@ -74,7 +74,7 @@ and compact summary of the images, since it has to contain enough information
for the classifier to make a good choice in a very small set of values. The
reason our final layer retraining can work on new classes is that it turns out
the kind of information needed to distinguish between all the 1,000 classes in
ImageNet is often also useful to chose between new kinds of objects.
ImageNet is often also useful to distinguish between new kinds of objects.
Because every image is reused multiple times during training and calculating
each bottleneck takes a significant amount of time, it speeds things up to
@ -88,20 +88,20 @@ part again.
Once the bottlenecks are complete, the actual training of the top layer of the
network begins. You'll see a series of step outputs, each one showing training
accuracy, validation accuracy, and the cross entropy. The training accuracy
shows how many of the images used in the current training batch were labeled
with the correct class. The validation accuracy is the precision on a
shows what percent of the images used in the current training batch were
labeled with the correct class. The validation accuracy is the precision on a
randomly-selected group of images from a different set. The key difference is
that the training accuracy is based on images that the network has been able
to learn from so the network can overfit to the noise in the training data. A
true measure of the performance of the network is to measure its performance on
a data set not contained in the training data -- this is measured by the
validation accuracy. If the training accuracy is high but the validation remains
low, that means the network is overfitting and memorizing particular features
in the training images that aren't helpful more generally. Cross entropy is a
loss function which gives a glimpse into how well the learning process is
progressing. The training's objective is to make the loss as small as possible,
so you can tell if the learning is working by keeping an eye on whether the loss
keeps trending downwards, ignoring the short-term noise.
validation accuracy. If the train accuracy is high but the validation accuracy
remains low, that means the network is overfitting and memorizing particular
features in the training images that aren't helpful more generally. Cross
entropy is a loss function which gives a glimpse into how well the learning
process is progressing. The training's objective is to make the loss as small as
possible, so you can tell if the learning is working by keeping an eye on
whether the loss keeps trending downwards, ignoring the short-term noise.
By default this script will run 4,000 training steps. Each step chooses ten
images at random from the training set, finds their bottlenecks from the cache,
@ -114,8 +114,8 @@ and validation pictures. This test evaluation is the best estimate of how the
trained model will perform on the classification task. You should see an
accuracy value of between 90% and 95%, though the exact value will vary from run
to run since there's randomness in the training process. This number is based on
how many of the images in the test set are given the correct label after the
model is fully trained.
the percent of the images in the test set that are given the correct label
after the model is fully trained.
## Using the Retrained Model
@ -266,7 +266,7 @@ memorized unimportant details of the training images.
This problem is known as overfitting, and to avoid it we keep some of our data
out of the training process, so that the model can't memorize them. We then use
those images as a check to make sure that overfitting isn't occuring, since if
those images as a check to make sure that overfitting isn't occurring, since if
we see good accuracy on them it's a good sign the network isn't overfitting. The
usual split is to put 80% of the images into the main training set, keep 10%
aside to run as validation frequently during training, and then have a final 10%

View File

@ -86,23 +86,23 @@ with tf.name_scope("Wx_b") as scope:
y = tf.nn.softmax(tf.matmul(x,W) + b)
# Add summary ops to collect data
w_hist = tf.histogram_summary("weights", W)
b_hist = tf.histogram_summary("biases", b)
y_hist = tf.histogram_summary("y", y)
tf.histogram_summary("weights", W)
tf.histogram_summary("biases", b)
tf.histogram_summary("y", y)
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None,10], name="y-input")
# More name scopes will clean up the graph representation
with tf.name_scope("xent") as scope:
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
ce_summ = tf.scalar_summary("cross entropy", cross_entropy)
tf.scalar_summary("cross entropy", cross_entropy)
with tf.name_scope("train") as scope:
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
with tf.name_scope("test") as scope:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
accuracy_summary = tf.scalar_summary("accuracy", accuracy)
tf.scalar_summary("accuracy", accuracy)
# Merge all the summaries and write them out to /tmp/mnist_logs
merged = tf.merge_all_summaries()

View File

@ -28,8 +28,7 @@ by calling `as_graph_def()`, which returns a `GraphDef` object.
The GraphDef class is an object created by the ProtoBuf library from the
definition in
[tensorflow/core/framework/graph.proto](https://github.com/tensorflow/tensorflow
/blob/master/tensorflow/core/framework/graph.proto). The protobuf tools parse
[tensorflow/core/framework/graph.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/graph.proto). The protobuf tools parse
this text file, and generate the code to load, store, and manipulate graph
definitions. If you see a standalone TensorFlow file representing a model, it's
likely to contain a serialized version of one of these `GraphDef` objects
@ -37,8 +36,7 @@ saved out by the protobuf code.
This generated code is used to save and load the GraphDef files from disk. A
good example to look at as we dig into this is
[graph_metrics.py](https://github.com/tensorflow/tensorflow/blob/master/tensorfl
ow/python/tools/graph_metrics.py). This Python script takes a saved graph
[graph_metrics.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/graph_metrics.py). This Python script takes a saved graph
definition, and analyzes the model to estimate performance and resource
statistics. The code that actually loads the model looks like this:
@ -69,16 +67,14 @@ There are actually two different formats that a ProtoBuf can be saved in.
TextFormat is a human-readable form, which makes it nice for debugging and
editing, but can get large when there's numerical data like weights stored in
it. You can see a small example of that in
[poly5-graph.pbtxt](https://github.com/tensorflow/tensorflow/blob/master/tensorf
low/tensorboard/components/tf-tensorboard/demo/data/poly5-graph.pbtxt).
[poly5-graph.pbtxt](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tensorboard/components/tf-tensorboard/demo/data/poly5-graph.pbtxt).
Binary format files are a lot smaller than their text equivalents, even though
they're not as readable for us. In this script, we ask the user to supply a
flag indicating whether the input file is binary or text, so we know the right
function to call. You can find an example of a large binary file inside the
[inception_dec_2015.zip
archive](https://storage.googleapis.com/download.tensorflow.org/models/inception
_dec_2015.zip), as `tensorflow_inception_graph.pb`.
archive](https://storage.googleapis.com/download.tensorflow.org/models/inception_dec_2015.zip), as `tensorflow_inception_graph.pb`.
The API itself can be a bit confusing - the binary call is actually
`ParseFromString()`, whereas you use a utility function from the `text_format`
@ -104,7 +100,7 @@ single operation along with its input connections. Here are the members of a
Every node should have a unique identifier that's not used by any other nodes
in the graph. If you don't specify one as you're building a graph using the
Python API, one reflecting the name of operation, such as "MatMul",
concatenated with a monotonically increasing number, such as "5", will be
concatenated with a monotonically increasing number, such as "5", will be
picked for you. an arbitrary one will be picked for you. The name is used when
defining the connections between nodes, and when setting inputs and outputs for
the whole graph when it's run.
@ -115,8 +111,7 @@ This defines what operation to run, for example `"Add"`, `"MatMul"`, or
`"Conv2D"`. When a graph is run, this op name is looked up in a registry to
find an implementation. The registry is populated by calls to the
`REGISTER_OP()` macro, like those in
[tensorflow/core/ops/nn_ops.cc](https://github.com/tensorflow/tensorflow/blob/ma
ster/tensorflow/core/ops/nn_ops.cc).
[tensorflow/core/ops/nn_ops.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/ops/nn_ops.cc).
### `input`
@ -142,8 +137,7 @@ size of filters for convolutions, or the values of constant ops. Because there
can be so many different types of attribute values, from strings, to ints, to
arrays of tensor values, there's a separate protobuf file defining the data
structure that holds them, in
[tensorflow/core/framework/attr_value.proto](https://github.com/tensorflow/tenso
rflow/blob/master/tensorflow/core/framework/attr_value.proto).
[tensorflow/core/framework/attr_value.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto).
Each attribute has a unique name string, and the expected attributes are listed
when the operation is defined. If an attribute isn't present in a node, but it
@ -161,8 +155,7 @@ the file format during training. Instead, they're held in separate checkpoint
files, and there are `Variable` ops in the graph that load the latest values
when they're initialized. It's often not very convenient to have separate files
when you're deploying to production, so there's the
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflo
w/python/tools/freeze_graph.py) script that takes a graph definition and a set
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py) script that takes a graph definition and a set
of checkpoints and freezes them together into a single file.
What this does is load the `GraphDef`, pull in the values for all the variables
@ -178,10 +171,9 @@ the most common problems is extracting and interpreting the weight values. A
common way to store them, for example in graphs created by the freeze_graph
script, is as `Const` ops containing the weights as `Tensors`. These are
defined in
[tensorflow/core/framework/tensor.proto](https://github.com/tensorflow/tensorflo
w/blob/master/tensorflow/core/framework/tensor.proto), and contain information
[tensorflow/core/framework/tensor.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor.proto), and contain information
about the size and type of the data, as well as the values themselves. In
Python, you get a `TensorProto` object from a `NodeDef` representing a `Const`
Python, you get a `TensorProto` object from a `NodeDef` representing a `Const`
op by calling something like `some_node_def.attr['value'].tensor`.
This will give you an object representing the weights data. The data itself

View File

@ -16,7 +16,7 @@ Python list) has a rank of 2:
t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
A rank two tensor is what we typically think of as a matrix, a rank one tensor
is a vector. For a rank two tensor you can acccess any element with the syntax
is a vector. For a rank two tensor you can access any element with the syntax
`t[i, j]`. For a rank three tensor you would need to address an element with
`t[i, j, k]`.

View File

@ -31,6 +31,11 @@ something amazing with TensorFlow, we'd like to hear about it!
## Community
The TensorFlow community has created many great projects around TensorFlow, including:
* [TensorFlow tutorials](https://github.com/pkmital/tensorflow_tutorials)
* [Scikit Flow - Simplified Interface for TensorFlow](https://github.com/tensorflow/skflow)
### Development
The source code for TensorFlow is hosted on GitHub:

View File

@ -9,8 +9,6 @@ CIFAR-10 classification is a common benchmark problem in machine learning. The
problem is to classify RGB 32x32 pixel images across 10 categories:
```airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.```
![CIFAR-10 Samples](../../images/cifar_samples.png "CIFAR-10 Samples, from http://www.cs.toronto.edu/~kriz/cifar.html")
For more details refer to the [CIFAR-10 page](http://www.cs.toronto.edu/~kriz/cifar.html)
and a [Tech Report](http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf)
by Alex Krizhevsky.
@ -117,7 +115,7 @@ learn more about how the `Reader` class works.
The images are processed as follows:
* They are cropped to 24 x 24 pixels, centrally for evaluation or
[randomly](../../api_docs/python/image.md#random_crop) for training.
[randomly](../../api_docs/python/constant_op.md#random_crop) for training.
* They are [approximately whitened](../../api_docs/python/image.md#per_image_whitening)
to make the model insensitive to dynamic range.
@ -168,7 +166,7 @@ Here is a graph generated from TensorBoard describing the inference operation:
</div>
> **EXERCISE**: The output of `inference` are un-normalized logits. Try editing
the network architecture to return normalized predictions using [`tf.softmax()`]
the network architecture to return normalized predictions using [`tf.nn.softmax()`]
(../../api_docs/python/nn.md#softmax).
The `inputs()` and `inference()` functions provide all the components

View File

@ -50,7 +50,7 @@ unpacked (following the instructions available at the website) by the
The image data is extracted into a 2d tensor of: `[image index, pixel index]`
where each entry is the intensity value of a specific pixel in a specific
image, rescaled from `[0, 255]` to `[-0.5, 0.5]`. The "image index" corresponds
image, rescaled from `[0, 255]` to `[0, 1]`. The "image index" corresponds
to an image in the dataset, counting up from zero to the size of the dataset.
And the "pixel index" corresponds to a specific pixel in that image, ranging
from zero to the number of pixels in the image.

View File

@ -92,7 +92,7 @@ lstm = rnn_cell.BasicLSTMCell(lstm_size)
# Initial state of the LSTM memory.
initial_state = state = tf.zeros([batch_size, lstm.state_size])
for i in range(len(num_steps)):
for i in range(num_steps):
# The value of state is updated after processing each batch of words.
output, state = lstm(words[:, i], state)
@ -159,7 +159,7 @@ lstm = rnn_cell.BasicLSTMCell(lstm_size)
stacked_lstm = rnn_cell.MultiRNNCell([lstm] * number_of_layers)
initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32)
for i in range(len(num_steps)):
for i in range(num_steps):
# The value of state is updated after processing each batch of words.
output, state = stacked_lstm(words[:, i], state)

View File

@ -58,7 +58,7 @@ translation [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215)
In the basic model depicted above, every input has to be encoded into
a fixed-size state vector, as that is the only thing passed to the decoder.
To allow the decoder more direct access to the input, an *attention* mechanism
was introduced in [Bahdanu et al., 2014](http://arxiv.org/abs/1409.0473)
was introduced in [Bahdanau et al., 2014](http://arxiv.org/abs/1409.0473)
([pdf](http://arxiv.org/pdf/1409.0473.pdf)).
We will not go into the details of the attention mechanism (see the paper),
suffice it to say that it allows the decoder to peek into the input at every
@ -176,8 +176,8 @@ projections are constructed by the following code in `seq2seq_model.py`.
```
First, note that we only construct a sampled softmax if the number of samples
(512 by default) is smaller that the target vocabulary size. For vocabularies
smaller than 512 it might be a better idea to just use a standard softmax loss.
(512 by default) is smaller than the target vocabulary size. For vocabularies
smaller than 512, it might be a better idea to just use a standard softmax loss.
Then, as you can see, we construct an output projection. It is a pair,
consisting of a weight matrix and a bias vector. If used, the rnn cell

View File

@ -17,7 +17,7 @@
This should achieve a test error of 0.7%. Please keep this model as simple and
linear as possible, it is meant as a tutorial for simple convolutional models.
Run with --self_test on the command line to exectute a short self-test.
Run with --self_test on the command line to execute a short self-test.
"""
from __future__ import absolute_import
from __future__ import division

View File

@ -276,7 +276,7 @@ def get_config():
raise ValueError("Invalid model: %s", FLAGS.model)
def main(unused_args):
def main(_):
if not FLAGS.data_path:
raise ValueError("Must set --data_path to PTB data directory")

View File

@ -66,7 +66,7 @@ def gunzip_file(gz_path, new_path):
"""Unzips from gz_path into new_path."""
print("Unpacking %s to %s" % (gz_path, new_path))
with gzip.open(gz_path, "rb") as gz_file:
with open(new_path, "w") as new_file:
with open(new_path, "wb") as new_file:
for line in gz_file:
new_file.write(line)

View File

@ -251,8 +251,8 @@ def import_graph_def(graph_def, input_map=None, return_elements=None,
class_values = value.list
new_class_values = []
for class_value in class_values.s:
if class_value.startswith('loc:@'):
op_to_bind_to = class_value[5:]
if class_value.startswith(b'loc:@'):
op_to_bind_to = class_value[5:].decode()
# Find the op by its original name.
if op_to_bind_to not in name_to_op:
raise ValueError('Specified colocation to an op that '

View File

@ -1041,7 +1041,7 @@ class Operation(object):
raise TypeError("node_def needs to be a NodeDef: %s" % node_def)
if node_def.ByteSize() >= (1 << 31) or node_def.ByteSize() < 0:
raise ValueError(
"Cannot create an Operation with a NodeDef larger than 2GB.")
"Cannot create a tensor proto whose content is larger than 2GB.")
if not _VALID_OP_NAME_REGEX.match(node_def.name):
raise ValueError("'%s' is not a valid node name" % node_def.name)
if not isinstance(g, Graph):

View File

@ -1228,8 +1228,8 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
with ops.colocate_with(a.op):
b = constant_op.constant(3.0)
c = constant_op.constant(4.0)
self.assertEqual(["loc:@a"], a.op.colocation_groups())
self.assertEqual(["loc:@a"], b.op.colocation_groups())
self.assertEqual([b"loc:@a"], a.op.colocation_groups())
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
with self.assertRaises(ValueError):
c.op.get_attr("_class")
@ -1242,7 +1242,7 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
# colocated with 'a', which is on '/gpu:0'. colocate_with
# overrides devices because it is a stronger constraint.
b = constant_op.constant(3.0)
self.assertEqual(["loc:@a"], b.op.colocation_groups())
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
self.assertEqual(a.op.device, b.op.device)
def testLocationOverrides(self):
@ -1258,7 +1258,7 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
c = constant_op.constant(4.0)
d = constant_op.constant(5.0)
self.assertEqual(["loc:@a"], b.op.colocation_groups())
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
self.assertEqual("/device:GPU:0", a.op.device)
self.assertEqual(a.op.device, b.op.device)
@ -1272,8 +1272,8 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
b = constant_op.constant(3.0)
with ops.colocate_with(b.op):
c = constant_op.constant(4.0)
self.assertEqual(["loc:@a"], b.op.colocation_groups())
self.assertEqual(["loc:@a"], c.op.colocation_groups())
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
self.assertEqual([b"loc:@a"], c.op.colocation_groups())
def testMultiColocationGroups(self):
a = constant_op.constant([2.0], name="a")
@ -1281,7 +1281,7 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
with ops.colocate_with(a.op):
with ops.colocate_with(b.op):
c = constant_op.constant(4.0)
self.assertEqual(set(["loc:@a", "loc:@b"]), set(c.op.colocation_groups()))
self.assertEqual(set([b"loc:@a", b"loc:@b"]), set(c.op.colocation_groups()))
def testColocationIgnoreStack(self):
a = constant_op.constant([2.0], name="a")
@ -1295,7 +1295,7 @@ class ColocationGroupTest(test_util.TensorFlowTestCase):
a = variables.Variable([2.0], name="a")
with ops.colocate_with(a.op):
b = variables.Variable([3.0], name="b")
self.assertEqual(["loc:@a"], b.op.colocation_groups())
self.assertEqual([b"loc:@a"], b.op.colocation_groups())
def testInconsistentDeviceWithinColocate(self):
with ops.device("/gpu:0"):

View File

@ -361,6 +361,9 @@ def make_tensor_proto(values, dtype=None, shape=None):
tensor_shape=tensor_shape.as_shape(shape).as_proto())
if is_same_size and numpy_dtype in _TENSOR_CONTENT_TYPES and shape_size > 1:
if nparray.size * nparray.itemsize >= (1 << 31):
raise ValueError(
"Cannot create a tensor proto whose content is larger than 2GB.")
tensor_proto.tensor_content = nparray.tostring()
return tensor_proto

View File

@ -155,7 +155,7 @@ class ConstantTest(tf.test.TestCase):
large_array = np.zeros((512, 1024, 1024), dtype=np.float32)
with self.assertRaisesRegexp(
ValueError,
"Cannot create an Operation with a NodeDef larger than 2GB."):
"Cannot create a tensor proto whose content is larger than 2GB."):
c = tf.constant(large_array)
def testTooLargeGraph(self):

View File

@ -1397,7 +1397,7 @@ class ControlFlowTest(tf.test.TestCase):
vdef)
# The device is empty, but the colocation constraint is set.
self.assertDeviceEqual("", with_vdef_dep.device)
self.assertEqual(["loc:@vdef"],
self.assertEqual([b"loc:@vdef"],
with_vdef_dep.op.colocation_groups())
def testGroup(self):

View File

@ -156,7 +156,7 @@ class DepthToSpaceTest(tf.test.TestCase):
out_tf.eval()
def testBlockSizeNotDivisibleDepth(self):
# The the depth is not divisible by the square of the block size.
# The depth is not divisible by the square of the block size.
x_np = [[[[1, 1, 1, 1],
[2, 2, 2, 2]],
[[3, 3, 3, 3],

View File

@ -23,18 +23,21 @@ import tensorflow as tf
class GenerateIdentityTensorTest(tf.test.TestCase):
def _testDiagOp(self, diag, dtype, expected_ans, use_gpu=False,
expected_err_re=None):
def diagOp(self, diag, dtype, expected_ans, use_gpu=False):
with self.test_session(use_gpu=use_gpu):
tf_ans = tf.diag(tf.convert_to_tensor(diag.astype(dtype)))
out = tf_ans.eval()
tf_ans_inv = tf.diag_part(expected_ans)
inv_out = tf_ans_inv.eval()
self.assertAllClose(out, expected_ans)
self.assertAllClose(inv_out, diag)
self.assertShapeEqual(expected_ans, tf_ans)
self.assertShapeEqual(diag, tf_ans_inv)
def testEmptyTensor(self):
x = numpy.array([])
expected_ans = numpy.empty([0, 0])
self._testDiagOp(x, numpy.int32, expected_ans)
self.diagOp(x, numpy.int32, expected_ans)
def testRankOneIntTensor(self):
x = numpy.array([1, 2, 3])
@ -42,8 +45,8 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
[[1, 0, 0],
[0, 2, 0],
[0, 0, 3]])
self._testDiagOp(x, numpy.int32, expected_ans)
self._testDiagOp(x, numpy.int64, expected_ans)
self.diagOp(x, numpy.int32, expected_ans)
self.diagOp(x, numpy.int64, expected_ans)
def testRankOneFloatTensor(self):
x = numpy.array([1.1, 2.2, 3.3])
@ -51,8 +54,8 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
[[1.1, 0, 0],
[0, 2.2, 0],
[0, 0, 3.3]])
self._testDiagOp(x, numpy.float32, expected_ans)
self._testDiagOp(x, numpy.float64, expected_ans)
self.diagOp(x, numpy.float32, expected_ans)
self.diagOp(x, numpy.float64, expected_ans)
def testRankTwoIntTensor(self):
x = numpy.array([[1, 2, 3], [4, 5, 6]])
@ -63,8 +66,8 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
[[[0, 0, 0], [4, 0, 0]],
[[0, 0, 0], [0, 5, 0]],
[[0, 0, 0], [0, 0, 6]]]])
self._testDiagOp(x, numpy.int32, expected_ans)
self._testDiagOp(x, numpy.int64, expected_ans)
self.diagOp(x, numpy.int32, expected_ans)
self.diagOp(x, numpy.int64, expected_ans)
def testRankTwoFloatTensor(self):
x = numpy.array([[1.1, 2.2, 3.3], [4.4, 5.5, 6.6]])
@ -75,8 +78,8 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
[[[0, 0, 0], [4.4, 0, 0]],
[[0, 0, 0], [0, 5.5, 0]],
[[0, 0, 0], [0, 0, 6.6]]]])
self._testDiagOp(x, numpy.float32, expected_ans)
self._testDiagOp(x, numpy.float64, expected_ans)
self.diagOp(x, numpy.float32, expected_ans)
self.diagOp(x, numpy.float64, expected_ans)
def testRankThreeFloatTensor(self):
x = numpy.array([[[1.1, 2.2], [3.3, 4.4]],
@ -90,8 +93,64 @@ class GenerateIdentityTensorTest(tf.test.TestCase):
[[[0, 0], [0, 0]], [[0, 6.6], [0, 0]]]],
[[[[0, 0], [0, 0]], [[0, 0], [7.7, 0]]],
[[[0, 0], [0, 0]], [[0, 0], [0, 8.8]]]]]])
self._testDiagOp(x, numpy.float32, expected_ans)
self._testDiagOp(x, numpy.float64, expected_ans)
self.diagOp(x, numpy.float32, expected_ans)
self.diagOp(x, numpy.float64, expected_ans)
class DiagPartOpTest(tf.test.TestCase):
def setUp(self):
x = numpy.random.seed(0)
def diagPartOp(self, tensor, dtpe, expected_ans, use_gpu=False):
with self.test_session(use_gpu=use_gpu):
tf_ans_inv = tf.diag_part(tensor)
inv_out = tf_ans_inv.eval()
self.assertAllClose(inv_out, expected_ans)
self.assertShapeEqual(expected_ans, tf_ans_inv)
def testRankTwoFloatTensor(self):
x = numpy.random.rand(3, 3)
i = numpy.arange(3)
expected_ans = x[i, i]
self.diagPartOp(x, numpy.float32, expected_ans)
self.diagPartOp(x, numpy.float64, expected_ans)
def testRankFourFloatTensor(self):
x = numpy.random.rand(2, 3, 2, 3)
i = numpy.arange(2)[:, None]
j = numpy.arange(3)
expected_ans = x[i, j, i, j]
self.diagPartOp(x, numpy.float32, expected_ans)
self.diagPartOp(x, numpy.float64, expected_ans)
def testRankSixFloatTensor(self):
x = numpy.random.rand(2, 2, 2, 2, 2, 2)
i = numpy.arange(2)[:, None, None]
j = numpy.arange(2)[:, None]
k = numpy.arange(2)
expected_ans = x[i, j, k, i, j, k]
self.diagPartOp(x, numpy.float32, expected_ans)
self.diagPartOp(x, numpy.float64, expected_ans)
def testOddRank(self):
w = numpy.random.rand(2)
x = numpy.random.rand(2, 2, 2)
y = numpy.random.rand(2, 2, 2, 2, 2)
z = numpy.random.rand(2, 2, 2, 2, 2, 2, 2)
self.assertRaises(ValueError, self.diagPartOp, w, numpy.float32, 0)
self.assertRaises(ValueError, self.diagPartOp, x, numpy.float32, 0)
self.assertRaises(ValueError, self.diagPartOp, y, numpy.float32, 0)
self.assertRaises(ValueError, self.diagPartOp, z, numpy.float32, 0)
def testUnevenDimensions(self):
w = numpy.random.rand(2, 5)
x = numpy.random.rand(2, 1, 2, 3)
y = numpy.random.rand(2, 1, 2, 1, 2, 5)
z = numpy.random.rand(2, 2, 2, 2, 2, 2, 2, 2)
self.assertRaises(ValueError, self.diagPartOp, w, numpy.float32, 0)
self.assertRaises(ValueError, self.diagPartOp, x, numpy.float32, 0)
self.assertRaises(ValueError, self.diagPartOp, y, numpy.float32, 0)
self.assertRaises(ValueError, self.diagPartOp, z, numpy.float32, 0)
if __name__ == "__main__":
tf.test.main()

View File

@ -25,7 +25,7 @@ from tensorflow.python.framework import random_seed
from tensorflow.python.ops import init_ops
# Returns true iff the two initalizers produce the same tensor to
# Returns true iff the two initializers produce the same tensor to
# within a tiny tolerance.
def identicaltest(tc, init1, init2, use_gpu):
"""Tests if two initializations are identical to within tiny tolerances.

View File

@ -120,7 +120,7 @@ class MatMulTest(tf.test.TestCase):
self._testCpuMatmul(x, y, True, True)
self._testGpuMatmul(x, y, True, True)
def testDoubleRandomTranposeBoth(self):
def testDoubleRandomTransposeBoth(self):
for _ in range(10):
n, k, m = np.random.randint(1, 100, size=3)
x = self._randMatrix(k, n, np.float64)

View File

@ -116,8 +116,8 @@ class SumReductionTest(tf.test.TestCase):
# Simple tests for various types.
def testDoubleReduce1D(self):
np_arr = np.arange(1, 6).reshape([5]).astype(np.float64)
self._compare(np_arr, [], False)
self._compare(np_arr, [0], False)
self._compareAll(np_arr, [])
self._compareAll(np_arr, [0])
def testInt32Reduce1D(self):
np_arr = np.arange(1, 6).reshape([5]).astype(np.int32)
@ -230,6 +230,19 @@ class MeanReductionTest(tf.test.TestCase):
self._compareAll(np_arr, [0, 2])
self._compareAll(np_arr, [0, 1, 2])
def testDoubleReduce3D(self):
# Create a 3D array of doubles and reduce across all possible
# dimensions
np_arr = np.arange(0, 30).reshape([2, 3, 5]).astype(np.float64)
self._compareAll(np_arr, [])
self._compareAll(np_arr, [0])
self._compareAll(np_arr, [1])
self._compareAll(np_arr, [2])
self._compareAll(np_arr, [0, 1])
self._compareAll(np_arr, [1, 2])
self._compareAll(np_arr, [0, 2])
self._compareAll(np_arr, [0, 1, 2])
def testGradient(self):
s = [2, 3, 4, 2]
x = np.arange(1.0, 49.0).reshape(s).astype(np.float32)
@ -383,6 +396,19 @@ class MinReductionTest(tf.test.TestCase):
self._compareAll(np_arr, [0, 2])
self._compareAll(np_arr, [0, 1, 2])
def testDoubleReduce3D(self):
# Create a 3D array of doubles and reduce across all possible
# dimensions
np_arr = np.arange(0, 30).reshape([2, 3, 5]).astype(np.float64)
self._compareAll(np_arr, [])
self._compareAll(np_arr, [0])
self._compareAll(np_arr, [1])
self._compareAll(np_arr, [2])
self._compareAll(np_arr, [0, 1])
self._compareAll(np_arr, [1, 2])
self._compareAll(np_arr, [0, 2])
self._compareAll(np_arr, [0, 1, 2])
def testGradient(self):
s = [2, 3, 4, 2]
x = np.arange(1.0, 49.0).reshape(s).astype(np.float64)
@ -477,6 +503,20 @@ class MaxReductionTest(tf.test.TestCase):
self._compareAll(np_arr, [0, 2])
self._compareAll(np_arr, [0, 1, 2])
def testDoubleReduce3D(self):
# Create a 3D array of doubles and reduce across all possible
# dimensions
np_arr = np.arange(0, 30).reshape([2, 3, 5]).astype(np.float64)
self._compareAll(np_arr, None)
self._compareAll(np_arr, [])
self._compareAll(np_arr, [0])
self._compareAll(np_arr, [1])
self._compareAll(np_arr, [2])
self._compareAll(np_arr, [0, 1])
self._compareAll(np_arr, [1, 2])
self._compareAll(np_arr, [0, 2])
self._compareAll(np_arr, [0, 1, 2])
def testGradient(self):
s = [2, 3, 4, 2]
x = np.arange(1.0, 49.0).reshape(s).astype(np.float64)

View File

@ -782,11 +782,11 @@ class BidirectionalRNNTest(tf.test.TestCase):
tf.float32,
shape=(batch_size, input_size) if use_shape else (None, input_size))
]
outputs = tf.nn.bidirectional_rnn(cell_fw,
cell_bw,
inputs,
dtype=tf.float32,
sequence_length=sequence_length)
outputs, state_fw, state_bw = tf.nn.bidirectional_rnn(cell_fw,
cell_bw,
inputs,
dtype=tf.float32,
sequence_length=sequence_length)
self.assertEqual(len(outputs), len(inputs))
for out in outputs:
self.assertEqual(
@ -794,17 +794,19 @@ class BidirectionalRNNTest(tf.test.TestCase):
[batch_size if use_shape else None, 2 * num_units])
input_value = np.random.randn(batch_size, input_size)
outputs = tf.pack(outputs)
return input_value, inputs, outputs, sequence_length
return input_value, inputs, outputs, state_fw, state_bw, sequence_length
def _testBidirectionalRNN(self, use_gpu, use_shape):
with self.test_session(use_gpu=use_gpu, graph=tf.Graph()) as sess:
input_value, inputs, outputs, sequence_length = (
input_value, inputs, outputs, state_fw, state_bw, sequence_length = (
self._createBidirectionalRNN(use_gpu, use_shape, True))
tf.initialize_all_variables().run()
# Run with pre-specified sequence length of 2, 3
out = sess.run(outputs, feed_dict={inputs[0]: input_value,
sequence_length: [2, 3]})
out, s_fw, s_bw = sess.run([outputs, state_fw, state_bw],
feed_dict={inputs[0]: input_value,
sequence_length: [2, 3]})
# Since the forward and backward LSTM cells were initialized with the
# same parameters, the forward and backward output has to be the same,
@ -836,13 +838,17 @@ class BidirectionalRNNTest(tf.test.TestCase):
self.assertEqual(out[2][1][0], out[0][1][3])
self.assertEqual(out[2][1][1], out[0][1][4])
self.assertEqual(out[2][1][2], out[0][1][5])
# Via the reasoning above, the forward and backward final state should be
# exactly the same
self.assertAllClose(s_fw, s_bw)
def _testBidirectionalRNNWithoutSequenceLength(self, use_gpu, use_shape):
with self.test_session(use_gpu=use_gpu, graph=tf.Graph()) as sess:
input_value, inputs, outputs, _ = self._createBidirectionalRNN(
use_gpu, use_shape, False)
input_value, inputs, outputs, state_fw, state_bw, _ = self._createBidirectionalRNN(
use_gpu, use_shape, False)
tf.initialize_all_variables().run()
out = sess.run(outputs, feed_dict={inputs[0]: input_value})
out, s_fw, s_bw = sess.run([outputs, state_fw, state_bw],
feed_dict={inputs[0]: input_value})
# Since the forward and backward LSTM cells were initialized with the
# same parameters, the forward and backward output has to be the same,
@ -861,6 +867,9 @@ class BidirectionalRNNTest(tf.test.TestCase):
self.assertEqual(out[i][1][0], out[8 - 1 - i][1][3])
self.assertEqual(out[i][1][1], out[8 - 1 - i][1][4])
self.assertEqual(out[i][1][2], out[8 - 1 - i][1][5])
# Via the reasoning above, the forward and backward final state should be
# exactly the same
self.assertAllClose(s_fw, s_bw)
def testBidirectionalRNN(self):
self._testBidirectionalRNN(use_gpu=False, use_shape=False)

View File

@ -495,6 +495,105 @@ class Seq2SeqTest(tf.test.TestCase):
if len(perplexities[bucket]) > 1: # Assert that perplexity went down.
self.assertLess(perplexities[bucket][-1], perplexities[bucket][0])
def testModelWithBooleanFeedPrevious(self):
"""Test the model behavior when feed_previous is True.
For example, the following two cases have the same effect:
- Train `embedding_rnn_seq2seq` with `feed_previous=True`, which contains
a `embedding_rnn_decoder` with `feed_previous=True` and
`update_embedding_for_previous=True`. The decoder is fed with "<Go>"
and outputs "A, B, C".
- Train `embedding_rnn_seq2seq` with `feed_previous=False`. The decoder
is fed with "<Go>, A, B".
"""
num_encoder_symbols = 3
num_decoder_symbols = 5
batch_size = 2
num_enc_timesteps = 2
num_dec_timesteps = 3
def TestModel(seq2seq):
with self.test_session(graph=tf.Graph()) as sess:
tf.set_random_seed(111)
random.seed(111)
np.random.seed(111)
enc_inp = [tf.constant(i + 1, tf.int32, shape=[batch_size])
for i in range(num_enc_timesteps)]
dec_inp_fp_true = [tf.constant(i, tf.int32, shape=[batch_size])
for i in range(num_dec_timesteps)]
dec_inp_holder_fp_false = [tf.placeholder(tf.int32, shape=[batch_size])
for _ in range(num_dec_timesteps)]
targets = [tf.constant(i + 1, tf.int32, shape=[batch_size])
for i in range(num_dec_timesteps)]
weights = [tf.constant(1.0, shape=[batch_size])
for i in range(num_dec_timesteps)]
def ForwardBackward(enc_inp, dec_inp, feed_previous):
scope_name = "fp_{}".format(feed_previous)
with tf.variable_scope(scope_name):
dec_op, _ = seq2seq(enc_inp, dec_inp, feed_previous=feed_previous)
net_variables = tf.get_collection(tf.GraphKeys.VARIABLES,
scope_name)
optimizer = tf.train.AdamOptimizer(0.03, epsilon=1e-5)
update_op = optimizer.minimize(
tf.nn.seq2seq.sequence_loss(dec_op, targets, weights),
var_list=net_variables)
return dec_op, update_op, net_variables
dec_op_fp_true, update_fp_true, variables_fp_true = ForwardBackward(
enc_inp, dec_inp_fp_true, feed_previous=True)
dec_op_fp_false, update_fp_false, variables_fp_false = ForwardBackward(
enc_inp, dec_inp_holder_fp_false, feed_previous=False)
sess.run(tf.initialize_all_variables())
# We only check consistencies between the variables existing in both
# the models with True and False feed_previous. Variables created by
# the loop_function in the model with True feed_previous are ignored.
v_false_name_dict = {v.name.split('/', 1)[-1]: v
for v in variables_fp_false}
matched_variables = [(v, v_false_name_dict[v.name.split('/', 1)[-1]])
for v in variables_fp_true]
for v_true, v_false in matched_variables:
sess.run(tf.assign(v_false, v_true))
# Take the symbols generated by the decoder with feed_previous=True as
# the true input symbols for the decoder with feed_previous=False.
dec_fp_true = sess.run(dec_op_fp_true)
output_symbols_fp_true = np.argmax(dec_fp_true, axis=2)
dec_inp_fp_false = np.vstack((dec_inp_fp_true[0].eval(),
output_symbols_fp_true[:-1]))
sess.run(update_fp_true)
sess.run(update_fp_false,
{holder: inp for holder, inp in zip(dec_inp_holder_fp_false,
dec_inp_fp_false)})
for v_true, v_false in matched_variables:
self.assertAllClose(v_true.eval(), v_false.eval())
def EmbeddingRNNSeq2SeqF(enc_inp, dec_inp, feed_previous):
cell = tf.nn.rnn_cell.BasicLSTMCell(2)
return tf.nn.seq2seq.embedding_rnn_seq2seq(
enc_inp, dec_inp, cell, num_encoder_symbols,
num_decoder_symbols, feed_previous=feed_previous)
def EmbeddingTiedRNNSeq2Seq(enc_inp, dec_inp, feed_previous):
cell = tf.nn.rnn_cell.BasicLSTMCell(2)
return tf.nn.seq2seq.embedding_tied_rnn_seq2seq(
enc_inp, dec_inp, cell, num_decoder_symbols,
feed_previous=feed_previous)
def EmbeddingAttentionSeq2Seq(enc_inp, dec_inp, feed_previous):
cell = tf.nn.rnn_cell.BasicLSTMCell(2)
return tf.nn.seq2seq.embedding_attention_seq2seq(
enc_inp, dec_inp, cell, num_encoder_symbols,
num_decoder_symbols, feed_previous=feed_previous)
for model in (EmbeddingRNNSeq2SeqF, EmbeddingTiedRNNSeq2Seq,
EmbeddingAttentionSeq2Seq):
TestModel(model)
if __name__ == "__main__":
tf.test.main()

View File

@ -0,0 +1,71 @@
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy
import tensorflow as tf
class TraceTest(tf.test.TestCase):
def setUp(self):
x = numpy.random.seed(0)
def traceOp(self, x, dtype, expected_ans, use_gpu=False):
with self.test_session(use_gpu=use_gpu):
tf_ans = tf.trace(x.astype(dtype))
out = tf_ans.eval()
self.assertAllClose(out, expected_ans)
def testEmptyTensor(self):
x = numpy.array([])
self.assertRaises(ValueError, self.traceOp, x, numpy.float32, 0)
def testRankOneTensor(self):
x = numpy.array([1,2,3])
self.assertRaises(ValueError, self.traceOp, x, numpy.float32, 0)
def testRankTwoIntTensor(self):
x = numpy.array(
[[1, 0, 0],
[0, 2, 0],
[0, 0, 3]])
expected_ans = 6
self.traceOp(x, numpy.int32, expected_ans)
self.traceOp(x, numpy.int64, expected_ans)
def testRankTwoFloatTensor(self):
x = numpy.array(
[[1.1, 0, 0],
[0, 2.2, 0],
[0, 0, 3.3]])
expected_ans = 6.6
self.traceOp(x, numpy.float32, expected_ans)
self.traceOp(x, numpy.float64, expected_ans)
def testRankThreeFloatTensor(self):
x = numpy.random.rand(2, 2, 2)
self.assertRaises(ValueError, self.traceOp, x, numpy.float32, 0)
def testRankFourFloatTensor(self):
x = numpy.random.rand(2, 2, 2, 2)
self.assertRaises(ValueError, self.traceOp, x, numpy.float32, 0)
if __name__ == "__main__":
tf.test.main()

View File

@ -846,6 +846,35 @@ def _DiagShape(op):
input_shape = op.inputs[0].get_shape().with_rank_at_most(3)
return [input_shape.concatenate(input_shape)]
@ops.RegisterShape("DiagPart")
def _DiagPartShape(op):
"""Shape function for array_ops.diag_part.
This op has one input (of rank k = 2, 4, or 6), and one output (of rank k/2),
where the shape of the output is the diagonal of the input shape.
Args:
op: A DiagPart Operation.
Returns:
A single-element list containing the shape of the output.
Raises:
ValueError: If input has odd rank or greater than 6
"""
shape = op.inputs[0].get_shape()
rank = len(shape)
mid = rank // 2
if rank % 2 or rank > 6:
raise ValueError("Input must have even rank <= 6, input rank is " +
str(rank) + "." )
if shape[:mid] != shape[mid:]:
raise ValueError("Invalid shape, shape[:mid] " + str(shape[:mid]) +
" and shape[mid:] " + str(shape[mid:]) +
" do not match ")
input_shape = shape.with_rank_at_most(6)
return [input_shape[:len(input_shape) // 2]]
@ops.RegisterShape("ExpandDims")
def _ExpandDimsShape(op):
@ -1360,7 +1389,7 @@ def _SpaceToDepthShape(op):
* input: a tensor of shape like that [B, H, W, D]
* block_size: an int.
Its output is the the same-rank tensor but with changed
Its output is the same-rank tensor but with changed
dimensions like that: [B, H/block_size, W/block_size, D*block_size*block_size]
Args:
@ -1408,7 +1437,7 @@ def _DepthToSpaceShape(op):
* input: a tensor of shape like that [B, H, W, D]
* block_size: an int.
Its output is the the same-rank tensor but with changed
Its output is the same-rank tensor but with changed
dimensions like that:
[B, H*block_size, W*block_size, D/(block_size*block_size)]

View File

@ -308,6 +308,7 @@ def flip_left_right(image):
Raises:
ValueError: if the shape of `image` not supported.
"""
image = ops.convert_to_tensor(image, name='image')
_Check3DImage(image, require_static=False)
return array_ops.reverse(image, [False, True, False])
@ -329,6 +330,7 @@ def flip_up_down(image):
Raises:
ValueError: if the shape of `image` not supported.
"""
image = ops.convert_to_tensor(image, name='image')
_Check3DImage(image, require_static=False)
return array_ops.reverse(image, [True, False, False])

View File

@ -741,7 +741,14 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
image_ops.ResizeMethod.AREA]
TYPES = [np.uint8, np.int8, np.int16, np.int32, np.int64,
np.float, np.double]
np.float32, np.float64]
def availableGPUModes(self, opt, nptype):
if opt == image_ops.ResizeMethod.NEAREST_NEIGHBOR \
and nptype in [np.float32, np.float64]:
return [True, False]
else:
return [False]
def testNoOp(self):
img_shape = [1, 6, 4, 1]
@ -761,13 +768,14 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
img_np = np.array(data, dtype=nptype).reshape(img_shape)
for opt in self.OPTIONS:
with self.test_session() as sess:
image = constant_op.constant(img_np, shape=img_shape)
y = image_ops.resize_images(image, target_height, target_width, opt)
yshape = array_ops.shape(y)
resized, newshape = sess.run([y, yshape])
self.assertAllEqual(img_shape, newshape)
self.assertAllClose(resized, img_np, atol=1e-5)
for use_gpu in self.availableGPUModes(opt, nptype):
with self.test_session(use_gpu=use_gpu) as sess:
image = constant_op.constant(img_np, shape=img_shape)
y = image_ops.resize_images(image, target_height, target_width, opt)
yshape = array_ops.shape(y)
resized, newshape = sess.run([y, yshape])
self.assertAllEqual(img_shape, newshape)
self.assertAllClose(resized, img_np, atol=1e-5)
# Resizing with a single image must leave the shape unchanged also.
with self.test_session():
@ -857,12 +865,13 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
img_np = np.array(data, dtype=nptype).reshape(img_shape)
for opt in self.OPTIONS:
with self.test_session():
image = constant_op.constant(img_np, shape=img_shape)
y = image_ops.resize_images(image, target_height, target_width, opt)
expected = np.array(expected_data).reshape(target_shape)
resized = y.eval()
self.assertAllClose(resized, expected, atol=1e-5)
for use_gpu in self.availableGPUModes(opt, nptype):
with self.test_session(use_gpu=use_gpu):
image = constant_op.constant(img_np, shape=img_shape)
y = image_ops.resize_images(image, target_height, target_width, opt)
expected = np.array(expected_data).reshape(target_shape)
resized = y.eval()
self.assertAllClose(resized, expected, atol=1e-5)
def testResizeUp(self):
img_shape = [1, 3, 2, 1]
@ -899,14 +908,15 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
image_ops.ResizeMethod.BILINEAR,
image_ops.ResizeMethod.NEAREST_NEIGHBOR,
image_ops.ResizeMethod.AREA]:
with self.test_session():
img_np = np.array(data, dtype=nptype).reshape(img_shape)
image = constant_op.constant(img_np, shape=img_shape)
y = image_ops.resize_images(image, target_height, target_width, opt)
resized = y.eval()
expected = np.array(expected_data[opt]).reshape(
[1, target_height, target_width, 1])
self.assertAllClose(resized, expected, atol=1e-05)
for use_gpu in self.availableGPUModes(opt, nptype):
with self.test_session(use_gpu=use_gpu):
img_np = np.array(data, dtype=nptype).reshape(img_shape)
image = constant_op.constant(img_np, shape=img_shape)
y = image_ops.resize_images(image, target_height, target_width, opt)
resized = y.eval()
expected = np.array(expected_data[opt]).reshape(
[1, target_height, target_width, 1])
self.assertAllClose(resized, expected, atol=1e-05)
def testResizeUpBicubic(self):
img_shape = [1, 6, 6, 1]
@ -964,6 +974,28 @@ class ResizeImagesTest(test_util.TensorFlowTestCase):
self.assertAllClose(resized, expected, atol=1)
def testCompareNearestNeighbor(self):
input_shape = [1, 5, 6, 3]
target_height = 8
target_width = 12
for nptype in [np.float32, np.float64]:
for align_corners in [True, False]:
img_np = np.arange(0, np.prod(input_shape), dtype=nptype).reshape(input_shape)
with self.test_session(use_gpu=True):
image = constant_op.constant(img_np, shape=input_shape)
out_op = image_ops.resize_images(image, target_height, target_width,
image_ops.ResizeMethod.NEAREST_NEIGHBOR,
align_corners=align_corners)
gpu_val = out_op.eval()
with self.test_session(use_gpu=False):
image = constant_op.constant(img_np, shape=input_shape)
out_op = image_ops.resize_images(image, target_height, target_width,
image_ops.ResizeMethod.NEAREST_NEIGHBOR,
align_corners=align_corners)
cpu_val = out_op.eval()
self.assertAllClose(cpu_val, gpu_val, rtol=1e-5, atol=1e-5)
class ResizeImageWithCropOrPadTest(test_util.TensorFlowTestCase):
def _ResizeImageWithCropOrPad(self, original, original_shape,

View File

@ -63,6 +63,8 @@ TensorFlow provides several operations that you can use to add basic
mathematical functions for matrices to your graph.
@@diag
@@diag_part
@@trace
@@transpose
@@matmul
@ -921,6 +923,39 @@ def reduce_any(input_tensor, reduction_indices=None, keep_dims=False,
keep_dims, name=name)
def trace(x, name=None):
""" Compute the trace of a tensor `x`.
`trace(x)` returns the sum of along the diagonal.
For example:
```python
# 'x' is [[1, 1],
# [1, 1]]
tf.trace(x) ==> 2
# 'x' is [[1,2,3],
# [4,5,6],
# [7,8,9]]
tf.trace(x) ==> 15
```
Args:
input_tensor: 2-D tensor.
name: A name for the operation (optional).
Returns:
The trace of input tensor.
"""
with ops.op_scope([x], name, "Trace") as name:
x = ops.convert_to_tensor(x, name="x")
if len(x.get_shape()) != 2:
raise ValueError("Expected a tensor with rank 2, rank %d tensor received"
% len(x.get_shape()))
return reduce_sum(array_ops.diag_part(x), name=name)
def matmul(a, b,
transpose_a=False, transpose_b=False,
a_is_sparse=False, b_is_sparse=False,

View File

@ -194,7 +194,7 @@ def softmax_cross_entropy_with_logits(logits, labels, name=None):
example, each CIFAR-10 image is labeled with one and only one label: an image
can be a dog or a truck, but not both.
**NOTE:**: While the classes are mutually exclusive, their probabilities
**NOTE:** While the classes are mutually exclusive, their probabilities
need not be. All that is required is that each row of `labels` is
a valid probability distribution. If using exclusive `labels`
(wherein one and only one class is true at a time), see
@ -231,7 +231,7 @@ def sparse_softmax_cross_entropy_with_logits(logits, labels, name=None):
example, each CIFAR-10 image is labeled with one and only one label: an image
can be a dog or a truck, but not both.
**NOTE:**: For this operation, the probability of a given label is considered
**NOTE:** For this operation, the probability of a given label is considered
exclusive. That is, soft classes are not allowed, and the `labels` vector
must provide a single specific index for the true class for each row of
`logits` (each minibatch entry). For soft softmax classification with

View File

@ -312,9 +312,11 @@ def bidirectional_rnn(cell_fw, cell_bw, inputs,
scope: VariableScope for the created subgraph; defaults to "BiRNN"
Returns:
A set of output `Tensors` where:
A tuple (outputs, output_state_fw, output_state_bw) where:
outputs is a length T list of outputs (one for each input), which
are depth-concatenated forward and backward outputs
output_state_fw is the final state of the forward rnn
output_state_bw is the final state of the backward rnn
Raises:
TypeError: If "cell_fw" or "cell_bw" is not an instance of RNNCell.
@ -333,19 +335,19 @@ def bidirectional_rnn(cell_fw, cell_bw, inputs,
name = scope or "BiRNN"
# Forward direction
with vs.variable_scope(name + "_FW") as fw_scope:
output_fw, _ = rnn(cell_fw, inputs, initial_state_fw, dtype,
output_fw, output_state_fw = rnn(cell_fw, inputs, initial_state_fw, dtype,
sequence_length, scope=fw_scope)
# Backward direction
with vs.variable_scope(name + "_BW") as bw_scope:
tmp, _ = rnn(cell_bw, _reverse_seq(inputs, sequence_length),
tmp, output_state_bw = rnn(cell_bw, _reverse_seq(inputs, sequence_length),
initial_state_bw, dtype, sequence_length, scope=bw_scope)
output_bw = _reverse_seq(tmp, sequence_length)
# Concat each of the forward/backward outputs
outputs = [array_ops.concat(1, [fw, bw])
for fw, bw in zip(output_fw, output_bw)]
return outputs
return (outputs, output_state_fw, output_state_bw)
def dynamic_rnn(cell, inputs, sequence_length=None, initial_state=None,

View File

@ -73,6 +73,34 @@ from tensorflow.python.ops import rnn_cell
from tensorflow.python.ops import variable_scope
def _extract_argmax_and_embed(embedding, output_projection=None,
update_embedding=True):
"""Get a loop_function that extracts the previous symbol and embeds it.
Args:
embedding: embedding tensor for symbols.
output_projection: None or a pair (W, B). If provided, each fed previous
output will first be multiplied by W and added B.
update_embedding: Boolean; if False, the gradients will not propagate
through the embeddings.
Returns:
A loop function.
"""
def loop_function(prev, _):
if output_projection is not None:
prev = nn_ops.xw_plus_b(
prev, output_projection[0], output_projection[1])
prev_symbol = math_ops.argmax(prev, 1)
# Note that gradients will not propagate through the second parameter of
# embedding_lookup.
emb_prev = embedding_ops.embedding_lookup(embedding, prev_symbol)
if not update_embedding:
emb_prev = array_ops.stop_gradient(emb_prev)
return emb_prev
return loop_function
def rnn_decoder(decoder_inputs, initial_state, cell, loop_function=None,
scope=None):
"""RNN decoder for the sequence-to-sequence model.
@ -107,14 +135,13 @@ def rnn_decoder(decoder_inputs, initial_state, cell, loop_function=None,
for i, inp in enumerate(decoder_inputs):
if loop_function is not None and prev is not None:
with variable_scope.variable_scope("loop_function", reuse=True):
# We do not propagate gradients over the loop function.
inp = array_ops.stop_gradient(loop_function(prev, i))
inp = loop_function(prev, i)
if i > 0:
variable_scope.get_variable_scope().reuse_variables()
output, state = cell(inp, state)
outputs.append(output)
if loop_function is not None:
prev = array_ops.stop_gradient(output)
prev = output
return outputs, state
@ -182,7 +209,7 @@ def tied_rnn_seq2seq(encoder_inputs, decoder_inputs, cell,
def embedding_rnn_decoder(decoder_inputs, initial_state, cell, num_symbols,
output_projection=None, feed_previous=False,
scope=None):
update_embedding_for_previous=True, scope=None):
"""RNN decoder with embedding and a pure-decoding option.
Args:
@ -200,6 +227,11 @@ def embedding_rnn_decoder(decoder_inputs, initial_state, cell, num_symbols,
In effect, this implements a greedy decoder. It can also be used
during training to emulate http://arxiv.org/abs/1506.03099.
If False, decoder_inputs are used as given (the standard decoder case).
update_embedding_for_previous: Boolean; if False and feed_previous=True,
only the embedding for the first symbol of decoder_inputs (the "GO"
symbol) will be updated by back propagation. Embeddings for the symbols
generated from the decoder itself remain unchanged. This parameter has
no effect if feed_previous=False.
scope: VariableScope for the created subgraph; defaults to
"embedding_rnn_decoder".
@ -227,16 +259,9 @@ def embedding_rnn_decoder(decoder_inputs, initial_state, cell, num_symbols,
with ops.device("/cpu:0"):
embedding = variable_scope.get_variable("embedding",
[num_symbols, cell.input_size])
def extract_argmax_and_embed(prev, _):
"""Loop_function that extracts the symbol from prev and embeds it."""
if output_projection is not None:
prev = nn_ops.xw_plus_b(
prev, output_projection[0], output_projection[1])
prev_symbol = array_ops.stop_gradient(math_ops.argmax(prev, 1))
return embedding_ops.embedding_lookup(embedding, prev_symbol)
loop_function = extract_argmax_and_embed if feed_previous else None
loop_function = _extract_argmax_and_embed(
embedding, output_projection,
update_embedding_for_previous) if feed_previous else None
emb_inp = (
embedding_ops.embedding_lookup(embedding, i) for i in decoder_inputs)
return rnn_decoder(emb_inp, initial_state, cell,
@ -306,7 +331,8 @@ def embedding_rnn_seq2seq(encoder_inputs, decoder_inputs, cell,
outputs, state = embedding_rnn_decoder(
decoder_inputs, encoder_state, cell, num_decoder_symbols,
output_projection=output_projection,
feed_previous=feed_previous_bool)
feed_previous=feed_previous_bool,
update_embedding_for_previous=False)
return outputs + [state]
outputs_and_state = control_flow_ops.cond(feed_previous,
@ -372,25 +398,19 @@ def embedding_tied_rnn_seq2seq(encoder_inputs, decoder_inputs, cell,
emb_decoder_inputs = [embedding_ops.embedding_lookup(embedding, x)
for x in decoder_inputs]
def extract_argmax_and_embed(prev, _):
"""Loop_function that extracts the symbol from prev and embeds it."""
if output_projection is not None:
prev = nn_ops.xw_plus_b(
prev, output_projection[0], output_projection[1])
prev_symbol = array_ops.stop_gradient(math_ops.argmax(prev, 1))
return embedding_ops.embedding_lookup(embedding, prev_symbol)
if output_projection is None:
cell = rnn_cell.OutputProjectionWrapper(cell, num_symbols)
if isinstance(feed_previous, bool):
loop_function = extract_argmax_and_embed if feed_previous else None
loop_function = _extract_argmax_and_embed(
embedding, output_projection, True) if feed_previous else None
return tied_rnn_seq2seq(emb_encoder_inputs, emb_decoder_inputs, cell,
loop_function=loop_function, dtype=dtype)
# If feed_previous is a Tensor, we construct 2 graphs and use cond.
def decoder(feed_previous_bool):
loop_function = extract_argmax_and_embed if feed_previous_bool else None
loop_function = _extract_argmax_and_embed(
embedding, output_projection, False) if feed_previous_bool else None
reuse = None if feed_previous_bool else True
with variable_scope.variable_scope(variable_scope.get_variable_scope(),
reuse=reuse):
@ -523,7 +543,7 @@ def attention_decoder(decoder_inputs, initial_state, attention_states, cell,
# If loop_function is set, we use it instead of decoder_inputs.
if loop_function is not None and prev is not None:
with variable_scope.variable_scope("loop_function", reuse=True):
inp = array_ops.stop_gradient(loop_function(prev, i))
inp = loop_function(prev, i)
# Merge input and previous attentions into one vector of the right size.
x = rnn_cell.linear([inp] + attns, cell.input_size, True)
# Run the RNN.
@ -539,8 +559,7 @@ def attention_decoder(decoder_inputs, initial_state, attention_states, cell,
with variable_scope.variable_scope("AttnOutputProjection"):
output = rnn_cell.linear([cell_output] + attns, output_size, True)
if loop_function is not None:
# We do not propagate gradients over the loop function.
prev = array_ops.stop_gradient(output)
prev = output
outputs.append(output)
return outputs, state
@ -549,8 +568,10 @@ def attention_decoder(decoder_inputs, initial_state, attention_states, cell,
def embedding_attention_decoder(decoder_inputs, initial_state, attention_states,
cell, num_symbols, num_heads=1,
output_size=None, output_projection=None,
feed_previous=False, dtype=dtypes.float32,
scope=None, initial_state_attention=False):
feed_previous=False,
update_embedding_for_previous=True,
dtype=dtypes.float32, scope=None,
initial_state_attention=False):
"""RNN decoder with embedding and attention and a pure-decoding option.
Args:
@ -571,6 +592,11 @@ def embedding_attention_decoder(decoder_inputs, initial_state, attention_states,
In effect, this implements a greedy decoder. It can also be used
during training to emulate http://arxiv.org/abs/1506.03099.
If False, decoder_inputs are used as given (the standard decoder case).
update_embedding_for_previous: Boolean; if False and feed_previous=True,
only the embedding for the first symbol of decoder_inputs (the "GO"
symbol) will be updated by back propagation. Embeddings for the symbols
generated from the decoder itself remain unchanged. This parameter has
no effect if feed_previous=False.
dtype: The dtype to use for the RNN initial states (default: tf.float32).
scope: VariableScope for the created subgraph; defaults to
"embedding_attention_decoder".
@ -602,17 +628,9 @@ def embedding_attention_decoder(decoder_inputs, initial_state, attention_states,
with ops.device("/cpu:0"):
embedding = variable_scope.get_variable("embedding",
[num_symbols, cell.input_size])
def extract_argmax_and_embed(prev, _):
"""Loop_function that extracts the symbol from prev and embeds it."""
if output_projection is not None:
prev = nn_ops.xw_plus_b(
prev, output_projection[0], output_projection[1])
prev_symbol = array_ops.stop_gradient(math_ops.argmax(prev, 1))
emb_prev = embedding_ops.embedding_lookup(embedding, prev_symbol)
return emb_prev
loop_function = extract_argmax_and_embed if feed_previous else None
loop_function = _extract_argmax_and_embed(
embedding, output_projection,
update_embedding_for_previous) if feed_previous else None
emb_inp = [
embedding_ops.embedding_lookup(embedding, i) for i in decoder_inputs]
return attention_decoder(
@ -700,6 +718,7 @@ def embedding_attention_seq2seq(encoder_inputs, decoder_inputs, cell,
num_decoder_symbols, num_heads=num_heads, output_size=output_size,
output_projection=output_projection,
feed_previous=feed_previous_bool,
update_embedding_for_previous=False,
initial_state_attention=initial_state_attention)
return outputs + [state]

View File

@ -248,7 +248,7 @@ class _Nulllocker(object):
def Exists(path): # pylint: disable=invalid-name
"""Retruns True iff "path" exists (as a dir, file, non-broken symlink)."""
"""Returns True iff "path" exists (as a dir, file, non-broken symlink)."""
return os.path.exists(path)

View File

@ -50,9 +50,11 @@ def exponential_decay(learning_rate, global_step, decay_steps, decay_rate,
starter_learning_rate = 0.1
learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step,
100000, 0.96, staircase=True)
optimizer = tf.GradientDescentOptimizer(learning_rate)
# Passing global_step to minimize() will increment it at each step.
optimizer.minimize(...my loss..., global_step=global_step)
learning_step = (
tf.GradientDescentOptimizer(learning_rate)
.minimize(...my loss..., global_step=global_step)
)
```
Args:

View File

@ -218,7 +218,7 @@ class ExponentialMovingAverageTest(tf.test.TestCase):
self.assertDeviceEqual("/job:dev_v0", ema.average(v0).device)
self.assertDeviceEqual("/job:dev_v1", ema.average(v1).device)
# However, the colocation property is maintained.
self.assertEqual(["loc:@v1"],
self.assertEqual([b"loc:@v1"],
ema.average(v1).op.colocation_groups())
self.assertDeviceEqual("/job:default", ema.average(tensor2).device)

Some files were not shown because too many files have changed in this diff Show More