Compare commits

...

21 Commits

Author SHA1 Message Date
breandan
9b69ec3960 Fix broken link to Anaconda installation (#2679) 2016-06-06 13:54:44 -07:00
Vijay Vasudevan
4b7bc3174e Update cuda instructions to be more specific about versions (#2065) 2016-04-22 13:51:21 -07:00
Martin Wicke
dc19800ee1 Merge pull request #2023 from caisq/r0.8-tensorforest-2
R0.8 tensorforest cherry-pick
2016-04-19 13:51:30 -07:00
Jan Prach
ac3c683651 Merge pull request #2026 from caisq/r0.8-final
One more version update for 0.8.0 final
2016-04-19 13:31:57 -07:00
Shanqing Cai
a074dca846 One more version update for 0.8.0 final 2016-04-19 16:01:05 -04:00
gilberth
f7ec1ed5fc Fixes and enhancements for contrib/tensorforest
Change: 120123078
2016-04-19 13:08:21 -04:00
Martin Wicke
44a6b91ce8 Merge pull request #1902 from martinwicke/branch_119712558
Merge internal changes
2016-04-19 13:08:08 -04:00
Martin Wicke
d118d1d31c Merge pull request #2020 from martinwicke/r0.8
Doc changes
2016-04-19 09:01:38 -07:00
Martin Wicke
846e0121e7 Point deepdream tutorial at README instead of ipython 2016-04-19 08:58:19 -07:00
Daniel W Mane
93625a8a79 Include TensorBoard TAG in pip_package (#2009) 2016-04-18 14:00:00 -07:00
Martin Wicke
6408038938 Update RELEASE.md (#1995)
Took out James Wexler (Googler) and added Yuan Tang (not Googler).
2016-04-17 00:17:38 -07:00
Martin Wicke
1933eb61ac Update RELEASE.md
Took out James Wexler (Googler) and added Yuan Tang (not Googler).
2016-04-17 00:17:07 -07:00
Daniel W Mane
9fd4a1dd9e Fix the TensorBoard README.md to point at correct url for demo tensorboard (#1973) 2016-04-16 14:25:02 -07:00
Vijay Vasudevan
7a4e0bf9aa Add a comment to install mentioning cuda and cudnn requirements (#1985) (#1986)
for PIP installs.

(Also cherry-pick anaconda instructions)
2016-04-16 11:38:15 -07:00
Vijay Vasudevan
0e61baf4ea Revert change to r0.8 branch that points to new URLs (#1977)
They haven't been uploaded yet, and so our website is pointing people at the wrong location.
2016-04-15 21:05:47 -07:00
caisq
50cb176ba9 Merge pull request #1966 from mrry/clusterspec_fix
Fixed bug in `tf.train.ClusterSpec` constructor.
2016-04-15 12:00:47 -04:00
caisq
9e1b37e4ce Version bumping for 0.8.0 final release (#1959)
0.8.0rc0 --> 0.8.0
2016-04-15 08:48:13 -07:00
Derek Murray
fb301a9848 Fixed bug in tf.train.ClusterSpec constructor.
Creating a `tf.train.ClusterSpec` from another ClusterSpec was broken,
which in turn broke creating a `tf.train.Server` from a ClusterSpec.

Fixes #1961.
Change: 119954117
2016-04-15 07:45:18 -07:00
caisq
35cd6a3011 Merge pull request #1946 from caisq/r0.8-tensorboard
Fix TensorBoard lib/css dependency (#1926)
2016-04-14 11:33:39 -04:00
Daniel W Mane
09d7d91ef0 Fix TensorBoard lib/css dependency (#1926) 2016-04-14 10:59:08 -04:00
Derek Murray
6bda8aa907 Applying editorial changes to the distributed how-to. (#1920)
Change: 119605636
2016-04-13 15:18:51 -07:00
50 changed files with 1412 additions and 555 deletions

View File

@ -31,9 +31,9 @@ and discussion.**
People who are a little bit adventurous can also try our nightly binaries:
* Linux CPU only: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/))
* Linux GPU: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-working/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-working/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-working/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-working/))
* Mac CPU only: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-py2-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-py3-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/))
* Linux CPU only: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-cp27-none-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=cpu-slave/))
* Linux GPU: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-working/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-cp27-none-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=gpu-working/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-working/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nigntly-matrix-linux-gpu/TF_BUILD_CONTAINER_TYPE=GPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=gpu-working/))
* Mac CPU only: [Python 2](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-py2-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=mac-slave/)) / [Python 3](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-py3-none-any.whl) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON3,label=mac-slave/))
* [Android](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/lastSuccessfulBuild/artifact/bazel-out/local_linux/bin/tensorflow/examples/android/tensorflow_demo.apk) ([build history](http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/))
#### *Try your first TensorFlow program*

View File

@ -35,7 +35,7 @@
This release contains contributions from many people at Google, as well as:
Abhinav Upadhyay, Aggelos Avgerinos, Alan Wu, Alexander G. de G. Matthews, Aleksandr Yahnev, @amchercashin, Andy Kitchen, Aurelien Geron, Awni Hannun, @BanditCat, Bas Veeling, Cameron Chen, @cg31, Cheng-Lung Sung, Christopher Bonnett, Dan Becker, Dan Van Boxel, Daniel Golden, Danijar Hafner, Danny Goodman, Dave Decker, David Dao, David Kretch, Dongjoon Hyun, Dustin Dorroh, @e-lin, Eurico Doirado, Erik Erwitt, Fabrizio Milo, @gaohuazuo, Iblis Lin, Igor Babuschkin, Isaac Hodes, Isaac Turner, Iván Vallés, J Yegerlehner, Jack Zhang, James Wexler, Jan Zikes, Jay Young, Jeff Hodges, @jmtatsch, Johnny Lim, Jonas Meinertz Hansen, Kanit Wongsuphasawat, Kashif Rasul, Ken Shirriff, Kenneth Mitchner, Kenta Yonekura, Konrad Magnusson, Konstantin Lopuhin, @lahwran, @lekaha, @liyongsea, Lucas Adams, @makseq, Mandeep Singh, @manipopopo, Mark Amery, Memo Akten, Michael Heilman, Michael Peteuil, Nathan Daly, Nicolas Fauchereau, @ninotoshi, Olav Nymoen, @panmari, @papelita1234, Pedro Lopes, Pranav Sailesh Mani, RJ Ryan, Rob Culliton, Robert DiPietro, @ronrest, Sam Abrahams, Sarath Shekkizhar, Scott Graham, Sebastian Raschka, Sung Kim, Surya Bhupatiraju, Syed Ahmed, Till Hoffmann, @timsl, @urimend, @vesnica, Vlad Frolov, Vlad Zagorodniy, Wei-Ting Kuo, Wenjian Huang, William Dmitri Breaden Madden, Wladimir Schmidt, Yuwen Yan, Yuxin Wu, Yuya Kusakabe, @zhongzyd, @znah.
Abhinav Upadhyay, Aggelos Avgerinos, Alan Wu, Alexander G. de G. Matthews, Aleksandr Yahnev, @amchercashin, Andy Kitchen, Aurelien Geron, Awni Hannun, @BanditCat, Bas Veeling, Cameron Chen, @cg31, Cheng-Lung Sung, Christopher Bonnett, Dan Becker, Dan Van Boxel, Daniel Golden, Danijar Hafner, Danny Goodman, Dave Decker, David Dao, David Kretch, Dongjoon Hyun, Dustin Dorroh, @e-lin, Eurico Doirado, Erik Erwitt, Fabrizio Milo, @gaohuazuo, Iblis Lin, Igor Babuschkin, Isaac Hodes, Isaac Turner, Iván Vallés, J Yegerlehner, Jack Zhang, Jan Zikes, Jay Young, Jeff Hodges, @jmtatsch, Johnny Lim, Jonas Meinertz Hansen, Kanit Wongsuphasawat, Kashif Rasul, Ken Shirriff, Kenneth Mitchner, Kenta Yonekura, Konrad Magnusson, Konstantin Lopuhin, @lahwran, @lekaha, @liyongsea, Lucas Adams, @makseq, Mandeep Singh, @manipopopo, Mark Amery, Memo Akten, Michael Heilman, Michael Peteuil, Nathan Daly, Nicolas Fauchereau, @ninotoshi, Olav Nymoen, @panmari, @papelita1234, Pedro Lopes, Pranav Sailesh Mani, RJ Ryan, Rob Culliton, Robert DiPietro, @ronrest, Sam Abrahams, Sarath Shekkizhar, Scott Graham, Sebastian Raschka, Sung Kim, Surya Bhupatiraju, Syed Ahmed, Till Hoffmann, @timsl, @urimend, @vesnica, Vlad Frolov, Vlad Zagorodniy, Wei-Ting Kuo, Wenjian Huang, William Dmitri Breaden Madden, Wladimir Schmidt, Yuan Tang, Yuwen Yan, Yuxin Wu, Yuya Kusakabe, @zhongzyd, @znah.
We are also grateful to all who filed issues or helped resolve them, asked and
answered questions, and were part of inspiring discussions.

View File

@ -18,6 +18,7 @@
// only op that involves tree traversal, and is constructed so that it can
// be run in parallel on separate batches of data.
#include <unordered_map>
#include <vector>
#include "tensorflow/contrib/tensor_forest/core/ops/tree_utils.h"
@ -25,10 +26,12 @@
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/lib/gtl/map_util.h"
#include "tensorflow/core/util/work_sharder.h"
namespace tensorflow {
using std::get;
using std::make_pair;
using std::make_tuple;
using std::pair;
using std::tuple;
@ -42,6 +45,71 @@ using tensorforest::DecideNode;
using tensorforest::Initialize;
using tensorforest::IsAllInitialized;
// A data structure to store the results of parallel tree traversal.
struct InputDataResult {
// A list of each node that was visited.
std::vector<int32> node_indices;
// The accumulator of the leaf that a data point ended up at, or -1 if none.
int32 leaf_accumulator;
// The left-branch taken candidate splits.
std::vector<int32> split_adds;
// If the candidate splits for the leaf that a data point arrived at
// were initialized or not, which determines if we add this to total
// pcw counts or not.
bool splits_initialized;
};
void Evaluate(const Tensor& input_data, const Tensor& input_labels,
const Tensor& tree_tensor, const Tensor& tree_thresholds,
const Tensor& node_to_accumulator,
const Tensor& candidate_split_features,
const Tensor& candidate_split_thresholds,
InputDataResult* results, int64 start, int64 end) {
const auto tree = tree_tensor.tensor<int32, 2>();
const auto thresholds = tree_thresholds.unaligned_flat<float>();
const auto node_map = node_to_accumulator.unaligned_flat<int32>();
const auto split_features = candidate_split_features.tensor<int32, 2>();
const auto split_thresholds = candidate_split_thresholds.tensor<float, 2>();
const int32 num_splits = candidate_split_features.shape().dim_size(1);
for (int i = start; i < end; ++i) {
const Tensor point = input_data.Slice(i, i + 1);
int node_index = 0;
results[i].splits_initialized = false;
while (true) {
results[i].node_indices.push_back(node_index);
int32 left_child = tree(node_index, CHILDREN_INDEX);
if (left_child == LEAF_NODE) {
const int32 accumulator = node_map(node_index);
results[i].leaf_accumulator = accumulator;
// If the leaf is not fertile or is not yet initialized, we don't
// count it in the candidate/total split per-class-weights because
// it won't have any candidate splits yet.
if (accumulator >= 0 &&
IsAllInitialized(candidate_split_features.Slice(
accumulator, accumulator + 1))) {
results[i].splits_initialized = true;
for (int split = 0; split < num_splits; split++) {
if (!DecideNode(point, split_features(accumulator, split),
split_thresholds(accumulator, split))) {
results[i].split_adds.push_back(split);
}
}
}
break;
} else if (left_child == FREE_NODE) {
LOG(ERROR) << "Reached a free node, not good.";
results[i].node_indices.push_back(FREE_NODE);
break;
}
node_index =
left_child + DecideNode(point, tree(node_index, FEATURE_INDEX),
thresholds(node_index));
}
}
}
REGISTER_OP("CountExtremelyRandomStats")
.Attr("num_classes: int32")
.Input("input_data: float")
@ -79,9 +147,9 @@ REGISTER_OP("CountExtremelyRandomStats")
gives the j-th feature of the i-th input.
input_labels: The training batch's labels; `input_labels[i]` is the class
of the i-th input.
tree:= A 2-d int32 tensor. `tree[0][i]` gives the index of the left child
of the i-th node, `tree[0][i] + 1` gives the index of the right child of
the i-th node, and `tree[1][i]` gives the index of the feature used to
tree:= A 2-d int32 tensor. `tree[i][0]` gives the index of the left child
of the i-th node, `tree[i][0] + 1` gives the index of the right child of
the i-th node, and `tree[i][1]` gives the index of the feature used to
split the i-th node.
tree_thresholds: `tree_thresholds[i]` is the value used to split the i-th
node.
@ -176,7 +244,31 @@ class CountExtremelyRandomStats : public OpKernel {
"candidate_split_features and candidate_split_thresholds should be "
"the same shape."));
const int32 num_splits = candidate_split_features.shape().dim_size(1);
// Evaluate input data in parallel.
const int64 num_data = input_data.shape().dim_size(0);
std::unique_ptr<InputDataResult[]> results(new InputDataResult[num_data]);
auto worker_threads = context->device()->tensorflow_cpu_worker_threads();
int num_threads = worker_threads->num_threads;
if (num_threads <= 1) {
Evaluate(input_data, input_labels, tree_tensor, tree_thresholds,
node_to_accumulator, candidate_split_features,
candidate_split_thresholds, results.get(), 0, num_data);
} else {
auto work = [&input_data, &input_labels, &tree_tensor, &tree_thresholds,
&node_to_accumulator, &candidate_split_features,
&candidate_split_thresholds, &num_data,
&results](int64 start, int64 end) {
CHECK(start <= end);
CHECK(end <= num_data);
Evaluate(input_data, input_labels, tree_tensor, tree_thresholds,
node_to_accumulator, candidate_split_features,
candidate_split_thresholds, results.get(), start, end);
};
Shard(num_threads, worker_threads->workers, num_data, 100, work);
}
// Set output tensors.
const auto labels = input_labels.unaligned_flat<int32>();
// node pcw delta
Tensor* output_node_pcw_delta = nullptr;
@ -196,58 +288,28 @@ class CountExtremelyRandomStats : public OpKernel {
&output_leaves));
auto out_leaves = output_leaves->unaligned_flat<int32>();
const auto tree = tree_tensor.tensor<int32, 2>();
const auto thresholds = tree_thresholds.unaligned_flat<float>();
const auto labels = input_labels.unaligned_flat<int32>();
const auto node_map = node_to_accumulator.unaligned_flat<int32>();
const auto split_features = candidate_split_features.tensor<int32, 2>();
const auto split_thresholds = candidate_split_thresholds.tensor<float, 2>();
const int32 num_data = input_data.shape().dim_size(0);
// <accumulator, class> -> count delta
std::unordered_map<pair<int32, int32>, int32, PairIntHash> total_delta;
// <accumulator, split, class> -> count delta
std::unordered_map<tuple<int32, int32, int32>,
int32, TupleIntHash> split_delta;
for (int i = 0; i < num_data; i++) {
const Tensor point = input_data.Slice(i, i+1);
int node_index = 0;
while (true) {
const int32 label = labels(i);
++out_node(node_index, label);
int32 left_child = tree(node_index, CHILDREN_INDEX);
if (left_child == LEAF_NODE) {
out_leaves(i) = node_index;
const int32 accumulator = node_map(node_index);
// If the leaf is not fertile or is not yet initialized, we don't
// count it in the candidate/total split per-class-weights because
// it won't have any candidate splits yet.
if (accumulator >= 0 &&
IsAllInitialized(
candidate_split_features.Slice(accumulator,
accumulator + 1))) {
++total_delta[std::make_pair(accumulator, label)];
for (int split = 0; split < num_splits; split++) {
if (!DecideNode(point, split_features(accumulator, split),
split_thresholds(accumulator, split))) {
++split_delta[make_tuple(accumulator, split, label)];
}
}
}
break;
} else if (left_child == FREE_NODE) {
LOG(ERROR) << "Reached a free node, not good.";
out_leaves(i) = FREE_NODE;
break;
for (int32 i = 0; i < num_data; ++i) {
const int32 label = labels(i);
const int32 accumulator = results[i].leaf_accumulator;
for (const int32 node : results[i].node_indices) {
++out_node(node, label);
}
out_leaves(i) = results[i].node_indices.back();
if (accumulator >= 0 && results[i].splits_initialized) {
++total_delta[make_pair(accumulator, label)];
for (const int32 split : results[i].split_adds) {
++split_delta[make_tuple(accumulator, split, label)];
}
node_index = left_child +
DecideNode(point, tree(node_index, FEATURE_INDEX),
thresholds(node_index));
}
}
// candidate splits pcw indices
// candidate splits pcw indices
Tensor* output_candidate_pcw_indices = nullptr;
TensorShape candidate_pcw_shape;
candidate_pcw_shape.AddDim(split_delta.size());

View File

@ -94,7 +94,7 @@ class SampleInputs : public OpKernel {
"split_sampling_random_seed", &split_sampling_random_seed_));
// Set up the random number generator.
if (split_sampling_random_seed_ == 0) {
uint64 time_seed = static_cast<uint64>(std::time(NULL));
uint64 time_seed = static_cast<uint64>(std::clock());
single_rand_ = std::unique_ptr<random::PhiloxRandom>(
new random::PhiloxRandom(time_seed));
} else {

View File

@ -44,9 +44,9 @@ REGISTER_OP("TreePredictions")
input_data: The training batch's features as a 2-d tensor; `input_data[i][j]`
gives the j-th feature of the i-th input.
tree:= A 2-d int32 tensor. `tree[0][i]` gives the index of the left child
of the i-th node, `tree[0][i] + 1` gives the index of the right child of
the i-th node, and `tree[1][i]` gives the index of the feature used to
tree:= A 2-d int32 tensor. `tree[i][0]` gives the index of the left child
of the i-th node, `tree[i][0] + 1` gives the index of the right child of
the i-th node, and `tree[i][1]` gives the index of the feature used to
split the i-th node.
tree_thresholds: `tree_thresholds[i]` is the value used to split the i-th
node.

View File

@ -17,7 +17,7 @@ from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow # pylint: disable=unused-import
import tensorflow as tf
from tensorflow.contrib.tensor_forest.python.ops import training_ops
@ -47,6 +47,29 @@ class CountExtremelyRandomStatsTest(test_util.TensorFlowTestCase):
self.tree_thresholds, self.node_map,
self.split_features, self.split_thresholds, num_classes=4))
self.assertAllEqual(
[[1., 1., 1., 1.], [1., 1., 0., 0.], [0., 0., 1., 1.]],
pcw_node.eval())
self.assertAllEqual([[0, 0, 0]], pcw_splits_indices.eval())
self.assertAllEqual([1.], pcw_splits_delta.eval())
self.assertAllEqual([[0, 1], [0, 0]], pcw_totals_indices.eval())
self.assertAllEqual([1., 1.], pcw_totals_delta.eval())
self.assertAllEqual([1, 1, 2, 2], leaves.eval())
def testThreaded(self):
with self.test_session(
config=tf.ConfigProto(intra_op_parallelism_threads=2)):
(pcw_node, pcw_splits_indices, pcw_splits_delta, pcw_totals_indices,
pcw_totals_delta,
leaves) = (self.ops.count_extremely_random_stats(self.input_data,
self.input_labels,
self.tree,
self.tree_thresholds,
self.node_map,
self.split_features,
self.split_thresholds,
num_classes=4))
self.assertAllEqual([[1., 1., 1., 1.], [1., 1., 0., 0.],
[0., 0., 1., 1.]],
pcw_node.eval())

View File

@ -49,12 +49,13 @@ def TreePredictions(op):
# there's not yet any guarantee that the shared object exists.
# In which case, "import tensorflow" will always crash, even for users that
# never use contrib.
def Load():
def Load(library_base_dir=''):
"""Load the inference ops library and return the loaded module."""
with _ops_lock:
global _inference_ops
if not _inference_ops:
data_files_path = tf.resource_loader.get_data_files_path()
data_files_path = os.path.join(library_base_dir,
tf.resource_loader.get_data_files_path())
tf.logging.info('data path: %s', data_files_path)
_inference_ops = tf.load_op_library(os.path.join(
data_files_path, INFERENCE_OPS_FILE))

View File

@ -25,6 +25,7 @@ import tensorflow as tf
from tensorflow.python.framework import ops
from tensorflow.python.framework import tensor_shape
TRAINING_OPS_FILE = '_training_ops.so'
_training_ops = None
@ -96,12 +97,13 @@ def _UpdateFertileSlotsShape(unused_op):
# there's not yet any guarantee that the shared object exists.
# In which case, "import tensorflow" will always crash, even for users that
# never use contrib.
def Load():
def Load(library_base_dir=''):
"""Load training ops library and return the loaded module."""
with _ops_lock:
global _training_ops
if not _training_ops:
data_files_path = tf.resource_loader.get_data_files_path()
data_files_path = os.path.join(library_base_dir,
tf.resource_loader.get_data_files_path())
tf.logging.info('data path: %s', data_files_path)
_training_ops = tf.load_op_library(os.path.join(
data_files_path, TRAINING_OPS_FILE))

View File

@ -25,19 +25,6 @@ from tensorflow.contrib.tensor_forest.python.ops import inference_ops
from tensorflow.contrib.tensor_forest.python.ops import training_ops
flags = tf.app.flags
FLAGS = flags.FLAGS
# Default parameter values. These are all only used if the corresponding
# parameter is not specified when constructing the ForestHParams.
flags.DEFINE_integer('num_trees', 100, 'Number of trees in forest')
flags.DEFINE_integer('max_nodes', 10000, 'Maxmimum number of tree nodes.')
flags.DEFINE_float(
'samples_to_decide', 25.0,
'Only decide on a split, or only fully use a leaf, after this many '
'training samples have been seen.')
# If tree[i][0] equals this value, then i is a leaf node.
LEAF_NODE = -1
@ -57,7 +44,20 @@ LEAF_NODE = -1
class ForestHParams(object):
"""A base class for holding hyperparameters and calculating good defaults."""
def __init__(self, **kwargs):
def __init__(self, num_trees=100, max_nodes=10000, bagging_fraction=1.0,
samples_to_decide=25, max_depth=0, num_splits_to_consider=0,
max_fertile_nodes=0, split_after_samples=0,
valid_leaf_threshold=0, **kwargs):
self.num_trees = num_trees
self.max_nodes = max_nodes
self.bagging_fraction = bagging_fraction
self.samples_to_decide = samples_to_decide
self.max_depth = max_depth
self.num_splits_to_consider = num_splits_to_consider
self.max_fertile_nodes = max_fertile_nodes
self.split_after_samples = split_after_samples
self.valid_leaf_threshold = valid_leaf_threshold
for name, value in kwargs.items():
setattr(self, name, value)
@ -69,19 +69,21 @@ class ForestHParams(object):
# Fail fast if num_classes isn't set.
_ = getattr(self, 'num_classes')
self.num_trees = getattr(self, 'num_trees', FLAGS.num_trees)
self.max_nodes = getattr(self, 'max_nodes', FLAGS.max_nodes)
self.training_library_base_dir = getattr(
self, 'training_library_base_dir', '')
self.inference_library_base_dir = getattr(
self, 'inference_library_base_dir', '')
# Allow each tree to be unbalanced by up to a factor of 2.
self.max_depth = getattr(self, 'max_depth',
int(2 * math.ceil(math.log(self.max_nodes, 2))))
self.max_depth = (self.max_depth or
int(2 * math.ceil(math.log(self.max_nodes, 2))))
# The Random Forest literature recommends sqrt(# features) for
# classification problems, and p/3 for regression problems.
# TODO(thomaswc): Consider capping this for large number of features.
if not getattr(self, 'num_splits_to_consider', None):
self.num_splits_to_consider = max(10, int(
math.ceil(math.sqrt(self.num_features))))
self.num_splits_to_consider = (
self.num_splits_to_consider or
max(10, int(math.ceil(math.sqrt(self.num_features)))))
# max_fertile_nodes doesn't effect performance, only training speed.
# We therefore set it primarily based upon space considerations.
@ -91,22 +93,19 @@ class ForestHParams(object):
num_fertile = int(math.ceil(self.max_nodes / self.num_splits_to_consider))
# But always use at least 1000 accumulate slots.
num_fertile = max(num_fertile, 1000)
self.max_fertile_nodes = getattr(self, 'max_fertile_nodes', num_fertile)
self.max_fertile_nodes = self.max_fertile_nodes or num_fertile
# But it also never needs to be larger than the number of leaves,
# which is max_nodes / 2.
self.max_fertile_nodes = min(self.max_nodes,
int(math.ceil(self.max_fertile_nodes / 2.0)))
self.max_fertile_nodes = min(self.max_fertile_nodes,
int(math.ceil(self.max_nodes / 2.0)))
# split_after_samples and valid_leaf_threshold should be about the same.
# Therefore, if either is set, use it to set the other. Otherwise, fall
# back on FLAGS.samples_to_decide.
samples_to_decide = (
getattr(self, 'split_after_samples',
getattr(self, 'valid_leaf_threshold', FLAGS.samples_to_decide)))
self.split_after_samples = getattr(self, 'split_after_samples',
samples_to_decide)
self.valid_leaf_threshold = getattr(self, 'valid_leaf_threshold',
samples_to_decide)
# back on samples_to_decide.
samples_to_decide = self.split_after_samples or self.samples_to_decide
self.split_after_samples = self.split_after_samples or samples_to_decide
self.valid_leaf_threshold = self.valid_leaf_threshold or samples_to_decide
# We have num_splits_to_consider slots to fill, and we want to spend
# approximately split_after_samples samples initializing them.
@ -184,23 +183,6 @@ class TreeStats(object):
self.num_leaves = num_leaves
def get_tree_stats(variables, unused_params, session):
num_nodes = variables.end_of_tree.eval(session=session) - 1
num_leaves = tf.where(
tf.equal(tf.squeeze(tf.slice(variables.tree, [0, 0], [-1, 1])),
LEAF_NODE)).eval(session=session).shape[0]
return TreeStats(num_nodes, num_leaves)
def get_forest_stats(variables, params, session):
tree_stats = []
for i in range(params.num_trees):
tree_stats.append(get_tree_stats(variables[i], params, session))
return ForestStats(tree_stats, params)
class ForestTrainingVariables(object):
"""A container for a forests training data, consisting of multiple trees.
@ -212,9 +194,11 @@ class ForestTrainingVariables(object):
... forest_variables.tree ...
"""
def __init__(self, params):
self.variables = [TreeTrainingVariables(params)
for _ in range(params.num_trees)]
def __init__(self, params, device_assigner):
self.variables = []
for i in range(params.num_trees):
with tf.device(device_assigner.get_device(i)):
self.variables.append(TreeTrainingVariables(params))
def __setitem__(self, t, val):
self.variables[t] = val
@ -223,15 +207,41 @@ class ForestTrainingVariables(object):
return self.variables[t]
class RandomForestDeviceAssigner(object):
"""A device assigner that uses the default device.
Write subclasses that implement get_device for control over how trees
get assigned to devices. This assumes that whole trees are assigned
to a device.
"""
def __init__(self):
self.cached = None
def get_device(self, unused_tree_num):
if not self.cached:
dummy = tf.constant(0)
self.cached = dummy.device
return self.cached
class RandomForestGraphs(object):
"""Builds TF graphs for random forest training and inference."""
def __init__(self, params):
def __init__(self, params, device_assigner=None, variables=None):
self.params = params
self.variables = ForestTrainingVariables(self.params)
self.trees = [RandomTreeGraphs(self.variables[i], self.params,
training_ops.Load(), inference_ops.Load())
for i in range(self.params.num_trees)]
self.device_assigner = device_assigner or RandomForestDeviceAssigner()
tf.logging.info('Constructing forest with params = ')
tf.logging.info(self.params.__dict__)
self.variables = variables or ForestTrainingVariables(
self.params, device_assigner=self.device_assigner)
self.trees = [
RandomTreeGraphs(
self.variables[i], self.params,
training_ops.Load(self.params.training_library_base_dir),
inference_ops.Load(self.params.inference_library_base_dir))
for i in range(self.params.num_trees)]
def training_graph(self, input_data, input_labels):
"""Constructs a TF graph for training a random forest.
@ -246,12 +256,26 @@ class RandomForestGraphs(object):
"""
tree_graphs = []
for i in range(self.params.num_trees):
tf.logging.info('Constructing tree %d', i)
seed = self.params.base_random_seed
if seed != 0:
seed += i
tree_graphs.append(self.trees[i].training_graph(
input_data, input_labels, seed))
with tf.device(self.device_assigner.get_device(i)):
seed = self.params.base_random_seed
if seed != 0:
seed += i
# If using bagging, randomly select some of the input.
tree_data = input_data
tree_labels = input_labels
if self.params.bagging_fraction < 1.0:
# TODO(thomaswc): This does sampling without replacment. Consider
# also allowing sampling with replacement as an option.
batch_size = tf.slice(tf.shape(input_data), [0], [1])
r = tf.random_uniform(batch_size, seed=seed)
mask = tf.less(r, tf.ones_like(r) * self.params.bagging_fraction)
gather_indices = tf.squeeze(tf.where(mask), squeeze_dims=[1])
# TODO(thomaswc): Calculate out-of-bag data and labels, and store
# them for use in calculating statistics later.
tree_data = tf.gather(input_data, gather_indices)
tree_labels = tf.gather(input_labels, gather_indices)
tree_graphs.append(
self.trees[i].training_graph(tree_data, tree_labels, seed))
return tf.group(*tree_graphs)
def inference_graph(self, input_data):
@ -265,9 +289,23 @@ class RandomForestGraphs(object):
"""
probabilities = []
for i in range(self.params.num_trees):
probabilities.append(self.trees[i].inference_graph(input_data))
all_predict = tf.pack(probabilities)
return tf.reduce_sum(all_predict, 0) / self.params.num_trees
with tf.device(self.device_assigner.get_device(i)):
probabilities.append(self.trees[i].inference_graph(input_data))
with tf.device(self.device_assigner.get_device(0)):
all_predict = tf.pack(probabilities)
return tf.reduce_sum(all_predict, 0) / self.params.num_trees
def average_size(self):
"""Constructs a TF graph for evaluating the average size of a forest.
Returns:
The average number of nodes over the trees.
"""
sizes = []
for i in range(self.params.num_trees):
with tf.device(self.device_assigner.get_device(i)):
sizes.append(self.trees[i].size())
return tf.reduce_mean(tf.pack(sizes))
def average_impurity(self):
"""Constructs a TF graph for evaluating the leaf impurity of a forest.
@ -277,9 +315,17 @@ class RandomForestGraphs(object):
"""
impurities = []
for i in range(self.params.num_trees):
impurities.append(self.trees[i].average_impurity(self.variables[i]))
with tf.device(self.device_assigner.get_device(i)):
impurities.append(self.trees[i].average_impurity())
return tf.reduce_mean(tf.pack(impurities))
def get_stats(self, session):
tree_stats = []
for i in range(self.params.num_trees):
with tf.device(self.device_assigner.get_device(i)):
tree_stats.append(self.trees[i].get_stats(session))
return ForestStats(tree_stats, self.params)
class RandomTreeGraphs(object):
"""Builds TF graphs for random tree training and inference."""
@ -394,6 +440,7 @@ class RandomTreeGraphs(object):
with tf.control_dependencies([node_update_op]):
def f1():
return self.variables.non_fertile_leaf_scores
def f2():
counts = tf.gather(self.variables.node_per_class_weights,
self.variables.non_fertile_leaves)
@ -535,3 +582,18 @@ class RandomTreeGraphs(object):
counts = tf.gather(self.variables.node_per_class_weights, leaves)
impurity = self._weighted_gini(counts)
return tf.reduce_sum(impurity) / tf.reduce_sum(counts + 1.0)
def size(self):
"""Constructs a TF graph for evaluating the current number of nodes.
Returns:
The current number of nodes in the tree.
"""
return self.variables.end_of_tree - 1
def get_stats(self, session):
num_nodes = self.variables.end_of_tree.eval(session=session) - 1
num_leaves = tf.where(
tf.equal(tf.squeeze(tf.slice(self.variables.tree, [0, 0], [-1, 1])),
LEAF_NODE)).eval(session=session).shape[0]
return TreeStats(num_nodes, num_leaves)

View File

@ -27,6 +27,37 @@ from tensorflow.python.platform import googletest
class TensorForestTest(test_util.TensorFlowTestCase):
def testForestHParams(self):
hparams = tensor_forest.ForestHParams(
num_classes=2, num_trees=100, max_nodes=1000,
num_features=60).fill()
self.assertEquals(2, hparams.num_classes)
# 2 * ceil(log_2(1000)) = 20
self.assertEquals(20, hparams.max_depth)
# sqrt(num_features) < 10, so num_splits_to_consider should be 10.
self.assertEquals(10, hparams.num_splits_to_consider)
# Don't have more fertile nodes than max # leaves, which is 500.
self.assertEquals(500, hparams.max_fertile_nodes)
# We didn't set either of these, so they should be equal
self.assertEquals(hparams.split_after_samples,
hparams.valid_leaf_threshold)
# split_after_samples is larger than 10
self.assertEquals(1, hparams.split_initializations_per_input)
self.assertEquals(0, hparams.base_random_seed)
def testForestHParamsBigTree(self):
hparams = tensor_forest.ForestHParams(
num_classes=2, num_trees=100, max_nodes=1000000,
split_after_samples=25,
num_features=1000).fill()
self.assertEquals(40, hparams.max_depth)
# sqrt(1000) = 31.63...
self.assertEquals(32, hparams.num_splits_to_consider)
# 1000000 / 32 = 31250
self.assertEquals(31250, hparams.max_fertile_nodes)
# floor(31.63 / 25) = 1
self.assertEquals(1, hparams.split_initializations_per_input)
def testTrainingConstruction(self):
input_data = [[-1., 0.], [-1., 2.], # node 1
[1., 0.], [1., -2.]] # node 2
@ -50,6 +81,14 @@ class TensorForestTest(test_util.TensorFlowTestCase):
graph = graph_builder.inference_graph(input_data)
self.assertTrue(isinstance(graph, tf.Tensor))
def testImpurityConstruction(self):
params = tensor_forest.ForestHParams(
num_classes=4, num_features=2, num_trees=10, max_nodes=1000).fill()
graph_builder = tensor_forest.RandomForestGraphs(params)
graph = graph_builder.average_impurity()
self.assertTrue(isinstance(graph, tf.Tensor))
if __name__ == '__main__':
googletest.main()

View File

@ -143,7 +143,6 @@ cc_library(
"lib/core/bits.h",
"lib/core/casts.h",
"lib/core/coding.h",
"lib/core/command_line_flags.h", # TODO(vrv): Delete.
"lib/core/errors.h",
"lib/core/notification.h",
"lib/core/status.h",

View File

@ -35,6 +35,7 @@ limitations under the License.
#include "tensorflow/core/framework/types.h"
#include "tensorflow/core/lib/core/coding.h"
#include "tensorflow/core/lib/core/errors.h"
#include "tensorflow/core/lib/gtl/inlined_vector.h"
#include "tensorflow/core/lib/gtl/stl_util.h"
#include "tensorflow/core/lib/strings/str_util.h"
#include "tensorflow/core/lib/strings/strcat.h"
@ -713,4 +714,36 @@ void Tensor::FillDescription(TensorDescription* description) const {
}
}
gtl::InlinedVector<int64, 5> Tensor::ComputeFlatInnerDims(
int64 num_out_dims) const {
gtl::InlinedVector<int64, 5> out_dims(num_out_dims, 0);
const int64 num_elements = NumElements();
if (num_elements != 0) {
int64 prod_out_dims = 1;
for (int64 out_dim = num_out_dims - 1; out_dim > 0; --out_dim) {
const int64 in_dim = out_dim + (dims() - num_out_dims);
out_dims[out_dim] =
(in_dim >= dims() || in_dim < 0) ? 1 : dim_size(in_dim);
prod_out_dims *= out_dims[out_dim];
}
out_dims[0] = num_elements / prod_out_dims;
}
return out_dims;
}
gtl::InlinedVector<int64, 5> Tensor::ComputeFlatOuterDims(
int64 num_out_dims) const {
gtl::InlinedVector<int64, 5> out_dims(num_out_dims, 0);
const int64 num_elements = NumElements();
if (num_elements != 0) {
int64 prod_out_dims = 1;
for (int64 out_dim = 0; out_dim < num_out_dims - 1; ++out_dim) {
out_dims[out_dim] = out_dim >= dims() ? 1 : dim_size(out_dim);
prod_out_dims *= out_dims[out_dim];
}
out_dims[num_out_dims - 1] = num_elements / prod_out_dims;
}
return out_dims;
}
} // namespace tensorflow

View File

@ -28,6 +28,7 @@ limitations under the License.
#include "tensorflow/core/lib/core/refcount.h"
#include "tensorflow/core/lib/core/status.h"
#include "tensorflow/core/lib/core/stringpiece.h"
#include "tensorflow/core/lib/gtl/inlined_vector.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/platform/macros.h"
#include "tensorflow/core/platform/types.h"
@ -243,40 +244,28 @@ class Tensor {
///
/// ```
template <typename T>
typename TTypes<T>::Flat flat();
typename TTypes<T>::Flat flat() {
return shaped<T, 1>({NumElements()});
}
template <typename T>
typename TTypes<T>::UnalignedFlat unaligned_flat() {
return unaligned_shaped<T, 1>({NumElements()});
}
/// Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all
/// Tensor dimensions but the last one into the first dimension of the result.
template <typename T>
typename TTypes<T>::Matrix flat_inner_dims() {
int64 last_size = dims() > 0 ? dim_size(dims() - 1) : 1;
if (last_size == 0) {
DCHECK_EQ(NumElements(), 0);
// Return something empty, avoiding divide by 0
return shaped<T, 2>({0, 0});
} else {
return shaped<T, 2>({NumElements() / last_size, last_size});
}
}
/// Returns the data as an Eigen::Tensor with NDIMS dimensions, collapsing all
/// Tensor dimensions but the last NDIMS-1 into the first dimension of the
/// result. If NDIMS > dims() then leading dimensions of size 1 will be
/// added to make the output rank NDIMS.
template <typename T, size_t NDIMS = 2>
typename TTypes<T, NDIMS>::Tensor flat_inner_dims();
/// Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all
/// Tensor dimensions but the first one into the last dimension of the result.
template <typename T>
typename TTypes<T>::Matrix flat_outer_dims() {
int64 first_size = dims() > 0 ? dim_size(0) : 1;
if (first_size == 0) {
DCHECK_EQ(NumElements(), 0);
// Return something empty, avoiding divide by 0
return shaped<T, 2>({0, 0});
} else {
return shaped<T, 2>({first_size, NumElements() / first_size});
}
}
/// Returns the data as an Eigen::Tensor with NDIMS dimensions, collapsing all
/// Tensor dimensions but the first NDIMS-1 into the last dimension of the
/// result. If NDIMS > dims() then trailing dimensions of size 1 will be
/// added to make the output rank NDIMS.
template <typename T, size_t NDIMS = 2>
typename TTypes<T, NDIMS>::Tensor flat_outer_dims();
template <typename T, size_t NDIMS>
typename TTypes<T, NDIMS>::Tensor shaped(gtl::ArraySlice<int64> new_sizes);
@ -308,31 +297,19 @@ class Tensor {
typename TTypes<T, NDIMS>::ConstTensor tensor() const;
template <typename T>
typename TTypes<T>::ConstFlat flat() const;
typename TTypes<T>::ConstFlat flat() const {
return shaped<T, 1>({NumElements()});
}
template <typename T>
typename TTypes<T>::UnalignedConstFlat unaligned_flat() const {
return unaligned_shaped<T, 1>({NumElements()});
}
template <typename T>
typename TTypes<T>::ConstMatrix flat_inner_dims() const {
int64 last_size = dims() > 0 ? dim_size(dims() - 1) : 1;
if (last_size == 0) {
DCHECK_EQ(NumElements(), 0);
// Return something empty, avoiding divide by 0
return shaped<T, 2>({0, 0});
} else {
return shaped<T, 2>({NumElements() / last_size, last_size});
}
}
template <typename T>
typename TTypes<T>::ConstMatrix flat_outer_dims() const;
template <typename T, size_t NDIMS>
typename TTypes<T, NDIMS>::ConstTensor shaped(
gtl::ArraySlice<int64> new_sizes) const;
template <typename T, size_t NDIMS>
typename TTypes<T, NDIMS>::UnalignedConstTensor unaligned_shaped(
gtl::ArraySlice<int64> new_sizes) const;
@ -340,6 +317,12 @@ class Tensor {
template <typename T>
typename TTypes<T>::ConstScalar scalar() const;
template <typename T, size_t NDIMS = 2>
typename TTypes<T, NDIMS>::ConstTensor flat_inner_dims() const;
template <typename T, size_t NDIMS = 2>
typename TTypes<T, NDIMS>::ConstTensor flat_outer_dims() const;
/// Render the first `max_entries` values in `*this` into a string.
string SummarizeValue(int64 max_entries) const;
@ -378,6 +361,8 @@ class Tensor {
void FillDimsAndValidateCompatibleShape(
gtl::ArraySlice<int64> new_sizes,
Eigen::array<Eigen::DenseIndex, NDIMS>* dims) const;
gtl::InlinedVector<int64, 5> ComputeFlatInnerDims(int64 num_out_dims) const;
gtl::InlinedVector<int64, 5> ComputeFlatOuterDims(int64 num_out_dims) const;
TensorShape shape_;
TensorBuffer* buf_;
@ -534,26 +519,24 @@ typename TTypes<T>::ConstScalar Tensor::scalar() const {
return typename TTypes<T>::ConstScalar(base<T>());
}
template <typename T>
typename TTypes<T>::Flat Tensor::flat() {
return shaped<T, 1>({NumElements()});
template <typename T, size_t NDIMS>
typename TTypes<T, NDIMS>::Tensor Tensor::flat_inner_dims() {
return shaped<T, NDIMS>(ComputeFlatInnerDims(NDIMS));
}
template <typename T>
typename TTypes<T>::ConstFlat Tensor::flat() const {
return shaped<T, 1>({NumElements()});
template <typename T, size_t NDIMS>
typename TTypes<T, NDIMS>::Tensor Tensor::flat_outer_dims() {
return shaped<T, NDIMS>(ComputeFlatOuterDims(NDIMS));
}
template <typename T>
typename TTypes<T>::ConstMatrix Tensor::flat_outer_dims() const {
int64 first_size = dims() > 0 ? dim_size(0) : 1;
if (first_size == 0) {
DCHECK_EQ(NumElements(), 0);
// Return something empty, avoiding divide by 0
return shaped<T, 2>({0, 0});
} else {
return shaped<T, 2>({first_size, NumElements() / first_size});
}
template <typename T, size_t NDIMS>
typename TTypes<T, NDIMS>::ConstTensor Tensor::flat_inner_dims() const {
return shaped<T, NDIMS>(ComputeFlatInnerDims(NDIMS));
}
template <typename T, size_t NDIMS>
typename TTypes<T, NDIMS>::ConstTensor Tensor::flat_outer_dims() const {
return shaped<T, NDIMS>(ComputeFlatOuterDims(NDIMS));
}
} // namespace tensorflow

View File

@ -224,6 +224,49 @@ TEST(Tensor_Float, Reshape) {
EXPECT_EQ(flat_inner_dims(0, 0), 0.01f);
EXPECT_EQ(flat_inner_dims(23, 4), 0.02f);
}
{
auto flat_outer_dims = t.flat_outer_dims<float>();
EXPECT_EQ(2, flat_outer_dims.dimension(0));
EXPECT_EQ(60, flat_outer_dims.dimension(1));
EXPECT_EQ(flat_outer_dims(0, 0), 0.01f);
EXPECT_EQ(flat_outer_dims(1, 59), 0.02f);
}
{
auto flat_inner_dims = t.flat_inner_dims<float, 3>();
EXPECT_EQ(6, flat_inner_dims.dimension(0));
EXPECT_EQ(4, flat_inner_dims.dimension(1));
EXPECT_EQ(5, flat_inner_dims.dimension(2));
EXPECT_EQ(flat_inner_dims(0, 0, 0), 0.01f);
EXPECT_EQ(flat_inner_dims(5, 3, 4), 0.02f);
}
{
auto flat_outer_dims = t.flat_outer_dims<float, 3>();
EXPECT_EQ(2, flat_outer_dims.dimension(0));
EXPECT_EQ(3, flat_outer_dims.dimension(1));
EXPECT_EQ(20, flat_outer_dims.dimension(2));
EXPECT_EQ(flat_outer_dims(0, 0, 0), 0.01f);
EXPECT_EQ(flat_outer_dims(1, 2, 19), 0.02f);
}
{
auto flat_inner_dims = t.flat_inner_dims<float, 5>();
EXPECT_EQ(1, flat_inner_dims.dimension(0));
EXPECT_EQ(2, flat_inner_dims.dimension(1));
EXPECT_EQ(3, flat_inner_dims.dimension(2));
EXPECT_EQ(4, flat_inner_dims.dimension(3));
EXPECT_EQ(5, flat_inner_dims.dimension(4));
EXPECT_EQ(flat_inner_dims(0, 0, 0, 0, 0), 0.01f);
EXPECT_EQ(flat_inner_dims(0, 1, 2, 3, 4), 0.02f);
}
{
auto flat_outer_dims = t.flat_outer_dims<float, 5>();
EXPECT_EQ(2, flat_outer_dims.dimension(0));
EXPECT_EQ(3, flat_outer_dims.dimension(1));
EXPECT_EQ(4, flat_outer_dims.dimension(2));
EXPECT_EQ(5, flat_outer_dims.dimension(3));
EXPECT_EQ(1, flat_outer_dims.dimension(4));
EXPECT_EQ(flat_outer_dims(0, 0, 0, 0, 0), 0.01f);
EXPECT_EQ(flat_outer_dims(1, 2, 3, 4, 0), 0.02f);
}
}
TEST(Tensor_Scalar, Basics) {

View File

@ -305,6 +305,23 @@ tf_kernel_libraries(
],
)
tf_cc_test(
name = "batch_norm_op_test",
size = "small",
deps = [
":batch_norm_op",
":ops_testutil",
":ops_util",
"//tensorflow/core:core_cpu",
"//tensorflow/core:framework",
"//tensorflow/core:lib",
"//tensorflow/core:protos_all_cc",
"//tensorflow/core:test",
"//tensorflow/core:test_main",
"//tensorflow/core:testlib",
],
)
tf_cc_test(
name = "concat_op_test",
size = "small",

View File

@ -0,0 +1,62 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#include <vector>
#include "tensorflow/core/framework/allocator.h"
#include "tensorflow/core/framework/fake_input.h"
#include "tensorflow/core/framework/graph.pb.h"
#include "tensorflow/core/framework/node_def_builder.h"
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/framework/tensor_testutil.h"
#include "tensorflow/core/framework/types.h"
#include "tensorflow/core/framework/types.pb.h"
#include "tensorflow/core/kernels/ops_testutil.h"
#include "tensorflow/core/kernels/ops_util.h"
#include "tensorflow/core/lib/core/status_test_util.h"
#include "tensorflow/core/platform/test.h"
namespace tensorflow {
class BatchNormOpTest : public OpsTestBase {};
TEST_F(BatchNormOpTest, Simple) {
TF_EXPECT_OK(
NodeDefBuilder("batch_norm_op", "BatchNormWithGlobalNormalization")
.Input(FakeInput(DT_FLOAT))
.Input(FakeInput(DT_FLOAT))
.Input(FakeInput(DT_FLOAT))
.Input(FakeInput(DT_FLOAT))
.Input(FakeInput(DT_FLOAT))
.Attr("scale_after_normalization", false)
.Attr("variance_epsilon", 0.001)
.Finalize(node_def()));
TF_EXPECT_OK(InitOpWithGraphVersion(8));
AddInputFromArray<float>(TensorShape({1, 1, 6, 2}),
{1, 4, 2, 5, 3, 6, -1, -4, -2, -5, -3, -6});
AddInputFromArray<float>(TensorShape({2}), {10, 20});
AddInputFromArray<float>(TensorShape({2}), {0.25, 0.5});
AddInputFromArray<float>(TensorShape({2}), {0.1, 0.6});
AddInputFromArray<float>(TensorShape({2}), {0.0, 0.0});
TF_ASSERT_OK(RunOpKernel());
Tensor expected(allocator(), DT_FLOAT, TensorShape({1, 1, 6, 2}));
test::FillValues<float>(
&expected, {-17.86, -22.00, -15.87, -20.59, -13.87, -19.18, -21.86,
-33.31, -23.85, -34.72, -25.85, -36.13});
test::ExpectTensorNear<float>(expected, *GetOutput(0), 0.01);
}
} // namespace tensorflow

View File

@ -94,10 +94,13 @@ class OpsTestBase : public ::testing::Test {
// and output types as output.
//
// Returns the status of initialization.
Status InitOp() {
Status InitOp() { return InitOpWithGraphVersion(TF_GRAPH_DEF_VERSION); }
// Only use this directly if you have a deprecated op that you need to test.
Status InitOpWithGraphVersion(int graph_def_version) {
Status status;
kernel_ = CreateOpKernel(device_type_, device_.get(), allocator(),
node_def_, TF_GRAPH_DEF_VERSION, &status);
node_def_, graph_def_version, &status);
if (kernel_ != nullptr) input_types_ = kernel_->input_types();
return status;
}

View File

@ -66,7 +66,7 @@ EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE T MaybeConj(T v) {
#define MAYBE_CONJ(T) \
template <> \
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE T MaybeConj<T>(T v) { \
return std::conj(v); \
return Eigen::numext::conj(v); \
}
#endif

View File

@ -1,121 +0,0 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#include "tensorflow/core/lib/core/command_line_flags.h"
#include "tensorflow/core/lib/strings/str_util.h"
#include "tensorflow/core/lib/strings/strcat.h"
#include "tensorflow/core/lib/strings/stringprintf.h"
namespace tensorflow {
namespace {
// Templated function to convert a string to target values.
// Return true if the conversion is successful. Otherwise, return false.
template <typename T>
bool StringToValue(const string& content, T* value);
template <>
bool StringToValue<int32>(const string& content, int32* value) {
return strings::safe_strto32(content, value);
}
template <>
bool StringToValue<string>(const string& content, string* value) {
*value = content;
return true;
}
// Parse a single argument by linearly searching through the command table.
// The input format is: --argument=value.
// Return OK if the argument is used. It store the extracted value into the
// matching flag.
// Return NOT_FOUND if the argument is not recognized.
// Return INVALID_ARGUMENT if the command is recognized, but fails to extract
// its value.
template <typename T>
Status ParseArgument(const string& argument) {
for (auto& command :
internal::CommandLineFlagRegistry<T>::Instance()->commands) {
string prefix = strings::StrCat("--", command.name, "=");
if (tensorflow::StringPiece(argument).starts_with(prefix)) {
string content = argument.substr(prefix.length());
if (StringToValue<T>(content, command.value)) {
return Status::OK();
}
return Status(error::INVALID_ARGUMENT,
strings::StrCat("Cannot parse integer in: ", argument));
}
}
return Status(error::NOT_FOUND,
strings::StrCat("Unknown command: ", argument));
}
// A specialization for booleans. The input format is:
// "--argument" or "--noargument".
// Parse a single argument by linearly searching through the command table.
// Return OK if the argument is used. The value is stored in the matching flag.
// Return NOT_FOUND if the argument is not recognized.
template <>
Status ParseArgument<bool>(const string& argument) {
for (auto& command :
internal::CommandLineFlagRegistry<bool>::Instance()->commands) {
if (argument == strings::StrCat("--", command.name)) {
*command.value = true;
return Status::OK();
} else if (argument == strings::StrCat("--no", command.name)) {
*command.value = false;
return Status::OK();
}
}
return Status(error::NOT_FOUND,
strings::StrCat("Unknown command: ", argument));
}
} // namespace
Status ParseCommandLineFlags(int* argc, char* argv[]) {
int unused_argc = 1;
for (int index = 1; index < *argc; ++index) {
Status s;
// Search bool commands.
s = ParseArgument<bool>(argv[index]);
if (s.ok()) {
continue;
}
if (s.code() != error::NOT_FOUND) {
return s;
}
// Search int32 commands.
s = ParseArgument<int32>(argv[index]);
if (s.ok()) {
continue;
}
// Search string commands.
s = ParseArgument<string>(argv[index]);
if (s.ok()) {
continue;
}
if (s.code() != error::NOT_FOUND) {
return s;
}
// Pointer swap the unused argument to the front.
std::swap(argv[unused_argc++], argv[index]);
}
*argc = unused_argc;
return Status::OK();
}
} // namespace tensorflow

View File

@ -1,80 +0,0 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
#ifndef TENSORFLOW_LIB_CORE_COMMAND_LINE_FLAGS_H_
#define TENSORFLOW_LIB_CORE_COMMAND_LINE_FLAGS_H_
#include <vector>
#include "tensorflow/core/lib/core/status.h"
#include "tensorflow/core/platform/macros.h"
#include "tensorflow/core/platform/types.h"
namespace tensorflow {
namespace internal {
template <typename T>
struct CommandLineFlagRegistry {
static CommandLineFlagRegistry* Instance() {
static CommandLineFlagRegistry instance_;
return &instance_;
}
struct Command {
string name;
T* value;
string text;
};
std::vector<Command> commands;
private:
CommandLineFlagRegistry() {}
TF_DISALLOW_COPY_AND_ASSIGN(CommandLineFlagRegistry);
};
template <typename T>
struct CommandLineFlagRegister {
CommandLineFlagRegister(const string& name, T* val, const string& text) {
CommandLineFlagRegistry<T>::Instance()->commands.push_back(
{name, val, text});
}
};
#define TF_DEFINE_variable(type, name, default_value, text) \
type FLAGS_##name = default_value; \
namespace TF_flags_internal { \
tensorflow::internal::CommandLineFlagRegister<type> \
TF_flags_internal_var_##name(#name, &FLAGS_##name, text); \
} // namespace TF_flags_internal
} // namespace internal
#define TF_DEFINE_int32(name, default_value, text) \
TF_DEFINE_variable(tensorflow::int32, name, default_value, text);
#define TF_DEFINE_bool(name, default_value, text) \
TF_DEFINE_variable(bool, name, default_value, text);
#define TF_DEFINE_string(name, default_value, text) \
TF_DEFINE_variable(string, name, default_value, text);
// Parse argv[1]..argv[*argc-1] to options. Remove used arguments from the argv.
// Returned the number of unused arguments in *argc.
// Return error Status if the parsing encounters errors.
// TODO(opensource): switch to a command line argument parser that can be
// shared with other tests.
Status ParseCommandLineFlags(int* argc, char* argv[]);
} // namespace tensorflow
#endif // TENSORFLOW_LIB_CORE_COMMAND_LINE_FLAGS_H_

View File

@ -20,7 +20,7 @@ limitations under the License.
#define TF_MAJOR_VERSION 0
#define TF_MINOR_VERSION 8
#define TF_PATCH_VERSION 0rc0
#define TF_PATCH_VERSION 0
// TF_VERSION_SUFFIX is non-empty for pre-releases (e.g. "-alpha", "-alpha.1",
// "-beta", "-rc", "-rc.1")

View File

@ -120,9 +120,12 @@ variable to its initial value.
##### Args:
* <b>`initial_value`</b>: A `Tensor`, or Python object convertible to a `Tensor`.
The initial value for the Variable. Must have a shape specified unless
`validate_shape` is set to False.
* <b>`initial_value`</b>: A `Tensor`, or Python object convertible to a `Tensor`,
which is the initial value for the Variable. The initial value must have
a shape specified unless `validate_shape` is set to False. Can also be a
callable with no argument that returns the initial value when called. In
that case, `dtype` must be specified. (Note that initializer functions
from init_ops.py must first be bound to a shape before being used here.)
* <b>`trainable`</b>: If `True`, the default, also adds the variable to the graph
collection `GraphKeys.TRAINABLE_VARIABLES`. This collection is used as
the default list of variables to use by the `Optimizer` classes.

View File

@ -1955,6 +1955,7 @@ on the parameters to the constructor and may include:
##### Raises:
* <b>`RuntimeError`</b>: If called with a non-chief Supervisor.
* <b>`ValueError`</b>: If not `logdir` was passed to the constructor as the
services need a log directory.
@ -2182,6 +2183,7 @@ on the parameters to the constructor and may include:
##### Raises:
* <b>`RuntimeError`</b>: If called with a non-chief Supervisor.
* <b>`ValueError`</b>: If not `logdir` was passed to the constructor as the
services need a log directory.
@ -2409,7 +2411,7 @@ Start threads for `QueueRunners`.
#### `tf.train.Supervisor.summary_op` {#Supervisor.summary_op}
Return the Summary Tensor used by the supervisor.
Return the Summary Tensor used by the chief supervisor.
##### Returns:
@ -2420,7 +2422,7 @@ Return the Summary Tensor used by the supervisor.
#### `tf.train.Supervisor.summary_writer` {#Supervisor.summary_writer}
Return the SummaryWriter used by the supervisor.
Return the SummaryWriter used by the chief supervisor.
##### Returns:

View File

@ -7,8 +7,10 @@ github source.
The TensorFlow Python API supports Python 2.7 and Python 3.3+.
The GPU version (Linux only) requires the Cuda Toolkit >= 7.0 and cuDNN >=
v2. Please see [Cuda installation](#optional-install-cuda-gpus-on-linux)
The GPU version (Linux only) works best with Cuda Toolkit 7.5 and
cuDNN v4. other versions are supported (Cuda toolkit >= 7.0 and
cuDNN 6.5(v2), 7.0(v3), v5) only when installing from sources.
Please see [Cuda installation](#optional-install-cuda-gpus-on-linux)
for details.
## Overview
@ -20,10 +22,13 @@ We support different ways to install TensorFlow:
Python programs on your machine.
* [Virtualenv install](#virtualenv-installation): Install TensorFlow in its own
directory, not impacting any existing Python programs on your machine.
* [Anaconda install](#anaconda-environment-installation): Install TensorFlow in its own
environment for those running the Anaconda Python distribution. Does not
impact existing Python programs on your machine.
* [Docker install](#docker-installation): Run TensorFlow in a Docker container
isolated from all other programs on your machine.
If you are familiar with Pip, Virtualenv, or Docker, please feel free to adapt
If you are familiar with Pip, Virtualenv, Anaconda, or Docker, please feel free to adapt
the instructions to your particular needs. The names of the pip and Docker
images are listed in the corresponding installation sections.
@ -53,28 +58,30 @@ Install TensorFlow:
```bash
# Ubuntu/Linux 64-bit, CPU only:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4. For
# other versions, see "Install from sources" below.
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Mac OS X, CPU only:
$ sudo easy_install --upgrade six
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py2-none-any.whl
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py2-none-any.whl
```
For python3:
```bash
# Ubuntu/Linux 64-bit, CPU only:
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled:
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4. For
# other versions, see "Install from sources" below.
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
# Mac OS X, CPU only:
$ sudo easy_install --upgrade six
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py3-none-any.whl
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py3-none-any.whl
```
NOTE: If you are upgrading from a previous installation of TensorFlow < 0.7.1,
@ -126,13 +133,14 @@ $ source ~/tensorflow/bin/activate.csh # If using csh
(tensorflow)$ # Your prompt should change
# Ubuntu/Linux 64-bit, CPU only:
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled:
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4. For
# other versions, see "Install from sources" below.
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Mac OS X, CPU only:
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py2-none-any.whl
(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py2-none-any.whl
```
and again for python3:
@ -143,13 +151,14 @@ $ source ~/tensorflow/bin/activate.csh # If using csh
(tensorflow)$ # Your prompt should change
# Ubuntu/Linux 64-bit, CPU only:
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled:
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4. For
# other versions, see "Install from sources" below.
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
# Mac OS X, CPU only:
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py3-none-any.whl
(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py3-none-any.whl
```
With the Virtualenv environment activated, you can now
@ -175,6 +184,95 @@ $ source ~/tensorflow/bin/activate.csh # If using csh.
(tensorflow)$ deactivate
```
## Anaconda environment installation
[Anaconda](https://www.continuum.io/why-anaconda) is a Python distribution that
includes a large number of standard numeric and scientific computing packages.
Anaconda uses a package manager called "conda" that has its own
[environment system](http://conda.pydata.org/docs/using/envs.html) similar to Virtualenv.
As with Virtualenv, conda environments keep the dependencies required by
different Python projects in separate places. The Anaconda environment
installation of TensorFlow will not override pre-existing version of the Python
packages needed by TensorFlow.
* Install Anaconda.
* Create a conda environment.
* Activate the conda environment and install TensorFlow in it.
* After the install you will activate the conda environment each time you
want to use TensorFlow.
Install Anaconda:
Follow the instructions on the [Anaconda download site](https://www.continuum.io/downloads)
Create a conda environment called `tensorflow`:
```bash
# Python 2.7
$ conda create -n tensorflow python=2.7
# Python 3.5
$ conda create -n tensorflow python=3.5
```
Activate the environment and use pip to install TensorFlow inside it.
Use the `--ignore-installed` flag to prevent errors about `easy_install`.
```bash
$ source activate tensorflow
(tensorflow)$ # Your prompt should change
# Ubuntu/Linux 64-bit, CPU only:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4. For
# other versions, see "Install from sources" below.
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Mac OS X, CPU only:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py2-none-any.whl
```
and again for Python 3:
```bash
$ source activate tensorflow
(tensorflow)$ # Your prompt should change
# Ubuntu/Linux 64-bit, CPU only:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4. For
# other versions, see "Install from sources" below.
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
# Mac OS X, CPU only:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py3-none-any.whl
```
With the conda environment activated, you can now
[test your installation](#test-the-tensorflow-installation).
When you are done using TensorFlow, deactivate the environment.
```bash
(tensorflow)$ source deactivate
$ # Your prompt should change back
```
To use TensorFlow later you will have to activate the conda environment again:
```bash
$ source activate tensorflow
(tensorflow)$ # Your prompt should change.
# Run Python programs that use TensorFlow.
...
# When you are done using TensorFlow, deactivate the environment.
(tensorflow)$ source deactivate
```
## Docker installation
[Docker](http://docker.com/) is a system to build self contained versions of a
@ -191,7 +289,7 @@ code.
* `gcr.io/tensorflow/tensorflow:latest-devel-gpu`: GPU Binary image plus source
code.
We also have tags with `latest` replaced by a released version (e.g., `0.8.0rc0-gpu`).
We also have tags with `latest` replaced by a released version (e.g., `0.8.0-gpu`).
With Docker the installation is as follows:
@ -229,7 +327,7 @@ You can now [test your installation](#test-the-tensorflow-installation) within t
### (Optional, Linux) Enable GPU Support
If you installed the GPU version of TensorFlow, you must also install the Cuda
Toolkit 7.0 and cuDNN v2. Please see [Cuda installation](#optional-install-cuda-gpus-on-linux).
Toolkit 7.5 and cuDNN v4. Please see [Cuda installation](#optional-install-cuda-gpus-on-linux).
You also need to set the `LD_LIBRARY_PATH` and `CUDA_HOME` environment
variables. Consider adding the commands below to your `~/.bash_profile`. These
@ -370,20 +468,25 @@ Supported cards include but are not limited to:
https://developer.nvidia.com/cuda-downloads
Install version 7.5 if using our binary releases.
Install the toolkit into e.g. `/usr/local/cuda`
##### Download and install cuDNN
https://developer.nvidia.com/cudnn
Download cuDNN v4 (v5 is currently a release candidate and is only supported when
installing TensorFlow from sources).
Uncompress and copy the cuDNN files into the toolkit directory. Assuming the
toolkit is installed in `/usr/local/cuda`, run the following commands (edited
to reflect the cuDNN version you downloaded):
``` bash
tar xvzf cudnn-6.5-linux-x64-v2.tgz
sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include
sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
tar xvzf cudnn-7.5-linux-x64-v4.tgz
sudo cp cudnn-7.5-linux-x64-v4/cudnn.h /usr/local/cuda/include
sudo cp cudnn-7.5-linux-x64-v4/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
```
@ -517,7 +620,7 @@ $ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_pack
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# The name of the .whl file will depend on your platform.
$ pip install /tmp/tensorflow_pkg/tensorflow-0.8.0rc0-py2-none-linux_x86_64.whl
$ pip install /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-linux_x86_64.whl
```
## Setting up TensorFlow for Development

View File

@ -8,7 +8,7 @@ writing TensorFlow programs.
## Hello distributed TensorFlow!
This tutorial assumes that you are using a TensorFlow nightly build. You
can test your installation by starting a local server as follows:
can test your installation by starting and using a local server as follows:
```shell
# Start a TensorFlow server as a single-process "cluster".
@ -16,29 +16,34 @@ $ python
>>> import tensorflow as tf
>>> c = tf.constant("Hello, distributed TensorFlow!")
>>> server = tf.train.Server.create_local_server()
>>> sess = tf.Session(server.target)
>>> sess = tf.Session(server.target) # Create a session on the server.
>>> sess.run(c)
'Hello, distributed TensorFlow!'
```
The
[`tf.train.Server.create_local_server()`](../../api_docs/train.md#Server.create_local_server)
method creates a single-process cluster.
method creates a single-process cluster, with an in-process server.
## Create a cluster
Most clusters have multiple tasks, divided into one or more jobs. To create a
cluster with multiple processes or machines:
A TensorFlow "cluster" is a set of "tasks" that participate in the distributed
execution of a TensorFlow graph. Each task is associated with a TensorFlow
"server", which contains a "master" that can be used to create sessions, and a
"worker" that executes operations in the graph. A cluster can also be divided
into one or more "jobs", where each job contains one or more tasks.
1. **For each process or machine** in the cluster, run a TensorFlow program to
do the following:
To create a cluster, you start one TensorFlow server per task in the cluster.
Each task typically runs on a different machine, but you can run multiple tasks
on the same machine (e.g. to control different GPU devices). In each task, do
the following:
1. **Create a `tf.train.ClusterSpec`**, which describes all of the tasks
in the cluster. This should be the same in each process.
1. **Create a `tf.train.ClusterSpec`** that describes all of the tasks
in the cluster. This should be the same for each task.
1. **Create a `tf.train.Server`**, passing the `tf.train.ClusterSpec` to
the constructor, and identifying the local process with a job name
and task index.
2. **Create a `tf.train.Server`**, passing the `tf.train.ClusterSpec` to
the constructor, and identifying the local task with a job name
and task index.
### Create a `tf.train.ClusterSpec` to describe the cluster
@ -71,28 +76,29 @@ tf.train.ClusterSpec({
</tr>
</table>
### Create a `tf.train.Server` instance in each process
### Create a `tf.train.Server` instance in each task
A [`tf.train.Server`](../../api_docs/python/train.md#Server) object contains a
set of local devices, and a
[`tf.Session`](../../api_docs/python/client.md#Session) target that can
participate in a distributed computation. Each server belongs to a particular
cluster (specified by a `tf.train.ClusterSpec`), and corresponds to a particular
task in a named job. The server can communicate with any other server in the
same cluster.
set of local devices, a set of connections to other tasks in its
`tf.train.ClusterSpec`, and a
["session target"](../../api_docs/python/client.md#Session) that can use these
to perform a distributed computation. Each server is a member of a specific
named job and has a task index within that job. A server can communicate with
any other server in the cluster.
For example, to define and instantiate servers running on `localhost:2222` and
`localhost:2223`, run the following snippets in different processes:
For example, to launch a cluster with two servers running on `localhost:2222`
and `localhost:2223`, run the following snippets in two different processes on
the local machine:
```python
# In task 0:
cluster = tf.ClusterSpec({"local": ["localhost:2222", "localhost:2223"]})
server = tf.GrpcServer(cluster, job_name="local", task_index=0)
cluster = tf.train.ClusterSpec({"local": ["localhost:2222", "localhost:2223"]})
server = tf.train.Server(cluster, job_name="local", task_index=0)
```
```python
# In task 1:
cluster = tf.ClusterSpec({"local": ["localhost:2222", "localhost:2223"]})
server = tf.GrpcServer(cluster, job_name="local", task_index=1)
cluster = tf.train.ClusterSpec({"local": ["localhost:2222", "localhost:2223"]})
server = tf.train.Server(cluster, job_name="local", task_index=1)
```
**Note:** Manually specifying these cluster specifications can be tedious,
@ -137,45 +143,44 @@ applying gradients).
## Replicated training
A common training configuration ("data parallel training") involves multiple
tasks in a `worker` job training the same model, using shared parameters hosted
in a one or more tasks in a `ps` job. Each task will typically run on a
different machine. There are many ways to specify this structure in TensorFlow,
and we are building libraries that will simplify the work of specifying a
replicated model. Possible approaches include:
A common training configuration, called "data parallelism," involves multiple
tasks in a `worker` job training the same model on different mini-batches of
data, updating shared parameters hosted in a one or more tasks in a `ps`
job. All tasks typically run on different machines. There are many ways to
specify this structure in TensorFlow, and we are building libraries that will
simplify the work of specifying a replicated model. Possible approaches include:
* Building a single graph containing one set of parameters (in `tf.Variable`
nodes pinned to `/job:ps`), and multiple copies of the "model" pinned to
different tasks in `/job:worker`. Each copy of the model can have a different
`train_op`, and one or more client threads can call `sess.run(train_ops[i])`
for each worker `i`. This implements *asynchronous* training.
* **In-graph replication.** In this approach, the client builds a single
`tf.Graph` that contains one set of parameters (in `tf.Variable` nodes pinned
to `/job:ps`); and multiple copies of the compute-intensive part of the model,
each pinned to a different task in `/job:worker`.
* **Between-graph replication.** In this approach, there is a separate client
for each `/job:worker` task, typically in the same process as the worker
task. Each client builds a similar graph containing the parameters (pinned to
`/job:ps` as before using
[`tf.train.replica_device_setter()`](../../api_docs/train.md#replica_device_setter)
to map them deterministically to the same tasks); and a single copy of the
compute-intensive part of the model, pinned to the local task in
`/job:worker`.
This approach uses a single `tf.Session` whose target is one of the workers in
the cluster.
* **Asynchronous training.** In this approach, each replica of the graph has an
independent training loop that executes without coordination. It is compatible
with both forms of replication above.
* As above, but where the gradients from all workers are averaged. See the
[CIFAR-10 multi-GPU trainer](https://www.tensorflow.org/code/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py)
for an example of this form of replication. This implements *synchronous*
training.
* The "distributed trainer" approach uses multiple graphs&mdash;one per
worker&mdash;where each graph contains one set of parameters (pinned to
`/job:ps`) and one copy of the model (pinned to a particular
`/job:worker/task:i`). The "container" mechanism is used to share variables
between different graphs: when each variable is constructed, the optional
`container` argument is specified with the same value in each copy of the
graph. For large models, this can be more efficient, because the overall graph
is smaller.
This approach uses multiple `tf.Session` objects: one per worker process,
where the `target` of each is the address of a different worker. The
`tf.Session` objects can all be created in a single Python client, or you can
use multiple Python clients to better distribute the trainer load.
* **Synchronous training.** In this approach, all of the replicas read the same
values for the current parameters, compute gradients in parallel, and then
apply them together. It is compatible with in-graph replication (e.g. using
gradient averaging as in the
[CIFAR-10 multi-GPU trainer](https://www.tensorflow.org/code/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py)),
and between-graph replication (e.g. using the
`tf.train.SyncReplicasOptimizer`).
### Putting it all together: example trainer program
The following code shows the skeleton of a distributed trainer program. It
includes the code for the parameter server and worker processes.
The following code shows the skeleton of a distributed trainer program,
implementing **between-graph replication** and **asynchronous training**. It
includes the code for the parameter server and worker tasks.
```python
import tensorflow as tf
@ -197,10 +202,13 @@ def main(_):
ps_hosts = FLAGS.ps_hosts.split(",")
worker_hosts = FLAGS.worker_hosts(",")
# Create a cluster from the parameter server and worker hosts.
cluster = tf.train.ClusterSpec({"ps": ps_hosts, "worker": worker_hosts})
# Create and start a server for the local task.
server = tf.train.Server(cluster,
job_name=FLAGS.job_name,
task_index=task_index)
task_index=FLAGS.task_index)
if FLAGS.job_name == "ps":
server.join()
@ -290,10 +298,10 @@ $ python trainer.py \
</dd>
<dt>Cluster</dt>
<dd>
A TensorFlow cluster comprises one or more TensorFlow servers, divided into
a set of named jobs, which in turn comprise lists of tasks. A cluster is
typically dedicated to a particular high-level objective, such as training a
neural network, using many machines in parallel.
A TensorFlow cluster comprises a one or more "jobs", each divided into lists
of one or more "tasks". A cluster is typically dedicated to a particular
high-level objective, such as training a neural network, using many machines
in parallel. A cluster is defined by a `tf.train.ClusterSpec` object.
</dd>
<dt>Job</dt>
<dd>
@ -301,20 +309,22 @@ $ python trainer.py \
purpose. For example, a job named `ps` (for "parameter server") typically
hosts nodes that store and update variables; while a job named `worker`
typically hosts stateless nodes that perform compute-intensive tasks.
The tasks in a job typically run on different machines.
The tasks in a job typically run on different machines. The set of job roles
is flexible: for example, a `worker` may maintain some state.
</dd>
<dt>Master service</dt>
<dd>
An RPC service that provides remote access to a set of distributed
devices. The master service implements the <code>tensorflow::Session</code>
interface, and is responsible for coordinating work across one or more
"worker services".
An RPC service that provides remote access to a set of distributed devices,
and acts as a session target. The master service implements the
<code>tensorflow::Session</code> interface, and is responsible for
coordinating work across one or more "worker services". All TensorFlow
servers implement the master service.
</dd>
<dt>Task</dt>
<dd>
A task typically corresponds to a single TensorFlow server process,
belonging to a particular "job" and with a particular index within that
job's list of tasks.
A task corresponds to a specific TensorFlow server, and typically
corresponds to a single process. A task belongs to a particular "job" and is
identified by its index within that job's list of tasks.
</dd>
<dt>TensorFlow server</dt>
<dd>
@ -326,6 +336,7 @@ $ python trainer.py \
An RPC service that executes parts of a TensorFlow graph using its local
devices. A worker service implements <a href=
"https://www.tensorflow.org/code/tensorflow/core/protobuf/worker_service.proto"
><code>worker_service.proto</code></a>.
><code>worker_service.proto</code></a>. All TensorFlow servers implement the
worker service.
</dd>
</dl>

View File

@ -114,4 +114,4 @@ Building on the Inception recognition model, we will release a TensorFlow
version of the [Deep Dream](https://github.com/google/deepdream) neural network
visual hallucination software.
[View Tutorial](https://www.tensorflow.org/code/tensorflow/examples/tutorials/deepdream/deepdream.ipynb)
[View Tutorial](https://www.tensorflow.org/code/tensorflow/examples/tutorials/deepdream/README.md)

View File

@ -19,6 +19,7 @@ from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import functools
import numpy as np
import tensorflow as tf
@ -208,5 +209,59 @@ class RNNCellTest(tf.test.TestCase):
0.13248, 0.13248]])
class SlimRNNCellTest(tf.test.TestCase):
def testBasicRNNCell(self):
with self.test_session() as sess:
with tf.variable_scope("root", initializer=tf.constant_initializer(0.5)):
x = tf.zeros([1, 2])
m = tf.zeros([1, 2])
my_cell = functools.partial(basic_rnn_cell, num_units=2)
g, _ = tf.nn.rnn_cell.SlimRNNCell(my_cell)(x, m)
sess.run([tf.initialize_all_variables()])
res = sess.run([g], {x.name: np.array([[1., 1.]]),
m.name: np.array([[0.1, 0.1]])})
self.assertEqual(res[0].shape, (1, 2))
def testBasicRNNCellMatch(self):
batch_size = 32
input_size = 100
num_units = 10
with self.test_session() as sess:
with tf.variable_scope("root", initializer=tf.constant_initializer(0.5)):
inputs = tf.random_uniform((batch_size, input_size))
_, initial_state = basic_rnn_cell(inputs, None, num_units)
my_cell = functools.partial(basic_rnn_cell, num_units=num_units)
slim_cell = tf.nn.rnn_cell.SlimRNNCell(my_cell)
slim_outputs, slim_state = slim_cell(inputs, initial_state)
rnn_cell = tf.nn.rnn_cell.BasicRNNCell(num_units)
outputs, state = rnn_cell(inputs, initial_state)
self.assertEqual(slim_outputs.get_shape(), outputs.get_shape())
self.assertEqual(slim_state.get_shape(), state.get_shape())
sess.run([tf.initialize_all_variables()])
res = sess.run([slim_outputs, slim_state, outputs, state])
self.assertAllClose(res[0], res[2])
self.assertAllClose(res[1], res[3])
def basic_rnn_cell(inputs, state, num_units, scope=None):
if state is None:
if inputs is not None:
batch_size = inputs.get_shape()[0]
dtype = inputs.dtype
else:
batch_size = 0
dtype = tf.float32
init_output = tf.zeros(tf.pack([batch_size, num_units]), dtype=dtype)
init_state = tf.zeros(tf.pack([batch_size, num_units]), dtype=dtype)
init_output.set_shape([batch_size, num_units])
init_state.set_shape([batch_size, num_units])
return init_output, init_state
else:
with tf.variable_op_scope([inputs, state], scope, "BasicRNNCell"):
output = tf.tanh(tf.nn.rnn_cell.linear([inputs, state],
num_units, True))
return output, output
if __name__ == "__main__":
tf.test.main()

View File

@ -302,6 +302,53 @@ class VariablesTestCase(tf.test.TestCase):
self.assertEqual(var.op.device, init_op.device)
sess.run(init_op)
def testInitializerFunction(self):
value = [[-42], [133.7]]
shape = [2, 1]
with self.test_session():
initializer = lambda: tf.constant(value)
with self.assertRaises(ValueError):
# Checks that dtype must be specified.
tf.Variable(initializer)
v1 = tf.Variable(initializer, dtype=tf.float32)
self.assertEqual(shape, v1.get_shape())
self.assertAllClose(value, v1.initial_value.eval())
with self.assertRaises(tf.errors.FailedPreconditionError):
v1.eval()
v2 = tf.Variable(tf.neg(v1.initialized_value()), dtype=tf.float32)
self.assertEqual(v1.get_shape(), v2.get_shape())
self.assertAllClose(np.negative(value), v2.initial_value.eval())
# Once v2.initial_value.eval() has been called, v1 has effectively been
# initialized.
self.assertAllClose(value, v1.eval())
with self.assertRaises(tf.errors.FailedPreconditionError):
v2.eval()
tf.initialize_all_variables().run()
self.assertAllClose(np.negative(value), v2.eval())
def testInitializerFunctionDevicePlacement(self):
with self.test_session():
initializer = lambda: tf.constant(42.0)
with tf.device("/cpu:100"):
v1 = tf.Variable(initializer, dtype=tf.float32, name="v1")
expected_device = "/device:CPU:100"
expected_group_v1 = [b"loc:@v1"]
self.assertEqual(expected_device, v1.op.device)
self.assertEqual(expected_group_v1, v1.op.colocation_groups())
for i in v1.initializer.inputs:
self.assertEqual(expected_device, i.op.device)
self.assertEqual(expected_group_v1, i.op.colocation_groups())
v2 = tf.Variable(initializer, dtype=tf.float32, name="v2")
expected_group_v2 = [b"loc:@v2"]
self.assertEqual(expected_group_v2, v2.op.colocation_groups())
for i in v2.initializer.inputs:
self.assertEqual(expected_group_v2, i.op.colocation_groups())
class IsInitializedTest(tf.test.TestCase):

View File

@ -167,19 +167,22 @@ def create_partitioned_variables(
slice_offset[slice_dim] += var_shape[slice_dim]
if callable(initializer):
init_val = initializer(var_shape, dtype=dtype)
init_val = ops.convert_to_tensor(init_val, dtype=dtype)
init = initializer
init_shape = var_shape
elif isinstance(initializer, ops.Tensor):
init_val = array_ops.slice(initializer, var_offset, var_shape)
init = array_ops.slice(initializer, var_offset, var_shape)
# Use the dtype of the given tensor.
dtype = init_val.dtype.base_dtype
dtype = init.dtype.base_dtype
init_shape = None
else:
init_val = ops.convert_to_tensor(initializer, dtype=dtype)
init_val = array_ops.slice(init_val, var_offset, var_shape)
init = ops.convert_to_tensor(initializer, dtype=dtype)
init = array_ops.slice(init, var_offset, var_shape)
init_shape = None
var = variable_scope.get_variable(name="part_%d" % i,
shape=init_shape,
dtype=dtype,
initializer=init_val,
initializer=init,
trainable=trainable,
collections=collections)

View File

@ -661,6 +661,42 @@ class MultiRNNCell(RNNCell):
return cur_inp, array_ops.concat(1, new_states)
class SlimRNNCell(RNNCell):
"""A simple wrapper for slim.rnn_cells."""
def __init__(self, cell_fn):
"""Create a SlimRNNCell from a cell_fn.
Args:
cell_fn: a function which takes (inputs, state, scope) and produces the
outputs and the new_state. Additionally when called with inputs=None and
state=None it should return (initial_outputs, initial_state).
Raises:
TypeError: if cell_fn is not callable
ValueError: if cell_fn cannot produce a valid initial state.
"""
if not callable(cell_fn):
raise TypeError("cell_fn %s needs to be callable", cell_fn)
self._cell_fn = cell_fn
self._cell_name = cell_fn.func.__name__
_, init_state = self._cell_fn(None, None)
state_shape = init_state.get_shape()
self._state_size = state_shape.with_rank(2)[1].value
if self._state_size is None:
raise ValueError("Initial state created by %s has invalid shape %s",
self._cell_name, state_shape)
@property
def state_size(self):
return self._state_size
def __call__(self, inputs, state, scope=None):
scope = scope or self._cell_name
output, state = self._cell_fn(inputs, state, scope=scope)
return output, state
def linear(args, output_size, bias, bias_start=0.0, scope=None):
"""Linear map: sum_i(args[i] * W[i]), where W[i] is a variable.

View File

@ -144,14 +144,19 @@ class _VariableStore(object):
with ops.control_dependencies(None):
if initializing_from_value:
init_val = initializer
variable_dtype = None
else:
with ops.name_scope(name + "/Initializer/"):
init_val = initializer(shape.as_list(), dtype=dtype)
init_val = lambda: initializer(shape.as_list(), dtype=dtype)
variable_dtype = dtype.base_dtype
# Create the variable.
v = variables.Variable(init_val, name=name, trainable=trainable,
v = variables.Variable(initial_value=init_val,
name=name,
trainable=trainable,
collections=collections,
caching_device=caching_device)
caching_device=caching_device,
dtype=variable_dtype)
self._vars[name] = v
logging.info("Created variable %s with shape %s and init %s", v.name,
format(shape), initializer)

View File

@ -156,9 +156,12 @@ class Variable(object):
variable to its initial value.
Args:
initial_value: A `Tensor`, or Python object convertible to a `Tensor`.
The initial value for the Variable. Must have a shape specified unless
`validate_shape` is set to False.
initial_value: A `Tensor`, or Python object convertible to a `Tensor`,
which is the initial value for the Variable. The initial value must have
a shape specified unless `validate_shape` is set to False. Can also be a
callable with no argument that returns the initial value when called. In
that case, `dtype` must be specified. (Note that initializer functions
from init_ops.py must first be bound to a shape before being used here.)
trainable: If `True`, the default, also adds the variable to the graph
collection `GraphKeys.TRAINABLE_VARIABLES`. This collection is used as
the default list of variables to use by the `Optimizer` classes.
@ -211,9 +214,12 @@ class Variable(object):
"""Creates a new variable from arguments.
Args:
initial_value: A `Tensor`, or Python object convertible to a `Tensor`.
The initial value for the Variable. Must have a shape specified unless
`validate_shape` is set to False.
initial_value: A `Tensor`, or Python object convertible to a `Tensor`,
which is the initial value for the Variable. The initial value must have
a shape specified unless `validate_shape` is set to False. Can also be a
callable with no argument that returns the initial value when called. In
that case, `dtype` must be specified. (Note that initializer functions
from init_ops.py must first be bound to a shape before being used here.)
trainable: If `True`, the default, also adds the variable to the graph
collection `GraphKeys.TRAINABLE_VARIABLES`. This collection is used as
the default list of variables to use by the `Optimizer` classes.
@ -240,25 +246,62 @@ class Variable(object):
"""
if initial_value is None:
raise ValueError("initial_value must be specified.")
init_from_fn = callable(initial_value)
if init_from_fn and dtype is None:
raise ValueError(
"dtype must also be specified when initial_value is callable.")
if collections is None:
collections = [ops.GraphKeys.VARIABLES]
if trainable and ops.GraphKeys.TRAINABLE_VARIABLES not in collections:
collections = list(collections) + [ops.GraphKeys.TRAINABLE_VARIABLES]
with ops.control_dependencies(None):
with ops.op_scope([initial_value], name, "Variable") as name:
self._initial_value = ops.convert_to_tensor(initial_value,
name="initial_value",
dtype=dtype)
initial_value_shape = self._initial_value.get_shape()
if validate_shape and not initial_value_shape.is_fully_defined():
raise ValueError("initial_value must have a shape specified: %s"
% self._initial_value)
shape_to_set = initial_value_shape if validate_shape else []
with ops.op_scope(
[] if init_from_fn else [initial_value], name, "Variable") as name:
self._variable = state_ops.variable_op(
shape_to_set, self._initial_value.dtype.base_dtype,
set_shape=validate_shape, name=name)
# Get the initial value from a callable function. The real shape of the
# variable will be set later, since under the init_from_fn case, the
# shape won't be known until after the function is invoked.
if init_from_fn:
self._variable = state_ops.variable_op(
[],
dtype.base_dtype,
set_shape=False,
name=name)
with ops.colocate_with(self._variable.op):
with ops.name_scope("Initializer"):
# Colocate the tensors created by the initial_value() function
# with the variable itself.
self._initial_value = ops.convert_to_tensor(initial_value(),
name="initial_value",
dtype=dtype)
# Or get the initial value from a Tensor or Python object.
else:
self._initial_value = ops.convert_to_tensor(initial_value,
name="initial_value",
dtype=dtype)
# In this case, the variable op can't be created until after the
# initial_value has been converted to a Tensor with a known type.
self._variable = state_ops.variable_op(
[],
self._initial_value.dtype.base_dtype,
set_shape=False,
name=name)
# Manually overrides the variable's shape with the initial value's.
if validate_shape:
initial_value_shape = self._initial_value.get_shape()
if not initial_value_shape.is_fully_defined():
raise ValueError("initial_value must have a shape specified: %s"
% self._initial_value)
self._variable.set_shape(initial_value_shape)
# TODO(b/28152992): Remove the below hack modifying the node_def shape
# directly once set_shape() handles it.
self._variable.op.node_def.attr["shape"].shape.CopyFrom(
initial_value_shape.as_proto())
# Assigns initial value.
with ops.colocate_with(self._variable.op):
self._initializer_op = state_ops.assign(
self._variable, self._initial_value,

View File

@ -79,3 +79,7 @@ def get_path_to_datafile(path):
"""
data_files_path = os.path.dirname(inspect.getfile(sys._getframe(1)))
return os.path.join(data_files_path, path)
def readahead_file_path(path, unused_readahead=None):
"""Readahead files not implemented; simply returns given path."""
return path

View File

@ -22,6 +22,7 @@ from tensorflow.core.util import event_pb2
from tensorflow.python import pywrap_tensorflow
from tensorflow.python.platform import app
from tensorflow.python.platform import logging
from tensorflow.python.platform import resource_loader
from tensorflow.python.util import compat
@ -31,6 +32,7 @@ class EventFileLoader(object):
def __init__(self, file_path):
if file_path is None:
raise ValueError('A file path is required')
file_path = resource_loader.readahead_file_path(file_path)
logging.debug('Opening a record reader pointing at %s', file_path)
self._reader = pywrap_tensorflow.PyRecordReader_New(
compat.as_bytes(file_path), 0)

View File

@ -238,6 +238,7 @@ class ClusterSpec(object):
elif isinstance(cluster, ClusterSpec):
self._cluster_def = tensorflow_server_pb2.ClusterDef()
self._cluster_def.MergeFrom(cluster.as_cluster_def())
self._cluster_spec = {}
for job_def in self._cluster_def.job:
self._cluster_spec[job_def.name] = [t for t in job_def.tasks.values()]
else:
@ -306,4 +307,3 @@ class ClusterSpec(object):
raise TypeError(
"Task address %r must be bytes or unicode" % task_address)
job_def.tasks[i] = task_address

View File

@ -146,6 +146,29 @@ class ServerDefTest(tf.test.TestCase):
cluster_spec = tf.train.ClusterSpec(cluster_def)
self.assertProtoEquals(cluster_def, cluster_spec.as_cluster_def())
def testClusterSpec(self):
cluster_spec = tf.train.ClusterSpec(
{"ps": ["ps0:2222", "ps1:2222"],
"worker": ["worker0:2222", "worker1:2222", "worker2:2222"]})
expected_proto = """
job { name: 'ps' tasks { key: 0 value: 'ps0:2222' }
tasks { key: 1 value: 'ps1:2222' } }
job { name: 'worker' tasks { key: 0 value: 'worker0:2222' }
tasks { key: 1 value: 'worker1:2222' }
tasks { key: 2 value: 'worker2:2222' } }
"""
self.assertProtoEquals(expected_proto, cluster_spec.as_cluster_def())
self.assertProtoEquals(
expected_proto, tf.train.ClusterSpec(cluster_spec).as_cluster_def())
self.assertProtoEquals(
expected_proto,
tf.train.ClusterSpec(cluster_spec.as_cluster_def()).as_cluster_def())
self.assertProtoEquals(
expected_proto,
tf.train.ClusterSpec(cluster_spec.as_dict()).as_cluster_def())
if __name__ == "__main__":
tf.test.main()

View File

@ -326,19 +326,29 @@ class Supervisor(object):
self._init_global_step(global_step=global_step)
self._graph = graph
self._is_chief = is_chief
self._logdir = logdir
self._save_summaries_secs = save_summaries_secs
self._save_model_secs = save_model_secs
self._recovery_wait_secs = recovery_wait_secs
self._coord = coordinator.Coordinator()
if logdir:
self._started_threads = []
self._recovery_wait_secs = recovery_wait_secs
# Only chief supervisors write event files, so only chief supervisors
# should have event-writing properties. Set to None for non-chiefs.
if self._is_chief:
self._logdir = logdir
self._save_summaries_secs = save_summaries_secs
self._save_model_secs = save_model_secs
else:
self._logdir = None
self._save_summaries_secs = None
self._save_model_secs = None
if self._is_chief and self._logdir:
self._save_path = os.path.join(self._logdir, checkpoint_basename)
self._summary_writer = summary_io.SummaryWriter(self._logdir)
else:
self._save_path = None
self._summary_writer = None
self._init_session_manager(session_manager=session_manager)
self._started_threads = []
self._verify_setup()
# The graph is not allowed to change anymore.
graph.finalize()
@ -520,7 +530,7 @@ class Supervisor(object):
@property
def summary_writer(self):
"""Return the SummaryWriter used by the supervisor.
"""Return the SummaryWriter used by the chief supervisor.
Returns:
A SummaryWriter.
@ -529,7 +539,7 @@ class Supervisor(object):
@property
def summary_op(self):
"""Return the Summary Tensor used by the supervisor.
"""Return the Summary Tensor used by the chief supervisor.
Returns:
A string Tensor for the summary or `None`.
@ -583,8 +593,7 @@ class Supervisor(object):
def _write_graph(self):
"""Writes graph_def to `logdir` and adds it to summary if applicable."""
if not self._is_chief:
return
assert self._is_chief
if self._logdir:
training_util.write_graph(self._graph.as_graph_def(),
self._logdir, "graph.pbtxt")
@ -610,11 +619,13 @@ class Supervisor(object):
sv.coord.Join(<list of threads>)
Raises:
RuntimeError: If called with a non-chief Supervisor.
ValueError: If not `logdir` was passed to the constructor as the
services need a log directory.
"""
if not self._is_chief:
return
raise RuntimeError("Only chief supervisor can start standard services. "
"Because only cheif supervisors can write events.")
if not self._logdir:
logging.warning("Standard services need a 'logdir' "
"passed to the SessionManager")
@ -812,14 +823,18 @@ class Supervisor(object):
TypeError: if 'summary' is not a Summary proto or a string.
RuntimeError: if the Supervisor was created without a `logdir`.
"""
if not self._logdir:
raise RuntimeError("summary_computed() requires a logdir")
if not self._summary_writer:
raise RuntimeError("Writing a summary requires a summary writer.")
if global_step is None and self.global_step is not None:
global_step = training_util.global_step(sess, self.global_step)
if self._summary_writer:
self._summary_writer.add_summary(summary, global_step)
self._summary_writer.add_summary(summary, global_step)
def _default_global_step_tensor(self):
"""Returns the global_step from the default graph.
Returns:
The global step `Tensor` or `None`.
"""
try:
gs = ops.get_default_graph().get_tensor_by_name("global_step:0")
if gs.dtype.base_dtype in [dtypes.int32, dtypes.int64]:

View File

@ -73,12 +73,11 @@ class SupervisorTest(tf.test.TestCase):
sess.close()
sv.stop()
def testSummary(self):
def testChiefCanWriteEvents(self):
logdir = self._TestDir("basics")
with tf.Graph().as_default():
const = tf.constant([1.0, 2.0, 3.0])
summ = tf.scalar_summary(["c1", "c2", "c3"], const)
sv = tf.train.Supervisor(logdir=logdir, summary_op=None)
summ = tf.scalar_summary(["c1", "c2", "c3"], tf.constant([1.0, 2.0, 3.0]))
sv = tf.train.Supervisor(is_chief=True, logdir=logdir, summary_op=None)
sess = sv.prepare_or_wait_for_session("")
sv.summary_computed(sess, sess.run(summ))
sess.close()
@ -113,13 +112,31 @@ class SupervisorTest(tf.test.TestCase):
# We should be done.
self.assertRaises(StopIteration, lambda: next(rr))
def testNonChiefCannotWriteEvents(self):
def _summary_computed():
with tf.Graph().as_default():
sv = tf.train.Supervisor(is_chief=False)
sess = sv.prepare_or_wait_for_session("")
summ = tf.scalar_summary(["c1", "c2"], tf.constant([1.0, 2.0]))
sv.summary_computed(sess, sess.run(summ))
def _start_standard_services():
with tf.Graph().as_default():
sv = tf.train.Supervisor(is_chief=False)
sess = sv.prepare_or_wait_for_session("")
sv.start_standard_services(sess)
self.assertRaises(RuntimeError, _summary_computed)
self.assertRaises(RuntimeError, _start_standard_services)
def testNoLogdirButWantSummary(self):
with tf.Graph().as_default():
const = tf.constant([1.0, 2.0, 3.0])
summ = tf.scalar_summary(["c1", "c2", "c3"], const)
sv = tf.train.Supervisor(logdir="", summary_op=None)
sess = sv.prepare_or_wait_for_session("")
with self.assertRaisesRegexp(RuntimeError, "requires a logdir"):
with self.assertRaisesRegexp(RuntimeError, "requires a summary writer"):
sv.summary_computed(sess, sess.run(summ))
def testNoLogdirSucceeds(self):

View File

@ -12,9 +12,10 @@ filegroup(
srcs = [
"dist/index.html",
"dist/tf-tensorboard.html",
"//tensorflow/tensorboard/bower:bower",
"TAG",
] + glob(["lib/**/*"]),
"//tensorflow/tensorboard/bower:bower",
"//tensorflow/tensorboard/lib:all_files",
],
)
py_binary(

View File

@ -5,7 +5,7 @@ TensorFlow runs and graphs. TensorBoard currently supports four visualizations:
scalars, images, histograms, and the graph.
You can play with an interactive demo TensorBoard at
[tensorflow.org/tensorboard/cifar.html](https://www.tensorflow.org/tensorboard/cifar.html).
[tensorflow.org/tensorboard](https://www.tensorflow.org/tensorboard).
This README gives an overview of key concepts in TensorBoard, as well as how to
interpret the visualizations TensorBoard provides. For an in-depth example of

View File

@ -0,0 +1,66 @@
# Description:
# Benchmark utility that can run on desktop and Android.
package(default_visibility = ["//visibility:public"])
licenses(["notice"]) # Apache 2.0
load("//tensorflow:tensorflow.bzl", "tf_copts")
exports_files(["LICENSE"])
cc_library(
name = "benchmark_model_lib",
srcs = [
"benchmark_model.cc",
],
copts = tf_copts(),
visibility = ["//visibility:public"],
deps = select({
"//tensorflow:android": [
"//tensorflow/core:android_tensorflow_lib",
],
"//conditions:default": [
"//tensorflow/core:core_cpu",
"//tensorflow/core:lib",
"//tensorflow/core:framework",
"//tensorflow/core:framework_internal",
"//tensorflow/core:protos_all_cc",
"//tensorflow/core:tensorflow",
],
}),
)
# This binary may be built for either desktop or Android.
# A typical Android build command will look like the following:
# bazel build -c opt tensorflow/core:android_tensorflow_lib \
# --crosstool_top=//external:android/crosstool \
# --cpu=armeabi-v7a \
# --host_crosstool_top=@bazel_tools//tools/cpp:toolchain
#
# NOTE: currently '-pthread' must be removed from the LINK_OPTS variable
# in google/protobuf/BUILD to sucessfully build for Android. This is temporary
# pending an update of the version of the protobuf library that Tensorflow
# uses.
cc_binary(
name = "benchmark_model",
copts = tf_copts(),
linkopts = select({
"//tensorflow:android": [
"-pie",
"-s",
"-landroid",
"-ljnigraphics",
"-llog",
"-lm",
"-z defs",
"-s",
"-Wl,--icf=all", # Identical Code Folding
"-Wl,--exclude-libs,ALL", # Exclude syms in all libs from auto export
],
"//conditions:default": [],
}),
linkstatic = 1,
visibility = ["//visibility:public"],
deps = [":benchmark_model_lib"],
)

View File

@ -0,0 +1,57 @@
# Tensorflow Model Benchmark Tool
## Description
A simple C++ binary to benchmark a compute graph and its individual operators,
both on desktop machines and on Android.
## To build/install/run
### On Android:
(1) build for your specific platform, e.g.:
```bash
$bazel build -c opt \
--crosstool_top=//external:android/crosstool \
--cpu=armeabi-v7a \
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain \
tensorflow/tools/benchmark:benchmark_model
```
(2) Connect your phone. Push the binary to your phone with adb push
(make the directory if required):
```bash
$adb push bazel-bin/tensorflow/tools/benchmark/benchmark_model /data/local/tmp
```
(3) Push the compute graph that you need to test. For example:
adb push tensorflow_inception_graph.pb /data/local/tmp
(4) Run the benchmark. For example:
```bash
$adb shell "/data/local/tmp/benchmark_model \
--graph=/data/local/tmp/tensorflow_inception_graph.pb \
--input_layer="input:0" \
--input_layer_shape="1,224,224,3" \
--input_layer_type="float" \
--output_layer="output:0"
```
### On desktop:
(1) build the binary
```bash
$bazel build -c opt tensorflow/tools/benchmark:benchmark_model
```
(2) Run on your compute graph, similar to the Android case but without the need of adb shell.
For example:
```bash
$bazel-bin/tensorflow/tools/benchmark/benchmark_model \
--graph=tensorflow_inception_graph.pb \
--input_layer="input:0" \
--input_layer_shape="1,224,224,3" \
--input_layer_type="float" \
--output_layer="output:0"
```
The Inception graph used as an example here may be downloaded from
https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip

View File

@ -0,0 +1,225 @@
/* Copyright 2016 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==============================================================================*/
// A C++ binary to benchmark a compute graph and its individual operators,
// both on desktop machines and on Android.
//
// See README.md for usage instructions.
#include <cstdlib>
#include <memory>
#include <string>
#include <unordered_set>
#include <vector>
#include "tensorflow/core/framework/graph.pb.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/graph/algorithm.h"
#include "tensorflow/core/graph/graph.h"
#include "tensorflow/core/graph/graph_constructor.h"
#include "tensorflow/core/lib/strings/str_util.h"
#include "tensorflow/core/lib/strings/strcat.h"
#include "tensorflow/core/platform/env.h"
#include "tensorflow/core/platform/init_main.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/platform/types.h"
#include "tensorflow/core/public/session.h"
#include "tensorflow/core/util/command_line_flags.h"
#include "tensorflow/core/util/stat_summarizer.h"
namespace tensorflow {
// Global variables that holds the Tensorflow classifier.
static std::unique_ptr<tensorflow::Session> session;
static StatSummarizer g_stats;
struct Flags {
string graph = "/data/local/tmp/tensorflow_inception_graph.pb";
string input_layer = "input:0";
string input_layer_shape = "1,224,224,3";
string input_layer_type = "float";
string output_layer = "output:0";
int num_runs = 50;
string run_delay = "-1.0";
int num_threads = -1;
};
static Flags* flags; // Filled in by main()
static bool InitializeBenchmark() {
g_stats.Reset();
LOG(INFO) << "Loading Tensorflow.";
tensorflow::SessionOptions options;
tensorflow::ConfigProto& config = options.config;
if (flags->num_threads > 0) {
config.set_intra_op_parallelism_threads(flags->num_threads);
}
LOG(INFO) << "Got config, " << config.device_count_size() << " devices";
session.reset(tensorflow::NewSession(options));
tensorflow::GraphDef tensorflow_graph;
Status s = ReadBinaryProto(Env::Default(), flags->graph, &tensorflow_graph);
if (!s.ok()) {
LOG(ERROR) << "Could not create Tensorflow Graph: " << s;
return false;
}
s = session->Create(tensorflow_graph);
if (!s.ok()) {
LOG(ERROR) << "Could not create Tensorflow Session: " << s;
return false;
}
// Clear the proto to save memory space.
tensorflow_graph.Clear();
return true;
}
static bool RunBenchmark() {
DataType input_data_type;
CHECK(DataTypeFromString(flags->input_layer_type, &input_data_type))
<< flags->input_layer_type << " was an invalid type";
std::vector<int32> sizes;
CHECK(str_util::SplitAndParseAsInts(flags->input_layer_shape, ',', &sizes))
<< "Incorrect size string specified: " << flags->input_layer_shape;
TensorShape input_shape;
for (int i = 0; i < sizes.size(); ++i) {
input_shape.AddDim(sizes[i]);
}
Tensor input_tensor(input_data_type, input_shape);
switch (input_data_type) {
case DT_INT32: {
auto int_tensor = input_tensor.flat<int32>();
int_tensor = int_tensor.constant(0.0);
break;
}
case DT_FLOAT: {
auto float_tensor = input_tensor.flat<float>();
float_tensor = float_tensor.constant(0.0);
break;
}
case DT_QUINT8: {
auto int_tensor = input_tensor.flat<quint8>();
int_tensor = int_tensor.constant(0.0);
break;
}
default:
LOG(FATAL) << "Unsupported input type: " << flags->input_layer_type;
}
std::vector<std::pair<string, tensorflow::Tensor> > input_tensors(
{{flags->input_layer, input_tensor}});
std::vector<tensorflow::Tensor> output_tensors;
std::vector<string> output_names({flags->output_layer});
tensorflow::Status s;
RunOptions run_options;
run_options.set_trace_level(RunOptions::FULL_TRACE);
RunMetadata run_metadata;
s = session->Run(run_options, input_tensors, output_names, {},
&output_tensors, &run_metadata);
assert(run_metadata.has_step_stats());
const StepStats& stats = run_metadata.step_stats();
g_stats.ProcessStepStats(stats);
if (!s.ok()) {
LOG(ERROR) << "Error during inference: " << s;
return false;
}
return true;
}
} // namespace tensorflow
int main(int argc, char** argv) {
tensorflow::flags = new tensorflow::Flags();
const bool parse_result = tensorflow::ParseFlags(
&argc, argv,
{
tensorflow::Flag("graph", &tensorflow::flags->graph),
tensorflow::Flag("input_layer", &tensorflow::flags->input_layer),
tensorflow::Flag("input_layer_shape",
&tensorflow::flags->input_layer_shape),
tensorflow::Flag("input_layer_type",
&tensorflow::flags->input_layer_type),
tensorflow::Flag("output_layer", &tensorflow::flags->output_layer),
tensorflow::Flag("num_runs", &tensorflow::flags->num_runs),
tensorflow::Flag("run_delay", &tensorflow::flags->run_delay),
tensorflow::Flag("num_threads", &tensorflow::flags->num_threads),
});
if (!parse_result) {
LOG(ERROR) << "Error parsing command-line flags.";
return -1;
}
::tensorflow::port::InitMain(argv[0], &argc, &argv);
if (argc > 1) {
LOG(ERROR) << "Unknown argument " << argv[1];
return -1;
}
LOG(INFO) << "Graph: [" << tensorflow::flags->graph << "]";
LOG(INFO) << "Input layer: [" << tensorflow::flags->input_layer << "]";
LOG(INFO) << "Input shape: [" << tensorflow::flags->input_layer_shape << "]";
LOG(INFO) << "Input type: [" << tensorflow::flags->input_layer_type << "]";
LOG(INFO) << "Output layer: [" << tensorflow::flags->output_layer << "]";
LOG(INFO) << "Num runs: [" << tensorflow::flags->num_runs << "]";
LOG(INFO) << "Inter-run delay (seconds): [" << tensorflow::flags->run_delay
<< "]";
LOG(INFO) << "Num threads: [" << tensorflow::flags->num_threads << "]";
if (!tensorflow::InitializeBenchmark()) {
return -1;
}
// Convert the run_delay string into a timespec.
const double sleep_seconds =
std::strtod(tensorflow::flags->run_delay.c_str(), nullptr);
timespec req;
req.tv_sec = static_cast<time_t>(sleep_seconds);
req.tv_nsec = (sleep_seconds - req.tv_sec) * 1000000000;
LOG(INFO) << "Running benchmark";
for (int i = 0; i < tensorflow::flags->num_runs; ++i) {
if (!tensorflow::RunBenchmark()) {
LOG(INFO) << "Failed on run " << i;
return -1;
}
// If requested, sleep between runs for an arbitrary amount of time.
// This can be helpful to determine the effect of mobile processor
// scaling and thermal throttling.
if (sleep_seconds > 0.0) {
nanosleep(&req, nullptr);
}
}
tensorflow::g_stats.PrintStepStats();
return 0;
}

View File

@ -139,6 +139,16 @@ else
# Assume: PYTHON_BIN_PATH is exported by the script above
fi
# Obtain the path to head/ghead binary (for log file printing)
HEAD_BIN="ghead"
if [[ -z $(which "${HEAD_BIN}") ]]; then
# This is not Mac (which uses coreutils/ghead), use head.
HEAD_BIN="head"
if [[ -z $(which "${HEAD_BIN}") ]]; then
die "Unable to obtain path to head or ghead"
fi
fi
if [[ -z "${PYTHON_BIN_PATH}" ]]; then
die "PYTHON_BIN_PATH was not provided. If this is not virtualenv, "\
"did you run configure?"
@ -371,7 +381,7 @@ while true; do
echo " Log @: ${TEST_LOGS[K]}"
echo "============== BEGINS failure log content =============="
head --lines=-1 "${TEST_LOGS[K]}"
"${HEAD_BIN}" --lines=-1 "${TEST_LOGS[K]}"
echo "============== ENDS failure log content =============="
echo ""
fi

View File

@ -21,7 +21,7 @@ RUN /var/gcloud/google-cloud-sdk/bin/gcloud components install kubectl
# Install nightly TensorFlow pip
# TODO(cais): Should we build it locally instead?
RUN pip install \
http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Copy test files
COPY scripts /var/tf-dist-test/scripts

View File

@ -36,7 +36,7 @@ RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
# Install TensorFlow CPU version from nightly build
RUN pip --no-cache-dir install \
http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Copy files, including the GRPC server binary at
# server/grpc_tensorflow_server.py

View File

@ -38,7 +38,7 @@ RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
# Install TensorFlow CPU version.
RUN pip --no-cache-dir install \
http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
http://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_CONTAINER_TYPE=CPU,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Copy files, including the GRPC server binary at
# server/grpc_tensorflow_server.py

View File

@ -29,7 +29,7 @@ RUN pip --no-cache-dir install \
python -m ipykernel.kernelspec
# Install TensorFlow CPU version.
ENV TENSORFLOW_VERSION 0.8.0rc0
ENV TENSORFLOW_VERSION 0.8.0
RUN pip --no-cache-dir install \
http://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-${TENSORFLOW_VERSION}-cp27-none-linux_x86_64.whl

View File

@ -29,7 +29,7 @@ RUN pip --no-cache-dir install \
python -m ipykernel.kernelspec
# Install TensorFlow GPU version.
ENV TENSORFLOW_VERSION 0.8.0rc0
ENV TENSORFLOW_VERSION 0.8.0
RUN pip --no-cache-dir install \
http://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-${TENSORFLOW_VERSION}-cp27-none-linux_x86_64.whl

View File

@ -27,7 +27,7 @@ from setuptools import find_packages, setup, Command, Extension
from setuptools.command.install import install as InstallCommandBase
from setuptools.dist import Distribution
_VERSION = '0.8.0rc0'
_VERSION = '0.8.0'
numpy_version = "1.8.2"
if platform.system() == "Darwin":
@ -183,6 +183,7 @@ setup(
'tensorboard/dist/index.html',
'tensorboard/dist/tf-tensorboard.html',
'tensorboard/lib/css/global.css',
'tensorboard/TAG',
] + matches,
},
zip_safe=False,