Replace get_started

Also add sub-sections to leftnav files,
and sync leftnav and index files.

PiperOrigin-RevId: 181394206
This commit is contained in:
Mark Daoust 2018-01-09 16:40:47 -08:00 committed by TensorFlower Gardener
parent c522f64667
commit 411f8bcff6
41 changed files with 233 additions and 3583 deletions

View File

@ -3,8 +3,8 @@
This library contains classes for launching graphs and executing operations.
The @{$get_started/get_started} guide has
examples of how a graph is launched in a @{tf.Session}.
@{$programmers_guide/low_level_intro$This guide} has examples of how a graph
is launched in a @{tf.Session}.
## Session management

View File

@ -51,8 +51,7 @@ it is executed without a feed, so you won't forget to feed it.
An example using `placeholder` and feeding to train on MNIST data can be found
in
[`tensorflow/examples/tutorials/mnist/fully_connected_feed.py`](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/fully_connected_feed.py),
and is described in the @{$mechanics$MNIST tutorial}.
[`tensorflow/examples/tutorials/mnist/fully_connected_feed.py`](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/fully_connected_feed.py).
## `QueueRunner`

View File

@ -2,8 +2,8 @@
This document shows how to create a cluster of TensorFlow servers, and how to
distribute a computation graph across that cluster. We assume that you are
familiar with the @{$get_started/get_started$basic concepts} of
writing TensorFlow programs.
familiar with the @{$programmers_guide/low_level_intro$basic concepts} of
writing low level TensorFlow programs.
## Hello distributed TensorFlow!

View File

@ -7,7 +7,7 @@ learning models and system-level optimizations.
This document describes the system architecture that makes possible this
combination of scale and flexibility. It assumes that you have basic familiarity
with TensorFlow programming concepts such as the computation graph, operations,
and sessions. See @{$get_started/get_started$Getting Started}
and sessions. See @{$programmers_guide/low_level_intro$this document}
for an introduction to these topics. Some familiarity
with @{$distributed$distributed TensorFlow}
will also be helpful.

View File

@ -1,698 +0,0 @@
# Creating Estimators in tf.estimator
The tf.estimator framework makes it easy to construct and train machine
learning models via its high-level Estimator API. `Estimator`
offers classes you can instantiate to quickly configure common model types such
as regressors and classifiers:
* @{tf.estimator.LinearClassifier}:
Constructs a linear classification model.
* @{tf.estimator.LinearRegressor}:
Constructs a linear regression model.
* @{tf.estimator.DNNClassifier}:
Construct a neural network classification model.
* @{tf.estimator.DNNRegressor}:
Construct a neural network regression model.
* @{tf.estimator.DNNLinearCombinedClassifier}:
Construct a neural network and linear combined classification model.
* @{tf.estimator.DNNLinearCombinedRegressor}:
Construct a neural network and linear combined regression model.
But what if none of `tf.estimator`'s predefined model types meets your needs?
Perhaps you need more granular control over model configuration, such as
the ability to customize the loss function used for optimization, or specify
different activation functions for each neural network layer. Or maybe you're
implementing a ranking or recommendation system, and neither a classifier nor a
regressor is appropriate for generating predictions.
This tutorial covers how to create your own `Estimator` using the building
blocks provided in `tf.estimator`, which will predict the ages of
[abalones](https://en.wikipedia.org/wiki/Abalone) based on their physical
measurements. You'll learn how to do the following:
* Instantiate an `Estimator`
* Construct a custom model function
* Configure a neural network using `tf.feature_column` and `tf.layers`
* Choose an appropriate loss function from `tf.losses`
* Define a training op for your model
* Generate and return predictions
## Prerequisites
This tutorial assumes you already know tf.estimator API basics, such as
feature columns, input functions, and `train()`/`evaluate()`/`predict()`
operations. If you've never used tf.estimator before, or need a refresher,
you should first review the following tutorials:
* @{$get_started/estimator$tf.estimator Quickstart}: Quick introduction to
training a neural network using tf.estimator.
* @{$wide$TensorFlow Linear Model Tutorial}: Introduction to
feature columns, and an overview on building a linear classifier in
tf.estimator.
* @{$input_fn$Building Input Functions with tf.estimator}: Overview of how
to construct an input_fn to preprocess and feed data into your models.
## An Abalone Age Predictor {#abalone-predictor}
It's possible to estimate the age of an
[abalone](https://en.wikipedia.org/wiki/Abalone) (sea snail) by the number of
rings on its shell. However, because this task requires cutting, staining, and
viewing the shell under a microscope, it's desirable to find other measurements
that can predict age.
The [Abalone Data Set](https://archive.ics.uci.edu/ml/datasets/Abalone) contains
the following
[feature data](https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.names)
for abalone:
| Feature | Description |
| -------------- | --------------------------------------------------------- |
| Length | Length of abalone (in longest direction; in mm) |
| Diameter | Diameter of abalone (measurement perpendicular to length; in mm)|
| Height | Height of abalone (with its meat inside shell; in mm) |
| Whole Weight | Weight of entire abalone (in grams) |
| Shucked Weight | Weight of abalone meat only (in grams) |
| Viscera Weight | Gut weight of abalone (in grams), after bleeding |
| Shell Weight | Weight of dried abalone shell (in grams) |
The label to predict is number of rings, as a proxy for abalone age.
![Abalone shell](https://www.tensorflow.org/images/abalone_shell.jpg)
**[“Abalone shell”](https://www.flickr.com/photos/thenickster/16641048623/) (by [Nicki Dugan
Pogue](https://www.flickr.com/photos/thenickster/), CC BY-SA 2.0)**
## Setup
This tutorial uses three data sets.
[`abalone_train.csv`](http://download.tensorflow.org/data/abalone_train.csv)
contains labeled training data comprising 3,320 examples.
[`abalone_test.csv`](http://download.tensorflow.org/data/abalone_test.csv)
contains labeled test data for 850 examples.
[`abalone_predict`](http://download.tensorflow.org/data/abalone_predict.csv)
contains 7 examples on which to make predictions.
The following sections walk through writing the `Estimator` code step by step;
the [full, final code is available
here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/estimators/abalone.py).
## Loading Abalone CSV Data into TensorFlow Datasets
To feed the abalone dataset into the model, you'll need to download and load the
CSVs into TensorFlow `Dataset`s. First, add some standard Python and TensorFlow
imports, and set up FLAGS:
```python
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import sys
import tempfile
# Import urllib
from six.moves import urllib
import numpy as np
import tensorflow as tf
FLAGS = None
```
Enable logging:
```python
tf.logging.set_verbosity(tf.logging.INFO)
```
Then define a function to load the CSVs (either from files specified in
command-line options, or downloaded from
[tensorflow.org](https://www.tensorflow.org/)):
```python
def maybe_download(train_data, test_data, predict_data):
"""Maybe downloads training data and returns train and test file names."""
if train_data:
train_file_name = train_data
else:
train_file = tempfile.NamedTemporaryFile(delete=False)
urllib.request.urlretrieve(
"http://download.tensorflow.org/data/abalone_train.csv",
train_file.name)
train_file_name = train_file.name
train_file.close()
print("Training data is downloaded to %s" % train_file_name)
if test_data:
test_file_name = test_data
else:
test_file = tempfile.NamedTemporaryFile(delete=False)
urllib.request.urlretrieve(
"http://download.tensorflow.org/data/abalone_test.csv", test_file.name)
test_file_name = test_file.name
test_file.close()
print("Test data is downloaded to %s" % test_file_name)
if predict_data:
predict_file_name = predict_data
else:
predict_file = tempfile.NamedTemporaryFile(delete=False)
urllib.request.urlretrieve(
"http://download.tensorflow.org/data/abalone_predict.csv",
predict_file.name)
predict_file_name = predict_file.name
predict_file.close()
print("Prediction data is downloaded to %s" % predict_file_name)
return train_file_name, test_file_name, predict_file_name
```
Finally, create `main()` and load the abalone CSVs into `Datasets`, defining
flags to allow users to optionally specify CSV files for training, test, and
prediction datasets via the command line (by default, files will be downloaded
from [tensorflow.org](https://www.tensorflow.org/)):
```python
def main(unused_argv):
# Load datasets
abalone_train, abalone_test, abalone_predict = maybe_download(
FLAGS.train_data, FLAGS.test_data, FLAGS.predict_data)
# Training examples
training_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=abalone_train, target_dtype=np.int, features_dtype=np.float64)
# Test examples
test_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=abalone_test, target_dtype=np.int, features_dtype=np.float64)
# Set of 7 examples for which to predict abalone ages
prediction_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=abalone_predict, target_dtype=np.int, features_dtype=np.float64)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.register("type", "bool", lambda v: v.lower() == "true")
parser.add_argument(
"--train_data", type=str, default="", help="Path to the training data.")
parser.add_argument(
"--test_data", type=str, default="", help="Path to the test data.")
parser.add_argument(
"--predict_data",
type=str,
default="",
help="Path to the prediction data.")
FLAGS, unparsed = parser.parse_known_args()
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
```
## Instantiating an Estimator
When defining a model using one of tf.estimator's provided classes, such as
`DNNClassifier`, you supply all the configuration parameters right in the
constructor, e.g.:
```python
my_nn = tf.estimator.DNNClassifier(feature_columns=[age, height, weight],
hidden_units=[10, 10, 10],
activation_fn=tf.nn.relu,
dropout=0.2,
n_classes=3,
optimizer="Adam")
```
You don't need to write any further code to instruct TensorFlow how to train the
model, calculate loss, or return predictions; that logic is already baked into
the `DNNClassifier`.
By contrast, when you're creating your own estimator from scratch, the
constructor accepts just two high-level parameters for model configuration,
`model_fn` and `params`:
```python
nn = tf.estimator.Estimator(model_fn=model_fn, params=model_params)
```
* `model_fn`: A function object that contains all the aforementioned logic to
support training, evaluation, and prediction. You are responsible for
implementing that functionality. The next section, [Constructing the
`model_fn`](#constructing-modelfn) covers creating a model function in
detail.
* `params`: An optional dict of hyperparameters (e.g., learning rate, dropout)
that will be passed into the `model_fn`.
Note: Just like `tf.estimator`'s predefined regressors and classifiers, the
`Estimator` initializer also accepts the general configuration arguments
`model_dir` and `config`.
For the abalone age predictor, the model will accept one hyperparameter:
learning rate. Define `LEARNING_RATE` as a constant at the beginning of your
code (highlighted in bold below), right after the logging configuration:
<pre class="prettyprint"><code class="lang-python">tf.logging.set_verbosity(tf.logging.INFO)
<strong># Learning rate for the model
LEARNING_RATE = 0.001</strong></code></pre>
Note: Here, `LEARNING_RATE` is set to `0.001`, but you can tune this value as
needed to achieve the best results during model training.
Then, add the following code to `main()`, which creates the dict `model_params`
containing the learning rate and instantiates the `Estimator`:
```python
# Set model params
model_params = {"learning_rate": LEARNING_RATE}
# Instantiate Estimator
nn = tf.estimator.Estimator(model_fn=model_fn, params=model_params)
```
## Constructing the `model_fn` {#constructing-modelfn}
The basic skeleton for an `Estimator` API model function looks like this:
```python
def model_fn(features, labels, mode, params):
# Logic to do the following:
# 1. Configure the model via TensorFlow operations
# 2. Define the loss function for training/evaluation
# 3. Define the training operation/optimizer
# 4. Generate predictions
# 5. Return predictions/loss/train_op/eval_metric_ops in EstimatorSpec object
return EstimatorSpec(mode, predictions, loss, train_op, eval_metric_ops)
```
The `model_fn` must accept three arguments:
* `features`: A dict containing the features passed to the model via
`input_fn`.
* `labels`: A `Tensor` containing the labels passed to the model via
`input_fn`. Will be empty for `predict()` calls, as these are the values the
model will infer.
* `mode`: One of the following @{tf.estimator.ModeKeys} string values
indicating the context in which the model_fn was invoked:
* `tf.estimator.ModeKeys.TRAIN` The `model_fn` was invoked in training
mode, namely via a `train()` call.
* `tf.estimator.ModeKeys.EVAL`. The `model_fn` was invoked in
evaluation mode, namely via an `evaluate()` call.
* `tf.estimator.ModeKeys.PREDICT`. The `model_fn` was invoked in
predict mode, namely via a `predict()` call.
`model_fn` may also accept a `params` argument containing a dict of
hyperparameters used for training (as shown in the skeleton above).
The body of the function performs the following tasks (described in detail in the
sections that follow):
* Configuring the model—here, for the abalone predictor, this will be a neural
network.
* Defining the loss function used to calculate how closely the model's
predictions match the target values.
* Defining the training operation that specifies the `optimizer` algorithm to
minimize the loss values calculated by the loss function.
The `model_fn` must return a @{tf.estimator.EstimatorSpec}
object, which contains the following values:
* `mode` (required). The mode in which the model was run. Typically, you will
return the `mode` argument of the `model_fn` here.
* `predictions` (required in `PREDICT` mode). A dict that maps key names of
your choice to `Tensor`s containing the predictions from the model, e.g.:
```python
predictions = {"results": tensor_of_predictions}
```
In `PREDICT` mode, the dict that you return in `EstimatorSpec` will then be
returned by `predict()`, so you can construct it in the format in which
you'd like to consume it.
* `loss` (required in `EVAL` and `TRAIN` mode). A `Tensor` containing a scalar
loss value: the output of the model's loss function (discussed in more depth
later in [Defining loss for the model](#defining-loss)) calculated over all
the input examples. This is used in `TRAIN` mode for error handling and
logging, and is automatically included as a metric in `EVAL` mode.
* `train_op` (required only in `TRAIN` mode). An Op that runs one step of
training.
* `eval_metric_ops` (optional). A dict of name/value pairs specifying the
metrics that will be calculated when the model runs in `EVAL` mode. The name
is a label of your choice for the metric, and the value is the result of
your metric calculation. The @{tf.metrics}
module provides predefined functions for a variety of common metrics. The
following `eval_metric_ops` contains an `"accuracy"` metric calculated using
`tf.metrics.accuracy`:
```python
eval_metric_ops = {
"accuracy": tf.metrics.accuracy(labels, predictions)
}
```
If you do not specify `eval_metric_ops`, only `loss` will be calculated
during evaluation.
### Configuring a neural network with `tf.feature_column` and `tf.layers`
Constructing a [neural
network](https://en.wikipedia.org/wiki/Artificial_neural_network) entails
creating and connecting the input layer, the hidden layers, and the output
layer.
The input layer is a series of nodes (one for each feature in the model) that
will accept the feature data that is passed to the `model_fn` in the `features`
argument. If `features` contains an n-dimensional `Tensor` with all your feature
data, then it can serve as the input layer.
If `features` contains a dict of @{$linear#feature-columns-and-transformations$feature columns} passed to
the model via an input function, you can convert it to an input-layer `Tensor`
with the @{tf.feature_column.input_layer} function.
```python
input_layer = tf.feature_column.input_layer(
features=features, feature_columns=[age, height, weight])
```
As shown above, `input_layer()` takes two required arguments:
* `features`. A mapping from string keys to the `Tensors` containing the
corresponding feature data. This is exactly what is passed to the `model_fn`
in the `features` argument.
* `feature_columns`. A list of all the `FeatureColumns` in the model—`age`,
`height`, and `weight` in the above example.
The input layer of the neural network then must be connected to one or more
hidden layers via an [activation
function](https://en.wikipedia.org/wiki/Activation_function) that performs a
nonlinear transformation on the data from the previous layer. The last hidden
layer is then connected to the output layer, the final layer in the model.
`tf.layers` provides the `tf.layers.dense` function for constructing fully
connected layers. The activation is controlled by the `activation` argument.
Some options to pass to the `activation` argument are:
* `tf.nn.relu`. The following code creates a layer of `units` nodes fully
connected to the previous layer `input_layer` with a
[ReLU activation function](https://en.wikipedia.org/wiki/Rectifier_\(neural_networks\))
(@{tf.nn.relu}):
```python
hidden_layer = tf.layers.dense(
inputs=input_layer, units=10, activation=tf.nn.relu)
```
* `tf.nn.relu6`. The following code creates a layer of `units` nodes fully
connected to the previous layer `hidden_layer` with a ReLU 6 activation
function (@{tf.nn.relu6}):
```python
second_hidden_layer = tf.layers.dense(
inputs=hidden_layer, units=20, activation=tf.nn.relu)
```
* `None`. The following code creates a layer of `units` nodes fully connected
to the previous layer `second_hidden_layer` with *no* activation function,
just a linear transformation:
```python
output_layer = tf.layers.dense(
inputs=second_hidden_layer, units=3, activation=None)
```
Other activation functions are possible, e.g.:
```python
output_layer = tf.layers.dense(inputs=second_hidden_layer,
units=10,
activation_fn=tf.sigmoid)
```
The above code creates the neural network layer `output_layer`, which is fully
connected to `second_hidden_layer` with a sigmoid activation function
(@{tf.sigmoid}). For a list of predefined
activation functions available in TensorFlow, see the @{$python/nn#activation_functions$API docs}.
Putting it all together, the following code constructs a full neural network for
the abalone predictor, and captures its predictions:
```python
def model_fn(features, labels, mode, params):
"""Model function for Estimator."""
# Connect the first hidden layer to input layer
# (features["x"]) with relu activation
first_hidden_layer = tf.layers.dense(features["x"], 10, activation=tf.nn.relu)
# Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.layers.dense(
first_hidden_layer, 10, activation=tf.nn.relu)
# Connect the output layer to second hidden layer (no activation fn)
output_layer = tf.layers.dense(second_hidden_layer, 1)
# Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer, [-1])
predictions_dict = {"ages": predictions}
...
```
Here, because you'll be passing the abalone `Datasets` using `numpy_input_fn`
as shown below, `features` is a dict `{"x": data_tensor}`, so
`features["x"]` is the input layer. The network contains two hidden
layers, each with 10 nodes and a ReLU activation function. The output layer
contains no activation function, and is
@{tf.reshape} to a one-dimensional
tensor to capture the model's predictions, which are stored in
`predictions_dict`.
### Defining loss for the model {#defining-loss}
The `EstimatorSpec` returned by the `model_fn` must contain `loss`: a `Tensor`
representing the loss value, which quantifies how well the model's predictions
reflect the label values during training and evaluation runs. The @{tf.losses}
module provides convenience functions for calculating loss using a variety of
metrics, including:
* `absolute_difference(labels, predictions)`. Calculates loss using the
[absolute-difference
formula](https://en.wikipedia.org/wiki/Deviation_\(statistics\)#Unsigned_or_absolute_deviation)
(also known as L<sub>1</sub> loss).
* `log_loss(labels, predictions)`. Calculates loss using the [logistic loss
forumula](https://en.wikipedia.org/wiki/Loss_functions_for_classification#Logistic_loss)
(typically used in logistic regression).
* `mean_squared_error(labels, predictions)`. Calculates loss using the [mean
squared error](https://en.wikipedia.org/wiki/Mean_squared_error) (MSE; also
known as L<sub>2</sub> loss).
The following example adds a definition for `loss` to the abalone `model_fn`
using `mean_squared_error()` (in bold):
<pre class="prettyprint"><code class="lang-python">def model_fn(features, labels, mode, params):
"""Model function for Estimator."""
# Connect the first hidden layer to input layer
# (features["x"]) with relu activation
first_hidden_layer = tf.layers.dense(features["x"], 10, activation=tf.nn.relu)
# Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.layers.dense(
first_hidden_layer, 10, activation=tf.nn.relu)
# Connect the output layer to second hidden layer (no activation fn)
output_layer = tf.layers.dense(second_hidden_layer, 1)
# Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer, [-1])
predictions_dict = {"ages": predictions}
<strong># Calculate loss using mean squared error
loss = tf.losses.mean_squared_error(labels, predictions)</strong>
...</code></pre>
See the @{tf.losses$API guide} for a
full list of loss functions and more details on supported arguments and usage.
Supplementary metrics for evaluation can be added to an `eval_metric_ops` dict.
The following code defines an `rmse` metric, which calculates the root mean
squared error for the model predictions. Note that the `labels` tensor is cast
to a `float64` type to match the data type of the `predictions` tensor, which
will contain real values:
```python
eval_metric_ops = {
"rmse": tf.metrics.root_mean_squared_error(
tf.cast(labels, tf.float64), predictions)
}
```
### Defining the training op for the model
The training op defines the optimization algorithm TensorFlow will use when
fitting the model to the training data. Typically when training, the goal is to
minimize loss. A simple way to create the training op is to instantiate a
`tf.train.Optimizer` subclass and call the `minimize` method.
The following code defines a training op for the abalone `model_fn` using the
loss value calculated in [Defining Loss for the Model](#defining-loss), the
learning rate passed to the function in `params`, and the gradient descent
optimizer. For `global_step`, the convenience function
@{tf.train.get_global_step} takes care of generating an integer variable:
```python
optimizer = tf.train.GradientDescentOptimizer(
learning_rate=params["learning_rate"])
train_op = optimizer.minimize(
loss=loss, global_step=tf.train.get_global_step())
```
For a full list of optimizers, and other details, see the
@{$python/train#optimizers$API guide}.
### The complete abalone `model_fn`
Here's the final, complete `model_fn` for the abalone age predictor. The
following code configures the neural network; defines loss and the training op;
and returns a `EstimatorSpec` object containing `mode`, `predictions_dict`, `loss`,
and `train_op`:
```python
def model_fn(features, labels, mode, params):
"""Model function for Estimator."""
# Connect the first hidden layer to input layer
# (features["x"]) with relu activation
first_hidden_layer = tf.layers.dense(features["x"], 10, activation=tf.nn.relu)
# Connect the second hidden layer to first hidden layer with relu
second_hidden_layer = tf.layers.dense(
first_hidden_layer, 10, activation=tf.nn.relu)
# Connect the output layer to second hidden layer (no activation fn)
output_layer = tf.layers.dense(second_hidden_layer, 1)
# Reshape output layer to 1-dim Tensor to return predictions
predictions = tf.reshape(output_layer, [-1])
# Provide an estimator spec for `ModeKeys.PREDICT`.
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(
mode=mode,
predictions={"ages": predictions})
# Calculate loss using mean squared error
loss = tf.losses.mean_squared_error(labels, predictions)
# Calculate root mean squared error as additional eval metric
eval_metric_ops = {
"rmse": tf.metrics.root_mean_squared_error(
tf.cast(labels, tf.float64), predictions)
}
optimizer = tf.train.GradientDescentOptimizer(
learning_rate=params["learning_rate"])
train_op = optimizer.minimize(
loss=loss, global_step=tf.train.get_global_step())
# Provide an estimator spec for `ModeKeys.EVAL` and `ModeKeys.TRAIN` modes.
return tf.estimator.EstimatorSpec(
mode=mode,
loss=loss,
train_op=train_op,
eval_metric_ops=eval_metric_ops)
```
## Running the Abalone Model
You've instantiated an `Estimator` for the abalone predictor and defined its
behavior in `model_fn`; all that's left to do is train, evaluate, and make
predictions.
Add the following code to the end of `main()` to fit the neural network to the
training data and evaluate accuracy:
```python
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(training_set.data)},
y=np.array(training_set.target),
num_epochs=None,
shuffle=True)
# Train
nn.train(input_fn=train_input_fn, steps=5000)
# Score accuracy
test_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(test_set.data)},
y=np.array(test_set.target),
num_epochs=1,
shuffle=False)
ev = nn.evaluate(input_fn=test_input_fn)
print("Loss: %s" % ev["loss"])
print("Root Mean Squared Error: %s" % ev["rmse"])
```
Note: The above code uses input functions to feed feature (`x`) and label (`y`)
`Tensor`s into the model for both training (`train_input_fn`) and evaluation
(`test_input_fn`). To learn more about input functions, see the tutorial
@{$input_fn$Building Input Functions with tf.estimator}.
Then run the code. You should see output like the following:
```none
...
INFO:tensorflow:loss = 4.86658, step = 4701
INFO:tensorflow:loss = 4.86191, step = 4801
INFO:tensorflow:loss = 4.85788, step = 4901
...
INFO:tensorflow:Saving evaluation summary for 5000 step: loss = 5.581
Loss: 5.581
```
The loss score reported is the mean squared error returned from the `model_fn`
when run on the `ABALONE_TEST` data set.
To predict ages for the `ABALONE_PREDICT` data set, add the following to
`main()`:
```python
# Print out predictions
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": prediction_set.data},
num_epochs=1,
shuffle=False)
predictions = nn.predict(input_fn=predict_input_fn)
for i, p in enumerate(predictions):
print("Prediction %s: %s" % (i + 1, p["ages"]))
```
Here, the `predict()` function returns results in `predictions` as an iterable.
The `for` loop enumerates and prints out the results. Rerun the code, and you
should see output similar to the following:
```python
...
Prediction 1: 4.92229
Prediction 2: 10.3225
Prediction 3: 7.384
Prediction 4: 10.6264
Prediction 5: 11.0862
Prediction 6: 9.39239
Prediction 7: 11.1289
```
## Additional Resources
Congrats! You've successfully built a tf.estimator `Estimator` from scratch.
For additional reference materials on building `Estimator`s, see the following
sections of the API guides:
* @{$python/contrib.layers$Layers}
* @{tf.losses$Losses}
* @{$python/contrib.layers#optimization$Optimization}

View File

@ -14,9 +14,6 @@ TensorFlow:
add support for your own shared or distributed filesystem.
* @{$new_data_formats$Custom Data Readers}, which details how to add support
for your own file and record formats.
* @{$extend/estimators$Creating Estimators in tf.contrib.learn}, which explains how
to write your own custom Estimator. For example, you could build your
own Estimator to implement some variation on standard linear regression.
Python is currently the only language supported by TensorFlow's API stability
promises. However, TensorFlow also provides functionality in C++, Java, and Go,

View File

@ -3,6 +3,5 @@ architecture.md
adding_an_op.md
add_filesys.md
new_data_formats.md
estimators.md
language_bindings.md
tool_developers/index.md

View File

@ -1,5 +1,6 @@
# Creating Custom Estimators
This document introduces custom Estimators. In particular, this document
demonstrates how to create a custom @{tf.estimator.Estimator$Estimator} that
mimics the behavior of the pre-made Estimator
@ -23,9 +24,9 @@ python custom_estimator.py
```
If you are feeling impatient, feel free to compare and contrast
[`custom_estimatr.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py)
[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py)
with
[`premade_estimatr.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py).
[`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py).
(which is in the same directory).
@ -105,7 +106,7 @@ This input function builds an input pipeline that yields batches of
## Create feature columns
As detailed in the @{$get_started/estimator$Premade Estimators} and
As detailed in the @{$get_started/premade_estimators$Premade Estimators} and
@{$get_started/feature_columns$Feature Columns} chapters, you must define
your model's feature columns to specify how the model should use each feature.
Whether working with pre-made Estimators or custom Estimators, you define

View File

@ -75,7 +75,7 @@ Let's walk through the `train_input_fn()`.
In the simplest cases, @{tf.data.Dataset.from_tensor_slices} function takes an
array and returns a @{tf.data.Dataset} representing slices of the array. For
example, an array containing the @{$mnist/beginners$mnist training data}
example, an array containing the @{$tutorials/layers$mnist training data}
has a shape of `(60000, 28, 28)`. Passing this to `from_tensor_slices` returns
a `Dataset` object containing 60000 slices, each one a 28x28 image.
@ -228,7 +228,7 @@ features_result, labels_result = dataset.make_one_shot_iterator().get_next()
The result is a structure of @{$programmers_guide/tensors$TensorFlow tensors},
matching the layout of the items in the `Dataset`.
For an introduction to what these objects are and how to work with them,
see @{$get_started/get_started}.
see @{$programmers_guide/low_level_intro}.
``` python
print((features_result, labels_result))

View File

@ -1,410 +0,0 @@
# tf.estimator Quickstart
TensorFlows high-level machine learning API (tf.estimator) makes it easy to
configure, train, and evaluate a variety of machine learning models. In this
tutorial, youll use tf.estimator to construct a
[neural network](https://en.wikipedia.org/wiki/Artificial_neural_network)
classifier and train it on the
[Iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) to
predict flower species based on sepal/petal geometry. You'll write code to
perform the following five steps:
1. Load CSVs containing Iris training/test data into a TensorFlow `Dataset`
2. Construct a @{tf.estimator.DNNClassifier$neural network classifier}
3. Train the model using the training data
4. Evaluate the accuracy of the model
5. Classify new samples
NOTE: Remember to @{$install$install TensorFlow on your machine}
before getting started with this tutorial.
## Complete Neural Network Source Code
Here is the full code for the neural network classifier:
```python
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
from six.moves.urllib.request import urlopen
import numpy as np
import tensorflow as tf
# Data sets
IRIS_TRAINING = "iris_training.csv"
IRIS_TRAINING_URL = "http://download.tensorflow.org/data/iris_training.csv"
IRIS_TEST = "iris_test.csv"
IRIS_TEST_URL = "http://download.tensorflow.org/data/iris_test.csv"
def main():
# If the training and test sets aren't stored locally, download them.
if not os.path.exists(IRIS_TRAINING):
raw = urlopen(IRIS_TRAINING_URL).read()
with open(IRIS_TRAINING, "wb") as f:
f.write(raw)
if not os.path.exists(IRIS_TEST):
raw = urlopen(IRIS_TEST_URL).read()
with open(IRIS_TEST, "wb") as f:
f.write(raw)
# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
filename=IRIS_TRAINING,
target_dtype=np.int,
features_dtype=np.float32)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
filename=IRIS_TEST,
target_dtype=np.int,
features_dtype=np.float32)
# Specify that all features have real-value data
feature_columns = [tf.feature_column.numeric_column("x", shape=[4])]
# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
n_classes=3,
model_dir="/tmp/iris_model")
# Define the training inputs
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(training_set.data)},
y=np.array(training_set.target),
num_epochs=None,
shuffle=True)
# Train model.
classifier.train(input_fn=train_input_fn, steps=2000)
# Define the test inputs
test_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(test_set.data)},
y=np.array(test_set.target),
num_epochs=1,
shuffle=False)
# Evaluate accuracy.
accuracy_score = classifier.evaluate(input_fn=test_input_fn)["accuracy"]
print("\nTest Accuracy: {0:f}\n".format(accuracy_score))
# Classify two new flower samples.
new_samples = np.array(
[[6.4, 3.2, 4.5, 1.5],
[5.8, 3.1, 5.0, 1.7]], dtype=np.float32)
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": new_samples},
num_epochs=1,
shuffle=False)
predictions = list(classifier.predict(input_fn=predict_input_fn))
predicted_classes = [p["classes"] for p in predictions]
print(
"New Samples, Class Predictions: {}\n"
.format(predicted_classes))
if __name__ == "__main__":
main()
```
The following sections walk through the code in detail.
## Load the Iris CSV data to TensorFlow
The [Iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) contains
150 rows of data, comprising 50 samples from each of three related Iris species:
*Iris setosa*, *Iris virginica*, and *Iris versicolor*.
![Petal geometry compared for three iris species: Iris setosa, Iris virginica, and Iris versicolor](https://www.tensorflow.org/images/iris_three_species.jpg) **From left to right,
[*Iris setosa*](https://commons.wikimedia.org/w/index.php?curid=170298) (by
[Radomil](https://commons.wikimedia.org/wiki/User:Radomil), CC BY-SA 3.0),
[*Iris versicolor*](https://commons.wikimedia.org/w/index.php?curid=248095) (by
[Dlanglois](https://commons.wikimedia.org/wiki/User:Dlanglois), CC BY-SA 3.0),
and [*Iris virginica*](https://www.flickr.com/photos/33397993@N05/3352169862)
(by [Frank Mayfield](https://www.flickr.com/photos/33397993@N05), CC BY-SA
2.0).**
Each row contains the following data for each flower sample:
[sepal](https://en.wikipedia.org/wiki/Sepal) length, sepal width,
[petal](https://en.wikipedia.org/wiki/Petal) length, petal width, and flower
species. Flower species are represented as integers, with 0 denoting *Iris
setosa*, 1 denoting *Iris versicolor*, and 2 denoting *Iris virginica*.
Sepal Length | Sepal Width | Petal Length | Petal Width | Species
:----------- | :---------- | :----------- | :---------- | :-------
5.1 | 3.5 | 1.4 | 0.2 | 0
4.9 | 3.0 | 1.4 | 0.2 | 0
4.7 | 3.2 | 1.3 | 0.2 | 0
&hellip; | &hellip; | &hellip; | &hellip; | &hellip;
7.0 | 3.2 | 4.7 | 1.4 | 1
6.4 | 3.2 | 4.5 | 1.5 | 1
6.9 | 3.1 | 4.9 | 1.5 | 1
&hellip; | &hellip; | &hellip; | &hellip; | &hellip;
6.5 | 3.0 | 5.2 | 2.0 | 2
6.2 | 3.4 | 5.4 | 2.3 | 2
5.9 | 3.0 | 5.1 | 1.8 | 2
For this tutorial, the Iris data has been randomized and split into two separate
CSVs:
* A training set of 120 samples
([iris_training.csv](http://download.tensorflow.org/data/iris_training.csv))
* A test set of 30 samples
([iris_test.csv](http://download.tensorflow.org/data/iris_test.csv)).
To get started, first import all the necessary modules, and define where to
download and store the dataset:
```python
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
from six.moves.urllib.request import urlopen
import tensorflow as tf
import numpy as np
IRIS_TRAINING = "iris_training.csv"
IRIS_TRAINING_URL = "http://download.tensorflow.org/data/iris_training.csv"
IRIS_TEST = "iris_test.csv"
IRIS_TEST_URL = "http://download.tensorflow.org/data/iris_test.csv"
```
Then, if the training and test sets aren't already stored locally, download
them.
```python
if not os.path.exists(IRIS_TRAINING):
raw = urlopen(IRIS_TRAINING_URL).read()
with open(IRIS_TRAINING,'wb') as f:
f.write(raw)
if not os.path.exists(IRIS_TEST):
raw = urlopen(IRIS_TEST_URL).read()
with open(IRIS_TEST,'wb') as f:
f.write(raw)
```
Next, load the training and test sets into `Dataset`s using the
[`load_csv_with_header()`](https://www.tensorflow.org/code/tensorflow/contrib/learn/python/learn/datasets/base.py)
method in `learn.datasets.base`. The `load_csv_with_header()` method takes three
required arguments:
* `filename`, which takes the filepath to the CSV file
* `target_dtype`, which takes the
[`numpy` datatype](http://docs.scipy.org/doc/numpy/user/basics.types.html)
of the dataset's target value.
* `features_dtype`, which takes the
[`numpy` datatype](http://docs.scipy.org/doc/numpy/user/basics.types.html)
of the dataset's feature values.
Here, the target (the value you're training the model to predict) is flower
species, which is an integer from 0&ndash;2, so the appropriate `numpy` datatype
is `np.int`:
```python
# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
filename=IRIS_TRAINING,
target_dtype=np.int,
features_dtype=np.float32)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
filename=IRIS_TEST,
target_dtype=np.int,
features_dtype=np.float32)
```
`Dataset`s in tf.contrib.learn are
[named tuples](https://docs.python.org/2/library/collections.html#collections.namedtuple);
you can access feature data and target values via the `data` and `target`
fields. Here, `training_set.data` and `training_set.target` contain the feature
data and target values for the training set, respectively, and `test_set.data`
and `test_set.target` contain feature data and target values for the test set.
Later on, in
["Fit the DNNClassifier to the Iris Training Data,"](#fit_the_dnnclassifier_to_the_iris_training_data)
you'll use `training_set.data` and
`training_set.target` to train your model, and in
["Evaluate Model Accuracy,"](#evaluate_model_accuracy) you'll use `test_set.data` and
`test_set.target`. But first, you'll construct your model in the next section.
## Construct a Deep Neural Network Classifier
tf.estimator offers a variety of predefined models, called `Estimator`s, which
you can use "out of the box" to run training and evaluation operations on your
data.
Here, you'll configure a Deep Neural Network Classifier model to fit the Iris
data. Using tf.estimator, you can instantiate your
@{tf.estimator.DNNClassifier} with just a couple lines of code:
```python
# Specify that all features have real-value data
feature_columns = [tf.feature_column.numeric_column("x", shape=[4])]
# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
n_classes=3,
model_dir="/tmp/iris_model")
```
The code above first defines the model's feature columns, which specify the data
type for the features in the data set. All the feature data is continuous, so
`tf.feature_column.numeric_column` is the appropriate function to use to
construct the feature columns. There are four features in the data set (sepal
width, sepal height, petal width, and petal height), so accordingly `shape`
must be set to `[4]` to hold all the data.
Then, the code creates a `DNNClassifier` model using the following arguments:
* `feature_columns=feature_columns`. The set of feature columns defined above.
* `hidden_units=[10, 20, 10]`. Three
[hidden layers](http://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw),
containing 10, 20, and 10 neurons, respectively.
* `n_classes=3`. Three target classes, representing the three Iris species.
* `model_dir=/tmp/iris_model`. The directory in which TensorFlow will save
checkpoint data and TensorBoard summaries during model training.
## Describe the training input pipeline {#train-input}
The `tf.estimator` API uses input functions, which create the TensorFlow
operations that generate data for the model.
We can use `tf.estimator.inputs.numpy_input_fn` to produce the input pipeline:
```python
# Define the training inputs
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(training_set.data)},
y=np.array(training_set.target),
num_epochs=None,
shuffle=True)
```
## Fit the DNNClassifier to the Iris Training Data {#fit-dnnclassifier}
Now that you've configured your DNN `classifier` model, you can fit it to the
Iris training data using the @{tf.estimator.Estimator.train$`train`} method.
Pass `train_input_fn` as the `input_fn`, and the number of steps to train
(here, 2000):
```python
# Train model.
classifier.train(input_fn=train_input_fn, steps=2000)
```
The state of the model is preserved in the `classifier`, which means you can
train iteratively if you like. For example, the above is equivalent to the
following:
```python
classifier.train(input_fn=train_input_fn, steps=1000)
classifier.train(input_fn=train_input_fn, steps=1000)
```
However, if you're looking to track the model while it trains, you'll likely
want to instead use a TensorFlow @{tf.train.SessionRunHook$`SessionRunHook`}
to perform logging operations.
## Evaluate Model Accuracy {#evaluate-accuracy}
You've trained your `DNNClassifier` model on the Iris training data; now, you
can check its accuracy on the Iris test data using the
@{tf.estimator.Estimator.evaluate$`evaluate`} method. Like `train`,
`evaluate` takes an input function that builds its input pipeline. `evaluate`
returns a `dict`s with the evaluation results. The following code passes the
Iris test data&mdash;`test_set.data` and `test_set.target`&mdash;to `evaluate`
and prints the `accuracy` from the results:
```python
# Define the test inputs
test_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(test_set.data)},
y=np.array(test_set.target),
num_epochs=1,
shuffle=False)
# Evaluate accuracy.
accuracy_score = classifier.evaluate(input_fn=test_input_fn)["accuracy"]
print("\nTest Accuracy: {0:f}\n".format(accuracy_score))
```
Note: The `num_epochs=1` argument to `numpy_input_fn` is important here.
`test_input_fn` will iterate over the data once, and then raise
`OutOfRangeError`. This error signals the classifier to stop evaluating, so it
will evaluate over the input once.
When you run the full script, it will print something close to:
```
Test Accuracy: 0.966667
```
Your accuracy result may vary a bit, but should be higher than 90%. Not bad for
a relatively small data set!
## Classify New Samples
Use the estimator's `predict()` method to classify new samples. For example, say
you have these two new flower samples:
Sepal Length | Sepal Width | Petal Length | Petal Width
:----------- | :---------- | :----------- | :----------
6.4 | 3.2 | 4.5 | 1.5
5.8 | 3.1 | 5.0 | 1.7
You can predict their species using the `predict()` method. `predict` returns a
generator of dicts, which can easily be converted to a list. The following code
retrieves and prints the class predictions:
```python
# Classify two new flower samples.
new_samples = np.array(
[[6.4, 3.2, 4.5, 1.5],
[5.8, 3.1, 5.0, 1.7]], dtype=np.float32)
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": new_samples},
num_epochs=1,
shuffle=False)
predictions = list(classifier.predict(input_fn=predict_input_fn))
predicted_classes = [p["classes"] for p in predictions]
print(
"New Samples, Class Predictions: {}\n"
.format(predicted_classes))
```
Your results should look as follows:
```
New Samples, Class Predictions: [1 2]
```
The model thus predicts that the first sample is *Iris versicolor*, and the
second sample is *Iris virginica*.
## Additional Resources
* To learn more about using tf.estimator to create linear models, see
@{$linear$Large-scale Linear Models with TensorFlow}.
* To build your own Estimator using tf.estimator APIs, check out
@{$extend/estimators$Creating Estimators}.
* To experiment with neural network modeling and visualization in the browser,
check out [Deep Playground](http://playground.tensorflow.org/).
* For more advanced tutorials on neural networks, see
@{$deep_cnn$Convolutional Neural Networks} and @{$recurrent$Recurrent Neural
Networks}.

View File

@ -5,13 +5,13 @@ intermediaries between raw data and Estimators. Feature columns are very rich,
enabling you to transform a diverse range of raw data into formats that
Estimators can use, allowing easy experimentation.
In @{$get_started/estimator$Premade Estimators}, we used the premade Estimator,
@{tf.estimator.DNNClassifier$`DNNClassifier`} to train a model to predict
different types of Iris flowers from four input features. That example created
only numerical feature columns (of type @{tf.feature_column.numeric_column}).
Although numerical feature columns model the lengths of petals and sepals
effectively, real world data sets contain all kinds of features, many of which
are non-numerical.
In @{$get_started/premade_estimators$Premade Estimators}, we used the premade
Estimator, @{tf.estimator.DNNClassifier$`DNNClassifier`} to train a model to
predict different types of Iris flowers from four input features. That example
created only numerical feature columns (of type
@{tf.feature_column.numeric_column}). Although numerical feature columns model
the lengths of petals and sepals effectively, real world data sets contain all
kinds of features, many of which are non-numerical.
<div style="width:80%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="../images/feature_columns/feature_cloud.jpg">

View File

@ -1,480 +0,0 @@
# Getting Started With TensorFlow
This guide gets you started programming in TensorFlow. Before using this guide,
@{$install$install TensorFlow}. To get the most out of
this guide, you should know the following:
* How to program in Python.
* At least a little bit about arrays.
* Ideally, something about machine learning. However, if you know little or
nothing about machine learning, then this is still the first guide you
should read.
TensorFlow provides multiple APIs. The lowest level API--TensorFlow Core--
provides you with complete programming control. We recommend TensorFlow Core for
machine learning researchers and others who require fine levels of control over
their models. The higher level APIs are built on top of TensorFlow Core. These
higher level APIs are typically easier to learn and use than TensorFlow Core. In
addition, the higher level APIs make repetitive tasks easier and more consistent
between different users. A high-level API like tf.estimator helps you manage
data sets, estimators, training and inference.
This guide begins with a tutorial on TensorFlow Core. Later, we
demonstrate how to implement the same model in tf.estimator. Knowing
TensorFlow Core principles will give you a great mental model of how things are
working internally when you use the more compact higher level API.
# Tensors
The central unit of data in TensorFlow is the **tensor**. A tensor consists of a
set of primitive values shaped into an array of any number of dimensions. A
tensor's **rank** is its number of dimensions. Here are some examples of
tensors:
```python
3 # a rank 0 tensor; a scalar with shape []
[1., 2., 3.] # a rank 1 tensor; a vector with shape [3]
[[1., 2., 3.], [4., 5., 6.]] # a rank 2 tensor; a matrix with shape [2, 3]
[[[1., 2., 3.]], [[7., 8., 9.]]] # a rank 3 tensor with shape [2, 1, 3]
```
## TensorFlow Core tutorial
### Importing TensorFlow
The canonical import statement for TensorFlow programs is as follows:
```python
import tensorflow as tf
```
This gives Python access to all of TensorFlow's classes, methods, and symbols.
Most of the documentation assumes you have already done this.
### The Computational Graph
You might think of TensorFlow Core programs as consisting of two discrete
sections:
1. Building the computational graph.
2. Running the computational graph.
A **computational graph** is a series of TensorFlow operations arranged into a
graph of nodes.
Let's build a simple computational graph. Each node takes zero
or more tensors as inputs and produces a tensor as an output. One type of node
is a constant. Like all TensorFlow constants, it takes no inputs, and it outputs
a value it stores internally. We can create two floating point Tensors `node1`
and `node2` as follows:
```python
node1 = tf.constant(3.0, dtype=tf.float32)
node2 = tf.constant(4.0) # also tf.float32 implicitly
print(node1, node2)
```
The final print statement produces
```
Tensor("Const:0", shape=(), dtype=float32) Tensor("Const_1:0", shape=(), dtype=float32)
```
Notice that printing the nodes does not output the values `3.0` and `4.0` as you
might expect. Instead, they are nodes that, when evaluated, would produce 3.0
and 4.0, respectively. To actually evaluate the nodes, we must run the
computational graph within a **session**. A session encapsulates the control and
state of the TensorFlow runtime.
The following code creates a `Session` object and then invokes its `run` method
to run enough of the computational graph to evaluate `node1` and `node2`. By
running the computational graph in a session as follows:
```python
sess = tf.Session()
print(sess.run([node1, node2]))
```
we see the expected values of 3.0 and 4.0:
```
[3.0, 4.0]
```
We can build more complicated computations by combining `Tensor` nodes with
operations (Operations are also nodes). For example, we can add our two
constant nodes and produce a new graph as follows:
```python
from __future__ import print_function
node3 = tf.add(node1, node2)
print("node3:", node3)
print("sess.run(node3):", sess.run(node3))
```
The last two print statements produce
```
node3: Tensor("Add:0", shape=(), dtype=float32)
sess.run(node3): 7.0
```
TensorFlow provides a utility called TensorBoard that can display a picture of
the computational graph. Here is a screenshot showing how TensorBoard
visualizes the graph:
![TensorBoard screenshot](https://www.tensorflow.org/images/getting_started_add.png)
As it stands, this graph is not especially interesting because it always
produces a constant result. A graph can be parameterized to accept external
inputs, known as **placeholders**. A **placeholder** is a promise to provide a
value later.
```python
a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
adder_node = a + b # + provides a shortcut for tf.add(a, b)
```
The preceding three lines are a bit like a function or a lambda in which we
define two input parameters (a and b) and then an operation on them. We can
evaluate this graph with multiple inputs by using the feed_dict argument to
the [run method](https://www.tensorflow.org/api_docs/python/tf/Session#run)
to feed concrete values to the placeholders:
```python
print(sess.run(adder_node, {a: 3, b: 4.5}))
print(sess.run(adder_node, {a: [1, 3], b: [2, 4]}))
```
resulting in the output
```
7.5
[ 3. 7.]
```
In TensorBoard, the graph looks like this:
![TensorBoard screenshot](https://www.tensorflow.org/images/getting_started_adder.png)
We can make the computational graph more complex by adding another operation.
For example,
```python
add_and_triple = adder_node * 3.
print(sess.run(add_and_triple, {a: 3, b: 4.5}))
```
produces the output
```
22.5
```
The preceding computational graph would look as follows in TensorBoard:
![TensorBoard screenshot](https://www.tensorflow.org/images/getting_started_triple.png)
In machine learning we will typically want a model that can take arbitrary
inputs, such as the one above. To make the model trainable, we need to be able
to modify the graph to get new outputs with the same input. **Variables** allow
us to add trainable parameters to a graph. They are constructed with a type and
initial value:
```python
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
x = tf.placeholder(tf.float32)
linear_model = W*x + b
```
Constants are initialized when you call `tf.constant`, and their value can never
change. By contrast, variables are not initialized when you call `tf.Variable`.
To initialize all the variables in a TensorFlow program, you must explicitly
call a special operation as follows:
```python
init = tf.global_variables_initializer()
sess.run(init)
```
It is important to realize `init` is a handle to the TensorFlow sub-graph that
initializes all the global variables. Until we call `sess.run`, the variables
are uninitialized.
Since `x` is a placeholder, we can evaluate `linear_model` for several values of
`x` simultaneously as follows:
```python
print(sess.run(linear_model, {x: [1, 2, 3, 4]}))
```
to produce the output
```
[ 0. 0.30000001 0.60000002 0.90000004]
```
We've created a model, but we don't know how good it is yet. To evaluate the
model on training data, we need a `y` placeholder to provide the desired values,
and we need to write a loss function.
A loss function measures how far apart the
current model is from the provided data. We'll use a standard loss model for
linear regression, which sums the squares of the deltas between the current
model and the provided data. `linear_model - y` creates a vector where each
element is the corresponding example's error delta. We call `tf.square` to
square that error. Then, we sum all the squared errors to create a single scalar
that abstracts the error of all examples using `tf.reduce_sum`:
```python
y = tf.placeholder(tf.float32)
squared_deltas = tf.square(linear_model - y)
loss = tf.reduce_sum(squared_deltas)
print(sess.run(loss, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]}))
```
producing the loss value
```
23.66
```
We could improve this manually by reassigning the values of `W` and `b` to the
perfect values of -1 and 1. A variable is initialized to the value provided to
`tf.Variable` but can be changed using operations like `tf.assign`. For example,
`W=-1` and `b=1` are the optimal parameters for our model. We can change `W` and
`b` accordingly:
```python
fixW = tf.assign(W, [-1.])
fixb = tf.assign(b, [1.])
sess.run([fixW, fixb])
print(sess.run(loss, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]}))
```
The final print shows the loss now is zero.
```
0.0
```
We guessed the "perfect" values of `W` and `b`, but the whole point of machine
learning is to find the correct model parameters automatically. We will show
how to accomplish this in the next section.
## tf.train API
A complete discussion of machine learning is out of the scope of this tutorial.
However, TensorFlow provides **optimizers** that slowly change each variable in
order to minimize the loss function. The simplest optimizer is **gradient
descent**. It modifies each variable according to the magnitude of the
derivative of loss with respect to that variable. In general, computing symbolic
derivatives manually is tedious and error-prone. Consequently, TensorFlow can
automatically produce derivatives given only a description of the model using
the function `tf.gradients`. For simplicity, optimizers typically do this
for you. For example,
```python
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
```
```python
sess.run(init) # reset variables to incorrect defaults.
for i in range(1000):
sess.run(train, {x: [1, 2, 3, 4], y: [0, -1, -2, -3]})
print(sess.run([W, b]))
```
results in the final model parameters:
```
[array([-0.9999969], dtype=float32), array([ 0.99999082], dtype=float32)]
```
Now we have done actual machine learning! Although this simple linear
regression model does not require much TensorFlow core code, more complicated
models and methods to feed data into your models necessitate more code. Thus,
TensorFlow provides higher level abstractions for common patterns, structures,
and functionality. We will learn how to use some of these abstractions in the
next section.
### Complete program
The completed trainable linear regression model is shown here:
```python
import tensorflow as tf
# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W*x + b
y = tf.placeholder(tf.float32)
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
# training data
x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init) # initialize variables with incorrect defaults.
for i in range(1000):
sess.run(train, {x: x_train, y: y_train})
# evaluate training accuracy
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))
```
When run, it produces
```
W: [-0.9999969] b: [ 0.99999082] loss: 5.69997e-11
```
Notice that the loss is a very small number (very close to zero). If you run
this program, your loss may not be exactly the same as the aforementioned loss
because the model is initialized with pseudorandom values.
This more complicated program can still be visualized in TensorBoard
![TensorBoard final model visualization](https://www.tensorflow.org/images/getting_started_final.png)
## `tf.estimator`
`tf.estimator` is a high-level TensorFlow library that simplifies the
mechanics of machine learning, including the following:
* running training loops
* running evaluation loops
* managing data sets
tf.estimator defines many common models.
### Basic usage
Notice how much simpler the linear regression program becomes with
`tf.estimator`:
```python
# NumPy is often used to load, manipulate and preprocess data.
import numpy as np
import tensorflow as tf
# Declare list of features. We only have one numeric feature. There are many
# other types of columns that are more complicated and useful.
feature_columns = [tf.feature_column.numeric_column("x", shape=[1])]
# An estimator is the front end to invoke training (fitting) and evaluation
# (inference). There are many predefined types like linear regression,
# linear classification, and many neural network classifiers and regressors.
# The following code provides an estimator that does linear regression.
estimator = tf.estimator.LinearRegressor(feature_columns=feature_columns)
# TensorFlow provides many helper methods to read and set up data sets.
# Here we use two data sets: one for training and one for evaluation
# We have to tell the function how many batches
# of data (num_epochs) we want and how big each batch should be.
x_train = np.array([1., 2., 3., 4.])
y_train = np.array([0., -1., -2., -3.])
x_eval = np.array([2., 5., 8., 1.])
y_eval = np.array([-1.01, -4.1, -7., 0.])
input_fn = tf.estimator.inputs.numpy_input_fn(
{"x": x_train}, y_train, batch_size=4, num_epochs=None, shuffle=True)
train_input_fn = tf.estimator.inputs.numpy_input_fn(
{"x": x_train}, y_train, batch_size=4, num_epochs=1000, shuffle=False)
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
{"x": x_eval}, y_eval, batch_size=4, num_epochs=1000, shuffle=False)
# We can invoke 1000 training steps by invoking the method and passing the
# training data set.
estimator.train(input_fn=input_fn, steps=1000)
# Here we evaluate how well our model did.
train_metrics = estimator.evaluate(input_fn=train_input_fn)
eval_metrics = estimator.evaluate(input_fn=eval_input_fn)
print("train metrics: %r"% train_metrics)
print("eval metrics: %r"% eval_metrics)
```
When run, it produces something like
```
train metrics: {'average_loss': 1.4833182e-08, 'global_step': 1000, 'loss': 5.9332727e-08}
eval metrics: {'average_loss': 0.0025353201, 'global_step': 1000, 'loss': 0.01014128}
```
Notice how our eval data has a higher loss, but it is still close to zero.
That means we are learning properly.
### A custom model
`tf.estimator` does not lock you into its predefined models. Suppose we
wanted to create a custom model that is not built into TensorFlow. We can still
retain the high level abstraction of data set, feeding, training, etc. of
`tf.estimator`. For illustration, we will show how to implement our own
equivalent model to `LinearRegressor` using our knowledge of the lower level
TensorFlow API.
To define a custom model that works with `tf.estimator`, we need to use
`tf.estimator.Estimator`. `tf.estimator.LinearRegressor` is actually
a sub-class of `tf.estimator.Estimator`. Instead of sub-classing
`Estimator`, we simply provide `Estimator` a function `model_fn` that tells
`tf.estimator` how it can evaluate predictions, training steps, and
loss. The code is as follows:
```python
import numpy as np
import tensorflow as tf
# Declare list of features, we only have one real-valued feature
def model_fn(features, labels, mode):
# Build a linear model and predict values
W = tf.get_variable("W", [1], dtype=tf.float64)
b = tf.get_variable("b", [1], dtype=tf.float64)
y = W*features['x'] + b
# Loss sub-graph
loss = tf.reduce_sum(tf.square(y - labels))
# Training sub-graph
global_step = tf.train.get_global_step()
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = tf.group(optimizer.minimize(loss),
tf.assign_add(global_step, 1))
# EstimatorSpec connects subgraphs we built to the
# appropriate functionality.
return tf.estimator.EstimatorSpec(
mode=mode,
predictions=y,
loss=loss,
train_op=train)
estimator = tf.estimator.Estimator(model_fn=model_fn)
# define our data sets
x_train = np.array([1., 2., 3., 4.])
y_train = np.array([0., -1., -2., -3.])
x_eval = np.array([2., 5., 8., 1.])
y_eval = np.array([-1.01, -4.1, -7., 0.])
input_fn = tf.estimator.inputs.numpy_input_fn(
{"x": x_train}, y_train, batch_size=4, num_epochs=None, shuffle=True)
train_input_fn = tf.estimator.inputs.numpy_input_fn(
{"x": x_train}, y_train, batch_size=4, num_epochs=1000, shuffle=False)
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
{"x": x_eval}, y_eval, batch_size=4, num_epochs=1, shuffle=False)
# train
estimator.train(input_fn=input_fn, steps=1000)
# Here we evaluate how well our model did.
train_metrics = estimator.evaluate(input_fn=train_input_fn)
eval_metrics = estimator.evaluate(input_fn=eval_input_fn)
print("train metrics: %r"% train_metrics)
print("eval metrics: %r"% eval_metrics)
```
When run, it produces
```
train metrics: {'loss': 1.227995e-11, 'global_step': 1000}
eval metrics: {'loss': 0.01010036, 'global_step': 1000}
```
Notice how the contents of the custom `model_fn()` function are very similar
to our manual model training loop from the lower level API.
## Next steps
Now you have a working knowledge of the basics of TensorFlow. We have several
more tutorials that you can look at to learn more. If you are a beginner in
machine learning see @{$beginners$MNIST for beginners},
otherwise see @{$pros$Deep MNIST for experts}.

View File

@ -1,36 +1,35 @@
# Getting Started
For a brief overview of TensorFlow programming fundamentals, see the following
guide:
TensorFlow is a tool for machine learning. While it contains a wide range of
functionality, it is mainly designed for deep neural network models.
* @{$get_started/get_started$Getting Started with TensorFlow}
The fastest way to build a fully-featured model trained on your data is to use
TensorFlow's high-level API. In the following examples, we will use the
high-level API on the classic [Iris dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set).
We will train a model that predicts what species a flower is based on its
characteristics, and along the way get a quick introduction to the basic tasks
in TensorFlow using Estimators.
MNIST has become the canonical dataset for trying out a new machine learning
toolkit. We offer three guides that each demonstrate a different approach
to training an MNIST model on TensorFlow:
This tutorial is divided into the following parts:
* @{$mnist/beginners$MNIST for ML Beginners}, which introduces MNIST through
the high-level API.
* @{$mnist/pros$Deep MNIST for Experts}, which is more-in depth than
"MNIST for ML Beginners," and assumes some familiarity with machine
learning concepts.
* @{$mnist/mechanics$TensorFlow Mechanics 101}, which introduces MNIST through
the low-level API.
* @{$get_started/premade_estimators}, which shows you
how to quickly setup prebuilt models to train on in-memory data.
* @{$get_started/checkpoints}, which shows you how to save training progress,
and resume where you left off.
* @{$get_started/feature_columns}, which shows how an
Estimator can handle a variety of input data types without changes to the
model.
* @{$get_started/datasets_quickstart}, which is a minimal introduction to
the TensorFlow's input pipelines.
* @{$get_started/custom_estimators}, which demonstrates how
to build and train models you design yourself.
For developers new to TensorFlow, the high-level API is a good place to start.
To learn about the high-level API, read the following guides:
* @{$get_started/estimator$tf.estimator Quickstart}, which introduces this
API.
* @{$get_started/input_fn$Building Input Functions},
which takes you into a somewhat more sophisticated use of this API.
TensorBoard is a utility to visualize different aspects of machine learning.
The following guides explain how to use TensorBoard:
* @{$get_started/summaries_and_tensorboard$TensorBoard: Visualizing Learning},
which gets you started.
* @{$get_started/graph_viz$TensorBoard: Graph Visualization}, which explains
how to visualize the computational graph. Graph visualization is typically
more useful for programmers using the low-level API.
For more advanced users:
* The @{$low_level_intro$Low Level Introduction} demonstrates how to use
tensorflow outside of the Estimator framework, for debugging and
experimentation.
* The remainder of the @{$programmers_guide$Programmer's Guide} contains
in-depth guides to various major components of TensorFlow.
* The @{$tutorials$Tutorials} provide walkthroughs of a variety of
TensorFlow models.

View File

@ -1,438 +0,0 @@
# Building Input Functions with tf.estimator
This tutorial introduces you to creating input functions in tf.estimator.
You'll get an overview of how to construct an `input_fn` to preprocess and feed
data into your models. Then, you'll implement an `input_fn` that feeds training,
evaluation, and prediction data into a neural network regressor for predicting
median house values.
## Custom Input Pipelines with input_fn
The `input_fn` is used to pass feature and target data to the `train`,
`evaluate`, and `predict` methods of the `Estimator`.
The user can do feature engineering or pre-processing inside the `input_fn`.
Here's an example taken from the @{$get_started/estimator$tf.estimator Quickstart tutorial}:
```python
import numpy as np
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
filename=IRIS_TRAINING, target_dtype=np.int, features_dtype=np.float32)
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(training_set.data)},
y=np.array(training_set.target),
num_epochs=None,
shuffle=True)
classifier.train(input_fn=train_input_fn, steps=2000)
```
### Anatomy of an input_fn
The following code illustrates the basic skeleton for an input function:
```python
def my_input_fn():
# Preprocess your data here...
# ...then return 1) a mapping of feature columns to Tensors with
# the corresponding feature data, and 2) a Tensor containing labels
return feature_cols, labels
```
The body of the input function contains the specific logic for preprocessing
your input data, such as scrubbing out bad examples or
[feature scaling](https://en.wikipedia.org/wiki/Feature_scaling).
Input functions must return the following two values containing the final
feature and label data to be fed into your model (as shown in the above code
skeleton):
<dl>
<dt><code>feature_cols</code></dt>
<dd>A dict containing key/value pairs that map feature column
names to <code>Tensor</code>s (or <code>SparseTensor</code>s) containing the corresponding feature
data.</dd>
<dt><code>labels</code></dt>
<dd>A <code>Tensor</code> containing your label (target) values: the values your model aims to predict.</dd>
</dl>
### Converting Feature Data to Tensors
If your feature/label data is a python array or stored in
[_pandas_](http://pandas.pydata.org/) dataframes or
[numpy](http://www.numpy.org/) arrays, you can use the following methods to
construct `input_fn`:
```python
import numpy as np
# numpy input_fn.
my_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(x_data)},
y=np.array(y_data),
...)
```
```python
import pandas as pd
# pandas input_fn.
my_input_fn = tf.estimator.inputs.pandas_input_fn(
x=pd.DataFrame({"x": x_data}),
y=pd.Series(y_data),
...)
```
For [sparse, categorical data](https://en.wikipedia.org/wiki/Sparse_matrix)
(data where the majority of values are 0), you'll instead want to populate a
`SparseTensor`, which is instantiated with three arguments:
<dl>
<dt><code>dense_shape</code></dt>
<dd>The shape of the tensor. Takes a list indicating the number of elements in each dimension. For example, <code>dense_shape=[3,6]</code> specifies a two-dimensional 3x6 tensor, <code>dense_shape=[2,3,4]</code> specifies a three-dimensional 2x3x4 tensor, and <code>dense_shape=[9]</code> specifies a one-dimensional tensor with 9 elements.</dd>
<dt><code>indices</code></dt>
<dd>The indices of the elements in your tensor that contain nonzero values. Takes a list of terms, where each term is itself a list containing the index of a nonzero element. (Elements are zero-indexed—i.e., [0,0] is the index value for the element in the first column of the first row in a two-dimensional tensor.) For example, <code>indices=[[1,3], [2,4]]</code> specifies that the elements with indexes of [1,3] and [2,4] have nonzero values.</dd>
<dt><code>values</code></dt>
<dd>A one-dimensional tensor of values. Term <code>i</code> in <code>values</code> corresponds to term <code>i</code> in <code>indices</code> and specifies its value. For example, given <code>indices=[[1,3], [2,4]]</code>, the parameter <code>values=[18, 3.6]</code> specifies that element [1,3] of the tensor has a value of 18, and element [2,4] of the tensor has a value of 3.6.</dd>
</dl>
The following code defines a two-dimensional `SparseTensor` with 3 rows and 5
columns. The element with index [0,1] has a value of 6, and the element with
index [2,4] has a value of 0.5 (all other values are 0):
```python
sparse_tensor = tf.SparseTensor(indices=[[0,1], [2,4]],
values=[6, 0.5],
dense_shape=[3, 5])
```
This corresponds to the following dense tensor:
```none
[[0, 6, 0, 0, 0]
[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0.5]]
```
For more on `SparseTensor`, see @{tf.SparseTensor}.
### Passing input_fn Data to Your Model
To feed data to your model for training, you simply pass the input function
you've created to your `train` operation as the value of the `input_fn`
parameter, e.g.:
```python
classifier.train(input_fn=my_input_fn, steps=2000)
```
Note that the `input_fn` parameter must receive a function object (i.e.,
`input_fn=my_input_fn`), not the return value of a function call
(`input_fn=my_input_fn()`). This means that if you try to pass parameters to the
`input_fn` in your `train` call, as in the following code, it will result in a
`TypeError`:
```python
classifier.train(input_fn=my_input_fn(training_set), steps=2000)
```
However, if you'd like to be able to parameterize your input function, there are
other methods for doing so. You can employ a wrapper function that takes no
arguments as your `input_fn` and use it to invoke your input function
with the desired parameters. For example:
```python
def my_input_fn(data_set):
...
def my_input_fn_training_set():
return my_input_fn(training_set)
classifier.train(input_fn=my_input_fn_training_set, steps=2000)
```
Alternatively, you can use Python's [`functools.partial`](https://docs.python.org/2/library/functools.html#functools.partial)
function to construct a new function object with all parameter values fixed:
```python
classifier.train(
input_fn=functools.partial(my_input_fn, data_set=training_set),
steps=2000)
```
A third option is to wrap your `input_fn` invocation in a
[`lambda`](https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions)
and pass it to the `input_fn` parameter:
```python
classifier.train(input_fn=lambda: my_input_fn(training_set), steps=2000)
```
One big advantage of designing your input pipeline as shown above—to accept a
parameter for data set—is that you can pass the same `input_fn` to `evaluate`
and `predict` operations by just changing the data set argument, e.g.:
```python
classifier.evaluate(input_fn=lambda: my_input_fn(test_set), steps=2000)
```
This approach enhances code maintainability: no need to define multiple
`input_fn` (e.g. `input_fn_train`, `input_fn_test`, `input_fn_predict`) for each
type of operation.
Finally, you can use the methods in `tf.estimator.inputs` to create `input_fn`
from numpy or pandas data sets. The additional benefit is that you can use
more arguments, such as `num_epochs` and `shuffle` to control how the `input_fn`
iterates over the data:
```python
import pandas as pd
def get_input_fn_from_pandas(data_set, num_epochs=None, shuffle=True):
return tf.estimator.inputs.pandas_input_fn(
x=pd.DataFrame(...),
y=pd.Series(...),
num_epochs=num_epochs,
shuffle=shuffle)
```
```python
import numpy as np
def get_input_fn_from_numpy(data_set, num_epochs=None, shuffle=True):
return tf.estimator.inputs.numpy_input_fn(
x={...},
y=np.array(...),
num_epochs=num_epochs,
shuffle=shuffle)
```
### A Neural Network Model for Boston House Values
In the remainder of this tutorial, you'll write an input function for
preprocessing a subset of Boston housing data pulled from the UCI Housing Data
Set and use it to feed data to
a neural network regressor for predicting median house values.
The [Boston CSV data sets](#setup) you'll use to train your neural network
contain the following
[feature data](https://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.names)
for Boston suburbs:
Feature | Description
------- | ---------------------------------------------------------------
CRIM | Crime rate per capita
ZN | Fraction of residential land zoned to permit 25,000+ sq ft lots
INDUS | Fraction of land that is non-retail business
NOX | Concentration of nitric oxides in parts per 10 million
RM | Average Rooms per dwelling
AGE | Fraction of owner-occupied residences built before 1940
DIS | Distance to Boston-area employment centers
TAX | Property tax rate per $10,000
PTRATIO | Student-teacher ratio
And the label your model will predict is MEDV, the median value of
owner-occupied residences in thousands of dollars.
## Setup {#setup}
Download the following data sets:
[boston_train.csv](http://download.tensorflow.org/data/boston_train.csv),
[boston_test.csv](http://download.tensorflow.org/data/boston_test.csv), and
[boston_predict.csv](http://download.tensorflow.org/data/boston_predict.csv).
The following sections provide a step-by-step walkthrough of how to create an
input function, feed these data sets into a neural network regressor, train and
evaluate the model, and make house value predictions. The full, final code is [available
here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/input_fn/boston.py).
### Importing the Housing Data
To start, set up your imports (including `pandas` and `tensorflow`) and set logging verbosity to
`INFO` for more detailed log output:
```python
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import itertools
import pandas as pd
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.INFO)
```
Define the column names for the data set in `COLUMNS`. To distinguish features
from the label, also define `FEATURES` and `LABEL`. Then read the three CSVs
([train](http://download.tensorflow.org/data/boston_train.csv),
[test](http://download.tensorflow.org/data/boston_test.csv), and
[predict](http://download.tensorflow.org/data/boston_predict.csv)) into _pandas_
`DataFrame`s:
```python
COLUMNS = ["crim", "zn", "indus", "nox", "rm", "age",
"dis", "tax", "ptratio", "medv"]
FEATURES = ["crim", "zn", "indus", "nox", "rm",
"age", "dis", "tax", "ptratio"]
LABEL = "medv"
training_set = pd.read_csv("boston_train.csv", skipinitialspace=True,
skiprows=1, names=COLUMNS)
test_set = pd.read_csv("boston_test.csv", skipinitialspace=True,
skiprows=1, names=COLUMNS)
prediction_set = pd.read_csv("boston_predict.csv", skipinitialspace=True,
skiprows=1, names=COLUMNS)
```
### Defining FeatureColumns and Creating the Regressor
Next, create a list of `FeatureColumn`s for the input data, which formally
specify the set of features to use for training. Because all features in the
housing data set contain continuous values, you can create their
`FeatureColumn`s using the `tf.feature_column.numeric_column()` function:
```python
feature_cols = [tf.feature_column.numeric_column(k) for k in FEATURES]
```
NOTE: For a more in-depth overview of feature columns, see
@{$linear#feature-columns-and-transformations$this introduction},
and for an example that illustrates how to define `FeatureColumns` for
categorical data, see the @{$wide$Linear Model Tutorial}.
Now, instantiate a `DNNRegressor` for the neural network regression model.
You'll need to provide two arguments here: `hidden_units`, a hyperparameter
specifying the number of nodes in each hidden layer (here, two hidden layers
with 10 nodes each), and `feature_columns`, containing the list of
`FeatureColumns` you just defined:
```python
regressor = tf.estimator.DNNRegressor(feature_columns=feature_cols,
hidden_units=[10, 10],
model_dir="/tmp/boston_model")
```
### Building the input_fn
To pass input data into the `regressor`, write a factory method that accepts a
_pandas_ `Dataframe` and returns an `input_fn`:
```python
def get_input_fn(data_set, num_epochs=None, shuffle=True):
return tf.estimator.inputs.pandas_input_fn(
x=pd.DataFrame({k: data_set[k].values for k in FEATURES}),
y = pd.Series(data_set[LABEL].values),
num_epochs=num_epochs,
shuffle=shuffle)
```
Note that the input data is passed into `input_fn` in the `data_set` argument,
which means the function can process any of the `DataFrame`s you've imported:
`training_set`, `test_set`, and `prediction_set`.
Two additional arguments are provided:
* `num_epochs`: controls the number of
epochs to iterate over data. For training, set this to `None`, so the
`input_fn` keeps returning data until the required number of train steps is
reached. For evaluate and predict, set this to 1, so the `input_fn` will
iterate over the data once and then raise `OutOfRangeError`. That error will
signal the `Estimator` to stop evaluate or predict.
* `shuffle`: Whether to shuffle the data. For evaluate and predict, set this to
`False`, so the `input_fn` iterates over the data sequentially. For train,
set this to `True`.
### Training the Regressor
To train the neural network regressor, run `train` with the `training_set`
passed to the `input_fn` as follows:
```python
regressor.train(input_fn=get_input_fn(training_set), steps=5000)
```
You should see log output similar to the following, which reports training loss
for every 100 steps:
```none
INFO:tensorflow:Step 1: loss = 483.179
INFO:tensorflow:Step 101: loss = 81.2072
INFO:tensorflow:Step 201: loss = 72.4354
...
INFO:tensorflow:Step 1801: loss = 33.4454
INFO:tensorflow:Step 1901: loss = 32.3397
INFO:tensorflow:Step 2001: loss = 32.0053
INFO:tensorflow:Step 4801: loss = 27.2791
INFO:tensorflow:Step 4901: loss = 27.2251
INFO:tensorflow:Saving checkpoints for 5000 into /tmp/boston_model/model.ckpt.
INFO:tensorflow:Loss for final step: 27.1674.
```
### Evaluating the Model
Next, see how the trained model performs against the test data set. Run
`evaluate`, and this time pass the `test_set` to the `input_fn`:
```python
ev = regressor.evaluate(
input_fn=get_input_fn(test_set, num_epochs=1, shuffle=False))
```
Retrieve the loss from the `ev` results and print it to output:
```python
loss_score = ev["loss"]
print("Loss: {0:f}".format(loss_score))
```
You should see results similar to the following:
```none
INFO:tensorflow:Eval steps [0,1) for training step 5000.
INFO:tensorflow:Saving evaluation summary for 5000 step: loss = 11.9221
Loss: 11.922098
```
### Making Predictions
Finally, you can use the model to predict median house values for the
`prediction_set`, which contains feature data but no labels for six examples:
```python
y = regressor.predict(
input_fn=get_input_fn(prediction_set, num_epochs=1, shuffle=False))
# .predict() returns an iterator of dicts; convert to a list and print
# predictions
predictions = list(p["predictions"] for p in itertools.islice(y, 6))
print("Predictions: {}".format(str(predictions)))
```
Your results should contain six house-value predictions in thousands of dollars,
e.g:
```none
Predictions: [ 33.30348587 17.04452896 22.56370163 34.74345398 14.55953979
19.58005714]
```
## Additional Resources
This tutorial focused on creating an `input_fn` for a neural network regressor.
To learn more about using `input_fn`s for other types of models, check out the
following resources:
* @{$linear$Large-scale Linear Models with TensorFlow}: This
introduction to linear models in TensorFlow provides a high-level overview
of feature columns and techniques for transforming input data.
* @{$wide$TensorFlow Linear Model Tutorial}: This tutorial covers
creating `FeatureColumn`s and an `input_fn` for a linear classification
model that predicts income range based on census data.
* @{$wide_and_deep$TensorFlow Wide & Deep Learning Tutorial}: Building on
the @{$wide$Linear Model Tutorial}, this tutorial covers
`FeatureColumn` and `input_fn` creation for a "wide and deep" model that
combines a linear model and a neural network using
`DNNLinearCombinedClassifier`.

View File

@ -1,10 +1,6 @@
index.md
get_started.md
mnist/beginners.md
mnist/pros.md
mnist/mechanics.md
estimator.md
input_fn.md
summaries_and_tensorboard.md
graph_viz.md
tensorboard_histograms.md
premade_estimators.md
checkpoints.md
feature_columns.md
datasets_quickstart.md
custom_estimators.md

View File

@ -1,454 +0,0 @@
# MNIST For ML Beginners
*This tutorial is intended for readers who are new to both machine learning and
TensorFlow. If you already know what MNIST is, and what softmax (multinomial
logistic) regression is, you might prefer this
@{$pros$faster paced tutorial}. Be sure to
@{$install$install TensorFlow} before starting either
tutorial.*
When one learns how to program, there's a tradition that the first thing you do
is print "Hello World." Just like programming has Hello World, machine learning
has MNIST.
MNIST is a simple computer vision dataset. It consists of images of handwritten
digits like these:
<div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/MNIST.png">
</div>
It also includes labels for each image, telling us which digit it is. For
example, the labels for the above images are 5, 0, 4, and 1.
In this tutorial, we're going to train a model to look at images and predict
what digits they are. Our goal isn't to train a really elaborate model that
achieves state-of-the-art performance -- although we'll give you code to do that
later! -- but rather to dip a toe into using TensorFlow. As such, we're going
to start with a very simple model, called a Softmax Regression.
The actual code for this tutorial is very short, and all the interesting
stuff happens in just three lines. However, it is very
important to understand the ideas behind it: both how TensorFlow works and the
core machine learning concepts. Because of this, we are going to very carefully
work through the code.
## About this tutorial
This tutorial is an explanation, line by line, of what is happening in the
[mnist_softmax.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist_softmax.py) code.
You can use this tutorial in a few different ways, including:
- Copy and paste each code snippet, line by line, into a Python environment as
you read through the explanations of each line.
- Run the entire `mnist_softmax.py` Python file either before or after reading
through the explanations, and use this tutorial to understand the lines of
code that aren't clear to you.
What we will accomplish in this tutorial:
- Learn about the MNIST data and softmax regressions
- Create a function that is a model for recognizing digits, based on looking at
every pixel in the image
- Use TensorFlow to train the model to recognize digits by having it "look" at
thousands of examples (and run our first TensorFlow session to do so)
- Check the model's accuracy with our test data
## The MNIST Data
The MNIST data is hosted on
[Yann LeCun's website](http://yann.lecun.com/exdb/mnist/). If you are copying and
pasting in the code from this tutorial, start here with these two lines of code
which will download and read in the data automatically:
```python
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
```
The MNIST data is split into three parts: 55,000 data points of training
data (`mnist.train`), 10,000 points of test data (`mnist.test`), and 5,000
points of validation data (`mnist.validation`). This split is very important:
it's essential in machine learning that we have separate data which we don't
learn from so that we can make sure that what we've learned actually
generalizes!
As mentioned earlier, every MNIST data point has two parts: an image of a
handwritten digit and a corresponding label. We'll call the images "x"
and the labels "y". Both the training set and test set contain images and their
corresponding labels; for example the training images are `mnist.train.images`
and the training labels are `mnist.train.labels`.
Each image is 28 pixels by 28 pixels. We can interpret this as a big array of
numbers:
<div style="width:50%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/MNIST-Matrix.png">
</div>
We can flatten this array into a vector of 28x28 = 784 numbers. It doesn't
matter how we flatten the array, as long as we're consistent between images.
From this perspective, the MNIST images are just a bunch of points in a
784-dimensional vector space, with a
[very rich structure](https://colah.github.io/posts/2014-10-Visualizing-MNIST/)
(warning: computationally intensive visualizations).
Flattening the data throws away information about the 2D structure of the image.
Isn't that bad? Well, the best computer vision methods do exploit this
structure, and we will in later tutorials. But the simple method we will be
using here, a softmax regression (defined below), won't.
The result is that `mnist.train.images` is a tensor (an n-dimensional array)
with a shape of `[55000, 784]`. The first dimension is an index into the list
of images and the second dimension is the index for each pixel in each image.
Each entry in the tensor is a pixel intensity between 0 and 1, for a particular
pixel in a particular image.
<div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/mnist-train-xs.png">
</div>
Each image in MNIST has a corresponding label, a number between 0 and 9
representing the digit drawn in the image.
For the purposes of this tutorial, we're going to want our labels as "one-hot
vectors". A one-hot vector is a vector which is 0 in most dimensions, and 1 in a
single dimension. In this case, the \\(n\\)th digit will be represented as a
vector which is 1 in the \\(n\\)th dimension. For example, 3 would be
\\([0,0,0,1,0,0,0,0,0,0]\\). Consequently, `mnist.train.labels` is a
`[55000, 10]` array of floats.
<div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/mnist-train-ys.png">
</div>
We're now ready to actually make our model!
## Softmax Regressions
We know that every image in MNIST is of a handwritten digit between zero and
nine. So there are only ten possible things that a given image can be. We want
to be able to look at an image and give the probabilities for it being each
digit. For example, our model might look at a picture of a nine and be 80% sure
it's a nine, but give a 5% chance to it being an eight (because of the top loop)
and a bit of probability to all the others because it isn't 100% sure.
This is a classic case where a softmax regression is a natural, simple model.
If you want to assign probabilities to an object being one of several different
things, softmax is the thing to do, because softmax gives us a list of values
between 0 and 1 that add up to 1. Even later on, when we train more sophisticated
models, the final step will be a layer of softmax.
A softmax regression has two steps: first we add up the evidence of our input
being in certain classes, and then we convert that evidence into probabilities.
To tally up the evidence that a given image is in a particular class, we do a
weighted sum of the pixel intensities. The weight is negative if that pixel
having a high intensity is evidence against the image being in that class, and
positive if it is evidence in favor.
The following diagram shows the weights one model learned for each of these
classes. Red represents negative weights, while blue represents positive
weights.
<div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/softmax-weights.png">
</div>
We also add some extra evidence called a bias. Basically, we want to be able
to say that some things are more likely independent of the input. The result is
that the evidence for a class \\(i\\) given an input \\(x\\) is:
$$\text{evidence}_i = \sum_j W_{i,~ j} x_j + b_i$$
where \\(W_i\\) is the weights and \\(b_i\\) is the bias for class \\(i\\),
and \\(j\\) is an index for summing over the pixels in our input image \\(x\\).
We then convert the evidence tallies into our predicted probabilities
\\(y\\) using the "softmax" function:
$$y = \text{softmax}(\text{evidence})$$
Here softmax is serving as an "activation" or "link" function, shaping
the output of our linear function into the form we want -- in this case, a
probability distribution over 10 cases.
You can think of it as converting tallies
of evidence into probabilities of our input being in each class.
It's defined as:
$$\text{softmax}(evidence) = \text{normalize}(\exp(evidence))$$
If you expand that equation out, you get:
$$\text{softmax}(evidence)_i = \frac{\exp(evidence_i)}{\sum_j \exp(evidence_j)}$$
But it's often more helpful to think of softmax the first way: exponentiating
its inputs and then normalizing them. The exponentiation means that one more
unit of evidence increases the weight given to any hypothesis multiplicatively.
And conversely, having one less unit of evidence means that a hypothesis gets a
fraction of its earlier weight. No hypothesis ever has zero or negative
weight. Softmax then normalizes these weights, so that they add up to one,
forming a valid probability distribution. (To get more intuition about the
softmax function, check out the
[section](http://neuralnetworksanddeeplearning.com/chap3.html#softmax) on it in
Michael Nielsen's book, complete with an interactive visualization.)
You can picture our softmax regression as looking something like the following,
although with a lot more \\(x\\)s. For each output, we compute a weighted sum of
the \\(x\\)s, add a bias, and then apply softmax.
<div style="width:55%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/softmax-regression-scalargraph.png">
</div>
If we write that out as equations, we get:
<div style="width:52%; margin-left:25%; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/softmax-regression-scalarequation.png"
alt="[y1, y2, y3] = softmax(W11*x1 + W12*x2 + W13*x3 + b1, W21*x1 + W22*x2 + W23*x3 + b2, W31*x1 + W32*x2 + W33*x3 + b3)">
</div>
We can "vectorize" this procedure, turning it into a matrix multiplication
and vector addition. This is helpful for computational efficiency. (It's also
a useful way to think.)
<div style="width:50%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/softmax-regression-vectorequation.png"
alt="[y1, y2, y3] = softmax([[W11, W12, W13], [W21, W22, W23], [W31, W32, W33]]*[x1, x2, x3] + [b1, b2, b3])">
</div>
More compactly, we can just write:
$$y = \text{softmax}(Wx + b)$$
Now let's turn that into something that TensorFlow can use.
## Implementing the Regression
To do efficient numerical computing in Python, we typically use libraries like
[NumPy](http://www.numpy.org) that do expensive operations such as matrix
multiplication outside Python, using highly efficient code implemented in
another language. Unfortunately, there can still be a lot of overhead from
switching back to Python every operation. This overhead is especially bad if you
want to run computations on GPUs or in a distributed manner, where there can be
a high cost to transferring data.
TensorFlow also does its heavy lifting outside Python, but it takes things a
step further to avoid this overhead. Instead of running a single expensive
operation independently from Python, TensorFlow lets us describe a graph of
interacting operations that run entirely outside Python. (Approaches like this
can be seen in a few machine learning libraries.)
To use TensorFlow, first we need to import it.
```python
import tensorflow as tf
```
We describe these interacting operations by manipulating symbolic variables.
Let's create one:
```python
x = tf.placeholder(tf.float32, [None, 784])
```
`x` isn't a specific value. It's a `placeholder`, a value that we'll input when
we ask TensorFlow to run a computation. We want to be able to input any number
of MNIST images, each flattened into a 784-dimensional vector. We represent
this as a 2-D tensor of floating-point numbers, with a shape `[None, 784]`.
(Here `None` means that a dimension can be of any length.)
We also need the weights and biases for our model. We could imagine treating
these like additional inputs, but TensorFlow has an even better way to handle
it: `Variable`. A `Variable` is a modifiable tensor that lives in TensorFlow's
graph of interacting operations. It can be used and even modified by the
computation. For machine learning applications, one generally has the model
parameters be `Variable`s.
```python
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
```
We create these `Variable`s by giving `tf.Variable` the initial value of the
`Variable`: in this case, we initialize both `W` and `b` as tensors full of
zeros. Since we are going to learn `W` and `b`, it doesn't matter very much
what they initially are.
Notice that `W` has a shape of [784, 10] because we want to multiply the
784-dimensional image vectors by it to produce 10-dimensional vectors of
evidence for the difference classes. `b` has a shape of [10] so we can add it
to the output.
We can now implement our model. It only takes one line to define it!
```python
y = tf.nn.softmax(tf.matmul(x, W) + b)
```
First, we multiply `x` by `W` with the expression `tf.matmul(x, W)`. This is
flipped from when we multiplied them in our equation, where we had \\(Wx\\), as
a small trick to deal with `x` being a 2D tensor with multiple inputs. We then
add `b`, and finally apply `tf.nn.softmax`.
That's it. It only took us one line to define our model, after a couple short
lines of setup. That isn't because TensorFlow is designed to make a softmax
regression particularly easy: it's just a very flexible way to describe many
kinds of numerical computations, from machine learning models to physics
simulations. And once defined, our model can be run on different devices:
your computer's CPU, GPUs, and even phones!
## Training
In order to train our model, we need to define what it means for the model to be
good. Well, actually, in machine learning we typically define what it means for
a model to be bad. We call this the cost, or the loss, and it represents how far
off our model is from our desired outcome. We try to minimize that error, and
the smaller the error margin, the better our model is.
One very common, very nice function to determine the loss of a model is called
"cross-entropy." Cross-entropy arises from thinking about information
compressing codes in information theory but it winds up being an important idea
in lots of areas, from gambling to machine learning. It's defined as:
$$H_{y'}(y) = -\sum_i y'_i \log(y_i)$$
Where \\(y\\) is our predicted probability distribution, and \\(y'\\) is the true
distribution (the one-hot vector with the digit labels). In some rough sense, the
cross-entropy is measuring how inefficient our predictions are for describing
the truth. Going into more detail about cross-entropy is beyond the scope of
this tutorial, but it's well worth
[understanding](https://colah.github.io/posts/2015-09-Visual-Information).
To implement cross-entropy we need to first add a new placeholder to input the
correct answers:
```python
y_ = tf.placeholder(tf.float32, [None, 10])
```
Then we can implement the cross-entropy function, \\(-\sum y'\log(y)\\):
```python
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
```
First, `tf.log` computes the logarithm of each element of `y`. Next, we multiply
each element of `y_` with the corresponding element of `tf.log(y)`. Then
`tf.reduce_sum` adds the elements in the second dimension of y, due to the
`reduction_indices=[1]` parameter. Finally, `tf.reduce_mean` computes the mean
over all the examples in the batch.
Note that in the source code, we don't use this formulation, because it is
numerically unstable. Instead, we apply
`tf.losses.sparse_softmax_cross_entropy` on the unnormalized logits (e.g., we
call `sparse_softmax_cross_entropy` on the output of `tf.matmul(x, W) + b`),
because this more numerically stable function internally computes the softmax
activation.
Now that we know what we want our model to do, it's very easy to have TensorFlow
train it to do so. Because TensorFlow knows the entire graph of your
computations, it can automatically use the
[backpropagation algorithm](https://colah.github.io/posts/2015-08-Backprop) to
efficiently determine how your variables affect the loss you ask it to
minimize. Then it can apply your choice of optimization algorithm to modify the
variables and reduce the loss.
```python
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
```
In this case, we ask TensorFlow to minimize `cross_entropy` using the
[gradient descent algorithm](https://en.wikipedia.org/wiki/Gradient_descent)
with a learning rate of 0.5. Gradient descent is a simple procedure, where
TensorFlow simply shifts each variable a little bit in the direction that
reduces the cost. But TensorFlow also provides
@{$python/train#Optimizers$many other optimization algorithms}:
using one is as simple as tweaking one line.
What TensorFlow actually does here, behind the scenes, is to add new operations
to your graph which implement backpropagation and gradient descent. Then it
gives you back a single operation which, when run, does a step of gradient
descent training, slightly tweaking your variables to reduce the loss.
We can now launch the model in an `InteractiveSession`:
```python
sess = tf.InteractiveSession()
```
We first have to create an operation to initialize the variables we created:
```python
tf.global_variables_initializer().run()
```
Let's train -- we'll run the training step 1000 times!
```python
for _ in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
```
Each step of the loop, we get a "batch" of one hundred random data points from
our training set. We run `train_step` feeding in the batches data to replace
the `placeholder`s.
Using small batches of random data is called stochastic training -- in this
case, stochastic gradient descent. Ideally, we'd like to use all our data for
every step of training because that would give us a better sense of what we
should be doing, but that's expensive. So, instead, we use a different subset
every time. Doing this is cheap and has much of the same benefit.
## Evaluating Our Model
How well does our model do?
Well, first let's figure out where we predicted the correct label. `tf.argmax`
is an extremely useful function which gives you the index of the highest entry
in a tensor along some axis. For example, `tf.argmax(y,1)` is the label our
model thinks is most likely for each input, while `tf.argmax(y_,1)` is the
correct label. We can use `tf.equal` to check if our prediction matches the
truth.
```python
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
```
That gives us a list of booleans. To determine what fraction are correct, we
cast to floating point numbers and then take the mean. For example,
`[True, False, True, True]` would become `[1,0,1,1]` which would become `0.75`.
```python
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
```
Finally, we ask for our accuracy on our test data.
```python
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
```
This should be about 92%.
Is that good? Well, not really. In fact, it's pretty bad. This is because we're
using a very simple model. With some small changes, we can get to 97%. The best
models can get to over 99.7% accuracy! (For more information, have a look at
this
[list of results](https://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results).)
What matters is that we learned from this model. Still, if you're feeling a bit
down about these results, check out
@{$pros$the next tutorial} where we do a lot
better, and learn how to build more sophisticated models using TensorFlow!

View File

@ -1,484 +0,0 @@
# TensorFlow Mechanics 101
Code: [tensorflow/examples/tutorials/mnist/](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/)
The goal of this tutorial is to show how to use TensorFlow to train and
evaluate a simple feed-forward neural network for handwritten digit
classification using the (classic) MNIST data set. The intended audience for
this tutorial is experienced machine learning users interested in using
TensorFlow.
These tutorials are not intended for teaching Machine Learning in general.
Please ensure you have followed the instructions to
@{$install$install TensorFlow}.
## Tutorial Files
This tutorial references the following files:
File | Purpose
--- | ---
[`mnist.py`](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist.py) | The code to build a fully-connected MNIST model.
[`fully_connected_feed.py`](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/fully_connected_feed.py) | The main code to train the built MNIST model against the downloaded dataset using a feed dictionary.
Simply run the `fully_connected_feed.py` file directly to start training:
```bash
python fully_connected_feed.py
```
## Prepare the Data
MNIST is a classic problem in machine learning. The problem is to look at
greyscale 28x28 pixel images of handwritten digits and determine which digit
the image represents, for all the digits from zero to nine.
![MNIST Digits](https://www.tensorflow.org/images/mnist_digits.png "MNIST Digits")
For more information, refer to [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/)
or [Chris Olah's visualizations of MNIST](http://colah.github.io/posts/2014-10-Visualizing-MNIST/).
### Download
At the top of the `run_training()` method, the `input_data.read_data_sets()`
function will ensure that the correct data has been downloaded to your local
training folder and then unpack that data to return a dictionary of `DataSet`
instances.
```python
data_sets = input_data.read_data_sets(FLAGS.input_data_dir, FLAGS.fake_data)
```
**NOTE**: The `fake_data` flag is used for unit-testing purposes and may be
safely ignored by the reader.
Dataset | Purpose
--- | ---
`data_sets.train` | 55000 images and labels, for primary training.
`data_sets.validation` | 5000 images and labels, for iterative validation of training accuracy.
`data_sets.test` | 10000 images and labels, for final testing of trained accuracy.
### Inputs and Placeholders
The `placeholder_inputs()` function creates two @{tf.placeholder}
ops that define the shape of the inputs, including the `batch_size`, to the
rest of the graph and into which the actual training examples will be fed.
```python
images_placeholder = tf.placeholder(tf.float32, shape=(batch_size,
mnist.IMAGE_PIXELS))
labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size))
```
Further down, in the training loop, the full image and label datasets are
sliced to fit the `batch_size` for each step, matched with these placeholder
ops, and then passed into the `sess.run()` function using the `feed_dict`
parameter.
## Build the Graph
After creating placeholders for the data, the graph is built from the
`mnist.py` file according to a 3-stage pattern: `inference()`, `loss()`, and
`training()`.
1. `inference()` - Builds the graph as far as required for running
the network forward to make predictions.
1. `loss()` - Adds to the inference graph the ops required to generate
loss.
1. `training()` - Adds to the loss graph the ops required to compute
and apply gradients.
<div style="width:95%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="https://www.tensorflow.org/images/mnist_subgraph.png">
</div>
### Inference
The `inference()` function builds the graph as far as needed to
return the tensor that would contain the output predictions.
It takes the images placeholder as input and builds on top
of it a pair of fully connected layers with [ReLU](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) activation followed by a ten
node linear layer specifying the output logits.
Each layer is created beneath a unique @{tf.name_scope}
that acts as a prefix to the items created within that scope.
```python
with tf.name_scope('hidden1'):
```
Within the defined scope, the weights and biases to be used by each of these
layers are generated into @{tf.Variable}
instances, with their desired shapes:
```python
weights = tf.Variable(
tf.truncated_normal([IMAGE_PIXELS, hidden1_units],
stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))),
name='weights')
biases = tf.Variable(tf.zeros([hidden1_units]),
name='biases')
```
When, for instance, these are created under the `hidden1` scope, the unique
name given to the weights variable would be "`hidden1/weights`".
Each variable is given initializer ops as part of their construction.
In this most common case, the weights are initialized with the
@{tf.truncated_normal}
and given their shape of a 2-D tensor with
the first dim representing the number of units in the layer from which the
weights connect and the second dim representing the number of
units in the layer to which the weights connect. For the first layer, named
`hidden1`, the dimensions are `[IMAGE_PIXELS, hidden1_units]` because the
weights are connecting the image inputs to the hidden1 layer. The
`tf.truncated_normal` initializer generates a random distribution with a given
mean and standard deviation.
Then the biases are initialized with @{tf.zeros}
to ensure they start with all zero values, and their shape is simply the number
of units in the layer to which they connect.
The graph's three primary ops -- two @{tf.nn.relu}
ops wrapping @{tf.matmul}
for the hidden layers and one extra `tf.matmul` for the logits -- are then
created, each in turn, with separate `tf.Variable` instances connected to each
of the input placeholders or the output tensors of the previous layer.
```python
hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases)
```
```python
hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases)
```
```python
logits = tf.matmul(hidden2, weights) + biases
```
Finally, the `logits` tensor that will contain the output is returned.
### Loss
The `loss()` function further builds the graph by adding the required loss
ops.
First, the values from the `labels_placeholder` are converted to 64-bit
integers. Then, a @{tf.losses.sparse_softmax_cross_entropy} op is used to
calculate the batch's average cross entropy, of the `inference()` result,
compared to the labels.
```python
labels = tf.to_int64(labels)
cross_entropy = tf.losses.sparse_softmax_cross_entropy(
labels=labels, logits=logits)
```
And the tensor that will then contain the loss value is returned.
> Note: Cross-entropy is an idea from information theory that allows us
> to describe how bad it is to believe the predictions of the neural network,
> given what is actually true. For more information, read the blog post Visual
> Information Theory (http://colah.github.io/posts/2015-09-Visual-Information/)
### Training
The `training()` function adds the operations needed to minimize the loss via
[Gradient Descent](https://en.wikipedia.org/wiki/Gradient_descent).
Firstly, it takes the loss tensor from the `loss()` function and hands it to a
@{tf.summary.scalar},
an op for generating summary values into the events file when used with a
@{tf.summary.FileWriter} (see below). In this case, it will emit the snapshot value of
the loss every time the summaries are written out.
```python
tf.summary.scalar('loss', loss)
```
Next, we instantiate a @{tf.train.GradientDescentOptimizer}
responsible for applying gradients with the requested learning rate.
```python
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
```
We then generate a single variable to contain a counter for the global
training step and the @{tf.train.Optimizer.minimize}
op is used to both update the trainable weights in the system and increment the
global step. This op is, by convention, known as the `train_op` and is what must
be run by a TensorFlow session in order to induce one full step of training
(see below).
```python
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss, global_step=global_step)
```
## Train the Model
Once the graph is built, it can be iteratively trained and evaluated in a loop
controlled by the user code in `fully_connected_feed.py`.
### The Graph
At the top of the `run_training()` function is a python `with` command that
indicates all of the built ops are to be associated with the default
global @{tf.Graph}
instance.
```python
with tf.Graph().as_default():
```
A `tf.Graph` is a collection of ops that may be executed together as a group.
Most TensorFlow uses will only need to rely on the single default graph.
More complicated uses with multiple graphs are possible, but beyond the scope of
this simple tutorial.
### The Session
Once all of the build preparation has been completed and all of the necessary
ops generated, a @{tf.Session}
is created for running the graph.
```python
sess = tf.Session()
```
Alternately, a `Session` may be generated into a `with` block for scoping:
```python
with tf.Session() as sess:
```
The empty parameter to session indicates that this code will attach to
(or create if not yet created) the default local session.
Immediately after creating the session, all of the `tf.Variable`
instances are initialized by calling @{tf.Session.run}
on their initialization op.
```python
init = tf.global_variables_initializer()
sess.run(init)
```
The @{tf.Session.run}
method will run the complete subset of the graph that
corresponds to the op(s) passed as parameters. In this first call, the `init`
op is a @{tf.group}
that contains only the initializers for the variables. None of the rest of the
graph is run here; that happens in the training loop below.
### Train Loop
After initializing the variables with the session, training may begin.
The user code controls the training per step, and the simplest loop that
can do useful training is:
```python
for step in xrange(FLAGS.max_steps):
sess.run(train_op)
```
However, this tutorial is slightly more complicated in that it must also slice
up the input data for each step to match the previously generated placeholders.
#### Feed the Graph
For each step, the code will generate a feed dictionary that will contain the
set of examples on which to train for the step, keyed by the placeholder
ops they represent.
In the `fill_feed_dict()` function, the given `DataSet` is queried for its next
`batch_size` set of images and labels, and tensors matching the placeholders are
filled containing the next images and labels.
```python
images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size,
FLAGS.fake_data)
```
A python dictionary object is then generated with the placeholders as keys and
the representative feed tensors as values.
```python
feed_dict = {
images_placeholder: images_feed,
labels_placeholder: labels_feed,
}
```
This is passed into the `sess.run()` function's `feed_dict` parameter to provide
the input examples for this step of training.
#### Check the Status
The code specifies two values to fetch in its run call: `[train_op, loss]`.
```python
for step in xrange(FLAGS.max_steps):
feed_dict = fill_feed_dict(data_sets.train,
images_placeholder,
labels_placeholder)
_, loss_value = sess.run([train_op, loss],
feed_dict=feed_dict)
```
Because there are two values to fetch, `sess.run()` returns a tuple with two
items. Each `Tensor` in the list of values to fetch corresponds to a numpy
array in the returned tuple, filled with the value of that tensor during this
step of training. Since `train_op` is an `Operation` with no output value, the
corresponding element in the returned tuple is `None` and, thus,
discarded. However, the value of the `loss` tensor may become NaN if the model
diverges during training, so we capture this value for logging.
Assuming that the training runs fine without NaNs, the training loop also
prints a simple status text every 100 steps to let the user know the state of
training.
```python
if step % 100 == 0:
print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration))
```
#### Visualize the Status
In order to emit the events files used by @{$summaries_and_tensorboard$TensorBoard},
all of the summaries (in this case, only one) are collected into a single Tensor
during the graph building phase.
```python
summary = tf.summary.merge_all()
```
And then after the session is created, a @{tf.summary.FileWriter}
may be instantiated to write the events files, which
contain both the graph itself and the values of the summaries.
```python
summary_writer = tf.summary.FileWriter(FLAGS.log_dir, sess.graph)
```
Lastly, the events file will be updated with new summary values every time the
`summary` is evaluated and the output passed to the writer's `add_summary()`
function.
```python
summary_str = sess.run(summary, feed_dict=feed_dict)
summary_writer.add_summary(summary_str, step)
```
When the events files are written, TensorBoard may be run against the training
folder to display the values from the summaries.
![MNIST TensorBoard](https://www.tensorflow.org/images/mnist_tensorboard.png "MNIST TensorBoard")
**NOTE**: For more info about how to build and run Tensorboard, please see the accompanying tutorial @{$summaries_and_tensorboard$Tensorboard: Visualizing Learning}.
#### Save a Checkpoint
In order to emit a checkpoint file that may be used to later restore a model
for further training or evaluation, we instantiate a
@{tf.train.Saver}.
```python
saver = tf.train.Saver()
```
In the training loop, the @{tf.train.Saver.save}
method will periodically be called to write a checkpoint file to the training
directory with the current values of all the trainable variables.
```python
saver.save(sess, checkpoint_file, global_step=step)
```
At some later point in the future, training might be resumed by using the
@{tf.train.Saver.restore}
method to reload the model parameters.
```python
saver.restore(sess, checkpoint_file)
```
## Evaluate the Model
Every thousand steps, the code will attempt to evaluate the model against both
the training and test datasets. The `do_eval()` function is called thrice, for
the training, validation, and test datasets.
```python
print('Training Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.train)
print('Validation Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.validation)
print('Test Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.test)
```
> Note that more complicated usage would usually sequester the `data_sets.test`
> to only be checked after significant amounts of hyperparameter tuning. For
> the sake of a simple little MNIST problem, however, we evaluate against all of
> the data.
### Build the Eval Graph
Before entering the training loop, the Eval op should have been built
by calling the `evaluation()` function from `mnist.py` with the same
logits/labels parameters as the `loss()` function.
```python
eval_correct = mnist.evaluation(logits, labels_placeholder)
```
The `evaluation()` function simply generates a @{tf.nn.in_top_k}
op that can automatically score each model output as correct if the true label
can be found in the K most-likely predictions. In this case, we set the value
of K to 1 to only consider a prediction correct if it is for the true label.
```python
eval_correct = tf.nn.in_top_k(logits, labels, 1)
```
### Eval Output
One can then create a loop for filling a `feed_dict` and calling `sess.run()`
against the `eval_correct` op to evaluate the model on the given dataset.
```python
for step in xrange(steps_per_epoch):
feed_dict = fill_feed_dict(data_set,
images_placeholder,
labels_placeholder)
true_count += sess.run(eval_correct, feed_dict=feed_dict)
```
The `true_count` variable simply accumulates all of the predictions that the
`in_top_k` op has determined to be correct. From there, the precision may be
calculated from simply dividing by the total number of examples.
```python
precision = true_count / num_examples
print(' Num examples: %d Num correct: %d Precision @ 1: %0.04f' %
(num_examples, true_count, precision))
```

View File

@ -1,434 +0,0 @@
# Deep MNIST for Experts
TensorFlow is a powerful library for doing large-scale numerical computation.
One of the tasks at which it excels is implementing and training deep neural
networks. In this tutorial we will learn the basic building blocks of a
TensorFlow model while constructing a deep convolutional MNIST classifier.
*This introduction assumes familiarity with neural networks and the MNIST
dataset. If you don't have
a background with them, check out the
@{$beginners$introduction for beginners}. Be sure to
@{$install$install TensorFlow} before starting.*
## About this tutorial
The first part of this tutorial explains what is happening in the
[mnist_softmax.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist_softmax.py)
code, which is a basic implementation of a Tensorflow model. The second part
shows some ways to improve the accuracy.
You can copy and paste each code snippet from this tutorial into a Python
environment to follow along, or you can download the fully implemented deep net
from [mnist_deep.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist_deep.py)
.
What we will accomplish in this tutorial:
- Create a softmax regression function that is a model for recognizing MNIST
digits, based on looking at every pixel in the image
- Use Tensorflow to train the model to recognize digits by having it "look" at
thousands of examples (and run our first Tensorflow session to do so)
- Check the model's accuracy with our test data
- Build, train, and test a multilayer convolutional neural network to improve
the results
## Setup
Before we create our model, we will first load the MNIST dataset, and start a
TensorFlow session.
### Load MNIST Data
If you are copying and pasting in the code from this tutorial, start here with
these two lines of code which will download and read in the data automatically:
```python
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data')
```
Here `mnist` is a lightweight class which stores the training, validation, and
testing sets as NumPy arrays. It also provides a function for iterating through
data minibatches, which we will use below.
### Start TensorFlow InteractiveSession
TensorFlow relies on a highly efficient C++ backend to do its computation. The
connection to this backend is called a session. The common usage for TensorFlow
programs is to first create a graph and then launch it in a session.
Here we instead use the convenient `InteractiveSession` class, which makes
TensorFlow more flexible about how you structure your code. It allows you to
interleave operations which build a
@{$get_started/get_started#the_computational_graph$computation graph}
with ones that run the graph. This is particularly convenient when working in
interactive contexts like IPython. If you are not using an
`InteractiveSession`, then you should build the entire computation graph before
starting a session and
@{$get_started/get_started#the_computational_graph$launching the graph}.
```python
import tensorflow as tf
sess = tf.InteractiveSession()
```
#### Computation Graph
To do efficient numerical computing in Python, we typically use libraries like
[NumPy](http://www.numpy.org/) that do expensive operations such as matrix
multiplication outside Python, using highly efficient code implemented in
another language. Unfortunately, there can still be a lot of overhead from
switching back to Python every operation. This overhead is especially bad if you
want to run computations on GPUs or in a distributed manner, where there can be
a high cost to transferring data.
TensorFlow also does its heavy lifting outside Python, but it takes things a
step further to avoid this overhead. Instead of running a single expensive
operation independently from Python, TensorFlow lets us describe a graph of
interacting operations that run entirely outside Python. This approach is
similar to that used in Theano or Torch.
The role of the Python code is therefore to build this external computation
graph, and to dictate which parts of the computation graph should be run. See
the @{$get_started/get_started#the_computational_graph$Computation Graph}
section of @{$get_started/get_started} for more detail.
## Build a Softmax Regression Model
In this section we will build a softmax regression model with a single linear
layer. In the next section, we will extend this to the case of softmax
regression with a multilayer convolutional network.
### Placeholders
We start building the computation graph by creating nodes for the
input images and target output classes.
```python
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
```
Here `x` and `y_` aren't specific values. Rather, they are each a `placeholder`
-- a value that we'll input when we ask TensorFlow to run a computation.
The input images `x` will consist of a 2d tensor of floating point numbers.
Here we assign it a `shape` of `[None, 784]`, where `784` is the dimensionality
of a single flattened 28 by 28 pixel MNIST image, and `None` indicates that the
first dimension, corresponding to the batch size, can be of any size. The
target output classes `y_` will also consist of a 2d tensor, where each row is a
one-hot 10-dimensional vector indicating which digit class (zero through nine)
the corresponding MNIST image belongs to.
The `shape` argument to `placeholder` is optional, but it allows TensorFlow
to automatically catch bugs stemming from inconsistent tensor shapes.
### Variables
We now define the weights `W` and biases `b` for our model. We could imagine
treating these like additional inputs, but TensorFlow has an even better way to
handle them: `Variable`. A `Variable` is a value that lives in TensorFlow's
computation graph. It can be used and even modified by the computation. In
machine learning applications, one generally has the model parameters be
`Variable`s.
```python
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
```
We pass the initial value for each parameter in the call to `tf.Variable`. In
this case, we initialize both `W` and `b` as tensors full of zeros. `W` is a
784x10 matrix (because we have 784 input features and 10 outputs) and `b` is a
10-dimensional vector (because we have 10 classes).
Before `Variable`s can be used within a session, they must be initialized using
that session. This step takes the initial values (in this case tensors full of
zeros) that have already been specified, and assigns them to each
`Variable`. This can be done for all `Variables` at once:
```python
sess.run(tf.global_variables_initializer())
```
### Predicted Class and Loss Function
We can now implement our regression model. It only takes one line! We multiply
the vectorized input images `x` by the weight matrix `W`, add the bias `b`.
```python
y = tf.matmul(x,W) + b
```
We can specify a loss function just as easily. Loss indicates how bad the
model's prediction was on a single example; we try to minimize that while
training across all the examples. Here, our loss function is the cross-entropy
between the target and the softmax activation function applied to the model's
prediction. As in the beginners tutorial, we use the stable formulation:
```python
cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_, logits=y))
```
Note that `tf.nn.softmax_cross_entropy_with_logits` internally applies the
softmax on the model's unnormalized model prediction and sums across all
classes, and `tf.reduce_mean` takes the average over these sums.
## Train the Model
Now that we have defined our model and training loss function, it is
straightforward to train using TensorFlow. Because TensorFlow knows the entire
computation graph, it can use automatic differentiation to find the gradients of
the loss with respect to each of the variables. TensorFlow has a variety of
@{$python/train#optimizers$built-in optimization algorithms}.
For this example, we will use steepest gradient descent, with a step length of
0.5, to descend the cross entropy.
```python
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
```
What TensorFlow actually did in that single line was to add new operations to
the computation graph. These operations included ones to compute gradients,
compute parameter update steps, and apply update steps to the parameters.
The returned operation `train_step`, when run, will apply the gradient descent
updates to the parameters. Training the model can therefore be accomplished by
repeatedly running `train_step`.
```python
for _ in range(1000):
batch = mnist.train.next_batch(100)
train_step.run(feed_dict={x: batch[0], y_: batch[1]})
```
We load 100 training examples in each training iteration. We then run the
`train_step` operation, using `feed_dict` to replace the `placeholder` tensors
`x` and `y_` with the training examples. Note that you can replace any tensor
in your computation graph using `feed_dict` -- it's not restricted to just
`placeholder`s.
### Evaluate the Model
How well did our model do?
First we'll figure out where we predicted the correct label. `tf.argmax` is an
extremely useful function which gives you the index of the highest entry in a
tensor along some axis. For example, `tf.argmax(y,1)` is the label our model
thinks is most likely for each input, while `tf.argmax(y_,1)` is the true
label. We can use `tf.equal` to check if our prediction matches the truth.
```python
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
```
That gives us a list of booleans. To determine what fraction are correct, we
cast to floating point numbers and then take the mean. For example,
`[True, False, True, True]` would become `[1,0,1,1]` which would become `0.75`.
```python
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
```
Finally, we can evaluate our accuracy on the test data. This should be about
92% correct.
```python
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
```
## Build a Multilayer Convolutional Network
Getting 92% accuracy on MNIST is bad. It's almost embarrassingly bad. In this
section, we'll fix that, jumping from a very simple model to something
moderately sophisticated: a small convolutional neural network. This will get us
to around 99.2% accuracy -- not state of the art, but respectable.
Here is a diagram, created with TensorBoard, of the model we will build:
<div style="width:40%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img src="https://www.tensorflow.org/images/mnist_deep.png">
</div>
### Weight Initialization
To create this model, we're going to need to create a lot of weights and biases.
One should generally initialize weights with a small amount of noise for
symmetry breaking, and to prevent 0 gradients. Since we're using
[ReLU](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) neurons, it is
also good practice to initialize them with a slightly positive initial bias to
avoid "dead neurons". Instead of doing this repeatedly while we build the model,
let's create two handy functions to do it for us.
```python
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
```
### Convolution and Pooling
TensorFlow also gives us a lot of flexibility in convolution and pooling
operations. How do we handle the boundaries? What is our stride size?
In this example, we're always going to choose the vanilla version.
Our convolutions uses a stride of one and are zero padded so that the
output is the same size as the input. Our pooling is plain old max pooling
over 2x2 blocks. To keep our code cleaner, let's also abstract those operations
into functions.
```python
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
```
### First Convolutional Layer
We can now implement our first layer. It will consist of convolution, followed
by max pooling. The convolution will compute 32 features for each 5x5 patch.
Its weight tensor will have a shape of `[5, 5, 1, 32]`. The first two
dimensions are the patch size, the next is the number of input channels, and
the last is the number of output channels. We will also have a bias vector with
a component for each output channel.
```python
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
```
To apply the layer, we first reshape `x` to a 4d tensor, with the second and
third dimensions corresponding to image width and height, and the final
dimension corresponding to the number of color channels.
```python
x_image = tf.reshape(x, [-1, 28, 28, 1])
```
We then convolve `x_image` with the weight tensor, add the
bias, apply the ReLU function, and finally max pool. The `max_pool_2x2` method will
reduce the image size to 14x14.
```python
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
```
### Second Convolutional Layer
In order to build a deep network, we stack several layers of this type. The
second layer will have 64 features for each 5x5 patch.
```python
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
```
### Densely Connected Layer
Now that the image size has been reduced to 7x7, we add a fully-connected layer
with 1024 neurons to allow processing on the entire image. We reshape the tensor
from the pooling layer into a batch of vectors,
multiply by a weight matrix, add a bias, and apply a ReLU.
```python
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
```
#### Dropout
To reduce overfitting, we will apply [dropout](
https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf) before the readout layer.
We create a `placeholder` for the probability that a neuron's output is kept
during dropout. This allows us to turn dropout on during training, and turn it
off during testing.
TensorFlow's `tf.nn.dropout` op automatically handles scaling neuron outputs in
addition to masking them, so dropout just works without any additional
scaling.<sup id="a1">[1](#f1)</sup>
```python
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
```
### Readout Layer
Finally, we add a layer, just like for the one layer softmax regression
above.
```python
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2
```
### Train and Evaluate the Model
How well does this model do? To train and evaluate it we will use code that is
nearly identical to that for the simple one layer SoftMax network above.
The differences are that:
- We will replace the steepest gradient descent optimizer with the more
sophisticated ADAM optimizer.
- We will include the additional parameter `keep_prob` in `feed_dict` to control
the dropout rate.
- We will add logging to every 100th iteration in the training process.
We will also use tf.Session rather than tf.InteractiveSession. This better
separates the process of creating the graph (model specification) and the
process of evaluating the graph (model fitting). It generally makes for cleaner
code. The tf.Session is created within a [`with` block](https://docs.python.org/3/whatsnew/2.6.html#pep-343-the-with-statement)
so that it is automatically destroyed once the block is exited.
Feel free to run this code. Be aware that it does 20,000 training iterations
and may take a while (possibly up to half an hour), depending on your processor.
```python
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
```
The final test set accuracy after running this code should be approximately 99.2%.
We have learned how to quickly and easily build, train, and evaluate a
fairly sophisticated deep learning model using TensorFlow.
<b id="f1">1</b>: For this small convolutional network, performance is actually nearly identical with and without dropout. Dropout is often very effective at reducing overfitting, but it is most useful when training very large neural networks. [](#a1)

View File

@ -531,7 +531,7 @@ TensorFlow programs:
<pre>Hello, TensorFlow!</pre>
If you are new to TensorFlow, see @{$get_started/get_started$Getting Started with TensorFlow}.
If you are new to TensorFlow, see @{$get_started/premade_estimators$Getting Started with TensorFlow}.
If the system outputs an error message instead of a greeting, see [Common
installation problems](#common_installation_problems).

View File

@ -398,7 +398,7 @@ writing TensorFlow programs:
<pre>Hello, TensorFlow!</pre>
If you are new to TensorFlow, see
@{$get_started/get_started$Getting Started with TensorFlow}.
@{$get_started/premade_estimators$Getting Started with TensorFlow}.
If the system outputs an error message instead of a greeting, see
[Common installation problems](#common_installation_problems).

View File

@ -1,10 +1,16 @@
index.md
### Python
install_linux.md
install_mac.md
install_windows.md
install_sources.md
>>>
migration.md
>>>
### Other Languages
install_java.md
install_go.md
install_c.md

View File

@ -2,8 +2,8 @@ performance_guide.md
datasets_performance.md
performance_models.md
benchmarks.md
quantization.md
>>>
### XLA
xla/index.md
xla/broadcasting.md
xla/developing_new_backend.md
@ -11,3 +11,6 @@ xla/jit.md
xla/operation_semantics.md
xla/shapes.md
xla/tfcompile.md
### Quantization
quantization.md

View File

@ -1,6 +1,6 @@
# Importing Data
The `tf.data` API enables you to build complex input pipelines from
The @{tf.data} API enables you to build complex input pipelines from
simple, reusable pieces. For example, the pipeline for an image model might
aggregate data from files in a distributed file system, apply random
perturbations to each image, and merge randomly selected images into a batch

View File

@ -2,9 +2,10 @@
This document introduces the concept of embeddings, gives a simple example of
how to train an embedding in TensorFlow, and explains how to view embeddings
with the TensorBoard Embedding Projector. The first two parts target newcomers
to machine learning or TensorFlow, and the Embedding Projector how-to is for
users at all levels.
with the TensorBoard Embedding Projector
([live example](http://projector.tensorflow.org)). The first two parts target
newcomers to machine learning or TensorFlow, and the Embedding Projector how-to
is for users at all levels.
[TOC]

View File

@ -134,7 +134,7 @@ The heart of every Estimator--whether pre-made or custom--is its
evaluation, and prediction. When you are using a pre-made Estimator,
someone else has already implemented the model function. When relying
on a custom Estimator, you must write the model function yourself. A
@{$extend/estimators$companion document}
@{$get_started/custom_estimators$companion document}
explains how to write the model function.
@ -186,9 +186,9 @@ est_inception_v3.train(input_fn=train_input_fn, steps=2000)
```
Note that the names of feature columns and labels of a keras estimator come from
the corresponding compiled keras model. For example, the input key names for
@{$get_started/input_fn} in above `est_inception_v3` estimator can be obtained
from `keras_inception_v3.input_names`, and similarly, the predicted output
names can be obtained from `keras_inception_v3.output_names`.
`train_input_fn` above can be obtained from `keras_inception_v3.input_names`,
and similarly, the predicted output names can be obtained from
`keras_inception_v3.output_names`.
For more details, please refer to the documentation for
@{tf.keras.estimator.model_to_estimator}.

View File

@ -68,14 +68,6 @@ dictionary that maps @{tf.Tensor} objects to
numpy arrays (and some other types), which will be used as the values of those
tensors in the execution of a step.
Often, you have certain tensors, such as inputs, that will always be fed. The
@{tf.placeholder} op allows you
to define tensors that *must* be fed, and optionally allows you to constrain
their shape as well. See the
@{$beginners$beginners' MNIST tutorial} for an
example of how placeholders and feeding can be used to provide the training data
for a neural network.
#### What is the difference between `Session.run()` and `Tensor.eval()`?
If `t` is a @{tf.Tensor} object,

View File

@ -248,8 +248,9 @@ The images below show the CIFAR-10 model with tensor shape information:
Often it is useful to collect runtime metadata for a run, such as total memory
usage, total compute time, and tensor shapes for nodes. The code example below
is a snippet from the train and test section of a modification of the
@{$beginners$simple MNIST tutorial},
in which we have recorded summaries and runtime statistics. See the @{$summaries_and_tensorboard#serializing-the-data$Summaries Tutorial}
@{$layers$simple MNIST tutorial}, in which we have recorded summaries and
runtime statistics. See the
@{$summaries_and_tensorboard#serializing-the-data$Summaries Tutorial}
for details on how to record summaries.
Full source is [here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py).

View File

@ -1,16 +1,24 @@
# Programmer's Guide
The documents in this unit dive into the details of writing TensorFlow
code. For TensorFlow 1.3, we revised this document extensively.
The units are now as follows:
The documents in this unit dive into the details of how TensorFlow
works. The units are as follows:
* @{$programmers_guide/estimators$Estimators}, which introduces a high-level
## High Level APIs
* @{$programmers_guide/estimators}, which introduces a high-level
TensorFlow API that greatly simplifies ML programming.
* @{$programmers_guide/tensors$Tensors}, which explains how to create,
* @{$programmers_guide/datasets}, which explains how to
set up data pipelines to read data sets into your TensorFlow program.
## Low Level APIs
* @{$programmers_guide/low_level_intro}, which introduces the
basics of how you can to use TensorFlow outside of the high Level APIs.
* @{$programmers_guide/tensors}, which explains how to create,
manipulate, and access Tensors--the fundamental object in TensorFlow.
* @{$programmers_guide/variables$Variables}, which details how
* @{$programmers_guide/variables}, which details how
to represent shared, persistent state in your program.
* @{$programmers_guide/graphs$Graphs and Sessions}, which explains:
* @{$programmers_guide/graphs}, which explains:
* dataflow graphs, which are TensorFlow's representation of computations
as dependencies between operations.
* sessions, which are TensorFlow's mechanism for running dataflow graphs
@ -20,18 +28,40 @@ The units are now as follows:
such as Estimators or Keras, the high-level API creates and manages
graphs and sessions for you, but understanding graphs and sessions
can still be helpful.
* @{$programmers_guide/saved_model$Saving and Restoring}, which
* @{$programmers_guide/saved_model}, which
explains how to save and restore variables and models.
* @{$programmers_guide/datasets$Input Pipelines}, which explains how to
set up data pipelines to read data sets into your TensorFlow program.
* @{$programmers_guide/embedding$Embeddings}, which introduces the concept
* @{$using_gpu} explains how TensorFlow assigns operations to
devices and how you can change the arrangement manually.
## ML Concepts
* @{$programmers_guide/embedding}, which introduces the concept
of embeddings, provides a simple example of training an embedding in
TensorFlow, and explains how to view embeddings with the TensorBoard
Embedding Projector.
* @{$programmers_guide/debugger$Debugging TensorFlow Programs}, which
## Debugging
* @{$programmers_guide/debugger}, which
explains how to use the TensorFlow debugger (tfdbg).
* @{$programmers_guide/version_compat$TensorFlow Version Compatibility},
## TensorBoard
TensorBoard is a utility to visualize different aspects of machine learning.
The following guides explain how to use TensorBoard:
* @{$programmers_guide/summaries_and_tensorboard},
which introduces TensorBoard.
* @{$programmers_guide/graph_viz}, which
explains how to visualize the computational graph.
* @{$programmers_guide/tensorboard_histograms} which demonstrates the how to
use TensorBoard's histogram dashboard.
## Misc
* @{$programmers_guide/version_compat},
which explains backward compatibility guarantees and non-guarantees.
* @{$programmers_guide/faq$FAQ}, which contains frequently asked
questions about TensorFlow. (We have not revised this document for v1.3,
except to remove some obsolete information.)
* @{$programmers_guide/faq}, which contains frequently asked
questions about TensorFlow.

View File

@ -1,12 +1,28 @@
index.md
### High Level APIs
estimators.md
datasets.md
### Low Level APIs
low_level_intro.md
tensors.md
variables.md
graphs.md
saved_model.md
datasets.md
using_gpu.md
### ML Concepts
embedding.md
### Debugging
debugger.md
supervisor.md
### TensorBoard
summaries_and_tensorboard.md
graph_viz.md
tensorboard_histograms.md
### Misc
version_compat.md
faq.md

View File

@ -349,10 +349,10 @@ SavedModel format. This section explains how to:
### Preparing serving inputs
During training, an @{$input_fn$`input_fn()`} ingests data and prepares it for
use by the model. At serving time, similarly, a `serving_input_receiver_fn()`
accepts inference requests and prepares them for the model. This function
has the following purposes:
During training, an @{$premade_estimators#input_fn$`input_fn()`} ingests data
and prepares it for use by the model. At serving time, similarly, a
`serving_input_receiver_fn()` accepts inference requests and prepares them for
the model. This function has the following purposes:
* To add placeholders to the graph that the serving system will feed
with inference requests.

View File

@ -76,7 +76,7 @@ data than you need, though. Instead, consider running the merged summary op
every `n` steps.
The code example below is a modification of the
@{$beginners$simple MNIST tutorial},
@{$layers$simple MNIST tutorial},
in which we have added some summary ops, and run them every ten steps. If you
run this and then launch `tensorboard --logdir=/tmp/tensorflow/mnist`, you'll be able
to visualize statistics, such as how the weights or accuracy varied during

View File

@ -172,7 +172,7 @@ If you would like to run TensorFlow on multiple GPUs, you can construct your
model in a multi-tower fashion where each tower is assigned to a different GPU.
For example:
```
``` python
# Creates a graph.
c = []
for d in ['/device:GPU:2', '/device:GPU:3']:

View File

@ -60,7 +60,7 @@ patch versions. The public APIs consist of
* [`tensor_shape`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor_shape.proto)
* [`types`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/types.proto)
## What is *not* covered
## What is *not* covered {not_covered}
Some API functions are explicitly marked as "experimental" and can change in
backward incompatible ways between minor releases. These include:

View File

@ -450,9 +450,7 @@ covering them.
To find out more about implementing convolutional neural networks, you can jump
to the TensorFlow @{$deep_cnn$deep convolutional networks tutorial},
or start a bit more gently with our
@{$beginners$ML beginner} or @{$pros$ML expert}
MNIST starter tutorials. Finally, if you want to get up to speed on research
in this area, you can
or start a bit more gently with our @{$layers$MNIST starter tutorial}.
Finally, if you want to get up to speed on research in this area, you can
read the recent work of all the papers referenced in this tutorial.

View File

@ -1,57 +1,60 @@
# Tutorials
This section contains tutorials demonstrating how to do specific tasks
in TensorFlow. If you are new to TensorFlow, we recommend reading the
documents in the "Get Started" section before reading these tutorials.
documents in the "@{$get_started$Get Started}" section before reading
these tutorials.
The following tutorial explains the interaction of CPUs and GPUs on a
TensorFlow system:
## Images
* @{$using_gpu$Using GPUs}
These tutorials cover different aspects of image recognition:
The following tutorials cover different aspects of image recognition:
* @{$layers}, which introduces convolutional neural networks (CNNs) and
demonstrates how to build a CNN in TensorFlow.
* @{$image_recognition}, which introduces the field of image recognition and
uses a pre-trained model (Inception) for recognizing images.
* @{$image_retraining}, which has a wonderfully self-explanatory title.
* @{$deep_cnn}, which demonstrates how to build a small CNN for recognizing
images. This tutorial is aimed at advanced TensorFlow users.
* @{$image_recognition$Image Recognition}, which introduces the field of
image recognition and a model (Inception) for recognizing images.
* @{$image_retraining$How to Retrain Inception's Final Layer for New Categories},
which has a wonderfully self-explanatory title.
* @{$layers$A Guide to TF Layers: Building a Convolutional Neural Network},
which introduces convolutional neural networks (CNNs) and demonstrates how
to build a CNN in TensorFlow.
* @{$deep_cnn$Convolutional Neural Networks}, which demonstrates how to
build a small CNN for recognizing images. This tutorial is aimed at
advanced TensorFlow users.
The following tutorials focus on machine learning problems in human language:
## Sequences
* @{$word2vec$Vector Representations of Words}, which demonstrates how to
create an embedding for words.
* @{$recurrent$Recurrent Neural Networks}, which demonstrates how to use a
These tutorials focus on machine learning problems dealing with sequence data.
* @{$recurrent}, which demonstrates how to use a
recurrent neural network to predict the next word in a sentence.
* @{$seq2seq$Sequence-to-Sequence Models}, which demonstrates how to use a
* @{$seq2seq}, which demonstrates how to use a
sequence-to-sequence model to translate text from English to French.
The following tutorials focus on linear models:
* @{$linear$Large-Scale Linear Models with TensorFlow}, which introduces
linear models and demonstrates how to build them with the high-level API.
* @{$wide$TensorFlow Linear Model Tutorial}, which demonstrates how to solve
a binary classification problem in TensorFlow.
* @{$wide_and_deep$TensorFlow Wide & Deep Learning Tutorial}, which explains
how to use the high-level API to jointly train both a wide linear model
and a deep feed-forward neural network.
* @{$kernel_methods$Improving Linear Models Using Explicit Kernel Methods},
which shows how to improve the quality of a linear model by using explicit
kernel mappings.
* @{$audio_recognition$Simple Audio Recognition}, which shows how to
* @{$recurrent_quickdraw}
builds a classification model for drawings, directly from the sequence of
pen strokes.
* @{$audio_recognition}, which shows how to
build a basic speech recognition network.
The following tutorial covers building a classification model for sequences:
## Data representation
* @{$recurrent_quickdraw$Classifying Drawings using Recurrent Neural Networks}
These tutorials demonstrate various data representations that can be used in
TensorFlow.
Although TensorFlow specializes in machine learning, you may also use
TensorFlow to solve other kinds of math problems. For example:
* @{$wide}, uses
@{tf.feature_column$feature columns} to feed a variety of data types
to linear model, to solve a classification problem.
* @{$wide_and_deep}, builds on the
above linear model tutorial, adding a deep feed-forward neural network
component and a DNN-compatible data representation.
* @{$word2vec}, which demonstrates how to
create an embedding for words.
* @{$kernel_methods},
which shows how to improve the quality of a linear model by using explicit
kernel mappings.
* @{$mandelbrot$Mandelbrot Set}
* @{$pdes$Partial Differential Equations}
## Non Machine Learning
Although TensorFlow specializes in machine learning, the core of TensorFlow is
a powerful numeric computation system which you can also use to solve other
kinds of math problems. For example:
* @{$mandelbrot}
* @{$pdes}

View File

@ -1,5 +1,10 @@
# Improving Linear Models Using Explicit Kernel Methods
Note: This document uses a deprecated version of ${tf.estimator},
which has a ${tf.contrib.learn.estimator$different interface}.
It also uses other `contrib` methods whose
${$version_compat#not_covered$API may not be stable}.
In this tutorial, we demonstrate how combining (explicit) kernel methods with
linear models can drastically increase the latters' quality of predictions
without significantly increasing training and inference times. Unlike dual
@ -44,18 +49,18 @@ respectively. Each split contains one numpy array for images (with shape
tutorial, we only use the train and validation splits to train and evaluate our
models respectively.
In order to feed data to a tf.contrib.learn Estimator, it is helpful to convert
In order to feed data to a `tf.contrib.learn Estimator`, it is helpful to convert
it to Tensors. For this, we will use an `input function` which adds Ops to the
TensorFlow graph that, when executed, create mini-batches of Tensors to be used
downstream. For more background on input functions, check
@{$get_started/input_fn$Building Input Functions with tf.contrib.learn}. In this
example, we will use the `tf.train.shuffle_batch` Op which, besides converting
numpy arrays to Tensors, allows us to specify the batch_size and whether to
randomize the input every time the input_fn Ops are executed (randomization
typically expedites convergence during training). The full code for loading and
preparing the data is shown in the snippet below. In this example, we use
mini-batches of size 256 for training and the entire sample (5K entries) for
evaluation. Feel free to experiment with different batch sizes.
@{$get_started/premade_estimators#input_fn$this section on input functions}.
In this example, we will use the `tf.train.shuffle_batch` Op which, besides
converting numpy arrays to Tensors, allows us to specify the batch_size and
whether to randomize the input every time the input_fn Ops are executed
(randomization typically expedites convergence during training). The full code
for loading and preparing the data is shown in the snippet below. In this
example, we use mini-batches of size 256 for training and the entire sample
(5K entries) for evaluation. Feel free to experiment with different batch sizes.
```python
import numpy as np

View File

@ -190,7 +190,7 @@ def cnn_model_fn(features, labels, mode):
The following sections (with headings corresponding to each code block above)
dive deeper into the `tf.layers` code used to create each layer, as well as how
to calculate loss, configure the training op, and generate predictions. If
you're already experienced with CNNs and @{$extend/estimators$TensorFlow `Estimator`s},
you're already experienced with CNNs and @{$get_started/custom_estimators$TensorFlow `Estimator`s},
and find the above code intuitive, you may want to skim these sections or just
skip ahead to ["Training and Evaluating the CNN MNIST
Classifier"](#training-and-evaluating-the-cnn-mnist-classifier).
@ -534,8 +534,8 @@ if mode == tf.estimator.ModeKeys.TRAIN:
```
> Note: For a more in-depth look at configuring training ops for Estimator model
> functions, see @{$extend/estimators#defining-the-training-op-for-the-model$"Defining
> the training op for the model"} in the @{$extend/estimators$"Creating Estimations in
> functions, see @{$get_started/custom_estimators#defining-the-training-op-for-the-model$"Defining
> the training op for the model"} in the @{$get_started/custom_estimators$"Creating Estimations in
> tf.estimator"} tutorial.
### Add evaluation metrics
@ -599,7 +599,7 @@ be saved (here, we specify the temp directory `/tmp/mnist_convnet_model`, but
feel free to change to another directory of your choice).
> Note: For an in-depth walkthrough of the TensorFlow `Estimator` API, see the
> tutorial @{$extend/estimators$"Creating Estimators in tf.estimator."}
> tutorial @{$get_started/custom_estimators$"Creating Estimators in tf.estimator."}
### Set Up a Logging Hook {#set_up_a_logging_hook}
@ -718,10 +718,9 @@ Here, we've achieved an accuracy of 97.3% on our test data set.
To learn more about TensorFlow Estimators and CNNs in TensorFlow, see the
following resources:
* @{$extend/estimators$Creating Estimators in tf.estimator}. An
introduction to the TensorFlow Estimator API, which walks through
* @{$get_started/custom_estimators$Creating Estimators in tf.estimator}
provides an introduction to the TensorFlow Estimator API. It walks through
configuring an Estimator, writing a model function, calculating loss, and
defining a training op.
* @{$pros#build-a-multilayer-convolutional-network$Deep MNIST for Experts: Building a Multilayer CNN}. Walks
through how to build a MNIST CNN classification model *without layers* using
lower-level TensorFlow operations.
* @{$deep_cnn} walks through how to build a MNIST CNN classification model
*without estimators* using lower-level TensorFlow operations.

View File

@ -1,17 +1,23 @@
index.md
using_gpu.md
### Images
layers.md
image_recognition.md
image_retraining.md
layers.md
deep_cnn.md
word2vec.md
### Sequences
recurrent.md
recurrent_quickdraw.md
seq2seq.md
linear.md
recurrent_quickdraw.md
audio_recognition.md
### Data Representation
wide.md
wide_and_deep.md
word2vec.md
kernel_methods.md
audio_recognition.md
### Non-ML
mandelbrot.md
pdes.md

View File

@ -17,24 +17,21 @@ tutorial walks through the code in greater detail.
To understand this overview it will help to have some familiarity
with basic machine learning concepts, and also with
@{$get_started/estimator$Estimators}.
@{$get_started/premade_estimators$Estimators}.
[TOC]
## What is a linear model?
A **linear model** uses a single weighted sum of features to make a prediction.
For example, if you have
[data](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names)
For example, if you have [data](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names)
on age, years of education, and weekly hours of
work for a population, a model can learn weights for each of those numbers so that
their weighted sum estimates a person's salary. You can also use linear models
for classification.
Some linear models transform the weighted sum into a more convenient form. For
example,
[**logistic regression**](https://developers.google.com/machine-learning/glossary/#logistic_regression)
plugs the weighted sum into the logistic
example, [**logistic regression**](https://developers.google.com/machine-learning/glossary/#logistic_regression) plugs the weighted sum into the logistic
function to turn the output into a value between 0 and 1. But you still just
have one weight for each input feature.
@ -177,7 +174,7 @@ the data itself. You provide the data through an input function.
The input function must return a dictionary of tensors. Each key corresponds to
the name of a `FeatureColumn`. Each key's value is a tensor containing the
values of that feature for all data instances. See
@{$input_fn$Building Input Functions} for a
@{$premade_estimators#input_fn} for a
more comprehensive look at input functions, and `input_fn` in the
[linear models tutorial code](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py)
for an example implementation of an input function.

View File

@ -219,7 +219,7 @@ length 2.
### Defining the model
To define the model we create a new `Estimator`. If you want to read more about
estimators, we recommend @{$extend/estimators$this tutorial}.
estimators, we recommend @{$get_started/custom_estimators$this tutorial}.
To build the model, we: