Sync Premade and Custom estimator docs with example code.

PiperOrigin-RevId: 179404175
This commit is contained in:
Mark Daoust 2017-12-18 04:12:29 -08:00 committed by TensorFlower Gardener
parent 04df827cfd
commit 119f5d477b
3 changed files with 147 additions and 115 deletions

View File

@ -4,13 +4,31 @@ This document introduces custom Estimators. In particular, this document
demonstrates how to create a custom @{tf.estimator.Estimator$Estimator} that
mimics the behavior of the pre-made Estimator
@{tf.estimator.DNNClassifier$`DNNClassifier`} in solving the Iris problem. See
the @{$get_started/estimator$Pre-Made Estimators chapter} for details.
the @{$get_started/premade_estimators$Pre-Made Estimators chapter} for details
on the Iris problem.
To download and access the example code invoke the following two commands:
```shell
git clone https://github.com/tensorflow/models/
cd models/samples/core/get_started
```
In this document we wil be looking at
[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py).
You can run it with the following command:
```bsh
python custom_estimator.py
```
If you are feeling impatient, feel free to compare and contrast
[`custom_estimatr.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py)
with
[`premade_estimatr.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py).
(which is in the same directory).
If you are feeling impatient, feel free to compare and contrast the following
full programs:
* Iris implemented with the [pre-made DNNClassifier Estimator](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py).
* Iris implemented with a [custom Estimator](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py).
## Pre-made vs. custom
@ -64,14 +82,16 @@ and a logits output layer.
## Write an Input function
In our custom Estimator implementation, we'll reuse the input function we used
in the pre-made Estimator implementation. Namely:
Our custom Estimator implementation uses the same input function as our
@{$get_started/premade_estimators$pre-made Estimator implementation}, from
[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py).
Namely:
```python
def train_input_fn(features, labels, batch_size):
"""An input function for training"""
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices((features, labels))
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
# Shuffle, repeat, and batch the examples.
dataset = dataset.shuffle(1000).repeat().batch(batch_size)
@ -85,8 +105,8 @@ This input function builds an input pipeline that yields batches of
## Create feature columns
<!-- TODO(markdaoust): link to feature_columns when it exists-->
As detailed in @{$get_started/estimator$Premade Estimators}, you must define
As detailed in the @{$get_started/estimator$Premade Estimators} and
@{$get_started/feature_columns$Feature Columns} chapters, you must define
your model's feature columns to specify how the model should use each feature.
Whether working with pre-made Estimators or custom Estimators, you define
feature columns in the same fashion.
@ -119,20 +139,23 @@ the input function; that is, `features` and `labels` are the handles to the
data your model will use. The `mode` argument indicates whether the caller is
requesting training, predicting, or evaluation.
The caller may pass `params` to an Estimator's constructor. The `params` passed
to the constructor become the `params` passed to `model_fn`.
The caller may pass `params` to an Estimator's constructor. Any `params` passed
to the constructor are in turn passed on to the `model_fn`. In
[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py)
the following lines create the estimator and set the params to configure the
model. This configuration step is similar to how we configured the @{tf.estimator.DNNClassifier} in
@{$get_started/premade_estimators}.
```python
# Build 2 hidden layer DNN with 10, 10 units respectively.
classifier = tf.estimator.Estimator(
model_fn=my_model,
params={
'feature_columns': my_feature_columns,
# Two hidden layers of 10 nodes each.
'hidden_units': [10, 10],
# The model must choose between 3 classes.
'n_classes': 3,
})
classifier = tf.estimator.Estimator(
model_fn=my_model,
params={
'feature_columns': my_feature_columns,
# Two hidden layers of 10 nodes each.
'hidden_units': [10, 10],
# The model must choose between 3 classes.
'n_classes': 3,
})
```
To implement a typical model function, you must do the following:
@ -163,7 +186,7 @@ feature columns into input for your model. For example:
```
The preceding line applies the transformations defined by your feature columns,
creating the input layer of our model.
creating the model's input layer.
<div style="width:100%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="height:260px"
@ -186,6 +209,7 @@ is connected to every node in the preceding layer. Here's the relevant code:
for units in params['hidden_units']:
net = tf.layers.dense(net, units=units, activation=tf.nn.relu)
```
* The `units` parameter defines the number of output neurons in a given layer.
* The `activation` parameter defines the [activation function](https://developers.google.com/machine-learning/glossary/#a) —
[Relu](https://developers.google.com/machine-learning/glossary/#ReLU) in this
@ -193,12 +217,11 @@ is connected to every node in the preceding layer. Here's the relevant code:
The variable `net` here signifies the current top layer of the network. During
the first iteration, `net` signifies the input layer. On each loop iteration
`tf.layers.dense` creates a new layer, which takes the previous layer as its
input. So, the loop uses `net` to pass the previously created layer as input
to the layer being created.
`tf.layers.dense` creates a new layer, which takes the previous layer's output
as its input, using the variable `net`.
After creating two hidden layers, our network looks as follows. For
simplicity, the figure only shows four hidden units in each layer.
simplicity, the figure does not show all the units in each layer.
<div style="width:100%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="height:260px"
@ -235,8 +258,8 @@ The final hidden layer feeds into the output layer.
When defining an output layer, the `units` parameter specifies the number of
outputs. So, by setting `units` to `params['n_classes']`, the model produces
one output value per class. Each element of the output vector will contains the
score, or "logit", calculated to the associated class of Iris: Setosa,
one output value per class. Each element of the output vector will contain the
score, or "logit", calculated for the associated class of Iris: Setosa,
Versicolor, or Virginica, respectively.
Later on, these logits will be transformed into probabilities by the
@ -255,11 +278,12 @@ function looks like this:
def my_model_fn(
features, # This is batch_features from input_fn
labels, # This is batch_labels from input_fn
mode): # An instance of tf.estimator.ModeKeys, see below
mode, # An instance of tf.estimator.ModeKeys, see below
params): # Additional configuration
```
Focus on that third argument, mode. As the following table shows, when someone
calls train, evaluate, or predict, the Estimator framework invokes your model
calls `train`, `evaluate`, or `predict`, the Estimator framework invokes your model
function with the mode parameter set as follows:
| Estimator method | Estimator Mode |
@ -344,8 +368,8 @@ decreases.
This function returns the average over the whole batch.
```python
# Compute loss.
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
# Compute loss.
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
```
### Evaluate
@ -364,10 +388,10 @@ true values, that is, against the labels provided by the input function. The
same shape. Here's the call to @{tf.metrics.accuracy}:
``` python
# Compute evaluation metrics.
accuracy = tf.metrics.accuracy(labels=labels,
predictions=predicted_classes,
name='acc_op')
# Compute evaluation metrics.
accuracy = tf.metrics.accuracy(labels=labels,
predictions=predicted_classes,
name='acc_op')
```
The @{tf.estimator.EstimatorSpec$`EstimatorSpec`} returned for evaluation
@ -382,16 +406,16 @@ same dictionary. Then, we'll pass that dictionary in the `eval_metric_ops`
argument of `tf.estimator.EstimatorSpec`. Here's the code:
```python
metrics = {'accuracy': accuracy}
tf.summary.scalar('accuracy', accuracy[1])
metrics = {'accuracy': accuracy}
tf.summary.scalar('accuracy', accuracy[1])
if mode == tf.estimator.ModeKeys.EVAL:
return tf.estimator.EstimatorSpec(
mode, loss=loss, eval_metric_ops=metrics)
if mode == tf.estimator.ModeKeys.EVAL:
return tf.estimator.EstimatorSpec(
mode, loss=loss, eval_metric_ops=metrics)
```
The @{tf.summary.scalar} will make accuracy available to TensorBoard (more on
this later).
The @{tf.summary.scalar} will make accuracy available to TensorBoard
in both `TRAIN` and `EVAL` modes. (More on this later).
### Train
@ -407,11 +431,10 @@ optimizers—feel free to experiment with them.
Here is the code that builds the optimizer:
``` python
# Instantiate an optimizer.
optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
```
Next, we train the model using the optimizer's
Next, we build the training operation using the optimizer's
@{tf.train.Optimizer.minimize$`minimize`} method on the loss we calculated
earlier.
@ -425,9 +448,7 @@ argument of `minimize`.
Here's the code to train the model:
``` python
# Train the model by establishing an objective, which is to
# minimize loss using that optimizer.
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
```
The @{tf.estimator.EstimatorSpec$`EstimatorSpec`} returned for training
@ -439,11 +460,7 @@ must have the following fields set:
Here's our code to call `EstimatorSpec`:
```python
# Return training information.
return tf.estimator.EstimatorSpec(
mode=tf.estimator.ModeKeys.TRAIN,
loss=loss,
train_op=train_op)
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
```
The model function is now complete.
@ -469,14 +486,15 @@ arguments of `DNNClassifier`; that is, the `params` dictionary lets you
configure your Estimator without modifying the code in the `model_fn`.
The rest of the code to train, evaluate, and generate predictions using our
Estimator is the same as for the pre-made `DNNClassifier`. For example, the
following line will train the model:
Estimator is the same as in the
@{$get_started/premade_estimators$Premade Estimators} chapter. For
example, the following line will train the model:
```python
# Train the Model.
classifier.train(
input_fn=lambda:train_input_fn(train_x, train_y, args.batch_size),
steps=args.train_steps)
# Train the Model.
classifier.train(
input_fn=lambda:iris_data.train_input_fn(train_x, train_y, args.batch_size),
steps=args.train_steps)
```
## TensorBoard

View File

@ -6,7 +6,7 @@ how to write the Iris classification problem in TensorFlow.
Prior to reading this document, do the following:
* [Install TensorFlow](install/index.md).
* @{$install$Install TensorFlow}.
* If you installed TensorFlow with virtualenv or Anaconda, activate your
TensorFlow environment.
* To keep the data import simple, our Iris example uses Pandas. You can
@ -28,7 +28,11 @@ Take the following steps to get the sample code for this program:
`cd models/samples/core/get_started/`
The program described in this document is called `premade_estimator.py`.
The program described in this document is
[`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py).
This program uses
[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py)
To fetch its training data.
### Running the program
@ -38,15 +42,15 @@ You run TensorFlow programs as you would run any Python program. For example:
python premade_estimator.py
```
The program should output training logs and some predictions against a test
set. For example, the first line in the following output shows that the model
thinks there is a 99.6% chance that the first example in the test set is a
Sentosa. Since the test set `expected "Setosa"`, this appears to be a good
prediction.
The program should output training logs followed by some predictions against
the test set. For example, the first line in the following output shows that
the model thinks there is a 99.6% chance that the first example in the test
set is a Setosa. Since the test set `expected "Setosa"`, this appears to be
a good prediction.
``` None
...
Prediction is "Sentosa" (99.6%), expected "Setosa"
Prediction is "Setosa" (99.6%), expected "Setosa"
Prediction is "Versicolor" (99.8%), expected "Versicolor"
@ -76,12 +80,12 @@ The TensorFlow Programming Environment
We strongly recommend writing TensorFlow programs with the following APIs:
* Estimators, which represent a complete model. The Estimator API provides
methods to train the model, to judge the model's accuracy, and to generate
predictions.
* Datasets, which build a data input pipeline. The Dataset API has methods to
load and manipulate data, and feed it into your model. The Datasets API meshes
well with the Estimators API.
* @{tf.estimator$Estimators}, which represent a complete model.
The Estimator API provides methods to train the model, to judge the model's
accuracy, and to generate predictions.
* @{$get_started/datasets_quickstart$Datasets}, which build a data input
pipeline. The Dataset API has methods to load and manipulate data, and feed
it into your model. The Datasets API meshes well with the Estimators API.
## Classifying irises: an overview
@ -130,7 +134,7 @@ The following table shows three examples in the data set:
|sepal length | sepal width | petal length | petal width| species (label) |
|------------:|------------:|-------------:|-----------:|:---------------:|
| 5.1 | 3.3 | 1.7 | 0.5 | 0 (Sentosa) |
| 5.1 | 3.3 | 1.7 | 0.5 | 0 (Setosa) |
| 5.0 | 2.3 | 3.3 | 1.0 | 1 (versicolor)|
| 6.4 | 2.8 | 5.6 | 2.2 | 2 (virginica) |
@ -145,11 +149,10 @@ topology:
The following figure illustrates the features, hidden layers, and predictions
(not all of the nodes in the hidden layers are shown):
<div style="width:80%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%"
alt="A diagram of the network architecture: Inputs, 2 hidden layers, and outputs"
src="../images/iris_model.png">
src="../images/custom_estimators/full_network.png">
</div>
<div style="text-align: center">
The Model.
@ -252,9 +255,11 @@ The Dataset API can handle a lot of common cases for you. For example,
using the Dataset API, you can easily read in records from a large collection
of files in parallel and join them into a single stream.
To keep things simple in this example we are going to load the data with pandas, and build our input pipeline from this in-memory data.
To keep things simple in this example we are going to load the data with pandas,
and build our input pipeline from this in-memory data.
Here is the input function used for training in this program:
Here is the input function used for training in this program, which is available
in [`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py):
``` python
def train_input_fn(features, labels, batch_size):
@ -272,14 +277,14 @@ def train_input_fn(features, labels, batch_size):
## Define the Feature Columns
A [**Feature Column**](https://developers.google.com/machine-learning/glossary/#feature_columns)
is an object describing how the model should use raw input features from the
is an object describing how the model should use raw input data from the
features dictionary. When you build an Estimator model, you pass it a list of
feature columns that describes each of the features you want the model to use.
These objects are created by functions in the @{tf.feature_column} module. `tf.feature_column` methods provide many different ways to represent data.
The @{tf.feature_column} module provides many options for representing data
to the model.
For Iris, the 4 raw features are numeric values, so we'll build a list of
feature columns, to tell the Estimator model to represent each of the four
feature columns to tell the Estimator model to represent each of the four
features as 32-bit floating-point values. Therefore, the code to create the
Feature Column is simply:
@ -291,7 +296,8 @@ for key in train_x.keys():
```
Feature Columns can be far more sophisticated than those we're showing here.
<!--TODO(markdaoust) add link to feature_columns doc when it exists.-->
We detail feature columns @{$get_started/feature_columns$later on} in
getting started.
Now that we have the description of how we want the model to represent the raw
features, we can build the estimator.
@ -305,8 +311,7 @@ provides several pre-made classifier Estimators, including:
* @{tf.estimator.DNNClassifier}—for deep models that perform multi-class
classification.
* @{tf.estimator.DNNLinearCombinedClassifier}—for wide-n-deep models.
* @{tf.estimator.LinearClassifier}—for linear models that feed results into
binary classifiers.
* @{tf.estimator.LinearClassifier}— for classifiers based on linear models.
For the Iris problem, `tf.estimator.DNNClassifier` seems like the best choice.
Here's how we instantiated this Estimator:
@ -336,14 +341,15 @@ Train the model by calling the Estimator's `train` method as follows:
```python
# Train the Model.
classifier.train(
input_fn=lambda:train_input_fn(train_x, train_y, args.batch_size),
input_fn=lambda:iris_data.train_input_fn(train_x, train_y, args.batch_size),
steps=args.train_steps)
```
Here we wrap up our `input_fn` call in a [`lambda`](https://docs.python.org/3/tutorial/controlflow.html)
to allow the Estimator to call it, at the correct time, with no arguments.
The `steps` argument tells the method to stop training after a number of
training steps.
Here we wrap up our `input_fn` call in a
[`lambda`](https://docs.python.org/3/tutorial/controlflow.html)
to capture the arguments while providing an input function that takes no
arguments, as expected by the Estimator. The `steps` argument tells the method
to stop training after a number of training steps.
### Evaluate the trained model
@ -354,14 +360,14 @@ model on the test data:
```python
# Evaluate the model.
eval_result = classifier.evaluate(
input_fn=lambda:eval_input_fn(test_x, test_y, args.batch_size))
input_fn=lambda:iris_data.eval_input_fn(test_x, test_y, args.batch_size))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
```
Note how unlike our call to the `train` method, we did not pass the `steps`
argument to evaluate. Our `eval_input_fn` doesn't use the `repeat` method on
the dataset, so evaluation just runs to the end of the data.
Unlike our call to the `train` method, we did not pass the `steps`
argument to evaluate. Our `eval_input_fn` only yields a single
[epoch](https://developers.google.com/machine-learning/glossary/#epoch) of data.
Running this code yields the following output (or something similar):
@ -387,7 +393,8 @@ predict_x = {
}
predictions = classifier.predict(
input_fn=lambda:eval_input_fn(predict_x, batch_size=args.batch_size))
input_fn=lambda:iris_data.eval_input_fn(predict_x,
batch_size=args.batch_size))
```
The `predict` method returns a Python iterable, yielding a dictionary of
@ -401,29 +408,35 @@ for pred_dict, expec in zip(predictions, expected):
class_id = pred_dict['class_ids'][0]
probability = pred_dict['probabilities'][class_id]
print(template.format(SPECIES[class_id], 100 * probability, expec))
print(template.format(iris_data.SPECIES[class_id],
100 * probability, expec))
```
Running the preceding code yields the following output:
``` None
...
Prediction is "Sentosa" (99.6%), expected "Setosa"
Prediction is "Setosa" (99.6%), expected "Setosa"
Prediction is "Versicolor" (99.8%), expected "Versicolor"
Prediction is "Virginica" (97.9%), expected "Virginica"
```
## Next
Now that you've gotten started writing TensorFlow programs.
## Summary
* For more on Datasets, see the
@{$programmers_guide/datasets$Programmer's guide} and
@{tf.data$reference documentation}.
* For more on Estimators, see the
@{$programmers_guide/estimators$Programmer's guide} and
@{tf.estimator$reference documentation}.
<!--TODO(markdaoust) add links to next get_started section when it exists.-->
Pre-made Estimators are an effective way to quickly create standard models.
Now that you've gotten started writing TensorFlow programs, consider the
following material:
* @{$get_started/saving_models$Checkpoints} to learn how to save and restore
models.
* @{$get_started/datasets_quickstart$Datasets} to learn more about importing
data into your
model.
* @{$get_started/custom_estimators$Creating Custom Estimators} to learn how to
write your own Estimator, customized for a particular problem.

View File

@ -15,9 +15,8 @@ This document focuses on checkpoints. For details on SavedModel, see the
## Sample code
This document relies on the same Iris classification example detailed in
<!-- TODO (barryr): fill in link when module settles down. -->
@{$premade_estimators$Getting Started with TensorFlow}.
This document relies on the same
[https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py](Iris classification example) detailed in @{$premade_estimators$Getting Started with TensorFlow}.
To download and access the example, invoke the following two commands:
```shell
@ -228,10 +227,12 @@ This separation will keep your checkpoints recoverable.
## Summary
Checkpoints provide an easy automatic mechanism for storing and restoring
models created by Estimators. See the @{$saved_model$Saving and Restoring}
Checkpoints provide an easy automatic mechanism for saving and restoring
models created by Estimators.
See the @{$saved_model$Saving and Restoring}
chapter of the *TensorFlow Programmer's Guide* for details on:
* Saving and restoring models created by low-level TensorFlow APIs.
* Saving and restoring models in the SavedModel format, which is a
* Saving and restoring models using low-level TensorFlow APIs.
* Exporting and importing models in the SavedModel format, which is a
language-neutral, recoverable, serialization format.