Added a "Getting Started with TensorFlow for ML Beginners" chapter to Get
Started section. PiperOrigin-RevId: 181396430
This commit is contained in:
		
							parent
							
								
									411f8bcff6
								
							
						
					
					
						commit
						cf3fb6bc1d
					
				
							
								
								
									
										732
									
								
								tensorflow/docs_src/get_started/get_started_for_beginners.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										732
									
								
								tensorflow/docs_src/get_started/get_started_for_beginners.md
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,732 @@ | ||||
| # Getting Started for ML Beginners | ||||
| 
 | ||||
| This document explains how to use machine learning to classify (categorize) | ||||
| Iris flowers by species.  This document dives deeply into the TensorFlow | ||||
| code to do exactly that, explaining ML fundamentals along the way. | ||||
| 
 | ||||
| If the following list describes you, then you are in the right place: | ||||
| 
 | ||||
| *   You know little to nothing about machine learning. | ||||
| *   You want to learn how to write TensorFlow programs. | ||||
| *   You can code (at least a little) in Python. | ||||
| 
 | ||||
| If you are already familiar with basic machine learning concepts | ||||
| but are new to TensorFlow, read | ||||
| @{$premade_estimators$Getting Started with TensorFlow: for ML Experts}. | ||||
| 
 | ||||
| ## The Iris classification problem | ||||
| 
 | ||||
| Imagine you are a botanist seeking an automated way to classify each | ||||
| Iris flower you find.  Machine learning provides many ways to classify flowers. | ||||
| For instance, a sophisticated machine learning program could classify flowers | ||||
| based on photographs.  Our ambitions are more modest--we're going to classify | ||||
| Iris flowers based solely on the length and width of their | ||||
| [sepals](https://en.wikipedia.org/wiki/Sepal) and | ||||
| [petals](https://en.wikipedia.org/wiki/Petal). | ||||
| 
 | ||||
| The Iris genus entails about 300 species, but our program will classify only | ||||
| the following three: | ||||
| 
 | ||||
| *   Iris setosa | ||||
| *   Iris virginica | ||||
| *   Iris versicolor | ||||
| 
 | ||||
| <div style="margin:auto; margin-bottom:10px; margin-top:20px;"> | ||||
| <img style="width:100%" | ||||
|   alt="Petal geometry compared for three iris species: Iris setosa, Iris virginica, and Iris versicolor" | ||||
|   src="../images/iris_three_species.jpg"> | ||||
| </div> | ||||
| **From left to right, | ||||
| [*Iris setosa*](https://commons.wikimedia.org/w/index.php?curid=170298) (by | ||||
| [Radomil](https://commons.wikimedia.org/wiki/User:Radomil), CC BY-SA 3.0), | ||||
| [*Iris versicolor*](https://commons.wikimedia.org/w/index.php?curid=248095) (by | ||||
| [Dlanglois](https://commons.wikimedia.org/wiki/User:Dlanglois), CC BY-SA 3.0), | ||||
| and [*Iris virginica*](https://www.flickr.com/photos/33397993@N05/3352169862) | ||||
| (by [Frank Mayfield](https://www.flickr.com/photos/33397993@N05), CC BY-SA | ||||
| 2.0).** | ||||
| <p> </p> | ||||
| 
 | ||||
| Fortunately, someone has already created [a data set of 120 Iris | ||||
| flowers](https://en.wikipedia.org/wiki/Iris_flower_data_set) | ||||
| with the sepal and petal measurements.  This data set has become | ||||
| one of the canonical introductions to machine learning classification problems. | ||||
| (The [MNIST database](https://en.wikipedia.org/wiki/MNIST_database), | ||||
| which contains handwritten digits, is another popular classification | ||||
| problem.) The first 5 entries of the Iris data set | ||||
| look as follows: | ||||
| 
 | ||||
| | Sepal length | sepal width | petal length | petal width | species | ||||
| | ---          | ---         | ---          | ---         | --- | ||||
| |6.4           | 2.8         | 5.6          | 2.2         | 2 | ||||
| |5.0           | 2.3         | 3.3          | 1.0         | 1 | ||||
| |4.9           | 2.5         | 4.5          | 1.7         | 2 | ||||
| |4.9           | 3.1         | 1.5          | 0.1         | 0 | ||||
| |5.7           | 3.8         | 1.7          | 0.3         | 0 | ||||
| 
 | ||||
| Let's introduce some terms: | ||||
| 
 | ||||
| *   The last column (species) is called the | ||||
|     [**label**](https://developers.google.com/machine-learning/glossary/#label); | ||||
|     the first four columns are called | ||||
|     [**features**](https://developers.google.com/machine-learning/glossary/#feature). | ||||
|     Features are characteristics of an example, while the label is | ||||
|     the thing we're trying to predict. | ||||
| 
 | ||||
| *   An [**example**](https://developers.google.com/machine-learning/glossary/#example) | ||||
|     consists of the set of features and the label for one sample | ||||
|     flower. The preceding table shows 5 examples from a data set of | ||||
|     120 examples. | ||||
| 
 | ||||
| Each label is naturally a string (for example, "setosa"), but machine learning | ||||
| typically relies on numeric values. Therefore, someone mapped each string to | ||||
| a number.  Here's the representation scheme: | ||||
| 
 | ||||
| * 0 represents setosa | ||||
| * 1 represents versicolor | ||||
| * 2 represents virginica | ||||
| 
 | ||||
| 
 | ||||
| ## Models and training | ||||
| 
 | ||||
| A **model** is the relationship between features | ||||
| and the label.  For the Iris problem, the model defines the relationship | ||||
| between the sepal and petal measurements and the Iris species. | ||||
| Some simple models can be described with a few lines of algebra; | ||||
| more complex machine learning models | ||||
| contain such a large number of interlacing mathematical functions and | ||||
| parameters that they become hard to summarize mathematically. | ||||
| 
 | ||||
| Could you determine the relationship between the four features and the | ||||
| Iris species *without* using machine learning?  That is, could you use | ||||
| traditional programming techniques (for example, a lot of conditional | ||||
| statements) to create a model?  Maybe. You could play with the data set | ||||
| long enough to determine the right relationships of petal and sepal | ||||
| measurements to particular species.  However, a good machine learning | ||||
| approach *determines the model for you*.  That is, if you feed enough | ||||
| representative examples into the right machine learning model type, the program | ||||
| will determine the relationship between sepals, petals, and species. | ||||
| 
 | ||||
| **Training** is the stage of machine learning in which the model is | ||||
| gradually optimized (learned).  The Iris problem is an example | ||||
| of [**supervised machine | ||||
| learning**](https://developers.google.com/machine-learning/glossary/#supervised_machine_learning) | ||||
| in which a model is trained from examples that contain labels.  (In | ||||
| [**unsupervised machine | ||||
| learning**](https://developers.google.com/machine-learning/glossary/#unsupervised_machine_learning), | ||||
| the examples don't contain labels. Instead, the model typically finds | ||||
| patterns among the features.) | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| ## Get the sample program | ||||
| 
 | ||||
| Prior to playing with the sample code in this document, do the following: | ||||
| 
 | ||||
| 1.  @{$install$Install TensorFlow}. | ||||
| 2.  If you installed TensorFlow with virtualenv or Anaconda, activate your | ||||
|     TensorFlow environment. | ||||
| 3.  Install or upgrade pandas by issuing the following command: | ||||
| 
 | ||||
|      `pip install pandas` | ||||
| 
 | ||||
| 
 | ||||
| Take the following steps to get the sample program: | ||||
| 
 | ||||
| 1. Clone the TensorFlow Models repository from github by entering the following | ||||
|    command: | ||||
| 
 | ||||
|        `git clone https://github.com/tensorflow/models` | ||||
| 
 | ||||
| 2. Change directory within that branch to the location containing the examples | ||||
|    used in this document: | ||||
| 
 | ||||
|        `cd models/samples/core/get_started/` | ||||
| 
 | ||||
| In that `get_started` directory, you'll find a program | ||||
| named `premade_estimator.py`. | ||||
| 
 | ||||
| 
 | ||||
| ## Run the sample program | ||||
| 
 | ||||
| You run TensorFlow programs as you would run any Python program. Therefore, | ||||
| issue the following command from a command line to | ||||
| run `premade_estimators.py`: | ||||
| 
 | ||||
| ``` bash | ||||
| python premade_estimator.py | ||||
| ``` | ||||
| 
 | ||||
| Running the program should output a whole bunch of information ending with | ||||
| three prediction lines like the following: | ||||
| 
 | ||||
| ```None | ||||
| ... | ||||
| Prediction is "Setosa" (99.6%), expected "Setosa" | ||||
| 
 | ||||
| Prediction is "Versicolor" (99.8%), expected "Versicolor" | ||||
| 
 | ||||
| Prediction is "Virginica" (97.9%), expected "Virginica" | ||||
| ``` | ||||
| 
 | ||||
| If the program generates errors instead of predictions, ask yourself the | ||||
| following questions: | ||||
| 
 | ||||
| * Did you install TensorFlow properly? | ||||
| * Are you using the correct version of TensorFlow?  The `premade_estimators.py` | ||||
|   program requires at least TensorFlow v1.4. | ||||
| * If you installed TensorFlow with virtualenv or Anaconda, did you activate | ||||
|   the environment? | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| ## The TensorFlow programming stack | ||||
| 
 | ||||
| As the following illustration shows, TensorFlow | ||||
| provides a programming stack consisting of multiple API layers: | ||||
| 
 | ||||
| <div style="margin:auto; margin-bottom:10px; margin-top:20px;"> | ||||
| <img style="width:100%" src="../images/tensorflow_programming_environment.png"> | ||||
| </div> | ||||
| **The TensorFlow Programming Environment.** | ||||
| <p> </p> | ||||
| 
 | ||||
| As you start writing TensorFlow programs, we strongly recommend focusing on | ||||
| the following two high-level APIs: | ||||
| 
 | ||||
| *   Estimators | ||||
| *   Datasets | ||||
| 
 | ||||
| Although we'll grab an occasional convenience function from other APIs, | ||||
| this document focuses on the preceding two APIs. | ||||
| 
 | ||||
| 
 | ||||
| ## The program itself | ||||
| 
 | ||||
| Thanks for your patience; let's dig into the code. | ||||
| The general outline of `premade_estimator.py`--and many other TensorFlow | ||||
| programs--is as follows: | ||||
| 
 | ||||
| *   Import and parse the data sets. | ||||
| *   Create feature columns to describe the data. | ||||
| *   Select the type of model | ||||
| *   Train the model. | ||||
| *   Evaluate the model's effectiveness. | ||||
| *   Let the trained model make predictions. | ||||
| 
 | ||||
| The following subsections detail each part. | ||||
| 
 | ||||
| 
 | ||||
| ### Import and parse the data sets | ||||
| 
 | ||||
| The Iris program requires the data from the following two .csv files: | ||||
| 
 | ||||
| *   `http://download.tensorflow.org/data/iris_training.csv`, which contains | ||||
|     the training set. | ||||
| *   `http://download.tensorflow.org/data/iris_test.csv`, which contains the | ||||
|     the test set. | ||||
| 
 | ||||
| The **training set** contains the examples that we'll use to train the model; | ||||
| the **test set** contains the examples that we'll use to evaluate the trained | ||||
| model's effectiveness. | ||||
| 
 | ||||
| The training set and test set started out as a | ||||
| single data set.  Then, someone split the examples, with the majority going into | ||||
| the training set and the remainder going into the test set.  Adding | ||||
| examples to the training set usually builds a better model; however, adding | ||||
| more examples to the test set enables us to better gauge the model's | ||||
| effectiveness. Regardless of the split, the examples in the test set | ||||
| must be separate from the examples in the training set.  Otherwise, you can't | ||||
| accurately determine the model's effectiveness. | ||||
| 
 | ||||
| The `premade_estimators.py` program relies on the `load_data` function | ||||
| in the adjacent [`iris_data.py`]( | ||||
| https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py) | ||||
| file to read in and parse the training set and test set. | ||||
| Here is a heavily commented version of the function: | ||||
| 
 | ||||
| ```python | ||||
| TRAIN_URL = "http://download.tensorflow.org/data/iris_training.csv" | ||||
| TEST_URL = "http://download.tensorflow.org/data/iris_test.csv" | ||||
| 
 | ||||
| CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', | ||||
|                     'PetalLength', 'PetalWidth', 'Species'] | ||||
| 
 | ||||
| ... | ||||
| 
 | ||||
| def load_data(label_name='Species'): | ||||
|     """Parses the csv file in TRAIN_URL and TEST_URL.""" | ||||
| 
 | ||||
|     # Create a local copy of the training set. | ||||
|     train_path = tf.keras.utils.get_file(fname=TRAIN_URL.split('/')[-1], | ||||
|                                          origin=TRAIN_URL) | ||||
|     # train_path now holds the pathname: ~/.keras/datasets/iris_training.csv | ||||
| 
 | ||||
|     # Parse the local CSV file. | ||||
|     train = pd.read_csv(filepath_or_buffer=train_path, | ||||
|                         names=CSV_COLUMN_NAMES,  # list of column names | ||||
|                         header=0  # ignore the first row of the CSV file. | ||||
|                        ) | ||||
|     # train now holds a pandas DataFrame, which is data structure | ||||
|     # analogous to a table. | ||||
| 
 | ||||
|     # 1. Assign the DataFrame's labels (the right-most column) to train_label. | ||||
|     # 2. Delete (pop) the labels from the DataFrame. | ||||
|     # 3. Assign the remainder of the DataFrame to train_features | ||||
|     train_features, train_label = train, train.pop(label_name) | ||||
| 
 | ||||
|     # Apply the preceding logic to the test set. | ||||
|     test_path = tf.keras.utils.get_file(TEST_URL.split('/')[-1], TEST_URL) | ||||
|     test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0) | ||||
|     test_features, test_label = test, test.pop(label_name) | ||||
| 
 | ||||
|     # Return four DataFrames. | ||||
|     return (train_features, train_label), (test_features, test_label) | ||||
| ``` | ||||
| 
 | ||||
| Keras is an open-sourced machine learning library; `tf.keras` is a TensorFlow | ||||
| implementation of Keras.  The `premade_estimator.py` program only accesses | ||||
| one `tf.keras` function; namely, the `tf.keras.utils.get_file` convenience | ||||
| function, which copies a remote CSV file to a local file system. | ||||
| 
 | ||||
| The call to `load_data` returns two `(feature,label)` pairs, for the training | ||||
| and test sets respectively: | ||||
| 
 | ||||
| ```python | ||||
|     # Call load_data() to parse the CSV file. | ||||
|     (train_feature, train_label), (test_feature, test_label) = load_data() | ||||
| ``` | ||||
| 
 | ||||
| Pandas is an open-source Python library leveraged by several | ||||
| TensorFlow functions.  A pandas | ||||
| [**DataFrame**](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) | ||||
| is a table with named columns headers and numbered rows. | ||||
| The features returned by `load_data` are packed in `DataFrames`. | ||||
| For example, the `test_feature` DataFrame looks as follows: | ||||
| 
 | ||||
| ```none | ||||
|     SepalLength  SepalWidth  PetalLength  PetalWidth | ||||
| 0           5.9         3.0          4.2         1.5 | ||||
| 1           6.9         3.1          5.4         2.1 | ||||
| 2           5.1         3.3          1.7         0.5 | ||||
| ... | ||||
| 27          6.7         3.1          4.7         1.5 | ||||
| 28          6.7         3.3          5.7         2.5 | ||||
| 29          6.4         2.9          4.3         1.3 | ||||
| ``` | ||||
| 
 | ||||
| 
 | ||||
| ### Describe the data | ||||
| 
 | ||||
| A **feature column** is a data structure that tells your model | ||||
| how to interpret the data in each feature.  In the Iris problem, | ||||
| we want the model to interpret the data in each | ||||
| feature as its literal floating-point value; that is, we want the | ||||
| model to interpret an input value like 5.4 as, well, 5.4.  However, | ||||
| in other machine learning problems, it is often desirable to interpret | ||||
| data less literally.  Using feature columns to | ||||
| interpret data is such a rich topic that we devote an entire | ||||
| @{$feature_columns$document} to it. | ||||
| 
 | ||||
| From a code perspective, you build a list of `feature_column` objects by calling | ||||
| functions from the @{tf.feature_column} module. Each object describes an input | ||||
| to the model. To tell the model to interpret data as a floating-point value, | ||||
| call @{tf.feature_column.numeric_column).  In `premade_estimator.py`, all | ||||
| four features should be interpreted as literal floating-point values, so | ||||
| the code to create a feature column looks as follows: | ||||
| 
 | ||||
| ```python | ||||
| # Create feature columns for all features. | ||||
| my_feature_columns = [] | ||||
| for key in train_x.keys(): | ||||
|     my_feature_columns.append(tf.feature_column.numeric_column(key=key)) | ||||
| ``` | ||||
| 
 | ||||
| Here is a less elegant, but possibly clearer, alternative way to | ||||
| encode the preceding block: | ||||
| 
 | ||||
| ```python | ||||
| my_feature_columns = [ | ||||
|     tf.feature_column.numeric_column(key='SepalLength'), | ||||
|     tf.feature_column.numeric_column(key='SepalWidth'), | ||||
|     tf.feature_column.numeric_column(key='PetalLength'), | ||||
|     tf.feature_column.numeric_column(key='PetalWidth') | ||||
| ] | ||||
| ``` | ||||
| 
 | ||||
| 
 | ||||
| ### Select the type of model | ||||
| 
 | ||||
| We need the select the kind of model that will be trained. | ||||
| Lots of model types exist; picking the ideal type takes experience. | ||||
| We've selected a neural network to solve the Iris problem.  [**Neural | ||||
| networks**](https://developers.google.com/machine-learning/glossary/#neural_network) | ||||
| can find complex relationships between features and the label. | ||||
| A neural network is a highly-structured graph, organized into one or more | ||||
| [**hidden layers**](https://developers.google.com/machine-learning/glossary/#hidden_layer). | ||||
| Each hidden layer consists of one or more | ||||
| [**neurons**](https://developers.google.com/machine-learning/glossary/#neuron). | ||||
| There are several categories of neural networks. | ||||
| We'll be using a [**fully connected neural | ||||
| network**](https://developers.google.com/machine-learning/glossary/#fully_connected_layer), | ||||
| which means that the neurons in one layer take inputs from *every* neuron in | ||||
| the previous layer.  For example, the following figure illustrates a  | ||||
| fully connected neural network consisting of three hidden layers: | ||||
| 
 | ||||
| *   The first hidden layer contains four neurons. | ||||
| *   The second hidden layer contains three neurons. | ||||
| *   The third hidden layer contains two neurons. | ||||
| 
 | ||||
| <div style="margin:auto; margin-bottom:10px; margin-top:20px;"> | ||||
| <img style="width:100%" src="../images/simple_dnn.svg"> | ||||
| </div> | ||||
| **A neural network with three hidden layers.** | ||||
| <p> </p> | ||||
| 
 | ||||
| To specify a model type, instantiate an | ||||
| [**Estimator**](https://developers.google.com/machine-learning/glossary/#Estimators) | ||||
| class.  TensorFlow provides two categories of Estimators: | ||||
| 
 | ||||
| *   [**pre-made | ||||
|     Estimators**](https://developers.google.com/machine-learning/glossary/#pre-made_Estimator), | ||||
|     which someone else has already written for you. | ||||
| *   [**custom | ||||
|     Estimators**](https://developers.google.com/machine-learning/glossary/#custom_estimator), | ||||
|     which you must code yourself, at least partially. | ||||
| 
 | ||||
| To implement a neural network, the `premade_estimators.py` program uses | ||||
| a pre-made Estimator named @{tf.estimator.DNNClassifier}.  This Estimator | ||||
| builds a neural network that classifies examples.  The following call | ||||
| instantiates `DNNClassifier`: | ||||
| 
 | ||||
| ```python | ||||
|     classifier = tf.estimator.DNNClassifier( | ||||
|         feature_columns=my_feature_columns, | ||||
|         hidden_units=[10, 10], | ||||
|         n_classes=3) | ||||
| ``` | ||||
| 
 | ||||
| Use the `hidden_units` parameter to define the number of neurons | ||||
| in each hidden layer of the neural network.  Assign this parameter | ||||
| a list. For example: | ||||
| 
 | ||||
| ```python | ||||
|         hidden_units=[10, 10], | ||||
| ``` | ||||
| 
 | ||||
| The length of the list assigned to `hidden_units` identifies the number of | ||||
| hidden layers (2, in this case). | ||||
| Each value in the list represents the number of neurons in a particular | ||||
| hidden layer (10 in the first hidden layer and 10 in the second hidden layer). | ||||
| To change the number of hidden layers or neurons, simply assign a different | ||||
| list to the `hidden_units` parameter. | ||||
| 
 | ||||
| The ideal number of hidden layers and neurons depends on the problem | ||||
| and the data set. Like many aspects of machine learning, | ||||
| picking the ideal shape of the neural network requires some mixture | ||||
| of knowledge and experimentation. | ||||
| As a rule of thumb, increasing the number of hidden layers and neurons | ||||
| *typically* creates a more powerful model, which requires more data to | ||||
| train effectively. | ||||
| 
 | ||||
| The `n_classes` parameter specifies the number of possible values that the | ||||
| neural network can predict.  Since the Iris problem classifies 3 Iris species, | ||||
| we set `n_classes` to 3. | ||||
| 
 | ||||
| The constructor for `tf.Estimator.DNNClassifier` takes an optional argument | ||||
| named `optimizer`, which our sample code chose not to specify.  The | ||||
| [**optimizer**](https://developers.google.com/machine-learning/glossary/#optimizer) | ||||
| controls how the model will train.  As you develop more expertise in machine | ||||
| learning, optimizers and | ||||
| [**learning | ||||
| rate**](https://developers.google.com/machine-learning/glossary/#learning_rate) | ||||
| will become very important. | ||||
| 
 | ||||
| 
 | ||||
| 
 | ||||
| ### Train the model | ||||
| 
 | ||||
| Instantiating a `tf.Estimator.DNNClassifier` creates a framework for learning  | ||||
| the model. Basically, we've wired a network but haven't yet let data flow  | ||||
| through it. To train the neural network, call the Estimator object's `train`  | ||||
| method. For example: | ||||
| 
 | ||||
| ```python | ||||
|     classifier.train( | ||||
|         input_fn=lambda:train_input_fn(train_feature, train_label, args.batch_size), | ||||
|         steps=args.train_steps) | ||||
| ``` | ||||
| 
 | ||||
| The `steps` argument tells `train` to stop training after the specified | ||||
| number of iterations.  Increasing `steps` increases the amount of time | ||||
| the model will train.  Counter-intuitively, training a model longer | ||||
| does not guarantee a better model.  The default value of `args.train_steps` | ||||
| is 1000.  The number of steps to train is a | ||||
| [**hyperparameter**](https://developers.google.com/machine-learning/glossary/#hyperparameter) | ||||
| you can tune. Choosing the right number of steps usually | ||||
| requires both experience and experimentation. | ||||
| 
 | ||||
| The `input_fn` parameter identifies the function that supplies the | ||||
| training data.  The call to the `train` method indicates that the | ||||
| `train_input_fn` function will supply the training data.  Here's that | ||||
| method's signature: | ||||
| 
 | ||||
| ```python | ||||
| def train_input_fn(features, labels, batch_size): | ||||
| ``` | ||||
| 
 | ||||
| We're passing the following arguments to `train_input_fn`: | ||||
| 
 | ||||
| * `train_feature` is a Python dictionary in which: | ||||
|     * Each key is the name of a feature. | ||||
|     * Each value is an array containing the values for each example in the | ||||
|       training set. | ||||
| * `train_label` is an array containing the values of the label for every | ||||
|   example in the training set. | ||||
| * `args.batch_size` is an integer defining the [**batch | ||||
|   size**](https://developers.google.com/machine-learning/glossary/#batch_size). | ||||
| 
 | ||||
| The `train_input_fn` function relies on the **Dataset API**. This is a | ||||
| high-level TensorFlow API for reading data and transforming it into a form | ||||
| that the `train` method requires.  The following call converts the | ||||
| input features and labels into a `tf.data.Dataset` object, which is the base | ||||
| class of the Dataset API: | ||||
| 
 | ||||
| ```python | ||||
|     dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) | ||||
| ``` | ||||
| 
 | ||||
| The `tf.dataset` class provides many useful functions for preparing examples | ||||
| for training. The following line calls three of those functions: | ||||
| 
 | ||||
| ```python | ||||
|     dataset = dataset.shuffle(buffer_size=1000).repeat(count=None).batch(batch_size) | ||||
| ``` | ||||
| 
 | ||||
| Training works best if the training examples are in | ||||
| random order.  To randomize the examples, call | ||||
| `tf.data.Dataset.shuffle`.  Setting the `buffer_size` to a value | ||||
| larger than the number of examples (120) ensures that the data will | ||||
| be well shuffled. | ||||
| 
 | ||||
| During training, the `train` method typically processes the | ||||
| examples multiple times.  Calling the | ||||
| `tf.data.Dataset.repeat` method without any arguments ensures | ||||
| that the `train` method has an infinite supply of (now shuffled) | ||||
| training set examples. | ||||
| 
 | ||||
| The `train` method processes a | ||||
| [**batch**](https://developers.google.com/machine-learning/glossary/#batch) | ||||
| of examples at a time. | ||||
| The `tf.data.Dataset.batch` method creates a batch by | ||||
| concatenating multiple examples. | ||||
| This program sets the default [**batch | ||||
| size**](https://developers.google.com/machine-learning/glossary/#batch_size) | ||||
| to 100, meaning that the `batch` method will concatenate groups of | ||||
| 100 examples.  The ideal batch size depends on the problem.  As a rule | ||||
| of thumb, smaller batch sizes usually enable the `train` method to train | ||||
| the model faster at the expense (sometimes) of accuracy. | ||||
| 
 | ||||
| The following `return` statement passes a batch of examples back to | ||||
| the caller (the `train` method). | ||||
| 
 | ||||
| ```python | ||||
|    return dataset.make_one_shot_iterator().get_next() | ||||
| ``` | ||||
| 
 | ||||
| 
 | ||||
| ### Evaluate the model | ||||
| 
 | ||||
| **Evaluating** means determining how effectively the model makes | ||||
| predictions.  To determine the Iris classification model's effectiveness, | ||||
| pass some sepal and petal measurements to the model and ask the model | ||||
| to predict what Iris species they represent. Then compare the model's | ||||
| prediction against the actual label.  For example, a model that picked | ||||
| the correct species on half the input examples would have an | ||||
| [accuracy](https://developers.google.com/machine-learning/glossary/#accuracy) | ||||
| of 0.5.  The following suggests a more effective model: | ||||
| 
 | ||||
| 
 | ||||
| <table> | ||||
|   <tr> | ||||
|     <th style="background-color:darkblue" colspan="5"> | ||||
|        Test Set</th> | ||||
|   </tr> | ||||
|   <tr> | ||||
|     <th colspan="4">Features</th> | ||||
|     <th colspan="1">Label</th> | ||||
|     <th colspan="1">Prediction</th> | ||||
|   </tr> | ||||
|   <tr> <td>5.9</td> <td>3.0</td> <td>4.3</td> <td>1.5</td> <td>1</td>  | ||||
|           <td style="background-color:green">1</td></tr> | ||||
|   <tr> <td>6.9</td> <td>3.1</td> <td>5.4</td> <td>2.1</td> <td>2</td>  | ||||
|           <td style="background-color:green">2</td></tr> | ||||
|   <tr> <td>5.1</td> <td>3.3</td> <td>1.7</td> <td>0.5</td> <td>0</td>  | ||||
|           <td style="background-color:green">0</td></tr> | ||||
|   <tr> <td>6.0</td> <td>3.4</td> <td>4.5</td> <td>1.6</td> <td>1</td>  | ||||
|           <td style="background-color:red">2</td></tr> | ||||
|   <tr> <td>5.5</td> <td>2.5</td> <td>4.0</td> <td>1.3</td> <td>1</td>  | ||||
|           <td style="background-color:green">1</td></tr> | ||||
| </table> | ||||
| **A model that is 80% accurate.** | ||||
| <p> </p> | ||||
| 
 | ||||
| To evaluate a model's effectiveness, each Estimator provides an `evaluate` | ||||
| method.  The `premade_estimator.py` program calls `evaluate` as follows: | ||||
| 
 | ||||
| ```python | ||||
| # Evaluate the model. | ||||
| eval_result = classifier.evaluate( | ||||
|     input_fn=lambda:eval_input_fn(test_x, test_y, args.batch_size)) | ||||
| 
 | ||||
| print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result)) | ||||
| ``` | ||||
| 
 | ||||
| The call to `classifier.evaluate` is similar to the call to `classifier.train`. | ||||
| The biggest difference is that `classifier.evaluate` must get its examples | ||||
| from the test set rather than the training set.  In other words, to | ||||
| fairly assess a model's effectiveness, the examples used to | ||||
| *evaluate* a model must be different from the examples used to *train* | ||||
| the model.  The `eval_input_fn` function serves a batch of examples from | ||||
| the test set.  Here's the `eval_input_fn` method: | ||||
| 
 | ||||
| ```python | ||||
| def eval_input_fn(features, labels=None, batch_size=None): | ||||
|     """An input function for evaluation or prediction""" | ||||
|     if labels is None: | ||||
|         # No labels, use only features. | ||||
|         inputs = features | ||||
|     else: | ||||
|         inputs = (features, labels) | ||||
| 
 | ||||
|     # Convert inputs to a tf.dataset object. | ||||
|     dataset = tf.data.Dataset.from_tensor_slices(inputs) | ||||
| 
 | ||||
|     # Batch the examples | ||||
|     assert batch_size is not None, "batch_size must not be None" | ||||
|     dataset = dataset.batch(batch_size) | ||||
| 
 | ||||
|     # Return the read end of the pipeline. | ||||
|     return dataset.make_one_shot_iterator().get_next() | ||||
| ``` | ||||
| 
 | ||||
| In brief, `eval_input_fn` does the following when called by | ||||
| `classifier.evaluate`: | ||||
| 
 | ||||
| 1.  Converts the features and labels from the test set to a `tf.dataset` | ||||
|     object. | ||||
| 2.  Creates a batch of test set examples.  (There's no need to shuffle | ||||
|     or repeat the test set examples.) | ||||
| 3.  Returns that batch of test set examples to `classifier.evaluate`. | ||||
| 
 | ||||
| Running this code yields the following output (or something close to it): | ||||
| 
 | ||||
| ```none | ||||
| Test set accuracy: 0.967 | ||||
| ``` | ||||
| 
 | ||||
| An accuracy of 0.967 implies that our trained model correctly classified 29 | ||||
| out of the 30 Iris species in the test set. | ||||
| 
 | ||||
| 
 | ||||
| ### Predicting | ||||
| 
 | ||||
| We've now trained a model and "proven" that it is good--but not | ||||
| perfect--at classifying Iris species.  Now let's use the trained | ||||
| model to make some predictions on [**unlabeled | ||||
| examples**](https://developers.google.com/machine-learning/glossary/#unlabeled_example); | ||||
| that is, on examples that contain features but not a label. | ||||
| 
 | ||||
| In real-life, the unlabeled examples could come from lots of different | ||||
| sources including apps, CSV files, and data feeds.  For now, we're simply | ||||
| going to manually provide the following three unlabeled examples: | ||||
| 
 | ||||
| ```python | ||||
|     predict_x = { | ||||
|         'SepalLength': [5.1, 5.9, 6.9], | ||||
|         'SepalWidth': [3.3, 3.0, 3.1], | ||||
|         'PetalLength': [1.7, 4.2, 5.4], | ||||
|         'PetalWidth': [0.5, 1.5, 2.1], | ||||
|     } | ||||
| ``` | ||||
| 
 | ||||
| Every Estimator provides a `predict` method, which `premade_estimator.py` | ||||
| calls as follows: | ||||
| 
 | ||||
| ```python | ||||
| predictions = classifier.predict( | ||||
|     input_fn=lambda:eval_input_fn(predict_x, batch_size=args.batch_size)) | ||||
| ``` | ||||
| 
 | ||||
| As with the `evaluate` method, our `predict` method also gathers examples | ||||
| from the `eval_input_fn` method. | ||||
| 
 | ||||
| When doing predictions, we're *not* passing labels to `eval_input_fn`. | ||||
| Therefore, `eval_input_fn` does the following: | ||||
| 
 | ||||
| 1.  Converts the features from the 3-element manual set we just created. | ||||
| 2.  Creates a batch of 3 examples from that manual set. | ||||
| 3.  Returns that batch of examples to `classifier.predict`. | ||||
| 
 | ||||
| The `predict` method returns a python iterable, yielding a dictionary of | ||||
| prediction results for each example.  This dictionary contains several keys. | ||||
| The `probabilities` key holds a list of three floating-point values, | ||||
| each representing the probability that the input example is a particular | ||||
| Iris species.  For example, consider the following `probabilities` list: | ||||
| 
 | ||||
| ```none | ||||
| 'probabilities': array([  1.19127117e-08,   3.97069454e-02,   9.60292995e-01]) | ||||
| ``` | ||||
| 
 | ||||
| The preceding list indicates: | ||||
| 
 | ||||
| *   A negligible chance of the Iris being Setosa. | ||||
| *   A 3.97% chance of the Iris being Versicolor. | ||||
| *   A 96.0% chance of the Iris being Virginica. | ||||
| 
 | ||||
| The `class_ids` key holds a one-element array that identifies the most | ||||
| probable species.  For example: | ||||
| 
 | ||||
| ```none | ||||
| 'class_ids': array([2]) | ||||
| ``` | ||||
| 
 | ||||
| The number `2` corresponds to Virginica.  The following code iterates | ||||
| through the returned `predictions` to report on each prediction: | ||||
| 
 | ||||
| ``` python | ||||
| for pred_dict, expec in zip(predictions, expected): | ||||
|     template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"') | ||||
| 
 | ||||
|     class_id = pred_dict['class_ids'][0] | ||||
|     probability = pred_dict['probabilities'][class_id] | ||||
|     print(template.format(SPECIES[class_id], 100 * probability, expec)) | ||||
| ``` | ||||
| 
 | ||||
| Running the program yields the following output: | ||||
| 
 | ||||
| 
 | ||||
| ``` None | ||||
| ... | ||||
| Prediction is "Setosa" (99.6%), expected "Setosa" | ||||
| 
 | ||||
| Prediction is "Versicolor" (99.8%), expected "Versicolor" | ||||
| 
 | ||||
| Prediction is "Virginica" (97.9%), expected "Virginica" | ||||
| ``` | ||||
| 
 | ||||
| 
 | ||||
| ## Summary | ||||
| 
 | ||||
| <!--TODO(barryr): When MLCC is released, add pointers to relevant sections.--> | ||||
| This document provides a short introduction to machine learning. | ||||
| 
 | ||||
| Because `premade_estimators.py` relies on high-level APIs, much of the | ||||
| mathematical complexity in machine learning is hidden. | ||||
| If you intend to become more proficient in machine learning, we recommend | ||||
| ultimately learning more about [**gradient | ||||
| descent**](https://developers.google.com/machine-learning/glossary/#gradient_descent), | ||||
| batching, and neural networks. | ||||
| 
 | ||||
| We recommend reading the @{$feature_columns$Feature Columns} document next, | ||||
| which explains how to represent different kinds of data in machine learning. | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user