Move docs for Python inference into guide/inference.md, and restructure that page to organize the load/run steps based on language.

PiperOrigin-RevId: 259778674
2019-07-24 11:16:56 -07:00 · 2019-07-24 11:16:56 -07:00 · b3aafbda35
commit b3aafbda35
parent 3198b9be2e
3 changed files with 253 additions and 233 deletions
--- a/tensorflow/lite/g3doc/convert/python_api.md
+++ b/tensorflow/lite/g3doc/convert/python_api.md
@ -1,9 +1,12 @@
 # Converter Python API guide
-This page provides examples on how to use the TensorFlow Lite Converter and the
+This page describes how to convert TensorFlow models into the TensorFlow Lite
-TensorFlow Lite interpreter using the Python API.
+format using the TensorFlow Lite Converter Python API.
-Note: These docs describe the converter in the TensorFlow nightly release,
+If you're looking for information about how to run a TensorFlow Lite model,
 see [TensorFlow Lite inference](../guide/inference.md).
 Note: This page describes the converter in the TensorFlow nightly release,
 installed using `pip install tf-nightly`. For docs describing older versions
 reference ["Converting models from TensorFlow 1.12"](#pre_tensorflow_1.12).
@ -20,13 +23,12 @@ be targeted to devices with mobile.
 ## API
 The API for converting TensorFlow models to TensorFlow Lite is
-`tf.lite.TFLiteConverter`. The API for calling the Python interpreter is
+`tf.lite.TFLiteConverter`, which provides class methods based on the original
-`tf.lite.Interpreter`.
+format of the model. For example, `TFLiteConverter.from_session()` is available
 for GraphDefs, `TFLiteConverter.from_saved_model()` is available for
 SavedModels, and `TFLiteConverter.from_keras_model_file()` is available for
 `tf.Keras` files.
 `TFLiteConverter` provides class methods based on the original format of the
 model. `TFLiteConverter.from_session()` is available for GraphDefs.
 `TFLiteConverter.from_saved_model()` is available for SavedModels.
 `TFLiteConverter.from_keras_model_file()` is available for `tf.Keras` files.
 Example usages for simple float-point models are shown in
 [Basic Examples](#basic). Examples usages for more complex models is shown in
 [Complex Examples](#complex).
@ -177,65 +179,6 @@ with tf.Session() as sess:
  open("converted_model.tflite", "wb").write(tflite_model)
 ```
 ## TensorFlow Lite Python interpreter <a name="interpreter"></a>
 ### Using the interpreter from a model file <a name="interpreter_file"></a>
 The following example shows how to use the TensorFlow Lite Python interpreter
 when provided a TensorFlow Lite FlatBuffer file. The example also demonstrates
 how to run inference on random input data. Run
 `help(tf.lite.Interpreter)` in the Python terminal to get detailed
 documentation on the interpreter.
 ```python
 import numpy as np
 import tensorflow as tf
 # Load TFLite model and allocate tensors.
 interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
 interpreter.allocate_tensors()
 # Get input and output tensors.
 input_details = interpreter.get_input_details()
 output_details = interpreter.get_output_details()
 # Test model on random input data.
 input_shape = input_details[0]['shape']
 input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
 interpreter.set_tensor(input_details[0]['index'], input_data)
 interpreter.invoke()
 # The function `get_tensor()` returns a copy of the tensor data.
 # Use `tensor()` in order to get a pointer to the tensor.
 output_data = interpreter.get_tensor(output_details[0]['index'])
 print(output_data)
 ```
 ### Using the interpreter from model data <a name="interpreter_data"></a>
 The following example shows how to use the TensorFlow Lite Python interpreter
 when starting with the TensorFlow Lite Flatbuffer model previously loaded. This
 example shows an end-to-end use case, starting from building the TensorFlow
 model.
 ```python
 import numpy as np
 import tensorflow as tf
 img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
 const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
 val = img + const
 out = tf.identity(val, name="out")
 with tf.Session() as sess:
  converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
  tflite_model = converter.convert()
 # Load TFLite model and allocate tensors.
 interpreter = tf.lite.Interpreter(model_content=tflite_model)
 interpreter.allocate_tensors()
 ```
 ## Additional instructions
--- a/tensorflow/lite/g3doc/guide/get_started.md
+++ b/tensorflow/lite/g3doc/guide/get_started.md
@ -4,22 +4,27 @@ TensorFlow Lite provides all the tools you need to convert and run TensorFlow
 models on mobile, embedded, and IoT devices. The following guide walks through
 each step of the developer workflow and provides links to further instructions.
 [TOC]
 ## 1. Choose a model
 <a id="1_choose_a_model"></a>
 TensorFlow Lite allows you to run TensorFlow models on a wide range of devices.
 A TensorFlow model is a data structure that contains the logic and knowledge of
 a machine learning network trained to solve a particular problem.
 There are many ways to obtain a TensorFlow model, from using pre-trained models
-to training your own. To use a model with TensorFlow Lite it must be converted
+to training your own.
-into a special format. This is explained in section 2,
+
-[Convert the model](#2_convert_the_model_format).
+To use a model with TensorFlow Lite, you must convert a
 full TensorFlow model into the TensorFlow Lite format—you
 cannot create or train a model using TensorFlow Lite. So you must start with a
 regular TensorFlow model, and then
 [convert the model](#2_convert_the_model_format).
 Note: TensorFlow Lite supports a limited subset of TensorFlow operations, so not
 all models can be converted. For details, read about the
 [TensorFlow Lite operator compatibility](ops_compatibility.md).
 Note: Not all TensorFlow models will work with TensorFlow Lite, since the
 interpreter supports a limited subset of TensorFlow operations. See section 2,
 [Convert the model](#2_convert_the_model_format) to learn about compatibility.
 ### Use a pre-trained model
@ -60,35 +65,37 @@ flowers with TensorFlow</a> codelab.
 ### Train a custom model
 If you have designed and trained your own TensorFlow model, or you have trained
-a model obtained from another source, you should convert it to the TensorFlow
+a model obtained from another source, you must
-Lite format before use.
+[convert it to the TensorFlow Lite format](#2_convert_the_model_format).
 ## 2. Convert the model
 <a id="2_convert_the_model_format"></a>
-TensorFlow Lite is designed to execute models efficiently on devices. Some of
+TensorFlow Lite is designed to execute models efficiently on mobile and other
 embedded devices with limited compute and memory resources. Some of
 this efficiency comes from the use of a special format for storing models.
 TensorFlow models must be converted into this format before they can be used by
 TensorFlow Lite.
 Converting models reduces their file size and introduces optimizations that do
-not affect accuracy. Developers can opt to further reduce file size and increase
+not affect accuracy. The TensorFlow Lite converter provides options
-speed of execution in exchange for some trade-offs. You can use the TensorFlow
+that allow you to further reduce file size and increase speed of execution, with
-Lite converter to choose which optimizations to apply.
+some trade-offs.
 Note: TensorFlow Lite supports a limited subset of TensorFlow operations, so not
 all models can be converted. For details, read about the
 [TensorFlow Lite operator compatibility](ops_compatibility.md).
 TensorFlow Lite supports a limited subset of TensorFlow operations, so not all
 models can be converted. See [Ops compatibility](#ops-compatibility) for more
 information.
 ### TensorFlow Lite converter
-The [TensorFlow Lite converter](../convert) is a tool that converts trained
+The [TensorFlow Lite converter](../convert) is a tool available as a Python API
-TensorFlow models into the TensorFlow Lite format. It can also introduce
+that converts trained TensorFlow models into the TensorFlow Lite format. It can
-optimizations, which are covered in section 4,
+also introduce optimizations, which are covered in section 4,
 [Optimize your model](#4_optimize_your_model_optional).
-The converter is available as a Python API. The following example shows a
+The following example shows a
 TensorFlow `SavedModel` being converted into the TensorFlow Lite format:
 ```python
@ -128,9 +135,9 @@ performance or reduce file size. This is covered in section 4,
 ### Ops compatibility
-TensorFlow Lite currently supports a [limited subset](ops_compatibility.md) of
+TensorFlow Lite currently supports a [limited subset of TensorFlow
-TensorFlow operations. The long term goal is for all TensorFlow operations to be
+operations](ops_compatibility.md). The long term goal is for all TensorFlow
-supported.
+operations to be supported.
 If the model you wish to convert contains unsupported operations, you can use
 [TensorFlow Select](ops_select.md) to include operations from TensorFlow. This
--- a/tensorflow/lite/g3doc/guide/inference.md
+++ b/tensorflow/lite/g3doc/guide/inference.md
@ -1,91 +1,104 @@
 # TensorFlow Lite inference
 The term *inference* refers to the process of executing a TensorFlow Lite model
-on-device in order to make predictions based on input data. Inference is the
+on-device in order to make predictions based on input data. To perform an
-final step in using the model on-device.
+inference with a TensorFlow Lite model, you must run it through an
 *interpreter*. The TensorFlow Lite interpreter is designed to be lean and fast.
 The interpreter uses a static graph ordering and a custom (less-dynamic) memory
 allocator to ensure minimal load, initialization, and execution latency.
-Inference for TensorFlow Lite models is run through an interpreter. The
+This page describes how to access to the TensorFlow Lite interpreter and
-TensorFlow Lite interpreter is designed to be lean and fast. The interpreter
+perform an inference using C++, Java, and Python, plus links to other resources
-uses a static graph ordering and a custom (less-dynamic) memory allocator to
+for each [supported platform](#supported-platforms).
 ensure minimal load, initialization, and execution latency.
-This document outlines the various APIs for the interpreter, along with the
+[TOC]
 [supported platforms](#supported-platforms).
-### Important Concepts
+## Important concepts
-TensorFlow Lite inference on device typically follows the following steps.
+TensorFlow Lite inference typically follows the following steps:
-1. **Loading a Model**
+1. **Loading a model**
-   The user loads the `.tflite` model into memory which contains the model's
+   You must load the `.tflite` model into memory, which contains the model's
   execution graph.
-1. **Transforming Data**
+1. **Transforming data**
   Input data acquired by the user generally may not match the input data format
   expected by the model. For eg., a user may need to resize an image or change
   the image format to be used by the model.
-1. **Running Inference**
+   Raw input data for the model generally does not match the input data format
   expected by the model. For example, you might need to resize an image or
   change the image format to be compatible with the model.
-   This step involves using the API to execute the model. It involves a few
+1. **Running inference**
   steps such as building the interpreter, and allocating tensors as explained
   in detail in [Running a Model](#running_a_model).
-1. **Interpreting Output**
+   This step involves using the TensorFlow Lite API to execute the model. It
   involves a few steps such as building the interpreter, and allocating
   tensors, as described in the following sections.
-   The user retrieves results from model inference and interprets the tensors in
+1. **Interpreting output**
   a meaningful way to be used in the application.
-   For example, a model may only return a list of probabilities. It is up to the
+   When you receive results from the model inference, you must interpret the
-   application developer to meaningully map them to relevant categories and
+   tensors in a meaningful way that's useful in your application.
   present it to their user.
-### Supported Platforms
+   For example, a model might return only a list of probabilities. It's up to
   you to map the probabilities to relevant categories and present it to your
   end-user.
 ## Supported platforms
 TensorFlow inference APIs are provided for most common mobile/embedded platforms
-such as Android, iOS and Linux.
+such as Android, iOS and Linux, in multiple programming languages.
-#### Android
+In most cases, the API design reflects a preference for performance over ease of
 use. TensorFlow Lite is designed for fast inference on small devices, so it
 should be no surprise that the APIs try to avoid unnecessary copies at the
 expense of convenience. Similarly, consistency with TensorFlow APIs was not an
 explicit goal and some variance between languages is to be expected.
 Across all libraries, the TensorFlow Lite API enables you to load models,
 feed inputs, and retrieve inference outputs.
 ### Android
 On Android, TensorFlow Lite inference can be performed using either Java or C++
 APIs. The Java APIs provide convenience and can be used directly within your
 Android Activity classes. The C++ APIs offer more flexibility and speed, but may
 require writing JNI wrappers to move data between Java and C++ layers.
-Visit the [Android quickstart](android.md) for a tutorial and example code.
+See below for details about using C++ and Java, or
 follow the [Android quickstart](android.md) for a tutorial and example code.
-#### iOS
+### iOS
-TensorFlow Lite provides native iOS libraries written in
+On iOS, TensorFlow Lite is available with native iOS libraries written in
 [Swift](https://www.tensorflow.org/code/tensorflow/lite/experimental/swift)
 and
 [Objective-C](https://www.tensorflow.org/code/tensorflow/lite/experimental/objc).
-Visit the [iOS quickstart](ios.md) for a tutorial and example code.
+This page doesn't include a discussion for about these languages, so you should
 refer to the [iOS quickstart](ios.md) for a tutorial and example code.
-#### Linux
+### Linux
-On Linux platforms such as [Raspberry Pi](build_rpi.md), TensorFlow Lite C++
+
-and Python APIs can be used to run inference.
+On Linux platforms (including [Raspberry Pi](build_rpi.md)), you can run
 inferences using TensorFlow Lite APIs available in C++ and Python, as shown
 in the following sections.
-## API Guides
+## Load and run a model in C++
-TensorFlow Lite provides programming APIs in C++, Java and Python, with
+Running a TensorFlow Lite model with C++ involves a few simple steps:
 experimental bindings for several other languages (C, Swift, Objective-C). In
 most cases, the API design reflects a preference for performance over ease of
 use. TensorFlow Lite is designed for fast inference on small devices so it
 should be no surprise that the APIs try to avoid unnecessary copies at the
 expense of convenience. Similarly, consistency with TensorFlow APIs was not an
 explicit goal and some variance is to be expected.
-There is also a [Python API for TensorFlow Lite](../convert/python_api.md).
+  1. Load the model into memory as a `FlatBufferModel`.
  2. Build an `Interpreter` based on an existing `FlatBufferModel`.
  3. Set input tensor values. (Optionally resize input tensors if the
     predefined sizes are not desired.)
  4. Invoke inference.
  5. Read output tensor values.
-### Loading a Model
+The [`FlatBufferModel`](
-
+https://www.tensorflow.org/lite/api_docs/cc/class/tflite/flat-buffer-model.html)
-#### C++
+class encapsulates a TensorFlow Lite model and you can
-The `FlatBufferModel` class encapsulates a model and can be built in a couple of
+build it in a couple of different ways, depending on where the model is stored:
 slightly different ways depending on where the model is stored:
 ```c++
 class FlatBufferModel {
@ -104,72 +117,36 @@ class FlatBufferModel {
 };
 ```
-```c++
+Note: If TensorFlow Lite detects the presence of the [Android NNAPI](
-tflite::FlatBufferModel model(path_to_model);
+https://developer.android.com/ndk/guides/neuralnetworks), it will
-```
+automatically try to use shared memory to store the `FlatBufferModel`.
-Note that if TensorFlow Lite detects the presence of Android's NNAPI it will
+Now that you have the model as a `FlatBufferModel` object, you can execute it
-automatically try to use shared memory to store the FlatBufferModel.
+with an [`Interpreter`](
 https://www.tensorflow.org/lite/api_docs/cc/class/tflite/interpreter.html).
 A single `FlatBufferModel` can be used
 simultaneously by more than one `Interpreter`.
-#### Java
+Caution: The `FlatBufferModel` object must remain valid until
 all instances of `Interpreter` using it have been destroyed.
-TensorFlow Lite's Java API supports on-device inference and is provided as an
+The important parts of the `Interpreter` API are shown in the
-Android Studio Library that allows loading models, feeding inputs, and
+code snippet below. It should be noted that:
 retrieving inference outputs.
 The `Interpreter` class drives model inference with TensorFlow Lite. In
 most of the cases, this is the only class an app developer will need.
 The `Interpreter` can be initialized with a model file using the constructor:
 ```java
 public Interpreter(@NotNull File modelFile);
 ```
 or with a `MappedByteBuffer`:
 ```java
 public Interpreter(@NotNull MappedByteBuffer mappedByteBuffer);
 ```
 In both cases a valid TensorFlow Lite model must be provided or an
 `IllegalArgumentException` with be thrown. If a `MappedByteBuffer` is used to
 initialize an Interpreter, it should remain unchanged for the whole lifetime of
 the `Interpreter`.
 ### Running a Model {#running_a_model}
 #### C++
 Running a model involves a few simple steps:
  * Build an `Interpreter` based on an existing `FlatBufferModel`
  * Optionally resize input tensors if the predefined sizes are not desired.
  * Set input tensor values
  * Invoke inference
  * Read output tensor values
 The important parts of public interface of the `Interpreter` are provided
 below. It should be noted that:
  * Tensors are represented by integers, in order to avoid string comparisons
    (and any fixed dependency on string libraries).
  * An interpreter must not be accessed from concurrent threads.
  * Memory allocation for input and output tensors must be triggered
-    by calling AllocateTensors() right after resizing tensors.
+    by calling `AllocateTensors()` right after resizing tensors.
-In order to run the inference model in TensorFlow Lite, one has to load the
+The simplest usage of TensorFlow Lite with C++ looks like this:
 model into a `FlatBufferModel` object which then can be executed by an
 `Interpreter`.  The `FlatBufferModel` needs to remain valid for the whole
 lifetime of the `Interpreter`, and a single `FlatBufferModel` can be
 simultaneously used by more than one `Interpreter`. In concrete terms, the
 `FlatBufferModel` object must be created before any `Interpreter` objects that
 use it, and must be kept around until they have all been destroyed.
 The simplest usage of TensorFlow Lite will look like this:
 ```c++
-tflite::FlatBufferModel model(path_to_model);
+// Load the model
 std::unique_ptr<tflite::FlatBufferModel> model =
    tflite::FlatBufferModel::BuildFromFile(filename);
 // Build the interpreter
 tflite::ops::builtin::BuiltinOpResolver resolver;
 std::unique_ptr<tflite::Interpreter> interpreter;
 tflite::InterpreterBuilder(*model, resolver)(&interpreter);
@ -185,9 +162,40 @@ interpreter->Invoke();
 float* output = interpreter->typed_output_tensor<float>(0);
 ```
-#### Java
+For more example code, see [`minimal.cc`](
 https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/minimal/minimal.cc)
 and [`label_image.cc`](
 https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/label_image/label_image.cc).
-The simplest usage of Tensorflow Lite Java API looks like this:
+
 ## Load and run a model in Java
 The Java API for running an inference with TensorFlow Lite is primarily designed
 for use with Android, so it's available as an Android library dependency:
 `org.tensorflow:tensorflow-lite`.
 In Java, you'll use the `Interpreter` class to load a model and drive model
 inference. In many cases, this may be the only API you need.
 You can initialize an `Interpreter` using a `.tflite` file:
 ```java
 public Interpreter(@NotNull File modelFile);
 ```
 Or with a `MappedByteBuffer`:
 ```java
 public Interpreter(@NotNull MappedByteBuffer mappedByteBuffer);
 ```
 In both cases, you must provide a valid TensorFlow Lite model or the API throws
 `IllegalArgumentException`. If you use `MappedByteBuffer` to
 initialize an `Interpreter`, it must remain unchanged for the whole lifetime
 of the `Interpreter`.
 To then run an inference with the model, simply call `Interpreter.run()`.
 For example:
 ```java
 try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model)) {
@ -195,48 +203,44 @@ try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model))
 }
 ```
-If a model takes only one input and returns only one output, the following will
+The `run()` method takes only one input and returns only one output. So if your
-trigger an inference run:
+model has multiple inputs or multiple outputs, instead use:
 ```java
 interpreter.run(input, output);
 ```
 For models with multiple inputs, or multiple outputs, use:
 ```java
 interpreter.runForMultipleInputsOutputs(inputs, map_of_indices_to_outputs);
 ```
-where each entry in `inputs` corresponds to an input tensor and
+In this case, each entry in `inputs` corresponds to an input tensor and
 `map_of_indices_to_outputs` maps indices of output tensors to the corresponding
-output data. In both cases the tensor indices should correspond to the values
+output data.
 given to the
 [TensorFlow Lite Optimized Converter](../convert/cmdline_examples.md) when the
 model was created. Be aware that the order of tensors in `input` must match the
 order given to the `TensorFlow Lite Optimized Converter`.
-The Java API also provides convenient functions for app developers to get the
+In both cases, the tensor indices should correspond to the values you gave to
-index of any model input or output using a tensor name:
+the [TensorFlow Lite Converter](../convert/) when you created the model.
 Be aware that the order of tensors in `input` must match the
 order given to the TensorFlow Lite Converter.
 The `Interpreter` class also provides convenient functions for you to get the
 index of any model input or output using an operation name:
 ```java
-public int getInputIndex(String tensorName);
+public int getInputIndex(String opName);
-public int getOutputIndex(String tensorName);
+public int getOutputIndex(String opName);
 ```
-If tensorName is not a valid name in model, an `IllegalArgumentException` will
+If `opName` is not a valid operation in the model, it throws an
-be thrown.
+`IllegalArgumentException`.
-##### Releasing Resources After Use
+Also beware that `Interpreter` owns resources. To avoid memory leak, the
-
+resources must be released after use by:
 An `Interpreter` owns resources. To avoid memory leak, the resources must be
 released after use by:
 ```java
 interpreter.close();
 ```
-##### Supported Data Types
+For an example project with Java, see the [Android image classification sample](
 https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android).
 ### Supported data types (in Java)
 To use TensorFlow Lite, the data types of the input and output tensors must be
 one of the following primitive types:
@ -256,7 +260,7 @@ provided as a single, flat `ByteBuffer` argument.
 If other data types, including boxed types like `Integer` and `Float`, are used,
 an `IllegalArgumentException` will be thrown.
-##### Inputs
+#### Inputs
 Each input should be an array or multi-dimensional array of the supported
 primitive types, or a raw `ByteBuffer` of the appropriate size. If the input is
@ -265,12 +269,12 @@ implicitly resized to the array's dimensions at inference time. If the input is
 a ByteBuffer, the caller should first manually resize the associated input
 tensor (via `Interpreter.resizeInput()`) before running inference.
-When using 'ByteBuffer', prefer using direct byte buffers, as this allows the
+When using `ByteBuffer`, prefer using direct byte buffers, as this allows the
 `Interpreter` to avoid unnecessary copies. If the `ByteBuffer` is a direct byte
 buffer, its order must be `ByteOrder.nativeOrder()`. After it is used for a
 model inference, it must remain unchanged until the model inference is finished.
-##### Outputs
+#### Outputs
 Each output should be an array or multi-dimensional array of the supported
 primitive types, or a ByteBuffer of the appropriate size. Note that some models
@ -279,7 +283,75 @@ the input. There's no straightforward way of handling this with the existing
 Java inference API, but planned extensions will make this possible.
-## Writing Custom Operators
+## Load and run a model in Python
 The Python API for running an inference is provided in the `tf.lite`
 module. From which, you mostly need only [`tf.lite.Interpreter`](
 https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter) to load
 a model and run an inference.
 The following example shows how to use the Python interpreter to load a
 `.tflite` file and run inference with random input data:
 ```python
 import numpy as np
 import tensorflow as tf
 # Load TFLite model and allocate tensors.
 interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
 interpreter.allocate_tensors()
 # Get input and output tensors.
 input_details = interpreter.get_input_details()
 output_details = interpreter.get_output_details()
 # Test model on random input data.
 input_shape = input_details[0]['shape']
 input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
 interpreter.set_tensor(input_details[0]['index'], input_data)
 interpreter.invoke()
 # The function `get_tensor()` returns a copy of the tensor data.
 # Use `tensor()` in order to get a pointer to the tensor.
 output_data = interpreter.get_tensor(output_details[0]['index'])
 print(output_data)
 ```
 Alternative to loading the model as a pre-converted `.tflite` file, you can
 combine your code with the [TensorFlow Lite Converter Python API](
 ../convert/python_api.md) (`tf.lite.TFLiteConverter`), allowing you to convert
 your TensorFlow model into the TensorFlow Lite format and then run an inference:
 ```python
 import numpy as np
 import tensorflow as tf
 img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
 const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
 val = img + const
 out = tf.identity(val, name="out")
 # Convert to TF Lite format
 with tf.Session() as sess:
  converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
  tflite_model = converter.convert()
 # Load TFLite model and allocate tensors.
 interpreter = tf.lite.Interpreter(model_content=tflite_model)
 interpreter.allocate_tensors()
 # Continue to get tensors and so forth, as shown above...
 ```
 For more Python sample code, see [`label_image.py`](
 https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/python/label_image.py).
 Tip: Run `help(tf.lite.Interpreter)` in the Python terminal to get detailed
 documentation about the interpreter.
 ## Write a custom operator
 All TensorFlow Lite operators (both custom and builtin) are defined using a
 simple pure-C interface that consists of four functions:
@ -343,7 +415,7 @@ Note that registration is not automatic and an explicit call to
 registration of builtins, custom ops will have to be collected in separate
 custom libraries.
-### Customizing the kernel library
+### Customize the kernel library
 Behind the scenes the interpreter will load a library of kernels which will be
 assigned to execute each of the operators in the model. While the default
@ -362,21 +434,19 @@ class OpResolver {
 };
 ```
-Regular usage will require the developer to use the `BuiltinOpResolver` and
+Regular usage requires that you use the `BuiltinOpResolver` and write:
 write:
 ```c++
 tflite::ops::builtin::BuiltinOpResolver resolver;
 ```
-They can then optionally register custom ops:
+You can optionally register custom ops (before you pass the resolver to the
 `InterpreterBuilder`):
 ```c++
 resolver.AddOp("MY_CUSTOM_OP", Register_MY_CUSTOM_OP());
 ```
 before the resolver is passed to the `InterpreterBuilder`.
 If the set of builtin ops is deemed to be too large, a new `OpResolver` could
 be code-generated  based on a given subset of ops, possibly only the ones
 contained in a given model. This is the equivalent of TensorFlow's selective