diff --git a/tensorflow/lite/g3doc/r1/convert/cmdline_examples.md b/tensorflow/lite/g3doc/r1/convert/cmdline_examples.md
index 4c001bc7c90..45824abfee5 100644
--- a/tensorflow/lite/g3doc/r1/convert/cmdline_examples.md
+++ b/tensorflow/lite/g3doc/r1/convert/cmdline_examples.md
@@ -2,175 +2,166 @@
This page shows how to use the TensorFlow Lite Converter in the command line.
+_Note: If possible, use the **recommended** [Python API](python_api.md)
+instead._
+
## Command-line tools
+### Starting from TensorFlow 1.9
+
There are two approaches to running the converter in the command line.
-* `tflite_convert`: Starting from TensorFlow 1.9, the command-line tool
- `tflite_convert` is installed as part of the Python package. All of the
- examples below use `tflite_convert` for simplicity.
- * Example: `tflite_convert --output_file=...`
-* `bazel`: In order to run the latest version of the TensorFlow Lite Converter
- either install the nightly build using
- [pip](https://www.tensorflow.org/install/pip) or
- [clone the TensorFlow repository](https://www.tensorflow.org/install/source)
- and use `bazel`.
- * Example: `bazel run
+* `tflite_convert` (**recommended**):
+ * *Install*: TensorFlow using
+ [pip](https://www.tensorflow.org/install/pip).
+ * *Example*: `tflite_convert --output_file=...`
+* `bazel`:
+ * *Install*: TensorFlow from
+ [source](https://www.tensorflow.org/install/source).
+ * *Example*: `bazel run
//third_party/tensorflow/lite/python:tflite_convert --
--output_file=...`
-### Converting models prior to TensorFlow 1.9
+*All of the following examples use `tflite_convert` for simplicity.
+Alternatively, you can replace '`tflite_convert`' with '`bazel run
+//tensorflow/lite/python:tflite_convert --`'*
+
+### Prior to TensorFlow 1.9
The recommended approach for using the converter prior to TensorFlow 1.9 is the
-[Python API](python_api.md#pre_tensorflow_1.9). If a command line tool is
-desired, the `toco` command line tool was available in TensorFlow 1.7. Enter
-`toco --help` in Terminal for additional details on the command-line flags
-available. There were no command line tools in TensorFlow 1.8.
+[Python API](python_api.md). Only in TensorFlow 1.7, a command line tool `toco`
+was available (run `toco --help` for additional details).
-## Basic examples
+## Usage
-The following section shows examples of how to convert a basic float-point model
-from each of the supported data formats into a TensorFlow Lite FlatBuffers.
+### Setup
-### Convert a TensorFlow GraphDef
-
-The follow example converts a basic TensorFlow GraphDef (frozen by
-[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py))
-into a TensorFlow Lite FlatBuffer to perform floating-point inference. Frozen
-graphs contain the variables stored in Checkpoint files as Const ops.
+Before we begin, download the models required to run the examples in this
+document:
```
+echo "Download MobileNet V1"
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
| tar xzv -C /tmp
+
+echo "Download Inception V1"
+curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
+ | tar xzv -C /tmp
+```
+
+### Basic examples
+
+The following section shows examples of how to convert a basic model from each
+of the supported data formats into a TensorFlow Lite model.
+
+#### Convert a SavedModel
+
+```
+tflite_convert \
+ --saved_model_dir=/tmp/saved_model \
+ --output_file=/tmp/foo.tflite
+```
+
+#### Convert a tf.keras model
+
+```
+tflite_convert \
+ --keras_model_file=/tmp/keras_model.h5 \
+ --output_file=/tmp/foo.tflite
+```
+
+#### Convert a Frozen GraphDef
+
+```
tflite_convert \
- --output_file=/tmp/foo.tflite \
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
+ --output_file=/tmp/foo.tflite \
--input_arrays=input \
--output_arrays=MobilenetV1/Predictions/Reshape_1
```
-The value for `input_shapes` is automatically determined whenever possible.
+Frozen GraphDef models (or frozen graphs) are produced by
+[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)
+and require additional flags `--input_arrays` and `--output_arrays` as this
+information is not stored in the model format.
-### Convert a TensorFlow SavedModel
+### Advanced examples
-The follow example converts a basic TensorFlow SavedModel into a Tensorflow Lite
-FlatBuffer to perform floating-point inference.
+#### Convert a quantization aware trained model into a quantized TensorFlow Lite model
+
+If you have a quantization aware trained model (i.e, a model inserted with
+`FakeQuant*` operations which record the (min, max) ranges of tensors in order
+to quantize them), then convert it into a quantized TensorFlow Lite model as
+shown below:
```
tflite_convert \
+ --graph_def_file=/tmp/some_mobilenetv1_quantized_frozen_graph.pb \
--output_file=/tmp/foo.tflite \
- --saved_model_dir=/tmp/saved_model
-```
-
-[SavedModel](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators)
-has fewer required flags than frozen graphs due to access to additional data
-contained within the SavedModel. The values for `--input_arrays` and
-`--output_arrays` are an aggregated, alphabetized list of the inputs and outputs
-in the [SignatureDefs](../../serving/signature_defs.md) within
-the
-[MetaGraphDef](https://www.tensorflow.org/saved_model#apis_to_build_and_load_a_savedmodel)
-specified by `--saved_model_tag_set`. As with the GraphDef, the value for
-`input_shapes` is automatically determined whenever possible.
-
-There is currently no support for MetaGraphDefs without a SignatureDef or for
-MetaGraphDefs that use the [`assets/`
-directory](https://www.tensorflow.org/guide/saved_model#structure_of_a_savedmodel_directory).
-
-### Convert a tf.Keras model
-
-The following example converts a `tf.keras` model into a TensorFlow Lite
-Flatbuffer. The `tf.keras` file must contain both the model and the weights.
-
-```
-tflite_convert \
- --output_file=/tmp/foo.tflite \
- --keras_model_file=/tmp/keras_model.h5
-```
-
-## Quantization
-
-### Convert a TensorFlow GraphDef for quantized inference
-
-The TensorFlow Lite Converter is compatible with fixed point quantization models
-described
-[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/quantize/README.md).
-These are float models with `FakeQuant*` ops inserted at the boundaries of fused
-layers to record min-max range information. This generates a quantized inference
-workload that reproduces the quantization behavior that was used during
-training.
-
-The following command generates a quantized TensorFlow Lite FlatBuffer from a
-"quantized" TensorFlow GraphDef.
-
-```
-tflite_convert \
- --output_file=/tmp/foo.tflite \
- --graph_def_file=/tmp/some_quantized_graph.pb \
- --inference_type=QUANTIZED_UINT8 \
--input_arrays=input \
--output_arrays=MobilenetV1/Predictions/Reshape_1 \
- --mean_values=128 \
- --std_dev_values=127
+ --inference_type=INT8 \
+ --mean_values=-0.5 \
+ --std_dev_values=127.7
```
-### Use \"dummy-quantization\" to try out quantized inference on a float graph
+*If you're setting `--inference_type=QUANTIZED_UINT8` then update
+`--mean_values=128` and `--std_dev_values=127`*
-In order to evaluate the possible benefit of generating a quantized graph, the
-converter allows "dummy-quantization" on float graphs. The flags
-`--default_ranges_min` and `--default_ranges_max` accept plausible values for
-the min-max ranges of the values in all arrays that do not have min-max
-information. "Dummy-quantization" will produce lower accuracy but will emulate
-the performance of a correctly quantized model.
+#### Convert a model with \"dummy-quantization\" into a quantized TensorFlow Lite model
+
+If you have a regular float model and only want to estimate the benefit of a
+quantized model, i.e, estimate the performance of the model as if it were
+quantized aware trained, then perform "dummy-quantization" using the flags
+`--default_ranges_min` and `--default_ranges_max`. When specified, they will be
+used as default (min, max) range for all the tensors that lack (min, max) range
+information. This will allow quantization to proceed and help you emulate the
+performance of a quantized TensorFlow Lite model but it will have a lower
+accuracy.
The example below contains a model using Relu6 activation functions. Therefore,
a reasonable guess is that most activation ranges should be contained in [0, 6].
```
-curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
- | tar xzv -C /tmp
tflite_convert \
- --output_file=/tmp/foo.cc \
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
- --inference_type=QUANTIZED_UINT8 \
+ --output_file=/tmp/foo.tflite \
--input_arrays=input \
--output_arrays=MobilenetV1/Predictions/Reshape_1 \
+ --inference_type=INT8 \
+ --mean_values=-0.5 \
+ --std_dev_values=127.7
--default_ranges_min=0 \
--default_ranges_max=6 \
- --mean_values=128 \
- --std_dev_values=127
```
-## Specifying input and output arrays
+*If you're setting `--inference_type=QUANTIZED_UINT8` then update
+`--mean_values=128` and `--std_dev_values=127`*
-### Multiple input arrays
+#### Convert a model with multiple input arrays
The flag `input_arrays` takes in a comma-separated list of input arrays as seen
in the example below. This is useful for models or subgraphs with multiple
-inputs.
+inputs. Note that `--input_shapes` is provided as a colon-separated list. Each
+input shape corresponds to the input array at the same position in the
+respective list.
```
-curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
- | tar xzv -C /tmp
tflite_convert \
--graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
--output_file=/tmp/foo.tflite \
- --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
--input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \
+ --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
--output_arrays=InceptionV1/Logits/Predictions/Reshape_1
```
-Note that `input_shapes` is provided as a colon-separated list. Each input shape
-corresponds to the input array at the same position in the respective list.
+#### Convert a model with multiple output arrays
-### Multiple output arrays
-
-The flag `output_arrays` takes in a comma-separated list of output arrays as
+The flag `--output_arrays` takes in a comma-separated list of output arrays as
seen in the example below. This is useful for models or subgraphs with multiple
outputs.
```
-curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
- | tar xzv -C /tmp
tflite_convert \
--graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
--output_file=/tmp/foo.tflite \
@@ -178,50 +169,45 @@ tflite_convert \
--output_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu
```
-### Specifying subgraphs
+### Convert a model by specifying subgraphs
Any array in the input file can be specified as an input or output array in
-order to extract subgraphs out of an input graph file. The TensorFlow Lite
-Converter discards the parts of the graph outside of the specific subgraph. Use
-[graph visualizations](#graph_visualizations) to identify the input and output
-arrays that make up the desired subgraph.
+order to extract subgraphs out of an input model file. The TensorFlow Lite
+Converter discards the parts of the model outside of the specific subgraph. Use
+[visualization](#visualization) to identify the input and output arrays that
+make up the desired subgraph.
The follow command shows how to extract a single fused layer out of a TensorFlow
GraphDef.
```
-curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
- | tar xzv -C /tmp
tflite_convert \
--graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
--output_file=/tmp/foo.pb \
- --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
--input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \
+ --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
--output_arrays=InceptionV1/InceptionV1/Mixed_3b/concat_v2
```
-Note that the final representation in TensorFlow Lite FlatBuffers tends to have
+Note that the final representation in TensorFlow Lite models tends to have
coarser granularity than the very fine granularity of the TensorFlow GraphDef
representation. For example, while a fully-connected layer is typically
-represented as at least four separate ops in TensorFlow GraphDef (Reshape,
-MatMul, BiasAdd, Relu...), it is typically represented as a single "fused" op
-(FullyConnected) in the converter's optimized representation and in the final
-on-device representation. As the level of granularity gets coarser, some
-intermediate arrays (say, the array between the MatMul and the BiasAdd in the
-TensorFlow GraphDef) are dropped.
+represented as at least four separate operations in TensorFlow GraphDef
+(Reshape, MatMul, BiasAdd, Relu...), it is typically represented as a single
+"fused" op (FullyConnected) in the converter's optimized representation and in
+the final on-device representation. As the level of granularity gets coarser,
+some intermediate arrays (say, the array between the MatMul and the BiasAdd in
+the TensorFlow GraphDef) are dropped.
When specifying intermediate arrays as `--input_arrays` and `--output_arrays`,
it is desirable (and often required) to specify arrays that are meant to survive
-in the final form of the graph, after fusing. These are typically the outputs of
+in the final form of the model, after fusing. These are typically the outputs of
activation functions (since everything in each layer until the activation
function tends to get fused).
-## Logging
+## Visualization
-
-## Graph visualizations
-
-The converter can export a graph to the Graphviz Dot format for easy
+The converter can export a model to the Graphviz Dot format for easy
visualization using either the `--output_format` flag or the
`--dump_graphviz_dir` flag. The subsections below outline the use cases for
each.
@@ -229,21 +215,20 @@ each.
### Using `--output_format=GRAPHVIZ_DOT`
The first way to get a Graphviz rendering is to pass `GRAPHVIZ_DOT` into
-`--output_format`. This results in a plausible visualization of the graph. This
+`--output_format`. This results in a plausible visualization of the model. This
reduces the requirements that exist during conversion from a TensorFlow GraphDef
-to a TensorFlow Lite FlatBuffer. This may be useful if the conversion to TFLite
-is failing.
+to a TensorFlow Lite model. This may be useful if the conversion to TFLite is
+failing.
```
-curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
- | tar xzv -C /tmp
tflite_convert \
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
--output_file=/tmp/foo.dot \
--output_format=GRAPHVIZ_DOT \
- --input_shape=1,128,128,3 \
--input_arrays=input \
+ --input_shape=1,128,128,3 \
--output_arrays=MobilenetV1/Predictions/Reshape_1
+
```
The resulting `.dot` file can be rendered into a PDF as follows:
@@ -267,12 +252,10 @@ Example PDF files are viewable online in the next section.
The second way to get a Graphviz rendering is to pass the `--dump_graphviz_dir`
flag, specifying a destination directory to dump Graphviz rendering to. Unlike
the previous approach, this one retains the original output format. This
-provides a visualization of the actual graph resulting from a specific
+provides a visualization of the actual model resulting from a specific
conversion process.
```
-curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
- | tar xzv -C /tmp
tflite_convert \
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
--output_file=/tmp/foo.tflite \
@@ -283,14 +266,14 @@ tflite_convert \
This generates a few files in the destination directory. The two most important
files are `toco_AT_IMPORT.dot` and `/tmp/toco_AFTER_TRANSFORMATIONS.dot`.
-`toco_AT_IMPORT.dot` represents the original graph containing only the
+`toco_AT_IMPORT.dot` represents the original model containing only the
transformations done at import time. This tends to be a complex visualization
with limited information about each node. It is useful in situations where a
conversion command fails.
-`toco_AFTER_TRANSFORMATIONS.dot` represents the graph after all transformations
+`toco_AFTER_TRANSFORMATIONS.dot` represents the model after all transformations
were applied to it, just before it is exported. Typically, this is a much
-smaller graph with more information about each node.
+smaller model with more information about each node.
As before, these can be rendered to PDFs:
@@ -316,15 +299,15 @@ Sample output files can be seen here below. Note that it is the same
before | after |
-### Graph "video" logging
+### Video logging
When `--dump_graphviz_dir` is used, one may additionally pass
-`--dump_graphviz_video`. This causes a graph visualization to be dumped after
-each individual graph transformation, resulting in thousands of files.
+`--dump_graphviz_video`. This causes a model visualization to be dumped after
+each individual model transformation, resulting in thousands of files.
Typically, one would then bisect into these files to understand when a given
-change was introduced in the graph.
+change was introduced in the model.
-### Legend for the graph visualizations
+### Legend for the Visualizations
* Operators are red square boxes with the following hues of red:
* Most operators are
diff --git a/tensorflow/lite/g3doc/r1/convert/cmdline_reference.md b/tensorflow/lite/g3doc/r1/convert/cmdline_reference.md
index 8cca69d5963..826bb7afdbb 100644
--- a/tensorflow/lite/g3doc/r1/convert/cmdline_reference.md
+++ b/tensorflow/lite/g3doc/r1/convert/cmdline_reference.md
@@ -1,42 +1,41 @@
# Converter command line reference
This page is complete reference of command-line flags used by the TensorFlow
-Lite Converter's command line starting from TensorFlow 1.9 up until the most
-recent build of TensorFlow.
+Lite Converter's command line tool.
## High-level flags
The following high level flags specify the details of the input and output
files. The flag `--output_file` is always required. Additionally, either
-`--graph_def_file`, `--saved_model_dir` or `--keras_model_file` is required.
+`--saved_model_dir`, `--keras_model_file` or `--graph_def_file` is required.
* `--output_file`. Type: string. Specifies the full path of the output file.
-* `--graph_def_file`. Type: string. Specifies the full path of the input
- GraphDef file frozen using
- [freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py).
* `--saved_model_dir`. Type: string. Specifies the full path to the directory
containing the SavedModel.
* `--keras_model_file`. Type: string. Specifies the full path of the HDF5 file
containing the tf.keras model.
+* `--graph_def_file`. Type: string. Specifies the full path of the input
+ GraphDef file frozen using
+ [freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py).
* `--output_format`. Type: string. Default: `TFLITE`. Specifies the format of
the output file. Allowed values:
- * `TFLITE`: TensorFlow Lite FlatBuffer format.
+ * `TFLITE`: TensorFlow Lite model format.
* `GRAPHVIZ_DOT`: GraphViz `.dot` format containing a visualization of the
graph after graph transformations.
* Note that passing `GRAPHVIZ_DOT` to `--output_format` leads to loss
- of TFLite specific transformations. Therefore, the resulting
- visualization may not reflect the final set of graph
- transformations. To get a final visualization with all graph
- transformations use `--dump_graphviz_dir` instead.
+ of TFLite specific transformations. To get a final visualization
+ with all graph transformations use `--dump_graphviz_dir` instead.
The following flags specify optional parameters when using SavedModels.
-* `--saved_model_tag_set`. Type: string. Default:
- [kSavedModelTagServe](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h).
+* `--saved_model_tag_set`. Type: string. Default: "serve" (for more options,
+ refer to
+ [tag_constants.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h)).
Specifies a comma-separated set of tags identifying the MetaGraphDef within
the SavedModel to analyze. All tags in the tag set must be specified.
-* `--saved_model_signature_key`. Type: string. Default:
- `tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY`.
+* `--saved_model_signature_key`. Type: string. Default: "serving_default" (for
+ more options, refer to
+ [tf.compat.v1.saved_model.signature_constants](https://www.tensorflow.org/api_docs/python/tf/compat/v1/saved_model/signature_constants)).
Specifies the key identifying the SignatureDef containing inputs and
outputs.
@@ -46,9 +45,9 @@ The following flags specify optional parameters when using SavedModels.
file.
* `--input_arrays`. Type: comma-separated list of strings. Specifies the list
- of names of input activation tensors.
+ of names of input tensors.
* `--output_arrays`. Type: comma-separated list of strings. Specifies the list
- of names of output activation tensors.
+ of names of output tensors.
The following flags define properties of the input tensors. Each item in the
`--input_arrays` flag should correspond to each item in the following flags
@@ -56,8 +55,7 @@ based on index.
* `--input_shapes`. Type: colon-separated list of comma-separated lists of
integers. Each comma-separated list of integers gives the shape of one of
- the input arrays specified in
- [TensorFlow convention](https://www.tensorflow.org/guide/tensors#shape).
+ the input arrays.
* Example: `--input_shapes=1,60,80,3` for a typical vision model means a
batch size of 1, an input image height of 60, an input image width of
80, and an input image depth of 3 (representing RGB channels).
@@ -65,24 +63,24 @@ based on index.
has a shape of [2, 3] and "bar" has a shape of [4, 5, 6].
* `--std_dev_values`, `--mean_values`. Type: comma-separated list of floats.
These specify the (de-)quantization parameters of the input array, when it
- is quantized. This is only needed if `inference_input_type` is
+ is quantized. This is only needed if `inference_input_type` is `INT8` or
`QUANTIZED_UINT8`.
* The meaning of `mean_values` and `std_dev_values` is as follows: each
quantized value in the quantized input array will be interpreted as a
mathematical real number (i.e. as an input activation value) according
to the following formula:
- * `real_value = (quantized_input_value - mean_value) / std_dev_value`.
+ * `real_value = (quantized_value - mean_value) / std_dev_value`.
* When performing float inference (`--inference_type=FLOAT`) on a
quantized input, the quantized input would be immediately dequantized by
the inference code according to the above formula, before proceeding
with float inference.
- * When performing quantized inference
- (`--inference_type=QUANTIZED_UINT8`), no dequantization is performed by
- the inference code. However, the quantization parameters of all arrays,
- including those of the input arrays as specified by `mean_value` and
- `std_dev_value`, determine the fixed-point multipliers used in the
- quantized inference code. `mean_value` must be an integer when
- performing quantized inference.
+ * When performing quantized inference (`inference_type`
+ is`INT8`or`QUANTIZED_UINT8`), no dequantization is performed by the
+ inference code. However, the quantization parameters of all arrays,
+ including those of the input arrays as specified
+ by`mean_value`and`std_dev_value`, determine the fixed-point multipliers
+ used in the quantized inference code.`mean_value` must be an integer
+ when performing quantized inference.
## Transformation flags
@@ -92,7 +90,7 @@ have.
* `--inference_type`. Type: string. Default: `FLOAT`. Data type of all
real-number arrays in the output file except for input arrays (defined by
- `--inference_input_type`). Must be `{FLOAT, QUANTIZED_UINT8}`.
+ `--inference_input_type`). Must be `{FLOAT, INT8, QUANTIZED_UINT8}`.
This flag only impacts real-number arrays including float and quantized
arrays. This excludes all other data types including plain integer arrays
@@ -101,6 +99,9 @@ have.
* If `FLOAT`, then real-numbers arrays will be of type float in the output
file. If they were quantized in the input file, then they get
dequantized.
+ * If `INT8`, then real-numbers arrays will be quantized as int8 in the
+ output file. If they were float in the input file, then they get
+ quantized.
* If `QUANTIZED_UINT8`, then real-numbers arrays will be quantized as
uint8 in the output file. If they were float in the input file, then
they get quantized.
@@ -109,7 +110,8 @@ have.
array in the output file. By default the `--inference_type` is used as type
of all of the input arrays. Flag is primarily intended for generating a
float-point graph with a quantized input array. A Dequantized operator is
- added immediately after the input array. Must be `{FLOAT, QUANTIZED_UINT8}`.
+ added immediately after the input array. Must be `{FLOAT, INT8,
+ QUANTIZED_UINT8}`.
The flag is typically used for vision models taking a bitmap as input but
requiring floating-point inference. For such image models, the uint8 input
diff --git a/tensorflow/lite/g3doc/r1/convert/index.md b/tensorflow/lite/g3doc/r1/convert/index.md
index 4080689ce26..7a4e8c7bc95 100644
--- a/tensorflow/lite/g3doc/r1/convert/index.md
+++ b/tensorflow/lite/g3doc/r1/convert/index.md
@@ -1,48 +1,48 @@
# TensorFlow Lite converter
-The TensorFlow Lite converter is used to convert TensorFlow models into an
-optimized [FlatBuffer](https://google.github.io/flatbuffers/) format, so that
-they can be used by the TensorFlow Lite interpreter.
+The TensorFlow Lite converter takes a TensorFlow model and generates a
+TensorFlow Lite model, which is an optimized
+[FlatBuffer](https://google.github.io/flatbuffers/) (identified by the `.tflite`
+file extension).
Note: This page contains documentation on the converter API for TensorFlow 1.x.
The API for TensorFlow 2.0 is available
[here](https://www.tensorflow.org/lite/convert/).
-## FlatBuffers
+## Options
+
+The TensorFlow Lite Converter can be used in two ways:
+
+* [Python API](python_api.md) (**recommended**): Using the Python API makes it
+ easier to convert models as part of a model development pipeline and helps
+ mitigate compatibility issues early on.
+* [Command line](cmdline_examples.md)
+
+## Workflow
+
+### Why use the 'FlatBuffer' format?
FlatBuffer is an efficient open-source cross-platform serialization library. It
-is similar to
-[protocol buffers](https://developers.google.com/protocol-buffers), with the
-distinction that FlatBuffers do not need a parsing/unpacking step to a secondary
-representation before data can be accessed, avoiding per-object memory
-allocation. The code footprint of FlatBuffers is an order of magnitude smaller
-than protocol buffers.
+is similar to [protocol buffers](https://developers.google.com/protocol-buffers)
+used in the TensorFlow model format, with the distinction that FlatBuffers do
+not need a parsing/unpacking step to a secondary representation before data can
+be accessed, avoiding per-object memory allocation. The code footprint of
+FlatBuffers is an order of magnitude smaller than protocol buffers.
-## From model training to device deployment
-
-The TensorFlow Lite converter generates a TensorFlow Lite
-[FlatBuffer](https://google.github.io/flatbuffers/) file (`.tflite`) from a
-TensorFlow model.
+### Convert the model
The converter supports the following input formats:
* [SavedModels](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators)
-* Frozen `GraphDef`: Models generated by
+* `tf.keras` H5 models.
+* Frozen `GraphDef` models generated using
[freeze_graph.py](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py).
-* `tf.keras` HDF5 models.
-* Any model taken from a `tf.Session` (Python API only).
+* `tf.Session` models (Python API only).
-The TensorFlow Lite `FlatBuffer` file is then deployed to a client device, and
-the TensorFlow Lite interpreter uses the compressed model for on-device
-inference. This conversion process is shown in the diagram below:
+### Run inference
+
+The TensorFlow Lite model is then deployed to a client device, and the
+TensorFlow Lite interpreter uses the compressed model for on-device inference.
+This conversion process is shown in the diagram below:

-
-## Options
-
-The TensorFlow Lite Converter can be used from either of these two options:
-
-* [Python](python_api.md) (**Preferred**): Using the Python API makes it
- easier to convert models as part of a model development pipeline, and helps
- mitigate [compatibility](../tf_ops_compatibility.md) issues early on.
-* [Command line](cmdline_examples.md)
diff --git a/tensorflow/lite/g3doc/r1/convert/python_api.md b/tensorflow/lite/g3doc/r1/convert/python_api.md
index 30d65750100..08e34b53630 100644
--- a/tensorflow/lite/g3doc/r1/convert/python_api.md
+++ b/tensorflow/lite/g3doc/r1/convert/python_api.md
@@ -1,119 +1,41 @@
# Converter Python API guide
This page describes how to convert TensorFlow models into the TensorFlow Lite
-format using the TensorFlow Lite Converter Python API.
+format using the
+[`tf.compat.v1.lite.TFLiteConverter`](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter)
+Python API. It provides the following class methods based on the original format
+of the model:
-If you're looking for information about how to run a TensorFlow Lite model,
-see [TensorFlow Lite inference](../guide/inference.md).
+* `tf.compat.v1.lite.TFLiteConverter.from_keras_model_file()`: Converts a
+ [Keras](https://www.tensorflow.org/guide/keras/overview) model file.
+* `tf.compat.v1.lite.TFLiteConverter.from_saved_model()`: Converts a
+ [SavedModel](https://www.tensorflow.org/guide/saved_model).
+* `tf.compat.v1.lite.TFLiteConverter.from_session()`: Converts a GraphDef from
+ a session.
+* `tf.compat.v1.lite.TFLiteConverter.from_frozen_graph()`: Converts a Frozen
+ GraphDef from a file. If you have checkpoints, then first convert it to a
+ Frozen GraphDef file and then use this API as shown [here](#checkpoints).
-Note: This page describes the converter in the TensorFlow nightly release,
-installed using `pip install tf-nightly`. For docs describing older versions
-reference ["Converting models from TensorFlow 1.12"](#pre_tensorflow_1.12).
-
-
-## High-level overview
-
-While the TensorFlow Lite Converter can be used from the command line, it is
-often convenient to use in a Python script as part of the model development
-pipeline. This allows you to know early that you are designing a model that can
-be targeted to devices with mobile.
-
-## API
-
-The API for converting TensorFlow models to TensorFlow Lite is
-`tf.lite.TFLiteConverter`, which provides class methods based on the original
-format of the model. For example, `TFLiteConverter.from_session()` is available
-for GraphDefs, `TFLiteConverter.from_saved_model()` is available for
-SavedModels, and `TFLiteConverter.from_keras_model_file()` is available for
-`tf.Keras` files.
-
-Example usages for simple float-point models are shown in
-[Basic Examples](#basic). Examples usages for more complex models is shown in
-[Complex Examples](#complex).
+In the following sections, we discuss [basic examples](#basic) and
+[complex examples](#complex).
## Basic examples
-The following section shows examples of how to convert a basic float-point model
-from each of the supported data formats into a TensorFlow Lite FlatBuffers.
+The following section shows examples of how to convert a basic model from each
+of the supported model formats into a TensorFlow Lite model.
-### Exporting a GraphDef from tf.Session
-
-The following example shows how to convert a TensorFlow GraphDef into a
-TensorFlow Lite FlatBuffer from a `tf.Session` object.
+### Convert a Keras model file
```python
import tensorflow as tf
-img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
-var = tf.get_variable("weights", dtype=tf.float32, shape=(1, 64, 64, 3))
-val = img + var
-out = tf.identity(val, name="out")
-
-with tf.Session() as sess:
- sess.run(tf.global_variables_initializer())
- converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
- tflite_model = converter.convert()
- open("converted_model.tflite", "wb").write(tflite_model)
-```
-
-### Exporting a GraphDef from file
-
-The following example shows how to convert a TensorFlow GraphDef into a
-TensorFlow Lite FlatBuffer when the GraphDef is stored in a file. Both `.pb` and
-`.pbtxt` files are accepted.
-
-The example uses
-[Mobilenet_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz).
-The function only supports GraphDefs frozen using
-[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py).
-
-```python
-import tensorflow as tf
-
-graph_def_file = "/path/to/Downloads/mobilenet_v1_1.0_224/frozen_graph.pb"
-input_arrays = ["input"]
-output_arrays = ["MobilenetV1/Predictions/Softmax"]
-
-converter = tf.lite.TFLiteConverter.from_frozen_graph(
- graph_def_file, input_arrays, output_arrays)
+converter = tf.compat.v1.lite.TFLiteConverter.from_keras_model_file("keras_model.h5")
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)
```
-### Exporting a SavedModel
-
-The following example shows how to convert a SavedModel into a TensorFlow Lite
-FlatBuffer.
-
-```python
-import tensorflow as tf
-
-converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
-tflite_model = converter.convert()
-open("converted_model.tflite", "wb").write(tflite_model)
-```
-
-For more complex SavedModels, the optional parameters that can be passed into
-`TFLiteConverter.from_saved_model()` are `input_arrays`, `input_shapes`,
-`output_arrays`, `tag_set` and `signature_key`. Details of each parameter are
-available by running `help(tf.lite.TFLiteConverter)`.
-
-### Exporting a tf.keras File
-
-The following example shows how to convert a `tf.keras` model into a TensorFlow
-Lite FlatBuffer. This example requires
-[`h5py`](http://docs.h5py.org/en/latest/build.html) to be installed.
-
-```python
-import tensorflow as tf
-
-converter = tf.lite.TFLiteConverter.from_keras_model_file("keras_model.h5")
-tflite_model = converter.convert()
-open("converted_model.tflite", "wb").write(tflite_model)
-```
-
-The `tf.keras` file must contain both the model and the weights. A comprehensive
-example including model construction can be seen below.
+The Keras file contains both the model and the weights. A comprehensive example
+is given below.
```python
import numpy as np
@@ -134,61 +56,133 @@ y = np.random.random((1, 3, 3))
model.train_on_batch(x, y)
model.predict(x)
-# Save tf.keras model in HDF5 format.
+# Save tf.keras model in H5 format.
keras_file = "keras_model.h5"
tf.keras.models.save_model(model, keras_file)
# Convert to TensorFlow Lite model.
-converter = tf.lite.TFLiteConverter.from_keras_model_file(keras_file)
+converter = tf.compat.v1.lite.TFLiteConverter.from_keras_model_file(keras_file)
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)
```
-## Complex examples
+### Convert a SavedModel
-For models where the default value of the attributes is not sufficient, the
-attribute's values should be set before calling `convert()`. In order to call
-any constants use `tf.lite.constants.` as seen below with
-`QUANTIZED_UINT8`. Run `help(tf.lite.TFLiteConverter)` in the Python
-terminal for detailed documentation on the attributes.
+The following example shows how to convert a
+[SavedModel](https://www.tensorflow.org/guide/saved_model) into a TensorFlow
+Lite model.
-Although the examples are demonstrated on GraphDefs containing only constants.
-The same logic can be applied irrespective of the input data format.
+```python
+import tensorflow as tf
-### Exporting a quantized GraphDef
+converter = tf.compat.v1.lite.TFLiteConverter.from_saved_model(saved_model_dir)
+tflite_model = converter.convert()
+open("converted_model.tflite", "wb").write(tflite_model)
+```
-The following example shows how to convert a quantized model into a TensorFlow
-Lite FlatBuffer.
+### Convert a GraphDef from a session
+
+The following example shows how to convert a TensorFlow GraphDef into a
+TensorFlow Lite model from a `tf.Session` object.
```python
import tensorflow as tf
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
-const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
-val = img + const
-out = tf.fake_quant_with_min_max_args(val, min=0., max=1., name="output")
+var = tf.get_variable("weights", dtype=tf.float32, shape=(1, 64, 64, 3))
+val = img + var
+out = tf.identity(val, name="out")
with tf.Session() as sess:
- converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
- converter.inference_type = tf.lite.constants.QUANTIZED_UINT8
- input_arrays = converter.get_input_arrays()
- converter.quantized_input_stats = {input_arrays[0] : (0., 1.)} # mean, std_dev
+ sess.run(tf.global_variables_initializer())
+ converter = tf.compat.v1.lite.TFLiteConverter.from_session(sess, [img], [out])
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)
```
+### Convert a Frozen GraphDef from file
-## Additional instructions
+The example uses
+[Mobilenet_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz).
-### Build from source code
+```python
+import tensorflow as tf
-In order to run the latest version of the TensorFlow Lite Converter Python API,
-either install the nightly build with
-[pip](https://www.tensorflow.org/install/pip) (recommended) or
-[Docker](https://www.tensorflow.org/install/docker), or
-[build the pip package from source](https://www.tensorflow.org/install/source).
+converter = tf.compat.v1.lite.TFLiteConverter.from_frozen_graph(
+ graph_def_file='/path/to/mobilenet_v1_1.0_224/frozen_graph.pb',
+ # both `.pb` and `.pbtxt` files are accepted.
+ input_arrays=['input'],
+ output_arrays=['MobilenetV1/Predictions/Softmax'],
+ input_shapes={'input' : [1, 224, 224,3]},
+)
+tflite_model = converter.convert()
+open("converted_model.tflite", "wb").write(tflite_model)
+```
-### Converting models from TensorFlow 1.12
+#### Convert checkpoints
+
+1. Convert checkpoints to a Frozen GraphDef as follows
+ (*[reference](https://laid.delanover.com/how-to-freeze-a-graph-in-tensorflow/)*):
+
+ * Install [bazel](https://docs.bazel.build/versions/master/install.html)
+ * Clone the TensorFlow repository: `git clone
+ https://github.com/tensorflow/tensorflow.git`
+ * Build freeze graph tool: `bazel build
+ tensorflow/python/tools:freeze_graph`
+ * The directory from which you run this should contain a file named
+ 'WORKSPACE'.
+ * If you're running on Ubuntu 16.04 OS and face issues, update the
+ command to `bazel build -c opt --copt=-msse4.1 --copt=-msse4.2
+ tensorflow/python/tools:freeze_graph`
+ * Run freeze graph tool: `bazel run tensorflow/python/tools:freeze_graph
+ --input_graph=/path/to/graph.pbtxt --input_binary=false
+ --input_checkpoint=/path/to/model.ckpt-00010
+ --output_graph=/path/to/frozen_graph.pb
+ --output_node_names=name1,name2.....`
+ * If you have an input `*.pb` file instead of `*.pbtxt`, then replace
+ `--input_graph=/path/to/graph.pbtxt --input_binary=false` with
+ `--input_graph=/path/to/graph.pb`
+ * You can find the output names by exploring the graph using
+ [Netron](https://github.com/lutzroeder/netron) or
+ [summarize graph tool](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#inspecting-graphs).
+
+2. Now [convert the Frozen GraphDef file](#basic_graphdef_file) to a TensorFlow
+ Lite model as shown in the example above.
+
+## Complex examples
+
+For models where the default value of the attributes is not sufficient, the
+attribute's values should be set before calling `convert()`. Run
+`help(tf.compat.v1.lite.TFLiteConverter)` in the Python terminal for detailed
+documentation on the attributes.
+
+### Convert a quantize aware trained model
+
+The following example shows how to convert a quantize aware trained model into a
+TensorFlow Lite model.
+
+The example uses
+[Mobilenet_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz).
+
+```python
+import tensorflow as tf
+
+converter = tf.compat.v1.lite.TFLiteConverter.from_frozen_graph(
+ graph_def_file='/path/to/mobilenet_v1_1.0_224/frozen_graph.pb',
+ input_arrays=['input'],
+ output_arrays=['MobilenetV1/Predictions/Softmax'],
+ input_shapes={'input' : [1, 224, 224,3]},
+)
+converter.quantized_input_stats = {['input'] : (0., 1.)} # mean, std_dev (input range is [-1, 1])
+converter.inference_type = tf.int8 # this is the recommended type.
+# converter.inference_input_type=tf.uint8 # optional
+# converter.inference_output_type=tf.uint8 # optional
+tflite_model = converter.convert()
+with open('mobilenet_v1_1.0_224_quantized.tflite', 'wb') as f:
+ f.write(tflite_model)
+```
+
+## Convert models from TensorFlow 1.12
Reference the following table to convert TensorFlow models to TensorFlow Lite in
and before TensorFlow 1.12. Run `help()` to get details of each API.