Update TensorFlow Lite Converter Docs in TensorFlow 1.x
PiperOrigin-RevId: 328436791 Change-Id: I60a94d4e1ba26ce9d31c027aef2f8c35d063424c
This commit is contained in:
parent
b3f274f6eb
commit
29fd07dc0b
tensorflow/lite/g3doc/r1/convert
@ -2,175 +2,166 @@
|
||||
|
||||
This page shows how to use the TensorFlow Lite Converter in the command line.
|
||||
|
||||
_Note: If possible, use the **recommended** [Python API](python_api.md)
|
||||
instead._
|
||||
|
||||
## Command-line tools <a name="tools"></a>
|
||||
|
||||
### Starting from TensorFlow 1.9
|
||||
|
||||
There are two approaches to running the converter in the command line.
|
||||
|
||||
* `tflite_convert`: Starting from TensorFlow 1.9, the command-line tool
|
||||
`tflite_convert` is installed as part of the Python package. All of the
|
||||
examples below use `tflite_convert` for simplicity.
|
||||
* Example: `tflite_convert --output_file=...`
|
||||
* `bazel`: In order to run the latest version of the TensorFlow Lite Converter
|
||||
either install the nightly build using
|
||||
[pip](https://www.tensorflow.org/install/pip) or
|
||||
[clone the TensorFlow repository](https://www.tensorflow.org/install/source)
|
||||
and use `bazel`.
|
||||
* Example: `bazel run
|
||||
* `tflite_convert` (**recommended**):
|
||||
* *Install*: TensorFlow using
|
||||
[pip](https://www.tensorflow.org/install/pip).
|
||||
* *Example*: `tflite_convert --output_file=...`
|
||||
* `bazel`:
|
||||
* *Install*: TensorFlow from
|
||||
[source](https://www.tensorflow.org/install/source).
|
||||
* *Example*: `bazel run
|
||||
//third_party/tensorflow/lite/python:tflite_convert --
|
||||
--output_file=...`
|
||||
|
||||
### Converting models prior to TensorFlow 1.9 <a name="pre_tensorflow_1.9"></a>
|
||||
*All of the following examples use `tflite_convert` for simplicity.
|
||||
Alternatively, you can replace '`tflite_convert`' with '`bazel run
|
||||
//tensorflow/lite/python:tflite_convert --`'*
|
||||
|
||||
### Prior to TensorFlow 1.9 <a name="pre_tensorflow_1.9"></a>
|
||||
|
||||
The recommended approach for using the converter prior to TensorFlow 1.9 is the
|
||||
[Python API](python_api.md#pre_tensorflow_1.9). If a command line tool is
|
||||
desired, the `toco` command line tool was available in TensorFlow 1.7. Enter
|
||||
`toco --help` in Terminal for additional details on the command-line flags
|
||||
available. There were no command line tools in TensorFlow 1.8.
|
||||
[Python API](python_api.md). Only in TensorFlow 1.7, a command line tool `toco`
|
||||
was available (run `toco --help` for additional details).
|
||||
|
||||
## Basic examples <a name="basic"></a>
|
||||
## Usage <a name="usage"></a>
|
||||
|
||||
The following section shows examples of how to convert a basic float-point model
|
||||
from each of the supported data formats into a TensorFlow Lite FlatBuffers.
|
||||
### Setup <a name="download_models"></a>
|
||||
|
||||
### Convert a TensorFlow GraphDef <a name="graphdef"></a>
|
||||
|
||||
The follow example converts a basic TensorFlow GraphDef (frozen by
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py))
|
||||
into a TensorFlow Lite FlatBuffer to perform floating-point inference. Frozen
|
||||
graphs contain the variables stored in Checkpoint files as Const ops.
|
||||
Before we begin, download the models required to run the examples in this
|
||||
document:
|
||||
|
||||
```
|
||||
echo "Download MobileNet V1"
|
||||
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
|
||||
| tar xzv -C /tmp
|
||||
|
||||
echo "Download Inception V1"
|
||||
curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
|
||||
| tar xzv -C /tmp
|
||||
```
|
||||
|
||||
### Basic examples <a name="basic"></a>
|
||||
|
||||
The following section shows examples of how to convert a basic model from each
|
||||
of the supported data formats into a TensorFlow Lite model.
|
||||
|
||||
#### Convert a SavedModel <a name="savedmodel"></a>
|
||||
|
||||
```
|
||||
tflite_convert \
|
||||
--saved_model_dir=/tmp/saved_model \
|
||||
--output_file=/tmp/foo.tflite
|
||||
```
|
||||
|
||||
#### Convert a tf.keras model <a name="keras"></a>
|
||||
|
||||
```
|
||||
tflite_convert \
|
||||
--keras_model_file=/tmp/keras_model.h5 \
|
||||
--output_file=/tmp/foo.tflite
|
||||
```
|
||||
|
||||
#### Convert a Frozen GraphDef <a name="graphdef"></a>
|
||||
|
||||
```
|
||||
tflite_convert \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
--input_arrays=input \
|
||||
--output_arrays=MobilenetV1/Predictions/Reshape_1
|
||||
```
|
||||
|
||||
The value for `input_shapes` is automatically determined whenever possible.
|
||||
Frozen GraphDef models (or frozen graphs) are produced by
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)
|
||||
and require additional flags `--input_arrays` and `--output_arrays` as this
|
||||
information is not stored in the model format.
|
||||
|
||||
### Convert a TensorFlow SavedModel <a name="savedmodel"></a>
|
||||
### Advanced examples
|
||||
|
||||
The follow example converts a basic TensorFlow SavedModel into a Tensorflow Lite
|
||||
FlatBuffer to perform floating-point inference.
|
||||
#### Convert a quantization aware trained model into a quantized TensorFlow Lite model
|
||||
|
||||
If you have a quantization aware trained model (i.e, a model inserted with
|
||||
`FakeQuant*` operations which record the (min, max) ranges of tensors in order
|
||||
to quantize them), then convert it into a quantized TensorFlow Lite model as
|
||||
shown below:
|
||||
|
||||
```
|
||||
tflite_convert \
|
||||
--graph_def_file=/tmp/some_mobilenetv1_quantized_frozen_graph.pb \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
--saved_model_dir=/tmp/saved_model
|
||||
```
|
||||
|
||||
[SavedModel](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators)
|
||||
has fewer required flags than frozen graphs due to access to additional data
|
||||
contained within the SavedModel. The values for `--input_arrays` and
|
||||
`--output_arrays` are an aggregated, alphabetized list of the inputs and outputs
|
||||
in the [SignatureDefs](../../serving/signature_defs.md) within
|
||||
the
|
||||
[MetaGraphDef](https://www.tensorflow.org/saved_model#apis_to_build_and_load_a_savedmodel)
|
||||
specified by `--saved_model_tag_set`. As with the GraphDef, the value for
|
||||
`input_shapes` is automatically determined whenever possible.
|
||||
|
||||
There is currently no support for MetaGraphDefs without a SignatureDef or for
|
||||
MetaGraphDefs that use the [`assets/`
|
||||
directory](https://www.tensorflow.org/guide/saved_model#structure_of_a_savedmodel_directory).
|
||||
|
||||
### Convert a tf.Keras model <a name="keras"></a>
|
||||
|
||||
The following example converts a `tf.keras` model into a TensorFlow Lite
|
||||
Flatbuffer. The `tf.keras` file must contain both the model and the weights.
|
||||
|
||||
```
|
||||
tflite_convert \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
--keras_model_file=/tmp/keras_model.h5
|
||||
```
|
||||
|
||||
## Quantization
|
||||
|
||||
### Convert a TensorFlow GraphDef for quantized inference <a name="graphdef_quant"></a>
|
||||
|
||||
The TensorFlow Lite Converter is compatible with fixed point quantization models
|
||||
described
|
||||
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/quantize/README.md).
|
||||
These are float models with `FakeQuant*` ops inserted at the boundaries of fused
|
||||
layers to record min-max range information. This generates a quantized inference
|
||||
workload that reproduces the quantization behavior that was used during
|
||||
training.
|
||||
|
||||
The following command generates a quantized TensorFlow Lite FlatBuffer from a
|
||||
"quantized" TensorFlow GraphDef.
|
||||
|
||||
```
|
||||
tflite_convert \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
--graph_def_file=/tmp/some_quantized_graph.pb \
|
||||
--inference_type=QUANTIZED_UINT8 \
|
||||
--input_arrays=input \
|
||||
--output_arrays=MobilenetV1/Predictions/Reshape_1 \
|
||||
--mean_values=128 \
|
||||
--std_dev_values=127
|
||||
--inference_type=INT8 \
|
||||
--mean_values=-0.5 \
|
||||
--std_dev_values=127.7
|
||||
```
|
||||
|
||||
### Use \"dummy-quantization\" to try out quantized inference on a float graph <a name="dummy_quant"></a>
|
||||
*If you're setting `--inference_type=QUANTIZED_UINT8` then update
|
||||
`--mean_values=128` and `--std_dev_values=127`*
|
||||
|
||||
In order to evaluate the possible benefit of generating a quantized graph, the
|
||||
converter allows "dummy-quantization" on float graphs. The flags
|
||||
`--default_ranges_min` and `--default_ranges_max` accept plausible values for
|
||||
the min-max ranges of the values in all arrays that do not have min-max
|
||||
information. "Dummy-quantization" will produce lower accuracy but will emulate
|
||||
the performance of a correctly quantized model.
|
||||
#### Convert a model with \"dummy-quantization\" into a quantized TensorFlow Lite model
|
||||
|
||||
If you have a regular float model and only want to estimate the benefit of a
|
||||
quantized model, i.e, estimate the performance of the model as if it were
|
||||
quantized aware trained, then perform "dummy-quantization" using the flags
|
||||
`--default_ranges_min` and `--default_ranges_max`. When specified, they will be
|
||||
used as default (min, max) range for all the tensors that lack (min, max) range
|
||||
information. This will allow quantization to proceed and help you emulate the
|
||||
performance of a quantized TensorFlow Lite model but it will have a lower
|
||||
accuracy.
|
||||
|
||||
The example below contains a model using Relu6 activation functions. Therefore,
|
||||
a reasonable guess is that most activation ranges should be contained in [0, 6].
|
||||
|
||||
```
|
||||
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
|
||||
| tar xzv -C /tmp
|
||||
tflite_convert \
|
||||
--output_file=/tmp/foo.cc \
|
||||
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
|
||||
--inference_type=QUANTIZED_UINT8 \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
--input_arrays=input \
|
||||
--output_arrays=MobilenetV1/Predictions/Reshape_1 \
|
||||
--inference_type=INT8 \
|
||||
--mean_values=-0.5 \
|
||||
--std_dev_values=127.7
|
||||
--default_ranges_min=0 \
|
||||
--default_ranges_max=6 \
|
||||
--mean_values=128 \
|
||||
--std_dev_values=127
|
||||
```
|
||||
|
||||
## Specifying input and output arrays
|
||||
*If you're setting `--inference_type=QUANTIZED_UINT8` then update
|
||||
`--mean_values=128` and `--std_dev_values=127`*
|
||||
|
||||
### Multiple input arrays
|
||||
#### Convert a model with multiple input arrays
|
||||
|
||||
The flag `input_arrays` takes in a comma-separated list of input arrays as seen
|
||||
in the example below. This is useful for models or subgraphs with multiple
|
||||
inputs.
|
||||
inputs. Note that `--input_shapes` is provided as a colon-separated list. Each
|
||||
input shape corresponds to the input array at the same position in the
|
||||
respective list.
|
||||
|
||||
```
|
||||
curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
|
||||
| tar xzv -C /tmp
|
||||
tflite_convert \
|
||||
--graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
--input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
|
||||
--input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \
|
||||
--input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
|
||||
--output_arrays=InceptionV1/Logits/Predictions/Reshape_1
|
||||
```
|
||||
|
||||
Note that `input_shapes` is provided as a colon-separated list. Each input shape
|
||||
corresponds to the input array at the same position in the respective list.
|
||||
#### Convert a model with multiple output arrays
|
||||
|
||||
### Multiple output arrays
|
||||
|
||||
The flag `output_arrays` takes in a comma-separated list of output arrays as
|
||||
The flag `--output_arrays` takes in a comma-separated list of output arrays as
|
||||
seen in the example below. This is useful for models or subgraphs with multiple
|
||||
outputs.
|
||||
|
||||
```
|
||||
curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
|
||||
| tar xzv -C /tmp
|
||||
tflite_convert \
|
||||
--graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
@ -178,50 +169,45 @@ tflite_convert \
|
||||
--output_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu
|
||||
```
|
||||
|
||||
### Specifying subgraphs
|
||||
### Convert a model by specifying subgraphs
|
||||
|
||||
Any array in the input file can be specified as an input or output array in
|
||||
order to extract subgraphs out of an input graph file. The TensorFlow Lite
|
||||
Converter discards the parts of the graph outside of the specific subgraph. Use
|
||||
[graph visualizations](#graph_visualizations) to identify the input and output
|
||||
arrays that make up the desired subgraph.
|
||||
order to extract subgraphs out of an input model file. The TensorFlow Lite
|
||||
Converter discards the parts of the model outside of the specific subgraph. Use
|
||||
[visualization](#visualization) to identify the input and output arrays that
|
||||
make up the desired subgraph.
|
||||
|
||||
The follow command shows how to extract a single fused layer out of a TensorFlow
|
||||
GraphDef.
|
||||
|
||||
```
|
||||
curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
|
||||
| tar xzv -C /tmp
|
||||
tflite_convert \
|
||||
--graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
|
||||
--output_file=/tmp/foo.pb \
|
||||
--input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
|
||||
--input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \
|
||||
--input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
|
||||
--output_arrays=InceptionV1/InceptionV1/Mixed_3b/concat_v2
|
||||
```
|
||||
|
||||
Note that the final representation in TensorFlow Lite FlatBuffers tends to have
|
||||
Note that the final representation in TensorFlow Lite models tends to have
|
||||
coarser granularity than the very fine granularity of the TensorFlow GraphDef
|
||||
representation. For example, while a fully-connected layer is typically
|
||||
represented as at least four separate ops in TensorFlow GraphDef (Reshape,
|
||||
MatMul, BiasAdd, Relu...), it is typically represented as a single "fused" op
|
||||
(FullyConnected) in the converter's optimized representation and in the final
|
||||
on-device representation. As the level of granularity gets coarser, some
|
||||
intermediate arrays (say, the array between the MatMul and the BiasAdd in the
|
||||
TensorFlow GraphDef) are dropped.
|
||||
represented as at least four separate operations in TensorFlow GraphDef
|
||||
(Reshape, MatMul, BiasAdd, Relu...), it is typically represented as a single
|
||||
"fused" op (FullyConnected) in the converter's optimized representation and in
|
||||
the final on-device representation. As the level of granularity gets coarser,
|
||||
some intermediate arrays (say, the array between the MatMul and the BiasAdd in
|
||||
the TensorFlow GraphDef) are dropped.
|
||||
|
||||
When specifying intermediate arrays as `--input_arrays` and `--output_arrays`,
|
||||
it is desirable (and often required) to specify arrays that are meant to survive
|
||||
in the final form of the graph, after fusing. These are typically the outputs of
|
||||
in the final form of the model, after fusing. These are typically the outputs of
|
||||
activation functions (since everything in each layer until the activation
|
||||
function tends to get fused).
|
||||
|
||||
## Logging
|
||||
## Visualization <a name="visualization"></a>
|
||||
|
||||
|
||||
## Graph visualizations
|
||||
|
||||
The converter can export a graph to the Graphviz Dot format for easy
|
||||
The converter can export a model to the Graphviz Dot format for easy
|
||||
visualization using either the `--output_format` flag or the
|
||||
`--dump_graphviz_dir` flag. The subsections below outline the use cases for
|
||||
each.
|
||||
@ -229,21 +215,20 @@ each.
|
||||
### Using `--output_format=GRAPHVIZ_DOT` <a name="using_output_format_graphviz_dot"></a>
|
||||
|
||||
The first way to get a Graphviz rendering is to pass `GRAPHVIZ_DOT` into
|
||||
`--output_format`. This results in a plausible visualization of the graph. This
|
||||
`--output_format`. This results in a plausible visualization of the model. This
|
||||
reduces the requirements that exist during conversion from a TensorFlow GraphDef
|
||||
to a TensorFlow Lite FlatBuffer. This may be useful if the conversion to TFLite
|
||||
is failing.
|
||||
to a TensorFlow Lite model. This may be useful if the conversion to TFLite is
|
||||
failing.
|
||||
|
||||
```
|
||||
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
|
||||
| tar xzv -C /tmp
|
||||
tflite_convert \
|
||||
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
|
||||
--output_file=/tmp/foo.dot \
|
||||
--output_format=GRAPHVIZ_DOT \
|
||||
--input_shape=1,128,128,3 \
|
||||
--input_arrays=input \
|
||||
--input_shape=1,128,128,3 \
|
||||
--output_arrays=MobilenetV1/Predictions/Reshape_1
|
||||
|
||||
```
|
||||
|
||||
The resulting `.dot` file can be rendered into a PDF as follows:
|
||||
@ -267,12 +252,10 @@ Example PDF files are viewable online in the next section.
|
||||
The second way to get a Graphviz rendering is to pass the `--dump_graphviz_dir`
|
||||
flag, specifying a destination directory to dump Graphviz rendering to. Unlike
|
||||
the previous approach, this one retains the original output format. This
|
||||
provides a visualization of the actual graph resulting from a specific
|
||||
provides a visualization of the actual model resulting from a specific
|
||||
conversion process.
|
||||
|
||||
```
|
||||
curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
|
||||
| tar xzv -C /tmp
|
||||
tflite_convert \
|
||||
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
|
||||
--output_file=/tmp/foo.tflite \
|
||||
@ -283,14 +266,14 @@ tflite_convert \
|
||||
|
||||
This generates a few files in the destination directory. The two most important
|
||||
files are `toco_AT_IMPORT.dot` and `/tmp/toco_AFTER_TRANSFORMATIONS.dot`.
|
||||
`toco_AT_IMPORT.dot` represents the original graph containing only the
|
||||
`toco_AT_IMPORT.dot` represents the original model containing only the
|
||||
transformations done at import time. This tends to be a complex visualization
|
||||
with limited information about each node. It is useful in situations where a
|
||||
conversion command fails.
|
||||
|
||||
`toco_AFTER_TRANSFORMATIONS.dot` represents the graph after all transformations
|
||||
`toco_AFTER_TRANSFORMATIONS.dot` represents the model after all transformations
|
||||
were applied to it, just before it is exported. Typically, this is a much
|
||||
smaller graph with more information about each node.
|
||||
smaller model with more information about each node.
|
||||
|
||||
As before, these can be rendered to PDFs:
|
||||
|
||||
@ -316,15 +299,15 @@ Sample output files can be seen here below. Note that it is the same
|
||||
<tr><td>before</td><td>after</td></tr>
|
||||
</table>
|
||||
|
||||
### Graph "video" logging
|
||||
### Video logging
|
||||
|
||||
When `--dump_graphviz_dir` is used, one may additionally pass
|
||||
`--dump_graphviz_video`. This causes a graph visualization to be dumped after
|
||||
each individual graph transformation, resulting in thousands of files.
|
||||
`--dump_graphviz_video`. This causes a model visualization to be dumped after
|
||||
each individual model transformation, resulting in thousands of files.
|
||||
Typically, one would then bisect into these files to understand when a given
|
||||
change was introduced in the graph.
|
||||
change was introduced in the model.
|
||||
|
||||
### Legend for the graph visualizations <a name="graphviz_legend"></a>
|
||||
### Legend for the Visualizations <a name="graphviz_legend"></a>
|
||||
|
||||
* Operators are red square boxes with the following hues of red:
|
||||
* Most operators are
|
||||
|
@ -1,42 +1,41 @@
|
||||
# Converter command line reference
|
||||
|
||||
This page is complete reference of command-line flags used by the TensorFlow
|
||||
Lite Converter's command line starting from TensorFlow 1.9 up until the most
|
||||
recent build of TensorFlow.
|
||||
Lite Converter's command line tool.
|
||||
|
||||
## High-level flags
|
||||
|
||||
The following high level flags specify the details of the input and output
|
||||
files. The flag `--output_file` is always required. Additionally, either
|
||||
`--graph_def_file`, `--saved_model_dir` or `--keras_model_file` is required.
|
||||
`--saved_model_dir`, `--keras_model_file` or `--graph_def_file` is required.
|
||||
|
||||
* `--output_file`. Type: string. Specifies the full path of the output file.
|
||||
* `--graph_def_file`. Type: string. Specifies the full path of the input
|
||||
GraphDef file frozen using
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py).
|
||||
* `--saved_model_dir`. Type: string. Specifies the full path to the directory
|
||||
containing the SavedModel.
|
||||
* `--keras_model_file`. Type: string. Specifies the full path of the HDF5 file
|
||||
containing the tf.keras model.
|
||||
* `--graph_def_file`. Type: string. Specifies the full path of the input
|
||||
GraphDef file frozen using
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py).
|
||||
* `--output_format`. Type: string. Default: `TFLITE`. Specifies the format of
|
||||
the output file. Allowed values:
|
||||
* `TFLITE`: TensorFlow Lite FlatBuffer format.
|
||||
* `TFLITE`: TensorFlow Lite model format.
|
||||
* `GRAPHVIZ_DOT`: GraphViz `.dot` format containing a visualization of the
|
||||
graph after graph transformations.
|
||||
* Note that passing `GRAPHVIZ_DOT` to `--output_format` leads to loss
|
||||
of TFLite specific transformations. Therefore, the resulting
|
||||
visualization may not reflect the final set of graph
|
||||
transformations. To get a final visualization with all graph
|
||||
transformations use `--dump_graphviz_dir` instead.
|
||||
of TFLite specific transformations. To get a final visualization
|
||||
with all graph transformations use `--dump_graphviz_dir` instead.
|
||||
|
||||
The following flags specify optional parameters when using SavedModels.
|
||||
|
||||
* `--saved_model_tag_set`. Type: string. Default:
|
||||
[kSavedModelTagServe](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h).
|
||||
* `--saved_model_tag_set`. Type: string. Default: "serve" (for more options,
|
||||
refer to
|
||||
[tag_constants.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h)).
|
||||
Specifies a comma-separated set of tags identifying the MetaGraphDef within
|
||||
the SavedModel to analyze. All tags in the tag set must be specified.
|
||||
* `--saved_model_signature_key`. Type: string. Default:
|
||||
`tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY`.
|
||||
* `--saved_model_signature_key`. Type: string. Default: "serving_default" (for
|
||||
more options, refer to
|
||||
[tf.compat.v1.saved_model.signature_constants](https://www.tensorflow.org/api_docs/python/tf/compat/v1/saved_model/signature_constants)).
|
||||
Specifies the key identifying the SignatureDef containing inputs and
|
||||
outputs.
|
||||
|
||||
@ -46,9 +45,9 @@ The following flags specify optional parameters when using SavedModels.
|
||||
file.
|
||||
|
||||
* `--input_arrays`. Type: comma-separated list of strings. Specifies the list
|
||||
of names of input activation tensors.
|
||||
of names of input tensors.
|
||||
* `--output_arrays`. Type: comma-separated list of strings. Specifies the list
|
||||
of names of output activation tensors.
|
||||
of names of output tensors.
|
||||
|
||||
The following flags define properties of the input tensors. Each item in the
|
||||
`--input_arrays` flag should correspond to each item in the following flags
|
||||
@ -56,8 +55,7 @@ based on index.
|
||||
|
||||
* `--input_shapes`. Type: colon-separated list of comma-separated lists of
|
||||
integers. Each comma-separated list of integers gives the shape of one of
|
||||
the input arrays specified in
|
||||
[TensorFlow convention](https://www.tensorflow.org/guide/tensors#shape).
|
||||
the input arrays.
|
||||
* Example: `--input_shapes=1,60,80,3` for a typical vision model means a
|
||||
batch size of 1, an input image height of 60, an input image width of
|
||||
80, and an input image depth of 3 (representing RGB channels).
|
||||
@ -65,24 +63,24 @@ based on index.
|
||||
has a shape of [2, 3] and "bar" has a shape of [4, 5, 6].
|
||||
* `--std_dev_values`, `--mean_values`. Type: comma-separated list of floats.
|
||||
These specify the (de-)quantization parameters of the input array, when it
|
||||
is quantized. This is only needed if `inference_input_type` is
|
||||
is quantized. This is only needed if `inference_input_type` is `INT8` or
|
||||
`QUANTIZED_UINT8`.
|
||||
* The meaning of `mean_values` and `std_dev_values` is as follows: each
|
||||
quantized value in the quantized input array will be interpreted as a
|
||||
mathematical real number (i.e. as an input activation value) according
|
||||
to the following formula:
|
||||
* `real_value = (quantized_input_value - mean_value) / std_dev_value`.
|
||||
* `real_value = (quantized_value - mean_value) / std_dev_value`.
|
||||
* When performing float inference (`--inference_type=FLOAT`) on a
|
||||
quantized input, the quantized input would be immediately dequantized by
|
||||
the inference code according to the above formula, before proceeding
|
||||
with float inference.
|
||||
* When performing quantized inference
|
||||
(`--inference_type=QUANTIZED_UINT8`), no dequantization is performed by
|
||||
the inference code. However, the quantization parameters of all arrays,
|
||||
including those of the input arrays as specified by `mean_value` and
|
||||
`std_dev_value`, determine the fixed-point multipliers used in the
|
||||
quantized inference code. `mean_value` must be an integer when
|
||||
performing quantized inference.
|
||||
* When performing quantized inference (`inference_type`
|
||||
is`INT8`or`QUANTIZED_UINT8`), no dequantization is performed by the
|
||||
inference code. However, the quantization parameters of all arrays,
|
||||
including those of the input arrays as specified
|
||||
by`mean_value`and`std_dev_value`, determine the fixed-point multipliers
|
||||
used in the quantized inference code.`mean_value` must be an integer
|
||||
when performing quantized inference.
|
||||
|
||||
## Transformation flags
|
||||
|
||||
@ -92,7 +90,7 @@ have.
|
||||
|
||||
* `--inference_type`. Type: string. Default: `FLOAT`. Data type of all
|
||||
real-number arrays in the output file except for input arrays (defined by
|
||||
`--inference_input_type`). Must be `{FLOAT, QUANTIZED_UINT8}`.
|
||||
`--inference_input_type`). Must be `{FLOAT, INT8, QUANTIZED_UINT8}`.
|
||||
|
||||
This flag only impacts real-number arrays including float and quantized
|
||||
arrays. This excludes all other data types including plain integer arrays
|
||||
@ -101,6 +99,9 @@ have.
|
||||
* If `FLOAT`, then real-numbers arrays will be of type float in the output
|
||||
file. If they were quantized in the input file, then they get
|
||||
dequantized.
|
||||
* If `INT8`, then real-numbers arrays will be quantized as int8 in the
|
||||
output file. If they were float in the input file, then they get
|
||||
quantized.
|
||||
* If `QUANTIZED_UINT8`, then real-numbers arrays will be quantized as
|
||||
uint8 in the output file. If they were float in the input file, then
|
||||
they get quantized.
|
||||
@ -109,7 +110,8 @@ have.
|
||||
array in the output file. By default the `--inference_type` is used as type
|
||||
of all of the input arrays. Flag is primarily intended for generating a
|
||||
float-point graph with a quantized input array. A Dequantized operator is
|
||||
added immediately after the input array. Must be `{FLOAT, QUANTIZED_UINT8}`.
|
||||
added immediately after the input array. Must be `{FLOAT, INT8,
|
||||
QUANTIZED_UINT8}`.
|
||||
|
||||
The flag is typically used for vision models taking a bitmap as input but
|
||||
requiring floating-point inference. For such image models, the uint8 input
|
||||
|
@ -1,48 +1,48 @@
|
||||
# TensorFlow Lite converter
|
||||
|
||||
The TensorFlow Lite converter is used to convert TensorFlow models into an
|
||||
optimized [FlatBuffer](https://google.github.io/flatbuffers/) format, so that
|
||||
they can be used by the TensorFlow Lite interpreter.
|
||||
The TensorFlow Lite converter takes a TensorFlow model and generates a
|
||||
TensorFlow Lite model, which is an optimized
|
||||
[FlatBuffer](https://google.github.io/flatbuffers/) (identified by the `.tflite`
|
||||
file extension).
|
||||
|
||||
Note: This page contains documentation on the converter API for TensorFlow 1.x.
|
||||
The API for TensorFlow 2.0 is available
|
||||
[here](https://www.tensorflow.org/lite/convert/).
|
||||
|
||||
## FlatBuffers
|
||||
## Options
|
||||
|
||||
The TensorFlow Lite Converter can be used in two ways:
|
||||
|
||||
* [Python API](python_api.md) (**recommended**): Using the Python API makes it
|
||||
easier to convert models as part of a model development pipeline and helps
|
||||
mitigate compatibility issues early on.
|
||||
* [Command line](cmdline_examples.md)
|
||||
|
||||
## Workflow
|
||||
|
||||
### Why use the 'FlatBuffer' format?
|
||||
|
||||
FlatBuffer is an efficient open-source cross-platform serialization library. It
|
||||
is similar to
|
||||
[protocol buffers](https://developers.google.com/protocol-buffers), with the
|
||||
distinction that FlatBuffers do not need a parsing/unpacking step to a secondary
|
||||
representation before data can be accessed, avoiding per-object memory
|
||||
allocation. The code footprint of FlatBuffers is an order of magnitude smaller
|
||||
than protocol buffers.
|
||||
is similar to [protocol buffers](https://developers.google.com/protocol-buffers)
|
||||
used in the TensorFlow model format, with the distinction that FlatBuffers do
|
||||
not need a parsing/unpacking step to a secondary representation before data can
|
||||
be accessed, avoiding per-object memory allocation. The code footprint of
|
||||
FlatBuffers is an order of magnitude smaller than protocol buffers.
|
||||
|
||||
## From model training to device deployment
|
||||
|
||||
The TensorFlow Lite converter generates a TensorFlow Lite
|
||||
[FlatBuffer](https://google.github.io/flatbuffers/) file (`.tflite`) from a
|
||||
TensorFlow model.
|
||||
### Convert the model
|
||||
|
||||
The converter supports the following input formats:
|
||||
|
||||
* [SavedModels](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators)
|
||||
* Frozen `GraphDef`: Models generated by
|
||||
* `tf.keras` H5 models.
|
||||
* Frozen `GraphDef` models generated using
|
||||
[freeze_graph.py](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py).
|
||||
* `tf.keras` HDF5 models.
|
||||
* Any model taken from a `tf.Session` (Python API only).
|
||||
* `tf.Session` models (Python API only).
|
||||
|
||||
The TensorFlow Lite `FlatBuffer` file is then deployed to a client device, and
|
||||
the TensorFlow Lite interpreter uses the compressed model for on-device
|
||||
inference. This conversion process is shown in the diagram below:
|
||||
### Run inference
|
||||
|
||||
The TensorFlow Lite model is then deployed to a client device, and the
|
||||
TensorFlow Lite interpreter uses the compressed model for on-device inference.
|
||||
This conversion process is shown in the diagram below:
|
||||
|
||||

|
||||
|
||||
## Options
|
||||
|
||||
The TensorFlow Lite Converter can be used from either of these two options:
|
||||
|
||||
* [Python](python_api.md) (**Preferred**): Using the Python API makes it
|
||||
easier to convert models as part of a model development pipeline, and helps
|
||||
mitigate [compatibility](../tf_ops_compatibility.md) issues early on.
|
||||
* [Command line](cmdline_examples.md)
|
||||
|
@ -1,119 +1,41 @@
|
||||
# Converter Python API guide
|
||||
|
||||
This page describes how to convert TensorFlow models into the TensorFlow Lite
|
||||
format using the TensorFlow Lite Converter Python API.
|
||||
format using the
|
||||
[`tf.compat.v1.lite.TFLiteConverter`](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter)
|
||||
Python API. It provides the following class methods based on the original format
|
||||
of the model:
|
||||
|
||||
If you're looking for information about how to run a TensorFlow Lite model,
|
||||
see [TensorFlow Lite inference](../guide/inference.md).
|
||||
* `tf.compat.v1.lite.TFLiteConverter.from_keras_model_file()`: Converts a
|
||||
[Keras](https://www.tensorflow.org/guide/keras/overview) model file.
|
||||
* `tf.compat.v1.lite.TFLiteConverter.from_saved_model()`: Converts a
|
||||
[SavedModel](https://www.tensorflow.org/guide/saved_model).
|
||||
* `tf.compat.v1.lite.TFLiteConverter.from_session()`: Converts a GraphDef from
|
||||
a session.
|
||||
* `tf.compat.v1.lite.TFLiteConverter.from_frozen_graph()`: Converts a Frozen
|
||||
GraphDef from a file. If you have checkpoints, then first convert it to a
|
||||
Frozen GraphDef file and then use this API as shown [here](#checkpoints).
|
||||
|
||||
Note: This page describes the converter in the TensorFlow nightly release,
|
||||
installed using `pip install tf-nightly`. For docs describing older versions
|
||||
reference ["Converting models from TensorFlow 1.12"](#pre_tensorflow_1.12).
|
||||
|
||||
|
||||
## High-level overview
|
||||
|
||||
While the TensorFlow Lite Converter can be used from the command line, it is
|
||||
often convenient to use in a Python script as part of the model development
|
||||
pipeline. This allows you to know early that you are designing a model that can
|
||||
be targeted to devices with mobile.
|
||||
|
||||
## API
|
||||
|
||||
The API for converting TensorFlow models to TensorFlow Lite is
|
||||
`tf.lite.TFLiteConverter`, which provides class methods based on the original
|
||||
format of the model. For example, `TFLiteConverter.from_session()` is available
|
||||
for GraphDefs, `TFLiteConverter.from_saved_model()` is available for
|
||||
SavedModels, and `TFLiteConverter.from_keras_model_file()` is available for
|
||||
`tf.Keras` files.
|
||||
|
||||
Example usages for simple float-point models are shown in
|
||||
[Basic Examples](#basic). Examples usages for more complex models is shown in
|
||||
[Complex Examples](#complex).
|
||||
In the following sections, we discuss [basic examples](#basic) and
|
||||
[complex examples](#complex).
|
||||
|
||||
## Basic examples <a name="basic"></a>
|
||||
|
||||
The following section shows examples of how to convert a basic float-point model
|
||||
from each of the supported data formats into a TensorFlow Lite FlatBuffers.
|
||||
The following section shows examples of how to convert a basic model from each
|
||||
of the supported model formats into a TensorFlow Lite model.
|
||||
|
||||
### Exporting a GraphDef from tf.Session <a name="basic_graphdef_sess"></a>
|
||||
|
||||
The following example shows how to convert a TensorFlow GraphDef into a
|
||||
TensorFlow Lite FlatBuffer from a `tf.Session` object.
|
||||
### Convert a Keras model file <a name="basic_keras_file"></a>
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
|
||||
var = tf.get_variable("weights", dtype=tf.float32, shape=(1, 64, 64, 3))
|
||||
val = img + var
|
||||
out = tf.identity(val, name="out")
|
||||
|
||||
with tf.Session() as sess:
|
||||
sess.run(tf.global_variables_initializer())
|
||||
converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
### Exporting a GraphDef from file <a name="basic_graphdef_file"></a>
|
||||
|
||||
The following example shows how to convert a TensorFlow GraphDef into a
|
||||
TensorFlow Lite FlatBuffer when the GraphDef is stored in a file. Both `.pb` and
|
||||
`.pbtxt` files are accepted.
|
||||
|
||||
The example uses
|
||||
[Mobilenet_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz).
|
||||
The function only supports GraphDefs frozen using
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py).
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
graph_def_file = "/path/to/Downloads/mobilenet_v1_1.0_224/frozen_graph.pb"
|
||||
input_arrays = ["input"]
|
||||
output_arrays = ["MobilenetV1/Predictions/Softmax"]
|
||||
|
||||
converter = tf.lite.TFLiteConverter.from_frozen_graph(
|
||||
graph_def_file, input_arrays, output_arrays)
|
||||
converter = tf.compat.v1.lite.TFLiteConverter.from_keras_model_file("keras_model.h5")
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
### Exporting a SavedModel <a name="basic_savedmodel"></a>
|
||||
|
||||
The following example shows how to convert a SavedModel into a TensorFlow Lite
|
||||
FlatBuffer.
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
For more complex SavedModels, the optional parameters that can be passed into
|
||||
`TFLiteConverter.from_saved_model()` are `input_arrays`, `input_shapes`,
|
||||
`output_arrays`, `tag_set` and `signature_key`. Details of each parameter are
|
||||
available by running `help(tf.lite.TFLiteConverter)`.
|
||||
|
||||
### Exporting a tf.keras File <a name="basic_keras_file"></a>
|
||||
|
||||
The following example shows how to convert a `tf.keras` model into a TensorFlow
|
||||
Lite FlatBuffer. This example requires
|
||||
[`h5py`](http://docs.h5py.org/en/latest/build.html) to be installed.
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
converter = tf.lite.TFLiteConverter.from_keras_model_file("keras_model.h5")
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
The `tf.keras` file must contain both the model and the weights. A comprehensive
|
||||
example including model construction can be seen below.
|
||||
The Keras file contains both the model and the weights. A comprehensive example
|
||||
is given below.
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
@ -134,61 +56,133 @@ y = np.random.random((1, 3, 3))
|
||||
model.train_on_batch(x, y)
|
||||
model.predict(x)
|
||||
|
||||
# Save tf.keras model in HDF5 format.
|
||||
# Save tf.keras model in H5 format.
|
||||
keras_file = "keras_model.h5"
|
||||
tf.keras.models.save_model(model, keras_file)
|
||||
|
||||
# Convert to TensorFlow Lite model.
|
||||
converter = tf.lite.TFLiteConverter.from_keras_model_file(keras_file)
|
||||
converter = tf.compat.v1.lite.TFLiteConverter.from_keras_model_file(keras_file)
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
## Complex examples <a name="complex"></a>
|
||||
### Convert a SavedModel <a name="basic_savedmodel"></a>
|
||||
|
||||
For models where the default value of the attributes is not sufficient, the
|
||||
attribute's values should be set before calling `convert()`. In order to call
|
||||
any constants use `tf.lite.constants.<CONSTANT_NAME>` as seen below with
|
||||
`QUANTIZED_UINT8`. Run `help(tf.lite.TFLiteConverter)` in the Python
|
||||
terminal for detailed documentation on the attributes.
|
||||
The following example shows how to convert a
|
||||
[SavedModel](https://www.tensorflow.org/guide/saved_model) into a TensorFlow
|
||||
Lite model.
|
||||
|
||||
Although the examples are demonstrated on GraphDefs containing only constants.
|
||||
The same logic can be applied irrespective of the input data format.
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
### Exporting a quantized GraphDef <a name="complex_quant"></a>
|
||||
converter = tf.compat.v1.lite.TFLiteConverter.from_saved_model(saved_model_dir)
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
The following example shows how to convert a quantized model into a TensorFlow
|
||||
Lite FlatBuffer.
|
||||
### Convert a GraphDef from a session <a name="basic_graphdef_sess"></a>
|
||||
|
||||
The following example shows how to convert a TensorFlow GraphDef into a
|
||||
TensorFlow Lite model from a `tf.Session` object.
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
|
||||
const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
|
||||
val = img + const
|
||||
out = tf.fake_quant_with_min_max_args(val, min=0., max=1., name="output")
|
||||
var = tf.get_variable("weights", dtype=tf.float32, shape=(1, 64, 64, 3))
|
||||
val = img + var
|
||||
out = tf.identity(val, name="out")
|
||||
|
||||
with tf.Session() as sess:
|
||||
converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
|
||||
converter.inference_type = tf.lite.constants.QUANTIZED_UINT8
|
||||
input_arrays = converter.get_input_arrays()
|
||||
converter.quantized_input_stats = {input_arrays[0] : (0., 1.)} # mean, std_dev
|
||||
sess.run(tf.global_variables_initializer())
|
||||
converter = tf.compat.v1.lite.TFLiteConverter.from_session(sess, [img], [out])
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
### Convert a Frozen GraphDef from file <a name="basic_graphdef_file"></a>
|
||||
|
||||
## Additional instructions
|
||||
The example uses
|
||||
[Mobilenet_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz).
|
||||
|
||||
### Build from source code <a name="latest_package"></a>
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
In order to run the latest version of the TensorFlow Lite Converter Python API,
|
||||
either install the nightly build with
|
||||
[pip](https://www.tensorflow.org/install/pip) (recommended) or
|
||||
[Docker](https://www.tensorflow.org/install/docker), or
|
||||
[build the pip package from source](https://www.tensorflow.org/install/source).
|
||||
converter = tf.compat.v1.lite.TFLiteConverter.from_frozen_graph(
|
||||
graph_def_file='/path/to/mobilenet_v1_1.0_224/frozen_graph.pb',
|
||||
# both `.pb` and `.pbtxt` files are accepted.
|
||||
input_arrays=['input'],
|
||||
output_arrays=['MobilenetV1/Predictions/Softmax'],
|
||||
input_shapes={'input' : [1, 224, 224,3]},
|
||||
)
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
### Converting models from TensorFlow 1.12 <a name="pre_tensorflow_1.12"></a>
|
||||
#### Convert checkpoints <a name="checkpoints"></a>
|
||||
|
||||
1. Convert checkpoints to a Frozen GraphDef as follows
|
||||
(*[reference](https://laid.delanover.com/how-to-freeze-a-graph-in-tensorflow/)*):
|
||||
|
||||
* Install [bazel](https://docs.bazel.build/versions/master/install.html)
|
||||
* Clone the TensorFlow repository: `git clone
|
||||
https://github.com/tensorflow/tensorflow.git`
|
||||
* Build freeze graph tool: `bazel build
|
||||
tensorflow/python/tools:freeze_graph`
|
||||
* The directory from which you run this should contain a file named
|
||||
'WORKSPACE'.
|
||||
* If you're running on Ubuntu 16.04 OS and face issues, update the
|
||||
command to `bazel build -c opt --copt=-msse4.1 --copt=-msse4.2
|
||||
tensorflow/python/tools:freeze_graph`
|
||||
* Run freeze graph tool: `bazel run tensorflow/python/tools:freeze_graph
|
||||
--input_graph=/path/to/graph.pbtxt --input_binary=false
|
||||
--input_checkpoint=/path/to/model.ckpt-00010
|
||||
--output_graph=/path/to/frozen_graph.pb
|
||||
--output_node_names=name1,name2.....`
|
||||
* If you have an input `*.pb` file instead of `*.pbtxt`, then replace
|
||||
`--input_graph=/path/to/graph.pbtxt --input_binary=false` with
|
||||
`--input_graph=/path/to/graph.pb`
|
||||
* You can find the output names by exploring the graph using
|
||||
[Netron](https://github.com/lutzroeder/netron) or
|
||||
[summarize graph tool](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#inspecting-graphs).
|
||||
|
||||
2. Now [convert the Frozen GraphDef file](#basic_graphdef_file) to a TensorFlow
|
||||
Lite model as shown in the example above.
|
||||
|
||||
## Complex examples <a name="complex"></a>
|
||||
|
||||
For models where the default value of the attributes is not sufficient, the
|
||||
attribute's values should be set before calling `convert()`. Run
|
||||
`help(tf.compat.v1.lite.TFLiteConverter)` in the Python terminal for detailed
|
||||
documentation on the attributes.
|
||||
|
||||
### Convert a quantize aware trained model <a name="complex_quant"></a>
|
||||
|
||||
The following example shows how to convert a quantize aware trained model into a
|
||||
TensorFlow Lite model.
|
||||
|
||||
The example uses
|
||||
[Mobilenet_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz).
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
converter = tf.compat.v1.lite.TFLiteConverter.from_frozen_graph(
|
||||
graph_def_file='/path/to/mobilenet_v1_1.0_224/frozen_graph.pb',
|
||||
input_arrays=['input'],
|
||||
output_arrays=['MobilenetV1/Predictions/Softmax'],
|
||||
input_shapes={'input' : [1, 224, 224,3]},
|
||||
)
|
||||
converter.quantized_input_stats = {['input'] : (0., 1.)} # mean, std_dev (input range is [-1, 1])
|
||||
converter.inference_type = tf.int8 # this is the recommended type.
|
||||
# converter.inference_input_type=tf.uint8 # optional
|
||||
# converter.inference_output_type=tf.uint8 # optional
|
||||
tflite_model = converter.convert()
|
||||
with open('mobilenet_v1_1.0_224_quantized.tflite', 'wb') as f:
|
||||
f.write(tflite_model)
|
||||
```
|
||||
|
||||
## Convert models from TensorFlow 1.12 <a name="pre_tensorflow_1.12"></a>
|
||||
|
||||
Reference the following table to convert TensorFlow models to TensorFlow Lite in
|
||||
and before TensorFlow 1.12. Run `help()` to get details of each API.
|
||||
|
Loading…
Reference in New Issue
Block a user