Update TFLite Converter Documentation

PiperOrigin-RevId: 330638471
Change-Id: I05c0c195256e36b0db9162b7354ef1c2454d9c92
This commit is contained in:
Meghna Natraj 2020-09-08 19:41:30 -07:00 committed by TensorFlower Gardener
parent ba7e0b1848
commit ad6e452065
6 changed files with 212 additions and 525 deletions

View File

@ -78,12 +78,6 @@ upper_tabs:
- heading: "Convert a model"
- title: "Overview"
path: /lite/convert/
- title: "Python API"
path: /lite/convert/python_api
- title: "Command line"
path: /lite/convert/cmdline
- title: "Convert quantized models"
path: /lite/convert/quantization
- title: "Convert RNN models"
path: /lite/convert/rnn
- title: "Add metadata"
@ -98,7 +92,7 @@ upper_tabs:
status: experimental
path: /lite/guide/model_maker
- heading: "Inference"
- heading: "Run Inference"
- title: "Overview"
path: /lite/guide/inference
- title: "Operator compatibility"
@ -113,7 +107,7 @@ upper_tabs:
path: /lite/guide/ops_version
status: experimental
- heading: "Inference with metadata"
- heading: "Run Inference with metadata"
- title: "Overview"
path: /lite/inference_with_metadata/overview
- title: "Generate model interfaces with codegen"

View File

@ -1,121 +0,0 @@
# Converter command line reference
This page describes how to use the [TensorFlow Lite converter](index.md) using
the command line tool. However, the [Python API](python_api.md) is recommended
for the majority of cases.
Note: This only contains documentation on the command line tool in TensorFlow 2.
Documentation on using the command line tool in TensorFlow 1 is available on
GitHub
([reference](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/cmdline_reference.md),
[example](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/cmdline_examples.md)).
## High-level overview
The TensorFlow Lite Converter has a command line tool named `tflite_convert`,
which supports models saved in the supported file formats:
* [SavedModel directory](https://www.tensorflow.org/guide/saved_model)
generated in 1.X or 2.X.
* [`tf.keras` model](https://www.tensorflow.org/guide/keras/overview)
formatted in the HDF5 file.
Use the [Python API](python_api.md) for any conversions involving optimizations,
or any additional parameters (e.g. custom objects in
[Keras models](https://www.tensorflow.org/guide/keras/overview)).
## Usage
The following example shows a `SavedModel` being converted:
```sh
tflite_convert \
--saved_model_dir=/tmp/mobilenet_saved_model \
--output_file=/tmp/mobilenet.tflite
```
The inputs and outputs are specified using the following commonly used flags:
* `--output_file`. Type: string. Specifies the full path of the output file.
* `--saved_model_dir`. Type: string. Specifies the full path to the directory
containing the SavedModel generated in 1.X or 2.X.
* `--keras_model_file`. Type: string. Specifies the full path of the HDF5 file
containing the `tf.keras` model generated in 1.X or 2.X.
To use all of the available flags, use the following command:
```sh
tflite_convert --help
```
The following flag can be used for compatibility with the TensorFlow 1.X version
of the converter CLI:
* `--enable_v1_converter`. Type: bool. Enables user to enable the 1.X command
line flags instead of the 2.X flags. The 1.X command line flags are
specified
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/cmdline_reference.md).
## Installing the converter CLI
To obtain the latest version of the TensorFlow Lite converter CLI, we recommend
installing the nightly build using
[pip](https://www.tensorflow.org/install/pip):
```sh
pip install tf-nightly
```
Alternatively, you can
[clone the TensorFlow repository](https://www.tensorflow.org/install/source) and
use `bazel` to run the command:
```sh
bazel run //tensorflow/lite/python:tflite_convert -- \
--saved_model_dir=/tmp/mobilenet_saved_model \
--output_file=/tmp/mobilenet.tflite
```
### Custom ops in the new converter
There is a behavior change in how models containing
[custom ops](https://www.tensorflow.org/lite/guide/ops_custom) (those for which
users previously set `--allow_custom_ops` before) are handled in the
[new converter](https://github.com/tensorflow/tensorflow/blob/917ebfe5fc1dfacf8eedcc746b7989bafc9588ef/tensorflow/lite/python/lite.py#L81).
**Built-in TensorFlow op**
If you are converting a model with a built-in TensorFlow op that does not exist
in TensorFlow Lite, you should set `--allow_custom_ops` argument (same as
before), explained [here](https://www.tensorflow.org/lite/guide/ops_custom).
**Custom op in TensorFlow**
If you are converting a model with a custom TensorFlow op, it is recommended
that you write a [TensorFlow kernel](https://www.tensorflow.org/guide/create_op)
and [TensorFlow Lite kernel](https://www.tensorflow.org/lite/guide/ops_custom).
This ensures that the model is working end-to-end, from TensorFlow and
TensorFlow Lite. This also requires setting the `--allow_custom_ops` argument.
**Advanced custom op usage (not recommended)**
If the above is not possible, you can still convert a TensorFlow model
containing a custom op without a corresponding kernel. You will need to pass the
[OpDef](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_def.proto)
of the custom op in TensorFlow using `--custom_opdefs` flag, as long as you have
the corresponding OpDef registered in the TensorFlow global op registry. This
ensures that the TensorFlow model is valid (i.e. loadable by the TensorFlow
runtime).
If the custom op is not part of the global TensorFlow op registry, then the
corresponding OpDef needs to be specified via the `--custom_opdefs` flag. This
is a list of an OpDef proto in string that needs to be additionally registered.
Below is an example of a TFLiteAwesomeCustomOp with 2 inputs, 1 output, and 2
attributes:
```sh
--custom_opdefs="name: 'TFLiteAwesomeCustomOp' input_arg: { name: 'InputA'
type: DT_FLOAT } input_arg: { name: InputB' type: DT_FLOAT }
output_arg: { name: 'Output' type: DT_FLOAT } attr : { name: 'Attr1' type:
'float'} attr : { name: 'Attr2' type: 'list(float)'}"
```

View File

@ -1,66 +1,226 @@
# TensorFlow Lite converter
The TensorFlow Lite converter takes a TensorFlow model and generates a
TensorFlow Lite model file (`.tflite`). The converter supports
[SavedModel directories](https://www.tensorflow.org/guide/saved_model),
[`tf.keras` models](https://www.tensorflow.org/guide/keras/overview), and
[concrete functions](https://tensorflow.org/guide/concrete_function).
TensorFlow Lite model (an optimized
[FlatBuffer](https://google.github.io/flatbuffers/) format identified by the
`.tflite` file extension). You have the following two options for using the
converter:
Note: This page contains documentation on the converter API for TensorFlow 2.0.
The API for TensorFlow 1.X is available
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/index.md).
1. [Python API](#python_api) (***recommended***): This makes it easier to
convert models as part of the model development pipeline, apply
optimizations, add metadata and has many more features.
2. [Command line](#cmdline): This only supports basic model conversion.
## Converting models
Note: In case you encounter any issues during model conversion, create a
[GitHub issue](https://github.com/tensorflow/tensorflow/issues/new?template=60-tflite-converter-issue.md).
In TensorFlow Lite, there are two ways to create a TensorFlow Lite model file:
![TFLite converter workflow](../images/convert/convert.png)
* [Python API](python_api.md) (recommended): The Python API makes it easier to
convert models as part of a model development pipeline and helps mitigate
[compatibility](../guide/ops_compatibility.md) issues early on.
* [Command line tool](cmdline.md): The CLI tool supports converting the models
saved in the supported file formats, the directory containing the SavedModel
and the HDF5 file containing the
[`tf.keras` model](https://www.tensorflow.org/guide/keras/overview).
## Python API <a name="python_api"></a>
## Device deployment
*Helper code: In Python, to identify the TensorFlow version, run
`print(tf.__version__)` and to learn more about the API, run
`print(help(tf.lite.TFLiteConverter))`.*
The TensorFlow Lite model is formatted in
[`FlatBuffer`](https://google.github.io/flatbuffers/). After conversion, The
model file is then deployed to a client device (e.g. mobile, embedded) and run
locally using the TensorFlow Lite interpreter. This conversion process is shown
in the diagram below:
If you've
[installed TensorFlow 2.x](https://www.tensorflow.org/install/pip#tensorflow-2-packages-are-available),
you have the following two options: (*if you've
[installed TensorFlow 1.x](https://www.tensorflow.org/install/pip#older-versions-of-tensorflow),
refer to
[Github](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/python_api.md)*)
![TFLite converter workflow](../images/convert/workflow.svg)
* [`tf.lite.TFLiteConverter`](https://www.tensorflow.org/api_docs/python/tf/lite/TFLiteConverter):
Converts TensorFlow 2.x models, which are stored using the SavedModel format
and are generated either using the high-level `tf.keras.*` APIs (a Keras
model) or the low-level `tf.*` APIs (from which you generate concrete
functions). As a result, you have the following three options (detailed
examples are in the next few sections):
## MLIR-based conversion
* `tf.lite.TFLiteConverter.from_saved_model()` (**recommended**): Converts
a [SavedModel](https://www.tensorflow.org/guide/saved_model).
* `tf.lite.TFLiteConverter.from_keras_model()`: Converts a
[Keras](https://www.tensorflow.org/guide/keras/overview) model.
* `tf.lite.TFLiteConverter.from_concrete_functions()`: Converts
[concrete functions](https://www.tensorflow.org/guide/intro_to_graphs).
TensorFlow Lite has switched to use a new converter backend, based on MLIR, by
default since TF 2.2 version. The new converter backend provides the following
benefits:
* [`tf.compat.v1.lite.TFLiteConverter`](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter):
Converts TensorFlow 1.x models (detailed examples are on
[Github](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/python_api.md)):
* Enables conversion of new classes of models, including Mask R-CNN, Mobile
BERT, and many more
* Adds support for functional control flow (enabled by default in TensorFlow
2.x)
* Tracks original TensorFlow node name and Python code, and exposes them
during conversion if errors occur
* Leverages MLIR, Google's cutting edge compiler technology for ML, which
makes it easier to extend to accommodate feature requests
* Adds basic support for models with input tensors containing unknown
dimensions
* Supports all existing converter functionality
* `tf.compat.v1.lite.TFLiteConverter.from_saved_model()`: Converts a
[SavedModel](https://www.tensorflow.org/guide/saved_model).
* `tf.compat.v1.lite.TFLiteConverter.from_keras_model_file()`: Converts a
[Keras](https://www.tensorflow.org/guide/keras/overview) model.
* `tf.compat.v1.lite.TFLiteConverter.from_session()`: Converts a GraphDef
from a session.
* `tf.compat.v1.lite.TFLiteConverter.from_frozen_graph()`: Converts a
Frozen GraphDef from a file. If you have checkpoints, then first convert
it to a Frozen GraphDef file and then use this API as shown
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/python_api.md#checkpoints).
## Getting Help
Note: The following sections assume you've both installed TensorFlow 2.x and
trained models in TensorFlow 2.x.
To get help with issues you may encounter using the TensorFlow Lite converter:
### Convert a SavedModel (recommended) <a name="saved_model"></a>
* Please create a
[GitHub issue](https://github.com/tensorflow/tensorflow/issues/new?template=60-tflite-converter-issue.md)
with the component label “TFLiteConverter”.
* If you are using the `allow_custom_ops` feature, please read the
[Python API](../convert/python_api.md) and
[Command Line Tool](../convert/cmdline.md) documentation
* Switch to the old converter by setting `--experimental_new_converter=false`
(from the [tflite_convert](../convert/cmdline.md) command line tool) or
`converter.experimental_new_converter=False` (from the
[Python API](https://www.tensorflow.org/api_docs/python/tf/lite/TFLiteConverter))
The following example shows how to convert a
[SavedModel](https://www.tensorflow.org/guide/saved_model) into a TensorFlow
Lite model.
```python
import tensorflow as tf
# Convert the model
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) # path to the SavedModel directory
tflite_model = converter.convert().
# Save the model.
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
```
### Convert a Keras model <a name="keras"></a>
The following example shows how to convert a
[Keras](https://www.tensorflow.org/guide/keras/overview) model into a TensorFlow
Lite model.
```python
import tensorflow as tf
# Create a model using high-level tf.keras.* APIs
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(units=1, input_shape=[1])
tf.keras.layers.Dense(units=16, activation='relu'),
tf.keras.layers.Dense(units=1)
])
model.compile(optimizer='sgd', loss='mean_squared_error') # compile the model
model.fit(x=[-1, 0, 1], y=[-3, -1, 1], epochs=5) # train the model
# (to generate a SavedModel) tf.saved_model.save(model, "saved_model_keras_dir")
# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
# Save the model.
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
```
### Convert concrete functions <a name="concrete_function"></a>
The following example shows how to convert
[concrete functions](https://www.tensorflow.org/guide/intro_to_graphs) into a
TensorFlow Lite model.
Note: Currently, it only supports the conversion of a single concrete function.
```python
import tensorflow as tf
# Create a model using low-level tf.* APIs
class Squared(tf.Module):
@tf.function
def __call__(self, x):
return tf.square(x)
model = Squared()
# (ro run your model) result = Squared(5.0) # This prints "25.0"
# (to generate a SavedModel) tf.saved_model.save(model, "saved_model_tf_dir")
concrete_func = model.__call__.get_concrete_function()
# Convert the model
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
tflite_model = converter.convert()
# Save the model.
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
```
### Other features
* Apply [optimizations](../performance/model_optimization.md). A common
optimization used is
[post training quantization](../performance/post_training_quantization.md),
which can further reduce your model latency and size with minimal loss in
accuracy.
* Handle unsupported operations. You have the following options if your model
has operators:
1. Supported in TensorFlow but unsupported in TensorFlow Lite: If you have
size constraints, you need to
[create the TensorFlow Lite operator](../guide/ops_custom.md), otherwise
just [use TensorFlow operators](../guide/ops_select.md) in your
TensorFlow Lite model.
2. Unsupported in TensorFlow: You need to
[create the TensorFlow operator](https://www.tensorflow.org/guide/create_op)
and then [create the TensorFlow Lite operator](../guide/ops_custom.md).
If you were unsuccessful at creating the TensorFlow operator or don't
wish to create one (**not recommended, proceed with caution**), you can
still convert using the `custom_opdefs` attribute and then directly
[create the TensorFlow Lite operator](../guide/ops_custom.md). The
`custom_opdefs` attribute is a string containing an (or a list of)
[OpDef](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_def.proto)
(s) or operator definition proto(s). Below is an example of a
`TFLiteAwesomeCustomOp` with 1 input, 1 output, and 2 attributes:
```python
converter.custom_opdefs="""name: 'TFLiteAwesomeCustomOp' input_arg:
{ name: 'In' type: DT_FLOAT } output_arg: { name: 'Out' type: DT_FLOAT }
attr : { name: 'a1' type: 'float'} attr : { name: 'a2' type: 'list(float)'}"""
```
## Command Line Tool <a name="cmdline"></a>
**It is highly recommended that you use the [Python API](#python_api) listed
above instead, if possible.**
If you've
[installed TensorFlow 2.x from pip](https://www.tensorflow.org/install/pip), use
the `tflite_convert` command as follows: (*if you've
[installed TensorFlow 2.x from source](https://www.tensorflow.org/install/source)
then you can replace '`tflite_convert`' with '`bazel run
//tensorflow/lite/python:tflite_convert --`' in the following
sections, and if you've
[installed TensorFlow 1.x](https://www.tensorflow.org/install/pip#older-versions-of-tensorflow)
then refer to Github
([reference](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/cmdline_reference.md),
[examples](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/cmdline_examples.md)))*
`tflite_convert`: To view all the available flags, use the following command:
```sh
$ tflite_convert --help
`--output_file`. Type: string. Full path of the output file.
`--saved_model_dir`. Type: string. Full path to the SavedModel directory.
`--keras_model_file`. Type: string. Full path to the Keras H5 model file.
`--enable_v1_converter`. Type: bool. (default False) Enables the converter and flags used in TF 1.x instead of TF 2.x.
You are required to provide the `--output_file` flag and either the `--saved_model_dir` or `--keras_model_file` flag.
```
### Converting a SavedModel <a name="cmdline_saved_model"></a>
```sh
tflite_convert \
--saved_model_dir=/tmp/mobilenet_saved_model \
--output_file=/tmp/mobilenet.tflite
```
### Converting a Keras H5 model <a name="cmdline_keras_model"></a>
```sh
tflite_convert \
--keras_model_file=/tmp/mobilenet_keras_model.h5 \
--output_file=/tmp/mobilenet.tflite
```
## Next Steps
* Add [metadata](metadata.md), which makes it easier to create platform
specific wrapper code when deploying models on devices.
* Use the [TensorFlow Lite interpreter](../guide/inference.md) to run
inference on a client device (e.g. mobile, embedded).

View File

@ -1,267 +0,0 @@
# Converter Python API guide
This page provides examples on how to use the
[TensorFlow Lite converter](index.md) using the Python API.
Note: This only contains documentation on the Python API in TensorFlow 2.
Documentation on using the Python API in TensorFlow 1 is available on
[GitHub](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/r1/convert/python_api.md).
[TOC]
## Python API
The Python API for converting TensorFlow models to TensorFlow Lite is
`tf.lite.TFLiteConverter`. `TFLiteConverter` provides the following classmethods
to convert a model based on the original model format:
* `TFLiteConverter.from_saved_model()`: Converts
[SavedModel directories](https://www.tensorflow.org/guide/saved_model).
* `TFLiteConverter.from_keras_model()`: Converts
[`tf.keras` models](https://www.tensorflow.org/guide/keras/overview).
* `TFLiteConverter.from_concrete_functions()`: Converts
[concrete functions](https://tensorflow.org/guide/concrete_function).
This document contains [example usages](#examples) of the API and
[instructions](#versioning) on running the different versions of TensorFlow.
## Examples <a name="examples"></a>
### Converting a SavedModel <a name="saved_model"></a>
The following example shows how to convert a
[SavedModel](https://www.tensorflow.org/guide/saved_model) into a TensorFlow
Lite [`FlatBuffer`](https://google.github.io/flatbuffers/).
```python
import tensorflow as tf
# Construct a basic model.
root = tf.train.Checkpoint()
root.v1 = tf.Variable(3.)
root.v2 = tf.Variable(2.)
root.f = tf.function(lambda x: root.v1 * root.v2 * x)
# Save the model in SavedModel format.
export_dir = "/tmp/test_saved_model"
input_data = tf.constant(1., shape=[1, 1])
to_save = root.f.get_concrete_function(input_data)
tf.saved_model.save(root, export_dir, to_save)
# Convert the model.
converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
tflite_model = converter.convert()
# Save the TF Lite model.
with tf.io.gfile.GFile('model.tflite', 'wb') as f:
f.write(tflite_model)
```
This API does not have the option of specifying the input shape of any input
arrays. If your model requires specifying the input shape, use the
[`from_concrete_functions`](#concrete_function) classmethod instead. The code
looks similar to the following:
```python
model = tf.saved_model.load(export_dir)
concrete_func = model.signatures[
tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
concrete_func.inputs[0].set_shape([1, 256, 256, 3])
converter = TFLiteConverter.from_concrete_functions([concrete_func])
```
### Converting a Keras model <a name="keras"></a>
The following example shows how to convert a
[`tf.keras` model](https://www.tensorflow.org/guide/keras/overview) into a
TensorFlow Lite [`FlatBuffer`](https://google.github.io/flatbuffers/).
```python
import tensorflow as tf
# Create a simple Keras model.
x = [-1, 0, 1, 2, 3, 4]
y = [-3, -1, 1, 3, 5, 7]
model = tf.keras.models.Sequential(
[tf.keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd', loss='mean_squared_error')
model.fit(x, y, epochs=50)
# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
# Save the TF Lite model.
with tf.io.gfile.GFile('model.tflite', 'wb') as f:
f.write(tflite_model)
```
If your model requires specifying the input shape, use `tf.keras.layers.Input`
or `tf.keras.layers.InputLayer` to create a Keras model with a fixed input shape
as seen below or use the [`from_concrete_functions`](#concrete_function)
classmethod as shown in the prior section to set the shape of the input arrays
prior to conversion.
```python
input = tf.keras.layers.Input(shape=(1), batch_size=1)
dense_layer = tf.keras.layers.Dense(units=1, input_shape=[1])
model = tf.keras.Model(input, dense_layer(input))
```
```python
model = tf.keras.models.Sequential(
[tf.keras.layers.InputLayer(input_shape=(1), batch_size=1),
tf.keras.layers.Dense(units=1, input_shape=[1])])
```
### Converting a concrete function <a name="concrete_function"></a>
The following example shows how to convert a TensorFlow
[concrete function](https://tensorflow.org/guide/concrete_function) into a
TensorFlow Lite [`FlatBuffer`](https://google.github.io/flatbuffers/).
```python
import tensorflow as tf
# Construct a basic model.
root = tf.train.Checkpoint()
root.v1 = tf.Variable(3.)
root.v2 = tf.Variable(2.)
root.f = tf.function(lambda x: root.v1 * root.v2 * x)
# Create the concrete function.
input_data = tf.constant(1., shape=[1, 1])
concrete_func = root.f.get_concrete_function(input_data)
# Convert the model.
#
# `from_concrete_function` takes in a list of concrete functions, however,
# currently only supports converting one function at a time. Converting multiple
# functions is under development.
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
tflite_model = converter.convert()
# Save the TF Lite model.
with tf.io.gfile.GFile('model.tflite', 'wb') as f:
f.write(tflite_model)
```
### End-to-end MobileNet conversion <a name="mobilenet"></a>
The following example shows how to convert and run inference on a pre-trained
`tf.keras` MobileNet model to TensorFlow Lite. It compares the results of the
TensorFlow and TensorFlow Lite model on random data. In order to load the model
from file, use `model_path` instead of `model_content`.
```python
import numpy as np
import tensorflow as tf
# Load the MobileNet tf.keras model.
model = tf.keras.applications.MobileNetV2(
weights="imagenet", input_shape=(224, 224, 3))
# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test the TensorFlow Lite model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
tflite_results = interpreter.get_tensor(output_details[0]['index'])
# Test the TensorFlow model on random input data.
tf_results = model(tf.constant(input_data))
# Compare the result.
for tf_result, tflite_result in zip(tf_results, tflite_results):
np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)
```
#### TensorFlow Lite Metadata
Note: TensorFlow Lite Metadata is in experimental (beta) phase.
TensorFlow Lite metadata provides a standard for model descriptions. The
metadata is an important source of knowledge about what the model does and its
input / output information. This makes it easier for other developers to
understand the best practices and for code generators to create platform
specific wrapper code. For more information, please refer to the
[TensorFlow Lite Metadata](metadata.md) section.
## Installing TensorFlow <a name="versioning"></a>
### Installing the TensorFlow nightly <a name="2.0-nightly"></a>
The TensorFlow nightly can be installed using the following command:
```sh
pip install tf-nightly
```
### Build from source code <a name="latest_package"></a>
In order to run the latest version of the TensorFlow Lite Converter Python API,
either install the nightly build with
[pip](https://www.tensorflow.org/install/pip) (recommended) or
[Docker](https://www.tensorflow.org/install/docker), or
[build the pip package from source](https://www.tensorflow.org/install/source).
### Custom ops in the experimental new converter
There is a behavior change in how models containing
[custom ops](https://www.tensorflow.org/lite/guide/ops_custom) (those for which
users previously set `allow_custom_ops` before) are handled in the
[new converter](https://github.com/tensorflow/tensorflow/blob/917ebfe5fc1dfacf8eedcc746b7989bafc9588ef/tensorflow/lite/python/lite.py#L81).
**Built-in TensorFlow op**
If you are converting a model with a built-in TensorFlow op that does not exist
in TensorFlow Lite, you should set the `allow_custom_ops` attribute (same as
before), explained [here](https://www.tensorflow.org/lite/guide/ops_custom).
**Custom op in TensorFlow**
If you are converting a model with a custom TensorFlow op, it is recommended
that you write a [TensorFlow kernel](https://www.tensorflow.org/guide/create_op)
and [TensorFlow Lite kernel](https://www.tensorflow.org/lite/guide/ops_custom).
This ensures that the model is working end-to-end, from TensorFlow and
TensorFlow Lite. This also requires setting the `allow_custom_ops` attribute.
**Advanced custom op usage (not recommended)**
If the above is not possible, you can still convert a TensorFlow model
containing a custom op without a corresponding kernel. You will need to pass the
[OpDef](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_def.proto)
of the custom op in TensorFlow using `--custom_opdefs` flag, as long as you have
the corresponding OpDef registered in the TensorFlow global op registry. This
ensures that the TensorFlow model is valid (i.e. loadable by the TensorFlow
runtime).
If the custom op is not part of the global TensorFlow op registry, then the
corresponding OpDef needs to be specified via the `--custom_opdefs` flag. This
is a list of an OpDef proto in string that needs to be additionally registered.
Below is an example of a TFLiteAwesomeCustomOp with 2 inputs, 1 output, and 2
attributes:
```python
converter.custom_opdefs="""name: 'TFLiteAwesomeCustomOp' input_arg: { name: 'InputA'
type: DT_FLOAT } input_arg: { name: InputB' type: DT_FLOAT }
output_arg: { name: 'Output' type: DT_FLOAT } attr : { name: 'Attr1' type:
'float'} attr : { name: 'Attr2' type: 'list(float)'}"""
```

View File

@ -1,79 +0,0 @@
# Converting Quantized Models
This page provides information for how to convert quantized TensorFlow Lite
models. For more details, please see the
[model optimization](../performance/model_optimization.md).
# Post-training: Quantizing models for CPU model size
The simplest way to create a small model is to quantize the weights to 8 bits
and quantize the inputs/activations "on-the-fly", during inference. This
has latency benefits, but prioritizes size reduction.
During conversion, set the `optimizations` flag to optimize for size:
```python
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()
```
# Full integer quantization of weights and activations
We can get further latency improvements, reductions in peak memory usage, and
access to integer only hardware accelerators by making sure all model math is
quantized. To do this, we need to measure the dynamic range of activations and
inputs with a representative data set. You can simply create an input data
generator and provide it to our converter.
```python
import tensorflow as tf
def representative_dataset_gen():
for _ in range(num_calibration_steps):
# Get sample input data as a numpy array in a method of your choosing.
yield [input]
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
tflite_quant_model = converter.convert()
```
# During training: Quantizing models for integer-only execution
Quantizing models for integer-only execution gets a model with even faster
latency, smaller size, and integer-only accelerators compatible model.
Currently, this requires training a model with
["fake-quantization" nodes](https://github.com/tensorflow/tensorflow/tree/r1.13/tensorflow/contrib/quantize).
This is only available in the v1 converter. A longer term solution that's
compatible with 2.0 semantics is in progress.
Convert the graph:
```python
converter = tf.compat.v1.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.inference_type = tf.lite.constants.QUANTIZED_UINT8
input_arrays = converter.get_input_arrays()
converter.quantized_input_stats = {input_arrays[0] : (0., 1.)} # mean_value, std_dev
tflite_model = converter.convert()
```
For fully integer models, the inputs are uint8. When the `inference_type` is set
to `QUANTIZED_UINT8` as above, the real_input_value is standardised using the
[standard-score](https://en.wikipedia.org/wiki/Standard_score) as follows:
real_input_value = (quantized_input_value - mean_value) / std_dev_value
The `mean_value` and `std_dev values` specify how those uint8 values map to the
float input values used while training the model. For more details, please see
the
[TFLiteConverter](https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter)
`mean` is the integer value from 0 to 255 that maps to floating point 0.0f.
`std_dev` is 255 / (float_max - float_min).
For most users, we recommend using post-training quantization. We are working on
new tools for post-training and training-time quantization that we hope will
simplify generating quantized models.

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB