TensorFlow Lite documentation
PiperOrigin-RevId: 246953700
This commit is contained in:
parent
60524c4167
commit
43c7b99a10
@ -1,15 +1,23 @@
|
||||
# TensorFlow Lite converter
|
||||
|
||||
TensorFlow Lite uses the optimized
|
||||
[FlatBuffer](https://google.github.io/flatbuffers/) format to represent graphs.
|
||||
Therefore, a TensorFlow model
|
||||
([protocol buffer](https://developers.google.com/protocol-buffers/)) needs to be
|
||||
converted into a `FlatBuffer` file before deploying to clients.
|
||||
The TensorFlow Lite converter is used to convert TensorFlow models into an
|
||||
optimized [FlatBuffer](https://google.github.io/flatbuffers/) format, so that
|
||||
they can be used by the TensorFlow Lite interpreter.
|
||||
|
||||
Note: This page contains documentation on the converter API for TensorFlow 1.x.
|
||||
The API for TensorFlow 2.0 is available
|
||||
[here](https://www.tensorflow.org/lite/r2/convert/).
|
||||
|
||||
## FlatBuffers
|
||||
|
||||
FlatBuffer is an efficient open-source cross-platform serialization library. It
|
||||
is similar to
|
||||
[protocol buffers](https://developers.google.com/protocol-buffers), with the
|
||||
distinction that FlatBuffers do not need a parsing/unpacking step to a secondary
|
||||
representation before data can be accessed, avoiding per-object memory
|
||||
allocation. The code footprint of FlatBuffers is an order of magnitude smaller
|
||||
than protocol buffers.
|
||||
|
||||
## From model training to device deployment
|
||||
|
||||
The TensorFlow Lite converter generates a TensorFlow Lite
|
||||
@ -20,14 +28,13 @@ The converter supports the following input formats:
|
||||
|
||||
* [SavedModels](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators)
|
||||
* Frozen `GraphDef`: Models generated by
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py).
|
||||
[freeze_graph.py](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py).
|
||||
* `tf.keras` HDF5 models.
|
||||
* Any model taken from a `tf.Session` (Python API only).
|
||||
|
||||
The TensorFlow Lite `FlatBuffer` file is then deployed to a client device
|
||||
(generally a mobile or embedded device), and the TensorFlow Lite interpreter
|
||||
uses the compressed model for on-device inference. This conversion process is
|
||||
shown in the diagram below:
|
||||
The TensorFlow Lite `FlatBuffer` file is then deployed to a client device, and
|
||||
the TensorFlow Lite interpreter uses the compressed model for on-device
|
||||
inference. This conversion process is shown in the diagram below:
|
||||
|
||||

|
||||
|
||||
|
@ -1,270 +1,286 @@
|
||||
# Get started with TensorFlow Lite
|
||||
|
||||
Using a TensorFlow Lite model in your mobile app requires multiple
|
||||
considerations: you must choose a pre-trained or custom model, convert the model
|
||||
to a TensorFLow Lite format, and finally, integrate the model in your app.
|
||||
TensorFlow Lite provides all the tools you need to convert and run TensorFlow
|
||||
models on mobile, embedded, and IoT devices. The following guide walks through
|
||||
each step of the developer workflow and provides links to further instructions.
|
||||
|
||||
## 1. Choose a model
|
||||
|
||||
Depending on the use case, you can choose one of the popular open-sourced models,
|
||||
such as *InceptionV3* or *MobileNets*, and re-train these models with a custom
|
||||
data set or even build your own custom model.
|
||||
<a id="1_choose_a_model"></a>
|
||||
|
||||
TensorFlow Lite allows you to run TensorFlow models on a wide range of devices.
|
||||
A TensorFlow model is a data structure that contains the logic and knowledge of
|
||||
a machine learning network trained to solve a particular problem.
|
||||
|
||||
There are many ways to obtain a TensorFlow model, from using pre-trained models
|
||||
to training your own. To use a model with TensorFlow Lite it must be converted
|
||||
into a special format. This is explained in section 2,
|
||||
[Convert the model](#2_convert_the_model_format).
|
||||
|
||||
Note: Not all TensorFlow models will work with TensorFlow Lite, since the
|
||||
interpreter supports a limited subset of TensorFlow operations. See section 2,
|
||||
[Convert the model](#2_convert_the_model_format) to learn about compatibility.
|
||||
|
||||
### Use a pre-trained model
|
||||
|
||||
[MobileNets](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
|
||||
is a family of mobile-first computer vision models for TensorFlow designed to
|
||||
effectively maximize accuracy, while taking into consideration the restricted
|
||||
resources for on-device or embedded applications. MobileNets are small,
|
||||
low-latency, low-power models parameterized to meet the resource constraints for
|
||||
a variety of uses. They can be used for classification, detection, embeddings, and
|
||||
segmentation—similar to other popular large scale models, such as
|
||||
[Inception](https://arxiv.org/pdf/1602.07261.pdf). Google provides 16 pre-trained
|
||||
[ImageNet](http://www.image-net.org/challenges/LSVRC/) classification checkpoints
|
||||
for MobileNets that can be used in mobile projects of all sizes.
|
||||
The TensorFlow Lite team provides a set of pre-trained models that solve a
|
||||
variety of machine learning problems. These models have been converted to work
|
||||
with TensorFlow Lite and are ready to use in your applications.
|
||||
|
||||
[Inception-v3](https://arxiv.org/abs/1512.00567) is an image recognition model
|
||||
that achieves fairly high accuracy recognizing general objects with 1000 classes,
|
||||
for example, "Zebra", "Dalmatian", and "Dishwasher". The model extracts general
|
||||
features from input images using a convolutional neural network and classifies
|
||||
them based on those features with fully-connected and softmax layers.
|
||||
The pre-trained models include:
|
||||
|
||||
[On Device Smart Reply](https://research.googleblog.com/2017/02/on-device-machine-intelligence.html)
|
||||
is an on-device model that provides one-touch replies for incoming text messages
|
||||
by suggesting contextually relevant messages. The model is built specifically for
|
||||
memory constrained devices, such as watches and phones, and has been successfully
|
||||
used in Smart Replies on Android Wear. Currently, this model is Android-specific.
|
||||
* [Image classification](../models/image_classification/overview.md)
|
||||
* [Object detection](../models/object_detection/overview.md)
|
||||
* [Smart reply](../models/smart_reply/overview.md)
|
||||
* [Pose estimation](../models/pose_estimation/overview.md)
|
||||
* [Segmentation](../models/segmentation/overview.md)
|
||||
|
||||
These pre-trained models are [available for download](hosted_models.md).
|
||||
See our full list of pre-trained models in [Models](../models).
|
||||
|
||||
### Re-train Inception-V3 or MobileNet for a custom data set
|
||||
#### Models from other sources
|
||||
|
||||
These pre-trained models were trained on the *ImageNet* data set which contains
|
||||
1000 predefined classes. If these classes are not sufficient for your use case,
|
||||
the model will need to be re-trained. This technique is called
|
||||
*transfer learning* and starts with a model that has been already trained on a
|
||||
problem, then retrains the model on a similar problem. Deep learning from
|
||||
scratch can take days, but transfer learning is fairly quick. In order to do
|
||||
this, you need to generate a custom data set labeled with the relevant classes.
|
||||
There are many other places you can obtain pre-trained TensorFlow models,
|
||||
including [TensorFlow Hub](https://www.tensorflow.org/hub). In most cases, these
|
||||
models will not be provided in the TensorFlow Lite format, and you'll have to
|
||||
[convert](#2_convert_the_model_format) them before use.
|
||||
|
||||
The [TensorFlow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/)
|
||||
codelab walks through the re-training process step-by-step. The code supports
|
||||
both floating point and quantized inference.
|
||||
### Re-train a model (transfer learning)
|
||||
|
||||
Transfer learning allows you to take a trained model and re-train it to perform
|
||||
another task. For example, an
|
||||
[image classification](../models/image_classification/overview.md) model could
|
||||
be retrained to recognize new categories of image. Re-training takes less time
|
||||
and requires less data than training a model from scratch.
|
||||
|
||||
You can use transfer learning to customize pre-trained models to your
|
||||
application. Learn how to perform transfer learning in the
|
||||
<a href="https://codelabs.developers.google.com/codelabs/recognize-flowers-with-tensorflow-on-android">Recognize
|
||||
flowers with TensorFlow</a> codelab.
|
||||
|
||||
### Train a custom model
|
||||
|
||||
A developer may choose to train a custom model using Tensorflow (see the
|
||||
[TensorFlow tutorials](https://www.tensorflow.org/tutorials/) for examples of building and training
|
||||
models). If you have already written a model, the first step is to export this
|
||||
to a `tf.GraphDef` file. This is required because some formats do not store the
|
||||
model structure outside the code, and we must communicate with other parts of
|
||||
the framework. See
|
||||
[Exporting the Inference Graph](https://www.tensorflow.org/tutorials/keras/save_and_restore_models#save_the_entire_model)
|
||||
to create file for the custom model.
|
||||
If you have designed and trained your own TensorFlow model, or you have trained
|
||||
a model obtained from another source, you should convert it to the TensorFlow
|
||||
Lite format before use.
|
||||
|
||||
TensorFlow Lite currently supports a subset of TensorFlow operators. Refer to
|
||||
the [TensorFlow Lite & TensorFlow Compatibility Guide](ops_compatibility.md)
|
||||
for supported operators and their usage. This set of operators will continue to
|
||||
grow in future Tensorflow Lite releases.
|
||||
## 2. Convert the model
|
||||
|
||||
## 2. Convert the model format
|
||||
<a id="2_convert_the_model_format"></a>
|
||||
|
||||
The [TensorFlow Lite Converter](../convert/index.md) accepts the following file
|
||||
formats:
|
||||
TensorFlow Lite is designed to execute models efficiently on devices. Some of
|
||||
this efficiency comes from the use of a special format for storing models.
|
||||
TensorFlow models must be converted into this format before they can be used by
|
||||
TensorFlow Lite.
|
||||
|
||||
* `SavedModel` — A `GraphDef` and checkpoint with a signature that labels
|
||||
input and output arguments to a model. See the documentation for converting
|
||||
SavedModels using [Python](../convert/python_api.md#basic_savedmodel) or using
|
||||
the [command line](../convert/cmdline_examples.md#savedmodel).
|
||||
* `tf.keras` - A HDF5 file containing a model with weights and input and
|
||||
output arguments generated by `tf.Keras`. See the documentation for
|
||||
converting HDF5 models using
|
||||
[Python](../convert/python_api.md#basic_keras_file) or using the
|
||||
[command line](../convert/cmdline_examples.md#keras).
|
||||
* `frozen tf.GraphDef` — A subclass of `tf.GraphDef` that does not contain
|
||||
variables. A `GraphDef` can be converted to a `frozen GraphDef` by taking a
|
||||
checkpoint and a `GraphDef`, and converting each variable into a constant
|
||||
using the value retrieved from the checkpoint. Instructions on converting a
|
||||
`tf.GraphDef` to a TensorFlow Lite model are described in the next
|
||||
subsection.
|
||||
Converting models reduces their file size and introduces optimizations that do
|
||||
not affect accuracy. Developers can opt to further reduce file size and increase
|
||||
speed of execution in exchange for some trade-offs. You can use the TensorFlow
|
||||
Lite converter to choose which optimizations to apply.
|
||||
|
||||
### Converting a tf.GraphDef
|
||||
TensorFlow Lite supports a limited subset of TensorFlow operations, so not all
|
||||
models can be converted. See [Ops compatibility](#ops-compatibility) for more
|
||||
information.
|
||||
|
||||
TensorFlow models may be saved as a .pb or .pbtxt `tf.GraphDef` file. In order
|
||||
to convert the `tf.GraphDef` file to TensorFlow Lite, the model must first be
|
||||
frozen. This process involves several file formats including the `frozen
|
||||
GraphDef`:
|
||||
### TensorFlow Lite converter
|
||||
|
||||
* `tf.GraphDef` (.pb or .pbtxt) — A protobuf that represents the TensorFlow
|
||||
training or computation graph. It contains operators, tensors, and variables
|
||||
definitions.
|
||||
* *checkpoint* (.ckpt) — Serialized variables from a TensorFlow graph. Since
|
||||
this does not contain a graph structure, it cannot be interpreted by itself.
|
||||
* *TensorFlow Lite model* (.tflite) — A serialized
|
||||
[FlatBuffer](https://google.github.io/flatbuffers/) that contains TensorFlow
|
||||
Lite operators and tensors for the TensorFlow Lite interpreter.
|
||||
The [TensorFlow Lite converter](../convert) is a tool that converts trained
|
||||
TensorFlow models into the TensorFlow Lite format. It can also introduce
|
||||
optimizations, which are covered in section 4,
|
||||
[Optimize your model](#4_optimize_your_model_optional).
|
||||
|
||||
You must have checkpoints that contain trained weights. The `tf.GraphDef` file
|
||||
only contains the structure of the graph. The process of merging the checkpoint
|
||||
values with the graph structure is called *freezing the graph*.
|
||||
The converter is available as a Python API. The following example shows a
|
||||
TensorFlow `SavedModel` being converted into the TensorFlow Lite format:
|
||||
|
||||
`tf.GraphDef` and checkpoint files for MobileNet models are available
|
||||
[here](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md).
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
To freeze the graph, use the following command (changing the arguments):
|
||||
|
||||
```
|
||||
freeze_graph --input_graph=/tmp/mobilenet_v1_224.pb \
|
||||
--input_checkpoint=/tmp/checkpoints/mobilenet-10202.ckpt \
|
||||
--input_binary=true \
|
||||
--output_graph=/tmp/frozen_mobilenet_v1_224.pb \
|
||||
--output_node_names=MobileNetV1/Predictions/Reshape_1
|
||||
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
|
||||
tflite_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
Set the `input_binary` flag to `True` when reading a binary protobuf, a `.pb`
|
||||
file. Set to `False` for a `.pbtxt` file.
|
||||
You can [convert TensorFlow 2.0 models](../r2/convert) in a similar way.
|
||||
|
||||
Set `input_graph` and `input_checkpoint` to the respective filenames. The
|
||||
`output_node_names` may not be obvious outside of the code that built the model.
|
||||
The easiest way to find them is to visualize the graph, either with
|
||||
[TensorBoard](https://www.tensorflow.org/guide/summaries_and_tensorboard) or
|
||||
`graphviz`.
|
||||
The converter can also be used from the
|
||||
[command line](../convert/cmdline_examples), but the Python API is recommended.
|
||||
|
||||
The frozen `GraphDef` is now ready for conversion to the `FlatBuffer` format
|
||||
(.tflite) for use on Android or iOS devices. For Android, the TensorFlow Lite
|
||||
Converter tool supports both float and quantized models. To convert the frozen
|
||||
`GraphDef` to the .tflite format use a command similar to the following:
|
||||
### Options
|
||||
|
||||
```
|
||||
tflite_convert \
|
||||
--output_file=/tmp/mobilenet_v1_1.0_224.tflite \
|
||||
--graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
|
||||
--input_arrays=input \
|
||||
--output_arrays=MobilenetV1/Predictions/Reshape_1
|
||||
```
|
||||
The converter can convert from a variety of input types.
|
||||
|
||||
The
|
||||
[frozen_graph.pb](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz)
|
||||
file used here is available for download. Setting the `input_array` and
|
||||
`output_array` arguments is not straightforward. The easiest way to find these
|
||||
values is to explore the graph using
|
||||
[TensorBoard](https://www.tensorflow.org/guide/summaries_and_tensorboard). Reuse
|
||||
the arguments for specifying the output nodes for inference in the
|
||||
`freeze_graph` step.
|
||||
When [converting TensorFlow 1.x models](../convert/python_api), these are:
|
||||
|
||||
### Full converter reference
|
||||
* [SavedModel directories](https://www.tensorflow.org/alpha/guide/saved_model)
|
||||
* Frozen GraphDef (models generated by
|
||||
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py))
|
||||
* [Keras](https://keras.io) HDF5 models
|
||||
* Models taken from a `tf.Session`
|
||||
|
||||
The [TensorFlow Lite Converter](../convert/index.md) can be
|
||||
[Python](../convert/python_api.md) or from the
|
||||
[command line](../convert/cmdline_examples.md). This allows you to integrate the
|
||||
conversion step into the model design workflow, ensuring the model is easy to
|
||||
convert to a mobile inference graph.
|
||||
When [converting TensorFlow 2.x models](../r2/convert/python_api), these are:
|
||||
|
||||
* [SavedModel directories](https://www.tensorflow.org/alpha/guide/saved_model)
|
||||
* [`tf.keras` models](https://www.tensorflow.org/alpha/guide/keras/overview)
|
||||
* [Concrete functions](../r2/convert/concrete_function.md)
|
||||
|
||||
The converter can be configured to apply various optimizations that can improve
|
||||
performance or reduce file size. This is covered in section 4,
|
||||
[Optimize your model](#4_optimize_your_model_optional).
|
||||
|
||||
### Ops compatibility
|
||||
|
||||
Refer to the [ops compatibility guide](ops_compatibility.md) for
|
||||
troubleshooting help, and if that doesn't help, please
|
||||
[file an issue](https://github.com/tensorflow/tensorflow/issues).
|
||||
TensorFlow Lite currently supports a [limited subset](ops_compatibility.md) of
|
||||
TensorFlow operations. The long term goal is for all TensorFlow operations to be
|
||||
supported.
|
||||
|
||||
### Graph Visualization tool
|
||||
If the model you wish to convert contains unsupported operations, you can use
|
||||
[TensorFlow Select](ops_select.md) to include operations from TensorFlow. This
|
||||
will result in a larger binary being deployed to devices.
|
||||
|
||||
The [development repo](https://github.com/tensorflow/tensorflow) contains a tool
|
||||
to visualize TensorFlow Lite models after conversion. To build the
|
||||
[visualize.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/visualize.py)
|
||||
tool:
|
||||
## 3. Run inference with the model
|
||||
|
||||
```sh
|
||||
bazel run tensorflow/lite/tools:visualize -- model.tflite model_viz.html
|
||||
<a id="3_use_the_tensorflow_lite_model_for_inference_in_a_mobile_app"></a>
|
||||
|
||||
*Inference* is the process of running data through a model to obtain
|
||||
predictions. It requires a model, an interpreter, and input data.
|
||||
|
||||
### TensorFlow Lite interpreter
|
||||
|
||||
The [TensorFlow Lite interpreter](inference.md) is a library that takes a model
|
||||
file, executes the operations it defines on input data, and provides access to
|
||||
the output.
|
||||
|
||||
The interpreter works across multiple platforms and provides a simple API for
|
||||
running TensorFlow Lite models from Java, Swift, Objective-C, C++, and Python.
|
||||
|
||||
The following code shows the interpreter being invoked from Java:
|
||||
|
||||
```java
|
||||
try (Interpreter interpreter = new Interpreter(tensorflow_lite_model_file)) {
|
||||
interpreter.run(input, output);
|
||||
}
|
||||
```
|
||||
|
||||
This generates an interactive HTML page listing subgraphs, operations, and a
|
||||
graph visualization.
|
||||
### GPU acceleration and Delegates
|
||||
|
||||
## 3. Use the TensorFlow Lite model for inference in a mobile app
|
||||
Some devices provide hardware acceleration for machine learning operations. For
|
||||
example, most mobile phones have GPUs, which can perform floating point matrix
|
||||
operations faster than a CPU.
|
||||
|
||||
After completing the prior steps, you should now have a `.tflite` model file.
|
||||
The speed-up can be substantial. For example, a MobileNet v1 image
|
||||
classification model runs 5.5x faster on a Pixel 3 phone when GPU acceleration
|
||||
is used.
|
||||
|
||||
### Android
|
||||
The TensorFlow Lite interpreter can be configured with
|
||||
[Delegates](../performance/delegates.md) to make use of hardware acceleration on
|
||||
different devices. The [GPU Delegate](../performance/gpu.md) allows the
|
||||
interpreter to run appropriate operations on the device's GPU.
|
||||
|
||||
Since Android apps are written in Java and the core TensorFlow library is in C++,
|
||||
a JNI library is provided as an interface. This is only meant for inference—it
|
||||
provides the ability to load a graph, set up inputs, and run the model to
|
||||
calculate outputs.
|
||||
The following code shows the GPU Delegate being used from Java:
|
||||
|
||||
The open source Android demo app uses the JNI interface and is available
|
||||
[on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/java/demo/app).
|
||||
You can also download a
|
||||
[prebuilt APK](http://download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk).
|
||||
See the <a href="./android.md">Android demo</a> guide for details.
|
||||
```java
|
||||
GpuDelegate delegate = new GpuDelegate();
|
||||
Interpreter.Options options = (new Interpreter.Options()).addDelegate(delegate);
|
||||
Interpreter interpreter = new Interpreter(tensorflow_lite_model_file, options);
|
||||
try {
|
||||
interpreter.run(input, output);
|
||||
}
|
||||
```
|
||||
|
||||
The <a href="./android.md">Android mobile</a> guide has instructions for
|
||||
installing TensorFlow on Android and setting up `bazel` and Android Studio.
|
||||
To add support for new hardware accelerators you can
|
||||
[define your own delegate](../performance/delegates.md#how_to_add_a_delegate).
|
||||
|
||||
### iOS
|
||||
### Android and iOS
|
||||
|
||||
To integrate a TensorFlow model in an iOS app, see the
|
||||
[TensorFlow Lite for iOS](ios.md) guide and <a href="./ios.md">iOS demo</a>
|
||||
guide.
|
||||
The TensorFlow Lite interpreter is easy to use from both major mobile platforms.
|
||||
To get started, explore the [Android quickstart](android.md) and
|
||||
[iOS quickstart](ios.md) guides.
|
||||
[Example applications](https://www.tensorflow.org/lite/examples) are available
|
||||
for both platforms.
|
||||
|
||||
#### Core ML support
|
||||
To obtain the required libraries, Android developers should use the
|
||||
[TensorFlow Lite AAR](android.md#use_the_tensorflow_lite_aar_from_jcenter). iOS
|
||||
developers should use the
|
||||
[CocoaPods for Swift or Objective-C](ios.md#add_tensorflow_lite_to_your_swift_or_objective-c_project).
|
||||
|
||||
Core ML is a machine learning framework used in Apple products. In addition to
|
||||
using Tensorflow Lite models directly in your applications, you can convert
|
||||
trained Tensorflow models to the
|
||||
[CoreML](https://developer.apple.com/machine-learning/) format for use on Apple
|
||||
devices. To use the converter, refer to the
|
||||
[Tensorflow-CoreML converter documentation](https://github.com/tf-coreml/tf-coreml).
|
||||
### Linux
|
||||
|
||||
### ARM32 and ARM64 Linux
|
||||
Embedded Linux is an important platform for deploying machine learning. We
|
||||
provide build instructions for both [Raspberry Pi](build_rpi.md) and
|
||||
[Arm64-based boards](build_arm64.md) such as Odroid C2, Pine64, and NanoPi.
|
||||
|
||||
Compile Tensorflow Lite for a Raspberry Pi by following the
|
||||
[RPi build instructions](build_rpi.md) Compile Tensorflow Lite for a generic aarch64
|
||||
board such as Odroid C2, Pine64, NanoPi, and others by following the
|
||||
[ARM64 Linux build instructions](build_arm64.md) This compiles a static
|
||||
library file (`.a`) used to build your app. There are plans for Python bindings
|
||||
and a demo app.
|
||||
### Microcontrollers
|
||||
|
||||
## 4. Optimize your model (optional)
|
||||
[TensorFlow Lite for Microcontrollers](../microcontrollers/overview.md) is an
|
||||
experimental port of TensorFlow Lite aimed at microcontrollers and other devices
|
||||
with only kilobytes of memory.
|
||||
|
||||
There are two options. If you plan to run on CPU, we recommend that you quantize
|
||||
your weights and activation tensors. If the hardware is available, another
|
||||
option is to run on GPU for massively parallelizable workloads.
|
||||
### Operations
|
||||
|
||||
If your model requires TensorFlow operations that are not yet implemented in
|
||||
TensorFlow Lite, you can use [TensorFlow Select](ops_select.md) to use them in
|
||||
your model. You'll need to build a custom version of the interpreter that
|
||||
includes the TensorFlow operations.
|
||||
|
||||
You can use [Custom operators](ops_custom.md) to write your own operations, or
|
||||
port new operations into TensorFlow Lite.
|
||||
|
||||
[Operator versions](ops_version.md) allows you to add new functionalities and
|
||||
parameters into existing operations.
|
||||
|
||||
## 4. Optimize your model
|
||||
|
||||
<a id="4_optimize_your_model_optional"></a>
|
||||
|
||||
TensorFlow Lite provides tools to optimize the size and performance of your
|
||||
models, often with minimal impact on accuracy. Optimized models may require
|
||||
slightly more complex training, conversion, or integration.
|
||||
|
||||
Machine learning optimization is an evolving field, and TensorFlow Lite's
|
||||
[Model Optimization Toolkit](#model-optimization-toolkit) is continually growing
|
||||
as new techniques are developed.
|
||||
|
||||
### Performance
|
||||
|
||||
The goal of model optimization is to reach the ideal balance of performance,
|
||||
model size, and accuracy on a given device.
|
||||
[Performance best practices](../performance/best_practices.md) can help guide
|
||||
you through this process.
|
||||
|
||||
### Quantization
|
||||
Compress your model size by lowering the precision of the parameters (i.e.
|
||||
neural network weights) from their training-time 32-bit floating-point
|
||||
representations into much smaller and efficient 8-bit integer ones.
|
||||
|
||||
This will execute the heaviest computations fast in lower precision, but the
|
||||
most sensitive ones with higher precision, thus typically resulting in little to
|
||||
no final accuracy losses for the task, yet a significant speed-up over pure
|
||||
floating-point execution.
|
||||
By reducing the precision of values and operations within a model, quantization
|
||||
can reduce both the size of model and the time required for inference. For many
|
||||
models, there is only a minimal loss of accuracy.
|
||||
|
||||
The post-training quantization technique is integrated into the TensorFlow Lite
|
||||
conversion tool. Getting started is easy: after building your TensorFlow model,
|
||||
simply enable the ‘post_training_quantize’ flag in the TensorFlow Lite
|
||||
conversion tool. Assuming that the saved model is stored in saved_model_dir, the
|
||||
quantized tflite flatbuffer can be generated in command line:
|
||||
The TensorFlow Lite converter makes it easy to quantize TensorFlow models. The
|
||||
following Python code quantizes a `SavedModel` and saves it to disk:
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
||||
```
|
||||
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
|
||||
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
|
||||
tflite_quant_model = converter.convert()
|
||||
open("converted_model.tflite", "wb").write(tflite_quantized_model)
|
||||
```
|
||||
|
||||
Read the full documentation [here](../performance/post_training_quantization.md)
|
||||
and see a tutorial
|
||||
[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_quant.ipynb).
|
||||
To learn more about quantization, see
|
||||
[Post-training quantization](../performance/post_training_quantization.md).
|
||||
|
||||
### GPU
|
||||
Run on GPU GPUs are designed to have high throughput for massively
|
||||
parallelizable workloads. Thus, they are well-suited for deep neural nets, which
|
||||
consist of a huge number of operators, each working on some input tensor(s) that
|
||||
can be easily divided into smaller workloads and carried out in parallel,
|
||||
typically resulting in lower latency.
|
||||
### Model Optimization Toolkit
|
||||
|
||||
Another benefit with GPU inference is its power efficiency. GPUs carry out the
|
||||
computations in a very efficient and optimized manner, so that they consume less
|
||||
power and generate less heat than when the same task is run on CPUs.
|
||||
The [Model Optimization Toolkit](../performance/model_optimization.md) is a set
|
||||
of tools and techniques designed to make it easy for developers to optimize
|
||||
their models. Many of the techniques can be applied to all TensorFlow models and
|
||||
are not specific to TensorFlow Lite, but they are especially valuable when
|
||||
running inference on devices with limited resources.
|
||||
|
||||
Read the tutorial [here](../performance/gpu.md) and full documentation [here](../performance/gpu_advanced.md).
|
||||
## Next steps
|
||||
|
||||
Now that you're familiar with TensorFlow Lite, explore some of the following
|
||||
resources:
|
||||
|
||||
* If you're a mobile developer, visit [Android quickstart](android.md) or
|
||||
[iOS quickstart](ios.md).
|
||||
* Explore our [pre-trained models](../models).
|
||||
* Try our [example apps](https://www.tensorflow.org/lite/examples).
|
||||
|
@ -1,202 +1,121 @@
|
||||
|
||||
# TensorFlow Lite guide
|
||||
|
||||
TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded
|
||||
devices. It enables on-device machine learning inference with low latency and a
|
||||
small binary size. TensorFlow Lite also supports hardware acceleration with the
|
||||
[Android Neural Networks
|
||||
API](https://developer.android.com/ndk/guides/neuralnetworks/index.html).
|
||||
TensorFlow Lite is a set of tools to help developers run TensorFlow models on
|
||||
mobile, embedded, and IoT devices. It enables on-device machine learning
|
||||
inference with low latency and a small binary size.
|
||||
|
||||
TensorFlow Lite uses many techniques for achieving low latency such as
|
||||
optimizing the kernels for mobile apps, pre-fused activations, and quantized
|
||||
kernels that allow smaller and faster (fixed-point math) models.
|
||||
TensorFlow Lite consists of two main components:
|
||||
|
||||
Most of our TensorFlow Lite documentation is [on
|
||||
GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite)
|
||||
for the time being.
|
||||
- The [TensorFlow Lite interpreter](inference.md), which runs specially
|
||||
optimized models on many different hardware types, including mobile phones,
|
||||
embedded Linux devices, and microcontrollers.
|
||||
- The [TensorFlow Lite converter](../convert/index.md), which converts
|
||||
TensorFlow models into an efficient form for use by the interpreter, and can
|
||||
introduce optimizations to improve binary size and performance.
|
||||
|
||||
## What does TensorFlow Lite contain?
|
||||
### Machine learning at the edge
|
||||
|
||||
TensorFlow Lite supports a set of core operators, both quantized and
|
||||
float, which have been tuned for mobile platforms. They incorporate pre-fused
|
||||
activations and biases to further enhance performance and quantized
|
||||
accuracy. Additionally, TensorFlow Lite also supports using custom operations in
|
||||
models.
|
||||
TensorFlow Lite is designed to make it easy to perform machine learning on
|
||||
devices, "at the edge" of the network, instead of sending data back and forth
|
||||
from a server. For developers, performing machine learning on-device can help
|
||||
improve:
|
||||
|
||||
TensorFlow Lite defines a new model file format, based on
|
||||
[FlatBuffers](https://google.github.io/flatbuffers/). FlatBuffers is an
|
||||
efficient open-source cross-platform serialization library. It is similar to
|
||||
[protocol buffers](https://developers.google.com/protocol-buffers/?hl=en), but
|
||||
the primary difference is that FlatBuffers does not need a parsing/unpacking
|
||||
step to a secondary representation before you can access data, often coupled
|
||||
with per-object memory allocation. Also, the code footprint of FlatBuffers is an
|
||||
order of magnitude smaller than protocol buffers.
|
||||
* *Latency:* there's no round-trip to a server
|
||||
* *Privacy:* no data needs to leave the device
|
||||
* *Connectivity:* an Internet connection isn't required
|
||||
* *Power consumption:* network connections are power hungry
|
||||
|
||||
TensorFlow Lite has a new mobile-optimized interpreter, which has the key goals
|
||||
of keeping apps lean and fast. The interpreter uses a static graph ordering and
|
||||
a custom (less-dynamic) memory allocator to ensure minimal load, initialization,
|
||||
and execution latency.
|
||||
TensorFlow Lite works with a huge range of devices, from tiny microcontrollers
|
||||
to powerful mobile phones.
|
||||
|
||||
TensorFlow Lite provides an interface to leverage hardware acceleration, if
|
||||
available on the device. It does so via the
|
||||
[Android Neural Networks API](https://developer.android.com/ndk/guides/neuralnetworks/index.html),
|
||||
available on Android 8.1 (API level 27) and higher.
|
||||
Key Point: The TensorFlow Lite binary is smaller than 300KB when all supported
|
||||
operators are linked, and less than 200KB when using only the operators needed
|
||||
for supporting the common image classification models InceptionV3 and MobileNet.
|
||||
|
||||
## Why do we need a new mobile-specific library?
|
||||
## Get started
|
||||
|
||||
Machine Learning is changing the computing paradigm, and we see an emerging
|
||||
trend of new use cases on mobile and embedded devices. Consumer expectations are
|
||||
also trending toward natural, human-like interactions with their devices, driven
|
||||
by the camera and voice interaction models.
|
||||
To begin working with TensorFlow Lite, visit [Get started](get_started.md).
|
||||
|
||||
There are several factors which are fueling interest in this domain:
|
||||
## Key features
|
||||
|
||||
- Innovation at the silicon layer is enabling new possibilities for hardware
|
||||
acceleration, and frameworks such as the Android Neural Networks API make it
|
||||
easy to leverage these.
|
||||
* *[Interpreter](inference.md) tuned for on-device ML*, supporting a set of
|
||||
core operators that are optimized for on-device applications, and with a
|
||||
small binary size.
|
||||
* *Diverse platform support*, covering [Android](android.md) and [iOS](ios.md)
|
||||
devices, embedded Linux, and microcontrollers, making use of platform APIs
|
||||
for accelerated inference.
|
||||
* *APIs for multiple languages* including Java, Swift, Objective-C, C++, and
|
||||
Python.
|
||||
* *High performance*, with [hardware acceleration](../performance/gpu.md) on
|
||||
supported devices, device-optimized kernels, and
|
||||
[pre-fused activations and biases](ops_compatibility.md).
|
||||
* *Model optimization tools*, including
|
||||
[quantization](../performance/post_training_quantization.md), that can
|
||||
reduce size and increase performance of models without sacrificing accuracy.
|
||||
* *Efficient model format*, using a [FlatBuffer](../convert/index.md) that is
|
||||
optimized for small size and portability.
|
||||
* *[Pre-trained models](../models)* for common machine learning tasks that can
|
||||
be customized to your application.
|
||||
* *[Samples and tutorials](https://www.tensorflow.org/examples)* that show you
|
||||
how to deploy machine learning models on supported platforms.
|
||||
|
||||
- Recent advances in real-time computer-vision and spoken language understanding
|
||||
have led to mobile-optimized benchmark models being open sourced
|
||||
(e.g. MobileNets, SqueezeNet).
|
||||
## Development workflow
|
||||
|
||||
- Widely-available smart appliances create new possibilities for
|
||||
on-device intelligence.
|
||||
The workflow for using TensorFlow Lite involves the following steps:
|
||||
|
||||
- Interest in stronger user data privacy paradigms where user data does not need
|
||||
to leave the mobile device.
|
||||
1. **Pick a model**
|
||||
|
||||
- Ability to serve ‘offline’ use cases, where the device does not need to be
|
||||
connected to a network.
|
||||
Bring your own TensorFlow model, find a model online, or pick a model from
|
||||
our [Pre-trained models](../models) to drop in or retrain.
|
||||
|
||||
We believe the next wave of machine learning applications will have significant
|
||||
processing on mobile and embedded devices.
|
||||
1. **Convert the model**
|
||||
|
||||
## TensorFlow Lite highlights
|
||||
If you're using a custom model, use the
|
||||
[TensorFlow Lite converter](../convert/index.md) and a few lines of Python
|
||||
to convert it to the TensorFlow Lite format.
|
||||
|
||||
TensorFlow Lite provides:
|
||||
1. **Deploy to your device**
|
||||
|
||||
- A set of core operators, both quantized and float, many of which have been
|
||||
tuned for mobile platforms. These can be used to create and run custom
|
||||
models. Developers can also write their own custom operators and use them in
|
||||
models.
|
||||
Run your model on-device with the
|
||||
[TensorFlow Lite interpreter](inference.md), with APIs in many languages.
|
||||
|
||||
- A new [FlatBuffers](https://google.github.io/flatbuffers/)-based
|
||||
model file format.
|
||||
1. **Optimize your model**
|
||||
|
||||
- On-device interpreter with kernels optimized for faster execution on mobile.
|
||||
Use our [Model Optimization Toolkit](../performance/model_optimization.md)
|
||||
to reduce your model's size and increase its efficiency with minimal impact
|
||||
on accuracy.
|
||||
|
||||
- TensorFlow converter to convert TensorFlow-trained models to the TensorFlow
|
||||
Lite format.
|
||||
To learn more about using TensorFlow Lite in your project, see
|
||||
[Get started](get_started.md).
|
||||
|
||||
- Smaller in size: TensorFlow Lite is smaller than 300KB when all supported
|
||||
operators are linked and less than 200KB when using only the operators needed
|
||||
for supporting InceptionV3 and Mobilenet.
|
||||
## Technical constraints
|
||||
|
||||
- **Pre-tested models:**
|
||||
TensorFlow Lite plans to provide high performance on-device inference for any
|
||||
TensorFlow model. However, the TensorFlow Lite interpreter currently supports a
|
||||
limited subset of TensorFlow operators that have been optimized for on-device
|
||||
use. This means that some models require additional steps to work with
|
||||
TensorFlow Lite.
|
||||
|
||||
All of the following models are guaranteed to work out of the box:
|
||||
To learn which operators are available, see
|
||||
[Operator compatibility](ops_compatibility.md).
|
||||
|
||||
- Inception V3, a popular model for detecting the dominant objects
|
||||
present in an image.
|
||||
If your model uses operators that are not yet supported by TensorFlow Lite
|
||||
interpreter, you can use [TensorFlow Select](ops_select.md) to include
|
||||
TensorFlow operations in your TensorFlow Lite build. However, this will lead to
|
||||
an increased binary size.
|
||||
|
||||
- [MobileNets](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md),
|
||||
a family of mobile-first computer vision models designed to effectively
|
||||
maximize accuracy while being mindful of the restricted resources for an
|
||||
on-device or embedded application. They are small, low-latency, low-power
|
||||
models parameterized to meet the resource constraints of a variety of use
|
||||
cases. They can be built upon for classification, detection, embeddings
|
||||
and segmentation. MobileNet models are smaller but [lower in
|
||||
accuracy](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
|
||||
than Inception V3.
|
||||
TensorFlow Lite does not currently support on-device training, but it is in our
|
||||
[Roadmap](roadmap.md), along with other planned improvements.
|
||||
|
||||
- On Device Smart Reply, an on-device model which provides one-touch
|
||||
replies for an incoming text message by suggesting contextually relevant
|
||||
messages. The model was built specifically for memory constrained devices
|
||||
such as watches & phones and it has been successfully used to surface
|
||||
[Smart Replies on Android
|
||||
Wear](https://research.googleblog.com/2017/02/on-device-machine-intelligence.html)
|
||||
to all first-party and third-party apps.
|
||||
## Next steps
|
||||
|
||||
Also see the complete list of
|
||||
[TensorFlow Lite's supported models](hosted_models.md),
|
||||
including the model sizes, performance numbers, and downloadable model files.
|
||||
Want to keep learning about TensorFlow Lite? Here are some next steps:
|
||||
|
||||
- Quantized versions of the MobileNet model, which runs faster than the
|
||||
non-quantized (float) version on CPU.
|
||||
|
||||
- New Android demo app to illustrate the use of TensorFlow Lite with a quantized
|
||||
MobileNet model for object classification.
|
||||
|
||||
- Java and C++ API support
|
||||
|
||||
|
||||
## Getting Started
|
||||
|
||||
We recommend you try out TensorFlow Lite with the pre-tested models indicated
|
||||
above. If you have an existing model, you will need to test whether your model
|
||||
is compatible with both the converter and the supported operator set. To test
|
||||
your model, see the
|
||||
[documentation on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite).
|
||||
|
||||
### Retrain Inception-V3 or MobileNet for a custom data set
|
||||
|
||||
The pre-trained models mentioned above have been trained on the ImageNet data
|
||||
set, which consists of 1000 predefined classes. If those classes are not
|
||||
relevant or useful for your use case, you will need to retrain those
|
||||
models. This technique is called transfer learning, which starts with a model
|
||||
that has been already trained on a problem and will then be retrained on a
|
||||
similar problem. Deep learning from scratch can take days, but transfer learning
|
||||
can be done fairly quickly. In order to do this, you'll need to generate your
|
||||
custom data set labeled with the relevant classes.
|
||||
|
||||
The [TensorFlow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/)
|
||||
codelab walks through this process step-by-step. The retraining code supports
|
||||
retraining for both floating point and quantized inference.
|
||||
|
||||
## TensorFlow Lite Architecture
|
||||
|
||||
The following diagram shows the architectural design of TensorFlow Lite:
|
||||
|
||||
<img src="https://www.tensorflow.org/images/tflite-architecture.jpg"
|
||||
alt="TensorFlow Lite architecture diagram"
|
||||
style="max-width:600px;">
|
||||
|
||||
Starting with a trained TensorFlow model on disk, you'll convert that model to
|
||||
the TensorFlow Lite file format (`.tflite`) using the TensorFlow Lite
|
||||
Converter. Then you can use that converted file in your mobile application.
|
||||
|
||||
Deploying the TensorFlow Lite model file uses:
|
||||
|
||||
- Java API: A convenience wrapper around the C++ API on Android.
|
||||
|
||||
- C++ API: Loads the TensorFlow Lite Model File and invokes the Interpreter. The
|
||||
same library is available on both Android and iOS.
|
||||
|
||||
- Interpreter: Executes the model using a set of kernels. The interpreter
|
||||
supports selective kernel loading; without kernels it is only 100KB, and 300KB
|
||||
with all the kernels loaded. This is a significant reduction from the 1.5M
|
||||
required by TensorFlow Mobile.
|
||||
|
||||
- On select Android devices, the Interpreter will use the Android Neural
|
||||
Networks API for hardware acceleration, or default to CPU execution if none
|
||||
are available.
|
||||
|
||||
You can also implement custom kernels using the C++ API that can be used by the
|
||||
Interpreter.
|
||||
|
||||
## Future Work
|
||||
|
||||
In future releases, TensorFlow Lite will support more models and built-in
|
||||
operators, contain performance improvements for both fixed point and floating
|
||||
point models, improvements to the tools to enable easier developer workflows and
|
||||
support for other smaller devices and more. As we continue development, we hope
|
||||
that TensorFlow Lite will greatly simplify the developer experience of targeting
|
||||
a model for small devices.
|
||||
|
||||
Future plans include using specialized machine learning hardware to get the best
|
||||
possible performance for a particular model on a particular device.
|
||||
|
||||
## Next Steps
|
||||
|
||||
The TensorFlow Lite [GitHub repository](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite).
|
||||
contains additional docs, code samples, and demo applications.
|
||||
* Visit [Get started](get_started.md) to walk through the process of using
|
||||
TensorFlow Lite.
|
||||
* If you're a mobile developer, visit [Android quickstart](android.md) or
|
||||
[iOS quickstart](ios.md).
|
||||
* Learn about
|
||||
[TensorFlow Lite for Microcontrollers](../microcontrollers/overview.md).
|
||||
* Explore our [pre-trained models](../models).
|
||||
* Try our [example apps](https://www.tensorflow.org/lite/examples).
|
||||
|
@ -1,16 +1,15 @@
|
||||
# TensorFlow Lite inference
|
||||
|
||||
[TOC]
|
||||
The term *inference* refers to the process of executing a TensorFlow Lite model
|
||||
on-device in order to make predictions based on input data. Inference is the
|
||||
final step in using the model on-device.
|
||||
|
||||
## Overview
|
||||
Inference for TensorFlow Lite models is run through an interpreter. The
|
||||
TensorFlow Lite interpreter is designed to be lean and fast. The interpreter
|
||||
uses a static graph ordering and a custom (less-dynamic) memory allocator to
|
||||
ensure minimal load, initialization, and execution latency.
|
||||
|
||||
TensorFlow Lite inference is the process of executing a TensorFlow Lite
|
||||
model on-device and extracting meaningful results from it. Inference is the
|
||||
final step in using the model on-device in the
|
||||
[architecture](index.md#tensorflow_lite_architecture).
|
||||
|
||||
Inference for TensorFlow Lite models is run through an interpreter. This
|
||||
document outlines the various APIs for the interpreter along with the
|
||||
This document outlines the various APIs for the interpreter, along with the
|
||||
[supported platforms](#supported-platforms).
|
||||
|
||||
### Important Concepts
|
||||
@ -43,19 +42,27 @@ TensorFlow Lite inference on device typically follows the following steps.
|
||||
present it to their user.
|
||||
|
||||
### Supported Platforms
|
||||
|
||||
TensorFlow inference APIs are provided for most common mobile/embedded platforms
|
||||
such as Android, iOS and Linux.
|
||||
|
||||
#### Android
|
||||
|
||||
On Android, TensorFlow Lite inference can be performed using either Java or C++
|
||||
APIs. The Java APIs provide convenience and can be used directly within your
|
||||
Android Activity classes. The C++ APIs on the other hand may offer more
|
||||
flexibility and speed, but may require writing JNI wrappers to move data between
|
||||
Java and C++ layers. You can find an example [here](android.md).
|
||||
Android Activity classes. The C++ APIs offer more flexibility and speed, but may
|
||||
require writing JNI wrappers to move data between Java and C++ layers.
|
||||
|
||||
Visit the [Android quickstart](android.md) for a tutorial and example code.
|
||||
|
||||
#### iOS
|
||||
TensorFlow Lite provides Swift/Objective C++ APIs for inference on iOS. An
|
||||
example can be found [here](ios.md).
|
||||
|
||||
TensorFlow Lite provides native iOS libraries written in
|
||||
[Swift](https://www.tensorflow.org/code/tensorflow/lite/experimental/swift)
|
||||
and
|
||||
[Objective-C](https://www.tensorflow.org/code/tensorflow/lite/experimental/objc).
|
||||
|
||||
Visit the [iOS quickstart](ios.md) for a tutorial and example code.
|
||||
|
||||
#### Linux
|
||||
On Linux platforms such as [Raspberry Pi](build_rpi.md), TensorFlow Lite C++
|
||||
|
@ -280,5 +280,5 @@ trees in the original training data. To do this, you will need a set of training
|
||||
images for each of the new labels you wish to train.
|
||||
|
||||
Learn how to perform transfer learning in the
|
||||
<a href="https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/">TensorFlow
|
||||
for Poets</a> codelab.
|
||||
<a href="https://codelabs.developers.google.com/codelabs/recognize-flowers-with-tensorflow-on-android/#0">Recognize
|
||||
flowers with TensorFlow</a> codelab.
|
||||
|
@ -13,12 +13,15 @@ starter model and labels</a>
|
||||
|
||||
### Sample application
|
||||
|
||||
We have provided a pre-built APK that demonstrates the smart reply model on
|
||||
Android.
|
||||
There is a TensorFlow Lite sample application that demonstrates the smart reply
|
||||
model on Android.
|
||||
|
||||
Go to the
|
||||
<a href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/models/smartreply/g3doc">GitHub
|
||||
page</a> for instructions and list of supported ops and functionalities.
|
||||
<a class="button button-primary" href="https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/models/smartreply">View
|
||||
Android example</a>
|
||||
|
||||
Read the
|
||||
[GitHub page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/models/smartreply/g3doc)
|
||||
to learn how the app works.
|
||||
|
||||
## How it works
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user