Move docs for Python inference into guide/inference.md, and restructure that page to organize the load/run steps based on language.
PiperOrigin-RevId: 259778674
This commit is contained in:
parent
3198b9be2e
commit
b3aafbda35
@ -1,9 +1,12 @@
|
||||
# Converter Python API guide
|
||||
|
||||
This page provides examples on how to use the TensorFlow Lite Converter and the
|
||||
TensorFlow Lite interpreter using the Python API.
|
||||
This page describes how to convert TensorFlow models into the TensorFlow Lite
|
||||
format using the TensorFlow Lite Converter Python API.
|
||||
|
||||
Note: These docs describe the converter in the TensorFlow nightly release,
|
||||
If you're looking for information about how to run a TensorFlow Lite model,
|
||||
see [TensorFlow Lite inference](../guide/inference.md).
|
||||
|
||||
Note: This page describes the converter in the TensorFlow nightly release,
|
||||
installed using `pip install tf-nightly`. For docs describing older versions
|
||||
reference ["Converting models from TensorFlow 1.12"](#pre_tensorflow_1.12).
|
||||
|
||||
@ -20,13 +23,12 @@ be targeted to devices with mobile.
|
||||
## API
|
||||
|
||||
The API for converting TensorFlow models to TensorFlow Lite is
|
||||
`tf.lite.TFLiteConverter`. The API for calling the Python interpreter is
|
||||
`tf.lite.Interpreter`.
|
||||
`tf.lite.TFLiteConverter`, which provides class methods based on the original
|
||||
format of the model. For example, `TFLiteConverter.from_session()` is available
|
||||
for GraphDefs, `TFLiteConverter.from_saved_model()` is available for
|
||||
SavedModels, and `TFLiteConverter.from_keras_model_file()` is available for
|
||||
`tf.Keras` files.
|
||||
|
||||
`TFLiteConverter` provides class methods based on the original format of the
|
||||
model. `TFLiteConverter.from_session()` is available for GraphDefs.
|
||||
`TFLiteConverter.from_saved_model()` is available for SavedModels.
|
||||
`TFLiteConverter.from_keras_model_file()` is available for `tf.Keras` files.
|
||||
Example usages for simple float-point models are shown in
|
||||
[Basic Examples](#basic). Examples usages for more complex models is shown in
|
||||
[Complex Examples](#complex).
|
||||
@ -177,65 +179,6 @@ with tf.Session() as sess:
|
||||
open("converted_model.tflite", "wb").write(tflite_model)
|
||||
```
|
||||
|
||||
## TensorFlow Lite Python interpreter <a name="interpreter"></a>
|
||||
|
||||
### Using the interpreter from a model file <a name="interpreter_file"></a>
|
||||
|
||||
The following example shows how to use the TensorFlow Lite Python interpreter
|
||||
when provided a TensorFlow Lite FlatBuffer file. The example also demonstrates
|
||||
how to run inference on random input data. Run
|
||||
`help(tf.lite.Interpreter)` in the Python terminal to get detailed
|
||||
documentation on the interpreter.
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
import tensorflow as tf
|
||||
|
||||
# Load TFLite model and allocate tensors.
|
||||
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
|
||||
interpreter.allocate_tensors()
|
||||
|
||||
# Get input and output tensors.
|
||||
input_details = interpreter.get_input_details()
|
||||
output_details = interpreter.get_output_details()
|
||||
|
||||
# Test model on random input data.
|
||||
input_shape = input_details[0]['shape']
|
||||
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
|
||||
interpreter.set_tensor(input_details[0]['index'], input_data)
|
||||
|
||||
interpreter.invoke()
|
||||
|
||||
# The function `get_tensor()` returns a copy of the tensor data.
|
||||
# Use `tensor()` in order to get a pointer to the tensor.
|
||||
output_data = interpreter.get_tensor(output_details[0]['index'])
|
||||
print(output_data)
|
||||
```
|
||||
|
||||
### Using the interpreter from model data <a name="interpreter_data"></a>
|
||||
|
||||
The following example shows how to use the TensorFlow Lite Python interpreter
|
||||
when starting with the TensorFlow Lite Flatbuffer model previously loaded. This
|
||||
example shows an end-to-end use case, starting from building the TensorFlow
|
||||
model.
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
import tensorflow as tf
|
||||
|
||||
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
|
||||
const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
|
||||
val = img + const
|
||||
out = tf.identity(val, name="out")
|
||||
|
||||
with tf.Session() as sess:
|
||||
converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
|
||||
tflite_model = converter.convert()
|
||||
|
||||
# Load TFLite model and allocate tensors.
|
||||
interpreter = tf.lite.Interpreter(model_content=tflite_model)
|
||||
interpreter.allocate_tensors()
|
||||
```
|
||||
|
||||
## Additional instructions
|
||||
|
||||
|
@ -4,22 +4,27 @@ TensorFlow Lite provides all the tools you need to convert and run TensorFlow
|
||||
models on mobile, embedded, and IoT devices. The following guide walks through
|
||||
each step of the developer workflow and provides links to further instructions.
|
||||
|
||||
[TOC]
|
||||
|
||||
## 1. Choose a model
|
||||
|
||||
<a id="1_choose_a_model"></a>
|
||||
|
||||
TensorFlow Lite allows you to run TensorFlow models on a wide range of devices.
|
||||
A TensorFlow model is a data structure that contains the logic and knowledge of
|
||||
a machine learning network trained to solve a particular problem.
|
||||
|
||||
There are many ways to obtain a TensorFlow model, from using pre-trained models
|
||||
to training your own. To use a model with TensorFlow Lite it must be converted
|
||||
into a special format. This is explained in section 2,
|
||||
[Convert the model](#2_convert_the_model_format).
|
||||
to training your own.
|
||||
|
||||
To use a model with TensorFlow Lite, you must convert a
|
||||
full TensorFlow model into the TensorFlow Lite format—you
|
||||
cannot create or train a model using TensorFlow Lite. So you must start with a
|
||||
regular TensorFlow model, and then
|
||||
[convert the model](#2_convert_the_model_format).
|
||||
|
||||
Note: TensorFlow Lite supports a limited subset of TensorFlow operations, so not
|
||||
all models can be converted. For details, read about the
|
||||
[TensorFlow Lite operator compatibility](ops_compatibility.md).
|
||||
|
||||
Note: Not all TensorFlow models will work with TensorFlow Lite, since the
|
||||
interpreter supports a limited subset of TensorFlow operations. See section 2,
|
||||
[Convert the model](#2_convert_the_model_format) to learn about compatibility.
|
||||
|
||||
### Use a pre-trained model
|
||||
|
||||
@ -60,35 +65,37 @@ flowers with TensorFlow</a> codelab.
|
||||
### Train a custom model
|
||||
|
||||
If you have designed and trained your own TensorFlow model, or you have trained
|
||||
a model obtained from another source, you should convert it to the TensorFlow
|
||||
Lite format before use.
|
||||
a model obtained from another source, you must
|
||||
[convert it to the TensorFlow Lite format](#2_convert_the_model_format).
|
||||
|
||||
## 2. Convert the model
|
||||
|
||||
<a id="2_convert_the_model_format"></a>
|
||||
|
||||
TensorFlow Lite is designed to execute models efficiently on devices. Some of
|
||||
TensorFlow Lite is designed to execute models efficiently on mobile and other
|
||||
embedded devices with limited compute and memory resources. Some of
|
||||
this efficiency comes from the use of a special format for storing models.
|
||||
TensorFlow models must be converted into this format before they can be used by
|
||||
TensorFlow Lite.
|
||||
|
||||
Converting models reduces their file size and introduces optimizations that do
|
||||
not affect accuracy. Developers can opt to further reduce file size and increase
|
||||
speed of execution in exchange for some trade-offs. You can use the TensorFlow
|
||||
Lite converter to choose which optimizations to apply.
|
||||
not affect accuracy. The TensorFlow Lite converter provides options
|
||||
that allow you to further reduce file size and increase speed of execution, with
|
||||
some trade-offs.
|
||||
|
||||
Note: TensorFlow Lite supports a limited subset of TensorFlow operations, so not
|
||||
all models can be converted. For details, read about the
|
||||
[TensorFlow Lite operator compatibility](ops_compatibility.md).
|
||||
|
||||
TensorFlow Lite supports a limited subset of TensorFlow operations, so not all
|
||||
models can be converted. See [Ops compatibility](#ops-compatibility) for more
|
||||
information.
|
||||
|
||||
### TensorFlow Lite converter
|
||||
|
||||
The [TensorFlow Lite converter](../convert) is a tool that converts trained
|
||||
TensorFlow models into the TensorFlow Lite format. It can also introduce
|
||||
optimizations, which are covered in section 4,
|
||||
The [TensorFlow Lite converter](../convert) is a tool available as a Python API
|
||||
that converts trained TensorFlow models into the TensorFlow Lite format. It can
|
||||
also introduce optimizations, which are covered in section 4,
|
||||
[Optimize your model](#4_optimize_your_model_optional).
|
||||
|
||||
The converter is available as a Python API. The following example shows a
|
||||
The following example shows a
|
||||
TensorFlow `SavedModel` being converted into the TensorFlow Lite format:
|
||||
|
||||
```python
|
||||
@ -128,9 +135,9 @@ performance or reduce file size. This is covered in section 4,
|
||||
|
||||
### Ops compatibility
|
||||
|
||||
TensorFlow Lite currently supports a [limited subset](ops_compatibility.md) of
|
||||
TensorFlow operations. The long term goal is for all TensorFlow operations to be
|
||||
supported.
|
||||
TensorFlow Lite currently supports a [limited subset of TensorFlow
|
||||
operations](ops_compatibility.md). The long term goal is for all TensorFlow
|
||||
operations to be supported.
|
||||
|
||||
If the model you wish to convert contains unsupported operations, you can use
|
||||
[TensorFlow Select](ops_select.md) to include operations from TensorFlow. This
|
||||
|
@ -1,91 +1,104 @@
|
||||
# TensorFlow Lite inference
|
||||
|
||||
The term *inference* refers to the process of executing a TensorFlow Lite model
|
||||
on-device in order to make predictions based on input data. Inference is the
|
||||
final step in using the model on-device.
|
||||
on-device in order to make predictions based on input data. To perform an
|
||||
inference with a TensorFlow Lite model, you must run it through an
|
||||
*interpreter*. The TensorFlow Lite interpreter is designed to be lean and fast.
|
||||
The interpreter uses a static graph ordering and a custom (less-dynamic) memory
|
||||
allocator to ensure minimal load, initialization, and execution latency.
|
||||
|
||||
Inference for TensorFlow Lite models is run through an interpreter. The
|
||||
TensorFlow Lite interpreter is designed to be lean and fast. The interpreter
|
||||
uses a static graph ordering and a custom (less-dynamic) memory allocator to
|
||||
ensure minimal load, initialization, and execution latency.
|
||||
This page describes how to access to the TensorFlow Lite interpreter and
|
||||
perform an inference using C++, Java, and Python, plus links to other resources
|
||||
for each [supported platform](#supported-platforms).
|
||||
|
||||
This document outlines the various APIs for the interpreter, along with the
|
||||
[supported platforms](#supported-platforms).
|
||||
[TOC]
|
||||
|
||||
### Important Concepts
|
||||
## Important concepts
|
||||
|
||||
TensorFlow Lite inference on device typically follows the following steps.
|
||||
TensorFlow Lite inference typically follows the following steps:
|
||||
|
||||
1. **Loading a Model**
|
||||
1. **Loading a model**
|
||||
|
||||
The user loads the `.tflite` model into memory which contains the model's
|
||||
You must load the `.tflite` model into memory, which contains the model's
|
||||
execution graph.
|
||||
|
||||
1. **Transforming Data**
|
||||
Input data acquired by the user generally may not match the input data format
|
||||
expected by the model. For eg., a user may need to resize an image or change
|
||||
the image format to be used by the model.
|
||||
1. **Transforming data**
|
||||
|
||||
1. **Running Inference**
|
||||
Raw input data for the model generally does not match the input data format
|
||||
expected by the model. For example, you might need to resize an image or
|
||||
change the image format to be compatible with the model.
|
||||
|
||||
This step involves using the API to execute the model. It involves a few
|
||||
steps such as building the interpreter, and allocating tensors as explained
|
||||
in detail in [Running a Model](#running_a_model).
|
||||
1. **Running inference**
|
||||
|
||||
1. **Interpreting Output**
|
||||
This step involves using the TensorFlow Lite API to execute the model. It
|
||||
involves a few steps such as building the interpreter, and allocating
|
||||
tensors, as described in the following sections.
|
||||
|
||||
The user retrieves results from model inference and interprets the tensors in
|
||||
a meaningful way to be used in the application.
|
||||
1. **Interpreting output**
|
||||
|
||||
For example, a model may only return a list of probabilities. It is up to the
|
||||
application developer to meaningully map them to relevant categories and
|
||||
present it to their user.
|
||||
When you receive results from the model inference, you must interpret the
|
||||
tensors in a meaningful way that's useful in your application.
|
||||
|
||||
### Supported Platforms
|
||||
For example, a model might return only a list of probabilities. It's up to
|
||||
you to map the probabilities to relevant categories and present it to your
|
||||
end-user.
|
||||
|
||||
## Supported platforms
|
||||
|
||||
TensorFlow inference APIs are provided for most common mobile/embedded platforms
|
||||
such as Android, iOS and Linux.
|
||||
such as Android, iOS and Linux, in multiple programming languages.
|
||||
|
||||
#### Android
|
||||
In most cases, the API design reflects a preference for performance over ease of
|
||||
use. TensorFlow Lite is designed for fast inference on small devices, so it
|
||||
should be no surprise that the APIs try to avoid unnecessary copies at the
|
||||
expense of convenience. Similarly, consistency with TensorFlow APIs was not an
|
||||
explicit goal and some variance between languages is to be expected.
|
||||
|
||||
Across all libraries, the TensorFlow Lite API enables you to load models,
|
||||
feed inputs, and retrieve inference outputs.
|
||||
|
||||
### Android
|
||||
|
||||
On Android, TensorFlow Lite inference can be performed using either Java or C++
|
||||
APIs. The Java APIs provide convenience and can be used directly within your
|
||||
Android Activity classes. The C++ APIs offer more flexibility and speed, but may
|
||||
require writing JNI wrappers to move data between Java and C++ layers.
|
||||
|
||||
Visit the [Android quickstart](android.md) for a tutorial and example code.
|
||||
See below for details about using C++ and Java, or
|
||||
follow the [Android quickstart](android.md) for a tutorial and example code.
|
||||
|
||||
#### iOS
|
||||
### iOS
|
||||
|
||||
TensorFlow Lite provides native iOS libraries written in
|
||||
On iOS, TensorFlow Lite is available with native iOS libraries written in
|
||||
[Swift](https://www.tensorflow.org/code/tensorflow/lite/experimental/swift)
|
||||
and
|
||||
[Objective-C](https://www.tensorflow.org/code/tensorflow/lite/experimental/objc).
|
||||
|
||||
Visit the [iOS quickstart](ios.md) for a tutorial and example code.
|
||||
This page doesn't include a discussion for about these languages, so you should
|
||||
refer to the [iOS quickstart](ios.md) for a tutorial and example code.
|
||||
|
||||
#### Linux
|
||||
On Linux platforms such as [Raspberry Pi](build_rpi.md), TensorFlow Lite C++
|
||||
and Python APIs can be used to run inference.
|
||||
### Linux
|
||||
|
||||
On Linux platforms (including [Raspberry Pi](build_rpi.md)), you can run
|
||||
inferences using TensorFlow Lite APIs available in C++ and Python, as shown
|
||||
in the following sections.
|
||||
|
||||
|
||||
## API Guides
|
||||
## Load and run a model in C++
|
||||
|
||||
TensorFlow Lite provides programming APIs in C++, Java and Python, with
|
||||
experimental bindings for several other languages (C, Swift, Objective-C). In
|
||||
most cases, the API design reflects a preference for performance over ease of
|
||||
use. TensorFlow Lite is designed for fast inference on small devices so it
|
||||
should be no surprise that the APIs try to avoid unnecessary copies at the
|
||||
expense of convenience. Similarly, consistency with TensorFlow APIs was not an
|
||||
explicit goal and some variance is to be expected.
|
||||
Running a TensorFlow Lite model with C++ involves a few simple steps:
|
||||
|
||||
There is also a [Python API for TensorFlow Lite](../convert/python_api.md).
|
||||
1. Load the model into memory as a `FlatBufferModel`.
|
||||
2. Build an `Interpreter` based on an existing `FlatBufferModel`.
|
||||
3. Set input tensor values. (Optionally resize input tensors if the
|
||||
predefined sizes are not desired.)
|
||||
4. Invoke inference.
|
||||
5. Read output tensor values.
|
||||
|
||||
### Loading a Model
|
||||
|
||||
#### C++
|
||||
The `FlatBufferModel` class encapsulates a model and can be built in a couple of
|
||||
slightly different ways depending on where the model is stored:
|
||||
The [`FlatBufferModel`](
|
||||
https://www.tensorflow.org/lite/api_docs/cc/class/tflite/flat-buffer-model.html)
|
||||
class encapsulates a TensorFlow Lite model and you can
|
||||
build it in a couple of different ways, depending on where the model is stored:
|
||||
|
||||
```c++
|
||||
class FlatBufferModel {
|
||||
@ -104,72 +117,36 @@ class FlatBufferModel {
|
||||
};
|
||||
```
|
||||
|
||||
```c++
|
||||
tflite::FlatBufferModel model(path_to_model);
|
||||
```
|
||||
Note: If TensorFlow Lite detects the presence of the [Android NNAPI](
|
||||
https://developer.android.com/ndk/guides/neuralnetworks), it will
|
||||
automatically try to use shared memory to store the `FlatBufferModel`.
|
||||
|
||||
Note that if TensorFlow Lite detects the presence of Android's NNAPI it will
|
||||
automatically try to use shared memory to store the FlatBufferModel.
|
||||
Now that you have the model as a `FlatBufferModel` object, you can execute it
|
||||
with an [`Interpreter`](
|
||||
https://www.tensorflow.org/lite/api_docs/cc/class/tflite/interpreter.html).
|
||||
A single `FlatBufferModel` can be used
|
||||
simultaneously by more than one `Interpreter`.
|
||||
|
||||
#### Java
|
||||
Caution: The `FlatBufferModel` object must remain valid until
|
||||
all instances of `Interpreter` using it have been destroyed.
|
||||
|
||||
TensorFlow Lite's Java API supports on-device inference and is provided as an
|
||||
Android Studio Library that allows loading models, feeding inputs, and
|
||||
retrieving inference outputs.
|
||||
|
||||
The `Interpreter` class drives model inference with TensorFlow Lite. In
|
||||
most of the cases, this is the only class an app developer will need.
|
||||
|
||||
The `Interpreter` can be initialized with a model file using the constructor:
|
||||
|
||||
```java
|
||||
public Interpreter(@NotNull File modelFile);
|
||||
```
|
||||
|
||||
or with a `MappedByteBuffer`:
|
||||
|
||||
```java
|
||||
public Interpreter(@NotNull MappedByteBuffer mappedByteBuffer);
|
||||
```
|
||||
|
||||
In both cases a valid TensorFlow Lite model must be provided or an
|
||||
`IllegalArgumentException` with be thrown. If a `MappedByteBuffer` is used to
|
||||
initialize an Interpreter, it should remain unchanged for the whole lifetime of
|
||||
the `Interpreter`.
|
||||
|
||||
### Running a Model {#running_a_model}
|
||||
|
||||
#### C++
|
||||
Running a model involves a few simple steps:
|
||||
|
||||
* Build an `Interpreter` based on an existing `FlatBufferModel`
|
||||
* Optionally resize input tensors if the predefined sizes are not desired.
|
||||
* Set input tensor values
|
||||
* Invoke inference
|
||||
* Read output tensor values
|
||||
|
||||
The important parts of public interface of the `Interpreter` are provided
|
||||
below. It should be noted that:
|
||||
The important parts of the `Interpreter` API are shown in the
|
||||
code snippet below. It should be noted that:
|
||||
|
||||
* Tensors are represented by integers, in order to avoid string comparisons
|
||||
(and any fixed dependency on string libraries).
|
||||
* An interpreter must not be accessed from concurrent threads.
|
||||
* Memory allocation for input and output tensors must be triggered
|
||||
by calling AllocateTensors() right after resizing tensors.
|
||||
by calling `AllocateTensors()` right after resizing tensors.
|
||||
|
||||
In order to run the inference model in TensorFlow Lite, one has to load the
|
||||
model into a `FlatBufferModel` object which then can be executed by an
|
||||
`Interpreter`. The `FlatBufferModel` needs to remain valid for the whole
|
||||
lifetime of the `Interpreter`, and a single `FlatBufferModel` can be
|
||||
simultaneously used by more than one `Interpreter`. In concrete terms, the
|
||||
`FlatBufferModel` object must be created before any `Interpreter` objects that
|
||||
use it, and must be kept around until they have all been destroyed.
|
||||
|
||||
The simplest usage of TensorFlow Lite will look like this:
|
||||
The simplest usage of TensorFlow Lite with C++ looks like this:
|
||||
|
||||
```c++
|
||||
tflite::FlatBufferModel model(path_to_model);
|
||||
// Load the model
|
||||
std::unique_ptr<tflite::FlatBufferModel> model =
|
||||
tflite::FlatBufferModel::BuildFromFile(filename);
|
||||
|
||||
// Build the interpreter
|
||||
tflite::ops::builtin::BuiltinOpResolver resolver;
|
||||
std::unique_ptr<tflite::Interpreter> interpreter;
|
||||
tflite::InterpreterBuilder(*model, resolver)(&interpreter);
|
||||
@ -185,9 +162,40 @@ interpreter->Invoke();
|
||||
float* output = interpreter->typed_output_tensor<float>(0);
|
||||
```
|
||||
|
||||
#### Java
|
||||
For more example code, see [`minimal.cc`](
|
||||
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/minimal/minimal.cc)
|
||||
and [`label_image.cc`](
|
||||
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/label_image/label_image.cc).
|
||||
|
||||
The simplest usage of Tensorflow Lite Java API looks like this:
|
||||
|
||||
## Load and run a model in Java
|
||||
|
||||
The Java API for running an inference with TensorFlow Lite is primarily designed
|
||||
for use with Android, so it's available as an Android library dependency:
|
||||
`org.tensorflow:tensorflow-lite`.
|
||||
|
||||
In Java, you'll use the `Interpreter` class to load a model and drive model
|
||||
inference. In many cases, this may be the only API you need.
|
||||
|
||||
You can initialize an `Interpreter` using a `.tflite` file:
|
||||
|
||||
```java
|
||||
public Interpreter(@NotNull File modelFile);
|
||||
```
|
||||
|
||||
Or with a `MappedByteBuffer`:
|
||||
|
||||
```java
|
||||
public Interpreter(@NotNull MappedByteBuffer mappedByteBuffer);
|
||||
```
|
||||
|
||||
In both cases, you must provide a valid TensorFlow Lite model or the API throws
|
||||
`IllegalArgumentException`. If you use `MappedByteBuffer` to
|
||||
initialize an `Interpreter`, it must remain unchanged for the whole lifetime
|
||||
of the `Interpreter`.
|
||||
|
||||
To then run an inference with the model, simply call `Interpreter.run()`.
|
||||
For example:
|
||||
|
||||
```java
|
||||
try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model)) {
|
||||
@ -195,48 +203,44 @@ try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model))
|
||||
}
|
||||
```
|
||||
|
||||
If a model takes only one input and returns only one output, the following will
|
||||
trigger an inference run:
|
||||
|
||||
```java
|
||||
interpreter.run(input, output);
|
||||
```
|
||||
|
||||
For models with multiple inputs, or multiple outputs, use:
|
||||
The `run()` method takes only one input and returns only one output. So if your
|
||||
model has multiple inputs or multiple outputs, instead use:
|
||||
|
||||
```java
|
||||
interpreter.runForMultipleInputsOutputs(inputs, map_of_indices_to_outputs);
|
||||
```
|
||||
|
||||
where each entry in `inputs` corresponds to an input tensor and
|
||||
In this case, each entry in `inputs` corresponds to an input tensor and
|
||||
`map_of_indices_to_outputs` maps indices of output tensors to the corresponding
|
||||
output data. In both cases the tensor indices should correspond to the values
|
||||
given to the
|
||||
[TensorFlow Lite Optimized Converter](../convert/cmdline_examples.md) when the
|
||||
model was created. Be aware that the order of tensors in `input` must match the
|
||||
order given to the `TensorFlow Lite Optimized Converter`.
|
||||
output data.
|
||||
|
||||
The Java API also provides convenient functions for app developers to get the
|
||||
index of any model input or output using a tensor name:
|
||||
In both cases, the tensor indices should correspond to the values you gave to
|
||||
the [TensorFlow Lite Converter](../convert/) when you created the model.
|
||||
Be aware that the order of tensors in `input` must match the
|
||||
order given to the TensorFlow Lite Converter.
|
||||
|
||||
The `Interpreter` class also provides convenient functions for you to get the
|
||||
index of any model input or output using an operation name:
|
||||
|
||||
```java
|
||||
public int getInputIndex(String tensorName);
|
||||
public int getOutputIndex(String tensorName);
|
||||
public int getInputIndex(String opName);
|
||||
public int getOutputIndex(String opName);
|
||||
```
|
||||
|
||||
If tensorName is not a valid name in model, an `IllegalArgumentException` will
|
||||
be thrown.
|
||||
If `opName` is not a valid operation in the model, it throws an
|
||||
`IllegalArgumentException`.
|
||||
|
||||
##### Releasing Resources After Use
|
||||
|
||||
An `Interpreter` owns resources. To avoid memory leak, the resources must be
|
||||
released after use by:
|
||||
Also beware that `Interpreter` owns resources. To avoid memory leak, the
|
||||
resources must be released after use by:
|
||||
|
||||
```java
|
||||
interpreter.close();
|
||||
```
|
||||
|
||||
##### Supported Data Types
|
||||
For an example project with Java, see the [Android image classification sample](
|
||||
https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android).
|
||||
|
||||
### Supported data types (in Java)
|
||||
|
||||
To use TensorFlow Lite, the data types of the input and output tensors must be
|
||||
one of the following primitive types:
|
||||
@ -256,7 +260,7 @@ provided as a single, flat `ByteBuffer` argument.
|
||||
If other data types, including boxed types like `Integer` and `Float`, are used,
|
||||
an `IllegalArgumentException` will be thrown.
|
||||
|
||||
##### Inputs
|
||||
#### Inputs
|
||||
|
||||
Each input should be an array or multi-dimensional array of the supported
|
||||
primitive types, or a raw `ByteBuffer` of the appropriate size. If the input is
|
||||
@ -265,12 +269,12 @@ implicitly resized to the array's dimensions at inference time. If the input is
|
||||
a ByteBuffer, the caller should first manually resize the associated input
|
||||
tensor (via `Interpreter.resizeInput()`) before running inference.
|
||||
|
||||
When using 'ByteBuffer', prefer using direct byte buffers, as this allows the
|
||||
When using `ByteBuffer`, prefer using direct byte buffers, as this allows the
|
||||
`Interpreter` to avoid unnecessary copies. If the `ByteBuffer` is a direct byte
|
||||
buffer, its order must be `ByteOrder.nativeOrder()`. After it is used for a
|
||||
model inference, it must remain unchanged until the model inference is finished.
|
||||
|
||||
##### Outputs
|
||||
#### Outputs
|
||||
|
||||
Each output should be an array or multi-dimensional array of the supported
|
||||
primitive types, or a ByteBuffer of the appropriate size. Note that some models
|
||||
@ -279,7 +283,75 @@ the input. There's no straightforward way of handling this with the existing
|
||||
Java inference API, but planned extensions will make this possible.
|
||||
|
||||
|
||||
## Writing Custom Operators
|
||||
## Load and run a model in Python
|
||||
|
||||
The Python API for running an inference is provided in the `tf.lite`
|
||||
module. From which, you mostly need only [`tf.lite.Interpreter`](
|
||||
https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter) to load
|
||||
a model and run an inference.
|
||||
|
||||
The following example shows how to use the Python interpreter to load a
|
||||
`.tflite` file and run inference with random input data:
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
import tensorflow as tf
|
||||
|
||||
# Load TFLite model and allocate tensors.
|
||||
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
|
||||
interpreter.allocate_tensors()
|
||||
|
||||
# Get input and output tensors.
|
||||
input_details = interpreter.get_input_details()
|
||||
output_details = interpreter.get_output_details()
|
||||
|
||||
# Test model on random input data.
|
||||
input_shape = input_details[0]['shape']
|
||||
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
|
||||
interpreter.set_tensor(input_details[0]['index'], input_data)
|
||||
|
||||
interpreter.invoke()
|
||||
|
||||
# The function `get_tensor()` returns a copy of the tensor data.
|
||||
# Use `tensor()` in order to get a pointer to the tensor.
|
||||
output_data = interpreter.get_tensor(output_details[0]['index'])
|
||||
print(output_data)
|
||||
```
|
||||
|
||||
Alternative to loading the model as a pre-converted `.tflite` file, you can
|
||||
combine your code with the [TensorFlow Lite Converter Python API](
|
||||
../convert/python_api.md) (`tf.lite.TFLiteConverter`), allowing you to convert
|
||||
your TensorFlow model into the TensorFlow Lite format and then run an inference:
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
import tensorflow as tf
|
||||
|
||||
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
|
||||
const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
|
||||
val = img + const
|
||||
out = tf.identity(val, name="out")
|
||||
|
||||
# Convert to TF Lite format
|
||||
with tf.Session() as sess:
|
||||
converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
|
||||
tflite_model = converter.convert()
|
||||
|
||||
# Load TFLite model and allocate tensors.
|
||||
interpreter = tf.lite.Interpreter(model_content=tflite_model)
|
||||
interpreter.allocate_tensors()
|
||||
|
||||
# Continue to get tensors and so forth, as shown above...
|
||||
```
|
||||
|
||||
For more Python sample code, see [`label_image.py`](
|
||||
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/python/label_image.py).
|
||||
|
||||
Tip: Run `help(tf.lite.Interpreter)` in the Python terminal to get detailed
|
||||
documentation about the interpreter.
|
||||
|
||||
|
||||
## Write a custom operator
|
||||
|
||||
All TensorFlow Lite operators (both custom and builtin) are defined using a
|
||||
simple pure-C interface that consists of four functions:
|
||||
@ -343,7 +415,7 @@ Note that registration is not automatic and an explicit call to
|
||||
registration of builtins, custom ops will have to be collected in separate
|
||||
custom libraries.
|
||||
|
||||
### Customizing the kernel library
|
||||
### Customize the kernel library
|
||||
|
||||
Behind the scenes the interpreter will load a library of kernels which will be
|
||||
assigned to execute each of the operators in the model. While the default
|
||||
@ -362,21 +434,19 @@ class OpResolver {
|
||||
};
|
||||
```
|
||||
|
||||
Regular usage will require the developer to use the `BuiltinOpResolver` and
|
||||
write:
|
||||
Regular usage requires that you use the `BuiltinOpResolver` and write:
|
||||
|
||||
```c++
|
||||
tflite::ops::builtin::BuiltinOpResolver resolver;
|
||||
```
|
||||
|
||||
They can then optionally register custom ops:
|
||||
You can optionally register custom ops (before you pass the resolver to the
|
||||
`InterpreterBuilder`):
|
||||
|
||||
```c++
|
||||
resolver.AddOp("MY_CUSTOM_OP", Register_MY_CUSTOM_OP());
|
||||
```
|
||||
|
||||
before the resolver is passed to the `InterpreterBuilder`.
|
||||
|
||||
If the set of builtin ops is deemed to be too large, a new `OpResolver` could
|
||||
be code-generated based on a given subset of ops, possibly only the ones
|
||||
contained in a given model. This is the equivalent of TensorFlow's selective
|
||||
|
Loading…
x
Reference in New Issue
Block a user