Move docs for Python inference into guide/inference.md, and restructure that page to organize the load/run steps based on language.

PiperOrigin-RevId: 259778674
This commit is contained in:
A. Unique TensorFlower 2019-07-24 11:16:56 -07:00 committed by TensorFlower Gardener
parent 3198b9be2e
commit b3aafbda35
3 changed files with 253 additions and 233 deletions

View File

@ -1,9 +1,12 @@
# Converter Python API guide
This page provides examples on how to use the TensorFlow Lite Converter and the
TensorFlow Lite interpreter using the Python API.
This page describes how to convert TensorFlow models into the TensorFlow Lite
format using the TensorFlow Lite Converter Python API.
Note: These docs describe the converter in the TensorFlow nightly release,
If you're looking for information about how to run a TensorFlow Lite model,
see [TensorFlow Lite inference](../guide/inference.md).
Note: This page describes the converter in the TensorFlow nightly release,
installed using `pip install tf-nightly`. For docs describing older versions
reference ["Converting models from TensorFlow 1.12"](#pre_tensorflow_1.12).
@ -20,13 +23,12 @@ be targeted to devices with mobile.
## API
The API for converting TensorFlow models to TensorFlow Lite is
`tf.lite.TFLiteConverter`. The API for calling the Python interpreter is
`tf.lite.Interpreter`.
`tf.lite.TFLiteConverter`, which provides class methods based on the original
format of the model. For example, `TFLiteConverter.from_session()` is available
for GraphDefs, `TFLiteConverter.from_saved_model()` is available for
SavedModels, and `TFLiteConverter.from_keras_model_file()` is available for
`tf.Keras` files.
`TFLiteConverter` provides class methods based on the original format of the
model. `TFLiteConverter.from_session()` is available for GraphDefs.
`TFLiteConverter.from_saved_model()` is available for SavedModels.
`TFLiteConverter.from_keras_model_file()` is available for `tf.Keras` files.
Example usages for simple float-point models are shown in
[Basic Examples](#basic). Examples usages for more complex models is shown in
[Complex Examples](#complex).
@ -177,65 +179,6 @@ with tf.Session() as sess:
open("converted_model.tflite", "wb").write(tflite_model)
```
## TensorFlow Lite Python interpreter <a name="interpreter"></a>
### Using the interpreter from a model file <a name="interpreter_file"></a>
The following example shows how to use the TensorFlow Lite Python interpreter
when provided a TensorFlow Lite FlatBuffer file. The example also demonstrates
how to run inference on random input data. Run
`help(tf.lite.Interpreter)` in the Python terminal to get detailed
documentation on the interpreter.
```python
import numpy as np
import tensorflow as tf
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)
```
### Using the interpreter from model data <a name="interpreter_data"></a>
The following example shows how to use the TensorFlow Lite Python interpreter
when starting with the TensorFlow Lite Flatbuffer model previously loaded. This
example shows an end-to-end use case, starting from building the TensorFlow
model.
```python
import numpy as np
import tensorflow as tf
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
val = img + const
out = tf.identity(val, name="out")
with tf.Session() as sess:
converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
tflite_model = converter.convert()
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
```
## Additional instructions

View File

@ -4,22 +4,27 @@ TensorFlow Lite provides all the tools you need to convert and run TensorFlow
models on mobile, embedded, and IoT devices. The following guide walks through
each step of the developer workflow and provides links to further instructions.
[TOC]
## 1. Choose a model
<a id="1_choose_a_model"></a>
TensorFlow Lite allows you to run TensorFlow models on a wide range of devices.
A TensorFlow model is a data structure that contains the logic and knowledge of
a machine learning network trained to solve a particular problem.
There are many ways to obtain a TensorFlow model, from using pre-trained models
to training your own. To use a model with TensorFlow Lite it must be converted
into a special format. This is explained in section 2,
[Convert the model](#2_convert_the_model_format).
to training your own.
To use a model with TensorFlow Lite, you must convert a
full TensorFlow model into the TensorFlow Lite format—you
cannot create or train a model using TensorFlow Lite. So you must start with a
regular TensorFlow model, and then
[convert the model](#2_convert_the_model_format).
Note: TensorFlow Lite supports a limited subset of TensorFlow operations, so not
all models can be converted. For details, read about the
[TensorFlow Lite operator compatibility](ops_compatibility.md).
Note: Not all TensorFlow models will work with TensorFlow Lite, since the
interpreter supports a limited subset of TensorFlow operations. See section 2,
[Convert the model](#2_convert_the_model_format) to learn about compatibility.
### Use a pre-trained model
@ -60,35 +65,37 @@ flowers with TensorFlow</a> codelab.
### Train a custom model
If you have designed and trained your own TensorFlow model, or you have trained
a model obtained from another source, you should convert it to the TensorFlow
Lite format before use.
a model obtained from another source, you must
[convert it to the TensorFlow Lite format](#2_convert_the_model_format).
## 2. Convert the model
<a id="2_convert_the_model_format"></a>
TensorFlow Lite is designed to execute models efficiently on devices. Some of
TensorFlow Lite is designed to execute models efficiently on mobile and other
embedded devices with limited compute and memory resources. Some of
this efficiency comes from the use of a special format for storing models.
TensorFlow models must be converted into this format before they can be used by
TensorFlow Lite.
Converting models reduces their file size and introduces optimizations that do
not affect accuracy. Developers can opt to further reduce file size and increase
speed of execution in exchange for some trade-offs. You can use the TensorFlow
Lite converter to choose which optimizations to apply.
not affect accuracy. The TensorFlow Lite converter provides options
that allow you to further reduce file size and increase speed of execution, with
some trade-offs.
Note: TensorFlow Lite supports a limited subset of TensorFlow operations, so not
all models can be converted. For details, read about the
[TensorFlow Lite operator compatibility](ops_compatibility.md).
TensorFlow Lite supports a limited subset of TensorFlow operations, so not all
models can be converted. See [Ops compatibility](#ops-compatibility) for more
information.
### TensorFlow Lite converter
The [TensorFlow Lite converter](../convert) is a tool that converts trained
TensorFlow models into the TensorFlow Lite format. It can also introduce
optimizations, which are covered in section 4,
The [TensorFlow Lite converter](../convert) is a tool available as a Python API
that converts trained TensorFlow models into the TensorFlow Lite format. It can
also introduce optimizations, which are covered in section 4,
[Optimize your model](#4_optimize_your_model_optional).
The converter is available as a Python API. The following example shows a
The following example shows a
TensorFlow `SavedModel` being converted into the TensorFlow Lite format:
```python
@ -128,9 +135,9 @@ performance or reduce file size. This is covered in section 4,
### Ops compatibility
TensorFlow Lite currently supports a [limited subset](ops_compatibility.md) of
TensorFlow operations. The long term goal is for all TensorFlow operations to be
supported.
TensorFlow Lite currently supports a [limited subset of TensorFlow
operations](ops_compatibility.md). The long term goal is for all TensorFlow
operations to be supported.
If the model you wish to convert contains unsupported operations, you can use
[TensorFlow Select](ops_select.md) to include operations from TensorFlow. This

View File

@ -1,91 +1,104 @@
# TensorFlow Lite inference
The term *inference* refers to the process of executing a TensorFlow Lite model
on-device in order to make predictions based on input data. Inference is the
final step in using the model on-device.
on-device in order to make predictions based on input data. To perform an
inference with a TensorFlow Lite model, you must run it through an
*interpreter*. The TensorFlow Lite interpreter is designed to be lean and fast.
The interpreter uses a static graph ordering and a custom (less-dynamic) memory
allocator to ensure minimal load, initialization, and execution latency.
Inference for TensorFlow Lite models is run through an interpreter. The
TensorFlow Lite interpreter is designed to be lean and fast. The interpreter
uses a static graph ordering and a custom (less-dynamic) memory allocator to
ensure minimal load, initialization, and execution latency.
This page describes how to access to the TensorFlow Lite interpreter and
perform an inference using C++, Java, and Python, plus links to other resources
for each [supported platform](#supported-platforms).
This document outlines the various APIs for the interpreter, along with the
[supported platforms](#supported-platforms).
[TOC]
### Important Concepts
## Important concepts
TensorFlow Lite inference on device typically follows the following steps.
TensorFlow Lite inference typically follows the following steps:
1. **Loading a Model**
1. **Loading a model**
The user loads the `.tflite` model into memory which contains the model's
You must load the `.tflite` model into memory, which contains the model's
execution graph.
1. **Transforming Data**
Input data acquired by the user generally may not match the input data format
expected by the model. For eg., a user may need to resize an image or change
the image format to be used by the model.
1. **Transforming data**
1. **Running Inference**
Raw input data for the model generally does not match the input data format
expected by the model. For example, you might need to resize an image or
change the image format to be compatible with the model.
This step involves using the API to execute the model. It involves a few
steps such as building the interpreter, and allocating tensors as explained
in detail in [Running a Model](#running_a_model).
1. **Running inference**
1. **Interpreting Output**
This step involves using the TensorFlow Lite API to execute the model. It
involves a few steps such as building the interpreter, and allocating
tensors, as described in the following sections.
The user retrieves results from model inference and interprets the tensors in
a meaningful way to be used in the application.
1. **Interpreting output**
For example, a model may only return a list of probabilities. It is up to the
application developer to meaningully map them to relevant categories and
present it to their user.
When you receive results from the model inference, you must interpret the
tensors in a meaningful way that's useful in your application.
### Supported Platforms
For example, a model might return only a list of probabilities. It's up to
you to map the probabilities to relevant categories and present it to your
end-user.
## Supported platforms
TensorFlow inference APIs are provided for most common mobile/embedded platforms
such as Android, iOS and Linux.
such as Android, iOS and Linux, in multiple programming languages.
#### Android
In most cases, the API design reflects a preference for performance over ease of
use. TensorFlow Lite is designed for fast inference on small devices, so it
should be no surprise that the APIs try to avoid unnecessary copies at the
expense of convenience. Similarly, consistency with TensorFlow APIs was not an
explicit goal and some variance between languages is to be expected.
Across all libraries, the TensorFlow Lite API enables you to load models,
feed inputs, and retrieve inference outputs.
### Android
On Android, TensorFlow Lite inference can be performed using either Java or C++
APIs. The Java APIs provide convenience and can be used directly within your
Android Activity classes. The C++ APIs offer more flexibility and speed, but may
require writing JNI wrappers to move data between Java and C++ layers.
Visit the [Android quickstart](android.md) for a tutorial and example code.
See below for details about using C++ and Java, or
follow the [Android quickstart](android.md) for a tutorial and example code.
#### iOS
### iOS
TensorFlow Lite provides native iOS libraries written in
On iOS, TensorFlow Lite is available with native iOS libraries written in
[Swift](https://www.tensorflow.org/code/tensorflow/lite/experimental/swift)
and
[Objective-C](https://www.tensorflow.org/code/tensorflow/lite/experimental/objc).
Visit the [iOS quickstart](ios.md) for a tutorial and example code.
This page doesn't include a discussion for about these languages, so you should
refer to the [iOS quickstart](ios.md) for a tutorial and example code.
#### Linux
On Linux platforms such as [Raspberry Pi](build_rpi.md), TensorFlow Lite C++
and Python APIs can be used to run inference.
### Linux
On Linux platforms (including [Raspberry Pi](build_rpi.md)), you can run
inferences using TensorFlow Lite APIs available in C++ and Python, as shown
in the following sections.
## API Guides
## Load and run a model in C++
TensorFlow Lite provides programming APIs in C++, Java and Python, with
experimental bindings for several other languages (C, Swift, Objective-C). In
most cases, the API design reflects a preference for performance over ease of
use. TensorFlow Lite is designed for fast inference on small devices so it
should be no surprise that the APIs try to avoid unnecessary copies at the
expense of convenience. Similarly, consistency with TensorFlow APIs was not an
explicit goal and some variance is to be expected.
Running a TensorFlow Lite model with C++ involves a few simple steps:
There is also a [Python API for TensorFlow Lite](../convert/python_api.md).
1. Load the model into memory as a `FlatBufferModel`.
2. Build an `Interpreter` based on an existing `FlatBufferModel`.
3. Set input tensor values. (Optionally resize input tensors if the
predefined sizes are not desired.)
4. Invoke inference.
5. Read output tensor values.
### Loading a Model
#### C++
The `FlatBufferModel` class encapsulates a model and can be built in a couple of
slightly different ways depending on where the model is stored:
The [`FlatBufferModel`](
https://www.tensorflow.org/lite/api_docs/cc/class/tflite/flat-buffer-model.html)
class encapsulates a TensorFlow Lite model and you can
build it in a couple of different ways, depending on where the model is stored:
```c++
class FlatBufferModel {
@ -104,72 +117,36 @@ class FlatBufferModel {
};
```
```c++
tflite::FlatBufferModel model(path_to_model);
```
Note: If TensorFlow Lite detects the presence of the [Android NNAPI](
https://developer.android.com/ndk/guides/neuralnetworks), it will
automatically try to use shared memory to store the `FlatBufferModel`.
Note that if TensorFlow Lite detects the presence of Android's NNAPI it will
automatically try to use shared memory to store the FlatBufferModel.
Now that you have the model as a `FlatBufferModel` object, you can execute it
with an [`Interpreter`](
https://www.tensorflow.org/lite/api_docs/cc/class/tflite/interpreter.html).
A single `FlatBufferModel` can be used
simultaneously by more than one `Interpreter`.
#### Java
Caution: The `FlatBufferModel` object must remain valid until
all instances of `Interpreter` using it have been destroyed.
TensorFlow Lite's Java API supports on-device inference and is provided as an
Android Studio Library that allows loading models, feeding inputs, and
retrieving inference outputs.
The `Interpreter` class drives model inference with TensorFlow Lite. In
most of the cases, this is the only class an app developer will need.
The `Interpreter` can be initialized with a model file using the constructor:
```java
public Interpreter(@NotNull File modelFile);
```
or with a `MappedByteBuffer`:
```java
public Interpreter(@NotNull MappedByteBuffer mappedByteBuffer);
```
In both cases a valid TensorFlow Lite model must be provided or an
`IllegalArgumentException` with be thrown. If a `MappedByteBuffer` is used to
initialize an Interpreter, it should remain unchanged for the whole lifetime of
the `Interpreter`.
### Running a Model {#running_a_model}
#### C++
Running a model involves a few simple steps:
* Build an `Interpreter` based on an existing `FlatBufferModel`
* Optionally resize input tensors if the predefined sizes are not desired.
* Set input tensor values
* Invoke inference
* Read output tensor values
The important parts of public interface of the `Interpreter` are provided
below. It should be noted that:
The important parts of the `Interpreter` API are shown in the
code snippet below. It should be noted that:
* Tensors are represented by integers, in order to avoid string comparisons
(and any fixed dependency on string libraries).
* An interpreter must not be accessed from concurrent threads.
* Memory allocation for input and output tensors must be triggered
by calling AllocateTensors() right after resizing tensors.
by calling `AllocateTensors()` right after resizing tensors.
In order to run the inference model in TensorFlow Lite, one has to load the
model into a `FlatBufferModel` object which then can be executed by an
`Interpreter`. The `FlatBufferModel` needs to remain valid for the whole
lifetime of the `Interpreter`, and a single `FlatBufferModel` can be
simultaneously used by more than one `Interpreter`. In concrete terms, the
`FlatBufferModel` object must be created before any `Interpreter` objects that
use it, and must be kept around until they have all been destroyed.
The simplest usage of TensorFlow Lite will look like this:
The simplest usage of TensorFlow Lite with C++ looks like this:
```c++
tflite::FlatBufferModel model(path_to_model);
// Load the model
std::unique_ptr<tflite::FlatBufferModel> model =
tflite::FlatBufferModel::BuildFromFile(filename);
// Build the interpreter
tflite::ops::builtin::BuiltinOpResolver resolver;
std::unique_ptr<tflite::Interpreter> interpreter;
tflite::InterpreterBuilder(*model, resolver)(&interpreter);
@ -185,9 +162,40 @@ interpreter->Invoke();
float* output = interpreter->typed_output_tensor<float>(0);
```
#### Java
For more example code, see [`minimal.cc`](
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/minimal/minimal.cc)
and [`label_image.cc`](
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/label_image/label_image.cc).
The simplest usage of Tensorflow Lite Java API looks like this:
## Load and run a model in Java
The Java API for running an inference with TensorFlow Lite is primarily designed
for use with Android, so it's available as an Android library dependency:
`org.tensorflow:tensorflow-lite`.
In Java, you'll use the `Interpreter` class to load a model and drive model
inference. In many cases, this may be the only API you need.
You can initialize an `Interpreter` using a `.tflite` file:
```java
public Interpreter(@NotNull File modelFile);
```
Or with a `MappedByteBuffer`:
```java
public Interpreter(@NotNull MappedByteBuffer mappedByteBuffer);
```
In both cases, you must provide a valid TensorFlow Lite model or the API throws
`IllegalArgumentException`. If you use `MappedByteBuffer` to
initialize an `Interpreter`, it must remain unchanged for the whole lifetime
of the `Interpreter`.
To then run an inference with the model, simply call `Interpreter.run()`.
For example:
```java
try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model)) {
@ -195,48 +203,44 @@ try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model))
}
```
If a model takes only one input and returns only one output, the following will
trigger an inference run:
```java
interpreter.run(input, output);
```
For models with multiple inputs, or multiple outputs, use:
The `run()` method takes only one input and returns only one output. So if your
model has multiple inputs or multiple outputs, instead use:
```java
interpreter.runForMultipleInputsOutputs(inputs, map_of_indices_to_outputs);
```
where each entry in `inputs` corresponds to an input tensor and
In this case, each entry in `inputs` corresponds to an input tensor and
`map_of_indices_to_outputs` maps indices of output tensors to the corresponding
output data. In both cases the tensor indices should correspond to the values
given to the
[TensorFlow Lite Optimized Converter](../convert/cmdline_examples.md) when the
model was created. Be aware that the order of tensors in `input` must match the
order given to the `TensorFlow Lite Optimized Converter`.
output data.
The Java API also provides convenient functions for app developers to get the
index of any model input or output using a tensor name:
In both cases, the tensor indices should correspond to the values you gave to
the [TensorFlow Lite Converter](../convert/) when you created the model.
Be aware that the order of tensors in `input` must match the
order given to the TensorFlow Lite Converter.
The `Interpreter` class also provides convenient functions for you to get the
index of any model input or output using an operation name:
```java
public int getInputIndex(String tensorName);
public int getOutputIndex(String tensorName);
public int getInputIndex(String opName);
public int getOutputIndex(String opName);
```
If tensorName is not a valid name in model, an `IllegalArgumentException` will
be thrown.
If `opName` is not a valid operation in the model, it throws an
`IllegalArgumentException`.
##### Releasing Resources After Use
An `Interpreter` owns resources. To avoid memory leak, the resources must be
released after use by:
Also beware that `Interpreter` owns resources. To avoid memory leak, the
resources must be released after use by:
```java
interpreter.close();
```
##### Supported Data Types
For an example project with Java, see the [Android image classification sample](
https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android).
### Supported data types (in Java)
To use TensorFlow Lite, the data types of the input and output tensors must be
one of the following primitive types:
@ -256,7 +260,7 @@ provided as a single, flat `ByteBuffer` argument.
If other data types, including boxed types like `Integer` and `Float`, are used,
an `IllegalArgumentException` will be thrown.
##### Inputs
#### Inputs
Each input should be an array or multi-dimensional array of the supported
primitive types, or a raw `ByteBuffer` of the appropriate size. If the input is
@ -265,12 +269,12 @@ implicitly resized to the array's dimensions at inference time. If the input is
a ByteBuffer, the caller should first manually resize the associated input
tensor (via `Interpreter.resizeInput()`) before running inference.
When using 'ByteBuffer', prefer using direct byte buffers, as this allows the
When using `ByteBuffer`, prefer using direct byte buffers, as this allows the
`Interpreter` to avoid unnecessary copies. If the `ByteBuffer` is a direct byte
buffer, its order must be `ByteOrder.nativeOrder()`. After it is used for a
model inference, it must remain unchanged until the model inference is finished.
##### Outputs
#### Outputs
Each output should be an array or multi-dimensional array of the supported
primitive types, or a ByteBuffer of the appropriate size. Note that some models
@ -279,7 +283,75 @@ the input. There's no straightforward way of handling this with the existing
Java inference API, but planned extensions will make this possible.
## Writing Custom Operators
## Load and run a model in Python
The Python API for running an inference is provided in the `tf.lite`
module. From which, you mostly need only [`tf.lite.Interpreter`](
https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter) to load
a model and run an inference.
The following example shows how to use the Python interpreter to load a
`.tflite` file and run inference with random input data:
```python
import numpy as np
import tensorflow as tf
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)
```
Alternative to loading the model as a pre-converted `.tflite` file, you can
combine your code with the [TensorFlow Lite Converter Python API](
../convert/python_api.md) (`tf.lite.TFLiteConverter`), allowing you to convert
your TensorFlow model into the TensorFlow Lite format and then run an inference:
```python
import numpy as np
import tensorflow as tf
img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
val = img + const
out = tf.identity(val, name="out")
# Convert to TF Lite format
with tf.Session() as sess:
converter = tf.lite.TFLiteConverter.from_session(sess, [img], [out])
tflite_model = converter.convert()
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
# Continue to get tensors and so forth, as shown above...
```
For more Python sample code, see [`label_image.py`](
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/python/label_image.py).
Tip: Run `help(tf.lite.Interpreter)` in the Python terminal to get detailed
documentation about the interpreter.
## Write a custom operator
All TensorFlow Lite operators (both custom and builtin) are defined using a
simple pure-C interface that consists of four functions:
@ -343,7 +415,7 @@ Note that registration is not automatic and an explicit call to
registration of builtins, custom ops will have to be collected in separate
custom libraries.
### Customizing the kernel library
### Customize the kernel library
Behind the scenes the interpreter will load a library of kernels which will be
assigned to execute each of the operators in the model. While the default
@ -362,21 +434,19 @@ class OpResolver {
};
```
Regular usage will require the developer to use the `BuiltinOpResolver` and
write:
Regular usage requires that you use the `BuiltinOpResolver` and write:
```c++
tflite::ops::builtin::BuiltinOpResolver resolver;
```
They can then optionally register custom ops:
You can optionally register custom ops (before you pass the resolver to the
`InterpreterBuilder`):
```c++
resolver.AddOp("MY_CUSTOM_OP", Register_MY_CUSTOM_OP());
```
before the resolver is passed to the `InterpreterBuilder`.
If the set of builtin ops is deemed to be too large, a new `OpResolver` could
be code-generated based on a given subset of ops, possibly only the ones
contained in a given model. This is the equivalent of TensorFlow's selective