Document TFLite delegate parameters supported by the TFLite delegate registrar in lite/tools/delegates, and also change README in each tool (evaluation tools and the benchmark tool) accordingly to mention these supported parameters.

PiperOrigin-RevId: 307791203
Change-Id: If23c11cb9c80e0037c38f070e0ad23fe6cff690e
This commit is contained in:
Chao Mei 2020-04-22 04:39:55 -07:00 committed by TensorFlower Gardener
parent 1912ef16d6
commit 83b579780d
5 changed files with 163 additions and 78 deletions

View File

@ -34,32 +34,6 @@ and the following optional parameters:
* `run_delay`: `float` (default=-1.0) \
The delay in seconds between subsequent benchmark runs. Non-positive values
mean use no delay.
* `use_xnnpack`: `bool` (default=false) \
Whether to use the XNNPack delegate.
* `use_hexagon`: `bool` (default=false) \
Whether to use the Hexagon delegate. Not all devices may support the Hexagon
delegate, refer to the TensorFlow Lite documentation for more information
about which devices/chipsets are supported and about how to get the required
libraries. To use the Hexagon delegate also build the
hexagon_nn:libhexagon_interface.so target and copy the library to the
device. All libraries should be copied to /data/local/tmp on the device.
* `use_nnapi`: `bool` (default=false) \
Whether to use
[Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/).
This API is available on recent Android devices. Note that some Android P
devices will fail to use NNAPI for models in `/data/local/tmp/` and this
benchmark tool will not correctly use NNAPI. When on Android Q+, will also
print the names of NNAPI accelerators accessible through the
`nnapi_accelerator_name` flag.
* `nnapi_accelerator_name`: `str` (default="") \
The name of the NNAPI accelerator to use (requires Android Q+). If left
blank, NNAPI will automatically select which of the available accelerators
to use.
* `nnapi_execution_preference`: `string` (default="") \
Which
[NNAPI execution preference](https://developer.android.com/ndk/reference/group/neural-networks.html#group___neural_networks_1gga034380829226e2d980b2a7e63c992f18af727c25f1e2d8dcc693c477aef4ea5f5)
to use when executing using NNAPI. Should be one of the following:
fast_single_answer, sustained_speed, low_power, undefined.
* `use_legacy_nnapi`: `bool` (default=false) \
Whether to use the legacy
[Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/)
@ -67,39 +41,6 @@ and the following optional parameters:
This is available on recent Android devices. Note that some Android P
devices will fail to use NNAPI for models in `/data/local/tmp/` and this
benchmark tool will not correctly use NNAPI.
* `max_delegated_partitions`: `int` (default=0, i.e. no limit) \
The maximum number of partitions that will be delegated. \
Currently supported by the Hexagon delegate or the NNAPI delegate but won't
work if `use_legacy_nnapi` has been selected.
* `min_nodes_per_partition`: `int` (default=0, i.e. default choice implemented
by each delegate) \
The minimal number of TFLite graph nodes of a partition that needs to be
reached to be delegated. A negative value or 0 means to use the default
choice of each delegate. \
This option is currently only supported by the Hexagon delegate.
* `disable_nnapi_cpu`: `bool` (default=false) \
Excludes the
[NNAPI CPU reference implementation](https://developer.android.com/ndk/guides/neuralnetworks#device-assignment)
from the possible devices to be used by NNAPI to execute the model. This
option is ignored if `nnapi_accelerator_name` is specified.
* `use_gpu`: `bool` (default=false) \
Whether to use the
[GPU accelerator delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/gpu).
This option is currently only available on Android and iOS devices.
* `gpu_precision_loss_allowed`: `bool` (default=true) \
Whethre to allow the GPU delegate to carry out computation with some
precision loss (i.e. processing in FP16) or not. If allowed, the performance
will increase.
* `gpu_experimental_enable_quant`: `bool` (default=true) \
Whether to allow the GPU delegate to run a quantized model or not. This
option is currently only available on Android.
* `gpu_wait_type`: `str` (default="") \
Which GPU wait_type option to use, when using GPU delegate on iOS. Should be
one of the following: passive, active, do_not_wait, aggressive. When left
blank, passive mode is used by default.
* `use_coreml`: `bool` (default=false) \
Whether to use the [Core ML delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/delegates/coreml).
This option is only available in iOS.
* `enable_op_profiling`: `bool` (default=false) \
Whether to enable per-operator profiling measurement.
* `enable_platform_tracing`: `bool` (default=false) \
@ -107,16 +48,50 @@ and the following optional parameters:
'enable_op_profiling'. Note, the platform-wide tracing might not work if the
tool runs as a commandline native binary. For example, on Android, the
ATrace-based tracing only works when the tool is launched as an APK.
* `hexagon_profiling`: `bool` (default=false) \
Whether to profile ops running on hexagon. Needs to be combined with
`enable_op_profiling`. When this is set to true the profile of ops on
hexagon DSP will be added to the profile table. Note that, the reported data
on hexagon is in cycles, not in ms like on cpu.
* `external_delegate_path`: `string` (default="") \
Path to the external delegate library to use.
* `external_delegate_options`: `string` (default="") \
A list of options to be passed to the external delegate library. Options
should be in the format of `option1:value1;option2:value2;optionN:valueN`
### TFLite delegate parameters
The tool supports all runtime/delegate parameters introduced by
[the delegate registrar]
(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates).
The following simply lists the names of all of them w/ additional notes where
applicable:
#### Common parameters
* `max_delegated_partitions`: `int` (default=0) \
Note when `use_legacy_nnapi` is selected, this parameter won't work.
* `min_nodes_per_partition`:`int` (default=0)
#### GPU delegate provider
* `use_gpu`: `bool` (default=false)
* `gpu_precision_loss_allowed`: `bool` (default=true)
* `gpu_experimental_enable_quant`: `bool` (default=true)
* `gpu_backend`: `string` (default="")
* `gpu_wait_type`: `str` (default="")
### NNAPI delegate provider
* `use_nnapi`: `bool` (default=false) \
Note some Android P devices will fail to use NNAPI for models in
`/data/local/tmp/` and this benchmark tool will not correctly use NNAPI.
* `nnapi_accelerator_name`: `str` (default="")
* `disable_nnapi_cpu`: `bool` (default=false)
#### Hexagon delegate provider
* `use_hexagon`: `bool` (default=false)
* `hexagon_profiling`: `bool` (default=false) \
Note enabling this option will not produce profiling results outputs unless
`enable_op_profiling` is also turned on. When both parameters are set to true,
the profile of ops on hexagon DSP will be added to the profile table. Note that,
the reported data on hexagon is in cycles, not in ms like on cpu.
#### XNNPACK delegate provider
* `use_xnnpack`: `bool` (default=false)
#### CoreML delegate provider
* `use_coreml`: `bool` (default=false)
#### External delegate provider
* `external_delegate_path`: `string` (default="")
* `external_delegate_options`: `string` (default="")
## To build/install/run

View File

@ -0,0 +1,104 @@
# TFLite Delegate Utilities for Tooling
## TFLite Delegate Registrar
[A TFLite delegate registrar]
(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/delegates/delegate_provider.h)
is provided here. The registrar keeps a list of TFLite delegate providers, each
of which defines a list parameters that could be initialized from commandline
argumenents and provides a TFLite delegate instance creation based on those
parameters. This delegate registrar has been used in TFLite evaluation tools and
the benchmark model tool.
A particular TFLite delegate provider can be used by
linking the corresponding library, e.g. adding it to the `deps` of a BUILD rule.
Note that each delegate provider library has been configured with
`alwayslink=1` in the BUILD rule so that it will be linked to any binary that
directly or indirectly depends on it.
The following lists all implemented TFLite delegate providers and their
corresponding list of parameters that each supports to create a particular
TFLite delegate.
### Common parameters
* `num_threads`: `int` (default=1) \
The number of threads to use for running the inference on CPU.
* `max_delegated_partitions`: `int` (default=0, i.e. no limit) \
The maximum number of partitions that will be delegated. \
Currently supported by the GPU, Hexagon, CoreML and NNAPI delegate.
* `min_nodes_per_partition`: `int` (default=delegate's own choice) \
The minimal number of TFLite graph nodes of a partition that needs to be
reached to be delegated. A negative value or 0 means to use the default
choice of each delegate. \
This option is currently supported by the Hexagon and CoreML delegate.
### GPU delegate provider
* `use_gpu`: `bool` (default=false) \
Whether to use the
[GPU accelerator delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/gpu).
This option is currently only available on Android and iOS devices.
* `gpu_precision_loss_allowed`: `bool` (default=true) \
Whethre to allow the GPU delegate to carry out computation with some
precision loss (i.e. processing in FP16) or not. If allowed, the performance
will increase.
* `gpu_experimental_enable_quant`: `bool` (default=true) \
Whether to allow the GPU delegate to run a quantized model or not. \
This option is currently only available on Android.
* `gpu_backend`: `string` (default="") \
Force the GPU delegate to use a particular backend for execution, and fail
if unsuccessful. Should be one of: cl, gl. By default, the GPU delegate will
try OpenCL first and then OpenGL if the former fails.\
Note this option is only available on Android.
* `gpu_wait_type`: `string` (default="") \
Which GPU wait_type option to use, when using GPU delegate on iOS. Should be
one of the following: passive, active, do_not_wait, aggressive. When left
blank, passive mode is used by default.
### NNAPI delegate provider
* `use_nnapi`: `bool` (default=false) \
Whether to use
[Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/).
This API is available on recent Android devices. When on Android Q+, will
also print the names of NNAPI accelerators accessible through the
`nnapi_accelerator_name` flag.
* `nnapi_accelerator_name`: `string` (default="") \
The name of the NNAPI accelerator to use (requires Android Q+). If left
blank, NNAPI will automatically select which of the available accelerators
to use.
* `nnapi_execution_preference`: `string` (default="") \
Which
[NNAPI execution preference](https://developer.android.com/ndk/reference/group/neural-networks.html#group___neural_networks_1gga034380829226e2d980b2a7e63c992f18af727c25f1e2d8dcc693c477aef4ea5f5)
to use when executing using NNAPI. Should be one of the following:
fast_single_answer, sustained_speed, low_power, undefined.
* `disable_nnapi_cpu`: `bool` (default=false) \
Excludes the
[NNAPI CPU reference implementation](https://developer.android.com/ndk/guides/neuralnetworks#device-assignment)
from the possible devices to be used by NNAPI to execute the model. This
option is ignored if `nnapi_accelerator_name` is specified.
### Hexagon delegate provider
* `use_hexagon`: `bool` (default=false) \
Whether to use the Hexagon delegate. Not all devices may support the Hexagon
delegate, refer to the [TensorFlow Lite documentation]
(https://www.tensorflow.org/lite/performance/hexagon_delegate) for more
information about which devices/chipsets are supported and about how to get
the required libraries. To use the Hexagon delegate also build the
hexagon_nn:libhexagon_interface.so target and copy the library to the
device. All libraries should be copied to /data/local/tmp on the device.
* `hexagon_profiling`: `bool` (default=false) \
Whether to profile ops running on hexagon.
### XNNPACK delegate provider
* `use_xnnpack`: `bool` (default=false) \
Whether to use the XNNPack delegate.
### CoreML delegate provider
* `use_coreml`: `bool` (default=false) \
Whether to use the [Core ML delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/delegates/coreml).
This option is only available in iOS.
### External delegate provider
* `external_delegate_path`: `string` (default="") \
Path to the external delegate library to use.
* `external_delegate_options`: `string` (default="") \
A list of options to be passed to the external delegate library. Options
should be in the format of `option1:value1;option2:value2;optionN:valueN`

View File

@ -83,9 +83,11 @@ The following optional parameters can be used to modify the inference runtime:
assumes that `libhexagon_interface.so` and Qualcomm libraries lie in
`/data/local/tmp`.
This script also supports all applicable runtime/delegate arguments supported on
the `benchmark_model` tool. If there is any conflict (for example, `num_threads`
in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this
This script also supports runtime/delegate arguments introduced by the
[delegate registrar]
(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates).
If there is any conflict (for example, `num_threads` vs
`num_interpreter_threads` here), the parameters of this
script are given precedence.
### Debug Mode

View File

@ -91,9 +91,11 @@ The following optional parameters can be used to modify the inference runtime:
assumes that `libhexagon_interface.so` and Qualcomm libraries lie in
`/data/local/tmp`.
This script also supports all applicable runtime/delegate arguments supported on
the `benchmark_model` tool. If there is any conflict (for example, `num_threads`
in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this
This script also supports runtime/delegate arguments introduced by the
[delegate registrar]
(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates).
If there is any conflict (for example, `num_threads` vs
`num_interpreter_threads` here), the parameters of this
script are given precedence.
## Downloading ILSVRC

View File

@ -64,9 +64,11 @@ and the following optional parameters:
The final metrics are dumped into `output_file_path` as a serialized
instance of `tflite::evaluation::EvaluationStageMetrics`
This script also supports all applicable runtime/delegate arguments supported on
the `benchmark_model` tool. If there is any conflict (for example, `num_threads`
in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this
This script also supports runtime/delegate arguments introduced by the
[delegate registrar]
(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates).
If there is any conflict (for example, `num_threads` vs
`num_interpreter_threads` here), the parameters of this
script are given precedence.
## Running the binary on Android