Document TFLite delegate parameters supported by the TFLite delegate registrar in lite/tools/delegates, and also change README in each tool (evaluation tools and the benchmark tool) accordingly to mention these supported parameters.

PiperOrigin-RevId: 307791203 Change-Id: If23c11cb9c80e0037c38f070e0ad23fe6cff690e
2020-04-22 04:39:55 -07:00 · 2020-04-22 04:39:55 -07:00 · 83b579780d
commit 83b579780d
parent 1912ef16d6
5 changed files with 163 additions and 78 deletions
--- a/tensorflow/lite/tools/benchmark/README.md
+++ b/tensorflow/lite/tools/benchmark/README.md
@ -34,32 +34,6 @@ and the following optional parameters:
 *   `run_delay`: `float` (default=-1.0) \
    The delay in seconds between subsequent benchmark runs. Non-positive values
    mean use no delay.
-*   `use_xnnpack`: `bool` (default=false) \
-    Whether to use the XNNPack delegate.
-*   `use_hexagon`: `bool` (default=false) \
-    Whether to use the Hexagon delegate. Not all devices may support the Hexagon
-    delegate, refer to the TensorFlow Lite documentation for more information
-    about which devices/chipsets are supported and about how to get the required
-    libraries. To use the Hexagon delegate also build the
-    hexagon_nn:libhexagon_interface.so target and copy the library to the
-    device. All libraries should be copied to /data/local/tmp on the device.
-*   `use_nnapi`: `bool` (default=false) \
-    Whether to use
-    [Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/).
-    This API is available on recent Android devices. Note that some Android P
-    devices will fail to use NNAPI for models in `/data/local/tmp/` and this
-    benchmark tool will not correctly use NNAPI. When on Android Q+, will also
-    print the names of NNAPI accelerators accessible through the
-    `nnapi_accelerator_name` flag.
-*   `nnapi_accelerator_name`: `str` (default="") \
-    The name of the NNAPI accelerator to use (requires Android Q+). If left
-    blank, NNAPI will automatically select which of the available accelerators
-    to use.
-*   `nnapi_execution_preference`: `string` (default="") \
-    Which
-    [NNAPI execution preference](https://developer.android.com/ndk/reference/group/neural-networks.html#group___neural_networks_1gga034380829226e2d980b2a7e63c992f18af727c25f1e2d8dcc693c477aef4ea5f5)
-    to use when executing using NNAPI. Should be one of the following:
-    fast_single_answer, sustained_speed, low_power, undefined.
 *   `use_legacy_nnapi`: `bool` (default=false) \
    Whether to use the legacy
    [Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/)
@ -67,39 +41,6 @@ and the following optional parameters:
    This is available on recent Android devices. Note that some Android P
    devices will fail to use NNAPI for models in `/data/local/tmp/` and this
    benchmark tool will not correctly use NNAPI.
-*   `max_delegated_partitions`: `int` (default=0, i.e. no limit) \
-    The maximum number of partitions that will be delegated. \
-    Currently supported by the Hexagon delegate or the NNAPI delegate but won't
-    work if `use_legacy_nnapi` has been selected.
-*   `min_nodes_per_partition`: `int` (default=0, i.e. default choice implemented
-    by each delegate) \
-    The minimal number of TFLite graph nodes of a partition that needs to be
-    reached to be delegated. A negative value or 0 means to use the default
-    choice of each delegate. \
-    This option is currently only supported by the Hexagon delegate.
-*   `disable_nnapi_cpu`: `bool` (default=false) \
-    Excludes the
-    [NNAPI CPU reference implementation](https://developer.android.com/ndk/guides/neuralnetworks#device-assignment)
-    from the possible devices to be used by NNAPI to execute the model. This
-    option is ignored if `nnapi_accelerator_name` is specified.
-*   `use_gpu`: `bool` (default=false) \
-    Whether to use the
-    [GPU accelerator delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/gpu).
-    This option is currently only available on Android and iOS devices.
-*   `gpu_precision_loss_allowed`: `bool` (default=true) \
-    Whethre to allow the GPU delegate to carry out computation with some
-    precision loss (i.e. processing in FP16) or not. If allowed, the performance
-    will increase.
-*   `gpu_experimental_enable_quant`: `bool` (default=true) \
-    Whether to allow the GPU delegate to run a quantized model or not. This
-    option is currently only available on Android.
-*   `gpu_wait_type`: `str` (default="") \
-    Which GPU wait_type option to use, when using GPU delegate on iOS. Should be
-    one of the following: passive, active, do_not_wait, aggressive. When left
-    blank, passive mode is used by default.
-*   `use_coreml`: `bool` (default=false) \
-    Whether to use the [Core ML delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/delegates/coreml).
-    This option is only available in iOS.
 *   `enable_op_profiling`: `bool` (default=false) \
    Whether to enable per-operator profiling measurement.
 *   `enable_platform_tracing`: `bool` (default=false) \
@ -107,16 +48,50 @@ and the following optional parameters:
    'enable_op_profiling'. Note, the platform-wide tracing might not work if the
    tool runs as a commandline native binary. For example, on Android, the
    ATrace-based tracing only works when the tool is launched as an APK.
-*   `hexagon_profiling`: `bool` (default=false) \
-    Whether to profile ops running on hexagon. Needs to be combined with
-    `enable_op_profiling`. When this is set to true the profile of ops on
-    hexagon DSP will be added to the profile table. Note that, the reported data
-    on hexagon is in cycles, not in ms like on cpu.
-*   `external_delegate_path`: `string` (default="") \
-    Path to the external delegate library to use.
-*   `external_delegate_options`: `string` (default="") \
-    A list of options to be passed to the external delegate library. Options
-    should be in the format of `option1:value1;option2:value2;optionN:valueN`
+
+### TFLite delegate parameters
+The tool supports all runtime/delegate parameters introduced by
+[the delegate registrar]
+(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates).
+The following simply lists the names of all of them w/ additional notes where
+applicable:
+#### Common parameters
+* `max_delegated_partitions`: `int` (default=0) \
+Note when `use_legacy_nnapi` is selected, this parameter won't work.
+* `min_nodes_per_partition`:`int` (default=0)
+
+#### GPU delegate provider
+* `use_gpu`: `bool` (default=false)
+* `gpu_precision_loss_allowed`: `bool` (default=true)
+* `gpu_experimental_enable_quant`: `bool` (default=true)
+* `gpu_backend`: `string` (default="")
+* `gpu_wait_type`: `str` (default="")
+
+### NNAPI delegate provider
+
+*   `use_nnapi`: `bool` (default=false) \
+    Note some Android P devices will fail to use NNAPI for models in
+    `/data/local/tmp/` and this benchmark tool will not correctly use NNAPI.
+*   `nnapi_accelerator_name`: `str` (default="")
+*   `disable_nnapi_cpu`: `bool` (default=false)
+
+#### Hexagon delegate provider
+* `use_hexagon`: `bool` (default=false)
+* `hexagon_profiling`: `bool` (default=false) \
+Note enabling this option will not produce profiling results outputs unless
+`enable_op_profiling` is also turned on. When both parameters are set to true,
+the profile of ops on hexagon DSP will be added to the profile table. Note that,
+the reported data on hexagon is in cycles, not in ms like on cpu.
+
+#### XNNPACK delegate provider
+*   `use_xnnpack`: `bool` (default=false)
+
+#### CoreML delegate provider
+*   `use_coreml`: `bool` (default=false)
+
+#### External delegate provider
+*   `external_delegate_path`: `string` (default="")
+*   `external_delegate_options`: `string` (default="")

 ## To build/install/run

--- a/tensorflow/lite/tools/delegates/README.md
+++ b/tensorflow/lite/tools/delegates/README.md
@ -0,0 +1,104 @@
+# TFLite Delegate Utilities for Tooling
+
+## TFLite Delegate Registrar
+[A TFLite delegate registrar]
+(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/delegates/delegate_provider.h)
+is provided here. The registrar keeps a list of TFLite delegate providers, each
+of which defines a list parameters that could be initialized from commandline
+argumenents and provides a TFLite delegate instance creation based on those
+parameters. This delegate registrar has been used in TFLite evaluation tools and
+the benchmark model tool.
+
+A particular TFLite delegate provider can be used by
+linking the corresponding library, e.g. adding it to the `deps` of a BUILD rule.
+Note that each delegate provider library has been configured with
+`alwayslink=1` in the BUILD rule so that it will be linked to any binary that
+directly or indirectly depends on it.
+
+The following lists all implemented TFLite delegate providers and their
+corresponding list of parameters that each supports to create a particular
+TFLite delegate.
+
+### Common parameters
+*   `num_threads`: `int` (default=1) \
+    The number of threads to use for running the inference on CPU.
+*   `max_delegated_partitions`: `int` (default=0, i.e. no limit) \
+    The maximum number of partitions that will be delegated. \
+    Currently supported by the GPU, Hexagon, CoreML and NNAPI delegate.
+*   `min_nodes_per_partition`: `int` (default=delegate's own choice) \
+    The minimal number of TFLite graph nodes of a partition that needs to be
+    reached to be delegated. A negative value or 0 means to use the default
+    choice of each delegate. \
+    This option is currently supported by the Hexagon and CoreML delegate.
+
+### GPU delegate provider
+*   `use_gpu`: `bool` (default=false) \
+    Whether to use the
+    [GPU accelerator delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/gpu).
+    This option is currently only available on Android and iOS devices.
+*   `gpu_precision_loss_allowed`: `bool` (default=true) \
+    Whethre to allow the GPU delegate to carry out computation with some
+    precision loss (i.e. processing in FP16) or not. If allowed, the performance
+    will increase.
+*   `gpu_experimental_enable_quant`: `bool` (default=true) \
+    Whether to allow the GPU delegate to run a quantized model or not. \
+    This option is currently only available on Android.
+*  `gpu_backend`: `string` (default="") \
+    Force the GPU delegate to use a particular backend for execution, and fail
+    if unsuccessful. Should be one of: cl, gl. By default, the GPU delegate will
+    try OpenCL first and then OpenGL if the former fails.\
+    Note this option is only available on Android.
+*   `gpu_wait_type`: `string` (default="") \
+    Which GPU wait_type option to use, when using GPU delegate on iOS. Should be
+    one of the following: passive, active, do_not_wait, aggressive. When left
+    blank, passive mode is used by default.
+
+### NNAPI delegate provider
+*   `use_nnapi`: `bool` (default=false) \
+    Whether to use
+    [Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/).
+    This API is available on recent Android devices. When on Android Q+, will
+    also print the names of NNAPI accelerators accessible through the
+    `nnapi_accelerator_name` flag.
+*   `nnapi_accelerator_name`: `string` (default="") \
+    The name of the NNAPI accelerator to use (requires Android Q+). If left
+    blank, NNAPI will automatically select which of the available accelerators
+    to use.
+*   `nnapi_execution_preference`: `string` (default="") \
+    Which
+    [NNAPI execution preference](https://developer.android.com/ndk/reference/group/neural-networks.html#group___neural_networks_1gga034380829226e2d980b2a7e63c992f18af727c25f1e2d8dcc693c477aef4ea5f5)
+    to use when executing using NNAPI. Should be one of the following:
+    fast_single_answer, sustained_speed, low_power, undefined.
+*   `disable_nnapi_cpu`: `bool` (default=false) \
+    Excludes the
+    [NNAPI CPU reference implementation](https://developer.android.com/ndk/guides/neuralnetworks#device-assignment)
+    from the possible devices to be used by NNAPI to execute the model. This
+    option is ignored if `nnapi_accelerator_name` is specified.
+
+### Hexagon delegate provider
+*   `use_hexagon`: `bool` (default=false) \
+    Whether to use the Hexagon delegate. Not all devices may support the Hexagon
+    delegate, refer to the [TensorFlow Lite documentation]
+    (https://www.tensorflow.org/lite/performance/hexagon_delegate) for more
+    information about which devices/chipsets are supported and about how to get
+    the required libraries. To use the Hexagon delegate also build the
+    hexagon_nn:libhexagon_interface.so target and copy the library to the
+    device. All libraries should be copied to /data/local/tmp on the device.
+*   `hexagon_profiling`: `bool` (default=false) \
+    Whether to profile ops running on hexagon.
+
+### XNNPACK delegate provider
+*   `use_xnnpack`: `bool` (default=false) \
+    Whether to use the XNNPack delegate.
+
+### CoreML delegate provider
+*   `use_coreml`: `bool` (default=false) \
+    Whether to use the [Core ML delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/delegates/coreml).
+    This option is only available in iOS.
+
+### External delegate provider
+*   `external_delegate_path`: `string` (default="") \
+    Path to the external delegate library to use.
+*   `external_delegate_options`: `string` (default="") \
+    A list of options to be passed to the external delegate library. Options
+    should be in the format of `option1:value1;option2:value2;optionN:valueN`
--- a/tensorflow/lite/tools/evaluation/tasks/coco_object_detection/README.md
+++ b/tensorflow/lite/tools/evaluation/tasks/coco_object_detection/README.md
@ -83,9 +83,11 @@ The following optional parameters can be used to modify the inference runtime:
    assumes that `libhexagon_interface.so` and Qualcomm libraries lie in
    `/data/local/tmp`.

-This script also supports all applicable runtime/delegate arguments supported on
-the `benchmark_model` tool. If there is any conflict (for example, `num_threads`
-in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this
+This script also supports runtime/delegate arguments introduced by the
+[delegate registrar]
+(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates).
+If there is any conflict (for example, `num_threads` vs
+`num_interpreter_threads` here), the parameters of this
 script are given precedence.

 ### Debug Mode
--- a/tensorflow/lite/tools/evaluation/tasks/imagenet_image_classification/README.md
+++ b/tensorflow/lite/tools/evaluation/tasks/imagenet_image_classification/README.md
@ -91,9 +91,11 @@ The following optional parameters can be used to modify the inference runtime:
    assumes that `libhexagon_interface.so` and Qualcomm libraries lie in
    `/data/local/tmp`.

-This script also supports all applicable runtime/delegate arguments supported on
-the `benchmark_model` tool. If there is any conflict (for example, `num_threads`
-in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this
+This script also supports runtime/delegate arguments introduced by the
+[delegate registrar]
+(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates).
+If there is any conflict (for example, `num_threads` vs
+`num_interpreter_threads` here), the parameters of this
 script are given precedence.

 ## Downloading ILSVRC
--- a/tensorflow/lite/tools/evaluation/tasks/inference_diff/README.md
+++ b/tensorflow/lite/tools/evaluation/tasks/inference_diff/README.md
@ -64,9 +64,11 @@ and the following optional parameters:
    The final metrics are dumped into `output_file_path` as a serialized
    instance of `tflite::evaluation::EvaluationStageMetrics`

-This script also supports all applicable runtime/delegate arguments supported on
-the `benchmark_model` tool. If there is any conflict (for example, `num_threads`
-in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this
+This script also supports runtime/delegate arguments introduced by the
+[delegate registrar]
+(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates).
+If there is any conflict (for example, `num_threads` vs
+`num_interpreter_threads` here), the parameters of this
 script are given precedence.

 ## Running the binary on Android