From 83b579780d06e6e576f20408adaa047e31ee1b3b Mon Sep 17 00:00:00 2001 From: Chao Mei Date: Wed, 22 Apr 2020 04:39:55 -0700 Subject: [PATCH] Document TFLite delegate parameters supported by the TFLite delegate registrar in lite/tools/delegates, and also change README in each tool (evaluation tools and the benchmark tool) accordingly to mention these supported parameters. PiperOrigin-RevId: 307791203 Change-Id: If23c11cb9c80e0037c38f070e0ad23fe6cff690e --- tensorflow/lite/tools/benchmark/README.md | 113 +++++++----------- tensorflow/lite/tools/delegates/README.md | 104 ++++++++++++++++ .../tasks/coco_object_detection/README.md | 8 +- .../imagenet_image_classification/README.md | 8 +- .../evaluation/tasks/inference_diff/README.md | 8 +- 5 files changed, 163 insertions(+), 78 deletions(-) create mode 100644 tensorflow/lite/tools/delegates/README.md diff --git a/tensorflow/lite/tools/benchmark/README.md b/tensorflow/lite/tools/benchmark/README.md index 70728f41db1..8fda7af77af 100644 --- a/tensorflow/lite/tools/benchmark/README.md +++ b/tensorflow/lite/tools/benchmark/README.md @@ -34,32 +34,6 @@ and the following optional parameters: * `run_delay`: `float` (default=-1.0) \ The delay in seconds between subsequent benchmark runs. Non-positive values mean use no delay. -* `use_xnnpack`: `bool` (default=false) \ - Whether to use the XNNPack delegate. -* `use_hexagon`: `bool` (default=false) \ - Whether to use the Hexagon delegate. Not all devices may support the Hexagon - delegate, refer to the TensorFlow Lite documentation for more information - about which devices/chipsets are supported and about how to get the required - libraries. To use the Hexagon delegate also build the - hexagon_nn:libhexagon_interface.so target and copy the library to the - device. All libraries should be copied to /data/local/tmp on the device. -* `use_nnapi`: `bool` (default=false) \ - Whether to use - [Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/). - This API is available on recent Android devices. Note that some Android P - devices will fail to use NNAPI for models in `/data/local/tmp/` and this - benchmark tool will not correctly use NNAPI. When on Android Q+, will also - print the names of NNAPI accelerators accessible through the - `nnapi_accelerator_name` flag. -* `nnapi_accelerator_name`: `str` (default="") \ - The name of the NNAPI accelerator to use (requires Android Q+). If left - blank, NNAPI will automatically select which of the available accelerators - to use. -* `nnapi_execution_preference`: `string` (default="") \ - Which - [NNAPI execution preference](https://developer.android.com/ndk/reference/group/neural-networks.html#group___neural_networks_1gga034380829226e2d980b2a7e63c992f18af727c25f1e2d8dcc693c477aef4ea5f5) - to use when executing using NNAPI. Should be one of the following: - fast_single_answer, sustained_speed, low_power, undefined. * `use_legacy_nnapi`: `bool` (default=false) \ Whether to use the legacy [Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/) @@ -67,39 +41,6 @@ and the following optional parameters: This is available on recent Android devices. Note that some Android P devices will fail to use NNAPI for models in `/data/local/tmp/` and this benchmark tool will not correctly use NNAPI. -* `max_delegated_partitions`: `int` (default=0, i.e. no limit) \ - The maximum number of partitions that will be delegated. \ - Currently supported by the Hexagon delegate or the NNAPI delegate but won't - work if `use_legacy_nnapi` has been selected. -* `min_nodes_per_partition`: `int` (default=0, i.e. default choice implemented - by each delegate) \ - The minimal number of TFLite graph nodes of a partition that needs to be - reached to be delegated. A negative value or 0 means to use the default - choice of each delegate. \ - This option is currently only supported by the Hexagon delegate. -* `disable_nnapi_cpu`: `bool` (default=false) \ - Excludes the - [NNAPI CPU reference implementation](https://developer.android.com/ndk/guides/neuralnetworks#device-assignment) - from the possible devices to be used by NNAPI to execute the model. This - option is ignored if `nnapi_accelerator_name` is specified. -* `use_gpu`: `bool` (default=false) \ - Whether to use the - [GPU accelerator delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/gpu). - This option is currently only available on Android and iOS devices. -* `gpu_precision_loss_allowed`: `bool` (default=true) \ - Whethre to allow the GPU delegate to carry out computation with some - precision loss (i.e. processing in FP16) or not. If allowed, the performance - will increase. -* `gpu_experimental_enable_quant`: `bool` (default=true) \ - Whether to allow the GPU delegate to run a quantized model or not. This - option is currently only available on Android. -* `gpu_wait_type`: `str` (default="") \ - Which GPU wait_type option to use, when using GPU delegate on iOS. Should be - one of the following: passive, active, do_not_wait, aggressive. When left - blank, passive mode is used by default. -* `use_coreml`: `bool` (default=false) \ - Whether to use the [Core ML delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/delegates/coreml). - This option is only available in iOS. * `enable_op_profiling`: `bool` (default=false) \ Whether to enable per-operator profiling measurement. * `enable_platform_tracing`: `bool` (default=false) \ @@ -107,16 +48,50 @@ and the following optional parameters: 'enable_op_profiling'. Note, the platform-wide tracing might not work if the tool runs as a commandline native binary. For example, on Android, the ATrace-based tracing only works when the tool is launched as an APK. -* `hexagon_profiling`: `bool` (default=false) \ - Whether to profile ops running on hexagon. Needs to be combined with - `enable_op_profiling`. When this is set to true the profile of ops on - hexagon DSP will be added to the profile table. Note that, the reported data - on hexagon is in cycles, not in ms like on cpu. -* `external_delegate_path`: `string` (default="") \ - Path to the external delegate library to use. -* `external_delegate_options`: `string` (default="") \ - A list of options to be passed to the external delegate library. Options - should be in the format of `option1:value1;option2:value2;optionN:valueN` + +### TFLite delegate parameters +The tool supports all runtime/delegate parameters introduced by +[the delegate registrar] +(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates). +The following simply lists the names of all of them w/ additional notes where +applicable: +#### Common parameters +* `max_delegated_partitions`: `int` (default=0) \ +Note when `use_legacy_nnapi` is selected, this parameter won't work. +* `min_nodes_per_partition`:`int` (default=0) + +#### GPU delegate provider +* `use_gpu`: `bool` (default=false) +* `gpu_precision_loss_allowed`: `bool` (default=true) +* `gpu_experimental_enable_quant`: `bool` (default=true) +* `gpu_backend`: `string` (default="") +* `gpu_wait_type`: `str` (default="") + +### NNAPI delegate provider + +* `use_nnapi`: `bool` (default=false) \ + Note some Android P devices will fail to use NNAPI for models in + `/data/local/tmp/` and this benchmark tool will not correctly use NNAPI. +* `nnapi_accelerator_name`: `str` (default="") +* `disable_nnapi_cpu`: `bool` (default=false) + +#### Hexagon delegate provider +* `use_hexagon`: `bool` (default=false) +* `hexagon_profiling`: `bool` (default=false) \ +Note enabling this option will not produce profiling results outputs unless +`enable_op_profiling` is also turned on. When both parameters are set to true, +the profile of ops on hexagon DSP will be added to the profile table. Note that, +the reported data on hexagon is in cycles, not in ms like on cpu. + +#### XNNPACK delegate provider +* `use_xnnpack`: `bool` (default=false) + +#### CoreML delegate provider +* `use_coreml`: `bool` (default=false) + +#### External delegate provider +* `external_delegate_path`: `string` (default="") +* `external_delegate_options`: `string` (default="") ## To build/install/run diff --git a/tensorflow/lite/tools/delegates/README.md b/tensorflow/lite/tools/delegates/README.md new file mode 100644 index 00000000000..09d8045d706 --- /dev/null +++ b/tensorflow/lite/tools/delegates/README.md @@ -0,0 +1,104 @@ +# TFLite Delegate Utilities for Tooling + +## TFLite Delegate Registrar +[A TFLite delegate registrar] +(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tools/delegates/delegate_provider.h) +is provided here. The registrar keeps a list of TFLite delegate providers, each +of which defines a list parameters that could be initialized from commandline +argumenents and provides a TFLite delegate instance creation based on those +parameters. This delegate registrar has been used in TFLite evaluation tools and +the benchmark model tool. + +A particular TFLite delegate provider can be used by +linking the corresponding library, e.g. adding it to the `deps` of a BUILD rule. +Note that each delegate provider library has been configured with +`alwayslink=1` in the BUILD rule so that it will be linked to any binary that +directly or indirectly depends on it. + +The following lists all implemented TFLite delegate providers and their +corresponding list of parameters that each supports to create a particular +TFLite delegate. + +### Common parameters +* `num_threads`: `int` (default=1) \ + The number of threads to use for running the inference on CPU. +* `max_delegated_partitions`: `int` (default=0, i.e. no limit) \ + The maximum number of partitions that will be delegated. \ + Currently supported by the GPU, Hexagon, CoreML and NNAPI delegate. +* `min_nodes_per_partition`: `int` (default=delegate's own choice) \ + The minimal number of TFLite graph nodes of a partition that needs to be + reached to be delegated. A negative value or 0 means to use the default + choice of each delegate. \ + This option is currently supported by the Hexagon and CoreML delegate. + +### GPU delegate provider +* `use_gpu`: `bool` (default=false) \ + Whether to use the + [GPU accelerator delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/delegates/gpu). + This option is currently only available on Android and iOS devices. +* `gpu_precision_loss_allowed`: `bool` (default=true) \ + Whethre to allow the GPU delegate to carry out computation with some + precision loss (i.e. processing in FP16) or not. If allowed, the performance + will increase. +* `gpu_experimental_enable_quant`: `bool` (default=true) \ + Whether to allow the GPU delegate to run a quantized model or not. \ + This option is currently only available on Android. +* `gpu_backend`: `string` (default="") \ + Force the GPU delegate to use a particular backend for execution, and fail + if unsuccessful. Should be one of: cl, gl. By default, the GPU delegate will + try OpenCL first and then OpenGL if the former fails.\ + Note this option is only available on Android. +* `gpu_wait_type`: `string` (default="") \ + Which GPU wait_type option to use, when using GPU delegate on iOS. Should be + one of the following: passive, active, do_not_wait, aggressive. When left + blank, passive mode is used by default. + +### NNAPI delegate provider +* `use_nnapi`: `bool` (default=false) \ + Whether to use + [Android NNAPI](https://developer.android.com/ndk/guides/neuralnetworks/). + This API is available on recent Android devices. When on Android Q+, will + also print the names of NNAPI accelerators accessible through the + `nnapi_accelerator_name` flag. +* `nnapi_accelerator_name`: `string` (default="") \ + The name of the NNAPI accelerator to use (requires Android Q+). If left + blank, NNAPI will automatically select which of the available accelerators + to use. +* `nnapi_execution_preference`: `string` (default="") \ + Which + [NNAPI execution preference](https://developer.android.com/ndk/reference/group/neural-networks.html#group___neural_networks_1gga034380829226e2d980b2a7e63c992f18af727c25f1e2d8dcc693c477aef4ea5f5) + to use when executing using NNAPI. Should be one of the following: + fast_single_answer, sustained_speed, low_power, undefined. +* `disable_nnapi_cpu`: `bool` (default=false) \ + Excludes the + [NNAPI CPU reference implementation](https://developer.android.com/ndk/guides/neuralnetworks#device-assignment) + from the possible devices to be used by NNAPI to execute the model. This + option is ignored if `nnapi_accelerator_name` is specified. + +### Hexagon delegate provider +* `use_hexagon`: `bool` (default=false) \ + Whether to use the Hexagon delegate. Not all devices may support the Hexagon + delegate, refer to the [TensorFlow Lite documentation] + (https://www.tensorflow.org/lite/performance/hexagon_delegate) for more + information about which devices/chipsets are supported and about how to get + the required libraries. To use the Hexagon delegate also build the + hexagon_nn:libhexagon_interface.so target and copy the library to the + device. All libraries should be copied to /data/local/tmp on the device. +* `hexagon_profiling`: `bool` (default=false) \ + Whether to profile ops running on hexagon. + +### XNNPACK delegate provider +* `use_xnnpack`: `bool` (default=false) \ + Whether to use the XNNPack delegate. + +### CoreML delegate provider +* `use_coreml`: `bool` (default=false) \ + Whether to use the [Core ML delegate](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/experimental/delegates/coreml). + This option is only available in iOS. + +### External delegate provider +* `external_delegate_path`: `string` (default="") \ + Path to the external delegate library to use. +* `external_delegate_options`: `string` (default="") \ + A list of options to be passed to the external delegate library. Options + should be in the format of `option1:value1;option2:value2;optionN:valueN` diff --git a/tensorflow/lite/tools/evaluation/tasks/coco_object_detection/README.md b/tensorflow/lite/tools/evaluation/tasks/coco_object_detection/README.md index 5b4617d0bb8..691c1001d79 100644 --- a/tensorflow/lite/tools/evaluation/tasks/coco_object_detection/README.md +++ b/tensorflow/lite/tools/evaluation/tasks/coco_object_detection/README.md @@ -83,9 +83,11 @@ The following optional parameters can be used to modify the inference runtime: assumes that `libhexagon_interface.so` and Qualcomm libraries lie in `/data/local/tmp`. -This script also supports all applicable runtime/delegate arguments supported on -the `benchmark_model` tool. If there is any conflict (for example, `num_threads` -in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this +This script also supports runtime/delegate arguments introduced by the +[delegate registrar] +(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates). +If there is any conflict (for example, `num_threads` vs +`num_interpreter_threads` here), the parameters of this script are given precedence. ### Debug Mode diff --git a/tensorflow/lite/tools/evaluation/tasks/imagenet_image_classification/README.md b/tensorflow/lite/tools/evaluation/tasks/imagenet_image_classification/README.md index ac8006befa5..a9c82dbdd07 100644 --- a/tensorflow/lite/tools/evaluation/tasks/imagenet_image_classification/README.md +++ b/tensorflow/lite/tools/evaluation/tasks/imagenet_image_classification/README.md @@ -91,9 +91,11 @@ The following optional parameters can be used to modify the inference runtime: assumes that `libhexagon_interface.so` and Qualcomm libraries lie in `/data/local/tmp`. -This script also supports all applicable runtime/delegate arguments supported on -the `benchmark_model` tool. If there is any conflict (for example, `num_threads` -in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this +This script also supports runtime/delegate arguments introduced by the +[delegate registrar] +(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates). +If there is any conflict (for example, `num_threads` vs +`num_interpreter_threads` here), the parameters of this script are given precedence. ## Downloading ILSVRC diff --git a/tensorflow/lite/tools/evaluation/tasks/inference_diff/README.md b/tensorflow/lite/tools/evaluation/tasks/inference_diff/README.md index c8873a63a62..30857de7c77 100644 --- a/tensorflow/lite/tools/evaluation/tasks/inference_diff/README.md +++ b/tensorflow/lite/tools/evaluation/tasks/inference_diff/README.md @@ -64,9 +64,11 @@ and the following optional parameters: The final metrics are dumped into `output_file_path` as a serialized instance of `tflite::evaluation::EvaluationStageMetrics` -This script also supports all applicable runtime/delegate arguments supported on -the `benchmark_model` tool. If there is any conflict (for example, `num_threads` -in `benchmark_model` vs `num_interpreter_threads` here), the parameters of this +This script also supports runtime/delegate arguments introduced by the +[delegate registrar] +(https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/tools/delegates). +If there is any conflict (for example, `num_threads` vs +`num_interpreter_threads` here), the parameters of this script are given precedence. ## Running the binary on Android