diff --git a/tensorflow/lite/delegates/gpu/README.md b/tensorflow/lite/delegates/gpu/README.md index 552e1cdbec6..c37ee90b704 100644 --- a/tensorflow/lite/delegates/gpu/README.md +++ b/tensorflow/lite/delegates/gpu/README.md @@ -51,6 +51,10 @@ TFLite on GPU supports the following ops in 16-bit and 32-bit float precision: ## Basic Usage +**Note:** Following section describes the example usage for Android GPU delegate +with C++. For other languages and platforms, please see +[the documentation](https://www.tensorflow.org/lite/performance/gpu). + Using TFLite on GPU is as simple as getting the GPU delegate via `TfLiteGpuDelegateV2Create()` and then passing it to `Interpreter::ModifyGraphWithDelegate()` instead of calling @@ -99,13 +103,13 @@ Metal shaders are used for iOS, which were introduced with iOS 8. Thus, compilation flags should look like: ```sh -bazel build --config ios_arm64 //path/to/your:project +bazel build --config ios_fat //path/to/your:project ``` ## Advanced Usage: Delegate Options There are GPU options that can be set and passed on to -`TfLiteGpuDelegateCreate()`. When option is set to `nullptr` as shown in the +`TfLiteGpuDelegateV2Create()`. When option is set to `nullptr` as shown in the Basic Usage, it translates to: ```c++ @@ -113,12 +117,13 @@ const TfLiteGpuDelegateOptionsV2 kDefaultOptions = TfLiteGpuDelegateOptionsV2Default(); ``` -Similar for `NewTfLiteMetalDelegate()`: +Similar for `TFLGpuDelegateCreate()`: ```c++ -const TfLiteMetalDelegateOptions kDefaultOptions = { - .precision_loss_allowed = 0, // false - .wait_type = TFLITE_METAL_WAIT_TYPE_SLEEP, +const TFLGpuDelegateOptions kDefaultOptions = { + .allow_precision_loss = false, + .wait_type = TFLGpuDelegateWaitTypePassive, + .enable_quantization = false, }; ``` @@ -126,9 +131,10 @@ While it is convenient to just supply `nullptr`, it is recommended to explicitly set the options to avoid any unexpected artifacts in case default values are changed. -*IMPORTANT:* Note that the default option does not allow precision loss, and -thus may not be the fastest. For faster execution, you may want to set -`precision_loss_allowed` to `1` for FP16 execution. +*IMPORTANT:* Note that the default option may not be the fastest. For faster +execution, you may want to set `allow_precision_loss` to `true` so that the GPU +performs FP16 calculation internally, and set `wait_type` to +`TFLGpuDelegateWaitTypeAggressive` to avoid GPU sleep mode. ## Tips and Tricks