Enable TFLite delegations

2019-07-23 14:00:08 +02:00 · 2019-07-23 14:00:08 +02:00 · 67004ca137
commit 67004ca137
parent d2d46c3aee
10 changed files with 213 additions and 85 deletions
--- a/SUPPORT.rst
+++ b/SUPPORT.rst
@ -1,3 +1,5 @@
 .. _support:
 Contact/Getting Help
 ====================
--- a/native_client/README.rst
+++ b/native_client/README.rst
@ -1,7 +1,11 @@
 .. _build-native-client:
 Building DeepSpeech Binaries
 ============================
 This section describes how to rebuild binaries. We have already several prebuilt binaries for all the supported platform,
 it is highly advised to use them except if you know what you are doing.
 If you'd like to build the DeepSpeech binaries yourself, you'll need the following pre-requisites downloaded and installed:
 * `Bazel 2.0.0 <https://github.com/bazelbuild/bazel/releases/tag/2.0.0>`_
@ -165,13 +169,31 @@ The path of the system tree can be overridden from the default values defined in
   cd ../DeepSpeech/native_client
   make TARGET=<system> deepspeech
-Android devices
+Android devices support
-^^^^^^^^^^^^^^^
+-----------------------
-We have preliminary support for Android relying on TensorFlow Lite, with Java and JNI bindinds. For more details on how to experiment with those, please refer to ``native_client/java/README.rst``.
+We have support for Android relying on TensorFlow Lite, with Java and JNI bindinds. For more details on how to experiment with those, please refer to the section below.
 Please refer to TensorFlow documentation on how to setup the environment to build for Android (SDK and NDK required).
 Using the library from Android project
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 We provide uptodate and tested ``libdeepspeech`` usable as an ``AAR`` package,
 for Android versions starting with 7.0 to 11.0. The package is published on
 `JCenter <https://bintray.com/alissy/org.mozilla.deepspeech/libdeepspeech>`_,
 and the ``JCenter`` repository should be available by default in any Android
 project.  Please make sure your project is setup to pull from this repository.
 You can then include the library by just adding this line to your
 ``gradle.build``, adjusting ``VERSION`` to  the version you need:
 .. code-block::
   implementation 'deepspeech.mozilla.org:libdeepspeech:VERSION@aar'
 Building ``libdeepspeech.so``
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 You can build the ``libdeepspeech.so`` using (ARMv7):
 .. code-block::
@ -184,6 +206,25 @@ Or (ARM64):
   bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm64 --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99 //native_client:libdeepspeech.so
 Building ``libdeepspeech.aar``
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 In the unlikely event you have to rebuild the JNI bindings, source code is
 available under the ``libdeepspeech`` subdirectory.  Building depends on shared
 object: please ensure to place ``libdeepspeech.so`` into the
 ``libdeepspeech/libs/{arm64-v8a,armeabi-v7a,x86_64}/`` matching subdirectories.
 Building the bindings is managed by ``gradle`` and should be limited to issuing
 ``./gradlew libdeepspeech:build``, producing an ``AAR`` package in
 ``./libdeepspeech/build/outputs/aar/``.
 Please note that you might have to copy the file to a local Maven repository
 and adapt file naming (when missing, the error message should states what
 filename it expects and where).
 Building C++ ``deepspeech`` binary
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 Building the ``deepspeech`` binary will happen through ``ndk-build`` (ARMv7):
 .. code-block::
@ -197,3 +238,77 @@ And (ARM64):
   cd ../DeepSpeech/native_client
   $ANDROID_NDK_HOME/ndk-build APP_PLATFORM=android-21 APP_BUILD_SCRIPT=$(pwd)/Android.mk NDK_PROJECT_PATH=$(pwd) APP_STL=c++_shared TFDIR=$(pwd)/../tensorflow/ TARGET_ARCH_ABI=arm64-v8a
 Android demo APK
 ^^^^^^^^^^^^^^^^
 Provided is a very simple Android demo app that allows you to test the library.
 You can build it with ``make apk`` and install the resulting APK file. Please
 refer to Gradle documentation for more details.
 The ``APK`` should be produced in ``/app/build/outputs/apk/``. This demo app might
 require external storage permissions. You can then push models files to your
 device, set the path to the file in the UI and try to run on an audio file.
 When running, it should first play the audio file and then run the decoding. At
 the end of the decoding, you should be presented with the decoded text as well
 as time elapsed to decode in miliseconds.
 This application is very limited on purpose, and is only here as a very basic
 demo of one usage of the application. For example, it's only able to read PCM
 mono 16kHz 16-bits file and it might fail on some WAVE file that are not
 following exactly the specification.
 Running ``deepspeech`` via adb
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 You should use ``adb push`` to send data to device, please refer to Android
 documentation on how to use that.
 Please push DeepSpeech data to ``/sdcard/deepspeech/``\ , including:
 * ``output_graph.tflite`` which is the TF Lite model
 * ``kenlm.scorer``, if you want to use the scorer; please be aware that too big
  scorer will make the device run out of memory
 Then, push binaries from ``native_client.tar.xz`` to ``/data/local/tmp/ds``\ :
 * ``deepspeech``
 * ``libdeepspeech.so``
 * ``libc++_shared.so``
 You should then be able to run as usual, using a shell from ``adb shell``\ :
 .. code-block::
   user@device$ cd /data/local/tmp/ds/
   user@device$ LD_LIBRARY_PATH=$(pwd)/ ./deepspeech [...]
 Please note that Android linker does not support ``rpath`` so you have to set
 ``LD_LIBRARY_PATH``. Properly wrapped / packaged bindings does embed the library
 at a place the linker knows where to search, so Android apps will be fine.
 Delegation API
 ^^^^^^^^^^^^^^
 TensorFlow Lite supports Delegate API to offload some computation from the main
 CPU. Please refer to `TensorFlow's documentation
 <https://www.tensorflow.org/lite/performance/delegates>`_ for details.
 To ease with experimentations, we have enabled some of those delegations on our
 Android builds: * GPU, to leverage OpenGL capabilities * NNAPI, the Android API
 to leverage GPU / DSP / NPU * Hexagon, the Qualcomm-specific DSP
 This is highly experimental:
 * Requires passing environment variable ``DS_TFLITE_DELEGATE`` with values of
  ``gpu``, ``nnapi`` or ``hexagon`` (only one at a time)
 * Might require exported model changes (some Op might not be supported)
 * We can't guarantee it will work, nor it will be faster than default
  implementation
 Feedback on improving this is welcome: how it could be exposed in the API, how
 much performance gains do you get in your applications, how you had to change
 the model to make it work with a delegate, etc.
 See :ref:`the support / contact details <support>`
--- a/doc/index.rst
+++ b/doc/index.rst
@ -60,6 +60,10 @@ See the output of ``deepspeech -h`` for more information on the use of ``deepspe
   SUPPORTED_PLATFORMS
   BUILDING
 .. include:: ../SUPPORT.rst
 .. toctree::
   :maxdepth: 2
   :caption: Decoder and scorer
--- a/native_client/BUILD
+++ b/native_client/BUILD
@ -139,6 +139,7 @@ tf_cc_shared_object(
    deps = select({
        "//native_client:tflite": [
            "//tensorflow/lite/kernels:builtin_ops",
            "//tensorflow/lite/tools/evaluation:utils",
        ],
        "//conditions:default": [
            "//tensorflow/core:core_cpu",
--- a/native_client/java/README.rst
+++ b/native_client/java/README.rst
@ -1,72 +0,0 @@
 DeepSpeech Java / Android bindings
 ==================================
 This is still preliminary work. Please refer to ``native_client/README.rst`` for
 building ``libdeepspeech.so`` and ``deepspeech`` binary for Android on ARMv7 and
 ARM64 arch.
 Android Java / JNI bindings: ``libdeepspeech``
 ==================================================
 Java / JNI bindings are available under the ``libdeepspeech`` subdirectory.
 Building depends on prebuilt shared object.  Please ensure to place
 ``libdeepspeech.so`` into the ``libdeepspeech/libs/{arm64-v8a,armeabi-v7a}/``
 matching subdirectories.
 Building the bindings is managed by ``gradle`` and should be limited to issuing
 ``./gradlew libdeepspeech:build``\ , producing an ``AAR`` package in
 ``./libdeepspeech/build/outputs/aar/``. This can later be used by other
 Gradle-based build with the following configuration:
 .. code-block::
   implementation 'deepspeech.mozilla.org:libdeepspeech:VERSION@aar'
 Please note that you might have to copy the file to a local Maven repository
 and adapt file naming (when missing, the error message should states what
 filename it expects and where).
 Android demo APK
 ================
 Provided is a very simple Android demo app that allows you to test the library.
 You can build it with ``make apk`` and install the resulting APK file. Please
 refer to Gradle documentation for more details.
 The ``APK`` should be produced in ``/app/build/outputs/apk/``. This demo app might
 require external storage permissions. You can then push models files to your
 device, set the path to the file in the UI and try to run on an audio file.
 When running, it should first play the audio file and then run the decoding. At
 the end of the decoding, you should be presented with the decoded text as well
 as time elapsed to decode in miliseconds.
 Running ``deepspeech`` via adb
 ==================================
 You should use ``adb push`` to send data to device, please refer to Android
 documentation on how to use that.
 Please push DeepSpeech data to ``/sdcard/deepspeech/``\ , including:
 * ``output_graph.tflite`` which is the TF Lite model
 * ``kenlm.scorer``, if you want to use the scorer; please be aware that too big
  scorer will make the device run out of memory
 Then, push binaries from ``native_client.tar.xz`` to ``/data/local/tmp/ds``\ :
 * ``deepspeech``
 * ``libdeepspeech.so``
 * ``libc++_shared.so``
 You should then be able to run as usual, using a shell from ``adb shell``\ :
 .. code-block::
   user@device$ cd /data/local/tmp/ds/
   user@device$ LD_LIBRARY_PATH=$(pwd)/ ./deepspeech [...]
 Please note that Android linker does not support ``rpath`` so you have to set
 ``LD_LIBRARY_PATH``. Properly wrapped / packaged bindings does embed the library
 at a place the linker knows where to search, so Android apps will be fine.
--- a/native_client/tflitemodelstate.cc
+++ b/native_client/tflitemodelstate.cc
@ -1,8 +1,17 @@
 #include "tflitemodelstate.h"
 #include "tensorflow/lite/string_util.h"
 #include "workspace_status.h"
 #ifdef __ANDROID__
 #include <android/log.h>
 #define  LOG_TAG    "libdeepspeech"
 #define  LOGD(...)  __android_log_print(ANDROID_LOG_DEBUG, LOG_TAG, __VA_ARGS__)
 #define  LOGE(...)  __android_log_print(ANDROID_LOG_ERROR, LOG_TAG, __VA_ARGS__)
 #else
 #define  LOGD(...)
 #define  LOGE(...)
 #endif // __ANDROID__
 using namespace tflite;
 using std::vector;
@ -90,6 +99,62 @@ TFLiteModelState::~TFLiteModelState()
 {
 }
 std::map<std::string, tflite::Interpreter::TfLiteDelegatePtr>
 getTfliteDelegates()
 {
  std::map<std::string, tflite::Interpreter::TfLiteDelegatePtr> delegates;
  const char* env_delegate_c = std::getenv("DS_TFLITE_DELEGATE");
  std::string env_delegate = (env_delegate_c != nullptr) ? env_delegate_c : "";
 #ifdef __ANDROID__
  if (env_delegate == std::string("gpu")) {
    LOGD("Trying to get GPU delegate ...");
    // Try to get GPU delegate
    {
      tflite::Interpreter::TfLiteDelegatePtr delegate = evaluation::CreateGPUDelegate();
      if (!delegate) {
        LOGD("GPU delegation not supported");
      } else {
        LOGD("GPU delegation supported");
        delegates.emplace("GPU", std::move(delegate));
      }
    }
  }
  if (env_delegate == std::string("nnapi")) {
    LOGD("Trying to get NNAPI delegate ...");
    // Try to get Android NNAPI delegate
    {
      tflite::Interpreter::TfLiteDelegatePtr delegate = evaluation::CreateNNAPIDelegate();
      if (!delegate) {
        LOGD("NNAPI delegation not supported");
      } else {
        LOGD("NNAPI delegation supported");
        delegates.emplace("NNAPI", std::move(delegate));
      }
    }
  }
  if (env_delegate == std::string("hexagon")) {
    LOGD("Trying to get Hexagon delegate ...");
    // Try to get Android Hexagon delegate
    {
      const std::string libhexagon_path("/data/local/tmp");
      tflite::Interpreter::TfLiteDelegatePtr delegate = evaluation::CreateHexagonDelegate(libhexagon_path, /* profiler */ false);
      if (!delegate) {
        LOGD("Hexagon delegation not supported");
      } else {
        LOGD("Hexagon delegation supported");
        delegates.emplace("Hexagon", std::move(delegate));
      }
    }
  }
 #endif // __ANDROID__
  return delegates;
 }
 int
 TFLiteModelState::init(const char* model_path)
 {
@ -111,9 +176,21 @@ TFLiteModelState::init(const char* model_path)
    return DS_ERR_FAIL_INTERPRETER;
  }
  LOGD("Trying to detect delegates ...");
  std::map<std::string, tflite::Interpreter::TfLiteDelegatePtr> delegates = getTfliteDelegates();
  LOGD("Finished enumerating delegates ...");
  interpreter_->AllocateTensors();
  interpreter_->SetNumThreads(4);
  LOGD("Trying to use delegates ...");
  for (const auto& delegate : delegates) {
    LOGD("Trying to apply delegate %s", delegate.first.c_str());
    if (interpreter_->ModifyGraphWithDelegate(delegate.second.get()) != kTfLiteOk) {
      LOGD("FAILED to apply delegate %s to the graph", delegate.first.c_str());
    }
  }
  // Query all the index once
  input_node_idx_       = get_input_tensor_by_name("input_node");
  previous_state_c_idx_ = get_input_tensor_by_name("previous_state_c");
--- a/native_client/tflitemodelstate.h
+++ b/native_client/tflitemodelstate.h
@ -6,6 +6,7 @@
 #include "tensorflow/lite/model.h"
 #include "tensorflow/lite/kernels/register.h"
 #include "tensorflow/lite/tools/evaluation/utils.h"
 #include "modelstate.h"
--- a/taskcluster/.shared.yml
+++ b/taskcluster/.shared.yml
@ -142,8 +142,8 @@ system:
      namespace: "project.deepspeech.swig.win.amd64.b5fea54d39832d1d132d7dd921b69c0c2c9d5118"
  tensorflow:
    linux_amd64_cpu:
-      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cpu/artifacts/public/home.tar.xz"
+      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.1.cpu/artifacts/public/home.tar.xz"
-      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cpu"
+      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.1.cpu"
    linux_amd64_cuda:
      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cuda/artifacts/public/home.tar.xz"
      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cuda"
@ -157,11 +157,11 @@ system:
      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.osx/artifacts/public/home.tar.xz"
      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.osx"
    android_arm64:
-      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.android-arm64/artifacts/public/home.tar.xz"
+      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1.1.bf55d362bb11e973b8f5.1.a3e5bf44d.1.android-arm64/artifacts/public/home.tar.xz"
-      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.android-arm64"
+      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1.1.bf55d362bb11e973b8f5.1.a3e5bf44d.1.android-arm64"
    android_armv7:
-      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.android-armv7/artifacts/public/home.tar.xz"
+      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1.1.bf55d362bb11e973b8f5.1.a3e5bf44d.1.android-armv7/artifacts/public/home.tar.xz"
-      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.android-armv7"
+      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1.1.bf55d362bb11e973b8f5.1.a3e5bf44d.1.android-armv7"
    win_amd64_cpu:
      url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.win/artifacts/public/home.tar.xz"
      namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.win"
--- a/taskcluster/android-build.sh
+++ b/taskcluster/android-build.sh
@ -22,7 +22,7 @@ if [ "${arm_flavor}" = "arm64-v8a" ]; then
 fi
 if [ "${arm_flavor}" = "x86_64" ]; then
-    LOCAL_ANDROID_FLAGS="--config=android --cpu=x86_64 --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99"
+    LOCAL_ANDROID_FLAGS="--config=android --cpu=x86_64 --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99"
 fi
 BAZEL_BUILD_FLAGS="--define=runtime=tflite ${LOCAL_ANDROID_FLAGS} ${BAZEL_EXTRA_FLAGS}"
--- a/taskcluster/tf_tc-vars.sh
+++ b/taskcluster/tf_tc-vars.sh
@ -168,8 +168,8 @@ else
 fi
 BAZEL_ARM_FLAGS="--config=rpi3 --config=rpi3_opt --copt=-DTFLITE_WITH_RUY_GEMV"
 BAZEL_ARM64_FLAGS="--config=rpi3-armv8 --config=rpi3-armv8_opt --copt=-DTFLITE_WITH_RUY_GEMV"
-BAZEL_ANDROID_ARM_FLAGS="--config=android --config=android_arm --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99 --copt=-DTFLITE_WITH_RUY_GEMV"
+BAZEL_ANDROID_ARM_FLAGS="--config=android --config=android_arm --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 --copt=-DTFLITE_WITH_RUY_GEMV"
-BAZEL_ANDROID_ARM64_FLAGS="--config=android --config=android_arm64 --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99 --copt=-DTFLITE_WITH_RUY_GEMV"
+BAZEL_ANDROID_ARM64_FLAGS="--config=android --config=android_arm64 --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 --copt=-DTFLITE_WITH_RUY_GEMV"
 BAZEL_CUDA_FLAGS="--config=cuda"
 if [ "${OS}" = "${TC_MSYS_VERSION}" ]; then