Enable TFLite delegations

This commit is contained in:
Alexandre Lissy 2019-07-23 14:00:08 +02:00
parent d2d46c3aee
commit 67004ca137
10 changed files with 213 additions and 85 deletions

View File

@ -1,3 +1,5 @@
.. _support:
Contact/Getting Help Contact/Getting Help
==================== ====================

View File

@ -1,7 +1,11 @@
.. _build-native-client:
Building DeepSpeech Binaries Building DeepSpeech Binaries
============================ ============================
This section describes how to rebuild binaries. We have already several prebuilt binaries for all the supported platform,
it is highly advised to use them except if you know what you are doing.
If you'd like to build the DeepSpeech binaries yourself, you'll need the following pre-requisites downloaded and installed: If you'd like to build the DeepSpeech binaries yourself, you'll need the following pre-requisites downloaded and installed:
* `Bazel 2.0.0 <https://github.com/bazelbuild/bazel/releases/tag/2.0.0>`_ * `Bazel 2.0.0 <https://github.com/bazelbuild/bazel/releases/tag/2.0.0>`_
@ -165,13 +169,31 @@ The path of the system tree can be overridden from the default values defined in
cd ../DeepSpeech/native_client cd ../DeepSpeech/native_client
make TARGET=<system> deepspeech make TARGET=<system> deepspeech
Android devices Android devices support
^^^^^^^^^^^^^^^ -----------------------
We have preliminary support for Android relying on TensorFlow Lite, with Java and JNI bindinds. For more details on how to experiment with those, please refer to ``native_client/java/README.rst``. We have support for Android relying on TensorFlow Lite, with Java and JNI bindinds. For more details on how to experiment with those, please refer to the section below.
Please refer to TensorFlow documentation on how to setup the environment to build for Android (SDK and NDK required). Please refer to TensorFlow documentation on how to setup the environment to build for Android (SDK and NDK required).
Using the library from Android project
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We provide uptodate and tested ``libdeepspeech`` usable as an ``AAR`` package,
for Android versions starting with 7.0 to 11.0. The package is published on
`JCenter <https://bintray.com/alissy/org.mozilla.deepspeech/libdeepspeech>`_,
and the ``JCenter`` repository should be available by default in any Android
project. Please make sure your project is setup to pull from this repository.
You can then include the library by just adding this line to your
``gradle.build``, adjusting ``VERSION`` to the version you need:
.. code-block::
implementation 'deepspeech.mozilla.org:libdeepspeech:VERSION@aar'
Building ``libdeepspeech.so``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can build the ``libdeepspeech.so`` using (ARMv7): You can build the ``libdeepspeech.so`` using (ARMv7):
.. code-block:: .. code-block::
@ -184,6 +206,25 @@ Or (ARM64):
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm64 --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99 //native_client:libdeepspeech.so bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm64 --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99 //native_client:libdeepspeech.so
Building ``libdeepspeech.aar``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In the unlikely event you have to rebuild the JNI bindings, source code is
available under the ``libdeepspeech`` subdirectory. Building depends on shared
object: please ensure to place ``libdeepspeech.so`` into the
``libdeepspeech/libs/{arm64-v8a,armeabi-v7a,x86_64}/`` matching subdirectories.
Building the bindings is managed by ``gradle`` and should be limited to issuing
``./gradlew libdeepspeech:build``, producing an ``AAR`` package in
``./libdeepspeech/build/outputs/aar/``.
Please note that you might have to copy the file to a local Maven repository
and adapt file naming (when missing, the error message should states what
filename it expects and where).
Building C++ ``deepspeech`` binary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Building the ``deepspeech`` binary will happen through ``ndk-build`` (ARMv7): Building the ``deepspeech`` binary will happen through ``ndk-build`` (ARMv7):
.. code-block:: .. code-block::
@ -197,3 +238,77 @@ And (ARM64):
cd ../DeepSpeech/native_client cd ../DeepSpeech/native_client
$ANDROID_NDK_HOME/ndk-build APP_PLATFORM=android-21 APP_BUILD_SCRIPT=$(pwd)/Android.mk NDK_PROJECT_PATH=$(pwd) APP_STL=c++_shared TFDIR=$(pwd)/../tensorflow/ TARGET_ARCH_ABI=arm64-v8a $ANDROID_NDK_HOME/ndk-build APP_PLATFORM=android-21 APP_BUILD_SCRIPT=$(pwd)/Android.mk NDK_PROJECT_PATH=$(pwd) APP_STL=c++_shared TFDIR=$(pwd)/../tensorflow/ TARGET_ARCH_ABI=arm64-v8a
Android demo APK
^^^^^^^^^^^^^^^^
Provided is a very simple Android demo app that allows you to test the library.
You can build it with ``make apk`` and install the resulting APK file. Please
refer to Gradle documentation for more details.
The ``APK`` should be produced in ``/app/build/outputs/apk/``. This demo app might
require external storage permissions. You can then push models files to your
device, set the path to the file in the UI and try to run on an audio file.
When running, it should first play the audio file and then run the decoding. At
the end of the decoding, you should be presented with the decoded text as well
as time elapsed to decode in miliseconds.
This application is very limited on purpose, and is only here as a very basic
demo of one usage of the application. For example, it's only able to read PCM
mono 16kHz 16-bits file and it might fail on some WAVE file that are not
following exactly the specification.
Running ``deepspeech`` via adb
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You should use ``adb push`` to send data to device, please refer to Android
documentation on how to use that.
Please push DeepSpeech data to ``/sdcard/deepspeech/``\ , including:
* ``output_graph.tflite`` which is the TF Lite model
* ``kenlm.scorer``, if you want to use the scorer; please be aware that too big
scorer will make the device run out of memory
Then, push binaries from ``native_client.tar.xz`` to ``/data/local/tmp/ds``\ :
* ``deepspeech``
* ``libdeepspeech.so``
* ``libc++_shared.so``
You should then be able to run as usual, using a shell from ``adb shell``\ :
.. code-block::
user@device$ cd /data/local/tmp/ds/
user@device$ LD_LIBRARY_PATH=$(pwd)/ ./deepspeech [...]
Please note that Android linker does not support ``rpath`` so you have to set
``LD_LIBRARY_PATH``. Properly wrapped / packaged bindings does embed the library
at a place the linker knows where to search, so Android apps will be fine.
Delegation API
^^^^^^^^^^^^^^
TensorFlow Lite supports Delegate API to offload some computation from the main
CPU. Please refer to `TensorFlow's documentation
<https://www.tensorflow.org/lite/performance/delegates>`_ for details.
To ease with experimentations, we have enabled some of those delegations on our
Android builds: * GPU, to leverage OpenGL capabilities * NNAPI, the Android API
to leverage GPU / DSP / NPU * Hexagon, the Qualcomm-specific DSP
This is highly experimental:
* Requires passing environment variable ``DS_TFLITE_DELEGATE`` with values of
``gpu``, ``nnapi`` or ``hexagon`` (only one at a time)
* Might require exported model changes (some Op might not be supported)
* We can't guarantee it will work, nor it will be faster than default
implementation
Feedback on improving this is welcome: how it could be exposed in the API, how
much performance gains do you get in your applications, how you had to change
the model to make it work with a delegate, etc.
See :ref:`the support / contact details <support>`

View File

@ -60,6 +60,10 @@ See the output of ``deepspeech -h`` for more information on the use of ``deepspe
SUPPORTED_PLATFORMS SUPPORTED_PLATFORMS
BUILDING
.. include:: ../SUPPORT.rst
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
:caption: Decoder and scorer :caption: Decoder and scorer

View File

@ -139,6 +139,7 @@ tf_cc_shared_object(
deps = select({ deps = select({
"//native_client:tflite": [ "//native_client:tflite": [
"//tensorflow/lite/kernels:builtin_ops", "//tensorflow/lite/kernels:builtin_ops",
"//tensorflow/lite/tools/evaluation:utils",
], ],
"//conditions:default": [ "//conditions:default": [
"//tensorflow/core:core_cpu", "//tensorflow/core:core_cpu",

View File

@ -1,72 +0,0 @@
DeepSpeech Java / Android bindings
==================================
This is still preliminary work. Please refer to ``native_client/README.rst`` for
building ``libdeepspeech.so`` and ``deepspeech`` binary for Android on ARMv7 and
ARM64 arch.
Android Java / JNI bindings: ``libdeepspeech``
==================================================
Java / JNI bindings are available under the ``libdeepspeech`` subdirectory.
Building depends on prebuilt shared object. Please ensure to place
``libdeepspeech.so`` into the ``libdeepspeech/libs/{arm64-v8a,armeabi-v7a}/``
matching subdirectories.
Building the bindings is managed by ``gradle`` and should be limited to issuing
``./gradlew libdeepspeech:build``\ , producing an ``AAR`` package in
``./libdeepspeech/build/outputs/aar/``. This can later be used by other
Gradle-based build with the following configuration:
.. code-block::
implementation 'deepspeech.mozilla.org:libdeepspeech:VERSION@aar'
Please note that you might have to copy the file to a local Maven repository
and adapt file naming (when missing, the error message should states what
filename it expects and where).
Android demo APK
================
Provided is a very simple Android demo app that allows you to test the library.
You can build it with ``make apk`` and install the resulting APK file. Please
refer to Gradle documentation for more details.
The ``APK`` should be produced in ``/app/build/outputs/apk/``. This demo app might
require external storage permissions. You can then push models files to your
device, set the path to the file in the UI and try to run on an audio file.
When running, it should first play the audio file and then run the decoding. At
the end of the decoding, you should be presented with the decoded text as well
as time elapsed to decode in miliseconds.
Running ``deepspeech`` via adb
==================================
You should use ``adb push`` to send data to device, please refer to Android
documentation on how to use that.
Please push DeepSpeech data to ``/sdcard/deepspeech/``\ , including:
* ``output_graph.tflite`` which is the TF Lite model
* ``kenlm.scorer``, if you want to use the scorer; please be aware that too big
scorer will make the device run out of memory
Then, push binaries from ``native_client.tar.xz`` to ``/data/local/tmp/ds``\ :
* ``deepspeech``
* ``libdeepspeech.so``
* ``libc++_shared.so``
You should then be able to run as usual, using a shell from ``adb shell``\ :
.. code-block::
user@device$ cd /data/local/tmp/ds/
user@device$ LD_LIBRARY_PATH=$(pwd)/ ./deepspeech [...]
Please note that Android linker does not support ``rpath`` so you have to set
``LD_LIBRARY_PATH``. Properly wrapped / packaged bindings does embed the library
at a place the linker knows where to search, so Android apps will be fine.

View File

@ -1,8 +1,17 @@
#include "tflitemodelstate.h" #include "tflitemodelstate.h"
#include "tensorflow/lite/string_util.h" #include "tensorflow/lite/string_util.h"
#include "workspace_status.h" #include "workspace_status.h"
#ifdef __ANDROID__
#include <android/log.h>
#define LOG_TAG "libdeepspeech"
#define LOGD(...) __android_log_print(ANDROID_LOG_DEBUG, LOG_TAG, __VA_ARGS__)
#define LOGE(...) __android_log_print(ANDROID_LOG_ERROR, LOG_TAG, __VA_ARGS__)
#else
#define LOGD(...)
#define LOGE(...)
#endif // __ANDROID__
using namespace tflite; using namespace tflite;
using std::vector; using std::vector;
@ -90,6 +99,62 @@ TFLiteModelState::~TFLiteModelState()
{ {
} }
std::map<std::string, tflite::Interpreter::TfLiteDelegatePtr>
getTfliteDelegates()
{
std::map<std::string, tflite::Interpreter::TfLiteDelegatePtr> delegates;
const char* env_delegate_c = std::getenv("DS_TFLITE_DELEGATE");
std::string env_delegate = (env_delegate_c != nullptr) ? env_delegate_c : "";
#ifdef __ANDROID__
if (env_delegate == std::string("gpu")) {
LOGD("Trying to get GPU delegate ...");
// Try to get GPU delegate
{
tflite::Interpreter::TfLiteDelegatePtr delegate = evaluation::CreateGPUDelegate();
if (!delegate) {
LOGD("GPU delegation not supported");
} else {
LOGD("GPU delegation supported");
delegates.emplace("GPU", std::move(delegate));
}
}
}
if (env_delegate == std::string("nnapi")) {
LOGD("Trying to get NNAPI delegate ...");
// Try to get Android NNAPI delegate
{
tflite::Interpreter::TfLiteDelegatePtr delegate = evaluation::CreateNNAPIDelegate();
if (!delegate) {
LOGD("NNAPI delegation not supported");
} else {
LOGD("NNAPI delegation supported");
delegates.emplace("NNAPI", std::move(delegate));
}
}
}
if (env_delegate == std::string("hexagon")) {
LOGD("Trying to get Hexagon delegate ...");
// Try to get Android Hexagon delegate
{
const std::string libhexagon_path("/data/local/tmp");
tflite::Interpreter::TfLiteDelegatePtr delegate = evaluation::CreateHexagonDelegate(libhexagon_path, /* profiler */ false);
if (!delegate) {
LOGD("Hexagon delegation not supported");
} else {
LOGD("Hexagon delegation supported");
delegates.emplace("Hexagon", std::move(delegate));
}
}
}
#endif // __ANDROID__
return delegates;
}
int int
TFLiteModelState::init(const char* model_path) TFLiteModelState::init(const char* model_path)
{ {
@ -111,9 +176,21 @@ TFLiteModelState::init(const char* model_path)
return DS_ERR_FAIL_INTERPRETER; return DS_ERR_FAIL_INTERPRETER;
} }
LOGD("Trying to detect delegates ...");
std::map<std::string, tflite::Interpreter::TfLiteDelegatePtr> delegates = getTfliteDelegates();
LOGD("Finished enumerating delegates ...");
interpreter_->AllocateTensors(); interpreter_->AllocateTensors();
interpreter_->SetNumThreads(4); interpreter_->SetNumThreads(4);
LOGD("Trying to use delegates ...");
for (const auto& delegate : delegates) {
LOGD("Trying to apply delegate %s", delegate.first.c_str());
if (interpreter_->ModifyGraphWithDelegate(delegate.second.get()) != kTfLiteOk) {
LOGD("FAILED to apply delegate %s to the graph", delegate.first.c_str());
}
}
// Query all the index once // Query all the index once
input_node_idx_ = get_input_tensor_by_name("input_node"); input_node_idx_ = get_input_tensor_by_name("input_node");
previous_state_c_idx_ = get_input_tensor_by_name("previous_state_c"); previous_state_c_idx_ = get_input_tensor_by_name("previous_state_c");

View File

@ -6,6 +6,7 @@
#include "tensorflow/lite/model.h" #include "tensorflow/lite/model.h"
#include "tensorflow/lite/kernels/register.h" #include "tensorflow/lite/kernels/register.h"
#include "tensorflow/lite/tools/evaluation/utils.h"
#include "modelstate.h" #include "modelstate.h"

View File

@ -142,8 +142,8 @@ system:
namespace: "project.deepspeech.swig.win.amd64.b5fea54d39832d1d132d7dd921b69c0c2c9d5118" namespace: "project.deepspeech.swig.win.amd64.b5fea54d39832d1d132d7dd921b69c0c2c9d5118"
tensorflow: tensorflow:
linux_amd64_cpu: linux_amd64_cpu:
url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cpu/artifacts/public/home.tar.xz" url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.1.cpu/artifacts/public/home.tar.xz"
namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cpu" namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.1.cpu"
linux_amd64_cuda: linux_amd64_cuda:
url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cuda/artifacts/public/home.tar.xz" url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cuda/artifacts/public/home.tar.xz"
namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cuda" namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.cuda"
@ -157,11 +157,11 @@ system:
url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.osx/artifacts/public/home.tar.xz" url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.osx/artifacts/public/home.tar.xz"
namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.osx" namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.osx"
android_arm64: android_arm64:
url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.android-arm64/artifacts/public/home.tar.xz" url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1.1.bf55d362bb11e973b8f5.1.a3e5bf44d.1.android-arm64/artifacts/public/home.tar.xz"
namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.android-arm64" namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1.1.bf55d362bb11e973b8f5.1.a3e5bf44d.1.android-arm64"
android_armv7: android_armv7:
url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.android-armv7/artifacts/public/home.tar.xz" url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1.1.bf55d362bb11e973b8f5.1.a3e5bf44d.1.android-armv7/artifacts/public/home.tar.xz"
namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.android-armv7" namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1.1.bf55d362bb11e973b8f5.1.a3e5bf44d.1.android-armv7"
win_amd64_cpu: win_amd64_cpu:
url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.win/artifacts/public/home.tar.xz" url: "https://community-tc.services.mozilla.com/api/index/v1/task/project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.win/artifacts/public/home.tar.xz"
namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.win" namespace: "project.deepspeech.tensorflow.pip.r2.2.518c1d04bf55d362bb11e973b8f5d0aa3e5bf44d.0.win"

View File

@ -22,7 +22,7 @@ if [ "${arm_flavor}" = "arm64-v8a" ]; then
fi fi
if [ "${arm_flavor}" = "x86_64" ]; then if [ "${arm_flavor}" = "x86_64" ]; then
LOCAL_ANDROID_FLAGS="--config=android --cpu=x86_64 --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99" LOCAL_ANDROID_FLAGS="--config=android --cpu=x86_64 --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99"
fi fi
BAZEL_BUILD_FLAGS="--define=runtime=tflite ${LOCAL_ANDROID_FLAGS} ${BAZEL_EXTRA_FLAGS}" BAZEL_BUILD_FLAGS="--define=runtime=tflite ${LOCAL_ANDROID_FLAGS} ${BAZEL_EXTRA_FLAGS}"

View File

@ -168,8 +168,8 @@ else
fi fi
BAZEL_ARM_FLAGS="--config=rpi3 --config=rpi3_opt --copt=-DTFLITE_WITH_RUY_GEMV" BAZEL_ARM_FLAGS="--config=rpi3 --config=rpi3_opt --copt=-DTFLITE_WITH_RUY_GEMV"
BAZEL_ARM64_FLAGS="--config=rpi3-armv8 --config=rpi3-armv8_opt --copt=-DTFLITE_WITH_RUY_GEMV" BAZEL_ARM64_FLAGS="--config=rpi3-armv8 --config=rpi3-armv8_opt --copt=-DTFLITE_WITH_RUY_GEMV"
BAZEL_ANDROID_ARM_FLAGS="--config=android --config=android_arm --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99 --copt=-DTFLITE_WITH_RUY_GEMV" BAZEL_ANDROID_ARM_FLAGS="--config=android --config=android_arm --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 --copt=-DTFLITE_WITH_RUY_GEMV"
BAZEL_ANDROID_ARM64_FLAGS="--config=android --config=android_arm64 --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++11 --copt=-D_GLIBCXX_USE_C99 --copt=-DTFLITE_WITH_RUY_GEMV" BAZEL_ANDROID_ARM64_FLAGS="--config=android --config=android_arm64 --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 --copt=-DTFLITE_WITH_RUY_GEMV"
BAZEL_CUDA_FLAGS="--config=cuda" BAZEL_CUDA_FLAGS="--config=cuda"
if [ "${OS}" = "${TC_MSYS_VERSION}" ]; then if [ "${OS}" = "${TC_MSYS_VERSION}" ]; then