Native Client README split (#2002)

Re-factioning the `native_client` README. 

This PR removes redundancies between the master README and the `native_client` README, keeping only instructions for building in the `native_client` README. 

All installation instructions for built binaries / language bindings remain in the master README.
This commit is contained in:
Josh Meyer 2019-04-11 20:27:59 +02:00 committed by GitHub
parent a05989439e
commit 6fcad513e8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 143 additions and 163 deletions

View File

@ -32,17 +32,17 @@ See the output of `deepspeech -h` for more information on the use of `deepspeech
- [Prerequisites](#prerequisites)
- [Getting the code](#getting-the-code)
- [Getting the pre-trained model](#getting-the-pre-trained-model)
- [Using the model](#using-the-model)
- [Using a Pre-trained Model](#using-a-pre-trained-model)
- [CUDA dependency](#cuda-dependency)
- [Getting the pre-trained model](#getting-the-pre-trained-model)
- [Model compatibility](#model-compatibility)
- [Using the Python package](#using-the-python-package)
- [Using the command-line client](#using-the-command-line-client)
- [Using the Node.JS package](#using-the-nodejs-package)
- [Using the Command Line client](#using-the-command-line-client)
- [Installing bindings from source](#installing-bindings-from-source)
- [Third party bindings](#third-party-bindings)
- [Training](#training)
- [Installing prerequisites for training](#installing-prerequisites-for-training)
- [Training your own Model](#training-your-own-model)
- [Installing training prerequisites](#installing-training-prerequisites)
- [Recommendations](#recommendations)
- [Common Voice training data](#common-voice-training-data)
- [Training a model](#training-a-model)
@ -68,7 +68,30 @@ Install [Git Large File Storage](https://git-lfs.github.com/) either manually or
git clone https://github.com/mozilla/DeepSpeech
```
## Getting the pre-trained model
## Using a Pre-trained Model
There are three ways to use DeepSpeech inference:
- [The Python package](#using-the-python-package)
- [The Node.JS package](#using-the-nodejs-package)
- [The Command-Line client](#using-the-command-line-client)
Running `deepspeech` might require some runtime dependencies to be already installed on your system. Regardless of which bindings you are using, you will need the following:
* libsox2
* libstdc++6
* libgomp1
* libpthread
Please refer to your system's documentation on how to install these dependencies.
### CUDA dependency
The GPU capable builds (Python, NodeJS, C++, etc) depend on the same CUDA runtime as upstream TensorFlow. Make sure you've installed the correct version of CUDA
### Getting the pre-trained model
If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech [releases page](https://github.com/mozilla/DeepSpeech/releases). Alternatively, you can run the following command to download and unzip the model files in your current directory:
@ -77,19 +100,6 @@ wget https://github.com/mozilla/DeepSpeech/releases/download/v0.4.1/deepspeech-0
tar xvfz deepspeech-0.4.1-models.tar.gz
```
## Using the model
There are three ways to use DeepSpeech inference:
- [The Python package](#using-the-python-package)
- [The command-line client](#using-the-command-line-client)
- [The Node.JS package](#using-the-nodejs-package)
### CUDA dependency
The GPU capable builds (Python, NodeJS, C++ etc) depend on the same CUDA runtime as upstream TensorFlow. Currently with TensorFlow 1.13 it depends on CUDA 10.0 and CuDNN v7.5.
### Model compatibility
DeepSpeech models are versioned to keep you from trying to use an incompatible graph with a newer client after a breaking change was made to the code. If you get an error saying your model file version is too old for the client, you should either upgrade to a newer model release, re-export your model from the checkpoint using a newer version of the code, or downgrade your client if you need to use the old model and can't re-export it.
@ -163,9 +173,30 @@ The arguments `--lm` and `--trie` are optional, and represent a language model.
See [client.py](native_client/python/client.py) for an example of how to use the package programatically.
### Using the command-line client
### Using the Node.JS package
To download the pre-built binaries for the `deepspeech` command-line client, use `util/taskcluster.py`:
You can download the Node.JS bindings using `npm`:
```bash
npm install deepspeech
```
Please note that as of now, we only support Node.JS versions 4, 5 and 6. Once [SWIG has support](https://github.com/swig/swig/pull/968) we can build for newer versions.
Alternatively, if you're using Linux and have a supported NVIDIA GPU, you can install the GPU specific package as follows:
```bash
npm install deepspeech-gpu
```
See the [release notes](https://github.com/mozilla/DeepSpeech/releases) to find which GPUs are supported. Please ensure you have the required [CUDA dependency](#cuda-dependency).
See [client.js](native_client/javascript/client.js) for an example of how to use the bindings. Or download the [wav example](examples/nodejs_wav).
### Using the Command-Line client
To download the pre-built binaries for the `deepspeech` command-line (compiled C++) client, use `util/taskcluster.py`:
```bash
python3 util/taskcluster.py --target .
@ -193,24 +224,6 @@ Note: the following command assumes you [downloaded the pre-trained model](#gett
See the help output with `./deepspeech -h` and the [native client README](native_client/README.md) for more details.
### Using the Node.JS package
You can download the Node.JS bindings using `npm`:
```bash
npm install deepspeech
```
Alternatively, if you're using Linux and have a supported NVIDIA GPU, you can install the GPU specific package as follows:
```bash
npm install deepspeech-gpu
```
See the [release notes](https://github.com/mozilla/DeepSpeech/releases) to find which GPUs are supported. Please ensure you have the required [CUDA dependency](#cuda-dependency).
See [client.js](native_client/javascript/client.js) for an example of how to use the bindings. Or download the [wav example](examples/nodejs_wav).
### Installing bindings from source
If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow these [`native_client` installation instructions](native_client/README.md).
@ -224,9 +237,9 @@ In addition to the bindings above, third party developers have started to provid
* [stes](https://github.com/stes) provides preliminary [PKGBUILDs](https://wiki.archlinux.org/index.php/PKGBUILD) to install the client and python bindings on [Arch Linux](https://www.archlinux.org/) in the [arch-deepspeech](https://github.com/stes/arch-deepspeech) repo.
* [gst-deepspeech](https://github.com/Elleo/gst-deepspeech) provides a [GStreamer](https://gstreamer.freedesktop.org/) plugin which can be used from any language with GStreamer bindings.
## Training
## Training Your Own Model
### Installing prerequisites for training
### Installing Training Prerequisites
Install the required dependencies using `pip3`:

View File

@ -1,67 +1,9 @@
# DeepSpeech native client, language bindings, and custom decoder
# Building DeepSpeech Binaries
This folder contains the following:
If you'd like to build the DeepSpeech binaries yourself, you'll need the following pre-requisites downloaded and installed:
1. A native client for running queries on an exported DeepSpeech model
2. Python and Node.JS bindings for using an exported DeepSpeech model programatically
3. A CTC beam search decoder which uses a language model (N.B - the decoder is also required for training DeepSpeech)
We provide pre-built binaries for Linux and macOS.
## Required Dependencies
Running inference might require some runtime dependencies to be already installed on your system. Those should be the same, whatever the bindings you are using:
* libsox2
* libstdc++6
* libgomp1
* libpthread
Please refer to your system's documentation on how to install those dependencies.
## Installing our Pre-built Binaries
To download the pre-built binaries, use `util/taskcluster.py`:
```
python util/taskcluster.py --target /path/to/destination/folder
```
If you need binaries which are different than current master (e.g. `v0.2.0-alpha.6`) you can use the `--branch` flag:
```bash
python3 util/taskcluster.py --branch "v0.2.0-alpha.6"
```
`util/taskcluster.py` will download and extract `native_client.tar.xz`. `native_client.tar.xz` includes (1) the `deepspeech` binary and (2) associated libraries. `taskcluster.py` will download binaries for the architecture of the host by default, but you can override that behavior with the `--arch` parameter. See `python util/taskcluster.py -h` for more details.
If you want the CUDA capable version of the binaries, use `--arch gpu`. Note that for now we don't publish CUDA-capable macOS binaries.
## Installing our Pre-built language bindings
### Python bindings
For the Python bindings, you can use `pip`:
```
pip install deepspeech
```
Check the [main README](../README.md) for more details about setup and virtual environment use.
### Node.JS bindings
For Node.JS bindings, use `npm install deepspeech` to install it. Please note that as of now, we only support Node.JS versions 4, 5 and 6. Once [SWIG has support](https://github.com/swig/swig/pull/968) we can build for newer versions.
Check the [main README](../README.md) for more details.
## Building & Installing your own Binaries
If you'd like to build the binaries yourself, you'll need the following pre-requisites downloaded and installed:
* [TensorFlow requirements](https://www.tensorflow.org/install/install_sources)
* [TensorFlow `r1.13` sources](https://github.com/mozilla/tensorflow/tree/r1.13)
* [Mozilla's TensorFlow `r1.13` branch](https://github.com/mozilla/tensorflow/tree/r1.13)
* [General TensorFlow requirements](https://www.tensorflow.org/install/install_sources)
* [libsox](https://sourceforge.net/projects/sox/)
It is required to use our fork of TensorFlow since it includes fixes for common problems encountered when building the native client files.
@ -72,34 +14,111 @@ If you'd like to build the language bindings or the decoder package, you'll also
* [node-pre-gyp](https://github.com/mapbox/node-pre-gyp) (for Node.JS bindings only)
### Building your own Binaries
## Dependencies
If you follow these instructions, you should compile your own binaries of DeepSpeech (built on TensorFlow using Bazel).
Firstly, you should create a symbolic link in your TensorFlow checkout to the DeepSpeech `native_client` directory. If your DeepSpeech and TensorFlow checkouts are side by side in the same directory, do:
For more information on configuring TensorFlow, read the docs up to the end of ["Configure the Build"](https://www.tensorflow.org/install/source#configure_the_build).
### TensorFlow: Clone & Checkout
Clone our fork of TensorFlow and checkout the correct version:
```
git clone https://github.com/mozilla/tensorflow.git
git checkout origin/r1.13
```
### Bazel: Download & Install
First, [find the version of Bazel](https://www.tensorflow.org/install/source#tested_build_configurations) you need for this TensorFlow release. Next, [download and install the correct version of Bazel](https://docs.bazel.build/versions/master/install.html).
### TensorFlow: Configure with Bazel
After you have installed the correct version of Bazel, configure TensorFlow:
```
cd tensorflow
./configure
```
## Compile DeepSpeech
### Compile `libdeepspeech.so` & `generate_trie`
Within your TensorFlow checkout, create a symbolic link to the DeepSpeech `native_client` directory. Assuming DeepSpeech and TensorFlow checkouts are in the same directory, do:
```
cd tensorflow
ln -s ../DeepSpeech/native_client ./
```
Next, you will need to prepare your environment to configure and build TensorFlow. Clone from `https://github.com/mozilla/tensorflow`, and then (preferably) checkout the version of `tensorflow` which is currently supported by DeepSpeech (see requirements.txt), and use the `bazel` version recommended by TensorFlow for that version. Follow the [instructions](https://www.tensorflow.org/install/install_sources) on the TensorFlow site for your platform, up to the end of ["Configure the Build"](https://www.tensorflow.org/install/source#configure_the_build).
After that, you can build the main DeepSpeech library, `libdeepspeech.so`, as well as the `generate_trie` binary using the following command:
You can now use Bazel to build the main DeepSpeech library, `libdeepspeech.so`, as well as the `generate_trie` binary. Add `--config=cuda` if you want a CUDA build.
```
bazel build --config=monolithic -c opt --copt=-O3 --copt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-fvisibility=hidden //native_client:libdeepspeech.so //native_client:generate_trie
```
If your build target requires extra flags, add them. For example `--config=cuda` if you want a CUDA build. Note that the generated binaries will show up under `bazel-bin/native_client/` (e.g., including `generate_trie` in case the `//native_client:generate_trie` option was present).
The generated binaries will be saved to `bazel-bin/native_client/`.
Finally, you can change to the `native_client` directory and use the `Makefile`. By default, the `Makefile` will assume there is a TensorFlow checkout in a directory above the DeepSpeech checkout. If that is not the case, set the environment variable `TFDIR` to point to the right directory.
### Compile Language Bindings
Now, `cd` into the `DeepSpeech/native_client` directory and use the `Makefile` to build all the language bindings (C++ client, Python package, Nodejs package, etc.). Set the environment variable `TFDIR` to point to your TensorFlow checkout.
```
TFDIR=~/tensorflow
cd ../DeepSpeech/native_client
make deepspeech
```
### Cross-building for RPi3 ARMv7 and LePotato ARM64
## Installing your own Binaries
After building, the library files and binary can optionally be installed to a system path for ease of development. This is also a required step for bindings generation.
```
PREFIX=/usr/local sudo make install
```
It is assumed that `$PREFIX/lib` is a valid library path, otherwise you may need to alter your environment.
### Install Python bindings
Included are a set of generated Python bindings. After following the above build and installation instructions, these can be installed by executing the following commands (or equivalent on your system):
```
cd native_client/python
make bindings
pip install dist/deepspeech*
```
The API mirrors the C++ API and is demonstrated in [client.py](python/client.py). Refer to [deepspeech.h](deepspeech.h) for documentation.
### Install Node.JS bindings
After following the above build and installation instructions, the Node.JS bindings can be built:
```
cd native_client/javascript
make package
make npm-pack
```
This will create the package `deepspeech-VERSION.tgz` in `native_client/javascript`.
### Install the CTC decoder package
To build the `ds_ctcdecoder` package, you'll need the general requirements listed above (in particular SWIG). The command below builds the bindings using eight (8) processes for compilation. Adjust the parameter accordingly for more or less parallelism.
```
cd native_client/ctcdecode
make bindings NUM_PROCESSES=8
pip install dist/*.whl
```
## Cross-building
### RPi3 ARMv7 and LePotato ARM64
We do support cross-compilation. Please refer to our `mozilla/tensorflow` fork, where we define the following `--config` flags:
@ -160,55 +179,3 @@ cd ../DeepSpeech/native_client
$ANDROID_NDK_HOME/ndk-build APP_PLATFORM=android-21 APP_BUILD_SCRIPT=$(pwd)/Android.mk NDK_PROJECT_PATH=$(pwd) APP_STL=c++_shared TFDIR=$(pwd)/../../tensorflowx/ TARGET_ARCH_ABI=arm64-v8a
```
### Installing your own Binaries
After building, the library files and binary can optionally be installed to a system path for ease of development. This is also a required step for bindings generation.
```
PREFIX=/usr/local sudo make install
```
It is assumed that `$PREFIX/lib` is a valid library path, otherwise you may need to alter your environment.
#### Python bindings
Included are a set of generated Python bindings. After following the above build and installation instructions, these can be installed by executing the following commands (or equivalent on your system):
```
cd native_client/python
make bindings
pip install dist/deepspeech*
```
The API mirrors the C++ API and is demonstrated in [client.py](python/client.py). Refer to [deepspeech.h](deepspeech.h) for documentation.
#### Node.JS bindings
After following the above build and installation instructions, the Node.JS bindings can be built:
```
cd native_client/javascript
make package
make npm-pack
```
This will create the package `deepspeech-VERSION.tgz` in `native_client/javascript`.
#### Building the CTC decoder package
To build the `ds_ctcdecoder` package, you'll need the general requirements listed above (in particular SWIG). The command below builds the bindings using 8 processes for compilation. Adjust the parameter accordingly for more or less parallelism.
```
cd native_client/ctcdecode
make bindings NUM_PROCESSES=8
pip install dist/*.whl
```
## Running
The client can be run via the `Makefile`. The client will accept audio of any format your installation of SoX supports.
```
ARGS="--model /path/to/output_graph.pbmm --alphabet /path/to/alphabet.txt --audio /path/to/audio/file.wav" make run
```