Publish README/USING/TRAINING to readthedocs

Fixes #2581
This commit is contained in:
Alexandre Lissy 2020-01-08 10:02:46 +01:00
parent 1dfba839ea
commit 4c7d5fb0e1
8 changed files with 56 additions and 43 deletions

View File

@ -38,7 +38,7 @@ To install and use deepspeech all you have to do is:
# Transcribe an audio file # Transcribe an audio file
deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav
A pre-trained English model is available for use and can be downloaded using `the instructions below <USING.rst#using-a-pre-trained-model>`_. A package with some example audio files is available for download in our `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_. A pre-trained English model is available for use and can be downloaded using `the instructions below <doc/USING.rst#using-a-pre-trained-model>`_. A package with some example audio files is available for download in our `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_.
Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_ to find which GPUs are supported. To run ``deepspeech`` on a GPU, install the GPU specific package: Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_ to find which GPUs are supported. To run ``deepspeech`` on a GPU, install the GPU specific package:
@ -54,7 +54,7 @@ Quicker inference can be performed using a supported NVIDIA GPU on Linux. See th
# Transcribe an audio file. # Transcribe an audio file.
deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav deepspeech --model deepspeech-0.6.0-models/output_graph.pbmm --lm deepspeech-0.6.0-models/lm.binary --trie deepspeech-0.6.0-models/trie --audio audio/2830-3980-0043.wav
Please ensure you have the required `CUDA dependencies <USING.rst#cuda-dependency>`_. Please ensure you have the required `CUDA dependencies <doc/USING.rst#cuda-dependency>`_.
See the output of ``deepspeech -h`` for more information on the use of ``deepspeech``. (If you experience problems running ``deepspeech``\ , please check `required runtime dependencies <native_client/README.rst#required-dependencies>`_\ ). See the output of ``deepspeech -h`` for more information on the use of ``deepspeech``. (If you experience problems running ``deepspeech``\ , please check `required runtime dependencies <native_client/README.rst#required-dependencies>`_\ ).
@ -62,34 +62,34 @@ See the output of ``deepspeech -h`` for more information on the use of ``deepspe
**Table of Contents** **Table of Contents**
* `Using a Pre-trained Model <USING.rst#using-a-pre-trained-model>`_ * `Using a Pre-trained Model <doc/USING.rst#using-a-pre-trained-model>`_
* `CUDA dependency <USING.rst#cuda-dependency>`_ * `CUDA dependency <doc/USING.rst#cuda-dependency>`_
* `Getting the pre-trained model <USING.rst#getting-the-pre-trained-model>`_ * `Getting the pre-trained model <doc/USING.rst#getting-the-pre-trained-model>`_
* `Model compatibility <USING.rst#model-compatibility>`_ * `Model compatibility <doc/USING.rst#model-compatibility>`_
* `Using the Python package <USING.rst#using-the-python-package>`_ * `Using the Python package <doc/USING.rst#using-the-python-package>`_
* `Using the Node.JS package <USING.rst#using-the-nodejs-package>`_ * `Using the Node.JS package <doc/USING.rst#using-the-nodejs-package>`_
* `Using the Command Line client <USING.rst#using-the-command-line-client>`_ * `Using the Command Line client <doc/USING.rst#using-the-command-line-client>`_
* `Installing bindings from source <USING.rst#installing-bindings-from-source>`_ * `Installing bindings from source <doc/USING.rst#installing-bindings-from-source>`_
* `Third party bindings <USING.rst#third-party-bindings>`_ * `Third party bindings <doc/USING.rst#third-party-bindings>`_
* `Trying out DeepSpeech with examples <examples/README.rst>`_ * `Trying out DeepSpeech with examples <examples/README.rst>`_
* `Training your own Model <TRAINING.rst#training-your-own-model>`_ * `Training your own Model <doc/TRAINING.rst#training-your-own-model>`_
* `Prerequisites for training a model <TRAINING.rst#prerequisites-for-training-a-model>`_ * `Prerequisites for training a model <doc/TRAINING.rst#prerequisites-for-training-a-model>`_
* `Getting the training code <TRAINING.rst#getting-the-training-code>`_ * `Getting the training code <doc/TRAINING.rst#getting-the-training-code>`_
* `Installing Python dependencies <TRAINING.rst#installing-python-dependencies>`_ * `Installing Python dependencies <doc/TRAINING.rst#installing-python-dependencies>`_
* `Recommendations <TRAINING.rst#recommendations>`_ * `Recommendations <doc/TRAINING.rst#recommendations>`_
* `Common Voice training data <TRAINING.rst#common-voice-training-data>`_ * `Common Voice training data <doc/TRAINING.rst#common-voice-training-data>`_
* `Training a model <TRAINING.rst#training-a-model>`_ * `Training a model <doc/TRAINING.rst#training-a-model>`_
* `Checkpointing <TRAINING.rst#checkpointing>`_ * `Checkpointing <doc/TRAINING.rst#checkpointing>`_
* `Exporting a model for inference <TRAINING.rst#exporting-a-model-for-inference>`_ * `Exporting a model for inference <doc/TRAINING.rst#exporting-a-model-for-inference>`_
* `Exporting a model for TFLite <TRAINING.rst#exporting-a-model-for-tflite>`_ * `Exporting a model for TFLite <doc/TRAINING.rst#exporting-a-model-for-tflite>`_
* `Making a mmap-able model for inference <TRAINING.rst#making-a-mmap-able-model-for-inference>`_ * `Making a mmap-able model for inference <doc/TRAINING.rst#making-a-mmap-able-model-for-inference>`_
* `Continuing training from a release model <TRAINING.rst#continuing-training-from-a-release-model>`_ * `Continuing training from a release model <doc/TRAINING.rst#continuing-training-from-a-release-model>`_
* `Training with Augmentation <TRAINING.rst#training-with-augmentation>`_ * `Training with Augmentation <doc/TRAINING.rst#training-with-augmentation>`_
* `Contribution guidelines <CONTRIBUTING.rst>`_ * `Contribution guidelines <CONTRIBUTING.rst>`_
* `Contact/Getting Help <SUPPORT.rst>`_ * `Contact/Getting Help <SUPPORT.rst>`_

View File

@ -1,5 +1,5 @@
Introduction DeepSpeech Model
============ ================
The aim of this project is to create a simple, open, and ubiquitous speech The aim of this project is to create a simple, open, and ubiquitous speech
recognition engine. Simple, in that the engine should not require server-class recognition engine. Simple, in that the engine should not require server-class

View File

@ -54,7 +54,7 @@ You'll also need to install the ``ds_ctcdecoder`` Python package. ``ds_ctcdecode
pip3 install $(python3 util/taskcluster.py --decoder) pip3 install $(python3 util/taskcluster.py --decoder)
This command will download and install the ``ds_ctcdecoder`` package. You can override the platform with ``--arch`` if you want the package for ARM7 (\ ``--arch arm``\ ) or ARM64 (\ ``--arch arm64``\ ). If you prefer building the ``ds_ctcdecoder`` package from source, see the `native_client README file <native_client/README.rst>`_. This command will download and install the ``ds_ctcdecoder`` package. You can override the platform with ``--arch`` if you want the package for ARM7 (\ ``--arch arm``\ ) or ARM64 (\ ``--arch arm64``\ ). If you prefer building the ``ds_ctcdecoder`` package from source, see the :github:`native_client README file <native_client/README.rst>`.
Recommendations Recommendations
^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
@ -124,9 +124,9 @@ The central (Python) script is ``DeepSpeech.py`` in the project's root directory
./DeepSpeech.py --helpfull ./DeepSpeech.py --helpfull
To get the output of this in a slightly better-formatted way, you can also look up the option definitions in `\ ``util/flags.py`` <util/flags.py>`_. To get the output of this in a slightly better-formatted way, you can also look up the option definitions in :github:`util/flags.py <util/flags.py>`.
For executing pre-configured training scenarios, there is a collection of convenience scripts in the ``bin`` folder. Most of them are named after the corpora they are configured for. Keep in mind that most speech corpora are *very large*\ , on the order of tens of gigabytes, and some aren't free. Downloading and preprocessing them can take a very long time, and training on them without a fast GPU (GTX 10 series or newer recommended) takes even longer. For executing pre-configured training scenarios, there is a collection of convenience scripts in the ``bin`` folder. Most of them are named after the corpora they are configured for. Keep in mind that most speech corpora are *very large*, on the order of tens of gigabytes, and some aren't free. Downloading and preprocessing them can take a very long time, and training on them without a fast GPU (GTX 10 series or newer recommended) takes even longer.
**If you experience GPU OOM errors while training, try reducing the batch size with the ``--train_batch_size``\ , ``--dev_batch_size`` and ``--test_batch_size`` parameters.** **If you experience GPU OOM errors while training, try reducing the batch size with the ``--train_batch_size``\ , ``--dev_batch_size`` and ``--test_batch_size`` parameters.**
@ -136,7 +136,7 @@ As a simple first example you can open a terminal, change to the directory of th
./bin/run-ldc93s1.sh ./bin/run-ldc93s1.sh
This script will train on a small sample dataset composed of just a single audio file, the sample file for the `TIMIT Acoustic-Phonetic Continuous Speech Corpus <https://catalog.ldc.upenn.edu/LDC93S1>`_\ , which can be overfitted on a GPU in a few minutes for demonstration purposes. From here, you can alter any variables with regards to what dataset is used, how many training iterations are run and the default values of the network parameters. This script will train on a small sample dataset composed of just a single audio file, the sample file for the `TIMIT Acoustic-Phonetic Continuous Speech Corpus <https://catalog.ldc.upenn.edu/LDC93S1>`_, which can be overfitted on a GPU in a few minutes for demonstration purposes. From here, you can alter any variables with regards to what dataset is used, how many training iterations are run and the default values of the network parameters.
Feel also free to pass additional (or overriding) ``DeepSpeech.py`` parameters to these scripts. Then, just run the script to train the modified network. Feel also free to pass additional (or overriding) ``DeepSpeech.py`` parameters to these scripts. Then, just run the script to train the modified network.
@ -168,7 +168,7 @@ Exporting a model for inference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If the ``--export_dir`` parameter is provided, a model will have been exported to this directory during training. If the ``--export_dir`` parameter is provided, a model will have been exported to this directory during training.
Refer to the corresponding `README.rst <native_client/README.rst>`_ for information on building and running a client that can use the exported model. Refer to the corresponding :github:`README.rst <native_client/README.rst>` for information on building and running a client that can use the exported model.
Exporting a model for TFLite Exporting a model for TFLite
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

View File

@ -7,7 +7,7 @@ Inference using a DeepSpeech pre-trained model can be done with a client/languag
* `The Python package/language binding <#using-the-python-package>`_ * `The Python package/language binding <#using-the-python-package>`_
* `The Node.JS package/language binding <#using-the-nodejs-package>`_ * `The Node.JS package/language binding <#using-the-nodejs-package>`_
* `The Command-Line client <#using-the-command-line-client>`_ * `The Command-Line client <#using-the-command-line-client>`_
* `The .NET client/language binding <native_client/dotnet/README.rst>`_ * :github:`The .NET client/language binding <native_client/dotnet/README.rst>`
Running ``deepspeech`` might, see below, require some runtime dependencies to be already installed on your system: Running ``deepspeech`` might, see below, require some runtime dependencies to be already installed on your system:
@ -110,18 +110,20 @@ Note: the following command assumes you `downloaded the pre-trained model <#gett
The arguments ``--lm`` and ``--trie`` are optional, and represent a language model. The arguments ``--lm`` and ``--trie`` are optional, and represent a language model.
See `client.py <native_client/python/client.py>`_ for an example of how to use the package programatically. See :github:`client.py <native_client/python/client.py>` for an example of how to use the package programatically.
Using the Node.JS package Using the Node.JS / Electron.JS package
^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can download the Node.JS bindings using ``npm``\ : You can download the JS bindings using ``npm``\ :
.. code-block:: bash .. code-block:: bash
npm install deepspeech npm install deepspeech
Please note that as of now, we only support Node.JS versions 4, 5 and 6. Once `SWIG has support <https://github.com/swig/swig/pull/968>`_ we can build for newer versions. Please note that as of now, we support:
- Node.JS versions 4 to 13.
- Electron.JS versions 1.6 to 7.1
Alternatively, if you're using Linux and have a supported NVIDIA GPU, you can install the GPU specific package as follows: Alternatively, if you're using Linux and have a supported NVIDIA GPU, you can install the GPU specific package as follows:
@ -131,7 +133,7 @@ Alternatively, if you're using Linux and have a supported NVIDIA GPU, you can in
See the `release notes <https://github.com/mozilla/DeepSpeech/releases>`_ to find which GPUs are supported. Please ensure you have the required `CUDA dependency <#cuda-dependency>`_. See the `release notes <https://github.com/mozilla/DeepSpeech/releases>`_ to find which GPUs are supported. Please ensure you have the required `CUDA dependency <#cuda-dependency>`_.
See `client.js <native_client/javascript/client.js>`_ for an example of how to use the bindings. Or download the `wav example <examples/nodejs_wav>`_. See :github:`client.js <native_client/javascript/client.js>` for an example of how to use the bindings.
Using the Command-Line client Using the Command-Line client
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -162,12 +164,12 @@ Note: the following command assumes you `downloaded the pre-trained model <#gett
./deepspeech --model models/output_graph.pbmm --lm models/lm.binary --trie models/trie --audio audio_input.wav ./deepspeech --model models/output_graph.pbmm --lm models/lm.binary --trie models/trie --audio audio_input.wav
See the help output with ``./deepspeech -h`` and the `native client README <native_client/README.rst>`_ for more details. See the help output with ``./deepspeech -h`` and the :github:`native client README <native_client/README.rst>` for more details.
Installing bindings from source Installing bindings from source
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow these `\ ``native_client`` installation instructions <native_client/README.rst>`_. If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow these :github:`native client installation instructions <native_client/README.rst>`.
Third party bindings Third party bindings
^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^

View File

@ -64,6 +64,7 @@ release = v
# ones. # ones.
extensions = [ extensions = [
'sphinx.ext.autodoc', 'sphinx.ext.autodoc',
'sphinx.ext.extlinks',
'sphinx.ext.intersphinx', 'sphinx.ext.intersphinx',
'sphinx.ext.mathjax', 'sphinx.ext.mathjax',
'sphinx.ext.viewcode', 'sphinx.ext.viewcode',
@ -194,3 +195,6 @@ texinfo_documents = [
# Example configuration for intersphinx: refer to the Python standard library. # Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {'https://docs.python.org/': None} intersphinx_mapping = {'https://docs.python.org/': None}
extlinks = {'github': ('https://github.com/mozilla/DeepSpeech/blob/v{}/%s'.format(release),
'%s')}

View File

@ -10,13 +10,18 @@ Welcome to DeepSpeech's documentation!
:maxdepth: 2 :maxdepth: 2
:caption: Introduction :caption: Introduction
DeepSpeech USING
TRAINING
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
:caption: DeepSpeech Model :caption: DeepSpeech Model
DeepSpeech
Geometry Geometry
ParallelOptimization ParallelOptimization
.. toctree:: .. toctree::

View File

@ -177,7 +177,8 @@ TFLiteModelState::init(const char* model_path,
std::cerr << "Specified model file version (" << *graph_version << ") is " std::cerr << "Specified model file version (" << *graph_version << ") is "
<< "incompatible with minimum version supported by this client (" << "incompatible with minimum version supported by this client ("
<< ds_graph_version() << "). See " << ds_graph_version() << "). See "
<< "https://github.com/mozilla/DeepSpeech/blob/master/USING.rst#model-compatibility " << "https://github.com/mozilla/DeepSpeech/blob/"
<< ds_git_version() << "/doc/USING.rst#model-compatibility "
<< "for more information" << std::endl; << "for more information" << std::endl;
return DS_ERR_MODEL_INCOMPATIBLE; return DS_ERR_MODEL_INCOMPATIBLE;
} }

View File

@ -91,7 +91,8 @@ TFModelState::init(const char* model_path,
std::cerr << "Specified model file version (" << graph_version << ") is " std::cerr << "Specified model file version (" << graph_version << ") is "
<< "incompatible with minimum version supported by this client (" << "incompatible with minimum version supported by this client ("
<< ds_graph_version() << "). See " << ds_graph_version() << "). See "
<< "https://github.com/mozilla/DeepSpeech/blob/master/USING.rst#model-compatibility " << "https://github.com/mozilla/DeepSpeech/blob/"
<< ds_git_version() << "/doc/USING.rst#model-compatibility "
<< "for more information" << std::endl; << "for more information" << std::endl;
return DS_ERR_MODEL_INCOMPATIBLE; return DS_ERR_MODEL_INCOMPATIBLE;
} }