Merge pull request #2949 from mozilla/docs-rtd

Docs centered on ReadTheDocs instead of GitHub
This commit is contained in:
Reuben Morais 2020-04-28 16:06:44 +02:00 committed by GitHub
commit 4930186197
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
12 changed files with 103 additions and 104 deletions

View File

@ -14,82 +14,10 @@ Project DeepSpeech
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.
**NOTE:** This documentation applies to the **master version** of DeepSpeech only. **Documentation for all versions** is published on `deepspeech.readthedocs.io <http://deepspeech.readthedocs.io/?badge=latest>`_.
Documentation for installation, usage, and training models is available on `deepspeech.readthedocs.io <http://deepspeech.readthedocs.io/?badge=latest>`_.
To install and use DeepSpeech all you have to do is:
For the latest release, including pre-trained models and checkpoints, `see the latest release on GitHub <https://github.com/mozilla/DeepSpeech/releases/latest>`_.
.. code-block:: bash
For contribution guidelines, see `CONTRIBUTING.rst <CONTRIBUTING.rst>`_.
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate
# Install DeepSpeech
pip3 install deepspeech
# Download pre-trained English model files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer
# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/audio-0.7.0.tar.gz
tar xvf audio-0.7.0.tar.gz
# Transcribe an audio file
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav
A pre-trained English model is available for use and can be downloaded using `the instructions below <doc/USING.rst#using-a-pre-trained-model>`_. A package with some example audio files is available for download in our `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_.
Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_ to find which GPUs are supported. To run ``deepspeech`` on a GPU, install the GPU specific package:
.. code-block:: bash
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate
# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu
# Transcribe an audio file.
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav
Please ensure you have the required `CUDA dependencies <doc/USING.rst#cuda-dependency>`_.
See the output of ``deepspeech -h`` for more information on the use of ``deepspeech``. (If you experience problems running ``deepspeech``\ , please check `required runtime dependencies <native_client/README.rst#required-dependencies>`_\ ).
----
**Table of Contents**
* `Using a Pre-trained Model <doc/USING.rst#using-a-pre-trained-model>`_
* `CUDA dependency <doc/USING.rst#cuda-dependency>`_
* `Getting the pre-trained model <doc/USING.rst#getting-the-pre-trained-model>`_
* `Model compatibility <doc/USING.rst#model-compatibility>`_
* `Using the Python package <doc/USING.rst#using-the-python-package>`_
* `Using the Node.JS package <doc/USING.rst#using-the-nodejs-package>`_
* `Using the Command Line client <doc/USING.rst#using-the-command-line-client>`_
* `Installing bindings from source <doc/USING.rst#installing-bindings-from-source>`_
* `Third party bindings <doc/USING.rst#third-party-bindings>`_
* `Trying out DeepSpeech with examples <examples/README.rst>`_
* `Training your own Model <doc/TRAINING.rst#training-your-own-model>`_
* `Prerequisites for training a model <doc/TRAINING.rst#prerequisites-for-training-a-model>`_
* `Getting the training code <doc/TRAINING.rst#getting-the-training-code>`_
* `Installing Python dependencies <doc/TRAINING.rst#installing-python-dependencies>`_
* `Recommendations <doc/TRAINING.rst#recommendations>`_
* `Common Voice training data <doc/TRAINING.rst#common-voice-training-data>`_
* `Training a model <doc/TRAINING.rst#training-a-model>`_
* `Checkpointing <doc/TRAINING.rst#checkpointing>`_
* `Exporting a model for inference <doc/TRAINING.rst#exporting-a-model-for-inference>`_
* `Exporting a model for TFLite <doc/TRAINING.rst#exporting-a-model-for-tflite>`_
* `Making a mmap-able model for inference <doc/TRAINING.rst#making-a-mmap-able-model-for-inference>`_
* `Continuing training from a release model <doc/TRAINING.rst#continuing-training-from-a-release-model>`_
* `Training with Augmentation <doc/TRAINING.rst#training-with-augmentation>`_
* `Contribution guidelines <CONTRIBUTING.rst>`_
* `Contact/Getting Help <SUPPORT.rst>`_
For contact and support information, see `SUPPORT.rst <SUPPORT.rst>`_.

View File

@ -3,15 +3,8 @@ Contact/Getting Help
There are several ways to contact us or to get help:
#. `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_ - The `Deep Speech category on Discourse <https://discourse.mozilla.org/c/deep-speech>`_ is the first place to look. Search for keywords related to your question or problem to see if someone else has run into it already. If you can't find anything relevant there, search on our `issue tracker <https://github.com/mozilla/deepspeech/issues>`_ to see if there is an existing issue about your problem.
#.
`\ **FAQ** <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ - We have a list of common questions, and their answers, in our `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_. When just getting started, it's best to first check the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ to see if your question is addressed.
#. `Matrix chat <https://chat.mozilla.org/#/room/#machinelearning:mozilla.org>`_ - If your question is not addressed by either the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ or `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_\ , you can contact us on the ``#machinelearning`` channel on `Mozilla Matrix <https://chat.mozilla.org/#/room/#machinelearning:mozilla.org>`_\ ; people there can try to answer/help
#.
`\ **Discourse Forums** <https://discourse.mozilla.org/c/deep-speech>`_ - If your question is not addressed in the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_\ , the `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_ is the next place to look. They contain conversations on `General Topics <https://discourse.mozilla.org/t/general-topics/21075>`_\ , `Using Deep Speech <https://discourse.mozilla.org/t/using-deep-speech/21076/4>`_\ , and `Deep Speech Development <https://discourse.mozilla.org/t/deep-speech-development/21077>`_.
#.
`\ **Matrix chat** <https://chat.mozilla.org/#/room/#machinelearning:mozilla.org>`_ - If your question is not addressed by either the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ or `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_\ , you can contact us on the ``#machinelearning`` channel on `Mozilla Matrix <https://chat.mozilla.org/#/room/#machinelearning:mozilla.org>`_\ ; people there can try to answer/help
#.
`\ **Issues** <https://github.com/mozilla/deepspeech/issues>`_ - Finally, if all else fails, you can open an issue in our repo.
#. `Create a new issue <https://github.com/mozilla/deepspeech/issues>`_ - Finally, if you have a bug report or a feature request that isn't already covered by an existing issue, please open an issue in our repo and fill the appropriate information on your hardware and software setup.

View File

@ -76,4 +76,4 @@ The character, '|' in this case, will then have to be replaced with spaces as a
Implementation
^^^^^^^^^^^^^^
The decoder source code can be found in ``native_client/ctcdecode``. The decoder is included in the language bindings and clients. In addition, there is a separate Python module which includes just the decoder and is needed for evaluation. In order to build and install this package, see the :github:`native_client README <native_client/README.rst#install-the-ctc-decoder-package>`.
The decoder source code can be found in ``native_client/ctcdecode``. The decoder is included in the language bindings and clients. In addition, there is a separate Python module which includes just the decoder and is needed for evaluation. A pre-built version of this package is automatically downloaded and installed when installing the training code. If you want or need to manually build and install it from source, see the :github:`native_client README <native_client/README.rst#install-the-ctc-decoder-package>`.

16
doc/Flags.rst Normal file
View File

@ -0,0 +1,16 @@
.. _training-flags:
Command-line flags for the training scripts
===========================================
Below you can find the definition of all command-line flags supported by the training scripts. This includes ``DeepSpeech.py``, ``evaluate.py``, ``evaluate_tflite.py``, ``transcribe.py`` and ``lm_optimizer.py``.
Flags
-----
.. literalinclude:: ../training/deepspeech_training/util/flags.py
:language: python
:linenos:
:lineno-match:
:start-after: sphinx-doc: training_ref_flags_start
:end-before: sphinx-doc: training_ref_flags_end

View File

@ -1,3 +1,5 @@
.. _js-api-example:
JavaScript API Usage example
=============================

View File

@ -1,7 +1,9 @@
.. _py-api-example:
Python API Usage example
========================
Examples are from `native_client/python/client.cc`.
Examples are from `native_client/python/client.py`.
Creating a model instance and loading model
-------------------------------------------

View File

@ -65,7 +65,7 @@ If you have a capable (NVIDIA, at least 8GB of VRAM) GPU, it is highly recommend
pip3 uninstall tensorflow
pip3 install 'tensorflow-gpu==1.15.2'
Please ensure you have the required `CUDA dependency <USING.rst#cuda-dependency>`_.
Please ensure you have the required :ref:`CUDA dependency <cuda-deps>`.
It has been reported for some people failure at training:
@ -74,7 +74,7 @@ It has been reported for some people failure at training:
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node tower_0/conv1d/Conv2D}}]]
Setting the ``TF_FORCE_GPU_ALLOW_GROWTH`` environment variable to ``true`` seems to help in such cases. This could also be due to an incorrect version of libcudnn. Double check your versions with the `TensorFlow 1.15 documentation <USING.rst#cuda-dependency>`_.
Setting the ``TF_FORCE_GPU_ALLOW_GROWTH`` environment variable to ``true`` seems to help in such cases. This could also be due to an incorrect version of libcudnn. Double check your versions with the :ref:`TensorFlow 1.15 documentation <cuda-deps>`.
Common Voice training data
^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -123,7 +123,7 @@ The central (Python) script is ``DeepSpeech.py`` in the project's root directory
./DeepSpeech.py --helpfull
To get the output of this in a slightly better-formatted way, you can also look up the option definitions in :github:`util/flags.py <util/flags.py>`.
To get the output of this in a slightly better-formatted way, you can also look at the flag definitions in :ref:`training-flags`.
For executing pre-configured training scenarios, there is a collection of convenience scripts in the ``bin`` folder. Most of them are named after the corpora they are configured for. Keep in mind that most speech corpora are *very large*, on the order of tens of gigabytes, and some aren't free. Downloading and preprocessing them can take a very long time, and training on them without a fast GPU (GTX 10 series or newer recommended) takes even longer.
@ -179,7 +179,7 @@ Exporting a model for inference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If the ``--export_dir`` parameter is provided, a model will have been exported to this directory during training.
Refer to the corresponding :github:`README.rst <native_client/README.rst>` for information on building and running a client that can use the exported model.
Refer to the :ref:`usage instructions <usage-docs>` for information on running a client that can use the exported model.
Exporting a model for TFLite
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

View File

@ -1,16 +1,19 @@
.. _usage-docs:
Using a Pre-trained Model
=========================
Inference using a DeepSpeech pre-trained model can be done with a client/language binding package. We have four clients/language bindings in this repository, listed below, and also a few community-maintained clients/language bindings in other repositories, listed `further down in this README <#third-party-bindings>`_.
* `The Python package/language binding <#using-the-python-package>`_
* `The Node.JS package/language binding <#using-the-nodejs-package>`_
* `The Command-Line client <#using-the-command-line-client>`_
* `The C API <c-usage>`.
* :ref:`The Python package/language binding <py-usage>`
* :ref:`The Node.JS package/language binding <nodejs-usage>`
* :ref:`The command-line client <cli-usage>`
* :github:`The .NET client/language binding <native_client/dotnet/README.rst>`
Running ``deepspeech`` might, see below, require some runtime dependencies to be already installed on your system:
.. _runtime-deps:
Running ``deepspeech`` might, see below, require some runtime dependencies to be already installed on your system:
* ``sox`` - The Python and Node.JS clients use SoX to resample files to 16kHz.
* ``libgomp1`` - libsox (statically linked into the clients) depends on OpenMP. Some people have had to install this manually.
@ -20,6 +23,8 @@ Running ``deepspeech`` might, see below, require some runtime dependencies to be
Please refer to your system's documentation on how to install these dependencies.
.. _cuda-deps:
CUDA dependency
^^^^^^^^^^^^^^^
@ -40,6 +45,8 @@ Model compatibility
DeepSpeech models are versioned to keep you from trying to use an incompatible graph with a newer client after a breaking change was made to the code. If you get an error saying your model file version is too old for the client, you should either upgrade to a newer model release, re-export your model from the checkpoint using a newer version of the code, or downgrade your client if you need to use the old model and can't re-export it.
.. _py-usage:
Using the Python package
^^^^^^^^^^^^^^^^^^^^^^^^
@ -110,7 +117,9 @@ Note: the following command assumes you `downloaded the pre-trained model <#gett
The ``--scorer`` argument is optional, and represents an external language model to be used when transcribing the audio.
See :github:`client.py <native_client/python/client.py>` for an example of how to use the package programatically.
See :ref:`the Python client <py-api-example>` for an example of how to use the package programatically.
.. _nodejs-usage:
Using the Node.JS / Electron.JS package
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -135,9 +144,11 @@ Alternatively, if you're using Linux and have a supported NVIDIA GPU, you can in
See the `release notes <https://github.com/mozilla/DeepSpeech/releases>`_ to find which GPUs are supported. Please ensure you have the required `CUDA dependency <#cuda-dependency>`_.
See :github:`client.ts <native_client/javascript/client.ts>` for an example of how to use the bindings.
See the :ref:`TypeScript client <js-api-example>` for an example of how to use the bindings programatically.
Using the Command-Line client
.. _cli-usage:
Using the command-line client
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To download the pre-built binaries for the ``deepspeech`` command-line (compiled C++) client, use ``util/taskcluster.py``\ :
@ -168,12 +179,12 @@ Note: the following command assumes you `downloaded the pre-trained model <#gett
./deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio_input.wav
See the help output with ``./deepspeech -h`` and the :github:`native client README <native_client/README.rst>` for more details.
See the help output with ``./deepspeech -h`` for more details.
Installing bindings from source
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow these :github:`native client installation instructions <native_client/README.rst>`.
If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow the :github:`native client build and installation instructions <native_client/README.rst>`.
Third party bindings
^^^^^^^^^^^^^^^^^^^^

View File

@ -6,6 +6,50 @@
Welcome to DeepSpeech's documentation!
======================================
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.
To install and use DeepSpeech all you have to do is:
.. code-block:: bash
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate
# Install DeepSpeech
pip3 install deepspeech
# Download pre-trained English model files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer
# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/audio-0.7.0.tar.gz
tar xvf audio-0.7.0.tar.gz
# Transcribe an audio file
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav
A pre-trained English model is available for use and can be downloaded following the instructions in :ref:`the usage docs <usage-docs>`. For the latest release, including pre-trained models and checkpoints, `see the GitHub releases page <https://github.com/mozilla/DeepSpeech/releases/latest>`_.
Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_ to find which GPUs are supported. To run ``deepspeech`` on a GPU, install the GPU specific package:
.. code-block:: bash
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate
# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu
# Transcribe an audio file.
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav
Please ensure you have the required :ref:`CUDA dependencies <cuda-deps>`.
See the output of ``deepspeech -h`` for more information on the use of ``deepspeech``. (If you experience problems running ``deepspeech``, please check :ref:`required runtime dependencies <runtime-deps>`).
.. toctree::
:maxdepth: 2
:caption: Introduction

View File

@ -1,4 +1,4 @@
Full project description and documentation on GitHub: [https://github.com/mozilla/DeepSpeech](https://github.com/mozilla/DeepSpeech).
Full project description and documentation on [https://deepspeech.readthedocs.io/](https://deepspeech.readthedocs.io/).
## Generating TypeScript Type Definitions

View File

@ -1 +1 @@
Full project description and documentation on GitHub: `https://github.com/mozilla/DeepSpeech <https://github.com/mozilla/DeepSpeech>`_
Full project description and documentation on `https://deepspeech.readthedocs.io/ <https://deepspeech.readthedocs.io/>`_

View File

@ -5,6 +5,7 @@ import absl.flags
FLAGS = absl.flags.FLAGS
# sphinx-doc: training_ref_flags_start
def create_flags():
# Importer
# ========
@ -198,3 +199,5 @@ def create_flags():
f.register_validator('one_shot_infer',
lambda value: not value or os.path.isfile(value),
message='The file pointed to by --one_shot_infer must exist and be readable.')
# sphinx-doc: training_ref_flags_end