Docs centered on ReadTheDocs instead of GitHub

This commit is contained in:
Reuben Morais 2020-04-27 18:49:02 +02:00
parent 6e9b251da2
commit a584c8e6b6
10 changed files with 84 additions and 101 deletions

View File

@ -14,82 +14,10 @@ Project DeepSpeech
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier. DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.
**NOTE:** This documentation applies to the **master version** of DeepSpeech only. **Documentation for all versions** is published on `deepspeech.readthedocs.io <http://deepspeech.readthedocs.io/?badge=latest>`_. Documentation for installation, usage, and training models is available on `deepspeech.readthedocs.io <http://deepspeech.readthedocs.io/?badge=latest>`_.
To install and use DeepSpeech all you have to do is: For the latest release, including pre-trained models and checkpoints, `see the latest release on GitHub <https://github.com/mozilla/DeepSpeech/releases/latest>`_.
.. code-block:: bash For contribution guidelines, see `CONTRIBUTING.rst <CONTRIBUTING.rst>`_.
# Create and activate a virtualenv For contact and support information, see `SUPPORT.rst <SUPPORT.rst>`_.
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate
# Install DeepSpeech
pip3 install deepspeech
# Download pre-trained English model files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer
# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/audio-0.7.0.tar.gz
tar xvf audio-0.7.0.tar.gz
# Transcribe an audio file
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav
A pre-trained English model is available for use and can be downloaded using `the instructions below <doc/USING.rst#using-a-pre-trained-model>`_. A package with some example audio files is available for download in our `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_.
Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_ to find which GPUs are supported. To run ``deepspeech`` on a GPU, install the GPU specific package:
.. code-block:: bash
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate
# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu
# Transcribe an audio file.
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav
Please ensure you have the required `CUDA dependencies <doc/USING.rst#cuda-dependency>`_.
See the output of ``deepspeech -h`` for more information on the use of ``deepspeech``. (If you experience problems running ``deepspeech``\ , please check `required runtime dependencies <native_client/README.rst#required-dependencies>`_\ ).
----
**Table of Contents**
* `Using a Pre-trained Model <doc/USING.rst#using-a-pre-trained-model>`_
* `CUDA dependency <doc/USING.rst#cuda-dependency>`_
* `Getting the pre-trained model <doc/USING.rst#getting-the-pre-trained-model>`_
* `Model compatibility <doc/USING.rst#model-compatibility>`_
* `Using the Python package <doc/USING.rst#using-the-python-package>`_
* `Using the Node.JS package <doc/USING.rst#using-the-nodejs-package>`_
* `Using the Command Line client <doc/USING.rst#using-the-command-line-client>`_
* `Installing bindings from source <doc/USING.rst#installing-bindings-from-source>`_
* `Third party bindings <doc/USING.rst#third-party-bindings>`_
* `Trying out DeepSpeech with examples <examples/README.rst>`_
* `Training your own Model <doc/TRAINING.rst#training-your-own-model>`_
* `Prerequisites for training a model <doc/TRAINING.rst#prerequisites-for-training-a-model>`_
* `Getting the training code <doc/TRAINING.rst#getting-the-training-code>`_
* `Installing Python dependencies <doc/TRAINING.rst#installing-python-dependencies>`_
* `Recommendations <doc/TRAINING.rst#recommendations>`_
* `Common Voice training data <doc/TRAINING.rst#common-voice-training-data>`_
* `Training a model <doc/TRAINING.rst#training-a-model>`_
* `Checkpointing <doc/TRAINING.rst#checkpointing>`_
* `Exporting a model for inference <doc/TRAINING.rst#exporting-a-model-for-inference>`_
* `Exporting a model for TFLite <doc/TRAINING.rst#exporting-a-model-for-tflite>`_
* `Making a mmap-able model for inference <doc/TRAINING.rst#making-a-mmap-able-model-for-inference>`_
* `Continuing training from a release model <doc/TRAINING.rst#continuing-training-from-a-release-model>`_
* `Training with Augmentation <doc/TRAINING.rst#training-with-augmentation>`_
* `Contribution guidelines <CONTRIBUTING.rst>`_
* `Contact/Getting Help <SUPPORT.rst>`_

View File

@ -4,14 +4,10 @@ Contact/Getting Help
There are several ways to contact us or to get help: There are several ways to contact us or to get help:
#. #. `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ - We have a list of common questions, and their answers, in our `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_. When just getting started, it's best to first check the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ to see if your question is addressed.
`\ **FAQ** <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ - We have a list of common questions, and their answers, in our `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_. When just getting started, it's best to first check the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ to see if your question is addressed.
#. #. `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_ - If your question is not addressed in the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_\ , the `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_ is the next place to look. They contain conversations on `General Topics <https://discourse.mozilla.org/t/general-topics/21075>`_\ , `Using Deep Speech <https://discourse.mozilla.org/t/using-deep-speech/21076/4>`_\ , and `Deep Speech Development <https://discourse.mozilla.org/t/deep-speech-development/21077>`_.
`\ **Discourse Forums** <https://discourse.mozilla.org/c/deep-speech>`_ - If your question is not addressed in the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_\ , the `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_ is the next place to look. They contain conversations on `General Topics <https://discourse.mozilla.org/t/general-topics/21075>`_\ , `Using Deep Speech <https://discourse.mozilla.org/t/using-deep-speech/21076/4>`_\ , and `Deep Speech Development <https://discourse.mozilla.org/t/deep-speech-development/21077>`_.
#. #. `Matrix chat <https://chat.mozilla.org/#/room/#machinelearning:mozilla.org>`_ - If your question is not addressed by either the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ or `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_\ , you can contact us on the ``#machinelearning`` channel on `Mozilla Matrix <https://chat.mozilla.org/#/room/#machinelearning:mozilla.org>`_\ ; people there can try to answer/help
`\ **Matrix chat** <https://chat.mozilla.org/#/room/#machinelearning:mozilla.org>`_ - If your question is not addressed by either the `FAQ <https://github.com/mozilla/DeepSpeech/wiki#frequently-asked-questions>`_ or `Discourse Forums <https://discourse.mozilla.org/c/deep-speech>`_\ , you can contact us on the ``#machinelearning`` channel on `Mozilla Matrix <https://chat.mozilla.org/#/room/#machinelearning:mozilla.org>`_\ ; people there can try to answer/help
#. #. `Issues <https://github.com/mozilla/deepspeech/issues>`_ - Finally, if all else fails, you can open an issue in our repo.
`\ **Issues** <https://github.com/mozilla/deepspeech/issues>`_ - Finally, if all else fails, you can open an issue in our repo.

View File

@ -76,4 +76,4 @@ The character, '|' in this case, will then have to be replaced with spaces as a
Implementation Implementation
^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
The decoder source code can be found in ``native_client/ctcdecode``. The decoder is included in the language bindings and clients. In addition, there is a separate Python module which includes just the decoder and is needed for evaluation. In order to build and install this package, see the :github:`native_client README <native_client/README.rst#install-the-ctc-decoder-package>`. The decoder source code can be found in ``native_client/ctcdecode``. The decoder is included in the language bindings and clients. In addition, there is a separate Python module which includes just the decoder and is needed for evaluation. A pre-built version of this package is automatically downloaded and installed when installing the training code. If you want to manually build and install it from source, see the :github:`native_client README <native_client/README.rst#install-the-ctc-decoder-package>`.

View File

@ -1,3 +1,5 @@
.. _js-api-example:
JavaScript API Usage example JavaScript API Usage example
============================= =============================

View File

@ -1,7 +1,9 @@
.. _py-api-example:
Python API Usage example Python API Usage example
======================== ========================
Examples are from `native_client/python/client.cc`. Examples are from `native_client/python/client.py`.
Creating a model instance and loading model Creating a model instance and loading model
------------------------------------------- -------------------------------------------

View File

@ -65,7 +65,7 @@ If you have a capable (NVIDIA, at least 8GB of VRAM) GPU, it is highly recommend
pip3 uninstall tensorflow pip3 uninstall tensorflow
pip3 install 'tensorflow-gpu==1.15.2' pip3 install 'tensorflow-gpu==1.15.2'
Please ensure you have the required `CUDA dependency <USING.rst#cuda-dependency>`_. Please ensure you have the required :ref:`CUDA dependency <cuda-deps>`.
It has been reported for some people failure at training: It has been reported for some people failure at training:
@ -74,7 +74,7 @@ It has been reported for some people failure at training:
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node tower_0/conv1d/Conv2D}}]] [[{{node tower_0/conv1d/Conv2D}}]]
Setting the ``TF_FORCE_GPU_ALLOW_GROWTH`` environment variable to ``true`` seems to help in such cases. This could also be due to an incorrect version of libcudnn. Double check your versions with the `TensorFlow 1.15 documentation <USING.rst#cuda-dependency>`_. Setting the ``TF_FORCE_GPU_ALLOW_GROWTH`` environment variable to ``true`` seems to help in such cases. This could also be due to an incorrect version of libcudnn. Double check your versions with the :ref:`TensorFlow 1.15 documentation <cuda-deps>`.
Common Voice training data Common Voice training data
^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -179,7 +179,7 @@ Exporting a model for inference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If the ``--export_dir`` parameter is provided, a model will have been exported to this directory during training. If the ``--export_dir`` parameter is provided, a model will have been exported to this directory during training.
Refer to the corresponding :github:`README.rst <native_client/README.rst>` for information on building and running a client that can use the exported model. Refer to the :ref:`usage instructions <usage-docs>` for information on running a client that can use the exported model.
Exporting a model for TFLite Exporting a model for TFLite
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

View File

@ -1,16 +1,19 @@
.. _usage-docs:
Using a Pre-trained Model Using a Pre-trained Model
========================= =========================
Inference using a DeepSpeech pre-trained model can be done with a client/language binding package. We have four clients/language bindings in this repository, listed below, and also a few community-maintained clients/language bindings in other repositories, listed `further down in this README <#third-party-bindings>`_. Inference using a DeepSpeech pre-trained model can be done with a client/language binding package. We have four clients/language bindings in this repository, listed below, and also a few community-maintained clients/language bindings in other repositories, listed `further down in this README <#third-party-bindings>`_.
* `The C API <c-usage>`.
* `The Python package/language binding <#using-the-python-package>`_ * :ref:`The Python package/language binding <py-usage>`
* `The Node.JS package/language binding <#using-the-nodejs-package>`_ * :ref:`The Node.JS package/language binding <nodejs-usage>`
* `The Command-Line client <#using-the-command-line-client>`_ * :ref:`The command-line client <cli-usage>`
* :github:`The .NET client/language binding <native_client/dotnet/README.rst>` * :github:`The .NET client/language binding <native_client/dotnet/README.rst>`
Running ``deepspeech`` might, see below, require some runtime dependencies to be already installed on your system: .. _runtime-deps:
Running ``deepspeech`` might, see below, require some runtime dependencies to be already installed on your system:
* ``sox`` - The Python and Node.JS clients use SoX to resample files to 16kHz. * ``sox`` - The Python and Node.JS clients use SoX to resample files to 16kHz.
* ``libgomp1`` - libsox (statically linked into the clients) depends on OpenMP. Some people have had to install this manually. * ``libgomp1`` - libsox (statically linked into the clients) depends on OpenMP. Some people have had to install this manually.
@ -20,6 +23,8 @@ Running ``deepspeech`` might, see below, require some runtime dependencies to be
Please refer to your system's documentation on how to install these dependencies. Please refer to your system's documentation on how to install these dependencies.
.. _cuda-deps:
CUDA dependency CUDA dependency
^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
@ -40,6 +45,8 @@ Model compatibility
DeepSpeech models are versioned to keep you from trying to use an incompatible graph with a newer client after a breaking change was made to the code. If you get an error saying your model file version is too old for the client, you should either upgrade to a newer model release, re-export your model from the checkpoint using a newer version of the code, or downgrade your client if you need to use the old model and can't re-export it. DeepSpeech models are versioned to keep you from trying to use an incompatible graph with a newer client after a breaking change was made to the code. If you get an error saying your model file version is too old for the client, you should either upgrade to a newer model release, re-export your model from the checkpoint using a newer version of the code, or downgrade your client if you need to use the old model and can't re-export it.
.. _py-usage:
Using the Python package Using the Python package
^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
@ -110,7 +117,9 @@ Note: the following command assumes you `downloaded the pre-trained model <#gett
The ``--scorer`` argument is optional, and represents an external language model to be used when transcribing the audio. The ``--scorer`` argument is optional, and represents an external language model to be used when transcribing the audio.
See :github:`client.py <native_client/python/client.py>` for an example of how to use the package programatically. See :ref:`the Python client <py-api-example>` for an example of how to use the package programatically.
.. _nodejs-usage:
Using the Node.JS / Electron.JS package Using the Node.JS / Electron.JS package
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -135,9 +144,11 @@ Alternatively, if you're using Linux and have a supported NVIDIA GPU, you can in
See the `release notes <https://github.com/mozilla/DeepSpeech/releases>`_ to find which GPUs are supported. Please ensure you have the required `CUDA dependency <#cuda-dependency>`_. See the `release notes <https://github.com/mozilla/DeepSpeech/releases>`_ to find which GPUs are supported. Please ensure you have the required `CUDA dependency <#cuda-dependency>`_.
See :github:`client.ts <native_client/javascript/client.ts>` for an example of how to use the bindings. See the :ref:`TypeScript client <js-api-example>` for an example of how to use the bindings programatically.
Using the Command-Line client .. _cli-usage:
Using the command-line client
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To download the pre-built binaries for the ``deepspeech`` command-line (compiled C++) client, use ``util/taskcluster.py``\ : To download the pre-built binaries for the ``deepspeech`` command-line (compiled C++) client, use ``util/taskcluster.py``\ :
@ -168,12 +179,12 @@ Note: the following command assumes you `downloaded the pre-trained model <#gett
./deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio_input.wav ./deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio_input.wav
See the help output with ``./deepspeech -h`` and the :github:`native client README <native_client/README.rst>` for more details. See the help output with ``./deepspeech -h`` for more details.
Installing bindings from source Installing bindings from source
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow these :github:`native client installation instructions <native_client/README.rst>`. If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow the :github:`native client build and installation instructions <native_client/README.rst>`.
Third party bindings Third party bindings
^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^

View File

@ -6,6 +6,50 @@
Welcome to DeepSpeech's documentation! Welcome to DeepSpeech's documentation!
====================================== ======================================
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.
To install and use DeepSpeech all you have to do is:
.. code-block:: bash
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate
# Install DeepSpeech
pip3 install deepspeech
# Download pre-trained English model files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer
# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/audio-0.7.0.tar.gz
tar xvf audio-0.7.0.tar.gz
# Transcribe an audio file
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav
A pre-trained English model is available for use and can be downloaded following the instructions in :ref:`the usage docs <usage-docs>`. For the latest release, including pre-trained models and checkpoints, `see the GitHub releases page <https://github.com/mozilla/DeepSpeech/releases/latest>`_.
Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes <https://github.com/mozilla/DeepSpeech/releases/latest>`_ to find which GPUs are supported. To run ``deepspeech`` on a GPU, install the GPU specific package:
.. code-block:: bash
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate
# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu
# Transcribe an audio file.
deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav
Please ensure you have the required :ref:`CUDA dependencies <cuda-deps>`.
See the output of ``deepspeech -h`` for more information on the use of ``deepspeech``. (If you experience problems running ``deepspeech``, please check :ref:`required runtime dependencies <runtime-deps>`).
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
:caption: Introduction :caption: Introduction

View File

@ -1,4 +1,4 @@
Full project description and documentation on GitHub: [https://github.com/mozilla/DeepSpeech](https://github.com/mozilla/DeepSpeech). Full project description and documentation on [https://deepspeech.readthedocs.io/](https://deepspeech.readthedocs.io/).
## Generating TypeScript Type Definitions ## Generating TypeScript Type Definitions

View File

@ -1 +1 @@
Full project description and documentation on GitHub: `https://github.com/mozilla/DeepSpeech <https://github.com/mozilla/DeepSpeech>`_ Full project description and documentation on `https://deepspeech.readthedocs.io/ <https://deepspeech.readthedocs.io/>`_