diff --git a/README.rst b/README.rst index 3e9320e5..17b849fa 100644 --- a/README.rst +++ b/README.rst @@ -14,82 +14,10 @@ Project DeepSpeech DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper `_. Project DeepSpeech uses Google's `TensorFlow `_ to make the implementation easier. -**NOTE:** This documentation applies to the **master version** of DeepSpeech only. **Documentation for all versions** is published on `deepspeech.readthedocs.io `_. +Documentation for installation, usage, and training models is available on `deepspeech.readthedocs.io `_. -To install and use DeepSpeech all you have to do is: +For the latest release, including pre-trained models and checkpoints, `see the latest release on GitHub `_. -.. code-block:: bash +For contribution guidelines, see `CONTRIBUTING.rst `_. - # Create and activate a virtualenv - virtualenv -p python3 $HOME/tmp/deepspeech-venv/ - source $HOME/tmp/deepspeech-venv/bin/activate - - # Install DeepSpeech - pip3 install deepspeech - - # Download pre-trained English model files - curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm - curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer - - # Download example audio files - curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/audio-0.7.0.tar.gz - tar xvf audio-0.7.0.tar.gz - - # Transcribe an audio file - deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav - -A pre-trained English model is available for use and can be downloaded using `the instructions below `_. A package with some example audio files is available for download in our `release notes `_. - -Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes `_ to find which GPUs are supported. To run ``deepspeech`` on a GPU, install the GPU specific package: - -.. code-block:: bash - - # Create and activate a virtualenv - virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/ - source $HOME/tmp/deepspeech-gpu-venv/bin/activate - - # Install DeepSpeech CUDA enabled package - pip3 install deepspeech-gpu - - # Transcribe an audio file. - deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav - -Please ensure you have the required `CUDA dependencies `_. - -See the output of ``deepspeech -h`` for more information on the use of ``deepspeech``. (If you experience problems running ``deepspeech``\ , please check `required runtime dependencies `_\ ). - ----- - -**Table of Contents** - -* `Using a Pre-trained Model `_ - - * `CUDA dependency `_ - * `Getting the pre-trained model `_ - * `Model compatibility `_ - * `Using the Python package `_ - * `Using the Node.JS package `_ - * `Using the Command Line client `_ - * `Installing bindings from source `_ - * `Third party bindings `_ - - -* `Trying out DeepSpeech with examples `_ - -* `Training your own Model `_ - - * `Prerequisites for training a model `_ - * `Getting the training code `_ - * `Installing Python dependencies `_ - * `Recommendations `_ - * `Common Voice training data `_ - * `Training a model `_ - * `Checkpointing `_ - * `Exporting a model for inference `_ - * `Exporting a model for TFLite `_ - * `Making a mmap-able model for inference `_ - * `Continuing training from a release model `_ - * `Training with Augmentation `_ - -* `Contribution guidelines `_ -* `Contact/Getting Help `_ +For contact and support information, see `SUPPORT.rst `_. diff --git a/SUPPORT.rst b/SUPPORT.rst index 8ef8ae11..c30e13a2 100644 --- a/SUPPORT.rst +++ b/SUPPORT.rst @@ -4,14 +4,10 @@ Contact/Getting Help There are several ways to contact us or to get help: -#. - `\ **FAQ** `_ - We have a list of common questions, and their answers, in our `FAQ `_. When just getting started, it's best to first check the `FAQ `_ to see if your question is addressed. +#. `FAQ `_ - We have a list of common questions, and their answers, in our `FAQ `_. When just getting started, it's best to first check the `FAQ `_ to see if your question is addressed. -#. - `\ **Discourse Forums** `_ - If your question is not addressed in the `FAQ `_\ , the `Discourse Forums `_ is the next place to look. They contain conversations on `General Topics `_\ , `Using Deep Speech `_\ , and `Deep Speech Development `_. +#. `Discourse Forums `_ - If your question is not addressed in the `FAQ `_\ , the `Discourse Forums `_ is the next place to look. They contain conversations on `General Topics `_\ , `Using Deep Speech `_\ , and `Deep Speech Development `_. -#. - `\ **Matrix chat** `_ - If your question is not addressed by either the `FAQ `_ or `Discourse Forums `_\ , you can contact us on the ``#machinelearning`` channel on `Mozilla Matrix `_\ ; people there can try to answer/help +#. `Matrix chat `_ - If your question is not addressed by either the `FAQ `_ or `Discourse Forums `_\ , you can contact us on the ``#machinelearning`` channel on `Mozilla Matrix `_\ ; people there can try to answer/help -#. - `\ **Issues** `_ - Finally, if all else fails, you can open an issue in our repo. +#. `Issues `_ - Finally, if all else fails, you can open an issue in our repo. diff --git a/doc/Decoder.rst b/doc/Decoder.rst index d7960fad..03cbd39d 100644 --- a/doc/Decoder.rst +++ b/doc/Decoder.rst @@ -76,4 +76,4 @@ The character, '|' in this case, will then have to be replaced with spaces as a Implementation ^^^^^^^^^^^^^^ -The decoder source code can be found in ``native_client/ctcdecode``. The decoder is included in the language bindings and clients. In addition, there is a separate Python module which includes just the decoder and is needed for evaluation. In order to build and install this package, see the :github:`native_client README `. +The decoder source code can be found in ``native_client/ctcdecode``. The decoder is included in the language bindings and clients. In addition, there is a separate Python module which includes just the decoder and is needed for evaluation. A pre-built version of this package is automatically downloaded and installed when installing the training code. If you want to manually build and install it from source, see the :github:`native_client README `. diff --git a/doc/NodeJS-Examples.rst b/doc/NodeJS-Examples.rst index 9c1197a3..ef7e7761 100644 --- a/doc/NodeJS-Examples.rst +++ b/doc/NodeJS-Examples.rst @@ -1,3 +1,5 @@ +.. _js-api-example: + JavaScript API Usage example ============================= diff --git a/doc/Python-Examples.rst b/doc/Python-Examples.rst index 9bbc4a3b..e00ac722 100644 --- a/doc/Python-Examples.rst +++ b/doc/Python-Examples.rst @@ -1,7 +1,9 @@ +.. _py-api-example: + Python API Usage example ======================== -Examples are from `native_client/python/client.cc`. +Examples are from `native_client/python/client.py`. Creating a model instance and loading model ------------------------------------------- diff --git a/doc/TRAINING.rst b/doc/TRAINING.rst index fecbbd53..3f0b584c 100644 --- a/doc/TRAINING.rst +++ b/doc/TRAINING.rst @@ -65,7 +65,7 @@ If you have a capable (NVIDIA, at least 8GB of VRAM) GPU, it is highly recommend pip3 uninstall tensorflow pip3 install 'tensorflow-gpu==1.15.2' -Please ensure you have the required `CUDA dependency `_. +Please ensure you have the required :ref:`CUDA dependency `. It has been reported for some people failure at training: @@ -74,7 +74,7 @@ It has been reported for some people failure at training: tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [[{{node tower_0/conv1d/Conv2D}}]] -Setting the ``TF_FORCE_GPU_ALLOW_GROWTH`` environment variable to ``true`` seems to help in such cases. This could also be due to an incorrect version of libcudnn. Double check your versions with the `TensorFlow 1.15 documentation `_. +Setting the ``TF_FORCE_GPU_ALLOW_GROWTH`` environment variable to ``true`` seems to help in such cases. This could also be due to an incorrect version of libcudnn. Double check your versions with the :ref:`TensorFlow 1.15 documentation `. Common Voice training data ^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -179,7 +179,7 @@ Exporting a model for inference ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If the ``--export_dir`` parameter is provided, a model will have been exported to this directory during training. -Refer to the corresponding :github:`README.rst ` for information on building and running a client that can use the exported model. +Refer to the :ref:`usage instructions ` for information on running a client that can use the exported model. Exporting a model for TFLite ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/doc/USING.rst b/doc/USING.rst index 7a98813e..57ee279d 100644 --- a/doc/USING.rst +++ b/doc/USING.rst @@ -1,16 +1,19 @@ +.. _usage-docs: + Using a Pre-trained Model ========================= Inference using a DeepSpeech pre-trained model can be done with a client/language binding package. We have four clients/language bindings in this repository, listed below, and also a few community-maintained clients/language bindings in other repositories, listed `further down in this README <#third-party-bindings>`_. - -* `The Python package/language binding <#using-the-python-package>`_ -* `The Node.JS package/language binding <#using-the-nodejs-package>`_ -* `The Command-Line client <#using-the-command-line-client>`_ +* `The C API `. +* :ref:`The Python package/language binding ` +* :ref:`The Node.JS package/language binding ` +* :ref:`The command-line client ` * :github:`The .NET client/language binding ` -Running ``deepspeech`` might, see below, require some runtime dependencies to be already installed on your system: +.. _runtime-deps: +Running ``deepspeech`` might, see below, require some runtime dependencies to be already installed on your system: * ``sox`` - The Python and Node.JS clients use SoX to resample files to 16kHz. * ``libgomp1`` - libsox (statically linked into the clients) depends on OpenMP. Some people have had to install this manually. @@ -20,6 +23,8 @@ Running ``deepspeech`` might, see below, require some runtime dependencies to be Please refer to your system's documentation on how to install these dependencies. +.. _cuda-deps: + CUDA dependency ^^^^^^^^^^^^^^^ @@ -40,6 +45,8 @@ Model compatibility DeepSpeech models are versioned to keep you from trying to use an incompatible graph with a newer client after a breaking change was made to the code. If you get an error saying your model file version is too old for the client, you should either upgrade to a newer model release, re-export your model from the checkpoint using a newer version of the code, or downgrade your client if you need to use the old model and can't re-export it. +.. _py-usage: + Using the Python package ^^^^^^^^^^^^^^^^^^^^^^^^ @@ -110,7 +117,9 @@ Note: the following command assumes you `downloaded the pre-trained model <#gett The ``--scorer`` argument is optional, and represents an external language model to be used when transcribing the audio. -See :github:`client.py ` for an example of how to use the package programatically. +See :ref:`the Python client ` for an example of how to use the package programatically. + +.. _nodejs-usage: Using the Node.JS / Electron.JS package ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -135,9 +144,11 @@ Alternatively, if you're using Linux and have a supported NVIDIA GPU, you can in See the `release notes `_ to find which GPUs are supported. Please ensure you have the required `CUDA dependency <#cuda-dependency>`_. -See :github:`client.ts ` for an example of how to use the bindings. +See the :ref:`TypeScript client ` for an example of how to use the bindings programatically. -Using the Command-Line client +.. _cli-usage: + +Using the command-line client ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To download the pre-built binaries for the ``deepspeech`` command-line (compiled C++) client, use ``util/taskcluster.py``\ : @@ -168,12 +179,12 @@ Note: the following command assumes you `downloaded the pre-trained model <#gett ./deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio_input.wav -See the help output with ``./deepspeech -h`` and the :github:`native client README ` for more details. +See the help output with ``./deepspeech -h`` for more details. Installing bindings from source ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow these :github:`native client installation instructions `. +If pre-built binaries aren't available for your system, you'll need to install them from scratch. Follow the :github:`native client build and installation instructions `. Third party bindings ^^^^^^^^^^^^^^^^^^^^ diff --git a/doc/index.rst b/doc/index.rst index 9eb761e1..fbf1a620 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -6,6 +6,50 @@ Welcome to DeepSpeech's documentation! ====================================== +DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper `_. Project DeepSpeech uses Google's `TensorFlow `_ to make the implementation easier. + +To install and use DeepSpeech all you have to do is: + +.. code-block:: bash + + # Create and activate a virtualenv + virtualenv -p python3 $HOME/tmp/deepspeech-venv/ + source $HOME/tmp/deepspeech-venv/bin/activate + + # Install DeepSpeech + pip3 install deepspeech + + # Download pre-trained English model files + curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm + curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer + + # Download example audio files + curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/audio-0.7.0.tar.gz + tar xvf audio-0.7.0.tar.gz + + # Transcribe an audio file + deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav + +A pre-trained English model is available for use and can be downloaded following the instructions in :ref:`the usage docs `. For the latest release, including pre-trained models and checkpoints, `see the GitHub releases page `_. + +Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes `_ to find which GPUs are supported. To run ``deepspeech`` on a GPU, install the GPU specific package: + +.. code-block:: bash + + # Create and activate a virtualenv + virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/ + source $HOME/tmp/deepspeech-gpu-venv/bin/activate + + # Install DeepSpeech CUDA enabled package + pip3 install deepspeech-gpu + + # Transcribe an audio file. + deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav + +Please ensure you have the required :ref:`CUDA dependencies `. + +See the output of ``deepspeech -h`` for more information on the use of ``deepspeech``. (If you experience problems running ``deepspeech``, please check :ref:`required runtime dependencies `). + .. toctree:: :maxdepth: 2 :caption: Introduction diff --git a/native_client/javascript/README.md b/native_client/javascript/README.md index 267fbeba..39b291f6 100644 --- a/native_client/javascript/README.md +++ b/native_client/javascript/README.md @@ -1,4 +1,4 @@ -Full project description and documentation on GitHub: [https://github.com/mozilla/DeepSpeech](https://github.com/mozilla/DeepSpeech). +Full project description and documentation on [https://deepspeech.readthedocs.io/](https://deepspeech.readthedocs.io/). ## Generating TypeScript Type Definitions diff --git a/native_client/python/README.rst b/native_client/python/README.rst index bde1e032..04d6bb29 100644 --- a/native_client/python/README.rst +++ b/native_client/python/README.rst @@ -1 +1 @@ -Full project description and documentation on GitHub: `https://github.com/mozilla/DeepSpeech `_ +Full project description and documentation on `https://deepspeech.readthedocs.io/ `_