Merge branch 'master' into r0.9

2020-10-30 17:31:35 +01:00 · 2020-10-30 17:31:35 +01:00 · f7e816c014
commit f7e816c014
parent 2368fca0f1 d9a35d63b0
4 changed files with 10 additions and 7 deletions
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@ -18,7 +18,7 @@ You've found a bug and you were able to squash it! Great job! Please write a sho
 Documentation PR
 ^^^^^^^^^^^^^^^^

-If you're just making updates or changes to the documentation, there's no need to run all of DeepSpeech's tests for Contiguous Itegration (i.e. Taskcluster tests). In this case, at the end of your short but clear commit message, you should add **X-DeepSpeech: NOBUILD**. This will trigger the CI tests to skip your PR, saving both time and compute.
+If you're just making updates or changes to the documentation, there's no need to run all of DeepSpeech's tests for Continuous Integration (i.e. Taskcluster tests). In this case, at the end of your short but clear commit message, you should add **X-DeepSpeech: NOBUILD**. This will trigger the CI tests to skip your PR, saving both time and compute.

 New Feature PR
 ^^^^^^^^^^^^^^
@ -29,8 +29,9 @@ The DeepSpeech codebase is made of many connected parts. There is Python code fo

 Whenever you add a new feature to DeepSpeech and what to contribute that feature back to the project, here are some things to keep in mind:

-1. You've made changes to the core C++ code. You should minimally also make neccesary changes to the C client (i.e. **args.h** and **client.cc**). The bindings for Python, Java, and Javascript are SWIG generated, so you don't need to worry about these. The bindings for .NET and Swift are, however, not generated automatically. It would be best if you also made the necessary manual changes to these bindings as well, but don't worry if you are unable to do so.
-2. You've made changes to the training Python code. Make sure you run a linter (described below).
+1. You've made changes to the core C++ code. Core changes can have downstream effects on all parts of the DeepSpeech project, so keep that in mind. You should minimally also make necessary changes to the C client (i.e. **args.h** and **client.cc**). The bindings for Python, Java, and Javascript are SWIG generated, and in the best-case scenario you won't have to worry about them. However, if you've added a whole new feature, you may need to make custom tweaks to those bindings, because SWIG may not automagically work with your new feature, especially if you've exposed new arguments. The bindings for .NET and Swift are not generated automatically. It would be best if you also made the necessary manual changes to these bindings as well. It is best to communicate with the core DeepSpeech team and come to an understanding of where you will likely need to work with the bindings. They can't predict all the bugs you will run into, but they will have a good idea of how to plan for some obvious challenges.
+2. You've made changes to the Python code. Make sure you run a linter (described below).
+3. Make sure your new feature doesn't regress the project. If you've added a significant feature or amount of code, you want to be sure your new feature doesn't create performance issues. For example, if you've made a change to the DeepSpeech decoder, you should know that inference performance doesn't drop in terms of latency, accuracy, or memory usage. Unless you're proposing a new decoding algorithm, you probably don't have to worry about affecting accuracy. However, it's very possible you've affected latency or memory usage. You should run local performance tests to make sure no bugs have crept in. There are lots of tools to check latency and memory usage, and you should use what is most comfortable for you and gets the job done. If you're on Linux, you might find [[perf](https://perf.wiki.kernel.org/index.php/Main_Page)] to be a useful tool. You can use sample WAV files for testing which are provided in the `DeepSpeech/data/` directory.

 Python Linter
 -------------
--- a/README.rst
+++ b/README.rst
@ -3,7 +3,7 @@ Project DeepSpeech


 .. image:: https://readthedocs.org/projects/deepspeech/badge/?version=latest
-   :target: http://deepspeech.readthedocs.io/?badge=latest
+   :target: https://deepspeech.readthedocs.io/?badge=latest
   :alt: Documentation


@ -12,9 +12,9 @@ Project DeepSpeech
   :alt: Task Status


-DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.
+DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567>`_. Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/>`_ to make the implementation easier.

-Documentation for installation, usage, and training models are available on `deepspeech.readthedocs.io <http://deepspeech.readthedocs.io/?badge=latest>`_.
+Documentation for installation, usage, and training models are available on `deepspeech.readthedocs.io <https://deepspeech.readthedocs.io/?badge=latest>`_.

 For the latest release, including pre-trained models and checkpoints, `see the latest release on GitHub <https://github.com/mozilla/DeepSpeech/releases/latest>`_.

--- a/bin/import_cv2.py
+++ b/bin/import_cv2.py
@ -27,6 +27,7 @@ from ds_ctcdecoder import Alphabet

 FIELDNAMES = ["wav_filename", "wav_filesize", "transcript"]
 SAMPLE_RATE = 16000
+CHANNELS = 1
 MAX_SECS = 10
 PARAMS = None
 FILTER_OBJ = None
@ -179,7 +180,7 @@ def _preprocess_data(tsv_dir, audio_dir, space_after_every_character=False):
 def _maybe_convert_wav(mp3_filename, wav_filename):
    if not os.path.exists(wav_filename):
        transformer = sox.Transformer()
-        transformer.convert(samplerate=SAMPLE_RATE)
+        transformer.convert(samplerate=SAMPLE_RATE, n_channels=CHANNELS)
        try:
            transformer.build(mp3_filename, wav_filename)
        except sox.core.SoxError:
--- a/bin/import_voxforge.py
+++ b/bin/import_voxforge.py
@ -2,6 +2,7 @@
 import codecs
 import os
 import re
+import sys
 import tarfile
 import threading
 import unicodedata