From 69569aab0ba8df57cc49e12fffeba0854dcece6b Mon Sep 17 00:00:00 2001 From: josh Date: Tue, 19 Mar 2019 19:00:16 +0100 Subject: [PATCH] import_cv2 --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index fd94289c..1eb59c98 100644 --- a/README.md +++ b/README.md @@ -253,13 +253,13 @@ Please ensure you have the required [CUDA dependency](#cuda-dependency). The Common Voice corpus consists of voice samples that were donated through Mozilla's [Common Voice](https://voice.mozilla.org/) Initiative. -We provide an importer (`bin/import_cv.py`) which automates downloading and preparing the Common Voice corpus as such: +We provide an importer (`bin/import_cv2.py`) which automates downloading and preparing the Common Voice (v2.0) corpus as such: ```bash -bin/import_cv.py path/to/target/directory +bin/import_cv2.py path/to/target/directory ``` -If you already downloaded Common Voice from [here](https://voice.mozilla.org/data), simply run `bin/import_cv.py` on the directory where the corpus is located. The importer will detect that you've already downloaded the data and immediately proceed to unpackaging and importing. If you haven't downloaded the data already, `bin/import_cv.py` will download it for you and save to the path you've specified. +If you already downloaded Common Voice from [here](https://voice.mozilla.org/data), simply run `bin/import_cv2.py` on the directory where the corpus is located. The importer will detect that you've already downloaded the data and immediately proceed to unpackaging and importing. If you haven't downloaded the data already, `bin/import_cv2.py` will download it for you and save to the path you've specified. Please be aware that training with the Common Voice corpus archive requires at least 70GB of free disk space and quite some time to conclude. As this process creates a huge number of small files, using an SSD drive is highly recommended. If the import script gets interrupted, it will try to continue from where it stopped the next time you run it. Unfortunately, there are some cases where it will need to start over. Once the import is done, the directory will contain a bunch of CSV files.