More rebranding, API names, iOS, .NET

This commit is contained in:
Kelly Davis 2021-03-07 14:29:02 +01:00
parent 136ca35ca2
commit 6d4d1a7153
112 changed files with 891 additions and 891 deletions

View File

@ -13,7 +13,7 @@ fi;
if [ -d "${COMPUTE_KEEP_DIR}" ]; then if [ -d "${COMPUTE_KEEP_DIR}" ]; then
checkpoint_dir=$COMPUTE_KEEP_DIR checkpoint_dir=$COMPUTE_KEEP_DIR
else else
checkpoint_dir=$(python -c 'from xdg import BaseDirectory as xdg; print(xdg.save_data_path("deepspeech/ldc93s1"))') checkpoint_dir=$(python -c 'from xdg import BaseDirectory as xdg; print(xdg.save_data_path("stt/ldc93s1"))')
fi fi
# Force only one visible device because we have a single-sample dataset # Force only one visible device because we have a single-sample dataset

View File

@ -60,7 +60,7 @@ Compile Coqui STT
----------------- -----------------
Compile ``libstt.so`` Compile ``libstt.so``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^
Within your TensorFlow directory, there should be a symbolic link to the 🐸STT ``native_client`` directory. If it is not present, create it with the follow command: Within your TensorFlow directory, there should be a symbolic link to the 🐸STT ``native_client`` directory. If it is not present, create it with the follow command:
@ -238,8 +238,8 @@ Due to the discontinuation of Bintray JCenter we do not have pre-built Android p
implementation 'stt.coqui.ai:libstt:VERSION@aar' implementation 'stt.coqui.ai:libstt:VERSION@aar'
Building ``libstt.so`` Building ``libstt.so`` for Android
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can build the ``libstt.so`` using (ARMv7): You can build the ``libstt.so`` using (ARMv7):
@ -254,7 +254,7 @@ Or (ARM64):
bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm64 --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 //native_client:libstt.so bazel build --workspace_status_command="bash native_client/bazel_workspace_status_cmd.sh" --config=monolithic --config=android --config=android_arm64 --define=runtime=tflite --action_env ANDROID_NDK_API_LEVEL=21 --cxxopt=-std=c++14 --copt=-D_GLIBCXX_USE_C99 //native_client:libstt.so
Building ``libstt.aar`` Building ``libstt.aar``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^
In the unlikely event you have to rebuild the JNI bindings, source code is In the unlikely event you have to rebuild the JNI bindings, source code is
available under the ``libstt`` subdirectory. Building depends on shared available under the ``libstt`` subdirectory. Building depends on shared
@ -270,7 +270,7 @@ and adapt file naming (when missing, the error message should states what
filename it expects and where). filename it expects and where).
Building C++ ``stt`` binary Building C++ ``stt`` binary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
Building the ``stt`` binary will happen through ``ndk-build`` (ARMv7): Building the ``stt`` binary will happen through ``ndk-build`` (ARMv7):
@ -306,7 +306,7 @@ mono 16kHz 16-bits file and it might fail on some WAVE file that are not
following exactly the specification. following exactly the specification.
Running ``stt`` via adb Running ``stt`` via adb
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^
You should use ``adb push`` to send data to device, please refer to Android You should use ``adb push`` to send data to device, please refer to Android
documentation on how to use that. documentation on how to use that.
@ -349,7 +349,7 @@ to leverage GPU / DSP / NPU * Hexagon, the Qualcomm-specific DSP
This is highly experimental: This is highly experimental:
* Requires passing environment variable ``DS_TFLITE_DELEGATE`` with values of * Requires passing environment variable ``STT_TFLITE_DELEGATE`` with values of
``gpu``, ``nnapi`` or ``hexagon`` (only one at a time) ``gpu``, ``nnapi`` or ``hexagon`` (only one at a time)
* Might require exported model changes (some Op might not be supported) * Might require exported model changes (some Op might not be supported)
* We can't guarantee it will work, nor it will be faster than default * We can't guarantee it will work, nor it will be faster than default

View File

@ -10,65 +10,65 @@ C API
See also the list of error codes including descriptions for each error in :ref:`error-codes`. See also the list of error codes including descriptions for each error in :ref:`error-codes`.
.. doxygenfunction:: DS_CreateModel .. doxygenfunction:: STT_CreateModel
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_FreeModel .. doxygenfunction:: STT_FreeModel
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_EnableExternalScorer .. doxygenfunction:: STT_EnableExternalScorer
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_DisableExternalScorer .. doxygenfunction:: STT_DisableExternalScorer
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_AddHotWord .. doxygenfunction:: STT_AddHotWord
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_EraseHotWord .. doxygenfunction:: STT_EraseHotWord
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_ClearHotWords .. doxygenfunction:: STT_ClearHotWords
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_SetScorerAlphaBeta .. doxygenfunction:: STT_SetScorerAlphaBeta
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_GetModelSampleRate .. doxygenfunction:: STT_GetModelSampleRate
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_SpeechToText .. doxygenfunction:: STT_SpeechToText
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_SpeechToTextWithMetadata .. doxygenfunction:: STT_SpeechToTextWithMetadata
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_CreateStream .. doxygenfunction:: STT_CreateStream
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_FeedAudioContent .. doxygenfunction:: STT_FeedAudioContent
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_IntermediateDecode .. doxygenfunction:: STT_IntermediateDecode
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_IntermediateDecodeWithMetadata .. doxygenfunction:: STT_IntermediateDecodeWithMetadata
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_FinishStream .. doxygenfunction:: STT_FinishStream
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_FinishStreamWithMetadata .. doxygenfunction:: STT_FinishStreamWithMetadata
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_FreeStream .. doxygenfunction:: STT_FreeStream
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_FreeMetadata .. doxygenfunction:: STT_FreeMetadata
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_FreeString .. doxygenfunction:: STT_FreeString
:project: deepspeech-c :project: stt-c
.. doxygenfunction:: DS_Version .. doxygenfunction:: STT_Version
:project: deepspeech-c :project: stt-c

View File

@ -2,18 +2,18 @@
============== ==============
DeepSpeech Class STT Class
---------------- ----------------
.. doxygenclass:: DeepSpeechClient::DeepSpeech .. doxygenclass:: STTClient::STT
:project: deepspeech-dotnet :project: stt-dotnet
:members: :members:
DeepSpeechStream Class Stream Class
---------------------- ----------------------
.. doxygenclass:: DeepSpeechClient::Models::DeepSpeechStream .. doxygenclass:: STTClient::Models::Stream
:project: deepspeech-dotnet :project: stt-dotnet
:members: :members:
ErrorCodes ErrorCodes
@ -21,33 +21,33 @@ ErrorCodes
See also the main definition including descriptions for each error in :ref:`error-codes`. See also the main definition including descriptions for each error in :ref:`error-codes`.
.. doxygenenum:: DeepSpeechClient::Enums::ErrorCodes .. doxygenenum:: STTClient::Enums::ErrorCodes
:project: deepspeech-dotnet :project: stt-dotnet
Metadata Metadata
-------- --------
.. doxygenclass:: DeepSpeechClient::Models::Metadata .. doxygenclass:: STTClient::Models::Metadata
:project: deepspeech-dotnet :project: stt-dotnet
:members: Transcripts :members: Transcripts
CandidateTranscript CandidateTranscript
------------------- -------------------
.. doxygenclass:: DeepSpeechClient::Models::CandidateTranscript .. doxygenclass:: STTClient::Models::CandidateTranscript
:project: deepspeech-dotnet :project: stt-dotnet
:members: Tokens, Confidence :members: Tokens, Confidence
TokenMetadata TokenMetadata
------------- -------------
.. doxygenclass:: DeepSpeechClient::Models::TokenMetadata .. doxygenclass:: STTClient::Models::TokenMetadata
:project: deepspeech-dotnet :project: stt-dotnet
:members: Text, Timestep, StartTime :members: Text, Timestep, StartTime
DeepSpeech Interface STT Interface
-------------------- --------------------
.. doxygeninterface:: DeepSpeechClient::Interfaces::IDeepSpeech .. doxygeninterface:: STTClient::Interfaces::ISTT
:project: deepspeech-dotnet :project: stt-dotnet
:members: :members:

View File

@ -1,12 +1,12 @@
.NET API Usage example .NET API Usage example
====================== ======================
Examples are from `native_client/dotnet/DeepSpeechConsole/Program.cs`. Examples are from `native_client/dotnet/STTConsole/Program.cs`.
Creating a model instance and loading model Creating a model instance and loading model
------------------------------------------- -------------------------------------------
.. literalinclude:: ../native_client/dotnet/DeepSpeechConsole/Program.cs .. literalinclude:: ../native_client/dotnet/STTConsole/Program.cs
:language: csharp :language: csharp
:linenos: :linenos:
:lineno-match: :lineno-match:
@ -16,7 +16,7 @@ Creating a model instance and loading model
Performing inference Performing inference
-------------------- --------------------
.. literalinclude:: ../native_client/dotnet/DeepSpeechConsole/Program.cs .. literalinclude:: ../native_client/dotnet/STTConsole/Program.cs
:language: csharp :language: csharp
:linenos: :linenos:
:lineno-match: :lineno-match:
@ -26,4 +26,4 @@ Performing inference
Full source code Full source code
---------------- ----------------
See :download:`Full source code<../native_client/dotnet/DeepSpeechConsole/Program.cs>`. See :download:`Full source code<../native_client/dotnet/STTConsole/Program.cs>`.

View File

@ -1,29 +1,29 @@
Java Java
==== ====
DeepSpeechModel STTModel
--------------- ---------------
.. doxygenclass:: org::deepspeech::libdeepspeech::DeepSpeechModel .. doxygenclass:: ai::coqui::libstt::STTModel
:project: deepspeech-java :project: stt-java
:members: :members:
Metadata Metadata
-------- --------
.. doxygenclass:: org::deepspeech::libdeepspeech::Metadata .. doxygenclass:: ai::coqui::libstt::Metadata
:project: deepspeech-java :project: stt-java
:members: getNumTranscripts, getTranscript :members: getNumTranscripts, getTranscript
CandidateTranscript CandidateTranscript
------------------- -------------------
.. doxygenclass:: org::deepspeech::libdeepspeech::CandidateTranscript .. doxygenclass:: ai::coqui::libstt::CandidateTranscript
:project: deepspeech-java :project: stt-java
:members: getNumTokens, getConfidence, getToken :members: getNumTokens, getConfidence, getToken
TokenMetadata TokenMetadata
------------- -------------
.. doxygenclass:: org::deepspeech::libdeepspeech::TokenMetadata .. doxygenclass:: ai::coqui::libstt::TokenMetadata
:project: deepspeech-java :project: stt-java
:members: getText, getTimestep, getStartTime :members: getText, getTimestep, getStartTime

View File

@ -1,12 +1,12 @@
Java API Usage example Java API Usage example
====================== ======================
Examples are from `native_client/java/app/src/main/java/org/deepspeech/DeepSpeechActivity.java`. Examples are from `native_client/java/app/src/main/java/ai/coqui/STTActivity.java`.
Creating a model instance and loading model Creating a model instance and loading model
------------------------------------------- -------------------------------------------
.. literalinclude:: ../native_client/java/app/src/main/java/org/deepspeech/DeepSpeechActivity.java .. literalinclude:: ../native_client/java/app/src/main/java/ai/coqui/STTActivity.java
:language: java :language: java
:linenos: :linenos:
:lineno-match: :lineno-match:
@ -16,7 +16,7 @@ Creating a model instance and loading model
Performing inference Performing inference
-------------------- --------------------
.. literalinclude:: ../native_client/java/app/src/main/java/org/deepspeech/DeepSpeechActivity.java .. literalinclude:: ../native_client/java/app/src/main/java/ai/coqui/STTActivity.java
:language: java :language: java
:linenos: :linenos:
:lineno-match: :lineno-match:
@ -26,4 +26,4 @@ Performing inference
Full source code Full source code
---------------- ----------------
See :download:`Full source code<../native_client/java/app/src/main/java/org/deepspeech/DeepSpeechActivity.java>`. See :download:`Full source code<../native_client/java/app/src/main/java/ai/coqui/STTActivity.java>`.

View File

@ -9,61 +9,61 @@ Linux / AMD64 without GPU
^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference) * x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* Ubuntu 14.04+ (glibc >= 2.19, libstdc++6 >= 4.8) * Ubuntu 14.04+ (glibc >= 2.19, libstdc++6 >= 4.8)
* Full TensorFlow runtime (``deepspeech`` packages) * Full TensorFlow runtime (``stt`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)
Linux / AMD64 with GPU Linux / AMD64 with GPU
^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference) * x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* Ubuntu 14.04+ (glibc >= 2.19, libstdc++6 >= 4.8) * Ubuntu 14.04+ (glibc >= 2.19, libstdc++6 >= 4.8)
* CUDA 10.0 (and capable GPU) * CUDA 10.0 (and capable GPU)
* Full TensorFlow runtime (``deepspeech`` packages) * Full TensorFlow runtime (``stt`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)
Linux / ARMv7 Linux / ARMv7
^^^^^^^^^^^^^ ^^^^^^^^^^^^^
* Cortex-A53 compatible ARMv7 SoC with Neon support * Cortex-A53 compatible ARMv7 SoC with Neon support
* Raspbian Buster-compatible distribution * Raspbian Buster-compatible distribution
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)
Linux / Aarch64 Linux / Aarch64
^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
* Cortex-A72 compatible Aarch64 SoC * Cortex-A72 compatible Aarch64 SoC
* ARMbian Buster-compatible distribution * ARMbian Buster-compatible distribution
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)
Android / ARMv7 Android / ARMv7
^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
* ARMv7 SoC with Neon support * ARMv7 SoC with Neon support
* Android 7.0-10.0 * Android 7.0-10.0
* NDK API level >= 21 * NDK API level >= 21
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)
Android / Aarch64 Android / Aarch64
^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
* Aarch64 SoC * Aarch64 SoC
* Android 7.0-10.0 * Android 7.0-10.0
* NDK API level >= 21 * NDK API level >= 21
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)
macOS / AMD64 macOS / AMD64
^^^^^^^^^^^^^ ^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference) * x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* macOS >= 10.10 * macOS >= 10.10
* Full TensorFlow runtime (``deepspeech`` packages) * Full TensorFlow runtime (``stt`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)
Windows / AMD64 without GPU Windows / AMD64 without GPU
^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference) * x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* Windows Server >= 2012 R2 ; Windows >= 8.1 * Windows Server >= 2012 R2 ; Windows >= 8.1
* Full TensorFlow runtime (``deepspeech`` packages) * Full TensorFlow runtime (``stt`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)
Windows / AMD64 with GPU Windows / AMD64 with GPU
^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
* x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference) * x86-64 CPU with AVX/FMA (one can rebuild without AVX/FMA, but it might slow down inference)
* Windows Server >= 2012 R2 ; Windows >= 8.1 * Windows Server >= 2012 R2 ; Windows >= 8.1
* CUDA 10.0 (and capable GPU) * CUDA 10.0 (and capable GPU)
* Full TensorFlow runtime (``deepspeech`` packages) * Full TensorFlow runtime (``stt`` packages)
* TensorFlow Lite runtime (``deepspeech-tflite`` packages) * TensorFlow Lite runtime (``stt-tflite`` packages)

View File

@ -5,19 +5,19 @@ Metadata
-------- --------
.. doxygenstruct:: Metadata .. doxygenstruct:: Metadata
:project: deepspeech-c :project: stt-c
:members: :members:
CandidateTranscript CandidateTranscript
------------------- -------------------
.. doxygenstruct:: CandidateTranscript .. doxygenstruct:: CandidateTranscript
:project: deepspeech-c :project: stt-c
:members: :members:
TokenMetadata TokenMetadata
------------- -------------
.. doxygenstruct:: TokenMetadata .. doxygenstruct:: TokenMetadata
:project: deepspeech-c :project: stt-c
:members: :members:

View File

@ -81,9 +81,9 @@ extensions = [
breathe_projects = { breathe_projects = {
"deepspeech-c": "xml-c/", "stt-c": "xml-c/",
"deepspeech-java": "xml-java/", "stt-java": "xml-java/",
"deepspeech-dotnet": "xml-dotnet/", "stt-dotnet": "xml-dotnet/",
} }
js_source_path = "../native_client/javascript/index.ts" js_source_path = "../native_client/javascript/index.ts"

View File

@ -5,6 +5,6 @@
fun:_Znwm fun:_Znwm
fun:_ZN6tflite20DefaultErrorReporterEv fun:_ZN6tflite20DefaultErrorReporterEv
fun:_ZN16TFLiteModelState4initEPKc fun:_ZN16TFLiteModelState4initEPKc
fun:DS_CreateModel fun:STT_CreateModel
fun:main fun:main
} }

View File

@ -815,7 +815,7 @@
fun:_ZN6Scorer9load_trieERSt14basic_ifstreamIcSt11char_traitsIcEERKNSt7__cxx1112basic_stringIcS2_SaIcEEE fun:_ZN6Scorer9load_trieERSt14basic_ifstreamIcSt11char_traitsIcEERKNSt7__cxx1112basic_stringIcS2_SaIcEEE
fun:_ZN6Scorer7load_lmERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE fun:_ZN6Scorer7load_lmERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
fun:_ZN6Scorer4initERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERK8Alphabet fun:_ZN6Scorer4initERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERK8Alphabet
fun:DS_EnableExternalScorer fun:STT_EnableExternalScorer
fun:main fun:main
} }
{ {
@ -831,7 +831,7 @@
fun:_ZN6Scorer9load_trieERSt14basic_ifstreamIcSt11char_traitsIcEERKNSt7__cxx1112basic_stringIcS2_SaIcEEE fun:_ZN6Scorer9load_trieERSt14basic_ifstreamIcSt11char_traitsIcEERKNSt7__cxx1112basic_stringIcS2_SaIcEEE
fun:_ZN6Scorer7load_lmERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE fun:_ZN6Scorer7load_lmERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
fun:_ZN6Scorer4initERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERK8Alphabet fun:_ZN6Scorer4initERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERK8Alphabet
fun:DS_EnableExternalScorer fun:STT_EnableExternalScorer
fun:main fun:main
} }
{ {

View File

@ -64,9 +64,9 @@ void PrintHelp(const char* bin)
"\t--hot_words\t\t\tHot-words and their boosts. Word:Boost pairs are comma-separated\n" "\t--hot_words\t\t\tHot-words and their boosts. Word:Boost pairs are comma-separated\n"
"\t--help\t\t\t\tShow help\n" "\t--help\t\t\t\tShow help\n"
"\t--version\t\t\tPrint version and exits\n"; "\t--version\t\t\tPrint version and exits\n";
char* version = DS_Version(); char* version = STT_Version();
std::cerr << "Coqui STT " << version << "\n"; std::cerr << "Coqui STT " << version << "\n";
DS_FreeString(version); STT_FreeString(version);
exit(1); exit(1);
} }
@ -169,9 +169,9 @@ bool ProcessArgs(int argc, char** argv)
} }
if (has_versions) { if (has_versions) {
char* version = DS_Version(); char* version = STT_Version();
std::cout << "Coqui " << version << "\n"; std::cout << "Coqui " << version << "\n";
DS_FreeString(version); STT_FreeString(version);
return false; return false;
} }

View File

@ -168,17 +168,17 @@ LocalDsSTT(ModelState* aCtx, const short* aBuffer, size_t aBufferSize,
// sphinx-doc: c_ref_inference_start // sphinx-doc: c_ref_inference_start
if (extended_output) { if (extended_output) {
Metadata *result = DS_SpeechToTextWithMetadata(aCtx, aBuffer, aBufferSize, 1); Metadata *result = STT_SpeechToTextWithMetadata(aCtx, aBuffer, aBufferSize, 1);
res.string = CandidateTranscriptToString(&result->transcripts[0]); res.string = CandidateTranscriptToString(&result->transcripts[0]);
DS_FreeMetadata(result); STT_FreeMetadata(result);
} else if (json_output) { } else if (json_output) {
Metadata *result = DS_SpeechToTextWithMetadata(aCtx, aBuffer, aBufferSize, json_candidate_transcripts); Metadata *result = STT_SpeechToTextWithMetadata(aCtx, aBuffer, aBufferSize, json_candidate_transcripts);
res.string = MetadataToJSON(result); res.string = MetadataToJSON(result);
DS_FreeMetadata(result); STT_FreeMetadata(result);
} else if (stream_size > 0) { } else if (stream_size > 0) {
StreamingState* ctx; StreamingState* ctx;
int status = DS_CreateStream(aCtx, &ctx); int status = STT_CreateStream(aCtx, &ctx);
if (status != DS_ERR_OK) { if (status != STT_ERR_OK) {
res.string = strdup(""); res.string = strdup("");
return res; return res;
} }
@ -187,28 +187,28 @@ LocalDsSTT(ModelState* aCtx, const short* aBuffer, size_t aBufferSize,
const char *prev = nullptr; const char *prev = nullptr;
while (off < aBufferSize) { while (off < aBufferSize) {
size_t cur = aBufferSize - off > stream_size ? stream_size : aBufferSize - off; size_t cur = aBufferSize - off > stream_size ? stream_size : aBufferSize - off;
DS_FeedAudioContent(ctx, aBuffer + off, cur); STT_FeedAudioContent(ctx, aBuffer + off, cur);
off += cur; off += cur;
prev = last; prev = last;
const char* partial = DS_IntermediateDecode(ctx); const char* partial = STT_IntermediateDecode(ctx);
if (last == nullptr || strcmp(last, partial)) { if (last == nullptr || strcmp(last, partial)) {
printf("%s\n", partial); printf("%s\n", partial);
last = partial; last = partial;
} else { } else {
DS_FreeString((char *) partial); STT_FreeString((char *) partial);
} }
if (prev != nullptr && prev != last) { if (prev != nullptr && prev != last) {
DS_FreeString((char *) prev); STT_FreeString((char *) prev);
} }
} }
if (last != nullptr) { if (last != nullptr) {
DS_FreeString((char *) last); STT_FreeString((char *) last);
} }
res.string = DS_FinishStream(ctx); res.string = STT_FinishStream(ctx);
} else if (extended_stream_size > 0) { } else if (extended_stream_size > 0) {
StreamingState* ctx; StreamingState* ctx;
int status = DS_CreateStream(aCtx, &ctx); int status = STT_CreateStream(aCtx, &ctx);
if (status != DS_ERR_OK) { if (status != STT_ERR_OK) {
res.string = strdup(""); res.string = strdup("");
return res; return res;
} }
@ -217,10 +217,10 @@ LocalDsSTT(ModelState* aCtx, const short* aBuffer, size_t aBufferSize,
const char *prev = nullptr; const char *prev = nullptr;
while (off < aBufferSize) { while (off < aBufferSize) {
size_t cur = aBufferSize - off > extended_stream_size ? extended_stream_size : aBufferSize - off; size_t cur = aBufferSize - off > extended_stream_size ? extended_stream_size : aBufferSize - off;
DS_FeedAudioContent(ctx, aBuffer + off, cur); STT_FeedAudioContent(ctx, aBuffer + off, cur);
off += cur; off += cur;
prev = last; prev = last;
const Metadata* result = DS_IntermediateDecodeWithMetadata(ctx, 1); const Metadata* result = STT_IntermediateDecodeWithMetadata(ctx, 1);
const char* partial = CandidateTranscriptToString(&result->transcripts[0]); const char* partial = CandidateTranscriptToString(&result->transcripts[0]);
if (last == nullptr || strcmp(last, partial)) { if (last == nullptr || strcmp(last, partial)) {
printf("%s\n", partial); printf("%s\n", partial);
@ -231,14 +231,14 @@ LocalDsSTT(ModelState* aCtx, const short* aBuffer, size_t aBufferSize,
if (prev != nullptr && prev != last) { if (prev != nullptr && prev != last) {
free((char *) prev); free((char *) prev);
} }
DS_FreeMetadata((Metadata *)result); STT_FreeMetadata((Metadata *)result);
} }
const Metadata* result = DS_FinishStreamWithMetadata(ctx, 1); const Metadata* result = STT_FinishStreamWithMetadata(ctx, 1);
res.string = CandidateTranscriptToString(&result->transcripts[0]); res.string = CandidateTranscriptToString(&result->transcripts[0]);
DS_FreeMetadata((Metadata *)result); STT_FreeMetadata((Metadata *)result);
free((char *) last); free((char *) last);
} else { } else {
res.string = DS_SpeechToText(aCtx, aBuffer, aBufferSize); res.string = STT_SpeechToText(aCtx, aBuffer, aBufferSize);
} }
// sphinx-doc: c_ref_inference_stop // sphinx-doc: c_ref_inference_stop
@ -404,7 +404,7 @@ GetAudioBuffer(const char* path, int desired_sample_rate)
void void
ProcessFile(ModelState* context, const char* path, bool show_times) ProcessFile(ModelState* context, const char* path, bool show_times)
{ {
ds_audio_buffer audio = GetAudioBuffer(path, DS_GetModelSampleRate(context)); ds_audio_buffer audio = GetAudioBuffer(path, STT_GetModelSampleRate(context));
// Pass audio to STT // Pass audio to STT
// We take half of buffer_size because buffer is a char* while // We take half of buffer_size because buffer is a char* while
@ -418,7 +418,7 @@ ProcessFile(ModelState* context, const char* path, bool show_times)
if (result.string) { if (result.string) {
printf("%s\n", result.string); printf("%s\n", result.string);
DS_FreeString((char*)result.string); STT_FreeString((char*)result.string);
} }
if (show_times) { if (show_times) {
@ -453,16 +453,16 @@ main(int argc, char **argv)
// Initialise STT // Initialise STT
ModelState* ctx; ModelState* ctx;
// sphinx-doc: c_ref_model_start // sphinx-doc: c_ref_model_start
int status = DS_CreateModel(model, &ctx); int status = STT_CreateModel(model, &ctx);
if (status != 0) { if (status != 0) {
char* error = DS_ErrorCodeToErrorMessage(status); char* error = STT_ErrorCodeToErrorMessage(status);
fprintf(stderr, "Could not create model: %s\n", error); fprintf(stderr, "Could not create model: %s\n", error);
free(error); free(error);
return 1; return 1;
} }
if (set_beamwidth) { if (set_beamwidth) {
status = DS_SetModelBeamWidth(ctx, beam_width); status = STT_SetModelBeamWidth(ctx, beam_width);
if (status != 0) { if (status != 0) {
fprintf(stderr, "Could not set model beam width.\n"); fprintf(stderr, "Could not set model beam width.\n");
return 1; return 1;
@ -470,13 +470,13 @@ main(int argc, char **argv)
} }
if (scorer) { if (scorer) {
status = DS_EnableExternalScorer(ctx, scorer); status = STT_EnableExternalScorer(ctx, scorer);
if (status != 0) { if (status != 0) {
fprintf(stderr, "Could not enable external scorer.\n"); fprintf(stderr, "Could not enable external scorer.\n");
return 1; return 1;
} }
if (set_alphabeta) { if (set_alphabeta) {
status = DS_SetScorerAlphaBeta(ctx, lm_alpha, lm_beta); status = STT_SetScorerAlphaBeta(ctx, lm_alpha, lm_beta);
if (status != 0) { if (status != 0) {
fprintf(stderr, "Error setting scorer alpha and beta.\n"); fprintf(stderr, "Error setting scorer alpha and beta.\n");
return 1; return 1;
@ -494,7 +494,7 @@ main(int argc, char **argv)
// so, check the boost string before we turn it into a float // so, check the boost string before we turn it into a float
bool boost_is_valid = (pair_[1].find_first_not_of("-.0123456789") == std::string::npos); bool boost_is_valid = (pair_[1].find_first_not_of("-.0123456789") == std::string::npos);
float boost = strtof((pair_[1]).c_str(),0); float boost = strtof((pair_[1]).c_str(),0);
status = DS_AddHotWord(ctx, word, boost); status = STT_AddHotWord(ctx, word, boost);
if (status != 0 || !boost_is_valid) { if (status != 0 || !boost_is_valid) {
fprintf(stderr, "Could not enable hot-word.\n"); fprintf(stderr, "Could not enable hot-word.\n");
return 1; return 1;
@ -555,7 +555,7 @@ main(int argc, char **argv)
sox_quit(); sox_quit();
#endif // NO_SOX #endif // NO_SOX
DS_FreeModel(ctx); STT_FreeModel(ctx);
return 0; return 0;
} }

View File

@ -61,37 +61,37 @@ typedef struct Metadata {
// sphinx-doc: error_code_listing_start // sphinx-doc: error_code_listing_start
#define DS_FOR_EACH_ERROR(APPLY) \ #define STT_FOR_EACH_ERROR(APPLY) \
APPLY(DS_ERR_OK, 0x0000, "No error.") \ APPLY(STT_ERR_OK, 0x0000, "No error.") \
APPLY(DS_ERR_NO_MODEL, 0x1000, "Missing model information.") \ APPLY(STT_ERR_NO_MODEL, 0x1000, "Missing model information.") \
APPLY(DS_ERR_INVALID_ALPHABET, 0x2000, "Invalid alphabet embedded in model. (Data corruption?)") \ APPLY(STT_ERR_INVALID_ALPHABET, 0x2000, "Invalid alphabet embedded in model. (Data corruption?)") \
APPLY(DS_ERR_INVALID_SHAPE, 0x2001, "Invalid model shape.") \ APPLY(STT_ERR_INVALID_SHAPE, 0x2001, "Invalid model shape.") \
APPLY(DS_ERR_INVALID_SCORER, 0x2002, "Invalid scorer file.") \ APPLY(STT_ERR_INVALID_SCORER, 0x2002, "Invalid scorer file.") \
APPLY(DS_ERR_MODEL_INCOMPATIBLE, 0x2003, "Incompatible model.") \ APPLY(STT_ERR_MODEL_INCOMPATIBLE, 0x2003, "Incompatible model.") \
APPLY(DS_ERR_SCORER_NOT_ENABLED, 0x2004, "External scorer is not enabled.") \ APPLY(STT_ERR_SCORER_NOT_ENABLED, 0x2004, "External scorer is not enabled.") \
APPLY(DS_ERR_SCORER_UNREADABLE, 0x2005, "Could not read scorer file.") \ APPLY(STT_ERR_SCORER_UNREADABLE, 0x2005, "Could not read scorer file.") \
APPLY(DS_ERR_SCORER_INVALID_LM, 0x2006, "Could not recognize language model header in scorer.") \ APPLY(STT_ERR_SCORER_INVALID_LM, 0x2006, "Could not recognize language model header in scorer.") \
APPLY(DS_ERR_SCORER_NO_TRIE, 0x2007, "Reached end of scorer file before loading vocabulary trie.") \ APPLY(STT_ERR_SCORER_NO_TRIE, 0x2007, "Reached end of scorer file before loading vocabulary trie.") \
APPLY(DS_ERR_SCORER_INVALID_TRIE, 0x2008, "Invalid magic in trie header.") \ APPLY(STT_ERR_SCORER_INVALID_TRIE, 0x2008, "Invalid magic in trie header.") \
APPLY(DS_ERR_SCORER_VERSION_MISMATCH, 0x2009, "Scorer file version does not match expected version.") \ APPLY(STT_ERR_SCORER_VERSION_MISMATCH, 0x2009, "Scorer file version does not match expected version.") \
APPLY(DS_ERR_FAIL_INIT_MMAP, 0x3000, "Failed to initialize memory mapped model.") \ APPLY(STT_ERR_FAIL_INIT_MMAP, 0x3000, "Failed to initialize memory mapped model.") \
APPLY(DS_ERR_FAIL_INIT_SESS, 0x3001, "Failed to initialize the session.") \ APPLY(STT_ERR_FAIL_INIT_SESS, 0x3001, "Failed to initialize the session.") \
APPLY(DS_ERR_FAIL_INTERPRETER, 0x3002, "Interpreter failed.") \ APPLY(STT_ERR_FAIL_INTERPRETER, 0x3002, "Interpreter failed.") \
APPLY(DS_ERR_FAIL_RUN_SESS, 0x3003, "Failed to run the session.") \ APPLY(STT_ERR_FAIL_RUN_SESS, 0x3003, "Failed to run the session.") \
APPLY(DS_ERR_FAIL_CREATE_STREAM, 0x3004, "Error creating the stream.") \ APPLY(STT_ERR_FAIL_CREATE_STREAM, 0x3004, "Error creating the stream.") \
APPLY(DS_ERR_FAIL_READ_PROTOBUF, 0x3005, "Error reading the proto buffer model file.") \ APPLY(STT_ERR_FAIL_READ_PROTOBUF, 0x3005, "Error reading the proto buffer model file.") \
APPLY(DS_ERR_FAIL_CREATE_SESS, 0x3006, "Failed to create session.") \ APPLY(STT_ERR_FAIL_CREATE_SESS, 0x3006, "Failed to create session.") \
APPLY(DS_ERR_FAIL_CREATE_MODEL, 0x3007, "Could not allocate model state.") \ APPLY(STT_ERR_FAIL_CREATE_MODEL, 0x3007, "Could not allocate model state.") \
APPLY(DS_ERR_FAIL_INSERT_HOTWORD, 0x3008, "Could not insert hot-word.") \ APPLY(STT_ERR_FAIL_INSERT_HOTWORD, 0x3008, "Could not insert hot-word.") \
APPLY(DS_ERR_FAIL_CLEAR_HOTWORD, 0x3009, "Could not clear hot-words.") \ APPLY(STT_ERR_FAIL_CLEAR_HOTWORD, 0x3009, "Could not clear hot-words.") \
APPLY(DS_ERR_FAIL_ERASE_HOTWORD, 0x3010, "Could not erase hot-word.") APPLY(STT_ERR_FAIL_ERASE_HOTWORD, 0x3010, "Could not erase hot-word.")
// sphinx-doc: error_code_listing_end // sphinx-doc: error_code_listing_end
enum DeepSpeech_Error_Codes enum STT_Error_Codes
{ {
#define DEFINE(NAME, VALUE, DESC) NAME = VALUE, #define DEFINE(NAME, VALUE, DESC) NAME = VALUE,
DS_FOR_EACH_ERROR(DEFINE) STT_FOR_EACH_ERROR(DEFINE)
#undef DEFINE #undef DEFINE
}; };
@ -104,49 +104,49 @@ DS_FOR_EACH_ERROR(DEFINE)
* @return Zero on success, non-zero on failure. * @return Zero on success, non-zero on failure.
*/ */
STT_EXPORT STT_EXPORT
int DS_CreateModel(const char* aModelPath, int STT_CreateModel(const char* aModelPath,
ModelState** retval); ModelState** retval);
/** /**
* @brief Get beam width value used by the model. If {@link DS_SetModelBeamWidth} * @brief Get beam width value used by the model. If {@link STT_SetModelBeamWidth}
* was not called before, will return the default value loaded from the * was not called before, will return the default value loaded from the
* model file. * model file.
* *
* @param aCtx A ModelState pointer created with {@link DS_CreateModel}. * @param aCtx A ModelState pointer created with {@link STT_CreateModel}.
* *
* @return Beam width value used by the model. * @return Beam width value used by the model.
*/ */
STT_EXPORT STT_EXPORT
unsigned int DS_GetModelBeamWidth(const ModelState* aCtx); unsigned int STT_GetModelBeamWidth(const ModelState* aCtx);
/** /**
* @brief Set beam width value used by the model. * @brief Set beam width value used by the model.
* *
* @param aCtx A ModelState pointer created with {@link DS_CreateModel}. * @param aCtx A ModelState pointer created with {@link STT_CreateModel}.
* @param aBeamWidth The beam width used by the model. A larger beam width value * @param aBeamWidth The beam width used by the model. A larger beam width value
* generates better results at the cost of decoding time. * generates better results at the cost of decoding time.
* *
* @return Zero on success, non-zero on failure. * @return Zero on success, non-zero on failure.
*/ */
STT_EXPORT STT_EXPORT
int DS_SetModelBeamWidth(ModelState* aCtx, int STT_SetModelBeamWidth(ModelState* aCtx,
unsigned int aBeamWidth); unsigned int aBeamWidth);
/** /**
* @brief Return the sample rate expected by a model. * @brief Return the sample rate expected by a model.
* *
* @param aCtx A ModelState pointer created with {@link DS_CreateModel}. * @param aCtx A ModelState pointer created with {@link STT_CreateModel}.
* *
* @return Sample rate expected by the model for its input. * @return Sample rate expected by the model for its input.
*/ */
STT_EXPORT STT_EXPORT
int DS_GetModelSampleRate(const ModelState* aCtx); int STT_GetModelSampleRate(const ModelState* aCtx);
/** /**
* @brief Frees associated resources and destroys model object. * @brief Frees associated resources and destroys model object.
*/ */
STT_EXPORT STT_EXPORT
void DS_FreeModel(ModelState* ctx); void STT_FreeModel(ModelState* ctx);
/** /**
* @brief Enable decoding using an external scorer. * @brief Enable decoding using an external scorer.
@ -157,7 +157,7 @@ void DS_FreeModel(ModelState* ctx);
* @return Zero on success, non-zero on failure (invalid arguments). * @return Zero on success, non-zero on failure (invalid arguments).
*/ */
STT_EXPORT STT_EXPORT
int DS_EnableExternalScorer(ModelState* aCtx, int STT_EnableExternalScorer(ModelState* aCtx,
const char* aScorerPath); const char* aScorerPath);
/** /**
@ -172,7 +172,7 @@ int DS_EnableExternalScorer(ModelState* aCtx,
* @return Zero on success, non-zero on failure (invalid arguments). * @return Zero on success, non-zero on failure (invalid arguments).
*/ */
STT_EXPORT STT_EXPORT
int DS_AddHotWord(ModelState* aCtx, int STT_AddHotWord(ModelState* aCtx,
const char* word, const char* word,
float boost); float boost);
@ -185,7 +185,7 @@ int DS_AddHotWord(ModelState* aCtx,
* @return Zero on success, non-zero on failure (invalid arguments). * @return Zero on success, non-zero on failure (invalid arguments).
*/ */
STT_EXPORT STT_EXPORT
int DS_EraseHotWord(ModelState* aCtx, int STT_EraseHotWord(ModelState* aCtx,
const char* word); const char* word);
/** /**
@ -196,7 +196,7 @@ int DS_EraseHotWord(ModelState* aCtx,
* @return Zero on success, non-zero on failure (invalid arguments). * @return Zero on success, non-zero on failure (invalid arguments).
*/ */
STT_EXPORT STT_EXPORT
int DS_ClearHotWords(ModelState* aCtx); int STT_ClearHotWords(ModelState* aCtx);
/** /**
* @brief Disable decoding using an external scorer. * @brief Disable decoding using an external scorer.
@ -206,7 +206,7 @@ int DS_ClearHotWords(ModelState* aCtx);
* @return Zero on success, non-zero on failure. * @return Zero on success, non-zero on failure.
*/ */
STT_EXPORT STT_EXPORT
int DS_DisableExternalScorer(ModelState* aCtx); int STT_DisableExternalScorer(ModelState* aCtx);
/** /**
* @brief Set hyperparameters alpha and beta of the external scorer. * @brief Set hyperparameters alpha and beta of the external scorer.
@ -218,7 +218,7 @@ int DS_DisableExternalScorer(ModelState* aCtx);
* @return Zero on success, non-zero on failure. * @return Zero on success, non-zero on failure.
*/ */
STT_EXPORT STT_EXPORT
int DS_SetScorerAlphaBeta(ModelState* aCtx, int STT_SetScorerAlphaBeta(ModelState* aCtx,
float aAlpha, float aAlpha,
float aBeta); float aBeta);
@ -231,10 +231,10 @@ int DS_SetScorerAlphaBeta(ModelState* aCtx,
* @param aBufferSize The number of samples in the audio signal. * @param aBufferSize The number of samples in the audio signal.
* *
* @return The STT result. The user is responsible for freeing the string using * @return The STT result. The user is responsible for freeing the string using
* {@link DS_FreeString()}. Returns NULL on error. * {@link STT_FreeString()}. Returns NULL on error.
*/ */
STT_EXPORT STT_EXPORT
char* DS_SpeechToText(ModelState* aCtx, char* STT_SpeechToText(ModelState* aCtx,
const short* aBuffer, const short* aBuffer,
unsigned int aBufferSize); unsigned int aBufferSize);
@ -250,19 +250,19 @@ char* DS_SpeechToText(ModelState* aCtx,
* *
* @return Metadata struct containing multiple CandidateTranscript structs. Each * @return Metadata struct containing multiple CandidateTranscript structs. Each
* transcript has per-token metadata including timing information. The * transcript has per-token metadata including timing information. The
* user is responsible for freeing Metadata by calling {@link DS_FreeMetadata()}. * user is responsible for freeing Metadata by calling {@link STT_FreeMetadata()}.
* Returns NULL on error. * Returns NULL on error.
*/ */
STT_EXPORT STT_EXPORT
Metadata* DS_SpeechToTextWithMetadata(ModelState* aCtx, Metadata* STT_SpeechToTextWithMetadata(ModelState* aCtx,
const short* aBuffer, const short* aBuffer,
unsigned int aBufferSize, unsigned int aBufferSize,
unsigned int aNumResults); unsigned int aNumResults);
/** /**
* @brief Create a new streaming inference state. The streaming state returned * @brief Create a new streaming inference state. The streaming state returned
* by this function can then be passed to {@link DS_FeedAudioContent()} * by this function can then be passed to {@link STT_FeedAudioContent()}
* and {@link DS_FinishStream()}. * and {@link STT_FinishStream()}.
* *
* @param aCtx The ModelState pointer for the model to use. * @param aCtx The ModelState pointer for the model to use.
* @param[out] retval an opaque pointer that represents the streaming state. Can * @param[out] retval an opaque pointer that represents the streaming state. Can
@ -271,80 +271,80 @@ Metadata* DS_SpeechToTextWithMetadata(ModelState* aCtx,
* @return Zero for success, non-zero on failure. * @return Zero for success, non-zero on failure.
*/ */
STT_EXPORT STT_EXPORT
int DS_CreateStream(ModelState* aCtx, int STT_CreateStream(ModelState* aCtx,
StreamingState** retval); StreamingState** retval);
/** /**
* @brief Feed audio samples to an ongoing streaming inference. * @brief Feed audio samples to an ongoing streaming inference.
* *
* @param aSctx A streaming state pointer returned by {@link DS_CreateStream()}. * @param aSctx A streaming state pointer returned by {@link STT_CreateStream()}.
* @param aBuffer An array of 16-bit, mono raw audio samples at the * @param aBuffer An array of 16-bit, mono raw audio samples at the
* appropriate sample rate (matching what the model was trained on). * appropriate sample rate (matching what the model was trained on).
* @param aBufferSize The number of samples in @p aBuffer. * @param aBufferSize The number of samples in @p aBuffer.
*/ */
STT_EXPORT STT_EXPORT
void DS_FeedAudioContent(StreamingState* aSctx, void STT_FeedAudioContent(StreamingState* aSctx,
const short* aBuffer, const short* aBuffer,
unsigned int aBufferSize); unsigned int aBufferSize);
/** /**
* @brief Compute the intermediate decoding of an ongoing streaming inference. * @brief Compute the intermediate decoding of an ongoing streaming inference.
* *
* @param aSctx A streaming state pointer returned by {@link DS_CreateStream()}. * @param aSctx A streaming state pointer returned by {@link STT_CreateStream()}.
* *
* @return The STT intermediate result. The user is responsible for freeing the * @return The STT intermediate result. The user is responsible for freeing the
* string using {@link DS_FreeString()}. * string using {@link STT_FreeString()}.
*/ */
STT_EXPORT STT_EXPORT
char* DS_IntermediateDecode(const StreamingState* aSctx); char* STT_IntermediateDecode(const StreamingState* aSctx);
/** /**
* @brief Compute the intermediate decoding of an ongoing streaming inference, * @brief Compute the intermediate decoding of an ongoing streaming inference,
* return results including metadata. * return results including metadata.
* *
* @param aSctx A streaming state pointer returned by {@link DS_CreateStream()}. * @param aSctx A streaming state pointer returned by {@link STT_CreateStream()}.
* @param aNumResults The number of candidate transcripts to return. * @param aNumResults The number of candidate transcripts to return.
* *
* @return Metadata struct containing multiple candidate transcripts. Each transcript * @return Metadata struct containing multiple candidate transcripts. Each transcript
* has per-token metadata including timing information. The user is * has per-token metadata including timing information. The user is
* responsible for freeing Metadata by calling {@link DS_FreeMetadata()}. * responsible for freeing Metadata by calling {@link STT_FreeMetadata()}.
* Returns NULL on error. * Returns NULL on error.
*/ */
STT_EXPORT STT_EXPORT
Metadata* DS_IntermediateDecodeWithMetadata(const StreamingState* aSctx, Metadata* STT_IntermediateDecodeWithMetadata(const StreamingState* aSctx,
unsigned int aNumResults); unsigned int aNumResults);
/** /**
* @brief Compute the final decoding of an ongoing streaming inference and return * @brief Compute the final decoding of an ongoing streaming inference and return
* the result. Signals the end of an ongoing streaming inference. * the result. Signals the end of an ongoing streaming inference.
* *
* @param aSctx A streaming state pointer returned by {@link DS_CreateStream()}. * @param aSctx A streaming state pointer returned by {@link STT_CreateStream()}.
* *
* @return The STT result. The user is responsible for freeing the string using * @return The STT result. The user is responsible for freeing the string using
* {@link DS_FreeString()}. * {@link STT_FreeString()}.
* *
* @note This method will free the state pointer (@p aSctx). * @note This method will free the state pointer (@p aSctx).
*/ */
STT_EXPORT STT_EXPORT
char* DS_FinishStream(StreamingState* aSctx); char* STT_FinishStream(StreamingState* aSctx);
/** /**
* @brief Compute the final decoding of an ongoing streaming inference and return * @brief Compute the final decoding of an ongoing streaming inference and return
* results including metadata. Signals the end of an ongoing streaming * results including metadata. Signals the end of an ongoing streaming
* inference. * inference.
* *
* @param aSctx A streaming state pointer returned by {@link DS_CreateStream()}. * @param aSctx A streaming state pointer returned by {@link STT_CreateStream()}.
* @param aNumResults The number of candidate transcripts to return. * @param aNumResults The number of candidate transcripts to return.
* *
* @return Metadata struct containing multiple candidate transcripts. Each transcript * @return Metadata struct containing multiple candidate transcripts. Each transcript
* has per-token metadata including timing information. The user is * has per-token metadata including timing information. The user is
* responsible for freeing Metadata by calling {@link DS_FreeMetadata()}. * responsible for freeing Metadata by calling {@link STT_FreeMetadata()}.
* Returns NULL on error. * Returns NULL on error.
* *
* @note This method will free the state pointer (@p aSctx). * @note This method will free the state pointer (@p aSctx).
*/ */
STT_EXPORT STT_EXPORT
Metadata* DS_FinishStreamWithMetadata(StreamingState* aSctx, Metadata* STT_FinishStreamWithMetadata(StreamingState* aSctx,
unsigned int aNumResults); unsigned int aNumResults);
/** /**
@ -352,42 +352,42 @@ Metadata* DS_FinishStreamWithMetadata(StreamingState* aSctx,
* can be used if you no longer need the result of an ongoing streaming * can be used if you no longer need the result of an ongoing streaming
* inference and don't want to perform a costly decode operation. * inference and don't want to perform a costly decode operation.
* *
* @param aSctx A streaming state pointer returned by {@link DS_CreateStream()}. * @param aSctx A streaming state pointer returned by {@link STT_CreateStream()}.
* *
* @note This method will free the state pointer (@p aSctx). * @note This method will free the state pointer (@p aSctx).
*/ */
STT_EXPORT STT_EXPORT
void DS_FreeStream(StreamingState* aSctx); void STT_FreeStream(StreamingState* aSctx);
/** /**
* @brief Free memory allocated for metadata information. * @brief Free memory allocated for metadata information.
*/ */
STT_EXPORT STT_EXPORT
void DS_FreeMetadata(Metadata* m); void STT_FreeMetadata(Metadata* m);
/** /**
* @brief Free a char* string returned by the Coqui STT API. * @brief Free a char* string returned by the Coqui STT API.
*/ */
STT_EXPORT STT_EXPORT
void DS_FreeString(char* str); void STT_FreeString(char* str);
/** /**
* @brief Returns the version of this library. The returned version is a semantic * @brief Returns the version of this library. The returned version is a semantic
* version (SemVer 2.0.0). The string returned must be freed with {@link DS_FreeString()}. * version (SemVer 2.0.0). The string returned must be freed with {@link STT_FreeString()}.
* *
* @return The version string. * @return The version string.
*/ */
STT_EXPORT STT_EXPORT
char* DS_Version(); char* STT_Version();
/** /**
* @brief Returns a textual description corresponding to an error code. * @brief Returns a textual description corresponding to an error code.
* The string returned must be freed with @{link DS_FreeString()}. * The string returned must be freed with @{link STT_FreeString()}.
* *
* @return The error description. * @return The error description.
*/ */
STT_EXPORT STT_EXPORT
char* DS_ErrorCodeToErrorMessage(int aErrorCode); char* STT_ErrorCodeToErrorMessage(int aErrorCode);
#undef STT_EXPORT #undef STT_EXPORT

View File

@ -9,7 +9,7 @@ __version__ = swigwrapper.__version__.decode('utf-8')
# Hack: import error codes by matching on their names, as SWIG unfortunately # Hack: import error codes by matching on their names, as SWIG unfortunately
# does not support binding enums to Python in a scoped manner yet. # does not support binding enums to Python in a scoped manner yet.
for symbol in dir(swigwrapper): for symbol in dir(swigwrapper):
if symbol.startswith('DS_ERR_'): if symbol.startswith('STT_ERR_'):
globals()[symbol] = getattr(swigwrapper, symbol) globals()[symbol] = getattr(swigwrapper, symbol)
class Scorer(swigwrapper.Scorer): class Scorer(swigwrapper.Scorer):

View File

@ -74,13 +74,13 @@ int Scorer::load_lm(const std::string& lm_path)
// Check if file is readable to avoid KenLM throwing an exception // Check if file is readable to avoid KenLM throwing an exception
const char* filename = lm_path.c_str(); const char* filename = lm_path.c_str();
if (access(filename, R_OK) != 0) { if (access(filename, R_OK) != 0) {
return DS_ERR_SCORER_UNREADABLE; return STT_ERR_SCORER_UNREADABLE;
} }
// Check if the file format is valid to avoid KenLM throwing an exception // Check if the file format is valid to avoid KenLM throwing an exception
lm::ngram::ModelType model_type; lm::ngram::ModelType model_type;
if (!lm::ngram::RecognizeBinary(filename, model_type)) { if (!lm::ngram::RecognizeBinary(filename, model_type)) {
return DS_ERR_SCORER_INVALID_LM; return STT_ERR_SCORER_INVALID_LM;
} }
// Load the LM // Load the LM
@ -97,7 +97,7 @@ int Scorer::load_lm(const std::string& lm_path)
uint64_t trie_offset = language_model_->GetEndOfSearchOffset(); uint64_t trie_offset = language_model_->GetEndOfSearchOffset();
if (package_size <= trie_offset) { if (package_size <= trie_offset) {
// File ends without a trie structure // File ends without a trie structure
return DS_ERR_SCORER_NO_TRIE; return STT_ERR_SCORER_NO_TRIE;
} }
// Read metadata and trie from file // Read metadata and trie from file
@ -113,7 +113,7 @@ int Scorer::load_trie(std::ifstream& fin, const std::string& file_path)
if (magic != MAGIC) { if (magic != MAGIC) {
std::cerr << "Error: Can't parse scorer file, invalid header. Try updating " std::cerr << "Error: Can't parse scorer file, invalid header. Try updating "
"your scorer file." << std::endl; "your scorer file." << std::endl;
return DS_ERR_SCORER_INVALID_TRIE; return STT_ERR_SCORER_INVALID_TRIE;
} }
int version; int version;
@ -128,7 +128,7 @@ int Scorer::load_trie(std::ifstream& fin, const std::string& file_path)
std::cerr << "Downgrade your scorer file or update your version of Coqui STT."; std::cerr << "Downgrade your scorer file or update your version of Coqui STT.";
} }
std::cerr << std::endl; std::cerr << std::endl;
return DS_ERR_SCORER_VERSION_MISMATCH; return STT_ERR_SCORER_VERSION_MISMATCH;
} }
fin.read(reinterpret_cast<char*>(&is_utf8_mode_), sizeof(is_utf8_mode_)); fin.read(reinterpret_cast<char*>(&is_utf8_mode_), sizeof(is_utf8_mode_));
@ -143,7 +143,7 @@ int Scorer::load_trie(std::ifstream& fin, const std::string& file_path)
opt.mode = fst::FstReadOptions::MAP; opt.mode = fst::FstReadOptions::MAP;
opt.source = file_path; opt.source = file_path;
dictionary.reset(FstType::Read(fin, opt)); dictionary.reset(FstType::Read(fin, opt));
return DS_ERR_OK; return STT_ERR_OK;
} }
bool Scorer::save_dictionary(const std::string& path, bool append_instead_of_overwrite) bool Scorer::save_dictionary(const std::string& path, bool append_instead_of_overwrite)

View File

@ -1,33 +0,0 @@
namespace DeepSpeechClient.Enums
{
/// <summary>
/// Error codes from the native DeepSpeech binary.
/// </summary>
internal enum ErrorCodes
{
// OK
DS_ERR_OK = 0x0000,
// Missing invormations
DS_ERR_NO_MODEL = 0x1000,
// Invalid parameters
DS_ERR_INVALID_ALPHABET = 0x2000,
DS_ERR_INVALID_SHAPE = 0x2001,
DS_ERR_INVALID_SCORER = 0x2002,
DS_ERR_MODEL_INCOMPATIBLE = 0x2003,
DS_ERR_SCORER_NOT_ENABLED = 0x2004,
// Runtime failures
DS_ERR_FAIL_INIT_MMAP = 0x3000,
DS_ERR_FAIL_INIT_SESS = 0x3001,
DS_ERR_FAIL_INTERPRETER = 0x3002,
DS_ERR_FAIL_RUN_SESS = 0x3003,
DS_ERR_FAIL_CREATE_STREAM = 0x3004,
DS_ERR_FAIL_READ_PROTOBUF = 0x3005,
DS_ERR_FAIL_CREATE_SESS = 0x3006,
DS_ERR_FAIL_INSERT_HOTWORD = 0x3008,
DS_ERR_FAIL_CLEAR_HOTWORD = 0x3009,
DS_ERR_FAIL_ERASE_HOTWORD = 0x3010
}
}

View File

@ -2,9 +2,9 @@ Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio Version 16 # Visual Studio Version 16
VisualStudioVersion = 16.0.30204.135 VisualStudioVersion = 16.0.30204.135
MinimumVisualStudioVersion = 10.0.40219.1 MinimumVisualStudioVersion = 10.0.40219.1
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "DeepSpeechClient", "DeepSpeechClient\DeepSpeechClient.csproj", "{56DE4091-BBBE-47E4-852D-7268B33B971F}" Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "STTClient", "STTClient\STTClient.csproj", "{56DE4091-BBBE-47E4-852D-7268B33B971F}"
EndProject EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "DeepSpeechConsole", "DeepSpeechConsole\DeepSpeechConsole.csproj", "{312965E5-C4F6-4D95-BA64-79906B8BC7AC}" Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "STTConsole", "STTConsole\STTConsole.csproj", "{312965E5-C4F6-4D95-BA64-79906B8BC7AC}"
EndProject EndProject
Global Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution GlobalSection(SolutionConfigurationPlatforms) = preSolution

View File

@ -0,0 +1,33 @@
namespace STTClient.Enums
{
/// <summary>
/// Error codes from the native Coqui STT binary.
/// </summary>
internal enum ErrorCodes
{
// OK
STT_ERR_OK = 0x0000,
// Missing invormations
STT_ERR_NO_MODEL = 0x1000,
// Invalid parameters
STT_ERR_INVALID_ALPHABET = 0x2000,
STT_ERR_INVALID_SHAPE = 0x2001,
STT_ERR_INVALID_SCORER = 0x2002,
STT_ERR_MODEL_INCOMPATIBLE = 0x2003,
STT_ERR_SCORER_NOT_ENABLED = 0x2004,
// Runtime failures
STT_ERR_FAIL_INIT_MMAP = 0x3000,
STT_ERR_FAIL_INIT_SESS = 0x3001,
STT_ERR_FAIL_INTERPRETER = 0x3002,
STT_ERR_FAIL_RUN_SESS = 0x3003,
STT_ERR_FAIL_CREATE_STREAM = 0x3004,
STT_ERR_FAIL_READ_PROTOBUF = 0x3005,
STT_ERR_FAIL_CREATE_SESS = 0x3006,
STT_ERR_FAIL_INSERT_HOTWORD = 0x3008,
STT_ERR_FAIL_CLEAR_HOTWORD = 0x3009,
STT_ERR_FAIL_ERASE_HOTWORD = 0x3010
}
}

View File

@ -1,9 +1,9 @@
using DeepSpeechClient.Structs; using STTClient.Structs;
using System; using System;
using System.Runtime.InteropServices; using System.Runtime.InteropServices;
using System.Text; using System.Text;
namespace DeepSpeechClient.Extensions namespace STTClient.Extensions
{ {
internal static class NativeExtensions internal static class NativeExtensions
{ {
@ -20,7 +20,7 @@ namespace DeepSpeechClient.Extensions
byte[] buffer = new byte[len]; byte[] buffer = new byte[len];
Marshal.Copy(intPtr, buffer, 0, buffer.Length); Marshal.Copy(intPtr, buffer, 0, buffer.Length);
if (releasePtr) if (releasePtr)
NativeImp.DS_FreeString(intPtr); NativeImp.STT_FreeString(intPtr);
string result = Encoding.UTF8.GetString(buffer); string result = Encoding.UTF8.GetString(buffer);
return result; return result;
} }
@ -86,7 +86,7 @@ namespace DeepSpeechClient.Extensions
metadata.transcripts += sizeOfCandidateTranscript; metadata.transcripts += sizeOfCandidateTranscript;
} }
NativeImp.DS_FreeMetadata(intPtr); NativeImp.STT_FreeMetadata(intPtr);
return managedMetadata; return managedMetadata;
} }
} }

View File

@ -1,13 +1,13 @@
using DeepSpeechClient.Models; using STTClient.Models;
using System; using System;
using System.IO; using System.IO;
namespace DeepSpeechClient.Interfaces namespace STTClient.Interfaces
{ {
/// <summary> /// <summary>
/// Client interface for DeepSpeech /// Client interface for Coqui STT
/// </summary> /// </summary>
public interface IDeepSpeech : IDisposable public interface ISTT : IDisposable
{ {
/// <summary> /// <summary>
/// Return version of this library. The returned version is a semantic version /// Return version of this library. The returned version is a semantic version
@ -80,7 +80,7 @@ namespace DeepSpeechClient.Interfaces
unsafe void SetScorerAlphaBeta(float aAlpha, float aBeta); unsafe void SetScorerAlphaBeta(float aAlpha, float aBeta);
/// <summary> /// <summary>
/// Use the DeepSpeech model to perform Speech-To-Text. /// Use the STT model to perform Speech-To-Text.
/// </summary> /// </summary>
/// <param name="aBuffer">A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).</param> /// <param name="aBuffer">A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).</param>
/// <param name="aBufferSize">The number of samples in the audio signal.</param> /// <param name="aBufferSize">The number of samples in the audio signal.</param>
@ -89,7 +89,7 @@ namespace DeepSpeechClient.Interfaces
uint aBufferSize); uint aBufferSize);
/// <summary> /// <summary>
/// Use the DeepSpeech model to perform Speech-To-Text, return results including metadata. /// Use the STT model to perform Speech-To-Text, return results including metadata.
/// </summary> /// </summary>
/// <param name="aBuffer">A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).</param> /// <param name="aBuffer">A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).</param>
/// <param name="aBufferSize">The number of samples in the audio signal.</param> /// <param name="aBufferSize">The number of samples in the audio signal.</param>
@ -104,26 +104,26 @@ namespace DeepSpeechClient.Interfaces
/// This can be used if you no longer need the result of an ongoing streaming /// This can be used if you no longer need the result of an ongoing streaming
/// inference and don't want to perform a costly decode operation. /// inference and don't want to perform a costly decode operation.
/// </summary> /// </summary>
unsafe void FreeStream(DeepSpeechStream stream); unsafe void FreeStream(Stream stream);
/// <summary> /// <summary>
/// Creates a new streaming inference state. /// Creates a new streaming inference state.
/// </summary> /// </summary>
unsafe DeepSpeechStream CreateStream(); unsafe Stream CreateStream();
/// <summary> /// <summary>
/// Feeds audio samples to an ongoing streaming inference. /// Feeds audio samples to an ongoing streaming inference.
/// </summary> /// </summary>
/// <param name="stream">Instance of the stream to feed the data.</param> /// <param name="stream">Instance of the stream to feed the data.</param>
/// <param name="aBuffer">An array of 16-bit, mono raw audio samples at the appropriate sample rate (matching what the model was trained on).</param> /// <param name="aBuffer">An array of 16-bit, mono raw audio samples at the appropriate sample rate (matching what the model was trained on).</param>
unsafe void FeedAudioContent(DeepSpeechStream stream, short[] aBuffer, uint aBufferSize); unsafe void FeedAudioContent(Stream stream, short[] aBuffer, uint aBufferSize);
/// <summary> /// <summary>
/// Computes the intermediate decoding of an ongoing streaming inference. /// Computes the intermediate decoding of an ongoing streaming inference.
/// </summary> /// </summary>
/// <param name="stream">Instance of the stream to decode.</param> /// <param name="stream">Instance of the stream to decode.</param>
/// <returns>The STT intermediate result.</returns> /// <returns>The STT intermediate result.</returns>
unsafe string IntermediateDecode(DeepSpeechStream stream); unsafe string IntermediateDecode(Stream stream);
/// <summary> /// <summary>
/// Computes the intermediate decoding of an ongoing streaming inference, including metadata. /// Computes the intermediate decoding of an ongoing streaming inference, including metadata.
@ -131,14 +131,14 @@ namespace DeepSpeechClient.Interfaces
/// <param name="stream">Instance of the stream to decode.</param> /// <param name="stream">Instance of the stream to decode.</param>
/// <param name="aNumResults">Maximum number of candidate transcripts to return. Returned list might be smaller than this.</param> /// <param name="aNumResults">Maximum number of candidate transcripts to return. Returned list might be smaller than this.</param>
/// <returns>The extended metadata result.</returns> /// <returns>The extended metadata result.</returns>
unsafe Metadata IntermediateDecodeWithMetadata(DeepSpeechStream stream, uint aNumResults); unsafe Metadata IntermediateDecodeWithMetadata(Stream stream, uint aNumResults);
/// <summary> /// <summary>
/// Closes the ongoing streaming inference, returns the STT result over the whole audio signal. /// Closes the ongoing streaming inference, returns the STT result over the whole audio signal.
/// </summary> /// </summary>
/// <param name="stream">Instance of the stream to finish.</param> /// <param name="stream">Instance of the stream to finish.</param>
/// <returns>The STT result.</returns> /// <returns>The STT result.</returns>
unsafe string FinishStream(DeepSpeechStream stream); unsafe string FinishStream(Stream stream);
/// <summary> /// <summary>
/// Closes the ongoing streaming inference, returns the STT result over the whole audio signal, including metadata. /// Closes the ongoing streaming inference, returns the STT result over the whole audio signal, including metadata.
@ -146,6 +146,6 @@ namespace DeepSpeechClient.Interfaces
/// <param name="stream">Instance of the stream to finish.</param> /// <param name="stream">Instance of the stream to finish.</param>
/// <param name="aNumResults">Maximum number of candidate transcripts to return. Returned list might be smaller than this.</param> /// <param name="aNumResults">Maximum number of candidate transcripts to return. Returned list might be smaller than this.</param>
/// <returns>The extended metadata result.</returns> /// <returns>The extended metadata result.</returns>
unsafe Metadata FinishStreamWithMetadata(DeepSpeechStream stream, uint aNumResults); unsafe Metadata FinishStreamWithMetadata(Stream stream, uint aNumResults);
} }
} }

View File

@ -1,4 +1,4 @@
namespace DeepSpeechClient.Models namespace STTClient.Models
{ {
/// <summary> /// <summary>
/// Stores the entire CTC output as an array of character metadata objects. /// Stores the entire CTC output as an array of character metadata objects.

View File

@ -1,4 +1,4 @@
namespace DeepSpeechClient.Models namespace STTClient.Models
{ {
/// <summary> /// <summary>
/// Stores the entire CTC output as an array of character metadata objects. /// Stores the entire CTC output as an array of character metadata objects.

View File

@ -1,19 +1,19 @@
using System; using System;
namespace DeepSpeechClient.Models namespace STTClient.Models
{ {
/// <summary> /// <summary>
/// Wrapper of the pointer used for the decoding stream. /// Wrapper of the pointer used for the decoding stream.
/// </summary> /// </summary>
public class DeepSpeechStream : IDisposable public class Stream : IDisposable
{ {
private unsafe IntPtr** _streamingStatePp; private unsafe IntPtr** _streamingStatePp;
/// <summary> /// <summary>
/// Initializes a new instance of <see cref="DeepSpeechStream"/>. /// Initializes a new instance of <see cref="Stream"/>.
/// </summary> /// </summary>
/// <param name="streamingStatePP">Native pointer of the native stream.</param> /// <param name="streamingStatePP">Native pointer of the native stream.</param>
public unsafe DeepSpeechStream(IntPtr** streamingStatePP) public unsafe Stream(IntPtr** streamingStatePP)
{ {
_streamingStatePp = streamingStatePP; _streamingStatePp = streamingStatePP;
} }

View File

@ -1,4 +1,4 @@
namespace DeepSpeechClient.Models namespace STTClient.Models
{ {
/// <summary> /// <summary>
/// Stores each individual character, along with its timing information. /// Stores each individual character, along with its timing information.

View File

@ -1,9 +1,9 @@
using DeepSpeechClient.Enums; using STTClient.Enums;
using System; using System;
using System.Runtime.InteropServices; using System.Runtime.InteropServices;
namespace DeepSpeechClient namespace STTClient
{ {
/// <summary> /// <summary>
/// Wrapper for the native implementation of "libstt.so" /// Wrapper for the native implementation of "libstt.so"
@ -13,101 +13,101 @@ namespace DeepSpeechClient
#region Native Implementation #region Native Implementation
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl, [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl,
CharSet = CharSet.Ansi, SetLastError = true)] CharSet = CharSet.Ansi, SetLastError = true)]
internal static extern IntPtr DS_Version(); internal static extern IntPtr STT_Version();
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal unsafe static extern ErrorCodes DS_CreateModel(string aModelPath, internal unsafe static extern ErrorCodes STT_CreateModel(string aModelPath,
ref IntPtr** pint); ref IntPtr** pint);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal unsafe static extern IntPtr DS_ErrorCodeToErrorMessage(int aErrorCode); internal unsafe static extern IntPtr STT_ErrorCodeToErrorMessage(int aErrorCode);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal unsafe static extern uint DS_GetModelBeamWidth(IntPtr** aCtx); internal unsafe static extern uint STT_GetModelBeamWidth(IntPtr** aCtx);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal unsafe static extern ErrorCodes DS_SetModelBeamWidth(IntPtr** aCtx, internal unsafe static extern ErrorCodes STT_SetModelBeamWidth(IntPtr** aCtx,
uint aBeamWidth); uint aBeamWidth);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal unsafe static extern ErrorCodes DS_CreateModel(string aModelPath, internal unsafe static extern ErrorCodes STT_CreateModel(string aModelPath,
uint aBeamWidth, uint aBeamWidth,
ref IntPtr** pint); ref IntPtr** pint);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal unsafe static extern int DS_GetModelSampleRate(IntPtr** aCtx); internal unsafe static extern int STT_GetModelSampleRate(IntPtr** aCtx);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern ErrorCodes DS_EnableExternalScorer(IntPtr** aCtx, internal static unsafe extern ErrorCodes STT_EnableExternalScorer(IntPtr** aCtx,
string aScorerPath); string aScorerPath);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern ErrorCodes DS_AddHotWord(IntPtr** aCtx, internal static unsafe extern ErrorCodes STT_AddHotWord(IntPtr** aCtx,
string aWord, string aWord,
float aBoost); float aBoost);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern ErrorCodes DS_EraseHotWord(IntPtr** aCtx, internal static unsafe extern ErrorCodes STT_EraseHotWord(IntPtr** aCtx,
string aWord); string aWord);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern ErrorCodes DS_ClearHotWords(IntPtr** aCtx); internal static unsafe extern ErrorCodes STT_ClearHotWords(IntPtr** aCtx);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern ErrorCodes DS_DisableExternalScorer(IntPtr** aCtx); internal static unsafe extern ErrorCodes STT_DisableExternalScorer(IntPtr** aCtx);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern ErrorCodes DS_SetScorerAlphaBeta(IntPtr** aCtx, internal static unsafe extern ErrorCodes STT_SetScorerAlphaBeta(IntPtr** aCtx,
float aAlpha, float aAlpha,
float aBeta); float aBeta);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl, [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl,
CharSet = CharSet.Ansi, SetLastError = true)] CharSet = CharSet.Ansi, SetLastError = true)]
internal static unsafe extern IntPtr DS_SpeechToText(IntPtr** aCtx, internal static unsafe extern IntPtr STT_SpeechToText(IntPtr** aCtx,
short[] aBuffer, short[] aBuffer,
uint aBufferSize); uint aBufferSize);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl, SetLastError = true)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl, SetLastError = true)]
internal static unsafe extern IntPtr DS_SpeechToTextWithMetadata(IntPtr** aCtx, internal static unsafe extern IntPtr STT_SpeechToTextWithMetadata(IntPtr** aCtx,
short[] aBuffer, short[] aBuffer,
uint aBufferSize, uint aBufferSize,
uint aNumResults); uint aNumResults);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern void DS_FreeModel(IntPtr** aCtx); internal static unsafe extern void STT_FreeModel(IntPtr** aCtx);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern ErrorCodes DS_CreateStream(IntPtr** aCtx, internal static unsafe extern ErrorCodes STT_CreateStream(IntPtr** aCtx,
ref IntPtr** retval); ref IntPtr** retval);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern void DS_FreeStream(IntPtr** aSctx); internal static unsafe extern void STT_FreeStream(IntPtr** aSctx);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern void DS_FreeMetadata(IntPtr metadata); internal static unsafe extern void STT_FreeMetadata(IntPtr metadata);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern void DS_FreeString(IntPtr str); internal static unsafe extern void STT_FreeString(IntPtr str);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl, [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl,
CharSet = CharSet.Ansi, SetLastError = true)] CharSet = CharSet.Ansi, SetLastError = true)]
internal static unsafe extern void DS_FeedAudioContent(IntPtr** aSctx, internal static unsafe extern void STT_FeedAudioContent(IntPtr** aSctx,
short[] aBuffer, short[] aBuffer,
uint aBufferSize); uint aBufferSize);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern IntPtr DS_IntermediateDecode(IntPtr** aSctx); internal static unsafe extern IntPtr STT_IntermediateDecode(IntPtr** aSctx);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern IntPtr DS_IntermediateDecodeWithMetadata(IntPtr** aSctx, internal static unsafe extern IntPtr STT_IntermediateDecodeWithMetadata(IntPtr** aSctx,
uint aNumResults); uint aNumResults);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl, [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl,
CharSet = CharSet.Ansi, SetLastError = true)] CharSet = CharSet.Ansi, SetLastError = true)]
internal static unsafe extern IntPtr DS_FinishStream(IntPtr** aSctx); internal static unsafe extern IntPtr STT_FinishStream(IntPtr** aSctx);
[DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)] [DllImport("libstt.so", CallingConvention = CallingConvention.Cdecl)]
internal static unsafe extern IntPtr DS_FinishStreamWithMetadata(IntPtr** aSctx, internal static unsafe extern IntPtr STT_FinishStreamWithMetadata(IntPtr** aSctx,
uint aNumResults); uint aNumResults);
#endregion #endregion
} }

View File

@ -1,34 +1,34 @@
using DeepSpeechClient.Interfaces; using STTClient.Interfaces;
using DeepSpeechClient.Extensions; using STTClient.Extensions;
using System; using System;
using System.IO; using System.IO;
using DeepSpeechClient.Enums; using STTClient.Enums;
using DeepSpeechClient.Models; using STTClient.Models;
namespace DeepSpeechClient namespace STTClient
{ {
/// <summary> /// <summary>
/// Concrete implementation of <see cref="DeepSpeechClient.Interfaces.IDeepSpeech"/>. /// Concrete implementation of <see cref="STTClient.Interfaces.ISTT"/>.
/// </summary> /// </summary>
public class DeepSpeech : IDeepSpeech public class STT : ISTT
{ {
private unsafe IntPtr** _modelStatePP; private unsafe IntPtr** _modelStatePP;
/// <summary> /// <summary>
/// Initializes a new instance of <see cref="DeepSpeech"/> class and creates a new acoustic model. /// Initializes a new instance of <see cref="STT"/> class and creates a new acoustic model.
/// </summary> /// </summary>
/// <param name="aModelPath">The path to the frozen model graph.</param> /// <param name="aModelPath">The path to the frozen model graph.</param>
/// <exception cref="ArgumentException">Thrown when the native binary failed to create the model.</exception> /// <exception cref="ArgumentException">Thrown when the native binary failed to create the model.</exception>
public DeepSpeech(string aModelPath) public STT(string aModelPath)
{ {
CreateModel(aModelPath); CreateModel(aModelPath);
} }
#region IDeepSpeech #region ISTT
/// <summary> /// <summary>
/// Create an object providing an interface to a trained DeepSpeech model. /// Create an object providing an interface to a trained STT model.
/// </summary> /// </summary>
/// <param name="aModelPath">The path to the frozen model graph.</param> /// <param name="aModelPath">The path to the frozen model graph.</param>
/// <exception cref="ArgumentException">Thrown when the native binary failed to create the model.</exception> /// <exception cref="ArgumentException">Thrown when the native binary failed to create the model.</exception>
@ -48,7 +48,7 @@ namespace DeepSpeechClient
{ {
throw new FileNotFoundException(exceptionMessage); throw new FileNotFoundException(exceptionMessage);
} }
var resultCode = NativeImp.DS_CreateModel(aModelPath, var resultCode = NativeImp.STT_CreateModel(aModelPath,
ref _modelStatePP); ref _modelStatePP);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
} }
@ -60,7 +60,7 @@ namespace DeepSpeechClient
/// <returns>Beam width value used by the model.</returns> /// <returns>Beam width value used by the model.</returns>
public unsafe uint GetModelBeamWidth() public unsafe uint GetModelBeamWidth()
{ {
return NativeImp.DS_GetModelBeamWidth(_modelStatePP); return NativeImp.STT_GetModelBeamWidth(_modelStatePP);
} }
/// <summary> /// <summary>
@ -70,13 +70,13 @@ namespace DeepSpeechClient
/// <exception cref="ArgumentException">Thrown on failure.</exception> /// <exception cref="ArgumentException">Thrown on failure.</exception>
public unsafe void SetModelBeamWidth(uint aBeamWidth) public unsafe void SetModelBeamWidth(uint aBeamWidth)
{ {
var resultCode = NativeImp.DS_SetModelBeamWidth(_modelStatePP, aBeamWidth); var resultCode = NativeImp.STT_SetModelBeamWidth(_modelStatePP, aBeamWidth);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
} }
/// <summary> /// <summary>
/// Add a hot-word. /// Add a hot-word.
/// ///
/// Words that don't occur in the scorer (e.g. proper nouns) or strings that contain spaces won't be taken into account. /// Words that don't occur in the scorer (e.g. proper nouns) or strings that contain spaces won't be taken into account.
/// </summary> /// </summary>
/// <param name="aWord">Some word</param> /// <param name="aWord">Some word</param>
@ -84,7 +84,7 @@ namespace DeepSpeechClient
/// <exception cref="ArgumentException">Thrown on failure.</exception> /// <exception cref="ArgumentException">Thrown on failure.</exception>
public unsafe void AddHotWord(string aWord, float aBoost) public unsafe void AddHotWord(string aWord, float aBoost)
{ {
var resultCode = NativeImp.DS_AddHotWord(_modelStatePP, aWord, aBoost); var resultCode = NativeImp.STT_AddHotWord(_modelStatePP, aWord, aBoost);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
} }
@ -95,7 +95,7 @@ namespace DeepSpeechClient
/// <exception cref="ArgumentException">Thrown on failure.</exception> /// <exception cref="ArgumentException">Thrown on failure.</exception>
public unsafe void EraseHotWord(string aWord) public unsafe void EraseHotWord(string aWord)
{ {
var resultCode = NativeImp.DS_EraseHotWord(_modelStatePP, aWord); var resultCode = NativeImp.STT_EraseHotWord(_modelStatePP, aWord);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
} }
@ -105,7 +105,7 @@ namespace DeepSpeechClient
/// <exception cref="ArgumentException">Thrown on failure.</exception> /// <exception cref="ArgumentException">Thrown on failure.</exception>
public unsafe void ClearHotWords() public unsafe void ClearHotWords()
{ {
var resultCode = NativeImp.DS_ClearHotWords(_modelStatePP); var resultCode = NativeImp.STT_ClearHotWords(_modelStatePP);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
} }
@ -115,7 +115,7 @@ namespace DeepSpeechClient
/// <returns>Sample rate.</returns> /// <returns>Sample rate.</returns>
public unsafe int GetModelSampleRate() public unsafe int GetModelSampleRate()
{ {
return NativeImp.DS_GetModelSampleRate(_modelStatePP); return NativeImp.STT_GetModelSampleRate(_modelStatePP);
} }
/// <summary> /// <summary>
@ -124,9 +124,9 @@ namespace DeepSpeechClient
/// <param name="resultCode">Native result code.</param> /// <param name="resultCode">Native result code.</param>
private void EvaluateResultCode(ErrorCodes resultCode) private void EvaluateResultCode(ErrorCodes resultCode)
{ {
if (resultCode != ErrorCodes.DS_ERR_OK) if (resultCode != ErrorCodes.STT_ERR_OK)
{ {
throw new ArgumentException(NativeImp.DS_ErrorCodeToErrorMessage((int)resultCode).PtrToString()); throw new ArgumentException(NativeImp.STT_ErrorCodeToErrorMessage((int)resultCode).PtrToString());
} }
} }
@ -135,7 +135,7 @@ namespace DeepSpeechClient
/// </summary> /// </summary>
public unsafe void Dispose() public unsafe void Dispose()
{ {
NativeImp.DS_FreeModel(_modelStatePP); NativeImp.STT_FreeModel(_modelStatePP);
} }
/// <summary> /// <summary>
@ -155,7 +155,7 @@ namespace DeepSpeechClient
throw new FileNotFoundException($"Cannot find the scorer file: {aScorerPath}"); throw new FileNotFoundException($"Cannot find the scorer file: {aScorerPath}");
} }
var resultCode = NativeImp.DS_EnableExternalScorer(_modelStatePP, aScorerPath); var resultCode = NativeImp.STT_EnableExternalScorer(_modelStatePP, aScorerPath);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
} }
@ -165,7 +165,7 @@ namespace DeepSpeechClient
/// <exception cref="ArgumentException">Thrown when an external scorer is not enabled.</exception> /// <exception cref="ArgumentException">Thrown when an external scorer is not enabled.</exception>
public unsafe void DisableExternalScorer() public unsafe void DisableExternalScorer()
{ {
var resultCode = NativeImp.DS_DisableExternalScorer(_modelStatePP); var resultCode = NativeImp.STT_DisableExternalScorer(_modelStatePP);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
} }
@ -177,7 +177,7 @@ namespace DeepSpeechClient
/// <exception cref="ArgumentException">Thrown when an external scorer is not enabled.</exception> /// <exception cref="ArgumentException">Thrown when an external scorer is not enabled.</exception>
public unsafe void SetScorerAlphaBeta(float aAlpha, float aBeta) public unsafe void SetScorerAlphaBeta(float aAlpha, float aBeta)
{ {
var resultCode = NativeImp.DS_SetScorerAlphaBeta(_modelStatePP, var resultCode = NativeImp.STT_SetScorerAlphaBeta(_modelStatePP,
aAlpha, aAlpha,
aBeta); aBeta);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
@ -188,9 +188,9 @@ namespace DeepSpeechClient
/// </summary> /// </summary>
/// <param name="stream">Instance of the stream to feed the data.</param> /// <param name="stream">Instance of the stream to feed the data.</param>
/// <param name="aBuffer">An array of 16-bit, mono raw audio samples at the appropriate sample rate (matching what the model was trained on).</param> /// <param name="aBuffer">An array of 16-bit, mono raw audio samples at the appropriate sample rate (matching what the model was trained on).</param>
public unsafe void FeedAudioContent(DeepSpeechStream stream, short[] aBuffer, uint aBufferSize) public unsafe void FeedAudioContent(Stream stream, short[] aBuffer, uint aBufferSize)
{ {
NativeImp.DS_FeedAudioContent(stream.GetNativePointer(), aBuffer, aBufferSize); NativeImp.STT_FeedAudioContent(stream.GetNativePointer(), aBuffer, aBufferSize);
} }
/// <summary> /// <summary>
@ -198,9 +198,9 @@ namespace DeepSpeechClient
/// </summary> /// </summary>
/// <param name="stream">Instance of the stream to finish.</param> /// <param name="stream">Instance of the stream to finish.</param>
/// <returns>The STT result.</returns> /// <returns>The STT result.</returns>
public unsafe string FinishStream(DeepSpeechStream stream) public unsafe string FinishStream(Stream stream)
{ {
return NativeImp.DS_FinishStream(stream.GetNativePointer()).PtrToString(); return NativeImp.STT_FinishStream(stream.GetNativePointer()).PtrToString();
} }
/// <summary> /// <summary>
@ -209,9 +209,9 @@ namespace DeepSpeechClient
/// <param name="stream">Instance of the stream to finish.</param> /// <param name="stream">Instance of the stream to finish.</param>
/// <param name="aNumResults">Maximum number of candidate transcripts to return. Returned list might be smaller than this.</param> /// <param name="aNumResults">Maximum number of candidate transcripts to return. Returned list might be smaller than this.</param>
/// <returns>The extended metadata result.</returns> /// <returns>The extended metadata result.</returns>
public unsafe Metadata FinishStreamWithMetadata(DeepSpeechStream stream, uint aNumResults) public unsafe Metadata FinishStreamWithMetadata(Stream stream, uint aNumResults)
{ {
return NativeImp.DS_FinishStreamWithMetadata(stream.GetNativePointer(), aNumResults).PtrToMetadata(); return NativeImp.STT_FinishStreamWithMetadata(stream.GetNativePointer(), aNumResults).PtrToMetadata();
} }
/// <summary> /// <summary>
@ -219,9 +219,9 @@ namespace DeepSpeechClient
/// </summary> /// </summary>
/// <param name="stream">Instance of the stream to decode.</param> /// <param name="stream">Instance of the stream to decode.</param>
/// <returns>The STT intermediate result.</returns> /// <returns>The STT intermediate result.</returns>
public unsafe string IntermediateDecode(DeepSpeechStream stream) public unsafe string IntermediateDecode(Stream stream)
{ {
return NativeImp.DS_IntermediateDecode(stream.GetNativePointer()).PtrToString(); return NativeImp.STT_IntermediateDecode(stream.GetNativePointer()).PtrToString();
} }
/// <summary> /// <summary>
@ -230,9 +230,9 @@ namespace DeepSpeechClient
/// <param name="stream">Instance of the stream to decode.</param> /// <param name="stream">Instance of the stream to decode.</param>
/// <param name="aNumResults">Maximum number of candidate transcripts to return. Returned list might be smaller than this.</param> /// <param name="aNumResults">Maximum number of candidate transcripts to return. Returned list might be smaller than this.</param>
/// <returns>The STT intermediate result.</returns> /// <returns>The STT intermediate result.</returns>
public unsafe Metadata IntermediateDecodeWithMetadata(DeepSpeechStream stream, uint aNumResults) public unsafe Metadata IntermediateDecodeWithMetadata(Stream stream, uint aNumResults)
{ {
return NativeImp.DS_IntermediateDecodeWithMetadata(stream.GetNativePointer(), aNumResults).PtrToMetadata(); return NativeImp.STT_IntermediateDecodeWithMetadata(stream.GetNativePointer(), aNumResults).PtrToMetadata();
} }
/// <summary> /// <summary>
@ -241,18 +241,18 @@ namespace DeepSpeechClient
/// </summary> /// </summary>
public unsafe string Version() public unsafe string Version()
{ {
return NativeImp.DS_Version().PtrToString(); return NativeImp.STT_Version().PtrToString();
} }
/// <summary> /// <summary>
/// Creates a new streaming inference state. /// Creates a new streaming inference state.
/// </summary> /// </summary>
public unsafe DeepSpeechStream CreateStream() public unsafe Stream CreateStream()
{ {
IntPtr** streamingStatePointer = null; IntPtr** streamingStatePointer = null;
var resultCode = NativeImp.DS_CreateStream(_modelStatePP, ref streamingStatePointer); var resultCode = NativeImp.STT_CreateStream(_modelStatePP, ref streamingStatePointer);
EvaluateResultCode(resultCode); EvaluateResultCode(resultCode);
return new DeepSpeechStream(streamingStatePointer); return new Stream(streamingStatePointer);
} }
/// <summary> /// <summary>
@ -260,25 +260,25 @@ namespace DeepSpeechClient
/// This can be used if you no longer need the result of an ongoing streaming /// This can be used if you no longer need the result of an ongoing streaming
/// inference and don't want to perform a costly decode operation. /// inference and don't want to perform a costly decode operation.
/// </summary> /// </summary>
public unsafe void FreeStream(DeepSpeechStream stream) public unsafe void FreeStream(Stream stream)
{ {
NativeImp.DS_FreeStream(stream.GetNativePointer()); NativeImp.STT_FreeStream(stream.GetNativePointer());
stream.Dispose(); stream.Dispose();
} }
/// <summary> /// <summary>
/// Use the DeepSpeech model to perform Speech-To-Text. /// Use the STT model to perform Speech-To-Text.
/// </summary> /// </summary>
/// <param name="aBuffer">A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).</param> /// <param name="aBuffer">A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).</param>
/// <param name="aBufferSize">The number of samples in the audio signal.</param> /// <param name="aBufferSize">The number of samples in the audio signal.</param>
/// <returns>The STT result. Returns NULL on error.</returns> /// <returns>The STT result. Returns NULL on error.</returns>
public unsafe string SpeechToText(short[] aBuffer, uint aBufferSize) public unsafe string SpeechToText(short[] aBuffer, uint aBufferSize)
{ {
return NativeImp.DS_SpeechToText(_modelStatePP, aBuffer, aBufferSize).PtrToString(); return NativeImp.STT_SpeechToText(_modelStatePP, aBuffer, aBufferSize).PtrToString();
} }
/// <summary> /// <summary>
/// Use the DeepSpeech model to perform Speech-To-Text, return results including metadata. /// Use the STT model to perform Speech-To-Text, return results including metadata.
/// </summary> /// </summary>
/// <param name="aBuffer">A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).</param> /// <param name="aBuffer">A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).</param>
/// <param name="aBufferSize">The number of samples in the audio signal.</param> /// <param name="aBufferSize">The number of samples in the audio signal.</param>
@ -286,7 +286,7 @@ namespace DeepSpeechClient
/// <returns>The extended metadata. Returns NULL on error.</returns> /// <returns>The extended metadata. Returns NULL on error.</returns>
public unsafe Metadata SpeechToTextWithMetadata(short[] aBuffer, uint aBufferSize, uint aNumResults) public unsafe Metadata SpeechToTextWithMetadata(short[] aBuffer, uint aBufferSize, uint aNumResults)
{ {
return NativeImp.DS_SpeechToTextWithMetadata(_modelStatePP, aBuffer, aBufferSize, aNumResults).PtrToMetadata(); return NativeImp.STT_SpeechToTextWithMetadata(_modelStatePP, aBuffer, aBufferSize, aNumResults).PtrToMetadata();
} }
#endregion #endregion

View File

@ -1,7 +1,7 @@
using System; using System;
using System.Runtime.InteropServices; using System.Runtime.InteropServices;
namespace DeepSpeechClient.Structs namespace STTClient.Structs
{ {
[StructLayout(LayoutKind.Sequential)] [StructLayout(LayoutKind.Sequential)]
internal unsafe struct CandidateTranscript internal unsafe struct CandidateTranscript

View File

@ -1,7 +1,7 @@
using System; using System;
using System.Runtime.InteropServices; using System.Runtime.InteropServices;
namespace DeepSpeechClient.Structs namespace STTClient.Structs
{ {
[StructLayout(LayoutKind.Sequential)] [StructLayout(LayoutKind.Sequential)]
internal unsafe struct Metadata internal unsafe struct Metadata

View File

@ -1,7 +1,7 @@
using System; using System;
using System.Runtime.InteropServices; using System.Runtime.InteropServices;
namespace DeepSpeechClient.Structs namespace STTClient.Structs
{ {
[StructLayout(LayoutKind.Sequential)] [StructLayout(LayoutKind.Sequential)]
internal unsafe struct TokenMetadata internal unsafe struct TokenMetadata

View File

@ -1,6 +1,6 @@
using DeepSpeechClient; using STTClient;
using DeepSpeechClient.Interfaces; using STTClient.Interfaces;
using DeepSpeechClient.Models; using STTClient.Models;
using NAudio.Wave; using NAudio.Wave;
using System; using System;
using System.Collections.Generic; using System.Collections.Generic;
@ -54,7 +54,7 @@ namespace CSharpExamples
Console.WriteLine("Loading model..."); Console.WriteLine("Loading model...");
stopwatch.Start(); stopwatch.Start();
// sphinx-doc: csharp_ref_model_start // sphinx-doc: csharp_ref_model_start
using (IDeepSpeech sttClient = new DeepSpeech(model ?? "output_graph.pbmm")) using (ISTT sttClient = new STT(model ?? "output_graph.pbmm"))
{ {
// sphinx-doc: csharp_ref_model_stop // sphinx-doc: csharp_ref_model_stop
stopwatch.Stop(); stopwatch.Stop();

View File

@ -5,12 +5,12 @@ using System.Runtime.InteropServices;
// General Information about an assembly is controlled through the following // General Information about an assembly is controlled through the following
// set of attributes. Change these attribute values to modify the information // set of attributes. Change these attribute values to modify the information
// associated with an assembly. // associated with an assembly.
[assembly: AssemblyTitle("DeepSpeechConsole")] [assembly: AssemblyTitle("STTConsole")]
[assembly: AssemblyDescription("")] [assembly: AssemblyDescription("")]
[assembly: AssemblyConfiguration("")] [assembly: AssemblyConfiguration("")]
[assembly: AssemblyCompany("")] [assembly: AssemblyCompany("Coqui GmbH")]
[assembly: AssemblyProduct("CSharpExamples")] [assembly: AssemblyProduct("STTConsole")]
[assembly: AssemblyCopyright("Copyright © 2018")] [assembly: AssemblyCopyright("Copyright © 2018-2020 Mozilla, © 2021 Coqui GmbH")]
[assembly: AssemblyTrademark("")] [assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")] [assembly: AssemblyCulture("")]

View File

@ -6,8 +6,8 @@
<Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform>
<ProjectGuid>{312965E5-C4F6-4D95-BA64-79906B8BC7AC}</ProjectGuid> <ProjectGuid>{312965E5-C4F6-4D95-BA64-79906B8BC7AC}</ProjectGuid>
<OutputType>Exe</OutputType> <OutputType>Exe</OutputType>
<RootNamespace>DeepSpeechConsole</RootNamespace> <RootNamespace>STTConsole</RootNamespace>
<AssemblyName>DeepSpeechConsole</AssemblyName> <AssemblyName>STTConsole</AssemblyName>
<TargetFrameworkVersion>v4.6.2</TargetFrameworkVersion> <TargetFrameworkVersion>v4.6.2</TargetFrameworkVersion>
<FileAlignment>512</FileAlignment> <FileAlignment>512</FileAlignment>
<AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects> <AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects>
@ -56,9 +56,9 @@
<None Include="packages.config" /> <None Include="packages.config" />
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>
<ProjectReference Include="..\DeepSpeechClient\DeepSpeechClient.csproj"> <ProjectReference Include="..\STTClient\STTClient.csproj">
<Project>{56DE4091-BBBE-47E4-852D-7268B33B971F}</Project> <Project>{56DE4091-BBBE-47E4-852D-7268B33B971F}</Project>
<Name>DeepSpeechClient</Name> <Name>STTClient</Name>
</ProjectReference> </ProjectReference>
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>

View File

@ -1,8 +1,8 @@
<Application <Application
x:Class="DeepSpeechWPF.App" x:Class="STTWPF.App"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:local="clr-namespace:DeepSpeechWPF" xmlns:local="clr-namespace:STTWPF"
StartupUri="MainWindow.xaml"> StartupUri="MainWindow.xaml">
<Application.Resources /> <Application.Resources />
</Application> </Application>

View File

@ -1,10 +1,10 @@
using CommonServiceLocator; using CommonServiceLocator;
using DeepSpeech.WPF.ViewModels; using STT.WPF.ViewModels;
using DeepSpeechClient.Interfaces; using STTClient.Interfaces;
using GalaSoft.MvvmLight.Ioc; using GalaSoft.MvvmLight.Ioc;
using System.Windows; using System.Windows;
namespace DeepSpeechWPF namespace STTWPF
{ {
/// <summary> /// <summary>
/// Interaction logic for App.xaml /// Interaction logic for App.xaml
@ -18,11 +18,11 @@ namespace DeepSpeechWPF
try try
{ {
//Register instance of DeepSpeech //Register instance of STT
DeepSpeechClient.DeepSpeech deepSpeechClient = STTClient.STT client =
new DeepSpeechClient.DeepSpeech("deepspeech-0.8.0-models.pbmm"); new STTClient.STT("coqui-stt-0.8.0-models.pbmm");
SimpleIoc.Default.Register<IDeepSpeech>(() => deepSpeechClient); SimpleIoc.Default.Register<ISTT>(() => client);
SimpleIoc.Default.Register<MainWindowViewModel>(); SimpleIoc.Default.Register<MainWindowViewModel>();
} }
catch (System.Exception ex) catch (System.Exception ex)
@ -35,8 +35,8 @@ namespace DeepSpeechWPF
protected override void OnExit(ExitEventArgs e) protected override void OnExit(ExitEventArgs e)
{ {
base.OnExit(e); base.OnExit(e);
//Dispose instance of DeepSpeech //Dispose instance of STT
ServiceLocator.Current.GetInstance<IDeepSpeech>()?.Dispose(); ServiceLocator.Current.GetInstance<ISTT>()?.Dispose();
} }
} }
} }

View File

@ -1,10 +1,10 @@
<Window <Window
x:Class="DeepSpeechWPF.MainWindow" x:Class="STTWPF.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:d="http://schemas.microsoft.com/expression/blend/2008" xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
Title="Deepspeech client" Title="STT client"
Width="800" Width="800"
Height="600" Height="600"
Loaded="Window_Loaded" Loaded="Window_Loaded"

View File

@ -1,8 +1,8 @@
using CommonServiceLocator; using CommonServiceLocator;
using DeepSpeech.WPF.ViewModels; using STT.WPF.ViewModels;
using System.Windows; using System.Windows;
namespace DeepSpeechWPF namespace STTWPF
{ {
/// <summary> /// <summary>
/// Interaction logic for MainWindow.xaml /// Interaction logic for MainWindow.xaml

View File

@ -7,12 +7,12 @@ using System.Windows;
// General Information about an assembly is controlled through the following // General Information about an assembly is controlled through the following
// set of attributes. Change these attribute values to modify the information // set of attributes. Change these attribute values to modify the information
// associated with an assembly. // associated with an assembly.
[assembly: AssemblyTitle("DeepSpeech.WPF")] [assembly: AssemblyTitle("STT.WPF")]
[assembly: AssemblyDescription("")] [assembly: AssemblyDescription("")]
[assembly: AssemblyConfiguration("")] [assembly: AssemblyConfiguration("")]
[assembly: AssemblyCompany("")] [assembly: AssemblyCompany("Coqui GmbH")]
[assembly: AssemblyProduct("DeepSpeech.WPF.SingleFiles")] [assembly: AssemblyProduct("STT.WPF.SingleFiles")]
[assembly: AssemblyCopyright("Copyright © 2018")] [assembly: AssemblyCopyright("Copyright © 2018-2020 Mozilla, © 2021 Coqui GmbH")]
[assembly: AssemblyTrademark("")] [assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")] [assembly: AssemblyCulture("")]

View File

@ -8,7 +8,7 @@
// </auto-generated> // </auto-generated>
//------------------------------------------------------------------------------ //------------------------------------------------------------------------------
namespace DeepSpeech.WPF.Properties { namespace STT.WPF.Properties {
using System; using System;
@ -39,7 +39,7 @@ namespace DeepSpeech.WPF.Properties {
internal static global::System.Resources.ResourceManager ResourceManager { internal static global::System.Resources.ResourceManager ResourceManager {
get { get {
if (object.ReferenceEquals(resourceMan, null)) { if (object.ReferenceEquals(resourceMan, null)) {
global::System.Resources.ResourceManager temp = new global::System.Resources.ResourceManager("DeepSpeech.WPF.Properties.Resources", typeof(Resources).Assembly); global::System.Resources.ResourceManager temp = new global::System.Resources.ResourceManager("STT.WPF.Properties.Resources", typeof(Resources).Assembly);
resourceMan = temp; resourceMan = temp;
} }
return resourceMan; return resourceMan;

View File

@ -8,7 +8,7 @@
// </auto-generated> // </auto-generated>
//------------------------------------------------------------------------------ //------------------------------------------------------------------------------
namespace DeepSpeech.WPF.Properties { namespace STT.WPF.Properties {
[global::System.Runtime.CompilerServices.CompilerGeneratedAttribute()] [global::System.Runtime.CompilerServices.CompilerGeneratedAttribute()]

View File

@ -6,8 +6,8 @@
<Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform>
<ProjectGuid>{54BFD766-4305-4F4C-BA59-AF45505DF3C1}</ProjectGuid> <ProjectGuid>{54BFD766-4305-4F4C-BA59-AF45505DF3C1}</ProjectGuid>
<OutputType>WinExe</OutputType> <OutputType>WinExe</OutputType>
<RootNamespace>DeepSpeech.WPF</RootNamespace> <RootNamespace>STT.WPF</RootNamespace>
<AssemblyName>DeepSpeech.WPF</AssemblyName> <AssemblyName>STT.WPF</AssemblyName>
<TargetFrameworkVersion>v4.6.2</TargetFrameworkVersion> <TargetFrameworkVersion>v4.6.2</TargetFrameworkVersion>
<FileAlignment>512</FileAlignment> <FileAlignment>512</FileAlignment>
<ProjectTypeGuids>{60dc8134-eba5-43b8-bcc9-bb4bc16c2548};{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}</ProjectTypeGuids> <ProjectTypeGuids>{60dc8134-eba5-43b8-bcc9-bb4bc16c2548};{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}</ProjectTypeGuids>
@ -131,9 +131,9 @@
<None Include="App.config" /> <None Include="App.config" />
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>
<ProjectReference Include="..\DeepSpeechClient\DeepSpeechClient.csproj"> <ProjectReference Include="..\STTClient\STTClient.csproj">
<Project>{56de4091-bbbe-47e4-852d-7268b33b971f}</Project> <Project>{56de4091-bbbe-47e4-852d-7268b33b971f}</Project>
<Name>DeepSpeechClient</Name> <Name>STTClient</Name>
</ProjectReference> </ProjectReference>
</ItemGroup> </ItemGroup>
<Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" /> <Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />

View File

@ -3,9 +3,9 @@ Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio 15 # Visual Studio 15
VisualStudioVersion = 15.0.28307.421 VisualStudioVersion = 15.0.28307.421
MinimumVisualStudioVersion = 10.0.40219.1 MinimumVisualStudioVersion = 10.0.40219.1
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "DeepSpeech.WPF", "DeepSpeech.WPF.csproj", "{54BFD766-4305-4F4C-BA59-AF45505DF3C1}" Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "STT.WPF", "STT.WPF.csproj", "{54BFD766-4305-4F4C-BA59-AF45505DF3C1}"
EndProject EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "DeepSpeechClient", "..\DeepSpeechClient\DeepSpeechClient.csproj", "{56DE4091-BBBE-47E4-852D-7268B33B971F}" Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "STTClient", "..\STTClient\STTClient.csproj", "{56DE4091-BBBE-47E4-852D-7268B33B971F}"
EndProject EndProject
Global Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution GlobalSection(SolutionConfigurationPlatforms) = preSolution

View File

@ -3,7 +3,7 @@ using System.Collections.Generic;
using System.ComponentModel; using System.ComponentModel;
using System.Runtime.CompilerServices; using System.Runtime.CompilerServices;
namespace DeepSpeech.WPF.ViewModels namespace STT.WPF.ViewModels
{ {
/// <summary> /// <summary>
/// Implementation of <see cref="INotifyPropertyChanged"/> to simplify models. /// Implementation of <see cref="INotifyPropertyChanged"/> to simplify models.

View File

@ -3,8 +3,8 @@ using CSCore;
using CSCore.CoreAudioAPI; using CSCore.CoreAudioAPI;
using CSCore.SoundIn; using CSCore.SoundIn;
using CSCore.Streams; using CSCore.Streams;
using DeepSpeechClient.Interfaces; using STTClient.Interfaces;
using DeepSpeechClient.Models; using STTClient.Models;
using GalaSoft.MvvmLight.CommandWpf; using GalaSoft.MvvmLight.CommandWpf;
using Microsoft.Win32; using Microsoft.Win32;
using System; using System;
@ -15,7 +15,7 @@ using System.IO;
using System.Threading; using System.Threading;
using System.Threading.Tasks; using System.Threading.Tasks;
namespace DeepSpeech.WPF.ViewModels namespace STT.WPF.ViewModels
{ {
/// <summary> /// <summary>
/// View model of the MainWindow View. /// View model of the MainWindow View.
@ -27,7 +27,7 @@ namespace DeepSpeech.WPF.ViewModels
private const string ScorerPath = "kenlm.scorer"; private const string ScorerPath = "kenlm.scorer";
#endregion #endregion
private readonly IDeepSpeech _sttClient; private readonly ISTT _sttClient;
#region Commands #region Commands
/// <summary> /// <summary>
@ -62,7 +62,7 @@ namespace DeepSpeech.WPF.ViewModels
/// <summary> /// <summary>
/// Stream used to feed data into the acoustic model. /// Stream used to feed data into the acoustic model.
/// </summary> /// </summary>
private DeepSpeechStream _sttStream; private Stream _sttStream;
/// <summary> /// <summary>
/// Records the audio of the selected device. /// Records the audio of the selected device.
@ -75,7 +75,7 @@ namespace DeepSpeech.WPF.ViewModels
private SoundInSource _soundInSource; private SoundInSource _soundInSource;
/// <summary> /// <summary>
/// Target wave source.(16KHz Mono 16bit for DeepSpeech) /// Target wave source.(16KHz Mono 16bit for STT)
/// </summary> /// </summary>
private IWaveSource _convertedSource; private IWaveSource _convertedSource;
@ -200,7 +200,7 @@ namespace DeepSpeech.WPF.ViewModels
#endregion #endregion
#region Ctors #region Ctors
public MainWindowViewModel(IDeepSpeech sttClient) public MainWindowViewModel(ISTT sttClient)
{ {
_sttClient = sttClient; _sttClient = sttClient;
@ -290,7 +290,7 @@ namespace DeepSpeech.WPF.ViewModels
//read data from the converedSource //read data from the converedSource
//important: don't use the e.Data here //important: don't use the e.Data here
//the e.Data contains the raw data provided by the //the e.Data contains the raw data provided by the
//soundInSource which won't have the deepspeech required audio format //soundInSource which won't have the STT required audio format
byte[] buffer = new byte[_convertedSource.WaveFormat.BytesPerSecond / 2]; byte[] buffer = new byte[_convertedSource.WaveFormat.BytesPerSecond / 2];
int read; int read;

View File

@ -66,9 +66,9 @@ create_package(absl::optional<string> alphabet_path,
scorer.set_utf8_mode(force_bytes_output_mode.value()); scorer.set_utf8_mode(force_bytes_output_mode.value());
scorer.reset_params(default_alpha, default_beta); scorer.reset_params(default_alpha, default_beta);
int err = scorer.load_lm(lm_path); int err = scorer.load_lm(lm_path);
if (err != DS_ERR_SCORER_NO_TRIE) { if (err != STT_ERR_SCORER_NO_TRIE) {
cerr << "Error loading language model file: " cerr << "Error loading language model file: "
<< (err == DS_ERR_SCORER_UNREADABLE ? "Can't open binary LM file." : DS_ErrorCodeToErrorMessage(err)) << (err == STT_ERR_SCORER_UNREADABLE ? "Can't open binary LM file." : STT_ErrorCodeToErrorMessage(err))
<< "\n"; << "\n";
return 1; return 1;
} }

View File

@ -9,7 +9,7 @@
android:roundIcon="@mipmap/ic_launcher_round" android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true" android:supportsRtl="true"
android:theme="@style/AppTheme"> android:theme="@style/AppTheme">
<activity android:name=".DeepSpeechActivity"> <activity android:name=".STTActivity">
<intent-filter> <intent-filter>
<action android:name="android.intent.action.MAIN" /> <action android:name="android.intent.action.MAIN" />

View File

@ -16,11 +16,11 @@ import java.io.IOException;
import java.nio.ByteOrder; import java.nio.ByteOrder;
import java.nio.ByteBuffer; import java.nio.ByteBuffer;
import ai.coqui.libstt.DeepSpeechModel; import ai.coqui.libstt.STTModel;
public class DeepSpeechActivity extends AppCompatActivity { public class STTActivity extends AppCompatActivity {
DeepSpeechModel _m = null; STTModel _m = null;
EditText _tfliteModel; EditText _tfliteModel;
EditText _audioFile; EditText _audioFile;
@ -50,7 +50,7 @@ public class DeepSpeechActivity extends AppCompatActivity {
this._tfliteStatus.setText("Creating model"); this._tfliteStatus.setText("Creating model");
if (this._m == null) { if (this._m == null) {
// sphinx-doc: java_ref_model_start // sphinx-doc: java_ref_model_start
this._m = new DeepSpeechModel(tfliteModel); this._m = new STTModel(tfliteModel);
this._m.setBeamWidth(BEAM_WIDTH); this._m.setBeamWidth(BEAM_WIDTH);
// sphinx-doc: java_ref_model_stop // sphinx-doc: java_ref_model_stop
} }
@ -124,7 +124,7 @@ public class DeepSpeechActivity extends AppCompatActivity {
@Override @Override
protected void onCreate(Bundle savedInstanceState) { protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState); super.onCreate(savedInstanceState);
setContentView(R.layout.activity_deep_speech); setContentView(R.layout.activity_stt);
this._decodedString = (TextView) findViewById(R.id.decodedString); this._decodedString = (TextView) findViewById(R.id.decodedString);
this._tfliteStatus = (TextView) findViewById(R.id.tfliteStatus); this._tfliteStatus = (TextView) findViewById(R.id.tfliteStatus);

View File

@ -4,7 +4,7 @@
xmlns:tools="http://schemas.android.com/tools" xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent" android:layout_width="match_parent"
android:layout_height="match_parent" android:layout_height="match_parent"
tools:context=".DeepSpeechActivity"> tools:context=".STTActivity">
<!-- <!--
<TextView <TextView

View File

@ -10,7 +10,7 @@
%javaconst(1); %javaconst(1);
%include "arrays_java.i" %include "arrays_java.i"
// apply to DS_FeedAudioContent and DS_SpeechToText // apply to STT_FeedAudioContent and STT_SpeechToText
%apply short[] { short* }; %apply short[] { short* };
%include "cpointer.i" %include "cpointer.i"
@ -43,7 +43,7 @@
} }
~Metadata() { ~Metadata() {
DS_FreeMetadata(self); STT_FreeMetadata(self);
} }
} }
@ -54,13 +54,13 @@
%nodefaultctor TokenMetadata; %nodefaultctor TokenMetadata;
%nodefaultdtor TokenMetadata; %nodefaultdtor TokenMetadata;
%typemap(newfree) char* "DS_FreeString($1);"; %typemap(newfree) char* "STT_FreeString($1);";
%newobject DS_SpeechToText; %newobject STT_SpeechToText;
%newobject DS_IntermediateDecode; %newobject STT_IntermediateDecode;
%newobject DS_FinishStream; %newobject STT_FinishStream;
%newobject DS_ErrorCodeToErrorMessage; %newobject STT_ErrorCodeToErrorMessage;
%rename ("%(strip:[DS_])s") ""; %rename ("%(strip:[STT_])s") "";
// make struct members camel case to suit Java conventions // make struct members camel case to suit Java conventions
%rename ("%(camelcase)s", %$ismember) ""; %rename ("%(camelcase)s", %$ismember) "";

View File

@ -11,7 +11,7 @@ import org.junit.runners.MethodSorters;
import static org.junit.Assert.*; import static org.junit.Assert.*;
import ai.coqui.libstt.DeepSpeechModel; import ai.coqui.libstt.STTModel;
import ai.coqui.libstt.CandidateTranscript; import ai.coqui.libstt.CandidateTranscript;
import java.io.RandomAccessFile; import java.io.RandomAccessFile;
@ -58,8 +58,8 @@ public class BasicTest {
} }
@Test @Test
public void loadDeepSpeech_basic() { public void loadSTT_basic() {
DeepSpeechModel m = new DeepSpeechModel(modelFile); STTModel m = new STTModel(modelFile);
m.freeModel(); m.freeModel();
} }
@ -71,7 +71,7 @@ public class BasicTest {
return retval; return retval;
} }
private String doSTT(DeepSpeechModel m, boolean extendedMetadata) { private String doSTT(STTModel m, boolean extendedMetadata) {
try { try {
RandomAccessFile wave = new RandomAccessFile(wavFile, "r"); RandomAccessFile wave = new RandomAccessFile(wavFile, "r");
@ -115,8 +115,8 @@ public class BasicTest {
} }
@Test @Test
public void loadDeepSpeech_stt_noLM() { public void loadSTT_stt_noLM() {
DeepSpeechModel m = new DeepSpeechModel(modelFile); STTModel m = new STTModel(modelFile);
String decoded = doSTT(m, false); String decoded = doSTT(m, false);
assertEquals("she had your dark suit in greasy wash water all year", decoded); assertEquals("she had your dark suit in greasy wash water all year", decoded);
@ -124,8 +124,8 @@ public class BasicTest {
} }
@Test @Test
public void loadDeepSpeech_stt_withLM() { public void loadSTT_stt_withLM() {
DeepSpeechModel m = new DeepSpeechModel(modelFile); STTModel m = new STTModel(modelFile);
m.enableExternalScorer(scorerFile); m.enableExternalScorer(scorerFile);
String decoded = doSTT(m, false); String decoded = doSTT(m, false);
@ -134,8 +134,8 @@ public class BasicTest {
} }
@Test @Test
public void loadDeepSpeech_sttWithMetadata_noLM() { public void loadSTT_sttWithMetadata_noLM() {
DeepSpeechModel m = new DeepSpeechModel(modelFile); STTModel m = new STTModel(modelFile);
String decoded = doSTT(m, true); String decoded = doSTT(m, true);
assertEquals("she had your dark suit in greasy wash water all year", decoded); assertEquals("she had your dark suit in greasy wash water all year", decoded);
@ -143,8 +143,8 @@ public class BasicTest {
} }
@Test @Test
public void loadDeepSpeech_sttWithMetadata_withLM() { public void loadSTT_sttWithMetadata_withLM() {
DeepSpeechModel m = new DeepSpeechModel(modelFile); STTModel m = new STTModel(modelFile);
m.enableExternalScorer(scorerFile); m.enableExternalScorer(scorerFile);
String decoded = doSTT(m, true); String decoded = doSTT(m, true);
@ -153,8 +153,8 @@ public class BasicTest {
} }
@Test @Test
public void loadDeepSpeech_HotWord_withLM() { public void loadSTT_HotWord_withLM() {
DeepSpeechModel m = new DeepSpeechModel(modelFile); STTModel m = new STTModel(modelFile);
m.enableExternalScorer(scorerFile); m.enableExternalScorer(scorerFile);
for(int i = 0; i < word.length; i++) { for(int i = 0; i < word.length; i++) {
@ -168,8 +168,8 @@ public class BasicTest {
} }
@Test @Test
public void loadDeepSpeech_HotWord_noLM() { public void loadSTT_HotWord_noLM() {
DeepSpeechModel m = new DeepSpeechModel(modelFile); STTModel m = new STTModel(modelFile);
try { try {
m.addHotWord(word[0], boost[0]); m.addHotWord(word[0], boost[0]);
assert(false); assert(false);

View File

@ -1,9 +1,9 @@
package ai.coqui.libstt; package ai.coqui.libstt;
/** /**
* @brief Exposes a DeepSpeech model in Java * @brief Exposes a STT model in Java
**/ **/
public class DeepSpeechModel { public class STTModel {
static { static {
System.loadLibrary("stt-jni"); System.loadLibrary("stt-jni");
@ -15,14 +15,14 @@ public class DeepSpeechModel {
private SWIGTYPE_p_ModelState _msp; private SWIGTYPE_p_ModelState _msp;
private void evaluateErrorCode(int errorCode) { private void evaluateErrorCode(int errorCode) {
DeepSpeech_Error_Codes code = DeepSpeech_Error_Codes.swigToEnum(errorCode); STT_Error_Codes code = STT_Error_Codes.swigToEnum(errorCode);
if (code != DeepSpeech_Error_Codes.ERR_OK) { if (code != STT_Error_Codes.ERR_OK) {
throw new RuntimeException("Error: " + impl.ErrorCodeToErrorMessage(errorCode) + " (0x" + Integer.toHexString(errorCode) + ")."); throw new RuntimeException("Error: " + impl.ErrorCodeToErrorMessage(errorCode) + " (0x" + Integer.toHexString(errorCode) + ").");
} }
} }
/** /**
* @brief An object providing an interface to a trained DeepSpeech model. * @brief An object providing an interface to a trained STT model.
* *
* @constructor * @constructor
* *
@ -30,7 +30,7 @@ public class DeepSpeechModel {
* *
* @throws RuntimeException on failure. * @throws RuntimeException on failure.
*/ */
public DeepSpeechModel(String modelPath) { public STTModel(String modelPath) {
this._mspp = impl.new_modelstatep(); this._mspp = impl.new_modelstatep();
evaluateErrorCode(impl.CreateModel(modelPath, this._mspp)); evaluateErrorCode(impl.CreateModel(modelPath, this._mspp));
this._msp = impl.modelstatep_value(this._mspp); this._msp = impl.modelstatep_value(this._mspp);
@ -107,7 +107,7 @@ public class DeepSpeechModel {
} }
/* /*
* @brief Use the DeepSpeech model to perform Speech-To-Text. * @brief Use the STT model to perform Speech-To-Text.
* *
* @param buffer A 16-bit, mono raw audio signal at the appropriate * @param buffer A 16-bit, mono raw audio signal at the appropriate
* sample rate (matching what the model was trained on). * sample rate (matching what the model was trained on).
@ -120,7 +120,7 @@ public class DeepSpeechModel {
} }
/** /**
* @brief Use the DeepSpeech model to perform Speech-To-Text and output metadata * @brief Use the STT model to perform Speech-To-Text and output metadata
* about the results. * about the results.
* *
* @param buffer A 16-bit, mono raw audio signal at the appropriate * @param buffer A 16-bit, mono raw audio signal at the appropriate
@ -144,10 +144,10 @@ public class DeepSpeechModel {
* *
* @throws RuntimeException on failure. * @throws RuntimeException on failure.
*/ */
public DeepSpeechStreamingState createStream() { public STTStreamingState createStream() {
SWIGTYPE_p_p_StreamingState ssp = impl.new_streamingstatep(); SWIGTYPE_p_p_StreamingState ssp = impl.new_streamingstatep();
evaluateErrorCode(impl.CreateStream(this._msp, ssp)); evaluateErrorCode(impl.CreateStream(this._msp, ssp));
return new DeepSpeechStreamingState(impl.streamingstatep_value(ssp)); return new STTStreamingState(impl.streamingstatep_value(ssp));
} }
/** /**
@ -158,7 +158,7 @@ public class DeepSpeechModel {
* appropriate sample rate (matching what the model was trained on). * appropriate sample rate (matching what the model was trained on).
* @param buffer_size The number of samples in @p buffer. * @param buffer_size The number of samples in @p buffer.
*/ */
public void feedAudioContent(DeepSpeechStreamingState ctx, short[] buffer, int buffer_size) { public void feedAudioContent(STTStreamingState ctx, short[] buffer, int buffer_size) {
impl.FeedAudioContent(ctx.get(), buffer, buffer_size); impl.FeedAudioContent(ctx.get(), buffer, buffer_size);
} }
@ -169,7 +169,7 @@ public class DeepSpeechModel {
* *
* @return The STT intermediate result. * @return The STT intermediate result.
*/ */
public String intermediateDecode(DeepSpeechStreamingState ctx) { public String intermediateDecode(STTStreamingState ctx) {
return impl.IntermediateDecode(ctx.get()); return impl.IntermediateDecode(ctx.get());
} }
@ -181,7 +181,7 @@ public class DeepSpeechModel {
* *
* @return The STT intermediate result. * @return The STT intermediate result.
*/ */
public Metadata intermediateDecodeWithMetadata(DeepSpeechStreamingState ctx, int num_results) { public Metadata intermediateDecodeWithMetadata(STTStreamingState ctx, int num_results) {
return impl.IntermediateDecodeWithMetadata(ctx.get(), num_results); return impl.IntermediateDecodeWithMetadata(ctx.get(), num_results);
} }
@ -195,7 +195,7 @@ public class DeepSpeechModel {
* *
* @note This method will free the state pointer (@p ctx). * @note This method will free the state pointer (@p ctx).
*/ */
public String finishStream(DeepSpeechStreamingState ctx) { public String finishStream(STTStreamingState ctx) {
return impl.FinishStream(ctx.get()); return impl.FinishStream(ctx.get());
} }
@ -212,7 +212,7 @@ public class DeepSpeechModel {
* *
* @note This method will free the state pointer (@p ctx). * @note This method will free the state pointer (@p ctx).
*/ */
public Metadata finishStreamWithMetadata(DeepSpeechStreamingState ctx, int num_results) { public Metadata finishStreamWithMetadata(STTStreamingState ctx, int num_results) {
return impl.FinishStreamWithMetadata(ctx.get(), num_results); return impl.FinishStreamWithMetadata(ctx.get(), num_results);
} }
/** /**

View File

@ -1,9 +1,9 @@
package ai.coqui.libstt; package ai.coqui.libstt;
public final class DeepSpeechStreamingState { public final class STTStreamingState {
private SWIGTYPE_p_StreamingState _sp; private SWIGTYPE_p_StreamingState _sp;
public DeepSpeechStreamingState(SWIGTYPE_p_StreamingState sp) { public STTStreamingState(SWIGTYPE_p_StreamingState sp) {
this._sp = sp; this._sp = sp;
} }

View File

@ -1,6 +1,6 @@
/* ---------------------------------------------------------------------------- /* ----------------------------------------------------------------------------
* This file was automatically generated by SWIG (http://www.swig.org). * This file was automatically generated by SWIG (http://www.swig.org).
* Version 4.0.1 * Version 4.0.2
* *
* Do not make changes to this file unless you know what you are doing--modify * Do not make changes to this file unless you know what you are doing--modify
* the SWIG interface file instead. * the SWIG interface file instead.

View File

@ -1,6 +1,6 @@
/* ---------------------------------------------------------------------------- /* ----------------------------------------------------------------------------
* This file was automatically generated by SWIG (http://www.swig.org). * This file was automatically generated by SWIG (http://www.swig.org).
* Version 4.0.1 * Version 4.0.2
* *
* Do not make changes to this file unless you know what you are doing--modify * Do not make changes to this file unless you know what you are doing--modify
* the SWIG interface file instead. * the SWIG interface file instead.

View File

@ -1,6 +1,6 @@
/* ---------------------------------------------------------------------------- /* ----------------------------------------------------------------------------
* This file was automatically generated by SWIG (http://www.swig.org). * This file was automatically generated by SWIG (http://www.swig.org).
* Version 4.0.1 * Version 4.0.2
* *
* Do not make changes to this file unless you know what you are doing--modify * Do not make changes to this file unless you know what you are doing--modify
* the SWIG interface file instead. * the SWIG interface file instead.
@ -8,7 +8,7 @@
package ai.coqui.libstt; package ai.coqui.libstt;
public enum DeepSpeech_Error_Codes { public enum STT_Error_Codes {
ERR_OK(0x0000), ERR_OK(0x0000),
ERR_NO_MODEL(0x1000), ERR_NO_MODEL(0x1000),
ERR_INVALID_ALPHABET(0x2000), ERR_INVALID_ALPHABET(0x2000),
@ -37,29 +37,29 @@ public enum DeepSpeech_Error_Codes {
return swigValue; return swigValue;
} }
public static DeepSpeech_Error_Codes swigToEnum(int swigValue) { public static STT_Error_Codes swigToEnum(int swigValue) {
DeepSpeech_Error_Codes[] swigValues = DeepSpeech_Error_Codes.class.getEnumConstants(); STT_Error_Codes[] swigValues = STT_Error_Codes.class.getEnumConstants();
if (swigValue < swigValues.length && swigValue >= 0 && swigValues[swigValue].swigValue == swigValue) if (swigValue < swigValues.length && swigValue >= 0 && swigValues[swigValue].swigValue == swigValue)
return swigValues[swigValue]; return swigValues[swigValue];
for (DeepSpeech_Error_Codes swigEnum : swigValues) for (STT_Error_Codes swigEnum : swigValues)
if (swigEnum.swigValue == swigValue) if (swigEnum.swigValue == swigValue)
return swigEnum; return swigEnum;
throw new IllegalArgumentException("No enum " + DeepSpeech_Error_Codes.class + " with value " + swigValue); throw new IllegalArgumentException("No enum " + STT_Error_Codes.class + " with value " + swigValue);
} }
@SuppressWarnings("unused") @SuppressWarnings("unused")
private DeepSpeech_Error_Codes() { private STT_Error_Codes() {
this.swigValue = SwigNext.next++; this.swigValue = SwigNext.next++;
} }
@SuppressWarnings("unused") @SuppressWarnings("unused")
private DeepSpeech_Error_Codes(int swigValue) { private STT_Error_Codes(int swigValue) {
this.swigValue = swigValue; this.swigValue = swigValue;
SwigNext.next = swigValue+1; SwigNext.next = swigValue+1;
} }
@SuppressWarnings("unused") @SuppressWarnings("unused")
private DeepSpeech_Error_Codes(DeepSpeech_Error_Codes swigEnum) { private STT_Error_Codes(STT_Error_Codes swigEnum) {
this.swigValue = swigEnum.swigValue; this.swigValue = swigEnum.swigValue;
SwigNext.next = this.swigValue+1; SwigNext.next = this.swigValue+1;
} }

View File

@ -1,6 +1,6 @@
/* ---------------------------------------------------------------------------- /* ----------------------------------------------------------------------------
* This file was automatically generated by SWIG (http://www.swig.org). * This file was automatically generated by SWIG (http://www.swig.org).
* Version 4.0.1 * Version 4.0.2
* *
* Do not make changes to this file unless you know what you are doing--modify * Do not make changes to this file unless you know what you are doing--modify
* the SWIG interface file instead. * the SWIG interface file instead.

View File

@ -26,18 +26,18 @@ using namespace node;
$2 = ($2_ltype)(bufferLength / 2); $2 = ($2_ltype)(bufferLength / 2);
} }
// apply to DS_FeedAudioContent and DS_SpeechToText // apply to STT_FeedAudioContent and STT_SpeechToText
%apply (short* IN_ARRAY1, int DIM1) {(const short* aBuffer, unsigned int aBufferSize)}; %apply (short* IN_ARRAY1, int DIM1) {(const short* aBuffer, unsigned int aBufferSize)};
// make sure the string returned by SpeechToText is freed // make sure the string returned by SpeechToText is freed
%typemap(newfree) char* "DS_FreeString($1);"; %typemap(newfree) char* "STT_FreeString($1);";
%newobject DS_SpeechToText; %newobject STT_SpeechToText;
%newobject DS_IntermediateDecode; %newobject STT_IntermediateDecode;
%newobject DS_FinishStream; %newobject STT_FinishStream;
%newobject DS_Version; %newobject STT_Version;
%newobject DS_ErrorCodeToErrorMessage; %newobject STT_ErrorCodeToErrorMessage;
// convert double pointer retval in CreateModel to an output // convert double pointer retval in CreateModel to an output
%typemap(in, numinputs=0) ModelState **retval (ModelState *ret) { %typemap(in, numinputs=0) ModelState **retval (ModelState *ret) {
@ -62,7 +62,7 @@ using namespace node;
%typemap(argout) StreamingState **retval { %typemap(argout) StreamingState **retval {
$result = SWIGV8_ARRAY_NEW(); $result = SWIGV8_ARRAY_NEW();
SWIGV8_AppendOutput($result, SWIG_From_int(result)); SWIGV8_AppendOutput($result, SWIG_From_int(result));
// not owned, DS_FinishStream deallocates StreamingState // not owned, STT_FinishStream deallocates StreamingState
%append_output(SWIG_NewPointerObj(%as_voidptr(*$1), $*1_descriptor, 0)); %append_output(SWIG_NewPointerObj(%as_voidptr(*$1), $*1_descriptor, 0));
} }
@ -93,6 +93,6 @@ using namespace node;
%nodefaultctor TokenMetadata; %nodefaultctor TokenMetadata;
%nodefaultdtor TokenMetadata; %nodefaultdtor TokenMetadata;
%rename ("%(strip:[DS_])s") ""; %rename ("%(strip:[STT_])s") "";
%include "../coqui-stt.h" %include "../coqui-stt.h"

View File

@ -26,7 +26,7 @@ ModelState::~ModelState()
int int
ModelState::init(const char* model_path) ModelState::init(const char* model_path)
{ {
return DS_ERR_OK; return STT_ERR_OK;
} }
char* char*

View File

@ -71,7 +71,7 @@ struct ModelState {
* *
* @return A Metadata struct containing CandidateTranscript structs. * @return A Metadata struct containing CandidateTranscript structs.
* Each represents an candidate transcript, with the first ranked most probable. * Each represents an candidate transcript, with the first ranked most probable.
* The user is responsible for freeing Result by calling DS_FreeMetadata(). * The user is responsible for freeing Result by calling STT_FreeMetadata().
*/ */
virtual Metadata* decode_metadata(const DecoderState& state, virtual Metadata* decode_metadata(const DecoderState& state,
size_t num_results); size_t num_results);

View File

@ -10,7 +10,7 @@
import_array(); import_array();
%} %}
// apply NumPy conversion typemap to DS_FeedAudioContent and DS_SpeechToText // apply NumPy conversion typemap to STT_FeedAudioContent and STT_SpeechToText
%apply (short* IN_ARRAY1, int DIM1) {(const short* aBuffer, unsigned int aBufferSize)}; %apply (short* IN_ARRAY1, int DIM1) {(const short* aBuffer, unsigned int aBufferSize)};
%typemap(in, numinputs=0) ModelState **retval (ModelState *ret) { %typemap(in, numinputs=0) ModelState **retval (ModelState *ret) {
@ -19,7 +19,7 @@ import_array();
} }
%typemap(argout) ModelState **retval { %typemap(argout) ModelState **retval {
// not owned, Python wrapper in __init__.py calls DS_FreeModel // not owned, Python wrapper in __init__.py calls STT_FreeModel
%append_output(SWIG_NewPointerObj(%as_voidptr(*$1), $*1_descriptor, 0)); %append_output(SWIG_NewPointerObj(%as_voidptr(*$1), $*1_descriptor, 0));
} }
@ -29,7 +29,7 @@ import_array();
} }
%typemap(argout) StreamingState **retval { %typemap(argout) StreamingState **retval {
// not owned, DS_FinishStream deallocates StreamingState // not owned, STT_FinishStream deallocates StreamingState
%append_output(SWIG_NewPointerObj(%as_voidptr(*$1), $*1_descriptor, 0)); %append_output(SWIG_NewPointerObj(%as_voidptr(*$1), $*1_descriptor, 0));
} }
@ -104,7 +104,7 @@ static PyObject *parent_reference() {
%extend struct Metadata { %extend struct Metadata {
~Metadata() { ~Metadata() {
DS_FreeMetadata($self); STT_FreeMetadata($self);
} }
} }
@ -115,14 +115,14 @@ static PyObject *parent_reference() {
%nodefaultctor TokenMetadata; %nodefaultctor TokenMetadata;
%nodefaultdtor TokenMetadata; %nodefaultdtor TokenMetadata;
%typemap(newfree) char* "DS_FreeString($1);"; %typemap(newfree) char* "STT_FreeString($1);";
%newobject DS_SpeechToText; %newobject STT_SpeechToText;
%newobject DS_IntermediateDecode; %newobject STT_IntermediateDecode;
%newobject DS_FinishStream; %newobject STT_FinishStream;
%newobject DS_Version; %newobject STT_Version;
%newobject DS_ErrorCodeToErrorMessage; %newobject STT_ErrorCodeToErrorMessage;
%rename ("%(strip:[DS_])s") ""; %rename ("%(strip:[STT_])s") "";
%include "../coqui-stt.h" %include "../coqui-stt.h"

View File

@ -263,7 +263,7 @@ StreamingState::processBatch(const vector<float>& buf, unsigned int n_steps)
} }
int int
DS_CreateModel(const char* aModelPath, STT_CreateModel(const char* aModelPath,
ModelState** retval) ModelState** retval)
{ {
*retval = nullptr; *retval = nullptr;
@ -279,7 +279,7 @@ DS_CreateModel(const char* aModelPath,
if (!aModelPath || strlen(aModelPath) < 1) { if (!aModelPath || strlen(aModelPath) < 1) {
std::cerr << "No model specified, cannot continue." << std::endl; std::cerr << "No model specified, cannot continue." << std::endl;
return DS_ERR_NO_MODEL; return STT_ERR_NO_MODEL;
} }
std::unique_ptr<ModelState> model( std::unique_ptr<ModelState> model(
@ -292,58 +292,58 @@ DS_CreateModel(const char* aModelPath,
if (!model) { if (!model) {
std::cerr << "Could not allocate model state." << std::endl; std::cerr << "Could not allocate model state." << std::endl;
return DS_ERR_FAIL_CREATE_MODEL; return STT_ERR_FAIL_CREATE_MODEL;
} }
int err = model->init(aModelPath); int err = model->init(aModelPath);
if (err != DS_ERR_OK) { if (err != STT_ERR_OK) {
return err; return err;
} }
*retval = model.release(); *retval = model.release();
return DS_ERR_OK; return STT_ERR_OK;
} }
unsigned int unsigned int
DS_GetModelBeamWidth(const ModelState* aCtx) STT_GetModelBeamWidth(const ModelState* aCtx)
{ {
return aCtx->beam_width_; return aCtx->beam_width_;
} }
int int
DS_SetModelBeamWidth(ModelState* aCtx, unsigned int aBeamWidth) STT_SetModelBeamWidth(ModelState* aCtx, unsigned int aBeamWidth)
{ {
aCtx->beam_width_ = aBeamWidth; aCtx->beam_width_ = aBeamWidth;
return 0; return 0;
} }
int int
DS_GetModelSampleRate(const ModelState* aCtx) STT_GetModelSampleRate(const ModelState* aCtx)
{ {
return aCtx->sample_rate_; return aCtx->sample_rate_;
} }
void void
DS_FreeModel(ModelState* ctx) STT_FreeModel(ModelState* ctx)
{ {
delete ctx; delete ctx;
} }
int int
DS_EnableExternalScorer(ModelState* aCtx, STT_EnableExternalScorer(ModelState* aCtx,
const char* aScorerPath) const char* aScorerPath)
{ {
std::unique_ptr<Scorer> scorer(new Scorer()); std::unique_ptr<Scorer> scorer(new Scorer());
int err = scorer->init(aScorerPath, aCtx->alphabet_); int err = scorer->init(aScorerPath, aCtx->alphabet_);
if (err != 0) { if (err != 0) {
return DS_ERR_INVALID_SCORER; return STT_ERR_INVALID_SCORER;
} }
aCtx->scorer_ = std::move(scorer); aCtx->scorer_ = std::move(scorer);
return DS_ERR_OK; return STT_ERR_OK;
} }
int int
DS_AddHotWord(ModelState* aCtx, STT_AddHotWord(ModelState* aCtx,
const char* word, const char* word,
float boost) float boost)
{ {
@ -352,15 +352,15 @@ DS_AddHotWord(ModelState* aCtx,
aCtx->hot_words_.insert( std::pair<std::string,float> (word, boost) ); aCtx->hot_words_.insert( std::pair<std::string,float> (word, boost) );
const int size_after = aCtx->hot_words_.size(); const int size_after = aCtx->hot_words_.size();
if (size_before == size_after) { if (size_before == size_after) {
return DS_ERR_FAIL_INSERT_HOTWORD; return STT_ERR_FAIL_INSERT_HOTWORD;
} }
return DS_ERR_OK; return STT_ERR_OK;
} }
return DS_ERR_SCORER_NOT_ENABLED; return STT_ERR_SCORER_NOT_ENABLED;
} }
int int
DS_EraseHotWord(ModelState* aCtx, STT_EraseHotWord(ModelState* aCtx,
const char* word) const char* word)
{ {
if (aCtx->scorer_) { if (aCtx->scorer_) {
@ -368,50 +368,50 @@ DS_EraseHotWord(ModelState* aCtx,
int err = aCtx->hot_words_.erase(word); int err = aCtx->hot_words_.erase(word);
const int size_after = aCtx->hot_words_.size(); const int size_after = aCtx->hot_words_.size();
if (size_before == size_after) { if (size_before == size_after) {
return DS_ERR_FAIL_ERASE_HOTWORD; return STT_ERR_FAIL_ERASE_HOTWORD;
} }
return DS_ERR_OK; return STT_ERR_OK;
} }
return DS_ERR_SCORER_NOT_ENABLED; return STT_ERR_SCORER_NOT_ENABLED;
} }
int int
DS_ClearHotWords(ModelState* aCtx) STT_ClearHotWords(ModelState* aCtx)
{ {
if (aCtx->scorer_) { if (aCtx->scorer_) {
aCtx->hot_words_.clear(); aCtx->hot_words_.clear();
const int size_after = aCtx->hot_words_.size(); const int size_after = aCtx->hot_words_.size();
if (size_after != 0) { if (size_after != 0) {
return DS_ERR_FAIL_CLEAR_HOTWORD; return STT_ERR_FAIL_CLEAR_HOTWORD;
} }
return DS_ERR_OK; return STT_ERR_OK;
} }
return DS_ERR_SCORER_NOT_ENABLED; return STT_ERR_SCORER_NOT_ENABLED;
} }
int int
DS_DisableExternalScorer(ModelState* aCtx) STT_DisableExternalScorer(ModelState* aCtx)
{ {
if (aCtx->scorer_) { if (aCtx->scorer_) {
aCtx->scorer_.reset(); aCtx->scorer_.reset();
return DS_ERR_OK; return STT_ERR_OK;
} }
return DS_ERR_SCORER_NOT_ENABLED; return STT_ERR_SCORER_NOT_ENABLED;
} }
int DS_SetScorerAlphaBeta(ModelState* aCtx, int STT_SetScorerAlphaBeta(ModelState* aCtx,
float aAlpha, float aAlpha,
float aBeta) float aBeta)
{ {
if (aCtx->scorer_) { if (aCtx->scorer_) {
aCtx->scorer_->reset_params(aAlpha, aBeta); aCtx->scorer_->reset_params(aAlpha, aBeta);
return DS_ERR_OK; return STT_ERR_OK;
} }
return DS_ERR_SCORER_NOT_ENABLED; return STT_ERR_SCORER_NOT_ENABLED;
} }
int int
DS_CreateStream(ModelState* aCtx, STT_CreateStream(ModelState* aCtx,
StreamingState** retval) StreamingState** retval)
{ {
*retval = nullptr; *retval = nullptr;
@ -419,7 +419,7 @@ DS_CreateStream(ModelState* aCtx,
std::unique_ptr<StreamingState> ctx(new StreamingState()); std::unique_ptr<StreamingState> ctx(new StreamingState());
if (!ctx) { if (!ctx) {
std::cerr << "Could not allocate streaming state." << std::endl; std::cerr << "Could not allocate streaming state." << std::endl;
return DS_ERR_FAIL_CREATE_STREAM; return STT_ERR_FAIL_CREATE_STREAM;
} }
ctx->audio_buffer_.reserve(aCtx->audio_win_len_); ctx->audio_buffer_.reserve(aCtx->audio_win_len_);
@ -441,11 +441,11 @@ DS_CreateStream(ModelState* aCtx,
aCtx->hot_words_); aCtx->hot_words_);
*retval = ctx.release(); *retval = ctx.release();
return DS_ERR_OK; return STT_ERR_OK;
} }
void void
DS_FeedAudioContent(StreamingState* aSctx, STT_FeedAudioContent(StreamingState* aSctx,
const short* aBuffer, const short* aBuffer,
unsigned int aBufferSize) unsigned int aBufferSize)
{ {
@ -453,32 +453,32 @@ DS_FeedAudioContent(StreamingState* aSctx,
} }
char* char*
DS_IntermediateDecode(const StreamingState* aSctx) STT_IntermediateDecode(const StreamingState* aSctx)
{ {
return aSctx->intermediateDecode(); return aSctx->intermediateDecode();
} }
Metadata* Metadata*
DS_IntermediateDecodeWithMetadata(const StreamingState* aSctx, STT_IntermediateDecodeWithMetadata(const StreamingState* aSctx,
unsigned int aNumResults) unsigned int aNumResults)
{ {
return aSctx->intermediateDecodeWithMetadata(aNumResults); return aSctx->intermediateDecodeWithMetadata(aNumResults);
} }
char* char*
DS_FinishStream(StreamingState* aSctx) STT_FinishStream(StreamingState* aSctx)
{ {
char* str = aSctx->finishStream(); char* str = aSctx->finishStream();
DS_FreeStream(aSctx); STT_FreeStream(aSctx);
return str; return str;
} }
Metadata* Metadata*
DS_FinishStreamWithMetadata(StreamingState* aSctx, STT_FinishStreamWithMetadata(StreamingState* aSctx,
unsigned int aNumResults) unsigned int aNumResults)
{ {
Metadata* result = aSctx->finishStreamWithMetadata(aNumResults); Metadata* result = aSctx->finishStreamWithMetadata(aNumResults);
DS_FreeStream(aSctx); STT_FreeStream(aSctx);
return result; return result;
} }
@ -488,41 +488,41 @@ CreateStreamAndFeedAudioContent(ModelState* aCtx,
unsigned int aBufferSize) unsigned int aBufferSize)
{ {
StreamingState* ctx; StreamingState* ctx;
int status = DS_CreateStream(aCtx, &ctx); int status = STT_CreateStream(aCtx, &ctx);
if (status != DS_ERR_OK) { if (status != STT_ERR_OK) {
return nullptr; return nullptr;
} }
DS_FeedAudioContent(ctx, aBuffer, aBufferSize); STT_FeedAudioContent(ctx, aBuffer, aBufferSize);
return ctx; return ctx;
} }
char* char*
DS_SpeechToText(ModelState* aCtx, STT_SpeechToText(ModelState* aCtx,
const short* aBuffer, const short* aBuffer,
unsigned int aBufferSize) unsigned int aBufferSize)
{ {
StreamingState* ctx = CreateStreamAndFeedAudioContent(aCtx, aBuffer, aBufferSize); StreamingState* ctx = CreateStreamAndFeedAudioContent(aCtx, aBuffer, aBufferSize);
return DS_FinishStream(ctx); return STT_FinishStream(ctx);
} }
Metadata* Metadata*
DS_SpeechToTextWithMetadata(ModelState* aCtx, STT_SpeechToTextWithMetadata(ModelState* aCtx,
const short* aBuffer, const short* aBuffer,
unsigned int aBufferSize, unsigned int aBufferSize,
unsigned int aNumResults) unsigned int aNumResults)
{ {
StreamingState* ctx = CreateStreamAndFeedAudioContent(aCtx, aBuffer, aBufferSize); StreamingState* ctx = CreateStreamAndFeedAudioContent(aCtx, aBuffer, aBufferSize);
return DS_FinishStreamWithMetadata(ctx, aNumResults); return STT_FinishStreamWithMetadata(ctx, aNumResults);
} }
void void
DS_FreeStream(StreamingState* aSctx) STT_FreeStream(StreamingState* aSctx)
{ {
delete aSctx; delete aSctx;
} }
void void
DS_FreeMetadata(Metadata* m) STT_FreeMetadata(Metadata* m)
{ {
if (m) { if (m) {
for (int i = 0; i < m->num_transcripts; ++i) { for (int i = 0; i < m->num_transcripts; ++i) {
@ -539,13 +539,13 @@ DS_FreeMetadata(Metadata* m)
} }
void void
DS_FreeString(char* str) STT_FreeString(char* str)
{ {
free(str); free(str);
} }
char* char*
DS_Version() STT_Version()
{ {
return strdup(ds_version()); return strdup(ds_version());
} }

View File

@ -2,7 +2,7 @@
#include <string.h> #include <string.h>
char* char*
DS_ErrorCodeToErrorMessage(int aErrorCode) STT_ErrorCodeToErrorMessage(int aErrorCode)
{ {
#define RETURN_MESSAGE(NAME, VALUE, DESC) \ #define RETURN_MESSAGE(NAME, VALUE, DESC) \
case NAME: \ case NAME: \
@ -10,7 +10,7 @@ DS_ErrorCodeToErrorMessage(int aErrorCode)
switch(aErrorCode) switch(aErrorCode)
{ {
DS_FOR_EACH_ERROR(RETURN_MESSAGE) STT_FOR_EACH_ERROR(RETURN_MESSAGE)
default: default:
return strdup("Unknown error, please make sure you are using the correct native binary."); return strdup("Unknown error, please make sure you are using the correct native binary.");
} }

View File

@ -1,12 +0,0 @@
framework module deepspeech_ios {
umbrella header "deepspeech_ios.h"
export *
module * { export * }
explicit module libdeepspeech_Private {
header "coqui-stt.h"
export *
link "deepspeech"
}
}

View File

@ -13,7 +13,7 @@ Pod::Spec.new do |s|
s.source = { :git => "https://github.com/coqui-ai/STT.git", :tag => "v#{s.version}" } s.source = { :git => "https://github.com/coqui-ai/STT.git", :tag => "v#{s.version}" }
# Assuming taskcluster build location. Depending on your Xcode setup, this might be in # Assuming taskcluster build location. Depending on your Xcode setup, this might be in
# build/Release-iphoneos/deepspeech_ios.framework instead. # build/Release-iphoneos/stt_ios.framework instead.
s.vendored_frameworks = "native_client/swift/DerivedData/Build/Products/Release-iphoneos/deepspeech_ios.framework" s.vendored_frameworks = "native_client/swift/DerivedData/Build/Products/Release-iphoneos/stt_ios.framework"
s.source_files = "native_client/swift/deepspeech_ios/**/*.{h,m,mm,swift}" s.source_files = "native_client/swift/stt_ios/**/*.{h,m,mm,swift}"
end end

View File

@ -7,11 +7,11 @@
objects = { objects = {
/* Begin PBXBuildFile section */ /* Begin PBXBuildFile section */
505B136B24960D550007DADA /* deepspeech_ios.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = 505B136124960D550007DADA /* deepspeech_ios.framework */; }; 505B136B24960D550007DADA /* stt_ios.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = 505B136124960D550007DADA /* stt_ios.framework */; };
505B137224960D550007DADA /* deepspeech_ios.h in Headers */ = {isa = PBXBuildFile; fileRef = 505B136424960D550007DADA /* deepspeech_ios.h */; settings = {ATTRIBUTES = (Public, ); }; }; 505B137224960D550007DADA /* stt_ios.h in Headers */ = {isa = PBXBuildFile; fileRef = 505B136424960D550007DADA /* stt_ios.h */; settings = {ATTRIBUTES = (Public, ); }; };
505B137D24961AF20007DADA /* coqui-stt.h in Headers */ = {isa = PBXBuildFile; fileRef = 505B137C24961AF20007DADA /* coqui-stt.h */; settings = {ATTRIBUTES = (Private, ); }; }; 505B137D24961AF20007DADA /* coqui-stt.h in Headers */ = {isa = PBXBuildFile; fileRef = 505B137C24961AF20007DADA /* coqui-stt.h */; settings = {ATTRIBUTES = (Private, ); }; };
505B137F24961BA70007DADA /* DeepSpeech.swift in Sources */ = {isa = PBXBuildFile; fileRef = 505B137E24961BA70007DADA /* DeepSpeech.swift */; }; 505B137F24961BA70007DADA /* STT.swift in Sources */ = {isa = PBXBuildFile; fileRef = 505B137E24961BA70007DADA /* STT.swift */; };
AD2FD0F925678F8800314F2E /* deepspeech_ios.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = AD2FD0F825678F8800314F2E /* deepspeech_ios.framework */; }; AD2FD0F925678F8800314F2E /* stt_ios.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = AD2FD0F825678F8800314F2E /* stt_ios.framework */; };
/* End PBXBuildFile section */ /* End PBXBuildFile section */
/* Begin PBXContainerItemProxy section */ /* Begin PBXContainerItemProxy section */
@ -20,7 +20,7 @@
containerPortal = 505B135824960D550007DADA /* Project object */; containerPortal = 505B135824960D550007DADA /* Project object */;
proxyType = 1; proxyType = 1;
remoteGlobalIDString = 505B136024960D550007DADA; remoteGlobalIDString = 505B136024960D550007DADA;
remoteInfo = deepspeech_ios; remoteInfo = stt_ios;
}; };
/* End PBXContainerItemProxy section */ /* End PBXContainerItemProxy section */
@ -38,14 +38,14 @@
/* End PBXCopyFilesBuildPhase section */ /* End PBXCopyFilesBuildPhase section */
/* Begin PBXFileReference section */ /* Begin PBXFileReference section */
505B136124960D550007DADA /* deepspeech_ios.framework */ = {isa = PBXFileReference; explicitFileType = wrapper.framework; includeInIndex = 0; path = deepspeech_ios.framework; sourceTree = BUILT_PRODUCTS_DIR; }; 505B136124960D550007DADA /* stt_ios.framework */ = {isa = PBXFileReference; explicitFileType = wrapper.framework; includeInIndex = 0; path = stt_ios.framework; sourceTree = BUILT_PRODUCTS_DIR; };
505B136424960D550007DADA /* deepspeech_ios.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = deepspeech_ios.h; sourceTree = "<group>"; }; 505B136424960D550007DADA /* stt_ios.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = stt_ios.h; sourceTree = "<group>"; };
505B136524960D550007DADA /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; }; 505B136524960D550007DADA /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; };
505B136A24960D550007DADA /* deepspeech_iosTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = deepspeech_iosTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; }; 505B136A24960D550007DADA /* stt_iosTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = stt_iosTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };
505B137B249619C90007DADA /* deepspeech_ios.modulemap */ = {isa = PBXFileReference; lastKnownFileType = "sourcecode.module-map"; path = deepspeech_ios.modulemap; sourceTree = "<group>"; }; 505B137B249619C90007DADA /* stt_ios.modulemap */ = {isa = PBXFileReference; lastKnownFileType = "sourcecode.module-map"; path = stt_ios.modulemap; sourceTree = "<group>"; };
505B137C24961AF20007DADA /* coqui-stt.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; name = coqui-stt.h; path = ../../coqui-stt.h; sourceTree = "<group>"; }; 505B137C24961AF20007DADA /* coqui-stt.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; name = coqui-stt.h; path = ../../coqui-stt.h; sourceTree = "<group>"; };
505B137E24961BA70007DADA /* DeepSpeech.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = DeepSpeech.swift; sourceTree = "<group>"; }; 505B137E24961BA70007DADA /* STT.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = STT.swift; sourceTree = "<group>"; };
AD2FD0F825678F8800314F2E /* deepspeech_ios.framework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.framework; path = deepspeech_ios.framework; sourceTree = "<group>"; }; AD2FD0F825678F8800314F2E /* stt_ios.framework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.framework; path = stt_ios.framework; sourceTree = "<group>"; };
/* End PBXFileReference section */ /* End PBXFileReference section */
/* Begin PBXFrameworksBuildPhase section */ /* Begin PBXFrameworksBuildPhase section */
@ -53,7 +53,7 @@
isa = PBXFrameworksBuildPhase; isa = PBXFrameworksBuildPhase;
buildActionMask = 2147483647; buildActionMask = 2147483647;
files = ( files = (
AD2FD0F925678F8800314F2E /* deepspeech_ios.framework in Frameworks */, AD2FD0F925678F8800314F2E /* stt_ios.framework in Frameworks */,
); );
runOnlyForDeploymentPostprocessing = 0; runOnlyForDeploymentPostprocessing = 0;
}; };
@ -61,7 +61,7 @@
isa = PBXFrameworksBuildPhase; isa = PBXFrameworksBuildPhase;
buildActionMask = 2147483647; buildActionMask = 2147483647;
files = ( files = (
505B136B24960D550007DADA /* deepspeech_ios.framework in Frameworks */, 505B136B24960D550007DADA /* stt_ios.framework in Frameworks */,
); );
runOnlyForDeploymentPostprocessing = 0; runOnlyForDeploymentPostprocessing = 0;
}; };
@ -71,7 +71,7 @@
505B135724960D550007DADA = { 505B135724960D550007DADA = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
505B136324960D550007DADA /* deepspeech_ios */, 505B136324960D550007DADA /* stt_ios */,
505B136224960D550007DADA /* Products */, 505B136224960D550007DADA /* Products */,
505B1380249620C60007DADA /* Frameworks */, 505B1380249620C60007DADA /* Frameworks */,
); );
@ -80,28 +80,28 @@
505B136224960D550007DADA /* Products */ = { 505B136224960D550007DADA /* Products */ = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
505B136124960D550007DADA /* deepspeech_ios.framework */, 505B136124960D550007DADA /* stt_ios.framework */,
505B136A24960D550007DADA /* deepspeech_iosTests.xctest */, 505B136A24960D550007DADA /* stt_iosTests.xctest */,
); );
name = Products; name = Products;
sourceTree = "<group>"; sourceTree = "<group>";
}; };
505B136324960D550007DADA /* deepspeech_ios */ = { 505B136324960D550007DADA /* stt_ios */ = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
505B137C24961AF20007DADA /* coqui-stt.h */, 505B137C24961AF20007DADA /* coqui-stt.h */,
505B136424960D550007DADA /* deepspeech_ios.h */, 505B136424960D550007DADA /* stt_ios.h */,
505B137E24961BA70007DADA /* DeepSpeech.swift */, 505B137E24961BA70007DADA /* STT.swift */,
505B137B249619C90007DADA /* deepspeech_ios.modulemap */, 505B137B249619C90007DADA /* stt_ios.modulemap */,
505B136524960D550007DADA /* Info.plist */, 505B136524960D550007DADA /* Info.plist */,
); );
path = deepspeech_ios; path = stt_ios;
sourceTree = "<group>"; sourceTree = "<group>";
}; };
505B1380249620C60007DADA /* Frameworks */ = { 505B1380249620C60007DADA /* Frameworks */ = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
AD2FD0F825678F8800314F2E /* deepspeech_ios.framework */, AD2FD0F825678F8800314F2E /* stt_ios.framework */,
); );
name = Frameworks; name = Frameworks;
sourceTree = "<group>"; sourceTree = "<group>";
@ -113,7 +113,7 @@
isa = PBXHeadersBuildPhase; isa = PBXHeadersBuildPhase;
buildActionMask = 2147483647; buildActionMask = 2147483647;
files = ( files = (
505B137224960D550007DADA /* deepspeech_ios.h in Headers */, 505B137224960D550007DADA /* stt_ios.h in Headers */,
505B137D24961AF20007DADA /* coqui-stt.h in Headers */, 505B137D24961AF20007DADA /* coqui-stt.h in Headers */,
); );
runOnlyForDeploymentPostprocessing = 0; runOnlyForDeploymentPostprocessing = 0;
@ -121,9 +121,9 @@
/* End PBXHeadersBuildPhase section */ /* End PBXHeadersBuildPhase section */
/* Begin PBXNativeTarget section */ /* Begin PBXNativeTarget section */
505B136024960D550007DADA /* deepspeech_ios */ = { 505B136024960D550007DADA /* stt_ios */ = {
isa = PBXNativeTarget; isa = PBXNativeTarget;
buildConfigurationList = 505B137524960D550007DADA /* Build configuration list for PBXNativeTarget "deepspeech_ios" */; buildConfigurationList = 505B137524960D550007DADA /* Build configuration list for PBXNativeTarget "stt_ios" */;
buildPhases = ( buildPhases = (
505B135C24960D550007DADA /* Headers */, 505B135C24960D550007DADA /* Headers */,
505B135D24960D550007DADA /* Sources */, 505B135D24960D550007DADA /* Sources */,
@ -135,14 +135,14 @@
); );
dependencies = ( dependencies = (
); );
name = deepspeech_ios; name = stt_ios;
productName = deepspeech_ios; productName = stt_ios;
productReference = 505B136124960D550007DADA /* deepspeech_ios.framework */; productReference = 505B136124960D550007DADA /* stt_ios.framework */;
productType = "com.apple.product-type.framework"; productType = "com.apple.product-type.framework";
}; };
505B136924960D550007DADA /* deepspeech_iosTests */ = { 505B136924960D550007DADA /* stt_iosTests */ = {
isa = PBXNativeTarget; isa = PBXNativeTarget;
buildConfigurationList = 505B137824960D550007DADA /* Build configuration list for PBXNativeTarget "deepspeech_iosTests" */; buildConfigurationList = 505B137824960D550007DADA /* Build configuration list for PBXNativeTarget "stt_iosTests" */;
buildPhases = ( buildPhases = (
505B136624960D550007DADA /* Sources */, 505B136624960D550007DADA /* Sources */,
505B136724960D550007DADA /* Frameworks */, 505B136724960D550007DADA /* Frameworks */,
@ -153,9 +153,9 @@
dependencies = ( dependencies = (
505B136D24960D550007DADA /* PBXTargetDependency */, 505B136D24960D550007DADA /* PBXTargetDependency */,
); );
name = deepspeech_iosTests; name = stt_iosTests;
productName = deepspeech_iosTests; productName = stt_iosTests;
productReference = 505B136A24960D550007DADA /* deepspeech_iosTests.xctest */; productReference = 505B136A24960D550007DADA /* stt_iosTests.xctest */;
productType = "com.apple.product-type.bundle.unit-test"; productType = "com.apple.product-type.bundle.unit-test";
}; };
/* End PBXNativeTarget section */ /* End PBXNativeTarget section */
@ -177,7 +177,7 @@
}; };
}; };
}; };
buildConfigurationList = 505B135B24960D550007DADA /* Build configuration list for PBXProject "deepspeech_ios" */; buildConfigurationList = 505B135B24960D550007DADA /* Build configuration list for PBXProject "stt_ios" */;
compatibilityVersion = "Xcode 9.3"; compatibilityVersion = "Xcode 9.3";
developmentRegion = en; developmentRegion = en;
hasScannedForEncodings = 0; hasScannedForEncodings = 0;
@ -190,8 +190,8 @@
projectDirPath = ""; projectDirPath = "";
projectRoot = ""; projectRoot = "";
targets = ( targets = (
505B136024960D550007DADA /* deepspeech_ios */, 505B136024960D550007DADA /* stt_ios */,
505B136924960D550007DADA /* deepspeech_iosTests */, 505B136924960D550007DADA /* stt_iosTests */,
); );
}; };
/* End PBXProject section */ /* End PBXProject section */
@ -218,7 +218,7 @@
isa = PBXSourcesBuildPhase; isa = PBXSourcesBuildPhase;
buildActionMask = 2147483647; buildActionMask = 2147483647;
files = ( files = (
505B137F24961BA70007DADA /* DeepSpeech.swift in Sources */, 505B137F24961BA70007DADA /* STT.swift in Sources */,
); );
runOnlyForDeploymentPostprocessing = 0; runOnlyForDeploymentPostprocessing = 0;
}; };
@ -234,7 +234,7 @@
/* Begin PBXTargetDependency section */ /* Begin PBXTargetDependency section */
505B136D24960D550007DADA /* PBXTargetDependency */ = { 505B136D24960D550007DADA /* PBXTargetDependency */ = {
isa = PBXTargetDependency; isa = PBXTargetDependency;
target = 505B136024960D550007DADA /* deepspeech_ios */; target = 505B136024960D550007DADA /* stt_ios */;
targetProxy = 505B136C24960D550007DADA /* PBXContainerItemProxy */; targetProxy = 505B136C24960D550007DADA /* PBXContainerItemProxy */;
}; };
/* End PBXTargetDependency section */ /* End PBXTargetDependency section */
@ -383,7 +383,7 @@
"$(inherited)", "$(inherited)",
"$(PROJECT_DIR)", "$(PROJECT_DIR)",
); );
INFOPLIST_FILE = deepspeech_ios/Info.plist; INFOPLIST_FILE = stt_ios/Info.plist;
INSTALL_PATH = "$(LOCAL_LIBRARY_DIR)/Frameworks"; INSTALL_PATH = "$(LOCAL_LIBRARY_DIR)/Frameworks";
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
@ -392,12 +392,12 @@
); );
LIBRARY_SEARCH_PATHS = ( LIBRARY_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"$(PROJECT_DIR)/deepspeech_ios", "$(PROJECT_DIR)/stt_ios",
"$(PROJECT_DIR)", "$(PROJECT_DIR)",
); );
MODULEMAP_FILE = deepspeech_ios/deepspeech_ios.modulemap; MODULEMAP_FILE = stt_ios/stt_ios.modulemap;
OTHER_LDFLAGS = "-lstdc++"; OTHER_LDFLAGS = "-lstdc++";
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-ios"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-ios";
PRODUCT_NAME = "$(TARGET_NAME:c99extidentifier)"; PRODUCT_NAME = "$(TARGET_NAME:c99extidentifier)";
SKIP_INSTALL = YES; SKIP_INSTALL = YES;
SWIFT_OPTIMIZATION_LEVEL = "-Onone"; SWIFT_OPTIMIZATION_LEVEL = "-Onone";
@ -423,7 +423,7 @@
"$(inherited)", "$(inherited)",
"$(PROJECT_DIR)", "$(PROJECT_DIR)",
); );
INFOPLIST_FILE = deepspeech_ios/Info.plist; INFOPLIST_FILE = stt_ios/Info.plist;
INSTALL_PATH = "$(LOCAL_LIBRARY_DIR)/Frameworks"; INSTALL_PATH = "$(LOCAL_LIBRARY_DIR)/Frameworks";
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
@ -432,12 +432,12 @@
); );
LIBRARY_SEARCH_PATHS = ( LIBRARY_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"$(PROJECT_DIR)/deepspeech_ios", "$(PROJECT_DIR)/stt_ios",
"$(PROJECT_DIR)", "$(PROJECT_DIR)",
); );
MODULEMAP_FILE = deepspeech_ios/deepspeech_ios.modulemap; MODULEMAP_FILE = stt_ios/stt_ios.modulemap;
OTHER_LDFLAGS = "-lstdc++"; OTHER_LDFLAGS = "-lstdc++";
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-ios"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-ios";
PRODUCT_NAME = "$(TARGET_NAME:c99extidentifier)"; PRODUCT_NAME = "$(TARGET_NAME:c99extidentifier)";
SKIP_INSTALL = YES; SKIP_INSTALL = YES;
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
@ -450,13 +450,13 @@
buildSettings = { buildSettings = {
ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;
CODE_SIGN_STYLE = Automatic; CODE_SIGN_STYLE = Automatic;
INFOPLIST_FILE = deepspeech_iosTests/Info.plist; INFOPLIST_FILE = stt_iosTests/Info.plist;
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"@executable_path/Frameworks", "@executable_path/Frameworks",
"@loader_path/Frameworks", "@loader_path/Frameworks",
); );
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-iosTests"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-iosTests";
PRODUCT_NAME = "$(TARGET_NAME)"; PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2"; TARGETED_DEVICE_FAMILY = "1,2";
@ -468,13 +468,13 @@
buildSettings = { buildSettings = {
ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;
CODE_SIGN_STYLE = Automatic; CODE_SIGN_STYLE = Automatic;
INFOPLIST_FILE = deepspeech_iosTests/Info.plist; INFOPLIST_FILE = stt_iosTests/Info.plist;
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"@executable_path/Frameworks", "@executable_path/Frameworks",
"@loader_path/Frameworks", "@loader_path/Frameworks",
); );
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-iosTests"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-iosTests";
PRODUCT_NAME = "$(TARGET_NAME)"; PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2"; TARGETED_DEVICE_FAMILY = "1,2";
@ -484,7 +484,7 @@
/* End XCBuildConfiguration section */ /* End XCBuildConfiguration section */
/* Begin XCConfigurationList section */ /* Begin XCConfigurationList section */
505B135B24960D550007DADA /* Build configuration list for PBXProject "deepspeech_ios" */ = { 505B135B24960D550007DADA /* Build configuration list for PBXProject "stt_ios" */ = {
isa = XCConfigurationList; isa = XCConfigurationList;
buildConfigurations = ( buildConfigurations = (
505B137324960D550007DADA /* Debug */, 505B137324960D550007DADA /* Debug */,
@ -493,7 +493,7 @@
defaultConfigurationIsVisible = 0; defaultConfigurationIsVisible = 0;
defaultConfigurationName = Release; defaultConfigurationName = Release;
}; };
505B137524960D550007DADA /* Build configuration list for PBXNativeTarget "deepspeech_ios" */ = { 505B137524960D550007DADA /* Build configuration list for PBXNativeTarget "stt_ios" */ = {
isa = XCConfigurationList; isa = XCConfigurationList;
buildConfigurations = ( buildConfigurations = (
505B137624960D550007DADA /* Debug */, 505B137624960D550007DADA /* Debug */,
@ -502,7 +502,7 @@
defaultConfigurationIsVisible = 0; defaultConfigurationIsVisible = 0;
defaultConfigurationName = Release; defaultConfigurationName = Release;
}; };
505B137824960D550007DADA /* Build configuration list for PBXNativeTarget "deepspeech_iosTests" */ = { 505B137824960D550007DADA /* Build configuration list for PBXNativeTarget "stt_iosTests" */ = {
isa = XCConfigurationList; isa = XCConfigurationList;
buildConfigurations = ( buildConfigurations = (
505B137924960D550007DADA /* Debug */, 505B137924960D550007DADA /* Debug */,

View File

@ -2,6 +2,6 @@
<Workspace <Workspace
version = "1.0"> version = "1.0">
<FileRef <FileRef
location = "self:deepspeech_ios.xcodeproj"> location = "self:stt_ios.xcodeproj">
</FileRef> </FileRef>
</Workspace> </Workspace>

View File

@ -15,9 +15,9 @@
<BuildableReference <BuildableReference
BuildableIdentifier = "primary" BuildableIdentifier = "primary"
BlueprintIdentifier = "505B136024960D550007DADA" BlueprintIdentifier = "505B136024960D550007DADA"
BuildableName = "deepspeech_ios.framework" BuildableName = "stt_ios.framework"
BlueprintName = "deepspeech_ios" BlueprintName = "stt_ios"
ReferencedContainer = "container:deepspeech_ios.xcodeproj"> ReferencedContainer = "container:stt_ios.xcodeproj">
</BuildableReference> </BuildableReference>
</BuildActionEntry> </BuildActionEntry>
</BuildActionEntries> </BuildActionEntries>
@ -33,9 +33,9 @@
<BuildableReference <BuildableReference
BuildableIdentifier = "primary" BuildableIdentifier = "primary"
BlueprintIdentifier = "505B136924960D550007DADA" BlueprintIdentifier = "505B136924960D550007DADA"
BuildableName = "deepspeech_iosTests.xctest" BuildableName = "stt_iosTests.xctest"
BlueprintName = "deepspeech_iosTests" BlueprintName = "stt_iosTests"
ReferencedContainer = "container:deepspeech_ios.xcodeproj"> ReferencedContainer = "container:stt_ios.xcodeproj">
</BuildableReference> </BuildableReference>
</TestableReference> </TestableReference>
</Testables> </Testables>
@ -62,9 +62,9 @@
<BuildableReference <BuildableReference
BuildableIdentifier = "primary" BuildableIdentifier = "primary"
BlueprintIdentifier = "505B136024960D550007DADA" BlueprintIdentifier = "505B136024960D550007DADA"
BuildableName = "deepspeech_ios.framework" BuildableName = "stt_ios.framework"
BlueprintName = "deepspeech_ios" BlueprintName = "stt_ios"
ReferencedContainer = "container:deepspeech_ios.xcodeproj"> ReferencedContainer = "container:stt_ios.xcodeproj">
</BuildableReference> </BuildableReference>
</MacroExpansion> </MacroExpansion>
</ProfileAction> </ProfileAction>

View File

@ -2,9 +2,9 @@
<Workspace <Workspace
version = "1.0"> version = "1.0">
<FileRef <FileRef
location = "group:deepspeech_ios.xcodeproj"> location = "group:stt_ios.xcodeproj">
</FileRef> </FileRef>
<FileRef <FileRef
location = "group:deepspeech_ios_test.xcodeproj"> location = "group:stt_ios_test.xcodeproj">
</FileRef> </FileRef>
</Workspace> </Workspace>

View File

@ -1,14 +1,14 @@
// //
// DeepSpeech.swift // STT.swift
// deepspeech_ios // stt_ios
// //
// Created by Reuben Morais on 14.06.20. // Created by Reuben Morais on 14.06.20.
// Copyright © 2020 Mozilla // Copyright © 2020 Mozilla
// Copyright © 2021 Coqui GmbH // Copyright © 2021 Coqui GmbH
import deepspeech_ios.libdeepspeech_Private import stt_ios.libstt_Private
public enum DeepSpeechError: Error { public enum STTError: Error {
// Should be kept in sync with coqui-stt.h // Should be kept in sync with coqui-stt.h
case noModel(errorCode: Int32) case noModel(errorCode: Int32)
case invalidAlphabet(errorCode: Int32) case invalidAlphabet(errorCode: Int32)
@ -35,7 +35,7 @@ public enum DeepSpeechError: Error {
case invalidErrorCode(errorCode: Int32) case invalidErrorCode(errorCode: Int32)
} }
extension DeepSpeechError : LocalizedError { extension STTError : LocalizedError {
public var errorDescription: String? { public var errorDescription: String? {
switch self { switch self {
case .noModel(let errorCode), case .noModel(let errorCode),
@ -58,66 +58,66 @@ extension DeepSpeechError : LocalizedError {
.failCreateSess(let errorCode), .failCreateSess(let errorCode),
.failCreateModel(let errorCode), .failCreateModel(let errorCode),
.invalidErrorCode(let errorCode): .invalidErrorCode(let errorCode):
let result = DS_ErrorCodeToErrorMessage(errorCode) let result = STT_ErrorCodeToErrorMessage(errorCode)
defer { DS_FreeString(result) } defer { STT_FreeString(result) }
return String(cString: result!) return String(cString: result!)
} }
} }
} }
private func errorCodeToEnum(errorCode: Int32) -> DeepSpeechError { private func errorCodeToEnum(errorCode: Int32) -> STTError {
switch Int(errorCode) { switch Int(errorCode) {
case Int(DS_ERR_NO_MODEL.rawValue): case Int(STT_ERR_NO_MODEL.rawValue):
return DeepSpeechError.noModel(errorCode: errorCode) return STTError.noModel(errorCode: errorCode)
case Int(DS_ERR_INVALID_ALPHABET.rawValue): case Int(STT_ERR_INVALID_ALPHABET.rawValue):
return DeepSpeechError.invalidAlphabet(errorCode: errorCode) return STTError.invalidAlphabet(errorCode: errorCode)
case Int(DS_ERR_INVALID_SHAPE.rawValue): case Int(STT_ERR_INVALID_SHAPE.rawValue):
return DeepSpeechError.invalidShape(errorCode: errorCode) return STTError.invalidShape(errorCode: errorCode)
case Int(DS_ERR_INVALID_SCORER.rawValue): case Int(STT_ERR_INVALID_SCORER.rawValue):
return DeepSpeechError.invalidScorer(errorCode: errorCode) return STTError.invalidScorer(errorCode: errorCode)
case Int(DS_ERR_MODEL_INCOMPATIBLE.rawValue): case Int(STT_ERR_MODEL_INCOMPATIBLE.rawValue):
return DeepSpeechError.modelIncompatible(errorCode: errorCode) return STTError.modelIncompatible(errorCode: errorCode)
case Int(DS_ERR_SCORER_NOT_ENABLED.rawValue): case Int(STT_ERR_SCORER_NOT_ENABLED.rawValue):
return DeepSpeechError.scorerNotEnabled(errorCode: errorCode) return STTError.scorerNotEnabled(errorCode: errorCode)
case Int(DS_ERR_SCORER_UNREADABLE.rawValue): case Int(STT_ERR_SCORER_UNREADABLE.rawValue):
return DeepSpeechError.scorerUnreadable(errorCode: errorCode) return STTError.scorerUnreadable(errorCode: errorCode)
case Int(DS_ERR_SCORER_INVALID_LM.rawValue): case Int(STT_ERR_SCORER_INVALID_LM.rawValue):
return DeepSpeechError.scorerInvalidLm(errorCode: errorCode) return STTError.scorerInvalidLm(errorCode: errorCode)
case Int(DS_ERR_SCORER_NO_TRIE.rawValue): case Int(STT_ERR_SCORER_NO_TRIE.rawValue):
return DeepSpeechError.scorerNoTrie(errorCode: errorCode) return STTError.scorerNoTrie(errorCode: errorCode)
case Int(DS_ERR_SCORER_INVALID_TRIE.rawValue): case Int(STT_ERR_SCORER_INVALID_TRIE.rawValue):
return DeepSpeechError.scorerInvalidTrie(errorCode: errorCode) return STTError.scorerInvalidTrie(errorCode: errorCode)
case Int(DS_ERR_SCORER_VERSION_MISMATCH.rawValue): case Int(STT_ERR_SCORER_VERSION_MISMATCH.rawValue):
return DeepSpeechError.scorerVersionMismatch(errorCode: errorCode) return STTError.scorerVersionMismatch(errorCode: errorCode)
case Int(DS_ERR_FAIL_INIT_MMAP.rawValue): case Int(STT_ERR_FAIL_INIT_MMAP.rawValue):
return DeepSpeechError.failInitMmap(errorCode: errorCode) return STTError.failInitMmap(errorCode: errorCode)
case Int(DS_ERR_FAIL_INIT_SESS.rawValue): case Int(STT_ERR_FAIL_INIT_SESS.rawValue):
return DeepSpeechError.failInitSess(errorCode: errorCode) return STTError.failInitSess(errorCode: errorCode)
case Int(DS_ERR_FAIL_INTERPRETER.rawValue): case Int(STT_ERR_FAIL_INTERPRETER.rawValue):
return DeepSpeechError.failInterpreter(errorCode: errorCode) return STTError.failInterpreter(errorCode: errorCode)
case Int(DS_ERR_FAIL_RUN_SESS.rawValue): case Int(STT_ERR_FAIL_RUN_SESS.rawValue):
return DeepSpeechError.failRunSess(errorCode: errorCode) return STTError.failRunSess(errorCode: errorCode)
case Int(DS_ERR_FAIL_CREATE_STREAM.rawValue): case Int(STT_ERR_FAIL_CREATE_STREAM.rawValue):
return DeepSpeechError.failCreateStream(errorCode: errorCode) return STTError.failCreateStream(errorCode: errorCode)
case Int(DS_ERR_FAIL_READ_PROTOBUF.rawValue): case Int(STT_ERR_FAIL_READ_PROTOBUF.rawValue):
return DeepSpeechError.failReadProtobuf(errorCode: errorCode) return STTError.failReadProtobuf(errorCode: errorCode)
case Int(DS_ERR_FAIL_CREATE_SESS.rawValue): case Int(STT_ERR_FAIL_CREATE_SESS.rawValue):
return DeepSpeechError.failCreateSess(errorCode: errorCode) return STTError.failCreateSess(errorCode: errorCode)
case Int(DS_ERR_FAIL_CREATE_MODEL.rawValue): case Int(STT_ERR_FAIL_CREATE_MODEL.rawValue):
return DeepSpeechError.failCreateModel(errorCode: errorCode) return STTError.failCreateModel(errorCode: errorCode)
default: default:
return DeepSpeechError.invalidErrorCode(errorCode: errorCode) return STTError.invalidErrorCode(errorCode: errorCode)
} }
} }
private func evaluateErrorCode(errorCode: Int32) throws { private func evaluateErrorCode(errorCode: Int32) throws {
if errorCode != Int32(DS_ERR_OK.rawValue) { if errorCode != Int32(STT_ERR_OK.rawValue) {
throw errorCodeToEnum(errorCode: errorCode) throw errorCodeToEnum(errorCode: errorCode)
} }
} }
/// Stores text of an individual token, along with its timing information /// Stores text of an individual token, along with its timing information
public struct DeepSpeechTokenMetadata { public struct STTTokenMetadata {
/// The text corresponding to this token /// The text corresponding to this token
public let text: String public let text: String
@ -137,9 +137,9 @@ public struct DeepSpeechTokenMetadata {
/** A single transcript computed by the model, including a confidence value and /** A single transcript computed by the model, including a confidence value and
the metadata for its constituent tokens the metadata for its constituent tokens
*/ */
public struct DeepSpeechCandidateTranscript { public struct STTCandidateTranscript {
/// Array of DeepSpeechTokenMetadata objects /// Array of STTTokenMetadata objects
public private(set) var tokens: [DeepSpeechTokenMetadata] = [] public private(set) var tokens: [STTTokenMetadata] = []
/** Approximated confidence value for this transcript. This corresponds to /** Approximated confidence value for this transcript. This corresponds to
both acoustic model and language model scores that contributed to the both acoustic model and language model scores that contributed to the
@ -150,16 +150,16 @@ public struct DeepSpeechCandidateTranscript {
internal init(fromInternal: CandidateTranscript) { internal init(fromInternal: CandidateTranscript) {
let tokensBuffer = UnsafeBufferPointer<TokenMetadata>(start: fromInternal.tokens, count: Int(fromInternal.num_tokens)) let tokensBuffer = UnsafeBufferPointer<TokenMetadata>(start: fromInternal.tokens, count: Int(fromInternal.num_tokens))
for tok in tokensBuffer { for tok in tokensBuffer {
tokens.append(DeepSpeechTokenMetadata(fromInternal: tok)) tokens.append(STTTokenMetadata(fromInternal: tok))
} }
confidence = fromInternal.confidence confidence = fromInternal.confidence
} }
} }
/// An array of DeepSpeechCandidateTranscript objects computed by the model /// An array of STTCandidateTranscript objects computed by the model
public struct DeepSpeechMetadata { public struct STTMetadata {
/// Array of DeepSpeechCandidateTranscript objects /// Array of STTCandidateTranscript objects
public private(set) var transcripts: [DeepSpeechCandidateTranscript] = [] public private(set) var transcripts: [STTCandidateTranscript] = []
internal init(fromInternal: UnsafeMutablePointer<Metadata>) { internal init(fromInternal: UnsafeMutablePointer<Metadata>) {
let md = fromInternal.pointee let md = fromInternal.pointee
@ -168,12 +168,12 @@ public struct DeepSpeechMetadata {
count: Int(md.num_transcripts)) count: Int(md.num_transcripts))
for tr in transcriptsBuffer { for tr in transcriptsBuffer {
transcripts.append(DeepSpeechCandidateTranscript(fromInternal: tr)) transcripts.append(STTCandidateTranscript(fromInternal: tr))
} }
} }
} }
public class DeepSpeechStream { public class STTStream {
private var streamCtx: OpaquePointer! private var streamCtx: OpaquePointer!
internal init(streamContext: OpaquePointer) { internal init(streamContext: OpaquePointer) {
@ -182,7 +182,7 @@ public class DeepSpeechStream {
deinit { deinit {
if streamCtx != nil { if streamCtx != nil {
DS_FreeStream(streamCtx) STT_FreeStream(streamCtx)
streamCtx = nil streamCtx = nil
} }
} }
@ -212,7 +212,7 @@ public class DeepSpeechStream {
public func feedAudioContent(buffer: UnsafeBufferPointer<Int16>) { public func feedAudioContent(buffer: UnsafeBufferPointer<Int16>) {
precondition(streamCtx != nil, "calling method on invalidated Stream") precondition(streamCtx != nil, "calling method on invalidated Stream")
DS_FeedAudioContent(streamCtx, buffer.baseAddress, UInt32(buffer.count)) STT_FeedAudioContent(streamCtx, buffer.baseAddress, UInt32(buffer.count))
} }
/** Compute the intermediate decoding of an ongoing streaming inference. /** Compute the intermediate decoding of an ongoing streaming inference.
@ -224,8 +224,8 @@ public class DeepSpeechStream {
public func intermediateDecode() -> String { public func intermediateDecode() -> String {
precondition(streamCtx != nil, "calling method on invalidated Stream") precondition(streamCtx != nil, "calling method on invalidated Stream")
let result = DS_IntermediateDecode(streamCtx) let result = STT_IntermediateDecode(streamCtx)
defer { DS_FreeString(result) } defer { STT_FreeString(result) }
return String(cString: result!) return String(cString: result!)
} }
@ -239,11 +239,11 @@ public class DeepSpeechStream {
- Returns: Metadata struct containing multiple CandidateTranscript structs. - Returns: Metadata struct containing multiple CandidateTranscript structs.
Each transcript has per-token metadata including timing information. Each transcript has per-token metadata including timing information.
*/ */
public func intermediateDecodeWithMetadata(numResults: Int) -> DeepSpeechMetadata { public func intermediateDecodeWithMetadata(numResults: Int) -> STTMetadata {
precondition(streamCtx != nil, "calling method on invalidated Stream") precondition(streamCtx != nil, "calling method on invalidated Stream")
let result = DS_IntermediateDecodeWithMetadata(streamCtx, UInt32(numResults))! let result = STT_IntermediateDecodeWithMetadata(streamCtx, UInt32(numResults))!
defer { DS_FreeMetadata(result) } defer { STT_FreeMetadata(result) }
return DeepSpeechMetadata(fromInternal: result) return STTMetadata(fromInternal: result)
} }
/** Compute the final decoding of an ongoing streaming inference and return /** Compute the final decoding of an ongoing streaming inference and return
@ -258,9 +258,9 @@ public class DeepSpeechStream {
public func finishStream() -> String { public func finishStream() -> String {
precondition(streamCtx != nil, "calling method on invalidated Stream") precondition(streamCtx != nil, "calling method on invalidated Stream")
let result = DS_FinishStream(streamCtx) let result = STT_FinishStream(streamCtx)
defer { defer {
DS_FreeString(result) STT_FreeString(result)
streamCtx = nil streamCtx = nil
} }
return String(cString: result!) return String(cString: result!)
@ -279,42 +279,42 @@ public class DeepSpeechStream {
- Postcondition: This method will invalidate this streaming context. - Postcondition: This method will invalidate this streaming context.
*/ */
public func finishStreamWithMetadata(numResults: Int) -> DeepSpeechMetadata { public func finishStreamWithMetadata(numResults: Int) -> STTMetadata {
precondition(streamCtx != nil, "calling method on invalidated Stream") precondition(streamCtx != nil, "calling method on invalidated Stream")
let result = DS_FinishStreamWithMetadata(streamCtx, UInt32(numResults))! let result = STT_FinishStreamWithMetadata(streamCtx, UInt32(numResults))!
defer { DS_FreeMetadata(result) } defer { STT_FreeMetadata(result) }
return DeepSpeechMetadata(fromInternal: result) return STTMetadata(fromInternal: result)
} }
} }
/// An object providing an interface to a trained DeepSpeech model. /// An object providing an interface to a trained STT model.
public class DeepSpeechModel { public class STTModel {
private var modelCtx: OpaquePointer! private var modelCtx: OpaquePointer!
/** /**
- Parameter modelPath: The path to the model file. - Parameter modelPath: The path to the model file.
- Throws: `DeepSpeechError` on failure. - Throws: `STTError` on failure.
*/ */
public init(modelPath: String) throws { public init(modelPath: String) throws {
let err = DS_CreateModel(modelPath, &modelCtx) let err = STT_CreateModel(modelPath, &modelCtx)
try evaluateErrorCode(errorCode: err) try evaluateErrorCode(errorCode: err)
} }
deinit { deinit {
DS_FreeModel(modelCtx) STT_FreeModel(modelCtx)
modelCtx = nil modelCtx = nil
} }
/** Get beam width value used by the model. If {@link DS_SetModelBeamWidth} /** Get beam width value used by the model. If {@link STT_SetModelBeamWidth}
was not called before, will return the default value loaded from the was not called before, will return the default value loaded from the
model file. model file.
- Returns: Beam width value used by the model. - Returns: Beam width value used by the model.
*/ */
public func getBeamWidth() -> Int { public func getBeamWidth() -> Int {
return Int(DS_GetModelBeamWidth(modelCtx)) return Int(STT_GetModelBeamWidth(modelCtx))
} }
/** Set beam width value used by the model. /** Set beam width value used by the model.
@ -323,17 +323,17 @@ public class DeepSpeechModel {
width value generates better results at the cost width value generates better results at the cost
of decoding time. of decoding time.
- Throws: `DeepSpeechError` on failure. - Throws: `STTError` on failure.
*/ */
public func setBeamWidth(beamWidth: Int) throws { public func setBeamWidth(beamWidth: Int) throws {
let err = DS_SetModelBeamWidth(modelCtx, UInt32(beamWidth)) let err = STT_SetModelBeamWidth(modelCtx, UInt32(beamWidth))
try evaluateErrorCode(errorCode: err) try evaluateErrorCode(errorCode: err)
} }
// The sample rate expected by the model. // The sample rate expected by the model.
public var sampleRate: Int { public var sampleRate: Int {
get { get {
return Int(DS_GetModelSampleRate(modelCtx)) return Int(STT_GetModelSampleRate(modelCtx))
} }
} }
@ -341,19 +341,19 @@ public class DeepSpeechModel {
- Parameter scorerPath: The path to the external scorer file. - Parameter scorerPath: The path to the external scorer file.
- Throws: `DeepSpeechError` on failure. - Throws: `STTError` on failure.
*/ */
public func enableExternalScorer(scorerPath: String) throws { public func enableExternalScorer(scorerPath: String) throws {
let err = DS_EnableExternalScorer(modelCtx, scorerPath) let err = STT_EnableExternalScorer(modelCtx, scorerPath)
try evaluateErrorCode(errorCode: err) try evaluateErrorCode(errorCode: err)
} }
/** Disable decoding using an external scorer. /** Disable decoding using an external scorer.
- Throws: `DeepSpeechError` on failure. - Throws: `STTError` on failure.
*/ */
public func disableExternalScorer() throws { public func disableExternalScorer() throws {
let err = DS_DisableExternalScorer(modelCtx) let err = STT_DisableExternalScorer(modelCtx)
try evaluateErrorCode(errorCode: err) try evaluateErrorCode(errorCode: err)
} }
@ -362,14 +362,14 @@ public class DeepSpeechModel {
- Parameter alpha: The alpha hyperparameter of the decoder. Language model weight. - Parameter alpha: The alpha hyperparameter of the decoder. Language model weight.
- Parameter beta: The beta hyperparameter of the decoder. Word insertion weight. - Parameter beta: The beta hyperparameter of the decoder. Word insertion weight.
- Throws: `DeepSpeechError` on failure. - Throws: `STTError` on failure.
*/ */
public func setScorerAlphaBeta(alpha: Float, beta: Float) throws { public func setScorerAlphaBeta(alpha: Float, beta: Float) throws {
let err = DS_SetScorerAlphaBeta(modelCtx, alpha, beta) let err = STT_SetScorerAlphaBeta(modelCtx, alpha, beta)
try evaluateErrorCode(errorCode: err) try evaluateErrorCode(errorCode: err)
} }
/** Use the DeepSpeech model to convert speech to text. /** Use the STT model to convert speech to text.
- Parameter buffer: A 16-bit, mono raw audio signal at the appropriate - Parameter buffer: A 16-bit, mono raw audio signal at the appropriate
sample rate (matching what the model was trained on). sample rate (matching what the model was trained on).
@ -382,7 +382,7 @@ public class DeepSpeechModel {
} }
} }
/** Use the DeepSpeech model to convert speech to text. /** Use the STT model to convert speech to text.
- Parameter buffer: A 16-bit, mono raw audio signal at the appropriate - Parameter buffer: A 16-bit, mono raw audio signal at the appropriate
sample rate (matching what the model was trained on). sample rate (matching what the model was trained on).
@ -390,65 +390,65 @@ public class DeepSpeechModel {
- Returns: The STT result. - Returns: The STT result.
*/ */
public func speechToText(buffer: UnsafeBufferPointer<Int16>) -> String { public func speechToText(buffer: UnsafeBufferPointer<Int16>) -> String {
let result = DS_SpeechToText(modelCtx, buffer.baseAddress, UInt32(buffer.count)) let result = STT_SpeechToText(modelCtx, buffer.baseAddress, UInt32(buffer.count))
defer { DS_FreeString(result) } defer { STT_FreeString(result) }
return String(cString: result!) return String(cString: result!)
} }
/** Use the DeepSpeech model to convert speech to text and output results /** Use the STT model to convert speech to text and output results
including metadata. including metadata.
- Parameter buffer: A 16-bit, mono raw audio signal at the appropriate - Parameter buffer: A 16-bit, mono raw audio signal at the appropriate
sample rate (matching what the model was trained on). sample rate (matching what the model was trained on).
- Parameter numResults: The maximum number of DeepSpeechCandidateTranscript - Parameter numResults: The maximum number of STTCandidateTranscript
structs to return. Returned value might be smaller than this. structs to return. Returned value might be smaller than this.
- Returns: Metadata struct containing multiple CandidateTranscript structs. - Returns: Metadata struct containing multiple CandidateTranscript structs.
Each transcript has per-token metadata including timing information. Each transcript has per-token metadata including timing information.
*/ */
public func speechToTextWithMetadata(buffer: Array<Int16>, numResults: Int) -> DeepSpeechMetadata { public func speechToTextWithMetadata(buffer: Array<Int16>, numResults: Int) -> STTMetadata {
return buffer.withUnsafeBufferPointer { unsafeBufferPointer -> DeepSpeechMetadata in return buffer.withUnsafeBufferPointer { unsafeBufferPointer -> STTMetadata in
return speechToTextWithMetadata(buffer: unsafeBufferPointer, numResults: numResults) return speechToTextWithMetadata(buffer: unsafeBufferPointer, numResults: numResults)
} }
} }
/** Use the DeepSpeech model to convert speech to text and output results /** Use the STT model to convert speech to text and output results
including metadata. including metadata.
- Parameter buffer: A 16-bit, mono raw audio signal at the appropriate - Parameter buffer: A 16-bit, mono raw audio signal at the appropriate
sample rate (matching what the model was trained on). sample rate (matching what the model was trained on).
- Parameter numResults: The maximum number of DeepSpeechCandidateTranscript - Parameter numResults: The maximum number of STTCandidateTranscript
structs to return. Returned value might be smaller than this. structs to return. Returned value might be smaller than this.
- Returns: Metadata struct containing multiple CandidateTranscript structs. - Returns: Metadata struct containing multiple CandidateTranscript structs.
Each transcript has per-token metadata including timing information. Each transcript has per-token metadata including timing information.
*/ */
public func speechToTextWithMetadata(buffer: UnsafeBufferPointer<Int16>, numResults: Int) -> DeepSpeechMetadata { public func speechToTextWithMetadata(buffer: UnsafeBufferPointer<Int16>, numResults: Int) -> STTMetadata {
let result = DS_SpeechToTextWithMetadata( let result = STT_SpeechToTextWithMetadata(
modelCtx, modelCtx,
buffer.baseAddress, buffer.baseAddress,
UInt32(buffer.count), UInt32(buffer.count),
UInt32(numResults))! UInt32(numResults))!
defer { DS_FreeMetadata(result) } defer { STT_FreeMetadata(result) }
return DeepSpeechMetadata(fromInternal: result) return STTMetadata(fromInternal: result)
} }
/** Create a new streaming inference state. /** Create a new streaming inference state.
- Returns: DeepSpeechStream object representing the streaming state. - Returns: STTStream object representing the streaming state.
- Throws: `DeepSpeechError` on failure. - Throws: `STTError` on failure.
*/ */
public func createStream() throws -> DeepSpeechStream { public func createStream() throws -> STTStream {
var streamContext: OpaquePointer! var streamContext: OpaquePointer!
let err = DS_CreateStream(modelCtx, &streamContext) let err = STT_CreateStream(modelCtx, &streamContext)
try evaluateErrorCode(errorCode: err) try evaluateErrorCode(errorCode: err)
return DeepSpeechStream(streamContext: streamContext) return STTStream(streamContext: streamContext)
} }
} }
public func DeepSpeechVersion() -> String { public func STTVersion() -> String {
let result = DS_Version() let result = STT_Version()
defer { DS_FreeString(result) } defer { STT_FreeString(result) }
return String(cString: result!) return String(cString: result!)
} }

View File

@ -1,6 +1,6 @@
// //
// deepspeech_ios.h // stt_ios.h
// deepspeech_ios // stt_ios
// //
// Created by Reuben Morais on 14.06.20. // Created by Reuben Morais on 14.06.20.
// Copyright © 2020 Mozilla // Copyright © 2020 Mozilla
@ -8,6 +8,6 @@
#import <Foundation/Foundation.h> #import <Foundation/Foundation.h>
// In this header, you should import all the public headers of your framework using statements like #import <deepspeech_ios/PublicHeader.h> // In this header, you should import all the public headers of your framework using statements like #import <stt_ios/PublicHeader.h>

View File

@ -0,0 +1,12 @@
framework module stt_ios {
umbrella header "stt_ios.h"
export *
module * { export * }
explicit module libstt_Private {
header "coqui-stt.h"
export *
link "stt"
}
}

View File

@ -9,16 +9,16 @@
/* Begin PBXBuildFile section */ /* Begin PBXBuildFile section */
504EC34324CF4EFD0073C22E /* SpeechRecognitionImpl.swift in Sources */ = {isa = PBXBuildFile; fileRef = 504EC34124CF4EFD0073C22E /* SpeechRecognitionImpl.swift */; }; 504EC34324CF4EFD0073C22E /* SpeechRecognitionImpl.swift in Sources */ = {isa = PBXBuildFile; fileRef = 504EC34124CF4EFD0073C22E /* SpeechRecognitionImpl.swift */; };
504EC34424CF4EFD0073C22E /* AudioContext.swift in Sources */ = {isa = PBXBuildFile; fileRef = 504EC34224CF4EFD0073C22E /* AudioContext.swift */; }; 504EC34424CF4EFD0073C22E /* AudioContext.swift in Sources */ = {isa = PBXBuildFile; fileRef = 504EC34224CF4EFD0073C22E /* AudioContext.swift */; };
504EC34524CF4F4F0073C22E /* deepspeech_ios.framework in Embed Frameworks */ = {isa = PBXBuildFile; fileRef = 507CD3A024B61FE400409BBB /* deepspeech_ios.framework */; settings = {ATTRIBUTES = (CodeSignOnCopy, RemoveHeadersOnCopy, ); }; }; 504EC34524CF4F4F0073C22E /* stt_ios.framework in Embed Frameworks */ = {isa = PBXBuildFile; fileRef = 507CD3A024B61FE400409BBB /* stt_ios.framework */; settings = {ATTRIBUTES = (CodeSignOnCopy, RemoveHeadersOnCopy, ); }; };
507CD3A124B61FE400409BBB /* deepspeech_ios.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = 507CD3A024B61FE400409BBB /* deepspeech_ios.framework */; }; 507CD3A124B61FE400409BBB /* stt_ios.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = 507CD3A024B61FE400409BBB /* stt_ios.framework */; };
50F787F32497683900D52237 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F787F22497683900D52237 /* AppDelegate.swift */; }; 50F787F32497683900D52237 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F787F22497683900D52237 /* AppDelegate.swift */; };
50F787F52497683900D52237 /* SceneDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F787F42497683900D52237 /* SceneDelegate.swift */; }; 50F787F52497683900D52237 /* SceneDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F787F42497683900D52237 /* SceneDelegate.swift */; };
50F787F72497683900D52237 /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F787F62497683900D52237 /* ContentView.swift */; }; 50F787F72497683900D52237 /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F787F62497683900D52237 /* ContentView.swift */; };
50F787F92497683A00D52237 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 50F787F82497683A00D52237 /* Assets.xcassets */; }; 50F787F92497683A00D52237 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 50F787F82497683A00D52237 /* Assets.xcassets */; };
50F787FC2497683A00D52237 /* Preview Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 50F787FB2497683A00D52237 /* Preview Assets.xcassets */; }; 50F787FC2497683A00D52237 /* Preview Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 50F787FB2497683A00D52237 /* Preview Assets.xcassets */; };
50F787FF2497683A00D52237 /* LaunchScreen.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 50F787FD2497683A00D52237 /* LaunchScreen.storyboard */; }; 50F787FF2497683A00D52237 /* LaunchScreen.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 50F787FD2497683A00D52237 /* LaunchScreen.storyboard */; };
50F7880A2497683A00D52237 /* deepspeech_ios_testTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F788092497683A00D52237 /* deepspeech_ios_testTests.swift */; }; 50F7880A2497683A00D52237 /* stt_ios_testTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F788092497683A00D52237 /* stt_ios_testTests.swift */; };
50F788152497683A00D52237 /* deepspeech_ios_testUITests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F788142497683A00D52237 /* deepspeech_ios_testUITests.swift */; }; 50F788152497683A00D52237 /* stt_ios_testUITests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 50F788142497683A00D52237 /* stt_ios_testUITests.swift */; };
/* End PBXBuildFile section */ /* End PBXBuildFile section */
/* Begin PBXContainerItemProxy section */ /* Begin PBXContainerItemProxy section */
@ -27,14 +27,14 @@
containerPortal = 50F787E72497683900D52237 /* Project object */; containerPortal = 50F787E72497683900D52237 /* Project object */;
proxyType = 1; proxyType = 1;
remoteGlobalIDString = 50F787EE2497683900D52237; remoteGlobalIDString = 50F787EE2497683900D52237;
remoteInfo = deepspeech_ios_test; remoteInfo = stt_ios_test;
}; };
50F788112497683A00D52237 /* PBXContainerItemProxy */ = { 50F788112497683A00D52237 /* PBXContainerItemProxy */ = {
isa = PBXContainerItemProxy; isa = PBXContainerItemProxy;
containerPortal = 50F787E72497683900D52237 /* Project object */; containerPortal = 50F787E72497683900D52237 /* Project object */;
proxyType = 1; proxyType = 1;
remoteGlobalIDString = 50F787EE2497683900D52237; remoteGlobalIDString = 50F787EE2497683900D52237;
remoteInfo = deepspeech_ios_test; remoteInfo = stt_ios_test;
}; };
/* End PBXContainerItemProxy section */ /* End PBXContainerItemProxy section */
@ -45,7 +45,7 @@
dstPath = ""; dstPath = "";
dstSubfolderSpec = 10; dstSubfolderSpec = 10;
files = ( files = (
504EC34524CF4F4F0073C22E /* deepspeech_ios.framework in Embed Frameworks */, 504EC34524CF4F4F0073C22E /* stt_ios.framework in Embed Frameworks */,
); );
name = "Embed Frameworks"; name = "Embed Frameworks";
runOnlyForDeploymentPostprocessing = 0; runOnlyForDeploymentPostprocessing = 0;
@ -55,9 +55,9 @@
/* Begin PBXFileReference section */ /* Begin PBXFileReference section */
504EC34124CF4EFD0073C22E /* SpeechRecognitionImpl.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = SpeechRecognitionImpl.swift; sourceTree = "<group>"; }; 504EC34124CF4EFD0073C22E /* SpeechRecognitionImpl.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = SpeechRecognitionImpl.swift; sourceTree = "<group>"; };
504EC34224CF4EFD0073C22E /* AudioContext.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = AudioContext.swift; sourceTree = "<group>"; }; 504EC34224CF4EFD0073C22E /* AudioContext.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = AudioContext.swift; sourceTree = "<group>"; };
507CD3A024B61FE400409BBB /* deepspeech_ios.framework */ = {isa = PBXFileReference; explicitFileType = wrapper.framework; path = deepspeech_ios.framework; sourceTree = BUILT_PRODUCTS_DIR; }; 507CD3A024B61FE400409BBB /* stt_ios.framework */ = {isa = PBXFileReference; explicitFileType = wrapper.framework; path = stt_ios.framework; sourceTree = BUILT_PRODUCTS_DIR; };
507CD3A224B61FEA00409BBB /* libdeepspeech.so */ = {isa = PBXFileReference; lastKnownFileType = "compiled.mach-o.dylib"; path = libdeepspeech.so; sourceTree = "<group>"; }; 507CD3A224B61FEA00409BBB /* libstt.so */ = {isa = PBXFileReference; lastKnownFileType = "compiled.mach-o.dylib"; path = libstt.so; sourceTree = "<group>"; };
50F787EF2497683900D52237 /* deepspeech_ios_test.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = deepspeech_ios_test.app; sourceTree = BUILT_PRODUCTS_DIR; }; 50F787EF2497683900D52237 /* stt_ios_test.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = stt_ios_test.app; sourceTree = BUILT_PRODUCTS_DIR; };
50F787F22497683900D52237 /* AppDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = "<group>"; }; 50F787F22497683900D52237 /* AppDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = "<group>"; };
50F787F42497683900D52237 /* SceneDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SceneDelegate.swift; sourceTree = "<group>"; }; 50F787F42497683900D52237 /* SceneDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SceneDelegate.swift; sourceTree = "<group>"; };
50F787F62497683900D52237 /* ContentView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ContentView.swift; sourceTree = "<group>"; }; 50F787F62497683900D52237 /* ContentView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ContentView.swift; sourceTree = "<group>"; };
@ -65,11 +65,11 @@
50F787FB2497683A00D52237 /* Preview Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = "Preview Assets.xcassets"; sourceTree = "<group>"; }; 50F787FB2497683A00D52237 /* Preview Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = "Preview Assets.xcassets"; sourceTree = "<group>"; };
50F787FE2497683A00D52237 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/LaunchScreen.storyboard; sourceTree = "<group>"; }; 50F787FE2497683A00D52237 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/LaunchScreen.storyboard; sourceTree = "<group>"; };
50F788002497683A00D52237 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; }; 50F788002497683A00D52237 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; };
50F788052497683A00D52237 /* deepspeech_ios_testTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = deepspeech_ios_testTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; }; 50F788052497683A00D52237 /* stt_ios_testTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = stt_ios_testTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };
50F788092497683A00D52237 /* deepspeech_ios_testTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = deepspeech_ios_testTests.swift; sourceTree = "<group>"; }; 50F788092497683A00D52237 /* stt_ios_testTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = stt_ios_testTests.swift; sourceTree = "<group>"; };
50F7880B2497683A00D52237 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; }; 50F7880B2497683A00D52237 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; };
50F788102497683A00D52237 /* deepspeech_ios_testUITests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = deepspeech_ios_testUITests.xctest; sourceTree = BUILT_PRODUCTS_DIR; }; 50F788102497683A00D52237 /* stt_ios_testUITests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = stt_ios_testUITests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };
50F788142497683A00D52237 /* deepspeech_ios_testUITests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = deepspeech_ios_testUITests.swift; sourceTree = "<group>"; }; 50F788142497683A00D52237 /* stt_ios_testUITests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = stt_ios_testUITests.swift; sourceTree = "<group>"; };
50F788162497683A00D52237 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; }; 50F788162497683A00D52237 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = "<group>"; };
/* End PBXFileReference section */ /* End PBXFileReference section */
@ -78,7 +78,7 @@
isa = PBXFrameworksBuildPhase; isa = PBXFrameworksBuildPhase;
buildActionMask = 2147483647; buildActionMask = 2147483647;
files = ( files = (
507CD3A124B61FE400409BBB /* deepspeech_ios.framework in Frameworks */, 507CD3A124B61FE400409BBB /* stt_ios.framework in Frameworks */,
); );
runOnlyForDeploymentPostprocessing = 0; runOnlyForDeploymentPostprocessing = 0;
}; };
@ -102,8 +102,8 @@
50F2B0FC2498D6C7007CD876 /* Frameworks */ = { 50F2B0FC2498D6C7007CD876 /* Frameworks */ = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
507CD3A224B61FEA00409BBB /* libdeepspeech.so */, 507CD3A224B61FEA00409BBB /* libstt.so */,
507CD3A024B61FE400409BBB /* deepspeech_ios.framework */, 507CD3A024B61FE400409BBB /* stt_ios.framework */,
); );
name = Frameworks; name = Frameworks;
sourceTree = "<group>"; sourceTree = "<group>";
@ -111,9 +111,9 @@
50F787E62497683900D52237 = { 50F787E62497683900D52237 = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
50F787F12497683900D52237 /* deepspeech_ios_test */, 50F787F12497683900D52237 /* stt_ios_test */,
50F788082497683A00D52237 /* deepspeech_ios_testTests */, 50F788082497683A00D52237 /* stt_ios_testTests */,
50F788132497683A00D52237 /* deepspeech_ios_testUITests */, 50F788132497683A00D52237 /* stt_ios_testUITests */,
50F787F02497683900D52237 /* Products */, 50F787F02497683900D52237 /* Products */,
50F2B0FC2498D6C7007CD876 /* Frameworks */, 50F2B0FC2498D6C7007CD876 /* Frameworks */,
); );
@ -122,14 +122,14 @@
50F787F02497683900D52237 /* Products */ = { 50F787F02497683900D52237 /* Products */ = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
50F787EF2497683900D52237 /* deepspeech_ios_test.app */, 50F787EF2497683900D52237 /* stt_ios_test.app */,
50F788052497683A00D52237 /* deepspeech_ios_testTests.xctest */, 50F788052497683A00D52237 /* stt_ios_testTests.xctest */,
50F788102497683A00D52237 /* deepspeech_ios_testUITests.xctest */, 50F788102497683A00D52237 /* stt_ios_testUITests.xctest */,
); );
name = Products; name = Products;
sourceTree = "<group>"; sourceTree = "<group>";
}; };
50F787F12497683900D52237 /* deepspeech_ios_test */ = { 50F787F12497683900D52237 /* stt_ios_test */ = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
504EC34224CF4EFD0073C22E /* AudioContext.swift */, 504EC34224CF4EFD0073C22E /* AudioContext.swift */,
@ -142,7 +142,7 @@
50F788002497683A00D52237 /* Info.plist */, 50F788002497683A00D52237 /* Info.plist */,
50F787FA2497683A00D52237 /* Preview Content */, 50F787FA2497683A00D52237 /* Preview Content */,
); );
path = deepspeech_ios_test; path = stt_ios_test;
sourceTree = "<group>"; sourceTree = "<group>";
}; };
50F787FA2497683A00D52237 /* Preview Content */ = { 50F787FA2497683A00D52237 /* Preview Content */ = {
@ -153,30 +153,30 @@
path = "Preview Content"; path = "Preview Content";
sourceTree = "<group>"; sourceTree = "<group>";
}; };
50F788082497683A00D52237 /* deepspeech_ios_testTests */ = { 50F788082497683A00D52237 /* stt_ios_testTests */ = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
50F788092497683A00D52237 /* deepspeech_ios_testTests.swift */, 50F788092497683A00D52237 /* stt_ios_testTests.swift */,
50F7880B2497683A00D52237 /* Info.plist */, 50F7880B2497683A00D52237 /* Info.plist */,
); );
path = deepspeech_ios_testTests; path = stt_ios_testTests;
sourceTree = "<group>"; sourceTree = "<group>";
}; };
50F788132497683A00D52237 /* deepspeech_ios_testUITests */ = { 50F788132497683A00D52237 /* stt_ios_testUITests */ = {
isa = PBXGroup; isa = PBXGroup;
children = ( children = (
50F788142497683A00D52237 /* deepspeech_ios_testUITests.swift */, 50F788142497683A00D52237 /* stt_ios_testUITests.swift */,
50F788162497683A00D52237 /* Info.plist */, 50F788162497683A00D52237 /* Info.plist */,
); );
path = deepspeech_ios_testUITests; path = stt_ios_testUITests;
sourceTree = "<group>"; sourceTree = "<group>";
}; };
/* End PBXGroup section */ /* End PBXGroup section */
/* Begin PBXNativeTarget section */ /* Begin PBXNativeTarget section */
50F787EE2497683900D52237 /* deepspeech_ios_test */ = { 50F787EE2497683900D52237 /* stt_ios_test */ = {
isa = PBXNativeTarget; isa = PBXNativeTarget;
buildConfigurationList = 50F788192497683A00D52237 /* Build configuration list for PBXNativeTarget "deepspeech_ios_test" */; buildConfigurationList = 50F788192497683A00D52237 /* Build configuration list for PBXNativeTarget "stt_ios_test" */;
buildPhases = ( buildPhases = (
50F787EB2497683900D52237 /* Sources */, 50F787EB2497683900D52237 /* Sources */,
50F787EC2497683900D52237 /* Frameworks */, 50F787EC2497683900D52237 /* Frameworks */,
@ -187,14 +187,14 @@
); );
dependencies = ( dependencies = (
); );
name = deepspeech_ios_test; name = stt_ios_test;
productName = deepspeech_ios_test; productName = stt_ios_test;
productReference = 50F787EF2497683900D52237 /* deepspeech_ios_test.app */; productReference = 50F787EF2497683900D52237 /* stt_ios_test.app */;
productType = "com.apple.product-type.application"; productType = "com.apple.product-type.application";
}; };
50F788042497683A00D52237 /* deepspeech_ios_testTests */ = { 50F788042497683A00D52237 /* stt_ios_testTests */ = {
isa = PBXNativeTarget; isa = PBXNativeTarget;
buildConfigurationList = 50F7881C2497683A00D52237 /* Build configuration list for PBXNativeTarget "deepspeech_ios_testTests" */; buildConfigurationList = 50F7881C2497683A00D52237 /* Build configuration list for PBXNativeTarget "stt_ios_testTests" */;
buildPhases = ( buildPhases = (
50F788012497683A00D52237 /* Sources */, 50F788012497683A00D52237 /* Sources */,
50F788022497683A00D52237 /* Frameworks */, 50F788022497683A00D52237 /* Frameworks */,
@ -205,14 +205,14 @@
dependencies = ( dependencies = (
50F788072497683A00D52237 /* PBXTargetDependency */, 50F788072497683A00D52237 /* PBXTargetDependency */,
); );
name = deepspeech_ios_testTests; name = stt_ios_testTests;
productName = deepspeech_ios_testTests; productName = stt_ios_testTests;
productReference = 50F788052497683A00D52237 /* deepspeech_ios_testTests.xctest */; productReference = 50F788052497683A00D52237 /* stt_ios_testTests.xctest */;
productType = "com.apple.product-type.bundle.unit-test"; productType = "com.apple.product-type.bundle.unit-test";
}; };
50F7880F2497683A00D52237 /* deepspeech_ios_testUITests */ = { 50F7880F2497683A00D52237 /* stt_ios_testUITests */ = {
isa = PBXNativeTarget; isa = PBXNativeTarget;
buildConfigurationList = 50F7881F2497683A00D52237 /* Build configuration list for PBXNativeTarget "deepspeech_ios_testUITests" */; buildConfigurationList = 50F7881F2497683A00D52237 /* Build configuration list for PBXNativeTarget "stt_ios_testUITests" */;
buildPhases = ( buildPhases = (
50F7880C2497683A00D52237 /* Sources */, 50F7880C2497683A00D52237 /* Sources */,
50F7880D2497683A00D52237 /* Frameworks */, 50F7880D2497683A00D52237 /* Frameworks */,
@ -223,9 +223,9 @@
dependencies = ( dependencies = (
50F788122497683A00D52237 /* PBXTargetDependency */, 50F788122497683A00D52237 /* PBXTargetDependency */,
); );
name = deepspeech_ios_testUITests; name = stt_ios_testUITests;
productName = deepspeech_ios_testUITests; productName = stt_ios_testUITests;
productReference = 50F788102497683A00D52237 /* deepspeech_ios_testUITests.xctest */; productReference = 50F788102497683A00D52237 /* stt_ios_testUITests.xctest */;
productType = "com.apple.product-type.bundle.ui-testing"; productType = "com.apple.product-type.bundle.ui-testing";
}; };
/* End PBXNativeTarget section */ /* End PBXNativeTarget section */
@ -251,7 +251,7 @@
}; };
}; };
}; };
buildConfigurationList = 50F787EA2497683900D52237 /* Build configuration list for PBXProject "deepspeech_ios_test" */; buildConfigurationList = 50F787EA2497683900D52237 /* Build configuration list for PBXProject "stt_ios_test" */;
compatibilityVersion = "Xcode 9.3"; compatibilityVersion = "Xcode 9.3";
developmentRegion = en; developmentRegion = en;
hasScannedForEncodings = 0; hasScannedForEncodings = 0;
@ -264,9 +264,9 @@
projectDirPath = ""; projectDirPath = "";
projectRoot = ""; projectRoot = "";
targets = ( targets = (
50F787EE2497683900D52237 /* deepspeech_ios_test */, 50F787EE2497683900D52237 /* stt_ios_test */,
50F788042497683A00D52237 /* deepspeech_ios_testTests */, 50F788042497683A00D52237 /* stt_ios_testTests */,
50F7880F2497683A00D52237 /* deepspeech_ios_testUITests */, 50F7880F2497683A00D52237 /* stt_ios_testUITests */,
); );
}; };
/* End PBXProject section */ /* End PBXProject section */
@ -315,7 +315,7 @@
isa = PBXSourcesBuildPhase; isa = PBXSourcesBuildPhase;
buildActionMask = 2147483647; buildActionMask = 2147483647;
files = ( files = (
50F7880A2497683A00D52237 /* deepspeech_ios_testTests.swift in Sources */, 50F7880A2497683A00D52237 /* stt_ios_testTests.swift in Sources */,
); );
runOnlyForDeploymentPostprocessing = 0; runOnlyForDeploymentPostprocessing = 0;
}; };
@ -323,7 +323,7 @@
isa = PBXSourcesBuildPhase; isa = PBXSourcesBuildPhase;
buildActionMask = 2147483647; buildActionMask = 2147483647;
files = ( files = (
50F788152497683A00D52237 /* deepspeech_ios_testUITests.swift in Sources */, 50F788152497683A00D52237 /* stt_ios_testUITests.swift in Sources */,
); );
runOnlyForDeploymentPostprocessing = 0; runOnlyForDeploymentPostprocessing = 0;
}; };
@ -332,12 +332,12 @@
/* Begin PBXTargetDependency section */ /* Begin PBXTargetDependency section */
50F788072497683A00D52237 /* PBXTargetDependency */ = { 50F788072497683A00D52237 /* PBXTargetDependency */ = {
isa = PBXTargetDependency; isa = PBXTargetDependency;
target = 50F787EE2497683900D52237 /* deepspeech_ios_test */; target = 50F787EE2497683900D52237 /* stt_ios_test */;
targetProxy = 50F788062497683A00D52237 /* PBXContainerItemProxy */; targetProxy = 50F788062497683A00D52237 /* PBXContainerItemProxy */;
}; };
50F788122497683A00D52237 /* PBXTargetDependency */ = { 50F788122497683A00D52237 /* PBXTargetDependency */ = {
isa = PBXTargetDependency; isa = PBXTargetDependency;
target = 50F787EE2497683900D52237 /* deepspeech_ios_test */; target = 50F787EE2497683900D52237 /* stt_ios_test */;
targetProxy = 50F788112497683A00D52237 /* PBXContainerItemProxy */; targetProxy = 50F788112497683A00D52237 /* PBXContainerItemProxy */;
}; };
/* End PBXTargetDependency section */ /* End PBXTargetDependency section */
@ -473,11 +473,11 @@
buildSettings = { buildSettings = {
ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;
CODE_SIGN_STYLE = Automatic; CODE_SIGN_STYLE = Automatic;
DEVELOPMENT_ASSET_PATHS = "\"deepspeech_ios_test/Preview Content\""; DEVELOPMENT_ASSET_PATHS = "\"stt_ios_test/Preview Content\"";
DEVELOPMENT_TEAM = AWCG9S27P7; DEVELOPMENT_TEAM = AWCG9S27P7;
ENABLE_PREVIEWS = YES; ENABLE_PREVIEWS = YES;
FRAMEWORK_SEARCH_PATHS = "$(PROJECT_DIR)"; FRAMEWORK_SEARCH_PATHS = "$(PROJECT_DIR)";
INFOPLIST_FILE = deepspeech_ios_test/Info.plist; INFOPLIST_FILE = stt_ios_test/Info.plist;
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"@executable_path/Frameworks", "@executable_path/Frameworks",
@ -486,7 +486,7 @@
"$(inherited)", "$(inherited)",
"$(PROJECT_DIR)", "$(PROJECT_DIR)",
); );
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-ios-test"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-ios-test";
PRODUCT_NAME = "$(TARGET_NAME)"; PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2"; TARGETED_DEVICE_FAMILY = "1,2";
@ -498,11 +498,11 @@
buildSettings = { buildSettings = {
ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;
CODE_SIGN_STYLE = Automatic; CODE_SIGN_STYLE = Automatic;
DEVELOPMENT_ASSET_PATHS = "\"deepspeech_ios_test/Preview Content\""; DEVELOPMENT_ASSET_PATHS = "\"stt_ios_test/Preview Content\"";
DEVELOPMENT_TEAM = AWCG9S27P7; DEVELOPMENT_TEAM = AWCG9S27P7;
ENABLE_PREVIEWS = YES; ENABLE_PREVIEWS = YES;
FRAMEWORK_SEARCH_PATHS = "$(PROJECT_DIR)"; FRAMEWORK_SEARCH_PATHS = "$(PROJECT_DIR)";
INFOPLIST_FILE = deepspeech_ios_test/Info.plist; INFOPLIST_FILE = stt_ios_test/Info.plist;
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"@executable_path/Frameworks", "@executable_path/Frameworks",
@ -511,7 +511,7 @@
"$(inherited)", "$(inherited)",
"$(PROJECT_DIR)", "$(PROJECT_DIR)",
); );
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-ios-test"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-ios-test";
PRODUCT_NAME = "$(TARGET_NAME)"; PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2"; TARGETED_DEVICE_FAMILY = "1,2";
@ -524,18 +524,18 @@
ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;
BUNDLE_LOADER = "$(TEST_HOST)"; BUNDLE_LOADER = "$(TEST_HOST)";
CODE_SIGN_STYLE = Automatic; CODE_SIGN_STYLE = Automatic;
INFOPLIST_FILE = deepspeech_ios_testTests/Info.plist; INFOPLIST_FILE = stt_ios_testTests/Info.plist;
IPHONEOS_DEPLOYMENT_TARGET = 13.5; IPHONEOS_DEPLOYMENT_TARGET = 13.5;
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"@executable_path/Frameworks", "@executable_path/Frameworks",
"@loader_path/Frameworks", "@loader_path/Frameworks",
); );
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-ios-testTests"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-ios-testTests";
PRODUCT_NAME = "$(TARGET_NAME)"; PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2"; TARGETED_DEVICE_FAMILY = "1,2";
TEST_HOST = "$(BUILT_PRODUCTS_DIR)/deepspeech_ios_test.app/deepspeech_ios_test"; TEST_HOST = "$(BUILT_PRODUCTS_DIR)/stt_ios_test.app/stt_ios_test";
}; };
name = Debug; name = Debug;
}; };
@ -545,18 +545,18 @@
ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;
BUNDLE_LOADER = "$(TEST_HOST)"; BUNDLE_LOADER = "$(TEST_HOST)";
CODE_SIGN_STYLE = Automatic; CODE_SIGN_STYLE = Automatic;
INFOPLIST_FILE = deepspeech_ios_testTests/Info.plist; INFOPLIST_FILE = stt_ios_testTests/Info.plist;
IPHONEOS_DEPLOYMENT_TARGET = 13.5; IPHONEOS_DEPLOYMENT_TARGET = 13.5;
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"@executable_path/Frameworks", "@executable_path/Frameworks",
"@loader_path/Frameworks", "@loader_path/Frameworks",
); );
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-ios-testTests"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-ios-testTests";
PRODUCT_NAME = "$(TARGET_NAME)"; PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2"; TARGETED_DEVICE_FAMILY = "1,2";
TEST_HOST = "$(BUILT_PRODUCTS_DIR)/deepspeech_ios_test.app/deepspeech_ios_test"; TEST_HOST = "$(BUILT_PRODUCTS_DIR)/stt_ios_test.app/stt_ios_test";
}; };
name = Release; name = Release;
}; };
@ -565,17 +565,17 @@
buildSettings = { buildSettings = {
ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;
CODE_SIGN_STYLE = Automatic; CODE_SIGN_STYLE = Automatic;
INFOPLIST_FILE = deepspeech_ios_testUITests/Info.plist; INFOPLIST_FILE = stt_ios_testUITests/Info.plist;
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"@executable_path/Frameworks", "@executable_path/Frameworks",
"@loader_path/Frameworks", "@loader_path/Frameworks",
); );
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-ios-testUITests"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-ios-testUITests";
PRODUCT_NAME = "$(TARGET_NAME)"; PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2"; TARGETED_DEVICE_FAMILY = "1,2";
TEST_TARGET_NAME = deepspeech_ios_test; TEST_TARGET_NAME = stt_ios_test;
}; };
name = Debug; name = Debug;
}; };
@ -584,24 +584,24 @@
buildSettings = { buildSettings = {
ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES; ALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;
CODE_SIGN_STYLE = Automatic; CODE_SIGN_STYLE = Automatic;
INFOPLIST_FILE = deepspeech_ios_testUITests/Info.plist; INFOPLIST_FILE = stt_ios_testUITests/Info.plist;
LD_RUNPATH_SEARCH_PATHS = ( LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)", "$(inherited)",
"@executable_path/Frameworks", "@executable_path/Frameworks",
"@loader_path/Frameworks", "@loader_path/Frameworks",
); );
PRODUCT_BUNDLE_IDENTIFIER = "org.deepspeech.deepspeech-ios-testUITests"; PRODUCT_BUNDLE_IDENTIFIER = "ai.coqui.stt-ios-testUITests";
PRODUCT_NAME = "$(TARGET_NAME)"; PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0; SWIFT_VERSION = 5.0;
TARGETED_DEVICE_FAMILY = "1,2"; TARGETED_DEVICE_FAMILY = "1,2";
TEST_TARGET_NAME = deepspeech_ios_test; TEST_TARGET_NAME = stt_ios_test;
}; };
name = Release; name = Release;
}; };
/* End XCBuildConfiguration section */ /* End XCBuildConfiguration section */
/* Begin XCConfigurationList section */ /* Begin XCConfigurationList section */
50F787EA2497683900D52237 /* Build configuration list for PBXProject "deepspeech_ios_test" */ = { 50F787EA2497683900D52237 /* Build configuration list for PBXProject "stt_ios_test" */ = {
isa = XCConfigurationList; isa = XCConfigurationList;
buildConfigurations = ( buildConfigurations = (
50F788172497683A00D52237 /* Debug */, 50F788172497683A00D52237 /* Debug */,
@ -610,7 +610,7 @@
defaultConfigurationIsVisible = 0; defaultConfigurationIsVisible = 0;
defaultConfigurationName = Release; defaultConfigurationName = Release;
}; };
50F788192497683A00D52237 /* Build configuration list for PBXNativeTarget "deepspeech_ios_test" */ = { 50F788192497683A00D52237 /* Build configuration list for PBXNativeTarget "stt_ios_test" */ = {
isa = XCConfigurationList; isa = XCConfigurationList;
buildConfigurations = ( buildConfigurations = (
50F7881A2497683A00D52237 /* Debug */, 50F7881A2497683A00D52237 /* Debug */,
@ -619,7 +619,7 @@
defaultConfigurationIsVisible = 0; defaultConfigurationIsVisible = 0;
defaultConfigurationName = Release; defaultConfigurationName = Release;
}; };
50F7881C2497683A00D52237 /* Build configuration list for PBXNativeTarget "deepspeech_ios_testTests" */ = { 50F7881C2497683A00D52237 /* Build configuration list for PBXNativeTarget "stt_ios_testTests" */ = {
isa = XCConfigurationList; isa = XCConfigurationList;
buildConfigurations = ( buildConfigurations = (
50F7881D2497683A00D52237 /* Debug */, 50F7881D2497683A00D52237 /* Debug */,
@ -628,7 +628,7 @@
defaultConfigurationIsVisible = 0; defaultConfigurationIsVisible = 0;
defaultConfigurationName = Release; defaultConfigurationName = Release;
}; };
50F7881F2497683A00D52237 /* Build configuration list for PBXNativeTarget "deepspeech_ios_testUITests" */ = { 50F7881F2497683A00D52237 /* Build configuration list for PBXNativeTarget "stt_ios_testUITests" */ = {
isa = XCConfigurationList; isa = XCConfigurationList;
buildConfigurations = ( buildConfigurations = (
50F788202497683A00D52237 /* Debug */, 50F788202497683A00D52237 /* Debug */,

View File

@ -2,6 +2,6 @@
<Workspace <Workspace
version = "1.0"> version = "1.0">
<FileRef <FileRef
location = "self:deepspeech_ios_test.xcodeproj"> location = "self:stt_ios_test.xcodeproj">
</FileRef> </FileRef>
</Workspace> </Workspace>

View File

@ -15,9 +15,9 @@
<BuildableReference <BuildableReference
BuildableIdentifier = "primary" BuildableIdentifier = "primary"
BlueprintIdentifier = "50F787EE2497683900D52237" BlueprintIdentifier = "50F787EE2497683900D52237"
BuildableName = "deepspeech_ios_test.app" BuildableName = "stt_ios_test.app"
BlueprintName = "deepspeech_ios_test" BlueprintName = "stt_ios_test"
ReferencedContainer = "container:deepspeech_ios_test.xcodeproj"> ReferencedContainer = "container:stt_ios_test.xcodeproj">
</BuildableReference> </BuildableReference>
</BuildActionEntry> </BuildActionEntry>
</BuildActionEntries> </BuildActionEntries>
@ -33,9 +33,9 @@
<BuildableReference <BuildableReference
BuildableIdentifier = "primary" BuildableIdentifier = "primary"
BlueprintIdentifier = "50F788042497683A00D52237" BlueprintIdentifier = "50F788042497683A00D52237"
BuildableName = "deepspeech_ios_testTests.xctest" BuildableName = "stt_ios_testTests.xctest"
BlueprintName = "deepspeech_ios_testTests" BlueprintName = "stt_ios_testTests"
ReferencedContainer = "container:deepspeech_ios_test.xcodeproj"> ReferencedContainer = "container:stt_ios_test.xcodeproj">
</BuildableReference> </BuildableReference>
</TestableReference> </TestableReference>
<TestableReference <TestableReference
@ -43,9 +43,9 @@
<BuildableReference <BuildableReference
BuildableIdentifier = "primary" BuildableIdentifier = "primary"
BlueprintIdentifier = "50F7880F2497683A00D52237" BlueprintIdentifier = "50F7880F2497683A00D52237"
BuildableName = "deepspeech_ios_testUITests.xctest" BuildableName = "stt_ios_testUITests.xctest"
BlueprintName = "deepspeech_ios_testUITests" BlueprintName = "stt_ios_testUITests"
ReferencedContainer = "container:deepspeech_ios_test.xcodeproj"> ReferencedContainer = "container:stt_ios_test.xcodeproj">
</BuildableReference> </BuildableReference>
</TestableReference> </TestableReference>
</Testables> </Testables>
@ -66,9 +66,9 @@
<BuildableReference <BuildableReference
BuildableIdentifier = "primary" BuildableIdentifier = "primary"
BlueprintIdentifier = "50F787EE2497683900D52237" BlueprintIdentifier = "50F787EE2497683900D52237"
BuildableName = "deepspeech_ios_test.app" BuildableName = "stt_ios_test.app"
BlueprintName = "deepspeech_ios_test" BlueprintName = "stt_ios_test"
ReferencedContainer = "container:deepspeech_ios_test.xcodeproj"> ReferencedContainer = "container:stt_ios_test.xcodeproj">
</BuildableReference> </BuildableReference>
</BuildableProductRunnable> </BuildableProductRunnable>
</LaunchAction> </LaunchAction>
@ -83,9 +83,9 @@
<BuildableReference <BuildableReference
BuildableIdentifier = "primary" BuildableIdentifier = "primary"
BlueprintIdentifier = "50F787EE2497683900D52237" BlueprintIdentifier = "50F787EE2497683900D52237"
BuildableName = "deepspeech_ios_test.app" BuildableName = "stt_ios_test.app"
BlueprintName = "deepspeech_ios_test" BlueprintName = "stt_ios_test"
ReferencedContainer = "container:deepspeech_ios_test.xcodeproj"> ReferencedContainer = "container:stt_ios_test.xcodeproj">
</BuildableReference> </BuildableReference>
</BuildableProductRunnable> </BuildableProductRunnable>
</ProfileAction> </ProfileAction>

View File

@ -1,6 +1,6 @@
// //
// AppDelegate.swift // AppDelegate.swift
// deepspeech_ios_test // stt_ios_test
// //
// Created by Reuben Morais on 15.06.20. // Created by Reuben Morais on 15.06.20.
// Copyright © 2020 Mozilla // Copyright © 2020 Mozilla

View File

@ -1,6 +1,6 @@
// //
// AudioContext.swift // AudioContext.swift
// deepspeech_ios_test // stt_ios_test
// //
// Created by Erik Ziegler on 27.07.20. // Created by Erik Ziegler on 27.07.20.
// Copyright © 2020 Mozilla // Copyright © 2020 Mozilla
@ -12,7 +12,7 @@ import AVFoundation
import AudioToolbox import AudioToolbox
import Accelerate import Accelerate
import deepspeech_ios import stt_ios
/// Holds audio information used for building waveforms /// Holds audio information used for building waveforms
final class AudioContext { final class AudioContext {

View File

@ -1,6 +1,6 @@
// //
// ContentView.swift // ContentView.swift
// deepspeech_ios_test // stt_ios_test
// //
// Created by Reuben Morais on 15.06.20. // Created by Reuben Morais on 15.06.20.
// Copyright © 2020 Mozilla // Copyright © 2020 Mozilla

Some files were not shown because too many files have changed in this diff Show More