GitHub - jkaspereit/DeepSpeech: DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Project DeepSpeech

DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier.

Documentation for installation, usage, and training models are available on deepspeech.readthedocs.io.

For the latest release, including pre-trained models and checkpoints, see the latest release on GitHub.

DeepSpeech 4 CommonVoice

Download and extract the CommonVoice Corpus (cv-corpus):

Download the latest release.
Build the Dockerfile: Dockerfile.train

docker build -f Dockerfile.train . -t deepspeech/training

Example for projects structure:

-App

--cv-corpus
de

--Docker
Dockerfile.train
Start the Container, it's important to load the cv-corpus as a volume and use GPU support:

docker run -it -v $(pwd)/cv-corpus:/DeepSpeech/data/cv-corpus --gpus all deepspeech/training sh

Container structure:

-DeepSpeech

--bin
run-cv-de.sh

--data

--cv-corpus
de
Run the CommonVoice importer:

Example ./bin/import_cv2.py --filter_alphabet data/alphabet-utf8.txt data/cv-corpus/de0 --normalize
Start the Training.

-d dataset -a argumentation (default=false) -c cudaDevice (default=0) -t alphabet_config_path (default=data/alphabet.txt)

Example ./bin/run-cv.sh -d de ./bin/run-cv.sh -d de0 -t alphabet-utf8.txt -a true ./bin/run-cv.sh -d en -t alphabet-utf8.txt -c 1 -a true
Finalize the model

./convert_graphdef_memmapped_format --in_graph=models/en/output_graph.pb --out_graph=models/en/output_graph.pbmm
Transfere learning

Aktuell im Container:7b595

-d dataset -c cudaDevice (default=0) -t alphabet_config_path (default=data/alphabet.txt)

Example:

./bin/transfer-learning.sh -d de0 -t alphabet-utf8.txt -c 1
Sprachmodell

siehe entsprechender folder: data/lm

Language Specific Adjustments

Overwrite data/alphabet.txt with all characters of your language.
Create your own training script (e.g. ./bin/run-cv-en.sh)
Watch out for paths (e.g. /de -> /en)

If your alphabet doesn't fit your language, you have to replace /data/alphabet.txt with the following output:

python -m deepspeech_training.util.check_characters -csv data/cv-corpus/de/clips/train.csv,data/cv-corpus/de/clips/dev.csv,data/cv-corpus/de/clips/test.csv -unicode -alpha

Name		Name	Last commit message	Last commit date
Latest commit History 3,500 Commits
.github		.github
bin		bin
ci_scripts		ci_scripts
data		data
doc		doc
examples		examples
images		images
kenlm @ 0c4dd4e		kenlm @ 0c4dd4e
native_client		native_client
taskcluster		taskcluster
tensorflow @ 23ad988		tensorflow @ 23ad988
tests		tests
training/deepspeech_training		training/deepspeech_training
.cardboardlint.yml		.cardboardlint.yml
.compute		.compute
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.isort.cfg		.isort.cfg
.pylintrc		.pylintrc
.readthedocs.yml		.readthedocs.yml
BIBLIOGRAPHY.md		BIBLIOGRAPHY.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CODE_OWNERS.rst		CODE_OWNERS.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
DeepSpeech.py		DeepSpeech.py
Dockerfile.build.tmpl		Dockerfile.build.tmpl
Dockerfile.train		Dockerfile.train
GRAPH_VERSION		GRAPH_VERSION
ISSUE_TEMPLATE.md		ISSUE_TEMPLATE.md
LICENSE		LICENSE
Makefile		Makefile
README.rst		README.rst
RELEASE.rst		RELEASE.rst
SUPPORT.rst		SUPPORT.rst
VERSION		VERSION
bazel.patch		bazel.patch
build-python-wheel.yml-DISABLED_ENABLE_ME_TO_REBUILD_DURING_PR		build-python-wheel.yml-DISABLED_ENABLE_ME_TO_REBUILD_DURING_PR
ds_generic.supp		ds_generic.supp
ds_lib.supp		ds_lib.supp
ds_openfst.supp		ds_openfst.supp
ds_sox.supp		ds_sox.supp
evaluate.py		evaluate.py
evaluate_tflite.py		evaluate_tflite.py
lm_optimizer.py		lm_optimizer.py
parse_valgrind_suppressions.sh		parse_valgrind_suppressions.sh
requirements_eval_tflite.txt		requirements_eval_tflite.txt
requirements_tests.txt		requirements_tests.txt
requirements_transcribe.txt		requirements_transcribe.txt
setup.py		setup.py
stats.py		stats.py
taskcluster.disabled.yml		taskcluster.disabled.yml
tensorflow_full_runtime.supp		tensorflow_full_runtime.supp
tensorflow_tflite_runtime.supp		tensorflow_tflite_runtime.supp
transcribe.py		transcribe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project DeepSpeech

DeepSpeech 4 CommonVoice

Language Specific Adjustments

About

Releases

Packages

Languages

License

jkaspereit/DeepSpeech

Folders and files

Latest commit

History

Repository files navigation

Project DeepSpeech

DeepSpeech 4 CommonVoice

Language Specific Adjustments

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages