Skip to content

Latest commit

 

History

History
 
 

data

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Language-Specific Data

This directory contains language-specific data files. Most importantly, you will find here:

  1. A list of unique characters for the target language (e.g. English) in data/alphabet.txt. After installing the training code, you can check python -m deepspeech_training.util.check_characters --help for a tool that creates an alphabet file from a list of training CSV files.
  2. A script used to generate a binary n-gram language model: data/lm/generate_lm.py.

For more information on how to build these resources from scratch, see the External scorer scripts section on deepspeech.readthedocs.io.