Skip to content

Files

Latest commit

author
zhangshaolei
Jun 4, 2024
fca58d8 · Jun 4, 2024

History

History

asr_bleu_rm_silence

ASR-BLEU evaluation toolkit

This toolkit provides a set of public ASR models used for evaluation of different speech-to-speech translation systems at Meta AI. It enables easier score comparisons between different system's outputs.

The ASRGenerator wraps different CTC-based ASR models from HuggingFace and fairseq code bases. Torchaudio CTC decoder is built on top of it to decode given audio files.

Please see asr_model_cfgs.json for a list of languages covered currently.

The high-level pipeline is simple by design: given a lang tag, script loads the ASR model, transcribes model's predicted audio, and computes the BLEU score against provided reference translations using sacrebleu.

Dependencies

Please see requirements.txt.

Usage examples

This toolkit have been used with:

Standalone run example

High-level example, please substitute arguments per your case:

python compute_asr_bleu.py --lang <LANG> \
--audio_dirpath <PATH_TO_AUDIO_DIR> \
--reference_path <PATH_TO_REFERENCES_FILE> \
--reference_format txt

For more details about arguments please see the script argparser help.