Introduction

This is the official repository of ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation, which includes the code for finetuning whisper and mbart, creating pseudo dataset and finetuning the ComSL model.

Preparation

To run the code, first install the requirements:

pip install -r requirements.txt

Then, download CoVoST2 dataset following the instructions in CoVoST2 Github Page.

Training

To launch the training, you should change the data_root in the config files in config/exp_spec to the root of CoVoST2 dataset. After that use command to start training:

python3 main.py -c XXX.yaml

where XXX.yaml is the configuration file in config/exp_spec.

Training with Pseudo Data

In ouder to train with pesudo data, you should first download and extract Common Voice dataset from Common Voice Website. Then, modified the data path and pretrained model path in create_pseudo_data.py and run this script.

After that, set cv_data_root in config/exp_spec/comsl.yaml to the path of Common Voice dataset and uncomment the language in avail_lang_extra. Finally, run the training script as above.

python3 main.py -c comsl.yaml

Citation

@misc{le2023comsl,
      title={ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation}, 
      author={Chenyang Le and Yao Qian and Long Zhou and Shujie Liu and Yanmin Qian and Michael Zeng and Xuedong Huang},
      year={2023},
      eprint={2305.14838},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Whisper		Whisper
config		config
criterion		criterion
data		data
decode		decode
model		model
modules		modules
.gitignore		.gitignore
README.md		README.md
create_pseudo_data.py		create_pseudo_data.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Preparation

Training

Training with Pseudo Data

Citation

About

Releases

Packages

Languages

nethermanpro/ComSL

Folders and files

Latest commit

History

Repository files navigation

Introduction

Preparation

Training

Training with Pseudo Data

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages