Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
audioldm_train		audioldm_train
infer		infer
test_prompts		test_prompts
offset_pretrained_checkpoints.json		offset_pretrained_checkpoints.json
qamdt.yml		qamdt.yml
readme.md		readme.md
run.sh		run.sh

Repository files navigation

Awesome text to music generation (TTM) : QA-MDT

Official Pytorch Implementation

without any fancy design, just a quality injection, and enjoy your beautiful music

I recommend anyone to listen to our demo, even under the clutter of tabs in Musiccaps, we still perform well

Down the main checkpoint of our QA-MDT model from https://huggingface.co/lichang0928/QA-MDT

For chinese users, you can also download your checkpoint through following link:

https://pan.baidu.com/s/1pkLnQhbNeFjKRadXUy_7Iw?pwd=v9dd

Overview

This repository provides an implementation of QA-MDT, integrating state-of-the-art models for music generation. The code and methods are based on the following repositories:

Requirements

Python 3.10
qamdt.yaml

Before training, you need to download extra ckpts needed in ./audioldm_train/config/mos_as_token/qa_mdt.yaml and offset_pretrained_checkpoints.json

Noted that: All above checkpoints can be downloaded from:

Training

sh run.sh

How to make your dataset for training or finetuning?

Inference

sh infer/infer.sh
# you may change the infer.sh for witch quality level you want to infer
# defaultly, it should be set to 5 which represent highest quality
# Additionly, it may be useful to change the prompt with text prefix "high quality", 
# which match the training process and may further improve performance

Contact

This is the first time I open source such a project, the code, the organization, the open source may not be perfect. If you have any questions about our model, code and datasets, feel free to contact me via below links, and I'm looking forward to any suggestions:

Email: [email protected]
WeChat: 19524292801

Citation

If you find this project useful, please consider citing:

@article{li2024quality,
  title={Quality-aware Masked Diffusion Transformer for Enhanced Music Generation},
  author={Li, Chang and Wang, Ruoyu and Liu, Lijuan and Du, Jun and Sun, Yixuan and Guo, Zilu and Zhang, Zhenrong and Jiang, Yuan},
  journal={arXiv preprint arXiv:2405.15863},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome text to music generation (TTM) : QA-MDT

Official Pytorch Implementation

Overview

Requirements

Training

Inference

Contact

Citation

About

Releases

Packages

Languages

jadechoghari/qa-mdt

Folders and files

Latest commit

History

Repository files navigation

Awesome text to music generation (TTM) : QA-MDT

Official Pytorch Implementation

Overview

Requirements

Training

Inference

Contact

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages