without any fancy design, just a quality injection, and enjoy your beautiful music
I recommend anyone to listen to our demo, even under the clutter of tabs in Musiccaps, we still perform well
Down the main checkpoint of our QA-MDT model from https://huggingface.co/lichang0928/QA-MDT
For chinese users, you can also download your checkpoint through following link:
https://pan.baidu.com/s/1pkLnQhbNeFjKRadXUy_7Iw?pwd=v9dd
This repository provides an implementation of QA-MDT, integrating state-of-the-art models for music generation. The code and methods are based on the following repositories:
Python 3.10
qamdt.yaml
Before training, you need to download extra ckpts needed in ./audioldm_train/config/mos_as_token/qa_mdt.yaml and offset_pretrained_checkpoints.json
Noted that: All above checkpoints can be downloaded from:
sh run.sh
How to make your dataset for training or finetuning?
sh infer/infer.sh
# you may change the infer.sh for witch quality level you want to infer
# defaultly, it should be set to 5 which represent highest quality
# Additionly, it may be useful to change the prompt with text prefix "high quality",
# which match the training process and may further improve performance
This is the first time I open source such a project, the code, the organization, the open source may not be perfect. If you have any questions about our model, code and datasets, feel free to contact me via below links, and I'm looking forward to any suggestions:
- Email: [email protected]
- WeChat: 19524292801
If you find this project useful, please consider citing:
@article{li2024quality,
title={Quality-aware Masked Diffusion Transformer for Enhanced Music Generation},
author={Li, Chang and Wang, Ruoyu and Liu, Lijuan and Du, Jun and Sun, Yixuan and Guo, Zilu and Zhang, Zhenrong and Jiang, Yuan},
journal={arXiv preprint arXiv:2405.15863},
year={2024}
}