Skip to content

jadechoghari/qa-mdt

Repository files navigation

Awesome text to music generation (TTM) : QA-MDT

Official Pytorch Implementation

without any fancy design, just a quality injection, and enjoy your beautiful music

I recommend anyone to listen to our demo, even under the clutter of tabs in Musiccaps, we still perform well

Down the main checkpoint of our QA-MDT model from https://huggingface.co/lichang0928/QA-MDT

For chinese users, you can also download your checkpoint through following link:

https://pan.baidu.com/s/1pkLnQhbNeFjKRadXUy_7Iw?pwd=v9dd

Overview

This repository provides an implementation of QA-MDT, integrating state-of-the-art models for music generation. The code and methods are based on the following repositories:

Requirements

Python 3.10
qamdt.yaml

Before training, you need to download extra ckpts needed in ./audioldm_train/config/mos_as_token/qa_mdt.yaml and offset_pretrained_checkpoints.json

Noted that: All above checkpoints can be downloaded from:

flan-t5-large

clap_music

roberta-base

others

Training

sh run.sh

How to make your dataset for training or finetuning?

Inference

sh infer/infer.sh
# you may change the infer.sh for witch quality level you want to infer
# defaultly, it should be set to 5 which represent highest quality
# Additionly, it may be useful to change the prompt with text prefix "high quality", 
# which match the training process and may further improve performance

Contact

This is the first time I open source such a project, the code, the organization, the open source may not be perfect. If you have any questions about our model, code and datasets, feel free to contact me via below links, and I'm looking forward to any suggestions:

Citation

If you find this project useful, please consider citing:

@article{li2024quality,
  title={Quality-aware Masked Diffusion Transformer for Enhanced Music Generation},
  author={Li, Chang and Wang, Ruoyu and Liu, Lijuan and Du, Jun and Sun, Yixuan and Guo, Zilu and Zhang, Zhenrong and Jiang, Yuan},
  journal={arXiv preprint arXiv:2405.15863},
  year={2024}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.5%
  • Other 0.5%