Skip to content

Official Implementation of Amuse: Human-AI Collaborative Songwriting with Multimodal Inspirations

Notifications You must be signed in to change notification settings

elianakim/Amuse

Repository files navigation

Amuse: Human-AI Collaborative Songwriting
with Multimodal Inspirations

Yewon Kim1·Sung-Ju Lee1·Chris Donahue2
1KAIST   2Carnegie Mellon University

This repository contains the code for the chord generation method used in Amuse. If you have any questions, please reach out to Yewon Kim at [email protected].

Run

1. Environment setup

conda env create -f amuse.yml -n amuse
conda activate amuse

2. Configure API Key(s)

Create a file ./assets/api_keys.csv and add your API key(s) in the following format:

host key
openai sk-proj-**********************************

Replace sk-proj-********************************** with your OpenAI API key.

3. Download Datasets

Download the Hooktheory dataset. For details on the dataset, refer to this paper.

cd ./dataset/Hooktheory
wget https://sheetsage.s3.amazonaws.com/hooktheory/Hooktheory.json.gz

Download the LLM-generated chords (pre-generated with GPT-4o-2024-05-13):

cd ./dataset/llmchords
wget https://yewon-kim.com/uploads/publications/2025-amuse/chords.txt

4. Training

To train a unimodal prior on Hooktheory dataset (P(x) in the paper), run the following:

python train.py --dataset hooktheory 

To train a unimodal prior on LLM-generated chord progressions (Q(x) in the paper), run the following:

python train.py --dataset llmchords 

Note: If the path specified by --llmchords_path (default: ./dataset/llmchords/chords.txt) does not exist, the above script will generate chords using the ChatGPT API, which incurs API costs. Use the --llmchords_num argument to limit API calls. By default, the model is trained on pre-generated chords created with GPT-4o-2024-05-13, but please note that this data may be outdated. For reliable chord generation in interactive mode (see section 6: Chord Generation), we recommend generating new chords using the latest models.

5. Evaluation

To compute the similarity between generated chords and the Hooktheory dataset, run:

python evaluate.py --px_path <path_to_px> --px_step <step_to_load> --qx_path <path_to_qx> --qx_step <step_to_load> 

Recommended values for rejection sampling parameters:

  • M (--rej_M): 4.0-8.0
  • Temperature (--rej_T): 1.7-2.5

6. Chord Generation

To interactively generate chords based on keywords, run:

python generate.py --px_path <path_to_px> <step_to_load> --qx_path <path_to_qx> --qx_step <step_to_load> --rej_M <threshold_M> --rej_T <temperature>

This opens an interactive terminal where you can input keywords and generate chords using two methods: (i) random-sampled LLM (GPT-4o) and (ii) rejection-sampled LLM (Amuse) chords.

Notes

This code may not accurately replicate the results in the paper due to potential inconsistencies during code preparation and changes in OpenAI API versions. For inquiries, please contact Yewon Kim at [email protected].

Citation

If you use this repository in your research, please cite:

@article{kim2024amuse,
    title={Amuse: Human-AI Collaborative Songwriting with Multimodal Inspirations},
    author={Kim, Yewon and Lee, Sung-Ju and Donahue, Chris},
    year={2024},
    journal={arXiv preprint arXiv:2412.18940},
}

About

Official Implementation of Amuse: Human-AI Collaborative Songwriting with Multimodal Inspirations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages