Skip to content

mdeff/fma

Repository files navigation

FMA: A Dataset For Music Analysis

Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson.
International Society for Music Information Retrieval Conference (ISMIR), 2017.

We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma.

Data

All metadata and features for all tracks are distributed in fma_metadata.zip (342 MiB). The below tables can be used with pandas or any other data analysis tool. See the paper or the usage.ipynb notebook for a description.

  • tracks.csv: per track metadata such as ID, title, artist, genres, tags and play counts, for all 106,574 tracks.
  • genres.csv: all 163 genre IDs with their name and parent (used to infer the genre hierarchy and top-level genres).
  • features.csv: common features extracted with librosa.
  • echonest.csv: audio features provided by Echonest (now Spotify) for a subset of 13,129 tracks.

Then, you got various sizes of MP3-encoded audio data:

  1. fma_small.zip: 8,000 tracks of 30s, 8 balanced genres (GTZAN-like) (7.2 GiB)
  2. fma_medium.zip: 25,000 tracks of 30s, 16 unbalanced genres (22 GiB)
  3. fma_large.zip: 106,574 tracks of 30s, 161 unbalanced genres (93 GiB)
  4. fma_full.zip: 106,574 untrimmed tracks, 161 unbalanced genres (879 GiB)

Code

The following notebooks, scripts and modules have been developed for the dataset.

  1. usage.ipynb: shows how to load the datasets and develop, train and test your own models with it.
  2. analysis.ipynb: exploration of the metadata, data and features. Creates the figures used in the paper.
  3. baselines.ipynb: baseline models for genre recognition, both from audio and features.
  4. features.py: features extraction from the audio (used to create features.csv).
  5. webapi.ipynb: query the web API of the FMA. Can be used to update the dataset.
  6. creation.ipynb: creation of the dataset (used to create tracks.csv and genres.csv).
  7. creation.py: creation of the dataset (long-running data collection and processing).
  8. utils.py: helper functions and classes.

Usage

  1. Download some data, verify its integrity, and uncompress the archives.

    curl -O https://os.unil.cloud.switch.ch/fma/fma_metadata.zip
    curl -O https://os.unil.cloud.switch.ch/fma/fma_small.zip
    curl -O https://os.unil.cloud.switch.ch/fma/fma_medium.zip
    curl -O https://os.unil.cloud.switch.ch/fma/fma_large.zip
    curl -O https://os.unil.cloud.switch.ch/fma/fma_full.zip
    
    echo "f0df49ffe5f2a6008d7dc83c6915b31835dfe733  fma_metadata.zip" | sha1sum -c -
    echo "ade154f733639d52e35e32f5593efe5be76c6d70  fma_small.zip"    | sha1sum -c -
    echo "c67b69ea232021025fca9231fc1c7c1a063ab50b  fma_medium.zip"   | sha1sum -c -
    echo "497109f4dd721066b5ce5e5f250ec604dc78939e  fma_large.zip"    | sha1sum -c -
    echo "0f0ace23fbe9ba30ecb7e95f763e435ea802b8ab  fma_full.zip"     | sha1sum -c -
    
    unzip fma_metadata.zip
    unzip fma_small.zip
    unzip fma_medium.zip
    unzip fma_large.zip
    unzip fma_full.zip

    If you get any error while decompressing the archives (especially with the Windows and macOS system unzippers), please try 7zip. That is probably an unsupported compression issue.

  2. Optionally, use pyenv to install Python 3.6 and create a virtual environment.

    pyenv install 3.6.0
    pyenv virtualenv 3.6.0 fma
    pyenv activate fma
  3. Clone the repository.

    git clone https://github.com/mdeff/fma.git
    cd fma
  4. Checkout the revision matching the data you downloaded (e.g., beta, rc1, v1). See the history of the dataset.

    git checkout rc1
  5. Install the Python dependencies from requirements.txt. Depending on your usage, you may need to install ffmpeg or graphviz. Install CUDA if you want to train neural networks on GPUs (see Tensorflow's instructions).

    make install
  6. Fill in the configuration.

    cat .env
    AUDIO_DIR=/path/to/audio
    FMA_KEY=IFIUSETHEAPI
  7. Open Jupyter or run a notebook.

    jupyter notebook
    make baselines.ipynb

Coverage and resources

Research papers (see also citations on Google Scholar):

Dataset lists:

History

2017-05-09 pre-publication release

  • paper: arXiv:1612.01840v2
  • code: git tag rc1
  • fma_metadata.zip sha1: f0df49ffe5f2a6008d7dc83c6915b31835dfe733
  • fma_small.zip sha1: ade154f733639d52e35e32f5593efe5be76c6d70
  • fma_medium.zip sha1: c67b69ea232021025fca9231fc1c7c1a063ab50b
  • fma_large.zip sha1: 497109f4dd721066b5ce5e5f250ec604dc78939e
  • fma_full.zip sha1: 0f0ace23fbe9ba30ecb7e95f763e435ea802b8ab

2016-12-06 beta release

  • paper: arXiv:1612.01840v1
  • code: git tag beta
  • fma_small.zip sha1: e731a5d56a5625f7b7f770923ee32922374e2cbf
  • fma_medium.zip sha1: fe23d6f2a400821ed1271ded6bcd530b7a8ea551

Contributing

Please open an issue or a pull request if you want to contribute. Let's try to keep this repository the central place around the dataset! Links to resources related to the dataset are welcome.

License & co

  • Please cite our paper if you use our code or data.
    @inproceedings{fma_dataset,
      title = {{FMA}: A Dataset for Music Analysis},
      author = {Defferrard, Micha\"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier},
      booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)},
      year = {2017},
      archiveprefix = {arXiv},
      eprint = {1612.01840},
      url = {https://arxiv.org/abs/1612.01840},
    }
    
  • The code in this repository is released under the terms of the MIT license.
  • The metadata is released under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0).
  • We do not hold the copyright on the audio and distribute it under the terms of the license chosen by the artist.
  • The dataset is meant for research purposes.
  • We are grateful to the Swiss Data Science Center (EPFL and ETHZ) for hosting the dataset.