The unofficial dataset of all music sheets and users on musescore.com, dedicated to big data analytics / data science / machine learning.
All data is collected by iterating through musecore.com's public API.
The
jsonl
files are in the Newline-delimited JSON (JSON Lines) format.
Only need the sheet files to learn music? try musescore-downloader.
View/Query in Google BigQuery
Update Manually,
Last Updated: Mar 16, 2020
https://musescore-dataset.xmader.com/user.jsonl
Update daily at 7:10 am ET (UTC-5 / UTC-4 Daylight Saving Time)
https://musescore-dataset.xmader.com/score.jsonl
Update daily at 7:10 am ET (UTC-5 / UTC-4 Daylight Saving Time)
https://musescore-dataset.xmader.com/mscz-files.csv
# The CSV file itself is on IPFS
wget -O mscz-files.csv https://ipfs.io/ipns/QmSdXtvzC8v8iTTZuj5cVmiugnzbR1QATYRcGix4bBsioP/mscz-files.csv.part{0..15}
This is a csv file, which contains score id (id
) and the corresponding IPFS reference (ref
) to each mscz file.
All files are available on IPFS.
NO ONE CAN TAKE IT DOWN NOW!
Download mscz files via IPFS HTTP Gateways
- https://ipfs.infura.io/{ref}
- https://ipfs.eternum.io/{ref}
- https://ipfs.io/{ref}
- https://cloudflare-ipfs.com/{ref}
- more
#!/bin/bash
while IFS=, read -r id ref
do
wget -nv https://ipfs.infura.io$ref -O $id.mscz
done < <(sed '1d' mscz-files.csv)
#!/bin/bash
# Install IPFS https://docs.ipfs.io/how-to/command-line-quick-start/#install-ipfs
ipfs daemon --init &
while IFS=, read -r id ref
do
ipfs get $ref -o $id.mscz
done < <(sed '1d' mscz-files.csv)
Contact me if you have any questions.
The purpose of the project is to make the data of musescore.com accessible to anyone in need, and bring a clean and high-quality music dataset to the world of computer science, but not for individuals who only want to keep the data pointlessly.
I would like to thank Luca B., telling me that what I am doing is meaningful.