Skip to content

Commit

Permalink
docs + readme update
Browse files Browse the repository at this point in the history
  • Loading branch information
Natooz committed Oct 27, 2023
1 parent 87a4988 commit 8e62e92
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 2 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,8 @@ Contributions are gratefully welcomed, feel free to open an issue or send a PR i

If you use MidiTok for your research, a citation in your manuscript would be gladly appreciated. ❤️

[**MidiTok paper**](https://archives.ismir.net/ismir2021/latebreaking/000005.pdf)
[**[MidiTok paper]**](https://arxiv.org/abs/2310.17202)
[**[MidiTok original ISMIR publication]**](https://archives.ismir.net/ismir2021/latebreaking/000005.pdf)
```bibtex
@inproceedings{miditok2021,
title={{MidiTok}: A Python package for {MIDI} file tokenization},
Expand Down
6 changes: 5 additions & 1 deletion docs/data_augmentation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,11 @@
Data augmentation
========================

MidiTok offers data augmentation solutions, on the pitch octave, velocity and duration attributes.
Data augmentation is a technique to artificially increases the size of a dataset by applying various transformations on to the existing data. These transformations consist in altering one or several attributes of the original data. In the context of images, they can include operations such as rotation, scaling, cropping or color adjustments. This is more tricky in the case of natural language, where the meaning of the sentences can easily diverge following how the text is modified, but some techniques such as paraphrase generation or back translation can fill this purpose.

The purpose of data augmentation is to introduce variability and diversity into the training data without collecting additional real-world data. Data augmentation can be important and increase a model's learning and generalization, as it exposes it to a wider range of variations and patterns present in the data. In turn it can increases its robustness and decrease overfitting.

MidiTok allows to perform data augmentation, on the MIDI level and token level. Transformations can be made by increasing the values of the velocities and durations of notes, or by shifting their pitches by octaves. Data augmentation is highly recommended to train a model, in order to help a model to learn the global and local harmony of music. In large datasets such as the `Lakh <https://colinraffel.com/projects/lmd/>`_ or `Meta MIDI <https://zenodo.org/records/5142664>`_ datasets, MIDI files can have various ranges of velocity, duration values, and pitch. By augmenting the data, thus creating more diversified data samples, a model can effectively learn to focus on the melody, harmony and music features rather than putting too much attention on specific recurrent token successions.

.. automodule:: miditok.data_augmentation
:members:

0 comments on commit 8e62e92

Please sign in to comment.