Skip to content

asigalov61/LAKH-MuseNet-MIDI-Dataset

Repository files navigation

LAKH MuseNet MIDI Dataset


Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums)

Bonus: Choir on Channel 10


Please CC BY-NC-SA


Make your own with the colab or download converted output here:

Open In Colab


wget install:

!wget --no-check-certificate -O LAKH-MuseNet-MIDI-Dataset.zip "https://onedrive.live.com/download?cid=8A0D502FC99C608F&resid=8A0D502FC99C608F%2118520&authkey=AN-gn1ZxEnO4khE"

Source license/attribution

The Lakh MIDI Dataset is distributed with a CC-BY 4.0 license; if you use this data in any capacity, please reference this page and my thesis:

Colin Raffel. "Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching". PhD Thesis, 2016.

Of course, I did not transcribe any of the MIDI files in the Lakh MIDI Dataset. While MIDI files have a built-in mechanism for attribution (the Copyright meta-event), it is not used consistently, so attributing each of the MIDI files in the dataset to a particular author is not feasible. If you'd like to try, here is a list of the text of all of the Copyright meta-events in the Lakh MIDI Dataset.

If you use the Million Song Dataset, please reference this paper:

Thierry Bertin-Mahieux, Daniel P. W. Ellis, Brian Whitman, and Paul Lamere. "The Million Song Dataset". In Proceedings of the 12th International Society for Music Information Retrieval Conference, pages 591–596, 2011.


Project Los Angeles

Tegridy Code 2022

About

Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums)

Topics

Resources

License

Stars

Watchers

Forks