Software & Datasets

Open Science and Reproducibility are core goals of the MTG, promoting collaborations by making sure that our research results can be used by other researchers and by the society at large. Here we highlight some of the software tools and datasets developed as part of our research and that are being maintained by researchers of the MTG. All our open source projects are made available from our github repository and all our open datasets are made available from Zenodo. Please review the terms and conditions of use stated in each software tool and dataset to make sure that they allow your intended use.

Apart from research collaborations, we are also interested in technology transfer, offering commercial licenses for exploiting our software tools in industrial applications. Contact us for any further information.

 

Software

ESSENTIA:

Software library for audio and music analysis, description, and synthesis.

GAIA:

Software library to apply similarity measures and classifications on the results of audio analysis.

SMS TOOLS:

Sound analysis/synthesis tools for music applications.

THE EYEHARP:

A Gaze-Controlled Digital Music Instrument.

DUNYA DESKTOP:

Modular and extensible desktop application to explore the Dunya corpora.

SARAGA

Android app to explore and listen to a collection of Carnatic and Hindustani music.

HPCP:

Vamp plug-in for chroma feature extraction from polyphonic music signals.

MELODIA:

Vamp plug-in for predominant melody estimation from polyphonic audio signals.

MIR.EDU:

Vamp plug-in library in C++ which implements a basic set of descriptors useful for teaching MIR.

Corpora

ACOUSTICBRAINZ:

Crowdsourced acoustic information of songs available under open licenses.

DUNYA:

Music corpora of several non-western music repertoires and related software tools.

FREESOUND:

Collaborative database of Creative Commons Licensed sounds. 

Datasets

FREESOUND DATASETS:

Platform for the collaborative creation of open audio collections from Freesound.

COMPMUSIC datasets:

Collection of datasets of several non-western music repertoires.

REPOVIZZ:

Data repository and visualization tool for music performance multi-modal recordings.

DREANSS:

Annotations of drum events within known music audio recordings datasets.

EEP:

Multimodal recordings of string quartet performances.

FLABASE:

Knowledge Base of flamenco music.

GIANTSTEPS Key:

Key annotations of a music audio collection.

GIANTSTEPS Tempo:

Tempo annotations of a music audio collection.

GOOD-SOUNDS:

Recordings of single notes and scales played by several instruments.

IRMAS:

Musical audio excerpts with annotations of the predominant instruments. 

Last.fm Dataset 360k users - Last.fm Dataset 1k users:  

<user, artist-mbid, artist-name, total-plays> tuples from Last.fm.

MARD:

Text and accompanying metadata of Amazon customer reviews.

MASS:

Multi-track recordings for audio source separation research.

MTG-QBH:

Recordings of sung melodies for Query-by-Humming research.

ORCHSET:

Orchestral music excerpts with annotations for melody extraction research.

PHENICX-Anechoic:

Denoised recordings and note annotations for Aalto anechoic orchestral database.

PHENICX-emotion:

Excerpts of the Eroica Symphony by Beethoven plus audio descriptors.

QUARTET:

Multimodal data of string quartet performances.

SAS:

List of artists and biographical information for semantic artist similarity research.

TONAS:

Flamenco a cappella sung melodies with manual transcriptions.

HAYDN QUARTETS:

Scores and harmonic annotations of Haydn's String Quartets Op. 20.

ISMIR04 genre

ISMIR 2004 Genre Identification task dataset.