Back [AUDIO AND TEXT] FSDnoisy18k


FSDnoisy18k is an audio dataset collected with the aim of fostering the investigation of label noise in sound event classification. It contains 42.5 hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.

Data curators

Eduardo Fonseca and Mercedes Collado


You are welcome to contact Eduardo Fonseca should you have any questions at [email protected].


If you use this dataset or part of it, please cite the following paper:

Eduardo Fonseca, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, and Xavier Serra, “Learning Sound Event Classifiers from Web Audio with Noisy Labels”, arXiv preprint arXiv:1901.01189, 2019

You can also consider citing our ISMIR 2017 paper that describes the Freesound Datasets platform, which was used to gather the manual annotations included in FSDnoisy18k:

Eduardo Fonseca, Jordi Pons, Xavier Favory, Frederic Font, Dmitry Bogdanov, Andres Ferraro, Sergio Oramas, Alastair Porter, and Xavier Serra, “Freesound Datasets: A Platform for the Creation of Open Audio Datasets”, In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017

Description at

General details also available at this UPF news