Back Soundata, a Python library for reproducible use of audio datasets

Soundata, a Python library for reproducible use of audio datasets

It allows loading and working with audio datasets in a standardized way and improves reproducibility


We’re excited to announce the release of Soundata, a Python library for reproducible use of audio datasets. We’re launching with 14 popular environmental sound datasets, with plans to continue expanding with additional datasets spanning a range of audio domains including speech and bioacoustics. For music datasets see mirdata, which was the inspiration for soundata.

Soundata makes it easy to:

  • Download datasets to a common location and format

  • Validate that a downloaded dataset is complete and perfectly matches a canonical version

  • Load audio and annotation files into a common format

  • Parse clip-level metadata for detailed evaluations

We hope soundata will help the community to:

  • Ensure results are reproducible by working against exactly the same data

  • Save time by avoiding manual downloads and having to write custom dataset parsers

  • Automate large-scale download, training, and evaluation pipelines

  • Increase the visibility of new datasets by adding them to soundata

Soundata is a cross-organizational collaboration spanning researchers from [email protected]Adobe Research[email protected], and [email protected]. From the MTG side, we have added loaders for various datasets, including the Freesound datasets FSD50K and FSDnoisy18K. These efforts will continue and many new and openly available datasets will be supported.

You can learn more about the library on our docs page:



SDG - Sustainable Development Goals:

Els ODS a la UPF