Soundata, a Python library for reproducible use of audio datasets
Soundata, a Python library for reproducible use of audio datasets
We’re excited to announce the release of Soundata, a Python library for reproducible use of audio datasets. We’re launching with 14 popular environmental sound datasets, with plans to continue expanding with additional datasets spanning a range of audio domains including speech and bioacoustics. For music datasets see mirdata, which was the inspiration for soundata.
Soundata makes it easy to:
-
Download datasets to a common location and format
-
Validate that a downloaded dataset is complete and perfectly matches a canonical version
-
Load audio and annotation files into a common format
-
Parse clip-level metadata for detailed evaluations
We hope soundata will help the community to:
-
Ensure results are reproducible by working against exactly the same data
-
Save time by avoiding manual downloads and having to write custom dataset parsers
-
Automate large-scale download, training, and evaluation pipelines
-
Increase the visibility of new datasets by adding them to soundata
Soundata is a cross-organizational collaboration spanning researchers from MARL@NYU, Adobe Research, MTG@UPF, and GPA@UdelaR. From the MTG side, we have added loaders for various datasets, including the Freesound datasets FSD50K and FSDnoisy18K. These efforts will continue and many new and openly available datasets will be supported.
You can learn more about the library on our docs page: https://soundata.