[TEXT] MARD: Multimodal Album Reviews Dataset
We have relevant datasets, repositories, frameworks and tools of relevance for research and technology transfer initiatives related to knowledge extraction. This section provides an overview on a selection of them and links to download or contact details.
The MdM Strategic Research Program has its own community in Zenodo for material available in this repository as well as at the UPF e-repository . Below a non-exhaustive list of datasets representative of the research in the Department.
As part of the promotion of the availability of resources, the creation of specific communities in Zenodo has also been promoted, at level of research communities (for instance, MIR and Educational Data Analytics) or MSc programs (for instance, the Master in Sound and Music Computing)
MARD contains texts and accompanying metadata originally obtained from a much larger dataset of Amazon customer reviews, which have been enriched with music metadata from MusicBrainz, and audio descriptors from AcousticBrainz. MARD amounts to a total of 65,566 albums and 263,525 customer reviews. A breakdown of the number of albums per genre is provided here:
Genre | Amazon | MusicBrainz | AcousticBrainz |
---|---|---|---|
Alternative Rock | 2,674 | 1,696 | 564 |
Reggae | 509 | 260 | 79 |
Classical | 10,000 | 2,197 | 587 |
R\&B | 2,114 | 2,950 | 982 |
Country | 2,771 | 1,032 | 424 |
Jazz | 6,890 | 2,990 | 863 |
Metal | 1,785 | 1,294 | 500 |
Pop | 10,000 | 4,422 | 1701 |
New Age | 2,656 | 638 | 155 |
Dance & Electronic | 5,106 | 899 | 367 |
Rap & Hip-Hop | 1,679 | 768 | 207 |
Latin Music | 7,924 | 3,237 | 425 |
Rock | 7,315 | 4,100 | 1482 |
Gospel | 900 | 274 | 33 |
Blues | 1,158 | 448 | 135 |
Folk | 2,085 | 848 | 179 |
Total | 66,566 | 28,053 | 8,683 |
A subset of the dataset was created for genre classification experiments. It contains 100 albums by genre from different artists, from 13 different genres. All the albums have been mapped to MusicBrainz and AcousticBrainz. It contains semantic, acoustic and sentiment features.
We also provide all the necessary files to reproduce the experiments on genre classification in the paper referenced below
For details on the datasets and download please go to http://mtg.upf.edu/download/datasets/mard
For more details on how these files were generated, we refer to the following scientific publication. We would highly appreciate if scientific publications of works partly based on the MARD dataset quote the following publication: