Thesis linked to the implementation of the María de Maeztu Strategic Research Program.

Open access to PhD thesis carried out at the Department can be found at TDX

Please visit these pages for information on our PhD, MSc and BSc programs.

 

Back [TEXT] MARD: Multimodal Album Reviews Dataset

 

MARD contains texts and accompanying metadata originally obtained from a much larger dataset of Amazon customer reviews, which have been enriched with music metadata from MusicBrainz, and audio descriptors from AcousticBrainz. MARD amounts to a total of 65,566 albums and 263,525 customer reviews. A breakdown of the number of albums per genre is provided here:

 

Genre Amazon MusicBrainz AcousticBrainz
Alternative Rock 2,674 1,696 564
Reggae 509 260 79
Classical 10,000 2,197 587
R\&B 2,114 2,950 982
Country 2,771 1,032 424
Jazz 6,890 2,990 863
Metal 1,785 1,294 500
Pop 10,000 4,422 1701
New Age 2,656 638 155
Dance & Electronic 5,106 899 367
Rap & Hip-Hop 1,679 768 207
Latin Music 7,924 3,237 425
Rock 7,315 4,100 1482
Gospel 900 274 33
Blues 1,158 448 135
Folk 2,085 848 179
Total 66,566 28,053 8,683

 

A subset of the dataset was created for genre classification experiments. It contains 100 albums by genre from different artists, from 13 different genres. All the albums have been mapped to MusicBrainz and AcousticBrainz. It contains semantic, acoustic and sentiment features. 

We also provide all the necessary files to reproduce the experiments on genre classification in the paper referenced below

For details on the datasets and download please go to http://mtg.upf.edu/download/datasets/mard 

For more details on how these files were generated, we refer to the following scientific publication. We would highly appreciate if scientific publications of works partly based on the MARD dataset quote the following publication:

Oramas, S., Espinosa-Anke L., Lawlor A., Serra X., & Saggion H. (2016).  Exploring Customer Reviews for Music Genre Classification and Evolutionary Studies. 17th International Society for Music Information Retrieval Conference (ISMIR'16). 
 
The MARD dataset will be introduced in the next ISMIR tutorial "Natural Language Processing for MIR" https://wp.nyu.edu/ismir2016/event/tutorials/