The MSD-A is a dataset related to the Million Song Dataset (MSD)It is a collection of artist tags and biographies gathered from Last.fm for all the artists that have songs in the MSD. In addition, the MSD Taste Profile (recommendation dataset) is adapted to artists.

We provide the biographies, tags, data splits, and feature embeddings to reproduce the experiments from the paper:

Oramas S., Nieto O., Sordo M., & Serra X. (2017) A Deep Multimodal Approach for Cold-start Music Recommendation. https://arxiv.org/abs/1706.09739

Downloads

MSD-A Dataset

Tartarus: Library for deep learning experiments https://github.com/sergiooramas/tartarus

Semantic Enrichment

In order to enrich the documents with semantic information, we filter the entities detected by the entity linking system by the following types:

types = ['MusicalArtist','Band','MusicGenre','MusicalWork','Engineer','RecordLabel','Instrument','Place','Language','EthnicGroup','music']

 

In addition we retrieved the following properties for every entity from DBpedia

artist_properties = ['activeYearsStartYear','homeTown','birthPlace','gerne','instrument','recordLabel','associatedBand','associatedMusicalArtist','bandMember','formerBandMember','mentor']
work_properties = ['writer','artist','genre','recordLabel','album','musicalArtist','musicalBand','releaseDate','producer','recordedIn']
label_properties = ['location','parentCompany','genre','foundedBy']
genre_properties = ['stylisticOrigin','instrument','subject']