Espinosa-Anke, L., Oramas S., Saggion H., & Serra X. ELMDist: A vector space model with words and MusicBrainz entities. Workshop on Semantic Deep Learning (SemDeep), collocated with ESWC 2017
We develop a large number of software tools and hosting infrastructures to support the research developed at the Department. We will be detailing in this section the different tools available. You can take a look for the moment at the offer available within the UPF Knowledge Portal, the innovations created in the context of EU projects in the Innovation Radar and the software sections of some of our research groups:
Artificial Intelligence |
Nonlinear Time Series Analysis |
Web Research |
Music Technology |
Interactive Technologies |
Barcelona MedTech |
Natural Language Processing |
Nonlinear Time Series Analysis |
UbicaLab |
Wireless Networking |
Educational Technologies |
Espinosa-Anke, L., Oramas S., Saggion H., & Serra X. ELMDist: A vector space model with words and MusicBrainz entities. Workshop on Semantic Deep Learning (SemDeep), collocated with ESWC 2017
Espinosa-Anke, L., Oramas S., Saggion H., & Serra X. ELMDist: A vector space model with words and MusicBrainz entities. Workshop on Semantic Deep Learning (SemDeep), collocated with ESWC 2017
Music consumption habits as well as the Music market have changed dramatically due to the increasing popularity of digital audio and streaming services. Today, users are closer than ever to a vast number of songs, albums, artists and bands. However, the challenge remains in how to make sense of all the data available in the Music domain, and how current state of the art in Natural Language Processing and semantic technologies can contribute in Music Information Retrieval areas such as music recommendation, artist similarity or automatic playlist generation. In this paper, we present an evaluate a distributional sense-based embeddings model in the music domain, which can be easily used for these tasks, as well as a device for improving artist or album clustering. The model is trained on a disambiguated corpus linked to the MusicBrainz musical Knowledge Base with an estimated precision of above 0.9, and following current knowledge-based approaches to sense-level embeddings, entity-related vectors are provided a` la WordNet, concatenating the id of the entity and its mention (in WordNet lingo, the entity’s synset and sense). The model is evaluated both intrinsically and extrinsically in a supervised entity typing task, and released for the use and scrutiny of the community.
Additional material:
ELMDist - sense-level embeddings model in the music domain, trained on a music-specific corpus of artist biographies, where musical entities have been automatically annotated with high precision against the musical KB MusicBrainz (MB).
The pretrained sense-level word2vec model against MusicBrainz can be downloaded from: http://mtg.upf.edu/system/files/projectsweb/elmdist_vectors.zip
If you want to retrain the vectors, ELMD 2.0 can be downloaded from here:
http://mtg.upf.edu/download/datasets/elmd
And you can train the model runing train_word2vec.py
- Link to the workshop, including open access to the publication