Back

Textual sources of musicology, analysed from the perspective of natural language technologies

Textual sources of musicology, analysed from the perspective of natural language technologies

Through a resource that is unique in its kind, jointly developed by the Music Technology and the Natural Language Processing research groups of the Department of Information and Communication technologies.

12.05.2016

 

Texts that contain reviews of albums, biographies of musicians and artists, the lyrics of songs, etc., “all this information is extremely useful for music technology researchers”, states Horacio Saggion, a member of the Natural Language Processing (TALN) research group.

In fact, automatically extracting information about musical entities (artists, albums, songs or record companies) described in these textual sources is important for the creation and/or extension of the foundations of musical knowledge that can be used not only in recommendation systems of artists and songs, but also, and from a generic perspective, for research in musicology.

This idea has arisen from collaboration between the Music Technology Group (MTG), coordinated by Xavier Serra, and the Natural Language Processing research group (TALN), to which Horacio Saggion belongs. Both research groups are attached to the Department of Information and Communication Technologies (DTIC) of Pompeu Fabra University.

92,000 artists, albums, songs, and record companies

The idea has allowed uniting and combining research in musical technology and in natural language processing to develop an automatic system for the semantic annotation of musical entities in “free” text, so that the annotated texts are connected to open-type knowledge bases, such as Wikipedia.

As part of this successful synergy, the researchers have created a new resource they call Entity Linking in the Music Domain (ELMD), unique in its kind, a new automatic system of annotation of semantic terms that, based on the biographies of musicians, has now managed to compile more than 92,000 entries between artists’ names (64.873), albums (16.302), songs (8,275) and record companies (3,480). A free access resource for the whole community.

Sergio OramasMohamed Sordo and Xavier Serra (MTG), together with Luis Espinosa-Anke and Horacio Saggion (TALN),will be presenting the technical and analytical details of this new resource at the 10th edition of the Language Resources and Evaluation Conference  (LREC) to be held from 23 to 28 May in Portorož (Slovenia).

Reference work:

Sergio Oramas, Luis Espinosa-Anke, Mohamed Sordo, Horacio Saggion, Xavier Serra (2016), “ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain”, 10th Edition Language Resources and Evaluation Conference, 23-28 May 2016, Portorož  (Slovenia).

Image credits:

By Unknown - http://lonestarstomp.blogspot.com/2008/12/much-busier-days-in-kermit-texas-post.html, Public Domain,https://commons.wikimedia.org/w/index.php?curid=7558657

Multimèdia

Multimedia

Multimedia

Categories: