Oramas S, Espinosa-Anke L, Sordo M, Saggion H, Serra X. ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain. Proceedings of the Language Resource and Evaluation Conference 2016
We develop a large number of software tools and hosting infrastructures to support the research developed at the Department. We will be detailing in this section the different tools available. You can take a look for the moment at the offer available within the UPF Knowledge Portal, the innovations created in the context of EU projects in the Innovation Radar and the software sections of some of our research groups:
Artificial Intelligence |
Nonlinear Time Series Analysis |
Web Research |
Music Technology |
Interactive Technologies |
Barcelona MedTech |
Natural Language Processing |
Nonlinear Time Series Analysis |
UbicaLab |
Wireless Networking |
Educational Technologies |
Oramas S, Espinosa-Anke L, Sordo M, Saggion H, Serra X. ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain. Proceedings of the Language Resource and Evaluation Conference 2016
Oramas S, Espinosa-Anke L, Sordo M, Saggion H, Serra X. ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain. Proceedings of the Language Resource and Evaluation Conference 2016
In this paper we present a gold standard dataset for Entity Linking (EL) in the Music Domain. It contains thousands of musical named entities such as Artist, Song or Record Label, which have been automatically annotated on a set of artist biographies coming from the Music website and social network Last.fm. The annotation process relies on the analysis of the hyperlinks present in the source texts and in a voting-based algorithm for EL, which considers, for each entity mention in text, the degree of agreement across three state-of-the-art EL systems. Manual evaluation shows that EL Precision is at least 94%, and due to its tunable nature, it is possible to derive annotations favouring higher Precision or Recall, at will. We make available the annotated dataset along with evaluation data and the code
Keywords: entity linking, Language Resources, music information retrieval
Additional material:
- Code in Github: ELVIS (Entity Linking Voting and Integration System): Framework to homogenize and combine the output of different entity linking tools, using the level of agreement as a conficence score
- ELMD dataset and evaluation data