We have relevant datasets, repositories, frameworks and tools of relevance for research and technology transfer initiatives related to knowledge extraction. This section provides an overview on a selection of them and links to download or contact details.

The MdM Strategic Research Program has its own community in Zenodo for material available in this repository  as well as at the UPF e-repository  . Below a non-exhaustive list of datasets representative of the research in the Department.

As part of the promotion of the availability of resources, the creation of specific communities in Zenodo has also been promoted, at level of research communities (for instance, MIR and Educational Data Analytics) or MSc programs (for instance, the Master in Sound and Music Computing)

 

 

Back Saggion H, AbuRa'ed A, Ronzano F. Trainable citation-enhanced summarization of scientific articles. Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL2016).

Saggion H, AbuRa'ed A, Ronzano F. Trainable citation-enhanced summarization of scientific articles. Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL2016). CEUR Workshop Proceedings

In order to cope with the growing number of relevant scientific publications to consider at a given time, automatic text summarization is a useful technique. However, summarizing scientific papers poses important challenges for the natural language processing community. In recent years a number of evaluation challenges have been proposed to address the problem of summarizing a scientific paper taking advantage of its citation network (i.e., the papers that cite the given paper). Here, we present our trainable technology to address a number of challenges in the context of the 2nd Computational Linguistics Scientific Document Summarization Shared Task.

Additional material: