We have relevant datasets, repositories, frameworks and tools of relevance for research and technology transfer initiatives related to knowledge extraction. This section provides an overview on a selection of them and links to download or contact details.

The MdM Strategic Research Program has its own community in Zenodo for material available in this repository  as well as at the UPF e-repository  . Below a non-exhaustive list of datasets representative of the research in the Department.

As part of the promotion of the availability of resources, the creation of specific communities in Zenodo has also been promoted, at level of research communities (for instance, MIR and Educational Data Analytics) or MSc programs (for instance, the Master in Sound and Music Computing)

 

 

Back Saggion H, Ronzano F, Accuosto P, Ferrés D. MultiScien: a Bi-Lingual Natural Language Processing System for Mining and Enrichment of Scientific Collections. 2nd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017), at SIGIR 2017

Saggion H, Ronzano F, Accuosto P,  Ferrés D. MultiScien: a Bi-Lingual Natural Language Processing System for Mining and Enrichment of Scientific Collections. 2nd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017), at SIGIR 2017

In the current online Open Science context, scientific datasets and tools for deep text analysis, visualization and exploitation play a major role. We present a system for deep analysis and annotation of scientific text collections. We also introduce the first version of the SEPLN Anthology, a bi-lingual (Spanish and English) fully annotated text resource in the field of natural language processing that we created with our system. Moreover, a faceted-search and visualization system to explore the created resource is introduced. All resources created for this paper will be available to the research community.

Additional material: