We have relevant datasets, repositories, frameworks and tools of relevance for research and technology transfer initiatives related to knowledge extraction. This section provides an overview on a selection of them and links to download or contact details.

The MdM Strategic Research Program has its own community in Zenodo for material available in this repository  as well as at the UPF e-repository  . Below a non-exhaustive list of datasets representative of the research in the Department.

As part of the promotion of the availability of resources, the creation of specific communities in Zenodo has also been promoted, at level of research communities (for instance, MIR and Educational Data Analytics) or MSc programs (for instance, the Master in Sound and Music Computing)

 

 

Back Espinosa-Anke, L, Camacho-Collados, J, Rodríguez-Fernández, S, Saggion, H, Wanner, L. Extending WordNet with Fine-Grained Collocational Information via Supervised Distributional Learning. 26th International Conference on Computational Linguistics (COLING 2016)

Espinosa-Anke, L, Camacho-Collados, J, Rodríguez-Fernández, S, Saggion, H, Wanner, L. Extending WordNet with Fine-Grained Collocational Information via Supervised Distributional Learning. Coling 2016.

 

WordNet is probably the best known lexical resource in Natural Language Processing. While it is widely regarded as a high quality repository of concepts and semantic relations, updating and extending it manually is costly. One important type of relation which could potentially add enormous value to WordNet is the inclusion of collocational information, which is paramount in tasks such as Machine Translation, Natural Language Generation and Second Language Learning. In this paper, we present ColWordNet (CWN), an extended WordNet version with fine-grained collocational information, automatically introduced thanks to a method exploiting linear relations between analogous sense-level embeddings spaces. We perform both intrinsic and extrinsic evaluations, and release CWN for the use and scrutiny of the community.

Additional material: