We have relevant datasets, repositories, frameworks and tools of relevance for research and technology transfer initiatives related to knowledge extraction. This section provides an overview on a selection of them and links to download or contact details.

The MdM Strategic Research Program has its own community in Zenodo for material available in this repository  as well as at the UPF e-repository  . Below a non-exhaustive list of datasets representative of the research in the Department.

As part of the promotion of the availability of resources, the creation of specific communities in Zenodo has also been promoted, at level of research communities (for instance, MIR and Educational Data Analytics) or MSc programs (for instance, the Master in Sound and Music Computing)

 

 

Back Rodríguez-Fernández S, Espinosa-Anke L, Carlini R, Wanner L. Semantics-Driven Recognition of Collocations Using Word Embeddings. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016)

Rodríguez-Fernández S, Espinosa-Anke L, Carlini R, Wanner L. Semantics-Driven Recognition of Collocations Using Word Embeddings. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016)

 

L2 learners often produce “ungrammatical” word combinations such as, e.g., *give a suggestion or *make a walk. This is because of the “collocationality” of one of their items (the base) that limits the acceptance of collocates to express a specific meaning (‘perform’ above). We propose an algorithm that delivers, for a given base and the intended meaning of a collocate, the actual collocate lexeme(s) (make / take above). The algorithm exploits the linear mapping between bases and collocates from examples and generates a collocation transformation matrix which is then applied to novel unseen cases. The evaluation shows a promising line of research in collocation discovery

Additional material: