[PhD thesis] Knowledge acquisition in the information age: the interplay between lexicography and natural language processing
We develop a large number of software tools and hosting infrastructures to support the research developed at the Department. We will be detailing in this section the different tools available. You can take a look for the moment at the offer available within the UPF Knowledge Portal, the innovations created in the context of EU projects in the Innovation Radar and the software sections of some of our research groups:
Artificial Intelligence |
Nonlinear Time Series Analysis |
Web Research |
Music Technology |
Interactive Technologies |
Barcelona MedTech |
Natural Language Processing |
Nonlinear Time Series Analysis |
UbicaLab |
Wireless Networking |
Educational Technologies |
[PhD thesis] Knowledge acquisition in the information age: the interplay between lexicography and natural language processing
[PhD thesis] Knowledge acquisition in the information age: the interplay between lexicography and natural language processing
Author: Luis Espinosa-Anke
Supervisor: Horacio Saggion
Natural Language Processing (NLP) is the branch of Artificial Intelligence aimed at understanding and generating language as close as possible to a human’s. Today, NLP benefits substantially of large amounts of unnanotated corpora with which it derives state-of-the-art resources for text understanding such as vectorial representations or knowledge graphs. In addition, NLP also leverages structured and semi-structured information in the form of ontologies, knowledge bases (KBs), encyclopedias or dictionaries. In this dissertation, we present several improvements in NLP tasks such as Definition and Hypernym Extraction, Hypernym Discovery, Taxonomy Learning or KB construction and completion, and in all of them we take advantage of knowledge repositories of various kinds, showing that these are essential enablers in text understanding. Conversely, we use NLP techniques to create, improve or extend existing repositories, and release them along with the associated code for the use of the community.
Additional material:
- Open access version available at TDR repository
- Datasets and software: https://bitbucket.org/luisespinosa/