The Knowledge Based Natural Interaction Lab does its research in different steps of an end-to-end processing that analyzes text to extract knowledge used to build an an internal representation of the content that is transformed and used to generate a new text with different objectives (like a summary in a different language).
(Corpus-based) Lexicology and Lexicography and its Application in Computer Assisted Language Learning
Within the field of lexicography and lexicology, we are interested in the investigation of problems related to idiosyncratic word co-occurrences (or collocations): their theoretical description, semantically-oriented classification, automatic recognition in text corpora and sound representation in dictionaries - both for human and machine use. We are interested in problems related to collocations in computer assisted language learning (CALL).
Divergences in structural codification across languages
So called "mismatches" across languages are an intriguing research topic in linguistics and one of the biggest challenges in such computational linguistics applications as Machine Translation. We are interested in all types of mismatches: morphological, syntactic, lexical, semantic, and communicative. In the literature, mainly syntactic mismatches have been discussed; the other types - although equally important - received much less attention. In collaboration with Prof. I. Mel'cuk from the University of Montreal, we are researching a formal modelling of mismatches in the framework of the Meaning-Text Theory.
Automatic Synthesis of multimodal and multilingual information
Automatic synthesis (and, in particular generation) of multilingual information is one of our principal fields of research. Special areas of interest include discourse planning, sentence planning, large scale generation grammars, and the role of the addressee in the process of generation. A very important matter of concern is the development of operational generators that can be applied in practice. Another topic which increasingly dominates our research agenda is the interactive production of multimedia narrations.
The analysis of text content and the extraction of knowledge from it is a crucial task where the inputs are text that needs to be understood and transformed to generate a new output that merges data from different sources. We process text from surface to deep text analysis, word sense disambiguation and entity linking to finally populate ontologies and knowledge bases that will be later used in text planning and text generation.