We develop a large number of software tools and hosting infrastructures to support the research developed at the Department. We will be detailing in this section the different tools available. You can take a look for the moment at the offer available within the UPF Knowledge Portal, the innovations created in the context of EU projects in the Innovation Radar and the software sections of some of our research groups:

 

 Artificial Intelligence

 Nonlinear Time Series Analysis

 Web Research 

 

 Music Technology

 Interactive  Technologies

 Barcelona MedTech

 Natural Language  Processing

 Nonlinear Time Series  Analysis

UbicaLab

Wireless Networking

Educational Technologies

GitHub

 

 

Back [MSc thesis] Term extraction and document similarity in an Integrated Learning Design Environment

[MSc thesis] Term extraction and document similarity in an Integrated Learning Design Environment

Author: Alberto Martínez Rodríguez

Supervisor: Davinia Hernández Leo, Horacio Saggion

MSc program: Master in Intelligent Interactive Systems

The Integrated Learning Design Environment is a social platform focused in supporting teachers in the computer-assisted design of Learning activities. In this platform, teachers and course designers can contextualize, author and share their designs within their community. This social component, of the ILDE, would benefit from the application of Information Retrieval and Natural Language Processing techniques to facilitate teachers and course designers to find shared designs as fast and efficient as possible. In this work, we use Natural Language Processing to classify learning designs written in Catalan, get the content of the users, parse this content with Freeling and extract education domainspecific terminology from the documents. To extract the terminology, a combination of two methods is used. The first method uses the Multilingual Central Repository ontology to check if a term belongs to any of four pedagogical fields. The second methodology, computes the tf-idf of all the documents terms using a non-domain-specific corpus, the Catalan Wikipedia. This work also discusses the potential of the proposed combination of methods to retrieve simple and complex terms from documents. The resulting combined method distributes the weight of each method in the extraction process to assign a score to each retrieved term. After this process of extracting education domain-specific terminology from different ILDE documents, it has been created a Document Similarity Application addressed to teachers and course designers. This application allows users to search documents based on the similarity between these documents and another document of the same ILDE community. Besides, given a document, users can visualize the education terminology that belongs to that document. Finally, users can also search for certain documents using a terminology-based query to obtain a set of documents and their similarity with respect to that query.

Additional material:

Open access version at UPF e-repository