List of results published directly linked with the projects co-funded by the Spanish Ministry of Economy and Competitiveness under the María de Maeztu Units of Excellence Program (MDM-2015-0502).

List of publications acknowledging the funding in Scopus.

The record for each publication will include access to postprints (following the Open Access policy of the program), as well as datasets and software used. Ongoing work with UPF Library and Informatics will improve the interface and automation of the retrieval of this information soon.

The MdM Strategic Research Program has its own community in Zenodo for material available in this repository   as well as at the UPF e-repository   



Back [MSc thesis] Term extraction and document similarity in an Integrated Learning Design Environment

[MSc thesis] Term extraction and document similarity in an Integrated Learning Design Environment

Author: Alberto Martínez Rodríguez

Supervisor: Davinia Hernández Leo, Horacio Saggion

MSc program: Master in Intelligent Interactive Systems

The Integrated Learning Design Environment is a social platform focused in supporting teachers in the computer-assisted design of Learning activities. In this platform, teachers and course designers can contextualize, author and share their designs within their community. This social component, of the ILDE, would benefit from the application of Information Retrieval and Natural Language Processing techniques to facilitate teachers and course designers to find shared designs as fast and efficient as possible. In this work, we use Natural Language Processing to classify learning designs written in Catalan, get the content of the users, parse this content with Freeling and extract education domainspecific terminology from the documents. To extract the terminology, a combination of two methods is used. The first method uses the Multilingual Central Repository ontology to check if a term belongs to any of four pedagogical fields. The second methodology, computes the tf-idf of all the documents terms using a non-domain-specific corpus, the Catalan Wikipedia. This work also discusses the potential of the proposed combination of methods to retrieve simple and complex terms from documents. The resulting combined method distributes the weight of each method in the extraction process to assign a score to each retrieved term. After this process of extracting education domain-specific terminology from different ILDE documents, it has been created a Document Similarity Application addressed to teachers and course designers. This application allows users to search documents based on the similarity between these documents and another document of the same ILDE community. Besides, given a document, users can visualize the education terminology that belongs to that document. Finally, users can also search for certain documents using a terminology-based query to obtain a set of documents and their similarity with respect to that query.

Additional material:

Open access version at UPF e-repository