Title & Description

Supervisor(s)

Social media monitoring for disaster risk reduction 

Our group has contributed to the development of social media monitoring tools for flood risk reduction within a platform developed by the Joint Research Center (JRC) and to be incorporated into EFAS. We would like to extend this tool to consider other types of risk including forest fires, earthquakes, storms, and so on. This topic is fairly practical and application-oriented, and requires solid knowledge of Python. Technologies to be mastered during the process include online/streaming algorithms for text clustering, deep learning for text classification, and the integration of these tools. A good starting point in terms of a base of code for the platform and training sets exist.

Carlos Castillo

Gender bias in online platforms 

This project will analyze and study potential gender biases of online platforms. These may include job search engines, e-commerce websites, search engines, and platform economy sites for which some data could be scrapped/obtained. The specific platform is to be decided with the master student. The work will involve data scrapping/collection, measurement through fair ranking/recommendation algorithms, and analysis of potential biases found. It requires solid knowledge of Python and desire to work within a framework of data feminism, which seeks to eliminate gender biases in data collection and data processing pipelines.

Carlos Castillo

Risk-Prediction Instruments in Criminal Justice 

The Justice Department of Catalonia has been using for several years structured risk prediction instruments, similar to COMPAS in the US or OASIS in the UK, to predict the risk of recidivism and violent recidivism. Correctional officers, psychologists, and social workers apply a questionnaire to juvenile offenders and to people who have been convicted for a crime (these are different questionnaires); the questionnaire is then processed using a model trained on thousands of previous cases, and an output is produced, which is then interpreted by the person applying the instrument. Detailed anonymized data is available for studying to what extent different instruments can be biased against different sub-populations (e.g., immigrants or children of immigrant). Additionally, we would like to explore the potential for new risk prediction instruments. The work would be done in collaboration with researchers on criminology at the University of Barcelona and at the Justice Department of Catalonia.

Carlos Castillo

Discrimination in house rental market 

This project will analyze and study potential discrimination occuring in the house rental market in an online platform. With a recent sociological study demonstrating the existence of certain levels of discrimination from the side of the real state agents, the goal of this project will be to understand how much this discrimination is reflected, mitigated or amplified by the recommenders systems that operate in online platforms. To develop the project, the student will study the existence of such inequities based on the ethnicity, gender and sexual orientation (between other features) of the applicants and how the recommendations of the platform vary depending on such characteristics. It requires solid knowledge of Python for data scraping, data collection, data cleaning for its analysis.

Carlos Castillo

David Solans

Aspect-oriented sentiment analysis 

Sentiment analysis (often also referred to as “opinion mining”) deals with the identifications of sentiments of the author of a text (e.g., review of a product) towards a topic, object, action, etc. Often, a unique sentiment (positive or negative) is assigned to the whole text – which is obviously a simplification since an author can find some aspects of the considered topic/object/action/… positive and some negative. The thesis will address the development of a linguistically-motivated deep learning algorithm for aspect-oriented sentiment analysis that will be able to recognize not only the polarity of the sentiment (`positive’ vs. `negative’) towards an aspect, but also any emotion associated with this sentiment (`happy’, `ambivalent’, `sad’, etc.) in a dataset related to the perception of arts.

Leo Wanner

Alexander Shvets

Aspect-oriented offensive speech analysis 

The detection of offensive speech online, which intends to insult, humiliate or hurt individuals or groups of people, is an increasingly popular research topic. But state-of-the-art solutions focus merely on the classification of a social media post or blog contribution in terms of a predefined typology (e.g., ‘hate speech’, `sexist’, `racist’, …); no analysis is made towards which aspect(s) of the targeted individuals/groups the offence is directed. In this thesis, models will be developed that address this challenge. Newspaper blogs and social media datasets will be used.

Leo Wanner

Juan Soler Company

Multilingual Neural Natural Language Text Generation 

Deep learning models have become common for natural language generation. However, so far, they focus predominantly on the generation of isolated sentences from shallow linguistic (syntactic) structures. In the current thesis, models for the generation of paragraph-long texts from generalized syntactic graphs will be explored. The models will, in particular, be able to capture co-references (such as, e.g., “Barcelona is a lovely city. I live here already for more than 15 years”) and the coherent structure of the narrative. The developed models will be tested on available large datasets for a number of languages.

Leo Wanner

Deep reinforcement learning-driven dialogue management 

In order to ensure a flexible coherent conversation between a human and the machine that goes beyond predefined information exchange patterns, advanced dialogue management strategies must be developed, which take the history and the goals of the conversation into account and which are able to handle interruptions, side sequences, grounding and other phenomena of a natural dialogue. Neural network-based Reinforcement Learning models have shown to have the potential to cope with these challenges. In the current thesis, such a model will be explored on a dataset composed of job interviews. The compilation of the dataset forms part of the thesis. This thesis will be developed in collaboration with the German Centre for Artificial Intelligence.

Leo Wanner

Patrick Gebhard

Identification of Political Bias in Media Coverage 

At the very latest the Fake News debate revealed the crucial role of media for the direction of the political tendencies in a modern society. Unfortunately, even if we leave Fake News aside, main stream media coverage of events with social, societal or political repercussion is not always objective, but, rather, follows a specific political agenda, the interpretation of the event in the light of a specific societal and/or political schema, or an existing (or self-imposed) educational mandate. The thesis will develop strategies for automatic recognition and classification of political bias in selected media coverage. The strategies will ideally implement an incremental learning mechanism, which will allow for a continuous improvement of their performance.

Leo Wanner

Neural Graph-to-Graph Transduction 

The standard neural network-based models in Natural Language Processing (NLP) are sequence-to-sequence models. In other words, they require a linear sequence of entities as input. This is insufficient since in many applications the input is constituted by hierarchical linguistic structures (trees or acyclic directed graphs). To address this challenge, hierarchical models such as Graph CNNs or Tree LSTMs have been proposed to project a hierarchical structure onto a sequence. In the current thesis, models will be explored that make one step further in that they will be able to project a given graph structure onto another graph structure of any required complexity (acyclic graph, tree or chain, i.e., linear sequence). The developed model will be tested on a number of different NLP applications. This thesis will be developed in collaboration with Google AI.

Leo Wanner

Bernd Bohnet

Bootstrapping a multilingual collocation dictionary 

Collocations are idiosyncratic (i.e., language-specific) expressions such as “take a walk” and “ask a question” (cf. in Spanish dar un paseo, lit. 'give a walk' and hacer una pregunta, lit. 'make a question' respectively) are a great challenge in both natural language processing and in second language learning - partially also because their meaning is not composed of the meaning of the isolated words that participate in the combination (thus, you don't 'take' or 'give' anything when you go for a walk). The goal of this Thesis is to develop a deep learning (neural network) - based algorithm for bootstrapping a multilingual English - L2 dictionary of collocations from large corpora, assigning the meaning to the extracted collocations in accordance with a given typology. The work of the thesis will build upon existing example-based deep learning and word embedding implementations and multilingual corpora. This thesis will be developed in collaboration with Cardiff University.

Leo Wanner

Luis Espinosa Anke

The Atmospheric Pollution Monitoring and Forecasting Network 

The goal of this project is to gain understanding of the temporal evolution of air quality levels in Catalunya. For that, a large-scale dataset gathered from the Atmospheric Pollution Monitoring and Forecasting Network (XVPCA) will be analyzed. The project considers tasks such as performing statistical analysis, characterizing temporal patterns, learning probabilistic models at the aggregate or individual level of each node, and relating these outcomes with existing forecasting models and the air quality index. The data for the analysis is daily updated and is composed of a large number of pollutants measured at the automatic measurement points of the network since 1991 until now. This project will be developed in collaboration with the Departament de Territori i Sostenibilitat de la Generalitat de Catalunya.

The Atmospheric Pollution Monitoring and Forecasting Network (in CAT)

Vicenç Gómez

Statistical Modeling of Online Discussions 

Online discussion is a core feature of numerous social media platforms and has attracted increasing attention for different and relevant reasons, e.g., the resolution of problems in collaborative editing, question answering and e-learning platforms, the response of online communities to news events, online political and civic participation, etc. This project aims to address, from a probabilistic modeling perspective, some existing challenges, both computational and social, that appear in platforms that enable online discussion. For example, how to deal with scalability issues, how to evaluate and improve the quality of online discussions, or how to generally improve social interaction such platforms.

Generative models of online discussion threads: state of the art and research challenges

Vicenç Gómez

Distributed control for teams of autonomous UAVs 

The aim of this project is to derive controllers for teams of unmanned aerial vehicles (UAVs). The focus will be on extending the current centralized algorithm to a distributed setting. The simulator used for the centralized version is implemented in ROS. Required MIIS courses: machine learning, autonomous systems, mobile robotics.

Real-Time Stochastic Optimal Control for Multi-agent Quadrotor Systems

Vicenç Gómez

Modelling the cooperative behaviours in real-time environments using Reinforcement Learning 

In this project we will model real-time experimental Game Theoretic tasks involving several agents using Reinforcement Learning techniques. The model will be based on Markov Decision Processes (MDP). The aim is to be able to make predictions on modifications of the experiments and to add increasingly complex features to the model including the prediction of other agent behavior and agent identity. Cooperative emerging behaviors will be studied, like for example in the presence of limited resources as in the Tragedy of the Common case where agents need to learn to consume resources in a controlled and coordinated way. Requirements: machine learning, autonomous systems

Vicenç Gómez

Martí Sanchez-Fibla

Greenhouse Gas Emissions: Are Countries Reporting Fabricated Data? 

The levels of Greenhouse Gas (GHG) emissions are being reported by countries under the United Nations Framework Convention on Climate Change. This data is crucial for the future of humanity on Earth. Billionaire economic decisions about global development and finance depend upon this data worldwide. The veracity of the reported figures has been questioned recently, after some records appeared to contradict real measurements coming from air monitoring stations. The uncertainty of the data has been highlighted for transition countries, but the most visible reasons for concerns appeared actually in Europe and in the United States. Current scientific approaches in climate science and the human dimensions of global change lack appropriate tools to identify the case for actively reported false data. There is not yet a conclusive model identifying unreliable outliers for this type of data. This thesis will consist of the application and/or development of statistical and machine learning techniques for the automatic identification of false data being actively reported from countries. The project is a collaboration between the Climate Service Center, Germany and UPF.

Vicenç Gómez

Roger Cremades

Generation and Detection of Deepfakes 

In the recent years many methods for face swapping and manipulation commonly known as "deepfakes" are rapidly evolving. The deepfakes make use of state-of-the-art techniques from deep learning and computer vision fields, making them increasingly harder to detect even for human evaluators. While there are legit applications of these techniques for creating deepfake videos in the audiovisual industry, they have the potential to abuse and to individuals, or propagate fake news. The focus of this thesis is first review the state-of-the-art and develop techniques for creating deepfakes and second, develop new algorithms to detect deepfakes and manipulated media, participating in the "Deepfake Detection Challenge", if results are satisfactory. The project will be carried out in Telefónica Research.

Vicenç Gómez

Ferran Diego

Carlos Segura

Automatic Semantic Representations of Web Documents 

The Web is a huge wealth of information, whose contents are typically accessed by means of search engines. However, search is typically performed by simply finding matching keywords. This approach ignores important aspects of the documents: their topic, degree of objectivity, content quality, readability, etc. The challenge is to (1) identify ways to automatically extract those attributes from web documents and (2) find a representation of the document that allows fast matching with user requests. The goal of this thesis is to develop an unsupervised vector representation method to capture and summarize semantic aspects of Web Documents to be used in a full-scale web search engine. To this end we will explore/extend word and document embedding algorithms like fastText, ELMo and BERT to web document retrieval. This project will be developed in collaboration with NTENT.

NTENT

Andreas Kaltenbrunner

Language Games 

Join an international project (Atlantis - CHIST-ERA) to work with NAO robots on fundamental research in visual question answering, as related to the CLEVR dataset for compositional language and elementary visual reasoning. We will use an analytic approach based on precision language processing in order to have a gold standard for testing and extending the CLEVR dataset. The main goals are (i) to develop a Spanish grammar in Fluid Construction Grammar (similar to an English grammar already developed), (ii) to ground computational semantics in the vision system of the NAO, (iii) implement language games to prepare language evolution experiments for the same set-up. Location: IBE (PRBB building next to Hospital del Mar) UPF Poblenou campus. Requirements: Solid background in (symbolic) computing. Basic familiarity with foundations of AI (machine vision, computational linguistics, knowledge representation).

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Martí Sanchez-Fibla

Luc Steels (IBE, PRBB)

Learning and Planning in Simple Video-Games 

One of the big breakthroughs in AI during the last few years was the DQN algorithm that learned to play Atari video games directly from the screen using a combination of deep learning and reinforcement learning techniques. Neither DQN nor the systems the follow it, however, learn to play these games as humans do which is by understanding the video games in terms of objects and relations, and planning accordingly. The goal of the project is to make progress in that direction by 1) defining suitable high-level planning languages for modeling some these games, 2) defining planning algorithms for deriving the actions to be done in such games when the model is known, and 3) learning parts of the model from observed traces when the model is not fully known. In principle, we will work with symbolic models, complete or incomplete, and not directly with the information available in the screen. That would be a follow up step which goes beyond the work that can be done in a Master project. Some games to consider: Point-and-shoot games, Pong, Space Invaders, Pacman, etc. None lends itself to be modeled and solved by current planning languages and algorithms.

GVG-AI competition

VGDL: Video Game Description Language (used in GVG-AI)

OpenAI Gym

Planning with pixels in (almost) real time

Planning with simulators

Hector Geffner

Ranking Attributes for Information Gain in Forests of Decision Trees (data science / machine learning related project) 

Decision trees are one of the most well-known techniques in machine learning, data science, analytics and data mining. They are transparent: can be presented as rules for human understanding. With the availability of big-data, algorithms for construction of decision trees in a platform like MapReduce are a new challenge. This project consists of developing the parallelization algorithms and their implementation, perhaps for Hadoop when we consider forest of decision trees. This implementation will be focused on the applications of the forest of decision trees on privacy-preserving data mining on on-line social networks. We can provide references to recent literature on how the forest of decision trees are used to provide privacy alerts to users of OLSN. Another application is feature selection.

Vladimir Estivill-Castro

Manipulation with Pepper arm and fingers 

The project consists of developing the infrastructure and integrating motion planning and task planning algorithms for a Pepper robot to pick up boxes, like small cereal boxes. Currently, there is significant interest in the robotics community to combine these two types of planning to carry out tasks like cleaning a table with different objects. This kind of task is now common in the [email protected] challenges. The project should include elements of feedback between motion planning and vision. The idea is that vision may not recognise objects but suggest new positions which would enable the recognition of objects. Simultaneously, motion planning may be refined by feedback from vision.

Vladimir Estivill-Castro

Combining reasoning with robotic localisation 

There are many algorithms for robotic localisation in a field. However, these algorithms are really integrated with qualitative reasoning approaches. The most famous example of qualitative reasoning is Allen's interval algebra (and associated algorithms). The challenge is to investigate a spatial qualitative reasoning system over the robotic localisation in a soccer field to obtain strategic or tactical decision making that shows improved performance is the soccer field for RoboCup soccer.

Vladimir Estivill-Castro

Robot localisation in a soccer field 

There are many algorithms for robotic localisation in a space. However, these algorithms are really integrated with qualitative reasoning approaches. The most famous example of qualitative reasoning is Allen's interval algebra (and associated algorithms). The challenge is to investigate a spatial qualitative reasoning system over the robotic localisation in a soccer field to obtain strategic or tactical decision making that shows improved performance is the soccer field for RoboCup soccer.

Vladimir Estivill-Castro

Game playing Pepper 

The aim is to develop a demonstration of human-robot interaction by which a Pepper robot plays as naturally as possible a simple game, like tic-tac-toe on a fixed space, perhaps with rope and special pieces. The robot applies some localisation and some game strategy. The entire software is to be developed using model-driven development and finite-state machines as much as possible for coordinating the control. The robot shall be flexible in its speech and use localisation within the game space but not necessarily within the room. Sensor fusion to detect humans have completed their move is the main research challenge.

Vladimir Estivill-Castro

Ethical Pepper 

The aim is to develop a demonstration of human-robot interaction by which a Pepper robot learns ethical dilemmas and reason over ethical challenges. The robot shall apply several techniques including game theory or dialogue with humans to argue and explain ethical decisions. The goal is to investigate practical debates between a robot and humans regarding the alignment problem.

Vladimir Estivill-Castro

Constrained optimization methods for computational optimal transport 

The framework of optimal transport addresses the problem of measuring distances between probability distributions. A rough definition of optimal transport distances can be given as follows: Given two probability distributions P and Q over the sets X and Y, and a cost function measuring transportation cost between elements of the two sets, the optimal transport distance between P and Q the total cost of transporting all mass from P to Q. This optimization problem can be formulated as a linear program called the Monge--Kantorovich optimal transport problem. This project follows up on recent progress made on computationally efficient algorithms for solving this LP, and particularly investigates the possibility of employing techniques for constrained optimization and saddle-point optimization for improving existing solutions. The project requires very strong mathematical skills, particularly in multivariate calculus and linear algebra. Knowledge of convex analysis and optimization is a plus, but not absolutely necessary at the current stage.

Sinkhorn Distances: Lightspeed Computation of Optimal Transportation Distances

Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration

Gergely Neu

Thompson sampling for sequential prediction 

Thompson sampling is one of the most well-studied algorithms for a class of sequential decision-making problems known as stochastic multi-armed bandit problems. In a stochastic multi-armed bandit problem, a learner selects actions in a sequential fashion, and receives a sequence of rewards corresponding to the chosen actions. A crucial assumption made in this problem is that the rewards associated with each action are random variables drawn independently from a fixed (but unknown) distribution. The goal of this project is to do away with this assumption and study Thompson sampling in non-stationary environments where the rewards may be generated by an arbitrary external process. Precisely, the project considers the framework of sequential prediction with expert advice, and aims to show theoretical performance guarantees for this algorithm and/or analyze its perormance empirically. The project requires very strong mathematical skills, particularly in probability theory and multivariate calculus. Knowledge of convex analysis is a plus, but not absolutely necessary at the current stage.

Learning to Optimize Via Posterior Sampling

An Information-Theoretic Analysis of Thompson Sampling

Online Linear Optimization via Smoothing

Thompson Sampling for Adversarial Bit Prediction

Gergely Neu

Better algorithms for online linear-quadratic control 

Linear-quadratic control is one of the most well-studied problem settings in optimal control theory: it considers control systems where states follow a linear dynamics, and the incurred costs are quadratic in the states and control inputs. In recent years, the problem of online learning in linear-quadratic control problems have received significant attention within the machine-learning community. One particularly interesting development is the formulation of the control problem as a semidefinite program (SDP), which allows the application of tools from online convex optimization. The present project aims to develop new algorithms for online linear quadratic control based on this framework by exploring the possibility of using regularization functions that make better use of the SDP geometry than existing methods based on online gradient descent. The project requires very strong mathematical skills, particularly in multivariate calculus and linear algebra. Knowledge of convex analysis and optimization is a plus, but not absolutely necessary at the current stage.

Online Linear Quadratic Control

Online PCA with Optimal Regret

Gergely Neu

Implicit regularization methods for high-dimensional optimization 

This project aims at studying the regularization properties of various incremental optimization methods such as gradient descent, averaged gradient descent, and exponentiated gradient descent. While recent work has successfully uncovered relations between averaging schemes for gradient descent and L2 regularization, these results remain specific for the classical problem of linear least-squares regression. One branch of this project is concerned with generalizing these results to more general convex optimization problems. Another direction the project aims to explore is the regularization effects of other gradient-descent variants, and particularly the sparsity-inducing properties of exponentiated gradient descent. The project requires very strong mathematical skills, particularly in multivariate calculus and linear algebra. Knowledge of convex analysis and optimization is a plus, but not absolutely necessary at the current stage.

Exponentiated Gradient versus Gradient Descent for Linear Predictors

Iterate averaging as regularization for stochastic gradient descent

A Continuous-Time View of Early Stopping for Least Squares

Connecting Optimization and Regularization Paths

Implicit Regularization for Optimal Sparse Recovery

Gergely Neu

Multilingual Lexical Simplification 

Lexical simplification is the task of replacing complex words or expressions by simpler synonyms in a context-aware fashion. Lexical simplification is useful to make texts more accessible to different types of users such as people with cognitive impairment. In this project the candidate will investigate current methods in lexical simplification and implement techniques based on current continuous vector representations and neural network architectures. The project seeks to contribute to our current research on multilingual text simplification at TALN. The MsC candidate will have available both a dataset for experimentation and simplification software already available for several languages.

Horacio Saggion

Beyond Abstracts: Generating Summaries of Scientific Texts  

Scientists worldwide face the problem of scientific information overload since the pace at which scientific articles are published is increasing exponentially. In this scenario, the possibility to access to a brief and complete overview of the contents of an article is essential to cope with the great amount of papers to consider. Often abstracts, which are published together with scientific papers, are too short and lack essential information for a complete assessment of the value of the research presented. New approaches to the creation of text summaries, which identify the fundamental contents of documents, constitute a useful instrument to create rich, structured and focused synthesis of the contents of a publication, thus providing new ways to deal with scientific information overload. This master project aims at developing techniques to produce summaries of scientific documents based on an analysis of its content with advanced linguistics tools. The project will investigate techniques to train summarization systems based on available annotated data. The MsC candidate will have available a dataset for experimentation and text processing and summarization libraries to carry out the project.

Horacio Saggion

Criticize or Praise? Citation Characterization in Scientific Papers 

Scientific texts do not stand in isolation, they are connected to each other by means of citations that identify the background on which a given scientific work stands. Citations are particularly important in assessing research output, mainly by means of reference counts (e.g. h-index). Besides citation counting, in recent years, citation semantics, concerning the characterization of the purpose of a citation in a text, started to gain momentum. In order to fully take advantage of citations to assess a piece of work, it is particularly important to understand why a piece of work has been cited in a given context (give credit, identify methods and tools, provide background, criticize, etc.). The characterization of the purpose of citations can have a significant impact in many activities related to the fruition and assessment of scientific literature including scientific text summarization, scientific information retrieval, paper / author recommendation, etc. This master project aims at developing systems to automatically detect the semantics of a given citation in text. The work will be based on the use of supervised techniques to classify citations using a variety of information sources arising from the linguistic and semantic analysis of scientific documents. The MsC candidate will have available both a dataset for experimentation and text processing and summarization libraries to carry out the project.

Horacio Saggion

Extracting the Science from Research Articles 

This thesis will study and develop machine learning techniques (preferably Deep Learning techniques) to extract different types of information from research articles. The work will be based on the development of supervised techniques for the identification of the following types of information: problem, technique, results, advantages, disadvantages, among others. The student will have available data and software to carry out the work.

Horacio Saggion

How do you feel listening? 

This thesis will study and develop machine learning techniques (preferably Deep Learning techniques) to identify the sentiment and emotions of people listening to music in social networks. The study will be based on the analysis of social media data collected before, during, and after concert performances. Contextual information will be used to improve the performance of current systems. The student will have available data and software to carry out the work.

Horacio Saggion

Summarizing Multimodal Content: the case of text and images 

This thesis will investigate the contribution of textual and non-textual information for the summarization of long articles which include multimodal information. Recent works have shown that classification systems can work better when information from multiple modalities is used. This has been little investigated for summarization.

Horacio Saggion

Process mining to understanding how teachers design learning activities 

Authoring tools and community platforms devoted to teachers (e.g. the Integrated Learning Design Environment) collect data about teachers’ action in the process of designing learning activities. In this context, the application of process mining techniques would bring light about how teachers design, i.e. which process they follow to gather inspiration by exploring designs created by others to which steps they follow in the authoring process. The TIDE group has developed several authoring tools and the ILDE community platform, which have been used by several teacher communities (two schools, teachers participating in professional development programs, etc). This project will consist in applying process mining techniques to these datasets, extract knowledge about how teachers design in each community and compare them.

Davinia Hernández-Leo

Ishari Amarasinghe

Intelligent Interactive Systems in Education 

Several topics related to the application of interactive and artificial intelligence techniques to the design of systems to support teaching and learning (includes adaptive and personalized learning, classroom orchestration, etc.).

http://www.upf.edu/web/tide

Davinia Hernández-Leo

Data Analysis in Education Technologies 

Several topics related to the application of data analytics techniques to the design of systems that support teaching and learning (includes community, teaching and learning analytics, interactive dashboards, etc.).

http://www.upf.edu/web/tide

Davinia Hernández-Leo

Conversational agents/chatbots for CSCL applications 

Conversational agents have been deployed into a variety of learning technology applications to enrich interaction between humans and machines. Tutorial Dialog Systems that employ Conversational Agents (CAs) to deliver instructional content to learners in one-on-one tutoring settings have been shown to be effective. This project focuses on extending this technology to collaborative learning settings. The student will focus on the development, integration and evaluation of conversational agents (following iterative design process) into a computer-supported collaborative learning (CSCL) tool called 'PyramidApp' which facilitates easy deployment of collaborative learning activities in classroom and distance learning settings.

http://www.upf.edu/web/tide

Davinia Hernández-Leo

Ishari Amarasinghe

Learning analytics for learning redesign and orchestration 

TThere are two problems that teachers face in their daily routines that would require data-driven educational technology solutions. First, the design and redesign of increasingly effective learning situations are currently not informed by indicators of the impact in learning of previous design realisations. Second, the orchestration of learning situations is a daunting task for teachers, that involves the monitoring, awareness and regulation of learning activities. Both problems stem from the fact that obtaining the adequate information and support required to make decisions about the (re)design and orchestration is out of reach for teachers. Learning Analytics (LA), which borrows methods from Artificial Intelligence (AI), can be considered a suitable approach to tackle both problems. However, it is unclear if the same LA solutions would be able to satisfactorily support both learning (re)design and orchestration. This TFM will analyse and compare the nature of both tasks and existing solutions supporting them, focusing on a case study that uses the same collaborative learning tool with LA to support both tasks.

Davinia Hernández-Leo

Ishari Amarasinghe

Understanding participant behaviours in Citizen Science online learning activities 

Citizen Science (CS) involve the collection and analysis of data relevant to solve research questions by members of the general public, usually as part of a collaborative project with professional scientists. In this master thesis project, the student will select a CS activity that is supported by technology (e.g., a CS devoted platform) to analyze how participants behave and interact with technology to collaborate in the endeavour. The analysis can target organizational/operational characteristics, scientific outcomes, individual/group learning, other success or failure indicators, etc., and societal aspects, related to the impact of those activities on society, such as gender, age, geographical and socio-economic differences; etc.

Davinia Hernández-Leo

Patricia Santos

Twitter as an educational community: identification and classification of educational tweets 

Several studies have shown that teachers use Twitter for professional development (PD) purposes and to develop a sense of professional community and reduce perceptions of isolation. Twitter has the potential to support the improvement of practice through grassroots continuing professional development. Recently, with the mandatory adoption of remote teaching due to the COVID-19 crisis, educators have found on twitter a good place to share educational resources, ideas for learning activities, messages with technical advice on the use of educational tools, etc. All this has generated a large amount of materials scattered throughout the network. This dispersion makes it difficult to get the maximum potential from all the knowledge generated and shared. For an educator, finding specific solutions for a specific need on Twitter is a challenge, as often the hashtags used are not consistent or specific enough as no standard categorization is used. This thesis aims to develop strategies for the automatic recognition and classification of educational information on Twitter. The strategies will ideally implement an incremental learning mechanism, which will allow for a continuous improvement of their performance.

Laia Albó

Pablo Aragón

Prediction of trabecular micro-fractures 

Bone fracture is a very local event. The fractured tissue has peculiar morphometric characteristics, however the prediction of this event is still an open question that lead every year to thousands of unpredicted or miss-treated patients. An ongoing study allowed us to study the morphometrical characteristics of trabecular fracture using image processing tools and micro-CT images. During this project we want to develop a classifier able to identify the weak region within the trabecular framework and predict its possibility to fail. The classifier will integrate information coming from the geometry and the mechanical behaviour of the trabecular structure.

Simone Tassani