Master Thesis

Efficient policies for constrained reinforcement learning

In this project, we will study decision-making algorithms under requirements. Many practical applications make use of learning algorithms which are required to satisfy constraints such as safety, energy use or robustness. Under the reinforcement learning framework, it is the agent who learns to satisfy these by interacting with the environment, in what is known as constrained reinforcement learning. We will study constrained reinforcement learning formulations and exploit recent developments in duality theory to improve on current solutions, designing efficient and versatile methods capable of guaranteeing near-optimality and minimal constraint violation.

Task-uncertain reinforcement learning

In this project we will study the problem of designing reinforcement learning policies in settings where we do not know beforehand which task must be performed. We will consider reinforcement learning problems with different objectives (rewards), each describing a task that may need to be performed during deployment (e.g., deviating from a large or small obstacle). At execution time, the agent must decide which task to perform by interacting with the environment and collecting rewards. In other words, in this project, we will design and test algorithms exploiting rewards not only to learn how to perform tasks, but also to identify which task to perform.

Information gathering in autonomous agents

The proposed project focuses on information gathering in autonomous agents, exploring how intelligent systems can actively seek, evaluate, and integrate information from their environment to make informed decisions under uncertainty. Autonomous agents must often operate in complex and dynamic settings where not all relevant data is readily available. In this project, we aim to develop and analyze algorithms that enable agents to identify knowledge, plan exploratory actions, and adapt their strategies as new information is acquired. By combining methods from reinforcement learning and estimation theory, the project seeks to improve the efficiency and reliability of autonomous systems in tasks such as exploration, monitoring and target tracking.

Task assignment in multi-agent reinforcement learning

The proposed project explores multi-agent reinforcement learning, focusing on how multiple autonomous agents can learn to cooperate and coordinate task assignment in dynamic environments. Unlike single-agent reinforcement learning, where one agent optimizes its policy in isolation, MARL involves multiple learners whose actions can simultaneously influence each other’s rewards and state transitions, creating a complex and non-stationary learning problem. This project aims to investigate recent algorithms that enable efficient learning in such settings, emphasizing mechanisms for communication, coordination, and convergence guarantees.

Representations of Abortion Risk in LLM Search Summaries

Popular search engines, such as Google, have recently begun prominently featuring text summaries generated by large language models (LLMs) in response to user search queries. These search summaries supposedly extract and rewrite information derived from several "source" websites that search engines deem relevant to a given search query. However, search engines typically "localize" the sources they choose to write summaries, typically according to someone's location and language. We are particularly interested in the topic of abortion, and how LLM-generated summaries may represent the risks of receiving an abortion differently across contexts. To understand how such information may (or may not) be incorporated into LLM-generated summaries, we intend to pursue an algorithmic "audit," in which we submit abortion-related queries to Google Search from multiple locations and contexts, and record and analyze any search summaries produced with respect to their choice of sources and their textual framing of abortion risk. This project is currently centered on U.S. (English), Brazilian (Portuguese), and German (German) contexts. We are hoping to expand to the Spanish political context as another comparison point. The MSc student would be responsible for working with the research team in the development of this project.

Anna Beers (University of Tuebingen)

Bias and discrimination in LLMs

This project will study discriminatory biases in Large Language Models (LLMs). This includes understanding the various types of biases, their impact, how can they be measured, and what mitigation measures are possible. Ideally, we will go beyond English-only text-centric benchmarks and consider multilingual and multimodal benchmarks. Of particular interest are (1) subtle stereotypes, and (2) epistemic injustice.

Gender bias in online platforms

This project will analyze and study potential gender biases of online platforms. These may include job search engines, e-commerce websites, search engines, and platform economy sites for which some data could be scrapped/obtained. The specific platform is to be decided with the master student. The work will involve data scrapping/collection, measurement through fair ranking/recommendation algorithms, and analysis of potential biases found. It requires solid knowledge of Python and desire to work within a framework of data feminism, which seeks to eliminate gender biases in data collection and data processing pipelines.

Risk-Prediction Instruments in Criminal Justice

The Justice Department of Catalonia has been using for several years structured risk prediction instruments, similar to COMPAS in the US or OASIS in the UK, to predict the risk of recidivism and violent recidivism. Correctional officers, psychologists, and social workers apply a questionnaire to juvenile offenders and to people who have been convicted for a crime (these are different questionnaires); the questionnaire is then processed using a model trained on thousands of previous cases, and an output is produced, which is then interpreted by the person applying the instrument. Detailed anonymized data is available for studying to what extent different instruments can be biased against different sub-populations (e.g., immigrants or children of immigrant). Additionally, we would like to explore the potential for new risk prediction instruments. The work would be done in collaboration with researchers on criminology at the University of Barcelona and at the Justice Department of Catalonia.

Searching for the Backfire Effect: Measurement and Design Considerations (Swire-Thompson, DeGutis & Lazer; 2020)

Monitoring of Artificial Intelligence incidents and hazards

The proposed project aims to develop an analysis on the OECD AI Incident Monitor (AIM) database, which documents incidents and hazards related to artificial intelligence. This monitor is a fundamental tool for public policy makers, AI professionals and other stakeholders at an international level to understand the risks and harms associated with AI systems. In this regard, students interested in developing a project to explore the impacts documented by the AIM, conduct a practical analysis with recommendations and possible future scenarios on risks from the use of AI are invited.

Nataly Buslon

See also

The Polarization Paradox: Are LLMs resilient against the *backfire effect?

The backfire effect (Swire-Thompson, DeGutis & Lazer; 2020) refers to the fact that people report beliefs even more rooted in misinformation after they have seen an evidence-based correction aiming to rectify it. From the perspective of ML/AI, certain works have already connected the Bayesian way of modeling with cognitive science, which allows us to test such a backfire effect in (yet) small data-driven scenarios. The project will consist of three stages: A first one where previous seminal results on the polarization paradox will be tested for Bayesian models, a second one where the hypothesis will be developed for simple versions of self-supervised learning whose behavior we know is close to Bayesian principles, and a third one, which will be based on training and producing empirical results. The ideal final outcome of the project would be a report or short manuscript that provides new insights into the ideas described, so it could be submitted to a workshop or main track of a top-tier ML/AI conference.

Pablo Moreno Muñoz

Tracing Shannon's footprints across self-supervised learning

Since its initial definition in Claude Shannon's masterpiece paper, and years later analyzed, bounded, and also coined by R. Fano — mutual information has become a big milestone in almost all disciplines related to data, probability theory, and statistics. Ultimately, it can be understood as a powerful measure of association, which quantifies how knowledge of a random variable reduces uncertainty on another one. No need to say how linked this is to language, originally for Shannon's view of communication, and nowadays for NLP, and particularly for families of algorithms like self-supervised learning for LLMs pre-training. This hypothesis, on the maximization of mutual information by certain types of self-supervised learning (SSL), is right now at the frontline of research in the ML/AI community. The aim of this research-oriented project is to explore this hypothesis, first from the SOTA contributions and then by developing a new vision from probabilistic models that we know are underlying SSL core performance.

Pablo Moreno Muñoz

Where did the time go? On how signal processing could help machine learning through the lens of online methods

Differences between data-driven methods in modern AI/ML and well-tested statistical signal processing algorithms are not as large as we initially think. Without time and recursion, the adaptation of current models becomes an impossible task, and the computational cost of re-training an immensely large number of parameters literally worsens day after day. With the consent of the reinforcement learning paradigm, which is not always applicable, we dig back to problems where a quick flow of data arrives in our computer's hands and we need to evaluate data-points and update parameters as fast as we can, and as proximal to a closed-form solution as we can derive. Luckily, we know a rigorous framework in probability theory that does that. The aim of the master thesis project is to investigate and re-discover well-known ideas and methods from statistical signal processing to make them applicable in modern AI/ML via probabilistic methods. Two labs in the world of top-tier AI conferences are particularly pushing for these ideas, which are still in a fertile environment. A probable outcome would be an initial publication in venues like MLSP 2026 and/or workshops in NeurIPS, ICML, ICLR if we succeed in our research mission.

Pablo Moreno Muñoz

Generative modeling by optimal transport

Optimal transport (OT) is concerned with defining and computing distance metrics between probability distributions. Among many other applications, OT techniques have recently found use in the area of generative modeling: OT metrics are used as objective functions of generative modeling, and concepts from OT serve as useful conceptual tools for analyzing successful algorithms such as diffusion models. The goal of this project is to explore the possibility of formulating the problem of generative modeling as learning optimal transport maps, and developing computational methods for solving the resulting optimization problems. While the foundations of the project are rather mathematical, most of the work will likely go into developing scalable implementations of some (otherwise straightforwardly defined) algorithms and testing them extensively in a range of experiments. Thus, this project will require excellent coding skills and ideally some prior expertise with training generative models.

Introductory text on computational aspects of OT

A recent paper on generative modeling via OT

A unified view of entropy-regularized Markov decision processes

Entropy-regularized reinforcement learning algorithms

Entropy regularization is an extremely popular component of modern reinforcement learning methods. After several years of applying entropy regularization as a heuristic to drive exploration and improve optimization properties of RL algorithms, there has been lots of recent progress in providing solid theoretical foundations justifying this technique. Most recently, a sophisticated regularization technique has lead to the development of a principled RL algorithm, Q-REPS, that is entirely derived from first principles, yet can be implemented in large-scale environments essentially without any approximations to the theory. This project aims to progress the research on this method from two parallel directions: 1) Implement Q-REPS in large-scale environments and analyze its performance empirically and 2) analyze various properties of the algorithm that remain poorly understood. (Some important questions being: understanding the role of the regularization functions and finding alternatives, or studying the properties of the action-value functions returned by the algorithm.) The project requires strong mathematical skills, particularly in linear algebra and probability theory. Knowledge of convex analysis and optimization is a plus, but not absolutely necessary at the current stage.

Logistic Q-Learning

Approximate Modified Policy Iteration and its Application to the Game of Tetris

Error averaging in approximate dynamic programming

Error-propagation analysis of approximate dynamic programming methods is a classic research topic with several implications on reinforcement learning algorithms. The most well-known results consider basic DP algorithms like value iteration and policy iteration, with several new results added in recent years regarding regularized counterparts of these methods. One discovery on this front was the discovery that certain regularization choices result in an "error-averaging" property that ensures a benign propagation of policy evaluation errors: roughly speaking, instead of accumulating the absolute errors linearly, these schemes allow the cancellation of positive and negative errors. This project sets out to understand this phenomenon in more detail by decoupling the averaging effect from the regularization effects as much as possible, thus potentially enabling the development and analysis of more effective error-averaging RL algorithms. The project requires strong mathematical skills, particularly in linear algebra and probability theory. Knowledge of convex analysis and optimization is a plus, but not absolutely necessary at the current stage.

Leverage the Average: an Analysis of KL Regularization in RL

Munchausen Reinforcement Learning

Learning to Optimize Via Posterior Sampling

Thompson sampling for sequential prediction

Thompson sampling is one of the most well-studied algorithms for a class of sequential decision-making problems known as stochastic multi-armed bandit problems. In a stochastic multi-armed bandit problem, a learner selects actions in a sequential fashion, and receives a sequence of rewards corresponding to the chosen actions. A crucial assumption made in this problem is that the rewards associated with each action are random variables drawn independently from a fixed (but unknown) distribution. The goal of this project is to do away with this assumption and study Thompson sampling in non-stationary environments where the rewards may be generated by an arbitrary external process. Precisely, the project considers the framework of sequential prediction with expert advice, and aims to show theoretical performance guarantees for this algorithm and/or analyze its perormance empirically. The project requires strong mathematical skills, particularly in probability theory and multivariate calculus. Knowledge of convex analysis is a plus, but not absolutely necessary at the current stage.

An Information-Theoretic Analysis of Thompson Sampling

Online Linear Optimization via Smoothing

Thompson Sampling for Adversarial Bit Prediction

Online Linear Quadratic Control

Better algorithms for online linear-quadratic control

Linear-quadratic control is one of the most well-studied problem settings in optimal control theory: it considers control systems where states follow a linear dynamics, and the incurred costs are quadratic in the states and control inputs. In recent years, the problem of online learning in linear-quadratic control problems have received significant attention within the machine-learning community. One particularly interesting development is the formulation of the control problem as a semidefinite program (SDP), which allows the application of tools from online convex optimization. The present project aims to develop new algorithms for online linear quadratic control based on this framework by exploring the possibility of using regularization functions that make better use of the SDP geometry than existing methods based on online gradient descent. The project requires strong mathematical skills, particularly in multivariate calculus and linear algebra. Knowledge of control theory, convex analysis and optimization is a plus, but not absolutely necessary at the current stage.

Online PCA with Optimal Regret

Linearly-solvable Markov decision processes with function approximation

A linearly-solvable Markov decision process (LMDP) is a special case of the more commonly used Markov decision process (MDP) in reinforcement learning. In LMDPs, the Bellman equations governing the value function are linear, which typically makes learning more efficient. In spite of being simpler than MDPs, LMDPs are surprisingly expressive and can be used to represent many reinforcement learning benchmarks. In particular, any MDP with deterministic actions can be converted into an LMDP. However, the large body of work on deep reinforcement learning largely ignores LMDPs as a representation. The goal of this project is to develop function approximation algorithms for LMDPs, e.g. deep neural networks. One way to do so is to study existing algorithms for MDPs and simplify them for the LMDP setting.

Anders Jonsson

Novel Algorithms for Centralized Multiagent Planning

Centralized multiagent planning is the problem of computing a joint plan for a team of agents that collaborate to achieve a goal. This problem has many applications in the real world, e.g. ride sharing, logistic planning and manufacturing. The aim of this project is to develop novel algorithms for multiagent planning based on an existing approach that reduces centralized multiagent planning problems to exponentially smaller approximate AI planning problems that can be efficiently solved using state-of-the-art planning methods. The novel algorithms will be tested in challenging scenarios from the Multi-Agent Programming Contest.

Anders Jonsson

Enabling formal verification with Agentic-AI

Software systems are increasingly complex, distributed, and safety-critical, yet the use of formal methods in software development remains rare. High-profile incidents, such as the 2024 CrowdStrike outage, have demonstrated the limitations of relying solely on testing to ensure software correctness. The goal of this project is to use Agentic AI and Generative AI to deliver an AI-driven framework that makes formal verification practical and accessible. The framework provides an agent that acts as a guide, helping developers construct robust, verifiable systems, automate sanity checks, and interpret verification outcomes for continuous improvement. This project has two variants: 1) Translate human natural language about the requirements of software into the verifiable properties of model checkers (it is like generating code for model checkers from a human description). 2) Use capabilities of agent technology, such as planning, to assist humans with the behaviour engineering methodology to construct software for embedded systems.

Ranking Attributes for Information Gain in Forests of Decision Trees (data science/machine learning related project)

Decision trees are one of the most well-known techniques in machine learning, data science, analytics and data mining. They are transparent as they can be presented as rules for human understanding. Explainable AI, human-in-the-loop developments, and fair and ethical AI favour the transparency of decision trees. With the availability of big data, algorithms for constructing decision trees in a platform like MapReduce are a new challenge. This project consists of developing the parallelisation algorithms and their implementation for Hadoop when we consider a forest of decision trees. This implementation will be focused on the applications of the forest of decision trees on privacy-preserving data mining on online social networks (OLSN). We can provide references to recent literature on how the construction of a forest of decision trees is used to provide privacy alerts to users of OLSN. Another application is feature selection.

Ethical Ai Agent

The aim is to develop a demonstration of interaction by a human being with an artificial agent about ethical dilemmas and reason over ethical challenges. The agent shall apply several techniques to argue and explain ethical decisions, including game theory or dialogue with humans. Another possibility is deontic logic. The goal is to investigate practical debates between and artificial agents and humans regarding the alignment problem. Possibilities are to explore the use of generative AI and in particular, large language models.

Verifiable AI

The goal is to explore mechanisms of formal verification of agents' behaviour, particularly agents who can combine reasoning with operating over a state machine. Moreover, if the behaviour or the knowledge for reasoning is learned, behavioural properties should be verified regarding the behaviour. The aim is to create trustable AI systems because they have been formally verified even if they evolve by learning. The immediate applications are behaviours of autonomous vehicles, and particular aspects could be efficient negotiation of intersections. There is already evidence that artificial agents can negotiate intersections faster than humans and even minimise the requirement of traffic signals. However, there are challenges is mixed environments, such as humans and autonomous vehicles. The impact on mobility of artificial intelligence tools is the focus of this project.

One Model, Any CSP: Graph Neural Networks as Fast Global Search Heuristics for Constraint Satisfaction

Graph Neural Networks for Large-scale Optimization

Large-scale constrained optimization problems are frequently encountered in real-world applications, e.g. traffic management, logistic planning, warehouse optimization, multiagent planning, and scheduling, to name a few. Though constrained optimization problems have been extensively studied, because of their computational complexity, existing optimal solvers do not scale well to complex problems. Recently, graph neural networks (GNNs) have been successfully applied to approximate solutions to large optimization problems. The aim of the thesis is to study the potential of graph neural networks for large-scale optimization, and test it in realistic applications.

Generative models of online discussion threads: state of the art and research challenges

Anders Jonsson

Statistical Modeling of Online Discussions

Online discussion is a core feature of numerous social media platforms and has attracted increasing attention for different and relevant reasons, e.g., the resolution of problems in collaborative editing, question answering and e-learning platforms, the response of online communities to news events, online political and civic participation, etc. This project aims to address, from a probabilistic modeling perspective, some existing challenges, both computational and social, that appear in platforms that enable online discussion. For example, how to deal with scalability issues, how to evaluate and improve the quality of online discussions, or how to generally improve social interaction such platforms.

Abstraction and Reasoning Challenge (Kaggle)

Andreas Kaltenbrunner

Abstraction and Reasoning Challenge

This project is about addressing the Abstraction and Reasoning Challenge (ARC), a kaggle competition consisting of large corpus of visual tasks that require abstraction and reasoning skills to be solved. Each ARC task contains 3-5 pairs of train inputs and outputs, and a test input for which you need to predict the corresponding output with the pattern learned from the train examples. The objective of this project to apply AI techniques (and possibly suggest novel ones) that combine search and learning for automated program synthesis that are developed in the AI/ML research group to address the challenge, or a part of it.

ARC info

PushWorld

This project is about addressing the PushWorld challenge. PushWorld is a novel grid-world environment designed to test planning and reasoning with physical tools and movable obstacles. While recent advances in artificial intelligence have achieved human-level performance in environments like Starcraft and Go, many physical reasoning tasks remain challenging for computers. The PushWorld benchmark is a collection of puzzles that emphasize this challenge. PushWorld is available as an OpenAI Gym environment and in PDDL format in Github. The environment is suitable for research in classical planning, reinforcement learning, combined task and motion planning, and cognitive science.

Pushworld

The League of Robot Runners Competition

This project is about addressing the league of robot runners competition. The League of Robot Runners, sponsored by Amazon Robotics, is a competition series where participants tackle the core combinatorial challenges found in cooperative multi-robot coordination problems: robot dynamics, lifelong planning, task assignment, and real-time execution. Besides being intellectually stimulating, these challenges present themselves in high-impact industrial applications including warehouse logistics, transportation, and advanced manufacturing.

League of Robot Runners

Joint Research Center (JRC)

AI and fitness

The European Commission recently approved the creation of the European Health Data Space (EHDS), which will be a sovranational repository of medical data usable for personalized care, research and clinical studies. However, its implementation is complicated by technical, ethical and regulatory difficulties. In this project, we would like to create a use case with wearables like smartwatches to understand real-world implementation challenges. The technical pipeline would explore federate learning of a personal AI assistant that uses data from smarwatch activities to recommend training activities and evaluate that against compliance with the new AI regulation: AI Act, GDPR and the upcoming EHDS requirements.

Sergio Consoli

Mario Ceresa

Joint Research Center (JRC)

AI as a scientific assistant

Given the overwhelming amount of new papers every day, new strategies are needed to find, categorize and understand them. In this project we will refine an existing mutliagents pipeline that searches, downloads and processes scientific papers to generate insights for scientific researchers in the European Commission using LLMs and graph techniques.

Mario Ceresa

Joint Research Center (JRC)

Reasoning with language and knowledge graphs

Large language models (LLMs), are making new waves in the field of natural language processing and artificial intelligence, due to their emergent ability and generalizability. However, LLMs are black-box models, which often fall short of capturing and accessing factual knowledge. In contrast, Knowledge Graphs (KGs), Wikipedia and Firebase for example, are structured knowledge models that explicitly store rich factual knowledge. KGs can enhance LLMs by providing external knowledge for inference and interpretability. Meanwhile, KGs are difficult to construct and evolving by nature, which challenges the existing methods in KGs to generate new facts and represent unseen knowledge. Therefore, it is complementary to unify LLMs and KGs together and simultaneously leverage their advantages. In this project, we will work jointly towards a system that unifies LLM and KGs for the processing of scientific literature and technical reports from the European Commission.

Mario Ceresa

Development of an ontology of errors

This project involves discovering and discussing the boundaries on defining errors in image generation. For doing so, we will explore a taxonomical analysis, based on previous work on design and communication principles. We will understand how the discrepancy between the idea and the result can be identified and evaluated. A study on the deep use of state-of-the-art AI to analyse the perception and the response to different outputs from the system. The work will be guided by a series of interviews and surveys, dedicated to understanding how different kinds of errors and responses impact users.

Manuel Portela

Synthetic data generation for spatio-temporal data

DATALOG is an independent entity that was recognised as the first Data Altruism Organisation and has been developing different mechanisms for data collection and processing in the context of smart cities. In such a context, the development of Digital Twins looks for the simulation and anticipation of multiple scenarios and aims to drive better public policies. However, despite the mechanisms for data sharing being developed, an alternative market is emerging due to the absence of accessible real data. Synthetic datasets can be one solution to that problem. However, in the light of spatio-temporal data, creating data that is valid and representative of real scenarios is extremely hard. This work will be a joint effort to explore different mechanisms to create and validate datasets that could help to represent the city of Barcelona in data, and to contribute to the stock of data that DATALOG is offering. Some companies that are working on the topic: Aizle.ai, Gretel.ai, Mostly.ai, rendered.ai, datacebo.com

Manuel Portela

DATALOG

Expressive High-level Features for Generalized Plans

An important breakthrough in the field of Automated Planning has been in the representation of solutions that not only solve one particular planning problem, but a possibly infinite class of planning problems. These solutions are named generalized plans, and they are represented as algorithms that branch and loop in a target language. These solutions generalize because of the branching and looping over a set of high-level features, so the main studies in the topic have explored the right representation of these features to capture generalized plans over huge set of planning classes (e.g., features in Description Logics, Qualitative Numeric Planning, Random-Access Machines, ...). Thus, in this project, we will analyze and study several feature representations to match the required expressivity to generalize over different classes of planning problems.

Unbounded Best-First Generalized Planning

Best-First Generalized Planning (BFGP) has been one of the most successful approaches to generalized planning, where solutions are expressed in the form of planning programs over pointers with a Random-Access Machine. Part of its success comes from the representation of solutions that allow to natively apply classical heuristic search approaches to generalized planning, while the other part is due to some design decisions performed by experts in the generalized planning problem (e.g., a bound in the maximum size of the solution). In this project, we will explore how to design an efficient search in the space of programs when such a bound does not exist, releasing the system from human expert knowledge in the given problem classes.

Reinforcement Learning Based Dynamic Model Combination for Time Series Forecasting

Time series forecasting with reinforcement learning

This project will investigate the use of reinforcement learning for time series forecasting. The approach will implement an agent that learns to make accurate forecasts by interacting with temporal data while receiving feedback on its performance. This aims to explore how this learning paradigm can capture complex temporal patterns and adapt to evolving data characteristics (e.g. modern cloud environments, finance, ...), offering an alternative perspective to conventional forecasting approaches.

Stochastic Diffusion: A Diffusion Based Model for Stochastic Time Series Forecasting

Telefónica Innovación Digital

Francisco Álvarez Terribas

Speech representations for voice cloning

In this project we will explore different aspects of speech processing applied to timber transfer and “voice cloning”. More specifically, we will analyze the latent spaces of self-supervised speech representation algorithms such as wave2vec2.0 and HuBERT. The main goal of the project is to improve downstream tasks such as timber transfer through the appropriate understanding of these models and how the perform in various existing datasets. This research is part of an ongoing art-driven research project currently being developed at the Data Analytics and Visualization Group at the Barcelona Supercomputing Center, in collaboration with the artist Maria Arnal.

Tomas Andrade

Eurecat: Multimedia Technologies Unit

Expanded Voices project

Synthetic Satellite Images for Enhancing 3D Modeling

Advances in generative AI have made it possible to create highly realistic synthetic satellite images, offering new opportunities to train more robust stereo and 3D vision models. Traditionally, 3D information from satellite data relies on multi-view or stereo pairs captured simultaneously, which limits update frequency and increases data acquisition costs, particularly for high-resolution imagery. The use of synthetic data can alleviate these constraints by generating temporal sequences that preserve scene geometry while varying seasonal conditions or local structural changes. This project aims to: (1) review the state of the art in generative models for satellite imagery (e.g., Gemini, CycleDiffusion), (2) generate synthetic multi-date stereo pairs with controlled variations, and (3) evaluate their usefulness for downstream tasks such as disparity estimation, 3D reconstruction, and change detection in permanent infrastructures.

Rafael Redondo Tejedor