We have weekly seminars, usually on wednesdays at noon, in which we regularly schedule invited speakers. We will keep this calendar updated (and send reminders on twitter/relevant mailing lists).
COLT Seminar
When: October 23rd, 2024, 12.00h to 13.00h
Where: Room 52.217, UPF-Roc Boronat Building
Title: Understanding Language Models via Theory, Interpretability, and Humans
Speaker: Michael Hahn (Saarland University)
Abstract: Recent progress in LLMs has rapidly outpaced our ability to understand their inner workings. This talk describes our work aiming to understand language models from three angles. First, we develop rigorous mathematical theory describing the abilities (and limitations) of transformers in performing computations foundational to reasoning. We also examine differences and similarities with state-space models such as Mamba. Second, we propose a method for reading out information from activations inside neural networks, and apply it to mechanistically interpret transformers performing various tasks. Third, we use language models to build a cognitive model of human language comprehension. Iwill close with directions for future research.
COLT Seminar
When: October 16th, 2024, 12.00h to 13.00h
Where: Room 52.217, UPF-Roc Boronat Building.
Title: From surprisal to processing effort, with the complexities underneath
Speaker: Tiago Pimentel (ETH Zurich)
Abstract: A major branch of psycholinguistics investigates how people read and infer a text’s meaning. Several hypotheses exist, a large subset of which predict that a word’s contextual predictability—typically quantified as its surprisal—plays an important role in this process. These hypotheses, however, differ in important ways, such as in how the surprisal–processing effort relationship should look like. Psycholinguists then gather evidence for or against each hypothesis by examining how a word’s surprisal relates to its processing effort. Unfortunately, we can measure neither a word’s surprisal nor its processing effort. These analyses thus typically follow the following roadmap: one starts with a language model that approximates a ``true’’ data-generating distribution over text; one then converts the language model’s subword-level probabilities to word-level ones; one analyses how word-level surprisal predicts reading times; one then (usually implicitly) connects reading times and processing effort. In this talk, we’ll discuss each of the four points in this roadmap, analysing the complexities underlying them.
Bio: Tiago is a Postdoc at ETH Zurich, working with Thomas Hofmann. He is mainly interested in understanding sentence processing in both humans and language models, focusing on how to formalise the methods used to study these topics. Towards this goal, he uses concepts from information theory, causality, statistics, and natural language processing.
COLT Seminar
When: September 18th, 2024, 12.00h to 13.00h
Where: Room 52.217, UPF-Roc Boronat Building.
Title: From Input Attribution Interpretations to Reverse-engineering Language Models
Speaker: Javier Ferrando (Universitat Politècnica de Catalunya)
Abstract: The field of interpretability in NLP has seen rapid advancements in recent years. While initially the focus was on explaining predictions by analyzing the importance of input tokens, new methods are now bringing us closer to reverse-engineering the decision-making processes of language models.
COLT- Seminar
When: July 3rd, 2024, 12.00h to 13.00h
Where: Room 52.701, UPF-Roc Boronat Building.
Title: On language, morality, and computation
Speaker: Yang Xu (Department of Computer Science, Cognitive Science Program, University of Toronto)
Abstract:Language and morality are intimately related. Morals can guide language use and development, while language can reveal people’s morals and moral perception. Advances in artificial intelligence (AI) have offered a new opportunity for studying these relations through the lens of computation. In this talk, I describe recent work that focuses on exploring the interplay of language, morality, and computation. First, I show that computational methodologies drawing on diachronic word embeddings support automated inference of historical shifts in moral sentiment toward concepts such as slavery over centuries and the cognitive processes underlying the evolution of moral lexicon. Next, I show that morally relevant linguistic inputs serve important functions to probing moral emergence and bias in AI systems such as large language models. I close by discussing ongoing and future work on the synergy of language, morality, and computational intelligence.
COLT- Seminar
When: June 26th, 2024, 12.00h to 13.00h
Where: Room 52.701, UPF-Roc Boronat Building.
Title: Analyzing self-supervised representations of speech: encoding structures of speaker information and phonetic context
Speaker: Oli Liu (University of Edinburgh)
Description: Mapping speech to meaning is a non-trivial computational problem. There is significant variability in the acoustic realization of a phoneme depending on the speaker and the phonetic context. Nevertheless, humans can easily overcome these challenges and comprehend speech with robustness. Recent advances in speech technology have created automatic speech recognition systems that approach human-level performance. Many of these systems make use of pre-trained models of speech, which are trained in a self-supervised manner without using any text labels, and yet are shown to encode significant linguistic information.
In my research, I use self-supervised learning models as a tool for understanding humans’ mental representations of speech. Through analyzing the structures of the representation space and comparing them to properties found in neural encoding of human listeners, I aim to identify potential computational mechanisms employed in speech perception. I will talk about two ongoing projects focusing on the representational geometry of speaker information and phonetic context respectively. I will discuss the implication of our findings for both understanding human speech perception and analysing self-supervised representations.
COLT- Seminar
When: May 29th, 2024, 12.00h to 13.00h
Where: Room 52.217, UPF-Roc Boronat Building.
Speaker:Albert Gatt (Utrecht University)
Title: Visual grounding of verbs and nominalisations in multimodal models.
Description: What distinguishes a ‘runner’ from someone who is merely running? And what properties allow us to recognise that in one photograph, a man is arresting or apprehending someone, whereas in a different, but very similar scene, it’s just a case of someone holding another person by the arm? In this talk, I will focus on two sets of experiments on the grounding capabilities of vision-language transformer models. The common thread underlying these experiments is the interaction of linguistic and world knowledge, and the degree to which such models are able to integrate these when recognising the correspondence between a visual scene in an image, and a simple description involving a verb or a nominalisation.
COLT- Seminar
When: May 8th, 2024, 12.00h to 13.00h
Where: Room 52.217, UPF-Roc Boronat Building.
Speaker: Mor Geva (Tel Aviv University)
Title: The Internal (Broken) Knowledge Graph of Large Language Models
Description: Some of the most pressing issues with large language models (LLMs), such as the generation of factually incorrect text and logically incorrect reasoning, may be attributed to the way models represent and recall knowledge internally. In this talk, we will evaluate the representation and utilization of knowledge dependencies in LLMs from two different perspectives. First, we will consider the task of knowledge editing, showing that (a) using various editing methods to edit a specific fact does not implicitly modify other facts that depend on it, and (b) some facts are often hard to disentangle. Next, we will consider the setting of latent multi-hop reasoning, showing that LLMs only weakly rely on knowledge dependencies when answering complex queries. While these shortcomings could potentially be mitigated by intervening on the LLM computation, they call for better training procedures and possibly new architectures.
COLT- Seminar
When: February 28th, 2024, 12.00h to 13.00h
Where: Room 52.217, UPF-Roc Boronat Building.
Speaker: Sandro Pezzelle (University of Amsterdam)
Title: Dealing with implicit and underspecified language: A semantic challenge for large language models
Description: The language we use in everyday communicative contexts exhibits a variety of phenomena—such as ambiguity, missing information, or semantic features expressed only indirectly—that make it often implicit or underspecified. Despite this, people are good at understanding and interpreting it. This is possible because we can exploit additional information from the linguistic or extralinguistic context and shared or prior knowledge. Given the ubiquity of these phenomena, NLP models must handle them appropriately to avoid potentially harmful biased behavior. In this talk, I will present recent work investigating how state-of-the-art transformer large language models (LMs) handle these phenomena. In particular, I will focus on the understanding of sentences with atypical animacy (“a peanut fell in love”) and on the interpretation of sentences that are ambiguous (“Bob looked at Sam holding a yellow bag”) or where some information is missing or implicit (“don't spend too much”). I will show that, in some cases, LMs behave surprisingly similarly to speakers; in other cases, they fail quite spectacularly. I will argue that having access to multimodal information (e.g., from language and vision) should, in principle, give these models an advantage on these semantic phenomena—as long as we take a perspective aware of the communicative aspects of language use.
-Hanna, M., Belinkov, Y., and Pezzelle, S. (2023). When Language Models Fall in Love: Animacy Processing in Transformer Language Models. EMNLP 2023.
-Wildenburg, F., Hanna, M., and Pezzelle, S. (2024) Do Pre-Trained Language Models Detect and Understand Semantic Underspecification? Ask the DUST! arXiv preprint
-Pezzelle, S. (2023). Dealing with Semantic Underspecification in Multimodal NLP. ACL 2023.
COLT- Seminar
When: February 21st, 2024, 12.00h to 13.00h
Where: Room 52.701, UPF-Roc Boronat Building.
Speaker: Ramon Ferrer-i-Cancho (Universitat Politècnica de Catalunya)
Title: Towards a predictive theory of word order.
Description: The bulk of research on language is about description, namely exploration and description of linguistic phenomena by means of qualitative, statistical or formal tools. Innovation often reduces to the revision of previous generalizations with new experiments, new data or new methods. A small fraction of the description efforts are directed to testing hypotheses generated by theory. Theorizing requires approval by schools of thought. Ideas and models that are sustained mainly by authority arguments abound. To some extent, researchers succeed according to the degree they endorse or remain neutral with respect to appeals to authority.
Although prediction is key in science, it is rather rare in language research. Models often do not generalize in the sense that they are just the compression of some linguistic data and are not able to generate predictions beyond the original domain. That failure is often justified by the complexity of language and also by the high level of diversity of languages on Earth. Furthermore, it is often believed that only a few simplistic generalities, e.g., more frequent words tend to be shorter, can be predicted. Interestingly, authoritative theories are built neglecting these "simple" generalities. It is often argued that language research cannot be like physics. However, various linguistic theories have aimed to explain the origins of black holes while neglecting the explanation for the fall of an apple from a tree.
Here I will review recent advances in word order theory on predicting the optimal placement of heads and the variation with respect to these optimal placements. I will pay special attention to the principle of swap distance minimization and show that it is possible to postulate highly predictive principles that are also compatible with the diversity of languages.
COLT- Seminar
When: January 17th, 2024, 13.00 to 14.00 (NOT 12.00 to 13.00)
Where: Room 52.737
Speaker: Gaël Le Mens, Universitat Pompeu Fabra
Title: Using LLMs for measurement in Social Science
Description:The talk will give an overview of some of my recent work that used LLMs for measurement in a variety of projects. We will talk about using ‘BERT’ for classifying the topics of tweets written by Spanish politicians, fine-tuned BERT and ‘of the shelf’ GPT-4 for measuring the typicality of text documents into concepts, and GTP-4 for positioning text documents in policy and ideological spaces.
It is based on other following recent papers:
- Scaling Political Texts with ChatGPT (with Aina Gallego). arXiv pre-print.
- Uncovering the Semantics of Concepts Using GPT-4 (with Balázs Kovács, Michael Hannan, & Guillem Pros). PNAS, 2023.
- Social Media Feedback and Extreme Opinion Expression (with Elizaveta Konovalova and Nikolas Schöll). PLOS One, 2023
- How politicians learn from citizens’ feedback: The case of gender on Twitter (with Nikolas Schöll and Aina Gallego). American Journal of Political Science, 2023.
COLT- Seminar
Where: Room 52.737
We introduce an approach that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This test can be used to identify the most informative distance measure out of a pool of candidates, to compare the representations in deep neural networks, and to infer causali
COLT- Seminar
Where: Room 52.737
COLT- Seminar
When: September 13th, 2023, 12.00 to 13.00
Where: Room 52.737
Speaker: Luca Moschella, Sapienza University of Rome
Title: Leveraging Emerging Similarities for Latent Space Communication
Description: Neural networks encode the complex structure of data manifolds in high-dimensional spaces through latent representations. The distribution of data points in the latent space should ideally rely solely on the task, data, loss, and specific architecture constraints. However, factors like random weight initialization, training hyperparameters, and other sources of randomness during training can lead to incoherent latent spaces that hinder reuse. Notably, a consistent phenomenon emerges when data semantics remain unchanged: the angles between encodings within distinct latent spaces exhibit similarity. During this talk, we will delve into two empirical strategies that harness this phenomenon, facilitating latent communication across diverse architectures and data modalities: Relative Projection: We will demonstrate the creation of a new, relative representation that inherently remains invariant against such transformations. Direct Transformation: We will showcase how prior knowledge about relationships/transformations between different spaces can directly guide the translation from one space to another.
In both cases, we facilitate efficient communication between latent spaces, bridging gaps between distinct domains, models, and modalities; enabling zero-shot model stitching, reuse and latent evaluation. This holds true for both generation and classification tasks, showcasing the versatility and applicability of these strategies.
COLT- Seminar
When: July 5th, 2023, 12.00 to 13.00
Where: Room 52.737
Speaker: Roberto Dessì
Title: Toolformer: Language Models Can Teach Themselves to Use Tools
Description: Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this talk I’ll present the Toolformer and show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds. Toolformer is a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. This is done in a self-supervised way, requiring nothing more than a handful of demonstrations for each API. Toolformer incorporates a range of tools, including a calculator, a Q&A system, two different search engines, a translation system, and a calendar. It achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, without sacrificing its core language modeling abilities.
COLT- Seminar
When: June 29th, 2023, 12.00 to 13.00
Where: Room 52.737
Speaker: Mario Giulianelli, University of Amsterdam
Title: Using neural language generators as computational models of human language use.
Description: While natural language generation (NLG) systems are widely deployed in real-world applications, evidence that they faithfully reproduce aspects of human linguistic behaviour is scarce. In a first study, we analyse variability in human production, a characterising aspect of language production that is often overlooked in NLG research. We propose a statistical framework to quantify variability and to assess language generators' alignment to the production variability observed in humans. In a second study, we use the previously introduced statistical framework to define a novel measure of utterance surprisal that quantifies (un)predictability as distance from plausible alternatives (here, we generate alternatives using neural NLG systems). We test the psychometric predictive power of this measure, showing that it predicts human acceptability judgements better than (standard) probabilistic surprisal and that it is complementary to probabilistic surprisal as a predictor of utterance-level reading times. Overall, these two studies contribute new empirical evidence that neural language generators can be used for the computational modelling of human language use.
COLT- Seminar
When: June 12th, 2023, 14.30 to 15.30
Where: Room 52.737
Speaker: Milica Denić, Tel Aviv University
Title: Recursive numeral systems optimize the trade-off between lexicon size and average morphosyntactic complexity
Description: Human languages vary in terms of which meanings they lexicalize, but there are important constraints on this variation. It has been argued that languages are under two competing pressures: the pressure to be simple (e.g., to have a small lexicon size) and to allow for informative (i.e., precise) communication with their lexical items, and that which meanings get lexicalized may be explained by languages finding a good way to trade off between these two pressures (Kemp and Regier, 2012 and much subsequent work). However, in certain semantic domains, it is possible to reach very high levels of informativeness even if very few meanings from that domain are lexicalized. This is due to productive morphosyntax, which may allow for construction of meanings which are not lexicalized. Consider the semantic domain of natural numbers: many languages lexicalize few natural number meanings as monomorphemic expressions, but can precisely convey any natural number meaning using morphosyntactically complex numerals. In such semantic domains, lexicon size is not in direct competition with informativeness. What explains which meanings are lexicalized in such semantic domains? We will argue that in such cases, languages are (near-)optimal solutions to a different kind of trade-off problem: the trade-off between the pressure to lexicalize as few meanings as possible (i.e, to minimize lexicon size) and the pressure to produce as morphosyntactically simple utterances as possible (i.e, to minimize average morphosyntactic complexity of utterances). This study in conjunction with previous work on communicative efficiency suggests that, in order to explain which meanings get lexicalized across languages and across semantic domains, a more general approach may be that languages are finding a good way to trade off between not two but three pressures: be simple, be informative, and minimize average morphosyntactic complexity of utterances.
COLT- Seminar
Where: Room 52.737
Speaker: Mathieu Rita , ENS/CoML - INRIA/Microsoft Research
I am a Ph.D. student under the supervision of Emmanuel Dupoux (ENS/FAIR), Olivier Pietquin (Google Brain) and Florian Strub (DeepMind). I work between the Inria-Microsoft Research Joint Lab and the CoML team in Paris, which is located at ENS Paris. Prior to that, I received an engineering degree from Ecole Polytechnique and a MSc degree in Mathematics, Computer Vision and Machine Learning from Ecole Normale Supérieure Paris-Saclay. My research explores the theoretical and experimental aspects of training RL objectives with language models, with a specific focus on constructing self-play multi-agent systems. I particularly investigate how scaling populations and generations of agents can help address language learning challenges, such as overfitting, exploration or drift. As an application, I simulate language evolution and study the pre-requisites necessary to the emergence of language universals, such as compositionality.
Title: Neural Communication Games
In this talk, I will present machine learning and information theoretical views of neural communication games. These perspectives provide insights into the dynamics of those games and explain alignments/misalignments between neural emergent communication results and empirical findings from cognitive science and socio-linguistics (population effects, iterated learning, etc.).
COLT- Seminar
COLT- Seminar
Where: Room 52.737
Speaker: Clément Romac , Hugging Face - INRIA
Bio: Now a first year PhD student jointly supervised by Pierre-Yves Oudeyer (FLOWERS, Inria) and Thomas Wolf (Hugging Face) studying how autonomous Deep RL agents can leverage large Language Models.
Title: Grounding LLMs in Interactive Environments with Online RL.
Description: Recent works successfully leveraged Large Language Models' (LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems. Yet, the alignment between LLMs' knowledge and the environment can be wrong and limit functional competence due to lack of grounding. We study an approach to achieve this alignment through functional grounding: we consider an agent using an LLM as a policy that is progressively updated as the agent interacts with the environment, leveraging online Reinforcement Learning to improve its performance to solve goals. Using an interactive textual environment designed to study higher-level forms of functional grounding, and a set of spatial and navigation tasks, we study several scientific questions: 1) Can LLMs boost sample efficiency for online learning of various RL tasks? 2) How can it boost different forms of generalization? 3) What is the impact of online learning? We study these questions by functionally grounding several variants (size, architecture) of FLAN-T5.
COLT- Seminar
Where: Room 52.737 or online on zoom
Speaker: Emanuele La Malfa
Description: In this presentation, I will be discussing the concept of robustness in the field of natural language processing (NLP) and the various research questions that can be explored through this lens. Specifically, I will delve into semantic robustness, which encompasses a broad range of linguistic phenomena and is a treatment-response notion that is distinct from the traditional idea of robustness in computer vision, which only takes into account word substitutions and deletions. Additionally, I will explore syntax robustness, which refers to a model’s ability to accurately represent language structures even when faced with manipulations. Furthermore, I will highlight how robustness can be used not only as a desirable property of a model, but also as a means of formally explaining a model’s decisions. Finally, I will share some of the projects I am currently working on, including one on the robustness of language models for code, and another on how to accurately measure robustness in the presence of input data noise.