Events

​​​​​We have weekly seminars, usually on wednesdays at noon, in which we regularly schedule invited speakers. We will keep this calendar updated (and send reminders on twitter/relevant mailing lists). Feel free to join, or ask for a zoom passcode if you're interested by an announced topic!

​​

COLT- Seminar

When: November 29th, 2023, 12.00 to 13.00
Where: Room 52.737
Speaker:  Alessandro Laio, SISSA (Scuola Internazionale Superiore di Studi Avanzati)
Title:  Identifying informative distance measures in high-dimensional feature spaces
Description:
Real-world data  typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Finding a small set of features that still retains sufficient information about the dataset is important for the successfuapplication of many statistical learning approaches.
We introduce an approach that can assess the relative information retained when using two different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This test can be used to identify the most informative distance measure  out of a pool of candidates, to compare the representations in deep neural networksand to infer causality in high-dimensional dynamic processes and time series.

COLT- Seminar

When: November 15th, 2023, 12.00 to 13.00
Where: Room 52.737
Speaker:  Andrew Lampinen, Google DeepMind
Title: Comparing language models & humans: reasoning and grammar
Description:
Everyone is now discussing Language Models (LMs), and arguing about whether they can understand language, reason and plan, or produce original content. Is there value in comparing the behavior of language models and humans? And if so, how should we make such comparisons? In this talk, I will draw inspiration from comparative psychology, to suggest that careful methods are needed to ensure that comparisons between humans and models are made fairly, and to highlight an important distinction between LMs and cognitive models that can lead to unfair comparisons. But I will also argue that careful comparisons of LMs to humans offer an opportunity to reconsider our assumptions about the origin and nature of human capabilities. I will illustrate these arguments by focusing on two of our recent papers comparing language models to humans: we find that LMs can process recursive grammar structures more reliably than prior work has suggested, and that LMs show human-like content effects on logical reasoning tasks.

COLT- Seminar

When: September 13th, 2023, 12.00 to 13.00
Where: Room 52.737
Speaker: Luca Moschella, Sapienza University of Rome
Title: Leveraging Emerging Similarities for Latent Space Communication
Description: Neural networks encode the complex structure of data manifolds in high-dimensional spaces through latent representations. The distribution of data points in the latent space should ideally rely solely on the task, data, loss, and specific architecture constraints. However, factors like random weight initialization, training hyperparameters, and other sources of randomness during training can lead to incoherent latent spaces that hinder reuse. Notably, a consistent phenomenon emerges when data semantics remain unchanged: the angles between encodings within distinct latent spaces exhibit similarity. During this talk, we will delve into two empirical strategies that harness this phenomenon, facilitating latent communication across diverse architectures and data modalities: Relative Projection: We will demonstrate the creation of a new, relative representation that inherently remains invariant against such transformations. Direct Transformation: We will showcase how prior knowledge about relationships/transformations between different spaces can directly guide the translation from one space to another.

In both cases, we facilitate efficient communication between latent spaces, bridging gaps between distinct domains, models, and modalities; enabling zero-shot model stitching, reuse and latent evaluation. This holds true for both generation and classification tasks, showcasing the versatility and applicability of these strategies.

COLT- Seminar

When: July 5th, 2023, 12.00 to 13.00
Where: Room 52.737
SpeakerRoberto Dessì
Title: Toolformer: Language Models Can Teach Themselves to Use Tools
Abstract: Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this talk I’ll present the Toolformer and show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds. Toolformer is a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. This is done in a self-supervised way, requiring nothing more than a handful of demonstrations for each API. Toolformer incorporates a range of tools, including a calculator, a Q&A system, two different search engines, a translation system, and a calendar. It achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, without sacrificing its core language modeling abilities.

COLT- Seminar

When: June 29th, 2023, 12.00 to 13.00
Where: Room 52.737
​​​​​Speaker:  Mario Giulianelli, University of Amsterdam
Title: Using neural language generators as computational models of human language use.
Description: While natural language generation (NLG) systems are widely deployed in real-world applications, evidence that they faithfully reproduce aspects of human linguistic behaviour is scarce. In a first study, we analyse variability in human production, a characterising aspect of language production that is often overlooked in NLG research. We propose a statistical framework to quantify variability and to assess language generators' alignment to the production variability observed in humans. In a second study, we use the previously introduced statistical framework to define a novel measure of utterance surprisal that quantifies (un)predictability as distance from plausible alternatives (here, we generate alternatives using neural NLG systems). We test the psychometric predictive power of this measure, showing that it predicts human acceptability judgements better than (standard) probabilistic surprisal and that it is complementary to probabilistic surprisal as a predictor of utterance-level reading times. Overall, these two studies contribute new empirical evidence that neural language generators can be used for the computational modelling of human language use.

COLT- Seminar

When: June 12th, 2023, 14.30 to 15.30
Where: Room 52.737
Speaker:  Milica DenićTel Aviv University
Title: Recursive numeral systems optimize the trade-off between lexicon size and average morphosyntactic complexity
Description: Human languages vary in terms of which meanings they lexicalize, but there are important constraints on this variation. It has been argued that languages are under two competing pressures: the pressure to be simple (e.g., to have a small lexicon size) and to allow for informative (i.e., precise) communication with their lexical items, and that which meanings get lexicalized may be explained by languages finding a good way to trade off between these two pressures  (Kemp and Regier, 2012 and much subsequent work). However, in certain semantic domains, it is possible to reach very high levels of informativeness even if very few meanings from that domain are lexicalized. This is due to productive morphosyntax, which may allow for construction of meanings which are not lexicalized. Consider the semantic domain of natural numbers: many languages lexicalize few natural number meanings as monomorphemic expressions, but can precisely convey any natural number meaning using morphosyntactically complex numerals. In such semantic domains, lexicon size is not in direct competition with informativeness. What explains which meanings are lexicalized in such semantic domains?  We will argue that in such cases, languages are (near-)optimal solutions to a different kind of trade-off problem: the trade-off between the pressure to lexicalize as few meanings as possible (i.e, to minimize lexicon size) and the pressure to produce as morphosyntactically simple utterances as possible (i.e, to minimize average morphosyntactic complexity of utterances). This study in conjunction with previous work on communicative efficiency suggests that, in order to explain which meanings get lexicalized across languages and across semantic domains, a more general approach may be that languages are finding a good way to trade off between not two but three pressures: be simple, be informative, and minimize average morphosyntactic complexity of utterances.

COLT- Seminar

When: May 24th, 2023, 12.00 to 13.00
Where: Room 52.737
Speaker:  Mathieu Rita , ENS/CoML - INRIA/Microsoft Research
Bio:
I am a Ph.D. student under the supervision of Emmanuel Dupoux (ENS/FAIR), Olivier Pietquin (Google Brain) and Florian Strub (DeepMind). I work between the Inria-Microsoft Research Joint Lab and the CoML team in Paris, which is located at ENS Paris. Prior to that, I received an engineering degree from Ecole Polytechnique and a MSc degree in Mathematics, Computer Vision and Machine Learning from Ecole Normale Supérieure Paris-Saclay. My research explores the theoretical and experimental aspects of training RL objectives with language models, with a specific focus on constructing self-play multi-agent systems. I particularly investigate how scaling populations and generations of agents can help address language learning challenges, such as overfitting, exploration or drift. As an application, I simulate language evolution and study the pre-requisites necessary to the emergence of language universals, such as compositionality.
Title: Neural Communication Games
Description:
In this talk, I will present machine learning and information theoretical views of neural communication games. These perspectives provide insights into the dynamics of those games and explain alignments/misalignments between neural emergent communication results and empirical findings from cognitive science and socio-linguistics (population effects, iterated learning, etc.).
 

COLT- Seminar

When: May 18th, 2023, 11.00 to 12.00
Where: Room 55.410
Speaker: Yang Xu, Department of Computer Science, Cognitive Science Program,University of Toronto
Title: Semantic chaining: A computational account of word meaning extension
Description:
Humans make creative use of words to express emerging meanings, a process that often results in word meaning extension. Word meaning extension is a prominent form of lexical creativity, but how it works is not well understood. I describe recent research on semantic chaining and a computational framework for modeling the mechanism and knowledge involved in word meaning extension. I first present a case study that formalizes chaining as probabilistic models of categorization and applies these models to account for historical extensions and choices of numeral classifiers. I then show that similar models enriched with grounded knowledge reconstruct children’s overextension patterns documented in the literature. I close by discussing related work and applications toward a computational account of human lexical creativity.

COLT- Seminar

When: May 10th, 2023, 12.00 to 13.00

Where: Room 52.737
SpeakerClément Romac , Hugging Face - INRIA
Bio: Now a first year PhD student jointly supervised by Pierre-Yves Oudeyer (FLOWERS, Inria) and Thomas Wolf (Hugging Face) studying how autonomous Deep RL agents can leverage large Language Models.
Title: Grounding LLMs in Interactive Environments with Online RL.
Description: Recent works successfully leveraged Large Language Models' (LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems. Yet, the alignment between LLMs' knowledge and the environment can be wrong and limit functional competence due to lack of grounding. We study an approach to achieve this alignment through functional grounding: we consider an agent using an LLM as a policy that is progressively updated as the agent interacts with the environment, leveraging online Reinforcement Learning to improve its performance to solve goals. Using an interactive textual environment designed to study higher-level forms of functional grounding, and a set of spatial and navigation tasks, we study several scientific questions: 1) Can LLMs boost sample efficiency for online learning of various RL tasks? 2) How can it boost different forms of generalization? 3) What is the impact of online learning? We study these questions by functionally grounding several variants (size, architecture) of FLAN-T5.

COLT- Seminar

When: April 19th, 2023, 12.00 to 13.00

Where: Room 52.737 or online on zoom
Speaker: Emanuele La Malfa
Abstract: In this presentation, I will be discussing the concept of robustness in the field of natural language processing (NLP) and the various research questions that can be explored through this lens. Specifically, I will delve into semantic robustness, which encompasses a broad range of linguistic phenomena and is a treatment-response notion that is distinct from the traditional idea of robustness in computer vision, which only takes into account word substitutions and deletions. Additionally, I will explore syntax robustness, which refers to a model’s ability to accurately represent language structures even when faced with manipulations. Furthermore, I will highlight how robustness can be used not only as a desirable property of a model, but also as a means of formally explaining a model’s decisions. Finally, I will share some of the projects I am currently working on, including one on the robustness of language models for code, and another on how to accurately measure robustness in the presence of input data noise.

Past COLT Events

— 12 Items per page
Showing 1 - 12 of 25 results.