Workshop on Anaphora and Predictability

The GLiF and COLT cordially invite you to the workshop on anapahora and predictability. Attendance is free and no registration is required.

When: April 29, 2024

Where: 55.309 (UPF - Tànger Building; access through Roc Boronat Building, Carrer Roc Boronat 138)

Schedule (see abstracts below):

15:00-15:45: Jennifer Arnold, The University of North Carolina at Chapel Hill
15:45-16:30 Yufang Hou, IBM Research
16:30-17:15 Andrew Kehler, University of California San Diego
17:15 Coffee break

Jennifer Arnold: "Where does referential predictability come from? The role of input frequency"

Several current theories of reference processing suggest that a key representation is “referential probability”, or the likelihood that potential referents will be re-mentioned (Arnold, 1998; Arnold et al. 2018; Frank & Goodman, 2012); Hartshorne et al., 2015; Kehler et al., 2008; Kehler & Rohde, 2013; 2019). However, there has been almost no discussion about where this representation comes from (but see Guan & Arnold, 2021). Several studies show that semantic constraints guide predictability, for example in “Ana admired Liz because she…”, people expect the implicit cause (Liz) to be mentioned, and this guides comprehension (e.g., Kehler et al., 2008). Logically, this expectation could be calculated on the fly, on the basis of real world knowledge and semantic constraints (see Hartshorne et al., 2015 for a proposed mechanism like this). But another possibility is that people learn about re-mention probability from exposure to the statistics of reference.

This talk reports the findings from a series of studies that tests whether frequency of input guides representations of referential probability. Using an adaptation paradigm, we test whether people can adapt to short-term changes in the frequency of referential patterns by testing whether these patterns affect the comprehension of ambiguous pronouns. We find that they can: they adapt to the frequency of subject vs. nonsubject antecedents for pronouns, and also semantic constraints such as goal/source or cause/noncause antecedents for pronouns (Johnson & Arnold, 2023; Ye & Arnold, 2023; Arnold, under review; Arnold in prep). This shows that people can track referential frequency and use it to guide pronoun comprehension. However, we did not find any adaptation when the referential patterns were achieved with noun anaphors. This is surprising, because it suggests that people may have been specifically tracking the behavior of pronouns, and not referential probability per se. We consider the implications of these findings for current theories.

Yufang Hou: "Bridging Resolution: Advancements and Challenges in the Era of Large Language Models"

Abstract: Information status and anaphora play an important role in text cohesion and language understanding. In this talk, I will first give an overview of my decade-long work on bridging resolution and information status classification. Next, I will discuss our research findings on probing LLMs for bridging inference. Finally, I’ll share some thoughts on a few challenging research questions in modelling referential discourse entities.

Andrew Kehler: "Anaphora and Predictability in Humans and Large Language Models: The Case of Ellipsis"

Until quite recently, no one would have looked to the language models employed in computational linguistics for insight into questions surrounding anaphora and predictability in language. The astonishing success of recent large language models (LLMs, e.g., ChatGPT and GPT-4) and the emergent behaviors they exhibit force a rethink of that position, and open new avenues for linguistic theorizing. As a potential linguistic application, I’ll begin by surveying a long running debate concerning the way in which two forms of ellipsis (VP-ellipsis and sluicing) are interpreted. Against the prevailing wisdom, I’ll argue that the evidence suggests that these forms (i) are interpreted via referential (i.e., not syntactic) processes, but (ii) also trigger the reconstruction of syntactically licit material by the addressee that the speaker could have plausibly had in mind. I’ll then describe two experiments (joint work with Till Poppels) that investigate the acceptability of ellipsis with inferred referents and the interpretations that participants recover, which reveal sets of highly acceptable cases that fall well outside of the identity conditions posited by current theories. I then describe versions of the experiments using LLMs as participants, which to some degree demonstrate human-like sensitivity to the eccentricities of different classes of examples. The fact that these abilities are emergent from being trained on a word prediction task reveals novel directions for examining the roles of anaphora and predictability in ellipsis interpretation.

Organizers: Gemma Boleda (COLT), Xixian Liao (COLT) and Laia Mayol (GLiF).

Funding: This workshop is funded by the project EXPEDIS (PID2021-122779NB-I00) funded by MICIU/AEI/10.13039/501100011033 and by “ERDF A way of making Europe” and by the COLT research group.

GLiF Formal Linguistics Research Group

Workshop on Anaphora and Predictability