AMORE seminar - Friday 27 April 12.00h, Raffaella Bernardi: See both the forest and the trees: Grounding by zooming out or zooming in
AMORE seminar - Friday 27 April 12.00h, Raffaella Bernardi: See both the forest and the trees: Grounding by zooming out or zooming in
AMORE seminar - Friday 27 April 12.00h, Raffaella Bernardi: See both the forest and the trees: Grounding by zooming out or zooming in
See both the forest and the trees: Grounding by zooming out or zooming in
- Speaker: Raffaella Bernardi
- When: Friday 27 April 12.00h
- 52.701 (Roc Boronat)
Humans can see the forest, but can also see the trees, if they need to do so. Current language and vision models can be trained to either see the forest (but they would miss the leaves) or to see its leaves (but they would miss the forest). In other words they fail to zoom in and out. We show that their coarse representations are meaningful when fuzzy operations need to be performed, like set comparison or vague quantification, but are not suitable when evaluated on tasks like grounded textual entailment, that require a fine-grained representation. We conjecture that such models could gain a precise representation through grounded interaction. Through a dialogue about objects in an image, a model can learn to zoom into forest's leaves and zoom out of them to see the forest, if the focus on the image has to be changed. To this end, we discuss our grounded conversational agents inspired by cognitive studies and the tradition of dialogue systems.
AMORE seminar - Friday 27 April 12.00h, Raffaella Bernardi: See both the forest and the trees: Grounding by zooming out or zooming in