Research at Intelligent Multimodal Vision Analysis (IMVA) group
We investigate the automatic analysis and understanding of visual content and to address real-world problems and applications, often involving also modalities beyond vision, such as audio, natural language, ultrasound or magnetic resonance. We develop model-based and data-driven (deep learning) approaches, algorithms and innovative digital technologies, together with their theoretical analysis. The applications include: accessibility of people with visual, hearing or reading impairment to multimedia content and may contribute to the development of more accessible devices; the analysis of the human face both in terms of its morphology and its dynamics (e.g. expressions and emotions) with enormous potential for disciplines such as psychology, linguistics, neuroscience, health or developmental biology; the separation of the different audio sources that make up the audio mixture of a particular video; the understanding and the exploitation of the correlations and complementations among different modalities; etc
Department of Information and Communication Technologies
Tànger building (Poblenou campus)
Tànger, 122-140
08018 Barcelona