Back Research seminars by Annamaria Mesaros and Romain Serizel

Research seminars by Annamaria Mesaros and Romain Serizel

Monday 16th of October, 15:30 in room 55.410 and Friday 20th of October, 15:30 in room 55.309 (Universitat Pompeu Fabra)

10.10.2023

 

Annamaria Mesaros (Monday 16th of October, 15:30, 55.410)

Title: An Overview of Detection and Classification of Acoustic Scenes and Events

Abstract: This talk will give a short overview of a few selected research topics on Detection and Classification of Acoustic Scenes and Events (DCASE). DCASE is a wide research area that is mainly concerned with analysis of everyday audio, with the scope of extracting information about the acoustic environment using audio signal processing and machine learning techniques. We humans recognize the sounds around us and through them understand our surroundings and react to different events. This is, in general terms, the main objective in DCASE: creating "machines" that understand the acoustic environment in a similar way as humans do.
The first part of the talk will introduce a few basic tasks of the DCASE Challenge from a broad perspective of audio content analysis, along with the general machine learning pipeline for solving the problems, and the typical task-specific modifications. The second part will focus on data collection from the perspective of recording and annotation, introducing some of the datasets collected by the Audio research Group at Tampere University and the procedures used for planning the recordings and annotations. Finally, the talk will bring up a few questions on topics of interest for possible collaboration around audio classification and data collection and annotation.

Bio: Annamaria Mesaros is Assistant Professor at Tampere University. She received her PhD in Signal Processing at Tampere University of Technology in 2012. Her research focuses on sound event detection in real-world multisource environments and includes over 50 scientific publications on this topic and many open datasets, having worked on this topic for over 10 years. She is the coordinator of the international evaluation challenge on Detection and Classification of Acoustic Scenes and Events (DCASE), and organizer of many different acoustic scene classification and sound event detection tasks within the challenge. She is currently an Academy of Finland Research Fellow for "Teaching Machines to Listen", member of the Audio and Acoustic Signal Processing Technical Committee of IEEE Signal Processing Society, and vice-chair of the DCASE Steering Group.

 

Romain Serizel (Friday 20th of October, 15:30, 55.309)

Title: Performance vs energy consumption? A case study on sound event detection.

Abstract: With the increasingly complex models used in machine learning and the large amount of data needed to train these models, machine learning based solutions can have a large environmental impact. Even if a few hundred experiments are sometimes needed to train a working model, the cost of the training phase represents only 10% to 20% of the total CO2 emissions of the related machine learning usage (the rest lying in the inference phase). Yet, as machine listening researchers the largest part of our energy consumption lies in the training phase. Even though models used in machine listening are smaller than those used in natural language processing or image generation, they still present similar problems. Comparing the energy consumption of a system trained on different sites can be a complex task and the relation between the system performance and its energy footprint can be uneasy to interpret. In this presentation we will focus on the task of sound event detection. For the first time we will study the energy consumption under various configurations to assess the aspects that can potentially affect the measure of the energy consumption. With these aspects in mind, in a second time, we will propose a benchmark of several sound event detection systems in terms of performance and energy consumption.

Bio: Romain Serizel obtained his PhD from KU Leuven (Leuven, Belgium) in 2021. He is an Associate Professor with Université de Lorraine (Nancy, France) doing research on robust speech communications and ambient sound analysis. He has been co-organizing DCASE tasks since 2018, including task 4 which includes the evaluation of the submissions energy consumption since 2022. Since 2019 he is general co-chair of the DCASE challenge together with Annamaria Mesaros and he is member of the DCASE steering group. He was DCASE workshop general co-chair in 2022.

Multimedia

Categories:

SDG - Sustainable Development Goals:

Els ODS a la UPF

Contact