The current focus of the Audio Signal Processing Lab of the MTG is to combine audio signal processing methods with machine learning and semantic technologies in order to create large and structured sound and music collections and to extract useful musical knowledge from them. Our research is partly supported by several EU and national projects.
Xavier Serra, Faculty, Head of lab
Alastair Porter, Researcher
Dmitry Bogdanov, Postdoc
Oriol Romani, Researcher
Frederic Font, Postdoc
Rafael Caro Repetto, PhD student
Sankalp Gulati, Postdoc
Georgi Dzhambazov, PhD student
Gopala Krishna Koduri, Postdoc
Xavier Favory, PhD student
Sertan Şentürk, Postdoc
Eduardo Fonseca, PhD student
Hasan Sercan Atlı, Researcher
Sergio Oramas, PhD student
Andrés Ferraro, Researcher
Jordi Pons, PhD student
Swapnil Gupta, Researcher
Gong Rong, PhD student
In the context of CompMusic we are interested in the development of music description techniques through the study of the art music traditions of India (Hindustani and Carnatic), Turkey (Turkish-makam), Maghreb (Arab-Andalusian), and China (Beijing Opera) (ex: Serra 2012; Serra, 2011). Our approach is based on combining signal-processing and machine-learning methodologies and thus a big effort has been dedicated to put together appropriate research corpora with which to carry this data-driven work. Using these corpora we have been focusing on the study on melodic and rhythmic issues with the goal to identify musically meaningful patterns and develop similarity measures between the relevant data entities. This work is permitting us to develop Dunya, which integrates the music corpora and software tools with which to browse them.
In the context of AudioCommons (Font et al., 2016) we are interested in developing technologies that can enhance the reuse potential of openly available audio content. These technologies relate to the semantic description of audio content, such as the one available in freesound.org. We are working on developing and sound representation approaches and ontologies for particular use cases of the creative industries, and also on developing feature analysis methodologies, using Essentia, that can generate semantically meaningful descriptions for those same use cases.
Most of the core signal processing algorithms being developed and used in our research projects are part of Essentia, an open-source C++ library for audio processing optimised for scalability (Bogdanov et al., 2013).
Since 2005 we have been developing and maintaining Freesound.org, a platform with which we do research on social computing and semantic web topics (ex: Font , 2015; Roma et al., 2012). Freesound is an excellent platform in which we have been experimenting, deploying and evaluating research ideas related to audio description, classification, recommendation, similarity measures, tag propagation and ontologies.
More recently we are actively involved in the development of Acousticbrainz.org, an open platform for crowdsourcing audio analysis data of commercial music recordings, obtained using Essentia, that can be of use for a variety of music information research and application tasks.