The focus of the Audio Signal Processing Lab of the MTG is to advance in the understanding of sound and music signals by combining signal processing and machine learning methods. We work both on data-driven methodologies, in which the development and use of large data collections is a fundamental aspect, and on knowledge-driven approaches, in which domain knowledge of the problem to be addressed is needed. Combining these research approaches, we are able to tackle practical problems related to automatic sound and music description, music exploration and recommendation, and tools for supporting music education.
In order to advance in the understanding of sound and music signals and address practical problems, we work on a variety of complementary topics covering the creation of sound and music collections, the development of task-oriented signal processing and machine learning methods, and the use of semantic technologies to structure sound and music concepts.
Related to data collections, we maintain a number of corpora of relevance both for research and practical applications. Especially significant are Freesound, a collaborative database of Creative Commons licensed sounds, AcousticBrainz, a platform for crowdsourcing analysis data of published music recordings, and Dunya, which comprises collections of audio recordings plus complementary information of various musical traditions. From these corpora, we have been creating datasets for specific research tasks, for example, supported by our Freesound Annotator, we are developing the FSD dataset, which is used for sound classification tasks. We have also created datasets, like the CompMusic datasets, to work on specific music analysis problems. We also promote research collaborations by organizing open machine learning challenges like the Freesound Audio Tagging 2019 or the Emotion and Theme recognition in music.
In our audio signal processing and machine learning work, we develop methodologies and algorithms for the analysis of sound and music signals for a wide variety of tasks, going from low-level characterization of audio signals to high-level sound and music classification tasks. Most of the resulting algorithms become part of our software library Essentia, which is extensively used both in research and commercial applications.
In the AudioCommons and Mingus projects, we work on technologies to support classification, exploration, and recommendation of sound and music signals using very large data collections. We are especially interested in developing technologies that can enhance the reuse of openly available data, like the sounds from Freesound or the music information in AcousticBrainz.
In the CompMusic and Musical Bridges projects, we have been studying music repertories from a variety of music traditions, like from India (Hindustani and Carnatic), Turkey (Turkish-makam), Maghreb (Arab-Andalusian), China (Beijing Opera), and more recently from Jazz. We have focused on the study of core musical dimensions, like melody, rhythm, and harmony (for the case of Jazz), which requires musicological knowledge of the music being studied and the involvement of musicians. This work has resulted in a number of tools and exploitable technologies, like Dunya or Riyaz (being exploited by our spin-off company MusicMuni Labs).
In the TECSOME project, we are developing technologies to support online music education. For example, we are developing MusicCritic, a service to facilitate the assessment of exercises which is able to evaluate the intonation, rhythm, and timbre of musical performances. This technology is already being exploited in an online course on Hindustani music.
We also carry projects in collaboration with industrial partners, for example for improving and adapting our existing technologies (like Essentia) to their particular industrial applications. We are interested in long-lasting and mutually beneficial partnerships.