Music Technology Group

Specialized in audio signal processing, music information retrieval, musical interfaces, and computational musicology

About us Watch video

Generative Music AI Workshop

Information

Cátedra UPF-BMAT en IA y Música

Harnessing the potential of AI by developing applications and training professionals capable of leading the transformation and renewal of the music sector

Website Videos

Freesound

The biggest creative-commons sound sharing website

Website Article

Essentia

Open-source library and AI models for audio and music analysis

Discover Essentia Industrial applications

New project at the MTG: IMPA Multimodal AI for Audio Processing

New project at the MTG: IMPA Multimodal AI for Audio Processing

The project started in September 2024 and is funded by the Ministry of Science, Innovation and Universities of the Spanish Government, the Agencia Estatal de Investigación (AEI) and co-financed by the European Union

30.10.2024

Imatge inicial -

The audio industry, which encompasses the fields of music, video games, audiovisual production, podcasts, audiobooks, and various other creative industries, is experiencing significant growth both in Spain and internationally. This growth is primarily driven by the development of digital platforms and artificial intelligence (AI). Through advanced signal processing algorithms and machine learning, AI has radically transformed the way we interact with sound, ranging from improving audio quality and noise cancellation to voice recognition and music generation. However, the disruptive potential of AI, while bringing about significant advancements, also poses important challenges that
need to be addressed.

To fully address all the challenges, we will adopt a cross-cutting approach that considers the ethical, legal, social, economic, and cultural aspects of AI development in the audio sector at every phase of the project. The project focuses on the development of multimodal AI methodologies of relevance to the audio industry, addressing different but complementary research areas.

The type of audio processing methodologies to be developed in this project will enable many innovative applications for the digital transformation of areas such as artistic creation, music distribution, music education, cultural preservation, or health and well-being. The
contributions will be related to the development of automated processes for generation, search, discovery, and re-use of audio content.
Specifically, within the context of current AI methodologies, the research contributions will be related to:

  • Development of methodologies for the curation of multimodal datasets
  • Development of pre-training models for audio representation
  • Development of task-specific models of relevance for the audio sector
  • Development of evaluation metrics of relevance to the defined tasks
  • Development of prototypes for each of the defined tasks.

 

Starting date: September 1st, 2024

Duration: 4 years

PIs: Xavier Serra, Rafael Ramírez, Sergi Jordà, Martín Rocamora, Dmitry Bogdanov, Frederic Font

IMPA project is funded by: MCIU/AEI/10.13039/501100011033/FEDER, UE

MTG Videos

Watch videos for more information about our activities, technology demonstrations or media coverage.