MTG leads European Commission report on technical solutions for marking and detecting AI-generated audio

MTG leads European Commission report on technical solutions for marking and detecting AI-generated audio

The report was authored by MTG researchers Xavier Serra, R. Oguz Araz, Roser Batlle Roca, David López, and Martín Rocamora
12.05.2026

Imatge inicial -

The Music Technology Group (MTG) at Universitat Pompeu Fabra has led the writing of a new European Commission study report entitled Technical Solutions for Marking and Detecting AI-generated Audio Content in the Context of Article 50(2) AI Act. The report was authored by Xavier Serra, R. Oguz Araz, Roser Batlle Roca, David López, and Martín Rocamora from UPF, together with Lauri Juvela from Aalto University. It has been published by the Publications Office of the European Union and is available at: https://doi.org/10.2759/0784462

The study provides an in-depth technical analysis of current approaches for marking and detecting AI-generated audio, in support of the transparency obligations introduced by the EU AI Act. It focuses particularly on Article 50(2), which requires certain AI-generated content to be marked in a machine-readable format so that it can be identified as artificially generated or manipulated.

The report reviews and compares four main families of technologies: audio metadata, audio watermarking, audio fingerprinting, and identification of generative audio models. It also clarifies the distinction between three related but different processes: marking, which proactively embeds or attaches information indicating AI generation; detection, which verifies the presence or integrity of such marks; and identification, which seeks to infer the origin of the content even when no explicit mark is available.

A key conclusion of the report is that no single technology currently satisfies all the requirements of Article 50 in terms of effectiveness, interoperability, robustness, and reliability. Metadata-based solutions can support transparency and provenance but may be removed or altered; watermarking can provide more persistent marks within the audio signal but still faces challenges in robustness and standardisation; fingerprinting is useful for identifying known content; and model-identification methods offer promising forensic capabilities, although they remain technically fragile and limited in generalisation.

The report therefore recommends a multi-layered approach, combining cryptographically protected metadata, imperceptible audio watermarking, fingerprinting in controlled scenarios, and forensic identification techniques for unmarked content. It also stresses the importance of continued research, shared benchmarks, open standards, and coordinated action among academia, industry, regulators, and civil society.

By leading this study, MTG contributes its expertise in sound and music computing to ongoing European policy discussions on trustworthy AI, transparency, accountability, and the responsible deployment of generative audio technologies.

 

Activity within the framework of:

Cátedra UPF-BMAT en Inteligencia Articial y Música (TSI-100929-2023-1). Project funded by Secretaría de Estado de Digitalización e Inteligencia Artificial, the European Union-Next Generation EU, and by BMAT Music Innovators, the Music Operating System