LoudSense – AI system for automatic audibility estimation of background music in audiovisual productions
LoudSense – AI system for automatic audibility estimation of background music in audiovisual productions
Music from audiovisual productions is an important source of income for the music industry, thanks to copyright. The royalties distribution rules vary between countries and often consider aspects such as the time slot, the role of music within the production, etc. However, the audibility of background music is not duly taken into account (due to technical limitations, mainly), which has given rise to debate in recent years.
To offer a tool that solves this problem, BMAT and the Music Technology Group collaborate in the project LoudSense to develop a new technology that automatically establishes the degree of audibility of the background music of audiovisual productions and therefore helps to determine when they should generate copyright royalties.
The project, funded by INNOTEC program (TECNIO), is a 2-year project led by BMAT, and the team at the MTG participating in LoudSense is formed by Xavier Serra, Perfecto Herrera and Roser Batlle.
The role of the MTG in the project focuses first on establishing how to determine the "audibility" of the background music from audiovisual productions, especially in the grey area called "barely audible ". The audibility or inaudibility of music is, in this case, related to perception and attention factors, and not to absolute hearing thresholds. Therefore audibility is a psychological concept, which varies among people and, even for the same person, depending on the context. Factors such as the devices used for watching the content, its audio and visual complexity, or the level of interest must be considered.
This characterization of the audibility will help create annotation tasks that we will implement as the project progresses. From the knowledge on how these contextual variables affect the perception of a background soundtrack, we can, finally and throughout the second year of the project, collaborate with BMAT in the construction of detailed and effective models to predict whether background music, in a certain context of listening and attention, will tend to be more or less perceived.
The research related to this project started in 2017 when BMAT and the MTG cooperated in an Industrial Doctorate program (AGAUR) with a thesis on technology capable of detecting and automatically categorizing background and foreground music carried out by Blai Melendez.
INNOTEC program is a program that promotes technology transfer supporting R&D projects carried out jointly by companies and research groups with the TECNIO accreditation.
With the support of ACCIÓ
Related news: https://www.upf.edu/home/-/asset_publisher/1fBlrmbP2HNv/content/id/246366292/maximized#.YNxT6kztaUk