Back

Musical AI

Musical AI

Musical AI
Artificial intelligence to support musical experiences: towards a data-driven, human-centred approach

Project funded by the  Ministry of Science and Innovation of the Spanish Government (Reference: PID2019-111403GB-I00).
Duration: June 1st 2020 to May 31st 2024
PIs: Xavier Serra, Emilia Gómez, Rafael Ramírez, Sergi Jordà

Abstract: Consuming and listening to music is truly one of the most widespread activities nowadays. Music is indeed the universal language that everybody enjoys and our Western culture knows it well, making music utterly pervasive: at supermarkets and airports, in movies and on television, at important ceremonies, and now, also on the private, portable soundscapes that snake ubiquitously from pocket to ear. Yet this same Western society is the one that has made us believe that only a limited number of people are musical. We do all possess however the basic capacity to listen to and distinguish, interpret, understand, appreciate and respond to patterns of sound, since without this given capability no musical tradition and no musical market could ever exist. This project aims to advance on a number of research topics related to large-scale corpora and data-driven machine learning, with applications to music applications. We plan to increase our understanding of music, developing AI-based models and tools for helping listeners to gain musical understanding and appreciation, and developing AI-based models for helping music learning and music creation.

WP1: Hybrid human-machine intelligence (PI Emilia Gómez)

Humans provide accurate annotations of music content, but can only be done on a small scale and a high musical expertise if often required in music annotation tasks. On the contrary, algorithms can process large amounts of data but they are less accurate and can only provide low-level annotations.

The goal of this work package is to research on the best strategies to combine human and machine (artificial) intelligence in data-driven music applications, and to investigate on human-centred approaches for the use of artificial intelligence to support music listening experiences. In particular, we will develop hybrid human-machine models for the annotation and description of large music collections. We will incorporate methodologies from cognitive science and human-computer interaction.

WP2: Automatic discovery of musical patterns (PI: Xavier Serra)

A properly created music corpus composed of audio recordings plus accompanying metadata, can capture the essence of the musical styles that we will use. Out of the different characteristics that can describe a musical style, a group of pieces, or a single musical piece, melodic motives are patterns that can be used to capture that essence. Based on these assumptions we will create the music corpora and develop pattern discovery methods for identifying style specific melodic motives in a collection of corpora.

The goal of this WP is to develop unsupervised approaches with which to discover the relevant melodic patterns that characterize different musical entities (e.g. artist, composer, form, style, ..) in the different corpora by starting from the audio signals. We will need to study different melodic representations that can be obtained from the continuous time series resulting from the audio analysis and explore different strategies for temporal segmentation. We will also need to consider different similarity measures and characterizations of the melodic patterns and focus on musically relevant ways to identify and characterize those patterns. Given the computational complexity of the problem, we will have to study optimization strategies for making the work feasible. 

WP3: Technology-enhanced music learning (PI: Rafael Ramírez)

Learning to play a musical instrument is a difficult task, requiring the development of sophisticated skills. Nowadays, such a learning process is mostly based on the master-apprentice model and technologies are rarely employed and are usually restricted to audio and video recording and playback. Learning under the master-apprentice model is difficult because the time lag between the student‘s performance and the teacher‘s feedback makes the feedback to be dissociated from the online proprioceptive and auditory sensations accompanying the performance – this is especially relevant since most of student‘s performance practice takes place long after the teacher‘s feedback. The resulting long periods of private-study by the student frequently make the learning of musical instruments a rather harsh and solitary experience, resulting in high abandonment rates.

The aim of this work package is to develop supervised approaches to discover patterns of good music performance practice and use these patterns to provide feedback to music students while they practice in order to enhance the learning process. With this aim we will collect recordings from master musicians, develop pedagogic-specific audio signal and motion capture processing tools to analyse such recordings, apply develop machine learning approaches to model music performance, and develop real-time feedback systems to inform music students about their music playing.

WP4: Machine music intelligence for assisting human music creation (PI: Sergi Jordà)

Everybody has the potential and the right of being creative (admittedly to different extents) and everybody feels indeed proud and positively rewarded when accomplishing creative tasks, as little ambitious as their outcomes might be. Yet, in our society, not everybody has the confidence or the nerve to approach music creation. This is why we believe that music creativity and engagement should be made accessible to everyone, not only to professional and hobbyist musicians, and machine music understanding can also be an important asset towards this objective.

A main objective of this work package is to address the particular point of loss of physicality and presence in the human-human and human-computer interaction. Laptops, computers, tablets and very especially mobile phones are used on a daily basis by a large proportion of the population, particularly by young people. The contents most consumed tend not to promote creativity or critical thinking, and the use of mobile devices are commonly disruptive of the user attention, bodily passive and addictive and may lead to social disconnection between persons physically present in the same space. For example, this has been a concern for children, leading to a ban of the smartphones in several schools. As a counterpart, we believe that a shared music experience is exemplary to understand and promote collective interaction.