Music Technology Group

Specialized in audio signal processing, music information retrieval, musical interfaces, and computational musicology

About us Watch video

Generative Music AI Workshop

June 17 to 21, 2024

Information

Freesound

The biggest creative-commons sound sharing website

Website

Essentia

Open-source library and AI models for audio and music analysis

Discover Essentia Industrial applications

Back David Cabrera Dalmazzo defends his PhD thesis

David Cabrera Dalmazzo defends his PhD thesis

Wednesday, December 2nd 2020 at 11h - online

25.11.2020

Imatge inicial

Title: Machine Learning and Deep Neural Networks Approach to Modelling Musical Gestures

Supervisor: Dr. Rafael Ramírez

Jury: Dr. Gualtiero Volpe (Università di Genova), Dr. Sergio Giraldo (Universitat Pompeu Fabra), Dr. Atau Tanaka (Goldsmith University)

Abstract:

Gestures can be defined as a form of non-verbal communication associated with an intention or an emotional state articulation. They are not only intrinsically part of the human language but also explain specific details of a body-knowledge execution. Gestures are being studied not only in the language research field but also in dance, sports, rehabilitation, and music; where the term is understood as a “learned technique of the body”. Therefore, in music education, gestures are assumed to be automatic-motor abilities learned by repetitional practice, to self-teach and fine-tune the motor actions optimally. Hence, those gestures are intended to be part of the performer’s technical repertoire to take fast actions/decisions on-the- flight, assuming that they are not only relevant in music expressive capabilities but also, a method for a correct ‘energy-consumption’ habit development to avoid injuries.

In this thesis, we applied state-of-the-art machine learning (ML) techniques to the model violin bowing gestures of professional players. Concretely, we recorded a database of violin performances of experts and students of different levels, implemented a multi-modal recording system to automatically synchronise audio, video and Inertial Measurement Unit sensor data and developed a custom application to visualise the output from the ML models. We explored three approaches to classify and identify violin gestures in real-time: a) We implemented a Hidden Markov Model to detect fingering and bow- stroke gesture performance from electromyogram and motion data. b) We extracted general time features from gestures samples, creating a dataset of audio and motion data from expert performers, and trained and compared different Recurrent Neural Network models. c) We implemented Mel-spectrogram-based Recurrent Neural Networks models for classifying bowing gestures from only audio data. This allows the recognition of bow strokes without the need for motion capture sensors. These three approaches are complementary and were incorporated into a real-time feedback system to enhance violin learning and practice.

 

This thesis defense will take place online. To attend use this link. The microphone and camera must be turned off, and the online access will be unavailable after 30 minutes from the start of the defense.

 

Vídeo: https://www.youtube.com/watch?v=r_2MnETNRHY

Multimedia

Categories:

SDG - Sustainable Development Goals:

Els ODS a la UPF

Contact

MTG Videos

Watch videos for more information about our activities, technology demonstrations or media coverage.