Back Participation of the MTG at FRSM 2021

Participation of the MTG at FRSM 2021

The MTG participates in the 26th International Symposium On Frontiers of Research in Speech and Music (FRSM 2021), a conference aimed at promoting speech and music research in India. The conference will be online on the 11th and 12th of February 2022 and is organized by the Indian Institute of Information Technology Pune (India)


Imatge inicial

The MTG is participating with one keynote and three paper presentations:

Xavier Serra (keynote): Enhancing musicality: scientific, technological and practical challenges 

Everyone has the capacity for musicality, but to enhance specific musical abilities requires engaging in time-consuming educational activities. Most people listen to music using technologically sophisticated distribution platforms, but these platforms offer little support to help comprehend the music being played. Very little has been done on creating and promoting technologies to cultivate people’s musicality, even less from a multicultural perspective. To address this, we should better understand the auditory features and patterns that characterise particular music styles and we should design learning methodologies with which people can improve their listening abilities. I will examplify my ideas with the research we have done on Hindustani, Carnatic, Turkish-makam, and Chinese-jingju musics. I will also discuss the need to combine approaches from cognition, computational musicology, and education.


Rafael Caro Repetto & Xavier Serra (paper presentation): Music technology for aiding understanding of Indian Art Music: the Musical Bridges project

The development of information and communication technologies is easing access to music traditions from all over the world to wider audiences. This access however does not grant the understanding and appreciation of those traditions for listeners not raised in their corresponding cultural backgrounds. Abundant literature is continuously published in the field of world music education to explain the principles of newly available music traditions. However, a gap remains between intellectual understanding of written theory and aural perception of its corresponding sonic phenomena. Music technology offers opportunities for helping to bridge that gap, and this is precisely the goal of the Musical Bridges project. To reach that aim, online interactive tools are developed that provide visualizations of selected key aspects of a particular music tradition in order to guide the user's listening. These computational tools draw on two main elements: open datasets of audio recordings that can be publicly shared, and both manually annotated and automatically extracted features that are able to represent the selected key aspects of the corresponding musical tradition. This paper presents the Musical Bridges prototype tools for aiding understanding and appreciation of Indian Art Music, both Hindustani and Carnatic. The tools use the public data from the Saraga Datasets, which contain open collections of recordings of vocal genres from both traditions, as well as extracted features and manual annotations. An open online environment is built using the JavaScript library p5.js which generates interactive visualizations that can be viewed while listening to the recordings. Each tool focuses on different selected aspects of the musical system of these traditions, such as tāla cycles, melodic movement among the svaras of particular rāgas, or the melodic contour and structural function of characteristic phrases of those rāgas. Different visualizations can be toggled by the users according to their development of listening skills.


Genis Plaja, Thomas Nuttall, & Xavier Serra (paper presentation): Continuing CompMusic: New approaches in the computational analysis of Carnatic Music

The CompMusic project ran between 2011 and 2017 with the aim of promoting culturally specific approaches in Computational Musicology, focusing on the art music traditions of India (Hindustani and Carnatic), Turkey (Turkish-Makam), Morocco (Arab-Andalusi) and China (Jingju). A large corpus was compiled for each of these traditions, from which much research was carried out. Since then, the evolution of new and increasingly more sophisticated methods and tools for Computational Musicology tasks has continued, and their application/development in culturally specific contexts remains as important as ever.

We present here the ongoing research and continuation of the CompMusic project in the tradition of Carnatic music, introducing new methodologies for (1) dataset gathering and management, (2) Music Information Retrieval tasks (vocal pitch extraction and source separation) and (3) melodic analysis (melodic pattern recognition, exploration and visualisation). We aim to achieve state-of-the-art results for these tasks in Carnatic music whilst providing the community with tools for the continued research of this musical repertoire.


Jyoti Narang, Ajay Srinivasamurthy, & Xavier Serra (paper presentation): Representation and Analysis of Dynamics for Automatic Music Assessment in Hindustani Vocal Music

Automatic music assessment systems rely on musically relevant representations of music dimensions to provide accurate and meaningful feedback to music learners on these dimensions. Dynamics is one such fundamental dimension of music performance in addition to melody and rhythm, contributing to the expressivity and musical expression. However, unlike melody and rhythm, music representations that encode dynamics have received limited attention. Dynamics are often extracted using objective loudness measures from an audio music piece, but those measures need to be translated into suitable abstract musically meaningful representations to provide accurate feedback and assessment for effective music learning. While western popular music has a systematic methodology to represent loudness, it is not the case with music cultures that are learnt largely through the process of imitation, such as Hindustani vocal music where the encoding of dynamics is implicit. The absence of a well defined, accurate, descriptive framework to represent and describe dynamics along with melody, rhythm and other ornamentation leads to challenges in automatic music learning and assessment for Hindustani music. Srinivasamurthy and Chordia (2012) developed a machine readable unified framework to encode the bhatkhande symbolic representation of Hindustani music pieces. We extend the framework to incorporate dynamics markings and develop a dataset with machine readable representations that also include dynamic markings. We then propose a methodology that translates the loudness descriptors as extracted from audio into dynamics markings that can be encoded in the extended framework. With these two contributions, we demonstrate how the extended framework can be used to build accurate descriptive scores of music recordings of Hindustani music that can encode melody, rhythm and dynamics, and further be used to provide feedback to music learners.

26th International Symposium On Frontiers of Research in Speech and Music:



SDG - Sustainable Development Goals:

Els ODS a la UPF