Ajay Srinivasamurthy and Sankalp Gulati defend their PhD thesis
Ajay Srinivasamurthy and Sankalp Gulati defend their PhD thesis
17 Nov 2016
Thursday, November 17th 2016 at 15:00h in room 55.309 (Tanger Building, UPF Communication Campus)
Thesis Committee: Simon Dixon (QMUL), Geoffroy Peeters (IRCAM) and Juan Pablo Bello (NYU)
Abstract: Large and growing collections of a wide variety of music are now available on demand to music listeners, necessitating novel ways of automatically structuring these collections using different dimensions of music. Rhythm is one of the basic music dimensions and its automatic analysis, which aims to extract musically meaningful rhythm related information from music, is a core task in Music Information Research (MIR).
The thesis aims to build data-driven signal processing and machine learning approaches for automatic analysis, description and discovery of rhythmic structures and patterns in audio music collections of Indian art music. After identifying challenges and opportunities, we present several relevant research tasks that open up the field of automatic rhythm analysis of Indian art music. Data-driven approaches require well curated data corpora for research and efforts towards creating such corpora and datasets are documented in detail. We then focus on the topics of meter analysis and percussion pattern discovery in Indian art music.
Meter analysis aims to align several hierarchical metrical events with an audio recording. Meter analysis tasks such as meter inference, meter tracking and informed meter tracking are formulated for Indian art music. Different Bayesian models that can explicitly incorporate higher level metrical structure information are evaluated for the tasks and novel extensions are proposed. The proposed methods overcome the limitations of existing approaches and their performance indicate the effectiveness of informed meter analysis.
Percussion in Indian art music uses onomatopoeic oral mnemonic syllables for the transmission of repertoire and technique, providing a language for percussion. We use these percussion syllables to define, represent and discover percussion patterns in audio recordings of percussion solos. We approach the problem of percussion pattern discovery using hidden Markov model based automatic transcription followed by an approximate string search using a data derived percussion pattern library. Preliminary experiments on Beijing opera percussion patterns, and on both tabla and mridangam solo recordings in Indian art music demonstrate the utility of percussion syllables, identifying further challenges to building practical discovery systems.
The technologies resulting from the research in the thesis are a part of the complete set of tools being developed within the CompMusic project for a better understanding and organization of Indian art music, aimed at providing an enriched experience with listening and discovery of music. The data and tools should also be relevant for data-driven musicological studies and other MIR tasks that can benefit from automatic rhythm analysis.
Thursday, November 17th 2016 at 17:00h in room 55.309 (Tanger Building, UPF Communication Campus)
Thesis Committee: Juan Pablo Bello (NYU), Emilia Gómez (UPF) and Barış Bozkurt (Koç Univ.)
[Full thesis document and accompanying materials]
Abstract: Automatically describing contents of recorded music is crucial for interacting with large volumes of audio recordings, and for developing novel tools to facilitate music pedagogy. Melody is a fundamental facet in most music traditions and, therefore, is an indispensable component in such description. In this thesis, we develop computational approaches for analyzing high-level melodic aspects of music performances in Indian art music (IAM), with which we can describe and interlink large amounts of audio recordings. With its complex melodic framework and well-grounded theory, the description of IAM melody beyond pitch contours offers a very interesting and challenging research topic. We analyze melodies within their tonal context, identify melodic patterns, compare them both within and across music pieces, and finally, characterize the specific melodic context of IAM, the ragas. All these analyses are done using data-driven methodologies on sizable curated music corpora. Our work paves the way for addressing several interesting research problems in the field of music information research, as well as developing novel applications in the context of music discovery and music pedagogy.
The thesis starts by compiling and structuring largest to date music corpora of the two IAM traditions, Hindustani and Carnatic music, comprising quality audio recordings and the associated metadata. From them we extract the predominant pitch and normalize by the tonic context. An important element to describe melodies is the identification of the meaningful temporal units, for which we propose to detect occurrences of nyas svaras in Hindustani music, a landmark that demarcates musically salient melodic patterns.
Utilizing these melodic features, we extract musically relevant recurring melodic patterns. These patterns are the building blocks of melodic structures in both improvisation and composition. Thus, they are fundamental to the description of audio collections in IAM.We propose an unsupervised approach that employs time-series analysis tools to discover melodic patterns in sizable music collections. We first carry out an in-depth supervised analysis of melodic similarity, which is a critical component in pattern discovery. We then improve upon the best possible competing approach by exploiting peculiar melodic characteristics in IAM. To identify musically meaningful patterns, we exploit the relationships between the discovered patterns by performing a network analysis. Extensive listening tests by professional musicians reveal that the discovered melodic patterns are musically interesting and significant.
Finally, we utilize our results for recognizing ragas in recorded performances of IAM. We propose two novel approaches that jointly capture the tonal and the temporal aspects of melody. Our first approach uses melodic patterns, the most prominent cues for raga identification by humans. We utilize the discovered melodic patterns and employ topic modeling techniques, wherein we regard a raga rendition similar to a textual description of a topic. In our second approach, we propose the time delayed melodic surface, a novel feature based on delay coordinates that captures the melodic outline of a raga. With these approaches we demonstrate unprecedented accuracies in raga recognition on the largest datasets ever used for this task. Although our approach is guided by the characteristics of melodies in IAM and the task at hand, we believe our methodology can be easily extended to other melody dominant music traditions.
Overall, we have built novel computational methods for analyzing several melodic aspects of recorded performances in IAM, with which we describe and interlink large amounts of music recordings. In this process we have developed several tools and compiled data that can be used for a number of computational studies in IAM, specifically in characterization of ragas, compositions and artists. The technologies resulted from this research work are a part of several applications developed within the CompMusic project for a better description, enhanced listening experience, and pedagogy in IAM.