Antonio Ramires defends his PhD thesis
Thursday, February 9th, 2023 at 11:00 AM - room 55.309 (3rd floor) Tanger building (UPF Poblenou)
Title: Automatic Characterization and Generation of Music Loops and Instrument Samples for Electronic Music Production
Supervisors: Dr. Xavier Serra and Dr. Frederic Font
Jury: Perfecto Herrera Boyer (UPF), George Fazekas (Queen Mary University of London), Anna Xambó Sedó (De Monfort University) ONLINE
Abstract:
Repurposing audio material to create new music - also known as sampling - was a foundation of electronic music and is a fundamental component of this practice. Loops are audio excerpts, usually of short duration, that can be played repeatedly in a seamless manner. These loops can serve as the basis for songs that music makers can combine, cut and rearrange and have been extensively used in Electronic Dance Music (EDM) tracks. Similarly, the so-called “one-shot sounds” are smaller musical constructs that are not meant to be looped but are also typically used in EDM production. These might be sound effects, drum sounds or even melodic phrases. Both loops and one-shot sounds have been available for amateur and professional music makers since the early ages of electronic music. Currently, large-scale databases of audio offer huge collections of audio material for users to work with. The navigation on these databases is, however, still heavily focused on hierarchical tree directories. Consequently, sound retrieval is tiresome and often identified as an undesired interruption in the creative process. In our work, we address two fundamental methods for navigating sounds: characterization and generation. Characterizing loops and one-shots in terms of their instruments or instrumentation (e.g. drums, harmony, melody) allows for organizing unstructured collections and a faster retrieval for music-making. Generation of loops and one-shot sounds enables the creation of new sounds not present in an audio collection through interpolation or modification of the existing material. To achieve this, we employ deep-learning-based data-driven methodologies for classification and generation. We start by applying convolutional neural networks to the task of instrument classification with a large-scale dataset of synthesized sounds augmented with audio effects, achieving high accuracy. Then, we present a large annotated collection of musical loops from Freesound, along with several applications and use cases, which enable further research in loop characterization. Using this dataset, we present an algorithm for classifying the instrumentation role a loop can take in a music composition and show that it can be applied to finding musical structure. The first contribution to generation is a neural synthesizer of percussive sounds based on high-level semantic concepts, along with a dataset of percussive one-shots from Freesound. We then extend this architecture to generate drum loops and evaluate several loss functions in terms of audio quality. Finally, we employ generative adversarial networks to create drum sounds with significantly higher audio quality and present the results of a user study to understand the preference for synthesis controls. Developments in music technology have revolutionized music creation and have enabled a vast amount of new music genres. Tools like the ones we propose have the potential capacity to change the way music is made, perhaps in a similar way to how sampling revolutionized music in the past.
Video: https://youtu.be/2cr7jGJYEZo