Audio Signal Processing for Music Applications

Instructor: Xavier Serra
Credits: 5 ECTS

A course of the Master in Sound and Music Computing that focuses on a number of signal processing methodologies and technologies that are specific for audio and music applications. Special emphasis is given to the use of spectral processing techniques for the description and transformation of music signals.

The course is offered in 10 weeks, with 25 hours of lectures. The evaluation of the students is based on the weekly assignments (60%) and final exam (40%).

All the labs of the course are done using Python and all the materials and code used in the class are available under open licenses (Creative Commons and GPL). All the materials prepared for the class are available in https://github.com/MTG/sms-tools

Topics covered

Introduction: Introduction to audio signal processing for music applications; Music applications examples. Introduction to needed math: Sinusoids, Complex numbers, Euler's identity, Complex sinusoids, Inner product of signals, Convolution.
Discrete Fourier Transform: DFT equation; Complex exponentials; Inner product; DFT of complex sinusoids; DFT of real sinusoids; Inverse-DFT.
Fourier transform properties: Linearity; Shift; Evenness; Convolution; Phase unwrapping; Zero padding; Power & amplitude in dB; Fast Fourier Transform (FFT); FFT and zero-phase.
Short-Time Fourier Transform: STFT equation; Window type; Window size; FFT size; Hop size; Time-frequency compromise; Inverse STFT; STFT implementation.
Sinusoidal model: Sinusoidal Model; Sinewave spectrum; Sinusoidal detection; Sinusoidal synthesis.
Harmonic model: Harmonic Model; Sinusoids-Partials-Harmonics; F0 detection; Harmonic tracking.
Sinusoidal plus residual modeling: Sinusoidal plus residual model; Sinusoidal subtraction; Stochastic model; Sinusoidal plus stochastic model.
Sound transformations: Filtering; Morphing; Frequency scaling and pitch transposition; Time scaling.
Sound/music description: Extraction of audio features; Describing sounds, sound collections, music recordings and music collections; Clustering and classification of sounds.
Concluding topics: Audio signal processing beyond this course; Beyond audio signal processing; Review of the course topics.

Materials and References

Main software for the course: sms-tools (https://github.com/MTG/sms-tools), essentia (http://essentia.upf.edu)
Additional software: Sonic Visualiser (http://www.sonicvisualiser.org), Audacity (http://audacity.sourceforge.net)
Programming language: Python (http://python.org)
Resource for sounds: Freesound (http://freesound.org)