Data-Driven Harmonic Filters for Audio Representation Learning
- Authors
- Won M, Chun S, Nieto O, Serra X
- UPF authors
- SERRA CASALS, FRANCESC XAVIER;
- Authors of the book
- AA. VV.
- Book title
- Proceedings ICASSP 2020 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Publisher
- IEEE
- Publication year
- 2020
- Pages
- 536-540
- ISBN
- 978-1-5090-6631-5
- Abstract
- We introduce a trainable front-end module for audio representation learning that exploits the inherent harmonic structure of audio signals. The proposed architecture, composed of a set of filters, compels the subsequent network to capture harmonic relations while preserving spectro-temporal locality. Since the harmonic structure is known to have a key role in human auditory perception, one can expect these harmonic filters to yield more efficient audio representations. Experimental results show that a simple convolutional neural network back-end with the proposed front-end outperforms state-of-the-art baseline methods in automatic music tagging, keyword spotting, and sound event tagging tasks.
- Complete citation
- Won M, Chun S, Nieto O, Serra X. Data-Driven Harmonic Filters for Audio Representation Learning. In: AA. VV.. Proceedings ICASSP 2020 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1 ed. Barcelona: IEEE; 2020. p. 536-540.
Bibliometric indicators
- 3 times cited Scopus
- Índex Scimago de 0