Data-Driven Harmonic Filters for Audio Representation Learning

  • Authors
  • Won M, Chun S, Nieto O, Serra X
  • UPF authors
  • SERRA CASALS, FRANCESC XAVIER;
  • Authors of the book
  • AA. VV.
  • Book title
  • Proceedings ICASSP 2020 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • Publisher
  • IEEE
  • Publication year
  • 2020
  • Pages
  • 536-540
  • ISBN
  • 978-1-5090-6631-5
  • Abstract
  • We introduce a trainable front-end module for audio representation learning that exploits the inherent harmonic structure of audio signals. The proposed architecture, composed of a set of filters, compels the subsequent network to capture harmonic relations while preserving spectro-temporal locality. Since the harmonic structure is known to have a key role in human auditory perception, one can expect these harmonic filters to yield more efficient audio representations. Experimental results show that a simple convolutional neural network back-end with the proposed front-end outperforms state-of-the-art baseline methods in automatic music tagging, keyword spotting, and sound event tagging tasks.
  • Complete citation
  • Won M, Chun S, Nieto O, Serra X. Data-Driven Harmonic Filters for Audio Representation Learning. In: AA. VV.. Proceedings ICASSP 2020 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1 ed. Barcelona: IEEE; 2020. p. 536-540.
Bibliometric indicators
  • 3 times cited Scopus
  • Índex Scimago de 0