Music Information Gathering, Structuring and Processing for Semantic Audio Applications

Project funded by the  Ministry of Economy and Competitiveness of the Spanish Government (Reference: TIN2015-69935-P).
Duration: January 1st 2016 to December 31st 2018.
PI: Xavier Serra

Abstract: The generation of knowledge models from the analysis of large data collections is one of the biggest challenges within information processing and retrieval technologies. In the field of music information processing, a number of algorithms for extracting audio features have been researched that allow generally low-level representations of audio that can be used for a variety of music description problems. However, the biggest bottleneck for obtaining more accurate and semantically meaningful descriptors from audio is the lack of comprehensive and real-world representative datasets for building models, and the lack of more robust and precise musically meaningful audio features. The main purpose of the MINGUS project is to address these two challenges and hence make machine learning algorithms for semantic audio annotations more effective. We put our focus on the methodologies for building such knowledge-models from crowd-sourced datasets and on the development of new audio descriptors and improvement of existing ones. Such new models and descriptors will allow us to better structure music content and their usefulness will be demonstrated through two specific use cases of real-world music applications: music exploration and music creation. Besides these two use cases, the outcomes of the project will be relevant for other kinds of semantic audio applications and, in general, as a methodology and proof of concept that could be applied to other multimedia domains.