Back First crowdsourcing platform for music description calls on public to support AI research

First crowdsourcing platform for music description calls on public to support AI research

AI researchers at Queen Mary University of London and UPF in Barcelona are building a dataset that will enable the development of better music description tools – but they need music fans’ help.


Imatge inicial

Song Describer is a collaborative platform for people to write descriptions of music under Creative Commons licenses, creating an open database of music with natural language description. As the database grows, it will support artificial intelligence research efforts in developing systems that combine natural language and audio processing to generate music captions automatically, among other applications.

Through Song Describer, researchers from Queen Mary’s C4DM (Centre for Digital Music) and the Music Technology Group at UPF (Pompeu Fabra University, Barcelona) are enabling the collection of textual descriptions of different songs, from genre, tone, emotions evoked by a melody, to instrumentation.

This public database of more than 10,000 pieces of music with their corresponding descriptions, can be used by the scientific community to develop, train and validate artificial intelligence models in the field of music description.

Song Describer is a crowdsourcing platform open to everyone, with no need for specialist musical knowledge. Researchers are calling on the public to support the project by writing descriptions of songs in English, with prizes available for the most active contributors. Around 100 people have written song descriptions so far.

There are three simple steps to get involved:

  1. Create a profile including age, location and level of interest in music (excluding personal data). This information may help researchers to see if and how cultural factors affect the way that people describe songs.
  2. Following the platform's instructions, listen to songs and submit descriptions.
  3. Evaluate descriptions made by other participants, indicating whether or not they seem valid and scoring them from 1 to 5. This is then used for quality control, so if many people invalidate a description or score it very low it gets discarded by the system.


Ilaria Manco, PhD Researcher in Artificial Intelligence and Music at Queen Mary University of London, said: “The field of music-and-language research is rapidly growing but finding suitable open datasets to support work in this field remains a challenge. This is why we decided to create Song Describer, an open-source crowdsourcing platform through which anyone can contribute to building a corpus of paired music tracks and natural language descriptions. We hope that data collected from our platform, will help to develop new audio-language models for music, as well as allow us to evaluate them in more detail.”

Dmitry Bogdanov, a researcher on the project from Pompeu Fabra University Barcelona, added: “We want to study the relationship between audio and these textual descriptions, and how people characterise music verbally, to develop machine learning models that generate music captions for any song from the audio.”

As for the uses of such systems, Bogdanov explained: “For many users, music captions can be useful for navigating music collections in an innovative and more intuitive way. On the one hand, people will be able to search for music through the automatically generated textual descriptions, and, on the other hand, they could make textual queries directly using natural language, for example, writing in a search engine ‘search for slow ballads with guitars and deep voices’.”

About the C4DM Research Group at Queen Mary University of London

The Centre for Digital Music C4DM at the Queen Mary University of London is a world-leading multidisciplinary research group in the field of music and audio technology. Since its founding members joined Queen Mary University in 2001, the Centre has grown to become the UK's leading digital music research group. Queen Mary University of London, member of the prestigious Russell Group, is a research-intensive university that connects minds worldwide. We work across the humanities and social sciences, medicine and dentistry, and science and engineering, with inspirational teaching directly informed by our world-leading research.  Our distinctive history stretching back to 1785 is built on four historic institutions (the London Hospital Medical College, St Bartholomew’s Medical College, Westfield College and Queen Mary College) with a shared vision to provide hope and opportunity for the less privileged or otherwise under-represented. Today, we remain true to that belief in opening the doors of opportunity for anyone with the potential to succeed and helping to build a future we can all be proud of. 

About the Music Technology Group at UPF

The UPF research group involved in the Song Describer project, the Music Technology Group (MTG), also fostered the creation of the Freesound website, a diverse collaborative database of sounds with audio under Creative Commons licenses, in collaboration with the Phonos Foundation, also located at the University's Poblenou campus.  The Music Technology Group is part of the UPF's Department of Information and Communication Technologies (DTIC). It is dedicated to research into music technology, including topics such as audio signal processing, music information retrieval, musical interfaces, and computational musicology.



SDG - Sustainable Development Goals:

09. Industry, innovation and infrastructure
Els ODS a la UPF


For more information

News published by:

Communication Office