Back We will see radical changes, both in the way we create and consume music, due to artificial intelligence
We will see radical changes, both in the way we create and consume music, due to artificial intelligence
Interview to Sergi Jordà about AI and music
This interview has been originally published by the Institutional Communication and Promotion Unit of the Universitat Pompeu Fabra in the Focus section of the UPF website
Sergi Jordà, who has extensive research experience in music technology, is a member of the Music Technology Group (MTG) at the UPF Department of Information and Communication Technologies (DTIC). In this research group, he directs the Musical and Multimodal Interaction Lab (MMI), specialized in interactive music technologies and their applications in areas such as education and in the creation of collective music.
In a context of the growing impact of artificial intelligence (AI) in the world of creation, and also specifically in the music sector, during the last year the MTG has been carrying out the project “Challenges and Opportunities in Music Technology”. During this project, experts from different sectors (artists and creators, entrepreneurs, researchers, jurists…) have been consulted regarding the possible impact of AI and other emerging technologies on music creation, distribution, learning and listening. The conclusions of this project* are the starting point of this interview.
In recent months, there has been increasing social debate around the different applications of artificial intelligence, but there has been less talk of its impact on the world of music than on other areas, such as text or image generation. How do you see the current impact of artificial intelligence on the world of music and how might it evolve in the future?
It is true that in recent months the debate on artificial intelligence has mushroomed, although this is not a new topic as it stems from some seven or eight years ago and has been snowballing. In the field of music, we predicted a year ago that it would begin to gain momentum. What is certain is that, just as has happened with ChatGPT, even the experts have been surprised. We weren’t expecting things to go as quickly as they have in the past few months.
Every week there are novelties. In the field of music, we are already reaching the point where we were about six months ago with the generation of text and image. In December, the generation of music based on text, similarly to what can be done with Dall-e or Stable Diffusion in the field of image, was rather primitive. Since January, although the system is not yet public, there is a Google demo, MusicLM, to generate music from text. You provide a textual description, like “slow-paced reggae song, with bass, drums and electric guitar, and relaxed and expressive voices”, and the system makes a song for you. It is true that there are still few tools for home use, but I don’t think it will be long before we have lots.
The way I see it is that the industry will greatly change, we will see radical changes, both in the way we create and consume music, due to artificial intelligence.
In this context, the Music Technology research group (MTG), of which you are a member, launched a project last year to diagnose the challenges and opportunities posed by artificial intelligence in this field. What have your main goals been?
We started this project with the intention of upholding certain criteria, of not conducting research guided only by technical possibilities, but rather oriented towards social, human, artistic criteria... The project had several phases. First, we surveyed 80 experts through various types of questions to get their opinion on different areas related to music, such as generation, listening, education... We then processed all this information and selected a few of these experts to hold videoconference interviews. Finally, in a third phase, last December we organized two round tables, one dedicated to creation using artificial intelligence and another to listening with artificial intelligence.
With these different phases, we wanted to gain a clearer view of what was happening, which direction we could go in and what we could do and make it known to everyone.
“The industry as we understand it, the music industry, is 100 years old; but music is over 40,000 years old. So the positive side is, whatever happens, I think music will continue to exist!
This diagnostic project finished recently. What conclusions have you reached regarding the main opportunities posed by artificial intelligence in the music field?
Artificial intelligence can be a tool to promote and encourage creativity or, and this is its dark side, it can also be a tool to supplant creativity. Obviously, the two aspects are total opposites. But, just as we saw its possibilities, I think we also need to talk about the problems, the risks that we will encounter.
Well, in short, it is very difficult to know in which direction the industry will head and the changes that will take place in the coming years. But in some ways, I’m optimistic. With the industry, I am not sure what will happen; but the industry has not been around for ever. The industry as we understand it, the music industry, is 100 years old; but music is 40,000 years old. So the positive side would be: I think music will continue to exist.
“If artificial intelligence learns from the large amount of dominant popular music, it will make more of the same style and, therefore, the production will be increasingly homogeneous”
What are the main risks that artificial intelligence might pose to the world of music?
The risks of artificial intelligence for the world of music are huge, in two main aspects: risks for creators and risks for the music itself.
Risks for creators mean that artisan creators, those who don’t have a big name or who live from commissions, will find it much harder to survive. It’s easier to order a piece of music by pressing a button than it is to pay someone to do it for you.
There are other risks for music itself. Since artificial intelligence tends to produce what it has learned, the tendency is rather to make more of the same. What impact can this have? If artificial intelligence learns from the large amount of dominant popular music, it will make more of the same style and, therefore, the production will be increasingly homogeneous.
“The democratization of technological tools has meant that, in recent times, more people can make music, and artificial intelligence could be the latest tool for this”
But if artificial intelligence tends to reproduce existing musical styles, why do you think it can boost creativity? How can it help to do this?
The democratization of technological tools has meant that, in recent times, more people can make music, and artificial intelligence could be the latest tool for this. That is, it can make it easier for everyone, rather than just being a consumer, to become a creator, not necessarily a creator who can earn a living from creating, but to do things with their ideas. In this sense, artificial intelligence can be a highly valuable tool.
So, are we heading towards a point where musicians and machines will create music together?
In fact, as a tool to promote creativity, artificial intelligence could even promote collective creation. And by collective we can understand collective creation is done between different people who are localized; that is, colocalized, or distributed with a system supervised by artificial intelligence, or obviously between people and machines. In other words, all the possibilities that arise in this sense are challenges and provide huge opportunities. I think they can be really interesting.
I would like to think, for example, of applications so that children could make music with gestures, by gesturing; so that they could conduct an orchestra, create a symphony by dancing or conducting, pretending they are conducting. That would be something I would love.
Artificial intelligence could recover a little the concept of popular music prior to the industry. There hasn’t always been a music industry: I mean, it began in 1930 with the sale of recordings, which meant a huge change in consumption and the way music was consumed. Before that, popular music was mostly made not for the people but by the people. And artificial intelligence could have an impact by regaining this popular creativity.
“Artificial intelligence could recover a little the concept of popular music prior to the industry”
The research group to which you belong, the MTG, has been researching the relationships between music and technology for more than 30 years. How does the baggage accumulated by the MGT now help it to address the challenges posed by artificial intelligence in this field?
Over the past three decades, we have worked on many things. We started working on sound synthesis, analysis and processing, on tools for creation... We have also worked for more than 20 years in a very broad field called music information retrieval, which has had many applications in identifying songs, in separating tracks... This technology is mainly used in recommendation systems... But the fact is that, for about six years, in all areas related to music technology, as in almost all engineering and science disciplines, artificial intelligence techniques have been used. Regardless of the area in which you work, in education systems, in music recognition systems..., 100% or 90% of the techniques used involve artificial intelligence, more specifically, what are known as neural networks and deep learning.
What is the purpose of the research on music and technology in which you are applying artificial intelligence techniques?
For example, we are working on music education projects with artificial intelligence, to create tools that help musicians improve their technique during their daily practising. Also on tools that facilitate their live performances, that complement them, that give them ideas... For example, we are doing a project with the Raül Refree, with whom we are developing intelligent drums, so that they can accompany other musicians. This is a possibility for someone who can’t have a drum kit.
For a long time, we have also been the world experts in singing voice synthesis. Just as we talk on the phone to answering machines that say things to us, although answering machines don’t sing, singing voice synthesis systems become virtual singers. Another field in which we are working is music recommendation and identification systems. Really, artificial intelligence is present in almost all the projects on which we are currently working.
Your research on the application of music technologies in different fields has an interdisciplinary approach. To which areas of knowledge do the experts with whom you collaborate most often belong?
We work with educators, we work with experts in neuroscience, we have done work with music therapy professionals or with health professionals to investigate the possibilities of music in the treatment of Alzheimer’s. In particular, well-being is an aspect that greatly interests us. It has been proved that listening to music is good for a lot of things. In this regard, an idea of the future would be to develop therapeutic listening and listen to personalized music, or music that has been made for us to listen to at a precise moment.
How do you work to transfer the knowledge and outcomes of your research projects to the music technology industry?
Our research group is particularly interested in technology transfer. Our group has given rise to some music technology startups or spinoffs. One of them is Voctrolabs (bought in late 2022 by the startup Voicemod), a spinoff from singing voice synthesis technology that was initially developed with Yamaha. This project led to the production of a virtual anime character, Hatsune Miku, which became very famous in Japan. Another spinoff that was born out of the research group was the Reactable, which for more than a decade was dedicated to the development of electronic music applications.
At the same time, we want companies to come to us to seek collaborations, and also vice-versa. We develop APIs (interface applications that act as intermediaries between two music technology systems) that can be used by the music industry and we have different licensing models. We are interested in all types of interaction with the industry.
“Just as when ready-to-wear came on the scene and handmade clothes acquired a special value, it is possible that machines can do anything and that what is ‘designed by humans’ will become a label of prestige or luxury. ‘This was designed by a human!’”
The music industry will increasingly produce music with artificial intelligence, according to the conclusions of the diagnostic project you have just completed. Can music generated by machines arouse the same emotions, feelings... as music made by humans?
Some people think artificial intelligence will never move us like a human can. I have my doubts here, although I, personally, am one of those people who cries when I listen to music and, often, it’s when listening to singing, because it is really the most essential component. You can still tell in today’s music when the voices are 100% human, when an instrument is played by a human..., but, as far as production and composition are concerned..., 90% of the music produced currently has a lot of machines behind it.
In this respect, I think music produced 100% artificially can be as exciting as 90% of the music produced by humans. I say this with a few exceptions. Some voices can move more, at least that happens to me. But, if we talk about consumer music, I would say that it’s very easy for artificial intelligence to excite in the same way. Therefore, people who say no, we will never get excited by a machine, I think they are wrong and that, shortly, this will not be of any consequence. That is, we won’t worry if it has been done a machine or not, and we will consume it anyway.
Just like when ready-to-wear came on the scene and handmade clothes took on a special value, it is possible that machines will be able to do anything and that what is designed by humans will become a prestigious or luxury label. “This was designed by a human!”. But I think that the trend will be that we will adapt and consume music produced by artificial intelligence without prejudices.
Among the challenges artificial intelligence poses to the music industry there is also the need to rethink the current concept of copyright. What needs to be taken into account to do so?
This is a debate that does not have a quick solution. We asked many users and many experts about the issue in the survey we conducted, but there is no simple or single answer. What is clear is that the legislation will have to adapt to artificial intelligence. Copyright hasn’t always been around. It’s a relatively new concept that is 150 years old and will have to change.
In the last two months, this issue has already generated controversy. For example, in April, new music videos were made of hip hop artists cloning their voices without asking for permission. But, if the big companies don’t complain, there will be little impact. In fact, large companies like Warner are already beginning to complain about the fact that there are artificial intelligence systems that are training using music that is their property. When the big players are starting to complain about this, it’s a sign that there really is a lot at stake.
“Research centres and public universities must find places where they can contribute [their findings], they must find gaps for the research to make sense and not be dominated or monopolized by the three or four large AI companies we have at the moment”
Beyond the impact of artificial intelligence on the music industry and the concept of copyright, what difficulties arise when you investigate in this area?
One of the problems of the current state of artificial intelligence is that it requires immense datasets that are not available to everyone. When I say everyone, I don’t mean a normal user. They are not available to most research centres, because these datasets are not public and we are not entitled to use them. Another, less obvious issue is that unimaginable computing power is required to work with these data. It’s difficult to imagine the amount of energy consumed by the projects that Google or OpenAI are pursuing.
Current research in artificial intelligence, in short, requires volumes of data and computing power that are beyond the possibilities of most of the world’s universities. So, research centres and public universities must find places where they can contribute [their findings], they must find gaps, for the research to make sense and not be dominated or monopolized by the three or four large AI companies we have at the moment.
It should also be borne in mind that, although artificial intelligence achieves better results, from a strictly scientific point of view, it also has drawbacks. Artificial intelligence tends to underexplain the results, it tends to be more opaque in terms of research. Artificial intelligence is concerned with output. The intermediate process is something of... a black box.
“We will hold a concert where artificial intelligence will be the leitmotiv of all the artists”
Shortly, the UPF Music Technology research group (MTG) will be participating in +RAIN Film Fest, the first European AI film festival, to be held on the UPF Poblenou campus on 14 June. What role will music play?
+Rain Film Fest, which will take place on this Poblenou campus on 14 June, the day before Sónar, includes several sections, all related to creation and artificial intelligence. The first part, RESEARCH, will involve debate on this topic of creation with artificial intelligence. There will be a festival of films specifically made with AI and there will be a last part, LIVE, with concerts by different artists from Japan, the United Kingdom, locals... who work with artificial intelligence in different ways. We will hold a concert where artificial intelligence will be the leitmotiv of all the artists.