The Show Must Go On – AI Developments in Music


Sally Yoon is an IPilogue Writer, IP Innovation Clinic Fellow, and a 3L JD Candidate at Osgoode Hall Law School.


This past summer, Amazon made headlines when it announced an update that would make Alexa capable of impersonating deceased family members, just after hearing under a minute of audio. While people are still unsure as to whether this is heartwarming or just plain creepy, AI continues to evolve, with recent developments showing its ability to not only mimic human speech but also singing.

AI-based audio technologies have been making waves worldwide. Last month, Google announced “AudioLM: a Language Modeling Approach to Audio Generation”, which proposes “a new framework for audio generation that learns to generate realistic speech and piano music by listening to audio only”. More recently, Tencent Music Entertainment (TME), China’s leading music entertainment platform, demonstrated the influence of AI in music. According to Music Business Worldwide, the company has released over 1,000 songs with human-mimicking AI vocals – one of the tracks surpassing 100 M streams. TME utilized a “patented voice synthesis technology” called “Lingyin Engine”, which the company claims can “quickly and vividly replicate singers’ voices to produce original songs of any style and language.” South Korea has been a strong player, with its most prominent AI-based audio start-up, Supertone. The company claims that its voice synthesis and real-time voice enhancement technology can create a hyper-realistic voice that is indistinguishable from real humans.

So far, these AI voice technologies have largely been publicized as an innovative way of paying tribute to deceased artists and preserving the memories of lost loved ones. Nevertheless, companies will likely aggressively pursue these technologies for profit. In fact, according to NME, HYBE (record label of globally recognized boy band, BTS) acquired Supertone for approximately 45 billion Won, which equates to about $44.6 million Canadian Dollars. In a letter to its shareholders last month, HYBE’s CEO confirmed that the company plans to “unveil new content and services to [its] fans by combining our content-creation capabilities with Supertone’s AI-based speaking and singing vocal synthesis technology.”

HYBE’s huge investment in Supertone starts to make a little more sense once we discover that the company’s “biggest organic revenue driver” in Q3 2022 was its Artist ‘Indirect-involvement’ revenues. BTS’s success suggests how more entertainment companies will follow HYBE’s footsteps to increase profits without the headache of coordinating any physical appearances of its artists.

The development of voice AI opens a plethora of legal questions to consider. These issues were highlighted more recently by the recent license by James Earl Jones (the voice of Darth Vader) of his voice to an AI company – who is given permission to use it and does the artist hold any rights to license their voice to third parties for use in other films? More specifically for AI generated music and royalties, how do we determine who owns the copyright to the work? Does it make sense to look at the creators of the voice AI technologies themselves or at the source of the vocal data (the artist)? These questions clarify that the development of voice AI places our artists in a very vulnerable position — suggesting a much-needed intermission for this chaotic programme.