Skip to main content

Audio deepfakes are going to wreak havoc on the recording industry

Jay-Z isn’t happy. In fact, the 50-year-old rapper and father of three sounds like he’s flipping out in a way you’ve never heard before. You’d have to go back to Jay during his early-2000s feud with Nas to hear him anywhere close to this incensed. Only this time he’s not rapping. He’s ranting.

“I will wipe you the f*** out with precision the likes of which has never been seen before on this Earth, mark my f****** words,” Jay says, with the instantly recognizable, staccato Brooklyn voice that has earned him a mountain of Grammy awards and nominations, along with a personal net worth estimated to be in the region of $1 billion. “You think you can get away with saying that s*** to me over the internet? Think again, f*****. As we speak, I am contacting my secret network of spies across the USA, and your IP is being traced right now. … You’re f****** dead, kid! I can be anywhere, anytime, and I can kill you in over seven hundred ways. And that’s just with my bare hands.”

Chris DeGraw/Digital Trends

As celebrity freak-outs go, it’s one of the better ones. Only it isn’t. Well, not exactly. The “recording” of Jay-Z’s voice is not so much a recording as it is a synthesis, made possible by the latest machine learning technologies. What deepfake videos are for images, these new audio deepfakes are for voices.

Recommended Videos

With enough audio samples to train on, they can carry out an impressively accurate imitation of any individual, even when it comes to pronouncing words or phrases they’ve most likely never uttered.

Jay-Z eats some Copypasta

The text that Jay-Z — or, more accurately, Jay-Z’s voice — is reading is the Navy Seal Copypasta: A parody of the typical internet tough guy braggadocio typically found under YouTube videos, Twitch livestreams, or practically any online comment section. The YouTube channel the video is hosted on, called Vocal Synthesis, boasts more than 46,000 subscribers and also hosts numerous other celebrities speaking the exact same words. Alternatives include the likes of George Carlin, Louis C.K., Bill Burr, Frank Sinatra, Bob Ross, Tucker Carlson, Gilbert Gottfried, and a handful of former U.S. presidents.

Jay-Z raps the Navy Seals Copypasta (Speech Synthesis)

Jay-Z really is upset about the deepfake audio, though. Or, at least, his Roc Nation LLC entertainment agency is. In fact, in a touch of irony for the man who once rapped the lines “I sampled your voice, you was usin’ it wrong,” Roc Nation last month filed copyright strikes against the YouTube uploads on Jay-Z’s behalf. The crime? “Unlawfully [using] an A.I. to impersonate our client’s voice.”

It’s a valid complaint, even if it’s arguably heavy-handed for some content intended to do little more than to raise a wry smile. But it also highlights one of the complex legal questions that could only arise from the age of deepfakes: “Does a person own their own voice?”

It highlights one of the complex legal questions that could only arise from the age of deepfakes.

The answer to this question is unsurprisingly not clear-cut. There’s a couplet from an old Dr. Seuss book, Did I Ever Tell You How Lucky You Are, that might as well explain the relationship between technology and legal regulation. It goes like this: “[Ali Sard] has to mow grass in his uncle’s backyard, and it’s quick growing grass and it grows as he mows it. The faster he mows it the faster he grows it.” In other words, technology changes faster than the law can keep up.

“You and I own our own voices under privacy statute, but protections for the voice of a public figure, while still protected under privacy rights or property rights in their identity, can be murky,” Peter Colin, a technologist for Thomson Reuters and a New York entertainment attorney specializing in right of publicity law, told Digital Trends.

Who owns your voice?

Colin describes it as a “legal minefield” that varies among jurisdictions around the world — and even throughout the United States. Ownership of aspects of your personality, such as your name, voice, or likeness, is not explicitly protected by statute in 28 states, although some have state case law that recognizes protection, Colin said. Some states protect voice, while others protect only names and likenesses. Some protect rights only while a person is alive, while others extend these protections for decades after death. Then there’s the question of fair use for satire and parody.

“A deepfake determined to be a cultural satire may leave Jay-Z unable to prevail in a court of law, but a defamatory use that paints him in a false light that misleads the public or generates a profit for the infringing user may give Jay-Z standing to prevail,” Colin said. “In the U.S., this legal framework is rapidly changing due to deepfakes created for misleading political purposes and for revenge porn, but also due to the advent of social media influencers, as states move to give student-athletes in college sports ability to legally profit off their name image and likeness for the first time, and to better monetize personality rights for living and dead celebrities in today’s entertainment industry.”

Focusing on the entertainment industry implications of audio deepfakes ignores some of the big challenges when it comes to spreading fake news. But it’s also rich territory for future potential lawsuits. Would an A.I.-performed album recorded by a soundalike of Jay-Z be illegal? What if it was clearly satire and given away for free? (And if you don’t think the embryonic stages of this are already happening you obviously don’t know the internet. Cue Jay-Z rapping We Didn’t Start the Fire by Billy Joel.) Music generated by A.I. may not be commonplace today, but as with deepfakes, some of the proof-of-concept demonstrations are getting scarily impressive.

Uptown Funk, but an AI attempts to continuously generate more of the song

“Deepfake soundalikes for entertainment purposes have not really been addressed yet by the courts,” Colin said. “The law has just not caught up to the tech as machine learning tech improves for voice modulation and synthesis. Relevant to any legal analysis here is if the purpose [is[ to present someone in a false light by making the public believe they said something they never said. Does the person creating the soundalike or hiring the soundalike for a voiceover profit off the use? Or is it a satirical portrayal or a transformative use for entertainment?”

Lawsuits loom in the future

There are even currently untested questions about the datasets used to train these audio deepfakes. As Colin points out, a voice itself can’t be copyrighted, but a sound recording of a voice singing a song can be. Is an audio deepfake trained on hours of copyrighted Jay-Z albums a breach of copyright? If so, since the copyrights may be dispersed among multiple record labels and other entities (for instance, an interview recorded for television), there could be a whole lot of potentially aggrieved (and copyright infringed) parties.

A voice itself can’t be copyrighted, but a sound recording of a voice singing a song can be.

As these A.I. tools become ever more sophisticated, these cases are going to shift from hypothetical quandaries to the subject of real legal battles, so expect to see some interesting developments. One thing’s for sure: These are the legal battles of the future. When it comes to the legality of deepfakes, even in this one specialized domain, there’s plenty of complexity to delve into. Lawyers are no doubt rubbing their hands together at the prospect.

Provided that they’ve not already been replaced by machines by that point, that is.

Topics
Luke Dormehl
Former Digital Trends Contributor
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
High-Tech, High-End: Must-Have Luxury Tech Gadgets to Gift This Holiday
Level Up Their Tech, But Make It Luxe
luxury tech gadgets best gifts

Luxury tech gifts are the ultimate way to impress. Combining innovation, practicality, and indulgence, they can also be thoughtful, if not showstopping. This year, we’ve curated an extraordinary selection of luxury tech gadgets and devices that deliver cutting-edge tech while adding a touch of luxe to everyday life. They're not just tools -- they're experiences designed to elevate and inspire your gift recipient’s life.

Whether you’re shopping for a coffee connoisseur, a fitness enthusiast, or the ultimate tech junkie, there’s a luxury tech gadget on our list for anyone you're shopping for. Imagine gifting the sleek Terra Kaffe for barista-quality drinks at home or the Meta Quest 3S for immersive virtual adventures. For the audiophile, Focal’s high-fidelity headphones redefine sound quality, while the Hydrow Core Rower offers an immersive fitness adventure like no other. Even cat parents and homebodies can indulge in next-level convenience with the Litter-Robot 4 or Shark PowerDetect vacuum.

Read more
The Lenovo Legion 5i with RTX 4060 is 37% off for Cyber Week
The Lenovo Legion 5i laptop with the Legion logo on the screen.

Often the home of great gaming laptop deals, Lenovo has a particularly great one for anyone keen to game in style this holiday season. Today, you can buy the Lenovo Legion 5i with a 16-inch screen for just $1,121, meaning you’re saving 37% or $669 off the regular price. It's a hefty price cut, but it’s worth remembering that Lenovo’s estimated value system means that sometimes the original price is optimistically high. But the discounted price is still great either way. One of the better laptop deals around, the Lenovo Legion 5i looks great and packs plenty of punch for the price. Here’s why you’ll want it.

Why you should buy the Lenovo Legion 5i
Lenovo is one of the best gaming laptop brands around and my personal favorite of the bunch. Having owned an older Lenovo Legion laptop for a number of years, I’d happily recommend the range for anyone seeking a reliable gaming laptop. With the Lenovo Legion 5i, you get a great upgrade to my four-year-old Lenovo Legion. It has a 14th-generation Intel Core i7-14650HX CPU and it’s teamed up with 16GB of RAM and 1TB of SSD storage. The highlight here is its Nvidia GeForce RTX 4060 graphics card with 8GB of dedicated VRAM. It’s the best graphics card in this price range dodging the downfall of the weaker 4050 GPU.

Read more
Experiment showcases 3D dental scanner capable of running Counter-Strike: Source
Counter Strike: Source running on a 3D dental scanner

One would assume that medical equipment is not as capable as a modern PC. However, in a surprising and creative tech experiment, Redditor u/AfternoonPutrid8558 demonstrated how Counter-Strike: Source could be played on a 3D dental scanner. The system, equipped with an old Intel processor and AMD GPU, proved surprisingly capable of running the popular first-person shooter at an impressive 600 to 700 frames per second (fps).

The post has gained a bit of attention on the r/pcmasterrace subreddit, highlighting the creative potential of reusing older hardware for gaming. The tech enthusiast repurposed the dental scanner’s hardware, which featured an older 5th-gen Intel Core i7-5720K and an MSI Raider X99 motherboard running at 3.3GHz with 32GB of DDR4-2999 RAM.

Read more