The artificial intelligence on our phones and computers doesn’t have a face. Talking to Siri, Cortana, Alexa, or most recently Google’s Assistant can be fun, but these disembodied virtual assistants are just voices, leaving us talking to our devices and looking bit weird. Researchers at the University of Leeds have been looking into an alternative to this, by taking a TV show character and making them into a virtual avatar capable of speaking new phrases and sentences, which could be the template for training natural, interactive AI avatars of the future.
That TV show character? Who else but Joey Tribbiani from Friends. Stepping beyond a simple computer generated character, the trick here is the reborn Joey will speak in the same manner as the original. The virtual Joey is trained by using original Joey’s phrases, speech patterns, face movements, and intonation; resulting in a new Joey ready to say whatever the AI driving him wants him to, while still sounding and looking like Joey. At least, that’s the potential, but as you can see from the demo video it’s a work in progress.
It’s very complex work. To train virtual Joey, all the above aspects of his character must be mapped, and new short sentences created for him to speak. The computer system must then match synthesized Joey-speak to the movement of his mouth, blend the two together, and place the virtual Joey-mouth onto his face, and at the same time use face tracking to align movement and expression.
Training for the future
The result is Joey telling someone off-screen, in a bathroom, that he likes pizza and cheese. It’s something that never happened in the show, and didn’t require a reshoot with actor Matt le Blanc. Additionally, just to make it all even more challenging, because he isn’t standing in front of the team reciting lines, Joey’s speech must be taken from the catalog of Friends shows, and all the background noise must be removed first.
The render isn’t perfect, but the team that created it is working to improve it. More excitingly, the method can be used to “virtually immortalize” any character, and interact with us, or with other avatars. Once perfected, uses could go beyond giving Siri and Cortana a celeb’s speech, face and body, and into creating new content using these avatars. Whether that’s an episode of Friends without any of the actual actors appearing in it, virtual representations of a long-since-departed individuals relaying messages seemingly from the after-life, or ultimately training artificially intelligent systems to move, emote, and sound just like us.