Skip to main content

Digital Trends may earn a commission when you buy through links on our site. Why trust us?

Microsoft’s friendly Xiaoice A.I can figure out what you want — before you ask

microsoft-xiaoice
Microsoft

Siri doesn’t like you. Alexa doesn’t want to be your friend. And Google Assistant? Well, let’s just say that Google Assistant wants to spend as little time as possible answering your questions. And that’s okay.

That’s because none is designed to be an A.I. friend, despite the platitudes they invariably spout if you ask them whether they enjoy spending time with you. They are “do engines,” virtual assistants which aim to resolve your queries in as little time as possible, hence our saying that Google Assistant doesn’t want to waste more time than it has to telling you what you need to know. These assistants can answer queries and increasingly anticipate our needs.

The one thing they don’t do is care.

Microsoft’s Xiaoice has a burning desire to be your (yes, your) friend.

Microsoft — which also makes the A.I. assistant Cortana — has a different idea. Xiaoice (pronounced “Shao-ice,” meaning “Little Bing”) is a social chatbot, with a personality modeled on that of a teenage girl, and a dauntingly precocious skill set. In addition to the usual skills you might expect from an A.I. assistant, she can tell jokes, write original poetry, compose and sing songs, read stories, play games, and more.

“Xiaoice is there, 24/7, as a good friend with the power to listen,” Ying Wang, the Microsoft director who oversees Xiaoice, told Digital Trends. “That’s a powerful promise for many users.”

Remember when Google showed off its Duplex technology, capable of making real spoken word phone calls? Xiaoice can do something very similar. Think the recent A.I. anchor on Chinese television is the first time such a thing has happened? Not quite. Xiaoice has already been a weather reader on Dragon TV, one of China’s biggest TV stations in Shanghai, for several years. Her omnipresence on various platforms — from television to social media to Huawei smartphones — has made her a star in the East Asian market; possibly Microsoft’s most famous current employee this side of CEO Satya Nadella, with 660 million world-wide users.

Microsoft XiaoIce

All of this, however, is window dressing for Xiaoice’s real unique selling point: a burning desire to be your (yes, your) friend. Just like Samantha, the artificial intelligence voiced by Scarlett Johansson in the 2013 movie Her, Microsoft’s Xiaoice is intended to be as much a companion as it is an assistant; utilizing some pretty darn impressive “empathic computing” abilities that has made it a surprising hit around the world. In the process, it may just offer us a glimpse at the future of A.I. assistants.

The dream of Eliza

The notion of a chatbot, a computer program designed to simulate a conversation with a human user, is not a new idea. Alan Turing, the godfather of modern artificial intelligence, hypothesized such a thing as early as the 1950s. (Because of Turing’s pioneering work in this area, we refer to the ultimate, human-fooling benchmark of such chatbots as the Turing Test.)

Alan M Turing (right) is considered the father of modern computing and artificial intelligence. Turing famously helped crack the Enigma code, a complex cypher machine used by the German to encrypt messages during World War II. Image used with permission by copyright holder

The first significant chatbot was built at the Massachusetts Institute of Technology in the mid-1960s by a computer scientist named Joseph Weizenbaum. Weizenbaum’s chatbot was named Eliza, after the character of Eliza Doolittle in George Bernard Shaw’s Pygmalion, who learns to speak progressively better through education. Eliza was intended to simulate a Rogerian psychotherapist by using clever scripting tricks to mirror users’ own words back at them. For example, a user saying that they were depressed much of the time would lead Eliza to ask why they were so depressed. To give the illusion of deep perceptiveness, Eliza would also return to topics brought up earlier in the conversation.

Ironically, Weizenbaum created Eliza to highlight the level of superficiality in communication between humans and machines. Instead, he was somewhat perturbed to see that Eliza’s users enjoyed engaging in conversations with the chatbot, which frequently meant divulging personal information.

Conversations with Eliza
An example of a conversation with Eliza, the first real chatbot program. Built by Joseph Weizenbaum, Eliza simulates Rogerian psychotherapy, a method that mirrors a person's own words back at them.

Xiaoice represents the dream of Eliza, writ large. Since launching in China in May 2014, Xiaoice has had more than 30 billion conversations with 660 million human users around the world. Although there are multiple ways to interact with “her,” these typically take place by text message. This divergence from the voice-first approach of other A.I. assistants hints at the different use case. It necessitates a longer, more drawn-out form of communication than the simple “OK Google, will it rain today?” you might bark as you decide whether or not to wear a coat to work.

The typical conversation with Xiaoice lasts 23 turns: around 10 times as many as the industry average.

Microsoft’s Ying Wang said the project started out as an attempt to figure out how to break into the search market in China. “We realized that everyone was spending a ton of time in IMs, with services like WeChat,” she told Digital Trends. “Our original motivation was simple: exploring how people in chat begin searching. We wanted to build an entry point, but we realized that when people are chatting they don’t want to stop doing that in order to begin a search. Our logic was that, if we can continue a conversation with a human, we’re going to find opportunities to find their search intent. We can draw that out to satisfy them.”

The idea of wrapping up users in extended conversations with a chatbot sounds counterintuitive on the surface. The history of computer interaction is based on the premise that using technology is a painful thing, and that whatever can trim even a millisecond off the experience is worthwhile. Like checking into a hotel or having our prostate examined, it’s an experience few want to extend any longer than absolutely necessary.

Phone booth | Microsoft XiaoIce
A user tries out the new functionality in XiaoIce, Microsoft’s social chatbot in China. Microsoft

But people seem surprisingly receptive to Microsoft’s approach. The typical conversation with Xiaoice lasts 23 turns: around 10 times as many as the industry average. The results, or so Microsoft hopes, is an A.I. which bridges the gap between the way we speak to our Amazon Echo and the way we speak with our friends.

“Interaction among humans is session-based, not command-based,” Wang continued. “Human-to-human conversation happens like that, so why not take a similar approach to A.I.-to-human engagement?”

The goal of a social chatbot

In a 2018 paper, Microsoft researchers wrote that “the primary goal of a social chatbot is not necessarily to solve all the questions the users might have, but rather, to be a virtual companion to users. By establishing an emotional connection with users, social chatbots can better understand them and therefore help them over a long period of time.”

Xiaoice - Full-duplex Voice Sense Demo

This more social chatter means that Xiaoice can delve into areas that might seem creepier were they voiced by another A.I. assistant. For instance, it will check up on whether you’ve reached home after a night out, find out how you’re faring after a breakup, or keep tabs on how you’re doing after you lose a job.

Like Eliza, it will return to these topics over time and use semantic analysis to gauge how users are feeling. It can also infer from images and then make passably human comments. If a user posts a photo of themselves with a swollen foot, Xiaoice will ask if it hurts. If they post a funny picture of their pet, Xiaoice could make a joke by observing a distinctive visual element of the photo.

Before Microsoft’s Xiaoice and Cortana… there was Clippy and Tay.

“Xiaoice is there, 24/7, as a good friend with the power to listen,” Wang said. “That’s a powerful promise for many users. We’ve seen lots of engagement in the Asia market specifically, but all over the world [there’s been a strong response to it.] Users feel safe, heard, and have a connection.”

Microsoft’s interest in this area is not wholly benevolent, of course. There’s also a steely business logic behind it: Making an A.I. that becomes friends with you drives engagement. With tech companies striving to find ways to keep users on their platforms for as long as possible, this is one heck of a selling point. It also opens up new ways to disseminate content to users.

When Apple presented its users with a free copy of U2’s “Songs of Innocence” album in 2014, the un-asked for move immediately prompted a backlash. Would we respond in the same way if a friend gifted us an album we hadn’t asked for, recommended a new restaurant, or sent us vouchers for a new subscription service? Perhaps not — which is exactly the ground services like Xiaoice have the potential to explore.

The challenges of building an A.I. BFF

Microsoft’s record in this area shows the difficulty of achieving this goal, however. In 1997, the company debuted Clippy, a name which likely caused an involuntary eye twitch in anyone old enough to remember using it. Pitched as an “intelligent” animated assistant to guide you through the experience of using Microsoft Office, Clippy was a cartoon paperclip that popped up on screen to offer guidance when it detected that you were trying to carry out a task like writing a letter or composing a “to do” list.

Microsoft's Clippy debuted in 1997 as a part of Microsoft Office.

The idea of Clippy as a sort of friendly virtual guide was a good one, but its implementation was fairly disastrous. Its illustrator, Seattle-based Kevan J. Atteberry, still notes on his website that he is responsible for creating “probably one of the most annoying characters in history!” A big problem with Clippy was its lack of recall for previous interactions with the user, making it the paperclip avatar version of Guy Pearce’s amnesiac protagonist in Chris Nolan’s Memento. If Microsoft was going to make a truly useful smart assistant, it would need information from its users to shape the suggestions that it made.

Unfortunately, the next attempt at doing something similar for the U.S. market veered too heavily into that terrain. In March 2016, following the initial success of Xiaoice in China, Microsoft attempted to introduce an American version of the technology. Called Tay, this chatbot resided on Twitter, allowing users to communicate with it by sending messages to @tayandyou. The idea was that Tay would learn from interactions with its users, taking conversational cues from the information it picked up from daily conversations. As Microsoft phrased it at the time, “The more you talk the smarter Tay gets.”

Racist Robot? | Microsoft AI Experiment Under Fire
Microsoft's Tay experiment went horribly, horribly wrong thanks to internet trolls.

Rapidly, online trolls began bombarding Tay with offensive messages designed to sully its blank slate of a brain. Within its first 24 hours of going live, Tay began tweeting pro-Nazi messages denying the Holocaust. When it finally suggested that “HITLER DID NOTHING WRONG!”, Microsoft pulled the plug, and the company issued a formal apology. A spokesperson for the company said that Tay had been taken offline and its creators were busy making adjustments. “[Tay] is as much a social and cultural experiment, as it is technical,” the statement read.

A social and cultural experiment

This idea of a “social and cultural experiment” is the best description of Xiaoice as it currently stands. Microsoft is moving into uncharted territory, and that’s exciting — but also carries risk. Recently Xiaoice launched its sixth generation product, further honing the technology. To date, Microsoft has rolled out the product in five markets: China, Japan, India, Indonesia, and the United States. In each place, Xiaoice is rebranded to give it a more local touch.

In the U.S., Xiaoice is called Zo. She is poised to receive some of the sixth-gen features (including “creation capabilities”) in the immediate future. Whether allowing users to upload their photos and have Zo write a poem about them will prove game-changing for the U.S. audience remains to be seen. Nonetheless, Microsoft deserves credit for taking a different path in a world filled with similar A.I. assistants all promising to do the same tasks. A.I. assistants can already turn your lights on and order you takeout; is it now time that they climbed higher up Maslow’s hierarchy of needs pyramid by tackling emotional affection and social belonging, too?

Microsoft Zo on Twitter
Image used with permission by copyright holder

Heather Child, an author whose novel Everything About You explores humanlike A.I. assistants, sees potential in the idea. “This might not be the fastest or most efficient search technology, but if it’s the most human then it’ll catch on,” she told us. “People lock on to people, and although digital friendliness may have emerged from the need to search, that will soon be eclipsed by all the other human needs an A.I. like this could potentially fulfill — such as offering support, empathy, validation and companionship. Communicating by text message removes any obvious difference between interacting with Xiaoice and with a human friend.”

Microsoft hopes that you agree. “The real key takeaway is that we’ve focused on emotional intelligence,” Ying Wang said. “We call this an empathetic computing framework, [designed to] have conversations with humans naturally, which can build a social and emotional connection. It’s a good friend. As a result, they can better participate and help out in human society.”

Editors' Recommendations

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
IBM’s A.I. Mayflower ship is crossing the Atlantic, and you can watch it live
Mayflower Autonomous Ship alone in the ocean

“Seagulls,” said Andy Stanford-Clark, excitedly. “They’re quite a big obstacle from an image-processing point of view. But, actually, they’re not a threat at all. In fact, you can totally ignore them.”

Stanford-Clark, the chief technology officer for IBM in the U.K. and Ireland, was exuding nervous energy. It was the afternoon before the morning when, at 4 a.m. British Summer Time, IBM’s Mayflower Autonomous Ship — a crewless, fully autonomous trimaran piloted entirely by IBM's A.I., and built by non-profit ocean research company ProMare -- was set to commence its voyage from Plymouth, England. to Cape Cod, Massachusetts. ProMare's vessel for several years, alongside a global consortium of other partners. And now, after countless tests and hundreds of thousands of hours of simulation training, it was about to set sail for real.

Read more
Can A.I. beat human engineers at designing microchips? Google thinks so
google artificial intelligence designs microchips photo 1494083306499 e22e4a457632

Could artificial intelligence be better at designing chips than human experts? A group of researchers from Google's Brain Team attempted to answer this question and came back with interesting findings. It turns out that a well-trained A.I. is capable of designing computer microchips -- and with great results. So great, in fact, that Google's next generation of A.I. computer systems will include microchips created with the help of this experiment.

Azalia Mirhoseini, one of the computer scientists of Google Research's Brain Team, explained the approach in an issue of Nature together with several colleagues. Artificial intelligence usually has an easy time beating a human mind when it comes to games such as chess. Some might say that A.I. can't think like a human, but in the case of microchips, this proved to be the key to finding some out-of-the-box solutions.

Read more
Google’s LaMDA is a smart language A.I. for better understanding conversation
LaMDA model

Artificial intelligence has made extraordinary advances when it comes to understanding words and even being able to translate them into other languages. Google has helped pave the way here with amazing tools like Google Translate and, recently, with its development of Transformer machine learning models. But language is tricky -- and there’s still plenty more work to be done to build A.I. that truly understands us.
Language Model for Dialogue Applications
At Tuesday’s Google I/O, the search giant announced a significant advance in this area with a new language model it calls LaMDA. Short for Language Model for Dialogue Applications, it’s a sophisticated A.I. language tool that Google claims is superior when it comes to understanding context in conversation. As Google CEO Sundar Pichai noted, this might be intelligently parsing an exchange like “What’s the weather today?” “It’s starting to feel like summer. I might eat lunch outside.” That makes perfect sense as a human dialogue, but would befuddle many A.I. systems looking for more literal answers.

LaMDA has superior knowledge of learned concepts which it’s able to synthesize from its training data. Pichai noted that responses never follow the same path twice, so conversations feel less scripted and more responsively natural.

Read more