Facebook is using Minecraft to train AI assistants, one block at a time

Engineers at Facebook have created a Minecraft assistant to help you with some of the menial tasks involved in the sandbox builder. The helper, which spawns and works alongside you in the game, can be given simple instructions for things like building a tower, digging a hole or making a wall, cutting construction time down for projects of all sizes.

Once installed, the assistant drops into your Minecraft world. You use a chat box to relay commands to your new companion as you see fit. Orders must be as straight-forward as possible, breaking tasks down to their most basic elements. You can instruct it to “make a five-by-five wall” or “dig down seven block.” Using that as a foundation, the bot can learn to do more complex jobs the more time you spend with it.

The list of things people might ask an assistant in Minecraft is long, so language understanding is front and center.

This could make life in Minecraft easier, but the true goal is even more ambitious. The bot’s education gives Facebook’s engineers valuable insight into how AI can better understand humans in everyday life.

“I can go into the Minecraft game, and it’s easy. I don’t have to learn how to move from a location to a location, I don’t have to learn how to destroy a block or build something,” Arthur Szlam, Head of the Minecraft AI Project at Facebook Research, told Digital Trends. “On the other hand, the space of things that people might ask an assistant in Minecraft to do is huge. So, the language understanding problem is front and center.”

A research tool in a creative game

Minecraft has long inspired players to make whatever their imagination desires. Between the global community and open-source roots, it provides an ideal path for engineers to bring this experiment to a wider audience.

Play-testers started to use the bot in January of 2019, and a public version is now live for anyone who wants it. There’s a small laundry-list of requirements to download first, and you’ll have to downgrade to version 1.12. Once that’s done, however, your new helper is yours to keep.

The bot’s company feels strange at first. The normally secluded experience (in single player, at least) is interrupted by a face staring at you, awaiting your input in the chatbox.

I started by politely saying “hello” while circling my block-headed buddy. My apprehension faded once I saw the first block built at my behest. It gave me a rush of power. The back-and-forth with the bot is friendly, but robotic. It gives thanks for compliments and says “I don’t know” if you tell it to do something beyond its scope. Slowly, but surely, I was taught it to build me a small, single-room hut.

If the thing you want to make can be broken down to circles and squares without much intricacy, the bot can handle it. That means a house, or a tower, or an underground lair, are feasible. Teaching the bot took more time than building the hut myself, but once it learned, it could build as many huts as I desired.

Of course, statue of Mickey Mouse – or anything else that uses granular detail – is far more difficult. These types of intricate structures create hurdles Facebook’s engineers aren’t sure how to solve. They’re hoping researchers can use Minecraft to help find the solutions.

Larry Zitnick, Research Scientist at Facebook

“[For] this release of the bot, the audience is not Minecraft players. It’s the research community,” Larry Zitnick, Research Scientist at Facebook, told me. “We’re purposely putting it out there before it really works well because we want the rest of the research community to work with us in exploring and trying to make this bot better.”

I, Robot

Having a robot assistant around is like having your own Alexa, or Siri, in Minecraft. The conversation is cheery but plainly robotic. Issuing the right command means thinking in the most basic terms and speaking in a rigid, straightforward, detailed manner.

This was obvious when I wanted to build a window in one of the walls of my hut. I had to work step-by-step. I couldn’t tell the bot build a wall with a window (though perhaps, someday, it could learn that). The variables that play into what “window” and “wall” mean easily confuse the bot. What wall? How big of a wall? How big of a window? What materials should it use?

Instead, I asked it to make a six-by-six block square. Then, with that task complete, I told it to put a hole right in the middle of the square. Then, I finished by asking it to put glass in the hole. Orders need to go by the most basic possible wording, building towards the result.

Bringing the bot this far isn’t the challenge. Voice command software has permeated our lives for many years. Today, you can find it in computers, phones, tablets, cars and much more. The real challenge users face is the leap between understanding simple tasks, like “call mom,” to discerning phrases that provide less context and don’t sound exactly like a predefined default.

I’m Irish, and while my accent isn’t pronounced, it can become strong if I’ve been talking to people from other parts of the country, or if I’m speaking quickly. Asking Alexa to play songs that use Irish, such as ‘The Jail of Cluain Meala’ (pronounced ‘Cloon-Malla’) by The Dubliners, forces me to slow my roll. I must make each syllable perfectly understandable. Engineering around accents and languages is an ongoing challenge.

Teaching AI to have a more nuanced understanding of our language may sound simple, but computerizing that level of speech is a multifaceted problem tangled in other multifaceted problems. It involves working around context, slang, and the tone of specific users. All of this is made more complicated when there’s not a lot of reference material to draw from.

“We don’t have that much training data for any of these things,” Szlam stated. “The number of examples that we have to train this thing is minuscule in some sense. It’s a few tens of thousands, and these are rephrasing from digital templates.”

A new frontier for AI in video games

Video games have a history of using AI in fascinating and creative ways, from Psycho Mantis reading the player’s memory card in 1997’s Metal Gear Solid, to the personalized character responses of Nintendo’s Animal Crossing series, to Toby Fox’s fourth-wall shattering Undertale, and beyond. Facebook’s tech is built so aspects can be easily used in other games. That means other developers could join in.

One potential use seems obvious. Accessibility. Google plans to implement an assistant with Google Stadia that can help players out with the touch of a button. Intuitive in-game helpers could provide all manner of assistance to players who are struggling.

You can imagine a whole new breed of games where the goal is not to win. The goal is to create.

The engineers behind the Minecraft bot are actively working on this problem. A voice-command interface has already been added, but is not yet available in the public version of the bot. They intend to build on the text relay based on trends they’re noticing from Facebook users.

“On Facebook, there’s this thing about describing images on people’s feeds,” Szlar says. “We were really thinking about ‘oh, could I play Minecraft where I would tell the bot to do something and then it would tell me what it did, describe it to me’ and it said ‘I built a cube there and there.’”

Some bigger updates to the bot are on the horizon. Aside from evolved text chat, an upgraded vision system might help the bot recreate other, finished creations it sees. Learning through dialog, teaching the bot something through words in chat, is another long-term goal. They’ve spoken to Katja Hoffman, the principal research manager at Project Malmo – Microsoft’s in-house AI research team – about an official collaboration.

The hope is that by framing the AI as a supportive entity, Facebook’s engineers can change ingrained perceptions about what video games can do, as well as cultivate experiences like Minecraft – where just making things is the reward.

“Hopefully we’ll see a shift from ‘here’s this AI I need to beat it’ to ‘here’s an AI, it’ll help me do creative end-goals’,” Zitnick said. “You can imagine a whole new breed of games where the goal is not to win. The goal is to create. As humans, we do have a lot of difficulty creating. Making music, designing things… These aren’t things that come naturally to everybody, and if we could have bots that would help us in doing these things, it would allow a lot of us to be a lot more creative.”

“I think we’re a long way from that,” Zitnick added. “But that’s where I’d love to see this go.”

Editors' Recommendations