Baidu’s TypeTalk app uses artificial intelligence to power voice transcription

baidu typetalk keyboard rsz img 0047
Baidu may not have the brand recognition in the United States that it does in mainland China, where the Beijing-based juggernaut commands roughly 80 percent of the internet search market and amasses quarterly profits that regularly top the hundreds of millions. But it’s hoping to change that. In 2013, the Asian juggernaut stuck a Silicon Valley stake in the ground with the Institute of Deep Learning, a research center devoted to advancing the firm’s artificial intelligence efforts. Now, thanks in part to the burgeoning lab’s efforts, Baidu is debuting an app bound for the United States and beyond: TypeTalk, a third-party Android keyboard with AI-powered voice recognition.

TypeTalk, which launched on the Google Play Store a few months ago and received a major update on Monday, isn’t your typical keyboard. Unlike Swype, SwiftKey, or Google Keyboard, Baidu’s effort emphasizes voice over tapping: Selecting a text field surfaces a giant, white microphone icon beneath instructions to “Speak now.”

“In our view, we’ve taken a voice-centric approach,” Bijit Halder, TypeTalk’s product manager, told Digital Trends. “It’s in the design. Most keyboards have a little mic button on the side, but we’ve put voice first.”

That may not sound all that novel — most Android devices ship with voice-to-text capabilities, after all. But there’s more to TypeTalk than meets the eye. At the keyboard’s core is machine intelligence: a voice transcription engine known as Deep Speech. “The special ingredient [is] highly accurate speech recognition, built on Baidu’s … deep learning-based technology,” Andrew Ng, chief scientist at Baidu and founder of Google research initiative Google Brain, said in a press release.

Optimization is to thank, too. TypeTalk personalizes transcription results, tailoring word and phrase suggestions to individual users. “The more data you have, the better algorithm you have,” Halder said. “You can learn. You can improve performance.” And it adapts to surroundings: TypeTalk undertakes voice-to-text near a noisy roadside differently than transcription in a quiet cubicle. “We’re looking at use cases and [continue] to make improvements,” he said.

talktype2png

If you’ve used Android’s built-in voice transcription, Apple’s Siri on iOS, or Cortana on Windows Phone, TypeTalk’s transcription tech will feel familiar. Tapping the microphone icon begins transcription, during which the keyboard performs a number of labor-saving features on the fly: It capitalizes words at the beginning of sentences, stops listening during periods of silence, and supports voice commands for punctuation like periods, commas, and colons. When it encounters a word in which it’s not entirely confident, an underline appears beneath it — along with a pop-up box with a list of suggestions.

That last feature, error detection, is a component on which the TypeText team spent an outsized amount of development effort. The reason, Halder said, is the extent to which transcription errors can impact accuracy. “It takes time and focus [to correct an error],” he said. “Even finding the word in error is tricky, and correction takes a long time.” According to Baidu, TypeText’s transcription engine can reduce error rate by as much as 20.4 percent.

That’s not the only error-correcting feature TypeText sports. A two-finger gesture allows for the quick highlighting of parts of sentences — the vacant space above, below, and on either side of the microphone icon acts as a trackpad, a swipe along which highlights words and phrases in sentences.

talktype1png

Baidu contends that TypeTalk’s features, taken together, have the potential to measurably speed up transcription. Halder pointed to a recent study published by Stanford, the University of Washington, and Baidu research that demonstrated Baidu’s voice recognition engine — the same in use with TypeTalk’s — could enable users to input text up to three times faster than, say, hunting and pecking for keys. “The average user can type about 40 to 50 words a minute on mobile, but up to 150 words a minute speaking,” Halder said.

Transcription may by TypeTalk’s headlining feature, but it’s not the only labor-saving one. Baidu’s keyboard sports a full QWERTY keyboard for situations in which transcription’s impossible — one with emojis, alphanumeric keys, symbols, and support for Swype-like gestures which generate AI-powered word suggestions.

And it packs what Baidu calls QuickShare, a tool that taps third-party services to supply images, search results, and other tools. Support for GIF database Giphy lets you browse and share animated movies; Yelp integration lets you search for restaurants and places; and location sharing, as you might expect, shares your current location.

Baidu has big plans for TypeTalk — it sees it as a “growing” product that’ll “evolve” over time. “On mobile, going forward, the focus will absolutely be speech interfaces,” Halder said. “That includes not only speech-to-text, but the whole language and AI behind it. And Baidu is committed to that as a company.”

More concretely, in the next few months, Halder sees intelligent integration with third-party services. “You may be able to say, ‘Come and pick me up,’ and it’ll share your location automatically and summon a car,” he said. “That’s the future of the keyboard.”

Support for Apple’s iOS, on the other hand, isn’t. That’s thanks to restrictions the latest version of the iPhone maker’s operating system, Halder said — third-party keyboards can’t access the iPhone’s microphone. “We’re interested in iOS and other means of getting that functionality, but until Apple changes its policy, support is impossible,” he said.

Here’s hoping that day arrives sooner than later.

Emerging Tech

Awesome Tech You Can’t Buy Yet: Booze-filled ski poles and crypto piggy banks

Check out our roundup of the best new crowdfunding projects and product announcements that hit the web this week. You may not be able to buy this stuff yet, but it sure is fun to gawk!
Computing

Don't keep typing the same thing -- learn to copy and paste with these shortcuts!

Looking for useful Windows keyboard shortcuts? The most common are the cut, copy, paste and undo shortcuts compatible with all kinds of tasks. They can save you an awful lot of time if you learn how to use them.
Computing

Go hands-free in Windows 10 with speech-to-text support

Looking for the dictation, speech-to-text, and voice control options in Windows 10? Here's how to set up Speech Recognition in Windows 10 and use it to go hands-free in a variety of different tasks and applications within Windows.
Emerging Tech

An A.I. cracks the internet’s squiggly letter bot test in 0.5 seconds

How do you prove that you’re a human when communicating on the internet? The answer used to be by solving a CAPTCHA puzzle. But maybe not for too much longer. Here is the reason why.
Music

Here's our head-to-head comparison of Pandora and Spotify

Which music streaming platform is best for you? We pit Spotify versus Pandora, two mighty streaming services with on-demand music and massive catalogs, comparing every facet of the two services to help you decide which is best.
Deals

Walmart drops prices on Apple Watches and other fitness trackers

Smartwatches, fitness trackers, and wearable heart rate monitors from Apple, Samsung, Fitbit, and Garmin are popular gifts. Wearables are smarter and more capable than in earlier years. We found the best wearables deals on Walmart.
Emerging Tech

Capture app saves money by 3D scanning objects using iPhone’s TrueDepth camera

Capture is a new iPhone app created by the Y Combinator-backed startup Standard Cyborg. It allows anyone to perform 3D scans of objects and share them with buddies. Here's how it works.
Wearables

The best Wear OS watches

There are a ton of different Wear OS watches out there, but which one's right for you? No matter what you're looking for from a smartwatch, here are the best Wear OS watches out there.
Home Theater

How to master your equalizer settings for the perfect sound

You may know what an EQ is, but do you know how to adjust equalizer settings for the best possible sound? We go through the basics of the modern EQ and lay out some guidelines for how to achieve tip-top sound from your system.
Mobile

How to switch from iPhone to Android: The ultimate guide

If you've decided to bridge the great tech divide and leave Apple's walled garden for the unknown shores of Android, then you'll find all the tips and advice you need to begin switching from an iPhone to an Android device.
Smart Home

This device detects when your pet is at the door and opens it for them

Tired of waiting for your dog to come inside, or running home in the middle of the day to let your four-legged friend out? Wayzn automatically opens sliding doors for your dog and gives you remote control.
Mobile

Apple pushing update to iPhone in China in response to legal troubles

Apple has been facing legal issues in China due to alleged infringements of patents from Qualcomm Inc. On Friday, Apple announced it will push a software update in China in hopes of resolving any potential legal issues around the iPhone.
Mobile

Need a quick battery boost? Try one of our favorite portable chargers

Battery life still tops the polls when it comes to smartphone concerns. If it’s bugging you, then maybe it’s time to snag yourself a portable charger. Here are our picks of the best portable chargers.
Mobile

iOS jailbreak app store Cydia shuts down purchasing

For years, iOS users have been jailbreaking their devices to install software not approved by Apple. But now the popular app store alternative Cydia will no longer be accepting purchases.