YouTube first introduced captions for videos in 2006. Three years later it automated the feature, a huge step forward that on Thursday enabled it to announce it now has a billion captioned videos on its site.
Captions show on videos as a text overlay and transcribe dialog and other relevant audio occurrences happening on screen. You can enable them by clicking on the icon on the far left in the bottom right of the video player.
Although primarily geared toward the 300 million people in the world with hearing impairments, captions can also come in handy for a good chunk of YouTube’s global user base of more than a billion people. Consider videos where the audio is a bit ropey or you simply can’t catch what the actors are saying. The feature could also be useful when you’re in a public place without your earphones and you’re still keen to view the content.
And yes, captions is a heavily used feature, with viewers clicking the “on” button more than 15 million times a day.
But the system isn’t perfect, at least, not yet. Errors in the text of course show up from time to time, with some YouTubers taking advantage of the slip-ups to create their own comedy videos.
However, the team has been working hard in recent years to improve the reliability of its automated captions technology. Discussing the issue in a blog post, YouTube’s Liat Kaver said significant progress has already been made in enhancing its speech recognition software and machine learning algorithms. “All together, those technological efforts have resulted in a 50 percent leap in accuracy for automatic captions in English, which is getting us closer and closer to human transcription error rates,” Kaver wrote.
The team also wants to invest more time in improving the caption accuracy of its other supported languages, which include Dutch, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish.
Kaver said the long-term aim is to get captions on every clip that requires them. “Ideally, every video would have an automatic caption track generated by our system and then reviewed and edited by the creator,” she wrote, adding, “With the improvements we’ve made to the automated speech recognition, this is now easier than ever.”
- How to use Google Meet
- Android 10’s Live Caption is coming to phone calls. Here’s how to use it
- YouTube reveals why it’s been removing far more videos than usual
- The best TVs for 2020
- How to stream on Twitch from a PC, Mac, PlayStation 4, or Xbox One