Assistive Technology: YouTube Automatic Captioning

Introduction

Initially introduced in 1971, closed captions are an accessibility tool for deaf or hard of hearing people*.

In 2009, Google introduced machine-generated automatic captions to YouTube. These automatic captions allow YouTube creators to easily add captions to their videos. YouTube viewers can toggle captions on and off in video settings.

Features

The key value of YouTube automatic captions lies in its utility and accessibility.

Utility

YouTube automatic captions offer the utility of transcription for all users, exemplifying an assistive tool designed for the Deaf community that benefits a broader audience. In fact, open access to closed captions follows the social model of disability and normalizes the presence of captions as a default.

An emerging benefit of YouTube automatic captions is auto-translate, a feature that bridges the worlds of closed captions and subtitles. YouTube automatic captions can be auto-translated into over 100 languages, offering utility for Deaf and English as a second language users alike.

Accessibility

YouTube automatic captions are available on long-form videos, Shorts, and live-streams, making the captions widely accessible across the entire YouTube product. This high availability 

YouTube automatic captions are available free of charge, making the captions financially accessible as well. The tool allows YouTube creators to directly edit and make corrections to the computer-generated captions, alleviating the need for creators to hire a transcriber or closed captioner for their uploaded videos.

The usability of YouTube automatic captions is clear – YouTube users can easily toggle closed captions on and off in the video player, where the toggle is marked with the standard “CC” symbol. A note in the settings menu of the video player also directs users to their account settings, where users can set defaults for closed caption preferences.

Opportunities

The current greatest limitation of YouTube automatic captions is its viability. Because the captions are computer-generated and do not have a quality control check, the accuracy of the captions can vary widely. The typical accuracy of YouTube automatic captions is 60-70%, a far cry from the at least 99% accuracy rate that ensures Deaf and hard of hearing people can understand audio content. An accuracy rate of 60-70% equates to a mis-transcription of 1 in every 3 words, a ratio high enough to alter the meaning of transcribed sentences.

An additional opportunity for YouTube automatic captions lies in its language support. Presently, YouTube automatic captions for long-form videos and Shorts are available in 14 languages, while automatic captions for live-streams are available only in English.

The 14 languages supported by YouTube automatic captions.

Notably absent from the list of supported languages are Mandarin and Hindi, the two most spoken languages in the world after English. While this lack of support can be argued to be a by-product of YouTube being an American company, the omission of Mandarin and Hindi poses a hurdle for YouTube creators who speak the two languages natively.

Closing

By offering computer-generated automatic captions, YouTube reduces the level of effort and financial hurdles that inhibit YouTube content creators from providing closed captions on their videos. For YouTube users, these automatic captions theoretically make YouTube videos much more accessible to Deaf users. In reality, however, the low accuracy of these automatically generated captions prevent Deaf users from accessing an equitable experience on YouTube.

*Deaf culture commonly prefers identify-first language rather than people-first language.