Microsoft AI Gives Deaf RIT Students Auto-Captioning Boost in Lecture Presentations

The Rochester Institute of Technology is one of nine colleges to pilot the use of an artificial intelligence-powered speech and language technology that Microsoft has produced.

The institute's 1,500 students who are deaf and hard of hearing make up the largest mainstream program in the United States. To serve them, the college employs a full-time staff of about 140 interpreters for American Sign Language interpretation and 50 "captionists" who use C-Print, an institute-developed technology, to deliver transcriptions of lectures in real time that are displayed on the laptops and tablets of students who want or need them.

However, even that sizable staff can't keep up with the growth in "the need for access services," said Gary Behm, interim associate vice president of academic affairs at the National Technical Institute for the Deaf (NTID) and director of the Center on Access Technology. Between 2007 and 2016, for example, the number of captioning hours grew from about 15,440 to 24,335, an increase of 58 percent. NTID is one of the nine colleges within RIT. The center, part of NTID, is the division in charge of doing research and deployment of emerging access technologies.

Microsoft's Translator for Education provides the intelligence behind Presentation Translator, a Microsoft "garage project" that breaks down the language barrier by letting users offer continually updated subtitled presentations from PowerPoint. As the presenter speaks in one of 10 supported languages, the add-in generates subtitles directly under the presentation in any of 60 different text languages. Simultaneously, up to 100 people in the audience can follow along with the presentation in a different language on a mobile device. The speech recognition engine can be updated using specialized vocabulary, jargon and technical terms — especially important in academia. Translator includes punctuation and displays numbers not in word form but in number form, making reading less tedious and more efficient for those users relying on it.

In a Microsoft blog post about RIT's usage of the technology, first-year international student Joseph Adjei described how he "struggled" with the interpretive services offered in his courses since he was fairly new to being deaf. Although he could read lips, he hadn't entirely learned ASL. The use of the real-time captions on the screens behind the instructor in his biology class, for instance, let him keep up with the class and see how the scientific terms were spelled. Now, in his second semester, he "regularly shifts his gaze between the interpreter, the captions on the screen and the transcripts on his mobile phone, which he props up on the desk." That combination of activities helps him stay engaged with the content, he noted, and also lets him reference the captions when he's not tracking the ASL.

Now Adjei also uses a Microsoft Translator on his phone to facilitate communication with his hearing peers outside of class. "Sometimes when we have conversations, they speak too fast and I can't lip read them," he explained. "So, I just grab the phone and we do it that way so that I can get what is going on."

The project is in the early stages of deployment to classrooms at RIT, Microsoft reported. The general biology class that Adjei attends is one of 10 equipped for the real-time captioning service.

NTID has also run experiments using comparing Microsoft's technology to IBM's AI service Watson. As an RIT presentation comparing the results noted a year ago, "Cognitive services is a relatively new field with many players like IBM, Microsoft, Google, Amazon and Baidu. With their investment in artificial intelligence, achieving functional equivalence with real-time automatic captions could become a reality in the near future."

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured