MIT CSAIL Creates Wearable AI System That Detects Conversation Tones

Mohammad Ghassemi and Tuka Alhanai (pictured above) have analyzed audio and vital-sign data to develop a deep-learning system that has the potential to serve as a "social coach" for individuals that need help navigating social situations. (Image Credit: Jason Dorfman, MIT CSAIL)

A single conversation can be interpreted in many different ways, which can make social encounters difficult for some individuals. But what if there were a way to measure social cues, like tone of voice or body language, to help us understand our interactions with other people?

Researchers from the Massachusetts Institute of Technology’s Computer Science and Artificial Intelligence Laboratory (MIT CSAIL) have come up with a potential solution: a wearable device that utilizes artificial intelligence (AI) to detect the tone of a conversation.

The research team, comprising graduate student Tuka Alhanai and PhD candidate Mohammad Ghassemi, developed a wearable AI system capable of predicting whether a conversation’s tone is happy, sad or neutral based on an individual’s speech patterns and vitals. It works by using deep-learning techniques to analyze audio, text transcriptions and physiological signals as it listens to an individual tell a story.

The team says their system could serve as a “social coach” for individuals with anxiety or other conditions, such as Asperger’s or Autism.

“Imagine if, at the end of a conversation, you could rewind it and see the moments when the people around you felt the most anxious,” said Alhanai. “Our work is a step in this direction, suggesting that we may not be that far away from a world where people can have an AI social coach right in their pocket.”

To develop the system, the researchers had individuals wear a Samsung Simband wristband, which captures high-resolution physiological waveforms to measure features like movement, heart rate and blood pressure. It also captures audio data and text transcripts to analyze tone, pitch, energy and vocabulary. Subjects were then asked to tell a happy or sad story of their choosing. A total of 31 conversations of several minutes each were collected. The team extracted 386 audio and 222 physiological features and trained two algorithms on the data. The first algorithm determined the overall tone of a conversation as either happy or sad, while the second classified each five-second block in every conversation as positive, negative or neutral.

The findings align closely with what people might expect to find in real life: long pauses and monotonous vocal tones indicated stories were judged as more sad than happy, while energetic stories had varied speech patterns. The system on average could classify the overall tone of the story with 83 percent accuracy. The mood of five-second intervals could be classified with an accuracy of about 18 percent above chance.

The researchers published their findings in the paper, “Predicting Latent Narrative Mood Using Audio and Physiological Data,” which they are presenting this week at the Association for the Advancement of Artificial Intelligence (AAAI) conference in San Francisco, CA.   

“Our next step is to improve the algorithm’s emotional granularity so it can call out boring, tense and excited moments with greater accuracy instead of just labeling interactions as ‘positive’ or ‘negative,’” said Alhani. “Developing technology that can take the pulse of human emotions has the potential to dramatically improve how we communicate with each other.”

To learn more about how the wearable AI device system works, read the paper or watch the video below.

About the Author

Sri Ravipati is Web producer for THE Journal and Campus Technology. She can be reached at [email protected].

Featured

  • robot waving

    Copilot Updates Aim to Make AI More Personal

    Microsoft has unveiled a range of updates to its Copilot platform, marking a new phase in its effort to deliver what it calls a "true AI companion" that adapts to individual users' needs, preferences and routines.

  • glowing futuristic laptop with a holographic screen displaying digital text

    New Turnitin Product Brings AI-Powered Tools to Students with Instructor Guardrails

    Academic integrity solution provider Turnitin has introduced Turnitin Clarity, a paid add-on for Turnitin Feedback Studio that provides a composition workspace for students with educator-guided AI assistance, AI-generated writing feedback, visibility into integrity insights, and more.

  • illustration of a futuristic building labeled "AI & Innovation," featuring circuit board patterns and an AI brain motif, surrounded by geometric trees and a simplified sky

    Cal Poly Pomona Launches AI and Innovation Center

    In an effort to advance AI innovation, foster community engagement, and prepare students for careers in STEM fields and business, California State Polytechnic University, Pomona has teamed up with AI, cloud, and advisory services provider Avanade to launch a new Avanade AI & Innovation Center.

  • Training the Next Generation of Space Cybersecurity Experts

    CT asked Scott Shackelford, Indiana University professor of law and director of the Ostrom Workshop Program on Cybersecurity and Internet Governance, about the possible emergence of space cybersecurity as a separate field that would support changing practices and foster future space cybersecurity leaders.