Researchers Add Sound-Based Gesture Recognition to Commodity Computer

A small team of Microsoft and University of Washington researchers are developing a technology that will allow ordinary computers--and eventually mobile devices--to detect gestures and motions in order to control them. SoundWave, as it's called, uses the speaker and microphone already built into most computers to sense in-air actions, such as a wave of the hand to specify an action like, "scroll the screen up" or "scroll it down."

In a short report on their work, the researchers have explained that the approach uses an "inaudible tone" that gets "frequency-shifted" when it bounces off of moving objects, such as a waving hand. The shift is measured with the microphone.

Other recognition mechanisms, such as that found in the Xbox Kinect, require ample processing power and are sometimes sensitive to environmental conditions or show a lag in recognizing a movement. In a video demonstration, however, SoundWave worked in real-time in a fairly dark environment with the ambient sounds of a coffee house. Low lighting situations don't create a barrier because it doesn't require line of site. Also, it continued working while the computer was playing music in a separate window.

Besides flapping a hand to generate scrolling, other gestures the scientists have programmed into SoundWave include:

  • Two hands moving in opposite directions to have the program rotate an object in the application;
  • A single tap or a double tap to mimic mouse or touchpad activities;
  • A hand flick to specify browsing, such as with photos;
  • Walking toward or away from the device--a "sustained motion"--to wake the system up or put it to sleep.

The approach does have several limitations, the researchers noted. The primary one is that the tone used to recognize the gesture may distress children and animals, whose hearing is more sensitive to sounds playing at a higher frequency. Also some devices filter out tones over 18 kilohertz, and this technique generates tones between 18 and 22 kHz. That obstacle could be mitigated by "piggy-backing" a tone onto a user's digital music, the report suggested. Also, the software can't detect the lack of motion, which means it would need to integrate complementary techniques to cover static designations.

The project is a result of work done by Microsoft Research and ubicomp lab, the ubiquitous computing research lab at U Washington in Seattle. That lab does research in a number of areas, including user interface technology, energy sensing, and activity recognition.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured

  • closeup of hands typing on laptop

    Turnitin Adds Customizable AI Assistance to Support Different Assignments, Grade Levels

    Turnitin has introduced new customizable settings Turnitin Clarity's built-in AI assistant, enabling instructors to specify AI's role and response complexity for each assignment.

  • Blue digital wireframe classical building structure

    Before AI, Fix Your Data

    Institutions don't have to solve every data problem before they can begin using AI responsibly. But they do need to treat information as a strategic asset — not a byproduct of operations — and start building toward AI-ready data now.

  • abstract smartphone translucent screen displaying AI interface

    Apple Introduces Redesigned Siri AI

    At its recent Worldwide Developers Conference, Apple introduced Siri AI, a redesigned version of its voice assistant that Apple describes in its own announcement as "a profoundly more capable and personal assistant." The update is intended to make Siri more conversational, more context-aware, and more useful across iPhone, iPad, Mac, Apple Watch, and Vision Pro.

  • glowing circuit patterns

    Call for Speakers Now Open for Tech Tactics in Education Fall 2026

    The virtual conference from the producers of Campus Technology and THE Journal will return on Sept. 23, 2026, with a focus on emerging trends in with a focus on emerging trends in AI, cybersecurity, and more.