Carnegie Mellon Invention Captures Social Motion

CMU motion detection 

A research breakthrough at Carnegie Mellon University's Artificial Intelligence initiative enables a computer to understand body poses and movements of multiple people in video in real time — including the signals indicated by their fingers. The result could be a giant leap forward in how computers capture even the subtlest social interactions for behavioral analysis, even when bodies block the full view.

The project, named OpenPose, is a code library that allows for "real-time, multi-person keypoint detection." As explained on the GitHub site where the code for the project lives, "OpenPose represents the first real-time system to jointly detect human body, hand and facial keypoints (in total 130 keypoints) on single images."

It was made possible with the use of the university's Panoptic Studio, which can view and record synchronized video streams of several people engaged in physical activities. None of them wears any special markers or trackers. The collective output of these numerous 2D images is a 3D visualization showing anatomical "landmarks" or trackers placed on individuals in space. The studio, built a decade ago, is a geodesic sphere with a radius of almost 5.5 meters — large enough to hold a group of people who can interact with one another. The dome is outfitted with 480 cameras mounted on the inside surface, generating a data stream of about 29 Gbps, and five Microsoft Kinect IIs calibrated with the cameras.

The approach used by the team was to "localize" body parts in a scene — arms, legs, faces, hands — and associate those parts with individuals. Doing that for hands is hard because a camera won't see all parts of the hand in a single shot. And unlike with face and other body parts, which have been amply captured and tagged by part and positioning, large datasets of hand images don't exist.

Using the studio with its multiple cameras, the researchers could have recorded 500 simultaneous views of a person's hand. However, because hands are small — "too small to be annotated by most of our cameras," according to Hanbyul Joo, a Ph.D. student in robotics — the project could get away with the use of 31 high-definition cameras. Then Joo and another Ph.D. student used their own hands to generate thousands of views that were used in the latest research.

According to Yaser Sheikh, associate professor of robotics and a member of the research team, the technology could open new methods for people and machines to interact with each other. For example, the ability to recognize hand poses offers the possibility of people interacting with computers in new and more natural ways, such as communicating with computers just by pointing at things. Robots could also be wired to "perceive" what the people around them are doing or are about to do, what kinds of moods they're in and whether they can be interrupted.

"We communicate almost as much with the movement of our bodies as we do with our voice," Sheikh said in an article about the project. "But computers are more or less blind to it."

Sheikh suggested several use cases, such as a self-driving car that could be "warned" about a pedestrian showing signs that he or she is about to step into the street or a robot that could detect conditions such as depression.

The researchers are making their computer code and dataset for both multiperson and hand-pose estimation openly available to push others to come up with their own applications. Already 20 commercial groups have expressed interest in licensing the technology.

Shortly, Sheikh and his fellow researchers will also do presentations on their latest work at CVPR 2017, the Computer Vision and Pattern Recognition Conference, in Honolulu.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured

  • illustration of a futuristic building labeled "AI & Innovation," featuring circuit board patterns and an AI brain motif, surrounded by geometric trees and a simplified sky

    Cal Poly Pomona Launches AI and Innovation Center

    In an effort to advance AI innovation, foster community engagement, and prepare students for careers in STEM fields and business, California State Polytechnic University, Pomona has teamed up with AI, cloud, and advisory services provider Avanade to launch a new Avanade AI & Innovation Center.

  •  black graduation cap with a glowing blue AI brain circuit symbol on top

    Report: AI Is a Must for Modern Learners

    A new report from VitalSource identifies a growing demand among learners for AI tools, declaring that "AI isn't just a nice-to-have; it's a must."

  • glowing shield hovers above a digital cloud platform with abstract data streams and cloud icons in the background

    Google to Acquire Cloud Security Firm Wiz

    Google has announced it will acquire cloud security startup Wiz. If completed, the acquisition — an all-cash deal valued at $32 billion — would mark the largest in Google's history.

  • digital dashboard featuring a shield icon, graphs, a world map, and network nodes

    IBM Introduces Agentic AI Governance and Security Platform

    IBM has launched a new software stack for enterprise IT teams tasked with managing the complex governance and security challenges posed by autonomous AI systems.