Carnegie Mellon Figures Out How To Match Images Across Media -- Campus Technology

STEM and STEAM

Carnegie Mellon Figures Out How To Match Images Across Media

By Dian Schaffhauser
12/12/11

Carnegie Mellon University researchers have developed what they're calling a "surprisingly simple method" for identifying visually similar images that don't match at the pixel level. This method would enable a computer to identify similar images even when factors within the images vary, such as lighting, season, or medium. For example, the technique can find photographic matches to a sketch of a bicycle, a problem that would typically be beyond the capabilities of a computer.

The major obstacle still to conquer: The processing power required to perform the operation is so excessive, adding the functionality to a search site won't be happening in the short term.

The image matching challenge is relevant in a number of activities, such as automatic colorization, scene and video completion, photo restoration, and even making computer graphics imagery more realistic, the researchers explained in their paper, "Data-driven Visual Similarity for Cross-domain Image Matching."

The research team, part of the university's School of Computer Science, is being led by Alexei Efros, an associate professor of computer science and robotics, and Abhinav Gupta, an assistant research professor of robotics. First author is Abhinav Shrivastava, a master's degree student in robotics. They'll be presenting their findings at a mid-December SIGGRAPH Asia conference.

The researchers said that image matching currently is done by pixel matching. For example, Google Goggles, an app created by Google developers, can make matches by examining shapes, colors, and compositions. But when there are variances in the images, such as a painting versus a photograph, "pixel-wise matching fares quite poorly," the researchers reported. "Small perceptual differences can result in arbitrarily large pixel-wise differences."

What's needed, they said, is a way to capture the important visual structures that make two images appear similar, yet can also take into account "small, unimportant visual details." In other words, a "visual similarity algorithm" needs to be able to figure out which parts of an image are important to the human observer and which aren't.

In an image of somebody in front of the Arc de Triomphe in Paris, for example, the presence of the person is usually similar to people in other photos and would thus be given little weight in calculating uniqueness. The Arc itself, however, would be given greater weight because few photos include anything like it.

The technique can also be combined with GPS-tagged photo collections to determine the location for a particular landmark and used to assemble a "visual memex," a data set that explores the connections among a set of photos. The researchers have posted a video on YouTube showing the technique, which can build a path through image data to uncover additional information about any individual image.

"We didn't expect this approach to work as well as it did," Efros said. "We don't know if this is anything like how humans compare images, but it's the best approximation we've been able to achieve."

Speed remains the "central limitation of the proposed approach," the researchers wrote. One implementation they developed took three minutes per query, and that was on a 200-node cluster. "This is still too slow for many practical applications at this time," they noted.

The research received financial support from the Computer Science Department's Center for Computational Thinking, the Office of Naval Research, and Google.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

E-Mail this page

Printable Format

Featured

Training the Next Generation of Space Cybersecurity Experts

CT asked Scott Shackelford, Indiana University professor of law and director of the Ostrom Workshop Program on Cybersecurity and Internet Governance, about the possible emergence of space cybersecurity as a separate field that would support changing practices and foster future space cybersecurity leaders.
New Nonprofit to Work Toward Safer, Truthful AI

Turing Award-winning AI researcher Yoshua Bengio has launched LawZero, a new nonprofit aimed at developing AI systems that prioritize safety and truthfulness over autonomy.
Why AI Strategy Matters (and Why Not Having One Is Risky)

If your institution hasn't started developing an AI strategy, you are likely putting yourself and your stakeholders at risk, particularly when it comes to ethical use, responsible pedagogical and data practices, and innovative exploration.
1EdTech: 6 Key Steps for a Successful Credentialing Program

A new report from 1EdTech Consortium outlines recommendations for creating microcredential programs in schools, colleges, and universities.