Carnegie Mellon Figures Out How To Match Images Across Media -- Campus Technology

STEM and STEAM

Carnegie Mellon Figures Out How To Match Images Across Media

By Dian Schaffhauser
12/12/11

Carnegie Mellon University researchers have developed what they're calling a "surprisingly simple method" for identifying visually similar images that don't match at the pixel level. This method would enable a computer to identify similar images even when factors within the images vary, such as lighting, season, or medium. For example, the technique can find photographic matches to a sketch of a bicycle, a problem that would typically be beyond the capabilities of a computer.

The major obstacle still to conquer: The processing power required to perform the operation is so excessive, adding the functionality to a search site won't be happening in the short term.

The image matching challenge is relevant in a number of activities, such as automatic colorization, scene and video completion, photo restoration, and even making computer graphics imagery more realistic, the researchers explained in their paper, "Data-driven Visual Similarity for Cross-domain Image Matching."

The research team, part of the university's School of Computer Science, is being led by Alexei Efros, an associate professor of computer science and robotics, and Abhinav Gupta, an assistant research professor of robotics. First author is Abhinav Shrivastava, a master's degree student in robotics. They'll be presenting their findings at a mid-December SIGGRAPH Asia conference.

The researchers said that image matching currently is done by pixel matching. For example, Google Goggles, an app created by Google developers, can make matches by examining shapes, colors, and compositions. But when there are variances in the images, such as a painting versus a photograph, "pixel-wise matching fares quite poorly," the researchers reported. "Small perceptual differences can result in arbitrarily large pixel-wise differences."

What's needed, they said, is a way to capture the important visual structures that make two images appear similar, yet can also take into account "small, unimportant visual details." In other words, a "visual similarity algorithm" needs to be able to figure out which parts of an image are important to the human observer and which aren't.

In an image of somebody in front of the Arc de Triomphe in Paris, for example, the presence of the person is usually similar to people in other photos and would thus be given little weight in calculating uniqueness. The Arc itself, however, would be given greater weight because few photos include anything like it.

The technique can also be combined with GPS-tagged photo collections to determine the location for a particular landmark and used to assemble a "visual memex," a data set that explores the connections among a set of photos. The researchers have posted a video on YouTube showing the technique, which can build a path through image data to uncover additional information about any individual image.

"We didn't expect this approach to work as well as it did," Efros said. "We don't know if this is anything like how humans compare images, but it's the best approximation we've been able to achieve."

Speed remains the "central limitation of the proposed approach," the researchers wrote. One implementation they developed took three minutes per query, and that was on a 200-node cluster. "This is still too slow for many practical applications at this time," they noted.

The research received financial support from the Computer Science Department's Center for Computational Thinking, the Office of Naval Research, and Google.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

E-Mail this page

Printable Format

Featured

Why AI Strategy Belongs in the President's Office

Institutions that are succeeding with AI share one thing in common, and it is not a better committee, a larger budget, or a more sophisticated technology stack. It is a president who never handed off the steering wheel.
Digital Holistic Student Supports Initiative Aims to Improve How Colleges Use Technology to Support Students

A new Gates Foundation-funded project is bringing together a group of access-oriented institutions and nonprofit partners to study and improve the digital tools colleges and universities rely on to support students.
Stanford Online Launches Immersive Learning Studio

Stanford Online recently marked its 30th anniversary with the announcement of a new immersive learning studio, according to a university news release. The studio takes advantage of AI-powered and immersive learning technologies to continue delivering personalized and faculty-led education.
Report: Global AI Use Rises as Adoption Gap Continues to Widen

AI usage has reached 17.8% among the world's working-age population, while adoption remains far higher in developed economies than in the Global South.