Augmented Reality: Metadata and Annotating Life

You've seen examples of Augmented Reality (AR) though you might not have recognized them. Through virtual product placement in movies, advertisements are added after scenes are shot, creating new revenue opportunities. On the auto racing circuit this past season, as Fox Sports televised cars racing around the track, balloon overlays showed the drivers' names. During last season's televised pro football broadcasts, a yellow line indicated the first-down marker on the field. The offensive team lined up on the ball, and there, somewhere ahead of them, was a bright yellow line showing viewers how far the team had to go to make the next first down. It's quite helpful. Too bad players can't look downfield and see how far they have to go as well.

The projection of the first-down maker onto the field seems so realistic that it appears to be part of the field. Players' shadows fall on the line, and bodies crossing it obscure it. However, the limitations are humorously revealed when a game is played under conditions programmers didn't count on. Last season a game took place between the New England Patriots and the Oakland Raiders during a driving snowstorm at Foxborough Stadium, home of the Patriots. Apparently the algorithm for calculating the placement of the yellow line on the field depended on the expectation that the background would be green, or at least some color. In the near-blinding snow, the field was pure white. In fact, the grounds crew had to line up on each 10 yard line with snow blowers to clear the down markers between plays so the players and referees knew where they were on the field. To television viewers, the yellow line seemed to go crazy. It stair-stepped over the players who crossed it, creating crazy yellow cityscapes jutting out of the ground.

While the commercial examples above are more widely known, the majority of AR research is done using see-through devices. Usually worn on the head, these devices are designed to superimpose graphics or text onto a user's perception of his or her surroundings. The reality augmentation is not limited to visual information, but can include auditory and tactile sensory systems as well.

In fact, the work in AR is a precursor of future user interface design. We're accustomed to think of future user interfaces in terms of new flat-panel screens, wrist displays, or various surfaces with interactive displays. But changes will affect the nature of the user experience with technology, integrating information from a variety of sources into the user's perception. The difference between a future emphasis on different display technologies and AR is in the cognitive effort that AR attempts to minimize by eliminating the need for users to switch between reading computer information on one screen and then switching back to the perceptual reality around them. The goal is to make the computerized information a part of the user's view of the world.

This may sound like Star Trek, but the work in this field can be traced back 34 years to the pioneering efforts of Ivan Sutherland at Harvard University. The fundamentals of the technology required to build integrated display environments have remained largely unchanged, requiring displays, trackers, and graphics computers and software. What has changed is that the technology for each of these components has drastically improved.

The approaches to integrating computer information into the user's perceptual field have taken two directions, both of which use see-through devices. One is optical, the other is video-based. In optical see-through devices, computer data is split by a mirror beam that both reflects and transmits light, letting the view of the world combine with the computer data images. Combiners of this sort are the basis for the heads-up displays in aircraft cockpits and, recently, some models of automobiles.

The other approach involves video mixing technology that takes images from a camera and adds computer graphics and text. The key difference is that the user is looking at a synthesized image in video-based devices instead of seeing the world augmented by computer-generated overlays. In current see-through display devices, superimposed text may be hard to read against some backgrounds, and the three-dimensional graphics may not produce realistic images. More challenging is bringing the virtual and physical images together in the same plane of focus.

Research to date has concentrated on improving the presentation of visual data—for example, controlling the data generated onto the human visual field so that items remain in focus and within the same plane as the primary object being viewed. However, there is perhaps even more rewarding work ahead in examining the cognitive circumstances that provide augmented understanding of a scene at hand—that is, what type and in what context should information be provided to improve the user experience? Imagine walking down the street in a city you are visiting with a pair of sunglasses that have an embedded see-through display. As you pass by restaurants, the Zagat rating of each appears adjacent to the restaurant's name. What other information do you want? Do you want to see the menu? Perhaps the average price of selected entrees based on a profile of your favorite foods? Knowing what is going to improve your contextual understanding of the setting you are in—and how to present it—is work that is just beginning.

The technology is rapidly improving. New see-through devices are using lasers to project computer data directly onto a user's retina. The other research direction involves improving the projection of images onto surfaces. Application areas are numerous. Surgeons are starting to experiment with biopsy procedures augmented with see-through displays that integrate ultrasound information into the surgical procedure. In manufacturing, assembling complex systems (aircraft, automobiles) and repairing them are ripe for AR. Finally, teaching with AR takes the notion of scaffolding learning constructs to a new level.

References

Azuma, Ronald T., Yohan Baillot, Reinhold Behringer, Steven K. Feiner, Simon Julier, and Blair MacIntyre. IEEE Computer Graphics and Applications, Vol. 21, No. 6, pp. 34-47; November/December 2001. Available at www.cs.unc.edu/~azuma/cga2001.pdf.

Feiner, Steven K., "Augmented Reality: A New Way of Seeing,"
MacIntyre, B., et al. "Ghosts in the Machine: Integrating 2D Video Actors into a 3D AR System," Proceedings of the 2nd International Symposium on Mixed Reality (ISMR 2001), MR Systems Lab, Yokohama, Japan, 2001, pp.73-80.

Sutherland, I. "A Head-Mounted Three-Dimensional Display," Fall Joint Computer Conference, American Federation of Information Processing Society Conference Proceedings 33, ThompsonBooks, Washington, D.C., 1968, pp.757-764.
Motion Tracking Devices: Acension Technology Corp. www.ascension-tech.com

comments powered by Disqus