Plugging into the Grid

Purdue researchers are redefining how information is shared between scientists, professors, and students.

Grid computing is more than a concept. At Purdue University (IN), it’s a powerful reality. Indeed, Purdue is one of nine major institutions participating in TeraGrid, an ambitious research project that’s building the world’s most comprehensive grid computing infrastructure for scientific research. Over the next few years, TeraGrid could dramatically influence such fields as computer animation, medical research, nanotechnology, energy exploration, and environmental issues. Working with partner institutions such as Indiana University, Purdue researchers are enhancing TeraGrid to harness massive computational power, network bandwidth, data storage, instrumentation, and visualization capabilities. Today, the system’s main users are scientists. But TeraGrid could ultimately empower Purdue students.

Eager to understand the birth, development, and future implications of TeraGrid, Campus Technology recently spoke with three of Purdue’s brightest minds: Research Scientist Krishna Madhavan; TeraGrid Site Lead Sebastien Goasguen; and High Performance Computing Technical Architect Michael Shuey. In this exclusive interview, the trio describe how TeraGrid will reshape scientific research and classroom learning at Purdue.

Campus Technology: For those who have yet to hear about TeraGrid, can you give us an overview of the project?
Goasguen: The 30,000-foot view is that TeraGrid is a new national cyberinfrastructure. It incorporates computing resources, storage, instrumentation, and visualization as a single utility for researchers, students, and faculty. TeraGrid is now in its third year of advancement; Purdue University’s participation in its construction started about a year ago. Any researcher in the nation can now request access to TeraGrid via www.paci.org.

How d'es TeraGrid differ from Internet2?
Shuey: Internet2 (www.internet2.org) is primarily a networking infrastructure; it is a means for connectivity that gets you from one place to another. TeraGrid implies not only networking capabilities but also active resources across the network. TeraGrid is not just an interconnection of highways; it is built out to offer complete computational infrastructure, storage, and other services.

What d'es Purdue bring to the TeraGrid table?
Goasguen: We provide Linux clusters, IBM (www.ibm.com) super computers, image satellites, and climate modeling data. We also bring our connection to the nanotechnology community, through our Network for Computational Nanotechnology (NCN) at Purdue University.
Madhavan: Many of the TeraGrid participants are supercomputing centers. Because we’re a university, we’re going to bring TeraGrid’s capabilities to the classroom. We’re lowering the threshold to entry for grid computing and bringing it down to a level where students can work on concrete, real-world problems in the classroom.

So, is TeraGrid similar to the commercial grid initiatives from such companies as IBM and Oracle?
Shuey: There are parallels to what IBM is doing, but there are also some distinct differences. At a conceptual level, most vendors are converging on a common grid concept. By contrast, the TeraGrid is actually the leading grid infrastructure for the US. It’s not strictly tied to any one vendor or institution; it’s a common infrastructure for all. Because it’s a government-funded initiative, my guess is that TeraGrid won’t become a major commercial resource. However, concepts pioneered in TeraGrid will work their way into commercial solutions.

On October 1, Purdue was slated to contribute part of its computing resources to TeraGrid. Did you meet that deadline? What resources were involved?
Goasguen: We met the deadline with flying colors. The network was online with a 10 gigabyte connection to the TeraGrid backbone.

The system includes Linux clusters. Did Purdue deploy new servers just for the cluster portion of the project?
Goasguen: Our Linux cluster is recycled from instructional labs. It includes close to 1,000 machines that were previously used by students in labs.
Shuey: We also have functions that are exported to the
TeraGrid. We’re taking idle time from other resources and applying them to TeraGrid. Within an individual TeraGrid site, the infrastructure exists to “notice” idle time and farm work out to it. There’s currently no existing mechanism to balance work between sites, but development is underway on that.

The TeraGrid research also involves nanotechnology. Can you describe that more fully?
Goasguen: Engineers are building new devices everyday. The devices are so small that researchers have to worry about designs at the atomic level. They have to go back to physics and chemistry to understand what’s happening with the device to ensure a proper design. That leads to a lot of people having to write programs and simulators for quantum chemistry and atomic structure, in order to create new devices. Those simulators require a lot of computing power and storage. TeraGrid is the perfect platform for designing and running those simulations.

Courtesy of the National Center for Supercomputing Applications (NCSA) and the Board of Trustees of the University of Illinois.

In addition to offering lots of bandwidth, d'es TeraGrid’s design also improve reliability and redundancy?
Shuey: Currently, much of the focus is on increased capacity. As a by-product, many users are experiencing much better reliability. One TeraGrid partner took its system offline for routine maintenance for two-and-a-half days. Several users merely switched to a different TeraGrid site and weren’t affected in the slightest. As TeraGrid matures, the fail-over ability between sites will only become more automatic.
Goasguen: The TeraGrid model is learn once, run anywhere. The user experiences a single environment and d'esn’t really care where he’s working. It’s all one environment.

So, how will TeraGrid evolve over the next one to three years?
Goasguen: We’ll bring it to the classroom and make it an integral part of teaching—and, of course, an integral part of work with researchers.
Madhavan: High-performance computing resources are the lingua franca of the research community today. If students are to contribute to future discoveries, TeraGrid needs to trickle down and be available in the classroom. Students don’t think of how to use a Web browser; they focus on the content. Similarly, we want to lower the complexity of TeraGrid tools.
Goasguen: We see TeraGrid as a single utility for scientists. In three years, we expect to see most scientists in the United States who require computing storage visualization able to use TeraGrid for their main research and scientific discovery.

Featured