Plugging into the Grid
- By Joseph C. Panettieri
- 12/30/04
Purdue researchers are redefining how information is shared between scientists,
professors, and students.
Grid computing is more than a concept. At Purdue University
(IN), it’s a powerful reality. Indeed, Purdue is one of nine major institutions
participating in TeraGrid, an ambitious research project that’s building
the world’s most comprehensive grid computing infrastructure for scientific
research. Over the next few years, TeraGrid could dramatically influence such
fields as computer animation, medical research, nanotechnology, energy exploration,
and environmental issues. Working with partner institutions such as Indiana
University, Purdue researchers are enhancing TeraGrid to harness massive
computational power, network bandwidth, data storage, instrumentation, and visualization
capabilities. Today, the system’s main users are scientists. But TeraGrid
could ultimately empower Purdue students.
Eager to understand the birth, development, and future implications of TeraGrid,
Campus Technology recently spoke with three of Purdue’s brightest minds:
Research Scientist Krishna Madhavan; TeraGrid Site Lead Sebastien Goasguen;
and High Performance Computing Technical Architect Michael Shuey. In this exclusive
interview, the trio describe how TeraGrid will reshape scientific research and
classroom learning at Purdue.
Campus Technology: For those
who have yet to hear about TeraGrid, can you give us an overview of the project?
Goasguen: The 30,000-foot view is that TeraGrid is a new national cyberinfrastructure.
It incorporates computing resources, storage, instrumentation, and visualization
as a single utility for researchers, students, and faculty. TeraGrid is now
in its third year of advancement; Purdue University’s participation in
its construction started about a year ago. Any researcher in the nation can
now request access to TeraGrid via www.paci.org.
How d'es TeraGrid differ from Internet2?
Shuey: Internet2 (www.internet2.org)
is primarily a networking infrastructure; it is a means for connectivity that
gets you from one place to another. TeraGrid implies not only networking capabilities
but also active resources across the network. TeraGrid is not just an interconnection
of highways; it is built out to offer complete computational infrastructure,
storage, and other services.
What d'es Purdue bring to the TeraGrid table?
Goasguen: We provide Linux clusters, IBM (www.ibm.com)
super computers, image satellites, and climate modeling data. We also bring
our connection to the nanotechnology community, through our Network for Computational
Nanotechnology (NCN) at Purdue University.
Madhavan: Many of the TeraGrid participants are supercomputing centers. Because
we’re a university, we’re going to bring TeraGrid’s capabilities
to the classroom. We’re lowering the threshold to entry for grid computing
and bringing it down to a level where students can work on concrete, real-world
problems in the classroom.
So, is TeraGrid similar to the commercial grid initiatives from such
companies as IBM and Oracle?
Shuey: There are parallels to what IBM is doing, but there are also some distinct
differences. At a conceptual level, most vendors are converging on a common
grid concept. By contrast, the TeraGrid is actually the leading grid infrastructure
for the US. It’s not strictly tied to any one vendor or institution; it’s
a common infrastructure for all. Because it’s a government-funded initiative,
my guess is that TeraGrid won’t become a major commercial resource. However,
concepts pioneered in TeraGrid will work their way into commercial solutions.
On October 1, Purdue was slated to contribute part of its computing
resources to TeraGrid. Did you meet that deadline? What resources were involved?
Goasguen: We met the deadline with flying colors. The network was online with
a 10 gigabyte connection to the TeraGrid backbone.
The system includes Linux clusters. Did Purdue deploy new servers just
for the cluster portion of the project?
Goasguen: Our Linux cluster is recycled from instructional labs. It includes
close to 1,000 machines that were previously used by students in labs.
Shuey: We also have functions that are exported to the
TeraGrid. We’re taking idle time from other resources and applying them
to TeraGrid. Within an individual TeraGrid site, the infrastructure exists to
“notice” idle time and farm work out to it. There’s currently
no existing mechanism to balance work between sites, but development is underway
on that.
The TeraGrid research also involves nanotechnology. Can you describe
that more fully?
Goasguen: Engineers are building new devices everyday.
The devices are so small that researchers have to worry about designs at the
atomic level. They have to go back to physics and chemistry to understand what’s
happening with the device to ensure a proper design. That leads to a lot of
people having to write programs and simulators for quantum chemistry and atomic
structure, in order to create new devices. Those simulators require a lot of
computing power and storage. TeraGrid is the perfect platform for designing
and running those simulations.
Courtesy of the National Center for Supercomputing Applications (NCSA) and
the Board of Trustees of the University of Illinois.
In addition to offering lots of bandwidth, d'es TeraGrid’s design
also improve reliability and redundancy?
Shuey: Currently, much of the focus is on increased capacity. As a by-product,
many users are experiencing much better reliability. One TeraGrid partner took
its system offline for routine maintenance for two-and-a-half days. Several
users merely switched to a different TeraGrid site and weren’t affected
in the slightest. As TeraGrid matures, the fail-over ability between sites will
only become more automatic.
Goasguen: The TeraGrid model is learn once, run anywhere. The user experiences
a single environment and d'esn’t really care where he’s working.
It’s all one environment.
So, how will TeraGrid evolve over the next one to three years?
Goasguen: We’ll bring it to the classroom and make it an integral part
of teaching—and, of course, an integral part of work with researchers.
Madhavan: High-performance computing resources are the lingua franca of the
research community today. If students are to contribute to future discoveries,
TeraGrid needs to trickle down and be available in the classroom. Students don’t
think of how to use a Web browser; they focus on the content. Similarly, we
want to lower the complexity of TeraGrid tools.
Goasguen: We see TeraGrid as a single utility for scientists. In three years,
we expect to see most scientists in the United States who require computing
storage visualization able to use TeraGrid for their main research and scientific
discovery.