2009 Campus Technology Innovators: High-Performance Computing
        
        
        
			- By Mary Grush, Matt  Villano
- 08/01/09
				 
				
						THE PURDUE DIAGRID TEAM, left to right: Andrew Howard, Phillip Cheeseman, John Campbell,
David Braun, Preston Smith, Carol Song.
		 
HIGH-PERFORMANCE COMPUTING
  Innovator: Purdue University
At Purdue University (IN), the demand for computing by science
and engineering faculty has increased at a far faster rate
than the budget for new computing hardware. Meanwhile,
most computers, even multimillion-dollar supercomputers, are
only in use about half of the time. By capturing these unused
cycles, DiaGrid provides millions of hours of computation that
would otherwise be wasted, without additional technology or
facilities purchases. (DiaGrid began in 2004 as a Purdue
West Lafayette campus system known as BoilerGrid, and was
renamed in 2008 with the addition of several other campuses,
including Indiana University, the University of Notre
Dame (IN), Indiana State University, Purdue's Calumet
and North Central regional campuses, and Indiana University-
Purdue University Fort Wayne.)
The idea of reclaiming wasted computing cycles by putting
  idle machines to work in a distributed computing grid is not new.
  The notion was even popularized
  by SETI@home, which
  recruited ordinary home computers
  to join in the hunt for
  extraterrestrials while their
  owners slept. But no other
  grid project has ever before attempted to pool the
  wide variety of hardware systems represented in DiaGrid.
  Among the resources tapped: computers in campus labs,
  offices, server rooms, and high-performance research computing
  clusters, running a variety of operating systems. Now at more
  than 24,000 processors (and growing) across multiple campuses,
  the sheer size of the pool also sets DiaGrid apart. It provided
  more than 16 million hours of computation in 2008.  
DiaGrid is based on Condor, free open source software
  developed at the University of Wisconsin that supports
  high-throughput computing on large collections of distributed,
  cross-platform computing resources. It also relies on Cycle
  Computing's CycleServer tool for many of the administrative
  aspects of managing and using a Condor pool, as well as
  Batch System Pro from PBS GridWorks for scheduling jobs.
  And DiaGrid takes advantage of high-speed connectivity via
  I-Light, the fiber-optic state network connecting Indiana
  campuses, along with national research networks such as
  Internet2 and National LambdaRail.  
DiaGrid has been used at Purdue in a variety of demanding
  research projects, such as imaging the structure of viruses at
  near-atomic resolutions; simulating the Oort Cloud in an effort
  to understand the early stages of the solar system's formation;
  projecting the reliability of Indiana's electrical supply; and
  modeling the spread of water pollutants. Other applications
  have included a system to help create a virtual version of a
  pharmacy clean room for training student pharmacists, and a
  fly-through animation of a proposed satellite city that
  could serve as a refuge for Istanbul, Turkey, in the event
  of a catastrophic earthquake. DiaGrid provides computational
  resources to researchers on both the Open
  Science Grid and the TeraGrid.  
Currently the centralized equivalent of DiaGrid would
  be a cluster supercomputer costing more than $3 million,
  taking up 2,000 square feet of floor space, and ranking
  among the top 100 supercomputers worldwide. And Dia-
  Grid provides its compute power entirely from existing
  computing resources that would otherwise be wasted.
  Project lead John Campbell, associate vice president for
  information technology at Purdue, has DiaGrid's next
  foreseeable goal in sight: to add more partners and reach
  a pool size of 100,000 processors in 2009.  
Gerry McCartney, Purdue's vice president for information
  technology and chief information officer, says
  DiaGrid will continue to build and expand. "We named
  this national computing grid DiaGrid after the type of
  girder arrangement used in modern skyscrapers,"
  McCartney says. "It's an apt metaphor. We're building a computing infrastructure that scientists and
  engineers can use to make monumental discoveries.
  DiaGrid is a new, national resource
  for research. Experiments will be conducted
  using this computing grid that could not have
  been done before."
            The centralized equivalent
of DiaGrid would be a $3
million supercomputer and
take up 2,000 square feet
of floor space.
 
        
        
        
        
        
        
        
        
        
        
        
        
            
        
        
                
                    About the Authors
                    
                
                    
                    Mary Grush is Editor and Conference Program Director, Campus Technology.
                    
                    
                    
                
            
                
                    
                    Matt Villano is senior contributing editor of this publication.