Purdue University

By centrally managing its supercomputing clusters, this Big 10 research university provides more power to more faculty with more efficiencies at less cost.

“It would be simple to buy more equipment in order to deliver more resources, or conversely to reduce our IT budget in tough economic times by buying less equipment. But to reduce costs while simultaneously providing one of the nation’s leading cyberinfrastructures for research requires new ways of working.”

That’s Gerry McCartney, vice president of information technology and CIO at Purdue University (IN), speaking about the institution’s Community Cluster Program—a centralized, supercomputing cluster operation that represents a game-changing way of providing high-performance computing power more efficiently, to more users, at a lower cost.

Developed by ITaP (Information Technology at Purdue), the idea was to pool the nodes purchased by several individual departments or researchers to build a very large central cluster. The end result is a cluster with a total compute power greater than any single user could purchase, significantly lower costs, and much higher utilization of resources.

Purdue’s first large community cluster, named Steele, was built in 2008, and the second, a larger cluster called Coates, was built in 2009. Together, Coates and Steele saved about $1.3 million in hardware costs (given the group pricing structure); they deliver more than 160 teraflops of processing power; and in 2009 alone they ran more than 6.9 million jobs in nearly 67 million compute hours. A third computer, Rossmann, coming into production in August 2010, will increase capacity by another 50 percent. Purdue’s cluster approach has resulted in two TOP500-class supercomputers that distinguish the institution as one of the largest supercomputer centers run exclusively by a university.

Vendor & Product Details
AMD: amd.com
Cfengine: cfengine.org
Chelsio Communications: chelsio.com
Cisco Systems: cisco.com
HP: hp.com
Red Hat: redhat.com

The storage, internetworking fabric, racking, and cooling—all things required to run the cluster—are provided by central computing. ITaP staff also administers and maintains the supercomputers, including end-user support, software updates, security, data storage, and backups. Coates is currently the largest entirely 10 gigabit Ethernet academic cluster in the world, consisting of 985 8- and 16-core HP systems with AMD 2380 or 8380 processors connected with Cisco and Chelsio Communications networking equipment.

By carefully balancing compute hours among many users, these valuable central computing resources are kept busy an impressive 95 percent of the time. Faculty investors always have access to their nodes, of course, but when they are not using them, their nodes are available to others at Purdue or are shared with external researchers through the TeraGrid and Open Science Grid distributed computing infrastructures. More than 14 million compute hours on the clusters were allocated to off-campus researchers in 2009.

Purdue and ITaP have a strong track record of finding ways to leverage computing power that would otherwise go unused. The institution was recognized with a 2009 Campus Technology Innovators award for DiaGrid, a distributed computing strategy that harvests idle CPU time from desktops in offices and laboratory computers at Purdue and other campuses, and applies the reclaimed compute hours to research computing (see campustechnology.com/articles/2009/07/22/campus-technology- innovators-awards-2009-high-performance-computing.aspx).

ITaP uses Red Hat Kickstart, Cfengine, and a range of other open source tools to automate the process of building the clusters and installing the nodes inexpensively and efficiently. But ITaP’s use of human resources is even more impressive: Coates and Steele were each built in a day, using a unique “high-tech barn raising” approach. Through the efforts of a very large team of well-coordinated volunteers, clusters stand up fast, with the first applications being run on stacks already completed in the morning, long before the pizza arrives. (For more on the team’s speedy builds, see CT’s recent interview with McCartney: “A High-tech Barn Raising,” campustechnology.com/articles/2010/06/09/a-high-tech-barn-raising.aspx.)

About the Authors

Meg Lloyd is a Northern California-based freelance writer.

David Raths is a Philadelphia-based freelance writer focused on information technology. He writes regularly for several IT publications, including Healthcare Innovation and Government Technology.

Featured

  • Training the Next Generation of Space Cybersecurity Experts

    CT asked Scott Shackelford, Indiana University professor of law and director of the Ostrom Workshop Program on Cybersecurity and Internet Governance, about the possible emergence of space cybersecurity as a separate field that would support changing practices and foster future space cybersecurity leaders.

  • modern college building with circuit and brain motifs

    Anthropic Launches Claude for Education

    Anthropic has announced a version of its Claude AI assistant tailored for higher education institutions. Claude for Education "gives academic institutions secure, reliable AI access for their entire community," the company said, to enable colleges and universities to develop and implement AI-enabled approaches across teaching, learning, and administration.

  • AI microchip, a cybersecurity shield with a lock, a dollar coin, and a laptop with financial graphs connected by dotted lines

    Survey: Generative AI Surpasses Cybersecurity in 2025 Tech Budgets

    Global IT leaders are placing bigger bets on generative artificial intelligence than cybersecurity in 2025, according to new research by Amazon Web Services (AWS).

  • university building surrounded by icons for AI, checklists, and data governance

    Improving AI Governance for Stronger University Compliance and Innovation

    AI can generate valuable insights for higher education institutions and it can be used to enhance the teaching process itself. The caveat is that this can only be achieved when universities adopt a strategic and proactive set of data and process management policies for their use of AI.