Putting Advanced Computing Power Within Reach
Indiana University led the creation of an on-demand cloud platform that extends scientific and research computing resources to more higher education communities.
Category: IT Infrastructure and Systems
Institution: Indiana University
Project: Jetstream: A Cloud System Enabling Learning in Higher Education Communities
Project lead: David Y. Hancock, program director, Advanced Cyberinfrastructure, Pervasive Technology Institute
Tech lineup: Ceph, Dell, MathWorks, OpenStack, Red Hat
Indiana University's Jetstream team
Indiana University's Pervasive Technology Institute is leading an NSF grant (currently valued at more than $13 million) to create, implement and operate Jetstream, a user-friendly cloud environment designed to give more researchers and students at higher education institutions access to high-end computing resources on demand, from their tablets, laptops or desktop computers.
This effort by the IU Jetstream team and partnering institutions — including the Texas Advanced Computing Center, the University of Chicago, Johns Hopkins University, the University of Arizona, the University of Texas-San Antonio, and Cornell University — puts advanced computing power within reach of research, science and education programs that previously did not have access to comparable resources.
In 2016, when Jetstream entered into its first few months in full production as an NSF-funded cloud for conducting scientific and engineering research, the platform supported just over a dozen education initiatives across the U.S. Today, it supports 409 active projects and serves nearly 2,400 researchers and 900 students, with 75 fields of science represented from 180 different institutions.
These and other data confirm that Jetstream is filling a gap in the nation's cyberinfrastructure for research and education. Jetstream takes a different approach than previously established scientific platforms. Many thousands of researchers already have access to advanced computational or HPC and HTC resources via commercial products or through funded programs within NSF's national cyberinfrastructure (CI). Jetstream goes where those programs do not reach.
David Y. Hancock
"What sets Jetstream apart is the goal to create a different type of resource for research and education," explained David Y. Hancock, program director for Advanced Cyberinfrastructure at IU's Pervasive Technology Institute. Hancock is project lead for the Jetstream team and principal investigator for the NSF-funded initiative.
Jetstream's strategic mission is to extend the computing power of NSF's eXtreme Digital (XD) program to education and research communities that have not typically used CI advanced computing resources. To date, Jetstream users have included — among others — Historically Black Colleges and Universities (HBCUs), Minority Serving Institutions (MSIs), Tribal colleges and higher education institutions in states designated by the NSF as eligible for funding via the Established Program to Stimulate Competitive Research (EPSCoR).
Among the most important design goals of Jetstream is that users enjoy an easy-to-use, self-serve, on-demand GUI interface. Another key design element of the platform is the option for users to select from stable, prepared virtual machines (VMs) or choose to create their own VMs and customized workflows. In either case, a VM might include research software selections from the "Featured Images" software managed and maintained by the Jetstream team — important tools such as Mathworks' Matlab, Galaxy, RStudio and many more. The high-end research tools available through Jetstream make it easier for researchers to conduct, visualize and share their work, enabling a new level of secure data exchange and professional communications.
The platform runs on Dell hardware, identically replicated at two geographically separate locations in the U.S.: Jetstream-IU, operated by the Indiana University Pervasive Technology Institute in Bloomington, IN; and Jetstream-TACC, operated by the Texas Advanced Computing Center at UT-Austin, TX.
Compute nodes at each of those two main data centers include 320 Dell M630 blades running a total of 640 CPUs that together pack 258 TFLOPS peak processing capability for each location. Jetstream uses the OpenStack software environment, with Ceph (acquired by Red Hat) as the storage software. Storage resides on 20 Dell R730 servers at each location with an aggregate of 960 TB of raw storage at each of the two sites.
By March 2018, allocations awarded on Jetstream for education program development and limited engagement workshops, as well as for semester-long courses, had reached more than 7 million CPU hours. In spite of that demonstrated success, the team's emphasis is still placed squarely on outreach and education. As Hancock pointed out, "The most user-friendly NSF-funded system ever created will not be a success without users."
As a user-friendly research and education cloud, Jetstream's offerings are supported by experienced technical staff as well as education, outreach and training teams. "Jetstream was designed from the start to focus on science, run by staff who are experienced in supporting researchers and educators," said Hancock. Funding for Jetstream has been extended for operations through November 2020. Workshops, seminars and conference presentations continue to attract new research and education users and boost Jetstream's ability to serve higher education's need for wider access to high-end compute power — a need that had gone unmet in previous years.
Return to Campus Technology Impact Awards Home