HPC-Powered Science Gateways Open Doors to Discoveries
New technologies in high-performance computing are having a direct effect on scientific research --
outside the traditional thinking of big iron supercomputers that run science jobs. The new technologies are approaching a kind of critical mass, and they are changing how science disciplines do their work. We expect these technologies to serve as science accelerators, and change the pace of new science discoveries in the near future.
Recently, in a series of grants, the National Science Foundation awarded Purdue University almost $25 million to further work on cyberinfrastructure, with an emphasis on science gateways, also known as hubs. The best known of these gateways is
nanoHUB.org, but many more will be developed over the next couple of years (which is what part of the NSF funding is for). Sites on the drawing boards that I have seen address medical science, cancer research, climate modeling, pharmaceutical research, and heat transfer in engineering.
These hubs are collaborative sites made up of user-supplied content, analogous to Web 2.0 sites like Facebook or YouTube, but with a much different purpose.
In many disciplines, scientists have to learn how to use software tools to run simulations or do modeling. These tools can be very specific for the type of work being done, and researchers might spend as much time learning the software as running the job. Hubs do away with that learning curve. Instead, there is a Web site where researchers input their data and manipulate the model (or graph) in near real-time.
These research tools -- there are hundreds of them -- are built by scientists or groups of scientists and put on the hubs for the community to use. Unlike before, when researchers had to know each other before collaborating, they are now collaborating and sharing resources with people they might never meet. Because there is no need to learn the tool before using it, researchers are freer to explore and test their data using a variety of simulation tools or input parameters.
But all this doesn't happen "automagically." These kinds of simulations require enormous computing resources. Behind the Web site, sophisticated middleware is looking for available supercomputer time and sending the job to be computed.
The computing power is supplied by the nation's top research institutions, which are linked together via the NSF's TeraGrid. This new (at least to the general public) computing network conjures up the old label "information superhighway," but Al Gore's superhighway was a dirt path compared with the information being shared on the TeraGrid.
Again, the acquisition of the resources is invisible to scientists using the tool on the hub. Unlike before, when they would have to schedule time on a supercomputer somewhere to run their simulation and wait for their turn, now the job happens almost immediately and without the researchers getting involved in requesting the computing time.
The big story, in my view, is that scientists are no longer working like a lone bench scientist would, or in small groups. Now, they're using the resources of the entire community. The result is that there are new virtual national laboratories, if you will, being built on the Internet.
We don't know what scientific discoveries await, but it will be an exciting development to watch.