Universities Access IBM/Google Cloud Compute Cluster for NSF-Funded Research

The National Science Foundation recently announced approximately $5 Million in new grants to universities to access and continue research using the IBM/Google cloud computing cluster, bringing the number of universities with NSF Cluster Exploratory (CLuE) program grants to 14. IBM and Google began collaborating on an IBM/Google Cloud Computing University Initiative in 2007 to help computer science students gain the skills they need to build cloud applications. Now, the NSF is tapping the same cloud infrastructure to support university research in data-intensive computing in a range of scientific and engineering areas. For example, the University of Utah and the University of Washington are working to expand the capabilities of VisTrails, a system developed at the University of Utah to create high-quality visualizations from very large datasets (pictured).

The NSF-funded university research projects, which use software and services on the IBM/Google cloud infrastructure include:

Carnegie-Mellon University received an award in 2008 to develop more effective processing of Web searches; their 2009 award focuses on machine translation using the Integrated Cluster Computing Architecture (INCA).

With funding started in 2008, Florida International University is leveraging the Hadoop framework to provide a distributed file system that supports analysis of aerial images and related objects, opening up new potential for high-performance geospatial querying.

The Massachusetts Institute of Technology, the University of Wisconsin-Madison, and Yale University are collaborating on a study of cluster-based, large-scale data analysis, comparing Google's MapReduce with other parallel database approaches.

Purdue University is investigating extensions to MapReduce for programming large-scale, distributed systems and applications that manipulate large, unstructured graphs.

The University of California, Irvine is conducting research to support fuzzy queries on large text repositories.

The University of California, San Diego and the San Diego Supercomputer Center are investigating the management and processing of massive spatial data sets on large-scale compute clusters.

At the University of California, Santa Barbara, the Massive Graphs in Clusters (MAGIC) project is developing software to query very large graph datasets efficiently, with implications for highly connected data (such as social networks).

The University of Maryland-College Park received an award in 2008 for machine translation; its 2009 award is focused on the development of parallel algorithms for analyzing new generation sequencing data.

At the University of Massachusetts-Amherst, researchers at the Center for Intelligent Information Retrieval (CIIR) are applying CluE infrastructure to explore word relationships and how they can be used in pre-processing and at search time to improve results from Web searches.

The University of Virginia is exploring super resolution derived by "data-driven image zoom," a process that intelligently enlarges a digital image by computing a new image based in part on patches taken from a 50-million image database.

The University of Washington is using MapReduce to index, access, and analyze astronomical images derived from petascale datasets. The university also received funding in 2008 for its work on preparing students and instructors for large-scale cluster computing.

The University of Washington and University of Utah are collaborating on new infrastructure for computational oceanography that leverages the CluE platform and extends two existing systems: GridFields, a library for manipulation of simulation results; and VisTrails, a comprehensive platform for scientific workflow.

IBM Cloud Labs Vice President Willy Chiu commented on his company's support of the research projects. "IBM is intensely focused on applying technology and science to make the world work better." 

Jeff Walz, director of University Relations at Google, reflected on the impact for both research and education, saying, "The movement of the cloud computing model into research could have a tremendous transformative [effect] both on the education side and on the research side." Walz explained that Google has provided a dedicated data center (located in the United States and not part of Google's regular cloud, which is distributed throughout the world), and IBM has provided software. Regarding the future of the collaboration, Walz noted, "We have a three-year commitment to keep it going and work with the NSF and IBM, so we hope to have more grants in the future."

[Photo by Juliana Freire and Claudio Silva. Courtesy University of Utah and PRNewsFoto/IBM Corporation.]

Featured

  • landscape photo with an AI rubber stamp on top

    California AI Watermarking Bill Garners OpenAI Support

    ChatGPT creator OpenAI is backing a California bill that would require tech companies to label AI-generated content in the form of a digital "watermark." The proposed legislation, known as the "California Digital Content Provenance Standards" (AB 3211), aims to ensure transparency in digital media by identifying content created through artificial intelligence. This requirement would apply to a broad range of AI-generated material, from harmless memes to deepfakes that could be used to spread misinformation about political candidates.

  • stylized illustration of an open laptop displaying the ChatGPT interface

    'Early Version' of ChatGPT Windows App Now Available to Paid Users

    OpenAI has announced the release of the ChatGPT Windows desktop app, about five months after the macOS version became available.

  • person signing a bill at a desk with a faint glow around the document. A tablet and laptop are subtly visible in the background, with soft colors and minimal digital elements

    California Governor Signs AI Content Safeguards into Law

    California Governor Gavin Newsom has officially signed off on a series of landmark artificial intelligence bills, signaling the state’s latest efforts to regulate the burgeoning technology, particularly in response to the misuse of sexually explicit deepfakes. The legislation is aimed at mitigating the risks posed by AI-generated content, as concerns grow over the technology's potential to manipulate images, videos, and voices in ways that could cause significant harm.

  • Jetstream logo

    Qualified Free Access to Advanced Compute Resources with NSF's Jetstream2 and ACCESS

    Free access to advanced computing and HPC resources for your researchers and education programs? Check out NSF's Jetstream2 and ACCESS.