Clemson U Scientists to Build Cyberinfrastructure System for Large-Scale Data Analysis

Clemson scientists Alex Feltus (left) and Melissa Smith (photo courtesy of Jim Melvin, Jan Lay / Clemson University)

Scientists at Clemson University are working on system that will help improve and simplify large-scale data analysis around the world. The NSF-funded project, called Scientific Data Analysis at Scale (SciDAS), aims to "help current researchers and future innovators discover data, move it smoothly across advanced networks, and improve flexibility and accessibility to national and global resources," according to a news release.

"Many fields are awash with huge datasets," said principle investigator Alex Feltus, associate professor of genetics and biochemistry in Clemson University's College of Science. "This is certainly true of biology and hydrology, but it also includes researchers who are studying satellite imagery, remote sensors and education analytics, to name a few. Today's scientists are now required to understand both the underlying science and the cyberinfrastructure ecosystem to design and execute mind-bogglingly complex computations. SciDAS will combine new software with existing software to construct a system that will be efficient, practical and user-friendly."

SciDAS will bring together multiple national cyberinfrastructure resources, including NSF Clouds, the Open Science Grid, the Extreme Science and Engineering Discovery Environment, petascale supercomputers such as COMET, Clemson's Palmetto Cluster and other nationwide university resources. In addition, Internet2's cyberteam will help optimize end-to-end data transfer rates. By exploiting the distributed and scalable nature of these resources — in terms of both data sharing and compute infrastructure — the researchers expect to boost data analysis performance and scientific productivity.

The result: "SciDAS will enable a broad range of scientists to not only get information faster but also to use much larger datasets and tease out information that they might not even know exists," according to the news announcement.

"A key aspect of the SciDAS team is that we'll be processing scientific data at the same time that we're gluing together all the parts needed for a national cyberinfrastructure ecosystem," noted Feltus. "We're trying to avoid the problem of 'if you build it they will come' and instead enlist the input of a variety of scientists to join us on the ground floor and help us build it. Thus, our software will be refined by using real data by real users with real habits."

Feltus and co-principle investigator Melissa Smith, associate professor in the Holcombe Department of Electrical and Computer Engineering in Clemson's College of Engineering, Computing and Applied Sciences, discuss their work in the video below.

About the Author

Rhea Kelly is editor in chief for Campus Technology, THE Journal, and Spaces4Learning. She can be reached at rkelly@1105media.com.

Featured

  • computer with a red warning icon on its screen, surrounded by digital grids, glowing neural network patterns, and a holographic brain

    Report Highlights Security Risks of Open Source AI

    In these days of rampant ransomware and other cybersecurity exploits, security is paramount to both proprietary and open source AI approaches — and here the open source movement might be susceptible to some inherent drawbacks, such as use of possibly insecure code from unknown sources.

  • pattern of interconnected glowing nodes and lines forming a neural network structure

    Meta AI Releases Open Source Machine Learning Library to Tackle Dataset Management Challenges

    Meta AI has announced LeanUniverse, an open source machine learning (ML) library designed to address the growing challenges of managing datasets in large-scale machine learning projects.

  • modern college building with circuit and brain motifs

    Anthropic Launches Claude for Education

    Anthropic has announced a version of its Claude AI assistant tailored for higher education institutions. Claude for Education "gives academic institutions secure, reliable AI access for their entire community," the company said, to enable colleges and universities to develop and implement AI-enabled approaches across teaching, learning, and administration.

  • glowing brain, connected circuits, and abstract representations of a book and graduation cap on a light gray gradient background

    Snowflake Launches Program to Upskill 100,000 People in Data and AI

    Cloud data platform Snowflake is embarking on an effort to train and certify more than 100,000 users on its AI Data Cloud by 2027. The One Million Minds + One Platform program will provide Snowflake-delivered courses, training materials, and free access to Snowflake software, at no cost to learners.