Big Data in Life Science Research Demands Advanced Networking, Computing Capabilities

Researchers in genomics, medicine and other life sciences are using big data to tackle big issues, but big data requires more networking and computing power. At the Internet2 Global Summit taking place in Washington, DC April 26-30, researchers in the life sciences will meet with engineers and technology leaders in the research and education community to discuss the advancement of IT infrastructure and applications for big data.

Researchers at Clemson University are using big data in the field of genomics to develop new varieties of agricultural crops that address the issues of population pressure, bioenergy, food security and climate change. And researchers at Arizona State University's (ASU) College of Life Sciences' Adaptive Complex Systems Science program are using big data in the field of precision medicine to manage and analyze genomic information and its associated imaging data to develop disease treatments customized to the molecular makeup of the patient and the disease.

According to Alex Feltus, an associate professor of genetics and biochemistry at Clemson University, using big data for life sciences research demands new methods of data storage and transfer. "Of course we need bigger boxes, but we also need faster ways to put stuff into them," said Feltus in a prepared statement. "There is a serious data transfer bottleneck at the network-hard-drive interface. Thus, we need faster, reasonably priced storage that can keep up with the advanced networks such as the Internet2 Network."

Feltus and other researchers at Clemson use the university's high-performance computing (HPC) resource, Palmetto, and the Internet2 Network to collaborate with other research teams across the country. "You can process data on the fastest nodes in the world, but it's pointless for real-time applications if the supercomputer is hooked up to a slow pipe." said Feltus.

ASU has developed the Next Generation Cyber Capability (NGCC) data science research instrument to reduce processing time for big data and HPC research. According to Jay Etchings, director of operations for research computing and senior HPC architect at ASU, the NGCC is a new model of computing that enables the integration of multidimensional molecular and clinical data required for precision medicine.

The NGCC supports big data research through a connection to the ultrahigh bandwidth Internet2 Network, large-scale storage, integration of multiple types of computation, including utility computing, HPC and big data, as well as advanced logical capabilities, such as software-defined storage and networking, metadata processing and semantics, according to information from Internet2.

Feltus and Etchings will both be presenters at the Internet2 Global Summit.

About the Author

Leila Meyer is a technology writer based in British Columbia. She can be reached at [email protected].

Featured