News

Caltech and Partners Set Data-Transfer World Record

A team of physicists, computer scientists, and network engineers led by the California Institute of Technology (Caltech)--with partners from a number of other universities and science organizations--set new records for sustained data transfer among storage systems during the SuperComputing 2008 conference held in Austin, TX.

The effort achieved a bidirectional peak throughput of 114 Gbps and a sustained data flow of more than 110 Gbps among clusters of servers at the conference itself as well as Caltech, Michigan, CERN in Geneva, Fermilab in Batavia, Brazil, Korea, Estonia, and locations in the USLHCNet network in Chicago, New York, Geneva, and Amsterdam. The demonstration was intended to show that a well designed and configured single rack of servers is capable of saturating the highest-speed wide-area network links in production use today, which have a capacity of 40 Gbps in each direction.

The setup, which took three days to build, used a dozen 10-Gbps wide-area network links to feed data to the event and 14 different providers to maintain connections to external servers, as well as equipment encompassing two Cisco 6500E series switch-routers, and a hundred 10 gigabit Ethernet server interfaces provided by Myricom and Intel, two fiber channel S2A9900 storage platforms provided by DataDirect Networks outfitted with 8 Gbps host bus adapters from QLogic, along with five X4500 and X4540 disk servers from Sun Microsystems. The computational nodes consisted of 32 widely available dual-motherboard Supermicro servers housing 128 quad-core Xeon processors on 64 motherboards with a like number of 10-GbE interfaces, as well as Seagate SATA II disks providing 128 terabytes of storage.

A key element in the demonstration was Fast Data Transfer (FTD), an open-source Java application based on TCP, developed by the Caltech team in collaboration with the Politehnica Bucharest team. FTD runs on major platforms and works by streaming data across an open TCP socket, so that a large data set composed of thousands of files, as is typical in high-energy physics applications, can be sent or received at full speed, without the network transfer restarting between files, and without any packets being lost. FDT works with Caltech's MonALISA system to dynamically monitor the capability of the storage systems, as well as the network path, in real time, and sends data out to the network at a moderated rate that is matched to the capacity (measured in real time) of long-range network paths.

FDT was combined with an optimized Linux kernel, known as the "UltraLight kernel," provided by Shawn McKee, and the FAST TCP protocol stack developed by Steven Low, professor of computer science and electrical engineering at Caltech, to reach its sustained throughput level of 14.3 Gbps with a single rack of servers, limited by the speed of the disks.

"This achievement is an impressive example of what a focused network and storage system effort can accomplish," said McKee. McKee is a research scientist in the University of Michigan department of physics and leader of the UltraLight network technical group involved with an experiment taking place on the world's largest particle accelerator, located at CERN. "It is an important step towards the goal of delivering a highly capable end-to-end network-aware system and architecture that meet the needs of next-generation e-science."

comments powered by Disqus