Open Menu Close Menu

High-Performance Computing

New Texas A&M Supercomputer 'Grace' Goes Online in December

Texas A&M's Grace supercomputer

Texas A&M University is getting a new supercomputer. The latest system, which goes online for researchers in December, will be 20 times more powerful than the supercomputer it's replacing. "Grace," as it's named, in memory of Vice Admiral Grace Hopper, will replace "Ada" (named for Ada Lovelace), which has served as the lead supercomputer at the university's High Performance Research Computing center, where it has been running since 2014.

As the university explained in a press release, supercomputer computing power is measured as "flops," or "floating point operations per second." A floating-point operation is any calculation that involves numbers with decimal points. One trillion flops equal a teraflop; a thousand teraflops equal a petaflop. While Ada can process up to 337 teraflops, Grace will handle up to 6.2 petaflops.

The new capacity will support Texas A&M research in numerous fields, including drug design, materials science, artificial intelligence and machine learning, geosciences, fluid dynamics, biomedical applications, biophysics, genetics, quantum computing, data analytics, population informatics and autonomous vehicles.

According to Honggao Liu, executive director of the center, the supercomputer will enhance the university's research capabilities and competitiveness and enable researchers "to keep pace with current trends in research computing technologies." The extra capacity is needed for a growing number of research projects. In the last four years, the center has seen a doubling of its user base, from about 1,300 in 2016 to more than 2,600 this year.

"HPRC has a mission to infuse computational and data analysis technologies into the research and creative activities of every academic discipline at Texas A&M," Liu noted. "We support compute- and data-intensive workloads and enable researchers to use cutting-edge processor, accelerator and data analytic technologies to solve complex research problems. In this era of converged demand for advanced computing resources, a new supercomputer like Grace is needed to support complex workflows and allow researchers to continue in their pursuit of discoveries and inventions."

The system combines technology from several companies. While Dell is the primary vendor, the CPUs come from Intel, the storage system from DataDirect Networks (DDN) and the graphics processing units and InfiniBand interconnect network from NVIDIA.

Specifications include:

  • Dell EMC PowerEdge servers with 2nd Gen Intel Xeon Scalable processors, making up 800 regular compute nodes, five login nodes and six management nodes;
  • 100 double-precision NVIDIA A100 compute nodes, eight single-precision NVIDIA T4 GPU compute nodes, and nine single-precision NVIDIA RTX 6000 Tensor Core GPUs;
  • An NVIDIA Mellanox HDR100 InfiniBand network; and
  • 5.12 petabytes of high-performance DDN EXAScaler ES7990X storage running the EXAScaler parallel filesystem.

Each node is outfitted with dual 2nd-generation Intel Xeon Scalable 24-core 3.0GHz processors and 384GB DDR4 3200MHz memory processors.

Eight large memory nodes have four 2nd-generation Intel Xeon Scalable 20-core 2.5 GHz processors and 3.072 terabytes of DDR4 3200MHz memory.

Funding for Grace came from Texas A&M as well as the university's Research Development Fund, Health Science Center, Engineering Experiment Station, Transportation Institute and several individual faculty members in the Colleges of Engineering and Science.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

comments powered by Disqus