New Texas A&M Supercomputer 'Grace' Goes Online in December

Texas A&M's Grace supercomputer

Texas A&M University is getting a new supercomputer. The latest system, which goes online for researchers in December, will be 20 times more powerful than the supercomputer it's replacing. "Grace," as it's named, in memory of Vice Admiral Grace Hopper, will replace "Ada" (named for Ada Lovelace), which has served as the lead supercomputer at the university's High Performance Research Computing center, where it has been running since 2014.

As the university explained in a press release, supercomputer computing power is measured as "flops," or "floating point operations per second." A floating-point operation is any calculation that involves numbers with decimal points. One trillion flops equal a teraflop; a thousand teraflops equal a petaflop. While Ada can process up to 337 teraflops, Grace will handle up to 6.2 petaflops.

The new capacity will support Texas A&M research in numerous fields, including drug design, materials science, artificial intelligence and machine learning, geosciences, fluid dynamics, biomedical applications, biophysics, genetics, quantum computing, data analytics, population informatics and autonomous vehicles.

According to Honggao Liu, executive director of the center, the supercomputer will enhance the university's research capabilities and competitiveness and enable researchers "to keep pace with current trends in research computing technologies." The extra capacity is needed for a growing number of research projects. In the last four years, the center has seen a doubling of its user base, from about 1,300 in 2016 to more than 2,600 this year.

"HPRC has a mission to infuse computational and data analysis technologies into the research and creative activities of every academic discipline at Texas A&M," Liu noted. "We support compute- and data-intensive workloads and enable researchers to use cutting-edge processor, accelerator and data analytic technologies to solve complex research problems. In this era of converged demand for advanced computing resources, a new supercomputer like Grace is needed to support complex workflows and allow researchers to continue in their pursuit of discoveries and inventions."

The system combines technology from several companies. While Dell is the primary vendor, the CPUs come from Intel, the storage system from DataDirect Networks (DDN) and the graphics processing units and InfiniBand interconnect network from NVIDIA.

Specifications include:

  • Dell EMC PowerEdge servers with 2nd Gen Intel Xeon Scalable processors, making up 800 regular compute nodes, five login nodes and six management nodes;
  • 100 double-precision NVIDIA A100 compute nodes, eight single-precision NVIDIA T4 GPU compute nodes, and nine single-precision NVIDIA RTX 6000 Tensor Core GPUs;
  • An NVIDIA Mellanox HDR100 InfiniBand network; and
  • 5.12 petabytes of high-performance DDN EXAScaler ES7990X storage running the EXAScaler parallel filesystem.

Each node is outfitted with dual 2nd-generation Intel Xeon Scalable 24-core 3.0GHz processors and 384GB DDR4 3200MHz memory processors.

Eight large memory nodes have four 2nd-generation Intel Xeon Scalable 20-core 2.5 GHz processors and 3.072 terabytes of DDR4 3200MHz memory.

Funding for Grace came from Texas A&M as well as the university's Research Development Fund, Health Science Center, Engineering Experiment Station, Transportation Institute and several individual faculty members in the Colleges of Engineering and Science.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured

  • interconnected cloud icons with glowing lines on a gradient blue backdrop

    Report: Cloud Certifications Bring Biggest Salary Payoff

    It pays to be conversant in cloud, according to a new study from Skillsoft The company's annual IT skills and salary survey report found that the top three certifications resulting in the highest payoffs salarywise are for skills in the cloud, specifically related to Amazon Web Services (AWS), Google Cloud, and Nutanix.

  • a hobbyist in casual clothes holds a hammer and a toolbox, building a DIY structure that symbolizes an AI model

    Ditch the DIY Approach to AI on Campus

    Institutions that do not adopt AI will quickly fall behind. The question is, how can colleges and universities do this systematically, securely, cost-effectively, and efficiently?

  • minimalist geometric grid pattern of blue, gray, and white squares and rectangles

    Windows Server 2025 Release Offers Cloud, Security, and AI Capabilities

    Microsoft has announced the general availability of Windows Server 2025. The release will enable organizations to deploy applications on-premises, in hybrid setups, or fully in the cloud, the company said.

  • digital brain made of blue circuitry on the left and a shield with a glowing lock on the right, set against a dark background with fading binary code

    AI Dominates Key Technologies and Practices in Cybersecurity and Privacy

    AI governance, AI-enabled workforce expansion, and AI-supported cybersecurity training are three of the six key technologies and practices anticipated to have a significant impact on the future of cybersecurity and privacy in higher education, according to the latest Cybersecurity and Privacy edition of the Educause Horizon Report.