Integration Brings Cerebras Inference Capabilities to Hugging Face Hub

AI hardware company Cerebras has teamed up with Hugging Face, the open source platform and community for machine learning, to integrate its inference capabilities into the Hugging Face Hub. This collaboration provides more than 5 million developers with access to models running on Cerebras' CS-3 system, the companies said in a statement, with reported inference speeds significantly higher than conventional GPU solutions.

Cerebras Inference, now available on Hugging Face, processes more than 2,000 tokens per second. Recent benchmarks indicate that models such as Llama 3.3 70B running on Cerebras' system can reach speeds exceeding 2,200 tokens per second, offering a performance increase compared to leading GPU-based solutions.

"By making Cerebras Inference available through Hugging Face, we are enabling developers to access alternative infrastructure for open source AI models," said Andrew Feldman, CEO of Cerebras, in a statement.

For Hugging Face's 5 million developers, this integration provides a streamlined way to leverage Cerebras' technology. Users can select "Cerebras" as their inference provider within the Hugging Face platform, instantly accessing one of the industry's fastest inference capabilities.

The demand for high-speed, high-accuracy AI inference is growing, especially for test-time compute and agentic AI applications. Open source models optimized for Cerebras' CS-3 architecture enable faster and more precise AI reasoning, the companies said, with speed gains ranging from 10 to 70 times compared to GPUs.

"Cerebras has been a leader in inference speed and performance, and we're thrilled to partner to bring this industry-leading inference on open source models to our developer community," commented Julien Chaumond, CTO of Hugging Face.

Developers can access Cerebras-powered AI inference by selecting supported models on Hugging Face, such as Llama 3.3 70B, and choosing Cerebras as their inference provider.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].

Featured

  • interconnected glowing nodes and circuits in blue and green, forming a neural network on a dark background with a futuristic design

    Tech Giants Launch $100 Billion AI Infrastructure Network Project

    OpenAI, SoftBank, and Oracle have unveiled a new venture, Stargate, through which they aim to build a massive AI infrastructure network across the United States. The initiative, which was announced at the White House with President Donald Trump, has been described as the "largest AI infrastructure project in history."

  • glowing crystal ball with a simplified university building inside, surrounded by seamlessly blended holographic symbols of binary code, a bar graph, database icons, and a cloud, against a gradient blue and white background with softly merging circuit patterns

    3 Areas Where AI Will Impact Higher Ed Most in 2025

    What should colleges and universities expect from the evolving landscape of artificial intelligence in the coming year? Here's what the experts told us.

  • glowing video screen with a play button, next to a floating holographic paper transcript displaying faint digital text

    3Play Media Launches AI-Enabled Accessibility Tools

    Accessibility provider 3Play Media has introduced new AI-enabled video accessibility solutions designed to help colleges and universities meet ADA Title II compliance regulations.

  • Two figures, one male and one female, stand beside a transparent digital interface displaying AI symbols like neural networks, code, and a shield, against a clean blue gradient background.

    Report Makes Business Case for Responsible AI

    A new report commissioned by Microsoft and published last month by research firm IDC notes that 91% of organizations use AI tech and expect more than a 24% improvement in customer experience, business resilience, sustainability, and operational efficiency due to AI in 2024.