Integration Brings Cerebras Inference Capabilities to Hugging Face Hub

AI hardware company Cerebras has teamed up with Hugging Face, the open source platform and community for machine learning, to integrate its inference capabilities into the Hugging Face Hub. This collaboration provides more than 5 million developers with access to models running on Cerebras' CS-3 system, the companies said in a statement, with reported inference speeds significantly higher than conventional GPU solutions.

Cerebras Inference, now available on Hugging Face, processes more than 2,000 tokens per second. Recent benchmarks indicate that models such as Llama 3.3 70B running on Cerebras' system can reach speeds exceeding 2,200 tokens per second, offering a performance increase compared to leading GPU-based solutions.

"By making Cerebras Inference available through Hugging Face, we are enabling developers to access alternative infrastructure for open source AI models," said Andrew Feldman, CEO of Cerebras, in a statement.

For Hugging Face's 5 million developers, this integration provides a streamlined way to leverage Cerebras' technology. Users can select "Cerebras" as their inference provider within the Hugging Face platform, instantly accessing one of the industry's fastest inference capabilities.

The demand for high-speed, high-accuracy AI inference is growing, especially for test-time compute and agentic AI applications. Open source models optimized for Cerebras' CS-3 architecture enable faster and more precise AI reasoning, the companies said, with speed gains ranging from 10 to 70 times compared to GPUs.

"Cerebras has been a leader in inference speed and performance, and we're thrilled to partner to bring this industry-leading inference on open source models to our developer community," commented Julien Chaumond, CTO of Hugging Face.

Developers can access Cerebras-powered AI inference by selecting supported models on Hugging Face, such as Llama 3.3 70B, and choosing Cerebras as their inference provider.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].

Featured

  • From Fire TV to Signage Stick: University of Utah's Digital Signage Evolution

    Jake Sorensen, who oversees sponsorship and advertising and Student Media in Auxiliary Business Development at the University of Utah, has navigated the digital signage landscape for nearly 15 years. He was managing hundreds of devices on campus that were incompatible with digital signage requirements and needed a solution that was reliable and lowered labor costs. The Amazon Signage Stick, specifically engineered for digital signage applications, gave him the stability and design functionality the University of Utah needed, along with the assurance of long-term support.

  • abstract illustration of a glowing pathway curving upward, with floating symbols of a graduation cap, a briefcase, and a tech icon, alongside two silhouetted figures

    Arkansas Community Colleges Tap Education Design Lab to Expand College-to-Career Pipeline

    A new program in Arkansas aims to create more community college pathways for learners to attain job-ready skills.

  • minimalist bookcase filled with textbooks featuring vibrant, solid-colored spines with no text, and a prominent number "25" displayed on one of the shelves

    OpenStax Celebrates 25th Anniversary

    OpenStax is celebrating its 25th anniversary as 2024 comes to a close. The open educational resources initiative from Rice University has served almost 37 million students in 153 countries and saved students nearly $3 billion in course material costs since its launch in 1999.

  • a glowing golden coin with a circuit board pattern, set against a gradient blue and white background with faint stock market graphs and metallic letters "AI" integrated into the design

    Google to Invest $1 Billion in AI Startup Anthropic

    Google is reportedly investing more than $1 billion in generative AI startup Anthropic, expanding its stake in one of Silicon Valley's leading artificial intelligence firms, according to a source familiar with the matter.