Snowflake Intros New Open Source Large Language Model Optimized for the Enterprise

Data-as-a-service provider Snowflake has introduced a new open source large language model (LLM) called Snowflake Arctic. Designed to be "the most open, enterprise-grade LLM on the market," Arctic has a unique Mixture-of-Experts (MoE) architecture optimized for complex enterprise workloads. In company tests, it exceled in complex enterprise workloads, leading several industry benchmarks in SQL code generation, and instruction following, among others.

The company is releasing its weights under an Apache 2.0 license — which permits ungated personal, research, and commercial use — along with details of the research leading to how the model was trained. Snowflake also comes with code templates and flexible inference and training options, enabling users to quickly deploy and customize Arctic using their preferred frameworks.

Arctic is immediately available for serverless inference in Snowflake Cortex, Snowflake's fully managed service offering machine learning and AI solutions in the Data Cloud. It will also be accessible on Amazon Web Services (AWS) and other model gardens and catalogs.

"This is a watershed moment for Snowflake, with our AI research team innovating at the forefront of AI," said Snowflake CEO Sridhar Ramaswamy, in a statement. "By delivering industry-leading intelligence and efficiency in a truly open way to the AI community, we are furthering the frontiers of what open source AI can do. Our research with Arctic will significantly enhance our capability to deliver reliable, efficient AI to our customers."

The Snowflake AI Research Team adopted an MoE (Mixture of Experts) strategy to craft a small yet adept language model. This "dense-MoE hybrid transformer architecture" draws on the work of the DeepSpeed team at Microsoft Research. It funnels training and inference tasks to 128 experts, a substantial increase compared to other MoEs, such as Databricks' DBRX and Hugging Face's Mixtral.

Arctic's Dense-MoE Hybrid transformer architecture combines a 10B dense transformer model with a residual 128×3.66B MoE MLP, resulting in 480B total and 17B active parameters chosen using a top-2 gating. The company envisions Arctic as a versatile tool for companies to develop their own chatbots, co-pilots, and other GenAI applications.

All told, Arctic is equipped with 480 billion parameters, only 17 billion of which are used at any given time for training or inference. This approach helped to decrease resource usage compared to other similar models. For instance, compared to Llama3 70B, Arctic consumed 16x fewer resources for training. DBRX, meanwhile, consumed 8x more resources.

That frugality was intentional, said Yuxiong He, a distinguished AI software engineer at Snowflake and one of the DeepSpeed creators. "As researchers and engineers working on LLMs, our biggest dream is to have unlimited GPU resources," He said in a statement. "And our biggest struggle is that our dream never comes true."

Arctic's training process involved a "dynamic data curriculum" to emulate human learning patterns by adjusting the balance between code and language over time. Samyam Rajbhandari, a principal AI software engineer at Snowflake and another one of DeepSpeed's creators, noted that this approach resulted in improved language and reasoning skills. Arctic was trained on a cluster of 1,000 GPUs over the course of three weeks, which amounted to a $2 million investment. But customers will be able to fine tune Arctic and run inference workloads with a single server equipped with 8 GPUs, Rajbhandari said.

Snowflake is expected delve deeper into Arctic's capabilities at the upcoming Snowflake Data Cloud Summit, June 3-6 in San Francisco.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].

Featured

  • The AI Show

    Register for Free to Attend the World's Greatest Show for All Things AI in EDU

    The AI Show @ ASU+GSV, held April 5–7, 2025, at the San Diego Convention Center, is a free event designed to help educators, students, and parents navigate AI's role in education. Featuring hands-on workshops, AI-powered networking, live demos from 125+ EdTech exhibitors, and keynote speakers like Colin Kaepernick and Stevie Van Zandt, the event offers practical insights into AI-driven teaching, learning, and career opportunities. Attendees will gain actionable strategies to integrate AI into classrooms while exploring innovations that promote equity, accessibility, and student success.

  • illustrated university campus with modern buildings, glowing binary code streaming straight and dynamically from multiple directions, integrated into the architecture, surrounded by stylized trees, grass, and walkways

    3 Ways Institutions Can Become Data-Driven Organizations

    Faced with declining enrollments and changing demographics, colleges and universities must make use of data and analytics to better serve students.

  • NVIDIA DGX line

    NVIDIA Intros Personal AI Supercomputers

    NVIDIA has introduced a new lineup of AI-powered computing solutions designed to accelerate enterprise workloads.

  • digital network with glowing blue and red lines, featuring multiple red arrows shifting in different directions

    Report: Attackers Change Tactics as Ransomware Payoffs Decline

    Attackers are changing tactics as they collect less money from ransomware payoffs, according to a new report from Chainalysis, a blockchain analytics firm.