Snowflake launches Arctic, an open ‘mixture-of-experts’ LLM to take on DBRX, Llama 3

Discover how companies are responsibly integrating AI in production. This invite-only event in SF will explore the intersection of technology and business. Find out how you can attend here.

Today, Snowflake announced the launch of Arctic, a large language model (LLM) optimized for complex enterprise workloads such as SQL generation, code generation and instruction following.

Touted as the “most open enterprise-grade LLM” in the market, Arctic taps a unique mixture of expert (MoE) architecture to top benchmarks for enterprise tasks while being efficient at the same time. It also delivers competitive performance across standard benchmarks, nearly matching open models from Databricks, Meta and Mistral at tasks revolving around world knowledge, common sense, reasoning and math capabilities.

“This is a watershed moment for Snowflake, with our AI research team innovating at the forefront of AI,” said Snowflake CEO Sridhar Ramaswamy. “By delivering industry-leading intelligence and efficiency in a truly open way to the AI community, we are furthering the frontiers of what open-source AI can do. Our research with Arctic will significantly enhance our capability to deliver reliable, efficient AI to our customers.”

The launch of the new model is also seen as an effort by Snowflake to remain competitive with Databricks, which has historically been aggressive in its AI efforts for customers leveraging its data platform. Snowflake’s AI endeavors have accelerated only recently, following the company’s acquisition of Neeva and the appointment of Ramaswamy as CEO.

VB Event

The AI Impact Tour – San Francisco

Join us as we navigate the complexities of responsibly integrating AI in business at the next stop of VB’s AI Impact Tour in San Francisco. Don’t miss out on the chance to gain insights from industry experts, network with like-minded innovators, and explore the future of GenAI with customer experiences and optimize business processes.

Request an invite

Snowflake Arctic shoots for enterprise workloads

Modern enterprises are bullish on the potential of generative AI and are racing to build gen AI apps such as retrieval-augmented generation (RAG) chatbots, data copilots and code assistants. Now, the thing is that there are many models that they can use to bring these use cases to life, but only a few are focused specifically on enterprise tasks. This is where Snowflake Arctic comes in

“We are invested in AI because we think it is going to make the creation of end-to-end AI products much better. Our dream here is like an API that our customers can use so that business users can directly talk to data. That’s the ultimate goal of Snowflake democratized data within the enterprise. And we think this is a really important part of making that vision come true,” Ramaswamy said in a press briefing.

Arctic uses a Dense MoE hybrid architecture, where the parameters are divided into as many as 128 fine-grained expert subgroups. These experts – trained on a dynamic data curriculum – always sit ready but handle only those input tokens they can process most effectively. This means only select parameters of the model – 17 billion out of 480 billion – are activated in response to a query, delivering targeted performance with minimal compute consumption.

According to benchmarks shared by Snowflake, with this approach, Arctic is already handling enterprise tasks pretty well, scoring an average of 65% across multiple tests. This matches the average enterprise performance of Llama 3 70B and sits just behind Mixtral 8X22B’s score of 70%.

Arctic’s performance across enterprise tasks. Credit: Snowflake

In the Spider benchmark for SQL generation, the model scored 79%, outperforming Databricks’ DBRX and Mixtral 8X7B and nearly matching Llama 3 70B and Mixtral 8X22B. In coding tasks, where the company considered an average of HumanEval+ and MBPP+ scores, it scored 64.3%, again surpassing Databricks and the smaller Mixtral model and trailing Llama 3 70B and Mixtral 8X22B.

However, the most interesting was the IFEval benchmark designed to measure instruction following capabilities. Arctic scored 52.4% in that test, doing better than most competitors, except the latest Mixtral model.

*Arctic performance across both enterprise and academic benchmarks. Credit: Snowflake*

The company claims this level of enterprise intelligence has been achieved with breakthrough efficiency, using a training compute budget of just under $2 million. This is way less than the compute budget of other open models, including Llama 3 70B which has been trained with 17x more compute. Additionally, the model uses 17 active parameters to get these results, which is far less than what other models put to use and will further drive cost benefits.

Availability under Apache 2.0 license

Snowflake is making Arctic available inside Cortex, its own LLM app development service, and across other model gardens and catalogs, including Hugging Face, Lamini, Microsoft Azure, Nvidia API catalog, Perplexity and Together. On Hugging Face, Arctic model weights and code can be downloaded directly under an Apache 2.0 license that allows ungated use for personal, commercial or research applications.

But, that’s just one bit of the company’s “truly open” effort.

In addition to the model weights and codes, the company is releasing a data recipe to help enterprises run efficient fine-tuning on a single GPU and comprehensive research cookbooks with insights into how the model was designed and trained.

“The cookbook is designed to expedite the learning process for anyone looking into the world-class MoE models. It offers high-level insights as well as granular technical details to craft LLMs like Arctic so that anyone can build their desired intelligence efficiently and economically,” Baris Gultekin, Snowflake’s head of AI, said in the press briefing.

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat's Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.