Honoring his commitment, Elon Musk’s enterprise xAI has made its first large language model (LLM), Grok, open source today.
As Musk had announced earlier in the week, this action allows any entrepreneur, programmer, company, or individual to access and utilize Grok’s weights—the parameters defining the strength of connections between the model’s artificial “neurons” or software modules. These neurons enable the model to process inputs, make decisions, and produce text-based outputs. Now, with the model and its associated documentation available, anyone can use Grok for various purposes, including commercial applications.
“We are releasing the base model weights and network architecture of Grok-1, our large language model,” the company announced in a blog post. “Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.”
Those interested can download the code for Grok on its Github page or via a torrent link.
What It Means That Grok Is Now Open Source
Parameters, which include weights and biases, define how a model operates—the more parameters, the more sophisticated and capable the model generally is. Grok, with its 314 billion parameters, surpasses open-source rivals like Meta’s Llama 2 (70 billion parameters) and Mistral 8x7B (12 billion parameters) in complexity and performance.
The open-sourcing of Grok under the Apache License 2.0 allows for commercial usage, alterations, and distribution of the model. However, it can’t be trademarked, and it comes without any liability or warranty. Users are also required to keep the original license and copyright notices intact and must clearly indicate any modifications they’ve implemented.
Grok’s structure, crafted with a unique training stack built on JAX and Rust in October 2023, introduces novel methods in neural network architecture. The model is designed to use only 25% of its weights for each token, boosting its performance and efficiency.
Initially, Grok was a proprietary or “closed source” model launched in November 2023. Until its recent open-sourcing, it was available exclusively on Elon Musk’s affiliated social network, X (previously known as Twitter), through the X Premium+ subscription service. This service was priced at $16 monthly or $168 annually.
While Grok is now open source, its release doesn’t include the entire set of data it was trained on. This detail isn’t crucial for using the model since it’s already been trained, but it means users can’t see the specific data it learned from. It’s suggested that it might have been trained on user text posts from X, as the xAI blog vaguely mentions the model was “trained on a large amount of text data, not fine-tuned for any particular task.”
Additionally, the open-source version of Grok lacks access to the real-time information on X, a feature Musk highlighted as a key advantage of Grok over other large language models. To utilize this real-time data feature, users would need to maintain a subscription to the paid service on X.
Not Just Tech – A Business and PR Tactic
Grok, aimed to be a competitor to ChatGPT from OpenAI (a company Musk helped start and left in 2018), is named after a term that means “to fully understand.” It’s inspired by “The Hitchhiker’s Guide to the Galaxy,” a famous 1970s radio drama and book series by British author Douglas Adams, which was later made into a movie in 2005.
Musk promotes Grok as a funnier and less restricted alternative to ChatGPT and similar large language models. This approach is gaining traction, especially as users express concerns over AI censorship and errors, like those seen with Google Gemini. For instance, Gemini once implied Musk’s tweets could be as harmful as actions by Hitler, leading to widespread criticism from Musk and tech figures like Marc Andreessen, co-founder of a16z and an internet pioneer.
Musk’s decision to make Grok open source serves as a strategic ideological move amidst his ongoing lawsuit and broader critiques of OpenAI, a company he co-founded and is now suing. He accuses OpenAI of straying from its original non-profit mission. In defense, OpenAI has publicly shared emails suggesting Musk might have been informed or even supportive of its shift to a proprietary, profit-driven model.
The AI community on X has greeted Grok’s release with enthusiasm and curiosity. Technical experts are particularly interested in its innovative use of GeGLU in feedforward layers and its unique approach to normalization, including the use of the sandwich norm technique. The release has even captured the attention of OpenAI employees, who have expressed intrigue in Grok’s capabilities.
Consequently, Grok’s release is expected to challenge other LLM providers, particularly those offering open source solutions, to demonstrate their superiority and value proposition to users.