Stability AI's FreeWilly LLMs: Breakthrough in reasoning with Alpaca Format & Orca Methodology
The FreeWilly models have been fine-tuned using the industry-standard Alpaca format, and their training was heavily influenced by Microsoft's groundbreaking approach in ‘Orca: Progressive Learning from Complex Explanation Traces of GPT-4.’

Highlights
- The CarperAI team develops FreeWilly1 and FreeWilly2 open-source LLMs
- FreeWilly2 utilises the LLaMA 2 70B base model
- The models are trained using the Alpaca format & Microsoft's Orca approach for progressive learning
Stability AI, an AI-driven visual art startup that designs open AI tools, and its CarperAI lab have introduced FreeWilly1 and its successor, FreeWilly2, two powerful Large Language Models (LLMs) that are now freely accessible to the public. These models have demonstrated remarkable reasoning capabilities across a diverse range of benchmarks.
FreeWilly1 leverages fine-tuning using a newly generated synthetic dataset, employing Supervised Fine-tuning (SFT) in the widely accepted Alpaca format.
On the other hand, FreeWilly2 utilises the LLaMA 2 70B foundation model and gives comparable performance to GPT-3.5 in certain tasks.
A breakthrough in LLMs
Stability AI's CarperAI team has achieved a significant milestone in the field of AI with the development of two powerful open-source LLMs. These models have demonstrated exceptional performance in reasoning competitions, handling a wide range of tasks across various disciplines.
Notably, the FreeWilly models have been fine-tuned using the industry-standard Alpaca format, and their training was heavily influenced by Microsoft's groundbreaking approach in ‘Orca: Progressive Learning from Complex Explanation Traces of GPT-4.’
Orca, a whale, or an AI methodology?
The names of the models, FreeWilly1 and FreeWilly2, are a clever nod to the ‘Orca’ AI training methodology developed by Microsoft researchers. FreeWilly1 and FreeWilly2 were trained using just 600,000 data points, which is a mere 10 percent of the size of the original Orca dataset. This efficient training was made possible by employing instructions from four datasets created by Enrico Shippole.
Consequently, the training process was significantly less costly and environmentally friendly, as it consumed less energy and resulted in a lower carbon footprint compared to the original Orca model and other leading LLMs.
Despite the reduced data size, both models still exhibited outstanding performance, surpassing even ChatGPT on GPT-3.5 in certain cases.
Versatility of FreeWilly
The efforts invested in creating the FreeWilly models have undoubtedly paid off, as they have consistently delivered outstanding results across multiple benchmarks and tasks.
FreeWilly2, which utilises the LLaMA 2 70B base model, has demonstrated performance on par with the renowned GPT-3.5 in certain tasks. The models' exceptional ability to resolve complex issues in specialised domains such as law and mathematics, along with their proficiency in intricate reasoning and language nuance recognition, has garnered much attention in the AI community.
Unlocking new possibilities in AI
The successful development of FreeWilly1 and FreeWilly2 marks a significant step forward in the search to grasp the complexities of spoken language and harness its potential in artificial intelligence.
The CarperAI team envisions a future where these innovative language models find applications in diverse fields, opening up new horizons that were once considered impossible. With their open-source nature, these models empower researchers and developers to explore novel use cases and advance the frontiers of natural language processing.