New FreeWilly LLM from Stability AI Achieves Pinnacle Position on Open Access Language Model Leaderboard

TL;DR:

  • Stability AI and CarperAI lab introduce FreeWilly1 and FreeWilly2 large language models for non-commercial use.
  • Both models excel in performance, rivaling GPT-3.5 in certain tasks and top the Hugging Face Open LLM Leaderboard.
  • FreeWilly models are built on strong foundation models and fine-tuned with a synthetically generated dataset using Supervised Fine-Tune (SFT) in standard Alpaca format.
  • The training methodology draws inspiration from Microsoft’s Orca approach, with data sourced from high-quality instructions by language models.
  • FreeWilly models showcase remarkable proficiency in intricate reasoning, linguistic subtleties, and domain-specific problem-solving.
  • Responsible release practices ensure thorough internal red team testing and encourage external feedback.

Main AI News:

Stability AI, in collaboration with CarperAI lab, has introduced two groundbreaking large language models, FreeWilly1 and FreeWilly2. These cutting-edge models, now available for non-commercial use, have demonstrated unparalleled performance across a diverse array of benchmarks, securing the topmost positions on the esteemed Hugging Face Open LLM Leaderboard.

An Unmatched Comprehension of Language

The foundation for FreeWilly1 and FreeWilly2 was laid on the robust models LLaMA 65B and LLaMA 2 70B from Meta. Meticulously fine-tuned with a synthetically generated dataset using the Supervised Fine-Tune (SFT) technique in standard Alpaca format, these models have showcased exceptional capabilities. Notably, FreeWilly2 has even rivaled the renowned GPT-3.5 in specific tasks, a remarkable feat for a novel language model.

Anel Islamovic, a spokesperson from Stability AI, expressed pride in the models’ exceptional reasoning ability across various benchmarks. “Both models demonstrate exceptional reasoning ability across varied benchmarks,” Islamovic said. “FreeWilly1 leverages the original LLaMA 65B foundation model and was carefully fine-tuned with a new synthetically-generated dataset using Supervised Fine-Tune (SFT) in standard Alpaca format. Similarly, FreeWilly2 leverages the LLaMA 2 70B foundation model to reach a performance that compares favorably with GPT-3.5 for some tasks.”

Revolutionary Training Methodology

Drawing inspiration from Microsoft’s methodology outlined in its paper Orca: Progressive Learning from Complex Explanation Traces of GPT-4, Stability AI adopted a similar approach with distinct data sources. The training dataset, consisting of 600,000 data points, originated from high-quality instructions generated by language models. These instructions were sourced from datasets created by Enrico Shippole. Remarkably, despite being a mere tenth of the size used in the original Orca paper, the FreeWilly models showcased exceptional performance, validating the efficacy of synthetically generated datasets.

Setting New Standards for Open Access LLMs

To gauge the models’ prowess, Stability AI employed EleutherAI’s lm-eval-harness, augmented with AGIEval, a human-centric benchmark for evaluating foundation models. The results were astounding, highlighting the FreeWilly models’ proficiency in intricate reasoning, nuanced language comprehension, and adeptly answering complex queries in specialized domains like law and mathematical problem-solving. The performance results of FreeWilly models were independently verified by Stability AI researchers and subsequently replicated by Hugging Face on July 21, 2023. These results were then published on their leaderboard, where LLMs and chatbots are meticulously ranked and evaluated upon their release.

Emphasis on Responsible Deployment Practices

Stability AI places a strong emphasis on responsible deployment practices for FreeWilly. The models underwent thorough internal red team testing to identify and address potential harms. However, the company actively encourages external feedback to further enhance safety measures and ensure ethical usage.

A New Era of Open Access LLMs

FreeWilly1 and FreeWilly2 mark a significant milestone in the realm of open access language models. Their arrival is poised to fuel groundbreaking research, amplify natural language understanding, and empower the performance of complex tasks. Stability AI is confident that these models will unlock endless possibilities within the AI community and inspire innovative applications.

We are excited about the endless possibilities that these models will bring to the AI community and the new applications they will inspire,” Islamovic said. “We would like to express our sincere gratitude to our passionate team of researchers, engineers, and collaborators, whose remarkable efforts and dedication have enabled us to reach this significant milestone.”

Conclusion:

The introduction of FreeWilly LLMs marks a significant milestone in the open access language model landscape. These models have the potential to revolutionize natural language understanding, advance research, and empower complex tasks across various domains. Businesses and industries can expect improved language processing capabilities and enhanced AI applications, fostering innovation and new possibilities in the market. Ethical deployment and responsible practices will be crucial in maximizing the benefits and avoiding potential risks in adopting these cutting-edge language models.

Source