TL;DR:
- G42 unveils Jais, an open-source Arabic Language Learning Model (LLM).
- Jais, developed by Inception, collaborates with Mohammed bin Zayed University of Artificial Intelligence and Cerebras Systems.
- Boasting 13 billion parameters, Jais is trained on a specialized dataset of 116 billion Arabic tokens and 279 billion English word tokens.
- The model is a result of training on the Condor Galaxy AI supercomputer.
- Jais outperforms existing Arabic models and competes with English models despite less English training data.
- Strategic focus areas include government, finance, energy/climate, and healthcare sectors.
- Notable collaborations include UAE Ministry of Foreign Affairs, ADNOC, Etihad Airways, and more.
- Jais is accessible for download on Hugging Face, with online exploration available by invitation.
Main AI News:
G42, a prominent entity based in Abu Dhabi, has taken a groundbreaking stride by introducing an open-source Arabic language learning model (LLM) named Jais. This innovative creation holds the potential to fuel a new era of generative AI applications tailored to the Arabic language.
In an impressive collaboration, Inception, a subsidiary of G42, partnered with the Mohammed bin Zayed University of Artificial Intelligence and Silicon Valley’s Cerebras Systems to birth Jais. Renowned as the paramount open-source Arabic LLM worldwide, Jais is a product of meticulous craftsmanship and expertise.
Jais stands as a testament to the prowess of its creators, boasting a staggering 13 billion parameters. Its construction relied on a specialized dataset, an amalgamation of 116 billion Arabic tokens for an intricate grasp of the language’s nuances, and 279 billion English word tokens to elevate its cross-lingual performance.
The journey of Jais to its current zenith involved training it on the formidable Condor Galaxy, a multi-exaFLOP AI supercomputer jointly engineered by G42 and Cerebras. Dr. Andrew Jackson, the CEO of Inception, elucidated that this formidable machine was the crucible for Jais’ development.
Dubbed after the UAE’s loftiest peak, Jais emerged from the combined dedication of scholars and engineers, driven by the scarcity of bilingual large language models. Dr. Jackson, however, acknowledged the scarcity of Arabic data online, estimating it to be a mere 1 percent of the content available.
Notably, Jais has showcased its supremacy over existing Arabic models such as Falcon from Abu Dhabi’s Technology Innovation Institute, Llama 2 from Meta Platforms, and Bloom, surpassing them by a considerable margin. It is an astonishing feat, given its relative scarcity of English training data compared to similar-sized English models.
The trajectory of Jais is set, and its applications are poised to revolutionize multiple sectors. Dr. Jackson elaborated on their strategic focus, stating, “We will channel our efforts into integrating Jais within governmental, financial, energy/climate, and healthcare domains.”
A distinguished roster of institutions, including the UAE Ministry of Foreign Affairs, Ministry of Industry and Advanced Technology, Department of Health – Abu Dhabi, ADNOC, Etihad Airways, First Abu Dhabi Bank, e&, and Mubadala Investment Company, is set to collaborate with Inception to harness the potential of Jais.
For those eager to explore this marvel, Jais is readily available for download on Hugging Face. As an added enticement, interested users can experience Jais firsthand through invitation-based access to its playground environment upon registering on Jais’ website.
Conclusion:
The introduction of Jais, an open-source Arabic LLM, represents a pivotal advancement in AI innovation. Developed in collaboration with prestigious institutions, powered by a massive parameter count, and strategically positioned for various industries, Jais is poised to drive significant transformations across sectors. Its superior performance against existing models signals a shift towards more sophisticated AI applications, not only in Arabic but also in cross-lingual contexts. This launch solidifies the notion that AI is becoming more accessible and tailored to diverse linguistic needs, thereby opening new avenues for the global AI market’s expansion.