Romania Unveils Inaugural Open-Source LLM Model

  • Collaborative effort by researchers from Politehnica Bucharest, University of Bucharest, and Institute of Logic and Data Science led to the development of Romania’s inaugural LLM model.
  • Project supported by BRD Groupe Société Générale, emphasizing innovation and future technologies in Romania.
  • Academic communities across Europe, including Bulgaria and Greece, have also introduced language-specific LLMs.
  • Romanian LLM aims to address limitations of existing models trained primarily on English datasets.
  • Traian Rebedea highlights plans for ongoing improvement and invites participation in the OpenLLM-Ro initiative.
  • Horia Velicu underscores the importance of integrating advanced technologies for societal benefit.

Main AI News:

In a collaborative effort spanning from the latter half of 2023, researchers from Politehnica Bucharest, the University of Bucharest, and the Institute of Logic and Data Science have diligently worked towards the creation and refinement of Romania’s very own LLM model. This concerted endeavor, as reported by Startup.ro, saw contributing researchers generously dedicating their expertise pro-bono, with Politehnica Bucharest furnishing the essential computing power requisite for the model’s training.

At the forefront of supporting innovation and future technologies within Romania stands BRD Groupe Société Générale, serving as the primary partner for this groundbreaking project.

Typically spearheaded by academic communities, the pursuit of language-specific models has gained momentum across various European regions. Notably, the Bulgarian Institute for Computer Science, Artificial Intelligence and Technology (INSAIT) ushered in the era of freely accessible Bulgarian-trained open LLMs with the introduction of BgGPT in March. Following suit, Greece introduced Meltemi, its own LLM, later the same month. These initiatives join the ranks of other European counterparts, including the German LeoLM and the Spanish Aguila.

A fundamental objective driving the development of a Romanian LLM is the rectification of existing limitations inherent in current open LLMs, predominantly trained on English monolingual datasets. The pioneering Romanian model underwent exposure to several million documents in the Romanian language, as conveyed by the researchers.

Traian Rebedea, associate professor at Politehnica Bucharest and principal researcher at NVIDIA, elucidated, “We envision the unveiling of this model as merely the inception of a sustained endeavor aimed at enhancing LLMs tailored for the Romanian language. We have already identified methodologies that we intend to implement across recently debuted models (Llama-3 and Mistral), which have showcased superior performance compared to our initial undertaking (Llama-2).”

He further emphasized, “It is our fervent hope that both private enterprises and governmental bodies recognize the paramount importance of fostering the development of expansive language and multimodal (text-image) models for the Romanian language. We extend an open invitation to all stakeholders to actively participate in the OpenLLM-Ro initiative and the associated research endeavors.”

Echoing this sentiment, Horia Velicu, Head of Innovation Lab at BRD Groupe Société Générale, articulated, “By actively engaging with the innovation landscape, we can catalyze the integration of cutting-edge technologies, ensuring their positive impact on Romanian society is commensurate with global advancements in the field.”

Conclusion:

The launch of Romania’s first open-source LLM model signifies a significant leap forward in the country’s technological landscape. With support from both academic and corporate sectors, this initiative demonstrates Romania’s commitment to fostering innovation and addressing linguistic challenges. As more countries invest in language-specific models, the market can anticipate a surge in localized AI solutions, paving the way for tailored applications and enhanced user experiences.

Source