Unveils Innovative LLM Hallucination Scoring Engine

  • alt Inc. unveils a groundbreaking method for scoring hallucinations in large language models (LLMs).
  • Hallucination in LLMs leads to unjustified false answers, eroding trust and hindering broader adoption.
  • The new automatic hallucination score evaluation engine achieves a remarkable 72% accuracy rate in detecting hallucinations.
  • Compatible with various LLMs including GPT-3.5, Llama2, and alt’s own LHTM-OPT.
  • Engine prioritizes consistency by comparing multiple output generations from the same input data.
  • Available through alt developer API service, enhancing reliability and trustworthiness of AI-driven content.

Main AI News:

alt Inc., the trailblazing Japan-based developer and purveyor of Personal Artificial Intelligence and AI clone technology, proudly announces a groundbreaking achievement: the successful development of a cutting-edge method for scoring hallucinations in large language models (LLMs).

In the realm of artificial intelligence, the issue of hallucination looms large, posing a significant challenge wherein LLMs provide erroneous responses devoid of factual basis, often stemming from misinterpretations of training or input data. Such inaccuracies not only erode trust among businesses and individuals but also impede the broader adoption of LLM technologies.

Drawing upon its rich legacy as a vanguard in LLM development and deployment in Japan, alt has harnessed its expertise to tackle the hallucination quandary head-on. Recent breakthroughs have culminated in the creation of a proprietary technique to autonomously assess the likelihood of hallucination, aptly termed the “hallucination score,” thereby giving birth to an automated hallucination score evaluation engine.

In rigorous testing, this engine demonstrated an impressive 72% accuracy rate in identifying instances of hallucination, leveraging a pseudo-evaluation set derived from the JcommonsenseQA dataset. Notably, it boasts compatibility with a spectrum of LLMs, including but not limited to GPT-3.5, Llama2, and alt’s very own LHTM-OPT—a nimble yet robust large language model tailored for diverse applications.

Moreover, the hallmark of the automatic hallucination score evaluation engine lies in its unwavering commitment to consistency. Employing a methodology predicated on iterative content generation from identical input data, it meticulously scrutinizes multiple outputs to discern any disparities or incongruities. From these observations, a probabilistic determination is derived, shedding light on the presence of hallucination—instances of spurious output divorced from training data or empirical veracity.

For developers and enterprises keen on fortifying the integrity of their AI-driven solutions, the automatic hallucination score evaluation engine stands as a beacon of assurance. Accessible through the alt developer API service, it offers a seamless pathway to enhancing the reliability and trustworthiness of LLM-generated content, heralding a new era of confidence in artificial intelligence applications.


alt’s development of an automatic hallucination score evaluation engine signifies a pivotal advancement in ensuring the integrity of AI-generated content. By addressing the pervasive issue of hallucination in large language models with a high degree of accuracy and consistency, this innovation sets a new standard for reliability in the AI market, bolstering trust and confidence among developers, enterprises, and end-users alike.