- Sarvam, an Indian AI startup, tackles India’s linguistic diversity with a voice-enabled AI bot supporting over ten languages.
- The startup focuses on voice interaction; many prefer speaking over text-based communication.
- Sarvam’s offerings include a language model for legal professionals and an audio-language model.
- Their AI voice bots are versatile and deployable across platforms like WhatsApp, apps, and traditional calls.
- Backed by Peak XV and Lightspeed, Sarvam prices its AI agents at ₹1 per minute.
- The AI agents are built on a small language model, Sarvam 2B, trained on 4 trillion tokens of synthetic data.
- Synthetic data poses risks of inaccuracies and hallucinations, potentially affecting model reliability.
Main AI News:
In a market where 22 official languages and over 19,000 dialects are spoken, the feasibility of offering a text-only AI chatbot that excels in just a few languages is questionable. This challenge has been the focal point for Indian AI startup Sarvam, which unveiled a suite of offerings designed to address this linguistic diversity on Tuesday. Among its innovations is a voice-enabled AI bot that supports more than 10 Indian languages, catering to a preference for voice interaction over text-based communication. Additionally, Sarvam has introduced a compact language model, an AI tool tailored for legal professionals, and an audio-language model.
Headquartered in Bengaluru, Sarvam primarily targets businesses and enterprises, promoting its AI voice bots across multiple industries, particularly in customer support. For instance, Sri Mandir, a startup specializing in religious content, has leveraged Sarvam’s AI to manage payments, processing over 270,000 transactions. Sarvam’s voice AI agents are versatile and deployable on platforms like WhatsApp, within apps, and even through traditional voice calls.
Supported by Peak XV and Lightspeed, Sarvam plans to offer its AI agents at a competitive rate, starting at ₹1 (approximately 1 cent) per minute of usage.
The voice-enabled AI agents are built on Sarvam’s foundational language model, Sarvam 2B, which has been trained on a dataset comprising 4 trillion tokens. Notably, the model is entirely trained on synthetic data, a method that has drawn caution from AI experts. Synthetic data, generated by large language models to mirror real-world information, carries the risk of inaccuracies and hallucinations. Relying on such data for training could potentially amplify these issues.
Conclusion:
Sarvam’s approach to tackling the linguistic challenges in India through AI-driven voice bots marks a significant shift in the market. By catering to the preference for voice communication in multiple languages, Sarvam is positioning itself as a critical player in industries reliant on customer interaction. However, the reliance on synthetic data to train its models introduces potential risks of inaccuracies, which could impact user trust and model performance. As Sarvam scales its offerings, the market must closely monitor the balance between innovation and the integrity of AI-driven services. This move could influence other AI startups to prioritize multilingual and voice-enabled solutions, driving a broader transformation in businesses engaging with diverse customers.