Healthcare Chatbots Less Accurate in Non-English Languages, Study Finds

Georgia Tech researchers find chatbots less accurate in Spanish, Chinese, and Hindi for health queries compared to English.
Their study suggests non-English speakers should be cautious relying on chatbots for healthcare advice.
XLingEval framework emphasizes improving accuracy, correctness, and reliability in non-English languages.
XLingHealth dataset aims to enhance chatbot performance by deepening multilingual data sources.
Testing reveals significant disparities in chatbot performance across languages, highlighting the need for improvement.

Main AI News:

Researchers from the Georgia Institute of Technology unveil concerning findings regarding the accuracy of chatbots when responding to health inquiries in languages other than English. According to a study led by Ph.D. students Mohit Chandra and Yiqiao (Ahren) Jin from the College of Computing at Georgia Tech, chatbots exhibit reduced accuracy in Spanish, Chinese, and Hindi compared to English when handling health-related questions.

The research, titled “Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries,” introduces a novel framework designed to evaluate the performance of large language models (LLMs) in diverse linguistic contexts. Available as a preprint on arXiv, the paper sheds light on the limitations and potential of LLMs in addressing health-related queries.

Chandra and Jin caution against relying on chatbots like ChatGPT for essential healthcare advice for non-English speakers. Their XLingEval framework emphasizes the need for improved accuracy, correctness, consistency, and reliability in languages other than English. They propose deepening the data pool with multilingual sources, advocating for the adoption of their XLingHealth benchmark to enhance model performance.

The study reveals significant disparities in the performance of chatbots across languages:

Correctness diminishes by 18% when questions are posed in Spanish, Chinese, or Hindi.
Responses in non-English languages exhibit a 29% decrease in consistency compared to their English counterparts.
Non-English responses are 13% less verifiable overall.

To address these challenges, the researchers introduce XLingHealth, a dataset comprising question-answer pairs aimed at improving chatbot performance. This dataset includes health-related content sourced from reputable platforms such as Patient and the U.S. National Institutes of Health (NIH).

In extensive testing, the researchers posed over 2,000 medical queries to ChatGPT-3.5 and MedAlpaca, a healthcare-oriented chatbot trained in medical literature. Alarmingly, more than 67% of MedAlpaca’s responses to non-English questions were deemed irrelevant or contradictory. Chandra notes that while both ChatGPT and MedAlpaca faced challenges, the former outperformed the latter due to its exposure to training data in multiple languages.

The study’s focus on Spanish, Chinese, and Hindi, as the world’s most spoken languages after English, reflects a personal interest and background of the researchers. Jin highlights the observations made by non-native English speakers, underscoring the importance of addressing linguistic disparities in chatbot performance.

This research underscores the critical need for advancements in chatbot technology to ensure accurate and reliable healthcare information is accessible across linguistic boundaries. As the field progresses, initiatives like XLingHealth offer promising avenues for enhancing the effectiveness of chatbots in diverse language contexts.

Conclusion:

The research underscores the pressing need for advancements in chatbot technology to address linguistic disparities in healthcare assistance. As the market for healthcare chatbots continues to expand globally, investments in improving accuracy and reliability across languages will be essential to ensure equitable access to quality healthcare information for diverse populations.

Source

OpenAI Fast-Tracks Release of New AI Model “Strawberry,” Focuses on Advanced Reasoning

Revolutionizing AI: Efficient Diffusion Models for High-Dimensional Data

Digital Dubai Partners with RIT Dubai to Advance AI Skills and Drive Digital Transformation

CAST AI Launches Enhanced Kubernetes Security Solution to Boost Runtime Threat Detection

Dubai’s AI Hub: Paving the Way for Global Technological Leadership

Glean Technologies Secures $260M in Series E Funding, Valued at $4.6B as Enterprise AI Adoption Grows

Dubai’s AI Hub: Paving the Way for Global Technological Leadership

AI’s Role in Transforming the Banking Industry

Fintech: The Future of Finance and Technology Careers

AI’s Impact on the Workforce: Risks, Opportunities, and the Path Forward

Ford’s Advanced Technologies Aim to Tackle Quality Issues and Boost Efficiency

Aifleet Secures $16.6M to Revolutionize Trucking Industry with AI Solutions

SiMa Technologies Advances Edge AI with High-Performance Multimodal Chip

Microsoft’s FPDT Breakthrough Extends Long-Context LLM Training Capabilities

Apple Intelligence: Will Delays Impact the iPhone 16’s Supercycle Potential?

AI’s Role in Defense: Opportunities and Challenges Ahead

JFrog and Nvidia Partner to Secure AI Models with New Runtime Security Solution

ServiceNow Unveils Advanced AI Features and Platform Enhancements to Boost Enterprise Productivity

Med-MoE: A Scalable AI Framework Revolutionizing Healthcare Efficiency

Deloitte Launches AI Factory as a Service, Partnering with NVIDIA and Oracle for Scalable AI Solutions

Vietnam’s AI Rise: A Path Toward Technological Independence

AI Unlocks Pig Communication: A Step Toward Better Animal Welfare

Abu Dhabi’s Sustainable Aquaculture Initiative: A New Approach to Marine Conservation and Economic Growth

Rising AI Demand Escalates Water Consumption in Data Centers, Poses Sustainability Concerns

Leaf: Modernizing Farm Data Management with Cutting-Edge Technology

Healthcare Chatbots Less Accurate in Non-English Languages, Study Finds

Main AI News:

Conclusion:

Healthcare Chatbots Less Accurate in Non-English Languages, Study Finds

Main AI News:

Conclusion:

Subscribe Now