The study compared 72 radiologists and AI tools in diagnosing 2,040 chest X-rays of older adults

TL;DR:

  • AI, while transforming various fields, faces challenges in replacing radiologists in interpreting chest X-rays.
  • A Danish study compared 72 radiologists and AI tools in diagnosing 2,040 chest X-rays of older adults.
  • AI demonstrated reasonable sensitivity but produced a significant number of false positives, especially in complex cases.
  • Radiologists outperformed AI in diagnosing pneumothorax and pleural effusion, with a 96% accuracy rate.
  • AI struggled in diagnosing airspace disease, achieving positive predictions in only 40-50% of cases.
  • Dr. Louis Plesner, the study’s lead author, emphasized AI’s limitations in identifying the absence of disease.
  • The high rate of false positives could lead to increased costs and radiation exposure for patients.

Main AI News:

Artificial intelligence (AI) has unquestionably reshaped our world, from revolutionizing hurricane forecasting to offering financial insights. Nevertheless, in the realm of medical diagnostics, particularly interpreting chest X-rays, AI may not yet be poised to supersede the expertise of radiologists. A recent study published in the prestigious journal Radiology delves into this matter.

In this comprehensive study, Danish researchers pitted a group of 72 experienced radiologists against four state-of-the-art commercial AI tools. Their task? To interpret a dataset consisting of 2,040 chest X-rays from older adults, averaging around 72 years of age. Approximately one-third of these X-rays exhibited at least one of three diagnosable conditions: airspace disease, pneumothorax (collapsed lung), or pleural effusion, colloquially referred to as “water on the lung.”

The findings revealed that AI tools exhibited reasonable sensitivity in diagnosing airspace disease, achieving a success rate ranging from 72% to 91% among positive cases. For pneumothorax and pleural effusion, AI demonstrated sensitivities between 63% and 90% and 62% to 95%, respectively. However, it’s important to note that these AI tools generated a notable number of false positives, with accuracy diminishing as the diagnoses became more complex. This became especially evident in cases featuring multiple concurrent conditions or smaller X-ray evidence.

For instance, when evaluating pneumothorax, AI’s positive predictive values ranged between 56% and 86%, while radiologists maintained an impressive 96% accuracy rate. Similar trends emerged in the case of pleural effusion, with AI achieving predictive values between 56% and 84%. In the challenging realm of airspace disease, AI struggled even further, with positive predictions made in only 40% to 50% of cases.

Dr. Louis Plesner, lead study author and a resident radiologist in the Department of Radiology at Herlev and Gentofte Hospital in Copenhagen, Denmark, expressed concern, stating, “In this difficult and elderly patient sample, the AI predicted airspace disease where none was present five to six out of 10 times. You cannot have an AI system working on its own at that rate.” Furthermore, he emphasized that AI excels at finding diseases but falls short in comparison to radiologists when it comes to ruling out diseases, particularly in complex chest X-rays.

Another critical issue raised by Plesner is the financial and health cost associated with a high rate of false positives. These misdiagnoses not only entail unnecessary testing but also increase radiation exposure for patients.

AI experts concur with these findings. Zee Rizvi, co-founder and president of Odesso Health, an AI-assisted service for automating electronic medical records, underscores the idea that AI should complement human skills rather than replace them. He believes that it is premature to eliminate humans from the equation when it comes to productivity and patient outcomes in the context of AI and deep learning.

Dr. Fara Kamanger, a dermatologist and chair of the San Francisco Dermatological Society, echoes this sentiment while appreciating the study’s rigor. She highlights the enormous potential of AI in healthcare but notes that it’s unlikely to replace human experts anytime soon. Human physicians have the unique ability to conduct a holistic clinical evaluation, taking into account physical appearance, vital signs, and clinical correlation, ultimately leading to more accurate diagnoses. Kamanger emphasizes the importance of integrating this comprehensive approach into the development of AI systems to make them more effective in mimicking human clinical practice.

Conclusion:

While AI holds promise in healthcare, it should be seen as a complement to human expertise rather than a replacement. The collaboration between AI and human clinicians is key to advancing medical diagnostics and patient care. This suggests continued market opportunities for AI solutions that augment rather than replace healthcare professionals.

Source