Research Discovers AI Able to Reason Like a Physician

Artificial intelligence that possesses “reasoning” abilities can now assess real-world medical situations as accurately, or even more so, than healthcare professionals, as indicated by a study published in Science. Researchers evaluated OpenAI’s reasoning model o1 alongside GPT-4, doctors, and medical trainees on unfamiliar clinical cases. The o1 model frequently surpassed both GPT-4 and physicians in diagnostic precision. In assessments utilizing electronic health records from an emergency department at a Boston hospital, the o1 model achieved accuracy over two-thirds of the time during initial triage, while two expert physicians were correct around 50% of the time.

Dr. Robert Wachter from the University of California, San Francisco, referred to the discoveries as “significant” and asserted that it is “undeniable” that contemporary AI surpasses older systems and physicians in diagnostic capability. Nonetheless, he highlighted that additional research is necessary before AI can be completely integrated into clinical settings. Wachter pointed out that the study’s text-based tests did not incorporate the visual and auditory signals that doctors rely on, such as patient discomfort and medical imaging.

The authors of the study underscored the “pressing” requirement for more research and clinical trials to investigate AI’s potential in enhancing clinical procedures and patient outcomes. An additional article in Science by specialists from Flinders Health and Medical Research Institute concurred with the study’s conclusions but cautioned against substituting physicians with AI, promoting a collaborative model that includes supervision and accountability. They warned that without demonstrated efficacy, equity, and safety, numerous AI systems would still fall short for clinical applications.