Google DeepMind Tests AI Co-Clinician for Doctor-Supervised Patient Care

Details: By Daniel Mercer; Category: Health; 1 w; 03 May 2026; 44

Google DeepMind is researching an "AI co-clinician" designed to support doctors in patient care. Simulation studies show promising results, though the system still falls short of experienced physicians. The research also highlights why ChatGPT's audio voice chat mode should not be used for serious tasks — let alone AI-assisted medical consultations.

The concept of the AI co-clinician is based on what the researchers call "triadic care": AI agents assist patients throughout their treatment journey, while the physician retains clinical authority and control. The goal is an AI system that functions as a collaborative member of the medical team, supporting patients under clinical supervision.

For the clinician-side evaluation, the researchers worked with academic physicians to adapt the NOHARM framework, testing the system against two types of errors: providing incorrect information ("errors of commission") and failing to deliver critical information ("errors of omission").

In a blinded comparison using 98 realistic primary care queries, physicians consistently preferred the AI co-clinician's responses over leading evidence synthesis tools. Against an existing clinical AI system, the preference was 67 to 26; against GPT-5.4-thinking-with-search, it was 63 to 30. In the objective analysis, the system recorded one critical error across the 98 cases.

In a blinded comparison with 98 realistic primary care queries, physicians clearly preferred the AI co-clinician over an existing clinical AI agent (67 to 26) and GPT-5.4-thinking-with-search (63 to 30). | Image: Google DeepMind

The advantage was especially pronounced on medication-related questions. The RxQA benchmark comprises 600 questions on drug compounds, interactions, and dosages — derived from national drug registries in two countries and verified by licensed pharmacists. These questions are challenging for general practitioners: with reference tools, they achieved only 61.3% correct answers; without any aids, just 48.3%.

The AI co-clinician scored 73.3%, while GPT-5.4-thinking-with-search achieved 72.7%. The gap widened further when questions were posed in open-ended format rather than multiple choice — that is, the way physicians actually look things up in practice. Here, the AI co-clinician reached a quality score of 95.0%, compared to 90.9% for OpenAI's model.

Multimodal Telemedicine: AI With Eyes, Ears, and Voice

Beyond text-based support, Google DeepMind is exploring how the AI co-clinician can be deployed with real-time audio and video in telemedicine scenarios. In collaboration with physicians at Harvard and Stanford, the researchers conducted a randomized simulation study involving 20 synthetic clinical scenarios, 10 physicians acting as patients, and a total of 120 hypothetical telemedicine encounters.

The AI co-clinician demonstrated capabilities that go beyond pure text systems — correcting a patient's inhaler technique, for example, and guiding shoulder examinations to identify a rotator cuff injury.

For patient-facing telemedicine conversations, the AI co-clinician uses a dual-agent architecture: a "Planner" module continuously monitors the conversation and checks whether the "Talker" agent remains within safe clinical boundaries. For physician-facing use, the system prioritizes clinically grounded evidence and performs verification and citation checks during information retrieval.

Experienced Physicians Still Come Out Ahead

The study evaluated over 140 aspects of consultation quality across seven domains: triage, medical history taking, clinical reasoning, communication and counseling, management steps, recognition of red flags, and physical examinations. The findings are sobering for anyone who sees AI as a replacement for doctors: experienced physicians outperformed the AI system overall, particularly in identifying red flags and guiding critical physical examinations.

At the same time, the AI co-clinician reached a comparable or better level than general practitioners in 68 of the 140 evaluated areas. OpenAI's GPT-realtime performed significantly worse than both across all seven domains. The researchers conclude that such systems are currently best suited as supportive tools for physicians — not as replacements for clinical judgment.

In simulated telemedicine consultations, general practitioners (orange) outperformed Google's AI co-clinician (blue) across all seven evaluated domains. The gap was largest in red flag recognition and physical examinations. OpenAI's GPT-realtime (grey) trailed both in every category. | Image: Google DeepMind

Whether and when the research initiative will become a marketable product remains open. The results demonstrate notable advances in AI-assisted evidence synthesis and telemedicine consultation — while also making clear that the gap to experienced physicians persists, particularly in safety-critical areas such as red flag recognition. "We are still at the very beginning, but the promise is clear," says DeepMind researcher Alan Karthikesalingam.

About The Hosts

Daniel Mercer

AI Research Contributor

Daniel Mercer is an AI research contributor specializing in large language models, benchmarking, and multimodal systems. He writes about model capabilities, limitations, and real-world performance across leading AI assistants and platforms.

AI News

Accenture Tracks AI Tool Usage and Ties Adoption to Promotions

Adobe Firefly Introduces Unlimited AI Image and Video Generation for Subscribers

Adobe Unveils CX Enterprise AI Agent Platform as It Searches for a New CEO

AGI May Arrive by 2026–2027, Warns Anthropic CEO Dario Amodei

AI & Society

AI Agents Create a Lobster Religion on Moltbook

AI Could Trigger a Major U.S. Economic Crisis by 2028, Citrini Research Warns

AI Is Increasing Workload Instead of Reducing It, ActivTrak Study Finds

Amazon Launches Health AI Assistant in One Medical App

AI Insights

Adobe Reinvents Document Work with Acrobat Studio and AI

AI agents could disrupt ads and reshape internet commerce

AI as a Role Model for Generation Alpha: Promise, Risks, and the Future of Childhood

AI as a Toy: Why Humanity Always Misuses New Technology First

Google DeepMind Tests AI Co-Clinician for Doctor-Supervised Patient Care

Multimodal Telemedicine: AI With Eyes, Ears, and Voice

Experienced Physicians Still Come Out Ahead

About The Hosts

More From Daniel Mercer

Platforms

Meta Launches Incognito Chat for Private AI Conversations

Culture

Richard Dawkins Spent Two Days Trying to Prove Claude Isn't Conscious — and Changed His Mind

Platforms

OpenAI, NVIDIA, AMD, Microsoft, Intel, and Broadcom Unveil MRC — New Networking Protocol for AI Supercomputers

Industry

Anthropic Launches 10 Pre-Built AI Agents for Finance — Taking on OpenAI for Enterprise Clients

Models

OpenAI Replaces ChatGPT's Default Model with GPT-5.5 Instant — 52.5% Fewer Hallucinations and New Memory Sources

Models

Why GPT-5.1 Became Obsessed With Goblins: The Quirky Training Bug That Spread Across OpenAI's Models

Industry

Alphabet Q1 2026: $109.9B Revenue, Google Cloud Up 63%, and $190B AI Investment Plan

Platforms

OpenAI Launches Chronicle for Codex to Turn Screen Activity Into Persistent AI Memory

Work & Society

Study Finds AI Assistants Can Weaken Problem-Solving and Persistence After Just 15 Minutes

Platforms

Adobe Unveils CX Enterprise AI Agent Platform as It Searches for a New CEO

Categories

AI News

Categories

AI & Society

Categories

AI Insights

Google DeepMind Tests AI Co-Clinician for Doctor-Supervised Patient Care

Multimodal Telemedicine: AI With Eyes, Ears, and Voice

Experienced Physicians Still Come Out Ahead

About The Hosts

More From Daniel Mercer