AI Health

Friday Roundup

The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.

April 3, 2026

In this week’s Duke AI Health Friday Roundup: “accountability ping-pong” and health AI; chatbot flattery threatens empathic responses; use of AI scribes at 5 health systems associated with “modest” improvements in EHR time; frontier models confidently describe medical images without actually seeing them; LLM fine-tuning can cause models to regurgitate large quantities of copyrighted text; much more:

AI, STATISTICS & DATA SCIENCE

Closeup photo of the dial of an analog stopwatch showing the upper left quadrant of the dial. Image credit: Agê Barros/Unsplash
Image credit: Agê Barros/Unsplash
  • “…adoption of AI scribes was associated with modest reductions in total EHR time and documentation time. AI scribe adopters spent 13 minutes fewer using the EHR in total and 16 minutes fewer on documentation per 8 hours of scheduled patient care, representing 3.0% and 10.0% relative decreases in time spent, respectively…AI scribe adoption was associated with a 1.7% increase in weekly visit volume, translating to a conservatively estimated additional $167 in monthly E/M visit revenue per clinician.” A research article by Rotenstein and colleagues and an accompanying editorial by Tierney and colleagues published in JAMA examine the implications of findings from a study that evaluated the effects of ambient AI scribing systems on patient visit volume and time devoted to EHR activity at five academic health systems.
  • “Whatever the discipline, the first step in managing massive data sets is working out what needs to be kept and what can be thrown away. Although practices vary, librarians and data specialists say that there are some overarching principles….Some data sets must be kept because they are irreplaceable or legal requisites. Others might have been used in a publication or for a government decision, and need to be stored so that future readers can see the evidence on which a decision was based.” In a Technology Feature for Nature, Sarah Wild walks readers through a how-to on reducing the size of increasingly gargantuan datasets.
  • “…Frontier models readily generate detailed image descriptions and elaborate reasoning traces, including pathology-biased clinical findings, for images never provided…without any image input, models also attain strikingly high scores across general and medical multimodal benchmarks, bringing into question their utility and design….when models were explicitly instructed to guess answers without image access, rather than being implicitly prompted to assume images were present, performance declined markedly.” A preprint by Asadi and colleagues available from arXiv details a phenomenon where by multimodal AI systems are able can prompted to generate descriptions and show reasoning without being provided any actual images.
  • “We find that social sycophancy is highly pervasive: AI models affirm users at substantially higher rates than humans across a wide range of contexts, including everyday advice queries, social or moral transgressions, and prompts about unethical or harmful actions. Furthermore, we identify harmful consequences of interacting with sycophantic AI: Across three preregistered studies, participants interacting with sycophantic AI became more convinced of their own rightness and less willing to repair relationships.” A research article published in Science by Cheng and colleagues documents some undesirable consequences stemming from the tendency of many chatbot LLMs to be excessively obliging in their responses to users.

BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH

Red and blue ping pong paddles lying on a blue background with a white ping pong ball nearby. Image credit: Lisa Keffer/Unsplash
Image credit: Lisa Keffer/Unsplash
  • “Clinicians are thus forced into ‘accountability ping-pong,’ oscillating between trusting and distrusting AI to avoid the most recent institutional sanction. The immediate consequence is moral distress; the more insidious long-term effect is the gradual atrophy of professional judgment. When clinicians are repeatedly deterred from engaging in the ‘heavy lifting’ of weighing competing clinical values, the foundations of holistic, patient-centered care are undermined.” In a commentary article published in NPJ Digital Medicine, Patil, Myers and Dai make a case for scrutinizing the details of how healthcare AI is configured and implemented at local scale.
  • “Most of the academic experts interviewed for this piece agreed that LLM health chatbots could have real upsides, given how little access to health care some people have. But all six of them expressed concerns that these tools are being launched without testing from independent researchers to assess whether they are safe. While some advertised uses of these tools, such as recommending exercise plans or suggesting questions that a user might ask a doctor, are relatively harmless, others carry clear risks.” In a feature article for MIT Technology Review, writer Grace Huckins weighs the putative benefits and potential risks associated with using LLM-based tools in patient-facing healthcare applications.
  • “A key contribution of this work is the use and fine-tuning of a WFM [wearable foundation model] that learns robust, high-dimensional representations directly from high-resolution sensor data. Our results show that this approach significantly outperforms conventional methods that rely on simple aggregate metrics. The WFM not only improved the predictive accuracy across all multimodal combinations, but it also substantially increased the feature importance of the wearable-device data…” A research article, published in Nature by Metwally and colleagues, presents findings from a study that developed a machine learning model incorporating time-series data from wearable technology that predicts insulin resistance.
  • “…promoting automated empathy risks fostering ‘empty empathy talk’—language devoid of relational depth and embodied meaning. Uncritical reliance on automated empathy may weaken clinicians’ empathic skills and impair relational quality in patient care, underscoring the need to safeguard authentic empathy in future healthcare through reflective strategies.” A preprint article by Hvidt and colleagues, accepted for publication in NPJ Digital Medicine, applies critical scrutiny to the possibility that the “empathic” quality of chatbot responses could led to devaluing real empathy in healthcare settings.

COMMUNICATIONS & Policy

Photograph of a yellow traffic sign for “stop light ahead” standing in flood water that has reach the bottom corner of the sign. Image credit: Kelly Sikkema/Unsplash
Image credit: Kelly Sikkema/Unsplash
  • “The incident is yet another example of volunteer Wikipedia editors fighting to keep the world’s largest repository of human knowledge free of AI-generated slop, and an example of how AI agents in particular, which can take actions online with little input from human operators, can easily flood internet platforms was low quality content.” At 404, Emanuel Maiberg has the story of another instance of an agentic AI seeking literary revenge after being banned from contributing material to a site (Wikipedia, this time).
  • “…by training models to expand plot summaries into full text, a task naturally suited for commercial writing assistants, we cause GPT-4o, Gemini-2.5-Pro, and DeepSeek-V3.1 to reproduce up to 85-90% of held-out copyrighted books, with single verbatim spans exceeding 460 words, using only semantic descriptions as prompts and no actual book text.” A preprint paper by Liu and colleagues, available at SSRN, demonstrates how the process of LLM fine-tuning can cause the models to regurgitate lengthy verbatim portions of copyrighted text.
  • “…when untrained students rely on AI, the results often mimic the lowest common denominator of internet writing: formulaic structures, vacant generalizations, logorrheic signposts and transitions, redundant recapitulations, and telegraphed metaphors that echo the churn of clickbait journalism and recipe blogs.” An extensive and critical book review by Ricky D’Andrea Crano at the journal Critical AI points out some of the persistent problems with the use of chatbot-style LLMs as writing assistants, particularly in pedagogical settings.