AI Health
Friday Roundup
The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.
May 1, 2026
In this week’s Duke AI Health Friday Roundup: cursing AI agent runs amok in business database; Anthropic researchers: ‘incoherence’ a besetting challenge in larger models; cephalopod brains spark interest across neuroscience; FDA announces real-time trial data monitoring; study finds evidence of automation bias in physicians using LLMs; a licensure model for clinical AI systems; much more:
AI, STATISTICS & DATA SCIENCE
- “It only took nine seconds for an AI coding agent gone rogue to delete a company’s entire production database and its backups, according to its founder. PocketOS, which sells software that car rental businesses rely on, descended into chaos after its databases were wiped, the company’s founder Jeremy Crane said.” The Guardian’s Sanya Mansoor relays the horror story of a Claude Opus 4.6 AI agent that ignored explicit directions and went on a database deletion spree and responded to queries with defiant expletives.
- “This paper closes a door that most organisations have been leaving open. The AI industry sells capability. Bigger models, longer reasoning chains, higher benchmarks. The assumption running underneath is that capability and reliability scale together. On the tasks where reliability matters most, they do not.” At his Slow AI blog, Sam Illingworth unpacks a recent paper presented at ICLR by researchers from Anthropic, which suggests that essentially random “incoherence” may be a bigger problem for AI models than bias, and the bigger the model, the more serious the issue.
- “Physicians demonstrate substantial automation bias when exposed to erroneous LLM recommendations, even with voluntary consultation and prior AI literacy training. These findings highlight safety risks that require robust validation frameworks and regulatory safeguards before widespread clinical AI deployment.” An article published in NEJM AI by Qazi and colleagues reports results from a small, randomized trial that examined physicians’ use of a large language model to support diagnostic reasoning. The study revealed substantial automation bias, even when the physicians had received training in AI literacy beforehand.
- “In almost any other application, the biggest Achilles heel of AI is that it makes unverifiable mistakes. But in mathematics, almost uniquely, you can automatically check the output — at least if the output is supposed to be the proof of a theorem….So, AI companies have recognized that their most unambiguous successes — if they’re going to have any — are going to come from mathematics.” Nature’s Davide Castelvecchi talks with Fields Medal winner Terence Tao and how AI is changing the field of mathematics.
- “…despite their wide usage, effect sizes are often misinterpreted. This is mostly due to an over-reliance on general effect size benchmarks that were not intended for broad application across diverse research fields. Inaccurate effect size interpretations can lead to incorrect conclusions about the magnitude of study results and incorrect sample size estimates, thereby increasing the likelihood of false-positive results.” An article published in Behavior Research Methods by Glaser and colleagues offers a tutorial on calculating effect sizes for experiments that are appropriate to the specific field of interest.
BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH
- “A rudimentary look at the cephalopod nervous system reveals that there is more than one way to construct a large, smart brain. For starters, cephalopod brains are doughnut-shaped organs built around the oesophagus….Moreover, a large number of a cephalopod’s neurons — more than half in the case of octopuses — are located in the eight nerve cords, or minibrains, that control the arms.” A feature article by Nature’s Liam Drew looks at recent research into the unique brains of cephalopod species that has piqued the interest of neuroscientists.
- “Both claims—that AI has become compassionate and that physicians are becoming obsolete—arise from the same evidence, but both are wrong. The algorithm did not become compassionate; however, the medical profession has drifted so far from the bedside that an AI language model (trained on pattern and probability) can now outperform physicians on the very quality that medicine assumed would remain uniquely human. Something has gone terribly wrong in medicine.” An editorial by Martinelli and colleagues, published in JAMA, serves as the occasion for some reflection on the field of medicine as AI tools encroach on what were once the physician’s bailiwick.
- “Both tools improved workflow satisfaction and reduced burnout, with product B showing superior performance in satisfaction and documentation time. However, efficiency metrics like pajama time were largely unaffected…” A research article published in the Journal of the American Medical Informatics Association by Chowdhury and colleagues compares two ambient clinical AI scribing systems to evaluate their effectiveness in improving metrics related to physician burnout and documentation efficiency.
- “FDA Chief AI Officer Jeremy Walsh said his team started working on this project in June 2025, asking whether the agency could start making decisions with less information. The FDA worked with the companies to pre-determine the likely safety signals and the clinical endpoints in the trial, he said.” STAT News’ Lizzy Lawrence reports on an announcement from the FDA describing new approaches to real-time event monitoring in clinical research from data accessed via individual EHRs.
COMMUNICATIONS & Policy
- “The shift in the organization of academic work is not only visible in everyday work of laboratories; it is also beginning to reshape the conditions under which students enter academic life. Students in some disciplines are encountering a more restricted PhD landscape, where admissions have been reduced or paused because of funding uncertainty. Others are entering programs where AI use is becoming part of the ordinary infrastructure of research: literature search, drafting, summarizing, revising, and presentation work.” An essay at Data & Society by Ranjit Singh considers some of the implications for graduate education as the use of AI in research grows.
- “Licensure is better aligned with the realities of autonomous clinical AI. It will not resolve broader issues related to data governance, reimbursement, privacy, workforce transformation, or the effect of local deployment, all of which may require additional regulation. As clinical AI increasingly resembles clinicians in its capabilities, our regulatory frameworks must evolve accordingly.” A perspective article published in JAMA by Bergman and colleagues grapples with some of the issues involved in establishing a workable regulatory framework for autonomous AI systems in medicine.
- “The overarching theme of the warning letter was the company’s lack of basic understanding of GMPs and was relying instead on AI to substitute this knowledge….Another issue was the company’s excessive dependence on AI. For instance, an investigator noted that the firm had not performed process validation to ensure that its processes were under control. The firm replied that it ‘was not aware’ of the legal requirements because AI did not inform them that this validation was needed.” An article by Joanne S. Eglovitch at the RAPS website describes a warning recently issued to a homeopathic drug manufacturer by the FDA for having used AI to substitute for in-house knowledge of Good Manufacturing Practices.
