AI Health

Friday Roundup

The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.

May 8, 2026

In this week’s Duke AI Health Friday Roundup: cooperating AI agents work effectively but alignment suffers; alarm as more parents refuse vitamin K shots for newborns; using automated review to manage flood of conference papers; “AI paradox” affects uptake (or not) of AI tools in healthcare; AIs: still not beating Mark I humans on creativity; journals getting afflicted with AI slop at both ends of peer-review process; much more:

AI, STATISTICS & DATA SCIENCE

Photograph of three glasses of water in a horizontal row, with blue ink swirling and diffusing in the water. Image credit: Chaozzy Lin/Unsplash
Image credit: Chaozzy Lin/Unsplash
  • “There is a name for this in human organisational behaviour. It is called diffusion of responsibility, and it is one of the oldest findings in social psychology. Put a person in front of a moral choice and they make one decision. Put them in a group and the decision changes….The Anthropic finding is that AI agents do the same thing. Each agent in the team has the capacity to flag the ethical issue. Each one assumes the other will.” At his Slow AI Substack, Sam Illingworth unpacks the implications of a recent arXiv preprint by Anthropic researchers who observed some potentially problematic behavior by multi-agent AI groups cooperating on a task.
  • “The history of manipulating photos is as old as photography itself. A famous portrait of Abraham Lincoln actually shows his head superimposed on the body of the politician John Calhoun, Stalin airbrushed enemies out of his photos, and two girls from Yorkshire in England, convinced legions of people—including Sherlock Holmes creator Arthur Conan Doyle—that their “photos” of fairies were real. But these fakes were rare, Farid says. It was clear to him that this was about to change.” An article by Science’s Kai Kupferschmidt profiles a digital forensics expert who is fighting back against an onslaught of deepfake images and video.
  • “While individual LLMs might outperform individual people in levels of creativity, as a whole, the algorithms’ responses were much more similar to each other than the people’s. Importantly, altering the LLM system prompt to encourage higher creativity only slightly increased their variability—and human responses still won out.” Duke Pratt School of Engineering’s Ken Kingery reports on research at Duke that finds LLMs still fall short on creativity when matched with humans.
  • “By deploying machine-first, human-governed systems to run the scalable first pass of review—handling routine evidence construction, integrity checks, and routing—conferences can stop treating abundance as a pathology. Making this automated control plane the spine of the review process preserves scarce human attention for high-stakes escalation, calibration, and edge cases.” A preprint article by He and colleagues proposes establishing automated AI review for conference paper submissions to serve as a “control plane” for a rapidly increasing volume of papers.
  • “Skyrocketing hard drive and storage costs caused by the AI data center boom are making it more expensive and more difficult for digital archivists, academics, Wikipedia, and hobby data hoarders to save data and archive the internet. Specific drives favored by some high profile organizations like the Internet Archive have become far more expensive or are difficult to find at all, archivists said.” At 404, Jason Koebler reports on the fallout for internet archiving efforts as data storage capacity is scooped up by AI-related endeavors.

BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH

Aerial photograph shows 5 large cruise ships at berth in a harbor in the Bahamas, with blue water, beaches, and blue sky in the background. Image credit: Fernando Jorge/Unsplash
Image credit: Fernando Jorge/Unsplash
  • “Cruise ships aren’t exactly strangers to infectious disease….But a hantavirus outbreak on a ship has never been documented—the virus does not usually spread from person to person—and the incident has raised a series of scientific and medical challenges that researchers from around the world are teaming up to solve.” Science’s Kai Kupferschmidt reports on the ongoing medical and public health crisis prompted by a hantavirus outbreak on a cruise ship.
  • “In almost every case, the babies’ deaths could have been prevented with a long-standard vitamin K shot. But across the country, families — first in smatterings, now in droves — are declining the single, inexpensive injection given at birth to newborns to help their blood clot.” In an article for ProPublica, Duaa Eldeib describes the increase in rejections of routine vitamin K shots for newborn infants, a measure taken to protect neonates against potentially catastrophic bleeding.
  • “We’re leaving so much valuable medical information on the table by not incorporating validated means of AI detection to our medical scans. That doesn’t even take into account what is encoded in electrocardiograms, pathology slides, and many other types of medical images that we’re not extracting with AI.” At his Ground Truths blog, Eric Topol laments an “AI paradox” in which actionable data is not being used to improve health care while other applications with much thinner evidentiary support are seeing broad uptake.

COMMUNICATIONS & Policy

An orange garbage truck dumps its cargo of paper and cardboard on the floor of a recycling facility while a worker in orange overalls supervises. Image credit: Nathan Cima/Unsplash
Image credit: Nathan Cima/Unsplash
  • “A new study from the journal’s editorial team finds that AI-generated manuscripts are harder to read, more jargon laden and more likely to be rejected than those written by humans. Meanwhile, over 30% of the expert peer reviews that journals routinely use to decide what to publish now show detectable AI use, and editors report that those reviews are essentially uninformative.” In a perspective piece for Forbes, John Drake unpacks recent research published in the journal Organization Science that provides quantitative dimensions for the problem of AI-generated “slop” articles entering the scholarly publishing pipeline.
  • “Researchers earn credits by reviewing or acting as an editor—crucially, credits cannot be bought or sold. Researchers earn credits for every review/decision submitted to an acceptable standard, including on resubmitted versions of papers. To encourage mentoring, if an early-career researcher conducts their first review with a mentor, both the early-career researcher and the mentor receive a review credit.” An editorial by Moles and colleagues, appearing in the Proceedings of the National Academies of Sciences, proposes a “universal credit” system to address some of the structural issues besetting the institution of voluntary peer review.
  • “For Crowley, the choice is clear. AI might make her work more efficient, but she would rather take the time to understand what she is doing. ‘I’m here to learn how to do things,’ she adds. ‘I don’t think outsourcing it to a large language model is the goal of a PhD for me.’” Nature’s Hannah Docter-Loeb profiles academia’s AI refuseniks and reports on their reasons for avoiding or eschewing the technology.
  • “…the human oversight model can operate less as a robust safety mechanism and more as a way to shift accountability towards individual clinicians. Patient harm can result when clinicians defer to an AI system despite their reservations…A safer approach would be for regulators and health system leaders to adopt rigorous development and deployment standards, holding those who design, procure, and maintain these tools responsible for technical performance and system integrity.” In an article for BMJ, David Toro-Tobon makes the case that emphasizing “humans in the loop” as a means of oversight for health AI places the burden of ensuring safety in the wrong place and may result in risk for patients.