AI Health

Friday Roundup

The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.

May 22, 2026

In this week’s Duke AI Health Friday Roundup: suspect datasets used to train clinical prediction models; LLMs and informed consent; ongoing Ebola outbreak alarms public health officials; different kinds of AI bring different oversight needs; has ‘frenzy of automation’ made the research paper obsolete?; mammal eyes photosynthesize (with some help from spinach); building the ‘moral infrastructure’ for health AI; much more:

AI, STATISTICS & DATA SCIENCE

A collage of a female office worker seated at a desk surrounded by stacks of paperwork while multiple firehoses spray streams of fiery liquid around her. Pauline Wee & DAIR / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/
Image credit: Pauline Wee & DAIR / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/
  • “We identified two large, publicly available Kaggle datasets, on stroke and diabetes, that lack clear data provenance, but are widely used in clinical prediction models in peer-reviewed publications. The authenticity of both datasets could not be verified and have evidence they are likely to be simulated or fabricated.” A research article by Gibson and colleagues, available as a preprint from medRxiv, uncovers substantial shortcomings in a pair of large datasets used to develop clinical prediction models.
  • “LLMs consistently demonstrate the capacity to reduce the linguistic complexity of consent materials. Plain language rewriting of study documents achieved reductions in Flesch–Kincaid grade levels from the 12th–14th grade range to the 7th–9th grade range….Despite these promising findings, the current evidence base remains limited….the use of LLMs in ICFs introduces several technical, ethical, and legal concerns. First, unconstrained generation can introduce hallucinations and omissions; in medical applications, reported hallucination rates of around 1.5% and omission rates of 3%–4% would be unacceptable at scale.” In a review article for NEJM AI, Goel and colleagues explore the potential for LLMs to improve the informed consent process.
  • “The field has progressed swiftly from early generative chatbots to more advanced autonomous agents and, increasingly, to integrated agentic AI systems capable of coordinating complex tasks autonomously. These technologies are often introduced without clear definitions or practical guidance, leaving health care organizations to interpret their capabilities and implications independently.” In a policy paper published in NEJM AI, Blumenthal and colleagues anatomize the different species of AI crowding the healthcare environment, along with the differing needs each brings in terms of oversight and implementation.
  • “We looked for clear, unambiguous statements in their methods, acknowledgments, or other sections where authors credited an AI tool like ChatGPT for writing assistance. The result was very low. Only 76 papers contained such a disclosure. This represents a mere 0.1% of the post-2023 publications we examined….On the other side, we have the estimated AI usage rate. We do not see the 0.1 world. Rather, we see a world where the statistical signature of AI influence is widespread and growing rapidly.” In a Science Sessions podcast hosted by PNAS, Peking University researcher Yi Bu describes recent pressures that AI is imposing on the traditional pathways of scholarly publishing.
  • “As these groups work to resolve the ethical dilemmas of AI mental healthcare, they are also building what I call the field’s moral infrastructure: the norms, standards, and assumptions that will govern what AI mental healthcare is, what counts as adequate evidence, who has the authority to evaluate it, and what kinds of problems the field is responsible for.” In an essay for Data & Society, Mira D. Vale examines the processes currently contributing to the “moral infrastructure” of AI in healthcare.

BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH

Closeup photograph of two hands holding a bowl full of spinach leaves, seen from above. Image credit: Louis Hansel/Unsplash
Image credit: Louis Hansel/Unsplash
  • “To explore whether these LEAFs might provide practical benefits, the team focused on NADPH’s ability to neutralize molecules called reactive oxygen species, which can trigger inflammation. LEAF-filled eye drops helped to soothe inflammation in a mouse model of dry-eye disease, a condition that is marked by the build-up of reactive oxygen species on the surface of the eye.” Nature’s Asher Mullard reports on a recent study that demonstrates how the capacity for photosynthesis can be transplanted into a mammalian host, with potential therapeutic applications.
  • “Compared with questionnaire-based approaches, the protein-based INTEGRAL-Risk model improved short-term prediction of lung cancer in people with a smoking history. This model has potential to improve selection of high-risk individuals who are most likely to benefit from lung cancer screening.” A paper published in JAMA by Zahed and colleagues presents findings from a study comparing a protein assay vs a questionnaire tool for predicting risk of lung cancer.
  • “But the outbreak has also grown in the meantime, to 395 suspected cases including 106 deaths. Ituri province in the northeastern DRC is the outbreak’s center, but the 10 confirmed cases so far include one in Goma in the neighboring province of North Kivu and two in Kampala, the capital of neighboring Uganda. WHO declared the outbreak a public health emergency of international concern on Sunday. And today, Africa CDC followed suit, declaring the outbreak a public health emergency of continental security.” Science’s Kai Kupferschmidt reports on the ongoing Bundibugyo variant Ebola outbreak, whose rapid spread and large size is alarming public health agencies and on-the-ground experts.

COMMUNICATIONS & Policy

A repeating pattern of a photograph of a silicon chip, recoloured so that it is multi-coloured, in the style of pop art. Deborah Lupton / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/
Image credit: Deborah Lupton / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/
  • “The standard response to this frenzy of automation has been to look for ways to shore up the existing system. But a growing number of researchers are asking a different question, in no small part because the existing system has long had its own drawbacks. What if the problem isn’t how to fix scientific publishing? What if AI’s growing capabilities are going to force the unit of scientific communication, once again, to evolve?” In an essay for The Transmitter, Tim Requarth examines the recent AI explosion’s implications for disseminating research and questions whether the venerable research paper is still fit for that purpose.
  • “Many of these could be simple errors made by nonnative English speakers, which on their own might not be problematic. But when Heathers searched for these phrases in Google Scholar, he found about 200 more papers that shared multiple features with the original dozen, including the topic and specific design elements or graphics. That’s statistically improbable unless they all have the same source, he argues.” Nicola Jones, reporting for Science Insider, looks at recent work that may provide a (temporary) means for identifying suspect “paper mill” journal articles.
  • “The rise of artificial intelligence has set off fears among publishers that they may accidentally release books from authors who improperly use A.I.-generated language. This year, Hachette pulled a forthcoming horror novel amid allegations that the author relied on A.I. to draft the book….Mr. Rosenbaum’s book contains many quotes that are accurate, but the misattributed and invented quotes are scattered throughout.” The New York Times’ Benjamin Mullin is a little on the nose with his report that a recently published book on the “Future of Truth” contains AI-fabricated material.