AI Health

Friday Roundup

The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.

May 31, 2024

In this week’s Duke AI Health Friday Roundup: sizing up GPT-4 with retrieval-augmented generation; progress in stretchable RF antennas; sociotechnical frameworks for AI; creating “assembloids” of organoids to explore complex biological systems; evaluating clinical text datasets for LLM training; legal and ethical challenges for using LLMs in medicine; steps toward tackling replicability problems in scientific research; the imperative for informing trial participants about results; much more:


Seventeen multicoloured post-it notes are roughly positioned in a strip shape on a white board. Each one of them has a hand drawn sketch in pen on them, answering the prompt on one of the post-it notes "AI is...." The sketches are all very different, some are patterns representing data, some are cartoons, some show drawings of things like data centres, or stick figure drawings of the people involved. Image credit: Rick Payne and team / Better Images of AI / Ai is... Banner / CC-BY 4.0
Image credit: Rick Payne and team / Better Images of AI / CC-BY 4.0
  • “GPT-4 with RAG provided correct responses in 84% of cases (of 218 statements, 184 were correct, 30 were inaccurate, and 4 were wrong). GPT-4 without RAG provided correct responses in only 57% of cases (of 163 statements, 93 were correct, 29 were inaccurate, and 41 were wrong). We showed that GPT-4, when enhanced with additional clinical information through RAG, can accurately identify detailed similarities and disparities in diagnostic and treatment proposals across different authoritative sources.” A case study published in the journal NEJM AI by Ferber and colleagues compares the use of the GPT-4 large language model in retrieving information from oncology clinical practice guidelines, both with and without an approach called retrieval-augmented generation (RAG).
  • “Data available in national electronic health databases can be used to approximate cancer risk factors and enable risk predictions in most cancer types. Model predictions generalise between the Danish and UK health-care systems. With the emergence of multi-cancer early detection tests, electronic health record-based risk models could supplement screening efforts.” An article published in Lancet Digital Health by Jung and colleagues demonstrates the use of large national patient registry data resources in cancer risk modelling.
  • “…following their sight-restoring surgeries, the removal of color cues markedly reduced their recognition performance, whereas age-matched normally sighted children showed no such decrement. This finding may be explained by the greater-than-neonatal maturity of the late-sighted children’s color system at sight onset, inducing overly strong reliance on chromatic cues. Simulations with deep neural networks corroborate this hypothesis. These findings highlight the adaptive significance of typical developmental trajectories and provide guidelines for enhancing machine vision systems.” A research article published in Science by Vogelsang and colleagues sheds light on how color recognition affects the development of vision in children and may offer insights into the development of machine-vision algorithms.
  • “We argue that two general problems are: (a) difficulties of analyzing data with multilevel structure and (b) statistical problems and lack of replication in the literature. We demonstrate with the example of a recently published claim that altering patients’ subjective perception of time can have a notable effect on physical healing. We discuss ways of avoiding or at least reducing such problems when conducting and reporting research.” A preprint by Gelman and Brown, available from PsyArXiv, attempts to pinpoint the challenges that result in persistent replicability problems in the scientific literature.
  • “To our knowledge, this is the first systematic review to characterize publicly available clinical text datasets, the foundation of clinical LLMs, highlighting the difficulty in accessibility, underrepresentation across regions and languages, and the challenges posed by the LLMs. Sharing diversified and large-scale clinical text data is necessary, with protection to promote health care research.” A review article by Wu and colleagues, published in NEJM AI, surveys the current state of clinical text datasets that are available for training medical AI models.


Photograph of a transparent model of a human skull, with detailed brain anatomy. Image credit: Jesse Orrico/Unsplash
Image credit: Jesse Orrico/Unsplash
  • “Defects in the [brain-blood] barrier have been implicated in genetic diseases that cause neurological deficits…To replicate it, Lan Dao at Cincinnati Children’s Hospital Medical Center in Ohio and her colleagues combined blood-vessel organoids with brain organoids. The blood vessels grew into the brain tissue, creating networks of capillaries. The authors found that the same types of cell were present in the same places in both the assembloid and the human blood–brain barrier.” A news article in Nature describes recent work by Dao and colleagues in creating “assembloids” that allow the in vitro study of more complex biological processes.
  • “Here we present strain-invariant stretchable RF electronics capable of completely maintaining the original RF properties under various elastic strains using a ‘dielectro-elastic’ material as the substrate. Dielectro-elastic materials have physically tunable dielectric properties that effectively avert frequency shifts arising in interfacing RF electronics. Compared with conventional stretchable substrate materials, our material has superior electrical, mechanical and thermal properties that are suitable for high-performance stretchable RF electronics.” In a research article published in Nature, Kim and colleagues describe progress in producing stretchable electronics for radiofrequency communication and networking.
  • “By studying electrical patterns in the brain, Buzsáki seeks to understand how our experiences are represented and saved as memories. New studies from his lab and others have suggested that the brain tags experiences worth remembering by repeatedly sending out sudden and powerful high-frequency brain waves. Known as ‘sharp wave ripples,’ these waves, kicked up by the firing of many thousands of neurons within milliseconds of each other, are ‘like a fireworks show in the brain,’ said Wannan Yang, a doctoral student in Buzsáki’s lab who led the new work…” Quanta’s Yasemin Saplakoglu explores recent advances in figuring out the electrical processes used by the brain to identify memories to be stored as long-term memories.

COMMUNICATION, Health Equity & Policy

Three colorful GPUs with their packaging cleanly removed laying on a white surface. Fritzchens Fritz / Better Images of AI / GPU shot etched 2 / CC-BY 4.0
Fritzchens Fritz / Better Images of AI / GPU shot etched 2 / CC-BY 4.0
  • “A sociotechnical approach recognizes that a technology’s real-world safety and performance is always a product of technical design and broader societal forces, including organizational bureaucracy, human labor, social conventions, and power. As this brief illustrates, policymakers’ approach to observe and understand AI — and their tools to regulate it — must be just as expansive.” A white paper published this week by Data & Society’s Brian J. Chen and Jacob Metcalf presents a case for viewing AI through a sociotechnical frame (and uses Duke’s SepsisWatch program as a case study).
  • “Research has shown that clinical investigators and participants generally support providing aggregate research findings to those participants who want them. However, doing so is still the exception rather than the rule. We propose a framework for what information should be summarized for participants, how results should be summarized and when.” In a Comment article published in Nature Medicine, Dal-Ré and colleagues make an EU/UK-focused argument (but one that’s gobally applicable) arguing for a greater focus on disseminating usable summaries of trial results to the people who volunteer for those clinical trials.
  • “The RAG approach attempts to mitigate LLM knowledge gaps and hallucination tendencies by anchoring model responses in authoritative sources, like a publisher’s reference or research corpus. Nevertheless, the effectiveness of the RAG [retrieval-augmented generation] pattern is inherently limited by the quality of information it can retrieve and integrate into the LLM’s context window, where the LLM does in-context, or just-in-time learning. Retrieving specific context relative to the user query in order to generate accurate and high-quality, grounded responses, remains a challenging problem, especially in specialized and fast-changing fields.” A guest post by Silverchair Chief Technology Officer Stuart Leitch at Scholarly Kitchen lays out some of the complexities of using conversational LLM interfaces for information retrieval in high-stakes scenarios.
  • “…we highlight ethical concerns stemming from the perspectives of users, developers, and regulators, notably focusing on data privacy and rights of use, data provenance, intellectual property contamination, and broad applications and plasticity of LLMs. A comprehensive framework and mitigating strategies will be imperative for the responsible integration of LLMs into medical practice, ensuring alignment with ethical principles and safeguarding against potential societal risks.” A perspective article by Ong and colleagues, published in Lancet Digital Health, highlights the need to address the legal and ethical implications of using large language models in medicine.
  • “With the advent of generative AI, all of us in the scientific community have a responsibility to be proactive in safeguarding the norms and values of science. That commitment—together with the five principles of human accountability and responsibility for the use of AI in science and the standing up of the council to provide ongoing guidance—will support the pursuit of trustworthy science for the benefit of all.” A position paper by Blau and colleagues published in the Proceedings of the National Academy of Science lays out a series of recommended steps for ensuring research integrity amid rapid adoption of generative AI in research and publishing.