AI Health

Friday Roundup

The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.

November 8, 2024

In this week’s Duke AI Health Friday Roundup: the promise of AI agents in discovery science; new mechanism for MRSA resistance found; LLMs for matching patients with clinical trials; the science behind cats’ love of tuna; memorization vs reasoning in LLMs; deep phenotyping illuminates sex-based differences in aging; data sleuths dig into dodgy science; weighing different definitions for Alzheimer disease; much more:

AI, STATISTICS & DATA SCIENCE

Pixelated images of pregnant women running on a race track, dynamically posed but distorted due to limited data availability. They are holding cards that read “error cannot generate,” symbolizing the struggle to accurately represent such dynamic figures due to inadequate online resources. Image credit: Amritha R Warrier & AI4Media / Better Images of AI / error cannot generate / CC-BY 4.0
Image credit: Amritha R Warrier & AI4Media / Better Images of AI / error cannot generate / CC-BY 4.0
  • “We found that LLMs could interpolate the training puzzles (achieving near-perfect accuracy) after fine-tuning, yet fail when those puzzles are slightly perturbed, suggesting that the models heavily rely on memorization to solve those training puzzles. On the other hand, we show that while fine-tuning leads to heavy memorization, it also consistently improves generalization performance. In-depth analyses with perturbation tests, cross difficulty-level transferability, probing model internals, and fine-tuning with wrong answers suggest that the LLMs learn to reason on K&K puzzles despite training data memorization.” A research paper by Xie and colleagues, available as a preprint from arXiv, applies a venerable logic puzzle to evaluate whether a given large language model is answering questions based on memorization or on the application of logical reasoning.
  • “Biomedical research is undergoing a transformative era with advances in computational intelligence. Presently, AI’s role is constrained to assistive tools in low-stake and narrow tasks where scientists can review the results. We outline agent-based AI to pave the way for systems capable of reflective learning and reasoning that consist of LLM-based systems and other ML tools, experimental platforms, humans, or even combinations of them. The continual nature of human-AI interaction and building trustworthy sandboxes, where AI agents can fail and learn from their mistakes, is one way to achieve this.” In a perspective article published in the journal Cell, Gao and colleagues examine in detail the prospects for using AI “agents” to accelerate discovery science in biomedical fields (h/t@EricTopol).
  • “Our study primarily demonstrates the utility of applying AI to CAC scans to extract more actionable information than currently reported, which is the Agatston CAC score only. We found that AI volumetry significantly improves upon traditional CAC scoring for the prediction of risk for total CVD events as well as the prediction of individual CVD events of HF, stroke, AF, and all-cause mortality in a large multi-ethnic cohort.” A research article published by Naghavi and colleagues in npj Digital Medicine presents findings from a study that explored the use of AI to predict cardiovascular events based on data contained in coronary artery calcium scans.
  • “…we developed and validated an end-to-end pipeline for matching patients to clinical trials based on inclusion and exclusion criteria and unstructured patient notes from real-world EHRs. Our findings indicate that leveraging LLMs, with carefully implemented controls, could significantly shift the paradigm for accruing and enrolling eligible patients in clinical trials. Unlike existing processes that rely on time and personnel-intensive manual EHR review, the proposed workflow-based platform can enhance clinical trial efficiency and improve cancer care.” A research article by Gupta and colleagues, published in npj Digital Medicine, presents a large language model-based system for identifying patients via electronic health record data who may be a candidate for specific clinical trials.

BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH

A grey tabby cat walks along the top of a wall on a quayside, with a harbor and boats out of focus in the background. Image credit: Wren Meinberg/Unsplash
Image credit: Wren Meinberg/Unsplash
  • “Tuna (or any seafood for that matter) is an odd favorite for an animal that evolved in the desert. Now, researchers say they have found a biological explanation for this curious craving. In a study published this month in Chemical Senses, scientists report that cat taste buds contain the receptors needed to detect umami—the savory, deep flavor of various meats, and one of the five basic tastes in addition to sweet, sour, salty, and bitter. Indeed, umami appears to be the primary flavor cats seek out….the team also found these cat receptors are uniquely tuned to molecules found at high concentrations in tuna, revealing why our feline friends seem to prefer this delicacy over all others.” A news article in Science by David Grimm unpacks recent research exploring just why, exactly, cats are so fond of tuna fish.
  • “We have revealed that the high-level resistance to β-lactam antibiotics exhibited by some MRSA strains is linked to an alternative mode of cell division set within the context of wider physiological adaptations…Our study has revealed insights into antibiotic resistance and facets of cell division in aureus. It is by studying these processes in tandem that we can understand basic mechanisms of the bacterial cell cycle and reveal ways to control antibiotic resistance.” A research article published in Science by Adedeji-Olulana and colleagues describes a new mechanistic component in the development of antibiotic-resistant MRSA.
  • “…we trained and evaluated the BA models on males and females separately. Instead of being based on a single biomarker, our models are based on a group of biomarkers that share a common physiological basis and were measured in a large representative population. We identified profound differences in the model’s performance when compared across profiled systems and between males and females. The majority of the models performed better among women.” A research paper by Reicher and colleagues, published in Nature Aging, presents findings from a deep phenotyping analysis that finds key differences in how men and women undergo aging.

COMMUNICATION, Health Equity & Policy

Black and white photo of a man wearing a Sherlock-Holmes-style tweed jacket and deerstalker cap holds up a magnifying glass, out of focus in the foreground. Image credit: Andres Siimon/Unsplash
Image credit: Andres Siimon/Unsplash
  • “A small, tight-knit community of scientific sleuths has been unearthing growing evidence that many studies, including landmark papers published in top journals, contain manipulated images and falsified findings. These revelations have led to high-profile investigations, raised concerns about clinical trials, and culminated in the departure of university presidents…Their work — often posted on PubPeer, a website where users comment on published studies — has also forced a reckoning around how the crushing pressure to publish splashy results incentivizes fraud, a conversation research integrity experts say is both deeply uncomfortable and long overdue.” STAT News’ Jonathan Wosen profiles a series of “sleuths” whose efforts – sometimes entirely voluntary – are revealing examples of shoddy science and outright research fraud.
  • “The UW researchers tested three open-source, large language models (LLMs) and found they favored resumes from white-associated names 85% of the time, and female-associated names 11% of the time. Over the 3 million job, race and gender combinations tested, Black men fared the worst with the models preferring other candidates nearly 100% of the time….Why do machines have such a outsized bias for picking white male job candidates? The answer is a digital take on the old adage ‘you are what you eat.’” GeekWire’s Lisa Stiffler reports on a recent study that found that large language models tasked with screening resumes exhibited notable bias.
  • “The AA group distinguishes between a disease and an illness, whereas the IWG group does not. As such, although the primary disagreement between the groups is semantic, the ramifications of the labeling can be significant… both positions recognize the importance of the underlying biological substrate and the progressive clinical states of the patients, as well as how these concepts are translated into the clinical setting. Both recognize the importance of biomarkers and the advancement of therapies for the underlying pathophysiology. However, the semantic differences are important for communication with patients and families and need to be explicated by the clinician.” An editorial by Petersen, Mormino, and Schneider published in JAMA Neurology unpacks points of convergence and disagreement with two competing definitions of Alzheimer disease.