AI Health
Friday Roundup
The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.
October 6, 2023
In this week’s Duke AI Health Friday Roundup: how AI effects clinical productivity; mRNA insights garner Nobel prize; deep learning predicts variation in proteins; US continues to lose ground in health, life expectancy; sitting is still bad for you; surveying algorithmic bias mitigation; antiracist approaches to clinical documentation; the surveillance and human labor interwoven into AI systems; the LLM hype cycle: peaks and troughs; much more:
AI, STATISTICS & DATA SCIENCE
- “AlphaMissense predictions may illuminate the molecular effects of variants on protein function, contribute to the identification of pathogenic missense mutations and previously unknown disease-causing genes, and increase the diagnostic yield of rare genetic diseases. AlphaMissense will also foster further development of specialized protein variant effect predictors from structure prediction models.” In a research article published in Science, Cheng and colleagues describe AlphaMissense, a deep learning model developed from AlphaFold that can be used to predict small change in proteins that can affect their structure and behavior.
- “In B-PRODUCTIVE, specialists reported that autonomous AI allowed them to focus their time on more complex cases, as reflected in the mean complexity score in the intervention group, which was significantly higher than in the control group. Given the large proportion of patients who were able to avoid the wait to see a specialist as a result of receiving their examination from the AI, the net effect of the autonomous AI visible to patients was to reduce wait time.” A research article published in NPJ Digital Medicine by Abramoff and colleagues presents results from a cluster-randomized trial that tested an FDA-cleared AI model in an evaluation of clinic productivity.
- “…we created an automated pipeline using GPT-4 to provide comments on the full PDFs of scientific papers. We evaluated the quality of GPT-4’s feedback through two large-scale studies. We first quantitatively compared GPT-4’s generated feedback with human peer reviewer feedback in 15 Nature family journals (3,096 papers in total) and the ICLR machine learning conference (1,709 papers). The overlap in the points raised by GPT-4 and by human reviewers (average overlap 30.85% for Nature journals, 39.23% for ICLR) is comparable to the overlap between two human reviewers (average overlap 28.58% for Nature journals, 35.25% for ICLR). The overlap between GPT-4 and human reviewers is larger for the weaker papers (i.e., rejected ICLR papers; average overlap 43.80%).” A research paper by Liang and colleagues, available as a preprint from arXiv, explores the use of the GPT-4 large language model to provide automated feedback on scientific papers.
- “Just 10 years ago, no machine could reliably provide language or image recognition at a human level. However, AI systems have become much more capable and are now beating humans in these domains, at least in some tests.” A chart produced for Our World in Data provides a visual distillation of progress in AI language and image recognition over two-plus decades.
- An upcoming free webinar hosted by NEJM AI offers a look at clinical use cases for large language models, as well as how to evaluate model performance.
- “Understanding the influence of computational infrastructure on the political economy of artificial intelligence is profoundly important: it affects who can build AI, what kind of AI gets built, and who profits along the way. It defines the contours of concentration in the tech industry, incentivizes toxic competition among AI firms, and deeply impacts the environmental footprint of artificial intelligence… creates systemic harms when systems fail or malfunction due to the creation of single points of failure. Most concerningly, it expands the economic and political power of the firms that have access to compute, cementing the control of firms that already dominate the tech industry.” A new report from New York University’s AI Now Institute examines the sociopolitical implications of “compute” – of computing power and the necessary undergirding infrastructure, and who has – or lacks – access to it.
BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH
- “In this study, there was a nonlinear relationship between mean daily sedentary behavior time and incident dementia, with risks increasing after approximately 10 hours per day….In contrast, mean daily sedentary behavior time remained significantly associated with incident dementia when adjusting for patterns of sedentary behavior (mean and maximum daily sedentary bout lengths).” Bad news for modern humans: In a study published in JAMA by Raichlen and colleagues, sedentary behavior – of the type typical of modern office jobs, among others – was associated with increased risk of developing dementia, and exercise did not appear to offset that risk.
- “In 2005, while working together at the University of Pennsylvania, Karikó and Weissman discovered a way to slightly tweak the nucleotide sequence of the mRNA molecules so that they could sneak past cellular immune surveillance and avoid kicking up a massive inflammatory response. They went on to show in 2008 and 2010 that modified mRNA molecules could produce high levels of proteins. Quanta’s Yasemin Saplakoglu reports on this year’s awarding of the Nobel Prize in Physiology or Medicine to Katilin Karikó and Drew Weissman. The two scientists, who were collaborators on the work cited in the award, developed the techniques for modifying mRNA to allow it to be used in vaccines.
- “Sickness and death are scarring entire communities in much of the country. The geographical footprint of early death is vast: In a quarter of the nation’s counties, mostly in the South and Midwest, working-age people are dying at a higher rate than 40 years ago, The Post found. The trail of death is so prevalent that a person could go from Virginia to Louisiana, and then up to Kansas, by traveling entirely within counties where death rates are higher than they were when Jimmy Carter was president.” More bad news: the Washington Post reports on the ongoing crisis of chronic disease in US health that’s resulting in a continuing loss of life expectancy.
- “BlueWalker 3 is the most brilliant recent addition to a sky that is already swarming with satellites. The spaceflight company SpaceX alone has launched more than 5,000 satellites into orbit, and companies around the globe have collectively proposed launching more than half a million satellites in the coming years — a scenario that astronomers fear could hamper scientific observations of the Universe.” Nature’s Shannon Hall reports on the advent of BlueWalker3, a large communications satellite whose reflection is so intense that it outshines many of the brighter stars in the night sky.
COMMUNICATION, Health Equity & Policy
- “It seems clear that GPT and LLMs in general are at or near the peak of the hype cycle. Fears that ChatGPT is intelligent and may plot against humankind are premature; hopes that LLMs alone could radically change how our and other industries work are similarly overblown. The transformer models that underlie the LLMs that have made headlines over the last 12 months are an important step forward in language generation, and will play a part in the future of AI, but they’re only one part of a broader range of technologies.” In an essay at Scholarly Kitchen, Phill Jones attempts to clear away the smoke from AI’s latest LLM-driven hype cycle to offer a glimpse of what to expect in the near future as these technologies make contact with the real world.
- “We identified a wide range of technical, operational, and systemwide bias mitigation strategies for clinical algorithms, but there was no consensus in the literature on a single best practice that covered entities could employ to meet the HHS requirements. Future research should identify optimal bias mitigation methods for various scenarios, depending on factors such as patient population, clinical setting, algorithm design, and types of bias to be addressed.” In a paper recently published in a theme issue of Health Affairs, Cary and colleagues present a scoping review of bias mitigation techniques applied to healthcare algorithms.
- “There’s no way to make these systems without human labor at the level of informing the ground truth of the data — reinforcement learning with human feedback, which again is just kind of tech-washing precarious human labor. It’s thousands and thousands of workers paid very little, though en masse it’s very expensive, and there’s no other way to create these systems, full stop,” she explained. “In some ways what we’re seeing is a kind of Wizard of Oz phenomenon, when we pull back the curtain there’s not that much that’s intelligent.” TechCrunch’s Devin Coldewey reports on conference remarks from Signal’s Meredith Whittaker as she highlights the layers of surveillance that power modern applications of AI.
- “Documentation of race in patients’ charts can trigger a cascade of effects. Clinicians read (or copy and paste) previous notes to inform their understanding of a patient’s health and illness, and they may carry forward any flawed, discriminatory descriptions, heuristics, or beliefs the notes contain. Depending on how racial identity is employed and contextualized in the chart, clinicians may perpetuate the codification of racial disparities in service delivery, teaching trainees racist ideas and communicating biases to other clinicians.” In an article at the New England Journal of Medicine, Williams and colleagues explore antiracist approaches to documenting patient encounters, and examine why they are important to downstream clinical decision-making.
- “It seems clear that GPT and LLMs in general are at or near the peak of the hype cycle. Fears that ChatGPT is intelligent and may plot against humankind are premature; hopes that LLMs alone could radically change how our and other industries work are similarly overblown. The transformer models that underlie the LLMs that have made headlines over the last 12 months are an important step forward in language generation, and will play a part in the future of AI, but they’re only one part of a broader range of technologies.” In an essay at Scholarly Kitchen, Phill Jones attempts to clear away the smoke from AI’s latest LLM-driven hype cycle to offer a glimpse of what to expect in the near future as these technologies make contact with the real world.