AI Health

Friday Roundup

The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.

September 6, 2024

In this week’s Duke AI Health Friday Roundup: AI sorts through chemical libraries to find drug candidates; fibrin-spike protein interaction implicated in COVID inflammation; AI’s artistic prospects; short-circuiting damage from retracted papers; improving clinical trials infrastructure; more insight needed on AI implementation experiences; sifting climate policy with machine learning; phages hitch a ride on tiny worms; testing ChatGPT’s performance as source of information for patients; large research teams put some scientists at career disadvantage; much more:

AI, STATISTICS & DATA SCIENCE

“I was astounded by the speed and scale of research in the chemical library. I also sensed its limits. A truly comprehensive library of potential drugs would require labs like this on every floor of every building in the world—and then some. A repository of that size would be unfathomable and practically unsearchable. But A.I. assistance may be able to weed out the least interesting chemicals, helping to steer scientists toward the most promising parts of the collection.” In an article for the New Yorker, Dhruv Khullar takes readers on a guided tour of AI-powered drug design.
“A.I.’s boosters envision a near future where everyone in your life — your doctor, your landlord, the government — will use A.I. to help make important decisions…In that world, what A.I. says about us matters — not just for vanity. And if the doomers are right, and these systems eventually grow powerful enough to carry out plans of their own, I’d rather not be first on the revenge list…Eventually, I realized that if I wanted to fix my A.I. reputation, I needed to enlist some experts.” The New York Times’ Kevin Roose charts a personal odyssey of trying to persuade hostile online chatbots to change their minds about him.
“…the current disparity between the abundance of AI research and the scarcity of evidence on real-world impact underscores the urgent need for comprehensive clinical effectiveness evaluations. These evaluations must go beyond model validation to explore the real-world effectiveness of AI models in clinical settings, especially because so few have gone on to show any meaningful impact. The importance of local context in AI model validation and impact assessment cannot be overstated.” In a perspective article published in NEJM AI, Longhurst and colleagues calls for a focus on implementation science principles in the context of evaluating AI systems deployed in clinical settings.
“Although ChatGPT showed potential to provide patients with breast cancer accurate and clinically concordant information, 24% of the time the responses provided inaccurate information and 41% of the time the responses cited references that did not exist. Furthermore, whereas each series of prompts started with the statement I am a patient, and requested that the responses should be for the patient, the responses provided were not at an appropriate patient reading level. In fact, none of the responses were at the recommended sixth-grade reading level, and the lowest grade level was eighth grade.” A research article published in Cancer by Park and colleagues evaluates the suitability of using generative AI applications (such as ChatGPT 3.5) as a source of medical information.
“We find that all 4 LLMs evaluated generated inappropriate responses to our prompt set. LLM performance is strongly hampered by learned anti-LGBTQIA+ bias and over-reliance on the mentioned conditions in prompts. Given these results, future work should focus on tailoring output formats according to stated use cases, decreasing sycophancy and reliance on extraneous information in the prompt, and improving accuracy and decreasing bias for LGBTQIA+ patients and care providers.” A research article by Chang and colleagues, available as a preprint from MedRxiv, evaluates the presence and impact of anti-LGBTQIA biases in large language models in healthcare contexts.
“What I’m saying is that art requires making choices at every scale; the countless small-scale choices made during implementation are just as important to the final product as the few large-scale choices made during the conception. It is a mistake to equate ‘large-scale’ with ‘important’ when it comes to the choices made when creating art; the interrelationship between the large scale and the small scale is where the artistry lies.” Writing for the New Yorker, science fiction luminary Ted Chiang interrogates whether AI can create art.

BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH

“Clinical trial design has advanced steadily over several decades and now includes the use of complex elements, such as master protocols, adaptive platform designs, external control arms, participants in multiple countries, and advanced statistical methods. Yet the infrastructure for clinical trial data collection has evolved very little. As technology has changed our interactions with data in modern society, opportunities to improve the infrastructure—meaning the ways we access, acquire, and use data in clinical trials—have become more evident.” A special communication published in JAMA by Franklin and colleagues makes a detailed case for revamping the national infrastructure for conducting clinical trials.
“…scaling up good-practice policies identified in this study to each sector of other parts of the world can in the short term be a powerful climate mitigation strategy. However, even if all countries in our sample were able to replicate past success, more than four times (one and a half times) the effort witnessed so far would have to be exerted to close the emissions gap. This also highlights the need for research providing systematic evidence on which climate policy mixes are most powerful in spurring the necessary deployment and development of low-carbon technologies.” A research article published in Science by Stechemesser and colleagues presents results from a study that used a machine-learning approach to assess the effectiveness of an array of climate policies aimed at curbing carbon emissions.
“…we show that fibrin binds to the SARS-CoV-2 spike protein, forming proinflammatory blood clots that drive systemic thromboinflammation and neuropathology in COVID-19….A monoclonal antibody targeting the inflammatory fibrin domain provides protection from microglial activation and neuronal injury, as well as from thromboinflammation in the lung after infection. Thus, fibrin drives inflammation and neuropathology in SARS-CoV-2 infection, and fibrin-targeting immunotherapy may represent a therapeutic intervention for patients with acute COVID-19 and long COVID.” A research article by Ryu and colleagues, published last month in Nature, sheds light on the mechanism responsible for multiple facets of both acute and long COVID.
“When the scientists placed infected and uninfected bacteria 2 centimeters apart in the containers, nothing happened. But when they added the common soil nematodes Caenorhabditis elegans or remanei to the mix, the phages began to kill the bacteria within a few days. Somehow the viruses had traversed the 2 centimeters separating the viruses from their quarries—a distance about 1 million times their own size, Van Sluijs says.” In a news article for Science, Elizabeth Pennisi describes recent research showing how soil phages “hitchhike” on nematodes to infect distant bacteria.

COMMUNICATION, Health Equity & Policy

“…retraction doesn’t mean that all the papers that cited the retracted article are now unreliable, too — but some might be. At a minimum, researchers should be aware of any retractions among the studies that they have cited. This would enable them to assess potential negative effects on their own work, and to mention the relevant caveats clearly should they continue to cite the retracted paper in the future.” In a commentary published in Nature, Guillaume Cabanac proposes measures that could help prevent the continued percolation of retracted papers through the scientific literature.
“A key problem with increasing the secondary license market for academic texts by extending licenses to incorporate those texts into AI tools is it has the potential to break the reputation ecosystem that supports scholarly publishing. Much of this secondary compensation market for researchers relies on the world of citation…. However, if the market for attribution is a key driver motivating authorship, and AI tools are generating content or providing search results without attribution, this could become a significant challenge in our community.” In an essay at Scholarly Kitchen, Todd Carpenter examines the importance of the details when it comes to licensing scholarly publications for use by the AI industry.
“…fortunately, AI doomers and ethicists have more common ground than they may think. From our research into diverse AI safety and governance issues and conversations with advocates across the spectrum of concern, we believe several AI policies can satisfy both doomers and ethicists on their own terms—and help draw them together for greater impact. None of these policies will necessarily be easy to achieve, and they don’t amount to a comprehensive agenda for governing AI and its challenges. But in our view, each has the important advantage of potentially appealing to advocates with diverse perspectives on AI risk and harm.” In a post at the Lawfare blog, Zachary Arnold and Helen Toner identify potential common ground for different camps with the AI world to converge on a regulatory approach.
“Early-career academics are facing an increasing squeeze in the hunt for tenure-track positions and funding—and despite widespread discussion, solutions have proved evasive. Now, a new study puts a finger on a major contributor: Research teams have grown, doubling from 1.8 authors per paper on average in 1970 to 3.6 in 2004.” In a news article for Science, Katie L. Burke unpacks findings from a recent study that suggests larger research teams may put some members at a career disadvantage.