February 23, 2024

In this week’s Duke AI Health Friday Roundup: few-shot learning powers drug interaction model; how LLMs pick up new skills (and why it matters); gene-swapped bananas are bulwark against fungal foe; new papers build on trove of NIH All of Us genetic data; parsing recently dismissed lawsuit over EHR data; pressure builds for definitive path on AI regulation; training language models to build proteins; a really big PDF; much more:


Three groups of icons representing people have shapes travelling between them and a page in the middle of the image. The page is a simple rectangle with straight lines representing data. The shapes traveling towards the page are irregular and in squiggly bands. Image credit: Yasmine Boudiaf & LOTI / Better Images of AI / CC-BY 4.0
Image credit: Yasmine Boudiaf & LOTI / Better Images of AI / CC-BY 4.0
  • “…we focus on a class of machine learning models that operate only on sequences yet capture the structural and functional properties of proteins. Protein language models (PLMs) are trained on vast datasets of protein sequences spanning the evolutionary tree of life. From these sequences, PLMs learn the underpinnings of protein structure and function, enabling a wide range of protein modeling and design tasks.” A review article published in Nature Biotechnology by Ruffolo and Madani offers a primer on how to use large language models to engineer novel proteins.
  • “The trio at Stanford who cast emergence as a ‘mirage’ recognize that LLMs become more effective as they scale up; in fact, the added complexity of larger models should make it possible to get better at more difficult and diverse problems. But they argue that whether this improvement looks smooth and predictable or jagged and sharp results from the choice of metric — or even a paucity of test examples — rather than the model’s inner workings.” In an article for Quanta, Stephen Ornes looks at attempts by researchers to understand how large language models acquire seemingly new skills and capabilities – and the degree to which the choice of measurement or benchmark may influence that perception.
  • “LLMs can offer a promising alternative approach for biological inference, particularly in cases where structured data and sample size are limited, by extracting prior knowledge from text corpora. Here we report our proposed few-shot learning approach, which uses LLMs to predict the synergy of drug pairs in rare tissues that lack structured data and features. Our experiments, which involved seven rare tissues from different cancer types, demonstrate that the LLM-based prediction model achieves significant accuracy with very few or zero samples.” A research article by Li and colleagues published in NPJ Digital Medicine describe CancerGPT – a transformer trained via a few-shot learning strategy to predict synergistic reactions in drug pairs with performance on par with a fine-tuned GPT model with orders of magnitude more paramaters.
  • “The new rules make just two changes, but they add up to major concerns. First, starting in August, new projects will be able to access data only through CMS’ cloud environment rather than the current practice of storing the data on highly secure computing infrastructure at research institutions. In principle, this doesn’t sound so bad. In practice, the costs associated with each “seat” at the virtual table severely limits the number of researchers who can use the data.” A STAT News opinion article by UPenn professor Rachel W. Werner warns of a potentially stifling effect on research from recent restrictions on data access imposed by CMS.
  • “Van Brummelen suggests that Bianchini’s schooling in economics might have been key to his invention, because he wasn’t embedded in sexagesimal numbers from early in his career, as other astronomers were. But his approach was perhaps too revolutionary to catch on at first. ‘In order to understand what Bianchini was doing, you had to learn a completely new system of arithmetic,’ he says.” At Nature, Jo Marchant reports on a recent discovery that suggests a Venetian mathematician may have invented the decimal point more than century earlier than previously thought.


Photograph showing bunches of bananas stacked on top of each other, filling the entire photographic frame. Image rotated 90 degrees clockwise from original orientation. Image credit: Rodrigo dos Reis/Unsplash
Image credit: Rodrigo dos Reis/Unsplash
  • Bananas! In this case, genetically engineered Cavendish bananas, designed to resist the onslaught of a currently incurable fungal plague affecting banana trees. ABC News has the story: “A genetically-modified (GM) banana is a step closer to commercial reality as Queensland scientists gain regulatory approval to release a GM variety of Cavendish banana for human consumption….Scientists say the QCAV-4 variety is the world’s first genetically modified banana and will be the first GM fruit approved by the federal government for growing in Australia. ….While scientists say they will be safe to eat, the GM variety will be considered a ‘back-up option’ in the fight against Panama Tropical Race 4 (TR4), as it is nearly immune to the disease.”
  • “Analyses of up to 245,000 genomes gathered by the All of Us programme, run by the US National Institutes of Health in Bethesda, Maryland, have uncovered more than 275 million new genetic markers, nearly 150 of which might contribute to type 2 diabetes. The work has also identified gaps in genetics research on non-white populations.” A news article at Nature by Max Kozlov encapsulates a handful of papers published in several different Nature journals this week that feature a trove of new science emerging from the NIH All of Us research project.
  • “Phage therapy has been around for more than 100 years, but with the advent of antibiotics and a possible influence of an unfavorable review published in the Journal of the American Medical Association in the early 1930s, phages fell out of favor and became a distant memory…That all changed in 2016, when HIV researcher Tom Patterson, infected with antibiotic-resistant Acinetobacter baumannii from a trip some months earlier to Egypt, was cured with phage therapy. STAT News’ Deborah Balthazar reports on how phage therapy – sometimes featuring as a last-ditch defense against resistant microbial infections – is the focus of growing attention from researchers looking to conduct clinical trials with them.
  • “Why does social media use seem to trigger mental health problems? Why are those effects unevenly distributed among different groups, such as girls or young adults? And can the positives of social media be teased out from the negatives to provide more targeted guidance to teens, their caregivers and policymakers?” At Science News, Sujata Gupta surveys the mounting evidence that a saturation diet of social media is having harmful effects on youth and examines the complexities of countering them effectively.

COMMUNICATION, Health Equity & Policy

Several small cartoon representations of people are linked by brown handdrawn lines representing a network. Image credit Jamillah Knowles & Reset.Tech Australia / © Better Images of AI / CC-BY 4.0
Several small cartoon representations of people are linked by brown hand-drawn lines representing a network.
  • “The director reassured me that I wasn’t alone in this journey, saying, ‘Yaowu, remember, we are a team; we struggle and succeed together.’ I quickly learned that the research group had a culture of giving and receiving feedback. And in the months that followed, my perspective shifted from seeing feedback as a form of criticism to seeing it as a valuable source of professional growth.” An essay in Science by Yaowu Zhang explores the challenges of academic writing and publishing in a second language from a personal perspective.
  • “There remains no question that hospitals and physicians participating in medical AI research and development are still responsible for complying with biomedical, ethical, and legal rules, including HIPAA. There is also no doubt that courts will continue to encounter cases involving sharing health data with technology companies like Google.” A legal analysis published in JAMA by Duffourc and Gerke examines the recent dismissal of a lawsuit filed against Google regarding the use of data from patient electronic health records and looks at some of the potential implications of the ruling.
  • “…as Washington forges ahead, founders say they’re in the dark about who will be regulated and how — and they’re urging policymakers to offer clarity….As they race for a slice of the $4 trillion U.S. health care market, AI founders and investors say regulatory uncertainty is a hurdle, forcing them to build more slowly and meticulously document for fear of potential audits. That makes it harder to keep up with the tech industry’s breakneck pace, they tell STAT.” STAT News’ Mohana Ravindranath reports (log-in required) on the growing clamor among the AI industry for a clear indication of what kind of federal regulations for AI are likely to crystallize out of recent plans for implementing oversight of algorithmic technologies.
  • Since the dawn of time, humanity has yearned to create a pdf bigger than known universe. And now, thanks to Alex Chan, it has. (H/T David Crotty at Scholarly Kitchen)