In today’s Duke AI Health Friday Roundup: AI considered as “late-stage teenager”; racial equity in clinical trials; ethical filtering for large language models; precision medicine for rheumatology; updating approaches to disease surveillance; Facebook inundates cancer patients with dubious ads; making pulse oximeters work for everyone; adjuvant boosting for COVID vaccines; legal framework for biometric tech; adversarial training for NLP models; much more:
AI, STATISTICS & DATA SCIENCE
- “One concern with the rise of large language models lies with their potential for significant harm, particularly from pretraining on biased, obscene, copyrighted, and private information. Emerging ethical approaches have attempted to filter pretraining material, but such approaches have been ad hoc and failed to take into account context. We offer an approach to filtering grounded in law, which has directly addressed the tradeoffs in filtering material. First, we gather and make available the Pile of Law, a 256GB (and growing) dataset of open-source English-language legal and administrative data, covering court opinions, contracts, administrative rules, and legislative records.” A new preprint by Henderson and colleagues, available from arXiv (with accompanying dataset), debuts the Pile of Law: a project designed to use a publicly accessible corpus of legal documents to create ethical and situationally appropriate guardrails for large language models (H/T @avillanovamoral).
- “AI science is like a late-stage teenager, newly aware of its extraordinary powers but without a fully developed frontal cortex that might guide its risky behavior and lead it to consider its broader social responsibilities…. In comparison with older fields like medicine or the law — or even garden-variety professions that have licensing requirements — the institutional norms for professional ethics in computer science are developmentally immature.” During an interview by Edmund L. Andrews, Stanford philosopher Rob Reich advocates for a professional code of conduct for developers of AI applications.
- “This extractive relation reprises a long historical pattern, in which new methods of producing knowledge generate a redistribution of epistemic power: who declares what kind of truth about me, to count for what kinds of decisions? I argue that prediction as extraction of discretion is normal and fundamental to the technology, rather than isolated cases of bias or error.” A 2022 ACM conference paper by Sun-ha Hong interrogates the idea of “neutral” predictive models that only need to be pruned of bits of biased data in order to function beneficently and fairly (H/T @mer_edith).
- “Developing methods to adversarially challenge NLP systems is a promising avenue for improving both model performance and interpretability. Here, we describe the approach of the team “longhorns” on Task 1 of the The First Workshop on Dynamic Adversarial Data Collection (DADC), which asked teams to manually fool a model on an Extractive Question Answering task.” A preprint available from arXiv by Kovatchev and colleagues debuts an approach to improving the performance of a natural language programming model by the use of adversarial challenges.
- “The Fields Medals, first awarded in 1936, were conceived by John Charles Fields, a Canadian mathematician. They and the Abacus Medal are unusual among top academic honors in that they go to people who are still early in their careers — younger than 40 years on Jan. 1 — and honor not just past achievements but also the promise of future breakthroughs.” The New York Times’ Kenneth Chang profiles this year’s winners of the quadrennially awarded Fields Medals for mathematics, whose winners include Ukrainian mathematician Maryna Viazovska, only the second woman to have received the prestigious award.
- “Ongoing debates about whether large pre-trained models understand text and images are complicated by the fact that scientists and philosophers themselves disagree about the nature of linguistic and visual understanding in creatures like us. Many researchers have emphasized the importance of “grounding” for understanding, but this term can encompass a number of different ideas. These might include having appropriate connections between linguistic and perceptual representations, anchoring these in the real world through causal interaction, and modeling communicative intentions.” In an article for Nautilus, Raphaël Millière examines the sometimes astonishing capabilities of large pre-trained AI models and investigates the underlying significance of these huge, data-hungry projects.
BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH
- “…the new papers classify minerals by ‘kind,’ a term that Hazen and Morrison define as a combination of the mineral species with its mechanism of origin (think volcanic pyrite versus microbial pyrite). Using machine learning analysis, they scoured data from thousands of scientific papers and identified 10,556 distinct mineral kinds.” Quanta’s Joanna Thompson reports on a pair of recently published papers that describe a new taxonomic system for Earth’s minerals – one that acknowledges the central role that the biosphere plays in creating them.
- “Most traditional public health surveillance systems are designed around a paradigm of counting the occurrence of disease, including human cases, hospitalized patients and positive laboratory diagnoses, pathogen genomes, and attributable deaths….An outbreak is classically defined by changes in case incidence in the context of person, place or time. However, this paradigm is often not sufficient for the rapid detection of emerging infectious diseases, when early case numbers are small, where no historic baseline exists, or where diagnosis of cases is uncertain, perhaps due to lack of adequate testing.” A commentary at Nature Medicine by authors from the World Health Organization advocates for incorporating modernized approaches to disease surveillance into global disease monitoring systems.
- “After a couple of years of trying to enroll people into WISDOM, only 48 Black women had volunteered to participate, compared to thousands of white women. The last thing Esserman wanted to do was run another cancer trial with scant participation from Black people.” An article at STAT News by Angus Chen explores the WISDOM breast cancer screening trial, and how it helped prompt greater efforts to increase the racial diversity of participants enrolling in clinical studies.
- “Over the short term, heterologous boosting with the beta-adjuvanted MVB.1.351 vaccine resulted in a higher neutralizing-antibody response against the beta variant as well as against the original strain and the delta and omicron BA.1 variants than did the mRNA vaccine BNT162b2 or the MVD614 formulation. The use of new vaccines that contain beta spike protein may be an interesting strategy for broader protection against SARS-CoV-2 variants.” A research letter by Launay and colleagues published in the New England Journal of Medicine offers an early look at the effects of adjuvant boosting for COVID vaccines.
- “Recent innovations in high-throughput ‘omic’ technologies are now enabling comprehensive profiling at multiple levels, helping to identify subgroups of patients who may taper off potentially toxic medications or better respond to current molecular targeted therapies. Such advances may help to optimize outcomes and identify new pathways for treatment, but there are many challenges along the path towards clinical translation.” A review article published this week in Nature Medicine by Guthridge and colleagues surveys the use of precision-medicine approaches in rheumatology.
COMMUNICATION, Health Equity & Policy
- “…the case for new legislation to cover all biometric technologies that identify or categorise people is clear. A new regulatory function is needed to assess the human rights impacts of certain biometric technologies before they are used, and ensure that all such technologies meet basic standards relating to accuracy and bias. Until this comprehensive framework is in place, there should be a moratorium on the more problematic uses of biometric technologies, including the use of live facial recognition technology.” An opinion article by Madeleine Chang, appearing in Thompson-Reuters News, describes a new review by the Ada Lovelace Foundation that critiques the UK’s legal superstructure in place for biometric technologies and finds it wanting.”
- “Of particular concern, the analysis showed that Google shared data with RuTarget about users browsing websites based in Ukraine. This means Google may have turned over such critical information as unique mobile phone IDs, IP addresses, location information and details about users’ interests and online activity, data that U.S. senators and experts say could be used by Russian military and intelligence services to track people or zero in on locations of interest.” ProPublica’s Craig Silverman reports that a Russian company, then under US Treasury Department sanctions, purchased highly granular (and in some cases sensitive) user data from Google.
- “Evidence from Facebook and Instagram users, medical researchers, and its own Ad Library suggests that Meta is rife with ads containing sensational health claims, which the company directly profits from. The misleading ads may remain unchallenged for months and even years. Some of the ads reviewed by MIT Technology Review promoted treatments that have been proved to cause acute physical harm in some cases. Other ads pointed users toward highly expensive treatments with dubious outcomes.” At MIT Technology Review, Abby Ohlheiser scrutinizes reports that Facebook/Meta is feeding ads for highly dubious therapies to cancer patients.
- “Mr. V’s death illustrates a basic health care inequity that must be addressed: Pulse oximeters do a poor job of measuring the blood oxygen level of people with darker skin, like Mr. V. Their skin has more melanin, a skin pigment that interferes with measuring the oxygen level in the blood.” In an opinion article at STAT News, Duke Health critical care physician A. Ian Wong presses for action a critical piece of medical technology – the pulse oximeter – which has been shown to work less accurately on darker-skinned patients.