AI Health
Friday Roundup
The AI Health Friday Roundup highlights the week’s news and publications related to artificial intelligence, data science, public health, and clinical research.
December 6, 2024
In this week’s Duke AI Health Friday Roundup: DeepMind improves on forecasting accuracy; NASEM unveils report on AI and the future of work; Women’s Health Study looks at 30-year risk; cognitive biases in AI models; bogus papers threaten knowledge synthesis; dark chocolate and type 2 diabetes; the evaporation of knowledge in the digital age; what ‘open’ really means for AI models; the distance to achieving artificial general intelligence; much more:
AI, STATISTICS & DATA SCIENCE
- “Google DeepMind has developed the first artificial intelligence (AI) model of its kind to predict the weather more accurately than the best system currently in use. The model generates forecasts up to 15 days in advance ― and it does so in minutes, rather than the hours needed by today’s forecasting programs…The purely AI system beats the world’s best medium-range operational model, the European Centre for Medium-Range Weather Forecasts’ ensemble model (ENS), at predicting extreme weather such as hurricanes and heatwaves.” An article by Nature’s Alix Soliman reports new accomplishments in weather forecasting as Google’s DeepMind surpasses the state of the art in predictive ability.
- “Artificial Intelligence and the Future of Work evaluates recent advances in AI technology and their implications for economic productivity, the workforce, and education in the United States. The report notes that AI is a tool with the potential to enhance human labor and create new forms of valuable work – but this is not an inevitable outcome. Tracking progress in AI and its impacts on the workforce will be critical to helping inform and equip workers and policymakers to flexibly respond to AI developments.” A recently published report from the National Academies of Science, Engineering & Medicine examines how burgeoning AI technologies may reshape the workplace.
- “But, despite such sophistication, o1 has its limitations and does not constitute AGI, say Kambhampati and Chollet. On tasks that require planning, for example, Kambhampati’s team has shown that although o1 performs admirably on tasks that require up to 16 planning steps, its performance degrades rapidly when the number of steps increases to between 20 and 402….LLMs, says Chollet, irrespective of their size, are limited in their ability to solve problems that require recombining what they have learnt to tackle new tasks. ‘LLMs cannot truly adapt to novelty because they have no ability to basically take their knowledge and then do a fairly sophisticated recombination of that knowledge on the fly to adapt to new context.’” At Nature, Anil Ananthaswamy surveys recent progress in the capabilities exhibited by large language models and asks what – if anything – that tells us about the remaining distance to achieving true artificial general intelligence (AGI).
- “Over consecutive rounds of improvement, EVOLVEpro yields variants with two- to 515-fold improvements in desired properties, including binding, catalytic efficiency, and immunogenic byproducts. Using both evolutionary scale PLMs and a regression layer, EVOLVEpro learns general rules of protein activity, generating highly active mutants with only a few cycles of evolution. Moreover, because of the rich latent space generated by the PLM and powerful feature selections present in the top-layer module, EVOLVEpro evolution is a low-N learning approach that requires minimal wet lab experimentation.” A research article published in Science by Jiang and colleagues describes EVOLVEpro, a few-shot learning model designed to model directed protein evolution.
- “We studied 10 cognitive biases and found that popular generative AI models replicated multiple pitfalls of reasoning. The discrepancies, furthermore, were often larger than past research on practicing clinicians — aside from GPT-4 avoiding base-rate neglect or Gemini-1.0-Pro avoiding framing effects for the specific clinical case tested. No configuration of synthetic respondent characteristics reduced the frequency of discrepancies below one in five (despite varying individual characteristics). In contrast, the frequency of vague nonanswer replies was low and unrelated to the extent of bias. The results suggest that AI models are prone to human-like biases when making medical decisions.” A case study by Wang and Redelmeier, published in NEJM AI, scrutinizes generative AI models for evidence of inbuilt cognitive biases.
BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH
- “Intake of dark chocolate instead of milk chocolate may be associated with a lower risk of T2D. Increased consumption of milk chocolate but not dark chocolate, however, was associated with increased weight gain. Further research, especially randomized controlled trials among middle aged participants and of longer duration, is needed to confirm these findings.” Hope springs eternal: findings from an observational study of chocolate consumption and type 2 diabetes risk, published in BMJ by Liu and colleagues, suggests that consumption of dark (but not milk) chocolate may be associated with lower risk for the disease.
- “In this prospective cohort of 27,939 initially healthy U.S. women who were enrolled beginning in 1992, a single combined measure of high-sensitivity CRP, LDL cholesterol, and lipoprotein(a) levels provided strong evidence of increased cardiovascular risk over a subsequent 30-year period…although traditional models for the prediction of cardiovascular risk are based on 10-year risks, there has been considerable interest in the prediction of lifetime risk and in cost-effective methods to assess risk and implement interventions throughout the lifespan…In this context, the current data show that a combined assessment of three simple blood biomarkers has predictive efficacy well beyond traditional 10-year estimates.” A research article by Ridker and colleagues, published in the New England Journal of Medicine, examines 30-year findings from the Women’s Health Study.
- “By measuring the structure of knowledge networks constructed by readers weaving a thread through articles in Wikipedia, we replicate two styles of curiosity previously identified in laboratory studies: the nomadic ‘busybody’ and the targeted ‘hunter.’ Further, we find evidence for another style—the ‘dancer’—which was previously predicted by a historico-philosophical examination of texts over two millennia and is characterized by creative modes of knowledge production.” A research article published in Science Advances by Zhou and colleagues examines characteristic patterns of human curiosity revealed by Wikipedia readers’ interactions with the online encyclopedia’s app.
COMMUNICATION, Health Equity & Policy
- “Eve, who is also a research developer at Crossref, an organization that registers DOIs, carried out the study in an effort to better understand a problem librarians and archivists already knew about — that although researchers are generating knowledge at an unprecedented rate, it is not necessarily being stored safely for the future. One contributing factor is that not all journals or scholarly societies survive in perpetuity. For example, a 2021 study found that a lack of comprehensive and open archiving meant that 174 open-access journals, covering all major research topics and geographical regions, vanished from the web in the first two decades of this millennium. ”A Nature editorial places a spotlight on the fragility of knowledge archiving and preservation in the digital age.
- “From one angle, LinkedIn may have inadvertently created the ideal laboratory for AI writing. Nobody’s logging on expecting profundity, hilarity, or sincerity. It’s the place where people strive to be the most anodyne versions of themselves, pleasant and inoffensive. Artificiality, in other words, is what everyone is expecting.” An article by Wired’s Katie Knibbs unpacks the implications of recent estimates that more than half of English language LinkedIn posts are created with AI tools.
- “At present, powerful actors are seeking to shape policy using claims that ‘open’ AI is either beneficial to innovation and democracy, on the one hand, or detrimental to safety, on the other. When policy is being shaped, definitions matter. To add clarity to this debate, we examine the basis for claims of openness in AI, and offer a material analysis of what AI is and what ‘openness’ in AI can and cannot provide: examining models, data, labour, frameworks, and computational power.” A perspective article published in Nature by Widder, Whittaker, and West question what the label “open” truly signifies in discussion of AI models.
- “The junk papers are likely the products of paper mills—businesses that produce fake science to order. The size of the problem is not clear, but a manuscript posted to the Center for Open Science’s OSF preprint server in September suggests up to one in seven published papers are fabricated or falsified. Aquarius and Wever’s group plans to sum up the problems in papers it analyzed by the end of this year; the results are “grim,” Aquarius says. Just in the past 4 weeks he has flagged 130 suspect papers on the postpublication peer-review site Pubpeer, bringing his total over the past 10 months to more than 690—many related to the stroke project, as well as others.” Science’s Holly Else reports on the emerging problem afflicting systematic reviews in science – a flood of junk papers is polluting the larger knowledge ecosystem.