In this week’s Duke AI Health Friday Roundup: Meta debuts LlaMA 2 large language model; calculating the toll of misdiagnosis; teaching writing in the age of GPTs; geographical concentration in AI industry; transgender youth, social media & mental health; responding to systemic racism in science; ML for extracting data from unstructured EHR records; regulatory implications for medical chatbots; building resiliency for a hotter world; much more:
AI, STATISTICS & DATA SCIENCE
- “The company is actually releasing a suite of AI models, which include versions of LLaMA 2 in different sizes, as well as a version of the AI model that people can build into a chatbot, similar to ChatGPT…many caveats still remain. Meta is not releasing information about the data set that it used to train LLaMA 2 and cannot guarantee that it didn’t include copyrighted works or personal data, according to a company research paper shared exclusively with MIT Technology Review. LLaMA 2 also has the same problems that plague all large language models: a propensity to produce falsehoods and offensive language.” MIT Technology Review’s Melissa Heikkilä has the rundown on the recent release of Meta’s free large language model, LLaMA 2, which marks the company’s emergence as a serious contender in the publicly available GPT arena.
- “The effective universal deployment of generative AI may well widen the geography of the overall AI industry—but then again, it might instead harden the dominance of the industry’s core hubs, especially when it comes to research and development. High and persistent levels of industry concentration could have negative implications for economic opportunity, regional growth, and the development of an AI sector that serves a broad consumer base with varied products and services.” A new report from the Brookings Institute tracks the geographic distribution of the emerging generative AI industry in the United States.
- “The results show that our RadioLOGIC outperforms CNN- and RNN-based NLP models in the task of extracting repomics features, indicating that our model performs better for complex long-text datasets. …In addition to repomics feature extraction, our model can predict BI-RADS scores based on descriptions from unstructured reports, and the application of transfer learning further improves the model’s performance in predicting BI-RADS scores, which could serve as decision support for clinicians.” A research article published in Cell Reports Medicine by Zhang and colleagues debuts RadioLOGIC, a machine learning model for extracting clinically actionable information from unstructured electronic health record data.
- “Many AI proponents believe that the solution to problematic AI output is to throw human resources against it; hire editors and fact-checkers to make the bot look like a good writer. However, not only does having humans heavily edit a bot’s work defeat the low-friction purpose of using the bot in the first place, but it also won’t catch all the errors….With AI content, you can’t trust anything it outputs to be correct, so the human editor has to not only research every single fact in detail, but also be a subject matter expert on the exact topic in question.” A post by Avram Piltch at Tom’s Hardware delves into the not-entirely-reassuring prospects for internet content – and the basic functionality of search – in the era of generative AI.
- “In their current state, LLMs do not ask for missing information needed to provide an accurate answer, provide no accompanying indication of relative certainty or confidence, and generally provide no genuine sources. This rules out their use in the USA for non-device clinical decision support.” A commentary article by Gilbert and colleagues published in Nature Medicine argues that the current crop of sophisticated AI chatbot should undergo the same sorts of regulatory scrutiny afforded to medical devices when they are deployed for healthcare-related purposes.
BASIC SCIENCE, CLINICAL RESEARCH & PUBLIC HEALTH
- La vie en rose: “Whatever ‘pink’ means to Barbie, research does show that a fully monochromatic life would be quite drab. In a 2019 Nature Communications study, researchers used a low-pressure sodium light to cast a yellow tint on an entire room. Participants in the room described all the objects presented to them as some kind of yellow.” At Scientific American, Timmy Broderick explores how the dazzling pink palette of set designs used in the Barbie movie would actually hit in real life – and what that tells us about visual processing and cognition.
- “Analyzing the nature of misdiagnoses also provides significant opportunities for solutions: The errors are many, but they are quite concentrated. According to the study, 15 diseases account for about half the misdiagnoses, and five diseases alone — stroke, sepsis, pneumonia, venous thromboembolism, and lung cancer — caused 300,000 serious harms, or almost 40% of the total, because clinicians failed to identify them in patients.” STAT News’ Annalisa Merelli reports on a study by Newman-Toker and colleagues, recently published in BMJ Quality and Safety, that attempts to characterize the full toll of diagnostic errors in US healthcare.
- “Now that we are in the summer of 2023, a concerted effort is needed more than ever to enhance the implementation of multi-pronged heat prevention and adaptation strategies at macro, meso and micro levels. By proactively building resilience across these levels, we can substantially mitigate the detrimental health impacts caused by heatwaves and safeguard vulnerable populations in the years ahead.” An article in Nature Medicine by Ji, Xi, and Huang presents the case for taking action to mitigate the health impact of heatwaves in light of recent findings on the public health toll on protracted temperature excursions.
- “…gender identity was associated with youths’ experiences of social media in ways that may have distinct implications for mental health. These results suggest that research about social media effects on youths should attend to gender identity; directing children and adolescents to spend less time on social media may backfire for those transgender and gender nonbinary youths who are intentional about creating safe spaces on social media that may not exist in their offline world.” A research article published in JAMA Pediatrics by Coyne and colleagues examines associations between mental health and use of social media among transgender youth in the US.
COMMUNICATION, Health Equity & Policy
- “First, we should recognize a couple of truths: 1. There is no reliable detection of text produced by a large language model. Policing this stuff through technology is a fool’s errand. And 2. While there is much that should be done in terms of assignment design to mitigate the potential misuse of LLMs, it is impossible to GPT-proof an assignment….This means that the primary focus—as I’ve been saying since I first saw an earlier version of GPT at work—needs to be on how we assess and respond to student writing.” An article at Inside Higher Education by writing teacher John Warner investigates how the advent of GPT in the classroom can actually spur a re-evaluation of how we teach writing – and what we expect students to get out of it.
- “So, a suggestion for those opposing the anti-racist moves in science: How about doing exactly what scientists are supposedly trained to do? Acknowledge the existing data and analyses, collect more data, run new analyses, and then see how to reframe questions and practices given more refined understandings of the dynamics of systems.” A viewpoint article published in Science by Augustin Fuentes challenges the scientific community on its collective response to systemic racism.
- “In our research, we found that reporting mechanisms on social media platforms are often profoundly confusing, time-consuming, frustrating, and disappointing. Users frequently do not understand how reporting actually works, including where they are in the process, what to expect after they submit a report, and who will see their report. Additionally, users often do not know if, or why, a decision has been reached regarding their report.” A recent report by advocacy organization PEN America examines the effectiveness of mechanisms put in place by major social media platforms to report and counter abuse and harassment.
- “One area in need of regulation is the potential for neurotechnologies to be used for profiling individuals and the Orwellian idea of manipulating people’s thoughts and behaviour. Mass-market brain-monitoring devices would be a powerful addition to a digital world in which corporate and political actors already use personal data for political or commercial gain, says Nita Farahany, an ethicist at Duke University in Durham, North Carolina, who attended the meeting.” A news article at Nature by Liam Drew collects some expert perspectives on the emerging field neuroprivacy in the face of brain scanning technologies that are theoretically capable of interpreting or influencing a person’s thoughts.