Clinical notes often contain important descriptive findings not captured in structured EHR fields, making them valuable for early autism prediction. However, identifying autism-related insights is difficult due to their sparsity within the large volume of notes for a typical child. Duke researchers, including Computational Biology & Bioinformatics student Fengnan Li, AI Health Data Science Fellow Elliot Hill, and Duke AI Health Data Science Fellowship Director Matthew Engelhard, PhD have developed a new natural language processing method, IRIS (Interpretable Retrieval-Augmented Classification for long Interspersed Document Sequences), to address this challenge. Their work was recently published at the 2025 Annual Meeting of the Association for Computational Linguistics.



