Project profile: EHR Foundation Models for Smarter, More Reliable, and Dynamically Deployed Clinical Decision Support
Status: Active
Traditional frameworks for developing clinical decision support (CDS) tools from electronic health record (EHR) data often rely on extensive data preprocessing. This typically involves tedious and inefficient steps such as cleaning raw data, aggregating clinical event sequences into tabular count-based features, and deriving longitudinal measures for labs and vitals. While many machine learning models have achieved strong performance under this paradigm, their preprocessing pipelines are usually task-specific, making them difficult to generalize. EHR foundation models (FMs), however, offer a fundamentally different approach. Instead of compressing patient histories into handcrafted features, FMs take raw sequences of medical events as input and learn rich representations of a patient’s overall health trajectory. These representations can then be flexibly applied to a wide range of downstream prediction tasks without the need for custom preprocessing.
As an initial step, we developed a contrastive learning framework, Borrowing from the Future (BFF), that leverages children’s health trajectories to improve early-stage risk prediction (e.g., prenatally or at birth). Building on this, we are now developing a pediatric EHR foundation model with a hierarchical attention mechanism to generate clinically meaningful representations at both the event and encounter levels. Crucially, this model is designed to be dynamically deployed, enabling adaptive, efficient integration into diverse CDS applications.
Research Team:
Principal Investigator’s: Benjamin Goldstein & Matthew Engelhard
Analytic Team: Scott Sun (AI Health Fellow)
