Purpose. Acute graft-versus-host disease (aGVHD) remains a significant complication of allogeneic hematopoietic cell transplantation (HCT) and limits its broader application. The ability to predict grade II to IV aGVHD could potentially mitigate morbidity and mortality. To date, researchers have focused on using snapshots of a patient (eg, biomarkers at a single time point) to predict aGVHD onset. We hypothesized that longitudinal data collected and stored in electronic health records (EHRs) could distinguish patients at high risk of developing aGVHD from those at low risk.
Methods. The study included a cohort of 324 patients undergoing allogeneic HCT at the University of Michigan C.S. Mott Children’s Hospital during 2014 to 2017. Using EHR data, specifically vital sign measurements collected within the first 10 days of transplantation, we built a predictive model using penalized logistic regression for identifying patients at risk for grade II to IV aGVHD. We compared the proposed model with a baseline model trained only on patient and donor characteristics collected at the time of transplantation and performed an analysis of the importance of different input features.
Results. The proposed model outperformed the baseline model, with an area under the receiver operating characteristic curve of 0.659 versus 0.512 (P = .019). The feature importance analysis showed that the learned model relied most on temperature and systolic blood pressure, and temporal trends (eg, increasing or decreasing) were more important than the average values.
Conclusion. Leveraging readily available clinical data from EHRs, we developed a machine-learning model for aGVHD prediction in patients undergoing HCT. Continuous monitoring of vital signs, such as temperature, could potentially help clinicians more accurately identify patients at high risk for aGVHD.