Ensemble Deep Learning Algorithm for Structural Heart Disease Screening Using Electrocardiographic Images: PRESENT SHD
Dhingra LS., Aminorroaya A., Sangha V., Pedroso AF., Shankar SV., Coppi A., Foppa M., Brant LCC., Barreto SM., Ribeiro ALP., Krumholz HM., Oikonomou EK., Khera R.
Background: Identifying structural heart diseases (SHDs) early can change the course of the disease, but their diagnosis requires cardiac imaging, which is limited in accessibility. Objectives: The purpose of this study was to leverage images of 12-lead electrocardiograms (ECGs) for automated detection and prediction of multiple SHDs using an ensemble deep learning approach. Methods: We developed a series of convolutional neural network models for detecting a range of individual SHDs from images of ECGs with SHDs defined by transthoracic echocardiograms performed within 30 days of the ECG at the Yale New Haven Hospital (YNHH). SHDs were defined as left ventricular ejection fraction <40%, moderate-to-severe left-sided valvular disease (aortic/mitral stenosis or regurgitation), or severe left ventricular hypertrophy (interventricular septal diameter at end-diastole >1.5 cm and diastolic dysfunction). We developed an ensemble XGBoost model, PRESENT-SHD (Practical scREening using ENsemble machine learning sTrategy for SHD detection), as a composite screen across all SHDs. We validated PRESENT-SHD at 4 U.S. hospitals and the prospective, population-based ELSA-Brasil (Brazilian Longitudinal Study of Adult Health) cohort, with concurrent protocolized ECGs and transthoracic echocardiograms. We also used PRESENT-SHD for risk stratification of new-onset SHD or heart failure (HF) in clinical cohorts and the population-based UK Biobank. Results: The models were developed using 261,228 ECGs from 93,693 YNHH patients and evaluated on a single ECG from 11,023 individuals at YNHH (19% with SHD), 44,591 across external hospitals (20%-27% with SHD), and 3,014 in the ELSA-Brasil (3% with SHD). In the held-out test set, PRESENT-SHD demonstrated an area under the receiver-operating characteristic curve (AUROC) of 0.886 (95% CI: 0.877-894), 90% sensitivity, and 66% specificity. At hospital-based sites, PRESENT-SHD had AUROCs ranging from 0.854 to 0.900, with sensitivities and specificities of 93% to 96% and 51% to 56%, respectively. The model generalized well to ELSA-Brasil (AUROC 0.853 [95% CI: 0.811-0.897], 88% sensitivity, 62% specificity). PRESENT-SHD demonstrated consistent performance across demographic subgroups, novel ECG formats, and smartphone photographs of ECGs from monitors and printouts. A positive PRESENT-SHD screen portended a 2- to 4-fold higher risk of new-onset SHD/heart failure, independent of demographics, comorbidities, and the competing risk of death across clinical sites and UK Biobank, with high predictive discrimination. Conclusions: We developed and validated PRESENT-SHD, an AI-ECG tool identifying a range of SHD using images of 12-lead ECGs, representing a robust, scalable, and accessible modality for automated SHD screening and risk stratification.