Identification of patients with atrial fibrillation: a big data exploratory analysis of the UK Biobank.
Oster J., Hopewell JC., Ziberna K., Wijesurendra R., Camm CF., Casadei B., Tarassenko L.
OBJECTIVE: Atrial fibrillation (AF) is the most common cardiac arrhythmia, with an estimated prevalence of around 1.6% in the adult population. The analysis of the electrocardiogram (ECG) data acquired in the UK Biobank represents an opportunity to screen for AF in a large sub-population in the UK. The main objective of this paper is to assess ten machine-learning methods for automated detection of subjects with AF in the UK Biobank dataset. APPROACH: Six classical machine-learning methods based on support vector machines are proposed and compared with state-of-the-art techniques (including a deep-learning algorithm), and finally a combination of a classical machine-learning and deep learning approaches. Evaluation is carried out on a subset of the UK Biobank dataset, manually annotated by human experts. MAIN RESULTS: The combined classical machine-learning and deep learning method achieved an F1 score of 84.8% on the test subset, and a Cohen's kappa coefficient of 0.83, which is similar to the inter-observer agreement of two human experts. SIGNIFICANCE: The level of performance indicates that the automated detection of AF in patients whose data have been stored in a large database, such as the UK Biobank, is possible. Such automated identification of AF patients would enable further investigations aimed at identifying the different phenotypes associated with AF.