page 1  (4 pages)
2to next section

A Supervised Machine Learning Algorithm for Arrhythmia Analysis

H. Altay G?uvenir, Burak Acar, G?ul?sen Demir?oz, Ayhan C? ekiny

Bilkent University, y Ba?skent University, Ankara, Turkey

Abstract

A new machine learning algorithm for the diagnosis of cardiac arrhythmia from standard 12 lead ECG recordings is presented. The algorithm is called VFI5 for Voting Feature Intervals. VFI5 is a supervised and inductive learning algorithm for inducing classification knowledge from examples. The input to VFI5 is a training set of records. Each record contains clinical measurements, from ECG signals and some other information such as sex, age, and weight, along with the decision of an expert cardiologist. The knowledge representation is based on a recent technique called Feature Intervals, where a concept is represented by the projections of the training cases on each feature separately. Classification in VFI5 is based on a majority voting among the class predictions made by each feature separately. The comparison of the VFI5 algorithm indicates that it outperforms other standard algorithms such as Naive Bayesian and Nearest Neighbor classifiers.

1. Introduction

In several medical domains the machine learning algorithms were actually applied, for example, two classification algorithms are used in localization of primary tumor, prognostics of recurrence of breast cancer, diagnosis of thyroid diseases, and rheumatology [4]. Another example is the CRLS system applied to a biomedical domain [5]. This paper presents a new machine learning algorithm for another medical problem, which is the diagnosis of cardiac arrhythmia from standard 12 lead ECG recordings. The algorithm is called VFI5 for Voting Feature Intervals. The VFI5 algorithm is similar to the VFI algorithm [2], which has been applied to a dermatological diagnosis problem [1]. The input to VFI5 is a training set of records of patients. Each record contains clinical measurements, from ECG signals, such as QRS duration, RR, P-R and Q-T intervals and some other information such as sex, age, weight, together with the decision of a cardiologist. There are a total of 279 attributes (features) per patient in a record. Diagnosis of the cardiologist is either normal or one

of 15 different classes of arrhythmia. VFI5 is a supervised, inductive and non-incremental algorithm for inducing classification knowledge from examples. The knowledge representation is based on a recent technique called Feature Intervals, where a concept (class) is represented by the projections of the training cases on each feature (attribute) separately. Classification in VFI5 is based on a majority voting among the class predictions (votes) made by each feature separately. A feature makes its prediction based on the projections of training instances on that feature. The VFI5 algorithm can incorporate further information about the relevancy of a feature during the voting process. Therefore, it uses a weighted majority voting, where the weight of a feature represents its relevancy. We have also developed a genetic algorithm to learn the respective weights of features. The comparison of the VFI5 algorithm indicates that it outperforms other standard algorithms such as Naive Bayesian classifier assuming normal distribution for linear feature (NBCN) and the Nearest Neighbor (NN) classifiers. On the same dataset of ECG recordings, NBCN and NN performed with an accuracy of 50% and 53%, respectively; whereas VFI5 achieved an accuracy of 62%. The paper describes the VFI5 algorithm, and its application to diagnosis of cardiac arrhythmia. A detailed empirical comparison of VFI5 with NBC and NN on arrhythmia dataset is given.

2. Dataset

The aim is to distinguish between the presence and types of cardiac arrhythmia and to classify it in one of the 16 groups. Currently, there are 452 patient records which are described by 279 feature values. Class 01 refers to normal ECG, class 02 to Ischemic changes (Coronary Artery Disease), class 03 to Old Anterior Myocardial Infarction, class 04 to Old Inferior Myocardial Infarction, class 05 to Sinus tachycardy, class 06 to Sinus bradycardy, class 07 to Ventricular Premature Contraction (PVC), class 08 to Supraventricular Premature Contraction (PVC), class 09 to Left bundle branch block, class 10 to Right bundle branch block,