Details of three two-class models - Using motion data to detect cardiac dysfunctions

This section will present details of three of the models created in section 4.1 and 4.2. These models either achieved a high accuracy or possess some interesting feature characteristics. The models are specified in table 4.5, before they are further assessed in the three next paragraphs in the same order as they appear in the table.

4.3. DETAILS OF THREE TWO-CLASS MODELS

Classifier Hyperparameters Feature set FS Accuracy AdaBoost NLearn:100

LearnRate: 0.2 All features CFS 0.84 Decision tree MaxNumSplits: 4

SplitCriterion: deviance All features Relieff 0.76 AdaBoost NLearn: 25

LearnRate: 0.7 Raw data CFS 0.79

Table 4.5: Table specifying that models that will be further investigated.

4.3.1 Best performing model

This model was the model achieving the highest average accuracy in the two-class classification task, hence a more detailed description of it is shown here. A decomposition of the average accuracy can be seen in table 4.6.

Fluid loading 0.63 164 119

61 136

Fluid loading 0.94 401 26

12 229

Esmolol

Ischemia 0.73 150 46

72 164

Esmolol

Fluid loading 0.81 178 39

44 174

Ischemia

Fluid loading 0.80 186 52

48 203

Table 4.6: Table showing the accuracy and confusion matrix of each class on the model that performed best, which were AdaBoost using all features, achieving an average accuracy of 0.84. Regarding the confusion matrix, columns represents the true class, while rows represents the predicted class.

As expected, combinations involving adrenaline performs best. Adrenaline is the only intervention which increases the heart rate compared to

base-line. The two combinations with the lowest accuracy, are the ones involv-ing baseline versus esmolol and fluid loadinvolv-ing, respectively.

Since the data comes from real world measurements, the influence of noise and inaccuracies in the data must be assessed. If the prediction errors of-ten comes from the same animals, it suggests that the measurements from that animal is particularly noisy. Analyzing the best model, results in one animal with an average error rate of 30%, and two animals with an aver-age error rate of 23%. The mean averaver-age error rate across all animals are 15%. This reveals that some animals were somewhat noisy, and did have a negative effect the results. However, noise is inherent of every real world measurement, so discarding the noisy animals would give a result that does not reflect the true quality of the models.

The average number of features chosen by CFS were 23. The next para-graph introduces a model using only three features, hence a more detailed analysis of which features are more discriminate is given there.

4.3. DETAILS OF THREE TWO-CLASS MODELS

4.3.2 Model using only three features

Decision three using all features and Relieff achieved an average accuracy of 0.78, as seen in table 4.4. Relieff used 13 features, but limiting the use to three features sees only a small decrease down to 0.77 in average accuracy.

It would be interesting to see which features are deemed most relevant by Relieff in the different classification tasks. Hence, table 4.7 decomposes the average accuracy and shows which features are used.

Dysfunctions Accuracy Features Confusion Matrix Baseline

Adrenaline 0.77 1.1 5.2 5.2 127 44

98 369

Baseline

Esmolol 0.50 4.2 4.2 1.1 105 113

98 109

Baseline

Ischemia 0.74 5.2 5.2 4.2 188 47

107 261

Baseline

Fluid loading 0.72 1.1 5.2 5.2 170 79

55 176

Adrenaline

Esmolol 0.90 1.1 5.1 5.1 287 21

36 201

Adrenaline

Ishcemia 0.90 1.5 1.1 4.2 375 25

38 209

Adrenaline

Fluid loading 0.94 1.1 5.4 1.6 391 18

22 237

Esmolol

Ischemia 0.74 5.2 5.2 4.2 191 82

31 128

Esmolol

Fluid loading 0.63 5.3 5.3 4.2 144 83

78 130

Ischemia

Fluid loading 0.79 4.2 4.2 4.2 203 71

31 184

Table 4.7: Table showing the accuracy and confusion matrix of each class on the model that performed second best, which were decision tree using all features and Relieff feature selection. In addition, the features chosen by Relieff is also shown. The numbers point to the list of features shown in 4.3 The numbers in the third column refers to the list of features shown in sec-tion 4.2.3. The features are listed according to the rank, where the leftmost feature is the highest ranked feature. It is evident that samples from the raw data from both the acceleration and velocity (both are from the cir-cumferential direction) are often chosen. This illustrates the usefulness of feature selection. Using all samples from the raw data would create a high dimensional feature space and make the curse of dimensionality problem (section 2.4.1), imminent. Moreover, finding the most discriminative sam-ples numbers without such methods is a demanding task.

The most common feature that is not from the raw data is the feature

de-noted 1.1. That is, the peak acceleration in circumferential acceleration from the midsystolic phase. This is also the feature deemed most useful by Halvorsenet al[30].

4.3. DETAILS OF THREE TWO-CLASS MODELS

4.3.3 Model using the raw data as features

As seen in the initial experiments table, some of the best performing mod-els used the raw data as features. It would therefore be interesting to see if there are a range of sample numbers that is deemed useful by the feature selection method. This can give insights into which parts of a cycle dis-criminates best between the classes.

The best performing model using the raw data used CFS feature selection, where the raw data is the acceleration and velocity in the circumferential direction. Note that the data is interpolated or decimated to 300 samples, as described in section 3.2.3. The finer grid search was performed, seen in table 4.2. The achieved average accuracy were 0.79, and the decomposition of the average accuracy can be seen in table 4.8.

Dysfunctions Accuracy Confusion Matrix

Fluid loading 0.41 114 172

111 83

Fluid loading 0.82 355 61

58 194

Esmolol

Ischemia 0.72 153 51

69 159

Esmolol

Fluid loading 0.78 178 52

44 161

Ischemia

Fluid loading 0.83 203 52

31 203

Table 4.8: Table showing the accuracy and confusion matrix a model only using the raw data as features, in combination with CFS feature selection

method.

Figure 4.1 shows how many times a sample number from the circumferen-tial acceleration were chosen by CFS to be used as a feature, while figure 4.2 shows how many times a sample number from the circumferential velocity is used as a feature. There were in total ten classification tasks, yielding a maximum number of times to be selected to be ten.

It was reasoned that the acceleration in midsystolic phase discriminated well between classes. The midsystolic phase was approximated to lie in the range of [0, 150] ms. A heartbeat lasts approximately 0.6 - 1 seconds, thus the sample numbers in the range [0, 45-75] are in the midsystolic phase and this is also a popular range by CFS, seen in figures 4.1 and 4.2.

It was also reasoned that the IVR phase could discriminate well, lying in the sample number range of approximately [109, 146] when using 0.8 sec-onds as the duration of the heart beat. These samples are not chosen as of-ten as the samples from the midsystolic phase. This is also expected, as the initial results shows that models using features from the midsystolic phase performed better than those which used features from the IVR phase. In-stead, samples lying in the range of sample number 200±25 is often chosen Note that the average number of samples chosen from the acceleration is 22, while the average from velocity is 17.6. However, samples from the ac-celeration was not chosen significantly more often than samples from the velocity (P = 0.31).

Figure 4.1: Figure showing how many times sample number from the acceleration in the circumferential direction is chosen to be used as a

feature in the two-class classification task.

In document Using motion data to detect cardiac dysfunctions (sider 72-79)