This section will present details of three of the models created in section 4.1 and 4.2. These models either achieved a high accuracy or possess some interesting feature characteristics. The models are specified in table 4.5, before they are further assessed in the three next paragraphs in the same order as they appear in the table.
4.3. DETAILS OF THREE TWO-CLASS MODELS
Classifier Hyperparameters Feature set FS Accuracy AdaBoost NLearn:100
LearnRate: 0.2 All features CFS 0.84 Decision tree MaxNumSplits: 4
SplitCriterion: deviance All features Relieff 0.76 AdaBoost NLearn: 25
LearnRate: 0.7 Raw data CFS 0.79
Table 4.5: Table specifying that models that will be further investigated.
4.3.1 Best performing model
This model was the model achieving the highest average accuracy in the two-class classification task, hence a more detailed description of it is shown here. A decomposition of the average accuracy can be seen in table 4.6.
Fluid loading 0.63 164 119
61 136
Fluid loading 0.94 401 26
12 229
Esmolol
Ischemia 0.73 150 46
72 164
Esmolol
Fluid loading 0.81 178 39
44 174
Ischemia
Fluid loading 0.80 186 52
48 203
Table 4.6: Table showing the accuracy and confusion matrix of each class on the model that performed best, which were AdaBoost using all features, achieving an average accuracy of 0.84. Regarding the confusion matrix, columns represents the true class, while rows represents the predicted class.
As expected, combinations involving adrenaline performs best. Adrenaline is the only intervention which increases the heart rate compared to
base-line. The two combinations with the lowest accuracy, are the ones involv-ing baseline versus esmolol and fluid loadinvolv-ing, respectively.
Since the data comes from real world measurements, the influence of noise and inaccuracies in the data must be assessed. If the prediction errors of-ten comes from the same animals, it suggests that the measurements from that animal is particularly noisy. Analyzing the best model, results in one animal with an average error rate of 30%, and two animals with an aver-age error rate of 23%. The mean averaver-age error rate across all animals are 15%. This reveals that some animals were somewhat noisy, and did have a negative effect the results. However, noise is inherent of every real world measurement, so discarding the noisy animals would give a result that does not reflect the true quality of the models.
The average number of features chosen by CFS were 23. The next para-graph introduces a model using only three features, hence a more detailed analysis of which features are more discriminate is given there.
4.3. DETAILS OF THREE TWO-CLASS MODELS
4.3.2 Model using only three features
Decision three using all features and Relieff achieved an average accuracy of 0.78, as seen in table 4.4. Relieff used 13 features, but limiting the use to three features sees only a small decrease down to 0.77 in average accuracy.
It would be interesting to see which features are deemed most relevant by Relieff in the different classification tasks. Hence, table 4.7 decomposes the average accuracy and shows which features are used.
Dysfunctions Accuracy Features Confusion Matrix Baseline
Adrenaline 0.77 1.1 5.2 5.2 127 44
98 369
Baseline
Esmolol 0.50 4.2 4.2 1.1 105 113
98 109
Baseline
Ischemia 0.74 5.2 5.2 4.2 188 47
107 261
Baseline
Fluid loading 0.72 1.1 5.2 5.2 170 79
55 176
Adrenaline
Esmolol 0.90 1.1 5.1 5.1 287 21
36 201
Adrenaline
Ishcemia 0.90 1.5 1.1 4.2 375 25
38 209
Adrenaline
Fluid loading 0.94 1.1 5.4 1.6 391 18
22 237
Esmolol
Ischemia 0.74 5.2 5.2 4.2 191 82
31 128
Esmolol
Fluid loading 0.63 5.3 5.3 4.2 144 83
78 130
Ischemia
Fluid loading 0.79 4.2 4.2 4.2 203 71
31 184
Table 4.7: Table showing the accuracy and confusion matrix of each class on the model that performed second best, which were decision tree using all features and Relieff feature selection. In addition, the features chosen by Relieff is also shown. The numbers point to the list of features shown in 4.3 The numbers in the third column refers to the list of features shown in sec-tion 4.2.3. The features are listed according to the rank, where the leftmost feature is the highest ranked feature. It is evident that samples from the raw data from both the acceleration and velocity (both are from the cir-cumferential direction) are often chosen. This illustrates the usefulness of feature selection. Using all samples from the raw data would create a high dimensional feature space and make the curse of dimensionality problem (section 2.4.1), imminent. Moreover, finding the most discriminative sam-ples numbers without such methods is a demanding task.
The most common feature that is not from the raw data is the feature
de-noted 1.1. That is, the peak acceleration in circumferential acceleration from the midsystolic phase. This is also the feature deemed most useful by Halvorsenet al[30].
4.3. DETAILS OF THREE TWO-CLASS MODELS
4.3.3 Model using the raw data as features
As seen in the initial experiments table, some of the best performing mod-els used the raw data as features. It would therefore be interesting to see if there are a range of sample numbers that is deemed useful by the feature selection method. This can give insights into which parts of a cycle dis-criminates best between the classes.
The best performing model using the raw data used CFS feature selection, where the raw data is the acceleration and velocity in the circumferential direction. Note that the data is interpolated or decimated to 300 samples, as described in section 3.2.3. The finer grid search was performed, seen in table 4.2. The achieved average accuracy were 0.79, and the decomposition of the average accuracy can be seen in table 4.8.
Dysfunctions Accuracy Confusion Matrix
Fluid loading 0.41 114 172
111 83
Fluid loading 0.82 355 61
58 194
Esmolol
Ischemia 0.72 153 51
69 159
Esmolol
Fluid loading 0.78 178 52
44 161
Ischemia
Fluid loading 0.83 203 52
31 203
Table 4.8: Table showing the accuracy and confusion matrix a model only using the raw data as features, in combination with CFS feature selection
method.
Figure 4.1 shows how many times a sample number from the circumferen-tial acceleration were chosen by CFS to be used as a feature, while figure 4.2 shows how many times a sample number from the circumferential velocity is used as a feature. There were in total ten classification tasks, yielding a maximum number of times to be selected to be ten.
It was reasoned that the acceleration in midsystolic phase discriminated well between classes. The midsystolic phase was approximated to lie in the range of [0, 150] ms. A heartbeat lasts approximately 0.6 - 1 seconds, thus the sample numbers in the range [0, 45-75] are in the midsystolic phase and this is also a popular range by CFS, seen in figures 4.1 and 4.2.
It was also reasoned that the IVR phase could discriminate well, lying in the sample number range of approximately [109, 146] when using 0.8 sec-onds as the duration of the heart beat. These samples are not chosen as of-ten as the samples from the midsystolic phase. This is also expected, as the initial results shows that models using features from the midsystolic phase performed better than those which used features from the IVR phase. In-stead, samples lying in the range of sample number 200±25 is often chosen Note that the average number of samples chosen from the acceleration is 22, while the average from velocity is 17.6. However, samples from the ac-celeration was not chosen significantly more often than samples from the velocity (P = 0.31).
Figure 4.1: Figure showing how many times sample number from the acceleration in the circumferential direction is chosen to be used as a
feature in the two-class classification task.