Five classes - results - Adding classes to the classification task

4.5 Adding classes to the classification task

4.5.4 Five classes - results

Lastly, the results of the five class classification task are presented. Table 4.17 shows the details two best performing models.

Since the model performed poorly in predicting esmolol, the second best model were included and is shown in table 4.19

Classifier Feature Selection Hyperparameter Average accuracy AdaBoost Relieff using 6 features LearnRate: 0.8

NLearn: 200 0.54

DT CFS MaxNumSplits: 8

SplitCriterion: twoing 0.50 Table 4.17: Table showing the details of the best performing models using

five classes.

Predicted Baseline 87 37 17 10 43

Predicted Adrenaline 41 249 14 15 3

Predicted Esmolol 0 0 5 0 0

Predicted Ischemia 63 27 141 185 59

Predicted Fluid loading 12 10 45 0 108

Table 4.18: Table showing the model achieving the best accuracy when including all classes.

Predicted Baseline 110 76 32 10 40

Predicted Adrenaline 59 172 0 46 14

Predicted Esmolol 12 0 68 7 7

Predicted Ischemia 11 61 76 136 53

Predicted Fluid loading 11 14 46 11 99

Table 4.19: Table showing the model achieving the second best accuracy when including all classes.

4.5. ADDING CLASSES TO THE CLASSIFICATION TASK

4.5.5 Analysis

Three classes The average accuracy saw a drop from 0.84 to 0.67 when adding a class. This is a substantial drop and demonstrates the that most combinations of classes perform substantially lower than satisfactory.

However, the decomposition of the average accuracy, seen in table 4.14 shows that the model can discriminate reasonably well between adrenaline, ischemia and fluid loading, achieving an accuracy of 0.80.

The more clinically interesting combinations are the ones involving base-line, as reasoned in 3.6. Two combinations of basebase-line, both marked in, achieves somewhat good results. These two combinations are: (i) Base-line, adrenaBase-line, ischemia, and (ii) baseBase-line, adrenaBase-line, fluid loading. They achieved an accuracy of 0.74 and 0.75, respectively. The remaining combi-nations did not perform close to satisfactory, though all performed better than chance (33%).

Four classes In the four-class classification task, the average accuracy further, down to 0.57. The decomposition of the average accuracy reveals that no combination achieved a satisfactory result.

Five classes The final classification task, involving all class at one, saw a further decrease in accuracy, by 0.03 to 0.54. The model has difficulties predicting esmolol and often mistakes it for being ischemia.

Chapter 5 Discussion

This chapter concludes the thesis with a discussion of the results, a conclusive summary, and lastly describes the possibilities for future work.

5.1 General discussion

Five-class classification The results achieved in the five-class classifica-tion task, seen in table 4.17, establishes that predicting one class out of the five is not feasible. Although the accuracy of 0.54 is better than chance¹, meaning that there is some characteristic motion on each of the cardiac functions which is noticed by the classifier. Still, the accuracy is substan-tially lower than satisfactory. Instead of predicting all classes, the model can to some degree distinguish the adrenaline class from the rest. The confusion matrix in table 4.18 shows that the accuracy with respect to the adrenaline class is 0.77. Its clinical use is discussed in the next paragraph.

A significant reason for the moderate accuracy is the overlap of data. It was mentioned in section 3.2.4 it was difficult to identify any clear diversity between the heart functions, thus making the classification task challeng-ing. The overlap is most prominent between ischemia and esmolol. This is confirmed in the confusion matrix in table 4.18, where the classifier often predicts ischemia when the actual cardiac function is esmolol. Moreover, having access to measurements to no more than 20 animals limits the ac-curacy. Even though segmenting the motion data into cycles increased the size of the dataset, the cycles should preferably come from different ani-mals. The optimal number of animals is likely to be higher, though it is difficult to state exactly how many animals are required to achieve satis-factory results. Inflicting the various cardiac functions is a demanding task and is thus a weakness with the approach.

Two-class classification Neither the four- or three-class classification task achieves adequate results. The best performing classifier on all combina-tions of two classes, though, achieves an average accuracy of 0.84. This

1Choosing a heart function at random. Five classes imply an accuracy of approximately 0.2.

demonstrates that there enough difference in motion from certain heart functions to be noticed by a machine learning model.

As mentioned in section 3.6, a two-class classifier is relevant for clinical use since certain cardiac functions are expected during and after cardiac surgery. Five combinations are relevant, four of whom involves baseline as this is either the initial or desirable cardiac function. Esmolol versus ischemia is also of clinical use since esmolol is often injected to decrease the heart rate, but injecting too much might make the heart ischemic. The results of the useful combinations are shown in table 4.10, where three out of the five combinations achieved a precision and recall of more than 0.80. Baseline versus adrenaline was the best combination out of the five, achieving a precision and recall of 0.97 and 0.96, respectively. These results demonstrate that the model can be used as a monitoring technique when the adrenaline dysfunction is expected.

Baseline versus ischemia achieved a precision and recall of 0.87 and 0.94, respectively. This result is an improvement compared to the study by Manochaet al.[44], which investigated 22 different machine learning mod-els detecting ischemia using electrocardiography. It was reported that the average precision and recall were 0.81 and 0.83 respectively. Because of the inherent drawbacks of the ECG as an ischemia detection technique, Manochaet al. proposed that a consolidation of two or more techniques may result in building a better classifier for detection of ischemia. The tech-nique described in this project is demonstrated that it can be a suitable part of such a consolidation.

A difference in this study and the one mentioned above is the use of the pig heart from the human heart. However, Douglaset al.[18] states that the structure of the cardiac blood vessel of the pig is very uniformly a duplicate of humans. Furthermore, it was stated that pigs may be useful in investi-gating the treatment of arrhythmias (adrenaline, esmolol), in myocardial infarction (ischemia) and effects of increased wall tension (fluid loading).

Esmolol versus ischemia achieved a precision and recall of 0.71 and 0.85, respectively. A recall of 0.85 implies that a high fraction of those with is-chemia will be correctly classifier as such, which is clinically important.

The lower precision of 0.71 suggests that some of those whose been classi-fied with ischemia are misclassiclassi-fied. However, this combination can still be relevant for clinical use if further assessment is done before treatment.

Baseline versus fluid loading achieved a precision and recall of 0.80 and 0.81, respectively. The results can be of clinical use. Frequently used stan-dard preload indexes such as central venous pressure (CVP) or pulmonary capillary wedge pressure (PCWP) often fail to provide reliable information on cardiac preload and are not capable of predicting a cardiac response to fluid therapy [41, 48]. It is important to note, however, that the accelerome-ter technique is only tested on an increase in preload and not a decrease.

In document Using motion data to detect cardiac dysfunctions (sider 83-89)