Future work - Using motion data to detect cardiac dysfunctions

In order for the model to be used for more than two cardiac functions fur-ther research is needed. Collection motion data from several animals en-sure a more robust and presumably better performing model.

Addition-5.3. FUTURE WORK

ally, attaching a gyroscope to the pacemaker lead would give information about the rotation of the heart. Having more information could further aid the model

It was proposed that the technique described in this project could be used in conjunction with ECG. Such a combination would need further research in order to validate the technique.

This project has not investigated the use of ensemble learners. That is, com-bining several classifiers and predict based on what the majority of the clas-sifiers predict. A different approach could be to use clasclas-sifiers which can output probability. The probability could then be interpreted as the how close the heart is from being in one of the cardiac functions.

An approach were the classifier at all times knew the baseline, and instead predicted the transition from baseline to another dysfunction would be of clinical use, since baseline is often the initial cardiac function.

Lastly, it is worth mentioning that an anomaly detector would be benefi-cial. When there are motion data that does not belong to any of the different classes, the model should not attempt to predict any of them, but instead predict that an abnormality was seen.

Appendices

Appendix A

MATLABs decision tree split criterions

MATLABs^R split criterion functions for decision tree is provided. The split criterion is what defines which node to create next. Let p(i|t)denote the fraction of inputs belonging to a classiat a given nodet.

Gini’s Diversity Index The Gini’s Diversity index (gdi) is the most broadly used [65] and is given by:

Gini(t)=1−

∑

c i=1

[p(i|t)]² (A.1) wherecis the number of classes. A node with just one class (a pure node) has Gini index 0; otherwise the Gini index is positive.

Deviance The Deviance is given by:

Deviance(t)=−

∑

c i=1

p(i|t)log₂p(i|t) (A.2)

Twoing rule Unlike the gdi, twoing rule will search for two classes that will make up together more than 50% of the data. It will maximize the following change-of-impurity measure: The above equation is maximized whereP_l andP_r are the fractions of ob-servations that split to the left and right respectively, andt_l andt_rrefers to left and right child node.

Little research has been devoted in understanding which of the equations works best for different kinds of problems. Like in many other ares of ma-chine learning, all three must be tried and tested to see which one performs best.

Appendix B

MATLABs SVM kernels

The provided by MATLAB to be used in MATLABs SVM classifier is shown in the table below.

Name Values

Gaussian radial basis function (RBF) exp_−||~

x−~y⁰||² (2σ²)

or exp −γ||~x−~y⁰||² Polynomial (1+~x·~y⁰)^d¹

Linear ~x·~x

Table B.1: MATLABs SVM kernels. σ and γ and d are hyperparameters which must be set on beforehand outside of the training process. Hyperpa-rameters is reviewed in section 2.5.3.

Kernels can also be though of as a similarity measure. Naturally, if the problem is linearly separable in the original space, the transformation is not needed. This is known as the linear kernel. The polynomial kernel will, in contrast with the linear kernel, calculate feature conjunctions up to the order ofddimensions. Setting the order too high will fit the data well, but will not generalize. The RBF kernel will intuitively, create bell-shaped curves centered around every support vector, where a support vector is a feature vector touching the margin, seen in figure 2.14. Sigma controls the width of the bell. A large sigma gives a pointed bump, while a small sigma gives a softer and more broad dump. Consequently, setting the sigma too small will make the classifier unable to recognize the pattern in the data.

Setting the sigma it too will make the classifier overfit, meaning that it only represents the current training data, but it cannot generalize to new and unseen data.

1d = number of dimensions

Appendix C

Statistics

C.1 Feature selection methods

Relieff Relieff randomly selects examples, denotedR_iand searches for its knearest neighbours from the same class called nearest hitHandknearest neighbours from the other class, called nearest missM. Relieff updates the rank for all features depending on their values forR_i,MandH. If instance R_iandHhave different values of the featureF, then the featureFseparates two instances with the same class which is not desirable so the rank of that feature is decreased. On the other hand, if instanceR_iandMhave different values of the featureFthen the feature separates two instances with differ-ent class values which is desirable so the rank of the feature is increased.

Figure C.1: Figure showing the basic idea of the Relieff feature selection algorithm withkset to one.

Relieff is one of the most successful [17] feature selection methods Robnik et al. [55](1997) has shown that Relieff is effective at detecting relevant features, even when these features are highly dependent on other features.

Robniket al.[56](2003) also showed that it is robust in the number of nearest

neighbors as long as it remains relatively small. A drawback of Relieff is its randomicity. It was mentioned that it chooses examples at random, hence the results can vary in each run of the algorithm. Additionally, a scheme for selecting the number of features to include is necessary, as Relieff only ranks the features. One way to solve this is by sorting the features based on its assigned rank, and then iteratively include the highest ranked feature not yet included and classify for each time. This is done until all features has been included, and the number of features that yielded the best results is used.

Correlation-based feature selection (CFS) Correlation-based Feature Selection (CFS) introduced by Hall [26] and is built on the following hypothesis:“A good feature subset is one that contains features highly correlated with (predictive of) the class, yet uncorrelated with (not predictive of) each other.”

CFS is a filter approach and uses a function that gives a feature subset S consisting of k features a merit according to:

Merit_S_k = _q ^kr^{c f} k+k(k−₁)r_{f f}

(C.1)

Ther_{c f} is the average feature-class correlation andr_{f f} is the average value of all feature-feature correlation. CFS starts with an empty set of features and adds the feature giving the highest merit given by equation C.1. This is known as a best-first search. It does so until five consecutive fully ex-panded non-improving subsets has been created. Consequently, CFS out-puts the best subset found, instead of ranking all features, thus eliminating the need for a scheme to select the number of features to include, as in Re-lieff.

Because CFS makes use of all the training data at once, it tends to give better results than the wrapper used by Hallet al. on small datasets [27].

The weakness of CFS is the best-first search approach. When a feature is added, it cannot be removed at later stages, thus CFS is susceptible to local optima. However, the number of features created will not be substantial, thus the probability of it getting stuck in a local optimum is limited.

C.2 P-value

When investigating a result from hypothesis testing, the p-value will state how likely the hypothesis is. More specifically, in hypothesis testing, some event is tested on a sample data to see if the event actually has an effect. An example of event and sample data can be some drug applied on a group of rats. With the result in hand, one would like to know if the difference seen on the sample data was because of chance, or because the event actually had an effect, i.e. investigating if the drug is actually working. The way this is solved is by first creating a null hypothesis denotedH₀which is the hypothesis stating that there isnoeffect. An alternative hypothesis H₁ is

C.2. P-VALUE

also created, which is the hypothesis stating that thereisan effect. The p-value is defined as the probability, under the assumption thatH₀is true, of obtaining a result equal or more extreme than what was actually observed.

Examples of test statistics calculating the p-value are: z-test [42], Student’s t-test [61], and the Kolomogorov-Smirnov test [46]. If the p-value is below some predefinedαvalue, usually 0.05, the null hypothesis can be rejected, hence the eventdidhave an effect on the group.

Bibliography

[1] accelerometer. https://www.flickr.com/photos/drdrang/

7277161810/. Accessed: 2-3-2016.

[2] Ecg. https://upload.wikimedia.org/wikipedia/commons/9/9e/

SinusRhythmLabels.svg. Accessed: 2-3-2016.

[3] http://diagnosoft.com/strain/cardiac-strain. http:

//diagnosoft.com/wp-content/uploads/2014/05/

Screen-Shot-2014-05-09-at-5.12.02-PM.png. Accessed: 10-9-2015.

[4] Human physiology. https://en.wikibooks.org/wiki/Human_

Physiology/The_cardiovascular_system#/media/File:Diagram_

of_the_human_heart_(cropped).svg. Accessed: 3-2-2016.

[5] Ischemia. https://upload.wikimedia.org/wikipedia/commons/d/

d1/Blausen_0257_CoronaryArtery_Plaque.png. Accessed: 4-3-2016.

[6] kernel-trick. https://en.wikipedia.org/wiki/Kernel_method#

/media/File:Kernel_Machine.png. Accessed: 3-3-2016.

[7] Overfitting. https://commons.wikimedia.org/wiki/File:

Overfitting.svg. Accessed: 26-2-2016.

[8] Wiggers diagram. https://upload.wikimedia.org/wikipedia/

commons/f/f4/Wiggers_Diagram.svg. Accessed: 2-3-2016.

[9] World health organization department of health statistics and infor-matics in the information, evidence and research cluster. the global burden of disease 2004 update. page 11, 2004.

[10] AI Belousov, SA Verzakov, and J Von Frese. A flexible classification approach with optimal generalisation performance: support vector machines. Chemometrics and Intelligent Laboratory Systems, 64(1):15–25, 2002.

[11] Steven Bird, Ewan Klein, and Edward Loper. Natural language processing with Python. " O’Reilly Media, Inc.", 2009.

[12] Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard, and Chih-Jen Lin. Training and testing low-degree polynomial data mappings via linear svm. The Journal of Machine Learning Research, 11:1471–1490, 2010.

[13] Ion Codreanu, Matthew D Robson, Stephen J Golding, Bernd A Jung, Kieran Clarke, and Cameron J Holloway. Longitudinally and circumferentially directed movements of the left ventricle studied by cardiovascular magnetic resonance phase contrast velocity mapping.

Journal of Cardiovascular Magnetic Resonance, 12(1):48, 2010.

[14] Dennis VP Cokkinos, Constantinos Pantos, Gerd Heusch, and Hein-rich Taegtmeyer. Myocardial ischemia: from mechanisms to therapeutic potentials, volume 21. Springer Science & Business Media, 2006.

[15] Mark E Comunale, Simon C Body, Catherine Ley, Colleen Koch, Gary Roach, Joseph P Mathew, Ahvie Herskowitz, and Dennis T Mangano. The concordance of intraoperative left ventricular wall-motion abnormalities and electrocardiographic st segment changes association with outcome after coronary revascularization.The Journal of the American Society of Anesthesiologists, 88(4):945–954, 1998.

[16] Corinna Cortes and Vladimir Vapnik. Support-vector networks.

Machine learning, 20(3):273–297, 1995.

[17] Thomas G Dietterich. Machine-learning research. AI magazine, 18(4):97, 1997.

[18] William Richard Douglas. Of pigs and men and research. Space life sciences, 3(3):226–234, 1972.

[19] Ole Jakob Elle, Steinar Halvorsen, Martin Gunnar Gulbrandsen, Lars Aurdal, Andre Bakken, Eigil Samset, Harald Dugstad, and Erik Fosse. Early recognition of regional cardiac ischemia using a 3-axis accelerometer sensor. Physiological measurement, 26(4):429, 2005.

[20] Yoav Freund, Robert Schapire, and N Abe. A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence, 14(771-780):1612, 1999.

[21] Derek G Gibson and Darrel P Francis. Clinical assessment of left ventricular diastolic function. Heart, 89(2):231–238, 2003.

[22] Ole-Johannes HN Grymyr, Anh-Tuan T Nguyen, Fjodors Tjulkins, Andreas Espinoza, Espen W Remme, Helge Skulstad, Erik Fosse, Kristin Imenes, and Per S Halvorsen. Continuous monitoring of cardiac function by 3-dimensional accelerometers in a closed-chest pig model. Interactive CardioVascular and Thoracic Surgery, page ivv191, 2015.

[23] Ole-Johannes HN Grymyr, Espen W Remme, Andreas Espinoza, Helge Skulstad, Ole J Elle, Erik Fosse, and Per S Halvorsen. Assess-ment of 3d motion increases the applicability of accelerometers for monitoring left ventricular function. Interactive cardiovascular and tho-racic surgery, 20(3):329–337, 2015.

BIBLIOGRAPHY

[24] Gongde Guo, Daniel Neagu, and Mark TD Cronin. A study on feature selection for toxicity prediction. In Fuzzy Systems and Knowledge Discovery, pages 31–34. Springer, 2005.

[25] Isabelle Guyon and André Elisseeff. An introduction to variable and feature selection.The Journal of Machine Learning Research, 3:1157–1182, 2003.

[26] Mark A Hall and Lloyd A Smith. Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. InFLAIRS conference, volume 1999, pages 235–239, 1999.

[27] Mark A Hall and Lloyd A Smith. Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. InFLAIRS conference, volume 1999, pages 235–239, 1999.

[28] Per Steinar Halvorsen, Andreas Espinoza, Lars Albert Fleischer, Ole Jakob Elle, Lars Hoff, Runar Lundblad, Helge Skulstad, Thor Edvardsen, Halfdan Ihlen, and Erik Fosse. Feasibility of a three-axis epicardial accelerometer in detecting myocardial ischemia in cardiac surgical patients. The Journal of thoracic and cardiovascular surgery, 136(6):1496–1502, 2008.

[29] Per Steinar Halvorsen, LA Fleischer, A Espinoza, OJ Elle, L Hoff, H Skulstad, T Edvardsen, and E Fosse. Detection of myocardial ischaemia by epicardial accelerometers in the pig. British journal of anaesthesia, 102(1):29–37, 2009.

[30] Per Steinar Halvorsen, Espen W Remme, Andreas Espinoza, Helge Skulstad, Runar Lundblad, Jacob Bergsland, Lars Hoff, Kristin Imenes, Thor Edvardsen, Ole Jakob Elle, et al. Automatic real-time detection of myocardial ischemia by epicardial accelerometer. The Journal of thoracic and cardiovascular surgery, 139(4):1026–1032, 2010.

[31] Zena M Hira and Duncan F Gillies. A review of feature selection and feature extraction methods applied on microarray data. Advances in bioinformatics, 2015, 2015.

[32] Per Kristian Hol, Per Snorre Lingaas, Runar Lundblad, Kjell Arne Rein, Karleif Vatne, Hans-Jørgen Smith, Sigurd Nitter-Hauge, and Erik Fosse. Intraoperative angiography leads to graft revision in coronary artery bypass surgery. The Annals of Thoracic Surgery, 78(2):502 – 505, 2004.

[33] Anders Holst and Arndt Jonasson. Classification of movement patterns in skiing. InSCAI, pages 115–124, 2013.

[34] Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al. A practical guide to support vector classification. 2003.

[35] Abdulmassih S Iskandrian, Charles E Bemis, A-Hamid Hakki, Ioannis Panidis, Jaekyeong Heo, J Gerald Toole, Tsushung A Hua, Douglas

Allin, and Sally Kane-Marsch. Effects of esmolol on patients with left ventricular dysfunction. Journal of the American College of Cardiology, 8(1):225–231, 1986.

[36] Uday Jain, CJ Laflamme, Anil Aggarwal, James G Ramsay, Mark E Co-munale, Sudhanshu Ghoshal, Long Ngo, Krzysztof Ziola, Milton Hol-lenberg, and Dennis T Mangano. Electrocardiographic and hemody-namic changes and their association with myocardial infarction dur-ing coronary artery bypass surgery. a multicenter study. multicenter study of perioperative ischemia (mcspi) research group. Anesthesiol-ogy, 86(3):576–591, 1997.

[37] Thorsten Joachims. Text categorization with support vector machines:

Learning with many relevant features. Springer, 1998.

[38] SVEN ROLAND KJELLBERG, ULF RUDHE, and TORGNY SJÖS-TRAND. The effect of adrenaline on the contraction of the human heart under normal circulatory conditions. Acta Physiologica Scandi-navica, 24(4):333–349, 1952.

[39] JD Kneeshaw. Transoesophageal echocardiography (toe) in the operating room. British journal of anaesthesia, 97(1):77–84, 2006.

[40] Ravindra Koggalage and Saman Halgamuge. Reducing the number of training samples for fast support vector machine classification.Neural Information Processing-Letters and Reviews, 2(3):57–65, 2004.

[41] Anand Kumar, Ramon Anel, Eugene Bunnell, Kalim Habet, Sergio Zanotti, Stephanie Marshall, Alex Neumann, Amjad Ali, Mary Cheang, Clifford Kavinsky, et al. Pulmonary artery occlusion pressure and central venous pressure fail to predict ventricular filling volume, cardiac performance, or the response to volume infusion in normal subjects. Critical care medicine, 32(3):691–699, 2004.

[42] DN Lawley. A generalization of fisher’s z test.Biometrika, 30(1/2):180–

187, 1938.

[43] Ying Liu, Dengsheng Zhang, and Guojun Lu. Region-based image retrieval with high-level semantics using decision tree learning.

Pattern Recognition, 41(8):2554–2570, 2008.

[44] Amit Kumar Manocha and Mandeep Singh. An overview of ischemia detection techniques. International Journal of Scientific & Engineering Research, 2(11), 2011.

[45] Stephen Marsland. Machine learning: an algorithmic perspective. CRC press, 2015.

[46] Frank J Massey Jr. The kolmogorov-smirnov test for goodness of fit.

Journal of the American statistical Association, 46(253):68–78, 1951.

BIBLIOGRAPHY

[47] Alain Mercat, Jean-Luc Diehl, Guy Meyer, Jean-Louis Teboul, and Herve Sors. Hemodynamic effects of fluid loading in acute massive pulmonary embolism. Critical care medicine, 27(3):540–544, 1999.

[48] Frédéric Michard and Jean-Louis Teboul. Predicting fluid responsive-ness in icu patients: a critical analysis of the evidence. CHEST Journal, 121(6):2000–2008, 2002.

[49] medical illustrator Patrick J. Lynch. Transthoracic echocardiogram.

https://commons.wikimedia.org/w/index.php?curid=1490817. Ac-cessed: 7-4-2016.

[50] Stephen J Preece, John Yannis Goulermas, Laurence PJ Kenney, and David Howard. A comparison of feature extraction methods for the classification of dynamic activities from accelerometer data.Biomedical Engineering, IEEE Transactions on, 56(3):871–879, 2009.

[51] J. Ross Quinlan. Induction of decision trees. Machine learning, 1(1):81–

106, 1986.

[52] Narasimhan Ranganathan, Vahe Sivaciyan, and Franklin B Saksena.

The art and science of cardiac physical examination: with heart sounds and pulse wave forms on CD. Springer Science & Business Media, 2007.

[53] Nishkam Ravi, Nikhil Dandekar, Preetham Mysore, and Michael L Littman. Activity recognition from accelerometer data. In AAAI, volume 5, pages 1541–1546, 2005.

[54] Espen W Remme, Lars Hoff, Per Steinar Halvorsen, Edvard Nærum, Helge Skulstad, Lars A Fleischer, Ole Jakob Elle, and Erik Fosse. Val-idation of cardiac accelerometer sensor measurements. Physiological measurement, 30(12):1429, 2009.

[55] Marko Robnik-Šikonja and Igor Kononenko. An adaptation of relief for attribute estimation in regression. InMachine Learning: Proceedings of the Fourteenth International Conference (ICML’97), pages 296–304, 1997.

[56] Marko Robnik-Šikonja and Igor Kononenko. Theoretical and empiri-cal analysis of relieff and rrelieff.Machine learning, 53(1-2):23–69, 2003.

[57] Yvan Saeys, Iñaki Inza, and Pedro Larrañaga. A review of feature selection techniques in bioinformatics. bioinformatics, 23(19):2507–

2517, 2007.

[58] Olav Sand, Egil Haug, Kari C Toverud, and Øystein V Sjaastad.

Menneskets fysiologi. Gyldendal akademisk, 2014.

[59] Robert E Schapire. The strength of weak learnability.Machine learning, 5(2):197–227, 1990.

[60] Mojtaba Seyedhosseini, António RC Paiva, and Tolga Tasdizen.

Fast adaboost training using weighted novelty selection. In Neural Networks (IJCNN), The 2011 International Joint Conference on, pages 1245–1250. IEEE, 2011.

[61] Student. The probable error of a mean. Biometrika, pages 1–25, 1908.

[62] A Taddei, G Costantino, R Silipo, M Emdin, and C Marchesi. A system for the detection of ischemic episodes in ambulatory ecg. InComputers in Cardiology 1995, pages 705–708. IEEE, 1995.

[63] Yuan Tao. Analyse statistique de la fonction cardiaque chez des patients atteints de cardiopathies et des triathl‘etes. 2014.

[64] Robert Tennant and Carl J Wiggers. The effect of coronary occlusion on myocardial contraction. American Journal of Physiology–Legacy Content, 112(2):351–361, 1935.

[65] Roman Timofeev. Classification and regression trees (cart) theory and applications. 2004.

[66] Stig Urheim, Thor Edvardsen, Hans Torp, Bjørn Angelsen, and Otto A Smiseth. Myocardial strain by doppler echocardiography validation of a new method to quantify regional myocardial function.Circulation, 102(10):1158–1164, 2000.

[67] David H Wolpert and William G Macready. No free lunch theorems for optimization. Evolutionary Computation, IEEE Transactions on, 1(1):67–

82, 1997.

In document Using motion data to detect cardiac dysfunctions (sider 90-0)