Discussion - Towards Universal EEG systems with minimum channel count based on Machine Learning

The EEG channel selection method for epileptic-seizure classification proved to be robust. For example, the accuracy for patient 1 with DWT-based features was

4.6. Discussion 77

Figure 4.6: Comparison of the most-used classifiers by NSGA-II (left) and NSGA-III (right) for the 24 patients using DWT-based feature extraction.

0.97 using all EEG channels. The accuracy was even higher when using the EEG channels selected by NSGA-II or NSGA-III (1 or 2 channels): 0.98 for EMD and 1.00 for DWT.

For example, the results obtained with the data of patient 12 showed the highest accuracy using EMD to be 0.942 using six EEG channels selected by NSGA-III.The highest accuracy obtained using DWT-based features was 0.952 using four EEG channels.An important feature of the classification of the epileptic seizures of this patient is that most of the highest accuracy values were obtained using theKNN classifier (see Figs. 4.5and4.6), i.e., an average of 73% and 84% using EMD-based features and an average of 96% and 98% using DWT-based features, for NSGA-II and NSGA-III, respectively.

Examination of the number of epileptic seizures described in the database [215] showed this patient to have had 38 epileptic seizures and after segmentation (six-second segments), 234 instances of epileptic seizures and 234 seizure-free periods were obtained. This amount of data was one of the highest of the patients used for this study. However for patient 15, for whom there was a similar amount of data, the highest accuracy values were obtained usingSVM. Thus, it is not possible to argue that this is due to the amount of data. Therefore, future work will also analyze more parameters related to the classifier (i.e., number of neighbors forKNNand kernel, as well as kernel parameters forSVM) and how accuracy is

both approaches. For future steps, these findings will be considered and used for testing other important parameters related to each classifier to reduce the computation cost, instead of testingNBagain.

In general, the results presented in this Section show that this approach is able to classify epileptic seizure and seizure-free periods with an average accuracy of up to 0.97±0.05 using only one EEG electrode. This result was obtained using DWT-based features. The use of two or more channels can increase the accuracy to 0.98 and 0.99, especially when the EEG channels are selected by NSGA-III (see Table4.5).

In the state-of-the-art, there are several relevant studies in which the authors present various methods for feature extraction and classification using the same dataset under different experiment setups. Table4.5presents a general overview of such studies for analysis and comparison.

Table4.5shows the state-of-the-art and classification accuracy of approaches using EMD-based or DWT-based features, as well as NSGA-II or NSGA-III. It should be noted that the results are not directly comparable to those from previous studies as a lower number of EEG channels were used, found by NSGA-based algorithms, and the experiments were based on 24 subjects and used different experimental setups. It should be noted that the average values presented in the results were obtained from Tables4.1,4.2,4.3, and4.4, which correspond to the results obtained in the Pareto-front for each subject in the dataset. In addition, the average accuracy was affected for some subjects when using two or three channels, for whom the highest accuracy values were not obtained with this number of EEG channels (see Tables4.1,4.2,4.3, and4.4), i.e., using EMD-based features, the

4.6. Discussion 79 Table 4.5: Comparison of relevant existing methods for epileptic-seizure classification using theCHB-MIT Scalp EEG datasetpresented in [218].

Ref. Method Subjects,

23, 23 accuracy of 0.80 using 80% for training.

accuracy of 0.91 using ˜80% for training.

[243] Seven features from the intersection sequence of Poincaré section with phase space.

23, 23 accuracy values of 0.93 and 0.94 using 25% and 50% for training, respectively.

[245] Three features extracted from different oscillatory levels using multivariate extension of EWT. The channel with the lowest standard deviation was selected and the four channels with higher mutual information then added.

23, 5 accuracy of0.99using 10-fold cross-validation.

[244] Signal curve length of the time-domain EEG signal and the mode powers of the dynamic mode decomposition.

12, 18 sensitivity of 0.87 using 50% for training.

[135] Teager and instantaneous energy, HiguchiandPetrosianfractal dimension, and DFA from 2 IMFs based on the EMD.

Channels selected using the backward-elimination algorithm.

24, 5 average accuracy of 0.93using 10-fold cross-validation. from 2 IMFs based on EMD.

24, 1-3 average accuracy values of 0.93±0.06,

0.95±0.06, and 0.95±0.05 using 10-fold cross-validation for 1, 2, 3, and 4 channels selected by NSGA-II. 10-fold cross-validation for 1, 2, and 3 channels selected by NSGA-III. from 4 decomposition levels of the DWT.

24, 1-3 average accuracy

values of0.97±0.05, 0.97±0.04, and

0.98±0.02 using

10-fold cross-validation for 1, 2 and 3, channels selected by NSGA-II.

24, 1-3 average accuracy values of 0.97±0.05,

0.98±0.03, and 0.99±0.01 using 10-fold cross-validation for 1, 2, and 3 channels selected by NSGA-III.

accuracy for the Pareto-front for NSGA-III was 0.992 with one channel, and 1.00 using four EEG channels, but there was no information for the combination with

usedSVM. based on PCA, ICA, and LDA, respectively.

[247] Entropy-Fuzzy Classifier with three classes, normal vs. pre-ictal vs. epileptic.

5, 1 accuracy of 0.981.

[248] Features based on two-dimensional (2D) and 3D phase space representation (PSRs) of IMFs from EMD, and least-squareSVM (LS-SVM) classifier.

5, 1 accuracy of 0.986.

[246] Using the TUH EEG corpus, they used 10-second segments with a sample rate of 250 Hz and computed 24 features per channel.

Six different classifiers were compared:SVM, NB, KNN, RF, gradient boosting, and logistic regression.

43, 22 accuracy of 0.994 using SVM.

[249] Features based on Fourier-Bessel series expansion and classified usingLS-SVM

5, 1 accuracy of 0.990 in the best case.

[252] Third-order cumulant (ToC) and neural network with softmax classifier.

5, 1 accuracy of 1.000.

[251] Energy features from sub-bands extracted using the Taylor-Fourier filter bank and LS-SVM.

5, 1 accuracy of 0.948.

[185] Wavelet coefficients from sub-bands obtained using DWT with 7 levels of decomposition using iEEG from 10 patients of the Flint Hills Scientific dataset.

10, 3 sensitivity of 0.96.

It is important to mention that in the work presented in [246–249,251,252, 257,258], no methods of channel selection were used, as the dataset used consisted of only one or two EEG channels and the study [185] used methods based on variance or entropy to select the channels before the classification process.

Most of the studies presented in Table4.6were based on invasive EEG, which

4.6. Discussion 81 provides better signal quality [253]. Therefore, their performance should be re-tested on non-invasive EEG signals for continuous monitoring. Note that in the presented work, theSVMclassifier was the most widely used and provided the highest accuracy values relative to the other classifiers and neural networks, consistent with the results obtained in this thesis.

According to the results in this thesis, NSGA-III is able to find the most relevant EEG channel combinations using DWT-based features to achieve an average accuracy of up to 0.99 using only three channels. Looking towards improving the general performance of this approach and testing it using additional public epileptic-seizure datasets, new experiments will be performed considering more than two objective functions in the problem and verify whether NSGA-III is still the best method for solving this problem [212,213].

Results have shown that the best accuracy can be reached using one to three channels for certain subjects and four or more for others. Thus, testing different methods in an attempt to improve the channel-selection process and decrease the complexity is proposed for future studies. This can be achieved by testing and comparing methods such as that presented by [245], which selects a channel with the lowest SD and then four channels with the highest MI with the previously chosen channel, as well as other optimization approaches [87,138,190–201].

Epileptic-seizure classification using EEG signals is important for evaluating the state of the brain. Following the evolution of the signals through continuous monitoring will enable prediction with a low number of EEG channels, making it easier to use and thus allowing long-term monitoring using a possibly personalized portable EEG device [259,260]. However, there are several challenges that need to be addressed before implementation in real life.

Because epilepsy can cause a variety of other neurological disorders (i.e., depression, anxiety, etc.) such confounders should be additionally studied to better distinguish between an epileptic seizure and seizure-free periods. Thus, future efforts will also include the study of epilepsy-related disorders and how they can be recognized on EEG signals. A possible portable low-density EEG device will facilitate monitoring in daily life, which will allow healthcare professionals more confident management of seizures, not only in the hospital or laboratory but also in conjunction with the recent progress in telehealth and telemedicine

O(MN²). However, the study of the most relevant channels is important and it must be performed for analysis and, as presented here, to verify whether epileptic seizures can be detected using a few non-invasive EEG channels. The limitations of the methods used for feature extraction are related to the well-known problems of EMD, such as the selection of the best spline, the end effect, and the mode mixing problem [116,126,128].

For DWT, the main problems are related to parameter selection, such as the number of levels of decomposition and the mother function. Some of these limitations have already been considered in the literature or can be solved by using recent progress in code optimization [227,228, 265]. Future efforts for classification will focus on testing and comparing shallow convolutional neural networks and Riemannian classifiers, as they have been shown to provide high accuracy values for EEG-signal classification [148,266,267].

Future efforts will concentrate on testing the methods used for epileptic-seizure classification, the epileptic epileptic-seizure prediction problem, testing methods for feature extraction and classification, and testing whether the methods for channel selection can find the most relevant subsets for this task and seizure onset detection [171,175,184,185].

Chapter 5

Case study 2: Channel count optimization for EEG-based biometric systems

This Chapter presents two approaches for creating EEG-based biometric systems using various methods for channel selection and implementing them for feature extraction and classification. This is tested in experiments using multi-class classification, as well as one-class classification

This Chapter is based on the journal articles [87,138,223] and addresses the 1^st, 2^nd, and 3^rdResearch Questions.

In document Towards Universal EEG systems with minimum channel count based on Machine Learning and Computational Intelligence (sider 100-107)