Evaluation of different time domain peak models using extreme learning machine-based peak detection for EEG signal

Various peak models have been introduced to detect and analyze peaks in the time domain analysis of electroencephalogram (EEG) signals. In general, peak model in the time domain analysis consists of a set of signal parameters, such as amplitude, width, and slope. Models including those proposed by Dumpala, Acir, Liu, and Dingle are routinely used to detect peaks in EEG signals acquired in clinical studies of epilepsy or eye blink. The optimal peak model is the most reliable peak detection performance in a particular application. A fair measure of performance of different models requires a common and unbiased platform. In this study, we evaluate the performance of the four different peak models using the extreme learning machine (ELM)-based peak detection algorithm. We found that the Dingle model gave the best performance, with 72 % accuracy in the analysis of real EEG data. Statistical analysis conferred that the Dingle model afforded significantly better mean testing accuracy than did the Acir and Liu models, which were in the range 37–52 %. Meanwhile, the Dingle model has no significant difference compared to Dumpala model.

two associated valley points, which hold a local minimum value. Any single peak is described by a number of signal parameters, including amplitude, width, and slope. Based on those parameters, a number of peak features can be calculated in the temporal domain, such as peak-to-peak amplitudes at the first half wave, peak width, ascending peak slopes at the first half wave, and descending peak slope at the second half wave. The ensemble of these peak features serves to detect the peak in various applications. However, the high variation of calculated peak features in real EEG data, which typically contain several types of noise, can interfere with correct peak detection and degrade performance.
Typically, a peak detection algorithm consists of combination of selected features from their peak model and subsequent computational processes, such as classification. Based on the literature, Dumpala et al. (1982) used a defined peak model, and then introduced a classification process to detect a peak signal in the analysis of gastric electrical activity. Dingle et al. (1993); Liu et al. (2002); Acir et al. (2005); Acir 2005); Liu et al. (2013) also used the defined peak models and different classification processes to detect peaks in EEG signal with epileptiform activity. The classifiers that have hitherto been used for signal peak detection include rule-based (Dumpala et al. 1982;Adam et al. 2015;Dingle et al. 1993), AdaBoost (Liu et al. 2002), radial basis function network , support vector machine (SVM) (Liu et al. 2013), radial basis support vector machine , artificial neural network (ANN) (Liu et al. 2002), and expert system (Liu et al. 2002;Dingle et al. 1993). In general, utilization of a peak detection algorithm provides the best performance in various applications. However, the various algorithms have used different peak models and different classification approaches combined in a particular peak detection algorithm. Moreover, to the best of our knowledge, there are very reports comparing the performance of different peak models using the same classification platform, such as a rule-based classifier (Adam et al. 2014a(Adam et al. , 2014b(Adam et al. , 2015. The existing methods tend to have poor performance with peak models defining many peak features, for example, the detection performance declined to nil when the classifier employed all 11 features from Liu model (Adam et al. 2014b).
For fair evaluation of the detection performance of different EEG signal peak models, they must be assessed using a common classification method. Therefore, in this study we used the extreme learning machine (ELM) method as a common classifier for the peak detection algorithm to evaluate the performance in association with four different peak models in time domain analysis, namely the models of Dumpala, Acir, Liu, and Dingle. The four representative peak models were on the basis of their proven utility in various physiological signal applications. We used an ELM since it provides very fast learning speed, generalized performance, learning without iterative tuning, and minimal requirement for user intervention. ELM also employs as alternative method to resolve the shortcoming of existing studies which have poor performance to peak model with many peak features. Hence, the present study aims to determine the best peak model in time domain analysis for EEG-based horizontal eye movement signal application using an advantageous common classification platform.

EEG signal peak detection algorithm
The training and testing phases of the EEG signal peak detection algorithm are shown in Fig. 1. The training and testing data that used in this study were collected using two channel EEG recordings from 20 voluntary subjects. In the first stage of peak detection, the training and testing EEG signals must be filtered as input to the algorithm, upon selection of the desired peak model. The training phase of the algorithm involves several processes, namely including peak candidate detection, feature extraction, with definition of model-specific features, and then classification process. The estimation process is performed during this phase to train the network for adjusting the ELM parameters using the learning algorithm of the ELM classifier. In the testing phase, the algorithm follows the same series of processes, and the ELM parameters first determined in the training phase are used in the classification process of the testing phase. The final output of the training and testing phase are the predicted peak points and non-peak points from the identified peak candidates.

Peak candidate detection
The first process of the detection algorithm is to determine candidate peaks. This process is used to assign the peaks into two groups, true non-peaks and true peaks. The group of true peaks is reconsidered to consist of candidate peaks, which can further classified into true non-peaks and true peaks using the ELM classifier. One advantage of determining candidate peaks process is that it reduces the number of input samples required for the ELM classifier, such that the computational time in the training and testing phases is minimized.
Determining the local maxima (peak points) and minima (valley points), the first process in determining a candidate peak, can be performed using an algorithm developed by Billauer (2012). The subsequent process of detecting a peak candidate is as follows: By considering a discrete-time signal, x(I) of L points, the ith candidate peak point, PP i , is identified using the three-point sliding window method (Dumpala et al. 1982). The three Fig. 1 Training and testing phases of EEG signal peak detection selected points are denoted as x(I−1), x(I) and x(I + 1) for I = 1, 2, 3, …,L. A candidate peak point is identified when x(PP i −1) < x(PP i ) > x(PP i + 1) and two associated valley points, VP1 i and VP2 i lie on either side of the peak, as shown in Fig. 2. The valley points are defined when

Feature extraction
The features of a peak candidate are calculated based on the eight points shown in Fig. 2. The set of points consists of the ith candidate peak point, PP i , the two associated valley points, VP1 i and VP2 i , the half point at first half wave (HP1 i ), the half point at second half wave (HP2 i ), the turning point at first half wave (TP1 i ), the turning point at second half wave (TP2 i ), and the moving average curve point (MAC(PP i )). The half point at first half wave can be defined as the point in slope located in the middle between thePP i and VP1 i points while the half point at the second half wave as the point in slope based in the midst between the PP i and VP2 i points. The MAC(PPi) point is located at the intersection between the PP i and MAC(PPi) points. The window length of the moving averaging is 100 sampling points.
Based on signal parameters, the features of a peak candidate can be categorized into three groups, namely amplitude, width, and slope. There are five different amplitudes, seven different widths, and four different slopes that can be calculated based on the eight defined points, resulting in a total of 16 features, which can be defined as follows: 1. The peak-to-peak amplitude at the first half wave, f 1 , is the peak amplitude between the magnitude of the peak and the magnitude of the valley of the first half wave, as denoted by.
2. The peak-to-peak amplitude at the second half wave, f 2 , is the peak amplitude between the magnitude of the peak and the magnitude of the valley of the second half wave, and is defined as 3. The turning point amplitude at the first half wave, f 3 , is the peak amplitude between the magnitude of the peak and the magnitude of the turning point at the first half- (1) The eight points of peak model wave. The turning point is defined as the point where the slope decreases more than 50 % compared to the slope of the preceding point. The equation for f 3 is as follows: 4. The turning point amplitude at the second half wave, f 4 , is the peak amplitude between the magnitude of the peak and the magnitude of the turning point at the second half wave, and is defined as 5. The moving average amplitude, f 5 , is the peak amplitude between the magnitude of the peak and the magnitude of the moving average, and is defined as The peak width, f 6 , is the peak width between the valley point of the first half point and the valley point of the second half wave, and is defined as 7. The first half wave width, f 7 , is the peak width between the peak point and the valley point of the first half wave, and is defined as 8. The second half wave width, f 8 , is the peak width between the peak point and the valley point of the second half wave, and is defined as 9. The turning point width, f 9 , is the peak width between the turning point at the first half wave and the turning point at the second half wave, and is defined as 10. The first half-wave turning point width, f 10 , is the peak width between the turning point at the first half-wave and the peak point, and is defined as 11. The second half wave turning point width, f 11 , is the peak width between the turning point at the second half-wave and the peak point, and is defined as 12. The half point width, f 12 , is the peak width between the half point of the first halfwave and the half point of the second half-wave, and is defined as 13. The peak slope at the first half wave, f 13 , is the maximal slope between the peak point and the valley point of the first half wave, and is defined as The peak slope at the second half-wave, f 14 , is the peak slope between the peak point and the valley point of the second half wave, and is defined as 15. The turning point slope at the first half-wave, f 15 , is the peak slope between the peak point and the turning point of the first half-wave, and is defined as 16. The turning point slope at the second half wave, f 16 , is the peak slope between the peak point and the turning point of the second half-wave, and is defined as From these 16 features, Dumpala et al. (1982) introduced a peak model that uses the four most salient features, f 1 , f 6 , f 13 , and f 14 . Additional features defining peak amplitude, f 2, and two features of peak width, f 7 and f 8, were introduced by Acir et al. (2005); Acir and Guzeli (2004). As pointed out in Acir et al. (2005), the defined features are interrelated to the characteristic of peak in epilepsy events. There are highlighted three characteristics as follows; (1) the ascending peak slopes at the first half wave, and descending peak slope at the second half wave are relatively large and smooth (2) the top of the peak is sharp, and (3) the peak width is always between 20 and 70 ms. The peak width can be calculated by the sum of the first half wave and second half wave peak width. Dumpala et al. (1982) and Acir et al. (2005) used the same similar definition of the peak slopes, i.e. f 13 and f 14 . The peak model of Liu et al. (2002) entailed a total of 11 features, consisting of four amplitudes (f 1 , f 2 , f 3 , and f 4 ), three widths (f 6 , f 9 , f 12 ), and four slopes (f 13 , f 14 , f 15 , f 16 ). Finally, the peak model introduced by Dingle et al. (1993) consists of four features (f 5 , f 6 , f 13 , f 14 ). The different peak models and their sets of features are listed in Table 1.

Extreme learning machine classifier
Extreme learning machine is a new approach to machine learning involving a single layer feedforward neural network (SLFN). The introduction of ELM by Huang et al. (2004) has been followed by several variants (Balasundaram et al. 2014;Huang et al. 2012). ELM has already been used with success for various event classifications of EEG signals (Song and Zhang 2013;Shi and Lu 2013;Yuan et al. 2012;Song et al. 2012;Yuan et al. 2011;Liang et al. 2006). The main advantage present by ELM is the fast computation of the learning  Fig. 3. The network consists of three layers, i.e. the input, hidden, and output layers. Between the input and hidden layers are the input weights, and between the hidden and output layers are the output weights. The training process of an ELM proceeds in three stages. In the first stage, the input weights are assigned randomly between −1 and 1, and the biases in the hidden layer are assigned randomly between 0 and 1. Both of these parameters remain fixed during the training process. Afterward, the output matrix of the hidden layer, H, is calculated as follows: where g is an activation function of the hidden neuron, x is the N × L matrix of inputs, a is the d × L matrix of random input weights, b is the 1 × L matrix of random biases in the hidden layer, N is an arbitrary number of distinct samples, L is the number of hidden neurons, and d is the number of inputs or features The ith column of H is the output of the ith hidden neuron with respect to inputs x 1 , x 2 , until x d .
The ELM can be represented as a linear system, which is mathematically modeled as where β is the L × m matrix of output weights, T is the N × m matrix of target outputs, and m is the number of output neurons. The β and T matrixes are denoted as

The output function of the ELM classifier of a given unknown sample x is
In the output layer, two neurons are used in the network to classify the output into two classes (output): class 1 and class 0. For the two classes (m > 1), the predicted class label is the ith number of the output neurons, which is the maximum value of the output neuron (Huang et al. 2012). The predicted class label of a given unknown sample x is defined as follows.
We evaluate the performance of our ELM classifier based on the G mean (Guo et al. 2008), which is calculated as follows: where a true peak (TP) is the correctly-detected peak point of the peak candidate, a true non-peak (TN) is the correctly-detected non-peak point of the peak candidate, a false peak (FP) is the wrongly-detected non-peak point of a peak candidate, a false non-peak (FN) is a wrongly-detected peak point of a peak candidate, TPR is the true peak rate, and TNR is the true non-peak rate.

Experimental setup and protocols
Each experiment is conducted in 30 independent runs. The first 50 % of the filtered EEG signal was used as training data, and the remaining 50 % as testing data. The parameter settings of the ELM classifier are shown in Table 2. The number of hidden neuron was selected using a trial and error method, which was set to 500. The sigmoid [−1, 1] was used as an activation function in the hidden layer for the purpose of normalization, whereas a linear function was located on each neuron in the output layer. Other settings for the ELM classifier, such as the number of neurons in the input layer are dependent on the number of selected features of a particular peak model. The number of output neurons was set to 2 which it used maximum argument as indicated in Eq. 24 for choosing the ELM output. We note that the input weights and the biases remained fixed during the training, but the values of these two ELM parameters are randomly assigned for each of 30 runs.
This experimental protocol was approved by the medical ethics committee of the University of Malaya Medical Centre. All subjects signed informed consent forms in advance.
The filtered EEG signals in this study were obtained in the Applied Control and Robotic (ACR) Laboratory, Department of Electrical Engineering, Faculty of Engineering, University of Malaya, Malaysia. Twenty healthy subjects (10 males and 10 females, aged 20-40 years), who are undergraduate and postgraduate students in the Faculty of Engineering, volunteered to participate in these data collection sessions. Filtered EEG signal recordings were obtained using the g.MOBIlab portable biological signal acquisition system. The scalp electrodes were arranged using the 10-20 international electrode placement system. The EEG signal was recorded from the C3 and C4 channels, with the signal of channel CZ used as a reference. The ground electrode was located on the FPz channel, such that a total of only four electrodes were used. The sampling frequency was set to 256 Hz. The electrodes from the C3 and C4 channels are positioned for detecting EEG peaks associated with the brain response of commanded horizontal eye gaze direction. We used the C3, C4, and CZ channels are used because of they have relatively little less contamination from EEG artifacts due to eye blinking (Klados et al. 2011).
All subjects had been instructed to get a good rest the night before the data collection session, so as to ensure full focus during the EEG recordings. The subjects were told to prepare for the external voice cue within up to 4 s. Appearance of the cue is voice command or verbal reminder for the subject to move his eyes initially forward fixation to the left or to the right. At exactly 5 s from the beginning session, the external voice cue appears randomly instructing the subject to shift gaze to the left or right direction, and hold the new eye position from 5 until 10 s, which is the end of the EEG recording. The eye gaze directions that produce a number of peaks in the signal on channels C3 and C4 are archived as raw data for analysis. Figure 4 shows a representative case of filtered EEG signals that are labeled as eye movement signals. The dotted red vertical lines show the actual peak point locations, as assigned by a researcher. The eye movement signal consists of 20 signals for channel C3, 20 signals for channel C4, for duration of 10-s per signal, recorded at 256 Hz for a total of 2560 sampling points per signal. Furthermore, each signal contains one known peak point location, where the known peak pattern represents the eye gaze direction, either to the left or to the right.

Experimental results and discussion
Four different outcome measures were used in the experiments: the average G mean , the maximum G mean , the minimum G mean , and the standard deviation (STDEV). Additionally, the statistical comparisons of the average test accuracy among the four models are evaluated using Friedman's test.
The results of training and testing performance based on the four different measurements (end-points) for the peak model with all 16 features are shown in Table 3. For training performance, the average, maximum, minimum, and STDEV values were 88.3, 94.9, 80.6, and 3.6 %, respectively, whereas for testing performance, the average, maximum, minimum, and STDEV values were 36.9, 58.1, 0, and 11.8 %, respectively. The minimum testing performance of 0 % indicates that the classifier was unable to correctly detect even one peak.
Next, the training and testing performance based on the four different measurements for each peak model are shown in Table 4, the training performance average, maximum, minimum, and STDEV values are 84.7, 86.6, 83.7 and 1.4 % for Dumpala peak model,  The testing performance on average, maximum, minimum, and STDEV values are 70.1, 82.6, 51.6 and 6.7 %, respectively, for Dumpala peak model; 36.9, 62.6, 0 and 11.9 %, respectively, for Acir peak model;51.1,71.8,37.2 and 7.9 %,respectively,for Liu peak model and 71.7,89.2,57.1 and 6.9 %, respectively, for Dingle peak model. As with the training data, performance of Dingle peak model with the test data was superior to that of the other peak models. As with the training set, results with Acir method for the test data showed that the classifier is unable to correctly predict all the true peaks occurring at the particular time when TP is equal to zero. Thus, G mean becomes zero even though the TN value is non-zero.
Sensitivity is the percentage of true peak rate recovered while specificity is the percentage of true non-peak rate. The overall sensitivity and specificity values for the testing performance are shown in Table 5. The results in Table 5 show that sensitivity is significantly lower than 30 % for the Acir and Liu models. Dumpala and Dingle peak models performed best, with sensitivity of 55 % and specificity exceeding 99 %. Overall, the sensitivity of the four peak models is lower than their specificity, thus resulting in a large amount of false non-peak. The four peak models return many false non-peaks due to several contributing factors such the collected data is affected by various noises and the peak features have a large different value from one subject to another subject. These factors are the cause to the high variation of peak features of the four peak models. In this case, the NNRW classifier has performed best for classifying the non-peak features than peak features.
The comparison of the average test accuracy between the four peak models is extended using Friedman's test for statistical analysis. The analysis searches for a significant difference in the average testing accuracies between the peak models with p value lower than  (Table 6) show best results for Dingle peak model, followed by the Dumpala, Liu, and Acir peak models. Post-hoc analysis of Friedman's test results are based on the Holm-Bonferroni method using two difference confidence intervals, α = 0.05 and α = 0.10 as shown in Table 7. Both Friedman's test and the Holm-Bonferroni post hoc analysis are conducted using the KEEL software tool (Alcala-Fdez et al. 2009). The post hoc results in Table 7 show similar rank orders for α = 0.05 and α = 0.10, where Dingle peak model offers test accuracies significantly better than do Acir and Liu peak models, but was not superior to without any significant compared to Dumpala peak model.

Conclusions and future work
In this study, we applied ELM-based peak detection to two-lead EEG signals recorded from 20 healthy subjects instructed to direct their horizontal gaze in response to a voice cue. The data was used to evaluate the performance of four different peak detection models. The various event-related EEG peaks were analyzed through a series of processes, i.e. peak candidate detection, feature extraction, and classification. The four peak models considered in this study are representative of typical EEG studies in the literature (Dumpala et al. 1982;Acir et al. 2005;Acir 2005;Dingle et al. 1993;Liu et al. 2002), all of which entail initial extraction of 16 peak features. Each of the four peak models was selected in turn before the execution of experiments using the ELM as a common classification algorithm. ELM has been tested on more than forty benchmark data sets. Also, ELM has been proven by experimental results that achieved similar or better generalized performance compared to SVM and least square support vector machine (LS-SVM) for two-class classification (Huang et al. 2012). We find Dingle peak model to be the best for reliably detecting voluntary horizontal eye movement signal peaks, delivering a mean performance of 99.5 % for the training set and 71.7 % for the testing set. The testing performance needs to be improved by reconsidering the selection of peak features among This study also observes that defining more peak features on model is not guarantee in producing better accuracy on EEG-based horizontal eye movement signal application. As shown in the results in Table 3, the mean of testing accuracy only can achieve at 36.9 %. However, determining the optimal model from the selected features associated with the advantageous of common classification platform is the best approach to gain the accuracy of detection performance.
Results of this study may be applicable in many contexts characterized by the general problem of signal detection, including, such as medical diagnostics, human-machine interface (HMI), brain-computer interface (BCI), and harmonic detection in digital and audio signal processing. For example, an EEG peak in the frontal eye field associated with a change of horizontal eye gaze direction could be translated to the direction of cursor movement in BCI applications, which might be useful for patients with lockedin syndrome or other disabilities (Belkacem et al. 2014). This approach might also be translatable for EEG-based command of the movement of a robotic arm or wheelchair in HMI applications (Postelnicu et al. 2011;Ramli et al. 2015). We intend in the future to extend this work to the problem of feature selection for the peak detection algorithm, so as to optimize the selection of the most salient peak features, with an aim to improve further the performance of peak detection.