Clinical EEG and Neuroscience -

Click here to load reader

  • date post

    04-Feb-2022
  • Category

    Documents

  • view

    0
  • download

    0

Embed Size (px)

Transcript of Clinical EEG and Neuroscience -

http://eeg.sagepub.com/content/early/2014/09/20/1550059414548721 The online version of this article can be found at:
  DOI: 10.1177/1550059414548721
published online 21 September 2014Clin EEG Neurosci Jing Li, Xianzeng Liu and Gaoxiang Ouyang
Using Relevance Feedback to Distinguish the Changes in EEG During Different Absence Seizure Phases    
Published by:
can be found at:Clinical EEG and NeuroscienceAdditional services and information for      
  http://eeg.sagepub.com/cgi/alertsEmail Alerts:
- Sep 21, 2014OnlineFirst Version of Record >>
at UNIV OF SAN DIEGO on September 30, 2014eeg.sagepub.comDownloaded from at UNIV OF SAN DIEGO on September 30, 2014eeg.sagepub.comDownloaded from
Original Article
Introduction
According to the World Health Organization, the incidence of epilepsy has affected more than 50 million individuals world- wide (ie, about 0.6% to 1% of the world’s population). This not only affects the patients themselves but also brings inconve- nience to their families. Consequently, it is important to predict seizures as early as possible such that clinicians can prescribe necessary medication for stopping the disease progression.1 During the past few decades, EEG signals have become one of the most useful tools for studying the processes involved in epileptic seizures.2-4 Currently, computational methods for ana- lysing nonlinear EEG signals mainly consist of traditional lin- ear methods such as Fourier transforms and spectral analysis5 and nonlinear algorithms such as Lyapunov exponents,6 corre- lation dimension,7,8 similarity,9 and power of scale freeness of visibility graph (PSVG).10
Understanding the transition of brain activity toward an absence seizure (ie, preseizure) is a very demanding task. EEG has become one of the most important diagnostic tools in clinical neurophysiology, most notably in epilepsy. Generally, the EEG is a recording of the mean electrical activity of the brain from the scalp in different locations of the head (scalp EEG). More spe- cially, it is the sum of the extracellular current flows of a large group of neurons, and the EEG activity can be classified by its
frequency, voltage, morphology, synchrony, and periodicity. Typical absence seizures are accompanied by an EEG hallmark of brief ictal and interictal 2.5- to 3-Hz spike-and-wave com- plexes with a maximum amplitude over the frontorolandic regions.11 A previous analysis of EEG dynamic changes of Genetic Absence Epilepsy Rat from Strasbourg (GAERS) has demonstrated that EEG epochs prior to seizures exhibit a higher degree of regularity/predictability than that in seizure-free EEG epochs, but they present a lower degree than that in seizure EEG epochs.12,13 These EEG precursors in rat models give us a clue in predicting human absence epilepsy via EEG signals.
548721 EEGXXX10.1177/1550059414548721Clinical EEG and NeuroscienceLi et al research-article2014
1State Key Laboratory of Cognitive Neuroscience and Learning & IDG/ McGovern Institute for Brain Research, Beijing Normal University, Beijing, China 2School of Information Engineering, Nanchang University, Nanchang, China 3The Comprehensive Epilepsy Center, Departments of Neurology and Neurosurgery, Peking University People’s Hospital, Beijing, China 4Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University Beijing, China
Corresponding Author: Gaoxiang Ouyang, Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University Beijing 100875, China. Email: ouyang@bnu.edu.cn Full-color figures are available online at http://eeg.sagepub.com
Using Relevance Feedback to Distinguish the Changes in EEG During Different Absence Seizure Phases
Jing Li1,2, Xianzeng Liu3, and Gaoxiang Ouyang1,4
Abstract We carried out a series of statistical experiments to explore the utility of using relevance feedback on electroencephalogram (EEG) data to distinguish between different activity states in human absence epilepsy. EEG recordings from 10 patients with absence epilepsy are sampled, filtered, selected, and dissected from seizure-free, preseizure, and seizure phases. A total of 112 two-second 19-channel EEG epochs from 10 patients were selected from each phase. For each epoch, multiscale permutation entropy of the EEG data was calculated. The feature dimensionality was reduced by linear discriminant analysis to obtain a more discriminative and compact representation. Finally, a relevance feedback technique, that is, direct biased discriminant analysis, was applied to 68 randomly selected queries over nine iterations. This study is a first attempt to apply the statistical analysis of relevance feedback to the distinction of different EEG activity states in absence epilepsy. The average precision in the top 10 returned results was 97.5%, and the standard deviation suggested that embedding relevance feedback can effectively distinguish different seizure phases in absence epilepsy. The experimental results indicate that relevance feedback may be an effective tool for the prediction of different activity states in human absence epilepsy. The simultaneous analysis of multichannel EEG signals provides a powerful tool for the exploration of abnormal electrical brain activity in patients with epilepsy.
Keywords EEG, absence epilepsy, relevance feedback, classification, multiscale permutation entropy
Received April 16, 2014; revised July 20, 2014; accepted August 1, 2014.
at UNIV OF SAN DIEGO on September 30, 2014eeg.sagepub.comDownloaded from
2 Clinical EEG and Neuroscience
In this article, we propose a machine learning scheme to analyze EEG recordings and to explore how EEG data provide evidence for the existence of a preseizure phase in human absence epilepsy. Machine learning algorithms (eg, kernel machines including support vector machines [SVMs]14,15) have been used for epilepsy diagnosis based on EEG signals.16 Lima et al17 applied relevance vector machines (RVMs) to the detec- tion of epileptic activity and found in terms of accuracy the best-calibrated RVM models have shown comparable perfor- mance to those of SVMs. Shoeb and Guttag18 used machine learning techniques to detect the onset of an epileptic seizure via the construction of patient-specific binary classifiers. Furthermore, Shoeb et al19 applied SVMs to the detection of seizure termination in scalp EEG and obtained satisfactory results. Similarly, Nandan et al20 adopted several types of SVMs to detect epileptic seizure in an animal model of chronic epilepsy and gave comparison results. However, the aforemen- tioned algorithms mostly paid attention to classifier construc- tion and did not consider the interaction at all. In this article, we propose a machine learning algorithm based on relevance feed- back (RF), a classical human–computer interaction technique in multimedia information processing. Through embedding the interaction, promising results are achieved in distinguishing the changes in EEG during different absence seizure phases.
The scheme proposed here involves 3 stages, which are (a) signal processing (feature extraction), (b) dimensionality reduc- tion, and (c) RF-based classification. Each stage will be briefly described subsequently. In this study, we collected 19 channels of EEG recordings from 10 patients (6 males and 4 females) with absence epilepsy. The EEG signals were sampled and fil- tered. After that, they were selected and dissected from seizure- free (data set I), preseizure (data set II), and seizure phases (data set III). For each data set, a total of 112 two-second 19-channel EEG epochs from 10 patients were selected. Multiscale permu- tation entropy (MPE) explores the local order structure of suc- cessive coarse-grained time series. It is calculated at multiple scales to extract useful information for classification and has shown promising performance in absence epilepsy.21 To this end, we extract MPE features in the first stage. When the dimen- sion of extracted feature vectors is much higher than the number of training examples or it exceeds a certain value, curse of dimensionality22 will occur and subsequent classification per- formance may be degraded. Considering this, we use dimen- sionality reduction23,24 to alleviate this problem and obtain more compact representation for more accurate prediction of absence epilepsy. Here, we use linear discriminant analysis (LDA)24 to find a projection that reduces the higher dimensional feature space to a lower dimensional subspace.
After dimensionality reduction, we use RF to embed human– computer interaction into the classification task of different phases in human absence epilepsy. RF describes how we as humans interact with machines, where a machine is defined as any mechanical or electrical device that transmits or modifies energy to assist in the performance of human tasks. RF origi- nated from document retrieval,25 but has been widely used in multimedia information retrieval because it can bridge the
semantic gap between the low-level visual features and high- level image concepts. Although RF was previously adopted in medical imaging,26 this study is a first attempt to apply the sta- tistical analysis of RF to the distinction of different EEG activity states in absence epilepsy. Traditional RF methods in informa- tion retrieval include the following 2 steps27: (a) when retrieved results are returned to the user, some relevant and irrelevant examples are labeled as positive feedbacks and negative feed- backs, respectively and (b) the retrieval system refines the retrieved results based on these labeled examples. These 2 steps are conducted iteratively until the user is satisfied with pre- sented results. Over the past few decades, RF techniques have been developed based on diverse machine learning techniques: feature selection, semisupervised learning, query modification, density estimation of positive samples, negative samples analy- sis, and distance metric learning.28-31 To accomplish the classifi- cation task of different activity states in absence seizures, we adopt the direct biased discriminant analysis (DBDA),32 treating RF as a (1 + x)-class biased learning problem.
The organization of this article is as follows. We introduce the material and methods in the next section, which is followed by experimental results. The discussion and conclusions are presented in the final section.
Material and Methods
Data for Acquisition
EEG recordings were collected from 10 patients (6 males and 4 females) with absence epilepsy, aged from 8 to 21 years. The study protocol had previously been approved by the ethics committee of Peking University People’s Hospital and the patients had signed informed consent that their clinical data might be used and published for research purposes. The EEG data were recorded by the Neurofile NT digital video EEG sys- tem from a standard international 10-20 electrode placement (Fp1, Fp2, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, T6, Fz, Cz, and Pz). They were sampled at a frequency of 256 Hz using a 16-bit analog-to-digital converter and filtered within a frequency band from 0.5 to 35 Hz.
Afterward, the EEG signals were selected and dissected from different seizure phases: seizure-free (data set I), pre- seizure (data set II), and seizure (data set III) phases. For each data set, a total of 112 two-second 19-channel EEG epochs from 10 patients were extracted. The timing of onset and offset in spike-wave discharges (SWDs) was identified by an epilepsy neurologist, and these SWDs were defined as large-amplitude rhythmic 2.5- to 4-Hz discharges with typi- cal spike-wave morphology lasting longer than 1 second. The criteria for selecting seizure-free, preseizure, and seizure data are that the interval between the seizure-free data and the beginning point of seizures is greater than 15 seconds, the interval is between 0 and 2 seconds prior to seizure onset, and the interval is the first 2 seconds of the absence seizure, respectively. Figure 1 shows representative examples of 19-channel EEG recordings during seizure-free (I),
at UNIV OF SAN DIEGO on September 30, 2014eeg.sagepub.comDownloaded from
Li et al 3
preseizure (II), and seizure (III) phases, respectively. It is found that generalized SWDs with a repetition rate of 3 Hz are typically associated with clinical absence seizures.
Feature Extraction
To investigate the dynamical characteristics of EEG data during different seizure phases, MPE21 was used to extract informative features from all EEG recordings. The MPE method is similar to the multiscale entropy (MSE) analysis,33 detailed information for which can be found in Ouyang et al.21 The code of MPE was downloaded from MATLAB Central File Exchange (MPerm.m). The MPE procedure con- tains the following 2 steps. First, a “coarse-graining” process is applied to a given time series { , , , }
… x x xN1 2 to construct a
consecutive coarse-grained time series
y j s( ) by averaging a

ing to
(1)
where s is the scale factor and 1 ≤ j ≤ N / s. The length of each coarse-grained time series is the integral part of N/s.
Next, permutation entropy34—the local order structure of the time series, is calculated for each coarse-grained time series

y j , a series of vectors Vm n n n mn y y y( ) [ , , , ]( )= + + −
… 1 1 1 1≤ ≤ − +( )n N s m/ with
length m is derived from
y j . Afterward, Vm n( ) can be ranked in an increasing order: [ ]
y y yn j n j n jn+ − + +≤ ≤1 21 1 1- - .
For different values of m, there will be m! possible order pat- terns π, which are also called permutations. Let f(π) denote the frequency of a permutation with π in the time series, the relative frequency is p f N s m( ) ( ) ( / )π π= − +1 . Consequently, the permutation entropy (PE) for the time series is defined as
PE p p m
(2)
The maximum value of PE is log(m!), which means all permuta- tions have an equal probability. The minimum value of PE is zero, which indicates that the time series is very regular. In other words, the smaller is the value of PE, the more regular are the time series.
Dimensionality Reduction
After feature extraction, LDA was used to reduce the dimension of feature vectors for alleviating computational complexity while preserving sufficient discriminative information in the subsequent classification stage.
Linear discriminant analysis23 is a supervised learning algo- rithm that takes the class label information into account. Given
Figure 1. Representative examples of 19-channel (from Fp1 to Pz) EEG recordings, where I, II, and III denote the EEG epochs during seizure-free, preseizure, and seizure intervals, respectively.
at UNIV OF SAN DIEGO on September 30, 2014eeg.sagepub.comDownloaded from
4 Clinical EEG and Neuroscience

xi n∈ℜ 1 ≤ j ≤ N / s to
a lower dimensional space through a linear transformation. Each feature vector can be considered as a point in the fea-
ture space. Given that the original high-dimensional data points X x x xN= { }
1 2, ,..., in ℜn belong to c classes, the between- class scatter matrix S
b and the within-class scatter matrix S
w are
given by
S N
S N
T
; ; jj
Ni
i
c
== ∑∑ 11
(3)

xi j; represents the jth example, and m N xi i i jj
ni= ( )∑ =1 1
c= =∑ 1 is the number of all training examples; and
m N xi jj
; is the mean vector of the whole input data.
The formulation of LDA is to maximize the ratio between S b
and S w in the projected low-dimensional subspace:
U U S U
U S U opt
T b
T w
= argmax . (4)
The generalized eigenvalue problem is S U S Ub w= λ , and the resulted lower dimensional subspace is spanned by U u u uL= { }
1 2, ,..., ( L c≤ −1 ). Herein, the covariance matrix of all training examples is S N x m x m S St i j i j
T
j
ni
i

Relevance Feedback
Generally, RF is widely considered as a 2-class learning prob- lem, treating positive examples and negative examples in a symmetric way. The learning flowchart of RF is given in Figure 2. When a query is input, its features are extracted and compared with those previously stored in the data set based on
a similarity measure. Within top returned results, the user labels some relevant examples as positive feedbacks and some irrel- evant examples as negative feedbacks, respectively. Based on these labeled feedbacks, the RF model can be enhanced itera- tively and return final results to the user.
In this article, we use DBDA32 as the relevance feedback tech- nique. DBDA is regarded as an improvement of biased discrimi- nant analysis (BDA),35 which treats positive examples and negative examples asymmetrically. They will be introduced as follows.
Biased Discriminant Analysis. As users usually label both posi- tive examples and negative examples, RF is considered as a 2-class pattern classification problem. However, just like “happy families are all alike, every unhappy family is unhappy in its own way” (Leo Tolstoy’s Anna Karenina), positive examples are all alike and each negative example is negative in its own way. That is, there is an asymmetry between posi- tive examples and negative examples. Moreover, users are only interested with one class (the positive class), that is, the returned results should be similar to the query, negative exam- ples are too few to represent the true nonlinear distributions. Therefore, it is more reasonable to assume there is one positive class but the number of other classes is uncertain. Based on the aforementioned concepts, BDA35 treats RF as a (1 + x)-class biased learning problem (biased toward the positive class) and labels training examples as only positive or negative in order to explore whether they belong to the target class or not. In this way, positive examples are pulled closer to each other while negative examples are pushed away from the positive ones.
It is easier for us to understand BDA after introducing the formulation of LDA in a previous section. The objective of BDA is to maximize the ratio between the biased matrix S
y and
W W S W
T
i
Nx
T
i
Ny
at UNIV OF SAN DIEGO on September 30, 2014eeg.sagepub.comDownloaded from
yi denotes the neg- ative examples. Herein, N
x is the number of positive examples,
N y is the number of negative examples, and

Nx= ( ) =∑1 1
is the mean vector of the positive examples. To obtain W, we can compute the eigenvectors of S Sx y
−1 .
Direct Biased Discriminant Analysis. DBDA,32 which is regarded as an enhanced BDA, adopts the same idea as direct LDA.36 In DBDA, it is assumed that the null space of S
y contains no impor-
tant information for discriminating different classes and the dis- criminant vectors are restricted in the subspace spanned by class centers. Therefore, the formulation of DBDA is obtained by first diagonalizing S
y and then removing its null space
Y S Y DT y y= 0 (7)
Here, D y comprises the corresponding nonzero eigenvalues of
S y and Y comprises the eigenvectors. Then, S
x is transformed to
x y= − − 1
W YD UDy x=
Adaptive Neuro-Fuzzy Inference System
To compare the accuracy of classification between the proposed rel- evance feedback scheme and some traditional methods, the Adaptive Neuro-Fuzzy Inference System (ANFIS)37 is also adopted to evaluate the ability and effectiveness of the MPE measures in classifying different seizure phases. The ANFIS learns features in the data set and adjusts the system parameters according to a given error criterion. For more details please refer to Jang.37 To improve the generalization, 3 ANFIS classifiers are trained with the back- propagation gradient descent method in combination with the least squares method when the calculated MPE measures are used as input. Each of the ANFIS classifier is trained so that they are likely to be more accurate for one state of EEG signals than the other states. The samples with target outputs, seizure-free (data set I), pre- seizure (data set II), and seizure (data set III) phases are given the binary target values of (0, 0, 1), (0, 1, 0), and (1, 0, 0), respectively. Each ANFIS classifier is implemented using the MATLAB soft- ware package (MATLAB version 7.0 with fuzzy logic toolbox).
Experimental Results
Multiscale Permutation Entropy Measure of EEG Data
The MPE measure was applied to analyze all 6384 two-second EEG epochs in this study (112 × 19-channel from each data set
I, II, and III). Scale 1 (ie, s = 1) is the only scale considered by traditional single-scale-based methods. For example, the per- mutation entropy values for EEG segments of channel F3 were averaged…