In this study, by using only one data channel,
and in the recording time, subjects can blink their
eyes naturally, the classification accuracy is normally
above 90% with features: percent alpha, percent beta,
PSD of alpha, PSD of beta, means of the absolute
values of the second differential of the normalized
signal, means of the absolute values of the first
differential of the normalized signal, Skewness,
Kurtosis, AR Burg order 6. Because the EEG signal
is user-dependent, we only classify EEG signals
individually on each subject. We have also proposed
a solution to smooth the recognition results. By
employing the proposed system, patients with
paralysis will be able to enjoy computer game as
healthy people, even if they are living with the loss of
muscle control.
As future works, we plan to classify more states
of the brain such as left, right, up, down and apply
results to play more complicated games. We will try
to collect more EEG data to achieve better and
independent results for the survey participants.
6 trang |
Chia sẻ: huongthu9 | Lượt xem: 425 | Lượt tải: 0
Bạn đang xem nội dung tài liệu EEG Features Extraction for Classification of Human Intention and Non-Intention, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
Journal of Science & Technology 123 (2017) 059-064
59
EEG Features Extraction for Classification
of Human Intention and Non-Intention
Phan Duy Hung1*, Vu Minh Thang1, Vu Thu Diep2
1 FPT University, Hoa Lac High Tech Park, Hanoi, Viet Nam
2 Hanoi University of Science and Technology, No. 1, Dai Co Viet, Hai Ba Trung, Hanoi, Viet Nam
Received: July 20, 2017; Accepted: November 03, 2017
Abstract
This study is to classify brain’s state Intention and Non-Intention base on only one channel of the
Electroencephalogram (EEG) signal, then apply results to real world’s problem. Because the brain signal is
much different between different people, this article only discusses personal EEG data. First, data is
recorded by EEG-SMT device of Olimex Ltd. Then we extracted features from collected EEG data and used
the ANOVA tool to evaluate significant of them. Multilayer Neutral Network is used for the training process.
The accuracy is normally above 90% for each subject by using proposed features. Finally, a method of
smoothing real-time results is developed to improve training results and to play simple computer games.
Keywords: EEG, Intention, Non-Intention
1. Introduction1
In recent years, EEG study has been
experiencing a rapid development and has a wide
range of application in real life [1-5]. The EEG has
some advantages: it’s high speed, non-invasive and
causes no pain to the human subject. Moreover, low
cost and increasing portable EEG equipment have
been developed in recent years. It also has a lot of
applications both in study and practice such as lie
detection [1], neuromarketing [2], etc.
There are many research results about EEG
classification. Among them are use of, large number
of electrodes to get large number of EEG channels. In
the article [3], the authors used 64 biosensors to
acquired EEG signals to classify human emotions and
achieved classification rate as high as 95%. In the
research work done by Zouhir B. et al., 2014 [4],
authors designed and implemented an efficient Brain
Computer Interface to allow disabled people to
control the motion of wheelchairs. It uses a portable
Emotiv headset that provides 14 channels. They
achieved average classification rate 91% on overall.
That research demonstrated another important result
that the EEG signal is heavy user-dependent. That
means the classification results of a person can’t be
applied to another. In the research carried out by
Tien Pham et al., 2014 [5], EEG signal, which
obtained from 23 electrodes, can be used for personal
identification.
* Corresponding author: Tel.: (+84) 975597339
Email: hungpd2@fpt.com.vn
Many experiments require the subjects to try to
keep eye state always open or close in recording EEG
process. In the research carried out by Zouhir B. et
al., 2014 [5], the involved subject is asked to keep the
eye open for about eight-seconds-long. During the
Research work carried out by Liu Y. et al., 2010 [6],
it requires subjects to keep eyes closed during
experiment sessions. This requirement can help to
avoid muscle movement and eye blinking artifact, but
it is impossible when applying the above mention
requirement to the real world application.
Our objective is to analyze and successfully
classify the acquired EEG signals into Intention and
Non-Intention classes using only one data channel. In
the recording time of the EEG signal, subjects do not
need to keep the eye open or closed, which means
subjects can blink their eyes naturally, because EEG
signal is user-dependent, we only classify EEG
signals individually on each subject. By employing
the proposed system, patients with paralysis will be
able to enjoy computer game as healthy people, even
if they are living with the loss of muscle control.
The EEG is recorded using EEG-SMT device
[7]. After recording, the signal will be analyzed to
extract features that are the input of classification
process. Multilayer Perception is employed for
classification. The output of the classification process
will be used to play some simple games.
2. Data Collection
In this research, we use EEG data which is
collected from four women and four men which there
age is around 22, healthy and right-handed. All of the
subjects were informed about the purpose of this
Journal of Science & Technology 123 (2017) 059-064
60
experiment. For stimulation, we created software to
stimulate animation to take attention of subject.
The recording is made with EEG-SMT, using
three passive electrodes. They are placed on Fp1 (or
Fp2), A1 and A2 as in Figure 1 [8]. When the
experiment was initiated, the test subjects were asked
to wear the EEG-SMT for 2 minutes to familiarize
themselves with the sensor to prevent the potential
effect of discomfort.
Fig. 1. International 10-20 System
The EEG-SMT device is connected with the
computer, and we have developped a software on
Matlab which have many functions in this research.
The EEG signal is carefully checked from the graph
that is display on the desktop’s screen before
recording, to make sure it was not affected by any
other factors. After everything was ready, the data
collecting will be started and the user interface is
switched to the stimulate screen of the intention or
not for the test subjects. If the test subject was
recording Intention state, the balloon on user interface
will move continuously in 60 seconds and the subject
would focus on the balloon. When recording Non-
Intention state, the test subject will relax and don’t
concentrate on the balloon in 60 seconds. After
recording, we would ask the subjects if they were
concentrating or not. If the subjects were unsure
about their mental state during the experiment, we
would ignore recorded data.
Finally, we had 15360 samples 60 seconds of
EEG signal for each brain’s state with each subject.
3. Features Extraction
The measurement signal is highly dependent on
the measuring system and the sensor (contact
impedance, gain coefficient, noise filter, etc), and the
goal is to play live games with the device, so this
study will not intend to use another open database for
prior research. We will select the EEG features from
the literature review, select the identification model
and propose the new strategies for improving
recognition results for actual games.
In this study, we investigated the most common
features: statistical features and frequency features.
3. 1. Proposal Features
Let the raw signals recorded from EEG devices
in a segment be designated by where n = 1,2...N, with
N = 512 (512 samples corresponds to 2 seconds of the
EEG recording).
Mean of raw signals:
∑
NX nn=11μ = XN (1)
Standard deviation of the raw signals:
∑
N 2 1/2X n Xn=11σ =( (X -μ ) )N-1 (2)
Means of the absolute values of the first differential
of the raw signals:
∑
N-1x n+1 nn=11δ = X -XN-1 (3)
Means of the absolute values of the second
differential of the raw signals:
∑
N-2x n+2 nn=11γ = X -XN-2 (4)
Skewness:
Skewness shows the degree of the asymmetry of
a distribution, where the left or right tail is relatively
longer than other. If Skewness is negative, the data
are spread out more to left of the mean than to the
right. If Skewness is positive, the data are spread out
more to right. If the distribution is perfectly
symmetric distribution (normal distribution), the
Skewness will equal zero [9,10].
∑
∑
3N i X3 i=1 23 N 3/2i Xi=1
(X -μ )
E(x-μ) Ns= =
(X -μ )σ [ ]
N-1
(5)
where µ is the mean of X, σ is the standard
deviation of X, E(t) represents the expected value of
the quantity t.
Kurtosis:
Kurtosis is a measure of how outlier-prone a
distribution is. The kurtosis of the normal distribution
is 3. The distributions that are more outlier-prone
than the normal distribution have kurtosis greater
than 3; distribution that are less outlier-prone have
kurtosis less than 3 [9,10]. The kurtosis of a
distribution is defined as:
Journal of Science & Technology 123 (2017) 059-064
61
∑
∑
4N i X4 i=1 24 N 2i Xi=1
(X -m )
E(x-m) Nk= =
(X -m )s [ ]N
(6)
where µ is the mean of X, σ is the standard
deviation of X, E(t) represents the expected value of
the quantity t.
Auto-Regressive coefficients:
Auto-Regressive coefficients (AR) will describe
each sample of EEG signal as a linear combination of
previous samples plus a white noise error term [11].
The forward prediction of the EEG signal was
accomplished using following equation:
∑
N
k i k-i ki=1X =- a X +e (7)
where ai is AR coefficients, ek is white noise or
error sequence, and N is the order of AR model.
Percent wave:
The raw EEG signal is transformed to the
frequency domain by using the Fast Fourier
Transform (FFT). Then, percent of each type of wave
is calculated by following formulas:
∑
∑
4Hz ii=0.5Hz
δ
fp = f ; ∑∑
7Hz ii=4Hz
θ
fp = f ;
∑
∑
14Hz ii=8Hz
α
fp = f ; ∑∑
30Hz ii=14Hz
β
fp = f (8)
where ft is absolute amplitude of signal of t Hz
in frequency domain [10].
Power Spectral Density:
Assuming F(n) is the FFT result of a segment,
the Power Spectral Density (PSD) is as follow: *F(n)F (n)
P(n)= N (9)
Where F*(n) is the conjugate function of F(n)
and N = length of segment [11,12]. The energy of
each type of signals can be defined as follows:
∑
4
δ freq
freq=0.5Hz
E = P ; ∑7θ freq
freq=4Hz
E = P ;
∑
13
α freq
freq=8Hz
E = P ; ∑30β freq
freq=14Hz
E = P (10)
Furthermore, according to article [11], there is
some relations between alpha and beta wave. For
example, alpha activity indicates the brain is in
relaxing state, whereas beta activity is related to
stimulation. So the ratio of alpha and beta can be used
as a feature for classification:
α
β
ER=E (11)
Max Frequency:
As a suggestion according to research [13], we
can use max frequency as features for classification.
Each type of wave will have its maximum value, so
we have 4 values corresponding to maximum values
of frequency in delta, theta, alpha and beta waves.
Mean Frequency:
Mean frequency represents the centroid of the
spectrum and is calculated from results of Fast
Fourier Transform. There are 4 values corresponding
to mean frequencies of delta, theta, alpha and beta
waves [10].
Shannon Entropy, Log Energy Entropy [14]:
Shannon entropy of a finite length discrete
random variable x = 1 2[ , ,..., ]NX X X with probability
distribution function denoted by p(x) is defined by:
∑
N 2 2
ShanEn i 2 ii=1H (x)= - (p (x)) (log (p (x)) (12)
Log Energy Entropy is given:
∑
N 2
LogEn 2 ii=1H (x)= (log (p (x))) (13)
3.2 ANOVA Evaluation
After calculating features, we use ANOVA tool
to reduce the high data dimensionality of the feature
space before the classification process. Note that one-
way ANOVA can just only evaluate separate feature.
In practice, two or more features can be combine to
get significant classification results.
After using ANOVA and practical experiment,
we decided to choose features list that contains:
percent alpha, percent beta, PSD of alpha, PSD of
beta, means of the absolute values of the second
differential, Means of the absolute values of the first
differential, Skewness, Kurtosis, AR Burg order 6.
4. Classification
We employed multilayer perceptron (MLP) for
classification Intension/Non-Intension. According to
Universal approximation theorem, a feed-forward
network with a single hidden layer containing a finite
number of neurons can approximate an arbitrary
nonlinear, continuous, multidimensional function f
with any desired accuracy [15]. So we use only one
hidden layer for our training process.
The number of neurons in the hidden layers has
great effect to classification result. In MLP network,
increasing the number of neurons in the hidden layer
Journal of Science & Technology 123 (2017) 059-064
62
increases the power of the network, but requires more
computation and is more likely to produce overfitting
(which means the network only fits for one specific
set of input). It turns out that there no a common
number of neurons for a hidden layer. So we did an
experiment with the number of neurons from 1 to 60
to find out which is the number is most appropriate.
After the experiment, we saw that the range
from 20 to 60 produces a high and stable result.
Therefore, we use 40 neurons for the hidden layer.
We use Early Stopping technique to prevent
overfitting situation. It is a popular method for all
supervised network. In this technique, the data is
divided into three subsets: the training set, the
validation set and the testing set. The training set is
used for computing the gradient and updating the
network weights and biases. The error on the
validation set is monitored during the training
process. The validation error normally decreases
during the initial phase of training, as does the
training set error. However, when the network begins
to overfit the data, the error on the validation set
typically begins to rise. When the validation error
increases for a specified number of iterations (one
pre-defined constant), the training is stopped [16].
The weights and biases at the minimum of validation
error are returned.
In our research, we used 70% data for training,
15% data for validation and 15% data for testing.
With a dataset, we implement training of dataset
multiple times with different initial weights and
biases, and different divisions of data into training,
validation, and test sets. These different conditions
might lead to very different training results for the
same dataset.
The results of classification Intention and Non-
Intention is summarized in below Table 1. This table
shows the results of classification are normally above
90% for each subject. That result is not applied for
every subject (e.g. subject 6). The reasons might be
when recording data some people were not
concentrated. And when we try classifying EEG data
of all subjects, the accuracy reduces to approximately
85%. This is because the EEG signal is different
among various people, and based on biological
features of each person.
5. Application of classification results
After training by MLP, the trained network
would be saved for future classification. In this
section, we will use this trained network to play a
simple game. In real time processing, we also use one
data segment with 512 samples and the overlap
segment is 16 samples. So there are up to 16
decisions “Intention or Non-Intention” per second.
5.1. Improve training results
To ensure that subject can play the game
correctly, after training and classification, we would
test the training results before the player really play a
game. If the test result is incorrect, we will use these
test-recorded data for incremental training. All the
recorded data is marked with corresponded label and
can be used for the training process. The new data
will help improve actual performance of playing the
game in the future (Fig 2).
Fig. 2. Balloon Game
Test and retraining environment is mostly
similar with the real game environment. EEG signal
is recorded, and pass through classification network,
and decided this signal is Intention or Non-Intention.
The balloon will move to the left, if the system
detects the signal is Intention, move to the right if the
signal is Non-Intention, respectively.
5.2. Smoothing results
In this section, we will use notation, (I) for
Intention and (N) for Non-Intention. According to
section 4, the result of classification is very high. But
when we use trained network for real time
classification, there are some artifacts. The position
of electrodes might be different from the previous
position, which can lead to a small error of data.
When the subjects concentrated, the result might be
“I, I, I, I, I, N, N, I, I, I”, etc. Two results “N, N”
appeared could make the balloon had unsmooth
movement. To eliminate this problem, we use an
algorithm to make the balloon move smoother.
Journal of Science & Technology 123 (2017) 059-064
63
Table 1. Classification results
Number of
data segments
1st 2nd 3rd 4th 5th
Subject 1 2023 100% 99.95% 100% 100% 100%
Subject 2 1873 99.31% 95.78% 99.36% 99.47% 99.52%
Subject 3 1866 100% 99.79% 100% 99.84% 100%
Subject 4 1867 99.09% 99.14% 99.79% 98.66% 99.46%
Subject 5 1932 93.94% 95.81% 86.90% 95.86% 90.53%
Subject 6 1856 85.18% 84.59% 93.21% 82.06% 93.43%
Subject 7 2580 99.73% 99.84% 99.73% 99.88% 99.92%
Subject 8 1740 98.28% 96.15% 96.67% 95.69% 92.18%
Normally, at each time a data segment is
classified, the trained network has to decide if it is (I)
or (N). Then the balloon will move to left or right
immediately. In our algorithm, we introduce
undefined (U) state. This state appears when the
trained network results change between I and N
continuously. At that time the data might not be
stable, so the balloon will not move. Our system
decides the subject is (I) or (N), if only the results of
classification by the trained network have 5 results of
(I) or (N) continuously. Because there are 16 data
frames per second, so the system's delay is negligible.
5.3. Playing PC Game
After retraining and test, the trained network is
used to classify EEG signals of the subject for
playing game purpose. For demonstration, we create
a simple game on Matlab. In this game, the balloon
will go up when the subject performs Intention, and
go down when the subject performs Non-Intention.
There is an arrow flying across the screen. If the
balloon has a collision with the arrow or the bottom
array-arrow, the game will end. The horizontal arrow
will appear at the position of the balloon to ensure
that subject has always to think to move the balloon.
With the retraining strategy above for the
neuron network before starting to play, combined
with the strategy to smooth the results of
identification, players can control the ball naturally in
the game.
6. Conclustion and perspectives
In this study, by using only one data channel,
and in the recording time, subjects can blink their
eyes naturally, the classification accuracy is normally
above 90% with features: percent alpha, percent beta,
PSD of alpha, PSD of beta, means of the absolute
values of the second differential of the normalized
signal, means of the absolute values of the first
differential of the normalized signal, Skewness,
Kurtosis, AR Burg order 6. Because the EEG signal
is user-dependent, we only classify EEG signals
individually on each subject. We have also proposed
a solution to smooth the recognition results. By
employing the proposed system, patients with
paralysis will be able to enjoy computer game as
healthy people, even if they are living with the loss of
muscle control.
As future works, we plan to classify more states
of the brain such as left, right, up, down and apply
results to play more complicated games. We will try
to collect more EEG data to achieve better and
independent results for the survey participants.
References
[1] R. Cakmak; A. M. Zeki, Determining the state of
truthfulness and falsehood by analyzing the acquired
EEG signals, IEEE 12th International Colloquium on
Signal Processing & Its Applications (CSPA), (2016)
173 – 178.
[2] M. Murugappan, S. Murugappan, Balaganapathy, C.
Gerard, Wireless EEG signals based Neuromarketing
system using Fast Fourier Transform (FFT), IEEE
10th International Colloquium on Signal Processing
and its Applications, (2014 ) 25 – 30.
[3] C. T. Yuen, W. S. San, J. H. Ho, M. Rizon,
Effectiveness of Statistical Features for Human
Emotions Classification using EEG Biosensors,
Research Journal of Applied Sciences, Engineering
and Technology, 5(21): (2013) 5083-5089.
[4] Z. Bahri, S. Abdulaal, M. Buallay, Sub-Band-Power-
Based Efficient Brain Computer Interface for
Wheelchair Control. American Journal of Signal
Processing, Vol. 4 No. 1, (2014) 34-40.
[5] T. Pham, W. Ma, D. Tran, and P. Nguyen, Multi-
factor EEG-based user authentication. Proceedings of
International Joint Conference on Neural Networks
(IJCNN), (2014) 4029-4034.
[6] Y. Liu, O. Sourina and M. K. Nguyen, Real-Time
EEG-Based Human Emotion Recognition and
Visualization. Proceeding of International Conference
on Cyberworlds, (2010) 262-269.
[7] https://www.olimex.com/Products/EEG/OpenEEG/E
EG-SMT/open-source-hardware
Journal of Science & Technology 123 (2017) 059-064
64
[8] J. F. Echallier, F. Perrin and J. Pernier, Computer-
assisted placement of electrodes on the human head,
Electroencephalography and clinical
Neurophysiology, 82 (1992) 160-163.
[9] J. W. Bang, J. S. Choi, K. R. Park, Noise Reduction
in Brainwaves by Using Both EEG Signals and
Frontal Viewing Camera Images. Sensors, 13 (2013)
6272-6294.
[10] R. S. Huang, L. L. Tsai, C. J. Kuo, Selection of Valid
and Reliable EEG Features for Predicting Auditory
and Visual Alertness Levels. Proc. Natl. Sci. Counc.
ROC(B), Vol. 25, No. 1, (2001) 17-25.
[11] N. H. Liu, C. Y. Chiang, and H. C. Chu, Recognizing
the degree of human attention using EEG signals
from mobile sensors. Sensors 13.8 (2013) 10273-
10286.
[12] C. Hasegawa, K. Oguri, The effects of Specific
Musical Stimuli on Driver’s Drowsiness, Proceeding
of the Intelligent Transportation Systems Conference,
ITSC’06, Tornto, ON, Canada, (2006) 817- 822.
[13] M. Nandish, M. Stafford, K. P. Hemanth, F. Ahmed,
Feature Extraction and Classification of EEG Signal
Using Neural Network Based Techniques,
International Journal of Engineering and Innovative
Technology (IJEIT) (2008), volume 2, issue 4.
[14] H. Jianfeng, D. Xiao, and Z. Mu, Application of
Energy Entropy in Motor Imagery EEG
Classification, JDCTA 3.2 (2009) 83-90.
[15] Q. J. Zhang, K. C. Gupta, and V. K. Devabhaktuni,
Artificial neural networks for RF and microwave
design-from theory to practice, Microwave Theory
and Techniques, IEEE Transactions on 51.4 (2003):
1339-1350.
[16] M. N. S. Swamy, Ke-Lin Du, Neural Networks and
Statistical Learning, (2013) pp. 21-22.
Các file đính kèm theo tài liệu này:
- eeg_features_extraction_for_classification_of_human_intentio.pdf