EEG Features Extraction for Classification of Human Intention and Non-Intention

In this study, by using only one data channel, and in the recording time, subjects can blink their eyes naturally, the classification accuracy is normally above 90% with features: percent alpha, percent beta, PSD of alpha, PSD of beta, means of the absolute values of the second differential of the normalized signal, means of the absolute values of the first differential of the normalized signal, Skewness, Kurtosis, AR Burg order 6. Because the EEG signal is user-dependent, we only classify EEG signals individually on each subject. We have also proposed a solution to smooth the recognition results. By employing the proposed system, patients with paralysis will be able to enjoy computer game as healthy people, even if they are living with the loss of muscle control. As future works, we plan to classify more states of the brain such as left, right, up, down and apply results to play more complicated games. We will try to collect more EEG data to achieve better and independent results for the survey participants.

pdf6 trang | Chia sẻ: huongthu9 | Lượt xem: 410 | Lượt tải: 0download
Bạn đang xem nội dung tài liệu EEG Features Extraction for Classification of Human Intention and Non-Intention, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
Journal of Science & Technology 123 (2017) 059-064 59 EEG Features Extraction for Classification of Human Intention and Non-Intention Phan Duy Hung1*, Vu Minh Thang1, Vu Thu Diep2 1 FPT University, Hoa Lac High Tech Park, Hanoi, Viet Nam 2 Hanoi University of Science and Technology, No. 1, Dai Co Viet, Hai Ba Trung, Hanoi, Viet Nam Received: July 20, 2017; Accepted: November 03, 2017 Abstract This study is to classify brain’s state Intention and Non-Intention base on only one channel of the Electroencephalogram (EEG) signal, then apply results to real world’s problem. Because the brain signal is much different between different people, this article only discusses personal EEG data. First, data is recorded by EEG-SMT device of Olimex Ltd. Then we extracted features from collected EEG data and used the ANOVA tool to evaluate significant of them. Multilayer Neutral Network is used for the training process. The accuracy is normally above 90% for each subject by using proposed features. Finally, a method of smoothing real-time results is developed to improve training results and to play simple computer games. Keywords: EEG, Intention, Non-Intention 1. Introduction1 In recent years, EEG study has been experiencing a rapid development and has a wide range of application in real life [1-5]. The EEG has some advantages: it’s high speed, non-invasive and causes no pain to the human subject. Moreover, low cost and increasing portable EEG equipment have been developed in recent years. It also has a lot of applications both in study and practice such as lie detection [1], neuromarketing [2], etc. There are many research results about EEG classification. Among them are use of, large number of electrodes to get large number of EEG channels. In the article [3], the authors used 64 biosensors to acquired EEG signals to classify human emotions and achieved classification rate as high as 95%. In the research work done by Zouhir B. et al., 2014 [4], authors designed and implemented an efficient Brain Computer Interface to allow disabled people to control the motion of wheelchairs. It uses a portable Emotiv headset that provides 14 channels. They achieved average classification rate 91% on overall. That research demonstrated another important result that the EEG signal is heavy user-dependent. That means the classification results of a person can’t be applied to another. In the research carried out by Tien Pham et al., 2014 [5], EEG signal, which obtained from 23 electrodes, can be used for personal identification. * Corresponding author: Tel.: (+84) 975597339 Email: hungpd2@fpt.com.vn Many experiments require the subjects to try to keep eye state always open or close in recording EEG process. In the research carried out by Zouhir B. et al., 2014 [5], the involved subject is asked to keep the eye open for about eight-seconds-long. During the Research work carried out by Liu Y. et al., 2010 [6], it requires subjects to keep eyes closed during experiment sessions. This requirement can help to avoid muscle movement and eye blinking artifact, but it is impossible when applying the above mention requirement to the real world application. Our objective is to analyze and successfully classify the acquired EEG signals into Intention and Non-Intention classes using only one data channel. In the recording time of the EEG signal, subjects do not need to keep the eye open or closed, which means subjects can blink their eyes naturally, because EEG signal is user-dependent, we only classify EEG signals individually on each subject. By employing the proposed system, patients with paralysis will be able to enjoy computer game as healthy people, even if they are living with the loss of muscle control. The EEG is recorded using EEG-SMT device [7]. After recording, the signal will be analyzed to extract features that are the input of classification process. Multilayer Perception is employed for classification. The output of the classification process will be used to play some simple games. 2. Data Collection In this research, we use EEG data which is collected from four women and four men which there age is around 22, healthy and right-handed. All of the subjects were informed about the purpose of this Journal of Science & Technology 123 (2017) 059-064 60 experiment. For stimulation, we created software to stimulate animation to take attention of subject. The recording is made with EEG-SMT, using three passive electrodes. They are placed on Fp1 (or Fp2), A1 and A2 as in Figure 1 [8]. When the experiment was initiated, the test subjects were asked to wear the EEG-SMT for 2 minutes to familiarize themselves with the sensor to prevent the potential effect of discomfort. Fig. 1. International 10-20 System The EEG-SMT device is connected with the computer, and we have developped a software on Matlab which have many functions in this research. The EEG signal is carefully checked from the graph that is display on the desktop’s screen before recording, to make sure it was not affected by any other factors. After everything was ready, the data collecting will be started and the user interface is switched to the stimulate screen of the intention or not for the test subjects. If the test subject was recording Intention state, the balloon on user interface will move continuously in 60 seconds and the subject would focus on the balloon. When recording Non- Intention state, the test subject will relax and don’t concentrate on the balloon in 60 seconds. After recording, we would ask the subjects if they were concentrating or not. If the subjects were unsure about their mental state during the experiment, we would ignore recorded data. Finally, we had 15360 samples 60 seconds of EEG signal for each brain’s state with each subject. 3. Features Extraction The measurement signal is highly dependent on the measuring system and the sensor (contact impedance, gain coefficient, noise filter, etc), and the goal is to play live games with the device, so this study will not intend to use another open database for prior research. We will select the EEG features from the literature review, select the identification model and propose the new strategies for improving recognition results for actual games. In this study, we investigated the most common features: statistical features and frequency features. 3. 1. Proposal Features Let the raw signals recorded from EEG devices in a segment be designated by where n = 1,2...N, with N = 512 (512 samples corresponds to 2 seconds of the EEG recording). Mean of raw signals: ∑ NX nn=11μ = XN (1) Standard deviation of the raw signals: ∑ N 2 1/2X n Xn=11σ =( (X -μ ) )N-1 (2) Means of the absolute values of the first differential of the raw signals: ∑ N-1x n+1 nn=11δ = X -XN-1 (3) Means of the absolute values of the second differential of the raw signals: ∑ N-2x n+2 nn=11γ = X -XN-2 (4) Skewness: Skewness shows the degree of the asymmetry of a distribution, where the left or right tail is relatively longer than other. If Skewness is negative, the data are spread out more to left of the mean than to the right. If Skewness is positive, the data are spread out more to right. If the distribution is perfectly symmetric distribution (normal distribution), the Skewness will equal zero [9,10]. ∑ ∑ 3N i X3 i=1 23 N 3/2i Xi=1 (X -μ ) E(x-μ) Ns= = (X -μ )σ [ ] N-1 (5) where µ is the mean of X, σ is the standard deviation of X, E(t) represents the expected value of the quantity t. Kurtosis: Kurtosis is a measure of how outlier-prone a distribution is. The kurtosis of the normal distribution is 3. The distributions that are more outlier-prone than the normal distribution have kurtosis greater than 3; distribution that are less outlier-prone have kurtosis less than 3 [9,10]. The kurtosis of a distribution is defined as: Journal of Science & Technology 123 (2017) 059-064 61 ∑ ∑ 4N i X4 i=1 24 N 2i Xi=1 (X -m ) E(x-m) Nk= = (X -m )s [ ]N (6) where µ is the mean of X, σ is the standard deviation of X, E(t) represents the expected value of the quantity t. Auto-Regressive coefficients: Auto-Regressive coefficients (AR) will describe each sample of EEG signal as a linear combination of previous samples plus a white noise error term [11]. The forward prediction of the EEG signal was accomplished using following equation: ∑ N k i k-i ki=1X =- a X +e (7) where ai is AR coefficients, ek is white noise or error sequence, and N is the order of AR model. Percent wave: The raw EEG signal is transformed to the frequency domain by using the Fast Fourier Transform (FFT). Then, percent of each type of wave is calculated by following formulas: ∑ ∑ 4Hz ii=0.5Hz δ fp = f ; ∑∑ 7Hz ii=4Hz θ fp = f ; ∑ ∑ 14Hz ii=8Hz α fp = f ; ∑∑ 30Hz ii=14Hz β fp = f (8) where ft is absolute amplitude of signal of t Hz in frequency domain [10]. Power Spectral Density: Assuming F(n) is the FFT result of a segment, the Power Spectral Density (PSD) is as follow: *F(n)F (n) P(n)= N (9) Where F*(n) is the conjugate function of F(n) and N = length of segment [11,12]. The energy of each type of signals can be defined as follows: ∑ 4 δ freq freq=0.5Hz E = P ; ∑7θ freq freq=4Hz E = P ; ∑ 13 α freq freq=8Hz E = P ; ∑30β freq freq=14Hz E = P (10) Furthermore, according to article [11], there is some relations between alpha and beta wave. For example, alpha activity indicates the brain is in relaxing state, whereas beta activity is related to stimulation. So the ratio of alpha and beta can be used as a feature for classification: α β ER=E (11) Max Frequency: As a suggestion according to research [13], we can use max frequency as features for classification. Each type of wave will have its maximum value, so we have 4 values corresponding to maximum values of frequency in delta, theta, alpha and beta waves. Mean Frequency: Mean frequency represents the centroid of the spectrum and is calculated from results of Fast Fourier Transform. There are 4 values corresponding to mean frequencies of delta, theta, alpha and beta waves [10]. Shannon Entropy, Log Energy Entropy [14]: Shannon entropy of a finite length discrete random variable x = 1 2[ , ,..., ]NX X X with probability distribution function denoted by p(x) is defined by: ∑ N 2 2 ShanEn i 2 ii=1H (x)= - (p (x)) (log (p (x)) (12) Log Energy Entropy is given: ∑ N 2 LogEn 2 ii=1H (x)= (log (p (x))) (13) 3.2 ANOVA Evaluation After calculating features, we use ANOVA tool to reduce the high data dimensionality of the feature space before the classification process. Note that one- way ANOVA can just only evaluate separate feature. In practice, two or more features can be combine to get significant classification results. After using ANOVA and practical experiment, we decided to choose features list that contains: percent alpha, percent beta, PSD of alpha, PSD of beta, means of the absolute values of the second differential, Means of the absolute values of the first differential, Skewness, Kurtosis, AR Burg order 6. 4. Classification We employed multilayer perceptron (MLP) for classification Intension/Non-Intension. According to Universal approximation theorem, a feed-forward network with a single hidden layer containing a finite number of neurons can approximate an arbitrary nonlinear, continuous, multidimensional function f with any desired accuracy [15]. So we use only one hidden layer for our training process. The number of neurons in the hidden layers has great effect to classification result. In MLP network, increasing the number of neurons in the hidden layer Journal of Science & Technology 123 (2017) 059-064 62 increases the power of the network, but requires more computation and is more likely to produce overfitting (which means the network only fits for one specific set of input). It turns out that there no a common number of neurons for a hidden layer. So we did an experiment with the number of neurons from 1 to 60 to find out which is the number is most appropriate. After the experiment, we saw that the range from 20 to 60 produces a high and stable result. Therefore, we use 40 neurons for the hidden layer. We use Early Stopping technique to prevent overfitting situation. It is a popular method for all supervised network. In this technique, the data is divided into three subsets: the training set, the validation set and the testing set. The training set is used for computing the gradient and updating the network weights and biases. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. When the validation error increases for a specified number of iterations (one pre-defined constant), the training is stopped [16]. The weights and biases at the minimum of validation error are returned. In our research, we used 70% data for training, 15% data for validation and 15% data for testing. With a dataset, we implement training of dataset multiple times with different initial weights and biases, and different divisions of data into training, validation, and test sets. These different conditions might lead to very different training results for the same dataset. The results of classification Intention and Non- Intention is summarized in below Table 1. This table shows the results of classification are normally above 90% for each subject. That result is not applied for every subject (e.g. subject 6). The reasons might be when recording data some people were not concentrated. And when we try classifying EEG data of all subjects, the accuracy reduces to approximately 85%. This is because the EEG signal is different among various people, and based on biological features of each person. 5. Application of classification results After training by MLP, the trained network would be saved for future classification. In this section, we will use this trained network to play a simple game. In real time processing, we also use one data segment with 512 samples and the overlap segment is 16 samples. So there are up to 16 decisions “Intention or Non-Intention” per second. 5.1. Improve training results To ensure that subject can play the game correctly, after training and classification, we would test the training results before the player really play a game. If the test result is incorrect, we will use these test-recorded data for incremental training. All the recorded data is marked with corresponded label and can be used for the training process. The new data will help improve actual performance of playing the game in the future (Fig 2). Fig. 2. Balloon Game Test and retraining environment is mostly similar with the real game environment. EEG signal is recorded, and pass through classification network, and decided this signal is Intention or Non-Intention. The balloon will move to the left, if the system detects the signal is Intention, move to the right if the signal is Non-Intention, respectively. 5.2. Smoothing results In this section, we will use notation, (I) for Intention and (N) for Non-Intention. According to section 4, the result of classification is very high. But when we use trained network for real time classification, there are some artifacts. The position of electrodes might be different from the previous position, which can lead to a small error of data. When the subjects concentrated, the result might be “I, I, I, I, I, N, N, I, I, I”, etc. Two results “N, N” appeared could make the balloon had unsmooth movement. To eliminate this problem, we use an algorithm to make the balloon move smoother. Journal of Science & Technology 123 (2017) 059-064 63 Table 1. Classification results Number of data segments 1st 2nd 3rd 4th 5th Subject 1 2023 100% 99.95% 100% 100% 100% Subject 2 1873 99.31% 95.78% 99.36% 99.47% 99.52% Subject 3 1866 100% 99.79% 100% 99.84% 100% Subject 4 1867 99.09% 99.14% 99.79% 98.66% 99.46% Subject 5 1932 93.94% 95.81% 86.90% 95.86% 90.53% Subject 6 1856 85.18% 84.59% 93.21% 82.06% 93.43% Subject 7 2580 99.73% 99.84% 99.73% 99.88% 99.92% Subject 8 1740 98.28% 96.15% 96.67% 95.69% 92.18% Normally, at each time a data segment is classified, the trained network has to decide if it is (I) or (N). Then the balloon will move to left or right immediately. In our algorithm, we introduce undefined (U) state. This state appears when the trained network results change between I and N continuously. At that time the data might not be stable, so the balloon will not move. Our system decides the subject is (I) or (N), if only the results of classification by the trained network have 5 results of (I) or (N) continuously. Because there are 16 data frames per second, so the system's delay is negligible. 5.3. Playing PC Game After retraining and test, the trained network is used to classify EEG signals of the subject for playing game purpose. For demonstration, we create a simple game on Matlab. In this game, the balloon will go up when the subject performs Intention, and go down when the subject performs Non-Intention. There is an arrow flying across the screen. If the balloon has a collision with the arrow or the bottom array-arrow, the game will end. The horizontal arrow will appear at the position of the balloon to ensure that subject has always to think to move the balloon. With the retraining strategy above for the neuron network before starting to play, combined with the strategy to smooth the results of identification, players can control the ball naturally in the game. 6. Conclustion and perspectives In this study, by using only one data channel, and in the recording time, subjects can blink their eyes naturally, the classification accuracy is normally above 90% with features: percent alpha, percent beta, PSD of alpha, PSD of beta, means of the absolute values of the second differential of the normalized signal, means of the absolute values of the first differential of the normalized signal, Skewness, Kurtosis, AR Burg order 6. Because the EEG signal is user-dependent, we only classify EEG signals individually on each subject. We have also proposed a solution to smooth the recognition results. By employing the proposed system, patients with paralysis will be able to enjoy computer game as healthy people, even if they are living with the loss of muscle control. As future works, we plan to classify more states of the brain such as left, right, up, down and apply results to play more complicated games. We will try to collect more EEG data to achieve better and independent results for the survey participants. References [1] R. Cakmak; A. M. Zeki, Determining the state of truthfulness and falsehood by analyzing the acquired EEG signals, IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA), (2016) 173 – 178. [2] M. Murugappan, S. Murugappan, Balaganapathy, C. Gerard, Wireless EEG signals based Neuromarketing system using Fast Fourier Transform (FFT), IEEE 10th International Colloquium on Signal Processing and its Applications, (2014 ) 25 – 30. [3] C. T. Yuen, W. S. San, J. H. Ho, M. Rizon, Effectiveness of Statistical Features for Human Emotions Classification using EEG Biosensors, Research Journal of Applied Sciences, Engineering and Technology, 5(21): (2013) 5083-5089. [4] Z. Bahri, S. Abdulaal, M. Buallay, Sub-Band-Power- Based Efficient Brain Computer Interface for Wheelchair Control. American Journal of Signal Processing, Vol. 4 No. 1, (2014) 34-40. [5] T. Pham, W. Ma, D. Tran, and P. Nguyen, Multi- factor EEG-based user authentication. Proceedings of International Joint Conference on Neural Networks (IJCNN), (2014) 4029-4034. [6] Y. Liu, O. Sourina and M. K. Nguyen, Real-Time EEG-Based Human Emotion Recognition and Visualization. Proceeding of International Conference on Cyberworlds, (2010) 262-269. [7] https://www.olimex.com/Products/EEG/OpenEEG/E EG-SMT/open-source-hardware Journal of Science & Technology 123 (2017) 059-064 64 [8] J. F. Echallier, F. Perrin and J. Pernier, Computer- assisted placement of electrodes on the human head, Electroencephalography and clinical Neurophysiology, 82 (1992) 160-163. [9] J. W. Bang, J. S. Choi, K. R. Park, Noise Reduction in Brainwaves by Using Both EEG Signals and Frontal Viewing Camera Images. Sensors, 13 (2013) 6272-6294. [10] R. S. Huang, L. L. Tsai, C. J. Kuo, Selection of Valid and Reliable EEG Features for Predicting Auditory and Visual Alertness Levels. Proc. Natl. Sci. Counc. ROC(B), Vol. 25, No. 1, (2001) 17-25. [11] N. H. Liu, C. Y. Chiang, and H. C. Chu, Recognizing the degree of human attention using EEG signals from mobile sensors. Sensors 13.8 (2013) 10273- 10286. [12] C. Hasegawa, K. Oguri, The effects of Specific Musical Stimuli on Driver’s Drowsiness, Proceeding of the Intelligent Transportation Systems Conference, ITSC’06, Tornto, ON, Canada, (2006) 817- 822. [13] M. Nandish, M. Stafford, K. P. Hemanth, F. Ahmed, Feature Extraction and Classification of EEG Signal Using Neural Network Based Techniques, International Journal of Engineering and Innovative Technology (IJEIT) (2008), volume 2, issue 4. [14] H. Jianfeng, D. Xiao, and Z. Mu, Application of Energy Entropy in Motor Imagery EEG Classification, JDCTA 3.2 (2009) 83-90. [15] Q. J. Zhang, K. C. Gupta, and V. K. Devabhaktuni, Artificial neural networks for RF and microwave design-from theory to practice, Microwave Theory and Techniques, IEEE Transactions on 51.4 (2003): 1339-1350. [16] M. N. S. Swamy, Ke-Lin Du, Neural Networks and Statistical Learning, (2013) pp. 21-22.

Các file đính kèm theo tài liệu này:

  • pdfeeg_features_extraction_for_classification_of_human_intentio.pdf