A formula to calculate pruning threshold for the part-Of-speech tagging problem - Nguyen Chi Hieu

By counting the number of commands to perform two algorithms Viterbi algorithm the base (Tgorigin) and the Viterbi algorithm pruning (Tgpruning) and compare the statements executed by the formula (8), the results showed that the pruning algorithm Viterbi algorithm implementation of commands less than the base algorithm Viterbi algorithm. But to get the full survey on the effectiveness and speed of calculation, the article needs further experimentation on the other Tree banks. On the other hand, the contribution of this paper is made by formula beam threshold. This threshold is calculated based on the probability of both components t-1(j) * P (Ci | Cj) instead of having to select the threshold from empirical through t-1(j). 5. CONCLUSION In this paper, we propose an approach to calculate the pruning threshold, and apply it into the original Viterbi algorithm. The basic difference of our solution from traditional heuristic pruning techniques is that the search pruning method is proposed instead of using a heuristic. Although the result of the proposed solution is rather satisfactory on the Wall Street Journal, this proposal should carry out the experimental study on the other Treebanks.

10 trang | Chia sẻ: honghp95 | Lượt xem: 557 | Lượt tải: 0

Bạn đang xem nội dung tài liệu A formula to calculate pruning threshold for the part-Of-speech tagging problem - Nguyen Chi Hieu, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên

Journal of Science and Technology 54 (3A) (2016) 64-73 A FORMULA TO CALCULATE PRUNING THRESHOLD FOR THE PART-OF-SPEECH TAGGING PROBLEM Nguyen Chi Hieu Industrial University of Ho Chi Minh City, 12 Nguyen Van Bao, Ward 4, Go Vap District, Ho Chi Minh City Email: nchieu@hui.edu.vn Received: 1 May 2016; Accepted for Publication: 15 July 2016 ABSTRACT The exact tagging of the words in the texts is a very important task in the natural language processing. It can support parsing the text, contribute to the solution of the polysemous word, and help to access a semantic information, etc. One of crucial factors in the POS (Part-of- Speech) tagging approaches based on the statistical method is the processing time. In this paper, we propose an approach to calculate the pruning threshold, which can apply into the Viterbi algorithm of Hidden Markov model for tagging the texts in the natural language processing. Experiment on the 1.000.000 words on the tag of the Wall Street Journal corpus showed that our proposed solution is satisfactory. Keywords: Hidden Markov model, Part-of-speech tagging, Viterbi algorithm, Beam search. 1. INTRODUCTION The tagging is defined as an automatic assignment of descriptors (or tags) to input tokens. Part-of-speech (POS) tagging is a selecting process to find the most likely sequence of syntactic categories for words in a sentence. It is a very important problem in natural language processing. Several approaches have been developed [1], which include taggers based on handwritten rules, n-gram automatically derived from tagged text corpora, Hidden Markov models, symbolic language models, machine learning, and hybrid taggers [2]. Among the above approaches, one based on the Hidden Markov model (HMM) can offer prominent results [3]. Especially, when using the Viterbi algorithm, it can achieve an accuracy rate of over 95 percent [4]. However, its complexity is a challenge. For a problem involving T words and K lexical categories, the algorithm which is to find the most likely sequence requires K 2 * T steps for bigram and K 3 * T steps for trigram. When K increases, the original Viterbi algorithm becomes slow. For this reason, several investigators have developed techniques to try to reduce the generation time and memory consumption in practice. They are based on the heuristic pruning methods. However, the heuristic pruning techniques may miss essential parts of the search tree, caused by wrong decisions while pruning. Therefore, in this paper, we propose a method to calculate the pruning threshold, and apply it into the Viterbi algorithm of HMM. Beam search is A formula to calculate pruning threshold for the part-of-speech tagging problem 65 a heuristic method for the combinatorial optimization problems, which has extensively been studying in artificial intelligence [5]. The method combining the Viterbi algorithm with Beam pruning technique is useful to compress the search space, which reduces the computational complexity [6]. The base difference of our solution from traditional heuristic pruning techniques is that the search pruning formula is proposed instead of using a heuristic. 2. RELATED WORKS 2.1. Hidden Markov model In 1913, Andrey Markov asked a less controversial question about Pushkin’ text: could we use frequency counts from the texts to help compute the probability that the letter in sequence would be the vowels. Afterwards, Hidden Markov models are applied in temporal pattern recognition as speech, handwriting, gesture recognition, part-of-speech tagging, partial discharges and bioinformatics, etc. and named a Markov process [7]. The Hidden Markov model is one of the most important machine learning models in speech and language processing. A Markov chain is a special case of the Markov process when states can be numbered. Figure 1 shows a Markov chain for assigning a probability to the sequence of categories, for which the category consists of ART, N, V, and P. A Markov chain embodies an important assumption about these probabilities. In the first-order Markov chain, the probability of a particular state depends only on the previous state [4]. Figure 1. A Markov chain. The Markov chain is a random process that undergoes transitions from one state to another on a state space. It must possess a property that is usually characterized as memoryless: the probability distribution of the next state depends only on the current state and not on the sequence of events that preceded it [8]. This specific kind of memoryless is called the Markov property. A stochastic process has the Markov property if the conditional probability distribution of future states of the process depends only upon the present state, not on the sequence of events that preceded it. A process with this property is called a Markov process. The Markov property can be defined as follows: suppose S = {S0, S1 Sn} are states in state-space, Si ∈ W 1≤ t ≤ n, where W is the finite set of possible symbols. We said that Si has Markov property if the ART N P V 0,29 0,65 0,71 0,44 0,26 0,35 0,43 0,74 0.13 1 Nguyen Chi Hieu 66 probability: P (St-1 | St) = P (S0 S1 St-1 | St). A Markov chain is useful when we need to compute a probability for a sequence of events that we can observe in the real world. In many cases, however, the events we may not be directly observable. For example, part-of-speech tagging, a HMM allows us to talk about both observed events (like words that we see in the input) and hidden events (like part-of-speech tags) that we think of as causal factors in probabilistic model [7]. The word hidden means unobserved states in the Markov model. For example, the sequence words: the flies like flowers, the word flowers could be generated from state N with a probability of 0.05, or at state V with a probability of 0.053 [4, page 200]. Like this, a probability of string “the flies like flowers”, with a different category is the difference. 2.2. Beam Search A beam search ranks all the alternatives produced so far by their probabilities, and always just expand the most likely sequences [9]. This continues to be done at each time point until the entire output is covered by an interpretation. This is a simple modification to the basic Viterbi algorithm. It can be seen that the original Viterbi algorithm was driven by a loop that traserves each position in the input, namely t(i) = Max j = 1,N ( t-1(j) × P(Ci | Cj)) × P(Wt | Ci), (1) where, Wt ∈ W = W1W2 WT is a sequence of words, Ci ∈ Ĉ = C1C2CK is a sequence of lexical categories, t(i) is a probability of Wt with a category Ci, Pr (Ci | Cj) is a probability of a category Ci in context of lexical category Cj, Pr (Wt | Ci) a probability of word Wt in context of lexical category Ci. All K states (the i’s) were looped and a Max calculation of a formula (1) over all K predecessor states (the j’s) was done by the process for each input position, i.e., K2 operations. The unpromising states at every step are pruned out by a beam search, leaving only the best nodes to calculate the next states. By using several different methods, we can decide how many to keep. We, for example, make a decision to keep a fixed number at each step, say D nodes. Alternately, we could choose the factor H and delete all states that have a probability less than m*H, where m is the maximum probability for a state at that time. For instance, we would only retain hypotheses that the score is greater than the half of the maximum one at that time if H = 0.5. For more detail, we decide to keep all nodes that m*K ≥ k threshold at each step. We create an array called beam, the set indices of states above threshold at time t will be stored by each element of which. The formula (1) have been presented with t(i) = Max j Beam(t-1) ( t-1(j) × P(Ci | Cj)) × P(Wt | Ci) (2) Then, by finding the maximum value Mt at time t, we can compute beam(t) and delete the states below the threshold of Mt*H. The number of operations in the inner loop to be D*K has been reduced with this change, where D is the average number of states above threshold from Mt to Mt*H. We can make this a manageable number by setting v. For instance, with the lexical- generation and bigram probabilities, we take the problem of finding the most likely categories for the sentence "the flies like flowers” into consideration. It is showed that there are 4 possible categories and 4 4 = 256 different sequences of length four. We consider the part of speech tagging example for the sentence “the flies like flowers”. In A formula to calculate pruning threshold for the part-of-speech tagging problem 67 Table 1, this sentence presented the Viterbi algorithm’s result with the Markov hypothesis. As expected, four nodes are enlarged at each stage in which consider four achievable paths, and four stages. It means that 4 3 possibilities are calculated. A transition diagram in Figure 2 represents all 256 sequences. Figure 2. Encoding the 256 possible sequence “the flies like flowers”. Table 1. The full Viterbi search for “the flies like flowers”. the flies like flowers N 0.29 e -4 0.95 e -2 0.15 e -4 0.7 e -5 V 0.1 e -7 0.95 e -5 0.41 e -3 0.34 e -6 Art 0.38 0.38 e -8 0.62 e -9 0.27 e -7 P 0.6 e -8 0.128 e -8 0.28 e -3 0.66 e -9 Table 2. The Beam - search Viterbi employing a factor H = 0.01 for “the flies like flowers”. the flies like flowers N 0.95 e -2 0.15 e -4 0.7 e -5 V 0.41 e -3 0.34 e -6 Art 0.38 P 0.28 e -3 Table 2 indicates the beam - search Viterbi just retaining entries that are within the factor of 0.01 for the flies like flowers. In this table, there are three stages, which retain just one entry. During the search, the algorithm considers 5×4 = 20 possibilities [10]. Table 3 presents a fixed beam width of two (i.e., the two best ones at each point are kept) of the beam search Viterbi. NULL the/N flowers/N like/N flies/N the/V the/Art the/P flies/V flies/Art flies/P like/V like/Art Art like/P flowers/V flowers/Art flowers/P Nguyen Chi Hieu 68 Table 3. The set beam - search Viterbi with w = 2 for the flies like flowers. the flies like flowers N 0.29 e -4 0.95 e -2 0.7 e -5 V 0.95 e -5 0.41 e -3 0.34 e -6 Art 0.38 P 0.28 e -3 Here there are two entries for every stage, thus the algorithm calculates 8*4 = 32 possibilities. 3. PROPOSED APPROACH Pruning in the beam search can be classified into two broad categories. The first approach, called complete anytime beam search was introduced by Zhang [11]. This algorithm calls a beam search, and if the beam search fails to find a solution, it tries again with a larger beam. Eventually this algorithm will find a solution because at some point the beam will become so large that no important thing is pruned. This approach has the serious drawback that the amount of space required to search to a given depth increases as time passes. An alternative approach is to expand pruned nodes without restarting the search. This is the approach taken in Limited discrepancy beam search [12], and beam-stack search [13]. Both beam-stack search and Limited discrepancy beam search reconsider pruned nodes without restarting the search, but they do so in a very systematic manner that does not consider heuristic information. Both algorithms approaches allow a complete exploration of the search space, but this completeness comes at a cost. Using a too narrow or tight beam (too low beam width D, a factor h) in Viterbi beam search can prune the best path, and may return errors [10]. But using a too large beam is unnecessary for computation in searching unlikely paths. That is disadvantage of search methods using heuristic. In the formula (2), pruning beam search techniques only focus on t-1(j), and determine pruning threshold by experimental. This paper proposes a formula to calculate pruning threshold ( t-1(j)*P(Ci|Cj)) automatically, and apply it into the Viterbi algorithm of HMM. 3.1. The Viterbi Algorithm With the exhausted search, we have to search the maximum N T possibles for a sentence of T words and N tags for each word. Luckily, it is unnecessary to enumerate N T paths because we can construct the dynamic programming algorithms for the above problem. One of these algorithms is the Viterbi Algorithm of HMM. To take the searching the above HMM on the input of the sentence the flies like flowers into consideration. We have many steps to deal with: - To compute a maximum probability rather than a minimum distance - To find routes at each point in time. With an HMM, it is necessary to generalize the algorithm to solve probability distributions over states and to keep a separate record for each state at each time point (record of state Ci at time i will have a different record of state Ci at time i-1). A formula to calculate pruning threshold for the part-of-speech tagging problem 69 - To look for the best path whatever state it ends in. The details of the algorithm are shown in Figure 3. This algorithm operates by computing the values of these two arrays that are SeqScore (i,t) and BACKPTR(i,t). The algorithm will search the probability of the best sequence leading to each possible category at each position in the search space. We use an NxT array for this space, where N is the number of lexical categories (C1LN) and T is the number of words in the sentence (w1..wT). The SeqScore(i, t) array records the maximum probability of a word Wt in category Ci. Another NxT array is the BACKPTR(i,t) that will indicate for each category in each position what the preceding category is in the best sequence at position i-1. Figure 3. The Viterbi algorithm. 3.2. Integrating pruning calculation threshold into the Viterbi algorithm The time (or the number of steps) which takes to complete a program depends on some factors such as the code of compiler, state and compute code, data, big O, etc. 3.2.1. Dataset We use a part of the Penn Treebank corpus (about 1,000,000 words annotated with POS). The Penn treebank POS tag set has 36 POS tags plus 12 others for punctuations and special symbols. 36.65 % of words have more than one lexical category (average is 2.44 lexical categories per a word). And 63.35 % of words have one lexical category. The average value is 1.47 lexical categories per a word for total 1,000,000 words [14]. This is the problem of spare data. Therefore, its probability in this category could be calculated as 0, so the probability of the overall sentence containing the word would be 0 too. For this reason, they have other calculated techniques to address the problem of estimating the above low probabilities. The technique to solve the zero probability problem is that we might add a small amount, say 0.5 to every count before the estimating probabilities Pr(Wt | Ci), and assign for Pr(Ci | Cj) = 10 -6 with probabilities of Pr(Ci | Cj) = 0. This guarantees that there are not the retaining zero probabilities. They called this estimation technique is the expected likelihood estimator (ELE) [4]. Given word sequence W 1 , ..., W T, lexical categories C1, ..., CN , output probabilities Pr (Wi | Ci) and bigram probabilities Pr (Ci | Cj), find the most likely sequence of C1,,CT for the word sequence W1,, WT. Initialization Step: for i = 1 to N do /*N is the number of lexical categories; : NULL/ø */ SeqScore(i,1) = Pr(C1 | )* Pr(W1 | Ci) BACKPTR(i,1) = 0; Iteration Step: for t = 2 to T do /* T is the number of the word in a sentence */ for i = 1 to N do SeqScore (i,t) = Max (SeqScore (j,t -1)* Pr (Ci | Cj))* Pr (Wt | Ci), với j = 1,..N BACKPTR(i,t) = index of j that gave the max above Sequence Identification Step: C(T) = i is Max of SeqScore(i,t) for i = T-1 to 1 do C(i) = BACKPTR(C(i+1),i+1) Nguyen Chi Hieu 70 3.2.2. Algorithm analysis Analysing the base Viterbi algorithm, we have found that, one step t (t = 2,...,T), Viterbi algorithm have to calculate a probability of a word to take out Max a formula (3). SeqScore(i,t) = Max(SeqScore(j,t -1)* Pr(Ci | Cj))* Pr(Wt | Ci), with j = 1..K (3) Notation: P = Max(SeqScore(j,t -1)* Pr(Ci | Cj)), with j = 1..K Result of the P formula will be depended on two parts: SeqScore(j,t -1) and Pr(Ci | Cj), which means that we need find two formulas SeqScore(j,t -1) and Pr(Ci | Cj) such that the product SeqScore(j,t -1)* Pr(Ci | Cj) is maximal. The Breadth–First–Search algorithm [15] with a suitable heuristic will make a result search fastest for the SeqScore(j,t -1). However, Pr(Ci | Cj) may be not large enough to create a maximal P. A solution to make the maximal P is presented in section 3.3. 3.2.3 Threshold factor Although we do not know the SeqScore(j,t -1)* Pr(Ci | Cj) result when it is maximal, oneself on the data characteristic and the past processing, we can know when to stop process calculate P. The parameter to control this process is called the Threshold factor. With this assumption, a calculation step, t is always made a result like a formula (4). SeqScore(j,t -1) >= SeqScore(j+1,t -1). (by sorting) (4) Setting up Prmax(Ci | Cj) is the maximal value in the Pr(Ci | Cj) values, with j,i = 1,,N. We make an examination of the formula P = Max (SeqScore(j,t -1)* Pr(Ci | Cj)), with j = 1,,N. We realize that we can stop the calculation of P, if the SeqScore(j,t -1)*Pr(Ci |Cj) >= SeqScore(j+1,t -1)*Prmax(Ci| Cj) in the j-th step. Due to, SeqScore(r,t -1) < SeqScore(j+1,t -1), where r = j+2N (by the assumption in a formula (4)). To reduce the number of the multiplification operations in the algorithm, we change an inequality: SeqScore(j,t -1)* Pr(Ci | Cj) >= SeqScore(j+1,t -1)* Prmax(Ci | Cj) (5) By SeqScore(j,t -1)* Pr(Ci | Cj) / Prmax(Ci | Cj) >= SeqScore(j+1,t -1) (6) Set up dBeam = Pr(Ci | Cj) / Prmax(Ci | Cj), (7) 3.2.4. Proposed algorithm – The pruning Viterbi algorithm We can calculate dBeam, before the program is run, and we call the dBeam, Threshold factor. In addition, we add a variable to rank for the Pr (Ci | Cj) that is called the iBack. The details of the proposed algorithm are shown in Figure 4. The difference between the base Viterbi algorithm and the pruning Viterbi algorithm is in the Iteration Step. In addition, we create an established Heap algorithm to get the biggest element in SeqScore(t,N) array in such a way that it is fastest. We use the statistical method to compare the time it takes to complete the program between the base Viterbi algorithm and the pruning Viterbi algorithm. A formula to calculate pruning threshold for the part-of-speech tagging problem 71 Figure 4. The pruning Viterbi algorithm 4. EXPERIMENTAL RESULT We used a part of the Penn Treebank corpus for training and testing [16]. We divided the corpus into two parts: the training and test set. We chose randomly the 3000 sentences in the WSJ part of the Penn Treebank corpus that sentences have different length. In addition, attaching the variable counts to determine the If statements and assignment statements in the pruning Viterbi algorithm. By assuming that an If statements and an assignment statement have the same run time, as shown in Table 4. Table 4. A summary of the statements count in the test. No Words in sentence Quantity sentences T*K*logK (E) (Heap) Results Rate D/(A+E) A B C D 1 3 250 189750 285241 1140456 398843 2071326 4.36 2 5 250 316250 676012 2159305 1153451 4168350 4.20  Step 1: Preparing data To sort the probabilities Pr(Wt | Ci) in the decrease order, this task helps to make heap faster, due to we do not need to change the position of the elements (following the strategy to optimize partiality [2]). To calculate rank and threshold of components in tab Pr(Ci | Cj).  Step 2: Using pruning technology in the Iteration Step of Viterbi algorithm (i) for t = 2 to T do (ii) for i = 1 to K do begin (1) SeqScore(t –1,K-i); //convert SeqScore(t –1,K-i) into heap// O(logK) (2) PMax = SeqScore(i,t -1)* Pr(Ci | C1); (3) max=1; (4) if ((rank of Pr(Ci | C1)>1) and (SeqScore (0,t -1)* Pr(Ci | C0).dBeam < SeqScore(1,t-1))) then (iii) for j=2 to K do begin (5) if (SeqScore(j,t -1)* Pr(Ci | Cj).dBeam >= SeqScore (j+1,t -1)) then // exit for with j argument; (6) A = SeqScore(j,t -1)* Pr(Ci | Cj); (7) if (A>PMax) then begin (8) PMax=A; (9) max = j; end; end; (10) SeqScore(i,t) = PMax* Pr(Wt | Ci); (11) BackPtr(i,t) = max; end; Nguyen Chi Hieu 72 3 8 250 506000 1063105 3534163 1993250 7263122 4.63 4 10 250 632500 2743247 5042016 5077515 10468054 3.10 5 16 250 1012000 2704276 7120150 5124199 15570318 4.19 6 20 350 1771007 4545821 12835032 80905650 27589426 4.37 7 24 250 1518000 4287091 11584866 7862927 24988512 4.30 8 32 250 2024000 5352826 14153251 13044250 32290542 4.39 9 38 250 2403500 11679096 19414311 21777512 37841566 2.68 10 40 250 2530000 8336102 19731752 16142521 40692570 3.75 11 53 250 3352250 9784750 25789183 18259751 57221084 4.36 12 62 150 2352975 7537491 17847771 14254069 38719195 3.90 Total: 3000 18608232 58995058 298884065 3.85 where: A: There are iBack and dBeam; B: There is not iBack, and there is dBeam; C: There is iBack, and there is not dBeam; D: There are not iBack and dBeam; E: Times counting to create Heap O (T*K*logK). For a problem involving T words and K lexical categories, the base Viterbi algorithm is guaranteed to find the most likely sequence using T*K 2 steps. In other words, Big O of the base Viterbi algorithm is the O (T*K 2 ). In the Penn Treebank, K = 46; Notation: Tgorigin = O (T*K 2 ) = D Tgpruning = A + E Tgorigin / Tgpruning = D/ (A + E) = 3.85 (8) where: Tgorigin: is the run time of the base Viterbi algorithm; Tgpruning: is the run time of the pruning Viterbi algorithm. By counting the number of commands to perform two algorithms Viterbi algorithm the base (Tgorigin) and the Viterbi algorithm pruning (Tgpruning) and compare the statements executed by the formula (8), the results showed that the pruning algorithm Viterbi algorithm implementation of commands less than the base algorithm Viterbi algorithm. But to get the full survey on the effectiveness and speed of calculation, the article needs further experimentation on the other Tree banks. On the other hand, the contribution of this paper is made by formula beam threshold. This threshold is calculated based on the probability of both components t-1(j) * P (Ci | Cj) instead of having to select the threshold from empirical through t-1(j). 5. CONCLUSION In this paper, we propose an approach to calculate the pruning threshold, and apply it into the original Viterbi algorithm. The basic difference of our solution from traditional heuristic pruning techniques is that the search pruning method is proposed instead of using a heuristic. Although the result of the proposed solution is rather satisfactory on the Wall Street Journal, this proposal should carry out the experimental study on the other Treebanks. A formula to calculate pruning threshold for the part-of-speech tagging problem 73 REFERENCES 1. Ruslan Mitkov - Computational Linguistics, The Oxford University Press, First Published, 2003, 219-220. 2. Padro L. - A hybrid environment for syntax-semantic tagging. Ph.D thesis, Departament de Llenguatges i Sistemes Informatics, Universitat Politecnica de Catalunya, Barcelona, 1998. 3. Yuan L.C. - Improved hidden Markov model for speech recognition and POS tagging, Journal of Central South University 19 (2012) pp. 511-516. 4. James Allen - Natural Language Understanding, The Benjamin/Cummings Publishing Company, Inc., 1995, pp. 144-155. 5. Pinedo M. - Scheduling: Theory, algorithms, and systems, Prentice-Hall, Second Addition, 2002. 6. Liu W. and Han W. - Improved Viterbi algorithm in continuous speech recognition, International Conference on Computer Application and System Modeling (ICCASM) 7 (2010) 201-207. 7. Daniel Jurafsky and James H. Martin - Speech and Language processing: an introduction to natural language processing, computational linguistics, and speech recognition – 2nd ed. Prentice Hall, 2013, 139-149. 8. Philipp Koehn, Pharaoh - A beam search decoder for phrase-based statistical machine translation models, Machine translation: From real users to research, 2004, pp. 115–124. 9. Christoph Tillmann, Hermann Ney - Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation, Journal Computational Linguistics 29 (1) (2003) 97-133 10. Jame Allen, Seaching HMM model, 2003 11. Zhang W. - Complete anytime beam search, Proceedings of AAAI-98, 1998, pp. 425–430. 12. Furcy D. and Koenig S. - Limited discrepancy beam search, Proceedings of the International Joint Conference on Artificial Intelligence, 2005, pp. 125–131. 13. Zhou R. and Hansen E. A. - Beam - stack search: Integrating backtracking with beam search, Proceedings of ICAPS-05, 2005, 90-98. 14. Ferran Pla, Antonio Molina and Natividad Prieto - Tagging and chunking with bigrams. In: Proceedings of the 18th conference on Computational linguistics 2. Association for Computational Linguistics, 2000, 614-620. 15. Thomas H. Cormen, Charles E. Leiserson and Ronald L. Rivest - Introduction to Algorithms, The MIT Press, 1990, 594-602. 16. The Penn TreeBank Project. [Online]. Available: ftp://ftp.cis.upenn.edu/pub/treebank/doc/update.cd2

Các file đính kèm theo tài liệu này:

11959_103810382391_1_sm_4519_2061578.pdf