Improving the Switched Split Vector Quantization Technique using a Joint Source Channel Coding Approach

In this paper, an efficient and robust structured VQ scheme based on an optimal IA version of the SSVQ technique, namely IA-SSVQ, was developed. The performance of SSVQ methods was investigated for quantizing a random highly correlated source and parameters of the speech coder. The results showed that the IA-SSVQ encoder yields significant improvement over the ordinary SSVQ encoder by providing robustness against channel errors. Although, the performance of COSSVQ scheme is better at high BER, the new scheme has advantage of requiring no increase complexity to the encoder and no sacrifice performance for the better channels. Therefore, the IA-SSVQ can be a good technique for systems transmitting correlated analog signal as well as in speech coder in particular.

pdf5 trang | Chia sẻ: huongthu9 | Lượt xem: 354 | Lượt tải: 0download
Bạn đang xem nội dung tài liệu Improving the Switched Split Vector Quantization Technique using a Joint Source Channel Coding Approach, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
Journal of Science & Technology 123 (2017) 043-047 43 Improving the Switched Split Vector Quantization Technique using a Joint Source Channel Coding Approach Tran Ngoc Tuan*, Nguyen Quoc Trung, Tran Hai Nam Hanoi University of Science and Technology, No. 1, Dai Co Viet Str., Hai Ba Trung, Ha Noi, Viet Nam Received: June 06, 2016; Accepted: November 03, 2017 Abstract This paper deals with enhancing the error resilient of the Switched Split Vector Quantization (SSVQ) techniques by adopting the optimal Index Assignment approach, a Joint Source-Channel coding method. SSVQ is one of the latest structured vector quantization schemes and it has several advantages over other schemes. The new method proposed in this paper can improve the SSVQ encoder without the addition of extra bits and coding complexity. In addition, the application of the new method in speech coding is also investigated in this paper. The effectiveness of IA-SSVQ method is validated by comparing it with other methods through simulations. Keywords: Joint Source-Channel coding, Vector Quantization, Index Assignment, Switched Split Vector Quantization. 1. Introduction* Signal coding has played a significant role in the success of digital communication, in which, the fundamental operation is quantization. Vector quantization (VQ) are known to theoretically achieve the lowest distortion, at a given rate and dimension, of any quantization scheme [1,2]. In practice, VQ is widely-used for low bit-rate coding of analog signals, especially highly correlated sources. An optimal vector quantizer operates using a single large codebook with no constraints imposed on its structure. However, the VQs using large codebook are impractical, because the memory and computational requirement for VQ encoding is prohibitively high and the training process takes too much time. Several structurally constrained VQ schemes have been developed [1], which reduce the complexity of implementation with moderate loss of quantization performance. Switched Split Vector Quantization (SSVQ) [3,4] is one of the latest structured vector quantization schemes and it is further explored in [5,6] to show its competitive performance advantage over other VQ methods. As most compression methods, the quality of reconstructed signal rapidly deteriorates when the channel noise is introduced. In order to protect against channel errors, the traditional approach is to increase the bit-rate for channel coding. Joint source- channel coding (JSCC) is an alternative that provides * Corresponding author: Tel.: (+84) 912.466.789 Email: tuan.tranngoc@hust.edu.vn a technique to mitigate channel errors without an increase of the bit-rate. This paper deals with enhancing the error resilient of the SSVQ technique by using JSCC approach. In the past, several methods based on JSCC technique were proposed for improving the VQ coder robustness for transmission over noisy channel. In order to improve the SSVQ method, the Channel Optimized Switched Split Vector Quantization (COSSVQ) method was proposed [7], which is based on Channel Optimized Vector Quantization approach (COVQ) [8]. In this approach, the channel statistical distribution is taken into account during both the source quantization and the codebook design. However, it requires long training time and its performance is usually degraded when the channel quality is high. In this paper, a method based on Index Assignment approach [9] is developed to improve the error resilience of the coder using SSVQ technique. Different from COSSVQ method, the proposed method does not sacrifice any performance for the better channel and does not add any complexity to the encoder. This approach is implemented simply by rearranging the codebooks in the optimized order, therefore it can be used for improving the existing SSVQ systems with no need to redesign the coder. In addition, the application of this method in speech coding is also investigated in this work and the performance of the proposed method is validated through experiments in Section 5. Journal of Science & Technology 123 (2017) 043-047 44 2. Switched Split Vector Quantization and the Index Assignment problem. 2.1. Vector Quantization. When a set of discrete-time amplitude values is quantized jointly as a single vector, the process is known as Vector Quantization (VQ) or block quantization [1]. A vector quantizer Q: ℜK → C maps a continuous source vector x ∈ ℜK to a codevector ci∈C by the nearest neighbour rule. The codebook C={ ci; 1≤ i ≤ N } is the set of K-dimensional codevectors. The output of the vector quantizer is the index i of the codevector ci which satisfies: ( )= argmin , k k i d x c (1) where d(x,ck) is the nonnegative distance between two vectors. A common distortion measure is the squared Euclidean distance (SED), given by: ( ) ( )2 1 , K i i i d x y = = −∑x y (2) Fig.1 shows the principle of VQ. Only the index i is transmitted over the channel to the receiver. Upon receiving i correctly, the VQ decoder can reconstruct x to ci by a simple table lookup operation. ci Encoder Decoder Find the closest codevector Codebook C index i x Table lookup ci Codebook C Fig. 1. Principle of vector quantization VQ (Switch Selection) Switch Codebook Cs VQ11 VQ1L VQ12 VQ21 VQ2L VQ22 VQM1 VQML VQM2 is is=1 is=2 is=M x SVQ1 SVQ2 SVQM Fig. 2. Block diagram of a SSVQ encoder The codebook design process is also known as training the codebook. A widely used algorithm for VQ codebook design is the Linde-Buzo-Gray (LBG) algorithm [10]. 2.2. Switched Split Vector Quantization. SSVQ is a hybrid of Switch Vector Quantization and Split Vector Quantization. In this scheme, the vector space is divided into non-overlapping switching regions and a separate Split Vector Quantizer (SVQ) [11] is designed for each region. The SVQ divides vectors into subvectors of lesser dimension and they are then quantized using independent codebooks. An L-part K-dimension SVQ is composed of L classical VQs of smaller sizes and dimension of K1,K2,...,KL. The block diagram of a Switched Split Vector Quantizer is shown in Fig.2. Each vector to be quantized is first switched to one of the M possible directions based on the nearest-neighbour criterion, using the switch VQ codebook Cs. ( )s argmin , si i i d= x c (3) Next, the vector will be quantized using the corresponding L-part SVQ. Therefore, the SSVQ coder transmits to the decoder an index i composed L+1 concatenated binary indices. The first index is indicates the switch direction and the remaining L indices i1,i2,...,iL are provided by the corresponding local SVQ si . 2.3. Index Assignment for Vector Quantization. The effect of channel errors is to cause errors in the received indices which can result in significant distortion in decoded vectors. Let Pa(i) denote the a priori probability of codevector ci, The IA function π is a permutation of the integers {0,1,...,N-1} and π(i) assigns an index to codevector ci. The overall distortion caused by channel noise is: ( ) ( ) ( )π π π = = = ∑ ∑ 1 1 ( ) ( ), ( ) , N N a C i j i j D P i P j i d c c (4) In case of binary symmetric channel (BSC) with bit error rate (BER) ε, the codeword transition probability PC(i,j) is given by: PC(i,j) = εh(i,j)(1 − ε)n − h(i,j) (5) where h(i,j) denote the Hamming distance (number of bit differences) between i and j. Different IAs affect the overall distortion D(π) in case of channel error, so the IA problem is to find the optimal IA solution π which minimize D(π). There are N! possibilities to order N codewords, and to find an optimal solution for codebooks larger than 32 entries is practically impossible. For this reason, a Journal of Science & Technology 123 (2017) 043-047 45 number of different IA approximate solutions have been proposed [9,12,13]. 3. The proposed IA-SSVQ method. In order to improve the robustness of the SSVQ coders, we adopt an JSCC approach carried out by the IA method and develop a new method named IA- SSVQ. The switch codebook CS need to be reassigned in the optimized order provided by an IA algorithm and the order of SVQs is also rearranged according to the new order of codevectors in CS. Next, continue using the IA algorithm to find the optimal IA for each codebook of local SVQs and rearranging them in such optimized order. The scheme for designing a M-switch IA-SSVQ with the training set S of length ns is described below: • Train the M-length switch codebook Cs from S. • Corresponding to M vectors cs1,cs2,...csM in Cs, partition S into M non-overlapping cells R1,R2...,RM of length ns1, ns2,..., nsM. • Train codebooks of the M local SVQs. (The SVQi is trained using the training set Ri). • Find the optimal IA solution of CS by using an IA algorithm with a priori probability of vector csi given by ( )a si sP i n n= ( )1 i M≤ ≤ . • Permute CS by the optimized IA solution and rearrange the order of SVQs according to the new positions of vectors in CS. • Apply IA method to rearrange all sub codebooks of the M local SVQs in the optimized order. In the case of upgrading the existing system, only the last 3 steps need to be executed. 4. Application of IA-SSVQ in speech coding. Most low bit rate speech coders employ the linear predictive coding (LPC) model [14] in which the short-term spectral is approximated by the all- pole filter whose transfer function is HLPC(z) = 1/A(z) and A(z) is an inverse filter, given by: ( ) 1 1 p i i i A z a z = = +∑ - (6) The order p is typically set to 10 for narrowband speech coders and to 16 for wideband speech coders. The quantization of LPC coefficients { } 1 p i ia = play a major role in the overall bit-rate and preserving the quality of the reconstructed speech. In order to evaluate the performance of a LPC quantizer, the most popular approach is the spectral distortion (SD). For the i-th frame, the SDi in Decibel, defined as [11]: ( ) ( ) 1 0 10 1 0 221 2 1 10 log ˆ j n Nn i j n N n n S e SD n n S e π π − = = −       ∑ (7) where (S e j2πn/N) and ˆ(S e j2πn/N) are the original and quantized power spectrum of the LPC filter corresponding to the i-th frame of speech signal. The requirements usually considered necessary to achieve good quality speech are [11]: The average distortion is about 1dB, the number of outlier frames having SD in the range 2-4dB is less than 2% and no outlier frame having SD larger than 4dB. In practice, the LPC coefficients are not directly quantized because they have poor quantization properties. Line Spectral Frequency (LSF) [15] has become the major representation of LPC coefficients because of its excellent properties in terms of model filter stability and robust quantization. The LSFs are defined as the roots of the following polynomials: ( ) ( ) ( ) ( ) ( ) ( ) 1 1 ( 1) ( 1) p p P z A z z A z Q z A z z A z − − − + − + = + = − (8) All roots of P(z) and Q(z) are located on the unit circle of the z-plane and are interlaced with each other so that LSFs are in ascending order. To further improve the performance of the coder, the weighted Euclidean distance (WED) may be used instead of SED as distortion measure for LSF vectors. The WED ˆ( , )d f f between the original and quantized LSF vectors is given by [11]: ( ) ( )[ ]2 1 ˆˆ, p i i i i d w f f = = −∑f f (9) where wi is the spectral weight corresponding to the i-th LSF: ( ) 2i i r w H f=    (10) where |H(fi)|2 is the LPC power spectrum at frequency fi and r is an empirical constant determined experimentally. A value of r = 0.15 has been found satisfactory [11]. Due to the high correlation property of LSFs, VQ of them is most suitable for low bitrate but high quality quantization. SSVQ which has been studied recently is an effective structurally constrained VQ method for quantizing LSF coefficients and has many advantages over other VQ techniques[5,6]. Therefore, using IA-SSVQ method can improve the robustness of the speech coder and the effectiveness of this method is confirmed by experiment in Section 5. Journal of Science & Technology 123 (2017) 043-047 46 5. Experiments and discussion. In this section, computational experiments are carried out in Matlab to examine the performance of the IA-SSVQ method and to compare it with the traditional SSVQ and COSSVQ method. These three SSVQ systems with the same selected characteristics quantize and transmit the source over a BSC channel. The sources include a random highly correlated process and sets of speech LSF parameters. In our experiments, codebooks were generated using LBG algorithm [8] and the SA algorithm [12,13] was applied to find the optimal IA for IA- SSVQ codebooks. The bit error probability used for training IA-SSVQ and COSSVQ codebooks is 0.01. 5.1. Random correlated source. In this section, the input signal is a first-order Gauss-Markov process with correlation coefficient ρ . x(n) = ρx(n−1) + w(n) (11) where w(n) is a zero-mean, unit variance, Gaussian white noise process. In our experiment the value for ρ is 0.9 and the SED (Eq.2) is used as vector distortion measure. The source is first partitioned into vectors of dimension 8, then these input vectors are quantized by various 16-switch 2-part SSVQ quantizers. The vectors are split into 2 parts with (4,4) division and the bit allocation is (6,6). The performances are evaluated in terms of signal-to-noise ratio (SNR) given by: SNR = 10log10(σx/σn) (12) where σx and σn are the signal and noise variances, respectively. S N R [d B] 0 5 10 15 10-4 10-3 10-2 10-1 BER IA-SSVQ SSVQ COSSVQ Fig. 3. Performance comparison of SSVQ methods Fig.3 shows the SNR of system for 3 SSVQ methods against the BER. According to Fig.2, it can be observe that the performance of the IA-SSVQ method outperforms the regular SSVQ method in terms of high SNR. At high BER levels, the COSSVQ method provides better performance compared to IA-SSVQ method, but the IA-SSVQ method is better at low BER. 5.2. LSF Parameters of speech coder. In this experiment, the TIMIT speech database with a sampling rate of 16kHz [16] was used for training and tesing of the SSVQ. In order to obtain the LSF vectors database, the same preprocessing and LPC analysis of the Adaptive Multirate Wideband speech coder (AMR-WB, ITU-T G.722.2) [17] was used. The training set consists of 644.137 vectors while the testing set contains 235.603 vectors distinct from the training vectors. In all SSVQ quantizers, the number of switch directions is 32 (m=5) and the 16-dimensional LSF vectors are split into 5 parts with (3,3,3,3,4) division and the bit allocation is (9,8,8,8,8). The WSED was used for measuring the distortion of LSF vectors. Table 1. Performance comparisons between various 46 bits/frame LSF SSVQ encoders. BER ε SSVQ IA-SSVQ COSSVQ Average SD (dB) Outliers % Average SD (dB) Outliers % Average SD (dB) Outliers % 2-4 dB >4 dB 2-4 dB > 4 dB 2-4 dB > 4 dB 0 0.921 0.499 0.000 0.921 0.499 0.000 0.968 1.499 0.006 0.001 1.077 2.857 1.294 1.003 1.723 0.596 1.035 2.512 0.545 0.002 1.204 4.894 2.455 1.077 2.925 1.129 1.097 3.523 1.029 0.003 1.338 6.691 3.742 1.158 4.125 1.738 1.163 4.505 1.570 0.004 1.461 8.470 4.969 1.234 5.358 2.286 1.227 5.530 2.093 0.005 1.585 10.187 6.173 1.307 6.399 2.857 1.287 6.469 2.586 0.01 2.185 17.265 12.332 1.673 11.679 5.799 1.592 11.170 5.191 0.1 7.887 17.176 79.401 6.011 32.266 55.664 5.316 35.921 49.802 Journal of Science & Technology 123 (2017) 043-047 47 We use the common measure of spectral distortion (SD) (Eq.7) [11] to test the LSF quantization performance. In Table 1, the performance both in average SD as well as outlier percentage is depicted for various SSVQ schemes. It can be seen that, the simulation result is similar to the result in Section 5.1. The IA-SSVQ coder provides better performance than the ordinary SSVQ coder in term of low average SD and the number of outlier’s frames of SD > 4dB. In comparison with COSSVQ coder, when ε is less than a certain threshold, the performance of IA-SSVQ coder is better and vice versa. In this experiment, the threshold is about 0.004. The reason is the IA-SSVQ and SSVQ codebooks are the same sets, just in different order, so the IA-SSVQ coder preserves the original performance of the SSVQ coder designed for noiseless channel. 6. Conclusion. In this paper, an efficient and robust structured VQ scheme based on an optimal IA version of the SSVQ technique, namely IA-SSVQ, was developed. The performance of SSVQ methods was investigated for quantizing a random highly correlated source and parameters of the speech coder. The results showed that the IA-SSVQ encoder yields significant improvement over the ordinary SSVQ encoder by providing robustness against channel errors. Although, the performance of COSSVQ scheme is better at high BER, the new scheme has advantage of requiring no increase complexity to the encoder and no sacrifice performance for the better channels. Therefore, the IA-SSVQ can be a good technique for systems transmitting correlated analog signal as well as in speech coder in particular. References [1] A. Gersbo and R. Gray, Vector quantization and signal compression, Boston, Ma. Kluwer Academic Publishers, 1992. [2] T.D. Lookabaugh, R.M. Gray, High-resolution quantization theory and the vector quantizer advantage, IEEE Trans. Inform. Theory 35 (5) (1989) 1020–1033. [3] S. So, K.K. Paliwal, Efficient vector quantisation of line spectral frequencies using the switched split vector quantiser, Proc. Int. Conf. Spoken Language Processing, Korea, 2004. [4] S. So, K.K. Paliwal, Switched Split Vector Quantisation of Line Spectral Frequencies for Wideband Speech Coding, INTERSPEECH-2005, Portugal, (2005) 2705-2708 [5] S. So, K.K. Paliwal, Efficient product code vector quantization using switched split vector quantizer, Digital Signal Processing journal, Elsevier, 17(1) (2007) 138-171. [6] S. So, K. K. Paliwal, A Comparative Study of LPC Parameter Representations and Quantisation Schemes for Wideband Speech Coding, Digital Signal Processing Journal, Elsevier, 17(1) (2007) 114-137. [7] M. Bouzid, S. Cheraitia, Channel Optimized Switched Split Vector Quantization for Wideband Speech LSF Parameters, Proc. 11th Int. Conf. on Inf. Science, ISSPA2012, Canada, (2012) 1045-1050. [8] N. Farvadin, A Study of Vector Quantization for Noisy Channels, IEEE Trans. on Inf. Theory, 36(4) (1990) 799-809. [9] N. Farvardin, V. Vaishampayan, On the performance and complexity of channel-optimized vector quantizers, IEEE Trans. Inf. Theory, 37(1) (1991) 155–160. [10] Y. Linde, A. Buzo, and R. M. Gray, An algorithm for vector quantization design, IEEE Trans. on Commun., COM-28 (1980) 84-95. [11] K. K. Paliwal, B. S. Atal, Efficient vector quantization of LPC parameters at 24 bits/frame, IEEE Transactions on Speech and Audio Processing, 1(1) (1993) 3-14. [12] K. Zeger and A. Gersho, Pseudo-Gray Coding, IEEE Trans. on Commun., 38(12) (1990) 2147-2158. [13] T.N. Tuan, N.Q. Trung, Improving the Simulated Annealing algorithm for the Index Assignment method to enhance the robustness of communication systems, Vietnamese Journal on Inf. Tech. & Comm.., E-3, 7(11) (2014) 13-20. [14] A. M. Kondoz, Digital Speech: Coding for Low Bit Rate Communication Systems, 2nd Edition, John Wiley and Sons, 2004. [15] F. Itakura, Line spectrum representation of linear predictive coefficients of speech signals, J. Acoust. Soc. Amer., 57 (1975) S35. [16] ITU-T Recommendation G.722.2, Wideband Coding of Speech at Around 16 kb/s Using Adaptive Muti- rate Wideband (AMR-WB), 2003. [17] J. Garofol and al., Darpa TIMIT, Acoustic-Phonetic Continuous Speech Corpus CD-ROM, National Institute of Standards and Technology, NISTIR 493, USA, 1990.

Các file đính kèm theo tài liệu này:

  • pdfimproving_the_switched_split_vector_quantization_technique_u.pdf