Kỹ thuật viễn thông - Chapter 15: Numerical strength reduction

Represent a filter operation by a table (matrix) {xij}, where the rows are indexed by delay i and the columns by shift j, i.e., the row i is the coefficient ci for the term x(n-i), and the column 0 in row i is the msb of ci and column W-1 in row i is the lsb of ci , where W is the word length. • The row and column indexing starts at 0. • The entries are 0 or 1 if 2’s complement representation is used and {1, 0, 1} if CSD is used. • A non-zero entry in row i and column j represents x(n-i) >> j. It is to be added or subtracted according to whether the entry is +1 or –1

21 trang | Chia sẻ: huyhoang44 | Lượt xem: 521 | Lượt tải: 0

Bạn đang xem trước 20 trang tài liệu Kỹ thuật viễn thông - Chapter 15: Numerical strength reduction, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

Chapter 15: Numerical Strength Reduction Keshab K. Parhi Chap. 15 2 • Sub-expression elimination is a numerical transformation of the constant multiplications that can lead to efficient hardware in terms of area, power and speed. • Sub-expression can only be performed on constant multiplications that operate on a common variable. • It is essentially the process of examining the shift and add implementations of the constant multiplications and finding redundant operations. • Example: a ´ x and b ´ x, where a = 001101 and b = 011011 can be performed as follows: – a ´ x = 000100 ´ x + 001001 ´ x – b ´ x = 010010 ´ x + 001001 ´ x = (001001 ´ x) << 1 + (001001 ´ x). – The term 001001 ´ x needs to be computed only once. – So, multiplications were implemented using 3 shifts and 3 adds as opposed to 5 shifts and 5 adds. Chap. 15 3 Multiple Constant Multiplication(MCM) The algorithm for MCM uses an iterative matching process that consists of the following steps: • Express each constant in the set using a binary format (such as signed, unsigned, 2’s complement representation). • Determine the number of bit-wise matches (non- zero bits) between all of the constants in the set. • Choose the best match. • Eliminate the redundancy from the best match. Return the remainders and the redundancy to the set of coefficients. • Repeat Steps 2-4 until no improvement is achieved. Chap. 15 4 0101110193c 10110110182b 11101101237a UnsignedValueConstantExample: Binary representation of constants 01001101Red. of a,c 00010000Rem. of c 10110110b 10100000Rem. of a UnsignedConstant 10100000Red. of Rem a,b 01001101Red. of a,c 00010000Rem. of c 00010110Rem. of b 00000000Rem. of a UnsignedConstant Updated set of constants 1st iteration Updated set of constants 2nd iteration Chap. 15 5 Linear Transformations • A general form of linear transformation is given as: y =T*x where, T is an m by n matrix, y is length-m vector and x is a length-n vector. It can also be written as: mixty n j jiji ,...,1, 1 == å = • The following steps are followed: Ø Minimize the number of shifts and adds required to compute the products tijxj by using the iterative matching algorithm. Ø Formation of unique products using the sub-expression found in the 1st step. Ø Final step involves the sharing of additions, which is common among the yi’s. This step is very similar to the MCM problem. Chap. 15 6 ú ú ú ú û ù ê ê ê ê ë é = 117117 15285 1371112 13287 T •The constants in each column multiply to a common variable. For Example x1 is multiplied to the set of constants [7, 12, 5, 7]. • Applying iterative matching algorithm the following table is obtained. 00101100 0100011110110010 1001001010000101 Column 4Column 3Column 2Column 1 Example: Chap. 15 7 • Next, the unique products are formed as shown below: p1 = 0101*x1, p2 = 0010*x1, p3 = 1100*x1 p4 = 1000*x2, p5 = 1011*x2, p6 = 0010*x3, p7 = 0111*x3 p8 = 1001*x4, p9 = 0100*x4, p10 = 0010*x4 • Using these products the yi’s are as follows: y1 = p1 + p2 + p4 + p6 + p8 + p9; y2 = p3 + p5 + p7 + p8 + p9; y3 = p1 + p4 + p6 + p8 + p9 + p10; y4 = p1 + p2 + p5 + p7 + p8 + p10; Chap. 15 8 • This step involves sharing of additions which are common to all yi’s. For this each yi is represented as k bit word (1 £ k £ 10), where each of the k products formed after the 2nd step represents a particular bit position. Thus, y1 = 1101010110, y2 = 0010101110, y3 = 1001010111, y4 = 1100101101. • Applying iterative matching algorithm to reduce the number of additions required for yi’s we get: y1 = p2 + (p1 + p4 + p6 + p8 + p9); y2 = p3 + p9 + (p5 + p7 + p8); y3 = p10 + (p1 + p4 + p6 + p8 + p9); y4 = p1 + p2 + p10 + (p5 + p7 + p8); • The total number of additions are reduced from 35 to 20. Chap. 15 9 Polynomial Evaluation Evaluating the polynomial: x13 + x7 + x4 + x2 + x • Without considering the redundancies this polynomial evaluation requires 22 multiplications. • Examining the exponents and considering their binary representations: 1 = 0001, 2 = 0010, 4 = 0100, 7 = 0111, 13 = 1101. • x7 can be considered as x4 ´ x2 ´ x1. Applying sub-expression sharing to the exponents the polynomial can be evaluated as follows: x8 ´(x4 ´ x) + x2 ´ (x4 ´ x) + x4 + x2 + x • The terms x2, x4 and x8 each require one multiplication as shown below: x2 = x ´x, x4 = x2 ´ x2, x8 = x4 ´ x4 • Thus, we require 6 instead of 22 multiplications. Chap. 15 10 Sub-expression Sharing in Digital Filters • Example of common sub-expression elimination within a single multiplication : y = 0.101000101*x. This may be implemented as: y = (x >> 1) – (x >> 3) + (x >> 7) – (x >> 9). Alternatively, this can be implemented as, x2 = x – (x >> 2) Y = (x2 >> 1) + (x2 >> 7) which requires one less addition. Chap. 15 11 • In order to realize the sub-expression elimination transformation, the N-tap FIR filter: y(n) = c0x(n) + c1x(n-1) + + c0x(n-N+1) must be implemented using transposed direct- form structure also called data-broadcast filter structure as shown below: Chap. 15 12 • Represent a filter operation by a table (matrix) {xij}, where the rows are indexed by delay i and the columns by shift j, i.e., the row i is the coefficient ci for the term x(n-i), and the column 0 in row i is the msb of ci and column W-1 in row i is the lsb of ci , where W is the word length. • The row and column indexing starts at 0. • The entries are 0 or 1 if 2’s complement representation is used and {1, 0, 1} if CSD is used. • A non-zero entry in row i and column j represents x(n-i) >> j. It is to be added or subtracted according to whether the entry is +1 or –1. Chap. 15 13 Example: y(n) = 1.000100000*x(n) + 0.101010010*x(n-1) + 0.000100001*x(n-2) -11 11-1-1 -11 This filter has 8 non-zero terms and thus requires 7 additions. But, the sub-expressions x1 + x1[-1] >> 1 occurs 4 times in shifted and delayed forms by various amounts as circled. So, the filter requires 4 adds. x2 = x1 – x1[-1] >> 1 y = x2 – (x2 >> 4) – (x2[-1] >> 3) + (x2[-1] >> 8) An alternative realization is : x2 = x1 – (x1 >> 4) – (x1[-1] >> 3) + (x1[-1] >> 8) y = x2 – (x2[-1] >> 1). Chap. 15 14 Example: y(n) = 1.01010000010*x(n) + 0.10001010101*x(n-1) + 0.10010000010*x(n-2) + 1.00000101000*x(n-4) The substructure matching procedure for this design is as follows: • Start with the table containing the coefficients of the FIR filter. An entry with absolute value of 1 in this table denotes add or subtract of x1. Identify the best sub-expression of size 2. -111 11-1 -1-1-1-1-1 111-1 Chap. 15 15 • Remove each occurrence of each sub-expression and replace it by a value of 2 or –2 in place of the first (row major) of the 2 terms making up the sub-expression. -11 -2 -2-1-1-2 212-1 • Record the definition of the sub-expression. This may require a negative value of shift which will be taken care of later. x3 = x1 – x1[-1] >> (-1) Chap. 15 16 • Continue by finding more sub-expressions until done. -11 -2 -2-3 23-1 5. Write out the complete definition of the filter. x2 = x1 – x1[-1] >> (-1) x3 = x2 + x1 >> 2 y = -x1 + x3 >> 2 + x2 >> 10 – x3[-1] >> 5 – x2[-1] >> 11 -x2[-2] >> 1 + x1[-3] >> 6 – x1[-3] >> 8. Chap. 15 17 • If any sub-expression definition involves negative shift, then modify the definition and subsequent uses of that variable to remove the negative shift as shown below: x2 = x1 >> 1 – x1[-1] x3 = x2 + x1 >> 3 y = -x1 + x3 >> 1 + x2 >> 9 – x3[-1] >> 4 – x2[-1] >> 10 - x2[-2] + x1[-3] >> 6 – x1[-3] >> 8. Chap. 15 18 3-tap FIR filter with sub-expression sharing for 3-tap FIR filter with coefficients c2 = 0.11010010, c1 = 0.10011010 and c0 = 0.00101011. This requires 7 shifts and 9 additions compared to 12 shifts and 11 additions. Chap. 15 19 3-tap FIR filter with sub-expression sharing requiring 8 additions as compared to 9 in the previous implementation. Chap. 15 20 Using 2 most common sub-expressions in CSD representation • x – x >> 2 and x + x >> 2 are the 2 most common sub- expressions in CSD representation. An FIR filter using the term sharing, where the two most common sub-expressions in CSD numbers 101 and 101, together with isolated 1 are shared among all filter coefficients. Chap. 15 21 3-tap FIR filter with coefficients c2 = 0.10101010101, c1 = 0.10010100101 and c0 = 0.10101010000. 2 additions in the dotted square in (a) are shared in (b). Filter requires only 7 additions and 7 shifts as opposed to 12 adds and 12 shifts in standard multiplierless implementation.

Các file đính kèm theo tài liệu này:

chap_chap15_7629_7482.pdf