Kỹ thuật viễn thông - Chapter 15: Numerical strength reduction
Represent a filter operation by a table (matrix)
{xij}, where the rows are indexed by delay i and
the columns by shift j, i.e., the row i is the
coefficient ci for the term x(n-i), and the column 0
in row i is the msb of ci and column W-1 in row i is
the lsb of ci , where W is the word length.
• The row and column indexing starts at 0.
• The entries are 0 or 1 if 2’s complement
representation is used and {1, 0, 1} if CSD is used.
• A non-zero entry in row i and column j represents
x(n-i) >> j. It is to be added or subtracted
according to whether the entry is +1 or –1
21 trang |
Chia sẻ: huyhoang44 | Lượt xem: 598 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Kỹ thuật viễn thông - Chapter 15: Numerical strength reduction, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Chapter 15: Numerical
Strength Reduction
Keshab K. Parhi
Chap. 15 2
• Sub-expression elimination is a numerical transformation of
the constant multiplications that can lead to efficient
hardware in terms of area, power and speed.
• Sub-expression can only be performed on constant
multiplications that operate on a common variable.
• It is essentially the process of examining the shift and add
implementations of the constant multiplications and finding
redundant operations.
• Example: a ´ x and b ´ x, where a = 001101 and
b = 011011 can be performed as follows:
– a ´ x = 000100 ´ x + 001001 ´ x
– b ´ x = 010010 ´ x + 001001 ´ x = (001001 ´ x) << 1 +
(001001 ´ x).
– The term 001001 ´ x needs to be computed only once.
– So, multiplications were implemented using 3 shifts and 3
adds as opposed to 5 shifts and 5 adds.
Chap. 15 3
Multiple Constant Multiplication(MCM)
The algorithm for MCM uses an iterative matching
process that consists of the following steps:
• Express each constant in the set using a binary
format (such as signed, unsigned, 2’s complement
representation).
• Determine the number of bit-wise matches (non-
zero bits) between all of the constants in the
set.
• Choose the best match.
• Eliminate the redundancy from the best match.
Return the remainders and the redundancy to
the set of coefficients.
• Repeat Steps 2-4 until no improvement is
achieved.
Chap. 15 4
0101110193c
10110110182b
11101101237a
UnsignedValueConstantExample:
Binary representation of constants
01001101Red. of a,c
00010000Rem. of c
10110110b
10100000Rem. of a
UnsignedConstant
10100000Red. of Rem a,b
01001101Red. of a,c
00010000Rem. of c
00010110Rem. of b
00000000Rem. of a
UnsignedConstant
Updated set of constants
1st iteration Updated set of constants
2nd iteration
Chap. 15 5
Linear Transformations
• A general form of linear transformation is given as:
y =T*x
where, T is an m by n matrix, y is length-m vector and x is a
length-n vector. It can also be written as:
mixty
n
j
jiji ,...,1,
1
== å
=
• The following steps are followed:
Ø Minimize the number of shifts and adds required
to compute the products tijxj by using the iterative
matching algorithm.
Ø Formation of unique products using the sub-expression
found in the 1st step.
Ø Final step involves the sharing of additions, which is
common among the yi’s. This step is very similar to the
MCM problem.
Chap. 15 6
ú
ú
ú
ú
û
ù
ê
ê
ê
ê
ë
é
=
117117
15285
1371112
13287
T
•The constants in each column multiply to a common variable. For
Example x1 is multiplied to the set of constants [7, 12, 5, 7].
• Applying iterative matching algorithm the following table is
obtained.
00101100
0100011110110010
1001001010000101
Column 4Column 3Column 2Column 1
Example:
Chap. 15 7
• Next, the unique products are formed as shown
below:
p1 = 0101*x1, p2 = 0010*x1, p3 = 1100*x1
p4 = 1000*x2, p5 = 1011*x2,
p6 = 0010*x3, p7 = 0111*x3
p8 = 1001*x4, p9 = 0100*x4, p10 = 0010*x4
• Using these products the yi’s are as follows:
y1 = p1 + p2 + p4 + p6 + p8 + p9;
y2 = p3 + p5 + p7 + p8 + p9;
y3 = p1 + p4 + p6 + p8 + p9 + p10;
y4 = p1 + p2 + p5 + p7 + p8 + p10;
Chap. 15 8
• This step involves sharing of additions which are
common to all yi’s. For this each yi is represented
as k bit word (1 £ k £ 10), where each of the k
products formed after the 2nd step represents a
particular bit position. Thus,
y1 = 1101010110, y2 = 0010101110,
y3 = 1001010111, y4 = 1100101101.
• Applying iterative matching algorithm to reduce
the number of additions required for yi’s we get:
y1 = p2 + (p1 + p4 + p6 + p8 + p9);
y2 = p3 + p9 + (p5 + p7 + p8);
y3 = p10 + (p1 + p4 + p6 + p8 + p9);
y4 = p1 + p2 + p10 + (p5 + p7 + p8);
• The total number of additions are reduced from
35 to 20.
Chap. 15 9
Polynomial Evaluation
Evaluating the polynomial:
x13 + x7 + x4 + x2 + x
• Without considering the redundancies this polynomial
evaluation requires 22 multiplications.
• Examining the exponents and considering their binary
representations:
1 = 0001, 2 = 0010, 4 = 0100, 7 = 0111, 13 = 1101.
• x7 can be considered as x4 ´ x2 ´ x1. Applying sub-expression
sharing to the exponents the polynomial can be evaluated as
follows:
x8 ´(x4 ´ x) + x2 ´ (x4 ´ x) + x4 + x2 + x
• The terms x2, x4 and x8 each require one multiplication as
shown below:
x2 = x ´x, x4 = x2 ´ x2, x8 = x4 ´ x4
• Thus, we require 6 instead of 22 multiplications.
Chap. 15 10
Sub-expression Sharing in Digital Filters
• Example of common sub-expression elimination
within a single multiplication :
y = 0.101000101*x.
This may be implemented as:
y = (x >> 1) – (x >> 3) + (x >> 7) – (x >> 9).
Alternatively, this can be implemented as,
x2 = x – (x >> 2)
Y = (x2 >> 1) + (x2 >> 7)
which requires one less addition.
Chap. 15 11
• In order to realize the sub-expression elimination
transformation, the N-tap FIR filter:
y(n) = c0x(n) + c1x(n-1) + + c0x(n-N+1)
must be implemented using transposed direct-
form structure also called data-broadcast filter
structure as shown below:
Chap. 15 12
• Represent a filter operation by a table (matrix)
{xij}, where the rows are indexed by delay i and
the columns by shift j, i.e., the row i is the
coefficient ci for the term x(n-i), and the column 0
in row i is the msb of ci and column W-1 in row i is
the lsb of ci , where W is the word length.
• The row and column indexing starts at 0.
• The entries are 0 or 1 if 2’s complement
representation is used and {1, 0, 1} if CSD is used.
• A non-zero entry in row i and column j represents
x(n-i) >> j. It is to be added or subtracted
according to whether the entry is +1 or –1.
Chap. 15 13
Example:
y(n) = 1.000100000*x(n) + 0.101010010*x(n-1)
+ 0.000100001*x(n-2)
-11
11-1-1
-11
This filter has 8 non-zero terms and thus requires 7
additions. But, the sub-expressions x1 + x1[-1] >> 1
occurs 4 times in shifted and delayed forms by various
amounts as circled. So, the filter requires 4 adds.
x2 = x1 – x1[-1] >> 1
y = x2 – (x2 >> 4) – (x2[-1] >> 3) + (x2[-1] >> 8)
An alternative realization is :
x2 = x1 – (x1 >> 4) – (x1[-1] >> 3) + (x1[-1] >> 8)
y = x2 – (x2[-1] >> 1).
Chap. 15 14
Example:
y(n) = 1.01010000010*x(n) + 0.10001010101*x(n-1)
+ 0.10010000010*x(n-2) + 1.00000101000*x(n-4)
The substructure matching procedure for this design
is as follows:
• Start with the table containing the coefficients of
the FIR filter. An entry with absolute value of 1 in
this table denotes add or subtract of x1. Identify
the best sub-expression of size 2.
-111
11-1
-1-1-1-1-1
111-1
Chap. 15 15
• Remove each occurrence of each sub-expression
and replace it by a value of 2 or –2 in place of the
first (row major) of the 2 terms making up the
sub-expression.
-11
-2
-2-1-1-2
212-1
• Record the definition of the sub-expression. This
may require a negative value of shift which will be
taken care of later.
x3 = x1 – x1[-1] >> (-1)
Chap. 15 16
• Continue by finding more sub-expressions until done.
-11
-2
-2-3
23-1
5. Write out the complete definition of the filter.
x2 = x1 – x1[-1] >> (-1)
x3 = x2 + x1 >> 2
y = -x1 + x3 >> 2 + x2 >> 10 – x3[-1] >> 5 – x2[-1] >> 11
-x2[-2] >> 1 + x1[-3] >> 6 – x1[-3] >> 8.
Chap. 15 17
• If any sub-expression definition involves
negative shift, then modify the definition and
subsequent uses of that variable to remove the
negative shift as shown below:
x2 = x1 >> 1 – x1[-1]
x3 = x2 + x1 >> 3
y = -x1 + x3 >> 1 + x2 >> 9 – x3[-1] >> 4 – x2[-1] >> 10
- x2[-2] + x1[-3] >> 6 – x1[-3] >> 8.
Chap. 15 18
3-tap FIR filter with sub-expression sharing for
3-tap FIR filter with coefficients c2 = 0.11010010,
c1 = 0.10011010 and c0 = 0.00101011.
This requires 7 shifts and 9 additions compared to
12 shifts and 11 additions.
Chap. 15 19
3-tap FIR filter with sub-expression sharing
requiring 8 additions as compared to 9 in the
previous implementation.
Chap. 15 20
Using 2 most common sub-expressions
in CSD representation
• x – x >> 2 and x + x >> 2 are the 2 most common sub-
expressions in CSD representation.
An FIR filter using the term sharing, where the two
most common sub-expressions in CSD numbers 101
and 101, together with isolated 1 are shared among
all filter coefficients.
Chap. 15 21
3-tap FIR filter with coefficients c2 = 0.10101010101,
c1 = 0.10010100101 and c0 = 0.10101010000. 2 additions
in the dotted square in (a) are shared in (b). Filter requires
only 7 additions and 7 shifts as opposed to 12 adds and
12 shifts in standard multiplierless implementation.
Các file đính kèm theo tài liệu này:
- chap_chap15_7629_7482.pdf