# Chip for the DCT Transform

Electronics - Devices

Nov 26, 2013 (4 years and 5 months ago)

84 views

Design and Implementation of
a High
-
Speed, Low
-
Power VLSI
Chip for the DCT Transform

Professor A. Doboli

Participates:

Tak Yuen Lam,

Kit Lam,

Wei Kit Ng,

Ying Lam

Back Ground

The DCT application can have many
purposes:

Filtering

Teleconferencing

high
-
definition television (HDTV)

speech coding, image coding

data compression, and more.

Back Ground

All of these use DCT algorithm for
compression and/or filtering purposes.
The DCT has

energy packing capabilities

approaches the statistically optimal
transform in de
-
correlating a signal.

It was implemented with discrete
components in a chip.

Goal

.

Implementation of a VLSI chip with:

-
high speed

-
low power

compute the 2
-
D Discrete Cosine
Transform (DCT) function of an 8 x 8
element matrix is presented.

Goal

Save Power Consumption during Computing
Operation in the chip:

-
Specially design multiplier with less

computation.

-
Less switching

-
Simplify of the equations.

High Speed Processing:

-
using pipeline technology.

-
Ignore zero’s in the multiplier.

Basic Formula

Forward DCT:

Inverse DCT:

]
2
)
1
2
(
cos
2
)
1
2
(
cos
)
,
(
)[
(
)
(
)
,
(
)
1
(
0
)
1
(
0
N
v
y
N
u
x
y
x
f
v
C
u
C
v
u
F
N
x
N
y

]
2
)
1
2
(
cos
2
)
1
2
(
cos
)
,
(
)
(
)
(
[
)
,
(
)
1
(
0
)
1
(
0
N
v
y
N
u
x
v
u
F
v
C
u
C
y
x
f
N
u
N
v

Basic Formula

C(u) =

,C(v) =

for u,v = 0

C(u) =

,C(v) =

for u,v = 1

through N
-
1;

N = 4, 8, or 16

N
1
N
2
N
1
N
2
1
-
D DCT Matrix

]
[
*
]
[
]
[
X
C
Y

)
7
(
)
6
(
)
5
(
)
4
(
)
3
(
)
2
(
)
1
(
)
0
(
4
1
)
7
(
)
6
(
)
5
(
)
4
(
)
3
(
)
2
(
)
1
(
)
0
(
7
5
3
1
1
3
5
7
6
2
2
6
6
2
2
6
5
1
7
3
3
7
1
5
4
4
4
4
4
4
4
4
3
7
1
5
5
1
7
3
2
6
6
2
2
6
6
2
1
3
5
7
7
5
3
1
4
4
4
4
4
4
4
4
X
X
X
X
X
X
X
X
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
Y
Y
Y
Y
Y
Y
Y
Y
Simplification:

)
4
(
)
3
(
)
5
(
)
2
(
)
6
(
)
1
(
)
7
(
)
0
(
)
5
(
)
2
(
)
6
(
)
1
(
)
4
(
)
3
(
)
7
(
)
0
(
)
6
(
)
5
(
)
2
(
)
1
(
)
4
(
)
3
(
)
7
(
)
0
(
)
7
(
)
6
(
)
5
(
)
4
(
)
3
(
)
2
(
)
1
(
)
0
(
*
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
1
)
7
(
)
6
(
)
5
(
)
1
(
)
6
(
)
4
(
)
2
(
)
0
(
1
3
5
7
3
7
1
5
5
1
7
3
7
5
3
1
2
6
6
2
4
4
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
Y
Y
Y
Y
Y
Y
Y
Y
The following equations are derived
from the matrix above

4
1
(0) C (X(0)+X(1)+X(2)+X(3)+X(4)+X(5)+X(0)
+X(7))
4
Y

2 6
1
Y(2) = C (X(0)+X(7)-X(3)-X(4)) + C (X(1)
+X(6)-X(2)-X(5))
4

4
1
Y(4) = -C (X(0)+X(7)+X(3)+X(4)-X(1)-X(2)
-X(5)-X(0))
4

6 2
1
Y(6) = C (X(0)+X(7)-X(3)-X(4)) - C (X(1)
+X(6)-X(2)-X(5))
4

1 3 5 7
1
Y(1) = C (X(0)-X(7)) + C (X(1)-X(6)) + C
(X(2)-X(5)) + C (X(3)-X(4))
4

3 7 1 5
1
Y(3) = C (X(0)-X(7)) - C (X(1)-X(6)) - C
(X(2)-X(5)) - C (X(3)-X(4))
4

5 1 7 3
1
Y(5) = C (X(0)-X(7)) - C (X(1)-X(6)) + C
(X(2)-X(5)) + C (X(3)-X(4))
4

7 5 3 1
1
Y(7) = C (X(0)-X(7)) - C (X(1)-X(6)) + C
(X(2)-X(5)) - C (X(3)-X(4))
4
Simplified Equations

Y(0) = c4[j + k+l+m]

Y(2) = c2[j
-
k] + c6[m
-
l]

Y(4) =
-
c4[j+k+l+m]

Y(6) = c6[j
-
k]

c2[m
-
l]

Y(1) = e + f + h + [c+c3
-
c5
-
c7]a

Y(3) = e + g + I + [c1+c3+c5
-
c7]b

Y(5) = e + g + h + [c1+c3
-
c5+c7]c

Y(7) = e + f + I + [
-
c1+c3+c5
-
c7]d

Simplified Equations

a = x0

x7; b= x1
-
x6;

c = x2

x5; d = x3
-
x4

j = x0+x7; k = x1+x6;

l = x2+x5; m= x3+x4

e = c3[a+b+c+d]

f=[c7
-
c3][a
-
d]

g=[
-
c1
-
c3][b+c]

h=[c5
-
c3][a+c]

I=[
-
c5
-
c3][b+d]

1D DCT Flow Chart

Pixel

Memory

Cosine

Matrix

Memory

Multiplier

DCT Coefficients

Register

Bank

Shifter

2D DCT Flow Chart

1 D DCT

Transpose

1 D DCT

Control

1
-
D DCT Architecture(First Version)

X(1)

X(1)

Y(7)

Y(6)

X(3)

X(7)

X(6)

X(4)

X(5)

Y(5)
Y(4)

X(4)

X(7)

X(2)

Y(3)
X(3)

Y(0)

Y(1)
X(6)

X(2)

X(0)

X(0)

X(5)

Y(2)

1
-
D DCT Architecture(Final Version)

X0 X1 X3 X4 X2 X5 X1X6 X0 X7 X3 X4 X2 X5 X1 X6

Y0 Y4 Y2 Y6 Y1 Y7 Y5 Y3

x

+

+

x

-

-

+

x

+

x

+

x

+

-

-

+

+

+

+

x

x

x

-

x

+

+

+

+

+

x

+

x

+

+

+

x

+

+

x

+

x

+

-

+

-

-

State 1

State

2

State 3

Transpose Architecture

OUT 6

OUT 5

OUT 1

Transpose

Component

OUT 4

In 0

In 4

OUT 7

OUT 0

In 1

In 6

In 5

In 2

In 7

In 3

OUT 2

OUT 3

B
CI
A
CI
B
-
B
CI
20 Transistors
A
CI
A
-
B
CI
-
A
S
CI
Co
Hardware: Multipier

Example:

X0
X3
FA
U1A
7408
1
2
3
FA
X1
X3
FA
U1A
7408
1
2
3
Y1
X1
4x4 bit
array
multiplier
U1A
7408
1
2
3
X1
U1A
7408
1
2
3
X1
Y2
FA
FA
U1A
7408
1
2
3
U1A
7408
1
2
3
X0
Y3
Y0
X0
U1A
7408
1
2
3
Z0
HA
X0
Z1
U1A
7408
1
2
3
FA
X2
Z2
U1A
7408
1
2
3
FA
X2
Z3
Z4
U1A
7408
1
2
3
U1A
7408
1
2
3
HA
Z5
HA
X2
U1A
7408
1
2
3
Z6
HA
X2
FA
U1A
7408
1
2
3
U1A
7408
1
2
3
Z7
X3
U1A
7408
1
2
3
X3
U1A
7408
1
2
3
Simplified Multiplication

Example:
-

ignore

x
1
x
x
1
0
x
x
x
x
1
0
0
x
x
x
x
x
x
x
1
x
x
x
x
x
x
x
x
1
x
x
x
x
x
x
x
x
1
0
0
0
x
x
x
x
x
x
x
x
x
x
x
x
Output
Comparison

Convential Design
First version
Final version
56
28
32
Numbers of Multiplier
32
16
14
Speed of Multiplier (ns)
Array(16x16)
Split Array(16x16)
Wallace(16x16)
Modified Booth(16x16)
Our Design (12x8)
92.6
62.9
54.5
45.4
~23
Power (mW)
43.5
38
32
41.3
~22.5

VLSI: Multiplier

Multiplier Simulation

VLSI: 1 bit Transpose

VLSI: 8x8 Transpose

Transpose Simulation

VLSI: 1D DCT Part One

1 D DCT Part One Simulation

VLSI: 1D DCT Part Two

1 D DCT Part two Simulation

Java simulation

Java Code:

public void transform() {

g = new int[8][8];

for ( int i = 0; i < 8; i++ ) {

for ( int j = 0; j < 8; j++ ) {

double ge = 0.0;

for ( int x = 0; x < 8; x++ ) {

for ( int y = 0; y < 8; y++ ) {

double cg1 = (2.0*(double)x+1.0)*(double)i*Math.PI/16.0;

double cg2 = (2.0*(double)y+1.0)*(double)j*Math.PI/16.0;

ge += ((double)f[x][y]) * Math.cos(cg1) * Math.cos(cg2);

}

}

double ci = ((i==0)?1.0/Math.sqrt(2.0):1.0);

double cj = ((j==0)?1.0/Math.sqrt(2.0):1.0);

ge *= ci * cj * 0.25;

g[i][j] = (int)Math.round(ge);

}

}

}

Simulation Result

INPUT MATRIX:

210
190
50
235
128
90
67
54
89
90
23
47
45
44
32
23
203
167
34
65
87
120
56
48
90
49
78
12
52
39
45
178
1
189
67
187
93
51
23
57
129
0
89
27
43
155
89
67
7
125
3
98
2
67
5
1
190
78
56
234
175
89
123
48
Simulation Result

Output matrix after DCT:

110
120
49
19
13
83
101
41
37
51
63
33
110
27
18
165
19
30
23
6
9
63
6
24
5
65
9
97
75
57
11
143
45
7
7
41
35
50
3
17
37
73
83
63
76
64
76
86
2
14
44
16
16
69
67
31
115
54
2
68
66
4
127
663