An Investigation on FPGA Placement

grandgoatAI and Robotics

Oct 23, 2013 (3 years and 7 months ago)

70 views

An Investigation on FPGA Placement
Using Mixed Genetic Algorithm with

Simulated Annealing


Meng Yang

Napier University

Edinburgh, UK

Overview


Placement problem definition


Symmetrical FPGA general architecture


Proposed algorithm


Experimental results


Conclusions

FPGA Placement Definition


Constraints


Some fixed I/O pads


Architecture


Problem definition

Given a netlist to find exact locations of the
FPGA logic blocks with constraints to minimize
wire length required for routing

FPGA General Architecture

Switch

Block

CLBs

IOBs

Out

4
-
input

LUT

D

Flip
-
flop

Clock

In

9

Genotype


The chromosome structure
is (
L
1
,
L
2
,
L
3
, ……,
L
N
)


Chromosome length,
N
,
depends on the size of
FPGA,
K


The location of CLB is
calculated as

P
= (
x
-
1)
×
K
+
(y
-
1)



0

1

2

3



13

14

15


-
1


5


2


9



6

-
1

10

10

8

1,2

1,3

1,1

1,4

2,2

2,3

2,1

2,4

3,2

3,3

3,1

3,4

4,2

4,3

4,1

4,4

2

11

5

6

7

4

1

3

Fitness Function

Compensation factor

Bounding box for horizontal span

Bounding box for vertical span

The worst cost for placement


maxcost

Half
-
perimeter Wire Length Model

Bounding Box=6

=4 (Hori. dist.)

+2(Vert. dist.)

Net with 6
terminals

Half
-
perimeter Wire Length Model

Net with 6
terminals

Bounding Box=5

=3 (Hori. dist.)

+2(Vert. dist.)

Overview of GASA

01

begin

02

initialize_population ();

03

while

(
generation

< MAX_GENS)
do

04

evaluate_population_fitness ();

05

reproduce_population (Preserve);

06

for

i

= 1
to

POP_SIZE/2
do

07


crossover (Pcrossover);

08

for

j
=1
to

NUM_GENES
do

09


mutate(Pmutation);

10

for

i

= 1
to

POP_SIZE
do

11



local_improvement(Plocal);

12

elitism();

13

end while

14


select_the_best_one();

15


T

=
set_temperature();

16

R = set_block_movement_range();

17

/*
following algorithm is pseudo
-
code of SA*/

18

while

(Exit_criterion() == FALSE)
do

19



while

(inner_criterion() == FALSE)
do

20




Pnew = generate_movement (R, Pold)

21




ΔC = C (Pnew)
-

C (Pold);

22




RANDOM

=
generate_number();

23



if (RANDOM < e exp (
-
ΔC/T))

24




Pold = Pnew;

25



end while

26


end while

27

end algorithm

Selection


Individuals are selected
according to their fitness value


The fitness values of population
are sorted in increasing order.


A small number of individuals of
population with higher fitness
value in the current generation
are intact and remain in the
population


W

individuals are simultaneously
selected



The selection procedure is
random but

fitter

individual is
more likely to be selected

Crossover Process

0

1


2


3 4 5 6 7 8


6
-
1 1

4

5

2
-
1

3
-
1

0

1


2


3 4 5 6 7 8


-
1 5 2

3
-
1

6
-
1 4 1

0

1


2


3 4 5 6 7 8



1
-
1 6

4

5 2
-
1 3
-
1

1

1,2

1,3

1,1

2,2

2,3

2,1

3,2

3,3

3,1

2

6

5

3

4

1

1,2

1,3

1,1

2,2

2,3

2,1

3,2

3,3

3,1

6

2

5

4

3

6

1,2

1,3

1,1

2,2

2,3

2,1

3,2

3,3

3,1

2

1

5

3

4

Local optimization (SA) stage


Once GA has done the global search in the first
stage, SA will take over from GA to do local search.


The takeover is static. If the improvement does not
gained in the GA for 5 generations or the number of
generations is greater than the maximum number of
generations, SA will start to work on individual
instead of entire population.


As the takeover process is static, according to the
experimental results, the initial temperature
T

in the
second stage of our algorithm is selected at 1 degree

Local optimization (SA) stage (Cont.)


New temperature is computed as
Tnew
=
β Told


β

depends on
α


α

is the percentage of attempted movements between two
swapped blocks that have been accepted


Movement is only between two blocks nearby

α

β

0.15 < α < 0.3

0.95

0.05 <= α <= 0.15

0.8

α < 0.05

0.6

Comparison Flow

Logic optimization and
technology map to 4 Look Up
Tables (LUTs)

Pack Flip
-
Flops and LUTs into
basic logic elements

Placement

(VPlace)

Routing

(VRouter)

Placement

(GASA)

Channel density

Benchmarks


Comparison to GA

Name

GA

GASA

CPU (s)

No. of Tracks

CPU (s)

No. of Tracks

9symml

25.74

5

22.86

5

alu2

91.76

6

74.27

6

apex7

38.39

5

38.11

5

e64

163.70

8

155.21

8

example2

107.57

5

95.23

5

k2

461.59

10

364.77

9

term1

28.06

5

26.35

5

too
-
lrg

82.51

7

74.37

7

vda

179.17

8

148.33

8

Total

1178.49

59

999.5

58

Comparison to VPR

Benchmarks

VPlace [5]

GASA

Cost

Cost

9symml

690

693

alu2

1670

1678

apex7

785

785

e64

2853

2849

example2

1348

1345

k2

5874

5873

term1

700

700

too
-
lrg

1750

1748

vda

3067

3067

Total

18737

18738

Conclusions


FPGA placement by using GASA is presented.


The experimental results show that the proposed algorithm
is effective in improving the quality of placement for the
tested MCNC benchmarks.


The proposed GASA achieves less CPU time than GA in
all cases without degradation of performance in the final
routing stage, i.e. same number of routing channel tracks
for all benchmarks.


It also shows GASA and VPlace are highly comparable
placement tools.



Thank you for your attention