0% found this document useful (0 votes)
232 views143 pages

Lecture 07: PARTITIONING: Prof. Indranil Sengupta

Floorplanning involves determining approximate locations and dimensions of modules on a layout surface to minimize area while reducing critical wire lengths, where modules can have flexible shapes and pin locations are not fixed, making it more difficult than placement where modules and pin locations are predefined. The two problems are identical in some design styles but floorplanning generally provides less precise module locations and sizes.

Uploaded by

Akhendra Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
232 views143 pages

Lecture 07: PARTITIONING: Prof. Indranil Sengupta

Floorplanning involves determining approximate locations and dimensions of modules on a layout surface to minimize area while reducing critical wire lengths, where modules can have flexible shapes and pin locations are not fixed, making it more difficult than placement where modules and pin locations are predefined. The two problems are identical in some design styles but floorplanning generally provides less precise module locations and sizes.

Uploaded by

Akhendra Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 143

Lecture 07: PARTITIONING

PROF. INDRANIL SENGUPTA


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
What is Partitioning?
System Design

• Decomposition of a complex system into smaller subsystems.


• Each subsystem can be designed independently.
• Decomposition scheme has to minimize the interconnections between the
subsystems.
• Decomposition is carried out hierarchically until each subsystem is of
manageable size.

Module 1 Module 2 Module n Interface Information

2
An Example of
Partitioning 48 gates

Size-1 = Size-2 = Size-3 =


15 16 17

3
Partitioning at Different Levels
• Can be done at multiple levels:
– System level
– Board level
– Chip level
• Delay implications are different:
– Intrachip : X
– Intraboard : 10X
– Interboard : 20X

4
Different Delays in a Chip

A B X A

10X
10X
C
C
B

20X

5
Problem Formulation
• Partition a given netlist into smaller netlists such that:
1. Interconnection between partitions is minimized.
2. Delay due to partitioning is minimized.
3. Number of terminals is less than a predetermined maximum value.
4. The area of each partition remains within specified bounds.
5. The number of partitions also remains within specified bounds.

6
Partitioning Techniques
• Broadly two classes of algorithms:
1. Constructive
• Random selection
• Cluster growth
• Hierarchical clustering
2. Iterative-improvement
• Min-cut
• Simulated annealing

7
Random Selection
• Randomly select nodes one at a time and place them into
clusters of fixed size, until the proper size is reached.
• Repeat above procedure until all the nodes have been placed.
• Quality/Performance:
– Fast and easy to implement.
– Generally produces poor results.
– Usually used to generate the initial partitions for iterative placement
algorithms.

8
m : size of each cluster, V : set of nodes
Cluster Growth n = |V| / m ;
for (i=1; i<=n; i++)
• Starting with a single seed = vertex in V with maximum degree;
node, add other nodes Vi = {seed};
to form the partitions V = V – {seed};
based on connectivity. for (j=1; j<m; j++)
• Number of clusters n can t = vertex in V maximally connected to Vi;
be an input parameter. Vi = Vi U {t};
V = V – {t};
endfor
endfor

9
Hierarchical Clustering
• Consider a set of objects and group them depending on some
measure of closeness.
– The two closest objects are clustered first, and considered to be a
single object for further partitioning.
– The process continues by grouping two individual objects, or an object
or cluster with another cluster.
– We stop when a single cluster is generated and a hierarchical cluster
tree has been formed.
• The tree can be cut in any way to get clusters.

10
Example
v1
v1
v241 v2413
7 5 7 5 6 v3
v2 1 v24 4 4
v3 1 v3
9 4 4 v5 v5
v4 v5 v5

v24135

11
Find an edge whose
removal splits the set of
v24135
vertices into two almost
equal partitions.
v2413

v241

v24

v2 v4 v1 v3 v5

12
Min-Cut Algorithm (Kernighan-Lin)
• Basically a bisection algorithm.
– The input graph is partitioned into two subsets of equal sizes.
• Till the cutsets keep improving:
– Vertex pairs which give the largest decrease in cutsize are exchanged.
– These vertices are then locked.
– If no improvement is possible and some vertices are still unlocked, the
vertices which give the smallest increase are exchanged.

13
Example

14
15
• Drawbacks of K-L Algorithm
– It is not applicable for hyper-graphs.
• It considers edges instead of hyper-edges.
• It cannot handle arbitrarily weighted graphs.
• Partition sizes have to be specified a priori.
– Time complexity is high :: O(n3).
– It considers balanced partitions only.

16
Extension of K-L Algorithm
• Unequal sized blocks
– To partition a graph with 2n vertices into two subgraphs of unequal
sizes n1 and n2:
• Divide the nodes into two subsets A and B, containing MIN(n1,n2)
and MAX(n1,n2) vertices respectively.
• Apply K-L algorithm, but restrict the maximum number of vertices
that can be interchanged in one pass to MIN(n1,n2).

17
• Unequal sized elements
– To generate a two-way partition of a graph whose vertices have
unequal sizes:
• Assume that the smallest element has unit size.
• Replace each element of size s with s vertices which are fully
connected (s-clique) with edges of infinite weight.
• Apply K-L algorithm to the modified graph.

18
Performance Driven Partitioning
• On-board delay is much larger than on-chip delay.
– On-chip delay is of the order of nanoseconds.
– On-board delay can be in the order of milliseconds.
• If a critical path is cut many times by the partition, the delay in the path
may be too large to meet the goals of high-performance systems.
• Goal of partitioning in high-performance systems:
1. Reduce the cut-size.
2. Minimize the delay in critical paths.
3. Timing constraints have to be satisfied.

19
END OF LECTURE 07

20
Lecture 08: FLOORPLANNING

PROF. INDRANIL SENGUPTA


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
What is Floorplanning?
• Find approximate locations of a set of modules that
need to be placed on a layout surface.
– Available region typically considered rectangular.
– Modules are also typically rectangular in shape, but there
can be exceptions (e.g. L-shaped modules).

2
An Example Floorplan
I/O Pads
Module a

Module b
Block a Block c
Module c
GND VDD
Chip Block d
Block Pins
Planning Block
Module d
b Block e

Module e

Power Supply Network

3
Problem Definition
• Input:
– n Blocks with areas A1, A2, ... , An
– Bounds ri and si on the aspect ratio of block Bi
• Output:
– Coordinates (xi, yi), width wi and height hi for each block such
that hi wi = Ai and ri ≤ hi/wi ≤ si
• Objective:
– Minimize area; reduce wire length for critical lengths.

4
Floorplanning and Placement: Differences
• The problems are similar in nature.
• Main differences:
– Floorplanning:- Some of the blocks may be flexible, and
the exact locations of the pins are not yet fixed.
– Placement:- All blocks have well-defined geometrical
shapes, with defined pin locations. We keep separate
space for routing.

5
• Points to note:
– Floorplanning problem is more difficult as compared to
placement.
• Multiple choice for the shape of a block.
– In some of the VLSI design styles, the two problems are
identical.

6
An Example for Rigid Module Width Height

Blocks A
B
1
1
1
3
C 1 1
D 1 2
E 2 1

Some of the Feasible Floorplans


E E B
A C B D B E
D
D A C A C

7
Design Style Specific Issues
• Full Custom
– All the steps required for general cells.
• Standard Cell
– Dimensions of all cells are fixed.
– Floorplanning problem is simply the placement problem.
– For large netlists, two steps:
• First do global partitioning.
• Next carry out placement for the individual regions.
• Gate Array
– Floorplanning problem same as placement problem.

8
Estimating Cost of a Floorplan
• The number of feasible solutions of a floorplanning problem
is very large.
– Finding the best solution is NP-hard.
• Several criteria used to measure the quality of floorplans:
a) Minimize area
b) Minimize total length of wire
c) Maximize routability
d) Minimize delays
e) Any combination of above

9
• How to determine area?
– Not difficult.
– Can be easily estimated because the dimensions of each block is
known.
– Area A computed for each candidate floorplan.

10
• How to determine wire length?
– A coarse measure is used.
– Based on a model where all I/O pins of the blocks are merged and
assumed to reside at its center.
– Overall wiring length L = Σi,j (cij * dij)
where cij : number of connections between blocks i and j
dij : Manhattan distances between the
centres of rectangles of blocks i and j

11
• Typical cost function used:
Cost = w1 * A + w2 * L
where w1 and w2 are user-specified parameters.

12
Dead Space
• Dead space is the space that is wasted:

Dead space

• Minimizing area is the same as minimizing deadspace.


• Dead space percentage is computed as: (A - ΣiAi) / A × 100%

13
Slicing Structure
• A rectangular dissection that can be obtained by repeatedly splitting
rectangles by horizontal and vertical lines into smaller rectangles.
• Slicing Tree:
– A binary tree that models a slicing structure.
– Each node represents a vertical cut line (V), or a horizontal cut line (H).
• A third kind of node called Wheel (W) appears for non-sliceable
floorplans (discussed later).
– Each leaf is a basic block (rectangle).

14
A Slicing Floorplan and Two Slicing Trees
V V

c H H H H
b
e f a b c H a b d H
a
d d V c V

e f e f

15
Polish Expression
• V: | H: −
V • For n number of
leaves, length of the
c H H expression is 2n-1
b
e f a b c H AB−CDEF|−−|

a
d d V

e f

16
A Non-Slicing Floorplan
• One that may not be obtained by repetitively subdividing
alone.
• Also called a WHEEL.

b b
c a
e e
a c
d d

17
A Hierarchical Floorplan
d H
b e
g V H
c
a
f H W h i
h
i a b c d e f g

18
Polar Graph Representation
• An alternate graph representation of floorplan.
• Each floorplan is modeled by a pair of directed acyclic graphs:
– Horizontal polar graph
– Vertical polar graph
• For horizontal (vertical) polar graph,
– Vertex represents vertical (horizontal) channel
– Edge indicates that the 2 channels are on 2 sides of a block
– Edge weight indicates the width (height) of the block

19
Polar Graph

Vertical Polar Graph


Example

Horizontal Polar Graph

20
END OF LECTURE 08

21
Lecture 09: FLOORPLANNING ALGORITHMS

PROF. INDRANIL SENGUPTA


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Floorplanning Algorithms
• Several broad classes of algorithms:
– Integer programming based
– Rectangular dual graph based
– Hierarchical tree based
– Simulated annealing based
– Other variations

2
Integer Linear Programming Formulation
• The problem is modeled as a set of linear equations
using 0/1 integer variables.
• ILP solver used to obtain optimal solution based on a
defined cost function.
• Can be used only for small problem instances.
– Very high computational complexity.

3
Rectangular Dual-Graph Approach
• Basic Concept:
– Output of partitioning algorithms represented by a graph.
– Floorplan obtained by converting the graph into its
rectangular dual.
• The rectangular dual of a graph satisfies the
following properties:
– Each vertex corresponds to a distinct rectangle.
– For every edge, the corresponding rectangles are adjacent.

4
A Rectangular Floorplan & its Dual Graph
• Without loss of generality, we 1
assume that a rectangular 2

floorplan contains no cross 3 4


5
junctions.
• Under this assumption, the dual 1 2
graph of a rectangular floorplan
is a planar triangulated graph
(PTG).
3 4 5

5
Contd.
• Every dual graph of a rectangular floorplan (without cross junction)
is a PTG. 1 4

1 2 Replace by 1 2

4 3 4 3
2 3
• However, not every PTG corresponds to a rectangular floorplan.

Complex
triangle

6
Drawbacks
• An interesting approach to floorplanning.
• The main problem concerns the existence of the rectangular dual,
i.e. the elimination of complex triangles.
– Select a minimum set E of edges such that each complex triangle has at
least one edge in E.
– A vertex can be added to each edge of E to eliminate all complex triangles.
– The weighted complex triangle elimination problem has been shown to be
NP-complete.
• Some heuristics are available.

January 17 CAD for VLSI 7


Hierarchical Approach
• Widely used approach to floorplanning.
– Based on a divide-and-conquer paradigm.
– At each level of the hierarchy, only a small number of rectangles
are considered.
• A small graph, and all possible floorplans.
a
a b c
b c a c a b
c b

8
• After an optimal configuration for the three modules has been
determined, they are merged into a larger module.
• The vertices ‘a’, ‘b’, ‘c’ are merged into a super vertex at the
next level.
• The number of floorplans increases exponentially with the
number of modules ‘d’ considered at each level.
– ‘d’ is thus limited to a small number (typically d < 6).

9
• All possible floorplans for:

d=2

d=3

10
Hierarchical Approach :: Bottom-Up
• Hierarchical approach works best in bottom-up fashion.
• Modules are represented as vertices of a graph, while
edges represent connectivity.
– Modules with high connectivity are clustered together.
• Number of modules in each cluster ≤ d.
– An optimal floorplan for each cluster is determined by exhaustive
enumeration.
– The cluster is merged into a larger module for high-level
processing.

11
• A Greedy Procedure
– Sort the edges in decreasing weights.
– The heaviest edge is chosen, and the two modules of the
edge are clustered in a greedy fashion.
• Restriction: number of modules in each cluster ≤ d.
– In the next higher level, vertices in a cluster are merged,
and edge weights are summed up accordingly.

January 17 CAD for VLSI 12


• Problem
– Some lightweight edges may be chosen at higher levels in the
hierarchy, resulting in adjacency of two clusters of highly incompatible
areas.
• Possible solution
– Arbitrarily assign a small cluster to a neighboring cluster when their
sizes will be too small for processing at a higher level of the hierarchy.

13
Example
a 3 c
2
10 3 10 e
1
b 3 d
a c a c
e
b d b e
d

Greedy clustering Merging small clusters

14
Hierarchical Approach :: Top-Down
• The fundamental step is the partitioning of modules.
– Each partition is assigned to a child floorplan.
– Partitioning is recursively applied to the child floorplans.
• Major issue here is to obtain balanced graph partitioning.
– k-way partitioning, in general.
• Not very widely used due to the difficulty of obtaining
balanced partitions.

15
Simulated Annealing
• Important issues in the design of a simulated annealing
optimization problem:
1. The solution space.
2. The movement from one solution to another.
3. The cost evaluation function.
• A solution by Wong is applicable to sliceable floorplans only.
– Floorplan can be represented by a tree.
– Postfix notations used for easy representation and manipulation.

January 17 CAD for VLSI 16


Some Notations
• Dissection operators defined:
– ijH means rectangle j is on top of rectangle i.
– ijV means rectangle j is on left of rectangle i.
• Normalized Polish expression:
– A Polish expression corresponding to a floorplan is called normalized if it
has no consecutive H’s or V’s.
– Used for the purpose of removing redundant solutions from the solution
space.
• Several Polish expressions may exist for the same slicing floorplan.
• Same floorplan can have more than one slicing tree.

17
Example: Two tree representations of a floorplan

1
3 4
2

12–34|| V V
12–3|4|
H V V 4

1 2 3 4
H 3 Normalized
1 2

18
Parameters of the Algorithm
• Solution Perturbations (Move)
a) Swap two adjacent operands
b) Complement a series of operators between two operands (called a
chain).  V′ = H and H′ = V
c) Swap two adjacent operand and operator.
• We accept a move only it is results in a normalized expression.
– Only move “c” may result in a non-normalized solution.
– We need to check only when move “c” is applied.

19
• Typical cost function:
Cost = w1 * A + w2 * L
where A is the area of the smallest rectangle enveloping the
given basic rectangles, L is the overall interconnection length,
w1 & w2 are user-specified parameters.

20
An Example
12–3–4|
3
Move Move
4 a 3 b 2 4
3 4
1 2 1 2 1

12|4–3| 12|3–4| Move


c

2 4
12–34–| 1 3

21
Rectangular and L-shaped Blocks
• Possible shapes:

• Note that L-shaped


blocks can be
produced by merging
rectangular blocks.
• Can generate non-
slicing floorplans.
END OF LECTURE 09

23
Lecture 10: PIN ASSIGNMENT

PROF. INDRANIL SENGUPTA


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Introduction
• The purpose is to define the signal that each pin will receive.
• It can be done
– During floorplanning
– During placement
– After placement is fixed
• For un-designed blocks, a good assignment of pins improves
placement.
• If the blocks are already designed, still some pins can be exchanged.

2
Pin Assignment
• Input:
– A placement of blocks.
– Number of pins on each block, possibly an ordering.
– A netlist.
• Requirements:
– To determine the pin locations on the blocks.
• Objectives:
– To minimize net-length.

3
• Functionally equivalent pins:
– Exchanging the signals does not affect the circuit.
• Equipotential pins:
– Both are internally connected and represent the same net.

A C A,B :: functionally equivalent

B D C,D :: equipotential

4
Problem Formulation
• Purpose is to optimize the assignment of nets within a
functionally equivalent (or equipotential) pin groups.
• Objective:
– To reduce congestion or reduce the number of crossovers.

5
Classification of Algorithms
Pin Assignment Algorithms

General Techniques Special Techniques


Concentric Circle Mapping
Channel Pin Assignment
Topological Method
Nine Zone Method

6
Concentric Circle Mapping
Given: Two sets of pins (1) Determine the circles

7
(2) Determine the points

8
(3) Determine initial mapping and (4) optimize the mapping (complete
rotation)

9
(4) Best mapping Final pin assignment

10
Pin Assignment to an External Block B

B B B

m m m

l’

11
Topological Pin Assignment
• Similar to concentric circle mapping.
• Easier to complete pin assignment.
– When there is interference from other components and barriers.
– For nets connected to more than two pins.
• If a net has been assigned to more than two pins, then the pin
closest to the center of the primary component is chosen.
• Pins of primary component are mapped onto a circle as before.
• Beginning at the bottom of the circle, and moving clockwise, the
pins are assigned to nets.

12
Nine Zone Method
• Based on zones in a Cartesian co-ordinate system.
• The center of the co-ordinate system is located inside a
group of interchangeable pins on a component.
– This component is called pin class.
• A net rectangle is defined by each of the nets connected
to the pin class.
– There are nine zones in which this rectangle can be positioned.

13
The Nine Pin Zones
4 7 1

6 5

3 8 2

January 17 14
Channel Pin Assignment
• A significant portion of the chip area is used for channel routing.
– After the placement phase, the position of terminals on the
boundaries of a block are not fixed.
– They may be moved before routing begins.

• Yang & Wong proposed a dynamic programming formulation to the


channel pin assignment problem.

15
Integrated Approach
• Better understanding of the different stages in physical
design automation over the years.
– Attempts are being made to merge some steps of the design cycle.
– For example, floorplanning and placement are considered together.
– Sometimes, placement and routing stages can also be combined
together.

January 17 16
END OF LECTURE 10

17
Lecture 11: PLACEMENT (PART 1)

PROF. INDRANIL SENGUPTA


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Introduction
• A very important step in physical design cycle.
– A poor placement requires larger area.
– Also results in performance degradation.
• It is the process of arranging a set of modules on the layout
surface.
– Each module has fixed shape and fixed terminal locations.
– A subset of modules may have pre-assigned positions (e.g., I/O pads).

2
Different Wire Length
Different
Routability/Chip Area
Placement can Make a Difference
• Placement of MCNC enchmark circuit e64 (contains 230 4-LUT)
on a FPGA.
Random Initial Placement Final Placement After Detailed Routing
The Placement Problem
• Inputs:
– A set of modules with (a) well-defined shapes, and (b) fixed locations of pins.
– A netlist.
• Requirements:
– Find locations for each module so that no two modules overlap.
– The placement is routable.
• Objectives:
– Minimize layout area.
– Reduce the length of critical nets.
– Completion of routing.

6
Placement Problem at Different Levels
1. System-level placement
– Place all the PCBs together such that
• Area occupied is minimum
• Heat dissipation is within limits.
2. Board-level placement
– All the chips have to be placed on a PCB.
• Area is fixed
• All modules of rectangular shape
– Objective is to: (a) Minimize the number of routing layers,
(b) Meet system performance requirements.

7
3. Chip-level placement
– Normally, floorplanning / placement carried out along with pin
assignment.
– Limited number of routing layers (2 to 4).
• Bad placements may be unroutable.
• Can be detected only later (during routing).
• Costly delays in design cycle.
– Minimization of area.

8
Problem Formulation
• Notations:
B1,B2,…, Bn : modules/blocks to be placed
wi, hi : width and height of Bi, 1 ≤ i ≤ n
N={N1,N2,…,Nm} : set of nets (i.e. the netlist)
Q={Q1,Q2,…,Qk} : rectangular empty spaces for routing
Li : estimated length of net Ni, 1 ≤ i ≤ m

9
• The problem:
Find rectangular regions R={R1,R2,...Rn} for each of the blocks such that
• Block Bi can be placed in region Ri.
• No two rectangles overlap, Ri∩Rj = Φ.
• Placement is routable (Q is sufficient to route all nets).
• Total area of rectangle bounding R and Q is minimized.
• Total wire length ΣLi is minimized.
• For high performance circuits, max {Li | i=1,2,…,m} is minimized.
• General problem is NP-complete.
• Algorithms used are heuristic in nature.

10
Given set
of blocks

Good Bad
Placement Placement

11
Interconnection Topologies
• The actual wiring paths are not known during placement.
– For making an estimation, a placement algorithm needs to model the
topology of the interconnection nets.
• An interconnection graph structure is used.
• Vertices are terminals, and edges are interconnections.
• Estimation of wire length is important.

12
Estimation of Wirelength
• The speed and quality of estimation has a drastic effect on
the performance of placement algorithms.
– For 2-terminal nets, we can use Manhattan distance as an estimate.
– If the end co-ordinates are (x1,y1) and (x2,y2), then the wire length
L =  x1 – x2  +  y1 – y2
• How to estimate length of multi-terminal nets?

13
Modeling of Multi-terminal Nets
1. Complete Graph
• nC = n(n-1)/2 edges for a n-pin net.
2

• A tree has (n-1) edges which is 2/n


times the number of edges of the
complete graph.
• Length is estimated as 2/n times the
sum of the edge weights.

14
2. Minimum Spanning Tree
• Commonly used structure.
• Branching allowed only at pin
locations.
• Easy to compute.

15
3. Rectangular Steiner Tree
• A Steiner tree is the shortest route
for connecting a set of pins.
• A wire can branch from any point
along its length.
• Problem of finding Steiner tree is
NP-complete.

16
4. Semi Perimeter
• Efficient and most widely used.
• Finds the smallest bounding
rectangle that encloses all the pins
to be connected.
• Estimated wire length is half the
perimeter of this rectangle.
• Always underestimates the wire
length for congested nets.

17
END OF LECTURE 11

18
Lecture 12: PLACEMENT (PART 2)

PROF. INDRANIL SENGUPTA


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Design Style Specific Issues
• The main issues in placement can differ depending
on the design style used.
– For instance, in standard cell based design style, the
floorplanning and placement problems are the same.
• We discuss the main issues relating to the ASIC
design styles:
– Full custom, standard cell, and gate array.

2
• Full Custom
– Placing a number of blocks of various shapes and sizes within a
rectangular region.
– Irregularity of block shapes may lead to unused areas.
– Both floorplanning and placement steps are required.
– May require iterations, where the layout may be modified at each
step.

3
• Standard Cell
– The problem of floorplanning and placement are the same in this design
style.
– Minimization of the layout area means:
• Minimize sum of channel heights.
• Minimize width of the widest row.
• All rows should have equal width.
– Over-the-cell routing leads to almost channel-less standard cell designs.

4
• Gate Arrays
– The problem of partitioning, floorplanning and placement are the
same in this design style.
– For FPGAs, the partitioned sub-circuit may be a complex netlist.
• Map the netlist to one or more basic blocks or LUTs (placement).

5
Classification of Placement Algorithms

Placement Algorithms

Simulation Based Partitioning Based Other


Simulated Annealing Breuer’s Algorithm Cluster Growth
Simulated Evolution Terminal Propagation Force Directed
Force Directed

6
Simulated Annealing
• Simulation of the annealing process in metals or glass.
– Avoids getting trapped in local minima.
– Starts with an initial placement.
– Incremental improvements by exchanging blocks, displacing a block, etc.
– Moves which decrease cost are always accepted.
– Moves which increase cost are accepted with a probability that decreases
with the number of iterations.
• Timberwolf is one of the most successful placement
algorithms based on simulated annealing.

7
Algorithm SA_Placement
Simulated begin
T = initial_temperature;
Annealing P = initial_placement;
while ( T > final_temperature) do
Algorithm while (no_of_trials_at_each_temp not yet completed) do
new_P = PERTURB (P);
∆C = COST (new_P) – COST (P);
if (∆C < 0) then
P = new_P;
else if (random(0,1) > exp(∆C/T)) then
P = new_P;
T = SCHEDULE (T); /** Decrease temperature **/
end

8
TimberWolf
• One of the most successful placement algorithms.
– Developed by Sechen and Sangiovanni-Vincentelli.
• Parameters used:
– Initial_temperature = 4,000,000
– Final_temperature = 0.1
– SCHEDULE(T) = α(T) x T
• α(T) specifies the cooling rate which depends on the current temperature.
• α(T) is 0.8 when the cooling process just starts.
• α(T) is 0.95 in the medium range of temperature.
• α(T) is 0.8 again when temperature is low.

9
The PERTURB Function
• New configuration is generated by making a weighted random
selection from one of the following moves:
M1. The displacement of a block to a new location.
M2. The interchange of locations between two blocks.
M3. An orientation change for a block.
– Mirror image of the block’s x-coordinate.
– Used only when a new configuration generated using alternative M1 is
rejected.

10
Illustration of the Moves
.

M1 .
M2

1 2 2 1 1 2
Axis of
M3 reflections

3 4
3 4 3 4
Move Selection
• Timberwolf first tries to select a move between M1 and M2.
Prob(M1) = 4/5
Prob(M2) = 1/5
• If a move of type M1 is chosen (for certain module) and it is rejected, then a
move of type M3 (for the same module) will be chosen with probability 1/10.
• Restriction on:
o How far a module can be displaced
o What pairs of modules can be interchanged
Move Restriction
Range Limiter:
• At the beginning, R is very large, big enough to contain the whole chip.
• Window size shrinks slowly as the temperature decreases. In fact, height and width of R
∝ log(T).
• Stage 2 begins when window size are so small that no inter-row modules interchanges
are possible.

Rectangular window R
The COST Function
• The cost of a solution is computed as:
COST = cost1 + cost2 + cost3
where cost1 : weighted sum of estimated length of all nets
cost2 : penalty cost for overlapping
cost3 : penalty cost for uneven length among
standard cell rows.
– Overlap is not allowed in placement.
– Computationally complex to remove all overlaps.
– More efficient to allow overlaps during intermediate placements.
• Cost function (cost2) penalizes the overlapping.

14
Summary
• Timberwolf is one of the very successful placement tools.
• Gives good placement for standard cell based designs.

15
END OF LECTURE 12

16
Lecture 13: PLACEMENT (PART 3)

PROF. INDRANIL SENGUPTA


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Force Directed Placement
• Explores the similarity between placement problem and
classical mechanics problem of a system of bodies attached to
springs.
• The blocks connected to each other by nets are supposed to
exert attractive forces on each other.
– Magnitude of this force is directly proportional to the distance
between the blocks.
• Analogous to Hooke’s law in mechanics.
– Final configuration is one in which the system achieves equilibrium.

2
• A cell i connected to several cells j experiences a total force
Fi = Σj (wij * dij)
where wij is the weight of connection between i and j
dij is the distance between i and j.
• If the cell i is free to move, it would do so in the direction of force
Fi until the resultant force on it is zero.
• When all cells move to their zero-force target locations, the total
wire length is minimized.

3
• For cell i, if (xi0, yi0) represents the zero-
force target location, by equating the x- and
y-components of the force to zero, we get

• Solving for xi0 and yi0, we get

• Care should be taken to avoid assigning


more than one cell to the same location.

4
Example

• A circuit with one gate and four I/O pads.


• The four pads are to be placed on the four corners of a 3x3 grid.
• The weights of the wires connected to the gate are: wvdd=8, wout=10,
win=3, and wgnd=3.
• Find the zero-force target location of the gate inside the grid.

5
6
7
• The zero-force location for the gate is (1.083, 1.50) that can be
approximated to the grid location (1,2).

8
Force Directed Approach for Constructive Placement

• The basic approach can be generalized for constructive


placement.
– Starting with some initial placement, one module is selected at a
time, and its zero-force location Fi computed.
– The process can be iterated to improve upon the solution obtained.
– The order of the cells can be random or driven by some heuristic.
• Select the cell for which Fi is maximum.

9
• If the zero-force location is occupied by another cell q, then several
options to place the cell p under consideration exist.
1. Move p to a location close to q.
2. Evaluate the change in cost if p is swapped with q. If the cost decreases, only
then is the swap made.
3. Ripple move: The cell p is placed in the computed location, and a new zero-
force location is computed for the displaced cell q. The procedure is
continued until all the cells are placed.
4. Chain move: The cell p is placed in the computed location, and the cell q is
moved to an adjacent location. If the adjacent location is occupied by a cell r,
then r is moved to its adjacent location, and so on, until a free location is
finally found.

10
END OF LECTURE 13

11
Lecture 14: PLACEMENT (PART 4)

PROF. INDRANIL SENGUPTA


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Breuer’s Algorithm
• Partitioning technique used to generate placement.
• The given circuit is repeatedly partitioned into two sub-
circuits.
– At each level of partitioning, the available layout area is partitioned
into horizontal and vertical subsections alternately.
– Each of the sub-circuits is assigned to a subsection.
– Process continues till each sub-circuit consists of a single gate, and
has a unique place on the layout area.

2
• Several cut-oriented sequences have been proposed.
– Cutsize is minimized during partitioning.
• We shall illustrate two alternate cut sequences proposed by
Breuer:
1. Quadrature mincut placement
2. Recursive bipartitioning mincut placement

3
An Example Block
Level Netlist

• The thick edges have a weight of 1, and the thin edges


have a weight of 0.5.

4
Quadrature Mincut Placement
• The layout is divided into 4 units with two cutlines, one
vertical and one horizontal, both passing through the center.
• The above division procedure is then recursively applied to
each quarter of the layout cut until the entire layout is
divided into slots of desired size.

5
Recursive Bipartitioning Mincut Placement
• The layout is repeatedly divided recursively using horizontal
and vertical cutlines as illustrated.

7
Terminal Propagation Algorithm
• Partitioning algorithms merely reduce net cut.
• Direct use of partitioning algorithms would increase net length.
– Also increases congestion in the channels.
• To prevent this, terminal propagation is used.
– When a net connecting two terminals is cut, a dummy terminal is
propagated to the nearest pin on the boundary.
– When this dummy terminal is generated, the partitioning algorithm will
not assign the two terminals in each partition into different partitions, as
this would not result in a minimum cut.

10
Illustration :: Terminal Propagation
B

A
B A

A A
B B
:: Dummy terminal :: Terminal

11
Cluster Growth
• In this constructive placement algorithm, bottom-up
approach is used.
• Blocks are placed sequentially in a partially completed
layout.
– The first block (seed) is usually placed by the user.
– Other blocks are selected and placed one by one.
• Selection of blocks is usually based on connectivity with
placed blocks.

12
Contd.
• Layouts produced are not usually good.
– Does not take into account the interconnections and other circuit
features.
• Useful for generating initial placements.
– For iterative placement algorithms.

13
Algorithm Cluster_Growth
begin
B = set of blocks to be placed;
Select a seed block S from B;
Place S in the layout;
B = B – S;
while (B ≠ φ) do
begin
Select a block X from B;
Place X in the layout;
B = B – X;
end;
end

14
Performance Driven Placement
• The delay at chip level plays an important role in determining
the performance of the chip.
– Depends on interconnecting wires.
• As the blocks in a circuit becomes smaller and smaller:
– The size of the chip decreases.
– Interconnection delay becomes a major issue in high-performance
circuits.
• Placement algorithms for high-performance chips:
– Allow routing of nets within timing constraints.

15
• Two major categories of algorithms:
1. Net-based approach
• Try to route the nets to meet the timing constraints on the individual nets
instead of considering paths.
• The timing requirement for each net has to be decided by the algorithm.
• Usually a pre-timing analysis generates the bounds on the net-lengths
which must be satisfied during placement.
2. Path-based approach
• Critical paths in the circuit are considered.
• Try to place the blocks in a manner that the path length is within the
timing constraint.

16
END OF LECTURE 14

17

You might also like