# VLSI Physical Design Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

# Lecture - 41 Performance-Driven Design Flow

So, to summarize we have earlier looked at the various physical design automations steps like partitioning, placement, floorplanning, routing and so on. Later on a looked at some of the performance driven issues, we looked at performance driven placement and routing and also some techniques for physical synthesis; how we can make some timing corrections. So, now, let us try to look at the overall picture now for this kind of high performance designs, where these performance driven issues are important, so what should be our so called performance driven design flow. So, we look at into the overall picture in this lecture.

(Refer Slide Time: 01:09)



So, let us have a quick look at the so called performance driven physical design flow; which is more or less standard practice in modern day VLSI circuits. So, here what we are trying to tell you is that whatever techniques we have looked at, let us try to combine it in a consistent design flow. Now here we shall see that there are a few things like static timing analysis, which we do not consider as a step, we consider this as a tool and this tool can be used a number of times in different phases of the process. Similarly buffering

is a very common technique which may have to be done multiple number of times, there can be a initial level of buffering may be after 1 cycle we feel like still some timings are not been satisfied you may have to do some more buffering. So, there are some iterative steps which need to be carried out; there are a lot of iterative steps and even at the end you may find that still a few things are not been met, you may have to do a few more corrections there.

(Refer Slide Time: 02:33)



So, let us revisit the typical physical design flow once more. Now the typical physical design flow starts with something called chip planning, well we did not use this term earlier we called it floorplanning placement, but in chip planning we include 3 different things. Let me just try to just highlight.

## (Refer Slide Time: 03:05)



So, in chip planning we are talking about something called chip planning, which is also a very commonly used phrase first is a step called I O placement. So, I O placement means for this chip we have to place the I O pins or which pin should be carrying which signal. This is the I O placement like this, then we can have floor planning of course, I said that most of the circuits today are based on standard cells, floorplanning and placement arriving the identical in those cases. So, in floorplanning we are telling full standard cell has to place in which rows and there is another thing which is done together power planning.

Because now that we have the standard cells already laid out, we know that; where are the power connections that we need. V DD and ground here, V DD and ground here like that. So, you also plan overall whichever technical we use into supply the power supply and ground lines whatever way you do. So, this kind of a power planning network also you do. So, this entire process is sometimes referred to as chip planning; that overall the chip how does it look like, for or the pins what will be these pins carrying, these are the cells which cells would be put on which rows, and how will the power network look like that let powering that will provide you power to the different rows and the cells ok fine. So, here there is a something that trail synthesis, which is the initial thing which you have got, this provides the floorplanning tool so that estimate of the total area that is needed you can see that; and in this estimate you should provision for buffers so additional place for routing, and gate sizing; some gates may have to made larger. So, you have to keep some additional spaces for these things right.

So, this earlier we did not talk about all these things, but now we are saying that we need to keep provisions for all these things which need to be added in later stages of the design flow. Then logic synthesis and technology mapping of course, of this can be done earlier also, it produces a gate level cell level netlist; well when you are doing a floorplanning sometimes these are done together or this can also be done earlier.

(Refer Slide Time: 06:16)



So, after you have placed the cells you can carry out global placements. Now global placement whatever I have said that you place the blocks or the cells in rows of the standard cell. Now again I have said that in a typical VLSI design flow you cannot do the thing in 1 go it is a repetitive step. So, you make a placement, you may find that something is not working properly there is a lot of congestion somewhere, there is lot of delay in some places, you would may have to change the placement continuously.

So, here what I am saying is that this the process of global placement, it typically assigns locations to be objects where means I shall be showing a slide.

# (Refer Slide Time: 07:25)



So, here we look at the clustering information, you try to spread the cells informally across the chips like what I am saying is that you may find that initially your chip plan looks like this, where this dark region means there is lot of clustering and congestion here many of the cells the global placement would says that they will be mapping map here. So, you proceed this region slowly spreads, it spreads across the whole chip and many finally, you will be getting something like this. So, your whole placement must be spread across the entire area of the chip something like this. So, very very roughly I am saying.

(Refer Slide Time: 07:58)



Then after the global placement is done, you see you cannot synthesis the clock network before you know where your flip flops are located. So, once you have done global placement, you know where your registers are where your flip pops are the sequential elements. So, with that information you can proceed to synthesis the clock network. So, this clock network as you know we have seen earlier, this can be a either a simple clock tree; it is like an H-tree or an MMM or something like that or you can have a more flexible network consisting of a tree at the higher level, and a very regular mesh at the lower level. Normally we have processors we have this kind of hybrid, because in a processor chip there are a lot of points where you need to carry the clock signals, there are large number of target or sink points for the clock signals.

So, we normally have a mesh in the lower level and a regular tree at the upper level and then a H-tree usually fine.



(Refer Slide Time: 09:11)

So, this is just a sample slide showing a buffered clock tree in a small processor design, where this rectangles indicate the places where buffers have been inserted and the lines indicate the connections, and the x indicates the points where the clock has been taken fine.

## (Refer Slide Time: 09:39)



So, after placement is done. So, the locations of these cells are well they aligned to a grid and after these align you see; during placement normally we do not have the grid concept. You can place a cell virtually anywhere, but when you do routing you often have some rows and column concept; tracks and some columns. So, there you have to align everything to a grid. So, after this initial level of placement is over, you have to do and of this alignment of this blocks or cells to grid locations to a uniform grid. And after you have done this, so you can proceed for global routing and layer assignment; where layer assignment says for each of the routes, which layer these layers are typically metal layer, which means which metal layer will be used to connect these nets.

Now, global router typically completes the routing on a single layer like you say the lease algorithm; they will the lease algorithm or the headlocks algorithm, they will try to find out a route on a single layer single metal layer. Now here when you do routing, you may also land up in something called wiring congestions, you may find there in the chip there many areas, where the wiring congestion is pretty bad. So, you may have to do iteration here again. So, you do something like congestion driven detailed placement, now you again modify your placement based on this information like I am just showing an example.

# (Refer Slide Time: 11:39)



This shows you means over the area of the chip these peaks indicates the levels of congestion; there is an area where lot of congestion is there. Now with iteration we try to reduce congestions, reduce congestion and you will see finally, that there is almost no congestion they are uniformly placed.

Congestion means I refer to congestion with respect to routing. So, if there is congestion it is quite likely; that during the process of routing you will find that you are not able to complete the routing at all. So, such scenario should not occur. So, you should do something which I just mentioned called congestion are aware detailed placement. So, you modify the placement again such that the congestion value is get improved to a significant extent.

### (Refer Slide Time: 12:41)



So, now after that you proceed to detailed routing; and the global routing gives you approximate routes, now detailed routing will just assign the exact horizontal and vertical metal layers for these routes. Now there is a additional optional step like these routes which are generated the wires. So, you may have to go through another iteration, which consist of reliability, manufacturability and electrical verification. Like let me tell you reliability refers to means you have seen the process of channel routing; there are horizontal tracks, vertical tracks, and there are some wire connections that connect the points across layers. So, you can have two layers or multi layer channel routing. Now the wire connection provides with you can say some source of unreliability you can say, because you are ultimately drilling a hole and making a connection across two layers. So, higher the number of wire connection, higher will be the possibility that there is in that the chip can fail because of some miss connection loose connection or something like that wrong connection.

So, one objective may be to reduce the number of bends in the connection, which may indicate number of wire connection. So, many a time means on the same layer when routing is carried out instead of using sharp bends, you use something like a 454 degree bend; because sharp bends are places where the metal is likely to break and form a disconnection right. These are something which are called reliability and manufacturability, these are something which have to be looked at it after everything is done. And electrical verification means well after you have laid out the wires, you see the length of the wires whether the separation is sufficient, width of the wires are sufficient, there are some very simple set of design rules you can say. So, all those design rules are verified by a design rule checker, which can tell you that well. Here the places where I find some violations you please correct these things, in that case you may have to go back and make some corrections there some rules are getting violated ok.

So, after everything is done you proceed for mask generation where. So, every circuit element and interconnection are represented by rectangles. Rectangles around various layers polysilicon diffusion, metal, metal 1 metal 2 metal 3 and so on fine.

(Refer Slide Time: 15:32)



So, let us look at the overall flow in the form of flow chart. So, in the first step we assume that already chip planning is done. Chip planning is given as input to the physical design, logic design is also done. So, you do some kind of block level placement. So, you are doing some as output you are generating block level or higher level global placement. So, whatever things you are doing? Chip planning means you have already done I O placement, you have already done power planning. Now you are doing performance driven trial synthesis and floorplanning; this is a new step which we include in the performance driven design flow. So, what does this involve? This consist of block shaping, sizing and placement and as I had said earlier we can assign some weights to the nets net weights depending on the criticality, slag values, then you can just analyze the global net routes, and for the ones where the slag values are negative you use some

buffering and here at this level you can do some approximate timing estimation using some timing analysis.

So, if it passes move on to the next step, if it fails you again go back and again you modify your placement. So, I had said as I had said that this is not a single process, it is a continuous iterative process, several iterations are carried out this is the first step.

Block-level or Top-level Global Placement Global Placement With Optional Rist Weighte Delay Estimation Using Budfees Using Budfees Using Budfees Using Budfees Using Budfees Using Budfees DB Virtual Budfeering DB Using Budfees Budfee Insertion Physical Synthesis Frontise Violations Martis Ordune Martis

(Refer Slide Time: 17:16)

Then from block level global placement you move to physical synthesis. So, what you do? You start with a global placement with of course, net weight optional net weights, then you do some delay estimations, you can use buffers in this case then you carry out static timing analysis again. If it fails you again go back and change the placement. Now this delay estimation using buffers, so here the details are shown here, so what you do? You do some physical or virtual buffering.

Physical buffering means you introduce some buffers, or virtual buffering means you indicate that here you may require a buffer, but you do not introduce right away you may need to introduce later. So, for physical buffering means you are actually introducing the circuits, the buffer the inverters. So, here you have to use an obstacle means obstacle avoiding global network topology, because you are inserting the buffers you have to connect them to the input and output lines, you have to see where the circuits and obstacles are already there, you have to lay the nets like that. So, layer assignment, buffer insertion these are the important steps here.

### (Refer Slide Time: 18:46)



So, next comes after physical synthesis done to routing. So, here timing correction is one step, where wherever there is a negative slack you make some corrective actions and maybe using the methods I have already mentioned just sometime back. And again you use static timing analysis if there are some violations, you repeat this process. So, here timing correction can involve timing driven restructuring, gate sizing and restructuring can be used in Boolean restructuring, and pin swapping, redesign, fanin trees, fanout trees all the techniques that you have just now seen in the last lectures. So, you do all these things so that this timing constraint whatever was there was made, physical synthesis is done.

# (Refer Slide Time: 19:42)



So, now this is the last step. So, you finalize locations of the sequential elements, now you synthesis the clock networks once they are finalized. So, after the clock network as you synthesis you do this once more global routing, layer assignment again you check the timing driven, congestion driven, detailed placement like you check for the timing here static timing analysis. If it fails again you go back again you modify it, but if the timing analysis is passed then you proceed to timing driven routing.

So, during this routing you may have to again insert buffers, you may have carryout some timing correction mechanisms here. So, once everything is done then you do detailed routing, then you can do some parasitic extraction, some additional steps of simulation and finally, you proceed to sign off.

## (Refer Slide Time: 20:54)



Sign off what you do here, you check for as I said manufacturability, electrical properties reliability verification. So, everything passes then only you proceed to generate the mask for fabrication, but if it fails then you go for a step called equal placement and routing. Equal stands for engineering change order. So, what it refers is that, this refers to a last minute changes that mean you are in a step, which is just one step before the final fabrication. So, at this step you find that well with respect to manufacture reliability there are some problem let me make some small corrections. You go back make the small corrections, run timing analysis again if it passes; if it fails again make some small changes.

So, in this step what are the things that we have done? Design rule checking one thing I have just mentioned, layout verses schematic. So, you had a layout initially, you had this schematic diagram you can make a comparison that whether they are matching. There are some electrical effects like antenna effects, electrical rule checking, you may also have to do all these things at this step and there are some well defined templates using which these checks can be carried out. Because one thing you understand, ultimately your layout is a huge thing there are millions and millions of rectangles, you will have to do this check over this entire rectangle, unless it is a feasible and simple process you really cannot do it. So, this is you can.

So, I think with this we just come to the end of this lecture, in the next lecture we shall be seeing some additional timing correction methods, and we shall be having a relook at the insertion of buffers and drivers design of driver's buffers etcetera. So, that we can just they use that information to just correlate with whatever we have said just now.

Thank you.