## INDIAN INSTITUTE OF TECHNOLOGY GUWAHATHI NPTEL NPTEL ONLINE CERTIFICATION COURSE An Initiative of MHRD

**VLSI Design, Verification & Test** 

Dr. Arnab Sarkar Dept. of CSE IIT Guwahati

Welcome in the previous lectures of this course we have looked at synthesis problems where area and latency or performance where constraints in this module we will look at synthesis problems where power is a constraint.

(Refer Slide Time: 00:46)



So we will look at first the scheduling problem when power is a constraint so power is consumed by change in register content use of function and units buses etc now if operations are performed simultaneously power is consumed what is the measure of the power consumed by a functional unit at NEC step a functional unit can be active or idle when it is active that means it is executing some operation in it, it consumes dynamic power when the functional unit is idle it consumes static power the peak power is given by the sea step which consumes the maximum power so among all steps among all time steps the peak power Is given by the time step which consumes the maximum power combining both dynamic power and static power.

Then summation of the dynamic power over all resources and summation over static power over all resources the sea step at which this is maximum gives me the peak power and the objective of Schilling is to minimize the peak power.

Power Aware Synthesis - Scheduling

(Refer Slide Time: 01:59)

Now we will look at this schedule problem here we are given this we are given this data flow graph in this data flow graph requires one multiplied functional unit one and functional unit and two adder functional units see that this addition operation and these two addition operations are at two different time steps and hence the same instance of an adder unit is, is sufficient for operations in this time step and this time step and therefore we only require two adders so this is a schedule comprising 1 multiplied one and, and two adders.

Now we take another schedule in which we put take the and operation to time step 2when this end is shifted from see step 1 to see step 2 we may save some power as the multiplier consumes more power than the adder so because the multiplier consumes more power than the adder the summation of the power is consumed by the addition and multiplication here will be more than the summation of the powers consumed by the two adders here maybe more than the summation of the power is consumed by the two adders here maybe more than the summation of the power is consumed by the two adders here and therefore when we bring this addition operation from see step want to see step 2 it may result in a lower power this may save power as multiplied consumed higher power than adder.

(Refer Slide Time: 04:01)



And hence this and when associated with addition will give a lower power value as with respect to when this and is associated with the multiplication in the first time step now in the third schedule what we have is that the add operation is shifted to time step 1 now this she only increases the total power at time step 1 but what we see here is that now we require only one instance of this addition operation and therefore because we require only one adder resource the static power at each step will be reduced and therefore this may result in saving or in reducing the peak power.

So what we wanted to show in this problem is that we can have three different schedules with three different power consumption peak power consumption values and hence then we can devise algorithms both optimal and heuristic algorithms you know to, to produce schedules where power is a constraint or to minimize power right this was a look at power-aware scheduling problems we will not go deeper into the problems all the strategies that we have studied can be applied but the problem remains this now we will look at registered optimization for power.

(Refer Slide Time: 05:15)



Now base power consumed by the registers is proportional to the number of registers that we have the active power consumed is proportional to the activity on the resistors what do we mean by activity on the registers so the number of different allocations that I have on the resistors that means when the big flips when the window bits within a register flip active power is consumed and hence it is it is proportional to the bit clips and when does bit flips happen when the data values in the registers change so I had a variable a in a register and then I have put a variable be

in a register and the data within the register has changed this will affect a certain number of bit flips in the register.

When the men there are bit clips in the register and the active power will be consumed in the register right so let X be the power consumed by a registered during transfer during register transfer so during register transfer there are bit flips and active power is consumed and why is the power consumed when it is not transferring so the power consumed at each see step due to these registers are the summation over x and y over all registers now active power is comes as we said active part consumed is proportional to the number of bit clips that occur.

(Refer Slide Time: 07:03)



Now this partially depends on the variables mapped to the register now what do registers contain they contain temporary variables where two temporary variables comes for come from, from the outputs of operations the operations what do they do they perform computation on input values on input values now these input values for a certain operation typically comes within a given range if the pattern of values that are handled by an operation node is, is often very similar or are often very similar overall data inputs and hence the temporary variables at the output of these operations also tend to have similar values and add values within similar ranges. And therefore if I have two variables the first in, in the first I, I have most of the values changing in the lower order bits so in this temporary variable the value of the lower order bits changes more and I have another temporary variable in width of values of the higher order changes more now when both these variables are kept together in the same Hardware register then overall there will be a lot of bit clips when these two variables are kept in to the same register right now as with respect to let us say we have two variables both for which both for which only the lower order bits change.

And most of the higher order bits remain same are all zeros only the lower order of exchange so if these two temporary variables are now club together into the same register we will have higher opportunities of saving power why because active power is consumed when bit flips happen in, in the register right so the active power consumed is part partially depends on the variables mapped to a register now two variables are selected for merging based on that switching power right so therefore what do we do we profile the operation constituents graph we deduce the table for average switching power.

So we find out when a is merged with B into the same register what, what is the switching power consumed as compared to when a is marked with see what is that what is the register what is the switching power consumed like why we can have a table containing the switching power for all different types of valid register merchants and we do feasible register mergers to minimize the total switching power again we understand that this can be modeled using a conflict craft and on this conflict graph we can apply merging and the splitting techniques that we learn for register and functional unit allocation when we learn the simulated annealing methodology for register and function unit allocation a similar strategy can be similar technique can be used here to obtain a register optimization for power.



We will not go into the exact methodology the strategies remain almost same the solution strategies remain almost same now we will look at another problem bus allocation for power so bus lines are put between registers and functional movements so when bus lines are put between registers and functional units the maxis de Max's drop out on the buses now if we want to minimize number of bus lines that are required to require to handle a certain number of Max's and de Max's there are problems are similar to the other allocation problems we learned previously and can be similarly formulated.

| Bus Allocation for Power                                                                                                                                                                                                                                           |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <ul> <li>Suppose we are given a 3-step schedule</li> <li>Variables put on the buses at each step are given:         <ul> <li>C-step 1: a, b, c, d;</li> <li>C-step 2: b, e;</li> <li>C-step 3: c, b, f, g</li> </ul> </li> <li>Example bus allocations:</li> </ul> |
| Bus 1: a, g $a \rightarrow a \rightarrow g$ Bus 3: c $b = c \rightarrow c$                                                                                                                                                                                         |
| Bus 2: b $b \rightarrow b$ Bus 4: d, e, f $a \rightarrow e \rightarrow f$                                                                                                                                                                                          |
| VLSI Design, Verification and Test 62                                                                                                                                                                                                                              |

Now we will look at this formulation for power so if data is written onto the bus vary a lot then power costs incurred increases the power cost is incurred due to switching now if we continue to have the same variable allocated to a bus line the value in the bus does not change now let us say we have a three-step schedule right and variables are put onto the bus at each step let us assume that the schedule is as follows we put variables, variables a b c and d onto the bus in time step one we put variables b & e onto the bus at time step two and we put variable C BFG onto the bus at time step3 because the maximum number of variables.

That are put in a single time step is for we at least require four buses right and let us assume an example bus allocation here so in these four buses that we require later let us assume that bus one has been allocated variables a and G bus to has been allocated variables variable beep birth three has been allocated variable C and bus for has been allocated variable def now suppose with respect to bus one we see that at time step one variable a is in the bus at time step 2 also variable a is in the bus and at time step three very variable G is in the bus we see that between times f1 and time step to the variable on the bus does not change.

So if the data in air does not change between time steps one and two the power costs incurred will be low now therefore the objective is to keep the same variable on to the same bus line as long as possible and this is an example allegation so we can find out an allocation such that the changes invariables on the bus lines is minimized we will obtain a minimum cost allocation for a given set of bus lines and this will provide a good allocation in terms of power as well this basic understanding of power our synthesis we come to the end of this module.

Head CET Prof. Sunil Khijwania

CET Production Team Bikask Jyoti Nath CS Bhaskar Bora Dibyajyoti Lahkar Kallal Barua Kaushik Kr. Sarma Queen Barman Rekha Hazarika

CET Administrative Team Susanta Sarma Swapan Debnath