VLSI Questions Answers r3
VLSI Questions Answers r3
VLSI Questions Answers r3
Questions and Answers are divided into following categories in the respective
order.
1)
2)
3)
4)
5)
6)
We can see that when both inputs A & B are 0 the output is 1 and viceversa, i.e. when both inputs are 1, the output is 0, which is similar to the
INV behavior, hence one possibility is to short both inputs of NAND to
achieve INV. Following are the two ways to make inverter out of NAND.
If you observe NAND table closely you see that when A input is 1, B input
appears inverted at output. Which brings us to other possibility that if tie A
input of NAND to 1 we get INV.
Question D11): Simplify logic : MUX with D1 input tied to ground, and
inverter at the select input.
Answer D11):
the XNOR gate so much that it defeats the purpose of optimization in first
place.
Also if you have implemented XNOR gate using transmission gate/ pass gate
devices you want to have input buffering provided by inverter at the input.
Question D17): Simplify equation Z = A + B((A * B + B) + A * (B)bar)
Answer D17):
You need to know the rules of digital design. The rules
Z = A + B( B * ( A + 1) + A * (B)bar)
Z = A + B( B * 1 + A * (B)bar)
Z = A + B * B + B * A * (B)bar
Z=A+B+A*0
Z=A+B
Question D18) Build a buffer from a single XOR gate. Build an inverter from
single XOR.
Answer D18):
For an XOR gate the boolean equation is following Assuming A,B inputs and O
as output
O = A * (B)bar + (A)bar * B
If we make B = 0 in this equation we get
O = A * 1 + (A)bar * 0 = A
You can do similar exercise and find out that tying other input to 1 will give
you inverter.
to
Question D23) : Come up with logic that counts number of 1s in a 7 bit wide
vector. You can only use combinational logic.
Answer D23):
Following is one of the ways to come up with such logic.
Input vector is 7 bit wide. To sum up 7 bits we need 3 bits of binary encoded
output.
Weve full adders available. A single full adder can add 3 input bits and
generate 2 bits of binary encoded output.
E.g. a full adder can add 3 bit wide input vector 111 and generate 11
output.
We can pick two full adders and add up 6 bits of the input vector and will end
up with two sets of two bit wide binary encoded data.
E.g. if input vector is 1100111, we can assume two full adders adding up
first 6 bits 110011 where first three bits 110 are input to first adder and
011 are input to second adder. First adder will output 10 (decimal 2) and
second adder will also output 10 (decimal 2), and we need to add up two
two bit binary vectors. We can again employ full adders to do this as we still
have to account for the 7th input bit of the input vector. That can go into the
least significant full adder carry-input.
For the above example :
Input vector 1100111
input 110 => full adder => 10 output
input 011 => full adder => 10 output
10
+10
-----100 => output (4)
Now accounting for the seventh input bit 1 as carry into the least significant
adder.
1 <= Carry in.
10
+10
----101 => Binary encoded decimal 5 which is the input of 1s in input
vector 1100111.
Full adders can be used to add-up 3 input bits at a time. Outputs of first level
of full adders represent the two bit encoded version of the total 1s count,
which we need to add up get the final two digit encoded version of total 1s.
Since we need to add up 7 bit input vector, 7th input vector can be used as
Carry In in the second level of full adders.
Always keep thinking out loud. Continue spelling out the dialogue thats
going on in your mind, dont keep your thoughts to yourself. Interviewer is
looking at your approach, he wants to make sure youre trying your best to
analytical approach such tough questions.
In this specific example, start with and initial simplistic guess of two clock
cycle delay circuit ( back to back flipflops ) and draw the initial output
waveform. Compare initial output waveform with the required output
waveform and based on differences iterate more. Many times there isnt a
systematic way to get your answer, youve to go trial and error.
Following logic will perform the required operation.
FIFO is used for high throughput asynchronous data transfer. When youre
sending data from one domain to another domain and if high performance is
required, you can not just get away with simple synchronizer( Metaflop ).
As you cant afford to loose clock cycles(In synchronizer you merely wait for
additional clock cycles until you guarantee metastability free operation), you
come up with storage element and reasonably complex handshaking scheme
for control signals to facilitate the transfer.
An Asynchronous FIFO has two interfaces, one for writing the data into the
FIFO and the other for reading the data out of FIFO. It has two clocks, one for
writing and the other for reading.
Block A writes the data in the FIFO and Block B reads out the data from it. To
facilitate error free operations, we have FIFO full and FIFO empty signals.
These signals are generated with respect to the corresponding clock.
Keep in mind that, because control signals are generated in their
corresponding domains and such domains are asynchronous to each other,
these control signals have to be synchronized through the synchronizer !
FIFO full signal is used by block A (when FIFO is full, we don't want block A to
write data into FIFO, this data will be lost), so it will be driven by the write
clock. Similarly, FIFO empty will be driven by the read clock. Here read clock
means block B clock and write clock means block A clock
Asynchronous FIFO is used at places when the performance matters more,
when one does not want to waste clock cycles in handshake and more
resources are available.
Any good digital design interviewer, will very likely ask a clock divider circuit
question. Clock divider by odd numbers, especially divide by 3 are tricky
circuits to come up with.
It is a question that is very likely to be asked and there in lies and
opportunity for you to impress your interviewer. Spend enough time to
familiarize yourself with clock divider circuits.
Remember that divide by circuits are some sort of variations of counters.
Divide by 2 can be thought of as variation of 2 bit counter. Divide by 3 as
3 bit counter. This concept was dealt with in previous question. Counters
are easy to understand and build. It is recommended first you read up basics
of counter circuits and then try your couple of counter examples yourself to
begin with.
There is a very good paper on clock divider, which describes a systematic
way for coming up with divide by state machine for clocks. Paper can be
found here : http://ebookbrowse.com/clock-dividers-made-easy-pdf-d75765174
If you look at the output of a 3 bit counter, its easy to derive a divide by 3
waveform which has a duty cycle of 33.33%. Again as described in previous
question.
It takes a bit trick to get a 50% duty cycle divide by 3 waveform. After
experimenting a bit, came up with following circuit for a divide by 3 clock
divider.
You can see that in previous question what we need to do is delay Qa
waveform by a phase and OR it with Qb and well have a 50% duty cycle
divide by 3 clock waveform.
Thats exactly what we do in following circuit. We add a latch for delaying
output by a phase, we use p-first latch to provide phase delay and we
introduce explicit OR. Practically we can keep NOR in place of OR.
Question D31): Given a DC signal how would you make a 50ps pulse? You
have a 50ps inverter available.
Answer D31):
Use an XOR and inverter on one of the inputs.
Question D32): Why do we need TLB ?
Answer D32):
At least, brush up your basics of memory sub systems.
We need TLB to improve virtual address translation speed.
Question D33): What is a page table?
Answer D33):
A page table is the data structure used by a virtual memory system in a
computer operating system to store the mapping between virtual addresses
and physical addresses. Virtual addresses are those unique to the accessing
process. Physical addresses are those unique to the hardware, i.e., RAM.
Question D34) What is a ring counter ?
Answer D34):
Ring counter is essentially a circular shift register. The output of the last shift
register is fed to the input of the first register.
When you want to transfer data from one clock domain to another clock
domain, where both clocks are independent, there are two choices to
accomplish this.
1) Use synchronizer
2) Use Asynchronous FIFO
The main concern for clock domain crossing is metastability. Please refer to
the question about the metastability and synchronizer for getting more
details. Basically lets assume we are sampling some data in xclk domain
using xclk clock and if data is coming from ycllk domain where it was
generated using yclk clock. Given that xclk and yclk are independent, we
have no knowledge whether data coming yclk domain will setup correctly to
xclk. In fact it is very likely that yclk data will violate the setup requirement
for xclk flop and this will cause xclk flop to go metastable. To prevent this
metastability we use metastability hardened flops or sometimes we simply
refer to as synchronizer. It is nothing but a back to back series of flip flops
where you essentially wait for additional clock cycles of the sampling domain
to make sure metastability is resolved.
The key with synchronizer is that we wait for additional clock cycles to allow
time for metastability resolution. This additional wait time might not be
acceptable for some timing critical cross domain transactions. One can argue
that, in that case we should not be having cross domain transactions in such
cases, but todays design reality is that, with SOCs cross clock domains are
inevitable and you are very likely to have performance critical transactions
that would happen across different clock domains.
There comes asynchronous FIFO in picture. It allows for a better cross clock
domain transfer mechanism compared to synchronizer. Unlike synchronizer,
where all data must wait for at least one additional local clock cycle, in async
FIFO, we dont wait for additional clock cycles, but a high speed memory
FIFO, allows for on demand data data transfer between two clock domain.
Whenever one of the domain is ready to send data to another domain it can
do so without waiting as long as it knows other side has caught up. Similarly
when one of the domain want to read data it can do without waiting as long
as it knows other side has caught up.
Notice the qualification. FIFO doesnt make everything completely seamless.
One one domain is writing into memory how does it know whether memory is
full or not ? It has to have indication of where the other domain is in terms of
reading from the memory. And as we know it can not know what is going on
in other domain without synchronizing the control signals coming from other
domain with respect to its own clock. And synchronizing means waiting for
additional clock cycle. But it is not as bad as synchronizer, infact as we will
see later, when FIFO does become full or empty we end up conservatively
writing into or reading from FIFO for additional clock cycles. Depending on
few variables including the size of FIFO memory, this conservatism doesnt
come into play with every time a write or a read is performed.
Thus async FIFO does improve overall throughput of a clock domain crossing
with the hardware overhead that comes in the form of memory and control
logic for figuring out FIFO full and empty conditions. But depending upon the
criticality of the signals involved in clock crossing it might be worth
implementing extra hardware.
checks whether wr full is asserted or not. The wr full signal indicates FIFO
is full, and if that is the case we have to wait until it clears up before we can
write into the FIFO. If wr full is clear, the data is written into FIFO at the
location pointed by the write pointer wr ptr and wr ptr is incremented. As
you can see wr ptr is always pointing to the next location in FIFO which is
ready to be written into.
Lets go through an example. Take a very simple case of 3 entry FIFO. Initially
when design is reset, both wr ptr and rd ptr are pointing to zero and FIFO
is empty.
away, as shown with the dotted line on the fifo full waveform, so that
someone can write into FIFO in next cycle if needed. But because our
communication from rdclk domain to wrclk domain is only through wrclk
based synchronizer, the fifo full signal will be deasserted only 2 wrclk
cycles later. We do this, because we dont have any better, trustable way to
pass signals from one independent domain to another one. This is just
pessimism, which is preferable compared to errors and functional failure.
Similarly we can analyze the case of FIFO getting empty.
And depending upon what instance you sampled all three bits you can possibly detect 111,
101, 110, 001, 010 or 000, which are all wrong combinations. If you use gray codes, you
avoid this issue, at the most you will be off by one number.
In FIFO, we are synchronizing read pointers in write clock domain and write pointers in read
clock domain all the time, hence it is crucial we avoid such potential spurious switching issue by
using gray codes.
Question C3): Draw a CMOS transistor circuit for a 2 input NAND gate.
Answer C3):
Figure C6. Two input NAND gate with P/N skew of 2/1.
Question C7): Draw the cross section of a MOSFET. What is saturation
region ? How to bias device as such. Where is the bulk connection? What
does it do?
Answer C7):
Question C9): How does a substrate bias (a.k.a. back-gate bias) on a MOS
transistor affect Vt?
Answer C9):
A back-gate(body) bias increases the magnitude of Vt. The mechanism is an
increase in the depletion width of the induced p-n junction under the gate.
This uncovers more fixed charge in the channel region. (The mobile charge
gets "pulled" to the substrate contact.)
Since the uncovered fixed charge has the same sign as the channel inversion
charge, not as much channel inversion charge is needed to balance the
charge on the gate. As a result, some of the inversion charge flows out the
source terminal, so the channel isn't as inverted as it was prior to applying
the substrate bias.
Therefore, the gate voltage needs to increase in magnitude to restore the
previous level of channel inversion. For NMOS, with the decrease of Vb, Vt
increases.
Question C10): Why power routes are routed in the top metal layers?
Answer C10):
This is not always true, but in general top metal layers are less resistive and
hence IR drop is less in power distribution network. Also routing power wires
in higher layers in top layers can free up lower metal layers to ease routing
congestion in lower layers. Bigger the concern of IR drop more power routes
are needed in all layers.
Question C11): What are the ways to speed up a standard cell ?
Answer C11):
Delay of the standard cell depends upon three main factors. Input slope,
output load and drive strength. If you increase input slope by increasing the
drive strength of the driver, the cell speeds up. If you reduce load at the
output of the cell, cell speeds up. If you increase the width of the cell, cell
speeds up. A cell with lover sub-threshold voltage is faster than a nominal
cell.
Question C12): How do you reduce noise or glitch ?
Answer C12):
First thing you ask when youve noise or glitch issue is whether you can
tolerate the glitch or not. If you receiver is non-storage element, e.g. static
gate like inverter or buffer, they attenuate the glitch/pulse(based on their
skew) and noise glitch becomes tolerable by the time it reaches flop/latch.
Normally cross coupling from wires is biggest contributor of the noise glitch
so increasing spacing to attacker helps. If the driver of the attacker is oversized it helps to reduce the cross coupling, Also if the victim node driver is
poorly sized, up-sizing the driver helps. Many times logical filtering, i.e.
realizing the logical mutex conditions between attacker and victim helps
filter out attackers.
Question C13): Explain short circuit current.
Answer C13):
In practice, because of finite input signal rise and fall times, there results a
direct current path between the supply and ground. This exists for a short
period of time during switching of the gate.
This instantaneous large current causes voltage to droop for a small amount
of time while devices are switches, this is called dynamic IR drop or
Instantaneous Voltage Drop (IVD). It is very common in high speed
memories, which have potentially thousands of cells switching at a time.
If capacitance is increased between VDD & VSS (rail to rail), VDD node
becomes more resistant to the effects of IVD, as the capacitance acts as a
charge reserve supplying local current sinks briefly for the short time when
large number of devices are switching.
As such, DCAP cells which are nothing but capacitors are added to the areas
of an IC that otherwise have no cells or the area where large simultaneous
switching is expected(memory). However, DCAP cells normally come with a
serious down-side. They are leaky devices and causes extra power
dissipation, hence they need to be carefully used.
Ca+Cb, because Ca and Cb are in parallel now. Because total charge remains
the same after switch is closed, we can say following :
CbVb = (Ca+Cb)Vb [ Previously node a had zero charge, and node b had
CbVb charge]
Here Vb is the voltage on node b after switch is closed and Vb is voltage on
node b before switch is closed.
If Ca =~ 0 or Cb >>> Ca, then Vb = Vb, in all other cases Vb << Vb.
This means unless Ca is very small compared to Cb node b voltage will
drop. This is the effect of charges sharing. When two capacitors are shorted
depending on the capacitor values, charge is shared or transferred from one
capacitor to another and voltage can droop on one of the node. Bigger the
value of Ca, more charge will move(transfer) from Cb to Ca and when
charges moves away from a node, the voltage at that node drops because
voltage at a node is nothing but the charge(potential) difference between the
node in question and charge at ground(which is zero). In other words a node
can glitch up or down and if downstream there is a sequential element, it can
capture glitch and cause false state to be captured and your circuit to
malfunction.
Question T3): What does the setup time of a flop depend upon ?
Answer T3):
Setup time of a flip-flop depends upon the Input data slope, Clock slope and
Output load.
Question T4): What does the hold time of a flip-flop depend upon ?
Answer T4):
Hold time of a flip-flop depends upon the Input data slope, Clock slope and
Output load.
Question T5): Describe a timing path.
Answer T5):
For standard cell based designs, following figure illustrates basic timing path.
Timing path typically starts at one of the sequential (storage element) which
could be either a flip-flop or a latch.
The timing path starts at the clock pin of the flip-flop/latch. Active clock edge
on this element triggers the data at the output of such element to change.
This is the first stage delay which is also called clock -> data out(Q) delay.
Then data goes through stages of combinational delay and interconnect
wires. Each of such stage has its own timing delay that accumulates along
the path. Eventually the data arrives at the sampling storage element, which
is again a flip-flop or a latch.
Thats where data has to meet setup and hold checks against the clock of
the receiving flip-flop/latch. Also notice for the timing paths in the same clock
domain, generating flip-flop clock and sampling flip-flop clocks are derived
from a single source, which is called the point of divergence.
In reality, actual start point for a synchronous clock based circuits is the first
instance where clocks branch off to generating path and sampling path as
shown here in the picture, which is also called point of divergence.
To simplify analysis we agree that clock will arrive at very much a fixed time
at the clock pin of all sequentials in the design. This simplified the analysis
of the timing path. from one sequential to another sequential.
Answer T7):
No, hold violations are functional failures. Unlike setup violations, which go
away with reduced frequency, hold violations are frequency independent and
are functional failures as mentioned earlier.
Question T8): Explain CTS (Clock Tree Synthesis) flow.
Answer T8):
The goal of the CTS flow is to minimize the clock skew and the clock insertion
delay. This is the flow where actual clock distribution tree is synthesized.
Before CTS timing tools use ideal clock arrival times. After CTS real clock
distribution tree is available so real clock arrival times are used.
Question T9): What is metastability and what are its effects ?
Answer T9):
Whenever there is setup or hold time violations in a flip-flop, it enters a state
where its output is unpredictable. This state of unpredictable output is known
as metastable state.
Its also called quasi stable state. At the end of metastable state, the flipflop settles down to either '1' or '0'. The whole process is known as
metastability.
Question T10) What is the difference between a latch and a flip-flop.
Answer T10):
Latch is level sensitive device, while flip-flop is edge sensitive. Actually a D
flip-flop is made from two back to back latches, in master-slave
configuration.
A low level master latch is followed by a high level slave latch to form a
rising edge sensitive D flip-flop. Latch is made using fewer devices hence
lower power compared to flip-flop, but flip-flip is immune to glitches while
latch will pass through glitches.
Question T11) What is clock skew ?
Answer T11):
In synchronous circuit design, usually a gridded clock is used. Gridded clock
means, at least for parts of the design clock has to arrive at the same time.
In reality clock arrives at different times at different clock receivers in the
Figure T11. False data capture because of late clock ( clock skew )
Question T12) What happens to delay if you increase load capacitance?
Answer T12):
Usually device slows down if load capacitance is increased. Device delay
depends on three parameters, 1) the strength of device, which usually is the
width of the device 2) input slope or slew rate and 3) output load
capacitance. Increasing strength or the width increases self load as well.
Question T13) What is clock-gating ?
Answer T13)
Clock gating is a power saving technique. In synchronous circuits a logic gate
( AND ) is added to the clock net, where other input of the AND gate can be
used to turn off clock to certain receiving sequentials which are not active,
thus saving power because of toggling clock.
combinational delay is too small. Hence this is sometimes called max delay
or slow delay timing issue and the constraints is called max delay constraint.
In figure there is max delay constraint on FF2_in input at receiving flop. Now
you can realize that max delay or slow delay constraint is frequency
dependent. If you are failing setup to a flop and if you slow down the clock
frequency, your clock cycle time increases, hence you've larger time for your
slow signal transitions to propagate through and you'll now meet setup
requirements.
Typically your digital circuit is run at certain frequency which sets your max
delay constraints. Amount of time the signal falls short to meet the setup
time is called setup or max, slack or margin.
In our figure below, data at input pin 'In' of the first flop is meeting setup and
is correctly captured by first flop. Output of first flop 'FF1_out' happens to be
inverted version of input 'In'.
As you can see once the active edge of the clock for the first flop happens,
which is rising edge here, after a certain clock to out delay output FF1_out
falls. Now for sake of our understanding assume that combinational delay
from FF1_out to FF2_in is very very small and signal goes blazing fast from
FF1_out to FF2_in as shown in the figure below.
In real life this could happen because of several reasons, it could happen by
design (imagine no device between first and second flop and just small wire,
even better think of both flops abutting each-other ), it could be because of
device variation and you could end up with very very fast device/devices
along the signal path, there could be capacitance coupling happening with
adjacent wires, favoring the transitions along the FF1_out to FF2_in, node
adjacent to FF2_in might be transitioning high to low( fall ) with a sharp slew
rate or slope which couples favorably with FF2_in going down and speeds up
FF2_in fall delay.
In short in reality there are several reasons for device delay to speed up
along the signal propagation path. Now what ends up happening because of
fast data is that FF2_in transitions within the hold time requirement window
of flop clocked by clk2 and essentially violates the hold requirement for clk2
flop.
This causes the the falling transition of FF2_in to be captured in first clk2
cycle where as design intention was to capture falling transition of FF2_in in
second cycle of clk2.
In a normal synchronous design where you have series of flip-flops clocked
by a grid clock(clock shown in figure below) intention is that in first clock
cycle for clk1 & clk2, FF1_out transitions and there would be enough delay
from FF1_out to FF2_in such that one would ideally have met hold
requirement for the first clock cycle of clk2 at second flop and FF2_in would
meet setup before the second clock cycle of clk2 and when second clock
cycle starts, at the active edge of clk2 original transition of FF1_out is
propagated to Out.
Now if you notice there is skew between clk1 and clk2, the skew is making
clk2 edge come later than the clk1 edge ( ideally we expect clk1 & clk2 to be
aligned perfectly, that's ideally !! ). In our example this is exacerbating the
hold issue, if both clocks were perfectly aligned, FF2_in fall could have
happened later and would have met hold requirement for the clk2 flop and
we wouldn't have captured wrong data !!
Question T19) STA tool reports a hold violation on following circuit. What
would you do ?
Answer T19)
The key to understand here is that were referring to the same CLK edge
hence no CLK skew and no hold violation.
Question T20) : Why does the delay of MOS device decreases with increasing
temperature at high voltage, but decreases with increasing temperature at
lower voltages ?
Answer T20) :
This effect is also referred to as low voltage Inverted Temperature
Dependence.
Lets first see, what does the delay of a MOS transistor depend upon, in a
simplified model.
Delay = ( Cout * Vdd )/ Id
[ approx ]
Where
Cout = Drain Cap
Vdd = Supply voltage
Id = Drain current.
Now lets see what drain current depends upon.
Id = (T) * (Vdd - Vth(T))
Where
= mobility
Vth = threshold voltage
= positive constant ( small number )
One can see that Id is dependent upon both mobility and threshold voltage
Vth. Let examine the dependence of mobility and threshold voltage upon
temperature.
(T) = (300) ( 300/T )m
Vth(T) = Vth(300) (T 300)
here 300 is room temperature in kelvin.
Mobility and threshold voltage both decreases with temperature. But
decrease in mobility means less drain current and slower device, whereas
If this modular structure is extended than delay at the last node can be
represented as following.
Total Delay at node N = R1C1 + (R1+R2)C2 + (R1+R2+R3)C3 + .
(R1+R2+....+RN)CN
assignments evaluate and update are not atomic operation, but are done at
different times and they do not block other assignments.
It is illegal to use non-blocking assignment in a continuous assignment
statement or in a net declaration.
// Blocking and non-blocking assignment
module blocking;
reg [0:2] X, Y;
initial begin: init1
X = 5;
#1 X = X + 1; // blocking procedural assignment
Y = X + 1;
$display("Blocking: X= %b Y= %b", X, Y );
X = 5;
#1 X <= X + 1; // non-blocking procedural assignment
Y <= X + 1;
#1 $display("Non-blocking: X= %b Y= %b", X, Y );
end
endmodule
produces the following output:
Blocking: X= 110 Y= 111
Non-blocking: X= 110 Y= 110
The effect is that for all non-blocking assignments use the old values of the
variables at the beginning of the current time unit and assigns the registers
new values at the end of the current time unit. This reflects how register
transfers occur in some hardware systems.
Blocking procedural assignment is used for combinational logic and nonblocking procedural assignment for sequentials.
Question V2) Assuming all variables have initial value of 0, when simulation
starts, what will be the value of x at the end of simulation time unit 4 ?
initial begin
x=2
#4
y <= #9 x
x=1
end
always @(x,y)
begin
z = #2 x + y
end
Answer V2):
In order to get a comprehensive understanding of the execution order, one
has to look at what is called Verilog stratified event queue, which is the
Verilog IEEE standard spec algorithm describing the event queues.
The Inactive event queue has been omitted as #0 delay events that it
deals with is not a recommended guideline.
As you can see at the top there is active event queue. According to the IEEE
Verilog spec, events can be scheduled to any of the event queues, but
events can be removed only from the active event queue. As shown in the
image, the active event queue holds blocking assignments, continuous
assignments. primitive IO updates and $write commands. Within active
queue all events have same priority, which is why they can get executed in
any order and is the source of nondeterminism in Verilog.
There is a separate queue for the LHS update for the nonblocking
assignments. As you can see that LHS updates queue is taken up after
active events have been exhausted, but LHS updates for the nonblocking
assignments could re-trigger active events.
Lastly once the looping through the active and non blocking LHS update
queue has settled down and finished, the postponed queue is taken up
where $strobe and $monitor commands are executed, again without any
particular preference of order.
At the end simulation time is incremented and whole cycle repeats.
Getting back to our original question.
- Initial block starts execution. Inside initial block we have a begin, end
procedural block, which means statements will be activated as per the order
they appear.
- At beginning of simulation unit time 0, the variable x is assigned value of 2
through blocking assignment. x is now 2.
- This new value of x at time 0 triggers always block and RHS of the
equation is evaluated and value if z is slated to be updated at simulation
time unit 2 with the value of 2(x + y = 2 + 0 = 2)
- Control is back inside initial statement.
- We encounter #4 so, execution in initial block is suspended.
- At simulation time unit 2, z is updated with the previously evaluated value
of 2.
- Back in initial block at time unit 4, execution proceeds and non blocking
statement is encountered, the RHS is evaluated, which is x with value 2.
- LHS of non blocking statement is scheduled to be updated 9 time units
later.
- Execution proceeds inside initial block at simulation time unit 4 and next is
blocking update of x with value 1.
- At the end of simulation time unit value of x is 1.
Question V6): For the following buffer in the figure and input waveform, show
the buffer output waveform for each case of verilog code snippet
representing buffer?
suspended for <delay> number of time units. This is also called inter
assignment delay. Key to remember is that while assignment is suspended
for <delay> number of time units, if input changes in the interim, those
changes are lost as assignment is suspended during that time !
Transport delay : This delay models devices with close to infinite switching
speeds. Input glitches are propagated to the output of the device, because of
infinite switching speeds.
One can use intra assignment delay control to model transport delay. You will
put the delay control after assignment.
For example :
output = #<delay> input.
This will cause the input to be evaluated first and update to happen
<delay> units later. Even if there is glitch on input which is smaller than
<delay> value, the changes to input are registered and propagated to
output after <delay> time units. Key to intra assignment delay with
blocking assignment is that while LHS update is waiting on the <delay> time
units, if input changes in the interim, those changes are lost ! Which really
means this is not a pure transport delay example. As well shortly see that, at
output we will not get the same waveform as input. If we truly want the
intermediate changes to the input to be registered, while assignment is
suspended on delay control, we need to use non-blocking assignment.