System-on-Chip Test Architectures Ch. 7 - Low-Power Testing - P. 1

Chapter 7
Low-Power Testing
EE141 System-on-Chip Test Architectures
Ch. 7 Low-Power Testing - P. 1
What is this chapter about?

Introduce
the various aspects of lowpower testing on

Issues arising from excessive test power Structural and algorithmic solutions proposed to alleviate the low-power test problems
Focus
Outline
1. Introduction
2. Energy and power modeling 3. Test power issues 4. Low-power scan testing 5. Low-power BIST 6. Low-power test data compression 7. Summary and conclusion
1. Introduction
Test in the past High Fault Coverage

Short Test Time Small Test Data Volume Low Test Development Efforts Low area overhead
Test from now

Low-Power Test
High Test Quality (e.g., high small-delay detection capability)
1. Introduction
Power dissipation in test mode is much higher than during functional mode

The circuit is highly stressed No correlation between consecutive test vectors Test vectors ignore functional constraints DFT circuitry is intensively used Parallel testing is often used for efficiency Low-power functional features (e.g., gated clock) often disabled during test
1. Introduction
Industry generally resorts to ad-hoc solutions:
Over sizing power rails
Over sizing packages and use of cooling systems Test with reduced operation frequency Partitioning and appropriate test planning
Costly or longer test time
2. Energy and Power Modeling

Power dissipation in CMOS ICs:
Static power: power consumed when the circuit is idle (leakage power) Dynamic power: power consumed when the circuit is switching its state
Charging (01): of energy dissipated as heat
P
Vdd
Discharging (10): energy dissipated as heat CL
N
Pdyn = CL.Vdd2.N01.1/T Pdyn = .CL.Vdd2.N.1/T

7

Energy = total switching activity generated during test
has impact on the battery lifetime during power up or periodic self-test of battery operated devices
Average Power = Energy / Test time
has impact on the thermal load of the device
Peak Power = Highest value of instantaneous power
determines the thermal and electrical limits of components and the system packaging requirements
Energy consumed after application of (Vk-1,Vk)

EVk = . c0.V2DD . i si(k) . Fi
Total energy consumed during test application

ETotal = . c0.V2DD . k i si(k) . Fi
Average
power consumed during the test session

PAVERAGE = ETotal / ( LengthTest . T )
Peak power consumed during the test session

PPEAK = maxk Pinst(Vk) = maxk ( EVk / tsmall )
9

Test power acting parameters
Switching Activity
has impact on the energy, average power and peak power
Test Frequency
has impact on the average power
Test Length
has impact on the energy
10

Test power contributors
Combinational Toggling
switching activity in the combinational part of the circuit
Sequential Toggling
switching activity in the flip-flops
Clock Toggling
switching activity in the clock tree feeding the circuit
11
3. Test Power Issues

Thermal effects
Heat produced during the functioning of a circuit is proportional to the dissipated power (Joule effect) and is responsible for die temperature increase Too high temperature can provoke irreversible structural degradations (premature destruction)
Too high temperature may affect circuit performance or can have an impact on the ICs reliability (corrosion, electro-migration, hot-carrier-induced defects, dielectric breakdown, )
12
3. Test Power Issues

Noise phenomena
Power supply noise L(di/dt) due to current variations through inductive connections (probes for wafer testing, pins for packaged circuits) Ground bounce or Voltage surge/droop - may change the rise/fall times of some signals in the circuit IR drop (resistive effect) and crosstalk (capacitive effects) similar effects
Good dies fail the test manufacturing yield loss (overkill)

13
4. Low-Power Scan Testing - Basics

Slow-speed scan testing
CLK
shift
shift & launch
capture
shift
SE
Time
Time
load/unload cycles
load/unload cycles
Time
14

At-speed scan testing with a LOC test scheme
Last shift V 1 applied
CLK
Capture & Launch V 2 applied
Response capture
shift
shift
SE LOC scheme
Time
SE LOS scheme
Time
Launch is caused by the difference between the values loaded Time by the last shift pulse (V1) and the first capture pulse (V2)
SE easy
load/unload cycles to implement,
but
test cycle lower
fault
load/unload cycles coverage than
LOS
15

At-speed scan testing with a LOS test scheme
V 1 applied
CLK
Last shift & Launch V 2 applied
Response capture
shift
shift
SE LOS scheme
Time
Time
Launch is caused by the difference between the values loaded by the next-to-last (V1) and the last (V2) shift pulses
Higher fault coverage than LOC, but SE not easy to implement
16
The problem of excessive power during scan testing can be split into two sub-problems: excessive power during the shift operation (called shift power) and excessive power during the capture operation (called capture power) At-speed scan testing especially vulnerable to excessive IR drop caused by the high switching activity generated in the CUT between launch and capture yield loss
17
4. Low-Power Scan Testing

ATPG and X-filling techniques (1/3)
The fraction of dont care bits (Xs) in a given ATPG test cube is nearly always a very large fraction of the total number of bits despite the application of state-of-the-art dynamic and static test pattern compaction techniques
In classical ATPG, Xs are randomly filled and then the resulting fully specified pattern is simulated to confirm detection of all targeted faults and to measure the amount of fortuitous detection
18

Power-aware ATPG algorithms (2/3)
Clever assignment of dont care bits in combinational (PODEM like) ATPG in order to minimize the number of transitions between two consecutive test vectors Minimizing the difference between the beforecapture and after-capture output values of a scan flip-flop
19

Power-aware X-filling heuristics (3/3)
From a set of deterministic test cubes, the main goal of these techniques is to assign dont care bits of each test pattern so that the occurrence of transitions in the scan chain is minimized: Adjacent filling or MT-filling 0-filling 1-filling 0XXX1XX0XX0XX 0000111000000 with MT-filling 0000100000000 with 0-filling 0111111011011 with 1-filling
Applicable at the end of the design process, no area overhead Reduce test power consumption by reasonable increase of test length A few solutions exist for reducing power during test cycle (LOC)
20

Low power test vector compaction
Static compaction minimizes the number of test cubes generated by an ATPG tool by merging test cubes that are compatible in all bit positions Example 1: 11XX0 and 1X0X0 are compatible ( 110X0) Example 2: 11XX0 and 011X1 are not compatible
Conventional approaches target the minimum number of final test cubes [Sankaralingam 2000] used a greedy heuristic for merging test cubes in a way that minimizes the number of transitions (use of weighted transition metric)
Significant reductions in average and peak power consumption

21

Low-power (gated) scan cells
Combinational Part
D SI
0 1
Q output
SO
CLK
SE
Gate scan cells block transitions during scan shifting Very effective in test power reduction Significant area overhead and performance degradation
22

Scan cell ordering (1/2)
FF1 FF2 FF3 FF4 FF2 FF4 FF1 FF3
0101
1100 0 0 0 1
1 0 1 0
0 1 0 1
0 0 1 0
0 0 1 1
0 0 0 1
0 0 0 0
0 0 0 0
10 transitions generated during loading of V
2 transitions generated after scan cell reordering
Need to change the order of bits in each vector during test application Scan cell reordering may lead to significant power reduction (up to 66%) No overhead, FC and test time unchanged, low impact on design flow May lead to routing congestion problems
23

Scan cell ordering (2/2)
Partition the circuit in clusters (by using geographical criteria) Then reorder the scan cells within each cluster so as to reduce WSA Clusters are then stitched together using the nearest neighbor criteria Good tradeoff between test power reduction and scan chain length
24

Scan chain segmentation
Combinational Logic SE
Scan Chain A Scan Chain B
Capture
CLKA
Scan Chain C
Scan In CLKA CLK
ScanOut
CLKB CLKC
CLKB
Clock Adaptor
CLKC
The scan chain is partitioned into N segments
One segment at a time is active during scan shifting Average power reduced by a factor of N with no impact on area and FC Clock power is reduced by gating the clock trees rather than the SE signals
25

Scan architecture modification
Inserting logic elements (XOR gates) between scan cells in order to minimize the number of transitions occurring inside the scan chain Use of buffers (of various size) in multi-scan circuits to provoke a slight temporal shift between scan chains and reduce peak power
26

Token scan architecture (1/2)
ScanIn
1 2 j N
ScanOut 1 2 j N
CLK
Multiphase Generator
Scan architecture that uses the concept of a token ring to reduce shift power SI is broadcasted to all scan cells but only one scan cell is activated at a time An N-phase non-overlapping clocking scheme is applied with one clock for each scan cell
27

Token scan architecture (2/2)
CLK Scan
Si Di
ScanIn
0 1 TCK
1 0 D
D1 Q
ScanOut
CLK
Ti
CLR D D2
T0
S S0 D0
Alternative solution to avoid large area overhead of the N multiphase clock routes and inter-phase skews due to the different lengths of the N clock routes It embeds the multiphase clock generator into each scan cell Require the use of a new type of scan cells, called token scan cells
28

Scan clock splitting
Circuit Under Test
ComOut CLK/2 CLK/2
Vdd CLK T Vdd CLK/2 T 3T 5T Time 2T 3T 4T 5T
Time
ScanIn
Scan Cells A
SE
Scan Cells B
1 0
ScanOut
Vdd CLK/2
2T
4T
Time
The two clocks are synchronous with the system clock and have the same period during shift operation except that they are shifted in time During capture operation, the two clocks operate as the system clock
Lowers the transition density in the CUT, the scan chains and the clock tree
29
5. Low-Power BIST - Basics

Test Pattern Generator (TPG) Logic BIST Controller
Circuit Under Test (CUT)
Output Response Analyzer (ORA)
A test pattern generator (TPG) automatically generates test patterns for application to the inputs of the circuit under test (CUT) In-circuit TPGs constructed from LFSRs are most commonly used LFSRs are also used for output response analyzer (ORA) BIST is implemented as Test-per-scan or as test-per-clock Even if it is slower, test-per-scan is the industry preferred solution today
30
5. Low-Power BIST
Low power test pattern generators (1/3)
Circuit Under Test
CLK SCLK Sel_CLK
CLK
Slow LFSR/MISR
Normal-speed LFSR/MISR
Dual-Speed LFSRs is based on two LFSRs running at different frequencies Average power during test is reduced by connecting the CUT inputs with the highest transition densities to the low speed LFSR while CUT inputs with the lowest activity are connected to the normal speed LFSR
31
5. Low-Power BIST
Low power test pattern generators (2/3)
k
LFSR r SI TFF Scan Chain m ScanOut
CUT
Low transition random test pattern generator involves inserting an AND gate and a toggle flip-flop (TFF) between the LFSR and the input of the scan chain to increase the correlation of neighboring bits in the scan vectors TFF holds its previous values until it receives a 1 on its input. The same value (0 or 1) is repeatedly scanned into the scan chain until the value at the output of the AND gate becomes 1
32
5. Low-Power BIST
Low-power test pattern generators (3/3)
By carefully choosing the seed of the LFSR (choice of polynomial has no real influence)
By inserting translating logic between the LFSR and the CUT to obtain weighted random test vectors By using Gray counters producing consecutive test vectors with only one bit difference in the case of deterministic testing of data paths
33
5. Low-Power BIST
Vector filtering BIST
Test Sequence
V0
CLK
LFSR
LFSR inhibition
Vi Vj Vk Vl
Decoder
LFSR activation LFSR inhibition LFSR activation

FF
Circuit Under Test
Prevent application of non-detecting (but consuming) vectors to the CUT A decoder is used to store the first and last vectors of each sub-sequence of consecutive non-detecting vectors to be filtered Minimizes average power without reducing fault coverage
34
5. Low-Power BIST
Circuit partitioning
A B C D A B
DMUX M U X
C
DMUX
B
DMUX
C
DMUX
C1
C2
C1
M U X
C2
C1
M U X
C2
MUX
MUX
MUX
MUX
Partition the original circuit (using a graph partitioning algorithm that minimizes the cut size) into structural sub-circuits so each sub-circuit can be successively tested through different BIST sessions FC and test time are unchanged and area overhead is quite low Drawbacks are a slight penalty on performance and an impact on routing
35
5. Low-Power BIST
Power-aware test scheduling (1/3)
Power Power limit
Test time
The goal is to determine the blocks (memory, logic, analog, etc.) of an SOC to be tested in parallel at each stage of the BIST session in order to keep power dissipation under a specified limit while optimizing test time Some of the test resources (pattern generators and response analyzers) must be shared among the various blocks
36
5. Low-Power BIST
The NP-complete test scheduling problem may be addressed by using a compatibility graph and heuristic-driven algorithms For given power constraints and parameters related to the test organization (fixed, variable, or undefined test sessions with or without precedence constraints) or to the test structure (test bus width, test resources sharing), these solutions allow to optimize overall SOC test time
37
5. Low-Power BIST
SOC
Core 1 Embedded Tester Tester Memory BIST Core 3 BIST Core 4 BIST Core 5 Core 2
BIST
BIST
Test Controller
Main focus is on total energy minimization under tester memory constraint
The test set is composed of core-level locally generated pseudo-random test patterns and additional deterministic test patterns that are generated off-line and stored in the system A careful tradeoff between the deterministic pattern lengths of the core must therefore be made in order to produce a globally optimal solution
38
6. Low-Power Test Data Compression

High test data volume leads to a high testing time and may exceed the limited memory depth of ATE Test data compression involves encoding a test set so as to reduce its size ATE limitations, i.e., tester storage memory and bandwidth gap between the ATE and the CUT, may hence be overcome Using compressed test data involves having an on-chip decoder which decompresses the data Low-power test data compression techniques are needed to concurrently reduce scan power dissipation and test data volume during test
39

Coding-based schemes
Use of 0-filling on ATPG test cubes and then encode runs of 0s with Golomb codes (runlength codes) for reducing the number of transitions (75%) Golomb coding is very inefficient for runs of 1s A synchronization signal between the ATE and the CUT is required as the size of the compressed data (codeword) is of variable length Alternating run-length coding improves the encoding efficiency of Golomb coding (can encode both runs of 0s and runs of 1s ) 40

Linear-decompression-based schemes (1/2)
Linear Decompressors consist of XORs and flip-flops In LFSR reseeding Deterministic test cubes generated by expanding seeds Typically 1-5% of bits in test vector specified Most bits need not be considered when seed computed Size of seed much smaller than size of vector Significantly reduces test data volume and bandwidth Problem: X's in test cubes filled randomly Results in excessive switching during scan shifting
41

Linear-decompression-based schemes (2/2)
Low power LD using LFSR reseeding can be used LFSR reseeding not used to directly encode specified bits Each test cube divided into blocks LFSR reseeding used only to produce blocks containing transitions For blocks not containing transitions Logic value fed into scan chain simply held constant Reduces number of transitions in scan chain Efficient solution to trade-off between test data compression and test power reduction
42

Broadcast-scan-based schemes (1/2)
Clock Tree
Segmented Addressable Scan (SAC)

Segment Address
Segment 1 Segment 2
Multi-Hot Decoder
Segment M
Output Compressor
Tester Channel or Input Decompressor
Based on broadcasting the same value to multiple scan segments SAC enhances the Illinois scan architecture by avoiding the limitation of having to have all segments compatible to benefit from the segmentation Test power is reduced as segments which are incompatible during the time needed to upload a given test pattern are not clocked
43

Broadcast-scan-based schemes (2/2)
Sense amplifiers & MISR Row enable shift register Mode
Progressive Random Access Scan (PRAS)
Test Control Scan data I/O
Column line driver

Column address decoder
Column address
Scan cells are configured as an SRAM-like structure using PRAS scan cells PRAS allows individual accessibility to each scan cell, thus eliminating unnecessary switching activity during scan, while reducing the test application time and test data volume by updating only a small fraction of scan-cells
44
7. Low-Power RAM Testing
Motivated by the need to concurrently test several banks of memories in a system to reduce test time A first strategy is to reorder memory tests to reduce the switching activity on each address line while retaining the fault coverage and the memory overall test time
Original Test Low-power Test s (W0, R0, W1, R1); s (W(1odd/0even), R(1odd/0even),
W(0odd/1even), R(0odd/1even));
Zero-One Checker Board
(W0); (R0); (W1); (R1); (W(1odd/0even)); (R(1odd/0even)); (W(0odd/1even)); (R(0odd/1even));
Single bit change (SBC) counting
Power dissipation reduced by a factor of two to A special design of the BIST circuitry is needed
sixteen
45
A second strategy is to exploit the predictability of the addressing sequence to reduce the pre-charge activity during test Pre-charge circuits contribute to up to 70% to power dissipation In functional mode, the cells are selected in random sequence, and all pre-charge circuits need to be always active, while during the test mode the access sequence is known, and hence only the columns that are to be selected need to be pre-charged This low-power test mode can be implemented by using a modified pre-charge control circuitry, and by exploiting the first degree of freedom of March tests, which allows choosing a specific addressing sequence Addressing sequence is fixed to word line after word line and the pre-charge activity is restricted to only two columns for each clock cycle: the selected column and the following one
46

BLj-1 BLBj-1 BLj BLBj BLj+1 BLBj+1
Additional pre-charge control logic

Prec Prec
Prec
Prec
LPtest = Low power test command
Prj-1 CS
j-1
Prj CSj
Prj+1 CSj+1
50% power savings with negligible impact on area overhead and memory performance
47
Summary and Conclusions
Test throughput and manufacturing yield may be affected by excessive test power Therefore, lowering test power has been and is still a focus of intense research and development Following points have been surveyed:

Test power parameters and contributors
Problems induced by an increased test power Structural and algorithmic solutions for low-power test along with their impacts on parameters such as fault coverage, test time, area overhead, circuit performance penalty, and design flow modification
48
Summary and Conclusions

Additional concerns Testing when new
low-power design techniques used
Dynamic power management techniques "Shut-down" parts of design when idle Testing currently done sequentially Test deals with power domains one at a time Practice becoming inadequate due to test time concern Multiple-voltage domains used to reduce power How to safely handle test of such designs?
49

System-on-Chip Test Architectures Ch. 7 - Low-Power Testing - P. 1

Uploaded by

Copyright:

Available Formats

System-on-Chip Test Architectures Ch. 7 - Low-Power Testing - P. 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

System-on-Chip Test Architectures Ch. 7 - Low-Power Testing - P. 1

Uploaded by

Copyright:

Available Formats

Chapter 7

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 1

What is this chapter about?

the various aspects of lowpower testing on

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 2

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 3

Test in the past High Fault Coverage

Test from now

High Test Quality (e.g., high small-delay detection capability)

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 4

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 5

Over sizing power rails

Costly or longer test time

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 6

2. Energy and Power Modeling

Discharging (10): energy dissipated as heat CL

Pdyn = CL.Vdd2.N01.1/T Pdyn = .CL.Vdd2.N.1/T

2. Energy and Power Modeling

Average Power = Energy / Test time

has impact on the thermal load of the device

Peak Power = Highest value of instantaneous power

Ch. 7 Low-Power Testing - P. 8

2. Energy and Power Modeling

Energy consumed after application of (Vk-1,Vk)

Total energy consumed during test application

power consumed during the test session

Peak power consumed during the test session

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 9

2. Energy and Power Modeling

has impact on the energy, average power and peak power

has impact on the average power

has impact on the energy

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 10

2. Energy and Power Modeling

switching activity in the combinational part of the circuit

switching activity in the flip-flops

switching activity in the clock tree feeding the circuit

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 11

3. Test Power Issues

EE141 System-on-Chip Test Architectures

Ch. 7 Low-Power Testing - P. 12

3. Test Power Issues

Good dies fail the test manufacturing yield loss (overkill)

Ch. 7 Low-Power Testing - P. 13

4. Low-Power Scan Testing - Basics

shift & launch

Ch. 7 Low-Power Testing - P. 14

4. Low-Power Scan Testing - Basics

Capture & Launch V 2 applied

test cycle lower