ML For Timimg PDF
ML For Timimg PDF
ML For Timimg PDF
0.80
0.81
0.82
0.83
0.84
0.85
0.86
0.87
0.88
0.89
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.13
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
1.30
Incremental non-SI delay (Non-SI Incr Delay) Clock period (ns)
∆Dsi0
of an arc without coupling reported
in timing analysis in non-SI mode
Path delay
Difference in arrival times at the clock pin of the Fig. 3. Maximum path slack delta between SI and non-SI modes over
launch flip-flop and D pin of the capture flip-flop the top-1000 setup-critical paths in a design signed off at 1.0ns. The delta
SI path delay across all timing increases from 81ps to 143ps as the clock period is reduced below 0.87ns.
Psi
arcs reported in timing analysis in SI mode
Non-SI path delay across all timing
Psi0
arcs reported in timing analysis in non-SI mode
∆Psi Difference between Psi0 and Psi
Our contributions in this paper are summarized as follows.
fCc ,red
Miller coupling factor in (1) We analyze multiple sources that cause timing divergence
non-SI mode, i.e., Cc × fCc ,red is added to Cg between SI and non-SI modes and provide new insights
Coupling capacitance factor
fCc
in SI mode, i.e., Cc is changed to Cc × fCc
on electrical and logic structure parameters that affect
Ground capacitance factor in SI or incremental transition time, incremental delay and path delay
fCg
non-SI mode, i.e., Cg is changed to Cg × fCg in SI mode. Unlike [4], we demonstrate that several new
Resistance factor in SI or parameters affect SI Incr Delay ∆Dsi (as defined in Table I)
f Rw
non-SI mode, i.e., Rw is changed to Rw × fRw
S Stage in which the arc appears
of an arc in a timing path.
Number of stages in the path (2) We develop new machine learning-based models for
Nstg
in which arc appears incremental transition time and delay due to SI, and compose
rS,Nstg Ratio of arc-stage to total #stages in path these models to derive a new model for path delay that is
clkp Clock period
different from [4].
Naggr Number of aggressors for a victim net
Ar Toggle rate of a net (3) The worst-case absolute errors in our modeling predictions of
Minimum (resp. maximum) rise (resp. fall) incremental transition time, incremental delay due to SI and
arr(min,max),(r, f ),(a,v)
arrival time of an aggressor (resp. a victim) SI-aware path delay are 7.0ps, 5.2ps and 8.2ps, respectively.
LE Logical effort of the driver of a net
We have developed and tested our models using timing reports
of block implementations with 28nm FDSOI foundry libraries.
change in the clock period changes toggle rates of aggressor Compared to the recent work of [4], we reduce worst-case
and victim nets by different amounts that can lead to change in error in prediction of incremental delay due to SI changes
aggressor and victim timing window alignment. Two phenomena from 60ps to 5.2ps.
are particularly challenging for analytical SI delay models. The remainder of the paper is organized as follows. In Section II,
Challenge 1. Path slack variation with clock period. Figure we review related works on studying correlations of timing
3 shows the maximum delta of slack in a path with 32 stages reports/predictions between different tools/models with attention to
between SI and non-SI analyses for an OpenCores [23] design SI effect. In Section III, we describe our methodology to select
dec viterbi that is signed off at 1.0ns. The delta is 81ps when the significant parameters and derive machine learning models for
clock period varies between 0.87ns and 1.3ns. However, when the incremental delay and path delay in SI mode. In Section IV, we
clock period decreases below 0.87ns, the maximum delta in path describe our experimental setup and present results. In Section V,
slack increases non-monotonically and becomes 143ps at a clock we describe future works and conclude the paper.
period of 0.8ns. Figure 4 shows timing parameters related to SI and
non-SI analyses for several nets and cells. The nets n33458 and II. R ELATED W ORKS
n33452 shown in brown are responsible for large delta transition Prior works that quantify miscorrelations of SI-induced delay
times and incremental delays in SI mode. We highlight these deltas between different analytical timing models or timing tools are
and the impact to path slack using the blue box. The same path has limited.
a delta slack of 49ps when the clock period is 1.0ns, as shown in An analytical model that captures SI-induced delay is due to
Figure 5. The path that has the maximum delta slack of 81ps at Sapatnekar [10]; it lumps coupling capacitance to ground with the
a clock period of 1.0ns continues to have the same value of delta value of Miller coupling factor being 0, 1 or 2 based on the timing
slack at a clock period of 0.8ns, as shown in Figure 6. window overlap and switching directions of the signals. The effect
Challenge 2. Arc delay and incremental transition time variation of crosstalk on net delay is estimated using an iterative algorithm
with ground and coupling capacitances. We illustrate non- with runtime that is polynomial in the number of nets. The results
intuitive impacts of varying ground and coupling capacitances of are not verified with results from other tools or models. Xiao et
the victim net n33452 on arc delay and incremental transition time al. [16] derive an analytical two-pole model for RC interconnect
respectively in Figures 7(a) and (b). When the ground capacitance noise waveform calculation with coupling capacitance. A Newton-
is changed from 0.006pF to 0.0132pF, the incremental delay in non- Raphson iteration is used to obtain the timing information.
Path Delta – 143ps
Cell / net name DTran SI Incr Non-SI Incr SI Path Non-SI Path
(ns) Delay (ns) Delay (ns) Delay (ns) Delay (ns)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
inst_ram_ctrl_write_ram_fsm_reg_0_/Q 0.000 0.000 0.069 0.269 0.269
inst_ram_ctrl_write_ram_fsm_0_ (net)
….
FE_OCP_RBC23542_n28670/Z 0.000 0.000 0.027 0.428 0.428
FE_OCP_RBN23542_n28670 (net)
FE_OCP_RBC23543_n28670/A 0.004 0.004 0.013 0.445 0.441
….
U143152/Z 0.000 0.000 0.034 0.809 0.800
n33458 (net)
U92231/C 0.003 0.002 0.000 0.811 0.801
…
U99631/Z 0.000 0.000 0.065 0.769 0.762
n33477 (net)
U145471/C 0.035 0.022 0.002 0.793 0.764
…
U121581/Z 0.000 0.000 0.104 0.967 0.935
n33452 (net)
U121579/B 0.133 (0.024) 0.115 (0.021) 0.004 1.082 (0.988) 0.939
U121579/Z 0.000 0.000 0.057 1.139 (1.045) 0.996
n79492 (net)
inst_ram_ctrl_inst_generic_sp_ram_0_q_reg_21_/D 0.000 0.000 0.000 1.139 (1.045) 0.996
Fig. 4. Timing divergence in a path with the maximum delta slack of 143ps at a clock period of 0.8ns. As defined in Table I, “DTran” is the delta
Path Delta – 49ps
transition due to coupling, “SI Incr Delay” is the incremental delay due to coupling, “Non-SI Incr Delay” is the incremental delay without coupling, “SI
Path Delay” is the accumulated path delay with coupling and “Non-SI Path Delay” is the accumulated path delay without coupling. The nets in green color
do not contribute to “DTran” and “SI Incr Delay”, whereas the nets in brown color cause non-zero “DTran” and “SI Incr Delay”. The nets that contribute
to the delta slack of 143ps are highlighted inside the blue boxes. The values in green are for the same path but analyzed at a clock period of 1.0ns.
Cell / net name DTran SI Incr Non-SI Incr SI Path Non-SI Path
(ns) Delay (ns) Delay (ns) Delay (ns) Delay (ns)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
inst_ram_ctrl_write_ram_fsm_reg_0_/Q 0.000 0.000 0.069 0.269 0.269
inst_ram_ctrl_write_ram_fsm_0_ (net)
….
FE_OCP_RBC23542_n28670/Z 0.000 0.000 0.027 0.428 0.428
FE_OCP_RBN23542_n28670 (net)
FE_OCP_RBC23543_n28670/A 0.004 0.004 0.013 0.445 0.441
….
U143152/Z 0.000 0.000 0.034 0.809 0.800
n33458 (net)
U92231/C 0.003 0.002 0.000 0.811 0.801
…
U99631/Z 0.000 0.000 0.065 0.769 0.762
n33477 (net)
U145471/C 0.035 0.022 0.002 0.793 0.764
…
U121581/Z 0.000 0.000 0.104 0.963 0.935
n33452 (net)
U121579/B 0.024 0.021 0.004 0.988 0.939
U121579/Z 0.000 0.000 0.057 1.045 0.996
n79492 (net)
inst_ram_ctrl_inst_generic_sp_ram_0_q_reg_21_/D 0.000 0.000 0.000 1.045 0.996
Fig. 5. The path with delta slack of 143ps at clock period of 0.8ns has delta slack of 49ps at clock period of 1.0ns.
Correlation with SPICE shows good matching. However, the tools in SI mode.
Newton-Raphson iteration is computationally expensive and may
not be practical for use with realistic designs. To correlate different signoff timing tools, Mishra et al. [8]
Thiel et al. [12] leverage the ability of PrimeTime (PT) [27] to recalculate clock uncertainties based on miscorrelation between
output a SPICE netlist, and use SPICE simulation to calibrate the different tools, and then use the new uncertainty values for better
PT timing report. However, SI effects are not addressed in this work. timing correlation between the tools. Han et al. [4] provide a deep
Motassadeq et al. [2] extend this analysis flow by using PrimeTime learning methodology to correlate timing between different signoff
SI (PTSI) [27] instead of PT to include SI effects. Mohamed et timers. However, they only correlate either non-SI to non-SI mode
al. [9] correlate PTSI-reported delta delay with coupling capacitance or SI to SI mode. The models in [4] do not predict timing in SI
and drive strengths of the aggressor and victim. However, they do mode using the timing reports of non-SI mode.
not provide a quantitative model for these correlations. Venugopal et
al. [14] characterize delays calculated by PTSI and correlated with Our work is closely related to that of Han et al. [4], even though
HSPICE [26], but no model predicting the discrepancy of HSPICE the work in [4] does not calibrate non-SI to SI. The key differences
and PTSI is presented. are: (i) a new model for incremental transition time due to SI; (ii) a
To minimize the gap between internal incremental STA tool and new model for incremental delay due to SI; (iii) a new model for SI-
signoff timing tool, Kahng et al. [7] use least-squares regression aware path delay; and (iv) validation with a wide range of testcases
to model wire delay. They then use offset-based correlation with that include memories from 28nm foundry FDSOI libraries. The
a signoff timing tool to calibrate the path slacks reported by the new models help us achieve higher modeling accuracy in calibrating
internal STA tool. However, they do not explicitly model signoff non-SI to SI as compared to the models in [4].
Path Delta – 81ps
Cell / net name DTran SI Incr Non-SI Incr SI Path Non-SI Path
(ns) Delay (ns) Delay (ns) Delay (ns) Delay (ns)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
inst_ram_ctrl_write_ram_ptr_reg_0_/Q 0.000 0.000 0.087 0.285 0.285
inst_ram_ctrl_write_ram_ptr_0_ (net)
….
FE_RC_3395_0/Z 0.000 0.000 0.015 0.424 0.424
FE_OCP_RBN22308_n20174(net)
FE_OCP_RBN22308_n20174/A 0.014 0.009 0.019 0.452 0.443
….
U98160/Z 0.000 0.000 0.029 0.541 0.532
n22678 (net)
FE_OFC16-76_n22678/C 0.003 0.002 0.000 0.543 0.532
…
U99420/Z 0.000 0.000 0.053 0.742 0.731
n25563 (net)
U145193/C 0.016 0.012 0.000 0.754 0.731
U145193/Z 0.000 0.000 0.114 0.868 0.845
n25556 (net)
U89670/B 0.089 0.058 0.006 0.932 0.851
…
U121246/Z 0.000 0.000 0.021 1.063 0.982
n70246 (net)
inst_ram_ctrl_inst_generic_sp_ram_1_q_reg_18_/D 0.000 0.000 0.000 1.063 0.982
Fig. 6. Timing divergence in a path with delta slack of 81ps at clock periods of both 1.0ns and 0.8ns.
0.16
0.14 Our analyses indicate that the transition time at the output pin of
0.16
0.12 a net’s driver, the product of wire resistance and capacitances, are
0.14
Arc Timing (ns)
0.1
0.06
SI Incr Delay the same way for two of the parameters used in [4]. In addition,
0.08 DTran
0.04
Non‐SI Incr Delay signoff timing tools use complex algorithms to determine timing
SI Incr Delay
0.06 windows for less pessimistic delay analyses in SI mode. This is
0.02 Non‐SI Incr Delay
0.04 difficult to model because timing windows change with operating
0
0.02 0.004 0.006 0.008 0.01 0.012 0.014 conditions. We introduce new electrical parameters to approximate
0 Ground Cap (pF) the effect of timing windows for the aggressor with the largest
0.004 0.006 0.008 0.01 0.012 0.014
coupling capacitance. Figures 9(a)–(d) show two new electrical and
0.3 Ground Cap (pF)
(a) two new structural parameters that affect the incremental delay in
0.25
0.3
SI mode.
We use the following 12 parameters in our modeling: (i)
Arc Timing (ns)
0.2
0.25
DTran
incremental delay in non-SI mode; (ii) transition time in non-SI
0.15
Arc Timing (ns)
0.2
SI Incr Delay
mode; (iii) clock period; (iv) resistance; (v) coupling capacitance;
0.15
0.1
DTran
Non‐SI Incr Delay (vi) ratio of coupling-to-total capacitance; (vii) toggle rate; (viii)
0.05 SI Incr Delay number of aggressors; (ix) ratio of the stage in which the arc of
0.1
Non‐SI Incr Delay the victim net appears to the total number of stages in the path; (x)
0
0.05 0.01 0.015 0.02 0.025 0.03
logical effort of the net’s driver; and (xi), (xii) the differences in
Coupling Cap (pF) max (respectively, min) arrival times1 of the signal at the driver’s
0
0.01 0.015 0.02 0.025 0.03 output pin for the victim and its strongest aggressor.2 We choose
Coupling Cap (pF) our parameters based on sensitivity of the parameter to incremental
transition time or incremental delay due to SI, or SI-aware path
(b) delay. Our experimental results indicate that dropping any of the
Fig. 7. Timing of the victim net that has the maximum divergence at a parameters can reduce the modeling accuracy by at least 5%.
clock period of 0.8ns when only (a) ground capacitance and (b) coupling
capacitance of the victim net is varied. The figure shows delta transition
Therefore, we use all the parameters indicated in Equations (1), (2)
due to coupling as “DTran” in brown rectangles, arc delay due to coupling and (3) to develop our models. We do not use any layout parameters
as “SI Incr Delay” in green triangles and arc delay without coupling is as since layout is reflected in parameters such as coupling capacitance,
“Non-SI Incr Delay” in blue diamonds. total capacitance and wire resistance.
We model the incremental transition time due to SI as
III. M ETHODOLOGY FOR T IMING C ORRELATION IN SI M ODE ∆Tsi = f (Tsi0 , Rw ,Cc , rCc ,Ctot , clkp, LE) (1)
Our modeling methodology includes (i) selection of parameters
that affect incremental delay in SI mode, and (ii) application of We further model the incremental delay due to SI as
nonlinear modeling techniques to capture the complex interactions ∆Dsi = f (∆Dsi0 , ∆Tsi , Rw ,Cc , rCc ,Ctot , rS,Nstg , clkp, (2)
of parameters so as to accurately predict the incremental delay in ∆arrmin,(r, f ) , ∆arrmax,(r, f ) , Ar , LE)
SI mode.
A. Selection of parameters 1 We use rise and fall arrival times based on the signal’s transition at the
We have studied multiple electrical and circuit parameters that output pin of the net’s driver, from timing reports in non-SI mode.
can affect incremental delay in SI mode and have drawn from the 2 We consider the net with largest coupling capacitance to the victim as
list of parameters used to model wire delay in SI mode in [4]. the strongest aggressor.
of both the victim and aggressor nets, and can change the timing
Incremental Delay in SI Mode (ns)
training, 10% for validation and the remaining 30% for testing. The respectively. The actual incremental transition time due to SI is 114.6 −
training time of our models is 10.6 hours for ANN, 23.9 hours for 34.6 = 80ps, whereas our model for incremental transition time predicts
SVM and 12 minutes for HSM on an Intel Xeon E5-2640 2.5GHz 73ps. The difference is 7.0ps. Therefore, per Equation (4), the percentage
error is 7.0/80 = 8.8%.
server with eight threads. This is a one-time overhead. After the 4 In non-SI and SI modes the path delays are 1055.2ps and 935.5ps,
models are trained, the time to test is ∼10 minutes for every 10K respectively. The actual difference in SI-aware path delay is 1055.2 −
data points. 935.5 = 119.7ps, whereas our model for SI-aware path delay predicts
We conduct three experiments to demonstrate accuracy and 109.6ps. The difference is 8.2ps. Therefore, per Equation (5), the percentage
robustness of our models. error is 8.2/119.7 = 6.9%.
Path Delta (Post‐Fitting) – 143ps
Cell / net name (Actual) SI Incr (Model) SI Incr (Actual) SI Path (Model) SI Path
Delay (ns) Delay (ns) Delay (ns) Delay (ns)
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
inst_ram_ctrl_write_ram_fsm_reg_0_/Q 0.000 0.000 0.269 0.269
inst_ram_ctrl_write_ram_fsm_0_ (net)
….
FE_OCP_RBC23542_n28670/Z 0.000 0.000 0.428 0.428
FE_OCP_RBN23542_n28670 (net)
FE_OCP_RBC23543_n28670/A 0.004 0.004 0.445 0.445
….
U143152/Z 0.000 0.000 0.809 0.809
n33458 (net)
U92231/C 0.002 0.002 0.811 0.811
…
U99631/Z 0.000 0.000 0.769 0.769
n33477 (net)
U145471/C 0.022 0.023 0.793 0.794
…
U121581/Z 0.000 0.000 0.967 0.968
n33452 (net)
U121579/B 0.115 0.118 1.082 1.086
U121579/Z
n79492 (net) Model xt v2 0.000 0.000 1.139 1.140
inst_ram_ctrl_inst_generic_sp_ram_0_q_reg_21_/D
Fig. 15. Actual and predicted values of “SI Incr Delay” and “SI Path Delay” (defined in Table I) of the same path shown in Figure 4. Our models reduce
0.000 0.000
Model pd_v2
1.139 1.144
the path delay (as well as path slack) divergence from 143ps to 5ps. The predicted values that differ from the actual values are highlighted in red.
Predicted Incremental Transition Time (ps)
7.0ps
8.2ps
Model xd v2
Actual Incremental Transition Time (ps) Actual Path Delay (ps)
Fig. 12. Actual versus predicted incremental transition times due to SI. Fig. 14. Actual versus predicted SI-aware path delays.