Dissertation Erb Final

Jitter Analysis Methods for the Design and
Test of High-Speed Serial Links
Distribution Tail Fitting Based on Gaussian

Quantile Normalization
Stefan Erb
Jitter Analysis Methods for the Design and
Test of High-Speed Serial Links
Distribution Tail Fitting Based on Gaussian
Design Quantile
of EmbeddedNormalizationCMOS
A/D-Converters
for Communication Systems
Dipl.-Ing. Stefan Erb
Dipl.-Ing. Richard Gaggl

————————————–
————————————–
Submitted as thesis to attain the academic degree “Dr. techn.”

Submitted as thesis to attain the academic degree “Dr. techn.”
atatthe
the
GrazGraz
University ofTechnology
University of Technology
ife
institut für elektronik
Institute of Electronics
Institute of Electronics
Supervisor: .........................................................................
Univ.-Prof. Dipl.-Ing. Dr. techn. Wolfgang Pribyl
Supervisor: .........................................................................
Univ.-Prof. Dipl.-Ing. Dr. techn. Wolfgang Pribyl
Co-Supervisor: .........................................................................
Univ.-Prof. Dipl.-Ing. Dr. techn. Ernst Stadlober
Co-Supervisor: .........................................................................
Univ.-Prof. Dr. Willy Sansen
Graz, May 2011
Villach, September 22, 2008

Eidesstattliche Erklärung
Ich erkläre an Eides statt, dass ich die vorliegende Arbeit selbstständig verfasst, andere als die
angegebenen Quellen/Hilfsmittel nicht benutzt, und die den benutzten Quellen wörtlich und in-
haltlich entnommene Stellen als solche kenntlich gemacht habe.
Statutory Declaration
I declare that I have authored this thesis independently, that I have not used other than the declared
sources / resources, and that I have explicitly marked all material which has been quoted either
literally or by content from the used sources.
....................... .............................
Date Signature
V
Abstract
In this thesis analysis methods for serial high-speed interfaces are presented to investigate charac-
teristics and impact of timing uncertainty or jitter. Such methods are widely used for estimating the
bit error rate (BER) of serial links, and for signal integrity measurements and compliance testing.
First, an algorithm is developed which determines the Gaussian tail behavior of measured jit-
ter distributions and separates them into random and deterministic components using an efficient
optimization scheme. The resulting timing budget allows to accurately quantify the total jitter of
clock signals and phase locked loop (PLL) systems. The fast analysis technique allows for an
implementation as embedded system and thus, supports a broad variety of design-for-test (DFT)
applications such as production testing, on-chip diagnostics and on-line monitoring. One chapter
of the thesis is dedicated to these hardware design aspects.
The underlying mathematical principle is based on the Gaussian quantile normalization, which
has already been discussed and utilized in previous approaches. A detailed and thorough perfor-
mance comparison is thus carried out to highlight their different properties. Further, it is demon-
strated how the proposed method can easily be generalized for use with arbitrary non-Gaussian
tails, as is the case for optical high-speed interconnects.
In a first case study, a fast system level model of a serial high-speed PLL is developed. The
system has already been realized as test structure and thus, allows for a direct comparison with
measurements. These include the analysis of closed loop phase noise, jitter transfer characteristics
as well as the jitter tolerance behavior. Especially the latter two also comprise the use of the
developed method. A second case study realizes a BER tester with a 3Gb/s serial high-speed
interface on an FPGA board. It demonstrates the practical aspects of jitter measurement, diagnosis
and optimization.
VII
Kurzfassung
In dieser Arbeit werden Methoden zur Analyse von Zeit-Jitter vorgestellt, die vor allem in digi-
talen seriellen Schnittstellen Anwendung finden. Solche Methoden erlauben die Abschätzung der
Bitfehlerrate (BER) und werden deshalb häufig für die Beschreibung der Signal-Qualität und zum
Nachweis der Einhaltung von Standards eingesetzt.
Im Rahmen der Arbeit wird zunächst eine Methode entwickelt, welche das Gauß-Verhalten an
den beiden Enden einer gemessenen Jitter-Verteilung feststellt und die Verteilung daraufhin ex-
trapoliert. Die Methode zeichnet sich vor allem durch ihre Genauigkeit und Effizienz aus. Sie ist
daher besonders für die quantitative Analyse der Qualität von Taktsignalen und Phasenregelschlei-
fen (PLLs) geeignet. Aufgrund des einfachen Algorithmus kann sie auch sehr gut zusammen mit
Teststrukturen, Diagnosewerkzeugen oder für die Echtzeit-Überwachung solcher Systeme einge-
setzt werden.
Das zugrundeliegende mathematische Prinzip basiert auf der Quantil-Normalisierung von Gauß-
Verteilungen, welche bereits in ähnlichen Ansätzen verwendet wurde. Deshalb umfasst die Arbeit
auch einen detaillierten und umfangreichen Vergleich verschiedener Methoden, der die jeweiligen
Eigenschaften aufzeigt. Ein weiterer Abschnitt ergänzt die vorgestellte Methode um ein generali-
siertes Prinzip, mit dem sie auf einfache Weise auf beliebige nicht-Gauß Verteilungen angewandt
werden kann.
In einer ersten Fallstudie wird ein sehr schnelles Modell für eine PLL erstellt. Dieses Sys-
tem existiert bereits als Teststruktur, und ermöglicht daher einen direkten Vergleich der erhalte-
nen Simulationsergebnisse mit Messdaten. Der Vergleich umfasst eine Rauschanalyse der PLL,
die Simulation der Jitter-Transfer-Funktion sowie des Jittertoleranz-Verhaltens. Die Simulation
der letzten beiden Eigenschaften erfolgt dabei mit Hilfe der neuen Analysemethode. Eine zweite
Fallstudie realisiert einen BER-Tester zusammen mit einer 3Gb/s S-ATA Schnittstelle auf einem
FPGA Board. Sie dient vor allem der Darstellung praktischer Aspekte bei der Diagnose und Opti-
mierung von Jitter.
IX
Danksagung
Ich möchte mich bei allen bedanken die mich in den letzten dreieinhalb Jahren bei der Umsetzung
dieser Dissertation unterstützt haben.
Allen voran, mein Betreuer und Mentor, Prof. Wolfgang Pribyl, der mir bei meinem Forschungs-
thema großen Freiraum gelassen, und mich in meinen Interessen und Anliegen sehr stark unter-
stützt hat. Diese Rahmenbedingungen haben das erfolgreiche Abschließen der Dissertation erst
möglich gemacht und mitunter ganz wesentlich zur Qualität der vorliegenden Arbeit beigetragen.
Ich empfinde den erfolgreichen Abschluss daher als großes Geschenk, das uns hoffentlich noch
viele Jahre über diese Arbeit hinaus verbindet.
Ein weiterer großer Dank gebührt meinem Zweitbegutachter, Prof. Ernst Stadlober, der mir in
zahlreichen Diskussionen die wesentlichen Aspekte der Extremwerttheorie für mein Thema be-
reitwillig nähergebracht hat. Ausgehend von der Modellierung elektronischer Systeme hat meine
Arbeit dadurch einen Streifzug durch die Welt der Statistik unternehmen können, den ich für mich
persönlich als sehr bereichernd empfinde.
Für die fachliche Unterstützung danke ich vor allem den Ingenieuren des ‘Analog Design Sup-
port’ Teams der Infineon Villach, die mir bei der Erstellung des Modells für serielle high-speed
Schnittstellen geholfen und die Messdaten der Teststruktur zur Verfügung gestellt haben. Reinhard
Steiner hat mir immer wieder gern Auskunft erteilt sowie den Zugang zu den Hochleistungsrech-
nern über die gesamte Dauer der Dissertation ermöglicht.
Nicht zuletzt verdanke ich auch meinen Kollegen aus dem Dissertantenlabor zahlreiche nütz-
liche Hinweise und Anregungen. Allen voran Christoph Böhm, der mir immer wieder gerne und
manchmal auch sehr kurzfristig meine Publikationen verbessert hat.
Ich möchte an dieser Stelle auch noch Siegfried Rainer und Jan Ranglack erwähnen, für deren
Freundschaft ich sehr dankbar bin. Danke William Robinson, für die Englisch-Korrekturen.
S TEFAN E RB
G RAZ , M AI 2011
XI
The real voyage of discovery consists not in seeking new landscapes
but in having new eyes.
M ARCEL P ROUST
There is nothing more practical than a good theory.
K URT L EWIN
Das Runde muss ins Eckige.
S EPP H ERBERGER
Contents
List of Figures v
List of Tables ix
List of Abbreviations xi
Nomenclature xiii
1. Introduction 1
1.1. Motivation and Problem Domain . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2. Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3. Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Fundamentals of Jitter, PLLs and BER Analysis 5

2.1. Jitter in High-Speed Serial Links . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1. Phase Noise Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2. Clock Jitter Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3. Relation Between Phase Noise and Jitter . . . . . . . . . . . . . . . . . . 9
2.2. PLLs for Serial High-Speed Communications . . . . . . . . . . . . . . . . . . . 10
2.3. Jitter and BER Analysis Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1. Histogram Based Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2. Time-Domain Based Analysis . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3. Frequency-Domain Based Analysis . . . . . . . . . . . . . . . . . . . . 15
3. Mathematical Background 17
3.1. Quantile Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1. Quantile Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.2. Gaussian Quantile Normalization . . . . . . . . . . . . . . . . . . . . . 18
3.2. Performance Analysis of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1. Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2. Test Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4. A Fast and Accurate Jitter Analysis Method 25

4.1. Scaled Q-Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.1. Optimization Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.2. Generalized Optimization Scheme . . . . . . . . . . . . . . . . . . . . . 30
4.1.3. Residual Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.4. Implementation of the Fitting Algorithm . . . . . . . . . . . . . . . . . . 33
4.2. Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.1. Influence of Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.2. Influence of Test Distribution Shape . . . . . . . . . . . . . . . . . . . . 39
4.3. Performance Optimization with Different Fitness Measures . . . . . . . . . . . . 40
4.3.1. Performance Analysis of Different Fitness Measures . . . . . . . . . . . 40
I
4.3.2. Improvement of Convergence Behavior . . . . . . . . . . . . . . . . . . 45
4.3.3. Performance Analysis with Optimized Parameters . . . . . . . . . . . . 50
4.4. Performance Optimization with Q-Domain Threshold . . . . . . . . . . . . . . . 51
4.4.1. Minimum Q-Domain Threshold Definition . . . . . . . . . . . . . . . . 51
4.4.2. Parameter Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4.3. Performance Analysis with Optimized Parameters . . . . . . . . . . . . 57
4.5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5. Hardware Design Aspects 59

5.1. Tail Parameters of Test Distributions . . . . . . . . . . . . . . . . . . . . . . . . 60
5.1.1. Relation Between Distribution Shape and At . . . . . . . . . . . . . . . 61
5.1.2. Relation Between Distribution Shape and σt , µt . . . . . . . . . . . . . 63
5.2. Minimum Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3. Minimum Time Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.4. Estimation Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.1. Empirical Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.2. Error Ripple Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4.3. Error Analysis with Modeled Process Variations . . . . . . . . . . . . . 79
5.5. Design Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.5.1. Example for Jitter Diagnostics . . . . . . . . . . . . . . . . . . . . . . . 81
5.5.2. Example for Production Tests . . . . . . . . . . . . . . . . . . . . . . . 82
5.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6. Comparison of Gaussian Tail Fitting Methods Based on Q-Normalization 85

6.1. Implementation of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2. Performance Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.3. Performance Analysis of Polynomial Fitting Methods . . . . . . . . . . . . . . . 93
6.4. Comparison with Scaled Q-normalization Method . . . . . . . . . . . . . . . . . 98
6.5. Estimation Error Analysis of Conventional Q-Normalization . . . . . . . . . . . 105
6.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7. Jitter Analysis Method for Generalized Gaussian Tail Extrapolation 109

7.1. Introduction to Generalized Tail Fitting . . . . . . . . . . . . . . . . . . . . . . 109
7.1.1. Quantile Normalization Functions . . . . . . . . . . . . . . . . . . . . . 111
7.1.2. Generalized Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . 112
7.2. Implementation of Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.3. Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.3.1. Software Model Simulations . . . . . . . . . . . . . . . . . . . . . . . . 115
7.3.2. Hardware Model Simulations . . . . . . . . . . . . . . . . . . . . . . . 117
7.3.3. Comparison with Other Methods . . . . . . . . . . . . . . . . . . . . . . 118
7.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8. An Accurate Behavioral Model for High-Speed PLLs 121

8.1. Modeling Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
8.1.1. Basic Event-Driven Model . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.1.2. VCO Noise Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.1.3. Gain Regulator and BB-PD . . . . . . . . . . . . . . . . . . . . . . . . 125
8.1.4. Default Model Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2. Jitter and Phase Noise Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.2.1. Closed Loop Phase Noise . . . . . . . . . . . . . . . . . . . . . . . . . 126
II
8.2.2. Jitter Transfer Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
8.3. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9. A Method for Fast Jitter Tolerance Analysis 133

9.1. Adaptive Algorithm for JTOL Analysis . . . . . . . . . . . . . . . . . . . . . . 134
9.1.1. Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.1.2. Sample Size Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.2. Application Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
9.2.1. JTOL Parameter Optimization . . . . . . . . . . . . . . . . . . . . . . . 138
9.2.2. Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.3. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10. An FPGA based Diagnostic Tool for Jitter Measurement and Optimization 145
10.1. Measurement Principle and Implementation . . . . . . . . . . . . . . . . . . . . 145
10.1.1. Diagnostic Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.1.2. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
10.2. Jitter Measurements and Optimization . . . . . . . . . . . . . . . . . . . . . . . 148
10.3. Analysis of Extrapolation Error . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
10.4. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
11. Conclusion 157

11.1. Results Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
11.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
A. Figure Data Files 163
Own Publications 167
Bibliography 169
III
List of Figures
1.1. Block scheme of a serial high-speed transceiver. . . . . . . . . . . . . . . . . . . 1
2.1. Jitter sources in a serial high-speed interface. . . . . . . . . . . . . . . . . . . . 5

2.2. Typical receiver eye diagram affected by jitter. . . . . . . . . . . . . . . . . . . . 6
2.3. Jitter components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4. Inter-Symbol-Interference caused by a transmission channel. . . . . . . . . . . . 7
2.5. Ideal and real oscillator spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.6. Typical oscillator phase noise spectrum. . . . . . . . . . . . . . . . . . . . . . . 8
2.7. Definitions of absolute, period and accumulated jitter. . . . . . . . . . . . . . . . 9
2.8. IO jitter measurement principle. . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.9. Basic block scheme of a CPLL. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.10. Bathtub function example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.11. RJ and DJ components of a jitter PDF. . . . . . . . . . . . . . . . . . . . . . . . 13
3.1. Example for an empirical distribution function and corresponding quantile plot. . 18
3.2. Q-normalization principle demonstrated with a bathtub function. . . . . . . . . . 20
3.3. Simple optimization scheme for Gaussian tail fitting based on Q-normalization. . 20
3.4. Definition of interquartile range for the Gaussian distribution. . . . . . . . . . . . 22
3.5. Test distribution types, constructed with parameters σRJ and ADJ . . . . . . . . 23
4.1. Amplitude matching with adapted Q-normalization function. . . . . . . . . . . . 26

4.2. Amplitude matching with scaled distribution. . . . . . . . . . . . . . . . . . . . 26
4.3. Optimization scheme of scaled Q-normalization (sQN) method. . . . . . . . . . 27
4.4. Measurement example to demonstrate the scaled Q-normalization principle. . . . 28
4.5. Regression error σ̂err of the measurement example, depending on fitting length n
and scaling factor k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.6. Generalized optimization scheme for tail fitting. . . . . . . . . . . . . . . . . . . 30
4.7. Scatter plot of residuals for multiple realizations of a normal distribution. . . . . 32
4.8. Flow graphs of implemented testbench and tail fitting algorithm. . . . . . . . . . 34
4.9. Calculation time tc of QN and sQN algorithms depending on the number of bins R. 36
4.10. Influence of both σRJ and ADJ on median estimation error Emed . . . . . . . . . 37
4.11. Boxplot example for extrapolation error over varying sample size N . . . . . . . . 38
4.12. Influence of varying sample size N on estimation error Emed . . . . . . . . . . . 38
4.13. Asymptotic linearity of Q-tails as fundamental cause for error bias. . . . . . . . . 39
4.14. Influence of jitter ratio σRJ /ADJ,uni and sample size N on error. . . . . . . . . . 39
4.15. Influence of jitter ratio σRJ /ADJ and DJ type on error. . . . . . . . . . . . . . . 40
4.16. Flow graphs of σ̂err based and T̂ based algorithmic principles. . . . . . . . . . . 42
4.17. Estimation performance of different fitness measures. . . . . . . . . . . . . . . . 43
4.18. Estimation loss indicator EL of different fitness measures. . . . . . . . . . . . . 43
4.19. Performance EL over varying search grid resolution ∆k={1.1, 1.2, 1.5, 1.8}. . . 44
4.20. Performance EL with secondary search refinement with ∆k={1.1, 1.2, 1.5, 1.8}. 44
4.21. Definitions of conservative fitting parameters. . . . . . . . . . . . . . . . . . . . 45
V
4.22. ∆Tt versus Pt,min parameter surfaces to investigate Emed , IQR, EL and κ of three
different optimization criteria. . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.23. ∆Pt versus Pt,min parameter surfaces to investigate Emed , IQR, EL and κ of
three different optimization criteria. . . . . . . . . . . . . . . . . . . . . . . . . 48
4.24. ∆Pt versus Pt,min parameter surfaces to investigate EL of three different opti-
mization criteria at a different test distribution shape. . . . . . . . . . . . . . . . 49
4.25. Emed and EL of optimized ĉ1.2 criterion over varying σRJ /ADJ,uni and N . . . . 50
4.26. Emed and EL of optimized ĉ1.2 criterion over varying σRJ /ADJ and DJ type. . . 51
4.27. Qmin threshold parameter definition. . . . . . . . . . . . . . . . . . . . . . . . . 52
4.28. Flow graphs of algorithms based on minimum Qth,min and constant Qth,c thresh-
old in Q-domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.29. Emed , EL and κ of four different algorithmic principles over varying σRJ /ADJ . . 54
4.30. Emed , EL and κ of four different algorithmic principles over varying Qmin . . . . 56
4.31. Emed , EL and κ of Qth,c and σ̂err algorithm over varying Qmin and DJ shape. . . 56
4.32. Emed and EL of Qth,c and Qmin =−1.2 over varying σRJ /ADJ,uni and N . . . . 57
4.33. Emed and EL of Qth,c and Qmin =−1.2 over varying σRJ /ADJ,uni and DJ type. . 57
5.1. BIJM based IO jitter measurement for PLLs. . . . . . . . . . . . . . . . . . . . . 60

5.2. Tail parameters σt , µt and At of fitted test distributions over varying σRJ /ADJ . . 61
5.3. Comparison of fitted tail amplitudes with fast numerical approximation. . . . . . 62
5.4. Normalized standard deviation and jitter ratio of fitted test distributions over vary-
ing shape σRJ /ADJ , At ≡1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.5. Right bathtub curve with fitted Gaussian tail. . . . . . . . . . . . . . . . . . . . 65
5.6. Inverse quantile function Q−1 applied to Qmin . . . . . . . . . . . . . . . . . . . 67
5.7. Definition of maximum slope smax . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.8. Smallest analyzable RJ component σt,min , verified using empirical relation (5.20). 69
5.9. ∆Pt selection chart for identifying σt,min ·R, as described by equation (5.20). . . 69
5.10. Smallest analyzable RJ component σt,min , verified using empirical relation (5.22). 71
5.11. ∆Pt and Qmin selection chart for identifying the normalized variable σRJ,min ·R,
as described by equation (5.23). . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.12. Estimation loss EL for different values of σRJ , ADJ , and R. . . . . . . . . . . . 73
5.13. Emed , IQR and EL over varying R and σRJ,min . . . . . . . . . . . . . . . . . . 74
5.14. Surfaces for the empirical analysis of Emed and IQR. . . . . . . . . . . . . . . . 75
5.15. Error ripple effect: simulated Emed surface and expected “error valleys” according
to equation (5.35). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.16. DNL error model to describe the effect of process variations. . . . . . . . . . . . 79
5.17. Design example: EL of sQN method over number of counters C. . . . . . . . . . 83
6.1. Optimization scheme for Q-normalization combined with polynomial regression. 85

6.2. Flow graph for σ̂err based polynomial fitting. . . . . . . . . . . . . . . . . . . . 87
6.3. 1st order pol. reg. (QN): Emed , IQR, EL , ξ and κ over varying ∆Pt and Pt,min . . 89
6.4. 2nd order pol. reg. (QP2): Emed , IQR, EL , ξ and κ over varying ∆Pt and Pt,min . 90
rd
6.5. 3 order pol. reg. (QP3): Emed , IQR, EL , ξ and κ over varying ∆Pt and Pt,min . 91
6.6. 4th order pol. reg. (QP4): Emed , IQR, EL , ξ and κ over varying ∆Pt and Pt,min . 91
6.7. Sinusoidal type DJ: Emed and EL of QN, QP2, QP3, QP4 fitting methods. . . . . 94
6.8. Uniform type DJ: Emed and EL of QN, QP2, QP3, QP4 fitting methods. . . . . . 95
6.9. Triangular type DJ: Emed and EL of QN, QP2, QP3, QP4 fitting methods. . . . . 96
6.10. Quadratic curve type DJ: Emed and EL of QN, QP2, QP3, QP4 fitting methods. . 97
6.11. Comparison of sQN, QN, QP2, QP3, QP4 methods at constant N =104 . . . . . . 99
VI
6.16. EL of sQN, QN, QP2, QP3, QP4 methods over varying number of bins R. . . . . 104
6.17. Design example: EL of sQN and QN methods over number of counters C. . . . . 107
7.1. Eye diagram with timing jitter and amplitude noise PDFs. . . . . . . . . . . . . . 110
7.2. Generalized optimization scheme. . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.3. Special GGD shapes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.4. GGD random generator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.5. T̂ surface for two-dimensional minimum search with initial search grid. . . . . . 115
7.6. Emed , EL and Emed,α over varying test distribution shape. . . . . . . . . . . . . 116
7.7. Emed , EL and Emed,α over varying sample size N . . . . . . . . . . . . . . . . . 117
7.8. Emed , EL and Emed,α over varying number of bins R. . . . . . . . . . . . . . . 117
7.9. Performance comparison using EL over varying jitter ratio σRJ /ADJ . . . . . . . 118
8.1. Functional block scheme of the CPLL. . . . . . . . . . . . . . . . . . . . . . . . 121

8.2. Leeson noise generator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.3. Loop filter voltage behavior depending on input current Icp . . . . . . . . . . . . . 125
8.4. Measured and simulated phase noise PSD over different parameter settings. . . . 127
8.5. RMS values of accumulated σacc (m), absolute σabs and long term jitter σlt . . . . 128
8.6. Phase noise PSD mismatch with RJ=0.3 UI and default parameters (table 8.1). . 129
8.7. T (fSJ ) over varying jitter amplitude ASJ . . . . . . . . . . . . . . . . . . . . . . 130
8.8. T (fSJ ) over varying loop filter parameters. . . . . . . . . . . . . . . . . . . . . 130
8.9. T (fSJ ) over varying test pattern. . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9.1. JTOL measurement scheme using TIA or BERT. . . . . . . . . . . . . . . . . . 133

9.2. Flow graph of JTOL analysis algorithm. . . . . . . . . . . . . . . . . . . . . . . 136
9.3. Worst case error of sQN and QN methods for sinusoidal DJ over varying N . . . . 137
9.4. Examples for the adaptive JTOL algorithm converging toward the unknown jitter
amplitude ASJ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9.5. Probability for successful convergence of JTOL algorithm. . . . . . . . . . . . . 140
9.6. Average number of iterations If and total sample size N over varying R. . . . . . 142
9.7. Simulated and measured JTOL curves at different model parameter configurations. 142
10.1. Basic principle for jitter diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . 146

10.2. Block scheme for the FPGA based 3Gb/s jitter measurement system. . . . . . . . 147
10.3. Realization of the BERT logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
10.4. Flow graph of BER measurement and analysis. . . . . . . . . . . . . . . . . . . 148
10.5. Example for a measured jitter distribution and sQN fitted tails. . . . . . . . . . . 149
10.6. K=100 evaluations of TJpp over varying sample size N , and RG-58 cable length. 149
10.7. Estimated positive jitter at internal loopback mode, using QN as fitting method. . 150
10.8. TJpp surfaces for Tx buffer optimization. . . . . . . . . . . . . . . . . . . . . . 150
10.9. TJpp surfaces for Tx buffer optimization, lock-to-data mode. . . . . . . . . . . . 151
10.10.TJpp optimization of Rx structures, including four different EQ settings and a
single DFE tap weight. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
10.11.Estimated TJpp values of a 1m S-ATA cable over varying N , Pat=0101. . . . . . 152
10.12.Estimated TJpp values of a 1m S-ATA cable over varying N , Pat=08CEFhex . . . 152
10.13.Estimated TJpp of a 5m RG-58 coaxial cable over varying N . . . . . . . . . . . 153
10.14.Estimated TJpp of a 1m S-ATA cable with Rx-PLL in lock-to-data mode. . . . . 154
VII
10.15.TJpp medians of fitted tails, exact measurements and expected worst case error
over varying cable length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
VIII
List of Tables
3.1. Multiplicative constant to specify a target BER for TJpp values. . . . . . . . . . 21

3.2. Definitions for time domain random processes. . . . . . . . . . . . . . . . . . . 23
4.1. Default algorithm configuration and important key parameters. . . . . . . . . . . 35

4.2. List of investigated fitness measures. . . . . . . . . . . . . . . . . . . . . . . . . 41
5.1. Coefficients for equations (5.3) and (5.4). . . . . . . . . . . . . . . . . . . . . . 62

5.2. Coefficients for equations (5.5b) and (5.5c). . . . . . . . . . . . . . . . . . . . . 64
5.3. Selected worst case shape values σRJ,min /ADJ and corresponding tail amplitudes
At,min for different DJ types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.4. Emed , IQR and EL coefficients for equation (5.30), ĉ1.2 based algorithm. . . . . 76
5.5. Emed , IQR and EL coefficients for equation (5.30), Qth,c based algorithm. . . . 77
5.6. Coefficients for Emed , IQR and EL with included DNL error, equation (5.38). . . 80
6.1. Default parameter configuration of ∆Pt for polynomial tail fitting methods. . . . 92
6.2. Emed , IQR and EL coefficients for QN method, equation (6.6). . . . . . . . . . 105
6.3. Emed , IQR and EL coefficients for QN method with included DNL error, equa-
tion (5.38). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.1. Quantile normalization functions for different tail distributions. . . . . . . . . . . 111
8.1. Default model parameter settings. . . . . . . . . . . . . . . . . . . . . . . . . . 126

8.2. RMS jitter values of different parameter configurations. . . . . . . . . . . . . . . 128
9.1. Polynomial regression coefficients for σe . . . . . . . . . . . . . . . . . . . . . . 137

9.2. Default JTOL algorithm settings. . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.3. JTOL analysis results for R=Rsim =3.3·105 . . . . . . . . . . . . . . . . . . . . 141
9.4. JTOL analysis results for R=32 and R=512. . . . . . . . . . . . . . . . . . . . 141
A.1. List of MATLAB files to generate simulation figures. . . . . . . . . . . . . . . . 163

A.2. List of MATLAB files to generate tables of coefficients. . . . . . . . . . . . . . . 165
A.3. List of System-C testbenches for simulations and MATLAB post-processing files. 165
IX
List of Abbreviations
BB-PD Bang-Bang Phase Detector

BER Bit Error Rate
BERT Bit Error Rate Tester
BIJM Built-In Jitter Measurement
BUJ Bounded Uncorrelated Jitter
CDF Cumulative Density Function
CDR Clock and Data Recovery
CP Charge-Pump
CPLL Charge-Pump Phase-Locked Loop
DDJ Data Dependent Jitter
DFE Decision Feedback Equalizer
DJ Deterministic Jitter
DNL Differential nonlinearity
EQ Equalizer
FPGA Field Programmable Gate Array
GGD Generalized Gaussian Distribution
Gb/s Gigabit per second
INL Integral nonlinearity
IQR Interquartile Range
ISI Inter-Symbol Interference
JTOL Jitter Tolerance
LF Loop Filter
MP Microprocessor
PD Phase Detector
PDF Probability Density Function
PLL Phase-Locked Loop
PRBS Pseudo Random Binary Sequence
PSD Power Spectral Density
Q Quantile
QN Quantile Normalization
RJ Random Jitter
RMS Root Mean Square
RX Receiver
S-ATA Serial Advanced Technology Attachment (an interface standard)
SJ Sinusoidal/Periodic Jitter
sQN Scaled Quantile Normalization
SUT System Under Test
TIA Time Interval Analyzer
TJ Total Jitter
TX Transmitter
UI Unit Interval
VCO Voltage Controlled Oscillator
XI
XII
Nomenclature
α Shape parameter of generalized Gaussian distributions, see figure 7.3

αRJ Shape parameter of generalized Gaussian random jitter source, see figure 7.6
∆k Distance factor for logarithmic search grid, see figure 4.8
∆Pt Minimum probability interval for tail fitting, see figure 4.21
∆Tt Minimum time interval for tail fitting, see figure 4.21
conf Confidence threshold for convergence of jitter tolerance algorithm, see figure 9.2
σ̂err Standard error of linear regression, see equation (4.5)
ĉ1.2 Algorithmic scenario for tail fitting based on ∆Pt parameter, see section 4.3.2
n̂ Fitness measure based on regression length, see equation (4.14)
n̂T Fitness measure based on regression length of T̂ , see equation (4.16)
T̂ Fitness measure based on regression error and slope, see equation (4.15)
κ Kurtosis
IQR Interquartile range of extrapolation error or error spread, see equation (3.17)
BERspec Target bit error rate level, specification requirement
DJpp Deterministic jitter peak-to-peak value after tail fitting, see equation (3.13)
RJrms Standard deviation of Gaussian tails after tail fitting, see equation (3.13)
µL,R Mean value of left/right fitted Gaussian model, see figure 2.11
ν Learning rate parameter for adaptive recursion, see equation (9.1)
σe Standard deviation of extrapolation error, see equation (3.16)
σabs/per/lt/max RMS value of absolute/period/long term/maximum jitter, see equation (8.15)
σacc RMS value of accumulated jitter, see equation (8.14)
σDN L Standard deviation to model differential nonlinearity error, see section 5.4.3
σL,R Standard deviation of left/right fitted Gaussian model, see figure 2.11
σRJ Standard deviation of random jitter component of test distributions, see figure 3.5
TJpp Total jitter peak-to-peak value or timing budget, see equation (3.12)
ξ Skewness
a Statistical confidence level, see figure 3.4
ADJ Amplitude of deterministic jitter component of test distributions, see figure 3.5
AL,R Amplitude of left/right fitted Gaussian model, see figure 2.11
ASJ Sinusoidal jitter amplitude
EL Estimation loss, see equation (3.18)
Em Mean value of extrapolation error, see equation (3.16)
Emed,α Median value of tail shape error, see equation (7.10)
Emed Median value of extrapolation error, or error bias, see equation (3.17)
fSJ Sinusoidal jitter frequency
Imax Maximum number of iterations of jitter tolerance algorithm
K Number of evaluations
k Amplitude scaling factor, see figure 4.3
N Number of jitter samples in a distribution, sample size
n Fitted tail length, see equation (4.5)
Nmin/max Minimum/Maximum sample size of jitter tolerance algorithm, see figure 9.2
o Offset of linear function
XIII
p Probability
Pt,min Minimum probability threshold for tail fitting, see figure 4.21
q Quantile
Qmin Minimum Q-domain threshold for tail fitting, see figure 4.27
Qth,c Algorithmic scenario for tail fitting based on Qmin threshold, see section 4.4.2
R Number of bins per unit interval
s Slope of linear function
T Bit period, if normalized to the unit interval T =1 UI
T (fSJ ) Jitter transfer function
teye Eye opening
tL,R Left/Right timing budget of jitter, see figure 2.10
x Timing jitter amplitude
XIV
1. Introduction
A brief overview to the topic of jitter and bit error rate (BER) analysis in high-speed communica-
tions is given. The problem domain is presented and an outline to the overall document is given,
together with a list of scientific contributions which have been elaborated throughout the work.
1.1. Motivation and Problem Domain

With the increasing demand on higher clock frequencies for synchronization and data transmis-
sion, timing uncertainty, or jitter has become a major limiting factor for today’s high-speed com-
munications. This is especially true for serial data transmission which utilizes phase locked loop
(PLL) based circuits for synchronization. In this case, limited jitter tolerance and inherent phase
noise can lead to erroneous data recovery, and hence, force the need for an accurate quantification
of timing jitter and associated effects.
Figure 1.1 shows the typical block diagram of a serial high-speed interface, composed of the
three main blocks, transmitter, channel and receiver [49,114]. The transmit (TX) buffer is triggered
by a high-speed PLL, running at the desired transmission rate of several Gigabit per second (Gb/s).
The channel or transmission path is the major contributor to signal degradation. It limits data
throughput by introducing noise and signal distortions, both leading to timing jitter. The receiver
(RX) also consists of a high-speed PLL, realized as a clock and data recovery (CDR) unit to
synchronize with the received bit stream. The decision block can be a simple data latch using the
recovered clock and the analog input signal.
Transmitter Channel Receiver
TX Data RX Data
TX−PLL RX−PLL
F IGURE 1.1.: Block scheme of a serial high-speed transceiver.
Serial high-speed interfaces usually transmit data without a dedicated clock signal. A special
encoding guarantees for sufficient bit transitions inside the data stream in order to correctly syn-
chronize the receiver clock with the input data. The RX-PLL as a synchronization circuit has to
cope with input jitter and provide a certain robustness against timing variations. If jitter exceeds a
critical amplitude, the PLL will not be able to track timing variations correctly, and hence, lead to
erroneous signal recovery or misinterpretation of the received data.
Serial interface standards, such as S-ATA [118], specify stringent requirements on the error
probability, or bit error rate (BER) of a recovered data stream. The BER directly reflects the influ-
ence of timing jitter on system performance, and is thus the best suited figure of merit to indicate
1
1. I NTRODUCTION
the quality of a digital communication interface. In fact, one likes to describe a relation between
BER and measured jitter to thoroughly investigate its influences and to afford identification of
possible root causes.
Since jitter is basically a random process, its analysis involves the use of statistical analysis
methods. These are applied to a set of collected jitter samples and try to accurately determine
the impact of timing jitter on the investigated system. In this thesis such jitter analysis methods
are developed, analyzed, compared and applied to practical simulations and measurement cases.
The subsequent section describes the key contributions and results which have been achieved
throughout this work. It is followed by a brief overview of the overall document.
1.2. Contributions
Several key contributions and results of this work extend the state of knowledge, and are summa-
rized below. A more detailed description of the key results can also be found in the conclusions
section at the end of the thesis, together with a list of own publications [C1-C9].
First, a novel method (here denoted as scaled Q-normalization, sQN) for jitter and BER anal-
ysis is developed. It is based on the Gaussian quantile normalization principle, where the three
parameters amplitude, mean and standard deviation of Gaussian model functions are identified
and fitted into the tails of a jitter distribution. This allows them to be extrapolated down to any
desired probability level. The method is realized with a flexible and efficient optimization scheme,
and allows for fast tail fitting combined with accurate extrapolation results. Its performance is
analyzed with respect to the extrapolation error, which is shown to highly depend on the sample
size and the shape of test distributions. From the basic concept, two algorithmic approaches with
conservative fitting parameters are derived and optimized, in order to improve the error behavior
with respect to accuracy and outlier suppression. For typical test distributions (uniform combined
with Gaussian) and a number of jitter samples N =106 (tail extrapolations range over six orders
of magnitude, down to the 10−12 level), the estimated jitter budget has an error bias <2% and an
overall error <3% in more than 97.5% of the cases. The method is partly published in [C1,C8,C9].
Another contribution of this thesis focuses on hardware design aspects, to utilize the proposed
sQN method together with real jitter measurement systems. Their limited precision causes quan-
tization effects and introduces additional extrapolation error which must be considered and dealt
with. First, requirements with respect to minimum tail amplitude and time resolution of measured
distributions are investigated. Corresponding equations are derived to guarantee these require-
ments. Then, the combined influence of limited sample size and time resolution on the accuracy
of the sQN method is described in terms of empirical relations that quantify error bias, statistical
spread and the combination of both. These relations aid a system designer in finding an optimum
performance trade-off between desired accuracy of analysis and hardware expense. Finally, also
the effect of process variations in measurement systems is added to these empirical relations, in
order to quantify their influence. Two typical design examples act as design guidelines to real-
ize hardware jitter measurements with a certain target accuracy, when using the proposed sQN
method. The described contributions are also published in [C4,C8].
Further, a comprehensive performance comparison of different jitter analysis methods based on
the quantile normalization principle is provided. Therefore, the sQN method is compared with
the known conventional quantile normalization (QN) method, as well as higher order polynomial
methods (QP2, QP3, QP4). The idea is to provide a detailed performance reference for compar-
ison with future jitter analysis methods. As a fundamental result, in simulators the QN method
highlights the same beneficial property of a strictly positive error bias as the sQN method. This is
a clear advantage compared to higher order polynomial methods (QP2, QP3, QP4), which achieve
acceptable accuracy only for certain test distributions. A comprehensive comparison with the sQN
2
1.2. C ONTRIBUTIONS
method is carried out, and clearly shows that sQN achieves the best performance. However, this is
at the cost of a larger computational demand. Although the QN method is less accurate, it offers a
significant speed-up (≈ 35 times compared to sQN) for tail fitting. This complementary property
demands an additional error analysis with QN when used together with hardware measurements.
This allows a system designer to choose between the better suited algorithmic alternative. Equiv-
alent to the sQN method, an error analysis is thus carried out for the QN method, which quantifies
its extrapolation error in terms of the two key parameters sample size and time resolution, as re-
quired for hardware jitter measurements. In addition, the previously described design examples
for hardware measurements are extended for use with the conventional QN method. Therefore,
the design equations for the sQN method can simply be reused. Obtained results highlight that
also for hardware measurements the QN method is generally less accurate than sQN, but is also
less affected by differential non-linearity error, as caused by process variations. These results are
partly published in [C3,C8].
A dedicated part of the thesis also focuses on the generalization of the sQN principle for use with
arbitrary non-Gaussian tails. Such tails may for example appear with amplitude distributions of
high-speed optical links. As a function class, the generalized Gaussian distribution (GGD) is used
for tail fitting. The presented principle is fully consistent with the existing jitter decomposition
model and thus, forms a logical extension. In simulators, it is able to identify the exponential
tail characteristic of a distribution and to extrapolate tails with acceptable accuracy. However, the
estimated tail characteristic suffers from large errors if tails decay very fast or in hardware systems
with very limited time resolution. Further, with its large computational demand the primary use
cases are simulations. The generalized principle is also presented in [C5].
Several case studies describe typical application fields for the proposed sQN method. First, a
fast behavioral model for charge-pump PLLs is implemented, which is based on an exact solution
for the 2nd order loop filter, and includes a parasitic gain regulator pole as well as an oscillator
noise model [C2]. It is able to simulate approximately 106 bit periods within one minute on
an Intel 2.2GHz laptop and thus, allows for in-depth system exploration as well as statistical
jitter analysis. Jitter transfer functions and phase noise spectra of the modeled PLL are compared
with measurements from an existing hardware structure and show excellent agreement. The sQN
method is here used to simulate and verify the measured jitter transfer functions. Therefore it
extracts the deterministic jitter component from collected distributions. For additional comparison,
also a spectral analysis method is used.
As a second application, the sQN method is used for identifying jitter tolerance curves of the
previous high-speed PLL model. Therefore, external jitter is injected to the PLL and adjusted
until a desired error probability is obtained. In order to solve this inverse problem, an adaptive
search algorithm is developed, which highly reduces the number of required jitter samples [C6].
Comparative results show, that the included sample size adaptation makes the recursive search 2-3
times faster. Results also highlight, that the proposed algorithm can be used for both simulations
and hardware measurements.
In a final case study, a practical jitter measurement and analysis system for the diagnosis and
optimization of transmission lines, PLLs and transceiver structures is developed [C7]. The target
architecture is a high-speed FPGA, and as an example, various jitter measurements are carried out
with RG-58 coaxial cables as well as a standard 1m S-ATA cable. Optimizations are performed
with the FPGA internal transceiver settings and equalizer structures, which allow to reduce the to-
tal jitter of a 5m test cable significantly (up to 28%). In concluding analyses, also the extrapolation
error of sQN and QN fitting methods is investigated and compared against theoretical worst case
errors from previous numerical analyses. In this context the different DJ shapes are experimentally
confirmed to be well suited for estimating the error of pure PLL jitter, ISI, and ISI plus additional
noise affected channels.
3
1. I NTRODUCTION
1.3. Thesis Overview

An introduction to jitter analysis in high-speed serial links is given in the subsequent chapter 2.
Fundamentals of jitter, noise and BER analysis in serial interfaces are described, together with
an overview to the state-of-the-art in the field. Further, high-speed PLLs are briefly discussed as
fundamental circuits for data transmission and recovery, and a thorough survey of jitter analysis
methods is included.
Chapter 3 provides the mathematical basics involved with Gaussian quantile normalization,
which forms the underlying key principle for jitter analysis methods investigated throughout this
thesis. Measures for statistical performance evaluation and a set of suitable test distributions are
defined as well.
In chapter 4 the scaled Q-normalization method is realized as an accurate and efficient method
for jitter analysis. Subsequent evaluations demonstrate excellent performance, which is further
improved by optimizing the algorithmic behavior with conservative fitting parameters. From the
fundamental optimization scheme, two algorithmic principles are derived which achieve similar
performance in a simulator environment.
In chapter 5 hardware design aspects are investigated, in order to utilize the developed method
together with jitter measurement devices or built-in test structures. Since these hardware systems
suffer from limited precision and additional error effects, a design trade-off between hardware
expense, measurement accuracy and analysis speed is required. In this context, several design
equations and empirical relations are derived to characterize the key parameters for a hardware
measurement system.
The performance of different jitter analysis methods are compared in chapter 6. First, a unify-
ing optimization scheme is derived to highlight the conceptual difference between the proposed
method from chapter 4 and others which are also based on the Gaussian quantile normalization. A
thorough performance analysis and comparison of each method is then carried out.
Chapter 7 generalizes the quantile normalization principle for use with arbitrary non-Gaussian
tails. This allows the proposed method to be applied to generic analysis scenarios where the
Gaussian tail assumption does not hold anymore, as can be the case with amplitude noise in optical
fiber interconnects.
In chapter 8 a fast system level model of a serial high-speed PLL is developed, which has
been realized as test structure for a 3Gb/s S-ATA interface. The event-driven model represents an
enhanced version of a prior approach. It serves as a case study where the jitter analysis method
from chapter 4 affords derivation and comparison of various jitter transfer functions.
A fast method for identifying the jitter tolerance curve of high-speed PLLs is introduced in
chapter 9. A recursive algorithm determines the tolerance behavior, and adaptively adjusts the
sample size of collected jitter distributions to minimize the required test time. The method is
applied to the PLL model from the previous chapter and simulation results are compared against
measured tolerance curves. Such simulations are particularly useful for indicating the robustness
of a PLL against input jitter.
Chapter 10 describes a diagnostic tool for jitter analysis, which is implemented on an FPGA. It
estimates the total jitter caused by a system under test and thus, can be used for testing the quality
of transmission channels and for optimizing the parameter configuration of interface structures.
Further it experimentally confirms the error behavior of the developed fitting method.
Finally, chapter 11 summarizes the contents of this thesis and concludes with a brief outlook to
future research directions.
4
2. Fundamentals of Jitter, PLLs and BER
Analysis
In this chapter an introduction to the basics of jitter and PLLs in high-speed serial links is given.
After a brief overview to the sources of timing jitter in communication systems, different types
and definitions for clock jitter and phase noise are discussed. The fundamental principle of PLLs
for high-speed data transmission is explained, followed by a comprehensive overview of the state-
of-the-art of jitter analysis methods. The main focus is on analysis techniques that relate jitter with
the bit error rate (BER), as required for testing high-speed serial links.
2.1. Jitter in High-Speed Serial Links

Dealing with jitter plays a crucial role in the design of digital high-speed interfaces. Since timing
uncertainty is the major cause for erroneous data recovery, a robust receiver architecture is one of
the most challenging design criteria. Serial high-speed standards impose heavy requirements on
the allowed bit error rate (BER), and specify typical target BERs of 10−12 or even less [43, 118].
Figure 2.1 shows different jitter sources in a serial interface which contribute to the overall
accumulated jitter at the receiver. A certain amount of jitter is already introduced by the non-
TX Channel RX
Clean
Data
EQ CDR
ISI
Reflection
Jittery Crosstalk accumulated
Tx−Clk jitter
F IGURE 2.1.: Jitter sources in a serial high-speed interface [5].
ideal clock synthesizer inside the transmitter structure. Depending on the quality of the channel,
inter symbol interference (ISI) as well as reflections and crosstalk may strongly degrade the signal
integrity along the transmission path. Finally also at the receiver side, a non-ideal equalizer (EQ)
and PLL inherent phase noise of the clock and data recovery (CDR) unit will additionally provide
their own timing jitter [3, 6, 8, 11, 15, 45, 53, 82, 83, 127].
A common way to highlight the problem of signal recovery and presence of jitter is the eye
diagram, as depicted in figure 2.2. It shows the received analog data signal folded in time, with
the bit period referred to as unit interval (UI). The untreated received eye is often almost closed
and has to be reopened with equalization techniques or signaling schemes that try to compensate
the channel influence. Timing jitter especially degrades system performance when causing a large
horizontal eye closure, since signal transitions spread over the entire bit interval and thus, impede
the recovery circuit to synchronize with the data. Inside the plotted waveform one may thus define
an eye mask [43, 47] which must not be violated in order to meet signal quality requirements.
5
2. F UNDAMENTALS OF J ITTER , PLL S AND BER A NALYSIS
F IGURE 2.2.: Typical receiver eye diagram affected by jitter.
Considering the timing budget or horizontal eye closure at the optimum decision threshold in
figure 2.2, simple probability distributions may be used for jitter analysis. Timing jitter is then
described as a statistical signal in terms of different components [82,104]. As shown in the scheme
in figure 2.3, an observed total jitter (TJ) distribution can basically be decomposed into a bounded
deterministic (DJ) and an unbounded random (RJ) part. Both components relate to independent
time-domain random processes and thus, appear as convolved in histogram domain.
Total Jitter (TJ)
Deterministic Jitter (DJ) Random Jitter (RJ)
Periodic (SJ) Data Dependent (DDJ) Bounded Uncorrelated (BUJ)
Duty-Cycle Distortion (DCD) Inter-Symbol Interference (ISI)
F IGURE 2.3.: Jitter components according to [82, 104].
Random jitter is usually considered as Gaussian, but can basically follow any unbounded prob-
ability behavior. It is observed at both distribution tails, extending them toward infinity. Bounded
deterministic jitter can be of arbitrary shape, and is expressed in terms of various subcomponents
in order to investigate and distinguish various root causes. DJ is further subdivided into sinusoidal
or periodic (SJ), bounded uncorrelated (BUJ), and data-dependent jitter (DDJ). Use of generated
SJ is especially important for jitter tolerance testing [43] and for the measurement of jitter transfer
functions [46, 121] in PLLs. Sometimes SJ also appears as an effect of parasitic spurs or power
supply noise. BUJ is mainly caused by couplings, such as crosstalk from adjacent transmission
lines, digital switching logic or ground bounce effects. BUJ is always considered bounded because
of the limited coupling strength. Exact models are difficult to derive for this component since both
coupling signals and mechanisms are highly variable. Finally, DDJ is a jitter component which
can be related to the transmitted data pattern. Duty-cycle distortion (DCD) is caused by a differ-
ence in the pulse width between logical high and low levels and ranges from voltage offsets or
different rise and fall times at signal transitions. Inter symbol interference (ISI), appears when
the channel impulse response extends over several bit periods. As shown in figure 2.4, a single
transmitted pulse is spread in time along the channel, and thus overlaps and influences adjacent
symbols. This causes a significant error in timing recovery. Fortunately, DDJ influences can be
fully compensated with an equalizer (EQ) if the channel is characterized by its impulse response,
or with an adaptive decision feedback equalizer (DFE) if it is unknown [5]. The use of EQs is re-
stricted on compensating DDJ, other jitter components which propagate through the transmission
path are then dealt by the receiver PLL.
6
2.1. J ITTER IN H IGH -S PEED S ERIAL L INKS
transmitted received
signal signal
0 1T 2T 3T 4T
F IGURE 2.4.: Inter-Symbol-Interference caused by a transmission channel [5].
In order to provide a fundamental understanding of the underlying research field, first the math-
ematical description of timing jitter and phase noise will be reviewed. In this context, jitter is seen
as time domain representation of phase noise, which describes the spectral purity of an oscillator.
Further, as a random process, phase noise has to be described in terms of statistical measures such
as variance and power spectral density (PSD). In the following sections thus, mathematical rela-
tions for these measures will be derived. These definitions and derivations are very common in
PLL literature and can for example be found in [30, 36, 42, 121].
2.1.1. Phase Noise Definition

We start the mathematical analysis of phase noise with the output signal v(t) of a non-ideal oscil-
lator, which is influenced by the phase noise φ(t) as time-domain random process [30, 42].
v(t) = A cos(ω0 t + φ(t)) (2.1)
This phase modulated signal can be decomposed in terms of Bessel functions. If the phase
variation of the noise signal is small compared to the reference period of the ideal oscillator
|φ(t)|1rad, equation (2.1) can be rewritten as:
v(t) ≈ A cos(ω0 t) − Aφ(t) sin(ω0 t) (2.2)
The output spectrum of this signal consists of two Dirac impulses located at the carrier frequency
ω=±ω0 together with the frequency translated spectrum of φ(t). If φ(t) is considered as a station-
ary random process, its phase noise spectrum Sφ (∆ω) can be calculated by the Fourier transform
of the auto-correlation function. Here the variable ∆ω is used to denote the frequency offset from
the carrier ω0 , as shown in figure 2.5, and Sφ (∆ω) is referred to as double-sideband PSD, which
contains the spectral power of both sidebands of the oscillator spectrum. The phase noise is of-
ten quantified in terms of a single-sided spectral noise density L{∆ω}. This is the noise density
Ideal Oscillator Practical Oscillator
Sv (ω) Sv (ω)
∆ω 1−Hz
bandwidth
ω0 ω ω0 ω
F IGURE 2.5.: Ideal and real oscillator spectrum [42].
7
measured at a frequency offset ∆ω from the carrier and is therefore one-half of Sφ (∆ω). L{∆ω}
additionally refers to the carrier power and is given in units of dBc/Hz:

Sφ (∆ω) noise power in 1 Hz BW at ω0 + ∆ω
L{∆ω} = = 10 log [dBc/Hz] (2.3)
2 carrier power
For a detailed description of the phase noise PSD also refer to [42, 78] or [30, chapter 11].
A typical phase noise spectrum for a voltage controlled oscillator (VCO) in high-speed PLLs
is given in figure 2.6. This spectrum is also known as Leeson process and consists of three dis-
tinct noise regions. The 1/ω 3 and 1/ω 2 terms represent flicker and thermal noise of electronic
components inside the VCO. These noise regions are integrated due to the frequency-to-phase
conversion, which corresponds to a multiplication of 1/ω 2 in spectral domain. The 1/ω 0 phase
noise floor is caused by external components and is not affected by the integration process.
The VCO noise model is usually specified using four parameters: the flicker noise corner fre-
quency ff l , a measured phase amplitude A1 with corresponding frequency f1 located in the 1/ω 2
region, and the noise floor amplitude AP hN . This phase noise model can also be realized in time
domain [122].
SΦ (∆ω)
∼ 1/ω 3
30dB/Dec
∼ 1/ω 2
20dB/Dec ∼ 1/ω 0
A1
0dB/Dec
AP hN
∆ω
ff l f1
F IGURE 2.6.: Typical oscillator phase noise spectrum [42].
2.1.2. Clock Jitter Definition

Clock jitter is considered as the phase noise behavior in time domain, which causes the timing
displacement of a digital clock signal. This means, as a random process it can only be observed at
the edges of a clock signal. According to figure 2.7 we distinguish between various definitions of
clock jitter [15, 22, 23, 42, 74, 119, 142].
Absolute Jitter jabs,k is the time difference of the k-th clock edge measured between an ideal
(tid,k ) and a non-ideal (tk ) clock signal:
jabs,k = tk − tid,k (2.4)
Period Jitter jper,k is defined as the time variation of the clock period. It is the time difference
between k-th clock period Tk and the ideal period T :
jper,k = Tk − T
= (tid,k − jabs,k ) − (tid,k−1 − jabs,k−1 ) − T
(2.5)
= jabs,k−1 − jabs,k + tid,k − tid,k−1 − T
⇒ jper,k = jabs,k−1 − jabs,k
8
2.1. J ITTER IN H IGH -S PEED S ERIAL L INKS
(m)
Accumulated Jitter jacc,k is defined similar to period jitter, besides that the time displacement
of a non-ideal clock is measured m periods after the reference clock edge. According to this
(1)
definition we have jper,k =jacc,k .
(m) (m)
jacc,k = Tk − T (m)
(m)
(2.6)
⇒ jacc,k = jabs,k−m − jabs,k
T (2) T
1jacc,3
0 (2) 1
0
0
1 0
1
(2)
T3 T4 jper,5
0
1 0
1
jittery
clock
11
00 11
00 11
00 11
00 1
0
00
11 00
11 00
11 00
11 0
1
00
11 00
11 00
11 00
11 0
1
00
11 00
11 00
11 00
11 0
1
00jabs,1
11 jabs,2
00
11 00jabs,3
11 00jabs,4
11 0jabs,5
1
ideal
clock
F IGURE 2.7.: Definitions of absolute, period and accumulated jitter [99].
Input-Output Jitter The most common method for analyzing the performance of a PLL is to
measure the time difference between input reference frequency and the PLL output clock. This al-
lows for directly quantifying the time domain misalignment of the PLL, which yields a qualitative
description for synchronization performance.
If the PLL is used for clock and data recovery (CDR) as with serial high-speed receivers, the
reference frequency is replaced by the analog input data and jitter values are measured between
bit transitions of the input signal and the PLL output clock (see figure 2.8). In order to correctly
quantify IO jitter, thus, an exact time interval measurement has to be performed.
IO jitter measurements are very important for practical use in high-speed communications and
required by a broad variety of applications, such as production tests, clock synchronization, or
signal quality specification [82].
data signal time

+
interval
analysis
CDR
F IGURE 2.8.: IO jitter measurement principle according to [82].
2.1.3. Relation Between Phase Noise and Jitter

We are still missing a relation between the phase noise φ(t) as continuous random process, and
jitter for digital clock signals in time domain. The phase modulated signal in equation (2.1) can
also be seen as a sampling clock, where the zero crossings correspond to the edges of a digi-
tal clock signal. For the ideal case we have φ(t)=0, and the sampling instants correspond to
{0, T0 , 2T0 , . . . , kT0 }. The non-ideal sampling instants are affected by absolute jitter, and thus
9
{jabs,0 , T0 + jabs,1 , 2T0 + jabs,2 , . . . , kT0 + jabs,k }. Therefore, the k-th phase deviation caused by
jitter appears at time instant tk =kT0 + jabs,k , so that
jabs,k · ω0 = φ(kT0 + jabs,k ). (2.7)
In addition, if the absolute jitter is significantly smaller than one sampling period (jabs T0 ), we
have φ(kT0 + jabs,k )≈φ(kT0 ), and can rewrite equation (2.7) as
φ(kT0 )
jabs,k ≈ , (2.8)
ω0
where jabs,k is now a discrete time random process, which simply corresponds to a sampled and
scaled version of the continuous phase noise process φ(t).
2.2. PLLs for Serial High-Speed Communications

In this section a brief introduction to high-speed PLLs is given, as required for serial communi-
cation interfaces. PLLs are non-linear synchronization systems that have been investigated and
described thoroughly in literature [7,30,36,92,121]. In high-speed serial links they are commonly
used as clock synthesizers at the transmitter, and for clock and data recovery (CDR) at the receiver
side [11, 45, 46, 134]. A classical digital PLL is composed of three basic components: a phase
detector (PD), a loop filter (LF) and a voltage controlled oscillator (VCO). The charge-pump PLL
(CPLL) architecture also omits the divider along the feedback path and includes an additional
charge-pump (CP), in order to achieve a simple high-speed design with low phase noise. The
basic block scheme is depicted in figure 2.9.
fref up fvco
PD CP LF VCO
dn
F IGURE 2.9.: Basic block scheme of a CPLL [121].
The behavior can be summarized as follows: The VCO generates an output clock with frequency
fvco , which depends on the given input voltage. The phase detector compares the phase of this
clock against a reference frequency fref , and decides whether fref is preceded (early) or pursued
(late) by fvco . According to this decision, logical down (early) or up (late) pulses are generated.
Both signals drive a charge-pump which injects or unloads current into the loop filter, and thus
provides the control voltage for the oscillator. If the oscillator clock is late, several up pulses are
generated by the PD which increases the loop filter voltage and thus, moves the oscillator toward
higher frequency where both phases are again aligned. For an early oscillator clock the reverse
behavior is observed. This way a non-linear control loop acts as synchronization system.
In the past years, charge-pump PLL architectures have dominated the field of high-speed trans-
ceivers due to a low phase noise. Although all-digital PLLs [25, 110] are becoming increasingly
important with technology scaling and for design cost reduction, CPLLs are still widely used.
They offer two major advantages compared to pure analog architectures. First, a flexible design
can be achieved with decoupled parameters such as the loop bandwidth, damping factor and lock
range. Second, the included charge-pump allows for a zero static phase offset [46, 133].
In serial high-speed links both transmitter and receiver are characterized by PLLs as clock
synthesizers. The transmitter often uses an additional clock divider in the PLL feedback path to
multiply the reference frequency in order to yield the desired high-speed data rate. Conversely,
10
2.3. J ITTER AND BER A NALYSIS M ETHODS
the receiver is characterized by a clock and data recovery (CDR) circuit, where the serial input
data is used as reference frequency and the VCO output is the synchronized clock. The PD can
determine a phase mismatch only at input signal changes, which requires a sufficient amount of bit
transitions inside the data stream. Therefore, typically the 8b10b encoding scheme [43] is used,
which converts 8 bit of original data into 10 bit for transmission. This encoding guarantees for
sufficient bit transitions with a maximum of four consecutive equal bits, and a DC-balanced signal.
CPLLs have been analyzed and described thoroughly in literature, where the theory has been
extended from the linear model of analog PLLs [36]. A valid continuous-time approximation is
obtained, if the loop bandwidth is considered significantly smaller (at least 1/10) than the update
frequency of the phase detector. Since high-frequency signals are suppressed by the loop filter,
digital pulses of the phase detector are averaged and thus, a linear s-domain model can be used
for a CPLL design. At higher frequencies where the PD update rate is comparable to the loop
bandwidth, the feedback delay will induce an excessive phase shift and hence, lead to instability.
In order to account for this effect, discrete-time z-domain equations have been derived as well [46].
Unfortunately, these analytical equations still do not provide an accurate description of the non-
linear phase noise behavior inside a CPLL. Only behavioral time domain models that are able
to cope with the non-linear loop dynamics of a CPLL thus correctly reflect the true phase noise
of high-speed transceivers. Such a model will be implemented in chapter 8 to demonstrate the
application of a proposed jitter analysis method, and to guarantee that specification requirements
such as the target BER and jitter tolerance are met.
2.3. Jitter and BER Analysis Methods

With the PLL as fundamental system for clock synchronization we are highly interested in speci-
fying its synchronization performance. Especially in digital high-speed interfaces where jitter can
lead to erroneous data recovery, jitter analysis, jitter tolerance and robust design become important
issues. In this context the probability of data misinterpretation in terms of bit error rate (BER) is an
important quality criterion for receivers. This measure is further supported by interface standards
specifying target BER levels of 10−12 or even less [118]. Therefore, as figure of merit one ideally
likes to give a relation between measured timing jitter and the BER.
The direct verification of a target BER=10−12 is very time consuming and impractical, it can
quickly take several minutes [72,96] to perform a single BER test. In [89,96] the trade-off between
test time and BER confidence level is examined, and the following equation derived for the amount
of data bits N , needed to guarantee a desired target BER:
E
" !#
1 X (N × B)k
N= − ln (1 − a) + ln , (2.9)
B k!
k=0
where B=10−12 is the desired BER level, a specifies the statistical probability or confidence level
that the true BER value is less than B, and E is the number of detected errors during measurement.
When no bit errors occur (E=0), the second term of the equation is zero and the solution to equa-
tion (2.9) is simplified. For example, with a=0.95 it is necessary to transmit N =3.0/B=3·1012
bits without errors in order to meet the imposed specification requirement. In a 3Gb/s transcei-
ver this would require an analysis time of T =N/3·109 =1000s=16m, 40s. Such a huge test time
cannot be tolerated for high volume production tests, where all specification requirements of the
transceiver have to be verified within several hundred milliseconds.
Therefore, test engineers have to rely on analysis methods which allow for accurate BER esti-
mation using a number of jitter samples which is several orders of magnitude smaller. Thus, cor-
responding mathematical models and equations must be provided in order to correctly verify the
11
desired target BER. Jitter values can usually be obtained easily from a model simulation, however
this process is often more complex in practice when carried out on hardware. High precision equip-
ment is required to perform accurate off-chip jitter measurements, including the use of high-speed
sampling scopes, time interval analyzers (TIAs) or bit error rate testers (BERTs) [12–14, 86, 130].
A detailed documentation of methodologies for jitter and signal quality measurements can also be
found in [43].
External noise sources can easily affect off-chip measurements at multiple Gb/s rates. Thus,
a broad variety of built-in jitter measurement (BIJM) systems [16–18, 35, 48, 57, 60, 65, 66, 100,
129, 131] has been developed in recent years as well. Such systems require a large amount of die
area if the jitter histograms have to be collected in real-time [16]. This is especially the case if for
example frequency domain analyses have to be realized and thus, all jitter values are needed. In
cases where the measurement time is uncritical, BIJM circuits also become very small but then,
they can only be used for histogram based jitter analysis.
Nevertheless, histogram based methods represent the most important class of analysis princi-
ples, since they directly relate jitter with the BER. This is not directly the case for time or frequency
domain based methods. In the following sections these three analysis domains will be explained
in more detail in order to give a comprehensive overview to the state-of-the-art in the topic.
2.3.1. Histogram Based Analysis

Histogram or statistical domain based methods estimate jitter influence using probability distri-
butions of collected jitter values. Starting with the observed eye diagram in figure 2.10, a jitter
distribution is obtained from the horizontal cross section at a desired signal level. For simplicity,
here only the optimum decision level or zero crossing line is considered.
The collected distribution corresponds to the
histogram or probability density function (PDF)
Eye
of jitter samples, assuming that timing jitter is a Diagram
stationary random process. A measured distribu-
tion is often represented by the so called bathtub 1UI
plot as shown in the bottom part of figure 2.10.

Jitter
Therefore, the integral of the density or cumu- PDF PDF
lative density function (CDF), is calculated for

both distribution tails and put into logarithmic BER
CDFL=1 − CDFR CDFR
1
scale. Additionally, CDFR (x) is defined as right
sided bathtub curve, while the reverse function
CDFL (x)=1−CDFR (x) is denoted as left sided 10 −4 Left sided
bathtub curve
Right sided
bathtub curve
bathtub, according to the respective tail. The CDF
directly describes the BER as a function of sam- 10 −8
pling time and thus, can be used to identify the

jitter extent at any desired BER level of interest. −12
L t t
eye t R
10
The goal now is to design a clock recovery sys-
tem in a way that both tails are separated suf- 0 T
ficiently far from each other at the required tar-
get BER. This target level is chosen according to
F IGURE 2.10.: Bathtub function example.
the specification requirement for high-speed serial
links, which is usually BERspec =10−12 . The eye
opening at this level is given by the distance between the corresponding points on left and right
BER curves:
teye = T − tL − tR , (2.10)
12
where T is the bit period, and tL and tR the resulting distances on the bathtub curve at 10−12 . The
total jitter peak-to-peak value TJpp or timing budget can thus directly be determined with
TJpp = T − teye = tL + tR . (2.11)
If normalized by the unit interval (UI) so that T ≡1, tL and tR equal the portion of eye closure.
These are the parts of the UI not accessible for sampling if the target BER has to be fulfilled.
The bathtub curve representation offers a simple way to verify whether a measured jitter distri-
bution achieves the specification. Unfortunately, in order to determine the correct eye aperture at
very low probability levels a huge amount of jitter samples must be collected. For BER levels of
10−12 and lower, a direct measurement of the histogram is not feasible. Especially in simulations
bathtub curves are only tracked down to probability levels that are orders of magnitude higher than
the target BER.
Therefore, an extrapolation of the bathtub curves is required. This extrapolation can be huge,
in the case of N =108 jitter samples it still ranges over four orders of magnitude, and can thus
only be done correctly if valid model assumptions are made for the underlying jitter distributions.
Common model assumptions are aligned to the popular Gaussian tail model [123] and can be
characterized as follows:
1. Jitter is a stationary random process.
2. The measured total jitter (TJ) distribution can be separated into two components, random
(RJ) and deterministic jitter (DJ).
3. RJ is observed at the outer tails of a TJ distribution, and follows an unbounded Gaussian

which can be fully described by its mean µ, standard deviation σ and amplitude A.
4. DJ follows a finite, bounded distribution.
DJ component
σL AL RJ component
Total Jitter AR σR
PDF
µL µR
−T /2 0 T /2
F IGURE 2.11.: RJ and DJ components of a jitter PDF. Definitions of right(R) and left(L) tails
correspond to the bathtub curves from figure 2.10.
According to these assumptions a TJ distribution can always be decomposed into two Gaussian
tails together with an arbitrary shaped bounded DJ component, as shown in figure 2.11. In order
to correctly extrapolate a measured distribution, analysis methods have to identify the three model
parameters µ, σ and A for both tails. This means, one is basically trying to fit a Gaussian function
into the measured distribution tails. Jitter analysis methods are thus also referred to as tail fitting
algorithms or jitter decomposition methods, while in mathematical statistics this problem domain
is also known as tail extrapolation and treated by extreme value theory. Once the model parameters
have been identified, the TJ timing budget can easily be calculated for arbitrary probabilities and
thus used for BER analysis. A mathematical description of the timing budget is provided later on
in section 3.
Various methods were developed to separate the random and deterministic jitter components
with tail fitting algorithms [51, 54, 58, 84, 95, 124, 136]. In this section, existing techniques are
reviewed in order to provide a comprehensive overview to the state-of-the-art.
13
Methods Based on Chi-Square Statistics

These are quite popular methods for distribution tail fitting, and have been widely used for jitter
and BER analysis in high-speed serial links [52,84,90]. The Tailfit algorithm by Li et al. [85] uses
the chi-square test as goodness-of-fit measure for fitting a Gaussian function into distribution tails.
The model includes the parameters mean, standard deviation and amplitude values, and is fitted
into the tails by minimizing the difference between measured tail data and model prediction. This
minimization process corresponds to a three dimensional optimization.
Methods based on the described principle suffer from a few drawbacks. First, the tail part of
the distribution has to be identified before starting the optimization. This includes use of a tail
identification algorithm with conservative parameters, which behave suboptimal for a broad range
of distribution shapes. Second, the tail fitting algorithm itself suffers from a high complexity
due to the three dimensional optimization. A successful minimum search thus requires a robust
convergence behavior.
Methods Based on Quantile Normalization

The quantile normalization principle [19, 106, 115] is based on a linearizing transform which
greatly simplifies the analysis of distribution tails. The idea is to transform a distribution into
a domain where the bathtub tails, if Gaussian, are represented as straight lines. A simple linear
extrapolation is then carried out to estimate the TJ timing budget at the target BER. This normal-
izing transform is realized using the so called quantile function [106], which is commonly used
for QQ-plots [19, 20, 116] in statistics. The transformed quantile domain is also referred to as
Q-scale [53, 95, 123] or Q-space [82] in jitter and BER analysis.
Besides the general extreme value theory, Gaussian Q-normalization was first used by Popovici
in [111] for BER analysis of digital links. Although the BER is only described as a function
of the signal to noise ratio, the underlying mathematical concept can also be mapped onto the
jitter decomposition problem. Popovici provides a thorough description of the quantile function to
normalize Gaussian distributions, together with rational approximation coefficients and a Gaussian
regression algorithm.
Hänsel et al. [51] used the Q-function for decomposing jitter distributions into RJ and DJ. Lines
were fitted to the normalized distribution in Q-domain, where the line slope and offset allowed
for the reconstruction of Gaussian model mean µ and standard deviation σ (see figure 2.11). This
principle was subsequently also described by Stephens [123] and Kizer [70]. However, the method
is not able to recover the Gaussian amplitude A, and is thus in some sense incomplete.
A general drawback of this conventional Q-normalization technique is that estimation accuracy
is very sensitive to the selected fitting region. The fit should be performed only at the tail parts of
the distribution that truly follow the underlying linearized Gaussian. Unfortunately, this line be-
havior is only approached asymptotically, and it is thus difficult to determine where the asymptote
begins.
Hong and Cheng [53, 54] tried to improve the Q-normalization method by fitting higher order
polynomials to the Q-normalized bathtub instead of linear functions, and achieved an acceptable
accuracy for estimated total jitter values. However, influence of statistical random data fluctuation
has not been considered, and the approach investigates only a few special test cases.
Finally, Miller [93–95] proposed the “normalized Q-scale” analysis, where the Q-normalization
method also includes the Gaussian amplitude A as third model parameter. The underlying mathe-
matical principle was already described by Popovici [111], and includes an additional pre-scaling
factor to normalize the Gaussian amplitude before transforming the distribution into Q-domain.
This thesis will also put the main focus on jitter analysis methods based on quantile normaliza-
tion. As will be shown, the linearizing property allows for an accurate and efficient extrapolation
14
of distribution tails, and can also be used to derive a unifying optimization scheme for tail fit-
ting which covers all of the above described references [51, 54, 95, 111, 123]. Further, it can be
generalized for use with arbitrary non-Gaussian tails.
Other Methods
Other less popular jitter decomposition methods are based on techniques for deconvolution [124,
125,128], Gaussian mixture models [98] and the wavelet transform [136]. Deconvolution methods
rely on the idea that in histogram based analysis a total jitter PDF is given as convolution result of
the RJ and DJ components. If one of these two components is approximately known or estimated,
a deconvolution algorithm can be used to determine the other component, and thus to retrieve the
Gaussian model parameters. A major drawback of these methods is that they suffer from accuracy,
since either the DJ or RJ component has to be estimated prior to the deconvolution.
Another method is based on the wavelet transform [136] and uses derivatives of Gaussian wave-
lets to detect the locations (mean values) of the Gaussian model functions. The variances are deter-
mined from a transformed log-likelihood function, while Gaussian amplitudes are not considered.
Due to the applied wavelet transform, this approach also suffers from a high computational de-
mand.
2.3.2. Time-Domain Based Analysis

Jitter analysis techniques based on time-domain [27, 89, 143] rely on jitter measurements carried
out in real-time. This is only feasible for dedicated real-time measurement systems, such as high-
speed sampling scopes or time interval analyzers (TIAs). In time-domain, jitter is then treated as
a statistical random signal which can be analyzed in terms of correlation and statistical moments.
Correlation analysis was introduced by Dou and Abraham [27, 28], and considers the evolution
of jitter samples in time by calculating the autocorrelation function. Unlike histogram based anal-
ysis it allows for the extraction of different DJ components (see figure 2.3), such as duty cycle
distortion (DCD), sinusoidal (SJ) and even data dependent jitter (DDJ). With only a few thousand
samples, estimates can already be obtained with an acceptable accuracy. Decomposition of DJ into
these subcomponents affords identification of the root causes of jitter. Unfortunately, the approach
still misses a relation between extracted DJ subcomponents and the total jitter, which impedes
derivation of the BER.
In [89] a method for the measurement time reduction based on signal to noise ratio (SNR) de-
crease is presented. This technique captures the amount of bit errors over a certain number N
of transmitted bits. The SNR of the system is intentionally reduced by a known quantity until
errors are captured, which results in a quicker measurement of the degraded BER. The relation-
ship between SNR and BER can be derived from Gaussian statistics and is documented in many
communications text books, such as [49, 114]. However, the described approach is valid only if
Gaussian RJ is the dominant cause for bit errors. It cannot be applied to arbitrary jitter distributions
and usually requires an amount of test samples which is too high for simulation applications.
2.3.3. Frequency-Domain Based Analysis

The time domain series of jitter can also be represented and analyzed in frequency domain using
the Fourier transform [82, 105, 139]. The PSD is then used to represent the jitter spectrum, by
applying averaging techniques such as the periodogram method. Peaks in the spectrum can be
interpreted as SJ or DDJ, while the average noise floor denotes the power of RJ. Unfortunately,
bounded uncorrelated jitter (BUJ) cannot be distinguished in a PSD, and a relation to the BER is
thus not explicitly given.
15
In [55, 56] four spectral regions of the jitter transfer function of CDR circuits are defined to
allow for BER analysis. The approach is restricted to Gaussian RJ combined with SJ, where the
sinusoidal jitter frequency is extracted from the spectral information. The obtained jitter transfer
characteristic of CDRs is subsequently [101, 102] also used to derive an analytical approximation
for the maximum phase error, which can be adapted for BER calculations.
In his book [82] Li thoroughly describes frequency domain principles for jitter separation in-
cluding DDJ, SJ and RJ types. BUJ can generally not be separated from the RJ noise floor, unless
it can be measured independently or controlled in some way.
16
3. Mathematical Background
This chapter deals with the mathematical basics involved with the developed jitter analysis meth-
ods and optimization schemes. First, the quantile normalization is reviewed as fundamental math-
ematical principle for a powerful class of tail fitting methods which is going to be analyzed and
optimized throughout subsequent chapters. Then performance metrics and test distributions are
introduced for the qualitative analysis of tail fitting methods.
3.1. Quantile Normalization

Fitting methods investigated in this thesis are all based on quantile normalization. It forms a
powerful technique for linearizing the tails of a jitter distribution and thus, allows simple linear
functions to be fitted via regression analysis. These lines then become the medium for tail extrapo-
lation. In order to fully understand the underlying concept, first the generic mathematics involved
with quantiles are described before focusing on the Gaussian distribution as special case.
3.1.1. Quantile Function

The derivation of quantiles [19,20,106,115,116] starts by defining a set of data samples x1 , . . . , xN ,
drawn from an unknown distribution function F (x). The data is ordered, so that x(1) ≤ x(2) ≤
. . . ≤ x(N ) , and the empirical distribution function of observed random samples F̃ (x) is defined
as
i
F̃ (x) = p = for x(i) ≤ x < x(i+1) , i = 1, . . . , N (3.1)
N +1
where the use of N +1 (instead of N ) avoids F̃ (x(N ) )=1. x(i) is an empirical estimate for the
pi = i/(N + 1) quantile of the distribution F (x). One can define the quantile function Q(p) to
describe a relation between ordered quantiles and the original amplitude of data, which is equal to
the inverse probability function [106]:
Q(p) = F −1 (p) = x (3.2)
This function allows to represent a distribution by the quantile plot (or QQ-plot) [19, 20, 116]:

−1 i
F , x(i) : i = 1, . . . , N (3.3)
N +1
where observed amplitudes x(i) are represented in terms of amplitudes of the theoretical model
quantiles F −1 (pi ). For a large sample size N , the sample quantiles x(i) approximate a shifted
and scaled version of the theoretical ones [115, 116]. This offers a linearized perspective on dis-
tributions, which is especially useful for tail fitting. As an example, in figure 3.1 an empirical
distribution F̃ (x) is shown, obtained from N =100 random samples of a normal distribution with
zero mean and unit variance F (x)=N (0, 1) (solid line). The quantile function F −1 (p) at the right
transforms the empirical distribution into an easy-to-fit linear function.
Thus, if a gathered distribution exactly matches the expected probability function as described
by the inverse F −1 , the result is a perfect line along the unit diagonal. If for example only the
tail part follows an expected behavior, as is the case for the RJ-DJ model with Gaussian tails
17
3. M ATHEMATICAL B ACKGROUND
1
2
0.8
1
0.6
Q(pi)
0
pi
0.4
−1
0.2
−2
0
−2 −1 0 1 2 −2 −1 0 1 2
xi xi
(a) (b)
F IGURE 3.1.: Example for a) an empirical distribution function CDF(x)=F̃ (x) with N =100
random samples of N (0, 1) and b) the corresponding quantile plot.
(section 2.3.1), this line is still observed at the tail parts. This way, the tail fitting problem can
basically be simplified to a linear regression analysis.
Such a linearizing transform offers a great simplification for the tail fitting procedure, especially
in terms of computational demand. As we will see in chapter 4, the method of least squares can be
implemented very efficiently for this purpose, as it uses only summing terms and recursions. Fur-
ther, the residual error after transform becomes approximately constant and normally distributed
over a large probability region, which makes the least squares method an ideal candidate for the
maximum likelihood estimation of tail parameters.
3.1.2. Gaussian Quantile Normalization

With the Gaussian tail assumption of jitter distributions in section 2.3.1, one is fundamentally
interested in linearizing tails with respect to the normal function. However, the Gaussian tail of a
total distribution can be of arbitrary mean, variance and amplitude. Thus a standardized form of
the quantile function Q(p) must be provided, which allows to recover the model parameters from
fitted lines in quantile domain. Once these parameters have been determined, tail extrapolations
down to any desired probability level become very simple.
Subsequently, the Gaussian quantile function is derived and embedded into a simple optimiza-
tion scheme for tail fitting. This also shows how to use the quantile function together with linear
regression analysis. Similar derivations can also be found in [82, section 5.3] or [51, 123] for the
Gaussian case, while [19, 115] describe statistical tail modeling in general. As was already ex-
plained, the bathtub curve of a total jitter distribution describes the BER as a function of the jitter
amplitude x. Initially, a pure Gaussian jitter distribution is assumed where the error rate of the
right tail BERR (x) can be expressed by the well known Gaussian integral:
h 0 i
−µR )2
1 x − (x 2σ
Z
2
BERR (x) = ρT √ e R dx0 (3.4)
2πσ −∞
where µR is the mean value and σR the standard deviation of the Gaussian. The parameter ρT is
the transition density and reflects the probability of bit transitions in the transmitted data signal.
In a clock-like ‘1010. . .’ pattern for example we have ρT =1, while for pseudo random binary
sequences (PRBS) ρT =0.5. In our case the BER definition describes the probability course p of
the right bathtub curve with negative jitter values and thus, represents the right sided cumulative
18
3.1. Q UANTILE N ORMALIZATION
density function CDFR (x):
BERR (x)
F (x) = p = CDFR (x) = (3.5)
ρT
In order to obtain a standardized representation of the integral in equation (3.4) the variable q
normalizes a Gaussian function with respect to mean µ and standard deviation σ:
x−µ
q= (3.6)
σ
With the standardized variable we yield
q q 02
1
Z
CDFR (q) = p = √ e− 2 dq 0 (3.7)
2π −∞
The complementary error function is defined as

Z ∞ Z −x
2 02 2 02
erfc(x) = √ e−x dx0 = √ e−x dx0 (3.8)
π x π −∞
and hence, equation (3.7) can be simplified to

1 −q
CDFR (q) = p = · erfc √ , p ∈ [0, 1] (3.9)
2 2
For the left bathtub curve CDFL (x)=1−CDFR (x) the same result is obtained, when using the
negative BER integral from x to ∞. The inverse Gaussian probability function or quantile norma-
lization is thus given by
−1
√
Qgauss (p) = q = Fgauss (p) = − 2 · erfc−1 (2 · p) (3.10)
In jitter analysis Qgauss (p) is briefly known as the Q-function [54, 82, 123]. It is commonly used
to transform measured probability functions into Q-domain, where tails appear as straight lines.
Often Qgauss (p) is defined using a positive sign. Here, the negative sign is used explicitly to
maintain symmetry between probability domain and Q-domain.
An example for a Q-normalized bathtub is given in figure 3.2. The linearizing effect on Gaussian
tails yields curves which can easily be fitted and extrapolated by simple linear functions. The
standardized variable q as defined in equation (3.6), makes the quantile normalization independent
from mean and standard deviation of the original Gaussian model. As a direct consequence, the
mean value µ is mapped onto a line offset, while the standard deviation σ is mapped onto a line
slope in Q-domain. After the transform, both parameters can easily be retrieved from the linear
regression, as also shown in figure 3.2. The zero crossings of the lines correspond to the Gaussian
means µL and µR , while the standard deviations σL and σR are given by the respective tail slope.
A coefficient comparison between standardization and obtained linear function yields:
q = (x − µ)/σ ⇐⇒ q = o + s · x
⇒ σ = 1/s µ = −o/s (3.11)

A simple optimization scheme can be derived from the Gaussian quantile normalization. The
scheme in figure 3.3 consists of two consecutive blocks, with the measured CDF data as input
and the regression error or an equivalent criterion as goodness-of-fit measure. An optimization
procedure identifies the best suited Q-tail region for linear regression analysis and returns fitted
19
CDFL (x) CDFR (x)

10−3
CDF(x)
10−6
10−9 teye
10−12
tL (10−12 ) tR (10−12 )
x
slopes
Q CDF(x)

−1 µL 1/σL 1/σR µR
−3
−5
−7
x
0 T
F IGURE 3.2.: Q-normalization principle demonstrated with a bathtub function.
p=CDF(x) Reg. error

Linear Reg.
Q(p) s, o
σ=1/s, µ=−o/s
F IGURE 3.3.: Simple optimization scheme for Gaussian tail fitting based on Q-normalization.
line offset o and slope s. These values are then used to retrieve the Gaussian tail parameters µ and
σ. This optimization has to be carried out for both distribution tails independently.
Note, that the presented optimization scheme does not consider the Gaussian tail amplitude
A as a third model parameter. This forms a missing gap for many tail fitting methods based on
conventional Q-normalization [51, 54, 82, 123]. Therefore, Miller [95] introduced an additional
variable for amplitude normalization. In chapter 4 a way to include this variable into the present
scheme is shown, which closes the missing gap and significantly improves fitting performance.
Returning to the recovered Gaussian model parameters µ and σ, the TJ timing budget as im-
portant quality measure for high-speed interfaces can now be easily retrieved. According to the
bathtub function in figure 3.2, the peak-to-peak value of total jitter TJpp is decomposed into RJ
and DJ:
TJpp = DJpp + RJpp (3.12)
where each of these components is described in terms of the tail parameters:
DJpp = µL − µR (3.13a)
σL + σR
RJrms = with RJpp = RJrms · −2 · Q(BERspec ) (3.13b)
2
Here, RJrms denotes the root-mean-square value and is calculated as the mean of the two Gaussian
standard deviations. The probability dependent Q-factor from equation (3.10) denotes the units of
Gaussian standard deviations we have to move away from the mean value, in order to reach the
desired target BER level. With BERspec =10−12 , equation (3.12) is rewritten in the commonly
used form:
TJpp = DJpp + 14.07 · RJrms (3.14)
In order to highlight the influence of RJrms on the total amount of jitter, in table 3.1 important
probability levels together with their Q-factors are given.
20
3.2. P ERFORMANCE A NALYSIS OF A LGORITHMS
BERspec −2 · QBER
10−6 9.51
10−9 12.00
10−12 14.07
10−15 15.88
TABLE 3.1.: Multiplicative constant to specify a target BER for TJpp values.
A general problem appears with fitted regression lines in Q-domain. The Q-tails often approach
the linear behavior asymptotically, which can end up in misleading TJpp estimates if the fit is
not performed in a suitable probability region. Due to the asymptote it is not possible to detect
an exact probability level where the linear behavior begins. With the proposed fitting method in
chapter 4 this effect is also visible and has to be investigated carefully in order to afford accurate
tail extrapolations. In sections 4.3.2 and 4.4 this problem domain will especially be addressed by
focusing on performance optimizations with additional fitting parameters.
3.2. Performance Analysis of Algorithms

In order to analyze and compare the performance of tail fitting methods, a broad variety of test
distributions must be generated. The basic idea is to evaluate the fitting quality of an algorithm,
which is directly reflected by the estimation accuracy of TJ values. This performance primarily
depends on the test distribution shape as well as the amount of collected jitter samples. In this
section a brief look shows how to investigate these influences.
3.2.1. Performance Metrics

The performance of a fitting algorithm can basically be measured by its ability to correctly identify
Gaussian tail parameters. When test distributions are built of RJ and DJ components, unfortunately
it is not possible to express the true parameter values as closed form equations. One can thus only
use a numerical approximation of the TJ distribution to identify the true timing budget at the target
BER=10−12 . If the tail fitting algorithm is able to perfectly decompose the test distribution into
the parameterized Gaussian tail, there will be no difference between the estimated TJpp,est and
the true timing budget TJpp,true . For real fitting methods the difference, or relative error E can be
used as a measure for estimation performance. In our analyses, test distributions will be generated
with random jitter values, and thus suffer from statistical variations. Therefore, also the estimation
error has to be treated as a random variable with its statistical mean and standard deviation:
TJpp,est,k − TJpp,true
Ek = , k = 1, . . . , K (3.15)
TJpp,true
Em = mean{Ek } , σe = std{Ek } (3.16)
Here, K is the number of evaluation runs and should at least equal a few hundred, in order to
construct empirical relations. Sometimes a tail fitting method might also produce misleading
outliers due to convergence problems. In this case it is better to use the median value Emed
together with the interquartile range IQR (interval between upper qup and lower qlo quartile) to
specify statistical spread. These measures are less prone to outlier degradation [29]:
Emed = median{Ek } , IQR = qup {Ek } − qlo {Ek } (3.17)
21
Various fitting methods may produce TJ estimates IQR

Median Median
with different error behavior. Therefore, a common per- − 1.5 IQR
Q1 Q3
+ 1.5 IQR
formance indicator is preferred, which considers both Median

biased and dispersive error influences. Unfortunately, (Q2)
there exists no optimal combination, and hence, a suit-

able confidence interval has to be specified. For a Gaus- 50%
sian distributed error as shown in figure 3.4, a symmet-
ric dispersion around the median value is obtained where a=95.7%
the data range is typically defined by 3 IQR. The Gaus-

sian standard deviation for ±1.5 IQR equals ±2.02 σe , F IGURE 3.4.: IQR definition for the
which corresponds to a confidence value of a=95.7%. Gaussian distribution.
If the error is not Gaussian distributed as is the case with
misleading outliers, Emed and IQR form robust estimates, which are better suited to describe
distributions of unknown shape.
With the given confidence level a=95.7% the overall error, or estimation loss EL can be defined
as a measure for combined biased and dispersive error influence:
EL = |Emed | + 1.5 · IQR (3.18)
Only for a normal error distribution we have the equivalent form:
EL = |Em | + 2.02 · σe (3.19)
According to this definition, EL defines a positive error threshold, which is exceeded by less than
(1−a)/2 ≈ 2.2% of estimates.
Sometimes a tail fitting algorithm may also produce misleading outliers, especially when con-
vergence failures occur. Such failures yield error distributions that are quite different from the ideal
Gaussian as shown in figure 3.4, and are typically characterized by slowly decaying heavy tails.
Outliers have to be avoided as far as possible. In order to measure their presence, the fourth stan-
dardized moment or kurtosis κ can be used. This statistical moment describes the “peakedness”
of a distribution and yields a value of κ=3 for an ideal Gaussian. If a distribution is outlier-prone,
κ will be significantly larger, while for the bounded uniform case it is κ=1.8.
3.2.2. Test Distributions

With the statistical performance measures, various distribution shapes may be defined to investi-
gate the estimation performance of a fitting algorithm. In literature, different distribution types
have commonly been used [54, 69, 82, 104, 120]. In these documents, TJ distributions are usually
composed of Gaussian RJ and bounded DJ, both characterized by the standard deviation σRJ and
amplitude ADJ , together with the selected DJ type. In subsequent analyses sinusoidal, uniform,
triangular and quadratic curve shaped DJ types will be used for performance evaluation, as de-
picted at the left of figure 3.5. Together with Gaussian RJ they yield the TJ test distributions at the
right.
Sinusoidal DJ is observed as periodic variation of edge positions and thus, also referred to as
sinusoidal jitter (SJ). Its root causes can be PLL spurs or power supply noise [104]. Uniform
DJ especially appears with inter-symbol interference (ISI), while triangular jitter is caused by
crosstalk dominated noise. Quadratic curve shaped DJ is finally obtained as combination of ISI
and crosstalk [54], and approximates a bounded Gaussian distribution which may also result from
bounded uncorrelated jitter (BUJ) [76, 120].
Combined test distributions can easily be generated with jitter samples of independent time
domain random processes. The RJ distribution is realized with a Gaussian normal process JRJ of
22
3.2. P ERFORMANCE A NALYSIS OF A LGORITHMS
Quad. Curve DJ TJ
Triangular RJ
σRJ
Sinusoidal
Uniform
ADJ
F IGURE 3.5.: Test distribution types, constructed with parameters σRJ and ADJ
Random Process Characterization

JRJ N (µ = 0, σ = σRJ )
JDJ,sin ADJ/2 · sin(2π·fSJ /fD ·k + ϕ)
JDJ,uni U(−ADJ/2, +ADJ/2)

P2
JDJ,tri U(−ADJ/2, +ADJ/2)/2
P13
1 U(
JDJ,qua −ADJ/2, +ADJ/2)/3
TABLE 3.2.: Definitions for time domain random processes.
zero mean and standard deviation equal σRJ as developed in [10], while DJ PDFs are constructed
with bounded random processes according to the respective DJ shape. Sinusoidal jitter additionally
uses a sinus function of random phase ϕ and frequency fSJ . The uniform process generates a
random number out of the bounded interval [−ADJ /2, +ADJ /2], while triangular and quadratic
curve shaped jitter can be realized as two or three superimposed uniform processes. The total jitter
random process JT J is simply obtained by the addition of RJ and DJ components:
JT J = JRJ + JDJ (3.20)
where the random processes correspond to the shape characterizations from table 3.2.
Note that the distribution synthesis with the two components JRJ and JDJ in time domain yields
a convolution in histogram domain. Due to the addition of independent random variables, the
resulting TJ distribution will be decomposed into DJpp and RJrms , which differ from the original
σRJ and ADJ parameters. According to the central limit theorem, the combination of an arbitrary
bounded random process with an unbounded Gaussian process always leads to a distribution which
is more Gaussian-like than the prior bounded component. Thus, an increase of the RJ component
(RJrms ≥ σRJ ) as well as a decrease of the DJ component (DJpp ≤ ADJ ) will be observed [123].
For sinusoidal DJ type, this topic has also been investigated in the appendix of [52].
The resulting TJ shapes can be characterized in a representative way using the variable ratio
σRJ /ADJ . The TJ shape then depends only on the relative difference between the two variables,
while the distribution size can be described by just one of them.
The TJpp,true values at the target BER are determined using numerical approximations. There-
fore, closed form equations of the independent RJ and DJ components are convolved, which allows
for a direct approximation of the complete TJ shape. Then, the TJ value closest to the 10−12 level
23
is extracted and an additional Newton step carried out. The relative numerical error is guaranteed
to be smaller than 10−4 .
Throughout subsequent analyses, the specified test distributions will always relate to the inde-
pendent parameters σRJ and ADJ prior to convolution. This is to provide reproducible simulation
results. The uniform DJ shape will especially be utilized as a reference, since it represents a good
compromise between easily decomposable sinusoidal shape, and hardly separable triangular or
quadratic curve shapes.
24
4. A Fast and Accurate Jitter Analysis
Method
With the mathematical background from the previous chapter a novel, fast and accurate jitter
analysis method is developed. This method puts the Q-normalization into the context of a com-
plete three-dimensional Gaussian model optimization, where the unknown tail parameters mean
µ, standard deviation σ and amplitude A are identified for both distribution tails. The optimiza-
tion scheme is realized with simple recursions that allow for a very fast exploration of the search
space. As will be demonstrated in this chapter, the proposed method yields very accurate fitting
results combined with low computational demand and a flexible design architecture. It automat-
ically determines the best suited tail part for distribution tail fitting, and thus represents a clear
improvement to existing tail fitting methods.
The three-dimensional approach is based on an additional scaling factor, included with the op-
timization scheme from figure 3.3. Thus, it also allows for tail amplitude search. Although the
proposed principle has been developed independently, this idea is not novel. Popovici [111] al-
ready described the mathematical principle for Gaussian quantile normalization with respect to
unknown amplitude A and standard deviation σ. There was no need to include the mean value
µ, since the signal to noise ratio was used for BER analysis and the application focus was not on
jitter decomposition. Miller [95] was the first to suggest the amplitude scaling factor which finally
allowed for three-dimensional Gaussian tail fitting. Unfortunately, both the determination of the
tail part as well as the optimization scheme for model parameter search were not described.
In this chapter a complete approach to three-dimensional tail fitting based on Gaussian quantile
normalization is provided. A simple optimization scheme is first derived, where the search algo-
rithm simultaneously searches for the Gaussian tail part while identifying the best suited model
parameters. A detailed and thorough description of the algorithmic characteristics and associated
mathematical fundamentals outlines an excellent fitting quality. A comprehensive performance
analysis then gives an impression on the potential of the proposed method. It involves the com-
bination of different test distributions and sample sizes. Subsequent performance optimizations
further improve estimation accuracy as well as the robustness of the algorithm. Therefore two
conservative concepts based on probability domain parameters as well as a Q-domain threshold
are proposed. After respective performance analyses with optimized parameters, the chapter con-
cludes with a summary of the novel jitter analysis method.
Due to the excellent tail fitting performance combined with a fast implementation, flexible ar-
chitecture and a minimum of conservative fitting parameters, the developed method is also meant
to act as a reference for future designs. For this purpose, in chapter 6 a comprehensive perfor-
mance comparison is carried out against other methods based on quantile normalization. Since the
tail fitting method utilizes the quantile normalization in a scaled sense, throughout this thesis it is
referred to as “scaled Q-normalization” (sQN) method.
4.1. Scaled Q-Normalization

The derivation of the proposed method starts with the generic Gaussian tail assumption, where
a fitting algorithm has to determine the Gaussian model parameters mean µ, standard deviation
25
4. A FAST AND ACCURATE J ITTER A NALYSIS M ETHOD
σ, and amplitude A, best matching the measured distribution tails. As was already shown in
section 3.1.2, the quantile normalization of Gaussian tails yields linearized curves, which can be
used for regression analysis. Along these tails in Q-domain the two parameters µ and σ can easily
be retrieved, but not the amplitude A. Now the same principle is extended in order to determine
all three Gaussian model parameters.
Considering a typical jitter distribution with the PDF as shown in figure 4.1 (solid curve), the
probability function or CDF covers the complete probability range from zero to one. Thus, the
PDF area equals A=1. Similar, a pure Gaussian function N (x) (dashed curve at the left) with
same area A=1 can be fitted into the distribution tail. For the moment it is assumed that the left
Gaussian function N (x) represents the ideal fitting result with known tail parameters µ and σ.
PDF(x)A=1 PDF(x)A=1
→
N (x)A=1 N (x)A<1
F IGURE 4.1.: Amplitude matching with adapted Q-normalization function.
At the left of figure 4.1 a comparably small part of the Gaussian function overlaps the outer
left tail of the jitter PDF. This means, the Gaussian function with A=1 cannot be fitted nicely into
the distribution tail, even with known parameters µ and σ. A smaller Gaussian amplitude instead,
would allow for an optimized fit with respect to both tail length and fitting error, as depicted in the
right figure. This means, an adapted Q-normalization function must be found to optimize the fit,
so that the Gaussian model best matches the distribution tail. With different jitter PDFs, it would
theoretically be necessary to derive an adapted Q-normalization function for each of the possible
tail areas A < 1, which is not feasible.
Instead one can think of a reverse approach where the probability of distribution samples is
scaled by a multiplication factor k, and the Q-normalization remains constant. This principle is
demonstrated in figure 4.2, where the original PDF is blown up by the scaling factor k. Although
the obtained probabilities will obviously increase and thus, span an area which is larger than one,
the tail fitting principle now has to be seen from the perspective of the constant Q-normalization
stage. This means, the normalization is narrowed down toward a smaller probability region, be-
cause of the scaling. In fact, the k-scaled distribution is Q-normalized in an original probability
region from zero to 1/k. Thus, the scaled distribution corresponds to a Gaussian tail search with a
smaller area of A=1/k.
PDF(x)A>1
PDF(x)A=1
→
N (x)A=1 N (x)A=1
F IGURE 4.2.: Amplitude matching with scaled distribution.
The three-dimensional tail model can be fully parameterized by

A (x−µ)2
PDF(x) = √ · e− 2σ2 (4.1)
σ 2π
to characterize the Gaussian tail shape of jitter distributions. The probability function CDF(x) will
subsequently range from zero to amplitude A, while the quantile normalization is simply achieved
26
4.1. S CALED Q-N ORMALIZATION
p=CDF(x) Reg. error

Linear Reg.
×k Q(. . .) s, o σ̂err
A=1/k σ=1/s, µ=−o/s
F IGURE 4.3.: Optimization scheme of scaled Q-normalization (sQN) method.
by probability scaling. In order to construct an enhanced optimization scheme equivalent to the

one from figure 3.3, thus, only the scaling factor k must be added prior to Q-normalization. This
directly leads to the scaled Q-normalization principle presented in figure 4.3.
With this scheme all three Gaussian model parameters µ, σ, and A can be identified. The
parameter k ≥ 1 scales the tail probabilities of the input CDF. The subsequent Q-normalization
stage expects a CDF with values in the complete probability range from zero to one, and thus,
narrows the normalized region down toward the tail part. This means, the Q-normalization is
carried out only on original tail probabilities from zero to 1/k, and the resulting Q-tail is linearized
with respect to amplitude A=1/k. The two remaining model parameters standard deviation σ
and mean µ are identified by linear regression analysis the same way as already described in
section 3.1.2.
Note, that for the moment the optimization scheme is only applied to the negative (right) bathtub
tail (see figure 2.10), because the scaling factor k requires the input CDF to start from zero. If
used with positive (left) tails, CDFR (x) must be replaced with the reverse probability function
CDFL = 1 − CDFR . The two obtained Gaussian tail models can finally be used to determine
the overall TJ timing budget. Therefore equations (3.12) and (3.13) are rewritten to include the
Gaussian tail amplitude A as third model parameter:
TJpp = DJpp + RJpp (4.2a)

DJpp = µL − µR (4.2b)
RJpp = −σL · Q(BERspec /AL ) − σR · Q(BERspec /AR ) (4.2c)
From the third expression we can see, that RJpp now also depends on the amplitudes of the Gaus-
sian models. For the special case AL =AR =1, the Q-function yields Q(BERspec ) and the model
reduces to the same equation as in (3.13).
In the following subsections the proposed scheme is described in more detail. First, the focus
is put on an efficient realization of the optimization procedure and the involved search algorithm.
Then, the generalization property of the scheme is highlighted as an additional feature. An error
analysis is carried out to justify the quantile normalization with associated linear regression as
a fundamental mathematical principle, which is close to the optimum solution for the tail fitting
problem. Finally, also the algorithmic details for an implementation in C++ are provided.
4.1.1. Optimization Procedure

An important goal for tail fitting methods is to realize an optimization process with fast tail param-
eter search. Generally, this optimization corresponds to a minimum search of the regression error
or an equivalent fitness measure, with the three unknown model parameters as search dimensions.
As will be shown, the proposed scheme is able to solve this optimization problem very quickly.
The scheme in figure 4.3 can also be seen as a twofold approach with two consecutive stages.
The first stage weights the measured CDF with the scaling factor k and performs the Q-normaliza-
27
0
10
2
k=16
k=4
k=64
0 −4
10
Bit Error Ratio

k=1
Quantile [Q]
−2
−8
10
−4
−12
10
0.78 0.81 0.84 0.87 0.7 0.8 0.9 1
Sampling Time [UI] Sampling Time[UI]
(a) k-scaled Q-functions for k={1, 4, 16, 64}, the maxi- (b) Left bathtub distribution with fitted Gaussian tail.
mum linearity is obtained at k=7.15
F IGURE 4.4.: Scaled Q-normalization principle demonstrated with N =107 jitter samples of
a 1 Gb/s signal (measured with Agilent Infiniium 40 GS/s). The jittery data
is generated with a SyntheSys BERT-Scope 7500A (sinusoidal ADJ =0.3 UI,
σRJ =0.2/14.07 UI), which also provides the measured magenta bathtub samples
at lowest BER level in (b). The required calibration delay is chosen to match the
highest BERT point with the measured Agilent bathtub.
tion. Including k into equation (3.10), thus, yields a k-scaled Q-function:

√
Qk (x) = − 2 · erfc−1 2 · CDF(x) · k

(4.3)
This function may be plotted for various values of k as shown in the example of figure 4.4, where
a typical jitter distribution is analyzed. The effect on the original CDF is observed as bent Q-
functions, which achieve best linearity for a certain scaling factor. Here, the resulting analysis
domain is also referred to as scaled Q-domain, where the optimum scaling factor yields a linear
function which is best described by a Gaussian. The obtained fitting result already shows the
potential of this approach, when performing tail extrapolations over several orders of magnitude.
After Q-normalization, the second optimization stage fits a regression line into the tail region
of the k-scaled Q-function, using the method of least squares. It yields fitted slope s, offset o
and regression error σ̂err , with the error as fitness measure for the optimization procedure. An
essential speed-up is achieved by choosing a representation with the fitting length n as variable.
This variable denotes the n outermost points on a bathtub tail which are used for regression
analysis. A finite time resolution for jitter values is used, so that distributions become discretized
and consist of a limited amount of R bins per UI. One can also think of dividing the bit period
into equally sized steps. This leads to an adjustable time resolution 1/R, which greatly reduces
computational demand. Instead of each single jitter value, the linear regression is now carried
out only along the reduced number of bins. When chosen too small, R will obviously degrade
the fitting performance. In simulators this can be avoided by selecting R sufficiently large, but
hardware systems will usually suffer from coarse resolutions. This problem domain is especially
addressed in chapter 5, when focusing on hardware design aspects.
Each collected jitter value is assigned to a bin of the discretized distribution function. The
regression error σ̂err can thus be represented as a function of the number of fitted bins n, and sub-
sequently used for optimization. In figure 4.5, σ̂err is plotted as a two dimensional function of scal-
ing factor k and tail length n, demonstrating its usability as goodness-of-fit measure. The global
28
0
10
Regression Error
−1
10
−2
10
0
10
1
1 10
10
2
2
10
Scaling factor k 10 Regression length n
F IGURE 4.5.: Regression error σ̂err depending on fitting length n and scaling factor k. The
example is the same as in figure 4.4 with a time resolution of 1.83 ps (1 UI =
1000 ps).
error minimum is obtained at optimized k and n, and can be used to retrieve the three Gaussian
model parameters. Although the error is given as a function of only two variables σ̂err =f (k, n),
Gaussian tail fitting remains a three dimensional optimization problem in a strict mathematical
sense. The variable n only hides the linear regression, which deals with the two parameters line
slope s and offset o.
The linear regression stage detects the error minimum by recursively incrementing n over the
Q-tails. That is, the search algorithm starts with a few outermost tail samples or data pairs (xi , qi )
in Q-domain, and moves toward higher probability levels by recursively adding samples from the
bins. Thus, for each additional sample the investigated Q-region becomes larger, while qualita-
tively described by the corresponding regression error σ̂err (n). This procedure has two major
advantages. First, all the outermost samples are included, which is very important for tail fitting as
they belong to the Gaussian tail part. Second, the linear regression offers very simple recursions
for adding tail samples, and hence, the desired error minimum is detected very efficiently.
With given data pairs (xi , qi ), regression analysis assumes the linear relation:
qi = o + s · x i , i = {1, . . . , n} (4.4)
The regression coefficients and the error are calculated using least squares equations [20, p. 393]:
P P P
n · x i qi − x i · qi
s= P 2 (4.5a)
n · x2i −
P
xi
P P
qi − x i · s
o= (4.5b)
rP n
(qi − o − s · xi )2
σ̂err = (4.5c)
n−2
where s and o are the estimated parameters for line slope and offset, and σ̂err is the standard error
to be minimized. Since n is constantly incremented during optimization, the present summing
terms can be implemented very efficiently as recursions.
29
In order to determine a global error minimum, the linear regression stage must be applied to
every value of k. To avoid high computational load, a logarithmically scaled search grid is utilized
for initial estimation of k. In a refined minimization an accurate estimate is then obtained after
a few more iterations. Once the optimization process is completed, the Gaussian tail model is
simply given by the fitted parameters:
A = 1/k, σ = 1/s, µ = −o/s (4.6)
The factor k forms the reciprocal of the Gaussian amplitude as already described previously. The
parameter σ is the reciprocal of the gradient or slope of the linearized Q-tail. This is due to the
inherent property of the Q-function, to normalize a given distribution in units of Gaussian standard
deviation (see equation (3.6)). Finally, µ is the jitter magnitude where the regression line crosses
the zero value in Q-domain and decomposes the jitter distribution into bounded DJ and unbounded
RJ components. Thus, with
!
q = 0 = µ · s + o ⇒ µ = −o/s (4.7)
we yield the third expression in equation (4.6) and hence, confirm the result obtained with the
comparison of coefficients in equation (3.11).
The proposed scaled Q-normalization method must be applied to both distribution tails sep-
arately. For negative jitter values the right sided distribution function CDFR (x) is used as in-
put to the optimization scheme in figure 4.3, while for positive jitter values the reverse function
CDFL = 1 − CDFR is utilized. The obtained left and right Gaussian tail parameters are finally
able to decompose a measured jitter distribution into RJ and DJ, as described by equation (4.2).
As will be demonstrated in subsequent performance analyses (section 4.2), the presented ap-
proach achieves excellent accuracy, even if a comparable small amount of jitter samples forms the
distribution. Due to the linearization of a Gaussian function in Q-domain, the three-dimensional
optimization problem (µ, σ and A) is basically simplified to a linear least squares regression with
preceding data normalization. A key advantage is the efficient application of recursions inside
the regression stage, making the optimization process very fast. Other approaches, such as fit-
ting algorithms based on chi-squared tests [52, 84, 90] have to face a non-linear three-dimensional
optimization. Hence, they are complex and suffer from a high computational demand.
4.1.2. Generalized Optimization Scheme

Another key advantage of the proposed fitting method relates to its flexible architecture. The
described analysis principle can be seen in a very generic context where the quantile normali-
zation stage is replaced by a normalization function that linearizes distribution tails according
to an expected shape. This way the proposed optimization scheme can be reused for arbitrary
non-Gaussian tails, that are given in terms of the three model parameters amplitude A, dispersion
σ and location µ. Therefore, the scheme only has to define an adequate normalization function
which is able to transform the expected tail behavior into a linear function. Given an expected tail
distribution with probability function p=F (x) and the quantile function from equation (3.2), this
CDF(x) Error
Linear Reg.
×k F −1 (. . .)
s, o
A σ, µ
F IGURE 4.6.: Generalized optimization scheme for tail fitting.
30
transform is realized with:
Q(p) = q = F −1 (p, µ=0, σ=1, A=1) (4.8)
which corresponds to the inverse CDF of unit amplitude, unit standard deviation and zero mean.
The corresponding scheme is given in figure 4.6.
Possible candidates for tail fitting other than the Gaussian model are distributions that exhibit
power-law behavior at the tails, such as the generalized Gaussian, generalized extreme value or
generalized Pareto distributions. These functions include large classes of tail shapes and intro-
duce additional degrees of freedom to the optimization scheme. In chapter 7 this generalization
principle will be further discussed.
The ability to substitute the Gaussian quantile normalization with any desired tail shape makes
the proposed scheme very flexible and thus, a powerful approach to tail extrapolation. In addti-
tion, if the amplitude pre-scaling factor is omitted, it becomes fully consistent with prior jitter
decomposition methods based on conventional Q-normalization as described in [51, 54, 82, 123].
4.1.3. Residual Analysis

In this subsection the mathematical background is provided to justify the proposed normalization
scheme from a qualitative perspective. So far, the quantile normalization has been described as
a linearizing transform, where the tail fitting problem can be solved very efficiently using the
least squares method (equation (4.5)), and the lines obtained serve as a simple medium for tail
extrapolation. However, a qualitative analysis of this principle is still missing.
Quantile normalization transforms the measured tail into a linear model where regression can
easily be carried out in the desired tail region. For such a linear model one likes to identify the
values of slope and offset that make the given data most likely. That is, we are searching for the
maximum likelihood, which is given by the method of least squares if the following conditions are
met [20, p. 389]:
1. The variance of the response variable q is constant.
2. Residuals are normally distributed.
3. The explanatory variable x is measured without error.

As will be shown, methods based on quantile normalization fulfill these optimality conditions to a
reasonable degree and hence, provide an excellent basis for distribution tail fitting. To verify this,
the behavior of residuals r must be analyzed, they are defined as the distances between response
variable q and model prediction q̂:
ri = qi − q̂i = qi − o − s · xi , i = 1, . . . , N (4.9)
A residual plot depicts ri against fitted values q̂i and thus, visualizes trends or non-constant error
in the scatter behavior of residuals which is also denoted as heteroscedasticity [20, p. 340]. Ideally,
ri should be randomly scattered over the whole plot. This guarantees for a constant error variance
which is independent from the fitted model value and thus, fulfills the first of the conditions above.
However, for quantile normalization this is not the case in the outermost tail region, as shown in
the example of figure 4.7. Here, the residuals of 25 normal distributions with standard deviation
σRJ = 0.1 UI, zero mean as well as a sample size of N =104 are represented in a scatter plot. Due
to the known parameters, fitted values q̂ were replaced by the sample amplitude x, to highlight the
error structure along the unit bit period.
The reason for the observed heteroscedasticity is the limited sample size with probability gran-
ularity 1/N . This effect can be visualized by calculating the confidence bounds for quantiles. The
31
0.6
0.4
a=0.99
0.2
Residuals
0
−0.2
a=0.01
−0.4
−0.6
−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4
x
F IGURE 4.7.: Scatter plot of residuals for 25 realizations of a normal distribution with sample
size N = 104 and σRJ = 0.1 UI. The curves correspond to the 1%, 5%, 10%
and 20% upper and lower confidence bounds when transformed into Q-domain,
according to equation (4.12).
probability of an observed quantile to be smaller than the theoretical one corresponds to a binomial
random variable with parameters (N, i) [71, sec. 3.1] [117]. Using the binomial distribution this
yields:
i
X N j
a = P (X(i) ≤ xp ) = p (1 − p)N −j = I1−p (N − i, i + 1) (4.10)
j
j=0
where X(i) is the i-th order statistics, xp the theoretical quantile at probability p, and a ∈ [0, 1] the
confidence level. Further,
Γ(u + v) x u−1
Z
Ix (u, v) = t (1 − t)v−1 dt (4.11)
Γ(u)Γ(v) 0
is the regularized incomplete beta function [31], which is used for numerical calculations. The
confidence level a, sample size N and probability p of the target quantile xp = Q(p) are fixed, and
we are searching for i = f (a, p, N ) such that equation (4.10) is fulfilled. The residuals ri,a for the
Gaussian distribution example are finally given by
ri,a = Q(i/N ) − Q(p) (4.12)
where Q is the quantile function, and the resulting ri,a (p) are plotted in figure 4.7 for different a
values. At a=0.5 the residual function follows the zero line.
A possible way to compensate existing heteroscedasticity is to use a generalized least squares
approach, where observed quantiles are weighted according to their variance. However, even
with known variance this is difficult, since quantiles are highly correlated. Thus, the complete
covariance matrix Σ must be described. According to [71, sec. 4.8] the covariance elements σij
can be determined as:
min(pi , pj ) − pi · pj
σij = (4.13)
f (Q(pi )) · f (Q(pj ))
32
where pi are the quantiles as defined in equation (3.1), f = F 0 denotes the probability density
function and f (Q(p)) is known as sparsity function [106]. An approximation of Σ is for example
given in [117] for the case of generalized extreme value distributions. The presented approach also
highlights the matrix computations involved with generalized least squares, and yields a regression
model with uncorrelated, constant errors. However, the inverse Σ−1 must be calculated, which
is only feasible for small sample sizes. This is an essential drawback which impedes utilizing
least squares equations (4.5) as efficient recursions. Nevertheless, heteroscedasticity of Gaussian
quantiles only influences the outermost tail region. With the huge sample sizes given in jitter
analysis scenarios, its influence on fitted tails becomes sufficiently small for a majority of test
cases, as will also be demonstrated later on.
Note, that other tail fitting methods that are not based on the quantile normalization as lineariz-
ing transform, have to face a non-linear regression where the error structure may highly degrade
the quality of fit. The Q-normalization technique instead, can at least always guarantee for an
approximately constant error in a higher probability region.
The third optimality condition as listed previously, demands x to be observed or measured with-
out error, which can also be guaranteed only for simulations but not for hardware measurements.
In section 4.1.1, already a time resolution variable was introduced, to speed-up the optimization
process using distributions with a discrete number of bins R. Thus, collected jitter values addi-
tionally suffer from a rounding effect or quantization into integer multiples of 1/R. This causes
an error in x, which also degrades the quality of quantile normalization. Its effect on fitting per-
formance will be investigated thoroughly in chapter 5.
In order to maintain efficiency of the scaled Q-normalization method, the analysis focus is
only on performance optimizations where the proposed scheme can utilize the fast least squares
recursions. Thus in sections 4.3 and 4.4, optimization will especially be carried out by introducing
conservative tail fitting parameters or by selecting suitable goodness-of-fit measures for the search
routine. In this context for example, an important question to be answered is, whether outermost
tail samples should be discarded or not.
4.1.4. Implementation of the Fitting Algorithm

The presented optimization scheme for the scaled Q-normalization method has been implemented
with C/C++ programming language. The goal was to achieve a fast implementation that allows for
an in-depth analysis of the proposed method. This problem was addressed by a twofold approach.
First, the method was embedded into an easy-to-configure stimuli testbench, which allows for
a quick configuration and specification of all relevant parameters of the algorithm. Second, the
method itself uses a numerical approximation of the Q-normalization function, which reduces
computational cost remarkably. In the following both testbench and fitting algorithm are described
in detail.
Testbench for Performance Analysis

A flow diagram of the implemented testbench is given in the left part of figure 4.8. The algorithm
starts by parsing all relevant analysis parameters and configuration settings from an input options
file before starting the analysis loop. This allows for multiple analyses with different parameter
settings to be executed in parallel. On a cluster of simultaneous calculators this principle greatly
increases exploration capabilities for the method. The analysis loop requires two key parameters.
The sample size N specifies the amount of collected jitter values for each test distribution, while
the number of independent evaluation runs K is required for statistical analysis of the extrapolation
error. Since jitter is a statistical random process, collected distributions suffer from random tail
variations, and thus have to be analyzed using multiple evaluations.
33
Start initial k=1
Read parameters
from options file Qk =Q(CDF(x) · k)
*.opt
Linear Regression
generate σ̂err , o, s
random jitter value
store as σ̂err (n)
store into PDF
of resolution R
n++
reg. length
no Qk (n)≥0?
jitter sample
loop yes
N×
σ̂n =min{σ̂err (n)}
calculate CDF store as σ̂n (k)
scaled k=k · ∆k
Q−normalization
scaling factor
method
no k≥kmax ?
store TJpp value yes
σ̂n,k =min{σ̂n (k)}

evaluation
loop min. refinement
K× [k/∆k, k · ∆k]
write results to final (k,n)

output files retrieve model
*.txt *.dat parameters
End calculate TJpp
F IGURE 4.8.: Flow graphs of implemented testbench (left) and tail fitting algorithm (right).
Jitter samples are generated according to the involved time domain random processes as al-
ready described in section 3.2. These random processes are realized as C functions from [34], as
an improved alternative to the built-in standard C library. A Mersenne twister is used for gen-
erating uniformly distributed samples and the method from [10] transforms them into a normal
distribution.
Generated jitter values are assigned to a PDF vector which uses a discrete amount of bins R,
specified in the options file. The PDF is thus represented by an integer vector where each bin
represents a discrete time interval. When a jitter sample falls into a certain time interval, the
corresponding vector entry is incremented, equal to a counter variable. To obtain the simulated
probability values, counter values only have to be normalized by the sample size N . This principle
limits data memory when gathering large amounts of jitter samples, and allows for modeling
the limited time resolution of hardware systems. Later on, with PLL behavioral simulations the
resolution of the simulator will be selected as 1fs. For the investigated 3Gb/s serial interface
34
Parameter Description Symbol Default Value

number of jitter samples N 107
number of evaluation runs K 250
number of bins per unit interval R=Rsim 3.33 · 105
target BER BERspec 10−12
default DJ type uniform
TABLE 4.1.: Default algorithm configuration and important key parameters.
this yields Rsim =3.33·105 bins per bit period, which guarantees for a sufficiently detailed timing
resolution so that quantization effects can be neglected.
Once N jitter samples have been gathered, the CDF is calculated as integral of the jitter distri-
bution. The scaled Q-normalization method then fits Gaussian functions into the distribution tails,
and returns the estimated timing budget TJpp,est at the desired target level of BERspec =10−12 . In
order to allow for a statistical analysis of this timing budget, K evaluations are carried out. The
described key analysis parameters are also summarized in table 4.1, together with their default
values.
After the evaluation loop, simulation results are stored in two separate output files. A logfile
(*.txt) contains recorded information about simulation progress, successful termination, start and
stop time, as well as a copy of the input options file. The output data file (*.dat) contains all
estimation results, such as the timing budgets over multiple evaluation runs. This file is meant to
be used by MATLAB scripts for post-processing and representation of results.
Fitting Algorithm
The implementation of the tail fitting algorithm is depicted in the right flow graph of figure 4.8. It
represents the realization of the scaled Q-normalization block at the left. As already described in
section 4.1.1 the basic algorithm is implemented as an optimization procedure which minimizes
the fitting error σ̂err as function of regression length n and scaling factor k. This concept is
realized with a nested loop for the regression length and an outer loop for the scaling factor.
The algorithm uses an initial search grid to identify the best suited scaling factor k, and hence,
starts with the first scaling value at k=1. With the CDF as input, the k-scaled Q-function is calcu-
lated and the linear regression analysis performed. The nested loop collects error values σ̂err (n)
over increasing regression length and continues until a maximum value of Qk ≥0 in Q-domain is
reached. This limit corresponds to a scaled CDF probability of 0.5 or half of the Gaussian tail
model. It aids in excluding CDF samples which hardly belong to the measured Gaussian tail, and
avoids negative influence of unfavorable DJ shapes.
The minimum error value of the nested loop is the optimum along the first search dimension n,
and is stored in a vector according to the different scaling factors of the outer loop. The second
search dimension k uses a logarithmically scaled search grid with the grid distance ∆k=1.2 as
default value. The parameter kmax =1/Amin must be chosen to include the minimum expected
tail amplitude, and thus, also affects computational demand of the search algorithm.
The coarse minimum along the second search dimension k is given by the initial search grid,
which must be further refined. This refinement process is carried out with a C implementation
of the MATLAB function fminbnd() from the Optimization Toolbox [88]. The required upper
and lower search bounds are the adjacent grid values of the selected scaling factor ∆k, with the
final result located inside the interval [k/∆k, k · ∆k].
After the refinement, the optimization of σ̂err is concluded, and the Gaussian model parameters
can be retrieved from the resulting kopt and nopt values together with the regression coefficients
35
in equation (4.6). Note that the tail fitting algorithm at the right hand side of figure 4.8 has to be
applied to both tails of a jitter distribution. The total jitter estimate TJpp,est is finally calculated
with both Gaussian model parameters according to equation (4.2).
In the C++ environment where the fitting algorithm has been implemented, the computational
effort depends on various factors. First, the Q-normalization function utilizes the inverse comple-
mentary error function erfc−1 (x), which can only be solved numerically and is computationally
expensive when using iterative approaches. As a solution, the MATLAB erfcinv() function
offers a very fast polynomial minimax approximation, with a relative error ≤1.13 · 10−9 . This
function has been transferred to the C++ environment. Second, the linear regression analysis has
to be carried out over the complete distribution tail inside the nested loop. This evidences the
importance of equations (4.5) which use a recursive solution for calculating line offset, line slope
and regression error.
Finally, the time discretization into R number of bins per UI highly influences computational
effort as it also defines the number of nested loop iterations for a given jitter distribution. In fig-
ure 4.9 the average calculation time tc is determined for a typical test distribution which occupies
approximately half of the unit interval. With a larger R, test distributions contain more bins and
thus, the computational demand is linearly increased. Simulations are carried out with an Intel
Core Duo 2.2GHz laptop, where the scaled Q-normalization (sQN) method with a search grid in-
terval k=[101 , 103 ] and ∆k=1.2 is typically 35 times slower than the simplified Q-normalization
(QN) method without scaling factor (see section 3.1.2). Note, that the sQN method offers the
advantage of a significantly higher accuracy compared to QN, as will especially be demonstrated
in chapter 6.
1
10
0
10
sQN
−1
10
tc [s]
−2
10
QN
−3
10
−4
10
−5
10 1 2 3 4 5
10 10 10 10 10
R
F IGURE 4.9.: Calculation time tc of QN and sQN algorithms depending on the number of bins
R. Test distribution (section 3.2.2): σRJ =0.05 UI, ADJ =0.2 UI (uniform DJ),
N =107 , K=50 evaluations.
With a bathtub curve that covers half of the unit interval (UI) and thus, consists of ≈150 k bath-
tub samples at Rsim =3.33·105 , sQN optimization of both tails takes several seconds. This may be
acceptable for system behavioral simulations where usually minutes or hours are spent to collect a
sufficient amount of data samples. With hardware jitter measurements or in production testing this
would be too time consuming, but here the bathtub is obtained using phase interpolators of coarse
time resolutions, that divide the UI into larger sized intervals. Assuming typically R=128 bins as
given with a 7 Bit phase interpolator, the optimization process is also several orders of magnitude
faster compared to the simulation.
36
4.2. P ERFORMANCE A NALYSIS
4.2. Performance Analysis
In this section the tail fitting performance of the proposed sQN method is analyzed. Therefore, test
distributions and error metrics from section 3.2 are used. According to the way of synthesizing
these distributions, the estimation error can be investigated with respect to varying distribution
shape (σRJ , ADJ , DJ type) and sample size N .
Accuracy of the extrapolated timing budget TJpp,est is evaluated by observing statistical spread
and bias of the estimation error over multiple evaluation runs. At least several hundred evaluations
are thus necessary to reliably judge the error behavior. Since it is not possible to predict whether
an implemented algorithm reveals convergence problems, statistical measures must be especially
robust against outliers. Therefore, median value Emed and interquartile range IQR are used,
according to the definitions from equation (3.17).
An initial error analysis is carried out by creating test distributions with different values of σRJ
and ADJ (uniform), where the median error bias of the tail fitting method is investigated. Results
are shown in figure 4.10(a). The obtained figure is symmetric and can thus be simplified by using
only the ratio σRJ /ADJ instead of both variables. In figure 4.10(b) a rotated view is obtained with
σRJ /ADJ as single dependent variable, which allows to discard σRJ . From a mathematical per-
spective the shape of a jitter distribution is fully described by this ratio, as long as the estimation
error is not influenced by the timing quantization of distributions. This is always the case for sim-
ulations where R, so that the estimation error Emed does not vary. In the example of figure 4.10
the default analysis configuration from table 4.1 has been used to construct the surfaces. Each
point on the surface is the median error value of K=250 evaluations where the estimated TJpp,est
values are obtained from fitted and extrapolated distribution tails. The true TJpp,true values for
error calculation are given by numerical approximations as described in section 3.2.2. Each of the
distributions uses N =108 samples with a resolution of Rsim =3.33·105 time divisions per UI.
In the following subsections an initial analysis investigates the influence of varying sample size
N . Then, the focus is put on the ratio σRJ /ADJ as distribution shape variable as well as on
different DJ types. Therefore, always the default values from table 4.1 are used for performance
evaluation, unless otherwise specified.
0.01
0.01
0.005
Emed
0.005
0 0
−3
10
−3 10
−3 2
10 10
−2 −2 −2 0
10 10 10 10
−1 −2
10 10
−1 0 −1 −4
σ 10 10 ADJ σRJ 10 10 σRJ/ADJ
RJ
(a) (b)
F IGURE 4.10.: Influence of both σRJ and ADJ on median estimation error Emed . With the
variable ratio σRJ /ADJ a rotated view is obtained where the symmetric repre-
sentation can be simplified and one of the two variables discarded.
37
4.2.1. Influence of Sample Size

The performance of the implemented tail fitting method is first investigated with respect to a vary-
ing number of jitter samples N . Figure 4.11 shows the statistical spread of K=250 estimates of
TJpp,est at BER=10−12 over varying sample size N . The test distribution consists of uniform DJ
(ADJ,uni =0.2 UI) and Gaussian RJ (σRJ =0.025 UI). In the graph, blue boxes delimit upper and
lower quartiles of the evaluation results, while red lines mark the median value. Black whiskers
show the extent of data scattering, outliers are marked with a plus sign, and the dashed magenta
line finally denotes the true value TJpp,true =0.523 UI. Figure 4.11 is intended to give a first im-
pression on the statistical behavior of TJpp,est estimates and on the way these values have to be
analyzed in order to provide a qualitative description of estimation performance. Note, that the
obtained boxplots also highlight a large amount of outliers, which have to be suppressed using
additional tail fitting parameters as will be introduced in section 4.3.
0.62
0.6
0.58
pp,est
0.56
TJ
0.54
0.52
0.5
1k 5k 10k 50k 100k 500k 1M 5M 10M 50M 100M 500M 1G
F IGURE 4.11.: Example for extrapolation error over varying sample size N.
TJpp,true =0.523 UI, ADJ,uni =0.2 UI, σRJ =0.025 UI, K=250.
In figures 4.12(a) and 4.12(b) the influence of varying sample size is demonstrated with respect
to two different jitter ratios (σRJ /ADJ =1/8 and 1/128). The markers show median error values
obtained with two hundred realizations, while the dashed lines denote upper and lower quartiles as
statistical spread. With increasing sample size, both error bias as well as statistical spread decrease
toward zero, which empirically proves consistency of the scaled Q-normalization (sQN) approach.
0 0
10 10
−1 −1
10 10
Estimation error Emed
Estimation error Emed
−2 −2
10 10
−3 −3
10 10
−4 −4
10 2 4 6 8 10
10 2 4 6 8 10
10 10 10 10 10 10 10 10 10 10
Sample size N Sample size N
(a) σRJ /ADJ =1/8, ADJ,uni =0.2 UI (b) σRJ /ADJ =1/128, ADJ,uni =0.2 UI
F IGURE 4.12.: Influence of varying sample size N on estimation error Emed , with two different
test distributions.
38
The positive error bias is an effect caused by the Q-normalization principle. Bathtub functions
are transformed into Q-domain where they are represented in a linearized form. This linear behav-
ior of Q-tails is approached asymptotically, which introduces error bias for extrapolated tails. As
figure 4.13 demonstrates for a left Q-tail, fitted lines tend to overestimate the true timing budget,
especially at small sample size N . With a large number of jitter samples, bathtub curves can be
tracked down to deep probability levels. Simultaneously, also the asymptotic behavior of a Q-tail
is significantly reduced. Overestimated TJ values yield positive errors with pessimistic estimates,
which is a beneficial property of fitting methods based on Q-normalization. This general property
is valid for the sQN method as well. Figure 4.12 also highlights an additional influence of the
jitter ratio σRJ /ADJ or distribution shape on estimation performance. This effect is investigated
subsequently.
Exact
Q(x)
N << N >>
x x x
Estimated
F IGURE 4.13.: Asymptotic linearity of Q-tails as fundamental cause for error bias.
4.2.2. Influence of Test Distribution Shape

Figures 4.14(a) and 4.14(b) investigate the estimation error over test distribution shape by vary-
ing the ratio σRJ /ADJ . Different performance curves are constructed with N ={104 , . . . , 108 }
and uniform type DJ. As expected, best results are obtained with the largest sample size. Note,
that with a target BER=10−12 the estimation of TJpp values with N =106 for example, corre-
sponds to a bathtub extrapolation over six orders of magnitude. In 4.14(b) the same curves for
N ={104 , 106 , 108 } are plotted again, together with upper and lower quartiles as dashed lines to
demonstrate the influence of statistical spread. Both plots highlight an inferior fitting performance
−1 −1
10 10
−2 −2
10 10
Emed
Emed
−3 −3
10 10
−4 −4
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) (b)
N =108 N =107 N =106 N =105 N =104
F IGURE 4.14.: Influence of jitter ratio σRJ /ADJ,uni and sample size N on error.
39
−1 −1
10 10
−2 −2
10 10
Emed
Emed
−3 −3
10 10
−4 −4
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) (b)
Sinusoidal Uniform Triangular Quadratic Curve
F IGURE 4.15.: Influence of jitter ratio σRJ /ADJ and DJ type on error. N =107 .
for the RJ dominant case, with a maximum around σRJ /ADJ,uni ≈1/4. This distribution shape
will later on also be utilized for worst case analysis.
So far, only uniform DJ has been considered for performance evaluation. Similar to the previous
plots, figures 4.15(a) and 4.15(b) thus demonstrate the estimation error and evaluation spread for
sinusoidal, uniform, triangular and quadratic curve shaped DJ. If the DJ type is changed to more
Gaussian-like shapes such as a triangle or a quadratic curve, estimates also degrade since the algo-
rithm tends to detect a single Gaussian-like peak instead of the steep tails at the distribution edges.
Only a very small percentage of the collected samples belongs to the true Gaussian tail, making a
correct tail detection very difficult. Therefore the estimation error becomes large, especially when
quadratic curve shaped DJ is combined with a small RJ component. Although this combination
is rather theoretical and unlikely to appear in real measurements, a possible way to handle the
problem is to use a specific parameter configuration optimized with respect to the desired working
region.
4.3. Performance Optimization with Different Fitness

Measures
In this section performance optimizations of the scaled Q-normalization (sQN) method are con-
ducted to improve estimation accuracy. First an alternative fitness measure based on the regression
length of fitted tails is introduced and combined with the conventional regression error to achieve
an optimized fitness criterion for Gaussian model parameter search. To improve the convergence
behavior and outlier suppression of the method, several conservative fitting parameters are intro-
duced and analyzed. As already mentioned, due to random variations of the outer Q-tail part,
the linear regression can produce misleading results which cause outliers in TJpp estimates. The
additional fitting parameters can suppress such outliers when selected appropriately, but they have
to be understood and used carefully in order to avoid any performance degradation.
4.3.1. Performance Analysis of Different Fitness Measures

So far, only the regression error σ̂err has been considered as goodness-of-fit measure for Gaus-
sian model parameter search. In this section other fitness measures and their combinations are
investigated as well, in order to increase the quality of the proposed fitting method.
40
4.3. P ERFORMANCE O PTIMIZATION WITH D IFFERENT F ITNESS M EASURES
Figure Description Fitness Value

4.17(a) Regression error σ̂err
4.17(a) Regression length n̂
4.17(b) Error/slope T̂ =σ̂err /s
4.17(b) Reg. length Error/slope n̂T
TABLE 4.2.: List of investigated fitness measures.
As was shown with the optimization scheme in section 4.1.1, the proposed sQN method yields
k-scaled Q-functions after the transform (see equation (4.3), figure 4.4(a)), and identifies scaling
factor k and fitting length n with optimum linearity. For this case the regression error becomes a
minimum, but the length of the linearized tail also approaches a maximum. Thus one may consider
a fitness measure based on the regression length of Q-tails as well. The criterion n̂ can for example
be defined as
nerr,min
n̂ = 1 − , nerr,min = n (4.14)

R σ̂err =min{σ̂err (n)}
where R is the number of bins per UI, here needed for normalization, and nerr,min is the length
which corresponds to the regression error minimum along a transformed Q-tail. n̂ maximizes the
regression length n or number of fitted tail samples over varying scaling factor k. That is, the
regression error σ̂err is still used for minimization along the first search dimension n, while n̂ is
applied along the second search dimension k.
In [117] Scholz provides a linearizing transform for generalized extreme value distributions,
and suggests a fitness criterion based on the ratio of regression error σ̂err and tail slope s:
T̂ = σ̂err /s (4.15)
Scholz analyzes the covariance structure of the regression error after performing the transform and
demonstrates, that the variance of the error depends on the slope s of the fitted line. Thus, T̂ can
be used to indicate the appropriateness of a fitted tail and to compensate the slope influence.
Equivalent to the regression length n̂, a length measure based on T̂ instead of σ̂err may be
defined. This measure also maximizes the regression length of fitted tails, and is denoted as n̂T .
nT,min
n̂T = 1 − , nT,min = n (4.16)

R T̂ =min{σ̂err (n)/s(n)}
n̂T also searches for the best suited scaling factor k and acts only along the second search dimen-
sion.
With the different fitness measures, now also the algorithmic implementation from figure 4.8
must be adapted. Therefore, in figure 4.16 the flow graphs of the fitting algorithms based on σ̂err
(left) and T̂ (right) optimization are plotted again. Depending on the selected type of algorithm,
either the regression length (n̂, n̂T ) or the regression error (σ̂err , T̂ ) is used for the outer loop with
scaling factor k. The nested loop remains the same for both cases. All four optimization criteria
are also summarized in table 4.2.
Estimation performance of the fitness measures in figure 4.17 is tested by carrying out K=250
evaluation runs over varying jitter ratio σRJ /ADJ . From figure 4.17(a) we notice that σ̂err pro-
vides a smaller error bias (expressed with markers as median values), while n̂ achieves a better
accuracy, or less statistical spread (expressed by the dashed lines as upper and lower quartiles).
An equivalent effect is observed with T̂ and n̂T . Although T̂ provides the smallest bias out of the
four fitness criteria, the lower quartiles evidence a large distance to the median values.
In order to determine an optimum trade-off between these two conflicting interests, a common
performance indicator must consider both error bias and dispersion. When recalling the perfor-
mance metrics from section 3.2, the estimation loss definition in equation (3.18) EL =|Emed | +
41
initial k=1 initial k=1
Qk =Q(CDF(x) · k) Qk =Q(CDF(x) · k)
Linear Regression Linear Regression

σ̂err , o, s σ̂err , o, s
store as σ̂err (n) store as T̂ (n)=σ̂err /s(n)
n++ n++
reg. length reg. length
no Qk (n)≥0? no Qk (n)≥0?
yes yes
{n̂n , σ̂n }=min{σ̂err (n)} {n̂T,n , T̂n }=min{T̂ (n)}
store as {n̂n (k), σ̂n (k)} store as{n̂T,n (k), T̂n (k)}
k=k·∆k k=k·∆k
scaling factor scaling factor
no k≥kmax ? no k≥kmax ?
yes yes
σ̂n,k =min{{n̂n , σ̂n }(k)} T̂n,k =min{{n̂T,n , T̂n }(k)}
min. refinement min. refinement

[k/∆k, k · ∆k] [k/∆k, k · ∆k]
final (k,n) final (k,n)

retrieve model retrieve model
parameters parameters
calculate TJpp calculate TJpp
F IGURE 4.16.: Flow graphs of σ̂err based (left) and T̂ based (right) algorithmic principles.
1.5·IQR can be used. EL gives a simple confidence interval which is exceeded by less than 2.2%
of the evaluations for a normal error distribution.
With EL the proposed fitness measures can now be compared against each other as shown in
figure 4.18. From this representation we notice that the performance of pure regression error σ̂err
is not outperformed by T̂ due to its large statistical spread. Both n̂ as well as n̂T perform slightly
worse than σ̂err over a broad shape region. The reason why T̂ is outperformed by σ̂err , is the
large distance of lower quartiles from the median value. In fact, TJ estimates demonstrate skewed
distributions with heavy negative tails. A possible cause to this effect might be correlated errors
inside the regression model. In [117] Scholz de-correlates the error of regression parameters using
the inverse of the covariance matrix, but unfortunately this highly increases complexity of the
fitting algorithm and is thus impractical for computations.
Since the asymptotic tail behavior always guarantees for a positive error bias, one could argue
that the estimation loss can also be calculated using only the upper quartile distance. Negative
errors would thus not affect the worst case error, even if distributions are heavy tailed. Neverthe-
42
−1 −1
10 10
−2 −2
10 10
Emed
Emed
−3 −3
10 10
−4 −4
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) σ̂err , n̂ (b) T̂ , n̂T
F IGURE 4.17.: Estimation performance of different fitness measures from table 4.2. Test distri-
butions are generated with the default analysis configuration from table 4.1.
0.03
0.02
EL
0.01
0 −2 −1 0
10 10 10
σRJ/ADJ
σ̂err n̂ T̂ n̂T
F IGURE 4.18.: Estimation loss indicator EL of different fitness measures.
less, in order to avoid outliers and heavy tails as far as possible and to ensure a certain algorithmic
robustness, a Gaussian-like error behavior is preferred.
A veritable improvement of estimation performance can be achieved by combining fitness mea-
sures during the optimization procedure. So far, the whole optimization has been performed using
only one fitness measure out of the candidates from table 4.2. When recalling the flow graphs
in figure 4.16, the optimization process first starts with an initial search grid and then refines the
obtained optimum grid value down to the desired accuracy. This principle can also be used to
combine two different fitness measures. Out of the numerous possibilities two combinations are
investigated and presented subsequently.
In figure 4.19 the search grid resolution ∆k is varied using both σ̂err and n̂ as fitness measures.
For the logarithmically scaled search points, n̂ is first used to determine the initial rough grid
estimate, and the refinement is then continued with σ̂err in a local search environment. From the
resulting EL curves, the best results are obtained with a grid distance ∆k=1.2. A comparison
with the performance in figure 4.18 clearly shows the improvement.
43
0.02
0.015
EL
0.01
0.005
0 −2 −1 0
10 10 10
σRJ/ADJ
σ̂err ∆k=1.1 ∆k=1.2 ∆k=1.5 ∆k=1.8
F IGURE 4.19.: Performance EL over varying search grid resolution ∆k={1.1, 1.2, 1.5, 1.8}.
The black curve is the reference from figure 4.18.
0.02
0.015
EL
0.01
0.005
0 −2 −1 0
10 10 10
σ /A
RJ DJ
σ̂err ∆k=1.1 ∆k=1.2 ∆k=1.5 ∆k=1.8
F IGURE 4.20.: Performance EL with secondary search refinement ∆k={1.1, 1.2, 1.5, 1.8}. The
black curve is the reference from figure 4.18
In figure 4.20, after complete optimization with n̂, a second refinement step with σ̂err is addi-
tionally performed in a local environment around the first estimate. The environment is specified
with the same scaling factor ∆k. The performance results are similar to those before and thus,
again better than simple σ̂err optimization. Considering the increased optimization effort for this
scenario the first one is the more convenient choice.
Summarizing these results, a well suited combination of n̂ and σ̂err measures is obtained, when
a search grid resolution ∆k=1.2 first provides an initial estimate with n̂, which is then refined by
σ̂err . Subsequently this combined optimization scenario is referred to as ĉ1.2 .
Note that the error maxima in figures 4.18 to 4.20 are always located at the same jitter ratio
σRJ /ADJ ≈1/4. This means, the worst case shape is constant and independent from the inves-
tigated fitness measures, and can thus be used for simplified worst case analysis. The presented
analyses used only test distributions of uniform type DJ and thus, the optimized ĉ1.2 scenario is
44
theoretically only valid for this single DJ shape. In other cases one would thus also have to search
for other combinations as well. However, the uniform distribution is here considered as a good
trade-off between easily separable sinusoidal and difficult Gaussian-like shapes. This allows to
use ĉ1.2 as algorithmic scenario for all DJ shapes.
4.3.2. Improvement of Convergence Behavior

The previous subsection has focused on optimum fitness measures for determining the Gaussian
model parameters. In this subsection strategies are investigated to avoid convergence failures of
the fitting algorithm. Such failures are noticed as misleading outliers in the estimated timing
budget, and can only be suppressed by introducing additional conservative parameters. The goal
is to provide suited parameters that afford a maximum flexibility on investigated test distributions
without affecting the performance.
When recalling the regression stage in figure 4.3, a linear function is fitted to the tails of the
scaled, Q-normalized bathtub curve. The fitting algorithm starts the minimum search with an
initial number of outermost tail samples. The fitted bathtub part, or investigated tail region is
expressed in terms of the regression length n and is successively increased, so that the regression
error minimum together with corresponding fitting length can be determined along the tail.
Due to the limited quantity of jitter values N for a distribution, in Q-domain an increase of
the tail variance toward lowest probabilities is observed. This results from the discretization with
probability granularity 1/N . In other words, the variance of regression error is not constant as
desired (also see section 4.1.3), but depends on the measured tail probability, which especially
affects lowest probabilities near 1/N . If the fitted lines are constructed with only few probability
values of the outermost tail part, random data variations may thus cause highly misleading outliers.
Conservative Parameters for Tail Fitting

The described fitting problem can be handled by intro-
ducing several conservative parameters that specify a
constant bathtub region at the tail edges which must be tail
samples
used for initial tail fitting. If the fit starts with a suf-
ficient minimum amount of initial tail samples nmin , right bathtub
wrong convergences caused by the granularity noise can curve
be fully suppressed. Subsequently, three conservative fit- ∆Pt
ting parameters as defined in figure 4.21 are investigated. Pt,min
A simple way to overcome random data fluctuation
is to introduce a minimum probability threshold Pt,min . ∆Tt
This parameter simply discards bathtub nodes with prob-
ability level less than the specified threshold. The sec-
F IGURE 4.21.: Parameter definitions
ond parameter ∆Tt specifies a time interval along the tail
where initial bathtub samples are selected for regression
analysis. In a simulator with adjustable time resolution one can always guarantee for sufficient
data points on the bathtub curve, even for distributions with a very steep tail. Thus ∆Tt may be
used down to the selected time resolution 1/R. If combined with Pt,min , the tail fitting algorithm
only uses nodes with probability higher than the threshold.
Equivalent, a third parameter ∆Pt may define an initial probability interval. The amount of
selected tail samples thus changes with the slope of a Q-tail. This means, less bathtub samples are
selected with steep tails, which automatically considers the influence of varying RJ tails.
In the subsequent paragraphs the influence of the three parameters with respect to estimation
performance and statistical dispersion is investigated. The idea is to give selection criteria and to
45
see how an adequate choice improves fitting performance. The definition of estimation loss EL
from equation (3.18) is therefore reused as performance indicator. To investigate the influence of
outliers caused by the algorithm, further the kurtosis κ is utilized as a presence-of-outlier indicator
which has to be minimized. Subsequent analyses are carried out with the default parameters from
table 4.1.
Influence of ∆Tt and Pt,min
The parameter ∆Tt specifies a constant part of the unit interval (UI), and hence, it defines a
minimum number of tail samples nmin which must be included with the linear regression analysis.
For a simulator with pre-specified time resolution Rsim , nmin is given by:
nmin = Rsim · ∆Tt (4.17)
∆Tt has to be chosen carefully in order to afford correct TJ estimation. If selected too large,
bathtub samples that do not belong to the Gaussian tail are used for tail fitting, which results in a
high estimation error. If selected too small, outliers may be caused by wrong convergences of the
linear regression stage, simply because the initial tail region is only supported by a few bathtub
samples. Especially RJ dominant distributions consist of highly varying tail endings and thus,
suffer from this effect.
Median error Emed , interquartile range IQR, performance EL and kurtosis κ, depending on
both ∆Tt and Pt,min are investigated in the rows of figure 4.22. The jitter ratio σRJ /ADJ,uni =1/4
corresponds to the worst case shape from the performance analysis in figure 4.18, where the esti-
mation loss is maximum. The other analysis parameters are set to the default configuration from
table 4.1. Three different optimization criteria with σ̂err , n̂ and ĉ1.2 from section 4.3.1 are investi-
gated in the different columns.
The plots in the first row show the median error Emed , which is approximately constant for a
broad range of ∆Tt values and decreases toward higher values of Pt,min , until a critical Pt,min
value is reached. Here the results become highly unstable unless the selected ∆Tt is close to the
optimum Gaussian tail length of the given test distribution, which is ∆Tt ≈0.1UI. For larger values
of ∆Tt the median error again increases.
The interquartile range IQR in the subfigures of the second row remains constant for a large
range of ∆Tt and reaches a minimum at the correct Gaussian tail length. The IQR increases
toward higher Pt,min values, which confirms the importance of bathtub samples at lowest prob-
ability levels. In fact, the tail fitting method looses accuracy when these samples are cut off by
the probability threshold Pt,min . The interquartile range also highlights a large influence on the
estimation loss EL , which has been calculated according to the definition from equation (3.18),
and is depicted in the third row of figure 4.22.
In the bottom row, the kurtosis surfaces display the outlier behavior of the scaled Q-normali-
zation method. Again, optimum kurtosis κ is only achieved with a known tail length. As soon as
the probability threshold discards parts of the measured tail (Pt,min >0), κ increases significantly,
and thus, outliers appear.
Considering that both EL and kurtosis have to be minimized, the best performance for all three
fitness measures is achieved with Pt,min =0 and ∆Tt ≈0.1 UI. Note that this parameter configura-
tion can only be used for the given test distribution but not for other shapes. This means, a priori
knowledge of the ideal tail length is needed in order to obtain optimum fitting results.
When comparing the three fitness measures σ̂err , n̂ and ĉ1.2 , the least statistical spread is
achieved with n̂ at Pt,min =0 (figure 4.22(e)). However, the overall estimation loss EL behaves
significantly better with ĉ1.2 and thus, achieves best overall performance with ĉ1.2 optimization.
46
0.03 0.03 0.03
0.02 0.02 0.02

med
Emed
Emed
E
0.01 0.01 0.01
0.005 0.005 0.005

−3 −3 −3
10 10 10
−5 −5 −5
10 10 10
−5 −5 −5
10 10
−3 10 10
−3 10 10
−3
−1 −1 −1
Pt,min
−7 10 Pt,min
−7 10 Pt,min
−7 10
10 ∆ Tt 10 ∆ Tt 10 ∆ Tt
(a) Emed , σ̂err (b) Emed , n̂ (c) Emed , ĉ1.2
−1 −1 −1
10 10 10
−2 −2 −2
IQR
IQR
IQR
10 10 10
−3 −3 −3
10 10 10
−3 −3 −3
10 10 10
−5 −5 −5
10 10 10
−5 −5 −5
10 10
−3 10 10
−3 10 10
−3
−1 −1 −1
−7 10 −7 10 −7 10
Pt,min 10 ∆T Pt,min 10 ∆T Pt,min 10 ∆T
t t t
(d) IQR, σ̂err (e) IQR, n̂ (f) IQR, ĉ1.2
0.05 0.05 0.05
0.03 0.03 0.03

L
EL
EL
E
0.02 0.02 0.02
0.01 0.01 0.01

−3 −3 −3
10 10 10
−5 −5 −5
10 10 10
−5 −5 −5
10 10
−3 10 10
−3 10 10
−3
−1 −1 −1
Pt,min
−7 10 Pt,min
−7 10 Pt,min
−7 10
10 ∆ Tt 10 ∆ Tt 10 ∆ Tt
(g) EL , σ̂err (h) EL , n̂ (i) EL , ĉ1.2
2 2 2
10 10 10
Kurtosis
Kurtosis
Kurtosis
1 1 1
10 10 10
0 0 0
10 10 10
−3 −3 −3
10 10 10
−5 −5 −5
10 10 10
−5 −5 −5
10 −3
10 10 −3
10 10 −3
10
−1 −1 −1
−7 10 −7 10 −7 10
Pt,min 10 ∆ Tt Pt,min 10 ∆ Tt Pt,min 10 ∆ Tt
(j) κ, σ̂err (k) κ, n̂ (l) κ, ĉ1.2
F IGURE 4.22.: Emed (top), IQR, EL , and kurtosis κ (bottom) for ∆Tt versus Pt,min parameter
surfaces. Three optimization criteria σ̂err (left), n̂ (middle) and ĉ1.2 (right) are in-
vestigated. Test distribution: σRJ =0.05 UI, ADJ,uni =0.2 UI, N =107 , K=250.
Influence of ∆Pt and Pt,min
∆Pt defines a probability region or interval where bathtub samples at lowest probability levels
must be included for regression analysis. Distributions with a small RJ component yield steep
bathtub tails which are tracked by few tail samples, while for the RJ dominant case many tail
samples are needed. However, for both cases the Gaussian probability region is approximately
constant. The parameter ∆Pt instead of ∆Tt thus, offers a much more robust way of initial tail
selection, without prior knowledge of the distribution shape. Further, random variations at the
47
0.03 0.03 0.03
0.02 0.02 0.02

med
Emed
Emed
E
0.01 0.01 0.01
0.005 0.005 0.005

−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
Pt,min 10 ∆ Pt Pt,min 10 ∆ Pt Pt,min 10 ∆ Pt
(a) Emed , σ̂err (b) Emed , n̂ (c) Emed , ĉ1.2
−1 −1 −1
10 10 10
−2 −2 −2
IQR
IQR
IQR
10 10 10
−3 −3 −3
10 10 10
−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
(d) IQR, σ̂err (e) IQR, n̂ (f) IQR, ĉ1.2
0.05 0.05 0.05
0.03 0.03 0.03

EL
EL
E
0.02 0.02 0.02
0.01 0.01 0.01

−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
(g) EL , σ̂err (h) EL , n̂ (i) EL , ĉ1.2
2 2 2
10 10 10
Kurtosis
Kurtosis
Kurtosis
1 1 1
10 10 10
0 0 0
10 10 10
−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
(j) κ, σ̂err (k) κ, n̂ (l) κ, ĉ1.2
F IGURE 4.23.: Emed (top), IQR, EL and κ (bottom) for ∆Pt versus Pt,min parameter surfaces.
Three optimization criteria σ̂err (left), n̂ (middle) and ĉ1.2 (right) are investigated.
Test distribution: σRJ =0.05 UI, ADJ,uni =0.2 UI, N =107 , K=250.
outermost tail endings, which are caused by the probability granularity 1/N , can directly be faced
by choosing ∆Pt sufficiently large.
In figure 4.23 again the fitting behavior of the scaled Q-normalization method is analyzed, this
time as a function of the parameters ∆Pt and Pt,min . The selected test distribution and perfor-
mance measures are the same as in figure 4.22.
For all three optimization criteria σ̂err , n̂ and ĉ1.2 a convergence limit for the variable product
∆Pt ·Pt,min =const is noticed. It is given by the Gaussian tail part of the test distribution. An
48
upper probability level Pup for tail selection can be defined according to this product:
Pup = ∆Pt · Pt,min At,min (4.18)
Pup must belong to the Gaussian part of the distribution tail in order to guarantee for correct fitting,
and hence, it must be significantly smaller than the minimum Gaussian tail amplitude At,min .
Since a Q-tail approaches the linear behavior asymptotically, such an exact probability level cannot
be determined but has to be approximated, as will be demonstrated later in section 5.2. When Pup
is located in the convergence region well below the tail amplitude At,min , a similar behavior of
Emed , IRQ and EL is noticed as with the analysis of ∆Tt before. Emed basically decreases
toward higher values of Pt,min , but the interquartile range IQR and also the overall estimation
loss EL increase. For all three optimization criteria σ̂err , n̂ and ĉ1.2 a minimum estimation loss
EL and smallest kurtosis κ is obtained, if also the threshold parameter Pt,min is at its minimum.
Estimation performance depending on the parameters ∆Pt and Pt,min must also be investigated
with respect to varying distribution shape σRJ /ADJ . The resulting estimation loss surfaces for
σRJ /ADJ =1/256 are plotted in figure 4.24 and demonstrate a behavior which is similar to the
prior figures. Besides less estimation error in the convergence region which is due to a better
suited distribution shape, the convergence limit has now also moved toward a lower ∆Pt ·Pt,min
variable product because of the smaller Gaussian tail amplitude.
0.05 0.05 0.05
0.02 0.02 0.02
0.01 0.01 0.01

EL
EL
EL
0.005 0.005 0.005
0.002 0.002 0.002

−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
(a) (b) (c)
F IGURE 4.24.: EL for parameter surfaces ∆Pt versus Pt,min , with σ̂err (a), n̂ (b) and ĉ1.2 (c)
optimization criterion. Test distribution: σRJ /ADJ =1/256, ADJ,uni =0.2,
σRJ =0.00078125, N =107 , K=250.
This shows, that it is not possible to specify an ideal parameter configuration which yields best
fitting performance for the general case of arbitrary distribution shapes. Instead, one has to rely on
a suboptimal configuration, which guarantees for correct algorithmic convergence and minimizes
the influence on estimation performance. From the EL surfaces in figures 4.23 and 4.24 at least
∆Pt ≥102 is recommended to afford correct tail fitting. Further, without Pt,min the kurtosis of
estimates demonstrates a minimum outlier presence, combined with a maximum selection range
for ∆Pt . Thus Pt,min can be omitted and replaced with the minimum probability granularity
Pt,min =1/N instead. Equation (4.18) is thus reformulated with the condition:
At,min Pup = ∆Pt /N (4.19)
As long as the upper probability level for tail selection Pup is located significantly below the
minimum Gaussian tail amplitude At,min , jitter distributions will be fitted correctly. If ∆Pt is
chosen too large, fitted tails are forced to include samples from the DJ component, which misleads
the extrapolation result. With a given minimum amplitude At,min it is thus possible to define
a suitable parameter range for ∆Pt . As a general selection criterion, ∆Pt should be chosen as
large as possible, but without violating equation (4.19). In section 5.2 design aspects for jitter
49
analysis systems will be investigated, and analyses will focus on ideal tail selection with more
detail, especially on a relation between At,min and the test distribution shape.
To summarize these results, best tail fitting behavior for the scaled Q-normalization (sQN)
method is obtained, when using the fitness criterion ĉ1.2 , together with ∆Pt as conservative pa-
rameter for tail selection. The other two parameters ∆Tt and Pt,min can be discarded. A suited
selection range for ∆Pt is given with:
At,min · N ∆Pt ≥ 102 (4.20)
Further, ∆Pt should always be as large as possible, which yields the default values
(
103 if N ≥ 106
∆Pt = (4.21)
N/103 if N ≤ 106
These default values for ∆Pt are chosen such that the upper tail selection bound ∆Pt /N is lo-
cated significantly far above the probability granularity 1/N of collected jitter distributions and
thus, avoids misleading outliers. Even if distribution tails are very flat (RJ dominant case) and
suffer from statistical noise, ∆Pt =103 guarantees for correct convergence if N ≥106 . However
at small sample sizes, ∆Pt must also be selected smaller, in order to fulfill condition (4.19). The
minimum Gaussian amplitude At,min which can be fitted correctly by the sQN method, thus, must
be significantly larger than
At,min 10−3 (4.22)
The investigated test distributions from section 3.2.2 violate this amplitude minimum only for
extreme cases with triangular and quadratic curve shaped DJ. Further, for N =104 we only have
∆Pt =101 and thus, partly accept outliers. The default ∆Pt configuration is applied subsequently.
4.3.3. Performance Analysis with Optimized Parameters

With the optimized tail search criterion ĉ1.2 and the fitting parameter ∆Pt estimation performance
of the sQN method can be re-evaluated and compared with the simple σ̂err optimization from
section 4.2. Note, that this prior analysis was already carried out with ∆Pt =101 , in order to
−1
10
−1
10
−2
10
Emed
EL
−2
10
−3
10
−4 −3
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) (b)
N =108 N =107 N =106 N =105 N =104
F IGURE 4.25.: Emed and EL performance of ĉ1.2 based optimization over varying σRJ /ADJ,uni
and sample size N . N ={104 , 105 , 106 , 107 , 108 }.
50
4.4. P ERFORMANCE O PTIMIZATION WITH Q-D OMAIN T HRESHOLD
−1
10
−1
10
−2
10
Emed
EL
−2
10
−3
10
−4 −3
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) (b)
F IGURE 4.26.: Emed and EL performance of ĉ1.2 based optimization over varying σRJ /ADJ
and DJ shape: sinusoidal, uniform, triangular and quadratic curve. N =107 .
afford evaluations without causing too many outliers. As figure 4.25 shows, the median estimation
error Emed with optimized parameters usually performs slightly worse than with prior tail search
(expressed by the dotted lines), but the estimation loss EL is improved due to less statistical spread
combined with complete outlier suppression. In fact, the kurtosis (not shown in the figures) of the
optimized search criterion has become significantly smaller.
If again the DJ shape is varied, as depicted in figure 4.26, a similar performance improvement
for EL is noticed. With quadratic curve shaped DJ an error peaking in the lowest σRJ /ADJ
region is additionally noticed, which is due to wrong tail fits. In this region, bathtub tails become
extremely steep and At,min reaches below the bound given by (4.19). The fitting algorithm is thus
not able to detect the correct tail amplitude with the given bathtub samples anymore, and rather
converges toward the Gaussian-like DJ shape instead of the steep RJ component.
4.4. Performance Optimization with Q-Domain Threshold

The previous section focused on optimizing fitness measures and convergence behavior with dif-
ferent fitting parameters. These parameters defined a lower probability region for tail fitting and
thus, directly operated on the measured probability function or bathtub. In this section fitting
parameters are investigated in scaled Q-domain. Therefore a Q-domain threshold parameter is
introduced and its fitting performance analyzed over a suited parameter range. Equivalent to the
previous section, a comparison with the simple σ̂err based optimization from section 4.2 highlights
performance improvements.
4.4.1. Minimum Q-Domain Threshold Definition

As described in section 4.1.1, in scaled Q-domain a bathtub is represented by bent Q-tails, obtained
from the weighted Q-function in equation (4.3). Each Q-tail corresponds to the Gaussian quantile
normalization for a specific amplitude, represented by the scaling factor k. In figure 4.27 the
Q-tails of a right bathtub curve are plotted for different scaling factors. As a conservative fitting
parameter, the minimum Q-domain threshold Qmin may be defined to denote an outer tail part for
tail fitting. One can easily see, that for a specific scaling factor the Q-tail achieves best linearity in
the lower interval defined by Qmin . Together with goodness-of-fit measures, Qmin can thus also
be used to assist in Gaussian model parameter search.
51
2
k=16
k=4
k=64
0
Q k=1
Quantile [Q]
min
−2
−4
0.78 0.81 0.84 0.87

Sampling Time [UI]
F IGURE 4.27.: Qmin threshold parameter definition.
In Q-domain, functions are described in terms of the standardized variable q=(x − µ)/σ (3.6),
which leads to a simple regression model for Gaussian tails represented as straight line. The
variable q thus also indicates the number of standard deviations one moves away from the mean
value of the Gaussian model. This means, the Qmin threshold defines the tail part as distance from
the Gaussian model mean in terms of its standard deviation. As a direct consequence, the fitting
region is automatically adjusted according to the expected Gaussian model.
The example in figure 4.27 uses a Q-domain threshold of Qmin =−1 equal to one standard devi-
ation. This value corresponds to the lower inflection point of a Gaussian PDF, located at x=µ−1σ.
When searching for the best suited regression line over varying k, it is thus assumed that measured
jitter distributions follow a Gaussian function at least up to the inflection point. In scaled Q-domain
the search can thus be carried out without knowing the Gaussian model parameters. For k=1 the
measured Q-tail already starts to bend before reaching the q=−1 level, indicating that the tail
probabilities increase faster than the expected Gaussian with amplitude A=1/k=1. Equivalently,
also for k=64 the Q-tail does not follow a linear behavior. The optimized tail amplitude is found
at k=7.15 where the linear part even reaches beyond the q=0 level, which means that more than
half of the Gaussian function can be fitted into the measured tail.
Obviously, different values for the threshold parameter Qmin are possible. An analysis of the
suited parameter range is thus needed and carried out subsequently. Different optimization sce-
narios and fitness measures also require a performance comparison, in order to identify a best
suited configuration. Similar to the ĉ1.2 criterion with the parameter ∆Pt (section 4.3.2), the
Qmin threshold also defines a minimum amplitude At,min and thus, limits the Gaussian tail model
search. An according functional relation will be derived in section 5.2. As a beneficial property,
Qmin allows for the derivation of an exact equation compared to the rough approximation for ∆Pt
in equation (4.19).
4.4.2. Parameter Optimization

In this subsection the influence of Qmin on estimation performance is investigated. Therefore, the
threshold is applied together with two different optimization scenarios. The first one uses Qmin
as a constant limit and thus, simply defines the lower tail part for linear regression. That is, the
fit is only performed in the lowest probability region from the tail end up to Qmin . Goodness-of-
fit measures such as σ̂err or T̂ thus only aid in searching for the best suited scaling factor on a
predefined tail length. The second scenario uses Qmin to define a minimum tail interval, which
52
initial k=1 initial k=1
Qk =Q(CDF(x) · k) Qk =Q(CDF(x) · k)
Linear Regression Linear Regression

{σ̂err , o, s}(n) {σ̂err , o, s}(n)
n++ n++
Qk (n)≥Qmin ? Qk (n)≥Qmin ?
no no
yes yes
store as σ̂err (n)
Qk (n)≥0?
no
yes
σ̂n =min{σ̂err (n)}
store as σ̂n (k) store as σ̂n (k)
k=k · ∆k k=k · ∆k
scaling factor scaling factor
no k≥kmax ? no k≥kmax ?
yes yes
σ̂n,k =min{σ̂n (k)} σ̂n,k =min{σ̂n (k)}
... ...
F IGURE 4.28.: Flow graphs for algorithmic principles based on Q-domain threshold, with mini-
mum Qth,min (left) and constant Qth,c (right) threshold scenario.
also allows for any larger tail lengths. Equivalent to the optimization principles from section 4.3
the best suited tail part is then identified according to the smallest fitness value along the tail.
The flow graphs for these two algorithmic principles are plotted in figure 4.28 and subsequently
denoted as Qth,c for constant and Qth,min for minimum threshold analysis. The last blocks have
been omitted since they are equivalent to prior algorithms (see figure 4.16). The left graph de-
scribes the Qth,min based fitting principle, where Qmin is now included into the algorithmic struc-
ture. As long as the Q-tail values Qk (n) are located in the tail region, and thus Qk (n)<Qmin
(also see figure 4.27), they will recursively be added by the algorithm but not used for minimum
search. As soon as the Q-domain threshold is reached, the minimum search finds the best suited
tail length inside the interval Qmin <Qk (n)<0. This minimum value has to be identified for every
scaling factor k in order to allow for Gaussian amplitude search.
The right graph in figure 4.28 describes the simple Qth,c scenario, where only the initial tail part
up to Qmin is used for regression analysis. Other tail lengths are not allowed. Note, that both flow
graphs only describe the search with an initial coarse search grid, and thus have to be followed by
a refinement stage in order to yield the final results. The refinement stages operate with the same
optimization criterion as for the respective search grid.
53
2
0.015 0.03 10
0.01 0.02
Kurtosis
Emed
EL
1
10
0.005 0.01
0 0 −2 −1 0 −3 −2 −1 0
−2 −1 0
10 10 10 10 10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ σRJ/ADJ
(a) {Qth,min , σ̂err }, Emed (b) {Qth,min , σ̂err }, EL (c) {Qth,min , σ̂err }, κ
2
0.015 0.03 10
0.01 0.02
Kurtosis
Emed
EL
1
10
0.005 0.01
0 0 −2 −1 0 −3 −2 −1 0
−2 −1 0
10 10 10 10 10 10 10 10 10 10
(d) {Qth,min , T̂ }, Emed (e) {Qth,min , T̂ }, EL (f) {Qth,min , T̂ }, κ

2
0.015 0.03 10
0.01 0.02
Kurtosis
Emed
EL
1
10
0.005 0.01
0 0 −2 −1 0 −3 −2 −1 0
−2 −1 0
10 10 10 10 10 10 10 10 10 10
(g) {Qth,c , σ̂err }, Emed (h) {Qth,c , σ̂err }, EL (i) {Qth,c , σ̂err }, κ
2
0.015 0.03 10
0.01 0.02
Kurtosis
Emed
EL
1
10
0.005 0.01
0 0 −2 −1 0 −3 −2 −1 0
−2 −1 0
10 10 10 10 10 10 10 10 10 10
(j) {Qth,c , T̂ }, Emed (k) {Qth,c , T̂ }, EL (l) {Qth,c , T̂ }, κ

−Qmin 1.0 1.2 1.4 1.6 1.8 2.0
F IGURE 4.29.: Emed (left), EL (middle) and kurtosis κ (right) over varying distribution shape
σRJ /ADJ . In the different rows four algorithmic principles with combinations of
{Qth,min , Qth,c } and {σ̂err , T̂ } are investigated. The Qmin threshold is chosen
as −Qmin ={1.0, 1.2, 1.4, 1.6, 1.8, 2.0}, uniform DJ type, N =107 , K=250.
54
Both flow graphs can either use σ̂err or T̂ =σ̂err /s as fitness measures to drive the minimum
search. Together with the two algorithmic principles Qth,c and Qth,min , thus four combinations
are obtained which can be compared against each other. Other fitness measures such as n̂ or n̂T
cannot be applied, since Qmin also affects the regression length over varying scaling factor k.
In figure 4.29 the estimation performance for each of the four algorithm combinations is de-
picted. The different subfigure rows correspond to these combinations, while the columns de-
scribe the respective behavior of median error Emed , estimation loss EL and kurtosis κ over
varying distribution shape σRJ /ADJ . While EL describes the estimation performance, κ indi-
cates the presence of outliers, as already described in section 4.3.2. Positive peaks in the course
of κ indicate strong outliers and have to be avoided as far as possible. Further, each subfig-
ure also investigates the influence of Qmin parameter variations by showing different curves for
−Qmin ={1.0, . . . , 2.0}.
From the Emed curves in the left column of figures 4.29 (a,d,g,j) we notice that the Qth,c sce-
nario has a better influence on error bias compared to Qth,min . With decreasing threshold Qmin ,
error bias can be reduced significantly. Unfortunately, this benefit is payed with a higher statis-
tical spread, as can be seen from the EL plots in 4.29 (b,e,h,k) where the common performance
is changed only marginally. Considering the worst case distribution shape at σRJ /ADJ =1/4,
in fact the optimum compromise between error bias and spread can be found in the interval
Qmin ∈[−1.2, −1.0].
Additionally an increase of kurtosis is observed at largest shape values for Qmin ≤−1.4. This
indicates a high uncertainty of the algorithm when choosing a small outermost fitting region. In
fact, with RJ dominant shapes the Q-tails become linear over a wide probability range, and thus,
the algorithm suffers from random noise at the outer tails.
Comparing the two fitness measures σ̂err and T̂ , the first one tends to produce estimates that
are less affected by outliers. This can especially be observed when comparing the kurtosis of
figures 4.29(c) and 4.29(i), with 4.29(f) and 4.29(l). Hence, the Qth,c optimization scenario in
combination with σ̂err as fitness measure offers the best choice out of the four candidates.
In order to get a better impression on how Qmin affects estimation performance, in figure 4.30
the four algorithm candidates are compared with respect to varying −Qmin ={0.5, . . . , 2.4}. The
selected test distribution is σRJ /ADJ =1/4 to realize the worst case shape. A visible difference in
performance appears as soon as Qmin ≤−1.0. The Qth,c optimization estimates the true TJ value
slightly better than Qth,min , which can be observed in both Emed and EL figures. The kurtosis
of all four algorithms follows a flat course, until at some point a high peak is observed. This
peak is negative skewed, indicating that large negative outliers are present in the distribution of TJ
estimates.
A closer analysis of these outliers highlights, that the regression stage has converged to the
outermost tail part by fitting a Gaussian model of very small amplitude. Due to the small Qmin
threshold parameter, the algorithm was in fact forced to carry out the regression analysis at the
outermost tail part of distributions, which is highly affected by statistical random variations. The
further Qmin is moved toward the outer tail, the higher the risk that outliers will occur. The only
way to avoid such a high risk is thus to increase Qmin .
In order to determine an optimum value of Qmin which can be used for various distribution
shapes, a similar analysis with respect to different DJ types is carried out. Figure 4.31 shows the
performance behavior of Qth,c scenario with σ̂err fitness measure and different DJ types. A good
compromise is found at Qmin =−1.2. This value is located in the minimum region of estimation
loss curves 4.31(c), and provides correct tail fitting results at an acceptable risk, as can be seen
from the kurtosis in 4.31(e). However, the quadratic curve DJ produces misleading outliers even
for Qmin ≥−1.2. The reason is the same as described before. The Q-tails of test distributions have
become very steep, so that random variations again easily cause outliers.
55
0.015 0.015 0.03
0.01
Emed
IQR
EL
0.01 0.02
0.005
0.005 0 0.01
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
−Qmin −Qmin −Qmin
(a) (b) (c)

2
10
0
Skewness
Kurtosis
−1
1
10
−2
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5

−Qmin −Qmin
(d) (e)
Qth,c , T̂ Qth,c , σ̂err Qth,min , T̂ Qth,min , σ̂err
F IGURE 4.30.: Emed (a), IQR (b), EL (c), skewness (d) and kurtosis (e) of the four algorithmic
principles from 4.29 plotted over −Qmin ={0.5, . . . 2.4}. Test distribution (worst
case shape): σRJ /ADJ =1/4, uniform DJ, N =107 , K=250.
0.04 0.015 0.05
0.03 0.04
0.01
med
IQR
EL
0.02 0.03
E
0.005
0.01 0.02
0 0 0.01
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
−Qmin −Qmin −Qmin
(a) (b) (c)

2
10
0
Skewness
Kurtosis
−1
1
10
−2
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5

−Qmin −Qmin
(d) (e)
Uniform Sinusoidal Triangular Quadratic Curve
F IGURE 4.31.: Emed (a), IQR (b), EL (c), skewness (d) and kurtosis (e) of DJ types over
−Qmin ={0.5, . . . , 2.4}, Qth,c , σ̂err scenario. Test distributions (worst case):
σRJ /ADJ =1/4 (uni.), 1/2 (sin.), 1/8 (tri.), 1/16 (quad.). N =107 , K=250.
56
−1
10
−1
10
−2
10
Emed
EL
−2
10
−3
10
−4 −3
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) (b)
N =108 N =107 N =106 N =105 N =104
F IGURE 4.32.: Emed and EL performance for Qth,c criterion with Qmin =−1.2 over varying
σRJ /ADJ,uni and sample size N . N ={104 , 105 , 106 , 107 , 108 }.
−1
10
−1
10
−2
10
Emed
EL
−2
10
−3
10
−4 −3
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) (b)
F IGURE 4.33.: Emed and EL performance for Qth,c criterion with Qmin =−1.2 over varying
σRJ /ADJ and DJ type: sin., uni., tri. and quad. curve. N =107 .
4.4.3. Performance Analysis with Optimized Parameters
Similar to section 4.3, the optimized Q-domain threshold method can also be compared with the
original simple σ̂err optimization from section 4.2. According to the previous results, the constant
Q-domain criterion Qth,c uses a threshold of Qmin =−1.2. Figure 4.32 shows, that the median
estimation error Emed of the threshold method usually performs slightly worse when compared
with the original tail fitting method (dotted lines). However, the overall estimation loss EL is
better for N ≥107 , and comparable to the ĉ1.2 based algorithm from section 4.3. For N ≤106
the estimation loss EL is larger. The Q-domain threshold also improves the performance of all
investigated DJ types, as can be seen in figure 4.33.
The error peaking in the lowest σRJ /ADJ region, which could be observed with the optimized
ĉ1.2 method at quadratic curve shaped DJ (figure 4.26), is not present anymore. The observable
minimum tail amplitude At,min is now given by the search range limit for kmax =1/At,min . How-
ever, the Qmin threshold method is also less robust against outliers.
57
4.5. Summary
A complete approach to Gaussian tail fitting, referred to as scaled Q-normalization (sQN) was
proposed and realized. Therefore, the quantile normalization principle was embedded into a three
dimensional optimization scheme for Gaussian model parameter search with unknown tail am-
plitude A, mean µ and standard deviation σ. Due to efficient recursions (equations (4.5)), this
optimization can be performed very fast. The calculation time linearly depends on the selected
number of bins R for a jitter distribution, and was investigated in figure 4.9. The algorithm was
implemented with C/C++ and yields very accurate TJpp estimates as the performance analysis
in section 4.2 demonstrates. The resulting estimation error bias is always positive and thus pes-
simistic, which is a further beneficial property of algorithms based on quantile normalization.
Causes for performance degradation of the method can be summarized as follows:
• A small sample size N of collected distributions, as shown in figures 4.14, 4.25 and 4.32.
• The fitting algorithm operates on distribution shapes in the worst case region, for example
σRJ /ADJ,uni ≈1/4 at uniform type DJ.
• The DJ shape is Gaussian-like but bounded, as shown in figures 4.15, 4.26 and 4.33.
Especially the last point leads to jitter distributions where only a very small percentage of the
collected samples belongs to the Gaussian tail part. The only way to deal with this problem is to
increase the sample size N and thus, to prolongate the Gaussian tail, but as tail samples become
rarely with larger jitter amplitude, acquisition time also increases exponentially.
In order to improve the estimation performance as well as the robustness of the sQN method, two
algorithmic approaches have been investigated. The first one, ĉ1.2 in section 4.3, uses conservative
fitting parameters in probability domain. It suggests an initial fitting region ∆Pt ≥102 which
covers at least two decades, in order to avoid outliers caused by statistical tail variations. From
this principle, a combination which utilizes both regression length and error, has been identified
as best suited optimization criterion. According to equation (4.19), the minimum tail amplitude
At,min must be located significantly above the minimum tail fitting region given by ∆Pt /N .
The second approach, Qth,c in section 4.4, uses a constant threshold Qmin in scaled Q-domain,
which defines the Gaussian tail region in terms of standard deviations beside the model mean.
This representation form allows for a flexible choice of the tail part. With a smaller parameter
Qmin , estimation accuracy is improved, but the risk for outlier occurrence is also increased. As an
acceptable compromise Qmin =−1.2 has been identified.
Both approaches improve the estimation performance compared to the simple algorithm based
on regression error σ̂err from section 4.2. For uniform DJ, N =106 and worst case test distribu-
tions, the error bias is <2% and the overall error is generally <3% in more than 97.5% of the cases.
Although the Q-domain threshold scenario seems to slightly outperform the ĉ1.2 scenario with dif-
ferent DJ types, it clearly performs worse at smaller sample sizes N ≤106 (see figures 4.25(b)
and 4.32(b)). In the subsequent chapter this effect is shown to originate from a larger error varia-
tion of Qth,c if only the outermost tail part can be used for a fit. Further, a relation between sample
size N , minimum tail amplitude At,min and threshold Qmin was derived. This also includes a
more detailed comparison of both algorithms.
So far, the proposed sQN method with its two algorithmic versions has only been applied to
simulated jitter distributions, where the simulator timing resolution Rsim is chosen sufficiently
high, so that an influence on estimation performance can be neglected. A coarse time quantization
of distributions introduces a limited, discrete amount of bins. This is an inherent property of hard-
ware systems and has to be investigated thoroughly to allow for a hardware implementation of the
proposed method. This issue will especially be addressed in the subsequent chapter. Summarized
parts of this chapter have also been published in [C1,C8,C9].
58
5. Hardware Design Aspects
This chapter focuses on hardware related design aspects for the scaled Q-normalization (sQN)
method. So far, the proposed approach has only been considered as a pure mathematical optimiza-
tion scheme, which is applied to simulated distributions or behavioral models to investigate the
impact of timing jitter on system performance. As an important application, one also likes to use
the method together with real jitter data collected in measurements. However, due to the limited
precision, such devices introduce certain quantization effects and include additional error sources
the analysis method has to cope with.
The subsequent sections investigate these effects, and try to derive empirical relations or de-
scribe the resulting error behavior and limitations. The idea is to give the designer useful hints on
how to select key parameters for a jitter measurement system, in order to guarantee a certain target
accuracy and robustness. Thus, a complete design guideline for the sQN method from chapter 4 is
provided.
A key advantage of jitter distributions is that they can be collected on-chip using built-in jitter
measurement (BIJM) systems. With data frequencies in the GHz range, off-chip instrumentation
may include significant noise caused by interconnect wires, which can easily affect measurements.
In such cases, the impact of external noise can be eliminated with a direct on-chip measurement.
Hence, in recent years a broad variety of BIJM topologies and implementations [16–18, 35, 48,
57, 60, 65, 66, 79, 100, 129, 131] has been realized as design-for-test (DFT) structures to support
production tests as well as on-chip diagnostics [12–14,86,130]. Although subsequent analyses for
the scaled Q-normalization method are basically valid for both instrumentation devices and BIJM
systems, the focus of this chapter is more on future oriented on-chip applications.
The timing jitter of a PLL is analyzed by measuring IO jitter, defined as the time difference
between reference frequency and PLL output clock (section 2.1.2). In the case of serial high-
speed interfaces with a clock and data recovery (CDR) unit, the reference frequency is given by
the analog input data, and jitter values are measured between bit transitions of the input signal
and the PLL output clock. In order to correctly quantify IO jitter, thus, an exact time interval
measurement has to be performed. Such measurements are typically realized with an adjustable
delay element inside the PLL under test, as depicted in figure 5.1. A binary phase detector (PD)
compares the delayed output clock against the transition edges of the serial data stream. If the
clock precedes the data edge, a logical one is created at the output, otherwise a zero. After N
clock cycles, the counter state reflects the jitter probability for the selected delay value. One can
sequentially step the delay over the whole bit period and thus, measure the complete probability
function of IO jitter. The smallest step size defines the achievable number of bins R in a unit
interval (UI), or time resolution 1/R.
A significant speed-up of the measurement is achieved if up to R PDs and counters are used in
parallel, together with a delay line [16,66]. Further, the PD can be replaced by a simple D flip-flop
or data latch if the input data is known. That is, the recovered data from the flip-flop is compared
against the expected original data, which again allows for detecting errors. However, the simple
BIJM principle in figure 5.1 measures only one bathtub point at a time. With N received bits
the probability value for the selected delay is tracked down to the BER level 1/N , which also
forms the probability granularity of the measurement. A complete bathtub measurement must be
59
5. H ARDWARE D ESIGN A SPECTS
Input Recovered
Data PLL
Clock
Delay
Counter
PD Jitter
Analysis
PLL Under Test
F IGURE 5.1.: BIJM based IO jitter measurement for PLLs.
performed over R time steps, which yields the test time

tt [UI] = N · R (5.1)
and also highlights the key problem for time critical measurements. While R can be reduced with
parallel circuit structures, the sample size N introduces a fundamental time limit for the reachable
probability depth. Further, mismatches in the delay line, as caused by process variations and the
non-ideal PD structure are additional sources for timing error. They can significantly affect the
accuracy of a fitting method and are described by a differential nonlinearity (DNL) error term.
Thus, the estimation error is basically influenced by the three parameters N , R and DNL error.
This problem domain is addressed in the subsequent sections. First, the tail parameters of test
distributions from section 3.2.2 are characterized, in order to relate them with a certain distribution
shape. This allows to specify minimum requirements for jitter analysis. Subsequently, empirical
relations for minimum sample size N and minimum time resolution R are derived. Since both
parameters have a significant influence on the extrapolation error of a fitting method, the focus is
also put on empirical relations for error prediction. In addition, a model is provided to consider the
DNL error caused by process variations. Finally, two comprehensive design examples are given
to demonstrate applicability of the empirical relations and to provide design guidelines for jitter
measurement systems. The chapter concludes with a brief summary.
5.1. Tail Parameters of Test Distributions

As was shown in section 3.2.2, test distributions are synthesized using RJ and DJ components,
and can be fully described in terms of the jitter ratio σRJ /ADJ . However, in a practical design
one is also interested in the tail characteristic of a total distribution. Often, typical tail parameters
are well known from measurements, and the goal of a jitter analysis is to fulfill certain minimum
requirements. Thus, one likes to provide relations between the shape variables σRJ and ADJ , and
fitted tail parameters At , σt and µt , obtained by the fitting method.
With the convolution of RJ and DJ components in histogram domain (figure 3.5), generally an
increase of the RJ variance (σt ≥σRJ ) as well as a decrease of the DJ amplitude (µt ≤ADJ /2) [52,
123] is observed. Additionally, the tail amplitude At is noticed with the sQN method. Unfor-
tunately, tail parameters cannot be determined exactly, not even with ideal bathtub curves where
{N, R}→∞. The problem is that a Q-tail always approaches the Gaussian line asymptotically,
even with optimized scaling factor k (see figure 4.4). An optimum k only guarantees for best
linearity of the obtained Q-tail, but still suffers from this asymptotic effect. In fact, tail fitting al-
gorithms have to deal with a badly situated optimization problem, since it is not possible to specify
where the asymptote or Gaussian tail part exactly begins. In other words, the fitted tail parameters
always depend on the selected sample size N and number of bins R.
60
5.1. TAIL PARAMETERS OF T EST D ISTRIBUTIONS
0
1.2 1 10
0.8 sin
1.15 tri qua sin −1
10
qua
0.6
2⋅µt / ADJ
σt / σRJ
At
1.1 tri uni
uni 0.4
−2
uni 10 qua
1.05 tri
sin
0.2
−3
1 0 10 −2 −1 0
−2 −1 0 −2 −1 0
10 10 10 10 10 10 10 10 10
(a) (b) (c)
F IGURE 5.2.: Fitted test distributions: tail parameters standard deviation σt (a), mean µt (b), and
amplitude At (c) over varying shape σRJ /ADJ , N =108 , R=Rsim =3.3·105 .
A possible way to deal with this problem is a fast numerical tail approximation that reflects
the realistic behavior of the sQN method. Therefore, ideal test distributions are created by con-
volving RJ and DJ components, and bathtub tails are simulated down to the target BER=10−12 ,
so that the noise of random variations is suppressed. Then quantization effects of N and R are
re-introduced. The applied fitting method thus yields the same tail parameters which otherwise
have to be estimated from median values of time consuming, statistical evaluations. This allows
to quickly characterize the average fitting behavior of the method. By applying this principle, in
the subsequent sections thus, empirical relations for tail parameters are derived.
5.1.1. Relation Between Distribution Shape and At

In order to correctly describe the tail amplitude as a function of distribution shape, one has to deal
with the asymptotic tail behavior. For the moment the influence of time resolution is suppressed by
choosing R=Rsim . As an example, in figure 5.2 fitting results for the three tail parameters σt , µt
and At are presented at N =108 . As expected, the obtained RJ standard deviation σt (figure5.2(a))
is always larger than σRJ , while µt (figure5.2(b)) is smaller than ADJ /2. Unfortunately, both
parameters highly depend on the selected N and R, and follow a non-linear course which impedes
derivation of simple empirical relations. For the estimated σt values in figure 5.2(a) and triangular
or quadratic DJ types even numerical inaccuracies are observed, they result from the probability
granularity 1/N at the outermost tail region. However, the fitted tail amplitude At (figure 5.2(c))
has a linear characteristic over a large range of shape values. Thus, one can use linear regressions
that aid in specifying minimum requirements for amplitudes. As will be shown in the subsequent
section 5.2, a limited sample size N also imposes a minimum analyzable tail amplitude At,min .
Therefore, the curves in figure 5.2(c) can be used to relate At,min with a corresponding distribution
shape or vice versa. It is important to note that these curves represent pessimistic results for an
arbitrary time resolution R. In fact, the practical case with a limited R always leads to larger
amplitudes than the ones given in figure 5.2(c). Thus, the curves approximate the R→∞ case, and
only depend on the sample size N .
For comparison, the numerical tail approximations are empirically verified, as shown in fig-
ure 5.3. Therefore, median tail amplitudes of fitted test distributions are determined with respect
to varying σRJ and ADJ , as well as sinusoidal ( 5.3(a)) and uniform ( 5.3(b)) DJ types. The
dashed lines mark the results obtained by fast numerical approximations, and correctly match the
observed median amplitudes. This means, the approximations truly reflect to a large extend the av-
erage behavior of the fitting method. Note, that the observed amplitude curves are in fact planes.
Median amplitudes are always constant for a certain jitter ratio σRJ /ADJ and thus, allow for a
simplified one-dimensional representation.
61
0 0
10 10
−1 −1
10 10
Amed
Amed
−2 −2
10 10
−3 −3
10 −3 −2 −1 0
10 −3 −2 −1 0
10 10 10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) Sinusoidal DJ (b) Uniform DJ
F IGURE 5.3.: Comparing the median of tail amplitude At over varying jitter ratio σRJ /ADJ with
fast numerical approximations (dashed lines). N =108 , K=250, ∆Pt =105 .
N = 105 N = 106 N = 107 N = 108 N = 109

DJ
a b a b a b a b a b
Sin. 0.766 0.492 0.729 0.495 0.700 0.497 0.673 0.497 0.652 0.498
Uni. 1.489 0.926 1.389 0.945 1.292 0.955 1.209 0.962 1.136 0.966
Tri. 5.566 1.619 5.462 1.723 5.251 1.793 4.826 1.832 4.420 1.863
Qua. 19.789 2.022 23.758 2.280 26.467 2.469 31.949 2.639 22.187 2.639
TABLE 5.1.: Coefficients for equations (5.3) and (5.4), with sQN ĉ1.2 algorithm, R=Rsim .
The four curves in figure 5.2(c) follow a linear behavior over a large range of jitter ratios,
until the tail amplitude is close to the maximum value At =1. An empirical relation between tail
amplitude and jitter ratio can thus be obtained using simple linear functions. For the resulting
regression coefficients a logarithmic scaling of both axes must additionally be considered, thus:
ln(y) = aL + b · ln(x), with y = At , x = σRJ /ADJ

aL b
= ln(e · x ), with a = eaL (5.2)
⇒ y = a · xb
When re-inserting the original variables we yield

b
σRJ
At = a · (5.3)
ADJ
with the inverse 1/b
σRJ At
= (5.4)
ADJ a
The regression coefficients for each of the four DJ types as well as N ={105 , . . . , 109 } are given
in table 5.1. These are valid for the sQN method with ĉ1.2 fitting criterion, which achieves best per-
formance, as will be demonstrated later in section 5.4. Regressions are carried out with fifty shape
values in an equidistant grid interval σRJ /ADJ =[10−3 , 100 ], where only the obtained amplitudes
in the range At =[10−3 , 3·10−1 ] are used, in order to avoid non-linearities of the probability limit
1/N and the uppermost amplitude region. The given sample sizes represent important candidates
for a typical BIJM system design.
With a given test distribution (σRJ , ADJ ), the coefficients allow to specify the tail amplitude
At obtained with the sQN method. In section 5.2 this will especially aid in determining minimum
requirements for amplitudes. Note, that if N is unknown or variable, one can still use the largest
62
5.1. TAIL PARAMETERS OF T EST D ISTRIBUTIONS
2
2 10
uni
1.8
qua
(σt/2µt) / (σRJ/ADJ)
σt / σRJ 1.6
tri
1
10
1.4 tri
qua
uni sin
1.2
sin
0
1 −3 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) (b)
σt σt /2µt
F IGURE 5.4.: Normalized standard deviation σRJ (a) and jitter ratio σRJ /ADJ (b) of fitted test
distributions over varying shape σRJ /ADJ , At ≡1, N =108 , R=Rsim =3.3·105 .
possible value of N to obtain a pessimistic amplitude. That is, fitted tail amplitudes become
smaller with increasing N , which is also an effect of the asymptotic tail behavior in Q-domain.
5.1.2. Relation Between Distribution Shape and σt , µt

As was already mentioned in the previous analysis, fitted standard deviation σt and mean µt behave
highly non-linear over varying N and R. Nevertheless, it is also possible to describe empirical
relations for these tail parameters. Section 5.3 will show, that the sQN method behaves equivalent
to the conventional Q-normalization (QN) method without scaling (At =! 1) for the extreme case of a
very coarse time resolution R. That means, worst case fitting parameters can be provided which are
generally valid for both methods. Therefore only the tail amplitude must be discarded. In figure 5.4
the standard deviation σt and the jitter ratio σt /(2µt ) of fitted tails with At =! 1 are represented with
respect to the four different DJ types. These curves now allow to specify minimum requirements
of fitted tail parameters, which can be related to the original variables σRJ and σRJ /ADJ prior to
convolution. This is especially useful when specifying a minimum standard deviation σt,min as
done in section 5.3 later on.
With higher order polynomials for regression analysis, we yield the following two relations:
y = a0 + a1 · x + . . . + ap · xp , x = ln(σRJ /ADJ ) (5.5a)
 ln(σ t /σRJ )

(figure 5.4(a)) (5.5b)

y= σt /(2µt )
 ln (figure 5.4(b)) (5.5c)
σRJ /ADJ
where the inverse can be determined iteratively using a simple Newton iteration. Equivalent to
tail amplitudes, regression coefficients are given in table 5.2, for three important sample size can-
didates N ={105 , . . . , 109 }. Regressions are carried out in the interval σRJ /ADJ =[2·10−3 , 0.5],
where equation (5.5c) additionally uses y≤ln(101 ) as upper bound. This is to guarantee for suf-
ficient accuracy of fitted polynomials (r-squared statistic 1−r2 <10−3 , defined in section 5.4) and
does not restrict the analysis, since one is only interested in finding minimum parameter values.
Note, that instead of the empirical equations (5.3) and (5.5) it is always possible to rely on fast
numerical approximations as described initially. They allow to quickly investigate the average
fitting behavior for arbitrary N and R. In the subsequent sections the numerical approximations
will be used together with additional equations to guarantee for correct behavior of fitting methods.
A corresponding design example is also given in section 5.5.
63
Eq. (5.5b) Eq. (5.5c)

DJ N
a0 ·103 a1 ·103 a2 ·103 a3 ·103 a0 a1 a2 a3 a4 ·103 a5 ·103
105 56.9 −42.1 0 0 2.99 3.08 1.51 0.385 49.0 2.46
106 50.2 −32.5 0 0 2.62 2.71 1.35 0.350 45.3 2.30
Sin. 107 44.9 −26.2 0 0 2.36 2.48 1.26 0.332 43.8 2.26
108 40.5 −21.9 0 0 2.16 2.31 1.20 0.324 43.5 2.28
109 36.8 −18.8 0 0 1.96 2.05 1.04 0.278 37.2 1.94
105 51.3 −56.1 5.77 0 4.23 4.06 1.86 0.444 53.8 2.58
106 42.7 −49.9 3.08 0 3.72 3.60 1.67 0.407 50.2 2.46
Uni. 107 38.3 −43.2 1.91 0 3.37 3.29 1.55 0.383 48.0 2.38
108 34.8 −38.0 1.24 0 3.02 2.88 1.33 0.324 40.3 1.99
109 32.7 −33.2 0.94 0 2.83 2.74 1.29 0.322 40.8 2.04
105 −86.5 −231.0 −44.45 −8.61 7.81 7.00 2.90 0.618 67.1 2.84
106 −33.2 −144.6 −16.13 −3.79 6.85 6.20 2.60 0.566 62.9 2.78
Tri. 107 −9.3 −103.3 −5.17 −1.78 6.22 5.70 2.43 0.539 61.3 2.78
108 −0.1 −84.8 −2.42 −1.07 5.55 5.00 2.10 0.461 52.0 2.35
109 6.1 −71.3 −0.78 −0.64 5.21 4.77 2.04 0.456 52.3 2.40
105 −61.4 −157.3 −13.00 −9.91 13.24 12.90 5.77 1.342 161.2 7.69
106 −120.7 −245.7 −51.85 −11.36 11.14 10.65 4.66 1.052 121.3 5.49
Qua. 107 −92.5 −201.1 −36.50 −7.63 9.87 9.41 4.10 0.922 105.3 4.73
108 −57.9 −148.3 −18.77 −4.46 9.06 8.73 3.86 0.880 102.0 4.67
109 −35.5 −113.7 −8.33 −2.57 8.08 7.62 3.30 0.743 85.1 3.88
TABLE 5.2.: Coefficients for equations (5.5b) and (5.5c), valid for QN and sQN methods.
5.2. Minimum Sample Size

As a fundamental limitation, a real jitter measurement system can only collect a limited number
of jitter samples N within a certain time interval. Even if jitter values are gathered at every bit
transition, a high-speed interface cannot afford histogram measurements down to the target BER
of 10−12 in a feasible amount of time. Thus, the sample size N is strictly limited and mostly
predefined by the maximum test time of a measurement system.
High-speed sampling scopes and time interval analyzers (TIAs) as mentioned in [43, 91] af-
ford jitter analysis in real-time, that is, an IO-jitter value is obtained at every bit transition of the
received data stream. For a measurement device this corresponds to the theoretical maximum.
BIJMs support this real-time TIA feature only in special cases, because the circuit in figure 5.1
must be realized as a parallel structure with R elements to support each delay value [16]. Thus,
silicon demand might be too large. Nevertheless, if acquisition time is uncritical, BIJMs can also
be realized with very simple, area saving circuits such as time-to-digital converters [17, 57] or
Vernier ring oscillators [18].
With the sample size N as a measure for the effort of collecting jitter distributions, one is basi-
cally interested in characterizing the minimum tail amplitude At,min , which can be fitted correctly
by the algorithm. This allows to specify a worst case distribution when using equation (5.3) from
the previous section, and hence, to describe minimum requirements for N in terms of the test
distribution shape.
Since collected probability values are always integer multiples of the granularity 1/N , two
fundamental problems have to be dealt. First, fitted tail data suffers from random variations and
may cause outliers if only a small, outermost tail region is used for fitting. Second, the fitting
methods should ideally include all of the visible RJ tail, but none of the DJ component.
The first problem has already been addressed in section 4.3, where the design parameter ∆Pt
was introduced. It defines a region where tail samples at lowest probability levels are always
64
5.2. M INIMUM S AMPLE S IZE
included for regression analysis, and must at least range over two decades. Results also showed,
that ∆Pt should be selected as large as possible in order to include the complete visible RJ tail.
This second problem is also depicted in figure 5.5 for a right bathtub curve with minimum tail
amplitude At,min . Since only part of the Gaussian RJ is observed at the TJ distribution ending,
At,min must significantly exceed the upper probability level Pup imposed by ∆Pt . Therefore, as a
rule of thumb equation (4.20) provides the following constraints:
∆Pt ≥ 102 , as large as possible (5.6a)

At,min Pup = ∆Pt /N (5.6b)
where Pup is the product of multiplication factor ∆Pt and probability granularity 1/N . The latter
condition is rather vague and does not indicate the minimum amplitude as precise as desired. Since
∆Pt is a free design parameter, we like to include all of the observable RJ tail for the worst case
situation in figure 5.5. However, the shape of a measured distribution is basically unknown and
thus, there is no information available on how deep Pup must be located below At,min .
In a simple conservative approach a pessimistic
threshold can be selected to define the observable CDF
tail part of a Gaussian function. One can for ex- At,min
RJ tail
ample assume that the Gaussian function is at least
visible up to the lower inflection point, which is Pup
located at one standard deviation from the mean ∆Pt
value. In order to determine the tail part, thus 1/N
the tail probability p at the inflection point must
be calculated. Therefore we recall the Gaussian
probability function or CDF from equation (3.9) F IGURE 5.5.: Right bathtub curve (solid)
with fitted Gaussian tail
−1 1 −q (dashed).
CDF(q) = Q (q) = p = · erfc √ (5.7)
2 2
which is also the inverse of the Q-function (3.10). The inflection point of a Gaussian is located at
x=µ − σ. In Q-domain, the Gaussian is a linear function with the standardized variable q:
x − µ x=µ−σ
q= −−−−→ q = −1 (5.8)
σ
The corresponding tail probability is thus:
Q−1 (q=−1) = pσ,1 = 0.159 (5.9)
It defines the probability level where the observable Gaussian part of a distribution tail starts. If
referred to the minimum amplitude At,min , one can specify a conservative threshold for Pup :
At,min · pσ,1 ≥ Pup = ∆Pt /N

∆Pt
At,min ≥ (5.10)
pσ,1 · N
At,min ≥ 6.3 · ∆Pt /N
This final result can be used together with equation (5.3), which allows to identify At for a given
distribution shape. Thus, the distribution shape can now also be related with a minimum sample
size requirement.
Note that the obtained condition is only valid if fitted tails truly follow a Gaussian below the
inflection point. According to the central limit theorem, the combination of Gaussian RJ with a
65
bounded DJ component (section 3.2.2) always leads to a TJ distribution which is more Gaussian-
like. Therefore, the synthesis principle already indicates validity, and in fact, empirical analyses
with the scaled Q-normalization method confirm this assumption. As long as the algorithm is able
to correctly identify the tail part, the fitted region also includes the inflection point.
However, this cannot be generalized for arbitrary distribution shapes and thus, the conservative
threshold may have to be re-specified with more pessimistic tail assumptions. Unfortunately,
no matter how pσ is selected, a certain risk to choose it inappropriately must always be faced.
Although rather theoretical, this is still an inherent disadvantage of the ĉ1.2 fitting algorithm from
section 4.3.
In section 4.4 the constant threshold scenario Qth,c was introduced as an alternative to define
the Gaussian tail part in Q-domain (figure 4.27). The worst case analysis situation is here given
by the shortest Q-tail, where the distance between lower tail end and the threshold Qmin becomes
minimum. In this case, the fitted tail suffers from the highest amount of statistical random varia-
tion.
In mathematical terms one can say that Qmin must be located above the required minimum
interval for tail fitting, which is given by ∆Pt /N when transformed into scaled Q-domain. The
worst case situation is given with a maximum scaling factor kmax , and thus

∆Pt
Q kmax · ≤ Qmin (5.11)
N
where the left hand side describes the ∆Pt /N interval in scaled Q-domain. Thus, by applying
Q−1 on both sides and by inserting the minimum amplitude At,min =1/kmax we have
∆Pt
≤ Q−1 (Qmin ) (5.12)
N · At,min
which can be rewritten as
∆Pt
At,min ≥ (5.13)
Q−1 (Q min ) ·N
This final result is very useful as it directly relates the sample size N with the minimum tail
amplitude At,min . Since N describes the measurement effort, the minimum amplitude At,min
returns the resulting benefit. The two parameters ∆Pt and Qmin influence the relation and are
now both included with the Qmin based optimization scenario. That means, equation (5.13) was
originally derived for the optimization scenario with Q-domain threshold Qmin from section 4.4.
However, now it also shows the missing link for the conservative probability threshold ∆Pt in
equation (5.10).
This threshold was needed to derive an amplitude relation for the ĉ1.2 optimization scenario in
section 4.3, using only the fitting parameter ∆Pt . Now with the same structure of the formula we
can easily see, that the probability factor pσ corresponds to
pσ ≡ Q−1 (Qmin ) (5.14)
This means, the additional Qmin parameter transforms the conservative approach of equation (5.10)
into the exact relation (5.13). We recognize this as a beneficial property of the Qth,c algorithmic
scenario from section 4.4.
In order to optimize equation (5.13) with respect to minimum tail amplitude, we choose ∆Pt as
small as possible, using the minimum ∆Pt =102 (equation (5.6)). The threshold parameter Qmin
is back-transformed into probability domain, where its reverse value also influences At,min . In
figure 5.6 the Q−1 behavior is displayed as a function of Qmin . According to this plot one would
like to minimize Qmin , but this comes along with a degraded performance since an optimum value
was already identified at Qmin =−1.2 (see figure 4.31). Hence, as long as At,min is not a critical
66
5.3. M INIMUM T IME R ESOLUTION
specification requirement, changes of ∆Pt or 0

10
Qmin should preferably be avoided. Thus, only

a larger sample size N is truly able to decrease
−1
At,min . 10
Q−1(Qmin)
As an example, in section 4.4.3 distributions
were analyzed using N =107 and Qmin =−1.2 −2
10
as threshold. Now we also like to guarantee a
minimum tail interval to avoid outliers caused
by statistical tail variations. With the minimum −3
10
∆Pt =102 and equation (5.13): 0 −0.5 −1
Qmin
−1.5 −2 −2.5
{N = 107 , Qmin = −1.2, ∆Pt = 102 } −1

−5
(5.15) F IGURE 5.6.: Inverse quantile function Q
⇒ At,min = 8.7 · 10 applied to Qmin .
The result allows to correctly fit tail amplitudes down to At,min =8.7·10−5 , without causing large
outliers. For the worst case with quadratic curve shaped DJ and equation (5.4) we yield a smallest
jitter ratio of σRJ /ADJ =6.0·10−3 , which covers nearly all of the test distributions in figures 4.26
and 4.33. Thus, none of the investigated distribution shapes is strongly affected by outliers. How-
ever, ∆Pt should always be selected as large as possible to guarantee for an optimum outlier sup-
pression. Further, the obtained result is only valid for simulations where R=Rsim and thus, when
time resolution does not affect the performance. This problem domain is addressed subsequently.
5.3. Minimum Time Resolution

Equivalent to the sample size, also the time resolution R of a jitter measurement system causes
limitations. Collected jitter distributions are represented by a discrete number of bins or nodes R,
which divide the unit interval (UI) into equidistant time intervals. A coarse resolution leads to a
small number of probability samples along the bathtub tails. This quantization effect limits the
applicability of jitter analysis methods and is thus addressed in this section.
Ideally, one likes to specify the minimum random jit-
ter component σt,min of a test distribution, which can CDF
be fitted correctly by the algorithm. A convergence fail- qup
ure is basically noticed as a limiting effect of steep tails,
∆q bins
where TJ estimates become highly biased. It is caused
when a bathtub function is not supported by a sufficient qlo
number of tail samples.
Due to the quantile normalization, the standard devia- ∆tmin = 2/R
tion σt of a fitted Gaussian is given by the reciprocal of
the tail slope (equation (4.6)). According to the scheme F IGURE 5.7.: Definition of maximum
in figure 5.7, the linearized Gaussian part of a tail in Q- slope smax .
domain must be supported by at least three nodes or bins.
These bins guarantee for correct linear regression. If the Q-tail becomes too steep, the upper bin
will be located in the bent curve region which belongs to the DJ component, thus leading to wrong
extrapolations. The maximum slope smax ca be found by defining the extreme case with ∆tmin
and ∆q.
σRJ,min = 1/smax = ∆tmin /∆q = ∆tmin /|qlo − qup | (5.16)
Along the time axis, at least three bins of the Q-tail are needed to calculate the regression error.
Together with the given time resolution 1/R this corresponds to a time interval ∆tmin =2/R. The
value for ∆q is the Q-tail region where the fitting algorithm forces tail samples to be included for
67
regression analysis. Thus, again the two algorithmic approaches ĉ1.2 and Qth,c with ∆Pt interval
and Qmin threshold must be considered separately.
Minimum Time Resolution with Fitting Parameter ∆Pt

The optimized algorithm from section 4.3 uses the parameter ∆Pt to define a probability re-
gion where measured tail bins are included. The lowest probability level plo is thus given by the
granularity 1/N , and the upper probability level pup by the fitting parameter as factor ∆Pt /N .
Additionally these probability values are transformed into scaled Q-domain, where the optimum
scaling factor kt is determined, and thus
qlo = Q(kt · plo ) = Q(kt /N ) (5.17a)

qup = Q(kt · pup ) = Q(kt · ∆Pt /N ) (5.17b)
In scaled Q-domain we yield
∆q = |Q(kt /N ) − Q(kt · ∆Pt /N )|

(5.18)

1 ∆P t
= Q −Q
At · N At · N
and can derive a relation for minimum standard deviation:
!
2 1
σt,min = · 1
∆Pt
(5.19)
R |Q At ·N − Q At ·N |
This relation is only valid if the upper boundary qup =Q A∆P

t
t ·N is located in the linearized Gaus-
sian part of the Q-tail. In the previous section corresponding equations (5.6) and (5.10) were
derived to ensure that this condition is met.
Unfortunately, the tail amplitude At is difficult to specify as it depends on the shape of a jitter
distribution. The algorithm from section 4.3 with ĉ1.2 fitness criterion maximizes the tail length
of an initial search grid, before carrying out a locally bounded minimum search. The tail length
is thus essential for initial tail amplitude search. At very coarse time resolutions, the algorithm
cannot identify any amplitudes where the regression length is maximum, simply because there
are not sufficient bins supporting the bathtub tails anymore. In this case, an amplitude of At =1
is assumed and thus, the algorithm behaves equivalent to the conventional Q-normalization (QN)
method. This allows to omit At , and as a final result

2 1
σt,min ≈ · (5.20)
R |Q(1/N ) − Q(∆Pt /N )|
This equation correctly tracks the convergence limit of the sQN method, and is empirically verified
in figure 5.8. The vertical lines mark calculated values for σt,min , according to equation (5.20) at
three different sample sizes N ={105 , 106 , 107 }. For each of the sample sizes, the correspond-
ing curve over varying distribution shape shows where the fitting algorithm truly starts to fail.
Over different sample size, ∆Pt ={102 , 103 , 104 } is also varied, in order to obtain a constant ratio
∆Pt /N =10−3 . Thus, the upper probability bound qup is kept constant and the benefit of increas-
ing sample size N can be demonstrated.
Note, that the σt,min values in equation (5.20) refer to the fitted tails of a total distribution.
They cannot directly be related with the curves in figure 5.8, since σRJ and ADJ correspond to
parameters prior to TJ composition. Therefore, these parameters were identified using the inverse
of equation (5.5b). With known and constant ADJ , a Newton approach finds the σRJ,min value
where the correct σt,min is obtained.
68
0.05 0.05 0.05
0.04 0.04 0.04
0.03 0.03 0.03

med
Emed
Emed
E
0.02 0.02 0.02
0.01 0.01 0.01
0 −2 −1 0
0 −2 −1 0
0 −2 −1 0
10 10 10 10 10 10 10 10 10
(a) R=1024 (b) R=512 (c) R=256
0.05 0.05 0.05
0.04 0.04 0.04
0.03 0.03 0.03

med
Emed
Emed
E
0.02 0.02 0.02
0.01 0.01 0.01
0 −2 −1 0
0 −2 −1 0
0 −2 −1 0
10 10 10 10 10 10 10 10 10
(d) R=128 (e) R=64 (f) R=32

N =105 , ∆Pt =102 N =106 , ∆Pt =103 N =107 , ∆Pt =104
F IGURE 5.8.: Smallest analyzable RJ component σt,min . Empirical relation (5.20) is veri-
fied with respect to varying time resolution R={1024, . . . , 32} and sample size
N ={105 , 106 , 107 }. ∆Pt ={102 , 103 , 104 }, K=250, uniform DJ type, ĉ1.2 opti-
mization scenario.
∆ Pt / N = 10−6
∆ Pt = 101
3
σt,min × R
0.5
∆ Pt = 106
∆ Pt / N = 10−1
0.3 4 5 6 7 8 9
10 10 10 10 10 10
N
F IGURE 5.9.: ∆Pt selection chart for identifying σt,min ·R, as described by equation (5.20).
∆Pt ={101 , . . . , 106 } (dashed), ∆Pt /N ={10−6 , . . . , 10−1 } (solid).
69
In figure 5.9 additionally, a chart for (5.20) is provided to help selecting the free design param-
eter ∆Pt , and to verify whether a certain minimum RJ tail can be fitted correctly. For example,
knowing N and R one can easily verify whether a desired σt,min is guaranteed for a certain ∆Pt .
σt,min decreases linearly with the number of bins, and is thus represented by the normalized,
dimensionless variable σt,min ·R. The chart is constructed using two different types of curves,
where either ∆Pt or the ratio ∆Pt /N is constant. This is to guarantee the two requirements with
respect to outlier suppression (equation (5.6)) and minimum tail amplitude (equation (5.10)) in-
dependently. Both curve types allow for the analysis of a smaller RJ standard deviation, if ∆Pt
is increased while N is constant. The maximum ∆Pt is only restricted by the minimum tail am-
plitude. Note that in equation (5.20) the value ∆Pt /N forms the upper probability level for tail
selection. If constant, the sample size N can be used to increase the fitting region and thus, to
identify a smaller RJ standard deviation.
An interesting effect is further noticed with constant ∆Pt . If more jitter samples are used for
collecting distributions, σt,min becomes larger. This seems contradictory, but is a result of the
nonlinear Q-function behavior when only the lowest probability region is used for tail fitting. In
fact, the benefit in this case lies in a smaller minimum amplitude At,min . To summarize these
observations, both ∆Pt and N should be chosen as large as possible, without violating equa-
tion (5.10).
Minimum Time Resolution with Qmin Threshold

Equivalent to the ∆Pt based algorithm, σt,min can also be determined for the Qmin based opti-
mization scenario from section 4.4. Here, the probability region where tail samples are included
for regression analysis is given by the Qmin threshold as upper bound, and the probability granu-
larity 1/N as lower bound. According to figure 5.7, in scaled Q-domain we yield
qlo = Q(kt /N ) (5.21a)

qup = Qmin (5.21b)
This leads to the equation

 
2  1
σt,min = ·  (5.22)
R Q 1
− Qmin

At ·N
The upper bound Qmin must belong to the linearized Q-tail part, which can easily be verified by
equation (5.13). Again, it is not possible to specify a tail amplitude At because it depends on the
distribution shape, but the minimum amplitude At,min from equation (5.13) can be used instead.
This is because correct convergence for the fitting method is assumed, and At,min forms a worst
case scenario which allows for pessimistic σt,min estimation. Thus, when inserting equation (5.13)
into (5.22) :  
2  1
σt,min ≤ · −1  (5.23)
R Q
Q (Qmin )
−Q

∆Pt min
where the σt,min function is now reduced to a simplified form without depending on the sample
size N . This expression cannot be reduced further, due to the non-linearity of the Q-function.
Equivalent to the previous analysis, applicability of this equation is demonstrated in figure 5.10.
Vertical solid lines mark the calculated values from equation (5.23). The course of median er-
ror Emed over varying jitter ratio shows where the fitting algorithm starts to fail at each of the
three sample sizes N ={105 , 106 , 107 }. The Q-domain threshold Qmin =−1.0 is here used for
70
0.05 0.05 0.05
0.04 0.04 0.04
0.03 0.03 0.03

Emed
Emed
Emed
0.02 0.02 0.02
0.01 0.01 0.01
0 0 0
−0.01 −2 −1 0
−0.01 −2 −1 0
−0.01 −2 −1 0
10 10 10 10 10 10 10 10 10
(a) R=1024 (b) R=512 (c) R=256
0.05 0.05 0.05
0.04 0.04 0.04
0.03 0.03 0.03

Emed
Emed
Emed
0.02 0.02 0.02
0.01 0.01 0.01
0 0 0
−0.01 −2 −1 0
−0.01 −2 −1 0
−0.01 −2 −1 0
10 10 10 10 10 10 10 10 10
(d) R=128 (e) R=64 (f) R=32

N =105 N =106 N =107
F IGURE 5.10.: Smallest analyzable RJ component σt,min . Empirical relation (5.22) is veri-
fied with respect to varying time resolution R={1024, . . . , 32} and sample size
N ={105 , 106 , 107 }. Qmin =−1.0, ∆Pt =102 , K=250, uniform DJ type, Qth,c
optimization scenario.
1
Qmin = −2
σt,min × R
Qmin = 0
0.5
0.3 1 2 3 4 5 6
10 10 10 10 10 10
∆ Pt
F IGURE 5.11.: ∆Pt and Qmin selection chart for identifying the normalized variable σRJ,min ·R,
as described by equation (5.23). Qmin ={0, −0.5, −1.0, −1.5, −2.0}.
71
calculations, which is different from the optimum value in section 4.4.2. This is due to the coarse
time resolutions RRsim , and allows to include more tail samples with the Qth,c optimization
scenario.
Again, σt,min values in equation (5.23) refer to the fitted tail parameter of a TJ distribution, and
can only be plotted in figure 5.10 when using the inverse of equation (5.5b). This yields different
results for the three sample sizes, which are located very close to each other. Here only the N =107
case is shown, due to more pessimistic values.
If ∆Pt is increased, σt,min estimates become smaller, which allows to search for a suited param-
eter value without depending on the sample size N . The normalized standard deviation σt,min ·R
is plotted in a chart for ∆Pt selection over different Qmin values (figure 5.11). The chart suggests
∆Pt to be selected as large as possible, but note that equation (5.13) must always be fulfilled.
Also, ∆Pt ≥102 is highly recommended to guarantee for sufficient outlier suppression.
5.4. Estimation Error Analysis

The estimation performance of the scaled Q-normalization method is highly affected by the sample
size N , time quantization R and differential non-linearity (DNL) error of a jitter measurement
system. In this section, first, empirical relations are derived that quantify estimation error in terms
of analysis parameters (N , R) and the distribution shape (σRJ , ADJ ). Then an error ripple effect
is investigated, which appears with a coarse time discretization of distribution tails. Finally, a DNL
error model is provided to include the effect of process variations into the empirical relations. The
presented sections are meant to assist the designer in finding an optimum trade-off between fitting
accuracy and hardware expense.
5.4.1. Empirical Error Analysis

The estimation error E as defined in equations (3.17) and (3.18) is a statistical variable, which can
basically be expressed as a function of four variables:
{Emed , IQR, EL } = f (σRJ , ADJ , N, R) (5.24)
In order to derive empirical relations, first the complexity of this four-dimensional function must
be reduced. As an introductory example, in figure 5.12 the estimation loss EL over varying dis-
tribution shape is depicted, by combining RJ (σRJ ) with uniform type DJ (ADJ ). TJpp estimates
are obtained from fitted bathtub tails using different time resolution R. In the example, the sQN
algorithm uses the optimized parameter configuration from section 4.3.2, with ĉ1.2 and ∆Pt =105 .
Basically two effects are noticed. First, the algorithm is highly biased if random jitter falls
below a certain minimum σRJ,min . This limiting effect is caused by the discrete time resolution
and has already been dealt in the previous section. Second, EL increases when either σRJ or R are
reduced at constant jitter ratio σRJ /ADJ . That means, the surfaces in figure 5.12 are symmetric
and thus, the ratio σRJ /ADJ can be reused as simplifying shape variable. However, an additional
dependency on σRJ and R is observed, combined with a high error ripple if either one of the two
variables is varied.
In order to quantify the estimation error, equation (5.24) must be simplified. Considering again
the jitter ratio σRJ /ADJ as single variable, only the shapes of largest error from the analysis in
figure 4.26 may be investigated within the scope of a worst case analysis. This yields a single shape
value per DJ type. Additionally, with constant ratio both σRJ and ADJ vary simultaneously, so
that each parameter also represents the total distribution size. The estimation error thus reduces to
72
5.4. E STIMATION E RROR A NALYSIS
0.04 0.04 0.04
0.03 0.03 0.03
0.02 0.02 0.02

EL
EL
EL
0.01 0.01 0.01
0 0 0
−3 −3 −3
10 10 10
2 2 2
10 10 10
−2 0 −2 0 −2 0
10 10 10 10 10 10
−2 −2 −2
10 10 10
−1 −4 −1 −4 −1 −4
σRJ 10 10 σ /A σRJ 10 10 σ /A σRJ 10 10 σ /A
RJ DJ RJ DJ RJ DJ
5
(a) R=Rsim =3.33 · 10 (b) R=1024 (c) R=512
0.04 0.04
0.03 0.03
0.02 0.02
EL
EL
0.01 0.01
0 0
−3 −3
10 10
2 2
10 10
−2 0 −2 0
10 10 10 10
−2 −2
10 10
−1 −4 −1 −4
σRJ 10 10 σRJ/ADJ σRJ 10 10 σRJ/ADJ
(d) R=256 (e) R=128
F IGURE 5.12.: Estimation loss EL for different values of σRJ , ADJ , and R. Rsim in fig-
ure 5.12(a) is obtained with a simulator time resolution of 1 fs in a 3Gb/s in-
terface. N =108 , ∆Pt =105 , K=250, ĉ1.2 optimization criterion.
Worst case Amplitude At,min

DJ type
σRJ /ADJ (eq. (5.3), tab. 5.1)
None, only RJ — 1.000
Sinusoidal 1/2 0.477
Uniform 1/4 0.319
Triangular 1/8 0.107
Quadratic 1/16 0.021
TABLE 5.3.: Selected worst case shape values σRJ,min /ADJ and corresponding worst case tail
amplitudes At,min (N =108 ) for different DJ types.
a simplified function of three variables
max{Emed , IQR, EL } = f (σRJ , N, R)

(5.25)
if σRJ /ADJ = const.
The variable σRJ now describes the overall distribution size, while ADJ is discarded. Remember,
that ADJ does not fully disappear as parameter, it has only been transformed into a dependent
variable. Varying the parameter σRJ now also means to vary ADJ according to the selected worst
case ratio.
This jitter ratio depends on the selected DJ type and thus, has to be determined for each of
the investigated shapes. From the performance results in sections 4.3.3 and 4.4.3 we can easily
determine these ratios for subsequent analysis. They are given in table 5.3 together with worst
case tail amplitudes, obtained by equation (5.3) at N =108 . The shape values have been selected
as power-of-two to simplify the performance analysis, so that the numerical approximations of
true total jitter values TJpp,true can be reused.
The next step is to search for a linear or logarithmic dependency between two of the independent
73
0.03 0.03 0.06
0.02 0.02 0.04

Emed
IQR
EL
0.01 0.01 0.02
0 0 0
−3 −3 −3
10 10 10
1 1 1
10 10 10
−2 −2 −2
10 2
10 10 2
10 10 10
2
3 3 3
σ
−1
10 10 σRJ,min
−1
10 10 σ
−1
10 10
RJ,min R R RJ,min R
(a) (b) (c)
0.03 0.03 0.06
0.025 0.025 0.05
0.02 0.02 0.04

Emed
IQR
EL
0.015 0.015 0.03
0.01 0.01 0.02
0.005 0.005 0.01
0 2 0 2 0 2 1 0
1 0 1 0
10 10 10 10 10 10 10 10 10
σRJ,min ⋅ R σRJ,min ⋅ R σRJ,min ⋅ R
(d) (e) (f)
F IGURE 5.13.: Median estimation error Emed (a,d), interquartile range IQR (b,e) and estima-
tion loss EL (c,f) over varying R and σRJ,min . N =108 , ∆Pt =105 , K=250,
σRJ,min /ADJ =const.=1/4, uniform DJ type.
variables from equation (5.25), while keeping the third one constant. As depicted in figure 5.13,
such a relation is found for a constant sample size N . If the parameters R or σRJ are varied, all
three performance indicators Emed , IQR and EL yield plane surfaces that consist of a convergence
region, which is to some degree affected by ripple. Inside the convergence region, bathtub tails
are supported by sufficient bins, thus allowing the fitting algorithm to correctly extrapolate tails.
When either σRJ or R become too small, the algorithm shows a large error bias. The ripple
increases when moving toward the convergence limit, which is an effect caused by the coarse time
discretization and will also be investigated later on in this section.
The regression analysis can be simplified by changing the variable representation into the prod-
uct form σRJ ·R, as demonstrated in the bottom row of figure 5.13. This is due to the constant
ratio σRJ /ADJ , which changes the whole distribution size when only σRJ is varied. The fitting
algorithm cannot distinguish between a variation of the distribution size or the time resolution. In
fact, the extrapolation error is only influenced by the number of bins which form the bathtub tail.
An increase in performance of the algorithm can therefore be achieved with a larger number of
bins R as well as a larger distribution size given in terms of σRJ .
As a result, another dependency between the variables R and σRJ has been identified. One
of them can be discarded if only the product σRJ,min ·R is considered, which further reduces the
estimated error in equation (5.25) to a function of two variables:
max{Emed , IQR, EL } = f (σRJ ·R, N ) , σRJ,min /ADJ = const. (5.26)
Consequently, for a constant sample size N , estimation performance in figures 5.13 (d,e,f) can also
be approximated with a simple regression line.
The variable product σRJ ·R is here referred to as node or bin density of a distribution. When
comparing this density with parameter selection charts from figures 5.9 and 5.11, we notice that
it also corresponds to the normalized standard deviation. Depending on the selected algorithmic
type, equations (5.20) and (5.23) can thus be used to identify the expected convergence limit.
74
−1 −1
10 10
−2
Emed
IQR
−2
10 10
−3 −3
10 10
0 0
10 5
10 5
10 10
6 6
1 10 1 10
10 7 10 7
10 10
8 8
σRJ,min × R 10 N σRJ,min × R 10 N
(a) (b)
F IGURE 5.14.: Surfaces for empirical analysis of Emed (a) and IQR (b). The worst case
shape σRJ /ADJ =1/4 (uniform type DJ) is here investigated in the range
N ={105 , . . . , 108 } and σRJ ·R={0.8, . . . , 51.2}, K=250.
In figure 5.13 the ĉ1.2 based algorithm was used with N =108 and ∆Pt =105 as parameters. The
selection chart in figure 5.9 and equation (5.20) yield σt ·R=0.79 as limit. With σRJ /ADJ =1/4
and equation (5.5b) we get σRJ ·R=0.72, which is consistent with the observed surfaces in fig-
ure 5.13.
Using the sample size N as second independent variable, the estimation error can be represented
by two-dimensional surfaces. If regression planes achieve acceptable accuracy, an empirical de-
scription of the extrapolation error thus becomes possible. Subsequently, the two algorithmic
candidates from sections 4.3 and 4.4 are investigated separately.
Empirical Error with Parameter ∆Pt

The sQN method with optimized ĉ1.2 criterion (section 4.3) is investigated with respect to vary-
ing sample size N and bin density σRJ ·R. Therefore, the design parameter ∆Pt must be se-
lected appropriately. In figure 5.9 a selection chart was given for either ∆Pt or the probability
∆Pt /N , depending on whether the minimum tail amplitude At,min is known. Here, the dis-
tribution shapes with worst case characteristics from table 5.3 are analyzed. With a constant
∆Pt /N =10−3 At,min all the given tail amplitudes are included and condition (5.10) is fulfilled.
Further, the analysis benefits from a larger tail interval if N is increased.
In figure 5.14, as an example, the two-dimensional surfaces of median error Emed and interquar-
tile range IQR are plotted. Both surfaces can be approximated very well using regression planes
where the resulting regression coefficients define the empirical relations. At smallest bin densities
σRJ ·R≤2 the median error starts to oscillate, because of a scarce discretization or limited number
of bins on the bathtub tails. This effect will be investigated later on in section 5.4.2.
When deriving empirical coefficients for the regression planes, a logarithmic scaling of all three
axes must be considered. Thus, regressions are described with
ln(z) = c1 + c2 · ln(x) + c3 · ln(y)

= ln(ec1 · xc2 · y c3 ) (5.27)
c1 c2 c3
⇒z =e ·x ·y
When mapped onto the original variables we thus have
x = σRJ ·R, y = N, z = {Emed , IQR, EL } (5.28)
By choosing the regression coefficients a0 , a1 and a2
a0 = ec1 , a1 = −c2 , a2 = −c3 (5.29)
75
N = [5 · 105 , 108 ] N = [104 , 106 ]

DJ, σRJ /ADJ σRJ · R = [2.0, 51.2] σRJ · R = [2.0, 51.2]
a0 a1 a2 r2 a0 a1 a2 r2
Emed 0.248 0.121 0.161 0.977 0.463 0.107 0.211 0.961
Sinusoidal 1/2 IQR 0.471 0.198 0.188 0.965 0.668 0.168 0.230 0.967
EL 0.919 0.168 0.177 0.982 1.444 0.144 0.223 0.976
Emed 0.256 0.103 0.157 0.977 0.494 0.099 0.206 0.968
Uniform 1/4 IQR 0.313 0.190 0.163 0.960 1.019 0.161 0.260 0.973
EL 0.720 0.154 0.161 0.981 1.898 0.136 0.238 0.981
Emed 0.447 0.070 0.170 0.984 1.089 0.035 0.239 0.994
Triangular 1/8 IQR 0.365 0.163 0.168 0.950 0.727 0.124 0.235 0.949
EL 0.994 0.117 0.169 0.984 2.179 0.076 0.237 0.988
Emed 0.983 0.044 0.201 0.992 1.724 −0.003 0.243 0.992
Quadratic 1/16 IQR 0.681 0.170 0.201 0.960 1.105 0.101 0.263 0.955
EL 1.987 0.098 0.201 0.986 3.266 0.036 0.250 0.990
Emed 0.091 0.408 0.137 0.925 0.552 0.457 0.272 0.936
Only RJ IQR 1.471 0.342 0.252 0.961 4.129 0.313 0.365 0.971
EL 2.002 0.355 0.231 0.969 6.583 0.335 0.350 0.973
TABLE 5.4.: Emed , IQR and EL regression coefficients for equation (5.30). ∆Pt /N =10−3
(planes for large N ) and ∆Pt /N =10−2 (planes for small N ), ĉ1.2 algorithm, pa-
rameter intervals specified above, K=250.
and re-substituting these variables one gets

{Emed , IQR, EL } = a0 · (σRJ ·R)−a1 · N −a2 (5.30)
The three regression coefficients have been determined for each of the investigated DJ types as
well as the pure Gaussian RJ case, and are listed in table 5.4 for two different sample size intervals.
The surfaces use a constant time resolution R=2048, while varying only the RJ standard deviation
in the range σRJ =[1/1024, 1/40] UI. This way the bin density σRJ ·R supports arbitrary floating
point values, and is not restricted to integer values of R. Note that the given parameter range
for σRJ ·R is smaller than the one depicted in figure 5.14, in order to exclude error oscillations.
In fact, the intervals have been selected from original equidistant grids of 75×51 nodes with
σRJ ·R={0.8, . . . , 51.2} UI as well as N ={105 , . . . , 108 } and N ={104 , . . . , 106 } respectively.
The r-squared statistic [20] in the last columns describes the quality of regression planes with
r2 ∈[0, 1]. The obtained results always highlight a very high degree of correlation with planes for
both large and small N . The latter ones yield large IQR values, due to a strong influence of random
tail variations. This influence was reduced to some degree by choosing a larger ∆Pt /N =10−2 .
A direct error estimation with EL offers the advantage of slightly improved precision, but is less
flexible. That means, Emed and IQR are not restricted to the definition from equation (3.18), and
can always be adapted as linear combinations to describe arbitrary statistical confidence levels.
Empirical Error with Qmin Threshold

Equivalent to the previous analysis, also the error of the Qth,c algorithm based on the threshold
Qmin from section 4.4 can be investigated. As a major difference to simulations with R=Rsim ,
the threshold is now reduced to Qmin =−1.0 (instead of Qmin =−1.2). This adaptation allows the
Qth,c algorithm to include more tail samples, especially at coarse time resolutions.
Two-dimensional error surfaces are again plotted as functions of bin density σRJ ·R and sam-
ple size N . Equivalent to the previous analysis, regression planes describe the empirical relation
76
N = [5 · 105 , 108 ] N = [104 , 106 ]

a0 a1 a2 r2 a0 a1 a2 r2
Emed 0.179 0.101 0.150 0.797 0.423 0.075 0.217 0.932
Sinusoidal 1/2 IQR 0.912 0.227 0.208 0.934 0.671 0.176 0.202 0.935
EL 1.453 0.196 0.193 0.967 1.454 0.145 0.208 0.968
Emed 0.185 0.098 0.146 0.826 0.773 0.086 0.250 0.952
Uniform 1/4 IQR 0.927 0.250 0.202 0.960 0.889 0.161 0.215 0.950
EL 1.440 0.207 0.186 0.978 2.121 0.137 0.227 0.969
Emed 0.259 0.069 0.150 0.929 1.818 0.040 0.299 0.982
Triangular 1/8 IQR 0.985 0.191 0.208 0.962 0.843 0.137 0.208 0.947
EL 1.533 0.148 0.188 0.981 2.883 0.098 0.244 0.986
Emed 0.570 0.039 0.186 0.979 2.751 −0.006 0.308 0.989
Quadratic 1/16 IQR 1.478 0.160 0.231 0.975 0.948 0.124 0.211 0.962
EL 2.505 0.111 0.212 0.985 3.911 0.069 0.257 0.988
Emed 0.045 0.432 0.095 0.856 0.174 0.418 0.188 0.679
Only RJ IQR 0.482 0.235 0.190 0.958 1.032 0.259 0.248 0.942
EL 0.728 0.264 0.175 0.968 1.751 0.280 0.242 0.957
TABLE 5.5.: Emed , IQR and EL coefficients for equation (5.30). Qmin =−1.0, ∆Pt =102 , Qth,c
algorithm, parameter intervals specified above, K=250.
of these two variables. At very low densities, ∆Pt =102 guarantees for sufficient bins to be in-
cluded with regression analysis. This avoids outliers, but unfortunately the pessimistic estimation
property with positive error bias for Emed is not guaranteed anymore. In fact, Emed now also
returns negative values, as can also be seen in figure 5.10. This is an effect caused by statistical
tail variations, which leads to highly overestimated scaling factors and optimistic TJ values.
The same original grids are used for the regression planes as with the previous analysis. Ta-
ble 5.5 contains the regression coefficients a0 , a1 and a2 for each of the investigated DJ types,
as well as the r-squared statistic to indicate the quality of fitted planes. The empirical relation is
again given by equation (5.30). This time, Emed planes highlight significantly smaller r2 values,
because of the partial influence of negative errors at lowest bin densities. This problem is over-
come by selecting a smaller parameter range of σRJ ·R for planes with small N . However, besides
the pure Gaussian case, EL is significantly larger compared to the ĉ1.2 based algorithm.
To highlight this difference in a brief example, we assume a jitter measurement system with the
parameters R=128 and N =108 . For the sQN method with ĉ1.2 based tail fitting ∆Pt =105 is se-
lected, and a minimum analyzable standard deviation σt,min =6.2·10−3 obtained (equation (5.20)).
According to equation (5.5b), at a worst case jitter ratio of σRJ /ADJ =1/4 (uniform DJ) we get
σRJ,min =5.7·10−3 . The worst case error is determined using equation (5.30) and yields
Emed = 1.46%, IQR = 1.64%, EL = 3.90% (5.31)
For the Qth,c based tail fitting with Qmin =−1.0 instead
Emed = 1.31%, IQR = 2.42%, EL = 4.98% (5.32)
This result shows that the Qth,c algorithm, although less biased, suffers from a significantly larger
statistical spread which is especially noticed with EL . A similar error characteristic is observed for
all four DJ types. Also for the pure RJ case, the error of the Qth,c algorithm becomes excessively
large as soon as σRJ ·R≤5 and N ≤106 . Generally, this makes the ĉ1.2 algorithm a better suited
choice for tail fitting with hardware based jitter measurements. Therefore, only the ĉ1.2 algorithm
will be further utilized subsequently.
77
4
10
−1
10
5
10
Emed
−2
10
6
10
N
−3
10
4 7
10 10
6
10
N
8
10
8 1 1.6
10 0.5 0.5 1 1.6
σRJ,min × R σRJ,min × R
(a) (b)
F IGURE 5.15.: Error ripple effect: simulated Emed surface (a) and expected “error valleys”
according to equation (5.35) (b) with a test distribution σRJ,min /ADJ =1/4
(uniform DJ). The investigated parameter ranges are N ={104 , . . . , 108 } and
σRJ,min ·R={0.4, . . . , 1.6}. K=250.
5.4.2. Error Ripple Effect

As was already shown in figure 5.14, error oscillations can be observed when the bin density
σRJ,min ·R≤2 and thus, reaches toward the convergence limit. This ripple effect is caused by the
coarse time discretization of bathtub tails. If tails are described by only few bins, fitting results
highly depend on their locations along the bathtub curve and are thus highly scattered. In such
cases a suitable location of bins can significantly improve fitting performance.
In figure 5.15(a) the resulting ripple of a test distribution with σRJ,min /ADJ =1/4 and uni-
form DJ type is shown. At smallest bin densities this ripple is still visible even though Emed
becomes very large. For tail fitting the ĉ1.2 based algorithm is used with a minimum tail interval
of ∆Pt /N =10−3 . If this interval does not include sufficient bins, the algorithm always selects at
least three so that a result is always obtained. Obviously, the median error Emed is then highly
biased.
The observed ripple effect may be described in terms of mathematical equations. The fitted
Gaussian tail is given by the parameters At , σt and µt which can be used to calculate the timing
budget or jitter extend Jt of the left or right bathtub tail.

1
Jt = µt + σt · Q (5.33)
At · N
The error ripple reaches a minimum if the last sample is located just at the tail edge of a distribu-
tion, which is the case when Jt is an exact integer multiple of time intervals given by the resolution
2/R. We can thus write the remainder equation
!
rem(Jt , 2/R) = rem(Jt ·R/2, 1) = 0 (5.34)
to indicate that the variable Jt ·R/2 should be an integer multiple. Thus the equation

1 ! 2
µ t + σt · Q = ·n (5.35)
At · N R
is obtained, where n is an arbitrary integer value. In figure 5.15(b) the first eight traces are depicted
for the given test distribution to prove correct behavior of the equation. If the sample size N can
be adjusted, it is highly recommended to fulfill
0 < rem(Jt ·2/R, 1) < 0.5 (5.36)
78
in order to guarantee that the fitting algorithm operates in a minimum error region.
As a certain drawback, the Gaussian model parameters (At , µt , σt ) must be known. In a simu-
lation they can be identified using a numerical approximation of the ideal bathtub function, equiv-
alent to the calculation of tail parameters in section 5.1. At N =106 thus, we yield At =0.380,
µt =0.0653 and σt =0.0531. Note, that these values form a compromise, since {At ,µt ,σt }=f (N ),
due to the asymptotic tail behavior in Q-domain.
In a real measurement scenario Gaussian model parameters can be approximated using the
median values of multiple tail fits. The remainder function (5.36) is then an indicator on how far
the tail edge is located from the outermost distribution sample, and hence, tells whether the fitting
result lies within an error maximum or minimum. Equation (5.35) can additionally be inverted to
identify suited values for the sample size N .
1
N= (5.37)
2n/R−µt
At · Q−1 σt
where n defines the trace number to be located on. If N is adjustable, one is thus able to move
from an error maximum to an error minimum by simply changing the sample size. However, this
analysis is only valid for a single distribution shape, and cannot be applied to a broad range of
distributions. Further the measurement system must not be affected by differential non-linearity
(DNL) error, which is rarely the case. DNL is caused by timing mismatches or process variations,
and leads to a smoothing of the rippled surface from figure 5.15 as well as a large statistical spread
of fitting results. This effect is investigated subsequently.
5.4.3. Error Analysis with Modeled Process Variations

Process variations can significantly affect the accuracy of a BIJM system by causing DNL error.
Hence, its influence on the performance of fitting algorithms must be investigated. The major
reasons for DNL error are timing mismatches of the delay line and a non-ideal PD structure, which
are always present in a real BIJM system. These effects can be modeled using normally distributed
random steps with mean 1/R and standard deviation σDN L . The steps start at the synchronization
time instant or center of a jitter distribution as shown in figure 5.16, and define the bins where
random jitter samples are assigned. Real bin locations thus differ from the equidistant time steps
N (1/R, σDN L )
0 1/R 2/R
−2/R −1/R
F IGURE 5.16.: DNL error model to describe the effect of process variations.
of an ideal measurement system. As an additional problem, the DNL error is summed up over the
delay line and yields an integral non-linearity (INL) which significantly exceeds the DNL values.
This effect can only be reduced if the PLL output clock is very clean and directly provides multiple
phases.
Similar to the empirical error analysis, the DNL error term is included as third variable and
yields a hyperplane where multiple linear regression can be performed. In order to deal with the
large computational demand, a reduced grid resolution is chosen with 25×25 nodes for bin density
σRJ ·R={0.8, . . . , 51.2} UI, as well as N ={5·105 , . . . , 108 } and N ={104 , . . . , 106 } respectively.
Each plane is simulated with respect to varying DNL error in the range σDN L ={0.0, . . . , 0.19}
using an equally spaced distance of 0.01.
79
DJ, σRJ /ADJ a0 a1 a2 a3 a4 r2

N = [5 · 105 , 108 ], σRJ · R = [2.0, 51.2]
Emed 1.683 0.144 0.116 −1.859 0.922 0.936
Sinusoidal 1/2 IQR 1.594 0.138 0.197 −8.185 1.309 0.960
EL 0.764 0.138 0.165 −6.582 1.303 0.967
Emed 1.735 0.133 0.102 −1.656 0.761 0.934
Uniform 1/4 IQR 1.652 0.133 0.190 −8.261 1.410 0.957
EL 0.812 0.131 0.152 −6.424 1.326 0.965
Emed 0.470 0.189 0.068 −0.733 0.344 0.970
Triangular 1/8 IQR 1.578 0.136 0.181 −7.064 1.216 0.948
EL 0.227 0.158 0.118 −4.775 1.014 0.964
Emed −0.040 0.207 0.040 0.439 −0.066 0.979
Quadratic 1/16 IQR 0.975 0.171 0.148 −5.514 1.093 0.952
EL −0.388 0.188 0.082 −2.997 0.682 0.971
Emed 0.948 0.226 0.469 −10.767 3.841 0.807
Only RJ IQR 1.097 0.156 0.356 −7.935 0.481 0.958
EL 0.407 0.159 0.379 −7.777 0.621 0.964
4 6
N = [10 , 10 ], σRJ · R = [2.0, 51.2]
Emed 0.736 0.213 0.111 −0.828 0.178 0.963
Sinusoidal 1/2 IQR 0.924 0.192 0.161 −5.469 1.107 0.951
EL −0.036 0.199 0.138 −3.977 0.835 0.969
Emed 0.807 0.199 0.093 −0.998 0.205 0.968
Uniform 1/4 IQR 0.488 0.223 0.158 −5.172 1.060 0.955
EL 0.278 0.213 0.128 −3.769 0.802 0.972
Emed 0.119 0.221 0.035 −0.246 0.067 0.984
Triangular 1/8 IQR 0.399 0.230 0.128 −4.520 0.995 0.917
EL −0.606 0.224 0.074 −2.640 0.624 0.959
Emed −0.557 0.245 −0.003 −0.148 0.076 0.984
Quadratic 1/16 IQR 0.074 0.259 0.082 −3.264 0.732 0.932
EL −1.117 0.250 0.025 −1.534 0.392 0.971
Emed 0.670 0.252 0.489 −2.599 −0.230 0.934
Only RJ IQR −0.424 0.274 0.340 −4.425 0.329 0.954
EL −1.033 0.270 0.363 −4.134 0.244 0.965
TABLE 5.6.: Coefficients for Emed , IQR and EL with included DNL error, equation (5.38).
∆Pt /N =10−3 (large N ) and 10−2 (small N ), ĉ1.2 algorithm, K=200.
For multiple linear regression a suitable model description was identified with the relation
y = −a0 − a1 x1 − a2 x2 − a3 x3 − a4 x2 x3 (5.38a)
x1 = ln(N ), x2 = ln(σRJ · R), x3 = ln(1 + σDN L ) (5.38b)
y
{Emed , IQR, EL } = e (5.38c)
where the last coefficient a4 also considers a correlation between bin density and DNL error, which
proved to significantly increase the quality of fitted hyperplanes. The coefficients are given in
table 5.6. Here, smaller r-squared values for the fitted hyperplanes of Emed are especially noticed
with the pure RJ shape. This is, because the DNL error highly affects statistical tail variations
and pure Gaussian distributions suffer from the strongest influence of random noise. Thus, fitted
hyperplanes also become inaccurate.
80
5.5. D ESIGN E XAMPLES
5.5. Design Examples

In order to highlight the practical aspect of the derived equations and empirical relations, two typi-
cal design examples are given. The first one is intended for jitter diagnosis, where the measurement
time is not crucial. The second one focuses on production testing with stringent requirements on
the test time and a small sample size N , it further assumes an undersampling technique as de-
scribed in [80, 129] for jitter measurements.
5.5.1. Example for Jitter Diagnostics

A BIJM system is assumed to be designed for a 3Gb/s interface with a maximum number of bins
R=128. The system shall be able to carry out on-chip diagnostics, where the complete bathtub
curve must be measured and fitted within a few hundred milliseconds. The simple measurement
scheme in figure 5.1 with an adjustable delay element, one phase detector and a counter, requires
a sequential bathtub measurement with N jitter samples at each of R delay steps. With N =107 a
maximum test time of
tt,max = (N · R)/3·109 s = 427 ms (5.39)
is obtained. This value can be reduced to some degree, because the bins with high error rates
quickly collect bit errors and thus, accurate BER values are obtained very fast. Furthermore, tails
are assumed to follow a monotonic behavior. One can thus start the measurement at the center
of a jitter distribution, and already stop after the first bin without errors. For the given resolution,
calculation time of the fitting method affects measurement speed only marginally when carried out
off-chip.
In the example, worst case distributions with σt,min =0.01UI are assumed, where the major
contributor to jitter is inter-symbol interference (ISI) over the transmission channel and thus, DJ
is approximately uniform [54]. With both a pessimistic DJ amplitude covering the whole UI and
a symmetric TJ distribution, we yield a maximum DJ value (equation (4.2)) of:
!
2 · µt,max + 2 · σt,min · Q(10−12 ) = 1 UI (5.40)
⇒ µt,max = 0.36 UI (5.41)
From the inverse of equation (5.5c) we have the jitter ratio σRJ,min /ADJ,uni =9.84·10−3 which is
necessary to synthesize this TJ shape. With equation (5.3) the minimum amplitude can now be
determined:
At,min = 1.292 · (9.84 · 10−3 )0.955 = 15.6 · 10−3 (5.42)
Equation (5.10) allows to specify a maximum value for the design parameter ∆Pt , so that the
minimum amplitude At,min can still be fitted correctly.
∆Pt ≤ At,min · N/6.3 ≈ 2.5 · 104 (5.43)
From the selection chart in figure 5.9 (or by equation (5.20)) ∆Pt =104 is chosen equivalent to
∆Pt /N =10−3 , which yields a minimum bin density of 0.95. This result is smaller than the
minimum imposed by our assumptions:
0.95 < σt,min · R = 0.01 · 128 = 1.2 (5.44)
Therefore, the selected ∆Pt is applicable and σt,min =0.01 UI can be fitted correctly.
The empirical relation (5.5b) can be used to determine σRJ,min =7.57·10−3 , since both the
minimum ratio σRJ,min /ADJ,uni and σt,min are known. With equation (5.30) and the coefficients
from table 5.4 the maximum fitting errors for the sQN method with ĉ1.2 can finally be determined.
Emed = 2.0%, IQR = 2.3%, EL = 5.4%
81
For the selected minimum variance σRJ,min , the resulting coarse bin density is the major contrib-
utor to the overall error. This influence becomes even more evident if additionally a DNL error of
σDN L =0.05 UI is assumed and the empirical equation (5.38) with table 5.6 applied:
Emed = 2.3%, IQR = 3.4%, EL = 7.4%
which demonstrates that the statistical spread of the fitting method can be highly affected by DNL
error. However, this is also an effect of low bin densities, since DNL has been modeled as a random
step with time interval 1/R (section 5.4.3), and can thus also be reduced using a higher time
resolution. This design example will also be continued in section 6.5 to compare the extrapolation
error of the sQN method with the conventional Q-normalization (QN) method.
5.5.2. Example for Production Tests

The jitter of a high-speed PLL is assumed to be measured by a second PLL, using an undersam-
pling technique as described in [53, 80, 129]. The total jitter of the PLL should be verified within
a maximum of 50ms, in order to guarantee for a sufficiently fast production test of multiple PLLs
running at fs =771.4MHz each [81]. The given architecture moves the sampling position over one
delay step 1/R after each bit period. Therefore, N ·R periods are required in order to measure a
BER value down to the probability level 1/N . With a minimum of one counter used in the test
structure, this measurement must be repeated R times for each sampling position. A speed-up can
also be achieved with C counters in parallel, which yields a test time of
(N · R) · R N · R2
tt = = (5.45)
C · fs C · fs
The sQN fitting method shall operate appropriately without being affected by large error oscilla-
tions and thus, σRJ ·R≥2 is required (also see figure 5.15). With a selected number of bins R=91,
this can be guaranteed if the minimum standard deviation σt,min ≥σRJ of the fitted random jitter
component is greater or equal to
σt,min [s] ≥ 2/R · 1/fs = 28.5 ps (5.46)
Note that the system is still able to operate down to the minimum value given by equation (5.20),
but additionally suffers from error oscillations if σRJ ·R<2.
As shown in equation (5.45), the test time tt can be linearly decreased with a larger number of
parallel counters C. Equivalent, the number of samples N can also be increased to improve accu-
racy of the jitter analysis method, if tt is kept constant. Thus, the expected worst case extrapolation
error EL of the sQN method can be plotted against the number of implemented parallel counters
C, which is a direct measure for the hardware expense. Therefore, empirical relation (5.30) and
the table of coefficients 5.4 are applied, together with the regression planes for small sample sizes.
In the calculations, we assume the pure Gaussian RJ case as well as sinusoidal DJ, where the latter
is typically observed with high-speed PLLs that are affected by spectral spurs.
As a result, the extrapolation error EL in figure 5.17 is depicted over increasing number of
counters C, which also defines the sample size N . N is determined from the inverse of equa-
tion (5.45), while a minimum bin density of σRJ ·R=2 is assumed. For the pure RJ case with
C≥14 and tt =20ms (⇒N ≈26k), the given measurement system is for example able to estimate
the TJ of the PLL under test with <15% error. Note that this result reflects the combined influence
of worst case error bias and spread, and includes approximately 97.5% of estimates. The presented
design example will also be continued in section 6.5 for comparison with the QN method.
82
5.6. S UMMARY
0.5
0.4
0.3
tt=20ms
Estimation Loss
Sinusoidal DJ
0.2
tt=20ms tt=50ms
0.1
Only RJ
tt=50ms
1 10 100
Number of Counters
F IGURE 5.17.: Estimation loss EL of sQN method over varying number of counters C, with
tt ={20, 50} ms, and worst case sinusoidal DJ (red) or pure RJ (black) case.
5.6. Summary
Hardware related design aspects were investigated to utilize the scaled Q-normalization method
for on-chip jitter diagnosis or together with built-in jitter measurement (BIJM) system. Influences
of limited sample size N as well as number of bins R on the algorithmic performance were inves-
tigated. For each analysis, the two algorithmic scenarios ĉ1.2 (section 4.3) and Qth,c (section 4.4)
were investigated independently.
In order to characterize the tail parameters of fitted test distributions, first, the polynomial equa-
tions (5.3) and (5.5) were derived. These allow to change between the variable representation
prior (σRJ , ADJ ) and after (At , σt , µt ) distribution synthesis. The coefficients in table 5.1 are
used together with equation (5.3) and specify the tail amplitude At obtained with the sQN fitting
method. The coefficients in table 5.2 and equation (5.5) specify minimum requirements of the
sQN method with respect to the tail parameters σt and µt . The obtained results are also valid for
the conventional Q-normalization (QN) method without scaling (chapter 6) and thus, allow for
parameter specification of both methods.
With the ĉ1.2 algorithm, the minimum tail amplitude At,min is estimated by defining a conser-
vative threshold, as shown with equation (5.10). The Qth,c fitting algorithm instead allows for
the derivation of an exact equation (5.13). This result also highlights the missing link to the first
algorithm.
The time resolution variable divides the unit interval into a discrete number of bins R, and
causes a limiting effect for maximum tail slope, which can be expressed as minimum Gaussian
standard deviation σt,min . With ĉ1.2 , equation (5.20) has been derived to identify the σt,min value
which can be fitted correctly by the algorithm. Validity of this equation has been demonstrated em-
pirically, and a selection chart for ∆Pt has been given in figure 5.9, in order to simplify a suitable
choice. The two conditions in equations (5.6) and (5.10) further guarantee for sufficient outlier
suppression as well as a robust algorithmic behavior. With the Qth,c algorithm, equation (5.23)
has been derived to give a pessimistic estimate for σt,min . A selection chart has been provided in
figure 5.11 as well. As a clear advantage of this second algorithm, Qmin and ∆Pt can be adjusted
independently from the sample size N . Correct convergence of the algorithm is again guaranteed
with ∆Pt as large as possible and condition (5.13) fulfilled.
Considering the combined influence of sample size and time resolution on the extrapolation
error of fitting algorithms, empirical relations were derived to approximate error bias and spread
as a function of sample size N and bin density σRJ,min ·R (see equation (5.30)). These empirical
relations investigate the worst case distribution shapes of the four important DJ types defined in
section 3.2.2, as well as the pure Gaussian RJ case. They are meant to aid the designer in finding
an optimum performance trade-off. The corresponding empirical coefficients can be found in
83
table 5.4 for ĉ1.2 , and in table 5.5 for the Qth,c algorithm. According to obtained results, the
ĉ1.2 algorithm from section 4.3 clearly highlights a better performance. This is mainly due to the
unfavorable behavior of Qth,c at lowest bin densities, which leads to large error oscillations and
may even cause negative errors. This highly degrades the extrapolation performance, and also the
quality of fitted regression planes.
The observed error oscillations, or error ripple effect at very small bin densities σRJ ·R<2 has
been investigated as well in section 5.4.2. As a fundamental result, equation (5.35) describes the
observable oscillations, and can be used to identify suitable working regions in order to avoid error
maximums. However, results behave optimal only for a single distribution shape, which is rather
impractical.
In the case of differential non-linearity (DNL) error, as caused by timing mismatches of the jitter
measurement system, the statistical spread of TJpp estimates increases significantly and becomes
the major contributor to overall estimation loss. Thus a DNL error model together with empirical
equation (5.38) and the table of coefficients 5.6 has been derived. It describes the influence of
DNL error on estimation performance in terms of the standard deviation σDN L .
In a first design example, applicability of the derived equations with respect to jitter diagnosis
has been demonstrated. Starting with a worst case RJ of σt,min =0.01UI and a time discretiza-
tion R=128, the minimum amplitude At,min was determined by assuming ISI dominated jitter
(uniform DJ). This allowed to correctly specify the design parameter ∆Pt , in order to guarantee
for correct convergence of the algorithm. With equation (5.30) and the table of coefficients 5.4
a worst case error bias Emed ≤2.0% as well as an estimation loss EL ≤5.4% were guaranteed if
the measurement system was not affected by DNL. Otherwise, with an assumed DNL standard
deviation of σDN L =0.05UI and equation (5.38) together with the table of coefficients 5.6, these
values increased to Emed ≤2.3% and EL ≤7.4% respectively.
Finally in a second example, the derived empirical relations were applied to a jitter measurement
system for fast production testing of high-speed PLLs. The sQN method was able to estimate the
true TJ budget in a given measurement time of tt =20ms with less than 14.9% error (pure RJ case)
as well as 13.5% error (sinusoidal DJ), if only the number of counters was increased to C=14 in
order to achieve a larger sample size N .
Parts of this chapter have also been published in [C4,C8].
84
6. Comparison of Gaussian Tail Fitting
Methods Based on Q-Normalization
In this chapter the performance of the scaled Q-normalization (sQN) method is compared with
various other tail fitting principles based on the Gaussian quantile normalization. This analysis
is meant to give useful insight to the performance and stability of fitting algorithms, and to high-
light their advantages and drawbacks. So far, literature is still missing on such comparisons and
generally lacking from a detailed performance description. This is partly also due to the high
computational demand associated with statistical evaluations. In this work, this problem is dealt
by a powerful cluster of up to fifty parallel workstations using 3GHz Intel Xeon processors.
From the variety of histogram based fitting techniques [51,54,84,95,124,136] this chapter only
focuses on methods related to the Gaussian Q-normalization principle. Obviously, a chi-squared
test as for example used in [52,84,90] would be a prominent candidate, but is omitted here because
of the highly individual structure of such algorithms. In fact the optimization process is quite
complex and typically includes histogram smoothing, outlier removal, tail part identification and
a Gaussian model search over several optimization stages. Implementing a chi-squared test with
acceptable accuracy and robust behavior is thus time consuming and hard to achieve. In contrast,
the presented fitting methods do not require any data preprocessing, and can directly be applied
onto raw jitter distributions.
The first type of fitting algorithm being compared against the sQN method is the conventional
Q-normalization (QN) method without scaling. The algorithm is simply obtained by omitting the
pre-scaling factor k and directly performs a linear regression analysis in Q-domain, as already
mentioned in section 3.1.2. This principle was proposed in [51, 111] and subsequently also de-
scribed in [82, 123]. The second class of algorithms uses higher order polynomials for tail fitting
in Q-domain, and was first suggested by Hong in [54]. The idea is to replace the linear regression
stage by a higher order polynomial regression, which fits polynomial functions into the measured
Q-tails. The obtained polynomial coefficients thus describe a parameterized bathtub function,
which can easily be used to extrapolate distributions down to the BER level of interest and thus,
to recover the TJ timing budget. This principle allows for a whole class of polynomial regression
methods (QP2, QP3, . . .) to be compared against the sQN method.
In figure 6.1 the optimization scheme for Gaussian Q-normalization, when combined with poly-
nomial regression, is depicted. Polynomials of arbitrary order can be fitted to the distribution tails
in Q-domain, where the resulting regression coefficients {a0 , . . . , ap } denote the functional rela-
tion between jitter amplitude x and quantile q. Similar to the sQN method, the regression error
σ̂err (n) can again be interpreted as a function of tail length n, when starting the regression with
CDF(x) error
Polynomial Reg.(n)
Q(· · · ) order p
σ̂err (n)
q = f (a0 + a1 x + ... + ap xp )
F IGURE 6.1.: Optimization scheme for Q-normalization combined with polynomial regression.
85
6. C OMPARISON OF G AUSSIAN TAIL F ITTING M ETHODS B ASED ON Q-N ORMALIZATION
outermost tail samples and recursively moving toward higher probabilities. Therefore, the basic
implementation principle of the fitting algorithm remains the same.
Note, that for the special case of order p=1, the regression function reduces to a line with
offset o=a0 and slope s=a1 . Since the scaling factor k is not included, this corresponds to the
conventional method without scaling (QN), as described before. In other words, the QN method
is equal to the first order polynomial regression (QP1) in Q-domain.
When trying to combine the scaling factor k with higher order polynomials, the optimization
becomes unstable and diverges toward meaningless values of k<1. Thus, only for the linear case
with p=1, the scaling factor k can be part of the optimization scheme. This also confirms the
sQN method as three dimensional approach to Gaussian tail parameter search. The QN method
instead, with k=A=1 always fits a Gaussian function of maximum amplitude, and can thus only
be used to retrieve mean µ and standard deviation σ of the Gaussian model. Further, higher order
polynomials (QP2, QP3, . . .) cannot be used at all for retrieving Gaussian model parameters.
In the subsequent sections first, an efficient algorithmic implementation as required for the
polynomial regression of Q-tails, equivalent to the realization of sQN in section 4.1.4 is described.
Then optimum selection criteria for conservative fitting parameters Pt,min and ∆Pt are again dis-
cussed. This is meant to improve the robustness of polynomial fitting methods by selecting an
appropriate tail region. Starting with a performance evaluation of the different methods, poly-
nomials up to the fourth order are investigated, which is sufficient as will also be shown. Then
the polynomial methods are compared with sQN, and as a result it is shown that the conventional
QN method is also suitable for tail fitting with hardware measurements of coarse time resolutions.
Thus, coefficients for the empirical error analysis with QN are derived equivalent to section 5.4.
The chapter concludes with a brief summary.
6.1. Implementation of Algorithms

When recalling the algorithmic implementation of the sQN method in section 4.1.4, the testbench
for performance analysis can be completely reused since only the algorithm block must be replaced
with the respective polynomial fitting algorithm (left flow graph in figure 4.8). The scaling factor
k is not utilized, and hence, the algorithm flow graph is greatly simplified.
Figure 6.2 depicts the analysis procedure which corresponds to a simple minimum search of the
regression error σ̂err over varying tail length n. First, a measured jitter distribution is transformed
into Q-domain. The minimum regression error σ̂err,min is then determined along the tail length
up to a maximum value of Q=0, which corresponds to half of the Gaussian model. As soon
as the optimum tail length is identified, the polynomial coefficients can be retrieved. They are
directly used for tail extrapolation and thus, to determine TJpp values which are again used for
error analysis.
The simplified implementation structure is a key advantage compared to the sQN method. The
scaling factor k is not present in the optimization scheme anymore and thus, one search dimension
is eliminated, which leads to a simplified one-dimensional search space with only the regression
length n as unknown variable.
This means a great speed-up for the optimization process, but at the same time the regression
error σ̂err also becomes the only applicable fitness value. Other measures based on the fitted
regression length such as n̂ (see section 4.3.1), or combinations of different measures cannot be
applied anymore.
With higher order polynomials, the computational demand for calculating the regression coef-
ficients increases as well. Thus, an efficient implementation of the polynomial regression stage is
required. The goal is to keep a recursive description of the coefficients, equivalent to equation (4.5)
86
6.2. P ERFORMANCE O PTIMIZATION
for the linear case, so that the results can easily be obtained from summing terms.
We can describe a polynomial regression of order p as an approximation to a set of tail data
pairs (xi , qi ):
qi = a0 + a1 xi + . . . + ap xpi , i = {1, . . . , n} (6.1)
The number of n pairs can be arranged according to the Van-
dermonde [39] system of equations:
CDF(x)
1 x1 . . . xp1
    
a0 q1
 1 x 2 . . . x p   a 1   q2 
2  q=Q(CDF(x))
..   ..  =  ..  (6.2)
   
 .. .. . .
 . . . .  .   . 
1 xn . . . xpn ap qn
Polynomial Regr.
σ̂err ,{a0 , . . . , ap }
When multiplied at the left by the transposed matrix we get:
 P P p    P  store as σ̂err (n)
n x ... xi a0 qi
P P 2i P p+1 P
 xi xi ... xi  a1   x i qi  n++
=
    
 .. .. .. ..  .. ..  Regr. length
. . . . . P .p Q(n)≥0?
    
P p P p+1 P 2p no
xi xi ... xi ap x i qi
yes
(6.3)
This system of equations contains only summing terms which σ̂err,min =min{σ̂err (n)}
can easily be updated using recursions, as required by the
polynomial fitting stage. To obtain the regression coeffi- retrieve polynomial
coefficients
cients {a0 , . . . , ap } the matrix inverse must be calculated. The
symmetric arrangement of summing terms corresponds to a
calculate TJpp
Hankel matrix, which can be inverted very efficiently us-
ing the Levinson-Durbin algorithm. Here, an implementation
from [112] is used, which requires only 3p2 + 9p + 3 multiply F IGURE 6.2.: σ̂err based poly-
and divide operations, and is thus sufficiently fast for analyses nomial fitting.
up to the required polynomial of order p=4. The regression
error is calculated as standard deviation of a fitted polynomial with the tail data:
sP
n p 2
i=1 (qi − a0 − a1 xi − . . . − ap xi )
σ̂err (n) = (6.4)
n − (p + 1)
The Levinson-Durbin algorithm is included with the C/C++ simulation environment, where fast
statistical simulations are carried out. As with the sQN implementation, this allows for an in-depth
analysis of the algorithmic performance. Further, MATLAB is again used for post processing of
fitting results as well as for data representation.
In section 4.1.4 the computational effort of sQN and QN methods was already compared. With
the very efficient Levinson-Durbin recursion, the difference between QN and higher order poly-
nomials is marginal and thus, analyses of QP2, QP3 and QP4 methods are omitted here.
6.2. Performance Optimization

Equivalent to the sQN method, analyses can be carried out to optimize the polynomial fitting
algorithms with respect to estimation error and robustness. Thus optimum parameter regions
are derived for every polynomial regression order individually, in order to obtain best estimation
results for each.
87
As already demonstrated with the simplified flow graph in figure 6.2, the minimum error can
be determined very quickly without amplitude scaling factor k. The minimum search is simply
carried out along the tail length n, where the error σ̂err is the goodness-of-fit measure.
When optimizing the performance, one likes to determine a best suited parameter configuration
for each of the polynomial fitting methods, equivalent to section 4.3.2. For the sQN method the
default configuration in equation (4.21) was derived as suitable interval for initial tail selection.
This configuration avoids outliers caused by statistical tail variations, and thus supports a robust
fitting behavior. For the polynomial methods the conservative parameters now have to be re-
specified.
Subsequent optimizations again use the median error Emed and interquartile range IQR of
TJpp estimates for performance analysis, since they are less influenced by outliers compared to
mean and standard deviation. In addition, the estimation loss EL was defined in equation (3.18)
to consider both error bias and spread. Further, higher order moments such as skewness ξ and
kurtosis κ are used for investigating the outlier behavior.
From the three conservative tail fitting parameters ∆Pt , ∆Tt and Pt,min , as defined and de-
scribed in section 4.3.2, ∆Tt exhibited poor performance, unless the Gaussian tail length was
known, which is usually not the case. Therefore, we only focus on the analysis of threshold Pt,min
and probability interval ∆Pt instead. Subsequently, performance optimizations are carried out
for each of the polynomial orders, starting with the first order or conventional Q-normalization
method.
First Order Polynomials

In figure 6.3 the fitting behavior of the conventional Q-normalization (QN) method with first order
(linear) regression is analyzed. Statistical performance measures are plotted as surfaces of varying
parameters ∆Pt and Pt,min , while the selected test distribution has a worst case jitter ratio of
σRJ /ADJ,uni =1/8 and uniform DJ type. Similar to prior analyses with sQN (see figure 4.23),
the median error Emed , interquartile range IQR, estimation loss EL , skewness ξ and kurtosis κ
(third and fourth order statistical moments) are reported in each of the subfigures. The kurtosis has
already been utilized as a measure for outlier presence. The skewness now additionally describes
the asymmetry of estimates, and thus denotes whether distribution data is centered at the left (ξ<0)
or right (ξ>0) from the mean value.
The resulting surfaces behave quite different from the ones obtained with the sQN method.
The median error Emed in figure 6.3(a) approaches the convergence limit for the variable product
∆Pt ·Pt,min very slowly and thus, rather forms a smooth transition than a clear cut off. As ex-
pected, the error bias is generally larger compared to sQN, since the fitting algorithm is based on
the simplified two dimensional tail model without Gaussian amplitude search.
The statistical spread, expressed by the interquartile range IQR in figure 6.3(b), indicates a
significant decrease toward higher values of the variable product ∆Pt ·Pt,min , which is in clear
contrast to the median estimation error. This causes the overall error EL in figure 6.3(c) to remain
constant over a large parameter range. Optimum performance can for example be achieved with
∆Pt ≈104 and Pt,min =1/N =10−7 , which is however only valid for the given distribution case.
Other test distributions with smaller tail amplitudes also yield surfaces where the convergence
region is narrowed down, and given by a smaller ∆Pt ·Pt,min product.
Skewness ξ and kurtosis κ in figures 6.3(d) and 6.3(e) indicate a large amount of outliers if
the variable product ∆Pt ·Pt,min is too small. A negative ξ means heavy tails directing toward
smaller TJ budgets. From the observed plane, at least ∆Pt ·Pt,min ≥103 is thus recommended to
successfully suppress outliers. This is in clear contrast to the sQN method (see figure 4.23), where
a smaller tail region was already sufficient. The observed effect can be explained by a stronger
asymptote in Q-domain compared to sQN, since the amplitude scaling factor is omitted. However,
88
−1
0.1 10 0.2
0.05 0.1
|Emed|
−2
0.05
IQR
10
EL
0.02
0.01 0.02
−3
0.005 10 0.01
−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
Pt,min 10 ∆ Pt Pt,min 10 ∆P Pt,min 10 ∆ Pt
t
(a) (b) (c)
2
5 10
Skewness
Kurtosis
1
0 10
0
−5 10
−3 −3
10 10
1 1
−5 10 −5 10
10 3 10 3
10 10
5 5
−7 10 −7 10
Pt,min 10 ∆P Pt,min 10 ∆P
t t
(d) (e)
F IGURE 6.3.: First order polynomial regression (QN): median error Emed , interquartile range
IQR, estimation loss EL , skewness ξ and kurtosis κ surfaces over varying ∆Pt
and Pt,min . Test distribution: σRJ /ADJ =1/8, ADJ,uni =0.2 UI, σRJ =0.025 UI,
N =107 , K=250.
with a negative ξ, outliers are mostly located closer to the true timing budget, and are therefore
uncritical.
To summarize the properties of the QN method, best performance is achieved with the param-
eter product ∆Pt ·Pt,min chosen as large as possible. This avoids convergence failures as well
as negative skewed error distributions. In the subsequent performance comparison the same test
distributions will be used as for the analysis of the sQN method (see section 4.3.3). If the outlier
presence at small sample size N is neglected, the same ∆Pt values can thus be utilized.
Second Order Polynomials
Figure 6.4 shows the fitting performance of the second order polynomial regression (QP2), where
the worst case test distribution has again been identified at a jitter ratio σRJ /ADJ ≈1/8. The
performance metrics are the same as in figure 6.3.
Median error Emed , interquartile range IQR and estimation loss EL in the upper subfigures
behave quite similar to first order regression analysis. The magnitude of Emed in figure 6.4(a) is
significantly smaller than with first order polynomials, but this benefit is lost due to an increased
statistical spread. The overall estimation loss EL in figure 6.4(c) highlights an optimum parameter
region for the variable product ∆Pt ·Pt,min , forming a distinct valley. This is due to the IQR
course with a falling edge along the surface in figure 6.4(b), which already starts before the rising
edge of Emed in figure 6.4(a) influences the overall EL . Inside this region, best estimates are
obtained.
Skewness and kurtosis evidence a critical drawback for TJ estimates obtained from second or-
der polynomials. If only a small part of the distribution tail is selected for polynomial regression,
the resulting estimates will be highly scattered and contain a large amount of outliers. In fact,
89
−1
0.1 10 0.2
0.05 0.1
|Emed|
−2
0.05
IQR
10
EL
0.02
0.01 0.02
−3
0.005 10 0.01
−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
t
(a) (b) (c)
2
5 10
Skewness
Kurtosis
1
0 10
0
−5 10
−3 −3
10 10
1 1
−5 10 −5 10
10 3 10 3
10 10
5 5
−7 10 −7 10
t t
(d) (e)
F IGURE 6.4.: Second order polynomial regression (QP2): Emed , IQR, EL , ξ and κ surfaces
over varying ∆Pt and Pt,min . Test distribution: σRJ /ADJ =1/8, ADJ,uni =0.2 UI,
σRJ =0.025 UI, N =107 , K=250.
the positive skewness clearly indicates a heavy tail directing toward higher TJ estimates, and even
reaches an undesired peak for a worst case configuration. As will be shown later on with perfor-
mance evaluations, the median error Emed is negative for the selected test distribution. Thus, to
some degree one may tolerate positive skewed tails, but the magnitudes in figure 6.4(d) are by far
too large.
In order to ensure that second order polynomials achieve correct fitting results without outliers
and high skewness, a sufficiently large parameter product ∆Pt ·Pt,min must be selected. If the
threshold parameter Pt,min is not in use, the minimum probability corresponds to the granularity
of Pt,min =1/N =10−7 . The probability region for best tail fitting is then located at ∆Pt =5·105 ,
which is here used as default value for second order polynomials. It must still be adapted accord-
ingly for use with different sample sizes. Fortunately, the fitted tail part is not anymore restricted
to the linearized Q-tail, and may also include parts of the DJ component in order to achieve bet-
ter extrapolations. However, it becomes impossible to find an optimum parameter configuration
which is well suited for all test distribution shapes. Here we simply choose ∆Pt =5·10−3 ·N to
guarantee at least for a sufficiently large fitting region.
Third and Fourth Order Polynomials
The same analysis from the previous paragraphs can also be carried out to investigate the estima-
tion performance of third and fourth order polynomials (QP3, QP4). Worst case test distributions
with uniform type DJ have again been identified at the jitter ratio σRJ /ADJ,uni ≈1/8.
With both polynomial orders, the regression error described by Emed , IQR and EL cannot
be influenced by the fitting parameter ∆Pt . Also the threshold parameter Pt,min provides best
performance only if reduced to its minimum Pt,min =1/N .
Unfortunately, the statistical random variation of tails has a highly misleading effect on higher
90
−1
0.1 10 0.2
0.05 0.1
|Emed|
−2
0.05
IQR
10
EL
0.02
0.01 0.02
−3
0.005 10 0.01
−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
t
(a) (b) (c)
2
5 10
Skewness
Kurtosis
1
0 10
0
−5 10
−3 −3
10 10
1 1
−5 10 −5 10
10 3 10 3
10 10
5 5
−7 10 −7 10
t t
(d) (e)
F IGURE 6.5.: Third order polynomial regression (QP3): Emed , IQR, EL , ξ and κ plots for
∆Pt versus Pt,min . σRJ /ADJ =1/8, ADJ,uni =0.2 UI, σRJ =0.025 UI, N =107 ,
K=250.
−1
0.1 10 0.2
0.05 0.1
|Emed|
−2
0.05
IQR
10
EL
0.02
0.01 0.02
−3
0.005 10 0.01
−3 −3 −3
10 10 10
1 1 1
−5 10 −5 10 −5 10
10 3 10 3 10 3
10 10 10
5 5 5
−7 10 −7 10 −7 10
Pt,min 10 ∆ Pt Pt,min 10 ∆ Pt Pt,min 10 ∆P
t
(a) (b) (c)
2
5 10
Skewness
Kurtosis
1
0 10
0
−5 10
−3 −3
10 10
1 1
−5 10 −5 10
10 3 10 3
10 10
5 5
−7 10 −7 10
Pt,min 10 ∆ Pt Pt,min 10 ∆ Pt
(d) (e)
F IGURE 6.6.: Fourth order polynomial regression (QP4): Emed , IQR, EL , ξ and κ plots for
∆Pt versus Pt,min . σRJ /ADJ =1/8, ADJ,uni =0.2 UI, σRJ =0.025 UI, N =107 ,
K=250.
91
order polynomials, which generally impedes them to achieve accurate fitting results. In the case
of third order polynomials, skewness and kurtosis show heavy tails toward both sides (skewness is
close to zero while kurtosis is large). These heavy tails are caused by convergence failures of the
regression stage, where the fitted polynomial does not extrapolate the bathtub correctly down to
the target BER. This can for example be the result of a non-monotonic course of the fitted polyno-
mial, which contains a local minimum and/or maximum. Such a failure can be identified and dealt
by fitting a simple, strictly monotonic linear function into the distribution tail. However, this cor-
responds to a reduction of the regression order and thus, introduces a different error characteristic,
which finally leads to the heavy tails observed.
Fourth order polynomials highly suffer from random variations of measured bathtub tails. Al-
though skewness is almost zero and kurtosis reaches small values close to the normal distribution
case, the IQR plane shows an extremely large statistical spread. This highlights a general problem
of tail fitting algorithms with higher order polynomials. Instead of converging toward a simple
regression function, they tend to follow random data variations which mislead the extrapolation
result. In regression analysis this effect is commonly known as overfitting problem.
The ∆Pt parameter selection for third and fourth order polynomials is uncritical as it does in
fact not influence the estimation performance. We thus select a default value of ∆Pt =103 for both
polynomial orders.
The performance optimization of polynomial fitting methods concludes with a brief summary.
Each of the fitting methods has been investigated individually, in order to find an adapted, well
suited parameter configuration which drives the regression stage. These configurations are sum-
marized in table 6.1. Note that the parameters were selected using only uniform DJ type with
the worst case test distributions identified at σRJ /ADJ =1/8. They do not consider triangular or
quadratic curve shaped DJ. Thus, parameter configurations are primarily meant as compromise
solutions, where especially the higher order polynomials cannot be utilized over a broad variety of
test shapes. If for example, specific shapes are known, other configurations may be more suitable.
Polynomial ∆Pt
Order N = 104 N = 105 N = 106 N = 107 N = 108
1st 101 102 103 103 103
2nd 5 · 101 5 · 102 5 · 103 5 · 104 5 · 105
3rd 101 102 103 103 103
4th 101 102 103 103 103
TABLE 6.1.: Default parameter configuration of ∆Pt for polynomial tail fitting methods, derived
with worst case test distributions σRJ /ADJ =1/8 of uniform DJ type.
For N ≥106 the parameter ∆Pt is always constant besides second order polynomials, where
it must be chosen sufficiently large to avoid misleading outliers. By selecting ∆Pt =5·10−3 ·N ,
the optimum region from figure 6.4 is guaranteed, which has also been verified for different pa-
rameter surfaces over varying N . ∆Pt does not influence the performance of third and fourth
order polynomials, and was thus set to the default sQN values from equation (4.21). With first
order polynomials, parameter selection is more critical since the product ∆Pt ·Pmin must not ex-
ceed the minimum tail amplitude of test distributions. To guarantee this also for a small sample
size N ={104 , 105 }, the default sQN configuration is used as well. However, in this case the QN
method is also affected by outliers.
92
6.3. P ERFORMANCE A NALYSIS OF P OLYNOMIAL F ITTING M ETHODS
6.3. Performance Analysis of Polynomial Fitting Methods

In this section a performance analysis for each of the described polynomial fitting methods is
carried out, using the parameter configurations from table 6.1. Equivalent to the sQN method in
section 4.2 the extrapolation error is investigated over varying sample size N and the four different
DJ shapes.
The influence of N is thus shown in figures 6.7-6.10, where Emed and EL are plotted for each
of the different polynomial regression orders. The median error of first order polynomials (linear
case) is always positive, which is due to the asymptote in Q-domain as already described together
with the scaled Q-normalization (sQN) method in section 4.2.1. Estimated TJpp values therefore
tend to be larger than the true timing budget, and are thus pessimistic. As can be seen in the lower
Emed subfigures, this beneficial property is not maintained with higher order polynomials (QP2,
QP3, QP4).
With decreasing sample size N , both median error and statistical spread increase, which leads to
a higher overall estimation loss EL as depicted in the right column of subfigures. Especially with
third and fourth order polynomials, this effect is additionally supported by a significant statistical
spread at larger jitter ratios σRJ /ADJ (RJ dominant case), which impedes accurate tail fitting.
Note, when varying the number of jitter samples N , also the conservative tail fitting parameter
∆Pt is adapted in order to keep the fitting algorithms working in their optimized parameter region
(see table 6.1).
When comparing the fitting performance over different DJ shapes (figures 6.7-6.10), we notice
that the QN method with first order polynomials still maintains its pessimistic extrapolation prop-
erty. However, triangular and quadratic curve shaped DJ are highly biased at smallest σRJ /ADJ
and for N ≤106 , because the tail amplitudes fall below the analyzable minimum given by equa-
tion (5.10):
At,min = 6.3 · ∆Pt /N = 6.3 · 10−3 (6.5)
Hence, the ∆Pt parameter forces the fitting algorithm to include parts of the DJ component and
thus, misleads the TJ estimation. This effect can only be overcome by further reducing ∆Pt , which
would however highly increase the presence of outliers. A large error bias is also observed with
higher order polynomials at smallest jitter ratios, which is also due to very steep tails. Since the
Gaussian component is barely visible at the distribution tails, these algorithms here rather converge
toward the bounded, Gaussian-like DJ shape.
Another interesting effect appears with QP2 at N ≥106 and triangular or quadratic curve shaped
DJ. If a certain minimum ratio can be guaranteed, equivalent to a lower bound for the Gaus-
sian tail slope in Q-domain, second order polynomials yield excellent fitting results down to
σRJ /ADJ ≈2·10−2 . They clearly outperform the QN method, and can compete with it for the
complete range of sinusoidal and uniform DJ shapes.
Third and fourth order polynomials only produce estimates with acceptable accuracy over a
small range of jitter ratios. Further, they are not able to track uniform and sinusoidal DJ at large
jitter ratios, where the tails of Q-normalized distributions correspond to simple linear functions.
While the median error Emed in figures 6.7(e) and 6.7(g) for example, approximately converges
toward zero at highest jitter ratios (pure Gaussian case), the estimation loss in figures 6.7(f)
and 6.7(h) reaches a large constant. This effect is further magnified toward a decreasing sam-
ple size N . It clearly indicates an over-fitting problem of higher order polynomials, caused by the
statistical random variations of tails.
93
0.15
−1
0.1 10
med
EL
0.05
E
−2
10
−3
−0.05 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) Emed , QN (b) EL , QN
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
(c) Emed , QP2 (d) EL , QP2
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
(e) Emed , QP3 (f) EL , QP3
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(g) Emed , QP4 (h) EL , QP4

N =108 N =107 N =106 N =105 N =104
F IGURE 6.7.: Sinusoidal type DJ: median error Emed (left) and estimation loss EL (right) for
polynomial tail fitting methods QN (a,b), QP2 (c,d), QP3 (e,f) and QP4 (g,h).
N ={104 , 105 , 106 , 107 , 108 }, K=250.
94
0.15
−1
0.1 10
med
EL
0.05
E
−2
10
−3
−0.05 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ

N =108 N =107 N =106 N =105 N =104
F IGURE 6.8.: Uniform type DJ: median error Emed (left) and estimation loss EL (right) for
N ={104 , 105 , 106 , 107 , 108 }, K=250.
95
0.15
−1
0.1 10
med
EL
0.05
E
−2
10
−3
−0.05 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ

N =108 N =107 N =106 N =105 N =104
F IGURE 6.9.: Triangular type DJ: median error Emed (left) and estimation loss EL (right) for
N ={104 , 105 , 106 , 107 , 108 }, K=250.
96
0.15
−1
0.1 10
med
EL
0.05
E
−2
10
−3
−0.05 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
0.15
−1
0.1 10
med
EL
0.05
E
−2
10
−3
−0.05 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.15
−1
0.1 10
med
EL
0.05
E
−2
10
−3
−0.05 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.15
−1
0.1 10
med
EL
0.05
E
−2
10
−3
−0.05 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ

N =108 N =107 N =106 N =105 N =104
F IGURE 6.10.: Quadratic curve type DJ: median error Emed (left) and estimation loss EL (right)
for polynomial tail fitting methods QN (a,b), QP2 (c,d), QP3 (e,f) and QP4 (g,h).
N ={104 , 105 , 106 , 107 , 108 }, K=250.
97
6.4. Comparison with Scaled Q-normalization Method

The polynomial fitting methods (QN, QP2, QP3, QP4) can also be compared against the scaled
Q-normalization (sQN). Therefore, the optimized method from section 4.3.3 is used, with ĉ1.2
based optimization and the ∆Pt settings from equation (4.21). The performance comparison is
first carried out over varying sample size N as well as different DJ distribution types. Finally, also
the influence of limited time resolution R is investigated.
Influence of Sample Size and Distribution Shape

The comparison of algorithmic performance is carried out in figures 6.11-6.15 where Emed and
EL are evaluated over varying distribution shape and DJ type, for each of the five different fitting
algorithms sQN, QN, QP2, QP3 and QP4. At the moment, the influence of time quantization R is
not considered and thus, Rsim =3.3·105 is used as equivalent number of bins for simulators.
As can be seen from the different figures, the sQN method achieves best performance. The
three-dimensional Gaussian parameter search (A, σ, µ) allows for accurate tail extrapolations, and
the TJ timing budget is determined in a robust manner. Nevertheless, due to the additional scaling
factor inside the optimization scheme, this also means an additional degree of freedom, which
significantly increases computational demand (see figure 4.9).
The conventional QN method in contrast, simply assumes a Gaussian tail area of A=1 and
thus, only determines the two parameters σ and µ. The method is faster, but does not estimate
the TJ budget as accurate as sQN. The magnitudes of Emed show that both sQN and QN methods
only produce estimates of positive error bias. This is due to the asymptotic tail behavior in Q-
domain (see section 4.2.1). Estimated TJ values are therefore always pessimistic, which is a clear
advantage of the two methods. As can be seen with higher order polynomials (QP2, QP3, QP4)
this beneficial property is not maintained. Additionally, polynomials do not allow for recovering
Gaussian model parameters.
With decreasing sample size, both median error and statistical spread increase, which generally
leads to a large overall estimation loss EL . With third and fourth order polynomials, this effect
is additionally supported by an especially large statistical spread at higher jitter ratios σRJ /ADJ .
This generally impedes higher order polynomials to be utilized for accurate tail extrapolation,
except for certain known shapes.
Influence of DJ Distribution Type

In figures 6.11-6.15 the four different DJ types are investigated as well. Error bias of QN is sig-
nificantly larger than with sQN, but remains strictly positive. QP2 estimates yield excellent fitting
results for Gaussian-like DJ types, which is due to an optimum working region. Thus, if a certain
minimum jitter ratio of σRJ /ADJ ≥2 · 10−2 can be guaranteed (like a lower bound for Gaussian
tail slope in Q-domain) and N ≥106 , QP2 can also be utilized for tail fitting. At sinusoidal and
uniform DJ shapes the algorithm performance is similar to QN, and for quadratic curve DJ it
even outperforms the sQN method. Thus, as long as the lower bound is guaranteed, QP2 can
theoretically replace the QN method. However, note that the beneficial property of pessimistic
TJ estimation is lost with the QP2 algorithm. Further, it cannot be used to characterize Gaussian
model parameters.
Regression analysis with third and fourth order polynomials (QP3, QP4) only yields estimates
of acceptable accuracy in a small range of jitter ratios. As an interesting effect, they are not able
to track the distribution tails correctly at highest jitter ratios, where linear functions are given in
Q-domain. Although the error bias Emed converges toward zero, the estimation loss of QP3 and
QP4 reaches a large constant (for example see EL subplots in figures 6.13). This behavior clearly
98
6.4. C OMPARISON WITH S CALED Q- NORMALIZATION M ETHOD
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(a) Emed , sinusoidal (b) EL , sinusoidal
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
(c) Emed , uniform (d) EL , uniform
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
(e) Emed , triangular (f) EL , triangular
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
(g) Emed , quad. curve (h) EL , quad. curve

sQN QN QP2 QP3 QP4
F IGURE 6.11.: N =104 : Emed and EL of compared fitting algorithms (sQN, QN, QP2, QP3,
QP4) over varying test distribution shape σRJ /ADJ and DJ type. K=250,
Rsim =3.3 · 105 .
99
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ

sQN QN QP2 QP3 QP4
Rsim =3.3 · 105 .
100
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ

sQN QN QP2 QP3 QP4
Rsim =3.3 · 105 .
101
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ

sQN QN QP2 QP3 QP4
Rsim =3.3 · 105 .
102
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σ /A σRJ/ADJ
RJ DJ
0.1
−1
0.05 10
med
EL
0
E
−2
10
−0.05
−3
−0.1 10 −2 −1 0
−2 −1 0
10 10 10 10 10 10
σRJ/ADJ σRJ/ADJ

sQN QN QP2 QP3 QP4
Rsim =3.3 · 105 .
103
0.12 0.12
QP2
QP2
0.09 0.09
QP4
QP4
EL
EL
0.06 0.06
QN QN
QP3
QP3
0.03 0.03
sQN sQN Rsim

Rsim
0 2 3
0 2 3
10 10 10 10
R R
(a) Sinusoidal (b) Uniform
0.12 0.12
QP4
QP4
0.09 0.09
QP3
QN
EL
EL
0.06 0.06 QN
QP3
QP2
0.03 0.03 sQN QP2

sQN
Rsim
Rsim
0 2 3
0 2 3
10 10 10 10
R R
(c) Triangular (d) Quadratic curve
F IGURE 6.16.: Estimation loss EL of sQN, QN, QP2, QP3, QP4 fitting algorithms over varying
number of bins in the interval R=[32, 2048]. The figures evaluate the different
fitting algorithms with sinusoidal, uniform, triangular and quadratic curve DJ
types. N =108 , K=250, σRJ /ADJ =1/8, ADJ =0.2 UI.
indicates an over-fitting problem of higher order polynomials, caused by the statistical random
variation of tails.
Influence of Time Resolution

The number of bins R has so far not been considered for the comparison of algorithms. In a sim-
ulation environment this value can always be chosen such that it does not affect the performance
of a fitting method. In real hardware measurement systems, as already demonstrated in chapter 5,
R is a critical design parameter and will always be limited. Thus, one likes to investigate its influ-
ence on the compared algorithms, in order to identify appropriate candidates for further in-depth
analysis.
In figure 6.16 the estimation loss EL is depicted as a function of time quantization in a typical
range of R=[32, 2048]. Test distributions are created using a constant jitter ratio σRJ /ADJ =1/8,
to guarantee for a sufficient amount of extrapolation error with all methods. As can be seen from
the resulting course of EL , the sQN method achieves best performance with all four DJ types.
With decreasing time resolution, the obtained curves show an increased noise behavior, which
is basically caused by the coarse discretization of bathtub tails. In section 5.4.2 this effect was
already described as error ripple. Higher order polynomials generally suffer from large oscillations
which are additionally intensified with increasing polynomial order. Further, the performance
advantage of second order polynomials together with quadratic curve DJ in figure 6.16(d) is lost
if the time resolution falls below R<200.
104
6.5. E STIMATION E RROR A NALYSIS OF C ONVENTIONAL Q-N ORMALIZATION
N = [5 · 105 , 108 ] N = [104 , 106 ]

a0 a1 a2 r2 a0 a1 a2 r2
Emed 0.368 0.062 0.167 0.986 0.484 0.047 0.189 0.973
Sinusoidal 1/4 IQR 0.389 0.153 0.173 0.947 0.843 0.132 0.251 0.967
EL 0.940 0.111 0.170 0.979 1.573 0.090 0.220 0.980
Emed 0.356 0.042 0.155 0.983 0.530 0.025 0.180 0.985
Uniform 1/8 IQR 0.510 0.143 0.191 0.957 1.213 0.137 0.282 0.974
EL 1.030 0.091 0.172 0.982 1.827 0.074 0.225 0.988
Emed 0.382 0.005 0.138 0.984 0.707 −0.027 0.166 0.989
Triangular 1/16 IQR 1.187 0.173 0.248 0.973 2.319 0.155 0.346 0.975
EL 1.248 0.063 0.176 0.990 2.047 0.024 0.218 0.988
Emed 0.436 −0.003 0.129 0.987 0.762 −0.022 0.152 0.982
Quadratic 1/16 IQR 2.355 0.177 0.283 0.977 2.150 0.144 0.331 0.976
EL 1.598 0.052 0.178 0.988 1.999 0.021 0.199 0.986
Emed 0.113 0.463 0.145 0.950 0.721 0.456 0.292 0.950
Only RJ IQR 1.753 0.405 0.262 0.964 5.371 0.350 0.388 0.971
EL 2.357 0.417 0.240 0.971 8.431 0.368 0.371 0.973
TABLE 6.2.: Emed , IQR and EL coefficients for QN method, equation (6.6). ∆Pt /N =10−3
(large N ) and ∆Pt /N =10−2 (small N ), intervals specified above, K=250.
Therefore, only sQN and QN fitting methods are suitable for use with hardware jitter measure-
ments. In fact, higher order polynomials introduce too much error and can generally not guarantee
for pessimistic TJ estimation. With its fast algorithm, the QN method forms an alternative to the
computationally more expensive sQN method and thus, the error behavior is thoroughly analyzed
in the following section.
6.5. Estimation Error Analysis of Conventional

Q-Normalization
In section 5.4 empirical relations were derived to describe the extrapolation error of the sQN
method. Since the QN method forms a suitable alternative, the same empirical analysis is now
also carried out for the QN method. Therefore equation (5.30) is reused, which describes the
performance metrics Emed , IQR and EL in terms of three regression coefficients a0 , a1 and a2 :
{Emed , IQR, EL } = a0 · (σRJ ·R)−a1 · N −a2 (6.6)
These coefficients are determined for each of the investigated test distributions and are subse-
quently listed in table 6.2. Estimation performance is investigated with respect to varying sam-
ple size N and bin density σRJ ·R. The intervals have been selected from original equidis-
tant grids of 75×51 nodes with σRJ ·R={0.8, . . . , 51.2} UI as well as N ={105 , . . . , 108 } and
N ={104 , . . . , 106 } respectively. Note, that the worst case shapes σRJ /ADJ for conventional QN
have been re-specified using figures 6.7-6.10, as they differ from the respective sQN values. A
constant ratio ∆Pt /N =10−3 for large N as well as ∆Pt /N =10−2 for small N again guarantee
for correct convergence of the algorithm. The r-squared statistic in the last column of the table
also highlights an excellent correlation.
The DNL model coefficients can be recalculated for the QN method as well, where a suitable
105
DJ, σRJ /ADJ a0 a1 a2 a3 a4 r2

N = [5 · 105 , 108 ], σRJ · R = [2.0, 51.2]
Emed 1.403 0.144 0.054 −1.514 0.558 0.961
Sinusoidal 1/2 IQR 1.550 0.140 0.139 −6.880 1.383 0.941
EL 0.619 0.140 0.096 −5.091 1.186 0.958
Emed 1.170 0.146 0.045 −1.416 0.462 0.973
Uniform 1/4 IQR 1.392 0.151 0.122 −6.419 1.394 0.935
EL 0.429 0.146 0.077 −4.453 1.086 0.956
Emed 0.825 0.146 0.002 −0.179 0.026 0.983
Triangular 1/16 IQR 0.794 0.194 0.149 −6.020 1.233 0.932
EL 0.069 0.162 0.043 −3.099 0.728 0.967
Emed 0.768 0.134 −0.005 −0.221 0.057 0.986
Quadratic 1/16 IQR 0.258 0.218 0.172 −5.489 1.108 0.937
EL −0.126 0.161 0.042 −2.610 0.631 0.972
Emed 1.416 0.189 0.521 −8.926 1.936 0.911
Only RJ IQR 1.071 0.156 0.421 −7.517 0.053 0.955
EL 0.375 0.158 0.442 −7.462 0.198 0.964
N = [104 , 108 ], σRJ · R = [2.0, 51.2]
Emed 0.701 0.192 0.043 −0.432 0.083 0.977
Sinusoidal 1/2 IQR 0.797 0.202 0.123 −4.402 1.075 0.948
EL −0.130 0.196 0.081 −2.771 0.685 0.970
Emed 0.707 0.174 0.021 −0.499 0.082 0.984
Uniform 1/4 IQR 0.423 0.237 0.114 −3.982 0.945 0.936
EL −0.272 0.201 0.058 −2.356 0.568 0.970
Emed 0.323 0.167 −0.023 −0.407 0.116 0.979
Triangular 1/16 IQR −0.439 0.319 0.138 −3.734 0.883 0.946
EL −0.614 0.211 0.017 −1.616 0.426 0.976
Emed 0.249 0.156 −0.029 −0.409 0.120 0.965
Quadratic 1/16 IQR −0.267 0.299 0.128 −3.535 0.824 0.946
EL −0.566 0.194 0.007 −1.455 0.386 0.963
Emed 0.760 0.241 0.491 −3.274 −0.300 0.952
Only RJ IQR −0.516 0.281 0.380 −4.168 0.102 0.949
EL −1.094 0.274 0.400 −4.003 0.032 0.961
TABLE 6.3.: Emed , IQR and EL coefficients for QN method with included DNL error, equa-
tion (5.38). ∆Pt /N =10−3 (large N ) and ∆Pt /N =10−2 (small N ), K=200.
model description has already been given in equation (5.38):

y = −a0 − a1 x1 − a2 x2 − a3 x3 − a4 x2 x3
x1 = ln(N ), x2 = ln(σRJ · R), x3 = ln(1 + σDN L ) (6.7)
y
{Emed , IQR, EL } = e
Corresponding QN coefficients are listed in table 6.3. In order to deal with the large computa-
tional demand, a reduced grid resolution for bin density σRJ ·R and sample size N is utilized
with 25×25 nodes, and the same parameter intervals as with previous error analysis. Each plane
is simulated with respect to varying DNL error in the range σDN L ={0.0, 0.01, . . . , 0.19}. The
r-squared statistic shows a slightly degraded performance of Emed hyperplanes for the pure Gaus-
sian RJ case. The additional DNL error in fact strongly affects random tail variations, which also
leads to a larger noise of fitted hyperplanes.
106
6.6. S UMMARY
The obtained empirical relations can now be used to design a jitter measurement system which
uses the conventional QN method for tail fitting, instead of the sQN method from chapter 5. There-
fore, the design equations can simply be reused, as they are also valid for QN. Equation (5.3) and
condition (5.10) guarantee for correct tail fitting if a minimum amplitude At,min is given. Fur-
ther, equation (5.20) and the selection chart in figure 5.9 specify a minimum RJ standard deviation
σt,min with known R, N and ∆Pt . Thus, QN and sQN errors may also be compared by simply
extending the design examples from section 5.5.
Example for Jitter Diagnostics The estimation performance resulting from R=128, N =107
and σRJ,min =7.57·10−3 , when inserted into equation (6.6) yields worst case errors (uniform DJ
type):
Emed = 2.0%, IQR = 2.3%, EL = 5.4% (sQN)
Emed = 2.9%, IQR = 2.4%, EL = 6.4% (QN)
For the given σRJ,min , the resulting coarse bin density is the major contributor to overall error and
thus, a rather small performance difference between QN and sQN is observed. This difference
becomes even smaller if additionally a DNL error of σDN L =0.05 UI is assumed and empirical
equation (6.7) is applied:
Emed = 2.3%, IQR = 3.4%, EL = 7.4% (sQN)
Emed = 3.2%, IQR = 3.0%, EL = 7.7% (QN)
which indicates that the statistical spread of the sQN algorithm is generally more affected by DNL
error than the QN method. However, this is also an effect caused by low bin densities, since DNL
has been modeled as a random step with time resolution 1/R (section 5.4.3) and can thus, also be
reduced with a higher time resolution.
Example for Production Tests The presented BIJM system can now also be used together
with the QN method, which only changes the empirical coefficients of equation (6.6) to the ones
specified in table 6.2. The results are depicted in figure 6.17. For pure Gaussian distributions the
error difference is marginal, because both QN and sQN methods behave similar. For sinusoidal
DJ, a performance difference is instead clearly visible.
0.5
0.4 QN
Sinusoidal DJ
0.3
Estimation Loss
0.2 QN
sQN
sQN
Only RJ
0.1
1 10 100
Number of Counters
F IGURE 6.17.: Estimation loss EL of sQN and QN methods over varying number of counters C,
with tt =20 ms, and worst case sinusoidal DJ (red) or pure RJ (black) case.
6.6. Summary
The performance of various recently proposed, polynomial tail fitting principles based on Gaussian
quantile normalization was compared. First a unifying optimization scheme (figure 6.1) equivalent
107
to the one for the sQN method was developed, in order to realize all of the investigated polyno-
mial methods (QN, QP2, QP3, QP4). The required polynomial regression led to a Hankel matrix
notation which can be inverted very efficiently by using a Levinson-Durbin recursion from [112].
As a fundamental simplification compared to the sQN method, only the regression error from
equation (6.4) was shown to be applicable as fitness measure for tail fitting. Optimum parameter
regions for tail selection with ∆Pt were discussed for each of the polynomial orders. This led to
the selected parameter configurations in table 6.1.
Performance evaluations were first carried out for simulator environments (Rsim =3.3·105 ) with
respect to varying sample size N as well as DJ type (figures 6.7-6.10) for each of the polynomial
methods. The median error of QN with linear functions was shown to be strictly positive, which
is due to the asymptote in Q-domain, equivalent to the sQN method in section 4.2.1. Extrapolated
tails are thus always pessimistic. This beneficial property was generally not maintained with
higher order polynomials (QP2, QP3, QP4). Third and fourth order polynomials only produced
results with acceptable accuracy for a small range of jitter ratios. For RJ dominant shapes, they
were generally not able to correctly extrapolate tails, although forming simple linear functions in
Q-domain. This effect was shown to be a general over-fitting problem of higher order polynomials.
A comprehensive performance comparison with the sQN method based on ĉ1.2 with ∆Pt was
carried out in figures 6.11-6.15. The sQN method clearly achieved best performance for the cost
of a larger computational demand as already shown previously with figure 4.9. The QN method,
although less accurate, offers approximately 35 times faster fitting results. Further, it was shown
to possess the same positive error property as already described before and is thus, also well suited
for tail fitting.
An interesting effect was observed with second order polynomials for N ≥106 . The QP2 method
returned excellent results down to σRJ /ADJ ≈2·10−2 with triangular or quadratic curve shaped
DJ, where it even outperformed the sQN method. This is due to an optimum working region.
Nevertheless, the pessimistic extrapolation property of QN and sQN methods is not guaranteed
with the QP2 algorithm. Additionally, polynomials do not allow for recovering Gaussian model
parameters.
The influence of a limited time resolution with R bins on the extrapolation error was investigated
in figure 6.16. It again highlighted the effect of error oscillations at coarse resolutions, which was
also investigated in section 5.4.2. Results showed, that higher order polynomials are generally
more affected by such oscillations than sQN and QN methods. Further, the performance advantage
of the QP2 method with quadratic curve DJ was lost as soon as R<200.
Summarizing these results, only the sQN and QN fitting methods are well suited for use with
hardware jitter measurements. Higher order polynomials introduce too much error and can gen-
erally not guarantee for pessimistic tail extrapolations. Therefore, an error analysis for the QN
method was carried out in section 6.5, equivalent to the sQN error analysis in section 5.4. This
allowed to compare the extrapolation performance of both methods and thus, to choose the better
suited algorithm for a BIJM design. Results with the continued design examples from section 5.5
basically showed, that QN is generally less accurate than sQN, but is also less affected by differ-
ential non-linearity errors. Further, for pure Gaussian test distributions the performance difference
between QN and sQN methods becomes marginal.
Note that for a system design with the conventional QN method, all the important design equa-
tions from chapter 5 can be reused. This especially includes equation (5.3) and condition (5.10) to
guarantee for correct tail fitting results with respect to a minimum tail amplitude At,min , as well
as equation (5.20) and the selection chart in figure 5.9 to specify a minimum RJ standard deviation
σt,min . Parts of this chapter were also published in [C3,C7].
108
7. Jitter Analysis Method for Generalized
Gaussian Tail Extrapolation
In this chapter the Gaussian quantile normalization is brought into a generic analysis context.
Section 4.1.2 briefly mentioned the idea of generalizing the developed optimization scheme for
use with arbitrary tail shapes. As a possible application, especially the amplitude distributions of
high-speed signals may sometimes follow non-Gaussian shapes, including both heavy tailed distri-
butions as well as fast decaying tails. In such cases one likes to identify the unknown shape in order
to guarantee for correct signal recovery. As an extension to the pure Gaussian Q-normalization
thus, classes of tail distributions may be described with additional shape parameters. The gener-
alized principle then allows for correct tail extrapolation and jitter estimation of arbitrary tails as
long as they belong to the defined class.
Generalizing the scaled Q-normalization (sQN) method from chapter 4 with additional shape
parameters allows the linearizing principle to be extended to a complete class of tail distributions,
which includes the Gaussian function as a special case. Here, special attention is given to the
generalized Gaussian distribution (GGD) class of probability functions. It provides only one ad-
ditional shape parameter, and the goal is thus, to realize accurate tail extrapolations for GGDs.
The unknown shape parameter increases computational demand significantly, as it introduces an
additional degree of freedom to the optimization search space. The GGD class is also expected
to be less accurate than the sQN method for the special case of Gaussian tails. This is a natural
generalization effect when unknown tail characteristics and thus, less information is given to the
analysis system. On the other hand, the generalized method offers an outstanding opportunity,
it can identify underlying tail characteristics, and is thus able to tell whether a tail truly behaves
Gaussian-like or not.
In the following sections first the generalization principle is justified with a short literature
review, and criteria are described to ensure consistency with the existing RJ-DJ model as intro-
duced in section 2.3.1. Then various classes of generalized distributions and the corresponding
quantile normalization functions are presented. The generalized Gaussian distribution (GGD) is
described and its advantages compared to other function classes, before embedding it into the gen-
eralized optimization scheme for tail fitting. This scheme is implemented as efficient C++ routine.
Performance evaluations are carried out and compared against the sQN method and other fitting
principles. The chapter concludes with a brief summary.
7.1. Introduction to Generalized Tail Fitting

So far, the unbounded random part of a distribution tail has been assumed to be strictly Gaussian,
which is also the case for many practical situations. However, literature also indicates scenarios
where non-Gaussian tails must be handled [2, 37, 40, 59, 93, 126, 135], and thus the Gaussian as-
sumption does not hold anymore. Such scenarios especially appear in optical high-speed commu-
nications where signals suffer from severe distortions, introduced by intersymbol interference (ISI)
and noise. Here, signal integrity is rather affected by the amplitude noise instead of timing jitter.
As shown in figure 7.1 bit errors can result from both timing jitter and amplitude noise [82], where
in optical links especially the latter is the dominant cause to erroneous signal recovery [59, 126].
109
7. J ITTER A NALYSIS M ETHOD FOR G ENERALIZED G AUSSIAN TAIL E XTRAPOLATION
0 UI 1 UI
"1" logic level
∆v PDFs
∆v
"0" logic level
∆t PDFs
∆t
F IGURE 7.1.: Eye diagram with timing jitter and amplitude noise PDFs [82].
Typical amplitude histograms in optical fiber channels follow a chi-squared distribution where the
critical tails decay very slowly and thus, cause bit errors at the receiver side. In order to min-
imize the BER, approaches based on maximum likelihood estimators and Viterbi-decoders are
utilized [2, 59, 126]. These approaches deal with non-linear noise properties of the transmission
medium, and require a careful analysis of observed amplitude distributions as well as the use of
generalized tail fitting methods.
Weinstein [135] was the first to propose a method for approximating the generalized Gaussian
distribution (GGD) class of functions. It extrapolates tails and estimates the BER without knowl-
edge of the underlying tail shape. Stojanovic [126] described an extrapolation method, which also
uses a subset of the GGD class. He also notes, that such tails especially appear in long haul optical
fibers and channels that suffer from severe signal distortions.
For short-range wireline communications, linear additive noise sources prevail. Thus, with
a large number of random processes involved, amplitude noise and timing jitter PDFs mostly
follow a Gaussian tail. Nevertheless, in [75, 104] different jitter types are classified and the RJ
section also defines an arbitrary non-Gaussian case which cannot be decomposed by conventional
fitting methods. However, non-Gaussian timing jitter has so far only been described for soliton
transmission, as a result of the Gordon-Haus effect [37, 44].
When focusing on the analysis of non-Gaussian tails thus, amplitude noise clearly dominates
the practical use case. Fortunately, with the definition of the unit interval (UI), fitting methods
are not restricted to the analysis of timing jitter and can equivalently also be applied to amplitude
histograms. In this context, an accurate jitter decomposition method for arbitrary RJ shapes is
subsequently developed. It is consistent with the existing RJ-DJ model [104] and hence, forms a
logical extension to the commonly accepted modeling approach.
The method can easily be derived from the generalized sQN scheme in figure 4.6. Therefore,
only a suitable quantile function for the generic normalization stage must be determined, as de-
picted in figure 7.2. Q(p) forms the heart piece for tail linearization. It may normalize a specific
tail shape, or a whole class of distributions when described by one or more shape parameters (α,
β, . . .). Note that every additional parameter introduces another degree of freedom for the opti-
CDF(x) Reg. Error

Linear Reg.
×k Q=f (α, β, ...)
s, o
A=1/k σ=1/s, µ=−o/s
F IGURE 7.2.: Generalized optimization scheme.
110
7.1. I NTRODUCTION TO G ENERALIZED TAIL F ITTING
PDF: f(x) Quantile Normalization: Q(p)=F−1 (p)

Distribution
x ∈ (−∞, +∞) p ∈ [0, 1]
(x−µ)2 √
Gaussian √1 · e− 2σ 2 − 2 · erfc−1 (2p)
2πσ
α 1/α q
x−µ

α −
γ −1 α1 , |2p − 1| · Γ(1/α)

2·β·Γ(1/α) e Γ(3/α) · sgn(2p − 1)
β
Generalized q
Γ(1/α) 1
R x u−1 −t
β=σ γ(u, x) = Γ(u) 0 t e dt
Gaussian Γ(3/α)
R∞
Γ(u) = 0 e−t tu−1 dt sgn(. . .) ... sign function
x ∈ [0, ∞) p ∈ [0, 1]
x
Exponential 1 −σ − ln(1 − p)
σe
(
Generalized −1− 1 − ln(1h− p) if α = 0
1 x−µ α
1+α·
i
Pareto σ σ −1/α 1 − (1 − p)−α if α 6= 0
Generalized 1 1 1
−α
(
− ln − ln(p)

if α = 0
Extreme σ (1 + αz)−1− α · e−(1+αz) h −α i
Value z = (x − µ)/σ −1/α 1 − − ln(p) if α 6= 0
TABLE 7.1.: Quantile normalization functions for different tail distributions [106].
mization scheme and thus, significantly increases computational demand. Once the tail parameters
have been identified, the timing budget TJpp can easily be determined at the target BER=10−12
with:
TJpp = tL + tR (7.1a)
−12
tL = µL + σL · Q(10 /AL , α, β, . . .) (7.1b)
tR = µR + σR · Q(10−12 /AR , α, β, . . .) (7.1c)
7.1.1. Quantile Normalization Functions

With equation (4.8) a general law has been given to describe the quantile function inside the
optimization scheme. With a probability distribution p=F (x), the quantile normalization Q(p) is
q = Q(p) = F −1 (p, µ=0, σ=1, A=1) (7.2)
which corresponds to the inverse probability function for an expected tail shape of unit amplitude,
unit standard deviation and zero mean.
The quantile normalization Q(p) may utilize additional shape variables to include the tail char-
acteristics of entire function classes. In table 7.1 Q(p) is listed for various candidates. The Gaus-
sian and generalized Gaussian functions are defined over the complete real axis. That is, they
are symmetric for positive and negative values of x. Other distributions such as the exponential,
generalized Pareto or generalized extreme value PDFs are bounded toward negative values of x.
The PDF definitions use a positive range for x to denote the right sided normalization function,
which is applied to positive bathtub tails with p=F (x). If measured tails belong to the domain
of attraction of the normalizing function, the optimization scheme from figure 7.2 is thus able to
identify a best suited set of tail parameters.
The presented distribution classes give an outline to possible tail characteristics, which may
exhibit power-law, exponential or Gaussian-like behavior instead of the pure Gaussian case. The
111
0.8
α=1.0
Shape α Distribution
0.6
→0 Dirac impulse
0.5 Gamma
PDF
α=2.0
0.4
α=100 1.0 Laplace
0.2 2.0 Gaussian
→∞ Uniform
0
−5 0 5
σ
F IGURE 7.3.: Special GGD shapes.
correct choice for an expected tail shape is crucial as it highly influences extrapolation results.
Generalized extreme value (GEV) distributions and generalized Pareto (GP) distributions are com-
monly used in extreme value theory, and their use for nonparametric tail extrapolation and thresh-
old models in general has been discussed extensively in literature [19, 115, 117]. Nevertheless,
here we focus on the generalized Gaussian distribution (GGD), which offers two major advan-
tages compared to other generalizations. First, it represents a simple and direct generalization
of the pure Gaussian function. With only one additional shape parameter α, it includes impor-
tant special cases such as the Gaussian (α=2) or the exponential (α=1) tails. Second, it is fully
consistent with the commonly accepted RJ-DJ model from [104], and thus allows for decompos-
ing a total distribution into its non-Gaussian random and bounded deterministic components. In
fact, the additional shape parameter α extends the existing model with the ability of tail shape
characterization.
7.1.2. Generalized Gaussian Distribution

In this section, properties of the generalized Gaussian distribution (GGD) as a generic function
for tail fitting and extrapolation are described. As the name already indicates, it puts the Gaussian
distribution into a general context, where an additional shape parameter α defines the exponential
rate of decay:
α −| x−µ |α
PDF(x) = f (x, α, β, µ) = e β (7.3a)
2 · β · Γ(1/α)
s Z ∞
Γ(1/α)
β=σ , Γ(u) = e−t tu−1 dt (7.3b)
Γ(3/α) 0
where [−∞ < x < ∞] and α > 0. Γ(u) is the gamma function, which is required for amplitude
normalization so that α can be varied independently. Note that β is a dependent scale parameter,
directly related with σ and α, and hence, it does not influence the tail shape.
The advantages of this representation form are, that only the α parameter defines the tail shape
and that the GGD class covers several special function types. With α=2 the normal distribution is
obtained, while for α=1 the Laplace distribution with exponential tails, and for α=0.5 the heavy
tailed Gamma distribution. A small shape value yields an impulsive function with slowly decaying
tails, while a large value leads toward the uniform distribution. This behavior is also depicted in
figure 7.3. The α parameter thus offers a flexible way for representing a generic distribution class
with exponential tail behavior.
The quantile normalization for GGD functions is derived as inverse of the CDF, which is ob-
112
7.2. I MPLEMENTATION OF A LGORITHM
tained from the PDF integral:

" #
1 1 x − µ α
p = CDF(x) = F (x, α, σ, µ) = · 1 + sgn(x − µ) · γ , (7.4a)
2 α β

Z x
1
γ(u, x) = tu−1 e−t dt (7.4b)
Γ(u) 0
where γ(u, x) is the incomplete gamma function and sgn(. . .) the sign function. The inverse CDF
can be written as:
1/α
F −1 (p, α, σ, µ) = γ −1 1/α, |2p − 1| · β · sgn(2p − 1) + µ (7.5)
For the quantile normalization Q(p, α)=F −1 (p, µ=0, σ=1) the case with unit standard deviation
and zero mean must be considered, and thus:
s
1/α Γ(1/α)
Q(p, α) = γ −1 1/α, |2p − 1| · · sgn(2p − 1) (7.6)
Γ(3/α)
The resulting equation describes a transform which allows for the linearization of arbitrary GGD
tails. When inserted into the optimization scheme from figure 7.2 we have four unknown variables:
α, A, σ and µ. This four-dimensional search space has to be dealt by an efficient search algorithm,
as described in the subsequent section.
Equivalent to the analysis of test distributions with Gaussian tails in section 3.2.2 GGD, test
distributions must also be created in order to analyze and compare the estimation performance
of tail fitting algorithms. Therefore, the composition principle with RJ and DJ can be reused by
replacing the Gaussian RJ generator with a GGD jitter source. The additional shape parameter is
denoted as αRJ , and all the prior performance metrics such as estimation loss EL or median error
Emed can be reused.
An important issue relates to correct GGD jitter generation. Since the quantile normalization
function describes a probability p∈[0, 1] in terms of the amplitude of a normalized random vari-
able, one can use equation (7.5) for random sample generation. Figure 7.4 demonstrates this
principle, where a uniform random process Juni generates jitter samples which are used as input
to the GGD normalization function. The required standard deviation σ is obtained by scaling,
while a non zero mean µ yields additional data offset.
Juni Jggd
Q(t, α) · σ + µ
0 1
F IGURE 7.4.: GGD random generator.
7.2. Implementation of Algorithm

An efficient implementation of the quantile normalization for GGD shapes requires a fast real-
ization of the inverse incomplete gamma function γ −1 . Equation (7.6) cannot be expressed as a
closed form equation. This means, Q(p, α) has to be implemented as an iterative approximation,
which requires considerable computational demand. Another problem relates to the large amount
of jitter samples required for performance analyses. If test distributions shall include up to several
113
million random samples, they must be generated very quickly. Therefore, this section especially
focuses on speed optimization for the GGD jitter source and the optimization scheme in figure 7.2,
which faces the four-dimensional search space.
In [26] the γ −1 function has been realized very efficiently with a third-order Schröder iteration,
supported by the Newton-Raphson method. A target accuracy of rel =10−6 is typically reached
after the second or third iteration step. Also the included complete gamma (7.3b) and incom-
plete gamma (7.4b) functions are realized with minimax rational approximations and a uniform
asymptotic expansion respectively. This yields an excellent computational efficiency.
The described functions have been realized as C++ routines, where one million γ −1 function
calls require approximately three seconds of simulation time on an Intel Core Duo 2.2GHz laptop.
This is still not fast enough, neither for tail parameter search with the scheme in figure 7.2, nor for
the GGD random generator in figure 7.4. Especially when gathering millions of random samples,
test distributions have to be generated significantly faster. Therefore, additional minimax approx-
imations of Q(p, α) have been realized, which support certain discrete α values and achieve an
additional speed up of more than one order of magnitude. As selectable grid values, they cover
shapes in a range from α=[1, 10] with logarithmically scaled distance and a maximum relative er-
ror of rel =10−6 . These minimax approximations allow for speeding up the optimization process
with a fast initial search grid and thus, also achieve a quick generation of random values.
An efficient realization of the optimization scheme in figure 7.2 requires a fast identification
of the global error minimum. The regression stage fits a simple linear function to the tail part of
n outermost tail samples by reusing the recursions (4.5) of the sQN method from section 4.1.1.
The regression is thus already optimized with respect to fast line slope and offset recovery. The
regression error σ̂err is the mean square error of fitted data pairs (qi , xi ):
s
Pn 2
i=1 qi − o − s · xi
σ̂err = (7.7)
n−2
The two remaining parameters scaling factor k and shape α obviously define the quantiles, here
given as qi values. Both parameters have to be identified in the context of an additional outer
optimization, characterized by another two dimensional minimum search.
In section 4.3 the ratio T̂ =σ̂err /s was also introduced as a fitness measure, suggested by
Scholz [117] for judging the appropriateness of a fitted line. For GGD tail extrapolation, this
criterion now drives the outer optimization. The optimization process is composed of two steps.
An initial search grid identifies a coarse error minimum, which is then refined by a search routine.
The initial search grid locates the global minimum, where the tail length n is pushed in as far as
possible. This corresponds to the same algorithmic principle as already described in section 4.3.
For each of the (k, α) grid values the T̂ (n) minimum is determined as a function of tail length
n, and the grid value with maximum tail length is selected. The search grid is logarithmically
scaled with a distance factor of ∆k=1.2 for both k and α variables. The investigated intervals
are k=[101 , 103 ] and α=[1, 101 ], while ∆Pt ≥102 is used as minimum tail interval for outlier sup-
pression. The resulting two-dimensional surface is thus a function of k and α, and typically forms
a narrow valley where the global minimum is hardly distinguishable along the bottom course. A
typical example is given in figure 7.5.
The second optimization step is a local refinement of the resulting plane, carried out with a
bounded search algorithm. Starting with the initial grid pair (kin , αin ), the bounds are:
αlo ≤ αin ≤ αup , αlo = max(1, αin /∆k), αup = min(10, αin · ∆k) (7.8)
3
klo ≤ kin ≤ kup , klo = max(1, kαlo ), kup = min(10 , kαup )
where kαlo and kαup are determined from the search grid as k-minimums which appear at the
αlo and αup values respectively. The refinement stage finally determines the minimum values
114
−2
10
−3
10
T
−4
10
2
10
2
10
1
10 1
10
0 0
k 10 10 α
F IGURE 7.5.: T̂ surface for two-dimensional minimum search with initial search grid.
(kopt , αopt ) on the parameter surface. It uses the BOBYQA algorithm [113], which showed good
performance and a fast convergence compared to other search algorithms from the NLopt [67]
nonlinear optimization library. As convergence criteria a minimum relative parameter variation of
xtol ≤10−5 or a maximum function count of three hundred iterations are used.
The overall optimization converges sufficiently fast, especially due to the minimax function
approximations of the initial search grid. The refinement step typically requires 30-40 function
calls. As with the sQN method in section 4.1.4, computational demand again depends on the time
resolution R, since the algorithm has to process all distribution bins. For Rsim =3.3·105 and the
same test distribution as for the performance analysis in figure 4.9 (ADJ =0.2 UI, σRJ =0.05 UI,
αRJ =2), the 2.2GHz laptop typically requires approximately one minute, which is acceptable for
model simulations. In order to speed up the optimization in subsequent analyses, the simulator
time resolution is reduced to some degree (Rsim =105 ) without significant influence on the error.
7.3. Performance Analysis

This section focuses on performance evaluations for the GGD fitting method. Therefore, the same
analyses are carried out as for the sQN method. The GGD test distributions utilize the parameters
defined in section 3.2.2 (ADJ , σRJ ), with the additional shape parameter αRJ for the generalized
RJ component. This allows for a simple performance comparison with other methods, while the
GGD random source is realized in a consistent manner according to figure 7.4. Note, that the GGD
fitting method is not restricted to the analysis of timing jitter. It can also be applied to amplitude
histograms, which is also the major use case as was already discussed in the introductory section.
The analysis in terms of the unit interval (UI) is thus only used for a consistent comparison of
different algorithms.
The GGD performance evaluation starts by describing influences of the additional shape param-
eter on estimation error. Therefore both software modeling and hardware scenarios with coarse
time resolution are considered. A brief performance comparison with different estimation prin-
ciples and an existing generalized fitting method is also provided. This method was originally
suggested by Weinstein [135], and is based on a simple double-log scale.
7.3.1. Software Model Simulations

Since system level simulations play an important role for transceiver designs, first the performance
of the GGD method is investigated when used together with software models. In a simulator, the
time resolution Rsim or number of bins per unit interval can be selected arbitrarily and thus, will
not influence the accuracy of a tail fitting method. The same high-speed transceiver as for the sQN
method is assumed, running at 3Gb/s and a slightly reduced simulator time resolution of 3.3 fs.
115
−1 0
0.015 10 10
0.01
−1
0.005 10
med,α
Emed
−2
L
0 10
E
−2
−0.005 10
−0.01
−3 −3
−0.015 10 −2 −1 0
10 −2 −1 0
−2 −1 0
10 10 10 10 10 10 10 10 10
(a) Emed , αRJ ={1.0, 1.44, 2.0} (b) EL , αRJ ={1.0, 1.44, 2.0} (c) Emed,α , αRJ ={1.0, 1.44, 2.0}
−1 0
0.015 10 10
0.01
−1
0.005 10
med,α
Emed
−2
L
0 10
E
E
−2
−0.005 10
−0.01
−3 −3
−0.015 10 −2 −1 0
10 −2 −1 0
−2 −1 0
10 10 10 10 10 10 10 10 10
(d) Emed , αRJ ={3.0, 6.2, 8.9} (e) EL , αRJ ={3.0, 6.2, 8.9} (f) Emed,α , αRJ ={3.0, 6.2, 8.9}
αRJ =1.0 αRJ =1.44 αRJ =2.0 αRJ =3.0 αRJ =6.2 αRJ =8.9
α0 =1.0 α0 =1.44 α0 =2.0 α0 =3.0 α0 =6.2 α0 =8.9
F IGURE 7.6.: Median error Emed , estimation loss EL , and shape error Emed,α over varying test
distribution shape: N =107 , ∆Pt =102 , uniform DJ, Rsim =105 , K=250.
With Rsim =105 and a distribution sample size of N =107 , tails can be fitted sufficiently accurate
and correspond to an extrapolation which ranges over five orders of magnitude.
In figure 7.6 the estimation performance of the GGD tail fitting method is investigated with
respect to varying jitter ratio σRJ /ADJ and αRJ . The solid lines with filled markers denote the
proposed GGD method, while dashed lines show the results obtained when αRJ is already known
to the search algorithm. In this case the constant α0 =αRJ replaces the shape variable and yields a
reduced search space for the fitting algorithm. The special case of α0 =αRJ =2 for example, gives
the simplified form of Gaussian tail fitting with scaled Q-normalization from chapter 4.
At αRJ =1, the error bias Emed tends to be negative and thus, to slightly underestimate the TJ
values. This is not the case for known α0 =αRJ , which is again due to the asymptotic tail behavior
in quantile domain. For larger αRJ values, the median error of the GGD method also becomes
positive, which is an effect caused by very steep tails.
Over all the investigated distribution shapes, EL yields a worst case error which is less than
2.5%. If αRJ >3, the estimation loss also shows that the generalized method can directly compete
with the known α0 scenario. Unfortunately, this is due to highly overestimated shape values, as can
be seen in figures 7.6(c) and 7.6(f). Here, the median shape error Emed,α is calculated equivalent
to Emed , using estimated shape values:
Eα,k = αest,RJ /αtrue,RJ − 1, k = 1, . . . , K (7.9)

Emed,α = median{Eα,k } (7.10)
Especially with steep tails, either realized by small jitter ratios σRJ /ADJ or a large αRJ , the
algorithm rather overestimates the true shape instead of correctly fitting a small tail amplitude.
Therefore, an exact tail shape identification of distributions can only be carried out in a very
limited sense.
116
−1 0
0.04 10 10
0.02
−1
10
med,α
Emed
−2
L
0 10
E
−2
10
−0.02
−3 −3
−0.04 10 −2 −1 0
10 −2 −1 0
−2 −1 0
10 10 10 10 10 10 10 10 10
(a) Emed (b) EL (c) Emed,α

N =105 N =106 N =107 N =108
N =105 , α0 =αRJ N =106 , α0 =αRJ N =107 , α0 =αRJ N =108 , α0 =αRJ
F IGURE 7.7.: Emed , EL , Emed,α for various distribution shapes equivalent to figure 7.6, with
constant shape parameter αRJ =2.0 (Gaussian case) and varying sample size
N ={107 , 108 , 109 }, K=250.
−1 0
0.04 10 10
0.02
−1
10
med,α
Emed
−2
L
0 10
E
E
−2
10
−0.02
−3 −3
−0.04 10 −2 −1 0
10 −2 −1 0
−2 −1 0
10 10 10 10 10 10 10 10 10
(a) Emed (b) EL (c) Emed,α

R=Rsim =105 R=1024 R=128
R=105 , α0 =αRJ R=1024, α0 =αRJ R=128, α0 =αRJ
F IGURE 7.8.: Emed , EL , Emed,α for various distribution shapes equivalent to figure 7.6, with
constant shape parameter αRJ =2.0 (Gaussian tails) and varying time resolution
R={105 , 1024, 128}, N =107 , K=250.
Acceptable results are obtained with αRJ ≤2 and large jitter ratios σRJ /ADJ ≥0.5 (RJ dominant
case). A possible way to extend this application range is to increase the sample size N , which has
been kept comparable small so far. Therefore, in figure 7.7 the αRJ =2.0 case is again investigated
with respect to varying N . Although a significant improvement can be achieved for both Emed
and EL , αRJ still remains highly overestimated. Also note, that an exponential increase of the
sample size can soon lead to unacceptable simulation or measurement times.
7.3.2. Hardware Model Simulations

Hardware models for jitter measurement systems additionally have to consider a very limited time
resolution RRsim of collected jitter distributions. This limitation can highly affect the extrapo-
lation error. Since the GGD fitting method already performed poor in terms of correctly identifying
the tail shape α, a strong performance degradation for the hardware scenario is expected as well.
Figure 7.8 demonstrates the effect on estimation performance with two typical resolutions
R=1024 and R=128. Besides an increased EL , also a lower bound for the minimum jitter ra-
tio is introduced by R, as can be seen at σRJ /ADJ ≈3·10−3 (R=1024) and ≈2·10−2 (R=128).
This is due to the insufficient number of bins located on a measured bathtub tail. Compared to
117
−1 −1 −1
10 10 10
EL
EL
E
−2 −2 −2
10 10 10
−3 −3 −3
10 −2 −1 0
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10 10 10 10
(a) EL , αRJ =1.0 (b) EL , αRJ =1.20 (c) EL , αRJ =1.44
−1 −1 −1
10 10 10
EL
EL
E
−2 −2 −2
10 10 10
−3 −3 −3
10 −2 −1 0
10 −2 −1 0
10 −2 −1 0
10 10 10 10 10 10 10 10 10
(d) EL , αRJ =1.73 (e) EL , αRJ =2.0 (f) EL , αRJ =3.0

GGD Method Weinstein α0 =const=αRJ . α0 =const.=2 (sQN method).
F IGURE 7.9.: Estimation loss EL over varying jitter ratio σRJ /ADJ . The test distributions
are the same as in figure 7.6, with αRJ ={1.0, 1.2, 1.44, 1.73.2.0, 3.0}, N =107 ,
Rsim =105 and K=250.
the scenario with constant and known α0 (dashed lines), the generalized fitting method performs
significantly worse. Especially with R=128 a large peak toward negative Emed is noticed. This
behavior is highly undesired as it leads to optimistic TJ estimates. The error peak ranges from a
large jump of shape estimates in figure 7.8(c), which also evidences the critical jitter ratio limit
where the GGD method starts to fail.
7.3.3. Comparison with Other Methods

The developed GGD fitting method is briefly compared with special scenarios including the known
α0 case, the pure Gaussian tail assumption, as well as Weinstein’s method [135] for GGD tail
approximation. This method simply transforms measured tails into a double-log scale, where
regression lines can be used for extrapolation. Weinstein showed with the asymptotic expansion
of the probability function (equation (7.3a)), that the double-log scale is asymptotically linear for
GGD tails. The resulting method is thus simple and very fast.
In figure 7.9 the performance of the developed GGD method is compared with I) Weinstein’s
method, II) known α0 =αRJ shape parameter and III) Gaussian tail assumption with α0 =2 (sQN
method). This last scenario demonstrates how sensitive the Gaussian fitting method is with respect
to varying tail shapes, and indicates the importance of either correct α0 choice, or application of
the generalized GGD method with unknown tail characteristics.
The scenario with known shape α0 =αRJ achieves best performance over all three shape values
αRJ ={1.0, 2.0, 3.0}. As an additional advantage, it tends to overestimate the true TJ values and
thus, always guarantees for a pessimistic tail extrapolation. The GGD tail fitting method nearly
achieves the same accuracy, but generally underestimates the true value and thus yields slightly
optimistic TJ values.
Weinstein’s method generally overestimates the true TJpp value at large jitter ratios σRJ /ADJ
118
7.4. S UMMARY
and underestimates it at small ones. This makes the method only applicable down to a certain
limit. The error EL is quite large due to the approximation of GGDs in double-log domain, but
the method is especially simple and thus, very fast.
The performance comparison with the sQN method highlights a possible application of the
generalized method, which is for tail shapes in the interval range αRJ =[1, 2]. Here, the sQN
method becomes highly optimistic with negative error bias, because it always assumes Gaussian
tail behavior. The test distributions instead are heavy tailed and thus, follow a flat course which
cannot be tracked correctly by sQN. If σRJ /ADJ is sufficiently large and the TJ error must be
strictly positive, Weinstein’s method is better suited, because it always guarantees for pessimistic
tail extrapolations.
7.4. Summary
A generic optimization scheme for non-Gaussian tail fitting was presented, which led to a jitter
analysis method for generalized Gaussian distributions (GGDs). Therefore, the Gaussian quan-
tile normalization principle from chapter 4 was reused. The method is fully consistent with the
existing RJ-DJ model and utilizes only one additional shape parameter to describe different tail
characteristics.
In simulators the proposed method correctly fits and extrapolates tails that belong to the GGD
class of functions, although clearly outperformed by the case of known shape parameter α. The
major drawback of the generalized method results from the optimistic extrapolation property. This
is due to a general overestimation of shape values, especially at α=2 and α=3, as shown in fig-
ure 7.6. This problem can only be dealt by a significant increase of the sample size N (figure 7.7).
In hardware scenarios the limited time resolution R introduces additional error as well as a lower
bound for analyzable jitter ratios σRJ /ADJ (figure 7.8). Both effects further augment the opti-
mistic error nature of the GGD method.
A comparison of different methods was also carried out in figure 7.9. It clearly shows the
performance advantage of the generalized method with heavy tailed distributions (αRJ <2) in
comparison with the sQN method from chapter 4. If a pessimistic tail extrapolation with αRJ ≥1
is the crucial design criterion, Weinstein’s method from [135] is the best suited choice.
The proposed method has been implemented using a minimum of constraints in order to facili-
tate a broad application field. Improvements can possibly be realized by considering only specific
shapes or narrow parameter ranges, and by combining them with more suitable optimization cri-
teria and fitness measures then the ones utilized here.
Summarizing these results, the GGD method must be utilized very carefully when characteriz-
ing unknown tail shapes. In simulation scenarios it provides accurate extrapolations, although with
an undesired optimistic error bias. A reduction of the number of bins R as required for hardware
applications is not recommended, since obtained results become highly unreliable. Summarized
parts of this chapter have also been published in [C5].
119
8. An Accurate Behavioral Model for
High-Speed PLLs
To highlight the practical aspect of jitter analysis methods, a typical application with system mod-
eling and simulation is presented. Therefore, an accurate behavioral model of a high-speed trans-
ceiver is implemented. It has also been realized as test structure [62], and the goal is to analyze
its jitter behavior as typically required for system development and verification. The transceiver is
modeled as system level model according to the top-down methodology [63], using SystemC as a
C++ based library [38, 41, 61]. Therefore, system blocks are first brought into an abstract design,
and then successively refined down to the desired accuracy.
The model realizes a charge-pump PLL (CPLL) for high-speed clock and data recovery at
3Gb/s, and is intended for the S-ATA communication standard [118]. It uses accurate transient
simulations to analyze and predict the jitter behavior in terms of phase noise PSDs and jitter trans-
fer functions. As an enhanced version of a prior event driven approach [50], it affords accurate
analysis of jitter generation and propagation effects inside a CPLL.
In the following, the basic CPLL model is described together with prior modeling approaches
before proceeding to an enhanced event-driven model for accurate behavioral simulations. In the
analysis section, phase noise spectra, time domain parameters and jitter transfer characteristics are
derived and compared with measurements from the test structure, to demonstrate applicability of
the model, as well as the tail fitting method from chapter 4.
8.1. Modeling Principle

In this section an accurate model for jitter and phase noise analysis of a typical CPLL structure is
developed. It is intended as clock and data recovery (CDR) unit for serial high-speed interfaces.
In figure 8.1 the functional block scheme is depicted, running at a serial data rate of 3Gb/s. The
CPLL is composed of an Alexander type bang-bang phase detector (BB-PD), a charge-pump, a
second order passive loop filter, and a voltage controlled oscillator (VCO) which is preceded by a
gain regulator.
Icp
data in
up
LF gr VCO
ip VO output
BB−PD fc
R0 C1
down
VC1
Icp C0
VC0
F IGURE 8.1.: Functional block scheme of the CPLL.
121
8. A N ACCURATE B EHAVIORAL M ODEL FOR H IGH -S PEED PLL S
Only transient simulations are able to accurately reflect the true time domain behavior and cope
with the non-linear loop dynamics. For such simulations, two different modeling principles have
been developed in recent years. The first one is based on event-driven concepts [1, 21, 46, 50, 77],
where the analog part of the CPLL (i.e. the loop filter) is replaced by a set of non-linear recursive
equations. These are exact solutions to the loop filter difference equations. Hence, they are able
to determine the exact time instant of the subsequent VCO clock, which transforms the analog
circuit part into an event-driven block. Since all other circuit components are digital blocks, the
complete CPLL can thus be simulated as an event-driven system, which is typically characterized
by its non-uniform simulation time steps.
The second principle [107,108] instead, uses uniform time steps for simulation. The analog loop
filter is initially converted into discrete-time via impulse-invariance or bilinear transform. The
quantization noise of asynchronous events is subsequently considered by varying the amplitude
of digital signals according to the location of the transition edge between sampling instances.
This yields a highly accurate signal representation in discrete-time, combined with a fast model
implementation which has also found use in modern PLL simulators [109].
In this section an enhanced event-driven model is implemented according to the first model-
ing technique as explained in [46, 50]. Although the principle from [107] might afford very fast
simulations and thus be an intuitive choice, it does not allow for a dynamic variation of model
components. This is an important point, since the present CPLL model is only a part of an overall
transceiver structure which requires a careful stability analysis. Especially when changing be-
tween different operating modes, such as startup phase and normal lock-to-data operation, a stable
PLL behavior must always be guaranteed. The event-driven model uses the actual physical state
of analog filter components to recursively determine successive states. Thus, it directly reflects
the physical behavior of the PLL loop filter at any calculated time instant, which also allows for
dynamic filter variations. This way, one can easily switch between different operating modes. A
discrete-time filter instead cannot include dynamic variations, because its state variables do not
reflect the actual physical state.
As modeling environment, the SystemC [41, 61] programming language affords simple system
level simulations, and can directly embed the developed jitter analysis methods. In the following
subsections a closer look to the event-driven model is given. The basic model is based on an exact
solution for the 3rd order CPLL [50], and is here enhanced with a noise model for the VCO [122],
an additional parasitic pole for the gain regulator, and some BB-PD caused non-ideal effects.
8.1.1. Basic Event-Driven Model

In order to construct an event-driven CPLL model, a delay based signal description for each of the
blocks in figure 8.1 is required. This is easily achieved with pure digital blocks, where the output
only changes when input signal events occur. The BB-PD for example, only generates a logical up
or down pulse when triggered by the VCO clock. According to the phase difference between input
data and VCO clock, it sets the up or down signals at the output accordingly. The charge-pump
then converts these signals into current pulses ip of defined amplitude.
The critical component is the analog loop filter together with the VCO. A mathematical solution
is required to calculate the next time instant where the VCO completes the clock cycle and thus,
triggers the BB-PD over the feedback path. Once this time instance is known, the simulation
time progress can be fully described in terms of an event list which is handled by the SystemC
scheduler. The mathematical description of the analog circuit starts by assuming a linear relation
between VCO frequency and the control voltage VO of the loop filter output:
fvco (Vc ) = Kv · VO (t) + f0 (8.1)
122
8.1. M ODELING P RINCIPLE
where Kv is the linear frequency slope and f0 the zero voltage frequency. In a practical VCO, this
simple relation is hardly valid for the complete tuning curve of the oscillator. However, in lock-to-
data mode it is constantly driven at the transmission rate and thus, must only be valid for a local
operating point. The oscillator output phase can be expressed as integral of the VCO frequency,
and hence: Z t0 +t
Θvco (t) = Θvco (t0 ) + 2π [Kv VO (τ ) + f0 ] dτ (8.2)
t0
The charge-pump generates discrete current pulses for a second order loop filter. These current
pulses are assumed as constant ip ∈ {+Icp , 0, −Icp }, without depending on the loop filter voltage
or other non-linear effects. The behavior can thus be described in time domain with two differential
equations, obtained by Kirchhoff’s laws:
di1 (t) C0 + C1 ip
+ i1 (t) · = (8.3a)
dt C0 R0 C1 C0 R0
di0 (t) C0 + C1 ip
+ i0 (t) · = (8.3b)
dt C0 R0 C1 C1 R0
Solving these differential equations and representing the variables in voltage domain, leads to the
recursive equations:

1 1 i0 (t0 ) − β2
Z
−β1 t
VC0 (t) = · i0 (τ )dτ = VC0 (t0 ) + · 1−e + β2 · t (8.4a)
C0 C0 β1

1 1 β2 − i0 (t0 )
Z
VC1 (t) = · i1 (τ )dτ = VC1 (t0 ) + · 1 − e−β1 t + β3 · t (8.4b)
C1 C1 β1
with
VC1 (t0 ) − VC0 (t0 )
VO (t) = VC1 (t), i0 (t0 ) =
R0
C0 + C1 C0 · ip C1 · ip
β1 = , β2 = , β3 =
C0 R0 C1 C0 + C1 C0 + C1
where the gain regulator with parasitic pole fc and gain gr in figure 8.1 is not considered for
the moment. We can now insert equation (8.4b) into equation (8.2), and yield the final recursive
equation for the VCO phase:
( " #)
K i (t ) − β t 2
v 0 0 2
Θvco (t) = Θvco (t0 ) + 2π t f0 + Kv VO (t0 ) + 1 − e−β1 t − β1 t + β3
C1 β12 2
(8.5)
The goal is to determine the exact time instant t where the VCO phase reaches Θvco (t)=2π and
thus, produces the next clock edge for the BB-PD, which finally closes the feedback loop. Since
equation (8.5) cannot be inverted, a Newton iteration is utilized to recursively identify t. In our
case the function of interest is Θvco (t) which converges toward 2π and thus, we have the iteration:
Θvco (tn ) − 2π Θvco (tn ) − 2π

tn+1 = tn − 0
= tn − (8.6)
Θvco (tn ) Kv Vc (tn ) + f0
The Newton method typically converges after the second or third iteration step by reaching the
maximum simulator time resolution of 1fs. With this basic model from [50], additional non-ideal
and non-linear effects can now be included, in order to significantly improve accuracy of the given
CPLL model.
123
8.1.2. VCO Noise Model

An event-driven phase noise model of the VCO is included with the basic CPLL. The model was
developed in [122] and simulates the typical phase noise spectrum SΦ (∆ω) of an oscillator as
already introduced in section 2.1.1. Such a Leeson process consists of three distinct noise regions,
which are uniquely specified using four parameters: the flicker noise corner frequency ff l , the
measured phase amplitude A1 with corresponding frequency f1 in the 1/f 2 region, and the phase
noise floor amplitude AP hN .
The model is implemented as an independent time domain random process which delivers a
noise frequency each time it is called. This frequency value is shaped according to the spectral
density of the Leeson process, and combined with the basic noise-free CPLL model:
1
tid+jit = (8.7)
1/tid + fjit
where tid is the ideal time period until the next VCO clock and fjit the current noise frequency.
A Leeson noise generator has been implemented in [9], and consists of a random number gen-
erator that supplies different discrete time filters for the three noise regions. The basic concept is
g0
√
N (0, f s)
fjit
g1 1 − z −1
fs
flicker
g2 filters
F IGURE 8.2.: Leeson noise generator [9].
depicted in figure 8.2, where the gain factors are calculated as follows:
p
g0 = f1 10(A1 /10) , ∼ 1/f 2 noise (8.8a)
(∆/20) 2
g1 = 10 ∆ = AP hN − 10 log10 (4π
, /fs2 ) ,
∼ 1/f noise 0
(8.8b)
g2 = f1 ff l · 10(A1 /20−1.2704) , ∼ 1/f noise 3
p
(8.8c)
The flicker noise filter bank according to [122] is realized with eight filters of increasing cut-off
frequency, where the first one is located at 100 Hz.
yk [n] = ak yk [n − 1] + bk g2 x[n] , k = [0, . . . , 7] (8.9a)
10(k+2) 10(k/2+2)
ak = 1 − 2π , bk = 2π (8.9b)
fs fs
X
ff l [n] = yk [n] (8.9c)
where x[n] are the random samples from the noise generator at the rate fs . The overall jitter
frequency fjit is the sum of the three noise components:
fphn [n] = g1 (x[n] − x[n − 1]) , ff 1 [n] = g0 x[n] (8.10a)
fjit = ff l + fphn + ff 1 (8.10b)
The sampling rate fs must be sufficiently large to guarantee for correct noise generation and thus,
simply the maximum VCO frequency is used.
124
8.1. M ODELING P RINCIPLE
−3
x 10
2
1 VC0
Voltage [V]
−1 VO
−2 VC1
−3
−4 −6
x 10
5
CP
0
I
−5
0 0.2 0.4 0.6 0.8 1 1.2 1.4
time [s] −8
x 10
F IGURE 8.3.: Loop filter voltage behavior depending on input current Icp , the voltages can be
found in the block scheme, figure 8.1.
8.1.3. Gain Regulator and BB-PD

To model the gain regulator, an additional parasitic pole with cut-off frequency fc and gain gr
must be included with the event-driven model. The pole reflects the VCO pre-amplifier influence
as depicted in figure 8.1, and significantly increases accuracy of the real hardware behavior.
The derivation of the model equations is equivalent to the prior analysis, and yields the following
state equations:
" !
gr β2 − i0 (t0 ) ωc
VO (t) = VO (t0 ) + 1 − e−β1 t − 1 − e−ωc t − ···
C1 ωc − β1 β1
#
β3 −ωc t

− 1−e + β3 t + VC1 (t0 ) · gr − VO (t0 ) · 1 − e−ωc t (8.11)
ωc
"(
gr Kv β2 − i0 (t0 )
Θvco (t) = Θvco (t0 ) + 2π f0 t + · 1 − e−ωc t − ωc t − · · ·
C1 ωc ωc − β1
! !
ωc2 β3

− 2 1 − e−β1 t − β1 t + − VC1 (t0 )C1 1 − e−ωc t − ωc t + · · ·
β1 ωc
# )
β3 ωc 2 Kv
+ t + VO (t0 ) 1 − e−ωc t (8.12)
2 ωc
The VCO control voltage VO is now given by the gain regulator output which introduces an addi-
tional, third state variable. The subsequent VCO clock period is calculated using equation (8.12)
and replaces equation (8.5) of the prior basic model.
Figure 8.3 shows the difference of the voltage behavior between basic and enhanced model,
characterized by VC1 and VO respectively. The additional parasitic pole highlights a significant
influence on the voltage course, and is thus essential for reflecting the real circuit behavior.
The digital BB-PD block is modeled with a propagation delay tdel and the non-ideal behavior
of data latches. When the VCO clock triggers close to an input data transition, digital output data
125
suffers from meta-stability, offset voltages, or setup and hold time violations. These effects are
considered by introducing a random output value at the logical up or down signals, if one of the
criteria is violated.
8.1.4. Default Model Parameters

The complete model can now be utilized for behavioral simulations. An analysis run on an Intel
Core Duo 2.2 GHz laptop is typically able to collect 10–20k jitter values per second of simulation
time, depending on the transition density of the data pattern.
To prove model accuracy and applicability, a set of default parameters as used with the hardware
test structure [24] is given in table 8.1. While loop filter, charge-pump and VCO conversion
gain Kv are given by their nominal values, the phase noise coefficients were specified from prior
measurements. The gain regulator pole fc , propagation delay of the BB-PD and meta-stability
range were pre-estimated and finally chosen to match the phase noise spectra at the different
parameter configurations of figure 8.4.
Component Parameters & Values

Loop filter R0 = 700 Ω, C0 = 70 pF, C1 = 2 pF
Charge-pump Icp = 5 µA
Kv = 2.7 GHz/V, ff l = 10 MHz
VCO A1 = −120 dBC @ f1 = 10 MHz
AP hN = −138 dBC
Gain regulator gr = 1.0, fc = 250 MHz
Phase detector tdel = 150 ps, Vmeta = ±1 mV
Data pattern Pat = 0101 . . .
TABLE 8.1.: Default model parameter settings.
8.2. Jitter and Phase Noise Analysis

In this section the jitter and phase noise behavior of the event-driven model is compared with mea-
surements from the hardware test structure [24]. Using the definitions from section 2.1.2, phase
noise spectra are determined as well as key parameters that characterize jitter in time domain.
Finally two different methods for deriving the jitter transfer function of a PLL are presented and
compared with measured curves.
8.2.1. Closed Loop Phase Noise

A first demonstration of model validity is given by the comparison between measured and sim-
ulated power spectral densities (PSD) of the PLL jitter. Therefore, jitter-free signal transitions
are assumed at the CDR data input, and absolute jitter values jabs as defined by equation (2.4)
are gathered from the VCO output clock. For model simulations, the PSD is calculated using the
Welch Periodogram [103] with a Hann window of 50% overlap, and eight averaged periods of
N =16384 jitter samples each. For the measurements from [24], an Agilent E4440A spectrum
analyzer was utilized.
The phase noise PSD is very sensitive to parameter variations, but the comparison in figure 8.4
shows excellent matching at various parameter configurations. This mainly originates from the
accurate behavioral model. The first spectrum is given by the default parameters from table 8.1,
while the other plots vary one of the parameters as specified in the caption. In the measured
126
8.2. J ITTER AND P HASE N OISE A NALYSIS
−70 −70
−80 −80
Phase Noise PSD [dBC/Hz]

−90 −90
−100 −100
−110 −110
−120 −120
−130 −130
−140 −140
2 4 6 8 10 12 14 2 4 6 8 10 12 14
Frequency [Hz] 7 Frequency [Hz] 7
x 10 x 10
(a) Default (b) Icp =10 µA
−70 −70
−80 −80

−90 −90
−100 −100
−110 −110
−120 −120
−130 −130
−140 −140
2 4 6 8 10 12 14 2 4 6 8 10 12 14
Frequency [Hz] 7 Frequency [Hz] 7
x 10 x 10
(c) R0 =400Ω (d) Pat=11110000 . . .
F IGURE 8.4.: Measured (dashed red) and simulated (solid blue) phase noise PSD over different
parameter settings.
signals, additional spectral spurs are observed. They originate from a 50MHz clock which is used
as on-chip reference frequency. Besides these spurs, the model is able to accurately reflect the true
phase noise behavior. In combination with fast simulations, this allows for a deep and thorough
system exploration.
Several common time-domain parameters for jitter characterization can also be determined from
the statistics of absolute jitter. They form simple alternatives to the spectral description of phase
noise. Typically, one specifies the standard deviation or RMS value of accumulated jitter:
(m)

σacc (m) = RMS jacc = RMS jabs,k − jabs,k+m (8.13)
where jabs,k (section 2.1.2) is the absolute jitter sequence of the VCO clock. It can be calculated
from the autocorrelation function rjabs of absolute jitter values [42, 99]:
2
σacc (m) = 2 · (rjabs (0) − rjabs (m)) (8.14)
Commonly used RMS values include absolute jitter σabs , long term jitter σlt , period jitter σper or
maximum jitter σmax [22, 42], and are defined as:
q
σabs = RMS jabs,k = rjabs (0) (8.15a)
√
σlt = σacc (m) = 2 · σabs (8.15b)

m→∞
σper = σacc (1) (8.15c)

σmax = max σacc (m) (8.15d)
where especially σlt is often used to specify the overall performance of a PLL. In figure 8.5 the
RMS value of accumulated jitter σacc (m) is depicted for the same parameter configurations as in
127
0.06 0.12
0.05 0.1
0.04 0.08
(m) [UI]
σacc(m) [UI]
0.03 0.06
acc
σ
0.02 0.04
0.01 0.02
0 0
0 50 100 150 200 250 0 50 100 150 200 250
m m
(a) Default (b) Icp =10 µA

−3
x 10
8
0.03
7
0.025
6
0.02 5
σacc(m) [UI]
σacc(m) [UI]
4
0.015
3
0.01
2
0.005
1
0 0
0 50 100 150 200 250 0 50 100 150 200 250
m m
(c) R0 =400Ω (d) Pat=11110000 . . .

σacc (m) σlt σabs
F IGURE 8.5.: RMS values of accumulated σacc (m), absolute σabs (dotted line) and long term
jitter σlt (dashed line). The curves are constructed using a sample size of N =105 .
Configuration σabs [UI] σlt [UI] σper [UI] σmax [UI] σabs,psd [UI]
Default 0.0258 0.0365 0.00440 0.0512 0.0315
Icp =10 µA 0.0513 0.0726 0.00874 0.1020 0.0611
R0 =400Ω 0.0135 0.0191 0.00273 0.0265 0.0170
Pat=11110000 . . . 0.0046 0.0065 0.00089 0.0073 0.0058
TABLE 8.2.: RMS jitter values from figure 8.5. 1 UI = 333 ps.
figure 8.4, together with calculated values for σabs (dotted line) and σlt (dashed line). In the course
of σacc (m) especially a periodicity given by the spectral peak can be noticed.
Table 8.2 lists the RMS values from equation (8.15). The absolute jitter can also be determined
from the measured phase noise spectrum using Parseval’s Theorem [43, Annex D], and yields
σabs,psd in the last column. Measured RMS values are generally larger than simulated ones, which
is due to mismatches in the peak region and additional parasitic spurs. As can also be observed, an
increase of the charge-pump current leads to a linear increase of absolute jitter and the oscillation
magnitude in figure 8.5.
8.2.2. Jitter Transfer Function

The jitter transfer function T (fSJ ) is a common way of characterizing PLL jitter [7, 46, 121].
Typically, it follows a low pass curve, because low jitter frequencies can easily be followed by the
loop filter, while higher frequencies are attenuated.
T (fSJ ) of the given hardware structure can be measured with a BERT-Scope when using sinu-
128
8.2. J ITTER AND P HASE N OISE A NALYSIS
soidal jitter of frequency fSJ and amplitude ASJ at the CPLL input. The BERT-Scope determines
the timing budget of jitter distributions at the output, using a built-in tail fitting method. In order
to get a reference value, the first point is measured at a sinusoidal jitter frequency of typically
fSJ =10kHz, so that the loop filter can easily follow the input jitter and thus, a transfer character-
istic of T (fSJ )=! 1 can be assumed. The corresponding timing budget can subsequently be used as
a reference for determining the complete T (fSJ ) curve over varying frequency fSJ .
In simulations, the same approach requires the use of the tail fitting method from chapter 4. That
is, distributions are gathered from IO jitter values of the behavioral model and extrapolated, in
order to identify the timing budget at the target BER=10−12 . The typical sample size of simulated
distributions is significantly smaller (typically 4 to 5 orders of magnitude) compared to real-time
BERT measurements. This is an essential drawback for simulations, since the timing budget has
to be estimated from higher BER levels, where the number of Gaussian tail samples may not be
sufficient. Further, the fitting algorithm of the BERT-scope may produce different estimates.
Another problem domain can be highlighted with phase noise spectra, when additional random
jitter (RJ) is added to the data input. Figure 8.6 yields a clear amplitude difference between
measured and simulated PSDs. This can be due to model inaccuracies, non-ideal behavior of the
−70
−80
−90
−100
−110
−120
−130
−140
2 4 6 8 10 12 14
Frequency [Hz] 7
x 10
F IGURE 8.6.: Phase noise PSD mismatch with RJ=0.3 UI and default parameters (table 8.1).
RJ generator, or additional phase noise as caused by interconnect wires and the analog receiver
circuit. Therefore, an additional, alternative analysis method is preferred for model simulations,
in order to truly reflect the transfer characteristic. Such a method is here realized using the FFT
spectrum of absolute jitter at the PLL output. The characteristics of both tail fitting and spectral
methods are briefly summarized:
Tail Fitting Method For each calculated frequency point, N =106 IO-jitter samples are col-
lected and the scaled Q-normalization (sQN) method is applied (ĉ1.2 optimization, ∆Pt =102 ),
while the PLL model is stressed with the jitter frequency fSJ . Only estimated DJpp values are
used for determining the jitter transfer function T (fSJ ). The first frequency value is assumed to
be ideally followed by the loop filter, and thus DJpp (fSJ,min ) is used as a reference to calculate
T (fSJ ) = DJpp (fSJ )/DJpp (fSJ,min ). (8.16)
Spectral Method Transfer function values are calculated from the frequency bins of the jitter
spectrum, which are calculated by the FFT of absolute output jitter. For each frequency point, N
samples are collected and FFT transformed, while the PLL model is stressed with the correspond-
ing sinusoidal jitter frequency fSJ and amplitude ASJ . A sample size of N ≥105 is chosen to
allow for coherent sampling:
NC /N = fSJ /fs (8.17)
129
5 5 5
0 0 0
[dB]
[dB]
[dB]
−5 −5 −5
−10 −10 −10
−15 6 7 8
−15 6 7 8
−15 6 7 8
10 10 10 10 10 10 10 10 10
Frequency [Hz] Frequency [Hz] Frequency [Hz]
(a) Default, ASJ =0.4 UI (b) ASJ =0.2 UI (c) ASJ =0.1 UI
measured sQN method FFT method
F IGURE 8.7.: T (fSJ ) over varying jitter amplitude ASJ .
5 5 5
0 0 0
[dB]
[dB]
[dB]
−5 −5 −5
−10 −10 −10
−15 6 7 8
−15 6 7 8
−15 6 7 8
10 10 10 10 10 10 10 10 10
(a) Icp =10 µA, R0 =700Ω (b) Icp =20 µA, R0 =700Ω (c) Icp =5 µA, R0 =1600Ω
F IGURE 8.8.: T (fSJ ) over varying loop filter parameters.
where NC is the bin where the transfer function value is determined. If for example fSJ =1 MHz,
fs =3 GHz ⇒ NC =34, N =102000. Further,
T (fSJ ) = |FFTNC (jabs,k )|/(ASJ /4) (8.18)
The jitter amplitude represents a peak-to-peak value while the FFT produces a double-side spec-
trum of half amplitude and thus, ASJ must be divided by four.
Both simulation methods use a logarithmic grid fSJ =[105 , 5·107 ] MHz of 50 frequency points.
With the Intel 2.2GHz laptop and N =105 , a full transfer function can be determined in a few
minutes. The default model parameters were already given in table 8.1, with an additional default
jitter amplitude of ASJ =0.4 UI.
An initial comparison of transfer functions over different jitter amplitudes ASJ is given in fig-
ure 8.7, where the typical low pass behavior is observed. Since the CPLL is a non-linear system,
the cut-off frequency of T (fSJ ) varies too. Both simulation methods are able to correctly track
the cut-off frequency. Although the simulated curves generally highlight acceptable matching with
the measured ones, the sQN method generally underestimates the true jitter transfer behavior. This
can especially be observed in the transition region of figures 8.7(c) and 8.8(c). However, the FFT
method correctly reflects the measured peaks. In figures 8.7 and 8.8 generally, both simulation
methods do not exactly match the slope of T (fSJ ) in the cut-off region. It is likely that this effect
ranges from additional parasitic poles which have not been considered, such as the line termina-
tion, input amplifier or equalizer circuit. Figure 8.9 depicts T (fSJ ) for different test patterns, as
130
8.3. S UMMARY
5 5 5
0 0 0
[dB]
[dB]
[dB]
−5 −5 −5
−10 −10 −10
−15 6 7 8
−15 6 7 8
−15 6 7 8
10 10 10 10 10 10 10 10 10
(a) Default, Pat = 0101 . . . (b) Pat = 0011 . . . (c) Pat = PRBS7
5 5 5
0 0 0
[dB]
[dB]
[dB]
−5 −5 −5
−10 −10 −10
−15 6 7 8
−15 6 7 8
−15 6 7 8
10 10 10 10 10 10 10 10 10
(d) Pat = LBP (e) Pat = HTDP long (f) Pat = LTDP long
F IGURE 8.9.: T (fSJ ) over varying test pattern.
specified in [118, section 7.2.4]. Unlike the FFT method, the sQN tail fitting method is now able
to correctly follow the measured course, due to the same analysis principle.
8.3. Summary
A fast system level model for accurate behavioral simulations of charge-pump PLLs has been
developed. Unlike modeling approaches with uniform time steps, an event-driven model uses
state variables that reflect the real physical state and thus, allow for dynamic run-time variations.
The model forms an enhanced version of a prior approach [50] with included gain regulator pole,
VCO noise model and non-ideal behavior of the BB-PD. Depending on the transition density of
the selected data pattern, it is typically able to collect 10-20k jitter values per second simulation
time on an Intel Core Duo 2.2GHz laptop.
Simulation results of the closed loop phase noise over varying parameter configuration proved
excellent agreement with jitter measurements from an existing hardware structure (figure 8.4).
In addition to the PSD spectra, also the RMS values of accumulated jitter were determined in
figure 8.5. They especially highlight the observed PSD peaks.
Two simulation methods were compared with measured jitter transfer functions T (fSJ ) in fig-
ures 8.7-8.9. One of the two simulation methods is based on the equivalent measurement principle,
where a tail fitting method identifies the DJ peak-to-peak characteristics DJpp over varying jitter
frequency. Using a reference value at low frequencies, equation (8.16) allows for determining
T (fSJ ). In measurements, DJpp was determined with a BERT scope, while for simulations the
sQN method from chapter 4 was utilized. As it can be difficult to observe certain phase noise
effects with the sQN method in simulations, a second analysis method based on spectral analy-
sis was implemented as well. It calculates the transfer function by identifying the amplitude of
absolute jitter at the PLL output.
131
Both methods correctly identify the cut-off frequency of measured T (fSJ ) in figures 8.7-8.9.
The spectral method is further able to track the peaking behavior of jitter transfer functions, which
is generally underestimated by the sQN method. Transfer characteristics of different test pattern
are instead only reflected correctly by the sQN method, due to the same analysis principle.
The presented model and simulation results have also been published in [C2].
132
9. A Method for Fast Jitter Tolerance
Analysis
Jitter analysis methods can be used for identifying jitter tolerance (JTOL) curves of high-speed
PLLs. This chapter focuses on this application field and realizes an algorithm for the automatic
determination of such curves. Due to the influence of timing jitter along the transmission channel,
high-speed PLLs and CDR structures have to provide a certain robustness against timing varia-
tions. Thus, interface standards often specify tolerance masks [43] to define minimum bounds for
jitter tolerance which must be guaranteed by a system design.
A JTOL curve describes the robustness of a PLL against an injected sinusoidal jitter frequency.
Therefore, a jitter amplitude must be determined where the TJ budget exactly covers the complete
bit period or UI. This corresponds to an inverse problem which is additionally influenced by the
statistical variation of collected distributions. Typically, a JTOL measurement scheme as depicted
in figure 9.1 uses a modulated clock source with corresponding pattern generator, and is charac-
terized by the injected sinusoidal jitter frequency fSJ and amplitude ASJ . The CDR under test
suffers from the jittery signal and produces output data with increased error probability. A bit error
rate tester (BERT) may compare the recovered data with the expected original one, and determine
the resulting error rate. Equivalently, a time interval analyzer (TIA) can directly measure the time
difference between the zero crossing of the analog input signal and the recovered clock edge. The
obtained IO jitter values (section 2.1.2) again allow for collecting distributions that represent the
error rate.
Pattern Jitter
Generator PLL TIA
Recovered
Clock
Delay
Clock
Source Latch
Q BER
BERT
fSJ ASJ D Recovered
Modulation Data
frequency CDR Under Test
F IGURE 9.1.: JTOL measurement scheme using TIA or BERT.
The BER measurement principle has already been discussed together with built-in jitter mea-
surement (BIJM) systems in chapter 5, which also explains the similarity between figures 9.1
and 5.1. Again, the adjustable delay element introduces a time discretization which yields a lim-
ited number of bins R per UI. The BERT measurement is easy to implement since bit errors can
be counted. The result is a single probability value, meaning that a complete jitter distribution
requires a sequential BER scan over all R delay steps. A TIA measurement instead, collects jitter
values at every bit transition of the received data and thus, directly yields probability distributions
with maximum speed. It can also be realized using R BERT elements in parallel, which requires
a delay line together with a bank of latches and counters. Although area consuming, BIJMs with
133
9. A M ETHOD FOR FAST J ITTER TOLERANCE A NALYSIS
such a real-time TIA feature have already been realized successfully using high resolution time-
to-digital converters [16, 66].
Jitter tolerance measurements are very challenging, because the amplitude of sinusoidal jitter
must exactly produce the target BER at the CDR under test. Thus, one is searching for the ASJ
value where both distribution tails cross each other at the 10−12 level. So far, this JTOL test
problem has been addressed either from a measurement or a simulation perspective. Methods for
hardware measurements have been proposed in [32, 33, 140, 141], where especially the principle
in [33] is very efficient as it is based on the Gaussian Q-normalization with subsequent tail extra-
polation. Using several measurement points, a regression line can be constructed where the correct
ASJ value is estimated easily via extrapolation. This BERT based approach does not require the
delay element in figure 9.1, but needs a considerable amount of measured bit errors, and is thus
too time consuming for simulations if the whole jitter tolerance curve has to be identified over
varying jitter frequency fSJ . Simulation methods work equivalent to TIA based measurements.
They additionally use statistical models [97] or special waveforms [4] to minimize the required
number of jitter samples as much as possible, so that JTOL simulations can be carried out in a
feasible amount of time. However, they can generally not be used for hardware measurements.
In this chapter an analysis method is proposed, where the jitter tolerance curve of a PLL is
determined very quickly using an adaptive algorithm. The method is sufficiently fast for use with
both behavioral simulations and TIA based jitter measurements. A minimum measurement time
is obtained by automatically adapting the sample size of collected jitter distributions according to
the dynamics of the PLL under test. This adaptive recursion utilizes the previously described sQN
and QN methods from chapters 4 and 6.
In subsequent sections, first the adaptive principle of the algorithm is described. In a practi-
cal use case, then a simulation example reuses the modeled charge-pump PLL from chapter 8.
Obtained simulation results show, that JTOL curves can also be determined correctly when using
hardware systems of limited accuracy. In a final analysis, simulation results are compared with
JTOL measurements from the given hardware structure.
9.1. Adaptive Algorithm for JTOL Analysis

As already mentioned, a JTOL analysis method is a search algorithm which identifies the jitter am-
plitude ASJ where the error probability of the CDR under test is equal to the target BER=10−12 .
This is an inverse problem, which must be solved independently for every jitter frequency fSJ .
The goal is thus, to cross the two bathtub tails of a jitter distribution at the target BER, so that
TJpp =1 UI. As a fundamental limitation one is not able to measure or simulate such a low proba-
bility level in a feasible amount of time, which again involves the use of tail fitting methods. For
performance comparison the two different methods sQN and QN, as described in chapters 4 and 6,
are investigated subsequently.
Both methods suffer from statistical tail variations and thus, lead to an uncertainty of the esti-
mated eye opening. Obviously, this uncertainty highly depends on the sample size N of a collected
jitter distribution, and an ideal choice poses a fundamental problem: estimation accuracy and thus,
statistical confidence ask for a large N , while fast jitter measurements require N to be as small as
possible.
As a solution to this problem a twofold search method is proposed. The primary, basic search
of ASJ is carried out with a recursive algorithm which minimizes the extrapolation error of fitted
tails. It is described by the recursion
ASJ (n + 1) = ASJ (n) + ν · e(n) (9.1)
This equation is widely used in adaptive filter theory [49] where least-mean-square algorithms or
134
9.1. A DAPTIVE A LGORITHM FOR JTOL A NALYSIS
Kalman filters are implemented, and offers a high robustness against statistical variations of the
error term e(n). Using a block size of N jitter values, a tail fitting algorithm can provide a cost
function and is thus able to specify the error e(n). The jitter amplitude for the next iteration step
ASJ (n + 1) is then determined according to equation (9.1), using the old value ASJ (n) and the
error e(n) which is additionally scaled by the learning rate parameter ν.
The second part of the algorithm adaptively adjusts the sample size N for each collected jitter
distribution. The method starts with a minimum sample size Nmin and decides after each itera-
tion, whether N must be increased or not. As soon as a maximum number Nmax is successfully
reached, the search algorithm converges and the JTOL analysis can proceed with the next jitter
frequency. This second algorithm allows for a significant speed-up of the ASJ search, because
only few jitter samples are needed for initial iterations. Further, the increase of N starts at a point
where ASJ is already quite close to the final result.
9.1.1. Cost Function

For the adaptive recursion (9.1) a suited error term e(n) should behave proportional to the injected
jitter amplitude ASJ . In [33] it is shown that, besides the injected sinusoidal, all jitter sources in
a JTOL measurement environment can be considered as uncorrelated and approximately constant.
In Gaussian quantile domain this allows for the construction of a linear relationship, which is valid
over a certain amplitude range.
Q = a · ASJ + b (9.2)
Theoretically, only two measurement points would thus be sufficient to determine the unknown
parameters a and b, but unfortunately the tail extrapolations suffer from statistical variations. This
impedes a direct calculation of ASJ without multiple evaluations, but one can still benefit from
the proportional influence on Q-values and derive a cost function.
The Q-value where the timing budget covers the whole unit interval (TJpp =! 1) can be determined
by rewriting equation (4.2) from chapter 4:
(
σL ·Q(p/AL )+σR ·Q(p/AR )
(1 − µR − µL ) σL +σR (sQN)
= (9.3)
σL + σR Q(p), with AL = AR = 1 (QN)
For the QN method Q(p)=Qest is directly calculated from the left hand side, while for the sQN
method it can be determined recursively using a simple Newton iteration. Qest must approach the
desired target BER=10−12 , which gives the normalized error term e(n):
Qest − Q(10−12 ) Qest

e(n) = = −1 (9.4)
Q(10−12 ) 7.03
This result is used together with the adaptive algorithm in equation (9.1).
9.1.2. Sample Size Adaptation

The automatic adaptation of sample size N is based on the decision, whether the variance of ASJ
falls below the expected error variance of the tail fitting method. It forms the important heart
piece of the JTOL analysis, and also decides if the search is completed or not. The algorithm uses
three adjustable parameters: minimum and maximum sample size (Nmin and Nmax ) as well as the
target deviation conf for amplitude values. This last parameter specifies the statistical confidence
interval for the final ASJ result.
A flow graph of the algorithm is given in figure 9.2, where ASJ is identified for a single fre-
quency fSJ . The algorithm starts with the minimum sample size Nmin and waits until the first
135
collect N jitter values N = Nmin , ASJ = 0.0, fSJ

Tail Fitting Algorithm: e(n)
ASJ (n + 1)=ASJ (n) + ν · e(n) Start
vSJ [L]=ASJ (n + 1)
L=L+1
n o
s
{v,min , Lmin } = min v = t(a, L−1) · √ v,L
L·vL
vSJ = vSJ [L−Lmin , . . . , L−1]
no fp (N ) yes
v,min < conf · End
fp (Nmax )
no yes
v,min
N =fp−1 conf
· fp (Nmax ) N =Nmax ASJ = vL
F IGURE 9.2.: Flow graph of JTOL analysis algorithm.
block of jitter samples has been collected. After applying the tail fitting method, the error e(n)
from equation (9.4) is determined and used for updating the recursion in equation (9.1). The re-
sulting new ASJ (n + 1) value is stored in an array vSJ [0, . . . , L−1] of variable length, where the
statistical variation of amplitudes can be observed over multiple iterations.
With blocks of only Nmin jitter samples at the beginning, the recursion quickly settles ASJ (n)
to a level where it constantly oscillates around its true value and exhibits statistical random walks.
A measure for the statistical variation of ASJ (n) can be derived if only the L last recursions
are considered. Assuming a normal distribution, the confidence interval of ASJ is specified as
t-statistic with
sv,L
v = t(a, L −1) · √ (9.5)
L · vL
and
1X
vL = ASJ,i (9.6)
s PL
A2SJ,i − L · vL 2
sv,L = (9.7)
L−1
where v is the estimated confidence bound of a t-distribution with confidence level a=0.95 and
L−1 degrees of freedom. It is proportional to the empirical standard deviation sv,L and normalized
by the empirical mean vL . If v falls below the target deviation conf , the JTOL algorithm has
converged.
The length L of the array vSJ is continuously increased. It is incremented at every recursion,
but only a subset of its newest elements is used to minimize the observed statistical variation. At
each recursion, the minimum epsilon value is searched over all possible lengths:
v,min = min{v , vSJ [L − k, . . . , L − 1] | k = 2, . . . , L} (9.8)
This minimum search yields an optimistic estimate of the actual statistical confidence of ASJ
values. It allows for quickly changing to a higher sample size N as soon as the observed optimistic
tolerance v,min falls below a known comparison threshold. Hence, the algorithm behavior is
optimized with respect to a minimum number of recursions.
136
9.1. A DAPTIVE A LGORITHM FOR JTOL A NALYSIS
0.1 0.03
0.08
0.02
0.06
Emed
σe
QN QN
0.04
0.01
0.02
sQN sQN
0 4 5 6 7 8
0 4 5 6 7 8
10 10 10 10 10 10 10 10 10 10
N N
(a) Median error Emed (b) Standard deviation σe
F IGURE 9.3.: Worst case error behavior of tail fitting methods over varying N . 4th order regres-
sion polynomials, K=250, worst case distribution shapes ADJ /σRJ =1/2 (sQN)
and 1/4 (QN).
Alg. p0 p1 p2 p3 p4
QN 0.2036 −0.03269 0.001823 −3.466·10−5 0.0
sQN 0.3493 −0.08615 0.008218 −3.530·10−4 5.71·10−5
TABLE 9.1.: Polynomial regression coefficients for σe .
The ideal comparison threshold corresponds to the expected error of the tail fitting method.
This error behavior can only be approximated, because extrapolation results depend not only on
the sample size N , but also on the underlying distribution shape. The CDR under test is stimulated
with sinusoidal jitter and thus, collected jitter distributions are expected to consist of a bounded
sinusoidal component combined with Gaussian random jitter. From the performance comparison
in figures 6.11-6.15, worst case distribution shapes resulting from combined sinusoidal and Gaus-
sian jitter components can be specified for the two fitting methods. The worst case error for each
method can thus be used to obtain a simplified function of sample size N . In figure 9.3 the median
error Emed (left) and corresponding standard deviation σe (right) curves are plotted. K=250 tail
fits were carried out for each of the different sample sizes, and 4th order polynomials were fitted to
achieve a functional relationship fp (N ) between sample size N and error behavior. Note, that the
error bias Emed cannot be compensated with the JTOL algorithm since the underlying distribution
shape is basically unknown. However, the standard deviation σe can be used as a pessimistic in-
dicator for choosing the right sample size N . The logarithmic scaling in figure 9.3(b) leads to a
polynomial
fp (N ) = p0 + p1 · log(N ) + . . . + p4 · log(N )4 (9.9)
with the coefficients in table 9.1, and an analysis range of N =[104 , 108 ]. In order to compare the
actual confidence interval v,min with the expected error of the fitting method, we can use these
error polynomials and formulate the condition
fp (N )
v,min < conf · (9.10)
fp (Nmax )
The error at fp (N ) is normalized by fp (Nmax ) so that the target bound conf forms the refer-
ence. In the flow graph (figure 9.2) this condition decides whether the obtained ASJ estimates
are sufficiently accurate, so that jitter distributions of a larger sample size have to be collected in
subsequent iterations. If this is the case, a new value for N is determined from the inverse of the
137
actual v,min , otherwise the JTOL algorithm continues to iterate. The inverse fp−1 (N ) is simply
realized by a Newton approach.
The overall structure of the algorithm guarantees for a strictly monotonic increase of N until
Nmax is reached. With v,min <conf and N =Nmax the final convergence criterion is met. Note,
that in order to identify a complete jitter tolerance curve, the JTOL algorithm from figure 9.2 must
be repeated for every desired frequency fSJ . Some additional speed up can thus be achieved when
using the amplitude result of the last frequency as initial value for the next one.
An additional reset feature has also been included with the presented JTOL algorithm. Since the
jitter amplitude ASJ is operated in a region where the observed jitter extends over the complete
UI, the modeled PLL may easily become unstable and produce bit errors. Thus, the JTOL method
also requires a well defined reset behavior. This is realized by first returning to a smaller, stable
ASJ value. Then the learning rate parameter ν is additionally decreased by a factor of two, so that
the search algorithm is more focused on the region of interest.
9.2. Application Example

In this section the proposed JTOL algorithm is applied to the behavioral model for charge-pump
PLLs from chapter 8. This model includes analog component values, a VCO noise model (Leeson
process), a gain regulator pole, as well as the phase detector delay and metastability behavior.
Suitable parameters are first identified to optimize the behavior of the JTOL algorithm, and
examples are given to highlight the varying loop dynamics of the PLL over fSJ . Then the anal-
ysis proceeds to jitter tolerance curves where performance comparisons between hardware and
software models are carried out.
9.2.1. JTOL Parameter Optimization

First, simple simulations shall optimize the algorithmic behavior together with the modeled charge-
pump PLL. The default parameters for the PLL have already been listed in table 8.1. Instead of the
clock-like signal, this time the lone bit pattern (LBP) is used for compliance testing, as suggested
by the 3Gb/s S-ATA standard [118]. Further, 0.18 UI of additional random jitter are superimposed
to the generated jittery data signal, in order to verify jitter tolerance with respect to the specified
maximum values DJ=0.42 UI and TJ=0.60 UI [118, p. 179, tab. 31].
In figure 9.4 the typical behavior of the JTOL algorithm is demonstrated for three different jit-
ter frequencies fSJ ={100, 10, 2}MHz. The PLL model utilizes the default parameter settings.
The QN method (squares) in figure 9.4(d) for example, carries out the first eleven iterations with
Nmin =2·104 data samples. Then, a subset of L≥2 last ASJ values has reached a confidence
level v,min which demands a larger sample size N , as determined by equation (9.10). Over
successive iterations, the N parameter increases monotonically toward Nmax =1·106 until finally
v,min <conf is reached. The sQN method (circles) generally needs less iterations and jitter sam-
ples, and thus converges faster. This is due to the more accurate tail fitting principle, which also
means less undesired error bias as depicted in figure 9.3(a).
A minimum sample size of Nmin =2·104 generally proved sufficient for initial tail estimates.
The given PLL model typically simulates 106 bits within 45s on a 3GHz Intel Xeon workstation,
and thus, Nmax =106 forms a good trade-off between fitting accuracy and simulation time. Here,
a worst case error bias of 1.3% for the sQN method is accepted (also see figure 9.3(a)). The
confidence interval conf =0.005 is chosen small enough so that the final statistical variance of
ASJ can be neglected.
The learning rate parameter ν must also be chosen correctly in order to support quick conver-
gence. If it is too large, ASJ values will exhibit large statistical variations and the JTOL algorithm
138
9.2. A PPLICATION E XAMPLE
6
0.8 10
0
3 10
0.6
2 −1
10
εv,min
5
en
ASJ
N
0.4 10
1 −2
10
0.2
0 −3
10
4
0 10
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
Iteration n Iteration n Iteration n Iteration n
(a) ASJ (n), fSJ =100MHz (b) e(n), fSJ =100MHz (c) v,min (n), fSJ =100MHz (d) N (n), fSJ =100MHz
6
0.8 10
0
3 10
0.6
2 −1
10
εv,min
5
en
ASJ
N
0.4 10
1 −2
10
0.2
0 −3
10
4
0 10
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
(e) ASJ (n), fSJ =10MHz (f) e(n), fSJ =10MHz (g) v,min (n), fSJ =10MHz (h) N (n), fSJ =10MHz
4 6
10
0
3 10
2 −1
10
ASJ
εv,min
5
en
N
10
1 −2
10
0 −3
10
4
0 10
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
(i) ASJ (n), fSJ =2MHz (j) e(n), fSJ =2MHz (k) v,min (n), fSJ =2MHz (l) N (n), fSJ =2MHz
sQN QN
F IGURE 9.4.: Examples for the adaptive JTOL algorithm converging toward the unknown jitter
amplitude ASJ . Parameters are given in table 9.2.
will require more iterations before the obtained statistical confidence is sufficiently accurate. In
extreme cases, it may even become unstable and produce errors. Otherwise, if ν is chosen too
small the JTOL search will converge before ASJ reaches the correct value, simply because also
the amplitude variations are too small. The ideal ν depends on the dynamics of the investigated
system and thus, changes with varying jitter frequency fSJ as well as the parameter configuration
of the model.
In figure 9.5 the convergence probability of the JTOL algorithm is investigated over varying ν.
As model configurations, four different parameter settings as given in table 9.3 were used, each
with twenty frequency points in a logarithmically scaled range of fSJ =[106 , 108 ] MHz. Success-
ful JTOL runs were counted by combining two limiting criteria: number of iterations I and overall
sample size N . In order to yield a valid JTOL run, both criteria must be fulfilled.
The obtained curves in figure 9.5 thus represent an empirical probability for convergence. They
clearly highlight an increase in the 0.01≤ν≤0.1 region where the performance settles toward a
maximum. When further increasing ν, a larger variation of probabilities is observed because the
obtained ASJ values also suffer from a strong statistical fluctuation. At largest ν, the PLL further
looses its lock state several times. Thus the reset feature, as described previously, divides ν until a
stable convergence of ASJ values is again achieved. For the presented PLL model the learning rate
is selected such that maximum performance with the QN method is achieved, which is at ν=0.11.
Further, a maximum number of iterations Imax =50 aborts simulations if no convergence criterion
is reached. All default settings for the JTOL algorithm are again summarized in table 9.2.
139
0.8
P (I ≤ 30 ∧ N ≤ 20·106 )
P (I ≤ 25 ∧ N ≤ 15·106 ) sQN
Probability
0.6 P (I ≤ 20 ∧ N ≤ 10·106 )
0.4 ν=0.11 P (I ≤ 30 ∧ N ≤ 20·106 )

P (I ≤ 25 ∧ N ≤ 15·106 ) QN
0.2 P (I ≤ 20 ∧ N ≤ 10·106 )
0
0 0.05 0.1 0.15 0.2
ν
F IGURE 9.5.: Probability for successful convergence of JTOL algorithm.
Nmin =2·104 Nmax =106

conf =0.005 Imax =50
Pat: LBP, RJ=0.18 UI ν=0.11
TABLE 9.2.: Default JTOL algorithm settings.
9.2.2. Simulation Results

For the given PLL model two different JTOL analysis scenarios are investigated. First, a software
based scenario uses the default simulator time resolution of 1fs, which yields Rsim =3.3·105 for
the 3Gb/s interface. Second, a hardware based scenario reduces this value to R=512 and R=32,
which allows to investigate the effect of limited precision in TIA based jitter measurements.
In tables 9.3 and 9.4 the performance of the JTOL algorithm is compared with respect to four
different model parameter settings and the two sQN and QN tail fitting methods. In order to high-
light the performance gain of sample size adaptation, both tail fitting methods are applied with a
constant sample size N =Nmax as well. These non-adaptive algorithmic versions are subsequently
referred to as c-QN and c-sQN methods.
The overall sample size N is the sum of samples needed for JTOL analysis runs with 20 fre-
quency points in an equally spaced logarithmic search grid of fSJ =[106 , 108 ] MHz. Since most
of the simulation time is spent to gather jitter distributions, this value directly reflects the compu-
tational effort for identifying a complete JTOL curve. In a hardware scenario it also relates to the
overall measurement time. As to be expected, smallest N are obtained by the quickly converging
sQN method. It is immediately followed by the QN method, which suffers from a larger error bias
and spread, and thus, on average needs more samples to converge. If no sample size adaptation is
performed (c-QN and c-sQN methods), the overall sample size N increases by a factor of 2-3.
tc is the overall computation time on a 3GHz Intel Xeon CPU, consumed with tail fitting meth-
ods. It is thus a measure for the complexity of algorithms. In tables 9.3 and 9.4, tail fitting with QN
is more than one order of magnitude faster than sQN. Further, as already described in section 4.1
figure 4.9, the computational demand linearly depends on R. Unfortunately the granularity of time
measurements is 10ms, which impedes a reliable estimation of computational demand at small R
(table 9.4) as well as for QN and c-QN methods.
If describes the average number of iterations per jitter frequency. In table 9.3 the If values of
140
9.2. A PPLICATION E XAMPLE
Param. Alg. Default Icp =10µA R0 =400Ω Kv =4GHz/V

QN 73.2 77.4 92.3 79.1
sQN 69.9 80.9 64.6 61.6
N [M]
c-QN 185 316 167 271
c-sQN 174 298 195 237
QN 28.2 29.2 31.1 28.4
sQN 357 512 332 362
tc [s]
c-QN 33.6 47.9 30.6 43.0
c-sQN 544 950 626 696
QN 14.8 19.8 16.3 16.8
sQN 12.9 21.3 14.0 15.7
If
c-QN 9.3 15.8 8.4 13.6
c-sQN 8.7 14.9 9.8 11.9
TABLE 9.3.: JTOL analysis results for R=Rsim =3.3·105 .
R = 32 R = 512
Param. Alg. Icp R0 Kv Icp R0 Kv
Def. Def.
10µA 400Ω 4GHz/V 10µA 400Ω 4GHz/V
QN 89.2 102.3 146.3 86.1 84.4 93.3 84.8 84.9
sQN 88.9 105.6 94.9 80.2 67.0 65.0 85.2 65.2
N [M]
c-QN 182.0 294.0 229.0 269.0 175.0 285.0 209.0 256.0
c-sQN 193.0 301.0 211.0 247.0 187.0 284.0 221.0 240.0
QN < 0.04 < 0.01 < 0.03 < 0.01 0.12 0.10 0.18 0.13
sQN 0.32 0.39 0.39 0.28 2.30 3.11 2.73 2.71
tc [s]
c-QN < 0.02 < 0.03 < 0.04 < 0.02 < 0.05 0.12 < 0.09 0.11
c-sQN 0.31 0.37 0.33 0.30 1.83 2.79 2.27 2.37
QN 13.8 21.2 17.7 16.6 13.7 20.5 15.6 17.2
sQN 15.3 19.2 17.7 16.3 13.9 18.8 15.5 16.9
If
c-QN 9.1 14.7 11.5 13.5 8.8 14.3 10.5 12.8
c-sQN 9.7 15.1 10.6 12.4 9.4 14.2 11.1 12.0
TABLE 9.4.: JTOL analysis results for R=32 and R=512.
sQN are slightly better than those of QN, due to the faster convergence. Both c-QN and c-sQN
methods with constant N yield the best results for If , because also the error bias is constant (refer
to figure 9.3(a)).
Table 9.4 shows the behavior of the JTOL algorithm, when carried out with a coarse time reso-
lution or reduced number of bins R, as is the case for hardware measurements. Generally a slight
increase of the number of overall samples N is observed, especially with sQN and QN methods.
This is, because a coarse time resolution R leads to larger statistical variations of TJ estimates,
obtained from extrapolated distribution tails. Thus, the JTOL algorithm on average also needs
more samples in order to reach the convergence criterion.
Overall sample size N and average number of iterations If of both QN and sQN methods can
also be investigated over varying R, as shown in figure 9.6. While If remains approximately con-
stant over the complete analysis range, N in fact highlights a slight increase toward smaller R.
This empirically confirms the assumption of a larger statistical variation of TJ estimates. How-
ever, at any R the JTOL algorithm correctly converges within the specified maximum number of
iterations Imax =50, indicating that it can also be utilized for hardware measurements.
141
20
Average number of Iterations If

18
Total number of Samples N

8
10
16
14
12
7
10 1 10 1 2 3 4 5
2 3 4 5
10 10 10 10 10 10 10 10 10 10
Number of bins R Number of bins R
sQN QN
F IGURE 9.6.: a) Average number of iterations If and b) total sample size N over varying R.
1 1 1 1
10 10 10 10
SJ
ASJ
ASJ
ASJ
0 0 0 0
10 10 10 10
A
−1 −1 −1 −1
10 6 7 8
10 6 7 8
10 6 7 8
10 6 7 8
10 10 10 10 10 10 10 10 10 10 10 10
fSJ fSJ fSJ fSJ
(a) sQN, Default (b) sQN, Icp =10 µA (c) sQN, R0 =400Ω (d) sQN, Kv =4GHz/V
1 1 1 1
10 10 10 10
SJ
ASJ
ASJ
ASJ
0 0 0 0
10 10 10 10
A
−1 −1 −1 −1
10 6 7 8
10 6 7 8
10 6 7 8
10 6 7 8
10 10 10 10 10 10 10 10 10 10 10 10
fSJ fSJ fSJ fSJ
(e) QN, Default (f) QN, Icp =10 µA (g) QN, R0 =400Ω (h) QN, Kv =4GHz/V
measured R=Rsim =3.3 · 105 R=128
F IGURE 9.7.: Simulated (tables 9.3 and 9.4) and measured JTOL curves at different model pa-
rameter configurations.
With the confidence level conf =0.005, this is further achieved without visibly affecting the
quality of obtained JTOL curves. In figure 9.7 the simulated JTOL curves for the sQN and QN
methods at both Rsim and R=128 are plotted together with manually measured JTOL curves of
the same PLL hardware structure. Note that the measurement device (SyntheSys BERT-Scope
7500A) has an amplitude limit of ASJ,max = 3.3 UI, which is reached at fSJ =1 MHz. Further,
differences with simulated curves are mainly given by model inaccuracies and thus, do not re-
flect the performance of the JTOL algorithm. The hardware oriented model simulation with only
R=128 bins already matches excellent with the Rsim high resolution scenario and can still handle
the varying loop dynamics over the complete frequency range. The obtained results show, that
the best jitter tolerance curve is achieved with the default model parameter settings. The S-ATA
mask specification [118] demands a minimum tolerance of ASJ = 0.42 UI which is here clearly
guaranteed.
If a complete JTOL measurement over twenty frequency points with the QN method typically
requires a total of N ≈ 100M samples (≈ 1.2h of simulation time), the time consumed in a TIA
based hardware measurement is tN ≈ 33ms for a 3Gb/s interface. Together with tc ≈ 130ms
(R = 512) for calculations and 1ms additional time buffer per iteration tI ≈ If · 20 · 1ms = 320ms
142
9.3. S UMMARY
(If ≈ 16), this gives an overall time consumption of
tN + tc + tI = 33 ms + 130 ms + 320 ms = 483 ms, (9.11)
which is needed to identify the complete jitter tolerance curve over twenty frequency points.
9.3. Summary
A fast and accurate method for the identification of jitter tolerance curves of high-speed PLLs has
been presented. An adaptive algorithm determines the unknown jitter amplitudes recursively and is
optimized with respect to a small number of iterations and sample size. The algorithm realization
started with the simple adaptive recursion in equation (9.1) and the associated derivation of a cost
function given in equation (9.4). An algorithm for automatic sample size adaptation was realized
and described with the flow graph in figure 9.2.
The basic idea is to observe jitter amplitudes over a subset of L last recursions, which allows
to describe their statistical confidence interval in terms of a Gaussian t-statistic (equation (9.5)).
As soon as this value falls below a predefined threshold, the algorithm adapts the sample size N
of collected distributions accordingly. This adaptation process is controlled by the known error
behavior of QN and sQN fitting methods. Therefore, the worst case distribution shapes of both
methods were used to derive 4th order polynomials, which approximate the extrapolation error as
a function of sample size N (figure 9.3). The overall algorithm structure guarantees for a strictly
monotonic increase of the adapted sample size until the maximum Nmax is reached, together with
the desired confidence level. The presented algorithm is repeated for every jitter frequency fSJ .
As an application example, the proposed analysis method was applied to the PLL model from
chapter 8. First, suitable analysis parameters were identified to optimize the algorithm behavior
with respect to the given model. As an example, figure 9.4 highlights the varying loop dynamics of
the PLL at different fSJ . Default algorithm settings, such as the optimized learning rate parameter
ν=0.11 were specified in table 9.2.
Finally, the analysis proceeded to the determination of jitter tolerance curves. Therefore, four
algorithmic versions were compared, including the two fitting methods with (QN and sQN) and
without (c-QN and c-sQN) sample size adaptation. Performance results were given in table 9.3 for
simulators (R=Rsim =3.3·105 ), and in table 9.4 for simulated hardware measurements (R=512
and R=32). These include four different parameter settings with twenty fSJ values each.
Results demonstrated, that the adaptive sample size adaptation of QN and sQN generally de-
creases the overall number of required jitter samples N by a factor of 2-3. This value directly
reflects the simulation effort or the measurement time in a hardware system. The overall computa-
tion time tc is the smallest with QN and c-QN methods and outperforms sQN and c-sQN methods
for more than one order of magnitude. Finally, the average number of recursions per jitter fre-
quency If is the smallest without sample size adaptation (c-QN and c-sQN methods). This is due
to the constant error bias which does not influence the adaptation process, as is the case with QN
and sQN methods. The same performance characteristics of N , tc , and If can also be observed
with simulated hardware scenarios, besides a highly reduced computational demand. For each
test case in figure 9.6, the algorithm was always able to converge within a maximum number of
fifty recursions over varying R, thus indicating that the proposed algorithm can also be applied to
hardware measurements.
As a final result, in figure 9.7 the obtained jitter tolerance curves were compared against real
jitter measurements from the test structure and showed an excellent matching. The simulation
of such jitter tolerance curves is particularly useful to optimize a CPLL design with respect to
its robustness against input jitter. It also allows for the verification of imposed specification re-
143
quirements such as a jitter tolerance mask. Contents of this chapter have partly been published in
[C6].
144
10. An FPGA based Diagnostic Tool for
Jitter Measurement and Optimization
In order to highlight the practical aspects of tail fitting methods, in this chapter an embedded jitter
measurement system is presented, which acts as a diagnostic tool for serial high-speed interfaces.
The underlying idea is to combine a real BIJM system with the previously described sQN and
QN tail fitting methods. This is to confirm the theory of hardware design aspects from chapter 5
and to prove correctness of the derived equations and empirical relations. Further, extrapolated
tails yield the TJ timing budget, which allows for judging the quality of transmission lines, PLLs
or transceiver structures as system under test (SUT). The resulting diagnostic tool is thus able to
optimize and configure an SUT without the use of additional instrumentation devices.
Many BIJM topologies and embedded systems [16, 18, 57, 64, 65, 68, 73,79] have been designed
for production tests and on-chip diagnostics, and some of them have also been used for jitter
optimizations [87, 132]. However, the combined use of BIJMs with tail fitting methods has not
been considered so far. This chapter especially points out the benefit of applied tail fitting methods
to estimate the TJ budget, which forms a direct quality measure for the impact of timing jitter on
system performance. The target platform is a Virtex-5 FPGA [137] on an ML507 high-speed
evaluation board [138]. It produces a 3Gb/s serial reference signal and retrieves timing jitter
information from the SUT.
In the following sections first the diagnostic principle is introduced. Then the implemented
FPGA logic together with the analysis software is described, and parameter optimizations with
different diagnostic scenarios and test cases are carried out. Finally the observed tail fitting error
is compared with expected worst case errors from the theory in chapter 5. A brief summary is
given at the end.
10.1. Measurement Principle and Implementation

The fundamentals of jitter diagnostics are given to solidify the understanding of the developed
analysis tool. The FPGA implementation is described subsequently and represents a direct real-
ization of the underlying principle.
10.1.1. Diagnostic Principle

The jitter analysis of serial high-speed signals requires an accurate measurement of IO jitter, which
has already been described in the introductory part of chapter 5 (also refer to figure 5.1). The fun-
damental scheme for jitter diagnostics is very similar to a BIJM system and depicted in figure 10.1.
A system under test basically degrades the signal integrity of a generated data stream by producing
jitter. It can be a simple transmission line, a PLL/CDR or even a complete transceiver structure.
The fundamental difference to BIJMs is the internal reference clock which acts as a common basis
for transmit buffer and jitter measurement circuit. In this way, external noise sources are excluded
and the quality of jitter measurements is given by the minimum inherent jitter of the reference
clock.
The BIJM principle with the adjustable delay element, phase detector (PD) and counter has
already been described in chapter 5, and allows for collecting jitter distributions over successive
145
10. A N FPGA BASED D IAGNOSTIC TOOL FOR J ITTER M EASUREMENT AND O PTIMIZATION
Tx
Data
Ref Clk
System
Under Delay
Test
(SUT)
PD Counter
Jitter
BIJM Analysis
F IGURE 10.1.: Basic principle for jitter diagnostics.
measurement runs of varying delay. The smallest delay step defines the number of bins R in a UI,
while the PD can also be replaced by a simple D flip-flop if the input data stream is known. In this
case the recovered data is first compared against the expected transmit data, which then allows for
error counting.
If a jitter distribution is measured down to a BER level of 1/N over R delay steps, the overall
time consumed is
tt [UI] = N · R. (10.1)
Although test time is rather uncritical, also in diagnostic applications the measured BER depth
1/N poses a fundamental time limit. In order to estimate the TJ budget at the target BER=10−12 ,
again tail fitting methods have to be applied. Using the sQN and QN methods from chapters 4
and 6 allows to parameterize the Gaussian tails (µ, σ, A) of a jitter distribution, and thus, to
estimate the extrapolated TJ peak-to-peak value TJpp
TJpp = tR − tL (10.2)
−12
tL(R) = µL(R) − σL(R) · Q(10 /AL(R) ) (10.3)
with the Q-function defined in equation (3.10). The obtained TJpp estimates are subsequently
used for jitter diagnostics and parameter optimization.
10.1.2. Implementation
Subsequently a 3Gb/s jitter diagnosis system is implemented using an FPGA. The measurement
is realized using a Xilinx Virtex-5 FX70T on an ML507 high-speed evaluation board [138], which
is controlled by a MATLAB program running on a remote desktop computer. The general FPGA
block scheme is given in figure 10.2. In order to collect jitter distributions, a dedicated high-speed
transceiver (GTX) [137] is combined with a BER test (BERT) logic and controlled by an instanti-
ated microprocessor (MP). Measurement results are then transferred to the remote computer where
the tail fitting methods are applied. The three main blocks of the FPGA logic are subsequently
described in more detail.
High-Speed Transceiver
The Xilinx GTX is used for the parallel to serial and serial to parallel data conversion, and is
needed to lower the data rate for use with the internal, custom FPGA logic. The 3Gb/s data stream
is converted from or into a 20bit word along with a 150MHz clock signal to mark the beginning
of a new word. The transceiver is driven in lock-to-reference mode, meaning that the PLL for data
recovery (Rx-PLL) is locked to the same reference frequency of the transmitter buffer (Tx-PLL),
146
10.1. M EASUREMENT P RINCIPLE AND I MPLEMENTATION
Virtex−5 FX70T PC − COM

controls stop
GTX
samp. pos. MicroBlaze
Tx
3Gb/s IO MP r/w
Tx
clr
serializer
SUT
&
deserializer Rx−Cmp BERT (Fig. 10.3)
errors
Rx Cmp
Rx bits
Rx−Clk
F IGURE 10.2.: Block scheme for the FPGA based 3Gb/s jitter measurement system.
in order to realize the diagnostic principle from figure 10.1. A complete jitter distribution can be
measured by adjusting the data sampling point of the receiver PLL over R=128 time steps. This
is done by using the Dynamic Reconfiguration Port (DRP) [137] of the GTX, which is controlled
by the MP. The transmit signal is also a 20bit word given by the MP.
BER Tester
The custom made BERT logic consists of two units: a bit counter and an error counter (figure 10.3).
The former keeps track of the sum of total bits by counting the number of clock cycles from the
Rx clock. The latter compares the Rx data to the data pattern set by the MP. The number of
errors is then added every Rx clock cycle. Each of the two counters is compared to a preset
maximum value. Once either number is reached, a generated enable signal simultaneously stops
both counters. This way the measurement can be stopped by either the number of bits or errors,
which is especially advantageous during the synchronization phase (MP). Once a measurement at
one sampling position is completed the MP reads the values held by the counters, resets them and
increases the delay to start the next measurement run.
Microprocessor
The MicroBlaze Processor controls the measurement sequence and realizes the serial interface
(RS232) between the ML507 board and a remote computer. The software flow graph in figure 10.4
shows the steps of a measurement run, needed for collecting a complete jitter distribution. The
data pattern to be transmitted along with the number of bits N per sampling position are first
Rx−Cmp CLR max. STOP

Errors
RD
>=
Rx
20x XOR 16bit Error Comp
Accumulator
max.
RST Bits OR
Rx−Clk EN
>=
Comp
36bit
BERT logic Counter RST
EN INV
F IGURE 10.3.: Realization of the BERT logic.
147
FPGA Program Matlab Program

Start
Measurement Measurement Settings
Synchronization
Error
Measurement
Samp.Pos.
stops if either:
+1
− max. # errors
− max. # bits
no
128
Data Transfer sampling Apply Tail fitting
to Matlab positions Algorithm
yes
F IGURE 10.4.: Flow graph of BER measurement and analysis.
entered using MATLAB. The MP then starts with short BER measurements at the first delay step
in synchronization mode. This phase is needed to correctly match the 20bit Rx and Rx-Cmp
words, due to an unknown channel delay. During the synchronization, the Rx word is compared
with a circular shifted version of the original Tx data pattern, where the BERT collects only 3200
words or 64k bits to evaluate the number of errors. This Rx-Cmp pattern is successively rotated
over all 20 positions. The lowest number of errors thus indicates the synchronization pattern to
which the Rx data pattern needs to be compared. The synchronization phase is carried out very
quickly, consuming < 0.5ms, and is repeated at each of the R sampling positions or delays. As a
key advantage, this allows for measuring both distribution tails in a single measurement run (also
see figure 10.5).
After synchronization, the MP resets both counters, and starts the long BER measurement with
sample size N and a maximum number of 32k errors. Once either counter reaches the maximum,
the MP reads their values and calculates the BER, passes it on to MATLAB, and increases the
sampling delay to restart the measurement process. Once the BER at every delay step has been
measured, MATLAB runs the program containing the sQN and QN fitting methods, which finally
yields the desired TJpp value at the target BER=10−12 .
10.2. Jitter Measurements and Optimization

In this section the presented diagnostic tool is applied to various test cases and optimization sce-
narios. To demonstrate the operating principle, it is first used to measure the jitter distribution of
a simple 1m RG-58 coaxial cable. In order to yield substantial inter-symbol interference (ISI) on
the transmission line, a PRBS4-like 20bit pattern 08CEFhex is sent. The obtained measurement
result is shown in figure 10.5(a). Using the selected data pattern, only one synchronization value
out of the 20 positions produces the correct, minimum number of errors. Since it is independently
determined at each sampling position prior to the effective BER measurement, both distribution
tails are directly obtained from a single measurement run. This is especially observed at the distri-
bution center, where the synchronization value is decremented by one. Here, the measured BER
corresponds to the right tail behavior or reverse cumulative distribution function 1−CDF. Other-
wise, with a constant synchronization position BER values would saturate at 8/20=0.4, which is
equal to the transition density of the selected pattern.
The applied sQN method accurately extrapolates distributions down to the target BER=10−12 ,
148
10.2. J ITTER M EASUREMENTS AND O PTIMIZATION
0
10 0
10
N = 108
Pat = 08CEFHex
BER
−4 −2
10 10
−4
10
−8
10
1 32 64 96 128 CDF 1 − CDF
BER
−6
10
20
−8
15 10
Sync Pos
10 −10
10
5
−12
0 10
1 32 64 96 128 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2
Sampling Position Jitter Extent [UI]
(a) (b)
F IGURE 10.5.: Example for a measured jitter distribution (left) and sQN fitted tails (right). N =
108 , Pat = 08CEFhex , 1m RG-58 coaxial cable.
0.45
0.28
RG−58
1m RG−58 8
N = 10
0.27 Pat = 08CEFhex 0.4
Pat = 08CEFhex
0.26 0.35
TJpp
TJpp [UI]
0.25
0.3
0.24
0.25
0.23
0.2 0 1
0.22 10 10
1M 10M 100M 1G 100G Cable length [m]
(a) (b)
F IGURE 10.6.: K=100 evaluations of TJpp estimates over a) varying sample size N , and b)
RG-58 cable length. Triangles mark medians of exact measurements (N = 1011 ).
as can be seen from the example in figure 10.6(a) where K=100 distributions of the 1m RG-58
cable are fitted over varying sample size N . As expected, both bias and statistical spread decrease
with larger N because the extrapolation range becomes smaller. Note that the error bias is always
positive and thus, yields pessimistic TJpp estimates, which is due to the beneficial extrapolation
property described in section 4.2.1.
In figure 10.6(b) K=100 estimates of TJpp are statistically evaluated over varying cable length,
to estimate the influence of ISI on total jitter. The triangles mark median values of N =1011
measurements to approximate the ideal extrapolation result. With 5m and 7m cables, the signal is
additionally degraded by noise effects, which cause a larger error bias in figure 10.6(b).
With N =108 samples a measurement run takes approximately 2s, which is sufficiently fast for
jitter diagnosis. This value is clearly below the expected test time of
tt = (N · R)/3·109 s = 4.27 s. (10.4)
This is, because bins with high error rates quickly count bit errors and thus, the maximum number
is reached very fast. Additionally, the FPGA software first locates a jitter distribution within the
128 sampling positions, so that measurements are only made in the region of interest.
The reference clock of the GTX transceiver includes a certain amount of inherent jitter, which
forms the fundamental minimum for the jitter measurement system. It can be identified using
the internal loopback mode of the GTX, and yields typical fitting results for right bathtub tails as
shown in figure 10.7. Due to the clean data signal, tails become very steep and thus, contain only
few data points. Especially at N ={106 , 107 } a correct tail extrapolation is therefore not possible
149
0.05 0
QN N = 108
1
Pat = 1010...
0.045
2
Jitter Extent [UI]

3
Q(BER)
0.04
4
5
0.035
6
7
0.03 0 0.01 0.02 0.03 0.04
1M 10M 100M 1G 10G 100G Jitter Extent [UI]
F IGURE 10.7.: Estimated positive jitter at internal loopback mode, using QN as fitting method.
Especially for smallest sample sizes N ={106 , 107 } a correct tail extrapolation
fails because bathtub curves (right) become too steep.
anymore. At N =108 further, the median is smaller than the approximated true median at N =1011 ,
which is due to the error oscillations at lowest bin density, as described in section 5.4.2. Observing
multiple measurement runs, the linear course in Q-domain is given until Qup =Q(∆Pt /N )≈1,
while σt,10−11 ≈3.7·10−3 . These values can be inserted into equation (5.20) for calculating the
required minimum sample size, which gives Nmin >107 . This is also confirmed by the results in
figure 10.7. The timing budget of GTX inherent jitter is approximated with
TJpp,loopback,N =1011 = TJpp,min ≈ 0.149 UI (10.5)
Instead of testing the quality of transmission channels, the jitter diagnosis system can also opti-
mize the parameter configuration of a complete high-speed interface as SUT. To demonstrate this
concept, in an example the GTX internal Rx and Tx structures are optimized together with a 5m
RG-58 coaxial cable.
First, only the Tx buffer configuration is optimized. Therefore, the GTX provides three ports
which control differential output swing (TxDIFF), pre-driver swing (TxBUF) and pre-emphasis
(TxPRE) [137]. Each of these ports has eight different gain settings, thus with a total of 512 param-
eter combinations. Using the diagnostic tool, the TJpp value acts as fitness measure for parameter
optimization, and yields the surfaces in figure 10.8. The selected sample size for measurements
is N =108 . Although median values from a statistical evaluation with K=25 measurements are
displayed, a single evaluation is already sufficient for optimizations, since the obtained standard
deviation of estimates is always σT Jpp,est <0.008 UI.
Second, the Rx-PLL is used in normal lock-to-data mode as is the common case for high-speed
data recovery. Without the clean reference clock, the Rx-PLL now suffers from a significant
0.48 TxPRE = 0
0.48 TxPRE = 6
0.44 0.44
TJpp [UI]
TJpp [UI]
TxPRE = 4
0.4 0.4
0.36 0.36
0 0
2 0 2 0
TxPRE = 2 2 2
4 4
4 4
6 6 6 6
TxBUF TxDIFF TxBUF TxDIFF
(a) TxPRE={0,2} (b) TxPRE={4,6}
F IGURE 10.8.: TJpp surfaces for Tx buffer optimization. N =108 , K=25.
150
10.3. A NALYSIS OF E XTRAPOLATION E RROR
TxPRE = 4
TxPRE = 0
0.65 0.65
TxPRE = 6
0.6 0.6
TJpp [UI]
TJpp [UI]
0.55 0.55
0.5 0.5 TxPRE = 2

0 0
2 0 2 0
2 2
4 4
4 4
6 6 6 6
TxBufDIF TxDIFF TxBufDIF TxDIFF
(a) TxPRE=0 (b) TxPRE={2,4,6}
F IGURE 10.9.: TJpp surfaces for Tx buffer optimization, lock-to-data mode. N =108 , K=25.
0.5 1
Large EQ Medium EQ
0.8 Large EQ
Medium EQ
0.4
Small EQ
TJpp [UI]
TJpp [UI]
Small EQ
0.6
0.3
0.4
EQ Gain Bypass EQ Gain Bypass
0.2 0.2
0 5 10 15 20 25 30 0 5 10 15 20 25 30
DFE1 tap weight DFE1 tap weight
(a) lock-to-reference (b) lock-to-data
F IGURE 10.10.: TJpp optimization of Rx structures, including four different EQ settings and a
single DFE tap weight. N =108 , K=25.
amount of additional jitter caused by the recovered signal clock, as can also be seen in figure 10.9.
The two optimized Tx parameter configurations finally yield

DIFF=0, BUF=5, PRE=1 TJpp =0.36 UI (lock-to-reference)
DIFF=0, BUF=5, PRE=0 TJpp =0.55 UI (lock-to-data)
indicating, that there is no visible benefit when using signal pre-emphasis in lock-to-data mode.
Equivalently, also the Rx structure can be optimized. The GTX includes a built-in gain equalizer
(EQ) and a decision feedback equalizer (DFE) with adjustable tap weights. The EQ affords either
low, medium or large high-frequency boost or bypass with gain factor. In figure 10.10 these four
EQ settings are used together with the first DFE tap DFE1=[0 . . . 31] for parameter optimization.
The plotted curves indicate the EQ settings over varying DFE1 tap value. With EQ gain bypass
and the DFE, a significant reduction of the total jitter down to TJpp < 0.3 UI is possible in lock-
to-reference mode (left). Again, in lock-to-data mode (right) the TJ budget is significantly larger
and the DFE also achieves only a slight improvement.
10.3. Analysis of Extrapolation Error

In this section the extrapolation error of both sQN and QN tail fitting methods is investigated when
applied to practical jitter measurements. Therefore again different test cables are used as SUTs for
the jitter diagnosis system. For both methods a minimum fitting region of ∆Pt =103 is selected,
since the DJ shape of measured distributions is basically unknown and may change over varying
151
0.26 0.26
1m S−ATA 1m S−ATA
Pat = 0101... Pat = 0101...
0.25 0.25
0.24 0.24
[UI]
[UI]
pp
pp
TJ
TJ
0.23 0.23
0.22 0.22
0.21 0.21
1M 10M 100M 1G 100G 1M 10M 100M 1G 100G
(a) sQN (b) QN
F IGURE 10.11.: Estimated TJpp values of a 1m S-ATA crossover cable (solid boxes),
N ={106 , . . . , 109 }. The N =1011 measurement allows for worst case error
estimation (equation (5.30), tables of coefficients 5.4 and 6.2) by assuming si-
nusoidal (PLL noise dominated) jitter (dashed boxes). ∆Pt =103 .
0.26 0.26
1m S−ATA 1m S−ATA
Pat = 08CEFhex Pat = 08CEFhex
0.25 0.25
0.24 0.24
TJpp [UI]
TJpp [UI]
0.23 0.23
0.22 0.22
0.21 0.21
1M 10M 100M 1G 100G 1M 10M 100M 1G 100G
(a) sQN (b) QN
F IGURE 10.12.: Estimated TJpp values of a 1m S-ATA crossover cable (solid boxes),
N ={106 , . . . , 109 }. Worst case error estimation with assumed uniform (ISI
dominated) jitter (dashed boxes). ∆Pt =103 .
cable length. With short cables jitter is expected to be dominated by ISI, while for longer cables
additional noise effects such as couplings, reflections or crosstalk will also degrade the signal.
Both QN and sQN fitting methods are first applied to jitter distributions of a standard 1m S-ATA
crossover cable and a clock-like data pattern. A statistical evaluation of TJpp values over K=100
bathtub measurements and varying sample size N yields the solid boxes in figure 10.11. As
expected, with increasing N the obtained medians converge toward the true value while statis-
tical spread becomes smaller. Thus, a good estimate for the true TJ budget is obtained from
measurements at N =1011 (K1011 =25). The observed statistical behavior of TJpp estimates can
be compared with expected worst case errors using the derived empirical relation (5.30). With
σRJ ≈ σt,10−11 ≈ 0.012, these errors can be calculated by choosing the correct DJ type. For
a clock-like pattern, the signal is only affected by the PLL jitter of the reference clock and the
transmission channel cannot produce additional ISI. Thus, sinusoidal DJ is assumed (also refer
to section 3.2.2). The obtained statistical error (dashed boxes) must be generally larger than the
measurements, which is the case and confirms validity of the worst case error estimation. Only
at N =106 , fitted values indicate a larger error bias jump, meaning that the underlying DJ shape
behaves different from the assumed sinusoidal distribution. Thus, the given sample size is not
anymore sufficient for correct tail extrapolation.
152
10.3. A NALYSIS OF E XTRAPOLATION E RROR
0.4 0.4
5m RG−58 5m RG−58
0.39 Pat = 08CEFHEX 0.39 Pat = 08CEF
HEX
0.38 0.38
[UI]
[UI]
0.37 0.37
pp
pp
TJ
TJ
0.36 0.36
0.35 0.35
0.34 0.34
1M 10M 100M 1G 100G 1M 10M 100M 1G 100G
(a) sQN (b) QN
F IGURE 10.13.: Estimated TJpp of a 5m RG-58 coaxial cable (solid boxes), N ={106 , . . . , 109 }.
Worst case errors (dashed boxes) assume quadratic curve DJ (combined ISI and
unknown noise couplings, equation (5.30), tables 5.4 and 6.2). ∆Pt =103 .
The second example in figure 10.12 uses a PRBS4-like 08CEFhex pattern for ISI dominated
jitter, instead of the clock-like signal. The expected worst case error assumes uniform DJ and is
again larger than the measured error.
A third example in figure 10.13 shows the influence of additional noise effects, as appear with
long transmission channels. Therefore, jitter measurements are now carried out with a 5m RG-58
coaxial cable and statistical evaluations are again performed using sQN and QN methods. Also
the PRBS4-like 08CEFhex pattern is used to produce ISI. The worst case error behavior can now
be described as a combined effect of ISI and unknown noise couplings, using quadratic curve
DJ. Validity of this assumption is confirmed with the obtained extrapolation results, which remain
within the expected error limits. At N =109 the worst case error yields somewhat pessimistic
medians and rather tends to underestimate the measured statistical spread. Note that this sample
size is also located beyond the original range of regression planes N =[5·105 , 108 ] for the tables
of coefficients 5.4 and 6.2, which was considered as more suitable for this use case.
The fourth example investigates the influence of process variations on fitted tails. Therefore
the Rx-PLL of the GTX is used in lock-to-data mode which is the common scenario for high-
speed signal recovery. This operating mode causes the Rx-PLL to lose lock if the data sampling
position is shifted too close to the center of a jitter distribution and thus, only allows for correct
BER measurements in the lower tail region of bathtub curves. However, this operating mode also
produces a random phase of the recovered clock as soon as the lock state is reached. Hence,
a DNL error of the BIJM system also affects the measurement result randomly. Over multiple
bathtub measurements it is thus observed as the random effect modeled in section 5.4.3. As test
channel, the 1m S-ATA crossover cable is again utilized. Figure 10.14 shows the statistical spread
of TJpp estimates over varying N with the Rx-PLL in lock-to-data mode (solid boxes). Using
equation (5.30) (uniform DJ) without the DNL error effect, the right dashed boxes are obtained.
Due to the present DNL error, especially the statistical spread is underestimated. This can be
solved by using equation (5.38) together with the tables of coefficients 5.6 and 6.3, and assuming
an additional DNL error of σDN L =0.1 UI. The obtained results are the dashed boxes in the center.
As a final analysis, in figure 10.15 the influence of varying channel length (RG-58 coaxial ca-
ble) on the measured TJ timing budget is shown. The solid lines correspond to median values of
estimated TJpp values with N =107 samples and the test pattern 08CEFhex . Black crosses mark
approximated true medians obtained from the N =1011 measurements, and dotted lines highlight
the expected worst case error under a uniform DJ assumption. As expected, the TJ budget gener-
ally increases with larger cable length and thus indicates the increase of ISI. For lock-to-reference
mode, the uniform DJ assumption is violated at large cable lengths due to additional noise influ-
153
0.42 0.42
σDNL = 0.1 UI σDNL = 0.1 UI

0.4 0.4
0.38 0.38
[UI]
[UI]
pp
pp
TJ
TJ
0.36 0.36
0.34 0.34
0.32 0.32
1M 10M 100M 1G 100G 1M 10M 100M 1G 100G
Sample Size N Sample Size N
(a) sQN (b) QN
F IGURE 10.14.: Estimated TJpp of a 1m S-ATA crossover cable with Rx-PLL in lock-to-data
mode (solid left boxes). Worst case error estimations assume uniform DJ us-
ing (5.30) (right dashed boxes) and (5.38) (center dashed boxes) which addi-
tionally includes the effect of process variations σDN L =0.1UI. ∆Pt =102 .
0.6
QN
0.55
Pat = 08CEFHEX
0.5
lock−to−data
0.45
TJpp
0.4
0.35
0.3
0.25
lock−to−reference
0.2 −1 0 1
10 10 10
Cable length [m]
F IGURE 10.15.: TJpp medians of tail fitted estimates (N =107 , circles), exact measurements
(N =1011 , crosses) and expected worst case uniform DJ (triangles) over varying
cable length. ∆Pt =103 .
ences. The same effect is also observed for very small cable lengths due to steep bathtub tails, as
discussed in the previous section with figure 10.7. In lock-to-data mode, overall TJ increases sig-
nificantly. The included Rx-PLL now superimposes its own jitter with unknown characteristic and
thus, violates the uniform DJ assumption as well. Also, measurement results suffer from larger
variations, as they are generally more affected by cable induced jitter.
10.4. Summary
An FPGA based diagnostic tool for total jitter estimation in high-speed interfaces has been devel-
oped. It allows for quantifying the timing budget caused by a system under test and thus, can be
used for testing the quality of transmission channels or optimizing the parameter configuration of
interface structures. Jitter measurements at 3Gb/s with a sample size of N =108 and R=128 delay
steps in a unit interval require approximately 2s of test time.
After an initial demonstration of the jitter measurement and sQN fitting principle in figure 10.5,
the TJpp values of different RG-58 cables were determined in figure 10.6 using the developed
diagnostic tool. The inherent jitter of the reference clock (TJpp ≈0.15 UI) was also quantified with
154
10.4. S UMMARY
loopback measurements and represented in figure 10.7. As an example for parameter optimization
with the presented diagnostic tool, the built-in Rx equalizers and Tx buffers of the FPGA were
configured together with a 5m RG-58 coaxial cable. This allowed to decrease the TJ timing
budget down to TJpp <0.3UI.
Further, the extrapolation error of sQN and QN fitting methods was investigated and compared
against theoretical worst case errors from sections 5.4 and 6.5. In this context the sinusoidal,
uniform and quadratic curve DJ shapes were experimentally confirmed to be well suited for pure
clock jitter, ISI, and ISI plus external noise affected channels (figures 10.11, 10.12 and 10.13).
When the receiver PLL was operated in normal lock-to-data mode, the additional effect of random
process variations could be made visible, which also allowed to apply the DNL error model in
figure 10.14. As a final analysis, in figure 10.15 the influence of varying RG-58 coaxial cable
length on the measured TJ timing budget was shown, together with the predicted worst case errors
under an ISI dominated DJ assumption. Results showed, that this condition is not fulfilled for large
cable lengths due to additional noise influences, as well as for very small cable lengths due to the
system limitations. Also in lock-to-data mode the overall TJ increases and changes the observed
DJ characteristic.
Parts of this chapter have also been published in [C7,C8].
155
11. Conclusion
This work concludes with an overall summary and overview to the key results achieved throughout
the thesis. A brief outlook to future directions is given at the end.
11.1. Results Summary
In this thesis, first the scaled Q-normalization (sQN) method for jitter and BER analysis was pre-
sented and realized in chapter 4. It is based on the Gaussian quantile function (equation (3.10)),
which linearizes the tails of jitter distributions and thus, allows for Gaussian tail fitting and extra-
polation. The Q-function was embedded into an efficient optimization scheme where a simple
recursion achieves a very fast exploration of the three-dimensional search space. With this re-
cursion, the sQN method automatically determines the best suited tail part for fitting. Thus, it
represents a clear improvement to other methods where the tail part is predefined in a conservative
manner or must be identified using an additional algorithm.
The extrapolation error of the basic sQN principle has been investigated in section 4.2, where
the major causes for a degraded performance have been identified as small sample sizes N , worst
case combinations of RJ and DJ, as well as Gaussian-like DJ shapes. However, the resulting
extrapolation error is always positive biased and thus pessimistic, which is a further beneficial
property of algorithms based on quantile normalization in general.
From the basic sQN principle, two optimized algorithmic scenarios have been derived which
allow for improved error behavior and robustness. The ĉ1.2 scenario in section 4.3 recommends
a minimum probability interval ∆Pt ≥102 for outermost tail selection. This parameter avoids
outliers caused by statistical tail variations. Further, the scenario combines fitness measures based
on both regression length and error to achieve an optimized error behavior. Another algorithmic
scenario, Qth,c , is based on a constant threshold Qmin in scaled Q-domain and thus, defines the
Gaussian tail region in terms of standard deviations beside the model mean. This representation
form allows for a flexible tail choice. As a trade-off, Qmin =−1.2 has been identified to achieve
acceptable accuracy with reduced risk for outlier occurrence.
Both algorithmic approaches improve the estimation performance compared to the basic sQN
principle. For example, with a typical uniform DJ, N =106 and worst case test distributions
(σRJ /ADJ =1/4), the error bias is still <2% with an overall error <3% in more than 97.5%
of the cases (confidence level a>0.95). The performance of the ĉ1.2 scenario is equivalent to Qth,c
for N ≥106 , and outperforms it at smaller sample sizes. This is due to a larger error variation of
Qth,c when a fit is performed at the outermost tail part.
In chapter 5, hardware design aspects for the sQN method were investigated. The basic idea was
to highlight properties of the proposed method when used together with test equipment, diagnostic
tools or built-in jitter measurement (BIJM) systems. Unlike simulations, these systems introduce
a discretization effect, which divides a distribution into R time intervals or bins per UI. As key
parameters for a system design, both the sample size N as well as the discrete number of bins
R were shown to cause fundamental limiting effects with respect to analyzable test distributions.
In order to correctly fit distribution tails, hence, equations were derived that specify minimum
requirements for tails.
157
11. C ONCLUSION
First the tail parameters of fitted test distributions were characterized in section 5.1 using the
polynomial equations (5.3) and (5.5). They allow for changing forth and back between the vari-
ables before (σRJ , ADJ ) and the obtained tail parameters after (At , σt , µt ) distribution synthesis.
The coefficients in table 5.1 and equation (5.3) specify tail amplitudes At obtained with the sQN
fitting method. The two parameters σt and µt describe minimum requirements for sQN tail fitting,
and are also valid for the conventional Q-normalization (QN) method described in chapter 6.
Requirements with respect to minimum tail amplitudes At,min were investigated in section 5.2.
For the ĉ1.2 algorithm a conservative threshold with N and the design parameter ∆Pt were given
in equation (5.10). For the Qth,c algorithm instead, equation (5.13) led to an exact solution, and
further highlighted the missing link to the ĉ1.2 algorithm based on ∆Pt .
Limitations introduced by the discrete time resolution with R number of bins per UI were in-
vestigated in section 5.3. This problem can also be represented in terms of a minimum analyzable
standard deviation σt,min of Gaussian tails. With equation (5.20) for ĉ1.2 and equation (5.23) for
the Qth,c algorithmic scenario, the minimum value σt,min is determined which can be fitted cor-
rectly by each of the algorithms. Selection charts that aid in identifying the design parameter ∆Pt
in figure 5.9 (ĉ1.2 ) or both Qmin and ∆Pt in figure 5.11 (Qth,c ) were provided as well. Note that
the parameters in both charts must also fulfill the previously mentioned requirements with respect
to the minimum tail amplitude (equation (5.10) or (5.13)), and outlier suppression with ∆Pt ≥102
chosen as large as possible.
In section 5.4 the combined influence of sample size and time discretization on the extrapolation
error of the sQN method was quantified. Therefore, the empirical relation (5.30) was derived
together with the tables of coefficients 5.4 (ĉ1.2 ) and 5.5 (Qth,c ). It describes bias, spread and
their combined influence on extrapolation error. The empirical relation is given in terms of a
two-dimensional function of sample size N and the variable product σRJ,min ·R. This product
is a measure for the bin density along a distribution tail. The empirical relation aids a designer
in finding an optimum performance trade-off between the required accuracy of a BIJM and the
hardware expense in terms of key parameters N and R. The obtained results clearly highlight a
better performance for the ĉ1.2 algorithm.
The additional effect of process variations in a jitter measurement system was investigated in
section 5.4.3. It is a typical result of timing mismatches inside the measurement system, and was
modeled as differential non-linearity (DNL) error with standard deviation σDN L . As a resulting
effect, the statistical spread of extrapolated tails increases significantly and thus, becomes the
major contributor to overall error. Thus, a well suited empirical equation (5.38) together with
included DNL error effect was derived, and according coefficients specified in table 5.6.
The effect of error oscillations at lowest bin densities was also investigated in section 5.4.2. If
a certain distribution shape is known, equation 5.35 allows to determine error maximums and to
adjust N and R accordingly. For the general case of unknown distribution shapes, σRJ,min ·R≥2
can avoid such oscillations.
Section 5.5 presented two typical design examples to highlight the calculation steps involved
with optimized BIJM designs. The first example assumed a jitter diagnosis scenario with a mini-
mum RJ tail of σt,min =0.01UI and a realizable number of R=128 bins per UI. First, the minimum
amplitude At,min was determined to correctly specify ∆Pt , which guaranteed for correct fitting
behavior. With equation (5.30) the sQN method achieved a worst case error bias Emed ≤2.0%
as well as an estimation loss EL ≤5.4% if the BIJM design was not affected by process varia-
tions. Otherwise, with σDN L =0.05UI and equation (5.38) the error increased to Emed ≤2.3% and
EL ≤7.4%.
The second design example focused on production testing with very stringent requirements on
the test time tt ≈20ms. With 14 parallel counters, the assumed system would be able to collect only
158
11.1. R ESULTS S UMMARY
N ≈26k jitter values within the given test time, but still achieve an extrapolation error EL <15%.
The performance of the proposed and previously published tail fitting principles based on Gaus-
sian quantile normalization was compared in chapter 6. These include the scaled Q-normalization
(sQN) method from chapter 4 and 5, the conventional Q-normalization method (QN), as well as
higher order polynomial methods (QP2, QP3 QP4) for tail fitting and extrapolation. The opti-
mization scheme in figure 6.1 was first derived to give a unifying, generalized perspective on
the compared methods. An efficient implementation of this scheme included the use of a fast
Levinson-Durbin recursion. Optimum parameter regions were derived using the ∆Pt parameter
for outer tail part selection and led to the configuration in table 6.1.
A comprehensive performance evaluation was first carried out for typical simulator environ-
ments by assuming R=Rsim =3.3·105 . With a varying sample size N and DJ type, figures 6.7-6.10
highlighted the different characteristics for each of the investigated methods. As a fundamental
result, the QN method showed the same beneficial property of a strictly positive error bias as the
sQN method. This is due to the asymptotic behavior of tails in Q-domain as already observed with
the sQN method in section 4.2.1. Hence, extrapolation results are always pessimistic. This is a
clear advantage compared to higher order polynomial methods (QP2, QP3, QP4), which achieved
acceptable accuracy only for certain DJ types or for a small range of distributional shapes. Third
and fourth order polynomials suffered from a large statistical variation of results, especially with
RJ dominant distributions where the Q-tails approximately follow a linear course. This problem
is generally due to an over-fitting effect.
A comprehensive comparison with the proposed sQN method (ĉ1.2 optimization based on ∆Pt )
was carried out in figures 6.11-6.15. The sQN method clearly achieved best performance as long
as the fitting condition for tail amplitudes in equation (5.10) was correctly met. Although the QN
method is less accurate, it offers the advantage of approximately 35 times faster tail fits, as was
also shown previously in figure 4.9. With this complementary property and the pessimistic tail
extrapolation, conventional QN is well suited for tail fitting and thus, also becomes a candidate for
hardware designs. The influence of such a limited time discretization RRsim on the extrapola-
tion error was investigated in figure 6.16. It basically showed a linear increase of the extrapolation
error over a large range of R for both QN and sQN methods.
In order to allow hardware designers to choose between the better suited algorithm alternative
for a jitter measurement system, the extrapolation performance of the QN method was evaluated in
section 6.5, equivalent to the sQN error analysis in section 5.4. Resulting coefficients were given
in table 6.2 for the empirical relation (6.6) without additional process variations, and in table 6.3
for the empirical relation (6.7) with included process variations as differential non-linearity error.
Hardware performance of the QN method was highlighted with the continued sQN design exam-
ples from section 5.5. Results showed that, although the QN method is generally less accurate
than sQN, it is also less affected by differential non-linearity error.
For a jitter measurement system using the QN method, the previously derived design equations
from chapter 5 can be applied again. Hence, equation (5.3) and condition (5.10) guarantee for a
minimum amplitude At,min , while equation (5.20) and the selection chart in figure 5.9 guarantee
for a minimum RJ standard deviation σt,min to be fitted correctly.
The flexible architecture of the sQN optimization scheme was highlighted in chapter 7, where a
generalized version let to a scheme for arbitrary non-Gaussian tail fitting. In this context, the gen-
eralized Gaussian distribution (GGD) function was introduced for tails with arbitrary exponential
power law behavior. A GGD uses an additional shape parameter α, and includes the Gaussian
distribution as a special case (α=2). It is thus fully consistent with the existing RJ-DJ model.
For simulations where Rsim is sufficiently large, the GGD method achieved acceptable accu-
racy, although it suffered from a slightly negative or optimistic error bias. The reason is a general
overestimation of shape parameters, as was shown in figure 7.6. For hardware scenarios with lim-
159
11. C ONCLUSION
ited R, additional error further degraded the performance and let to unreliable extrapolations. This
use case is thus not recommended. A performance comparison in figure 7.9 clearly highlighted
the advantage of the generalized method when used together with heavy tailed test distributions
(α≤2), instead of the sQN method from chapter 4. Since the proposed method was implemented
for use with a broad range of test distributions, further improvements may be achieved if spe-
cific shapes or test distributions are known and the fitting algorithm only focuses on these special
characteristics.
Chapter 8 provided a first application example for the sQN method when used with system
behavioral simulations. Therefore, an accurate CPLL model was implemented as an enhanced
version of a prior event-driven approach. It allowed for dynamic run-time variations of parameters
and included a gain regulator pole as well as a VCO noise model. On an Intel Core Duo 2.2GHz
laptop, the model was able to gather up to 20k jitter values per second simulation time.
Initial simulations in figure 8.4 compared the closed loop phase noise PSD with measurement
data at different parameter configurations, and showed an excellent agreement. Only spectral spurs
at multiples of 50MHz, could not be reflected as they were not modeled. Two different simula-
tion methods, one based on the sQN method and the other on a spectral method, determined jitter
transfer functions of the modeled PLL. Resulting curves were again compared with measurements
in figures 8.7-8.9. Both methods correctly identified the measured cut-off frequency of transfer
functions. However, the spectral method better reflected the peaking behavior in the cut-off re-
gion, while the sQN method was instead able to correctly track curves obtained with different test
patterns. This is due to the same analysis principle as used in measurements.
A second application example for the sQN method was given in chapter 9, together with an
adaptive algorithm for jitter tolerance analysis of high-speed PLLs. It consists of an adaptive
recursion (equation (9.1)) as well as a mechanism for automatic sample size adaptation (figure 9.2),
to efficiently determine jitter tolerance curves.
In the example, the algorithm was applied to the PLL model from chapter 8. After deriving a
well suited set of parameters (table 9.2), the performance of four different algorithmic combina-
tions was investigated. These included use of the QN and sQN fitting methods as well as their
realizations without sample size adaptation (c-QN and c-sQN).
Results highlighted a general decrease of the total number of required jitter samples N by a
factor of 2-3, if the automatic sample size adaptation (QN, sQN) is utilized. The smallest com-
putational effort was achieved with QN and c-QN. The smallest number of iterations instead, was
given without sample size adaptation (c-QN and c-sQN), because their constant error bias does
not influence the adaptation process. Hardware simulations with a reduced number of bins R (ta-
ble 9.4 and figure 9.6) indicated that the developed algorithm can also be applied to hardware jitter
measurements. Finally, in figure 9.7 simulated jitter tolerance curves were also compared against
measurements from the according test structure and highlighted an excellent matching.
The third application example for tail fitting methods was given in chapter 10, where an FPGA
based diagnostic tool for TJ estimation in high-speed interfaces was implemented. The mea-
surement routine used an efficient algorithm, which realizes jitter measurements at 3Gb/s with a
sample size of N =108 and R=128 in approximately 2s of test time. Using this diagnostic tool,
the TJpp values of different RG-58 cables were determined in figure 10.6. The inherent jitter of
the reference clock was quantified with loopback measurements and indicated TJpp ≈0.15 UI. As
an example for parameter optimization, the built-in Rx equalizers and Tx buffers of the FPGA
were configured in figures 10.8-10.10 together with a 5m RG-58 coaxial cable. For the best case
with an included DFE, the TJ timing budget was decreased over 28% below TJpp <0.3UI.
The extrapolation error of sQN and QN fitting methods was investigated and compared against
theoretical worst case errors from sections 5.4 and 6.5 respectively. This allowed to experimentally
160
11.2. O UTLOOK
confirm the sinusoidal, uniform and quadratic curve DJ shapes as well suited for clock jitter, ISI,
and ISI plus additional noise affected channels (figures 10.11, 10.12 and 10.13). The DNL error
model was also successfully applied in figure 10.14 with jitter measurements from the Rx-PLL
operating in lock-to-data mode.
11.2. Outlook
The sQN method from chapter 4 was shown to achieve an excellent accuracy of extrapolated
tails, combined with a fast and flexible tail fitting procedure. The derived equations and empirical
relations are primarily intended for use with a broad variety of test distributions and DJ types. For
further improvements with respect to certain distribution shapes, it would thus be interesting to
focus on specific test cases, where one of the tail parameters is for example known or can easily
be approximated.
The residual analysis of fitted quantiles in section 4.1.3 highlighted a non-constant and corre-
lated error structure for the outermost tail region in Q-domain. This makes the linear regression
of quantiles still sub-optimal with respect to a maximum likelihood tail search [20]. However,
an additional weighting and de-correlation of errors requires an excessive computational demand,
and is thus not feasible. Nevertheless it would be interesting to compare the sQN error with the
theoretical performance maximum for tail fitting methods. Such comparisons might for example
become possible with fitting methods based on maximum likelihood approaches.
The generalized fitting principle in chapter 7 suffers from accuracy with a reduced number
of bins R or a large shape parameter α. A detailed analysis of different fitness measures and
their combinations, as well as the focus on a few specific tail shapes could possibly improve the
performance.
Finally, with the given empirical error analysis of the presented sQN and QN methods, the
extrapolation accuracy in all kind of future BIJM designs, diagnostic tools or measurement devices
can be predicted as long as they provide jitter distributions. Thus, the derived empirical relations
may also find a direct application in preliminary concept studies of novel systems.
161
A. Figure Data Files
The following lists denotes the MATLAB scripts which were used to post-process the C simula-
tions, in order to generate the figures and tables of coefficients throughout this work. The scripts
can be found in the sub-folders specified in the last column. All simulation data and results are
documented on an appended DVD, available upon request at:
[email protected]
Figure Section Page MATLAB File Folder

4.4 28
4.1.1 TestDist_analyze_bathtub 4/4/
4.5 29
4.7 4.1.3 32 qqtest_v3 3/5/
4.9 4.1.4 36 calc_time 4/8/
4.10 4.2 37 stim_ber_estimate_ML_RJDJ 4/3/
4.11 38
4.2.1 stim_ber_estimate_ML 4/3/
4.12 38
4.14 39
4.15 40
4.17 43
4.18 43
4.19 44
4.20 44
4.22 47
4.23 4.3.2 48 stim_ber_dPmin_Pmin_UImin 4/3/
4.24 49
4.25 50
4.26 51
4.29 4.4.2 54 stim_ber_estimate_ML 4/3/
4.30 56
4.31 56
4.32 57
4.33 57
5.2 5.1.1 61 app_tj_analysis 3/6/
5.3 5.1.1 62 stim_ber_estimate_ML_RJDJ 4/3/
5.4 5.1.2 63 app_tj_analysis 3/6/
5.8 5.3 69 stim_ber_estimate_ML_LTR 4/3/
5.10 5.3 71 stim_ber_estimate_ML_LTR 4/3/
5.12 5.4.1 73 stim_ber_estimate_ML_RJDJ 4/3/
5.13 5.4.1 74 stim_ber_estimate_ML_LTR 4/3/
5.14 5.4.1 75
stim_ber_N_RJmin_looped 4/3/
5.15 5.4.2 78
TABLE A.1.: List of MATLAB files to generate simulation figures.
163
A. F IGURE DATA F ILES
Figure Section Page MATLAB File Folder

6.3 6.2 89
6.4 6.2 90
stim_ber_dPmin_Pmin_UImin 4/3/
6.5 6.2 91
6.6 6.2 91
6.7 6.3 94
6.8 6.3 95
stim_ber_estimate_poly 4/3/
6.9 6.3 96
6.10 6.3 97
6.11 6.4 99
6.12 6.4 100
stim_ber_estimate_poly
6.13 6.4 101 4/3/
stim_ber_estimate_ML
6.14 6.4 102
6.15 6.4 103
6.16 6.4 104 stim_ber_dPmin_Pmin_UImin 4/3
7.6 7.3.1 116
7.7 7.3.1 117
stim_ber_ggd_ML 4/3/
7.8 7.3.2 117
7.9 7.3.3 118
8.4 8.2.1 127 cdr_stim_jit_FFT 4/5/
8.5 8.2.1 128 cdr_stim_jit_FFT 4/5/
8.6 8.2.2 129 cdr_stim_jit_FFT 4/5/
8.7 130
8.8 8.2.2 130 cdr_stim_transfer 4/5
8.9 131
9.3 9.1.2 137 stim_ber_estimate_ML_sin 4/3/
9.4 9.2.1 139 cdr_stim_JTOL 4/5/
9.5 9.2.1 140 cdr_stim_JTOL_loop 4/5/
9.6 9.2.2 142 cdr_stim_JTOL_Rloop 4/5/
9.7 9.2.2 142 cdr_stim_JTOL 4/5/
10.5 10.2 149 Gaussfit_v2 4/6/
Gaussfit_v2
10.6 10.2 149 4/6/
Gaussfit_erran_v1
10.7 10.2 150 Gaussfit_v2 4/6/
10.8 150
10.2 GTX_Gaussfit_v1 4/6/
10.9 151
10.10 10.2 151 RXTX_Gaussfit_v1 4/6/
10.11 152
10.12 152
10.3 Gaussfit_v2 4/6/
10.13 153
10.14 154
10.15 10.3 154 Gaussfit_erran_v1 4/6/
TABLE A.1.: List of MATLAB files to generate simulation figures. (continued)
164
Table Section Page MATLAB File Folder
5.1 5.1.1 62 app_tj_analysis 3/6/
5.2 5.1.1 64 app_tj_analysis 3/6/
5.4 76
5.4.1 stim_ber_N_RJmin_looped 4/3/
5.5 77
5.6 5.4.3 80 stim_ber_N_RJmin_DNL_looped 4/3/
6.2 6.5 105 stim_ber_N_RJmin_looped 4/3/
6.3 6.5 106 stim_ber_N_RJmin_DNL_looped 4/3/
TABLE A.2.: List of MATLAB files to generate tables of coefficients.
System-C Testbench Options File

MATLAB Files for post-processing
BER_Test_stim_template ber_test.opt
stim_ber_estimate_ML
stim_ber_estimate_ML_LTR
stim_ber_estimate_poly
stim_ber_estimate_ML_RJDJ
BER_Test_stim_dPmin_Pmin_looped ber_test_dPminPminloop.opt
stim_ber_dPmin_Pmin_UImin
BER_Test_stim_UImin_Pmin_looped ber_test_UIminPminloop.opt
stim_ber_dPmin_Pmin_UImin
BER_Test_stim_N_RJmin_looped ber_test_NRJminloop.opt
stim_ber_N_RJmin_looped
BER_Test_RJ_tail ber_test_RJ_tail.opt
stim_ber_ggd_ML
cdr_stim_jit_fft jit_fft_an.opt
cdr_stim_jit_FFT
cdr_stim_jit_transfer jit_trans_an.opt
cdr_stim_transfer
cdr_stim_JTOL jtol_an.opt
cdr_stim_JTOL
cdr_stim_JTOL_loop
cdr_stim_JTOL_Rloop
TABLE A.3.: List of System-C testbenches for simulations and MATLAB post-processing files.
165
Own Publications
[C1] S. Erb and W. Pribyl, “An Accurate and Efficient Method for BER Analysis in High-
Speed Communication Systems", IEEE European Conf. on Circuit Theory and Design
(ECCTD’09), Aug. 2009.
[C2] S. Erb and W. Pribyl, “A Behavioral Modeling Approach for Jitter Analysis in Charge-Pump
PLLs", Austrian Workshop on Microelectronics (AUSTROCHIP’09), Oct. 2009.
[C3] S. Erb and W. Pribyl, “Comparison of Jitter Decomposition Methods for BER Analysis of
High-Speed Serial Links", IEEE Symp. on the Design and Diagnostics of Electronic Circuits
and Systems (DDECS’10), Apr. 2010.
[C4] S. Erb and W. Pribyl, “Design and Performance Considerations for an On-Chip Jitter Anal-
ysis System", IEEE Int. Symp. on Circuits and Systems (ISCAS’10), May 2010.
[C5] S. Erb and W. Pribyl, “An Approach to Generalized Jitter and BER Analysis", Austrian
Workshop on Microelectronics (AUSTROCHIP’10), Oct. 2010.
[C6] S. Erb and W. Pribyl, “A Method for Fast Jitter Tolerance Analysis of High-Speed PLLs",
IEEE Conf. Design Automation and Test in Europe (DATE’11), Mar. 2011.
[C7] S. Erb, M. Stadler and W. Pribyl, “An FPGA based Diagnostic Tool for Jitter Optimization in
Serial High-Speed Transceivers", IEEE Ph.D. Research in Microelectronics & Electronics
(PRIME’11), Jul. 2011, submitted for publication in Feb. 2011.
[C8] S. Erb and W. Pribyl, “Design Specification for BER Analysis Methods using Built-in Jitter
Measurements", IEEE Trans. VLSI Systems, submitted for publication in Oct. 2010.
[C9] S. Erb, “Method and Device for Predicting a Figure of Merit from a Distribution", U.S.
Patent Application, US2010/0 246 650 A1, Sep. 30, 2010.
167
Bibliography
[1] P. Acco, M. P. Kennedy, C. Mira, B. Morley, and B. Frigyik, “Behavioral Modeling of
Charge Pump Phase Locked Loops,” in IEEE Int. Symp. Circuits and Systems (ISCAS’99),
1999, pp. 375–378.
[2] O. Agazzi, M. Hueda, H. Carrer, and D. Crivelli, “Maximum-likelihood sequence estima-

tion in dispersive optical channels,” J. Lightwave Technology, vol. 23, no. 2, pp. 749–763,
Feb. 2005.
[3] Agilent Tech. (2003, Feb.) Jitter Analysis Techniques for High Data Rates. White Paper.
Cited 2010-09-17. [Online]. Available: www.agilent.com
[4] S. Ahmed and T. Kwasniewski, “Efficient Simulation of Jitter Tolerance for All-Digital
Data Recovery Circuits,” IEEE Midwest Symp. Circuits and Systems (MWSCAS’07), pp.
1070–1073, Aug. 2007.
[5] S. Ali, “Basics of Chip-to-Chip and Backplane Signaling,” IEEE Solid-State Circuits Conf.
Tutorial (ISSCC’08), Feb. 2008.
[6] G. Balamurugan, B. Casper, J. Jaussi, M. Mansuri, F. O’Mahony et al., “Modeling and

Analysis of High-Speed I/O Links,” IEEE Trans. Advanced Packaging, vol. 32, no. 2, pp.
237–247, May 2009.
[7] R. E. Best, Phase-Locked Loops - Design, Simulation and Applications, 5th ed. New York
(NY): McGraw-Hill, 2003.
[8] W. Beyene, C. Madden, J.-H. Chun, H. Lee, Y. Frans et al., “Advanced Modeling and Ac-
curate Characterization of a 16 Gb/s Memory Interface,” IEEE Trans. Advanced Packaging,
vol. 32, no. 2, pp. 306–327, May 2009.
[9] L. Bizjak, “Development of a PLL Blocks Library for accurate Time-Domain simulations
and Clock Analysis Software,” Master’s thesis, Università degli studi di Udine, I, 2005.
[10] G. E. P. Box and M. E. Muller, “A Note on the Generation of Random Normal Deviates,”
Ann. Math. Statistics, no. 29, pp. 610–611, 1958.
[11] M. Brownlee, “Low Noise Clocking for High Speed Serial Links,” Ph.D. dissertation, Ore-
gon State University, US, 2006.
[12] Y. Cai, A. Bhattacharyya, J. Martone, A. Verma, and W. Burchanowski, “A Comprehensive

Production Test Solution for 1.5Gb/s and 3Gb/s Serial-ATA - based on AWG and Under-
sampling Techniques,” IEEE Int. Test Conf. (ITC’05), Nov. 2005.
[13] Y. Cai, S. Werner, G. Zhang, M. Olsen, and R. Brink, “Jitter Testing for Multi-Gigabit
Backplane SerDes - Techniques to Decompose and Combine Various Types of Jitter,” IEEE
Int. Test Conf. (ITC’02), pp. 700–709, Oct. 2002.
[14] Y. Cai, B. Laquai, and K. Luehman, “Jitter Testing for Gigabit Serial Communication Trans-
ceivers,” IEEE Design & Test of Computers, vol. 19, no. 1, pp. 66–74, Jan.-Feb. 2002.
169
B IBLIOGRAPHY
[15] B. Casper and F. O’Mahony, “Clocking Analysis, Implementation and Measurement Tech-
niques for High-Speed Data Links - A Tutorial,” IEEE Trans. Circuits Syst. I, vol. 56, no. 1,
pp. 17–39, Jan. 2009.
[16] A. Chan and G. Roberts, “A Jitter Characterization System Using a Component-Invariant

Vernier Delay Line,” IEEE Trans. VLSI Systems, vol. 12, no. 1, pp. 79–95, Jan. 2004.
[17] A.-S. Chao and S.-J. Chang, “A Jitter Characterizing BIST with Pulse-Amplifying Tech-
nique,” IEEE Asian Test Symp. (ATS’09), pp. 379–384, Nov. 2009.
[18] K.-H. Cheng, J.-C. Liu, C.-Y. Chang, S.-Y. Jiang, and K.-W. Hong, “Built-in Jitter Mea-
surement Circuit With Calibration Techniques for a 3-GHz Clock Generator,” IEEE Trans.
VLSI Systems, vol. PP, no. 99, pp. 1–11, Jun. 2010.
[19] S. Coles, An Introduction to Statistical Modeling of Extreme Values. London (GB):

Springer, 2001.
[20] M. J. Crawley, The R Book. Chichester (GB): John Wiley & Sons, 2007.
[21] B. Daniels, R. Farrell, and G. Baldwin, “Arbitrary Order Charge Approximation Event
Driven Phase Lock Loop Model,” IET Irish Signals and Systems Conf. (ISSC), Jul. 2004.
[22] N. Da Dalt, M. Harteneck, C. Sandner, and A. Wiesbauer, “Numerical Modeling of PLL Jit-
ter and the Impact of its Non-White Spectrum on the SNR of Sampled Signals,” Southwest
Symp. Mixed-Signal Design (SSMSD’01), pp. 38–44, Feb. 2001.
[23] N. Da Dalt and C. Sandner, “Introduction to PLL Jitter Definitions,” presentation slides,
Feb. 2001.
[24] N. Da Dalt, “Cheetah CDR L90 v2 Measurements,” Infineon Technologies, Jan. 2007, con-
fidential.
[25] N. Da Dalt, “Theory and Implementation of Digital Bang-Bang Frequency Synthesizers for
High Speed Serial Data Communications,” Dissertation, Technische Hochschule Aachen,
D, 2007.
[26] A. R. DiDonato and A. H. Morris, Jr., “Computation of the Incomplete Gamma Function
Ratios and their Inverse,” ACM Trans. Math. Softw., vol. 12, no. 4, pp. 377–393, 1986.
[27] Q. Dou and J. Abraham, “Jitter Decomposition by Time Lag Correlation,” IEEE Int. Symp.
Quality Electronic Design (ISQED’06), Mar. 2006.
[28] Q. Dou and J. Abraham, “Jitter Decomposition in High-Speed Communication Systems,”

IEEE European Test Symp. (ETS’08), pp. 157–162, May 2008.
[29] J. Eckle-Kohler and M. Kohler, Eine Einführung in die Statistik und ihre Anwendungen.
Berlin (D): Springer, 2008.
[30] W. F. Egan, Phase-Lock Basics, 2nd ed. Hoboken (NJ): John Wiley & Sons, 2007.
[31] M. Evans, N. Hastings, and B. Peacock, Statistical Distributions, 2nd ed. Hoboken (NJ):
John Wiley & Sons, 1993.
[32] Y. Fan, Y. Cai, L. Fang, A. Verma, W. Burchanowski et al., “An Accelerated Jitter Tolerance
Test Technique on ATE for 1.5GB/s and 3GB/s Serial-ATA,” IEEE Int. Test Conf. (ITC’06),
pp. 1–10, Oct. 2006.
170
B IBLIOGRAPHY
[33] Y. Fan and Z. Zilic, “Accelerating Jitter Tolerance Qualification for High Speed Serial In-
terfaces,” IEEE Int. Symp. Quality of Electronic Design (ISQED’09), pp. 360–365, Mar.
2009.
[34] A. Fog. (2008, Feb.) “C++ Library of Pseudo Random Number Generators”. Cited
2008-10-09. [Online]. Available: www.agner.org/random/
[35] A. Frisch, “Jitter Measurement System and Method,” U.S. Patent 2002/0 106 014 A1, Aug.
8, 2002.
[36] F. M. Gardner, Phase Lock Techniques, 3rd ed. Hoboken (NJ): John Wiley & Sons, 2004.
[37] T. Georges, “Non-Gaussian timing jitter statistics of controlled solitons,” in Optical Fiber
Communications (OFC’96), Feb. 1996, pp. 232–233.
[38] F. Ghenassia, Ed., Transaction Level Modeling with SystemC. Dordrecht (NL): Springer,
2005.
[39] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed. Baltimore (MD): Johns
Hopkins University Press, 1996.
[40] V. Grigoryan, C. Menyuk, and R.-M. Mu, “Calculation of Timing and Amplitude Jitter
in Dispersion-Managed Optical Fiber Communications using Linearization,” J. Lightwave
Technology, vol. 17, no. 8, pp. 1347–1356, Aug. 1999.
[41] T. Grötker, S. Liao, G. Martin, and S. Swan, System Design with SystemC. Boston (MA):
Kluwer Academic Publishers, 2002.
[42] A. Hajimiri and T. Lee, The Design of Low Noise Oscillators. Dordrecht (NL): Kluwer
Academic Publishers, 1999.
[43] B. Ham, “Methodologies for Jitter and Signal Quality Specification,” INCITS, Tech. Rep.,
Jun. 2005.
[44] Y. Hamaizi and A. El-Akrmi, “Soliton Propagation in Fiber Systems,” in ICTON Mediter-
ranean Winter Conf. (ICTON-MW’09), Dec. 2009, pp. 1–4.
[45] P. K. Hanumolu, “Design Techniques for Clocking High Performance Signaling Systems,”
Ph.D. dissertation, Oregon State University, US, 2006.
[46] P. Hanumolu, M. Brownlee, K. Mayaram, and U.-K. Moon, “Analysis of Charge-Pump

Phase-Locked Loops,” IEEE Trans. Circuits Syst. I, vol. 51, no. 9, pp. 1665–1674, Sept.
2004.
[47] P. Hanumolu, B. Casper, R. Mooney, G.-Y. Wei, and U.-K. Moon, “Jitter in High-Speed
Serial and Parallel Links,” IEEE Int. Symp. Circuits and Systems (ISCAS’04), vol. 4, pp.
IV–425–428, May 2004.
[48] T. Hashimoto, H. Yamazaki, A. Muramatsu, T. Sato, and A. Inoue, “Time-to-Digital Con-

verter with Vernier Delay Mismatch Compensation for High Resolution On-Die Clock Jitter
Measurement,” in IEEE Symp. VLSI Circuits, Jun. 2008, pp. 166–167.
[49] S. Haykin, Communication Systems, 4th ed. Hoboken (NJ): John Wiley & Sons, 2001.
171
B IBLIOGRAPHY
[50] C. D. Hedayat, A. Hachem, Y. Leduc, and G. Benbassat, “Modeling and Characterization

of the 3rd Order Charge-Pump PLL: a Fully Event-driven Approach,” Analog Integrated
Circuits and Signal Processing, vol. 19, no. 1, pp. 25–45, 1999.
[51] G. Hänsel, K. Stieglbauer, G. Schulze, and J. Moreira, “Implementation of an Economic

Jitter Compliance Test for a Multi-Gigabit Device on ATE,” IEEE Int. Test Conf. (ITC’04),
pp. 1303–1312, Oct. 2004.
[52] D. Hong, C.-K. Ong, and K.-T. Cheng, “Bit-Error-Rate Estimation for High-Speed Serial
Links,” IEEE Trans. Circuits Syst. I, vol. 53, no. 12, pp. 2616–2627, Dec. 2006.
[53] D. Hong, “Efficient Test Methodologies for High-Speed Serial Links,” Ph.D. dissertation,
University of California, Santa Barbara, US, 2008.
[54] D. Hong and K.-T. Cheng, “An Accurate Jitter Estimation Technique for Efficient High
Speed I/O Testing,” IEEE Asian Test Symp. (ATS’07), pp. 224–229, Oct. 2007.
[55] D. Hong and K.-T. Cheng, “Bit-Error Rate Estimation for Bang-Bang Clock and Data Re-
covery Circuit in High-Speed Serial Links,” IEEE VLSI Test Symp. (VTS’08), pp. 17–22,
May 2008.
[56] D. Hong, C.-K. Ong, and K.-T. Cheng, “BER Estimation for Serial Links Based on Jitter
Spectrum and Clock Recovery Characteristics,” IEEE Int. Test Conf. (ITC’04), pp. 1138–
1147, Oct. 2004.
[57] J.-C. Hsu and C. Su, “BIST for Measuring Clock Jitter of Charge-Pump Phase-Locked
Loops,” IEEE Trans. Instrumentation and Measurement, vol. 57, no. 2, pp. 276–285, Feb.
2008.
[58] J.-L. Huang, “A Random Jitter Extraction Technique in the Presence of Sinusoidal Jitter,”
IEEE Asian Test Symp. (ATS’06), pp. 318–326, Nov. 2006.
[59] M. Hueda, D. Crivelli, H. Carrer, and O. Agazzi, “Parametric Estimation of IM/DD Opti-
cal Channels Using New Closed-Form Approximations of the Signal PDF,” J. Lightwave
Technology, vol. 25, no. 3, pp. 957–975, Mar. 2007.
[60] K. Ichiyama, M. Ishida, T. J. Yamaguchi, and M. Soma, “Novel CMOS Circuits to Measure
Data-Dependent Jitter, Random Jitter, and Sinusoidal Jitter in Real Time,” IEEE Trans.
Microw. Theory Tech., vol. 56, no. 5, pp. 1278–1285, May 2008.
[61] SystemC Specification 1666, IEEE Std., Rev. 2.1, 2005.
[62] Nitrophy Analog Core Reference Manual, Infineon Technologies, 2007, confidential.
[63] A. Jantsch, Modeling Embedded Systems and SoC’s. US: Elsevier Science, 2003.
[64] J. Jaussi, G. Balamurugan, J. Kennedy, F. O’Mahony, M. Mansuri et al., “In-Situ Jitter

Tolerance Measurement Technique for Serial I/O,” in IEEE Symp. VLSI Circuits, Jun. 2008,
pp. 168–169.
[65] K. Jenkins, A. Jose, Z. Xu, and K. Shepard, “On-chip Circuit for Measuring Jitter and
Skew with Picosecond Resolution,” in Proc. of IEEE Int. Conf. Integrated Circuit Design
and Technology (ICICDT’08), Jun. 2008, pp. 257–260.
172
B IBLIOGRAPHY
[66] S.-Y. Jiang, K.-H. Cheng, and P.-Y. Jian, “A 2.5-GHz Built-in Jitter Measurement System
in a Serial-Link Transceiver,” IEEE Trans. VLSI Systems, vol. 17, no. 12, pp. 1698 –1708,
Dec. 2009.
[67] S. G. Johnson. (2009, Nov.) “The NLopt Nonlinear-Optimization Package”. MIT. Cited
2010-01-31. [Online]. Available: ab-initio.mit.edu/nlopt
[68] J. Kim, “On-Chip Measurement of Jitter Transfer and Supply Sensitivity of PLL/DLLs,”
IEEE Trans. Circuits and Systems II: Express Briefs, vol. 56, no. 6, pp. 449–453, Jun. 2009.
[69] K. K. Kim, J. Huang, Y.-B. Kim, and F. Lombardi, “Analysis and Simulation of Jitter
Sequences for Testing Serial Data Channels,” IEEE Trans. Industrial Informatics, vol. 4,
no. 2, pp. 134–143, May 2008.
[70] J. M. Kizer and C. J. Madden, “Method for Estimating RJ and DJ,” U.S. Patent US
2006/0 059 392 A1, Mar. 16, 2006.
[71] R. Koenker, Quantile Regression. New York (NY): Cambridge University Press, 2005.
[72] M. Kossel and M. Schmatz, “Jitter Measurements of High-Speed Serial Links,” IEEE De-
sign & Test of Computers, vol. 21, no. 6, pp. 536–543, Nov.-Dec. 2004.
[73] M. Kubicek, “In-System Jitter Measurement Using FPGA,” in 20th Int. Conf. Radioelek-
tronika, Apr. 2010, pp. 1–4.
[74] K. Kundert. (2006, Aug.) Modeling Jitter in PLL-based Frequency Synthesizers. Cited
2008-05-27. [Online]. Available: www.designers-guide.org
[75] A. Kuo, T. Farahmand, N. Ou, S. Tabatabaei, and A. Ivanov, “Jitter Models and Measure-
ment Methods for High-Speed Serial Interconnects,” IEEE Int. Test Conf. (ITC’04), pp.
1295–1302, Oct. 2004.
[76] A. Kuo, R. Rosales, T. Farahmand, S. Tabatabaei, and A. Ivanov, “Crosstalk Bounded Un-
correlated Jitter (BUJ) for High-Speed Interconnects,” IEEE Trans. Instrumentation and
Measurement, vol. 54, no. 5, pp. 1800–1810, Oct. 2005.
[77] P. Larsson, “A Simulator Core for Charge-Pump PLLs,” IEEE Trans. Circuits Syst. II,
vol. 45, no. 9, pp. 1323–1326, Sep. 1998.
[78] T. Lee and A. Hajimiri, “Oscillator Phase Noise: A Tutorial,” IEEE J. Solid-State Circuits,
vol. 35, no. 3, pp. 326–336, Mar. 2000.
[79] Y. Lee, C.-Y. Yang, N.-C. Cheng, and J.-J. Chen, “An Embedded Wide-Range and High-
Resolution Clock Jitter Measurement circuit,” in IEEE Conf. Design, Automation and Test
in Europe (DATE’10), 2010, pp. 1637–1640.
[80] H. Le Gall, “Estimating of the Jitter of a Clock Signal,” U.S. Patent 7 487 055, Feb. 3, 2009.
[81] H. Le Gall, “Jitter Estimation Circuit,” ST Microelectronics, Mar. 2011, private communi-
cation, confidential.
[82] M. P. Li, Jitter, Noise, and Signal Integrity at High-Speed. Boston (MA): Prentice Hall,
2007.
[83] M. Li, “Jitter Challenges and Reduction Techniques at 10 Gb/s and Beyond,” IEEE Trans.
Advanced Packaging, vol. 32, no. 2, pp. 290–297, May 2009.
173
B IBLIOGRAPHY
[84] M. Li, J. Wilstrup, R. Jessen, and D. Petrich, “A New Method for Jitter Decomposition
Through its Distribution Tail Fitting,” IEEE Int. Test Conf. (ITC’99), pp. 788–794, Sep.
1999.
[85] M. Li, J. Wilstrup, R. Jessen, and D. Petrich, “Method and Apparatus for Analyzing Mea-
surement,” U.S. Patent 6 298 315, Oct. 2, 2001.
[86] T. Mak, M. Tripp, and A. Meixner, “Testing Gbps Interfaces without a Gigahertz Tester,”
IEEE Design & Test of Computers, vol. 21, no. 4, pp. 278–286, July-Aug. 2004.
[87] M. Mansuri, A. Hadiashar, and C.-K. K. Yang, “Methodology for On-Chip Adaptive Jitter
Minimization in Phase-Locked Loops,” IEEE Trans. Circuits and Systems II: Analog and
Digital Signal Processing, vol. 50, no. 11, pp. 870–878, Nov. 2003.
[88] MathWorks Inc. (2009, Mar.) “Optimization Toolbox User’s Guide V4.2 (R2009a)”.
MATLAB Documentation. [Online]. Available: www.mathworks.com
[89] “Physical Layer Performance: Testing the Bit Error Ratio (BER),” Technical Article,
Maxim Inc., Sep. 2004.
[90] S. McClure, “Digital Jitter Measurement and Separation,” Master’s thesis, Texas Tech Uni-
versity, US, 2006.
[91] J. A. McNeill and D. Ricketts, The Designer’s Guide to Jitter in Ring Oscillators. New
York (NY): Springer, 2010.
[92] S. E. Meninger, “Low Phase Noise, High Bandwidth Frequency Synthesis Techniques,”
Ph.D. dissertation, Massachusetts Institute of Technology, US, 2007.
[93] M. Miller, “Estimating Total Jitter Concerning Precision, Accuracy and Robustness,” De-
signCon, Feb. 2007.
[94] M. Miller, “Measuring Components of Jitter,” U.S. Patent 7 516 030, Apr. 7, 2009.
[95] M. Miller. (2007) Normalized Q-scale analysis: Theory and background. EDN. Cited
2011-04-12. [Online]. Available: www.edn.com
[96] M. Müller, R. W. Stephens, and R. McHugh, “Total Jitter Measurement at Low Probability
Levels, Using Optimized BERT Scan Methods,” White Paper, Agilent Technologies, 2005.
[97] P. Muller and Y. Leblebici, “Jitter Tolerance Analysis of Clock and Data Recovery Circuits
using Matlab and VHDL-AMS,” in Proc. of Forum on Design Languages (FDL’05), 2005.
[98] F. Nan, Y. Wang, F. Li, W. Yang, and X. Ma, “A Better Method than Tail-fitting Algorithm
for Jitter Separation Based on Gaussian Mixture Model,” J. of Electronic Testing: Theory
and Applications, vol. 25, no. 6, pp. 337–342, Dec. 2009.
[99] R. Nonis, “Phase Noise Modeling in PLL Frequency Synthesizers,” Master’s thesis, Uni-
versità degli studi di Udine, I, 2002.
[100] K. Nose, M. Kajita, and M. Mizuno, “A 1-ps Resolution Jitter-Measurement Macro Using
Interpolated Jitter Oversampling,” IEEE J. Solid-State Circuits, vol. 41, no. 12, pp. 2911–
2920, Dec. 2006.
174
B IBLIOGRAPHY
[101] C.-K. Ong, D. Hong, K.-T. Cheng, and L.-C. Wang, “Jitter Spectral Extraction for Multi-
Gigahertz Signal,” IEEE Asia - South Pacific Design Automation Conf. (ASP-DAC’04), pp.
298–303, Jan. 2004.
[102] C.-K. Ong, D. Hong, K.-T. Cheng, and L.-C. Wang, “A Clock-Less Jitter Spectral Analysis
Technique,” IEEE Trans. Circuits Syst. I, vol. 55, no. 8, pp. 2263–2272, Sept. 2008.
[103] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, 3rd ed. New York
(NY): Pearson, 2010.
[104] N. Ou, T. Farahmand, A. Kuo, S. Tabatabaei, and A. Ivanov, “Jitter Models for the Design
and Test of Gbps-Speed Serial Interconnects,” IEEE Design & Test of Computers, vol. 21,
no. 4, pp. 302–313, July-Aug. 2004.
[105] H. Pang, J. Zhu, and W. Huang, “Jitter Decomposition by Fast Fourier Transform and Time
Lag Correlation,” in IEEE Int. Conf. Communications, Circuits and Systems ICCCAS, Jul.
2009, pp. 365–368.
[106] E. Parzen, “Nonparametric Statistical Data Modeling,” J. American Statistical Association,

vol. 74, no. 365, pp. 105–121, Mar. 1979.
[107] M. Perrott, “Fast and Accurate Behavioral Simulation of Fractional-N Frequency Synthe-
sizers and Other PLL/DLL Circuits,” in IEEE Design Automation Conf. (DAC’02), Jun.
2002, pp. 498–503.
[108] M. Perrott, M. Trott, and C. Sodini, “A Modeling Approach for Σ-∆ Fractional-N Fre-
quency Synthesizers Allowing Straightforward Noise Analysis,” IEEE J. Solid-State Cir-
cuits, vol. 37, no. 8, pp. 1028–1038, Aug. 2002.
[109] M. H. Perrott. (2008, Apr.) “CppSim System Simulator Package V4”. MIT. Cited
2009-06-09. [Online]. Available: www.cppsim.com
[110] M. H. Perrott, “Digital Phase-Locked Loops,” IEEE Solid-State Circuits Conf. Tutorial
(ISSCC’08), Feb. 2008.
[111] A. Popovici, “Fast Measurement of Bit Error Rate in Digital Links,” IEE Proc. Communi-
cations, Radar and Signal Processing, vol. 134, no. 5, pp. 439–447, Aug. 1987.
[112] M. J. Porsani and T. J. Ulrych, “Levinson-Type Algorithms for Polynomial Fitting and
for Cholesky and Q-Factors of Hankel and Vandermonde Matrices,” IEEE Trans. Signal
Processing, vol. 43, no. 1, pp. 63–70, Jan. 1995.
[113] M. J. D. Powell, “The BOBYQA Algorithm for Bound Constrained Optimization with-
out Derivatives,” Department of Applied Mathematics and Theoretical Physics, Cambridge
England, Tech. Rep., 2009.
[114] J. Proakis, Digital Communications, 4th ed. New York (NY): McGraw-Hill, 2001.
[115] R.-D. Reiss and M. Thomas, Statistical Analysis of Extreme Values, 2nd ed. Basel (CH):
Birkhäuser, 2001.
[116] F. Scholz. (2008, May) Applications of the Noncentral t-Distribution. Stat 498B Industrial
Statistics. Cited 2011-04-12. [Online]. Available: www.stat.washington.edu/fritz/
[117] F. Scholz, “Nonparametric Tail Extrapolation,” Boeing Information & Support Services,
ISSTECH-95-014, 1995. [Online]. Available: www.stat.washington.edu/fritz/
175
B IBLIOGRAPHY
[118] Serial ATA Specification, Serial ATA International Organization Std., Rev. 2.6, 2006.
[119] M. Shimanouchi, “An Approach to Consistent Jitter Modeling for Various Jitter Aspects
and Measurement Methods,” IEEE Int. Test Conf. (ITC’01), pp. 848–857, Oct.-Nov. 2001.
[120] M. Shimanouchi, M. Li, and D. Chow, “New Modeling Methods for Bounded Gaussian
Jitter (BGJ)/Noise (BGN) and their Applications in Jitter/Noise Estimation/Testing,” IEEE
Int. Test Conf. (ITC’09), pp. 1–8, 2009.
[121] K. Shu and E. Sánchez-Sinencio, CMOS PLL Synthesizers. Boston (MA): Springer Sci-
ence, 2005.
[122] R. Staszewski, C. Fernando, and P. Balsara, “Event-driven Simulation and Modeling of

Phase Noise of an RF Oscillator,” IEEE Trans. Circuits Syst. I, vol. 52, no. 4, pp. 723–733,
Apr. 2005.
[123] R. W. Stephens, “Jitter Analysis: The dual-Dirac model, RJ/DJ, and Q-Scale,” White Paper,
Agilent Technologies, Dec. 2004.
[124] R. Stephens, “Separation of Random and Deterministic Components of Jitter,” U.S. Patent
7 149 638, Dec. 12, 2006.
[125] R. Stephens, “Separation of a Random Component of Jitter and a Deterministic Component

of Jitter,” U.S. Patent 7 191 080, Mar. 13, 2007.
[126] N. Stojanovic, “Tail Extrapolation in MLSE Receivers Using Nonparametric Channel

Model Estimation,” IEEE Trans. Signal Processing, vol. 57, no. 1, pp. 270–278, Jan. 2009.
[127] V. Stojanovic, “Channel Limited High-Speed Serial links - Modeling, Analysis and De-
sign,” Ph.D. dissertation, Stanford University, US, 2004.
[128] J. Sun, M. Li, and J. Wilstrup, “A Demonstration of Deterministic jitter (DJ) Deconvo-
lution,” IEEE Instrumentation and Measurement Technology Conf. (IMTC’02), vol. 1, pp.
293–298, May 2002.
[129] S. Sunter and A. Roy, “On-chip Digital Jitter Measurement, from Megahertz to Gigahertz,”
IEEE Design & Test of Computers, vol. 21, no. 4, pp. 314–321, July-Aug. 2004.
[130] S. Tabatabaei and A. Ivanov, “Embedded Timing Analysis: A SoC Infrastructure,” IEEE
Design & Test of Computers, vol. 19, no. 3, pp. 22–34, May-June 2002.
[131] C.-C. Tsai and C.-L. Lee, “An On-Chip Jitter Measurement Circuit for the PLL,” IEEE
Asian Test Symp. (ATS’03), pp. 332–335, Nov. 2003.
[132] S. Vamvakos, C. Werner, and B. Nikolic, “Phase-Locked Loop Architecture for Adaptive
Jitter Optimization,” in Int. Symp. Circuits and Systems (ISCAS ’04), vol. 4, May 2004, pp.
IV–161–164.
[133] R. C. Walker, “Designing Bang-Bang PLLs for Clock and Data Recovery in Serial Data
Transmission Systems,” in Phase-Locking in High-Performance Systems: From Devices to
Architectures, B. Razavi, Ed. IEEE Press, 2003, pp. 34–45.
[134] Z. Wang, “An Analysis of Charge-Pump Phase-Locked Loops,” IEEE Trans. Circuits Syst.
I, vol. 52, no. 10, pp. 2128–2138, Oct. 2005.
176
B IBLIOGRAPHY
[135] S. Weinstein, “Estimation of Small Probabilities by Linearization of the Tail of a Probability

Distribution Function,” IEEE Trans. Communication Technology, vol. 19, no. 6, pp. 1149–
1155, Dec. 1971.
[136] S. Wisetphanichkij and K. Dejhan, “Jitter Decomposition by Derivatived Gaussian Wavelet

Transform,” IEEE Int. Symp. Communication and Information Technology (ISCIT’04),
vol. 2, pp. 1160–1165, Oct. 2004.
[137] Xilinx Inc. (2008, Nov.) Virtex-5 FPGA RocketIO GTX Transceiver User’s Guide.
ug198.pdf. Cited 2011-02-15. [Online]. Available: www.xilinx.com
[138] Xilinx Inc. (2009, Oct.) ML50x Evaluation Platform User’s Guide. ug347.pdf. Cited
2011-02-15. [Online]. Available: www.xilinx.com
[139] T. Yamaguchi, H. Hou, K. Takayama, D. Armstrong, M. Ishida et al., “An FFT-based Jitter
Separation Method for High-Frequency Jitter Testing with a 10x Reduction in Test Time,”
IEEE Int. Test Conf. (ITC’07), pp. 1–8, Oct. 2007.
[140] T. Yamaguchi, M. Soma, M. Ishida, H. Musha, and L. Malarsie, “A New Method for Testing
Jitter Tolerance of SerDes Devices Using Sinusoidal Jitter,” IEEE Int. Test Conf. (ITC’02),
pp. 717–725, Oct. 2002.
[141] J. Yin and L. guang Zeng, “A Statistical Jitter Tolerance Estimation Applied for Clock and
Data Recovery Using Oversampling,” IEEE Region 10 Conf. (TENCON’06), pp. 1–4, Nov.
2006.
[142] I. Zamek and S. Zamek, “Definitions of Jitter Measurement Terms and Relationships,” IEEE
Int. Test Conf. (ITC’05), Nov. 2005.
[143] J. Zhu and W. Huang, “Jitter Analysis and Decomposition Based on EMD/HHT in
High-Speed Serial Communications,” in Proc. of IEEE Int. Conf. Testing and Diagnosis
(ICTD’09), Apr. 2009, pp. 1–4.
177

Dissertation Erb Final

Uploaded by

Copyright:

Available Formats

Dissertation Erb Final

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dissertation Erb Final

Uploaded by

Copyright:

Available Formats

Jitter Analysis Methods for the Design and

Test of High-Speed Serial Links

Distribution Tail Fitting Based on Gaussian

Dipl.-Ing. Richard Gaggl

Submitted as thesis to attain the academic degree “Dr. techn.”

Villach, September 22, 2008

There is nothing more practical than a good theory.

Das Runde muss ins Eckige.

2. Fundamentals of Jitter, PLLs and BER Analysis 5

4. A Fast and Accurate Jitter Analysis Method 25

5. Hardware Design Aspects 59

6. Comparison of Gaussian Tail Fitting Methods Based on Q-Normalization 85

7. Jitter Analysis Method for Generalized Gaussian Tail Extrapolation 109

8. An Accurate Behavioral Model for High-Speed PLLs 121

9. A Method for Fast Jitter Tolerance Analysis 133

11. Conclusion 157

A. Figure Data Files 163

Own Publications 167

1.1. Block scheme of a serial high-speed transceiver. . . . . . . . . . . . . . . . . . . 1

2.1. Jitter sources in a serial high-speed interface. . . . . . . . . . . . . . . . . . . . 5

4.1. Amplitude matching with adapted Q-normalization function. . . . . . . . . . . . 26

5.1. BIJM based IO jitter measurement for PLLs. . . . . . . . . . . . . . . . . . . . . 60

6.1. Optimization scheme for Q-normalization combined with polynomial regression. 85

8.1. Functional block scheme of the CPLL. . . . . . . . . . . . . . . . . . . . . . . . 121

9.1. JTOL measurement scheme using TIA or BERT. . . . . . . . . . . . . . . . . . 133

10.1. Basic principle for jitter diagnostics. . . . . . . . . . . . . . . . . . . . . . . . . 146

3.1. Multiplicative constant to specify a target BER for TJpp values. . . . . . . . . . 21

4.1. Default algorithm configuration and important key parameters. . . . . . . . . . . 35

5.1. Coefficients for equations (5.3) and (5.4). . . . . . . . . . . . . . . . . . . . . . 62

7.1. Quantile normalization functions for different tail distributions. . . . . . . . . . . 111

8.1. Default model parameter settings. . . . . . . . . . . . . . . . . . . . . . . . . . 126

9.1. Polynomial regression coefficients for σe . . . . . . . . . . . . . . . . . . . . . . 137

A.1. List of MATLAB files to generate simulation figures. . . . . . . . . . . . . . . . 163

BB-PD Bang-Bang Phase Detector

α Shape parameter of generalized Gaussian distributions, see figure 7.3

1.1. Motivation and Problem Domain

Transmitter Channel Receiver

F IGURE 1.1.: Block scheme of a serial high-speed transceiver.

1.3. Thesis Overview

2.1. Jitter in High-Speed Serial Links

F IGURE 2.1.: Jitter sources in a serial high-speed interface [5].

F IGURE 2.2.: Typical receiver eye diagram affected by jitter.

Total Jitter (TJ)

Deterministic Jitter (DJ) Random Jitter (RJ)

Periodic (SJ) Data Dependent (DDJ) Bounded Uncorrelated (BUJ)

Duty-Cycle Distortion (DCD) Inter-Symbol Interference (ISI)

F IGURE 2.3.: Jitter components according to [82, 104].

F IGURE 2.4.: Inter-Symbol-Interference caused by a transmission channel [5].

2.1.1. Phase Noise Definition

v(t) = A cos(ω0 t + φ(t)) (2.1)

v(t) ≈ A cos(ω0 t) − Aφ(t) sin(ω0 t) (2.2)

Ideal Oscillator Practical Oscillator

F IGURE 2.5.: Ideal and real oscillator spectrum [42].

F IGURE 2.6.: Typical oscillator phase noise spectrum [42].

2.1.2. Clock Jitter Definition

jabs,k = tk − tid,k (2.4)

F IGURE 2.7.: Definitions of absolute, period and accumulated jitter [99].

data signal time

F IGURE 2.8.: IO jitter measurement principle according to [82].