



Ph.D. Dissertation

# Design of Ring-Oscillator-Based Clock Generator with Calibration Techniques

조정 기술을 활용하는 링 발진기 기반의 클럭 생성 회로의 설계

by

Yeonggeun Song

February 2023

Department of Electrical and Computer Engineering College of Engineering Seoul National University Ph.D. Dissertation

## Design of Ring-Oscillator-Based Clock Generator with Calibration Techniques

조정 기술을 활용하는 링 발진기 기반의 클럭 생성 회로의 설계

by

Yeonggeun Song

February 2023

Department of Electrical and Computer Engineering College of Engineering Seoul National University

## Design of Ring-Oscillator-Based Clock Generator with Calibration Techniques

조정 기술을 활용하는 링 발진기 기반의 클럭 생성 회로의 설계

## 지도교수 정 덕 균 이 논문을 공학박사 학위논문으로 제출함

2023년 2월

서울대학교 대학원

전기·정보공학부

## 송영근

송영근의 공학박사 학위 논문을 인준함

2023년 2월

| 위 원 장 | 김재하 | _(인) |
|-------|-----|------|
| 부위원장  | 정덕균 | (인)  |
| 위 원   | 최우석 | _(인) |
| 위 원   | 이우근 | _(인) |
| 위 원   | 문용삼 | (인)  |

## Abstract

As demands for wide-band operations increase and sub-rate architectures for reducing power consumption in wireline communications become promising, a ring oscillator (RO) that has a wide frequency range and a capability to generate multiphase becomes a prospective replacement for LC counterparts. However, the RO, whose frequency is determined by the propagation delay of active devices is vulnerable to supply noise and has the fatal disadvantage of inferior phase noise compared to the LC counterparts. In this dissertation, solutions for the two major drawbacks of RObased clock generators are addressed, and each solution is verified by a prototype chip.

First of all, a RO-based all-digital phase-locked loop (AD-PLL) with a self-biased supply-noise-compensation (SNC) technique for a DDR5 registering clock driver (RCD) application is presented. Considering prerequisites for the DDR5 RCD application, an open-loop SNC-RO that achieves a low frequency-pushing factor with a large static voltage margin is proposed. Since the SNC technique operates independently of the PLL loop bandwidth without using feedback, the SNC-PLL is free from the stability problem associated with bandwidth overlapping, and the SNC performance can be maintained regardless of operating configurations. Furthermore, the SNC technique does not require a start-up circuit and does not deteriorate a stabilization time. Quantitative analyses on static and dynamic characteristics of the proposed SNC technique and relevant design-oriented considerations are addressed. The prototype chip is fabricated in a 28-nm CMOS technology, and the measurement results demonstrate that the AD-PLL satisfies the prerequisites for the SNC technique in the RCD application. The SNC-PLL achieves the best power-supply-noise-attenuation (PSNA) performance of 40 dB and maintains the PSNA performance over 20 dB up to 10 MHz. In the case of random supply noise, the integrated RMS jitter performance is improved by about

65% on average. The AD-PLL consumes 12.1 mW at 3.0 GHz operation and achieves an integrated RMS jitter of 271 fs without any injected supply noise.

Secondly, an injection-locked clock multiplier (ILCM) with a new background calibration technique, so-called multi-phase-based calibration (MPC), that utilizes an intrinsic multi-phase generation capability of the RO is presented. To achieve a high suppression bandwidth of RO-induced noises, an injection-locking technique is employed. However, it requires an indispensable two-point (a frequency error (FE) and an injection path offset (PO)) calibration to ensure its normal operation and secure its remarkable jitter performance. With the FE calibrator that operates at every injection rate, the high bandwidth of the FE calibration with full injection effects is attained simultaneously, which contributes to achieving a much lower jitter. The PO calibrator makes the MPC-ILCM converge to a minimum reference spur position, where the residue of the FE in the steady state is minimized. For a low-power implementation, all calibration loops operate at the reference clock rate using sub-sampling bang-bang phase detectors. Time-domain analysis of the behavior of the injection locking and detailed MPC operations with associated requirement conditions are addressed. Fabricated in a 28-nm CMOS, the proposed MPC verifies a low-jitter and low-referencespur RO-based ILCM. It achieves an integrated RMS jitter of 143.6 fs and a reference spur of -77.9 dBc with a FoM of -247.1 dB at 4.8 GHz operation. The MPC sustains a successful injection-locked condition, hence both the integrated RMS jitter and reference spur performances are maintained with supply voltage variations.

**keywords**: phase-locked loop (PLL), injection-locked clock multiplier (ILCM), supply-noise-compensation (SNC), injection-locking, two-point calibration, multiphase-based calibration (MPC), registering clock driver (RCD) **student number**: 2018-29700

# Contents

| Al | bstrac  | et      |                                             | i   |
|----|---------|---------|---------------------------------------------|-----|
| Co | onten   | ts      |                                             | iii |
| Li | st of ' | Fables  |                                             | vi  |
| Li | st of ] | Figures |                                             | vii |
| 1  | Intr    | oductio | n                                           | 1   |
|    | 1.1     | Motiva  | ation                                       | 1   |
|    | 1.2     | Disser  | tation Objectives                           | 4   |
| 2  | Bac     | kgroun  | d on RO-Based Clock Generator               | 5   |
|    | 2.1     | All-Di  | gital Phase-Locked Loop (AD-PLL)            | 6   |
|    |         | 2.1.1   | AD-PLL Fundamentals                         | 6   |
|    |         | 2.1.2   | Phase Noise Analysis of AD-PLL              | 11  |
|    |         | 2.1.3   | Effect of Supply Noise in PLL System        | 16  |
|    | 2.2     | Injecti | on-Locked Clock Multiplier (ILCM)           | 18  |
|    |         | 2.2.1   | Basics of Injection-Locked Oscillator (ILO) | 18  |
|    |         | 2.2.2   | Phase Domain Response (PDR) Analysis        | 22  |
|    |         | 2.2.3   | Phase Noise of ILCM                         | 26  |

### CONTENTS

| 3 | Self- | Biased  | Supply-Noise-Compensating RO                         | 31 |
|---|-------|---------|------------------------------------------------------|----|
|   | 3.1   | Overvi  | iew                                                  | 31 |
|   | 3.2   | Desigr  | Motivation                                           | 33 |
|   |       | 3.2.1   | Design Consideration of SNC for RCD application      | 33 |
|   |       | 3.2.2   | Prior Works Regarding SNC Techniques                 | 34 |
|   | 3.3   | Propos  | sed Self-Biased SNC-RO                               | 38 |
|   |       | 3.3.1   | Principle of Nataga Current Source                   | 38 |
|   |       | 3.3.2   | Circuit Description of Self-Biased SNC-RO            | 39 |
|   |       | 3.3.3   | Design-Oriented Analysis                             | 42 |
|   |       | 3.3.4   | Evaluation on Suitability in RCD application         | 48 |
|   | 3.4   | Circuit | t Implementation                                     | 54 |
|   |       | 3.4.1   | Overall Architecture                                 | 54 |
|   |       | 3.4.2   | Circuit Description of Building Blocks: TDC-PFD      | 55 |
|   |       | 3.4.3   | Circuit Description of Building Blocks: A-FTL        | 59 |
|   | 3.5   | Measu   | rement Results                                       | 61 |
|   |       | 3.5.1   | Open-Loop SNC Performance                            | 62 |
|   |       | 3.5.2   | Closed-Loop SNC Performance                          | 64 |
|   |       | 3.5.3   | Fast Stabilization Performance                       | 68 |
|   |       | 3.5.4   | Performance Comparison                               | 69 |
|   | 3.6   | Summ    | ary                                                  | 71 |
| 4 | ILC   | M with  | Multi-Phase-Based Calibration                        | 72 |
|   | 4.1   | Overvi  | iew                                                  | 72 |
|   | 4.2   | Desigr  | Motivation                                           | 74 |
|   |       | 4.2.1   | Needs for Two-Point Calibration in Injection-Locking | 74 |
|   |       | 4.2.2   | Prior Works Regarding Two-Point Calibration          | 79 |
|   | 4.3   | Propos  | sed Multi-Phase-Based Calibration (MPC)              | 81 |

|    |        | 4.3.1    | Logical Process of Creating MPC                  | 82  |
|----|--------|----------|--------------------------------------------------|-----|
|    |        | 4.3.2    | Time-Domain Analysis of MPC                      | 84  |
|    |        | 4.3.3    | Conceptual Block Diagram of MPC                  | 88  |
|    |        | 4.3.4    | MPC Operation (1, 2): FE Calibration and DLL     | 92  |
|    |        | 4.3.5    | MPC Operation (3): PO Calibration                | 92  |
|    | 4.4    | Circuit  | t Implementation                                 | 94  |
|    |        | 4.4.1    | Overall Architecture                             | 94  |
|    |        | 4.4.2    | Circuit Description of Building Blocks (1): ILO  | 98  |
|    |        | 4.4.3    | Circuit Description of Building Blocks (2): DCDL | 101 |
|    | 4.5    | Measu    | rement Results                                   | 103 |
|    | 4.6    | Summ     | ary                                              | 116 |
| 5  | Con    | clusion  |                                                  | 117 |
| A  | Note   | es for P | LL in the RCD                                    | 120 |
|    | A.1    | PLL as   | s Zero-Delay Buffer                              | 122 |
|    | A.2    | Practic  | cal Behavioral Simulation                        | 123 |
|    | A.3    | Additi   | onal Measurement from ADVANTEST                  | 124 |
| Ab | ostrac | t (In Ko | orean)                                           | 137 |

# **List of Tables**

| 3.1 | PERFORMANCE SUMMARY AND COMPARISON               | 70  |
|-----|--------------------------------------------------|-----|
| 4.1 | PERFORMANCE COMPARISON WITH THE STATE-OF-THE-ART |     |
|     | RO-BASED ILCMs                                   | 115 |

# **List of Figures**

| 1.1  | (a) Per-lane data rate versus year for a variety of common I/O standards                                 |    |
|------|----------------------------------------------------------------------------------------------------------|----|
|      | and (b) data rate versus process node and year [1]                                                       | 2  |
| 1.2  | Conceptual block diagram of the sub-rate transceivers; (a) transmitter                                   |    |
|      | and (b) receiver.                                                                                        | 2  |
| 1.3  | PCI express technology roadmap [2]                                                                       | 3  |
| 2.1  | Negative feedback system of PLL                                                                          | 6  |
| 2.2  | Charge pump and second-order analog loop filter                                                          | 7  |
| 2.3  | Bilinear transform from an analog to a digital LF. [7]                                                   | 9  |
| 2.4  | Block diagram of TDC-based digital PLL.                                                                  | 10 |
| 2.5  | Parameterized noise model of TDC-based digital PLL                                                       | 11 |
| 2.6  | Phase noise simulation with $f_{\text{REF}}$ of 50 MHz, $f_{\text{OUT}}$ of 5 GHz, $f_{\text{DSM}}$ of   |    |
|      | 1 GHz, $\alpha$ of 1/256, $\beta$ of 1/8, $m$ of 1, $t_{\rm res, TDC}$ of 500 fs, and $K_{\rm DCO}$ of 3 |    |
|      | MHz/code                                                                                                 | 15 |
| 2.7  | Phase noise plots derived from the model of figure 2.5 and equations                                     |    |
|      | of (2.7), (2.9), and (2.11)                                                                              | 15 |
| 2.8  | Simulated phase noise with supply voltage variation in an oscillator                                     | 17 |
| 2.9  | (a) Second-order parallel LC tank and (b) its open-loop characteristic.                                  | 19 |
| 2.10 | Phasor diagram with different frequencies between $\omega_{\rm OSC}$ and $\omega_{\rm INJ}.~$ .          | 19 |

| 2.11 | LC-ILO model with a feedback system.                                                           | 20 |
|------|------------------------------------------------------------------------------------------------|----|
| 2.12 | (a) Concept of PDR of the ILO and (b) PDR curve                                                | 22 |
| 2.13 | Discrete-time non-linear phase model of the ILO having $\Delta \omega$                         | 24 |
| 2.14 | Conceptual phase noise comparison between the digital PLL and ILCM.                            | 26 |
| 2.15 | Extra phase shift due to the phase realignment [32]                                            | 27 |
| 2.16 | Phase noise model of the ILCM including transfer functions of the                              |    |
|      | phase realignment of the ILO noise and the up-conversion of the in-                            |    |
|      | jection noise.                                                                                 | 29 |
| 2.17 | Magnitude of (a) $H_{up}(s)$ and (b) $H_{rl}(s)$ with $\beta$ variations                       | 30 |
| 3.1  | Block diagram of registered DIMM                                                               | 34 |
| 3.2  | SNC technique proposed in [42]                                                                 | 35 |
| 3.3  | SNC technique proposed in [43]                                                                 | 36 |
| 3.4  | SNC technique proposed in [44]                                                                 | 36 |
| 3.5  | SNC technique proposed in [45]                                                                 | 37 |
| 3.6  | Circuit description and operation of Nagata CS                                                 | 38 |
| 3.7  | Circuit description of proposed SNC-DCO.                                                       | 39 |
| 3.8  | (a) Static and (b) dynamic characteristics of SNC-DCO                                          | 40 |
| 3.9  | Small-signal circuit of the BVG                                                                | 42 |
| 3.10 | Nagata CS dependency with $R$ variations                                                       | 44 |
| 3.11 | Phase noise at 1 MHz offset frequency with a change in $I_{\text{BIAS}}$                       | 45 |
| 3.12 | Location of dominant pole frequency of (3.2) with variations of $C_{\text{BIAS}}$ .            | 46 |
| 3.13 | (a) Magnitude and (b) phase of the transfer function of $\ensuremath{PNSA}_{\ensuremath{DCO}}$ | 47 |
| 3.14 | Currents flowing into the RO with a skewed corner                                              | 49 |
| 3.15 | Frequency of the SNC-DCO with process corner variations                                        | 49 |
| 3.16 | Frequency of the SNC-DCO with temperature variations.                                          | 50 |

| 3.17 | (a) Start-up simulation and (b) time trend of $V_2$ with a supply voltage           |    |
|------|-------------------------------------------------------------------------------------|----|
|      | ramp of 100 ns                                                                      | 51 |
| 3.18 | Overall architecture of AD-PLL with SNC-DCO for RCD application.                    | 53 |
| 3.19 | Block diagram of the TDC-PFD.                                                       | 55 |
| 3.20 | Detailed block diagram of the TDC-PFD with each timing information.                 | 56 |
| 3.21 | Signal decision time and timing margin for stable operation                         | 57 |
| 3.22 | Gain curve of TDC-PFD.                                                              | 58 |
| 3.23 | Monte carlo simulation of first steps in TDC-PFD                                    | 58 |
| 3.24 | Block diagram of A-FTL                                                              | 59 |
| 3.25 | Simulated frequency trend with A-FTL.                                               | 60 |
| 3.26 | (a) Die photograph and (b) power breakdown of each supply domain                    | 61 |
| 3.27 | Measurement setup                                                                   | 61 |
| 3.28 | Measured static SNC characteristic of the free-running DCO and the                  |    |
|      | corresponding FP with different voltage variations.                                 | 62 |
| 3.29 | (a) Measured spectra of the free-running DCO with a sinusoidal noise                |    |
|      | of 1 MHz, 100 mV $_{pp}$ and (b) the amount of frequency deviation. $\ . \ .$       | 63 |
| 3.30 | Measured spectra of the AD-PLL with a sinusoidal noise of 1 MHz,                    |    |
|      | 25 mV $_{pp}$ and the derived PSNA                                                  | 64 |
| 3.31 | Measured PSNA results of the SNC-PLL.                                               | 64 |
| 3.32 | Measured RMS jitter with (a) sinusoidal and (b) random noises                       | 65 |
| 3.33 | Measured phase noise plots with a sinusoidal noise of 200 kHz, 50 $mV_{\text{pp}}.$ | 65 |
| 3.34 | Measured phase noise plots with random noise of 40 $mV_{pp}.\ .\ .\ .$              | 66 |
| 3.35 | Measured phase noise plots without noise.                                           | 66 |
| 3.36 | Measured convergent time of clock frequency with and without the                    |    |
|      | A-FTL                                                                               | 68 |
| 3.37 | Measured stabilization time with different configurations                           | 68 |

ix

#### LIST OF FIGURES

| 3.38 | Measured jitter, power consumption, and $FoM_{JITTER}$ with different   |    |
|------|-------------------------------------------------------------------------|----|
|      | configurations.                                                         | 69 |
| 4.1  | Ideal injection-locked state.                                           | 74 |
| 4.2  | Injection-locked state with an (a) positive and (b) negative FE         | 75 |
| 4.3  | Block diagram of ILCM including a FE calibrator                         | 76 |
| 4.4  | (a) Initial and (b) steady state of figure 4.3                          | 77 |
| 4.5  | Conceptual block diagram of ILCM with two-point calibration             | 78 |
| 4.6  | (a) Conceptual block diagram of conventional pulse-gating ILCM and      |    |
|      | (b) its operation                                                       | 78 |
| 4.7  | (a) Conceptual block diagram of RDC-based ILCM and (b) its operation.   | 80 |
| 4.8  | Non-linear phase relationship of ILCM with negative FE in time-domain.  | 84 |
| 4.9  | The residual time $RT[k]$ with frequency errors of $\pm 0.1$ %          | 86 |
| 4.10 | The convergence index with $\beta$ variations                           | 86 |
| 4.11 | Conceptual block diagram of the proposed MPC structure                  | 89 |
| 4.12 | Timing diagram of the MPC, when the injection pulse is applied          | 90 |
| 4.13 | Timing diagram of the MPC, when the injection pulse is gated            | 91 |
| 4.14 | Overall architecture of the proposed MPC-ILCM                           | 93 |
| 4.15 | Decision tables of MPC; (a) the FE calibrator, (b) the DLL, and (c) the |    |
|      | PO calibrator.                                                          | 93 |
| 4.16 | Transient behavior of MPC structure; the frequency trends and the       |    |
|      | DCWs of each DCDL.                                                      | 96 |
| 4.17 | (a) Monte Carlo simulation of two phase-convergence points and (b)      |    |
|      | its results.                                                            | 97 |
| 4.18 | (a) Circuit description of the ILO, (b) the PDR curve of the ILO, and   |    |
|      | (c) the injection strength of the ILO.                                  | 98 |
| 4.19 | Normalized phase shift with the half-supply-crossing transition         | 99 |

#### LIST OF FIGURES

| 4.20 | Circuit description of (a) coarse-tuning and (b) fine-tuning DCDL               | 101 |
|------|---------------------------------------------------------------------------------|-----|
| 4.21 | Propagation delay of the fine-tuning DCDL                                       | 102 |
| 4.22 | (a) INL and (b) DNL of the fine-tuning DCDL                                     | 102 |
| 4.23 | (a) Die photomicrograph and (b) power breakdown at 4.8 GHz operation.           | 103 |
| 4.24 | Measurement setup for the MPC-ILCM.                                             | 103 |
| 4.25 | Measured (a) phase noise and (b) spectrum of the output clock $\ldots$          | 104 |
| 4.26 | Measured (a) phase noise plots with and without the MPC and (b) their           |     |
|      | decade tables.                                                                  | 105 |
| 4.27 | Measured phase noise plots with integral gain $(K_{\rm I})$ variations          | 106 |
| 4.28 | Measured phase noise plots with variations of the size of injection             |     |
|      | switches                                                                        | 107 |
| 4.29 | Measured (a) reference spur performance with the injection switch size          |     |
|      | variations and (b) spectrum example of (a).                                     | 108 |
| 4.30 | Measured reference spur performance with the gating rate variations             | 109 |
| 4.31 | Measured spectra with $GR_{INJ}$ s of (a) 1/70 and (b) 1/10                     | 110 |
| 4.32 | Measured clock eye with its jitter histogram                                    | 111 |
| 4.33 | Measured (a) integrated jitter and (b) reference spur with supply $(VDD_{ILO})$ | )   |
|      | variations in five different sample chips.                                      | 112 |
| 4.34 | Measured integrated jitter and resolution of the fine-tuning DCDL with          |     |
|      | supply $(VDD_{ANA})$ variations.                                                | 113 |
| 4.35 | Benchmark of performances of the MPC-ILCM and state-of-the-art                  |     |
|      | RO-based ILCMs.                                                                 | 113 |
| A.1  | Block diagram of PLL in RCD as zero-delay buffer.                               | 121 |
| A.2  | (a) continuous and (b) request-driven DM operation and (c) its simu-            |     |
| -    | lation result of (b) case.                                                      | 121 |
| A.3  | Input-to-Output Jitter Transfer Function of PLL in RCD                          | 123 |
|      |                                                                                 |     |

#### LIST OF FIGURES

| A.4 | Simulated frequency time trend and the jitter with supply fluctuation                     | 124 |
|-----|-------------------------------------------------------------------------------------------|-----|
| A.5 | Measurement setup for DDR4 RCD with ADVANTEST equipment .                                 | 125 |
| A.6 | Measured Y0 <sub>-</sub> t output jitter of DDR4 RCD $\ldots \ldots \ldots \ldots \ldots$ | 126 |
| A.7 | Measured stabilization time of DDR4 RCD                                                   | 127 |

## **Chapter 1**

### Introduction

### 1.1 Motivation

As the demands for high-speed data transmission rates of wireline communication link have increased, the influence of the clock quality on overall system performance have also increased. Figure 1.1(a) shows that data rate per pin has approximately doubled every four years across various I/O standards ranging from DDR to high-speed Ethernet, and figure 1.1(b) shows that the data rates for recently published transceivers have kept pace with these standards while taking advantage of CMOS scaling [1]. However, in a LC-type voltage-controlled-oscillator (VCO) -based phase-locked loop (PLL), passive elements such as a capacitor in the loop filter and an inductor of the LC tank occupy a large silicon area, which cannot reflect the benefit of process technology scaling in terms of area cost. On the contrary, a ring oscillator (RO) consists of several stages of delay elements that can be implemented with only active devices. Besides, the achievable maximum frequency of the RO increases with the CMOS scaling, which matches the trend of increasing data rates.



Figure 1.1: (a) Per-lane data rate versus year for a variety of common I/O standards and (b) data rate versus process node and year [1].

Another objective of modern wireline transceivers is low energy efficiency [pJ/bit]. Of course, a remarkable improvement has been achieved with the power-performance benefits of process technology scaling as well, but much lower energy efficiency has been required than the CMOS scaling speed [1]. Figure 1.2 shows a conceptual block diagram of the sub-rate transceivers. The sub-rate architecture not only alleviates critical timing constraints and design complexities, but also reduces power consumption compared to a full-rate counterpart. However, it requires a multi-phase clock generation.



Figure 1.2: Conceptual block diagram of the sub-rate transceivers; (a) transmitter and (b) receiver.



Figure 1.3: PCI express technology roadmap [2]

One of the special properties of some wireline communication standards is backward compatibility. Figure 1.3 shows the Peripheral Component Interconnect Express (PCIe) technology roadmap, and the specification has doubled the data rate in every generation. The PCIe standard supports full backward compatibility with all prior generations to protect customer investments [3]. In other words, the multiple configurations must be satisfied in one standard. Even in a registering clock driver (RCD) [4, 5], which is one of the memory solution systems and will be addressed in detail in chapter 3, the full wide-band frequency range is required because the fine granularity DIMM operating speed per 20 MHz must be supported.

When it comes to these properties of the silicon area cost, the multi-phase generation capability, and wide-band frequency tuning range, the RO becomes a prospective replacement for LC counterparts. However, the RO, whose frequency is determined by the propagation delay of active devices is more vulnerable to supply noise and has the fatal disadvantage of inferior phase noise compared to the LC counterparts. Therefore, in this dissertation, solutions for the two major drawbacks of RO-based clock generators are addressed, and each solution is verified by a prototype chip.

#### **1.2 Dissertation Objectives**

The remaining chapters in this dissertation are organized as follows.

In chapter 2, the fundamentals of the RO-based clock generators, such as an alldigital PLL (AD-PLL) and an injection-locked clock multiplier (ILCM), are provided. The phase noise contributions of each building block in AD-PLL are derived, and the basic analyses on the injection-locking technique are dealt with.

In chapter 3, a RO-based AD-PLL with a self-biased supply-noise-compensation (SNC) technique for a DDR5 RCD is presented. Considering the prerequisites for the DDR5 RCD application, the reasons why prior works regarding SNC techniques are difficult to be applied to the RCD application are briefly examined. And the detailed operations and the design-oriented analyses of the proposed SNC technique are addressed. The detailed circuit implementations of the RO-based SNC-PLL are presented, and the measurement results verify the suitability of the proposed SNC technique to the RCD application.

In chapter 4, an ILCM with a new background calibration technique that utilizes an intrinsic multi-phase generation capability of the RO is presented. The advantages and disadvantages of the existing representative background calibration methodologies are briefly mentioned, and the background on the birth of the proposed multi-phasebased calibration (MPC) structure is explained. Time-domain analysis of the behavior of the injection locking in the steady state is presented, and the detailed operations of the MPC are illustrated with timing diagrams. The detailed circuit implementations of the RO-based ILCM with the background calibration of the MPC are presented, and the measurement results verify a low-jitter and low-reference-spur RO-based clock generator.

Chapter 5 summarizes the proposed works and concludes the dissertation.

### **Chapter 2**

### **Background on RO-Based Clock Generator**

In this chapter, the fundamentals of two clocking structures that underlie the proposed prototype chips are addressed. The first structure is the PLL that has been widely adopted to generate an on-chip clock signal for decades. In particular, as CMOS technology has been scaled, digital PLLs that can achieve the performance equivalent to analog PLLs while taking advantage of digital signal processing are highly favored. The first section briefly introduces the concept of the digital PLLs compared with the analog PLLs and deals with phase noise analysis of each building block. The second is an injection-locked structure that can achieve a high suppression bandwidth of ROinduced noises. The second section provides the fundamentals of the injection-locking architecture and its phase domain response (PDR) which is one of the analytic methodologies of an injection-locked oscillator. In addition, the bandwidth extension effect due to the injection-locking technique is handled through its phase noise analysis.

#### 2.1 All-Digital Phase-Locked Loop (AD-PLL)

#### 2.1.1 AD-PLL Fundamentals



Figure 2.1: Negative feedback system of PLL.

Figure 2.1 shows the most simplified block diagram of a PLL structure as a frequency synthesizer. The output clock that is generated from the oscillator is divided by a multiplication factor (N) and is fed into a phase-frequency detector (PFD). The PFD compares its phase/frequency with the reference clock and outputs the corresponding error information to the loop filter (LF). The LF modulates the oscillator frequency in the direction of decreasing the phase/frequency errors, thus implementing a negative feedback system of the entire PLL structure. In the steady-states, the average frequency of the oscillator becomes N times the frequency of the reference clock, i.e.,  $f_{osc} = N$  $\cdot f_{ref}$ , and the two input signals of the PFD reach the phase-locked state.

Depending on the type of signals propagated to each building block, an analog or digital PLL is distinguished. For example, if the phase/frequency error and the control signal that modulates the oscillator frequency have an analog attribute, the PLL is considered an analog PLL. For decades of PLL history, an analog charge-pump (CP)-based PLL has been widely used. Figure 2.2 shows the CP of UP/DN current sources and the LF constituting the second-order PLL. The PFD controls the operating time of the respective UP/DN current sources in accordance with the phase/frequency



Figure 2.2: Charge pump and second-order analog loop filter.

difference between the reference and oscillator clocks. The corresponding charge flows into or out of the LF and adjusts the control voltage of  $V_{\text{CTRL}}$ . The amount of charge is expressed as

$$\Delta Q_{\rm LF} = I_{\rm UP} \cdot t_{\rm UP} - I_{\rm DN} \cdot t_{\rm DN} \tag{2.1}$$

, where  $t_{\rm UP}$  and  $t_{\rm DN}$  are the operating time of the UP/DN current sources, respectively. Equation (2.1) allows us to infer the important considerations for designing the CP and the PFD. A current mismatch in the CP causes the static phase offset in the PFD and the undesired ripple at  $V_{\rm CTRL}$  in the steady-state, resulting in degrading a deterministic jitter and a reference spur performance. Hwang [6] proposed a dual compensation method that reduces both the current mismatch and the current variations without sacrificing the output dynamic range. However, a leakage current in the capacitor cannot be completely eliminated, thus remaining the static phase offsets in the PFD. Furthermore, the passive elements in the loop filter, along with the charge pump, suffer from an analog mismatch and have a limitation in that they cannot fully reflect the advantages of technology scaling.



Figure 2.3: Bilinear transform from an analog to a digital LF. [7]

On the other hand, the digital PLL facilitates a high-level integration with the technology scalability and a design migration to the advanced technology process. Moreover, it is free from the aforementioned leakage current problem and enables robust performance by providing reconfigurable system parameters. Figure 2.3 shows the bilinear transformation from the analog LF to the digital loop filter (DLF) that constitutes a second-order AD-PLL. It consists of a proportional path with a gain of  $\beta$  and an integral path with a gain of  $\alpha$ , which corresponds to R and C in figure 2.2, respectively. Kratyuk [7] proposed a systematic design procedure through the analogies of CP-PLL and AD-PLL and presented their relationship as follows.

$$\alpha = \frac{T_s}{C} \tag{2.2}$$

$$\beta = R - \frac{T_s}{2 \cdot C} \tag{2.3}$$

, where  $T_{\rm s}$  is a sampling period of a discrete-time system which is the inverse of the reference clock frequency. The frequency warping due to the bilinear transform distorts frequency response in the vicinity of its Nyquist frequency. However, this deformation is negligible because the maximum stable bandwidth of the standard type-II PLL is close to the so-called Gardner's limit [8], which is one-tenth of the reference frequency, i.e.,  $f_{\rm BW} \approx f_{\rm REF} / 10$ . Through digital transformation, the digital PLL not only sustains



Figure 2.4: Block diagram of TDC-based digital PLL.

an operating condition by utilizing digital calibrations but also allows fast frequency acquisition by directly modulating the frequency of a digitally-controlled oscillator (DCO). However, it increases the design complexity of PD preceding the DLF due to digitization (sampling and quantization). A time-to-digital converter (TDC) is generally used to digitize the phase difference of two input clock signals. To imitate the analog PD that has a theoretically infinite resolution, a high resolution with a sufficiently wide range is required. Dudek [9] proposed a vernier TDC which has a sub-gate-delay resolution with an asynchronous read-out circuitry, but it still entails a large power consumption. In addition, since the TDC has intrinsically single-ended properties and functions based on delay lines, it is still difficult to obtain a high resolution with a monotonic characteristic in high-frequency operation. Figure 2.4 describes a block diagram of the TDC-based digital PLL. The TDC digitizes the phase difference between the two input clock signals, but cannot perform the frequency acquisition, requiring an auxiliary frequency tracking loop (FTL) using a frequency detector (FD). A high-frequency-operating delta-sigma modulator (DSM) is located between the DLF and the DCO to alleviate the quantization noise of the DCO. The DSM dithers the fractional bits of the frequency control word (FCW) and contributes to improving the effective frequency resolution of the DCO by providing a noise-shaping, which imputes DC noises to high-frequency noises.



#### 2.1.2 Phase Noise Analysis of AD-PLL

Figure 2.5: Parameterized noise model of TDC-based digital PLL.

Figure 2.5 shows a parameterized noise model of a conventional TDC-based digital PLL. It also describes five main noise sources which are  $\theta_{n,REF}$ ,  $Q_{n,TDC}$ ,  $Q_{n,\Delta\Sigma}$ ,  $Q_{n,DCO}$ , and  $\theta_{n,DCO}$ . Each noise source represents the reference clock noise, the TDC quantization noise, the DSM quantization noise, the DCO quantization noise, and the DCO clock noise, respectively. The *D* stands for a total loop delay taken to process the phase/frequency errors in the DLF, and the *m* represents the order of the DSM. In order to distinguish between the two discrete-time domains, the z-domain transfer functions operated by a dithering frequency are marked. It is noted that *D* in figure 2.5 has an implementation difference from the loop delay expressed in [10]. The *D* is converted to the DSM dithering period and includes the signal transfer function of the DSM (time delay in the discrete-time domain, *z*<sup>-m</sup>). As explained in section 2.1.1, the phase noise of the digital PLL can be analyzed similarly to CP-PLL by linear approximation, and

the open-loop transfer function of the system is expressed as follows.

$$A(s) = \frac{T_{\text{REF}}}{2\pi} \cdot \frac{1}{t_{\text{res,TDC}}} \cdot (\beta + \frac{\alpha}{1 - e^{-sT_{\text{REF}}}}) \cdot \frac{2\pi K_{\text{DCO}}}{s} \cdot \frac{1}{N}$$
(2.4)

, where  $T_{\text{REF}}$ ,  $t_{\text{res,TDC}}$ ,  $K_{\text{DCO}}$  represent the period of the reference clock [sec], the resolution of the TDC [sec], and the gain of the DCO [Hz/code], respectively. Perrott [11] derived a single function of G(s), and all the relevant transfer functions are described in terms of G(s).

$$G(s) = \frac{A(s)}{1 + A(s)} \tag{2.5}$$

Even though the parameterized model in figure 2.5 is not the same as that of [11], which deals with a fractional-N PLL, the analytical methodology on the phase noise contribution by using the power spectral density conversion can be applied in the same way. In the remainder of this section, the quantization noise expressions of the TDC and the DCO, which are the special properties caused by digitization, are derived, and the dithering noise of the DSM is also addressed.

#### **Quantization Noises**

Assuming that a quantization error is a random variable uniformly distributed in a finite interval  $\Delta$ , i.e., from  $-\Delta/2$  to  $\Delta/2$ , its quantization noise power is calculated as  $\Delta^2/12$ . The total noise power is uniformly spread over the span from DC to the Nyquist frequency (the frequency of the reference clock in a PLL case), thus the single-sided spectral density  $\mathcal{L}$  is expressed as

$$\mathcal{L} = \frac{\Delta^2}{12} \cdot \frac{1}{f_{\text{REF}}}$$
(2.6)

When it comes to the quantization noise of the TDC, Staszewski [12, 13] derived the equation of the phase noise spectral density due to the TDC quantization effect. There-

fore, by using (2.5), the total phase noise contribution due to the TDC quantization is calculated as follows.

$$\mathcal{L}_{\text{TDC},\text{Q}} = \frac{(2\pi)^2}{12} \left(\frac{t_{\text{res},\text{TDC}}}{T_{\text{REF}}}\right)^2 \frac{1}{f_{\text{REF}}} \left|N \cdot G(f)\right|^2$$
(2.7)

The phase noise contribution due to the DCO quantization effect can also be derived in a similar manner, but it needs to be multiplied by the *sinc* function corresponding to the Fourier transform of the zero-order hold (ZOH) operation. The model illustrated in figure 2.5 is considered an event-driven system that samples and corrects the phase/frequency error of the output clock at every reference clock rate. However, the DCO inputs are not impulses but are held until the next phase/frequency error is updated [14]. Consequently, the total phase noise contribution due to the DCO quantization is expressed as

$$\mathcal{L}_{\text{DCO},\text{Q}} = \frac{1}{12} \left( \frac{f_{\text{res},\text{DCO}}}{f} \right)^2 \frac{1}{f_{\Delta\Sigma}} \left( sinc \frac{f}{f_{\Delta\Sigma}} \right)^2 |1 - G(f)|^2$$
(2.8)

, where  $f_{\rm res,DCO}$  is the resolution of the DCO. However, the DSM effect should be additionally considered in (2.8). The DSM dithers the fractional bits of the FCW and arouses the noise-shaping effect, which imputes the DC noises to the high-frequency noises. Consequently, by implementing the DSM whose operating frequency,  $f_{\Delta\Sigma}$ , is much higher than  $f_{\rm REF}$ , the effective DCO resolution is greatly improved to be  $f_{\rm res*,DCO} = f_{\rm res,DCO} / 2^{N_{\rm DSM}}$ , where  $N_{\rm DSM}$  is the fractional word length of FCW. As a result,  $f_{\rm res,DCO}$  in (2.8) is substituted by the effective resolution of the DCO,  $f_{\rm res*,DCO}$ .

$$\mathcal{L}_{\text{DCO},\text{Q}} = \frac{1}{12} \left( \frac{f_{\text{res}*,\text{DCO}}}{f} \right)^2 \frac{1}{f_{\Delta\Sigma}} \left( sinc \frac{f}{f_{\Delta\Sigma}} \right)^2 |1 - G(f)|^2$$
(2.9)

#### **DSM Dithering Noise**

Despite the advantages of suppressing the quantization noise of the DCO, the dithering effect itself induces a corresponding noise, which is called a dithering noise of the DSM. It is noted that the signal transfer function of the DSM is  $z^{-m}$ , and the noise transfer function (NTF) of the DSM is  $(1 - z^{-1})^m$  as shown in figure 2.5. The magnitude of the NTF is calculated as

$$|\text{NTF of DSM}| = \left| 2 \cdot \sin\left(\frac{w \cdot T_{\Delta\Sigma}}{2}\right) \right|^{\text{m}} = \left| 2 \cdot \sin\left(\frac{\pi \cdot f}{f_{\Delta\Sigma}}\right) \right|^{\text{m}}$$
(2.10)

Referring to (2.10), it is inferred that the dithering noise within the frequency range of interest can be reduced with a sufficiently high  $f_{\Delta\Sigma}$ . Therefore, the DSM operating clock is set to the divided 2 or 4 of the output clock which has the highest achievable frequency in the PLL system. The resulting dithering-induced phase noise is expressed as follows.

$$\mathcal{L}_{\text{DSM,dth}} = \frac{1}{12} \left( \frac{f_{\text{res,DCO}}}{f} \right)^2 \frac{1}{f_{\Delta\Sigma}} \left( 2sin \frac{\pi f}{f_{\Delta\Sigma}} \right)^{2m} |1 - G(f)|^2$$
(2.11)

#### **Total Phase Noise of AD-PLL**

According to the parameterized model in figure 2.5, the phase noise simulation conducted by Matlab is shown in figure 2.6. Overall phase noise contributions of each noise source show a similar response with the analog CP-PLL. In the frequency range of interest, the in-band phase noises are dominated by the reference clock noise and the TDC quantization noise, while the out-of-band phase noises are dominated by the DCO noise. The highly noticeable difference appears in much higher frequency offsets. Due to the noise-shaping effect of the DSM, the out-of-band phase noise cannot be completely reduced to -20 dB/dec which is the noise attribute of the oscillator. The phase noise profile of the free-running DCO is fitted into a general phase noise plot of



Figure 2.6: Phase noise simulation with  $f_{\text{REF}}$  of 50 MHz,  $f_{\text{OUT}}$  of 5 GHz,  $f_{\text{DSM}}$  of 1 GHz,  $\alpha$  of 1/256,  $\beta$  of 1/8, m of 1,  $t_{\text{res,TDC}}$  of 500 fs, and  $K_{\text{DCO}}$  of 3 MHz/code.



Figure 2.7: Phase noise plots derived from the model of figure 2.5 and equations of (2.7), (2.9), and (2.11).

an electrical oscillator [15] based on a post-layout simulation result. The coefficients of  $1/f^3$  and  $1/f^2$  are derived from the phase noise spectral densities of -32.47 dBc/Hz and -136.44 dBc/Hz at the offset frequencies of 10 kHz and 100 MHz, respectively, which has a flicker noise frequency of 2.5 MHz. The integrated RMS jitter of 353 fs from 10 kHz to 40 MHz is calculated. Figure 2.7 shows the compared phase noise contributions due to the TDC quantization, the DSM dithering, and the DCO quantization. One is obtained from the derived equations (2.7), (2.9), and (2.11) and the other is obtained from the discrete-time parameterized model in figure 2.5. The almost same results are acquired except for the negligible difference of noise spectral density in the vicinity of the digital operating frequency, as explained in section 2.1.1.

#### 2.1.3 Effect of Supply Noise in PLL System

In the case of the RO whose operating frequency is determined by logic propagation delays, its frequency is more vulnerable to a supply variation. Therefore, the effect of an on-chip supply fluctuation in the PLL system is examined by the phase noise analysis through the parameterized model in figure 2.5.

In figure 2.5, as the sixth noise source, the supply voltage variation (SVV) is depicted as  $\Delta V_{SVV}$ . Although the SVV cannot be regarded as a completely independent noise source because its actual influence is correlated with the DCO free-running phase noise, the SVV model in figure 2.5 is constructed through the following two assumptions. The first assumption is that the SVV is configured as a white random noise to generate a static frequency deviation, and the second is that the frequency response with the SVV has a sufficient bandwidth within the frequency range of interest (up to 100 MHz). Consequently, the FPF is defined as a scalar that does not change with respect to the operating frequency, and its value is extracted from the measurement result of the prototype chip, which will be covered in chapter 3.



Figure 2.8: Simulated phase noise with supply voltage variation in an oscillator.

Figure 2.8 shows the phase noise plots with FPFs of 3300 MHz/V and 100 MHz/V at  $\Delta V_{SVV}$  of 3 mV<sub>RMS</sub>. The FPF of 3300 MHz/V is derived from a conventional RO whose frequency is proportional to the supply voltage, and the FPF of 100 MHz/V is derived from the RO that has an additional logic calibrating the on-chip SVV noise. The overall phase noise contribution of the SVV noise shows the band-pass characteristic and it is different from the oscillator's phase noise characteristic of high-pass filtering because there is an integrator in the frequency-to-phase conversion. Intuitively, a high-frequency noise becomes averaged out via the integrator. The SVV noise with the FPF of 3300 MHz/V deteriorates the overall PLL performance over the entire frequency range (in-band and out-band). However, the SVV noise with the FPF of 100 MHz/V is significantly attenuated. The integrated RMS jitters from 10 kHz to 40 MHz are calculated to be 14.45 ps and 562.5 fs, which are about 40.9 times and 1.6 times the noise-free RMS jitter. According to the simulation, the on-chip calibration of reducing the FPF from 3300 to 100 MHz/V improves the integrated jitter by 28.2 dB.

### 2.2 Injection-Locked Clock Multiplier (ILCM)

#### 2.2.1 Basics of Injection-Locked Oscillator (ILO)

The injection-locking phenomenon is a kind of coupling that two different systems affect each other with a shared medium. As early as the 17th century, the Dutch scientist Christiaan Huygens noticed that the pendulums of two clocks on the wall moved in unison if the clocks were hung close to each other [16, 17]. Injection locking has been used in a number of applications, including frequency synthesizers (clock multipliers) [18, 19, 20, 21], frequency division [24, 25], clock recovery [22, 23], clock de-skewing as delay-controlling elements [26, 27], and so forth. In terms of an injection-locked oscillator (ILO), Adler [28] first developed a phasor diagram and derived the rate of phase rotation of the ILO, and Razavi [17] derived a graphical analysis in time and frequency domains and formulated the behavior of phase-locked oscillators under injection. And these principles underlie the detailed approach to a recent injection-locking analysis [29, 30]. To understand the concept of injection locking and to comprehend an underlying ILO model, formulas for an injection lock range and a phase shift are derived through the phasor diagram [17] in this section.

The phasor diagram is a graphical representation of the phase relationship described on a coordinate system. Sinusoidal signals with the same frequency have a phase difference between themselves and the lead/lag phase information can be visualized in the phasor domain. In addition, using the rotating vector method, the phasor diagram is also applicable for expressing signals with slightly different frequencies. Figure 2.9 shows a simplified open-loop characteristic of a LC oscillator with a resonance frequency of  $\omega_0 = \sqrt{LC}$  and suppose an injection signal is inserted into the free-running oscillator and it makes an additional phase shift,  $\phi_0$ , in LC tank as shown in figure 2.9(b). Then, the LC oscillator may oscillate at  $\omega_{INJ}$  rather than  $\omega_0$ and injection locking occurs, if the injected signal is within the injection lock range and has enough amplitude. Under the injection-locked condition, the phasor diagram can be derived as shown in figure 2.10. With  $\omega_0 \neq \omega_{\text{INJ}}$ , the  $V_{\text{OUT}}$  rotates with respect to the  $V_{\text{INJ}}$ . The instantaneous phase difference between  $V_{\text{OSC}}$  and  $V_{\text{INJ}}$  is defined as  $\theta$ and the resultant phase difference between  $V_{\text{OUT}}$  and  $V_{\text{OSC}}$  is  $\phi_0$ .



Figure 2.9: (a) Second-order parallel LC tank and (b) its open-loop characteristic.



Figure 2.10: Phasor diagram with different frequencies between  $\omega_{OSC}$  and  $\omega_{INJ}$ .

A phase shift in the second-order parallel tank of figure 2.9(a) is expressed as

$$\alpha = \frac{\pi}{2} - \tan^{-1} \left( \frac{L\omega}{R} \cdot \frac{\omega_0^2}{\omega_0^2 - \omega^2} \right)$$
(2.12)

By approximating  $\omega_0^2 - \omega^2 \approx 2\omega_0(\omega_0 - \omega)$  and applying the quality factor of Q as  $L\omega/R = 1/Q$ , the equation (2.12) is expressed as

$$\tan \alpha \approx \frac{2Q}{\omega_0} (\omega_0 - \omega) \tag{2.13}$$

If the injected signal contains a phase modulation, i.e.,  $S_{INJ} = V_{INJ} \cdot \cos[\omega t + \psi(t)]$ , then the instantaneous injection frequency is  $\omega + d\psi/dt$  and the equation (2.13) is replaced with

$$\tan \alpha \approx \frac{2Q}{\omega_0} (\omega_0 - \omega - \frac{d\psi}{dt})$$
(2.14)

The equation (2.14) is valid for narrow-band phase modulation (slowly-varying  $\psi$ ), this approximation holds well for typical injection phenomena [17].



Figure 2.11: LC-ILO model with a feedback system.

A LC oscillator under the injection locking can be modeled by using a feedback system as shown in figure 2.11. The output,  $S_{OUT}$ , is represented by a phase-modulated signal having a carrier frequency of  $\omega_{INJ}$ , and the time-varying  $\theta$  is derived from the
model shown in figure 2.11. The output of the adder,  $S_X$ , is expressed as

$$S_{\rm X} = V_{\rm INJ} \cos \omega_{\rm INJ} t + V_{\rm OUT} \cos (\omega_{\rm INJ} t + \theta)$$
(2.15)

$$S_{\rm X} = (V_{\rm INJ} + V_{\rm OUT} \cos \theta) \, \cos \omega_{\rm INJ} t - V_{\rm OUT} \sin \theta \sin \omega_{\rm INJ} t \qquad (2.16)$$

Since the phase quantities do not satisfy superposition, the equations (2.15) and (2.16) cannot be separately subjected to the tank. By using  $A \cdot \cos \omega t + B \cdot \cos \omega t = \sqrt{A^2 + B^2} \cdot (\cos (\omega t + C))$ , where tan C = B/A, the equation (2.16) is replaced by

$$S_{\rm X} = \frac{(V_{\rm INJ} + V_{\rm OUT} \cos \theta)}{\cos \psi} \cdot \cos (\omega_{\rm INJ} t + \psi)$$
(2.17)

$$\tan \psi = \frac{V_{\text{OUT}} \sin \theta}{V_{\text{INJ}} + V_{\text{OUT}} \cos \theta}$$
(2.18)

Since  $\cos \psi = (\sqrt{1 + \tan^2 \psi})^{-1}$ ,  $S_{\rm X}$  is written as

$$S_{\rm X} = V_{\rm OUT} \cos\left(\omega_{\rm INJ} t + \psi\right) \tag{2.19}$$

 $S_{\rm X}$  is fed into the LC tank and the phase shift corresponding to equation (2.14) is made.

$$S_{\text{OUT}} \approx V_{\text{OUT}} \cos \left\{ \omega_{\text{INJ}} t + \psi + \tan^{-1} \left[ \frac{2Q}{\omega_0} \left( \omega_0 - \omega - \frac{d\psi}{dt} \right) \right] \right\}$$
 (2.20)

Equating equation (2.20) to  $S_{\text{OUT}}$ , the relationship between  $\psi$  and  $\theta$  can be derived as

$$\psi + \tan^{-1} \left[ \frac{2Q}{\omega_0} \left( \omega_0 - \omega - \frac{d\psi}{dt} \right) \right] = \theta$$
 (2.21)

It follows the results of the behavior of ILOs with the lock range  $\omega_L$ .

$$\frac{d\theta}{dt} = \omega_0 - \omega_{\rm INJ} - \omega_{\rm L} \sin\theta \qquad (2.22)$$

#### 2.2.2 Phase Domain Response (PDR) Analysis

Denwell [31] developed a phase domain response (PDR) analysis to define a phase response of an ILO to an injection signal with P. The injection signal makes the phase of the ILO,  $\theta(t)$ , be shifted by P, and this phase shift, P, depends on the input phase difference,  $\phi_{IN}(t)$  between the phase of the ILO,  $\theta(t)$ , and the phase of the injected signal,  $\theta_{INJ}(t)$ .

$$\phi_{\rm IN}(t) = \theta_{\rm INJ}(t) - \theta(t) \tag{2.23}$$

The following ILO model is formulated under the assumption that the injection signal generates an immediate phase change in  $\theta(t)$  by  $P(\phi_{\text{IN}})$ .<sup>1</sup> Moreover, the free-running frequency of the ILO is kept as  $\omega_0$ , and the frequency difference,  $\Delta\omega$ , causes an additional phase deviation of  $2\pi\Delta\omega/\omega_0$  in each injection period. In other words, the frequency difference keeps generating the phase deviation, and the phase shift generated by the injection signal,  $P(\phi_{\text{IN}})$ , and the phase deviation of  $2\pi\Delta\omega/\omega_0$  reaches an equilibrium in the steady state.



Figure 2.12: (a) Concept of PDR of the ILO and (b) PDR curve.

<sup>&</sup>lt;sup>1</sup>It is generally known that it takes some time for the phase shift corresponding to the input phase difference to be reflected. However, the simulated result in figure 4.19 exhibits that the phase shift is caused right after the injection signal, thus allowing the post-injection phases to be used to detect the phase shift. The detailed explanation will be addressed in chapter 4.

Figure 2.12(a) illustrates the PDR of the ILO. Assuming that  $S_{INJ}$  pushes or pulls into its zero-crossing point (virtual ground), the phase shift,  $\phi_{OUT}$ , caused by  $S_{INJ}$  is a function P of the input phase difference between  $S_{OSC}$  and  $S_{INJ}$ ,  $\phi_{IN}$ , which is depicted in figure 2.12(b) as a PDR curve. The desired injection-locking point is the origin, where both the input phase difference and the corresponding phase shift are zero. In the vicinity of the origin, the linear approximation can be applied, thus replacing P with a scalar value of  $\beta$  called an injection strength.

Since the phase shift occurs at every injection rate, a phase of the ILO,  $\theta(t)$ , can be treated as a discrete-time model,  $\theta[k]$  of the *k*-th injection. Hence, the equation (2.23) becomes

$$\phi_{\rm IN}[k] = \theta_{\rm INJ}[k] - \theta[k] \tag{2.24}$$

When the ILO is injection-locked without any frequency deviation, i.e.,  $\Delta \omega = 0$ , the change in  $\phi_{IN}$  is only determined by *P*.

$$\phi_{\rm IN}[k+1] = \phi_{\rm IN}[k] - P(\phi_{\rm IN}[k])$$
(2.25)

Now suppose an injection-locked clock multiplier (frequency synthesizer), where the desired angular frequency of the ILO,  $\omega_{\text{ILO}}$ , is the N times the angular frequency of the injection signal,  $\omega_{\text{INJ}}$ , such that  $\omega_0 = N \cdot \omega_{\text{INJ}}$ . However, if the angular frequency deviates from the ideal value by  $\Delta \omega$ ,  $\Delta \omega$  is expressed as

$$\Delta \omega = \omega_0 - N \cdot \omega_{\rm INJ} = 2\pi \Delta f \tag{2.26}$$

Since the period difference corresponding to  $\Delta f$ ,  $\Delta T$ , has a relationship of  $\Delta T \approx -\Delta f / f_0^2$ , the accumulated phase change from the k-th injection to the [k + 1]-th injection affects the [k + 1]-th input phase difference,  $\phi_{IN}[k + 1]$ . Then, the equation

(2.25) is replaced by

$$\phi_{\rm IN}[k+1] = \phi_{\rm IN}[k] - P(\phi_{\rm IN}[k]) - \frac{2\pi N \Delta \omega}{\omega_0}$$
(2.27)

The equation (2.27) is derived as a reference of every k-th phase of the injection signal. If the injection signal has a cycle-to-cycle jitter such that,

$$\Delta \theta_{\rm INJ}[k] = \theta_{\rm INJ}[k+1] - \theta_{\rm INJ}[k]$$
(2.28)

the equation (2.27) is finally replaced by

$$\phi_{\rm IN}[k+1] = \phi_{\rm IN}[k] - P(\phi_{\rm IN}[k]) - \frac{2\pi N\Delta\omega}{\omega_0} + \Delta\theta_{\rm INJ}[k]$$
(2.29)

The equation (2.29) can be expressed in terms of the phase of the ILO,  $\theta[k]$ , with substituting the equation (2.24) to the equation (2.29), resulting in

$$\theta[k+1] = \theta[k] + P(\phi_{\text{IN}}[k]) + \frac{2\pi N \Delta \omega}{\omega_0}$$
(2.30)

This non-linear phase relationship in PDR analysis is described in a discrete-time phase model as shown in figure 2.13.



Figure 2.13: Discrete-time non-linear phase model of the ILO having  $\Delta \omega$ .

#### Phase Shift in Steady State

The equations (2.29) and (2.30) provide a straightforward insight into the injectionlocking in the steady state. When it reaches the injection-locked condition, the ILO and the injection signal are settled with a certain phase relation. The instantaneous input phase difference,  $\phi_{IN}$ , converges to the value of  $\phi_{IN}$ [SS] such that

$$\phi_{\mathrm{IN}}[SS] \coloneqq \lim_{k \to \infty} \phi_{\mathrm{IN}}[k+1] = \lim_{k \to \infty} \phi_{\mathrm{IN}}[k]$$
(2.31)

Then, both  $\phi_{IN}[k + 1]$  and  $\phi_{IN}[k]$  become  $\phi_{IN}[SS]$  in the steady state, hence the following phase relationship between the ILO and the injection signal is derived from the equation (2.29).

$$P(\phi_{\rm IN}[SS]) = -\frac{2\pi N\Delta\omega}{\omega_0} \tag{2.32}$$

The equation (2.32) indicates that the phase shift caused by the injection signal and the accumulated phase deviation resulting from  $\Delta \omega$  are in equilibrium. Through a linear approximation in the vicinity of the injection-locked position,  $\phi_{IN}[SS]$  in the equation (2.32) is expressed with an injection strength,  $\beta$ , as

$$\phi_{\rm IN}[SS] = -\frac{1}{\beta} \cdot \frac{2\pi N \Delta \omega}{\omega_0} \tag{2.33}$$

In other words, the static phase offset corresponding to the frequency error between the ILO and the injection signal is produced in the steady state and is inversely proportional to  $\beta$ . It deteriorates a system performance in specific applications adopting the ILO as a clock de-skewing and a delay-controlling element. Even in a clock multiplier (frequency synthesizer), the instantaneous frequency deviation corresponding to the static phase offset is observed at every injection, thus degrading a reference spur performance.



### 2.2.3 Phase Noise of ILCM

Figure 2.14: Conceptual phase noise comparison between the digital PLL and ILCM.

Figures 2.14(a) and 2.14(b) show the conceptual block diagram and the phase noise plot of the conventional digital PLL and the ILCM, respectively. The injection signal periodically cleans accumulated oscillator-induced noises, hence the ILCM can achieve a higher suppression bandwidth than that of a conventional digital PLL. Ye [32] developed a theoretical analysis and derived the transfer functions of the phase realignment of the ILO noise and the up-conversion of the reference noise. It is also assumed that the phase shift (in [32], this is called as 'phase realignment') is linear with respect to the instantaneous phase error by a factor of  $\beta$  which is the same as the injection strength in PDR-based analysis. To prevent confusion in notations, the overlapped parameters in the following equations are used as the same notation described in section 2.2.2. The instantaneous ILO phase error,  $\theta_{\text{INST}}(t)$  is given by

$$\theta_{\text{INST}}(t) = \theta(t) + \phi(t). \tag{2.34}$$

, where the  $\theta(t)$  is the ILO phase error and  $\phi(t)$  is the accumulated phase shift caused at each injection period. The input phase difference,  $\phi_{IN}[k]$  is represented as

$$\phi_{\rm IN}[k] = \theta_{\rm INST}(kT_{\rm REF}^{-}) - N\theta_{\rm INJ}(kT_{\rm REF}).$$
(2.35)

, where  $t = (kT_{\text{REF}}^{-})$  denotes the time instant just before the k-th injection edge.



Figure 2.15: Extra phase shift due to the phase realignment [32].

The injection-locking model is similar to the model of the PDR-based analysis, but Ye [32] derived those transfer functions in a continuous-time domain by using a phase error integrator. In other words, the phase shift generated at every reference period is accumulated (phase error integrator) and the k-th phase shift is held between  $kT_{\text{REF}}$ and  $(k + 1) T_{\text{REF}}$ , as shown in figure 2.15. As shown in figure 2.15, the phase shift,  $\phi(t)$ , is expressed as a phase error integrator and is developed with a 'hold' operation, with a new notation  $\phi_{\Delta}[k] \coloneqq \phi(kT_{\text{REF}}^+)$ .

$$\phi(t) = -\beta \sum_{k=-\infty}^{\infty} \phi_{\rm IN}[k] \cdot u(t - kT_{\rm REF}).$$
(2.36)

$$\phi_{\text{OUT}}[k] = -\beta \phi_{\text{IN}}[k] = \phi_{\Delta}[k] - \phi_{\Delta}[k-1].$$
 (2.37)

$$\phi(t) = \sum_{k=-\infty}^{\infty} \phi_{\Delta}[k] \cdot h_{\text{hold}}(t - kT_{\text{REF}}).$$
(2.38)

, where  $h_{\text{hold}}(t) = u(t) - u(t - T_{\text{REF}})$  is the impulse response of the hold operation. Taking the Fourier transform of the equation (2.38) yields

$$\phi(j\omega) = T_{\text{REF}} \cdot e^{-j\omega T_{\text{REF}}/2} \cdot \frac{\sin(\omega T_{\text{REF}}/2)}{\omega T_{\text{REF}}/2} \cdot \phi_{\Delta}(z) \Big|_{z=e^{j\omega T_{\text{REF}}}}.$$
 (2.39)

, where  $\phi_{\Delta}(z)$  is the z transform of  $\phi_{\Delta}[k]$ . Combining the equations (2.35) and (2.37) and taking z transform, it results in

$$\phi_{\Delta}(z) = \frac{-\beta}{1 + (\beta - 1)z^{-1}} \cdot \theta(z) + \frac{N\beta}{1 + (\beta - 1)z^{-1}} \cdot \theta_{\text{INJ}}(z).$$
(2.40)

Combining the equations (2.34), (2.39), and (2.40), yields

$$\theta_{\text{INST}}(j\omega) = \theta(j\omega) \cdot H_{\text{rl}}(j\omega) + \theta_{\text{INJ}}(j\omega) \cdot H_{\text{up}}(j\omega)$$
(2.41)

, where

$$H_{\rm rl}(j\omega) = 1 - \frac{\beta}{1 + (\beta - 1)e^{-j\omega T_{\rm REF}}} \cdot e^{-j\omega T_{\rm REF}/2} \cdot \frac{\sin(\omega T_{\rm REF}/2)}{\omega T_{\rm REF}/2}.$$
 (2.42)

and

$$H_{\rm up}(j\omega) = \frac{N\beta}{1 + (\beta - 1)e^{-j\omega T_{\rm REF}}} \cdot e^{-j\omega T_{\rm REF}/2} \cdot \frac{\sin(\omega T_{\rm REF}/2)}{\omega T_{\rm REF}/2}.$$
 (2.43)

, where  $H_{\rm rl}(j\omega)$  and  $H_{\rm up}(j\omega)$  are the transfer functions of the phase realignment of the ILO noise and the up-conversion of the injection signal noise, respectively. In accordance with the results in the equations (2.42) and (2.43), the linearized phase noise model of the ILCM is achieved as shown in figure 2.16. The simulated NTFs of the reference noises and the oscillator-induced noises of the model in figure 2.16 are shown in figure 2.17. All parameters are the same as figure 2.6 except for injection strength variations. As explained, the loop bandwidth is further extended with a larger  $\beta$ .



 $H_{UP}(s)$  : up-conversion transfer function of the reference noise  $H_{rl}(s)$  : phase-realignment transfer function

Figure 2.16: Phase noise model of the ILCM including transfer functions of the phase realignment of the ILO noise and the up-conversion of the injection noise.



Figure 2.17: Magnitude of (a)  $H_{up}(s)$  and (b)  $H_{rl}(s)$  with  $\beta$  variations.

# **Chapter 3**

# Self-Biased Supply-Noise-Compensating RO

# 3.1 Overview

This chapter presents an AD-PLL for a DDR5 RCD application with a self-biased supply-noise-compensation (SNC) technique [33]. By combining two Nagata current sources that have opposite dependencies on supply variations, it offers a constant current to a RO over a wide range of supply voltage. Thereby, the deterministic jitter caused by the dynamic voltage droop due to workload transition is compensated while a static voltage margin for mass production is improved. Since the proposed SNC technique operates independently of the PLL loop bandwidth without using feedback, the AD-PLL with the SNC-RO is free from the stability problem associated with bandwidth overlapping. Furthermore, the SNC technique is adaptable to applications that have a noisy reference clock, since it has an open-loop calibration characteristic. Quantitative analyses on static and dynamic characteristics of the proposed SNC technique and relevant design-oriented considerations are addressed. In addition to the SNC technique, the design of an auxiliary frequency tracking loop (A-FTL) to meet the stabilization time of the RCD application and the design of a TDC-PFD considering an input jitter specification are also presented. Fabricated in the 28-nm CMOS technology, the AD-PLL occupies an active area of 0.06 mm<sup>2</sup> and consumes 12.1 mW at 3.0 GHz with a 1.1 V supply voltage. The power-supply-noise-attenuation (PSNA) is measured as 40 dB with a 300 kHz single-tone noise frequency and retains over 35 dB up to 3 MHz. The integrated RMS jitter of 271 fs from 100 kHz to 500 MHz is achieved, which translates to a jitter-and-power figure of merit (FoM<sub>JITTER</sub>) of -240.5 dB.

Rest of this chapter is organized as follows. Section 3.2 introduces design motivations along with essential requirements for the SNC in the DDR5 RCD application and prior works regarding the SNC technique. Section 3.3 focuses on the design aspects and the detailed design-oriented analyses of the proposed self-biased SNC-RO. The circuit implementations of building blocks are described in section 3.4. The measurement results from the prototype chip are presented in section 3.5. Finally, the key contributions of this chapter are summarized in section 3.6.

## **3.2 Design Motivation**

### 3.2.1 Design Consideration of SNC for RCD application

The RCD performs the role of buffering the command and address bus, chip selects, and clock between the host controller and the DRAMs in server memory modules [4, 5]. Figure 3.1 shows the block diagram of registered dual inline memory modules (RDIMMs). It solves address and command loading problems and enables server and client computing system to deal with the most demanding workloads. One of key challenges of designing the PLL in the RCD is responding to the sudden voltage droop due to workload transitions. This non-ideal supply can cause performance degradation or functional failures [34]. The power-efficient method to suppress supply-induced noises in oscillators is increasing the PLL loop bandwidth, but the narrow bandwidth is preferred in the RCD to filter out an input reference clock noise. According to the input clock differential jitter specification in [5], the maximum one-period jitter is as large as 0.14 UI at the bit error rate (BER) of 1E -16, the PLL bandwidth cannot be arbitrarily increased, thereby requiring an additional technique for suppressing the supply-induced noises.

The essential prerequisites for designing SNC-PLL in the RCD application are as follows. First of all, the SNC technique should be independent of the signal quality of the input reference clock. As mentioned above, since it has a noisy input reference clock, the PLL in the RCD plays a role in filtering out the input clock noises. Therefore, the reference clock cannot be used as a criterion for detecting supply fluctuations. The second requirement is a sufficient static voltage margin within all operating configurations. The RCD can use the DIMM operating speed control word to optimize its functionality and performance at low voltage conditions [5]. To support its lower voltage mode and secure its stable functionality over process, voltage, and temperature (PVT) variations for mass production, a sufficient voltage margin is required. The



Figure 3.1: Block diagram of registered DIMM.

third prerequisite is a sufficient SNC bandwidth for responding to dynamic voltage droops. The unwanted supply fluctuation due to the workload transition or the resonance between the package/bonding inductance and on-die decoupling capacitance may cause supply variations up to 10-15 % from the nominal supply voltage [34, 35]. Even though there is a large droop of about -100 mV during 3  $\mu$ s, the clock omission has to be prohibited and the clock drift should be low enough with a sufficient tracking bandwidth. Lastly, it has to meet a stabilization time of 3.5  $\mu$ s [5], even if the SNC technique needs a training procedure for a foreground calibration or affects the PLL locking time.

### 3.2.2 Prior Works Regarding SNC Techniques

The output jitter of the RO-based PLL is much affected by supply fluctuation. Such an effect is called a high frequency pushing (FP), i.e., a high supply sensitivity. To minimize the FP, a low-dropout regulator (LDO) with a high-gain suppression is commonly used [36, 37, 38, 39], but conventional regulation techniques using the



Figure 3.2: SNC technique proposed in [42].

LDO suffer from the fundamental limitation that the bandwidth of the LDO should be much wider than the PLL bandwidth to achieve the desired supply rejection [40]. In addition, the LDO supporting a wide operating range increases power consumption and increases the PLL's random jitter by 2-3x [41]. Therefore, several SNC techniques without using the LDO have been proposed [42, 43, 44, 45]. However, these prior SNC techniques cannot satisfy all of the prerequisites for the SNC-PLL in the RCD application described in section 3.2.1. In this section, why they are unacceptable in the RCD application will be addressed.

Yeh [42] proposed a background supply-noise-cancellation controller that has a ripple detector and low-bandwidth control logics. By using a finite state machine that compares the before and after integral codes in the DLF during a finite interval (5.24 ms in [42]), the ripple detector controls the current digital-to-analog converter (DAC). This solution requires a long calibration time and may violate the stabilization time of the RCD because the stable ripple value must be stored at least once when the PLL



Figure 3.3: SNC technique proposed in [43].



Figure 3.4: SNC technique proposed in [44].

reaches the steady state. Another background calibration was proposed by Elshazly [43]. A low-frequency deterministic digital test signal is injected into the oscillator



Figure 3.5: SNC technique proposed in [45].

supply and the gain of the cancellation currents is adjusted by the correlated integral code. However, the SNC techniques using background calibration in [42, 43] have a crucial defect. Their mechanisms of detecting the supply noise rely on the integral code of the DLF whose ripple or correlation value is considered to have originated from the supply noise. Therefore, they could rather aggravate deterministic jitter in applications that do not use a clean reference clock. Nagam [42] proposed a feed-forward noise cancellation (FFNC) that leverages the available sub-sampling phase detector (SSPD) to extract the noise with high sensitivity. The noise cancellation block (NCB) can be constituted independently of the PLL loop bandwidth, but the amount of suppression is limited by the range of variable delay lines in the NCB. Furthermore, it is unacceptable in noisy-reference applications as well. In [45] proposed by Wu, a supply noise is filtered by providing high impedance to the RO, but stacked thick-oxide MOSFETs for gain boosting require a high supply voltage of 1.6 V for reliability and necessary voltage headroom.

## 3.3 Proposed Self-Biased SNC-RO

#### 3.3.1 Principle of Nataga Current Source

Figure 3.6(a) shows a peaking current mirror called Nagata current source (CS) [46]. Unlike a conventional current mirror, the resistor R is added between the gate and the drain of the diode-connected MOSFET M<sub>IN</sub>. As the input current  $I_{IN}$  increases, a voltage drop across the resistor  $I_{IN}*R$  increases, resulting in the decrease in gate-source voltage of M<sub>OUT</sub>. Therefore, the output current  $I_{OUT}$  exhibits a peak with respect to  $I_{IN}$  as shown in Fig. 3.6(b). When a system is biased at the vicinity of the peak, the constant output current against supply variations can be achieved. However, such an optimum bias point is very narrow, so parallel combinations of Nagata CSs with different peak positions were proposed [47, 48] in order to overcome the drawback of a single Nagata CS.



Figure 3.6: Circuit description and operation of Nagata CS.



Figure 3.7: Circuit description of proposed SNC-DCO.

### 3.3.2 Circuit Description of Self-Biased SNC-RO

Figure 3.7 describes the proposed SNC-DCO supporting a power-down mode. It consists of a bias voltage generator (BVG), a 10-bit digitally-controlled frequency-tuning block as current mirrors, and a 4-stage differential RO. The unit frequency-tuning cell (FTC) is composed of only two CSs,  $M_1$  and  $M_2$ , to perform the SNC function using as few devices as possible. This is because the larger the number of CSs and series resistors in the BVG is, the less the voltage headroom and the lower the overdrive voltage of CSs become, which disrupts the wide band operation and aggravates phase noise performance. The frequency of the SNC-DCO is determined by the number of the activated FTC, and  $M_6$  works by the digital switch. It is noted that the low threshold voltage (LVT) active device is used at  $M_2$  to secure the voltage margin for its operating condition. In addition, two additional MOSFETs of  $M_4$  and  $M_5$  are employed to support the power-down mode. When  $S_{PDN}$  is activated as a logical



Figure 3.8: (a) Static and (b) dynamic characteristics of SNC-DCO.

zero,  $M_5$  is turned off and  $M_4$  is turn on, thereby the bias voltages  $V_1$  and  $V_2$  become a logical high and all of FTCs are turned off.

To keep a constant current despite the supply fluctuation, two currents with positiveand negative-supply dependency are added. As the supply voltage goes higher by amount of  $\Delta V_{VDD}$ , the current flowing through the BVG  $I_{BIAS}$  increases. Then, the bias voltages  $V_1$  and  $V_2$  increase by the corresponding IR drop across the resistor  $R_1$ and  $(R_1 + R_2)$ , respectively, since M<sub>5</sub> also works by the switch and the drain voltage of M<sub>5</sub> is approximately the same as ground. When  $\Delta V_1$  is smaller than  $\Delta V_{VDD}$  and  $\Delta V_2$  is bigger than  $\Delta V_{VDD}$ , the changes of overdrive voltages of M<sub>1</sub> and M<sub>2</sub> have an opposite polarity. Thus, if transconductances of M<sub>1</sub> and M<sub>2</sub> in the vicinity of the nominal voltage VDD<sub>NOM</sub> are designed for the change of each current to have the same magnitude and the opposite polarity, they make the FP converge to zero.

Figure 3.8(a) illustrates the aforementioned DC characteristic. The summation  $I_{\text{DCO}}$  of  $I_1$  and  $I_2$ , which are proportional and inversely proportional to the supply voltage variations, respectively, shows a tendency to be insensitive to the supply vari-

ations in the vicinity of the nominal voltage. The longer the supply range where the absolute values of the derivatives of  $I_1$  and  $I_2$  are kept the same, the wider the supply range becomes where the minimum FP is maintained, thereby further improving the static voltage margin. Figure 3.8(b) shows a transient simulation result of the frequency trend of the free-running DCO when a step function is applied to VDD<sub>DCO</sub> to estimate its dynamic response. It shows a high-pass characteristic with a tracking bandwidth of about 20 MHz. On the other hand, in the case of SNC-free DCO, the frequency increases in proportion to the supply change. Comparing the amount of frequency change after about 10ns, the enhanced FP of about 20 MHz/V is observed in the SNC-DCO, while the FP of the SNC-free DCO is about 3800MHz/V. The SNC technique improves the FP by 45 dB.

#### 3.3.3 Design-Oriented Analysis

The previous studies employing a Nagata CS [47, 48] focus only on generating a constant reference current and conducting the large-signal analysis. However, in the design of the FTC of the proposed SNC-DCO, the amount of noise suppressions, the SNC-operating bandwidth, and the phase noise contribution due to the SNC technique are additionally considered as well as keeping the constant DC current. Therefore, in this section, the design-oriented determination of the design parameters of the SNC-DCO and the additional considerations regarding the SNC in the RO are addressed. Moreover, the quantitative analyses of the SNC performance metric of the power-supply-noise-attenuation (PSNA) are discussed in detail.



Figure 3.9: Small-signal circuit of the BVG.

#### **Determination of Design Parameters: FP**

The FP in an oscillator refers to a frequency sensitivity to fluctuations in the power supply voltage and is given as

$$FP = \frac{\Delta Freq.}{\Delta VDD} = \frac{\Delta Freq.}{\Delta I_{DCO}} \cdot \frac{\Delta I_{DCO}}{\Delta VDD}$$
(3.1)

,where  $I_{DCO}$  is the current flowing into the DCO. The  $I_{DCO}$  is determined by the two CSs, as illustrated in figure 3.7, and the resulting change in  $I_{DCO}$  due to a fluctuation in supply voltage is expressed as

$$\frac{\Delta I_{DCO}}{\Delta VDD} = g_{m1} \cdot \left(1 - \frac{\Delta V_1}{\Delta VDD}\right) + g_{m2} \cdot \left(1 - \frac{\Delta V_2}{\Delta VDD}\right)$$
(3.2)

,where  $g_{m1}$  and  $g_{m2}$  are the transconductances of the two CSs, respectively. According to (3.2), the SNC performance is derived from the dynamic aspect of bias voltages  $V_1$  and  $V_2$  through the small-signal model of the BVG.

Figure 3.9 shows the small-signal circuit of the BVG.  $C_1$  and  $C_2$  represent the entire capacitance between bias voltages and the supply, including parasitic and onchip decoupling capacitances. It is noted that two MOSFETs for the power-down mode  $M_4$  and  $M_5$  in figure 3.7 are omitted in the analysis because  $M_4$  is turned off in the normal operation and the drain of  $M_5$  become a ground. With an approximation of  $C_{gd3}+C_2\approx C_2$ , the respective small-signal transfer functions of the bias voltages  $V_1$  and  $V_2$  are expressed as

$$\frac{\Delta V_2}{\Delta VDD} = \frac{s^2 + (N + \frac{g_m}{C_2})s + \frac{(R_1 + R_2)(1 + g_m r_o)}{R_1 R_2 r_o C_1 C_2}}{s^2 + Ns + \frac{R_1 + R_2 + r_o}{R_1 R_2 r_o C_1 C_2}}$$
(3.3)

$$\frac{\Delta V_1}{\Delta VDD} = \frac{R_1}{R_1 + R_2 + sC_1R_1R_2} \cdot \frac{\Delta V_2}{\Delta VDD} + \frac{sC_1R_1R_2}{R_1 + R_2 + sC_1R_1R_2}$$
(3.4)

,where  $N = \frac{C_2 r_o (R_1 + R_2) + C_1 R_1 (r_o + R_2)}{R_1 R_2 r_o C_1 C_2}$ ,  $g_m$  and  $r_o$  are the transconductance and the output resistance of M<sub>3</sub>, respectively. To achieve the minimum FP at the quiet (DC) voltage, the desired transconductance ratio of the two CSs is derived from (3.2), (3.3), and (3.4) as

$$\frac{g_{m1}}{g_{m2}} = \frac{g_m r_o (R_1 + R_2) - r_o}{R_2 + r_o - R_1 g_m r_o}$$
(3.5)

#### **Determination of Design Parameters: Phase Noise**

The resistances of  $R_1$  and  $R_1+R_2$  determine the location where the peak current of each CS is located [48]. The simulated result of R variations in figure 3.6 is shown in figure 3.10. A higher resistance value leads to a higher IR drop across the resistor, which makes the peak current location appear at the lower input current  $I_{IN}$ . And the corresponding DC bias voltages  $V_1$  and  $V_2$  decide the sizes of M<sub>1</sub> and M<sub>2</sub> according to (3.5).



Figure 3.10: Nagata CS dependency with R variations.



Figure 3.11: Phase noise at 1 MHz offset frequency with a change in  $I_{\text{BIAS}}$ .

Even at the same DC bias voltage of  $V_1$  and  $V_2$ , the phase noise contribution varies depending on the bias current  $I_{BIAS}$ . Figure 3.11 shows the phase noise at the 1 MHz offset frequency with the change in  $I_{BIAS}$ , while the bias voltages  $V_1$  and  $V_2$  are kept constant. The x-axis represents the current of  $I_{BIAS}$  normalized by the designed value of about 1.04 mA, which is 36.5% of the current flowing to the RO at a 2 GHz operation. The phase noise contribution of the BVG is assessed from the difference from the dotted line representing -10 dB/dec. When  $I_{BIAS}$  is low, the phase noise is improved in proportion to the increase of  $I_{BIAS}$ . However, as  $I_{BIAS}$  increases, the phase noise improvement diminishes. These observations indicate that the dominant phase noise contribution is relatively transferred from the BVG to the two CSs and the RO. Therefore, the adequate current value of  $I_{BIAS}$  must be determined comprehensively considering a power budget, the desired jitter constraint, and a loop bandwidth.

#### **Determination of Design Parameters: SNC-Operating Bandwidth**

The capacitances  $C_1$  and  $C_2$  are the key parameters for determining the SNCoperating band. They are comprised of a gate-source capacitance and a junction capacitance of the CSs and the decoupling capacitance between the bias node and the supply. In this design, the sizes of two CSs and the number of decoupling capacitors are the same, thus the total capacitance viewed from each node,  $C_1$  and  $C_2$ , are approximately identical as  $C_{\text{BIAS}}$  with symmetric routing and placement. The dominant pole of the transfer function of (3.2) with a change of  $C_{\text{BIAS}}$  is shown in figure 3.12. It shows that the larger the  $C_{\text{BIAS}}$  is, the lower the dominant pole frequency becomes. Since it is assumed that an analog supply to be assigned to this AD-PLL is filtered by the 1storder RC filter with a cutoff frequency of 1.59 MHz like a DDR4 RCD case [4], the desired dominant pole is located above the cutoff frequency.



Figure 3.12: Location of dominant pole frequency of (3.2) with variations of  $C_{\text{BIAS}}$ .



Figure 3.13: (a) Magnitude and (b) phase of the transfer function of PNSA<sub>DCO</sub>.

#### **PSNA**

One of the performance indicators for suppressing a supply-induced noise is the PSNA. The PSNA of the DCO is expressed as

$$PSNA_{DCO} = \frac{\Delta I_{DCO}}{\Delta I_{DCO,conv}}$$
(3.6)

, where  $\Delta I_{DCO,conv}$  is the current flowing into an oscillator that is proportional to the supply fluctuation in the conventional control group. The transconductance of the conventional control group is set as  $2 \cdot I_{RO}/V_{OV}$ , which is the same operating point as that of the SNC-DCO, and it is assumed that the change of the overdrive voltage is proportional to the supply variations. Figure 3.13 describes the PSNA<sub>DCO</sub> and shows +20 dB/dec noise suppression characteristic in the interesting frequency band up to 10 MHz.

#### **3.3.4** Evaluation on Suitability in RCD application

The proposed SNC technique has a variety of advantages. First of all, it does not require an additional feedback system for the SNC. Therefore, the proposed SNC-PLL is free from the stability issue, and the SNC-operating bandwidth is not limited by the frequency of the reference clock. In other words, the proposed SNC technique can be accepted regardless of the frequency of the reference clock and the same SNC performance can be maintained no matter what the PLL loop bandwidth is even in different configurations such as DDR5-3200, DDR5-4800, and so forth. The second advantage is that the SNC performance does not be affected by the quality of the reference clock, since it is the open-loop calibration conducted in the RO itself. In addition, it is able to detect the supply fluctuations directly through the self-biased BVG, whereas the background calibrations [42, 43] detect the supply variation indirectly through the integral code in the DLF as discussed in section 3.2.2. Therefore, the possibility of a malfunction of incorrect calibration due to a large ripple in the integral code for reasons other than the supply noise is completely prevented.

However, it cannot always be in the complete same SNC-operating condition over process and temperature variations. Figure 3.14(a) shows the current flowing into the RO,  $I_{DCO}$ , with the skewed corners (FS and SF) and the nominal corner (NN). The respective optimum bias points with the different corners are located where their derivative is equal to 0 and they slightly deviate from the nominal voltage of 1.1 V. This is because the peak current location of  $I_2$  moves back and forth with respect to process variations, and the desired transconductance ratio from (3.5) changes. For example, in the case of the SF corner with all design parameters of the NN corner, the output resistance of M<sub>3</sub> in the BVG,  $r_o$ , has the relatively lower value, so that the bias voltage of  $V_2$  is set as a relatively high voltage that is corresponding to  $(R_1+R_2)*I_{BIAS}$ . According to figure 3.10, it makes the peak current of  $I_2$  become located at the lower



Figure 3.14: Currents flowing into the RO with a skewed corner.



Figure 3.15: Frequency of the SNC-DCO with process corner variations.



Figure 3.16: Frequency of the SNC-DCO with temperature variations.

supply and the desired transconductance ratio of (3.5) decreases. Accordingly, the optimum bias region of the SF corner with the given design parameters is positioned at a lower supply than that of the NN corner. The respective  $I_1$  and  $I_2$  with the skewed corner variation is shown in figure 3.14(b) and (c). With the respect to each current at the given supply voltage, they compensate each other by themselves, which alleviates the amount of the total current variation. Figure 3.15 is the frequency of the free-running DCO with corner variations, which is normalized into 2.0 GHz, since the average frequency is locked at the target frequency by the PLL behavior. Similar to figure 3.14, in the FF and SF cases, the flat region goes toward the lower supply than the nominal voltage, while it goes toward the higher supply in the FS and SS cases. Even though it has corner variations, the worst FP at the SS corner is simulated as 595 MHz/V, which is about a 15dB improvement compared to the FP of 3300 MHz/V

in the SNC-free DCO. It also has robustness against temperature variations [47, 48]. Figure 3.16 shows the frequency of the free-running DCO with temperature variations, which is also normalized into 2.0GHz for the same reason. The FP with moderate temperature variations from 20 °C to 75 °C is lower than 450 MHz/V, which is a 17.3 dB improvement.



Figure 3.17: (a) Start-up simulation and (b) time trend of  $V_2$  with a supply voltage ramp of 100 ns.

Lastly, it does not need a start-up circuit and does not deteriorate the stabilization time for the SNC. Of course, it needs an additional time for the bias voltages  $V_1$  and  $V_2$ to be settled after the stable supply is provided. Figure 3.17(a) is the start-up simulation of the SNC-DCO and figure 3.17(b) is the time trend of the bias voltage of  $V_2$  with a supply voltage ramp time of 100 ns. The SNC-DCO starts to oscillate from about 50 ns and it requires an additional time of about 40 ns for the bias voltages to fulfill their own SNC function.



Figure 3.18: Overall architecture of AD-PLL with SNC-DCO for RCD application.

# **3.4** Circuit Implementation

#### 3.4.1 Overall Architecture

The overall architecture of the proposed AD-PLL is shown in figure 3.18. Frequency and phase errors,  $F_{err}$  and  $P_{err}$ , are directly forwarded into the DCO in a similar way as [49, 50] and are deserialized for use in the integral path. Since both the directproportional and integral paths control in parallel the number of the activated unit FTCs, the ratio of two gain paths is maintained over PVT variations. The integral code,  $C_{\rm I}$ , is also fed into the lock detector (LD), and it perceives the locked state as  $C_{\rm I}$  is confined to a certain range for a fairly long period. A band-selecting signal,  $S_{\text{BAND}}$ , controls the strength of the RO and the number of the CSs that is always activated to support a wide frequency band. For a fast stabilization, the A-FTL using a counter-based frequency detector compensates for a large frequency difference via a large integral gain of  $\alpha_2$ . The TDC-PFD uses three finely spaced TDC for a better jitter performance and four coarsely spaced TDC with non-linear gain for fast locking. The detailed circuit descriptions of the A-FTL and TDC-PFD are explained in sections 3.4.2 and 3.4.3, respectively. The power domains are divided into the  $VDD_{DCO}$ , the VDD<sub>DIG</sub>, and VDD<sub>ANA</sub>. A test noise is applied to the VDD<sub>DCO</sub> domain to evaluate the performance of the proposed SNC technique.

### 3.4.2 Circuit Description of Building Blocks: TDC-PFD

Unlike the PLL as a frequency synthesizer, the PLL as a clock buffer has a noisy reference clock whose frequency is the same as that of the operating clock. To achieve a high PLL loop bandwidth and secure a stable operation, the frequency and phase errors,  $F_{\rm err}$  and  $P_{\rm err}$ , should be propagated into the loop filter within one period of the reference clock. However, due to the high reference clock frequency, timing constraints over PVT variations in the TDC-PFD have been taken into account [51].



Figure 3.19: Block diagram of the TDC-PFD.

Figure 3.19 is the block diagram of the TDC-PFD. Five steps of Vernier TDC [9] are utilized, where three narrow-spaced steps are used for the jitter reduction at the locked state, and two broad-spaced steps are used for the fast locking. Dead-zone (DZ) PFD produces the  $F_{\rm err}$  signals for a large PFD gain when it detects a frequency difference. To prevent a metastability problem between the TDC-PFD and the DLF, the clock of  $S_{\rm CDC}$  for a clock domain crossing (CDC) is adopted. The detailed

block diagram with each propagation delay is shown in figure 3.20. The critical paths of curtailing a timing budget are determined by  $S_{\text{UP}}$ ,  $S_{\text{REF}}$ , and  $S_{\text{CDC}}$ . Considering the timing margin for a stable operation as shown in figure 3.21, the following two equations must be satisfied.

$$t_{\rm UP} + t_{\rm c2q} + t_{\rm setup} < t_{\rm CDC} \tag{3.7}$$

$$t_{\rm CK} + t_{\rm c2q} + t_{\rm hold} > t_{\rm CDC} \tag{3.8}$$



Figure 3.20: Detailed block diagram of the TDC-PFD with each timing information.
, where  $t_{\rm UP}$  and  $t_{\rm CDC}$  are the propagation delays of from  $S_{\rm REF}$  to  $S_{\rm UP}$  and from  $S_{\rm REF}$  to  $S_{\rm CDC}$ , respectively, and  $t_{\rm c2q}$ ,  $t_{\rm setup}$  and  $t_{\rm hold}$  are the clock-to-q delay, setup and hold time of the D-flipflop. And  $t_{\rm CK}$  is the period of the reference clock.



Figure 3.21: Signal decision time and timing margin for stable operation.

Figure 3.22 describes the gain of the TDC-PFD. The optimum point of  $\tau_1$  [52] is determined by the Lloyd-Max algorithm [53] and the probability density function of  $P_{\rm err}$ . Here,  $\tau_1$  is decided as the average value of RMS jitter of  $S_{\rm REF}$  to satisfy the given DDR5 RCD specification [5].  $\tau_2$  not only helps fast phase tracking but also increases a loop bandwidth when a large phase difference is detected.  $\tau_{\rm UP}$  and  $\tau_{\rm DN}$  are determined by the propagation delays of figure 3.20 and expressed as

$$\tau_{\rm UP} = t_{\rm s} + t_{\rm UP} - (t_{\rm c2q} + t_{\rm reset} + t_{\rm r2q} + t_{\rm hold}) \tag{3.9}$$

$$\tau_{\rm DN} = t_{\rm s} + t_{\rm c2q} + t_{\rm setup} \tag{3.10}$$



Figure 3.22: Gain curve of TDC-PFD.

, where  $t_s$  is the propagation delay of the slow buffer and  $t_{reset}$  is the reset AND delay in the DZ-PFD, and  $t_{r2q}$  is the reset-to-q delay of D filpflop. For a symmetric locking transient behavior,  $\tau_{UP} = \tau_{DN}$  should be satisfied. The monte carlo simulation of the narrow-spaced steps of  $\tau_1$  is shown in figure 3.23. According to the simulation results, the mean value of  $\tau_1$  is about  $\pm 6$  ps, and its standard variation is 1.45 ps.



Figure 3.23: Monte carlo simulation of first steps in TDC-PFD.

### 3.4.3 Circuit Description of Building Blocks: A-FTL

To satisfy the required stabilization time of 3.5  $\mu$ s in the entire operating frequency band, the A-FTL is incorporated for the fast locking to its target frequency. The TDC-PFD explained in section 3.4.2 is also able to produce the frequency error  $F_{err}$ , when their phase difference is greater than  $\tau_{UP}$  or less than  $-\tau_{DN}$  from the phase-locked point. However, the TDC-PFD has a limited frequency acquisition gain, because the frequency error  $F_{err}$  is not proportional to the frequency difference. Therefore, when there is a large frequency difference such as an initial locking case, the A-FTL is required to attain the corresponding large frequency gain.



Figure 3.24: Block diagram of A-FTL.

Figure 3.24 is the block description of the A-FTL. The A-FTL using a counterbased frequency detector examines the average frequency difference between  $S_{\text{REF}}$ and  $S_{\text{CLK,DIG}}$  by comparing the counted numbers of each clock edge. When the difference of the counted numbers exceeds two, the enable signal  $S_{\text{EN,FTL}}$  is activated and the A-FTL applies an integral gain of  $\alpha_2$  only once, while the original low gain of  $\alpha_1$  is continuously applied otherwise. Since the A-FTL operation is a non-linear behavior, when the  $S_{\text{EN,FTL}}$  is activated, the reset signal  $S_{\text{RST}}$  must be held to the logical high until the corresponding frequency correction by  $\alpha_2$  is reflected in order to prevent overcompensation. The  $\alpha_2$  is determined by  $K_{\text{DCO}}$ , the counted number of  $S_{\text{CLK,DIG}}$ , and the reference frequency. Since the A-FTL compares the average clock frequencies, robust frequency acquisition is possible regardless of an instantaneous jitter. Furthermore, the A-FTL is implemented with digital synthesis.



Figure 3.25: Simulated frequency trend with A-FTL.

The simulated frequency trend of  $S_{\text{OUT}}$  is shown in figure 3.25. The large frequency acquisition gain is obtained at the initial locking. As the frequency difference becomes low, it takes a longer time for  $S_{\text{EN,FTL}}$  to be activated and the corresponding frequency correction is also reduced. If the frequency difference is low enough, the counted number of  $S_{\text{CLK,DIG}}$  is saturated and the SNC-AD-PLL finishes the phase tracking with the TDC-PFD by  $\alpha_1$ .



# 3.5 Measurement Results

Figure 3.26: (a) Die photograph and (b) power breakdown of each supply domain.



Figure 3.27: Measurement setup

The proposed AD-PLL is fabricated in the 28-nm CMOS process with a supply voltage of 1.1 V and occupies an active area of 0.06 mm<sup>2</sup>. The micrograph of the chip is shown in figure 3.26(a) and figure 3.26(b) shows the sub-block lists and the power breakdown of each power supply domain. Unlike a conventional frequency synthe-

sizer, both all sub-circuits in VDD<sub>ANA</sub> domain and the proportional path operate at full rate. Accordingly, the power consumption in the VDD<sub>ANA</sub> domain of 3.10 mW at a 3 GHz operation is particularly high. The measurement setup is shown in figure 3.27. To evaluate the SNC performance, the voltage noise from the Waveform Generator,  $S_{\Delta V,NOISE}$ , is fed into the VDD<sub>DCO</sub> domain through the AC-coupling capacitor of 100  $\mu$ F. The signal type of  $S_{\Delta V,NOISE}$  is a single-tone sinusoidal noise or a random noise. The Vector Signal Generator provides the input high-frequency reference clock and the BALUN conducts the single-to-differential conversion. The Bias TEEs adjust their DC offset voltages.

### 3.5.1 Open-Loop SNC Performance



Figure 3.28: Measured static SNC characteristic of the free-running DCO and the corresponding FP with different voltage variations.

The measured static SNC characteristic of the free-running DCO and the average FP of three prototype chips are shown in figure 3.28. The frequency of the SNCfree DCO shows a linear dependency on a voltage variation. On the other hand, in the SNC-DCO, the low FP of 60.8 MHz/V at a  $\pm 20$  mV change is achieved with small



Figure 3.29: (a) Measured spectra of the free-running DCO with a sinusoidal noise of 1 MHz, 100 mV<sub>pp</sub> and (b) the amount of frequency deviation.

chip-to-chip variations, which is about 55 times lower than that of the SNC-free DCO. The measurement results in figure 3.28 are fairly consistent with the simulated results in section 3.3. Spectra of the free-running DCO in figure 3.29(a) display the SNC effects with a sinusoidal noise of 1 MHz, 100 mV<sub>pp</sub>. In general, when a supply noise is applied, frequency drifts like frequency modulation (FM) in accordance with the applied noise, thus the shape of the spectrum changes from skirt-shaped into the wideband FM. However, with the SNC assistance, the frequency drift is markedly mitigated as shown in figure 3.29(b). The noise suppression occurs in all measured bands, but the higher the noise frequency is, the less the suppression becomes. This result coincides with the analysis of the PSNA<sub>DCO</sub> in section 3.3.



### 3.5.2 Closed-Loop SNC Performance

Figure 3.30: Measured spectra of the AD-PLL with a sinusoidal noise of 1 MHz, 25  $mV_{pp}$  and the derived PSNA.



Figure 3.31: Measured PSNA results of the SNC-PLL.

Figure 3.30 is the spectra of the AD-PLL with and without the SNC technique when a sinusoidal noise of 1 MHz, 25  $mV_{pp}$  is applied or not. In the SNC-free PLL, the residue noise that is not eliminated by the PLL loop increases the in-band noise. On the contrary, in the SNC-PLL, the fundamental noise as well as N-th harmonic

noises are greatly attenuated, and the supply noises are further suppressed by the SNC technique. The extent attenuated by the SNC technique derives the PSNA as shown in figure 3.31. The SNC-PLL achieves the highest PSNA of 40 dB at 300 kHz and retains over 35 dB up to 3 MHz.



Figure 3.32: Measured RMS jitter with (a) sinusoidal and (b) random noises.



Figure 3.33: Measured phase noise plots with a sinusoidal noise of 200 kHz, 50 mV<sub>pp</sub>.



Figure 3.34: Measured phase noise plots with random noise of 40  $mV_{pp}$ .



Figure 3.35: Measured phase noise plots without noise.

When a sinusoidal noise and a random noise are applied, the integrated RMS jitter for each case is plotted in figure 3.32. When a sinusoidal noise is applied, the SNC-PLL achieves a relatively constant RMS jitter below 1 MHz, whereas the SNC-free PLL fails to lock above 500 kHz with 100 mV<sub>pp</sub> amplitude. In the case of random noise, on average, the achieved RMS jitter of the SNC-PLL is 65 % lower than that of the SNC-free PLL. When a sinusoidal noise of 200 kHz, 50 mV<sub>pp</sub> is applied, the measured phase noise with and without the SNC technique is shown in figure 3.33. Even though they have the same loop bandwidth, the in-band phase noise considerably deteriorates in the SNC-free PLL. When a random noise of 40 mV<sub>pp</sub> is applied, the measured phase noise comparison plot with and without the SNC technique is shown in figure 3.34. As discussed in section 2.1.3, the phase noise of the SNC-free PLL is aggravated by the random noise. However, the phase noise of the SNC-PLL is improved with the additional noise suppression owing to the SNC technique. Under the noise-free condition, the integrated RMS jitter of 271 fs is achieved from 100 kHz to 500 MHz at 3 GHz operation as shown in figure 3.35.



### 3.5.3 Fast Stabilization Performance

Figure 3.36: Measured convergent time of clock frequency with and without the A-FTL.



Figure 3.37: Measured stabilization time with different configurations.

The measured convergent time of clock frequency with and without the A-FTL is described in figure 3.36. With the starting point of clock frequency set to the minimum value of 290MHz, the entire system reset is released. When the frequency difference between the reference clock and the DCO clock is large in the beginning, it shows a non-linear staircase locking behavior with the A-FTL assistance, as explained in figure 3.25 of section 3.4. When it goes out of the frequency acquisition range of the A-FTL, the TDC-PFD completes phase tracking. Thus, the proposed PLL with the A-FTL achieves a fast stabilization time regardless of the target frequency as shown in figure 3.37, whereas the lock time of the PLL without the A-FTL increases linearly as the target frequency becomes widely off.

### 3.5.4 Performance Comparison

Table 3.1 summarizes the chip performance. The proposed SNC-PLL achieves the highest PSNA of 40 dB and the lowest integrated jitter of 271 fs and FoM<sub>JITTER</sub> of -240.5 dB in the comparison with the state-of-art RO-based SNC-PLLs concerning SNC techniques. Figure 3.38 shows the measured integrated jitter and the power consumption with the FoM<sub>JITTER</sub> in the other DDR5 configuration. The SNC-PLL retains below -237 dB FoM<sub>JITTER</sub> performance from 1.8 GHz to 3.2 GHz operation.



Figure 3.38: Measured jitter, power consumption, and  $FoM_{JITTER}$  with different configurations.

| ۲ | -             |
|---|---------------|
| 2 | 5             |
| 7 | 2             |
| 1 | Ĵ             |
| f | ž             |
| ć | 2             |
| 5 | 3             |
| 2 | 5             |
| ì | 5             |
|   | Ξ             |
| f | Ⅎ             |
| 1 | 5             |
|   | 4             |
| ĥ | $\geq$        |
| ĥ | Y             |
| 1 | ≤             |
| - | ≥             |
| 2 | $\geq$        |
| ÷ | Ď             |
| ζ | 2             |
| ļ | T)            |
| ( | $\mathcal{L}$ |
| 7 | Z             |
|   | ∢             |
| 2 | $\geq$        |
| Ģ | 2             |
| ( | С             |
| ļ | Ļ             |
| f | ř             |
| 2 | 7             |
| 1 |               |
| • | -             |
| 0 | 3             |
| ÷ | ē             |
| - | g             |
| E | -             |
|   |               |

| Ļ     | iis work                | [54]<br>ISSCC'14          | [55]<br>ISSCC'14         | [42]<br>ISSCC'16            | [56]<br>VLSI <sup>*</sup> 17  | [44]<br>JSSC′18            |
|-------|-------------------------|---------------------------|--------------------------|-----------------------------|-------------------------------|----------------------------|
|       | 28                      | 20                        | 40                       | 40                          | 65                            | 65                         |
|       | 3000                    | 25                        | 26                       | 200                         | 50                            | 49.15                      |
|       | 12.1                    | 3.1                       | 6.4                      | 2.9                         | 2.7                           | 5.86                       |
|       | 1.1                     | 0.9                       | 1.1                      | 1.1                         | 1.0                           | 0.94                       |
|       | 3.0                     | 1.6                       | 2.418                    | 3.2                         | 3.2                           | 2.36                       |
| Sel   | f-biased                | Self-biased               | Background               | Background                  | Background                    | Feed-<br>forward           |
| (0    | 0.27                    | 5830                      | N/A                      | 3.54                        | 7500                          | 0.63                       |
| 0)    | .1-500)                 | (0.02-40)                 | (0.01-40)                | (N/A)                       | (0.001-40)                    | (0.001-100)                |
| _     | 0.81**                  | 5890                      | 3.29                     | 3.85                        | 7800                          | N/A                        |
| (1001 | k,100mV <sub>pp</sub> ) | (5M, 50mV <sub>pp</sub> ) | (1M, 1mV <sub>pp</sub> ) | (100k, 50mV <sub>pp</sub> ) | (wh ite,20mV <sub>rms</sub> ) |                            |
|       | 40**                    |                           | 28                       | 10                          | 32                            | 19.5                       |
| Ÿ     | 3 → -53                 | N/A                       | -17 → -45                | -18 → -28                   | -31 → -63                     | N/A                        |
| (300  | k, 25mV <sub>pp</sub> ) |                           | (1M, 1mV <sub>pp</sub> ) | (100k, 50mV <sub>pp</sub> ) | (500k, 20mV <sub>pp</sub> )   | (100k, 1mV <sub>pp</sub> ) |
| -     | 240.5                   | -219.8                    | -221.6                   | -224.4                      | -218.1                        | -236.3                     |
|       | 0.059                   | 0.012                     | 0.013                    | 0.0216                      | 0.047                         | 0.022                      |

\* Fo $M_{
m jitter} = 10 \cdot \log \left[ \left( \frac{\sigma_{RMS}}{1s} \right)^2 \cdot \left( \frac{Power}{1mW} \right) 
ight]$  \*\* Measured at 2.0 GHz output clock

## 3.6 Summary

This chapter presents the AD-PLL for the DDR5 RCD application with a selfbiased SNC-DCO. By combining two current sources that have an opposite dependency on supply variations, the proposed SNC-PLL improves a static voltage margin and offers robustness against supply fluctuation. The proposed SNC technique satisfies the prerequisites for the RCD application. With an open-loop SNC calibration operating independently of PLL loop bandwidth, the SNC performance is maintained regardless of the quality of the input reference clock and its configurations. In addition, the SNC technique does not require a start-up circuit and does not deteriorate the stabilization time for the RCD. Even though the SNC performance is varied with process and temperature variations, about 15 dB and 17 dB improvement of FP at the worst case of SS corner and 20 °C to 75 °C variations, respectively. Furthermore, quantitative analyses of the proposed SNC technique such as the operating band, the amount of noise suppression, and the phase noise contribution due to the addition SNC function are addressed. The measured open-loop FP with a  $\pm 20$  mV voltage variation is 60.83 MHz/V while the PSNA of 40 dB is achieved with a sinusoidal supply noise of 300 kHz and 25 mV<sub>pp</sub>. In addition, the stabilization time below 700 ns in all configurations is achieved with the A-FTL assistance.

# **Chapter 4**

# **ILCM with Multi-Phase-Based Calibration**

# 4.1 Overview

This chapter presents a RO-based ILCM with a new background calibration technique that utilizes a multi-phase generation capability of the RO [57]. By detecting phase changes before and after the injection pulse, both a frequency error and an injection path offset are calibrated. The frequency calibrator operates at the injection rate with high bandwidth, which contributes to further suppressing the flicker noise of the RO and producing a much lower RMS jitter. The path offset calibrator operating at the pulse-gating rate with a low bandwidth makes the ILCM converge to the state with a minimum reference spur. To maintain reliability in the frequency error caused by the injection path offset in the steady state, a narrow-range delay-locked loop is additionally adopted. For a low-power implementation, a sub-sampling bangbang phase detector is employed for each calibration loop, and all calibration loops operate at the reference clock rate. The proposed multi-phase-based calibration takes advantage of the distinct and intrinsic feature of multi-phase generations in the RO and extirpates the trade-off relationship between the achievable effective injection strength and the power consumption. In addition, through continuous background calibration, the ILCM maintains its jitter and reference spur performance against PVT variations.

Fabricated in 28-nm CMOS, the proposed ILCM achieves an integrated RMS jitter of 143.6 fs from 10 kHz to 40 MHz with a reference spur of -77.9 dBc. The ILCM consumes 9.4 mW at the 4.8-GHz operation, which translates to a FoM<sub>JITTER</sub> of -247.1 dB.

Rest of this chapter is organized as follows. Section 4.2 introduces why the twopoint calibration is indispensable for successful injection locking and discusses the advantages and disadvantages of prior works regarding injection-locking calibration. Section 4.3 focuses on the proposed multi-phase-based calibration structure, and the detailed calibration methodology is addressed with timing diagrams in specific cases. The overall architecture and the circuit implementation of building blocks are described in Section 4.4. The measurement results from the prototype chip are presented in Section 4.5. Finally, the key contributions of this chapter are summarized in section 4.6.

## 4.2 Design Motivation

4.2.1 Needs for Two-Point Calibration in Injection-Locking



Figure 4.1: Ideal injection-locked state.

A phase-realignment mechanism of a RO-based ILCM offers strong suppression of the RO-induced noise due to its high bandwidth nature, as explained in section 2.2. To make the most of the injection effects, a free-running frequency of the RO must be precisely tuned with the target frequency [17]. Figure 4.1 shows the ideal injectionlocked state in the example case of its multiplication factor (N) of 2. The injection pulse signal,  $S_{INJ}$ , makes the phase-realignment in the oscillating phases,  $\phi_{0^\circ}$  and  $\phi_{180^\circ}$ , of  $S_{OSC}$ . If there is no frequency error (FE) with  $T_{REF} = N \cdot T_{ILO}$ , the frequency and the phase of the  $S_{OSC}$  have no fluctuation and remain constant at their ideal values.



Figure 4.2: Injection-locked state with an (a) positive and (b) negative FE.

On the contrary, in the case with the FE,  $S_{INJ}$  makes a sharp phase-realignment in the phases of  $S_{OSC}$  and the accumulated phase error is rectified at every injection rate, as shown in figure 4.2. The frequency of  $S_{OSC}$  deviates slightly from its average (ideal) frequency by the amount of the FE, and a large frequency deviation occurs instantaneously when the injection pulse is applied. It entails poor reference spur performance and aggravates a deterministic jitter.

However, if the FE calibrator is adopted heedlessly, the FE remains uneliminated even in the steady state. This is because there are two different phase-locked points in one chip, as shown in figure 4.3. One is where the injection pulse causes the phase realignment, and the other is the phase-locking point of the FE calibrator. Figure 4.4(a) illustrates the initial case of figure 4.3 that the FE calibrator is added in an ideal state of figure 4.1. Then, due to the path offset (PO), the output of the PD leads to locking at a lower frequency. As a result, it converges to the state with a negative FE in the steady state, as shown in figure 4.4(b). Therefore, the PO calibration logic is required to match the two different phase convergence points, as described in figure 4.5.



Figure 4.3: Block diagram of ILCM including a FE calibrator.



In the steady state) *f*<sub>ERR</sub> < 0 with the PO



Figure 4.4: (a) Initial and (b) steady state of figure 4.3.



Figure 4.5: Conceptual block diagram of ILCM with two-point calibration.



Figure 4.6: (a) Conceptual block diagram of conventional pulse-gating ILCM and (b) its operation.

### 4.2.2 Prior Works Regarding Two-Point Calibration

As methodologies for the two-point (FE and PO) calibration, various background calibration techniques have been proposed [58, 59, 60, 61, 62]. Figure 4.6(a) depicts the conceptual block diagrams of the conventional pulse-gating ILCM [60, 61]. The gating control block generates the signal  $S_{\text{EN,gating}}$  enabling for  $S_{\text{INJ}}$  to be gated periodically, and the FE is detected in the PD when  $S_{\text{EN,gating}}$  is activated. And the PO is calibrated by controlling the propagation delay in the injection path when the  $S_{\text{INJ}}$  is applied. The operation of the conventional pulse-gating ILCM is organized in figure 4.6(b). In this way, a low-power operation can be possible, but the effective injection strength is reduced for a high bandwidth of FE calibration. This method even makes the achievable loop bandwidth be slashed by half, if the injection pulse is gated every other one [63].

Another solution is a replica-delay-cell (RDC)-based calibration [59, 62], and its conceptual block diagram is shown in figure 4.7(a). The RDC plays a role in storing free-running frequency information, if the propagation delay of the RDC is a multiple of  $T_{OSC}$ . Then, the FE is calibrated when the injection pulse is applied. To perceive the FE precisely, the PO in the RDC is continuously calibrated when the injection pulse is not applied. The operation of the RDC-based calibration is organized in figure 4.7(b). The RDC-based calibration can achieve both the full injection strength and the high bandwidth of the FE calibration, simultaneously. However, the RDC and the PD operate at the oscillation frequency, which entails a large power consumption.



**RDC: Replica Delay Cell** 

(a)

| Cal. Operation  |      |  |
|-----------------|------|--|
| Control         | Cal. |  |
| Injection       | FE   |  |
| *Usual          | PO   |  |
| Loveent for Ini |      |  |

\* : except for Inj.

(b)

Figure 4.7: (a) Conceptual block diagram of RDC-based ILCM and (b) its operation.

## **4.3** Proposed Multi-Phase-Based Calibration (MPC)

To adopt only the advantages of the previous structures, a multi-phase-based calibration (MPC), which utilizes the inherent capability of generating multi-phase clocks, is proposed [57]. The FE and PO calibrations are conducted by detecting the phase changes of the multi-phase clocks at before and after the injection pulse. To precisely determine the FE in the steady state that corresponds to the PO, the MPC structure includes a narrow-range delay-locked loop (DLL). The operations of the three calibration loops constituting the MPC are divided into two-timing categories, depending on whether the injection pulse is applied to the RO or gated. It is noted that the gating methodology of the MPC is completely opposite to that of the conventional pulsegating method described in section 4.2.2. The pulse-gating method of the MPC is not intended to obtain the FE of the free-running oscillator, but is adopted to prevent the number of additional PDs for calibrating the circuit offsets from deductively increasing to infinity. To sum up, the proposed MPC consists of three calibration loops, which are the FE, PO calibrations, and the narrow-range DLL. They operate separately by two events that cannot happen simultaneously. One event is when the injection pulse is applied to the RO, and the other event is when the injection pulse is gated. The FE calibration and the DLL are performed when the injection pulse is applied, and the PO calibration is carried out when the injection pulse is gated. In addition, all three calibration loops operate at the reference clock rate with sub-sampling bang-bang phase detectors (SS-BBPDs). Consequently, the MPC not only brings the advantage of the RDC-based ILCM that is a full injection effect with high bandwidth of the FE calibration, but also achieves a low-power operation with SS-BBPDs that is the strength of the conventional pulse-gating ILCM. In the remainder of this section, a logical process of creating the MPC and detailed operating principles of each calibration loop are examined.

### 4.3.1 Logical Process of Creating MPC

### FE Calibrator Sampling Pre-Injection Phases

The primary purpose of the RO-based ILCM is an extended suppression bandwidth of RO-induced noises. Accordingly, one of the prerequisites for two-point calibration that is essentially accompanied by the successful injection-locking structure is not to infringe on the maximized suppression bandwidth. The key factor is how to calibrate the frequency of the free-running RO that changes with the surrounding conditions. As explained in section 4.2.2, the RDC [59, 62] is used for storing the frequency of the free-running RO, but its frequency information is contained in multi-phase clocks themselves. The multi-phase clocks after the phase realignment have unreliable freerunning frequency information, but the accumulated jitter due to the FE is reflected in the multi-phase clocks right before the phase realignment. Therefore, the PD of the FE calibrator is placed to sample the pre-injection phases.

#### **PO Calibrator Sampling Post-Injection Phases**

Where the phase of the FE sampling clock is aligned ensues from the FE calibrator. To solve this question, it is necessary to understand where and how the discrepancy due to the PO occurs. According to the analysis that the remaining FE in the steady state originates from the corresponding PO, the PD detecting the remaining FE with a low operating bandwidth is additionally required. However, this approach still leaves a serious problem, because the k + 1th PD is deductively required to calibrate the circuit offset of the *k*th PD. Consequently, the pulse-gating methodology is adopted in the MPC to solve this deductive problem. The pulse-gating method is a very efficient way to observe the difference in whether a certain event has occurred or not in the same circumstance. Moreover, the reduced effective injection strength, which is a fatal disadvantage of the pulse-gating method, is not a problem in PO calibration, because

the remaining FE needs to be sampled in the steady state so that the PO calibration can operate with sufficiently low bandwidth.

The next step is where the remaining FE is detected. The PO calibrator in the MPC identifies the phase shift from the phase-realignment mechanism in the post-injection phases because it corresponds to the remaining FE. To minimize the misguided sensing due to the phase noise of the RO, the PO calibrator samples the phases right after the injection pulse. The time it takes for the RO noise to be accumulated is only  $T_{OSC}/m$ , where m is the number of multi-phase clocks in the RO. Furthermore, it reduces the DLL coverage to secure the normal operations of the MPC.

#### **Tracking Post-Injection Phases**

Finally, to detect PO errors with high reliability, the sampling clock of the PO calibrator must be aligned with the phase-realigned post-injection phases. Only then can the phase shift (phase pushing or phase pulling)<sup>1</sup> be known at the pulse-gating event. For this reason, the sampling clock continues to track the phase-realigned post-injection phases through the DLL. Since  $S_{INJ}$  is synchronous to  $S_{REF}$  and has a phase relation of  $T_{OSC}/m$  with the post-injection phases, the dynamic range of DLL is substantially reduced.

<sup>&</sup>lt;sup>1</sup>In [57], the two phase-realignment types are called as injection pushing and injection pulling. However, to prevent a misunderstanding of the notation "injection pulling" that represents the injecting oscillator is involved but does not capture the primary oscillator, they are written as phase pushing and phase pulling in this dissertation.

### 4.3.2 Time-Domain Analysis of MPC

The PDR-based analysis of the injection-locking technique has three important characteristics. The first characteristic is that the injection strength,  $\beta$ , has a range between 0 and 1. In other words, there is no overcompensation in the injection locking. The second characteristic is that the phase shift (phase pushing and phase pulling) is proportional to the input phase difference. The third characteristic is that the same injection strength is maintained within a certain range of the input phase difference. Consequently, a linear approximation of the injection-locking phenomenon can be performed within the range where the injection pulling does not happen. In this section, based on these three characteristics, a time-domain analysis of the injection-locking is presented, and equations related to the input phase difference, the amount of phase shift, and the convergence point in the steady state are derived. Subsequently, the proposed MPC behavior is embodied in the derived equations.



Figure 4.8: Non-linear phase relationship of ILCM with negative FE in time-domain.

Figure 4.8 depicts the non-linear phase relationship of ILCM with a negative FE in the time domain. The phase '0' value represents the convergence point of the injection pulse,  $S_{INJ}$ , which also means the reference point of the input phase difference in the PDR-based analysis. The accumulated phase noise during  $T_{REF}$  is refreshed at every injection rate (the reference clock rate in figure 4.8), and the phase shift is half of the input phase difference. ( $\beta = 0.5$ ) The amount of residual phase after the phase realignment is expressed as RP, and the discrete-time variable RP[k] is defined as the value of RP sampled for each  $T_{REF}$ . The convergent residual phase (CRP) is the value of RP in the steady state.

$$RP[k] \coloneqq \lim_{t \to k \cdot T_{\text{REF}}+} \phi(t) \tag{4.1}$$

Since the phase value is discussed in the time domain, all phase values to be developed are expressed as time dimensions in order to resolve confusing notations. The residual time, RT, has an one-to-one correspondence with the residual phase, RP, and expressed as

$$RT[k] = RP[k] \cdot \frac{T_{\text{OSC}}}{2\pi}$$
(4.2)

Figure 4.9 shows RT[k] with the injection strength,  $\beta$ , of 0.8,  $f_{OSC}$  of 5 GHz, and the absolute frequency error,  $|f_{err}|$ , of 0.1 %. The RT[k] converges to the convergent residual time (CRT) with an index of 7, which defines a convergence index (CI). The higher  $\beta$  becomes, the faster it reaches the CRT, which is described in figure 4.10. The initial value of RT[1] is 10 ps and other conditions are the same as figure 4.9.



Figure 4.9: The residual time RT[k] with frequency errors of  $\pm 0.1$  %.



Figure 4.10: The convergence index with  $\beta$  variations.

By means of its non-linear phase relationship, RT[k] has the following recurrence relation.

$$RT[k+1] = (1-\beta) \cdot \left(RT[k] + \frac{1}{2\pi f_{\text{OSC}}} \int_0^{T_{\text{REF}}} \Delta f(t) \, \mathrm{d}t\right)$$
(4.3)

With a given  $\Delta f$ , the accumulated time difference (D) due to the FE is expressed as

$$D = \frac{1}{2\pi f_{\text{OSC}}} \int_0^{T_{\text{REF}}} \Delta f(t) \, \mathrm{d}t = \Delta T \cdot N \tag{4.4}$$

, where  $\Delta T$  is the time difference that corresponds to the FE and is expressed as  $-\Delta f/f_{OSC}^2$ . Then, the recursive equation (4.3) is developed and the convergent value of RT[k], CRT, is derived as follows.

$$RT[k] = \left(RT[1] - \frac{1-\beta}{\beta} \cdot D\right) \cdot \left(1-\beta\right)^{k-1} + \frac{1-\beta}{\beta} \cdot D \tag{4.5}$$

$$CRT = \lim_{k \to \infty} RT[k] = \frac{1 - \beta}{\beta} \cdot D$$
(4.6)

According to (4.5) and (4.6), the input time difference and the time shift that correspond to the input phase difference and the phase shift, respectively, in the steady state are derived as

$$T_{\rm IN}\big|_{@SS} = \frac{D}{\beta} = \frac{\Delta T \cdot N}{\beta} \tag{4.7}$$

$$T_{\text{OUT}}\big|_{@SS} = \beta \cdot \frac{D}{\beta} = \Delta T \cdot N \tag{4.8}$$

The equations (4.6), (4.7), and (4.8) imply several interesting aspects. To start with, the necessary and sufficient condition for the ideal ILCM is D = 0, which represents the FE of 0. And if the FE exists in the injection-locked system, all performances are improved with a lower N. The low N with a given  $f_{OSC}$  means that the time for noises to be accumulated is effectively reduced. In addition, the lower  $\beta$  is, the larger the CRT becomes. This indicates that the static phase offset between the input injection clock,  $S_{INJ}$ , and the oscillator is inversely proportional to  $\beta$  in the steady state. Moreover, the time shift in the steady state is kept constant as D regardless of  $\beta$ . This result is coincident with the result of (2.33) as developed in (4.9).

$$P(\phi_{SS}) = \frac{2\pi}{T_{OSC}} \cdot T_{OUT} \Big|_{@SS} = \frac{2\pi}{T_{OSC}} \cdot \left( -\frac{\Delta f}{f_{OSC}} \cdot N \right) = -\frac{2\pi N \Delta w}{w_{OSC}}$$
(4.9)

The roles and operations of the MPC corresponding to the equations of (4.6), (4.7), and (4.8) are as follows. The FE calibrator reduces  $\Delta T_{OSC}$  and D, thus making the MPC-ILCM an ideal ILCM without the FE. The narrow-range DLL forces the phase (time) of the sampling clock of the PO calibrator to be located at the *CRP* (*CRT*). The PO calibrator identifies the time difference of  $T_{IN}|_{@SS}$  - *CRT* = D, when the injection pulse is gated.

#### 4.3.3 Conceptual Block Diagram of MPC

Figure 4.11(a) shows a conceptual block diagram of the proposed MPC structure. The FE and PO calibrators sample the pre- and post- injection phases, respectively. The narrow-range DLL tracks the realigned post-injection phases. The entire system is controlled by the gating enable signal,  $S_{\text{EN,gating}}$ . When the  $S_{\text{INJ}}$  is applied to the RO, the FE calibrator and the DLL are executed. On the contrary, when the  $S_{\text{INJ}}$  is gated with the activated  $S_{\text{EN,gating}}$ , the PO calibrator is performed. These overall calibration operations are summarized in figure 4.11(b). The FE calibration functions at every injection rate, so that high bandwidth of FE calibration with the full injection effect can be achieved simultaneously. In addition, all calibration loops operate at the reference clock rate, thus a low-power implementation can be performed.



(a)

| Cal. Operation |      |
|----------------|------|
| Control        | Cal. |
| Injection      | FE   |
| Gating         | PO   |
| J              |      |

(b)

Figure 4.11: Conceptual block diagram of the proposed MPC structure.

# FE Cal. & Narrow-range DLL

Ex)  $f_{\text{ERR}} = f_{\text{ILO}} - f_{\text{IDEAL}} > 0$ ,  $t_{\text{PO}} = 0$ ,  $\Delta t_{\text{POST}} > 0$  $S_{\text{INJ}}$  is aligned with rising edge of  $\phi_{0^{\circ}}$ 



Figure 4.12: Timing diagram of the MPC, when the injection pulse is applied.



Figure 4.13: Timing diagram of the MPC, when the injection pulse is gated.

### **4.3.4** MPC Operation (1, 2): FE Calibration and DLL

When the injection pulse,  $S_{INJ}$ , is applied to the RO, the FE calibrator and the DLL are executed. Figure 4.12 shows how they function with the timing diagram in the specific case. Assuming  $S_{INJ}$  is aligned with the rising edge of  $\phi_{0^\circ}$ , the phase shift at every injection rate is conducted as "phase pushing" with a positive FE. And the accumulated FE information is obtained by sampling the pre-injection phases,  $\phi_{PRE}$ , with the sampling clock,  $S_{PRE}$ , that has no PO. The DLL enables the sampling clock of the PO calibrator,  $S_{POST}$ , to track the realigned post-injection phases,  $\phi_{POST}$ . It is noted that the DLL operates independently of the injection path, when the injection pulse is applied. The DLL noise itself does affect the sampling clock phase of the FE calibrator and becomes low-pass filtered.

### **4.3.5** MPC Operation (3): PO Calibration

When  $S_{\text{INJ}}$  is gated with the activated  $S_{\text{EN,gating}}$ , the PO calibrator identifies the direction of the phase shift (phase pushing or phase pulling) that corresponds to the remaining FE in the steady state. Figure 4.13 shows how the PO calibrator recognizes the phase shift with the timing diagram in the specific case. Before  $S_{\text{INJ}}$  is gated, the MPC converges into the steady state where both expectations of  $PD_{\text{PRE}}$  and  $PD_{\text{POST}}$  are zero. Meanwhile, even with a small negative PO,  $t_{\text{PO}}$ , a small positive FE,  $\epsilon$ , corresponding to  $t_{\text{PO}}$  remains and an imperfect locked condition is reached. In this case, the phase shift occurs as "phase pushing" at every injection rate, and  $S_{\text{POST}}$  is aligned with the realigned post-injection phases thanks to the DLL. However, when the  $S_{\text{INJ}}$  is gated, the phase shift does not take place, so that the de-realigned post-injection phases are sampled and the phase shift of "phase pushing" is identified. In this way, the PO calibrator makes the MPC-ILCM lock to the minimum reference spur condition.


Figure 4.14: Overall architecture of the proposed MPC-ILCM.

| Frequency Calibration    |            | <b>Tracking</b> $\phi_{PC}$               | ost (DLL)              |  | Path Offset Calibration                                  |                       |  |
|--------------------------|------------|-------------------------------------------|------------------------|--|----------------------------------------------------------|-----------------------|--|
| $\{PD_{INJ}, PD_{PRE}\}$ | Freq. cal. | {PD <sub>INJ</sub> , PD <sub>POST</sub> } | t <sub>POST</sub> cal. |  | { <b>PD</b> <sub>INJ</sub> , <b>PD</b> <sub>POST</sub> } | t <sub>PRE</sub> cal. |  |
| { -1, -1 }               | DN         | { -1, -1 }                                | DN                     |  | { -1, -1 }                                               | UP                    |  |
| { -1, +1 }               | UP         | { -1, +1 }                                | UP                     |  | { -1, +1 }                                               | DN                    |  |
| { +1, -1 }               | UP         | { +1, -1 }                                | UP                     |  | { +1, -1 }                                               | DN                    |  |
| { +1, +1 }               | DN         | { +1, +1 }                                | DN                     |  | { <b>+1</b> , <b>+1</b> }                                | UP                    |  |
| (a)                      |            | (b)                                       |                        |  | (c)                                                      |                       |  |

Figure 4.15: Decision tables of MPC; (a) the FE calibrator, (b) the DLL, and (c) the PO calibrator.

### 4.4 Circuit Implementation

#### 4.4.1 Overall Architecture

Figure 4.14 shows the overall architecture of the proposed MPC-ILCM. A conventional bang-bang phase-locked loop structure is used for an initial locking and disabled later in the ILCM mode. The injection pulse,  $S_{INJ}$ , shorts differential oscillating nodes,  $\phi_{\rm INI}$  ( $\phi_{0^{\circ}}$  and  $\phi_{180^{\circ}}$ ), and the frequency is tuned by a 10-bit monotonic digitallycontrolled resistor (DCR) [64]. The injection-locked oscillator (ILO) consists of a 4stage differential RO and injection switches. The injection strength,  $\beta$ , is varied by 4-bit binary-weighted injection switches. The pre- and post- injection phases,  $\phi_{PRE}$  $(\phi_{315^\circ} \text{ and } \phi_{135^\circ})$  and  $\phi_{POST}$  ( $\phi_{45^\circ}$  and  $\phi_{225^\circ}$ ), are sub-sampled by  $S_{PRE}$  and  $S_{POST}$ with SS-BBPDs to calibrate the FE and PO, respectively. The sampling clocks of the FE and PO calibrators,  $S_{\text{PRE}}$  and  $S_{\text{POST}}$ , are the delayed signals of a reference clock,  $S_{\text{REF}}$ , by  $t_{\text{PRE}}$  and  $t_{\text{POST}}$ , respectively. The PO calibrator and the narrow-range DLL share the  $PD_{POST}$  and each operation is determined by the gating-enabled signal,  $S_{\text{EN,gating}}$ . The gating control block adjusts the pulse-gating rate through a gating-rate control word (GRCW) and affects the calibration bandwidth. It is recalled that due to the pulse-gating methodology, the required number of PDs for calibrating circuit offsets is prevented from being deductively increased to infinity. To explore whether the  $S_{INJ}$  is aligned with the rising or falling edge of  $\phi_{0^\circ}$ ,  $\phi_{INJ}$  is also sub-sampled by  $S_{POST}$ . Since all of  $S_{INJ}$ ,  $S_{PRE}$ , and  $S_{POST}$  are synchronous to  $S_{REF}$ , the required delay range of each digitally-controlled delay line (DCDL) can be reduced. To secure the effective injection strength with the high bandwidth of FE calibration,  $S_{\rm EN,gating}$ is activated with a low gating rate which is less than about 1/100. In consequence, the bandwidth of the FE calibrator and the DLL is much higher than that of the PO calibrator. For precise calibration, 20-bit delta-sigma modulators ( $\Delta\Sigma$ s) are employed in each calibration loop to improve the frequency and delay resolutions. Figure 4.15

shows the decision table of the MPC. All of them are correlated with  $PD_{INJ}$ , since  $PD_{INJ}$  indicates the transition polarity of  $\phi_{INJ}$ . The reason why only the logic of the PO calibrator consists of XNOR is that it calibrates the phase of the sampling clock of the FE calibrator, thus their decision table have to be opposite to each other.

#### **Transient Locking Behavior**

The convergence of the MPC is completed with both DCDLs (DCDL<sub>PRE</sub> and DCDL<sub>POST</sub>) having the desired propagation delays. The proposed MPC structure consists of three calibration loops, hence the bandwidth of each calibration loop should be determined carefully. Each bandwidth is determined by the gating rate ( $GR_{INJ}$ ) and each loop gain ( $K_I$ ,  $K_{PRE}$ , and  $K_{POST}$ ). Since the DLL operates independently of the injection path, the bandwidths of the FE calibrator and the DLL do not affect each other, when the injection pulse is applied. The important consideration is the relationship between the bandwidths of the PO calibrator and the DLL. According to section 4.3.1, the DLL should be aligned with the realigned post-injection pulse is gated. Therefore, the bandwidth of the DLL should be much higher than that of the PO calibrator to secure a normal operation and prevent a stability issue due to bandwidth overlapping. The bandwidth rate between the PO calibrator and the DLL has the following relation.

$$BW_{rate} = \frac{BW_{PO}}{BW_{DLL}} \propto GR_{INJ} \cdot \frac{K_{PRE}}{K_{POST}}$$
(4.10)



Figure 4.16: Transient behavior of MPC structure; the frequency trends and the DCWs of each DCDL.

Figure 4.16 shows the transient behavior of the MPC structure. The frequency trends and the delay control words (DCW) of each DCDL are illustrated when it is switched from a BB-PLL mode to an ILCM mode. Since the gating rate,  $GR_{INJ}$  and the subsequent  $BW_{rate}$  is quite low, the stability issue due to the bandwidth overlapping is prevented. The moving average frequency of the ILO at every injection rate shows the injection-locked state, when the DCWs of the DCDLs are locked to their desired points.

#### **Necessary Condition for Normal Operation**

To guarantee the MPC structure even in the PVT variations, the sufficient delay range of the DCDLs (DCDL<sub>PRE</sub> and DCDL<sub>POST</sub>) should be covered for the sampling clocks,  $S_{PRE}$  and  $S_{POST}$ , to have the desired phases. Fortunately, all of  $S_{PRE}$ ,  $S_{POST}$ , and  $S_{INJ}$  are synchronous to  $S_{REF}$ , and the required additional delay range can be derived by a monte carlo simulation. The propagation delays from  $S_{REF}$  to  $S_{INJ}$  and from  $S_{REF}$  to  $S_{PRE}$  are defined as  $t_{REFtoINJ}$  and  $t_{REFtoPRE}$ , respectively. Figure 4.17(a) shows the monte carlo simulation of the delay difference of two phase-convergence points,  $t_{REFtoINJ} - t_{REFtoPRE}$ , with the PVT variations. The corner variation is conducted with the slow, nominal, and fast corners. The static supply voltage and the operating temperature are varied from -0.1 V (low) to +0.1 V (high) and from -5 °C (low) to +105 °C (high), respectively. The result of each PVT variation is shown in figure 4.17(b). It indicates that its standard variation in the worst case is about 5 ps and the delay cover range of at least 20 ps is additionally required.



Figure 4.17: (a) Monte Carlo simulation of two phase-convergence points and (b) its results.



### 4.4.2 Circuit Description of Building Blocks (1): ILO

Figure 4.18: (a) Circuit description of the ILO, (b) the PDR curve of the ILO, and (c) the injection strength of the ILO.



Figure 4.19: Normalized phase shift with the half-supply-crossing transition.

Figure 4.18(a) shows a circuit description of the ILO. The ILO is composed of a 4-stage differential RO and injection switches. The injection pulse,  $S_{INJ}$ , shorts differential oscillating phases,  $\phi_{INJ}$  ( $\phi_{0^\circ}$  and  $\phi_{0^\circ}$ ).  $S_{INJ}$  is placed at only one of four switches and the rest are dummies with disabled switches. The injection strength,  $\beta$ , is controlled by 4-bit binary-weighted switches, of which the unit size is W. The PDR curve of the ILO in the vicinity of the locking point is shown in figure 4.18(b). The larger the size of the injection switch is, the larger the phase shift becomes. Within the injection capture range, the PDR exhibits the linear approximation with the injection strength,  $\beta$ . The injection strength versus the injection switch size is simulated as shown in figure 4.18(c). As the size of the injection switch becomes larger,  $\beta$  also increases, but its rate of increase gradually diminishes. Since the DLL in the MPC samples the realigned post-injection phases, when the phase shift is properly reflected should be considered very carefully. Figure 4.19 shows the normalized phase shift with various input phase differences. The x-axis of the number of crossing means the number of transitions from the  $S_{INJ}$  including the multiphase clocks. In other words, '0' indicates the half-supply-crossing of  $\phi_{0^\circ}$  and  $\phi_{180^\circ}$  right before or after  $S_{INJ}$ , and '1' represents the half-supply-crossing of  $\phi_{POST}$ ,  $\phi_{45^\circ}$  and  $\phi_{225^\circ}$ . From the right after the post-injection phases, the phase shift is sufficiently reflected, hence  $\phi_{45^\circ}$  and  $\phi_{225^\circ}$  are suitable for use as  $\phi_{POST}$  and the required delay range of the DLL is reduced. An additional noticeable difference in figure 4.19 is observed in the phase shifts of index '0' with a positive and negative input phase difference. The phase shifts with positive input phase differences are almost zero on account of a causal response of  $S_{INJ}$ 

#### 4.4.3 Circuit Description of Building Blocks (2): DCDL

The DCDL consists of coarse- and fine- tuning cells as described in figure 4.20. Each coarse cell is a NAND-based DCDL and has a resolution of two NAND propagation delays. The fine-tuning DCDL adjusts its delay through an 8-bit  $P_{ctrl}$  and a 64-bit  $N_{ctrl}$  by changing load capacitance, according to the number of the activated MOSFETs.  $P_{ctrl}$  compensates for the narrow range of  $N_{ctrl}$ . If  $N_{ctrl}$  is stuck to either the minimum or the maximum value,  $P_{ctrl}$  is activated one by one as described in figure 4.16. The simulated resolutions of the fine-tuning DCDL are shown in the bottom of figure 4.20(b). The delay of the fine-tuning DCDL is shown in figure 4.21. A monotonic relationship with a fine resolution should be satisfied. The integral non-linearity (INL) and differential non-linearity (DNL) based on the post-layout simulation are shown in figure 4.22. The maximum absolute value of INL and DNL are 0.58 LSB and 0.59 LSB, respectively.



Figure 4.20: Circuit description of (a) coarse-tuning and (b) fine-tuning DCDL.



Figure 4.21: Propagation delay of the fine-tuning DCDL.



Figure 4.22: (a) INL and (b) DNL of the fine-tuning DCDL.



### 4.5 Measurement Results

Figure 4.23: (a) Die photomicrograph and (b) power breakdown at 4.8 GHz operation.



Figure 4.24: Measurement setup for the MPC-ILCM.

The proposed MPC-ILCM is fabricated in the 28-nm CMOS technology and occupies an active die area of  $0.062 \text{ mm}^2$ . The micrograph of the chip is shown in figure 4.23(a). The supply domains of the ILCM are divided into three parts. One is for the ILO (VDD<sub>ILO</sub>), another is for the frequency calibrator (VDD<sub>DIG</sub>) and the other is for the rest (VDD<sub>ANA</sub>). The power breakdown of each supply domain is shown in figure 4.23(b). The ILCM consumes 9.36 mW with the 4.8 GHz output clock. Figure 4.24 is the measurement setup for MPC-ILCM. The off-chip regulators provide each supply voltage and  $S_{PRE}$  is outputted to measure the propagation delay of DCDL.







Figure 4.25: Measured (a) phase noise and (b) spectrum of the output clock



| Decade Table [dBc/Hz] |        |        |        |  |  |  |  |
|-----------------------|--------|--------|--------|--|--|--|--|
| Freq.                 | Α      | B      | С      |  |  |  |  |
| 10k                   | -111.4 | -103.7 | -41.6  |  |  |  |  |
| 100k                  | -118.3 | -113.0 | -57.3  |  |  |  |  |
| 1M                    | -122.7 | -121.8 | -81.4  |  |  |  |  |
| 10M                   | -126.2 | -125.9 | -105.6 |  |  |  |  |
| 40M                   | -126.9 | -129.1 | -120.2 |  |  |  |  |

(b)

Figure 4.26: Measured (a) phase noise plots with and without the MPC and (b) their decade tables.



Figure 4.27: Measured phase noise plots with integral gain  $(K_{I})$  variations.

The measured phase noise and spectrum are shown in figure 4.25. The measured RMS jitter is 143.6 fs integrated from 10 kHz to 40 MHz and the reference spur of – 77.89 dBc is achieved. It is noted that the hump around the 120 MHz offset frequency in figure 4.25(b) is observed due to the supply coupling problem. Figure 4.26(a) depicts the phase noise plots of the ILCM with and without the MPC, and figure 4.26(b) shows the corresponding decade tables. The ILCM without the MPC is measured at the same locked condition as the ILCM with the MPC, and all calibration loops are disabled. The phase noise at the low-frequency offset of the ILCM without the MPC is higher than that of the ILCM with the MPC. It indicates that continuous frequency calibration further suppresses the flicker noise of the RO, which helps to achieve the much lower RMS jitter [62]. To explore the detailed phase noise contribution of the bandwidth of the FE calibration, the integral gain,  $K_{\rm I}$ , is changed in the same condition of the MPC-ILCM. Figure 4.27 demonstrates that the higher  $K_{\rm I}$  is, the lower the phase noise becomes at low offset frequencies.

The figure 4.28 shows the phase noise plots with variations of the size of injection switches. According to figure 4.18(c),  $\beta$  is almost proportional to the size of the injection switches, but it is saturated with a large size of the injection switches. The phase noise results in figure 4.28 coincide and the higher bandwidth extension effect is achieved with a strong  $\beta$ . Figure 4.28 also implies that the up-conversion phase noise contribution of the input reference clock aggravates the out-band phase noises at high offset frequencies with a high  $\beta$ . In other words, the optimum  $\beta$  should be determined to improve the overall phase noise performance. The achieved reference spur performance with the  $\beta$  variations is shown in figure 4.29(a). The reference spur performance below -75dBc is retained with the large size of the injection switches, but it deteriorates rapidly with a relatively low  $\beta$ . The low reference spur of -79.1 dBc is achieved with the injection switch size of 13 W and the corresponding measured spectrum is shown in figure 4.29(b).



Figure 4.28: Measured phase noise plots with variations of the size of injection switches.





(b)

Figure 4.29: Measured (a) reference spur performance with the injection switch size variations and (b) spectrum example of (a).



Figure 4.30: Measured reference spur performance with the gating rate variations.

The gating rate,  $GR_{INJ}$ , affects all bandwidth of the MPC loops and the stability due to their bandwidth overlapping as explained in sections 4.3.4, 4.3.5, and 4.4.1. Both the phase noise and the reference spur performance are maintained with a sufficiently low  $GR_{INJ}$ . Figure 4.30 shows the reference spur performance with  $GR_{INJ}$ variations.  $1/GR_{INJ}$  of 100 means that  $S_{EN,gating}$  is activated once every 100 times of  $S_{INJ}$ . The reference spur performance below -70 dBc seems to be maintained regardless of its  $GR_{INJ}$ . However, the spectrum of the MPC-ILCM starts to degenerate, as the  $GR_{INJ}$  grows larger. Figure 4.31 shows the example spectra with  $GR_{INJ}$  of 1/70 (a) and 1/10 (b). When  $1/GR_{INJ}$  reaches 70, the undesired fractional tones are observed and they become worse as  $GR_{INJ}$  grows larger.



# (a)



Figure 4.31: Measured spectra with  $GR_{INJ}$ s of (a) 1/70 and (b) 1/10.



Figure 4.32: Measured clock eye with its jitter histogram.

The measured clock eye diagram with its time interval error (TIE) histogram is shown in figure 4.32). The symmetric and Gaussian-shaped TIE histogram represents that the ILCM is successfully injection-locked with the MPC. To further verify the MPC operation, the integrated RMS jitter and the reference spur with supply (VDD<sub>ILO</sub>) variations in five different chips are measured as shown in figure 4.33. Without the MPC, both the integrated jitter and the reference spur rapidly deteriorate when the supply voltage deviates from the central value of 1.1 V. Furthermore, the ILCM without the MPC fails to lock, if the supply voltage deviation becomes larger. However, with the MPC assistance, the ILCM maintains robust performance with the consistently low integrated jitter and the low reference spur in multiple samples. Figure 4.34 shows the measured integrated jitter and the resolution of the fine-tuning DCDL with supply (VDD<sub>ANA</sub>) variations. Since the proposed MPC structure is comprised of three digitally-controlled calibration loops, the performance has a strong dependency on the resolution of the DCDL. The two lines in figure 4.34 show that both the integrated jitter and the resolution of the fine-tuning DCDL are degraded as VDD<sub>ANA</sub> decreases. The RMS jitter of 184.3 fs and the resolution of 183.7 fs are measured at 0.92 V. When it comes to the reference spur, relatively consistent performance is obtained from 1.2 V to 0.92 V VDD<sub>ANA</sub> variations and the spectrum measured at 0.92 V is also plotted in figure 4.34. In this case, the reference spur of -73.22 dBc is measured.



Figure 4.33: Measured (a) integrated jitter and (b) reference spur with supply  $(VDD_{ILO})$  variations in five different sample chips.



Figure 4.34: Measured integrated jitter and resolution of the fine-tuning DCDL with supply  $(VDD_{ANA})$  variations.



Figure 4.35: Benchmark of performances of the MPC-ILCM and state-of-the-art RO-based ILCMs.

Table 4.1 compares this design against the state-of-the-art RO-based ILCMs having background calibration techniques. Since the proposed MPC performs the lowpower and high-bandwidth frequency calibration, the ILCM exhibits the best FoM along with the lowest reference spur. Figure 4.35 shows the benchmark of performances of the MPC-ILCM and state-of-the-art RO-based ILCMs.

| JSSC'21 [62]  | 5. Y 00<br>65    | 100              | 2.4                 | 24               | 140<br>(0.01 - 30)                                   | -72             | 11.0       | 4.58                | Replica Delay<br>( High / High )                 | 0.055                   | -246.7          |
|---------------|------------------|------------------|---------------------|------------------|------------------------------------------------------|-----------------|------------|---------------------|--------------------------------------------------|-------------------------|-----------------|
| JSSC'19 [61]  | A. EIKNOIY<br>65 | 125              | S                   | 40               | 337<br>(0.01 - 40)                                   | -45             | 5.3        | 1.06                | Pulse-Gating<br>( Low / <mark>Low</mark> )       | 0.09                    | -242.2          |
| ISSCC'18 [60] | K. M. Megawer    | 5                | 4.752               | 88               | 370<br>(0.01 – 30)                                   | 53              | 6.5        | 1.37                | Pulse-Gating<br>( Low / Low )                    | 0.16                    | -240.5          |
| ISSCC'17 [59] | 5. KIM<br>65     | 156.25           | 2.5                 | 16               | 198<br>(0.01 - 40)                                   | -65             | 13.5       | 5.40                | Replica Delay<br>( <mark>High</mark> / High )    | 0.064                   | -242.8          |
| VLSI'16 [58]  | Y. Lee<br>40     | 180              | 1.44                | 80               | 450<br>(0.001 - 40)                                  | 59              | 2.8        | 1.94                | Period-Charging<br>( Low / Low )                 | 0.061                   | -242.5          |
| This work     | 28               | 300              | 4.8                 | 16               | 143.6<br>(0.01 - 40)                                 | 6.77            | 9.36       | 1.95                | Multi-Phase<br>( High / Low )                    | 0.062                   | -247.1          |
|               | Tech [nm]        | Ref. Freq. [MHz] | Output. Freq. [GHz] | Mult. Factor (N) | Inte. Jitter, σ <sub>RMS</sub> [fs]<br>(range) [MHz] | Ref. Spur [dBc] | Power [mW] | Power Eff. [mW/GHz] | Freq. cal. Methodology<br>(cal. BW / cal. Power) | Area [mm <sup>2</sup> ] | *FoM.ITTER [dB] |

Table 4.1: PERFORMANCE COMPARISON WITH THE STATE-OF-THE-ART RO-BASED ILCMs

CHAPTER 4. ILCM WITH MULTI-PHASE-BASED CALIBRATION

115

 $\text{*FoM}_{\text{JITTER}} = 10 \cdot log \left[ (\frac{\sigma_{\text{RMS}}}{1s})^2 \cdot (\frac{P_{\text{ower}}}{1mW}) \right]$ 

### 4.6 Summary

In this chapter, a low-jitter, low-reference-spur RO-based ILCM is presented. With the MPC that utilizes an intrinsic multi-phase generation capability of the RO, the ILCM successfully compensates for a frequency error and an injection path offset. The logical process of the birth of MPC and its time-domain analysis are addressed. The bandwidth determination for securing a normal operation of the MPC and preventing a stability issue is concerned and its transient locking behavior is presented. Thanks to the high-bandwidth frequency calibration, the ILCM further suppresses the flicker noise of the RO and achieves a much lower RMS jitter. In addition, by identifying the direction of the phase shift, the PO calibrator makes the MPC-ILCM converge to the minimum reference spur point in the steady state. For a low-power implementation, the SS-BBPDs are used in all calibration loops. According to the measured performance of the integrated jitter and the reference spur performance in the MPC-ILCM with respect to the integral gain variations, the injection strength variations, and the gating rate variations, how the MPC-ILCM reacts to each environment is verified. With continuous background calibration, the MPC-ILCM maintains its performance against PVT variations. The ILCM achieves a 143.6-fs RMS jitter with a -77.9 dBc reference spur and consumes 9.4 mW at the 4.8-GHz operation, which translates to a jitter-and-power FoM (FoM<sub>JITTER</sub>) of -247.1 dB.

# **Chapter 5**

# Conclusion

When it comes to the wireline communication trends of the process technology scaling, the low energy efficiency, and the multiple configurations including the back-ward compatibility, the RO becomes a prospective replacement for LC counterparts, because the RO occupies a low silicon area cost, has a multi-phase generation capability and performs a wide-band frequency tuning range. However, the RO is more vulnerable to supply noise and has the fatal disadvantage of inferior phase noise compared to the LC counterparts. Therefore, in this dissertation, solutions for the two major drawbacks of RO-based clock generators are proposed. One prototype chip verifies the feasibility to the practical application, and the other prototype chip evaluates the superiority over different methodologies.

First of all, the RO-based AD-PLL with the self-biased SNC technique for the DDR5 RCD application is presented. The prerequisites of the SNC technique for RCD application are summarized as follows. 1) It should not rely on the reference clock for detecting a supply fluctuation, since the input reference clock is noisy. 2) It secure a wide-range static supply voltage margin for low voltage conditions considering PVT variations and mass production. 3) It compensates dynamic voltage droops due to the workload transition with a sufficient SNC bandwidth. 4) It must satisfy the stabilization time of  $3.5 \ \mu$ s. And the corresponding results of the proposed SNC technique are

summarized as follows. 1) The open-loop self-biased SNC-RO does not require any feedback system and its behavior does not have a dependency on the reference clock quality. 2) The low FP of 60.8 MHz/V at a  $\pm 20$  mV and 67.9 MHz/V at a  $\pm 40$ mV is measured in the SNC-RO, which is about 55 times lower than that of the SNC-free DCO. 3) The dominant pole is located in the vicinity of 20 MHz and the measured PSNA performance is sustained over 20 dB up to 10 MHz. 4) The additional settling time of about 40 ns is required, thus not deteriorating the stabilization time. The design-oriented quantitative analyses on static and dynamic characteristics such as the FP, the SNC-operating bandwidth, the phase noise contribution due to the SNC technique, and the estimated PSNA are also addressed. The prototype chip is fabricated in a 28-nm CMOS technology, and the measurement results demonstrate that the AD-PLL satisfies the prerequisites of the SNC technique as well as the stabilization time for the RCD application. The SNC-PLL achieves the best PSNA performance of 40 dB and maintains the PSNA performance over 20 dB with a single-tone noise frequency from 100 kHz to 10 MHz. In the case of random supply noise up to 100 mV<sub>pp</sub>, the integrated RMS jitter performance is improved by about 65 % on average. The AD-PLL consumes 12.1 mW at 3.0 GHz operation and achieves an integrated RMS jitter of 271 fs without any injected supply noise. Besides the SNC performance, the measured worst-case stabilization time is below 700 ns with the A-FTL assistance to satisfy the stabilization time for the RCD application. And the SNC-PLL retains below -237 dB

Secondly, a RO-based ILCM with the MPC that utilizes the intrinsic multi-phase generation capability of the RO is presented. The proposed MPC is one of the indispensable two-point calibration methodologies to ensure its normal operation and secure its remarkable jitter performance. Since the free-running frequency information is contained in the pre-injection phases, the FE calibration can be performed without any hardware overhead. In addition, the high bandwidth of FE calibration as well as full

FoM<sub>JITTER</sub> performance from 1.8 GHz to 3.2 GHz operation.

injection effects is attained simultaneously, which overcomes the trade-off relationship between the FE calibration and the injection effect in the conventional pulse-gating method. The PO calibrator identifies the phase shift in the post-injection phases, which makes the MPC-ILCM converge to the minimum reference spur point. All calibration loops operate at the reference clock rate, thus enabling a low-power implementation, whereas the PD and RDC operate at the oscillator clock rate in the RDC-based ILCM. Through a time-domain analysis, the equations for the residue phase offset in the steady state are derived, which corresponds to the residue FE and the indicator of how much the deterministic jitter remains. Fabricated in a 28-nm CMOS technology, the proposed MPC verifies a low-jitter and low-reference-spur RO-based ILCM. The MPC-ILCM achieves an integrated RMS jitter of 143.6 fs with a -77.9 dBc reference spur at 4.8 GHz operation. The MPC sustains a successful injection-locked condition, hence both the integrated RMS jitter and reference spur performances are maintained with supply voltage variations. The performance variations with respect to the injection strength, the gain of the FE calibration, and the gating rate that corresponds to effective injection strength are also presented. It achieves the best jitter-and-power FoM of -247.1dB and the lowest reference spur among the state-of-the-art RO-based ILCM with background calibration techniques.

# **Chapter A**

## Notes for PLL in the RCD

The input reference clock of the SNC-PLL presented in the previous chapter 3 is applied from the Signal Vector Generator, which does not represent the actual conditions of the RCD application. There is a separate printed circuit board (PCB) for measuring the output clock jitter of the RCD chip, and the ADVANTEST equipment is used for verifying overall functional operations and checking voltage and timing margins through various test vectors. In this appendix, additional logic of the PLL in the RCD that uses the built-in feature of the PLL as a zero-delay buffer is provided. Furthermore, the behavioral simulation results of the robustness against voltage fluctuations and the jitter transfer function that emulates the actual RCD conditions are addressed. It can be used as a jitter estimation for the predictable voltage fluctuation and determines whether the amounts of jitter peaking and bandwidth meet each constraint of the RCD under the actual conditions. Lastly, the output clock jitters of YCK in DDR4 RCD chip under different conditions measured by the ADVANTEST equipment are presented.



Figure A.1: Block diagram of PLL in RCD as zero-delay buffer.



Figure A.2: (a) continuous and (b) request-driven DM operation and (c) its simulation result of (b) case.

### A.1 PLL as Zero-Delay Buffer

Figure A.1 shows the block diagram of the SNC-PLL. Compared with figure 3.18, the two noticeable different parts are observed. The first part is the logic related to the clock stop signal,  $S_{CLK STOP}$ .  $S_{CLK STOP}$  is sampled twice by the divided clock by 8, and it guarantees a clock toggling of at least 16 cycles after  $S_{\text{CLK},\text{STOP}}$  is received. The second is the delay monitor (DM) and the replica blocks of the clock path placed in the PLL feedback path, which includes a phase interpolator (PI), a clock tree (CT), an output buffer (OB), and an input buffer (IB). Even though the jitter performance is further degraded by the increased effective loop delay due to the replica blocks in the feedback path, they are added in order to perform the role of a zero-delay buffer. The reason why the zero-delay buffer should be adopted is for the asynchronous delay  $t_{\text{PDM}}$ , which is one of the most important performance metrics of the RCD application and is defined as a propagation delay from DCK\_t/DCK\_c falling edge crosspoint to output [4, 5]. Therefore, the phase-locked state that is the intrinsic feature of the PLL is used so that  $t_{PDM}$  is quite robust to PVT variations. In addition, the DM quantizes the propagation delay of each replica block with the tCK/64 resolution. To reduce power consumption due to the DM during a normal operation, the DM operates when the request signal,  $S_{\text{DM,REO}}$ , is applied as shown in figure A.2(b). The simulated DM operation is shown in figure A.2(c).

### A.2 Practical Behavioral Simulation

Since the input reference clock applied from the Signal Vector Generator does not represent the actual input clock in the RCD application, the behavioral simulation is performed in accordance with the input clock differential jitter specification of [5]. Figure A.3 shows the input-to-output jitter transfer function derived from the behavioral simulation at 2.4 GHz operation of DDR5-4800. The jitter peaking of 1.4 dB and the loop bandwidth of 78 MHz ( $0.03*f_{clock}$ ) are observed from the figure A.3. It satisfies the maximum jitter peaking value of 3 dB and the minimum bandwidth of  $0.01*f_{clock}$ , according to [5]. Figure A.4 shows the simulated time trend of the output clock, the cycle-to-cycle (CC) jitter, and the half-period (HP) jitter with the supply voltage ramping of -100 mV during 1  $\mu$ s. It is observed that the peak-to-peak CC jitter and HP jitter during about 2.5  $\mu$ s are 6.8 ps and 3.8 ps, respectively, while the phase-locked state is maintained even in the supply fluctuations of -100 mV.



Figure A.3: Input-to-Output Jitter Transfer Function of PLL in RCD

| Baseline ▼+0<br>Ef Carsor-Baseline ▼+15,402ns |                                  | TimeA = 15.4                   | 0205    |                        |                                       |                                 |                                |                         |                          |                                           |          |                             |                         |          |                                |
|-----------------------------------------------|----------------------------------|--------------------------------|---------|------------------------|---------------------------------------|---------------------------------|--------------------------------|-------------------------|--------------------------|-------------------------------------------|----------|-----------------------------|-------------------------|----------|--------------------------------|
| Name Or Cursor Or                             | 15,200m                          | 15,400m                        | 15,600m | 15,800ns               | 16,000ns                              | 16,200ns                        | 16,400ns                       | 16,600ns                | 16,800ss                 | 17,000ns                                  | 17,200ns | 17,400ns                    | 17,600ns                | 17,800ns | 18,000                         |
| VDD_DCO                                       | -1.08<br>-1.08<br>-1.04<br>-1.02 |                                |         |                        |                                       |                                 |                                |                         |                          |                                           |          |                             |                         |          | 114+8840                       |
| Frequency                                     | ,<br>Muhh<br>Muhh                | dindatan<br>Alaminta           |         |                        |                                       |                                 |                                |                         | haiddanol<br>Dolynar ydd | ul la |          | d ur verhol<br>auf produkte | ella da da<br>Alexandra |          | 2.41883                        |
| Cycle-to-cycle<br>jitter                      | t sole alea<br>Parks Physics     | alandaraik<br>Manadaraik       |         | na shinin<br>Lengapine | hidalaa<br>pooroin                    | ten Alexandria<br>Genergen Alex | itani midali<br>Maji pelentari | aldah olan<br>Merpinisi | ling and the             |                                           |          | pp: 6.8p                    | os                      | ,        | 0.003474<br>4400444<br>4400444 |
| Half-period jitter                            | I STATE                          | ethile où eth<br>e serifiempro |         | halphan                | a a a a a a a a a a a a a a a a a a a | <u>en plaisi</u> n              | N. ANA                         | heatilities             |                          |                                           |          | pp: 3.8p                    | s                       |          | 00186267                       |
|                                               | *                                |                                |         |                        |                                       |                                 |                                |                         |                          |                                           |          |                             |                         |          |                                |

Figure A.4: Simulated frequency time trend and the jitter with supply fluctuation.

### A.3 Additional Measurement from ADVANTEST

In figure A.5, the period jitter, the CC jitter, and the time interval error (TIE) are measured through an oscilloscope with a single-ended ground termination by soldering the SMA connector to Y0\_t/Y0\_c of the DDR4 RCD on the tip-shaped test PCB. This method can measure the all integrated noises involved in YCK output, but it is not the ideal condition for measuring the actual jitter. The comparison of the results of competitors is rather a more reliable interpretation. According to the table A.6, there is no jitter degradation due to a data pattern, and it is presumed that the dominant noise contributor is PI. In addition, figure A.7 confirms the fast stabilization even in a wide supply range of 1.0 V to 1.4 V.



Figure A.5: Measurement setup for DDR4 RCD with ADVANTEST equipment

| Measurement<br>Control                                                                                                                           |              | Jitter [ps <sub>RMS</sub> ] |                |       |  |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-----------------------------|----------------|-------|--|--|--|
|                                                                                                                                                  |              | Period                      | Cycle-to-Cycle | TIE   |  |  |  |
| 1                                                                                                                                                | IDLE         | DLE 2.207                   |                | 3.225 |  |  |  |
|                                                                                                                                                  | Operating    | 2.123                       | 3.526          | 3.290 |  |  |  |
| 2                                                                                                                                                | Gain change  | 2.071                       | 3.470          | 3.223 |  |  |  |
| 2                                                                                                                                                | VDD 1.4 V    | 1.676                       | 2.725          | 2.679 |  |  |  |
| ?                                                                                                                                                | VDD 1.0 V    | 3.175                       | 5.412          | 3.418 |  |  |  |
| 4                                                                                                                                                | V.S./Mod./On | 1.859                       | 3.096          | 2.735 |  |  |  |
|                                                                                                                                                  | Fast 2x      | 1.816                       | 3.034          | 2.568 |  |  |  |
| 5                                                                                                                                                | Fast 4x      | 1.745                       | 2.919          | 2.617 |  |  |  |
|                                                                                                                                                  | Fast 8x      | 1.632                       | 2.734          | 2.684 |  |  |  |
|                                                                                                                                                  | Sample #1    | 1.632                       | 2.734          | 2.684 |  |  |  |
| 6                                                                                                                                                | Sample #2    | 1.676                       | 2.752          | 2.654 |  |  |  |
|                                                                                                                                                  | Sample #3    | 1.563                       | 2.622          | 2.610 |  |  |  |
|                                                                                                                                                  | Α            | 1.371                       | 2.347          | 1.906 |  |  |  |
| 7                                                                                                                                                | В            | 1.403                       | 2.418          | 2.261 |  |  |  |
|                                                                                                                                                  | С            | 2.237                       | 3.749          | 3.157 |  |  |  |
| <ul> <li>* Measurement Control Description</li> <li>1) Operating condition : IDLE state and operating state</li> <li>2) PLL gain swep</li> </ul> |              |                             |                |       |  |  |  |

3) High & Low supply voltage

4) YCK Strength (very strong) / slew : moderate / duty control : On
5) PI bias strength control

6) Other Samples

7) Other Samples of competitors

Figure A.6: Measured Y0\_t output jitter of DDR4 RCD

| Меа | asurement<br>Control | Stabilization Time [ns] |
|-----|----------------------|-------------------------|
|     | 1.0 V                | 400                     |
| VDD | 1.2 V                | 120                     |
|     | 1.4 V                | 120                     |

Figure A.7: Measured stabilization time of DDR4 RCD

# **Bibliography**

- International Solid-State Circuits Conference (ISSCC), "2022 Press Kit," Online (accessed: Dec. 05, 2022), https://www.isscc.org/past-conferences (2022).
- [2] PCI-SIG, "PCI Express® Base Specification Revision 6.0, Version 0.5," Feb 19, 2020.
- [3] D. D. Sharma, "PCI Express® 6.0 Specification at 64.0 GT/s with PAM-4 signaling: a low latency, high bandwidth, high reliability and cost-effective interconnect," in *IEEE Symp. High-Performance Interconnects (HOTI)*, Aug. 2020, pp. 1–8.
- [4] DDR4 Registering Clock Driver, JEDEC Standard JESD82-31A, 2019.
- [5] DDR5 Registering Clock Driver Definition, JEDEC Standard JESD82-511, 2021.
- [6] M.-S. Hwang, J. Kim, and D.-K. Jeong, "Reduction of Pump Current Mismatch in Charge-Pump PLL," in *Electron. Lett.*, vol. 45, no. 3, pp. 135-136, Jan. 2009.
- [7] V. Kratyuk, P. K. Hanumolu, Un-Ku Moon, and K. Mayaram, "A Design Procedure for All-Digital Phase-Locked Loops Based on a Charge-Pump Phase-Locked-Loop Analogy," in *IEEE Trans. Circuits Syst. II*, vol. 54, no. 3, pp. 247-251, Mar. 2007.
- [8] F. M. Gardner, "Phaselock Techniques," Hoboken, NJ, USA: Wiley, 2005.
- [9] P. Dudek and S. Szczepanski, and J. V. Hatfield, "A High-Resolution CMOS Time-to-Digital Converter Utilizing a Vernier Delay Line," in *IEEE J. Solid-State Circuits*, vol. 35, no. 2, pp. 240-247, Feb. 2000.
- [10] N. D. Dalt, "A Design-Oriented Study of the Nonlinear Dynamics of Digital Bang-Bang PLLs," in *IEEE Trans. Circuits Syst. I*, vol. 52, no. 1, pp. 21-31, Jan. 2005.
- [11] M. H. Perrott, M. D. Trott, and C. G. Sodini, "A Modeling Approach for  $\Sigma$ - $\Delta$ Fractional-N Frequency Synthesizers Allowing Straightforward Noise Analysis," in *IEEE J. Solid-State Circuits*, vol. 37, no. 8, pp. 1028-1038, Aug. 2002.
- [12] R. B. Staszewski *et al.*, "All-Digital PLL and Transmitter for Mobile Phones," in *IEEE J. Solid-State Circuits*, vol. 40, no. 12, pp. 2469-2482, Dec. 2005.
- [13] R. B. Staszewski, D. Leipold, C.-M. Hung, and P. T. Balsara, "TDC-Based Frequency Synthesizer for Wireless Applications," in *Proc. IEEE Radio Frequency Integrated Circuits (RFIC) Symp.*, Jun. 2004, pp. 251-258.
- [14] R. B. Staszewski, C.-M. Hung, N. Barton, M.-C. Lee, and D. Leipold, "A Digitally Controlled Oscillator in a 90 nm Digital CMOS Process for Mobile Phones," in *IEEE J. Solid-State Circuits*, vol. 40, no. 11, pp. 2203-2211, Nov. 2005.
- [15] A. Hajimiri, and T. H. Lee, "A General Theory of Phase Noise in Electrical Oscillators," in *IEEE J. Solid-State Circuits*, vol. 33, no. 2, pp. 179-194, Feb. 1998.
- [16] A. E. Siegman, *Lasers*. Mill Valley, CA: University Science Books, 1986.

- [17] B. Razavi, "A Study of Injection Locking and Pulling in Oscillators," in *IEEE J. Solid-State Circuits*, vol. 39, no. 9, pp. 1415-1424, Sep. 2004.
- [18] B. M. Helal, C. M. Hsu, K. Johnson, and M. H. Perrott, "A Low Jitter Programmable Clock Multiplier Based on a Pulse Injection-Locked Oscillator with a Highly-Digital Tuning Loop," in *IEEE J. Solid-State Circuits*, vol. 44, no. 5, pp. 1391-1400, May. 2009.
- [19] A. Elkholy, M. Talegaonkar, T. Anand, and P. K. Hanumolu, "Design and Analysis of Low-Power High-Frequency Robust Sub-Harmonic Injection-Locked Clock Multipliers," in *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 3160-3174, Dec. 2015.
- [20] H. C. Ngo, K. Nakata, T. Yoshioka, Y. Terashima, K. Okada, and A. Matsuzawa, "8.5 A 0.42ps-Jitter –241.7dB-FOM Synthesizable Injection-Locked PLL with Noise-Isolation LDO," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 150-151.
- [21] Y.-C. Huang and S.-I. Liu, "19.8 A 2.4-GHz Sub-Harmonically Injection-Locked PLL With Self-Calibrated Injection Timing," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2012, pp. 338-340.
- [22] W.-S. Choi, T. Anand, G. Shu, A. Elshazly, and P. K. Hanumolu, "A Burst-Mode Digital Receiver with Programmable Input Jitter Filtering for Energy Proportional Links," in *IEEE J. Solid-State Circuits*, vol. 50, no. 3, pp. 737-748, Mar. 2015.
- [23] J. Terada, K. Nishimura, S. Kimura, H. Katsurai, N. Yoshimoto, and Y. Ohtomo, "11.4 A 10.3125Gb/s Burst-Mode CDR Circuit Using a  $\Delta\Sigma$  DAC," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2008, pp. 226-227 and 609.

- [24] H. Wu and L. Zhang, "32.9 A 16-to-18GHz 0.18µm Epi-CMOS Divide-by-3 Injection-Locked Frequency Divider," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2006, pp. 602-603.
- [25] A. Garghetti, A. L. Lacaita, D. Seebacher, M. Bassi, and S. Levantino, 'Analysis and Design of 8-to-101.6-GHz Injection-Locked Frequency Divider by Five With Concurrent Dual-Path Multi-Injection Topology," in *IEEE J. Solid-State Circuits*, vol. 57, no. 6, pp. 1788-1799, Jun. 2022.
- [26] J.-H. Seol et al., "23.6 An 8Gb/s 0.65mW/Gb/s Forwarded-Clock Receiver Using an ILO with Dual Feedback Loop and Quadrature Injection Scheme," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2013, pp. 410-411.
- [27] D. Kim et al., "23.2 A 1.1V lynm 6.4Gb/s/pin 16Gb DDR5 SDRAM with a Phase-Rotator-Based DLL, High-Speed SerDes and RX/TX Equalization Scheme," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2019, pp. 380-382.
- [28] R. Adler, "A Study of Locking Phenomena in Oscillators," in *Proc. IRE*, vol. 34, no. 6, pp. 351-357, Jun. 1946.
- [29] B. Hong, and A. Hajimiri, "A General Theory of Injection Locking and Pulling in Electrical Oscillators—Part I: Time-Synchronous Modeling and Injection Waveform Design," in *IEEE J. Solid-State Circuits*, vol. 54, no. 8, pp. 2109-2121, Aug. 2019.
- [30] B. Hong, and A. Hajimiri, "A General Theory of Injection Locking and Pulling in Electrical Oscillators—Part II: Amplitude Modulation in LC Oscillators, Transient Behavior, and Frequency Division," in *IEEE J. Solid-State Circuits*, vol. 54, no. 8, pp. 2122-2139, Aug. 2019.

- [31] D. Dunwell, and A. C. Carusone, "Modeling Oscillator Injection Locking Using the Phase Domain Response," in *IEEE Trans. Circuits Syst. I*, vol. 60, no. 11, pp. 2823-2833, Nov. 2013.
- [32] S. Ye, L. Jansson, and I. Galton, "A Multiple-Crystal Interface PLL With VCO Realignment to Reduce Phase Noise," in *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1795-1803, Dec. 2002.
- [33] Y. Song, H.-G. Ko, C. Kim, and D.-K. Jeong, "A 1.05-to-3.2 GHz All-Digital PLL for DDR5 Registering Clock Driver With a Self-Biased Supply-Noise-Compensation Ring DCO," in *IEEE Trans. Circuits Syst. II*, vol. 69, no. 3, pp. 759-763, Mar. 2022.
- [34] A. Grenat, S. Pant, R. Rachala, and S. Naffziger, "5.6 Adaptive Clocking System for Improved Power Efficiency in a 28nm x86-64 Microprocessor," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 106-107.
- [35] B. Kim, W. Xu, and C. H. Kim, "A Supply-Noise Sensitivity Tracking PLL in 32nm SOI Featuring a Deep Trench Capacitor Based Loop Filter," in *IEEE J. Solid-State Circuits*, vol. 49, no. 4, pp. 1017-1026, Apr. 2014.
- [36] V. V. Kaenel, D. Aebischer, C. Piguet, and E. Dijkstra, "A 320 MHz, 1.5 mW at 1.35V CMOS PLL for Microprocessor Clock Generation," in *IEEE J. Solid-State Circuits*, vol. 31, no. 11, pp. 1715-1722, Nov. 1996.
- [37] K. Chang *et al.*, "A 0.44-Gb/s CMOS Quad Transceiver Cell Using On-Chip Regulated Dual-Loop PLLs," in *IEEE J. Solid-State Circuits*, vol. 38, no. 5, pp. 747-754, May 2003.

- [38] S. Sidiropoulos, D. Liu, J. Kim, G. Wei, and M. Horowitz, "Adaptive bandwidth DLLs and PLLs using regulated supply CMOS buffers," in *Proc. IEEE Symp. VLSI Circuits*, Jun. 2000, pp. C124–C127.
- [39] E. Alon, J. Kim, S. Pamarti, K. Chang, and M. Horowitz, "Replica Compensated Linear Regulators for Supply-Regulated Phase-Locked Loops," in *IEEE J. Solid-State Circuits*, vol. 41, no. 2, pp. 413-424, Feb. 2006.
- [40] A. Arakali, S. Gondi, and P. K. Hanumolu, "Low-Power Supply-Regulation Techniques for Ring Oscillators in Phase-Locked Loops Using a Split-Tuned Architecture," in *IEEE J. Solid-State Circuits*, vol. 44, no. 8, pp. 2169-2181, Aug. 2009.
- [41] K.-Y. J. Shen *et al.*, "19.4 A 0.17-to-3.5mW 0.15-to-5GHz SoC PLL with 15dB Built-In Supply Noise Rejection and Self-Bandwidth Control in 14nm CMOS," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2016, pp. 330-331.
- [42] C.-W. Yeh, C.-E Hsieh, and S.-I. Liu, "19.5 A 3.2GHz Digital Phase-Locked Loop with Background Supply-Noise Cancellation," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2016, pp. 332-333.
- [43] A. Elshazly, R. Inti, W. Yin, B. Young, and P. K. Hanumolu, "5.3 A 0.4-to-3GHz Digital PLL with Supply-Noise-Cancellation Using Deterministic Background Calibration," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2011, pp. 92-94
- [44] S. S. Nagam, and P. R. Kinget, "A Low-Jitter Ring-Oscillator Phase-Locked Loop Using Feedforward Noise Cancellation With a Sub-Sampling Phase Detector," in *IEEE J. Solid-State Circuits*, vol. 53, no. 3, pp. 703-714, Mar. 2018.

- [45] C.-Y. Wu, R.-P. Shen, C.-H. Chang, K. Hsieh, and M. Chen, "A 0.031mm<sup>2</sup>, 910fs, 0.5-4GHz Injection Type SOC PLL with 90dB Built-in Supply Noise Rejection in 10nm FinFET CMOS," in *in Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, 2017.
- [46] M. Nagata, "Constant Current Circuits," (in Japanese) Japanese Patent 628 228: Japanese Examined Patent Pub. 46-16463B, May 6, 1971.
- [47] T. Abe, H. Tanimoto, and S. Yoshizawa, "A Simple Current Reference with Low Sensitivity to Supply Voltage and Temperature," in *Proc. 24th Int. Conf. Mixed Des. Integr. Circuits Syst.*, Jun. 2017, pp. 67-72.
- [48] M. Hirano, N. Tsukiji, and H. Kobayashi, "Simple Reference Current Source Insensitive to Power Supply Voltage Variation—Improved Minoru Nagata Current Source," in *Proc. 13th IEEE Int. Conf. Solid-State Integr. Circuit Technol.* (ICSICT), Oct. 2016, pp. 87-89.
- [49] W. Yin, R. Inti, A. Elshazly, B. Young, and P. K. Hanumolu, "A 0.7-to-3.5 GHz 0.6-to-2.8 mW Highly Digital Phase-Locked Loop with Bandwidth Tracking," in *IEEE J. Solid-State Circuits*, vol. 46, no. 8, pp. 1870-1880, Aug. 2011.
- [50] M. Hossain, W. El-Halwagy, A. D. Hossain, and Aurangozeb, "Fractional-N DPLL-Based Low-Power Clocking Architecture for 1-14 Gb/s Multi-Standard Transmitter," in *IEEE J. Solid-State Circuits*, vol. 52, no. 10, pp. 2647-2662, Jul. 2017.
- [51] Y. Song, and D.-K. Jeong, "Design of TDC-PFD in DDR5 RCD Application," in *Proc. 3th Institute of Semiconductor Engineers (ISE) Conf. Tech. Papers*, Dec. 2020, page S3-2.

- [52] T. Seong, Y. Lee, S. Yoo, and J. Choi, "25.4 A –242dB FOM and –75dBc-Reference-Spur Ring-DCO-Based All-Digital PLL Using a Fast Phase-Error Correction Technique and a Low-Power Optimal-Threshold TDC," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 396-398.
- [53] S. Lloyd, "Least Squares Quantization in PCM," in *IEEE Trans. Inf. Theory*, vol. 28, no. 2, pp. 129-137, Mar. 1982.
- [54] J. Liu *et al.*, "15.2 A 0.012 mm<sup>2</sup> 3.1 mW Bang-Bang Digital Fractional-N PLL with a Power-Supply-Noise Cancellation Technique and A Walking-One-Phase-Selection Fractional Frequency Divider," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 268-269.
- [55] Y.-C. Huang, C.-F. Liang, H.-S. Huang, and P.-Y. Wang, "15.3 2.4 GHz AD-PLL With Digital-Regulated Supply-Noise-Insensitive and Temperature Self-Compensated Ring DCO," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 270-271.
- [56] D. Kim and S. Cho, "A Supply Noise Insensitive PLL with a Rail-To-Rail Swing Ring Oscillator and a Wideband Noise Suppression Loop," in *Proc. IEEE Symp. VLSI Circuits*, Jun. 2017, pp. C180-C181.
- [57] Y. Song, K. Ha, H.-G. Ko, M.-S. Choo and D.-K. Jeong, "A –247.1 dB FoM, –77.9dBc Reference Spur Ring-Oscillator-Based Injection-Locked Clock Multiplier with Multi-Phase-Based Calibration," in *Proc. IEEE 48th European Solid-State Circuits Conference (ESSCIRC)*, Sep. 2022, pp. 249–252.
- [58] Y. Lee, H. Yoon, M. Kim, and J. Choi, "A PVT-Robust –59-dBc Reference Spur and 450-fs<sub>RMS</sub> Jitter Injection-Locked Clock Multiplier Using a Voltage-Domain

Period-Calibrating Loop," in *Proc. IEEE Symp. VLSI Circuits*, Jun. 2016, pp. C190–C191.

- [59] S. Kim *et al.*, "29.7 A 2.5GHz Injection-Locked ADPLL with 197fs<sub>RMS</sub> Integrated Jitter and –65dBc Reference Spur Using Time-Division Dual Calibration," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 494-495.
- [60] K. M. Megawer et al., "25.2 A 5GHz 370fs<sub>RMS</sub> 6.5mW Clock Multiplier Using a Crystal-Oscillator Frequency Quadrupler in 65nm CMOS," in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2018, pp. 392-394.
- [61] A. Elkholy, D. Coombs, R. K. Nandwana, A. Elmallah, and P. K. Hanumolu, "A 2.5–5.75-GHz Ring-Based Injection-Locked Clock Multiplier with Background-Calibrated Reference Frequency Doubler," in *IEEE J. Solid-State Circuits*, vol. 54, no. 7, pp. 2049-2058, Jul. 2019.
- [62] S. Yoo *et al.*, "A Low-Jitter and Low-Reference-Spur Ring-VCO-Based Injection-Locked Clock Multiplier Using a Triple-Point Background Calibrator," in *IEEE J. Solid-State Circuits*, vol. 56, no. 1, pp. 298-309, Jan. 2021.
- [63] M.-S. Choo, H.-G. Ko, S.-Y. Cho, K. Lee, and D.-K. Jeong, "An Optimum Injection-Timing Tracking Loop for 5-GHz, 1.13-mW/GHz RO-Based Injection-Locked PLL With 152-fs Integrated Jitter," in *IEEE Trans. Circuits Syst. II*, vol. 65, no. 12, pp. 1819-1823, Dec. 2018.
- [64] D.-H. Oh, D.-S. Kim, S. Kim, D.-K. Jeong, and W. Kim, "29.7 A 2.8Gb/s All-Digital CDR with a 10b Monotonic DCO," in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2007, pp. 222-223 and 598.

초록

광대역 작동에 대한 수요가 증가하고 전력 소비를 줄이기 위한 하위 속도 구조가 유망해짐에 따라, 넓은 주파수 범위와 다상 생성 기능을 갖춘 링 발진기가 LC 공진기 의 대체품으로 유망하다. 그러나 능동 소자의 전파 지연에 의해 주파수가 결정되는 링 발진기는 전원 잡음에 취약하고 LC 공진기에 비해 위상 잡음 성능이 취약하다는 단점을 갖는다. 본 논문에서는 링 발진기 기반 클럭 발생기의 두 가지 주요한 단점을 극복하기 위한 해결법을 제안하고, 각 해결법은 칩 설계를 통해 검증된다.

먼저, 자체적으로 전원 잡음을 보상하는 기술이 적용된 링 발진기 기반의 DDR5 RCD용 위상 고정 루프를 제시한다. DDR5 RCD에 적용 가능한 전제 조건을 고려하 여, 큰 정적 전압 마진으로 낮은 주파수 민감성을 달성하는 개방 루프형 전원 잡음을 보상하는 링 발진기를 제안한다. 이 기술은 피드백을 사용하지 않고 위상 고정 루프 의 대역폭과 독립적으로 동작하기 때문에, 대역폭 중첩과 관련된 안정성 문제로부터 자유로우면서 동작 환경에 관계없이 전원 잡음 보상 성능을 유지할 수 있다. 또한 시 동 회로를 필요로 하지 않고, 안정화 시간을 저하시키지 않는다. 제안된 전원 잡음 보상 기술의 정적 및 동적 특성에 대한 정량적 분석과 설계 지향적 고려 사항에 대 해 다룬다. 28 나노미터 CMOS 공정으로 제작된 프로토타입 칩의 측정 결과를 통해 RCD 제품의 전원 잡음 보상 기술과 관련된 전제조건을 만족하는 것을 보인다. 본 위상 고정 루프는 40 dB의 최고 PSNA 성능을 달성하고, 10 MHz까지 20 dB 이상의 PSNA 성능을 유지한다. 임의 공급 잡음의 경우, RMS 지터 성능이 평균적으로 약 65 % 개선된다. 본 위상 고정 루프는 3.0 GHz 동작에서 12.1 mW의 전력을 소비하고, 공금 전원 잡음이 없는 환경에서 271 fs의 RMS 지터를 달성한다.

두번째로는 링 발진기의 고유한 다중 위상 생성 기능을 활용하는 새로운 백그 라운드 보정 기술이 포함된 주입 고정 클럭 합성기를 제시한다. 링 발진기의 잡음을 줄이기 위한 높은 억제 대역폭을 달성하기 위해 주입 고정 기술이 사용된다. 그러나 정상 동작을 보장하고 뛰어난 지터 성능을 확보하기 위해 필수적인 2점 교정을 필 요로 한다. 주입 펄스 속도로 동작하는 주파수 오류 보정기는 높은 주파수 오류 보정 대역폭과 완전한 주입 효과가 동시에 달성되어, 링 발진기의 플리커 잡음을 더욱 억 제시켜 훨씬 더 낮은 지터를 달성하는데 기여한다. 경로 오프셋 보정기는 다상 기반 조정 기술의 주입 고정 클럭 합성기가 보정 이후 남아있는 주파수 오류가 최소화가 되어 최소 기준 스퍼의 위치로 수렴되도록 한다. 정상 상태에서 주입 고정 동작에 대한 시간 영역 분석과 다중 위상 보정기의 자세한 동작에 대해 다룬다. 28 나노미터 CMOS 공정으로 제작된 다중 위상 보정기는 4.8 GHz 동작에서 -247.1 dB의 FoM 과 143.6 fs의 낮은 지터 및 -77.9 dBc의 낮은 기준 스퍼를 갖는 링 발진기 기반의 주입 고정 클럭 합성기를 입증한다. 다중 위상 보정기는 성공적인 주입 잠금 상태를 유지하여, RMS 지터와 기준 스퍼 성능이 공급 전압 변동에도 유지된다.

**주요어**: 위상 고정 루프, 주입 고정 클럭 합성기, 전원 잡음 보상, 주입 고정, 2점 교정, 다중 위상 보정기, RCD **화번**: 2018-29700