



**Ph.D. Dissertation** 

# Design of High-Speed and Low-Power Internal Display Interface

고속, 저전력의 내부 디스플레이 인터페이스 설계

by

**Kwang-Hoon Lee** 

August, 2023

Department of Electrical and Computer Engineering College of Engineering Seoul National University

## **Design of High-Speed and Low-Power Internal Display Interface**

지도 교수 정 덕 균

이 논문을 공학박사 학위논문으로 제출함 2023 년 8 월

> 서울대학교 대학원 전기·정보공학부 이 광 훈

이광훈의 박사 학위논문을 인준함 2023 년 8 월



## **Design of High-Speed and Low-Power Internal Display Interface**

by

Kwang-Hoon Lee

A Dissertation Submitted to the Department of Electrical and Computer Engineering in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

> at SEOUL NATIONAL UNIVERSITY

> > August, 2023

Committee in Charge:

Professor Suhwan Kim, Chairman

Professor Deog-Kyoon Jeong, Vice-Chairman

Professor Woo-Seok Choi

Professor Yongsam Moon

Professor Jun-Eun Park

### Abstract

In this thesis, major concerns in the architecture of internal display interface are explained. Considering the limited battery capacity of mobile phones and the increasing amount of required data, the interface should be designed for both highspeed and low-power operation.

In the first prototype design, a 10 Gb/s/lane transceiver is presented. In transmitter (TX), pseudo serializer and 2:1 ISI mitigating MUX are proposed to simultaneously mitigate inter-symbol-interference (ISI) and achieve power efficiency. The proposed serializer reduces the clock distribution of the conventional serializer to save power, and the proposed MUX pre-charges or pre-discharges the floating nodes of the tristate inverter to eliminate previous information and mitigate ISI. In receiver (RX), a hybrid loop is employed, which is initially performed using a digital loop. After the frequency detection, the digital loop is deactivated, and the analog loop is activated to eliminate the remaining frequency and phase errors. By utilizing the digital loop, unlimited frequency detection is possible, and the analog loop can achieve better power efficiency due to deactivating the edge deserializer (DES) and digital loop filter (DLF). The prototype chip is fabricated in 28-nm CMOS technology and occupies an active area of 0.196 mm2. Each TX, RX and PLL occupies 0.026 mm<sup>2</sup>, 0.066 mm<sup>2</sup>, 0.012 mm<sup>2</sup>, respectively. The overall transceiver achieves an energy efficiency of 1.23 pJ/b.

In the second prototype design, a 10 Gb/s receiver that is capable of fast frequency acquisition in the initial mode and recovering its operating frequency fast from the

sleep mode under the supply voltage drift is proposed. The linear characteristic of the frequency gain curve is used to adjust the initial digital code using a finite-state machine (FSM). Furthermore, a hybrid CDR is employed to support the fast entering and exiting of the sleep mode by adding AND gates to the digital loop filter, while offering good jitter performance by utilizing an analog loop filter. Also, supply voltage drift cancellation (SVDC) circuit is added to maintain constant current in the presence of supply voltage drift. Thanks to the hybrid CDR and SVDC, even if the supply voltage drift occurs during the sleep mode, the same frequency is recovered fast without frequency re-tracking. A prototype chip fabricated in 28-nm CMOS technology occupies an active area of 0.089mm<sup>2</sup> with 0.99-pJ/bit energy efficiency in the active mode. The proposed fast tracking method achieves a frequency lock time of  $0.37 \,\mu$ s, which is faster than the conventional frequency lock time of  $3.02 \,\mu$ s. In the sleep mode, the measured results show that the frequency is recovered within 36 ns even if the worst-case supply voltage drift occurs during the sleep mode.

**Keywords :** Internal display interface, transmitter, receiver, clock and data recovery, (CDR), fast frequency acquisition, supply voltage drift, sleep mode, wake-up, fast recovery

**Student Number :** 2019-25389

## Contents

| ABSTRACT       |                                  | Ι   |
|----------------|----------------------------------|-----|
| CONTENTS       |                                  | III |
| LIST OF FIGURE | S                                | VI  |
| LIST OF TABLES |                                  | XI  |
| CHAPTER 1 INTE | RODUCTION                        | 1   |
| 1.1 Motivati   | ION                              | 1   |
| 1.2 THESIS OF  | RGANIZATION                      | 5   |
| CHAPTER 2 BAC  | KGROUNDS                         | 7   |
| 2.1 OVERVIEV   | V                                | 7   |
| 2.2 INTERNAL   | DISPLAY INTERFACE                | 16  |
| 2.2.1 OVE      | RVIEW                            | 16  |
| 2.2.2 AP-7     | TO-TED INTERFACE                 | 21  |
| 2.2.2.1 FLEXIE | BLE PRINTED CIRCUIT BOARD (FPCB) |     |
| CHAPTER 3 DESI | IGN OF HIGH-SPEED AND LOW-POWEF  | Ł   |
| TRANSCEIVER    |                                  | 23  |
| 3.1 THE DESIG  | GN OF TRANSMITTER                | 23  |
| 3.1.1 Propose  | ed Pseudo Serializer             | 23  |
| 3.1.2 Proposi  | ED ISI-MITIGATING MUX            | 27  |

| 3.1.3 Ov                                                                                                     | VERALL STURUCTURE OF TRANSMITTER                                                                                                                                                                                                   | 30                                           |
|--------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------|
| 3.2 The                                                                                                      | DESIGN OF RECEIVER                                                                                                                                                                                                                 | 32                                           |
| 3.2.1 C                                                                                                      | )VERVIEW                                                                                                                                                                                                                           | 32                                           |
| 3.2.1 P                                                                                                      | ROPOSED REFERENCELESS HYBRID LOOP CDR                                                                                                                                                                                              | 33                                           |
| 3.2.2 C                                                                                                      | VERALL STRUCTURE OF RECEIVER                                                                                                                                                                                                       | 37                                           |
| 3.3.3 C                                                                                                      | IRCUIT IMPLEMENTATION                                                                                                                                                                                                              | 39                                           |
| 3.3.3.1                                                                                                      | CONTINOUS-TIME LINEAR EQUALIZER (CTLE)                                                                                                                                                                                             | 39                                           |
| 3.3.3.2                                                                                                      | DIGITAL LOOP FILTER (DLF)                                                                                                                                                                                                          | 41                                           |
| 3.4 MEA                                                                                                      | SUREMENT RESULTS                                                                                                                                                                                                                   | 42                                           |
| CHAPTER 4                                                                                                    | RECEIVER WITH FAST FREQUENCY ACQUISITION I                                                                                                                                                                                         | N                                            |
| ACTIVE MO                                                                                                    | DE AND FAST RECOVERY FROM SLEEP MODE UNDE                                                                                                                                                                                          | R                                            |
|                                                                                                              |                                                                                                                                                                                                                                    |                                              |
| VOLTAGE D                                                                                                    |                                                                                                                                                                                                                                    | 49                                           |
| VOLTAGE D                                                                                                    |                                                                                                                                                                                                                                    |                                              |
| <b>VOLTAGE D</b><br>4.1 Ove                                                                                  | DRIFT                                                                                                                                                                                                                              | 49                                           |
| <b>VOLTAGE D</b><br>4.1 Ove                                                                                  | PRIFT<br>RVIEW                                                                                                                                                                                                                     | 49<br>52                                     |
| VOLTAGE D<br>4.1 OVE<br>4.2 PROI                                                                             | PRIFT<br>RVIEW<br>POSED FREQUENCY DETECTOR                                                                                                                                                                                         | 49<br>52<br>52                               |
| <b>VOLTAGE D</b><br>4.1 OVE<br>4.2 PROI<br>4.2.1<br>4.2.2                                                    | PRIFT<br>RVIEW<br>POSED FREQUENCY DETECTOR<br>PRIOR WORK                                                                                                                                                                           | 49<br>52<br>52<br>55                         |
| VOLTAGE D<br>4.1 OVE<br>4.2 PROI<br>4.2.1<br>4.2.2<br>4.3 PROI                                               | PRIFT<br>RVIEW<br>Posed Frequency Detector<br>Prior Work<br>Fast Frequency acquisition Using Linearity Fucntion.                                                                                                                   | 49<br>52<br>52<br>55                         |
| VOLTAGE D<br>4.1 OVE<br>4.2 PROI<br>4.2.1<br>4.2.2<br>4.3 PROI<br>UNDER VOI                                  | PRIFT<br>RVIEW<br>POSED FREQUENCY DETECTOR<br>PRIOR WORK<br>FAST FREQUENCY ACQUISITION USING LINEARITY FUCNTION.<br>POSED HYBRID CDR WITH FAST RECOVERY FROM SLEEP MODE                                                            | 49<br>52<br>52<br>55                         |
| VOLTAGE D<br>4.1 OVE<br>4.2 PROI<br>4.2.1<br>4.2.2<br>4.3 PROI<br>UNDER VOI                                  | PRIFT<br>RVIEW<br>POSED FREQUENCY DETECTOR<br>PRIOR WORK<br>FAST FREQUENCY ACQUISITION USING LINEARITY FUCNTION.<br>POSED HYBRID CDR WITH FAST RECOVERY FROM SLEEP MODE<br>LTAGE DRIFT                                             | 49<br>52<br>52<br>55<br>62                   |
| VOLTAGE D<br>4.1 OVE<br>4.2 PROD<br>4.2.1<br>4.2.2<br>4.3 PROD<br>UNDER VOI<br>4.3.1 F<br>4.3.1.1            | PRIFT<br>RVIEW<br>POSED FREQUENCY DETECTOR<br>PRIOR WORK<br>FAST FREQUENCY ACQUISITION USING LINEARITY FUCNTION.<br>FOSED HYBRID CDR WITH FAST RECOVERY FROM SLEEP MODE<br>LTAGE DRIFT<br>IYBRID LOOP CDR WITH SVDC                | 49<br>52<br>52<br>55<br>62<br>62             |
| VOLTAGE D<br>4.1 OVE<br>4.2 PROD<br>4.2.1<br>4.2.2<br>4.3 PROD<br>UNDER VOI<br>4.3.1 F<br>4.3.1.1<br>4.3.1.2 | PRIFT<br>RVIEW<br>POSED FREQUENCY DETECTOR<br>PRIOR WORK<br>FAST FREQUENCY ACQUISITION USING LINEARITY FUCNTION .<br>POSED HYBRID CDR WITH FAST RECOVERY FROM SLEEP MODE<br>LTAGE DRIFT<br>IYBRID LOOP CDR WITH SVDC<br>MOTIVATION | 49<br>52<br>52<br>55<br>62<br>62<br>62<br>63 |

| 4.5 CIRCUIT IMPLEMENTATION                                    | 74  |
|---------------------------------------------------------------|-----|
| 4.5.1 Overall Structure                                       | 74  |
| 4.5.2 COMMAND CONTROLLER                                      | 78  |
| 4.6 MEASUREMENT                                               | 79  |
| 4.6.1 FAST FREQUENCY TRACKING                                 | 83  |
| 4.6.2 FREQUENCY RECOVERY FROM SLEEP MODE UNDER SUPPLY VOLTAGE | l   |
| DRIFT                                                         | 89  |
| CHAPTER 5 CONCLUSIONS                                         | 93  |
| BIBLIOGRAPHY                                                  | 95  |
| 초 록                                                           | 105 |

# **List of Figures**

| FIG. 1.1 THE DISPLAY RESOLUTION CHANGES OF THE GALAXY-S SERIES [1]2         |
|-----------------------------------------------------------------------------|
| FIG. 1.2 OVERALL ARCHITECTURE OF MOBILE APPLICATION [9]4                    |
| FIG. 2.1 DEPICTION OF PARALLEL COMMUNICATION AND SERIES COMMUNICATION. $8$  |
| FIG. 2.2 SIMPLIFIED BLOCK DIAGRAM OF SERIAL LINK                            |
| FIG. 2.3 BLOCK DIAGRAM OF SYNCHRONOUS CLOCKING ARCHITECTURE –               |
| FORWARDED CLOCKING                                                          |
| FIG. 2.4 BLOCK DIAGRAM OF MESOCHRONOUS CLOCKING ARCHITECTURE –              |
| FORWARDED CLOCKING                                                          |
| FIG. 2.5 BLOCK DIAGRAM OF MESOCHRONOUS CLOCKING ARCHITECTURE –              |
| COMMON REFERENCE CLOCK                                                      |
| FIG. 2.6 BLOCK DIAGRAM OF PLESIOCHRONOUS CLOCKING ARCHICTECTURE 12          |
| FIG. 2.7 TIMING DIAGRAM OF CLOCKING SCHEME: (A) FULL-RATE (B) HALF-RATE (C) |
| QUARTER-RATE                                                                |
| FIG. 2.8 DESCRIPTION AND SPECIFICATION OF LVDS (LOW VOLTAGE DIFFERENTIAL    |
| SIGNALING) [28]                                                             |
| FIG. 2.9 SIMPLIFIED BLOCK DIAGRAM OF INTERNAL DISPLAYPORT (IDP) PHY         |
| ELECTRICAL SUB-LAYER [12]                                                   |
| FIG. 2.10 MIPI MULTIMEDIA SPECIFICATION [9]                                 |
| Fig. 2.11 A conceptual view and description of the layers of DSI $[10]$ 19  |
| FIG. 2.12 THE PER-PIN DATA RATE VERSUS VERSION FOR D-PHY20                  |
| FIG. 2.13 OVERALL ARCHITECTURE OF AP-TO-TED INTERFACE                       |

| FIG. 2.14 EXAMPLE ILLSTRATION OF FLEXIBLE PRINTED CIRCUIT BOARD [29]2     | 2  |
|---------------------------------------------------------------------------|----|
| FIG. 3.1 OVERALL STURCTURE OF 2:1 SERIALIZER2                             | 4  |
| FIG. 3.2 OVERALL STRUCTURE OF PROPOSED PSEUDO SERIALIZER2                 | 6  |
| FIG. 3.3 STRUCTURE OF 2-TO-1 MUX AND TIMING DIAGRAM2                      | 6  |
| FIG. 3.4 THE SCHEMATIC OF CONVENTIONAL MUX2                               | :7 |
| FIG. 3.5 PROPOSED ISI-MITIGATING MUX2                                     | 8  |
| FIG. 3.6 TIMING DIAGRAM OF ISI-MITIGATING MUX2                            | 8  |
| FIG. 3.7 SIMULATION RESULTS OF (A) CONVENTIONAL MUX (B) ISI-MITIGATING    |    |
| MUX                                                                       | 9  |
| FIG. 3.8 OVERALL STURCTURE OF PROPOSED TRANSMITTER                        | 1  |
| Fig. 3.9 Proposed hybrid oscillator whose frequency is controlled by $4$  |    |
| DIFFERENT TYPES OF TAIL CURRENTS                                          | 4  |
| Fig. 3.10 Frequency locking procedure of the proposed $CDR$ : (a) digital |    |
| LOOP ACTIVATION, AND (B) ANALOG LOOP ACTIVATION                           | 5  |
| FIG. 3.11 SIMULATION RESULT OF FREQUENCY LOCKING PROCEDURE                | 6  |
| FIG. 3.12 THE OVERALL STRUCTURE OF PROPOSED RECEIVER                      | 8  |
| FIG. 3.13 THE CIRCUIT DESCRIPTION OF CTLE                                 | 9  |
| FIG. 3.14 FREQUENCY RESPONSE OF CTLE [POST-SIMULATION RESULTS]4           | .0 |
| FIG. 3.15 THE COMPONENTS OF A DIGITAL LOOP FILTER                         | -1 |
| FIG. 3.16 DIE MICROPHOTOGRAPH4                                            | -2 |
| FIG. 3.17 POWER BREAKDOWN OF TRANSCEIVER4                                 | .3 |
| FIG. 3.18 MEASUREMENT SETUP OF PROPOSED TRANSCEIVER4                      | .3 |
| FIG. 3.19 MEASURED EYE DIAGRAM OF TRANSMITTER OUTPUT4                     | 4  |
| Fig. 3.20 Measured jitter tolerance curve (BER $< 10^{-12}$ )             | -6 |

| FIG. 3.21 MEASURED BER BATHTUB CURVE                                                          |
|-----------------------------------------------------------------------------------------------|
| FIG. 3.22 MEASURED FREQUENCY ACQUISITION BEHAVIOR                                             |
| FIG. 3.23 MEASURED 5-GHZ RECOVERED CLOCK                                                      |
| FIG. 4.1 CONCETUAL DIAGRAM OF PROPOSED RECEIVER                                               |
| FIG. 4.2 CONCEPT OF STOCHASTIC FPD [48]53                                                     |
| FIG. 4.3 FLOW CHART OF DESIGN TECHNIQUES OF STOCHASTIC FREQUENCY-PHASE                        |
| DETECTOR [48]                                                                                 |
| FIG. 4.4 ACHIEVED (A) PHASE DETECTION GAIN CURVE (B) FREQUENCY DETECTION                      |
| GAIN CURVE [48]                                                                               |
| FIG. 4.5 THE MAIN STAGES OF FSM55                                                             |
| FIG. 4.6 SIMULATED FREQUENCY DETECTION GAIN CURVE BY VARING THE                               |
| AVERAGING TIME                                                                                |
| FIG. 4.7 The concept of the frequeny tracking in the FD gain curve57                          |
| FIG. 4.8 SIMULATION RESULTS OF ERROR ACCUMULATION AND CALCULATION 59                          |
| FIG. 4.9 THE SIMULATION RESULT OF FREQUENCY TRANSIENT BEHAVIOR60                              |
| FIG. 4.10 The simulation results of the locking time between the                              |
| CONVENTONAL FREQUENCY TRACKING AND PROPOSED FREQUENCY TRACKING                                |
| FIG. 4.11 SIMULATED OSCILLATOR FREQUENCY SENSITIVITY TO $A_{\mbox{CTRL}}$ for various         |
| SUPPLY VOLTAGES (A) WITHOUT SVDC, (B) WITH SVDC64                                             |
| Fig. 4.12 Simulated current changes versus a change in $VDD_{OSC}65$                          |
| FIG. 4.13 HYBRID OSCILLATOR WITH SVDC65                                                       |
| FIG. 4.14 CIRCUIT IMPLEMENTATION OF SUPPLY VOLTAGE DRIFT CANCELLATION . 67                    |
| Fig. 4.15 The simulation results of (a) bias voltage change in a $BVG\left( \text{b} \right)$ |
| CURRENT OF FREQUENCY TUNING CELL CHANGE VERSUS A CHANGE IN $VDD_{OSC}$                        |

| FIG. 4.16 THE SIMULATION RESULTS OF CURRENT CHANGE VERSUS A CHANGE IN               |
|-------------------------------------------------------------------------------------|
| VDD <sub>OSC</sub> FOR                                                              |
| FIG. 4.17 THE SIMULATION RESULTS OF CURRENT CHANGE VERSUS A CHANGE IN               |
| VDD <sub>OSC</sub> FOR                                                              |
| FIG. 4.18 CURRENT SWITCH ON A) CURRENT BIAS OF CTLE B) TAIL CURRENT OF              |
| HYBRID OSCILLATOR                                                                   |
| FIG. 4.19 SIMULATION RESULT FOR CURRENT VARIATION ON ACIVE MODE, SLEEP              |
| MODE, WAKE-UP                                                                       |
| FIG. 4.20 SIMULATION RESULTS FOR THE PROCESS OF STORING AND RESTORING               |
| CODE                                                                                |
| FIG. 4.21 OVERALL STRUCTURE OF THE PROPOSED RECEIVER                                |
| FIG. 4.22 CIRCUIT DESCRIPTION OF THE HOSC WITH SVDC INCLUDING THE                   |
| SIMULATION RESULTS OF SVDC                                                          |
| FIG. 4.23 THE SIMULATION RESULTS OF BEHAVIOR MODELING OF COMMAND                    |
| CONTROLLER                                                                          |
| FIG. 4.24 DIE MICROPHOTOGRAPH                                                       |
| FIG. 4.25 MEAUSREMENT SETUP FOR (A) RECEIVER PERFORMANCE (B) SVDC80                 |
| FIG. 4.26 DETAILED MEASUREMENT SETUP FOR VERIFYING RECEIVER PERFORMANCE             |
|                                                                                     |
| FIG. 4.27 POWER BREAKDOWN OF ACTIVE MODE AND SLEEP MODE                             |
| FIG. 4.28 MEASURED JITTER TOLERANCE CURVE                                           |
| FIG. 4.29 MEASURED FREQUENCY ACQUISITION BEHAVIORS BEFORE POST-                     |
| PROCESSING @ $10 \text{ Gb/s}$ PRBS-7 with varying inital HOSC frequency (a) $4.55$ |
| GHz, (b) 4.37 GHz, (c) 4.87 GHz                                                     |

| FIG. 4.30 MEASURED      | FREQUENCY ACQUISITIC       | N BEHAVIORS WITH V     | ARYING INTIAL    |
|-------------------------|----------------------------|------------------------|------------------|
| DCO FREQUENCY (A        | .) 4.55 ~ 5.83 GHz, (в) 4. | .37~5.65 GHz, (c) 4.8  | 7 ~ 6.15 GHz. 86 |
| FIG. 4.31 MEASURED      | FREQUENCY ACQUISITIC       | ON BEHAVIORS WITH V    | ARYING           |
| AVERAGING TIME          |                            |                        |                  |
| FIG. 4.32 MEASURED      | FREQUENCY OF FREE-RU       | JNNING HOSC            |                  |
| FIG. 4.33 POST-PROC     | ESSED WAVEFORM OF FR       | EQUENCY RECOVERY       | AFTER SLEEP      |
| MODE. (A) INITIAL ACTIV | VE MODE VOLTAGE : 1 V      | AND (B) 0.95 V, 1.05 V | V90              |

## **List of Tables**

 TABLE 3.1 COMPARISION OF THE PROPOSED TRANSCEIVER WITH PRIOR DESIGN
 48

 TABLE 4.1 COMPARISON TABLE WITH OTHER RECEIVERS SUPPORTING THE SLEEP
 92

## Chapter 1

### Introduction

### **1.1 Motivation**

The demand for high-performance displays in modern smartphones is on the rise. The implementation of a high-performance display involves taking three things into consideration.

The first factor is the display's resolution, which denotes the number of pixels in the width and height of the screen. The high-performance display features a more significant number of pixels. Fig. 1.1 is a chart showing the resolution changes in the Galaxy-S series [1]. The chart demonstrates a progressive shift, with the series starting from WVGA (800x480) and gradually transitioning to HD (1280X720), FHD (1920 X1080), and QHD (2560X1440) with the respective product releases.



Fig. 1.1 The display resolution changes of the Galaxy-S series [1]

The second factor to be considered is color depth. This parameter indicates the number of colors that the display can represent, and high-performance displays can exhibit a diverse and natural range of colors. For instance, in the case of black and white televisions, black is represented as 0 and white as 1, enabling the display to represent only two colors using 1 bit. In color televisions, red (R), green (G), blue (B) are used as the primary colors for subpixels, which can be combined to represent a range of colors.

The third element to consider is frame rates. In the video, multiple still images are arranged sequentially to create the illusion of motion. Each still image in this sequence is referred to as a frame, and the number of frames displayed per second is known as the frame rate. A higher frame rate means that more still images need to be processed per second to create a smooth and seamless video experience. Thus, in order to achieve high resolution, large color depth, and high frame rates, it is essential to have interfaces that can handle a large amount of data.

There are two interfaces for the display of mobile applications: the intra-panel interface, which is the interface from the timing controller (T-CON) to the source driver (SD). And the other is the system-interface which is the interface between the application processor (AP) and the timing controller (T-CON).

Many studies have been reported on the intra-panel interface [2]-[8], and there is a need to increase the data rates of the system interface to keep up with the speed of the intra-panel interface.

Not only high data rates but also low power systems should be considered. One of the representative interfaces for the display of mobile applications is MIPI (Mobile Industry Processor Interface). Fig. 1.2 shows the overall architecture of MIPI [9]. The AP must control not only the display but also various systems and interfaces. For this reason, considering the limited battery life of smartphones, it should be operated as a low-power system. One of the most attractive approaches is to turn off the module when it is not in use. Much research has been conducted on links that turn off when not in use and switch rapidly between the wake-up and sleep mode, specified in various standards such as PON and DisplayPort [11]-[22].

With this motivation, this dissertation presents a transceiver from the AP to the timing controller embedded controller (TED) for two main objectives. Firstly, a high-speed transceiver is proposed to match the required total bandwidth while reducing the number of lanes, thereby achieving cost benefits. Secondly, a receiver that supports sleep mode is proposed to implement a low-power system incorporating designs for faster frequency acquisition and frequency recovery in each state: initial active

mode, sleep mode, and wake-up state.



Fig. 1.2 Overall architecture of mobile application [9]

### **1.2 Thesis Organization**

This thesis is organized as follows. The backgrounds of the serial link and internal display interface are described in Chapter 2. And the overall structure of the proposed AP-to-TED interface and it's requirements are described.

In Chapter 3, high-speed and low-power transmitter and receiver is proposed. In transmitter (TX), the proposed pseudo serializer minimized clocking power to reduce the total power, and ISI-mitigating MUX minimized the interference with previous data. In receiver (RX), hybrid clock and data recovery (CDR) is proposed for the low power system. The hybrid loop consists of a digital loop and an analog loop. The digital loop is first activated to find the digital codes. Then the analog loop is activated to reduce power consumption by blocking the clock path of the digital loop filter (DLF) and deactivating DLF and edge deserializer (DES). Circuit implementation and measurements are shown.

In Chapter 4, 10 Gb/s receiver supporting sleep mode is presented. Receiver operation can be divided into three stages, 1) initial active mode, 2) sleep mode, and 3) wake-up.

In initial active mode, a new frequency acquisition mode is proposed utilizing the linearity of frequency gain curves. The proposed method uses the value of error according to the frequency difference to adjust the code corresponding to the initial frequency. finite-state machine (FSM) is used to adjust the code and the process is described. Simulation results show that it achieves a faster lock time.

In sleep mode, a design for turning off the module to make power zero is presented.

Switches are added to each block and the change in power is described through simulation.

After wake-up, frequency is quickly recovered through the proposed hybrid loop CDR with supply voltage drift cancellation (SVDC). By employing supply voltage drift technique to digitally controlled current of hybrid oscillator (HOSC), it can accomplish the fast frequency recovery from sleep mode if supply voltage drift occurs in sleep mode. Circuit implementation and measurements results are shown to verify the proposed scheme.

Chapter 5 summarizes the proposed works and concludes this thesis.

## Chapter 2

### Backgrounds

### **2.1 Overview**

There are two types of communication for chip-to-chip communication: parallel and serial communication. Parallel communication sends multiple data signals over multiple channels. As data speeds increase, parallel communication requires more channels, resulting in increased cost and more physical layers. In addition, as clock speeds increase, EMI and propagation delay issues can arise.

On the other hand, serial communication sends data one bit at a time, sequentially, requiring only one channel and fewer physical layers than parallel communication.

Fig. 2.1 shows a depiction of parallel and serial communication, where parallel interfaces require a total of 8 channels to transmit 1 byte (8 bits) of information, whereas serial communication can transmit on just one channel. This lower number of channels



Fig. 2.1 Depiction of parallel communication and series communication



Fig. 2.2 Simplified block diagram of serial link

results in a relatively lower cost.

Future mobile applications will require low-cost, small-area interfaces with high data rates. Because of these reasons, serial communication is commonly employed.

Serial link is an interface for serial communication. Fig. 2.2 shows the simplified block diagram of the serial link. It consists of a serializer (SER) that converts parallel data into a series to be transmitted, a deserializer (DES) that converts the received serial data back into parallel data, an equalizer that compensates for the insertion loss due to the skin effect and dielectric loss incurred during transmission through the channel, analog front-end (AFE) to sample the data that has passed through the channel, and a clocking scheme that synchronizes the operation of each circuit.

To receive data accurately, it is essential to have synchronized clocks between the transmitter and receiver, or the ability to operate asynchronously. The clocking architecture used in SERDES systems is categorized according to the synchronization scheme used between the transmitter and receiver clocks. In SERDES architecture, three main types of clock synchronization are predominantly employed: synchronous, mesochronous, and plesiochronous.

Synchronous clocking architecture is a clocking scheme in which there are no frequency or phase differences between the receiver clock and the transmitted data. This means that there is no need for additional clock phase adjustments during data sampling. Typically, synchronous clocking is implemented using a forwarded-clocking architecture, as shown in Fig. 2.3. It is crucial to ensure that the delay of the data channel and the clock channel, as well as the delay on both the transmitter side and the receiver side are matched or within a reasonable range. By tightly controlling



Fig. 2.3 Block diagram of synchronous clocking architecture - forwarded clocking



Fig. 2.4 Block diagram of mesochronous clocking architecture - forwarded clocking

these delays, synchronous clocking architecture can provide a simple and cost-effective solution for constructing a SERDES interface. However, there are challenges associated with implementing synchronous clocking architecture in modern SERDES interfaces. As the data rate increases, the timing margin for robust data sampling decreases while the clock skew(variation in clock arrival times) and propagation delay (time taken for signals to propagate through the circuit) remain constant. Moreover, synchronous clocking architecture is sensitive to delay variations. For these reasons, there are limitations in architectures that require high data rates.

A mesochronous clocking architecture is characterized by having only one phase difference between the receiver clock and the transmitted data. This clocking scheme is commonly associated with systems using a forwarded clocking architecture. Fig. 2.4 shows an example of a typical mesochronous SERDES system using a forwarded clocking architecture. In a mesochronous clocking scheme, the receiver clock frequency is the same as the transmitter (TX) frequency. Therefore, only a delay adjustment is required to achieve optimal sampling timing. This can be done manually or automatically, depending on the implementation. If a SERDES system includes a pretraining sequence, Clock and Data Recovery (CDR) is not necessary because the controller can perform delay adjustments. In such cases, a variable delay line or a phase interpolator (PI) is sufficient for the purpose. However, if the system lacks a controller, CDR becomes essential. It can be achieved using a simple deskewing circuit such as a delay-locked loop (DLL) or a phase interpolator (PI). In a mesochronous system, an alternative configuration is using a common clock architecture instead of forwarded clocking architecture. This arrangement is depicted in Fig. 2.5, which shows an example of a mesochronous system using a common clock architecture. In this case, the receiver employs a phase-locked loop (PLL) to generate a clock signal with the same frequency as the transmitter. The clock phase is then adjusted using CDR with the deskewing circuit mentioned earlier.



Fig. 2.5 Block diagram of mesochronous clocking architecture - common reference clock



Fig. 2.6 Block diagram of plesiochronous clocking architecture

Despite the need for an additional clock channel or external reference clock generator, mesochronous clocking architecture is widely used in certain applications due to its simple CDR structure. In plesiochronous clocking architecture, the transmitted data and the receiver clock exhibit a slight frequency difference. Unlike synchronous systems with a common clock, the transmitter and receiver generate their own clocks using their respective reference clocks in this architecture, as shown in Fig. 2.6. As a result, the clocks cannot be perfectly synchronized, leading to a slight frequency disparity. This frequency difference gives rise to a continuous drift in the relative phase of the clocks. While an elastic buffer can be employed to mitigate this problem, it is effective only when data attenuation is minimal. In most cases, receivers require the regeneration of attenuated data, necessitating the use of clock and data recovery (CDR) mechanisms. Unlike mesochronous systems, plesiochronous systems demand tracking of both phase and frequency. Consequently, the design of the receiver becomes more challenging in plesiochronous systems.

Therefore, clocking architectures should be chosen based on the target application, taking into consideration their respective advantages and disadvantages.

In addition to the clocking architecture between the transmitter and receiver, the selection of the appropriate clocking scheme in TX and RX should also be taken into consideration.

There are three main clocking schemes commonly used in serial link: full-rate, half -rate and quarter-rate. The full-rate scheme aligns data on the rising edge of the clock, which has an advantage in terms of jitter by retiming the data in final stage, but requires a fast clock and consumes a lot of power, as shown in Fig. 2.7 (a). To overcome these disadvantages, the half-rate architecture has been proposed as shown in Fig. 2.7 (b). It is commonly used because it consumes less power and mitigate the timing budget. However, if the clock's duty is broken, horizontal jitter may occur. The quarter-rate scheme, shown in Fig. 2.7 (c), is used to transmit high-speed data by using the rising edges of four low-speed clocks. However, maintaining a constant 90° phase difference between the four clocks can be challenging. Due to the existence of pros and cons for each clocking structure, the selection of clocking architectures should be determined based on factors such as data rate, power requirements, and other relevant considerations.



Fig. 2.7 Timing diagram of clocking scheme: (a) full-rate (b) half-rate (c) quarter-rate

#### **2.2 Internal Display Interface**

#### 2.2.1 Overview

The internal display interface primarily serves as the interface for communication between the main system-on-a-chip (SoC) and the timing controller (T-CON). It is divided into different types depending on the target application. Among them, LVDS was first introduced in 1994 by ANSI/TIA/EIA-644 and has been mainly used in various of applications. Fig. 2.8 shows the description and specification of LVDS [28]. It transmits serial data with low voltage and small voltage difference, which makes it resistant to noise and consumes less power. However, the speed that can be sent per lane is limited, and as the required data rate increases, more lanes are required. As a result, different interfaces have been developed as alternatives for each application.

One of these, iDP, was developed by VESA in 2010 for Displayport applications [11]. The iDP standard defines an internal link between a digital TV system on chip (SOC) controller and the display panel's timing controller. Compared to LVDS, iDP offers the advantage of higher data rates per lane, fewer wires, and better EMI characteristics. Fig. 2.9 shows a basic block diagram of iDP. In iDP, the TX utilizes preemphasis to help in clock recovery and symbol lock at the sink device by compensating for the frequency-dependent insertion loss of the channel. On the other hand, the RX employs optional equalization (EQ) and CDR techniques to transmit and receive signals.

Unlike large displays such as TVs and monitors, mobile applications utilize the



Fig. 2.8 Description and specification of LVDS (low voltage differential signaling) [28]



Fig. 2.9 Simplified block diagram of internal DisplayPort (iDP) PHY Electrical Sub-Layer

[12]

MIPI standard, established by the MIPI Alliance (ARM, Intel, Nokia, Samsung, STMicroelectronics, TI) in 2003. They established interfaces for processors, including the AP between peripheral devices, with objectives of achieving cost-effective

flexibility through component compatibility among manufacturers, high-speed and low power. Each device utilizes a different physical layer (PHY) for communication with the AP. Fig. 2.10 shows the protocol and physical layer for each peripheral devices [9]. Specifically for the display interface in smartphones, it employs the display serial interface (DSI). Fig. 2.11 illustrates the conceptual view and description of the layers of DSI [10]. Based on the conceptual diagram, the host device or AP sends information related to pixels or commands to the peripheral devices, and receives status or pixel information back from the devices. During this process, DSI serializes all pixel data, commands, and events that are typically conveyed to or from the peripheral device using a parallel data bus with additional control signals in traditional or legacy interfaces. As mentioned earlier, the amount of data being transmitted from the AP has been increasing, necessitating higher speeds for the links handled by the DSI. Fig. 2.12 illustrates the per-lane data rate of the D-PHY according to different versions. D-PHY 3.0, it has been developed to support a maximum transmission speed of 9.0 Gbps and up to 11 Gbps for short channels. Additionally, as the data rate increases, continuous-time linear equalizer (CTLE) has been implemented to compensate for channel loss.

The future direction of internal interfaces for mobile applications can be summarized into two main aspects. Firstly, as the amount of information that needs to be processed continues to increase, it is crucial to implement high data rates. Secondly, low-power design should be employed for long battery life. Considering these factors, it is essential to appropriately design the clocking scheme, TX, RX, and EQ.



Fig. 2.10 MIPI Multimedia specification [9]



Fig. 2.11 A conceptual view and description of the layers of DSI [10]



Fig. 2.12 The per-pin data rate versus version for D-PHY

#### 2.2.2 AP-to-TED interface

In high-resolution display, a large amount of data must be fowarded from the T-CON to the source driver IC (SD) through a so-called intra-panel interface. To meet these requirements, the need for more flexible printed circuits and printed circuit boards has increased, resulting in higher costs. In order to address these challenges, many studies have been conducted to achieve both high data rates and power efficiency [2]-[8], [23]-[27]. One of the approaches involves the development of a T-CON embedded driver (TED) that combines T-CON and SD into a single chip, thereby proposing solutions to reduce channel number and power consumption [25]. Accordingly, the data rates of the AP-to-TED interface must be increased to keep up with the speed of intra-panel interface.

Fig. 2.13 describes the overall architecture of the AP-to-TED interface, which is composed of a AP, TED, and a flexible printed circuit board (FPCB). In order to determine the architecture of the TX in AP and the receiver in the TED, it is essential to have a precise understanding of the characteristics of the FPCB.



Fig. 2.13 Overall architecture of AP-to-TED interface

#### **2.2.2.1 Flexible Printed Circuit Board (FPCB)**

Fig. 2.14 depicts an FPCB [29]. The figure illustrates a different FPCB than the one actually used to security reasons. When the data rate increases, the FPCB also experiences an increase in channel loss, like other channels. This is due to the fact that loss is frequency-dependent, caused by the skin effect and dielectric loss. Compensating for higher channel losses requires higher circuit complexity and more power consumption.

Therefore, the objective of this paper is to implement a low-power system with a data rate of 10 Gb/s per lane, surpassing the previously published D-PHY standard. The channel losses at the Nyquist frequency of 5 GHz are measured for channel lengths of 7 cm, 10 cm, 15 cm, and 32 cm, resulting in losses of 6.5 dB, 10.4 dB, 10.9 dB, and 16.9 dB, respectively.



Fig. 2.14 Example illstration of Flexible Printed Circuit Board [29]

# **Chapter 3**

# Design of High-Speed and Low-Power Transceiver

## **3.1 The Design of Transmitter**

#### **3.1.1 Proposed Pseudo Serializer**

The purpose of the serializer is to convert parallel data generated by a pattern generator into a serial data format. Fig 3.1 represents a widely used 2:1 serializer structure primarily employed in the transmitter (TX) [30]. It consists of five latches and a 2:1 multiplexer (MUX). Utilizing a total of five latches ensures sufficient timing margin, thereby preventing glitches at the final 2:1 MUX stage.



Fig. 3.1 Overall sturcture of 2:1 serializer

However, each latch requires a clock, leading to increased power consumption. As a result, as the speed increases, schemes have been proposed to reduce the number of latches and minimize the clock distribution path [31], [32]. In this chapter, a pseudoserializer scheme is proposed as a solution to reduce the number of latches.

The proposed pseudo serializer's overall structure is illustrated in Fig. 3.2. Each 20-to-1 SER comprises four 5-to-1 MUXs and three 2-to-1 MUXs. Fig. 3.3 depicts the structure of the 2-to-1 MUX, which comprises two tri-state inverters and one switch.

Both the positive and negative edges of the 5 GHz signal are employed to create divided 2.5 GHz and 1.25 GHz signals across all phases. During the low level of the  $CLK_{2.5G}$  signal, IN0 is transmitted to SER0, while IN1 is sent to  $IN_{SW}$  through the switch being activated. When  $CLK_{2.5G}$  is high, the transmission of IN0 is blocked, and the value stored in  $IN_{SW}$  is transmitted to SER0. However, it is crucial to prevent any

transition in the data during this process. During the high level of  $\text{CLK}_{2.5G}$ , the tristate inverter and switch that are directly connected to IN0 and IN1 are turned off. As a result, any changes in IN0 and IN1 cannot have an impact on SER0. As a result, SER0 is only synchronized by  $\text{CLK}_{2.5G}$ , and the use of flip-flops and latches is minimized to decrease clock distribution and achieve low power consumption.



Fig. 3.2 Overall structure of proposed pseudo serializer



Fig. 3.3 Structure of 2-to-1 MUX and timing diagram

#### 3.1.2 Proposed ISI-Mitigating MUX

Fig. 3.4 shows the schematic of conventional MUX that consists of two tri-state inverters. The voltage of the internal node of the tri-state inverter ( $P_{INIT0}$ ,  $N_{INIT0}$ ,  $P_{INIT1}$ ,  $N_{INIT1}$ ) can vary depending on the transmitted data, causing inter-symbol-interference, (ISI) resulting in different slew rates at the MUX output. To address this issue, we incorporate additional switches to pre-discharge or pre-charge each node, thereby ensuring a consistent value at the internal nodes.

Fig. 3.5 illustrates the proposed ISI-Mitigating MUX. The proposed structure is designed to pre-charge or discharge the internal node, depending on the level of the clock used. In the case where the clock of the gate of the tri-state inverter is at a low



Fig. 3.4 The schematic of conventional MUX



Fig. 3.5 Proposed ISI-mitigating MUX



Fig. 3.6 Timing diagram of ISI-Mitigating MUX

level, Q0 becomes the output of the MUX, while Q1 is sent to the output of the MUX when clock is at a high level. Fig. 3.6 shows a timing diagram for one case where Q0

and Q1 are intended to transmit 0 and 1, respectively. While Q0 is being transmitted, the flip-flop at the front of the MUX maintains a value of 1 for Q1. This value causes  $P_{INIT1}$ ,  $N_{INIT1}$  of the tri-state inverter of Q1 to be pre-charged or pre-discharged, respectively. As a result, before Q1 is transmitted,  $P_{INIT1}$  is pre-charged to 0, thus maintaining a constant internal node value. This plays a role in mitigating ISI by ensuring the same slew rate.

Fig. 3.7 (a) and (b) show simulation results of conventional MUX and ISI-mitigating MUX, respectively. Compared to the conventional MUX with an eye width of 0.82 UI, the ISI-mitigating MUX shows an improvement with a width of 0.89 UI, which is an increase of 0.07 UI.



Fig. 3.7 Simulation results of (a) conventional MUX (b) ISI-Mitigating MUX

#### **3.1.3 Overall Sturucture of Transmitter**

Each transmitter utilizes two differential lanes to eliminate common noise interference from a display panel, with each lane responsible for a 10-Gb/s data rate. Fig. 3.8 shows the overall structure of the transmitter. The clock-embedded architecture utilizes a ring-based charge-pump PLL (CP-PLL) with a 312-MHz reference to provide a 5-GHz clock for the transmitter. The transmitter's divider generates divided clock signals (2.5, 1.25, 0.25 GHz) for serialization.

The transmitter consists of a TX-Digital block including PRBS-Generator and modules used in logical PHY, a proposed 40-to-2 pseudo SER, a proposed ISI-Mitigating MUX, a single-to-differential predriver(S-to-D predriver) to send the signal through the differential lane, and a source-series-terminated (SST) output drivers. For impedance matching with the FPCB, the SST driver can control MOS resistance using 4-bit tunable signals.





### **3.2 The Design of Receiver**

#### 3.2.1 Overview

As seen in Chapter 2, only the synchronous clocking architecture exhibits a relationship between the received data and the sampling clock, while other structures lack synchronization. However, as the data rate increases in a source synchronous system, there are various limitations, such as delay generations, variations in delay due to PVT variations, and other design complexities. To address these issues, embedded clocking has been considered.

Referenceless CDR is widely used in wireline applications because it eliminates the need for an additional reference clock, resulting in cost savings. However, since there is no reference clock, methods for determining the frequency and synchronization from the received input data are necessary. In referenceless CDR, frequency acquisition is the most critical task, and numerous studies have been conducted on achieving high performance [33]-[43], but there have been limitations in terms of the trade-off between capture range, acquisition time, and power. To mitigate the tradeoff power and frequency range, a frequency acquisition scheme based on stochastic PFD has been proposed [48]. It can achieve both wide frequency detection and phase detection with a high power efficiency because there is no need for additional clock phase and data levels. Therefore, in this thesis, the stochastic frequency detection is employed with hybrid loop CDR to enhance power efficiency.

#### 3.2.1 Proposed Referenceless Hybrid Loop CDR

CDR is classified based on the type of loop filter used. Digital loop filter offer several advantages over analog loop filter [44], [45]. It is designed using digital logic cells, which can utilize the available area more efficiently than analog components. Additionally, they can be easily programmed to adjust to process, voltage, and temperature (PVT) variations. It also has the advantage of being easily turned on and off by adding a simple logic gate to the output of the DLF, whereas analog loop filter requires complex circuits such as Analog-to-Digital Converter (ADC) to store analog voltage information increasing power consumption and circuit complexity [46], [47]. However, DLF has limitations in terms of jitter performance due to the latency of the DLF. As a result, a hybrid loop filter that combines the advantages of both analog and digital loop filters can be designed to achieve better performance in terms of both area efficiency, controllability, power efficiency, and jitter performance.

Fig. 3.9 shows the type of tail currents of the hybrid oscillator (HOSC) controlled by digital and analog loop. It consists of four types of tail currents: analog integral path current (I<sub>1</sub>), direct proportional path current (I<sub>2</sub>), band-select current (I<sub>3</sub>) and digital codes (D<sub>CTRL</sub>) current.

The initial frequency locking procedure is illustrated in Fig. 3.10. First, band-select  $(I_3)$  is adjusted to match the desired frequency band. Second, the digital loop for frequency acquisition is enabled to determine the digital codes current  $(I_4)$ , as shown in Fig. 3.10 (a). During this phase, the output of the analog loop filter  $(A_{CTRL})$  is fixed to 0.7 V using V<sub>INIT</sub>. When the CDR locks, we can obtain D<sub>CTRL</sub> indicating the target frequency. After obtaining the lock code, the lock detector generates a flock signal



Fig. 3.9 Proposed hybrid oscillator whose frequency is controlled by 4 different types of tail

#### currents

that the process has been completed. This signal blocks the clock path to the DLF and edge DLF to deactivate them, and the DLF stays at the fixed  $D_{CTRL}$ , as shown in Fig. 3.10 (b). Then the analog loop is activated to remove the residual frequency error and obtain phase locking by adjusting the I<sub>1</sub>, and I<sub>2</sub>.

Since the DLF and edge DES are turned off, it is more advantageous in terms of power due to the reduction of the clocking path, and can support switch mode rapidly through fixed  $D_{CTRL}$ , and  $A_{CTRL}$ . Additionally, employing a hybrid loop design facilitates achieving desirable jitter performance.

Fig. 3.11 shows the simulation result of the frequency locking procedure.  $1^{st}$  stage is the process of finding the D<sub>CTRL</sub>.  $2^{nd}$  stage is turning off the DLF by generating flock signal from the lock detector and activating the analog loop to remove frequency and phase error. Finally, phase-locking is performed in  $3^{rd}$  stage.





Fig. 3.10 Frequency locking procedure of the proposed CDR: (a) digital loop activation, and (b) analog loop activation



Fig. 3.11 Simulation result of frequency locking procedure

#### 3.2.2 Overall Structure of Receiver

Fig. 3.12 illustrates the overall architecture of the proposed receiver for the AP-to-TED interface. The receiver consists of two differential lanes, which effectively eliminate common noise interference caused by a display panel. Half-rate clocking is employed to alleviate timing constraints, and a referenceless CDR is also implemented to reduce power consumption by eliminating the need for a dedicated clock lane. A continuous-time linear equalizer (CTLE) is employed to compensate for channel loss. The CTLE boosts the incoming signals passing through the FPCB by up to 6 dB at the Nyquist frequency. After the CTLE, four strong-ARM latch comparators are used for 2x oversampling with a half rate. The resulting data and edge samples are transmitted through two separate paths using SR latches.

In the first path, the sampled data is parallelized using a 2-to-40 deserializer (DES) to reduce the data rate in the digital loop filter (DLF). The second path contains BBPD. The UP and DN signals generated by the BBPD are fed into the integral and direct proportional paths to control the HOSC.

A two-stage differential ring oscillator is employed for low-power implementation. During the initial frequency tracking phase, the frequency is adjusted using a 128-bit thermometer code generated from the DLF. Once  $D_{CTRL}$  is set, the DLF is deactivated, and the frequency and phase locking is achieved through integral and proportional paths.



#### 3.3.3 Circuit Implementation

#### **3.3.3.1** Continous-Time Linear Equalizer (CTLE)

Fig. 3.13 depicts the schematic of a CTLE. It employs a resistive and capacitive source degeneration to boost up to 6 dB at the Nyquist frequency. As a differential input, the default value is set to 100 ohms for the termination resistance. Additionally, to account for variations in the measurement environment, it can be externally controlled. Furthermore, the values of  $I_{BIAS}$  and  $R_S$  can be adjusted. Fig. 3.14 illustrates the post-simulation results of the designed CTLE.



Fig. 3.13 The circuit description of CTLE



Fig. 3.14 Frequency response of CTLE [post-simulation results]

#### 3.3.3.2 Digital Loop Filter (DLF)

Fig. 3.15 shows the components of the DLF. A DLF comprises a pattern decoder, an accumulator, a DSM, a lock detector, and a loop gain controller. Based on the received data( $D_{40}$ ) and edge( $E_{40}$ ) information, the DLF performs stochastic frequencyphase detection using a pattern decoder [48], accumulator, and DSM. The output pattern is analyzed, and if a certain number of occurrences of specific patterns ( $S_1$ ,  $S_3$ ) are detected, the lock detector triggers a signal (flock). This signal determines which loop is activated. The loop gain controller allows for adjustment of the integral gain and direct proportional gain using the I<sup>2</sup>C interface.



Fig. 3.15 The components of a digital loop filter

# **3.4 Measurement Results**

The prototype chip is fabricated in 28-nm CMOS technology. The photomicrograph of the chip is shown in Fig. 3.16. The total active area of the chip is 0.196 mm<sup>2</sup>,



Fig. 3.16 Die microphotograph



Fig. 3.17 Power breakdown of transceiver



Fig. 3.18 Measurement setup of proposed transceiver

where each TX, RX and PLL occupy 0.026 mm<sup>2</sup>, 0.066 mm<sup>2</sup>, 0.012 mm<sup>2</sup>, respectively. Fig. 3.17 shows the power breakdown of transceiver. The total power consumption of TX and PLL is 7 mW, and the total power consumption of RX is 5.3 mW, resulting in a power efficiency of 1.23 pJ/b.

Fig. 3.18 describes three measurement setups of the proposed transceiver. The performance of transceiver is evaluated with the 7-cm channel on the FPCB as shown in Fig. 3.18(Bottom). PRBS-7 input data is used for the TRX measurement. In the case of TX, the data output from TX is verified for its eye pattern using BERT and an oscilloscope, as described at the top of Fig. 3.18. Fig. 3.19 shows the TX measurement results. The vertical eye opening is 75 mV, and the horizontal timing margin is 0.48 UI.



Fig. 3.19 Measured eye diagram of transmitter output

The following are the RX measurement results. To measure jitter tolerance, a PRBS-7 pattern with added random jitter is first generated on the Signal Quality Analyzer. The recovered data at 250 MHz is then fed back into the error detector of

the Signal Quality Analyzer to measure the BER. The measured jitter tolerance curve obtained through this process showed a jitter tolerance of 0.29 UI at the 30 MHz corner frequency, as shown in the Fig. 3.20. Fig. 3.21 shows the measured bathtub obtained by shifting the external clock using the external clocking mode, exhibiting a 0.32 UI opening at BER <  $10^{-12}$ .

To measure the frequency locking behavior, the arstb signal of the DLF inside the prototype chip is used as the trigger for the oscilloscope to monitor frequency acquisition. When a signal is given to arstb through I<sup>2</sup>C, the frequency transient response after the trigger signal is monitored by oscilloscope. Fig. 3.22 shows the waveform results obtained through this process. It takes 5.2 us of lock time to recovery the target of 5 GHz from the initial frequency of 3.5 GHz. After stochastic frequency tracking, it is confirmed that the DLF and edge DES are deactivated while the analog loop is activated. The jitter histogram of the recovered clock is also measured by oscilloscope, showing a 2.68-ps<sub>rms</sub> jitter and 25.33-ps<sub>p-p</sub>value as shown in Fig. 3.23.

Table 1 presents a comparison with other transceivers, achieving a power efficiency of 1.23 pJ/bit.



Fig. 3.20 Measured jitter tolerance curve (BER  $< 10^{-12}$ )



Fig. 3.21 Measured BER bathtub curve



Fig. 3.22 Measured frequency acquisition behavior



Fig. 3.23 Measured 5-GHz recovered clock

| Min. JTOL(UIPP) | JTOL Corner Frequency (MHz) | Recovered Clock Jitter <sub>RMS</sub> (ps) | PLL Clock Jitter <sub>RMS</sub> (ps) | Energy Efficiency<br>(pJ/b)                            | Power (mW)                                           | Data Rate (Gb/s/lane) | Channel Loss (dB) @ Nyquist | Area (mm²)                                           | Supply Voltage (V)          | Equalizer    | <b>RX Sampling Rate</b> | CDR Architecture | Referenceless CDR | Clocking Scheme       | Interface Architecture | Application             | Technology     |                   |
|-----------------|-----------------------------|--------------------------------------------|--------------------------------------|--------------------------------------------------------|------------------------------------------------------|-----------------------|-----------------------------|------------------------------------------------------|-----------------------------|--------------|-------------------------|------------------|-------------------|-----------------------|------------------------|-------------------------|----------------|-------------------|
| N/A             | N/A                         | N/A                                        |                                      | TX : 6.04                                              | TX : 12.7                                            | 2.1                   | 12                          | TX + PLL :<br>0.675                                  | 1.0                         | FFE,PE       | N/A                     | N/A              | N/A               | N/A                   | TX                     | Display<br>Panel I/F    | 28 nm          | JSSC 2018<br>[23] |
| 14              | 3                           | 5.73 (Ring)                                | N/A                                  | RX : 41.5                                              | RX : 216                                             | 5.2                   | 6-29                        | RX:0.75                                              | 1.8                         | CTLE, DFE    | Octa-Rate               | Analog           | Yes               | Embedded              | RX                     | Display<br>Panel I/F    | 180 nm         | JSSC 2022<br>[24] |
| ,               | ,                           | 3 (LC)                                     | 0.7 (LC)                             |                                                        |                                                      | 6                     | 24                          | TX + PLL : 2.1<br>RX : 2.7<br>Total : 4.8            | -                           | FFE,CTLE,DFE | Quad-Rate               | Analog           | No                | Forwarded             | TX & RX                | Display<br>Panel I/F    | TX: 65, RX:180 | ISSCC 16<br>[7]   |
| 35              | 15                          | 0.76 (LC)                                  | 0.36 (LC)                            | TX : 2.76<br>Clk Gen : 0.4<br>RX: 2.59<br>Total : 5.75 | TX : 27.6<br>Clk Gen : 4<br>RX: 25.9<br>Total : 57.5 | 3-10                  |                             | TX:0.335<br>Clk Gen:0.328<br>RX:0.382<br>Total:1.045 | 0.9                         | CTLE         | Half-Rate               | Digital          | No                | Source<br>Synchronous | TX & RX                | High<br>Performance I/F | 65             | ISSCC 17<br>[26]  |
| N/A             | N/A                         | N/A                                        | -                                    | TX : 3.32<br>RX: 2.07<br>Total : 5.39                  | TX : 89.77<br>RX : 55.85<br>Total : 145.62           | 27                    | 9                           | TX + RX : 0.7                                        | -                           | FFE,CTLE     | Half-Rate               | N/A              | No                | Source<br>Synchronous | TX & RX                | High<br>Performance I/F | 65             | VLSI 21<br>[27]   |
| 29              | 30                          | 2.68 (Ring)                                | 1.61 (Ring)                          | TX : 0.43<br>PLL : 0.27<br>RX: 0.53<br>Total : 1.23    | TX : 4.3<br>PLL : 2.7<br>RX: 5.3<br>Total : 12.3     | 10                    | 9.6                         | TX:0.052<br>PLL:0.012<br>RX:0.132<br>Total:0.196     | TX + PLL : 0.95<br>RX : 0.9 | CTLE         | Half-Rate               | Hybrid           | Yes               | Embedded              | TX & RX                | AP-to-TED<br>I/F        | 28             | This Work         |

Table 3.1 Comparision of the proposed transceiver with prior design

# **Chapter 4**

# Receiver with Fast Frequency Acquisition in Active mode and Fast Recovery from Sleep mode under Voltage Drift

# 4.1 Overview

The receiver operation can be divided into three stages, 1) initial active mode, 2) sleep mode, and 3) wake-up. In this chapter, the design techniques are explained to improve the performance of the receiver at each stage. Fig. 4.1 represents the conceptual diagram of the proposed receiver, depicted according to each state.

The first stage is initial active mode, when the receiver is first powered on and needs to acquisite the frequency during a training period. To shorten the activation time, a fast frequency tracking method is needed. So a new frequency detection method is utilized to achieve a faster lock time in the initial active mode.

The second stage corresponds to the sleep mode state, which means turning off the module when not in use, making it an attractive option for saving power. In order to support sleep mode, we add switches to the circuit that completely turn off the bias and oscillator while also allowing for easy restoration of relevant voltage information.

The third stage is wake-up mode, in which the module is reactivated from sleep mode. There are two primary considerations for this stage. First, retracking the frequency during the training period after wake-up is wasteful in terms of power. Therefore, it is necessary to store the frequency information before sleep mode to skip the retracking frequency process. Second, even if voltage and temperature (VT) drift occurs, this information remains valid during wake-up.

Much studies have been conducted on rapidly transitioning between sleep mode and active mode, focusing on different standards [11]-[22]. Previous works such as [13] and [14] have achieved instant locking by utilizing a gated voltage controlled oscillator-based architecture, which is well-suited for rapid on/off applications. However, they are sensitive to VT variations due to the delay difference and require an additional settling time if VT drift occurs in the off state. In [16], fast lock time is achieved by sweeping the oscillator phase over 1100 repetitive patterns, but there are difficulties in obtaining an appropriate proportional gain considering frequency drift of the VCO occurred in the off state.

In view of these drawbacks, we propose a receiver with a supply-insensitive hybrid

CDR for the AP-to-TED interface to achieve fast frequency recovery from sleep mode under supply voltage drift.

In conclusion, we proposed a design that enables faster frequency acquisition for achieving faster lock time in initial active mode, as well as low power and VT robust design for the sleep mode and wake-up stages.



Fig. 4.1 Concetual diagram of proposed receiver

## 4.2 Proposed Frequency Detector

#### 4.2.1 Prior Work

Previous studies have shown a trade-off relationship between frequency capture range, power, and frequency acquisition time. However, a frequency acquisition scheme based on stochastic PFD has been proposed to alleviate this trade-off to some extent [48]. The stochastic methodology involves observing sequential data-edge-data patterns under various phase and frequency conditions, as shown in Fig. 4.2. These patterns are classified to collect histograms, and representative histograms are selected. Weights are then calculated using Bayes' theorem as described in Fig. 4.3. Through this iteration process, the PD gain curve and FD gain curve are ultimately achieved as shown in Fig. 4.4.

This method enabled a wide range of frequency acquisition with high power efficiency. However, starting frequency tracking from a lower initial frequency compared to the operating frequency results in a longer lock time. Therefore, in this paper, we propose a fast frequency acquisition using the characteristics of the frequency gain curve. As a result, proposed design can achieve a faster lock time compared to conventional frequency locking.



Fig. 4.2 Concept of stochastic FPD [48]



Fig. 4.3 Flow chart of design techniques of stochastic frequency-phase detector [48]



Fig. 4.4 Achieved (a) phase detection gain curve (b) frequency detection gain curve

[48]

### 4.2.2 Fast Frequency acquisition Using Linearity Fucntion

When observing the FD gain curve shown in Fig. 4.4 (b), we can observe a linear characteristic in a specific range with reference to the point where the Pr(LATE) - Pr(EARLY) = 0. Therefore, we propose a frequency detecting method for coarsely adjusting codes to achieve faster frequency acquisition using the error[= Pr(LATE) - Pr(EARLY)] values for each code. By obtaining the value of error, internal division is utilized to adjust the initial code closer to the lock point. To facilitate this process, finite state machine (FSM) is employed to adjust the codes. The FSM has three main stages as shown in Fig. 4.5.

The first state is to adjust the D<sub>CTRL</sub> to the frequency corresponding to the



Fig. 4.5 The main stages of FSM

Late( $D_{CTRL\_LATE}$ ), and the value of the corresponding error *L* is calculated. Then FSM proceeds to the next state.

In the second state, the  $D_{CTRL}$  is adjusted to the frequency corresponding to the Early ( $D_{CTRL\_EARLY}$ ), and the value of the corresponding error *E* is determined. Finally, the internal division is performed using these values. Based on the outcome of the internal division, the  $D_{CTRL}$  is adjusted to  $D_{CTRL\_UPDATE}$ . When the last stage of the FSM is completed, it generates a signal indicating the completion (flock). This signal deactivates the digital loop and immediately activates the analog loop, enabling the removal of remaining frequency error and phase error through the direct proportional path and integral path.

The error value needs to be accumulated due to noise factors such as voltage and thermal effects rather than just the single error value. To achieve this, assuming sufficient compensation for loss in the equalizer, we simulate with noise and obtain a new frequency gain curve by varying the averaging time. Fig. 4.6 shows the results obtained for averaging times of 500 UI, 1000 UI, 1500 UI, and 2000 UI, respectively. In all four cases, a linear trend similar to the frequency gain curve without noise is observed. Additionally, a more pronounced linear characteristic is observed as the averaging time increases. Based on these results, in this thesis, simulations and measurements are conducted by varying the averaging time.

Fig. 4.7 represents the three stages of the FSM on the FD gain curve. To reduce computations, instead of using the entire code, we perform internal division using the midpoint code under the assumption that the lock code corresponding to the target frequency is within the range of 7'd0 to the midpoint code. Therefore, in the simulation, we use half of the whole code, which is 64, rather than the entire code of 127.



Fig. 4.6 Simulated frequency detection gain curve by varing the averaging time



Fig. 4.7 The concept of the frequeny tracking in the FD gain curve

Additionally, to determine the size of the entire code without subtraction operations, the value corresponding to L is fixed at 7'd0. The count of accumulations is set to 16 (640 UI).

In the first state,  $D_{CTRL}$  is set to 7'd0, and the error is accumulated to obtain *L*. In next state,  $D_{CTRL}$  is set to 7'd64, and the error is accumulated to obtain *E*. The total code length, in this case, becomes 64, which is the length of the used  $D_{CTRL}$  interval.

Based on the values of *L* and *E*, the code for starting frequency tracking is calculated by interpolation. Once the calculation is complete, the updated  $D_{CTRL}$  adjusts the initial code. Fig. 4.8 shows the simulation results. The accumulated error resulted in a value of 72 for *L* and -80 for *E*. So if we substitute these values into the equation below, we can obtain the rounded value of 30, and it is confirmed that we also get the same result of

$$D_{CTRL\_UPDATE} = D_{CTRL\_LATE} + \left( D_{CTRL\_EARLY} - D_{CTRL\_LATE} \right)$$
(4.1)  
 
$$\times \frac{L}{L + |E|}$$
  
where  $D_{CTRL\_EARLY} = 64$ ,  $D_{CTRL\_LATE} = 0, L = 72, E = -80$ 

30 from the simulation.

Fig. 4.9 represents the simulation results of frequency transient behavior. It shows a frequency of 4.35 GHz at  $D_{CTRL\_LATE}$ , and the value of *L* is obtained while the frequency is fixed. Next, *E* is obtained at a frequency of 5.74 GHz, corresponding to the  $D_{CTRL\_EARLY}$ . After about 100 ns of calculation time,  $D_{CTRL\_UPDATE}$  is outputted, and frequency and phase tracking is performed immediately.

Fig. 4.10 compares the locking time when the proposed mode is turned off. It takes 1.83 µs from the initial frequency of 4.35 GHz to 5 GHz, while the proposed frequency

tracking mode achieves a faster time of  $0.42 \ \mu s$ .

Fig. 4.8 Simulation results of error accumulation and calculation







Fig. 4.10 The simulation results of the locking time between the conventonal frequency tracking and proposed frequency tracking

N

# 4.3 Proposed Hybrid CDR with Fast Recovery from Sleep Mode under Voltage Drift

#### 4.3.1 Hybrid Loop CDR with SVDC

#### 4.3.1.1 Motivation

Storing the frequency information is important, but it is also essential to make the same operating conditions at wake-up. If the supply voltage and temperature (VT) are not the same, different currents flow even with the same  $D_{CTRL}$ , and  $A_{CTRL}$ . This difference in current causes frequency error and requires additional settling time at wake-up.

Also, to minimize bit errors after the wake-up state, it is important for the CDR to correct frequency errors and prevent the accumulation of phase errors. To achieve this, it is necessary to set a higher value for proportional gain ( $K_P$ ). The minimum  $K_P$  required to correct the frequency error can be expressed as follows [16] :

$$K_P > \left(\frac{1}{\rho}\right) \times \frac{|F_{OSC} - F_{DATA}|}{F_{DATA}} = \frac{|\beta|}{\rho}$$
(4.2)

where 
$$\beta = \frac{|F_{OSC} - F_{DATA}|}{F_{DATA}}$$
 (4.3)

where  $\rho$  is the update rate assumed to be 0.5. A larger K<sub>P</sub> can overcome initial frequency errors but can increase cycle/hunting jitter [49]. Consequently, this leads to increased design complexity in CDR design.

In this work, a supply-insensitive hybrid CDR is proposed to correct the initial frequency error. It offers an efficient approach for fast frequency recovery under sleep mode by minimizing initial frequency error, even in the presence of voltage variations.

#### 4.3.1.2 Analysis of Prior Design

In Chapter 3, we determined the digital code value that corresponds to  $A_{CTRL}$  of 0.7 V. This value is used to restore the frequency when it wakes up from sleep mode. However, this information is valid when operating in nominal voltage (1V). Fig. 4.11 (a) shows the supply voltage sensitivity, illustrating the oscillation frequency variation when supply voltage is changed while  $A_{CTRL}$  is fixed at 0.7 V.

Despite using the same  $D_{CTRL}$ , and  $A_{CTRL}$ , the operating frequency could be out of the desired range due to the supply voltage drift. This is because the total amount of current flowing through the HOSC changes, as illustrated in Fig. 4.12. Considering the nominal voltage of 1 V, the amount of change in the band-select current and the digital codes current is more pronounced than the change in the integral path current. Therefore, to adjust the five frequency gain curves of the HOSC as depicted in Fig. 4.11 (b), the SVDC is employed exclusively to stabilize the band-select and digital codes current, as shown in Fig. 4.13, which accounts for most of the variation. Thus, even with a supply voltage drift during sleep mode, constant current flows for the same  $D_{CTRL}$ , reducing the frequency drift so that no additional acquisition time is required.



Fig. 4.11 Simulated oscillator frequency sensitivity to A<sub>CTRL</sub> for various supply voltages (a) without SVDC, (b) with SVDC



Fig. 4.12 Simulated current changes versus a change in VDD<sub>OSC</sub>



Fig. 4.13 Hybrid oscillator with SVDC

#### **4.3.1.3** Supply Voltage Drift Cancellation (SVDC)

TED includes various circuits such as source driver IC and T-CON, resulting in a high presence of supply noise factors. Therefore, dynamic supply noise rejection should also be considered besides supply voltage drift. Numerous studies have been conducted to reduce supply sensitivity [50]-[56]. Considering rapid on/off applications, it is necessary to achieve fast settling time of the bias and implement a low-power system using simple hardware. Consequently, the use of LDO (Low-Dropout Regulator) is not suitable from a power perspective, and schemes relying on background operations are not suitable due to their long calibration time [53]. Also, the inclusion of additional active devices within the oscillator also leads to increased power consumption and noise [55]. Therefore, we employed the supply-noise-compensating technique (SNC), which does not require a separate loop and enables fast bias settling time [56].

The circuit implementation of the supply voltage drift cancellation is shown in Fig. 4.14. It consists of a frequency-tuning unit cell that adjusts the frequency, a bias voltage generator (BVG) that adjusts the gate voltage of MOSFET in frequency unit cell by assembling two Nagata current sources [50], and a hybrid oscillator. The bias voltage generator consists of a total of three MOSFETs and two series resistors. To support sleep mode, MOSFETs M<sub>1</sub> and M<sub>2</sub> are used to turn off the bias voltage generator completely. When the sleep mode signal ( $\overline{WK_{UP1}}$ ) is received, V<sub>1</sub> and V<sub>2</sub> become logically low, causing the NMOS in the frequency unit current cells to be completely turned off.

The frequency unit current cell comprises of three MOSFETs: M4, M5, and M6,



Fig. 4.14 Circuit implementation of supply voltage drift cancellation



Fig. 4.15 The simulation results of (a) bias voltage change in a BVG (b) current of frequency tuning cell change versus a change in VDD<sub>OSC</sub>

where  $M_4$  and  $M_5$  function as current sources, receiving gate voltage from the bias voltage generator, and  $M_6$  turns the current source on and off according to the output of the DLF. When the supply voltage increases, the amount of I<sub>BIAS</sub> increases, which change  $V_1$  and  $V_2$ . In the case of  $V_2$ , it is affected only by  $R_1$ , but in the case of  $V_1$ , more IR drop is generated by ( $R_1$ + $R_2$ ). Accordingly, the change amount of  $V_1$  is greater than the change amount of VDD<sub>OSC</sub>, and the change amount of  $V_2$ , is smaller than the change amount of VDD<sub>OSC</sub>. Therefore, the overdrive voltage of the MOSFET  $M_4$ ,  $M_5$  changes with different polarity and the same magnitude. So, in conclusion, a constant current flows in the vicinity of nominal voltage.

Fig. 4.15 (a) shows the simulation results of bias voltage change in BVG and Fig.4.15 (b) shows the current change in the frequency tuning cell versus in VDD<sub>osc</sub>.

Fig. 4.16 shows the simulation results of the current change versus a change in VDD<sub>OSC</sub> for temperature. Simulations are performed for 0 °C, 20 °C, 40 °C, 60 °C, 80 °C, although the optimal point for each temperature is slightly different, I<sub>OSC</sub> remained constant regardless of VDD<sub>OSC</sub> variation.

Fig. 4.17 shows the simulation results conducted with corner variation. The simulations are performed for SS, TT, and FF corners. Although each optimal point is slightly different, it can be observed that there is a flat region for each of them.



Fig. 4.16 The simulation results of current change versus a change in  $VDD_{OSC}$  for

temperature



Fig. 4.17 The simulation results of current change versus a change in  $VDD_{OSC}$  for

corner variation

## 4.4 Analog Front-End Supporting Sleep Mode

To save power during sleep mode, it is necessary to disable both the data and clock paths. A switch is added to the current bias of the CTLE to block the data path, as shown in Fig. 4.18 (a), that allows it to be fully deactivated during sleep mode. To block the clock path, the frequency of the hybrid oscillator must be set to zero. To accomplish this, the two loops that adjust the hybrid oscillator must be turned off. Fig. 4.18 (b) shows the current switch on the tail current of hybrid oscillator. In the case of the DLF, AND gates are added to the output of the DLF to disable it during sleep mode. The gate of the MOSFET responsible for the current of the integral path in the A<sub>CTRL</sub> is connected to ground to cut off the current. When a wake-up signal is received,



Fig. 4.18 Current switch on a) current bias of CTLE b) tail current of hybrid oscillator

the AND gate is activated to restore digital codes, and the  $A_{CTRL}$  is designed to be reactivated through  $V_{INIT}$ , which is used to fix the  $A_{CTRL}$  during initial locking. Furthermore, to prevent leakage in the floating node, the clock and data paths are intentionally shorted to either the ground or VDD. Fig. 4.19 is the simulation result of the change in frequency and current for two cases: 1) when entering sleep mode from active mode. 2) wake-up from sleep mode. In the case of CTLE and DES, 1.2 uA and 222 uA of current flow as a result of post-simulation, respectively, and in the case of AFE and HOSC, 343 uA of current flow as a result of simulation. The process of restoring  $D_{CTRL}$  and  $A_{CTRL}$  is shown in Fig. 4.20. According to the WK<sub>UP1</sub>, the  $D_{CTRL}$ is restored to the initially locked code and the  $A_{CTRL}$  is restored to 0.7 V, respectively. Since activating the analog loop prior to the restoration of  $A_{CTRL}$  can result in the false lock caused by erroneous UP and DN detector outputs, two signals are incorporated. WK<sub>UP1</sub>, turns on all bias, while WK<sub>UP2</sub> activates the loop. Accordingly, it can be confirmed that the frequency is also recovered to 5 GHz.



Fig. 4.19 Simulation result for current variation on acive mode, sleep mode, wake-up



### **4.5 Circuit Implementation**

#### 4.5.1 Overall Structure

Fig. 4.21 shows the overall architecture of the proposed receiver for the AP-to-TED interface. It comprises a receiver and a command controller, which allows an external control of the module via an  $I^2C$  interface with a sleep-mode signal. To alleviate timing constraints, half-rate clocking is employed and a referenceless CDR is also implemented to reduce power consumption by eliminating the clock lane. The receiver consists of a CTLE, strong-ARM samplers, SR latches, a BBPD, a deserializer (DES), a charge pump, a DLF, and a hybrid oscillator. Signals passing through the FPCB are applied to the CTLE, which employs a resistive and capacitive source degeneration to boost up to 6 dB at the Nyquist frequency. In addition, a switch is attached to the current mirror to support the sleep mode, allowing the bias to be turned on and off. After the CTLE, four strong-ARM latch comparators are used for 2x oversampling with a half rate. The data and edge samples are transmitted in two paths through SR latches. In the first path, the DES reduces the data rate in the DLF. The second path contains a BBPD. The UP and DN signals generated by the BBPD are applied to the integral and direct proportional paths to control the HOSC. During initial frequency tracking, the frequency is adjusted by a 128-bit thermometer code generated from the DLF. Once D<sub>CTRL</sub> is fixed, the DLF is deactivated and the frequency locking and phase locking are accomplished with integral and direct proportional paths.



Fig. 4.22 shows the schematic of the HOSC with the SVDC including the simulation results of the SVDC. A two-stage differential ring oscillator is utilized for lowpower implementation. The range of frequency is selected with  $B_{CTRL}$  for the desired band. Tail current is partitioned into two parts: one that cancels out supply voltage drift and the other that does not. Band-select (I<sub>3</sub>) and digital codes current (I<sub>4</sub>) are used to keep a constant current in the event of supply voltage drift.

The digital block comprises a pattern decoder, an accumulator, a DSM for stochastic frequency-phase detection, a lock detector to determine which loop is activated, a loop gain controller to adjust the integral gain and the direct proportional gain, and FSM for fast frequency acquisition. Utilizing the en\_linear signal through I<sup>2</sup>C enables the option to turn the fast frequency tracking mode on or off.





#### 4.5.2 Command Controller

A command controller is composed of synthesized digital gates. The controller takes a reference clock ( $CLK_{EXT}$ ) from a bit error tester (BERT). Each module is turned on/off using the signals PLL\_WK<sub>UP</sub> for PLL, TX\_WK<sub>UP</sub> for TX, AND RX WK<sub>UP1</sub> and RX WK<sub>UP2</sub> for RX.

It functions in two modes. The first mode activates and deactivates the receiver externally. The second mode is a sequential mode where each signal is generated with a certain delay. The delay between the signals (delay<sub>1</sub>, delay<sub>2</sub>, delay<sub>3</sub>) can be adjusted by changing the number of delay stages and the speed of the reference clock. When all wake-up signals become high level, the Flag\_WK<sub>UP</sub> become high level, indicating that all signals have been applied. Fig. 4.23 shows the simulation results of behavior modeling of command controller.



Fig. 4.23 The simulation results of behavior modeling of command controller

## 4.6 Measurement

The prototype chip is fabricated in 28-nm CMOS technology. The photomicrograph of the chip is shown in Fig. 4.24. The total active area of the chip is 0.089 mm<sup>2</sup>, where the RX and command controller occupy 0.082 mm<sup>2</sup> and 0.007 mm<sup>2</sup>, respectively. Fig. 4.25 describes two measurement setups: one for verifying the receiver performance and the other for measuring the SVDC performance. Fig. 4.26 shows the detailed measurement setup for verifying the receiver's performance. The



Fig. 4.24 Die microphotograph



Fig. 4.25 Meausrement setup for (a) receiver performance (b) SVDC



Fig. 4.26 Detailed measurement setup for verifying receiver's performance

Signal Quality Analyzer (Anritsu Mu1800A) generates a PRBS-7 pattern for the input data. To measure jitter tolerance, Signal Quality Analyzer generates data with added random jitter and the 250 MHz receovered data fed it to the error detector(ED). For measuring the power consumption of the module in both sleep mode and active mode,

BERT applies a 312.5 MHz clock to operate the command controller inside the chip. Through the I<sup>2</sup>C signal, it is adjusted whether the chip is in active mode or sleep mode.

For measuring SVDC, the UXA Signal Analyzer N9040B is used to measure the free-running frequency in open-loop. Also oscilloscope (Tektronix MSO73304DX) is used for measuring frequency locking behavior and jitter histograms of the recovered clock.

The proposed receiver achieves error-free operation (BER <  $10^{-12}$ ) when recovering 10 Gb/s PRBS-7 in nominal voltage, 1 V. Fig. 4.28 shows the measured jitter tolerance with PRBS-7 and BER <  $10^{-11}$ . Fig. 4.27 shows the power breakdown of the proposed receiver. The total power consumption of the receiver in active mode is 9.9 mW, the CTLE, AFE, DES & DLF, and HOSC consumes 0.784 mW, 3.145 mW, 1.24 mW, and 4.731 mW. On the other hand, the total power consumption is 1.962 mW, showing 80% power reduction compared to active mode, with CTLE consuming 0.043 mW, AFE consuming 0.422 mW, DES & DLF consuming 0.385 mW, and HOSC consuming 1.112 mW.



Fig. 4.27 Power breakdown of active mode and sleep mode



Fig. 4.28 Measured jitter tolerance curve

#### 4.6.1 Fast frequency tracking

To measure the frequency acquisition behaviors for 10 Gb/s, the reset signal of the DLF is used as the trigger signal. After the trigger signal, the oscillator frequency is measured, and the lock time is defined as the point where the frequency average in the observed window is 5 GHz. The measurements are conducted in two directions.

Firstly, to confirm the linearity of the gain curve, the frequency range of the HOSC is adjusted using  $B_{CTRL}$ . As the HOSC operates at different frequency ranges, the code values from the FSM are different, and this will be validated by observing the updated frequency.

Secondly, to verify that the frequency gain curve remains consistent regardless of the changing averaging time, the error is accumulated by varying the number of accumulations and measuring the transient response. Finally, all these results are postprocessed to compare the fast acquisition mode with the conventional mode on the same plot.

Fig. 4.29 represents the measured conventional frequency transient response before post-processing. The data rate is 10 Gb/s with PRBS-7 and the initial HOSC initial frequencies are (a) 4.55 GHz, (b) 4.37 GHz, and (c) 4.87 GHz. The corresponding lock times are  $3.02 \,\mu$ s,  $3.4 \,\mu$ s, and  $1.2 \,\mu$ s, respectively.

Fig. 4.30 shows the measured frequency behaviors of the conventional tracking mode and fast tracking mode with varying initial HOSC frequency. In all cases, the averaging time is set to 640 UI, and the update control ( $D_{CTRL_UPDATE}$ ) corresponding to the updated frequency is observed. For case Fig. 4.30 (a), the operating range of the HOSC is 4.55 to 5.83 GHz, and it takes about 0.37 µs lock time. In case Fig. 4.29 (b),



(a)



(b)



(c)

Fig. 4.29 Measured frequency acquisition behaviors before post-processing @ 10 Gb/s PRBS-7 with varying initial HOSC frequency (a) 4.55 GHz, (b) 4.37 GHz, (c) 4.87 GHz

the HOSC operated within the range of 4.37 to 5.65 GHz, with a lock time of 0.38  $\mu$ s. In the case of Fig. 4.29 (c), the HOSC has an operating range of 4.87 to 6.15 GHz, and 0.41  $\mu$ s of lock time is achieved.

Based on the results of cases (a), (b), and (c), it can be observed that the calculation of L and E values vary due to the difference in operating frequency. Due to the linearity of the frequency gain curve, it can be confirmed that even with changes in the values of L and E, the calculated code from the FSM consistently results in a frequency near 5 GHz.

To confirm the frequency gain curve with varying averaging time, we adjust the number of accumulations. The operating range of HOSC is set to  $4.55 \sim 5.83$  GHz, and we measure the frequency behavior by changing the averaging time to 1000 UI and 2520 UI, following the previous measurement of 640 UI. Fig. 4.31 (a) and (b)





Fig. 4.30 Measured frequency acquisition behaviors with varying intial DCO frequency (a) 4.55 ~ 5.83 GHz, (b) 4.37~5.65 GHz, (c) 4.87 ~ 6.15 GHz

illustrate the frequency acquisition behavior measured at 1000 UI and 2520 UI and lock times are 0.48  $\mu$ s and 0.79  $\mu$ s, respectively. As more errors are accumulated compared to the 640 UI case, it takes longer time for lock acquisition. Calculating L and E, and observing the frequency updated by the interpolated value, we can see that frequency tracking starts around 5 GHz. This confirms that the frequency gain curve remains linear and consistent even with changes in the averaging time.

The proposed receiver achieved a faster lock time compared to the existing structure based on the stochastic frequency-detection method by utilizing the linearity characteristics of the gain curve.



Fig. 4.31 Measured frequency acquisition behaviors with varying averaging time

(a) 1000 UI, (b) 2520 UI

#### 4.6.2 Frequency recovery from sleep mode under supply voltage drift

Fig. 4.32 shows the measured frequency of the free-running HOSC when  $A_{CTRL}$  is fixed at 0.7 V. As illustrated, the frequency does not follow a linear trend and tends to be flat with respect to changes in VDD<sub>OSC</sub>. A table depicting the results for ±50 mV varied from the nominal voltage is also provided, with frequency pushing of up to 811 MHz/V.



Fig. 4.32 Measured frequency of free-running HOSC

To verify the recovery back to the same operating frequency even with the supply voltage drift during sleep mode, the follwing three steps are performed: (1) operate the CDR in the nominal voltage to lock the frequency (2) enter the sleep mode with the command from the command controller through the I<sup>2</sup>C interface and (3) change the supply voltage and turn on the module again with a wake-up signal through the I<sup>2</sup>C interface.



Fig. 4.33 Post-processed waveform of frequency recovery after sleep mode. (a) Initial active mode voltage : 1 V and (b) 0.95 V, 1.05 V

The post-processed waveform obtained through these three steps are shown in Fig. 4.33. After entering the sleep mode, frequency recovery in four cases is observed: 0.95 V, 1 V, 1.05 V, 1.1 V. The original operating frequency of 5 GHz is recovered

within 34 ns, 32 ns, 32 ns, 36 ns, respectively. Fig. 4.33 (b) depicts the results of measuring frequency when the nominal voltage is changed in the first step. This is done with a variation of  $\pm 100$  mV from the nominal voltage during the sleep mode and it can be seen that frequency is recovered within 33 ns and 32 ns.

Table 1 shows the comparison with other receivers supporting the sleep mode. Our design shows low power consumption and offers fast frequency lock at wake-up in the presence of supply voltage drift in the sleep mode.

|                                       | JSSC<br>2020 [16] | ISSCC<br>2018 [15] | ESSCIRC<br>2016 [13] | JSSC<br>2015 [14] | This Work             |
|---------------------------------------|-------------------|--------------------|----------------------|-------------------|-----------------------|
| Technology                            | 65 nm             | 14 nm              | 65 nm SOI            | 90 nm             | 28 nm                 |
| CDR Type                              | Digital           | Digital            | Digital              | Digital           | Hybrid                |
| Supply Voltage (V)                    | 1.1               | 0.9                | 1.2                  | 1.2               | 1.0                   |
| Data Rate (Gb/s)                      | 12                | 56                 | 1.2 - 2.3            | 2.2               | 10                    |
| Channel Loss (dB)                     | 20 @ 6 GHz        | ı                  | ı                    | I                 | 9.6 @ 5 GHz           |
| Power Active-mode                     | 45.7              | *126               | 24.6                 | 11.6              | 6.6                   |
| (mW) Sleep-mode                       | 3.7               | 8*                 | 0.04                 | N/A               | 1.77                  |
| Energy Efficiency<br>(pJ/b)           | 3.8               | *2.2               | 10.7                 | 5.27              | 0.99                  |
| VT Tolerant<br>during Sleep-mode      | Yes               | N/A                | N/A                  | Yes               | Yes                   |
| Wake-up w/o drift                     | 120 (10 ns)       | 384 (6.8ns)        | 4 (2 ns)             | 1 (0.5 ns)        | < 360 (36 ns)         |
| Time (UI) w drift                     | -                 |                    | -                    | 1                 | $(V_{NOM} \pm 50 mV)$ |
| *Exclude Oscillator power consumption | er consumption    |                    |                      |                   |                       |

Table 4.1 Comparison table with other receivers supporting the sleep mode

Chapter 4. Receiver with Fast Frequency Acquisition in Active mode and Fast Recovery from Sleep mode under Voltage Drift 92

## Chapter 5

## Conclusions

In this thesis, a design technique focusing on high-speed and low-power internal display interface for next-generation mobile applications is presented.

First, a high-speed and low-power transceiver is proposed for an AP-to-TED interface. In the TX, a pseudo SER and a 2:1 mitigating MUX are utilized to achieve not only high-speed operation but also alleviate intersymbol interference (ISI) while maintaining good power efficiency. In the RX, a hybrid loop CDR is employed. The digital loop enables unlimited frequency tracking. Once the frequency lock code is found, the digital loop is deactivated while activating the analog loop to achieve better power efficiency. The prototype chip is fabricated in 28-nm CMOS technology and occupies an active area of 0.196 mm<sup>2</sup>, where each TX, RX, and PLL occupies 0.026 mm<sup>2</sup>, 0.066 mm<sup>2</sup>, 0.012 mm<sup>2</sup>, respectively. The performance of the transceiver is evaluated with the 7-cm channel on FPCB. Thanks to low power schemes in both TX and RX, the proposed design achieves 1.23 pJ/b, the best energy efficiency compared to prior transceiver designs.

Second, a receiver supporting a sleep mode is proposed for an AP-to-TED interface. The receiver can be divided into initial active mode, sleep mode, and wake-up state. Schemes are proposed for each state to enhance its performance.

In the initial mode, a fast locking mode utilizing the linearity of the frequency gain curve of the previously published stochastic frequency detection is proposed. The FSM internally divides the entire code and adjusts the initial frequency for HOSC by using the error values based on the frequency difference. Once all stages of the FSM are completed, the analog loop is immediately activated to remove frequency and phase errors while deactivating the digital loop. Through the FSM, the frequency tracking code can be coarsely moved closer to the target frequency code, achieving a significantly faster lock time of  $0.37 \,\mu$ s.

For sleep mode, a switch is added to deactivate the clock distribution path and all biases in the module. Also, the SVDC scheme is added solely to stabilize the digitally controlled current of the hybrid oscillator, ensuring a constant current flow. Thanks to the SVDC, the operating frequency can be recovered with the same  $D_{CTRL}$ ,  $A_{CTRL}$ , regardless of the supply voltage drift during sleep mode.

This receiver and command controller are fabricated in a 28-nm CMOS process and achieve BER  $< 10^{-12}$  when recovering 10-Gb/s data with 0.99-pJ/b power efficiency in the active mode. Thanks to the above schemes, the power consumption is decreased by 80% compared to the active mode, and the frequency is recovered within 36 ns even when the supply voltage drift occurs during the sleep mode.

## **Bibliography**

- [1] Available "http://olednet.com/galaxy/"
- [2] Y. -U. Jeong et al., "A 9Gb/s Wide Output Range Transmitter With 2D Binary-Segmented Driver and Dual-Loop Calibration for Intra-Panel Interfaces," in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 67, no. 9, pp. 1589-1593, Sept. 2020.
- [3] J. Park, J. -H. Chae, Y. -U. Jeong, J. -W. Lee and S. Kim, "A 2.1Gbps 12channel transmitter with phase emphasis embedded serializer for UHD intrapanel interface," 2017 IEEE Asian Solid-State Circuits Conference (A-SSCC), Seoul, Korea (South), 2017, pp. 257-260.
- [4] Y. Lee et al., "29.5 12Gb/s over four balanced lines utilizing NRZ braid clock signaling with 100% data payload and spread transition scheme for 8K UHD intra-panel interfaces," 2017 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2017, pp. 490-491.
- [5] H. K. Jeon, Y. H. Moon, J. K. Kang and L. S. Kim, "An Intra-Panel Interface With Clock-Embedded Differential Signaling for TFT-LCD Systems," in Journal of Display Technology, vol. 7, no. 10, pp. 562-571, Oct. 2011
- [6] Y. -H. Kim, T. Lee, H. -K. Jeon, D. Lee and L. -S. Kim, "An Input Data and Power Noise Inducing Clock Jitter Tolerant Reference-Less Digital CDR for

LCD Intra-Panel Interface," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 64, no. 4, pp. 823-835, April 2017.

- [7] M. Hekmat et al., "23.3 A 6Gb/s 3-tap FFE transmitter and 5-tap DFE receiver in 65nm/0.18µm CMOS for next-generation 8K displays," 2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2016, pp. 402-403.
- [8] D. H. Baek, B. Kim, H. -J. Park and J. -Y. Sim, "2.6 A 5.67mW 9Gb/s DLLbased reference-less CDR with pattern-dependent clock-embedded signaling for intra-panel interface," 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), San Francisco, CA, USA, 2014, pp. 48-49.
- [9] "MIPI Multimedia specification", Available <u>https://www.mipi.org/mobile</u>
- [10] Mobile Industry Processor Interface (MIPI) Specification for Display Serial Interface (DSI) Version 1.3., MIPI Alliance, March, 2015.
- [11] DisplayPort (DP) Standard Version 1.4a. Video Electronics Standard Association (VESA), San Jose, CA, April, 2018.
- [12] "New Generation Large Screen Display Internal Interface iDP(Internal DisplayPort Technology Overveiw", DisplayPort Developer Conference, Video Electronics Standard Association (VESA), Westin taipei, December, 2010.
- [13] T. Iizuka, N. Tohge, S. Miura, Y. Murakami, T. Nakura and K. Asada, "A 4-

cycle-start-up reference-clock-less all-digital burst-mode CDR based on cyclelock gated-oscillator with frequency tracking," ESSCIRC Conference 2016: 42nd European Solid-State Circuits Conference, 2016, pp. 301-304

- [14] W. -S. Choi, T. Anand, G. Shu, A. Elshazly and P. K. Hanumolu, "A Burst-Mode Digital Receiver with Programmable Input Jitter Filtering for Energy Proportional Links," in IEEE Journal of Solid-State Circuits, vol. 50, no. 3, pp. 737-748, March 2015
- [15] I. Ozkaya et al., "A 56Gb/s burst-mode NRZ optical receiver with 6.8ns poweron and CDR-Lock time for adaptive optical links in 14nm FinFET CMOS," 2018 IEEE International Solid - State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2018, pp. 266-268
- [16] D. Kim, M. G. Ahmed, W. -S. Choi, A. Elkholy and P. K. Hanumolu, "A 12-Gb/s 10-ns Turn-On Time Rapid ON/OFF Baud-Rate DFE Receiver in 65-nm CMOS," in IEEE Journal of Solid-State Circuits, vol. 55, no. 8, pp. 2196-2205, Aug. 2020.
- [17] S. Shekhar, R. Inti, J. Jaussi, T. -C. Hsueh and B. K. Casper, "A Low-Power Bidirectional Link With a Direct Data-Sequencing Blind Oversampling CDR," in IEEE Journal of Solid-State Circuits, vol. 54, no. 6, pp. 1669-1681, June 2019.
- [18] A. I. Abbas and G. E. R. Cowan, "Fast-Locking Burst-Mode Clock and Data Recovery for Parallel VCSEL-Based Optical Link Receivers," in *IEEE Access*,

vol. 10, pp. 34306-34320, 2022.

- [19] G. Shu et al., "23.1 A 16Mb/s-to-8Gb/s 14.1-to-5.9pJ/b source synchronous transceiver using DVFS and rapid on/off in 65nm CMOS," 2016 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2016, pp. 398-399.
- [20] D. Wei, T. Anand, G. Shu, J. E. Schutt-Ainé and P. K. Hanumolu, "A 10-Gb/s/ch, 0.6-pJ/bit/mm Power Scalable Rapid-ON/OFF Transceiver for On-Chip Energy Proportional Interconnects," in IEEE Journal of Solid-State Circuits, vol. 53, no. 3, pp. 873-883, March 2018.
- [21] T. Anand, M. Talegaonkar, A. Elkholy, S. Saxena, A. Elshazly and P. K. Hanumolu, "3.7 A 7Gb/s rapid on/off embedded-clock serial-link transceiver with 20ns power-on time, 740µW off-state power for energy-proportional links in 65nm CMOS," 2015 IEEE International Solid-State Circuits Conference -(ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 2015, pp. 1-3.
- [22] A. Rylyakov et al., "22.1 A 25Gb/s burst-mode receiver for rapidly reconfigurable optical networks," 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 2015, pp. 1-3.
- [23] J. Park, J. -H. Chae, Y. -U. Jeong, J. -W. Lee and S. Kim, "A 2.1-Gb/s 12-Channel Transmitter With Phase Emphasis Embedded Serializer for 55-in UHD Intra-Panel Interface," in IEEE Journal of Solid-State Circuits, vol. 53,

no. 10, pp. 2878-2888, Oct. 2018, doi: 10.1109/JSSC.2018.2859808.

- [24] T. Wang et al., "A 5.2 Gb/s Receiver for Next-Generation 8K Displays in 180 nm CMOS Process," in IEEE Journal of Solid-State Circuits, vol. 57, no. 8, pp. 2521-2531, Aug. 2022.
- [25] Kim, T. -J., Baek, C., Chun, S., Lee, K. -H., Hwang, J. -I., Kwon, K., Kim, Y. -H., Park, H. -S., Shin, Y., Ryu, S., Lee, J. -Y., Hwang, G., and Kim, G. (2016) A timing controller embedded driver IC with 3.24-Gbps eDP interface for chipon-glass TFT-LCD applications. Jnl Soc Info Display, 24: 299– 306.
- [26] R. K. Nandwana et al., "29.6 A 3-to-10Gb/s 5.75pJ/b transceiver with flexible clocking in 65nm CMOS," 2017 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2017, pp. 492-493.
- [27] M. Megahed, Y. Chun, Z. Wang and T. Anand, "A 27 Gb/s 5.39 pJ/bit 8-ary Modulated Wireline Transceiver Using Pulse Width and Amplitude Modulation Achieving 9.5 dB SNR Improvement over PAM-8," 2021 Symposium on VLSI Circuits, Kyoto, Japan, 2021, pp. 1-2
- [28] "Deep dive about the Basic Principle of LVDS SerDes, Taking advantage of its feqtures –high speed, long distance,low noise", Available <u>https://www.thine.co</u> .jp/en/contents/detail/serdes-lvds.html
- [29] Available, https://m.blog.naver.com/PostView.naver?isHttps Redirect=true &blogId=zpdlsjtm0&logNo=220718179482

- [30] B. Razavi, "Design Techniques for High-Speed Wireline Transmitters," in IEEE Open Journal of the Solid-State Circuits Society, vol. 1, pp. 53-66, 2021.
- [31] Y. Chang, A. Manian, L. Kong and B. Razavi, "An 80-Gb/s 44-mW Wireline PAM4 Transmitter," in IEEE Journal of Solid-State Circuits, vol. 53, no. 8, pp. 2214-2226, Aug. 2018.
- [32] A. A. Hafez, M. -S. Chen and C. -K. K. Yang, "A 32–48 Gb/s Serializing Transmitter Using Multiphase Serialization in 65 nm CMOS Technology," in IEEE Journal of Solid-State Circuits, vol. 50, no. 3, pp. 763-775, March 2015.
- [33] R. Inti, W. Yin, A. Elshazly, N. Sasidhar and P. K. Hanumolu, "A 0.5-to-2.5 Gb/s Reference-Less Half-Rate Digital CDR With Unlimited Frequency Acquisition Range and Improved Input Duty-Cycle Error Tolerance," in IEEE Journal of Solid-State Circuits, vol. 46, no. 12, pp. 3150-3162, Dec. 2011.
- [34] M. S. Jalali, R. Shivnaraine, A. Sheikholeslami, M. Kibune and H. Tamura, "An 8mW frequency detector for 10Gb/s half-rate CDR using clock phase selection," Proceedings of the IEEE 2013 Custom Integrated Circuits Conference, San Jose, CA, USA, 2013, pp. 1-8.
- [35] A. Pottbacker, U. Langmann and H. . -U. Schreiber, "A Si bipolar phase and frequency detector IC for clock extraction up to 8 Gb/s," in IEEE Journal of Solid-State Circuits, vol. 27, no. 12, pp. 1747-1751, Dec. 1992.
- [36] D. Messerschmitt, "Frequency Detectors for PLL Acquisition in Timing and Carrier Recovery," in IEEE Transactions on Communications, vol. 27, no. 9,

pp. 1288-1295, September 1979.

- [37] M. S. Jalali, A. Sheikholeslami, M. Kibune and H. Tamura, "A Reference-Less Single-Loop Half-Rate Binary CDR," in IEEE Journal of Solid-State Circuits, vol. 50, no. 9, pp. 2037-2047, Sept. 2015.
- [38] W. Rahman et al., "A 22.5-to-32-Gb/s 3.2-pJ/b Referenceless Baud-Rate Digital CDR With DFE and CTLE in 28-nm CMOS," in IEEE Journal of Solid-State Circuits, vol. 52, no. 12, pp. 3517-3531, Dec. 2017.
- [39] G. Shu et al., "A 4-to-10.5 Gb/s Continuous-Rate Digital Clock and Data Recovery With Automatic Frequency Acquisition," in IEEE Journal of Solid-State Circuits, vol. 51, no. 2, pp. 428-439, Feb. 2016.
- [40] J. Jin, X. Jin, J. Jung, K. Kwon, J. Kim and J. -H. Chun, "A 0.75–3.0-Gb/s Dual-Mode Temperature-Tolerant Referenceless CDR With a Deadzone-Compensated Frequency Detector," in IEEE Journal of Solid-State Circuits, vol. 53, no. 10, pp. 2994-3003, Oct. 2018.
- [41] K. Park, W. Bae, J. Lee, J. Hwang and D. -K. Jeong, "A 6.7–11.2 Gb/s, 2.25 pJ/bit, Single-Loop Referenceless CDR With Multi-Phase, Oversampling PFD in 65-nm CMOS," in IEEE Journal of Solid-State Circuits, vol. 53, no. 10, pp. 2982-2993, Oct. 2018.
- [42] C. Yu, E. Sa, S. Jin, H. Park, J. Shin and J. Burm, "A 6.5–12.5-Gb/s Half-Rate Single-Loop All-Digital Referenceless CDR in 28-nm CMOS," in IEEE Journal of Solid-State Circuits, vol. 55, no. 10, pp. 2831-2841, Oct. 2020.

- [43] J. Jin et al., "A 4.0-10.0-Gb/s Referenceless CDR with Wide-Range, Jitter-Tolerant, and Harmonic-Lock-Free Frequency Acquisition Technique," ESSCIRC 2018 - IEEE 44th European Solid State Circuits Conference (ESSCIRC), Dresden, Germany, 2018, pp. 146-149.
- [44] J. L. Sonntag and J. Stonick, "A Digital Clock and Data Recovery Architecture for Multi-Gigabit/s Binary Links," in IEEE Journal of Solid-State Circuits, vol. 41, no. 8, pp. 1867-1875, Aug. 2006
- [45] M. Talegaonkar, R. Inti and P. K. Hanumolu, "Digital clock and data recovery circuit design: Challenges and tradeoffs," 2011 IEEE Custom Integrated Circuits Conference (CICC), San Jose, CA, USA, 2011
- [46] Razavi, "Challenges in the design high-speed clock and data recovery circuits," in IEEE Communications Magazine, vol. 40, no. 8, pp. 94-101, Aug. 2002 Shoujun Wang, Haitao Mei, M. Baig, W. Bereza, T. Kwasniewski and R. Patel, "Design considerations for 2nd-order and 3rd-order bang-bang CDR loops," Proceedings of the IEEE 2005 Custom Integrated Circuits Conference, 2005., San Jose, CA, USA, 2005, pp. 317-320.
- [47] Shoujun Wang, Haitao Mei, M. Baig, W. Bereza, T. Kwasniewski and R. Patel,
  "Design considerations for 2nd-order and 3rd-order bang-bang CDR loops,"
  Proceedings of the IEEE 2005 Custom Integrated Circuits Conference, 2005.,
  San Jose, CA, USA, 2005, pp. 317-320.
- [48] K. Park, M. Shim, H. -G. Ko, B. Nikolić and D. -K. Jeong, "Design Techniques

for a 6.4–32-Gb/s 0.96-pJ/b Continuous-Rate CDR With Stochastic Frequency–Phase Detector," in IEEE Journal of Solid-State Circuits, vol. 57, no. 2, pp. 573-585, Feb. 2022.

- [49] M. -J. Park and J. Kim, "Pseudo-Linear Analysis of Bang-Bang Controlled Timing Circuits," in *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 6, pp. 1381-1394, June 2013.
- [50] M. Nagata, "Constant current circuits," (in Japanese) Japanese Patent 628 228: Japanese Examined Patent Pub. 46-16463B, May 6, 1971.
- [51] T. Abe, H. Tanimoto and S. Yoshizawa, "A simple current reference with low sensitivity to supply voltage and temperature," 2017 MIXDES - 24th International Conference "Mixed Design of Integrated Circuits and Systems, Bydgoszcz, Poland, 2017, pp. 67-72.
- [52] M. Hirano, N. Tsukiji and H. Kobayashi, "Simple reference current source insensitive to power supply voltage variation - improved Minoru Nagata current source," 2016 13th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT), Hangzhou, 2016, pp. 87-89.
- [53] Y. -C. Huang, C. -F. Liang, H. -S. Huang and P. -Y. Wang, "15.3 A 2.4GHz ADPLL with digital-regulated supply-noise-insensitive and temperature-selfcompensated ring DCO," 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), San Francisco, CA, USA, 2014, pp. 270-271.

- [54] M. Mansuri and C. . -K. K. Yang, "A low-power adaptive bandwidth PLL and clock buffer with supply-noise compensation," in IEEE Journal of Solid-State Circuits, vol. 38, no. 11, pp. 1804-1812, Nov. 2003.
- [55] P. -H. Hsieh, J. Maxey and C. -K. K. Yang, "Minimizing the Supply Sensitivity of a CMOS Ring Oscillator Through Jointly Biasing the Supply and Control Voltages," in IEEE Journal of Solid-State Circuits, vol. 44, no. 9, pp. 2488-2495, Sept. 2009
- [56] Y. Song, H. -G. Ko, C. Kim and D. -K. Jeong, "A 1.05-to-3.2 GHz All-Digital PLL for DDR5 Registering Clock Driver With a Self-Biased Supply-Noise-Compensating Ring DCO," in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 69, no. 3, pp. 759-763, March 2022.

## 초 록

본 논문은 모바일 어플리케이션을 위한 내부 디스플레이 인터페이스에 대한 새로운 구조에 대해 제안하고 있다. 고사양의 디스플레이 구현을 위 해 필요한 데이터 양의 증가와, 스마트폰의 제한된 배터리 용량을 고려하 면 인터페이스는 고속 및 저전력 동작을 고려하여 설계되어야 한다.

첫 번째 프로토타입 설계에서는 10 Gb/s/lane 으로 동작하는 송수신기 가 제시되었다. 송신기에서는 모조의 직렬 변환기와, 2:1 부호 간 간섭 완 화 MUX 를 사용하여 좋은 전력 효율과, 부호간 간섭을 완화시켰다. 제안 하는 직렬 변환기는 기존의 직렬 변환기보다 클럭 배분을 감소시켜 파워 를 절약했고, 제안하는 MUX 는 구성하는 tri-state inverter 내부의 플로팅 노드들을 미리 충전, 혹은 방전시켜 그전에 전송된 데이터의 정보를 없애 부호 간 간섭을 제거하였다. 수신기 단에서는 혼성 루프를 사용하였다. 먼저 디지털 루프를 활성화하여 주파수 추적을 한다. 주파수 추적이 끝나 면, 아날로그 루프를 활성화하여 남은 주파수 에러와, 위상 에러를 제거 한다. 디지털 루프를 차용하여 무제한의 주파수 추적이 가능하게 하였고, 아날로그 루프를 통해서는 edge 용 직병렬 변환기와 디지털 루프 필터를 비활성화하여 좋은 파워 효율을 가져갔다. 프로토타입 칩은 28-nm CMOS 공정으로 만들어졌으며, 0.196 mm<sup>2</sup>을 차지한다. 그중에서 송신기, 수신기, 위상 고정 루프는 각각 0.026 mm<sup>2</sup>, 0.026 mm<sup>2</sup>, 0.026 mm<sup>2</sup>의 면 적을 차지한다. 전체 송수신기는 1.23 pJ/b 의 에너지 효율을 보였다.

두 번째 프로토타입 설계에서는, 활성화 모드에서는 더 빠른 주파수 추적을 할 수 있고, 비활성화 모드를 지원하며, 비활성화 모드 동안 전압 공급기의 전압 변화에도 원래 주파수로 빨리 복원하는 수신기가 제안되었 다. 주파수 이득 곡선의 선형성을 활용하여 최초의 디지털 코드를 유한 공급장치를 이용하여 조정하였다. 그리고 디지털 루프 필터의 출력에 AND 게이트를 추가하여 빠르게 비활성화 모드로 들어갈 수 있고, 다시 복원할 수 있게 하였으며 아날로그 루프를 통해서는 좋은 지터 성능을 가 져갈 수 있는 혼성 루프 클럭 복원 회로를 사용하였다. 또한 공급 전압 변화를 무효화 시킬 수 있는 회로를 추가하여 항상 일정한 전류가 흐르도 록 하였다. 혼성 클럭 복원 회로와, 공급 전압 변화 무효 회로를 통해 비 활성화 모드에서 공급전압의 변화가 생겨도 주파수 재추적 없이 빠르게 복원할 수 있었다. 프로토타입 칩은 **28-nm CMOS** 공정에서 제작되었으 며, 0.089 mm<sup>2</sup> 의 면적을 차지하였고, 활성화 상태에서 0.99 pJ/bit의 에 너지 효율을 보였다. 제안하는 빠른 주파수 추적 방식은 기존의 주파수 잠금될 때까지 시간인 3.02 us 보다 더 빠른 0.37 us의 주파수 잠금 시간 을 달성할 수 있게 하였다. 비활성화 모드에서는 활성화 모드에 비해 **80%** 전력 감소를 보였다. 또한 비활성화 모드에서 공급 전압 변화가 최악으로 일어나도 활성화 되었을 때 주파수가 **36 ns** 이내의 빠른 복원 시간을 달 성할 수 있었다.

**주요어** : 내부 디스플레이 인터페이스, 송신기, 수신기, 클럭복원회로, 빠른 주파수 추적, 전압공급변화, 모드 비활성화, 모드 재활성화, 빠른 주 파수 복원

학 번 : 2019-25389