



PH.D. DISSERTATION

## A DESIGN OF SINGLE-ENDED VOLTAGE-MODE PAM-4 TRANSMITTER FOR MEMORY INTERFACES

메모리 인터페이스를 위한 단일 종단 전압 모드 팸포 송신기 설계

BY

CHANGHO HYUN

FEBRUARY 2023

DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING COLLEGE OF ENGINEERING SEOUL NATIONAL UNIVERSITY

### A DESIGN OF SINGLE-ENDED VOLTAGE-MODE PAM-4 TRANSMITTER FOR MEMORY INTERFACES

메모리 인터페이스를 위한 단일 종단 전압 모드 팸포 송신기 설계

지도교수 김 수 환

이 논문을 공학박사 학위논문으로 제출함

2023년 2 월

서울대학교 대학원

전기정보공학부

현 창 호

현창호의 공학박사 학위논문을 인준함

2023년 2 월

| 위육 | 비장:_   | 정 덕 균 | (印)  |
|----|--------|-------|------|
| 부위 | 원장 : _ | 김 수 환 | _(印) |
| 위  | 원 :    | 김 재 하 | (印)  |
| 위  | 원 :    | 최 우 석 | (印)  |
| 위  | 원 :    | 채 주 형 | _(印) |

### ABSTRACT

## A DESIGN OF SINGLE-ENDED VOLTAGE-MODE PAM-4 TRANSMITTER FOR MEMORY INTERFACES

CHANGHO HYUN Department of Electrical and Computer Engineering College of Engineering Seoul National University

As the demand for high-bandwidth memory increases, the data rate per pin is also increasing. In this thesis, single-ended voltage-mode PAM-4 transmitters are proposed to increase memory bandwidth. Since PAM-4 signaling has four voltage levels, output level distortion occurs due to drain-source voltage ( $V_{DS}$ ) variation in the single-ended PAM-4 transmitter when output signal level changes from one level to another level.

To address this issue, the proposed PAM-4 transmitter has additional pull-up driver units. The additional pull-up driver units can reduce the eye height difference between four levels of PAM-4 signal by raising two intermediate voltage levels. Implemented in 65nm CMOS technology, the active area of the transmitter is 0.06mm<sup>2</sup>. It draws 61.5mW at a data rate of 20 Gb/s/pin.

In addition, we propose another PAM-4 transmitter preventing output level distortion while matching impedance with the channel. ZQ codes for all four output signal levels are obtained through ZQ calibration and saved in the ZQ code table. The ZQ code generator then adaptively selects the appropriate codes depending on the data pattern and delivered them to the output driver. To validate the effectiveness of our approach, a prototype chip with an active area of 0.035mm<sup>2</sup> was fabricated in a 65nm CMOS technology. It achieved the energy efficiency of 3.09 pJ/bit/pin at 18 Gb/s/pin, and its level separation mismatch ratio (RLM) is 0.971 while matching the channel impedance.

**Keywords**: Single-ended voltage-mode transmitter, PAM-4 signaling, Memory interface, Level separation mismatch ratio (RLM), Impedance matching, ZQ calibration.

Student Number: 2016-20987.

# CONTENTS

| ABSTRACT       |                                                | 1                 |
|----------------|------------------------------------------------|-------------------|
| CONTENTS.      |                                                | 3                 |
| LIST OF FIG    | URES                                           | 5                 |
| LIST OF TAB    | LE                                             | 8                 |
| CHAPTER 1      | INTRODUCTION                                   | 1                 |
| 1.1            | MOTIVATION                                     | 1                 |
| 1.2            | THESIS ORGANIZATION                            | 5                 |
| CHAPTER 2      | DESIGN CONSIDERATIONS OF PAM-4 TRANSMITTER FOR | MEMORY            |
| INTERFACE .    |                                                | 6                 |
| 2.1            | WHAT IS DAM A STONAL INC?                      | 6                 |
| 2.1            | WHAT IS PAIN-4 SIGNALING?                      | 0                 |
| 2.2            | VOLTAGE-MODE AND CURRENT-MODE DRIVER           | 9                 |
| 2.3            | LEVEL SEPARATION MISMATCH RATIO                | 12                |
| 2.4            | IMPEDANCE MATCHING WITH TRANSMISSION LINE      |                   |
| 2.5            | Previous Arts                                  | 20                |
| CHAPTER 3      | DUAL-MODE PAM-4/NRZ SINGLE-ENDED TRANSMITTER   | WITH <b>RLM</b>   |
| COMPENSAT      | 10N                                            | 21                |
| 3.1            | OVERALL ARCHITECTURE                           | 22                |
| 3.1.1          | INTERNAL CLOCK PATH                            | 24                |
| 3.1.2          | 32:4 Serializer, and Encoder & Retimer         |                   |
| 3.2<br>Compens | DUAL-MODE PAM-4/NRZ SINGLE-ENDED TRANSMITTE    | ER WITH RLM<br>26 |
| 3.3            | MEASUREMENTS AND RESULTS                       |                   |
| 3.3.1          | CHIP MICROGRAPH AND MEASUREMENT SETUP          | 30                |
| 3.3.2          | MEASUREMENT RESULTS                            | 31                |

| CHAPTER 4  | A SINGLE-ENDED PAM-4 TRANSMITTER WITH ADAPTIVE IMPEI    | DANCE       |
|------------|---------------------------------------------------------|-------------|
| MATCHING A | AND OUTPUT LEVEL COMPENSATION                           |             |
| 4.1        | OVERALL ARCHITECTURE                                    | 37          |
| 4.2        | Adaptive Impedance Matching With Manual ZQ Calibr<br>39 | ATION       |
| 4.2.1      | MANUAL ZQ CALIBRATION                                   |             |
| 4.2.2      | ZQ CODE TABLE AND ADAPTIVE ZQ CODE GENERATOR            | 41          |
| 4.3        | PROPOSED OUTPUT DRIVER                                  | 43          |
| 4.4        | MEASUREMENTS AND RESULTS                                | 47          |
| 4.4.1      | CHIP MICROGRAPH AND MEASUREMENT SETUP                   | 47          |
| 4.4.2      | Measurement Results                                     | 49          |
| CHAPTER 5  | CONCLUSION                                              | 58          |
| APPENDIX A | . Ron VARIATION REDUCING ZQ CALIBRATION SCHEME FOR MEM  | <b>IORY</b> |
| INTERFACES |                                                         | 60          |
| BIBLIOGRAP | НҮ                                                      | 84          |
| 한글초록       |                                                         | 89          |

# **LIST OF FIGURES**

| Fig. 1.1.1. Various applications of DRAM1                                                    |
|----------------------------------------------------------------------------------------------|
| Fig. 1.1.2. 2021-2027 evolution of the stand-alone memory market2                            |
| Fig. 1.1.3. DRAM data bandwidth growth                                                       |
| Fig. 1.1.4. Data rate per pin trends for DRAM4                                               |
| Fig. 2.1.1. Basic characteristics of NRZ and PAM-4 signals                                   |
| Fig. 2.2.1. Driver types for PAM-4 signaling: (a) Single-ended voltage-mode driver, (b)      |
| Differential current-mode driver9                                                            |
| Fig. 2.3.1. Concept of a level separation mismatch ratio (RLM)12                             |
| Fig. 2.3.2. Single-ended voltage-mode driver structure for PAM-4 signaling and its level     |
| configuration: (a) Level 3, (b) Level 2, (c) Level 1, and (d) Level 013                      |
| Fig. 2.3.3. $V_{DS}$ variation of a single-ended voltage-mode PAM-4 driver: (a) Level 3, (b) |
| Level 2, (c) Level 1, and (d) Level 015                                                      |
| Fig. 2.3.4. Output eye diagram of PAM-4 driver considering $V_{DS}$ variation16              |
| Fig. 2.4.1. Impedance discontinuities causing reflection along a transmission line           |
| Fig. 3.1.1. Overall block diagram of proposed dual-mode PAM-4/NRZ transmitter23              |
| Fig. 3.1.1.1. Internal clock path                                                            |
| Fig. 3.2.1. A schematic of basic source-series terminated (SST) driver unit26                |
| Fig. 3.2.2. Overall structure of the output driver                                           |
| Fig. 3.2.3. The simulated eye diagram of the output driver: (a) without additional PU driver |
| units, (b) with additional PU driver units                                                   |
| Fig. 3.3.1.1. Measurement setup and chip micrograph                                          |
| Fig. 3.3.2.1. Measured PAM-4 eye diagrams at: (a) 16Gb/s, (b) 20Gb/s31                       |
| Fig. 3.3.2.2. Measured PAM-4 eye diagrams at 20Gb/s: (a) without equalization and level      |
| compensation, (b) with equalization alone, and (c) with both equalization and                |
| level compensation                                                                           |
| Fig. 3.3.2.3. Measured eye diagram of NRZ operation at 10Gb/s                                |

| Fig. 3.3.2.4. Power breakdown of the proposed transmitter                                   | 34  |
|---------------------------------------------------------------------------------------------|-----|
| Fig. 4.1.1. Overall architecture of the proposed PAM-4 transmitter                          | 38  |
| Fig. 4.2.1.1. An example of the manual ZQ calibration using a simplified driver circu       | ıit |
| diagram when both current and 1-UI delayed data are "11"                                    | 40  |
| Fig. 4.2.2.1. The ZQ code table and the adaptive ZQ code generator                          | 41  |
| Fig. 4.3.1. (a) Output driver using a data and driver code encoder [1.1.5, 2.5.5] and (b) t | he  |
| proposed output driver                                                                      | 43  |
| Fig. 4.3.2. (a) Operation for conventional [1.1.5] and proposed PAM-4 driver and (b) o      | ne  |
| example of the adaptive PD code generation in the proposed PAM                              | -4  |
| transmitter                                                                                 | 44  |
| Fig. 4.4.1.1. Measurement setup                                                             | 47  |
| Fig. 4.4.1.2. Die micrograph with magnified layout                                          | 48  |
| Fig. 4.4.2.1. Measured PAM-4 eye diagrams (a) with the conventional and (b) propos          | ed  |
| PAM-4 transmitter, at 18Gb/s/pin                                                            | 50  |
| Fig. 4.4.2.2. Measured BER bathtub curves of PAM-4 operation (a) with the convention        | ıal |
| and proposed PAM-4 transmitter, at 18Gb/s/pin                                               | 52  |
| Fig. 4.4.2.3. Power breakdown of the PAM-4 transmitter at 18Gb/s/pin                        | 53  |
| Fig. 4.4.2.4. Measured output return loss of the proposed PAM-4 transmitter                 | 54  |
| Fig. A.1. (a) Block diagram of a conventional ZQ calibration scheme, (b) timing diagram     | m   |
| of a conventional ZQ calibration scheme                                                     | 52  |
| Fig. A.2. The voltage difference ( $\Delta V$ ) between $V_{ZQ}$ and $V_{REF}$              | 53  |
| Fig. A.3. Conventional DQ driver configuration                                              | 54  |
| Fig. A.4. Block diagram of the proposed ZQ calibration scheme                               | 57  |
| Fig. A.5. Proposed DQ driver configuration                                                  | 58  |
| Fig. A.6. Error ratio according to the difference with the target impedance                 | 71  |
| Fig. A.7. PU/PD lock detector circuits                                                      | 72  |
| Fig. A.8. (a) Operation and timing diagram of PD lock detector, (b) operation timin         | ng  |
| diagram of PU lock detector                                                                 | 73  |
| Fig. A.9. Overall architecture of the transmitter with proposed ZQ calibration scheme?      | 75  |

| Fig. A.10. (a) Block diagram of the 4-phase clock generator, and (b) block diagr  | am of the  |
|-----------------------------------------------------------------------------------|------------|
| clock generator                                                                   | 77         |
| Fig. A.11. Simulated error ratio with the target impedance (a) PU-DRV calibration | n, (b) PD- |
| DRV calibration                                                                   | 79         |
| Fig. A.12. Location of ZQ pin DQ pins                                             | 80         |
| Fig. A.13. Simulated calibration result per DQ: (a) PU-DRV calibration, (b)       | PD-DRV     |
| calibration                                                                       | 82         |

# LIST OF TABLE

| Table. 2.1.1. Comparison of NRZ and PAM-4 signal                              | 7       |
|-------------------------------------------------------------------------------|---------|
| Table. 2.2.1. Comparison of voltage-mode driver and current-mode driver       | 10      |
| Table. 3.3.2.1. Performance summary and comparison table with other PAM-4/NRZ | Z dual- |
| mode transmitters                                                             | 35      |
| Table. 4.3.1. Active PU/PD path and output impedance                          | 46      |
| Table. 4.4.2.1. Transmitter performance summary                               | 56      |
| Table. 4.4.2.2. Comparison with other recent voltage-mode PAM-4 transmitters  | 57      |
| Table. A.1. Error ratio of the conventional ZQ calibration                    | 65      |
| Table. A.2. Error ratio of the proposed ZQ calibration                        | 70      |

### **CHAPTER 1**

### INTRODUCTION

### **1.1 MOTIVATION**



Fig. 1.1.1. Various applications of DRAM.

Dynamic random-access memory (DRAM), mainly composed of one transistor and one capacitor, is a type of RAM, which is a temporary storage device that can store digital data in cell capacitors. The demand for high-bandwidth DRAM is constantly increasing as the amount of data to be processed increases due to the development of various applications



Fig. 1.1.2. 2021-2027 Evolution of the stand-alone memory market.

such as the internet of things (IoT), 5G communication, artificial intelligence (AI), smart cars, and personal computers (PC), as shown in Fig.1.1.1 [1.1.1]-[1.1.3]. Fig. 1.1.2 shows the evolution of stand-alone memory market. The stand-alone memory market is expected to grow constantly, and in particular, DRAM revenue is expected to grow from \$94 billion in 2021 to \$158 billion in 2027 which also supports the need for high-bandwidth DRAM.

Fig. 1.1.3 shows memory bandwidth growth since 2008. Data bandwidth has been continuously increasing in all types of memory. High-bandwidth DRAM can be achieved by applying the dual in-line memory module (DIMM) with multiple DRAM chips mounted on a circuit board. But, DIMM requires a multi-drop structure which degrades signal integrity caused by reflection and signal attenuation. Another method for high-bandwidth



Fig. 1.1.3. DRAM data bandwidth growth.

DRAM is to increase the number of input/output (I/O) pins as in high-bandwidth memory (HBM) [1.1.4]. However, a large number of I/O pins is accompanied by high chip cost and complexity of signal routing. Also, using higher data rate per pin can be another method for high-bandwidth DRAM. Fig. 1.1.4 shows data rate per pin trends for DRAM. The data rate per pin increases gradually in memories such as double data rate (DDR), low power double data rate (LPDDR) and graphic double data rate (GDDR). Increasing the data rate per pin requires increasing the clock frequency. However, since power consumption and frequency dependent channel loss increases at higher frequencies, it is difficult to increase



Fig. 1.1.4. Data rate per pin trends for DRAM.

data rate using non-return-to-zero (NRZ) signaling. Therefore, four-level pulse amplitude modulation (PAM-4) signaling is the most promising way of addressing this problem [1.1.5].

#### **1.2 THESIS ORGANIZATION**

The thesis is organized as follows: in Chapter 2, design considerations of PAM-4 transmitters for memory interface are introduced; in Chapter 3, the proposed dual-mode PAM-4/NRZ transmitter with RLM compensation is presented and measurement results; in Chapter 4, the proposed a single-ended PAM-4 transmitter with adaptive impedance matching and output level compensation is introduced with measurement results; in Chapter 5, the thesis is summarized with the discussion of contribution.

### **CHAPTER 2**

## **DESIGN CONSIDERATIONS OF PAM-4 TRANSMITTER FOR MEMORY INTERFACE**

### 2.1 WHAT IS PAM-4 SIGNALING?



Fig. 2.1.1. Basic characteristics of NRZ and PAM-4 signals.

Fig.2.1.1 shows the basic characteristics of NRZ and PAM-4 signals. NRZ, also called 2-level pulse amplitude modulation is a modulation method using two signal levels to

| Signal Type                                                          | NRZ          |                                                                | PAM-4   |
|----------------------------------------------------------------------|--------------|----------------------------------------------------------------|---------|
| Signal Bandwidth                                                     | 1/T          |                                                                | 1/2T    |
| Pin Efficiency                                                       | 100%         |                                                                | 200%    |
| Clock Frequency                                                      |              | F                                                              | (1/2)*F |
| PAM-4 characteristics compared to NRZ                                |              |                                                                |         |
| <ul> <li>High pin efficier</li> <li>Low clock frequencies</li> </ul> | ncy<br>Jency | <ul><li>Non-linearity issue</li><li>Small eye height</li></ul> |         |

Table. 2.1.1. Comparison of NRZ and PAM-4 signal.

represent 0 or 1 information. Data transfer speeds are limited due to channel loss causing inter-symbol interference (ISI) at high speeds. In addition, the continuing trend towards reduced supply voltage also makes it more difficult to increase data transfer speed with NRZ signal. PAM-4 signaling, which has four voltage levels instead of two is emerging to overcome the limitations of NRZ signaling. PAM-4 signaling, which can transmit 2 bits per symbol, is more suitable for higher speed because its clock frequency is only half of that required for NRZ signaling at the same data rate [2.1.1]. However, PAM-4 signal has only one third of the eye height of an NRZ signal, resulting in a signal-to-noise ratio (SNR) attenuation by 9.5dB [2.1.1, 2.1.2]. In addition, further SNR degradation may occur due to

the non-linearity characteristics of the transmitter output [2.1.3]; this causes an imbalance between PAM-4 signal levels. Therefore, PAM-4 transmitter should be designed considering the reduced eye height and non-linearity characteristics of the transmitter.

The details to be considered when designing the PAM-4 transmitter are discussed in the following sections.

### 2.2 VOLTAGE-MODE AND CURRENT-MODE DRIVER



(a)



Fig. 2.2.1. Driver types for PAM-4 signaling: (a) Single-ended voltage-mode driver, (b) Differential current-mode driver.

| Driver Type           | Voltage-mode                   | Current-mode |
|-----------------------|--------------------------------|--------------|
| Impedance<br>matching | Bad                            | Good         |
| Power<br>consumption  | Low                            | High         |
| Linearity             | Susceptible                    | Robust       |
| Area                  | Small                          | Large        |
| Signaling             | Single-ended /<br>Differential | Differential |

Table. 2.2.1. Comparison of voltage-mode driver and current-mode driver.

The types of drivers used in transmitters typically include voltage-mode driver and current-mode driver. Fig 2.2.1 shows the two driver types of PAM-4 signaling. Current-mode drivers are used in high performance serial links and have the advantage of easy output impedance control. However, it requires 2 or 4 times more current than the voltage-mode driver for a given output swing, so power consumption is high [2.2.1]. Also, since the current-mode driver requires a differential structure, it is not appropriate for memory interface which has restrictions on the number of pins, cost and power consumption [2.2.2]. Table 2.2.1 summarize the characteristics of voltage-mode driver and current-mode driver. Voltage-mode driver is more suitable for memory interface considering power consumption,

area, and signaling type.

#### 2.3 LEVEL SEPARATION MISMATCH RATIO



Fig. 2.3.1. Concept of a level separation mismatch ratio (RLM).

As mentioned in Chapter 2.1, PAM-4 has four levels, the highest voltage level is level 3, the lowest voltage level is level 0, and the two middle levels are level 2 and level 1, respectively. Level separation mismatch ratio (RLM) is an indicator showing the linearity characteristics of PAM-4 signal [2.3.1]. It is determined by the smallest gap between four signal levels. If the gaps are A, B, and C, respectively from the top, then the RLM is defined as shown in Fig. 2.3.1. RLM has a value between 0 and 1, and when the differences between levels are similar, it approaches 1, which means that the signal characteristics of PAM-4 is



Fig. 2.3.2. Single-ended voltage-mode driver structure for PAM-4 signaling and its level configuration: (a) Level 3, (b) Level 2, (c) Level 1, and (d) Level 0.

improved. Conversely, when RLM is closer to 0, it means that one of the three eyes is very small and the PAM-4 linearity is poor.

The structure of a conventional single-ended voltage-mode driver for PAM-4 signaling and its level configuration are shown in Fig. 2.3.2. Each level is determined

according to most significant bit (MSB), and least significant bit (LSB) data. For example, assuming VSSQ termination, if both MSB and LSB data are 0, both PMOS are turned on and the output voltage ( $V_{OUT}$ ) becomes VDDQ/2.

However, in reality, the output voltage level is not formed ideally, as shown in Fig. 2.3.2. Fig. 2.3.3 shows drain-source voltage ( $V_{DS}$ ) variation of the single-ended voltage-mode PAM-4 driver. If the current flowing through the PMOS and NMOS transistors are  $I_{D,PU}$  and  $I_{D,PD}$  respectively, then the total impedance of pull-up path ( $R_{ON,PU}$ ) and pull-down path ( $R_{ON,PD}$ ) can be expressed as follows:

$$R_{ON,PU} = \frac{V_{DS,PU}}{\left|I_{D,PU}\right|} = \frac{1}{\frac{1}{2}\mu_{p}C_{OX}\frac{W}{L}[2(V_{GS,PU} - V_{TH,PU}) - V_{DS,PU}]},$$
(2.3.1)

$$R_{ON,PD} = \frac{V_{DS,PD}}{\left|I_{D,PD}\right|} = \frac{1}{\frac{1}{2}\mu_n C_{OX} \frac{W}{L} [2(V_{GS,PD} - V_{TH,PD}) - V_{DS,PD}]},$$
(2.3.2)

where  $\mu_p$  and are the  $\mu_n$  are the mobility of holes and electrons respectively,  $C_{OX}$  is the gate oxide capacitance per unit area.  $V_{GS}$  is the voltage between gate and source,  $V_{DS}$  is the



Fig. 2.3.3. *V<sub>DS</sub>* variation of a single-ended voltage-mode PAM-4 driver: (a) Level 3, (b) Level 2, (c) Level 1, and (d) Level 0.



Fig. 2.3.4. Output eye diagram of PAM-4 driver considering  $V_{DS}$  variation.

voltage between drain and source, and  $V_{TH}$  is the threshold voltage of the transistor.

The variation  $V_{DS,PU}$  and  $V_{DS,PD}$  in the pull-up and pull-down transistor occurs as the output level changes from one level to another level; this causes variation of the transistor's on-resistance, distorting the output signal level. For example, when there is a transition from level 3 to level 2,  $V_{DS}$  of the pull-up path increases and  $V_{DS}$  of the pull-down path decreases. Therefore,  $R_{ON,PU}$  increases and  $R_{ON,PD}$  decreases from these two equations. The output voltage can be expressed as the ratio between the impedance of the pull-up and pull-down path, as follow:

$$V_{OUT} = \frac{R_{ON.PD} \parallel R_{TERM}}{R_{ON.PU} + R_{ON.PD} \parallel R_{TERM}} \cdot VDDQ$$
(2.3.3)

The change in  $V_{DS}$  causes the voltage corresponding to level 2 to drop below VDDQ/3 and the voltage corresponding to level 1 to drop below VDDQ/6. Fig. 2.3.4 shows the output eye diagram of PAM-4 driver considering  $V_{DS}$  variation. The output signal level is deviated from the ideal level, deteriorating the RLM. Since the overall performance depends on the smallest eye height of the transmitter [2.3.2], it is important to equalize the voltage difference between signal levels, taking into account  $V_{DS}$  fluctuation.

#### 2.4 IMPEDANCE MATCHING WITH TRANSMISSION LINE



Fig. 2.4.1. Impedance discontinuities causing reflection along a transmission line..

Fig. 2.4.1 shows how impedance mismatch and reflection are related. If an electromagnetic signal travels along a transmission line, also called channel, signal reflection occurs when the impedance at the transmitter is different from the line impedance. In other words, when the impedance seen from the output of the transmitter ( $Z_{TX}$ ) is different from the transmission line impedance ( $Z_0$ ), signal reflection occurs which causes signal integrity degradation. When designing a transmitter with the current-mode driver, since the current-mode driver has a large output impedance close to infinity, impedance matching can be easily performed using termination resistor in parallel. However, current-mode driver is difficult to use in the memory interface because it requires differential signaling and consumes a lot of power. On the other hand, voltage-mode driver requires

series termination because it has a relatively small output impedance. Therefore, the sum of the series termination resistance and the resistance by the metal-oxide semiconductor field effect transistors (MOSFETs) should be  $Z_0$  to prevent reflection.

#### 2.5 **PREVIOUS ARTS**

There have been PAM-4 transmitters that adopt a current-mode driver [2.1.2, 2.5.1]. When using a current-mode driver, the simplest way to achieve similar gaps between PAM-4 levels is to increase the supply voltage [2.5.1], but, this increases power consumption. Also, current-mode driver consumes more power than the voltage-mode driver. Therefore, PAM-4 transmitter using voltage-mode drivers, while improving the PAM-4 signal linearity, have been recently reported [1.1.5, 2.5.3, 2.5.4, 2.5.5]. However, in previous designs [1.1.5, 2.5.3, 2.5.4] the signal level distortion cannot be alleviated while matching the channel impedance, or the output impedance is calibrated based on only one output level; thus impedance mismatch can occur at other signal levels. A PAM-4 transmitter with a three-point ZQ calibration [2.5.5] can match the impedance at +3, +1, -1 levels considering the  $V_{DS}$  variation but cannot match the impedance at -3 level with the channel impedance.

### **CHAPTER 3**

## **DUAL-MODE PAM-4/NRZ SINGLE-ENDED TRANSMITTER WITH RLM COMPENSATION**

To alleviate the issues of PAM-4 transmitter mentioned in Chapter 2.2 and 2.3, we have presented dual-mode PAM-4/NRZ singled-ended transmitter with RLM compensation. Its output drivers are composed of 60 basic source-series terminated (SST) driver units and 12 additional pull-up (PU) driver units. The additional PU driver unit s are used to reduce the eye height difference between four voltage levels of PAM-4 signaling.

#### **3.1 OVERALL ARCHITECTURE**

Fig. 3.1.1 shows the overall block diagram of proposed dual-mode PAM-4/NRZ transmitter. An on-chip pseudo-random bit sequence (PRBS) generator creates 32-bit parallel data, which are sent to the 32:4 serializer (SER). A mode selection signal is used in the mode selector to determines PAM-4 or NRZ data. The signal then passes to a retimer and an encoder, which has an equalization selector and a logic to apply PAM-4 level compensation. A driver consists of 60 basic segments and 12 additional segments including four 4:1 multiplexers (MUXs), a 4:1 SER and an SST driver with a shared resistor.



Figure 3.1.1. Overall block diagram of proposed dual-mode PAM-4/NRZ transmitter.

#### **3.1.1 INTERNAL CLOCK PATH**



Fig. 3.1.1.1. Internal clock path.

The internal clock path is depicted in Fig. 3.1.1.1. Two external clock signals CLKP and CLKN pass through the clock buffer and IQ generator, and inverter chains. Then, clock signals with a 90° phase difference are generated. 3-bit MOS capacitors are used for compensating for the skew between 4-phase clocks by controlling delay time.

#### 3.1.2 32:4 SERIALIZER, AND ENCODER & RETIMER

The 32:4 serializer has three internal stages, which perform 32:16, 16:8, and 8:4 serialization and each stage is made up of 2:1 SERs. In the case of PAM-4 mode, 8 parallel data are sent to the encoder side; in NRZ mode, 4 parallel data are transmitted. A retimer aligns both the PAM-4 and NRZ data using one of the 4-phase clock signals from the internal clock path. After data are aligned with one clock signal, four parallel data with one unit-interval (UI) delay are generated using all four phase clock signals. The data passed through the retimer are transmitted to the 4:1 MUX via the encoder and the feed-forward equalization (FFE) coefficient is determined by control signals in the encoder. Level compensation logic is included in the encoder and retimer block.
### 3.2 DUAL-MODE PAM-4/NRZ SINGLE-ENDED TRANSMITTER WITH RLM COMPENSATION



Fig. 3.2.1. A schematic of basic source-series terminated (SST) driver unit.

To support various channels and memory standards, a dual-mode PAM-4/NRZ singleended voltage-mode transmitter is proposed. Fig. 3.2.1 shows the schematic of basic SST driver unit. The basic SST driver unit contains two PMOS transistors and two NMOS



Fig. 3.2.2. Overall structure of the output driver.

transistors to control the impedance of the output driver. A passive resistor is used for better linearity characteristics of the driver [2.5.1].

The overall structure of the output driver is shown in Fig. 3.2.2. The output driver is composed of 60 basic driver units and 12 additional PU driver units. Since output level distortion occurs due to  $V_{DS}$  variation, as shown in Fig. 2.3.4, it is necessary to raise the two



Fig. 3.2.3. The simulated eye diagram of the output driver: (a) without additional PU driver units, (b) with additional PU driver units.

middle levels to equalize the gap between each level. When the voltage is level 2 or level 1, turning on the addition PU driver units reduce the total impedance of pull-up path and raise the voltage level. Fig. 3.2.3 is a simulation result showing the effect of the additional PU driver units. In the conventional structure, the size of the bottom eye is small due to  $V_{DS}$  variation, as shown in Fig. 3.2.3(a), but the RLM can be improved through additional PU driver units, as shown in Fig. 3.2.3(b). However, impedance matching can be broken in this case.

#### **3.3 MEASUREMENTS AND RESULTS**

#### **3.3.1 CHIP MICROGRAPH AND MEASUREMENT SETUP**



Fig. 3.3.1.1 Measurement setup and chip micrograph.

A prototype of the dual-mode PAM-4/NRZ transmitter was fabricated in a 65nm CMOS process. Its total active area is 0.06mm<sup>2</sup>, including internal clock path. The measurement setup and die micrograph are shown in Fig 3.3.1.1 A single-ended clock signal generated by the clock generator passes through a single-to-differential converter to produce two differential clocks, and 4-phase quarter-rate clock signals are then generated inside the chip. All experiments were performed using a PRBS7 pattern, and output waveforms were viewed and measured on an oscilloscope.

#### **3.3.2 MEASUREMENT RESULTS**



Fig. 3.3.2.1. Measured PAM-4 eye diagrams at: (a) 16Gb/s, (b) 20Gb/s

Fig. 3.3.2.1 shows eye diagrams of PAM-4 operation at 16Gb/s and 20Gb/s. The size of the eye at 16Gb/s is larger because the timing margin is greater than it is at 20Gb/s. The effect of the equalization and level compensation on the eye diagram at 20Gb/s is shown in Fig. 3.3.2.2. The eye diagram hardly open without an equalization by comparing Fig. 3.3.2.2(a) and Fig. 3.3.2.2(b). Three eye heights are 60.5mV, 58.2mV, and 39.7mV respectively from the top by using 2-tap FFE with 0.8, -0.2 tap coefficient in Fig.3.3.2.2(b). However, the height of three eyes are 56.1mV, 53.4mV, and 54.3mV from the top by



Fig. 3.3.2.2. Measured PAM-4 eye diagrams at 20Gb/s: (a) without equalization and level compensation, (b) with equalization alone, and (c) with both equalization and level compensation

.

applying the level compensation method with the same FFE coefficient, as shown in Fig. 3.3.2.2(c). RLM improves from 0.75 to 0.98 when level compensation is applied by comparing Fig. 3.3.2.2(b) and Fig. 3.3.2.2(c). Fig. 3.3.2.3 shows the eye diagram for NRZ signal at 10Gb/s. The FFE coefficients are 0.9 and -0.1 in this measurement and it eye height is 247.8mV.

Our dual-mode transmitter consumes 61.5mW at 20Gb/s in PAM-4 operation, and 72mW at 10Gb/s in NRZ operation. Fig. 3.3.2.4 shows the power breakdown of the



Fig. 3.3.2.3. Measured eye diagram of NRZ operation at 10Gb/s.

.

proposed transmitter. 4:1 MUX and SER occupies the largest share of 49%. Next, more power is consumed in the order of output driver, encoder & retimer, and clock tree. Table 3.3.2.1 compares the performance of our transmitter with that of other dual-mode transmitters. Out transmitter is the only one that adopts single-ended signaling for memory interfaces.



Fig. 3.3.2.4. Power breakdown of the proposed transmitter.

.

| Parameter                 | [2.5.4]             |       | [3.3.2.1]           |       | [3.3.2.2]    |       | This Work           |       |
|---------------------------|---------------------|-------|---------------------|-------|--------------|-------|---------------------|-------|
| Technology (nm)           | 65                  |       | 14                  |       | 22           |       | 65                  |       |
| Supply voltage (V)        | 1.2                 |       | N/A                 |       | 1.2          |       | 1.0                 |       |
| Modulation                | PAM-<br>4           | NRZ   | PAM-<br>4           | NRZ   | PAM-<br>4    | NRZ   | PAM-<br>4           | NRZ   |
| Signaling                 | Differential        |       | Differential        |       | Differential |       | Single-ended        |       |
| Driver topology           | Voltage-mode<br>SST |       | Voltage-mode<br>SST |       | CML          |       | Voltage-mode<br>SST |       |
| Data-rate (Gb/s)          | 32                  | 16    | 40                  | 40    | 56           | 28    | 20                  | 10    |
| Equalization<br>(TX FFE)  | 2-tap               | 4-tap | No<br>EQ            | 4-tap | No<br>EQ     | 4-tap | 2-tap               | 4-tap |
| Power consumption<br>(mW) | 176.3               | 173.7 | 167.5               | 518   | N/A          | N/A   | 61.5                | 72    |
| RLM                       | 0.967               |       | N/A                 |       | N/A          |       | 0.98                |       |
| Area (mm <sup>2</sup> )   | 0.06                |       | 0.0279              |       | N/A          |       | 0.06                |       |

Table. 3.3.2.1. Performance summary and comparison table with other PAM-4/NRZ dual-mode transmitters.

### **CHAPTER 4**

## A SINGLE-ENDED PAM-4 TRANSMITTER with Adaptive Impedance Matching and Output Level Compensation

To alleviate the issues of PAM-4 transmitter mentioned in Chapter 2.2 and 2.3, and 2.4, this paper presents a method for preventing output level distortion while matching the channel impedance in the single-ended PAM-4 transmitter for memory interfaces. ZQ codes for all four output signal levels were obtained through ZQ calibration and saved in the ZQ code table. The ZQ code generator then adaptively selected the appropriate codes depending on the data pattern and delivered them to the output driver; this can improve the RLM while matching the channel impedance.

#### 4.1 OVERALL ARCHITECTURE

Fig. 4.1.1. shows a proposed PAM-4 transmitter with four-level impedance matching. The four-phase clock signals (CLK0, CLK90, CLK180, and CLK270) are generated from an internal clock path, which is composed of a clock buffer (CLK BUF), an IQ divider (DIV), and a single-to-differential converter (S-to-D). The 32-bit parallel data generated by a PRBS generator (Gen.) are transmitted to the output driver through a 32:8 serializer (SER), a data aligner, and a 4:1 serializer. The 32:8 serializer is composed of several 2:1 serializers. The most significant bit (MSB) and least significant bit (LSB) drivers are divided into 20 and 10 segments. In each MSB/LSB driver segment, the 2:1 multiplexer (MUX) selects the current or 1-UI delayed data to implement two-tap feed-forward equalization (FFE). The FFE strength is controlled by adjusting the number of segments operating with the current or 1-UI delayed data, and its coefficient can be adjusted up to 13.98dB. After a ZQ code table is filled by the ZQ calibration, an adaptive ZQ code generator changes the ZQ codes depending on the data pattern and transmits them to the output driver.



Figure 4.1.1. Overall architecture of the proposed PAM-4 transmitter.

### 4.2 ADAPTIVE IMPEDANCE MATCHING WITH MANUAL ZQ CALIBRATION

#### 4.2.1 MANUAL ZQ CALIBRATION

To change appropriate ZQ codes in the adaptive ZQ code generator according to the output signal level, the ZQ calibration is first performed for all PAM-4 levels reflecting the FFE strength, and the corresponding ZQ codes are then stored in the ZQ code table. In this prototype, the ZQ calibration is performed manually.

Fig. 4.2.1.1 shows an example of the manual ZQ calibration using a simplified driver circuit diagram when both current and 1-UI delayed data are "11". First, we determine the target FFE strength,  $\alpha$ , considering channel loss; this example chooses  $\alpha$  as 1. Since the on-resistance ( $R_{ON}$ ) of each transistor should be  $1.5k\Omega$  to match the channel impedance ( $Z_O$ ) of 50 $\Omega$ , the output signal level is 50mV. We then fix OUT to 50mV in order to find the PU codes at this output level. With the FFE strength of 1, the pull-up resistance should be 500 $\Omega$ ; thus, the flowing current should be 1.9mA. After finding the PU codes to achieve this current value, we store these codes in the ZQ code table. Finally, we perform the same process to find the PD codes and store them in the ZQ code table. The ZQ calibration for other PAM-4 signal levels is subsequently followed, obtaining the ZQ codes for all signal levels. Through this process, the ZQ code table is filled. This calibration can be implemented on-chip and performed automatically during the training sequence when applied to the memory system.



③ After fixing OUT to 50mV, find PU codes that allow the current of 1.9mA to flow and store them in the ZQ code table.

④ Find PD codes that allow the current of 0.9mA to flow and store them in the ZQ code table.

Fig. 4.2.1.1. An example of the manual ZQ calibration using a simplified driver circuit diagram when both current and 1-UI delayed data are "11".



#### 4.2.2 ZQ CODE TABLE AND ADAPTIVE ZQ CODE GENERATOR

Fig. 4.2.2.1.The ZQ code table and the adaptive ZQ code generator.

The ZQ code table has two 4 x 5 structures for PU/PD codes, as shown in Fig. 4.2.2.1. During the ZQ calibration, the ZQ code table is filled with 0 or 1 for all output signal levels. The adaptive ZQ code generator consists of ten 4:1 MUXs. This generator uses MSB/LSB data to adaptively select codes from the ZQ code table and transmits corresponding codes to the output driver as PU<4:0> and PD<4:0>. The mismatch of the propagation delay between the transmitted codes and the data can degrade the overall performance or our structure. Therefore, a ZQ code generator replica is placed at each driver segment as a delay matching component, as shown in Fig. 4.1.1.

#### 4.3 PROPOSED OUTPUT DRIVER



Fig. 4.3.1. (a) Output driver using a data and driver code encoder [1.1.5, 2.5.5] and (b) the proposed output driver.



(a)



(b)

Fig. 4.3.2. (a) Operation for conventional [1.1.5] and proposed PAM-4 driver and (b) one example of the adaptive PD code generation in the proposed PAM-4 transmitter.

Fig. 4.3.1(a) shows an output driver circuit diagram using a data and driver code encoder [1.1.5, 2.5.5]. When the encoder is placed in front of the output driver, the data and the driver code are encoded, and then encoded data are sent to the output driver; this can make the output driver configuration simple. However, the propagation delay of the data path increases by the delay of the encoder, increasing the power-supply-induced jitter and deteriorating the output drift characteristics in memory interfaces. The proposed output driver in Fig. 4.3.1(b) can improve these issues by removing the encoder. Each driver segment has a source-series terminated structure. All segments share a 71 $\Omega$  passive resistor to improve linearity, and 5-bit PU/PD codes from the adaptive ZQ code generator control the driver's on-resistance.

The operation for the conventional [1.1.5] and proposed PAM-4 output driver is shown in Fig. 4.3.2(a). In the conventional driver for memory interfaces [1.1.5], after ZQ calibration based on the on signal level, PU/PD driver codes are fixed during the burst operation; thus,  $V_{DS}$  fluctuation can vary the transistors' on-resistance, leading to the RLM degradation and the impedance mismatch. Although the previous PAM-4 transmitter with three-point ZQ calibration [2.5.5] changes the driver codes according to the data pattern, this structure uses the driver codes obtained from the -1 output level when transmitting the -3 output level; this leads to signal reflection. To alleviate these issues, the PU/PD codes of our driver adaptively change in real time for each output signal level after the ZQ calibration is performed at all four signal levels. Fig. 4.3.2(b) shows one example of PD code generation. When the data pattern is changed to '11', '10', '01', and '00', the PD<4:0> is changed to '01010', '10010', '10011', and '10111 correspondingly (however, in the

| Output level | Active PU Path                                | Active PD Path                                                  | Output<br>Impedance |
|--------------|-----------------------------------------------|-----------------------------------------------------------------|---------------------|
| Level 3      | $M_{PU4} \parallel M_{PU2}$                   | $M_{PD4} \parallel M_{PD2} \parallel M_{PD1} \parallel M_{PD0}$ | Zo                  |
| Level 2      | $M_{PU4} \parallel M_{PU1}$                   | $M_{PD4} \parallel M_{PD1} \parallel M_{PD0}$                   | Zo                  |
| Level 1      | $M_{PU4} \parallel M_{PU1} \parallel M_{PU0}$ | $M_{PD4} \parallel M_{PD1}$                                     | Zo                  |
| Level 0      | Mpu4    Mpu0                                  | Mpd3    Mpd1                                                    | Zo                  |

Table. 4.3.1. Active PU/PD path and output impedances.

conventional structure [1.1.5], PD<4:0> is fixed to '01010' regardless of the data pattern). PU<4:0> changes in the same way. Table 4.3.1 shows the PU and PD transistors that are turned on and output impedances at each PAM-4 level.

#### 4.4 MEASUREMENTS AND RESULTS

#### 4.4.1 CHIP MICROGRAPH AND MEASUREMENT SETUP



Fig. 4.4.1.1 Measurement setup.

A prototype of the proposed PAM-4 transmitter was fabricated in a 65nm CMOS process. Fig. 4.4.1.1 shows the measurement setup and Fig. 4.4.1.2 shows a die micrograph



Fig. 4.4.1.2 Die micrograph with magnified layout.

with magnified layout. The total active area of the transmitter is 0.035mm<sup>2</sup>. The differential clock signals CLKP and CLKN generated by the signal quality analyzer (Anritsu MP1800A) are transmitted to the printed circuit board (PCB), and the output signal passes through a channel where the insertion loss of -3.02 dB at 4.5GHz is measured on an oscilloscope (Tektronix MSO73304DX). The operation of the transmitter can be controlled externally by means of an inter-integrated circuit (I2C) interface.

#### 4.4.2 MEASUREMENT RESULTS

Fig. 4.4.2.1 shows the measured PAM-4 eye diagrams at 18Gb/s/pin, with conventional and proposed PAM-4 transmitter. In this measurement, the PRBS-7 data pattern and the FFE coefficient of 1.94dB were used. Since the conventional PAM-4 transmitter uses fixed PU/PD codes, eye distortion occurs due to the  $V_{DS}$  fluctuation; the voltage gaps between signals from top to bottom are 195 mV, 174 mV, and 113 mV, resulting in an RLM of 0.703, and the worst eye height of 32.4 mV (Fig.4.4.2.1(a)). When the proposed method is applied, eye distortion is compensated; the voltage gaps between signals are 164 mV, 160 mV, and 155 mV, respectively, achieving an improved RLM of 0.971 (Fig. 4.4.2.1(b)). With this method, the worst eye-opening also increases to 75.9 mV. Although this method has the better RLM performance, its structure using single-ended signaling may make the PAM-4 eye diagram look somewhat less symmetrical than an eye diagram based on differential signaling [2.5.3, 2.5.4].



(a)

w/ Proposed PAM-4 Transmitter



(b)

Fig. 4.4.2.1 Measured PAM-4 eye diagrams (a) with the conventional and (b) proposed PAM-4 transmitter, at 18 Gb/s/pin.

Fig. 4.4.2.2 shows the measured BER bathtub curves of the upper, middle, and lower eye at 18Gb/s/pin, with the conventional and proposed PAM-4 transmitter using the PRBS-7 data pattern and the FFE coefficient of 1.94dB. When the proposed method is not applied, a minimum horizontal margin at a BER of  $10^{-12}$  is only 0.05UI (Fig. 4.4.2.2(a)). With the proposed PAM-4 transmitter, the minimum horizontal margin increase to 0.22UI at a BER of  $10^{-12}$  (Fig. 4.4.2.2(b)).



(b)

Fig. 4.4.2.2 Measured BER bathtub curves of PAM-4 operation (a) with the conventional and (b) proposed PAM-4 transmitter, at 18 Gb/s/pin.



Total Power Consumption : 55.67mW @ 18Gb/s/pin

Fig. 4.4.2.3 Power breakdown of the PAM-4 transmitter at 18Gb/s/pin.

Fig. 4.4.2.3 shows the power breakdown of the proposed PAM-4 transmitter at 18Gb/s/pin. The 4:1 SER and pre-driver occupy the largest proportion with 51.2% of the total power consuption, followed by the driver occupying 25.3%.

Fig. 4.4.2.4 shows the measured output return loss of a PAM-4 transmitter. Since return loss specification of typical transmitter require under 8dB at frequency corresponding to Nypuist frequency [4.4.2.1], it can be seen that the proposed transmitter has good impedance matching.

In Tables 4.4.2.1 and 4.4.2.2, the performance or out prototype is summarized and compared with those of other recent voltage-mode PAM-4 transmitters. Out transmitter can match the channel impedance at all four PAM-4 signal levels while achieving a good RLM



Fig. 4.4.2.4 Measured output return loss of the proposed PAM-4 transmitter.

performance by adaptively adjusting the ZQ code according to the data pattern. The ZQ codes are transmitted to the output driver parallel to the data. Thus, the encoder block in front of the output driver can be removed, decreasing the data path delay and improving the output drift characteristic. Althouth the differential PAM-4 transmitter [2.1.3, 2.5.3] has better enery efficiency, the differential architecture with the on-chip voltage regulator cannot be adopted in memory interfaces. The previous single-ended PAM-4 transmitter [2.5.5] has better RLM and energy efficiency performances; however, the impedance values of the NMOS-only driver are vulneralbe to voltage and temperature variations, and the circuits for the ZQ calibration are needed, even in the main path. Even though the PAM-

4 transmitter using the differential ternary R-2R DAC [4.4.2.2] also shows better RLM performance, the impedance calibration is not performed at all signal levels. Furthermore, the bootstrapping method of the R-2R switch is susceptible to the PVT variations, which can vary the output driver's on-resistance.

| Process                | 65nm CMOS                                               |  |  |  |
|------------------------|---------------------------------------------------------|--|--|--|
| Data Rate              | 18Gb/s/pin                                              |  |  |  |
| Supply                 | 1.0V / 1.2V                                             |  |  |  |
| Modulation             | PAM-4                                                   |  |  |  |
| Signaling              | Single-ended                                            |  |  |  |
| Driver Type            | Voltage-mode<br>with shared resistor                    |  |  |  |
| Equalization           | 2-tap FFE (De-emphasis)                                 |  |  |  |
| Channel Loss @ Nyquist | -3.02dB                                                 |  |  |  |
| BER                    | <10 <sup>-12</sup> (PRBS-7)                             |  |  |  |
| RLM                    | 0.975 (Simulated)<br>0.971 (Measured)                   |  |  |  |
| Energy efficiency      | 3.05pJ/bit/pin (Simulated)<br>3.09pJ/bit/pin (Measured) |  |  |  |
| Area                   | 0.035mm <sup>2</sup>                                    |  |  |  |

Table. 4.4.2.1 Transmitter performance summary.

|                                         | [2.1.3]                          | [2.5.3]                          | [2.5.4]                                                   | [2.5.5]                    | [1.1.5]                    | [4.4.2.2]                        | This<br>Work                |
|-----------------------------------------|----------------------------------|----------------------------------|-----------------------------------------------------------|----------------------------|----------------------------|----------------------------------|-----------------------------|
| Process (nm)                            | 16nm<br>FinFET                   | 65nm<br>CMOS                     | 65nm<br>CMOS                                              | 65nm<br>CMOS               | 1y nm<br>DRAM              | 65nm<br>CMOS                     | 65nm<br>CMOS                |
| Data Rate<br>(Gb/s/pin)                 | 29                               | 14                               | 16                                                        | 28                         | 22                         | 5                                | 18                          |
| Supply (V)                              | 0.85/0.9/<br>1.2/1.8             | 1.0/0.91/<br>0.5                 | 1.2                                                       | 1.0/0.6                    | 1.35                       | 1.0                              | 1.0/1.2                     |
| Signaling                               | Different<br>ial                 | Different<br>ial                 | Different<br>ial                                          | Single-<br>ended           | Single-<br>ended           | Different<br>ial                 | Single-<br>ended            |
| Level<br>Compensation                   | Impedan<br>ce<br>Control<br>Loop | Impedan<br>ce<br>Control<br>Loop | LUT <sup>1</sup> +<br>Pseudo<br>Analog<br>Control<br>Loop | Three-<br>Point<br>ZQ Cal. | ZQ Cal.<br>(Fixed<br>Code) | Impedan<br>ce<br>Control<br>Loop | Adaptive<br>ZQ Code<br>Gen. |
| Impedance<br>Matching<br>at Four Levels | No                               | No                               | No                                                        | No                         | No                         | No                               | Yes                         |
| RLM                                     | >0.98                            | 0.947                            | 0.967                                                     | 0.993                      | N/A                        | 0.994                            | 0.971                       |
| Power<br>consumption<br>(pJ/bit/pin)    | 1.46                             | 1.82                             | 9.9                                                       | 0.64                       | N/A                        | 1.97                             | 3.09                        |
| Area (mm <sup>2</sup> )                 | N/A                              | 0.06                             | 0.06                                                      | 0.0333                     | 54.7 <sup>2</sup>          | 0.073                            | 0.035                       |

Table.4.4.2.2. Comparison with other recent voltage-mode PAM-4 transmitters.

### **CHAPTER 5**

### CONCLUSION

In this thesis, two single-ended voltage-mode PAM-4 transmitters for memory interfaces have been proposed. Since single-ended signaling requires half the number of pins compared to differential signaling, it is more suitable for memory interfaces with limited pin counts such as DDR, LPDDR, and GDDR. In addition, we adopt a voltage-mode structure that consumes less power than the current-mode structure. Also, we adopt PAM-4 signaling that can achieve higher data rate than NRZ signaling for high-bandwidth memory.

The first proposed single-ended voltage-mode PAM-4 transmitter with RLM enhancement for memory interfaces is implemented in 65nm CMOS process. Its output drivers are composed of 60 basic SST driver units and 12 additional pull-up driver units. RLM can be improved from 0.75 to 0.98 by controlling the two intermediate levels of PAM-4 signal with additional drivers. The total active area is 0.06mm<sup>2</sup>, including the internal clock path and the proposed transmitter consumes 61.5mW at the data rate of 20 Gb/s. However, if additional drivers are used to increase to secure the smallest eye height, impedance mismatch occurs between the output driver and the channel.

To alleviate the impedance mismatch issue, we proposed another single-ended voltage-mode PAM-4 transmitter for memory interfaces with adaptive impedance matching

and output level compensation. The driver codes for all PAM-4 signal levels are stored in a ZQ code table after a ZQ calibration. Using an adaptive ZQ code generator, the output driver adjusts the output signal level and impedance for four signal levels to compensate the  $V_{DS}$  variation caused by the output level change. This transmitter achieves an RLM of 0.971 while achieving 3.09 pJ/bit/pin at 18 Gb/s/pin.

# APPENDIX A. R<sub>on</sub> Variation Reducing ZQ Calibration Scheme for Memory Interfaces

As the amount of data to be processed increases due to the development of internet of things (IoT), 5G communication, artificial intelligence (AI), the operating speed of DRAMs such as double-data rate (DDR), low-power double data-rate (LPDDR), and graphic double data-rate (GDDR) is increasing [A.1]. As the speed of various DRAMs increases, signal integrity (S/I) is deteriorated by reflection caused by the mismatch in impedance between the channel and the output driver [A.2]. At low speed and short channels, impedance mismatch is not very critical as there is not much S/I degradation but becomes critical at higher operating speed [A.3]. To address this issue, impedance [A.4].

In fact, 10% impedance mismatch between the channel and the output driver results in a 2.4% amplitude reduction [2.2.1]. So it is important to make the output driver impedance (R<sub>ON</sub>) equal to the channel impedance through ZQ calibration to prevent S/I degradation. ZQ calibration process is performed based on ZQ pin, and the results obtained through this process are delivered to the DQ side. However, since the PVT condition of ZQ pin and DQ pin is different depending on the location, it also may cause R<sub>ON</sub> variation [A.5]. Therefore, ZQ calibration considering PVT variation is also required.

ZQ calibration is impedance calibration performed in the training process to prevent

signal integrity degradation due to reflection caused by impedance discontinuity [A.3]. Fig. A.1(a) and Fig. A.1(b) show the process of conventional ZQ operation, which has two steps. When the ZQ calibration enable signal is asserted, the 1<sup>st</sup> loop operates and the pull-down (PD) code calibration is performed. The PD code counter value is incremented until the comparator finds that the reference voltage  $V_{REF}$  exceeds the voltage of the ZQ pin node ( $V_{ZQ}$ ) using an external resistor ( $R_{EXT}$ ). Then PD code (PD<N:0>) is settled when the 1<sup>st</sup> loop operation is finished, as shown in Fig. A. 1(b). After 1<sup>st</sup> loop operation, the 2<sup>nd</sup> loop operation is started by the pull-up (PU) calibration enable (PU\_EN) signal. This loop calibrates PU code (PU<N:0>) in the similar way as the PD code. However, PU code calibration is performed using the PD driver whose calibration has been completed in the primary loop operation instead of the external resistor. PU/PD codes obtained after ZQ calibration are transmitted to the DQ driver.


Fig. A.1. (a) Block diagram of a conventional ZQ calibration scheme, (b) timing diagram of a conventional ZQ calibration scheme.



Fig. A.2. The voltage difference ( $\Delta V$ ) between  $V_{ZQ}$  and  $V_{REF}$ .

PU/PD codes are transmitted to the DQ side when calibration is completed. The  $V_{ZQ}$  is determined by the PD code, and the  $V_{ZQ\_REP}$  is determined by the PU code, as shown in Fig. A.1(b). Taking  $V_{ZQ}$  as an example,  $V_{ZQ}$  settles at one of the two toggled points, and the impedance of the PD driver is  $R_{ZQ}$ , as shown in Fig. A.2. The voltage at another toggle point is  $V_{OPP}$ , and the impedance at that point becomes  $R_{OPP}$ . Fig. A.2 is an enlarged graph of the toggled voltage value based on  $V_{REF}$  in Fig. A.1(b). When the  $V_{ZQ}$  value is determined, the voltage difference ( $\Delta V$ ) from  $V_{REF}$  may be large as on the left or small as on the right. A large voltage difference means that the  $R_{ZQ}$  value has a large error with the target impedance ( $R_{TARGET}$ ). Since it cannot be guaranteed that the result of ZQ calibration is the same as the one on the right in Fig. A.2, there is a possibility that the voltage difference with the target can be large with the conventional ZQ calibration method.

Fig. A.3 shows the conventional configuration of DQ driver. When one DQ driver is composed of 6 PU drivers (PU DRV) and 6 PD drivers (PD DRV) calibrated to  $240\Omega$  each,

| Conventio | nal——  |      |        |      |     |     |      |     |      |     |     |   |
|-----------|--------|------|--------|------|-----|-----|------|-----|------|-----|-----|---|
| PU<5:0>   | PU     | PL   | J      | P    | U   | P   | U    | P   | U    | P   | U   |   |
|           |        |      | v      |      | (V  |     | NV I |     | NV   |     |     |   |
|           | (240Ω) | (240 | Ω)     | (240 | )Ω) | (24 | 0Ω)  | (24 | -0Ω) | (24 | 0Ω) |   |
|           |        |      |        |      |     |     |      |     |      |     |     |   |
|           | Ī      | T    |        |      |     |     |      |     | ſ    |     | I   |   |
| PD<5:0>   | PD     | PE   | )<br>) | P    | D   | P   | D    | P   | D    | Р   | D   |   |
|           | (240Ω) | (240 | Ω)     | (240 | ΩΩ) | (24 | 0Ω)  | (24 | 0Ω)  | (24 | 0Ω) |   |
|           | · ,    | ,    | ,      | ,    | ,   | •   | ,    | •   | ,    | `   | ,   | I |

Fig. A.3. Conventional DQ drvier configuration.

the impedance of the DQ driver becomes  $(240/N)\Omega$  according to the number of turned-on PU/PD drivers (N). The same PU/PD codes are transmitted to all PU/PD drivers, as shown in Fig. A.3.

If the voltage by the PU/PD codes have a large difference from the target voltage, as shown on the left of Fig. A.2, the difference with the target impedance is large, which means that the quantization error is large. If the target resistance is  $240\Omega$  when the voltage value is V<sub>REF</sub>, R<sub>ZQ</sub> assumed to  $(240+X)\Omega$ , where impedance difference (X) is a random value. The PD driver impedance (R<sub>OPP</sub>) when the voltage is V<sub>OPP</sub> can be assumed to  $(240-Y)\Omega$ , where another impedance difference (Y) is also a random value, as shown in Fig. A.2. Then, the error ratio of one driver can be expressed as follows;

| Impedance<br>Difference | Conventional ZQ<br>Calibration Error<br>Ratio (%) | Impedance<br>Difference | Conventional ZQ<br>Calibration Error<br>Ratio (%) |  |  |
|-------------------------|---------------------------------------------------|-------------------------|---------------------------------------------------|--|--|
| X = 2.5                 | 1.042                                             | X = 2.5                 | 5.208                                             |  |  |
| X = 5                   | 2.083                                             | X = 5                   | 6.25                                              |  |  |
| X = 7.5                 | 3.125                                             | X = 7.5                 | 7.292                                             |  |  |
| X = 10                  | 4.167                                             | X = 10                  | 8.333                                             |  |  |

Table.A.1. Error ratio of the conventional ZQ calibration.

Error Ratio(ER) (%) = 
$$\frac{\left|R_{ZQ} - R_{TARGET}\right|}{R_{TARGET}}$$
•100 (A.1)

The error ratio can be expressed as follows regardless of the N in the conventional ZQ calibration;

$$ER_{Conventional} = \frac{\left| (240 + X) / N - 240 / N \right|}{240 / N} \bullet 100 = 0.417 \bullet X(\%) \tag{A.2}$$

Table A.1 shows that the error ratio increases in proportion to the X value in the

conventional ZQ calibration. Also, since calibration is based on the ZQ pin, the PVT condition of the ZQ pin and DQ pin differs depending on the location within the chip. Therefore, R<sub>ON</sub> variation may occur due to the mismatch between ZQ-DQ pins [A.5].



Figure A.4. Block diagram of the proposed ZQ calibration scheme.



Fig. A.5. Proposed DQ drvier configuration.

The block diagram of a proposed ZQ calibration scheme is shown in Fig. A.4. Like the conventional ZQ calibration, the 1st loop operates using an external resistor, and the 2nd loop operates using a calibrated PD driver. A ZQ calibration control block determines the overall timing of signals such as PU/PD enable and lock signals. ZQ codes (PU<5:0> and PD<5:0>) are obtained through ZQ calibration scheme and adjacent PU/PD fixed codes (PU\_FIXED<5:0> and PD\_FIXED<5:0>) are also generated using the PU/PD lock signals (LOCK\_PU\_PRE and LOCK\_PD\_PRE). When the ZQ codes pass through D-flip-flops (D-FF) which use the lock signals as a clock signal, the adjacent PU/PD fixed codes are generated

Fig. A.5 shows the proposed DQ configuration. The 6-bit selection signal (SEL<5:0>) determines which code to select between ZQ codes and adjacent fixed codes in 2:1 multiplexers (MUXs). If the PD code of  $R_{ZQ}$  and PD code of  $R_{OPP}$  are used together, the

error ratio of the proposed ZQ calibration can be expressed by following equation, when N=2;

$$ER_{\text{Proposed}} = \frac{\left|\frac{(240+X)\times(240-Y)}{480+X-Y} - 240/2\right|}{240/2} \cdot 100(\%)$$
(A.3)

Equation (3) can be simplified and expressed as follows;

$$ER_{\text{Proposed}} = \frac{5|120(X-Y) - XY|}{6(480 + X - Y)} (\%)$$
(A.4)

N=2 means that one PD code representing  $R_{ZQ}$  and PD code representing  $R_{OPP}$  are used together. Since X and Y are random variables, assuming that X-Y=5, as in the situation on the left of Fig. A.2, the equation (4) can be simplified as follows;

$$ER_{\text{Proposed}} = \frac{5\left|120(X-Y) - XY\right|}{6(480 + X - Y)} = 0.0017 \cdot \left|600 + 5X - X^2\right|(\%)$$
(A.5)

| Impedance<br>Difference | Conventional ZQ<br>Calibration Error<br>Ratio (%) | Impedance<br>Difference | Conventional ZQ<br>Calibration Error<br>Ratio (%) |
|-------------------------|---------------------------------------------------|-------------------------|---------------------------------------------------|
| X = 2.5                 | 1.042                                             | X = 2.5                 | 0.87                                              |
| X = 5                   | 1.031                                             | X = 5                   | 0.773                                             |
| X = 7.5                 | 0.999                                             | X = 7.5                 | 0.655                                             |
| X = 10                  | 0.945                                             | X = 10                  | 0.515                                             |

Table.A.2. Error ratio of the proposed ZQ calibration.

Table A.2 shows the error ratio of the proposed ZQ calibration. Unlike conventional ZQ calibration, the error ratio does not increase proportionally with the value of X and typically stays within 1%, as shown in Fig. A.6. Although it is possible to reduce  $\Delta V$  by increasing the number of bits of the driver code for lower error ratio in the conventional ZQ calibration, but one extra bit doubles the number of pre-drivers required, increasing area and power requirements. However, the proposed ZQ calibration can maintain a low error ratio with low driver resolution.



Fig. A.6. Error ratio according to the difference with the target impedance.



Fig. A.7. PU/PD lock detector circuits.

Fig. A.7 shows the circuit of PU/PD lock detectors which are used to generate the adjacent PU/PD fixed codes. Each lock detector consists of a 2-bit counter, D-FF and logic part. Fig. A.8(a) and Fig. A.8(b) show the detailed operation and timing diagram of the PU/PD lock detectors. The process begins with the assertion of reset signal. This initialize the value of 2-bit counter (CNT<1:0>) to 00. Then, when the PU/PD enable signals (PU\_EN and PD\_EN) from the ZQ calibration control block are asserted, the PU lock counter operates with COMP\_OUT=1, and the PD lock counter operates with COMP\_OUT=0. CNT<1:0> to 00, 01, 10, 11, and lock signals (LOCK\_PU and LOCK\_PD) are



Fig. A.8 (a) Operation and timing diagram of PD lock detector, (b) operation and timing dagram of the PU lock detector.

generated when CNT<1:0> becomes 11.

When the voltage is toggled up and down based on V<sub>REF</sub>, as shown in Fig. A.8(a) and Fig. A.8(b), the CNT<1:0> takes successive values of 00, 01, 10 and 11, then lock signals are generated and the ZQ codes are fixed. The clock signal (CLK\_CNT) used for counters and lock detectors has the opposite phase to CLK\_COMP, which is used for comparator and CLK\_CNT is a clock that operates only when the enable signal is applied. Since the counter unit in Fig. A.7 uses the comparator output signal (COMP\_OUT) as a clock signal, the LOCK\_PU\_PRE and LOCK\_PD\_PRE signals are aligned with CLK\_COMP, while LOCK\_PU and LOCK\_PD signals are aligned with CLK\_CNT. Therefore, there is a half cycle clock delay between LOCK\_PU and LOCK\_PU\_PRE and so do LOCK\_PD and LOCK\_PD\_PRE. The adjacent fixed codes can be obtained by using LOCK\_PU\_PRE, LOCK PD PRE signal and D-FFs, as shown in Fig. A.4.



Fig. A.9. Overall architecture of the transmitter with proposed ZQ calibration scheme..

Fig. A.9 shows the overall architecture of the transmitter with proposed ZQ calibration scheme. First, a detailed description and circuits of the ZQ calibration scheme is explained in Section III. The internal clock path receives an external differential clock signal (CLKP/CLKN) and generates the 4-phase clock signals (CLK0, CLK90, CLK180, CLK270) through the clock aligner and buffers. These signals are used in the pseudo-random bit sequence (PRBS) generator (GEN.) and serializers (SERs). Also, CLK0 is used

in the ZQ calibration scheme. The 32-bit parallel data generated by a PRBS GEN. is serialized and transmitted to the PU/PD drivers. The SEL<5:0> signals are used as a selection signal to select between the ZQ codes and the adjacent fixed codes in the 2:1 MUXs and the selected codes are delivered to the PU/PD drivers. Quantization error and mismatch between ZQ-DQ can be reduced by controlling the number of used ZQ codes and adjacent fixed codes. Each driver can be controlled with 6-bit driver codes, and has a binary configuration, as shown in Fig. A.9.



(a)



Fig. A.10. (a) Block diagram of the 4-phase clock generator, and (b) block diagram of the clock aligner.

Fig. A.10 shows the internal clock path of the proposed transmitter. When external differential clock signals (CLKP and CLKN) are applied, the clock signals passed through the 2-stage CML amplifier converted to the CMOS voltage level with AC-coupled resistive feedback inverters which improve duty cycle performance [A.6]. When the clock signals pass through a 4-phase clock generator composed of an IQ divider and a single-to-

differential, 4-phase clocks (ICK, QCK, IBCK, and QBCK) are generated, as shown in Fig. A.10(a). Fig. A.10(b) shows the block diagram of the clock aligner. Skew of 4-phase clock is reduced by 4-bit binary weighted MOSCAPs and phase alignment is improved by cross-coupled inverters in the clock aligner.



Fig. A.11. Simulated error ratio with the target impedance (a) PU-DRV calibration, (b) PD-DRV calibration.



Fig. A.12. Location of ZQ pin and DQ pins.

The quantization error is reduced when ZQ codes and adjacent fixed codes are used together than when only ZQ codes are used. Fig. A.11(a) and Fig. A.11(b) show the simulation results of the error ratio with the target impedance of PU and PD driver according to the number of turned-on drivers. Results show that the error ratio is low when adjacent fixed codes are used with ZQ codes in both cases. The proposed method is more effective in PU driver calibration which has relatively low resolution. The effect of the proposed scheme seems insignificant in the PD DRV having high resolution in the structure with the current 6-bit ZQ code. However, it can be more effective in the structure with the lower bit ZQ code. Since the number of pre-drivers and drivers is reduced by using a lower bit ZQ code, the proposed scheme can be used to benefit in terms of area and power consumption.

Fig. A.12 shows the location of ZQ pin and DQ pins. ZQ calibration is performed through the ZQ pin and the result codes are transmitted to the DQ driver. Since the ZQ pin and the DQ pins are not in the same location in the chip, as shown in Fig. A.12, the PVT condition of the ZQ pin and the DQ pin is different. As a result, the value determined by the ZQ codes delivered after ZQ calibration have a different result value on the DQ side. In addition, since the conditions between DQs are also different, a different result value can be obtained among DQs with same codes.

Fig. A.13(a) and Fig. A.13(b) show the simulated calibration result of PU and PD drivers when all drivers are turned on for each DQ. Red graph shows that the ZQ codes are applied to three drivers and adjacent fixed codes are applied to the remaining three drivers. The graph shown in blue shows that the number of ZQ codes and adjacent fixed codes delivered to the driver is adjusted to get closer to the target impedance using the selection signal (SEL<5:0>) shown in Fig. A.5.



Fig. A.13. Simulated calibration result per DQ: (a) PU-DRV calibration (b) PD-DRV calibration.

We have presented a ZQ calibration scheme that reduces quantization error and ZQ-DQ mismatch by using ZQ codes and adjacent fixed codes together. Using the lock signals of the lock detectors, fixed codes adjacent to the ZQ codes obtained as a result of the ZQ calibration is generated and delivered to the output driver. When the two codes mentioned above are used together, the impedance mismatch can be reduced to less than 1% compared to when only one code is used. Also, mismatch between ZQ-DQ pins can also be reduced by adjusting the number of adjacent fixed codes applied to the driver.

If this method is applied to single-ended PAM-4 signaling, since the structure of the PAM-4 output driver is similar to the NRZ output driver structure, it can be helpful for impedance matching at each level of the PAM-4 signal.

## **BIBLIOGRAPHY**

- [1.1.1] H. Ko, et al., "A controller PHY for managed DRAM solution with damping resistor-aided pulse-based feed-forward equalizer," in *IEEE Journal of Solid-State Circuits*, vol. 56, no. 8, pp. 2563-2573, Aug. 2021.
- [1.1.2] W. Bae, "Supply-Scalable High-Speed I/O Interfaces," in *MDPI Electronics*, no. 9: 1315, Aug. 2020.
- [1.1.3] T. M. Hollis, et al., "Recent Evolution in the DRAM Interface: Mile-Markers Along Memory Lane," *IEEE Solid-State Circuits Magazine*, vol. 11, no. 2, pp. 14-30, Jun. 2019.
- [1.1.4] T. M. Hollis, et al., "Achieving 16 Gb/s Single-Ended Signaling in High-Performance Graphics Memory," in *IEEE Workshop on Microelectronics and Electron Devices (WMED)*, Apr. 2018, pp. 1–5.
- [1.1.5] T. M. Hollis, et al., "An 8Gb GDDR6X DRAM achieving 22Gb/s/pin with single-ended PAM-4 signaling," 2021 IEEE International Solid-State Circuits Conference (ISSCC), pp. 348-349, Feb. 2021.
- [2.1.1] N. Dikhaminjia, et al., "High-speed serial link challenges using multi-level signaling," in 2015 IEEE 24<sup>th</sup> Electrical Performance of Electronic Packaging and System (EPEPS), San Jose, CA, USA, 2015, pp. 57–60.
- [2.1.2] X. Zheng, et al., "A 4-40Gb/s PAM4 transmitter with output linearity

optimization in 65nm CMOS," 2017 IEEE Custom Integrated Circuits Conference (CICC), Austin, TX, USA, 2017, pp. 1–4.

- [2.1.3] P. Upadhyaya, et al., "A fully adaptive 19–58 Gb/s PAM-4 and 9.5–29 Gb/s NRZ wireline transceiver with configurable ADC in 16-nm FinFET," *IEEE J. Solid-State Circuits*, vol. 53, no. 1, pp. 18–28, Jan. 2019.
- [2.2.1] K-S. Kwak, et al., "A Low-Power Two-Tap Voltage-Mode Transmitter With Precisely Matched Output Impedance Using an Embedded Calibration Circuit," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 63, no. 6, pp. 573-577, Jun. 2016.
- [2.2.2] J.-H. Chae, H. Ko, J. H. Park, and S. Kim, "A 12.8Gb/s quarter-rate transmitter using a 4:1 overlapped multiplexing driver combined with an adaptive clock phase aligner," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 66, no. 3, pp. 372-376, Mar. 2019.
- [2.3.1] C. Hyun, H. Ko, J. –H. Chae, H. Park and S. Kim, "A 20Gb/s dual-mode PAM4/NRZ single-ended transmitter with RLM compensation," 2019 *IEEE International Symposium on Circuits and Systems (ISCAS)*, May. 2019, pp. 1–4.
- [2.3.2] P. Upadhyaya, et al., "A fully adaptive 19-to-56Gb/s PAM-4 wireline transceiver with a configurable ADC in 16nm FinFET," 2018 IEEE International Solid-State Circuits Conference (ISSCC), pp. 108-109, Feb. 2018.
- [2.5.1] M. Bassi, F. Radice, M. Broccoleri, S. Erba, and K. A. Mazaanti, "A High-

swing 45 Gb/s hybrid voltage and current-mode PAM-4 transmitter in 28 nm CMOS FDSOI," *IEEE J. Solid-State Circuits*, vol. 51, no. 11, pp. 2702–2715, Nov. 2016.

- [2.5.2] A. Nazemi, et al., "A 36Gb/s PAM4 transmitter using an 8b 18GS/S DAC in 28nm CMOS," 2015 IEEE International Solid-State Circuits Conference (ISSCC), pp. 58-59, Feb. 2015.
- [2.5.3] H. –W. Yang, A. Roshan-Zamir, Y.-H. Song, and S. Palermo, "A low-power dual-mode 20-Gb/s NRZ and 28-Gb/s PAM-4 voltage-mode transmitter," 2017 *IEEE Asian Solid-State Circuits Conference (ASSCC)*, pp. 261-264, Nov. 2017.
- [2.5.4] A. Roshan-Zamir, O. Elhadidy, H.–W. Yang, and S. Palermo, "A reconfigurable 16/32 Gb/s dual-mode NRZ/PAM4 SerDes in 65nm CMOS," *IEEE J. Solid-State Circuits*, vol. 52, no. 9, pp. 2430–2447, Sep. 2017.
- [2.5.5] Y. -U. Jeong, H. Park, C. Hyun, J. -H. Chae, S. -H. Jeong and S. Kim, "A 0.64pJ/Bit 28-Gb/s/Pin High-Linearity Single-Ended PAM-4 Transmitter With an Impedance-Matched Driver and Three-Point ZQ Calibration for Memory Interface," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 4, pp. 1278-1287, Apr. 2021.
- [3.3.2.1] J. Kim, et al., "A 16-to-40Gb/s quarter-rate NRZ/PAM4 dual-mode transmitter in 14nm CMOS," 2015 IEEE International Solid-State Circuits Conference (ISSCC), pp. 60-61, Feb. 2015.
- [3.3.2.2] H. Liu, L. Ding, J. Jin, and J. Zhou, "A reconfigurable 28/56 Gb/s PAM4/NRZ dual-mode SerDes with hardware-reuse," in *IEEE Int. Symposium on Circuits*

and Systems, Florence, Italy, May 2018, pp. 1-5.

- [4.3.1] K. Kim, et al., "A 24Gb/s/pin 8Gb GDDR6 with a half-rate daisy-chain-based clocking architecture and IO circuitry for low-noise operation", 2021 IEEE International Solid-State Circuits Conference (ISSCC), pp. 344-346, Feb. 2021.
- [4.4.2.1] M. Kossel, et al., "A T-coil-enhanced 8.5 Gb/s high-swing SST transmitter in 65nm Bulk CMOS with <<-16dB return loss over 10 GHz bandwidth," *IEEE J. Solid-State Circuits*, vol. 43, no. 12, pp. 2905-2920, Dec. 2008.
- [4.4.2.2] B. Lim, D. Kim, and C. Yoo, "Voltage-mode PAM4 driver with differential ternary R-2R DAC architecture," *Electron. Letters*, 2020, 56, pp. 431–432.
- [A.1] J.-H. Chae, Y.-U. Jeong and S. Kim "Data-Dependent Selection of Amplitude and Phase Equalization in a Quarter-Rate Transmitter for Memory Interfaces," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 67, no.9, pp. 2972-2983, Sep. 2020.
- [A.2] T. Kim *et al*, "A Hybrid ZQ Calibration Design for High-Density Flash Memory Toggle 5.0 High-speed Interface, in *IEEE Asian Solid-State Circuits Conference (ASSCC)*, Nov. 2021, pp. 1-2.
- [A.3] C.-K Lee *et al*, "Dual-Loop Two-Step ZQ Calibration for Dynamic Voltage-Frequency Scaling in LPDDR4 SDRAM," *IEEE J. Solid-State Circuits*, vol. 53, no. 10, pp. 2906-2916, Oct. 2018.
- [A.4] T. Na et al, "A Heterogeneous Dual DLL and Quantization Error Minimized ZQ Calibration for 30nm 1.2V 4Gb 3.2Gb/s/pin DDR4 SDRAM," Symposium

on VLSI Circuits, Jun. 2013, pp. 242-243.

- [A.5] J. Koo *et al.*, "Small-Area High-Accuracy ODT/OCD by calibration of Global On-Chip for 512M GDDR5 application," in *IEEE Custom Integrated Circuits Conference (CICC)*, Sep. 2009, pp. 717-720.
- [A.6] Y.-H. Song, R. Bai, K. Hu, H.-W. Yang, P. Y. Chiang, and S. Palermo, "A 0.47– 0.66 pJ/bit, 4.8–8 Gb/s I/O transceiver in 65 nm CMOS," IEEE J. Solid-State Circuits, vol. 48, no. 5, pp. 1276–1289, May 2013.

## 한글초록

고대역폭 메모리에 대한 수요가 증가함에 따라 핀 당 데이터 속도도 증가 하고 있다. 본 논문에서는 메모리 대역폭을 증가시키기 위해 단일 종단 전압 모드 PAM-4 송신기를 제안한다. PAM-4 신호는 4개의 전압 레벨을 가지므 로 출력 신호 레벨이 한 레벨에서 다른 레벨로 변경될 때 단일 종단 PAM-4 송신기의 드레인-소스 전압(V<sub>DS</sub>) 변동으로 인해 출력 레벨 왜곡이 발생한다. 이 문제를 해결하기 위해 제안된 PAM-4 송신기에는 추가 풀업 드라이 버가 있다. 추가 풀업 드라이버는 2개의 중간 전압 레벨을 높여서 4개 레벨의 PAM-4 신호 간 eye 높이 차이를 줄일 수 있다. 65nm CMOS 기술로 구현 된 송신기의 면적은 0.06mm<sup>2</sup>이다. 송신기가 20 Gb/s/pin으로 동작할 때 소 모되는 전력은 61.5mW이다.

또한 임피던스를 채널과 일치시키면서 출력 레벨 왜곡을 방지하는 또 다 른 PAM-4 송신기를 제안한다. 4개의 모든 출력 신호 레벨에 대해 ZQ 코드 들은 ZQ 보정 과정을 통해 얻어져 ZQ 코드 테이블에 저장된다. 그런 다음 ZQ 코드 생성기는 데이터 패턴에 따라 적절한 코드를 선택하여 출력 드라이 버에 전달한다. 우리 접근 방식의 효율성을 검증하기 위해 면적이 0.035mm<sup>2</sup> 인 프로토타입 칩이 65nm CMOS 기술로 제작되었다. 18 Gb/s/pin동작에서 3.09 pJ/bit/pin의 에너지 효율을 달성하였으며 임피던스를 채널과 일치시키면

89

서 레벨 분리 불일치 비율 (RLM)은 0.971을 달성하였다.

주요어 : 단일 종단 전압 모드 송신기, PAM-4 신호, 메모리 인터페이스, 레벨 분리 불일치 비율, 임피던스 매칭, ZQ 보정.

학 번:2016-20987