



**Ph.D. Dissertation** 

# Design of Simultaneous Bidirectional Transceivers with PAM-4 Signaling

PAM-4 신호 방식을 이용한 동시 양방향 송수신기 설계

by

Yunhee Lee

August, 2023

Department of Electrical and Computer Engineering College of Engineering Seoul National University

# Design of Simultaneous Bidirectional Transceivers with PAM-4 Signaling

지도 교수 정덕 균

이 논문을 공학박사 학위논문으로 제출함 2023 년 8 월

> 서울대학교 대학원 전기·정보공학부 이 윤 희

이윤희의 박사 학위논문을 인준함 2023 년 8 월

| 위 육 | 원 장 | 김 재 하 | (인) |
|-----|-----|-------|-----|
| 부위  | 원장  | 정 덕 균 | (인) |
| 위   | 원   | 문 용   | (인) |
| 위   | 원   | 최 우 석 | (인) |
| 위   | 원   | 박 관 서 | (인) |

# Design of Simultaneous Bidirectional Transceivers with PAM-4 Signaling

by

Yunhee Lee

A Dissertation Submitted to the Department of Electrical and Computer Engineering in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

at

#### SEOUL NATIONAL UNIVERSITY

August, 2023

Committee in Charge:

Professor Jaeha Kim, Chairman

Professor Deog-Kyoon Jeong, Vice-Chairman

Professor Yong Moon

Professor Woo-Seok Choi

Professor Kwanseo Park

### Abstract

This thesis proposes the design of simultaneous bidirectional (SBD) transceivers using four-level pulse amplitude modulation (PAM-4) for wireline communication. The transceiver structures and novel hybrid techniques are proposed for both asymmetric and symmetric prototype chips.

In the first prototype design, an asymmetric SBD transceiver for next-generation automotive camera links over 10 Gb/s is presented. PAM-4 signaling is employed to overcome the limited cable bandwidth, and SBD operation with PAM-4 is realized with a wide linear range (WLR) hybrid. A two-step hybrid strategy of bypassing a feed-forward equalizer (FFE) in the transmitter reduces power and simplifies the hybrid design significantly. The hybrid removes only four primary DC levels with coefficient  $\Sigma \alpha$ , and a second-order transconductor-capacitor (gm-C) low-pass filter (LPF) filters out the remaining components from the hybrid and reflections from the channel. An echo-canceller (EC) is also utilized to eliminate the reflections of the PAM-2 back channel (BC). The highly asymmetric transceiver with 12-Gb/s PAM-4 forward channel (FC) and 125-Mb/s PAM-2 BC exhibits eye margins of 0.15 UI and 0.57 UI at a bit error rate (BER) < 10<sup>-12</sup> over a 5-m automotive cable under SBD communication. Fabricated in 40-nm CMOS, the prototype transceiver achieves an energy efficiency of 6.5 pJ/b, exhibiting an FoM of 0.41 pJ/b/dB.

The second prototype chip presents a symmetric SBD transceiver with PAM-4, employing a novel hybrid adaptation scheme. The possibility of extending bandwidth is explored by applying PAM-4 signaling to SBD. Furthermore, a mismatch compensation method for a hybrid circuit is proposed, which is essential for removing the outbound signal in SBD. The hybrid adaptation is easily implemented by applying the locked condition of the Mueller Müller phase detector (MMPD) with the data level adaptation. By sharing one error sampler with MMPD and adaptation engine, the presented SBD transceiver is efficient in terms of clocking power consumption and bandwidth of a receiver front-end. A wide linear range hybrid and a data alignment technique guarantee the robust PAM-4 SBD operation. Fabricated in the 28-nm CMOS, the 80-Gb/s SBD transceiver achieves a BER of less than 10<sup>-12</sup> with an energy efficiency of 2.65 pJ/b.

**Keywords :** Simultaneous bidirectional (SBD), four-level pulse amplitude modulation (PAM-4), asymmetric, symmetric, automotive camera link, hybrid, two-step hybrid, hybrid adaptation, transceiver.

Student Number : 2019-26929

## Contents

| ABSTRACT                                       | Ι    |
|------------------------------------------------|------|
| CONTENTS                                       | III  |
| LIST OF FIGURES                                | V    |
| LIST OF TABLES                                 | X    |
| CHAPTER 1 INTRODUCTION                         | 1    |
| 1.1 MOTIVATION                                 | 1    |
| 1.2 THESIS ORGANIZATION                        | 4    |
| CHAPTER 2 BACKGROUND OF SIMULTANEOUS BIDIRECTI | ONAL |
| TRANSCEIVER                                    | 5    |
| 2.1 Overview                                   | 5    |
| 2.2 BASIC ARCHITECTURES                        |      |
| 2.3 Hybrid                                     | 16   |
| 2.3.1 Replica Hybrid                           | 17   |
| 2.3.2 RESISTOR-TRANSCONDUCTOR (R-GM) HYBRID    |      |
| 2.3.3 WIDE LINEAR RANGE (WLR) HYBRID           | 24   |
| CHAPTER 3 DESIGN OF ASYMMETRIC SBD TRANSCEIVER | WITH |
| TWO-STEP HYBRID                                | 27   |
| 3.1 Overview                                   | 27   |

| 3.2 ANALYSIS ON WIDE LINEAR RANGE (WLR) HYBRID  | 31   |
|-------------------------------------------------|------|
| 3.3 PROPOSED TWO-STEP HYBRID STRATEGY           | 36   |
| 3.4 SBD TRANSCEIVER IMPLEMENTATION              | 41   |
| 3.4.1 SERIALIZER CHIP (SER)                     | 43   |
| 3.4.2 DESERIALIZER CHIP (DES)                   | 49   |
| 3.5 MEASUREMENT                                 | 56   |
| CHAPTER 4 DESIGN OF SYMMETRIC SBD TRANSCEIVER V | WITH |
| HYBRID ADAPTATION                               | 66   |
| 4.1 Overview                                    | 66   |
| 4.2 PROPOSED HYBRID ADAPTATION SCHEME           | 69   |
| 4.3 SBD TRANSCEIVER IMPLEMENTATION              | 74   |
| 4.3.1 TRANSMITTER                               | 76   |
| 4.3.2 Receiver                                  | 81   |
| 4.3.3 CLOCK DISTRIBUTION                        | 85   |
| 4.4 MEASUREMENT                                 | 88   |
| CHAPTER 5 CONCLUSION                            | 95   |
| BIBLIOGRAPHY                                    | 97   |
| 초 록                                             | 109  |

## **List of Figures**

| Fig. 1.1 Number of published papers on wireline SBD transceivers by year2         |
|-----------------------------------------------------------------------------------|
| FIG. 1.2 PER-PIN DATA RATES OF SBD TRANSCEIVERS BY YEAR                           |
| FIG. 2.1 COMPARISON OF UNIDIRECTIONAL AND BIDIRECTIONAL COMMUNICATIONS            |
| FIG. 2.2 EXAMPLES OF SYMMETRIC SBD SIGNALING APPLICATIONS [59], [51]7             |
| FIG. 2.3 EXAMPLES OF ASYMMETRIC SBD SIGNALING APPLICATIONS [49], [60]7            |
| FIG. 2.4 (A) OPERATIONAL PRINCIPLE OF PAM-4 SIGNALING AND (B) POWER SPECTRAL      |
| DENSITY OF NRZ AND PAM-4 SIGNALS WITH SAME DATA RATE [61]8                        |
| FIG. 2.5 EYE DIAGRAMS OF OVERLAPPED SIGNALS IN SYMMETRICAL SBD COMMUNICATION      |
| USING (A) NRZ AND (B) PAM-4 SIGNALING                                             |
| FIG. 2.6 CONCEPT OF TIME DIVISION DUPLEXING (TDD)                                 |
| FIG. 2.7 CONCEPT OF FREQUENCY DIVISION DUPLEXING (FDD)                            |
| Fig. 2.8 (a) Conceptual diagram of SBD transceiver using active cancellation. (b) |
| SEPARATION PROCESS OF ACTIVE CANCELLATION                                         |
| FIG. 2.9 NEAR-END/FAR-END ECHOES IN SBD TRANSCEIVER                               |
| FIG. 2.10 BASIC ARCHITECTURE OF WIRELINE SBD TRANSCEIVER (SINGLE SIDE)14          |
| FIG. 2.11 STRUCTURE OF CONVENTIONAL REPLICA HYBRID                                |
| FIG. 2.12 IMPLEMENTATION EXAMPLE OF A SUBTRACTION STAGE                           |
| FIG. 2.13 STRUCTURE OF SWITCHED-CAPACITOR HYBRID                                  |
| Fig. 2.14 (a) Conceptual structure and (b) separation process of R-GM hybrid20    |
| Fig. 2.15 Implementation example of (a) conventional replica hybrid and (b) R-gm  |
| HYBRID                                                                            |

| Fig. 2.16 (a) Structure of WLR hybrid. (b) Operational principle of WLR hybrid                  |
|-------------------------------------------------------------------------------------------------|
| (WHEN INBOUND SIGNAL IS OFF)                                                                    |
| FIG. 3.1 OVERALL ARCHITECTURE OF AUTOMOTIVE CAMERA LINK SYSTEM                                  |
| Fig. 3.2 Characteristic of hybrid operation in the asymmetric SBD transceivers.29 $$            |
| Fig. 3.3 (a) WLR hybrid including an interconnect model and (b) small-signal                    |
| MODEL OF THE HYBRID NETWORK                                                                     |
| FIG. 3.4 Power overhead and frequency characteristic of WLR hybrid to $R_{\text{HYB}}$ 34       |
| FIG. 3.5 RLM COMPARISON OF HYBRID STRUCTURES TO THE INPUT VOLTAGE SWING                         |
| FIG. 3.6 CONCEPTUAL DIAGRAM OF SBD STRUCTURE THAT EMULATES PAM-4 DRIVERS                        |
| INCLUDING TX FFE USING REPLICA                                                                  |
| Fig. 3.7 (a) Conceptual diagram of proposed two-step hybrid structure in SER and                |
| (B) OPERATIONAL PRINCIPLE OF THE HYBRID STRUCTURE WITH A SINGLE-BIT MODEL. 37                   |
| FIG. 3.8 FFT SIMULATION RESULTS OF FC SIGNAL WITH DIFFERENT HYBRID COEFFICIENTS38               |
| Fig. 3.9 Simulated received eye-opening of BC signal to the timing difference                   |
| BETWEEN FC DRIVER AND HYBRID                                                                    |
| FIG. 3.10 SIMULATED FC DRIVER RLM VS. OUTPUT VOLTAGE SWING                                      |
| FIG. 3.11 SIMULATED EYE OPENINGS OF RECEIVED FC AND BC VS. SIGNAL SWING RATIO OF                |
| FC AND BC                                                                                       |
| FIG. 3.12 SER BLOCK DIAGRAM WITH TWO-STEP HYBRID ARCHITECTURE                                   |
| FIG. 3.13 CONFIGURATION OF 6:1 SERIALIZER                                                       |
| FIG. 3.14 SCHEMATIC OF PAM-4 FC DRIVER (INCLUDING PRE-DRIVER) AND HYBRID                        |
| FIG. 3.15 Structure of $2^{\text{ND}}$ order GM-C LPF                                           |
| FIG. 3.16 SIGNAL SPECTRA OF FC AND BC AFTER HYBRID IN SER                                       |
| FIG. 3.17 EYE DIAGRAMS OF RECEIVED BC SIGNAL WITH 1 <sup>st</sup> and 2 <sup>ND</sup> order LPF |

| FIG. 3.18 FREQUENCY RESPONSE OF GM-C LPF                                         |
|----------------------------------------------------------------------------------|
| FIG. 3.19 Vertical/horizontal margins of received BC signal to $Q$ factor        |
| FIG. 3.20 DES BLOCK DIAGRAM WITH WLR HYBRID AND ECHO CANCELLER                   |
| FIG. 3.21 SCHEMATIC OF BC DRIVER AND WLR HYBRID NETWORK                          |
| FIG. 3.22 (A) OVERLAPPED HYBRID INPUT UNDER SBD OPERATION. OUTPUTS OF WLR HYBRID |
| (B) WITHOUT ATTENUATOR, (C) WITH PASSIVE ATTENUATOR, AND (D) WITH ACTIVE         |
| PGA INSTEAD OF PASSIVE ATTENUATOR                                                |
| FIG. 3.23 CONFIGURATION OF FC RX DATA PATH                                       |
| FIG. 3.24 FREQUENCY RESPONSE OF PGA INCLUDING PASSIVE ATTENUATOR                 |
| FIG. 3.25 FREQUENCY RESPONSE OF CTLE (DC GAIN CONTROL)                           |
| Fig. 3.26 (a) A single-bit response of BC with 5-m automotive cable and (b) echo |
| WAVEFORMS WITH AND WITHOUT ECHO CANCELLER                                        |
| FIG. 3.27 MEASUREMENT SETUP OF SBD TRANSCEIVER                                   |
| FIG. 3.28 MEASURED CHANNEL CHARACTERISTICS OF 5-M STQ CABLE                      |
| FIG. 3.29 MEASURED TRANSMITTER OUTPUT WAVEFORM OF PAM-4 FC SIGNAL                |
| FIG. 3.30 MEASURED TRANSMITTER OUTPUT WAVEFORM OF PAM-2 BC SIGNAL                |
| FIG. 3.31 MEASURED SPECTRA OF SER INTERMEDIATE NODES                             |
| FIG. 3.32 MEASURED SPECTRA OF DES INTERMEDIATE NODES                             |
| FIG. 3.33 MEASURED BATHTUB CURVE OF FC DATA UNDER SBD OPERATION                  |
| Fig. 3.34 Measured bathtub curves of BC data under SBD operation with two        |
| HYBRID COEFFICIENTS                                                              |
| FIG. 3.35 MEASURED JTOL CURVES OF 12-GB/S PAM-4 FC DATA                          |
| FIG. 3.36 MEASURED JTOL CURVES OF 125-MB/S PAM-2 BC DATA                         |
| FIG. 3.37 CHIP PHOTOMICROGRAPHS OF SER AND DES                                   |

| FIG. 3.38 POWER BREAKDOWN OF SBD TRANSCEIVER                                     | 4 |
|----------------------------------------------------------------------------------|---|
| FIG. 4.1 CONCEPTUAL BLOCK DIAGRAM OF SBD TRANSCEIVER WITH HYBRID ADAPTATION      |   |
| ENGINE                                                                           | 7 |
| FIG. 4.2 OPERATIONAL PRINCIPLE OF PROPOSED HYBRID ADAPTATION SCHEME              | 9 |
| FIG. 4.3 TRUTH TABLE OF (A) PATTERN FILTER FOR HYBRID ADAPTATION AND (B) HYBRID  |   |
| WEIGHT ADAPTATION LOGIC                                                          | 0 |
| FIG. 4.4 SIMULATED RESULTS OF (A) NORMALIZED VEO BY THE NORMALIZED HYBRID WEIGHT | Г |
| AND (B) EYE DIAGRAMS BY THE HYBRID MISMATCHES                                    | 2 |
| FIG. 4.5 BLOCK DIAGRAM OF OVERALL PAM-4 SBD TRANSCEIVER ARCHITECTURE             | 4 |
| FIG. 4.6 SCHEMATIC IMPLEMENTATION OF FINAL 2:1 SERIALIZER                        | 6 |
| FIG. 4.7 SCHEMATIC OF DRIVER AND HYBRID                                          | 7 |
| FIG. 4.8 SIMULATED WAVEFORMS OF PAM-4 HYBRID OPERATION                           | 8 |
| FIG. 4.9 SIMULATED EYE DIAGRAMS OF HYBRID (A) INPUT AND (B) OUTPUT               | 9 |
| FIG. 4.10 SIMULATED OUTPUT IMPEDANCE OF DRIVER INCLUDING HYBRID                  | 0 |
| FIG. 4.11 (A) SCHEMATIC AND (B) FREQUENCY RESPONSE OF CTLE                       | 1 |
| FIG. 4.12 LOGICAL OPERATION OF DLEV ADAPTATION                                   | 2 |
| FIG. 4.13 LOGICAL OPERATIONS OF SS-MMPD IN PAM-4 USING ONE ERROR SAMPLER8        | 3 |
| FIG. 4.14 SIMULATED LOCKING TRANSIENTS OF ADAPTATION LOOPS                       | 4 |
| FIG. 4.15 SIMULATED PI LOCKING BEHAVIORS WITH DIFFERENT HYBRID WEIGHTS           | 4 |
| FIG. 4.16(A) SCHEMATIC OF PI AND (B) INTERPOLATED PHASE BY CONTROL CODE          | 5 |
| FIG. 4.17 CONCEPTUAL DIAGRAM OF DATA ALIGNMENT TECHNIQUE                         | 6 |
| FIG. 4.18 CHIP PHOTOMICROGRAPH AND POWER/AREA BREAKDOWN.                         | 8 |
| FIG. 4.19 MEASURED FREQUENCY RESPONSE OF CHANNEL                                 | 9 |
| FIG. 4.20 OUTPUT WAVEFORM OF PAM-4 TRANSMITTER                                   | 9 |

| FIG. 4.21 MEASUREMENT SETUP OF SBD TRANSCEIVER.                               | 0              |
|-------------------------------------------------------------------------------|----------------|
| FIG. 4.22 MEASURED BATHTUB CURVES OF 40-GB/S UD AND 80-GB/S SBD TRANSCEIVER9  | )2             |
| FIG. 4.23 MEASURED JTOL CURVES OF 40-GB/S UD AND 80-GB/S SBD TRANSCEIVER      | 92             |
| FIG. 4.24 MEASURED BER OF SBD TRANSCEIVER BY HYBRID WEIGHT                    | )3             |
| FIG. 4.25 MEASURED BER OF SBD TRANSCEIVER BY PHASE DIFFERENCE OF SBD SIGNALS. | <del>)</del> 3 |

## **List of Tables**

| TABLE 3.1 PERFORMANCE SUMMARY AND COMPARISON WITH PRIOR SBD TRANSCEIV | ERS 65 |
|-----------------------------------------------------------------------|--------|
| TABLE 4.1 PERFORMANCE SUMMARY AND COMPARISON WITH PRIOR SYMMETRIC SBE | )      |
| TRANSCEIVERS                                                          |        |

## Chapter 1

### Introduction

#### **1.1 Motivation**

Simultaneous bidirectional signaling (SBD) is a communication method that simultaneously transmits signals in both directions over a single channel. This communication topology can increase data throughput under legacy conditions and is required for applications that require real-time transmission of data information and control commands. While SBD signaling has primarily been used in wireless communication, there is a growing demand to utilize the benefits of SBD signaling in wireline communication as bandwidth requirement continues to increase. Fig. 1.1 shows the number of published papers on wireline SBD transceivers by year in IEEE international conferences and journals [1]-[54], and the per-pin data rates for silicon-proven SBD transceivers are shown in Fig. 1.2. In the 2000s, the use of SBD signaling was prevalent for processors with a data rate of around 1~2 Gb/s. SBD research that had slowed down for a while regained momentum around 2020, and interest has rapidly increased since then. The development of advanced devices and



Fig. 1.1 Number of published papers on wireline SBD transceivers by year.



Fig. 1.2 Per-pin data rates of SBD transceivers by year.

circuit techniques has significantly increased bandwidth, and the application fields have become diverse, including Ethernet, HDMI, automotive link, and nextgeneration silicon interposer chiplet. In order to take full advantage of the SBD signaling characteristics that increase data throughput, studies have also emerged that apply four-level pulse-amplitude modulation (PAM-4) beyond the conventional twolevel non-return-to-zero (NRZ) signaling.

Various studies have explored PAM-4 techniques to increase bandwidth [55]-[58]; however, only a few studies have applied PAM-4 to SBD communication. In SBD, it is necessary to extract the desired inbound signal from the overlapped signal using a circuit known as a hybrid. PAM-4 presents unique challenges, including linearity issues and differences in TX driver structure compared to NRZ. Thus, an appropriate SBD transceiver structure is required when using PAM-4 signaling. Furthermore, compensation methods of hybrid circuits are necessary due to the impact of mismatches on the signal-to-noise ratio (SNR). This thesis proposes design strategies suitable for SBD transceivers using PAM-4 signaling. An efficient hybrid technique for PAM-4 TX FFE is proposed in an asymmetric SBD transceiver, and an adaptation loop compensating for the hybrid mismatch is presented for a more robust SBD operation in a symmetric PAM-4 SBD transceiver design. The proposed SBD transceivers achieve high data rates and power efficiency, and measurement results show the applicability of PAM-4 signaling to SBD transceivers.

### **1.2 Thesis Organization**

This thesis is organized as follows. In Chapter 2, the backgrounds of the simultaneous bidirectional (SBD) transceivers are presented. A comparison of SBD signaling with unidirectional (UD) signaling and basic architectures of SBD communication are provided. Moreover, the hybrid structures that remove the outbound signal are discussed for appropriate use in SBD transceivers utilizing four-level pulseamplitude modulation (PAM-4).

In Chapter 3, a design of an asymmetric SBD transceiver with a two-step hybrid is presented. An analysis of the wide linear range (WLR) hybrid is performed to ensure elaborate operation, and a two-step hybrid strategy is proposed for lowcomplexity hybrid design with a PAM-4 transmitter that employs feed-forward equalizer (FFE). Then, the implementation details of the SBD transceiver are explained, followed by the measurement results of the prototype chip.

In Chapter 4, a design of a symmetric SBD transceiver with a hybrid adaptation scheme is presented. A hybrid adaptation methodology is proposed for accurate hybrid operation in PAM-4 SBD transceiver, with a low-cost design that cooperates with legacy circuit components. Next, the implementation of the SBD transceiver is explained, and the performance of the prototype chip is verified with the measurement results.

Chapter 5 summarizes the proposed works and concludes this thesis.

## Chapter 2

# Background of Simultaneous Bidirectional Transceiver

### **2.1 Overview**

Bidirectional communication is fundamentally compared to unidirectional communication. Unidirectional communication has a single directionality, with a transmitter (TX) and receiver (RX) on either side of a channel. In contrast, bidirectional communication has a pair of TX and RX on both sides, allowing data transfer in both directions. Bidirectional communication can be categorized into half-duplex and full-duplex modes. In half-duplex, such as radio communication, two devices connected to a single communication channel can transmit or receive data only one direction at a time. In contrast, full-duplex allows bidirectional data transmission simultaneously. Fig. 2.1 illustrates these communications, and this thesis focuses on



Fig. 2.1 Comparison of unidirectional and bidirectional communications.

simultaneous bidirectional (SBD) signaling that corresponds to the full duplex.

Based on the data rates of the forward channel (FC) and back channel (BC), SBD signaling is divided into two types, symmetric and asymmetric. In symmetric SBD signaling, both directions have the same transmission speed and generally use the same voltage swing. As a result, the total throughput for a single channel can be doubled compared to unidirectional communication. Various Ethernet standards are representative applications, and efforts are being made to adopt SBD signaling for next-generation chiplets. In contrast, asymmetric SBD signaling refers to a situation where the data rates of the forward and back channels are different. It is primarily utilized in wireline applications where control commands need to be provided from real-time data information. The examples of symmetric and asymmetric SBD signaling ing applications are shown in Fig. 2.2 and Fig. 2.3, respectively [59], [51], [49], [60].



Fig. 2.2 Examples of symmetric SBD signaling applications [59], [51].



Fig. 2.3 Examples of asymmetric SBD signaling applications [49], [60].

In general, research on high-speed wireline SBD transceivers has primarily focused on non-return-to-zero (NRZ) signals. However, as interfaces require higher data rates, four-level pulse-amplitude modulation (PAM-4) is becoming more important in SBD transceivers. In specific applications, PAM-4 SBD transceiver has emerged as a standard for the transceiver structure. The operating principle and characteristics of PAM-4 signaling are illustrated in Fig. 2.4 [61]. This signaling creates four levels using two bits, resulting in half the Nyquist frequency for the



Fig. 2.4 (a) Operational principle of PAM-4 signaling and (b) power spectral density of NRZ and PAM-4 signals with same data rate [61].

same data rate compared to NRZ signaling. As a result, there are advantages in reducing insertion loss in the channel that must be compensated and in timing constraint, which is halved. Ultimately, PAM-4 is more advantageous for achieving high-speed transmission, which is difficult to achieve with NRZ. However, there is a 9.5 dB signal-to-noise ratio (SNR) penalty due to the decrease in the vertical eye size to 1/3 for the same signal swing. Additionally, linearity is emphasized, and encoding/decoding of most significant bit (MSB) and least significant bit (LSB) data is required, which results in complex circuit configuration. Thus, these aspects must be considered when applying PAM-4 to SBD transceivers. Fig. 2.5 depicts the eye diagrams of overlapping signals in symmetrical SBD communication using NRZ and PAM-4 signalings, assuming no channel loss.



Fig. 2.5 Eye diagrams of overlapped signals in symmetrical SBD communication using (a) NRZ and (b) PAM-4 signaling.

### **2.2 Basic Architectures**

In SBD communication, the primary challenge is ensuring that signals in both directions do not interfere. SBD architectures can be classified into several categories depending on the approaches taken to address this. One such approach is time division duplexing (TDD) (Fig. 2.6). This method involves alternating bidirectional data transmission at a fixed time slot interval, although not in the strict sense of simultaneous bidirectional communication. The rapid switching between transmission modes is repeated imperceptibly fast enough like a simultaneous bidirectional transmission. Although this method does not require a circuit to separate the bidirectional signals due to their non-overlapping nature, the benefits of increased throughput achievable by using SBD signaling cannot be obtained for the same reason. Additional handshaking and control logic are required for TDD.

Another approach is frequency division duplexing (FDD) (Fig. 2.7), which separates the communication frequency bands to prevent data collision. This method is commonly used in cellular networks, where two carrier frequencies (or a baseband and a carrier) modulate forward and back channel data, resulting in complete spectra separation. However, implementing the same modulation scheme in wireline communication presents practical difficulties. Therefore, FDD is feasible only for asymmetric SBD transceivers with significant data rate differences between the forward and back channels in wireline applications. It is compared with TDD, which can be applied regardless of symmetry. However, FDD simultaneously transmits bidirectional signals, eliminating the overheads caused by TDD switching.



Fig. 2.6 Concept of time division duplexing (TDD).



Fig. 2.7 Concept of frequency division duplexing (FDD).

Due to the limitations and constraints of TDD and FDD, active cancellation is generally used in wireline communication. This method sends bidirectional signals without separate frequency and phase modulation and obtains the desired inbound signal by subtracting the overlapped signal through a subtraction circuit called a hybrid. Because it sends signals in both directions without dividing the frequency band,



Fig. 2.8 (a) Conceptual diagram of SBD transceiver using active cancellation.

(b) Separation process of active cancellation.

direct removal of outbound signal is required. The most conventional method is to use a replica circuit to generate a signal to be subtracted and remove it from the overlapped signal. Fig. 2.8(a) shows the conceptual diagram of an SBD transceiver that performs active cancellation, and Fig. 2.8(b) shows the active cancellation process. If a subtraction circuit is not precisely designed or if a mismatch occurs, a residual error can be included in the separated inbound signal, leading to a deterioration of SNR.

SBD operation is valid regardless of symmetry and asymmetry, but additional considerations exist. First, the increased operating range must be considered. Since bidirectional signals overlap, the operating range increases at the terminal. Outstanding performance is still required for the driver and hybrid to account for the increased dynamic range under the limited supply voltage. Another important consideration is reflection. During SBD operation, variations can occur between the output impedance of the driver and the characteristic impedance of the channel, resulting in near-end/far-end echoes, as shown in Fig. 2.9. If the impact of the echoes is significant, an echo canceller for the outbound signal may be required at the receiver end. The echoes are often defined as the reflected outbound signals, which is distin-



Fig. 2.9 Near-end/far-end echoes in SBD transceiver.

guished from the reflected inbound signals. In cases where the inbound signal undergoes multiple reflections and affects the received inbound signal after some latency, a floating decision feedback equalizer (DFE) may be necessary [62]. However, for high-speed signals involving at least two reflections with some channel loss, a floating DFE is usually unnecessary.

Fig. 2.10 illustrates the basic architecture of the wireline SBD transceiver (single side), which comprises a hybrid and an echo canceller. For symmetric SBD transceivers, two identical structures are located on the opposite side, whereas the structures on both sides may differ for asymmetric SBD transceivers. Compared to unidirectional communication, a symmetric SBD transceiver achieves half the Nyquist frequency for the same throughput, similar to the case of PAM-4 relatives to NRZ. Thus, it incurs relatively low channel insertion loss, rarely requiring a TX feed-forward equalizer (FFE). When the channel loss compensation is necessary, equalization is usually done through a continuous-time linear equalizer (CTLE) or a DFE



Fig. 2.10 Basic architecture of wireline SBD transceiver (single side).

at the RX. It is also because the TX FFE reduces the voltage swing of the main tap and also makes the driver bulky, making the hybrid design more challenging.

### 2.3 Hybrid

The hybrid is a critical block for active cancellation in the SBD transceiver for reliable communication, which extracts the received inbound signal by removing the outbound signal at the shared transmitter and receiver terminal. However, the design of such a hybrid is challenging, as it must satisfy several design goals simultaneous-ly [46]. Specifically, the hybrid should be linear over the overlapped range of bidirectional signals and avoid affecting the impedance matching or the behavior of the driver while minimizing the power/area overhead. In addition, as data rate increases and the process is scaled down, the importance of mismatches is emphasized, and the need for compensation methodology for the mismatches is highlighted.

Several hybrid structures have been proposed to meet these requirements, utilizing different techniques, from replica generation and subtraction of Fig. 2.8(b) to methods obtained through the relationship of intermediate nodes with different ratios of inbound/outbound signals. However, providing a universal optimal solution for all designs suffers from numerous trade-offs. Therefore, it is crucial to use a hybrid structure suitable for each specific situation. This section provides an explanation of the operating principles and characteristics of various representative hybrid structures.

#### 2.3.1 Replica Hybrid

The replica hybrid is the most basic hybrid method, featured with its intuitive approach. It removes the outbound signal by generating a replica signal using a replica circuit from the overlapped bidirectional signals. Fig. 2.11 shows a conventional hybrid structure that includes a replica driver and a subtraction stage to emulate and cancel the outbound signal [1], [4]. To match with the line driver, a dummy *RC* is required at the output node of the replica (REP) driver. As the driver is a significant power-consuming block of the total TX power, the replica driver is often scaled to reduce power consumption. A subtraction stage can be implemented as a differential amplifier with a current-mode logic (CML) structure, and Fig. 2.12 provides an implementation example of the subtraction stage. The suitability of the subtractor design with a transconductance ( $g_m$ ) cell and a load resistor for PAM-4 SBD transceivers can be examined as follows.



Fig. 2.11 Structure of conventional replica hybrid.



Fig. 2.12 Implementation example of a subtraction stage.

The maximum differential input voltage of the CML subtraction stage for linear operation is expressed as

$$\Delta V = \sqrt{\frac{2I_{\rm SUB}}{\mu C_{\rm ox} \left(\frac{W}{L}\right)}}.$$
(2.1)

The relationship shows that (W/L) must be reduced by 1/4 for the input swing to be doubled when the source current  $I_{SUB}$  is constant. Here, the load resistor and current source can be assumed to be constant since they are determined by the bandwidth of the circuit and the common-mode level of the next stage. As a result, this subtraction structure in the SBD halves  $g_m$ , thereby reducing the gain and causing degradation of the SNR. More importantly, since  $g_m$  is voltage-dependent and non-linear, it is hard to anticipate robust hybrid operation for a wide input range, which is not appropriate behavior for PAM-4 signaling.



Fig. 2.13 Structure of switched-capacitor hybrid.

Fig. 2.13 illustrates a switched-capacitor hybrid that employs a replica driver, while the subtraction stage operates in discrete time using switched capacitors to further reduce power consumption. [10], [15]. The voltages of the LINE and REP are sampled at the rising edge of  $\phi_1$  and stored in  $C_{\text{LINE}}$  and  $C_{\text{REP}}$ , respectively. These two capacitors are then connected in series at the rising edge of  $\phi_2$  for subtraction. However, the reduction in hybrid power is not significant despite the use of a scaled replica driver. Moreover, as the data rate increases, it is less likely to be adopted due to the reduced timing margin for sample and hold. In particular, when inbound and outbound signals have unequal frequency, a problem may arise where the timing of subtracting the outbound signal conflicts with the desired sampling time of the inbound signal. The phase relation between the two clock phases and the duty cycle should also be carefully handled.

#### 2.3.2 Resistor-Transconductor (R-gm) Hybrid

The resistor-transconductor (R-gm) hybrid is an improved active cancellation method that utilizes the characteristics of intermediate node with different ratios of inbound ( $V_{ib}$ ) and outbound ( $V_{ob}$ ) signals [28]. This design uses a sensing resistor, *r*,



Fig. 2.14 (a) Conceptual structure and (b) separation process of R-gm hybrid.

and transconductor cells to obtain the inbound signal, thereby eliminating the need for a replica driver. Fig. 2.14 illustrates the structure and separation process of the R-gm hybrid. Unlike the conventional replica hybrid, which generates a copy of the outbound signal by the replica driver, the R-gm hybrid generates a signal corresponding to the subtraction of the inbound and outbound signals ( $V = V_{ib} - V_{ob}$ ). By subtracting the generated signal from the overlapped signal ( $V = V_{ib} + V_{ob}$ ) at the terminal, the signal proportional to the inbound signal is achieved.

The R-gm hybrid offers several advantages over conventional replica hybrid circuits since it eliminates the requirement for a replica path, which in turn resolves mismatch issues caused by gain or frequency characteristics of drivers in the conventional hybrid structure. Furthermore, in TX designs incorporating pre-emphasis or multi-level signaling, the R-gm hybrid simplifies the design by eliminating the requirement to emulate all slices for pre-emphasis or multi-level signaling. This convenience enables its application to a wide range of signal technologies.

However, it also has some drawbacks. For instance, since the TX current is split between the shunt and series paths, the driver's power cannot be fully utilized, which results in halving the voltage swing at the channel. Fig. 2.15 illustrates the implementation of the conventional replica hybrid and R-gm hybrid with current-mode (CM) driver, comparing the nodal voltages when the same driver current is consumed. In this example, a current-mode driver is assumed for both hybrid structures, and  $Z_0/2$  is assumed as the sensing resistor of the R-gm hybrid. To achieve the same voltage of  $V_{ib}$  at the RX, the R-gm hybrid requires  $g_m$  cells with a gain of 2x and 3x, compared to the conventional replica hybrid. Considering the maximum differential input voltage discussed in Chapter 2.3.1, the R-gm hybrid encounters more challenging design issues than the conventional replica hybrid for the subtraction circuit. Additionally, mismatches between the two  $g_m$  cells or resistors still limit the performance of the hybrid, even if the mismatch factors from the replica are eliminated.



Fig. 2.15 Implementation example of (a) conventional replica hybrid and

(b) R-gm hybrid.
The R-gm hybrid can be implemented with a voltage-mode (VM) driver, preventing current splitting and enhancing energy efficiency [44]. The VM driver can also assist in designing the subtraction stage. However, there is a significant concern when adopting the VM driver. In a VM driver, the instantaneous output impedance changes with variations in input and output voltages. The dynamic variation in output impedance generates large echoes, which considerably deteriorate signal integrity. Consequently, the use of an echo canceller becomes necessary, with sufficient coverage range in delays and weights for many taps. As the echo canceller, including delay stages, consumes considerable power, the design approach should be carefully determined.

#### 2.3.3 Wide Linear Range (WLR) Hybrid

The wide linear range (WLR) hybrid is a hybrid method presented to address the linearity issue [50]. It utilizes a passive resistor ( $R_{HYB}$ ) and a current source ( $I_{HYB}$ ) for subtraction, as shown in Fig. 2.16(a). The hybrid does not suffer from linearity degradation since the overlapped inbound/outbound signal does not pass through the



Fig. 2.16 (a) Structure of WLR hybrid. (b) Operational principle of WLR hybrid (when inbound signal is off).

voltage-dependent non-linear  $g_m$  of the active elements. This feature allows for the uniform cancellation of the outbound signal regardless of voltage levels and maintains the linearity of the inbound PAM-4 signal over a wide voltage range. Consequently, the high SNR of the PAM-4 signal can be preserved without reducing the amplitude due to the linearity issue. Moreover, since the emulation and subtraction stages are combined, it does not require extra power for the subtraction stage.

Fig. 2.16(b) depicts the operational principle of the WLR hybrid when the inbound signal is turned off. The driver current ( $I_{DRV}$ ) flows through the termination resistor ( $R_{TERM}$ ) to generate the outbound signal, while  $I_{HYB}$  flows through  $R_{HYB}$  and  $R_{TERM}$  in the opposite direction, cancelling the outbound signal. The WLR hybrid equation can be expressed as

$$(I_{\rm DRV} - I_{\rm HYB})(R_{\rm TERM} || Z_0) = I_{\rm HYB}R_{\rm HYB}$$
(2.2)

where  $Z_0$  denotes the characteristic impedance of the channel. To reduce power consumption,  $I_{\text{HYB}}$  can be scaled down by a factor of *M* compared to the LINE driver. The ratio *M* between  $I_{\text{DRV}}$  and  $I_{\text{HYB}}$  can be expressed as

$$M = \frac{I_{\rm DRV}}{I_{\rm HYB}} = \frac{R_{\rm HYB}}{R_{\rm TERM} \parallel Z_0} + 1 \cong \frac{2R_{\rm HYB}}{R_{\rm TERM}} + 1$$
(2.3)

assuming  $R_{\text{TERM}} \cong Z_0$ . To minimize mismatches, a ratio-metric design of resistors and current sources can be accomplished between the LINE driver and REP hybrid path.

However, the WLR hybrid has some drawbacks and considerations. Firstly, the

current of the driver is not fully utilized for generating the driver output. The effective driver current decreases due to the hybrid current in the opposite direction, resulting in power overhead. Additionally, as the inverted input is required in the REP path, it is more suitable for differential implementation than single-ended. In the case of single-ended implementation, a timing mismatch between the LINE and REP paths may occur due to the inverting, but this can be easily resolved by connecting plus and minus nodes in reverse for differential implementation. As the REP path is utilized, the mismatch caused by the replica must also be considered.

Applying the VM driver instead of the CM driver is feasible for reducing driver power consumption. However, there are additional considerations beyond the echoes induced by changes in output impedance. The most crucial point is that the current of the VM driver must remain constant even during data transitions, as continuous subtraction is performed using the current relationship. If the current is not maintained constant, the hybrid equation does not hold, resulting in residual errors and SNR degradation. This residual error is particularly critical when the frequencies in both directions are different. In addition, the ratio-metric design with the REP path, which uses current for subtraction, becomes impossible.

## Chapter 3

# Design of Asymmetric SBD Transceiver with Two-Step Hybrid

## **3.1 Overview**

This thesis deals with asymmetric SBD communication focusing on automotive interface applications. The need for high-resolution CMOS image sensors (CISs) and high-bandwidth automotive interfaces has grown rapidly from the increased popularity of object detection for advanced driver-assistance systems. Recently enacted automotive Ethernet standards of 2.5 Gb/s, 5Gb/s, and 10 Gb/s also support the demand for the high-speed interface [63]. However, the channel bandwidth of automotive links is limited to about 2~3 GHz due to the multiple cables with in-line connectors and medium-dependent interface (MDI) connectors. To support the next-generation interface with a data rate above 10 Gb/s, multi-level signaling, such as

27



Fig. 3.1 Overall architecture of automotive camera link system.

four-level pulse-amplitude modulation (PAM-4), is necessary to provide a costeffective solution with legacy channel components. In addition, to reduce the weight of the link network and achieve lower fuel consumption, control data and DC power must be delivered to the CIS via the same cable, requiring SBD transceivers that are asymmetric in data rate and modulation method. The overall architecture of the automotive CIS link system involves a CIS-side serializer chip (SER) and an electronic control unit-side deserializer chip (DES), with the forward channel (FC) delivering image data from the CIS in PAM-4 and the back channel (BC) carrying control data in PAM-2. Fig. 3.1 depicts the overall architecture of the automotive camera link system.

Since SBD transceivers have been commonly configured to have equivalent data rate and signal amplitude in both directions doubling the throughput per pin, it is required to explore the appropriate structure for an asymmetric SBD transceiver. Considering the asymmetric SBD characteristics in automotive camera links (Fig.



Fig. 3.2 Characteristic of hybrid operation in the asymmetric SBD transceivers.

3.2), several FC sampling points are within one symbol of low-speed BC data. Due to the difference in data rate, glitches or transition errors from the discontinuous hybrid operation become the sampling noise for the high-speed receiver. Therefore, hybrids operating in the discrete-time, such as switched-capacitor-based hybrids, are unsuitable for asymmetric SBD.

Although the R-gm hybrid cancels the outbound signal continuously, the TX swing is reduced by the ratio of the sensing resistor to the characteristic impedance of the channel. The sensing resistors are usually designed to be half of the characteristic impedance reducing the TX output swing by half, thus it is inappropriate for automotive links with high loss. The subtraction stage consisting of a  $g_m$  cell and a load resistor is also a limiting factor. The combined FC and BC signal becomes the input of the voltage-dependent  $g_m$  cell, making it hard to operate linearly for a wide

input range. This linearity issue is particularly critical for PAM-4 signaling. One asymmetric bidirectional transceiver implemented a common-mode signaling approach to the BC using asymmetric features to avoid the hybrid constraints [49]. Nonetheless, this method necessitates a doubled power for BC swing and a high common-mode rejection ratio for the FC receiver, consequently rendering the design of the linear receiver front-end more challenging in PAM-4.

This chapter presents a novel SBD transceiver suitable for a highly asymmetric automotive camera interface. A wide linear range (WLR) hybrid is employed, and several other circuit-level design techniques are proposed to effectively extract the incoming signal with different data rate and modulation method [50]. In particular, a two-stage hybrid strategy suitable for PAM-4 TX feed-forward equalizer (FFE) is proposed in SER, which significantly reduces design complexity. By presenting adequate hybrid design methodologies for SER and DES, the prototype chips exhibit a figure of merit (FoM) of 0.41 pJ/b/dB for the 12-Gb/s PAM-4 FC and 125-Mb/s PAM-2 BC.

## 3.2 Analysis on Wide Linear Range (WLR) Hybrid

A WLR hybrid is employed in this work since it satisfies both continuous-time and wide-range linear operations. From a large-signal perspective, the WLR hybrid can remove the outbound signal by satisfying the DC relation between the current and resistance of the driver and hybrid (2.2). However, a small-signal analysis is required for smooth removal in asymmetric situations. The analysis also helps determine its bandwidth and evaluate the feasibility of the hybrid when an interconnection model is considered. Fig. 3.3(a) illustrates the WLR hybrid including a bondwire interconnect model, and Fig. 3.3(b) shows the corresponding small-signal model of the hybrid network. Parameters  $g_{m,DRV}$ ,  $r_{o,DRV}$ ,  $g_{m,HYB}$ , and  $r_{o,HYB}$  represent  $g_m$ and output impedance ( $r_o$ ) of the driver and the hybrid satisfying  $g_{m,DRV}/g_{m,HYB} =$  $r_{o,HYB}/r_{o,DRV} = M$  (scale factor). Applying Kirchhoff's current law to nodes  $V_x$  and  $V_{out}$  yields

$$\frac{V_{\rm x}}{R_{\rm TERM} \parallel Z_{\rm IN}} + g_{m, \rm DRV} V_{\rm in} + \frac{V_{\rm x}}{r_{o, \rm DRV}} + \frac{V_{\rm x} - V_{\rm out}}{R_{\rm HYB}} = 0$$
(3.1)

$$\frac{V_{\text{out}} - V_x}{R_{\text{HYB}}} - g_{m,\text{HYB}}V_{\text{in}} + \frac{V_{\text{out}}}{r_{o,\text{HYB}}} + sC_LV_{\text{out}} = 0.$$
(3.2)

where the interconnect impedance  $Z_{IN}$  appears as

$$Z_{\rm IN} = \frac{Z_0 + sL_{\rm B} + s^2 L_{\rm B} C_{\rm B} Z_0}{1 + s(C_{\rm B} + C_{\rm PAD}) Z_0 + s^2 L_{\rm B} C_{\rm PAD} + s^3 L_{\rm B} C_{\rm PAD} C_{\rm B} Z_0} [11].$$
(3.3)

If the effects of bonding inductance and capacitance can be considered negligible,



Fig. 3.3 (a) WLR hybrid including an interconnect model and (b) small-signal model of the hybrid network.

(3.1) can be expressed as

$$\left(\frac{M}{R_{\rm HYB}} + sC_{\rm PAD} + \frac{1}{r_{o,\rm DRV}}\right) + V_{\rm x} + g_{m,\rm DRV}V_{\rm in} - \frac{V_{\rm out}}{R_{\rm HYB}} = 0.$$
(3.4)

By combining (3.2) and (3.4), the transfer function of the hybrid can be represented as the following two-pole one-zero system:

$$\frac{V_{\text{out}}}{V_{\text{in}}}(s) = \frac{g_{m,\text{HYB}}\left(s\frac{C_{\text{PAD}}}{R_{\text{HYB}}} + \frac{M}{r_{o,\text{HYB}}}\right)}{s^{2}C_{\text{L}}C_{\text{PAD}} + s\left(MC_{\text{L}} + C_{\text{PAD}}\right)\left(\frac{1}{R_{\text{HYB}}} + \frac{1}{r_{o,\text{HYB}}}\right) + \left(\frac{M-1}{R_{\text{HYB}}^{2}} + \frac{2M}{r_{o,\text{HYB}}R_{\text{HYB}}} + \frac{M}{r_{o,\text{HYB}}^{2}}\right)} \\
\approx \frac{g_{m,\text{HYB}}\left(s\frac{C_{\text{PAD}}}{R_{\text{HYB}}} + \frac{M}{r_{o,\text{HYB}}}\right)}{\left(sC_{\text{L}} + \frac{1}{R_{\text{HYB}}} + \frac{1}{r_{o,\text{HYB}}}\right)\left(sC_{\text{PAD}} + \frac{M}{R_{\text{HYB}}} + \frac{M}{r_{o,\text{HYB}}}\right)}.$$
(3.5)

Assuming M = M - 1, the two poles can be simplified as  $w_{p1} = 1/C_L(1/R_{HYB}+1/r_{o,HYB})$ and  $w_{p2}=1/C_{PAD}(M/R_{HYB}+M/r_{o,HYB})$ . The locations of the poles are primarily determined by the hybrid resistance ( $R_{HYB}$ ), as observed from the derivation. The dominant pole among the two poles varies depending on the conditions of load capacitance ( $C_L$ ) and pad capacitance ( $C_{PAD}$ ). In this design, the dominant pole is determined by the load capacitance.

Meanwhile, the scale factor M is determined by the trade-off between power overhead and frequency-dependent loss of the WLR hybrid. The hybrid current flows through  $R_{\rm HYB}$  and  $R_{\rm TERM}$ , reducing the driver output swing of the outbound



Fig. 3.4 Power overhead and frequency characteristic of WLR hybrid to  $R_{HYB}$ .

signal. Therefore, selecting a higher  $R_{\rm HYB}$  value results in lower power consumption. However, a large value of  $R_{\rm HYB}$  can cause high frequency-dependent loss since the bandwidth of the hybrid is determined by  $R_{\rm HYB}$ . Fig. 3.4 demonstrates the simulated power overhead and frequency-dependent loss of the WLR hybrid, assuming a Nyquist frequency of 3 GHz and a load capacitance of 50 fF. From the simulation with a simple *RC* model, a value of 400  $\Omega$  is selected for  $R_{\rm HYB}$ , with a power overhead of 6.25% and a frequency-dependent loss of less than 0.6 dB at the Nyquist frequency.

The level separation mismatch ratio (RLM) after the WLR hybrid circuit is shown in Fig. 3.5, compared with the conventional replica hybrid and R-gm hybrid with the same load capacitance. For small input swings, such as 400 mV<sub>pp</sub>, the hybrids show uniform PAM-4 eyes. However, as the input swing increases, the RLM degrades significantly in the conventional and R-gm structures, whereas the WLR

hybrid maintains a value close to 1.0. Specifically, for the 850-mV<sub>pp</sub> swing, which is the target voltage in this work, the conventional hybrid shows an RLM of only 0.86 and even worse performance for the R-gm hybrid. Hence, for PAM-4 systems that require large input swings to secure sufficient voltage margin, the WLR hybrid can maintain the RLM and is advantageous for increasing the swing compared to other structures.



Fig. 3.5 RLM comparison of hybrid structures to the input voltage swing.

### **3.3 Proposed Two-Step Hybrid Strategy**

In addition to the hybrid operation, there are considerations regarding the overall architecture since the 3-tap TX FFE is employed to compensate for lossy channel characteristics. A PAM-4 driver that contains FFE requires many unit slices for MSB, LSB, and FFE tap allocation. Hybrid typically utilizes a replica driver to sub-tract outbound signals, but emulating all the slices significantly increases design complexity (Fig. 3.6). Moreover, the replica is often scaled for power/area saving,



Fig. 3.6 Conceptual diagram of SBD structure that emulates PAM-4

drivers including TX FFE using replica.

making the unit slice smaller and more vulnerable to mismatch. As the complexity increases, the timing difference also increases due to physical differences between drivers and hybrid slices, which makes it difficult to expect accurate hybrid behavior.

Instead, a simplified hybrid removing only four primary DC levels of the PAM-4 signal is proposed combined with the use of a low-pass filter (LPF). To be specific, a hybrid is designed to remove the major components among the FFE taps, with the corresponding value of  $\Sigma \alpha$  (=  $\alpha_{-1} + \alpha_0 + \alpha_1$ ), where  $|\alpha_{-1}| + |\alpha_0| + |\alpha_1| = 1$  ( $\alpha_{-1} < 0$ ,  $\alpha_0 > 0$ ,  $\alpha_1 < 0$ ). Then, an LPF filters out the residual high-frequency components from



Fig. 3.7 (a) Conceptual diagram of proposed two-step hybrid structure in SER and(b) operational principle of the hybrid structure with a single-bit model.

the hybrid output and the reflections due to channel imperfection. Fig. 3.7(a) shows the conceptual diagram of the proposed two-step hybrid structure in SER and Fig. 3.7(b) illustrates the operational principle of the hybrid structure with a single-bit model. The hybrid coefficient  $\Sigma \alpha$  suppresses low-frequency components of the FC signal, making it easier to separate the overlapped signals with different speeds. It can be easily observed from the fast Fourier transform (FFT) simulation results of the FC signal with different hybrid coefficients in Fig. 3.8. The coefficient  $\Sigma \alpha$  is optimal with the LPF and consumes less power than using  $\alpha_0$ , which removes all major components.

The mathematical analysis is as follows. The FC signal spectrum after passing the FFE can be expressed as

$$S_o(s) = (\alpha_{-1}^{-sT_f} + \alpha_0 + \alpha_1^{sT_f})S_i(s)$$
(3.6)



Fig. 3.8 FFT simulation results of FC signal with different hybrid coefficients.

where  $S_i(s)$  and  $T_f$  refer to the FC signal spectrum before hybrid and the unit interval of the FC signal. Then, the spectrum  $S_f(s)$  after the hybrid subtracting the main tap by the coefficient of *x* appears as

$$S_{f}(s) = (\alpha_{-1}^{-sT_{f}} + (\alpha_{0} - x) + \alpha_{1}^{sT_{f}})S_{i}(s).$$
(3.7)

Assuming that the LPF sufficiently suppresses high-frequency components above the cut-off frequency ( $w_c$ ), x, which minimizes the coefficient term of  $S_f(jw)$  in the low-frequency band, is the optimum value. In the frequency range below the  $w_c$  satisfying  $wT_f < 1/10$ , the coefficient term can be expressed as

$$\alpha_{-1}^{-iwT_{f}} + (\alpha_{0} - x) + \alpha_{1}^{iwT_{f}}$$

$$= \alpha_{-1}(\cos(-wT_{f}) + i\sin(-wT_{f})) + (\alpha_{0} - x)$$

$$+ \alpha_{1}(\cos(wT_{f}) + i\sin(wT_{f}))$$

$$\cong \alpha_{-1} + (\alpha_{0} - x) + \alpha_{1}.$$
(3.8)

From the derived equation, it is noted that the hybrid coefficient of  $\Sigma \alpha$  (=  $\alpha_{-1} + \alpha_0 + \alpha_1$ ) makes the outbound signal for the interested range approximately zero.

As a result, the burden on the hybrid emulating multiple driver slices with the FFE can be reduced. In addition, this method is relatively robust to hybrid timing mismatch because it filters the outbound signal through LPF after the hybrid. As shown in Fig. 3.9, the received eye-opening of the BC signal remains almost unchanged, even for a timing difference of up to 60 ps (0.36 UI) between the FC driver and hybrid.



Fig. 3.9 Simulated received eye-opening of BC signal to the timing difference

between FC driver and hybrid.

### **3.4 SBD Transceiver Implementation**

The WLR hybrid is employed in both SER and DES chips to suppress outbound signals. Nonetheless, distinctive design strategies are applied for each chip due to the difference in data speed and signaling methods, and their respective descriptions contain details of such strategies. Employing a source synchronous forwarded clocking structure, a phase interpolator (PI)-based baud-rate clock and data recovery (CDR) and a 2x oversampling CDR are utilized for FC and BC, respectively.

In SBD transceivers, determining the magnitude of the signal in both directions for a given supply voltage is important, as signals in both directions overlap each other. Since it is first required to maintain the linearity of a PAM-4 driver even for the combined output swing level, the output swing limit is determined through the RLM simulation of the FC driver as shown in Fig. 3.10. The output swing is limited



Fig. 3.10 Simulated FC driver RLM vs. output voltage swing

to a total of 850 mV, and each swing is determined by the received eye opening trade-off according to the swing ratio between the FC and BC as illustrated in Fig. 3.11. Increasing the FC swing enhances the SNR of the PAM-4 signal through the channel, but the increase gradually diminishes due to the linearity limit of the receiver front-end. Also considering that the BC swing using filtering is more difficult to separate as the FC swing increases, the signal swing ratio is determined to about 5.5, which ensures the eye opening of 100mV to the BC.



Fig. 3.11 Simulated eye openings of received FC and BC vs. signal swing ratio of FC and BC.

#### 3.4.1 Serializer Chip (SER)

Fig. 3.12 depicts the block diagram of the SER, which is composed of the PAM-4 FC TX and PAM-2 BC RX. MSB and LSB data are individually 6:1 serialized for the FC signal and transferred to a PAM-4 driver. Fig. 3.13 shows the configuration of the 6:1 serializer, which comprises a tap generation block for the 3-tap FFE and a single-to-differential (S2D) block. To prevent jitter from propagating to the driver stage, the S2D is placed before the final 2:1 serializer. For better impedance matching and lower power consumption, a push-pull current-mode logic (CML) type is employed for the driver [64], [65]. Especially, impedance matching is vital in SBD since the reflected signal with the amplitude ratio of  $(Z_{DRV} - Z_0)/(Z_{DRV} + Z_0)$ , where



Fig. 3.12 SER block diagram with two-step hybrid architecture.



Fig. 3.13 Configuration of 6:1 serializer.



Fig. 3.14 Schematic of PAM-4 FC driver (including pre-driver) and hybrid.

 $Z_{\text{DRV}}$  is the output impedance of the driver, becomes deterministic noise to the inbound signal at the receiver. In this respect, a CML driver is advantageous in SBD since it shows lower impedance variation at the data transitions compared to a voltage-mode driver.

A pre-driver preceding the main driver improves the linearity of the PAM-4 signal by keeping the input transistors of the main driver in saturation. It utilizes Miller compensation with feed-forward capacitors to enhance the bandwidth and power efficiency [66]. The pre-driver and driver consist of 30 unit slices/LSB to realize a 3-tap ( $\alpha_{-1}$ ,  $\alpha_0$ ,  $\alpha_1$ ) FFE. The schematic of the PAM-4 FC driver, including the predriver, is presented in Fig. 3.14, which also shows the schematic of the WLR hybrid utilized in the design. Only the main tap of the FFE taps (pre, main, post) is received as the hybrid input with the coefficient corresponding to  $\Sigma \alpha$  since the two-step hybrid approach is utilized. A distinctive feature is that the driver and hybrid have different types, which use the same types usually to avoid transition errors. The unmatched design is possible thanks to the LPF, which sufficiently separates the highfrequency FC transition errors from the relatively low-frequency BC signal.



Fig. 3.15 Structure of 2<sup>nd</sup> order gm-C LPF.

The gm-C LPF is implemented with a fully differential structure [67], which is advantageous in noise immunity and halves the required capacitances. Fig. 3.15 illustrates the structure of the  $2^{nd}$  order gm-C LPF, where each  $G_m$  cell is constructed with an operational transconductance amplifier (OTA). The transfer function of the gm-C LPF can be expressed as

$$H(s) = \frac{G_{m0}G_{m1}}{C_1 C_2 s^2 + C_1 G_{m2} s + G_{m1} G_{m3}}.$$
(3.9)

The cut-off frequency of the LPF is set around the point where the magnitude becomes the same by comparing the FC and BC signal spectra after the hybrid (Fig. 3.16). A lower cut-off frequency allows for greater suppression of the outbound signal; however, it also leads to an increased attenuation ratio of the inbound signal to the outbound signal. The order of the LPF is determined by how much the outbound signal should be suppressed compared to the inbound signal. Fig. 3.17 shows the



Fig. 3.16 Signal spectra of FC and BC after hybrid in SER.



Fig. 3.17 Eye diagrams of received BC signal with 1<sup>st</sup> and 2<sup>nd</sup> order LPF.

simulated eye diagram of the received BC signal with the first- and second-order LPF. While a higher-order LPF can remove more outbound components, it also suppresses high-frequency components of the inbound signal, resulting in little improvement in overall SNR beyond a certain LPF order. This occurs because the signal spectra overlap near the corner frequency. Considering this trade-off, a second-order LPF is employed in this paper, and Fig. 3.18 shows its frequency response. Tunable transconductance and the capacitance of the LPF adjust not only the gain and bandwidth but also the Q factor, which can compensate for the channel loss by frequency peaking at the cost of slight noise boosting. In the prototype, considering both vertical and horizontal margins, the Q factor of about 1.0 is chosen based on the analysis in Fig. 3.19.



Fig. 3.18 Frequency response of gm-C LPF.



Fig. 3.19 Vertical/horizontal margins of received BC signal to Q factor.

#### **3.4.2 Deserializer Chip (DES)**

Fig. 3.20 illustrates the block diagram of the DES, which consists of the PAM-4 FC RX and the PAM-2 BC TX. The BC TX is configured to send Manchesterencoded data for DC balancing, and the encoded data are delivered directly to the pre-driver and the driver without serialization. Here, the use of Manchester coding makes CDR easier since a transition is guaranteed in every symbol. At the same time, the high-frequency energy of the BC signal increases from the encoding for the same reason, which may interfere with frequency separation in BC RX. However, compared to the unencoded NRZ signal, it is not detrimental to filtering because the energy becomes more concentrated in a narrow band.



Fig. 3.20 DES block diagram with WLR hybrid and echo canceller.

In the DES, it is not appropriate to use a filter as in SER because the FC signal suffers from high-frequency attenuation due to the high-loss cable. In addition, the loss of energy when filtering the overlapped signal is also a problem for recovering the high-speed PAM-4 signal. Therefore, it is suitable for the DES to remove the outbound signal using a hybrid circuit, which requires more precise operation because error in the hybrid output can also be fatal to FC RX during BC transition due to asymmetric data rates. Consequently, the WLR hybrid is utilized for accurate operation, and a pre-driver is added to minimize the error at the transition by limiting the input voltage of the BC CML driver. Fig. 3.21 illustrates the schematic of the BC driver and the WLR hybrid network. Since the BC is slow in speed, a slew is intentionally added to reduce the harmonics of the output BC driver, which also reduces the hybrid error slightly.



Fig. 3.21 Schematic of BC driver and WLR hybrid network.

On the other hand, since the hybrid output is still a PAM-4 signal to be recovered, the swing of the received signal is adjusted using a passive attenuator to alleviate the nonlinearity. The passive attenuator can be easily implemented with the WLR hybrid as shown in Fig. 3.21 with the gain of  $R_{\text{ATT}}/(R_{\text{HYB}}+R_{\text{ATT}})$ , and the hybrid output is then forwarded to the analog front-end (AFE). The effectiveness of the combined



Fig. 3.22 (a) Overlapped hybrid input under SBD operation. Outputs of WLR hybrid (b) without attenuator, (c) with passive attenuator, and (d) with active PGA instead of passive attenuator.

WLR hybrid and the passive attenuator is illustrated in Fig. 3.22(a)-(c), which shows that linear hybrid operation is conducted for the overlapped 850-mV swing, and swing control is performed without degradation of the RLM using the passive attenuator. Comparison is made with the degraded RLM in Fig. 3.22(d) which used an active PGA instead of the passive attenuator.

Fig. 3.23 shows the FC RX data path configuration after hybrid, which is composed of a programmable gain amplifier (PGA), a continuous-time linear equalizer (CTLE), a half-rate decision feedback equalizer (DFE), and an echo canceller (EC). Cooperating with the preceding passive attenuator, the PGA adjusts the overall gain of the signal to cover various cable lengths. The frequency response of the PGA, including the passive attenuator, is shown in Fig. 3.24 and has an adjustable range of over 15 dB. Then, the CTLE with *RC*-degenerated pair boosts up to 7 dB at the



Fig. 3.23 Configuration of FC RX data path.



Fig. 3.24 Frequency response of PGA including passive attenuator

Nyquist frequency, compensating for the channel loss, and the 3-tap DFE is employed to compensate for the post-cursor intersymbol interferences (ISIs) not removed by the FFE and the CTLE. The DFE is designed in a CML structure with a degenerated resistor for linearity and a common-mode compensation circuit to maintain performance with various tap coefficients. The samplers are configured with a strong-arm latch structure consisting of 3 data samplers and 2 error samplers. The sampled data after the RS latches become the feedback inputs of the DFE taps. A fixed dtLev scheme is applied for receiver adaptation [68], which first sets the target received swing and conducts offset cancellation for analog circuits within the given voltage range. After that, the entire equalizer adaptation is performed by adjusting the CTLE DC gain to make  $h_4$  zero and removing the remaining  $h_1 \sim h_3$  by the DFE [69], utilizing the sign-sign least-mean-square algorithm (SSLMS). Fig. 3.25 shows the frequency response of the CTLE with DC gain control.



Fig. 3.25 Frequency response of CTLE (DC gain control).

Additionally, five EC taps are added to the CTLE output to remove the nearend/far-end echoes from the MDI or in-line connectors [44]. The coarse/fine delay stages are constructed, and the pre-drivers are employed to match with the slew in the BC output driver. Each delay stage of the taps covers the round trip delay of up to 10-m cable and has a fine delay resolution of 0.33 ns which corresponds to the round trip delay of 3.3 cm. Fig. 3.26(a) shows the response of the BC with echoes when a single-bit is transmitted to the 5-m cable, and Fig. 3.26(b) shows echo waveforms with and without echo canceller. When the EC is applied, about 20 mV of echoes can be removed.



Fig. 3.26 (a) A single-bit response of BC with 5-m automotive cable and (b) echo waveforms with and without echo canceller.

#### **3.5 Measurement**

The SER and the DES are separately fabricated in 40-nm CMOS technology and tested together in a chip-on-board assembly, and Fig. 3.27 illustrates the measurement setup of the SBD transceiver. The clocks for both SER and DES are generated using the Agilent J-BERT N4903 signal analyzer with a source synchronous system. The intermediate nodes of FC and BC AFE data are read out through 50  $\Omega$  CML drivers and measured using the Keysight N9040B spectrum analyzer to demonstrate



Fig. 3.27 Measurement setup of SBD transceiver.

the effectiveness and characteristics of the proposed hybrid structures. Fig. 3.28 shows the measured channel characteristics of the 5-m STQ cable, which has an insertion loss of 15.9 dB at 3 GHz.



Fig. 3.28 Measured channel characteristics of 5-m STQ cable.

Fig. 3.29 shows the measured transmitter output waveform of the 12-Gb/s PAM-4 FC signal, which exhibits a differential 0.72-V swing with an RLM of 0.99. On the other hand, the 125-Mb/s Manchester-encoded BC signal has a differential swing of 0.13 V, whose measured transmitter output waveform is shown in Fig. 3.30. The measured spectra of the SER intermediate nodes are shown in Fig. 3.31, illustrating the characteristics of the hybrid and LPF. The hybrid with coefficient  $\Sigma \alpha$  reduces the outbound FC signal by about 20 dBm compared to the coefficient  $\alpha_0$  at the Nyquist frequency of the BC signal. Furthermore, the second-order gm-C LPF effectively reduces the high-frequency energy of the outbound FC spectrum by more than 30 dBm, which is not removed from the hybrid. The FFE coefficients used for the measurement are  $(\alpha_{-1}, \alpha_0, \alpha_1) = (-0.1, 0.8, -0.1)$ . The measured spectra of the DES intermediate nodes are presented in Fig. 3.32, depicting the characteristics of the hybrid and EC. The WLR hybrid and the analog EC reduce the outbound BC signal by about 12 dBm each. While the hybrid reduces overall energy, the EC reduces only low-frequency components induced by the BC signal.


Fig. 3.29 Measured transmitter output waveform of PAM-4 FC signal.



Fig. 3.30 Measured transmitter output waveform of PAM-2 BC signal.



Fig. 3.31 Measured spectra of SER intermediate nodes.



Fig. 3.32 Measured spectra of DES intermediate nodes.

The bit error rate (BER) measurement is performed using the on-chip eye monitoring logic, which compares the sampled data output to the sampled reference data output by adjusting the clock phase of the reference data. Fig. 3.33 shows the measured bathtub curve of the PAM-4 FC data under SBD operation, which exhibits an eve margin of 0.15 UI for a symbol error rate of  $10^{-12}$ . For the PAM-2 BC data, the measured bathtub curves are compared with two hybrid coefficients,  $\Sigma \alpha$  and  $\alpha_0$ , as illustrated in Fig. 3.34. In the case of the proposed coefficient  $\Sigma \alpha$ , 0.57 UI eye margin for BER<10<sup>-12</sup> is achieved, whereas, in the case of coefficient  $\alpha_0$ , the eye is completely closed for BER $<10^{-8}$ . Fig. 3.35 and Fig. 3.36 present the measured jitter tolerance (JTOL) curves for the 12-Gb/s PAM-4 FC and 125-Mb/s PAM-2 BC data for the BER $<10^{-9}$ , respectively. The FC exhibits a JTOL bandwidth of about 1 MHZ, and the BC JTOL exceeds the equipment limit over the entire frequency range. Fig. 3.37 shows chip photomicrographs of the SER and the DES with an active chip area of 0.24 mm<sup>2</sup>, and the power breakdown of the prototype chips is illustrated in Fig. 3.38. Total power consumption is 78.4 mW when the SBD is active. More than half of the power is used for receiving the FC signal, including PAM-4 equalizers, clocking, and the EC. The FoM, including power efficiency and the channel loss, is 0.41 pJ/b/dB for the 5-m automotive cable. Table 3.1 shows the performance summary and comparison with recent SBD transceivers. Only this design incorporates both PAM-4 and PAM-2 signaling with large eye margins and an FoM less than 1.0 pJ/b/dB.



Fig. 3.33 Measured bathtub curve of FC data under SBD operation.



Fig. 3.34 Measured bathtub curves of BC data under SBD

operation with two hybrid coefficients.



Fig. 3.35 Measured JTOL curves of 12-Gb/s PAM-4 FC data.



Fig. 3.36 Measured JTOL curves of 125-Mb/s PAM-2 BC data.



Fig. 3.37 Chip photomicrographs of SER and DES.



Fig. 3.38 Power breakdown of SBD transceiver.

|                           |             | <b>JSSC '07 [28]</b>       | TCAS-I '20 [46]           | <b>JSSC '20 [44]</b>     | ISSCC '21 [48]           | JSSC '21 [49]           | This work                        |
|---------------------------|-------------|----------------------------|---------------------------|--------------------------|--------------------------|-------------------------|----------------------------------|
| Technolo                  | lgy         | 110-nm CMOS                | 65-nm CMOS                | 28-nm CMOS               | 14-nm FinFET             | 130-nm BiCMOS           | 40-nm CMOS                       |
| SBD Ty                    | pe          | Symmetric                  | Symmetric                 | Symmetric                | Symmetric                | Asymmetric              | Asymmetric                       |
| Signalir                  | 50          | NRZ                        | NRZ                       | NRZ                      | PAM-4                    | PAM-2                   | PAM-4 & PAM-2                    |
| Data Data                 | FC          | 10 01 /-                   |                           | 1001                     |                          | 24 Gb/s                 | 12 Gb/s                          |
| Data Nate                 | BC          | 2000/8                     | 7.JUD/S                   | 10 OD/S                  | 2000/S                   | 312.5 Mb/s              | 125 Mb/s                         |
| Driver Ty                 | vpe         | Current mode               | Voltage mode              | Voltage mode             | Voltage mode             | Voltage mode            | Current mode                     |
| SBD                       | FC          | D cm hubuid                | Resistor-bridge           | R-gm hybird              | Passive hybrid           | Differential            | WLR hybrid, EC                   |
| Architecture              | BC          | <b>м-</b> 8ш пурти         | hybrid                    | & EC                     | & EC                     | Common-mode             | Hybrid ( $\Sigma \alpha$ ) & LPF |
| DDA                       |             | 1.2 V                      | 1.2 V                     | 0.9 V                    | 1.2V / $0.9$ V           | 2.5 V                   | 0.9V / $1.1V$ / $1.2~V$          |
| Amplitude (F              | C+BC)       | $0.80~\mathrm{V_{pp}}$     | $0.40 \ V_{pp}$           | $0.40 \ \mathrm{V_{pp}}$ |                          | $0.88 V_{pp}$           | $0.85 \ V_{pp}$                  |
| Channel I<br>@ Nyqui      | _oss<br>ist | 5.0 dB<br>3-m twisted pair | 2.5 dB<br>10-mm PCB trace | 10.2 dB<br>6" PCB trace  | 35 dB<br>Backplane trace | 18.3 dB<br>20-m coaxial | 15.9 dB<br>5-m STQ cable         |
| Power                     | FC only     | 766 mW                     | 10.1 mW/*                 | 20.2 mW/                 | 00/ mW                   | -                       | 67.2 mW                          |
| Consumption               | FC+BC       | 200 III W                  | 10:1 111 11               | 29.2 III W               | 224 III W                | 456 mW                  | 78.4 mW                          |
| Energy                    | FC only     | 26 G nT/h                  | 1 25 pT/b*                | 1 83 51/5                | 1775 pT/h*               | I                       | 5.6 pJ/b                         |
| Efficiency                | FC+BC       | 20.0 ps/0                  | 1.32 paro                 | 1.05 раго                | יייני בייי               | 18.76 pJ/b              | 6.5 pJ/b                         |
| FoM                       | FC only     | CC 3                       | 0 5 4 %                   | 0.10                     | 0 51*                    | -                       | 0.35                             |
| [pJ/b/dB]                 | FC+BC       | 20.2                       | 0.04                      | 0.18                     | 0.01                     | 1.03                    | 0.41                             |
| Eye Margin                | FC          | 0 /1 11                    | 0.02.111                  | 0.06111                  |                          | 0.7 UI                  | 0.15 UI                          |
| for BER<10 <sup>-12</sup> | BC          | 0.41 UI                    | 0.03 01                   | 0.00 01                  | '                        | 0.23 UI                 | 0.57 UI                          |
| Area                      |             | $1.02 \text{ mm}^2$        | $0.012 \text{ mm}^{2*}$   | $0.182 \text{ mm}^2$     | ı                        | $0.23~\mathrm{mm}^{2*}$ | $0.24 \text{ mm}^2$              |
| * Includes only ar        | nalog front | -ends                      |                           |                          |                          |                         |                                  |

Table 3.1 Performance summary and comparison with prior SBD transceivers

## Chapter 4

# Design of Symmetric SBD Transceiver with Hybrid Adaptation

## 4.1 Overview

Simultaneous bidirectional (SBD) communication with symmetric data rate and amplitude using non-return-to-zero (NRZ) signaling has been steadily proposed to double per-pin bandwidth. Furthermore, quadruple throughput can be achieved for the PAM-4 SBD architecture compared to the unidirectional (UD) NRZ architecture by simultaneously delivering four-level pulse-amplitude modulation (PAM-4) signals in both directions in a channel. Thanks to this advantage, a PAM-4 SBD transceiver has recently been explored [48]. However, there are also issues that arise with the application of PAM-4 to SBD. In addition to the general linearity problem, PAM-4 has a fatal weakness three times greater than NRZ for the same amount of



Fig. 4.1 Conceptual block diagram of SBD transceiver with hybrid adaptation engine.

noise. In short-reach SBD communication, errors caused by hybrid mismatches become the dominant noise source, unlike UD communication which primarily considers intersymbol interference (ISI).

Fig. 4.1 illustrates a conceptual block diagram of the SBD transceiver that employs a hybrid adaptation engine. The essential component of the SBD transceiver is a hybrid circuit, which eliminates the outbound signal. Various hybrid structures have been investigated due to its significance, and one of the representative structures is the R-gm hybrid, which excludes a replica driver to minimize the mismatch factors from the replica [28]. However, even with the reduction of the mismatch factor, residual hybrid errors occur when the resistance or transconductance deviates from the desired value, necessitating hybrid trimming to enhance the signal-to-noise ratio (SNR) of the received signal. The need for an automatic search of the hybrid coefficient increases since the mismatch is aggravated as the technology scales down for higher bandwidth. This chapter presents an SBD transceiver with PAM-4, employing a novel hybrid adaptation scheme [70]. The possibility of extending bandwidth is explored by applying PAM-4 signaling to SBD. Furthermore, a mismatch compensation method for a hybrid circuit is proposed to guarantee robust SBD operation. The hybrid adaptation scheme is featured with its cost-efficient design by sharing one error sampler with clock and data recovery (CDR) and the data level (dLev) adaptation logic and utilizing the locking points of the loops. Fabricated in the 28-nm CMOS, the 80-Gb/s SBD transceiver achieves a bit error rate (BER) of less than 10<sup>-12</sup> with an energy efficiency of 2.65 pJ/b.

#### 4.2 Proposed Hybrid Adaptation Scheme

While compensating for the hybrid mismatch is the first priority, too much overhead of area or power is inappropriate. Furthermore, fatal trade-offs such as bandwidth reduction due to additional circuits should be avoided. Thus, employing additional samplers at the receiver is not recommended for high speed, and the information must be utilized as much as possible from the existing hardware system for low energy consumption. This thesis presents a low-cost adaptation methodology compatible with the widely used dLev adaptation and Mueller-Müller phase detector (MMPD). The dLev adaptation moves the reference level of the error sampler to the desired point, generally set to the center of the main cursor level until the same ratio of "UP" and "DN" is reached. The MMPD, on the other hand, determines phase



Applied  $h_{-1} = h_1$  from the MMPD characteristic

Fig. 4.2 Operational principle of proposed hybrid adaptation scheme.

| D[n-1] | D[n]      | D[n+1] | P[n] |
|--------|-----------|--------|------|
| -3     | +3        | +3     |      |
| -1     | +3        | +1     | 1    |
| +1     | +3        | -1     |      |
| +3     | +3        | -3     |      |
| 0      | ther case | es     | 0    |
|        | (:        | a)     |      |

| P[n] | D'[n]     | E[n] | <b>W<sub>HYB</sub></b> |
|------|-----------|------|------------------------|
|      | -3 / -1   | -1   | UP                     |
| 1    | -3 / -1   | +1   | DN                     |
| I    | +3 / +1   | -1   | DN                     |
|      | +3 / +1   | +1   | UP                     |
| 0    | ther case | es   | Hold                   |
|      | (1        | )    |                        |

Fig. 4.3 Truth table of (a) pattern filter for hybrid adaptation and (b) hybrid weight adaptation logic.

error from the consecutive error and data samples and produces a lock point where  $h_{-1} = h_1$ . The logical operations are covered by Chapter 4.3.2.

The proposed hybrid adaptation begins with the spread of possible ISIs from the dLev (=  $3h_0$ ). Fig. 4.2 illustrates the process under the assumption that only  $h_{-1}$ ,  $h_0$ , and  $h_1$  components are present. Four ISIs according to the PAM-4 levels appear for each cursor, and 16 kinds of ISIs exist when both the pre-cursor and post-cursor ISIs are considered. Here, applying the characteristic of MMPD,  $h_{-1} = h_1$ , the spreads from the ISIs appear in seven ways. Among them, a case of  $3h_0$  that corresponds to the dLev value suggests that the data level is maintained for specific patterns even after considering ISIs. Therefore, using these particular patterns enables hybrid adaptation without additional samplers. Such a pattern filter P[n] can be expressed as

$$P[n] = D_{\text{MSB}}[n] \cdot D_{\text{LSB}}[n] \cdot (D_{\text{MSB}}[n-1] \oplus D_{\text{MSB}}[n+1])$$
  
 
$$\cdot (D_{\text{LSB}}[n-1] \oplus D_{\text{LSB}}[n+1])$$
(4.1)

where  $D_{\text{MSB}}[n]$  and  $D_{\text{LSB}}[n]$  represent the MSB and LSB of the inbound signal, respectively. The corresponding truth table of the pattern filter also appears in Fig. 4.3(a) with PAM-4 levels.

The last step is to consider the residual hybrid errors from the outbound signal and to determine the output of the error sampler based on the hybrid weight. When the hybrid weight ( $w_{HYB}$ ) is the same as the outbound main cursor ( $h'_0$ ),  $h'_0 = w_{HYB}$ , the hybrid output level is maintained to  $3h_0$ . However, when the weight deviates, the level is equally divided into four parts. Therefore, the "UP"/"DN" of  $w_{HYB}$  can be determined in the direction where ( $h'_0 - w_{HYB}$ ) becomes zero, which can be adapted by detecting the correlation of the outbound signal (D'[n]) and the error sampler output (E[n]). The "UP"/"DN" cases of the  $w_{HYB}$  are shown in the Fig. 4.3(b). Consequently, using sign-sign least-mean-square (SSLMS) algorithm, the hybrid adaptation logic can be expressed as

$$w_{\text{HYB},i+1} = w_{\text{HYB},i} + \mu \cdot \text{sign}(E[n]) \cdot D'_{\text{MSB}}[n] \cdot P[n]$$
(4.2)

where  $D'_{\text{MSB}}[n]$  represents the MSB of the outbound signal. The outbound MSB data is obtained by delaying and transmitting data without serialization. As a result, the proposed hybrid adaptation is performed without additional analog hardware in the main path by sharing the one error sampler with both CDR and dLev adaptation loop. This method can be extended for multiple error samplers to enhance the transition density with the same principle.



Fig. 4.4 Simulated results of (a) normalized VEO by the normalized hybrid weight and (b) eye diagrams by the hybrid mismatches.

The importance of the hybrid adaptation scheme is indirectly noticed by apprehending how much hybrid mismatch affects bidirectional communication. A hybrid mismatch is simulated from the modeling of a high-speed SBD transceiver, and the result of the normalized vertical eye-opening (VEO) for the normalized hybrid weight appears in Fig. 4.4(a). Monte Carlo simulation is also conducted for the hybrid circuit with 1000 samples, and the standard deviation of the normalized hybrid weight is shown in the same graph. A combination of the simulation results derives that the normalized VEO can be less than 0.79 with a 32% probability and 0.52 with a 5% probability by the hybrid mismatch. Fig. 4.4(b) shows the eye diagrams for three cases: ideal, 1-sigma mismatch, and 2-sigma mismatch. As a result, the effectiveness of the compensation scheme is significant since VEO degrades with a high probability when a mismatch happens, even closed in the worst case. These results may be used to determine criteria such as the control range or the resolution of the logic.

## **4.3 SBD Transceiver Implementation**

Fig. 4.5 shows the overall architecture of the presented PAM-4 SBD transceiver. Assuming the clock forwarding structure, a polyphase filter (PPF) makes differential external clock source into four phases, which are used for transmitter and receiver after passing through each phase interpolator (PI) and duty-cycle corrector (DCC).



Fig. 4.5 Block diagram of overall PAM-4 SBD transceiver architecture.

In addition, the SBD transceiver includes PAM-4 hybrid circuit and adaptation logic while supporting data alignment logic to ensure smooth two-way communication.

#### 4.3.1 Transmitter

The transmitter is configured so that MSB and LSB are delivered from the onchip pattern generator to the 32:1 serializer, respectively, and finally to the PAM-4 driver stage. The 32-bit data is serialized using continuous 2:1 serializers, and single-ended data is converted to differential data just before the final 2:1 serializer to reduce power consumption while preventing single-to-differential (S2D) jitter propagation to the driver output. Fig. 4.6 shows the schematic implementation of the final 2:1 serializer. Five latches align data with the clock to guarantee the timing



Fig. 4.6 Schematic implementation of final 2:1 serializer.

margin of the 2:1 MUX, and a clocked CMOS latch structure is employed to enhance the circuit bandwidth. In addition, the 2:1 MUX is designed with pre-charge and pre-discharge MOSFETs, which help to remove the jitter caused by the charge remaining in the internal nodes depending on the data pattern during high-speed operation [71].

The current-mode (CM) type is employed for the driver instead of the voltagemode (VM), which shows a large impedance variation during data transition. A wide linear range (WLR) hybrid is adopted for the hybrid structure to ensure PAM-4 SBD communication [50]. Since the WLR hybrid cancels the outbound signal by the ratio of resistance and current, it is free from deteriorating linearity of the PAM-4 signal or subtracting certain levels less and can operate in a wide range. Fig. 4.7 shows the network of driver and hybrid. The resistance and current are set based on the relation  $I_{\text{DRV}} / I_{\text{HYB}} = 2R_{\text{HYB}} / R_{\text{TERM}} + 1$ , which drives the hybrid subtraction to be the same as the outbound driver swing. The hybrid resistance is set to 100  $\Omega$ , and the current



Fig. 4.7 Schematic of driver and hybrid.

is set to 4 mA, which is 1/5 of the PAM-4 driver current.



Fig. 4.8 Simulated waveforms of PAM-4 hybrid operation.



Fig. 4.9 Simulated eye diagrams of hybrid (a) input and (b) output.

Fig. 4.8 shows the simulated waveforms of the PAM-4 hybrid operation. The hybrid input exhibits seven levels that correspond to the combination of outbound and inbound data. The process involves extracting the PAM-4 signal corresponding to the inbound data by removing the outbound data from the overlapped waveform. The eye diagrams for the hybrid input and output are presented in Fig. 4.9. Meanwhile, Fig. 4.10 shows the simulated output impedance of the driver (including hybrid) in real-time and indicates a variation of  $\pm 1.5 \Omega$  to 50  $\Omega$  in SBD. The impedance in SBD maintains about the same value as the UD signaling with a half swing.



Fig. 4.10 Simulated output impedance of driver including hybrid.

#### 4.3.2 Receiver

The receiver employs a programmable continuous-time linear equalizer (CTLE) at the front-end, instead of a power-hungry decision feedback equalizer (DFE). The CTLE comprises two stages to maximize ISI cancellation by controlling not only peak gain but also mid-band boosting, and Fig. 4.11(a) illustrates the schematic of the two-stage CTLE. The transfer function of the CTLE can be expressed as

$$H(s) = \frac{g_{m1}R_{L1}}{1 + g_{m1}R_{S1}/2} \frac{(1 + sR_{S1}C_{S1})}{\left(1 + \frac{sR_{S1}C_{S1}}{1 + g_{m1}R_{S1}/2}\right)(1 + sR_{L1}C_{L1})} \cdot \frac{g_{m}R_{L2}}{1 + g_{m2}R_{S2}/2} \frac{(1 + sR_{S2}C_{S2})}{\left(1 + \frac{sR_{S2}C_{S2}}{1 + g_{m2}R_{S2}/2}\right)(1 + sR_{L2}C_{L2})}$$
(4.3)



Fig. 4.11 (a) Schematic and (b) frequency response of CTLE.

where  $C_{L1}$  and  $C_{L2}$  represent the load capacitance of each CTLE stage. The *RC*degenerated CTLE provides AC boosting up to 10 dB at the Nyquist frequency, and its frequency response is shown in Fig. 4.11(b). The restored PAM-4 signal from the hybrid and CTLE is sampled using four samplers per UI with half-rate sampling. Each phase comprises three data samplers and one error sampler. The system achieves cost-efficiency in clocking power consumption by employing a total of eight samplers in the receiver, while simultaneously alleviating the bandwidth requirements of the receiver front-end. The sampled information is then passed through a 2:32 deserializer and utilized for CDR, dLev adaptation, and hybrid adaptation of the synthesized digital logic. Notably, all loops operate using only one error sampler.

Synthesized digital logic consists of dLev adaptation, CDR, and data alignment, including the proposed hybrid adaptation described in Chapter 4.2. First, the dLev adaptation operation is done as shown in Fig. 4.12 with the reference level of the error sampler being adjusted to the center of  $3h_0$  dispersion using the SSLMS algorithm [72]. Next, CDR utilizes sign-sign MMPD (SS-MMPD), which is commonly



Fig. 4.12 Logical operation of dLev adaptation.



Fig. 4.13 Logical operations of SS-MMPD in PAM-4 using one error sampler.

adopted in PAM-4 receivers for its power efficiency and clock overhead advantages [73]. Fig. 4.13 depicts the operation principle of the SS-MMPD, which is configured using only one error sampler. "Early" or "Late" is determined through consecutive data and error samples for the "0110" pattern of the DH sampler output, and the PD provides the locking point of  $h_{-1} = h_1$  [74].

The loop bandwidth is high in the order of dLev adaptation, CDR, and hybrid adaptation to prevent multiple loop stability. The bandwidth of the hybrid adaptation logic is the slowest since it utilizes the locked condition of MMPD, and the simulated locking transients of the adaptation loops are shown in Fig. 4.14. The adaptation loops set all control words to the desired lock point for incorrect initial values, but a significant deviation in the hybrid weight can affect CDR behavior, which requires attention. Fig. 4.15 shows the simulated PI locking behaviors with different hybrid weights. When the deviation is small ( $w_{HYB1}$ ), the settling time increases but still maintains the same optimum lock point. However, as the deviation becomes larger, different lock point from the optimum point is observed ( $w_{HYB2}$ ), and even false locks can occur when the deviation is too large ( $w_{HYB3}$ ).



Fig. 4.14 Simulated locking transients of adaptation loops.



Fig. 4.15 Simulated PI locking behaviors with different hybrid weights.

#### **4.3.3 Clock Distribution**

The clock distribution comprises PI and DCC, used in both the transmitter and receiver, with a 4-phase clock (I, Q, IB, QB) input generated by a PPF. Phase interpolation is achieved by a weighted summation of the quadrature clock inputs using a current summation with a shared resistive load [75]. The schematic of the current-



Fig. 4.16(a) Schematic of PI and (b) interpolated phase by control code.

mode PI is presented in Fig. 4.16(a), where the sum of weighting factors in the I path and Q path is constant, and their ratio represents the tangent value of the interpolated phase. Fig. 4.16(b) shows the interpolated phase of a 10 GHz clock using a 7-bit control code. The PI output clock employs DCC to optimize the eye margins in the transmitter and receiver. The DCC utilizes a starved inverter structure with a control range of  $44.7^{\circ}$  to  $54.3^{\circ}$ .

In a symmetric SBD transceiver, the alignment between bidirectional inbound and outbound data affects the signal quality of the hybrid output. In the case of a high-speed hybrid, fluctuations in subtraction may occur while removing the outbound signal. Accordingly, if the received and transmitting data are aligned, the sampling margin increases and a lower BER can be obtained. Fig. 4.17 describes this difference, and it can be confirmed that the hybrid output is not disturbed by the



Aligned condition :  $c/k_{TX} = c/k_{RX} \pm 0.5UI - (t_{d1} + t_{d2} + t_{d3})$ 

Fig. 4.17 Conceptual diagram of data alignment technique.

blip when the aligned condition is satisfied. The phase alignment is done by the PI of the transmitter and is manually controlled during measurement.

## 4.4 Measurement

The proposed PAM-4 SBD transceiver is fabricated in 28-nm CMOS technology, and Fig. 4.18 shows a photomicrograph and power/area breakdown of the prototype chip. The transceiver consumes a total power of 106 mW, including clock distribution. An energy efficiency of 2.65 pJ/b is achieved with an active area of 0.089 mm<sup>2</sup>. The measured channel response (1.3-in PCB trace and 36-in coaxial cable) is represented in Fig. 4.19, which has an insertion loss of 5.6 dB at 10 GHz. To confirm that



|   | Block               | Power    | Area                  |
|---|---------------------|----------|-----------------------|
| Α | TX (include hybrid) | 41.0 mW  | 0.018 mm <sup>2</sup> |
| В | RX                  | 52.2 mW  | $0.061 \text{ mm}^2$  |
| С | DAC & Bias gen.     | 53.3 MVV | 0.001 11111           |
| D | Clock dist.         | 11.6 mW  | 0.01 mm <sup>2</sup>  |

Fig. 4.18 Chip photomicrograph and power/area breakdown.

the PAM-4 data generated by the on-chip pattern generator is being transmitted correctly, the operation of the transmitter is verified using the Tektronix



Fig. 4.19 Measured frequency response of channel.



Fig. 4.20 Output waveform of PAM-4 transmitter.

MSO73304DX oscilloscope. Fig. 4.20 shows the PAM-4 eye diagram of the transmitter without channel loss, which shows a level separation mismatch ratio (RLM) of 0.96. For the bidirectional test, data stream generated from the Anritsu MU183020A Pulse Pattern Generator via a passive PAM-4 combiner is differential 400 mV<sub>p-p</sub> to the receiver. Fig. 4.21 illustrates the measurement setup, and the UD is measured with the transmitter turned off under the same condition.



Fig. 4.21 Measurement setup of SBD transceiver.

The BER measurement is conducted using the Anritsu MU183040B Error Detector, comparing the deserialized recovered data with the input data of the receiver. Fig. 4.22 and Fig. 4.23 show the measured bathtub curves and jitter tolerance (JTOL) curves for the 40-Gb/s UD and 80-Gb/s SBD transceiver. Regarding BER of 10<sup>-6</sup>, horizontal eye margins of 0.39 UI and 0.14 UI are achieved for UD and SBD, respectively. SBD shows a reduced eye margin compared to UD due to other noise sources such as echoes, inexact cancellation of the hybrid, and increased power noise and so on. Nonetheless, the BER less than  $10^{-12}$  is achieved in the SBD operation.

Fig. 4.24 illustrates the measured BER by the hybrid adaptation, where the lock point is 1.04 for the ideal normalized hybrid weight. Similar to the simulation results with the reduced normalized VEO, the tendency of BER by the hybrid weight is represented. A 20% hybrid error makes a BER of 10<sup>-12</sup> to a BER of 10<sup>-5</sup> or higher, regardless of other optimum settings. Meanwhile, the data alignment shows the best SNR performance when the PIs of the transmitter and receiver differ by 0.31 UI, and the measured BER by the phase difference of SBD signals is shown in Fig. 4.25. The SNR deteriorates when out of this range, but the eyes are not entirely closed for the worst alignment. Table 4.1 summarizes the SBD transceiver performance and compares it to prior symmetric SBD transceivers with throughput over 10 Gb/s. It is the only SBD transceiver with the hybrid adaptation, and the 80-Gb/s throughput is more than twice that of which utilizes NRZ signaling. It also has a low energy efficiency compared to the PAM-4 SBD transceiver.



Fig. 4.22 Measured bathtub curves of 40-Gb/s UD and 80-Gb/s SBD transceiver.



Fig. 4.23 Measured JTOL curves of 40-Gb/s UD and 80-Gb/s SBD transceiver.



Fig. 4.24 Measured BER of SBD transceiver by hybrid weight.



Fig. 4.25 Measured BER of SBD transceiver by phase difference of SBD signals.

|                                       | <b>JSSC '07 [28]</b>             | TCAS-I '20 [46]                                 | JSSC '20 [44]                    | ISSCC '21 [48]                      | This work                       |
|---------------------------------------|----------------------------------|-------------------------------------------------|----------------------------------|-------------------------------------|---------------------------------|
| Technology                            | 110-nm CMOS                      | 65-nm CMOS                                      | 28-nm CMOS                       | 14-nm FinFET                        | 28-nm CMOS                      |
| Signaling                             | NRZ                              | NRZ                                             | NRZ                              | PAM-4                               | PAM-4                           |
| Data rate (SBD)                       | 20 Gb/s                          | 15 Gb/s                                         | 32 Gb/s                          | 112 Gb/s                            | 80 Gb/s                         |
| Architecture                          | Driver : CM<br>Sub : R-gm hybrid | Driver : VM<br>Sub : Resistor-<br>bridge hybrid | Driver : VM<br>Sub : R-gm hybrid | Driver : VM<br>Sub : Passive hybrid | Driver : CM<br>Sub : WLR hybrid |
| Hybrid<br>Adaptation                  | No                               | No                                              | No                               | No                                  | Yes                             |
| Channel Loss @<br>Nyquist             | 5 dB                             | 2.5 dB                                          | 4.4 dB, 10.2 dB                  | 35 dB                               | 5.6 dB                          |
| Area                                  | $1.02 \text{ mm}^2$              | $0.012 \text{ mm}^2$                            | $0.182 \text{ mm}^2$             | -                                   | $0.089 \text{ mm}^2$            |
| Power<br>Consumption<br>(Single side) | 266 mW                           | 10.1 mW*                                        | 29.2 mW                          | 994 mW*                             | 106 mW                          |
| <b>Energy Efficiency</b>              | 26.6 pJ/b                        | 1.35 pJ/b*                                      | 1.83 pJ/b                        | 17.75 pJ/b*                         | 2.65 pJ/b                       |
| FoM [pJ/bit/dB]                       | 5.32                             | 0.54*                                           | 0.42, 0.18                       | 0.51*                               | 0.47                            |
| BER                                   | $< 10^{-12}$                     | $< 10^{-12}$                                    | $< 10^{-12}$                     | $< 10^{-6}$                         | $< 10^{-12}$                    |

Table 4.1 Performance summary and comparison with prior symmetric SBD transceivers

\* Includes only analog front-ends
## Chapter 5

## Conclusion

In this thesis, the design techniques and structures of SBD transceivers utilizing PAM-4 signaling are proposed. The first prototype design presents an asymmetric SBD transceiver for a 12-Gb/s PAM-4 FC and a 125-Mb/s PAM-2 BC signals. The highly asymmetric communication is achieved by adopting different hybrid structures suitable for each chip. A WLR hybrid structure, operating linearly in a wide range, is employed for reliable communication. Furthermore, a method for performing the hybrid in two stages is proposed in the SER to effectively separate overlapped spectra. The hybrid with coefficient  $\Sigma \alpha$  minimizes DC components, and the secondary LPF reduces the hybrid design complexity of the PAM-4 drivers, including the FFE, by suppressing the residual high-frequency energy. In the DES, the WLR hybrid is designed as fine as possible, which enables a sufficient eye margin of the PAM-4 FC signal from the CIS with the help of the equalizers and the EC.

The second chip presents an 80-Gb/s PAM-4 SBD transceiver. A WLR hybrid

ensures reliable PAM-4 hybrid behavior without linearity issues, and the proposed hybrid adaptation scheme effectively compensates for the weaknesses of SBD transceivers vulnerable to a hybrid mismatch. The hybrid adaptation is achieved through a simplified analysis utilizing the lock point of MMPD, which is a cost-effective approach as it eliminates the need for additional analog hardware. Additionally, incorporating a data alignment logic enhances the robustness of SBD behavior. Experimental measurements conducted on the prototype chip demonstrate the effectiveness of these techniques, achieving a BER of less than 10<sup>-12</sup> at a data rate of 80 Gb/s for PAM-4 SBD operation. The transceiver is designed to be NRZ/PAM-4 reconfigurable, offering support for unidirectional communication and ensuring high compatibility.

## **Bibliography**

- R. Mooney, C. Dike and S. Borkar, "A 900 Mb/s bidirectional signaling scheme," *IEEE J. Solid-State Circuits*, vol. 30, no. 12, pp. 1538-1543, Dec. 1995.
- [2] T. Takahashi *et al.*, "A CMOS gate array with 600 Mb/s simultaneous bidirectional I/O circuits," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 1995, pp. 40-41.
- [3] T. Takahashi *et al.*, "A CMOS gate array with 600 Mb/s simultaneous bidirectional I/O circuits," *IEEE J. Solid-State Circuits*, vol. 30, no. 12, pp. 1544-1546, Dec. 1995.
- [4] D. A. Johns and D. Essig, "Integrated circuits for data transmission over twisted-pair channels," *IEEE J. Solid-State Circuits*, vol. 32, no. 3, pp. 398-406, Mar. 1997.
- [5] S. A. Jackson and B. J. Blalock, "A CMOS mixed signal simultaneous bidirectional signaling I/O," in *Proc. IEEE Midwest Symp. on Circuits and Systems*, Aug. 1998, pp. 37-40.
- [6] T. Takahashi *et al.*, "110 GB/s simultaneous bi-directional transceiver logic synchronized with a system clock," in *IEEE Int. Solid-State Circuits Conf.* (*ISSCC*) Dig. Tech. Papers, Feb. 1999, pp. 176-177.

- [7] T. Takahashi *et al.*, "110-GB/s simultaneous bidirectional transceiver logic synchronized with a system clock," *IEEE J. Solid-State Circuits*, vol. 34, no. 11, pp. 1526-1533, Nov. 1999.
- [8] E. Yeung and M. Horowitz, "A 2.4 Gb/s/pin simultaneous bidirectional parallel link with per pin skew compensation," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2000, pp. 256-257.
- [9] E. Yeung and M. A. Horowitz, "A 2.4 Gb/s/pin simultaneous bidirectional parallel link with per-pin skew compensation," *IEEE J. Solid-State Circuits*, vol. 35, no. 11, pp. 1619-1628, Nov. 2000.
- [10] H. Tamura *et al.*, "5 Gb/s bidirectional balanced-line link compliant with plesiochronous clocking," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2001, pp. 64-65.
- [11] D. Cecchi, C. Hanson and C. Preuss, "A 2 GB/s high speed link with differential simultaneous bi-directional IO," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, May. 2001, pp. 505-508.
- [12] H. Wilson and M. Haycock, "A six-port 30-GB/s nonblocking router component using point-to-point simultaneous bidirectional signaling for highbandwidth interconnects," *IEEE J. Solid-State Circuits*, vol. 36, no. 12, pp. 1954-1963, Dec. 2001.
- [13] Y. Fujimura, T. Takahashi, S. Toyoshima, K. Nagashima, J. Baba and T. Matsumoto, "1.2 Gbps/pin simultaneous bidirectional transceiver logic

with bit deskew technique," in *Symp. VLSI Circuits Dig. Tech. Papers*, Jun. 2002, pp. 58-59.

- [14] A. Martin, B. Casper, J. Kennedy, J. Jaussi and R. Mooney, "8Gb/s differential simultaneous bidirectional link with 4mV 9ps waveform capture diagnostic capability," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2003, pp. 478-479.
- [15] B. Casper, A. Martin, J. E. Jaussi, J. Kennedy and R. Mooney, "An 8-Gb/s simultaneous bidirectional link with on-die waveform," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2111-2120, Dec. 2003.
- [16] J.-H. Kim *et al.*, "A 4Gb/s/pin 4-level simultaneous bidirectional I/O using a 500MHz clock for high-speed memory," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2004, pp. 248-249.
- [17] W.-S. Kim *et al.*, "A 4 Gb/s/pin dual-reference simultaneous bidirectional I/O circuit for memory-bus interface," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2004, pp. 412-536.
- [18] J.-K. Kim et al., "A 3.6 Gb/s/pin simultaneous bidirectional (SBD) I/O interface for high-speed DRAM," in *IEEE Int. Solid-State Circuits Conf.* (*ISSCC*) Dig. Tech. Papers, Feb. 2004, pp. 414-415.
- [19] M.-T. Hsieh and G. E. Sobelman, "Simultaneous bidirectional signaling with adaptive pre-emphasis," in *Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS)*, May. 2004, pp. 397-400.

- [20] R. J. Drost and B. A. Wooley, "An 8-Gb/s/pin simultaneously bidirectional transceiver in 0.35-µm CMOS," *IEEE J. Solid-State Circuits*, vol. 39, no. 11, pp. 1894-1908, Nov. 2004.
- [21] G. Yang, Y. Kim and S.-M. Kang, "Current mode multi-level simultaneous bidirectional I/O scheme for chip-to-chip communications," in *Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS)*, May. 2005, pp. 5493-5496.
- [22] J.-H. Kim *et al.*, "A 4-Gb/s/pin low-power memory I/O interface using 4level simultaneous bi-directional signaling," *IEEE J. Solid-State Circuits*, vol. 40, no. 1, pp. 89-101, Jan. 2005.
- [23] J. Kennedy *et al.*, "A 3.6-Gb/s point-to-point heterogeneous-voltagecapable DRAM interface for capacity-scalable memory subsystems," *IEEE J. Solid-State Circuits*, vol. 40, no. 1, pp. 233-24, Jan. 2005.
- [24] V. Stojanovic *et al.*, "Autonomous dual-mode (PAM2/4) serial link transceiver with adaptive equalization and data recovery," *IEEE J. Solid-State Circuits*, vol. 40, no. 1, pp. 1012-1026, Apr. 2005.
- [25] Y. Tomita, H. Tamura, M. Kibune, J. Ogawa, K. Gotoh and T. Kuroda, "A 20-Gb/s bidirectional transceiver using a resistor-transconductor hybrid," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2006, pp. 2102-2111.
- [26] Y. S. Kim, S. Shin and S.-M. Kang, "A 4-Gb/s/pin current mode 4-level simultaneous bidirectional I/O with current mismatch calibration," in *Proc.*

*IEEE Int. Symp. on Circuits and Systems (ISCAS)*, May. 2006, pp. 1007-1010.

- [27] Y. S. Kim, and S.-M. Kang, "Programmable high speed multi-level simultaneous bidirectional I/O," in *IEEE Int. Symp. on Quality Electronic Design* (*ISQED*), Mar. 2007, pp. 1007-1010.
- [28] Y. Tomita, H. Tamura, M. Kibune, J. Ogawa, K. Gotoh and T. Kuroda, "A 20-Gb/s simultaneous bidirectional transceiver using a resistortransconductor hybrid in 0.11-µm CMOS," *IEEE J. Solid-State Circuits*, vol. 42, no. 3, pp. 627-636, Mar. 2007.
- [29] C. J. Akl and M. A. Bayoumi, "Wiring-area efficient simultaneous bidirectional point-to-point link for inter-block on-chip signaling," in *IEEE Int. Conf. on VLSI Design (VLSID)*, Jan. 2008, pp. 195-200.
- [30] Y. S. Kim and S.-M. Kang, "A 8-Gb/s/pin current mode multi-level simultaneous bidirectional I/O," in *Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS)*, May. 2008, pp. 3069-3072.
- [31] H.-Y. Huang, R.-I. Pu and M.-T. Lee, "Simultaneous bidirectional transceiver with impedance matching," in *IEEE Int. Conf. on Electronics, Circuits and Systems (ICECS)*, Sep. 2008, pp. 312-315.
- [32] S. Kojima *et al.*, "8Gbps CMOS pin electronics hardware macro with simultaneous bi-directional capability," in *IEEE Int. Test Conference*, Nov. 2012, pp. 1-9.

- [33] H. Pan *et al.*, "A full-duplex line driver for gigabit Ethernet with rail-to-rail class-AB output stage in 28nm CMOS," *IEEE J. Solid-State Circuits*, vol. 49, no. 12, pp. 3141-3155, Dec. 2014.
- [34] N. Wary and P. Mandal, "Current-mode simultaneous bidirectional transceiver for on-chip global interconnects," in *IEEE Asia Symp. on Quality Electronic Design (ASQED)*, Aug. 2015, pp. 19-24.
- [35] H. Pan et al., "An analog front-end for 100BASE-T1 automotive Ethernet in 28nm CMOS," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2016, pp. 186-187.
- [36] D. Duvvuri, S. Agarwal and V. S. R. Pasupureddi, "A new hybrid circuit topology for simultaneous bidirectional signaling over on-chip interconnects," in *Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS)*, May. 2016, pp. 2342-2345.
- [37] N. Wary and P. Mandal, "Current-mode full-duplex transceiver for lossy on-chip global interconnects," *IEEE J. Solid-State Circuits*, vol. 52, no. 8, pp. 2026-2037, Aug. 2017.
- [38] A. Manian, A. Rane and Y. Koh *et al.*, "A simultaneous bidirectional single-ended coaxial link with 24-Gb/s forward and 312.5-Mb/s back channels," in *Proc. IEEE 44th Eur. Solid State Circuits Conf. (ESSCIRC)*, Sep. 2018, pp. 178-181.
- [39] A. R. Chowdhury, N. Wary and P. Mandal, "Energy efficient bidirectional

equalized transceiver with PVT insensitive active termination," in *IEEE Int. Conf. on VLSI Design (VLSID)*, Jan. 2019, pp. 25-30.

- [40] G. W. den Besten, "Single-pair automotive PHY solutions from 10Mb/s to 10Gb/s and beyond," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2019, pp. 474-475.
- [41] Y.-H. Fan *et al.*, "A 32-Gb/s simultaneous bidirectional sourcesynchronous transceiver with adaptive echo cancellation in 28nm CMOS," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Apr. 2019, pp. 1-4.
- [42] P. Venuturupalli, P. K. Govindaswamy and V. S. R. Pasupureddi, "An adaptive hybrid with residue monitor for full-duplex on-chip interconnects," in *Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS)*, Oct. 2020, pp. 1-5.
- [43] S. Goyal, G. A. Parulekar and S. Gupta, "A true full-duplex IO for highdensity high-speed interconnects," in *IEEE Int. Conf. on Electronics, Circuits and Systems (ICECS)*, Nov. 2020, pp. 1-4.
- [44] Y.-H. Fan *et al.*, "A 32-Gb/s simultaneous bidirectional sourcesynchronous transceiver with adaptive echo cancellation techniques," *IEEE J. Solid-State Circuits*, vol. 55, no. 2, pp. 439-451, Feb. 2020.
- [45] J.-K. Kim and D.-W. Jee, "Current/voltage dual-mode single-wire simultaneous bidirectional interface architecture for sensor system," *IEEE Trans. Biomed. Circuits Syst.*, vol. 14, no. 1, pp. 12-19, Feb. 2020.

- [46] C. Yuan, A. Naguib and S. Shekhar, "On the design of low-power hybrids for full-duplex simultaneous bidirectional signaling links," *IEEE Trans. Circuits Syst. I: Reg. Papers*, vol. 67, no. 4, pp. 1413-1422, Apr. 2020.
- [47] S. Mukherjee, A. Das, S. Seth and S. Saxena, "An energy-efficient 3 Gb/s PAM4 full-duplex transmitter with 2-tap feed forward equalizer," *IEEE Trans. Circuits Syst. II: Exp. Briefs*, vol. 67, no. 5, pp. 916-920, May. 2020.
- [48] R. Farjadrad *et al.*, "An echo-cancelling front-end for 112Gb/s PAM-4 simultaneous bidirectional signaling in 14nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2021, pp. 194-196.
- [49] A. Manian, A. Rane, Y. Koh, H. K. Nat and M. Lu, "A simultaneous bidirectional single-ended coaxial link with 24-Gb/s forward and 312.5-Mb/s back channels," *IEEE J. Solid-State Circuits*, vol. 56, no. 3, pp. 972-987, Mar. 2021.
- [50] Y. Lee, W. Lee, M. Shim, S. Shin, W.-S. Choi and D.-K. Jeong, "0.41pJ/b/dB asymmetric simultaneous bidirectional transceivers with PAM-4 forward and PAM-2 back channels for 5-m automotive camera link," in *Proc. IEEE Symp. VLSI Technology and Circuits*, Jun. 2022, pp.30-31.
- [51] Y. Nishi *et al.*, "0.297-pJ/bit 50.4-Gb/s/wire inverter-based short-reach simultaneous bidirectional transceiver for die-to-die interface in 5nm CMOS," in *Proc. IEEE Symp. VLSI Technology and Circuits*, Jun. 2022, pp.154-155.

- [52] P. K. Govindaswamy, N. Wary and V. S. R. Pasupureddi, "Power efficient echo-cancellation based hybrid for full-duplex chip-to-chip interconnects," in *Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS)*, May. 2022, pp. 852-856.
- [53] P. K. Govindaswamy, N. Wary and V. S. R. Pasupureddi, "A low-power half-rate charge-steering hybrid for full-duplex chip-to-chip interconnects," in *Proc. IEEE Int. Symp. on Circuits and Systems (ISCAS)*, May. 2022, pp. 857-861.
- [54] S. Goyal, G. Parulekar and S. Gupta, "A true full-duplex IO (TFD-IO) with background SI cancellation for high-density interfaces," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 30, no. 5, pp. 615-624, May. 2022.
- [55] P.-J. Peng, J.-F. Li, L.-Y. Chen, and J. Lee, "A 56Gb/s PAM-4/NRZ transceiver in 40nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 110-111.
- [56] E. Depaoli *et al.*, "A 64 Gb/s low-power transceiver for short-reach PAM-4 electrical links in 28-nm FDSOI CMOS," *IEEE J. Solid-State Circuits*, vol. 54, no. 1, pp. 6-17, Jan. 2019.
- [57] R. Yousry et al., "A 1.7pJ/b 112Gb/s XSR transceiver for intra-package communication in 7nm FinFET technology," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2021, pp. 180-181.
- [58] B. Ye et al., "A 2.29pJ/b 112Gb/s wireline transceiver with RX 4-tap FFE

for medium-reach applications in 28nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2022, pp. 118-119.

- [59] Ethernet alliance, "Ethernet Roadmap 2019", Online(Accessed Feb. 01, 2019), Available: https://ethernetalliance.org/wp-content/uploads/2019/08/EthernetRoadmap-2019-Side2-ToPrint.pdf
- [60] Marvell, "Marvell Unveils 802.3ch 10G Automotive Multi-Gigabit Ethernet PHY", Online(Accessed Apr. 27, 2021), Available: https://www.marvell.com/content/dam/marvell/en/company/media-kit/802-3ch-10g-automotive-ethernet-phy/88q4346-media-presentation.pdf
- [61] Intel, "An 835: PAM4 Signaling Fundamentals", Online(Accessed Mar. 12, 2019), Available: https://www.intel.com/content/www/us/en/docs/programmable/683852/curr ent/introduction.html
- [62] F. Zhong *et al.*, "A 1.0625 ~ 14.025 Gb/s multi-media transceiver with fullrate source-series-terminated transmit driver and floating-tap decisionfeedback equalizer in 40 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 3126-3139, Nov. 2011.
- [63] IEEE Standard for Ethernet-Amendment 8: Physical Layer Specifications and Management Parameters for 2.5 Gb/s, 5 Gb/s, and 10 Gb/s Automotive Electrical Ethernet, IEEE Standard 802.3ch-2020.
- [64] J. Hwang et al., "A 32 Gb/s, 201 mW, MZM/EAM cascode push-pull CML

driver in 65 nm CMOS," *IEEE Trans. Circuits Syst. II: Exp. Briefs*, vol. 65, no. 4, pp. 436-440, Apr. 2018.

- [65] J. He *et al.*, "A 56-Gb/s reconfigurable silicon-photonics transmitter using high-swing distributed driver and 2-tap in-segment feed-forward equalizer in 65-nm CMOS," *IEEE Trans. Circuits Syst. I: Reg. Papers*, vol. 69, no. 3, pp. 1159-1170, Mar. 2022.
- [66] J. M. Algueta-Miguel, J. Ramirez-Angulo, E. Mirazo, A. J. Lopez-Martin and R. G. Carvajal, "A simple miller compensation with essential bandwidth improvement," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 25, no. 11, pp. 3186-3192, Nov. 2017.
- [67] Y. P. Tsividis, "Integrated continuous-time filter design-An overview," *IEEE J. Solid-State Circuits*, vol. 29, no. 3, pp. 166-176, Mar. 1994.
- [68] W. Lee *et al.*, "0.37-pJ/b/dB PAM-4 transmitter and adaptive receiver with fixed data and threshold levels for 12-m automotive camera link," in *Proc. IEEE 47th Eur. Solid State Circuits Conf. (ESSCIRC)*, Sep. 2021, pp. 475-478.
- [69] J. Lee, K. Lee, H. Kim, B. Kim, K. Park and D.-K. Jeong, "A 0.1pJ/b/dB 1.62-to-10.8Gb/s video interface receiver with jointly adaptive CTLE and DFE using biased data-level reference," *IEEE J. Solid-State Circuits*, vol. 55, no. 8, pp. 2186-2195, Aug. 2020.
- [70] Y. Lee, M. Shim, S. Roh, W. Lee and D.-K. Jeong, "An 80-Gb/s PAM-4

simultaneous bidirectional transceiver with hybrid adaptation scheme," *IEEE Trans. Circuits Syst. II: Express Briefs*, early access, Mar. 07, 2023, doi: 10.1109/TCSII.2023.3253679.

- [71] H. Ju, M.-C. Choi, G.-S. Jeong, W. Bae and D.-K. Jeong, "A 28 Gb/s 1.6 pJ/b PAM-4 transmitter using fractionally spaced 3-tap FFE and G<sub>m</sub>-regulated resistive-feedback driver," *IEEE Trans. Circuits Syst. II: Exp. Briefs*, vol. 64, no. 12, pp. 1377-1381, Sep. 2017.
- [72] S. Roh, K. Lee, M. Shim, M.-C. Choi and D.-K. Jeong, "A 64-Gb/s PAM-4 receiver with transition-weighted phase detector," *IEEE Trans. Circuits Syst. II: Express Briefs*, vol. 69, no. 9, pp. 3704-3708, Sep. 2022.
- [73] F. Spagna *et al.*, "A 78mW 11.8Gb/s serial link transceiver with adaptive RX equalization and baud-rate CDR in 32nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2010, pp. 366-367.
- [74] M.-C. Choi, H.-G. Ko, J. Oh, H.-Y. Joo, K. Lee and D.-K. Jeong, "A 0.1pJ/b/dB 28-Gb/s maximum-eye tracking, weight-adjusting MM CDR and adaptive DFE with single shared error sampler," in *Proc. IEEE Symp. VLSI Technology and Circuits*, Jun. 2020, pp. 1-2.
- [75] R. Nonis, W. Grollitsch, T. Santa, D. Cherniak and N. Da. Dalt, "digPLL-Lite: A Low-Complexity, Low-Jitter Fractional-N Digital PLL Architecture," *IEEE J. Solid-State Circuits*, vol. 48, no. 12, pp. 3134-3145, Dec. 2013.

## 초 록

본 논문은 4 레벨 펄스 진폭 변조 (PAM-4)를 사용하는 유선 통신 동 시 양방향(SBD) 송수신기의 설계를 제안한다. 비대칭 및 대칭 프로토타 입 칩들에 대하여 송수신기의 구조 및 새로운 하이브리드 기술을 제안한 다.

첫 번째 프로토타입 디자인에서는, 10 Gb/s 이상의 차세대 자동차 카 메라 링크용 비대칭 SBD 송수신기가 제안되었다. 제한된 케이블 대역폭 을 극복하기 위해 PAM-4 신호 전송 방식이 사용되었으며, PAM-4 SBD 의 동작은 WLR 하이브리드를 사용하여 구현되었다. 송신기에서 FFE 를 우회하는 2 단계 하이브리드 전략은 전력 소모를 줄이고 하이브 리드 디자인을 단순화시키는 데 크게 기여한다. 하이브리드는 계수 Σα로 네 개의 주요 DC 레벨만 제거하고, 2 차 전도체-커패시터(gm-C) 저역 통과 필터(LPF)는 하이브리드의 잔여 성분과 채널의 반사를 필터링한다. PAM-2 역 채널(BC)의 반사를 제거하기 위해 에코 캔슬러(EC)도 사용 된다. 5m 의 자동차 케이블에 대하여 12-Gb/s PAM-4 정방향 채널(FC) 과 125-Mb/s PAM-2 BC 를 사용하는 고도 비대칭 송수신기는 SBD 통 신 하에서 BER < 10<sup>-12</sup>에 대하여 각각 0.15 UI 및 0.57 UI 의 아이 마 진을 갖는다. 40-nm CMOS 로 제작된 이 프로토타입 송수신기는 6.5 pJ/b 의 에너지 효율성을 달성하며, 0.41 pJ/b/dB 의 FoM 을 나타낸다.

두 번째 프로토타입 칩은 새로운 하이브리드 적응 방식을 사용하여 PAM-4 를 사용한 대칭 SBD 송수신기를 제시한다. PAM-4 신호를 SBD 에 적용하여 대역폭을 확장하는 가능성이 탐구되었다. 또한 SBD 에 서 아웃바운드 신호를 제거하는 데 필수적인 하이브리드 회로에 대한 불 일치 보상 방법이 제안되었다. 하이브리드 적응은 Mueller Müller 위상 검출기(MMPD)의 잠금 조건을 데이터 레벨 적응에 적용함으로써 쉽게 구현된다. MMPD 및 적응 엔진과 하나의 오류 샘플러를 공유함으로써 제 시된 SBD 송수신기는 수신기 프론트 엔드의 클럭 전력 소비 및 대역폭 면에서 효율적이다. 넓은 선형 범위 하이브리드와 데이터 정렬 기술은 견 고한 PAM-4 SBD 동작을 보장한다. 28-nm CMOS 에서 제작된 80-Gb/s SBD 송수신기는 2.65 pJ/b 의 에너지 효율성과 함께 10<sup>-12</sup> 이하의 BER를 달성한다.

**주요어** : 동시 양방향, 4 레벨 펄스 진폭 변조, 비대칭, 대칭, 자동차 카메 라 링크, 하이브리드, 2 단계 하이브리드, 하이브리드 적응, 송수신기.

**학 번** : 2019-26929