PAPER Special Section on Papers Selected from AP-ASIC 2004

# A Fully Integrated 1.7–3.125 Gbps Clock and Data Recovery Circuit Using a Gated Frequency Detector

Rong-Jyi YANG<sup>†</sup>, Nonmember and Shen-Iuan LIU<sup>†a)</sup>, Member

SUMMARY A fully integrated clock and data recovery circuit with the proposed gated frequency detector (GFD) is presented. It has been realized in a standard 0.25- $\mu$ m CMOS technology. The proposed voltage-controlled oscillator (VCO) can achieve wide operation range and reasonable conversion gain by employing the analog/digital dual loop architecture. The characteristics of small VCO gain can help to reduce loop bandwidth without enlarge the capacitors and relax the constraint on choosing the loop parameter to reduce the size of the on-chip capacitor. The proposed GFD will make the frequency lock time fixed and can avoid the harmonic locking problem in digital domain for wide data rate operations. All measured BERs are less than  $10^{-12}$  with the data rate from 1.7 Gbps to 3.125 Gbps. key words: DLL, CDR, dual loop

#### 1. Introduction

The requirements for faster information exchanging have led to the demand for high speed data communications. Various researches focus on high speed transceivers, especially for the clock and data recovery (CDR) circuits [1]–[6]. For a conventional charge-pump CDR with the loop filter of  $R_P$  in series of  $C_P$ , the natural frequency,  $\omega_n$ , and the damping ratio,  $\zeta$ , of the closed loop transfer function can be represented as [7]

$$\omega_n = \sqrt{\frac{I_P D_T K_{VCO}}{2\pi C_P}} \text{ and }$$
 (1)

$$\zeta = \frac{R_P}{2} \sqrt{\frac{I_P D_T C_P K_{VCO}}{2\pi}},\tag{2}$$

where  $I_P$  is charge pump current,  $D_T$  is input data transition density and  $K_{VCO}$  is gain of the voltage-controlled oscillator (VCO). Usually, the tuning range of a VCO should be large enough to overcome the process, voltage and temperature variations, i.e.,  $K_{VCO}$  will be large. A large  $C_P$  might be needed to reduce loop bandwidth for jitter suppression. As  $C_P$  is large enough, the -3 dB bandwidth of the loop can be expressed as [7]

$$\omega_{-3dB} \cong \frac{R_P I_P D_T K_{VCO}}{2\pi}.$$
 (3)

The ways to reduce loop bandwidth are to decrease  $I_P$  and  $R_P$  for a fixed  $K_{VCO}$ . The smaller  $I_P$  will induce the serious mismatch between charging and discharging current.

Manuscript received November 3, 2004.

Manuscript revised February 15, 2005.

<sup>†</sup>The authors are with the Graduate Institute of Electronics Engineering and Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan, 10617, R.O.C.

a) E-mail: lsi@cc.ee.ntu.edu.tw DOI: 10.1093/ietele/e88–c.8.1726 Reducing loop bandwidth by  $R_P$  has to further increase  $C_P$  to maintain the same damping ratio. Moreover, for a fixed -3 dB bandwidth, if  $K_{VCO}$  is reduced by a factor of k,  $R_P$  can be increased by the same factor. And the damping ratio can be increased by a factor of  $\sqrt{k}$ . The jitter tolerance can be also improved [7]. If  $C_P$  is also reduced by a factor of k, the natural frequency and damping ratio will remain the same. The required die area for the on-chip loop filter could be reduced and is easier to be integrated. In order to enlarge the tuning range and reduce the  $K_{VCO}$ , an analog/digital dual-controlled VCO (DVCO) with 56% tuning range and reasonable  $K_{VCO}$  is proposed.

Owing to the small lock range of the phase detectors (PDs), an accessory frequency tracing loop is needed for wide tuning range VCOs in CDR circuits. Since the whole frequency range of the proposed DVCO is divided into several bands, a frequency detector in digital domain would be necessary. Quadricorrelators [8]–[10] are widely used for frequency acquisition. The characteristics of narrow capture range and the probabilistic nature [10] makes it impossible to be used in this work. To avoid this predicament, a digital gated frequency detector (GFD) is presented to acquire the wide range data without harmonic locking issue.

## 2. Circuit Description

The proposed fully integrated CDR circuit is shown in Fig. 1. It consists of a digital GFD, an analog/digital DVCO, a linear half-rate phase detector [1] and a charge-pump (CP). To receive the data, the signal, *Start*, is set to low to clear the GFD. When the signal *Start* changes to high, the GFD would output 5-digit control code to select the proper frequency band of the DVCO according to the input data rate. After the digital loop is stopped, the analog loop would be



Fig. 1 Half-rate CDR architecture.



**Fig. 2** The proposed GFD.



Fig. 3 Delay cell of DVCO and DCDL

activated and the linear PD and CP would take over the rest job until the phase is correct.

The proposed GFD consists of two gated oscillators (GOs), a dual-controlled delay line (DCDL), a phase comparator (PC), a 5-bit successive approximation registercontrolled (SAR) [11] controller and a differential to singleended (DTS) buffer, as shown in Fig. 2. The two GOs would be triggered by the first rising and falling edges of the preamble signal, respectively, and generate the signals  $V_{GO1}$  and  $V_{GO2}$ . The signal  $V_D$  is generated by the DCDL from the signal  $V_{GO1}$ . The PC will compare the phase relation between the signals  $V_D$  and  $V_{GO2}$  and output the signal Lead/Lag for the 5-bit SAR controller to adjust the DCDL. And the clock signal which triggers the SAR controller is generated by the signal  $V_{GO2}$  through the DTS buffer. After the 5-bit binary searching is finished, the controller will output the signal PowerDown to turn off the GOs and the DCDL to reduce the power consumption. The delay cells in the DCDL, as shown in Fig. 3, are identical to those in the DVCO. The only difference between them is that the signal, PowerDown, of the delay cells in the DVCO is connected to power supply while that in the DCDL is controlled by the 5-bit SAR controller.

In order to operate correctly, the preamble signal is needed as shown in Fig. 4. Suppose that the first two data are "10," i.e., logic one followed with a logic zero and the remaining preamble signal can be random data or periodic data of 1010... Assume these two GOs are identical; they will oscillate at the same frequency and keep a constant phase difference. Since the time difference between the first rising and falling edges of the preamble is equal to a bit time, the phase difference between  $V_{GO1}$  and  $V_{GO2}$  would be the same. If  $V_D$  is aligned with  $V_{GO2}$ , the delay time of the DCDL would be close to a bit time. In other words, if these



Fig. 4 Timing diagram of digital GFD.

delay cells are connected as a ring oscillator, it will oscillate close to half data rate.

While the most significant bit is switched from low to high in binary searching, the maximum change of delay time in a DCDL would be a half of tuning range. Such a rapid change might result in the harmonic locking problem for the GFD. The oscillating frequency of the two GOs should be chosen carefully to prevent the issue. Considering the timing relation for a certain data rate as shown in Fig. 4, the delay time,  $T_D$ , of the DCDL has to satisfy the following constraint:

$$T_{bit} - \frac{T_{GO}}{2} \le T_D \le T_{bit} + \frac{T_{GO}}{2},\tag{4}$$

where  $T_{bit}$  is the bit time of data and  $T_{GO}$  is the oscillating period of the two GOs. This constraint has to be satisfied for all data rates in wide range applications. Equation (4) could be rewritten as

$$T_{bit,\text{max}} - \frac{T_{GO}}{2} \le T_{D,\text{min}}$$
 and (5)

$$T_{D,\max} \le T_{bit,\min} + \frac{T_{GO}}{2},\tag{6}$$

where  $T_{bit,\max}$  and  $T_{bit,\min}$  represent the maximum and minimum bit time, respectively. Ideally,  $T_{GO}$  should be greater than twice of the delay range and the constraint of Eq. (6) would be vanished for any value of  $T_{bit}$ . For example, if the data rate from 1.7 Gbps to 3.125 Gbps for a CDR circuit is desired,  $T_D$  should cover the range from 320 ps to 588 ps and  $T_{GO}$  has to greater than 536 ps. One can simply choose  $T_{GO}$  to be larger than  $2 \times T_{bit,\max}$ , i.e. 1.17 ns. The detection range could be enlarged by increasing  $T_{GO}$ . In this paper,  $T_{GO}$  is set to 2 ns to ensure that GFD have adequate operation range.

The frequency resolution of the GFD is determined by the variable delay time,  $T_{LSB}$ , corresponding to the least significant bit (LSB) in the DCDL. Because the binary search algorithm could only guarantee that the steady-state error is less than  $\pm 1$  LSB, The value of  $T_{LSB}$  should be designed small enough and could not exceed the capture range of PD. The maximum frequency error,  $\Delta f$ , of the GFD could be expressed as

$$\Delta f = \frac{1}{2(T_{bit} - T_{LSB})} - \frac{1}{2T_{bit}} \cong \frac{T_{LSB}}{2T_{bit}^2}.$$
 (7)



Fig. 5 Phase detector with MUX in [1].



Fig. 6 Current mode XOR gates [1] with modified charge pump.

If  $T_{LSB}$  is 30 ps, the maximum frequency error would be 43 MHz for the data rate of 1.7 Gbps and the analog tuning range of the DVCO has to cover that. The accuracy of the GFD could be improved by reducing  $T_{LSB}$ . But the number of bit of the digital loop should be increased to maintain the same tuning range and the digital lock time would be longer. In wide data-rate applications, the frequency accuracy would be better in low data-rate cases.

A linear half-rate PD [1] is employed in this work as shown in Fig. 5. Conventional static and dynamic logics cannot perform phase detection for such a high speed. To extract high speed phase information, all logic components in Fig. 5 are implemented in current mode logic (CML) [12]. The current mode XOR gates [1] with the modified charge pump are implemented as shown in Fig. 6 to achieve high speed operation. The symmetry configuration could reduce the loading mismatch of phase detector.

# 3. Experimental Results

The proposed CDR has been fabricated in a standard 0.25- $\mu$ m CMOS technology and occupies a chip area of 1.5 × 1.8 mm² including the on-chip loop filter. Figure 7 shows the die photograph. This CDR consumes 200 mW from a single 2.5 V supply at the data rate of 3.125 Gbps. To make sure that the proposed GFD catches the correct period of one data bit, the differential input data is set to logic zero while the signal start is switched from low to high by the off-chip manual control. After the signal start is enabled, the data pattern is changed from the fixed logic zero to the



Fig. 7 Die photo of the proposed CDR.



**Fig. 8** Measured transient response for the GFD at the data rate of 2.5 Gbps.

pre-programmed  $2^7 - 1$  PRBS pattern. The first two bits of the chosen data pattern are a logic one followed with a logic zero. The pattern of other bits will not affect the operation of GFD. Figure 8 illustrates the measured transient response for the GFD at the data rate of 2.5 Gbps. The operating frequency of the digital GFD is 500 MHz. The signal, *PowerDown*, of the GFD will switch from logic one to logic zero and thus turn off the GOs and the DCDL after 5 cycles of the GO.

Figures 9(a) and 9(b) illustrate the retimed data and clock when the CDR locks to 1.7 Gbps and 3.125 Gbps NRZ data with a PRBS of 2<sup>7</sup> – 1, respectively. To reduce the required bonding pads for the measurement consideration, the differential retimed clock and data are transformed into single-ended by differential-to-single-ended buffers. These buffers degrade the CMRR and PSRR performance and contribute noise and jitter. Further, the rail-to-rail input signals for the open-drain buffers will cause large ripples when the output signals are at the maximum or minimum voltage. The measured jitter histograms are shown in Fig. 10. The measured rms and peak-to-peak jitters from 1.7 Gbps to



Fig. 9 Retimed data and clock at the data rate of (a)  $1.7\,\mathrm{Gbps}$  (b)  $3.125\,\mathrm{Gbps}$ .

3.125 Gbps are below 7.4 ps and 62.2 ps, respectively. Figure 11 illustrates the measured result of jitter transfer function at the data rate of 2.5 Gbps. The measured loop bandwidth is around 1 MHz without large off-chip capacitors and the measured jitter transfer functions for all data rates are almost the same. This is because the  $K_{VCO}$  is small and nearly constant under different digital control codes. The bit-error rate (BER) testing time was set to be 20 minutes and no error occurred. The measured BERs are all smaller than  $10^{-12}$  when the data rate is from 1.7 Gbps to 3.125 Gbps of  $2^7 - 1$  PRBS. The maximum length of consecutive 1's or 0's is 15 when the BER is lower than  $10^{-12}$ . Table 1 gives the performance summary of this work.

## 4. Conclusions

A 1.7–3.125 Gbps CDR circuit is realized in a 0.25- $\mu$ m standard CMOS technology including the passive on-chip loop filter. The DVCO incorporating with the proposed wide range GFD can achieve both small  $K_{VCO}$  and wide operation range without harmonic locking issue. This dual loop architecture could reduce the  $K_{VCO}$  and provide another way to decrease the loop bandwidth. It can relax the loop parameter and further reduces the area of on-chip capacitor. The large off-chip capacitors are not required in this design and make it suitable for system-on-a-chip (SoC) design. The GFD has a fixed lock time and is independent of data pat-





**Fig. 10** Measured jitter histograms at the data rate of (a) 1.7 Gbps and (b) 3.125 Gbps.



**Fig. 11** Measured jitter transfer at the data rate of 2.5 Gbps.

tern. All the measured BERs are less than  $10^{-12}$  at the data rate from 1.7 Gbps to 3.125 Gbps.

## Acknowledgement

The authors would like to thank Chip Implementation Cen-

| Table 1 Terrormance summary. |                           |                 |
|------------------------------|---------------------------|-----------------|
| Technology                   | Standard 0.25um 1P5M CMOS |                 |
| Power Supply                 | Single 2.5V               |                 |
| Chip Area                    | 1.5 mm x 1.8 mm           |                 |
| Power<br>Consumption         | DVCO                      | 43mW~87mW       |
|                              | PD + CP                   | 15mW            |
|                              | Digital buffers           | 57mW~98mW       |
|                              | GFD                       | 125mW           |
|                              | Total                     | 115mW@1.7Gbps   |
|                              |                           | 157mW@2.5Gbps   |
|                              | (GFD Off)                 | 200mW@3.125Gbps |
| VCO Range                    | 0.78~1.8GHz               |                 |
| Loop BW                      | 1MHz                      |                 |
| K <sub>vco</sub>             | -50MHz/V                  |                 |
| CP Current                   | 50uA                      |                 |
| Loop Filter                  | Rp 1.5kΩ                  |                 |
|                              | Cp 255pF                  |                 |
|                              | Cs 60pF                   |                 |
| RMS Jitter                   | 7.4ps @1.7Gbps            |                 |
|                              | 7.5ps @2.5Gbps            |                 |
|                              | 6.7ps @3.125Gbps          |                 |
| Peak-Peak<br>Jitter          | 62.2ps @1.7Gbps           |                 |
|                              | 61ps @2.5Gbps             |                 |
|                              | 60ps @3.125Gbps           |                 |

Table 1 Performance summary.

ter (CIC), Taiwan, for fabricating this chip. This work was supported in part by MediaTek Inc.

#### References

- [1] J. Savoj and B. Razavi, "A 10-Gb/s CMOS clock and data recovery circuit with a half-rate linear phase detector," IEEE J. Solid-State Circuits, vol.36, no.5, pp.761–767, May 2001.
- [2] S.H. Lee, M.S. Hwang, Y. Choi, S. Kim, Y. Moon, B.J. Lee, D.K. Jeong, W. Kim, Y.J. Park, and G. Ahn, "A 5 Gb/s 0.25 μm CMOS jitter-tolerant variable-interval oversampling clock/data recovery circuit," IEEE J. Solid-State Circuits, vol.37, no.12, pp.1822– 1830, Dec. 2002.
- [3] J.E. Rogers and J.R. Long, "A 10 Gb/s CDR/DEMUX with LC delay line VCO in 0.18-\(\mu\mathrm{m}\) CMOS," IEEE J. Solid-State Circuits, vol.37, no.5, pp.1781–1789, May 2002.
- [4] S.J. Song, S.M. Park, and H.J. Yoo, "A 4-Gb/s CMOS clock and data recovery circuit using 1/8-rate clock technique," IEEE J. Solid-State Circuits, vol.38, no.6, pp.1213–1219, July 2003.
- [5] M. Ramezani and C.A.T. Salama, "A 10 Gb/s CDR with a half-rate bang-bang phase detector," IEEE International Symposium on Circuits and Systems, vol.II, pp.181–184, May 2003.
- [6] J. Savoj and B. Razavi, "A 10-Gb/s CMOS clock and data recovery circuit with a half-rate binary phase/frequency detector," IEEE J. Solid-State Circuits, vol.38, no.1, pp.13–21, Jan. 2003.
- [7] B. Razavi, Design of Integrated Circuits for Optical Communications, International Edition, McGraw-Hill, New York, 2003.
- [8] A. Pottbacker, U. Langmann, and H. Schreiber, "A Si bipolar phase and frequency detector IC for clock extraction up to 8 Gb/s," IEEE J. Solid-State Circuits, vol.27, no.12, pp.1747–1751, Dec. 1992.
- [9] B. Stilling, "Bit rate and protocol independent clock and data recovery," Electron. Lett., vol.36, pp.824–825, April 2000.
- [10] R.-J. Yang, S.-P. Chen, and S.-I. Liu, "A 3.125 Gbps clock and data recovery circuit for 10-Gbase-LX4 Ethernet," IEEE J. Solid-State Circuits, vol.39, no.8, pp.1356–1360, Aug. 2004.

- [11] A. Rossi and G. Fucilli, "Nonredundant successive approximation register for A/D converters," Electron. Lett., vol.32, pp.1055–1057, June 1996.
- [12] M.M. Green and U. Singh, "Design of CMOS CML circuits for high-speed broadband communications," IEEE International Symposium on Circuits and Systems, vol.II, pp.204–207, May 2003.



Rong-Jyi Yang was born in Taipei, Taiwan, R.O.C., in 1973. He received the B.S. degree in electrical engineering from National Central University, Jhongli, Taiwan, R.O.C., in 1998. He is currently working toward the Ph.D. degree in electronics engineering, National Taiwan University, Taipei, Taiwan, R.O.C. His research interests include phase-locked loops, delay-locked loops, and high-speed CMOS data recovery circuits for multiple gigabit communication.



Shen-Iuan Liu was born in Keelung, Tai-wan, Republic of China, 1965. He received both the B.S. and Ph.D. degree in electrical engineering from National Taiwan University (NTU), Taipei, in 1987 and 1991, respectively. During 1991–1993 he served as a second lieutenant in Chinese Air Force. During 1991–1994, he was an Associate Professor in the Department of Electronic Engineering of National Taiwan Institute of Technology. He joined in the Department of Electrical Engineering, NTU, Taipei,

Taiwan in 1994 and he has been the Professor since 1998. He obtained the Engineering Paper Award from the Chinese Institute of Engineers, 2003. He obtained the Young Professor Teaching Award from MXIC Inc., the Research Achievement Award from NTU, and the Outstanding Research Award from National Science Council, 2004. His research interests are in analog and digital integrated circuits and systems. Dr. Liu has served a chair on IEEE SSCS Taipei Chapter from 2004. He has served a general chair on the 15th VLSI Design/CAD symposium, Taiwan, 2004 and a Program Co-chair on the Fourth IEEE Asia-Pacific Conference on Advanced System Integrated Circuits, Japan, 2004. He is a senior member of IEEE.