# Design and Analysis of a 20-GHz Clock Multiplication Unit in 0.18-µm CMOS Technology

Jri Lee and Shanghann Wu National Taiwan University, Taipei, Taiwan

# Abstract

A 20-GHz clock multiplication unit for SONET OC-768 systems employs dual loops and third-order loop filter to suppress the jitter. Realized in 0.18- $\mu$ m CMOS technology, this circuit achieves an output jitter of 0.2 ps,rms and 4.5 ps,pp while consuming 40 mW from a 1.8-V supply.

## I. INTRODUCTION

The continuous growth of broadband data communications has been driving optical systems to 40 Gb/s. Recently, multiplexer/clock multiplication unit (MUX/CMU) circuits for OC-768 systems have been realized in SiGe technology [1][2], but they usually require high supply voltages and a few Watts of power.

This paper presents the design, analysis, and experimental verification of a 20-GHz PLL-based CMU capable of providing clocks for a 40-Gb/s MUX. Fabricated in 0.18- $\mu$ m CMOS technology, this circuit achieves an output jitter of 0.2 ps,rms and 4.5 ps,pp, and a power dissipation of 40 mW from a 1.8-V supply.

The next section describes the architecture and building blocks of the CMU circuit. Section III examines design considerations and Section IV summarizes the experimental results.

# **II. ARCHITECTURE AND BUILDING BLOCKS**

## A. Architecture

Figure 1 depicts the proposed CMU architecture. In contrast to a type IV phase/frequency detector (PFD), the phase and frequency detection loops are decomposed to minimize jitter while maintaining a wide acquisition range. Here, the phase detector (PD) merges with its V-to-I converter. The frequency detector drives the VCO frequency toward the desired value, and disables itself when the loop is locked. A thirdorder loop filter is employed to suppress the ripple on the control line. The VCO is followed by a chain of frequency dividers ( $\div$ 32), and the 625-MHz quadrature clocks are generated for phase and frequency comparison. The bandwidth is chosen such that the total jitter caused by the reference and VCO noise reaches a minimum [2].



Fig. 1. CMU architecture.

### B. Phase Detector/V-to-I Converter

As shown in Fig. 2(a), the phase detector and V-to-I converter are merged to save power and reduce jitter. The quadrature clocks  $CK_i$  and  $CK_q$  create quarter-period reference pulses, while  $CK_q$  and the input reference  $CK_{ref}$  provide pulses whose widths are proportional to the phase error [Fig. 2(b)]. As a result, a characteristic of Fig. 2(c) is obtained, and  $CK_{ref}$  eventually aligns with  $CK_i$  upon lock.

It can be shown that skews between  $I_{up}$  and  $I_{down}$  paths  $(M_5-M_6 \text{ and } M_1-M_4, \text{ respectively})$  disturbs the VCO control line periodically, and the channel-length modulation of  $M_1-M_6$  causes control-line ripple as well. In this design, the dimension of  $M_1-M_6$  is chosen as a compromise between these two effects so that the deterministic output jitter is minimized.

## C. Frequency Detector

As shown in Fig. 3, the frequency detector (FD) produces the polarity of beat frequency by comparing the phase relationship between  $Q_1$  and  $Q_2$  [3]. Here, the V-to-I converter  $[(V/I)_{FD}]$  bears 4 times larger pump current than that of PD to ensure the FD loop dominates during frequency acquisition. Similar to [4], the FD automatically disables  $(V/I)_{FD}$  when the loop is locked, injecting no current into the loop filter.

# D. VCO and Divider

As illustrated in Fig. 4(a), the 20-GHz VCO is realized as an LC oscillator with a differentially-stacked inductor that achieves higher self-resonance frequency ( $f_{SR}$ ) and quality factor (Q) simultaneously [5]. To reduce the capacitive cou-



Fig. 2. (a) Phase detector and V-to-I converter, (b) timing diagram, and (c) its characteristic.



Fig. 3. Frequency detector.

pling to the substrate, a ground shield made of polysilicon sticks with minimum gap width is placed underneath the spirals in the direction perpendicular to the current flow.

The first divide-by-2 circuit is implemented as a Miller divider with inductive loads [6], as depicted in Fig. 4(b). Simulation shows that this topology achieves an operation range of 7 GHz, well exceeding the VCO tuning range.

## **III.** CONSIDERATIONS

## A. Reference Feedthrough

The sources that generates control line ripple include current mismatch and pulse skew of the V-to-I converter. Synchronized with the input reference clock, the ripple on the control line modulates the VCO frequency, resulting in clock jitter directly.

Consider a periodic ripple,  $V_m \cos \omega_{ref} t$ , imposed on a control voltage of a locked loop [Fig. 5(a)]. The excessive phase caused by the ripple is given by

$$\Delta\phi(t) = \int_0^t K_{VCO} V_m \cos\omega_{ref} \tau \, d\tau = \frac{K_{VCO} V_m}{\omega_{ref}} \sin\omega_{ref} t \tag{1}$$

Noting that (absolute) jitter is defined as the deviation of the zero-crossing point of the output clock, we arrive at

$$\Delta T(t) = \frac{\Delta \phi(t)}{M\omega_{ref}} = \frac{K_{VCO}V_m}{M\omega_{ref}^2} \sin \omega_{ref} t, \qquad (2)$$





Fig. 5. Jitter due to control-line ripple.

where M denotes the divide ratio. As illustrated in Fig. 5(b), the zero-crossing point "waggles" around the *average* point with a frequency of  $\omega_{ref}/2\pi$ . For large divide ratio M, the rms jitter can be obtained as

$$(\Delta T)_{rms}^2 = \frac{\omega_{ref}}{2\pi} \int_0^{2\pi/\omega_{ref}} \frac{K_{VCO}^2 V_m^2}{M^2 \omega_{ref}^4} \sin^2 \omega_{ref} t \, dt.$$
(3)

It follows that

$$(\Delta T)_{rms} = \frac{K_{VCO}V_m}{\sqrt{2}M\omega_{ref}^2}.$$
(4)

Since the excessive phase reaches a maximum at  $t = (2k + 1)\pi/(2\omega_{ref})$  where  $k = 0, 1, 2, \ldots$ , the peak-to-peak jitter can be calculated as

$$(\Delta T)_{pp} = \frac{2K_{VCO}V_m}{M\omega_{ref}^2}.$$
(5)

Equation (4) and (5) reveal that the jitter caused by the reference feedthrough is proportional to the ripple amplitude  $V_m$ ,

implying the advantage of higher-order loop filters that attenuate the control line disturbance without degrading the stability.

## B. Acquisition Range

The acquisition or capture range is defined as the maximum frequency deviation that a PLL can tolerate and relock the loop after a period of settling. As shown in Section II, the proposed phase detector exhibits a periodic characteristic [Fig. 2(c)], implying a finite capture range. It is because during the loop settling, a large frequency deviation may lead to a phase error greater than  $\pi$ . The V-to-I converter thus produces a current with opposite polarity, further exacerbating the deviation. As a result, the loop becomes out of lock and a frequency acquisition loop must be activated to relock the frequency.

To further investigate the acquisition behavior, we simplify the PLL as a second-order model with linear PD characteristic, as shown in Fig. 6(a). The transfer function is thus given by

$$\frac{\phi_{out}}{\phi_{in}}(s) = \frac{M(2\zeta\omega_n s + \omega_n^2)}{s^2 + 2\zeta\omega_n s + \omega_n^2},\tag{6}$$

where  $\omega_n = (K_{VCO}I_P/2\pi CM)^{1/2}$  and  $\zeta = R/2(K_{VCO}I_PC/2\pi M)^{1/2}$ . Suppose the loop is locked properly for t < 0, and the reference frequency  $\omega_{ref}$  jumps abruptly to  $\omega_{ref} + \Delta \omega$  at t = 0. The output phase would



Fig. 6. Acquisition range calculation.

"track" the curve of  $M(\omega_{ref} + \Delta \omega)t$  to minimize the phase error [Fig. 6(b)]. However, for the loop to relock, the maximum phase deviation,  $\Delta \phi_{max}$  must not exceed  $M\pi$ . With  $\phi_{in}(t) = (\omega_{ref} + \Delta \omega)t, \phi_{out}(t)$  can be derived as

$$\phi_{out}(t) = \frac{M\Delta\omega}{2\omega_n\sqrt{\zeta^2 - 1}}(e^{k_1t} - e^{k_2t}) + M(\omega_{ref} + \Delta\omega)t, \quad (7)$$

where  $k_1 = -\omega_n(\zeta + \sqrt{\zeta^2 - 1})$  and  $k_2 = -\omega_n(\zeta - \sqrt{\zeta^2 - 1})$ . It can be proven that  $\Delta \phi_{max}$  occurs at  $t = t_1$ , where  $\phi'_{out}(t_1) = M(\omega_{ref} + \Delta \omega)$ . It follows that

$$t_1 = \frac{1}{2\omega_n \sqrt{\zeta^2 - 1}} \ln \frac{\zeta + \sqrt{\zeta^2 - 1}}{\zeta - \sqrt{\zeta^2 - 1}}.$$
 (8)

To ensure relocking, we must have

$$M(\omega_{ref} + \Delta\omega)t_1 - \phi_{out}(t_1) < M\pi, \qquad (9)$$

and the capture range is obtained as

$$|\Delta\omega| < \pi\omega_n (\zeta - \sqrt{\zeta^2 - 1}) \left(\frac{\zeta - \sqrt{\zeta^2 - 1}}{\zeta + \sqrt{\zeta^2 - 1}}\right)^{-\frac{\zeta + \sqrt{\zeta^2 - 1}}{2\sqrt{\zeta^2 - 1}}}.$$
 (10)

For a heavily overdamped system,  $\zeta \gg 1$  and Eq. (10) becomes

$$|\Delta\omega| < 2\pi\zeta\omega_n. \tag{11}$$

As expected, this value is commensurate with the loop bandwidth. In practice, the proposed PD achieves a smaller capture range simply due to the nonlinear region from  $\pi/2$  to  $\pi$  of the  $I_{av}$ - $\Delta\phi$  characteristic.

## **IV. EXPERIMENTAL RESULTS**

The CMU circuit has been fabricated in 0.18- $\mu$ m CMOS technology. Figure 7(a) shows the die photograph, which measures  $0.8 \times 0.8 \text{ mm}^2$  including pads. Loop filter is built on chip to avoid possible sources of noise due to wirebonding. The chip is tested on a high-speed probe station. The on-chip



#### Fig. 7. Chip micrograph.

output lines are designed as  $50-\Omega$  transmission lines to absorb the routing capacitance, and skews and jitters are minimized through symmetric layout and balanced routing. The circuit consumes 40 mW (excluding buffers) from a 1.8-V supply. Shown in Fig. 8 is the 20-GHz VCO tuning characteristic, indicating a tuning range of 1.6 GHz.<sup>1</sup> The output clock is plotted in Fig. 9, suggesting an rms and peak-to-peak jitter of 0.87 and 4.5 ps, respectively. These values must de-embed the jitter contributed by the oscilloscope (as shown in the inset), resulting in an rms jitter of 0.2 ps and a peak-to-peak jitter less than 4.5 ps. A 50% duty cycle is observed on the output clock. The output spectrum under locked condition is depicted in Fig. 10. Integration of the spectrum verifies the jitter measurement. Table I summarizes the performance of this circuit.





Fig. 9. Clock jitter measurement (horizontal scale: 2 ps/div, vertical scale: 10 mV/div).

# REFERENCES

- Mounir Meghelli et al., "A 0.18-μm SiGe BiCMOS Receiver and Transmitter Chipset for SONET OC-768 Transmission Systems," *IEEE J. Solid-State Circuits*, vol. 38, pp. 2147-2154, Dec. 2003.
- [2] Hai Tao et al, "40–43-Gb/s OC-768 16:1 MUX/CMU Chipset with SFI-5 Compliance," *IEEE J. Solid-State Circuits*, vol. 38, pp. 2169-2180, Dec. 2003.
- [3] A. Pottbacker et al., "A Si Bipolar Phase and Frequency Detector IC for Clock Extraction up to 8 Gb/s," *IEEE J. Solid-State Circuits*, vol. 27, pp. 1747-1751, Dec. 1992.

 $^1 \mathrm{In}$  a redesign, the VCO frequency should be raised by 5%.



Fig. 10. Output spectrum under locked condition.

| Output Freq.   | 20 GHz          |
|----------------|-----------------|
| Multiply Ratio | 32              |
| Clock Jitter   | 0.2 ps,rms      |
|                | < 4.5 ps,pp     |
| Power Diss.    | 40 mW           |
| Supply         | 1.8 V           |
| Chip Area      | 0.8 mm x 0.8 mm |
| Technology     | 0.18-um CMOS    |
|                |                 |

Table 1. Performance summary.

- [4] Remco C. H. van de Beek et al., "A 2.5–10-GHz Clock Multiplier Unit with 0.22-ps RMS Jitter in Standard 0.18-μm CMOS," *IEEE J. Solid-State Circuits*, vol. 39, pp. 1862-1872, Nov. 2004.
- [5] Jri Lee et al., "A 20-Gb/s 2-to-1 MUX and a 40-GHz VCO in 0.18-μm CMOS Technology," submitted to *the 2005 Symposium on VLSI Circuits*.
- [6] Jri Lee and Behzad Razavi, "A 40-GHz Frequency Divider in 0.18-μm CMOS Technology," *IEEE J. Solid-State Circuits*, vol. 39, pp. 594-601, April 2004.