### ECEN720: High-Speed Links Circuits and Systems Spring 2025

### Lecture 11: Clocking Architectures & PLLs



### Sam Palermo Analog & Mixed-Signal Center Texas A&M University

## Announcements

- Lab Report 6 due Apr 3
- Project Preliminary Report due Apr 15
- Project Final Report due Apr 29



Clocking Architectures

### • PLLs

- Modeling
- Noise transfer functions

# References

- High-speed link clocking tutorial paper, PLL analysis paper, and PLL thesis posted on website
- Posted PLL models in project section
- Website has additional links on PLL and jitter tutorials
- Majority of today's PLL material comes from Fischette tutorial and M. Mansuri's PhD thesis (UCLA)

## High-Speed Electrical Link System



# Clocking Terminology



#### Synchronous

- Every chip gets same frequency AND phase
- Used in low-speed busses

#### Mesochronous

- Same frequency, but unknown phase
- Requires phase recovery circuitry
  - Can do with or without full CDR
- Used in fast memories, internal system interfaces, MAC/Packet interfaces

#### Plesiochronous

- Almost the same frequency, resulting in slowly drifting phase
- Requires CDR
- Widely used in high-speed links

### Asynchronous

- No clocks at all
- Request/acknowledge handshake procedure
- Used in embeddded systems, Unix, Linux

# I/O Clocking Architectures

- Three basic I/O architectures
  - Common Clock (Synchronous)
  - Forward Clock (Source Synchronous)
  - Embedded Clock (Clock Recovery)
- These I/O architectures are used for varying applications that require different levels of I/O bandwidth
- A processor may have one or all of these I/O types
- Often the same circuitry can be used to emulate different I/O schemes for design reuse

# Common Clock I/O Architecture



- Common in original computer systems
- Synchronous system by design (no active deskew)
- Common bus clock controls chip-to-chip transfers
- Requires equal length routes to chips to minimize clock skew
- Data rates typically limited to ~100Mb/s

# Common Clock I/O Cycle Time

Cycle time to meet setup time

 $\max(T_{clk-A}+T_{Aclk}+T_{drive}+T_{tof}+T_{receive}+T_{setup}) - \min(T_{Bclk}-T_{clk-B}) < T_{cycle}$ Chip A T<sub>tof</sub> <sup>I</sup> receive T<sub>drive</sub> etup T<sub>Bclk</sub> PLL Aclk PLL Chip B T<sub>clk - A</sub> I clk - B clock [Krauter] source

# Common Clock I/O Limitations

- Difficult to control clock skew and propagation delay
- Need to have tight control of absolute delay to meet a given cycle time
- Sensitive to delay variations in on-chip circuits and board routes
- Hard to compensate for delay variations due to low correlation between on-chip and off-chip delays
- While commonly used in on-chip communication, offers limited speed in I/O applications

# Forward Clock I/O Architecture



- Common high-speed reference clock is forwarded from TX chip to RX chip
  - Mesochronous system
- Used in processor-memory interfaces and multi-processor communication
  - Intel QPI
  - Hypertransport
  - Requires one extra clock channel
  - "Coherent" clocking allows lowto-high frequency jitter tracking
  - Need good clock receive amplifier as the forwarded clock is attenuated by the channel

# Forward Clock I/O Limitations



- Clock skew can limited forward clock I/O performance
  - Driver strength and loading mismatches
  - Interconnect length
     mismatches
- Low pass channel causes jitter amplification
- Duty-Cycle variations of forwarded clock

# Forward Clock I/O De-Skew



- Per-channel de-skew allows for significant data rate increases
  - Sample clock adjusted to center clock on the incoming data eye
  - Implementations
    - Delay-Locked Loop and Phase Interpolators
    - Injection-Locked Oscillators
  - Phase Acquisition can be
    - BER based no additional input phase samplers
    - Phase detector based implemented with additional input phase samplers periodically powered on

# Forward Clock I/O Circuits



- TX PLL
- TX Clock Distribution
- Replica TX Clock Driver
- Channel
- Forward Clock Amplifier
- RX Clock Distribution
- De-Skew Circuit
  - DLL/PI
  - Injection-Locked Oscillator

# Embedded Clock I/O Architecture



- Can be used in mesochronous or plesiochronous systems
- Clock frequency and optimum phase position are extracted from incoming data stream
- Phase detection continuously running
- CDR Implementations
  - Per-channel PLL-based
  - Dual-loop w/ Global PLL &
    - Local DLL/PI
    - Local Phase-Rotator PLLs

# Embedded Clock I/O Limitations



- Jitter tracking limited by CDR bandwidth
  - Technology scaling allows CDRs with higher bandwidths which can achieve higher frequency jitter tracking
- Generally more hardware than forward clock implementations
  - Extra input phase samplers

# Embedded Clock I/O Circuits



TX PLL

- TX Clock Distribution
- CDR
  - Per-channel PLL-based
  - Dual-loop w/ Global PLL &
    - Local DLL/PI
    - Local Phase-Rotator PLLs
    - Global PLL requires RX clock distribution to individual channels

## Xilinx 0.5-32Gb/s Transceiver Clocking



| Technology                        | CMOS 16nm FinFET        |
|-----------------------------------|-------------------------|
| Power Supply (Vavce, Vavtt, Vaux) | 0.9 V, 1. 2V, 1.8 V     |
| Frequency range                   | 500 Mb/s - 32.75 Gb/s   |
| Transceiver Quad area             | 2.625 mm × 2.218 mm     |
| LC PLL range                      | 8-16.375 GHz            |
| Ring PLL range                    | 2-6.25 GHz              |
| TX PRBS7 jitter at 32.75Gb/s      | TJ: 5.39 ps, RJ: 190 fs |
| 32.75Gb/s RX JTOL @ 30MHz         | 0.45 UI                 |
| @ 100MHz                          | 0.6 UI                  |
| Channel loss at 32.75Gb/s         | 30 dB                   |
| Measured BER at 32.75Gb/s         | < 10 <sup>-15</sup>     |
| Power at 32.75Gb/s with DFE       | 577mW/ch (17.6pJ/b)     |

### [Upadhyaya VLSI 2016]

- LC-PLL with 2 LC-VCOs used to cover high data rates (8-32Gb/s)
- Ring-PLL used for lower data rates
- CML clock distribution with active inductive loads used for low jitter



### • PLL modeling

### PLL noise transfer functions

# PLL Block Diagram



 A phase-locked loop (PLL) is a negative feedback system where an oscillator-generated signal is phase AND frequency locked to a reference signal

# **PLL Applications**

- PLLs applications
  - Frequency synthesis
    - Multiplying a 100MHz reference clock to 10GHz
  - Skew cancellation
    - Phase aligning an internal clock to an I/O clock
  - Clock recovery
    - Extract from incoming data stream the clock frequency and optimum phase of high-speed sampling clocks
  - Modulation/De-modulation
    - Wireless systems
    - Spread-spectrum clocking

# Forward Clock I/O Circuits



- TX PLL
- TX Clock Distribution
- Replica TX Clock Driver
- Channel
- Forward Clock Amplifier
- RX Clock Distribution
- De-Skew Circuit
  - DLL/PI
  - Injection-Locked Oscillator

# Embedded Clock I/O Circuits



### TX PLL

- TX Clock Distribution
- CDR
  - Per-channel PLL-based
  - Dual-loop w/ Global PLL &
    - Local DLL/PI
    - Local Phase-Rotator PLLs
    - Global PLL requires RX clock distribution to individual channels

# PLL Design Challenges

- Board-level reference clock frequencies don't scale often
  - 156MHz is a common frequency
- RX CDR bandwidth is hard to scale with PAM4 signaling and ADC-based front-ends
  - Typically 2-4MHz
- PLL bandwidth must be kept less than 10MHz for stability and to filter reference jitter
- VCO phase noise at low-frequency offsets due to flicker noise must be suppressed

#### 32.75Gbps Transceiver PLL Simulated Jitter Numbers

| Receiver Type   | PLL PN @1MHz | CDR BW  | RJ      | RJ in UI |
|-----------------|--------------|---------|---------|----------|
| Analog based RX | -92.4dBc/Hz  | 12.7MHz | 160.7fs | 5.26mUI  |
| ADC based RX    | -92.4dBc/Hz  | 2MHz    | 407fs   | 13.3mUI  |

### [Turker ISSCC 2019]

# Charge Pump PLL



- Charge pump PLL is a common implementation
- Type-2 (2 integrators) allows for ideally zero phase error between the input and feedback phase
- Requires a stabilizing zero that is realized with the filter resistor
- A secondary capacitor C<sub>2</sub> is often added for additional filtering to reduce reference spurs
- Modeled as a third-order system

## Linear PLL Model



- Phase is the key variable of interest
  - Output phase response to a stimulus injected at a given point in the loop
  - Phase error response is also informative
- Linear "small-signal" analysis is useful for understand PLL dynamics if
  - PLL is locked (or near lock)
  - Input phase deviation amplitude is small enough to maintain operation in lock range

### Understanding PLL Frequency Response



- Frequency domain analysis can tell us how well the PLL tracks the input phase as it changes at a certain frequency
- PLL transfer function is different depending on which point in the loop the output is responding to

## Linear PLL Model



### 14GHz PLL Closed-Loop Transfer Function

| Parameter        |           |
|------------------|-----------|
| Fref             | 156.25MHz |
| Ν                | 90        |
| Fvco             | 14GHz     |
| f <sub>u</sub>   | 2MHz      |
| $\Phi_{\sf m}$   | 60°       |
| f <sub>3dB</sub> | 3.1MHz    |
| Кусо             | 2π*1GHz/V |
| R                | 4kΩ       |
| C <sub>1</sub>   | 74pF      |
| C <sub>2</sub>   | 5.8pF     |
| I <sub>cp</sub>  | 310uA     |



$$H(s) = \frac{\phi_{out}(s)}{\phi_{in}(s)} = \frac{\overline{C_2} \left(s + \overline{RC_1}\right)}{s^3 + \left(\frac{C_1 + C_2}{RC_1C_2}\right)s^2 + \left(\frac{K_{PD}K_{VCO}}{NC_2}\right)s + \frac{K_{PD}K_{VCO}}{NRC_1C_2}}$$

29

## **Common PLL Noise Sources**



## Noise Transfer Functions



- Input reference and charge pump noise is low-pass filtered
- Loop filter noise (VCO input noise) is band-pass filtered
- VCO output phase noise is high-pass filtered

## PLL Phase Noise & Jitter

#### [Turker ISSCC 2018]



• PLL time-domain jitter is obtained by integrating the output phase noise

$$\sigma_{j,Total}^{2} = \frac{2}{\omega_{0}^{2}} \int_{f_{start}}^{f_{stop}} S_{\phi_{out}}^{Total}(f) df$$

|                                             | [3]                      | [4]                 | [5]                | [6]                     | This Work              |
|---------------------------------------------|--------------------------|---------------------|--------------------|-------------------------|------------------------|
| PLL Architecture                            | Integer N,<br>SSPD based | FracN,<br>SSPD DPLL | FracN,<br>DPLL     | Integer N,<br>SPD based | Integer N,<br>CP based |
| VCO                                         | LC                       | LC                  | LC                 | LC                      | LC                     |
| Technology                                  | 180nm                    | 28nm                | 14nm FinFET        | 16nm FinFET             | 16nm FinFET            |
| Reference Freq.(MHz)                        | 55.25                    | 40                  | 26                 | 450                     | 500                    |
| Frequency Range (GHz)                       | 2.21                     | 2.7 – 4.3           | 5.38               | 9 - 18                  | 7.4 – 14               |
| Measurement Frequency<br>(GHz)              | 2.21                     | 5.82                | 2.69               | 18                      | 6.25                   |
| Phase Noise @100kHz<br>(dBc/Hz)             | -125<br>(@200kHz)        | -105.5              | -113.6             | -104.1<br>(@200kHz)     | -120                   |
| Phase Noise @1MHz<br>(dBc/Hz)               | -125<br>(from figure)    | -115.4              | -122.45            | -107.3                  | -123.2                 |
| Phase Noise @100kHz<br>(normalized to 1GHz) | - 131.9<br>(@200kHz)     | -120.8              | -122.2             | - 129.2<br>(@200kHz)    | -135.9                 |
| Phase Noise @1MHz<br>(normalized to 1CHz)   | - 131.9                  | -130.7              | - 131              | - 132.4                 | -139.1                 |
| RMS Jitter (fs)                             | 160<br>(10k – 100M)      | 159<br>(10k – 40M)  | 137<br>(10k – 10M) | 164<br>(1k – 100M)      | 53.6<br>(10k – 10M)    |
| Reference Spur (dBc)                        | -56                      | -78                 | -87.6              | N.A                     | -75.5 *                |
| Power (mW)                                  | 2.5                      | 8.2                 | 13.4               | 29.2                    | 45                     |
| Area (mm <sup>2</sup> )                     | 0.2                      | 0.3                 | 0.257              | 0.39                    | 0.35                   |
| FOM <sub>T</sub> (dB)                       | N.A                      | -243.4              | N.A                | -239.3                  | -246.8                 |
|                                             | * including DAC          | , measured at 1.    | .052GHz DAC ou     | itput                   | 1                      |
|                                             |                          | , measured at 1     | .052GHz DAC ou     | tput                    | -240.0                 |

• We can model an individual noise source's contribution

$$\sigma_{j,i}^2 = \frac{2}{\omega_0^2} \int_{f_{start}}^{f_{stop}} S_i(f) |NTF_i(f)|^2 df$$

$$\sigma_{j,Total}^2 = \sum_i \sigma_{j,i}^2$$
  
RMS Jitter  $\sigma_j = \sqrt{\sigma_{j,Total}^2}$ 

# Wireline Transceiver Jitter Modeling



- Relative jitter (dynamic phase error) between the RX CDR-generated sampling clock and input data sets the system timing margin
- This CDR high-pass response provides additional filtering
- Modeled as a 4MHz 1<sup>st</sup>-order response (IEEE 802.3 & OIF-CEI)

$$\sigma_{jSYS,i}^{2} = \frac{2}{\omega_{0}^{2}} \int_{0}^{\frac{f_{0}}{2}} S_{i}(f) |NTF_{i}(f)|^{2} |CDR(f)|^{2} df$$

# **Input Reference Noise**



#### Phase Noise at 156.26MHz

• Reference jitter  $\sigma_{j,in} = 226 fs_{rms} (10 kHz - 10 MHz)$ 

## **Input Reference Noise**



- After PLL:  $\sigma_{j,in} = 217 fs_{rms} (10 kHz 10 MHz)$
- Including CDR:  $\sigma_{j,in} = 45 fs_{rms} (100 Hz 7 GHz)$

# **Charge Pump Noise**





- Charge pump noise current is injected into the loop filter during the PFD reset time
- Transistor noise PSD convolved with pulse frequency spectrum
- White noise scaled by  $(T_{rst}/T_{ref})$ and 1/f noise scaled by  $(T_{rst}/T_{ref})^2$

# Charge Pump Noise



- After PLL:  $\sigma_{j,CP} = 61 fs_{rms} (10 kHz 10 MHz)$
- Including CDR:  $\sigma_{j,CP} = 22fs_{rms} (100Hz 7GHz)$

# Loop Filter R Noise



- Trade-off between resistor noise, loop filter capacitor size, and charge pump noise
  - Smaller resistor results in larger capacitors (higher area) and larger charge pump current (higher S<sub>iCP</sub>)

### Loop Filter R Noise



# VCO Noise



LC-VCO phase noise sources

- Finite tank quality factor
- Cross-coupled pair
- Tail current source

# VCO Noise



- After PLL:  $\sigma_{j,VCO} = 257 fs_{rms} (10 kHz 10 MHz)$
- Including CDR:  $\sigma_{j,R} = 125 fs_{rms} (100 Hz 7 GHz)$

# **Total Noise**



- After PLL:  $\sigma_{j,Total} = 365 fs_{rms} (10 kHz 10 MHz)$ 
  - Reference clock noise dominates at low frequency
  - VCO dominates near loop bandwidth and higher
- Including CDR:  $\sigma_{j,Total} = 157 fs_{rms} (100 Hz 7 GHz)$ 
  - Now VCO noise clearly dominates total
  - Loop resistor noise is a larger percentage

### PLL Noise Transfer Function Take-Away Points

- The way a PLL shapes phase noise depends on where the noise is introduced in the loop
- Optimizing the loop bandwidth for one noise source may enhance other noise sources
- Generally, the PLL low-pass shapes input phase noise, band-pass shapes VCO input voltage noise, and high-pass shapes VCO/clock buffer output phase noise

### **Oscillator Noise**



### **Oscillator Phase Noise Model**



• For improved model see Hajimiri papers

# **Open-Loop VCO Jitter**



- Measure distribution of clock threshold crossings
- Plot  $\sigma$  as a function of delay  $\Delta T$

### **Open-Loop VCO Jitter**



- Jitter  $\sigma$  is proportional to sqrt( $\Delta T$ )
- K is VCO time domain figure of merit

### VCO in Closed-Loop PLL Jitter



• PLL limits  $\sigma$  for delays longer than loop bandwidth  $\tau_{L}$ 

$$\tau_L = 1/2\pi f_L$$

### Ref Clk-Referenced vs Self-Referenced



- Generally, we care about the jitter w.r.t. the ref. clock ( $\sigma_x$ )
- However, may be easier to measure w.r.t. delayed version of output clk
  - Due to noise on both edges, this will be increased by a sqrt(2) factor relative to the reference clock-referred jitter

### **Converting Phase Noise to Jitter**

[Mansuri]



- RMS jitter for  $\Delta T$  accumulation  $\sigma_{\Delta T}^2 = \frac{8}{\omega_o^2} \int_0^\infty S_{\phi}(f) \sin^2(\pi f \Delta T) df$
- As  $\Delta T$  goes to  $\infty$   $\sigma_T^2 = \frac{2}{\omega_o^2} R_\phi(0) = \frac{2}{\omega_o^2} \int_0^\infty S_\phi(f) df$
- Actual integration range depends on application bandwidth
  - f<sub>min</sub> set by assumed CDR tracking bandwidth
  - $f_{max}$  set by Nyquist frequency ( $f_0/2$ )
- Most exact approach

$$\sigma_T^2 = \frac{2}{\omega_o^2} \int_0^{f_0/2} S_\phi(f) \left| H_{sys}(f) \right|^2 df$$

where  $|H_{sys}(f)|^2$  is the system jitter transfer function 50

# Time Domain Model

- Time domain models captures the discrete-time operation of the PLL architectures
  - Interaction between charge pump and loop filter
  - Cycle slipping behavior
- Allows modeling of non-linear control systems
  - Dynamic loop bandwidth control
  - Automatic frequency band selection
- Potential implementation tools
  - Matlab Simulink
  - CppSim
  - Cadence

### Simulink Model

PLL FREQUENCY SYNTHESIZER MODEL









# Frequency Step w/ Simulink Model

VCO control voltage response to input frequency step



- Voltage spikes due to charge pump current driving loop filter resistor
- Cycle slipping occurs during lock acquisition due to large initial frequency difference

# CppSim Model

#### [Perrott/Meninger]





- C++ based allows for rapid simulation of advanced architectures
- Many useful building blocks included



# Pled for CppSinView ---- Library: WBymb, Example, Cell: vb, synth



### Cadence Verilog-A Model



#### VCO (Square Wave) Verilog-A Code Snippet

module vco advanced backup(in, out); input in; output out; voltage in, out; parameter real Vamp = 0.425; parameter Fmax = 14.3125G; parameter Fmin = 13.5625G; //... (lines omitted) real phase; real ideal phase; real dPhase ; //... (lines omitted) analog begin if(V(in)<Vmin) inst freq = Fmin;</pre> else if(V(in) > Vmax) inst freq = Fmax; else inst freq = ((V(in)-Vmin)\*(Fmax-Fmin)/(Vmax-Vmin)) + Fmin ; ideal\_phase = 2\*`PI\*idtmod(inst\_freq, 0.0, 1.0, -0.5); phase = ideal phase + dPhase; //... (lines omitted) n = (phase >= -`PI/2) & (phase < `PI/2);end V(out) <+ transition(n?Vamp:0,0,tran time);</pre> end endmodule



### Next Time

### CDRs

 The following slides provide more details on PLL circuits. This 620 material may useful for the project, but won't be covered in detail on Exam 2.

### PLL Loop Gain



$$LG(s) = \frac{K_{PD}F(s)K_{VCO}}{Ns} = \frac{K_{PD}K_{VCO}\left(s + \frac{1}{R_{1}C_{1}}\right)}{NC_{2}s^{2}\left(s + \frac{C_{1} + C_{2}}{R_{1}C_{1}C_{2}}\right)}$$

$$\omega_z = \frac{1}{R_1 C_1}, \qquad \omega_{p1} = \omega_{p2} = 0, \qquad \omega_{p3} = \frac{C_1 + C_2}{R_1 C_1 C_2}$$

### Loop Gain Response



# Design Procedure for Max $\Phi_{\rm m}$



• Design procedure maximizes phase margin for a given  $f_u$  and  $\Phi_m$  specification [Hanumolu TCAS1 2004]

# Design Procedure for Max $\Phi_{\rm m}$

**1.** Set loop filter capacitor ratio based on  $\Phi_m$ 

$$K_C = \frac{C_1}{C_2} = 2\left(\tan^2(\Phi_m) + \tan(\Phi_m)\sqrt{\tan^2(\Phi_m) + 1}\right)$$

$$\Phi_m = 60^\circ \to K_C = 12.9$$

2. Set loop filter values based on  $\omega_u$  & with R set for low noise

$$\omega_z = \frac{\omega_u}{\sqrt{1 + K_C}}$$
$$C_1 = \frac{1}{\omega_z R}, \ C_2 = \frac{C_1}{K_C}$$

$$\omega_u = 2\pi * 2MHz \rightarrow \omega_z = 2\pi * 536kHz$$
  
Set  $R = 4k\Omega \rightarrow C_1 = 74pF \& C_2 = 5.8pF$ 

3. Set  $I_{cp}$  to achieve required loop gain

$$I_{cp} = \frac{NC_2 \omega_u^2}{K_{VCO}} \sqrt{\frac{\omega_{p3}^2 + \omega_u^2}{\omega_z^2 + \omega_u^2}} \qquad \qquad \omega_{p3} = 2\pi * 7.45 MHz \to I_{cp} = 310 \mu A$$

### Simulated Responses



- Design achieves  $f_u = 2MHz$  and  $\Phi_m = 60^{\circ}$
- Closed loop response has f<sub>3dB</sub>=3.1MHz

# Charge-Pump PLL Circuits

- Phase Detector
- Charge-Pump
- Loop Filter
- VCO
- Divider



### Phase Detector





- Detects phase difference between feedback clock and reference clock
- The loop filter will filter the phase detector output, thus to characterize phase detector gain, extract average output voltage
- The K<sub>PD</sub> factor can change depending on the specific phase detector circuit

 $K_{\rm PD}$  units are V/rad when used with a dimension - less filter

 $K_{PD}$  units are rad<sup>-1</sup> (averaged) or A/rad when combined with the charge - pump

when used with a impedance filter

### Analog Multiplier Phase Detector

$$A_{1} \cos \omega_{1} t \longrightarrow \frac{\alpha A_{1} A_{2}}{2} \cos[(\omega_{1} + \omega_{2})t + \Delta \phi] + \frac{\alpha A_{1} A_{2}}{2} \cos[(\omega_{1} - \omega_{2})t - \Delta \phi]$$

$$A_{2} \cos(\omega_{2} t + \Delta \phi) \longrightarrow \alpha \text{ is mixer gain}$$

• If  $\omega_1 = \omega_2$  and filtering out high-frequency term

$$\overline{y(t)} = \frac{\alpha A_1 A_2}{2} \cos \Delta \phi$$

• Near  $\Delta \phi$  lock region of  $\pi/2$ :  $\overline{y(t)} \approx \frac{\alpha A_1 A_2}{2} \left( \frac{\pi}{2} - \Delta \phi \right)$ 



### **XOR Phase Detector**



- Assuming logic 1="+1" and 0="-1", the XOR PD will lock when the average output is 0
  - Generally,  $\pi/2$  is a stable lock point and  $-\pi/2$  is a metastable point
- Sensitive to clock duty cycle

### **XOR Phase Detector**



# Cycle Slipping

- If there is a frequency difference between the input reference and PLL feedback signals the phase detector can jump between regions of different gain
  - PLL is no longer acting as a linear system



# Cycle Slipping



• If frequency difference is too large the PLL may not lock

# Phase Frequency Detector (PFD)



- Phase Frequency Detector allows for wide frequency locking range, potentially entire VCO tuning range
- 3-stage operation w/ UP & DN outputs
  - Rising edge-triggered results in duty cycle insensitivity



### Averaged PFD Transfer Characteristic



- Constant slope and polarity asymmetry about zero phase allows for wide frequency range operation
- The averaged PFD gain is  $1/(2\pi)$  with units of rad<sup>-1</sup>

### **Phase Detector**



- Detects phase difference between feedback clock and reference clock
- The loop filter will filter the phase detector output, thus to characterize phase detector gain, extract average output voltage (or current for charge-pump PLLs)

### PFD Deadzone





- If phase error is small, then short output pulses are produced by PFD
- Cannot effectively propagate these pulses to switch charge pump
- Results in phase detector "dead zone" which causes low loop gain and increased jitter



## PFD Operation w/ Reset Delay

- Solution is to add delay in PFD reset path to force a minimum UP and DN pulse length
- In locked state both UP and DN current sources are on for T<sub>rst</sub>, but ideally no net current is delivered to loop filter





## Problems Near $2\pi$

- PFD cannot react to input rising edges during reset
- This can result in the next rising edge driving the loop in the wrong direction
- Reset delay can increase acquisition time and sets a max PFD operating frequency





#### PFD Transfer Characteristic w/ Reset Delay



- PFD reset delay generates wrong frequency information
- If this becomes a large percentage of the reference cycle, then the PFD can fail to acquire frequency lock

Max 
$$T_{rst} = \frac{T_{ref}}{2}$$
  
Max PFD Frequency  $= \frac{1}{2T_{rst}}$ 

## Charge-Pump PLL Circuits

- Phase Detector
- Charge-Pump
- Loop Filter
- VCO
- Divider



## Charge Pump



 Converts PFD output signals to charge

 Charge is proportional to PFD pulse widths

Un - Averaged Charge - Pump Gain =  $I_{CP}$  (Amps) Averaged Charge - Pump Gain =  $\frac{I_{CP}}{2\pi} \left(\frac{\text{Amps}}{\text{rad}}\right)$ Total PFD & Charge - Pump Gain =  $\frac{I_{CP}}{2\pi} \left(\frac{\text{Amps}}{\text{rad}}\right)$ 

This gain can vary if a different phase detector is used

## Simple Charge Pump



- Issues
  - Skew between UPB and DN control signals
  - Matching of UP/DN current sources
  - Clock feedthrough and charge injection from switches onto V<sub>ctrl</sub>
  - Charge sharing between current source drain nodes' capacitance and  $V_{ctrl}$

#### Simple Charge Pump Skew Compensation



3/2 Inverter Path



- Adding a transmission gate in the DN signal path helps to equalize the delay with the UPB signal for better overlap between the UP and DN current sources
- Poor matching of UPB and  $\text{DN}_{\!\Delta}$  edge rates

 Utilizing a 3-inverter UP path and a 2-inverter DN path with a higher fanout provides good matching of both delay and edge rates

## Charge Pump Mismatch





- Extra "ripple" on V<sub>ctrl</sub>
  - Results in frequency domain spurs at the reference clock frequency offset from the carrier



# Charge Pump w/ Improved Matching



- Parallel path keeps current sources always on
- Amplifier keeps current source V<sub>DS</sub> voltages constant resulting in reduced transient current mismatch (charge sharing)

[Young JSSC 1992]

# **Digital Leakage Compensation**

- Charge pump off-state leakage causes PLL to lock with static phase error
- Compensated by additional digitally-controlled charge pump current pulses
- TDC detects phase error between input reference clock and feedback clock





Static Phase Error due to leakage from supply



Two pulse digital leakage compensation for leakage from supply

## Charge Pump w/ Reversed Switches

- Swapping switches reduces charge injection \_\_\_\_
  - MOS caps (Md1-4) provide extra clock feedthrough cancellation
- Helper transistors Mx and My quickly turn-off current sources
- Dummy branch helps to match PFD loading
- Helps with charge injection, but charge sharing is still an issue



[Ingino JSSC 2001]

## Charge-Pump PLL Circuits

- Phase Detector
- Charge-Pump
- Loop Filter
- VCO
- Divider



#### Charge Pump PLL Passive PI Loop Filter



- Simple passive filter is most commonly used
- Integrates low-frequency phase errors onto C1 to set average frequency
- Resistor (proportional gain) isolates phase correction from frequency correction
- Primary capacitor C1 affects PLL bandwidth
- Zero frequency affects PLL stability
- Resistor adds thermal noise which is band-pass filtered by PLL

## Loop Filter Transfer Function

Neglecting secondary capacitor, C<sub>2</sub>



## Loop Filter Transfer Function

• With secondary capacitor, C<sub>2</sub>



# Why have C<sub>2</sub>?

- Secondary capacitor smoothes control voltage ripple
- Can't make too big or loop will go unstable
  - $C_2 < C_1/10$  for stability
  - $C_2 > C_1/50$  for low jitter

PLL Synthesizing a 380MHz Signal



## Loop Filter Capacitors

- To minimize area, we would like to use highest density caps
- Thin oxide MOS cap gate leakage can be an issue
  - Similar to adding a non-linear parallel resistor to the capacitor
  - Leakage is voltage and temperature dependent
  - Will result in excess phase noise and spurs
- Metal caps or thick oxide caps are a better choice
  - Trade-off is area
- Metal cap density can be <1/10 thin oxide caps
- Filter cap frequency response can be relatively low, as PLL loop bandwidths are typically 1-50MHz

## Charge-Pump PLL Circuits

- Phase Detector
- Charge-Pump
- Loop Filter
- VCO
- Divider



### Voltage-Controlled Oscillator



 $\omega_{out}(t) = \omega_0 + \Delta \omega_{out}(t) = \omega_0 + K_{VCO} v_{ctrl}(t)$ 

Time-domain phase relationship

# Voltage-Controlled Oscillators (VCO)

- Ring Oscillator
  - Easy to integrate
  - Wide tuning range (5x)
  - Higher phase noise



- LC Oscillator
  - Large area
  - Narrow tuning range (20-30%)
  - Lower phase noise



## Barkhausen's Oscillation Criteria



Closed-loop transfer function:

$$\frac{H(j\omega)}{1-H(j\omega)}$$

- Sustained oscillation occurs if  $H(j\omega)=1$
- 2 conditions:
  - Gain = 1 at oscillation frequency  $\omega_0$
  - Total phase shift around loop is n360° at oscillation frequency  $\omega_0$

### Ring Oscillator Example



$$H(s) = -\frac{A_0^4}{\left(1 + \frac{s}{\omega_o}\right)^4}$$

Phase Condition: 
$$\tan^{-1}\left(\frac{\omega_{osc}}{\omega_o}\right) = 45^\circ \rightarrow \omega_{osc} = \omega_o = \frac{1}{RC}$$

Gain Condition: 
$$\frac{A_0^4}{\left[\sqrt{1 + \left(\frac{\omega_{osc}}{\omega_o}\right)^2}\right]^4} = 1 \rightarrow A_0 = \sqrt{2} = g_{m1}R$$

94

# LC Oscillator Example



 Oscillation phase shift condition satisfied at the frequency when the LC (and R) tank load displays a purely real impedance, i.e. 0° phase shift

LC tank impedance

$$Z_{eq}(s) = \frac{R_{s} + L_{1}s}{1 + L_{1}C_{1}s^{2} + R_{s}C_{1}s}$$
$$\left|Z_{eq}(s = j\omega)\right|^{2} = \frac{R_{s}^{2} + L_{1}^{2}\omega^{2}}{\left(1 - L_{1}C_{1}\omega^{2}\right)^{2} + R_{s}^{2}C_{1}^{2}\omega^{2}}$$

## LC Oscillator Example



## LC Oscillator Example





• Phase condition satisfied at  $\omega_1 = \frac{1}{\sqrt{L_P C_P}}$ 

• Gain condition satisfied when  $(g_m R_p)^2 \ge 1$ 

- Can also view this circuit as a parallel combination of a tank with loss resistance 2R<sub>P</sub> and negative resistance of 2/g<sub>m</sub>
- Oscillation is satisfied when

$$\frac{1}{g_m} \le R_P$$

### Supply-Tuned Ring Oscillator



### **Current-Starved Ring Oscillator**



Current - starved VCO.

## Capacitive-Tuned Ring Oscillator



# Symmetric Load Ring Oscillator

[Maneatis JSSC 1996 & 2003]



- Symmetric load provides frequency tuning at excellent supply noise rejection
- See Maneatis papers for self-biased techniques to obtain constant damping factor and loop bandwidth (% of ref clk)

## LC Oscillator

- A variable capacitor (varactor) is often used to adjust oscillation frequency
- Total capacitance includes both tuning capacitance and fixed capacitances which reduce the tuning range

$$\omega_{osc} = \frac{1}{\sqrt{L_P C_P}} = \frac{1}{\sqrt{L_P (C_{tune} + C_{fixed})}}$$



## Varactors

- pn junction varactor
  - Avoid forward bias region to prevent oscillator nonlinearity



• Accumulation-mode devices have better Q than inversion-mode



## Charge-Pump PLL Circuits

- Phase Detector
- Charge-Pump
- Loop Filter
- VCO
- Divider



## Loop Divider



• Time-domain model

$$\omega_{fb}(t) = \frac{1}{N} \omega_{out}(t)$$

$$\phi_{fb}(t) = \int \frac{1}{N} \omega_{out}(t) dt = \frac{1}{N} \phi_{out}(t)$$

 The loop divider is dimension-less in the PLL linear model

### Basic Divide-by-2



- Divide-by-2 can be realized by a flip-flip in "negative feedback"
- Divider should operate correctly up to the maximum output clock frequency of interest PLUS some margin



## Divide-by-2 with TSPC FF

#### **True Single Phase Clock Flip-Flop**



- Advantages
  - Reasonably fast, compact size, and no static power
  - Requires only one phase of the clock
- Disadvantages
  - Signal needs to propagate through three gates per input cycle
  - Need full swing CMOS inputs
  - Dynamic flip-flop can fail at low frequency (test mode) due to leakage, as various nodes are floating during different CLK phases & output states
    - Ex: Q\_bar is floating during when CLK is low

#### **Divider Equivalent Circuit** Note: output inverter not in left schematic



## Divide-by-2 with CML FF



- Advantages
  - Signal only propagates through two CML gates per input cycle
  - Accepts CML input levels
- Disadvantages
  - Larger size and dissipates static power
  - Requires differential input
  - Need tail current biasing
- Additional speedup (>50%) can be achieved with shunt peaking inductors

#### Binary Dividers: Asynchronous vs Synchronous

#### **Asynchronous Divider**



- Advantages
  - Each stage runs at lower frequency, resulting in reduced power
  - Reduced high frequency clock loading
- Disadvantage
  - Jitter accumulation
- Advantage
  - Reduced jitter
- Disadvantage
  - All flip-flops work at maximum frequency, resulting in high power
  - Large loading on high frequency clock

#### **Synchronous Divider**



#### Jitter in Asynchronous vs Synchronous Dividers

#### Asynchronous



- Jitter accumulates with the clock-to-Q delays through the divider
- Extra divider delay can also degrade PLL phase margin

#### **Synchronous**



- Divider output is "sampled" with high frequency clock
- Jitter on divider clock is similar to VCO output
- Minimal divider delay

### **Dual Modulus Prescalers**



• For /15, first prescaler circuit divides by 3 once and 4 three times during the 15 cycles

## **Injection-Locked Frequency Dividers**



Ring-oscillator type (/3)



[Verma JSSC 2003, Rategh JSSC 1999]

[Lo CICC 2009]

- Superharmonic injection-locked oscillators (ILOs) can realize frequency dividers
- Faster and lower power than flip-flop based dividers
- Injection locking range can be limited