# EQUALIZERS FOR HIGH-SPEED SERIAL LINKS #### PAVAN KUMAR HANUMOLU School of Electrical Engineering and Computer Science, Oregon State University Corvallis, Oregon 97331, U.S.A hanvmolv@ece.orst.edu #### GU-YEON WEI Electrical Engineering and Computer Science, Harvard University. Cambridge, Massachusetts 02138, U.S.A quyeon@eecs.harvard.edu #### UN-KU MOON School of Electrical Engineering and Computer Science, Oregon State University. Corvallis, Oregon 97331, U.S.A moon@ece.orst.edu In this tutorial paper we present equalization techniques to mitigate inter-symbol interference (ISI) in high-speed communication links. Both transmit and receive equalizers are analyzed and high-speed circuits implementing them are presented. It is shown that a digital transmit equalizer is the simplest to design, while a continuous-time receive equalizer generally provides better performance. Decision feedback equalizer (DFE) is described and the loop latency problem is addressed. Finally, techniques to set the equalizer parameters adaptively are presented. Keywords: Serial Link; Eye Diagram; ISI; Equalizer; Jitter; BER; Transceiver; Noise; DFE; Pre-emphasis. # 1. Introduction Recent advances in integrated circuit (IC) fabrication technology coupled with innovative circuit and architectural techniques led to the design of high performance digital systems. The complex systems are built by combining several ICs consisting of millions of transistors operating at multi-gigahertz frequency. These systems require efficient communication between multiple chips for proper functioning of the whole system. However, the off-chip bandwidth scales<sup>1</sup> at a much lower rate compared to the on-chip bandwidth,<sup>2</sup> thus making the communication link (also referred to as serial link) between chips the major bottleneck for the overall performance. For example, present day microprocessors run at several gigahertz clock rates, while the speed of the front-side bus is limited to less than a gigahertz. Due to these reasons, there is a great research interest to reduce the gap between the on-chip and off-chip bandwidth. A representative depiction of a communication link between two chips is shown in Fig. 1. Dedicated circuits designed for high-speed operation Fig. 1. A typical serial link block diagram. are used as transmitter and receiver, to transmit and receive the data, respectively. The medium of transmission is called the channel which in the ideal case is a wire representing a short circuit. However, as the data rates increase, these wires behave as lossy transmission lines severely degrading the transmitted data symbols. Equalization is a well-known technique used to overcome non-idealities introduced by the channel. In this paper we present several equalization techniques that are amenable for high-speed operation. The organization of the paper is as follows. Section 2 briefly discusses different aspects of channel modeling, while different metrics used to evaluate the performance of the serial link are summarized in Section 3. Some equalizer background is presented in Section 4, and transmitter and receiver equalizer designs are considered in Section 5 and Section 6, respectively. The advantages of combined transmitter and receiver equalizers are presented in Section 7. Finally, the process of adapting of filter tap weights is discussed in Section 8. # 2. Channel Modeling There are several types of channels used in high-speed interconnects, primarily based on the target application. These channels can be broadly classified into three categories. First, for chip-to-chip communication on a printed circuit board (PCB), short well-controlled copper traces are used. Second, for systems such as local-area network (LAN) which require high-speed connection between two computers, coaxial cable is used as the transmission medium. Finally, copper traces along with backplane connectors are used for high-speed board-to-board communication systems such as routers. In this paper, we focus on the copper traces used for chip-to-chip communication. However, the analysis and the design techniques are easily applicable to a wide range of other channels. The copper traces commonly used on PCBs behave as lossy transmission lines 431 at multi-gigahertz frequency range. The distributed nature of these transmission lines can be captured by a cascade of an infinitesimal length RLGC sections shown in Fig. 2.<sup>3</sup> Accurate modeling of transmission lines with RLGC sections require Fig. 2. RLGC section of a transmission line. quantizing both space and time into sections that are small compared to the shortest wavelength of interest. Therefore, for high-speed designs a large number of RLGC sections are required to comprehend all the transmission line effects. The dominant sources of loss in these channels are due to the skin effect and dielectric loss.<sup>4,5,6</sup> The loss due to skin effect is proportional to $\sqrt{f}$ and typically dominates the total loss at low frequencies. On the other hand, dielectric loss is proportional to f, and therefore, determines the total loss at high frequencies. The lumped RLGC sections are modified as shown in Fig. 3 to comprehend the frequency dependent nature of these loss mechanisms.<sup>7,8</sup> However, with the increased number of nodes in each lumped section, coupled with the requirement of large number of cascaded sections to model the full channel, results in tremendous increase in the overall simulation time. These frequency dependent loss mechanisms also greatly depend on several factors such as the geometry of the traces, making the development of a generic channel model impractical. Due to these issues, channel models are most commonly developed by fitting measured data of each of the components. For example, the s-parameters of the PCB trace with a given geometry are determined by using field solvers such as ADS<sup>9</sup> and are combined with connector models supplied by the vendor to obtain the s-parameters for the complete channel. The $S_{21}$ of a 20" differential micro-strip line on a FR4 board with two connectors, referred to as server channel, and a 6" differential micro-strip line on the same FR4 board indicated as desktop channel is shown in Fig. 4. The s-parameters can be directly used in circuit-level simulators such as SPECTRE or equivalently an impulse response can be derived for system level simulators such as MATLAB. The second approach is more amenable for fast system-level simulations to evaluate the performance of various equalization and clock recovery schemes. Due to these reasons, the impulse response approach is used in this paper. The derived impulse response for the server and the desktop channel is shown in Fig. 5. Since most practical channels are linear time-invariant systems, an impulse response is sufficient to completely characterize Fig. 3. RLGC section of a transmission line modeling frequency dependent loss. Fig. 4. $S_{21}$ of a $20^{\prime\prime}$ (server) and $6^{\prime\prime}$ (desktop) FR4 trace with two connectors. Fig. 5. Impulse response of server and desktop channels. it. Before presenting various equalizer designs, we first briefly discuss the metrics used to evaluate the performance of high-speed serial links. # 3. Performance Metrics The primary performance metric in all applications employing serial links is bit error rate (BER). Most systems of interest require almost error-free operation (BER $< 10^{-12}$ ). However, direct evaluation of such low BER in simulation is not a trivial task. So we will employ an indirect method in which we will calculate the noise margins and relate them to error-free operation. The biggest noise source in high-speed serial links is inter-symbol interference (ISI)<sup>a</sup> caused by the frequency dependent attenuation of the channel. The noise margin degradation due to ISI is best quantified by an eye diagram. The eye diagrams at the receive end of the server channel obtained by transmitting 500 pseudo-random bits at 2Gbps and 5Gbps data rates are shown in Fig. 6(a) and Fig. 6(b) respectively. Note that the pseudo-random eye approaches worst-case eye only with very large number of transmitted bits. Even though these eye diagrams clearly indicate the noise margin degradation due to ISI at higher data rates, this approach has two main drawbacks. First, the accu- <sup>&</sup>lt;sup>a</sup>Even though ISI is fully deterministic, we will refer to it as *noise* here without adhering to the definition of noise in the strict sense. Fig. 6. (a) 2Gbps eye diagram. (b) 5Gbps eye diagram. rate estimation of worst-case noise margin degradation requires transmitting several thousands of bits, thus increasing the simulation times drastically. Second, this approach does not provide any design insight. We, therefore, employ an analytical method based on pulse response to evaluate the noise margins. Since ISI is completely deterministic in nature, it is possible to calculate the worst case noise margin degradation due to ISI. Consider the representative positive pulse response shown Fig. 7. Conceptual pulse response illustrating ISI terms. in Fig. 7. The trailing ISI and leading ISI terms are referred to pre-cursors and post-cursors, respectively. The worst case effect of these ISI terms on the overall voltage margin at a given sampling instant is obtained simply by adding them in an absolute sense as shown in "Eq. (1)":<sup>11</sup> Worstcase ISI noise $$= \sum |ISI_{+}| + \sum |ISI_{-}|$$ $$= \sum ISI_{+} - \sum ISI_{-}$$ $$= \sum_{k=-\infty}^{\infty} p(t-kT)|_{p(t-kT)>0, k\neq 0}$$ $$-\sum_{k=-\infty}^{\infty} p(t-kT)|_{p(t-kT)<0, k\neq 0}$$ (1) where $ISI_{-}$ and $ISI_{+}$ are negative and positive ISI terms. The complete worst-case ISI eye diagram can be obtained by sweeping the sampling instance across the whole bit period. Fig. 8 displays the worst-case ISI eye diagram calculated by the described method and the eye diagram obtained by transmitting 500 pseudo-random bits. Even though, worst-case eye results in a pessimistic BER prediction, we will Fig. 8. Pseudo-random data and worst-case eye diagrams. continue using it in this paper for its simplicity. Interested readers can refer to statistical approaches outlined in<sup>11,12,13,14,15</sup> for more accurate prediction of BER. Other noise sources of concern include circuit noise, clock jitter induced noise, <sup>15,16</sup> and power supply noise.<sup>17</sup> These noise sources are implementation dependent and so we will discuss them separately for each equalizer. In addition to the noise margins, other metrics of interest are circuit area, power consumption and ease of design. # 4. Equalizers Background The magnitude response of the server channel shown again in Fig. 9 illustrates high frequency attenuation due to both skin effect and dielectric loss. For example, the loss for 5 gigabit operation is approximately 12dB resulting in an almost closed eye (see Fig. 6(b)). The frequency shaping filters that flatten the channel response till Nyquist frequency are called equalizers. These equalizers, therefore, reduce ISI and can increase the achievable data rates tremendously. The conceptual diagram illustrating two ways of performing equalization is shown in Fig. 9. In the first Fig. 9. Two ways of equalization: (a) Attenuate low frequency. (b) Boost high frequency. method denoted by A, the low-frequencies of the signal spectrum are attenuated, while in the second method denoted by B, the high frequency signal spectrum is boosted in order to mitigate ISI. We now present several techniques and design tradeoffs in implementing these two types of equalizers. # 5. Transmit Equalizers Equalization can be performed either at the transmitter, or at the receiver, or both. In this section we will focus on the transmitter-side equalization. The transmit equalizer shown in Fig. 10 is a symbol spaced (symbol-period is denoted by $\Delta$ ) finite impulse response (FIR) filter that pre-shapes or pre-distorts transmitted data so as to attenuate the low frequency portion of the signal spectrum while maintaining the high-frequency part intact. Because of this, the transmit equalizers are also referred to as de-emphasis, pre-emphasis, pre-distortion or pre-coding filters. <sup>18,19,20,21,22,23,24</sup> Fig. 11 depicts the eye diagram at 5Gbps equalized with Fig. 10. Transmit pre-emphasis FIR filter. Fig. 11. 5Gbps eye diagram with 3-tap transmit pre-emphasis. a transmit pre-emphasis filter $C = [-0.13\ 0.66\ -0.21]$ . Post equalization worst case ISI eye shown in Fig. 11 displays 80ps of timing margin with at least 100mV of voltage margin. It is instructive to view the time-domain response of this technique to better visualize the concept. The raw and equalized sampled pulse responses are shown in Fig. 12. Note that the cursor tap is attenuated while at the same time Fig. 12. Sampled 5Gbps pulse response: (a) Raw channel. (b) Transmit pre-emphasis. the pre-cursor and post cursor ISI is greatly reduced. This cursor-tap attenuation is due to peak transmit power constraint. Consider a practical implementation of a 2-tap pre-emphasis filter shown in Fig. 13, where the tap weights are implemented by scaled tail current sources. These current sources are adjusted digitally by current-mode digital-to-analog converter (DAC), not shown in the figure. It is important to maintain the tail current sources in saturation to prevent reflections introduced by imperfect source termination. Therefore, the maximum output swing of the current mode driver is limited by the voltage headroom needed to maintain Fig. 13. Two-tap pre-emphasis filter implementation. high output impedance. Hence, extra taps can be added only at the expense of reducing the cursor tap weight. In other words, since the maximum voltage drop across the termination resistor $I \cdot R_T$ is determined by the voltage headroom, the coefficients should satisfy: $$\left(I \cdot \sum |C_i|\right) \cdot R_T = I \cdot R_T \Rightarrow \sum |C_i| = 1.$$ (2) There are several limitations of transmit pre-emphasis. First, due to the signal attenuation transmit pre-emphasis can not improve SNR. Second, it is essential to maximize transmitted signal swings to incorporate large amount of equalization, thus resulting in excessive crosstalk.<sup>25</sup> Third, high resolution DACs are required to implement pre-emphasis filters to equalize channels containing large number of ISI terms.<sup>26</sup> Finally, despite transmit pre-emphasis there is considerable residual ISI which results in reduction of both timing and voltage margins, particularly at higher data rates. # 6. Receive Equalizers Receive-side equalization offers an alternate method to mitigate ISI without any peak power constraint. The loss in the channel is suppressed by boosting the high frequency signal spectrum rather than attenuating the low-frequency content. Due to the inherent gain in the system this method often results in larger noise margins. We now present different receive equalizer architectures. # 6.1. Digital FIR equalizer Linear transversal filter similar to the one used for transmit pre-emphasis can be used on the receive-side to perform equalization. Unlike the transmit pre-emphasis where the input to the filter is a binary signal, the input to the receive filter is channel output which is analog in nature. An analog to digital converter (ADC) is required to interface the channel output to the filter as shown in Fig. 14. Symbol-spaced delay is implemented by a register. Fig. 15 depicts 5Gbps eye diagram equal- Fig. 14. Digital FIR equalizer. Fig. 15. 5Gbps eye diagram equalized with ideal receiver equalizer. ized by an ideal 3-tap receive FIR equalizer. The higher voltage margin obtained by receive equalization is clearly evident. However, there are two major bottlenecks in the practical implementation of this equalizer. First, the critical path shown in Fig. 14 limits the maximum operation frequency to only few hundred megahertz. Well known techniques such as transposition<sup>28</sup> and parallelism<sup>32</sup> can be used to shorten critical path. Nevertheless, these transposed filters are still speed-limited to less than a gigabit data rate. Second, the practical usefulness of this equalizer is severely limited by the high-speed ADC requirement at the front-end. Even though high-speed ADCs are possible to design,<sup>27</sup> they add large power and area overhead. Due to these constraints, digital FIR equalizers are employed only in medium rate interfaces such as broadband modems,<sup>29</sup> disk-drive read channels,<sup>30,31,32,33,34</sup> and gigabit ethernet.<sup>35,36</sup> The price paid for high speed operation using digital FIR is excessive power consumption. For example, the equalizer in<sup>34</sup> consumes 1.2W for 2.3Gbps operation in $0.18\mu m$ CMOS process. This amount of power consumption is unacceptable in most serial-link applications which require integrating hundreds of equalizers on a single chip. # 6.2. Analog FIR equalizer An analog FIR equalizer obviates the need for a high-speed ADC and is therefore attractive for high-speed operation with potentially lower power consumption. A conceptual block diagram of an analog FIR equalizer is shown in Fig. 16. Note that Fig. 16. An analog FIR equalizer. the high-speed ADC is replaced by a relatively simple sample and hold amplifier (SHA) circuit. As opposed to a digital delay in the case of digital FIR an analog delay chain is required to implement the analog FIR. This analog delay can be im- plemented using a replica delay line whose delay is locked to a delay locked loop or a phase locked loop operating at data rate. 37,38 However, the FIR analog equalizer in its most primitive form suffers from many implementation difficulties. First, the settling time of the front-end SHA limits the overall operating speed. Second, the sampled signal experiences considerable attenuation due to the limited bandwidth of the delay elements in the delay chain. Moreover, this limited bandwidth induced error accumulates along the delay chain, thus limiting this technique to FIR filters with few taps. Finally at high data rates, the precise generation of analog delay consumes excessive power thus negating the primary benefit of an analog FIR equalizer. One alternate way to generate the analog delay is by using multiphase clocks.<sup>39</sup> The delay in the sampling clocks translates to the tap delay. Two time-interleaved architectures referred to here as Rotating Input Samples (RIS)<sup>41</sup> and Rotating Tap Weights (RTW)<sup>42</sup> employing multi-phase clocks to implement tap delay are presented. In the RIS method, a time interleaved N-tap FIR filter is implemented by using M > N front-end SHAs clocked by multiple phases of a clock as shown in Fig. 17. The outputs of the M SHAs are routed through a switch Fig. 17. An analog FIR equalizer based on rotating input samples. matrix controlled by multi-phase clocks in a circular manner. This circular buffer architecture increases the minimum settling time of the SHA to more than a clock period. However, the complexity of the input sample rotating array and mismatches among various SHAs limit the maximum speed of this architecture to less than 1Gbps data rate.<sup>39,41</sup> Alternately, in the RTW method, the tap weights are rotated instead of the input samples.<sup>42</sup> The conceptual operation of the RTW method is illustrated in Fig. 18 where SR denotes digital shift register. As in the RIS method, Fig. 18. An analog FIR equalizer based on rotating tap weights. the circuit employs time interleaved SHAs, but instead shifts the coefficients in a counter-clockwise manner to achieve the FIR filter functionality. By implementing the tap weights using digital words, the RTW method has distinct advantage over the RIS method because the time-interleaving is achieved by rotating digital tap weights, instead of analog input samples. However, RTW method still suffers from the mismatches in the SHAs and does not offer any advantage in high-speed designs employing analog tap coefficients. Parallelism along with time-interleaving can obviate the need for rotating input samples and rotating tap weights and hence permitting very high data rates. $^{43}$ The conceptual block diagram of a parallelized architecture is shown in Fig. 19. In this example, 43 the high-speed front-end samplers track the input for two bit periods and hold the sampled value for the next six bit periods. It is well-known that current-mode signal processing can achieve higher speeds, is more efficient and is easier to implement than voltage-mode processing. Therefore, front-end samplers are typically followed by voltage-to-currents (V2I) converters and the operations required for equalization (addition and multiplication) are performed in current domain. Each V2I output is replicated into four interleaved equalizers by simple current mirroring thus accomplishing an effective input sample rotation. The tap-weight multiplication is performed by a currentmode DAC while the summation is achieved by simply shorting the DAC current outputs. Employing parallelism, time-interleaving and current-mode signal process- Fig. 19. A parallelized and time-interleaved analog FIR equalizer. ing, this architecture is suitable for equalizing multi-gigabit serial links, albeit at the expense of increased power and area incurred due to massive parallelism. # 6.3. Continuous time equalizers The discrete-time receive equalizers discussed thus far need sampling front-end to perform equalization. This requirement results in two drawbacks. First, the sampling clock-jitter reduces the effectiveness of the equalization. Second, in a truly serial communication system, the clock is recovered from the incoming data. However, due to the sampling front-end, the clock recovery loop needs to operate on raw channel output resulting in an excessive jitter in the recovered clock. 44,45 In order to circumvent the clock recovery problem, practical serial links employing discrete-time FIR equalizers are limited to source synchronous interfaces 43 containing a separate clock channel as shown in Fig. 20. A continuous-time circuit that Fig. 20. A source synchronous interface employing discrete-time equalizer. $\Delta\Phi$ compensates for the delay mismatch between clock and data channels. can provide high-frequency boost is a very attractive alternative to the transversal filters employing sampling front-ends. A continuous-time equalizer is a simple one tap continuous-time circuit with high-frequency gain boosting transfer function that effectively flattens the channel response. As an example, the required frequency shaping can be achieved by a simple RC network as shown in Fig. 21. The resistor attenuates the low-frequency signals while the capacitor allows the high-frequency signal content, thus resulting in high frequency gain boosting. The transfer function Fig. 21. Continuous-time passive equalizer. and the pole zero frequencies are given by: $$H(s) = \frac{R_2}{R_1 + R_2} \frac{1 + R_1 C_1 s}{1 + \frac{R_1 R_2}{R_1 + R_2} (C_1 + C_2) s}$$ (3) $$\omega_z = \frac{1}{R_1 C_1} \tag{4}$$ $$\omega_p = \frac{1}{\frac{R_1 R_2}{R_1 + R_2} (C_1 + C_2)} \tag{5}$$ $$\omega_{z} = \frac{1}{R_{1}C_{1}}$$ $$\omega_{p} = \frac{1}{\frac{R_{1}R_{2}}{R_{1}+R_{2}}(C_{1}+C_{2})}$$ $$DC \ gain = \frac{R_{2}}{R_{1}+R_{2}}$$ $$(6)$$ The gain-boost factor is proportional to the ratio of zero and pole frequency $\frac{\omega_z}{\omega_n}$ , so reasonable amounts of equalization can be achieved by choosing appropriate component values that set the required gain-boosting. For example, the equalizer obtained with $R_1 = 200\Omega$ , $C_1 = 1pF$ , $R_2 = 65\Omega$ and $C_2 = 0.1pF$ , results in considerable eye-opening at 5Gbps on the server channel as shown in Fig. 22. There are two main disadvantages with simple passive RC equalizers. First, the RC network introduces large impedance discontinuity at the channel and equalizer interface. Impedance matching networks, <sup>54</sup> often employing inductors, can be used to prevent the discontinuity. However, the large inductors make this approach less suitable for on-chip integration. Second, this method can not improve SNR, since equalization is performed by attenuating low-frequency signal spectrum much like transmit preemphasis. Due to these reasons, this technique has limited use in high-speed serial links. It is desirable to have a gain greater than one at all frequencies to maximize the benefit from receiver-side equalization. Therefore, equalizers using active circuit elements rather than passive components are required to achieve gains greater than one. Active filters with desired frequency response can be designed using standard filter design techniques. 46 Such standard filters are typically implemented either with operational amplifiers in negative feedback or Gm-C filter topology. However, the negative feedback as used in these systems greatly degrades the maximum oper- Fig. 22. 5Gbps eye diagram using passive equalizer. ating frequency thus, limiting the usefulness of such equalizers to only few hundred megahertz. 47,48,49 Recent attempts however focus on open loop equalizer architectures to overcome bandwidth penalty due to negative feedback.<sup>50</sup> There are several wide-band amplifier design techniques that can provide the required high frequency boost for equalization. These techniques include bandwidth enhancement by zeros, and tuned and/or cascaded amplifiers.<sup>54</sup> As noted earlier in the design of passive equalizer, the parallel RC combination introduces real zero in the transfer function, potentially providing gain-peaking. The active-equivalent of the passive equalizer can be designed by degenerating a source-coupled pair with the parallel RC network as shown in Fig. 23.<sup>51,52</sup> The transfer function and the associated pole-zero locations are given by: $$H(s) = \frac{g_m}{C_L} \frac{s + \frac{1}{R_D C_D}}{\left(s + \frac{g_m R_D + 1}{R_D C_D}\right)} \cdot \frac{1}{\left(s + \frac{1}{R_L C_L}\right)}$$ (7) $$\omega_z = \frac{1}{R_D C_D} \tag{8}$$ $$\omega_{p1} = \frac{g_m R_D + 1}{R_D C_D} \tag{9}$$ $$\omega_{p2} = \frac{1}{R_L C_L s} \tag{10}$$ $$\omega_{z} = \frac{1}{R_{D}C_{D}}$$ $$\omega_{z} = \frac{1}{R_{D}C_{D}}$$ $$\omega_{p1} = \frac{g_{m}R_{D} + 1}{R_{D}C_{D}}$$ $$\omega_{p2} = \frac{1}{R_{L}C_{L}s}$$ $$DC \ gain = \frac{g_{m}R_{L}}{g_{m}R_{D} + 1}$$ $$(8)$$ $$(9)$$ $$(10)$$ By designing the zero frequency to be lower than the dominant pole, considerable Fig. 23. Continuous-time equalizer using capacitive degeneration. $R_L$ and $C_L$ represent the load. high frequency gain boosting can be achieved. The amount of this gain boost can be controlled by the ratio of the dominant pole and zero frequency $(\omega_z/\omega_{p1})$ . fcaptionChannel response: (a) Raw. (b) With continuous-time equalizer. The continuous-time equalizer implemented in a $0.18\mu m$ CMOS process operating with 1.8V supply and consuming less than 10mW of power, provides a gain boost of 8dB at 2.5GHz as depicted in Fig. . The equalized eye diagram at 5Gbps data rate shown in Fig. 24 displays 120ps of timing margin with at least 100mV of voltage margin. Thus, compared with transmit pre-emphasis, continuous-time receive equalizer provides 65% more timing margin reinforcing the benefit of receiveside equalization. The maximum gain boosting achieved by this method is limited by the bandwidth of the amplifier due to the load capacitance $(\omega_{p2})$ . Inductive peaking shown in Fig. 25(a)<sup>53</sup> or neutralization as shown in Fig. 25(b),<sup>54</sup> can be used to increase the amplifier bandwidth and hence improve gain-boost factor. Also, a cascade of these equalizer stages can provide higher gain without sacrificing the bandwidth. 54,55 It is worth mentioning that gain peaking can also be achieved by zeros introduced by the load inductor as shown in Fig. 26.<sup>56</sup> The equalizer output is the weighted sum of the flat gain amplifier output and the gain peaked amplifier output. Implemented in 150GHz $f_T$ BiCMOS process, this equalizer provides almost 30dB gain boost at 7GHz and achieves 10Gbps data rate on a channel consisting of 15 feet of coaxial cable. The large gain-boosting using a single stage was possible due to large (9GHz) amplifier bandwidth. However, this equalizer consumes 200mW of power, making it less attractive for main-stream serial links. Finally, continuous-time transversal filters, as opposed to discrete-time filters, Fig. 24. Continuous-time receive equalized 5Gbps eye diagram. Fig. 25. Continuous-time equalizer bandwidth enhancement: (a) Inductive peaking. (b) Neutralization. Fig. 26. Continuous-time equalizer using inductor load. can be implemented if one can design high-bandwidth analog-delay elements. A 10Gbps continuous-time analog FIR equalizer using distributed techniques to generate the analog delay is recently proposed.<sup>57</sup> The precise delay is generated by using transmission-line sections shown in Fig. 27. Even though this method has a Fig. 27. Transmission-line delay element. potential high speed advantage, it is not practical at medium to high data rates (5 to 10Gbps) due to the requirement of very long well-controlled on-chip transmission lines or large number of area consuming inductors.<sup>57</sup> #### 6.4. Noise enhancement We have thus far presented techniques for suppressing ISI, without alluding to other noise sources in the system. In this section we focus on the noise introduced by the continuous-time equalizer itself. The gain-peaking transfer function of the equalizer amplifies the high frequency noise potentially degrading the noise margin. Also in equalizers employing multiple stages, the first stage generally dominates the overall noise. We now estimate the noise contribution of a single equalizer stage shown in Fig. 23 The output noise is typically dominated by the input transistor pair. The one-sided voltage noise power spectral density of the input transistor given by $\overline{V_{in}^2} = 4kT\gamma/g_m$ is amplified by the equalizer transfer function resulting in an total output noise $$\overline{V_{on,Total}^{2}} = \int_{0}^{\infty} |H(s)|^{2} \overline{V_{in}^{2}} df$$ $$= \frac{g_{m}}{C_{L}} \frac{\sqrt{\omega_{p1} \cdot \omega_{p2}}}{2(\omega_{p1} + \omega_{p2})} \left[ 1 + \frac{\omega_{z}}{\sqrt{\omega_{p1} \cdot \omega_{p2}}} \right] \overline{V_{in}^{2}} \tag{12}$$ For the equalizer used to achieve about 8dB gain boost at 2.5GHz in $0.18\mu m$ CMOS technology, the output referred rms noise voltage $\sqrt{\overline{V_{on,Total}^2}}$ is less than $1mV_{rms}$ . For links operating with at least several tens of millivolts of signal swings, this noise enhancement does not limit the overall performance. # 6.5. Decision feedback equalization The problem of noise enhancement can be completely eliminated by using Decision Feedback Equalizer (DFE) shown in Fig. 28. Unlike the aforementioned equalizers, DFE utilizes the previous decisions to estimate and cancel the ISI introduced by the lossy channel. The feedback filter, estimates the ISI based on previous decisions, and therefore, can only cancel post-cursor ISI (i.e. ISI caused by previous symbols). Since the ISI cancellation is based on previous decisions, without highfrequency boost, it is inherently immune to noise enhancement. There are three design issues with the DFE design. First, the effectiveness of ISI cancellation is based on the assumption that all previous decisions are correct and therefore bit errors can exacerbate ISI instead of cancelling it. This problem is referred to as error propagation. However, in the case of serial-links with required BER $< 10^{-12}$ error propagation does not degrade the performance.<sup>58</sup> Second, DFE can cancel only post-cursor ISI, and therefore, a separate feed-forward filter is required to cancel pre-cursor ISI. The analog FIR equalizer or transmit pre-emphasis <sup>25,64</sup> optimized to cancel pre-cursor ISI can be used. Finally, the DFE implementation suffers from the feed-back loop latency illustrated as critical timing path in Fig. 28. The loop latency due to the input slicer regeneration time and the coefficient DAC settling time should be less than the bit period in order for the feedback to cancel the first post cursor ISI. However, at high data rates, this loop delay is more than several bit periods. Decision look-ahead schemes as shown for a single bit case in Fig. 29 are <sup>&</sup>lt;sup>b</sup>In an equalizer employing cascaded stages, low frequency input referred noise can be amplified if the DC gain is less than one. Fig. 28. Decision feedback equalizer. Fig. 29. 1-bit decision look-ahead feedback equalizer. used to circumvent the latency issue.<sup>59,60,61,62,63</sup> Two parallel receivers resolve the channel output for the two possible previous outputs (+1,-1). The correct output is then selected by the previous bit using a simple multiplexor. Note that the hardware to implement this look-ahead scheme grows exponentially with the number of taps, thus limiting it to only few taps in practical implementations. # 7. Transmit and Receive Equalizers We have considered transmit and receive equalizers independently thus far. As mentioned earlier, transmit pre-emphasis suffers from peak power constraint and the receive equalizer performance is constrained by several non-idealities such as the limited amplifier bandwidth, noise enhancement and amplifier non-linearity. However, some of these issues can be circumvented by using both transmit and receive equalizers together. Employing a 3-tap transmit pre-emphasis filter and a single stage receive equalizer the eye diagram shown in Fig. 30 displays 50ps of Fig. 30. 8Gbps eye diagram with both transmit pre-emphasis and receive equalization. timing margin with at least 100mV voltage margin. The amount of equalization performed by receiver and the transmitter can be evaluated by the pulse responses shown in Fig. 31. Fig. 31(a) displays the raw 8Gbps pulse response clearly illustrating both pre-cursor and post-cursor ISI. Fig. 31(b) depicts the receive equalized pulse response. There are two important things to note. First, the receive equalizer amplifies the cursor tap while attenuating post-cursor ISI. Second, the dominant left-over ISI is due to first pre-cursor and post-cursor terms. Therefore, a 3-tap pre-emphasis is used to reduce these ISI terms as shown in Fig. 31(c). Fig. 31. 8Gbps pulse response: (a) Raw. (b) Receive equalized. (c) Receive and transmit equalized. # 8. Adaptation In a practical transmission system, the exact channel characteristics are not known a priori. Therefore, making the pre-designed equalizer grossly sub-optimal. For example, the channel length can vary from one application to another, or the loss profile of the channel may vary due to the variations in the PCB fabrication process. Due to these reasons, the equalizer coefficients are set adaptively. The conceptual block diagram illustrating the operation of an adaptive equalizer is shown in Fig. 32. In this generic block diagram the equalizer could be of any type — Fig. 32. Adaptive equalizer concept. discrete-time FIR, continuous-time FIR, or continuous-time analog equalizer. The adaptive engine automatically adjusts the coefficients by measuring the equalizer performance so as to improve the performance on an average. There are several algorithms<sup>65</sup> that can be used for adapting the equalizer. Of these the most popular ones for compact hardware implementation are the Least Mean Squares (LMS) and the Zero-Forcing (ZF) algorithms or their variants. The LMS algorithm optimizes the filter coefficients based on minimizing the mean squared error. The coefficient update equation in the LMS algorithm is given by "Eq. (13)": $$c_{(k+1,n)} = c_{(k,n)} + \mu \cdot e_k \cdot x_{k-n} \text{ for } n = 0 \cdots N$$ (13) where $c_{(k+1,n)}$ represents $n^{th}$ filter coefficient with N taps at $(k+1)^{1st}$ update, $\mu$ is the update step size and $e_k = y_k - \widehat{d_k}$ is the error in the equalizer output, $\widehat{d_k}$ the best estimate of the transmitted bit and $x_k$ is the channel output. The analog multipliers are required to realize the update equation ( $e_k$ and $x_k$ are analog in nature), making the hardware implementation of the update equation difficult. The sign-sign LMS update algorithm is given by: $$c_{(k+1,n)} = c_{(k,n)} + \mu \cdot sign(e_k) \cdot sign(x_{k-n}) \text{ for } n = 0 \cdots N$$ (14) "Eq. (14)" obviates the need for an analog multiplier, thus making it more amenable for on-chip integration.<sup>58,66,67</sup> The adaptation engine consists of UP/DOWN counters that count up or down according to the product of the error sign and the data sign. Since a quantized error is used to update the tap weights, the convergence time of sign-sign LMS is generally worse than the traditional LMS algorithm. In most serial-link applications, this increased convergence time is not a problem. Training sequences, as shown in Fig. 33 are used to ease the convergence of the Fig. 33. Adaptive equalizer with a training sequence. tap weights. After training for a time period sufficient for tap weight convergence, the training sequence is replaced by the decision of the receiver for adaptation to continue. Continuous-time analog equalizers can also be adapted using similar concepts but require different error detection mechanisms. Interested readers can refer to $^{47,48,49,52,68}$ for more information. # 9. Conclusions We have presented different equalizer architectures suitable for medium-to-high data rate serial-link applications. Design trade-offs in both transmit-side and receive-side equalizers were presented. Transmit pre-emphasis suffers from peak power constraint while receive equalizer performance is limited by amplifier bandwidth. The availability of high $f_T$ transistors coupled with the need for large amounts of equalization will make receive equalization a very attractive alternative in the future. Finally, techniques to adaptively set equalizer settings were described. #### References - 1. D. Schmidt, "Circuit pack parameter estimation using Rents rule," *IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems*, pp. 186-192, Oct. 1982. - 2. G. Moore, "Cramming more components onto integrated circuits," *Electronics*, pp. 114-117, April, 1965. - 3. D. Pozar, Microwave Engineering, Second edition, John Wiley & Sons, Inc., 1998. - W. Dally and J. Poulton, Digital Systems Engineering, Cambridge University Press, 1998. - H. Bakoglu, Circuits, Interconnections, and Packaging for VLSI, Addison-Wesley Publishing Company, 1990. - 6. H. Johnson and M. Graham, High-Speed Digital Design, Prentice Hall, 1993. - 7. S. Kim and D. P. Neikirk, "Compact equivalent circuit model for the skin Effect," *IEEE MTTS Dig. Tech. Papers*, pp. 17-21, Jun. 1996. - 8. H. Wheeler, "Formulas for the skin-effect," *Proc. Institute of Radio Engineers*, pp. 412-424, 1942. - 9. "Advanced Design Systems," ADS Design Guides, Agilent Technologies, 2000. - E. Lee and D. Messerschmitt, Digital Communication, Kluwer Academic Publishers, 1994. - 11. B. Casper, M. Haycock, and R. Mooney, "An accurate and efficient analysis method for multi-Gb/s chip-to-chip signaling schemes," *IEEE VLSI Circuits Sym. Tech. Papers*, pp. 54-57, Jun. 2002. - 12. B. Ahmad, "Performance specification of interconnects," DesignCon, 2003. - R. Kollipara, G. Yeh, B. Chia, A. Agarwal, "Design, modeling and characterization of high-speed backplane interconnects," *DesignCon*, 2003. - 14. S. Sercu and J. De Geest, "BER Link Simulations," DesignCon, 2003. - 15. V. Stojanovic, M. Horowitz "Modeling and analysis of high-speed links," *Proc. of IEEE CICC*, pp. 589-594, Sep. 2003. - P. Hanumolu, B. Casper, R. Mooney, G. Wei and U. Moon, "Analysis of PLL clock jitter in high-speed serial links," *IEEE Trans. Circuits Syst. II*, vol. 50, no. 11, pp. 879-886, Nov. 2003. - 17. E. Alon, V. Stojanovic, M. Horowitz, "Circuits and techniques for high-resolution measurement of on-chip power supply noise," *IEEE VLSI Circuits Sym. Tech. Papers*, pp. 102-105, Jun. 2004. - 18. W. Dally and J. Poulton, "Transmitter Equalization for 4-Gbps Signaling," *IEEE Micro*, pp. 48-56, 1997. - A. Fiedler, R. Mactaggart, J. Welch, and S. Krishnan, "A 1.0625Gbps transceiver with 2x-oversampling and transmit signal pre-emphasis," *ISSCC Dig. Tech. Papers*, pp. 238-239, Feb. 1997. - R. Gu, J. Tran, H. Lin, A. Yee, and M. Izzard, "A 0.5-3.5Gb/s low-power low-jitter serial data CMOS transceiver," ISSCC Dig. Tech. Papers, pp. 352-353, Feb. 1999. - B. Casper, A. Martin, J. Jaussi, J. Kennedy, and R. Mooney, "8Gb/s differential simultaneous bidirectional link with 4mV 9ps waveform capture diagnostic capability," ISSCC Dig. Tech. Papers, pp. 78-79, Feb. 2003. - 22. R. Farjad-Rad, C. Yang, and M. Horowitz, "A $0.4 \mu m$ CMOS 10-Gb/s 4-PAM preemphasis serial link transmitter," *IEEE J. Solid-State Circuits*, vol. 34, no. 5, pp. 580-585, May 1999. - 23. M. Lee, W. Dally, P. Chiang, "A 90 mW 4 Gb/s equalized I/O circuit with input offset cancellation," ISSCC Digest Technical Papers, Feb. 2000. - B. Lee, M. Hwang, S. Lee, and D. Jeong, "A 2.5-10Gb/s CMOS transceiver with alternating edge sampling phase detection for loop characteristic stabilization," ISSCC Dig. Tech. Papers, pp. 76-77, Feb. 2003. - J. Zerbe, P. Chau, C. Werner, W. Stonecypher, H. Liaw, G. Yeh, T. Thrush, S. Best, K. Donnelly, "A 2Gb/s/pin 4-PAM parallel bus interface with transmit crosstalk cancellation, equalization, and integrating receivers," ISSCC Dig. Tech. Papers, pp. 66-67, Feb. 2001. - J. Zerbe, C. Werner, V. Stojanovic, F. Chen, J. Wei, G. Tsang, D. Kim, W. Stonecypher, A. Ho, T. Thrush, R. Kollipara, M. Horowitz, K. Donnelly, "Equalization and clock recovery for a 2.5-10-Gb/s 2-PAM/4-PAM backplane transceiver cell," *IEEE J. Solid-State Circuits*, pp. 2121-2130, Dec. 2003. - 27. C. Yang, V. Stojanovic, S. Modjtahedi, M. Horowitz, W. Ellersick "A serial-link transceiver based on 8-Gsamples/s A/D and D/A converters in $0.25 \mu m$ CMOS', IEEE J. Solid-State Circuits, pp. 1684-1692, Nov. 2001. - 28. K. Azadet, C. Nicole, "Low-power equalizer architectures for high speed modems," - IEEE Communication Magazine, pp. 118-126, Oct. 1998. - 29. C. Nicol, P. Larsson, K. Azadet, J. O'Neill, "A low-power 128-tap digital adaptive equalizer for broadband modems," IEEE J. Solid-State Circuits, pp. 1777-1789, Nov. 1997. - 30. D. Moloney, J. O'Brien, E. O'Rourke, F. Brianti, "Low-power 200-Msps, area-efficient, five-tap programmable FIR filter," IEEE J. Solid-State Circuits, vol. 33, pp. 1134-1138, July 1998. - 31. C. Wong, J. Rudell, G. Uehara, P. Gray, "A 50 MHz eight-tap adaptive equalizer for partial-response channel," IEEE J. Solid-State Circuits, vol. 30, pp. 228-234, Mar. 1995. - 32. L. Thon, P. Sutardja, F. Lai, G. Coleman, "A 240 MHz 8-tap programmable FIR filter for disk-drive read channels," ISSCC Dig. Tech. Papers, pp. 82-83, Feb. 1995. - 33. R. Staszewski, K. Muhammad, P. Balsara, "550 -MSample/s 8 -Tap FIR digital filter for magnetic recording read channels," IEEE J. Solid-State Circuits, vol. 35, pp. 1205-1210, Aug. 2000. - 34. S. Rylov et al., "A 2.3 GSample/s 10-tap digital FIR filter for magnetic recording read channels," ISSCC Dig. Tech. Papers, pp. 190-191, Feb. 2001. - 35. E. Haratsch, K. Azadet, "A 1-Gb/s joint equalizer and trellis decoder for 1000BASE-T gigabit Ethernet," IEEE J. Solid-State Circuits, vol. 36, pp. 374-384, July 2000. - 36. K. Azadet, M. Yu, P. Larsson, D. Inglis, "A gigabit transceiver chip set for UTP CAT-6 cables in digital CMOS technology," ISSCC Dig. Tech. Papers, pp. 306-307, Feb. 2000. - 37. J. Buckwalter, A. Hajimiri, "An active analog delay and the delay reference loop," Proc. of IEEE RFIC Symposium, pp. 17-20, Jun. 2004. - 38. J. Yang, J. Kim, S. Byun, C. Conroy, B. Kim, "A quad-channel 3.125Gb/s/ch seriallink transceiver with mixed-mode adaptive equalizer in $0.18\mu m$ CMOS," ISSCC Diq. Tech. Papers, pp. 176-177, Feb. 2004. - 39. D. Xu, Y. Song, G. Uehara, "A 200MHz 9-tap analog equalizer for magnetic disk read channels in 0.6 µm CMOS," ISSCC Dig. Tech. Papers, pp. 74-75, Feb. 1996. - 40. R. Farjad-Rad, C. Yang, and M. Horowitz, "A 0.3-μ CMOS 8-Gb/s 4-PAM serial link transceiver," IEEE J. Solid-State Circuits, vol. 35, pp. 757-764, May 2000. - 41. X. Wang, R. Spencer, "A low-power 170-MHz discrete-time analog FIR filter," IEEE J. Solid-State Circuits, vol. 35, pp. 417-426, March 1998. - 42. T. Lee, B. Razavi, "A 125-MHz CMOS mixed-signal equalizer for Gigabit Ethernet on copper wire," Proc. of IEEE CICC, pp. 131-134, May 2001. - 43. J. Jaussi, G. Balamurugan, D. Johnson, B. Casper, A. Martin, J. Kennedy, N. Shanbhag, R. Mooney, "An 8Gb/s Source-Synchronous I/O Link with adaptive receiver equalization, offset cancellation and clock deskew," ISSCC Dig. Tech. Papers, pp. 242-243, Feb. 2004. - 44. J. Buckwalter, A. Hajimiri, "A 10Gb/s data-dependent jitter equalizer," Proc. of IEEE CICC, pp. 39-42, Oct. 2004. - 45. J. Buckwalter, B. Analui, A. Hajimiri, "Data-dependent jitter and crosstalk-induced bounded uncorrelated jitter in copper interconnects," IEEE MTTS Dig. Tech. Papers, pp. 1627-1630, Jun. 2004. - 46. R. Schaumann, M. Van Valkenburg Design of Analog Filters, Oxford University Press, - 47. J. Babanezhad, "A 3.3-V Analog adaptive line-equalizer for fast ethernet data connection," *Proc. of IEEE CICC*, pp. 343-346, May 1998. - 48. G. Hartman, K. Martin, A. McLaren, "Continuous-time adaptive-analog coaxial cable equalizer in $0.5\mu m$ CMOS," Proc. of IEEE ISCAS, pp. 97-100, May 1999. - O. Shoaei et al., "A 3V Low-Power 0.25μm CMOS 100Mb/s receiver for fast ethernet," ISSCC Diq. Tech. Papers, pp. 308-309, Feb. 1996. - Y. Kudoh, M. Fukaishi, M. Mizuno, "A 0.13μm CMOS 5-Gb/s 10-meter 28AWG Cable transceiver with no-feedback-loop continuous-time post-equalizer," *IEEE VLSI Circuits Sym. Tech. Papers*, pp. 64-67, Jun. 2002. - R. Farjad-Rad et al., "0.622-8.0 Gbps 150 mW serial IO macrocell with fully flexible pre-emphasis and equalization," *IEEE VLSI Circuits Sym. Tech. Papers*, pp. 63-66, Jun. 2003. - 52. J. Choi, M. Hwang, D. Jeong, "A 0.18μm CMOS 3.5-gb/s continuous-time adaptive cable equalizer using enhanced low-frequency gain control method," *IEEE J. Solid-State Circuits*, vol. 39, pp. 419-425, March 2004. - S. Mohan, M. Hershenson, S. Boyd, T. Lee, "Bandwidth extension in CMOS with optimized on-chip inductors," *IEEE J. Solid-State Circuits*, vol. 35, pp. 346-355, March 2000. - 54. T. Lee, The Design of CMOS Radio-Frequency Integrated Circuits, Cambridge University Press, 1998. - 55. S. Galal, B. Razavi, "10Gb/s Limiting Amplifier and Laser/Modulator Driver in 0.18µm CMOS technology," ISSCC Dig. Tech. Papers, pp. 188-189, Feb. 2003. - G. Zhang, P. Chaudhari, M. Green "A BiCMOS 10Gb/s adaptive cable equalizer," ISSCC Dig. Tech. Papers, pp. 482-483, Feb. 2004. - H. Wu, J. Tierno, P. Pepeljugoski, J. Schaub, S. Gowda, J. Kash, A. Hajimiri, "Integrated transversal equalizers in high-speed fiber-optic systems," *IEEE J. Solid-State Circuits*, vol. 38, pp. 2131-2137, Dec. 2003. - 58. V. Balan, J. Caroselli, J. Chem, C. Desai, C. Liu, "A 4.8-6.4 Gbps serial link for backplane applications using decision feedback equalization," *Proc. of IEEE CICC*, pp. 31-34, Oct. 2003. - K. Parhi, "High-speed architectures for algorithms with quantizer loops," Proc. of IEEE ISCAS, pp. 2357-2360, May 1990. - S. Kasturia, J. Winters, "Techniques for high-speed implementation of nonlinear cancellation," *IEEE J. Selected Areas in Communications*, vol. 38, pp. 711-717, Jun. 1991. - Y. Sohn, S. Bae, H. Park, C. Kim, S. Cho, "A 2.2 Gbps CMOS look-ahead DFE receiver for multidrop channel with pin-to-pin time skew compensation," *Proc. of IEEE CICC*, pp. 473-476, Sep. 2003. - R. Kajley, P. Hurst, J. Brown, "A mixed-signal decision-feedback equalizer that uses a look-ahead architecture," *IEEE J. Solid-State Circuits*, vol. 32, pp. 450-459, Mar. 1997. - 63. V. Stojanovic *et al.*, "Adaptive equalization and data recovery in a dual-mode (PAM2/4) serial link transceiver," *IEEE VLSI Circuits Sym. Tech. Papers*, pp. 348-351, Jun. 2004. - 64. M. Tomlinson, "New automatic equalizer employing modulo arithmetic," *Electr. Let.*, pp. 138-139, March 1971. - 65. J. Proakis, Digital Communications, McGraw-Hill Education, 2000. - 66. J. Stonick, G. Wei, J. Sonntag, D. Weinlader, "An adaptive PAM-4 5-Gb/s backplane transceiver in 0.25μm CMOS," *IEEE J. Solid-State Circuits*, vol. 38, pp. 436-443, Mar. 2003. - 67. G. Balamurugan et al., "Receiver adaptation and system characterization of an 8Gbps source-synchronous I/O link using on-die circuits in 0.13μm CMOS," IEEE VLSI Circuits Sym. Tech. Papers, pp. 356-359, Jun. 2004. - 68. A. Baker, "An adaptive cable equalizer for serial digital video rates to 400 Mb/s," ISSCC Dig. Tech. Papers, pp. 174-175, Feb. 1996.