Carry-chain propagation delay impacts on resolution of FPGA-based TDC

NUCLEAR ELECTRONICS AND INSTRUMENTATION

Carry-chain propagation delay impacts on resolution of FPGA-based TDC

DONG Lei，

YANG Jun-Feng，

SONG Ke-Zhui

Nuclear Science and Techniques

Vol.25, No.3

Article number 030401

Published in print 20 Jun 2014

Available online 20 Jun 2014

DOI：10.13538/j.1001-8042/nst.25.030401

52404

The architecture of carry chains in Field-Programmable Gate Array (FPGA) is introduced in this paper. The propagation delay time of the rising and falling edges in the carry chains are calculated according to the architecture and they are predicted not equal in most cases. Tests show that the measuring results of the propagation delay time in EP3C120F484C8N series FPGA of Altera are in line with the inference. The difference of propagation delay time results in different accuracies of Time-to-Digital Converter (TDC). This phenomenon shall be considered in the design of TDC implemented in FPGA. It can ensure better accuracy.

FPGA FirmwareCarry chainsPropagation delay timeTDC

I. INTRODUCTION

Time-to-Digital Converters are widely used in scientific applications. High resolution TDCs are indispensable in many physics experiments, such as time-of-flight (TOF) systems for target recoil-ion momentum spectroscopy at Institute of Modern Physics (IMP), Chinese Academy of Sciences [1]. Also, two interpolation methods used in high-resolution applications, the vernier and tapped delay line (TDL) method, have been successfully implemented in field-programmable gate arrays (FPGAs) [2].

At present, TDCs using the tapped-delay line method implemented in FPGA and taking advantage of reduced delays provided by dedicated arithmetic (carry-chain) routing structures, are efficient and successful ways to achieve low dead time measurements.

In this work, for a further study on TDC implementation in FPGA, we did a detailed analysis of the carry-chain routing structures and found that the propagation speeds of rising and falling edges should differ from each other. To verify the prediction, the relationship between the propagation speeds and TDC accuracy was tested. The result shows that the propagation speed affects the TDC accuracy, hence the importance of edge choice in TDC design.

II. THE ARCHITECTURE OF THE CARRY CHAIN

The most common FPGA architecture consists of an array of logic blocks, I/O pads, and routing channels. The array of logic blocks is called Configurable Logic Block (CLB) or Logic Array Block (LAB) depending on vendor. In general, it consists of a few logical cells (called ALM, LE, Slice etc.) [3-5]. A typical cell has a 4-input Lookup Table (LUT), a Full Adder (FA) and a D-type flip-flop, as shown in Fig. 1. One tap of carry-chain is marked in the figure. The carry-in signal is a bit carried-in from the next less significant stage, and the carry-out signal represents an overflow into the next digit of a multi-digit addition.

Fig. 1.

Simplified example illustration of a logic cell.

It is possible to create a logical circuit using multiple full adders to add N-bit numbers. Each full adder inputs a carry-in signal which is the carry-out signal of the previous adder. This kind of adder is called a ripple-carry adder. A 4-bit ripple-carry adder is shown in Fig. 2. When an N-bit ripple-carry adder is implemented in FPGA, N-1 taps of carry chain can be obtained. The carry chain is usually used as tapped-delay line in TDC design implemented in FPGA.

Fig. 2.

4-bit ripple-carry adder.

III. PROPAGATION TIMES OF RISING AND FALLING EDGES IN THE CARRY CHAIN

As the dedicated arithmetic routing structure in FPGAs, a carry chain consists of a series of full adders, and the propagation time in it is composed of the propagation time in the full adder, and the propagation time in the cable connecting the full adders. Generally, the cables are short, and regarding the propagation time they contribute equally to the rising and falling edges. So we consider just the propagation time in the full adder.

By analyzing a full adder circuit, the propagation time can be calculated. As most FPGAs are manufactured using CMOS process technology, a typical CMOS full adder in Fig. 3 will be analyzed in this paper. To make the adder work as a delay chain, the input A is set to 1 and B is set to 0. Assuming that C_out is equal to C_in in logical, the propagation time in the full adder is the time from C_in to C_out. The circuit analysis proceeds in the manner mentioned in Ref. [6]. Fig. 4 shows the circuit with the FET resistances and the capacitances. For convenience, the circuit is divided into two stages. The propagation time of Stages 1 and 2 are calculated separately.

Fig. 3.

A CMOS full adder schematic.

Fig. 4.

The circuit for calculating the propagation time.

Figure 5 shows the sub-circuits for the output transients of Stage 1. R_n and R_p are the parasitic resistance of the NMOS and PMOS, respectively. C_out1 is the output capacitance of Stage 1. Cx is the parasitic capacitance between the NMOSs, and Cy is parasitic capacitance between the PMOSs. The propagation time of falling edge is calculated using the circuit in Fig. 5(a). The output voltage can be written as

Fig. 5.

Discharge circuit (a) and Charging circuit (b) for stage 1.

V_{out} (t) = V_{dd} e^{- t / τ_{n}} .

(1)

The two discharge paths are shown as i_dis.1 and i_dis.2. According to Elmore formula, the time constant of main discharge path i_dis.1 is

τ_{n 1} = C_{out1} (2 R_{n}) .

(2)

Since the input A is set to 1 and B is set to 0, the NMOS1 and PMOS1 remain in conduction state. So Cx does not discharge during this event, the time constant of discharge path i_dis.2 is τ_n2=0. So, the total time constant can be obtained by superposing:

\begin{array}{l} τ_{na} & = τ_{n1} + τ_{n2} \\ = C_{out1} (2 R_{n}) . \end{array}

(3)

Then the propagation time of falling edge in Stage 1, according to Ref. [6], shall be

t_{pf1} = \ln 2 τ_{na} .

(4)

The rise time t_r is computed using the circuit in Fig. 5(b). The time constant of main charging path i_ch.1 is

τ_{p1} = C_{out1} (2 R_{p}) .

(5)

As the PMOS1 remains in conduction state, the time constant of charging path i_ch.2 is τ_p2=0. The total time constant is

\begin{array}{l} τ_{pa} & = τ_{p1} + τ_{p2} \\ = C_{out1} (2 R_{p}) . \end{array}

(6)

Therefore, the propagation time of rising edge in Stage 1 is

t_{pr1} = \ln 2 τ_{pa} .

(7)

Figure 6 shows the sub-circuit for output transients of Stage 2. The time constant of discharge path i_dis is

Fig. 6.

Discharge circuit (a) and Charging circuit (b) for stage 2.

τ_{n b} = C_{out2} R_{n} .

(8)

where C_out2 is the output capacitance of Stage 2. The propagation time of falling edge in Stage 2 is

t_{pr2} = \ln 2 τ_{nb} .

(9)

The time constant of charging path i_ch is

τ_{pb} = C_{out2} R_{p} .

(10)

So, the propagation time of rising edge in Stage 2 is

t_{pr2} = \ln 2 τ_{pb} .

(11)

Then, the propagation time of the rising edge from C_in to C_out is

\begin{array}{l} t_{pr} & = t_{pf1} + t_{pr2} \\ = \ln 2 (2 C_{out1} R_{n} + C_{out2} R_{p}) . \end{array}

(12)

The propagation time of the falling edge from C_in to C_out is

\begin{array}{l} t_{pf} & = t_{pr1} + t_{pf2} \\ = \ln 2 (2 C_{out1} R_{p} + C_{out2} R_{n}) . \end{array}

(13)

The value of R_p and R_n can be calculated according to Ref. [6]:

R_{p} = \frac{1}{k_{p}^{'} {(W / L)}_{p} (V_{dd} - | V_{Tp} |)},

(14)

R_{n} = \frac{1}{k_{n}^{'} {(W / L)}_{n} (V_{dd} - | V_{Tn} |)},

(15)

where, $k_{n}^{'}$ and $k_{p}^{'}$ are the nFET and pFET process transconductance, respectively (typically $k_{n}^{'} / k_{p}^{'} = 2 \sim 3$ ); (W/L)_p and (W/L)_n the width-to-length ratio of p-channel MOSFET and n-channel MOSFET, respectively; and |V_Tp| and |V_Tn| are the threshold voltage of p-channel MOSFET and n-channel MOSFET, respectively.

Large width-to-length ratio of MOSFET increases the circuit speed, but this increases the area of circuit, too. By a comprehensive consideration of speed and area, the R_p to R_n ratio shall be

\frac{R_{p}}{R_{n}} \approx 1 \sim 3.

(16)

The smaller the R_p/R_n ratio, the larger area of the circuit.

Figure 7 shows the t_pr and t_pf as function of C_out2 at different R_p/R_n ratios, assuming R_n=300 Ω and C_out1=50 fF. It can be seen that larger R_p/R_n ratio results in greater difference between t_pr and t_pf. Except t_pr=t_pf at C_out2 = 2C_out1, t_pr is not equal to t_pf. This depends on the processes of how the FPGAs are fabricated.

Fig. 7.

Propagation time for different R_p to R_n ratio and C_out2 values (R_n =300, C_out1 = 50 fF).

IV. MEASUREMENT OF THE PROPAGATION TIME OF RISING AND FALLING EDGES IN THE CARRY CHAIN

There are at least two approaches to measure the propagation time in the carry chain, i.e., the double registration approach [7] and statistical approach [8]. They are usually used as digital calibration method for TDC implemented in FPGA. The double registration approach measures only the average cell delay. When the bin widths are different, hence the need of bin-by-bin measurement, the statistical approach prevails [9] and we use this method.

The FPGA TDC was used to measure the propagation time in the carry chain. A simplified block diagram of the TDC implemented in FPGA is shown in Fig. 8 [10]. It includes two crucial parts, the time delay lines (the carry chain) and the coarse time counters. The TDC is based on counter and interpolating method, detailed description can be found in Ref. [11].

Fig. 8.

(Color online) Block diagram of the time-to-digital converter implemented in a single FPGA device.

The method of measurement is as follows [9]. After power up or system reset, the TDC input is fed with calibration hits. The timing of these hits should have no correlation with the clock signal driving the TDC, so the hits are generated from an independent oscillator.

A Differential Non-Linearity (DNL) histogram is booked in the FPGA internal memory. Once all hits are booked into the histogram, a sequence controller starts to build a lookup table (LUT) in the FPGA internal memory. The LUT is integrated from the DNL histogram so that it outputs the actual time of the center of the addressed bin.

The LUTs of rising edge and falling edge TDC are shown in Fig. 9. The slope can be interpreted as average bin width for falling edge TDC, and about half of the slope is for rising edge TDC. The bin width means the propagation time of the rising edge or falling edge in one tap carry chain. The propagation time of falling edge is shorter than that of rising edge. This is due to the difference of the N and P transistors in COMS integrated circuit. The experiment was done in several series FPGAs of Altera and Xilinx.

Fig. 9.

(a) LUT of EP3C120F484C8N (Altera); (b) LUT of EP2S60F1020I4 (Altera); (c) LUT of XC3S500EFG320-4 (Xilinx); (d) LUT of XC5VFX70TFF1136-1(Xilinx).

V. THE RELATIONSHIP BETWEEN PROPAGATION TIME AND TDC ACCURACY

It is intuitive that shorter propagation delay time will result in higher accuracy in TDC design. So, the rising edge TDC and falling edge TDC are tested respectively. The "rising edge TDC" is the TDC that uses the rising edge information to convert time to digital. And the "falling edge TDC" is the TDC using the falling edge information.

The input to the TDC is a pulse train with a repeating rate generated by an external phase-locked loop. The TDC is driven by a crystal oscillator. The time between successive rising edges of the pulse train is measured by the rising edge TDC and falling edge TDC, respectively. And digital calibration is used in the measurement [9].

The test results of successive rising edges and falling edges are shown in Table 1. The root meam square (RMS) resolutions of rising edge TDC and falling edge TDC are 95 ps and 52 ps, respectively. The falling edge TDC has shorter average propagation delay time. So the shorter propagation delay time brings better RMS resolution according to the test. The long propagation delay time of rising edge TDC limits the resolution.

The RMS resolution of rising edge TDC and falling edge TDC

	Resolution		Average bin width
	Rising edge TDC	Falling edge TDC	Rising edge TDC	Falling edge TDC
EP3C120F484C8N	95 ps	52 ps	169 ps	91ps
EP2S60F1020I4	31ps	29 ps	45 ps	44 ps
XC3S500EFG320-4	69 ps	62 ps	98 ps	96 ps
XC5VFX70TFF1136-1	28 ps	25 ps	57 ps	55 ps

VI. CONCLUSION

The different propagation time in carry chains is due to the processes of how the FPGAs are fabricated. Since the SRAM based FPGAs are fabricated using processes similar to CMOS, so the phenomenon is common in the SRAM based FPGAs. Being aware of the situation can help us to understand the FPGA TDC better.

The test results show that the impact of this phenomenon to the TDC accuracy is obvious. This effect is there in TDCs implemented FPGA using the tapped-delay line method. Knowing this can help us take advantage of the reduced delays provided by dedicated arithmetic (carry-chain) routing structures.

References

[1]