logo

Carry-chain propagation delay impacts on resolution of FPGA-based TDC

NUCLEAR ELECTRONICS AND INSTRUMENTATION

Carry-chain propagation delay impacts on resolution of FPGA-based TDC

DONG Lei
YANG Jun-Feng
SONG Ke-Zhui
Nuclear Science and TechniquesVol.25, No.3Article number 030401Published in print 20 Jun 2014Available online 20 Jun 2014
52404

The architecture of carry chains in Field-Programmable Gate Array (FPGA) is introduced in this paper. The propagation delay time of the rising and falling edges in the carry chains are calculated according to the architecture and they are predicted not equal in most cases. Tests show that the measuring results of the propagation delay time in EP3C120F484C8N series FPGA of Altera are in line with the inference. The difference of propagation delay time results in different accuracies of Time-to-Digital Converter (TDC). This phenomenon shall be considered in the design of TDC implemented in FPGA. It can ensure better accuracy.

FPGA FirmwareCarry chainsPropagation delay timeTDC

I. INTRODUCTION

Time-to-Digital Converters are widely used in scientific applications. High resolution TDCs are indispensable in many physics experiments, such as time-of-flight (TOF) systems for target recoil-ion momentum spectroscopy at Institute of Modern Physics (IMP), Chinese Academy of Sciences [1]. Also, two interpolation methods used in high-resolution applications, the vernier and tapped delay line (TDL) method, have been successfully implemented in field-programmable gate arrays (FPGAs) [2].

At present, TDCs using the tapped-delay line method implemented in FPGA and taking advantage of reduced delays provided by dedicated arithmetic (carry-chain) routing structures, are efficient and successful ways to achieve low dead time measurements.

In this work, for a further study on TDC implementation in FPGA, we did a detailed analysis of the carry-chain routing structures and found that the propagation speeds of rising and falling edges should differ from each other. To verify the prediction, the relationship between the propagation speeds and TDC accuracy was tested. The result shows that the propagation speed affects the TDC accuracy, hence the importance of edge choice in TDC design.

II. THE ARCHITECTURE OF THE CARRY CHAIN

The most common FPGA architecture consists of an array of logic blocks, I/O pads, and routing channels. The array of logic blocks is called Configurable Logic Block (CLB) or Logic Array Block (LAB) depending on vendor. In general, it consists of a few logical cells (called ALM, LE, Slice etc.) [3-5]. A typical cell has a 4-input Lookup Table (LUT), a Full Adder (FA) and a D-type flip-flop, as shown in Fig. 1. One tap of carry-chain is marked in the figure. The carry-in signal is a bit carried-in from the next less significant stage, and the carry-out signal represents an overflow into the next digit of a multi-digit addition.

Fig. 1.
Simplified example illustration of a logic cell.
pic

It is possible to create a logical circuit using multiple full adders to add N-bit numbers. Each full adder inputs a carry-in signal which is the carry-out signal of the previous adder. This kind of adder is called a ripple-carry adder. A 4-bit ripple-carry adder is shown in Fig. 2. When an N-bit ripple-carry adder is implemented in FPGA, N-1 taps of carry chain can be obtained. The carry chain is usually used as tapped-delay line in TDC design implemented in FPGA.

Fig. 2.
4-bit ripple-carry adder.
pic

III. PROPAGATION TIMES OF RISING AND FALLING EDGES IN THE CARRY CHAIN

As the dedicated arithmetic routing structure in FPGAs, a carry chain consists of a series of full adders, and the propagation time in it is composed of the propagation time in the full adder, and the propagation time in the cable connecting the full adders. Generally, the cables are short, and regarding the propagation time they contribute equally to the rising and falling edges. So we consider just the propagation time in the full adder.

By analyzing a full adder circuit, the propagation time can be calculated. As most FPGAs are manufactured using CMOS process technology, a typical CMOS full adder in Fig. 3 will be analyzed in this paper. To make the adder work as a delay chain, the input A is set to 1 and B is set to 0. Assuming that Cout is equal to Cin in logical, the propagation time in the full adder is the time from Cin to Cout. The circuit analysis proceeds in the manner mentioned in Ref. [6]. Fig. 4 shows the circuit with the FET resistances and the capacitances. For convenience, the circuit is divided into two stages. The propagation time of Stages 1 and 2 are calculated separately.

Fig. 3.
A CMOS full adder schematic.
pic
Fig. 4.
The circuit for calculating the propagation time.
pic

Figure 5 shows the sub-circuits for the output transients of Stage 1. Rn and Rp are the parasitic resistance of the NMOS and PMOS, respectively. Cout1 is the output capacitance of Stage 1. Cx is the parasitic capacitance between the NMOSs, and Cy is parasitic capacitance between the PMOSs. The propagation time of falling edge is calculated using the circuit in Fig. 5(a). The output voltage can be written as

Fig. 5.
Discharge circuit (a) and Charging circuit (b) for stage 1.
pic
Vout(t)=Vddet/τn. (1)

The two discharge paths are shown as idis.1 and idis.2. According to Elmore formula, the time constant of main discharge path idis.1 is

τn1=Cout1(2Rn). (2)

Since the input A is set to 1 and B is set to 0, the NMOS1 and PMOS1 remain in conduction state. So Cx does not discharge during this event, the time constant of discharge path idis.2 is τn2=0. So, the total time constant can be obtained by superposing:

τna=τn1+τn2=Cout1(2Rn). (3)

Then the propagation time of falling edge in Stage 1, according to Ref. [6], shall be

tpf1=ln2τna. (4)

The rise time tr is computed using the circuit in Fig. 5(b). The time constant of main charging path ich.1 is

τp1=Cout1(2Rp). (5)

As the PMOS1 remains in conduction state, the time constant of charging path ich.2 is τp2=0. The total time constant is

τpa=τp1+τp2=Cout1(2Rp). (6)

Therefore, the propagation time of rising edge in Stage 1 is

tpr1=ln2τpa. (7)

Figure 6 shows the sub-circuit for output transients of Stage 2. The time constant of discharge path idis is

Fig. 6.
Discharge circuit (a) and Charging circuit (b) for stage 2.
pic
τnb=Cout2Rn. (8)

where Cout2 is the output capacitance of Stage 2. The propagation time of falling edge in Stage 2 is

tpr2=ln2τnb. (9)

The time constant of charging path ich is

τpb=Cout2Rp. (10)

So, the propagation time of rising edge in Stage 2 is

tpr2=ln2τpb. (11)

Then, the propagation time of the rising edge from Cin to Cout is

tpr=tpf1+tpr2=ln2(2Cout1Rn+Cout2Rp). (12)

The propagation time of the falling edge from Cin to Cout is

tpf=tpr1+tpf2=ln2(2Cout1Rp+Cout2Rn). (13)

The value of Rp and Rn can be calculated according to Ref. [6]:

Rp=1kp'(W/L)p(Vdd|VTp|), (14) Rn=1kn'(W/L)n(Vdd|VTn|), (15)

where, kn' and kp' are the nFET and pFET process transconductance, respectively (typically kn'/kp'=23); (W/L)p and (W/L)n the width-to-length ratio of p-channel MOSFET and n-channel MOSFET, respectively; and |VTp| and |VTn| are the threshold voltage of p-channel MOSFET and n-channel MOSFET, respectively.

Large width-to-length ratio of MOSFET increases the circuit speed, but this increases the area of circuit, too. By a comprehensive consideration of speed and area, the Rp to Rn ratio shall be

RpRn13. (16)

The smaller the Rp/Rn ratio, the larger area of the circuit.

Figure 7 shows the tpr and tpf as function of Cout2 at different Rp/Rn ratios, assuming Rn=300 Ω and Cout1=50 fF. It can be seen that larger Rp/Rn ratio results in greater difference between tpr and tpf. Except tpr=tpf at Cout2 = 2Cout1, tpr is not equal to tpf. This depends on the processes of how the FPGAs are fabricated.

Fig. 7.
Propagation time for different Rp to Rn ratio and Cout2 values (Rn =300, Cout1 = 50 fF).
pic

IV. MEASUREMENT OF THE PROPAGATION TIME OF RISING AND FALLING EDGES IN THE CARRY CHAIN

There are at least two approaches to measure the propagation time in the carry chain, i.e., the double registration approach [7] and statistical approach [8]. They are usually used as digital calibration method for TDC implemented in FPGA. The double registration approach measures only the average cell delay. When the bin widths are different, hence the need of bin-by-bin measurement, the statistical approach prevails [9] and we use this method.

The FPGA TDC was used to measure the propagation time in the carry chain. A simplified block diagram of the TDC implemented in FPGA is shown in Fig. 8 [10]. It includes two crucial parts, the time delay lines (the carry chain) and the coarse time counters. The TDC is based on counter and interpolating method, detailed description can be found in Ref. [11].

Fig. 8.
(Color online) Block diagram of the time-to-digital converter implemented in a single FPGA device.
pic

The method of measurement is as follows [9]. After power up or system reset, the TDC input is fed with calibration hits. The timing of these hits should have no correlation with the clock signal driving the TDC, so the hits are generated from an independent oscillator.

A Differential Non-Linearity (DNL) histogram is booked in the FPGA internal memory. Once all hits are booked into the histogram, a sequence controller starts to build a lookup table (LUT) in the FPGA internal memory. The LUT is integrated from the DNL histogram so that it outputs the actual time of the center of the addressed bin.

The LUTs of rising edge and falling edge TDC are shown in Fig. 9. The slope can be interpreted as average bin width for falling edge TDC, and about half of the slope is for rising edge TDC. The bin width means the propagation time of the rising edge or falling edge in one tap carry chain. The propagation time of falling edge is shorter than that of rising edge. This is due to the difference of the N and P transistors in COMS integrated circuit. The experiment was done in several series FPGAs of Altera and Xilinx.

Fig. 9.
(a) LUT of EP3C120F484C8N (Altera); (b) LUT of EP2S60F1020I4 (Altera); (c) LUT of XC3S500EFG320-4 (Xilinx); (d) LUT of XC5VFX70TFF1136-1(Xilinx).
pic

V. THE RELATIONSHIP BETWEEN PROPAGATION TIME AND TDC ACCURACY

It is intuitive that shorter propagation delay time will result in higher accuracy in TDC design. So, the rising edge TDC and falling edge TDC are tested respectively. The "rising edge TDC" is the TDC that uses the rising edge information to convert time to digital. And the "falling edge TDC" is the TDC using the falling edge information.

The input to the TDC is a pulse train with a repeating rate generated by an external phase-locked loop. The TDC is driven by a crystal oscillator. The time between successive rising edges of the pulse train is measured by the rising edge TDC and falling edge TDC, respectively. And digital calibration is used in the measurement [9].

The test results of successive rising edges and falling edges are shown in Table 1. The root meam square (RMS) resolutions of rising edge TDC and falling edge TDC are 95 ps and 52 ps, respectively. The falling edge TDC has shorter average propagation delay time. So the shorter propagation delay time brings better RMS resolution according to the test. The long propagation delay time of rising edge TDC limits the resolution.

TABLE 1.
The RMS resolution of rising edge TDC and falling edge TDC
Resolution Average bin width
Rising edge TDC Falling edge TDC Rising edge TDC Falling edge TDC
EP3C120F484C8N 95 ps 52 ps 169 ps 91ps
EP2S60F1020I4 31ps 29 ps 45 ps 44 ps
XC3S500EFG320-4 69 ps 62 ps 98 ps 96 ps
XC5VFX70TFF1136-1 28 ps 25 ps 57 ps 55 ps
Show more

VI. CONCLUSION

The different propagation time in carry chains is due to the processes of how the FPGAs are fabricated. Since the SRAM based FPGAs are fabricated using processes similar to CMOS, so the phenomenon is common in the SRAM based FPGAs. Being aware of the situation can help us to understand the FPGA TDC better.

The test results show that the impact of this phenomenon to the TDC accuracy is obvious. This effect is there in TDCs implemented FPGA using the tapped-delay line method. Knowing this can help us take advantage of the reduced delays provided by dedicated arithmetic (carry-chain) routing structures.

References
[1] Zhou J, Liu S, Yin C, et al. Nucl Sci Tech, 2011, 22: 372-377.
[2] Eugen B and Michael T. IEEE T Nucl Sci, 2011, 58: 1547-1552.
[3] Altera Corporation.

Documentation: Cyclone II Device Handbook, Chapter 2. Cyclone II Architecture

. http://www.altera.com/literature/hb/cyc2/cyc2_cii51002.pdf (Feb 2007).
Baidu ScholarGoogle Scholar
[4] Altera Corporation.

Documentation: Stratix IV Device Handbook

. http://www.altera.com.cn/literature/hb/stratix-iv/stratix4_handbook.pdf (Sep 2012)
Baidu ScholarGoogle Scholar
[5] Xilinx Corporation.

Virtex-4 FPGA User Guide

. http://www.xilinx.com/support/documentation/user_guides/ug070.pdf (Dec 2008).
Baidu ScholarGoogle Scholar
[6] John P U, Introduction to VLSI circuits and systems, Hoboken, J. Wiley, 2002: 250-294.
[7] Wu J, Shi Z, Wang I Y.

NSSMIC.2003.1352025: Firmware-only implementation of time-to-digital converter (TDC) in field programmable gate array (FPGA)

, in Proc. IEEE Nuclear Science Symposium.Portland, OR, USA, Oct. 2003, 177-181.
Baidu ScholarGoogle Scholar
[8] Pelka R, Kalisz J, Szplet R. IEEE T Instrum Meas, 1997, 46: 449-453.
[9] Wu J, Shi Z.

The 10-ps wave union TDC: Improving FPGA TDC resolution beyond its cell delay

, in Proc.IEEE Nuclear Science Symposium, Dresden, Germany, Oct 2008, 3440-3446.
Baidu ScholarGoogle Scholar
[10] Wang J, Liu S, Qi S, et al. IEEE T Nucl Sci, 2009, 57: 446-450.
[11] Song J, An Q, Liu S. IEEE T Nucl Sci, 2006, 53: 236-241.