1 Introduction
The TDC (Time-to-Digital Convertor) implemented in FPGAs (Field-Programmable-Gate-Arrays)[1,2,3,4,5] is a flexible, low cost for measuring time of flight (TOF) in particle physics and plasma experiments. In SRAM FPGAs, a 50 ps resolution and a 10 ns dead time can be achieved by the time interpolating method employing dedicated carry lines as the delay elements[6,7]; and in Actel Flash-based FPGAs, a resolution of 540 ps can be obtained by employing buffers as delay elements, and the resolution can be improved to 130 ps by eliminating the buffers from the delay line and using only the routing lines[8].
Single-Event-Effects (SEE) test results showed that the Actel flash-based FPGAs had a better performance than the SRAM FPGAs in single-event -upset (SEU) immunity[9]. Thus the flash-based FPGAs can be applied in some space missions to meet the radiation-tolerant requirements. However, different from SRAM FPGAs, the flash-based FPGA had no dedicated carry lines, and the shortest delay time through a logic block is hundreds of picoseconds. Thus, time is difficultly measured in a resolution of dozens of picoseconds by the flash-based FPGAs using time interpolating method.
The "Vernier Delay Line" (VDL) method is utilized by some Time-to-Digital convertors in Application Specific Integrated Circuits (ASIC)[10,11,12,13]. Rather than the vernier TDCs using crystal oscillators of different frequencies[14,15], the TDC utilizing vernier tapped delay lines can provide higher resolution, larger dynamic range and shorter dead time. Because the propagation delay of vernier elements has temperature and voltage dependent, the voltage control circuit and delay-locked-loop (DLL) are integrated to ensure the stability of TDC bin size. However, such circuits are not integrated in FPGAs, thus the FPGA-based vernier delay line TDCs was rarely reported.
In this paper, a high resolution TDC based on both the VDL method and the time interpolating method is implemented in Actel Flash-based FPGA A3PE1500. A "double delay lines" method is utilized to cut the dead time in half. A temperature compensation algorithm is operated in a wide temperature range. The architecture design and performance tests are described.
2 Architecture
2.1 Vernier time interpolating architecture
The FPGA core of A3PE1500, which consisted of 38400 VersaTiles, was configured as three-input logic functions, D-flip-flops, and latched by programming the appropriate flash switch interconnections[16]. The VersaTiles exhibited propagation delays when it was used as different combinatorial cells. In our design, the propagation delay difference of AND3 and MUX2 was used as the VDL elements, and a Vernier TDC, which has with an average bin size of less than 50 ps, was theoretically formed.
The VDL block diagram is shown in Fig.1. The architecture as a time stamp vernier TDC is based on a coarse counter and interpolator units formed by VDL. Simulation results show that the propagation delay difference of AND3 and MUX2 units leads to a minimum bin size of about 50 ps. The leading signal for the vernier lines is exactly the "hit" pulse. The lagging signal is generated by a D-flip-flop fed by the hit pulse and master clock. Actually, the time measured by the vernier delay line is the interval between hit event and the proximal clock rising edge. As the TDC is fed by the master clock using a period of 10 ns, the time stamp TDC provides a wide time measurement range of about 655.36 μs by a 16-bit coarse time counter, and the range is further expanded by increasing bit numbers of the counter. The encoder unit transforms the fine time information output by the VDL into 9-bit data. The Read-Out FIFO stores the integrated TDC data, including16-bit coarse time data, 9-bit fine time data, and the TDC channel ID. The "Read En" is the read enable signal for the coarse counter latches and encoder unit.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F001.jpg)
Figure 2 is the timing diagram of generating the "Read En" pulse. Once hit signal arrives, a D flip-flop at the proximal clock rising edge generates the lagging signal. At the next clock rising edge, the lagging signal is latched by another D flip-flop to output a reverse pulse. Both lagging signal and the reverse pulse are fed to an AND2 gate, outputting a "Read En" signal. The generated signal with one clock period is synchronous to the hit signal. "Read En" can be used as the coarse counter reading enable signal, and fed to encoder unit for VDL output fine data reading enable and FIFO writing enable after delaying for several clock periods.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F002.jpg)
2.2 Double delay lines method
Dead time spent by the lagging signal of the vernier delay line TDC is maximal to overtake the leading signal, this depends on the number of delay cells and the propagation delay of a single cell. In a single vernier delay line covering a 10 ns clock period, there are the 300 propagation delay combinatorial cells, and the dead time is close to 200 ns.
A "double delay lines" method is employed to obtain a shorter dead time[17]. Fig.3 shows the double delay lines TDC structure. Because clocks fed to the two VDLs have the same frequency with inverted phase, the lagging signals for the two delay lines are generated using time difference of a half clock period. Both of the vernier delay lines cover the half clock period, their number of required cells for both lines decrease in half, and the dead time decreases to 100 ns. Each hit signal is measured by both delay lines, but just one of the two output codes is selected as the time information. The code selection is based on the fine time data from the two vernier delay lines. The clocks for the double delay lines are generated by two methods, that is, two internal PLL cores of FPGA with inverted phases, and both rising and falling edges of the master clock. The TDC performance by two clock methods is shown in the test results.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F003.jpg)
3 Test results
In 2012, a prototype, which had a "vernier delay line" TDC implemented in Actel Flash-based FPGA, was designed and tested. Fig.4(a) shows the TDC board based on a Universal Serial Bus (USB) port. Fig.4(b) shows its testing block diagram.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F004.jpg)
In Fig.4(b), testing hit signals for the TDCs are output by a pulse generator; and the LVDS level signals for each hit by the discriminators. Time information of the input hits are measured by the Actel FPGA-based TDC, and the output time codes are transmitted to the data processing computer through the USB port.
3.1 Bin size and differential non-linearity
In this design, the TDC bin size equals to the propagation delay difference between AND3 and MUX2 units. The differential non-linearity is mainly contributed by the master clock skew and the disproportionate widths of the vernier delay chain cells. DNL (differential non-linearity) is defined as the deviation of bin size from its ideal LSB (Least significant bit) value, and INL (integral non-linearity) is the deviation of the input/output curve from the ideal transfer characteristic, which is a straight line fitting the curve best. To obtain the non-linearity of the TDC, the code-density test method was adopted[18]. Fig.5 shows the code-density tests for vernier TDC with a single chain. Because the non-linearity repeats at every 10 ns, a look-up table (LUT) can be constructed by the INL information to compensate for the TDC outputs.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F005.jpg)
Figure 5(a) shows the bin size information for a single vernier delay chain implemented in A3PE1500, and the average is about 42 ps. The first bin, which is larger than others, leads to the worst non-linearity of the TDC. The large bin results from the D-flip-flop outputting the lagging signals. When the hit signal arrives accompany with the clock rising edge, ambiguous state appears. The ambiguous state leads to a slower rising edge for the lagging signal. In the code-density test, the slower lagging signals finally accumulate on one bin, thus generating an ultra wide bin. Due to the 135 ps bin, DNL of the TDC is –1/+ 2.2 LSB, and INL is –1.4 /+3.7 LSB.
Fig.6 shows the waveforms of the slower lagging signal.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F006.jpg)
The bin size and non-linearity information of vernier TDC are obtained using the "double delay lines" method (Fig.7). The two lines cover a little time longer than half a clock period, hit signal arriving at the clock rising edge is tapped at the start of one chain and another end. By selecting the correct time information, the ultra wide bin exists in the TDC channel no longer. DNL of the vernier double delay lines TDC is in the range of –1 to +0.9 LSB, and the INL is –1 to +3.4 LSB. The INL nonuniformity is contributed by the disparity of averaged bin size between two delay lines.
3.2 Time measurement resolution
The resolution for a time measurement system, which is one of the most important parameters, can be obtained by the "cable delay test" . However the cable length is limited because long cables can lead to the attenuation of the input signal and slow the leading edge, thus introducing measurement error. So when measuring different time intervals, a dual channel arbitrary function generator Tektronix AFG3252 is employed to obtain the resolution[19]. Because the two channels of the generator output cognate signals with adjustable delays, a wide range of time intervals is applied in the test.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F007.jpg)
An INL look-up table is used to correct the INL error. The TDC utilizing a single vernier delay line exhibits a resolution of more than 50 ps RMS before INL compensation, and the resolution is about 20 ps RMS after INL compensation. Fig.8 shows the time resolution of the vernier delay line TDC.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F008.jpg)
The double vernier delay lines can eliminate the ultra wide bin from the TDC channel, an improvement of about 3.7 ps in RMS can be obtained comparing with the single delay line TDC (Fig.8(c)).
If the time interval of the paired input pulses is in one clock cycle, so is the measurement resolution. Fig.9 shows the time resolutions of 0 ns to 20 ns in the step of 1ns. All RMS curves repeat at every 10 ns which equals to the clock period.
When the measured time interval equals to N×T (T is one clock period), the double delay lines TDC with clocks of inverted phases exhibits RMS of less than 20 ps after INL compensation, however the RMS becomes worse when measuring other time intervals. In other words, the TDC provides the best precision when selecting the output time codes of the paired input hits from the same delay line, and RMS becomes worse when outputting the time codes from different delay lines. The two clocks are generated by integrated PLLs of FPGA, and stacking of clock noise results in worse resolutions. At the mean time, for the double delay lines TDC to employ rising and falling edges of the master clock, the RMS floats around 20ps. The performance of single delay line vernier TDC is similar to the previous one, but RMS increases by a few extra picoseconds. So the double vernier delay lines TDC using both clock edges has the best performance in time measurements.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F009.jpg)
3.3 Bin size drifts
The delay time of the vernier elements drift when the ambient temperature changes, tests for the bin size are performed in a temperature control box (Fig.10).
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F010.jpg)
When the ambient temperature changes from –5°C to +55°C, the averaged bin size of the TDC increases from 39.8 ps to 44.5 ps. Cell delay varies linearly with temperature, and the calculated slope is about 0.0807 ps/°C. A functional relation of LSB = 39.6 ps +0.0807 ps/°C×(T+5°C) is obtained. Once working in the environment where temperature changes fiercely, the random error increases to hundreds of picoseconds without temperature drift correction, its compensation mechanism for a given operating temperature is necessary to correct the tap delay. Look-up tables are usually re-generated by a lot more memory space when the ambient temperature changes. In our temperature compensation algorithm, only a look-up-table generated at –5°C.
The algorithm is performed as following: Firstly the time information for each hit is corrected by the –5°C look-up table. Secondly, since TDC bin size changes with temperature, the LSB at the operating temperature is calculated, and a ratio is obtained just dividing the calculated value of LSB by the value at –5°C. Finally, the time information is further corrected by multiplying the ratio of LSBs in the previous step.
Figure 11 shows the resolutions at different temperatures at 0 ns time interval. The RMS floats below 18 ps when using look-up tables at each temperature, while the resolution becomes a little worse when using the temperature drift compensation algorithm at the –5°C look-up-table, in which more vernier elements are occupied at lower temperatures due to the smaller bin sizes, and more bin information is included. There is a little increment in RMS once the operating temperature departs from –5°C, and the worst RMS remains below 22 ps. Thus this temperature drift compensation algorithm is better due to its accuracy and convenience.
-201304/1001-8042-24-04-012/alternativeImage/1001-8042-24-04-012-F011.jpg)
4 Discussion
4.1 Coarse counter
The 16-bit coarse counter working under a clock of 100MHz frequency contributes to a 655.36 μs dynamic range. As no dedicated carry lines are integrated in Actel Flash-based FPGAs, counters in the devices are formed by combinatorial cells. The carry delay of the counter, which is contributed by the delaies of the cell propagation and routing lines, must be confined within one clock cycle. Thus layout constraints are aimed to shorten the routing delays, and all counter combinatorial cells are laid in a small region of the FPGA.
4.2 Dead time
To adapt to practical applications, the method of double delay lines is employed to decrease the dead time of the vernier delay line TDC. Using both edges of the master clock, the maximum dead time can be decreased to 100 ns. The TDC channel is disabled when measuring one hit. Also, the dead time is related with temperature, 10% longer dead time exists when the ambient temperature increases by 50°C.
4.3 Bin size and logic resource occupancy
Bin size of the TDC depends on the time difference of the two macro chosen as the vernier cell. Bin size of less than 40 ps is obtained by reselecting delay elements. Shorter delay cells lead to more elements, longer dead time, more complex encoder logic, and higher clock frequency. Finally, more elements lead to larger occupancy of logic resources; and higher clock frequency, more risk for timing integrity. Working under a 100 MHz clock, two channels vernier TDCs using AND3 and MUX2 macro should cost about 25 percent of A3PE1500 logic resources.
5 Conclusion
Based on time interpolating method and implemented in an Actel ProASIC3E FPGA, a vernier delay line TDC is reported. A "double delay lines" method was employed to cut the dead time in half and improve the time measurement performance. A temperature drift compensation algorithm was employed to correct different temperatures, thus achieving a dead time of 100 ns, a dynamic range of 655.36 μs, a resolution of 16.4 ps RMS and an averaged bin size of 42 ps in a temperature range of –5/+55°C.