1 Introduction
Switched capacitor array (SCA) ASICs have been widely used for low to medium event-rate physics experiments because of its low power consumption and high channel densities at affordable costs in comparison to analog-to-digital converters (ADCs). SCA chips can provide signal waveforms, and hence information that is more detailed can be extracted by further digital signal processing. Various SCA ASICs have been developed with sampling rates varying from GS/s to MS/s. The GS/s sampling SCA chips were mostly used for fast scintillation or Cherenkov light detectors, for example, the ARS chip for H.E.S.S. and IACT array [1], the SAM chip for H.E.S.S.-II [2], and the DRS chip for MAGIC-II telescopes [3,4] as well as PSEC4 [5], LABRADOR [6], and TARGET [7]. However, the MS/s sampling SCA chips have a much wider range of applications. They can be used for both semiconductor and gas detectors, for example, the CERN-49 IC for STAR [8], the DTMROC and HAMAC chips for LHC ATLAS [9,10], the APV series chips for LHC CMS [11,12], the Beetle chip for LHCb [13,14], the AFTER chip for T2K, and the AGET chip generically for Time Projection Chambers (TPCs) [15~17].
In our previous work, a CASCA ASIC has been successfully developed for a TPC based X-ray polarimetry using a 0.18 μm CMOS process [18]. The emitting direction of the photoelectron is modulated by the X-ray polarization, and it can be estimated by measuring the two-dimensional trajectories of the photoelectrons with a precision of one tenth of a millimeter. In a TPC based polarimeter, one dimension of the photo-electron track was determined by the readout strips and the other dimension was estimated by the signal waveforms. SCA was adopted for the waveform sampling in CASCA chip where each channel consisted of analog front-ends and a 64-cell depth SCA. The prototype chip was tested with a GEM (gas electron multiplier)-TPC to measure the photoelectron tracks generated by 8 keV X-rays [19]. However, there were two limiting factors of the SCA in CASCA chip. First, the energy of the photoelectron (namely the maximum track length) was limited below 10 keV by the depth of 64 cells. Second, it could not work in a ping-pong mode, resulting in a systematic dead time during the readout.
A new version of the SCA chip called GERO is proposed and designed to solve these problems and to provide a configurable memory depth for more generic readout solutions. Besides the generic readout requirements of micro-pattern gas detectors (MPGDs), TPC applications were also considered with the utmost importance. Here, the architecture of a two-stage SCA was adopted [7], in which the SCA was divided into sampling and storage arrays. The input signal was sampled continuously by the sampling SCA array, and once a trigger was asserted, the sampling was stopped and the analog samples were transmitted to a storage SCA block. A depth of 32 cells was chosen for the sampling block, and each channel consisted of two sampling blocks working at ping-pong mode. The storage block had a depth same as that of the sampling block, and each channel was integrated with 32 storage blocks. Hence, a total depth of 1024 cells was implemented in such a way that the event size could be flexibly configured by an external trigger pattern. In addition, an on-chip Wilkinson type ADC was also integrated to improve the readout speed and dynamic range [1,2].
The first prototype chip integrated 16 channels of SCA and was fabricated in a 0.18 µm CMOS process. The detailed design and test results are described in the following sections.
2 Architecture and Specifications
A simplified schematic of 16-channel GERO is shown in Fig.1. Each channel consisted of sample and storage SCA arrays along with 32 Wilkinson ADCs. The sample SCA array consisted of two sample SCA blocks (Block A and Block B), and each block consisted of 32 cells of switched capacitors, namely 32 sample cells. The input signal was sampled continuously by the 32 sample cells in one of the sampling blocks, for example Block A. When a trigger was asserted, the analog samples were held in Block A waiting for transmission to one of the 32 storage SCA blocks. At the same time, the other sampling block (Block B in this case) started sampling immediately. The systematic dead time was reduced dramatically by operating the sampling blocks in ping-pong mode. By the two-stage (sample and storage) SCA, the input signal could avoid driving a long wire and large capacitance load induced by the increasing number of storage cells. As a result, a readout buffer was needed for each sample cell to transmit the analog voltage to the storage cell with a certain precision and speed. To support consecutive triggers, total time for the transmission and the reset of the readout buffer should be within 32 sample clock periods, that is, 320 ns at 100 MS/s sampling rate. A design that has low power, high precision and high speed readout buffer was mandatory. In addition, the maximum trigger latency was limited by the depth of the sample block.
-201909/1001-8042-30-09-001/alternativeImage/1001-8042-30-09-001-F001.jpg)
The storage SCA block had a depth of 32 cells same as that of the sample block. Each channel was integrated with 32 storage blocks, resulting in a buffer depth of 1024 cells. The 1024-cell memory can be easily split by sending a sequence of consecutive triggers with an interval of 32 sample clocks. For example, the 1024 storage cells can be used as a two-event buffer of 512, or a four-event buffer of 256, etc. In this way, the storage SCA can work as a multi-event buffer with reconfigurable event number and buffer depth that is, from a 32-cell event buffer recording 32 consecutive events to a 1024-cell event buffer recording one event. Because of the benefit of the two-stage SCA architecture, GERO could be configured to meet the stringent demands of event size and rate for different experiments.
In the storage array, the data are stored in a sequence as they are generated and wait for the digitization or so-called read phase. The analog samples in the selected storage block are then digitized by the 32 Wilkinson ADCs in parallel. These digitized samples are then latched into the output registers and can be shifted out through 4 LVDS data outputs. All the control signals for the sample and storage SCA block, and ADC were generated by the control logic module including a few external signals. In addition, the control logic was designed with flexibility so that one or several ASICs can be controlled and read out by a single companion field programmable gate array (FPGA).
The TPCs in different experiments require readout ASICs with different sampling speeds (at ~ 10 MS/s) and resolutions from 8–10 bit. Moreover, the event length varies largely due to different experiments. For example, the TPC based X-ray polarimetry needs a readout ASIC with approximately 100 sampling cells when sampling at 20 MS/s. Whereas, for the ALICE TPC, the readout MALICE ASIC [20] integrates 1024 switched capacitor cells in one channel and has a sampling speed of 1 MS/s. The aim of the prototype GERO is to meet the different requirements of TPCs. The major specifications of GERO are listed in Table I.
Number of channels | 16 |
Input signal range | 0.3–1.3 V |
Sampling frequency | 1–100 MS/s |
Readout bandwidth | 800 Mbps max. |
Buffer latency | 10 µs @ 100 MS/s sampling |
ADC ENOB | 10 bit |
Power consumption | <4.5 mW/ch |
3 Circuit Design
3.1 The Sampling Cell
As shown in Fig.1, the sample cell consists of two switched capacitors (one each in Block A and B) and a readout buffer. In total, 32 sampling cells are integrated for each channel.
Switches
-201909/1001-8042-30-09-001/alternativeImage/1001-8042-30-09-001-F002.jpg)
The structure of the readout buffer is also shown in Fig.1. The simple two-stage structure was used to save the power consumption and area, as well as to meet the speed and precision requirements. The settling time and input referred noise of the buffer were simulated to be 250 ns and 0.49 mV, respectively.
3.2 The Storage Cell
The schematic of the storage cell with the common current mirrors and comparator in ADC is shown in Fig.3. It consists of a switched capacitor and two common drain (source follower) transistors (
-201909/1001-8042-30-09-001/alternativeImage/1001-8042-30-09-001-F003.jpg)
The storage cells are divided into two groups, corresponding to the sampling block A and B. The write of the storage cell (from sampling cell to storage cell) is then selected by the Block A/B selection line (
3.3 The Wilkinson ADC
The Wilkinson type ADC has been adopted because of its low power consumption and circuit compactness. Each channel integrates 32 Wilkinson ADCs and hence, a whole storage block can be digitized simultaneously.
The ramp signal generator is shared by all the 32 ADCs, as well as the 11-bit counter. As described above, the ramp signal is sent to each storage cell as the input of one source follower. The counter starts counting as the ramp signal rises, and its value is latched into a local 11-bit latch when the ramp signal crosses the stored voltage in the corresponding cell. The ramp signal sweeps from 0.2 V to 1.4 V in 12 μms. Here, an input voltage range of 0.3–1.3 V is used for better linearity. After digitization, the data is loaded into the output shift registers. The maximum output data bandwidth for ADCs is 29.3 Mbps per channel.
3.4 The Control Logic
The digital control circuit is developed using a standard digital IC design process. Only a few external signals are used to generate all internal control signals for sampling and storage of SCA, for example, sync for multi-ASIC synchronizing, trigger, read, move, and global reset. Three clocks are required: the sampling clock, ADC clock, and readout clock.
Two finite state machines (FSMs), SA and ST, are designed to control the sampling and storage SCA, respectively. The simplified state diagrams are shown in Fig.4.
-201909/1001-8042-30-09-001/alternativeImage/1001-8042-30-09-001-F004.jpg)
The FSM SA has 4 different states:
SA1 – Block A sampling, Block B idle
SA2 – Block A transmitting data, Block B sampling
SA3 – Block A idle, Block B sampling
SA4 – Block A sampling, Block B transmitting data
The state transitions are driven by the trigger and full signals. The latter is an indicator whether the storage SCA blocks are full. There is no more response to the external triggers once all the storage blocks are full. In each state, the corresponding control signals are generated for the switches in the sampling cells (Fig.1). For instance, in State SA1, switches
The FSM ST also has 4 different states:
ST1 – Waiting data from Block A
ST2 – Writing in data of Block A
ST3 – Waiting data from Block B
ST4 – Writing in data of Block B
In each state, the corresponding control signals are generated for the switches
Control signals read and move are used for digitization and output data buffering. Signal read is used to start the AD conversion of the storage block that is currently selected, and signal move is used to shift the address to the next storage block. The digitized data is also loaded into the output shift registers at the end of the read operation. Flexible readout schemes can be implemented by an independent combination of these two signals. After digitization (read + move) or abandonment (move), the flag of the corresponding block is cleared, which otherwise may alter the indicator signals full and empty. The data can be shifted out by enabling the readout clock and signal read enable.
3.5 Layout design
The layout of the chip with a dimension of 4960 µm × 3980 µm is shown in Fig.5. The control logic is at the top and the analog bias generator is at the bottom. Each channel consists of the sampling SCA, storage SCA and Wilkinson ADC with output data buffer as shown in Fig.5. from the left to the right. The storage array occupies the largest area with a large number of MIM capacitors. A careful layout has been considered to suppress the interference of the digital circuits on the analog circuits.
-201909/1001-8042-30-09-001/alternativeImage/1001-8042-30-09-001-F005.jpg)
4 Test Results and Discussion
4.1 The Evaluation System
A dedicated evaluation board and FPGA test system have been developed to characterize the GERO prototype chip. The GERO chip was mounted on the evaluation board with 3 channel input signals via the SMA connectors. The external control signals were connected to the FPGA board through an adapter board, for 1.8 V to 3.3 V logic level conversions. The sampling frequency could be programmed through the remote bus control protocol (RBCP), as well as the event size and depth. The output data from the GERO chip was buffered in the FPGA before transferring to the computer via Ethernet. A Qt-based data acquisition software has been developed to configure the FPGA and to collect the data.
4.2 Functional Test
Configurations of different sampling rates from 25 MS/s to 100 MS/s, and event sizes from 32 to 1024 have been tested. A maximum data bandwidth of 800 Mbps has also been verified with 200 MHz readout clock. All the functions in the GERO chip, that are sampling, storage, data digitalization, and data output worked well; which means that the two stage SCA architecture, Wilkinson ADC, output shift registers, and control logic were running well. Fig.6 shows two examples of the waveforms with two different memory depths of 32-cell and 64-cell. The waveforms were far from satisfactory, but they could verify the flexible split of the memory depth.
-201909/1001-8042-30-09-001/alternativeImage/1001-8042-30-09-001-F006.jpg)
However, nearly half of the storage SCA blocks were found to have failed due to the mistakes in the layout. The outputs of these blocks were noisy and were independent to the inputs, and thus were discarded in the following analysis.
4.3 The Power Consumption
The power consumption of the GERO was measured at the room temperature with a main bias current of 50 µA. Two supply voltages of 1.8 V and 2.5 V were used. The power consumption was measured to be 2.3 mW/ch for the 1.8 V power supply, and 7.23 mW/ch for the 2.5 V power supply. The latter was much higher than the simulation result of 1.8 mW/ch. The ramp signals for the Wilkinson ADCs were found to be considerably steeper and their start points were approximately 100 mV higher than the designed values, accounting for the additional current of 0.5 mA per channel. Abnormal parasitic current paths were found from 2.5 V supply to ground, probably through some switch transistors in the ramp generator. The test results shown below are for less than 100 MS/s sampling and 1024-cell event size.
4.4 The Static Noise
The output noise was first tested by measuring the output variances, during sampling different DC voltages. A typical histogram distribution of the digitized outputs for a certain storage cell sampling a DC level is shown in Fig.7(a). In total, 5000 samples were collected for each histogram. The input referred noise was then estimated to be 1.2 mV from the standard deviation of the major peak, which is consistent with the simulation result of 1.17 mV. However, a minor peak could also be clearly seen, which was common for quite amount of storage cells. The differences between the major and minor peaks varied for different input voltages. The phenomenon of the twin peak was most probably caused by the disturbance on the ramp signal. Only the major peaks were used for the static performance evaluation.
-201909/1001-8042-30-09-001/alternativeImage/1001-8042-30-09-001-F007.jpg)
4.5 The Linearity
The mean values of the major peaks for different DC input voltages are shown in Fig.7(b), along with the linear fit curve. The linear range was reduced to 0.4–1.25 V, because of the exceptional leakage current in the ramp signal. The maximum integral non-linearity (INL) was typically around 3% for the input range. Besides the issue of the ramp signal, the major source of the non-linearity was from the pair of the source followers, according to the circuit simulation. This could be improved in the future by optimizing the size of the source follower transistors (e.g. 8 u/1 u) and by changing the output point to the other side of the switch
4.6 The Leakage Current
The charge loss caused by the leakage current could be significant, especially for a large SCA. The major contribution to the leakage current was from the switch transistors and was voltage dependent. The change of the ASIC output versus time for 1.2 V input voltage is shown in Fig.7(c). The leakage current was calculated to be 260 fA. At the readout frequency of 200 MHz, the difference caused by the leakage current between the first and the last block is less than 1 mV.
4.7 The Non-Uniformity
The DC responses of the different storage cells in one channel were measured for 0.7–0.9 V input range. In total, 544 cells were tested; and for each cell three different input voltages were repeatedly sampled for 1000 times. The averaged sample points with their linear fits are shown in Fig.7(d). The standard deviation of the offsets was 20.4 mV. This is almost one order of the magnitude larger than the simulation result of 2.43 mV, indicating abnormal contributions probably due to the unstable ramp signal. In the next version, the mistakes relating ramp generate circuit would be corrected, and a bandgap will be used instead of a current mirror to generate a more stable and reliable ramp signal.
4.8 The Transient Performance
The quantitative measurement of the dynamic performance became extremely difficult due to the unexpected problems such as half-failed storage cells and twin peak phenomenon. However, to prove the functionality of the GERO chip, Lorentz waves with two different amplitudes were sampled at 100 MS/s, as shown in Fig.7(e). The baseline of the Lorentz waves was 550 mV, and the peak values were 700 mV and 1000 mV, respectively. The FWHM width of the Lorentz waves was approximately 575 ns, and the waves occupied seven consecutive storage blocks.
5 Conclusion
The GERO chip has been designed for the generic readout for MPGD with a sampling frequency up to 100 MHz, an event buffer depth up to 1024 cells, and an on-chip digitization. Here, the architecture of the two-stage SCA was implemented, along with the corresponding control logics. In this way, the event size and buffer depth could be easily reconfigured, which brought a large flexibility to meet the stringent demands of various applications. Although the performance of the prototype is not satisfactory and cannot meet the design specifications, the whole function of the GERO chip has been verified, that includes, the sampling and storage SCAs, on-chip ADCs, data output, and the corresponding control logics. The performance of the first prototype chip was severely affected by the failed storage blocks and the unstable ramp signal. An upgraded version will be designed and tested in future works.
Application specific integrated circuits for ANTARES offshore front-end electronics
. Nucl. Instrum. Meth. A, 442(1-3):99-104. (2000).doi: 10.1016/s0168-9002(99)01205-xSAM: A new GHz sampling ASIC for the H.E.S.S.-II front-end electronics
. Nucl. Instrum. Meth. A, 567(1):21-26 (2005). doi: 10.1016/j.nima.2006.05.052Design and performance of the 6 GHz waveform digitizing chip DRS4.J
.The DRS chip: cheap waveform digitizing in the GHz range
. Nucl. Instrum. Meth. A, 518(1):470-471(2004).doi: 10.1016/j.nima.2003.11.059A 15 GSa/s, 1.5 GHz Bandwidth Waveform Digitizing ASIC
. Nucl. Instrum. Meth. A, 735(1): 452-461 (2014). doi: 10.1016/j.nima.2013.09.042The large analog bandwidth recorder and digitizer with ordered readout (LABRADOR) ASIC
. Nucl. Instrum. Meth. A, 583(2): 447-460 (2005). doi: 10.1016/j.nima.2007.09.013TARGET: A multi-channel digitizer chip for very-high-energy gamma-ray telescopes
. J. Astroparticle Physics. 36, 156-165 (2012). doi: 10.1016/j.astropartphys.2012.05.016Front end electronics for the STAR TPC
, InDesign and implementation of the ATLAS TRT front end electronics
. Nucl. Instrum. Meth. A,, 563(2):306-309 (2006). doi: 10.1016/j.nima.2006.02.168HAMAC, a rad-hard high dynamic range analog memory for ATLAS calorimetry
. CERN (2000). doi: 10.5170/CERN-2000-010.203Beam test of a prototype readout system for precision tracking detectors at LHC
. Nucl. Instrum. Meth. A, 369(1):79-91 (1996). doi: 10.1016/0168-9002(95)00769-5The CMS silicon strip tracker and its electronic readout
. Nucl. Instrum. Meth. A 518(2):331-335 (2001). doi: 10.1016/j.nima.2003.11.012Development and Characterisation of a Radiation Hard Readout Chip for the LHCb-Experiment
. J. Philosophy, 518(1-2):468-469(2003). doi: 10.11588/heidok.00003128Beetle - a radiation hard readout chip for the LHCb experiment
. Nucl. Instrum. Meth. A 518, 468-469. doi: 10.1016/j.nima.2003.11.058AFTER, an ASIC for the readout of the large T2K time projection chambers
. IEEE T. Nuclear Science, 55(3): 1744-1752(2008).doi: 10.1109/NSSMIC.2007.4436521AGET, the GET front-end ASIC, for the readout of the Time Projection Chambers used in nuclear physic experiments
.Architecture and Implementation of the Front-End Electronics of the Time Projection Chambers in the T2K Experiment
. IEEE Transactions on Nuclear Science, 57(2): 406-411(2010).doi: 10.1109/tns.2009.2035313.CASCA: A readout ASIC for a TPC based X-ray polarimeter, IEEE,1-4,2016
.Dissertation
,MALICE: A Full Custom Analog Memory for ALICE
. In