1. Introduction
The parameter estimation problem associated with nuclear pulse signals is an important aspect of radioactivity measurement. Goulding [1], Gerardi [2], Noulis [3] et al. pointed out that Gaussian pulse signals have good performance in improving the signal–noise ratio and in energy resolution. Therefore, the Gaussian pulse signal is usually used as the signal after the nuclear pulse signal is shaped, however, the real Gaussian pulse form has an anti-causal part, making it difficult to implement analog systems [4]. With the development of digital signal processing methods and technologies, nuclear pulse digital formation technology has become widely used. Chen Shi-Guo [5] et al. proposed the recursive implementation of Gaussian pulse-shaping, based on wavelet analysis, while Kaiming Jiang [6] et al. proposed a pulse parameter extraction method, based on Sally–Key (S–K) digital shaping and a population technique. Hongquan Huang [7] et al. used a genetic algorithm to estimate S–K, Gaussian-shaped, overlapping pulse signal parameters.
Pulse signals have complex and varied characteristics, however, which means very large amounts of data are required for pulse parameter extraction. In addition, due to the simplicity of its mathematical model, the extraction efficiency and accuracy of the traditional search algorithm become rapidly reduced once the number of overlapping pulses increases, or the level of overlap deepens.
In recent years, deep learning technology has been continuously developing [8]. It has hidden layers that contain many nonlinear transformation structures, so its ability in fitting complex models by training more data is enhanced [9-12]. As part of the continuing development of this field, the recurrent neural network (RNN) method has been proven effective in processing time series problems [13-15]. Unfortunately, RNN may experience gradient disappearance, or gradient explosion, during network model training, however, this can be solved by a Long Short-Term Memory (LSTM) neural network replacing each hidden layer in the RNN with a memory cell composed of three control unit gates. According to the characteristics of deep learning, a multi-layer LSTM model, composed of multiple LSTM models, can map the abstract features of data to the network layer in higher dimensions, giving it more powerful learning and expressive abilities, for nonlinear sequences [16-18].
At present, research related to introducing deep learning technology into nuclear pulse parameter extraction is still at the preliminary stage, making it urgent to introduce the new, deep learning technology into the field. In this paper, continuous pulse signals have been discretized, and then classical S-K Gaussian shaping has been used to form discrete exponential pulses, in a dataset with the characteristics of a time series. Then, based on the characteristics of Gaussian overlapping pulses after shaping, an efficient and stable LSTM model for parameter extraction of Gaussian overlapping nuclear pulses (ONP) was explored.
Our experimental results showed that the proposed method could effectively overcome the difficulty of extracting parameters from noise-containing overlapping pulses. The extracted parameters have shown high precision and demonstrate the good performance of the proposed method for estimating pulse parameters.
2. Principles and Algorithms
2.1 A shaping model for nuclear pulses
The transmission process for nuclear signals in the detection channel is characterized by pulses that can exhibit exponential, double exponential, triangular, step, trapezoidal or Gaussian form. The following is an example of extracting overlapping pulse parameters after S-K Gaussian shaping. The input S-K signals were multiple exponential pulses.
2.1.1 Superposition model of multiple exponential pulses
For overlapping pulses formed by the superposition of N exponentially decaying nuclear pulses, the mathematical model is as follows:
In Eq. (1), u(t) represents the step signal,
2.1.2 S-K Gaussian shaping
S–K is a common Gaussian shaping circuit. Assuming that the resistance of the circuit is R, and the capacitance is C. The parameter is
From the above, Eq. (4) can be obtained:
In this formula,
2.2 Theories and techniques relevant to the LSTM model
The LSTM model includes forward propagation, back propagation through time (BPTT), and the Adam parameter optimization algorithm [19,20]. For a given sequence, a standard LSTM model is applied, and hidden layer and output sequences can be iterated through structures such as a forget gate, input gate, candidate information gate, and output gate. The mathematical models for these structures are shown in Eqs. (5)–(10):
where
The LSTM training process uses the BPTT algorithm, which consists of four main steps. First, the output value of each LSTM memory cell is calculated by forward propagation; secondly, the error term of the memory cell is calculated backwards; then, the gradient of each weight is calculated according to the error term, before, finally, the gradient-based optimization algorithm is applied to update the weight. Common gradient optimization algorithms include the stochastic gradient descent (SGD) [21], adaptive gradient (AdaGrad) [22], root mean square propagation (RMSProp) [23], and adaptive momentum estimation (Adam) [24] algorithms.
2.3 Parameter estimation for overlapping pulses
Overlapping pulse parameter estimation, after Gaussian shaping, mainly involves production of the data set, forward and back propagation training, based on the BPTT algorithm, and preservation of the model after training completion.
2.3.1 Data set production
Make a data set with n samples. The matrix representation of the data set is as follows:
Each row in Eq. (11) represented data from one sample, with the first M data from each sample being the sampled values of Vo(mTs). Vo(mTs) is the signal from the overlapping pulses after S–K digital Gaussian shaping, corresponding to this sample—and so Vo(mTs) is also known as the Gaussian overlapping pulses signal. The parameters of the input signal Ve(mTs) were set to
Next, the data set was divided into the Training Set, the Test Set, and the Validation Set, according to a certain ratio. The Training Set was used for training the LSTM model, while the Test Set was used to verify the generalization ability of the model, after completion of its model training. The Validation Set was then used to verify whether the model for training completion suffered from the over-fitting phenomenon. The reason for this was that in an over-fitting situation, if the model loss value for the training data was small, the prediction accuracy was higher, but when the loss function was larger, the prediction accuracy on the validation data became low, which would cause the trained model to lose the ability to generalize.
Traditional machine learning models have often used the L1 and L2 regularization method to modify the loss function. However, for large deep neural networks, modification of the loss function alone cannot meet the actual needs, so in order to solve this difficulty, the Dropout algorithm was used in our study [25-27]. If the Dropout algorithm is used in the forward propagation process, a certain number of memory cells in a complex neural network will stop processing sequence information with a certain probability. Thus, the training process is carried out on the LSTM architecture of different combinations. This method can reduce the dependence of the neural network on certain local features, enhance the generalization ability of the LSTM model, and finally achieve the purpose of improving the performance of the neural network. The mathematical model for Dropout algorithm is as shown in Eqs 9120 and (13):
where the p is the probability that the LSTM memory cells stop propagation. The function
2.3.2 Forward propagation calculation of the pulses sampling value sequence
The forward propagation calculation process for the pulses sampling value sequence involves using the pulses sampling value sequence [Vo(mTs)]i in the training set as the input data, and, after iteration through the multi-layer LSTM model, the process finally delivers the nuclear pulses parameter set
Calculation of the forget gate structure
The forget gate structure could determine the retention probability of memory cell state information, and was calculated as shown in Eq. (14):
In (14), hm-1 is the hidden state information of the previous memory cell. The functions
Calculation of the input gate structure
The input gate structure is used to calculate new state information inside the memory cell, and its structure is similar to that of the forget gate. The weight and offset parameters are Ug, Wg, and bg, respectively, and its mathematical model is described by Eq. (15).
In (15),
Status update of the memory cell
First, the candidate information vector
In these equations, Cm represents the status value of the memory cell at the current moment, while fm represents the output value of the forget gate. Cm-1 represents the status value of the memory cell at the previous moment, gm represents the output value of the input gate, and
Calculation of the output gate structure
The output gate structure determines the hidden state information, hm. At first, the vector containing the hidden state information hm-1, from the previous memory cell, and the vector containing the current pulse sequence information [Vo(mTs)]I, were calculated, using the sigmoid function. Then the cell state information Cm of the memory cell was calculated, using the tanh function. Next, the output value of the tanh function was multiplied with the output value, om, of the sigmoid function to determine the hidden state information, hm. Finally, it was necessary to transmit the hidden state information, hm, to the next layer of the network, and to transmit the information, hm, of the hidden state and the state information Cm of the memory unit to the next memory cell of the same layer.
Therefore, the mathematical model of the output gate was as shown in Eq. (18) and (19):
In this mathematical model,
The ONP parameter estimation model for the multi-layer LSTM neural network could be considered as a stack of multiple, single-layer LSTM models. The hidden state information, hm, was used as the transmission information between each neural network layer. Therefore, the extracted abstract information was passed between each network layer until the last layer of the LSTM network estimated the parameter set for the ONP, and forward propagation finished.
2.3.3 Back propagation training process for nuclear pulse sequences
Because the weights and bias of each LSTM memory cell are randomly assigned when defining a neural network, a loss function also needs to be designed. With this loss function, the error between the estimated pulse parameters
The Gaussian ONP parameters estimation model proposed in this paper has solved the problem of estimating specific values. This problem belonged in the category of regression models, so it is more appropriate to use the mean square error (MSE) function as the loss function. For the Training Set with q samples, the error value between the estimated pulse parameter set
Then, using the Adam[24] algorithm based on gradient optimization, the loss value and the gradient of the loss function were fed back to the network to update the weight, and as the weights were updated in backpropagation, the set of nuclear pulse parameters estimated in the next forward propagation process would be more accurate. By analogy, with the parameter set of the nuclear pulses estimated by forward propagations, and by correcting the weight of the LSTM network model by backpropagation, the loss value between the ONP parameter set estimated by the LSTM model and the true ONP set gradually decreased, thereby achieving the purpose of training the network.
Finally, in order to improve training efficiency and avoid LSTM model loss value oscillation in the later stages of training, a method was needed to determine the number of rounds after which the model stopped training. Because the Mean Absolute Error (MAE) function—as shown in Eq. (21)—has the property of avoiding mutual offsetting between deviations, it was used to determine the number of algorithm training rounds.
In use, the technique applied was to set a threshold based on the actual situation, and then when the MAE was less than that threshold, the training ended. At this time, the data in the Test Set was input into the model to test the generalization ability of the ONP parameter estimation model.
2.3.4 Saving the model after training
After the LSTM model with the ability to estimate the Gaussian ONP parameter set θ was trained, important information such as the model structure, weights, training configuration, and optimizer status of the training were saved as Hierarchical Data Format 5 (HDF5) files. When it becomes necessary to estimate ONP in the future, first, let the program load the trained model through the HDF5 file; next, the sampled value of the Gaussian ONP for which the shaping parameters need to be estimated is used as the input data of the LSTM model; finally, the LSTM model outputs the required set of nuclear pulses parameters. Therefore, the structure diagram of the Gaussian ONP parameter estimation algorithm, based on the deep learning LSTM model, is as shown in Fig. 1.
-201911/1001-8042-30-11-013/alternativeImage/1001-8042-30-11-013-F001.jpg)
3. Experimental verification and discussion
In order to verify the feasibility of the proposed method in estimating ONP parameters, three examples have been prepared and reviewed in this paper. First, Example 1 is a contrast experiment, in which the operation of the proposed method is compared with the estimation of nuclear pulse parameters using the traditional optimization algorithm proposed by Hong-Quan Huang, Xiao-Feng Yang et al [7].
Second, as the time interval between adjacent exponential pulses becomes smaller, their overlap becomes more severe, in a scenario that will further increase the difficulty of estimating nuclear pulse parameters. Therefore, Example 2 was used to verify the effect of the proposed algorithm on the estimation of ONP parameters when the time interval between adjacent nuclear pulses was short.
Finally, considering the difficulty of estimating nuclear pulse parameters caused by the overlap of multiple pulses, Example 3 involved designing a parameter extraction experiment using nine ONP. This example has been used to verify the parameter extraction effect of the algorithm under multiple nuclear pulse overlap conditions.
In addition, in order to ensure the accuracy of the calculation under the premise of maximizing computational efficiency improvement, an adaptive algorithm was designed, to determine the number M of sampled values of the ONP Vo(mTs), and to determine the number of LSTM model neural network layers. Thus, for the ONP Vo(mTs), if the kth nuclear pulses sampling value Vo(kTs) satisfies the condition expressed in Eq. (22), the number M of sampled values of the ONP Vo(mTs) could be obtained by applying Eq. (23).
Combined with the above conditions, the number of LSTM model neural network layers could be determined through applying Eq. (24):
In Eq.(24), s is an arbitrary positive integer, Cnum1 is the number of memory cells in the first layer,
Example 1
Exponential pulses Input 1, Input 2, Input 3, and Input 4 were inputted into an S–K shaping circuit, with the characteristic time of
-201911/1001-8042-30-11-013/alternativeImage/1001-8042-30-11-013-F002.jpg)
As shown in Fig. 2, the loss value of the model in both the Training Set and the Validation Set decreased monotonically, showing that there was no over fitting phenomenon, and, in a saving on computational costs, this example did not require use of the Dropout algorithm. Finally, the corresponding parameters and errors of the Gaussian ONP estimated by the deep learning LSTM model and the traditional optimization algorithm that Huang [7] et al. used are shown in Table 1.
A1 | A2 | A3 | A4 | T1 | T2 | T3 | T4 | |
---|---|---|---|---|---|---|---|---|
True value | 300 | 150 | 200 | 250 | 1 | 100 | 200 | 300 |
Calculated value (Huang) | 305.62 | 150.85 | 189.33 | 256.69 | 1.17 | 99.52 | 198.63 | 299.9 |
Calculated LSTM value | 294.44 | 150.06 | 199.95 | 248.92 | 0.9995 | 100.1 | 199.89 | 294.29 |
Error (Huang) | 5.62(1.87%) | 0.85(0.57%) | 10.67(5.34%) | 6.69(2.68%) | 0.17 | 0.48 | 1.37 | 0.1 |
Error (LSTM) | 5.56(1.85%) | 0.06(0.04%) | 0.05(0.03%) | 1.08(0.43%) | 0.0005 | 0.1 | 0.11 | 5.71 |
Figure 3 shows the results for nuclear pulse parameter estimation based on the LSTM model. Here, Figure 3a shows the exponential pulses before shaping, the Gaussian pulses after shaping, and the Gaussian ONP obtained using the LSTM parameter estimation method. Figure 3b shows the true exponential pulses and the exponential pulses obtained using the LSTM model, while Fig. 3c shows both the true Gaussian pulses and the Gaussian pulses whose parameters were estimated using the LSTM model.
-201911/1001-8042-30-11-013/alternativeImage/1001-8042-30-11-013-F003.jpg)
From the experimental results, the relative errors for amplitude parameter,
Example 2
When the time interval of the adjacent exponential nuclear pulses is short, the overlap between the pulses will be more serious, making it very difficult to estimate ONP parameters. The purpose of this example was to verify the ability of the proposed method to estimate ONP parameters when the time interval between adjacent exponential nuclear pulses was small.
Exponential pulses Input 1, Input 2, Input 3, and Input 4 were inputted into an S–K shaping circuit, with the characteristic time of
Items | A1 | A2 | A3 | A4 | T1 | T2 | T3 | T4 |
---|---|---|---|---|---|---|---|---|
True value | 300 | 150 | 200 | 250 | 1 | 100 | 150 | 200 |
Calculated value | 300.13 | 150.72 | 199.29 | 250.23 | 0.997 | 98.95 | 149.29 | 199.34 |
Error | 0.13(0.043%) | 0.72(0.48%) | 0.71(0.36%) | 0.23(0.09%) | 0.003 | 1.05 | 0.71 | 0.66 |
-201911/1001-8042-30-11-013/alternativeImage/1001-8042-30-11-013-F004.jpg)
According to the experimental results, the relative amplitude parameter (
Example 3
To test the LSTM model's ability to estimate the parameters of multiple ONP, this example investigated LSTM model parameter estimation for nine overlapping pulses. Exponential pulses Input 1, Input 2, Input 3, Input 4, Input 5, Input 6, Input 7, Input 8, and Input 9 were inputted into an S–K shaping circuit, with the characteristic time of
Taking RC = 250 ns, the parameter K for the S–K digital Gaussian pulses shaping algorithm was K = RC / Ts = 50. Bringing the above conditions into Eqs. (1)–(4), it was calculated that k = 382, which, when inputted into Eqs. (22)–(24), allowed the number of network layers,
The Adam algorithm was used to update the weights, and the Adam parameters were set to
Items | A1 | A2 | A3 | A4 | A5 | A6 | A7 | A8 | A9 | T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
True value | 300 | 150 | 200 | 250 | 650 | 100 | 550 | 350 | 50 | 1 | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 |
Calculated value | 300.44 | 150.86 | 200.54 | 250.62 | 649.9 | 98.07 | 549.72 | 350.14 | 49.11 | 1 | 100.61 | 199.57 | 300.04 | 400.26 | 500.55 | 600.05 | 700.3 | 799.78 |
Absolute error | 0.44 | 0.86 | 0.54 | 0.62 | 0.1 | 1.93 | 0.28 | 0.14 | 0.89 | 0 | 0.61 | 0.43 | 0.04 | 0.26 | 0.55 | 0.05 | 0.3 | 0.22 |
Relative error % | 0.15 | 0.57 | 0.27 | 0.248 | 0.015 | 1.93 | 0.051 | 0.04 | 1.78 |
Examining the experimental results, it can be seen that the relative amplitude parameter,
4. Conclusion
The LSTM model in deep learning has been used to estimate parameters for ONP signals after the S–K Gaussian shaping, and the problems of the pre-shaping exponential signal and noise influence on the parameters were seen to have been satisfactorily overcome. Taking the measured ONP signal as a sample, the sampled value of the ONP signal was transmitted into the LSTM model as a separate part. The processing and transfer of sampled value information was based on the memory cell structure peculiar to the LSTM model. Finally, through continuous training, when sampled ONP values were input into the LSTM model, the LSTM model was able to estimate the shaping parameters of these nuclear pulses quickly and accurately. Because the network was learning the entire sample, all features in the sample were recorded by the network, which indicated that this method could overcome the traditional method defect of local convergence, and achieve optimal estimation of nuclear pulse parameters in the global sense. It has been demonstrated that the model proposed herein is a good method for estimating ONP parameters.
Use of full-sample learning resulted, however, in a sharp increase in the amount of data that needed to be calculated during model training, in comparison with traditional methods, and so training time for the current LSTM model was longer than that required for the traditional method. In addition, the increased size of the model, caused by the increased amount of data, meant that the currently used hardware could not explore some additional aspects.
Network structure optimization and computational efficiency improvement will become the primary features, therefore, of our forward research, in which, for example, using multiple types of deep neural networks, and re-optimizing the memory cells inside the LSTM could be explored. In addition, the examples presented here only considered parameter estimation issues for ONP based on S-K digital shaping. In future research, the problem of pulse parameter extraction in digital trapezoidal (triangular) forms could be addressed, and, the issue of baseline estimates could also be taken into account.
Pulses-shaping in low-noise nuclear amplifiers: A physical approach to noise analysis
. Nucl. Instrum. Meth. A 100, 493-504(1972). doi: 10.1016/0029-554X(72)90828-2Digital filtering and analysis for a semiconductor X-ray detector data acquisition
. Nucl. Instrum. Meth. A 571, 378-380 (2007). doi: 10.1016/j.nima.2006.10.113Particle detector tunable monolithic Semi-Gaussian shaping filter based on transconductance amplifiers
. Nucl. Instrum. Meth. A 589, 330-337(2008). doi: 10.1016/j.nima.2008.02.048Gaussian pulses shaping of exponential decay signal based on wavelet analysis
. ACTA PHYS SIN-CH ED. 57, 2882-2887 (2018). doi: 10.3321/j.issn:1000-3290.2008.05.041 (in Chinese)Recursive implementation of Gaussian pulse shaping based on wavelet analysis
. ACTA PHYS SIN-CH ED. 58 (5),3041-3046 (2009). doi: 10.7498/aps.58.3041 (in Chinese)Pulses Parameter Extraction Method Based on S-K Digital Shaping and Population Technique
. Nuclear Electronics & Detection Technology. 37, 2(2017). (in Chinese)Estimation method for parameters of overlapping nuclear pulses signal
. Nucl. Sci. Tech. 28, 12(2017). doi: 10.1007/s41365-016-0161-zReducing the dimensionality of data with neural networks
. Science. 313, 5786(2006). doi: 10.1126/science.1127647Neural networks for time series processing
. NEURAL NETW WORLD. 6, 1(1996).Deep learning
. Nature. 521, 7553(2015). doi: 10.1038/nature14539Gradient-based learning applied to document recognition
. Proceedings of the IEEE 86, 2278-2324(1998). doi: 10.1109/5.726791Study on quality identification of macadamia nut based on convolutional neural networks and spectral features
. SPECTROSC SPECT ANAL. 38, 1514(2018). doi: 10.3964/j.issn.1000-0593(2018)05-1514-06Generating Sequences With Recurrent Neural Networks(2013)
. arXiv:1308.0850Speech Recognition with Deep Recurrent Neural Networks(2013)
. arXiv:1303.5778How to Construct Deep Recurrent Neural Networks(2013)
. arXiv:1312.6026Long short-term memory
. Neural. Comput. 9,1735-1780(1997). doi: 10.1162/neco.1997.9.8.1735A novel connectionist system for unconstrained handwriting recognition
. IEEE T Pattern Anal. 31, 855-868(2009). doi: 10.1109/tpami.2008.137Artificial Neural Networks-Icann 2001
(Backpropagation through time: what it does and how to do it
. Proceedings of the IEEE. 78, 1550-1560(1990). doi: 10.1109/5.58337Framewise phoneme classification with bidirectional LSTM and other neural network architectures
. Neural Networks. 18, 602-610(2005). doi: 10.1016/j.neunet.2005.06.042Backpropagation and stochastic gradient descent method
. Neurocomputing. 5, 185-196 (1993). doi: 10.1016/0925-2312(93)90006-OAdaptive subgradient methods for online learning and stochastic optimization
. J. Mach. Learn. Res. 12, 2121-2159(2011).Every moment counts: Dense detailed labeling of actions in complex videos
. Int. J. Comput. Vision 126, 375-389(2018). doi: 10.1007/s11263-017-1013-yAdam: A Method for Stochastic Optimization(2014)
. arXiv:1412.6980Improving neural networks by preventing co-adaptation of feature detectors(2012)
. arXiv:1207.0580ImageNet Classification with Deep Convolutional Neural Networks
. COMMUN ACM. 60, 84-90(2017). doi: 10.1145/3065386Dropout as data augmentation(2015)
. arXiv:1506.08700