Introduction
Neural networks are powerful tools for making predictions after training using data and have made possible exciting achievements in nuclear physics in the past few years [1-6]. The earliest work on neural networks in nuclear physics dates back to 1993, when the phenomenological approach to many-body systems based on multilayer feedforward neural networks was introduced to learn the systematics of atomic masses and nuclear spins and parities [7]. Thereafter, various types of neural networks have been applied to study nuclear mass systematics [8, 9], β-decay systematics [10], and binding energy [11].
Recently, with the help of physical ideas, neural networks have been improved and their potential capability has been realized. The known physics were explicitly embedded based on the Bayesian neural network (BNN), which results in a novel method for accurately predicting β-decay half-lives[12]. The input data are preprocessed, correlations among the input data are included, and the problem of multiple solutions can be reduced, yielding more stable extrapolated results[13]. The combination of the three-parameter formula and the BNN results in a novel approach for describing the charge radii of the nuclei [14]. In addition, the physical law is revealed by neural networks. For example, a feed-forward neural network model was trained to calculate nuclear charge radii, and the correlation between the symmetry energy and charge radii of Ca isotopes was suggested [15]. The convolutional neural network algorithm was applied to determine the impact parameters in heavy ion collisions using constrained molecular-dynamics model simulations [16]. To date, neural networks have been widely applied for signal identification [17-19], data restoration [20, 21], and regression analysis [5] and have been broadly used in nuclear physics.
In studies where the nuclear masses and nuclear charge radii were studied, Utama et al. claimed that the physics in the initial prediction can be included using physics-motivated models and that the BNN can be used to fine-tune these models by modeling the residuals [22, 23]. Based on this residual-approach, the predictions of several physics-motivated models in nuclear physics have been improved using neural networks. For example, the BNN approach was employed to improve the nuclear mass predictions of various physics models [24] and the fission yield prediction using the TALYS model [2]. To study the isotopic cross sections in proton-induced spallation reactions, the BNN predictions obtained by learning the experimental data and the residuals of the SPACS parametrization were compared [25]. It was shown that the latter are better, which indicates the significance of the robust physics-motivated model when using a neural network.
A neural network is a type of numerical algorithm. In cases where a physics-motivated model is not available, attempts have been made to provide physics guidance in neural networks. A multitask neural network was applied to learn the experimental data of the giant dipole resonance parameters directly, and the predictions were better than those calculated by the Goldhaber–-Teller model [26]. In a study for determininig the impact parameters of heavy-ion collisions using convolutional neural networks, no initial predictions were made [16]. In addition to the physics-motivated model, physics guidance in neural networks has also been provided from the input layer [27] or by an empirical formula [14]. In this study, an attempt is made to improve the BNN for studying photoneutron yield cross sections, where both the improvements from the input layer and by the empirical formula are considered.
Photonuclear reactions were first observed more than 60 years ago [28]. Their cross section data are important to analyze the radiation transport, study the nuclear waste transmutation [29], and calculate the nucleosynthesis [30, 31]. The underlying mechanisms in photonuclear reactions are significant in fundamental nuclear physics [32, 33]. More than 27000 data of σxn were collected in the EXFOR database [34] for nuclei from 6Li to 239Pu at incident energies above the neutron separation energy. New facilities have been developed to measure additional data [35, 36]. A large dataset makes machine learning possible and advisable.
This study focuses on the improvement of the BNN to study photoneutron yield cross sections. The remainder of this paper is organized as follows. In Sect. 2, the model is described. In Sect. 3, both the results and discussions are presented. Finally, a summary is presented in Sect. 4.
Model
The fundamentals of the BNN approach were established in the last century [37]. It is now a commonly used method for pattern recognition and numerical regression. The latter is the case in the present study. In the normal BNN algorithm, a neural network with hidden layers is built for mapping from the input layer X to the output layer Y.
With the activation function, nonlinearity between the input and output is realized.
The parameters in Eq. (1) are determined by learning dataset D, which displays the outputs Y for given inputs X,
With the posterior distribution of the parameters θ, the expected value of the output
Monte Carlo techniques are applied to calculate the above integration.
The analytic computation of the posterior distribution P(θ|D) is intractable owing to the high dimensionality of the parameters. In the BNN approach, variation inference is applied to obtain an approximation of P(θ|D). The variation inference attempts to find a κ such that
The traditional method of solving the regression problem in physics is based on an empirical formula with parameters, where an appropriate formula can avoid both misconvergence and overfitting. Similarly, when applying a BNN in physics, it is important to select the appropriate input nodes for a specific output. In this study, the output is the photoneutron yield cross section σxn. More than 27000 data of σxn were collected in the EXFOR [34] for nuclei from 6Li to 239Pu at incident energies above the neutron separation energy. The minimum input nodes of the BNN to study σxn are the charge number Z of the target, mass number A of the target, and energy ε of the incident γ particle. The BNN in this case is abbreviated as BNN-ZAE and illustrated in Fig. 1(a).
-202211/1001-8042-33-11-001/alternativeImage/1001-8042-33-11-001-F001.jpg)
The BNN is a numerical algorithm. In physics, the BNN is used to learn the residuals of the physics-motivated model and then fine-tune the model. Thus, the main physics information is included in the initial prediction by the physics-motivated model. In our previous work [38], we illustrated a new method to provide physical guidance in the BNN from the input layer without initial prediction by the physics-motivated model. This method is applied in this work to select the optimal ground-state properties as neurons of the input layer in the BNN to predict the photoneutron yield cross sections σxn. Details of the method are provided in Ref. [38]. In brief, various combinations of ground-state properties are considered as the input nodes in the BNN, and the optimal combination is selected according to the smallest deviation between the data and the prediction. The optimal input nodes are as follows:
We will further prove that the BNN can be improved by considering known knowledge of physics. The photoneutron yield cross sections σxn as a function of incident energy can be characterized by a Lorentzian shape with two components. The Lorentzian parameters are peak energy Ei, width Γi, and strength si.
The subscripts i = 1 and 2 denote the two components. σTRK expresses the cross section in terms of the Thomas–Reiche–Kuhn sum rule. The Lorentzian function is applied before mapping the hidden nodes to the output cross sections, as shown in Fig. 1(c). Except for the Lorentzian shape, the known knowledge includes the empirical formula of the Lorentzian parameters:
Results & discussions
There are three types of BNN in this work: BNN-ZAE, BNN-OPT, and LBNN. In the latter two, physics guidance is provided by improving the input nodes and considering the Lorentzian shape. In contrast, BNN-ZAE is a numerical algorithm without any physical improvement. In the following section, we evaluate these three models by comparing their predictions. The root-mean-square (RMS) deviations between the predictions and data were calculated as
The RMS deviations as a function of the iteration step were applied to test the convergence of the predictions. The cases for the BNN-ZAE predictions with one, two, and three hidden layers and 10 nodes for each hidden layer are shown in Fig. 2(a). From the figure, we can observe that all the RMS deviations converge quickly in 1000 iteration steps. The final RMS deviations were 0.226, 0.219, and 0.214 for the cases of one, two, and three hidden layers, respectively. This indicates that more hidden layers help reproduce the training data; however, the effect is weak. The effect of the number of hidden nodes was also tested by setting one hidden layer to 10, 30, and 100 nodes. The RMS deviations are compared in Fig. 2(b). It was shown that the rate of convergence was slower for a larger number of hidden nodes. The iteration steps required to decrease the RMS deviations to a value less than 0.25 are 1000 for the cases of 10 and 30 hidden nodes, but 2000 steps are needed for 100 hidden nodes. The RMS deviations at 4000 iteration steps are 0.226, 0.229, and 0.235 for 10, 30, and 100 nodes, respectively. The RMS deviations are similar for 10 and 30 hidden nodes, but they are larger for 100 hidden nodes. The role of the activation function was tested by setting a hidden layer with 30 hidden nodes using the sigmoid, tanh, or softplus activation function, as shown in Eq. (2). The RMS deviations are compared in Fig. 2(c). The sigmoid activation function was shown to be the best. In the following calculations, three hidden layers with 30 hidden nodes for each layer and a sigmoid activation function were applied.
-202211/1001-8042-33-11-001/alternativeImage/1001-8042-33-11-001-F002.jpg)
The RMS deviations as a function of the iteration step for the three types of BNNs are shown in Fig. 3(a). It is shown that the rates of convergence are similar for all the cases, and 4000 iteration steps are sufficient for the calculations. The RMS deviations at 4000 iteration steps are 0.198 for the BNN-OPT model and 0.206 for the LBNN model, which are smaller than that of 0.214 for BNN-ZAE. Figure 2(a) and (b) shows that a change in the numerical parameters (number of hidden layers and hidden nodes) does not improve the neural network. Here, it is indicated that the key to improving the neural network for physics is the consideration of the known knowledge of observables. For instance, several effects of the photoneutron reaction, such as isospin dependence and shape effect, have been found [39, 40, 3], which indicate that the cross sections of the photoneutron reaction depend on the ground-state properties of the nuclei. Another known knowledge of photoneutron cross sections is their Lorentzian shape. The data of the ground-state properties are applied to the input layer of the BNN-OPT model, whereas the Lorentzian shape is considered in the LBNN model. These two aspects are responsible for the smaller RMS deviations compared to those provided by the BNN-ZAE algorithm.
-202211/1001-8042-33-11-001/alternativeImage/1001-8042-33-11-001-F003.jpg)
To compare the errors of the predictions by the three neural networks, the distributions of the differences between the predictions and training data log σp-log σd are shown in Fig. 3(b). A value 1 of log σp-log σd means that the predicted cross section is 10 times larger than the experimental data, whereas a value of -1 indicates that it is 10 times smaller. These cases hardly ever occur. The sample size for
We further evaluated the three types of BNNs by comparing their predictions of photoneutron yield cross sections for spherical nuclei 92Zr, 112Sn, and 206Pb. The cross sections as a function of the incident energy are shown in Fig. 4. In general, for spherical nuclei, only one main component of the Lorentzian shape is displayed in the excitation function of the photoneutron reaction. The experimental cross sections as a function of the incident energy ε for nuclei 92Zr, 112Sn, and 206Pb display this situation. The position and value of the Lorentzian peak were reproduced well by the LBNN model. In contrast, the peak position for nucleus 92Zr and the peak value for nucleus 112Sn were underestimated by both the BNN-ZAE and BNN-OPT models.
-202211/1001-8042-33-11-001/alternativeImage/1001-8042-33-11-001-F004.jpg)
The data were abundant for these three nuclei. Some experimental errors, including statistical and systematic errors, are shown as error bars in the figure. The error of the data was applied to train the BNN. More specifically, the data were re-sampled with a number inversely proportional to the experimental errors. This means that the data with small errors are resampled with a large number for training, while those with large errors are applied only a few times. After 4000 iteration steps, 100 samples are applied to calculate the standard deviations and uncertainties of the predictions, which are shown as shadows in the figure. In the energy region with data, the uncertainties of the predictions by the BNN-ZAE and BNN-OPT models are small, but there should be large uncertainties in the extrapolations. Because the Lorentzian function is applied to the neural network, the predictions and their uncertainties are both constrained by the Lorentzian shape. The uncertainties of the predictions by the LBNN model were the same for interpolations and extrapolations in logarithmic coordinates. Furthermore, the uncertainties are small because the data for these three nuclei are abundant. The uncertainties shown in Fig. 4 originate from the Monte Carlo techniques. The Lorentzian function applied in the LBNN model is only an approximate expression for the photoneutron yield cross sections. A threshold exists for the photoneutron yield cross sections at ε = Sn, where Sn is the neutron separation energy. The Lorentzian shape is the most important known knowledge of photoneutron yield cross sections, but it does not consider the threshold. The experimental data near the threshold were against this formula. Thus, predictions by the LBNN model below the threshold are meaningless.
For the axially deformed nucleus, the photoneutron yield cross sections as a function of incident energy display two main Lorentzian shapes. The difference between the two Lorentzian peaks has been found to be positively associated with the deformation parameter β2 using the time-dependent Hartree–Fock model [40]. The data for deformed nuclei 31P, 75As, and 165Ho are shown in Fig. 5. The quadrupole deformation parameters β2 are -0.22, -0.24, and 0.28, respectively. The data clearly show two peaks for 165Ho, but faintly for 75As. The abundant data but large errors for 31P make it difficult to distinguish the two peaks. However, the wide peak does not contradict the two main Lorentzian shapes because they may be too close to be distinguished.
-202211/1001-8042-33-11-001/alternativeImage/1001-8042-33-11-001-F005.jpg)
The curves and shadows show the predictions with confidence intervals using neural networks. The BNN-ZAE model reproduces the overall trend for the data of 31P and 75As and slightly overestimates the cross sections of 31P in the most energy region. However, it neglects the two obvious peaks for 165Ho and predicts the cross sections with only one peak. At energy ε = 14 MeV, where the data are approximately 0.25 b, the BNN-ZAE model grossly overestimates the cross section. This process is illustrated in Fig. 3. The BNN-OPT model provides a smaller RMS deviation than the BNN-ZAE model. This point is reiterated by comparing the predictions of the BNN-OPT and BNN-ZAE models. However, the confidence intervals provided by the BNN-OPT model were wider than those provided by the BNN-ZAE model. When extrapolating the cross section to the low-energy region, ε < 10 MeV for 31P and 75As, overfitting by the BNN-OPT model is shown. The confidence intervals show an increasing cross section with decreasing energy, which was not observed in the experiment.
The LBNN model reproduced the data better than the other two models did. Two Lorentzian shapes related to the quadrupole deformation parameters β2 were considered in the LBNN model, as shown in Fig. 1(c), hence the predictions show two peaks for deformed nuclei. It cannot be denied that the LBNN model underestimates the cross section of 31P at ε = 21.5 MeV, where the calculations show a valley between two Lorentzian peaks, but the data display a weak peak. The weak peak may reveal a substructure beyond the main Lorentzian shapes, which can also be observed in the data of 206Pb near ε = 25 MeV. However, its origin has not been explained by the physics-motivated model, and hence, it has not been considered in the neural network. The improvement from BNN-OPT to LBNN supports the idea that the substructure beyond the main Lorentzian shapes can be considered in the neural network after its properties have been revealed.
Conclusion
In conclusion, the photoneutron yield cross sections as a function of the charge number Z, mass number A, and incident energy ε were studied using the BNN, and the model is abbreviated as BNN-ZAE. The numerical parameters of the neural network were varied to test the model. The influence of the activation function on the prediction was determined. The sigmoid activation function was the best for realizing nonlinearity between the input and output, and hence it provided the smallest deviations between predictions and data. However, the predictions of the BNN-ZAE model could not be improved by increasing the number of hidden layers from 1 to 3 and the number of hidden nodes from 10 to 100.
In the method proposed in Ref. [38], physics guidance is provided in BNNs from the input layer. Several effects of the photoneutron reaction, such as the isospin dependence and shape effect, have been observed [39, 40, 3], which indicate that the cross sections of the photoneutron reaction depend on the ground-state properties of the nuclei. Based on this knowledge, the optimal ground-state properties were selected as input neurons, resulting in the BNN-OPT model. It was shown that the deviations between the predictions of the BNN-OPT model and the data were smaller than those of the BNN-ZAE model.
The BNN was further improved by the Lorentzian shape of the photoneutron yield cross sections. The Lorentzian function was applied to map the hidden nodes to the output cross sections, and the empirical formula of the Lorentzian parameters was applied to link the input nodes to the output cross sections. This new algorithm is called Lorentzian function-based BNN (LBNN). We evaluated the BNN-ZAE, BNN-OPT, and LBNN models by comparing their predictions of the photoneutron yield cross sections for the spherical nuclei 92Zr, 112Sn, and 206Pb, as well as the deformed nuclei 31P, 75As, and 165Ho. Generally, for spherical nuclei, only one main component of the Lorentzian shape exists. All three models reproduced the main trend of the data, but the predictions of the LBNN model were the best. For an axially deformed nucleus, the photoneutron yield cross sections displayed two main Lorentzian shapes. Only the LBNN model reproduced two peaks of the cross sections in the deformed nuclei 31P, 75As, and 165Ho. This is because two Lorentzian shapes related to quadrupole deformation parameters β2 are considered in the LBNN model.
Nuclear mass predictions based on bayesian neural network approach with pairing and shell effects
. Physics Letters B 778, 48–53 (2018). doi: 10.1016/j.physletb.2018.01.002Bayesian evaluation of incomplete fission yields
. Physical review letters 123, 122501 (2019). doi: 10.1103/PhysRevLett.123.122501Constraining the in-medium nucleon-nucleon cross section from the width of nuclear giant dipole resonance
. Physics Letters B 807, 135532 (2020). doi: 10.1016/j.physletb.2020.135532Nuclear mass based on the multi-task learning neural network method
. Nuclear Science and Techniques 33, 48 (2022). doi: 10.1007/s41365-022-01031-zMachine learning the nuclear mass
. Nuclear Science and Techniques 32, 109 (2021). doi: 10.1007/s41365-021-00956-1Precise machine learning models for fragment production in projectile fragmentation reactions using bayesian neural networks
. Chinese Physics C 46, 074104 (2022). doi: 10.1088/1674-1137/ac5efbNeural network models of nuclear systematics
. Physics Letters B 300, 1–7 (1993). doi: 10.1016/0370-2693(93)90738-4Nuclear mass systematics using neural networks
. Nuclear Physics A 743, 222–235 (2004). doi: 10.1016/j.nuclphysa.2004.08.006Bethe–weizsäcker semiempirical mass formula coefficients 2019 update based on ame2016
. Nuclear Science and Techniques 31, 9 (2020). doi: 10.1007/s41365-019-0718-8Decoding β-decay systematics: A global statistical model forβ- half-lives
. Physical Review C 80, 044332 (2009). doi: 10.1103/PhysRevC.80.044332A study on ground-state energies of nuclei by using neural networks
. Annals of Nuclear Energy 63, 172–175 (2014). doi: 10.1016/j.anucene.2013.07.039Predictions of nuclearβ-decay half-lives with machine learning and their impact on r-process nucleosynthesis
. Physical Review C 99, 064307 (2019). doi: 10.1103/PhysRevC.99.064307Extrapolation of nuclear structure observables with artificial neural networks
. Physical Review C 100, 054326 (2019). doi: 10.1103/PhysRevC.100.054326Novel bayesian neural network based approach for nuclear charge radii
. Physical Review C 105, 014308 (2022). doi: 10.1103/PhysRevC.105.014308Calculation of nuclear charge radii with a trained feed-forward neural network
. Physical Review C 102, 054323 (2020). doi: 10.1103/PhysRevC.102.054323Determining impact parameters of heavy-ion collisions at low-intermediate incident energies using deep learning with convolutional neural networks
. Physical Review C 105, 034611 (2022). doi: 10.1103/PhysRevC.105.034611Fast nuclide identification based on a sequential bayesian method
. Nuclear Science and Techniques 32, 143 (2021). doi: 10.1007/s41365-021-00982-zDiscrimination of neutrons and gamma rays in plastic scintillator based on pulse-coupled neural network
. Nuclear Science and Techniques 32, 82 (2021). doi: 10.1007/s41365-021-00915-wHybrid windowed networks for on-the-fly doppler broadening in rmc code
. Nuclear Science and Techniques 32, 62 (2021). doi: 10.1007/s41365-021-00901-2Robust restoration of low-dose cerebral perfusion ct images using ncs-unet
. Nuclear Science and Techniques 33, 30 (2022). doi: 10.1007/s41365-022-01014-0Sinogram denoising via attention residual dense convolutional neural network for low-dose computed tomography
. Nuclear Science and Techniques 32, 41 (2021). doi: 10.1007/s41365-021-00874-2Nuclear mass predictions for the crustal composition of neutron stars: A bayesian neural network approach
. Physical Review C 93, 014311 (2016). doi: 10.1103/PhysRevC.93.014311Nuclear charge radii: density functional theory meets bayesian neural networks
. Journal of Physics G: Nuclear and Particle Physics 43, 114002 (2016). doi: 10.1088/0954-3899/43/11/114002Bayesian approach to model-based extrapolation of nuclear observables
. Physical Review C 98, 034318 (2018). doi: 10.1103/PhysRevC.98.034318Isotopic cross-sections in proton induced spallation reactions based on the bayesian neural network method
. Chinese Physics C 44, 014104 (2020). doi: 10.1088/1674-1137/44/1/014104The description of giant dipole resonance key parameters with multitask neural networks
. Physics Letters B 815, 136147 (2021). doi: 10.1016/j.physletb.2021.136147Providing physics guidance in bayesian neural networks from the input layer: The case of giant dipole resonance predictions
. Physical Review C 104, 034317 (2021). doi: 10.1103/PhysRevC.104.034317Photo-fission in heavy elements
. Physical Review 71, 3 (1947). doi: 10.1103/PhysRev.71.3Yield of long-lived fission product transmutation using proton-, deuteron-, and alpha particle-induced spallation
. Nuclear Science and Techniques 32, 96 (2021). doi: 10.1007/s41365-021-00933-8Giant dipole resonance parameters of ground-state photoabsorption: Experimental values with uncertainties
. Atomic Data and Nuclear Data Tables 123, 1–85 (2018). doi: 10.1016/j.adt.2018.03.002Iaea photonuclear data library 2019
. Nuclear Data Sheets 163, 109–162 (2020). doi: 10.1016/j.nds.2019.12.002Experimental studies of the pygmy dipole resonance
. Progress in Particle and Nuclear Physics 70, 210–245 (2013). doi: 10.1016/j.ppnp.2013.02.003Isoscalar and isovector dipole excitations: Nuclear properties from low-lying states and from the isovector giant dipole resonance
. Progress in Particle and Nuclear Physics 106, 360–433 (2019). doi: 10.1016/j.ppnp.2019.02.001Exfor–a global experimental nuclear reaction data repository: Status and new developments
. Vol. 146,Back-n white neutron source at csns and its applications
. Nuclear Science and Techniques 32, 11 (2021). doi: 10.1007/s41365-021-00846-6Development of a low-background neutron detector array
. Nuclear Science and Techniques 33, 41 (2022). doi: 10.1007/s41365-022-01030-0A practical bayesian framework for backpropagation networks
. Neural computation 4, 448–472 (1992). doi: 10.1162/neco.1992.4.3.448Providing physics guidance in bayesian neural networks from the input layer: The case of giant dipole resonance predictions
. Physical Review C 104, 034317 (2021). doi: 10.1103/PhysRevC.104.034317Constraining symmetry energy at subnormal density by isovector giant dipole resonances of spherical nuclei
. Chinese Physics C 43, 064109 (2019). doi: 10.1088/1674-1137/43/6/064109Giant dipole resonance and shape evolution in nd isotopes within tdhf method
. Physica Scripta 95, 065301 (2020). doi: 10.1088/1402-4896/ab73d8