Introduction
Nuclear mass is a fundamental quantity widely involved in various domain studies in nuclear science and engineering. Accurate masses are crucial not only to derive highly concerning nuclear shell information but also to quantify the procedure of nuclear reactions [1-4]. Thus, much interest has been drawn in the past several decades to obtain and improve nuclear mass values to meet the requirements of contemporary nuclear studies.
Nuclear researchers have been involved in this field, especially since the 1950s, and international cooperation has been established to create the well-known atomic mass evaluation (AME) motivated to provide a reliable database to the public, where the data are based on pure measurements and empirical extrapolation [5-7]. Significant success has been achieved with AME, and over 3500 nuclei have been evaluated, whereas there is a gap between the number of evaluated nuclei and the real requirements from high-fidelity simulation calculations regarding complex nuclear physics environments. Moreover, the uncertainties of AME are still worth concentrating on, for further improvements.
Therefore, many theoretical calculations based on microscopic mean-field models [8-11] and macroscopic-microscopic models [12-21] have been developed to obtain the global nuclear mass.The macroscopic-microscopic models start from the liquid drop model (LDM) and the correction energy terms, based on a special single-particle potential, which makes the calculation relatively simple compared with microscopic mean-field models. Meanwhile, the macroscopic-microscopic models show better performance than the global nuclear mass [5, 22]; therefore, they are normally considered more applicable in real evaluations.
In the scheme of macroscopic-microscopic models, the theoretical determination of the shell correction energy of the single-particle potential is complicated [23]. Normally, in real calculations, smoothing methods are necessary to deal with the single-particle potential, which influences the final results [22, 24]. To simplify this problem, a so-called “simple nuclear mass formula form” is proposed in [25], the linear polynomial functions are applied to replace the residual correction energy and the global root means square (RMS) successfully reaches 0.266 MeV compared to the original LDM of 2.456 MeV. Artificial neural networks (ANNs) have been proven to be excellent methods in many research regions [24]. It seems to be a better choice here than simple mathematical functions because of its powerful capability in dealing with complex problems.
The application of neural networks to predict nuclear masses can be traced back to the 1990s [5, 26]. The input layer of the neural networks is designed according to basic nuclear properties such as the proton, neutron number Z, N of target nuclei, and the relevant Z0, N0 of the nearest magic number, and the application of ANN in mass model calculations has been validated by many perfect outputs. Most previous studies on the ANN nuclear mass were constructed using the single-task-learning (STL) technique at the output layer, where the fluctuation of nuclear binding energy (δLDM) was taken as the only task guiding the entire training and testing procedures. Consequently, a better global RMS is obtained via STL; for example, the RMS can be reduced to 0.235 MeV in Ref. [24]. Other nuclear properties such as the single proton separation energy (Sp), single neutron separation energy (Sn), nuclear charge radii, and β-decay half-lives have also been studied using various artificial intelligence (AItools [27-31]. In most of the above-mentioned studies, AI methods are trained to learn one of the nuclear properties. The aforementioned properties are naturally correlated; therefore, novel AI methods that can study more than one property simultaneously should be developed. Accordingly, we attempted to involve more tasks in the ANN to study the neural network using deep learning and further reduce the global RMS.
In this study, an improved multi-task-learning (MTL) technique was created to integrate more crucial knowledge from nuclear physics into the neural network. A total of 2095 nuclei with fully evaluated nuclear properties in the AME were adopted in this new MTL-ANN. To include more tasks, Sn and Sp are included to provide information on the nuclear shell. In the input layer, we adopt the neurons with proton number, mass number, and the number of residual particles or holes relative to the closet magic shell for protons, and that for neutrons, which were applied in our previous study [24]. Moreover, we expand the current input layer by adding pairing terms with the expression in Ref. [18].
The remainder of this paper is organized as follows. In Sect. 2, first, a general LDM formula is used, and the general formalism and structure of the present applied neural network with the MTL technique, called MTL-ANN, are introduced, and a new mass method incorporating MTL-ANN is proposed. The results of the global analysis of 2095 nuclear binding energy with the novel MTL-ANN are presented in Sect. 3, and discussions on MTL-ANN parameters and optimizing procedures are illuminated in detail simultaneously. Finally, the summary of this study is provided in Sect. 4.
Macroscopic-microscopic model with Multi-Task-Learning technique
In the macroscopic-microscopic model scheme, the binding energy of a given nucleus with A mass and Z protons E(Z,A) can be assumed as the macroscopic binding energy with LDM ELDM (Z,A) and the fluctuating part δLDM(Z,A)[32],
Conversely, the fluctuating part of the binding energy (δLDM) is a necessary compensation in the macroscopic-microscopic model. Multitask learning has a strong ability to improve generalization using the domain information contained in the data, and the learned knowledge for one task can assist other tasks to be learned better [33]. Therefore, a novel MTL artificial neural network (MTL-ANN) was designed in this study to mimic δLDM more accurately using more related nuclear properties.
The MTL-ANN is a feedforward neural network. The structure of the MTL-ANN used in this study is illustrated in Fig. 1, the architecture consists of three layers: input, hidden, and output.
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F001.jpg)
Five features
As shown in Fig. 1, two hidden layers were defined to adequately share information, and 20 neurons were set in each hidden layer. The multitask outputs are obtained through training iterations between the hidden and output layers back and forth. Generally, it is assumed that the input vector is x=(x1,x2,……,xn), the obtained output vector is y=(y1,y2,……,ym), and n, m denotes the total number of inputs and tasks. The lth task yl can be written as
In addition, it should be noted that a hard-sharing approach is employed in all the network neurons to build the full connections between the input, hidden, and output layers, which is believed to contain more nuclear mass physics in the training process and efficiently avoid overfitting [34]. Moreover, the limited-BFGS, an updated quasi-Newton method that can deal with large-number-parameter training [35], is used in backpropagation learning procedures to efficiently obtain the minimum loss value.
Result and Discussion
The newly designed MTL-ANN for the nuclear mass was used to analyze 2095 nuclei (Z ≥ 8, N ≥ 8)from the AME2020 database. In our calculation, 2095 nuclei were divided into three datasets for training, validation, and testing. All the data were sampled with a uniform distribution. In practice, 95 nuclei were first sampled from the data pool, which did not participate in neural network training. Subsequently, 1400 nuclei of training data were constructed by sampling stochastically from the remaining 2000 nuclei, and the remaining 600 nuclei were used for validation.
In our training process, the loss value can normally reach stable values after several hundred iterations. The convergence for data training is illustrated in Fig. 2. The loss for training reached a minimum after 200 iterations, and the corresponding validation value maintained a speed similar to that of the training. The stability of the two main procedures guarantees the correctness of the network.
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F002.jpg)
To examine the validity of the proposed model, four types of network were designed according to the tasks in the output layer. The RMS values of the binding energies (EMTL), neutron separation energy (Sn), and proton separation energy (Sp) in the training, validation, and testing processes are listed in Table 1. The RMS is calculated as
Task group | Training and Validation | |||
---|---|---|---|---|
RMS of EMTL (MeV) | RMS of Sn (MeV) | RMS of Sp (MeV) | ||
TYPE 1 | TB | 0.23 | 0.28* | 0.32* |
TYPE 2 | TB and T Sn | 0.21 | 0.21 | 0.24* |
TYPE 3 | TB and T Sp | 0.21 | 0.23* | 0.23 |
TYPE 4 | TB, T Sn and T Sp | 0.22 | 0.22 | 0.24 |
Task group | Testing | |||
RMS of EMTL (MeV) | RMS of Sn (MeV) | RMS of Sp (MeV) | ||
TYPE 1 | TB | 0.31 | 0.34* | 0.33* |
TYPE 2 | TB and T Sn | 0.23 | 0.20 | 0.24* |
TYPE 3 | TB and T Sp | 0.21 | 0.22* | 0.23 |
TYPE 4 | TB, T Sn and T Sp | 0.25 | 0.23 | 0.23 |
Compared to the simple LDM model, the RMS of the binding energy between the calculation and experimental data can be reduced sharply from 2.4027 MeV to the current of 0.2∼0.24 MeV. Moreover, the multi-task networks with TYPE 2: TB and TSn, TYPE 3: TB and TSp, TYPE 4: TB,TSn, and TSp all show better performance compared to network TYPE 1: with only a single task TB, which demonstrates that the MTL approach has a more powerful capability to improve the mass model, and TYPE 3 can obtain the best RMS for the binding energy EMTL not only in training and validation but also in the testing process. However, it can also be observed that when more tasks are added, the learning performance of the network may worsen. In TYPE 4, the RMS of the network with more tasks TB,TSn, and TSp are even larger than the RMS in TYPE 2 and TYPE 3. This is called “negative transfer” in neural network training, which may be caused by the inner contradiction of experimental information from TSn, TSp, and TB in the task inputs.
The prediction power of the MTL model is also verified in the testing part in Table 1. For the randomly selected 95 nuclei, the results predicted by the current multitask network performed similarly to the training and validation. Moreover, when we repeated the experiment by changing the 95 test sets, the change in Table 1 could be ignored because of its small percentage. In conclusion, TPYE 3 can suitably improve the current mass model more than other models.
To investigate further, we compare Sn and Sp with the related experimental data in Figs. 3, 4, 5, 6, 7, 8, 9, 10 for the selected nuclear chains Z=8,22,61,84 and N = 8,22,61,84, and the absolute deviations between the calculations and experimental data for δSn and δSp are plotted for each concerned nucleus. From these figures for each nucleus, the model description of Sn and Sp can be observed are satisfying, and it can also be confirmed that all current MTL networks can better describe the nuclear mass compared to STL. The four types of MTLs almost fit the experimental data analogously, although the global RMS of EMTL, Sn, and Sp testing shows that the TPYE 3 task group (TB and Sp) are the best choices.
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F003.jpg)
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F004.jpg)
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F005.jpg)
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F006.jpg)
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F007.jpg)
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F008.jpg)
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F009.jpg)
-202204/1001-8042-33-04-008/alternativeImage/1001-8042-33-04-008-F010.jpg)
For different nuclear mass regions, it can observe that the prediction ability for the corresponding nuclei is significantly improved with increasing Z and N. The absolute values of δSn and δSp are varied from approximately 1.0 MeV for the very light nuclei (Z=8, N=8) to approximately 0.2 MeV for the heavy nuclei, which illustrates the obvious better fittings for the heavier mass region.
In addition, the current predictions of the network are significantly influenced by the status of the reported data in AME2020. For example, in the case of Z=61, some large vibrations occur abnormally within the N=90∼93 scope because the regulated patterns of the experimental data of N=90∼93 visibly deviate from those of other neighbor nuclei. We also investigated the reported errors in this mass region. As observed, the current predictions from MTL are populated beyond the experimental error band; that is, the data for Z=61 and N=90∼93 are recommended in AME2020 as 5.604±0.02 MeV, 7.860±0.02 MeV, 5.939±0.03 MeV, and 7.465 ±0.03 MeV; however, the deviations of our MTL-related predictions all reach approximately 0.4 MeV shown in Fig. 5. These large inconsistencies between the experimental data and model predictions require more attention to investigate the correctness of the measured points and our models in the future.
Conclusion
In summary, a newly designed MTL-ANN method was introduced to the global macroscopic-microscopic mass model. This method has been proven to increase the accuracy of mass models and effectively reduce the risk of network overfitting.
Five essential nuclear properties related to the neutron number, mass number, near magic number, and pairing
All of these excellent results verify the impressive prediction capability of the MTL-ANN mass model, which implies good predictive performance in the known nuclear region. Moreover, it can provide important hints to examine the correctness of the experimental data available in the future.
Improved macroscopic microscopic mass formula
. Chinese Physics C 45(7), 074108 (2021). doi: 10.1088/1674-1137/abfaf2Density dependence of the symmetry energy probed by β- decay energies of odd-A nuclei
. Phys. Rev. C 88, 014302 (2013). doi: 10.1103/PhysRevC.88.014302A new view of nuclear shells
. Physica Scripta, T152, 014002 (2013). doi: 10.1088/0031-8949/2013/t152/014002X-ray binaries
. Nucl. Phys. A 777, 601-622 (2006). Special Isseu on Nuclear Astrophysics. doi: 10.1016/j.nuclphysa.2005.05.200Nuclear mass predictions based on bayesian neural network approach with pairing and shell effects
. Phys. Lett. B 778, 48-53 (2018). doi: 10.1016/j.physletb.2018.01.002The Ame 2020 atomic mass evaluation[I]
. Chinese Physics C 45(3), 030002 (2021). doi: 10.1088/1674-1137/abddb0The Ame 2020 atomic mass evaluation[II]
. Chinese Physics C 45(3), 030003 (2021). doi: 10.1088/1674-1137/abddafFurther explorations of Skyrme-Hartree-Fock-Bogoliubov mass formulas. XIII. the 2012 atomic mass evaluation and the symmetry coefficient
. Phys. Rev. C 88, 024308 (2013). doi: 10.1103/PhysRevC.88.024308Further explorations of Skyrme-Hartree-Fock-Bogoliubov mass formulas. IX: Constraint of pairing force to 1S0 neutron-matter gap
. Nuclear Physics A, 812(1), 72-98 (2008). doi: 10.1016/j.nuclphysa.2008.08.015New parametrization for the nuclear covariant energy density functional with a point-coupling interaction
. Phys. Rev. C 82, 054319 (2010). doi: 10.1103/PhysRevC.82.054319Ground-state and pairing properties of Pr isotopes in relativistic mean-field theory
. Phys. Rev. C, 65, 064305 (2002). doi: 10.1103/PhysRevC.65.064305Deformation and shell effects in nuclear mass formulas
. Nucl. Phys. A 874, 81-97 (2012). doi: 10.1016/j.nuclphysa.2011.11.005Modification of nuclear mass formula by considering isospin effects
. Phys. Rev. C 81, 044322 (2010). doi: 10.1103/PhysRevC.81.044322Mirror nuclei constraint in nuclear mass formula
. Phys. Rev. C 82, 044304 (2010). doi: 10.1103/PhysRevC.82.044304New finite-range droplet mass model and equation-of-state parameters
. Phys. Rev. Lett. 108, 052501 (2012). doi: 10.1103/PhysRevLett.108.052501Nuclear ground-state masses and deformations: FRDM(2012)
. Atomic Data and Nuclear Data Tables 109-110, 1-204 (2016). doi: 10.1016/j.adt.2015.10.002Microscopic mass formulas
. Phys. Rev. C 52, R23-R27 (1995). doi: 10.1103/PhysRevC.52.R23The anatomy of the simplest Duflo-Zuker mass formula
. Nucl. Phys. A 843(1), 14-36 (2010). doi: 10.1016/j.nuclphysa.2010.05.055Macro-microscopic mass formulae and nuclear mass predictions
. Nucl. Phys. A 847(1), 24-41 (2010). doi: 10.1016/j.nuclphysa.2010.06.014Coefficients of different macroõmicroscopic mass formulae from the AME2012 atomic mass evaluation
. Nucl. Phys. A 917, 1-14 (2013). doi: 10.1016/j.nuclphysa.2013.09.003Nuclear properties according to the Thomas-Fermi model
. Nucl. Phys. A 601(2), 141-167 (1996). doi: 10.1016/0375-9474(95)00509-9Theoretical studies of nuclear mass formula and nuclear spontaneous fission
.Shell effects in nuclear masses and deformation energies
. Nucl. Phys. A 95(2), 420-442 (1967). doi: 10.1016/0375-9474(67)90510-6Performance of the levenberg–marquardt neural network approach in nuclear mass prediction
. Journal of Physics G: Nuclear and Particle Physics, 44(4):045110,Simple nuclear mass formula
. Phys. Rev. C, 90:064306,Learning and prediction of nuclear stability by neural networks
. Nucl. Phys. A 540, (1992). doi: 10.1016/0375-9474(92)90191-LMachine learning the nuclear mass
. Nucl. Sci. Tech. 32,109 (2021). doi: 10.1007/s41365-021-00956-1Nuclear mass predictions for the crustal composition of neutron stars: A Bayesian neural network approach
. Phys. Rev. C 93, 014311 (2016). doi: 10.1103/PhysRevC.93.014311Refining mass formulas for astrophysical applications: a Bayesian neural network approach
. Phys. Rev. C 96, 044308 (2017). doi: 10.1103/PhysRevC.96.044308Nuclear charge radii: Density functional theory meets Bayesian neural networks
. J. Phys. G 43, 114002 (2016). doi: 10.1088/0954-3899/43/11/114002Predictions of nuclear β-decay half-lives with machine learning and their impact on r-process nucleosynthesis
. Phys. Rev. C 99, 064307 (2019). doi: 10.1103/PhysRevC.99.064307An improved nuclear mass formula with a unified prescription for the shell and pairing corrections
. Nucl. Phys. A 929, 38-53 (2014). doi: 10.1016/j.nuclphysa.2014.05.019Multitask Learning
. Machine Learning 28, 41-75 (1997). doi: 10.1023/A:1007379606734Updating quasi-newton matrices with limited storage
. Mathematics Computation 35, 773-782 (1980). doi: 10.1090/S0025-5718-1980-0572855-7