Machine learning the nuclear mass

NUCLEAR PHYSICS AND INTERDISCIPLINARY RESEARCH

Machine learning the nuclear mass

Ze-Peng Gao，

Yong-Jia Wang，

Hong-Liang Lü，

Qing-Feng Li，

Cai-Wan Shen，

Ling Liu

Nuclear Science and Techniques

Vol.32, No.10

Article number 109

Published in print 01 Oct 2021

Available online 07 Oct 2021

DOI：10.1007/s41365-021-00956-1

76503

Background: The masses of ~2500 nuclei have been measured experimentally; however, >7000 isotopes are predicted to exist in the nuclear landscape from H (Z=1) to Og (Z=118) based on various theoretical calculations. Exploring the mass of the remaining isotopes is a popular topic in nuclear physics. Machine learning has served as a powerful tool for learning complex representations of big data in many fields. Purpose: We use Light Gradient Boosting Machine (LightGBM), which is a highly efficient machine learning algorithm, to predict the masses of unknown nuclei and to explore the nuclear landscape on the neutron-rich side from learning the measured nuclear masses. Methods: Several characteristic quantities (e.g., mass number and proton number) are fed into the LightGBM algorithm to mimic the patterns of the residual δ(Z,A) between the experimental binding energy and the theoretical one given by the liquid-drop model (LDM), Duflo–Zucker (DZ, also dubbed DZ28) mass model, finite-range droplet model (FRDM, also dubbed FRDM2012), as well as the Weizsäcker–Skyrme (WS4) model to refine these mass models. Results: By using the experimental data of 80% of known nuclei as the training dataset, the root mean square deviations (RMSDs) between the predicted and the experimental binding energy of the remaining 20% are approximately 0.234± 0.022, 0.213± 0.018, 0.170± 0.011, and 0.222± 0.016 MeV for the LightGBM-refined LDM, DZ model, WS4 model, and FRDM, respectively. These values are approximately 90%, 65%, 40%, and 60% smaller than those of the corresponding origin mass models. The RMSD for 66 newly measured nuclei that appeared in AME2020 was also significantly improved. The one-neutron and two-neutron separation energies predicted by these refined models are consistent with several theoretical predictions based on various physical models. In addition, the two-neutron separation energies of several newly measured nuclei (e.g., some isotopes of Ca, Ti, Pm, and Sm) predicted with LightGBM-refined mass models are also in good agreement with the latest experimental data. Conclusions: LightGBM can be used to refine theoretical nuclear mass models and predict the binding energy of unknown nuclei. Moreover, the correlation between the input characteristic quantities and the output can be interpreted by SHapley additive exPlanations (a popular explainable artificial intelligence tool), which may provide new insights for developing theoretical nuclear mass models.

Video Abstract

Nuclear massMachine learningBinding energySeparation energy

1　Introduction

The mass of nuclei, which is of fundamental importance to explore nuclear landscape and properties of the nuclear force, plays a crucial role in understanding many issues in the fields of both nuclear physics and astrophysics [1-6]. It is known that >7000 nuclei in the nuclear landscape from H (Z=1) to Og (Z=118) are predicted to exist according to various theoretical models, while ~3000 nuclei have been found or synthesized experimentally and ~2500 nuclei have been measured accurately [7, 8]. Exploring the masses of the remaining nuclei is of particular interest to both the nuclear experimental and theoretical communities. On the experimental side, facilities such as HIRFL-CSR in China, RIBF at RIKEN in Japan, the cooler-storage ring ESR and SHIPTRAP at GSI in Germany, CPT at Argonne and LEBIT at Michigan State University in the USA, ISOLTRAP at CERN, JYFLTRAP at Jyväskylä in Finland, and TITAN at TRIUMPH in Canada are partly dedicated to measuring the nuclear mass, especially for nuclei around the drip lines. On the theoretical side, various models have been developed to study nuclear mass by considering different physics, such as the finite-range droplet model (FRDM) [9, 10], Weizsäcker–Skyrme (WS) model [11], Hartree–Fock–Bogoliubov mass models [12-14], relativistic mean-field (RMF) model [15], and relativistic continuum Hartree–Bogoliubov theory [16]. Although tremendous progress has been made from both experimental and theoretical perspectives, exploring the mass of nuclei around drip lines is still a great challenge.

Machine learning (ML), which is a subset of artificial intelligence, has been widely applied for analyzing data in many branches of science, such as physics (see, e.g., Refs. [17-24]). In nuclear physics, a Bayesian neural network (BNN) has been applied to reduce the mass residuals between theory and experiment, and a significant improvement in the mass predictions of several theoretical models was obtained after BNN refinement [25-27]; for example, the root mean square deviation (RMSD) of the liquid-drop model (LDM) was reduced from ~3 to 0.8 MeV. Later, the BNN approach was also applied to study nuclear charge radii [28], β-decay half-lives [32], fission product yield [29], and fragment production in spallation reactions [30, 31]. In addition to the BNN, other machine learning or deep learning algorithms have also been employed in the study of nuclear reactions (see, e.g., Refs. [33-38]). Focusing on nuclear mass, in addition to the BNN in Refs. [25-27], the Levenberg–Marquardt neural network approach [39], Gaussian processes [42, 43], the decision tree algorithm [44], and the multilayer perceptron (MLP) algorithm [45] have also been applied to refine nuclear mass models.

Indeed, studying nuclear mass with machine learning algorithms is not a new topic, and it can be traced back to at least 1993 (see Refs. [46-48] and reference therein). Ref. [46] addressed the capability of multilayer feedforward neural networks for learning the systematics of atomic masses and nuclear spins and parities with high accuracy. This topic has flourished again because of the rapid development of computer science and artificial intelligence. In 2016, Light Gradient Boosting Machine (LightGBM), which is a tree-based learning algorithm, was developed by Microsoft [65]. It is a state-of-the-art machine learning algorithm that has achieved better performance in many machine learning tasks. Therefore, it would be interesting to explore whether the LightGBM algorithm can achieve better accuracy than the BNN for predicting nuclear mass. The BNN comprises a statistical model and a neural network. It is a neural network with a probability distribution placed over their weights and biases. It can produce uncertainties in the predictions and generate the distribution of the parameters obtained by learning the data. Consequently, the overfitting problem in the regime of small data can be partially avoided. LightGBM is a tree-based learning algorithm, and the advantages of its framework include (1) faster training speed and higher efficiency, (2) lower memory usage, (3) better accuracy, (4) support of parallel and graphics processing unit learning, and (5) capability of handling large-scale data. More specifically, in the present work, because of its tree-based nature, LightGBM has an excellent degree of explainability, which is important for studying physical problems. Therefore, it is worth using LightGBM to obtain some physical insight into the nuclear mass.

The remainder of this paper is organized as follows. In Sect. 2, we will introduce the LightGBM model and 10 input features. The predicted binding energy and neutron separation energy obtained with LightGBM are discussed in detail in Sect. 3. The conclusions and outlook are presented in Sect. 4.

2　LightGBM and the input features

LightGBM refers to a recent improvement of the gradient boosting decision tree that provides efficient implementation of gradient boosting algorithms. It is becoming increasingly popular because of its efficiency and capability of handling large amounts of data. LightGBM utilizes leaf-wise growth of trees, rather than level-wise growth. After the first partition, the next split is performed only on the leaf node that adds more to the information gain.

The primary advantage of LightGBM is the change in the training algorithm that speeds up the optimization process dramatically and results in a more effective model in many cases. More specifically, to speed up the training process, LightGBM uses a histogram-based methodology to select the best segmentation. For any continuous variable, instead of individual values being used, these are divided into bins or buckets, which can accelerate the training process and reduce memory usage. In addition, LightGBM employs two novel techniques: gradient-based one-side sampling, which keeps all the instances of large gradients and performs random sampling on the instances with small gradients, and exclusive feature bundling, which helps to bundle multiple features into a single feature without losing any information. Furthermore, as a decision-tree-based algorithm, LightGBM also has a high level of interpretability, allowing the results obtained in the machine learning model to be checked against previous knowledge regarding nuclear mass. For example, one can find which feature is more important for predicting nuclear mass; this will be helpful in further improving the nuclear mass model.

In this work, the binding energies of 2408 nuclei between ¹⁶O and ²⁷⁰Ds from the atomic mass evaluation (AME2016) [8] were employed as the training and test datasets. LightGBM was trained to learn the residual between the theoretical prediction and the experimental binding energy, δ(Z,A) = B_th(Z,A) - B_exp(Z,A). Four theoretical mass models were adopted in this study to obtain B_th, including the LDM [39], the Duflo–Zucker (DZ) mass model [57], the FRDM [9, 10], and the WS4 model [11]. After LightGBM learns the behavior of the residual δ(Z,A), the binding energy of an unknown mass nucleus can be obtained using B_LightGBM(Z,A) = B_th(Z,A) - δ(Z,A). The RMSD of these four theoretical mass models can be significantly improved after LightGBM refinement.

For the LDM model, the nucleus is regarded as a noncompressible droplet, which contains the volume energy, surface energy, Coulomb energy of proton repulsion, symmetry energy related to the ratio of neutrons to protons, and pairing energy of the neutron–proton pairing effect. It can be described as follows:

\begin{array}{l} B_{LDM} (Z, A) = & a_{v} (1 + \frac{4 k_{v}}{A^{2}} | T_{z} | (| T_{z} | + 1)) A \\ + a_{s} (1 + \frac{4 k_{s}}{A^{2}} | T_{z} | (| T_{z} | + 1)) A^{\frac{2}{3}} \\ + a_{c} \frac{Z^{2}}{A^{\frac{1}{3}}} + f_{p} \frac{Z^{2}}{A} + E_{p}, \end{array}

(1)

where E_p is the pairing energy given by the following expression:

E_{p} = {\begin{array}{l} \frac{d_{n}}{N^{1 / 3}} + \frac{d_{p}}{Z^{1 / 3}} + \frac{d_{np}}{A^{2 / 3}}, & for Z and N odd, \\ \frac{d_{p}}{Z^{1 / 3}}, & for Z odd and N even, \\ \frac{d_{n}}{N^{1 / 3}}, & for Z even and N odd, \\ 0, & for Z and N even . \end{array}

(2)

In the above formulas, A, Z, N, and Tz are the mass number, proton number, neutron number, and third component of isospin $(T_{z} = \frac{1}{2} (Z - N))$ and a_v, k_v, a_s, k_s, a_c, f_p, d_n, d_p, and d_np are adjustable parameters, whose values are listed in Table 1 . Based on these parameters, the binding energy of the RMSD value of the 2408 nuclei theoretical and experimental values is 2.463 MeV.

Parameter settings. All units are in MeV, except for k_v and k_s.

Parameter	Value
a_v	-15.4963
a_s	17.7937
k_v	-1.8232
k_s	-2.2593
a_c	0.7093
f_p	-1.2739
d_n	4.6919
d_p	4.7230
d_np	-6.4920

This work mainly aims to find the relationship between the feature quantity of each nucleus and δ(Z,A) with the LightGBM model. For each nucleus, we selected 10 physical quantities (cf. Table 2) as the input features, which are thought to be related to the nuclei properties. [49-56] It is known that nuclear binding energy and nuclear structure are linked, therefore, we selected four physical quantities related to the shell structure, among which Zm and Nm are the shells where the last proton and neutron are located, and the level of the shell is given by the magic numbers. The number of protons between 8, 20, 50, 82 and 126 corresponds to Zm of 1, 2, 3, 4, and the number of neutrons between 8, 20, 50, 82, 126 and 184 corresponds to Nm of 1, 2, 3, 4, 5. In addition, |Z - m| and |N - m| are the absolute values of the difference between the number of protons, the number of neutrons, and the nearest magic number, respectively, which represent the distance between the number of protons, the number of neutrons, and the nearest magic number. N_pair is an index that considers the proton-neutron pairing effect, the odd-odd nucleus is 0, the odd-even nucleus or even-odd nucleus is 1, and the even-even nucleus is 2.

Selection of characteristic quantities

Feature	Description
A	mass number
Z	proton number
N	neutron number
N/Z	ratio of neutrons to protons
B_LDM	theoretical value from the LDM
N_pair = 0,1,2	dependence on pair effect; for odd–odd, odd–even, even–even
Zm = 1,2,...	shell of the last proton; for 8 ≤ Z < 20, 20 ≤ Z < 28, ...
Nm = 1,2, ...	shell of the last neutron; for 8 ≤ N < 20, 20 ≤ N < 28, ...
\|Z-m\|	Distance between proton number and the nearest magic number; m ∈ {8,20,28,50,82,126}
\|N-m\|	Distance between neutron number and the nearest magic number; m ∈ {8,20,28,50,82,126,184}

In this study, the value of num_boost_round (maximum number of decision trees allowed) was 50,000, num_leaves (maximum number of leaves allowed per tree) was 10, and the corresponding total number of parameters per tree was 19. Max_depth (maximum depth allowed per tree) is -1, and other parameters are set as their default values of the LightGBM model. Varying these parameters did not significantly alter the results. During the training process, LightGBM will generate a decision tree based on the relevant information between the features of the training set, and δ (Z, A). A total of 10,000 to 25,000 decision trees are generated in the training process depending on the training set and learning rate, and the overall model contains 190,000 to 475,000 parameters. 10-fold cross-validation, which is a technique to evaluate models by partitioning the original dataset into 10 equal-sized subsamples, is also applied to prevent overfitting and selection bias. After the training, the model makes predictions on the test set. Each nucleus in the test set traverses the decision tree grown during model training. Each decision tree will give its contribution to the predicted value according to the feature quantity of each nucleus. The sum of the contributions of all decision trees is the predicted value given by the final model.

3　Results

3.1　 Predictions of the binding energy based on the LDM

In this section, LightGBM is trained to learn the residual δ(Z,A) between the LDM and the experimental binding energies. For this purpose, the binding energies of the 2408 nuclei between ¹⁶O and ²⁷⁰Ds from AME2016 were split into training and test datasets. We note here that nuclei with proton (neutron) numbers smaller than 8 and with relatively large experimental uncertainties in AME2016 were not used. First, the influence of the training size on the predicted binding energy was examined, as shown in Fig. 1. We randomly selected 482 (~20% of the 2408 nuclei) nuclei to constitute the test set. The RMSD of LDM for the selected 482 nuclei was ~2.458 MeV, and after LightGBM refinement, the RMSD was reduced to 0.496, 0.272, and 0.233 MeV when 482, 1204, and 1926 nuclei were used to train the LightGBM, respectively. This means that LightGBM can capture the missing physics of the LDM and decode the correlation between the input features and the residual, to further improve the agreement with experimental data.

Fig. 1

(Color online) Upper panels: Locations of training data sets with (a) 20%, (b) 50%, and (c) 80% of the 2408 nuclei from AME2016 in the N–Z plane. Lower panels: The error between the experimental and LightGBM-refined LDM predicted binding energies for the test set (20% of the 2408 nuclei). Results obtained with the LightGBM-refined LDM trained with (d) 20%, (e) 50%, and (f) 80% of the 2408 nuclei, respectively. σ_pre is the RMSD of the original LDM, and σ_post is the RMSD of the LightGBM-refined LDM.

In addition, it can be seen that the deviation between the experimental and LightGBM-refined LDM predictions for nuclei with the small number of proton and neutron is usually larger, this may be because the microstructure effect in the light-mass nuclei is strong, and there are less data on the light-mass nuclei in the training set. The value of the RMSD fluctuates when the training and test data sets are randomly selected, because δ(Z,A) for some of the nuclei (i.e., nuclei around the magic number) are large and some are small. To evaluate this issue, we randomly split the 2408 nuclei into training and test data sets 500 times with each ratio (i.e., 4:1, 1:1, and 1:4), and the RMSD and its density distribution are plotted in Figs. 2 and 3. As observed in Fig. 2, fluctuation in the RMSD is the largest of all when the ratio of training size to test size is 1:4. The RMSD for 1926 nuclei predicted by LightGBM-refined LDM with a binding energy of 482 nuclei was ~0.508± 0.035 MeV, which is comparable to many physical mass models. With the training data set built from 1926 nuclei and the remaining 482 nuclei constitute the test data set, the RMSD was obtained to be 0.234± 0.022 MeV, which is better than that of many physical mass models.

Fig. 2

(Color online) RMSD for the test data from 100 runs. In each run, the 2408 nuclei were randomly split into training and test datasets at a ratio of 4:1 (blue), 1:1 (orange), and 1:4 (green).

Fig. 3

(Color online) Density distribution of RMSD between the experimental and predicted binding energy. The results from 500 runs for each set are displayed. Dashed lines denote a Gaussian fit to the distribution. The mean values and the standard deviation of the RMSD values are 0.508, 0.303, and 0.224 MeV and 0.035, 0.020, and 0.022 MeV for the three sets with different ratios of training to test size, respectively.

Figure 4 shows the residual δ(Z,A) obtained from the LDM and LightGBM-refined LDM. The results from nine runs with randomly selected 80% of 2408 nuclei as the training set and the remaining 20% as the test set are displayed. It can be seen that the residual δ(Z,A) obtained with the original LDM is large, especially for nuclei around the magic number, owing to the absence of shell effect in the LDM. After the refinement of LightGBM, δ(Z,A) is considerably reduced, especially for nuclei with mass number larger than 60. The performance of LightGBM for nuclei with mass number smaller than 60 is not as good as that for nuclei with large mass number, the same as we already observed in Fig. 1. This could be improved by feeding more relevant features to LightGBM.

Fig. 4

(Color online) Residual δ(Z,A) plotted as a function of mass number. Nine runs with a random splitting of the 2408 nuclei into training and test groups with a ratio of 4:1 are displayed. Blue and orange points denote δ(Z,A) for the test data obtained with the LDM and LightGBM-refined LDM, respectively. σ_pre is the RMSD of the original LDM, and σ_post is the RMSD of the LightGBM-refined LDM.

3.2　 Predictions of the binding energy based on different mass models

In the previous section, the capability of LightGBM to refine the LDM has been demonstrated. In this section, in addition to the LDM, three popular mass models, that is, DZ, WS4, and FRDM are tested as well. To do so, the δ(Z,A) between the experimental binding energy and the one obtained from each mass model is fed to LightGBM, and we randomly split the 2408 nuclei into training and test groups at a ratio of 4:1, and run 500 times for each mass model. The distribution of RMSD on the training and test datasets is shown in Fig. 5. In Table 3 , the performance of serval ML-refined mass models are compared. It can be seen that the typical value of RMSD on the training data set is only ~0.05–0.1 MeV, which is the smallest of all, to the best of our knowledge, the highest performance mass model. The typical value of RMSD on the test data set is ~0.2 MeV, which is also smaller than the others. In general, significant improvements of approximately 90%, 65%, 40%, and 60% after the LightGBM refinement on the LDM, DZ, WS4, and FRDM were obtained, indicating the strong capability of LightGBM to improve theoretical nuclear mass models. In addition, other approaches, such as the radial basis function (RBF) [11], two combinatorial radial basis functions (RBFs) [59], the kernel ridge regression (KRR) [40], and the RBF approach with odd-even corrections (RBFoe) [41], can also be used to improve the performance of nuclear mass models.

Fig. 5

(Color online) Density distribution of RMSD for training and test data sets. Results from 500 runs for each mass model (LDM, DZ, WS4, and FRDM) are displayed. Dashed lines denote a Gaussian fit to the distribution. The corresponding mean values and standard deviations are listed in Table 5. In each run, the 2408 nuclei were randomly split into training and test datasets at a ratio of 4:1.

Comparison of the RMSD for the ML-refined mass models. σ_pre denotes the RMSD of the original mass models, and σ_pre is the result obtained using the LightGBM-refined mass models. All values are in units of MeV.

σ_pre		LDM	DZ	WS4	FRDM
		2.462 ± 0.023	0.613 ± 0.007	0.302 ± 0.003	0.599 ± 0.009
	RBF by Wang [11]	—	—	0.170	—
	KRR by Wu [40]	—	—	0.199	—
	RBFs by Ma [59]	—	—	0.130	0.209
Training set	LMNN by Zhang [39]	0.235	0.325	—	0.348
	BNN by Niu [27]	—	—	0.176	0.187
	RBFoe by Niu [41]	—	0.171	0.140	0.182
	NN by Utama [25]	0.466	0.274	—	0.342
	NN by Pastore [58]	—	0.324	—	—
	Trees by Carnini [44]	2.070	0.471	—	—
	LightGBM in this work	0.058 ± 0.011	0.066 ± 0.010	0.055 ± 0.011	0.077 ± 0.013
Test set	LMNN by Zhang	0.256	0.329	—	0.368
	BNN by Niu	—	—	0.212	0.252
	RBFoe by Niu	—	0.344	0.337	0.218
	NN by Utama	0.486	0.278	—	0.352
	NN by Pastore	—	0.358	—	—
	Trees by Carnini	2.881	0.569	—	—
	LightGBM in this work	0.234 ± 0.022	0.213 ± 0.018	0.170 ± 0.011	0.222 ± 0.016

In principle, if there exists a function that can precisely predict nuclear mass and LightGBM can find this function by learning the mapping between the mass and input features, the improvements of different mass models after the LightGBM refinement should be the same. However, the correlations between nuclear mass and input features are indeed very complicated. LightGBM can capture some of the missed ingredients of these mass models, thereby improving their performance in predicting nuclear mass. In general, the improvement in RMSD is significant for the nuclear mass model with large RMSD values and is relatively slight for the nuclear mass model with small RMSD values.

Very recently, the AME2020 was published; thus, it is interesting to see whether the LightGBM-refined mass models also work well for newly measured nuclei that appear in the AME2020 mass evaluation. A comparison of the binding energies obtained with the LDM, DZ, WS4, and FRDM and LightGBM-refined mass models on the 66 newly measured nuclei that appeared in the AME2020 are illustrated in Fig. 6. The RMSD values of the original mass models on these newly measured nuclei were 2.468, 0.821, 0.350, and 0.778 MeV for the LDM, DZ, WS4, and FRDM, respectively. After the refinement of LightGBM, the RMSD of the above four mass models was significantly reduced to 0.452, 0.320, 0.222, and 0.292 MeV.

Fig. 6

(Color online) Difference between the theoretical and the experimental binding energies (red horizontal line) obtained using the LDM, DZ, WS4, and FRDM (open diamonds) and LightGBM-refined mass models (solid squares). The results of 66 newly measured nuclei that appeared in the AME2020 mass evaluation are displayed. σ_pre and σ_post denote the RMSD of the original and LightGBM-refined mass models on the newly measured nuclei, respectively. The error of the predictions obtained using the LightGBM-refined mass model is the standard deviation of the predicted binding energy. It is obtained by running LightGBM 500 times with randomly splitting AME2016 data into training and test sets with a ratio of 4:1.

By increasing the capacity of the training set, the model can learn more information, and the RMSD is reduced. However, the uncertainty of RMSD increases with an increase in the percentage of the training set, because the less tested nuclei, the larger the fluctuations. In addition, the RMSD of the 66 newly measured nuclei that appeared in AME2020 are also displayed in Fig. 7. When the percentage of the training set was larger than 90%, the RMSD of the 66 newly measured nuclei slightly increased, indicating that overfitting started to occur. However, this overfitting problem is not very serious because LightGBM has a strong ability to prevent overfitting. In the present work, to avoid either a large RMSD value or large uncertainty of RMSD, the percentage of the training set was chosen as 80% in most cases.

Fig. 7

(Color online) RMSD of the LightGBM-refined LDM plotted as a function of the percentage of the training set.

3.3　 Extrapolation of neutron separation energy

Single and two-neutron separation energies are of particular interest because they provide information relevant to shell and subshell structures, nuclear deformation, paring effects, as well as the boundary of the nuclear landscape. They can be calculated using the following formulas:

{\begin{array}{l} S_{n} (Z, A) = B (Z, A) - B (Z, A - 1), \\ S_{2n} (Z, A) = B (Z, A) - B (Z, A - 2) . \end{array}

(3)

Good performance of the LightGBM-refined mass models for the prediction of nuclear binding has been shown, and it is interesting to see whether the single and two-neutron separation energies can also be reproduced well on the same foot. Based on the calculation of the LightGBM-refined models, the RMSD of S_n of 2255 nuclei and S_2n of 2140 nuclei are displayed in Table 4 . Figure 8 compares the single neutron separation energy of Ca, Zr, Sn, and Pb isotopic chains given by different theoretical models and the experimental data from AME2016. All predictions are in good agreement with the experimental data when there is data, while discrepancies appear as the neutron number increases. The general trend of S_n as a function of neutron number obtained with LightGBM-refined LDM and WS4 are similar to those obtained with other nuclear mass models, for example, the odd-even staggering can also be observed.

Fig. 8

(Color online) Single-neutron separation energy of Ca, Zr, Sn, and Pb isotopic chains given by different models. The results obtained using the LightGBM-refined LDM and WS4 were compared with those from the FRDM and WS4, as well as recent theoretical calculations given by Xia et al. [16], Ma et al. [59], and Yu et al. [60].

RMSD of S_n and S_2n obtained with the LightGBM-refined mass models. All values are in units of MeV

	LDM_lgb	DZ_lgb	WS4_lgb	FRDM_lgb
S_n	0.093	0.141	0.121	0.156
S_2n	0.094	0.164	0.126	0.170

The latest experimental measurements of the two-neutron separation energies of the four elements (Ca, Ti, Pm, Sm) were compared with various theoretical model calculations, as shown in Fig. 9. It can be seen that the newly measured S_2n can be well reproduced by the LightGBM-refined LDM and WS4 models; in particular, S_2n obtained with LightGBM-refined LDM is much closer to the experimental data than that obtained with the LDM. For example, the sharp decrease in S_2n around the magic number cannot be reproduced by the LDM, while this issue can be fixed after the refinement of LightGBM. The good performance of LightGBM-refined mass models on both S_n and S_2n, again indicating the strong capacity of LightGMB in refining the nuclear mass model.

Fig. 9

(Color online) Two-neutron separation energy of the neutron-rich nuclei on Ca, Ti, Pm, Sm isotopic chains given by different models. The red and green dots represent the experimental data from AME2016 and the latest measurements from Refs. [61, 62], respectively. The neutron numbers of the predicted drip line isotopes (Ca and Ti) in each nuclear mass model are also listed in the figure. Note that S_2n obtained with WS4 and LightGBM-refined WS4 almost completely overlap.

3.4　 Prediction of the residual δ(Z,A)=B_LDM-B_exp

It is known that the residual δ(Z,A)=B_LDM-B_exp, that is, the difference in binding energy between the experimental data and the one calculated with the LDM, in the vicinity of the magic numbers are usually large, and δ(Z,A) can reflect quantum-mechanical shell effects. It is always interesting to know if new magic numbers exist for exotic nuclei. For this purpose, the residual δ(Z,A) predicted with LightGBM based on learning the pattern of δ(Z,A) for nuclei appearing in the AME2016 database are displayed in Fig. 10. As can be seen, δ(Z,A) for Z=126 and N=184 nuclei are relatively large, which might reveal that they are magic numbers. However, it should be noted that they have been used as input features, that is, |Z-m| and |N-m|. By removing four magic number related features (i.e., Zm, Nm, |Z-m|, and |N-m|) in the training, the predicted results are displayed in the upper subfigure of Fig. 10. It is interesting to note that δ(Z,A) has local minima around (Z=20, N=50), (Z=28, N=82), and (Z=50, N=126), but there is no shell structure on δ(Z,A) for nuclei with Z ≥ 82 or N ≥ 126. This is understandable because ML algorithms have a strong ability to address interpolation tasks and become less efficient and reliable for extrapolation tasks, particularly for samples that are far away from the training samples. In this context, one does not expect new magic numbers to be explored by ML algorithms. Nevertheless, the high accuracy for predicting nuclear mass around nuclei with experimental data has been demonstrated, for example, the RMSD for the 66 newly measured nuclei that appeared in the AME2020 of the LightGBM-refined mass.

Fig. 10

(Color online) Residual δ(Z,A)=B_LDM-B_exp predicted with LightGBM. The 10 input features are listed in Table 2. The training set consisted of 80% of the nuclear masses appearing in the AME2016 database, which are displayed in the area encircled in red. The neutron drip lines obtained by using the LDM, WS4, LDM_lgb, and Ma [59] are shown with different symbols. The upper subfigure shows the residual δ (Z,A) predicted with LightGBM but with only six input features, that is, excluding four magic number-related features.

Fig. 11

(Color online) Importance ranking for the input features obtained with the SHAP package. Each row represents a feature, and the x axis is the SHAP value, which shows the importance of a feature for a particular prediction. Each point represents a nucleus, and the color represents the feature value (with red being high and blue being low).

3.5　 Interpretability of the model

As a decision-tree-based algorithm, one advantage of LightGBM is its excellent degree of explainability. This is important because, as a physicist, one expecting the ML algorithm not only has a good performance in refining nuclear mass models, but can also provide some underlying physics that is absent from the original nuclear mass models. Understanding what happens when ML algorithms make predictions that could help us further improve our knowledge about the relationship between the input feature quantity and the predicted value. One of the possible ways to understand how the LightGBM algorithm provides a particular prediction is to appreciate the most important features that drive the model. For this purpose, SHapley additive exPlanations (SHAP) [63], which is one of the most popular feature attribution methods, was applied to obtain the contributions of each feature value to the prediction. Figure 12 illustrates the ranking of importance of the input 10 features. The top is the most important feature, while the bottom is the most irrelevant feature for predicting the residual δ(Z,A) between the experimental and theoretical binding energies. It can be seen that the importance ranks of the input features are different for different mass models. Because shell effects are not included in the LDM, the residual δ(Z,A) around magic numbers is usually larger (also can be seen in Fig. 4). As a result, |N-m| and |Z-m| are more important for predicting δ(Z,A) between the LDM calculation and experimental data. To demonstrate the meaning of the SHAP value, the residual δ(Z,A) obtained from the LDM and SHAP values are shown in Fig. 12. The upper panel of Fig. 12, around the magic number, i.e., |N-m| is close to 0, larger difference between LDM calculated and experimental binding energy is existed, especially for nuclei with larger neutron number. A very similar behavior for the SHAP value can be seen in the lower panel. This implies that by adding a |N-m|-related term in the LDM, the accuracy of the LDM for calculating nuclear binding energy can be improved to some extent. For FRDM, the neutron number N represents the most relevant feature, and the SHAP value for smaller N is usually larger. Indeed, the residual δ(Z,A) for nuclei with a smaller neutron number N is larger has already been observed in the FRDM paper, that is, Fig. 6 of Ref. [64]. In addition, we see that N_pair, Zm, and Nm are three of the most irrelevant features for predicting the residual δ(Z,A). We have checked that the enhancement of RMSD is ~5% when they are not considered. It is worth noting that the distribution of B_th-B_exp for nuclei with different Zm (Nm) are slightly different, implying that shell corrections on nuclei mass are different in different regions, which deserves further study with the inclusion of more relevant features.

Fig. 12

(Color online) Upper panel: Residual δ(Z,A) obtained from the LDM plotted against |N-m| colored by neutron number. Each point represents a nucleus, and the color represents the number of neutrons in the nucleus. Lower panel: Same as the upper one, but with the SHAP value plotted instead of the residual δ(Z,A).

4　Conclusion and outlook

To summarize, several features are fed into the LightGBM algorithm to study the residual δ(Z,A) between the theoretical and experimental binding energies, and it was found that the LightGBM algorithm can mimic the patterns of δ(Z,A) with high accuracy, to refine theoretical mass models. In this study, significant reductions in the RMSD of approximately 90%, 65%, 40%, and 60% after LightGBM refinement on the LDM, DZ, WS4, and FRDM were obtained, indicating the strong capability of LightGBM to improve theoretical nuclear mass models. In addition, the RMSD for various mass models with respect to the 66 newly measured nuclei that appeared in AME2020 (compared with AME2016) was reduced to the same level as well. Furthermore, it was found that the single and two-neutron separation energies obtained with the LightGBM-refined mass models are in good agreement with the newly developed experimental data. By using the SHAP package, the most relevant input features for predicting the residual δ(Z,A) for each mass model were determined, which may provide guidance for the further development of nuclear mass models.

The good performance of the machine learning method in refining the nuclear mass model gives us a new tool to further investigate other properties of nuclei that we are interested in, such as superheavy nuclei, halo nuclei, and nuclei around drip lines. In addition, with the development of interpretable machine learning methods, more physical hints can be obtained, thereby improving our understanding of the present nuclear models.

References

[1]

D. Lunney, J.M. Pearson C. Thibault,

Recent trends in the determination of nuclear masses

. Rev. Mod. Phys. 75, 1021-1082 (2003). doi: 10.1103/RevModPhys.75.1021

Machine learning the nuclear mass

Video Abstract

1 Introduction

2 LightGBM and the input features

3 Results

3.1 Predictions of the binding energy based on the LDM

3.2 Predictions of the binding energy based on different mass models

3.3 Extrapolation of neutron separation energy

3.4 Prediction of the residual δ(Z,A)=BLDM-Bexp

3.5 Interpretability of the model

4 Conclusion and outlook

1　Introduction

2　LightGBM and the input features

3　Results

3.1　 Predictions of the binding energy based on the LDM

3.2　 Predictions of the binding energy based on different mass models

3.3　 Extrapolation of neutron separation energy

3.4　 Prediction of the residual δ(Z,A)=B_LDM-B_exp

3.5　 Interpretability of the model

4　Conclusion and outlook