Prediction of the first 2+ states properties for atomic nuclei using Light Gradient Boosting Machine

NUCLEAR PHYSICS AND INTERDISCIPLINARY RESEARCH

Prediction of the first 2⁺ states properties for atomic nuclei using Light Gradient Boosting Machine

Hui Liu，

Xin-Xiang Li，

Yun Yuan，

Wen Luo ，

Yi Xu

Nuclear Science and Techniques

Vol.36, No.2

Article number 21

Published in print Feb 2025

Available online 09 Jan 2025

DOI：10.1007/s41365-024-01613-z

CSTR：32136.14.NST.2025.0221

1411014

The first 2⁺ excited states of the nucleus directly reflect the interaction between the shell structure and the nucleus, providing insights into the validity of the shell model and nuclear structure characteristics. Although the features of the first 2⁺ excited states can be measured for stable nuclei and calculated using nuclear models, significant uncertainty remains. This study employs a machine-learning model based on a light-gradient boosting machine (LightGBM) to investigate the first 2⁺ excited states. Specifically, the training of the LightGBM algorithm and the prediction of the first 2⁺ properties of 642 nuclei are presented. Furthermore, detailed comparisons of the LightGBM predictions were performed with available experimental data, shell model calculations, and Bayesian neural network predictions. The results revealed that the average difference between the LightGBM predictions and the experimental data was 18 times smaller than that obtained by the shell model and only 70% of the BNN prediction results. Considering Mg, Ca, Kr, Sm, and Pb isotopes as examples, it was also observed that LightGBM can effectively reproduce the magic number mutation caused by shell effects, with the energy being as low as 0.04 MeV due to shape coexistence. Therefore, we believe that leveraging LightGBM-based machine learning can profoundly enhance our insights into nuclear structures and provide new avenues for nuclear physics research.

First 2+ stateNuclear levelsLight Gradient Boosting Machine

Introduction

The study of excited states in atomic nuclei is a significant area of research in nuclear physics, aimed at elucidating the internal structure and interactions of nuclei. The energy level scheme of nuclear states is crucial for understanding nuclear structures and explaining nuclear reaction processes [1, 2]. It also provides essential insights into nuclear synthesis processes in celestial environments [3, 4]. Particularly for the first 2⁺ states of even-even nuclei, these nuclear properties yield valuable information regarding the evolution of nuclear characteristics and shell models. Accurate knowledge of these properties is vital for the continued advancement of nuclear model calculations and for the theoretical understanding of many intriguing phenomena in the quantum realm. However, the study of excited states presents a series of challenges. Experimentally, the scarcity of certain nuclear states and the complexities associated with experimental measurements necessitate overcoming technical limitations and enhancing measurement accuracy. Additionally, careful control over data processing and analysis is essential to ensure the reliability of the experimental results. Theoretically, the accurate modeling and calculation of multibody interactions remains a complex issue that requires balancing simulation accuracy with computational efficiency. Thus far, various models have been developed to investigate the excited-state properties of atomic nuclei, considering different physical factors, such as the Shell Model (SM) [5, 6], Collective Model (CM) [7-10], Collective Shell Model [11], and Density Functional Theory (DFT) [12, 13], and the Interaction Shell Model (ISM) [14]. Although these models offer various perspectives for understanding the excited-state properties of atomic nuclei, each model has its limitations under different conditions. Consequently, exploring the energy of the first 2⁺ states of atomic nuclei on a large scale remains a significant challenge.

Machine learning can extract valuable features and patterns from diverse types of data and has been widely applied across various fields. As important branches of machine learning, neural networks and decision trees have long played crucial roles in predictive modeling, providing effective solutions for numerous tasks. In the domain of nuclear physics, machine learning holds significant potential for addressing both theoretical and experimental challenges [15-19]. For instance, these methods have been successfully utilized to predict nuclear mass [20-30], binding energies [31], particle physics [32], phase transitions [33], neutron star observables [34], fission fragments [35-38], half-life [39-42], ground state energy [44, 43], charge radius [45-47], giant resonance parameters [48], and reaction cross-sections [49, 50].

In fact, studying the first 2⁺ states using machine-learning algorithms is not a new topic. In 2020, Akkoyun et al. [51] employed neural networks and found that their trained models achieved slightly more accurate energy values than those obtained from the shell model (SM). In 2022, Wang et al. [52] utilized a Bayesian neural network (BNN) to derive a more precise low excitation energy over a broad energy range, successfully reproducing experimental data within approximately 1.12 times that range. With the rapid advancements in computer science and artificial intelligence, a variety of sophisticated machine learning algorithms have emerged. Among these, LightGBM [53], developed by Microsoft in 2016, is an efficient gradient boosting framework based on decision tree algorithms. It has been widely adopted in various machine learning tasks, demonstrating strong performance. Its advantages include: (1) rapid training speed, (2) improved accuracy in capturing nonlinear relationships in the data, (3) enhanced handling of large-scale datasets, and (4) a significant reduction in memory usage. Based on the findings of Gao et al. [20], who used LightGBM to enhance the theoretical atomic nuclei mass model, investigating the application of LightGBM to predict the energy of the first 2⁺ states presents a promising area of research.

This study investigates the reasonable prediction of the first 2⁺ states of nuclei using LightGBM-based machine learning. In the Sect. 2, we describe the construction of the LightGBM algorithm and its functionality. Section 3 presents the training process of the LightGBM algorithm and its predictions regarding the first 2⁺ properties of 642 nuclei. Additionally, we conducted a detailed comparison of the LightGBM predictions with available experimental data and shell model calculations, validating both the robustness of the LightGBM algorithm and its predictive accuracy. A summary and outlook are provided in Sect. 4.

Methodology

In this section, we discuss the working principle of the LightGBM algorithm and how it can be utilized to construct a model for predicting the first 2⁺ states.

LightGBM is an efficient gradient boosting framework based on a decision tree algorithm that uses a gradient boosting decision tree (GBDT) as the basic model, gradually constructs multiple decision trees, and combines their prediction results to improve the overall performance of the model. The specific architecture of LightGBM is illustrated in Fig. 1. Once the organized training data is input, features from different groups are separated, recombined, and bundled for training to create GBDT histograms. Following data parallelism and number-theory iterations under defined hyperparameter settings, the final model is evaluated by verifying its root mean square error.

Fig. 1

The architecture of the LightGBM

For the dataset, the energies of the first 2⁺ states of 642 atomic nuclei, ranging from ⁴He to ²⁵⁶Rf, as reported in Ref. [54] (see Fig. 2), were used for training and testing LightGBM to identify the functional patterns between energy and various nuclear characteristics. For each nucleus, we selected five physical quantities (see Table 1) as input features. Given the relationship between excitation energy and nuclear shell structure, $| Z - m |$ and $| N - m |$ are considered, representing the distances between the number of protons and neutrons, respectively, and the nearest magic number. The parameter $β_{2}$ describes the quadrupole deformation of a nucleus, indicating the extent to which the nucleus transitions from a spherical to an ellipsoidal shape. The specific formula[54] is: $β_{2} = (\frac{4 π}{3 Z R_{0}^{2}}) {[\frac{B (E 2)}{e^{2}}]}^{1 / 2},$ (1) where B(E2) is the reduced rate of electric quadrupole transition from ground state to 2⁺ state given by: $B (E 2) = 2.97 \times 10^{- 5} A^{4 / 3} e^{2} b^{2},$ (2) $R_{0}^{2} = {(1.2 \times 10^{- 13} A^{1 / 3} cm)}^{2} .$ (3)

Fig. 2

(Color online) Locations of the training dataset with 642 cores from Ref. [54], including 20% (a), 50% (b), and 80% (c), in the N-Z plane. Energies of

2_{1}^{+}

states

E (2_{1}^{+})

for even-even Z=2–104 nuclei, in keV

Selection of characteristic quantities

Features	Description
Z	proton number
N	neutron number
β₂	Quadrupole deformation parameter
\|Z-m\|	the distance between the number of protons in nucleus and the nearest magic number; $m \in {8, 20, 28, 50, 82, 126}$
\|N-m\|	the distance between the number of neutrons in nucleus and the nearest magic number; $m \in {8, 20, 28, 50, 82, 126, 184}$

For the current LightGBM parameter settings, the value of num_leaves (i.e. the maximum number of leaves allowed per tree; an increase in the number of leaves adds complexity to the model) is set to 48. The learning_rate (the step size for each iteration; a lower learning rate typically enhances the model's generalization ability) is set at 0.05, and num_round (i.e. the maximum number of iterations allowed for LightGBM model training, indicating the intensity of model training) is 500. Other parameters are generally set to their default values, as altering them does not significantly affect the results. During the training process, LightGBM generates a decision tree based on the relationships between the features of the training set and $E (2_{1}^{+})$ . Ten-fold cross-validation is employed to evaluate the model by dividing the original dataset into 10 equally sized subsamples, which helps prevent overfitting and selection bias. After training, the model makes predictions using a test set, with each atomic nucleus in the test set navigating the decision tree created during training [20]. Each decision tree contributes to the predicted values based on the feature quantities of each nucleus, and the final predicted value is the sum of the contributions from all the trees.

Results and discussions

In this section, LightGBM is trained to learn the functional patterns between the first 2⁺ state energies and various nuclear properties. The excitation energies of 646 nuclei between ⁴He and ²⁵⁶Rf, obtained in Ref. [54], are utilized in this study. These nuclei are divided into training and testing datasets. It is important to note that four data points with energy values exceeding 5000 keV are excluded from the analysis, as these unusually high values are rare in the dataset. Consequently, the total number of samples used in the present LightGBM study is 642.

First, the effect of training size on the predicted excitation energy was examined, as depicted in Fig. 2. 128, 321, and 514 nuclei (approximately 20%, 50%, and 80% of the total 642 nuclei, respectively) were randomly selected to construct the training sets. It can be observed that for atomic nuclei with proton and neutron numbers less than the magic number 50, the average excitation energy was higher, at approximately 1115 keV. In contrast, for atomic nuclei with proton and neutron numbers greater than the magic number 50, the average excitation energy was significantly lower, around 399 keV. This difference may be attributed to the filling of lower energy levels in the shell structure when the number of protons and neutrons is below 50, requiring higher energy to excite the nucleus to higher energy levels. Conversely, when the number of protons and neutrons exceeds 50, the deformation and collective motion modes of the nucleus increase, resulting in decreased shell spacing and consequently lower excitation energies.

Based on the aforementioned analysis, we adopted three uniformly random training sets with different segmentation ratios to train the LightGBM model and calculated the RMSE for each group over the training rounds. The results are presented in Fig. 3. As the number of training rounds increased, the RMSE gradually converged to a certain value. Additionally, as the training ratio increased, the RMSE tended to decrease. During the initial stages of model training, optimizing model parameters using only the training set ensured a gradual decrease in error on the training data. However, this approach does not accurately reflect the model's generalization ability to unseen data. To address this, we introduced a validation set to monitor the model's performance in real time with unseen data. By plotting the RMSD curves for both the training and validation sets, we can intuitively evaluate the model's learning performance and identify any potential overfitting, thereby optimizing its overall performance and generalization ability. As shown in Fig. 4, during the initial stage of training, the prediction accuracy on the validation set continued to improve, as indicated by a gradual decrease in the RMSD value. However, as the training rounds reached between 500 and 1000, the loss curve for the validation set began to increase slowly, suggesting a slight deviation from optimal generalization performance. Consequently, we recommend using 500 training rounds as a suitable number for the current model to effectively avoid both overfitting and underfitting.

Fig. 3

(Color online) RMSE loss decline curve of each group over for 3000 training rounds, with 642 nuclei in the training set randomly divided into 20% (blue), 50% (orange), and 80% (green) of the dataset

Fig. 4

Variation curves of RMSD of training set (blue curve) and validation set (red curve) with the number of training round

To further evaluate the predictive capability of LightGBM across different nuclear regions, Fig. 5 presents the predicted energies for the first 2⁺ states of the Mg, Ca, Kr, Sm, and Pb isotope chains. While the BNN [52] offers a reasonable description of these isotopes, its quantitative accuracy can still be improved. Notably, the BNN tends to overestimate the ⁺ state energies of Mg isotopes while underestimating those of Ca isotopes. In contrast, the LightGBM predictions not only effectively replicated the trends observed in the isotopes but also captured the magic number mutations caused by shell effects more accurately, aligning closely with experimental measurements.

Fig. 5

Energies of the first 2⁺ states on Mg, Ca, Kr, Sm, and Pb isotope chains, with blue dots representing experimental values, red squares representing BNN, purple pentagons representing Shell model and green triangles representing LGBM

The Mg isotope chain is a typical light isotope series that transitions from a spherical nucleus to a deformed nucleus. When the neutron number N reaches 20, the phenomenon of “magic number disappearance” in the shell model begins to manifest, leading to the formation of the so-called “deformation island”. Due to the significant deformation of atomic nuclei that deviates from spherical symmetry, traditional shell models can no longer accurately describe nuclear structures. In this context, LightGBM offers a more complex and effective description of atomic nuclei in inversion islands, outperforming the BNN method. For the Mg isotope chain, the traditional magic number N =28 for ⁴⁰Mg is situated close to the drip-line nucleus. The predicted energies of the first 2⁺ states in this isotope chain are illustrated in Fig. 5(a), with recent experimental data [55] provided for comparison. The results from LightGBM demonstrate better agreement with the experimental values than those from the BNN method. In the Ca isotope chain, LightGBM effectively reproduces the shell effects at N=20 and 28, as well as its sub-shell effect at N=32, as shown in Fig. 5(b). For the Kr and Pb isotopes, the existence of shape coexistence leads to very low first 2⁺ state energies in the neutron-deficient region. Additionally, LightGBM accurately predicts experimental values in the neutron-rich region, particularly near the magic numbers N=50 and 126. For the Sm isotopes in the medium-heavy nucleus region, when N approaches 90, the observed energy drops to as low as 0.04 MeV due to deformation characteristics, which are successfully reproduced by the LightGBM. These comparisons validate the robustness and accuracy of LightGBM's predictions.

Based on the qualitative constraints discussed above, a more specific, intuitive, and analytically quantitative expression is obtained to quantify the differences between the models. Therefore, we present the differences between the first 2⁺ states predicted by LightGBM and the corresponding experimental data in Fig. 6. For comparison, we also include differences between the shell model [54] calculations and the BNN results [52]. It is evident that when the charge number Z reaches 30, the shell model provides limited computational results. Overall, LightGBM not only accurately reproduces the experimental results in the light nucleus region but also in the medium-heavy nucleus region. The calculated results for transition and magic nuclei align well with experimental data, demonstrating more consistent and stable outcomes compared to the BNN.

Fig. 6

(Color online) Difference between the theoretical excitation energy obtained from Shell Model (blue diamond), BNN(red diamond), LightGBM (green diamond) and the experimental excitation energy

To better assess the accuracy and stability of LightGBM in predicting the first 2⁺ state energies, we plotted histograms of the differences between the predictions from the shell model, BNN, and LightGBM against the experimental values, as shown in Fig. 7. The average difference between the shell model calculations and the experimental values for 90 nuclei (see Ref. [54]) is 0.091 MeV, with a standard deviation of 0.17 MeV. In contrast, the average difference between the BNN calculations and the experimental values for 630 nuclei is 0.007 MeV, with a standard deviation of 0.12 MeV. For LightGBM, the average difference from the experimental values across 642 nuclei is 0.005 MeV, accompanied by a standard deviation of 0.10 MeV. Given that LightGBM exhibits both a lower average difference and standard deviation, we conclude that it provides more accurate and consistent results. This further reinforces the reliability of LightGBM as a machine learning method for predicting the first 2⁺ states energies.

Fig. 7

(Color online) Density distribution of the difference between experimental and model output. Blue color histogram bar represents the Shell Model, red denotes BNN, while green indicates the LightGBM

In addition to predicting the first 2⁺ state energies, we investigated the sensitivity of the LightGBM model to input parameters, which can enhance both the interpretability [20] and transparency of the model. To achieve this, we employed the popular SHAP [56] (SHapley Additive exPlanations) feature attribute method to interpret the LightGBM model. SHAP is an explanatory method grounded in cooperative game theory, designed to quantify the contribution of each feature to the predictions made by machine learning models. The core principle of SHAP is to allocate feature contributions by calculating Shapley values, thereby ensuring fairness and consistency in interpretation. Key characteristics of SHAP include the provision of unique and interpretable contribution values for each feature, as well as its compatibility with various model types, including tree-based models and neural networks.

Given a feature set $F$ and model prediction function $f (x)$ , the SHAP value $ϕ_{i}$ of a certain feature $i$ is defined as $ϕ_{i} = \sum_{S \subseteq F \ {i}} \frac{| S |! (| F | - | S | - 1)!}{| F |!} [f (S \cup {i}) - f (S)],$ (4) where $S$ is a subset of the feature set that does not contain features $i$ , $| S |$ represents the size of the set $S$ , and $F$ is the complete set of all features. This formula calculates the weighted average of the marginal contribution of the added feature $i$ to the model prediction for all possible feature combinations.

Figure 8 shows the ranking of the importance of five sets of features. The top N and |N-m| are the most critical parameters for predicting the first 2⁺-state energies. The red and blue points correspond to the high- and low-end parts of the feature values, respectively. The dots are more enriched between -0.5 and 0.3, with only approximately 2% of the light nuclei in the total dataset showing a bias. Similar behavior was observed in the experimental data and BNN predictions. In the future, we expect to enlarge the prediction horizon either when sufficient experimental data are available for the light nuclear region or by studying different nuclear regions separately.

Fig. 8

(Color online) The importance ranking of the five input features obtained by the SHAP explained LightGBM model is presented, and the x-axis of the graph represents the SHAP value, indicating the importance of the features for specific predictions. Each point represents a nucleus, and the color changes from red (high eigenvalue) to blue (low eigenvalue)

Summary

The properties of the first 2⁺ excited states are crucial for understanding nuclear structure. This study employed a LightGBM-based machine learning model to investigate the first 2⁺ states across 642 nuclei. Several features of atomic nuclei were considered as inputs to predict the ⁺ state energies. The LightGBM predictions were explicitly compared with available experimental data, shell model calculations, and BNN method predictions. Notably, the difference between the LightGBM predictions and the average experimental data was 18 times smaller than that obtained using the shell model and only 70% of the BNN prediction results. We demonstrated that LightGBM effectively reproduces the magic number mutations caused by shell effects, with energies as low as 0.04 MeV due to shape coexistence. The results for transition and magic nuclei showed excellent agreement with experimental data. These findings not only enhance existing predictive models but also pave the way for future machine learning applications in nuclear physics, allowing for a more nuanced understanding of nuclear structure and excitation energies.

References

T. Otsuka, T. Suzuki, R. Fujimoto et al.,

Evolution of nuclear shells due to the tensor force

. Phys. Rev. Lett. 95, 232502 (2005). https://doi.org/10.1103/PhysRevLett.95.232502.