Introduction
The study of excited states in atomic nuclei is a significant area of research in nuclear physics, aimed at elucidating the internal structure and interactions of nuclei. The energy level scheme of nuclear states is crucial for understanding nuclear structures and explaining nuclear reaction processes [1, 2]. It also provides essential insights into nuclear synthesis processes in celestial environments [3, 4]. Particularly for the first 2+ states of even-even nuclei, these nuclear properties yield valuable information regarding the evolution of nuclear characteristics and shell models. Accurate knowledge of these properties is vital for the continued advancement of nuclear model calculations and for the theoretical understanding of many intriguing phenomena in the quantum realm. However, the study of excited states presents a series of challenges. Experimentally, the scarcity of certain nuclear states and the complexities associated with experimental measurements necessitate overcoming technical limitations and enhancing measurement accuracy. Additionally, careful control over data processing and analysis is essential to ensure the reliability of the experimental results. Theoretically, the accurate modeling and calculation of multibody interactions remains a complex issue that requires balancing simulation accuracy with computational efficiency. Thus far, various models have been developed to investigate the excited-state properties of atomic nuclei, considering different physical factors, such as the Shell Model (SM) [5, 6], Collective Model (CM) [7-10], Collective Shell Model [11], and Density Functional Theory (DFT) [12, 13], and the Interaction Shell Model (ISM) [14]. Although these models offer various perspectives for understanding the excited-state properties of atomic nuclei, each model has its limitations under different conditions. Consequently, exploring the energy of the first 2+ states of atomic nuclei on a large scale remains a significant challenge.
Machine learning can extract valuable features and patterns from diverse types of data and has been widely applied across various fields. As important branches of machine learning, neural networks and decision trees have long played crucial roles in predictive modeling, providing effective solutions for numerous tasks. In the domain of nuclear physics, machine learning holds significant potential for addressing both theoretical and experimental challenges [15-19]. For instance, these methods have been successfully utilized to predict nuclear mass [20-30], binding energies [31], particle physics [32], phase transitions [33], neutron star observables [34], fission fragments [35-38], half-life [39-42], ground state energy [44, 43], charge radius [45-47], giant resonance parameters [48], and reaction cross-sections [49, 50].
In fact, studying the first 2+ states using machine-learning algorithms is not a new topic. In 2020, Akkoyun et al. [51] employed neural networks and found that their trained models achieved slightly more accurate energy values than those obtained from the shell model (SM). In 2022, Wang et al. [52] utilized a Bayesian neural network (BNN) to derive a more precise low excitation energy over a broad energy range, successfully reproducing experimental data within approximately 1.12 times that range. With the rapid advancements in computer science and artificial intelligence, a variety of sophisticated machine learning algorithms have emerged. Among these, LightGBM [53], developed by Microsoft in 2016, is an efficient gradient boosting framework based on decision tree algorithms. It has been widely adopted in various machine learning tasks, demonstrating strong performance. Its advantages include: (1) rapid training speed, (2) improved accuracy in capturing nonlinear relationships in the data, (3) enhanced handling of large-scale datasets, and (4) a significant reduction in memory usage. Based on the findings of Gao et al. [20], who used LightGBM to enhance the theoretical atomic nuclei mass model, investigating the application of LightGBM to predict the energy of the first 2+ states presents a promising area of research.
This study investigates the reasonable prediction of the first 2+ states of nuclei using LightGBM-based machine learning. In the Sect. 2, we describe the construction of the LightGBM algorithm and its functionality. Section 3 presents the training process of the LightGBM algorithm and its predictions regarding the first 2+ properties of 642 nuclei. Additionally, we conducted a detailed comparison of the LightGBM predictions with available experimental data and shell model calculations, validating both the robustness of the LightGBM algorithm and its predictive accuracy. A summary and outlook are provided in Sect. 4.
Methodology
In this section, we discuss the working principle of the LightGBM algorithm and how it can be utilized to construct a model for predicting the first 2+ states.
LightGBM is an efficient gradient boosting framework based on a decision tree algorithm that uses a gradient boosting decision tree (GBDT) as the basic model, gradually constructs multiple decision trees, and combines their prediction results to improve the overall performance of the model. The specific architecture of LightGBM is illustrated in Fig. 1. Once the organized training data is input, features from different groups are separated, recombined, and bundled for training to create GBDT histograms. Following data parallelism and number-theory iterations under defined hyperparameter settings, the final model is evaluated by verifying its root mean square error.
-202502/1001-8042-36-02-003/alternativeImage/1001-8042-36-02-003-F001.jpg)
For the dataset, the energies of the first 2+ states of 642 atomic nuclei, ranging from 4He to 256Rf, as reported in Ref. [54] (see Fig. 2), were used for training and testing LightGBM to identify the functional patterns between energy and various nuclear characteristics. For each nucleus, we selected five physical quantities (see Table 1) as input features. Given the relationship between excitation energy and nuclear shell structure,
-202502/1001-8042-36-02-003/alternativeImage/1001-8042-36-02-003-F002.jpg)
Features | Description |
---|---|
Z | proton number |
N | neutron number |
β2 | Quadrupole deformation parameter |
|Z-m| | the distance between the number of protons in nucleus and the nearest magic number; |
|N-m| | the distance between the number of neutrons in nucleus and the nearest magic number; |
For the current LightGBM parameter settings, the value of num_leaves (i.e. the maximum number of leaves allowed per tree; an increase in the number of leaves adds complexity to the model) is set to 48. The learning_rate (the step size for each iteration; a lower learning rate typically enhances the model's generalization ability) is set at 0.05, and num_round (i.e. the maximum number of iterations allowed for LightGBM model training, indicating the intensity of model training) is 500. Other parameters are generally set to their default values, as altering them does not significantly affect the results. During the training process, LightGBM generates a decision tree based on the relationships between the features of the training set and
Results and discussions
In this section, LightGBM is trained to learn the functional patterns between the first 2+ state energies and various nuclear properties. The excitation energies of 646 nuclei between 4He and 256Rf, obtained in Ref. [54], are utilized in this study. These nuclei are divided into training and testing datasets. It is important to note that four data points with energy values exceeding 5000 keV are excluded from the analysis, as these unusually high values are rare in the dataset. Consequently, the total number of samples used in the present LightGBM study is 642.
First, the effect of training size on the predicted excitation energy was examined, as depicted in Fig. 2. 128, 321, and 514 nuclei (approximately 20%, 50%, and 80% of the total 642 nuclei, respectively) were randomly selected to construct the training sets. It can be observed that for atomic nuclei with proton and neutron numbers less than the magic number 50, the average excitation energy was higher, at approximately 1115 keV. In contrast, for atomic nuclei with proton and neutron numbers greater than the magic number 50, the average excitation energy was significantly lower, around 399 keV. This difference may be attributed to the filling of lower energy levels in the shell structure when the number of protons and neutrons is below 50, requiring higher energy to excite the nucleus to higher energy levels. Conversely, when the number of protons and neutrons exceeds 50, the deformation and collective motion modes of the nucleus increase, resulting in decreased shell spacing and consequently lower excitation energies.
Based on the aforementioned analysis, we adopted three uniformly random training sets with different segmentation ratios to train the LightGBM model and calculated the RMSE for each group over the training rounds. The results are presented in Fig. 3. As the number of training rounds increased, the RMSE gradually converged to a certain value. Additionally, as the training ratio increased, the RMSE tended to decrease. During the initial stages of model training, optimizing model parameters using only the training set ensured a gradual decrease in error on the training data. However, this approach does not accurately reflect the model's generalization ability to unseen data. To address this, we introduced a validation set to monitor the model's performance in real time with unseen data. By plotting the RMSD curves for both the training and validation sets, we can intuitively evaluate the model's learning performance and identify any potential overfitting, thereby optimizing its overall performance and generalization ability. As shown in Fig. 4, during the initial stage of training, the prediction accuracy on the validation set continued to improve, as indicated by a gradual decrease in the RMSD value. However, as the training rounds reached between 500 and 1000, the loss curve for the validation set began to increase slowly, suggesting a slight deviation from optimal generalization performance. Consequently, we recommend using 500 training rounds as a suitable number for the current model to effectively avoid both overfitting and underfitting.
-202502/1001-8042-36-02-003/alternativeImage/1001-8042-36-02-003-F003.jpg)
-202502/1001-8042-36-02-003/alternativeImage/1001-8042-36-02-003-F004.jpg)
To further evaluate the predictive capability of LightGBM across different nuclear regions, Fig. 5 presents the predicted energies for the first 2+ states of the Mg, Ca, Kr, Sm, and Pb isotope chains. While the BNN [52] offers a reasonable description of these isotopes, its quantitative accuracy can still be improved. Notably, the BNN tends to overestimate the + state energies of Mg isotopes while underestimating those of Ca isotopes. In contrast, the LightGBM predictions not only effectively replicated the trends observed in the isotopes but also captured the magic number mutations caused by shell effects more accurately, aligning closely with experimental measurements.
-202502/1001-8042-36-02-003/alternativeImage/1001-8042-36-02-003-F005.jpg)
The Mg isotope chain is a typical light isotope series that transitions from a spherical nucleus to a deformed nucleus. When the neutron number N reaches 20, the phenomenon of “magic number disappearance” in the shell model begins to manifest, leading to the formation of the so-called “deformation island”. Due to the significant deformation of atomic nuclei that deviates from spherical symmetry, traditional shell models can no longer accurately describe nuclear structures. In this context, LightGBM offers a more complex and effective description of atomic nuclei in inversion islands, outperforming the BNN method. For the Mg isotope chain, the traditional magic number N =28 for 40Mg is situated close to the drip-line nucleus. The predicted energies of the first 2+ states in this isotope chain are illustrated in Fig. 5(a), with recent experimental data [55] provided for comparison. The results from LightGBM demonstrate better agreement with the experimental values than those from the BNN method. In the Ca isotope chain, LightGBM effectively reproduces the shell effects at N=20 and 28, as well as its sub-shell effect at N=32, as shown in Fig. 5(b). For the Kr and Pb isotopes, the existence of shape coexistence leads to very low first 2+ state energies in the neutron-deficient region. Additionally, LightGBM accurately predicts experimental values in the neutron-rich region, particularly near the magic numbers N=50 and 126. For the Sm isotopes in the medium-heavy nucleus region, when N approaches 90, the observed energy drops to as low as 0.04 MeV due to deformation characteristics, which are successfully reproduced by the LightGBM. These comparisons validate the robustness and accuracy of LightGBM's predictions.
Based on the qualitative constraints discussed above, a more specific, intuitive, and analytically quantitative expression is obtained to quantify the differences between the models. Therefore, we present the differences between the first 2+ states predicted by LightGBM and the corresponding experimental data in Fig. 6. For comparison, we also include differences between the shell model [54] calculations and the BNN results [52]. It is evident that when the charge number Z reaches 30, the shell model provides limited computational results. Overall, LightGBM not only accurately reproduces the experimental results in the light nucleus region but also in the medium-heavy nucleus region. The calculated results for transition and magic nuclei align well with experimental data, demonstrating more consistent and stable outcomes compared to the BNN.
-202502/1001-8042-36-02-003/alternativeImage/1001-8042-36-02-003-F006.jpg)
To better assess the accuracy and stability of LightGBM in predicting the first 2+ state energies, we plotted histograms of the differences between the predictions from the shell model, BNN, and LightGBM against the experimental values, as shown in Fig. 7. The average difference between the shell model calculations and the experimental values for 90 nuclei (see Ref. [54]) is 0.091 MeV, with a standard deviation of 0.17 MeV. In contrast, the average difference between the BNN calculations and the experimental values for 630 nuclei is 0.007 MeV, with a standard deviation of 0.12 MeV. For LightGBM, the average difference from the experimental values across 642 nuclei is 0.005 MeV, accompanied by a standard deviation of 0.10 MeV. Given that LightGBM exhibits both a lower average difference and standard deviation, we conclude that it provides more accurate and consistent results. This further reinforces the reliability of LightGBM as a machine learning method for predicting the first 2+ states energies.
-202502/1001-8042-36-02-003/alternativeImage/1001-8042-36-02-003-F007.jpg)
In addition to predicting the first 2+ state energies, we investigated the sensitivity of the LightGBM model to input parameters, which can enhance both the interpretability [20] and transparency of the model. To achieve this, we employed the popular SHAP [56] (SHapley Additive exPlanations) feature attribute method to interpret the LightGBM model. SHAP is an explanatory method grounded in cooperative game theory, designed to quantify the contribution of each feature to the predictions made by machine learning models. The core principle of SHAP is to allocate feature contributions by calculating Shapley values, thereby ensuring fairness and consistency in interpretation. Key characteristics of SHAP include the provision of unique and interpretable contribution values for each feature, as well as its compatibility with various model types, including tree-based models and neural networks.
Given a feature set
Figure 8 shows the ranking of the importance of five sets of features. The top N and |N-m| are the most critical parameters for predicting the first 2+-state energies. The red and blue points correspond to the high- and low-end parts of the feature values, respectively. The dots are more enriched between -0.5 and 0.3, with only approximately 2% of the light nuclei in the total dataset showing a bias. Similar behavior was observed in the experimental data and BNN predictions. In the future, we expect to enlarge the prediction horizon either when sufficient experimental data are available for the light nuclear region or by studying different nuclear regions separately.
-202502/1001-8042-36-02-003/alternativeImage/1001-8042-36-02-003-F008.jpg)
Summary
The properties of the first 2+ excited states are crucial for understanding nuclear structure. This study employed a LightGBM-based machine learning model to investigate the first 2+ states across 642 nuclei. Several features of atomic nuclei were considered as inputs to predict the + state energies. The LightGBM predictions were explicitly compared with available experimental data, shell model calculations, and BNN method predictions. Notably, the difference between the LightGBM predictions and the average experimental data was 18 times smaller than that obtained using the shell model and only 70% of the BNN prediction results. We demonstrated that LightGBM effectively reproduces the magic number mutations caused by shell effects, with energies as low as 0.04 MeV due to shape coexistence. The results for transition and magic nuclei showed excellent agreement with experimental data. These findings not only enhance existing predictive models but also pave the way for future machine learning applications in nuclear physics, allowing for a more nuanced understanding of nuclear structure and excitation energies.
Evolution of nuclear shells due to the tensor force
. Phys. Rev. Lett. 95,Self-consistent mean-field models for nuclear structure
. Rev. Mod. Phys. 75, 121–180 (2003). https://doi.org/10.1103/RevModPhys.75.121.Thermonuclear reaction rates
. Astron. Astrophys. Rev. 5, 525-570 (1967). https://doi.org/10.1146/annurev.aa.05.090167.002521.On closed shells in nuclei
. II. Phys. Rev. 75, 1969-1970 (1949). https://doi.org/10.1103/PhysRev.75.1969.On the “magic numbers” in nuclear structure
. Phys. Rev. 75, 1766-1766 (1949). https://doi.org/10.1103/PhysRev.75.1766.2.Collective and individual-particle aspects of nuclear structure, in Proceedings
(1953). url: https://api.semanticscholar.org/CorpusID:118820787.Nuclear structure, Volume II: Nuclear deformations
(1975). url: https://api.semanticscholar.org/CorpusID:117271202.Nuclear structure (In 2 Volumes)
.Possible analogy between the excitation spectra of nuclei and those of the superconducting metallic state
. Phys. Rev. 110, 936-938 (1958). https://doi.org/10.1103/PhysRev.110.936.Inhomogeneous electron Ga
. Phys. Rev. 136, B864-B871 (1964). https://doi.org/10.1103/PhysRev.136.B864.Self-consistent equations including exchange and correlation effects
. Phys. Rev. 140, A1133-A1138 (1965). https://doi.org/10.1103/PhysRev.140.A1133.The shell model as a unified view of nuclear structure
. Rev. Mod. Phys. 77, 427-488 (2005). https://doi.org/10.1103/RevModPhys.77.427.Machine learning in nuclear physics at low and intermediate energies
. Sci. China Phys. Mech. Astron. 66,Machine learning transforms the inference of the nuclear equation of state
. Front. Phys. 18,Machine learning and the physical sciences
. Rev. Mod. Phys. 91,Colloquium: Machine learning in nuclear physics
. Rev. Mod. Phys. 94,Machine learning in nuclear physics at low and intermediate energies
. Sci. China Phys. Mech. Astron. 66,Machine learning the nuclear mass
. Nucl. Sci. Tech. 108, 32 (2021). DOI: https://doi.org/10.1007/s41365-021-00956-1.Physically interpretable machine learning for nuclear masses
. Phys. Rev. C 106,Nuclear masses learned from a probabilistic neural network
. Phys. Rev. C 106,Performance of the Levenberg-Marquardt neural network approach in nuclear mass prediction
. J. Phys. G Nucl. Part. Phys. 44,Nuclear mass predictions based on Bayesian neural network approach with pairing and shell effects
. Phys. Lett. B 778, 48–53 (2018). https://doi.org/10.1016/j.physletb.2018.01.002.Refining mass formulas for astrophysical applications: A Bayesian neural network approach
. Phys. Rev. C 96,Deep learning approach to nuclear masses and α-decay half-lives
. Phys. Rev. C 105,Nuclear mass predictions with machine learning reaching the accuracy required by r-process studies
. Phys. Rev. C 106,High precision nuclear mass predictions towards a hundred kilo-electron-volt accuracy
. Sci. Bull. 63, 759–764 (2018). https://doi.org/10.1016/j.scib.2018.05.009.Comparative study of radial basis function and Bayesian neural network approaches in nuclear mass predictions
. Phys. Rev. C 100,Trees and forests in nuclear physics
. J. Phys. G: Nucl. Part. Phys. 47,Reliable calculations of nuclear binding energies by the Gaussian process of machine learning
. Nucl. Sci. Tech. 35, 105 (2024). https://doi.org/10.1007/s41365-024-01463-9.Boosted decision trees in the era of new physics: a smuon analysis case study
. J. High Energy Phys. 2022, 15 (2022). https://doi.org/10.1007/JHEP04(2022)015.Phase Transition Study Meets Machine Learning
. Chin. Phys. Lett. 40,Bayesian inference of neutron-skin thickness and neutron-star observables based on effective nuclear interactions
. Sci. China Phys. Mech. Astron. 67,Bayesian evaluation of charge yields of fission fragments of 239U
. Phys. Rev. C 103,Bayesian evaluation of incomplete fission yields
. Phys. Rev. Lett. 123,Bayesian approach to heterogeneous data fusion of imperfect fission yields for augmented evaluations
. Phys. Rev. C 106,Quantifying uncertainties on fission fragment mass yields with mixture density networks
. J. Phys. G Nucl. Part. Phys. 47,β-delayed one-neutron emission probabilities within a neural network model
. Phys. Rev. C 104,Predictions of nuclear β-decay half-lives with machine learning and their impact on r-process nucleosynthesis
. Phys. Rev. C 99,Random forest-based prediction of decay modes and half-lives of superheavy nuclei
. Nucl. Sci. Tech. 34, 204 (2023). https://doi.org/10.1007/s41365-023-01354-5.Investigation of β--decay half-life and delayed neutron emission with uncertainty analysis
. Nucl. Sci. Tech. 34, 9 (2023). https://doi.org/10.1007/s41365-022-01153-4.Extrapolation of nuclear structure observables with artificial neural networks
. Phys. Rev. C 100,Deep learning: A tool for computational nuclear physics
. arXiv, 2018. arXiv:1803.03215.Nuclear charge radii: density functional theory meets Bayesian neural networks
. J. Phys. G Nucl. Part. Phys. 43,Calculation of nuclear charge radii with a trained feed-forward neural network
. Phys. Rev. C, 102,Predictions of nuclear charge radii and physical interpretations based on the naive Bayesian probability classifier
. Phys. Rev. C 101,The description of giant dipole resonance key parameters with multitask neural networks
. Phys. Lett. B 815,Isotopic cross-sections in proton induced spallation reactions based on the Bayesian neural network method
. Chin. Phys. C 44,Constraining the Woods-Saxon potential in fusion reactions based on the neural network
. Phys. Rev. C 109,Estimations of first 2+ energy states of even-even nuclei by using artificial neural networks
. Indian J. Phys. 96, 1791–1797 (2020). https://doi.org/10.1007/s12648-021-02099-w.Study of nuclear low-lying excitation spectra with the Bayesian neural network approach
. Phys. Lett. B 830,LightGBM: a highly efficient gradient boosting decision tree
. InTables of E2 transition probabilities from the first 2+ states in even–even nuclei
. At. Data Nucl. Data Tables. 107, 1–139 (2016). https://doi.org/10.1016/j.adt.2015.10.001.First spectroscopy of the near drip-line nucleus 40Mg
. Phys. Rev. Lett. 122,A unified approach to interpreting model predictions
. InThe authors declare that they have no competing interests.