Introduction
The atomic nucleus is a complex many-body quantum system. Conventionally, a many-body Hamiltonian or Lagrangian must be constructed based on reliable interactions to investigate this complex system. However, such a task is usually challenging, as the interactions in many-body problems are strongly entangled with the structures. Thus, the self-consistency requirement under a certain ansatz leads to a vague or inaccurate many-body Hamiltonian. Fortunately, if only the regularity and robust properties of many-body systems that are independent of the interaction details are required, the vagueness of the Hamiltonian provides an alternative perspective, with random numbers as parameters of nuclear interactions, that is, by using random interactions to statistically probe the robust regularity of nuclei.
The study of random interactions can be traced back to the investigation of Wigner’s random matrix theory (RMT) [1], where random numbers were used as the matrix elements of the many-body Hamiltonian. The diagonalization of these random matrices yields spectral statistical properties that agree with experimental data. The spectral properties of the RMT were further linked to quantum chaos [2]. In the 1970s, Wong, Bohigas, et al. [3-5] randomized the two-body interaction matrix elements in shell-model calculations [6, 7] to quantitatively demonstrate the phenomenon of quantum chaos in nuclei [5, 8-11]. The shell-model calculations with random interactions generate an ensemble of virtual nuclei, which is known as the two-body random ensemble (TBRE). A study based on TBRE revealed that certain robust features of nuclei may be independent of the specific details of the interaction.
Following this ensemble, Johnson, Bertsch, et al. [12, 13] reported a series of robust and interaction-independent statistical properties of low-lying states in nuclei. A significant finding was the "predominance of the spin-zero ground state" in even-even nuclei. Even-even nuclei exhibited a considerably higher probability of having spin-zero ground states compared to the fraction of spin-zero configurations in the entire shell model space. Subsequently, this phenomenon was observed in the interacting boson model (IBM) [14-16]. The spin-zero ground states of even-even nuclei are conventionally attributed to the short-range nature of the nuclear force. However, in the TBRE, the interactions are entirely random and no specific force was predominant. The predominance of the spin-zero ground state in the TBRE contradicts the conventional understanding of how spin-zero ground states emerge from even-even systems. Therefore, considerable effort has been devoted to understanding this robust property of TBRE, which has proven to be significantly challenging and reflects the complexity of the quantum many-body problem. Phenomenological attempts include the studies of the distribution of the lowest eigenvalues for each spin [14] and its width [17], geometric chaos of spin coupling [18], maximum and minimum diagonal matrix elements [19], IBM-limit of spin distribution in the IBM with TBRE [20-22], wave-function properties of different spin ground states [23, 24], energy scale features of different spin ground states [25], and correlation between the probability of spin-zero ground states and the central values of the distribution of two-body matrix elements [26]. The probability distributions of various spin states as ground states must be mathematically calculated to explain this phenomenon. However, nuclear models are typically nonlinear systems, which are difficult to apply to statistical theories. Therefore, several empirical rules have been proposed to predict the probability distribution of ground-state spins. For example, Kusnezov et al. used the random polynomial method [24] to a priori determine the probability distribution for sp bosons and obtained results that were consistent with those obtained by Bijker et al. using mean-field methods [21, 22]. Chau et al. discussed the cases of d boson systems and four fermions in the f7/2 shell, demonstrating the correlation between specific ground states and the geometric shapes determined by nuclear observables, and predicting the probabilities for the ground-state spin [27]. Zhao et al. suggested that the spins of ground states in the TBRE may be associated with specific two-body interaction matrix elements and thus proposed an empirical approach [28] to predict the distribution of ground-state spins. The correlation between the ground-state spin and two-body interaction matrix elements in this empirical approach is also crucial in our work.
Because the nonlinearity of the nuclear model is too complex to overcome, one can take an indirect approach to explain the origin of the predominance of spin-zero ground states. This can be achieved by using a sufficiently simple nonlinear to simulate the behavior of the shell model and studying the spin determination mechanism therein, which may provide more insight from a different perspective. The neural network (NN) model is a potential candidate for such simulations owing to its powerful learning, prediction, and adaptation capabilities, which have been successfully applied in diverse fields such as language translation, speech recognition, computer vision, and even complex physical systems [29-32]. Specifically, NN models have been extensively used in nuclear structure studies to predict unknown nuclear properties using existing experimental data. These properties include the mass [33-35], charge radii [36, 37], low-lying excitation spectra [38, 39], and β decay lifetimes [40]. However, most of these studies only utilized the fitting capacity of the NN without fully exploring its classification capability for nuclear structure research.
In this study, we attempted to distinguish between samples with different ground-state spins in the TBRE by adopting the classification capability of an NN with supervised learning. The adopted NN was trained using the interaction matrix elements from the TBRE samples as features and the ground-state spin as the label. In this process, the NN learnt the behavior of the ground state spin in the TBRE, as well as the specific correlations between the interaction elements and ground state spin, as described in the empirical approach [28]. A significant advantage of using the NN in the TBRE study lies in the ability of the TBRE to provide nearly infinite independent samples for NN training, thereby avoiding overfitting. This enhanced the generalization ability of the NN and facilitated the simulation of the shell model production of the ground-state spin. We present the performance of the NN in predicting the ground-state spins and reproducing their distribution in the TBRE. The proposed NN architecture may serve as a valuable benchmark for other classification-based applications.
MODEL FRAMEWORK
Two-Body Random Ensemble (TBRE)
In the TBRE, the nuclear Hamiltonian includes only two-body interactions and is expressed as follows:
In TBRE, the matrix elements
Here,
Classification neural network
The classification model presented in this paper utilizes an NN, which consists of an input layer, one or more hidden layers, and an output layer. The structure is illustrated in Fig. 1, (with one hidden layer shown as an example). The input layer receives the matrix elements of the two-body interactions in the shell model, specifically the
Activation function | (h11/2)4 | 18Ne | 20Ne | 22Ne | 46Ca |
---|---|---|---|---|---|
Sigmoid | 95.36 | 79.21 | 66.95 | 77.22 | 55.34 |
Tanh | 96.15 | 85.10 | 67.69 | 78.39 | 55.67 |
ReLU | 96.69 | 86.36 | 67.88 | 78.62 | 55.62 |
-202403/1001-8042-35-03-016/alternativeImage/1001-8042-35-03-016-F001.jpg)
The output layer introduces the softmax function[42], which transforms the unnormalized output values into non-negative probability values whose sum is one.
To train the NN model, we first prepared a training set consisting of N samples,
Shell Model Spaces
We performed approximately 100,000 TBRE calculations in six model spaces. These included four valence nucleons in the f7/2 orbital of the virtual nucleus (simply expressed as (f7/2)4), four valence nucleons in the
In the (f7/2)4 space, the eigenvalues from the shell model are simple linear combinations of the two-body interaction matrix elements, as shown in Eq. (1.81) [44]. The ground state spin corresponds to the lowest eigenvalue associated with a specific spin. An NN without a hidden layer corresponds to linear combinations of the input two-body interaction matrix elements, followed by the application of the softmax operation to identify the smallest linear combinations. Both models employ similar calculation processes to determine the ground-state spin, where the weight parameters. Thus, dji parameters in Eq. (4) in the NN correspond to the cfp coefficients [44] in the shell model, and the softmax input to the NN is equivalent to the energy eigenvalues of the shell model. Because a hidden layer complicates the NN model and violates the correspondence between the NN and shell models, we excluded the hidden layer in our NN model for the (f7/2)4 space.
For the (h11/2)4 and 18Ne spaces, the complexity increases beyond the (f7/2)4 space. Some eigenvalues remain linear combinations of two-body interaction matrix elements, whereas others must be obtained through diagonalization. Although some diagonalization processes with fewer than 5 are analytical, the weight parameters dji can no longer be related to cfp coefficients. Therefore, hidden layers can enhance the adaptability of NN models to nonlinear diagonalization [45]. In the 20, 22Ne and 46Ca spaces, the relationship between the eigenvalues and cfp coefficients is completely nonlinear. The transcendental nature of some of these relationships further necessitates hidden layers. Table 1 presents the TBRE sample sizes and input-output settings of our nn models for different spaces.
Model Space | Input | Output |
---|---|---|
(f7/2)4 | 4 | 5 |
(h11/2)4 | 6 | 10 |
18Ne | 30 | 5 |
20Ne | 30 | 7 |
22Ne | 30 | 8 |
46Ca | 94 | 13 |
Optimization of the network architecture
For the (f7/2)4 space, the shell model eigenvalues were linear combinations of two-body interaction matrix elements and the softmax input in the NN model. Therefore, both models employed similar calculation processes for determining the ground-state spin. As mentioned in Sect. 2.3, hidden layers were not required. In fact, an NN model without a hidden layer has already predicted ground-state spins in the (f7/2)4 space with up to 98% accuracy.
For the (h11/2)4 space, because certain eigenvalues displayed nonlinear correlations with the two-body interaction matrix elements, incorporating hidden layers into the model was essential for enhancing the prediction accuracy. We first added one hidden layer and empirically selected 64 hidden nodes for the test run. The results demonstrated an accuracy of 97%, which was a satisfactory outcome.
For the remaining four spaces, the arbitrary accuracies were not always optimal. Therefore, we attempted to improve the prediction accuracy of our NN classification model by including more hidden layers and increasing the number of neural nodes in the 18Ne, 20Ne, 22Ne, and 46Ca model spaces.
First, the prediction accuracy improved when the number of neural nodes was doubled, as indicated by the difference between the prediction accuracies with N/2 and N neural nodes, as shown in Fig. 2. The absence of negative differences (Fig. 2) suggests that doubling the number of neural nodes consistently improves results, as expected. Furthermore, the differences reached a peak when N=32 for all four model spaces. For N>32, the prediction accuracy improved by only 0~2%. Considering that more nodes entail additional computational overhead, we believe that 32 nodes may be an optimal and balanced choice for this study.
-202403/1001-8042-35-03-016/alternativeImage/1001-8042-35-03-016-F002.jpg)
Furthermore, we investigated the impact of hidden layers on the prediction accuracy. By employing 32 neural nodes in each layer, as shown in Fig 2, we present the difference in prediction accuracy between networks with
-202403/1001-8042-35-03-016/alternativeImage/1001-8042-35-03-016-F003.jpg)
Activation functions [46] play a crucial role in NNs by learning abstract features via nonlinear transformations. Common activation functions are the sigmoid (also called the logistic function), tanh (hyperbolic tangent), and ReLU functions. Table 2 presents the impact of different activation functions on the prediction accuracy of our NN model. of The tanh and ReLU functions exhibit similar model prediction accuracies for the five model spaces. However, because the tanh activation function includes an exponential operation, the computational overhead may be larger; therefore, we used the ReLU function throughout this study.
In summary, the optimal network configuration for the 18Ne, 20Ne, 22Ne, and 46Ca model spaces considered in our analysis comprised one hidden layer with 32 ReLU neural nodes.
RESULTS AND ANALYSIS
Model comparison
As shown in Fig. 1, we adopted a fully connected NN model. However, considering the recent application of Bayesian NNs (BNNs) in nuclear physics [33, 34, 37, 39] and the great success of convolutional NNs [47-50] (CNN) and recurrent NNs [51-53] (RNN), we compared these four networks in terms of accuracy, as shown in Fig 4.
-202403/1001-8042-35-03-016/alternativeImage/1001-8042-35-03-016-F004.jpg)
The BNN was implemented via Bayesian sampling of weights and biases facilitated by a variational inference algorithm to optimize model training. The model adopted 1000 iterations to update loss and accuracy. During the prediction process, we sampled 1000 times to obtain precise probability prediction results. The CNN included a convolutional, pooling, fully connected, and softmax layers. In particular, in the convolutional layer, the input channel was the number of input features, the output channel was defined as 16. The convolution kernel size was specified as three to facilitate feature extraction. Subsequently, in the pooling layer, the pooling kernel size and step size were set to two to reduce the dimensions of the feature map. The fully connected layer mapped the features extracted by the convolutional layer to the final classification result based on the feature dimensions and category counts of the task. The model parameters were adjusted and iteratively optimized throughout the construction process to ensure effective feature extraction and classification. A remarkable characteristic of the RNN is its ability to continuously transmit and share information through recurrent connections. In addition, the network calculation process defines the forward propagation function, which encompasses the output generated after the input calculation and subsequent prediction through the fully connected layer and softmax function. All four models, including the adopted classic softmax model, shared consistent parameters, including a training-to-test set ratio of 2:1, a learning rate of 0.01, over 1000 training epochs, and the use of the ReLU activation and Adam optimization functions.
According to Fig. 4, the CNN performs the worst, while the accuracies of both BNN and RNN are similar to that of the adopted network. However, the adopted networks had faster training speeds and required fewer computational resources. Therefore, we believe that the adopted network was the optimal choice for our study.
Feature selection
Feature selection is crucial in machine learning and data analysis because it enhances model performance, mitigates the risk of overfitting, boosts computational efficiency, streamlines model interpretation, and addresses problems related to noise and redundant information. This involves conducting a correlation analysis to assess the relationship between each feature and target variable. Subsequently, the features exhibiting a strong correlation with the target variable are selected, whereas the others are excluded from further training. Considering the nonlinear nature of both our feature and label data, we utilized the Spearman correlation coefficient ρ [54] for feature selection as
We calculated the ρ coefficients for four high-dimensional model spaces, namely, 18~22Ne and 46Ca model spaces and used different threshold sizes to select input features with strong correlation. Only features with ρ values greater than the threshold were retained for further training. In Table 3, we list the number of elements of the two-body matrix, that is, the number of input features over a certain threshold, and the corresponding accuracy.
Threshold | 0.1 | 0.01 | 0.001 | 0 | |
---|---|---|---|---|---|
18Ne | |||||
Accuracy | 70 | 77 | 85 | 86 | |
Input number | 5 | 14 | 28 | 30 | |
20Ne | |||||
Accuracy | 60 | 65 | 66 | 68 | |
Input number | 6 | 17 | 25 | 30 | |
22Ne | |||||
Accuracy | 71 | 74 | 76 | 80 | |
Input number | 4 | 16 | 25 | 30 | |
46Ca | |||||
Accuracy | 53 | 56 | 56 | 56 | |
Input number | 1 | 30 | 80 | 94 |
According to Table 3, the number of inputs after feature selection decreases with increasing threshold values, as anticipated. However, this reduction in the input number implies declining performance. Consequently, each input two-body matrix element within the four model spaces significantly affects the output and thus, the exclusion of any of these from network training is not recommended.
Accuracy
Figure 5 presents the evolution of the loss function during training. As expected, the loss functions of the six model spaces converge, indicating that the network parameters are optimal. The loss values of two single-j model spaces, namely, (f7/2)4 and (f11/2)4, decrease the most dramatically because the single-j spaces are simpler than the other 4 models. Furthermore, 46Ca, 20Ne, 22Ne, and 18N all converge with large loss values, corresponding to the unsatisfactory accuracy described in Table 4. Thus, increasing the number of training epochs does not improve the accuracy.
Model Space | (f7/2)4 | (h11/2)4 | 18Ne | 20Ne | 22Ne | 46Ca |
---|---|---|---|---|---|---|
Dimension | 8 | 23 | 14 | 81 | 142 | 3952 |
Accuracy (%) | 98 | 97 | 86 | 68 | 80 | 56 |
Consistency (%) | 100 | 100 | 100 | 60 | 80 | 74 |
-202403/1001-8042-35-03-016/alternativeImage/1001-8042-35-03-016-F005.jpg)
Table 4 shows the correlation between the prediction accuracy (%) of the NN for the ground-state spin and the dimensions of the six model spaces under investigation. The shell model eigenvalues of the (f7/2)4 space are linear combinations of two-body interaction matrix elements; therefore, the NN model is equivalent to linear regression and achieves high prediction accuracy of up to 98%. The accuracy is 97% for the (h11/2)4 space with one hidden layer; however, some eigenvalues in the (h11/2)4 space exhibit nonlinear relationships with the two-body interaction matrix elements.
For the remaining four spaces, the accuracy decreased significantly with increasing dimensions. We obtained a Pearson correlation coefficient [55] of -0.753 between the prediction accuracy (%) and dimensions on a logarithmic scale, indicating a negative correlation between the two variables. As the dimensions of the space and the shell complexity increased, the ability of the NN model to predict the ground-state spin diminished. As shown in Fig. 2 and 3, introducing additional hidden layers or neural nodes does not significantly improve the performance of general classification NNs. Thus, to accurately predict the ground-state spin in the TBRE, the generalization capability of the NN is strongly challenged by the complexity of the quantum many-body system, and a more specialized NN architecture and activation function should be designed according to the cfp coefficient property and diagonalization process.
To obtain a more detailed picture of the prediction performance of the NN model for TBRE samples with a specific spin, Fig. 6 presents the confusion matrix for the NN models of the six model spaces. In the confusion matrices, the y- and x-axes represent the ground-state spin predicted by the NN (INN) and that obtained from the shell model calculations (ISM), respectively. The gray scale indicates the probability of the shell model calculation yielding a ground-state spin of ISM in the samples for which the NN predicts a ground-state spin of INN. The main diagonal of the confusion matrix appears predominantly dark, indicating a reasonably high degree of consistency between the NN and shell model for a specific ground-state spin. From a statistical perspective, the NN captures some correlations between the ground-state spin and two-body interaction matrix elements of the TBRE.
-202403/1001-8042-35-03-016/alternativeImage/1001-8042-35-03-016-F006.jpg)
The data in Table 4 indicates that the prediction accuracy for the ground-state spin of the 20Ne nucleus is lower than that for higher-dimensional 22Ne. This finding is consistent with the observations shown in Fig. 6. Specifically, for 20Ne, the difference in colors between the main diagonal and other regions is less pronounced than in other nuclei. This suggests that the prediction of the ground-state spin in 20Ne space poses greater challenges to the NN, which may be related to some special properties of the cfp coefficients of 20Ne. Further exploration of specific multibody complexity features in 20Ne space is desirable.
To further evaluate the statistical performance of the NN model, Fig. 7 presents the distribution of ground-spin spins I (PI) obtained using both the shell model and the well-trained NN model with random interactions. The NN model is consistent with the shell model for all model spaces and partially succeeds in capturing the robust statistical properties of the TBRE.
-202403/1001-8042-35-03-016/alternativeImage/1001-8042-35-03-016-F007.jpg)
G-I correlation
To predict PI in TBRE, Zhao et al. proposed a general empirical approach [19]. In their approach, one of the two-body interaction matrix elements was set to -1, whereas the rest were set to 0. This interaction was then input into the shell model, and the output ground-state spin I was recorded. If N independent two-body interaction matrix elements existed in the model space, the process was repeated N times, setting a different matrix element equal to -1 each time. Finally, the number of times the spin I was observed as the ground-state spin in N numerical experiments was represented as NI. The probability of spin I in the ground state was then estimated as follows:
The empirical approach [19] attributes the "specific spin I as the ground-state spin" to a few two-body interaction matrix elements. If there are more two-body interaction matrix elements responsible for the spin I=0, the empirical rule provides a phenomenological explanation for the dominance of the ground state with zero spin.
Note that the empirical approach hints at the correlation between the two-body interaction matrix elements and ground-state spin, and its correlation determines the ground-state spin distribution, as shown in Fig. 7. Thus, the NN model with a good prediction of the TBRE ground-state spin distribution should also produce a similar correlation between the two-body interaction matrix elements (
In Tables 5, 6 and 7, the G-I correlations between the two-body interaction matrix elements (
GJ | (f7/2)4 | (h11/2)4 | ||
---|---|---|---|---|
SM | NN | SM | NN | |
G0 | 0 | 0 | 0 | 0 |
G2 | 4 | 4 | 4 | 4 |
G4 | 2 | 2 | 0 | 0 |
G6 | 8 | 8 | 4 | 4 |
G8 | 8 | 8 | ||
G10 | 16 | 16 |
18Ne | 20Ne | 22Ne | ||||
---|---|---|---|---|---|---|
SM | NN | SM | NN | SM | NN | |
0 | 0 | 0~4 | 0 | 0~6 | 0 | |
0 | 0 | 0,2,4 | 0 | 0,2,4 | 0 | |
0 | 0 | 0 | 0 | 0,2 | 0 | |
0 | 0 | 0,2~4 | 0 | 0~5 | 0 | |
0 | 0 | 0 | 0 | 0 | 0 | |
0 | 0 | 0 | 0 | 0~2 | 0 | |
1 | 1 | 1 | 0 | 0 | 0 | |
1 | 1 | 2 | 0 | 0 | 0 | |
1 | 1 | 0 | 0 | 0 | 3 | |
2 | 2 | 0,2 | 0 | 0 | 0 | |
2 | 2 | 2 | 0 | 2 | 2 | |
2 | 2 | 1~4 | 0 | 0~6 | 0 | |
2 | 2 | 0 | 0 | 0 | 2 | |
2 | 2 | 0 | 0 | 0 | 0 | |
2 | 2 | 4 | 2 | 0,2,4 | 2 | |
2 | 2 | 0 | 0 | 0 | 0 | |
2 | 2 | 0 | 2 | 0 | 0 | |
2 | 2 | 2 | 2 | 0~4 | 2 | |
2 | 2 | 0 | 0 | 0,2~4 | 0 | |
2 | 2 | 2 | 0 | 2,3 | 0 | |
2 | 2 | 0 | 0 | 0 | 0 | |
2 | 2 | 2 | 0 | 0 | 0 | |
2 | 2 | 0 | 0 | 0 | 0 | |
2 | 2 | 2 | 0 | 0 | 0 | |
3 | 3 | 5 | 2 | 0,2,4 | 3 | |
3 | 3 | 4 | 0 | 3 | 0 | |
3 | 3 | 0 | 0 | 0 | 0 | |
4 | 4 | 6 | 6 | 6 | 6 | |
4 | 4 | 4 | 0 | 2,3 | 0 | |
4 | 4 | 4 | 4 | 0 | 0 |
SM | NN | SM | NN | ||
---|---|---|---|---|---|
0~10 | 0 | 0 | 0 | ||
0~10 | 0 | 0 | 0 | ||
0,2~6 | 0 | 4 | 0 | ||
0~4 | 0 | 0 | 0 | ||
0~10 | 0 | 2 | 2 | ||
0,2,4,6 | 0 | 0,2,4 | 3 | ||
0 | 0 | 3 | 0 | ||
0~6 | 0 | 3 | 0 | ||
0 | 0 | 4 | 0 | ||
0~4 | 0 | 4 | 0 | ||
0 | 0 | 0,2,4~8 | 2 | ||
0 | 0 | 0 | 0 | ||
0,9 | 0 | 0 | 0 | ||
0 | 0 | 0 | 0 | ||
0 | 0 | 3 | 0 | ||
1,8 | 0 | 0 | 2 | ||
0 | 0 | 0 | 0 | ||
2 | 0 | 0 | 0 | ||
0~10 | 0 | 0 | 0 | ||
0,4,6 | 0 | 0,10 | 0 | ||
0 | 0 | 0,2,4~6,8 | 8 | ||
0 | 0 | 2 | 0 | ||
0,9 | 0 | 6 | 0 | ||
0 | 0 | 0 | 0 | ||
0,2,4 | 2 | 1 | 0 | ||
0,2,4,6 | 0 | 0~4 | 0 | ||
0 | 0 | 6 | 0 | ||
0 | 0 | 2 | 0 | ||
0~8 | 0 | 1~6 | 0 | ||
2 | 0 | 0,9 | 0 | ||
2 | 0 | 4 | 0 | ||
0~6 | 0 | 0 | 0 | ||
1,2,4,5 | 0 | 0 | 0 | ||
0 | 0 | 3 | 0 | ||
0,2~4,6 | 0 | 0,2~4 | 0 | ||
0 | 0 | 0 | 0 | ||
0,2~4 | 0 | 0 | 0 | ||
0 | 0 | 0,10 | 0 | ||
0 | 0 | 0 | 0 | ||
0,2~4,6 | 0 | 0 | 0 | ||
0,10 | 0 | 4 | 4 | ||
0 | 0 | 10 | 9 | ||
0,9 | 0 | 0 | 0 | ||
0 | 0 | 1 | 0 | ||
0 | 0 | 12 | 10 | ||
0 | 0 | 0 | 0 | ||
0 | 0 | 6 | 6 |
According to Table 5, in the (f7/2)4 and (h11/2)4 model spaces, the NN model produces G-I correlations that are perfectly consistent with those of the shell model. This explains the agreement between the shell and NN models shown in Figures 6(a,b) and 7(a,b). In Table 6, this perfect consistency can also be observed for 18Ne space in coordination with Figures 6(c) and 7(c). However, as the dimensions increase, the consistency of 20 22Ne in Tables 6 and 46Ca in Table 7 gradually decreases. In the 20Ne space, there are 12 inconsistent G-I correlations out of 30 (40%) between the SM and NN; in 22Ne, six out of 30 (20%) are inconsistent; and in 46, 24 out of 94 (~26%) are inconsistent. These inconsistent rates are also correlated with the prediction accuracy for different model spaces, as shown in Table 4.
Furthermore, the empirical approach was also applicable to the trained NN model by setting one of the inputs of the NN to -1 and the rest to 0 and recording the ground-state spin (I) from the network. This approach also revealed the correlation between the interaction matrix elements and the predicted ground-state spin, as well as the PI distribution, of the well-trained NN model. Table 5 presents the correlations between the matrix elements and the spin obtained from the shell and NN models. The correlations from both models are identical, indicating that our NN model successfully learns the G-I correlation, as suggested by the empirical approach. Thus, it accurately reproduces the ground-state spin of the shell model for simple model.
With the G-I correlation from such an NN model, we counted the ground-spin Is emerging in the G-I correlation and then normalized them to the PI distribution, as guided by the empirical approach. The PI distributions based on the empirical approach using the NN model are shown in Fig. 7. The empirical approach from both the SM and NN models yields reasonably consistent PI distributions for all model spaces in this case, suggesting that the NN may effectively capture the correlation between the two-body interaction matrix elements and the ground-state spin, which further explains its remarkable performance in reproducing the statistical properties of the ground-state spin in the TBRE.
CONCLUSION
This study used an NN model to investigate the distribution of the ground-state spin in the TBRE. Using a softmax classification NN model, we attempted to reproduce the correlation between the matrix elements of the interaction and ground-state spin, as labeled by the shell model, for the TBRE. The reliability of the NN model was analyzed based on its prediction accuracy and consistency with the empirical rules of the PI distribution.
Previous applications of NN models in nuclear physics primarily focused on their strong fitting capabilities. However, the analysis of the ground-state spin distribution in TBRE demonstrated the classification ability of the NN, which is rare in the literature. Furthermore, TBRE provided extensive samples for training NNs, thereby potentially enhancing the performance of the NN model.
In our investigation, we adopted various strategies to enhance network performance, including the introduction of BNN, CNN, and RNN, feature selection, and adjusting the number of neural nodes and hidden layers. However, none of these approaches yielded significant improvements with limited computational resources. Therefore, we must acknowledge that the quantum many-body problem remains a formidable challenge for NN models. Addressing this challenge may necessitate the further development of NN architectures tailored for analyzing the nuclear ground-state spin in the TBRE.
However, NN models still offer some insights into the specific robust statistical properties of the ground-state spin. For example, they effectively capture the distribution of the ground-state spin, as shown in Fig. 7. Moreover, the resulting confusion matrix exhibits dominant diagonal elements, indicating the consistency between the ground-state spin from the shell model and that predicted by the NN model, as shown in Fig. 6. This success can be attributed to the capacity of the NN to replicate the correlation between the ground-state spin and two-body interaction matrix element in the shell model, as shown in Tables 5, 6, and 7.
Random matrices and chaos in nuclear physics: Nuclear structure
. Rev. Mod. Phys. 81(2), P539-589 (2009). https://doi.org/10.1103/RevModPhys.81.539Spectral properties of the Laplacian and random matrix theories
. J. Phys. Let. 45(21), 1015-1022 (1984). https://doi.org/10.1051/jphyslet:0198400450210101500Level-density fluctuations and two-body versus multi-body interactions
. Nucl. Phys. A 198(1), 188-208 (1972). https://doi.org/10.1016/0375-9474(72)90779-8Two-body random hamiltonian and level density
. Phys. Let. B 34(4), 261-263 (1971). https://doi.org/10.1016/0370-2693(71)90598-3Validity of random matrix theories for many-particle systems
. Phys. Let. B. 33(7), 449-452 (2001). https://doi.org/10.1016/0370-2693(70)90213-3On closed shells in nuclei
. Phys. Rev. 74(3), 235-239 (1948). https://doi.org/10.1103/PhysRev.74.235On the "Magic Numbers" in nuclear structure
. Phys. Rev. 75(11), 1766 (1949). https://doi.org/10.1103/PhysRev.75.1766.2The nuclear shell model as a testing ground for many-body quantum chaos
. Phys. Rep. 276, 85-176 (1996). https://doi.org/10.1016/S0370-1573(96)00007-5Random-matrix theories in quantum physics: common concepts
. Phys. Rep. 299(4-6), 189-425 (1998). https://doi.org/10.1016/s0370-1573(97)00088-4Embedded random matrix ensembles for complexity and chaos in finite interacting particle systems
. Phys. Rep. 347(3), 223-288 (2001). https://doi.org/10.1016/S0370-1573(00)00113-7Nuclear structure, random interactions and mesoscopic physics
. Phys. Rep. 391(3), 311-352 (2004). https://doi.org/10.1016/j.physrep.2003.10.008Orderly spectra from random interactions
. Phys. Rev. Lett. 80(13), 2749 (1998). https://doi.org/10.1103/PhysRevLett.80.2749Generalized seniority from random Hamiltonians
. Phys. Rev. C 61, 014311 (1999). https://doi.org/10.1103/PhysRevC.61.014311Band structure from random interactions
. Phys. Rev. Lett. 84(3), 420-422 (2000). https://doi.org/10.1103/PhysRevLett.84.420Robust nuclear observables and constraints on random interactions
. Phys. Rev. Lett. 85(7), 1396 (2000). https://doi.org/10.1103/PhysRevLett.85.1396The interacting boson model
. Ann. Phys. 84, 211-231 (1974). https://doi.org/10.1016/0003-4916(74)90300-5On the dominance of J(P)=0(+) ground states in even-even nuclei from random two-body interactions
. Phys. Rev. C. 60(2), 021302 (1999). https://doi.org/10.1103/PhysRevC.60.021302Geometric chaoticity leads to ordered spectra for randomly interacting fermions
. Phys. Rev. Lett. 85(19), 4016-4019 (2000). https://doi.org/10.1103/PhysRevLett.85.4016Towards understanding the probability of 0+ ground states in even-even many-body systems
. Phys. Rev. C. 64(4), 041301 (2001). https://doi.org/10.1103/PhysRevC.64.041301Two-body random ensembles: From nuclear spectra to random polynomials
. Phys. Rev. Lett. 85(18), 3773 (2000). https://doi.org/10.1103/PhysRevLett.85.3773Mean-field analysis of interacting boson models with random interactions
. Phys. Rev. C. 64(6), 656-656 (2001). https://doi.org/10.1103/PhysRevC.64.061303Regular spectra in the vibron model with random interactions
. Phys. Rev. C. 65(4), 579-579 (2002). https://doi.org/10.1103/PhysRevC.65.044316Spin structure of many-body systems with two-body random interactions
. Phys. Rev. C. 63, 014307 (2000). https://doi.org/10.1103/physrevc.63.014307Wave Function Structure in Two-Body Random Matrix Ensembles
. Phys. Rev. Lett. 84(20), 4553-4556 (2000). https://doi.org/10.1103/PhysRevLett.84.4553Nature of order from random two-body interactions
. Physica A Statal Mechanics and Its Applications 301(1), 291-300 (2001). https://doi.org/10.1016/S0378-4371(01)00403-4Correlation between the probability of spin-zero ground state and TBME in the presence of random interactions
. Nucl. Phys. Rev. 37(3), 523-529 (2020). https://doi.org/10.11804/NuclPhysRev.37.2019CNPC15Geometry of random interactions
. Phys. Rev. C. 66(6), 1302 (2002). https://doi.org/10.1103/PhysRevC.66.061302Regularities of many-body systems interacting by a two-body random ensemble
. Phys. Rep. 400(1), 1-66 (2003). https://doi.org/10.1016/j.physrep.2004.07.004Learning and prediction of nuclear stability using neural networks
. Nucl. Phys. A. 540(1-2), 1-26 (1992). https://doi.org/10.1016/0375-9474(92)90191-LPhase transition study meets machine-learning requirements
. Chinese Phys. Lett. 40, 122101 (2023). https://doi.org/10.1088/0256-307X/40/12/122101Machine learning in nuclear physics at low and intermediate energies
. Sci. China Phys. Mech. Astron. 66(8), 282001 (2023). https://doi.org/10.1007/s11433-023-2116-0Machine learning is required in high-energy nuclear physics
. Nucl. Sci. Tech. 34(6), 88 (2023). https://doi.org/10.1007/s41365-023-01233-zNuclear mass predictions for the crustal composition of neutron stars: A Bayesian neural network approach
. Phys. Rev. C. 93, 014311 (2016). https://doi.org/10.1103/physrevc.93.014311Nuclear mass predictions based on Bayesian neural network approach with pairing and shell effects
. Phys. Lett. B. 778, 48-53 (2018). https://doi.org/10.1016/j.physletb.2018.01.002Nuclear mass based on the multi-task learning neural network method
. Nucl. Sci. Tech. 33(4), 48 (2022). https://doi.org/10.1007/s41365-022-01031-zPrediction of nuclear charge density distribution with feedback neural network
. Nucl. Sci. Tech. 33(12), 153 (2022). https://doi.org/10.1007/s41365-022-01140-9Nuclear charge radii: Density functional theory meets Bayesian neural networks
. J. Phys. G. Nucl. Partic. 43(11), 114002 (2016). https://doi.org/10.1088/0954-3899/43/11/114002Studies of nuclear low-lying excitation spectra with multi-task neural network
. Nucl. Phys. Rev. 39(3), 273-280 (2022). https://doi.org/10.11804/NuclPhysRev.39.2022043Study of nuclear low-lying excitation spectra with the Bayesian neural network approach
. Phys. Lett. B. 830, 137-154. (2022). https://doi.org/10.1016/j.physletb.2022.137154Predictions of nuclear β-decay half-lives with machine learning and their impact on r-process nucleosynthesis
. Phys. Rev. C. 99(6), 064307 (2019). https://doi.org/10.1103/PhysRevC.99.064307Deep sparse rectifier neural networks
. Journal of Machine Learning Research 15, 315-323 (2011).Bayesian classification using Gaussian processes
. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(12), 1342-1351 (1999). https://doi.org/10.1109/34.735807Adam: A method for stochastic optimization
. Computer Science (2014). https://doi.org/10.48550/arXiv.1412.6980Theory of the nuclear shell model
. Phys. Today. 35(1), 73-75 (1980). https://doi.org/10.1016/B978-1-4832-3064-1.50016-4Deep learning
. Nature, 521(7553), 436 (2015). https://doi.org/10.1038/nature14539Application of machine learning for the determination of impact parameters in the 132Sn + 124Sn system
. Phys. Rev. C 104, 034608 (2021). https://doi.org/10.1103/PhysRevC.104.034608Using deep learning to study the equation of state of nuclear matter
. Nucl. Phys. Rev. 37(4), 825-832 (2020). https://doi.org/10.11804/NuclPhysRev.37.2020017Prediction of nuclear charge radii based on a convolutional neural network
, Nucl. Sci. Tech. 34(10), 152 (2023). https://doi.org/10.1007/s41365-023-01308-xStatistics corner: A guide to appropriate use of correlation coefficient in medical research
. Malawi Medical Journal. 24(3), 69-71 (2012). https://doi.org/10.2166/wh.2012.000Pearson’s correlation coefficient
. BMJ (online). 345(jul041), e4483-e4483 (2012). https://doi.org/10.1136/bmj.e4483The authors declare that they have no competing interests.