logo

Neural network study of the nuclear ground-state spin distribution within a random interaction ensemble

NUCLEAR PHYSICS AND INTERDISCIPLINARY RESEARCH

Neural network study of the nuclear ground-state spin distribution within a random interaction ensemble

Deng Liu
Alam Noor A
Zhen‑Zhen Qin
Yang Lei
Nuclear Science and TechniquesVol.35, No.3Article number 64Published in print Mar 2024Available online 03 May 2024
51503

The distribution of the nuclear ground-state spin in a two-body random ensemble (TBRE) was studied using a general classification neural network (NN) model with two-body interaction matrix elements as input features and the corresponding ground-state spins as labels or output predictions. The quantum many-body system problem exceeds the capability of our optimized NNs in terms of accurately predicting the ground-state spin of each sample within the TBRE. However, our NN model effectively captured the statistical properties of the ground-state spin because it learned the empirical regularity of the ground-state spin distribution in TBRE, as discovered by physicists.

Neural networkTwo-body random ensembleSpin distribution of nuclear ground state
1

Introduction

The atomic nucleus is a complex many-body quantum system. Conventionally, a many-body Hamiltonian or Lagrangian must be constructed based on reliable interactions to investigate this complex system. However, such a task is usually challenging, as the interactions in many-body problems are strongly entangled with the structures. Thus, the self-consistency requirement under a certain ansatz leads to a vague or inaccurate many-body Hamiltonian. Fortunately, if only the regularity and robust properties of many-body systems that are independent of the interaction details are required, the vagueness of the Hamiltonian provides an alternative perspective, with random numbers as parameters of nuclear interactions, that is, by using random interactions to statistically probe the robust regularity of nuclei.

The study of random interactions can be traced back to the investigation of Wigner’s random matrix theory (RMT) [1], where random numbers were used as the matrix elements of the many-body Hamiltonian. The diagonalization of these random matrices yields spectral statistical properties that agree with experimental data. The spectral properties of the RMT were further linked to quantum chaos [2]. In the 1970s, Wong, Bohigas, et al. [3-5] randomized the two-body interaction matrix elements in shell-model calculations [6, 7] to quantitatively demonstrate the phenomenon of quantum chaos in nuclei [5, 8-11]. The shell-model calculations with random interactions generate an ensemble of virtual nuclei, which is known as the two-body random ensemble (TBRE). A study based on TBRE revealed that certain robust features of nuclei may be independent of the specific details of the interaction.

Following this ensemble, Johnson, Bertsch, et al. [12, 13] reported a series of robust and interaction-independent statistical properties of low-lying states in nuclei. A significant finding was the "predominance of the spin-zero ground state" in even-even nuclei. Even-even nuclei exhibited a considerably higher probability of having spin-zero ground states compared to the fraction of spin-zero configurations in the entire shell model space. Subsequently, this phenomenon was observed in the interacting boson model (IBM) [14-16]. The spin-zero ground states of even-even nuclei are conventionally attributed to the short-range nature of the nuclear force. However, in the TBRE, the interactions are entirely random and no specific force was predominant. The predominance of the spin-zero ground state in the TBRE contradicts the conventional understanding of how spin-zero ground states emerge from even-even systems. Therefore, considerable effort has been devoted to understanding this robust property of TBRE, which has proven to be significantly challenging and reflects the complexity of the quantum many-body problem. Phenomenological attempts include the studies of the distribution of the lowest eigenvalues for each spin [14] and its width [17], geometric chaos of spin coupling [18], maximum and minimum diagonal matrix elements [19], IBM-limit of spin distribution in the IBM with TBRE [20-22], wave-function properties of different spin ground states [23, 24], energy scale features of different spin ground states [25], and correlation between the probability of spin-zero ground states and the central values of the distribution of two-body matrix elements [26]. The probability distributions of various spin states as ground states must be mathematically calculated to explain this phenomenon. However, nuclear models are typically nonlinear systems, which are difficult to apply to statistical theories. Therefore, several empirical rules have been proposed to predict the probability distribution of ground-state spins. For example, Kusnezov et al. used the random polynomial method [24] to a priori determine the probability distribution for sp bosons and obtained results that were consistent with those obtained by Bijker et al. using mean-field methods [21, 22]. Chau et al. discussed the cases of d boson systems and four fermions in the f7/2 shell, demonstrating the correlation between specific ground states and the geometric shapes determined by nuclear observables, and predicting the probabilities for the ground-state spin [27]. Zhao et al. suggested that the spins of ground states in the TBRE may be associated with specific two-body interaction matrix elements and thus proposed an empirical approach [28] to predict the distribution of ground-state spins. The correlation between the ground-state spin and two-body interaction matrix elements in this empirical approach is also crucial in our work.

Because the nonlinearity of the nuclear model is too complex to overcome, one can take an indirect approach to explain the origin of the predominance of spin-zero ground states. This can be achieved by using a sufficiently simple nonlinear to simulate the behavior of the shell model and studying the spin determination mechanism therein, which may provide more insight from a different perspective. The neural network (NN) model is a potential candidate for such simulations owing to its powerful learning, prediction, and adaptation capabilities, which have been successfully applied in diverse fields such as language translation, speech recognition, computer vision, and even complex physical systems [29-32]. Specifically, NN models have been extensively used in nuclear structure studies to predict unknown nuclear properties using existing experimental data. These properties include the mass [33-35], charge radii [36, 37], low-lying excitation spectra [38, 39], and β decay lifetimes [40]. However, most of these studies only utilized the fitting capacity of the NN without fully exploring its classification capability for nuclear structure research.

In this study, we attempted to distinguish between samples with different ground-state spins in the TBRE by adopting the classification capability of an NN with supervised learning. The adopted NN was trained using the interaction matrix elements from the TBRE samples as features and the ground-state spin as the label. In this process, the NN learnt the behavior of the ground state spin in the TBRE, as well as the specific correlations between the interaction elements and ground state spin, as described in the empirical approach [28]. A significant advantage of using the NN in the TBRE study lies in the ability of the TBRE to provide nearly infinite independent samples for NN training, thereby avoiding overfitting. This enhanced the generalization ability of the NN and facilitated the simulation of the shell model production of the ground-state spin. We present the performance of the NN in predicting the ground-state spins and reproducing their distribution in the TBRE. The proposed NN architecture may serve as a valuable benchmark for other classification-based applications.

2

MODEL FRAMEWORK

2.1
Two-Body Random Ensemble (TBRE)

In the TBRE, the nuclear Hamiltonian includes only two-body interactions and is expressed as follows: H=Jj1j2j3j4Gj1j2;j3j4JAJ(j1j2)AJ(j3j4). (1) In Eq. (1), Gj1j2;j3j4J represents the matrix elements of the two-body interaction; AJ(j1j2) denotes the creation operator of the nucleon pair with two nucleons on the j1 and j2 orbits coupled to total angular momentum J; and AJ(j3j4) represents the annihilation operator of the nucleon pair.

In TBRE, the matrix elements Gj1j2;j3j4J in Eq. (1) are independent random numbers that follow a Gaussian distribution with a probability function f(Gj1j2j3j4J)=12πσexp{(Gj1j2;j3j4J)22σ2}. (2)

Here, σ2=12(1+δj1j3δj2j4), (3) to maintain the invariance of the statistical distribution of interaction matrix elements during an arbitrary single-particle transformation.

2.2
Classification neural network

The classification model presented in this paper utilizes an NN, which consists of an input layer, one or more hidden layers, and an output layer. The structure is illustrated in Fig. 1, (with one hidden layer shown as an example). The input layer receives the matrix elements of the two-body interactions in the shell model, specifically the Gj1j2;j3j4J values in Eq. (1), where the number of inputs is equal to the number of independent two-body interaction matrix elements in a specific shell-model space. Based on the corresponding input interactions, the output layer provides the probabilities of different spin states being the ground state. The number of outputs should equal the number of possible ground-state spins. The activation function used in this model was the rectified linear unit (ReLU) function [41], which is explained in Table 2. Assuming that the vector x={xi} represents the network input, that is, G two-body interaction matrix elements in Eq. (1), and y is the network output, whose elements correspond to the probability of each spin being the ground-state spin, the relationship (with one hidden layer) can be expressed analytically as follows: yk(x;ω)=ak+jbkjReLU(cj+djixi). (4) Here, ω={ak,bkj,cj,dji} denotes the NN parameter vector.

Table 2
Prediction accuracy (%) with three different activation functions, namely, sigmoid, tanh, and ReLU
Activation function (h11/2)4 18Ne 20Ne 22Ne 46Ca
Sigmoid 95.36 79.21 66.95 77.22 55.34
Tanh 96.15 85.10 67.69 78.39 55.67
ReLU 96.69 86.36 67.88 78.62 55.62
Show more
All the calculations are performed with a single 32-node hidden layer NN model
Fig. 1
Schematic of the adopted NN classification model
pic

The output layer introduces the softmax function[42], which transforms the unnormalized output values into non-negative probability values whose sum is one. Pk=Softmax(y)|k=eykkeyk. (5) This operation preserves the differentiability property of the model and the relative order of unnormalized output values. It also allows the model output to be interpreted as a probability for each class, facilitating the direct interpretation and utilization of these probabilities for classification decisions. Therefore, they are frequently employed in NN models to solve classification problems. Here, Pk is the probability that the kth spin is the ground-state spin. Thus, the maximum of Pk determines the ground-state spin according to the x feature. All the elements Pk construct the predicted probability P from the NN model.

To train the NN model, we first prepared a training set consisting of N samples, D={(x1,S1),(x2,S2),,(xN,SN)} out of ~100,000 shell-model calculations. Here, xi includes two-body interaction matrix elements in a single shell-model calculation, while Si is the corresponding ground-state spin obtained from such a shell-model. Secondly, for each Si spin, we created the label P^i vectors, which is a hot-one vector, and only include one non-zero elements of value “1,” corresponding to a 100% probability of Si ground-state spin, and 0% probabilities of the rest other spins. Third, we defined the loss function to evaluate the similarity between the label P^i vector and the NN predicted Pi vector from Eq. (5). L(Pi,P^i)=PmilogP^mi, (6) which is a common loss function for training an NN model for classification problems. Using the training samples and corresponding loss function, we trained our network by adjusting the network parameter vector ω with the adaptive moment estimation (Adam) optimization algorithm [43] to minimize the sum of the loss functions for all training samples. Consequently, an NN model with predictive capabilities was developed.

2.3
Shell Model Spaces

We performed approximately 100,000 TBRE calculations in six model spaces. These included four valence nucleons in the f7/2 orbital of the virtual nucleus (simply expressed as (f7/2)4), four valence nucleons in the h11/2 orbital virtual nuclear ((h11/2)4), and two, four, and six valence neutrons in the sd shell (corresponding to 18Ne, 20Ne, and 22Ne nucleus, respectively), and six valence neutrons in the pf shell (46Ca nuclear). These six model spaces represent various levels of many-body complexity.

In the (f7/2)4 space, the eigenvalues from the shell model are simple linear combinations of the two-body interaction matrix elements, as shown in Eq. (1.81) [44]. The ground state spin corresponds to the lowest eigenvalue associated with a specific spin. An NN without a hidden layer corresponds to linear combinations of the input two-body interaction matrix elements, followed by the application of the softmax operation to identify the smallest linear combinations. Both models employ similar calculation processes to determine the ground-state spin, where the weight parameters. Thus, dji parameters in Eq. (4) in the NN correspond to the cfp coefficients [44] in the shell model, and the softmax input to the NN is equivalent to the energy eigenvalues of the shell model. Because a hidden layer complicates the NN model and violates the correspondence between the NN and shell models, we excluded the hidden layer in our NN model for the (f7/2)4 space.

For the (h11/2)4 and 18Ne spaces, the complexity increases beyond the (f7/2)4 space. Some eigenvalues remain linear combinations of two-body interaction matrix elements, whereas others must be obtained through diagonalization. Although some diagonalization processes with fewer than 5 are analytical, the weight parameters dji can no longer be related to cfp coefficients. Therefore, hidden layers can enhance the adaptability of NN models to nonlinear diagonalization [45]. In the 20, 22Ne and 46Ca spaces, the relationship between the eigenvalues and cfp coefficients is completely nonlinear. The transcendental nature of some of these relationships further necessitates hidden layers. Table 1 presents the TBRE sample sizes and input-output settings of our nn models for different spaces.

Table 1
Input-output settings for the six model spaces
Model Space Input Output
(f7/2)4 4 5
(h11/2)4 6 10
18Ne 30 5
20Ne 30 7
22Ne 30 8
46Ca 94 13
Show more
The input and output numbers correspond to the number of two-body interaction matrix elements and the number of possible ground state spins, respectively
2.4
Optimization of the network architecture

For the (f7/2)4 space, the shell model eigenvalues were linear combinations of two-body interaction matrix elements and the softmax input in the NN model. Therefore, both models employed similar calculation processes for determining the ground-state spin. As mentioned in Sect. 2.3, hidden layers were not required. In fact, an NN model without a hidden layer has already predicted ground-state spins in the (f7/2)4 space with up to 98% accuracy.

For the (h11/2)4 space, because certain eigenvalues displayed nonlinear correlations with the two-body interaction matrix elements, incorporating hidden layers into the model was essential for enhancing the prediction accuracy. We first added one hidden layer and empirically selected 64 hidden nodes for the test run. The results demonstrated an accuracy of 97%, which was a satisfactory outcome.

For the remaining four spaces, the arbitrary accuracies were not always optimal. Therefore, we attempted to improve the prediction accuracy of our NN classification model by including more hidden layers and increasing the number of neural nodes in the 18Ne, 20Ne, 22Ne, and 46Ca model spaces.

First, the prediction accuracy improved when the number of neural nodes was doubled, as indicated by the difference between the prediction accuracies with N/2 and N neural nodes, as shown in Fig. 2. The absence of negative differences (Fig. 2) suggests that doubling the number of neural nodes consistently improves results, as expected. Furthermore, the differences reached a peak when N=32 for all four model spaces. For N>32, the prediction accuracy improved by only 0~2%. Considering that more nodes entail additional computational overhead, we believe that 32 nodes may be an optimal and balanced choice for this study.

Fig. 2
(Color online) Difference in the prediction accuracies of models employing N/2 and N neural nodes for 18Ne, 20Ne, 22Ne, and 46Ca model spaces with a single hidden layer. Thirty-two nodes are recommended
pic

Furthermore, we investigated the impact of hidden layers on the prediction accuracy. By employing 32 neural nodes in each layer, as shown in Fig 2, we present the difference in prediction accuracy between networks with n1 and n hidden layers against the layer number n in Fig 3 for the 18Ne, 20Ne, 22Ne, and 46Ca model spaces. The accuracy improved significantly with a single hidden layer, that is, n=1. However, these improvements diminished with the introduction of additional layers. Because additional layers also consume computational resources, a single hidden layer may be the optimal choice.

Fig. 3
(Color online) Difference between the prediction accuracies of networks employing n and n1 hidden layers for the 18Ne, 20Ne, 22Ne, and 46Ca model spaces. Thirty-two neural nodes in each hidden layer are recommended, as in Fig 2. A single hidden layer is recommended
pic

Activation functions [46] play a crucial role in NNs by learning abstract features via nonlinear transformations. Common activation functions are the sigmoid (also called the logistic function), tanh (hyperbolic tangent), and ReLU functions. Table 2 presents the impact of different activation functions on the prediction accuracy of our NN model. of The tanh and ReLU functions exhibit similar model prediction accuracies for the five model spaces. However, because the tanh activation function includes an exponential operation, the computational overhead may be larger; therefore, we used the ReLU function throughout this study.

In summary, the optimal network configuration for the 18Ne, 20Ne, 22Ne, and 46Ca model spaces considered in our analysis comprised one hidden layer with 32 ReLU neural nodes.

3

RESULTS AND ANALYSIS

3.1
Model comparison

As shown in Fig. 1, we adopted a fully connected NN model. However, considering the recent application of Bayesian NNs (BNNs) in nuclear physics [33, 34, 37, 39] and the great success of convolutional NNs [47-50] (CNN) and recurrent NNs [51-53] (RNN), we compared these four networks in terms of accuracy, as shown in Fig 4.

Fig. 4
(Color online) Prediction accuracies with different NN models. We adopt the classic fully connected NN as demonstrated in Fig. 1. BNN stands for Bayesian NN; CNN for convolutional NN; RNN for recurrent NN
pic

The BNN was implemented via Bayesian sampling of weights and biases facilitated by a variational inference algorithm to optimize model training. The model adopted 1000 iterations to update loss and accuracy. During the prediction process, we sampled 1000 times to obtain precise probability prediction results. The CNN included a convolutional, pooling, fully connected, and softmax layers. In particular, in the convolutional layer, the input channel was the number of input features, the output channel was defined as 16. The convolution kernel size was specified as three to facilitate feature extraction. Subsequently, in the pooling layer, the pooling kernel size and step size were set to two to reduce the dimensions of the feature map. The fully connected layer mapped the features extracted by the convolutional layer to the final classification result based on the feature dimensions and category counts of the task. The model parameters were adjusted and iteratively optimized throughout the construction process to ensure effective feature extraction and classification. A remarkable characteristic of the RNN is its ability to continuously transmit and share information through recurrent connections. In addition, the network calculation process defines the forward propagation function, which encompasses the output generated after the input calculation and subsequent prediction through the fully connected layer and softmax function. All four models, including the adopted classic softmax model, shared consistent parameters, including a training-to-test set ratio of 2:1, a learning rate of 0.01, over 1000 training epochs, and the use of the ReLU activation and Adam optimization functions.

According to Fig. 4, the CNN performs the worst, while the accuracies of both BNN and RNN are similar to that of the adopted network. However, the adopted networks had faster training speeds and required fewer computational resources. Therefore, we believe that the adopted network was the optimal choice for our study.

3.2
Feature selection

Feature selection is crucial in machine learning and data analysis because it enhances model performance, mitigates the risk of overfitting, boosts computational efficiency, streamlines model interpretation, and addresses problems related to noise and redundant information. This involves conducting a correlation analysis to assess the relationship between each feature and target variable. Subsequently, the features exhibiting a strong correlation with the target variable are selected, whereas the others are excluded from further training. Considering the nonlinear nature of both our feature and label data, we utilized the Spearman correlation coefficient ρ [54] for feature selection as ρ=16di2n(n21), (7) where di and n represent the difference in the rank values of the i-th data pair and total number of observed samples, respectively.

We calculated the ρ coefficients for four high-dimensional model spaces, namely, 18~22Ne and 46Ca model spaces and used different threshold sizes to select input features with strong correlation. Only features with ρ values greater than the threshold were retained for further training. In Table 3, we list the number of elements of the two-body matrix, that is, the number of input features over a certain threshold, and the corresponding accuracy.

Table 3
Model accuracies (%) and input numbers under different feature selection thresholds
Threshold 0.1 0.01 0.001 0
18Ne
  Accuracy 70 77 85 86
  Input number 5 14 28 30
20Ne
  Accuracy 60 65 66 68
  Input number 6 17 25 30
22Ne
  Accuracy 71 74 76 80
  Input number 4 16 25 30
46Ca
  Accuracy 53 56 56 56
  Input number 1 30 80 94
Show more
Threshold 0 indicates no feature selection

According to Table 3, the number of inputs after feature selection decreases with increasing threshold values, as anticipated. However, this reduction in the input number implies declining performance. Consequently, each input two-body matrix element within the four model spaces significantly affects the output and thus, the exclusion of any of these from network training is not recommended.

3.3
Accuracy

Figure 5 presents the evolution of the loss function during training. As expected, the loss functions of the six model spaces converge, indicating that the network parameters are optimal. The loss values of two single-j model spaces, namely, (f7/2)4 and (f11/2)4, decrease the most dramatically because the single-j spaces are simpler than the other 4 models. Furthermore, 46Ca, 20Ne, 22Ne, and 18N all converge with large loss values, corresponding to the unsatisfactory accuracy described in Table 4. Thus, increasing the number of training epochs does not improve the accuracy.

Table 4
Model space dimensions, prediction accuracy of the NN, and consistent rate of the G-I correlations between the SM and NN (see Subsect. 3.4 for definition)
Model Space (f7/2)4 (h11/2)4 18Ne 20Ne 22Ne 46Ca
Dimension 8 23 14 81 142 3952
Accuracy (%) 98 97 86 68 80 56
Consistency (%) 100 100 100 60 80 74
Show more
Fig. 5
(Color online) Evolution of the loss functions during training
pic

Table 4 shows the correlation between the prediction accuracy (%) of the NN for the ground-state spin and the dimensions of the six model spaces under investigation. The shell model eigenvalues of the (f7/2)4 space are linear combinations of two-body interaction matrix elements; therefore, the NN model is equivalent to linear regression and achieves high prediction accuracy of up to 98%. The accuracy is 97% for the (h11/2)4 space with one hidden layer; however, some eigenvalues in the (h11/2)4 space exhibit nonlinear relationships with the two-body interaction matrix elements.

For the remaining four spaces, the accuracy decreased significantly with increasing dimensions. We obtained a Pearson correlation coefficient [55] of -0.753 between the prediction accuracy (%) and dimensions on a logarithmic scale, indicating a negative correlation between the two variables. As the dimensions of the space and the shell complexity increased, the ability of the NN model to predict the ground-state spin diminished. As shown in Fig. 2 and 3, introducing additional hidden layers or neural nodes does not significantly improve the performance of general classification NNs. Thus, to accurately predict the ground-state spin in the TBRE, the generalization capability of the NN is strongly challenged by the complexity of the quantum many-body system, and a more specialized NN architecture and activation function should be designed according to the cfp coefficient property and diagonalization process.

To obtain a more detailed picture of the prediction performance of the NN model for TBRE samples with a specific spin, Fig. 6 presents the confusion matrix for the NN models of the six model spaces. In the confusion matrices, the y- and x-axes represent the ground-state spin predicted by the NN (INN) and that obtained from the shell model calculations (ISM), respectively. The gray scale indicates the probability of the shell model calculation yielding a ground-state spin of ISM in the samples for which the NN predicts a ground-state spin of INN. The main diagonal of the confusion matrix appears predominantly dark, indicating a reasonably high degree of consistency between the NN and shell model for a specific ground-state spin. From a statistical perspective, the NN captures some correlations between the ground-state spin and two-body interaction matrix elements of the TBRE.

Fig. 6
Confusion matrices for predicting the ground-state spin using the NN model in the (f7/2)4, (h11/2)4, 18Ne, 20Ne, 22Ne, and 46Ca TBRE calculations. The y- and x-axes represents the ground-state spin predicted by the NN (INN) and that obtained from the shell model calculations (ISM), respectively. The gray scale represents the probability of the shell model calculation yielding a ground-state spin of ISM in the samples for which the NN predicts a ground-state spin of INN
pic

The data in Table 4 indicates that the prediction accuracy for the ground-state spin of the 20Ne nucleus is lower than that for higher-dimensional 22Ne. This finding is consistent with the observations shown in Fig. 6. Specifically, for 20Ne, the difference in colors between the main diagonal and other regions is less pronounced than in other nuclei. This suggests that the prediction of the ground-state spin in 20Ne space poses greater challenges to the NN, which may be related to some special properties of the cfp coefficients of 20Ne. Further exploration of specific multibody complexity features in 20Ne space is desirable.

To further evaluate the statistical performance of the NN model, Fig. 7 presents the distribution of ground-spin spins I (PI) obtained using both the shell model and the well-trained NN model with random interactions. The NN model is consistent with the shell model for all model spaces and partially succeeds in capturing the robust statistical properties of the TBRE.

Fig. 7
(Color online) Distribution of the ground-state spin I (PI) for (f7/2)4, (h11/2)4, 18Ne, 20Ne, 22Ne, and 46Ca. The black square and red circle represent the PI obtained from shell model calculations with random interactions and that predicted by the NN model, respectively. The blue triangle and olive star represent the PI obtained with the empirical approach [28] applied to the shell model and NN model, respectively
pic
3.4
G-I correlation

To predict PI in TBRE, Zhao et al. proposed a general empirical approach [19]. In their approach, one of the two-body interaction matrix elements was set to -1, whereas the rest were set to 0. This interaction was then input into the shell model, and the output ground-state spin I was recorded. If N independent two-body interaction matrix elements existed in the model space, the process was repeated N times, setting a different matrix element equal to -1 each time. Finally, the number of times the spin I was observed as the ground-state spin in N numerical experiments was represented as NI. The probability of spin I in the ground state was then estimated as follows: PI=NI/N. (8)

The empirical approach [19] attributes the "specific spin I as the ground-state spin" to a few two-body interaction matrix elements. If there are more two-body interaction matrix elements responsible for the spin I=0, the empirical rule provides a phenomenological explanation for the dominance of the ground state with zero spin.

Note that the empirical approach hints at the correlation between the two-body interaction matrix elements and ground-state spin, and its correlation determines the ground-state spin distribution, as shown in Fig. 7. Thus, the NN model with a good prediction of the TBRE ground-state spin distribution should also produce a similar correlation between the two-body interaction matrix elements (Gj1j2;j3j4JT) and ground-state spin (I) as the shell model. Therefore, the element–spin (G-I) correlations in the shell model must be compared with those in the NN model.

In Tables 5, 6 and 7, the G-I correlations between the two-body interaction matrix elements (Gj1j2;j3j4J defined in Eq. (1)) and the ground-state spin (I), obtained from the empirical approach applied to both the shell and NN models, are listed for the (f7/2)4 and (h11/2)4 model spaces, Ne isotopes, and 46Ca, respectively.

Table 5
Ground state spin(I) from the shell and NN models for the (f7/2)4 and (h11/2)4 model spaces, with input GJ=-1 for some specific J, and other GJs equal to 0. Here, GJ denotes the two-body interaction matrix element Gjj;jjJ, as defined in Eq. (1). This table summarizes the correlation between the two-body interaction matrix elements and ground-state spin in the empirical approach
GJ (f7/2)4 (h11/2)4
  SM NN SM NN
G0 0 0 0 0
G2 4 4 4 4
G4 2 2 0 0
G6 8 8 4 4
G8     8 8
G10     16 16
Show more
Table 6
Same as Table 5, except for 18Ne, 20Ne, and 22Ne, with Gj1j2;j3j4I as the matrix elements of the two-body interaction
Gj1j2;j3j4J 18Ne 20Ne 22Ne
  SM NN SM NN SM NN
G11110 0 0 0~4 0 0~6 0
G11220 0 0 0,2,4 0 0,2,4 0
G11330 0 0 0 0 0,2 0
G22220 0 0 0,2~4 0 0~5 0
G22330 0 0 0 0 0 0
G33330 0 0 0 0 0~2 0
G12121 1 1 1 0 0 0
G12231 1 1 2 0 0 0
G23231 1 1 0 0 0 3
G12122 2 2 0,2 0 0 0
G12132 2 2 2 0 2 2
G12222 2 2 1~4 0 0~6 0
G12232 2 2 0 0 0 2
G12332 2 2 0 0 0 0
G13132 2 2 4 2 0,2,4 2
G13222 2 2 0 0 0 0
G13232 2 2 0 2 0 0
G13332 2 2 2 2 0~4 2
G22222 2 2 0 0 0,2~4 0
G22232 2 2 2 0 2,3 0
G22332 2 2 0 0 0 0
G23232 2 2 2 0 0 0
G23332 2 2 0 0 0 0
G33332 2 2 2 0 0 0
G13133 3 3 5 2 0,2,4 3
G13233 3 3 4 0 3 0
G23233 3 3 0 0 0 0
G23234 4 4 6 6 6 6
G23334 4 4 4 0 2,3 0
G33334 4 4 4 4 0 0
Show more
The subscripts j1, j2, j3, j4 are equal to 1, 2, 3, corresponding to s1/2, d3/2, and d5/2 orbits in the sd shell, respectively. I=0~4 in this table represents the degenerate states with spin 0, 1, 2, 3, and 4 from the shell model. The inconsistency between the NN and shell models is highlighted in bold
Table 7
Same as Table 6, except for 46Ca
Gj1j2;j3j4J SM NN Gj1j2;j3j4I SM NN
G11110 0~10 0 G33342 0 0
G11220 0~10 0 G33442 0 0
G11330 0,2~6 0 G34342 4 0
G11440 0~4 0 G34442 0 0
G22220 0~10 0 G44442 2 2
G22330 0,2,4,6 0 G13133 0,2,4 3
G22440 0 0 G13143 3 0
G33330 0~6 0 G13233 3 0
G33440 0 0 G13243 4 0
G44440 0~4 0 G13343 4 0
G12121 0 0 G14143 0,2,4~8 2
G12231 0 0 G14233 0 0
G12341 0,9 0 G14243 0 0
G23231 0 0 G14343 0 0
G23341 0 0 G23233 3 0
G34341 1,8 0 G23243 0 2
G12122 0 0 G23343 0 0
G12132 2 0 G24243 0 0
G12222 0~10 0 G24343 0 0
G12232 0,4,6 0 G34343 0,10 0
G12242 0 0 G14144 0,2,4~6,8 8
G12332 0 0 G14234 2 0
G12342 0,9 0 G14244 6 0
G12442 0 0 G14334 0 0
G13132 0,2,4 2 G14344 1 0
G13222 0,2,4,6 0 G14444 0~4 0
G13232 0 0 G23234 6 0
G13242 0 0 G23244 2 0
G13332 0~8 0 G23334 1~6 0
G13342 2 0 G23344 0,9 0
G13442 2 0 G23444 4 0
G22222 0~6 0 G24244 0 0
G22232 1,2,4,5 0 G24334 0 0
G22242 0 0 G24344 3 0
G22332 0,2~4,6 0 G24444 0,2~4 0
G22342 0 0 G33334 0 0
G22442 0,2~4 0 G33344 0 0
G23232 0 0 G33444 0,10 0
G23242 0 0 G34344 0 0
G23332 0,2~4,6 0 G34444 0 0
G23342 0,10 0 G44444 4 4
G23442 0 0 G24245 10 9
G24242 0,9 0 G24345 0 0
G24332 0 0 G34345 1 0
G24342 0 0 G34346 12 10
G24442 0 0 G34446 0 0
G33332 0 0 G44446 6 6
Show more
The subscripts j1, j2, j3, j4 are equal to 1, 2, 3, 4, corresponding to the p1/2, p3/2, f5/2, and f7/2 orbits in the pf shell, respectively

According to Table 5, in the (f7/2)4 and (h11/2)4 model spaces, the NN model produces G-I correlations that are perfectly consistent with those of the shell model. This explains the agreement between the shell and NN models shown in Figures 6(a,b) and 7(a,b). In Table 6, this perfect consistency can also be observed for 18Ne space in coordination with Figures 6(c) and 7(c). However, as the dimensions increase, the consistency of 20 22Ne in Tables 6 and 46Ca in Table 7 gradually decreases. In the 20Ne space, there are 12 inconsistent G-I correlations out of 30 (40%) between the SM and NN; in 22Ne, six out of 30 (20%) are inconsistent; and in 46, 24 out of 94 (~26%) are inconsistent. These inconsistent rates are also correlated with the prediction accuracy for different model spaces, as shown in Table 4.

Furthermore, the empirical approach was also applicable to the trained NN model by setting one of the inputs of the NN to -1 and the rest to 0 and recording the ground-state spin (I) from the network. This approach also revealed the correlation between the interaction matrix elements and the predicted ground-state spin, as well as the PI distribution, of the well-trained NN model. Table 5 presents the correlations between the matrix elements and the spin obtained from the shell and NN models. The correlations from both models are identical, indicating that our NN model successfully learns the G-I correlation, as suggested by the empirical approach. Thus, it accurately reproduces the ground-state spin of the shell model for simple model.

With the G-I correlation from such an NN model, we counted the ground-spin Is emerging in the G-I correlation and then normalized them to the PI distribution, as guided by the empirical approach. The PI distributions based on the empirical approach using the NN model are shown in Fig. 7. The empirical approach from both the SM and NN models yields reasonably consistent PI distributions for all model spaces in this case, suggesting that the NN may effectively capture the correlation between the two-body interaction matrix elements and the ground-state spin, which further explains its remarkable performance in reproducing the statistical properties of the ground-state spin in the TBRE.

4

CONCLUSION

This study used an NN model to investigate the distribution of the ground-state spin in the TBRE. Using a softmax classification NN model, we attempted to reproduce the correlation between the matrix elements of the interaction and ground-state spin, as labeled by the shell model, for the TBRE. The reliability of the NN model was analyzed based on its prediction accuracy and consistency with the empirical rules of the PI distribution.

Previous applications of NN models in nuclear physics primarily focused on their strong fitting capabilities. However, the analysis of the ground-state spin distribution in TBRE demonstrated the classification ability of the NN, which is rare in the literature. Furthermore, TBRE provided extensive samples for training NNs, thereby potentially enhancing the performance of the NN model.

In our investigation, we adopted various strategies to enhance network performance, including the introduction of BNN, CNN, and RNN, feature selection, and adjusting the number of neural nodes and hidden layers. However, none of these approaches yielded significant improvements with limited computational resources. Therefore, we must acknowledge that the quantum many-body problem remains a formidable challenge for NN models. Addressing this challenge may necessitate the further development of NN architectures tailored for analyzing the nuclear ground-state spin in the TBRE.

However, NN models still offer some insights into the specific robust statistical properties of the ground-state spin. For example, they effectively capture the distribution of the ground-state spin, as shown in Fig. 7. Moreover, the resulting confusion matrix exhibits dominant diagonal elements, indicating the consistency between the ground-state spin from the shell model and that predicted by the NN model, as shown in Fig. 6. This success can be attributed to the capacity of the NN to replicate the correlation between the ground-state spin and two-body interaction matrix element in the shell model, as shown in Tables 5, 6, and 7.

References
1. H.A. Weidenmüller, G.E. Mitchell,

Random matrices and chaos in nuclear physics: Nuclear structure

. Rev. Mod. Phys. 81(2), P539-589 (2009). https://doi.org/10.1103/RevModPhys.81.539
Baidu ScholarGoogle Scholar
2. O. Bohigas, M.J. Giannoni, C. Schmit,

Spectral properties of the Laplacian and random matrix theories

. J. Phys. Let. 45(21), 1015-1022 (1984). https://doi.org/10.1051/jphyslet:0198400450210101500
Baidu ScholarGoogle Scholar
3. S. S. M. Wong, J.B. French,

Level-density fluctuations and two-body versus multi-body interactions

. Nucl. Phys. A 198(1), 188-208 (1972). https://doi.org/10.1016/0375-9474(72)90779-8
Baidu ScholarGoogle Scholar
4. O. Bohigas, J.F. Fx,

Two-body random hamiltonian and level density

. Phys. Let. B 34(4), 261-263 (1971). https://doi.org/10.1016/0370-2693(71)90598-3
Baidu ScholarGoogle Scholar
5. J.B. French, S.S.M. Wong,

Validity of random matrix theories for many-particle systems

. Phys. Let. B. 33(7), 449-452 (2001). https://doi.org/10.1016/0370-2693(70)90213-3
Baidu ScholarGoogle Scholar
6. M.G. Mayer,

On closed shells in nuclei

. Phys. Rev. 74(3), 235-239 (1948). https://doi.org/10.1103/PhysRev.74.235
Baidu ScholarGoogle Scholar
7. O. Haxel, H.J.D. Jensen, H.E. Suess,

On the "Magic Numbers" in nuclear structure

. Phys. Rev. 75(11), 1766 (1949). https://doi.org/10.1103/PhysRev.75.1766.2
Baidu ScholarGoogle Scholar
8. Zelevinsky , G. Vladimir, B. Alex Brownet al,

The nuclear shell model as a testing ground for many-body quantum chaos

. Phys. Rep. 276, 85-176 (1996). https://doi.org/10.1016/S0370-1573(96)00007-5
Baidu ScholarGoogle Scholar
9. T. Guhr, A. Müller–Groeling, H.A. Weidenmüller,

Random-matrix theories in quantum physics: common concepts

. Phys. Rep. 299(4-6), 189-425 (1998). https://doi.org/10.1016/s0370-1573(97)00088-4
Baidu ScholarGoogle Scholar
10. V.K.B. Kota,

Embedded random matrix ensembles for complexity and chaos in finite interacting particle systems

. Phys. Rep. 347(3), 223-288 (2001). https://doi.org/10.1016/S0370-1573(00)00113-7
Baidu ScholarGoogle Scholar
11. V. Zelevinsky, A. Volya,

Nuclear structure, random interactions and mesoscopic physics

. Phys. Rep. 391(3), 311-352 (2004). https://doi.org/10.1016/j.physrep.2003.10.008
Baidu ScholarGoogle Scholar
12. C.W. Johnson, G.F. Bertsch, D.J. Dean,

Orderly spectra from random interactions

. Phys. Rev. Lett. 80(13), 2749 (1998). https://doi.org/10.1103/PhysRevLett.80.2749
Baidu ScholarGoogle Scholar
13. C.W. Johnson, G.F. Bertsch, D.J. Deanet al.,

Generalized seniority from random Hamiltonians

. Phys. Rev. C 61, 014311 (1999). https://doi.org/10.1103/PhysRevC.61.014311
Baidu ScholarGoogle Scholar
14. R. Bijker, A. Frank.,

Band structure from random interactions

. Phys. Rev. Lett. 84(3), 420-422 (2000). https://doi.org/10.1103/PhysRevLett.84.420
Baidu ScholarGoogle Scholar
15. D. Kusnezov, N.V. Zamfir, R.F. Casten,

Robust nuclear observables and constraints on random interactions

. Phys. Rev. Lett. 85(7), 1396 (2000). https://doi.org/10.1103/PhysRevLett.85.1396
Baidu ScholarGoogle Scholar
16. H. Feshbach, F. Iachello,

The interacting boson model

. Ann. Phys. 84, 211-231 (1974). https://doi.org/10.1016/0003-4916(74)90300-5
Baidu ScholarGoogle Scholar
17. R. Bijker, A. Frank, S. Pittel,

On the dominance of J(P)=0(+) ground states in even-even nuclei from random two-body interactions

. Phys. Rev. C. 60(2), 021302 (1999). https://doi.org/10.1103/PhysRevC.60.021302
Baidu ScholarGoogle Scholar
18. D. Mulhall, A. Volya, V. Zelevinsky,

Geometric chaoticity leads to ordered spectra for randomly interacting fermions

. Phys. Rev. Lett. 85(19), 4016-4019 (2000). https://doi.org/10.1103/PhysRevLett.85.4016
Baidu ScholarGoogle Scholar
19. Y.M. Zhao, A. Arima,

Towards understanding the probability of 0+ ground states in even-even many-body systems

. Phys. Rev. C. 64(4), 041301 (2001). https://doi.org/10.1103/PhysRevC.64.041301
Baidu ScholarGoogle Scholar
20. D. Kusnezov,

Two-body random ensembles: From nuclear spectra to random polynomials

. Phys. Rev. Lett. 85(18), 3773 (2000). https://doi.org/10.1103/PhysRevLett.85.3773
Baidu ScholarGoogle Scholar
21. R. Bijker, A. Frank,

Mean-field analysis of interacting boson models with random interactions

. Phys. Rev. C. 64(6), 656-656 (2001). https://doi.org/10.1103/PhysRevC.64.061303
Baidu ScholarGoogle Scholar
22. R. Bijker, A. Frank,

Regular spectra in the vibron model with random interactions

. Phys. Rev. C. 65(4), 579-579 (2002). https://doi.org/10.1103/PhysRevC.65.044316
Baidu ScholarGoogle Scholar
23. L. Kaplan, T. Papenbrock, C.W. Johnson,

Spin structure of many-body systems with two-body random interactions

. Phys. Rev. C. 63, 014307 (2000). https://doi.org/10.1103/physrevc.63.014307
Baidu ScholarGoogle Scholar
24. L. Kaplan, T. Papenbrock,

Wave Function Structure in Two-Body Random Matrix Ensembles

. Phys. Rev. Lett. 84(20), 4553-4556 (2000). https://doi.org/10.1103/PhysRevLett.84.4553
Baidu ScholarGoogle Scholar
25. S. Drozdz, M. Wojcik,

Nature of order from random two-body interactions

. Physica A Statal Mechanics and Its Applications 301(1), 291-300 (2001). https://doi.org/10.1016/S0378-4371(01)00403-4
Baidu ScholarGoogle Scholar
26. J.J. Shen,

Correlation between the probability of spin-zero ground state and TBME in the presence of random interactions

. Nucl. Phys. Rev. 37(3), 523-529 (2020). https://doi.org/10.11804/NuclPhysRev.37.2019CNPC15
Baidu ScholarGoogle Scholar
27. H.T. Pc, A. Frank, S. Naet al.,

Geometry of random interactions

. Phys. Rev. C. 66(6), 1302 (2002). https://doi.org/10.1103/PhysRevC.66.061302
Baidu ScholarGoogle Scholar
28. Y.M. Zhao, A. Arima, N. Yoshinaga,

Regularities of many-body systems interacting by a two-body random ensemble

. Phys. Rep. 400(1), 1-66 (2003). https://doi.org/10.1016/j.physrep.2004.07.004
Baidu ScholarGoogle Scholar
29. S. Gazula, J.W. Clark, H. Bohr,

Learning and prediction of nuclear stability using neural networks

. Nucl. Phys. A. 540(1-2), 1-26 (1992). https://doi.org/10.1016/0375-9474(92)90191-L
Baidu ScholarGoogle Scholar
30. Y. G. Ma, L. G. Pang, R. Wanget al.,

Phase transition study meets machine-learning requirements

. Chinese Phys. Lett. 40, 122101 (2023). https://doi.org/10.1088/0256-307X/40/12/122101
Baidu ScholarGoogle Scholar
31. W. He, Q. Li, Y. Maet al.,

Machine learning in nuclear physics at low and intermediate energies

. Sci. China Phys. Mech. Astron. 66(8), 282001 (2023). https://doi.org/10.1007/s11433-023-2116-0
Baidu ScholarGoogle Scholar
32. W.B. He, Y.G. Ma, L.G. Panget al.,

Machine learning is required in high-energy nuclear physics

. Nucl. Sci. Tech. 34(6), 88 (2023). https://doi.org/10.1007/s41365-023-01233-z
Baidu ScholarGoogle Scholar
33. R. Utama, J. Piekarewicz, H.B. Prosper.,

Nuclear mass predictions for the crustal composition of neutron stars: A Bayesian neural network approach

. Phys. Rev. C. 93, 014311 (2016). https://doi.org/10.1103/physrevc.93.014311
Baidu ScholarGoogle Scholar
34. Z.M. Niu, H.Z. Liang,

Nuclear mass predictions based on Bayesian neural network approach with pairing and shell effects

. Phys. Lett. B. 778, 48-53 (2018). https://doi.org/10.1016/j.physletb.2018.01.002
Baidu ScholarGoogle Scholar
35. X.C. Ming, H.F. Zhang, R.R. Xuet al.,

Nuclear mass based on the multi-task learning neural network method

. Nucl. Sci. Tech. 33(4), 48 (2022). https://doi.org/10.1007/s41365-022-01031-z
Baidu ScholarGoogle Scholar
36. T. S. Shang, J. Li, Z.M. Niu.,

Prediction of nuclear charge density distribution with feedback neural network

. Nucl. Sci. Tech. 33(12), 153 (2022). https://doi.org/10.1007/s41365-022-01140-9
Baidu ScholarGoogle Scholar
37. R. Utama, W.C. Chen, J. Piekarewicz,

Nuclear charge radii: Density functional theory meets Bayesian neural networks

. J. Phys. G. Nucl. Partic. 43(11), 114002 (2016). https://doi.org/10.1088/0954-3899/43/11/114002
Baidu ScholarGoogle Scholar
38. Y.F. Wang, Z.M. Niu,

Studies of nuclear low-lying excitation spectra with multi-task neural network

. Nucl. Phys. Rev. 39(3), 273-280 (2022). https://doi.org/10.11804/NuclPhysRev.39.2022043
Baidu ScholarGoogle Scholar
39. Y.F. Wang, X.Y. Zhang, Z.M. Niuet al.,

Study of nuclear low-lying excitation spectra with the Bayesian neural network approach

. Phys. Lett. B. 830, 137-154. (2022). https://doi.org/10.1016/j.physletb.2022.137154
Baidu ScholarGoogle Scholar
40. Z.M. Niu, H.Z. Liang, B.H. Sunet al.,

Predictions of nuclear β-decay half-lives with machine learning and their impact on r-process nucleosynthesis

. Phys. Rev. C. 99(6), 064307 (2019). https://doi.org/10.1103/PhysRevC.99.064307
Baidu ScholarGoogle Scholar
41. X. Glorot, A. Bordes, Y. Bengio,

Deep sparse rectifier neural networks

. Journal of Machine Learning Research 15, 315-323 (2011).
Baidu ScholarGoogle Scholar
42. C.K.I. Williams, D. Barber,

Bayesian classification using Gaussian processes

. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(12), 1342-1351 (1999). https://doi.org/10.1109/34.735807
Baidu ScholarGoogle Scholar
43. D. Kingma, J. Ba,

Adam: A method for stochastic optimization

. Computer Science (2014). https://doi.org/10.48550/arXiv.1412.6980
Baidu ScholarGoogle Scholar
44. R.D. Lawson, H.H. Stroke,

Theory of the nuclear shell model

. Phys. Today. 35(1), 73-75 (1980). https://doi.org/10.1016/B978-1-4832-3064-1.50016-4
Baidu ScholarGoogle Scholar
45. Y. Lecun, Y. Bengio, G. Hinton,

Deep learning

. Nature, 521(7553), 436 (2015). https://doi.org/10.1038/nature14539
Baidu ScholarGoogle Scholar
46. S.R. Dubey, S.K. Singh, B.B. Chaudhuri., Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark. (2021). https://doi.org/10.48550/arXiv.2109.14545
47. F.P. Li, Y.J. Wang, Z.P. Gaoet al.,

Application of machine learning for the determination of impact parameters in the 132Sn + 124Sn system

. Phys. Rev. C 104, 034608 (2021). https://doi.org/10.1103/PhysRevC.104.034608
Baidu ScholarGoogle Scholar
48. J. Bouvrie, Notes on Convolutional Neural Networks. neural nets. (2006).
49. F.P. Li, Y.J. Wang, Q.F. Li,

Using deep learning to study the equation of state of nuclear matter

. Nucl. Phys. Rev. 37(4), 825-832 (2020). https://doi.org/10.11804/NuclPhysRev.37.2020017
Baidu ScholarGoogle Scholar
50. Y. Y. Cao, J. Y. Guo, B. Zhou,

Prediction of nuclear charge radii based on a convolutional neural network

, Nucl. Sci. Tech. 34(10), 152 (2023). https://doi.org/10.1007/s41365-023-01308-x
Baidu ScholarGoogle Scholar
51. H. Salehinejad, S. Sankar, J. Barfett et al., Recent Advances in Recurrent Neural Networks. (2017). https://doi.org/10.48550/arXiv.1801.01078
52. L.R. Medsker, L.C. Jain, Recurrent neural networks: Design and applications. (1999). https://doi.org/10.1109/IJCNN.2005.1556106
53. R. Engelken, F. Wolf, L.F. Abbott, Lyapunov spectra of chaotic recurrent neural networks. (2020). https://doi.org/10.48550/arXiv.2006.02427
54. M. Mukaka,

Statistics corner: A guide to appropriate use of correlation coefficient in medical research

. Malawi Medical Journal. 24(3), 69-71 (2012). https://doi.org/10.2166/wh.2012.000
Baidu ScholarGoogle Scholar
55. P. Sedgwick,

Pearson’s correlation coefficient

. BMJ (online). 345(jul041), e4483-e4483 (2012). https://doi.org/10.1136/bmj.e4483
Baidu ScholarGoogle Scholar
Footnote

The authors declare that they have no competing interests.