Neural network study of the nuclear ground-state spin distribution within a random interaction ensemble

NUCLEAR PHYSICS AND INTERDISCIPLINARY RESEARCH

Neural network study of the nuclear ground-state spin distribution within a random interaction ensemble

Deng Liu，

Alam Noor A，

Zhen‑Zhen Qin，

Yang Lei

Nuclear Science and Techniques

Vol.35, No.3

Article number 64

Published in print Mar 2024

Available online 03 May 2024

DOI：10.1007/s41365-024-01424-2

65303

The distribution of the nuclear ground-state spin in a two-body random ensemble (TBRE) was studied using a general classification neural network (NN) model with two-body interaction matrix elements as input features and the corresponding ground-state spins as labels or output predictions. The quantum many-body system problem exceeds the capability of our optimized NNs in terms of accurately predicting the ground-state spin of each sample within the TBRE. However, our NN model effectively captured the statistical properties of the ground-state spin because it learned the empirical regularity of the ground-state spin distribution in TBRE, as discovered by physicists.

Neural networkTwo-body random ensembleSpin distribution of nuclear ground state

Introduction

The atomic nucleus is a complex many-body quantum system. Conventionally, a many-body Hamiltonian or Lagrangian must be constructed based on reliable interactions to investigate this complex system. However, such a task is usually challenging, as the interactions in many-body problems are strongly entangled with the structures. Thus, the self-consistency requirement under a certain ansatz leads to a vague or inaccurate many-body Hamiltonian. Fortunately, if only the regularity and robust properties of many-body systems that are independent of the interaction details are required, the vagueness of the Hamiltonian provides an alternative perspective, with random numbers as parameters of nuclear interactions, that is, by using random interactions to statistically probe the robust regularity of nuclei.

The study of random interactions can be traced back to the investigation of Wigner’s random matrix theory (RMT) [1], where random numbers were used as the matrix elements of the many-body Hamiltonian. The diagonalization of these random matrices yields spectral statistical properties that agree with experimental data. The spectral properties of the RMT were further linked to quantum chaos [2]. In the 1970s, Wong, Bohigas, et al. [3-5] randomized the two-body interaction matrix elements in shell-model calculations [6, 7] to quantitatively demonstrate the phenomenon of quantum chaos in nuclei [5, 8-11]. The shell-model calculations with random interactions generate an ensemble of virtual nuclei, which is known as the two-body random ensemble (TBRE). A study based on TBRE revealed that certain robust features of nuclei may be independent of the specific details of the interaction.

Following this ensemble, Johnson, Bertsch, et al. [12, 13] reported a series of robust and interaction-independent statistical properties of low-lying states in nuclei. A significant finding was the "predominance of the spin-zero ground state" in even-even nuclei. Even-even nuclei exhibited a considerably higher probability of having spin-zero ground states compared to the fraction of spin-zero configurations in the entire shell model space. Subsequently, this phenomenon was observed in the interacting boson model (IBM) [14-16]. The spin-zero ground states of even-even nuclei are conventionally attributed to the short-range nature of the nuclear force. However, in the TBRE, the interactions are entirely random and no specific force was predominant. The predominance of the spin-zero ground state in the TBRE contradicts the conventional understanding of how spin-zero ground states emerge from even-even systems. Therefore, considerable effort has been devoted to understanding this robust property of TBRE, which has proven to be significantly challenging and reflects the complexity of the quantum many-body problem. Phenomenological attempts include the studies of the distribution of the lowest eigenvalues for each spin [14] and its width [17], geometric chaos of spin coupling [18], maximum and minimum diagonal matrix elements [19], IBM-limit of spin distribution in the IBM with TBRE [20-22], wave-function properties of different spin ground states [23, 24], energy scale features of different spin ground states [25], and correlation between the probability of spin-zero ground states and the central values of the distribution of two-body matrix elements [26]. The probability distributions of various spin states as ground states must be mathematically calculated to explain this phenomenon. However, nuclear models are typically nonlinear systems, which are difficult to apply to statistical theories. Therefore, several empirical rules have been proposed to predict the probability distribution of ground-state spins. For example, Kusnezov et al. used the random polynomial method [24] to a priori determine the probability distribution for sp bosons and obtained results that were consistent with those obtained by Bijker et al. using mean-field methods [21, 22]. Chau et al. discussed the cases of d boson systems and four fermions in the f_7/2 shell, demonstrating the correlation between specific ground states and the geometric shapes determined by nuclear observables, and predicting the probabilities for the ground-state spin [27]. Zhao et al. suggested that the spins of ground states in the TBRE may be associated with specific two-body interaction matrix elements and thus proposed an empirical approach [28] to predict the distribution of ground-state spins. The correlation between the ground-state spin and two-body interaction matrix elements in this empirical approach is also crucial in our work.

Because the nonlinearity of the nuclear model is too complex to overcome, one can take an indirect approach to explain the origin of the predominance of spin-zero ground states. This can be achieved by using a sufficiently simple nonlinear to simulate the behavior of the shell model and studying the spin determination mechanism therein, which may provide more insight from a different perspective. The neural network (NN) model is a potential candidate for such simulations owing to its powerful learning, prediction, and adaptation capabilities, which have been successfully applied in diverse fields such as language translation, speech recognition, computer vision, and even complex physical systems [29-32]. Specifically, NN models have been extensively used in nuclear structure studies to predict unknown nuclear properties using existing experimental data. These properties include the mass [33-35], charge radii [36, 37], low-lying excitation spectra [38, 39], and β decay lifetimes [40]. However, most of these studies only utilized the fitting capacity of the NN without fully exploring its classification capability for nuclear structure research.

In this study, we attempted to distinguish between samples with different ground-state spins in the TBRE by adopting the classification capability of an NN with supervised learning. The adopted NN was trained using the interaction matrix elements from the TBRE samples as features and the ground-state spin as the label. In this process, the NN learnt the behavior of the ground state spin in the TBRE, as well as the specific correlations between the interaction elements and ground state spin, as described in the empirical approach [28]. A significant advantage of using the NN in the TBRE study lies in the ability of the TBRE to provide nearly infinite independent samples for NN training, thereby avoiding overfitting. This enhanced the generalization ability of the NN and facilitated the simulation of the shell model production of the ground-state spin. We present the performance of the NN in predicting the ground-state spins and reproducing their distribution in the TBRE. The proposed NN architecture may serve as a valuable benchmark for other classification-based applications.

MODEL FRAMEWORK

2.1

Two-Body Random Ensemble (TBRE)

In the TBRE, the nuclear Hamiltonian includes only two-body interactions and is expressed as follows: $H = \sum_{J} \sum_{j_{1} j_{2}} \sum_{j_{3} j_{4}} G_{j_{1} j_{2}; j_{3} j_{4}}^{J} A_{J}^{†} (j_{1} j_{2}) A_{J} (j_{3} j_{4}) .$ (1) In Eq. (1), $G_{j_{1} j_{2}; j_{3} j_{4}}^{J}$ represents the matrix elements of the two-body interaction; $A_{J}^{†} (j_{1} j_{2})$ denotes the creation operator of the nucleon pair with two nucleons on the j₁ and j₂ orbits coupled to total angular momentum J; and $A_{J} (j_{3} j_{4})$ represents the annihilation operator of the nucleon pair.

In TBRE, the matrix elements $G_{j_{1} j_{2}; j_{3} j_{4}}^{J}$ in Eq. (1) are independent random numbers that follow a Gaussian distribution with a probability function $f (G_{j_{1} j_{2} j_{3} j_{4}}^{J}) = \frac{1}{\sqrt{2 π} σ} \exp {- \frac{{(G_{j_{1} j_{2}; j_{3} j_{4}}^{J})}^{2}}{2 σ^{2}}} .$ (2)

Here, $σ^{2} = \frac{1}{2} (1 + δ_{j_{1} j_{3}} δ_{j_{2} j_{4}}),$ (3) to maintain the invariance of the statistical distribution of interaction matrix elements during an arbitrary single-particle transformation.

2.2

Classification neural network

The classification model presented in this paper utilizes an NN, which consists of an input layer, one or more hidden layers, and an output layer. The structure is illustrated in Fig. 1, (with one hidden layer shown as an example). The input layer receives the matrix elements of the two-body interactions in the shell model, specifically the $G_{j_{1} j_{2}; j_{3} j_{4}}^{J}$ values in Eq. (1), where the number of inputs is equal to the number of independent two-body interaction matrix elements in a specific shell-model space. Based on the corresponding input interactions, the output layer provides the probabilities of different spin states being the ground state. The number of outputs should equal the number of possible ground-state spins. The activation function used in this model was the rectified linear unit (ReLU) function [41], which is explained in Table 2. Assuming that the vector $\vec{x} = {x_{i}}$ represents the network input, that is, G two-body interaction matrix elements in Eq. (1), and $\vec{y}$ is the network output, whose elements correspond to the probability of each spin being the ground-state spin, the relationship (with one hidden layer) can be expressed analytically as follows: $y_{k} (\vec{x}; \vec{ω}) = a_{k} + \sum_{j} b_{k j} ReLU (c_{j} + \sum d_{j i} x_{i}) .$ (4) Here, $\vec{ω} = {a_{k}, b_{k j}, c_{j}, d_{j i}}$ denotes the NN parameter vector.

Prediction accuracy (%) with three different activation functions, namely, sigmoid, tanh, and ReLU

Activation function	(h_11/2)⁴	¹⁸Ne	²⁰Ne	²²Ne	⁴⁶Ca
Sigmoid	95.36	79.21	66.95	77.22	55.34
Tanh	96.15	85.10	67.69	78.39	55.67
ReLU	96.69	86.36	67.88	78.62	55.62

All the calculations are performed with a single 32-node hidden layer NN model

Fig. 1

Schematic of the adopted NN classification model

The output layer introduces the softmax function[42], which transforms the unnormalized output values into non-negative probability values whose sum is one. $P_{k} = {Softmax (\vec{y}) |}_{k} = \frac{e^{y_{k}}}{\sum_{k} e^{y_{k}}} .$ (5) This operation preserves the differentiability property of the model and the relative order of unnormalized output values. It also allows the model output to be interpreted as a probability for each class, facilitating the direct interpretation and utilization of these probabilities for classification decisions. Therefore, they are frequently employed in NN models to solve classification problems. Here, Pk is the probability that the kth spin is the ground-state spin. Thus, the maximum of Pk determines the ground-state spin according to the $\vec{x}$ feature. All the elements Pk construct the predicted probability $\vec{P}$ from the NN model.

To train the NN model, we first prepared a training set consisting of N samples, $D = {({\vec{x}}_{1}, S_{1}), ({\vec{x}}_{2}, S_{2}), \dots, ({\vec{x}}_{N}, S_{N})}$ out of ~100,000 shell-model calculations. Here, ${\vec{x}}_{i}$ includes two-body interaction matrix elements in a single shell-model calculation, while Si is the corresponding ground-state spin obtained from such a shell-model. Secondly, for each Si spin, we created the label ${\hat{\vec{P}}}^{i}$ vectors, which is a hot-one vector, and only include one non-zero elements of value “1,” corresponding to a 100% probability of Si ground-state spin, and 0% probabilities of the rest other spins. Third, we defined the loss function to evaluate the similarity between the label ${\hat{\vec{P}}}^{i}$ vector and the NN predicted ${\vec{P}}^{i}$ vector from Eq. (5). $L ({\vec{P}}^{i}, {\hat{\vec{P}}}^{i}) = - \sum P_{m}^{i} \log {\hat{P}}_{m}^{i},$ (6) which is a common loss function for training an NN model for classification problems. Using the training samples and corresponding loss function, we trained our network by adjusting the network parameter vector $\vec{ω}$ with the adaptive moment estimation (Adam) optimization algorithm [43] to minimize the sum of the loss functions for all training samples. Consequently, an NN model with predictive capabilities was developed.

2.3

Shell Model Spaces

We performed approximately 100,000 TBRE calculations in six model spaces. These included four valence nucleons in the f_7/2 orbital of the virtual nucleus (simply expressed as (f_7/2)⁴), four valence nucleons in the $h_{11 / 2}$ orbital virtual nuclear ( ${(h_{11 / 2})}^{4}$ ), and two, four, and six valence neutrons in the sd shell (corresponding to ¹⁸Ne, ²⁰Ne, and ²²Ne nucleus, respectively), and six valence neutrons in the pf shell (⁴⁶Ca nuclear). These six model spaces represent various levels of many-body complexity.

In the (f_7/2)⁴ space, the eigenvalues from the shell model are simple linear combinations of the two-body interaction matrix elements, as shown in Eq. (1.81) [44]. The ground state spin corresponds to the lowest eigenvalue associated with a specific spin. An NN without a hidden layer corresponds to linear combinations of the input two-body interaction matrix elements, followed by the application of the softmax operation to identify the smallest linear combinations. Both models employ similar calculation processes to determine the ground-state spin, where the weight parameters. Thus, dji parameters in Eq. (4) in the NN correspond to the cfp coefficients [44] in the shell model, and the softmax input to the NN is equivalent to the energy eigenvalues of the shell model. Because a hidden layer complicates the NN model and violates the correspondence between the NN and shell models, we excluded the hidden layer in our NN model for the (f_7/2)⁴ space.

For the (h_11/2)⁴ and ¹⁸Ne spaces, the complexity increases beyond the (f_7/2)⁴ space. Some eigenvalues remain linear combinations of two-body interaction matrix elements, whereas others must be obtained through diagonalization. Although some diagonalization processes with fewer than 5 are analytical, the weight parameters dji can no longer be related to cfp coefficients. Therefore, hidden layers can enhance the adaptability of NN models to nonlinear diagonalization [45]. In the ^{20, 22}Ne and ⁴⁶Ca spaces, the relationship between the eigenvalues and cfp coefficients is completely nonlinear. The transcendental nature of some of these relationships further necessitates hidden layers. Table 1 presents the TBRE sample sizes and input-output settings of our nn models for different spaces.

Input-output settings for the six model spaces

Model Space	Input	Output
(f_7/2)⁴	4	5
(h_11/2)⁴	6	10
¹⁸Ne	30	5
²⁰Ne	30	7
²²Ne	30	8
⁴⁶Ca	94	13

The input and output numbers correspond to the number of two-body interaction matrix elements and the number of possible ground state spins, respectively

2.4

Optimization of the network architecture

For the (f_7/2)⁴ space, the shell model eigenvalues were linear combinations of two-body interaction matrix elements and the softmax input in the NN model. Therefore, both models employed similar calculation processes for determining the ground-state spin. As mentioned in Sect. 2.3, hidden layers were not required. In fact, an NN model without a hidden layer has already predicted ground-state spins in the (f_7/2)⁴ space with up to 98% accuracy.

For the (h_11/2)⁴ space, because certain eigenvalues displayed nonlinear correlations with the two-body interaction matrix elements, incorporating hidden layers into the model was essential for enhancing the prediction accuracy. We first added one hidden layer and empirically selected 64 hidden nodes for the test run. The results demonstrated an accuracy of 97%, which was a satisfactory outcome.

For the remaining four spaces, the arbitrary accuracies were not always optimal. Therefore, we attempted to improve the prediction accuracy of our NN classification model by including more hidden layers and increasing the number of neural nodes in the ¹⁸Ne, ²⁰Ne, ²²Ne, and ⁴⁶Ca model spaces.

First, the prediction accuracy improved when the number of neural nodes was doubled, as indicated by the difference between the prediction accuracies with N/2 and N neural nodes, as shown in Fig. 2. The absence of negative differences (Fig. 2) suggests that doubling the number of neural nodes consistently improves results, as expected. Furthermore, the differences reached a peak when N=32 for all four model spaces. For N>32, the prediction accuracy improved by only 0~2%. Considering that more nodes entail additional computational overhead, we believe that 32 nodes may be an optimal and balanced choice for this study.

Fig. 2

(Color online) Difference in the prediction accuracies of models employing N/2 and N neural nodes for ¹⁸Ne, ²⁰Ne, ²²Ne, and ⁴⁶Ca model spaces with a single hidden layer. Thirty-two nodes are recommended

Furthermore, we investigated the impact of hidden layers on the prediction accuracy. By employing 32 neural nodes in each layer, as shown in Fig 2, we present the difference in prediction accuracy between networks with $n - 1$ and n hidden layers against the layer number n in Fig 3 for the ¹⁸Ne, ²⁰Ne, ²²Ne, and ⁴⁶Ca model spaces. The accuracy improved significantly with a single hidden layer, that is, n=1. However, these improvements diminished with the introduction of additional layers. Because additional layers also consume computational resources, a single hidden layer may be the optimal choice.

Fig. 3

(Color online) Difference between the prediction accuracies of networks employing n and

n - 1

hidden layers for the ¹⁸Ne, ²⁰Ne, ²²Ne, and ⁴⁶Ca model spaces. Thirty-two neural nodes in each hidden layer are recommended, as in Fig 2. A single hidden layer is recommended

Activation functions [46] play a crucial role in NNs by learning abstract features via nonlinear transformations. Common activation functions are the sigmoid (also called the logistic function), tanh (hyperbolic tangent), and ReLU functions. Table 2 presents the impact of different activation functions on the prediction accuracy of our NN model. of The tanh and ReLU functions exhibit similar model prediction accuracies for the five model spaces. However, because the tanh activation function includes an exponential operation, the computational overhead may be larger; therefore, we used the ReLU function throughout this study.

In summary, the optimal network configuration for the ¹⁸Ne, ²⁰Ne, ²²Ne, and ⁴⁶Ca model spaces considered in our analysis comprised one hidden layer with 32 ReLU neural nodes.

RESULTS AND ANALYSIS

3.1

Model comparison

As shown in Fig. 1, we adopted a fully connected NN model. However, considering the recent application of Bayesian NNs (BNNs) in nuclear physics [33, 34, 37, 39] and the great success of convolutional NNs [47-50] (CNN) and recurrent NNs [51-53] (RNN), we compared these four networks in terms of accuracy, as shown in Fig 4.

Fig. 4

(Color online) Prediction accuracies with different NN models. We adopt the classic fully connected NN as demonstrated in Fig. 1. BNN stands for Bayesian NN; CNN for convolutional NN; RNN for recurrent NN

The BNN was implemented via Bayesian sampling of weights and biases facilitated by a variational inference algorithm to optimize model training. The model adopted 1000 iterations to update loss and accuracy. During the prediction process, we sampled 1000 times to obtain precise probability prediction results. The CNN included a convolutional, pooling, fully connected, and softmax layers. In particular, in the convolutional layer, the input channel was the number of input features, the output channel was defined as 16. The convolution kernel size was specified as three to facilitate feature extraction. Subsequently, in the pooling layer, the pooling kernel size and step size were set to two to reduce the dimensions of the feature map. The fully connected layer mapped the features extracted by the convolutional layer to the final classification result based on the feature dimensions and category counts of the task. The model parameters were adjusted and iteratively optimized throughout the construction process to ensure effective feature extraction and classification. A remarkable characteristic of the RNN is its ability to continuously transmit and share information through recurrent connections. In addition, the network calculation process defines the forward propagation function, which encompasses the output generated after the input calculation and subsequent prediction through the fully connected layer and softmax function. All four models, including the adopted classic softmax model, shared consistent parameters, including a training-to-test set ratio of 2:1, a learning rate of 0.01, over 1000 training epochs, and the use of the ReLU activation and Adam optimization functions.

According to Fig. 4, the CNN performs the worst, while the accuracies of both BNN and RNN are similar to that of the adopted network. However, the adopted networks had faster training speeds and required fewer computational resources. Therefore, we believe that the adopted network was the optimal choice for our study.

3.2

Feature selection

Feature selection is crucial in machine learning and data analysis because it enhances model performance, mitigates the risk of overfitting, boosts computational efficiency, streamlines model interpretation, and addresses problems related to noise and redundant information. This involves conducting a correlation analysis to assess the relationship between each feature and target variable. Subsequently, the features exhibiting a strong correlation with the target variable are selected, whereas the others are excluded from further training. Considering the nonlinear nature of both our feature and label data, we utilized the Spearman correlation coefficient ρ [54] for feature selection as $ρ = 1 - \frac{6 \sum d_{i}^{2}}{n (n^{2} - 1)},$ (7) where di and n represent the difference in the rank values of the i-th data pair and total number of observed samples, respectively.

We calculated the ρ coefficients for four high-dimensional model spaces, namely, ^18~22Ne and ⁴⁶Ca model spaces and used different threshold sizes to select input features with strong correlation. Only features with ρ values greater than the threshold were retained for further training. In Table 3, we list the number of elements of the two-body matrix, that is, the number of input features over a certain threshold, and the corresponding accuracy.

Model accuracies (%) and input numbers under different feature selection thresholds

Threshold		0.1	0.01	0.001	0
¹⁸Ne
	Accuracy	70	77	85	86
	Input number	5	14	28	30
²⁰Ne
	Accuracy	60	65	66	68
	Input number	6	17	25	30
²²Ne
	Accuracy	71	74	76	80
	Input number	4	16	25	30
⁴⁶Ca
	Accuracy	53	56	56	56
	Input number	1	30	80	94

Threshold 0 indicates no feature selection

According to Table 3, the number of inputs after feature selection decreases with increasing threshold values, as anticipated. However, this reduction in the input number implies declining performance. Consequently, each input two-body matrix element within the four model spaces significantly affects the output and thus, the exclusion of any of these from network training is not recommended.

3.3

Accuracy

Figure 5 presents the evolution of the loss function during training. As expected, the loss functions of the six model spaces converge, indicating that the network parameters are optimal. The loss values of two single-j model spaces, namely, (f_7/2)⁴ and (f_11/2)⁴, decrease the most dramatically because the single-j spaces are simpler than the other 4 models. Furthermore, ⁴⁶Ca, ²⁰Ne, ²²Ne, and ¹⁸N all converge with large loss values, corresponding to the unsatisfactory accuracy described in Table 4. Thus, increasing the number of training epochs does not improve the accuracy.

Model space dimensions, prediction accuracy of the NN, and consistent rate of the G-I correlations between the SM and NN (see Subsect. 3.4 for definition)

Model Space	(f_7/2)⁴	(h_11/2)⁴	¹⁸Ne	²⁰Ne	²²Ne	⁴⁶Ca
Dimension	8	23	14	81	142	3952
Accuracy (%)	98	97	86	68	80	56
Consistency (%)	100	100	100	60	80	74

Fig. 5

(Color online) Evolution of the loss functions during training

Table 4 shows the correlation between the prediction accuracy (%) of the NN for the ground-state spin and the dimensions of the six model spaces under investigation. The shell model eigenvalues of the (f_7/2)⁴ space are linear combinations of two-body interaction matrix elements; therefore, the NN model is equivalent to linear regression and achieves high prediction accuracy of up to 98%. The accuracy is 97% for the (h_11/2)⁴ space with one hidden layer; however, some eigenvalues in the (h_11/2)⁴ space exhibit nonlinear relationships with the two-body interaction matrix elements.

For the remaining four spaces, the accuracy decreased significantly with increasing dimensions. We obtained a Pearson correlation coefficient [55] of -0.753 between the prediction accuracy (%) and dimensions on a logarithmic scale, indicating a negative correlation between the two variables. As the dimensions of the space and the shell complexity increased, the ability of the NN model to predict the ground-state spin diminished. As shown in Fig. 2 and 3, introducing additional hidden layers or neural nodes does not significantly improve the performance of general classification NNs. Thus, to accurately predict the ground-state spin in the TBRE, the generalization capability of the NN is strongly challenged by the complexity of the quantum many-body system, and a more specialized NN architecture and activation function should be designed according to the cfp coefficient property and diagonalization process.

To obtain a more detailed picture of the prediction performance of the NN model for TBRE samples with a specific spin, Fig. 6 presents the confusion matrix for the NN models of the six model spaces. In the confusion matrices, the y- and x-axes represent the ground-state spin predicted by the NN (I_NN) and that obtained from the shell model calculations (I_SM), respectively. The gray scale indicates the probability of the shell model calculation yielding a ground-state spin of I_SM in the samples for which the NN predicts a ground-state spin of I_NN. The main diagonal of the confusion matrix appears predominantly dark, indicating a reasonably high degree of consistency between the NN and shell model for a specific ground-state spin. From a statistical perspective, the NN captures some correlations between the ground-state spin and two-body interaction matrix elements of the TBRE.

Fig. 6

Confusion matrices for predicting the ground-state spin using the NN model in the (f_7/2)⁴, (h_11/2)⁴, ¹⁸Ne, ²⁰Ne, ²²Ne, and ⁴⁶Ca TBRE calculations. The y- and x-axes represents the ground-state spin predicted by the NN (I_NN) and that obtained from the shell model calculations (I_SM), respectively. The gray scale represents the probability of the shell model calculation yielding a ground-state spin of I_SM in the samples for which the NN predicts a ground-state spin of I_NN

The data in Table 4 indicates that the prediction accuracy for the ground-state spin of the ²⁰Ne nucleus is lower than that for higher-dimensional ²²Ne. This finding is consistent with the observations shown in Fig. 6. Specifically, for ²⁰Ne, the difference in colors between the main diagonal and other regions is less pronounced than in other nuclei. This suggests that the prediction of the ground-state spin in ²⁰Ne space poses greater challenges to the NN, which may be related to some special properties of the cfp coefficients of ²⁰Ne. Further exploration of specific multibody complexity features in ²⁰Ne space is desirable.

To further evaluate the statistical performance of the NN model, Fig. 7 presents the distribution of ground-spin spins I (PI) obtained using both the shell model and the well-trained NN model with random interactions. The NN model is consistent with the shell model for all model spaces and partially succeeds in capturing the robust statistical properties of the TBRE.

Fig. 7

(Color online) Distribution of the ground-state spin I (PI) for (f_7/2)⁴, (h_11/2)⁴, ¹⁸Ne, ²⁰Ne, ²²Ne, and ⁴⁶Ca. The black square and red circle represent the PI obtained from shell model calculations with random interactions and that predicted by the NN model, respectively. The blue triangle and olive star represent the PI obtained with the empirical approach [28] applied to the shell model and NN model, respectively

3.4

G-I correlation

To predict PI in TBRE, Zhao et al. proposed a general empirical approach [19]. In their approach, one of the two-body interaction matrix elements was set to -1, whereas the rest were set to 0. This interaction was then input into the shell model, and the output ground-state spin I was recorded. If N independent two-body interaction matrix elements existed in the model space, the process was repeated N times, setting a different matrix element equal to -1 each time. Finally, the number of times the spin I was observed as the ground-state spin in N numerical experiments was represented as NI. The probability of spin I in the ground state was then estimated as follows: $P_{I} = N_{I} / N .$ (8)

The empirical approach [19] attributes the "specific spin I as the ground-state spin" to a few two-body interaction matrix elements. If there are more two-body interaction matrix elements responsible for the spin I=0, the empirical rule provides a phenomenological explanation for the dominance of the ground state with zero spin.

Note that the empirical approach hints at the correlation between the two-body interaction matrix elements and ground-state spin, and its correlation determines the ground-state spin distribution, as shown in Fig. 7. Thus, the NN model with a good prediction of the TBRE ground-state spin distribution should also produce a similar correlation between the two-body interaction matrix elements ( $G_{j_{1} j_{2}; j_{3} j_{4}}^{J T}$ ) and ground-state spin (I) as the shell model. Therefore, the element–spin (G-I) correlations in the shell model must be compared with those in the NN model.

In Tables 5, 6 and 7, the G-I correlations between the two-body interaction matrix elements ( $G_{j_{1} j_{2}; j_{3} j_{4}}^{J}$ defined in Eq. (1)) and the ground-state spin (I), obtained from the empirical approach applied to both the shell and NN models, are listed for the (f_7/2)⁴ and (h_11/2)⁴ model spaces, Ne isotopes, and ⁴⁶Ca, respectively.

Ground state spin(I) from the shell and NN models for the (f_7/2)⁴ and (h_11/2)⁴ model spaces, with input GJ=-1 for some specific J, and other GJs equal to 0. Here, GJ denotes the two-body interaction matrix element

G_{j j; j j}^{J}

, as defined in Eq. (1). This table summarizes the correlation between the two-body interaction matrix elements and ground-state spin in the empirical approach

GJ	(f_7/2)⁴		(h_11/2)⁴
	SM	NN	SM	NN
G⁰	0	0	0	0
G²	4	4	4	4
G⁴	2	2	0	0
G⁶	8	8	4	4
G⁸			8	8
G¹⁰			16	16

Same as Table 5, except for ¹⁸Ne, ²⁰Ne, and ²²Ne, with

G_{j_{1} j_{2}; j_{3} j_{4}}^{I}

as the matrix elements of the two-body interaction

$G_{j_{1} j_{2}; j_{3} j_{4}}^{J}$	¹⁸Ne		²⁰Ne		²²Ne
	SM	NN	SM	NN	SM	NN
$G_{1111}^{0}$	0	0	0~4	0	0~6	0
$G_{1122}^{0}$	0	0	0,2,4	0	0,2,4	0
$G_{1133}^{0}$	0	0	0	0	0,2	0
$G_{2222}^{0}$	0	0	0,2~4	0	0~5	0
$G_{2233}^{0}$	0	0	0	0	0	0
$G_{3333}^{0}$	0	0	0	0	0~2	0
$G_{1212}^{1}$	1	1	1	0	0	0
$G_{1223}^{1}$	1	1	2	0	0	0
$G_{2323}^{1}$	1	1	0	0	0	3
$G_{1212}^{2}$	2	2	0,2	0	0	0
$G_{1213}^{2}$	2	2	2	0	2	2
$G_{1222}^{2}$	2	2	1~4	0	0~6	0
$G_{1223}^{2}$	2	2	0	0	0	2
$G_{1233}^{2}$	2	2	0	0	0	0
$G_{1313}^{2}$	2	2	4	2	0,2,4	2
$G_{1322}^{2}$	2	2	0	0	0	0
$G_{1323}^{2}$	2	2	0	2	0	0
$G_{1333}^{2}$	2	2	2	2	0~4	2
$G_{2222}^{2}$	2	2	0	0	0,2~4	0
$G_{2223}^{2}$	2	2	2	0	2,3	0
$G_{2233}^{2}$	2	2	0	0	0	0
$G_{2323}^{2}$	2	2	2	0	0	0
$G_{2333}^{2}$	2	2	0	0	0	0
$G_{3333}^{2}$	2	2	2	0	0	0
$G_{1313}^{3}$	3	3	5	2	0,2,4	3
$G_{1323}^{3}$	3	3	4	0	3	0
$G_{2323}^{3}$	3	3	0	0	0	0
$G_{2323}^{4}$	4	4	6	6	6	6
$G_{2333}^{4}$	4	4	4	0	2,3	0
$G_{3333}^{4}$	4	4	4	4	0	0

The subscripts

j_{1}, j_{2}, j_{3}, j_{4}

are equal to 1, 2, 3, corresponding to

s_{1 / 2}

d_{3 / 2}

, and

d_{5 / 2}

orbits in the sd shell, respectively. I=0~4 in this table represents the degenerate states with spin 0, 1, 2, 3, and 4 from the shell model. The inconsistency between the NN and shell models is highlighted in bold

Same as Table 6, except for ⁴⁶Ca

$G_{j_{1} j_{2}; j_{3} j_{4}}^{J}$	SM	NN	$G_{j_{1} j_{2}; j_{3} j_{4}}^{I}$	SM	NN
$G_{1111}^{0}$	0~10	0	$G_{3334}^{2}$	0	0
$G_{1122}^{0}$	0~10	0	$G_{3344}^{2}$	0	0
$G_{1133}^{0}$	0,2~6	0	$G_{3434}^{2}$	4	0
$G_{1144}^{0}$	0~4	0	$G_{3444}^{2}$	0	0
$G_{2222}^{0}$	0~10	0	$G_{4444}^{2}$	2	2
$G_{2233}^{0}$	0,2,4,6	0	$G_{1313}^{3}$	0,2,4	3
$G_{2244}^{0}$	0	0	$G_{1314}^{3}$	3	0
$G_{3333}^{0}$	0~6	0	$G_{1323}^{3}$	3	0
$G_{3344}^{0}$	0	0	$G_{1324}^{3}$	4	0
$G_{4444}^{0}$	0~4	0	$G_{1334}^{3}$	4	0
$G_{1212}^{1}$	0	0	$G_{1414}^{3}$	0,2,4~8	2
$G_{1223}^{1}$	0	0	$G_{1423}^{3}$	0	0
$G_{1234}^{1}$	0,9	0	$G_{1424}^{3}$	0	0
$G_{2323}^{1}$	0	0	$G_{1434}^{3}$	0	0
$G_{2334}^{1}$	0	0	$G_{2323}^{3}$	3	0
$G_{3434}^{1}$	1,8	0	$G_{2324}^{3}$	0	2
$G_{1212}^{2}$	0	0	$G_{2334}^{3}$	0	0
$G_{1213}^{2}$	2	0	$G_{2424}^{3}$	0	0
$G_{1222}^{2}$	0~10	0	$G_{2434}^{3}$	0	0
$G_{1223}^{2}$	0,4,6	0	$G_{3434}^{3}$	0,10	0
$G_{1224}^{2}$	0	0	$G_{1414}^{4}$	0,2,4~6,8	8
$G_{1233}^{2}$	0	0	$G_{1423}^{4}$	2	0
$G_{1234}^{2}$	0,9	0	$G_{1424}^{4}$	6	0
$G_{1244}^{2}$	0	0	$G_{1433}^{4}$	0	0
$G_{1313}^{2}$	0,2,4	2	$G_{1434}^{4}$	1	0
$G_{1322}^{2}$	0,2,4,6	0	$G_{1444}^{4}$	0~4	0
$G_{1323}^{2}$	0	0	$G_{2323}^{4}$	6	0
$G_{1324}^{2}$	0	0	$G_{2324}^{4}$	2	0
$G_{1333}^{2}$	0~8	0	$G_{2333}^{4}$	1~6	0
$G_{1334}^{2}$	2	0	$G_{2334}^{4}$	0,9	0
$G_{1344}^{2}$	2	0	$G_{2344}^{4}$	4	0
$G_{2222}^{2}$	0~6	0	$G_{2424}^{4}$	0	0
$G_{2223}^{2}$	1,2,4,5	0	$G_{2433}^{4}$	0	0
$G_{2224}^{2}$	0	0	$G_{2434}^{4}$	3	0
$G_{2233}^{2}$	0,2~4,6	0	$G_{2444}^{4}$	0,2~4	0
$G_{2234}^{2}$	0	0	$G_{3333}^{4}$	0	0
$G_{2244}^{2}$	0,2~4	0	$G_{3334}^{4}$	0	0
$G_{2323}^{2}$	0	0	$G_{3344}^{4}$	0,10	0
$G_{2324}^{2}$	0	0	$G_{3434}^{4}$	0	0
$G_{2333}^{2}$	0,2~4,6	0	$G_{3444}^{4}$	0	0
$G_{2334}^{2}$	0,10	0	$G_{4444}^{4}$	4	4
$G_{2344}^{2}$	0	0	$G_{2424}^{5}$	10	9
$G_{2424}^{2}$	0,9	0	$G_{2434}^{5}$	0	0
$G_{2433}^{2}$	0	0	$G_{3434}^{5}$	1	0
$G_{2434}^{2}$	0	0	$G_{3434}^{6}$	12	10
$G_{2444}^{2}$	0	0	$G_{3444}^{6}$	0	0
$G_{3333}^{2}$	0	0	$G_{4444}^{6}$	6	6

The subscripts

j_{1}, j_{2}, j_{3}, j_{4}

are equal to 1, 2, 3, 4, corresponding to the

p_{1 / 2}

p_{3 / 2}

, f_5/2, and f_7/2 orbits in the pf shell, respectively

According to Table 5, in the (f_7/2)⁴ and (h_11/2)⁴ model spaces, the NN model produces G-I correlations that are perfectly consistent with those of the shell model. This explains the agreement between the shell and NN models shown in Figures 6(a,b) and 7(a,b). In Table 6, this perfect consistency can also be observed for ¹⁸Ne space in coordination with Figures 6(c) and 7(c). However, as the dimensions increase, the consistency of ^{20 22}Ne in Tables 6 and ⁴⁶Ca in Table 7 gradually decreases. In the ²⁰Ne space, there are 12 inconsistent G-I correlations out of 30 (40%) between the SM and NN; in ²²Ne, six out of 30 (20%) are inconsistent; and in ⁴⁶, 24 out of 94 (~26%) are inconsistent. These inconsistent rates are also correlated with the prediction accuracy for different model spaces, as shown in Table 4.

Furthermore, the empirical approach was also applicable to the trained NN model by setting one of the inputs of the NN to -1 and the rest to 0 and recording the ground-state spin (I) from the network. This approach also revealed the correlation between the interaction matrix elements and the predicted ground-state spin, as well as the PI distribution, of the well-trained NN model. Table 5 presents the correlations between the matrix elements and the spin obtained from the shell and NN models. The correlations from both models are identical, indicating that our NN model successfully learns the G-I correlation, as suggested by the empirical approach. Thus, it accurately reproduces the ground-state spin of the shell model for simple model.

With the G-I correlation from such an NN model, we counted the ground-spin Is emerging in the G-I correlation and then normalized them to the PI distribution, as guided by the empirical approach. The PI distributions based on the empirical approach using the NN model are shown in Fig. 7. The empirical approach from both the SM and NN models yields reasonably consistent PI distributions for all model spaces in this case, suggesting that the NN may effectively capture the correlation between the two-body interaction matrix elements and the ground-state spin, which further explains its remarkable performance in reproducing the statistical properties of the ground-state spin in the TBRE.

CONCLUSION

This study used an NN model to investigate the distribution of the ground-state spin in the TBRE. Using a softmax classification NN model, we attempted to reproduce the correlation between the matrix elements of the interaction and ground-state spin, as labeled by the shell model, for the TBRE. The reliability of the NN model was analyzed based on its prediction accuracy and consistency with the empirical rules of the PI distribution.

Previous applications of NN models in nuclear physics primarily focused on their strong fitting capabilities. However, the analysis of the ground-state spin distribution in TBRE demonstrated the classification ability of the NN, which is rare in the literature. Furthermore, TBRE provided extensive samples for training NNs, thereby potentially enhancing the performance of the NN model.

In our investigation, we adopted various strategies to enhance network performance, including the introduction of BNN, CNN, and RNN, feature selection, and adjusting the number of neural nodes and hidden layers. However, none of these approaches yielded significant improvements with limited computational resources. Therefore, we must acknowledge that the quantum many-body problem remains a formidable challenge for NN models. Addressing this challenge may necessitate the further development of NN architectures tailored for analyzing the nuclear ground-state spin in the TBRE.

However, NN models still offer some insights into the specific robust statistical properties of the ground-state spin. For example, they effectively capture the distribution of the ground-state spin, as shown in Fig. 7. Moreover, the resulting confusion matrix exhibits dominant diagonal elements, indicating the consistency between the ground-state spin from the shell model and that predicted by the NN model, as shown in Fig. 6. This success can be attributed to the capacity of the NN to replicate the correlation between the ground-state spin and two-body interaction matrix element in the shell model, as shown in Tables 5, 6, and 7.

References

H.A. Weidenmüller, G.E. Mitchell,

Random matrices and chaos in nuclear physics: Nuclear structure

. Rev. Mod. Phys. 81(2), P539-589 (2009). https://doi.org/10.1103/RevModPhys.81.539