Research on inversion method for complex source-term distributions based on deep neural networks

NUCLEAR ELECTRONICS AND INSTRUMENTATION

Research on inversion method for complex source-term distributions based on deep neural networks

Yi-Sheng Hao
Zhen Wu
Yan-Heng Pu
Rui Qiu
Hui Zhang
Jun-Li Li
Nuclear Science and TechniquesVol.34, No.12Article number 195Published in print Dec 2023Available online 09 Dec 2023
7403

This study proposes a source distribution inversion convolutional neural network (SDICNN), which is deep neural network model for the inversion of complex source distributions, to solve inversion problems involving fixed-source distributions. A function is developed to obtain the distribution information of complex source terms from radiation parameters at individual sampling points in space. The SDICNN comprises two components: a fully connected network and a convolutional neural network. The fully connected network mainly extracts the parameter measurement information from the sampling points, whereas the convolutional neural network mainly completes the fine inversion of the source-term distribution. Finally, the SDICNN obtains a high-resolution source-term distribution image. In this study, the proposed source-term inversion method is evaluated based on typical geometric scenarios. The results show that, unlike the conventional fully connected neural network, the SDICNN model can extract the two-dimensional distribution features of the source terms, and its inversion results are better. In addition, the effects of the shielding mechanism and number of sampling points on the inversion process are examined. In summary, the result of this study can facilitate the accurate assessment of dose distributions in nuclear facilities.

Source Term InversionMonte CarloArtificial IntelligenceNeural Network
1

Introduction

“As low as reasonably achievable” is one of the most important principles of radiation protection systems. The key idea is to rationally use resources to reduce radiation hazard. Economic and social factors are considered to reduce the dose and risk levels as much as possible. Currently, many nuclear facilities are or will be decommissioned, whereas many new nuclear facilities, such as nuclear power plants, are under construction. The assessment and control of occupational exposure dose during the maintenance and decommissioning of nuclear facilities are important topics in radiation protection research. In addition, the radiation dose levels in the environment must be assessed accurately to protect people’s lives and property.

To evaluate the radiation dose level in the vicinity of a nuclear facility, information regarding the source terms in the radiation field, such as the energy spectrum, spatial distribution, angular distribution and other parameters, is indispensable. However, in actual operation, these parameters cannot be accurately measured or calculated. Therefore, the source-term information must be obtained in space via inversion methods, followed by the calculation of radiation dose levels in space using source-term parameters to comprehensively evaluate the radiation dose level in the entire space as well as to provide reference for reducing the radiation dose and risk level to operators. Depending on the type of source term to be inverted, source-term inversion algorithms can be classified into source-term inversion under nuclear and non-nuclear accident conditions.

In the case of a nuclear accident, the type and amount of radioactive material released into the environment must be determined. Considerable research has been conducted to determine the total amount and composition of radioactive material released into the atmosphere. As presented in Table 1, the methods used for source-term inversion can be classified into four categories.

Table 1
Comparison of typical source-term inversion algorithms
Method Type of Inversion Advantages Disadvantages
Optimal Interpolation Method Source term location inversion Simple application, widely used in the field of assimilation Less effective for complex situations
Genetic Algorithm Source term location inversion Can be combined with other diffusion models, which makes it versatile Relatively weak ability to handle constrained optimization
Kalman Filter Source term location inversion The variance matrix is developed explicitly Huge computational overhead
Artificial Neural Network Source term location and distribution inversion Some qualitative data needed in traditional models are not required A large amount of reliable training data is needed, and its interpretability is poor
Show more

(1) The optimal interpolation method [1]. The optimal interpolation method, which was first proposed in 1963 [2], is based on the theory of linear least-squares estimation, the basic principle of which is to minimize the absolute error between n observations and theoretical calculations based on the sum of squares. Its primary advantage is its ability to “automatically” process various observations with varying accuracies.

(2) Genetic algorithms [3]. A genetic algorithm is a type of combinatorial optimization algorithm. It is a bionic algorithm that mainly adheres to the law of “survival of the fittest” in biological evolution using a certain coding technology to alter a binary string known as the chromosome. It begins with an initial population, repeats the processes of selection, hybridization, and mutation, and then causes the population evolve closer to a certain goal to obtain an optimal solution to a practical problem.

(3) Kalman filters [4, 5]. The Kalman filter is an optimal recursive data-processing algorithm that integrates all possible observations and statistical characteristics of errors in a model as well as observations to estimate a specific variable, thus minimizing the statistical error. As an important optimal estimation tool, the Kalman filter is widely used in various fields, such as inertial navigation [6], global positioning systems [7], target tracking [8], and weather forecasting [9].

(4) Artificial neural network (ANN) [10]. An ANN is an abstract model of the human brain. It is a complex network formed by numerous simple processing units (or neurons) that are extensively interconnected, thus reflecting many basic properties of the human brain. A neural network forms a model by learning several examples in a process known as “training” and stores the knowledge obtained in the connections between processing units, i.e., weighted data. ANNs can be used to solve a series of problems related to nonlinear regression, data prediction, and dynamics. Currently, artificial intelligence methods based on neural networks are widely used in nuclear facility fault diagnosis [11, 12], radiation imaging [13, 14], nuclear data measurement [15, 16], computational fluid dynamics [17], reactor safety [1], multiphase flow measurement [18], radiation field reconstruction [19-21], and other applications.

This paper mainly discusses the practical application of source-term inversion to non-accident situations. Depending on the type of information, source-term inversion can be classified into source-term location inversion and source-term distribution inversion. The main purpose of source-term location inversion is to locate unknown source terms that are important for determining source-term loss, maintaining equipment, and decommissioning nuclear facilities. A source-term inversion method based on the maximum likelihood estimation was proposed by researchers at Nagoya University, Japan [22]. The results showed that inversion accuracy was significantly affected by the detection parameters and measurement data. Using the angular response property of multiple detectors, researchers at South China University proposed a method for locating radioactive sources of gamma-rays, which is mainly applied to locate the position of a lost source term. Researchers at Tsinghua University implemented a source-term inversion algorithm based on the least-squares method and performed experiments to prove its feasibility; the algorithm is mainly applicable to scenarios in which the source-term location is known but the source-term distribution is unknown. Recently, researchers at the South China University of Technology proposed a radiation-field inversion method based on a grid interpolation function that can restore lost data in a gridded radiation field, thereby achieving a certain level of source-term distribution restoration capability [23]. However, because the grid difference function method requires several input parameters, it presents a few limitations in practical applications.

In practice, operators at nuclear facility sites are aware of hotspots in key areas, source-term locations, and quantity parameters. In addition, the nuclides of the source term at the operation site are unchanged and can be obtained by measuring the characteristic energy of the detector. Hence, the distribution of the source term can be calculated using the measured value of the radiation field dose rate. Subsequently, the calculated source-term distribution is used to calculate the dose rate at other locations. This method is relatively more accurate and reliable than using data obtained via simple interpolation and extrapolation.

The inversion of the source-term distribution from the measured dose rate of the radiation field involves solving an equation. The accuracy of an equation directly determines the accuracy of the results. Having more information regarding parameters other than the source distribution allows one to obtain more accurate equations and source distributions. Currently, studies related to inversion methods for complex source-term distributions are few, and such methods are imposed with stringent requirements for spatial dose data or exhibit low inversion accuracy, which hinder their portability and accuracy in engineering applications. Therefore, a more general, efficient, and accurate inversion model for complex source-term distributions must be developed.

To satisfy the requirements above, an ANN is used in this study to investigate the problem of inverting a complex source-term distribution under non-accident conditions. For the inversion of a complex source-term distribution, using a neural network can offer three advantages, as follows: (1) In an actual operation scenario, the measured values of the radiation parameters at each sampling point are the result of the combined effect of each discrete position of the source term on the position of the sampling point. In addition, this result exhibits strong nonlinear characteristics owing to the effect of the scattering term in space. Neural networks exhibit strong nonlinear mapping capability owing to their activation function [24]. Meanwhile, (2) a convolutional neural network (CNN) can effectively extract the features of the source-term distribution using the receptive field to efficiently complete the inversion of the complex source-term distribution [25,26]. (3) The training and validation process of the neural network is mainly based on data and does not require consideration regarding the specific particle transport process or complex geometric structure; therefore, the network can rapidly perform complex source-term inversion based on the radiation parameter measurements of the sampling points.

In summary, based on the ANN method, this study investigates the inversion of complex source-term distribution information obtained from the measured values of radiation parameters at the sampling points of radiation fields. Subsequently, a neural network structure suitable for the inversion of fixed complex source-term distributions, named the source distribution inversion convolutional neural network (SDICNN), is proposed. In addition, an experimental validation example is established with reference to an actual scenario of spent fuel storage, and the dataset required for training and validating the proposed neural network method is obtained via Monte Carlo simulation.

2

Complex Source-Term Inversion Method

2.1
Principle of Source-Term Inversion

If the source-term geometry is known at the time of inversion, then it can be discretized into several sub-geometries comprising a set G=[g1,g2,,gn]. Each subgeometry comprises a corresponding activity distribution, the set of which is S=[S1,S2,,Sn], and the dose rate in space is expressed as follows: D=F(S,G,P), where F. represents an operator for calculating the dose, P=[P1,P2,,Pm]Rm. represents a matrix coaining t positions of m. sampling points in space, and D=[D1,D2,,Dm]. is the dose vector corresponding to P. Subsequently, the process of inverting the source-term activity distribution can be expressed as follows: S=I(P,G,D)+ϵ, where I. represents the inversion operator and ϵRn. represents the inversion error. The equation above is a general expression for the main source-term inversion method. The general expression of a neural network with input X. and output O. is O=f(X).

In summary, the general form of the neural network used to reconstruct the source term is as follows: S=f(P,G,D)+ϵ.

For regression problems, the most typically used loss function is the mean squared error (MSE) function,ch is expressed as follows: MSE=1ni=1nSiOi2.

Thus, the method of applying a neural network for source-term distribution inversion can be described as follows: A predictive model is constructed using a neural network dedicated to minimizing the loss function; subsequently, a model with the smallest possible error is determined by training the pameters of the network model.

2.2
Method and Process

In this study, deep neural networks and Monte Carlo methods were used to calculate examples with different source-term parameters to form a dataset for model training and to realize source-term iersion. For a series of detectors at different positions in space, the two-dimensional distribution of the source term must be calculated based on the measured parameters of the detectors. Conceptually, this process comprises two aspects:

(1) Extraction of measurement parameter information for the sampling points. The two-dimensional distribution information of the source term is encoded using the relative magnitudes of the measured parameters at the sampling points. Therefore, the measured parameters of each sampling point must be considered as independent variables and information from the relative sizes between the sampling points must be extracted.

(2) Reconstruction of the source-term distribution. The two-dimensional distribution of the source term is reconstructed based on the information extracted in the previous step. Multiple solution strategies are available for calculating the distribution of the measured parameters based on any sampling point; therefore, this problem is pathological and is an indeterminate inverse problem that has no unique solution and typically requires constraining the solution space using the corresponding samples.

In this study, a neural network model dedicated to the inversion of complex source-term distributions based on the idea above is proposed. The radiation parameters measured at n sampling points in the computing space are used to form a 1×n one-dimensional array as the input parameters for the neural network, and the sampling points are independent of each other. Similarly, two-dimensional images of the source-term distribution are used as the output parameters of the neural network. In this study, the SDICNN is classified into two components: a fully connected network and a CNN. The extraction of the measurement parameter information of the sampling points is mainly completed in the fully connected network. The low-resolution source-term distribution parameters are obtained from individual sampling points in the radiation field via the fully connected neural network. The one-dimensional feature representation output by the fully connected neural network is converted into a two-dimensional feature representation to obtain a low-resolution source term distribution image, which typically exhibits 1/2 the resolution of the source term. The CNN mainly uses a structure similar to that of the Super Resolution CNN (SRCNN) to complete the fine reconstruction of the source-term distribution to obtain a high-resolution image from a low-resolution image. The structure of the SDICNN is shown in Figure 1.

Fig. 1
(Color online) Diagram of the SDICNN model
pic

In summary, the inversion process for the source-term distribution shown in Figure 2 is as follows:

Fig. 2
Flowchart for fixed complex source-term inversion
pic

(1) The number and location of sampling points in space, as characteristic variables of the neural network input, are determined based on the geometry of the space; simultaneously, the resolution of the complex source-term distribution is determined and its two-dimensional image is used as the target signal for the output of the network.

(2) A certain amount of data is required to train, test, and verify the neural network. Therefore, different source-term distributions are constructed, and the parameters of the sampling points under the source-term distribution condition are used as datasets and calculated via Monte Carlo simulation. In addition, the input and output of the network are standardized to accelerate the convergence of the neural network.

(3) Based on the input and output parameters of the neural network combined with the complexity of the actual radiation device space, hyperparameter settings such as the number of hidden layers and the number of nodes of this neural network are determined.

(4) A preprocessed dataset is used to train the neural network. First, the linear layer is trained, and the parameters of the previous linear layer are fixed; subsequently, the convolution and deconvolution layers are trained.

(5) Test samples are used to test the source-term inversion of the deep neural network model based on a fixed complex source-term inversion.

(6) The fixed complex source term is inverted using the deep neural network model obtained after testing.

3

Experiments

3.1
Calculation Example

Many nuclear power plants are beginning to use the dry storage of spent fuel, which requires the construction of storage system facilities for the off-stack storage of spent fuel. Based on a dry spent fuel storage system as a benchmark, we constructed a spent fuel transfer container with a simple geometry to test the inversion method described above.

The calculated area in this example was a 10 m × 10 m × 10 m. cube-shaped room. The surrounding walls were 0.20-m-thick concrete walls. The entire room was partitioned into two areas by a 0.20 m wall. The spent fuel transfer container was placed upright inside the room. The dimensions of the container were as follows: outer diameter, 2.34 m; inner diameter, 2.20 m; height, 5.00 m; a bottom surface, 0.20 m. The container cover, which was made of stainless steel, featured a diameter of 2.40 m and a height of 0.20 m. A cylindrical beam with a diameter of 1.00 m and a height of 5.80 m was placed above the inner room to increase the complexity of the geometry. The positions and coordinates of the main geometric components in this study are listed in Table 2, and a geometric diagram is shown in Fig. 3a.

Table 2
Positions and coordinates of main geometric components in examples
No. Component Name Component Coordinates (mm) Component Dimensions Material
1 Transfer Container (0, 0, 0) Outer Diameter: 2.34 m, Inner Diameter: 2.20 m, Height: 5.00 m Stainless Steel
2 Cylindrical Room Beam (1000, -2800, 7000) Diameter: 1.00 m, Height: 5.80 m Stainless Steel
3 Surrounding Walls - Thickness: 0.20 m Concrete
4 Wall Inside the Room (-2800, 3000, 0) Thickness: 0.20 m Stainless Steel
Show more
Fig. 3
(Color online) Simple example of geometric diagram (a); complex surface diagram of the cross-section (b); and circumferential angles (c)
pic
3.2
Dataset

In this study, the source term refers to a neutron source and the energy spectrum is the watt fission spectrum. The outside of the container is regarded as the surface source term, which is abstracted as a two-dimensional array. The two dimensions are the circumferential discrete angle of the source term and the axial discrete mesh of the source term. The complex surface source term in the example is discretized into a series of subsurface sources, and the particles are emitted uniformly outward in the region with a certain intensity, based on the assumption that the energy spectrum remains unchanged. In the arithmetic example described herein, the entire cylindrical container is sampled at 360 °and segmented into 30 angles. Each angle is segmented into 20 sampling regions. Therefore, the number of parameters of the source term to be inverted is 30 × 20 = 600, as shown in Figs. 3b and 3c.

In this study, the abovementioned two-dimensional arrays represent the intensity of each source-term discrete mesh. The intensity distribution of the source term in the example assumes the form of a two-dimensional trigonometric function. The parameters of the function are randomly sampled, including the number of levels, the amplitude, the frequency, and the phase of the trigonometric function. The circumferential and axial coordinates of the source term are substituted into the sampled two-dimensional distribution function to generate a set of source-term parameters. To enable the intensity distribution of the generated source term change continuously in both the circumferential and axial directions, we use binary functions to represent the intensity of the source term in each mesh and trigonometric functions with multiple random parameters to represent the variation in the intensity of the source term, as shown in the formula below: F(x,y)=order[amp1sin(freq1x+φ1)+amp2cos(freq2x+φ2)+amp3sin(freq3y+φ3)+amp4cos(freq4y+φ4)],

where F(x,y). represents the source-term intensity of meshes x and y. in the circumferential and axial directions, respectively; and order, amp, freq, and φ are the parameters of random sampling. Figure 4 shows examples of the two-dimensional distributions of different source terms in the dataset.

Fig. 4
(Color online) Two-dimensional distributions of source terms of different generated datasets
pic
3.3
Monte Carlo Calculation

After a series of source-term distributions are generated using the method above, the Monte Carlo method is used to calculate the example. The Monte Carlo method can be used not only to flexibly model the geometry of source terms, but also to accurately describe and analyze their distribution, thus rendering it suitable for rapid calculations and generating datasets required for training [27]. To improve the computational efficiency of Monte Carlo simulations, Dr. Pan [28] from Shanghai Jiao Tong University proposed a single-step Monte Carlo criticality algorithm, which has been used to generate transplutonium isotopes [29]. Similar methods include the DeGVR [30] and PDMC methods [31].

In this study, a Monte Carlo program known as MCShield was used for calculations. MCShield is a Monte Carlo program developed by the Radiation Protection and Environmental Protection Laboratory of Tsinghua University for shielding calculations. The custom source term function in MCShield was used in this study to sample the source terms. After normalizing the activity of the source particles at different locations, energies, and angles, the distribution of particles at different locations, energies, and angles was obtained. Information such as the location, energy, and angle of individual source particles was obtained via sampling.

Subsequently, two cases of the same source term without and with a shield were computed separately, and 10000 sets of arithmetic cases were computed to create a dataset for validating the neural network. In this study, the MESH method was used to calculate the neutron flux over the entire calculation area. Each MESH measured 160 mm ×160 mm ×160 mm. In total, 64 × 64 × 64 MESHs were configured. After the calculation was completed, the MESH parameters at the sampling points were extracted to obtain the measured values of the sampling points. In the case with shielding, the number of simulated particles was 1E7, and the average statistical error was approximately 0.07. In the case without shielding, the number of simulated particles was 1E6, and the average statistical error was approximately 0.10. Figure 5 shows the calculation results obtained with and without shielding under the same source-term distribution.

Fig. 5
(Color online) Calculation results for radiation field with shielding
pic
3.4
Inversion calculation results

When training neural networks, normalization preprocessing is necessary to transform different feature ranges to achieve a mean of 0 and a variance of 1. The source-term parameters and flux values are transformed from being several orders of magnitude apart to the same order of magnitude. Subsequently, different features in the dataset are standardized, which allows the network model to learn the fitting relationship between the two more effectively and accelerate convergence. During the training of the neural network in this study, the linear layer was trained first. Subsequently, the parameters of the previous linear layer were fixed, and the convolution and deconvolution layers were trained. During this training process, the dropout technique was used to randomly turn off a certain proportion of neurons to avoid overfitting the model. The dataset used for the training contained 10000 examples, 150 training epochs were performed, and the MSE was selected as the loss function. In this example, different numbers of sampling points were selected on the wall around the spent fuel transfer container to simulate the actual detector arrangement. Owing to the different numbers of sampling points, the training times differed slightly. The time for one training epoch was 20–50 s, and the total training time was 1–2 h.

A total of 96 sampling points were selected around the side of the container as the input to each neural network, where the circumferential direction was segmented into 30 angles, and the axial direction was segmented into 20 regions. The fully connected neural network in the SDICNN contained three hidden layers, and the CNN in the SDICNN model performed two convolution and one upsampling operation, followed by two additional convolutional operations. Figure 7a shows the inversion results of the SDICNN model for source terms outside the dataset.

Fig. 7
(Color online) Comparison of predicted source distribution (left) output by SDICNN model (a) and fully connected neural network (b), and actual source distribution (right)
pic
4

Analysis of Neural Network Hyperparameters

First, to verify the effectiveness of the SDICNN model, an experiment was performed to compare between a fully connected neural network model and the proposed SDICNN model. Second, to investigate the effects of different factors on the source-inversion performance of the SDICNN model, the relevant parameters were modified, and the corresponding network training results obtained from the source-inversion process were compared.

4.1
Comparison of Neural Network Inversion Results

In this study, an experiment was performed to compare between a fully connected neural network model and the SDICNN model to verify the effectiveness of the neural network method for source-term inversion and to analyze the performance of the SDICNN model by comparing it with that of a conventional fully connected neural network model. To be consistent with the SDICNN model, a typical fully connected neural network containing three hidden layers was used, whose structure is the same as that of the fully connected network of the SDICNN model, as shown in Fig. 6.

Fig. 6
(Color online) Schematic diagram of conventional fully connected neural network
pic

For comparison purposes, all results are presented herein based on the same validation case. Different methods (such as the fully connected neural network and SDICNN model) and parameters (such as different numbers of sampling points) were used for source inversion, and the results are presented. Figures 7 and 8 show the inversion results of the fully connected neural network and SDICNN model for source terms outside the same dataset and the relative deviations from the actual source terms. The average absolute error of the source-term inversion results of the fully connected neural network was 87.41%, whereas that of the SDICNN model was 22.16%.

Fig. 8
(Color online) Relative errors of (a) fully connected neural network and (b) SDICNN model
pic

The prediction results of the fully connected neural network and SDICNN model indicate that the fully connected network could not learn the two-dimensional features of the source-term distribution. This is mainly because a fully connected neural network regards the source terms as separate discrete points, whereas in reality, the source-term distribution is a continuous two-dimensional distribution with different continuous features in both the circumferential and axial directions. By contrast, the SDICNN model can extract features of the source-term distribution more effectively and obtain better inversion results.

Figure 9 shows a comparison of the MSEs induced in the validation set during the training processes of the two networks. The fully connected neural network indicated a lower convergence speed and a plateau phase during the source-term inversion, whereas the SDICNN model exhibited a higher convergence speed and achieved better training results for the validation set.

Fig. 9
(Color online) MSEs obtained on validation set by two neural network methods
pic

Figure 10 shows the flux values calculated for 96 sampling points based on the source-term prediction results of the fully connected network and SDICNN model, as well as the relative deviations from the flux values under the conditions of the actual source terms. The average relative deviation under the predicted source term conditions of the fully connected network was 16.29%, whereas the average relative deviation under the predicted source-term conditions of the SDICNN model was 14.92%. The SDICNN model performed slightly better than the fully connected network in predicting the source-term results.

Fig. 10
(Color online) (a) Flux results and (b) relative flux deviations obtained for 96 sampling points by two neural networks
pic

These results indicate that the fully connected neural network is unsuitable for this application. Compared with the conventional fully connected neural network, the SDICNN model achieves better source prediction performance on the same dataset and is more suitable for predicting two-dimensional complex source distributions.

4.2
Shielding Effect

In the calculation space, a significant amount of shielding effects (e.g., beams and walls in the calculation example presented herein) are typically observed, which can reduce the dose level and significantly increase the complexity of the calculation space as scattering terms are added. Therefore, the effect of shielding in the radiation field on the source-term inversion effect must be investigated. This section is based on the SDICNN model, where 30 sampling points at fixed positions on the wall are selected as network inputs. Based on two cases, i.e., with and without shielding (in the Monte Carlo simulation, the shielding material is set as air), the model was trained and tested. Figures 11 and 12 show the source-term inversion results obtained for the cases with and without shielding, and the relative deviations from the actual source-term distribution. The results of the two schemes show minimal difference between the inversion results obtained for the source terms with and without shielding.

Fig. 11
(Color online) Comparison of predicted (left) and actual (right) distributions for source terms (a) with and (b) without shielding
pic
Fig. 12
(Color online) Relative error (a) with and (b) without shielding
pic

In the cases without and with shielding, the average absolute errors of the source-term inversion were 78.82%, and 64.73%, respectively. Based on the relative deviation results, the extreme errors of the inversion results obtained with and without shielding were relatively low and relatively high, respectively. In general, the inversion results obtained with shielding were slightly better than those obtained without shielding.

Figure 13 shows a comparison of the MSE values obtained on the verification set when network training was performed under two conditions: with and without shielding. As shown, on the same dataset, the network converged faster without shielding and achieved better training results on the validation set.

Fig. 13
(Color online) MSEs obtained on validation set under two shielding conditions
pic

The inversion results obtained for the source terms with and without shielding exhibited mutual advantages and disadvantages. This is primarily because of the fewer number of scattering terms under the condition without shielding, and that the training process of the network model is relatively simple, thus resulting in a faster convergence process. However, without shielding, the same sampling point position is not shielded from the source, and the radiation parameters at the sampling point position may not accurately reflect the source-term distribution.

4.3
Number of Sampling Points

To investigate the effect of the number of sampling points on the source-term inversion results based on the SDICNN model, we selected 10 and 40 sampling points at fixed positions on the wall as network inputs, conducted network model training, and then conducted tests. Figures 14 and 15 show the source-term inversion results obtained based on 10 and 40 sampling points and the relative deviations from the actual values. The average absolute errors of the source-item inversion results based on 10 and 40 sampling points were 66.44% and 39.35%, respectively. By combining the source-term inversion results of the 30 and 96 sampling points (as shown in Figs. 11 and 7, respectively), we clearly observed that the source-term inversion results improved gradually as the number of sampling points increased. A greater number of sampling points corresponded to more source-term information obtained and thus better source-term inversion results.

Fig. 14
(Color online) Predicted source distribution (left) with (a) 10 and (b) 40 sampling points and actual source distribution (right)
pic
Fig. 15
(Color online) Relative deviation with (a) 10 and (b) 40 sampling points
pic

Figure 16 shows a comparison of the MSE values induced on the verification set when the network was trained based on 10, 30, 40, and 96 sampling points. As shown, on the same dataset, an increase in the number of sampling points resulted in a higher network convergence speed, and the training results obtained on the verification set were better.

Fig. 16
(Color online) MSEs obtained on validation set with different numbers of sampling points
pic

Figure 17 shows the flux values calculated at the sampling points based on the prediction results of networks trained on data from 10 and 40 sampling points and the relative deviations from the flux values under the actual source-term conditions. The average relative deviations of the neural network based on 10 and 40 sampling points were 25.27% and 25.64%, respectively; as mentioned above, the average relative deviation of the neural network based on 96 sampling points was 14.92%. In general, the prediction results became increasingly accurate as the number of sampling points increased.

Fig. 17
(Color online) Flux results and relative deviations obtained with (a) 10 and (b) 40 sampling points
pic
4.4
Discussion

The effectiveness of the SDICNN model for the source-term distribution inversion problem was verified comparing it with the fully connected neural network. In addition, the effects of factors such as shielding and the number of sampling points on the source-term inversion effect were verified and tested. The findings obtained were as follows: (1) The effect of shielding on source-term inversion results must be considered comprehensively. Shielding introduces complexity into the radiation field, which is not conducive to the neural network training. However, without shielding, the distances between the sampling points and source term must increase accordingly, and the resolution ability of the radiation parameters at the sampling point positions for the source-term distribution is weakened. Therefore, in practical applications, the effects of shielding and sampling point positions on inversion must be considered comprehensively. (2) The test results show that more sampling points in the space yields a better source-term inversion effect. Therefore, in actual application scenarios, the number of sampling points should be increased as much as possible to achieve better results. Based on the verification results above, the proposed SDICNN model offers two advantages:

(1) In the radiation protection scenarios, the radiation parameter measurement values at each sampling point are the result of the comprehensive effect of various positions in the fixed complex source, and they exhibit strong nonlinear characteristics. The proposed network model is based on a deep neural network and is trained on a dataset composed of different source scenarios without considering specific nonlinear processes and complex geometric structures. It can perform complex source inversions rapidly based on the radiation parameter values measured at the sampling points.

(2) The proposed neural network is classified into a fully connected neural network and a CNN. The fully connected neural network obtains a higher-dimensional feature representation of the input, whereas a CNN performs convolution and deconvolution operations to extract more spatial information and high-level features. This network design structure combines the advantages of both types of neural networks, thus effectively solving the source-inversion problem with a certain degree of flexibility.

5

Conclusion

To address the inverse problem of fixed source distributions, we herein proposed a deep neural network model named the SDICNN for the inversion of complex source distributions. The model successfully obtained fixed complex-source distribution information from individual radiation parameters at sampling points in space. The SDICNN comprises two components: a fully connected network and a CNN. The fully connected network extracts the measurement parameter information at the sampling points and obtains low-resolution source distribution parameters from individual sampling points in the radiation field via a fully connected neural network. Subsequently, the one-dimensional feature representation output by the fully connected neural network is converted into a two-dimensional feature representation, thereby yielding a low-resolution source distribution image. The CNN primarily completes fine inversion of the source distribution and obtains high-resolution source distribution images from low-resolution images.

In this study, the source-term inversion method was tested based on typical geometric scenarios in practical operations. The results showed that compared with the conventional fully connected neural network, based on the same datasets, the SDICNN successfully extracted the two-dimensional features of the source-term distribution; additionally, it converged faster and achieved better inversion results. In addition, we verified and evaluated the effects of factors such as shielding and the number of sampling points on the source-term inversion effect. Using more sampling points resulted in a better source-term inversion effect. Although shielding introduced significant complexity to the radiation field, its effect on the source-term inversion results must be comprehensively considered, in addition to the distance between the sampling point and source term.

The results of this study can serve as reference for monitoring a source’s status in radiation-field operation environments and for accurately evaluating the spatial dose distribution in nuclear facilities. Additionally, the findings of this study may contribute to the accurate assessment of the radiation protection level of on-site personnel in nuclear facilities, thereby reducing the amount of radiation exposed to workers and decreasing the collective dose during the lifecycle of the nuclear facility. In addition, owing to the increasing prevalence of machine-learning methods in recent years, neural networks for source inversion may become the next research focus in the field of radiation protection because of their potential significance for optimizing real-world radiation sites.

Future endeavors pertaining to this study may include the following: (1) Identifying more effective source-inversion neural network models, determining more effective neural network structures, and further improving the accuracy of source parameter inversion. (2) Establishing more complex geometric models based on real-world application scenarios to validate neural network models and investigating source-inversion methods under complex geometric conditions. (3) Combining other measurable radiation field parameters to invert the source parameters. In this study, source inversion was considered based only on individual sampling points, which implies insufficient input information. (4) Exploring inversion methods for time-series source data. In this study, source inversion was examined only under fixed source distribution conditions, i.e., without considering variations in the source parameters over time. (5) Currently, the dose is a physical quantity measured by many typical detectors, and the source term used in the calculation example presented herein is a neutron source. In the future, we will attempt to identify methods that allow us perform source-term distribution inversion based on the gamma radiation field and dose measurement values. (6) Investigating the effect of the sampling point location distribution on the source-term inversion performance for a specified number of sampling points to design a reasonable distribution of sampling point locations.

References
1. P. W. Gaffney, M. J. D. Powell,

Optimal interpolation

. Numer. Anal. 506, 90-99 (1976).
Baidu ScholarGoogle Scholar
2. U.C. Herzfeld,

Inverse theory in the earth sciences—an introductory overview with emphasis on Gandin's method of optimum interpolation

. Math. Geol. 28, 137-160 (1996). https://doi.org/10.1007/BF02084210
Baidu ScholarGoogle Scholar
3. M. Srinivas, L. M. Patnaik,

Genetic algorithms: A survey, Computer

. 27, 17-26 (1994). https://doi.org/10.1109/2.294849
Baidu ScholarGoogle Scholar
4. R. E. Kalman.

A new approach to linear filtering and prediction problems

. ASME. J. Basic. Eng. 82, 35-45 (1960). https://doi.org/10.1115/1.3662552
Baidu ScholarGoogle Scholar
5. G. Evensen

The Ensemble Kalman Filter: theoretical formulation and practical implementation

. Ocean. Dynam. 53, 343-367 (2003). https://doi.org/10.1007/s10236-003-0036-9
Baidu ScholarGoogle Scholar
6. J. Bijker, W. Steyn,

Kalman filter configurations for a low-cost loosely integrated inertial navigation system on an airship

. Control. Eng. Pract. 16, 1509-1518 (2008). https://doi.org/10.1016/j.conengprac.2008.04.011
Baidu ScholarGoogle Scholar
7. J. L. Crassidis,

Sigma-point Kalman filtering for integrated GPS and inertial navigation

. IEEE. T. Aero. Elec. Sys. 42, 750-756 (2006). https://doi.org/10.1109/TAES.2006.1642588
Baidu ScholarGoogle Scholar
8. S. Mahfouz, F. Mourad-Chehade, P. Honeine, et al.,

Target tracking using machine learning and Kalman filter in wireless sensor networks

. IEEE. Sens. J. 14, 3715-3725 (2014). https://doi.org/10.1109/JSEN.2014.2332098
Baidu ScholarGoogle Scholar
9. F. Cassola, M. Burlando,

Wind speed and wind energy forecast through Kalman filtering of Numerical Weather Prediction model output

. Appl. Energ. 99, 154-166 (2012). https://doi.org/10.1016/j.apenergy.2012.03.054
Baidu ScholarGoogle Scholar
10. J.M. Zurada, Introduction to artificial neutral systems. New York: West Publishing Company, 1992.
11. X. Zhong, H. Ban,

Pre-trained network-based transfer learning: A small-sample machine learning approach to nuclear power plant classification problem

. Ann. Nucl. Energy. 175, 109201 (2022). https://doi.org/10.1016/j.anucene.2022.109201
Baidu ScholarGoogle Scholar
12. X. Zhong, F. Wang, H. Ban,

Development of a plug-and-play anti-noise module for fault diagnosis of rotating machines in nuclear power plants

. Prog. Nucl. Energ. 151, 104344 (2022). https://doi.org/10.1016/j.pnucene.2022.104344
Baidu ScholarGoogle Scholar
13. X.Y. Guo, L. Zhang, Y.X. Xing

Study on analytical noise propagation in convolutional neural network methods used in computed tomography imaging

. Nucl. Sci. Tech. 33, 77 (2022). https://doi.org/10.1007/s41365-022-01057-3
Baidu ScholarGoogle Scholar
14. Y.J. Ma, Y. Ren, P. Feng et al.

Sinogram denoising via attention residual dense convolutional neural network for low-dose computed tomography

. Nucl. Sci. Tech. 32, 41 (2021). https://doi.org/10.1007/s41365-021-00874-2
Baidu ScholarGoogle Scholar
15. Y.Y. Li, F. Zhang, J. Su,

Improvement of the Bayesian neural network to study the photoneutron yield cross sections

. Nucl. Sci. Tech. 33, 135 (2022). https://doi.org/10.1007/s41365-022-01131-w
Baidu ScholarGoogle Scholar
16. X.C. Ming, H.F. Zhang, R.R. Xu et al.,

Nuclear mass based on the multi-task learning neural network method

. Nucl. Sci. Tech. 33, 48 (2022). https://doi.org/10.1007/s41365-022-01031-z
Baidu ScholarGoogle Scholar
17. A. Gheziel, S. Hanini, B. Mohamedi et al.

Particle dispersion modeling in ventilated room using artificial neural network

. Nucl. Sci. Tech. 28, 5 (2017). https://doi.org/10.1007/s41365-016-0159-6
Baidu ScholarGoogle Scholar
18. G.H. Roshani, E. Nazemi,

A high performance gas–liquid two-phase flow meter based on gamma-ray attenuation and scattering

. Nucl. Sci. Tech. 28, 169 (2017). https://doi.org/10.1007/s41365-017-0310-z
Baidu ScholarGoogle Scholar
19. Y. S. Hao, Z. Wu, Y. H. Pu, et al.,

Validation of the neural network for 3D photon radiation field reconstruction under various source distributions

. Front. Energy. Res. 11, 1151364 (2023). https://doi.org/10.3389/fenrg.2023.1151364
Baidu ScholarGoogle Scholar
20. W. Zhou, G. Sun, Z. Yang, et al.,

BP neural network based reconstruction method for radiation field applications

. Nucl. Eng. Des. 380, 111228 (2021). https://doi.org/10.1016/j.nucengdes.2021.111228
Baidu ScholarGoogle Scholar
21. M. Li, Y. Liu, M. Peng, et al.,

A fast simulation method for radiation maps using interpolation in a virtual environment

. J. Radiol. Prot. 38, 892 (2018). https://doi.org/10.1088/1361-6498/aac392
Baidu ScholarGoogle Scholar
22. S. Sugaya, T. Endo, A. Yamamoto,

Inverse estimation methods of unknown radioactive source for fuel debris search

. Ann. Nucl. Energy. 124, 49-57. (2019). https://doi.org/10.1016/j.anucene.2018.09.022
Baidu ScholarGoogle Scholar
23. Z. Wang, J. Cai,

Inversion of radiation field on nuclear facilities: a method based on net function interpolation

. Radiat. Phys. Chem. 153, 27-34 (2018). https://doi.org/10.1016/j.radphyschem.2018.09.003
Baidu ScholarGoogle Scholar
24. G. Cybenko,

Approximation by superpositions of a sigmoidal function

. Math. Control. Signal. 2, 303-314 (1989). https://doi.org/10.1007/BF02551274
Baidu ScholarGoogle Scholar
25. Y. LeCun, L. Bottou, Y. Bengio, et al.,

Gradient-based learning applied to document recognition

. P. IEEE. 86, 2278-2324 (1998). https://doi.org/10.1109/5.726791
Baidu ScholarGoogle Scholar
26. Y. LeCun, B. Boser, J. S. Denker, et al.,

Backpropagation applied to handwritten zip code recognition

. Neural. Comput. 1, 541-551 (1989). https://doi.org/10.1162/neco.1989.1.4.541
Baidu ScholarGoogle Scholar
27. Y. Hao, R. Qiu, Z. Wu, et al.,

Research on the source-detector variance reduction method based on the AIS adjoint Monte Carlo method

. Ann Nucl Energy. 191, 109916 (2023) https://doi.org/10.1016/j.anucene.2023.109916
Baidu ScholarGoogle Scholar
28. Q. Pan, N. An, T. Zhang, et al.,

Single-step Monte Carlo criticality algorithm

. Comput. Phys. Commun. 279, 108439 (2022). https://doi.org/10.1016/j.cpc.2022.108439
Baidu ScholarGoogle Scholar
29. Q. Pan, Q. Zhao, L. Wang, et al.,

Rapid Diagnosis Method for Transplutonium Isotopes Production in High Flux Reactor

. Nucl. Sci. Tech. 34, 44 (2023). https://doi.org/10.57760/sciencedb.j00186.00039
Baidu ScholarGoogle Scholar
30. Q. Pan, L. Wang, Y. Cai, et al.,

Density-extrapolation Global Variance Reduction (DeGVR) Method for Large-scale Radiation Field Calculation

. Comput. Math. Appl. 143, 10-22 (2023). https://doi.org/10.1016/j.camwa.2023.04.024
Baidu ScholarGoogle Scholar
31. Q. Pan, H. Lv, S. Tang, et al.,

Pointing Probability Driven Semi-Analytic Monte Carlo Method (PDMC) – Part I: Global Variance Reduction for Large-scale Radiation Transport Analysis

. Comput. Phys. Commun. 291, 108850 (2023).
Baidu ScholarGoogle Scholar
Footnote

The authors declare that they have no competing interests.