logo

Physics-informed neural network with equation adaption for 220Rn progeny concentration prediction

NUCLEAR CHEMISTRY, RADIOCHEMISTRY, AND NUCLEAR MEDICINE

Physics-informed neural network with equation adaption for 220Rn progeny concentration prediction

Shao-Hua Hu
Qi Qiu
De-Tao Xiao
Xiang-Yuan Deng
Xiang-Yu Xu
Peng-Hao Fan
Lei Dai
Zhi-Wen Hu
Tao Zhu
Qing-Zhi Zhou
Nuclear Science and TechniquesVol.37, No.2Article number 23Published in print Feb 2026Available online 03 Jan 2026
9102

Physics-informed neural networks (PINNs) are vital for machine learning and exhibit significant advantages when handling complex physical problems. The PINN method can rapidly predict 220Rn progeny concentration and is very important for regulating and measuring this property. To construct a PINN model, training data are typically preprocessed; however, this approach changes the physical characteristics of the data, with the preprocessed data potentially no longer directly conforming to the original physical equations. As a result, the original physical equations cannot be directly employed in the PINN. Consequently, an effective method for transforming physical equations is crucial for accurately constraining PINNs to model the 220Rn progeny concentration prediction. This study presents an equation adaptation approach for neural networks, which is designed to improve prediction of 220Rn progeny concentration. Five neural network models based on three architectures are established: a classical network, a physics-informed network without equation adaptation, and a physics-informed network with equation adaptation. The transport equation of the 220Rn progeny concentration is transformed via equation adaption and integrated with the PINN model. The compatibility and robustness of the model with equation adaption is then analyzed. The results show that PINNs with equation adaption converge consistently with classical neural networks in terms of the training and validation loss, and achieve the same level of prediction accuracy. This outcome indicates that the proposed method can be integrated into the neural network architecture. Moreover, the prediction performance of classical neural networks declines significantly when interference data are encountered, whereas the PINNs with equation adaption exhibit stable prediction accuracy. This performance demonstrates that the proposed method successfully harnesses the constraining power of physical equations, significantly enhancing the robustness of the resultant PINN models. Thus, the use of a physics-informed network with equation adaption can guarantee accurate prediction of 220Rn progeny concentration.

Machine learningPhysics-informed neural networksEquation adaption220Rn progeny
1

Introduction

Deep learning has profoundly impacted many areas of modern society [1-4], with significant applications in the fields of image recognition [1], natural language processing [2], cognitive science [3], and genomics [3, 4]. As a core technology of machine learning, neural networks play key roles in those fields. However, traditional neural network methods require considerable volumes of training data when analyzing complex physical, biological, or engineering systems. In complex, specialized cases, the cost of data collection is often high, and uncertainties exist regarding the data collection accuracy; these problems pose many challenges for deep learning applications [5]. Further, when supplied with only partial datasets, most advanced machine learning techniques lack robustness and cannot draw reliable conclusions and make decisions. In recent years, a new deep neural network (DNN) framework, physics-informed neural networks (PINNs), has been developed. A PINN incorporates physical laws into a neural network (e.g., an artificial, recurrent, or convolutional neural network (ANN, RNN or CNN, respectively)), and constitutes a data–physics dual-drive approach. This feature differentiates PINNs from traditional neural networks, which rely solely on data-driven methods [1, 5]. That is, through the use of physical information as prior knowledge, PINNs can be trained with very few or no labeled data as alternative models for accurately solving partial differential equations [5, 6], while also incorporating complex physical laws (that are difficult to describe via theoretical equations) in data form. Thus, PINNs have physics- and data-driven components.

The main traditional neural network models are ANNs, RNNs, and CNNs [5]. Many advanced algorithms have been derived to optimize the performance of these models, such as TRANSFORM-ANN [7], which can simultaneously fine-tune the neural network architecture, adjust the training dataset size, and select the appropriate activation function. To mitigate the risk of over-fitting, TRANSFORM-ANN integrates three strategies for determining the training set size, all of which are based on Sobol sampling. TRANSFORM-ANN is suitable for construction of an accurate and concise ANN model. However, limitations exist. Because TRANSFORM-ANN employs multi-objective optimization, it has a high computational cost, especially when applied to high-dimensional datasets, for which the computational complexity increases significantly. In addition to TRANSFORM-ANN, progressive neural architecture search [8] and one-shot neural architecture search (OSNAS) [9] are important methods in the field of model optimization. Boundary-integrated neural networks (BINNs) [10] are among the networks similar to the PINN model. A BINN is a numerical method combining a boundary integral equation (BIE) and neural network, and is employed to solve acoustic radiation and scattering problems efficiently and accurately. A BINN requires boundary-node information only as input, which greatly reduces the calculation cost and makes this approach particularly suitable for infinite domain problems. The semi-analytical characteristics of the BIE improves the prediction accuracy of the BINN. However, the application of BINNs to more challenging problems, such as those arising in the fields of high-frequency and nonlinear acoustics and complex geometry, requires further exploration.

The training data supplied to neural networks usually span physical quantities with multiple dimensions, which often exhibit significantly different orders of magnitude; therefore, appropriate data preprocessing is crucial [11]. For example, the physical quantities encountered in the fields of medical microdosimetry [12] and radioactive detection [13] may span hundreds of orders of magnitude or beyond. In those applications, even datasets with narrower ranges contain key information. To improve the sensitivity of neural networks to data with wide ranges of orders of magnitude and ensure that models can fully capture the key information in those data, the data must be processed appropriately using preprocessing functions. For example, logarithmic functions can be used to scale data effectively to an appropriate range. Overall, preprocessing functions are essential and universal for neural network training, but their application changes the dimensions of various features of the training data [1, 14]. Such alterations adversely affect PINN networks. That is, the key concept of the PINN network is that physical equations are combined to guide the training process of the neural network, ensuring that the model predictions not only conform to the data distribution but also follow specific physical laws. When the preprocessing function changes the dimensions of the data features, the physical equations in the PINN network are no longer directly effective, because the equation parameters and variables are usually closely related to the dimensions of the original data.

An 220Rn chamber is an essential scientific device for accurately measuring the radiation dose levels of 220Rn and its progeny [15-17]. The exhaust pipe is a core component of this device. When the concentrations of 220Rn and its progeny in the 220Rn chamber must be reduced, clean air is often injected into the chamber to dilute the indoor radioactive gas concentration. The excess radioactive gas is released into the atmosphere via the exhaust pipe. The emitted radioactive-gas concentration can reach several thousand becquerel per meter cubed; thus, the concentration distribution of the emitted radioactive gas must be monitored effectively. To achieve precise control and accurate measurement of 220Rn progeny concentration [15], a rapid prediction model must be established; this is feasible using the PINN approach. However, when a 220Rn concentration prediction model is developed using training data subjected to a preprocessing function, the preprocessing function alters the characteristics of physical quantities such as time, space, and concentration in the training data. To achieve normal functionality of the physical equations in the PINN network, the physical equations must be deformed according to specific preprocessing functions.

Since PINNs were proposed in 2019 [18], these methods have been applied to various fields. In fluid mechanics [19-25], PINNs have proven to be a valuable tool for overcoming the limitations of traditional numerical simulation methods, particularly for noisy data, complex grid generation challenges, and high-dimensional flow problems. In medical diagnostics [26, 27], PINNs precisely simulate biomechanics and biofluid mechanics, elucidating complex biological fluid phenomena and aiding disease diagnosis, treatment optimization, and medical device design. In materials science [28-35], PINNs have greatly enhanced the prediction accuracy of key physical quantities such as material stress and strain, especially in cases with limited data resources. In the power industry [36-41], PINNs have been used for power-system optimization and stability analysis, combining physical laws with data analysis to accurately predict system behavior, optimize energy distribution, enhance grid stability, and improve overall energy efficiency. These applications demonstrate the excellent adaptability and reliability of PINNs. However, in such previous studies, conventional data processing methods were generally adopted and the issue of physical-equation deformation and incorporation into the neural network after preprocessing was not explored. In particular, to achieve effective 220Rn progeny concentration prediction, the development of a PINN with equation adaption is highly significant.

This study introduces an equation adaption approach for neural networks, which can accurately deform physical equations for application in PINN model training. The compatibility of this method with neural networks and the robustness of the resultant model are explored. The remainder of this paper is organized as follows. In Sect. 2, five neural network models are established based on three architectures: a classical network, a physics-informed network without equation adaptation, and a physics-informed network with equation adaptation. The equation adaption process is then applied and the deformation of the physical equations based on specific preprocessing functions is demonstrated. Section 3 focuses on the compatibility of the neural-network equation adaption approach and the robustness of the PINN network after equation adaption, based on the five models established previously. Section 4 concludes the work.

2

Methodology

When training data are processed by a preprocessing function, the physical equations must be transformed before they can be incorporated into the PINN architecture. This section first establishes five prediction models for 220Rn concentration based on three network architectures. Then, the proposed equation adaptation method is introduced, with the equations being incorporated into the PINN model.

2.1
Physical object

In this study, the exhaust pipe of the 220Rn chamber was taken as the research object and the concentration distribution of the emitted radioactive gas was predicted. The device was cylindrical, with a diameter of Φ = 10 cm and a length L of 40 cm (Fig. 1). The gas entered through the inlet and exited through the outlet, and primarily comprised a mixture of 220Rn progeny and air. The inlet wind speed was taken as the model boundary condition, and the wind-speed adjustment range was 0 - 0.1 m/s [15]. The main decay products of 220Rn are 216Po, 212Pb, and 212Bi. Because 216Po has a short half-life of only 0.145 s, its migration and diffusion capabilities are minimal. In contrast, the half-lives of the latter two are more prolonged, at 10.64 h and 50.55 min, respectively, and their migration and diffusion are more impactful. Thus, 212Pb, and 212Bi are the focus of research attention [15, 16]. As 212Pb and 212Bi exhibit highly similar migration and diffusion patterns in this context, in this study, only the 212Pb concentration distribution was considered.

Fig. 1
(Color online) Exhaust pipeline structure
pic
2.2
Establishment of neural network

Computational fluid dynamics (CFD) was used to establish a numerical simulation of the physical object (i.e., the exhaust pipe) discussed in Sect. 2.1; hence, the database needed to train the neural network was obtained. Note that this database can be used to train and validate subsequent neural network models. The physical equations were incorporated into the loss function to jointly constrain the training of the neural network model. Finally, the data and physical-equation loss function were used in combination to train the neural network model. The model construction flowchart is shown in Fig. 2.

Fig. 2
(Color online) Framework for establishing neural network model
pic

The physical structure considered in this study was a cylinder, which is highly symmetrical. Therefore, in experiment, only the 212Pb concentration distribution in the xoy plane area is often required. In addition, because the physical structure is highly symmetrical and the fluid velocity is minimal, the Reynolds number is far less than 2000; thus, a laminar flow is established, which is characterized by a linear and uniform velocity distribution with a stable pressure gradient. Considerable similarity exists between the two- and three-dimensional flows; that is, the characteristics of a three-dimensional flow field can be described by a two-dimensional simulation. Thus, the difficulty of establishing a three-dimensional model can be avoided [42]. Therefore, the neural network model established in this study was designed to predict the 212Pb concentration distribution on the xoy plane.

(1) Data collection PINN models exhibit high robustness and can effectively handle data with significant errors [43, 44]. In this study, to test the robustness of the PINN model established using the proposed method, two databases were established: one without and one with interference (labeled Data-01 and Data-02, respectively). Data-02 comprised Data-01 with the addition of a small volume of data containing significant errors. The databases contained 212Pb concentrations spanning an extensive range, from 10-50 Bq/m3 to 103 Bq/m3. Data-01 comprised Ndata = 987,135 normal data points, whereas Data-02 comprised Ndata = 987,135 normal data points and Nerror=50 erroneous data points. A random 0.2% of the data from Data-01 was selected as the validation set (Nvalidation=2,000 data points), labeled “Data-validation.” Finally, these two databases were used to separately train three networks: NN, PINN-EA, and PINN-f. Hence, five models were obtained: NN, NN-ERR, PINN-EA, PINN-EA-ERR, and PINN-f. The correspondence between the different training databases and training models is shown in Fig. 3. NN, PINN-EA, and PINN-f are introduced in detail in the following subsection. Note that Data-02 had 50 additional data points compared to Data-01, accounting for 0.005% of the total data, which was a tiny proportion. Therefore, the data volumes of Data-02 and Data-01 were considered identical, and the influence of the additional 50 data points on the model training was ignored.

Fig. 3
(Color online) Correspondence between training databases and trained models
pic

(2) Neural networks A PINN is essentially a DNN that can approximate a solution determined from data and PDEs [45]; its architecture is shown in Fig. 4. In this study, three neural networks were constructed: NN, PINN-EA, and PINN-f. NN was a classical neural network without physical laws, whereas PINN-EA and PINN-f were PINNs with physical laws. PINN-EA and PINN-f differed in terms of their preprocessing functions. The equation adaption approach proposed in this study was adopted for PINN-EA; that is, the physical law F(X, Y, U, V, C) was used in the network. No preprocessing function was used for PINN-f, with the physical law f(x, y, u, v, c) being directly integrated into the network. The physical laws F(X, Y, U, V, C) and f(x, y, u, v, c) are explained in detail in Sect. 2.3.

Fig. 4
(Color online) NN, PINN-EA and PINN-f architectures
pic

A residual neural network [46, 47] was adopted in this study, for which the relationship between the inputs and outputs can be expressed aspic(1)Here, represents the neural network, the inputs of which are the space coordinates (x and y), time (t), and wind speed (q). The neural-network outputs are the velocity vectors (u and v) and concentration (c). The parameter Θ represents the trainable variables. Through this network, the relationship between the inputs and outputs is constructed. The k-th hidden layer for the residual neural network is expressed aspic(2)where W and b are the weights and biases, respectively; H represents the output of each hidden layer; and σ is the activation function.

In this work, the partial derivatives , , and were computed based on the chain rule, which has previously been implemented via automatic differentiation in both TensorFlow and PyTorch [45]. In this study, PyTorch was employed for this computation. Note that higher-order derivatives can be estimated through multiple calling of this function. Based on the 220Rn progeny transport equation, which is detailed in Eqs. (9) and (18) of Sect. 2.3, the residual equation of the transport equation can be derived as follows:pic(3)pic(4)where ef and eF pertain to PINN-f and PINN-F, respectively. Further, De is the diffusion coefficient, De = 2.88×10-5 m2/s [15], and λ is the decay constant of 212Pb. To ensure the network adheres to the transport equations, the loss function within the PINN is defined as follows:pic(5)where α is a weighting coefficient, the value of which is discussed in Sect. 3.3. Further, and are computed aspic(6)pic(7)Here, , , and are the measured data; , , and are the predicted data; represents the loss between the measured and predicted data; and e represents ef or eF. Recall that, in PINN-F, , whereas in PINN-f, . The variables Θ are optimized by minimizing the loss function. Here, the training variables were updated using the adaptive moment estimation (Adam) optimizer with an initial learning rate of 0.001. The learning rate decreased stepwise every 40 epochs to 0.9 times the original learning rate. Five models were obtained by training the three neural networks with different databases. The hyperparameter settings of the five models are detailed in Table 1. Note that the specified number of iterations was sufficient to decrease the model training error to a stable condition. The numbers of hidden units and layers (Ncell and Nlayer, respectively) were determined as discussed in Sect. 3.3.

Table 1
Hyperparameters used in five models
Hyperparameters Model
NN NN-ERR PINN-EA PINN-EA-ERR PINN-f
Network NN NN PINN-EA PINN-EA PINN-f
Learning rate Stepped descent Stepped descent Stepped descent Stepped descent Stepped descent
Optimizer Adam Adam Adam Adam Adam
Iteration 1000 2000 1000 2000 3000
Training data Data-01 Data-02 Data-01 Data-02 Data-01
Validation data Data-validation Data-validation Data-validation Data-validation Data-validation
Show more

(3) Activation functions The activation function is pivotal to the neural-network ability to approximate data. Without an activation function, the network would perform linear transformations [45]. Given that a PINN incorporates a derivation process, selection of an appropriate activation function is essential for practical model training. Five representative activation functions were used to construct and optimize our model training process in this study: Sigmoid, Tanh, ReLU, Leaky ReLU, and Hardswish. These activation functions exhibit unique nonlinear mapping characteristics, such as a sharp contrast between the saturated (Sigmoid, Tanh) and unsaturated (ReLU, Leaky ReLU, Hardswish) types, and also span the diversity of parameterized (e.g., the negative slope parameter of Leaky ReLU) and nonparametric design. These five activation functions are representative and widely used in the field of neural networks [48, 49].

2.3
Equation adaption

The migration and diffusion behaviors of 212Pb within the device follow the transport equation. That is,pic(8)where x and y are spatial coordinates; u and v are the flow-field velocities at these coordinates; c is the 212Pb concentration at these coordinates; De is the diffusion coefficient; and λ is the 212Pb decay constant. First, the f(x, y, u, v, c) expression is established:pic(9)Here, the preprocessing function and its inverse are represented by N and FN, respectively. Further, X, Y, T, Q, U, V, and C are the parameters obtained by transforming x, y, t, q, u, v, and c through the preprocessing function. Their relationships can be expressed aspic(10)From Eq. (10), the physical Eq. (9) can be transformed as follows:pic(11)To improve the transparency of the subsequent equation derivation, Eq. (11) is decomposed into four parts: A, B, E, and G:pic(12)Then, Eq. (11) can be transformed into the following form:pic(13)By applying the chain rule for the differentiation of composite functions [50], Eq. (12) can be expanded as follows:pic(14)Eq. (14) is a transformed form of the physical Eq. (9). If the preprocessing function is determined, a further precise transformation of Eq. (14) can be performed. The normalization function was employed for preprocessing in this study. The normalization function and its inverse are, respectively, expressed aspic(15)pic(16)where represents the parameters x, y, t, q, u, v and c; and represent the minimum and maximum parameter values, respectively; and represents the normalized parameters X, Y, T, Q, U, V, and C. Therefore, based on Eqs. (15) and (16), Eq. (14) can be transformed to obtain the following:pic(17)Then, Eq. (13) can be transformed into the following form:pic(18)Therefore, after the preprocessing function is determined, Eq. (12) can be transformed into Eq. (18), which means that Eq. (9) is transformed into Eq. (18), following implementation of the equation adaption technique.

2.4
Model evaluation indexes

In this study, three key indicators were used to gauge the performance of the trained model: the training loss, validation loss, and relative standard deviation (RSD). The RSD measures the relative discrepancy between the predicted and true values. The RSD calculation formula is as follows:pic(19)where ytr and ypre are the true and predicted values, respectively.

To facilitate analysis and comparison of different models, the final training and validation losses (FT and FV, respectively) were adopted as metrics. Consequently, the FT and FV for the NN and PINN models were denoted as FTNN, FVNN, FTPINN and FVPINN, respectively. The FT and FV ratios between the two models were and , respectively.

3

Results and Discussion

Section 2 presented the basic conditions for establishing the PINN model. This section details the optimization of the model parameters and the subsequent performance analysis of the optimal model. The performance analysis of the optimal model is reported first, in Sect. 3.1 and 3.2. The basis for determining the model parameters is discussed subsequently, in Sect. 3.3.

3.1
Convergence and predictive performance of PINNs without equation adaption

This section discusses the convergence and predictive performance of the PINN network with no equation adaption trained on the Data-01 database. The PINN model discussed in this section corresponds to the PINN-f model in Sect. 2.2. Figure 5 shows the training- and validation-loss convergence during training of the PINN-f model. Following application of the Leaky ReLU and Hardswish activation functions, the model training and validation losses exceeded orders of magnitude of e23 and e8, respectively, and convergence to smaller values did not occur. However, for the models with the ReLU, Sigmoid, and Tanh activation functions, these values converged to e-3. Therefore, compared to the Leaky ReLU and Hardswish activation functions, the ReLU, Sigmoid, and Tanh activation functions yielded better convergence of the model training and validation losses.

Fig. 5
(Color online) Training and validation losses of PINN-f model for different activation functions
pic

Although the training and validation losses are fundamental metrics for evaluating the effectiveness of model training, the accuracy with which physical quantities are predicted is also a crucial indicator. In this study, the model accuracy was assessed using the predictive RSD, which measures the RSD between the predicted and true values, as expressed in Eq. (19). Figure 6 illustrates the RSD between the predicted and true values for the x- and y-velocity components and the 212Pb concentration for the PINN-f model. The models employing the Leaky ReLU and Hardswish activation functions exhibited RSDs for the predictive values of the three considered physical quantities that significantly exceeded e6. In particular, the RSD for the predictive 212Pb concentration was as high as the order of e27. The models using the ReLU, Sigmoid, and Tanh activation functions exhibited smaller predictive RSDs. However, the values for the three considered physical quantities were relatively high, all exceeding 1, with the predictive RSD for the 212Pb concentration reaching the order of 1011. Therefore, the PINN-f models trained with these five activation functions did not achieve satisfactory prediction accuracy.

Fig. 6
(Color online) RSDs between predicted and true values for PINN-f models with different activation functions
pic
3.2
Convergence and predictive performance of PINNs with equation adaption

As indicated in Sect. 3.1, the neural network method without equation adaption failed to accurately predict the x- and y-velocity components and the 212Pb concentration. This section compares the neural network model with equation adaption to that without equation adaption and verifies the compatibility and robustness of the proposed method within neural networks.

3.2.1
Comparative analysis of PINN and NN models without interference data

(1) Neural-network convergence

This section discusses the training and predictive performance of the PINN network with the equation adaption and the classical neural network, both of which were trained without interference data (i.e., on the Data-01 database). These models correspond to the PINN-EA and NN models described in Sect. 2.2, respectively. Figure 7 shows the training and validation loss convergence for both models under different activation function conditions. Without interference data, the training- and validation-loss convergence patterns of the two models were consistent and reached the order of 10-6. Compared to the results reported in Sect. 3.1, these results indicate that preprocessing of the database effectively reduce the model training difficulty and facilitates training and validation loss convergence. This outcome further demonstrates the necessity of equation adaption in the training of PINN models.

Fig. 7
(Color online) Training and validation losses of NN and PINN-EA models for different activation functions without interference data
pic

Figure 8 compares the magnitudes to which the two models training and validation losses converged after 1000 training epochs under different activation function conditions. As the activation function changed, the two models exhibited a consistent trend in both FT and FV, with the K values fluctuating around 1. This indicates that there was almost no difference between the FT values of the two models, and similarly, that the FV values of the two models were almost identical. Therefore, without interference data, a comprehensive analysis of the FT, FV, and K values suggests that the neural network model with equation adaption (PINN-EA) and the classical neural network model (NN) exhibit good consistency.

Fig. 8
(Color online) FT, FV, and K values of NN and PINN-EA models for different activation functions without interference data
pic

(2) Model prediction accuracy

Figure 9 illustrates the change pattern of the predictive RSD for the x-velocity and y-velocity components and the 212Pb concentration as the NN and PINN-EA models underwent continuous training. When the ReLU activation function was used, the RSD for the x-velocity component exceeded 100%. For the other activation functions, however, the RSD values for these properties remained below 100%. The lowest RSD values were obtained for the y-velocity component, with all values below 10%, whereas those for the x-velocity and 212Pb concentration fell between 10% and 100%. Figure 9 also displays the RSD change pattern for the x-velocity and y-velocity components and the 212Pb concentration as the PINN-EA model underwent continuous training. The RSD change pattern for the three physical properties were essentially the same for both the PINN-EA and NN models.

Fig. 9
(Color online) RSD between predicted and true values for NN and PINN-EA models for different activation functions without interference data
pic

To more intuitively illustrate the relationship between the 212Pb concentrations predicted by the NN and PINN-EA models and the true values, a scatter plot of the predicted and true values is shown in Fig. 10. The true and predicted values were normalized, with the maximum and minimum values in the normalization being the ηmin and ηmax of Eq. (15). The true and predicted 212Pb concentration values are plotted on the horizontal and vertical axes, respectively. The closer the scatter points are to the y=x line, the closer the predicted values are to the true values, which indicates better prediction accuracy. The scatter points in Fig. 10 are essentially on the y=x line, suggesting that both the NN and PINN-EA models had good prediction accuracy.

Fig. 10
(Color online) Scatter plots of predicted and true values for NN (a–e) and PINN-EA (f–j) models without interference data
pic

In summary, analysis of the training- and validation-loss convergence patterns, as well as the model predictive accuracy revealed that, without interference data, the training and validation losses of the NN and PINN-EA models exhibit consistent convergence patterns during training. Moreover, compared to the models in Sect. 3.1, the two models in this section achieved higher prediction accuracy for the 212Pb concentration. These results demonstrate that the equation adaption technique can be effectively integrated into neural networks without conflict, with good compatibility.

3.2.2
Comparative analysis of PINN and NN models with interference data

To verify the robustness of the PINN model following adoption of the equation adaption technique, both the classical neural network and the PINN network were trained using data containing interference (the Data-02 database). The effects of training were examined for the two models. The models discussed in this section correspond to the NN-ERR and PINN-EA-ERR models described in Sect. 2.2.

(1) Convergence of neural networks

Figure 11 shows the training- and validation-loss convergence with epochs for the two models under different activation functions. When the model training was stable, although the NN-ERR training loss was considerably smaller than that of PINN-EA-ERR, the PINN-EA-ERR validation loss was smaller than that of NN-ERR, by approximately half. Further, both the NN-ERR and PINN-EA-ERR models exhibited overfitting at approximately 200 epochs. Table 2 reports the magnitudes to which the training and validation losses of the classic neural network and PINN network converged after 2000 training epochs. From Table 2, the FT values of both models were within the range of 1×10-4 to 1×10-3, and the FV values exceeded 1×10-1. Under the condition of no interference data, as shown in Fig. 8, the FT and FV values of the two models were within the range of 1×10-6 to 1×10-5. This indicates that, under the influence of interference data, the FT increased by two to three orders of magnitude, and the FV increased by 6. Therefore, interference data has a significant and adverse effect on model training.

Fig. 11
(Color online) Training and validation losses of NN-ERR and PINN-EA-ERR models for different activation functions with interference data
pic
Table 2
Comparison of training and validation losses of NN-ERR and PINN-EA-ERR models for different activation functions
    Activation functions
Leaky ReLU ReLU Sigmoid Tanh Hardswish
NN FTNN 1.2×10-3 9.8×10-4 1.7×10-3 6.5×10-4 9.1×10-4
  FVNN 1.90 2.14 1.55 2.05 2.10
PINN PTPINN 2.5×10-3 3.3×10-3 3.7×10-3 2.8×10-3 2.6×10-3
  FVPINN 1.09 1.13 0.46 1.11 1.24
Show more

Figure 12 shows the ratios of the FT or FV values between NN-ERR and PINN-EA-ERR under different activation functions. The FT ratios of the two models were less than 1. In contrast, the FV ratios were larger than 1. This result indicates that the PINN-EA-ERR model with the equation adaption constraint, can recognize erroneous data, further enhancing the robustness of the neural network.

Fig. 12
(Color online) K values of NN-ERR and PINN-EA-ERR models for different activation functions with interference data
pic

(2) Model prediction accuracy

Figure 13 shows the RSD values between the predicted and true values for the x- and y-velocity components and the 212Pb concentration for the NN-ERR and PINN-EA-ERR models under different activation function conditions. Comparison of the NN-ERR and PINN-EA-ERR models reveals that, for the x and y fluid velocity vectors, the RSD of the PINN-EA-ERR model predictions were half those of the NN-ERR model; as regards the RSD of the predicted 212Pb concentration values, those of the PINN-EA-ERR model were approximately 1/400th that of the NN-ERR model. This indicates that the PINN-EA-ERR model had higher prediction accuracy than the NN-ERR model, especially for the 212Pb concentration. The main reason for this performance is that the physical equation used in this study is the 212Pb concentration transport equation, which effectively constrained the 212Pb concentration prediction. This equation also incorporates flow-field physical quantities; that is, the x- and y-velocity components. Some improvement in prediction accuracy was observed for those quantities; however, the constraining force was insufficient and there was a stark contrast in accuracy.

Fig. 13
(Color online) RSD between predicted and true values of NN-ERR and PINN-EA-ERR models for different activation functions with interference data
pic

Figure 14 depicts scatter plots of the 212Pb concentrations predicted by the NN-ERR and PINN-EA-ERR models against the true values. The true and predicted values were uniformly normalized in the same manner as for the results shown in Fig. 10. Figure 14(a)–(e) and (f)–(j) show scatter plots of the 212Pb concentration predictions of the NN-ERR and PINN-EA-ERR models, respectively. In Figs. 14(a)–(e), most of the scatter points lie on the y=x, except for those close to 0. This result indicates a significant deviation in the NN-ERR model predictions for values close to 0, which markedly decreased the prediction accuracy. Moreover, compared to the NN-ERR model, the predicted scatter points (for the PINN-EA-ERR model (Figs. 14(f)-(j)) are mainly on the y=x line. and the model maintained good prediction accuracy for values close to 0. This further demonstrates the effectiveness of the practical constraint of the proposed equation adaption on the neural network model.

Fig. 14
Scatter plot of predicted and true values for NN-ERR (a–e) and PINN-EA-ERR (f–j) models with interference data
pic
3.3
Ndata, Ncell, and α parameters

Parameter optimization is typically performed for neural networks with different activation functions. From Sect. 3.1 and 3.2, the optimal performance was obtained for the tanh activation function. Therefore, in this section, we take the neural network model with the tanh activation function as an example and discuss the parameter optimization process in detail. We primarily consider the influence of Ndata, Ncell, and α on the prediction accuracy without interference data. The 212Pb concentration predicted by the PINN for different Ndata are presented in Fig. 15. The RSD of the 212Pb concentration prediction was used to characterize the model prediction accuracy. The layer number was set to 5 for all cases, and the number of neurons in each layer was varied from 16 to 128.

Fig. 15
(Color online) RSD of 212Pb concentration for varying Ndata and Ncell. Nlayer was fixed at 5. In (a)–(e), α = 1×1024, 1×1025, 1×1026, 1×1027, and 1×1028, respectively
pic

Three points can be summarized from Fig. 15. ① Data dependence on PINN prediction accuracy: With increasing Ndata, the prediction accuracy for the 212Pb concentration improved significantly. When Ndata reached 104, the prediction accuracy did not change considerably. ② Determination of Ncell: From Figs. 15(a)–(e), with increasing Ncell, the prediction accuracy for the 212Pb concentration increased gradually; the prediction accuracies for Ncell = 64 and 128 were close. Considering the calculation resources, the optimal Ncell for this model was 64. ③ Determination of α: In the training experiment for the PINN neural network model, the order of magnitude of was approximately ×10-3, and the order of magnitude of Ldata was approximately ×10-40. To facilitate supervision of the training of neural network models for both, in this paper study similar orders of magnitude were set for and Ldata, so the change in prediction accuracy was analyzed when α was ×102428. As apparent from Fig. 15, for sufficient training data and with increasing α, the model prediction accuracy first increased rapidly and then decreased slightly, and optimal performance was obtained when α was 26. Therefore, in this study, Ncell = 64 and α = 1028 were selected for the optimal model.

This study had some limitations. A high level of accuracy was not achieved for the flow field prediction when interference data were considered (Sect. 3.2.2). This indicates that the conditions of the physical equations should be adjusted to constrain the model and enhance its prediction accuracy. As regards noise in the training stage of this research model, the training data were preprocessed to a certain extent and a good signal-to-noise ratio was obtained.

4

Conclusion

In this study, a PINN with equation adaption was established for 220Rn progeny concentration prediction. A PINN without equation adaption was examined; this model failed to yield the desired outcomes. This failure underscores the critical role of the equation adaption in training neural networks. For training without interference data, the PINN with equation adaption exhibited performance consistent with that of a classical neural network model, achieving high accuracy when predicting 220Rn concentrations. This outcome emphasizes the excellent compatibility of the equation adaption technique with neural networks. When interference data were considered, the PINN model with equation adaption retained good prediction accuracy, especially for 220Rn concentration prediction. This outcome highlights the effectiveness of equation adaption in constraining neural networks with physical equations, thereby improving the robustness of the neural network model.

In future work, different types of noise will be added to the model, based on factors such as the background radioactivity level and detection method. Additionally, the equation adaption technique will be used to model specific physical objects with PINNs; for example, precise and rapid prediction of 220Rn and its progeny concentrations will be explored. Overall, the equation adaptation approach presented in this study has good universality and provides a theoretical foundation for the widespread application of neural networks in various fields.

References
1.A. Krizhevsky, I. Sutskever, G.E. Hinton et al.,

ImageNet classification with deep convolutional neural networks

. Commun. ACM 60, 8490 2017). https://doi.org/10.1145/3065386
Baidu ScholarGoogle Scholar
2.Y. LeCun, Y. Bengio, G. Hinton et al.,

Deep learning

. Nature 521, 436444 2015). https://doi.org/10.1038/nature14539
Baidu ScholarGoogle Scholar
3.B.M. Lake, R. Salakhutdinov, J.B. Tenenbaum et al.,

Human-level concept learning through probabilistic program induction

. Science 350, 13321338 2015). https://doi.org/10.1126/science.aab3050
Baidu ScholarGoogle Scholar
4.B. Alipanahi, A. Delong, M.T. Weirauch et al.,

Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning

. Nat. Biotechnol. 33, 831838 2015). https://doi.org/10.1038/nbt.3300
Baidu ScholarGoogle Scholar
5.G.E. Karniadakis, I.G. Kevrekidis, L. Lu et al.,

Physics-informed machine learning

. Nat. Rev. Phys. 3, 422440 2021). https://doi.org/10.1038/s42254-021-00314-5
Baidu ScholarGoogle Scholar
6.L. Yuan, Y.Q. Ni, X.Y. Deng et al.,

A-PINN: Auxiliary physics informed neural networks for forward and inverse problems of nonlinear integro-differential equations

. J. Comput. Phys. 462, 111260 2022). https://doi.org/10.1016/j.jcp.2022.111260
Baidu ScholarGoogle Scholar
7.S.S. Miriyala, V.R. Subramanian, K. Mitra et al.,

TRANSFORM-ANN for online optimization of complex industrial processes: Casting process as case study

. Eur. J. Oper. Res. 264, 294309 2018). https://doi.org/10.1016/j.ejor.2017.05.026
Baidu ScholarGoogle Scholar
8.C. Liu, B. Zoph, M. Neumann et al.,

Progressive neural architecture search

. In Proceedings of the European conference on computer vision (ECCV), 1934 (2018). https://doi.org/10.1007/978-3-030-01246-5-2
Baidu ScholarGoogle Scholar
9.X. Xie, X. Song, Z. Lv et al.,

Efficient evaluation methods for neural architecture search: A survey

(2023). arXiv:2301.05919.
Baidu ScholarGoogle Scholar
10.W. Qu, Y. Gu, S. Zhao et al.,

Boundary integrated neural networks and code for acoustic radiation and scattering

. Int. J. Mech. Syst. Dynam. 4, 27671399 2024). https://doi.org/10.1002/msd2.12109
Baidu ScholarGoogle Scholar
11.S. García, J. Luengo, F. Herrera et al., Data preprocessing in data mining. (Springer, Switzerland, 2015), 59139.
12.M.V. Nuland, H. Rosing, A.D. Huitema et al.,

Predictive value of microdose pharmacokinetics

. Clin. Pharmacokinet. 58, 12211236 2019). https://doi.org/10.1007/s40262-019-00769-x
Baidu ScholarGoogle Scholar
13.N. Keat, J. Kenny, K. Chen et al.,

A microdose PET study of the safety, immunogenicity, biodistribution, and radiation dosimetry of 18F-FB-A20FMDV2 for imaging the integrin αvβ6

. J. Nucl. Med. Technol. 46, 136143 2018). https://doi.org/10.2967/jnmt.117.203547
Baidu ScholarGoogle Scholar
14.L. Pouchard, K.G. Reyes, F.J. Alexander et al.,

A rigorous uncertainty-aware quantification framework is essential for reproducible and replicable machine learning workflows

. Digital Discovery 2, 12511258 2023). https://doi.org/10.1039/D3DD00094J
Baidu ScholarGoogle Scholar
15.S.H. Hu, Y.J. Ye, Z.Z. He et al.,

Analysis and optimization of performance parameters of the 220Rn chamber in flow-field mode using computational fluid dynamics method

. Nucl. Sci. Tech. 35, 175 2024). https://doi.org/10.1007/s41365-024-01526-x
Baidu ScholarGoogle Scholar
16.Z. He, D. Xiao, L. Lv et al.,

Stable control of thoron progeny concentration in a thoron chamber for calibration of active sampling monitors

. Radiat. Meas. 102, 2733 2017). https://doi.org/10.1016/j.radmeas.2017.02.013
Baidu ScholarGoogle Scholar
17.W. Li, Q. Zhou, Z. He et al.,

Optimization of the thoron progeny compensation system of a thoron calibration chamber

. J. Radioanal. Nucl. Ch. 324, 12551263 2020). https://doi.org/10.1007/s10967-020-07180-y
Baidu ScholarGoogle Scholar
18.M. Raissi, P. Perdikaris, G.E. Karniadakis et al.,

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations

. J. Comput. Phys. 378, 686707 2019). https://doi.org/10.1016/j.jcp.2018.10.045
Baidu ScholarGoogle Scholar
19.S. Cai, Z. Mao, Z. Wang et al.,

Physics-informed neural networks (PINNs) for fluid mechanics: A review

. Acta Mech. Sinica-PRC 37, 17271738 2021). https://doi.org/10.1007/s10409-021-01148-1
Baidu ScholarGoogle Scholar
20.Q. He, D. Barajas-Solano, G. Tartakovsky et al.,

Physics-informed neural networks for multiphysics data assimilation with application to subsurface transport

. Adv. Water Resour. 141, 103610 2020). https://doi.org/10.1016/j.advwatres.2020.103610
Baidu ScholarGoogle Scholar
21.S. Falas, C. Konstantinou, M.K. Michael et al.,

Physics-informed neural networks for securing water distribution systems

(2020). arXiv:2009.08842.
Baidu ScholarGoogle Scholar
22.C. Cheng, G.T. Zhang,

Deep learning method based on physics informed neural network with resnet block for solving fluid flow problems

. Water-Sui 13, 423 2021). https://doi.org/10.3390/w13040423
Baidu ScholarGoogle Scholar
23.Z. Mao, A.D. Jagtap, G.E. Karniadakis et al.,

Physics-informed neural networks for high-speed flows

. Comput. Method. Appl. M. 360, 112789 2020). https://doi.org/10.1016/j.cma.2019.112789
Baidu ScholarGoogle Scholar
24.Q. Zhu, Z. Liu, J. Yan et al.,

Machine learning for metal additive manufacturing: Predicting temperature and melt pool fluid dynamics using physics-informed neural networks

. Comput. Mech. 67, 619635 2021). https://doi.org/10.1007/s00466-020-01952-9
Baidu ScholarGoogle Scholar
25.J. Du, J. Zheng, Y. Liang et al.,

Deeppipe: A two-stage physics-informed neural network for predicting mixed oil concentration distribution

. Energy 276, 127452 2023). https://doi.org/10.1016/j.energy.2023.127452
Baidu ScholarGoogle Scholar
26.A. Arzani, J.X. Wang, R.M.D’ Souza et al.,

Uncovering near-wall blood flow from sparse data with physics-informed neural networks

. Phys. Fluids 33, 071905 2021). https://doi.org/10.1063/5.0055600
Baidu ScholarGoogle Scholar
27.F.S. Costabal, Y. Yang, P. Perdikaris et al.,

Physics-informed neural networks for cardiac activation mapping

. Front. Phys-Lausanne 8, 42 2020). https://doi.org/10.3389/fphy.2020.00042
Baidu ScholarGoogle Scholar
28.Y. Chen, L. Lu, G.E. Karniadakis et al.,

Physics-informed neural networks for inverse problems in nano-optics and metamaterials

. Opt. Express 28, 1161811633 2020). https://doi.org/10.1364/OE.384875
Baidu ScholarGoogle Scholar
29.S. Goswami, C. Anitescu, S. Chakraborty et al.,

Transfer learning enhanced physics informed neural network for phase-field modeling of fracture

. Theor. Appl. Fract. Mec. 106, 102447 2020). https://doi.org/10.1016/j.tafmec.2019.102447
Baidu ScholarGoogle Scholar
30.E. Zhang, M. Yin, G.E. Karniadakis et al.,

Physics-informed neural networks for nonhomogeneous material identification in elasticity imaging

(2020). arXiv:2004.04525.
Baidu ScholarGoogle Scholar
31.M. Yin, X. Zheng, J.D. Humphrey et al.,

Non-invasive inference of thrombus material properties with physics-informed neural networks

. Comput. Method. Appl. M. 375, 113603 2021). https://doi.org/10.1016/j.cma.2020.113603
Baidu ScholarGoogle Scholar
32.R. Zhang, Y. Liu, H. Sun et al.,

Physics-informed multi-LSTM networks for metamodeling of nonlinear structures

. Comput. Method. Appl. M. 369, 113226 2020). https://doi.org/10.1016/j.cma.2020.113226
Baidu ScholarGoogle Scholar
33.E. Haghighat, M. Raissi, A. Moure et al.,

A physics-informed deep learning framework for inversion and surrogate modeling in solid mechanics

. Comput. Method. Appl. M. 379, 113741 2021). https://doi.org/10.1016/j.cma.2021.113741
Baidu ScholarGoogle Scholar
34.Q. Zhang, Y. Chen, Z. Yang,

Data-driven solutions and discoveries in mechanics using physics informed neural network

. Preprints, 060258 2020). https://doi.org/10.20944/preprints202006.0258.v1
Baidu ScholarGoogle Scholar
35.E. Zhang, M. Dao, G.E. Karniadakis et al.,

Analyses of internal structures and defects in materials using physics-informed neural networks

. Sci. Adv. 8(7), eabk0644 (2022). https://doi.org/10.1126/sciadv.abk0644
Baidu ScholarGoogle Scholar
36.T. Zhu, W. Luo, C. Bu et al.,

Accelerate population-based stochastic search algorithms with memory for optima tracking on dynamic power systems

. IEEE T. Power Syst. 31(1), 268277 2015). https://doi.org/10.1109/TPWRS.2015.2407899
Baidu ScholarGoogle Scholar
37.H. Gong, T. Zhu, Z. Chen et al.,

Parameter identification and state estimation for nuclear reactor operation digital twin

. Ann. Nucl. Energy 180, 109497 2023). https://doi.org/10.1016/j.anucene.2022.109497
Baidu ScholarGoogle Scholar
38.Q.H. Ngo, B.L. Nguyen, T.V. Vu et al.,

Physics-informed graphical neural network for power system state estimation

. Appl. Energ. 358, 122602 2024). https://doi.org/10.1016/j.apenergy.2023.122602
Baidu ScholarGoogle Scholar
39.M.E. Bento,

Physics-guided neural network for load margin assessment of power systems

. IEEE T. Power Syst. 39(1), 564575 2023). https://doi.org/10.1109/TPWRS.2023.3266236
Baidu ScholarGoogle Scholar
40.R. Nellikkath, S. Chatzivasileiadis,

Physics-informed neural networks for ac optimal power flow

. Electr. Pow. Syst. Res. 212, 108412 2022). https://doi.org/10.1016/j.epsr.2022.108412
Baidu ScholarGoogle Scholar
41.P.R. Bana, M. Amin,

Control for grid-connected VSC with improved damping based on physics-informed neural network

. IEEE J. Emerg. Sel. Top. Ind. Electron. 4(3), 878888 (2023). https://doi.org/10.1109/jestie.2023.3258339
Baidu ScholarGoogle Scholar
42.W. Wang, M. Yang,

The nonlinear flow characteristics within two-dimensional and three-dimensional counterflow models within symmetrical structures

. Energies. 17, 3176 2024). https://doi.org/10.3390/en17133176
Baidu ScholarGoogle Scholar
43.M. Raissi, P. Perdikaris, G. E. Karniadakis et al.,

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations

. J. Comput. Phys. 378, 686707 2019). https://doi.org/10.1016/j.jcp.2018.10.045
Baidu ScholarGoogle Scholar
44.C. Rackauckas, Y. Ma, J. Martensen et al.,

Universal differential equations for scientific machine learning

. arXiv:2001.04385 (2020).
Baidu ScholarGoogle Scholar
45.H. Wang, Y. Liu, S. Wang et al.,

Dense velocity reconstruction from particle image velocimetry/particle tracking velocimetry using a physics-informed neural network

. Phys. Fluids. 34 2022). https://doi.org/10.1063/5.0078143
Baidu ScholarGoogle Scholar
46.S.H. Rudy, S.L. Brunton, J.L. Proctor et al.,

Data-driven discovery of partial differential equations

. Sci. Adv. 3, e1602614 2017). https://doi.org/10.1126/sciadv.1602614
Baidu ScholarGoogle Scholar
47.K. He, X. Zhang, S. Ren et al.,

Deep residual learning for image recognition

. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 770778 (2016). https://doi.org/10.1109/CVPR.2016.90
Baidu ScholarGoogle Scholar
48.P. Ramachandran, B. Zoph, Q.V. Le,

Searching for activation functions

. arXiv:1710.05941 (2017).
Baidu ScholarGoogle Scholar
49.S. Markidis,

The old and the new: Can physics-informed deep-learning replace traditional linear solvers?

Front. Big Data. 4, 669097 2021). https://doi.org/10.3389/fdata.2021.669097
Baidu ScholarGoogle Scholar
50.L. Ambrosio, G. D. Maso,

A general chain rule for distributional derivatives

. P. Am. Math. Soc. 108, 691702 1990). https://doi.org/10.1090/S0002-9939-1990-0969514-3
Baidu ScholarGoogle Scholar
Footnote

The authors declare that they have no competing interests.