Physics-informed neural network with equation adaption for 220Rn progeny concentration prediction

NUCLEAR CHEMISTRY, RADIOCHEMISTRY, AND NUCLEAR MEDICINE

Physics-informed neural network with equation adaption for ²²⁰Rn progeny concentration prediction

Shao-Hua Hu，

Qi Qiu，

De-Tao Xiao，

Xiang-Yuan Deng，

Xiang-Yu Xu，

Peng-Hao Fan，

Lei Dai，

Zhi-Wen Hu，

Tao Zhu，

Qing-Zhi Zhou

Nuclear Science and Techniques

Vol.37, No.2

Article number 23

Published in print Feb 2026

Available online 03 Jan 2026

DOI：10.1007/s41365-025-01836-8

CSTR：32136.14.NST.2026.0223

65102

Physics-informed neural networks (PINNs) are vital for machine learning and exhibit significant advantages when handling complex physical problems. The PINN method can rapidly predict ²²⁰Rn progeny concentration and is very important for regulating and measuring this property. To construct a PINN model, training data are typically preprocessed; however, this approach changes the physical characteristics of the data, with the preprocessed data potentially no longer directly conforming to the original physical equations. As a result, the original physical equations cannot be directly employed in the PINN. Consequently, an effective method for transforming physical equations is crucial for accurately constraining PINNs to model the ²²⁰Rn progeny concentration prediction. This study presents an equation adaptation approach for neural networks, which is designed to improve prediction of ²²⁰Rn progeny concentration. Five neural network models based on three architectures are established: a classical network, a physics-informed network without equation adaptation, and a physics-informed network with equation adaptation. The transport equation of the ²²⁰Rn progeny concentration is transformed via equation adaption and integrated with the PINN model. The compatibility and robustness of the model with equation adaption is then analyzed. The results show that PINNs with equation adaption converge consistently with classical neural networks in terms of the training and validation loss, and achieve the same level of prediction accuracy. This outcome indicates that the proposed method can be integrated into the neural network architecture. Moreover, the prediction performance of classical neural networks declines significantly when interference data are encountered, whereas the PINNs with equation adaption exhibit stable prediction accuracy. This performance demonstrates that the proposed method successfully harnesses the constraining power of physical equations, significantly enhancing the robustness of the resultant PINN models. Thus, the use of a physics-informed network with equation adaption can guarantee accurate prediction of ²²⁰Rn progeny concentration.

Machine learningPhysics-informed neural networksEquation adaption220Rn progeny

Introduction

Deep learning has profoundly impacted many areas of modern society [1-4], with significant applications in the fields of image recognition [1], natural language processing [2], cognitive science [3], and genomics [3, 4]. As a core technology of machine learning, neural networks play key roles in those fields. However, traditional neural network methods require considerable volumes of training data when analyzing complex physical, biological, or engineering systems. In complex, specialized cases, the cost of data collection is often high, and uncertainties exist regarding the data collection accuracy; these problems pose many challenges for deep learning applications [5]. Further, when supplied with only partial datasets, most advanced machine learning techniques lack robustness and cannot draw reliable conclusions and make decisions. In recent years, a new deep neural network (DNN) framework, physics-informed neural networks (PINNs), has been developed. A PINN incorporates physical laws into a neural network (e.g., an artificial, recurrent, or convolutional neural network (ANN, RNN or CNN, respectively)), and constitutes a data–physics dual-drive approach. This feature differentiates PINNs from traditional neural networks, which rely solely on data-driven methods [1, 5]. That is, through the use of physical information as prior knowledge, PINNs can be trained with very few or no labeled data as alternative models for accurately solving partial differential equations [5, 6], while also incorporating complex physical laws (that are difficult to describe via theoretical equations) in data form. Thus, PINNs have physics- and data-driven components.

The main traditional neural network models are ANNs, RNNs, and CNNs [5]. Many advanced algorithms have been derived to optimize the performance of these models, such as TRANSFORM-ANN [7], which can simultaneously fine-tune the neural network architecture, adjust the training dataset size, and select the appropriate activation function. To mitigate the risk of over-fitting, TRANSFORM-ANN integrates three strategies for determining the training set size, all of which are based on Sobol sampling. TRANSFORM-ANN is suitable for construction of an accurate and concise ANN model. However, limitations exist. Because TRANSFORM-ANN employs multi-objective optimization, it has a high computational cost, especially when applied to high-dimensional datasets, for which the computational complexity increases significantly. In addition to TRANSFORM-ANN, progressive neural architecture search [8] and one-shot neural architecture search (OSNAS) [9] are important methods in the field of model optimization. Boundary-integrated neural networks (BINNs) [10] are among the networks similar to the PINN model. A BINN is a numerical method combining a boundary integral equation (BIE) and neural network, and is employed to solve acoustic radiation and scattering problems efficiently and accurately. A BINN requires boundary-node information only as input, which greatly reduces the calculation cost and makes this approach particularly suitable for infinite domain problems. The semi-analytical characteristics of the BIE improves the prediction accuracy of the BINN. However, the application of BINNs to more challenging problems, such as those arising in the fields of high-frequency and nonlinear acoustics and complex geometry, requires further exploration.

The training data supplied to neural networks usually span physical quantities with multiple dimensions, which often exhibit significantly different orders of magnitude; therefore, appropriate data preprocessing is crucial [11]. For example, the physical quantities encountered in the fields of medical microdosimetry [12] and radioactive detection [13] may span hundreds of orders of magnitude or beyond. In those applications, even datasets with narrower ranges contain key information. To improve the sensitivity of neural networks to data with wide ranges of orders of magnitude and ensure that models can fully capture the key information in those data, the data must be processed appropriately using preprocessing functions. For example, logarithmic functions can be used to scale data effectively to an appropriate range. Overall, preprocessing functions are essential and universal for neural network training, but their application changes the dimensions of various features of the training data [1, 14]. Such alterations adversely affect PINN networks. That is, the key concept of the PINN network is that physical equations are combined to guide the training process of the neural network, ensuring that the model predictions not only conform to the data distribution but also follow specific physical laws. When the preprocessing function changes the dimensions of the data features, the physical equations in the PINN network are no longer directly effective, because the equation parameters and variables are usually closely related to the dimensions of the original data.

An ²²⁰Rn chamber is an essential scientific device for accurately measuring the radiation dose levels of ²²⁰Rn and its progeny [15-17]. The exhaust pipe is a core component of this device. When the concentrations of ²²⁰Rn and its progeny in the ²²⁰Rn chamber must be reduced, clean air is often injected into the chamber to dilute the indoor radioactive gas concentration. The excess radioactive gas is released into the atmosphere via the exhaust pipe. The emitted radioactive-gas concentration can reach several thousand becquerel per meter cubed; thus, the concentration distribution of the emitted radioactive gas must be monitored effectively. To achieve precise control and accurate measurement of ²²⁰Rn progeny concentration [15], a rapid prediction model must be established; this is feasible using the PINN approach. However, when a ²²⁰Rn concentration prediction model is developed using training data subjected to a preprocessing function, the preprocessing function alters the characteristics of physical quantities such as time, space, and concentration in the training data. To achieve normal functionality of the physical equations in the PINN network, the physical equations must be deformed according to specific preprocessing functions.

Since PINNs were proposed in 2019 [18], these methods have been applied to various fields. In fluid mechanics [19-25], PINNs have proven to be a valuable tool for overcoming the limitations of traditional numerical simulation methods, particularly for noisy data, complex grid generation challenges, and high-dimensional flow problems. In medical diagnostics [26, 27], PINNs precisely simulate biomechanics and biofluid mechanics, elucidating complex biological fluid phenomena and aiding disease diagnosis, treatment optimization, and medical device design. In materials science [28-35], PINNs have greatly enhanced the prediction accuracy of key physical quantities such as material stress and strain, especially in cases with limited data resources. In the power industry [36-41], PINNs have been used for power-system optimization and stability analysis, combining physical laws with data analysis to accurately predict system behavior, optimize energy distribution, enhance grid stability, and improve overall energy efficiency. These applications demonstrate the excellent adaptability and reliability of PINNs. However, in such previous studies, conventional data processing methods were generally adopted and the issue of physical-equation deformation and incorporation into the neural network after preprocessing was not explored. In particular, to achieve effective ²²⁰Rn progeny concentration prediction, the development of a PINN with equation adaption is highly significant.

This study introduces an equation adaption approach for neural networks, which can accurately deform physical equations for application in PINN model training. The compatibility of this method with neural networks and the robustness of the resultant model are explored. The remainder of this paper is organized as follows. In Sect. 2, five neural network models are established based on three architectures: a classical network, a physics-informed network without equation adaptation, and a physics-informed network with equation adaptation. The equation adaption process is then applied and the deformation of the physical equations based on specific preprocessing functions is demonstrated. Section 3 focuses on the compatibility of the neural-network equation adaption approach and the robustness of the PINN network after equation adaption, based on the five models established previously. Section 4 concludes the work.

Methodology

When training data are processed by a preprocessing function, the physical equations must be transformed before they can be incorporated into the PINN architecture. This section first establishes five prediction models for ²²⁰Rn concentration based on three network architectures. Then, the proposed equation adaptation method is introduced, with the equations being incorporated into the PINN model.

2.1

Physical object

In this study, the exhaust pipe of the ²²⁰Rn chamber was taken as the research object and the concentration distribution of the emitted radioactive gas was predicted. The device was cylindrical, with a diameter of Φ = 10 cm and a length L of 40 cm (Fig. 1). The gas entered through the inlet and exited through the outlet, and primarily comprised a mixture of ²²⁰Rn progeny and air. The inlet wind speed was taken as the model boundary condition, and the wind-speed adjustment range was 0 - 0.1 m/s [15]. The main decay products of ²²⁰Rn are ²¹⁶Po, ²¹²Pb, and ²¹²Bi. Because ²¹⁶Po has a short half-life of only 0.145 s, its migration and diffusion capabilities are minimal. In contrast, the half-lives of the latter two are more prolonged, at 10.64 h and 50.55 min, respectively, and their migration and diffusion are more impactful. Thus, ²¹²Pb, and ²¹²Bi are the focus of research attention [15, 16]. As ²¹²Pb and ²¹²Bi exhibit highly similar migration and diffusion patterns in this context, in this study, only the ²¹²Pb concentration distribution was considered.

Fig. 1

(Color online) Exhaust pipeline structure

2.2

Establishment of neural network

Computational fluid dynamics (CFD) was used to establish a numerical simulation of the physical object (i.e., the exhaust pipe) discussed in Sect. 2.1; hence, the database needed to train the neural network was obtained. Note that this database can be used to train and validate subsequent neural network models. The physical equations were incorporated into the loss function to jointly constrain the training of the neural network model. Finally, the data and physical-equation loss function were used in combination to train the neural network model. The model construction flowchart is shown in Fig. 2.

Fig. 2

(Color online) Framework for establishing neural network model

The physical structure considered in this study was a cylinder, which is highly symmetrical. Therefore, in experiment, only the ²¹²Pb concentration distribution in the xoy plane area is often required. In addition, because the physical structure is highly symmetrical and the fluid velocity is minimal, the Reynolds number is far less than 2000; thus, a laminar flow is established, which is characterized by a linear and uniform velocity distribution with a stable pressure gradient. Considerable similarity exists between the two- and three-dimensional flows; that is, the characteristics of a three-dimensional flow field can be described by a two-dimensional simulation. Thus, the difficulty of establishing a three-dimensional model can be avoided [42]. Therefore, the neural network model established in this study was designed to predict the ²¹²Pb concentration distribution on the xoy plane.

(1) Data collection PINN models exhibit high robustness and can effectively handle data with significant errors [43, 44]. In this study, to test the robustness of the PINN model established using the proposed method, two databases were established: one without and one with interference (labeled Data-01 and Data-02, respectively). Data-02 comprised Data-01 with the addition of a small volume of data containing significant errors. The databases contained ²¹²Pb concentrations spanning an extensive range, from 10^-50 Bq/m³ to 10³ Bq/m³. Data-01 comprised N_data = 987,135 normal data points, whereas Data-02 comprised N_data = 987,135 normal data points and N_error=50 erroneous data points. A random 0.2% of the data from Data-01 was selected as the validation set (N_validation=2,000 data points), labeled “Data-validation.” Finally, these two databases were used to separately train three networks: NN, PINN-EA, and PINN-f. Hence, five models were obtained: NN, NN-ERR, PINN-EA, PINN-EA-ERR, and PINN-f. The correspondence between the different training databases and training models is shown in Fig. 3. NN, PINN-EA, and PINN-f are introduced in detail in the following subsection. Note that Data-02 had 50 additional data points compared to Data-01, accounting for 0.005% of the total data, which was a tiny proportion. Therefore, the data volumes of Data-02 and Data-01 were considered identical, and the influence of the additional 50 data points on the model training was ignored.

Fig. 3

(Color online) Correspondence between training databases and trained models

(2) Neural networks A PINN is essentially a DNN that can approximate a solution determined from data and PDEs [45]; its architecture is shown in Fig. 4. In this study, three neural networks were constructed: NN, PINN-EA, and PINN-f. NN was a classical neural network without physical laws, whereas PINN-EA and PINN-f were PINNs with physical laws. PINN-EA and PINN-f differed in terms of their preprocessing functions. The equation adaption approach proposed in this study was adopted for PINN-EA; that is, the physical law F(X, Y, U, V, C) was used in the network. No preprocessing function was used for PINN-f, with the physical law f(x, y, u, v, c) being directly integrated into the network. The physical laws F(X, Y, U, V, C) and f(x, y, u, v, c) are explained in detail in Sect. 2.3.

Fig. 4

(Color online) NN, PINN-EA and PINN-f architectures

A residual neural network [46, 47] was adopted in this study, for which the relationship between the inputs and outputs can be expressed as(1)Here, represents the neural network, the inputs of which are the space coordinates (x and y), time (t), and wind speed (q). The neural-network outputs are the velocity vectors (u and v) and concentration (c). The parameter Θ represents the trainable variables. Through this network, the relationship between the inputs and outputs is constructed. The k-th hidden layer for the residual neural network is expressed as(2)where W and b are the weights and biases, respectively; H represents the output of each hidden layer; and σ is the activation function.

In this work, the partial derivatives , , and were computed based on the chain rule, which has previously been implemented via automatic differentiation in both TensorFlow and PyTorch [45]. In this study, PyTorch was employed for this computation. Note that higher-order derivatives can be estimated through multiple calling of this function. Based on the ²²⁰Rn progeny transport equation, which is detailed in Eqs. (9) and (18) of Sect. 2.3, the residual equation of the transport equation can be derived as follows:(3)(4)where e^f and e^F pertain to PINN-f and PINN-F, respectively. Further, D_e is the diffusion coefficient, D_e = 2.88×10^-5 m²/s [15], and λ is the decay constant of ²¹²Pb. To ensure the network adheres to the transport equations, the loss function within the PINN is defined as follows:(5)where α is a weighting coefficient, the value of which is discussed in Sect. 3.3. Further, and are computed as(6)(7)Here, , , and are the measured data; , , and are the predicted data; represents the loss between the measured and predicted data; and e represents e^f or e^F. Recall that, in PINN-F, , whereas in PINN-f, . The variables Θ are optimized by minimizing the loss function. Here, the training variables were updated using the adaptive moment estimation (Adam) optimizer with an initial learning rate of 0.001. The learning rate decreased stepwise every 40 epochs to 0.9 times the original learning rate. Five models were obtained by training the three neural networks with different databases. The hyperparameter settings of the five models are detailed in Table 1. Note that the specified number of iterations was sufficient to decrease the model training error to a stable condition. The numbers of hidden units and layers (N_cell and N_layer, respectively) were determined as discussed in Sect. 3.3.

Hyperparameters used in five models

Hyperparameters	Model
Hyperparameters	NN	NN-ERR	PINN-EA	PINN-EA-ERR	PINN-f
Network	NN	NN	PINN-EA	PINN-EA	PINN-f
Learning rate	Stepped descent	Stepped descent	Stepped descent	Stepped descent	Stepped descent
Optimizer	Adam	Adam	Adam	Adam	Adam
Iteration	1000	2000	1000	2000	3000
Training data	Data-01	Data-02	Data-01	Data-02	Data-01
Validation data	Data-validation	Data-validation	Data-validation	Data-validation	Data-validation

(3) Activation functions The activation function is pivotal to the neural-network ability to approximate data. Without an activation function, the network would perform linear transformations [45]. Given that a PINN incorporates a derivation process, selection of an appropriate activation function is essential for practical model training. Five representative activation functions were used to construct and optimize our model training process in this study: Sigmoid, Tanh, ReLU, Leaky ReLU, and Hardswish. These activation functions exhibit unique nonlinear mapping characteristics, such as a sharp contrast between the saturated (Sigmoid, Tanh) and unsaturated (ReLU, Leaky ReLU, Hardswish) types, and also span the diversity of parameterized (e.g., the negative slope parameter of Leaky ReLU) and nonparametric design. These five activation functions are representative and widely used in the field of neural networks [48, 49].

2.3

Equation adaption

The migration and diffusion behaviors of ²¹²Pb within the device follow the transport equation. That is,(8)where x and y are spatial coordinates; u and v are the flow-field velocities at these coordinates; c is the ²¹²Pb concentration at these coordinates; D_e is the diffusion coefficient; and λ is the ²¹²Pb decay constant. First, the f(x, y, u, v, c) expression is established:(9)Here, the preprocessing function and its inverse are represented by N and FN, respectively. Further, X, Y, T, Q, U, V, and C are the parameters obtained by transforming x, y, t, q, u, v, and c through the preprocessing function. Their relationships can be expressed as(10)From Eq. (10), the physical Eq. (9) can be transformed as follows:(11)To improve the transparency of the subsequent equation derivation, Eq. (11) is decomposed into four parts: A, B, E, and G:(12)Then, Eq. (11) can be transformed into the following form:(13)By applying the chain rule for the differentiation of composite functions [50], Eq. (12) can be expanded as follows:(14)Eq. (14) is a transformed form of the physical Eq. (9). If the preprocessing function is determined, a further precise transformation of Eq. (14) can be performed. The normalization function was employed for preprocessing in this study. The normalization function and its inverse are, respectively, expressed as(15)(16)where represents the parameters x, y, t, q, u, v and c; and represent the minimum and maximum parameter values, respectively; and represents the normalized parameters X, Y, T, Q, U, V, and C. Therefore, based on Eqs. (15) and (16), Eq. (14) can be transformed to obtain the following:(17)Then, Eq. (13) can be transformed into the following form:(18)Therefore, after the preprocessing function is determined, Eq. (12) can be transformed into Eq. (18), which means that Eq. (9) is transformed into Eq. (18), following implementation of the equation adaption technique.

2.4

Model evaluation indexes

In this study, three key indicators were used to gauge the performance of the trained model: the training loss, validation loss, and relative standard deviation (RSD). The RSD measures the relative discrepancy between the predicted and true values. The RSD calculation formula is as follows:(19)where y_tr and y_pre are the true and predicted values, respectively.

To facilitate analysis and comparison of different models, the final training and validation losses (FT and FV, respectively) were adopted as metrics. Consequently, the FT and FV for the NN and PINN models were denoted as FT_NN, FV_NN, FT_PINN and FV_PINN, respectively. The FT and FV ratios between the two models were and , respectively.

Results and Discussion

Section 2 presented the basic conditions for establishing the PINN model. This section details the optimization of the model parameters and the subsequent performance analysis of the optimal model. The performance analysis of the optimal model is reported first, in Sect. 3.1 and 3.2. The basis for determining the model parameters is discussed subsequently, in Sect. 3.3.

3.1

Convergence and predictive performance of PINNs without equation adaption

This section discusses the convergence and predictive performance of the PINN network with no equation adaption trained on the Data-01 database. The PINN model discussed in this section corresponds to the PINN-f model in Sect. 2.2. Figure 5 shows the training- and validation-loss convergence during training of the PINN-f model. Following application of the Leaky ReLU and Hardswish activation functions, the model training and validation losses exceeded orders of magnitude of e23 and e8, respectively, and convergence to smaller values did not occur. However, for the models with the ReLU, Sigmoid, and Tanh activation functions, these values converged to e-3. Therefore, compared to the Leaky ReLU and Hardswish activation functions, the ReLU, Sigmoid, and Tanh activation functions yielded better convergence of the model training and validation losses.

Fig. 5

(Color online) Training and validation losses of PINN-f model for different activation functions

Although the training and validation losses are fundamental metrics for evaluating the effectiveness of model training, the accuracy with which physical quantities are predicted is also a crucial indicator. In this study, the model accuracy was assessed using the predictive RSD, which measures the RSD between the predicted and true values, as expressed in Eq. (19). Figure 6 illustrates the RSD between the predicted and true values for the x- and y-velocity components and the ²¹²Pb concentration for the PINN-f model. The models employing the Leaky ReLU and Hardswish activation functions exhibited RSDs for the predictive values of the three considered physical quantities that significantly exceeded e6. In particular, the RSD for the predictive ²¹²Pb concentration was as high as the order of e27. The models using the ReLU, Sigmoid, and Tanh activation functions exhibited smaller predictive RSDs. However, the values for the three considered physical quantities were relatively high, all exceeding 1, with the predictive RSD for the ²¹²Pb concentration reaching the order of 10¹¹. Therefore, the PINN-f models trained with these five activation functions did not achieve satisfactory prediction accuracy.

Fig. 6

(Color online) RSDs between predicted and true values for PINN-f models with different activation functions

3.2

Convergence and predictive performance of PINNs with equation adaption

As indicated in Sect. 3.1, the neural network method without equation adaption failed to accurately predict the x- and y-velocity components and the ²¹²Pb concentration. This section compares the neural network model with equation adaption to that without equation adaption and verifies the compatibility and robustness of the proposed method within neural networks.

3.2.1

Comparative analysis of PINN and NN models without interference data

(1) Neural-network convergence

This section discusses the training and predictive performance of the PINN network with the equation adaption and the classical neural network, both of which were trained without interference data (i.e., on the Data-01 database). These models correspond to the PINN-EA and NN models described in Sect. 2.2, respectively. Figure 7 shows the training and validation loss convergence for both models under different activation function conditions. Without interference data, the training- and validation-loss convergence patterns of the two models were consistent and reached the order of 10^-6. Compared to the results reported in Sect. 3.1, these results indicate that preprocessing of the database effectively reduce the model training difficulty and facilitates training and validation loss convergence. This outcome further demonstrates the necessity of equation adaption in the training of PINN models.

Fig. 7

(Color online) Training and validation losses of NN and PINN-EA models for different activation functions without interference data

Figure 8 compares the magnitudes to which the two models training and validation losses converged after 1000 training epochs under different activation function conditions. As the activation function changed, the two models exhibited a consistent trend in both FT and FV, with the K values fluctuating around 1. This indicates that there was almost no difference between the FT values of the two models, and similarly, that the FV values of the two models were almost identical. Therefore, without interference data, a comprehensive analysis of the FT, FV, and K values suggests that the neural network model with equation adaption (PINN-EA) and the classical neural network model (NN) exhibit good consistency.

Fig. 8

(Color online) FT, FV, and K values of NN and PINN-EA models for different activation functions without interference data

(2) Model prediction accuracy

Figure 9 illustrates the change pattern of the predictive RSD for the x-velocity and y-velocity components and the ²¹²Pb concentration as the NN and PINN-EA models underwent continuous training. When the ReLU activation function was used, the RSD for the x-velocity component exceeded 100%. For the other activation functions, however, the RSD values for these properties remained below 100%. The lowest RSD values were obtained for the y-velocity component, with all values below 10%, whereas those for the x-velocity and ²¹²Pb concentration fell between 10% and 100%. Figure 9 also displays the RSD change pattern for the x-velocity and y-velocity components and the ²¹²Pb concentration as the PINN-EA model underwent continuous training. The RSD change pattern for the three physical properties were essentially the same for both the PINN-EA and NN models.

Fig. 9

(Color online) RSD between predicted and true values for NN and PINN-EA models for different activation functions without interference data

To more intuitively illustrate the relationship between the ²¹²Pb concentrations predicted by the NN and PINN-EA models and the true values, a scatter plot of the predicted and true values is shown in Fig. 10. The true and predicted values were normalized, with the maximum and minimum values in the normalization being the η_min and η_max of Eq. (15). The true and predicted ²¹²Pb concentration values are plotted on the horizontal and vertical axes, respectively. The closer the scatter points are to the y=x line, the closer the predicted values are to the true values, which indicates better prediction accuracy. The scatter points in Fig. 10 are essentially on the y=x line, suggesting that both the NN and PINN-EA models had good prediction accuracy.

Fig. 10

(Color online) Scatter plots of predicted and true values for NN (a–e) and PINN-EA (f–j) models without interference data

In summary, analysis of the training- and validation-loss convergence patterns, as well as the model predictive accuracy revealed that, without interference data, the training and validation losses of the NN and PINN-EA models exhibit consistent convergence patterns during training. Moreover, compared to the models in Sect. 3.1, the two models in this section achieved higher prediction accuracy for the 212^Pb concentration. These results demonstrate that the equation adaption technique can be effectively integrated into neural networks without conflict, with good compatibility.

3.2.2

Comparative analysis of PINN and NN models with interference data

To verify the robustness of the PINN model following adoption of the equation adaption technique, both the classical neural network and the PINN network were trained using data containing interference (the Data-02 database). The effects of training were examined for the two models. The models discussed in this section correspond to the NN-ERR and PINN-EA-ERR models described in Sect. 2.2.

(1) Convergence of neural networks

Figure 11 shows the training- and validation-loss convergence with epochs for the two models under different activation functions. When the model training was stable, although the NN-ERR training loss was considerably smaller than that of PINN-EA-ERR, the PINN-EA-ERR validation loss was smaller than that of NN-ERR, by approximately half. Further, both the NN-ERR and PINN-EA-ERR models exhibited overfitting at approximately 200 epochs. Table 2 reports the magnitudes to which the training and validation losses of the classic neural network and PINN network converged after 2000 training epochs. From Table 2, the FT values of both models were within the range of 1×10^-4 to 1×10^-3, and the FV values exceeded 1×10^-1. Under the condition of no interference data, as shown in Fig. 8, the FT and FV values of the two models were within the range of 1×10^-6 to 1×10^-5. This indicates that, under the influence of interference data, the FT increased by two to three orders of magnitude, and the FV increased by 6. Therefore, interference data has a significant and adverse effect on model training.

Fig. 11

(Color online) Training and validation losses of NN-ERR and PINN-EA-ERR models for different activation functions with interference data

Comparison of training and validation losses of NN-ERR and PINN-EA-ERR models for different activation functions

		Activation functions
		Leaky ReLU	ReLU	Sigmoid	Tanh	Hardswish
NN	FT_NN	1.2×10^-3	9.8×10^-4	1.7×10^-3	6.5×10^-4	9.1×10^-4
	FV_NN	1.90	2.14	1.55	2.05	2.10
PINN	PT_PINN	2.5×10^-3	3.3×10^-3	3.7×10^-3	2.8×10^-3	2.6×10^-3
	FV_PINN	1.09	1.13	0.46	1.11	1.24

Figure 12 shows the ratios of the FT or FV values between NN-ERR and PINN-EA-ERR under different activation functions. The FT ratios of the two models were less than 1. In contrast, the FV ratios were larger than 1. This result indicates that the PINN-EA-ERR model with the equation adaption constraint, can recognize erroneous data, further enhancing the robustness of the neural network.

Fig. 12

(Color online) K values of NN-ERR and PINN-EA-ERR models for different activation functions with interference data

(2) Model prediction accuracy

Figure 13 shows the RSD values between the predicted and true values for the x- and y-velocity components and the 212^Pb concentration for the NN-ERR and PINN-EA-ERR models under different activation function conditions. Comparison of the NN-ERR and PINN-EA-ERR models reveals that, for the x and y fluid velocity vectors, the RSD of the PINN-EA-ERR model predictions were half those of the NN-ERR model; as regards the RSD of the predicted 212^Pb concentration values, those of the PINN-EA-ERR model were approximately 1/400th that of the NN-ERR model. This indicates that the PINN-EA-ERR model had higher prediction accuracy than the NN-ERR model, especially for the 212^Pb concentration. The main reason for this performance is that the physical equation used in this study is the ²¹²Pb concentration transport equation, which effectively constrained the ²¹²Pb concentration prediction. This equation also incorporates flow-field physical quantities; that is, the x- and y-velocity components. Some improvement in prediction accuracy was observed for those quantities; however, the constraining force was insufficient and there was a stark contrast in accuracy.

Fig. 13

(Color online) RSD between predicted and true values of NN-ERR and PINN-EA-ERR models for different activation functions with interference data

Figure 14 depicts scatter plots of the ²¹²Pb concentrations predicted by the NN-ERR and PINN-EA-ERR models against the true values. The true and predicted values were uniformly normalized in the same manner as for the results shown in Fig. 10. Figure 14(a)–(e) and (f)–(j) show scatter plots of the 212^Pb concentration predictions of the NN-ERR and PINN-EA-ERR models, respectively. In Figs. 14(a)–(e), most of the scatter points lie on the y=x, except for those close to 0. This result indicates a significant deviation in the NN-ERR model predictions for values close to 0, which markedly decreased the prediction accuracy. Moreover, compared to the NN-ERR model, the predicted scatter points (for the PINN-EA-ERR model (Figs. 14(f)-(j)) are mainly on the y=x line. and the model maintained good prediction accuracy for values close to 0. This further demonstrates the effectiveness of the practical constraint of the proposed equation adaption on the neural network model.

Fig. 14

Scatter plot of predicted and true values for NN-ERR (a–e) and PINN-EA-ERR (f–j) models with interference data

3.3

N_data, N_cell, and α parameters

Parameter optimization is typically performed for neural networks with different activation functions. From Sect. 3.1 and 3.2, the optimal performance was obtained for the tanh activation function. Therefore, in this section, we take the neural network model with the tanh activation function as an example and discuss the parameter optimization process in detail. We primarily consider the influence of N_data, N_cell, and α on the prediction accuracy without interference data. The ²¹²Pb concentration predicted by the PINN for different N_data are presented in Fig. 15. The RSD of the ²¹²Pb concentration prediction was used to characterize the model prediction accuracy. The layer number was set to 5 for all cases, and the number of neurons in each layer was varied from 16 to 128.

Fig. 15

(Color online) RSD of 212^Pb concentration for varying N_data and N_cell. N_layer was fixed at 5. In (a)–(e), α = 1×10²⁴, 1×10²⁵, 1×10²⁶, 1×10²⁷, and 1×10²⁸, respectively

Three points can be summarized from Fig. 15. ① Data dependence on PINN prediction accuracy: With increasing N_data, the prediction accuracy for the ²¹²Pb concentration improved significantly. When N_data reached 10⁴, the prediction accuracy did not change considerably. ② Determination of N_cell: From Figs. 15(a)–(e), with increasing N_cell, the prediction accuracy for the ²¹²Pb concentration increased gradually; the prediction accuracies for N_cell = 64 and 128 were close. Considering the calculation resources, the optimal N_cell for this model was 64. ③ Determination of α: In the training experiment for the PINN neural network model, the order of magnitude of was approximately ×10^-3, and the order of magnitude of L_data was approximately ×10^-40. To facilitate supervision of the training of neural network models for both, in this paper study similar orders of magnitude were set for and L_data, so the change in prediction accuracy was analyzed when α was ×10²⁴–²⁸. As apparent from Fig. 15, for sufficient training data and with increasing α, the model prediction accuracy first increased rapidly and then decreased slightly, and optimal performance was obtained when α was ²⁶. Therefore, in this study, N_cell = 64 and α = 10²⁸ were selected for the optimal model.

This study had some limitations. A high level of accuracy was not achieved for the flow field prediction when interference data were considered (Sect. 3.2.2). This indicates that the conditions of the physical equations should be adjusted to constrain the model and enhance its prediction accuracy. As regards noise in the training stage of this research model, the training data were preprocessed to a certain extent and a good signal-to-noise ratio was obtained.

Conclusion

In this study, a PINN with equation adaption was established for ²²⁰Rn progeny concentration prediction. A PINN without equation adaption was examined; this model failed to yield the desired outcomes. This failure underscores the critical role of the equation adaption in training neural networks. For training without interference data, the PINN with equation adaption exhibited performance consistent with that of a classical neural network model, achieving high accuracy when predicting ²²⁰Rn concentrations. This outcome emphasizes the excellent compatibility of the equation adaption technique with neural networks. When interference data were considered, the PINN model with equation adaption retained good prediction accuracy, especially for ²²⁰Rn concentration prediction. This outcome highlights the effectiveness of equation adaption in constraining neural networks with physical equations, thereby improving the robustness of the neural network model.

In future work, different types of noise will be added to the model, based on factors such as the background radioactivity level and detection method. Additionally, the equation adaption technique will be used to model specific physical objects with PINNs; for example, precise and rapid prediction of ²²⁰Rn and its progeny concentrations will be explored. Overall, the equation adaptation approach presented in this study has good universality and provides a theoretical foundation for the widespread application of neural networks in various fields.

References

A. Krizhevsky, I. Sutskever, G.E. Hinton et al.,

ImageNet classification with deep convolutional neural networks

. Commun. ACM 60, 84–90 2017). https://doi.org/10.1145/3065386