Fault prediction method for nuclear power machinery based on bayesian PPCA recurrent neural network model

NUCLEAR ENERGY SCIENCE AND ENGINEERING

Fault prediction method for nuclear power machinery based on bayesian PPCA recurrent neural network model

Jun Ling，

Gao-Jun Liu，

Jia-Liang Li，

Xiao-Cheng Shen，

Dong-Dong You

Nuclear Science and Techniques

Vol.31, No.8

Article number 75

Published in print 01 Aug 2020

Available online 29 Jul 2020

DOI：10.1007/s41365-020-00792-9

135101

Early fault warning for nuclear power machinery is conducive to timely troubleshooting and reductions in safety risks and unnecessary costs. This paper presents a novel intelligent fault prediction method, integrated probabilistic principal component analysis (PPCA), multi-resolution wavelet analysis, Bayesian inference, and RNN model for nuclear power machinery that consider data uncertainty and chaotic time series. After denoising the source data, the Bayesian PPCA method is employed for dimensional reduction to obtain a refined data group. An recurrent neural network (RNN) prediction model is constructed, and a Bayesian statistical inference approach is developed to quantitatively assess the prediction reliability of the model. By modeling and analyzing the data collected on the steam turbine and components of a nuclear power plant, the results of the goodness of fit, mean square error distribution, and Bayesian confidence indicate that the proposed RNN model can implement early warning in the fault creep period. The accuracy and reliability of the proposed model are quantitatively verified.

Key words: Fault predictionNuclear power machinerySteam turbineRecurrent neural networkProbabilistic principal component analysisBayesian confidence

1 Introduction

Using a real-time monitoring system to collect the operations data of mechanical equipment in nuclear power plants (NPPs) for early warning in the early stage of equipment failure allows troubleshooting, which avoids major safety accidents, reduces unplanned shutdown maintenance of units, and reduces costs [1–3]. The establishment of a data-driven prediction models for mechanical equipment fault prediction has become an important means of predictive maintenance, and research on nuclear power machinery has gradually increased because of the recent and rapid development of artificial intelligence algorithms and big data technology [4–6]. Xie et al. [7] designed an online early warning system to track and predict the critical reaction of a nuclear reactor through two independent online simulation systems. Qian et al. [8] presented a hierarchical multi-dimensional method for fault detection of an NPP main pipeline. Min et al. [9] used a pattern recognition early warning system developed with AAKR technology to demonstrate the effectiveness of a real-time monitoring and early warning system for NPPs. Peng et al. [10] utilized the feature selection ability of association analysis and the depth confidence network (DBN) method to detect faults in nuclear power machinery. Yao et al. [11] introduced a fault diagnosis method for NPP full range simulators based on state information imaging. By using machine learning and image processing technology, historical data and synthetic grey image data are analyzed, and the system learns to achieve image feature extraction and classification to perform fault diagnosis.

A variety of data-driven approaches are used in the fault prediction of large mechanical equipment. Qin et al. [12] employed an approach based on time series and Bayesian discriminant analyses to solve the problems of type identification and diagnosis of concurrent faults without characteristic parameters in rotating machinery. Mehrdad et al. [13] presented a non-parametric single spline regression approach to construct the power curve model of the generator set. Aye and Heyns [14] proposed an optimal Gaussian process regression through the combination of simple mean value and variance to predict a low error rate for the remaining service life of low-speed bearings. The judgment methods for regression model accuracy are continuously developing. Jiang and Yin [15] proposed and applied a recursive total principle component regression-based design and implementation approach for efficient data-driven fault detection for vehicular cyber-physical systems. Gao et al. [16] established a partial least squares-aided data-driven model predictive control approach to improve prediction accuracy. Herp et al. [17] developed a statistical method of online extraction and prediction of turbine state based on Bayesian inference. The residuals of bearing temperature measurement were inferred online, and the prediction probability is calculated by the sample model and the risk function describing the state transition probability to predict the fault state in advance. Li et al. [18] proposed a method for rolling bearing fault identification based on the multifractal and grey system theories, aiming at the non-equilibrium and non-linear characteristics of bearing vibration signals and the complexity of the distribution of state indication information in the signal. Liu et al. [19] presented the thermal component of a fault prediction method based on the convolutional neural network (CNN) to address the disastrous consequences caused by thermal component faults of gas turbines. This method shows that the CNN is a feasible method to resolve thermal component fault detection. Liu and Karimi [20] established two machine learning models based on an artificial neural network and a high-dimensional model representation to predict the operation characteristics of steam turbines and air compressors and provided a basis for continuous health monitoring and fault diagnosis.

Deep learning can directly reflect the characteristics through the training sample data to reduce the influence of assumptions and simplification on calculation results and has recently been widely used for mechanical fault identification [21–23]. The RNN model shows good performance in capturing temporal correlations in data and can store and transmit the sequence information multiple times. Liu et al. [24] developed an RNN-based fault identification approach that uses a denoising auto-encoder based on a gated recursive unit to predict multiple vibration values of rolling bearings in subsequent time series. Hadi and Shahnazari [25] proposed a fault detection and isolation (FDI) method. Based on the RNN, this method models and inverts the nonlinear system, establishes a factory prediction model, and makes use of the residual generated from the model for fault identification. Wang et al. [26] developed an RNN-based algorithm to effectively handle the multi-classification fault diagnosis for wind power systems. Palau et al. [27] employed Weibull time to an event-RNN algorithm for distributed collaborative prognostics. The industrial gas turbine unit data and c-mapps engine degradation data set are used in the experiment. Wang et al. [28] analyzed the motor vibration signal and multi-scale stator current signal and presented a multi-resolution & multi-sensor fusion network model for motor fault diagnosis based on RNN.

The dimension of the data must be reduced, and more refined information must be used to conduct a comprehensive analysis of the collected data to refine and simplify the research. Principal component analysis (PCA) can effectively reduce the dimensionality of compressed data by retaining the original data feature information and solve the multi-variable correlation problem to reduce the complexity of the problem analysis and is widely applied in big data processing, pattern recognition, and image processing fields. Li et al. [29] established an optimized PCA model to perform fault detection of sensors in NPPs and verified that the model can detect and reconstruct the fault sensors well by simulation. Prusty et al. [30] employed PCA to reduce the dimension of a large number of plant signals transmitted by the Prototype Fast Breeder Reactor (PFBR) in a NPP, improving the decision-making capability of the operator in catastrophic conditions. Wu et al. [31] constructed a fault detection model of a pressurized water reactor in a NPP based on the BN-FDD system framework. PCA, fuzzy theory, and data fusion were used to promote data accuracy, and multiple sensor data were combined into one node data. Sharifi and Langari [32] divided the measurement space into several local linear regions associated with a PPCA model and presented a sensor fault diagnosis method for a nonlinear system by considering the data uncertainty. Xiang et al. [33] utilized the PPCA denoising model for rolling bearing fault prediction. In this model, the subspace of the principal component retains the more useful original information and fault signal, and the noise and related linear information are projected into the remaining subspace.

The above literature introduces the analysis methods of fault prediction for different mechanical equipment. The following problems regarding the application of fault detection and early warning for nuclear power rotating mechanical equipment must be improved: 1) imperfections in the source data and multiple variable redundancies; 2) monitoring and early warning during the creep period of equipment failure; and 3) the quantitative evaluation of the reliability of the prediction model.

This study combines pattern recognition technology and deep learning to present a fault prediction approach for steam turbines, pumps, and other mechanical equipment in NPPs based on Bayesian PPCA RNN to improve the aforementioned issues. After wavelet packet threshold denoising, the signal data are dimensionally reduced by using a Bayesian PPCA method. A fully connected RNN prediction model is established and verified by using the goodness of fit and mean square error. The model reliability is quantified by calculating the Bayesian factor and confidence. Combined with the prior information in the historical data set, the proposed method calculates the residual between the prediction and hypothetical health values to find the unit failure during the creep period.

2 Data integration analysis

For the monitoring data of rotating machinery in NPPs, the discrete wavelet packet transform (DWPT) method is first used to denoise the data. The number of monitoring data prediction variables must be reduced to simplify the research and reduce the prediction time and calculation cost. Because the variables with a certain correlation cannot be directly ignored, a Bayesian PPCA method [34] is employed to extract the principal component signal with the highest contribution rate for analysis and prediction. Considering the data uncertainty, PPCA estimates the discarded information as Gaussian noise.

There are N samples with dimension d, where d is the number of variables, and X=(x₁,x₂…xN); thus, each variable contains N denoised time series points. PPCA assumes the existence of q (q≤d) dimension hidden variables β. The hidden variable model relates the relevant matrix X of the relevant sample data and the matrix β of uncorrelated hidden response variables with the formula (1):

X = W β + μ + ε

(1)

where w is the q×d weight matrix that describes the relationship between $X$ and $β$ . Here, $μ = {(μ_{1}, μ_{2}, \dots μ_{d})}^{T}$ is the mean vector that describes the mean value of each variable in the data matrix, which is $μ_{i} = (1 / N) \sum_{j = 1}^{N} x_{i j}$ . $β$ is a vector $β = (β_{1}, β_{2} ， \dots, β_{N})$ composed of q latent variable factors. Each latent variable factor contains N corresponding points in the hidden space, obeys $β ~ Ν (0, I_{q})$ , and is the result of dimension reduction. Here, $ε$ is the possible error or noise that cannot be eliminated by the DWPT method combined with Bayesian theory, which is the Gaussian variable $N ~ (0, σ^{2} I_{d})$ ( $I$ is the unit matrix) with an independent distribution. The principal axis of the PPCA is an incremental rule, which is a special factor analysis model. Combining with various conditions and formula (1), we obtain

X ~ N (μ, W W^{T} + σ^{2} I_{d})

(2)

The specific parameters are solved by the maximum likelihood estimation. From formula (2), the prior distribution of the hidden variable $β$ is

p (β) = {(2 π)}^{- \frac{q}{2}} \exp {- \frac{1}{2} β^{T} β}

(3)

Under the condition of an implicit variable $β$ , from Eq. (1), the prior probability distribution of the sample data $X$ is

p (X | β) = {(2 π σ^{2})}^{- \frac{d}{2}} \exp {- \frac{1}{2 σ^{2}} {‖ X - W β - μ ‖}_{2}^{2}} .

(4)

By substituting formula (4) into the Bayesian formula, the posterior probability density distribution of the hidden variable $β$ concerning sample data $X$ can be obtained as follows:

p (β | X) = {(2 π)}^{- \frac{q}{2}} {| σ^{2} M |}^{- \frac{1}{2}} \exp {- \frac{1}{2} {(β - 〈 β 〉)}^{T} {(σ^{2} M)}^{- 1} (β - 〈 β 〉)},

(5)

where $M = {(W^{T} W + σ^{2} I_{q})}^{- 1}$ and $〈 β 〉 = M^{- 1} W^{T} (X - μ)$ . The PPCA considers that the q components of the expected $〈 β 〉$ of the $β$ posterior probability are the result of $X$ dimension reduction, namely, the q principal components. The maximum likelihood function can be used to estimate $W, σ$ .

\overset{⌢}{W} = U_{q} {(Λ_{q} - σ^{2} I_{q})}^{1 / 2} R

(6)

{\overset{⌢}{σ}}^{2} = \frac{1}{d - q} \sum_{j = q + 1}^{d} λ_{j}

(7)

In formulas (6) and (7), the covariance matrix of sample data $X$ is decomposed in accordance with eigenvalues, and $λ_{j}$ is obtained. Thus, $C υ_{j} = λ_{j} υ_{j}$ and $υ_{j}$ is an eigenvector. $U_{q}$ and $Λ_{q}$ are diagonal matrices composed of eigenvector matrices and eigenvalues that correspond to the previous q eigenvalues, respectively, with $U_{q} = (υ_{1}, υ_{2}, \dots, υ_{q})$ , $Λ_{q} = diag (λ_{1}, λ_{2} \dots λ_{q})$ . When the cumulative variance rate of the first k principal components in q principal components reaches a certain contribution rate (as determined by the actual situation), only the k-dimensional data after the dimension reduction can be used for the subsequent data processing.

3 Fully Connected RNN Prediction Model

3.1 RNN Model Construction

In this study, an RNN suitable for sequence data modeling is used to predict a time series. The neurons with a cyclic structure retain and apply the state information of the previous moment as memory to current output calculation; thus, the nodes between the same hidden layers are connected. The RNN can transmit information laterally among neurons and partially express correlations within the data. This information transmission mode matches well with the state process of operational nuclear power machinery. The running state at a given moment will have a certain impact on the running state at the next moment, and the collected data also correlates.

According to the embedding dimension m and time delay τ, an RNN structure prediction model is constructed (Fig. 1), where $x_{t} (x_{1}, x_{2}, \dots x_{τ})$ is the network input, the corresponding hidden layer is $s_{t} (s_{1}, s_{2}, \dots s_{τ})$ , and the output layer is ${\hat{y}}_{t + （ n + 1) τ} (y_{1}, y_{2}, \dots y_{τ})$ . V is the weight matrix from the hidden layer to the output layer. W is a weight matrix that represents the value of the last time on the hidden layer as the input weight. U is the weight matrix from the input layer to the hidden layer.

Fig. 1

Prediction model with RNN structure

In the model, the same weight parameters are used at different times, and the activation function uses the rectified linear unit (ReLu) function uniformly. Units with a certain amount of m are connected, and the last unit provides the output value. The hidden layer state $s_{0}$ of the initial input is a random value, and the prediction value of the next $τ$ time point outputted by the last unit is ${\hat{y}}_{t + (n + 1) τ}$ . For time $t + n τ$ , formula (8) is used to calculate the forward propagation from input to output.

{\begin{matrix} c_{t + n τ} = W s_{t + (n - 1) τ} + U x_{t + n τ} + a \\ s_{t + n τ} = R e L u (c_{t + n τ}) \\ o_{t + n τ} = V s_{t + n τ} + b \\ {\hat{y}}_{t + (n + 1) τ} = R e L u (o_{t + n τ}) \end{matrix}

(8)

In formula (8), $c_{t}$ and $o_{t}$ are used as intermediate variables to participate in the backpropagation calculation; $a$ and $b$ represent the bias terms of the hidden layer neurons and output layer neurons, respectively.

After constructing the model, a backpropagation through time (BPTT) algorithm [35] is employed to train the RNN model. Based on formula (8), the loss function is established, and the minimum value of the loss function is calculated. The negative log likelihood function is used to establish the loss function:

L_{t} = - \sum_{i}^{n} y_{t}^{(i)} \log ({\hat{y}}_{t}^{(i)})

(9)

where $y_{t}^{(i)}$ is the i-th element of the output $y_{t}$ , $\hat{y}$ is the predicted data, and n is the number of data points in each group. After the loss function is determined, the partial derivative values of each time step, such as equation (10), are accumulated by using optimization strategies such as random gradient descent, to update the weights and bias.

[\frac{\partial L_{t}}{\partial V}, \frac{\partial L_{t}}{\partial b}, \frac{\partial L_{t}}{\partial W}, \frac{\partial L_{t}}{\partial U}, \frac{\partial L_{t}}{\partial a}]

(10)

3.2 Model Reliability Verification

The reliability represents the ability of the model to accurately reflect the characteristics of the data set and to predict the data information of future time nodes. To verify the reliability of the model, three methods are introduced: 1) goodness of fit $R^{2}$ ; 2) mean square error MSE; and 3) Bayesian confidence. The model has high precision and good reliability when $R^{2} \in (0 ~ 1)$ { $R^{2}$ is closer to 1, MSE smaller}. The Bayesian hypothesis test method [36–38] considers the data uncertainty and intuitively verifies the model reliability.

The model prediction value is assumed as $\hat{y} = ({\hat{y}}_{1}, {\hat{y}}_{2}, \dots {\hat{y}}_{n})$ , and the monitoring value is $y = (y_{1}, y_{2}, \dots y_{n})$ . The residual $e$ is expressed by formula (11), where $e_{j}$ is the residual between the j-th monitoring and predictive value, obeying a normal distribution $N (μ, σ_{1}^{2})$ , and $ε$ is the mean value of n residuals, obeying a normal distribution $N (μ, σ_{1}^{2} / n)$ .

{\begin{matrix} e_{j} = y_{j} - {\hat{y}}_{j}, j \in 1, 2, \dots, n \\ ε = \frac{1}{n} \sum_{j = 1}^{n} e_{j} \end{matrix}

(11)

The null hypothesis is defined as $H_{0} : μ = 0$ , and the alternative hypothesis is $H_{1} : μ_{1} \neq 0$ . The prior probability density of $μ$ is assumed as $N (0, σ_{0}^{2})$ . The mean value of residuals is computed in sections, and the mean variance is taken as $σ_{0}^{2}$ . Thus, the probability density function of $μ$ is:

g (μ) = \sqrt{\frac{1}{2 π σ_{0}^{2}}} \exp {- \frac{μ^{2}}{2 σ_{0}^{2}}} .

(12)

The Bayesian factor is the primary evaluation index of the Bayesian hypothesis test, which is the ratio of prior and posterior probabilities. If a Bayesian factor is significantly greater than 1, the sample information supports the null hypothesis $H_{0}$ and is expressed as:

B^{π} (\bar{ε}) = \sqrt{1 + \frac{n σ_{0}^{2}}{σ_{1}^{2}}} \exp {\frac{n {\bar{ε}}^{2}}{2} (- \frac{1}{σ_{1}^{2}} + \frac{1}{n σ_{0}^{2} + σ_{1}^{2}})} .

(13)

The posterior probability of the mean value μ can be further obtained by formula (14):

λ = π (μ | \bar{ε}) = {[1 + \frac{1 - π_{0}}{π_{0}} \frac{1}{B^{π} (\bar{ε})}]}^{- 1},

(14)

where $λ$ reflects the confidence degree of the prediction model. When $λ \to 0$ , the confidence degree of the support model is 0% and the model reliability is low. When $λ \to \infty$ , the confidence degree of the support model is 100%, and the reliability is high.

3.3 Fault Prediction

In this study, part of the data is used as a training set to construct the RNN prediction model. The abnormal signal is identified by setting the threshold in advance. The threshold $e_{\max}$ is the value of the maximum residual in the training set. In the testing set, residuals between the monitoring value of each time point and the predicted value are expressed as follows:

Θ_{test} = {e_{j} | e_{j} = y_{j} - {\hat{y}}_{j}, j \in 1, 2, \dots, n},

(15)

when $Θ_{test} > e_{\max}$ , an alarm will be given. In the real-time condition monitoring of the steam turbine unit, when the residual exceeds the set threshold for a long time, the unit is considered a failure.

4 Illustration

This study uses rotating speed signal data of a pressure cylinder of a nuclear power turbine in April 2019 to explain the algorithm flow and model building. The rated rotating speed of the sampled turbine is 1500 rpm. The data set consists of two rotor speeds, one bearing group speed, and 720 time points. The data before April 20 is used as the training set to build the model and train the weights, and the data after April 20 is used as the verification set to verify the model reliability.

Figure 2 is a flow chart of fault prediction by the RNN model. After the original three-dimensional rotational speed signal is denoised by DWPT and reduced by PPCA, a one-dimensional time series signal is obtained. The signal data with the delay time and embedding dimension optimized by the enumeration method are set as a training dataset. The RNN model after training characteristic parameters is used to predict the value of the next time point, and the residual between the prediction and monitoring values is calculated for early warning.

Fig. 2

Fault prediction by the RNN model

4.1 Data Denoising and Dimensional Reduction

Three speed signals are denoised by the DWPT. The time series signal is decomposed into three levels by using the db8 wavelet packet, and the wavelet coefficients of each point are obtained. The wavelet coefficients are filtered according to the Bayesian threshold approach, and the signal is reconstructed. Figure 3 shows the noise and denoising data of the bearing group speed signal. The denoising signal is very similar to the original signal, and the trend is consistent. The feature information of the original signal is retained.

Fig. 3

(Color online) Noise reduction effect of the bearing group signal

The dimensions of each type of PPCA data are reduced to more than 70% of the cumulative variance contribution rate of the retained dimension. Thus, the proportion of information after dimensional reduction is more than 70%. Three speed signals are dimensionally reduced in this case. Table 1 shows the results of dimensional reduction, where $w_{i}$ is the weight value of each component signal to the principal component after Bayesian PPCA of the rotating speed signals. The component signal contributes more to the result analysis for greater absolute values of the weight parameter. The cost of training calculation can be decreased, and one-dimensional data can retain most of the information from the original data without any signal distortion by reducing the three-dimensional rotating speed signal to the one-dimensional signal corresponding to PC1 (Table 1).

PCA weight and contribution rate of rotating speeds

Weight	PC1	PC2	PC3
w₁	0.364	-0.931	-0.005
w₂	0.658	0.261	-0.706
w₃	0.659	0.254	0.708
Contribution rate	0.717	0.275	0.008
Cumulative contribution rate	0.717	0.992	1.000

4.2 Determining the embedding dimension and time delay of the input layer

The time series data after dimensional reduction by PPCA are prone to chaos; therefore, the input layer must be determined by phase space reconstruction for the prediction model. By using the enumeration method with various time delay and embedding dimension combinations, the changes in the R² and MSE parameters of the RNN model training are analyzed. Figures 4 and 5 show the changing trends of $R^{2}$ and MSE with different time delays and embedding dimensions, respectively. $R^{2}$ and MSE slowly decrease and increase, respectively, with an increase in time delay in the training set; in the verification set, $R^{2}$ descends steeply, and MSE increases (Figure 4). MSE and $R^{2}$ change in waves as the embedding dimension increases (Figure 5). Additionally, $R^{2}$ reaches a maximum at $m = 4$ in the training and verification sets, and MSE reaches a minimum at $m = 4, 7, 9$ and $m = 4, 14$ in the training and verification sets, respectively. Thereafter, to ensure the maximum value of $R^{2}$ and the minimum value of MSE, the optimal embedding dimension $m = 4$ is chosen. For the optimal time delays, $t = 1$ and t>1, indicate single step and multi-step training, respectively. Larger t values predict data a later times, but a t of 2 or 3 provide optimum accuracy (Fig. 4).

Fig. 4

(Color online) R² (a) and MSE (b) under different time delays

Fig. 5

(Color online) R² (a) and MSE (b) under different embedding dimensions

4.3 RNN model prediction and verification

The fully connected RNN model is constructed according to the optimized time delay and embedding dimension. For the rotating speed principal component, the time delay is the interval time between two adjacent input data, and the embedding dimension is the number of RNN input units. After model training, the R², MSE and Bayesian confidence $λ$ indexes are employed for training and verification datasets to validate the accuracy and reliability of the model. When $R^{2}$ is closer to 1, MSE is smaller, $λ$ is higher, and the model has high accuracy and good reliability. Table 2 lists the values of R², MSE, and $λ$ in the case of time delay $τ_{d} = 2$ and embedding dimension m=3,4,…8, respectively. Under the parameters τ_d=2, m=4, the R² values of the training set and verification set are both above 0.94, MSE values are maintained at the order of 0.001 magnitudes, and the λ values of the verification set are stable at 93% (Table 2, Figures 4 and 5).

Values of R², MSE and λ

time delay	Embedding dimension	Training set		Verification set
		R²	MSE	R²	MSE	λ
τ_d=2	m=3	0.965	0.001855	0.935	0.003154	93.55%
	m=4	0.975	0.000951	0.944	0.002177	93.64%
	m=5	0.952	0.002946	0.942	0.003242	92.37%
	m=6	0.962	0.002127	0.934	0.002755	93.27%
	m=7	0.956	0.002099	0.917	0.002046	93.45%
	m=8	0.949	0.003509	0.918	0.003326	91.93%

Figure 6 shows the comparison between the monitoring and predictive values of the rotating speed signal under three different conditions. Figure 6a illustrates the results of the proposed model and shows that the two curves coincide well. To analyze the effect of noise reduction, the rotating speed signals without DWPT are trained and predicted (Fig. 6b). The difference between two curves in the figure indicates that noise reduction of the source data is necessary. Furthermore, the traditional artificial neural network (ANN) model is applied for comparative analysis, and the predictive results are described in Fig. 6c. The degree of agreement between the two curves is lower than that of the RNN model in Fig. 6a. The $R^{2}$ values of the training and verification sets of the ANN model are 0.963 and 0.933, respectively, and the MSE values of the training and verification sets of the ANN model are 0.017104 and 0.022082, respectively. The results of these two parameters of the RNN model under $m = 4, τ_{d} = 2$ in Table 2 are better than those of the ANN model. Thus, the accuracy and reliability of the proposed model are validated.

Fig. 6

(Color online) Prediction and monitoring values of rotating speed signal. (a) denoise data in RNN model; (b) noise data in RNN model; (c) denoise data in ANN model.

4.4 Fault prediction of a two-stage impeller

A data set with fault points is applied to further verify the model reliability and test the early warning function for faults with the model. Figure 7 shows cracks (found in late May 2017 during major maintenance) in a two-stage impeller of a turbine unit in an NPP. The flaw detection results show that five blades have cracks, of which the shortest and longest measure approximately 29 and 45 mm, respectively.

Fig. 7

(Color online) Cracks in a two-stage impeller

The rotating speed and vibration signals from February 10 to March 16, 2017 are extracted from the monitoring system as the experimental data set. Both the speed and vibration datasets consist of 34 days of monitoring data with 24 time points per day and a total of 816 time points. The signal data are divided into training (Feb. 10 to Mar. 2), verification (Mar. 2–4), and testing (after Mar. 5) data sets for modeling and fault prediction. Similar to the previous case, the two data sets are employed for the RNN model prediction after noise reduction and reconstruction of the embedding dimension and time delay. Table 3 lists some model prediction parameters and fault early warning parameters. The results of $R^{2}$ and MSE in the training set indicate good RNN model reliability. The positive threshold, which is the maximum positive residual allowed in the early warning system, is set using 1.2 times the maximum residual in the training set. The negative threshold, which is the maximum negative residual allowed in the early warning system, is set using 1.2 times the minimum residual in the training set. Table 3 shows the values of the positive and negative thresholds. Two warning lines are drawn in orange (Figs. 8 and 9). When the green residual curve exceeds the warning line, the blue monitoring signal curve still fluctuates normally (Figure 8). The monitoring system, which sets fixed alarm thresholds for specific signal values instead of residuals, was recorded to give an alarm at 17:00 on March 7, whereas the alarm time of the RNN model is 5:00 on March 5 (60 h in advance). Similarly, Figure 9 shows the early warning effect of another vibration signal. Table 3 lists the corresponding alarm times of the RNN model and monitoring system. The RNN model produces an alarm 44 h in advance. The experimental results indicate that the cracks in the two-stage impeller cause the abnormal signals of turbine speed and vibration; the RNN model can predict the anomaly well. The model is further validated as reliable.

Fig. 8

(Color online) Fault early warning effect of rotating speed signal

Fig. 9

(Color online) Fault early warning effect of vibration signal

RNN model parameters and fault early warning parameters

Fault data set (m=4, τ_d=2)	R²(training)	MSE (training)	Positive threshold	Negative threshold	Systemalarm time	RNN model alarm time
Rotating speed	0.967	0.001248	0.043382	-0.173924	3/7 17:00	3/5 05:00
Vibration	0.944	0.001619	0.139959	-0.205049	3/7 17:00	3/5 21:00

5 Conclusion

Because of multiple variable redundancies and turbine data imperfections, the Bayesian PPCA method is used to preprocess the DWPT denoising data and obtain a data set with a high signal-to-noise ratio and low dimension. The rotating speed signal is reduced from a three- to one-dimensional principal component, and the contribution rate is more than 70%.

A fully connected RNN prediction model is established. The goodness of fit of each signal data is calculated to be higher than 0.93, and MSE fluctuates on the order of 0.001, which verifies the model reliability. Furthermore, a Bayesian hypothesis testing method, which considers the data uncertainty and prior information of the training set, is employed to quantify the model confidence. The Bayesian confidence values of the verification set under different embedding dimensions are calculated at more than 90%. In the comparison case study of the two-stage impeller cracking and the monitoring system, the RNN prediction model produces alarms 60 and 44 h in advance for the rotating speed and vibration signals, respectively. The prediction results indicate that the RNN model can effectively identify faults during the creep period.

References:

J.P. Ma, J. Jiang,

Applications of fault detection and diagnosis methods in nuclear power plants: A review

. Prog. Nucl. Energ. 53, 255-266 (2011). https://doi.org/10.1016/j.pnucene.2010.12.001