Hybrid reconstruction algorithm for computed tomography based on diagonal total variation

NUCLEAR ELECTRONICS AND INSTRUMENTATION

Hybrid reconstruction algorithm for computed tomography based on diagonal total variation

Lu-Zhen Deng，

Peng He，

Shang-Hai Jiang，

Mian-Yi Chen，

Biao Wei，

Peng Feng

Nuclear Science and Techniques

Vol.29, No.3

Article number 45

Published in print 01 Mar 2018

Available online 28 Feb 2018

DOI：10.1007/s41365-018-0376-2

103401

Inspired by total variation (TV), this paper represents a new iterative algorithm based on diagonal total variation (DTV) to address the computed tomography (CT) image reconstruction problem. To improve the quality of a reconstructed image, we used DTV to sparsely represent images when iterative convergence of the reconstructed algorithm with TV-constraint had no effect during the reconstruction process. To investigate our proposed algorithm, the numerical and experimental studies were performed, and root mean square error (RMSE) and structure similarity (SSIM) were used to evaluate the reconstructed image quality. The results demonstrated that the proposed method could effectively reduce noise, suppress artifacts, and reconstruct high-quality image from incomplete projection data.

Computed tomography (CT)Sparse view reconstructionDiagonal total variation (DTV)Compressive sensing (CS)

1. Introduction

X-ray computed tomography (CT) has been widely used in clinical and preclinical applications and plays a central role in the examination of diseases and procedures [1, 2]. X-ray radiation dose is harmful for patients, and low-dose CT reconstruction techniques are a research hotspot in the current medical CT field. To reduce the X-ray radiation dose, medical CT systems can decrease the X-ray intensity or the number of scanning projection angles. However, these two strategies result in a reconstructed image with a low signal-to-noise ratio (SNR), indicating the negative impact of noise and artifacts.

In comparison to traditional CT-reconstructed algorithms [3-5], the iterative algorithms based on compressive sensing (CS) [6] can be used to reconstruct high-quality CT images with incomplete or low SNR projection data. In the CT reconstruction, some advanced exemplary algorithms were employed, such as total variation (TV) minimization [7,8], soft-thresholding algorithms [9, 10], adaptive-weighted total variation (AWTV) [11], dictionary learning [12], multi-direction anisotropic total variation (MDATV) [13], and split-Bregman reconstruction [14]. Among them, TV using the x- and y-coordinate gradient operators as the sparse representation approach during the iteration process is one of the most popular algorithms. However, it still can be improved by combining with the diagonal total variation (DTV) [15-19], which accelerates the iterative convergence and reconstructs the high-quality image from incomplete projection data.

In this paper, the research goal was to reconstruct high-quality CT images with a low dose. There are many methods to reduce the delivered dose in CT scanning but we focused on a sparse view reconstruction strategy. To handle the noise and artifacts in the reconstructed images from few-view projections, we proposed a hybrid reconstruction approach that combined TV- and DTV-constraints aimed at exploring the sparse view reconstruction. We introduce the methodology including the proposed algorithm in Sect. 2, analyze the reconstructed results in Sect. 3, and discuss the issues related to our reconstruction process and the corresponding results in Sect. 4.

2. Methodology

2.1 Total Variation and Diagonal Total Variation

The reconstruction algorithm with TV-constraint can be defined as:

\min {‖ \vec{f} ‖}_{TV} \begin{matrix}  \end{matrix} s . t . \begin{matrix}  \end{matrix} {‖ A \vec{f} - \vec{p} ‖}_{2}^{2} < σ^{2} .

(1)

Where $\vec{f}$ is the reconstructed image, $A$ is the projection matrix, $\vec{p}$ is the projection data, σ is permissible error, and TV of $\vec{f}$ can be expressed as:

{‖ \vec{f} ‖}_{TV} = \sum_{s, t} | \vec{\nabla} f_{s, t} | = \sqrt{{[D_{s} (f_{s, t})]}^{2} + {[D_{t} (f_{s, t})]}^{2}} .

(2)

Where $\vec{\nabla}$ represents the local gradient operator, $f_{s, t}$ is a pixel value of $\vec{f}$ at position (s,t), $D_{s} (•)$ and $D_{t} (•)$ are the discrete differential operators along the s- and t-axes, respectively, defined as:

D_{s} (f_{s, t}) = f_{s, t} - f_{s - 1, t}

(3)

D_{t} (f_{s, t}) = f_{s, t} - f_{s, t - 1} .

(4)

The positional relationship between $f_{s, t}$ and its neighboring pixels is shown in Figure 1.

Fig. 1.

Illustration of the pixel positions in a reconstructed image

The DTV of $\vec{f}$ is defined as ${‖ \vec{f} ‖}_{D T V}$ :

{‖ \vec{f} ‖}_{DTV} = \sum_{s, t} | {\vec{\nabla}}_{D} f_{s, t} | = \sqrt{{[D_{d s} (f_{s, t})]}^{2} + {[D_{d t} (f_{s, t})]}^{2}}

(5)

where ${\vec{\nabla}}_{D}$ represents the local diagonal gradient operator and $D_{d s} (•)$ and $D_{d t} (•)$ are the discrete differential operators along the diagonal s- and t-axes, respectively, given by

D_{d s} (f_{s, t}) = f_{s, t} - f_{s - 1, t - 1}

(6)

D_{d t} (f_{s, t}) = f_{s, t} - f_{s - 1, t + 1} .

(7)

2.2 Proposed Algorithm

In this paper, we used DTV to sparsely represent images when iterative convergence of the reconstructed algorithm with TV-constraint showed no change during the reconstruction process. To solve the optimization problem, we employed the steepest descent method [20]. Our proposed algorithm can be defined as follows:

\min {‖ \vec{f} ‖}_{TV} \begin{matrix} s . t . \begin{matrix} {‖ A \vec{f} - \vec{p} ‖}_{2}^{2} < σ_{1}^{2} \end{matrix} \end{matrix}

(8)

\min {‖ \vec{f} ‖}_{DTV} \begin{matrix} s . t . \begin{matrix} {‖ A \vec{f} - \vec{p} ‖}_{2}^{2} < σ_{2}^{2} \end{matrix} \end{matrix}

(9)

where $σ_{1}$ and $σ_{2}$ are the permissible errors of TV and DTV, respectively.

The steepest descent method was applied to solve Eqs. (8) and (9), and we obtained the following formulas:

{\vec{f}}^{l + 1} = {\vec{f}}^{l} - α {\overset{⌢}{g}}_{TV}^{l}

(10)

{\vec{f}}^{k + 1} = {\vec{f}}^{k} - β {\overset{⌢}{g}}_{DTV}^{k}

(11)

Where $l$ and $k$ denote the iteration indices of TV and DTV in the steepest descent method, respectively, and α and β are the gradient descent step sizes of TV and DTV, respectively. ${\overset{⌢}{g}}_{TV}^{}$ is the normalized TV gradient while ${\vec{g}}_{TV}^{}$ is the TV gradient, and are related by ${\overset{⌢}{g}}_{TV}^{} = {\vec{g}}_{TV}^{} / {‖ {\vec{g}}_{TV}^{} ‖}_{2}$ . The individual elements of ${\vec{g}}_{TV}^{}$ can be defined as follows:

\begin{matrix} \frac{\partial {‖ \vec{f} ‖}_{T V}}{\partial f_{s, t}} \approx \frac{(f_{s, t} - f_{s - 1, t}) + (f_{s, t} - f_{s, t - 1})}{\sqrt{ε + {(f_{s, t} - f_{s - 1, t})}^{2} + {(f_{s, t} - f_{s, t - 1})}^{2}}} \\ - \frac{(f_{s + 1, t} - f_{s, t})}{\sqrt{ε + {(f_{s + 1, t} - f_{s, t})}^{2} + {(f_{s + 1, t} - f_{s + 1, t - 1})}^{2}}} \\ - \frac{(f_{s, t + 1} - f_{s, t})}{\sqrt{ε + {(f_{s, t + 1} - f_{s, t})}^{2} + {(f_{s, t + 1} - f_{s - 1, t + 1})}^{2}}} . \end{matrix}

(12)

${\overset{⌢}{g}}_{DTV}^{}$ is the normalized DTV gradient while ${\vec{g}}_{DTV}^{}$ is DTV gradient, and related by ${\overset{⌢}{g}}_{DTV}^{} = {\vec{g}}_{DTV}^{} / {‖ {\vec{g}}_{DTV}^{} ‖}_{2}$ . The individual elements of ${\vec{g}}_{DTV}^{}$ can be defined as follow:

\begin{array}{l} \frac{\partial {‖ \vec{f} ‖}_{DTV}}{\partial f_{s, t}} \approx \frac{(f_{s, t} - f_{s - 1, t - 1}) + (f_{s, t} - f_{s - 1, t + 1})}{\sqrt{ε + {(f_{s, t} - f_{s - 1, t - 1})}^{2} + {(f_{s, t} - f_{s - 1, t + 1})}^{2}}} \\ - \frac{(f_{s + 1, t - 1} - f_{s, t})}{\sqrt{ε + {(f_{s + 1, t - 1} - f_{s, t})}^{2} + {(f_{s + 1, t - 1} - f_{s, t - 2})}^{2}}} \\ - \frac{(f_{s + 1, t + 1} - f_{s, t})}{\sqrt{ε + {(f_{s + 1, t + 1} - f_{s, t})}^{2} + {(f_{s + 1, t + 1} - f_{s, t + 2})}^{2}}} \end{array}

(13)

Where ε is a known positive integer. In our study, we selected ε =10^-8.

We will now describe the iterative steps of the proposed algorithm. The whole iteration process contained two loops, the outside and inside loops operated the Algebraic Reconstruction Techniques (ART) [4] and TV gradient descent, respectively, then the DTV was used as a substitute when the iterative convergence of the reconstructed algorithm was stable after the $N_{T V}$ th iteration. The flow chart is shown in Table 1, where $m = 2, ..., M$ denotes the projection angles, ${\vec{A}}_{m}$ is the mth row vector of the projection matrix $A =({\vec{A}}_{1}, {\vec{A}}_{2}, ..., {\vec{A}}_{m}, ..., {\vec{A}}_{M})$ , $\vec{p} =(p_{1}, p_{2}, ..., p_{m}, ..., p_{M})$ is the projection, and $λ$ is the convergence parameter of the ART method. The inside loop is labeled by $k$ and $K$ is the iteration count for the TV and DTV minimizations.

Implementation steps of the TV+DTV reconstruction

			Algorithm TV+DTV
%Initialization
Given $M$ , ${\vec{A}}_{m} (m = 1, 2, ..., M)$ , $p_{m} （ m = 1 ， 2, ..., M)$ , $λ$ , $α$ , $β$ , $σ$ , $K$ , $N_{T V}$ and $N_{i t e r}$
${\vec{f}}^{0} =0;$
%Main iteration loop
for $n = 1, 2, ..., N_{i t e r}$ do
%ART Updating
${\vec{f}}^{n, 0} = {\vec{f}}^{n - 1}$
for $m = 1 ， 2, ..., M$ do
${\vec{f}}^{n, m} = {\vec{f}}^{n, m - 1} + λ \frac{{\vec{A}}_{m} (p_{m} - {\vec{A}}_{m} . {\vec{f}}^{n, m - 1})}{{\vec{A}}_{m} . {\vec{A}}_{m}}$
		end
		${(f_{s, t})}^{n} = {\begin{matrix} {(f_{s, t})}^{n, M} \begin{matrix} \end{matrix} \\ 0 \end{matrix} \begin{matrix} {(f_{s, t})}^{n, M} \geq 0 \\ {(f_{s, t})}^{n, M} < 0 \end{matrix}$
		%TV+DTV
		$d (n) = {‖ {\vec{f}}^{n - 1} - {\vec{f}}^{n} ‖}_{2}$
		for $k = 1, 2, ..., K$ do
			if $n \leq N_{T V}$ do
${\vec{g}}_{s, t}^{n, k - 1} = {\vec{g}}_{T V (s, t)}^{n, k - 1} = \partial {‖ {\vec{f}}^{n, M + k - 1} ‖}_{T V} / \partial f_{s, t}$
			else do
${\vec{g}}_{s, t}^{n, k - 1} = {\vec{g}}_{D T V (s, t)}^{n, k - 1} = \partial {‖ {\vec{f}}^{n, M + k - 1} ‖}_{D T V} / \partial f_{s, t}$
			end
			${\overset{⌢}{g}}_{}^{n, k - 1} = {\vec{g}}_{}^{n, k - 1} / {‖ {\vec{g}}_{}^{n, k - 1} ‖}_{2}$
			${\vec{f}}^{n, M + k} = {\vec{f}}^{n, M + k - 1} - α d (n) {\overset{⌢}{g}}_{}^{n, k - 1}$
		end
		%Image Updating
		${\vec{f}}^{n + 1} = {\vec{f}}^{n, M + k}$
		%Exit Criterion
		if ${‖ A \vec{f} - \vec{p} ‖}_{2}^{2} < σ^{2}$ then
			exit
		end
	end

3. Numerical and experimental studies

In this section, we present our numerical and experimental studies. There were two sets of experiments; FORBILD head phantom [21] was used in the numerical study and real projection data was used to reconstruct the aspirin images in the experimental study. The image quality was assessed with relative root mean square error (RMSE) and structure similarity (SSIM) [22].

The RMSE is defined as

RMSE = \sqrt{({\sum_{0 \leq s < N} \sum_{0 \leq t < M} (f_{s, t} - f^{*}_{s, t})}^{2}) / (M \times N)}

(14)

where $f_{s, t}$ is the pixel value of the original image at position (s,t) and $f^{*}_{s, t}$ is the pixel value of the reconstructed image at position (s,t).

SSIM is defined as

SSIM = l {(\vec{f}, {\vec{f}}^{*})}^{δ} \cdot c {(\vec{f}, {\vec{f}}^{*})}^{γ} \cdot v {(\vec{f}, {\vec{f}}^{*})}^{η}

(15)

where

l (\vec{f}, {\vec{f}}^{*}) = (2 \bar{\vec{f}} \times {\bar{\vec{f}}}^{*} + c 1) / ({(\bar{\vec{f}})}^{2} + {({\bar{\vec{f}}}^{*})}^{2} + c 1)

(16)

c (\vec{f}, {\vec{f}}^{*}) = (2 σ_{\vec{f}}^{} σ_{{\vec{f}}^{*}}^{} + c 2) / (σ_{\vec{f}}^{2} + σ_{{\vec{f}}^{*}}^{2} + c 2)

(17)

v (\vec{f}, {\vec{f}}^{*}) = (σ_{\vec{f} {\vec{f}}^{*}} + c 3) / (σ_{\vec{f}}^{} σ_{{\vec{f}}^{*}}^{} + c 3)

(18)

where $\bar{\vec{f}}$ and ${\bar{\vec{f}}}^{*}$ are the mean of $\vec{f}$ and ${\vec{f}}^{*}$ , respectively, $σ_{\vec{f}}^{}$ and $σ_{{\vec{f}}^{*}}^{}$ are the variances of $\vec{f}$ and ${\vec{f}}^{*}$ , respectively, $σ_{\vec{f} {\vec{f}}^{*}}$ is the covariance of $\vec{f}$ and ${\vec{f}}^{*}$ , and $c 1$ , $c2$ , and $c 3$ are three small positive constants used to avoid instability if the value of the denominator in Eq.(16) is very close to zero. $δ$ , $γ$ , and $η$ are used to adjust the weights of the luminance $l (\vec{f}, {\vec{f}}^{*})$ , contrast $c (\vec{f}, {\vec{f}}^{*})$ , and structures $v (\vec{f}, {\vec{f}}^{*})$ . In our study, we selected $δ = γ = η = 1$ , $c 1 =2 \times 10^{-8}$ , $c2=1 \times 10^{-8}$ , and $c 3 =0 .5 \times c2=0 .5 \times 10^{-7}$ , and the value of SSIM was between -1 and 1. When two images were the same, the SSIM value was 1.

The selection of the optimal parameters is very important. According to Table 1, there are three parameters: λ, α, and β. Previous studies reported a range for λ of [0,2], and the steepest-descent parameter (α or β) and descent iterative number (K) should be set so that α*K (or β*K) is of the order of, but larger than one [7]. In the cases considered here, the descent iterative numbers were 20, and the ranges of α and β were both [0.05, 1]. The following optimal parameters were selected from these ranges. We use the FORBILD head phantom with 30 projections as an example to determine the optimal parameters. We changed the value of λ to reconstruct images with the ART method and calculated the reconstruction error. Figure 2 shows that λ=1 was the optimal parameter where RMSE was the lowest and SSIM was approaching 1. When λ was fixed, we changed the value of α to reconstruct images with the TV method and calculated the reconstruction error. Figure 3 shows that α=0.55 was the optimal parameter. When λ and α were fixed, we changed the value of β to reconstruct images with the TV+DTV method and calculated the reconstruction error. Figure 4 shows that β =0.28 was the optimal parameter. Then we used λ=1, α=0.55, and β =0.28 in the next numerical simulation using the FORBILD head phantom. A similar search was conducted for real data and the optimal values of these parameters are shown in Table 2.

Optimum parameter selections for each dataset

Data	λ	α	β
FORBILD head phantom	1	0.55	0.28
Real data	0.5	0.15	0.1

Fig. 2.

RMSE and SSIM of the analyses to find the optimum regularization parameter λ.

Fig. 3.

RMSE and SSIM of the analyses to find the optimum regularization parameterα.

Fig. 4.

RMSE and SSIM of analyses to find the optimum regularization parameter β.

3.1 Numerical Simulation Study

In the numerical simulation, a FORBILD head phantom (256×256×8 Bits), shown in Figure 5(a) was used to analyze the performance of the proposed algorithm. The scanning range of the CT system was from 0 to 2π with a θ angular increment. The projection number was set to 30 so that θ= π/15. The scanned angle can be specified by ψ=θ*i (1≤i≤30). The iteration number was set to 1000 and the proposed algorithm (TV+DTV) included 600 TV iterations and 400 DTV iterations. Figures 5(b) and 5(c) were reconstructed by the TV and TV+DTV methods, respectively.

Fig. 5.

Reconstructed FORBILD head phantom for comparison.

Although the reconstructed images obtained using the TV and TV+DTV methods were not significantly different in Figure 5, the positions of the arrows in Figure 6 showed that the profile of the reconstructed image using the TV+DTV method was more stable than that of the TV method. Furthermore, the zoomed in part of the reconstructed images are shown in Figure 7, and Figure 7(c) had fewer artifacts, clearer edges, and a more uniform distribution compared with Figure 7(b) reconstructed by TV method.

Fig. 6.

Profile of line 180 in the different reconstructed methods for the FORBILD head phantom.

Fig. 7.

Magnified part of the reconstructed FORBILD head phantom for comparison.

In Figure 8, the RMSE and SSIM are plotted with the iteration number. For both criteria, there was a similar trend in which the RMSE and SSIM of the reconstructed images became better when converted from TV-constraint to DTV-constraint after 600 iterations. Table 3 lists the RMSE and SSIM calculated from the reconstructed FORBILD head phantom with the TV and TV+DTV methods. It was evident that the RMSE of the reconstructed images using the TV+DTV method was considerably smaller than those obtained by the TV method, and the SSIM was significantly larger. Both the RMSE and SSIM showed that the TV+DTV method can be used to reconstruct images with higher quality.

RMSE and SSIM of reconstruction images

Methods	FORBILD head phantom			Real aspirin data
	TV	TV+DTV	TV	TV+DTV
RMSE	0.0159	0.0143	0.1586	0.1359
SSIM	0.9987	0.9989	0.9745	0.9816

Fig. 8.

RMSE and SSIM lines of reconstructed FORBILD head phantoms with iteration number (from 100 to 1000) for the TV and TV+DTV methods.

3.3 Real Projection Data Study

We applied our proposed algorithm to a set of real projection data, acquired from scanned aspirin. The source-to-object distance was 48.1551 cm, and the object to detector distance was 100.4449 cm. The detector pixel size was 0.0098 cm, and the number of detector elements was 1472. X-ray CT geometrical calibration via locally linear embedding [23] was used to calibrate the geometry and the calibrated rotation center and detector offsets were -0.3318 cm and -0.5364 cm, respectively. The number of projection angles was 360, equally spaced in the angular range [0, 2π]. To demonstrate the performance of our proposed approach, we reduced the number of views to 90 by setting the angular increment to θ= π/45. As shown in Figure 9(a), the reconstructed full-view image was considered to be the standard image. Figures 9(b) and 9(c) were reconstructed by the TV and TV+DTV methods with 90 projection views, respectively. The TV-constraint was converted into the DTV-constraint after 40 iteration numbers, and the lines of the RMSE and SSIM with respect to iteration number are shown in Figure 10. The RMSE and SSIM of reconstructed images are listed in Table 3. It was observed that the result of the TV+DTV method was considerably better than that of the TV method.

Fig. 9.

Reconstructed aspirin images for comparison.

Fig. 10.

RMSE and SSIM lines of reconstructed aspirin projection data with respect to iteration number for the TV and TV+DTV methods.

4. Discussions and Conclusion

There are several issues worth further discussion. Although the proposed hybrid method can be used to reconstruct high-quality images from sparse-views data, it should be noted that the DTV-constraint had no obvious advantages over the TV-constraint with a small number of iterations, due to the gradient operators for the sparse representation. Therefore, it is vital to find the appropriate iteration number to convert the TV-constraint into the DTV-constraint to accelerate the iterative convergence. In the reconstruction process, the iteration numbers were 600 and 40 in the numerical and experimental studies, respectively, according to the characteristics of the low and high frequency components in the reconstructed image.

Another issue regarding the feasibility of the reconstruction algorithm is whether the run-time is acceptable. The run-time depends on the computational environment. MatLabR2014b on a computer with the Intel(R) Core(TM) i5-4590 CPU @3.30 GHz, RAM 8.00 GB, 64-bit OS was used here. The implementation of the TV + DTV algorithm took 179 s for the FORBILD head phantom (total number of iterations was 1000) and 285 s for the real aspirin projection data (the total iterative number was 90). The run-time was acceptable, but could be further improved by many methods, such as parallel computing and CUDA acceleration.

For our proposed algorithm, the selection of the weights of the TV- and DTV-constraints were also an important issue in the reconstruction. If they are too small, the algorithm based on the TV- and DTV-constraints would not be able to reduce the artifacts and noise of the reconstructed image. If they are too large, the TV- and DTV-constraints would over-smooth the CT images. Thus, the weight parameter selection for the TV- and DTV-constraints depends on the levels of artifacts and noise. In this paper, the used parameters are shown in Table 2 according to the experimental analysis.

In conclusion, we proposed a hybrid reconstruction approach by combining the TV- and DTV-constraints, by operating the DTV-constraint to sparsely represent the images when the iterative convergence of the reconstructed algorithm with the TV-constraint did not vary during the reconstruction process. The numerical and experimental studies demonstrated that the proposed hybrid method can be used to reconstruct high-quality images from sparse-views data, and the RMSE and SSIM were improved when the TV-constraint was converted to the DTV-constraint after a set number of iterations. Further research will be performed to explore the directional TV optimization problem in the CT reconstruction.

References

[1]

E.J. Hall, D.J. Brenner,

Cancer risks from diagnostic radiology

. British Journal of Radiology. 81(965), 362-378 (2008).doi: 10.1259/bjr/01948454