Static coded aperture in robotic X-ray tomography systems

Tianyi Mao; Tianyi Mao; Xu Ma; Angela P. Cuadros; XiuBin Dai; Zhiteng Wang; Xin Zhang; Shujin Zhu; Jianjian Zhu; Gonzalo R. Arce; Gonzalo R. Arce

doi:10.1364/OE.449505

1. Introduction

Computed tomography (CT) is a computational imaging technique used to reconstruct high-resolution three-dimensional (3D) volumetric images. It has been extensively used in medical/biological imaging, security inspection, non-destructive testing among others [1–3]. CT technology has continuously evolved since its introduction several decades ago. Most common CT systems, in either cone-beam CT or spiral CT architectures, utilize a circular scanning trajectory of the source-detector gantry around the subject under inspection. More recently, robotic CT imaging systems have been introduced as a means to improve targeting accuracy and to reduce radiation exposure to both patient and physician [4,5]. Unlike conventional CT systems, robotic CT systems can adjust their trajectories to maximize dose reduction and image reconstruction accuracy. A twin robotic X-ray system for projection and measurement is, for instance, shown in Fig. 1, where the source and detector are moved on a coordinated trajectory to ensure comfortable scan conditions and lower radiation dose. Given the set of projections and the geometry of the scanning trajectory, algebraic reconstruction techniques (ART) are used to reconstruct the volumetric images with limited-view projections [6].

Fig. 1. The schematic diagram of the twin robotic X-ray system with a static coded aperture. The source-detector pair scans the object with multi degree-of-freedom robotic arms along the pre-programmed trajectory. The plane of the source is designed parallel to that of the detector and thus the coded aperture has one-to-one correspondence to the pixels of the 2D detector.

Download Full Size | PDF

In order to further reduce the radiation dose, this work explores the use of coded illumination in robotic CT imaging systems. The use of binary “on-off” coded apertures placed in front of the source, blocking a significant portion of the X-rays transmitted to the subject, has been studied in conventional CT [7–10]. It has been shown that structured illumination in CT imaging reduces the radiation dose yet maintains the reconstruction image quality. Inspired by the recent work in [11,12], our work is based on coded illumination where only a single coded aperture is used in the robotic CT imaging system. The implementation of the proposed system with static coded projections can be obtained by following similar strategies as well as other feasible implementation of structured illumination [13,14]. The coded aperture is fixed and is static with respect to the source as depicted in Fig. 1. We thus explore the concept of coded illumination in robotic CT and explore novel scanning trajectories in robotic CT that can provide significant gains in reconstruction image quality. As in [11,12], our work focuses on the use of a single coded aperture in the robotic CT imaging gantry as depicted in Fig. 1. Our goal is then to design the scanning trajectory of the source-detector pair in a subject independent manner. The approach is to constrain the degrees of freedom in the gantry’s motion to only allow rotational scans with varying angles with respect to a fixed coordinate system. The plane of the source is designed parallel to that of the detector for maximum use of the X-ray projections, and the robotic source-detector pair scans the object with a non-zero pitch angle around a fixed axis. The irregular scanning trajectory driven by the robotic system significantly improves the sampling efficiency in a tilted orbit which is not possible with standard circular orbits in X-ray tomosynthesis.

The irregular projection structure and trajectories used in the proposed coded aperture robotic tomography system (CARTS) leads to complex structures for the sensing matrices at different view angles and positions. We thus introduce an efficient forward model to attain a diversity of view angles through rotational sensing by fixing the coded aperture to remain static. The two structures are equivalent and could be converted based on relative motion. As depicted in Fig. 2, the position and orientation of the X-ray source, the flat detector and the coded aperture are all fixed and the scans are performed by rotating the object. The object is placed on a plane rotating around the $r-$axis which has a pitch angle $\theta$ with the normal vector of the $x_1 - z_1$ plane. Again, this model is used to simplify the analysis of the robotic scanning system depicted in Fig. 1. Note that the proposed method is equivalent to the general cone-beam X-ray CT when $\theta = 0^{\circ }$ and a single coded aperture is used. The imaging performance in this $\theta =0^{\circ }$ case is poor as documented by Kaganovsky et. al. The proposed geometry has two advantages compared with the conventional one. First, the effect of source and coded aperture when $\theta >0^{\circ }$ at different view angles is replaced by the rotation of the object. Furthermore, the generation of multiple structure matrices at different view angles is replaced by that of the rotation matrix, leading to more efficient calculation. Second, the object uniformly rotates around the axis which is misaligned with the coordinate axes leading to more efficient sensing of the inner structure. In the forward model, the rotation matrix is numerically computed with two basic sub-matrices and the compressive reconstruction is performed at a sub-sampling rate. Due to the computational complexity, the existing coded aperture optimization methods cannot be used for large structured matrices in X-ray CT. This paper develops a more efficient algorithm based on the Pearson product-moment correlation coefficient (PPMCC) analysis, which can efficiently optimize the coded aperture associated with the high-dimensional sensing matrix that models the system in Fig. 1.

Fig. 2. Equivalent sensing mechanism of the CARTS system shown in Fig. 1. The 3D object illuminated by a stationary cone-beam X-ray source rotates around the $r-$axis with pitch angle $\theta$. The $p$th coded projection is measured by a stationary 2D detector, $p = 1, 2, \ldots, P$. A single block-unblock coded aperture is placed in front of the X-ray source and part of the X-ray radiation is blocked by the blocking elements.

Download Full Size | PDF

This paper is organized as follows. Section 2 introduces the forward model in matrix notation. The methods to generate the rotation matrix and optimize the coded aperture are presented in Section 3 and Section 4, respectively. The results using 3D simulated datasets and simulated projections of real datasets at different sampling rates are provided in Section 5. Conclusions and future work are presented in Section 6.

2. Coded aperture robotic tomography system

2.1 Forward model

As depicted in Fig. 2, a binary coded aperture is placed in front of a stationary X-ray source with a fixed orientation. The elements on the coded aperture have one-to-one correspondence to the pixels on a flat 2D detector. The 3D object rotates around the $r-$axis with pitch angle $\theta$ and part of the 3D object is illuminated by the coded X-ray patterns. The $p$th coded X-ray projection is measured by the 2D detector where $p = 1,2,\ldots,P$, and $P$ is the number of projections. Measurements corresponding to the blocking elements on the coded aperture are discarded. Then the linear attenuation map of the 3D object is reconstructed using the set of tomographic measurements and the projection geometries.

According to the Beer-Lambert law, the transmission model of a monochromatic X-ray path $l$ to a particular pixel on the detector is given by [15]

(1)$$I^{l}= I_0^{l} \exp\{-\int_{0}^{ \infty }{\alpha(l)dl}\},$$

where $I^{l}$ is the measurement on the detector, $I_0^{l}$ is the intensity of the monochromatic X-ray generated by an X-ray source, and $\alpha$ is the linear attenuation coefficient varying in the $l$ path. Suppose $\vec {s}$ and $\vec {\theta }$ represent the location of the X-ray source and the direction of the projection, respectively, the logarithmic measurement of the detector can be re-written as

(2)$$y(\vec{s},\vec{\theta}) = \ln{\frac{I_0^{l}}{I^{l}}}=\int_{0}^{ \infty }f(\vec{s}+l\vec{\theta})dl,$$

where $f$ represent the linear attenuation coefficient map of the 3D object.

Due to the pixelated nature of the 2D detector, the 3D object is gridded into $N = N_x \times N_y \times N_z$ voxels where $N_x$, $N_y$ and $N_z$ are the dimensions along the $x_1$, $y_1$ and $z_1$ axes in Fig. 2(b), respectively. The 2D detector is gridded into $M = M_x \times M_y$ pixels where $M_x$ and $M_y$ are the dimensions along the $x_1$ and $y_1$ axes, respectively. Then, the discrete logarithmic measurements in matrix notation can be formulated as

(3)$$\textbf{y} = \textbf{Wf},$$

where $\textbf {y}\in \mathbb {R}^{M}$ denotes the vectorized representation of the logarithmic measurement $y(x_0,y_0)$ where $x_0$ and $y_0$ are the spatial coordinates of the 2D detector, respectively. $\textbf {f} \in \mathbb {R}^{N}$ denotes the vectorized representation of the 3D object $f(x_1,y_1,z_1)$ where $x_1$, $y_1$, $z_1$ represent its spatial coordinates. $\textbf {W} \in \mathbb {R}^{M \times N}$ denotes the structure matrix which represents the X-ray path characteristics of the ray projections from the source at a particular view angle to the corresponding detector. The $(m,n)$th element in the structure matrix $\textbf {W}$ represents the volume portion of the $n$th voxel that is irradiated by the X-ray source associated with the $m$th element on the 2D detector.

The binary on-off coded aperture $T(x_0,y_0)$, $x_0 = 1, 2, \ldots, M_x$, $y_0 = 1, 2, \ldots, M_y$, is placed in front of the cone-beam X-ray source, and the coded aperture pitch is fixed to have one-to-one correspondence to the pixels of the 2D detector. The equivalent transmission function of the coded aperture can then be formulated as

(4)$$\widetilde{\textbf{T}} = \textbf{T} \odot \boldsymbol{\Lambda}$$

where $\odot$ represents the element-by-element multiplication, $\textbf {T}$ and $\boldsymbol {\Lambda }$ are the transmission function corresponding to the coded aperture and source orientation, respectively. Thus, the vectorized representation of the coded logarithmic measurements is given by

(5)$$\textbf{y} = \textbf{CWf},$$

where the coded aperture matrix $\textbf {C} = \text {diag}(\text {vec} (\widetilde {\textbf {T}}))$ is a diagonal matrix with the diagonal vector equal to the vectorized representation of the equivalent transmission function, $\widetilde {\textbf {T}}$.

Furthermore, sampling in X-ray CT can be considered as cutting the object using X-rays and thus more irregular voxels can lead to higher sampling efficiency in the absolute coordinate (3D free space). In our previous work, the number of intersections of X-rays in the object is theoretically demonstrated to determine the amount of information acquisition [16,17]. Limited by the projective geometry, the intersections of projections in X-ray CT are finite. However, misalignment between the voxels and the coordinates during the rotation can lead to more intersections and thus more details can be resolved. As depicted in Fig. 2, the 3D object rotates around the fixed $r-$axis with the pitch-roll angle representation $\theta$ and $\omega$, where $\theta$ is the angle of $r-$axis direction and $\omega$ is the magnitude of the rotation, respectively. Therefore, the coded measurements in the proposed CARTS system with the $r-$axis and the angle representation $\omega$ can be written as

(6)$$\textbf{y}(\textbf{r},\omega) = \textbf{C}(\textbf{r},\omega)\textbf{W}\textbf{R}(\textbf{r},\omega)\textbf{f},$$

where $\textbf {R}(\textbf {r},\omega ) \in \mathbb {R}^{N\times N}$ is the rotation matrix. The $(m,n)$th element in the rotation matrix $\textbf {R}$ represents the volume portion of the $n$th voxel of the 3D object $\textbf {f}$ that is rotated associated with the $m$th element of the 3D free space $\Omega (\mu,\nu,\kappa )$ where $\mu$, $\nu$ and $\kappa$ represent its spatial coordinates.

An example of the sensing matrix with pitch angle $\theta = 0$ and $\theta = 45^{\circ }$ for $N_x = N_y = N_z = 8$ and $M_x = M_y = 32$, is depicted in Fig. 3(a) and (b), respectively. The number of unblocking elements on the coded aperture $D$ is 256 and thus the transmittance is 25%. It is seen that the sensing matrix with pitch angle $\theta = 0$ is highly structured and the regions marked in blue are not sensed. The distribution of elements in the sensing matrix with pitch angle $\theta = 45^{\circ }$ is more uniform and the corresponding regions marked in red are sensed. The singular value decomposition (SVD) analysis which has been reportedly used as a tool to compare the performance of coded apertures is depicted in Fig. 3(c). The measurement strategy with larger non-zero singular value components is considered to capture more orthogonal components of the object and thus, it leads to less ill-conditioned reconstruction. The condition numbers of sensing matrix with $\theta = 0$ and $\theta = 45^{\circ }$ are $1.03\times 10^{4}$ and $4.04\times 10^{3}$, respectively. It is seen that the misalignment implementation captures more orthogonal components.

Fig. 3. (a) Example of a sensing matrix for X-ray CT using rotational coding with pitch angle $\theta = 0$, $N_x = N_y = N_z = 8$ and $M_x = M_y = 32$. The coded aperture has 25% transmittance. (b) Example of a sensing matrix for X-ray CT using rotational coding with pitch angle $\theta = 45^{\circ }$, $N_x = N_y = N_z = 8$ and $M_x = M_y = 32$. The coded aperture has 25% transmittance. (c) The plot of the singular values as a function of component numbers. The highest and lowest singular values are highlighted in each case.

Download Full Size | PDF

It is difficult to reconstruct the linear attenuation coefficient map of the volumetric object from a single measurement. Multiple projections are measured at a series of $P$ rotations, and thus, the vectorized representation of the coded logarithmic measurements is given by

(7)$$\left [ \begin{matrix} \textbf{y}_1 \\ \textbf{y}_2 \\ \cdots \\ \textbf{y}_P \end{matrix} \right ] = \left[ \begin{matrix} \text{diag}(\textbf{c}_1) \textbf{W} \\ \text{diag}(\textbf{c}_2) \textbf{W} \\ \cdots \\ \text{diag}(\textbf{c}_P) \textbf{W} \end{matrix} \right ] \left[ \begin{matrix} \textbf{R}_1(\textbf{r},\omega_1) \\ \textbf{R}_2(\textbf{r},\omega_2) \\ \cdots \\ \textbf{R}_P(\textbf{r},\omega_P) \end{matrix} \right ] \textbf{f},$$

where $\textbf {c}_i = \text {vec}(\widetilde {\textbf {T}}_i)$ is the vectorized representation of the equivalent transmission function in the $i$th projection, $i = 1, 2, \ldots, P$ and $\textbf {R}_i(\textbf {r},\omega _i)$ is the $i$th rotation matrix with the $i$th angle representation $(\theta,\omega _i)$, respectively. It is noted that the coded aperture and the source orientation are fixed in the proposed CARTS system, that is, $\textbf {c}_i = \textbf {c}_0$, $i = 1, 2, \ldots, P$.

For simplicity, Eq. (7) can be re-written in matrix notation as

(8)$$\textbf{Y} = \widetilde{\textbf{C}}\widetilde{\textbf{W}}\widetilde{\textbf{R}}\textbf{f},$$

where $\textbf {Y} = \left [ \textbf {y}_1^{T}, \textbf {y}_2^{T}, \ldots, \textbf {y}_P^{T} \right ]^{T}$, $\widetilde {\textbf {W}} = \left [ \textbf {W}, \textbf {W}, \ldots, \textbf {W} \right ]^{T}$ and $\widetilde {\textbf {R}} = \left [ \textbf {R}_1, \textbf {R}_2, \ldots, \textbf {R}_P \right ]^{T}$, respectively. The coded aperture matrix $\widetilde {\textbf {C}}$ is defined as

(9)$$\widetilde{\textbf{C}} = \left [ \begin{matrix} \text{diag}(\textbf{c}_0) & \textbf{0} & \cdots & \textbf{0} \\ \textbf{0} & \text{diag}(\textbf{c}_0) & \cdots & \textbf{0} \\ \cdots & \cdots & \cdots & \cdots \\ \textbf{0} & \textbf{0} & \cdots & \text{diag}(\textbf{c}_0) \end{matrix} \right ],$$

where $\textbf {0}$ represents an $M \times M$ all zero matrix.

2.2 Rotation matrix

As shown in Eq. (8), the forward model of the proposed CARTS is determined by the coded aperture matrix $\widetilde {\textbf {C}}$, the structure matrix $\widetilde {\textbf {W}}$, and the rotation matrix $\widetilde {\textbf {R}}$. The closed-form solution of the coded aperture matrix $\widetilde {\textbf {C}}$ is provided in Eq. (9) and the structure matrix related to the geometric projections can be numerically calculated by using the open-access ASTRA Tomography Toolbox (“All Scale Tomographic Reconstruction Antwerp”) [18].

In this section, an approach to obtain the rotation matrix $\textbf {R}$ with the angle representation $(\theta,\omega )$ is proposed. Similar to the generation of the structure matrix, it is also difficult to obtain the closed-form solution of the volume portion of a rotated voxel associated with a particular element of the free space. Physically, the $(m,n)$th element in the rotation matrix $\textbf {R}$ represents the intersection volume of the $n$th voxel in the 3D object $\textbf {f}$ and the $m$th element in the space $\Omega$. Therefore, a numerical approach is proposed to estimate the intersection volume. As shown in Fig. 4, every voxel in the 3D object $f(x_1,y_1,z_1)$ is divided into $N_0\times N_0\times N_0$ smaller cubes and the coordinates of the centroids in the right-handed Cartesian coordinate system can be calculated as $x_{c}$, $y_{c}$ and $z_{c}$. Then, the number of smaller cubes for the 3D object is $NN_0^{3}$. Given a rotation axis $\boldsymbol {\mu } = (\mu _x, \mu _y, \mu _z)^{T}$ where $\mu _x^{2} + \mu _y^{2} + \mu _z^{3} = 1$ and a rotation angle $\delta$, the rotation transformation $\textbf {Q}_{\delta }(\boldsymbol {\mu },\delta )$ in the right-handed Cartesian coordinate system can be given by

(10)$$\begin{aligned} & \textbf{Q}_{\delta}(\mu,\delta) =\\ & \left[ \begin{array}{ccc} \text{cos}\delta + \mu_x^{2}(1-\text{cos}\delta) & \mu_x\mu_y(1-\text{cos}\delta) - \mu_z \text{sin}\delta & \mu_x\mu_y(1-\text{cos}\delta) + \mu_y \text{sin}\delta \\ \mu_y\mu_z(1-\text{cos}\delta) + \mu_z \text{sin}\delta & \text{cos}\delta + \mu_y^{2}(1-\text{cos}\delta) & \mu_y\mu_z(1-\text{cos}\delta) - \mu_x \text{sin}\delta \\ \mu_z\mu_x(1-\text{cos}\delta) + \mu_y \text{sin}\delta & \mu_z\mu_y(1-\text{cos}\delta) + \mu_x \text{sin}\delta & \text{cos}\delta + \mu_z^{2}(1-\text{cos}\delta) \end{array} \right ]. \end{aligned}$$

Fig. 4. (a) Free space and the 3D object before rotation; (b) Free space and the 3D object after the rotation around the $r-$axis with angle representation $(\theta,\omega )$. Note that the coordinates are in the right-handed Cartesian coordinate system.

Download Full Size | PDF

Thus, the coordinates of the centroid $x_{c}$, $y_{c}$ and $z_{c}$ with the rotation axis $\boldsymbol {\mu }$ and the rotation angle $\delta$ can be calculated as

(11)$$(x'_c, y'_c, z'_c)^{T} = \textbf{Q}_{\delta}(\boldsymbol{\mu},\delta) \cdot (x_c, y_c, z_c)^{T}.$$

All of the $N_0 \times N_0 \times N_0$ coordinates of the centroids of the smaller cubes in the $m$th cube can be calculated and thus the number of smaller cubes in the $n$th space can be counted, that is the coefficient of the $(m,n)$th element in the rotation matrix $\textbf {R}(\boldsymbol {\mu },\delta )$.

Since the projections are measured at a series of $P$ rotations, it is computationally inefficient to numerically estimate $P$ rotation matrices, $\textbf {R}(\textbf {r},\omega _i)$, $i = 1, 2, \ldots, P$. Here, the rotation represented by $\textbf {R}(\textbf {r},\omega )$ is decomposed as $\textbf {R}_{\theta }(\boldsymbol {\mu _1},\theta )\textbf {R}_{\omega }(\boldsymbol {\mu _2},\omega )$ where $\boldsymbol {\mu _1}$ and $\boldsymbol {\mu _2}$ are the normalized vectors along $z-$axis direction and $x-$ or $y-$axis direction, respectively. That is, $\boldsymbol {\mu _1} = (0,0,1)$, and $\boldsymbol {\mu _2} = (0,1,0)$ or $\boldsymbol {\mu _2} = (1,0,0)$. Besides, the $(i+1)$th rotation around the $r-$axis can be formulated by the $i$th rotation around the $r-$axis when the distribution of the angle, $\omega$, is uniform. That is, $\textbf {R}_{\omega }(\boldsymbol {\mu _2},\omega _{i+1}) = \textbf {R}_{\omega }(\boldsymbol {\mu _2},\omega _0)\textbf {R}_{\omega }(\boldsymbol {\mu _2},\omega _i)$, where $\textbf {R}_{\omega }(\boldsymbol {\mu _2},\omega _1)=\textbf {R}_{\omega }(\boldsymbol {\mu _2},\omega _0)$ is the initial rotation matrix with $\omega _0 = 2\pi / P$. Therefore, the $P$ rotation matrix can be calculated by $\textbf {R}_{\theta }(\boldsymbol {\mu _1},\theta )$ and $\textbf {R}_{\omega }(\boldsymbol {\mu _2},\omega _0)$. The numerical estimation aims at obtaining the rotation matrix is detailed in Algorithm 1. It is noted that the rotation matrix in this manuscript is pre-computed and stored with the CPU implementation. However, step 12 to step 18 in Algorithm 1 is a loop with large amount of independent computation requiring few memory. Then a GPU workstation with numerous cores can be used to significantly accelerate the algorithm and make an arbitrary acquisition scan possible.

oe-30-5-7677-i001

2.3 Coded aperture optimization

Since the sensing matrix $\widetilde {\textbf {C}}\widetilde {\textbf {W}}\widetilde {\textbf {R}}$ is highly structured, the random coded aperture is not optimal in the reconstruction of CS. Recently, coded aperture optimization methods have been developed based on the RIP condition, the mutual coherence and information theory [10,16,17]. However, these methods cannot be directly used in the optimization of the proposed CARTS system due to two reasons. First, these methods are developed for the conventional CARTS systems where the coded apertures and source orientations at different view angles are different. However, the coded aperture and source orientation are stationary at a series of $P$ rotations in the proposed model. Second, the requirement of memory to calculate the Gram matrix in [10] or correlation matrix in [16] and [17] far exceeds the limit of regular computing platforms.

It has been demonstrated that the variables in the structure matrix represent the observations of the attenuation characteristics of X-ray projections and the coded apertures characterizes reducing the measurements of the structure matrix. The purpose of coded aperture design is to obtain the maximal non-overlappling information carried by unblocking X-ray projections. Thus the coded aperture optimization problem is analogous to the feature selection problem or the dimensionality reduction problem [17]. In [16], the PPMCC matrix is used as a metric to optimize the coded aperture and the cost function is given by

(12)$$\arg \min \sum_{i \neq j} B_{i j}^{2} \text{ s.t. } C_{i i}=C_{j j}=1 \quad \forall i, j,$$

and

(13)$$B_{i j} = \frac{E[(\textbf{H}(i)-\mu _i)(\textbf{H}(j)-\mu _j)]}{ \sigma _i \sigma _j}$$

where ${B}_{ij}$ is the $(i,j)\text {th}$ coefficient of the PPMCC matrix of $\textbf {H}^{T}$, $E[\cdot ]$ is the expected value, $\mu$ is the mean, $\sigma$ is the standard deviation, and $\textbf {H}(i)$ and $\textbf {H}(j)$ are the columns of the transpose of the structure matrix $\textbf {H}^{T} = (\widetilde {\textbf {W}}\widetilde {\textbf {R}})^{T}$, respectively. The nature of the cost function is to minimize the standardized correlation of the transpose of the structure matrix. It is noted that the PPMCC of all columns of the transpose of the structure matrix are calculated and the dense large PPMCC matrix is stored. Cuadros et al. proposed a gradient descent approach based on restricted isometry property (RIP) principles to obtain state-of-art optimization quality [10]. The computation and memory complexity are $\sim \mathcal {O}(\varrho N^{4})$ and $\sim \mathcal {O}(DN^{3})$, respectively, where $\varrho$ is the number of iterations in the gradient approach and $D$ is the number of unblocking elements on the coded aperture. Mao et al. proposed a more efficient approach based on sparse principle component analysis with the minimum memory complexity $\mathcal {O}(N^{2})$ [17]. However, the memory requirements ($\sim$ 2TB for a $128\times 128\times 128$ cube with a $512\times 512$ coded aperture) in [17] far exceed our computing platform. Inspired by the connection between the mutual coherence and the RIP condition where a subset of correlation coefficients are used to estimate the entire set of correlation coefficients, a subset of columns of the transpose of the structure matrix are considered.

Since the system matrix in the proposed architecture is highly structured, the subset of columns cannot be chosen randomly. In practice, neighboring X-ray projections, with a much higher probability, illuminate a particular voxel at the same time and thus carry more overlapping information. Therefore, for a particular pixel on the coded aperture, the neighboring pixels in the diameter of 9 pixels are chosen as a subset. Then the PPMCC of the rows in the sensing matrix $\textbf {H}$ that correspond to the X-ray projections are calculated. The PPMCC of all pixels on the coded aperture with 9 neighboring pixels are calculated, and the average results are depicted in Fig. 5. Figure 5(a)-(d) illustrate the average coefficients at $\omega = 0^{\circ }$, $\omega = 90^{\circ }$, $\omega = 180^{\circ }$ and $\omega = 270^{\circ }$, respectively. It is seen that the PPMCC of the pixels with shorter distance to the neighboring pixel are much larger than that with longer distance. The PPMCC of the pixel pairs decreases much fast as the radius increases and the coefficients are almost zero at the diameter of 9 pixel. Therefore, the distance of the unblocking pixel on the coded aperture is used as the metric and the purpose of the optimization is to make the distance of any two neighboring unblocking elements as far away as possible. It is equivalent to the design where the unblocking elements are spread as homogeneously as possible and such distribution of binary pixels are blue noise coded apertures [19]. The direct binary search (DBS) method in our previous work can be directly used to efficiently obtain the blue noise coded aperture [20].

Fig. 5. Average PPMCC distribution of a particular element in the coded aperture and the neighboring elements in the radius of 9 elements at (a) $\omega = 0^{\circ }$, (b)$\omega = 90^{\circ }$, (c) $\omega = 180^{\circ }$ and (d) $\omega = 270^{\circ }$ for $M_x = 256$ and $M_y = 256$.

Download Full Size | PDF

2.4 Reconstruction

The discrete datacube for reconstruction often has a coarse resolution along the $z$ axial direction in that $N_z \ll min(N_x, N_y)$. With such resolution, the correlations between transverse slices are small and Discrete Cosine Transform (DCT) basis is used in $\ell _1$ norm to estimate the sparsity. Then, Kronecker CS reconstruction is often used in these cases [8,21]. In the cases where the resolution along the $z$ axial direction is equal to that of slices, Total Variation (TV) is often used as an implicit sparse basis (or dictionary) to minimize the sum of non-zero gradients. Then, the cost function for reconstruction is given by

(14)$$\hat{\textbf{f}}=\arg \min _{\textbf{f}}\|\textbf{Y}-\widetilde{\textbf{C}}\widetilde{\textbf{W}}\widetilde{\textbf{R}}\textbf{f}\|_{2}^{2}+\lambda\|\textbf{f}\|_{T V}$$

where $\lambda$ is the regularization constant and the TV norm $\| \textbf {f} \|_{T V}$ is defined by

(15)$$\|\textbf{f}\|_{T V}=\sum_{x_1=1}^{N_{z}} \sum_{y_1=1}^{N_{x}} \sum_{z_1=1}^{N_{y}}\left|\nabla \textbf{f}_{x_1, y_1, z_1}\right|.$$

3. Simulation results

3.1 Results of simulated datasets

In this section, we present simulations to illustrate the performance of the proposed CARTS system with an optimal coded aperture. The simulations are performed with a cone-beam X-ray source and a flat imager with $256 \times 256$ pixels. The coded aperture with $256 \times 256$ pixels, having one-to-one correspondence to the element on the detector, is placed in front of the X-ray source and the source orientation is disk-shaped in the center of the space $\Omega$. The side length of a pixel on the detector is 0.5 cm and thus the geometric length of the imaging system is 128 cm. The source-to-object and object-to-detector distances are 80 cm and 55 cm, respectively. The object rotates around the $r-$axis with pitch angle $\theta$ and the distribution of angle $\omega$ is uniform at the range of $0 \sim 2\pi$. The imaging system captures the coded X-ray projections at a series of 8 rotations and the 3D object is discretized into $64 \times 64 \times 64$ voxels of dimensions $0.5 \text {cm} \times 0.5 \text {cm} \times 0.5 \text {cm}$. That is, $M = 256^{2}$, $N = 64^{3}$ and $P = 8$. The discrete-to-discrete structure matrix is numerically obtained using the ASTRA Tomography Toolbox [18] and the discrete-to-discrete rotation matrix is calculated using the proposed algorithm in Section II-B. The blue noise coded aperture is generated utilizing the DBS algorithm proposed in [22]. The conventional CARTS where $\theta = 0^{\circ }$ is used to compared with the proposed geometry. The source-plane-to-object-plane and object-to-detector distances are 80 cm and 55 cm, respectively. The simulations are performed in a desktop architecture with an Intel Core i3 3.7GHz processor, 16G RAM, using Matlab 2015b. The object used in the simulations is generated by the 3D Shepp-Logan phantom [8]. To quantitatively evaluate the reconstructed images, the Peak Signal-to-Noise Ratio (PSNR) is used as

(16)$$\operatorname{PSNR}=20\cdot {{\log }_{10}}\left( \frac{{{\textbf{f}}_{\max }}}{\sqrt{\operatorname{MSE}}} \right),$$

where $\operatorname {MSE} = \frac {\sum \sum \sum { (\textbf {f} - \hat {\textbf {f}})^{2}}}{N_xN_yN_z}$, $\hat {\textbf {f}}$ and ${\textbf {f}}_{\max }$ denote the reconstructed cube and its maximum value, respectively. Note that the PSNR is the average of 10 different realizations in the random coded aperture cases.

The pitch angle $\theta$ is determined numerically. An optimal first-order method for strongly convex TV regularization is used to reconstruct the 3D cubes with binary random coding and blue noise coding, respectively [23]. Since the X-ray projections through the $x_1-z_1$ plane and the $y_1-z_1$ plane are different, the simulations of the two cases are both performed. The results of the PSNR with 26% transmittance at the range of $0 \sim 90^{\circ }$ are shown in Table 1. Therefore, the average PSNR using optimized coding at $\theta = 0^{\circ }$, $15^{\circ }$, $30^{\circ }$, $45^{\circ }$, $60^{\circ }$, $75^{\circ }$ and $90^{\circ }$ are 29.2dB, 32.8dB, 38.1dB, 44.7dB, 38.1dB, 32.8dB and 29.2dB, respectively. The average PSNR using random coding at $\theta = 0^{\circ }$, $15^{\circ }$, $30^{\circ }$, $45^{\circ }$, $60^{\circ }$, $75^{\circ }$ and $90^{\circ }$ are 27.6dB, 31.3dB, 38.7dB, 42.7dB, 38.7dB, 31.3dB and 27.6dB, respectively. The strategy referred to as $\theta = 45^{\circ }$ yields the best performance. Thus, the pitch angle between the $r-$axis and the normal vector is $45^{\circ }$ in the following simulations.

Table 1. PSNR as a function of angle $\theta$ using optimized coding and random coding in the cases of $x_1-z_1$ plane and $y_1-z_1$ plane

View Table

The reconstructed tomographic images using random coding and optimized coding with 26% transmittance are depicted in Fig. 6. The execution time for the optimization of coded aperture with 26% transmittance is $\sim$198 seconds. That is, the sub-sampling rate in the CS reconstruction is 52%. As described in Section II-C, the proposed optimization method is not based on the structure matrix $\textbf {W}$ which is the computational burden of the other methods, the runtime of the optimization method is reduced by four orders of magnitude compared with the gradient approach utilized in [10] and one order of magnitude compared with the method based on information acquisition utilized in [16]. Note that the number of the elements in the structure matrix of the proposed method is more than $10^{11}$ while that of the methods in [10] and [16] are $\sim 10^{9}$. Furthermore, the memory requirements in the cases for the methods in [10] and [16] are more than 1TB. The convergence curves of $64\times 64\times 64$ reconstructed images using random coding and optimized coding are depicted in Fig. 7 (a) and (b), respectively.

Fig. 6. (a1) - (a5) are the reconstructed images with 26% transmittance of the 12th layer, 22th layer, 32th layer, 42th layer and 52th layer using the conventional system. (b1) - (b5) are the reconstructed images with 26% transmittance of the corresponding layers using random coding, respectively at pitch angle $\theta = 45^{\circ }$. (c1) - (c5) are the reconstructed images with 26% transmittance of the corresponding layers using optimized coding at pitch angle $\theta = 45^{\circ }$. (d1) - (d5) are the absolute error maps of (b1) - (b5) and (e1) - (e5) are the absolute error maps of (c1) - (c5), respectively.

Download Full Size | PDF

Fig. 7. The convergence curves of $64\times 64\times 64$ reconstructed images using (a) random coding and (b) optimized coding.

Download Full Size | PDF

The reconstructed images of the 12th layer, 22th layer, 32th layer, 42th layer and 52th layer, based on the conventional CARTS are shown in Fig. 6 (a1) - (a5), respectively. The corresponding PSNR are 17.3dB, 22.3dB, 19.4dB, 14.2dB and 8.6dB, respectively. It is seen that the structures of the upper layers ($12$th and $22$th layers) are basically reconstructed and the details are drowned out by artifacts. The structures of the lower layers ($42$th and $52$th layers) are not even reconstructed. The reconstructed images of the 12th layer, 22th layer, 32th layer, 42th layer and 52th layer, using stationary random coded aperture are shown in Fig. 6 (b1) - (b5), respectively. The corresponding PSNR are 35.2dB, 48.7dB, 52.0dB, 50.6dB and 49.8dB, respectively. Compared with the conventional system, the intensity distributions of the 3D Shepp-Logan phantom varying along the slices are reconstructed without many artifacts at both upper layers and lower layers. The PSNR gain is $\sim$30.9dB. Besides, the reconstructed images of the 12th layer, 22th layer, 32th layer, 42th layer and 52th layer, using stationary optimized coded aperture are shown in Fig. 6 (c1) - (c5), respectively. The corresponding PSNR are 37.3dB, 49.4dB, 53.4dB, 52.0dB and 50.6dB, respectively. Compared with random coding, the PSNR gains of using optimized coding are 2.1dB, 0.7dB, 1.4dB, 1.4dB and 0.8dB, respectively. The absolute error maps of using random coding and optimized coding are depicted in Fig. 6 (d1) - (d5) and (e1) - (e5), respectively. Note that the maximum value of the colorbar for a tomographic image is the maximum value of the reconstructed cubes which is 0.15 and the minimum value is 0. The red squares in the absolute error maps are used to highlight the difference in the reconstructed images using random coding and optimized coding. It is seen that the optimized coded aperture leads to fewer artifacts in the absolute error maps and the intensity of artifacts using optimized coding is less than that using random coding.

Furthermore, the plot of average PSNR of a $64\times 64\times 64$ 3D phantom as a function of transmittance for optimized coding and random coding is depicted in Fig. 8. It is seen that the proposed method performs much better than the conventional coded aperture X-ray tomosynthesis system. The $64 \times 64 \times 64$ 3D Shepp-Logan phantom cube can be reconstructed at high quality ($\gt$30dB) at only 16% transmittance. Besides, the reconstruction quality is improved by using the optimized coded aperture compared with that of random coded aperture. Note that the results of random coding are the average of 10 realizations in the simulations. The PSNR gain is up to 3dB at the range of 14% - 32% transmittance.

Fig. 8. The plot of average PSNR as a function of transmittance for optimized and random coding for the 3D Shepp-Logan phantom cube with $64 \times 64 \times 64$ voxels.

Download Full Size | PDF

3.2 Simulated results of simulated projections from real datasets

In this section, we present simulations using simulated projections of real datasets obtained in [21]. Experimental data was obtained at Chesapeake Testing Inc., with a Nikon metrology 225/450kV Vault CT scanning system with a 450kV micro-focus X-ray source capable of producing a spot-size down to 80$\mu$m. Multiple X-ray projections over 360 degrees around an object, in this particular case a vivofit watch, were acquired. These projection images were then reconstructed into a full 3D volumetric data set. The 3D data cube was re-sampled to the a cube of size $128 \times 128 \times 128$. The simulations are performed with a cone-beam X-ray source and a flat imager with $512 \times 512$ pixels. The coded aperture with $512 \times 512$ pixels is placed in front of the X-ray source. Note that the elements on the coded aperture have one-to-one correspondence to that on the detector and the source orientation is disk-shaped in the center of the space $\Omega$. The side length of a pixel on the detector is 0.5 cm and thus the geometric length of the imaging system is 256 cm. The source-to-object and object-to-detector distances are 160 cm and 110 cm, respectively. The object rotates around the $r-$axis with pitch angle $\theta = 45^{\circ }$ and the distribution of angle $\omega$ is uniform at the range of $0 \sim 2\pi$. The imaging system captures the coded X-ray projections at a series of 16 rotations and the 3D object is discretized into $128 \times 128\times 128$ voxels of dimensions $0.5 \text {cm} \times 0.5 \text {cm} \times 0.5 \text {cm}$. That is, $M = 512^{2}$, $N = 128^{3}$, $P = 16$ and the dimension of the sensing matrix is $4194304 \times 2097152$ (more than $8.8 \times 10^{12}$ elements in the matrix). It is noted that the content of the object out of the region of interest is truncated to avoid truncation artifacts. The simulations are performed in a desktop architecture with an Intel Core i3 3.7GHz processor, 64G RAM, using Matlab 2019b.

The reconstructed images using random coding and optimized coding with 12% transmittance are depicted in Fig. 9. The execution time for the coded aperture optimization with 12% transmittance is $\sim 5.5\times 10^{3}$ seconds. That is, the sub-sampling rate in the CS reconstruction is 24%. The convergence curves of $128\times 128\times 128$ reconstructed images using random coding and optimized coding are depicted in Fig. 10 (a) and (b), respectively. On the contrast, only two basic sub-matrices and a large structure matrix are required to be numerically calculated in the proposed method. It is seen that the intensity distributions of the 3D data cube varying along the slices are reconstructed without many artifacts using both random coding and optimized coding. The plot of average PSNR as a function of slices for optimized coding and random coding with $128\times 128\times 128$ voxels is depicted in Fig. 11. The PSNR gain is up to 4.8dB and the average PSNR gain for optimized coding is 1.5dB. It is seen that the proposed method outperforms the general CARTS system where $\theta = 0^{\circ }$ by using both optimized coding and random coding.

Fig. 9. The reconstructed tomographic images of the “watch” with $128 \times 128 \times 128$ voxels. The number “4”, “25”, “48”, “76”, “93” and “110” represent the $4$th, $25$th, $48$th, $76$th, $93$th and $110$th slices of the data cube, respectively.

Download Full Size | PDF

Fig. 10. The convergence curves of $128\times 128\times 128$ reconstructed images using (a) random coding and (b) optimized coding.

Download Full Size | PDF

Fig. 11. The plot of average PSNR as a function of slices for optimized coding and random coding with $128\times 128\times 128$ voxels. Conventional CARTS is shown at $\theta = 0^{\circ }$.

Download Full Size | PDF

4. Conclusion

Coded aperture robotic tomography system is introduced to obtain both flexible scanning and lower radiation dose. A static binary “on-off” coded aperture is placed in front of the X-ray source and the robotic source-detector pair scans the object with a pitch angle around a fixed axis. The degrees of freedom in the gantry’s motion are constrained to only rotational scans with respect to a fixed coordinate system. Therefore, the complex structured matrices of the robotic CT systems are simplified using an efficient rotational sensing model based on two basic sub-matrices. The stationary coded aperture is designed based on the PPMCC analysis and the DBS algorithm is used to obtain the optimized coding with one order of magnitude faster computation than state-of-the-art optimization algorithms. Computational experiments using simulated datasets show that the proposed method can reconstruct high-quality tomographic images (>30dB) at only 16% transmittance while that of conventional X-ray tomosynthesis is less than 15dB. Furthermore, the optimized coded aperture leads up to $\sim$3dB PSNR gains in the reconstruction quality compared with that attained by a random coded aperture.

Funding

National Natural Science Foundation of China (61875088, 62005128); National Science Foundation (CIF 1717578); Scientific Research Foundation of Nanjing University of Posts and Telecommunications (NY220210); Natural Science Foundation for Colleges and Universities of Jiangsu Province (20KJB510022); Natural Science Foundation of Jiangsu Province (BK20200745).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. W. A. Kalender, Computed tomography: fundamentals, system technology, image quality, applications (Wiley-VCH, 2000).

2. W. A. Kalender, “X-ray computed tomography,” Phys. Med. Biol. 51(13), R29–R43 (2006). [CrossRef]

3. V. Cnudde and M. N. Boone, “High-resolution x-ray computed tomography in geosciences: A review of the current technology and applications,” Earth-Sci. Rev. 123, 1–17 (2013). [CrossRef]

4. S. B. Solomon, A. Patriciu, M. E. Bohlman, L. R. Kavoussi, and D. Stoianovici, “Robotically driven interventions: a method of using ct fluoroscopy without radiation exposure to the physician,” Radiology 225(1), 277–282 (2002). [CrossRef]

5. E. Ben-David, M. Shochat, I. Roth, I. Nissenbaum, J. Sosna, and S. N. Goldberg, “Evaluation of a ct-guided robotic system for precise percutaneous needle insertion,” J. Vasc. Interv. Radiol. 29(10), 1440–1446 (2018). [CrossRef]

6. A. Fieselmann, J. Steinbrener, A. K. Jerebko, J. M. Voigt, R. Scholz, L. Ritschl, and T. Mertelmeier, “Twin robotic x-ray system for 2d radiographic and 3d cone-beam ct imaging,” in Medical Imaging 2016: Physics of Medical Imaging, vol. 9783 (International Society for Optics and Photonics, 2016), p. 97830G.

7. Y. Kaganovsky, D. Li, A. Holmgren, H. Jeon, K. P. MacCabe, D. G. Politte, J. A. O’Sullivan, L. Carin, and D. J. Brady, “Compressed sampling strategies for tomography,” J. Opt. Soc. Am. A 31(7), 1369–1394 (2014). [CrossRef]

8. X. Ma, Q. Zhao, A. Cuadros, T. Mao, and G. R. Arce, “Source and coded aperture joint optimization for compressive x-ray tomosynthesis,” Opt. Express 27(5), 6640–6659 (2019). [CrossRef]

9. A. Cuadros, X. Ma, and G. R. Arce, “Compressive spectral x-ray tomography based on spatial and spectral coded illumination,” Opt. Express 27(8), 10745–10764 (2019). [CrossRef]

10. A. P. Cuadros and G. R. Arce, “Coded aperture optimization in compressive x-ray tomography: a gradient descent approach,” Opt. Express 25(20), 23833–23849 (2017). [CrossRef]

11. A. P. Cuadros, X. Ma, C. M. Restrepo, and G. R. Arce, “Staticcodect: single coded aperture tensorial x-ray ct,” Opt. Express 29(13), 20558–20576 (2021). [CrossRef]

12. A. P. Cuadros, X. Liu, P. E. Parsons, X. Ma, and G. R. Arce, “Experimental demonstration and optimization of x-ray staticcodect,” Appl. Opt. 60(30), 9543–9552 (2021). [CrossRef]

13. B. Chen, E. Kobler, M. J. Muckley, A. D. Sodickson, T. O’Donnell, T. Flohr, B. Schmidt, D. K. Sodickson, and R. Otazo, “Sparsect: System concept and design of multislit collimators,” Med. Phys. 46(6), 2589–2599 (2019). [CrossRef]

14. T. Koesters, F. Knoll, A. Sodickson, D. K. Sodickson, and R. Otazo, “Sparsect: interrupted-beam acquisition and sparse reconstruction for radiation dose reduction,” in Medical Imaging 2017: Physics of Medical Imaging, vol. 10132 (SPIE, 2017), pp. 174–180.

15. Y. Hasegawa, Y. Yamada, M. Tamura, and Y. Nomura, “Monte carlo simulation of light transmission through living tissues,” Appl. Opt. 30(31), 4515–4520 (1991). [CrossRef]

16. T. Mao, A. P. Cuadros, X. Ma, W. He, Q. Chen, and G. R. Arce, “Fast optimization of coded apertures in x-ray computed tomography,” Opt. Express 26(19), 24461–24478 (2018). [CrossRef]

17. T. Mao, A. Cuadros, X. Ma, W. He, Q. Chen, and G. Arce, “Coded aperture optimization in x-ray tomography via sparse principal component analysis,” IEEE Trans. Comput. Imaging 6, 73–86 (2020). [CrossRef]

18. W. van Aarle, W. J. Palenstijn, J. Cant, E. Janssens, F. Bleichrodt, A. Dabravolski, J. De Beenhouwer, K. J. Batenburg, and J. Sijbers, “Fast and flexible x-ray tomography using the astra toolbox,” Opt. Express 24(22), 25129–25147 (2016). [CrossRef]

19. D. L. Lau and G. R. Arce, Modern digital halftoning (CRC Press, 2001).

20. D. L. Lau, R. Ulichney, and G. R. Arce, “Blue and green noise halftoning models,” IEEE Signal Process. Mag. 20(4), 28–38 (2003). [CrossRef]

21. A. P. Cuadros, C. Peitsch, H. Arguello, and G. R. Arce, “Coded aperture optimization for compressive x-ray tomosynthesis,” Opt. Express 23(25), 32788–32802 (2015). [CrossRef]

22. J. B. Rodriguez, G. R. Arce, and D. L. Lau, “Blue-noise multitone dithering,” IEEE Signal Process. Mag. 17(8), 1368–1382 (2008). [CrossRef]

23. T. L. Jensen, J. H. Jørgensen, P. C. Hansen, and S. H. Jensen, “Implementation of an optimal first-order method for strongly convex total variation regularization,” Bit Numer Math 52(2), 329–356 (2012). [CrossRef]

The angle $θ$	$0^{\circ}$	$15^{\circ}$	$30^{\circ}$	$45^{\circ}$	$60^{\circ}$	$75^{\circ}$	$90^{\circ}$
$Optimized (x_{1} - z_{1})$	12.5dB	22.3dB	31.5dB	44.7dB	48.6dB	43.3dB	46.0dB
$Optimized (y_{1} - z_{1})$	46.0dB	43.3dB	48.6dB	44.7dB	31.5dB	22.3dB	12.5dB
${Average}_{o p t}$	29.2dB	32.8dB	38.1dB	44.7dB	38.1dB	32.8dB	29.2dB
$Random (x_{1} - z_{1})$	12.5dB	21.3dB	30.0dB	42.7dB	47.4dB	41.4dB	42.7dB
$Random (y_{1} - z_{1})$	42.7dB	41.4dB	47.4dB	42.7dB	30.0dB	21.3dB	12.5dB
${Average}_{r a n d o m}$	27.6dB	31.3dB	38.7dB	42.7dB	38.7dB	31.3dB	27.6dB

Static coded aperture in robotic X-ray tomography systems

Abstract

1. Introduction

2. Coded aperture robotic tomography system

2.1 Forward model

2.2 Rotation matrix

2.3 Coded aperture optimization

2.4 Reconstruction

3. Simulation results

3.1 Results of simulated datasets

3.2 Simulated results of simulated projections from real datasets

4. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (11)

Tables (1)

Equations (16)

Optics Express