Computational integral imaging reconstruction of perspective and orthographic view images by common patches analysis

Zhiqiang Yan; Xingpeng Yan; Xiaoyu Jiang; Lingyu Ai

doi:10.1364/OE.25.021887

1. Introduction

Integral imaging (II), proposed by Lippmann in 1908 [1], is a promising technique for practical three-dimensional (3-D) display applications. Compared with other 3-D display techniques such as stereoscopy [2], multi-view image display [3] and holography [4], II has comprehensive advantages, such as continuous motion parallax, full color, no visual fatigue, no necessity of coherent illumination and easy implementation of real-time display [5, 6]. II records the irradiance of directional rays emitted from real scenes by a lens array (LA) (or pinhole array) on recording mediums such as films or CCD in the form of elemental image array (EIA), and each elemental image (EI) is a perspective image of the scenes imaged by corresponding elemental lens. When the EIA is displayed combined with LA (or pinhole array), directional rays with the same irradiance are reproduced, which means the recorded light field [7] is reconstructed, and 3-D scene that can be observed by naked eyes is reconstructed.

Both the recording and reconstruction process of II can be implemented optically and computationally [8]. For optical reconstruction of II, the quality of the displayed images is limited by the physical capabilities of devices. For example, the viewing resolution of integrated images is affected by pixel pitch of display device, elemental lens pitch, diffraction and aberration of elemental lens [9–11], and thus the viewing resolution that the human eye can perceive when observing II display is generally low. Meanwhile, optical reconstruction is not always more preferable than computational reconstruction. For instance, in some special fields like II microscopy and II based sensing etc., it is not suitable or necessary to optically reconstruct the scene, and computational II reconstruction (CIIR) which generates view images [12–16] or depth plane images [17, 18] is more practical. By CIIR, high resolution images can be generated beyond the physical limitations of devices, which is very beneficial to fields like image based hologram generation [19], object recognition and identification. Moreover, image processing techniques can be introduced in CIIR to enhance the quality of reconstructed images [16].

The reconstruction of high resolution view images remains an important issue, and several novel methods has been presented. H. Arimoto et al. proposed to generate orthogonal view images by extracting a pixel from the same location of each EI [16]. The number of orthogonal view images equals to the pixel number in an EI, and the resolution of the view images equals to the number of elemental lens, which is commonly low. To increase the resolution of computationally reconstructed view images, micro lens array (MLA) shifting technique is introduced so that the spatial sampling rate of the scene is enhanced, which helps to reconstruct view images with high resolution [14]. Y.-T. Lim et al. introduced MLA shifting technique to II based microscope, and sequential integral photographs were captured at slightly different positions [12]. Using MLA shifting technique, the resolution of computationally generated viewing images can be enhanced N² times if the one-dimensional spatial sampling rate is enhanced N times. J.-H. Park et al. proposed a novel perspective and orthographic view image generation method by estimating the depth of several points on the object and the view image is interpolated by patches among these points [13]. The method is capable of generating view images with full resolution of the recording device, and the reconstruction quality is closely related to the precision of the calculated depth map. Based on the above method, J. Chen et al. further improved the resolution of generated view images by processing multiple image patches that correspond to the same area of object [20].

In this paper, we propose to computationally reconstruct perspective and orthographic view images with full resolution of recording device from a single recorded integral photograph, and without the necessity of calculating depth map. The proposed method is implemented by firstly generating a set of image slices which contain full but redundant information that is needed to reconstruct the view image at current viewpoint or view angle, and then the redundancy is excluded by the proposed common patches analysis (CPA), and finally the view image is synthesized by tiling and resampling the image slices. The common patches are images of the same area on object by different elemental lenses, and their position difference in EIs indicates the 3-D nature, while their irradiance difference indicates the directional radiation characteristic of the object area. The irradiance difference is redundant for the reconstruction of view images and must be excluded, since the direction of radiation is settled for a given viewpoint or view angle. The principles of the method will be explained, and the feasibility of the method will be verified by simulations and experiments.

2. Principles

2.1 Perspective view images reconstruction

Perspective view images are generated in perspective projection in which directional light rays converge at a given viewpoint. The perspective projection is very similar to the way human eyes observe the real scenes. The content of the perspective view image changes with the variation of viewing positions, and the size of the objects in perspective view image varies as the distance between the objects and viewing positions changes.

The concept of the proposed method of reconstructing perspective view images is illustrated in Fig. 1. The object is integral photographed by MLA, and the EIA is formed on the image plane of lens. Firstly, the images slices that contribute to the current view image is extracted from the captured EIA according to the perspective projection geometry, as depicted in Fig. 1(a). A reference plane is chosen between MLA and object, and its location is denoted as $z = l$ . The EIs are projected onto the reference plane through the corresponding elemental lens, and the image slices (marked red, green and blue on reference plane in Fig. 1(a) that can be observed at the viewpoint are the cross section between the reference plane and the pyramid formed by the viewpoint and elemental lens. The shape of the image slices on the reference plane is square regardless of the shape of elemental lens, and the side length of the square, denoted as $w_{1}$ , can be calculated as

w_{1} = \frac{L - l}{L} p_{l}

where

p_{l}

is the elemental lens pitch,

z = L

is the plane where the viewpoint locates. And the size of corresponding image slices on EIA plane, denoted as

w_{2}

, can be calculated as

w_{2} = \frac{g}{l} w_{1}

where

g

is the gap between MLA plane and EIA plane. And the centers of these image slices on EIA plane, denoted as

E_{n} : (x_{E_{n}}, y_{E_{n}}, - g), (1 \leq n \leq N)

(

N

is the number of elemental lens in one dimension), can be determined following

\begin{array}{l} x_{E_{n}} = \frac{g}{L} (x_{C_{n}} - x_{V}) + x_{C_{n}} \\ y_{E_{n}} = \frac{g}{L} (y_{C_{n}} - y_{V}) + y_{C_{n}} \end{array}

where

(x_{C_{n}}, y_{C_{n}}, 0)

is the coordinate of the optical center of the n_th elemental lens, and

(x_{V}, y_{V}, L)

is the coordinate of the viewpoint.

Fig. 1 Schematic diagram of the proposed method of reconstructing perspective view images: the generation of (a) image slices which contains redundant information, and (b) perspective view image with redundancy excluded.

Download Full Size | PDF

The perspective view image plane is parallel to the MLA plane, as shown in Fig. 1(a). The image slices are extracted from EIA following Eqs. (2) and (3), rotated 180°, and tiled to synthesize the perspective view image. Aliasing will happen in such synthesized view images, because there are common patches between adjacent image slices. For example, the common patches between the n_th and (n + 1)_th image slices is the image of the object area between point $O_{n}$ and point $O'_{n + 1}$ on the object, which are simultaneously imaged by the n_th and (n + 1)_th elemental lens. And the common patches emerge repeatedly between adjacent image slices, which causes aliasing.

The way of generating image slices is similar to that discussed by J.-S. Jang, et al. [21], by setting the viewpoint as human eye pupil, and the reference plane as the image plane of lens. The size of image slice should not exceed that of EIs, i.e. $g < l$ . In our method, the reference plane can be arbitrary planes between plane $z = g$ and the minimal depth plane of the object, i.e.

g < l \leq l_{a}

where

l_{a} \leq z \leq l_{b}

is depth range of the object. The reference plane should never exceed the minimal depth plane of the object to avoid information loss.

Secondly, the common patches between any two image slices are found, and each image slice is subdivided into several patches, and the perspective image is integrated by tiling the subdivided patches sequentially with common patches excluded, as depicted in Fig. 1(b). The common patches are the recorded images of the same area of the object by different elemental lens, and reflect the radiance variation in different directions. But they are redundant information for the generation of view images, which cause aliasing, and should be removed to get correct view images.

2.2 Orthographic view image reconstruction

Orthographic view images are generated in orthographic projection geometry in which directional light rays emitted from objects are parallel. The orthographic view images varies only with directions rather than observing positions.

The concept of the method to reconstruct orthographic view images is illustrated in Fig. 2. The process is the same with that of perspective view image generation by setting $w_{1} = p_{l}$ . Firstly, the group of image slices which includes complete and redundant information to reconstruct orthographic view images is generated according to the geometrical relations depicted in Fig. 2(a). The side length of the image slices on EIA plane is

w_{2} = \frac{g}{l} p_{l}

and the coordinates of the centers of them are

\begin{array}{l} x_{E_{n}} = - g \tan θ + x_{C_{n}} \\ y_{E_{n}} = - g \tan θ + y_{C_{n}} \end{array}

where

θ

is the view angle. Secondly, the information redundancy embedded in the group of image slices is excluded. CPA is conducted between any two image slices to find the common patches, and each image slice is subdivided into several common patches and particular patches, as shown in Fig. 2(b). The common patches are shared by at least two image slices, while the particular patches are included in only one image slice. The existence of particular patches depend on geometric parameters, for example, the distance between the reference plane and the depth of local area of the object, and the size of image slice on reference plane. For instance, the n_th image slice generated in orthographic projection geometry is subdivided into two common patches and one particular patch in one dimension, as shown in Fig. 2(b). While the n_th image slice generated in perspective projection geometry is subdivided into three common patches because of a smaller

w_{1}

, as shown in Fig. 1(b), and particular patch will emerge if the reference plane is moved nearer to the object. However, the synthesized view images will not be affected by the compositional structure of images slices, and the position of reference plane is arbitrary within the restriction set by Eq. (4).

Fig. 2 Schematic diagram of the proposed method of reconstructing orthographic view images: the generation of (a) image slices which contains redundant information, and (b) orthographic view image with redundancy excluded.

Download Full Size | PDF

2.3 Common patches analysis

When the reference plane is set not to exceed the minimal depth plane of the object, information redundancy contained in the image slices is guaranteed, and the information loss of consecutive object surface is prevented. The common patches between two adjacent image slices always exist according to the geometrical relations depicted in Figs. 1 and 2, which can be figured out by the proposed CPA operation. CPA in this paper refers to the process of analyzing and finding the image patches that correspond to the same area of the object in two different image slices, and such image patches are called common patches, which are images of the area through different elemental lens and express the irradiance variation in different directions. CPA is similar to the correspondence analysis used in depth extraction, which determines the two pixels that correspond to the same point of object in different EIs [13, 22]. And CPA can be considered as the correspondence analysis of a group of pixels rather than a single pixel in different EIs.

CPA can be implemented specifically in the proposed method by considering the unique features of the generated image slices. As illustrated in Figs. 1(b) and 2(b), considering the n_th and (n + 1)_th image slices on the reference plane, the common patches are the images of the object area between $O_{n}$ and $O'_{n + 1}$ , and $O'_{n + 1}$ corresponds to the bottom pixel in the (n + 1)_th image slice. If the pixel that corresponds to $O'_{n + 1}$ in the n_th image slice is found, the common patch is the area of pixels between it and the upmost pixel in the n_th image slice. The common patch in the (n + 1)_th image slice can be determined in the same way.

The two dimensional implementation of CPA is illustrated in Fig. 3, in which four adjacent EIs are considered. The EIs (gray squares) here are rotated 180° from the EIs on EIA plane, so are the image slices (pink squares). The center (dark cross in Fig. 3) of the (m, n)_th image slices on reference plane is denote as $E'_{m, n} : (x_{E'_{m, n}}, y_{E'_{m, n}}, l)$ , and the coordinate of it can be calculated as

\begin{array}{l} x_{E'_{m, n}} = \frac{l}{g} (x_{C_{m, n}} - x_{E_{m, n}}) + x_{C_{m, n}} \\ y_{E'_{m, n}} = \frac{l}{g} (y_{C_{m, n}} - y_{E_{m, n}}) + y_{C_{m, n}} \end{array}

where

(x_{C_{m, n}}, y_{C_{m, n}}, 0)

is the coordinate of optical center of the (m, n)_th elemental lens, and

(x_{E_{m, n}}, y_{E_{m, n}}, - g)

is the coordinate of center of the (m, n)_th image slice on EIA plane, which can be calculated following Eqs. (3) and (6) according to the specific projection geometry.

Fig. 3 Schematic diagram of CPA between adjacent image slices.

Download Full Size | PDF

The common patches of horizontally adjacent image slices are determined as follows: firstly, conduct the correspondence analysis of the rightmost column of pixels of the (m, n)_th image slice in the (m, n + 1)_th image slice, i.e. find in the (m, n + 1)_th image slice the column of pixels that correspond to the same set of points of the object as the rightmost column of pixels in the (m, n)_th image slice do. The obtained column of pixels is depicted in Fig. 3 as vertical dashed line in the (m, n + 1)_th image slice, and the common patch is the group of pixels between the obtained column and the leftmost column in the (m, n + 1)_th image slice, the width of which is $a_{2}$ pixels. Secondly, conduct the correspondence analysis of the leftmost column of pixels of the (m, n + 1)_th image slice in the (m, n)_th image slice, and the common patch is the group of pixels between the obtained column (marked as vertical dashed line) and the rightmost column of pixels in the (m, n)_th image slice, and width of which is $a_{1}$ pixels.

The correspondence analysis of the column of pixels in column $T_{0}$ and row from $S_{1}$ to $S_{2}$ , can be implemented by finding $t$ which makes $s s d (t)$ minimal, and $s s d (t)$ is expressed as

s s d (t) = {\sum_{s \in Q} [I_{m, n} (s, T_{0}) - I_{m, n + 1} (s, t)]}^{2}, (t \in N_{+}, t_{1} \leq t \leq t_{2})

where

I_{m, n} (s, t)

denotes the value of the (s, t)_th pixel in the (m, n)_th EI,

t_{1}

and

t_{2}

are the left row and right row of the image slice,

Q = {s | s \in N_{+}, S_{1} - ⌊ \frac{W}{2} ⌋ \leq s \leq S_{2} + ⌊ \frac{W}{2} ⌋}

, and

W

is the size of window used to get more precise correspondence calculation,

⌊ x ⌋

gets the floor of

x

.

For orthographic projection geometry, the optimum $t$ exists in the range of image slices, as shown in Fig. 2. While for perspective projection geometry, if the leftmost (rightmost) light rays from the two adjacent image slices on reference plane (e.g. line $C_{n + 1} O_{n + 1}$ and line $C_{n} O_{n}$ ) do not intersect in the depth range of object, the optimum $t$ exists within the range of image slices, as shown in Fig. 1, which requires $l_{b} < L$ , and it generally holds. Therefore, the searching range of $t$ is within image slices.

The common patches of vertically adjacent image slices can be determined by analogy with that of horizontal ones. CPA is conducted for any two adjacent image slices, horizontally and vertically, and thus the common patches are found. The common patches of four adjacent image slices are illustrated in Fig. 3, and the common patches of two adjacent image slices are marked as the same color, and the width of them are recorded as $a_{1}$ , $a_{2}$ , $a_{3}$ , $a_{4}$ pixels for horizontal common patches, and $b_{1}$ , $b_{2}$ , $b_{3}$ , $b_{4}$ pixels for vertical common patches.

2.4 Merging and resampling strategy

After conducting CPA between any two adjacent image slices, the compositional structure of each image slices are figured out, and the information redundancy embedded in the images slices can be removed. The image slices are subdivided into patches by CPA, and the strategy of merging such patches into view images should be seriously considered in case of image distortions brought by geometrical mismatching. We propose to merge the patches within a square denoted as $S_{m, n}, (1 \leq m \leq M - 1, 1 \leq n \leq N - 1)$ , with the centers of four adjacent image slices as vertexes, as shown in Fig. 4, and geometrical mismatching can be avoided by taking the four centers as the benchmark. A quarter of each image slice is included in the square, as depicted in Fig. 4(a). The width of horizontal (vertical) common patches should be filtered to decrease the correspondence calculation errors in CPA, and mean filter is used for simplicity, i.e.

\begin{array}{l} A_{m, n} = mean (a_{1}, a_{2}, a_{3}, a_{4}) \\ B_{m, n} = mean (b_{1}, b_{2}, b_{3}, b_{4}) \end{array}

where

A_{m, n}

and

B_{m, n}

denotes the average width of horizontal and vertical common patches in

S_{m, n}

. The width of horizontal (vertical) common patches is a constant after filtering, as depicted in Fig. 4(b). The patches within the square are merged by discarding any one of the two common patches in the two horizontally (vertically) adjacent image slices, as depicted in Fig. 4(c), and a merged image without information redundancy is generated. The width and height of the merged image are

w - A_{m, n}

and

w - B_{m, n}

pixels respectively, here

w

is the one-dimensional pixel number in each image slice. The width and height of the merged images are inversely proportional to the distance between LA and the current area on object, which can be known by considering the pixelated structure of recording devices such as CCD. For instance, a smaller distance means the current area is imaged more detailedly, and results in a larger number of pixels contained in the merged image, and vice versa. For view images, the maximum sampling number on the reference plane within the square is

H \times V = \max [(w - A_{m, n}) \times (w - B_{m, n})], (1 \leq m \leq M - 1, 1 \leq n \leq N - 1)

where H and V denotes the horizontal and vertical pixel number of the merged image when maximum sampling number is achieved. In order to generate view images with maximum resolution, the sampling rate should be maximum, i.e. all the square

S_{m, n}

should be sampled

H \times V

times, which means the resolution of the merged image in each square should be modified to

H \times V

. Linear interpolation is used to implement resampling process, and each merged image is linearly interpolated from resolution of

(w - A_{m, n}) \times (w - B_{m, n})

to

H \times V

, as depicted in Fig. 4(d).

Fig. 4 Merging and resampling strategy when common patches do not exceed the centers of image slices. (a) Compositional structure of image slices in square $S_{m, n}$ , (b) mean filtering of the width of common patches, (c) the merged image, (d) interpolation of merged image to the resolution of $H \times V$ .

Download Full Size | PDF

The view images can be generated by tiling the resampled merged images which are composed by 9 different patches, as depicted in Fig. 4(d). The situation happens when the horizontal and vertical common patches do not exceed the corresponding centers of image slices. However, there are occasions when the common patches exceed the centers of image slices, as depicted in Fig. 5(a), which may happen when the distance between the reference plane and current area of object surface is very large. For simplicity, only horizontal circumstance is shown in Fig. 5. The common patches within the square is depicted in Fig. 5(b), which are marked with the same fringe pattern and color, and the width of them are $w - a_{i}$ , (i = 2,1,4,3). The area (surround by dashed rectangle) in each image slice corresponds to object surface that is out of the scope that square $S_{m, n}$ corresponds to, and it should not be included in the merged image. Therefore, one common patch of two adjacent image slice within square $S_{m, n}$ is chosen as the merged image, as depicted in Fig. 5(c). The width of the four horizontal common patches should be filtered to decrease correspondence calculation error following Eq. (11). After generating the merged image, linear interpolation is conducted to extend the resolution to $H \times V$ .

Fig. 5 Merging strategy of image slices when common patches exceed the centers of image slices. (a) Compositional structure of image slices in square $S_{m, n}$ , (b) merging strategy, (c) the merged image.

Download Full Size | PDF

A_{m, n} = w - mean (a_{1}, a_{2}, a_{3}, a_{4})

By using the above merging strategy, CPA is necessary only between adjacent images slices, rather than any two image slices as described in subsection 2.1-2.2, which makes the method easy and timesaving to implement.

2.5 Essence of the method

The proposed method subdivides the object surface into small pieces, and each is composed by the area between $E''_{n}$ and $E''_{n + 1}$ , $(1 \leq n \leq N - 1)$ , here $E''_{n}$ is the intersection of the object surface and line $E_{n} E'_{n}$ , as the geometry depicted in Fig. 6. We illustrate the geometrical relations in one dimension, and that in two dimension can be easily deduced by analogy. The image slice in square $S_{n}$ contains complete but redundant information to represent the object area between $E''_{n}$ and $E''_{n + 1}$ , and the redundancy should be excluded to achieve exactly right description of current object area, which is implemented by CPA and the merging strategy. Each piece of the object surface is detailedly represented with 9 image patches at most from 4 adjacent EIs, as depicted in Fig. 4(c). The view image is composed of the images of pieces of object surface, which are all exactly right represented, so the view image contains complete and nonredundant information for current viewpoint or view angle.

Fig. 6 Subdivision of object surface into pieces and exactly right description of each piece for (a) perspective projection geometry and (b) orthographic projection geometry.

Download Full Size | PDF

The object surface is imaged by elemental lens with a resolution of $l_{o} p / g$ , where $p$ is the pixel pitch of recording medium, $l_{o}$ is distance between object and MLA. Since the merged image is composed of multiple patches, and each patch is the image of part of the local area of object surface by different elemental lens, with full resolution, it can be deduced that each piece of object surface is represented with maximum resolution. Using the resampling strategy described in subsection 2.3, the view image can achieve maximum resolution because of no information loss during interpolation.

3. Simulation

We verify the proposed method and illustrate the implementation process by simulation. A virtual strawberry (Fig. 7(a)) is used as the 3-D model, and EIA (Fig. 7(b)) is digitally generated by tiling the perspective images captured by 19 × 19 virtual pinhole array, which simulates the process of optical integral photography. The pitch of pinholes is $p_{l} = 1$ mm, and the gap between the pinhole array and captured EIs is $g =$ 3.3 mm. The pixel pitch of EIs is 12.3 μm and each EI contains 81 × 81 pixels. The pinhole array locates on z = 0 cm, and the minimal and maximum distance between pinhole array and the 3-D model is l_a = 2 cm and l_b = 4 cm, respectively. The reference plane can be arbitrary between z = 0.33 cm and z = 2 cm, here it is set as z = 1.5 cm. The upper left pixel in EIA locates at (0, 0, −0.33) cm, and the bottom right pixel locates at (1.9, 1.9, −0.33) cm. The reconstruction process of view images can be divided to three steps as discussed in subsection 2.1-2.4, i.e.

Fig. 7 (a) The 3-D model used for simulation, (b) the generated EIA, (c) the view image generated with information redundancy, and aliasing occurs.

Download Full Size | PDF

(1) Generating image slices according to the specific projection geometry;
(2) CPA of any two adjacent image slices to figure out the compositional structure of each image slice;
(3) Merging the image slices with redundancy excluded, and interpolating them to generate the view image.

The image slices are determined following Eqs. (2) and (3) for perspective projection geometry, and Eqs. (5) and (6) for orthographic projection geometry. Figure 7(b) shows the image slices and the centers of them generated at viewpoint (1, 1, 40) cm in perspective projection geometry, and the resolution of each image slice is 17 × 17. The view image shown in Fig. 7(c) is generated by rotating each image slice 180° and tiling them sequentially, which contains full but redundant information to construct the correct view image, and the information redundancy causes the aliasing effect. The view image is seen from an inverse direction with the original image (Fig. 7(a)), so it is left and right reversed compared with the original image.

After generating the image slices, CPA is conducted between any two adjacent image slices to find out the redundant information, as shown in Fig. 8. CPA between horizontally (vertically) adjacent image slices is implemented by double correspondence analysis of a column (row) of pixels, i.e. finding in the (m, n + 1)_th ((m + 1, n)_th) image slice the column (row) of pixels that correspond to the rightmost (bottom) column (row) of pixels in the (m, n)_th image slice, and finding in the (m, n)_th image slice the column (row) of pixels that correspond to the leftmost (uppermost) column (row) of pixels in the (m, n + 1)_th ((m + 1, n)_th) image slice. Extra 30 pixels that is in the same column (row) in EI is introduced to decrease error of correspondence analysis, i.e. the size of window is W = 30 pixels. After CPA, the compositional structure, i.e. the common patches that are shared by other image slices and the particular patches, of image slices are figured out, as depicted in Fig. 8.

Fig. 8 CPA of horizontal and vertical image slices.

Download Full Size | PDF

The information redundancy is excluded in the merging step, in which a quarter of each image slice in the 4 adjacent image slice is considered, and the merging strategy are described in section 2.3. The merged image is an assembled image of the local area of object surface by multiple elemental lenses with full resolution, and all such local areas forms the whole object surface at current viewpoint (view angle) without empty holes, so all the merged images contain the image of the whole object surface with full resolution. The pixels in each merged image slice is different according to the distance between LA and the local area of object surface, and a nearer distance means more pixels in the merged image slice. For view image generation, the spatial sampling rate should be constant, which means the pixel number in each merged image should be equal. To achieve maximum resolution, the maximum pixel number of merged image should be chosen as the sampling rate, with which all the merged images should be resampled. The maximum pixel number of merged image in the simulation is 13 × 13, and all merged images are resampled, which can be implemented by linear interpolation, and different interpolation strategies such as nearest interpolation, bilinear interpolation, bicubic interpolation can be adopted. The final view image at the viewpoint or view angle can be generated by tilling the resampled images.

Various perspective view images from different lateral positions and depth are reconstructed, as illustrated in Figs. 9(a) and 9(b). The reconstructed view images are pseudoscopic since no pseudoscopic to orthoscopic conversion is applied to the EIA. The pseudoscopic phenomenon can be evidenced by such examples as the image (e.g. the green leaves) that should be observed at the leftmost viewpoint (1, −4.1, 40) cm is observed at the rightmost viewpoint (1, 4.9, 40) cm, and vice versa. The size of the object in view images decreases as the distance between viewpoint and object increases, as shown in Fig. 9(b), which is the typical feature for perspective projection geometry. The computationally reconstructed perspective view images by the proposed method gain very high resolution of 234 × 234, and the image composed of central pixels of images slices is generated for comparison, as shown in Fig. 9(c), the resolution of which equals to the number of EIs, i.e. 19 × 19, and the resolution is the physical resolution for optical reconstruction when the pupil is infinitely small. Therefore, view images with much higher resolution can be reconstructed by our computational method compared with the optical manner, which is beneficial for the recognition and identification of the captured 3-D object. Visualization 1 presents the reconstructed perspective view images from various viewpoints and depth.

Fig. 9 The reconstructed perspective view images from (a) different viewpoints and (b) depth, and (c) view images with resolution equal to the number lens of LA. Visualization 1 presents the reconstructed perspective view images from various viewpoints and depth.

Download Full Size | PDF

The reconstruction of orthographic view images is similar to that of perspective view images, except that the image slices and their centers are different, which can be determined by Eqs. (5) and (6) respectively. The reconstructed orthographic view images with resolution of 234 × 234 from different view angles are shown in Fig. 10(a), and the sub-images with low resolution of 19 × 19 are shown in Fig. 10(b) for comparison, and pseudoscopic phenomenon also exist in these images. Visualization 2 is a video of the reconstructed orthographic view images and corresponding sub-images from various view angles.

Fig. 10 The reconstructed (a) orthographic view images and (b) sub-images. Visualization 2 presents the reconstructed orthographic view images and sub-images from various view angles.

Download Full Size | PDF

4. Experiment and results

We also conduct experiment to verify the proposed method. EIA is generated using synthetic aperture method [23, 24], as depicted in Fig. 11. A digital camera (Cannon EOS 550d) is shifted horizontally and vertically with a constant step of 3 mm and sampling number of 21 × 21, to capture perspective images of the 3-D scene which is composed of a toy car and a green plant. The focal length of the camera lens is 18 mm, and F number is 22. The camera focuses on z = 900 mm, and then the gap g is about 18 mm. The depth of field is large enough to capture sharp images of the 3-D scene positioned 19 cm away from the camera. The pixel pitch of the camera is 4.3 μm, and the size of each EI is 3 mm × 3 mm, and thus EIs are generated by extracting the central 700 × 700 pixels from the captured perspective images. The EIs are resized to 200 × 200 for computation simplicity, and the resolution of EIA is 4200 × 4200. The front view of the 3-D scene is shown in Fig. 12(a), and the generated EIA is shown is Fig. 12(b).

Fig. 11 Schematic of capturing EIs using synthetic aperture method.

Download Full Size | PDF

Fig. 12 (a) Front view of the scene, and (b) the generated EIA.

Download Full Size | PDF

Perspective and orthographic view images are reconstructed using the proposed method, as illustrated in Figs. 13 and 14 respectively. The reference plane is fixed at z = 12 cm for both projection modes, and a window with size of W = 30 pixels is set for more precise correspondence analysis. For perspective view image reconstruction, when the viewpoints are on z = 200 cm, the resolution of each image slice is 28 × 28, and the maximum resolution of the merged images is 24 × 24, and then the resolution of the reconstructed perspective view images is 480 × 480. While for orthographic view image reconstruction, the resolution of each image slice is 30 × 30, and the maximum resolution of the merged images is 27 × 27, and then the resolution of the reconstructed orthographic view images is 540 × 540. Since larger area of the model is covered in orthographic projection geometry than that in perspective projection geometry, the resolution of orthographic view images is higher. Visualization 3 and Visualization 4 present animations of view images reconstructed at various viewpoints and view angles. Some parts in the reconstructed images are a little distorted, which is caused by the calculation errors in correspondence analysis. The quality of the reconstructed view images can be further improved by employing more proper algorithms in correspondence analysis calculation.

Fig. 13 The reconstructed perspective view images from (a) different viewpoints and (b) depth. Visualization 3 presents the reconstructed perspective view images from various viewpoints and depth.

Download Full Size | PDF

Fig. 14 The reconstructed orthographic view images. Visualization 4 presents the reconstructed orthographic view images from various view angles.

Download Full Size | PDF

5. Conclusions

In conclusion, a novel method to computationally reconstruct perspective and orthographic view images from a single integral photograph is proposed. The core idea of the method is to exclude the redundancy embedded in the information that is needed to reconstruct the view image, and exact description of the object surface is achieved. The view images can be reconstructed at arbitrary positions or view angles (within the field of view of II system) with full resolution of recording device, which is beneficial for the recognition and identification of the object. The method may have potential applications in such areas as hologram generation, 3-D object sensing and recognition. Both simulation and experiment results verify the validity of the proposed method.

Funding

National Natural Science Foundation of China (61775240); Foundation for the Author of National Excellent Doctoral Dissertation of the People’s Republic of China (FANEDD) (201432); Natural Science Foundation of Beijing Municipality (4152049); Beijing NOVA program (Z1511000003150119).

References and links

1. G. Lippmann, “La photographic integrale,” CR Acad. Sci. 146, 446–451 (1908).

2. J.-Y. Son, B. Javidi, S. Yano, and K.-H. Choi, “Recent developments in 3-D imaging technologies,” J. Disp. Technol. 6(10), 394–403 (2010). [CrossRef]

3. J.-Y. Son and B. Javidi, “Three-dimensional imaging methods based on multiview images,” J. Disp. Technol. 1(1), 125–140 (2005). [CrossRef]

4. P. Su, W. Cao, J. Ma, B. Cheng, X. Liang, L. Cao, and G. Jin, “Fast computer-generated hologram generation method for three-dimensional point cloud model,” J. Disp. Technol. 12(12), 1688–1694 (2016). [CrossRef]

5. Y. Kim, K. Hong, and B. Lee, “Recent researches based on integral imaging display method,” 3D Research 1(1), 17–27 (2010). [CrossRef]

6. J.-H. Park, K. Hong, and B. Lee, “Recent progress in three-dimensional information processing based on integral imaging,” Appl. Opt. 48(34), H77–H94 (2009). [CrossRef] [PubMed]

7. M. Levoy, “Light fields and computational imaging,” Computer 39(8), 46–55 (2006). [CrossRef]

8. A. Stern and B. Javidi, “Three-dimensional image sensing, visualization, and processing using integral imaging,” Proc. IEEE 94(3), 591–607 (2006). [CrossRef]

9. H. Hoshino, F. Okano, H. Isono, and I. Yuyama, “Analysis of resolution limitation of integral photography,” J. Opt. Soc. Am. A 15(8), 2059–2065 (1998). [CrossRef]

10. Z. E. Ashari, Z. Kavehvash, and K. Mehrany, “Diffraction influence on the field of view and resolution of three-dimensional integral imaging,” J. Disp. Technol. 10(7), 553–559 (2014). [CrossRef]

11. Z. Kavehvash, M. Martinez-Corral, K. Mehrany, S. Bagheri, G. Saavedra, and H. Navarro, “Three-dimensional resolvability in an integral imaging system,” J. Opt. Soc. Am. A 29(4), 525–530 (2012). [CrossRef] [PubMed]

12. Y.-T. Lim, J.-H. Park, K.-C. Kwon, and N. Kim, “Resolution-enhanced integral imaging microscopy that uses lens array shifting,” Opt. Express 17(21), 19253–19263 (2009). [CrossRef] [PubMed]

13. J.-H. Park, G. Baasantseren, N. Kim, G. Park, J.-M. Kang, and B. Lee, “View image generation in perspective and orthographic projection geometry based on integral imaging,” Opt. Express 16(12), 8800–8813 (2008). [CrossRef] [PubMed]

14. A. Stern and B. Javidi, “Three-dimensional image sensing and reconstruction with time-division multiplexed computational integral imaging,” Appl. Opt. 42(35), 7036–7042 (2003). [CrossRef] [PubMed]

15. S. Kishk and B. Javidi, “Improved resolution 3D object sensing and recognition using time multiplexed computational integral imaging,” Opt. Express 11(26), 3528–3541 (2003). [CrossRef] [PubMed]

16. H. Arimoto and B. Javidi, “Integral three-dimensional imaging with digital reconstruction,” Opt. Lett. 26(3), 157–159 (2001). [CrossRef] [PubMed]

17. Y. Piao and E.-S. Kim, “Resolution-enhanced reconstruction of far 3-D objects by using a direct pixel mapping method in computational curving-effective integral imaging,” Appl. Opt. 48(34), H222–H230 (2009). [CrossRef] [PubMed]

18. L.-Y. Ai, X.-B. Dong, J.-Y. Jang, and E.-S. Kim, “Optical full-depth refocusing of 3-D objects based on subdivided-elemental images and local periodic δ-functions in integral imaging,” Opt. Express 24(10), 10359–10375 (2016). [CrossRef] [PubMed]

19. N. Chen, J.-H. Park, and N. Kim, “Parameter analysis of integral Fourier hologram and its resolution enhancement,” Opt. Express 18(3), 2152–2167 (2010). [CrossRef] [PubMed]

20. J. Chen, Q. H. Wang, S. L. Li, Z. L. Xiong, and H. Deng, “Multiple elemental image mapping for resolution‐enhanced orthographic view image generation based on integral imaging,” J. Soc. Inf. Disp. 22(9), 487–492 (2014). [CrossRef]

21. J.-S. Jang and B. Javidi, “Improved viewing resolution of three-dimensional integral imaging by use of nonstationary micro-optics,” Opt. Lett. 27(5), 324–326 (2002). [CrossRef] [PubMed]

22. G. Passalis, N. Sgouros, S. Athineos, and T. Theoharis, “Enhanced reconstruction of three-dimensional shape and texture from integral photography images,” Appl. Opt. 46(22), 5311–5320 (2007). [CrossRef] [PubMed]

23. J.-S. Jang and B. Javidi, “Three-dimensional synthetic aperture integral imaging,” Opt. Lett. 27(13), 1144–1146 (2002). [CrossRef] [PubMed]

24. H. Navarro, R. Martínez-Cuenca, G. Saavedra, M. Martínez-Corral, and B. Javidi, “3D integral imaging display by smart pseudoscopic-to-orthoscopic conversion (SPOC),” Opt. Express 18(25), 25573–25583 (2010). [CrossRef] [PubMed]

Name	Description
Visualization 1	The video is an animation of the reconstructed perspective view images of virtual object from various viewpoints.
Visualization 2	The video is an animation of the reconstructed orthographic view images of virtual object from various view angles.
Visualization 3	The video is an animation of the reconstructed perspective view images of real object from various viewpoints.
Visualization 4	The video is an animation of the reconstructed orthographic view images of real object from various view angles.

Computational integral imaging reconstruction of perspective and orthographic view images by common patches analysis

Abstract

1. Introduction

2. Principles

2.1 Perspective view images reconstruction

2.2 Orthographic view image reconstruction

2.3 Common patches analysis

2.4 Merging and resampling strategy

2.5 Essence of the method

3. Simulation

4. Experiment and results

5. Conclusions

Funding

References and links

Supplementary Material (4)

Cited By

Figures (14)

Equations (11)

Optics Express