Performance improvement for compressive light field display based on the depth distribution feature

Liming Zhu; Liming Zhu; Guoqiang Lv; Guoqiang Lv; Guoqiang Lv; Liye Xv; Zi Wang; Zi Wang; Qibin Feng

doi:10.1364/OE.428082

1. Introduction

Three-dimensional (3D) display has attracted considerable attention as an essential next-generation display because of its ability to provide 3D perception. In most commercial 3D displays, the physiological depth perception is constructed by the principle of binocular disparity. However, only providing binocular disparity could bring about a nonignorable drawback called vergence-accommodation conflict (VAC) [1–3], making the observer suffer from visual discomfort and fatigue. Aiming to mitigate VAC, integral imaging [4–8] and holographic display [9–12] have been put forward. Integral imaging displays can mitigate the VAC by utilizing a two-dimensional micro-lens array and high-resolution display. Although this approach could provide binocular parallax and motion parallax, the trade-off between spatial resolution and angular resolution remains to be solved. Holographic display based on ray-wavefront conversion can provide all depth cues by reconstructing the wavefront information of the 3D scene. Though it can efficiently resolve the VAC problem, some challenges can be overcome, such as speckle noise and great computational demand.

In recent years, an emerging approach, called compressive light field (CLF) display [13–18] based on multi-layer spatial light modulators (SLMs), has been introduced to solve the VAC problem and relieve the spatio-angular resolution trade-off. Based on the structure, the light field (LF) information was compressed into layered images and displayed by multi-layer SLMs. When viewed from different directions, it can present slightly different scenes without sacrificing the resolution. Though these prototypes can easily provide a high-resolution light field and continuous depth perception, the poor image quality still limits the displaying performance. Cao et al. and Liu et al. analyzed the relationship between average overload rate and image quality [19,20]. The average overload rate is defined as the ratio of the amount of light rays to the number of pixels. A lower overload rate means each pixel needs to be responsible for fewer light rays, resulting in better image quality. Encouraged by this conclusion, increasing the number of layers is adopted to improve image fidelity. It is usually implemented with additional optical elements such as Pancharatnam-Berry phase lenses (PBLs) [21,22], half mirror [23], reflective polarizer [24], holographic optical elements (HOEs) [25], polymer stabilized cholesteric texture (PSCT) shutters [26] and savart plate [27]. While this method is a straightforward way to enhance image fidelity, increasing the physical layers or additional optical elements brings its own issues (such as increasing system volume or the requirement for high refresh rates of the display panel). One way to overcome the above problems is realized by encoding the layered images into a hologram [28]. Nevertheless, some issues should be solved for commercialization: speckle noise and require high specification equipment.

Recently, another approach utilizing human visual features or the structural characteristics of CLF displays was proposed to enhance the quality of the perceived image with limited layers. Chen et al. propose a CLF display with the viewing-position-dependent weight distribution [29]. It achieves high-performance display effect and enlarges the field of view (FOV), but the eye tracker limits the viewer number. Lee et al. utilizes the fall of visual acuity in the periphery of the retina for enhancement of image quality and enlarging eye-box [30]. One disadvantage of the approach based on human visual features is that it only be applied to the near-eye display. Wang et al. firstly propose a salience-guided depth calibration for perceptually optimized CLF displays [31]. Different from existing CLF displays that only use a fixed depth reference plane, the method automatically calibrates the depth of the reference plane based on the salience results. However, it remains a challenge to obtain a salient region efficiently and robustly in the computer vision system. Inspired by Wang et al., we consider the depth distribution of the 3D scene and guide the depth calibration for the light field optimization to realize CLF display with excellent performance.

In this paper, we establish a theoretical model to analyze the relationship between the depth distribution of the object and image quality. The theoretical analysis reveals that the object with a closer distance to the physical layers has a better reconstruction quality when the SLM layers have the same pixel density. Then, a method based on the depth distribution feature to guide the light field optimization automatically is proposed. In section 2, the concept of the conventional CLF display is introduced. Subsequently, the relationship between the depth distribution of the object and the image quality is analyzed in detail. In section 3, the method via offsetting the depth of the reference plane is introduced to minimize the deviation between the target light field and the reconstructed light field. In section 4, the feasibility of the proposed method is verified with a photorealistic optical experiment.

2. Principle

2.1 Conventional compressive light field display

Figure 1(a) illustrates the light field rendering. For simplicity of analysis, the two-dimensional case is considered. The reference plane is defined as the plane on which the camera is focused when capturing the viewpoint image. The direction of light rays can be described by the point on the reference plane and the location of the camera. Figure 1(b) shows the structure of the CLF display, where a few SLMs are stacked with consistently spaced intervals. When reconstructing the light field with a CLF display, the reference plane can be located anywhere theoretically [31]. By moving the reference plane, the relative location of the object to the physical layers can be changed. However, in the conventional CLF display, the reference plane is usually located at a fixed depth, such as the display volume center [31–33]. Therefore, the conventional CLF display does not consider the relationship between the depth distribution of the object and the image quality. As depicted in Fig. 1(b), any light rays in 2D space can be parameterized with ${x_o}$ and $\theta $. ${x_o}$ is a point on the reference plane and $\theta $ is a directional angle. Assuming the light ray $l({{x_o},\theta } )$ passes through multiple stacked SLMs and generates an intersection on each stacked layer. In accordance with the parameterization, the ${i^{th}}$ intersection coordinate should satisfy the following equation:

(1)$${x_i} = {x_o} + {[{({n - 1} )\cdot \Delta d - {h_o}} } ]\cdot \tan \theta $$

Where ${h_o}$ is the distance from the reference plane to the ${1^{st}}$ SLM and $\Delta d$ is the gap between adjacent SLMs. The reconstructed light ray $l^{\prime}({{x_o},\theta } )$ which is generated from the N-layer of SLMs can be represented in two types of approaches. One is the multiplicative CLF display described by the following equation:

(2)$$l^{\prime}({{x_o},\theta } )= \mathop \prod \limits_{n = 1}^N {f_n} {\{{{x_o} + {[{({n - 1} )\cdot \Delta d - {h_o}} } ]\cdot \tan \theta } } \}$$

Fig. 1. (a) The light field rendering. The red dotted line represents the reference plane (b) The structure of the CLF display.

Download Full Size | PDF

The other is additive CLF display:

(3)$$l^{\prime}({{x_o},\theta } )= \mathop \sum \limits_{n = 1}^N {f_n} {\{{{x_o} + {[{({n - 1} )\cdot \Delta d - {h_o}} } ]\cdot \tan \theta } } \}$$

where ${f_n}$ is the intensity of the images in the ${n^{th}}$ SLM.

To compress light field into n-layer images for corresponding SLMs, the following constrained optimization problem should be solved:

(4)$$\arg \min {\left\| {l\left( {{x_o},\theta } \right) - l^\prime\left( {{x_o},\theta } \right)} \right\|^2}$$

Equation (4) optimization minimizes the deviation between the light field and the reconstructed light field using the nonnegative least squares method [21,28,34]. After the correct decomposed images are presented on each SLM, the light field scene is generated and observed.

In the experiment, it is found that a better reconstruction quality can be obtained when the object point is closer to the physical layer. Liu et al. simply analyze the relationship between the depth of the object point and reconstruction quality [20]. And it believes that the reconstruction quality is determined by the pixel with a lower overload rate. However, the analysis method does not consider that each object point is reconstructed by multiple pixels. Therefore, we further analyze the relationship between object point depth and quality in the next section.

2.2 Theoretical analysis

This section further analyzes the relationship between the depth distribution of 3D scenes and the image quality. The theoretical analysis reveals that the object with a closer distance to the physical layers has a better reconstruction quality when the SLMs have the same pixel density. Especially, the best reconstruction quality is obtained when the object is precisely located on the physical layer. On the contrary, the worst reconstruction quality is obtained when the object is in the intermediate between adjacent layers.

As shown in Fig. 2, we suppose that a planar object is displayed on a constant depth ${h_1}$ and the object point O locates on the planar. The viewing distance is H and the viewing area is D. As is depicted in Fig. 2(a), the number of pixels ${Z_i}$ that the target light rays emitted by the object point O passing through the ${i^{th}}$ layer is expressed by the following formula:

(5)$${Z_i} = |{{h_1} - ({i - 1} )\times \Delta d} |\cdot \frac{D}{{(H + {h_1}) \cdot {S_0}}}$$

where ${S_0}$ is the pixel size. Since the number of light rays emitted from O is constant, the value of ${Z_i}$ is smaller means each pixel located on the ${i^{th}}$ layer controls more light rays emitted by O. In other words, the number of light rays controlled by a single pixel located on the ${i^{th}}$ layer is inversely proportional to ${Z_i}$ . Simultaneously, the light rays emitted by the same object point are highly coherent. The intensity of light rays emitted from the same object points is almost the same, while the intensity of emitted from different object points is generally different. Thus, the more light rays emitted by the same object point O passing through a single pixel, the more the pixel contributes to the reconstructed object point $\; O$. The weight ratio of pixels on different layers can be written as the following expression:

(6)$${w_1}:{w_2}:{w_3}:{w_4} = \frac{1}{{{h_1}}}:\frac{1}{{|{{h_1} - \Delta d} |}}:\frac{1}{{|{{h_1} - 2\Delta d} |}}:\frac{1}{{|{{h_1} - 3\Delta d} |}}\; $$

Fig. 2. Illustration of the relationship between the depth distribution of 3D scenes and the image quality. (a) The number of pixels on each layer involved in the reconstruction of object point O. (b) The number of reconstructed object points to be involved per pixel.

Download Full Size | PDF

Figure 3 shows that the weight ratio of pixels on different layers changes with the depth of the object point O varies. Four SLMs are located at 1, 2, 3 and 4, respectively. It can be seen from Fig. 3 that a pixel with a closer distance to the object point has a higher weight. When the object point is located on any of the layers, the weight of the pixel is almost 1.

Fig. 3. The weights of pixels on different layers change with the depth of the object point O varies. Four SLMs are located at 1, 2, 3 and 4, respectively.

Download Full Size | PDF

However, the value of a pixel relies on the light rays passing through that pixel according to Eq. (1)–(4). As described in Fig. 2(b), the number of object points ${P_i}$ which emit light rays passing through each pixel located on the ${i^{th}}$ layer can be represented as the following equation:

(7)$${P_i} = \frac{{|{{h_1} - ({i - 1} )\times \Delta d} |}}{{\textrm{H} + ({i - 1} )\times \Delta d \cdot {S_0}}} \cdot D\; $$

Since each pixel with a larger ${P_i}$ needs to be responsible for more object points with low coherence, the value of pixel needs to consider more light rays with different intensity and contributes less to each reconstructed object point. On the contrary, the value of a pixel with a smaller ${P_i}$ contributes more to each reconstructed point. Therefore, the contribution of the pixel on the ${i^{th}}$ layer to the reconstructed object point locating at depth ${h_1}$ is inversely proportional to ${P_i}$ . And the weight ratio of pixels on different layers for object point O are different from Eq. (6). The reconstruction quality of the object point O can be expressed as the cumulative contribution of all pixels. Therefore, we set an evaluation index R to characterize the relationship between the depth of point O and reconstruction accuracy:

(8)$$R = {\log _{10}}\mathop \sum \limits_{i = 0}^3 (1 + \alpha \cdot {w_i}/{P_i})\; $$

where $\alpha $ is a constant. Equation (8) can be depicted in Fig. 4(a). The physical layers are located on the distance=1, 2, 3, 4, respectively. As presented in Fig. 4(a), an object point with closer distance to the physical layer has better reconstruction accuracy. In particular, the reconstruction accuracy is best when the object point is located on any physical layer. On the contrary, the worst reconstruction accuracy is obtained when the depth of the object point is in the intermediate between adjacent layers.

Fig. 4. (a) The relationship between the depth of the object point O and reconstruction accuracy R. The physical layers are located on the distance=1, 2, 3, 4, respectively. (b) Chang of PSNR as the depth of 2D image varies. (c) The original 2D image. (d)-(h) The reconstructed images when the 2D image is located on the distance=1, 1.2, 1.5, 1.8, 2, respectively. The result shows that the quality gradually decreased as the 2D image further away from all physical layers. The best reconstruction quality is obtained when the 2D image is located on physical layers such as distance=1, 2. The worst is brought when the object planar is located on the intermediate depth between adjacent layers, as shown in Fig. 4(f).

Download Full Size | PDF

To verify the conclusion, a two-dimensional image, as shown in Fig. 4(c), is placed at different depths and the light field is reproduced based on additive CLF. Although we only present the simulation images based on additive CLF, the conclusion is also applicable to the multiplicative CLF. When the depth of the image ${h_1}$ is varied from 0 to 5, the PSNR of the reconstructed light field is shown in Fig. 4(b). As expected from the proposed theory, the image quality degrades as the 2D image diverges from the physical layers. Figures 4(d)-(h) show the reconstructed images when the 2D image located on the distance = 1, 1.2, 1.5, 1.8, 2, respectively. The result shows that the quality gradually decreased as the 2D image further away from all physical layers. The best reconstruction quality is obtained when the object planar is exactly located on the physical layer, as shown in Fig. 4(d) and (h). On the contrary, the worst is obtained when the object planar is located on the intermediate depth between adjacent layers, as shown in Fig. 4(f).

In the previous paragraphs, the derivation process is based on the assumption that the SLM layers have the same size and therefore pixel density. The structure is suitable for larger size CLF displays. However, when the CLF display is applied to the near-eye display, the separation between virtual planes can be very large. This leads to different magnification of the image and therefore different pixel density for each plane. For this structure, we also made a brief analysis, as shown in Fig. 5. A CLF display with 2-layer virtual planes was implemented in the simulation, and the distance between two adjacent layers was 1m. The virtual layers are located on the distance=1, 2, respectively. And the virtual plane at distance = 1 has a higher pixel density. We can see that the reconstruction is the best when the object is located on a plane. However, the worst results were not obtained at intermediate depths between adjacent layers. Therefore, different pixel density for each plane could influence the image quality. We will do more detailed researches of the CLF display in the near-eye display field in the future.

Fig. 5. (a) The relationship between the depth of the object point O and reconstruction accuracy R in different pixel density for each plane. The physical layers are located on the distance=1, 2, respectively. (b) Chang of PSNR as the depth of 2D image varies.

Download Full Size | PDF

3. Improving the quality of light field via offsetting the depth of the reference plane

Unlike the conventional CLF display which uses the fixed reference plane such as the display volume center, the proposed method automatically calibrates the depth initialization based on the depth distribution feature. The proposed method is based on the assumption that the SLM layers of the CLF display have the same size and therefore pixel density. When applied to different scenes, this method firstly detects the dense region of depth information and then maps these regions as close as possible to the physical layers by offsetting the depth of the reference plane. When the offset reference plane was used to guide the light field optimization, the image quality can be noticeably improved.

When reconstructing the light field, the location of the physical layers is fixed while the reference plane can be located anywhere theoretically [31]. By moving the reference plane, the relative location of the physical layers and the object can be changed. To improve the quality of the reconstructed image, we need to offset the location of the reference plane to minimize the distance between the object and the physical layers. Assuming that $\Delta T$ is the offset depth of the reference plane and ${h_0}$ is the original depth. The depth of the reference plane after offset is shown as following:

(9)$$h_0^{\prime} = {h_0} + \Delta T\; $$

Firstly, we obtain the depth map through the depth camera. Each pixel represents an object point. The value of the pixel represents the depth information, and it is a non-negative integer from 0 to M. Then, we transform the depth map into a histogram to graphically summarize and display depth information distribution. Let ${k_v}$ denotes the number of object points with depth value v. As the location of the reference plane is offset, the location of the object point is also offset. The depth of object point after offset is shown as following:

(10)$$v^{\prime}\; \; = v + \Delta T\; $$

Since the gap between adjacent layers is a fixed value of $\Delta d$, the depth of the ${({i + 1} )^{th}}$ layer should satisfy the following equation:

(11)$${d_{i + 1}} - {d_i} = \Delta d\; $$

${d_i}$ is the depth of the ${i^{th}}$ layer ($i = 1,2, \cdots ,N$).

The squared Euclidean distance ${E_{v^{\prime}i}}$ between the offset object point with the depth value $v^{\prime}$ and the ${i^{th}}$ layer.

(12)$${E_{v^{\prime}i}} = {|{v^{\prime} - {d_i}} |^2}$$

Since an object point with a closer distance to the physical layer has a better reconstruction accuracy, we only need to consider the distance from each object point to the nearest physical layer. Therefore, we define a binary matrix W to preserve the distance from the object point to the nearest physical layer. $W = [{{w_{v^{\prime}i}}} ]$ is a binary matrix with M rows and N columns:

(13)$${w_{v^{\prime}i}} = \left\{ {\begin{array}{{ll}} {1},&{E_{v^{\prime}i} = min\{{{E_{v^{\prime}1}},{E_{v^{\prime}2}}, \cdots ,{E_{v^{\prime}N}}} \}}\\ {0},&{otherwise} \end{array}} \right.$$

where M is the number of object points and N is the number of physical layers. ${w_{v^{\prime}i}}$ is to judge whether the ${i^{th}}$ layer is the nearest layer to the object point with value $v^{\prime}$. ${w_{v^{\prime}i}}$ is 1 when the distance from the offset object point with a value $v^{\prime}$ to the ${i^{th}}$ layer is the shortest. Otherwise, ${w_{v^{\prime}i}}$ is 0. In each row of the matrix W, only one value is 1 and the rest are 0. For example, the depth of object point is 70, and the depth of the ${1^{st}}$, ${2^{nd}}$, ${3^{rd}}$ and ${4^{th}}$ is 1, 85, 170, 255 respectively. Thus, the distance from the object point to the ${2^{nd}}$ layer is the shortest, so ${w_{({70,1} )}}$=0, $\; {w_{({70,2} )}}$=1, ${w_{({70,3} )}}$=0, ${w_{({70,4} )}}$=0.

To calculate the offset depth of the reference plane, the objective is the minimization of the Euclidean distance between object point to its nearest physical layer, described as follows:

(14)$$\mathop {\textrm{argmin}}\limits_{\Delta T} {\bigg \Vert} \mathop \sum \limits_{v^{\prime} = \Delta T}^{M + \Delta T} \mathop \sum \limits_{i = 1}^N {w_{v^{\prime}i}} \cdot {E_{v^{\prime}i}} \cdot {k_v}{\bigg \Vert}\; $$

Because of the depth of field (DOF) of the CLF display is the twice the distance between the outer layers [31], we use the method of exhaustion to achieve the minimum average distance between reconstructed points and physical depths. First, we move the reference plane in unit step within the DOF range and then calculate the sum of the distances from object points to the nearest layer when the reference plane is at different depths. When the sum of the distances is the minimum, the depth where the reference plane is located at is the desired depth.

Then, the depth $h_0^{{\prime}}$ of the reference plane after offset can be found by substituting $\Delta T$ into Eq. (9). Therefore, Eqs. (1–3) can be modified as follows:

(15)$${x_i} = {x_0} + [{({n - 1} )\cdot \Delta d - h_0^{\prime}} ]\cdot \tan \theta \; $$

(16)$$l^{\prime}({{x_o},\theta } )= \mathop \prod \limits_{n = 1}^N {f_n} {\{{{x_o} + {[{({n - 1} )\cdot \Delta d - h_0^{\prime}} } ]\cdot \tan \theta } } \}$$

(17)$$l^{\prime}({{x_o},\theta } )= \mathop \sum \limits_{n = 1}^N {f_n} {\{{{x_o} + {[{({n - 1} )\cdot \Delta d - h_0^{\prime}} } ]\cdot \tan \theta } } \}$$

Next, simulations compared with the conventional method based on a fixed initialization plane were performed to show the superiority of the proposed method with the offset reference plane. Figure 6(a) shows 5 ${\times} $ 5 parallax images of the 3D scene captured by the camera array and the interval between adjacent parallax images is 5 degrees. The parallax images are used to generate compressed layer images based on additive CLF. Figure 6(d) shows the relationship between the location of the 3D scene and the physical layer in the conventional algorithm with a fixed reference plane. Each physical layer is stacked at a space of $5cm$. The reference plane of the 3D scene is in the middle of the display volume. The red dotted line represents the reference plane and the blue solid lines represent the physical layers. The four compressed layer images with the conventional method are shown in Fig. 6(f). Figure 6(b) shows the depth information of the 3D scene. The depth information is used to adjust the reference plane to be as close to the physical layers as possible. Figure 6(c) shows the histogram describing the depth distribution. After the proposed method with the offset reference plane was used, the adjusted depth of target light and the calibrated reference plane $h_0^{\prime} = 5.25cm$ are shown in Fig. 6(e). It can be seen from Fig. 6(e) that the dense areas of objects in the 3D scene are all close to the physical layers. The four compressed layer images with the proposed method are shown in Fig. 6(g). To demonstrate the offset reference plane is optimal, we calculated the PSNR values when the reference plane is placed at different depths. Figure 7 represents the change of PSNR as the depth of reference plane varies. As expected, the quality of the reconstructed light field is the best when the depth of reference plane is $h_0^{\prime} = 5.25cm$. And we can see after the depth map was used to guide the light field optimization, the image quality is improved from 27.55dB to 29.95dB. The simulations with different methods are shown in Fig. 8. The first, second and third rows are the reconstructed images simulated in different viewpoints. The red regions, located at the intersection of red fish and cyan fish, present significant parallax change. The magenta regions located on the eyes of blue fish show the image quality with the offset reference plane is much better than the result with a fixed reference plane.

Fig. 6. (a) 5 × 5 parallax images of the 3D scene consisting of three fishes. (b) Depth map of the 3D scene. (c) The depth histogram. (d) The conventional method with a fixed initialization plane. The red dotted line represents the reference plane and the blue solid lines represent the physical layers. (e) The proposed method with the offset reference plane guides the light field optimization. (f) Four compressed layer images with 5cm intervals for the conventional method. (g) Four compressed layer images with 5cm intervals for the proposed method.

Download Full Size | PDF

Fig. 7. The change of PSNR as the depth of reference plane varies. The negative depth value means that the reference plane is in front of the display, the depth value greater than 15 means that the reference plane is behind the display, and the rest means that it is inside the display. The solid red line represents the depth of the reference plane ${h_0} = 7.5cm$ used by the conventional method. The solid yellow line represents the depth of the offset reference plane $h_0^{\prime} = 5.25cm$ after the proposed method was used. As expected, the quality of the reconstructed light field is the best when the depth of reference plane is $h_0^{\prime} = 5.25cm$.

Download Full Size | PDF

Fig. 8. Simulation images for the conventional method with a fixed reference plane and the proposed method with the offset reference plane are described. The value of PSNR for each image is marked in red letters. After the depth map was used to guide the light field optimization, the average image quality of 25 viewpoints was improved from 27.55dB to 29.95dB. The red regions, located at the intersection of red fish and cyan fish, present significant parallax change. The magenta regions located on the eyes of blue fish show the image quality with the offset reference plane is much better than the result with a fixed reference plane.

Download Full Size | PDF

4. Implementation

4.1 Hardware

This section experimentally demonstrates that the proposed method using the depth distribution to guide the light field optimization can significantly improve the image quality compared with the conventional method using fixed depth initialization configuration. We built a prototype consisting of four-layer stacked display panels and digital micro-mirror devices (DMDs) [25], as shown in Fig. 9. The DMD supports a 1024×768 resolution with a refresh rate of 32.5 kHz. Since each display panel of size 80cmx60cm is composed of four polymer stabilized cholesteric texture (PSCT) shutters with 40cmx30cm, there will be stitching gaps that will affect the viewing quality of the image. To avoid the problem, we only use a quarter of the panel to display the image. Therefore, the physical size of the reconstructed image is 40cmx30cm and the resolution is 512 × 384. The PSCT shutter works in two states, a transparent state with applied voltage and a scattering state in the absence of voltage [35]. The distance between adjacent PSCT shutters is $5cm$. When the DMD projects a sequence of decomposed images onto PSCT and the driving board switches the on/off state of PSCT shutters according to the image frames sequentially, a flicker-free 3D image can be reconstructed on these shutters.

Fig. 9. (a) The design diagram of the CLF display with PSCTs. (b) The experimental setup.

Download Full Size | PDF

To demonstrate the proposed method, a camera was placed at the desired viewer positions (directly in front of the display at a distance of 1.7m) and captured the perspective images. The interval between adjacent viewer positions is 5 degrees. Figure 10 shows the optically reconstructed images of the fish model with the conventional method and the proposed method. The first, second and third rows show photographs from different viewpoints. The PSNR of the conventional algorithm in different viewpoints are 12.0dB,12.6dB and 12.1dB, respectively. When the proposed algorithm is used, the PSNR values are 13.7dB,14.8dB and 13.8dB, respectively. So an average improvement of 1.8dB is obtained and the image quality improvement is easily confirmed from the enlarged details in Fig. 10. The red regions, located at the intersection of cyan fish and yellow fish, present significant parallax change. The cyan region in the first column, located on the blue fish, exhibits noticeable halo artifacts. After the proposed method was used, the cyan region can provide a better result in the second column and most details are preserved. Thus, the experimental result verifies that our approach could provide a more distinct and natural scene than the other method.

Fig. 10. Different perspectives from different directions of the optically reconstructed 3D image of the conventional method and proposed method. The value of PSNR for each image is marked in red letters. The red regions, located at the intersection of cyan fish and yellow fish, present significant parallax change. The magenta regions, located on the fins of blue fish, show the image quality of the CLF display using the proposed method with offset reference plane (the second column) is much better than the classical method with a fixed reference plane (the first column).

Download Full Size | PDF

Although the proposed method has a relatively obvious improvement, the difference between the PSNR obtained in experiments and simulations is more apparent, caused by the following factors. Firstly, the camera is difficult to align perfectly with the expected viewpoint. Secondly, the image is distorted due to the characteristics of PSCTs (e.g., different transmittance at different wavelengths and nonlinear changes in transmittance when the state is switched). When the compensation algorithm [26] and grayscale correction [36] method are adopted, the image distortion will be alleviated. Thirdly, stray light is generated inside the display volume which reduces the image quality.

5. Limitation and future work

Although the proposed method can significantly improve the reconstruction quality of a static light field, quality improvement is limited when reconstructing a dynamic light field video. It is because the depth information of the object is likely to change during the video stream. When determining the offset reference plane, we need to consider the change of depth in the whole process. Therefore, it may make the image quality improvement of a video stream less than that of a static light field. In the future work, we hope to propose a new method to solve this problem, such as changing the depth of the virtual plane based on the depth distribution feature rather than moving the reference plane.

6. Conclusion

In conclusion, we demonstrated that the object with a closer distance to the physical layer has a better image quality. Besides, an efficient algorithm based on the depth distribution feature to guide the light field optimization is introduced to improve the image quality without increasing the layered number or the refresh rate. To verify the proposed method, a CLF display consisting of four-layer stacked display panels is implemented and the distance between two adjacent layers is 5cm. Compared to the classical reconstruction method, the simulation and optical experiment show that the reconstruction quality of the proposed method is improved by 2.4dB and 1.8dB, respectively. We hope the proposed method could provide a novel and helpful insight for the future 3D display.

Funding

Anhui Science and Technology Department (201903a05020057); National Natural Science Foundation of China (61805065); Fundamental Research Funds for the Central Universities (JZ2021HGTB0077).

Acknowledgments

The authors thank anonymous reviewers for their thoughtful and helpful comments.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. D. M. Hoffman, A. R. Girshick, K. Akeley, and M. S. Banks, “Vergence-accommodation conflicts hinder visual performance and cause visual fatigue,” J. Vision 8(3), 33 (2008). [CrossRef]

2. N. Matsuda, A. Fix, and D. Lanman Focal surface displays. ACM Transactions on Graphics (TOG), 36(4), 1–14. (2017).

3. Z.-B. Fan, H.-Y. Qiu, H.-L. Zhang, X.-N. Pang, L.-D. Zhou, L. Liu, H. Ren, Q.-H. Wang, and J.-W. Dong, “A broadband achromatic metalens array for integral imaging in the visible,” Light: Sci. Appl. 8(1), 67 (2019). [CrossRef]

4. Z. Wang, A. Wang, S. Wang, X. Ma, and H. Ming, “Resolution-enhanced integral imaging using two micro-lens arrays with different focal lengths for capturing and display,” Opt. Express 23(22), 28970–28977 (2015). [CrossRef]

5. H.-L. Zhang, H. Deng, J.-J. Li, M.-Y. He, D.-H. Li, and Q.-H. Wang, “Integral imaging-based 2D/3D convertible display system by using holographic optical element and polymer dispersed liquid crystal,” Opt. Lett. 44(2), 387–390 (2019). [CrossRef]

6. Q.-H. Wang, C.-C. Ji, L. Li, and H. Deng, “A dual-view integral imaging 3D display by using orthogonal polarizer array and polarization switcher,” Opt. Express 24(1), 9–16 (2016). [CrossRef]

7. Y. Xing, Q.-H. Wang, H. Ren, L. Luo, H. Deng, and D.-H. Li, “Optical arbitrary-depth refocusing for large-depth scene in integral imaging display based on reprojected parallax image,” Opt. Commun. 433, 209–214 (2019). [CrossRef]

8. W.-X. Zhao, Q.-H. Wang, A.-H. Wang, and D.-H. Li, “An autostereoscopic display based on two-layer lenticular lens,” Opt. Lett. 35(24), 4127–4129 (2010). [CrossRef]

9. Z. Wang, G. Lv, Q. Feng, A. Wang, and H. Ming, “Simple and fast calculation algorithm for computer-generated hologram based on integral imaging using look-up table,” Opt. Express 26(10), 13322–13330 (2018). [CrossRef]

10. Z. Wang, G. Q. Lv, Q. B. Feng, A. T. Wang, and H. Ming, “Resolution priority holographic stereogram based on integral imaging with enhanced depth range,” Opt. Express 27(3), 2689–2702 (2019). [CrossRef]

11. Z. Wang, G. Q. Lv, and Q. B. Feng, “Enhanced resolution of holographic stereograms by moving or diffusing a virtual pinhole array,” Opt. Express 28(15), 22755–22766 (2020). [CrossRef]

12. D. Wang, C. Liu, C. Shen, Y. Xing, and Q.-H. Wang, “Holographic capture and projection system of real object based on tunable zoom lens,” PhotoniX 1(1), 6 (2020). [CrossRef]

13. G. Wetzstein, D. Lanman, M. Hirsch, W. Heidrich, and R. Raskar, “Compressive Light Field Displays,” IEEE Computer Graphics and Applications 32(5), 6–11 (2012). [CrossRef]

14. G. Wetzstein, D. Lanman, M. Hirsch, and R. Raskar, “Tensor displays: compressive light field synthesis using multilayer displays with directional backlighting,” ACM Trans. Graph. 31(4), 1–11 (2012). [CrossRef]

15. G. Wetzstein, D. Lanman, W. Heidrich, and R. Raskar, “Layered 3D: tomographic image synthesis for attenuation-based light field and high dynamic range displays,” ACM Trans. Graph. 30(4), 1–12 (2011). [CrossRef]

16. F.-C. Huang, K. Chen, and G. Wetzstein, “The light field stereoscope: immersive computer graphics via factored near-eye light field displays with focus cues,” ACM Trans. Graph. 34(4), 1–12 (2015). [CrossRef]

17. D. Lanman, G. Wetzstein, M. Hirsch, W. Heidrich, and R. Raskar, “Polarization fields: dynamic light field display using multi-layer LCDs,” ACM Trans. Graph. 30(6), 1–10 (2011). [CrossRef]

18. A. Maimone, G. Wetzstein, M. Hirsch, D. Lanman, R. Raskar, and H. Fuchs, “Focus 3D: Compressive accommodation display,” ACM Trans. Graph. 32(5), 1–13 (2013). [CrossRef]

19. X. Cao, Z. Geng, M. Zhang, and X. Zhang, “Load-balancing multi-LCD light field display,” Proc. SPIE 9391, 93910F (2015). [CrossRef]

20. L. Mali, L. Chihao, L. Haifeng, and L. Xu, “Bifocal computational near eye light field displays and Structure parameters determination scheme for bifocal computational display,” Opt. Express 26(4), 4060 (2018). [CrossRef]

21. T. Zhan, Y.-H. Lee, and S.-T. Wu, “High-resolution additive light field near-eye display by switchable Pancharatnam–Berry phase lenses,” Opt. Express 26(4), 4863–4872 (2018). [CrossRef]

22. G. Tan, T. Zhan, Y.-H. Lee, J. Xiong, and S.-T. Wu, “Polarization-multiplexed multiplane display,” Opt. Lett. 43(22), 5651–5654 (2018). [CrossRef]

23. D. Kim, S. Lee, S. Moon, J. Cho, Y. Jo, and B. Lee, “Hybrid multi-layer displays providing accommodation cues,” Opt. Express 26(13), 17170–17184 (2018). [CrossRef]

24. N.-Y. Jo, H.-G. Lim, S.-K. Lee, Y.-S. Kim, and J.-H. Park, “Depth enhancement of multi-layer light field display using polarization dependent internal reflection,” Opt. Express 21(24), 29628–29636 (2013). [CrossRef]

25. S. Lee, C. Jang, S. Moon, J. Cho, and B. Lee, “Additive light field displays: realization of augmented reality with holographic optical elements,” ACM Trans. Graph. 35(4), 1–13 (2016). [CrossRef]

26. L. Zhu, G. Du, G. Lv, Z. Wang, and Q. Feng, “Performance improvement for compressive light field display with multi-plane projection,” Optics and Lasers in Engineering 142, 106609 (2021). [CrossRef]

27. C.-K. Lee, S. Moon, S. Lee, D. Yoo, J.-Y. Hong, and B. Lee, “Compact three-dimensional head-mounted display system with Savart plate,” Opt. Express 24(17), 19531–19544 (2016). [CrossRef]

28. Z. Wang, L. M. Zhu, X. Zhang, P. Dai, G. Q. Lv, Q. B. Feng, A. T. Wang, and H. Ming, “Computer-generated photorealistic hologram using ray-wavefront conversion based on the additive compressive light field approach,” Opt. Lett. 45(3), 615–618 (2020). [CrossRef]

29. D. Chen, X. Sang, X. Yu, X. Zeng, S. Xie, and N. Guo, “Performance improvement of compressive light field display with the viewing-position-dependent weight distribution,” Opt. Express 24(26), 29781–29793 (2016). [CrossRef]

30. S. Lee, J. Cho, B. Lee, Y. Jo, C. Jang, D. Kim, and B. Lee, “Foveated Retinal Optimization for See-Through Near-Eye Multi-Layer Displays,” IEEE Access 6, 2170–2180 (2018). [CrossRef]

31. S. Wang, W. Liao, P. Surman, Z. Tu, Y. Zheng, and J. Yuan, “Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 2031–2040, doi: 10.1109/CVPR.2018.00217.

32. K. Takahashi, Y. Kobayashi, and T. Fujii, “From Focal Stack to Tensor Light-Field Display,” IEEE Trans. on Image Process. 27(9), 4571–4584 (2018). [CrossRef]

33. K. Maruyama, K. Takahashi, and T. Fujii, “Comparison of Layer Operations and Optimization Methods for Light Field Display,” IEEE Access 8, 38767–38775 (2020). [CrossRef]

34. T. F. Coleman and Y. Li, “A reflective Newton method for minimizing a quadratic function subject to bounds onsome of the variables,” SIAM J. Optim. 6(4), 1040–1058 (1996). [CrossRef]

35. H. Lu, J. Zhang, Z. Song, G. Zhang, X. Wang, L. Qiu, and G. Lv, “Submillisecond-response light shutter for solid-state volumetric 3D display based on polymer-stabilized cholesteric texture,” J. Disp. Technol. 10(5), 396–401 (2014). [CrossRef]

36. Y. Fang, Y. Lu, H. Wu, G. Lv, and Y. Hu, “Study On GrayScale Correction in Solid-State Three Dimensional Volumetric Display,” Acta Optica Sinica 36(11), 1133001 (2016).

Performance improvement for compressive light field display based on the depth distribution feature

Abstract

1. Introduction

2. Principle

2.1 Conventional compressive light field display

2.2 Theoretical analysis

3. Improving the quality of light field via offsetting the depth of the reference plane

4. Implementation

4.1 Hardware

5. Limitation and future work

6. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (10)

Equations (17)

Optics Express