Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Fast shadow casting algorithm in analytical polygon-based computer-generated holography

Open Access Open Access

Abstract

Shadow casting is essential in computer graphics, which can significantly enhance the reality of rendered images. However, shadow casting is rarely studied in polygon-based computer-generated holography (CGH) because state-of-art triangle-based occlusion handling methods are too complicated for shadow casting and unfeasible for complex mutual occlusion handling. We proposed a novel drawing method based on the analytical polygon-based CGH framework and achieved Z-buffer-based occlusion handling instead of the traditional Painter’s algorithm. We also achieved shadow casting for parallel and point light sources. Our framework can be generalized to N-edge polygon (N-gon) rendering and accelerated using CUDA hardware, by which the rendering speed can be significantly enhanced.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Computer-generated holography (CGH) is one of the most promising 3D display techniques compared to other 3D virtual reality (VR) and augmented reality (AR) display systems since holography can record and reconstruct an arbitrary wavefront without information loss [1,2]. Based on the types of input data, the CGH algorithms can be divided into two classes: image-based and model-based. The image-based CGH, which is also known as RGBD-based (red, green, blue & depth) CGH, converts RGBD images into holograms directly using iterative methods [3], non-iterative methods [4], and learning-based methods [5]. The model-based CGH records 3D models and scenes. The model-based CGH is more complicated than the image-based CGH, but it is promising since it can reconstruct virtual 3D scenes directly with full details, which is the ultimate goal of 3D display techniques. The model-based CGH rendering algorithms have been studied for decades, and there are mainly three model-based CGH rendering algorithms: point-based, layer-based and polygon-based [6,7].

(i) The point-based algorithm treats a model as massive illuminating points and then superposes the sphere wavefronts of all points [8]. The traditional point-based algorithm directly calculates the wavefronts of every point, which is straightforward but computationally inefficient. Lucente et al. improve the point-based algorithms by utilizing look-up tables (LUTs) to store the wavefronts in several assigned depths beforehand [9], which can significantly speed up the computation, but the computable depth of the model is limited. Otherwise, the memory needed will be huge. Yang et al. optimized the original LUT method by storing only the radial wavefront pixels and reduced memory usage [10]. Ito et al. designed a series of FPGA-based hardware, named HORN, accelerating the computation, achieving 1080P real-time rendering with enormous points [11,12]. However, both these algorithms and the hardware cannot solve the problem of the limited computable depth range.

(ii) The layer-based algorithm first renders the 2D image of the scene using computer graphics (CG) techniques, then slices the image into several layers according to the depth distribution of the scene. Finally, the hologram is rendered by superposing the wavefronts of each layer propagated to the hologram plane [13]. The layer-based algorithm is fast, and the reconstructed image quality is relatively high. Since state-of-art neural image-based CGH algorithms can render holograms in real-time with high-quality [14,15], the layer-based method is popular nowadays. However, the computable depth of the scene is also limited since a more extensive depth range means more layers to be computed. Nevertheless, the viewing angle of holograms rendered by layer-based algorithms is narrow because the gaps between layers will be seen if the viewing direction is not parallel to the optical axis [6]. Moreover, the resolution of the holograms rendered by neural network-based CGH algorithms is typically the same as modern video frames, such as 1920 by 1080 or 3840 by 2160. No training data is available when encountering a larger-scale high-definition hologram (e.g., 128 K by 128 K). Moreover, only a very few computing platforms can train such a huge network.

(iii) Polygon-based algorithms treat a model as multiple polygons, and the final hologram is the superposition of the wavefronts illuminated by each polygon [16]. The computable depth of polygon-based algorithms is unlimited, and the computational efficiency is contingent on the quantity of rendered polygons and the hologram’s resolution [6]. Moreover, the polygon-based method can render HD holograms by partitional rendering proposed by Muffoletto et al. and utilized by Matsushima et al. [17,18]. Because polygons (especially triangles) are the mostly-used primitives in CG rendering, rasterization rendering algorithms can be used in polygon-based CGH algorithms, making them one of the candidates for the most practical CGH rendering methods in the future.

There are mainly two polygon-based algorithms, the numerical and the analytical methods [19]. The numerical method first calculates the surface function, i.e., the complex amplitude distribution on the polygon, and then rotates and propagates it into the temporal plane. The temporal plane is located in the middle of the object and is used to reduce the size of the array, which stores the surface function since the propagating distance is reduced. After computing all the polygons, the total object wave is propagated to the hologram plane [16]. The numerical method is complicated since several Fast Fourier Transformations (FFTs) and a bicubic interpolation are performed. The analytical method uses the affine relationship between the reference and the target triangles and directly computes the target spectrum using an analytical formula. Therefore, it is much faster than the numerical method [20]. Nowadays, speed optimization is the main topic for the analytical method. Pan et al. and Zhang et al. improved computational efficiency by simplifying the relationship between the reference and the target triangles into 3D affine transformations [2123]. Wang et al. utilized LUT and GPU acceleration and reduced the rendering load by principal component analysis, achieving an extremely high rendering speed [24]. Wang et al. also proposed a wavefront recording plane-like method, scaling down the calculating area and significantly enhancing the rendering speed [25]. However, visual effects (VFX) such as texture mapping, occlusion handling, and shadow rendering in the analytical polygon-based method are relatively less discussed.

Texture mapping is fundamental in traditional CG rendering, which is, however, complicated in polygon-based CGH rendering. For the numeric method, Matsushima et al. achieved texture mapping by adding texture information in the synthesis of the surface function [26], which is convenient because surface function generation is naturally a sampling process. However, the numerical method’s framework is inefficient due to the time-consuming wavefront rotation [27]. As for the analytical method, it is challenging to achieve texture mapping since the texture is a posteriori information. Therefore, it is impossible to consider the texture when deducing the formula [19]. Lee et al. proposed a method that transforms the texture information into coefficients of the Fourier series, used as weights in the superposition [28]. The convolution algorithm can be used in this method, but this method is not suitable for UV texture mapping since the textures in every triangle are generally different. Ji et al. proposed the sub-triangle decomposition method, dividing the rendered triangle into four self-similar sub-triangles until the color in each sub-triangle is uniform [29]. This method maintains the framework of the full-analytical method, but it is unsuitable for multi-tonal (grayscale) textures since the colors in each sub-triangle are discrete.

Occlusion handling is another obstacle to rendering complex models using the polygon-based method. Concerning the numerical method, Matsushima et al. proposed the silhouette method, generating a mask of the rendered polygon to occlude the previously calculated light field [30]. Matsushima et al. proposed the switch-back method based on Babinet’s principle to accelerate the silhouette method. Liu et al. further optimized the speed using the slice-by-slice silhouette method [31]. However, noise may occur when obliquely viewing the hologram considering the silhouette is perpendicular to the optical axis. Askari et al. considered the rendered triangle as an aperture [32]. The previously computed spectrum is utilized as the backlight, and the spectrum of the occluded wavefront can be obtained using their previous work [33]. This method is efficient and high-quality when viewing the hologram from multi-directions. However, the above methods are similar to the Painter’s algorithm, i.e., processing the occlusion in the depth order triangle-by-triangle. They cannot handle complex occlusion situations, e.g., multiple triangles in the same depth occluding each other (see Fig. 1). Besides Painter’s algorithm, Wang et al. proposed an octree-based method, sorting the triangles in sub-divided octants and storing them in an octree [34]. While processing occlusion, depth calculation is substituted with searching the octree, and high speed can be achieved. However, this method’s precision is determined by the depth of the octree, and this method cannot handle several kinds of partial occlusion.

 figure: Fig. 1.

Fig. 1. An example of complex occlusion. The three triangles’ barycenters are on the same depth plane. (a) Front view; (b) side view.

Download Full Size | PDF

Shadow casting can significantly enhance the reality of the virtual scene by imitating natural light behaviors [35]. However, as far as we know, it has rarely been studied in polygon-based CGH because the occlusion algorithms are time-consuming and unsuitable for determining shadow areas if the scene is complicated. Ezura et al. proposed a patch projection method to calculate the shadow area by traversing all triangles [36], which is complicated and even impossible for complex scenes containing multiple objects. We will discuss the infeasibility of triangle-based occlusion handling methods in shadow casting in the following section.

In this paper, we propose a novel semi-analytical method that substitutes the analytical formula of the reference triangle for a buffer but maintains the rest of the analytical method’s framework. Customizing the reference triangle makes it feasible to utilize the texture-based Z-buffer algorithm to achieve complex occlusion handling. The Z-buffer algorithm can enhance the occlusion handling resolution from the triangle to the fragment level [35], and shadow computation for parallel and point light sources can be achieved. With our proposed N-gon rendering and CUDA acceleration, the rendering speed can exceed 2500 triangles per second with occlusion handling and shadow casting.

2. Methods

2.1 Traditional analytical polygon-based CGH rendering

Figure 2 illustrates the computation framework of the analytical method, where two coordinates are defined: (i) the global coordinate, where the objects locate; (ii) the reference coordinate, which includes the reference triangle. We suppose the illuminating wavefront is a planer wave $u({\mathbf r}) = a\exp [j2\pi {{\mathbf k}^T}{\mathbf r}]$, where a is the amplitude, j is the unit imaginary number, ${\mathbf k} = {[{k_x},{k_y},{k_z}]^T}$ is the wave vector where ${k_z} = \sqrt {{\lambda ^{ - 2}} - k_x^2 - k_y^2} $, $\lambda $ is the wave length of the illuminating wavefront, and ${r} = {[x,y,z]^T}$ is a position vector in the global space. First, we compute the $3 \times 3$ rotation matrix R and the $3 \times 1$ shift vector c, which satisfy ${{\mathbf r}_l} = {[{x_l},{y_l},{z_l}]^T} = {\mathbf {Rr}} + {\mathbf c}$, and transform the rendered triangle to the XY-plane of the global coordinate. One vertex of the rendered triangle coincides with the origin of the global coordinate. Then, we compute the $2 \times 2$ affine transformation matrix A, which satisfies $\mathbf{r}_s = [x_s, y_s]^T = \mathbf{A}[x_l, y_l]^T$. Using R, c and A, an arbitrary triangle can be transformed into a reference triangle. The same affine transforming relationship also holds between the global frequency vector ${\mathbf f} = {[{f_x},{f_y},{f_z}]^T}$ and the reference frequency vector ${{\mathbf f}_l} = {[{f_{xl}},{f_{yl}},{f_{zl}}]^T}$ in the frequency domain, i.e., ${{\mathbf f}_l} = {\mathbf {Rf}}$. The relationship between the rendered and the reference triangles is determined by Eqs. (1) and (2) [32].

$$G({{\mathbf f}_{xy}};a,{{\mathbf k}_{xy}}) = B({{\mathbf f}_{xy}};{{\mathbf k}_{xy}})a\exp ({j2\pi {{\mathbf k}^T}{{\mathbf r}_o}} )\frac{{\exp ({j\pi {\mathbf f}_l^T{\mathbf c}} )}}{{\textrm{det}({\mathbf A})}}\frac{{{f_{zl}}}}{{{f_z}}}, $$
$$B({{\mathbf f}_{xy}};{{\mathbf k}_{xy}}) = {G_r}\left( {{{({{\mathbf A}^T})}^{ - 1}}\left[ {\begin{array}{{ccc}} 1&0&0\\ 0&1&0 \end{array}} \right]{\mathbf R}({\mathbf f} - {\mathbf k})} \right), $$
where ${G_r}$ is the spectrum of the reference triangle, det(A) is A’s determinant, ${{\mathbf f}_{xy}} = {[{f_x},{f_y}]^T}$, ${{\mathbf k}_{xy}} = {[{k_x},{k_y}]^T}$, and ${{\mathbf r}_o}$ is the position of the vertex which will coincide with the origin of the reference coordinate. ${G_r}$ can be directly solved since the reference triangle is straightforwardly defined, customarily defined as Eq. (3).
$${g_r}(s,t) = \left\{ {\begin{array}{{l}} {1, s \in [0, 1], t \in [0, 1], s > t}\\ {0,\;etc.} \end{array}} \right.$$

 figure: Fig. 2.

Fig. 2. (a) The relationship between the reference and the target triangles. The rendered triangle (blue) is first rotated and shifted by R and c into the reference coordinate (yellow), then affine transformed by A, coinciding with the reference triangle (red). (b) The Phong illuminating model. The illuminating direction l and viewing direction v are inversely drawn to meet the intuition.

Download Full Size | PDF

It is feasible to synthesize the continuous and specular shading information into the analytical formula by changing the vertex properties and solving the integral in Eqs. (4) and (5). Here the Phong illumination model is combined, as Eq. (5) (illustrated in Fig. 2(b)) [37,38].

$${G_r}({{\mathbf f}_{l,xy}}) = \int\!\!\!\int {{g_r}(s,t)\exp [{ - i2\pi ({f_{xl}}s + {f_{yl}}t)} ]\textrm{d}{f_{xl}}\textrm{d}{f_{yl}}}, $$
$${g_r}(s,t) = {g_a} + {\mathbf n}{(s,t)^T}{\mathbf l}(s,t) + {a_s}{[{{{\mathbf r}_{reflect}}{{(s,t)}^T}{\mathbf v}(s,t)} ]^{{N_s}}}, $$
where ${{\mathbf f}_{l,xy}} = {[{f_{xl}},{f_{yl}}]^T}$, n(s, t) is the normal vector at (s, t) in the reference coordinate obtained using interpolation, v(s, t) is the inverse viewing direction vector, l (s, t) is the inverse illuminating direction vector, ${g_a}$ is the ambient light’s intensity, ${a_s}$ is the specular intensity, ${{\mathbf r}_{reflect}}(s,t)$ is the reflecting direction at (s, t), and ${N_s}$ is the shininess coefficient. According to Eqs. (1) and (2), it is difficult to synthesize the texture information into the formula since it makes the integral unsolvable. Our drawing framework introduced in the following subsections will solve this problem.

2.2 Proposed semi-analytical framework

In Eq. (2), the Fourier spectrum of the reference triangle ${G_r}$ can be calculated by solving the integral Eq. (4). However, adding texture information to the formula is complicated. Our drawing framework is illustrated in Fig. 3 [39]. We first draw a reference triangle in a buffer array, named the reference buffer, whose pixel pitch is equal to the spatial light modulator (SLM) used in the optical reconstruction. The vertices of the reference triangle are (0, 0), (0, 1) and (1, 1) (unit: mm). Then the reference buffer is Fourier transformed to compute the reference spectrum ${G_r}$. We can obtain the spectrum B after determining the sampling grid using the relationship Eq. (2). The final result is the superposition of all rendered triangles’ spectrum. It should be noted that the reference triangle can be arbitrarily set in our proposed method. However, it is more convenient to use the setting of the classic analytical method since fixed data can be easily hard-coded.

 figure: Fig. 3.

Fig. 3. The framework of the proposed semi-analytical drawing method. First, the reference triangle is drawn in the reference buffer. Then the reference spectrum can be obtained by a Fourier transform. Sampling the reference spectrum acquires the rendered triangle’s spectrum, and finally, all target triangles’ wavefront is obtained by an inverse Fourier transformation. The second row indicates the process with texture mapping. The color of each pixel is multiplied by the color sampled from the texture image. These intermediate images are output by OpenCV, and the y-axis of OpenCV images is opposite to the mathematical definition.

Download Full Size | PDF

Several primary shading effects can be achieved in the drawing stage, e.g., continuous shading and specular shading, which were studied by Park et al. and Shimobaba et al. based on the full-analytical framework [37,38]. Besides shading, texture mapping is naturally achieved since sampling exists in the drawing stage, i.e., the color of the fragment is the multiplication of the illumination calculated by Eq. (5), and the color sampled from the texture, which is shown in the second row of Fig. 3. Moreover, occlusion handling can be performed in this stage, which will be introduced in the following subsection.

Though we used interpolation in the proposed method, we define this as semi-analytical rather than numerical. The classic numerical method needs first to define the shape of the surface function and then calculate the exact value. Then the surface function is rotated and propagated to the temporal plane, separated into steps. Our proposed method utilized the framework of the analytical method, which simplified the computing process and was more efficient. We substituted the analytical spectrum formula with an FFT and an interpolation. Though it added complexity, it added more freedom since texture mapping and shadow casting can be achieved. We discussed the efficiency of our proposed method in Section 4 with details.

2.3 Complex occlusion handling using modified Z-buffer algorithm

The occlusion handling methods in polygon-based CGH so far are mainly similar to the Painter's Algorithm since all these methods need to sort the triangles in the depth order before calculation and suppose the triangles in a small depth range do not occlude each other. Therefore, the triangles can be divided into groups to reduce redundant computation. However, the occlusion of triangles in the same depth is not rare and should not be omitted in a robust algorithm. Since the Z-buffer algorithm is a texture-based occlusion handling algorithm [35], it can be utilized in the drawing stage in our proposed semi-analytical method with modifications. As far as we know, the Z-buffer algorithm is rarely used in polygon-based CGH.

The traditional Z-buffer algorithm in CG contains the following steps:

  • 1. Calculate the depth distribution of the rendered scene, and store it in a 2D buffer (Z-buffer).
  • 2. When shading a fragment, interpolate its depth using the triangle’s three vertices.
  • 3. Compare the interpolated depth value (${z_i}$) with the value of the corresponding position in the Z-buffer (${z_b}$). If ${z_i} \le {z_b}$, shade this fragment. Otherwise, discard this fragment.

The Z-buffer is an efficient rasterization algorithm that can enhance the minimal occlusion-handling unit from an object to a fragment. However, it cannot be directly used in our proposed method. In CG rendering, the Z-buffer is aligned with the rendered scene. However, the reference and the rendered triangles are in different spaces, which makes the corresponding positions of the rendered fragments in the reference space and the global space different. Therefore, the interpolated depth value cannot be used in depth comparison. Otherwise, the sampling alias errors will occur (see Fig. 4(a)). To solve this problem, we modified the second process:

 figure: Fig. 4.

Fig. 4. Simulations of different versions of our Z-buffer modification. (a) Original Z-buffer with severe alias errors. (b) First Modification, seams will occur. (c) Final Modification and the correct result.

Download Full Size | PDF

(Modified) 2. Calculate the depth distribution of the rendered triangle alone and store it in another buffer (Z-tmp). When shading a fragment, interpolate the UV coordinate of this fragment, then sample the depth value from Z-tmp using the UV coordinate.

After this modification, the depth distribution of the rendered triangle in the reference and the global spaces are aligned, and the alias error will be eliminated in the final hologram. However, when calculating the edge fragments of a triangle, the sampled depth value may be infinity since Z-tmp only contains the depth distribution of the rendered triangle, and its background is initially set to infinity. Seams will occur (see Fig. 4(b)), which is different from the dark lines caused by the mismatch of each triangle’s initial phase [40]. To fix this problem, we further modified the second step of the Z-buffer algorithm:

(Further Modified) 2. Copy Z-buffer to Z-tmp. Calculate the depth distribution of the rendered triangle and overwrite Z-tmp with new values. When shading a fragment, interpolate the UV coordinate of this fragment, then sample the depth value from Z-tmp using the UV coordinate.

The whole depth testing process is summarized in Fig. 5. After this modification, seams will be removed in the final result (see Fig. 4(c)). Our modified Z-buffer algorithm can handle complex occlusion problems such as the end-to-end overlapping illustrated in Fig. 1, and the result will be illustrated in the next section. It should be noted that the grid is often shifted by half a pixel in both dimensions in CG rasterization rendering to enhance the precision of sampling [41]. However, we have already defined the origin of the reference coordinate at the center of a pixel, and this half-pixel shift is not needed in our method.

 figure: Fig. 5.

Fig. 5. The modified Z-buffer depth testing process. The total Z-buffer is first calculated. The depth distribution of rendered triangle is then drawn in the Z-tmp. After comparing the depth values of the corresponding positions in Z-buffer and Z-tmp, the fragment is judged to be rendered or discarded. It should be noted that the reference triangle here is filled with 1 to show its shape for demonstration.

Download Full Size | PDF

2.4 Shadow casting in CGH

Shadow is the occluded area from the view of a light source. Most occlusion handling methods nowadays are triangle-based. If we choose triangle-based methods, a trivial shadow-casting method can be utilized (${T_i}$ is the i-th rendered triangle, and T is the quantity of the render triangles):

  • 1. Project ${T_i}$ to the first triangle’s plane orthogonally or perspectively, considering the light source type.
  • 2. Remove the occluded triangle in the global space.
  • 3. Repeat the above procedures to the rest previous ${T_i} - 1$ triangles.

There are several problems with the above method. First, the computation load will be tremendous since it contains $(T - 1)T/2$ times occlusion calculations for a single light source and T triangles, which exceeds the rendered triangles. Second, determining the occluded part is complicated. For instance, if the rendering order is random and a nearer triangle is rendered earlier, a farther triangle’s projection will be inversely on the nearer triangle’s plane, which will cause a wrong shadow casting result (see Fig. 6(a)). Third, if the size of a triangle is larger than the hologram, the exceeded part will invade the hologram because of the periodic property of FFT. Therefore, if the projected triangle is too large or slender, some part of the hologram will be eliminated because of the invasion from the neighbor period (see Fig. 6(b)). To solve this problem, truncating the projected triangle might be feasible. However, the intersected part may be a quadrilateral, a pentagon or a more complex polygon, and more triangle-based occlusion calculations are needed.

 figure: Fig. 6.

Fig. 6. Problems in triangle-based shadow casting. (a) The random rendering order may cause wrong results. If a nearer triangle is rendered earlier, a farther triangle rendered later will cast the shadow area inversely on the nearer plane, which is the wrong result (red). The green projection area is correct. (b) Oversized projection (red) may damage the wavefront. Because of the periodic property of FFT, the exceeded area will invade the neighbor period, damaging the previously rendered light field (green).

Download Full Size | PDF

On the other hand, the modified Z-buffer algorithm can also be utilized in shadow casting (see Fig. 7). First, we need to transform the vertices from the global space to the light source’s space using affine transforming matrices. As for parallel light sources, e.g., sunlight, a “lookat” matrix ${{\mathbf M}_{LookAt}}$, an orthographic projection matrix ${{\mathbf M}_{ortho}}$, and a viewport matrix ${{\mathbf M}_{view}}$ are required, defined as Eq. (6) [42]. The functions of these matrices are illustrated in Fig. 7(a),

$$\begin{array}{{cc}} {{{\mathbf M}_{LookAt}} = \left[ {\begin{array}{{cccc}} {{{\mathbf r}_x}}&{{{\mathbf r}_y}}&{{{\mathbf r}_z}}&0\\ {{{\mathbf u}_x}}&{{{\mathbf u}_y}}&{{{\mathbf u}_z}}&0\\ {{{\mathbf g}_x}}&{{{\mathbf g}_y}}&{{{\mathbf g}_z}}&0\\ 0&0&0&1 \end{array}} \right]\left[ {\begin{array}{{cccc}} 1&0&0&{ - {{\mathbf p}_x}}\\ 0&1&0&{ - {{\mathbf p}_y}}\\ 0&0&1&{ - {{\mathbf p}_z}}\\ 0&0&0&1 \end{array}} \right]}&{{{\mathbf M}_{ortho}} = \left[ {\begin{array}{{cccc}} {\frac{2}{{r - l}}}&0&0&{ - \frac{{r + l}}{{r - l}}}\\ 0&{\frac{2}{{t - b}}}&0&{ - \frac{{t + b}}{{t - b}}}\\ 0&0&{\frac{2}{{n - f}}}&{ - \frac{{n + f}}{{n - f}}}\\ 0&0&0&1 \end{array}} \right]}\\ {{{\mathbf M}_{view}} = \left[ {\begin{array}{{cccc}} {\frac{W}{2}}&0&0&{\frac{{W - 1}}{2}}\\ 0&{\frac{H}{2}}&0&{\frac{{H - 1}}{2}}\\ 0&0&1&0\\ 0&0&0&1 \end{array}} \right]}&{{{\mathbf M}_{pers}} = \left[ {\begin{array}{{cccc}} n&0&0&0\\ 0&n&0&0\\ 0&0&{n + f}&{ - fn}\\ 0&0&1&0 \end{array}} \right]} \end{array}$$
where g is the inverse viewing direction of the light source, r is the right direction, typically defined as ${\mathbf r} = \frac{{{{\mathbf e}_y} \times {\mathbf g}}}{{||{{{\mathbf e}_y} \times {\mathbf g}} ||}}$ (${{\mathbf e}_y} = {[0,1,0]^T}$), u is the up direction, defined as ${\mathbf u} = \frac{{{\mathbf g} \times {\mathbf r}}}{{||{{\mathbf g} \times {\mathbf r}} ||}}$, and p is the position of the light source. In ${{\mathbf M}_{ortho}}$, r, l, t, b, n, and f are the right, the left, the top, the bottom, the near and the far boundary of the rendered space. n and f are defined in the occlusion handling stage, $n ={-} 0.1\textrm{ and }f ={-} 100$ are typical values. In ${{M}_{view}}$, H and W are the resolutions of the hologram. The total transformation matrix is ${{\mathbf M}_{trans}} = {{\mathbf M}_{view}}{{\mathbf M}_{ortho}}{{\mathbf M}_{LookAt}}$. After all the vertices of the rendered triangle are transformed, the Z-buffer from the view of a parallel light source can be calculated, which is named the “shadow texture.” In the drawing stage of our proposed method, we render the individual shadow texture of the rendered triangle, i.e., the Z-tmp from the light source’s view, and compare the depth value of this pixel in the whole and the individual shadow texture to judge if this pixel is in the shadow area, which is similar to the occlusion handling process. As for point sources, e.g., lightbulbs, a perspective projection matrix ${{\mathbf M}_{pers}}$ is added to ${{\mathbf M}_{trans}}$, i.e., ${{\mathbf M}_{trans}} = {{\mathbf M}_{view}}{{\mathbf M}_{ortho}}{{\mathbf M}_{pers}}{{\mathbf M}_{LookAt}}$, and other processes remain unchanged. After traversing all the light sources, the shadow areas can be cast correctly. Figure 8 illustrates examples of the shadow casting’s results.

 figure: Fig. 7.

Fig. 7. Occlusion-based shadow casting algorithm. (a) Functions of the transform matrices. We take the yz-plane as an example. y, z are the y- and z-axes of the global coordinate. ${y_c},\textrm{ }{z_c}$ are the corresponding axes in the camera’s view. Defined in a right-handed coordinate. (b) Shadow casting for a parallel light source. (c) Shadow casting for a point light source. In the occlusion handling process, the occlusion part is determined from the viewer’s view (front view), and the shadow area is determined from the view of the light source. Orthogonal and perspective transformations are separately utilized for the parallel and the point light sources.

Download Full Size | PDF

 figure: Fig. 8.

Fig. 8. The background rectangle’s 2 reference triangles of “Overlap” in Fig. 13 after texture mapping, occlusion handling and shadow casting.

Download Full Size | PDF

It should be noted that the y-axis of the light source’s view should be flipped when using OpenCV because of its coordinate definition. Otherwise, sampling errors may occur. It should also be noted that the matrices in Eq. (6) are defined in a right-handed coordinate. If the rendered scene is originally defined in a left-handed coordinate, the x-axis of the light source’s view should also be flipped to avoid mismatch.

2.5 Rendering accelerations

A modern polygon mesh usually consists of not only triangles but also quadrilaterals because of their excellent properties in interpolation, animation, etc. [43,44]. To render an N-gon using the traditional full-analytical method or our proposed method in the previous subsections, it should first be divided into N - 2 triangles. As shown in Eqs. (1) and (2), if R, c, A, ${{\mathbf r}_0}$ and ${G_r}$ are determined, the result of sampling will be determined. However, when rendering an N-gon, it is impossible to determine A and ${G_r}$ analytically. It is impossible to set a reference N-gon that can be transformed to coincide with an arbitrary N-gon by an affine matrix since A in Eq. (7) is unsolvable, where P is the vertices of the rendered N-gon except the vertex coincides with the origin (0, 0).

$${\mathbf {AP}} = {\mathbf A}\left[ {\begin{array}{{ccc}} {{x_1}}&{{x_2}}&{{x_3}}\\ {{y_1}}&{{y_2}}&{{y_3}} \end{array}} \right] = \left[ {\begin{array}{{ccc}} 1&1&0\\ 0&1&1 \end{array}} \right]$$

However, it is feasible to render an N-gon directly by modifying our proposed semi-analytical method. Here we take a quadrilateral as an example (see Fig. 9). The rendered quadrilateral is divided into two triangles, T1 and T2. First, R, c, A, and ${{\mathbf r}_0}$ are determined utilizing T1 and the original reference triangle named RefT1. Next, we transform T2 utilizing R, c, A, and ${{\mathbf r}_0}$, and the corresponding “reference triangle” of T2 is determined, which is named RefT2. Therefore, the reference quadrilateral, i.e., the combination of RefT1 and RefT2, can be drawn in the reference buffer. And ${G_r}$ can be obtained using an FFT. After determining ${G_r}$, the spectrum of the rendered quadrilateral can be obtained by sampling ${G_r}$ only once. Therefore, reducing FFTs and sampling processes can enhance rendering efficiency. It should be noted that when judging if the triangles belong to the same polygon, measuring the subtraction of the normals of each triangle may be applicable since the sub-triangles of a quadrangle are usually neighbors. However, the precision of the subtraction must be high enough to avoid floating errors because the normals of the polygons of a very complex model may be hard to distinguish. To further eliminate this bug, using a specified data structure to contain triangles and quadrangles directly may be a better choice. In this case, the normal of a quadrangle should be the average of its sub-triangles to eliminate the floating error.

 figure: Fig. 9.

Fig. 9. The proposed N-gon rendering method. The quadrilateral is divided into two triangles, T1 and T2. R, c and A are determined utilizing T1 and the regular reference triangle, RefT1. Then by transforming T2 into the reference coordinate using R, c and A, the corresponding reference triangle, RefT2, is determined, and the reference quadrilateral can be drawn in the reference buffer.

Download Full Size | PDF

Compared to the triangle-based rendering, the pixels of T2 may be fewer than T1. However, the reference spectrum’s resolution is determined by the pixels’ total quantity and pitch. If T2 is smaller than T1, T2 contains less texture information (illumination, image textures, etc.), and RefT2 needs fewer pixels than RefT1. Figure 10 illustrates the simulation results of a different method. The difference between the two results can be neglected.

 figure: Fig. 10.

Fig. 10. Comparison of the (a) triangle-based and (b) quadrilateral-based rendering.

Download Full Size | PDF

The values of different fragments in the reference buffer and the sampling values of different positions in the spectrum are uncorrelated. Therefore, it is feasible to accelerate the drawing framework parallelly. Graphic processing units (GPUs) are initially designed for graphics rendering, and many modern GPUs support universal parallel computation programming. Compute Unified Device Architecture (CUDA) is a high-performance programmable parallel computing platform developed by NVIDIA, and most modern NVIDIA GPUs support it. CUDA supports sampling hardware, which can significantly enhance the speed of sampling textures. In this study, we programmed a CUDA-based soft fixed rendering pipeline and utilized the texture memory of the CUDA device to accelerate the sampling process.

3. Results

To test the performance of our proposed semi-analytical drawing method, we did both simulations and optical reconstructions and illustrated the results in this section. Our proposed rendering engine is programmed in C++ and CUDA C with the Eigen math library, FFTW and the cuFFT library. It is implemented on an AMD Ryzen 5 5600X CPU and an NVIDIA GeForce RTX 3090 Ti GPU with 16 GByte memory. The resolution of the rendered holograms is 1024 by 1024, and the pixel pitch is 6.4 um if they are not specially mentioned. We built the rendered scenes and rendered the CG benchmarks utilizing Blender 3D modeling software. The light path used for the optical reconstruction is illustrated in Fig. 11. The laser’s wavelength is 532 nm. The SLM used here is CAS Microstar FSLM-2K55-P with an FHD resolution. The camera is Fujifilm X-A7 with Laowa CF 65 mm F2.8 CA-Dreamer Macro 2X lens. All the results are illustrated in Figs. 1214.

 figure: Fig. 11.

Fig. 11. Configuration of the optical reconstruction. The objects are set 200 mm away from the SLM.

Download Full Size | PDF

 figure: Fig. 12.

Fig. 12. The results of complex occlusion handling using the modified Z-buffer method. Since the three triangles cannot be sorted in the depth order, the Painter’s algorithm cannot offer the correct result, and the result of the modified Z-buffer algorithm is correct.

Download Full Size | PDF

 figure: Fig. 13.

Fig. 13. CG benchmarks and simulation results of our proposed method. The background is set 200 mm from the hologram, and all the foreground objects are set 1 mm farther the background, i.e., 201 mm, to make sure the shadow’s details are visible. The details on the background are precise, and the Chinese characters on the “Teapot” are also distinct. All the shadow-casting results are correct compared to the CG benchmarks. Note that the self-shadow casting in “Teapot” is also correct and distinct.

Download Full Size | PDF

 figure: Fig. 14.

Fig. 14. Optical reconstructions of our proposed method. Better optical results will be achieved if the holograms are loaded on a better SLM, independent of our proposed algorithm.

Download Full Size | PDF

Figure 12 illustrates the occlusion handling. The traditional Painter’s Algorithm cannot handle this mutual occluding situation, but our modified Z-buffer algorithm offers the correct results. It should be noted that several sparkles may occur because of sampling. However, these flaws do not affect the final hologram in most cases. In the following results, two light sources are set in the scenes: a point light source at (0, 0, 205 mm) and a parallel light source whose illuminating direction is ${[1, - 1, - 1]^T}$. Figure 13 illustrates the CG benchmarks and the simulation results. The image quality is high since the textures are distinct, and no rendering errors such as seams or noise occur. The shadow casting results are identical to the CG benchmarks, and the edges of the shadow areas are sharp, proving the correctness of our method. The shadow cast by the handle of the “Teapot” is also distinct, proving that the self-shadow casting is also correct.

Figure 14 shows photographs of optical reconstructions. The optical results are identical to the simulations. The multi-tone wicker image texture in the background can be recognized, and the shadow-casting results are also distinct and identical to the simulations. All results above proved the correctness of our proposed method. However, because of the low resolution and non-configurability of our SLM, the photos are relatively noisy. Better optical results will be achieved if the holograms are loaded on a better SLM, but optical noise depression is not the main problem in this study, and we will manage this problem in future studies.

4. Discussion

4.1 Speed analysis

To record the rendering speed, we specially rendered “Statue” and “Rabbit” in Fig. 13, which consist of 10398 and 79230 triangles, separately. The rendering speed is listed in Fig. 15. The time cost of texture-adding is ignorable. After adding the occlusion handling process, the time consumed by our drawing method did not enhance much, which is better than the traditional Painter’s algorithm since it needs an additional convolution if occlusion handling is enabled in the analytical method, and every convolution needs 3 FFTs to calculate [32]. Moreover, the time consumed by shadow calculation is non-ignorable but acceptable. Accelerations mentioned in Section 2.5 are utilized. Our quadrilateral-based rendering is faster than the traditional triangle-based rendering when the models are highly quadrilateralized. The rendering time is reduced by using N-gon rendering and CUDA acceleration. The maximal rendering speed is around 3000 triangles per second (Tri.s/s) using the triangle-based method and exceeds 5000 Tri.s/s using the triangle-based method quadrilateral-based method without occlusion handling and shadow calculation. The typical speed is around 3600 tris/s with all visual effects with N-gon rendering. We also rendered all the holograms in the resolution of 2048 by 2048, and the rendering time is illustrated in Fig. 15(d). The speed is proportionally reduced when the resolution of the hologram is increased. It should be noted that CUDA’s performance reaches the climax when the rendering burden is huge enough, and the speed enhancement of N-gon rendering depends on the quadrilateralized degree of the models.

 figure: Fig. 15.

Fig. 15. Speed statistics. (a) Full-analytical method and the proposed semi-analytical drawing method. (b) Comparison of the triangle-based and the quadrilateral-based rendering. (c) Results after CUDA acceleration. (d) Results of higher resolution rendering with CUDA acceleration. The texture mapping, the occlusion handling and the shadow casting are added successively, and we utilized log axes since the range of the rendered triangles is vast.

Download Full Size | PDF

As for comparing different methods, the traditional full-analytical method needs no FFT in triangle rendering and only 1 inverse FFT (iFFT) to obtain the final wavefront from the spectrum. Our proposed method needs 1 FFT and 1 bilinear interpolation for every rendered primitive (i.e., a triangle or a polygon). Therefore, the full-analytical method is theoretically faster than our proposed method. However, the absolute rendering time is determined by the programs. The original full-analytical program is not open-source and was written in MATLAB, and we wrote the proposed program in C++ with Eigen and FFTW. Therefore, the rendering time of our proposed method might be shorter than the full-analytical method.

On the other hand, our proposed semi-analytical drawing method is theoretically faster than the classic numerical method, which needs 4 FFTs and 1 bicubic interpolation for every primitive (2 FFTs and 1 bicubic interpolation in the rotation transform and 2 FFTs in the angular spectrum method propagating the rotated wavefront to the temporal plane).

Moreover, the interpolation is not always time-consuming. Equation (8) is the 1D definition of interpolation in the convolution form.

$$f(x) = \sum\limits_{n \in N} {h(x,n)f(n)}$$

Several interpolation algorithms exist, e.g., the nearest-neighbor, the linear, and the cubic interpolation. First, we list the definitions of these interpolation methods below in 1D form, which are Eqs. (9)-(11) respectively. Here n is the grid point’s index, and a typically equals -0.5.

$${h_{nearest}}(x,n) = \left\{ {\begin{array}{{l}} {1,n - \frac{1}{2} \le x < n + \frac{1}{2}}\\ {0,etc} \end{array}} \right.$$
$${h_{linear}}(x,n) = \left\{ {\begin{array}{{l}} {1 - |{x - n} |,n - 1 \le x < n + 1}\\ {0,etc.} \end{array}} \right.$$
$${h_{cubic}} = \left\{ {\begin{array}{{l}} {(a + 2){{|{x - n} |}^3} - (a + 3){{|{x - n} |}^2} + 1,|{x - n} |< 1}\\ {a{{|{x - n} |}^3} - 5a{{|{x - n} |}^2} + 8a|{x - n} |- 4a,1 \le |{x - n} |< 2}\\ {0,etc} \end{array}} \right.$$

In 1D situations, the nearest-neighbor interpolation samples 1 pixel, the linear interpolation samples 2 pixels, and the cubic interpolation samples 4 pixels. The 2D versions of the rendering kernels are simply the multiplication of the 1D formula in both axes. The bilinear interpolation can also be written in a matrix form, as Eq. (12).

$$f(x,y) = \left[ {\begin{array}{{cc}} {1 - \delta y}&{\delta y} \end{array}} \right]\left[ {\begin{array}{{cc}} {p1}&{p2}\\ {p3}&{p4} \end{array}} \right]\left[ {\begin{array}{{c}} {1 - \delta x}\\ {\delta x} \end{array}} \right]$$
where x and y are the sampling point’s coordinate, p1-p4 are the surrounding pixels’ values, $\delta x,\textrm{ }\delta y$ are the decimal parts of x, y, respectively.

The time complexity of a 2D FFT is ${O_1}({HW{{\log }_2}({HW} )} )$ for an $H \times W$ array. The time complexity of interpolation is ${O_2}({\beta HW} )$, where $\beta $ is a constant value determined by the interpolation method since the computation burden is fixed and equals for all pixels. 6 multiplications are needed to perform the bilinear interpolation once. Therefore, we assume $\beta = 6$ for the bilinear interpolation, which is similar to computing the frequency value using the full-analytical formula since it contains. As for the bicubic interpolations, it needs 16 surrounding pixels, and power calculations are needed. We assume $\beta = 64$ for the bicubic interpolation since 4 complex multiplications are needed for each pixel. Since H and W all equal 1024 in this paper, and ${\log _2}1024 = 10$, the bilinear interpolation is faster and the bicubic interpolation is much slower than FFT. Even for $H = 8192,\textrm{ }W = 4096$, the coefficient of an FFT is ${\log _2}(HW) = 25$, which is still faster than a bicubic interpolation. This coincides with the time analysis of Matsushima et al. in [27]. The quality of different interpolations is discussed in the following subsection.

4.2 Influence of interpolation

One of the advantages of the traditional full-analytical method is that there is no sampling process. Therefore, the spectrum value is accurate, and the quality of the reconstructions is high. The interpolation method determines the accuracy of the spectrum. We chose bilinear interpolation in the previous subsection, and in this subsection, the influence of different interpolation algorithms is discussed. It should be noted that a 2D interpolation kernel can be treated as the product of two 1D interpolation kernels. Therefore, this subsection mainly discusses 1D interpolations, and the conclusion can be generalized.

The nearest-neighbor interpolation samples the value of the nearest grid point to the interpolating point. Equation (13) is the frequency response of the nearest-neighbor interpolation.

$${H_{nearest}}(\omega ) = \textrm{sinc}\left( {\frac{\omega }{2}} \right)$$

The linear interpolation samples the two grid points surrounding the interpolating point. Equation (14) is the frequency response of the linear interpolation.

$${H_{linear}} = \textrm{sin}{\textrm{c}^2}\left( {\frac{\omega }{2}} \right)$$

The cubic interpolation samplings the four points surrounding the sampling point. The cubic sampling kernel’s frequency response is Eq. (15), where a typically equals -0.5 [45].

$${H_{cubic}} = \frac{{12}}{{{\omega ^2}}}\left[ {\textrm{sin}{\textrm{c}^2}\left( {\frac{\omega }{2}} \right) - \textrm{sinc}(\omega )} \right] + a\frac{8}{{{\omega ^2}}}[{3\textrm{sin}{\textrm{c}^2}(\omega ) - 2\textrm{sinc}(\omega ) - \textrm{sinc}(2\omega )} ]$$

All the kernel functions and their frequency responses are illustrated in Fig. 16. It is obvious that the nearest-neighbor interpolation is the simplest, the linear interpolation is slightly complicated, and the cubic interpolation is of the highest complexity. In the frequency domain, all three sampling kernels are low-pass filters (see Fig. 16(b)). The frequency response of the nearest-neighbor interpolation is a sinc function. There are many prominent side lobes, meaning that high-frequency noises will occur, which exceeds the sampling rate of the reference buffer. Therefore, in the space domain, artifacts from other periods will invade. Both the linear and the cubic filters have smooth side lobes. Though the cubic filter gives a better result, the cost of the linear filter is lower. To determine the influence of different interpolation algorithms, we rendered a blank cube using the above mentioned algorithms, and the results are illustrated in Fig. 17. The rendering time is listed in Table. 1.

Tables Icon

Table 1. Rendering time with different interpolation algorithms. (6 Tri.s rendered. Unit: second)

 figure: Fig. 16.

Fig. 16. (a) The interpolation kernels, where dx is the pixel pitch of the reference buffer. (b) The frequency responses. The ideal interpolation kernel is a sinc function, which is illustrated as a reference. Prominent lobes of the frequency response of the nearest-neighbor sampling kernel are distinct. Therefore the nearest-neighbor sampling kernel gives bad results. The low-pass properties of the linear sampling and the cubic sampling kernels are better than the nearest-neighbor sampling kernel.

Download Full Size | PDF

 figure: Fig. 17.

Fig. 17. Simulations and optical reconstructions of different choices of interpolation algorithms. The bilinear and the bicubic results are almost identical to the full-analytic result, and the quality of the nearest-neighbor result is poor with visible artifacts.

Download Full Size | PDF

Compared to the result of the full-analytical method, both the bilinear and the bicubic algorithms give good results, but the nearest method induces many artifacts. Moreover, the differences between Fig. 17(c) and (d) are ignorable, and CUDA devices provide only the nearest-neighbor and the bilinear sampling hardware. Therefore, we chose the bilinear method to balance the rendering speed and the image quality. In the future, bicubic interpolation or other advanced interpolation algorithms may be the better choice if they are supported by GPUs or other high-performance computing hardware to generate CGHs for higher quality.

5. Conclusion

In this paper, we proposed the semi-analytical drawing framework, supporting texture mapping, complex occlusion handling and shadow casting, which is rarely discussed in previous works. We utilized the N-gon rendering method and CUDA to accelerate the rendering process. Both simulations and optical reconstructions were illustrated. We also discussed the choice of sampling algorithms considering the rendering speed and the image quality from the view of the Fourier domain. Furthermore, our proposed framework can be executed on high-performance servers with multi-GPU despite only one GPU being utilized in this study. The potential parallel computing comes from the fact that the primitives’ rendering order can be set arbitrarily as the rendering of one triangle does not affect others. It should be noted that since our semi-analytical drawing method is rasterized, the resolution of the triangles may be lower than the full-analytical method, and sampling errors may occur. We also need to be cautious about floating-point errors.

In the future, we will further optimize our rendering framework to achieve more parallelization, the higher rendering speed and more visual effects with higher robustness and reconstruction quality. Nowadays, ray-tracing hardware is supported by state-of-art GPUs to accelerate the ray-tracing process. We hope our CGH rendering pipeline in the future can be supported by specialized computing hardware.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. C. Chang, K. Bang, G. Wetzstein, B. Lee, and L. Gao, “Toward the next-generation VR/AR optics: a review of holographic near-eye displays from a human-centric perspective,” Optica 7(11), 1563–1578 (2020). [CrossRef]  

2. M. Yamaguchi, “Light-field and holographic three-dimensional displays [Invited],” J. Opt. Soc. Am. A 33(12), 2348–2364 (2016). [CrossRef]  

3. R. W. Gerchberg and W. O. Saxton, “A Practical Algorithm for the Determination of Phase from Image and Diffraction Plane Pictures,” Optik 35(2), 237–246 (1972).

4. Y. Qi, C. Chang, and J. Xia, “Speckleless holographic display by complex modulation based on double-phase method,” Opt. Express 24(26), 30368–30378 (2016). [CrossRef]  

5. D. Yang, W. Seo, H. Yu, S. I. Kim, B. Shin, C. K. Lee, S. Moon, J. An, J. Y. Hong, G. Sung, and H. S. Lee, “Diffraction-engineered holography: Beyond the depth representation limit of holographic displays,” Nat. Commun. 13(1), 6012 (2022). [CrossRef]  

6. E. Sahin, E. Stoykova, J. Mäkinen, and A. Gotchev, “Computer-Generated Holograms for 3D Imaging,” ACM Computing Surveys 53(2), 1–35 (2021). [CrossRef]  

7. D. Pi, J. Liu, and Y. Wang, “Review of computer-generated hologram algorithms for color dynamic holographic three-dimensional display,” Light: Sci. Appl. 11(1), 231 (2022). [CrossRef]  

8. P. W. M. Tsang, T. C. Poon, and Y. M. Wu, “Review of fast methods for point-based computer-generated holography [Invited],” Photonics Res. 6(9), 837–846 (2018). [CrossRef]  

9. M. E. Lucente, “Interactive computation of holograms using a look-up table,” J. Electron. Imaging 2(1), 28–34 (1993). [CrossRef]  

10. Z. Yang, Q. Fan, Y. Zhang, J. Liu, and J. Zhou, “A new method for producing computer generated holograms,” J. Opt. 14(9), 095702 (2012). [CrossRef]  

11. T. Nishitsuji, Y. Yamamoto, T. Sugie, T. Akamatsu, R. Hirayama, H. Nakayama, T. Kakue, T. Shimobaba, and T. Ito, “Special-purpose computer HORN-8 for phase-type electro-holography,” Opt. Express 26(20), 26722–26733 (2018). [CrossRef]  

12. Y. Yamamoto, T. Shimobaba, and T. Ito, “HORN-9: Special-purpose computer for electroholography with the Hilbert transform,” Opt. Express 30(21), 38115–38127 (2022). [CrossRef]  

13. M. Bayraktar and M. Ozcan, “Method to calculate the far field of three-dimensional objects for computer-generated holography,” Appl. Opt. 49(24), 4647–4654 (2010). [CrossRef]  

14. S. Choi, M. Gopakumar, Y. Peng, J. Kim, and G. Wetzstein, “Neural 3D holography,” ACM Trans. Graph. 40(6), 1–12 (2021). [CrossRef]  

15. Y. F. Peng, S. Choi, N. Padmanaban, and G. Wetzstein, “Neural Holography with Camera-in-the-loop Training,” ACM Trans. Graph. 39(6), 1–14 (2020). [CrossRef]  

16. K. Matsushima and S. Nakahara, “Extremely high-definition full-parallax computer-generated hologram created by the polygon-based method,” Appl. Opt. 48(34), H54–63 (2009). [CrossRef]  

17. K. Matsushima and N. Sonobe, “Full-color digitized holography for large-scale holographic 3D imaging of physical and nonphysical objects,” Appl. Opt. 57(1), A150–A156 (2018). [CrossRef]  

18. R. P. Muffoletto, J. M. Tyler, and J. E. Tohline, “Shifted Fresnel diffraction for computational holography,” Opt. Express 15(9), 5631–5640 (2007). [CrossRef]  

19. Y. Zhang, H. Fan, F. Wang, X. Gu, X. Qian, and T. C. Poon, “Polygon-based computer-generated holography: a review of fundamentals and recent progress [Invited],” Appl. Opt. 61(5), B363–B374 (2022). [CrossRef]  

20. L. Ahrenberg, P. Benzie, M. Magnor, and J. Watson, “Computer generated holograms from three dimensional meshes using an analytic light transport model,” Appl. Opt. 47(10), 1567–1574 (2008). [CrossRef]  

21. Y. Pan, Y. Wang, J. Liu, X. Li, and J. Jia, “Fast polygon-based method for calculating computer-generated holograms in three-dimensional display,” Appl. Opt. 52(1), A290–299 (2013). [CrossRef]  

22. Y. P. Zhang, F. Wang, T. C. Poon, S. Fan, and W. Xu, “Fast generation of full analytical polygon-based computer-generated holograms,” Opt. Express 26(15), 19206–19224 (2018). [CrossRef]  

23. H. X. Fan, B. Zhang, Y. P. Zhang, F. Wang, W. L. Qin, Q. Y. Fu, and T. C. Poon, “Fast 3D Analytical Affine Transformation for Polygon-Based Computer-Generated Holograms,” Appl. Sci. 12(14), 6873 (2022). [CrossRef]  

24. F. Wang, T. Shimobaba, Y. Zhang, T. Kakue, and T. Ito, “Acceleration of polygon-based computer-generated holograms using look-up tables and reduction of the table size via principal component analysis,” Opt. Express 29(22), 35442–35455 (2021). [CrossRef]  

25. F. Wang, D. Blinder, T. Ito, and T. Shimobaba, “Wavefront recording plane-like method for polygon-based holograms,” Opt. Express 31(2), 1224–1233 (2023). [CrossRef]  

26. K. Matsushima, M. Nakamura, and S. Nakahara, “Novel Techniques Introduced into Polygon-Based High-Definition CGHs,” in Biomedical Optics and 3-D Imaging, OSA Technical Digest (CD) (Optica Publishing Group, 2010), JMA10. [CrossRef]  

27. K. Matsushima, “Performance of the Polygon-Source Method for Creating Computer-Generated Holograms of Surface Objects,” in Proceedings of ICO Topical Meeting on Optoinfomatics/Information Photonics 2006, (Springer, St. Petersburg, Russia, 2006), pp. 99–100.

28. W. Lee, D. Im, J. Paek, J. Hahn, and H. Kim, “Semi-analytic texturing algorithm for polygon computer-generated holograms,” Opt. Express 22(25), 31180–31191 (2014). [CrossRef]  

29. Y. M. Ji, H. Yeom, and J. H. Park, “Efficient texture mapping by adaptive mesh division in mesh-based computer generated hologram,” Opt. Express 24(24), 28154–28169 (2016). [CrossRef]  

30. K. Matsushima, M. Nakamura, and S. Nakahara, “Silhouette method for hidden surface removal in computer holography and its acceleration using the switch-back technique,” Opt. Express 22(20), 24450–24465 (2014). [CrossRef]  

31. J. P. Liu and H. K. Liao, “Fast occlusion processing for a polygon-based computer-generated hologram using the slice-by-slice silhouette method,” Appl. Opt. 57(1), A215–A221 (2018). [CrossRef]  

32. M. Askari, S. B. Kim, K. S. Shin, S. B. Ko, S. H. Kim, D. Y. Park, Y. G. Ju, and J. H. Park, “Occlusion handling using angular spectrum convolution in fully analytical mesh based computer generated hologram,” Opt. Express 25(21), 25867–25878 (2017). [CrossRef]  

33. H. J. Yeom and J. H. Park, “Calculation of reflectance distribution using angular spectrum convolution in mesh-based computer generated hologram,” Opt. Express 24(17), 19801–19813 (2016). [CrossRef]  

34. F. Wang, T. Ito, and T. Shimobaba, “High-speed rendering pipeline for polygon-based holograms,” Photonics Res. 11(2), 313–328 (2023). [CrossRef]  

35. T. Akenine-Möller, E. Haines, N. Hoffman, A. Pesce, M. Iwanicki, and S. Hillaire, Real-Time Rendering (A K Peters/CRC Press, 2018), Chap. 7.

36. K. Ezura, Y. Sakamoto, and Y. Aoki, “Computer-generated holograms considering multiple reflection, transmission, and shadow on object surfaces,” Proc. SPIE 5290, Practical Holography XVIII: Materials and Applications (2004).

37. J. H. Park, S. B. Kim, H. J. Yeom, H. J. Kim, H. Zhang, B. Li, Y. M. Ji, S. H. Kim, and S. B. Ko, “Continuous shading and its fast update in fully analytic triangular-mesh-based computer generated hologram,” Opt. Express 23(26), 33893–33901 (2015). [CrossRef]  

38. F. Wang, H. Shiomi, T. Ito, T. Kakue, and T. Shimobaba, “Fully analytic shading model with specular reflections for polygon-based hologram,” Opt. Lasers Eng. 160, 107235 (2023). [CrossRef]  

39. J. Dong, J. Hu, B.-R. Yang, and Z. Qin, “Normal Texture Mapping in Analytical Polygon-Based Computer-Generated Holography,” in Frontiers in Optics + Laser Science 2022 (FIO, LS), Technical Digest Series (Optica Publishing Group, 2022), JW4B.50. [CrossRef]  

40. J. H. Park, H. J. Yeom, H. J. Kim, H. Zhang, B. Li, Y. M. Ji, and S. H. Kim, “Removal of line artifacts on mesh boundary in computer generated hologram by mesh phase matching,” Opt. Express 23(6), 8006–8013 (2015). [CrossRef]  

41. S. Marschner and P. Shirley, Fundamentals of Computer Graphics, Fourth Edition (A K Peters/CRC Press, 2016), Chap. 3.

42. S. Marschner and P. Shirley, Fundamentals of Computer Graphics, Fourth Edition (A K Peters/CRC Press, 2016), Chap. 7.

43. X. Fang, H. Bao, Y. Tong, M. Desbrun, and J. Huang, “Quadrangulation through morse-parameterization hybridization,” ACM Trans. Graph. 37(4), 1–15 (2018). [CrossRef]  

44. N. Pietroni, S. Nuvoli, T. Alderighi, P. Cignoni, and M. Tarini, “Reliable feature-line driven quad-remeshing,” ACM Trans. Graph. 40(4), 1–17 (2021). [CrossRef]  

45. S. E. Reichenbach and F. Geng, “Two-dimensional cubic convolution,” IEEE Trans. on Image Process. 12(8), 857–865 (2003). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (17)

Fig. 1.
Fig. 1. An example of complex occlusion. The three triangles’ barycenters are on the same depth plane. (a) Front view; (b) side view.
Fig. 2.
Fig. 2. (a) The relationship between the reference and the target triangles. The rendered triangle (blue) is first rotated and shifted by R and c into the reference coordinate (yellow), then affine transformed by A, coinciding with the reference triangle (red). (b) The Phong illuminating model. The illuminating direction l and viewing direction v are inversely drawn to meet the intuition.
Fig. 3.
Fig. 3. The framework of the proposed semi-analytical drawing method. First, the reference triangle is drawn in the reference buffer. Then the reference spectrum can be obtained by a Fourier transform. Sampling the reference spectrum acquires the rendered triangle’s spectrum, and finally, all target triangles’ wavefront is obtained by an inverse Fourier transformation. The second row indicates the process with texture mapping. The color of each pixel is multiplied by the color sampled from the texture image. These intermediate images are output by OpenCV, and the y-axis of OpenCV images is opposite to the mathematical definition.
Fig. 4.
Fig. 4. Simulations of different versions of our Z-buffer modification. (a) Original Z-buffer with severe alias errors. (b) First Modification, seams will occur. (c) Final Modification and the correct result.
Fig. 5.
Fig. 5. The modified Z-buffer depth testing process. The total Z-buffer is first calculated. The depth distribution of rendered triangle is then drawn in the Z-tmp. After comparing the depth values of the corresponding positions in Z-buffer and Z-tmp, the fragment is judged to be rendered or discarded. It should be noted that the reference triangle here is filled with 1 to show its shape for demonstration.
Fig. 6.
Fig. 6. Problems in triangle-based shadow casting. (a) The random rendering order may cause wrong results. If a nearer triangle is rendered earlier, a farther triangle rendered later will cast the shadow area inversely on the nearer plane, which is the wrong result (red). The green projection area is correct. (b) Oversized projection (red) may damage the wavefront. Because of the periodic property of FFT, the exceeded area will invade the neighbor period, damaging the previously rendered light field (green).
Fig. 7.
Fig. 7. Occlusion-based shadow casting algorithm. (a) Functions of the transform matrices. We take the yz-plane as an example. y, z are the y- and z-axes of the global coordinate. ${y_c},\textrm{ }{z_c}$ are the corresponding axes in the camera’s view. Defined in a right-handed coordinate. (b) Shadow casting for a parallel light source. (c) Shadow casting for a point light source. In the occlusion handling process, the occlusion part is determined from the viewer’s view (front view), and the shadow area is determined from the view of the light source. Orthogonal and perspective transformations are separately utilized for the parallel and the point light sources.
Fig. 8.
Fig. 8. The background rectangle’s 2 reference triangles of “Overlap” in Fig. 13 after texture mapping, occlusion handling and shadow casting.
Fig. 9.
Fig. 9. The proposed N-gon rendering method. The quadrilateral is divided into two triangles, T1 and T2. R, c and A are determined utilizing T1 and the regular reference triangle, RefT1. Then by transforming T2 into the reference coordinate using R, c and A, the corresponding reference triangle, RefT2, is determined, and the reference quadrilateral can be drawn in the reference buffer.
Fig. 10.
Fig. 10. Comparison of the (a) triangle-based and (b) quadrilateral-based rendering.
Fig. 11.
Fig. 11. Configuration of the optical reconstruction. The objects are set 200 mm away from the SLM.
Fig. 12.
Fig. 12. The results of complex occlusion handling using the modified Z-buffer method. Since the three triangles cannot be sorted in the depth order, the Painter’s algorithm cannot offer the correct result, and the result of the modified Z-buffer algorithm is correct.
Fig. 13.
Fig. 13. CG benchmarks and simulation results of our proposed method. The background is set 200 mm from the hologram, and all the foreground objects are set 1 mm farther the background, i.e., 201 mm, to make sure the shadow’s details are visible. The details on the background are precise, and the Chinese characters on the “Teapot” are also distinct. All the shadow-casting results are correct compared to the CG benchmarks. Note that the self-shadow casting in “Teapot” is also correct and distinct.
Fig. 14.
Fig. 14. Optical reconstructions of our proposed method. Better optical results will be achieved if the holograms are loaded on a better SLM, independent of our proposed algorithm.
Fig. 15.
Fig. 15. Speed statistics. (a) Full-analytical method and the proposed semi-analytical drawing method. (b) Comparison of the triangle-based and the quadrilateral-based rendering. (c) Results after CUDA acceleration. (d) Results of higher resolution rendering with CUDA acceleration. The texture mapping, the occlusion handling and the shadow casting are added successively, and we utilized log axes since the range of the rendered triangles is vast.
Fig. 16.
Fig. 16. (a) The interpolation kernels, where dx is the pixel pitch of the reference buffer. (b) The frequency responses. The ideal interpolation kernel is a sinc function, which is illustrated as a reference. Prominent lobes of the frequency response of the nearest-neighbor sampling kernel are distinct. Therefore the nearest-neighbor sampling kernel gives bad results. The low-pass properties of the linear sampling and the cubic sampling kernels are better than the nearest-neighbor sampling kernel.
Fig. 17.
Fig. 17. Simulations and optical reconstructions of different choices of interpolation algorithms. The bilinear and the bicubic results are almost identical to the full-analytic result, and the quality of the nearest-neighbor result is poor with visible artifacts.

Tables (1)

Tables Icon

Table 1. Rendering time with different interpolation algorithms. (6 Tri.s rendered. Unit: second)

Equations (15)

Equations on this page are rendered with MathJax. Learn more.

G ( f x y ; a , k x y ) = B ( f x y ; k x y ) a exp ( j 2 π k T r o ) exp ( j π f l T c ) det ( A ) f z l f z ,
B ( f x y ; k x y ) = G r ( ( A T ) 1 [ 1 0 0 0 1 0 ] R ( f k ) ) ,
g r ( s , t ) = { 1 , s [ 0 , 1 ] , t [ 0 , 1 ] , s > t 0 , e t c .
G r ( f l , x y ) = g r ( s , t ) exp [ i 2 π ( f x l s + f y l t ) ] d f x l d f y l ,
g r ( s , t ) = g a + n ( s , t ) T l ( s , t ) + a s [ r r e f l e c t ( s , t ) T v ( s , t ) ] N s ,
M L o o k A t = [ r x r y r z 0 u x u y u z 0 g x g y g z 0 0 0 0 1 ] [ 1 0 0 p x 0 1 0 p y 0 0 1 p z 0 0 0 1 ] M o r t h o = [ 2 r l 0 0 r + l r l 0 2 t b 0 t + b t b 0 0 2 n f n + f n f 0 0 0 1 ] M v i e w = [ W 2 0 0 W 1 2 0 H 2 0 H 1 2 0 0 1 0 0 0 0 1 ] M p e r s = [ n 0 0 0 0 n 0 0 0 0 n + f f n 0 0 1 0 ]
A P = A [ x 1 x 2 x 3 y 1 y 2 y 3 ] = [ 1 1 0 0 1 1 ]
f ( x ) = n N h ( x , n ) f ( n )
h n e a r e s t ( x , n ) = { 1 , n 1 2 x < n + 1 2 0 , e t c
h l i n e a r ( x , n ) = { 1 | x n | , n 1 x < n + 1 0 , e t c .
h c u b i c = { ( a + 2 ) | x n | 3 ( a + 3 ) | x n | 2 + 1 , | x n | < 1 a | x n | 3 5 a | x n | 2 + 8 a | x n | 4 a , 1 | x n | < 2 0 , e t c
f ( x , y ) = [ 1 δ y δ y ] [ p 1 p 2 p 3 p 4 ] [ 1 δ x δ x ]
H n e a r e s t ( ω ) = sinc ( ω 2 )
H l i n e a r = sin c 2 ( ω 2 )
H c u b i c = 12 ω 2 [ sin c 2 ( ω 2 ) sinc ( ω ) ] + a 8 ω 2 [ 3 sin c 2 ( ω ) 2 sinc ( ω ) sinc ( 2 ω ) ]
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.