Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Automotive augmented reality 3D head-up display based on light-field rendering with eye-tracking

Open Access Open Access

Abstract

We explore the feasibility of implementing stereoscopy-based 3D images with an eye-tracking-based light-field display and actual head-up display optics for automotive applications. We translate the driver’s eye position into the virtual eyebox plane via a “light-weight” equation to replace the actual optics with an effective lens model, and we implement a light-field rendering algorithm using the model-processed eye-tracking data. Furthermore, our experimental results with a prototype closely match our ray-tracing simulations in terms of designed viewing conditions and low-crosstalk margin width. The prototype successfully delivers virtual images with a field of view of 10° × 5° and static crosstalk of <1.5%.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Recent years have witnessed rapid developments in augmented reality (AR) technologies in terms of both software and hardware, particularly in the case of head-mounted displays and mobile devices. In the automotive sector, AR technologies can improve the driver comfort and safety and also act as an infotainment platform when applied in the form of next-generation head-up displays (HUDs).

Cars today are typically equipped with HUDs with a limited horizontal field of view (FoV) of up to 5°. In these systems, the virtual image is implemented as a conventional 2D projection on the car windshield with a virtual image distance of up to 3 m. Even when the size of the virtual image is enlarged, the typically represented information for such HUDs involves the replication of instrument clusters with limited navigation extension. A recent key improvement in the automotive sector is the development of HUDs with AR support, which affords the ability to merge driving-related information with the actual (“real”) scene. Such information can be obtained from various sensors to aid the driver in avoiding collisions or receive additional information for driver assistance.

In this regard, many studies have focused on implementing AR HUDs from the perspectives of different design criteria. Thus, the FoV and image brightness can be improved by using microelectromechanical-systems-based laser scanning [1], digital light processing [2], or 2D computer-generated holography (CGH) technologies [3]. AR HUD systems can also be made more compact by applying waveguide [4], holographic optical element [5], or metasurface [6] technologies instead of mirror-based projection. However, the proposed solutions do not address improvements of AR content matching with real scene to avoid visual conflicts. One effective approach to address this issue involves implementing multi-depth images while merging a set of virtual images with the real scene. Several such methods have been implemented to address the problem [79]; however, the number of image planes is currently limited and therefore insufficient for natural overlap. Additionally, flickering can also occur when mechanical steering is adopted. These disadvantages may be overcome by using varifocal optical elements as described in Ref. [1012]; however, it is difficult to practically apply these components considering factors such as the required size/dimensions, compatibility with curved windshields, and large eyebox size. In future, CGH-based fully dynamic holographic 3D displays [13] could be an ultimate solution to address all the cues of natural multi-depth image perception. However, CGH application is complex both in terms of hardware and rendering and further requires additional computational time; thus, this technology is currently impractical for high-speed automotive solution.

One approach that may bridge the gap between conventional systems and holographic 3D displays is the integration of an autostereoscopic 3D display [14] into the HUD. In essence, stereoscopy works based on binocular disparity: the illusion of depth can be created from two 2D images whose features are slightly offset from each other, and the brain merges these two images into a single 3D perspective in an acceptable depth range [15]. Autostereoscopic 3D displays allow for the formation of several viewing positions at which the viewer can observe valid stereo-pair images. As a drawback, resolution of a perceived 3D image decreases depending on the number of viewing points. However, a method to improve pixel-resources utilization for a large number of viewing points was proposed by implementing a light-field model [16]. When combined with an eye-tracking, direct light-field rendering for valid stereo-pairs is possible without a significant loss of the 3D image resolution [17]. Although finally exploiting binocular disparity, this type of display can be considered as a light-field display because stereo-pair images are assigned to light rays uniformly distributed according to the light-field model. In this regard, Table 1 compares the currently available technologies that can be used as potential car solutions.

Tables Icon

Table 1. Head-up display (HUD) technology comparison.

Against this backdrop, we propose the adoption of stereoscopy-based 3D images with continuous depth by integrating a light-field display into the HUD optics. With our approach, such virtual images may be observed with low crosstalk when using matching HUD optics with viewing zones formed by the light-field display [18]. First, we derive certain equations to replace the complex optics by an effective lens model, and subsequently, we translate the driver’s actual eye position to the corresponding virtual one. Based on the results of this substitution, we proceed with our lenticular lens design and the corresponding light-field rendering. Next, we use simulations with ray-tracing software and conduct experiments with a developed prototype to demonstrate how closely the simulations and experimental results match in terms of the crosstalk level and low-crosstalk margin based on our initial assumptions.

2. Configuration and operating principle

Figure 1 shows the schematic of our AR 3D HUD with an eye-tracking camera. The camera acquires information on the eye position and tracks it continuously. In front of the driver, the HUD generates a virtual image through the windshield by means of mirror projection optics according to the perceived eye position. The 3D depth is adjusted within a suitable range based on environmental information from external sensors. Therefore, virtual objects are seamlessly integrated into the real world and made visible at various selected depths.

 figure: Fig. 1.

Fig. 1. Schematics of (a) light-field virtual image projection and (b) depth creation of augmented reality (AR) 3D head-up display (HUD) images with eye-tracking camera.

Download Full Size | PDF

Figure 2(a) shows the principle of the light-field rendering with an eye-tracking. With the use of a lenticular lens with a determined pitch on the display panel, uniform directional light-field rays can be generated with the transformation of each sub-pixel’s spatial position into a ray direction. Each sub-pixel can be allocated to the left or right eye to form stereo-pair images based on the viewer position as per eye-tracking. When the viewer/driver moves their head, the sub-pixel allocation is updated in real time with smooth steering as each ray can be manipulated separately. Figure 2(b) shows the light-field view distribution across the eyebox after allocation; two separate viewing zones are formed for the left and right eyes. Here, we note that to simultaneously observe a clear image with both eyes, two conditions should be satisfied: First, for the static case, i.e., without viewer movement, the crosstalk level should be minimized to ensure comfortable viewing conditions [19]. Secondly, the area with low static crosstalk, referred to as the low-crosstalk margin, should have a width wM that is sufficiently large to ensure a comfortable view when the viewer moves while driving. A low-crosstalk margin is required to compensate for system latency (which may reach 50 ms) during sub-pixel re-assigning. Margin wM can be expressed by Eq. (1) as follows:

$$w_{M}=\left\{\begin{array}{ll} w_{I P D}-w_{S} & \text { if } w_{I P D} \leq w_{V} / 2 \\ w_{V}-w_{I P D}-w_{S} & \text { if } w_{I P D}>w_{V} / 2 \end{array}\right.,$$
where wIPD denotes the driver inter-pupil distance (which may vary), wS the width of the high-crosstalk slope area, and wV the viewing width formed by the lenticular lens. According to Eq. (1), assuming that wS varies insignificantly across various designs, to maximize wM for all drivers with wIPD up to 70 mm (which corresponds to 96.7% of drivers [20]), we need to define viewing width wV as 140 mm for our application.

 figure: Fig. 2.

Fig. 2. (a) Principle of lenticular-lens-based light-field 3D display rendering with eye-tracking and (b) result of sub-pixel allocation leading to the formation of two low-crosstalk viewing zones for the left and right eyes.

Download Full Size | PDF

3. System design and methods

In this section, we describe the optimization and image-quality analysis of the HUD optics as well as the calculation procedure required for the lenticular lens design and light-field rendering.

3.1 Projection optics

Figure 3 shows the schematic of the proposed layout for the projection optics. We consider a windshield-type HUD, wherein light rays interact with the curved windshield as the last element with optical power before reaching the eyebox. To compensate for the curved windshield shape and to magnify the virtual image to achieve a wide FoV, we apply two free-form off-axis mirrors as the projection system. The choice of free-form mirrors is motivated by their high degree of freedom in terms of optical design and immunity to chromatic aberration [21]. Prior to the mirrors, we position a picture generation unit (PGU) consisting of a light-field display, i.e., an LCD panel with integrated lenticular lens, and an LED-powered backlight unit, which emits light in a narrow cone to collect light around the eyebox area to ensure high brightness and the suppression of straylight. The final component of the optical system is a dust cover with optionally applied sunload and straylight reduction technology (polarizer, dichroic coating, hot mirror, etc.). The presence of the cover implies negligible influence on the ray paths and image quality, and therefore, we do not consider it here. With this setup, the light rays from the PGU, split to the left and right eyes by the lenticular lens, are reflected by the two mirrors and the windshield to the driver’s eyebox.

 figure: Fig. 3.

Fig. 3. Schematic optical layout of augmented reality (AR) 3D head-up display (HUD).

Download Full Size | PDF

To improve the viewer comfort, we selected the following parameters for our projection optics: virtual image distance dIM = 7 m with FoV of 10° × 5° and eyebox size of 140 mm × 80 mm. The optical system arrangement and aberration suppression were carried out by using the Zemax OpticStudio package with the focus on achieving an angular spot size of <1 arcmin, i.e., matching the human eye resolution to >60 pixels per degree (ppd). Windshield-induced aberrations such as astigmatism and free-form distortions were successfully mitigated. Additionally, the specific parameters for HUD systems, such as binocular misalignments (BMs) of various kinds (convergence, divergence, dipvergence) were controlled to <1 mrad as per the criteria described in Ref. [22]. Table 2 lists the optimized design results.

Tables Icon

Table 2. Optimized design results of head-up display (HUD) projection optics.

An important aspect needs to be considered when implementing light-field-based technology for the single-mirror combiner-type HUD (Fig. 4). We note that the viewer does not observe the actual display D, but its virtual image D′ magnified by the mirror, and therefore, it is required to perform viewing-zone calculations and light-field rendering taking into account the magnifying properties of this mirror. This can be achieved by performing the inverse transform of the actual eye position (x, y, z) to estimate the virtual eye position (x′, y′, z′) through this mirror. In this case, the viewing-zone calculation and light-field rendering can be performed with reference to (abbreviated hereon as w.r.t.) this virtual eye position and not the actual one [18].

 figure: Fig. 4.

Fig. 4. Light-field display rendering for single-mirror combiner-type head-up display (HUD).

Download Full Size | PDF

Thus, the accuracy of the calculation of the virtual eye position will determine the possible errors and viewing comfort because any mismatch between the actual and calculated values will reduce the low-crosstalk margin width wM and increase error in the viewing-zone positioning. The presence of latency can also increase the probability of the actual eye position falling within the high-crosstalk slope area, particularly with dynamic movement. In our HUD system, we call the plane of the inverse-transformed eye positions as the virtual eyebox, which is formed by the complex free-form mirror optics underlying the actual display (Fig. 5). The distance between the actual display and the virtual eyebox plane is the viewing distance d, which is an essential parameter for viewing-zone calculation and light-field rendering. When considering the transform of the actual eyebox to the plane of the virtual eyebox, we introduce MVEB, the magnification of the virtual eyebox, which basically defines the amount of eyebox magnification by the HUD optics. We apply this concept in section 3.2 for the lenticular lens design.

 figure: Fig. 5.

Fig. 5. Virtual eyebox formation with head-up display (HUD) projection optics.

Download Full Size | PDF

To implement a facilitated algorithm for light-field rendering for high-speed car applications, it is desirable to avoid the complex matrix calculations required for each free-form surface for each eye position. Accordingly, we replace the actual free-form-mirror optical system with the effective lens model shown in Fig. 6. We note here that a similar concept was described by Takaki et al. [23] with the use of an actual Fresnel lens. In particular, the transformation induced by the optical system with the two free-form mirrors and windshield can be calculated by this type of effective lens model considering the sign transformation from the eyebox’s right-handed Cartesian coordinate origin (X, Y, Z) to the display’s left-handed Cartesian coordinate origin (X′, Y′, Z′), and further, this approach can be extended to optical systems of all complexity levels. To derive the transformation equation, we first need to determine the mutual arrangement of the effective lens-model components. For this, we may start with the display object distance, aD, from the effective lens, which can be calculated from the thin-lens formula by using a set of substitutions for known parameters:

$${a_{D}}=\frac{f_{EL}\left(w_{I M}-w_{D}\right)}{w_{I M}},$$
where fEL denotes the focal length of the effective lens representing the complex optical system, wD the width of the actual display, and wIM the width of its virtual image. The display object distance aD depends on the effective lens focal length, and therefore, it requires an exact evaluation to avoid uncertainty. To proceed with the evaluation, we set our optical system in reverse order with the eyebox at its initial plane as the object in the Zemax OpticStudio environment. When using the ray transfer matrix, i.e., for paraxial approximations, we obtain the focal length, averaged for 9 examined object fields, as fEL = 481.8 mm, which corresponds to the display object distance aD = 446.75 mm. Taking into account the calculated display object distance aD, we proceed further with the component arrangement in our effective lens optical system. Thus, the distance from the effective lens to the virtual image can be calculated as per Eq. (3):
$${a_{IM}}={a_{D}} M,$$
where M = wIM/wD denotes the magnification of the virtual image of display by the projection optics. Next, we set the eyebox object distance as aEB = dIMaIM and calculate the virtual eyebox image distance, aVEB, from the effective lens as per Eq. (4):
$${a_{VEB}}=\frac{{a_{EB}} {f_{EL}}}{{a_{EB}}-{f_{EL}}}.$$
In our model, we consider the viewing distance d w.r.t. the display origin, which can be expressed by Eq. (5):
$${d} ={a_{D}}-{a_{VEB}}.$$
Finally, we calculate the viewing distance as d = -651.1 mm, where the (-) sign refers to the “virtual” eyes located behind the display. Consequently, virtual eyebox magnification MVEB is simply the ratio defined by Eq. (6):
$${M_{VEB}} = - \frac{{{a_{VEB}}}}{{{a_{EB}}}}.$$
To match the HUD’s FoV with the real scene, we introduce the normal look-over angle αLO and look-down angle αLD. Thus, to consider these angles in our procedure for virtual-eye position specification, we calculate the coordinate relationship along both the z- and x-directions, as shown in Fig. 7, as well as the y-direction. This calculation is performed in the Zemax OpticStudio model, wherein we insert the plane of the virtual eyebox behind the actual display with viewing distance d and trace field (0, 0) from the real eyebox plane to the plane of the introduced virtual eyebox. The coordinates of the chief ray at the virtual eyebox plane are measured w.r.t. the display origin and denoted as Δx and Δy for the x- and y-directions, respectively. In this case, the off-axis shift of the display center, xD, can be calculated as per Eq. (7):
$${x_D} = \frac{{{a_{VEB}}{a_D}{x_{LO}} - {a_{EB}}{a_D}\Delta x}}{{{a_{EB}}{a_D} + {a_{VEB}}{a_{IM}}}}, $$
where xLO = dIM tan(αLO) denotes the shift due to αLO between the center of the real eyebox and the center of the virtual image of the display. Moreover, to calculate the off-axis shift of the display center in the vertical direction, yD, the same equation can be used, but Δx must be replaced by Δy and the look-over angle αLO by the look-down angle αLD. Accordingly, based on xD and the virtual eyebox off-axis shift xVEB = xD + Δx, we calculate the off-axis shift of the real eyebox center, xEB, using Eq. (8):
$${x_{EB}} = \frac{{{x_{VEB}}}}{{{M_{VEB}}}}.$$

 figure: Fig. 6.

Fig. 6. Illustration of effective lens model.

Download Full Size | PDF

 figure: Fig. 7.

Fig. 7. Illustration of off-axis shifts due to look-over angle.

Download Full Size | PDF

The parameters required to construct our effective lens model are listed in Table 3.

Tables Icon

Table 3. Effective lens model estimation results.

All the calculated values are applied to transform the eye position (x, y, z) in the space of the real eyebox with its coordinate origin at the center of the eyebox (as perceived by the eye-tracking camera) to position (x′, y′, z′) in the space of the virtual eyebox w.r.t. the display origin following Eq. (9); here, we neglect the z-direction assuming that the actual eyes are located at z = 0 mm and translated into virtual eyes at viewing distance z′ = d. To minimize any possible errors in the magnification between the actual optics and the ideal model for various fields, we define compensation coefficients kx and ky along the x- and y-directions, respectively:

$$\left[ \begin{array}{l} x^{\prime}\\ y^{\prime} \end{array} \right] = {M_{VEB}}\left[ \begin{array}{c} {k_x}x + {x_{EB}}\\ {k_y}y + {y_{EB}} \end{array} \right] - \left[ \begin{array}{c} {x_D}\\ {y_D} \end{array} \right].$$
To obtain kx and ky, we trace 25 fields, shown in Fig. 8(a), and we compare the coordinates of the chief rays in the plane of the virtual eyebox with our model output. By averaging measurements along each direction, we estimate kx = 0.992 and ky = 0.981. Consequently, the x-position mismatch for each field can be calculated as δx = x′xRT, where x′ denotes the calculated value and xRT the ray-traced value obtained with Zemax OpticStudio; the same procedure can be applied for the y-position. These results are illustrated in Fig. 8(c), wherein the average value of the absolute x-position mismatch in the virtual-eyebox plane is 0.186 mm and the average of the absolute y-position mismatch is 0.261 mm. While the y-position mismatch exceeds 1 mm for an exceptional field (Field #21 in Fig. 8(c)), the total mismatch, which affects the viewing-zone position, i.e., its error, can be calculated as δ = δx + δytan(α), where α denotes the slanted angle of the viewing zones and is equal to the slanted angle of the lenticular lens. This parameter is described in detail in Sections 3.2 and 3.3. Thus, the maximum total mismatch in this case is <0.6 mm and the average total mismatch is 0.207 mm in the virtual-eyebox plane, which is sufficiently small to ensure that the eye position lies within the low-crosstalk margin area.

 figure: Fig. 8.

Fig. 8. Simulation results for driver eye-position matching. (a) Twenty-five analyzed eyebox fields, (b) translation from y-position mismatch to x-position mismatch for total mismatch value, (c) mismatch values between positions x′ and y′ (calculated with our model) and xRT and yRT (ray-traced with Zemax OpticStudio). The values are indicated for the virtual-eyebox plane.

Download Full Size | PDF

3.2 Lenticular lens

In our light-field display, we use a lenticular lens to form viewing zones at the eyebox plane. As shown in Fig. 9, for lenticular lens design, we take into account the viewing angle for the virtual eye position, which can be expressed by Eq. (10):

$$\theta = \textrm{ta}{\textrm{n}^{ - 1}}({w_{VV}}/d), $$
where wVV denotes the virtual viewing width (MVEBwV). Thus, with viewing angle θ = 15.65°, MVEB = -1.28, and the number of sub-pixels per lens N [16], we calculate the lenticular lens design parameters such as the thickness of the multi-layer material of the lens, radius, sag, and horizontal pitch. The actual lens pitch is adjusted by using the slanted angle, which value is selected to reduce Moiré patterns [24].

 figure: Fig. 9.

Fig. 9. Schematic representation of view distribution used for lenticular lens design.

Download Full Size | PDF

3.3 Light-field rendering

Light-field rendering includes the tasks of pixel assignment for the full number of light-field views [25], which is defined based on the number of sub-pixels per lens, and the actual content image rendering; in our study, all these steps were together implemented by using C++ with NVIDIA CUDA under conditions optimized for calculation speed. Here, we describe the simplified two-view, i.e., for the left and right eyes, allocation algorithm for each pixel (sub-pixel), which is implemented for our model evaluation with the use of ray-tracing software. The algorithm is adjusted for virtual eyes, which are located behind the actual display panel [18]. Conceptually, we need to assign each sub-pixel to the left or right eye according to the viewing position. First, we calculate each sub-pixel position w.r.t. the coordinate origin at the corner of the display panel, which is a common approach to define the origin in rendering (Fig. 10). The virtual-eye positions (xL, yL) and (xR, yR) are also adjusted to this origin taking into account display width wD and height hD (we do not use the prime symbol to indicate the virtual-eye coordinates in this section). Thus, each sub-pixel position xSP in the horizontal direction and ySP in the vertical direction can be expressed by Eq. (11):

$${x_{SP}} = {P_{SP}}({n_x} - 1/2)\quad \textrm{and}\quad {y_{SP}} = {P_P}({n_y} - 1/2),$$
where PSP denotes the horizontal sub-pixel pitch and PP the vertical pixel pitch, and nx and ny the sub-pixel count numbers in the horizontal and vertical directions, respectively.

 figure: Fig. 10.

Fig. 10. Schematic of (a) sub-pixel distribution behind lenticular lens and (b) panel view with coordinate origin.

Download Full Size | PDF

When the light-field display is observed by the virtual eyes located at a finite viewing distance z = d, the center position of the corresponding lenticular lens curvature yl is not equal to ySP as there is an additional difference due to the non-zero lens thickness (Fig. 11(a)). To minimize the rendering error, we take into account this difference and calculate the adjusted lenticular lens position in the vertical directions as per Eq. (12):

$${y_l} = {y_{SP}} - t({y_{SP}} - {y_L})/d, $$
where we assume yL = yR as both eyes are located at the same height when the driver’s head is not inclined, and t denotes the lenticular lens thickness adjusted to air. In this case, lenticular lens position in the horizontal direction can be calculated by considering the horizontal pitch PH of the lenticular lens as per Eq. (13):
$${x_l} = {P_H}\textrm{(floor}\{{[{x_{SP}} + {y_l}\tan (\alpha )]/{P_H}} \}+ 1/2) - {y_l}\tan (\alpha ), $$
where floor() is a function that outputs the greatest integer less than or equal to the number in the bracket. In here, we assume that the first row of the lenticular lens begins at the coordinate origin (0, 0). Consequently, an offset between the sub-pixel position and corresponding lenticular lens can be calculated as per Eq. (14):
$$\Delta {x_l} = {x_{SP}} - {x_l}.$$
To proceed with the final sub-pixel allocation, we need to trace the virtual rays from each sub-pixel and the corresponding lenticular lens element to the virtual-eye plane; this tracing can also be performed in the reverse direction, as shown in Fig. 11(b). After ray tracing to the lenticular lens plane, distance ΔR from the intersection point of the ray from the right eye to the lens center should be compared with ΔL for the ray from the left eye to accordingly assign the sub-pixel for the left or right eye. Each sub-pixel is assigned based on the “shorter distance” criterion, for example, if ΔR < ΔL, the sub-pixel is assigned to right eye, where ΔR and ΔL can be calculated as:
$${\Delta _R} = \Delta {x_l} + t({x_{SP}} - {x_R})/d\quad \textrm{and}\quad {\Delta _L} = \Delta {x_l} + t({x_{SP}} - {x_L})/d.$$

 figure: Fig. 11.

Fig. 11. Schematic showing the pixel allocation principle between left and right eyes. Transformation of viewer’s eyes to lenticular lens plane according to (a) vertical position and (b) horizontal position.

Download Full Size | PDF

3.4 Ray-tracing simulation

To confirm the low level of crosstalk and sufficient margin width wM for our system, we construct a LightTools ray-tracing setup based on previously calculated data. Our setup includes a light source, light-field display, mirror optics, windshield, and eyebox with receiver. The light source is implemented as an ideal plane source with spectrum-matched LEDs, which are selected for our backlight design. The light-field display consists of two components: the panel itself and the lenticular lens. The panel has an active area of 89.1 mm × 44.55 mm, and default RGB color filters are distributed on its bottom surface to mimic an actual LCD panel. To the top surface, we apply a user-defined coating as a grayscale filter; the filter is actually binary and corresponds to only 1 or 0 values. These values are assigned for each sub-pixel as per the sub-pixel allocation algorithm. The optical surfaces of the mirrors and windshield are imported from Zemax OpticStudio in the CAD data format. The eyebox is implemented as a transparent dummy plane with an applied illuminance receiver and the option to save ray data for further analysis.

 figure: Fig. 12.

Fig. 12. Ray-tracing simulation results. (a) Eyebox illuminance with two-view images resulting in wV = 140 mm at normalized illuminance of 0.5, (b) crosstalk chart calculated based on illuminance before normalization resulting in wS = 33 mm at crosstalk 1.7%, (c) eyebox illuminance with 127-view images demonstrating light-field, which confirms designed viewing width wV = 140 mm as the distance between V1 and V1(Repeated).

Download Full Size | PDF

We generate light-field images to be applied via the grayscale filter by using a simplified two-view algorithm for sub-pixel allocation for the left and right eyes, as described in Section 3.3. Additionally, we use an actual algorithm (C++ with NVIDIA CUDA), which allows for the display of 127 separated views of the actual light-field before merging. The 127-view images are useful to measure the resulting viewing width wV as the distance between the same view’s main and repeating peaks of illuminance distribution at the eyebox plane. Measurement with the two-view images at the normalized illuminance level of 0.5 results in a viewing width wV= 140 mm, as shown in Fig. 12(a). Additionally, measurement with the 127-view images confirms this value, as shown in Fig. 12(c). In practical situations, the high-crosstalk slope width wS should be measured by using a crosstalk chart instead of normalized illuminance because the intersection point with the limiting crosstalk level is required, as shown in Fig. 12(b). The baseline crosstalk in our simulations is negligible, and therefore, we select the limiting crosstalk level as 1.7% (3.0% – 1.3%), where 3.0% is our limiting goal with the prototyped device and 1.3% is the baseline crosstalk measured during experiments (Section 4). The limiting goal of 3.0% is selected based on a combination of two criteria, allowing for a sufficient depth resolution and acceptable discomfort [19]. Thus, we measure wS = 33 mm, as shown in Fig. 12(b). Applying this data to Eq. (1), we calculate the margin width as wM = 27 mm for the driver wIPD = 60 mm, wM = 32 mm for wIPD = 65 mm, and wM = 37 mm for wIPD = 70 mm. Within these margins, we can ensure a stable 3D performance of the HUD regardless of any sudden movement of the driver’s eyes.

Additionally, we visualize the left- and right-view separation with combined red-blue image, where red content corresponds to the right eye and blue content to the left eye, in Fig. 13(a); and with combined black-white content (signs and grid) image in Fig. 13(b). The grid is introduced to visualize the low distortion of the virtual image, while the signs “LEFT EYE IMAGE” and “RIGHT EYE IMAGE” indicate low light leakage between views.

 figure: Fig. 13.

Fig. 13. Ray-tracing simulation results. Left- and right-view separation (a) with red and blue images and (b) with black-white content (signs and grid) images corresponding to virtual image with dIM = 7 m.

Download Full Size | PDF

4. Experiment

To implement the 3D HUD and validate the ray-tracing and simulation results described in Section 3, we constructed a prototype, as shown in Fig. 14(a). In the prototype, the backlight unit was powered by 50 LEDs with a maximum total power of 23 W, resulting in a brightness of 13,398 nit at the eyebox with homogeneity of white 97% and black 83% measured over 25 points. Additionally, a 3.92-inch (89.1 mm × 44.55 mm) LCD panel with a pixel resolution of 1,800 × 900 pixels was used to provide 80 ppd of 3D resolution. The relevant mechanics were developed to minimize position errors during both assembly and operation. The tolerance values were in the range previously defined with the use of Zemax OpticStudio to prevent any significant image degradation. The stand-alone HUD prototype assembly was prepared with a part of the windshield as the optical surface, and the eye-tracking camera was installed on top. As implementing a receiver with the same size as the eyebox is challenging under laboratory conditions, here, we introduce a method to measure the crosstalk and low-crosstalk margin width by acquiring a crosstalk map and crosstalk chart along the eyebox horizontal direction. Figure 14(b) depicts the measurement setup for this method. A face mask is placed at the eyebox plane to mimic the driver’s eyes position for light-field rendering. A camera is installed at the position of the left eye to capture virtual images, and a motorized stage is used for horizontal scanning with a step of 1 mm. All measurements are performed inside a “dark” box to avoid the influence of additional noise.

 figure: Fig. 14.

Fig. 14. (a) Stand-alone head-up display (HUD) prototype assembly with (b) crosstalk measurement setup.

Download Full Size | PDF

Images for crosstalk measurement are rendered in the same manner as with the simplified two-view algorithm, but adjusted to a 24-bit bitmap. Therefore, each sub-pixel “holds” numbers 0 through 255. Thus, the LBRW image is assigned 0 (black) for sub-pixels allocated to the left eye and 255 (white) to right eye, whereas the LWRB image is assigned 255 (white) for the left eye and 0 (black) for the right eye. A completely black image, i.e., assigned 0 for all sub-pixels is rendered as well for background subtraction. Next, while projecting these images for each ith step, we calculate crosstalk value ηi(x, y) at each x, y point of the virtual image (indicated in Fig. 15 as a red circle) as per Eq. (16):

$${\eta _i}(x,y) = \frac{{{I_{LBRWi}}(x,y) - {I_{Bi}}(x,y)}}{{{I_{LWRBi}}(x,y) - {I_{Bi}}(x,y)}} \times 100, $$
where ILBRWi(x, y) denotes the illuminance of the virtual image plane at the x, y point with the LBRW image, ILWRBi(x, y) the illuminance with the LWRB image, and IBi(x, y) the illuminance with the black image. Thus, the crosstalk map for the ith step can be constructed as shown in Fig. 15(a).

 figure: Fig. 15.

Fig. 15. Construction of crosstalk map. (a) Crosstalk map at ith step, (b) captured LBRW image, (c) captured LWRB image, and (d) captured black image.

Download Full Size | PDF

To plot the crosstalk chart along the eyebox and measure margin width wM, we calculate the crosstalk value of each ith crosstalk map by averaging values with all x, y points as per Eq. (17):

$${\eta _i} = \sum\limits_{x,y\textrm{ } \in \textrm{ }Re{g_i}} {\frac{{{\eta _i}(x,y)}}{{{N_{Pi}}}}}, $$
where Regi denotes the ith region and NPi the number of examined points in this region. The resulting crosstalk chart is shown in Fig. 16(a). The baseline for crosstalk is 1.3%, and this value is used to calculate the low-crosstalk margin width with the simulation results. With the prototype, we measure viewing width wV = 125 mm and slope width wS = 35 mm at the limiting crosstalk level of 3.0%. In this case, the margin width can be calculated as wM = 25 mm for driver wIPD = 60 mm, wM = 25 mm for wIPD = 65 mm, and wM = 20 mm for wIPD = 70 mm. As can be observed, the low-crosstalk margin width is smaller than that previously measured with the simulation (Section 3) owing to wV, corresponding to a reduction from the design value of 140 mm to the measured value of 125 mm. Moreover, wS increases from 33 mm to 35 mm, although it should decrease. This is because the prototype display panel is thicker than that considered at the design stage owing to the increased thickness of one of its internal layers. This thickness difference Δt = 50 µm leads to a corresponding reduction in the viewing width wV, as wV is proportional to the viewing angle θ and the adjusted θadj can be calculated as per Eq. (18):
$${\theta _{adj}} = \textrm{ta}{\textrm{n}^{ - 1}}[{P_{SP}}N/(t + \Delta t/{n_{DP}})],$$
where nDP denotes the refractive index of the display panel layer. Simultaneously, this implies a mismatch between the lenticular lens focal length fl (designed based on the initial t) and the actual distance to the sub-pixel plane (t + Δt/nDP), thereby affecting the width of each view distribution at the eyebox plane and eventually slope width wS. This mismatch can be compensated by changing the lenticular lens radius of curvature R, which requires lens reconstruction. Any mismatch in the lenticular lens pitch PH is not considered as a source of possible errors because it can be compensated by the light-field rendering algorithm. Figure 16(b) illustrates the simulation results with the applied display thickness difference, wherein we consider two scenarios: First, for reference, we consider only the effects of Δt, i.e., the mismatch in fl is compensated by ΔR, which affords a reduction in wV. Secondly, as in the case of our prototype, we consider Δt and the initial design value R, which eventually affords a reduction in wV and increase in wS. The simulation results correlate well with the measured values for the actual prototype. Thus, the difference between the simulation and the prototype for both wV and wS is 1 mm, which is comparable with our measurement accuracy. These simulation results confirm that satisfactory viewing conditions can be realized with negligible errors if the lenticular lens thickness is matched with the design value.

 figure: Fig. 16.

Fig. 16. Crosstalk chart: (a) measured, and (b) simulated with display-thickness difference.

Download Full Size | PDF

Even with the actual prototyped lenticular lens and increased display thickness, our HUD satisfies the initial criteria for a low-crosstalk margin for a sudden driver’s eyes movement velocity of 200 mm/s and the latency time of the entire system. Furthermore, to visualize the image quality results, we acquire pictures with left- and right-view separation for the red and blue images and black-white content (signs and grid) images, as shown in Fig. 17. The results closely match the ray-tracing simulation and confirm that the HUD affords high image quality with low crosstalk.

 figure: Fig. 17.

Fig. 17. Visualized image quality results. Left- and right-view separation (a) with red and blue virtual images and (b) with black-white content (signs and grid) virtual images.

Download Full Size | PDF

Figure 18 shows the representative images of the AR 3D HUD concept displayed at the prototype assembly. A scene with navigation signs and warning signals can represent one of the scenarios of AR-3D-HUD-assisted driving in the near future.

 figure: Fig. 18.

Fig. 18. Demonstration images displayed by augmented reality (AR) 3D head-up display (HUD) (see Visualization 1, Visualization 2).

Download Full Size | PDF

5. Conclusion

In this study, we proposed and demonstrated a stereoscopy-based light-field 3D display system with eye-tracking integrated into an HUD to improve the driver experience. We developed and evaluated a light-weight approach to translate the driver’s eye position (based on eye-tracking) to a virtual eye position by replacing the actual complex optics with an effective lens model. Using a prototype, we experimentally verified that the HUD could display high-quality 3D images with a low-crosstalk margin, which closely matched our simulation results. In the study, we acquired low-distortion virtual images at human-eye-limiting resolution quality, FoV of 10° × 5°, and static crosstalk of <1.5%. We believe that presented 3D HUD system can form the mainstream technology for various AR HUD applications.

Acknowledgments

The representative image sources for the AR 3D HUD concept were produced by the New eXperience Design (NXD) Group at the Samsung Corporate Design Center.

Disclosures

The authors declare no conflicts of interest.

References

1. V. Milanovic, A. Kasturi, and V. Hachtel, “High brightness MEMS mirror based head-up display (HUD) modules with wireless data streaming capability,” Proc. SPIE 9375, 93750A (2015). [CrossRef]  

2. G. Pettitt, J. Ferri, and J. Thompson, “Practical application of TI DLP® technology in the next generation head-up display system,” SID Symp. Dig. Tech. 46(1), 700–703 (2015). [CrossRef]  

3. J. Christmas and N. Collings, “Realizing automotive holographic head up displays,” SID Symp. Dig. Tech. 47(1), 1017–1020 (2016). [CrossRef]  

4. P. Richter, W. von Spiegel, and J. Waldern, “Volume optimized and mirror-less holographic waveguide augmented reality head-up display,” SID Symp. Dig. Tech. 49(1), 725–728 (2018). [CrossRef]  

5. K. Bang, C. Jang, and B. Lee, “Curved holographic optical elements and applications for curved see-through displays,” J. Inf. Disp. 20(1), 9–23 (2019). [CrossRef]  

6. G. Lee, J. Hong, S. Hwang, S. Moon, H. Kang, S. Jeon, H. Kim, J.-H. Jeong, and B. Lee, “Metasurface eyepiece for augmented reality,” Nat. Commun. 9(1), 4562 (2018). [CrossRef]  

7. Z. Qin, S.-M. Lin, K.-T. Luo, C.-H. Chen, and Y.-P. Huang, “Dual-focal-plane augmented reality head-up display using a single picture generation unit and a single freeform mirror,” Appl. Opt. 58(20), 5366–5374 (2019). [CrossRef]  

8. J. H. Seo, C. Y. Yoon, J. H. Oh, S. B. Kang, C. Yang, M. R. Lee, and Y. H. Han, “A study on multi-depth head-up display,” SID Symp. Dig. Tech. 48(1), 883–885 (2017). [CrossRef]  

9. Konica Minolta, Inc., “Konica Minolta develops world’s first automotive 3D augmented reality head-up display,” (2017). http://newsroom.konicaminolta.eu/konica-minolta-develops-the-worlds-first-automotive-3d-augmented-reality-head-up-display

10. S. Liu, H. Hua, and D. Cheng, “A novel prototype for an optical see-through head-mounted display with addressable focus cues,” IEEE Trans. Vis. Comput. Graphics 16(3), 381–393 (2010). [CrossRef]  

11. T. Zhan, Y.-H. Lee, G. Tan, J. Xiong, K. Yin, F. Gou, J. Zou, N. Zhang, D. Zhao, J. Yang, S. Liu, and S.-T. Wu, “Pancharatnam-Berry optical elements for head-up and near-eye displays,” J. Opt. Soc. Am. B 36(5), D52–D65 (2019). [CrossRef]  

12. S. Lee, Y. Jo, D. Yoo, J. Cho, D. Lee, and B. Lee, “Tomographic near-eye displays,” Nat. Commun. 10(1), 2497 (2019). [CrossRef]  

13. K. Wakunami, P.-Y. Hsieh, R. Oi, T. Senoh, H. Sasaki, Y. Ichihashi, M. Okui, Y.-P. Huang, and K. Yamamoto, “Projection-type see-through holographic three-dimensional display,” Nat. Commun. 7(1), 12954 (2016). [CrossRef]  

14. J. Hong, Y. Kim, H.-J. Choi, J. Hahn, J.-H. Park, H. Kim, S.-W. Min, N. Chen, and B. Lee, “Three-dimensional display technologies of recent interest: principles, status, and issues,” Appl. Opt. 50(34), H87–H115 (2011). [CrossRef]  

15. N. Broy, S. Höckh, A. Frederiksen, M. Gilowski, J. Eichhorn, F. Naser, H. Jung, J. Niemann, M. Schell, A. Schmid, and F. Alt, “Exploring design parameters for a 3D head-up display,” Proc. PerDis ‘14, pp. 38–43, ACM (2014).

16. D. Nam, J.-H. Lee, Y. H. Cho, Y. J. Jeong, H. Hwang, and D. S. Park, “Flat panel light-field 3-D display: concept, design, rendering, and calibration,” Proc. IEEE 105(5), 876–891 (2017). [CrossRef]  

17. D. Nam and D. Park, “Light field reconstruction,” Proc. 14th Workshop Inf. Opt. (WIO), Th2-2, pp. 1–3 (2015).

18. J. Park, D. Nam, and K. Choi, “Three-dimensional (3D) image rendering method and apparatus,” U.S. patent 10521953B2 (2019).

19. A. J. Woods, “Crosstalk in stereoscopic displays: a review,” J. Electron. Imaging 21(4), 040902 (2012). [CrossRef]  

20. N. Dodgson, “Variation and extrema of human interpupillary distance,” Proc. SPIE 5291, 36–46 (2004). [CrossRef]  

21. S. Wei, Z. Fan, Z. Zhu, and D. Ma, “Design of a head-up display based on freeform reflective systems for automotive applications,” Appl. Opt. 58(7), 1675–1681 (2019). [CrossRef]  

22. M. Hagino, H. Miura, N. Kimura, and K. Watanuki, “Driver’s recognition of head-up display (HUD) as information provision system,” SA'15: SIGGRAPH Asia, HUDs and their Applications, Article no. 6, pp. 1–3 (2015).

23. Y. Takaki, Y. Urano, S. Kashiwada, H. Ando, and K. Nakamura, “Super multi-view windshield display for long-distance image information presentation,” Opt. Express 19(2), 704–716 (2011). [CrossRef]  

24. Y. J. Jeong and K. Choi, “Three-dimensional display optimization with measurable energy model,” Opt. Express 25(9), 10500–10514 (2017). [CrossRef]  

25. S. Lee, J. Park, J. Heo, B. Kang, D. Kang, H. Hwang, J. Lee, Y. Choi, K. Choi, and D. Nam, “Autostereoscopic 3D display using directional subpixel rendering,” Opt. Express 26(16), 20233–20247 (2018). [CrossRef]  

Supplementary Material (2)

NameDescription
Visualization 1       This is the demonstration video of AR 3D HUD.
Visualization 2       This is the demonstration video of AR 3D HUD.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (18)

Fig. 1.
Fig. 1. Schematics of (a) light-field virtual image projection and (b) depth creation of augmented reality (AR) 3D head-up display (HUD) images with eye-tracking camera.
Fig. 2.
Fig. 2. (a) Principle of lenticular-lens-based light-field 3D display rendering with eye-tracking and (b) result of sub-pixel allocation leading to the formation of two low-crosstalk viewing zones for the left and right eyes.
Fig. 3.
Fig. 3. Schematic optical layout of augmented reality (AR) 3D head-up display (HUD).
Fig. 4.
Fig. 4. Light-field display rendering for single-mirror combiner-type head-up display (HUD).
Fig. 5.
Fig. 5. Virtual eyebox formation with head-up display (HUD) projection optics.
Fig. 6.
Fig. 6. Illustration of effective lens model.
Fig. 7.
Fig. 7. Illustration of off-axis shifts due to look-over angle.
Fig. 8.
Fig. 8. Simulation results for driver eye-position matching. (a) Twenty-five analyzed eyebox fields, (b) translation from y-position mismatch to x-position mismatch for total mismatch value, (c) mismatch values between positions x′ and y′ (calculated with our model) and xRT and yRT (ray-traced with Zemax OpticStudio). The values are indicated for the virtual-eyebox plane.
Fig. 9.
Fig. 9. Schematic representation of view distribution used for lenticular lens design.
Fig. 10.
Fig. 10. Schematic of (a) sub-pixel distribution behind lenticular lens and (b) panel view with coordinate origin.
Fig. 11.
Fig. 11. Schematic showing the pixel allocation principle between left and right eyes. Transformation of viewer’s eyes to lenticular lens plane according to (a) vertical position and (b) horizontal position.
Fig. 12.
Fig. 12. Ray-tracing simulation results. (a) Eyebox illuminance with two-view images resulting in wV = 140 mm at normalized illuminance of 0.5, (b) crosstalk chart calculated based on illuminance before normalization resulting in wS = 33 mm at crosstalk 1.7%, (c) eyebox illuminance with 127-view images demonstrating light-field, which confirms designed viewing width wV = 140 mm as the distance between V1 and V1(Repeated).
Fig. 13.
Fig. 13. Ray-tracing simulation results. Left- and right-view separation (a) with red and blue images and (b) with black-white content (signs and grid) images corresponding to virtual image with dIM = 7 m.
Fig. 14.
Fig. 14. (a) Stand-alone head-up display (HUD) prototype assembly with (b) crosstalk measurement setup.
Fig. 15.
Fig. 15. Construction of crosstalk map. (a) Crosstalk map at ith step, (b) captured LBRW image, (c) captured LWRB image, and (d) captured black image.
Fig. 16.
Fig. 16. Crosstalk chart: (a) measured, and (b) simulated with display-thickness difference.
Fig. 17.
Fig. 17. Visualized image quality results. Left- and right-view separation (a) with red and blue virtual images and (b) with black-white content (signs and grid) virtual images.
Fig. 18.
Fig. 18. Demonstration images displayed by augmented reality (AR) 3D head-up display (HUD) (see Visualization 1, Visualization 2).

Tables (3)

Tables Icon

Table 1. Head-up display (HUD) technology comparison.

Tables Icon

Table 2. Optimized design results of head-up display (HUD) projection optics.

Tables Icon

Table 3. Effective lens model estimation results.

Equations (18)

Equations on this page are rendered with MathJax. Learn more.

w M = { w I P D w S  if  w I P D w V / 2 w V w I P D w S  if  w I P D > w V / 2 ,
a D = f E L ( w I M w D ) w I M ,
a I M = a D M ,
a V E B = a E B f E L a E B f E L .
d = a D a V E B .
M V E B = a V E B a E B .
x D = a V E B a D x L O a E B a D Δ x a E B a D + a V E B a I M ,
x E B = x V E B M V E B .
[ x y ] = M V E B [ k x x + x E B k y y + y E B ] [ x D y D ] .
θ = ta n 1 ( w V V / d ) ,
x S P = P S P ( n x 1 / 2 ) and y S P = P P ( n y 1 / 2 ) ,
y l = y S P t ( y S P y L ) / d ,
x l = P H (floor { [ x S P + y l tan ( α ) ] / P H } + 1 / 2 ) y l tan ( α ) ,
Δ x l = x S P x l .
Δ R = Δ x l + t ( x S P x R ) / d and Δ L = Δ x l + t ( x S P x L ) / d .
η i ( x , y ) = I L B R W i ( x , y ) I B i ( x , y ) I L W R B i ( x , y ) I B i ( x , y ) × 100 ,
η i = x , y     R e g i η i ( x , y ) N P i ,
θ a d j = ta n 1 [ P S P N / ( t + Δ t / n D P ) ] ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.