Event-based asynchronous HDR imaging by temporal incident light modulation

Yuliang Wu; Ganchao Tan; Jinze Chen; Wei Zhai; Yang Cao; Yang Cao; Zheng-Jun Zha

doi:10.1364/OE.520808

1. Introduction

An ideal imaging system is expected to efficaciously capture luminance and contrast information under various lighting conditions, encompassing a vast luminance range from approximately $10^{-2}$ lux in nocturnal starlight environments to $10^8$ lux in scenes illuminated by midday sunlight. In extremely low-light conditions, effective scene imaging can be achieved by enlarging the aperture size and prolonging exposure durations, while in brightly illuminated environments, sensor overexposure can be avoided by using a smaller aperture and a shorter exposure time. However, in high dynamic range scenarios, imaging systems that use globally uniform sampling, exposure, and light input control face challenges due to limitations in sampling bit depth and electron well capacity. During each imaging process, only a limited number of pixels on the sensor can achieve optimal exposure, while other pixels fail to perceive the scene accurately due to inappropriate illumination parameter settings.

Current mainstream methods for high dynamic range imaging can be divided into multi-exposure fusion (MEF) and spatial light modulators-based (SLMs-based) approaches. As the most widely applied HDR imaging technique, MEF entails capturing multiple frames with varying exposure parameters on CMOS/CCD sensors, followed by meticulous selection and fusion of regions with optimal exposure across the frames to generate an HDR image [1–9]. However, the demand for repetitive sampling of frames poses several challenges to MEF. Single-sensor MEF methods [10–12] are troubled by the ghosting artifacts due to temporal misalignment. Multi-sensor methods [13–15] face challenges such as sensor registration and structural complexity. And MEF methods obtain multiple images by reusing Bayer matrices at the expense of sacrificing sensor spatial resolution [9,16]. In contrast to the MEF approaches, SLMs-based methods involve modulating the irradiance incident upon the sensor pixel-by-pixel. By utilizing SLMs [17] such as DMD [18–26] or LCD [27,28], the incident light of each pixel is independently attenuated based on the intensity of incident light, ensuring that it falls within the effective working range of the sensor. However, the introduction of high-cost SLMs leads to a decrease in imaging quality, and the parameters of the SLMs in the structure are scene-dependent, as real-time feedback adjustments are required for different scenarios.

The advent of asynchronous sensors introduces the potential to develop HDR imaging systems with pixel-independent sampling. By leveraging asynchronous sensors such as Dynamic Vision Sensors (DVS), imaging systems can break free from the constraints of globally uniform pixel sampling, constructing an HDR system with pixel-independent triggering. Previous work has extensively explored DVS-based image reconstruction, such as estimating the scene radiance by leveraging motion-triggered event streams [29,30], or constructing motion-independent DVS imaging systems using actively controlled light sources to modulate scene brightness [31–36]. However, the preceding approaches are compromised due to limited scene information contained in the sparse events triggered by motion. And the systems incorporating active light sources are not applicable to HDR outdoor scenarios or HDR scenes containing light sources. In contrast, our proposed method leverages the asynchronous sensing features of the dynamic vision sensor to achieve HDR imaging, enabling operation in various HDR scenarios.

In this paper, we develop the Asynchronous HDR imaging system (AsynHDR), which triggers event streams by introducing temporal variations in the system’s incident light intensity. Compared to active light-triggered imaging systems [31,33], the AsynHDR system achieves HDR scene imaging ability by combining the proportional attenuation of incident light with DVS’s independent pixel triggering mechanism. The optical architecture of the AsynHDR system consists of a DVS, two LCD panels, a beam splitter, and a signal generator. The LCD panels dynamically modulate the transmittance to control the incident light in the system. The DVS in the system triggers event streams on a per-pixel basis within suitable exposure ranges. Building upon the hardware system, we further propose a temporal-weighted algorithm to replace the direct integration method for the reconstruction of scene radiance from event streams. Combined with subsequent threshold correction processing, it significantly enhances imaging signal-to-noise ratio (SNR) and quality.

Our contributions can be summarized as follows:

• First, we discern the efficacy of sensor pixels operating independently in tackling HDR challenges. Combining this observation with the operating principles of DVS, we propose the construction methodology for DVS-based HDR imaging systems.
• Second, by modulating the incident light using LCD panels, the AsynHDR system constructed by us can recover scene radiance from the triggered event stream, and we propose a temporal-weighted method to enhance imaging quality.
• Third, the experiments under the challenging light-source included and outdoor HDR scenarios validate the system’s high-quality HDR imaging capability, and confirm the viability of DVS conducting passive imaging without the aid of frame-based cameras or active light sources.

2. Principle and method

In the following three subsections, we will introduce the principles for constructing an asynchronous HDR imaging system, the optical architecture of our asynchronous HDR imaging system, and the temporal-weighted algorithm for reconstructing HDR images from event streams.

2.1 Methodology for AsynHDR imaging system

Constructing a pixel-independent HDR imaging system requires selecting an asynchronous sampling sensor as the sensing component. This paper outlines the construction methodology for an asynchronous HDR imaging system using DVS. Unlike frame-based cameras, each pixel in the DVS array operates independently, triggering events based on the event-triggering mechanism. Each event is defined as:

(1)$$e_{i}:=\left(x_{i}, y_{i}, t_{i}, p_{i}\right),$$

where $(x_{i}, y_{i})$ represents the pixel coordinates, $t_{i}$ is the timestamp of the event, and $p_{i} \in \{-1,+1\}$ indicates the polarity of the event. An event is triggered when the change of logarithmic intensity of the pixel, $log\mathbf {I}(x, y, t):=\overline {\mathbf {I}}(x,y,t)$, exceeds the triggering threshold $c$:

(2)$$\left|\Delta \overline{\mathbf{I}}(x_{i}, y_{i},t_{i})\right|=\left|\overline{\mathbf{I}}\left(x_{i}, y_{i}, t_{i}\right)-\overline{\mathbf{I}}\left(x_{i}, y_{i},t_{i}-\Delta t_{i}\right)\right| \geq c,$$

where $\Delta t$ represents the time elapsed since the previous event at pixel $(x_i, y_i)$. To construct an HDR imaging system based on DVS, it is essential to obtain sufficiently informative event streams. In addition to utilizing changes in scene illumination and object motion to generate events, we can also dynamically alter the camera’s incident light to trigger DVS event streams by incorporating devices such as optical valves into the optical path of the system. The incident light at the sensor pixel $(x, y)$ can be modeled as follows:

(3)$$\mathbf{I}(x,y,t) = f(t)\mathbf{L}(x,y,t),$$

where $f(t)$ is the temporal modulation factor for the imaging system’s incident light, and $L$ is the scene radiance component incident on the pixel.

Assuming nearly constant scene radiance over a short period, the event-triggering mode is as follows:

(4)$$\begin{aligned} \left|\Delta \overline{\mathbf{I}}\left(x,y,t\right)\right| = &\left| \log f(t)\mathbf{L}(x, y, t) - \log f(t-\Delta t)\mathbf{L}(x, y, t-\Delta t) \right|\\ &=\left| \log f(t)\mathbf{L}(x, y) - \log f(t-\Delta t)\mathbf{L}(x, y) \right|\\ &=\left|\log f(t)-\log f(t-\Delta t)\right|\geq c. \end{aligned}$$

We can observe that the logarithmic threshold event triggering characteristics of DVS, coupled with the separable form of the incident light modulation function, result in the triggering timestamps of all pixels being solely dependent on the light modulation function $f(t)$. These timestamps are independent of the scene light intensity component $L$. Uniformly adjusting the incident light of the system triggers events with consistent timestamps and does not contain any scene radiance information.

Therefore, to encode scene radiance information into the time stamps of event streams, the temporal variation component in Eq. (3) needs to be designed in a form inseparable from the scene radiance $\mathbf {L}$,

(5)$$\mathbf{I}(x,y,t) = f_{inseparable}(t,\mathbf{L}(x,y)).$$

Such as

(6)$$\mathbf{I}(x,y,t) = f(t)\mathbf{L}(x,y) + g(t),$$

of our system, under this setting, the event triggering mode is as follows:

(7)$$\left|\Delta \overline{\mathbf{I}}\left(x,y,t\right)\right| =\left| \log [f(t)\mathbf{L}(x, y)+g(t)] - \log [f(t-\Delta t)\mathbf{L}(x, y)+g(t)] \right|\geq c.$$

In this modulation, the scene component won’t be eliminated as in Eq. (4). The information about scene radiance $\mathbf {L}$ can be encoded into the temporal characteristics of the event streams.

2.2 Construction of AsyHDR imaging system

With the theoretical foundation from the previous subsection, we constructed an AsynHDR system where the incident light triggering events are modulated by LCD panels, as shown in Fig. 1. The system consists of a DVS, LCD panels, a signal generator, a beam splitter, and lenses. The sensor irradiance ($\mathbf {I}$) incident on the DVS pixels array is obtained by proportionally attenuating the environmental light scene radiance ($\mathbf {L}$) through the LCD panels in the optical path and then transmitting through the lenses. The mathematical expression for this process is:

(8)$$\mathbf{I}(x,y,t) = T(t)k_{lens}\mathbf{L}(x,y,t).$$

Here, $T(t)\geq 0$ represents the transmittance of the LCD panels, $k_{lens}$ is the attenuation coefficient of the lens, and $\mathbf {I}(x,y,t)$ represents the irradiance component projected onto the DVS sensor pixel $(x,y)$ at time $t$ in this optical path.

Fig. 1. (a) Optical schematic diagram of our AsynHDR system. (b) Event triggering demonstration. The point cloud diagram illustrates events triggered by the dynamic modulation of LCD panels, where red represents positive events and blue represents negative ones. (c) Physical demonstration of the system.

Download Full Size | PDF

In the AsynHDR system, the irradiance projected onto the sensor is composed of two beams modulated by LCD panels:

(9)$$\mathbf{I}\left(x, y,t\right)=\mathbf{I}_{1}(x,y,t)+\mathbf{I}_{2}(t),$$

$I_{1}(t)$ represents the sensor irradiance component of the scene incident light, while $I_{2}(t)$ is a uniformly weak incident light component that varies only with time which is used to provide a consistent starting sampling value for all pixels,

(10)$$\left\{ \begin{array}{c} \mathbf{I}_{1}(x,y,t) = T_{1}(t)k_{lens}\mathbf{L}(x, y),t_0<t<t_1 \\ \mathbf{I}_{2}(t) = T_{2}(t)k_{lens}\mathbf{L}_{const},t_0<t<t_1\\ \end{array} \right..$$

The critical formula for system event triggering is as follows:

(11)$$\begin{aligned} \left|\Delta \overline{\mathbf{I}}\left(x,y,t\right)\right| = &\left| \log[T_{1}(t)k_{lens}\mathbf{L}(x, y) + T_{2}(t)k_{lens}\mathbf{L}_{const}] - \right.\\ &\left. \log[T_{1}(t-\Delta t)k_{lens}\mathbf{L}(x, y) + T_{2}(t-\Delta t)k_{lens}\mathbf{L}_{const}] \right| \geq c. \end{aligned}$$

The information of the scene radiance component $\mathbf {L}(x, y)$ corresponding to pixel point $(x, y)$ is encoded in the event stream, and HDR image reconstruction can be achieved through appropriate processing.

2.3 Reconstruction of HDR intensity images from the event streams

Previous approaches have restored scene radiance by directly integrating events [32]. However, this method results in very few gray levels and is severely degraded by noise. We incorporate temporal information into the reconstruction process to achieve a low noise level and a nuanced gray-scale response.

Let’s consider any pixel points $\left (x_{1}, y_{1}\right )$ and $\left (x_{2}, y_{2}\right )$. On the first optical path, the LCD transmission function $T_{1}(t)$ is monotonically increasing. On the second optical path, the LCD transmission function $T_{2}(t)$ is monotonically decreasing, and $|\frac {T_{2}(t)}{dt}|<|\frac {T_{1}(t)}{dt}|, \forall t\geq 0$. For the convenience of demonstrating the temporal sequence of events for different pixels, we introduce $\mathbf {L}_{const}$ at the initial moment ($t$ = 0) as an intermediary variable, which remains constant for each pixel. Assuming both pixel points can trigger more than $k$ events, the critical condition equation for event triggering is:

(12)$$\begin{array}{r} \log[T_{1}(t^{k}_{1})\mathbf{L}\left(x_{1}, y_{1}\right) + T_{2}(t^{k}_{1})\mathbf{L}_{const}] - \log[T_{1}(0)\mathbf{L} \left(x_{1},y_{1}\right) + T_{2}(0)\mathbf{L}_{const}] = \\ \log[T_{1}(t^{k}_{2})\mathbf{L} \left(x_{2},y_{2}\right) + T_{2}(t^{k}_{2})\mathbf{L}_{const}] - \log[T_{1}(0)\mathbf{L}\left(x_{2},y_{2}\right) + T_{2}(0)\mathbf{L}_{const}]=kc, \end{array}$$

where $kc$ represents the $k$-th order event triggering threshold. Substituting $T_{2}(0)=1,T_{1}(0)=0$ into the above equation yields:

(13)$$T_{1}(t^{k}_{1})\mathbf{L}\left(x_{1}, y_{1}\right) + T_{2}(t^{k}_{1})\mathbf{L}_{const}= T_{1}(t^{k}_{2})\mathbf{L}\left(x_{2},y_{2}\right) + T_{2}(t^{k}_{2})\mathbf{L}_{const} =\exp(kc+\log \mathbf{L}_{const}).$$

We can deduce that the relationship between the triggering moments $t^{k}_{1},t^{k}_{2}$ and the scene illumination $L\left (x_{1}, y_{1}\right ),L\left (x_{2}, y_{2}\right )$:

(14)$$\begin{aligned}&t^{k}_{1}<t^{k}_{2},\\ s.t.\quad &\mathbf{L}\left(x_{1}, y_{1}\right)>\mathbf{L}\left(x_{2}, y_{2}\right)\geq\mathbf{L}_{const} . \end{aligned}$$

This implies that the relative brightness information of pixels is encoded in the temporal information of event triggering. As shown in Fig. 2, we utilized the AsynHDR system to capture the gray-scale gradient test card, showcasing event streams recorded in different scene radiance regions. Specifically, brighter pixels reach the triggering threshold earlier, resulting in smaller event timestamps.

Fig. 2. Pixel-wise presentation of events triggered at different light intensities. (a) Sampling points chosen from continuous stepped radiance levels on a gray-scale test card. (b) The events are triggered at different points along the timeline, where blue lines represent the positive events ($p_{i}$=+1), and colored triangles indicate different order events for each pixel.

Download Full Size | PDF

Based on the above conclusions, we designed a temporal-weighted algorithm to extract information from the event stream and map it into an intensity image:

(15)$$img(x,y) = \mathbf{L}_{const}*exp(\sum_{\substack{x_i=x,y_i=y}}f(t_i)*p_i*c),$$

where $img(x, y)$ represents the intensity of the pixel at coordinates $(x, y)$ in the recovered image. $f(t)$ is the temporal-weighted function, assigning weights to each event based on their triggering timestamps. Considering $\mathbf {L}\left (x_{1}, y_{1}\right )> \mathbf {L}\left (x_{2}, y_{2}\right )\geq \mathbf {L}_{const}$, we aim to ensure the monotonicity of the system, meaning that the pixel intensities $img\left (x_{1},y_{1}\right )$ and $img\left (x_{2}, y_{2}\right )$ recovered from the corresponding pixel event streams maintain the same size relationship as $\mathbf {L}\left (x_{1}, y_{1}\right )$ and $\mathbf {L}\left (x_{2}, y_{2}\right )$. Combining the previous derivation, the weighting function $f(t)$ only needs to decrease monotonically in the time domain to ensure a monotonically consistent system. Subsequently, we will demonstrate how the introduction of the function $f(t)$ enhances imaging quality.

In our DVS-HDR imaging system, noise mainly originates from two aspects: The pseudo-events triggered by fluctuations in sensor dark current, and the inconsistency in event thresholds among pixels [37]. We mitigate the impact of pseudo-events on imaging by introducing and optimizing $f(t)$. Simultaneously, we estimate an event threshold correction map to eliminate the multiplicative fixed pattern noise (FPN) caused by threshold inconsistency, further enhancing the imaging signal-to-noise ratio (SNR).

We define pseudo-events and valid-events as $e^{ps}_{i}$ and $e^{val}_{i}$. Substituting into the reconstruction formula, the value of the reconstruction image at position $(x, y)$ can be expressed as follows:

(16)$$\begin{aligned}img(x,y) &= \mathbf{L}_{const}*exp[(\sum_{\substack{x^{val}_i=x,y^{val}_i=y}}f(t^{val}_{i})*p_i^{val}+\sum_{\substack{x^{ps}_{i}=x,y^{ps}_{i}=y}}f(t^{ps}_{i})*p^{ps}_{i})*c]\\ &=\mathbf{L}_{const}*(E_{val}(x,y)*E_{pseudo}(x,y)). \end{aligned}$$

Assuming $\mathbf {L}\left (x_{1}, y_{1}\right )> \mathbf {L}\left (x_{2}\right )$, we define the difference term as:

(17)$$\begin{aligned} \delta(x_1,y_1,x_2,y_2) =& \frac{img(x_1,y_1)}{img(x_2,y_2)}=\frac{E_{val}(x_1,y_1)}{E_{val}(x_2,y_2)}*\frac{E_{pseudo}(x_1,y_1)}{E_{pseudo}(x_2,y_2)}\\ =& \delta_{val}(x_1,y_1,x_2,y_2) * \delta_{ps}(x_1,y_1,x_2,y_2). \end{aligned}$$

We aim to identify a method that amplifies $\delta _{val}$ while slightly affecting $\delta _{ps}$ to enhance the imaging quality. According to the mechanism of Eq. (14), the reconstruction intensity of pixel of the same-level valid events with higher intensity triggers earlier than those with lower intensity, while pseudo events do not possess this characteristic due to equiprobable triggering in the time domain. The relationship between $\delta _{val}$ with and without the weighting function $f(t)$ can be expressed as:

(18)$$\frac{E_{val}(x_1,y_1)}{E_{val}(x_2,y_2)}=\frac{\sum_{\substack{x_i=x_1,y_i=y_1}}f(t_i)*p_i}{\sum_{\substack{x_i=x_2,y_i=y_2}}f(t_i)*p_i}> \frac{\sum_{\substack{x_i=x_1,y_i=y_1}}p_i}{\sum_{\substack{x_i=x_2,y_i=y_2}}p_i} .$$

Considering $\frac {f(t)}{dt}<0$, Eq. (18) can be proven, and $\delta _{val}$ achieves amplification. The most straightforward monotonically decreasing function takes on a linear form, denoted as $f(t) = 1-t, (0<t<1)$. And in the experiments section, we analyze the enhancement effects of different temporal weighting approaches by a standard test, and replace linear function with the exponential function $f(t) = 2^{-k * t}, \quad k \in \mathbb {Z}^{+},(0 < t < 1)$ to further amplifying the SNR of the results. The final reconstruction formula is as follows:

(19)$$img(x,y) = \mathbf{L}_{const}*exp(\sum_{\substack{x_i=x,y_i=y}}2^{{-}k*t_i}*p_i*c).$$

To address the noise introduced by varying pixel event triggering thresholds, we introduced a calibration step to correct the imaging system, reducing fixed pattern noise (FPN) of a multiplicative nature. We refer to flat-field correction methods [38] for removing FPN from image sensors and treating the AsynHDR hardware and algorithm as a single entity. We acquired the correction tensor ($c-map$) through the iterative acquisition of images from a uniformly illuminated light box after fixing the LCD waveform, followed by the averaging of the obtained results:

(20)$$c-map(x, y)=\frac{1}{n} \sum_{i=1}^n I_i(x, y).$$

We can significantly reduce the impact of FPN on imaging quality by using the $c-map$. Our experiments have confirmed the effectiveness of this approach. (refer to Supplement 1). The adjusted formula is as follows:

(21)$$img_{adjusted}(x,y) = \frac{img(x,y)*\overline{c-map}}{c-map(x,y)}.$$

The overline in the formula represents averaging.

3. Experiments and analysis

3.1 System dynamic range test and imaging quality test

We employ a standard testing platform to evaluate the dynamic range of the AsynHDR system and showcase the performance of the temporal-weighted algorithm. The platform consists of a high-intensity uniform lightbox (160,000 lux illuminance) and a density filter array. By imaging a set of neutral density filters varying in transmittance on the array, we tested the dynamic range of the system. As shown in Fig. 3(a), the filter array exhibits uniformity in each region with varying transmittance density ($Dt$), which is computed as follows:

Dt = \lg \frac{P_{o}}{P_{t}},

where $P_{o}$ represents incident light and $P_{t}$ represents transmitted light.

Fig. 3. Dynamic range test and denoise algorithm experiment. (a) The stepped transmission brightness test card. (b) Illustration of the dynamic range test curve for the system. The table at the bottom displays the transmittance density of different filters for the filter array.

Download Full Size | PDF

The HDR test results are displayed in Fig. 3(b), revealing that the AsynHDR system exhibits perceptual sensitivity to brightness variations across filter levels 2 through 29. The values of $Dt$ for the 2nd-order filter ($Dt_{0}$) and the 28th-order filter ($Dt_{1}$) are 0.1 and 5.23. The dynamic range of the AsynHDR system is calculated as follows:

DR = 20\lg \frac{L_{max}}{L_{min}} = 20(Dt_{0}-Dt_{1}) =102.6dB,

where $L_{max}$ is the maximum radiance distinguishable by the system, while $L_{min}$ is the minimum radiance distinguishable by the system.

We conducted experiments on the same testing platform to validate the denoising capability of the temporal-weighted algorithm. As shown in Fig. 4(a), by calculating the Signal-to-Noise Ratio (SNR) in different filters of the array, we demonstrate the denoising ability of various event processing methods at different brightness levels. The SNR is calculated as follows:

SNR =10 \lg \left(\frac{\mu^{2}}{\sigma^{2}}\right),

where $\mu$ represents the average value of pixels irradiance, and $\sigma$ represents the standard deviation of the noise. In the event encoding step described in the previous section, we employ a weighting method to suppress noise. Considering the reconstructed image combines information from both event timestamps and the accumulated number of events, increasing the value of $k$ in the weighting method indefinitely doesn’t guarantee improved reconstruction. We explore the effect of $k$ in the AsynHDR system by comparing signal enhancement performance and ultimately chose $k=10$ based on the results, as shown in Fig. 4. Additionally, other types of temporal weighting functions, such as quadratic or higher-order polynomial functions, can also be employed as temporal weighting strategies to enhance the imaging. The denoising results of other weighting functions are measured and compared in our SNR test, as shown in Table 1. It can be observed that among numerous strategies, exponential weighting achieves state-of-the-art results. Therefore, we use the exponential function in this context and optimize its parameters.

Fig. 4. Analysis of Algorithm SNR. (a) The SNR curves of different uniform radiance regions on the test card under various temporal weighting enhancement strategies with/without c-map adjustment. (b) The average step-radiance SNR under different k-factor exponential temporal weighting.

Download Full Size | PDF

Table 1. The $\overline {SNR}$(mean SNR) results for different temporal weighting methods, the h-poly term represents the SNR calculated using best high-order polynomial ($f(t)=(1-t)^5$) weighted result.

View Table

Additionally, we present the imaging results of different temporal weighting methods, referring to the histograms of the results in Fig. 5. It can be observed that the image produced by our method in Fig. 5(b) has more gray levels compared to the raw integral in Fig. 5(a), indicating better imaging quality.

Fig. 5. Imaging results of event flow using different reconstruction methods. For each algorithm, we provide the zoom-in images of the orange block. (a) Directly accumulate events to image (raw integral + c-map adjust). (b) Imaging by exponentially temporal weighted summation of events (ours + c-map adjust). (c) The linear temporal weighted summation of events to image (linear + c-map adjust). (d) Imaging results of the frame camera with identical resolution and sensor size under the same optical system (GT).

Download Full Size | PDF

3.2 Modulation frequency test

In the AsynHDR system, the response speed of LCD panels can reach up to 500 Hz, while the temporal resolution of the event camera (Metavision EVK4) can reach 1 $\mathrm {\mu }$s. We investigated the imaging speed performance boundaries of the AsynHDR system by adjusting the LCD trigger frequency. The transmission function used in the experiment is as follows:

(22)$$\left\{ \begin{array}{l} T_{1}(t) = 0, 0<t<0.3 \\ T_{1}(t) = 1-2^{{-}20(t-0.3)}), 0.3<t<1\\ \end{array} \right.,$$

(23)$$T_{2}(t) = 0.5*(1-T_{1}(t)).$$

Equation (22) and Eq. (23) represent the general case with a period of 1s. The waveforms used for experimental results at other frequencies in our manuscript were obtained by performing temporal transformations based on these functions. We modulate the light input using periodic signals with frequencies ranging from 1 to 20 Hz and observe the temporal response of the event stream and the reconstruction quality of algorithms. Experimental results indicate that the system exhibits good imaging performance within a frequency range of up to 5 Hz. However, as the incident light modulation frequency increases, there is a significant degradation in imaging quality. A comparison between low-frequency modulation (1 Hz, 2 Hz) and high-frequency modulation (10 Hz, 20 Hz) in Fig. 6 reveals that under high incident light modulation frequency, the impact of the refractory period (also called tailing effect [31], smearing effect [39]) of the event stream due to the recovery of the light valve is significantly enhanced within the changing period, causing some valid events to be missed, thus resulting in a misalignment of the grayscale mapping of the reconstructed images. Additionally, the influence of the event stream sampling "gaps" caused by limited signal transmission bandwidth is magnified at high frequencies, leading to degraded imaging quality.

Fig. 6. Modulation frequency test. Temporal experiments were conducted to explore the temporal imaging speed boundaries of the AsynHDR system. The figures respectively depict the triggering event streams (the x-axis represents the time axis, while the y-axis represents pixel coordinates) and imaging results under different incident light modulation frequencies. At a frequency of 20 Hz, the sampling "gap" in the event stream, the refractory period effect, and grayscale misalignment in the reconstructed image can be observed. Additionally, a waveform diagram of the LCD panel’s transmittance is provided as a reference. We described the process of selecting the LCD transmission function in Supplement 1.

Download Full Size | PDF

3.3 Experiment in real HDR scenes

We selected two challenging scenes for system real scene evaluation: The first scene involves simultaneous capturing of a dark box and an incandescent lamp to assess performance in extreme HDR scenarios (Fig. 7(a)). The second scene depicts an outdoor setting with a bright afternoon sky as the background (Fig. 7(b)), demonstrating the system’s HDR performance in open outdoor environments. Considering the inherent limitation of active light triggered methods [31,33] in imaging outdoor and light source included scenes, this set of experiments further validates the superiority of our system.

Fig. 7. HDR imaging through the AsynHDR system. After carefully selecting different exposure times to capture 10 images using the frame camera in the system, the MEF method [1] is employed to obtain the real scene radiance as a reference (GT). Our method is compared with frame-based cameras under long and short exposures to showcase HDR performance. Additionally, the directly integrating events (raw integral) method is also included for reference, to demonstrate the enhancement achieved by our algorithm. (a) The imaging results in the scene with a light source. (b) The imaging results under outdoor scenarios.

Download Full Size | PDF

4. Conclusion

In this paper, we proposed an approach for constructing HDR imaging systems using asynchronous sensors, addressing HDR challenges through asynchronous sampling. Our experiments in HDR scenarios validate that the DVS can independently serve as a sensor to construct a multi-scene robust imaging system. This implies, using the approach presented in this paper, we can replace frame-based cameras with DVS as the sensor for devices such as mobile phones and auto-pilot vehicles, rather than using it as an auxiliary for imaging.

Although the AsynHDR system effectively addressed HDR challenges, its frame rate is constrained to 20fps due to the bandwidth limitations of the DVS sensor, and it faces limitations in handling fast-moving scenes due to the scene radiance information’s temporal coding. However, with advancements in DVS sensors, we anticipate future improvements in the system’s frame rate and plan to explore solutions for motion scenes in our future work. Moreover, using a DVS sensor designed with a Bayer matrix, AsynHDR can achieve color HDR imaging, similar to a frame-based RGB camera.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (NSFC) under Grants 62225207 and 62306295.

Disclosures

The authors declare that there are no conflicts of interest related to this article.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. P. E. Debevec and J. Malik, “Recovering high dynamic range radiance maps from photographs,” in ACM SIGGRAPH 2008 classes, (2008).

2. T. Jinno and M. Okuda, “Multiple exposure fusion for high dynamic range image acquisition,” IEEE Trans. on Image Process. 21(1), 358–365 (2012). [CrossRef]

3. S. B. Kang, M. Uyttendaele, S. Winder, et al., “High dynamic range video,” ACM Trans. Graph. 22(3), 319–325 (2003). [CrossRef]

4. M. Mase, S. Kawahito, M. Sasaki, et al., “A wide dynamic range cmos image sensor with multiple exposure-time signal outputs and 12-bit column-parallel cyclic a/d converters,” IEEE J. Solid-State Circuits 40(12), 2787–2795 (2005). [CrossRef]

5. S. W. Hasinoff, D. Sharlet, R. Geiss, et al., “Burst photography for high dynamic range and low-light imaging on mobile cameras,” ACM Trans. Graph. 35(6), 1–12 (2016). [CrossRef]

6. S. W. Hasinoff and K. N. Kutulakos, “Multiple-aperture photography for high dynamic range and post-capture refocusing,” IEEE Transactions on Pattern Analysis and Machine Intelligence 1, 3 (2009).

7. M. A. Martínez, E. M. Valero, and J. Hernández-Andrés, “Adaptive exposure estimation for high dynamic range imaging applied to natural scenes and daylight skies,” Appl. Opt. 54(4), B241–B250 (2015). [CrossRef]

8. M. D. Tocci, C. Kiser, N. Tocci, et al., “A versatile hdr video production system,” ACM Trans. Graph. 30(4), 1–10 (2011). [CrossRef]

9. S. Hajisharif, J. Kronander, and J. Unger, “Adaptive dualiso hdr reconstruction,” J Image Video Proc. 2015(1), 41 (2015). [CrossRef]

10. A. Srikantha and D. Sidibé, “Ghost detection and removal for high dynamic range images: Recent advances,” Signal Process. Image Commun. 27(6), 650–662 (2012). [CrossRef]

11. S. Silk and J. Lang, “High dynamic range image deghosting by fast approximate background modelling,” Comput. & Graph. 36(8), 1060–1071 (2012). [CrossRef]

12. K. Karaduzovic-Hadžiabdic, J. H. Telalovic, and R. K.Mantiuk, “Assessment of multi-exposure hdr image deghosting methods,” Comput. & Graph. 63, 1–17 (2017). [CrossRef]

13. T. Yamashita and Y. Fujita, “Hdr video capturing system with four image sensors,” ITE Trans. on Media Technol. Appl. 5(4), 141–146 (2017). [CrossRef]

14. K. Seshadrinathan and O. Nestares, “High dynamic range imaging using camera arrays,”, in International Conference on Image Processing (IEEE, 2017), pp. 725–729.

15. T. T. Huynh, T.-D. Nguyen, M.-T. Vo, et al., “High dynamic range imaging using a 2×2 camera array with polarizing filters,”, in 19th International Symposium on Communications and Information Technologies (IEEE, 2019), pp. 183–187.

16. S. Nayar and T. Mitsunaga, “High dynamic range imaging: spatially varying pixel exposures,” in Conference on Computer Vision and Pattern Recognition (IEEE, 2002).

17. M. Saxena, G. Eluru, and S. S. Gorthi, “Structured illumination microscopy,” Adv. Opt. Photonics 7(2), 241–275 (2015). [CrossRef]

18. A. A. Adeyemi, N. Barakat, and T. E. Darcie, “Applications of digital micro-mirror devices to digital optical microscope dynamic range enhancement,” Opt. Express 17(3), 1831–1843 (2009). [CrossRef]

19. N. A. Riza and J. P. La Torre, “Demonstration of 136 db dynamic range capability for a simultaneous dual optical band caos camera,” Opt. Express 24(26), 29427–29443 (2016). [CrossRef]

20. W. Feng, F. Zhang, X. Qu, et al., “Per-pixel coded exposure for high-speed and high-resolution imaging using a digital micromirror device camera,” Sensors 16(3), 331 (2016). [CrossRef]

21. M. A. Mazhar and N. A. Riza, “96 db linear high dynamic range caos spectrometer demonstration,” IEEE Photonics Technol. Lett. 32(23), 1497–1500 (2020). [CrossRef]

22. X. Guan, X. Qu, B. Niu, et al., “Pixel-level mapping method in high dynamic range imaging system based on dmd modulation,” Opt. Commun. 499, 127278 (2021). [CrossRef]

23. S. K. Nayar, V. Branzoi, and T. E. Boult, “Programmable imaging: Towards a flexible camera,” Int. J. Comput. Vis. 70(1), 7–22 (2006). [CrossRef]

24. Y. Qiao, X. Xu, T. Liu, et al., “Design of a high-numerical-aperture digital micromirror device camera with high dynamic range,” Appl. Opt. 54(1), 60–70 (2015). [CrossRef]

25. J. Zhou, Y. Qiao, Z. Sun, et al., “Design of a dual dmds camera for high dynamic range imaging,” Opt. Commun. 452, 140–145 (2019). [CrossRef]

26. W. Feng, F. Zhang, W. Wang, et al., “Digital micromirror device camera with per-pixel coded exposure for high dynamic range imaging,” Appl. Opt. 56(13), 3831–3840 (2017). [CrossRef]

27. H. Mannami, R. Sagawa, Y. Mukaigawa, et al., “Adaptive dynamic range camera with reflective liquid crystal,” J. Vis. Commun. Image Represent. 18(5), 359–365 (2007). [CrossRef]

28. Branzoi Nayar, “Adaptive dynamic range imaging: Optical control of pixel exposures over space and time,” in Ninth International Conference on Computer Vision (IEEE, 2003), pp. 1168–1175.

29. H. Rebecq, R. Ranftl, V. Koltun, et al., “Events-to-video: Bringing modern computer vision to event cameras,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019).

30. Y. Yang, J. Han, J. Liang, et al., “Learning event guided high dynamic range video reconstruction,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), pp. 13924–13934.

31. J. Han, Y. Asano, B. Shi, et al., “High-fidelity event-radiance recovery via transient event frequency,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), pp. 20616–20625.

32. M. Muglikar, G. Gallego, and D. Scaramuzza, “Esl: Event-based structured light,” in International Conference on 3D Vision (IEEE, 2021), pp. 1165–1174.

33. T. Takatani, Y. Ito, A. Ebisu, et al., “Event-based bispectral photometry using temporally modulated illumination,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 15638–15647.

34. X. Huang, Y. Zhang, and Z. Xiong, “High-speed structured light based 3d scanning using an event camera,” Opt. Express 29(22), 35864–35876 (2021). [CrossRef]

35. X. Liu, J. D. Rego, S. Jayasuriya, et al., “Event-based dual photography for transparent scene reconstruction,” Opt. Lett. 48(5), 1304–1307 (2023). [CrossRef]

36. J. Fu, Y. Zhang, Y. Li, et al., “Fast 3d reconstruction via event-based structured light with spatio-temporal coding,” Opt. Express 31(26), 44588–44602 (2023). [CrossRef]

37. Z. Wang, Y. Ng, P. van Goor, et al., “Event camera calibration of per-pixel biased contrast threshold,” arXivarXiv:2012.09378 (2020). [CrossRef]

38. J. A. Seibert, J. M. Boone, and K. K. Lindfors, “Flat-field correction technique for digital detectors,” in Medical Imaging 1998: Physics of Medical Imaging, vol. 3336 (SPIE, 1998), pp. 348–354.

39. M. Ng, Z. M. Er, G. S. Soh, et al., “Aggregation functions for simultaneous attitude and image estimation with event cameras at high angular rates,” IEEE Robot. Autom. Lett. 7(2), 4384–4391 (2022). [CrossRef]

Weighting Method	c-map adjust	$\bar{S N R}$ (dB)	c-map adjust	$\bar{S N R}$ (dB)
raw integral	$\times$	14.98	$✓$	21.74
linear weighted	$\times$	15.99	$✓$	24.99
quadratic weighted	$\times$	16.32	$✓$	26.43
h-poly weighted	$\times$	16.56	$✓$	27.52
Ours	$\times$	16.61	$✓$	27.67

Event-based asynchronous HDR imaging by temporal incident light modulation

Abstract

1. Introduction

2. Principle and method

2.1 Methodology for AsynHDR imaging system

2.2 Construction of AsyHDR imaging system

2.3 Reconstruction of HDR intensity images from the event streams

3. Experiments and analysis

3.1 System dynamic range test and imaging quality test

3.2 Modulation frequency test

3.3 Experiment in real HDR scenes

4. Conclusion

Acknowledgments

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (7)

Tables (1)

Equations (26)

Optics Express