Abstract
We present a new method for measuring camera vibrations such as camera shake and shutter shock. This method successfully detects the vibration trajectory and transient waveforms from the camera image itself. We employ a time-varying pattern as the camera test chart over the conventional static pattern. This pattern is implemented using a specially developed blinking light-emitting-diode array. We describe the theoretical framework and pattern analysis of the camera image for measuring camera vibrations. Our verification experiments show that our method has a detection accuracy and sensitivity of 0.1 pixels, and is robust against image distortion. Measurement results of camera vibrations in commercial cameras are also demonstrated.
© 2017 Optical Society of America
1. Introduction
Camera images inevitably exhibit blurring owing to various vibrations that occur during the process of capturing a photograph. A primary cause of image blurring is camera shake, i.e., the unintended motion of the camera during exposure by hand-held shooting. Camera shake significantly affects the image quality, and consequently, many commercial cameras are equipped with image stabilizers that work to reduce the camera shake. The use of a tripod is also another popular option to address the problem of camera shake. However, it is difficult to completely suppress camera vibrations.
Vibrations that lead to image blur are also caused by mirror slap and shutter shocks that are produced within the body of the camera. These vibrations are also inevitable even with the use of a heavy tripod and either a remote shutter release or a self-timer. Mirror slap refers to the vibration that is caused by the flipping movement of the mirror upward and away from the light path immediately before exposure; the phenomenon is inherent in single lens reflex (SLR) cameras. This effect is avoidable by using mirror lock-up or mirror-up delay features. Moreover, the recently developed mirrorless camera structurally eliminates mirror slap.
However, even mirrorless cameras are subject to shutter shock. Many cameras have a shutter curtain that prevents the exposure of the image sensor to light. This shutter curtain is positioned immediately in front of the focal plane of the image sensor. The exposure process involves the opening and closing of the shutter curtain and this movement induces a slight vibration in the image sensor, which is termed as shutter shock.
The magnitude of camera vibrations such as mirror slap and shutter shock is very small compared with that of camera shake; however, these types of vibrations still affect the image quality. Addressing this problem involves exact quantification of the vibrations occurring in the shuttering process. Consequently, a detailed analysis of camera vibrations is necessary for developing high-definition cameras.
Various efforts to measure vibrations have been made over the past few decades. Popular vibration-measurement techniques include laser Doppler vibrometry [1], holographic interferometry [2], and speckle photography [3,4], which have been in use for nearly 30 years. Moreover, gyro sensors and accelerometers can effectively detect relatively large vibrations, and therefore, these devices are also used in image stabilization [5] and deblurring [6] to suppress camera shake.
The standard method for assessing image quality involves the use of a test chart as the photographic subject, e.g., optical blur is often estimated using the test chart. The test chart dispersively arranged on the image allows estimation of spatially variant blur and deblurring with the estimated blur kernels [7]. Further, a simple point light source is used since in this case, the camera image corresponds to a point spread function (PSF) representing blur and camera shake in the optical imaging system [8]. The magnitude transfer function (MTF) of the system is derived by the Fourier transform of the PSF, and it can be used for evaluating the residual vibration after image stabilization [9]. The edge information of the image can also be utilized to measure the efficacy of the image stabilizer. These days, the evaluation results of many newly developed cameras are posted as reviews in product websites [10].
An index based on the slope angle of the edge has also been proposed to address camera vibration measurements [11]. In this context, researchers have recently established the performance evaluation criteria of image stabilizers [12]. However, it is difficult to separate only the camera-shake component from the “complex” blurred image (which may also contain defocusing components) and evaluate it independently. A method for computing the MTF of blurring owing to vibrations by using charts with different resolutions has also been proposed [13,14].
Other related methods for vibration measurement include the tracking of a point light source, which allows the detection of the camera shake trajectory [15]. Target patterns [16,17] and feature points in the test object [18] and distributed light-emitting diodes (LEDs) [19] have been used for investigating vibrations in buildings and bridges. Focusing on small changes in the pixel values in video images allows detection of vibrations with sub-pixel accuracy [20]. This class of techniques is based on analyzing video images. The camera motion with relatively large vibrations can also be estimated using the inter-frame displacement of video images. A method for suppressing motion blur with a video camera has also been proposed [21]. The spectroscopical stereo photogrammetry method enables measurement of full-field surface vibrations [22]. A method for accurately measuring the in-plane displacement distribution by analyzing repeated patterns that are contained in the subject has also been proposed [23].
However, none of the above methods can be used for the measurement of camera vibrations. Video cameras or specialized still cameras cannot be used for vibration measurement, since the vibrations need to be detected from the image taken by the camera in question. By placing mirrors on the camera body and observing the reflection of laser beams from a long distance, minute vibrations can be detected with very high accuracy and sensitivity. However, we want to detect vibration of image sensors that directly affects the image, but not vibrations of the camera body or other parts that do not necessarily affect the image. Since the image sensor is located internally in the camera, it is difficult to attach the mirrors to the image sensor and observe the reflection of a laser beam. Further, vibration sensors placed on the camera do not provide exact measurements of the camera vibrations affecting the image sensor. There are no existing methods that enable measurement of minute camera vibrations directly from the image itself. Camera and tripod manufactures have no means of identifying the detailed behavior of camera vibrations and have not adequately developed such knowledge into camera design.
With this background, our proposed method makes it possible to detect the trajectory and transient waveforms of camera vibrations from the camera image itself. The most distinctive feature of our approach is the use of a time-varying pattern instead of the conventional static one as the camera test chart. When the conventional static pattern is used, image blur due to defocusing becomes dominant. Therefore, the effect of camera vibrations is buried in the defocusing blur and cannot be extracted from the camera image. Employing the time-varying pattern resolves this defect and also enables the detection of micro-vibrations such as mirror slap and shutter shock upon analysis of the camera image. The time-varying pattern is implemented using a specially developed blinking LED array. The basic concept and certain experimental examples have been presented in previous conference proceedings [24,25]; however, the approach has not yet been subjected to theoretical analysis, and the accuracy of the method has not been verified.
In this study, we present a new pattern analysis method to detect vibrations more accurately. This paper is organized as follows: The theoretical interpretation of vibration detection using the time-varying pattern and the pattern analysis of the camera image are described in Section 2. The analysis demonstrates that vibration in the direction of the roll axis can also be detected. In Section 3, the detection accuracy, sensitivity, and limitations are verified and compared with results when using the previous method. The measurement results of various camera vibrations, such as camera shakes, residual vibrations of the image stabilizer, mirror slaps, and shutter shocks in commercial cameras, are presented. In Section 4, the robustness of the method against image distortion and some possible expansions are discussed. Section 5 presents our main conclusions.
2. Principle of vibration measurement with time-varying pattern
2.1 Formulation of blurring process
The camera image is given by shooting a test chart, as illustrated in Fig. 1. Let us consider the geometric relationship between the test chart and the camera image, where any distortion of the optical system is assumed to be neglected for simplicity. As shown Fig. 2, an arbitrary point in the test chart is projected to the point in the image sensor under in-focus condition. The relationship between and can be described using “planar homography” [26]. As illustrated in Appendix A, it leads to the following relationship:
Camera vibrations continuously vary the geometric relationship from moment to moment. Therefore, any parameters involve a time variable , i.e., . Over a time interval , the camera vibration acquires a three-dimensional (3D) trajectory . During camera motion, therefore, a test chart with the image intensity is transformed intoon the image sensor . Since the camera vibration occurs during exposure between the shutter opening and closing, the camera image is given byThis expression implies that multiple exposures occur over the interval of the shutter aperture time . If the image blur due to defocusing exists together with vibrations, we can assume that the blur is contained within the test image itself from the beginning. The abovementioned process is illustrated in Fig. 1.The inverse problem of Eq. (3) needs to be solved in order to estimate the vibration trajectory from the blurred image . However, this process is very complex in general.
2.2 Introduction of time-varying pattern
The key idea for simplifying the inverse problem is to employ a time-varying pattern as the test chart. This pattern is represented as over a time interval instead of the static chart denoted by. For easy implementation and computation, we choose the following discrete-time form:
The above expression implies that each frame appears (“blinks”) for just one instant over the interval , and the next frame appears after the interval . The number of frames is chosen to satisfy the relation . Consequently, Eq. (3) can be rewritten asThis expression implies that the integral operation is transformed into a summation.Here, we choose each frame such that any two frames and are mutually orthogonal where , i.e., the frames do not overlap even for displacements . Subsequently, each frame can be separated individually from the camera image because there is no interference between frames. By comparison with the corresponding frame of the original, the displacements contained within the separated frame at time can be estimated. By means of the above procedure, the vibration trajectory can be obtained as a data series of discrete-time intervals .
The use of the time-varying pattern allows an arbitrary choice of patterns as long as the chosen pattern satisfies the condition of orthogonality. For easy implementation and computation, we choose a square lattice pattern with an matrix. Figures 3(a) and 4 illustrate the relationship between the time-varying pattern (N = 4, M = 3) and the camera image. The sequence of frames is obtained by shifting the lattice pattern sequentially from top left to bottom right. The frame sequence corresponds to multiple exposures and provides a camera image. Therefore, the camera image represented by Eq. (5) is recorded as an “accumulated” lattice pattern with lattice points. Each lattice pattern, expressed as and termed a partial lattice, is subjected to motion occurring during each frame interval. Figure 4 illustrates the estimation of the displacements from the camera image. The camera image can be decomposed into N frames, i.e., partial lattices with displacements, and the displacements can be computed by matching the decomposed partial lattices to the original ones without displacements.
For comparison, Fig. 3(b) shows the case in which a conventional static pattern is used. The pattern remains unchanged during the time interval over which the shutter is open. The resulting camera image is blurred due to vibrations. Since the vibrations are in effect “buried” in the blurred image, it is very difficult to extract the vibration trajectory from the image.
2.3 Detection of vibration trajectory
To extract the vibration trajectory from a camera image such as the one in Fig. 3(a), the translational and rotational displacements of each partial lattice need to be estimated accurately. The process consists of the following steps, and as an example, Figs. 6 and 7 depict the process for a pattern with N = 4, M = 3.
(1) Acquisition of reference pattern
First, the original lattice pattern including all partial lattices without vibrations, i.e., the reference pattern , needs to be prepared, because each displacement of partial lattices is computed on the basis of the reference pattern placed at home position. The reference pattern was acquired by mounting the test camera on a tripod and photographing the lattice pattern under vibration-free conditions. To prevent even a slight vibration such as a shutter shock, the lattice pattern is displayed just for a moment when any vibration diminishes sufficiently after the shutter opens. Then, the lattice pattern can be recorded with no vibration and cropped from the camera image. Figure 5 illustrates this process.
(2) Estimation of lattice point positions
Estimations of displacements due to vibrations require us to calculate each position of the partial lattices precisely. We focus on the positional relationship of each lattice point rather than the image itself. As shown in Figs. 6(a)–6(d), a lattice point is defined as the center position of each filled circle in the accumulated lattice pattern. The filled circles are gray scale images that resemble blobs and have rough-edged contours. The rough-edged circle is matched to the exact circle with a smooth contour, and the center position is given by the estimated circle. The exact circle contour and the center position can be estimated with subpixel accuracy by using the circle fitting technique, e.g., here we use MVTec’s machine vision tool named HALCON.
(3) Grouping into partial lattices
The accumulated lattice pattern consists of N partial lattices in which each lattice comprises points corresponding to each frame of the time-varying pattern. We group the calculated lattice points into the corresponding partial lattice, i.e., we assign each lattice point to the partial lattice to which it belongs. This process is illustrated in Figs. 6(d) and 6(e). Any lattice points of belonging to the same partial lattice group can be selected from all lattice points in the image, because the point-to-point interval can be preliminarily determined from the size of the outer frame recorded in the camera image, e.g., see Fig. 8, and the location and size of the target partial lattice can be estimated. Similarly, the reference pattern can also be grouped into partial lattices.
(4) Calculation of translational and rotational displacements of partial lattices
We calculate the translational and rotational displacements of each partial lattice in the accumulated lattice pattern. From Eq. (1), the relationship between the displaced lattice point and the reference one can be represented by
at the m-th point of the partial lattice corresponding to the n-th frame . The geometric relationship is depicted in Fig. 7. Our goal is to estimate , i.e., the horizontal/vertical/roll displacements at each frame, based on and given by Steps (2) and (3). Since every lattice point belonging to the same frame has a fixed displacement, Eq. (6) holds simultaneously for every partial lattice point from to . Therefore, it satisfies the following simultaneous equation:Actually, the equation has no consistent solutions because a disturbance is inevitable. In this case, the employment of the least-square estimation:is more suitable for solving Eq. (7). Then, the solution isIt can be calculated at each frame n. The time-series solution for represents the vibration trajectory, i.e., our final goal.The higher the number of frames N, the higher the number of sampling points that can be obtained. The estimation error reduces by a factor of from the estimate for just one lattice point in the sense of the mean square deviation. Thus, an increase in the matrix order M of each partial lattice reduces the estimation error. For the actual implementation, the total size of the lattice pattern is constrained only by manufacturing limitations.
3. Experiments
3.1 Implementation with blinking LED array
We implemented the time-varying pattern by developing a specially designed LED display, which is shown in Fig. 8(a). It is composed of an array of LEDs with a spacing of 7 mm between the columns and rows, and a square outer frame with dimensions of with a white border of thickness 1 mm. Each LED has a uniform white-light-emitting face with a diameter of 1.5 mm. The square outer frame is used as a marker for searching for any given area of the LED array from the camera image. It is also used for determining the correspondence relationship between the physical length per side of the square outer frame and the number of pixels in the image. This relationship facilitates a mutual conversion between millimeters and pixels.
Each frame of the time-varying pattern corresponds to a partial lattice with a LED array, and a total of 64 frames is obtained by shifting the lighting position in sequence from the top left to bottom right, i.e., N = 64, M = 3. After the lighting position reaches the bottom right, it returns to the top left again to repeat cyclically. The switching interval between each frame can be selected arbitrarily to fit the shutter speed. As shown in Fig. 8(b), the interval needs to be adjusted such that the shutter closes just before the lighting position sequence repeats, and the captured image has a small blank space in which lattice points are not recorded. This blank space enables determination of each position of the partial lattice corresponding to the instant of the shutter closing and opening from the fore-and-aft position of the blank space. The light-emitting period is set at about 100 μs, which yields sufficient brightness but causes no image lag. Using the LED display, we experiment with the detection of various vibrations as follows.
3.2 Verification of detection accuracy and limitation
We verify the detection accuracy and limitations using the two following experimental setups. Using a vibration exciter as shown in Fig. 9(a), a test camera is moved along a circular path on the two-dimensional surface described by the pitch and yaw axes. The radius of the circular motion is 0.05° and the rotation speed is 10 s per cycle. Since the rotation speed is very slow, the moment of inertia is negligible, and the vibration exciter is able to trace an exact circle. The distance between the camera and the LED display is 2.0 m, and the exposure time is 10 s. We initially confirmed that the test camera itself does not generate any vibrations such as shutter shock.
Figure 10(c) shows the trajectories calculated by means of the method described in section 2.3. As shown in Fig. 10(a), the lattice pattern is significantly distorted due to the applied motion. Since the square outer frame of the undeformed lattice pattern corresponds to 1659 pixels per side, the diameter of the applied circular motion is estimated to be 30 pixels in the image. From Fig. 10(c), we observe that the detected trajectory from the image nearly coincides with the applied one. The detection error, i.e., the difference between the applied trajectory and the estimated one, is about 0.20 pixels in terms of root-mean-square deviation. As described in section 2.3, the least-square estimation using lattice points yields sub-pixel accuracy.
For comparison, the detected result by the previous method [25] is shown in Fig. 10(b). In this method, the pattern matching technique is used for calculating the displacements of each partial lattice. Other experimental conditions are the same as those in Fig. 10(c). The detection error is about 0.44 pixels in terms of root-mean-square deviation, i.e., more than twice the error of the current method. The use of the lattice points estimated at sub-pixel accuracy and the use of the least-squares method as the averaging procedure provides a high-accuracy measurement.
In these experiments, the vibration exciter is assumed to produce only yaw/pitch rotations without any vertical/horizontal motions. However, since the rotation center does not necessarily coincide with the center of the camera lens, the vibration exciter involves a slight vertical/horizontal motion. This is why the circular motions in Figs. 10(b) and 10(c) are slightly distorted in the diagonal direction.
Note that the maximum allowable amplitude for vibration detection is limited to half the value of the spacing between each lattice point under the condition that the positional relation of the lattice points remains constant even when subject to vibrations. This limitation was verified by examining the relationship between vibration amplitudes and detection success rates, as shown in Fig. 11(a). In this experiment, the rotation speed of the circular motion by the vibration exciter is 5 Hz, the distance between the camera and the LED display is 1.0 m, and the exposure time is 0.1 s. When the radius of circular motion, i.e., the vibration amplitude, exceeds approximately 0.2°, the success rate is rapidly reduced. This is because 0.2° corresponds to half of the spacing between lattice points. In the case of detection success, the captured image is similar to that in Fig. 10(a) in that it contains the same sequence of LED array as the original one even if it involves deformation. However, as shown in Fig. 11(b), detection failure disarranges the formation of the LED array. According to our investigation, since it is known that the largest camera shake caused by hand-held shooting is about 0.2°, the present system sometimes fails to detect a camera shake, especially during telephoto shooting. The allowable amplitude for vibration detection will be expanded depending on the design of the LED display.
Using the same vibration exciter as in Fig. 9(a), we also examine the detection capability of our approach by applying a sinusoidal motion along the roll axis with a deflection angle of ° and frequency of 0.1 Hz. As shown in Fig. 10(d), the detected motion path nearly coincides with the applied motion path. The error is approximately 0.01°. These results indicate that the proposed method allows detection of 3D vibration along the yaw, pitch, and roll axes.
Next, we investigate the minimum detectable sensitivity. The vibration exciter used in this experiment is shown in Fig. 9(b). This device can generate micro-vibrations along the vertical direction by means of a piezoelectric actuator. The test camera is subjected to sinusoidal vibration with a frequency of 5 Hz. The distance between the camera and the LED display is 1.0 m, and the exposure time is 1 s. Because of the small amplitude of the applied vibration, the pattern distortion is too minute to be observed. However, the vibrational waveforms can be detected from the captured image. When the driving amplitude is 0.68 pixels as shown in Fig. 12(a), the detected vibration value corresponds closely to this applied vibration value, and the error is only about 0.1 pixels. Indeed, the spectrum distribution calculated by means of fast Fourier transform (FFT) shows a salient peak at the corresponding frequency, as shown in Fig. 12(b). When the driving amplitude is considerably smaller as shown in Fig. 12(c), i.e., 0.14 pixels, the detected vibration is relatively distorted. However, from Fig. 12(d), we observe that the corresponding peak emerges in the spectrum distribution. The minimum detectable sensitivity is estimated to be approximately 0.1 pixels. Other various vibration amplitudes are tested as well. Figure 13 shows a comparison of the estimation error in various vibration amplitudes. While the estimation error in the previous method varies between 0.15 to 0.3 pixels, the current one holds about 0.1 pixels in any amplitude.
The abovementioned results enable us to conclude that the proposed method can detect vibrations with sub-pixel accuracies over a wide range of motions including large motions resulting from camera shake to small vibrations due to shutter shock.
For practical use, we also need to investigate the influence of background illumination on the error of vibration estimation. A camera image is taken under the background illumination with a certain level of brightness. The brightness ratio of LED lights and the interstitial background, hereinafter called “pattern contrast”, is expected to affect the error of vibration estimation. Various pattern contrasts are realized by varying the brightness of the LED, and the vibration is estimated in each pattern contrast. Figure 14 shows the relationship between the pattern contrast and the estimation error. The experimental conditions are the same as those for Fig. 12(a). When the pattern contrast is lower, the estimation error increases rapidly or the detection fails. It is because the LED lights are hard to separate from the interstitial background and non-lighting LEDs with a slight brightness by reflection, as shown in Fig. 15(a). This situation occurs under bright ambient illumination and when poor lighting LEDs are used. On the other hand, when the pattern contrast is relatively high as shown in Fig. 15(b), the estimation error remains low. In this case, the positions of LED lights can be estimated easily and precisely. Therefore, our recommendation is for the captured image to have a higher contrast.
3.3 Measurement of camera shake and residual vibrations of image stabilizer
Our method can be practically applied to detect camera shakes caused by hand-held shooting. To demonstrate this capability, we use a commercial SLR camera equipped with an optical image stabilizer. The distance between the camera and the LED display is 2.5 m, the focal length of the lens is 135 mm, and the exposure time is 0.1 s.
Figure 16(a) shows an example of the detected camera shake with the image stabilizer turned off. The starting point of the trajectory is located at the origin. Figure 16(b) shows trajectories obtained over 10 trials plotted together in the same graph. Since the camera shake is a random process, the cumulative distribution is suitable to examine the average trend of camera shake.
Residual vibrations with the image stabilizer turned on can also be detected. From Fig. 17(a), we observe that the detected trajectory is considerably smaller than that obtained for camera shake without the use of the image stabilizer. The cumulative distribution in Fig. 17(b) exhibits a drastic reduction in scale when compared with that shown in Fig. 16(b). This result indicates the effectiveness of the image stabilizer. A quantitative evaluation of the image stabilizer performance can also be obtained by estimating the extent of the distribution.
3.4 Measurement of mirror slaps and shutter shocks
Mirror slaps and shutter shocks are vibrations produced within the camera at the instant of the clicking of the shutter. Although these vibrations are small in terms of magnitude, they can be detected via our method. In the experiments described below, we set the distance between the camera and the LED display to 4.0 m and the exposure time to 0.5 s.
First, we detect mirror slap with a commercial SLR camera; as mentioned previously, mirror slap is the vibration caused by the flipping of the mirror upward and outward of the light path immediately before the shutter opens. In the experiments, the camera is mounted on a tripod weighing approximately 3 kg and positioned at a height of 1.5 m. The focal length of the lens is 200 mm. Figure 18 shows an example of the detected mirror slaps. From a comparison of Figs. 18(a) and 18(c), we observe that the vertical component of the mirror slap is larger than the horizontal one. This is because the mirror flips up along the vertical direction. The vertical component exhibits a typical damped vibration such as that observed in the simple harmonic motion of a spring. The vibration frequency is about 20 Hz, as shown in Fig. 18(b). The horizontal vibration consists of two major frequency components of approximately 18 Hz and 25 Hz, as shown in Fig. 18(d). The vibration in this case is more complex than the vertical vibration.
Next, we used our method to detect shutter shock with a commercial mirrorless camera; as mentioned previously, shutter shock is the vibration generated by the opening of the shutter curtain that prevents light from reaching the image sensor. As the camera exposure occurs with the opening of the shutter curtain, a slight vibration is induced in the image sensor. This vibration can be detected from the camera image itself via our method. In this experiment, the mirrorless camera is mounted on the tripod used for the previous experiment (as described above). The focal length of the lens is 210 mm. Figure 19 shows an example of the detected shutter shocks. The vertical component is too small to be detected, but a weak vibration emerges in the horizontal direction. This result seems to be in conflict with the fact that the shutter curtain runs in the vertical direction. The vibration frequency is about 26 Hz, as shown in Fig. 19(d).
In the final phase of the study, we also applied our method to another mirrorless camera equipped with an electronic shutter; however, no vibrations could be detected. This was because the camera had no components that were subjected to mechanical vibration during its operation.
In conclusion, our method provides a highly sensitive approach for the detection of vibrations such as camera shake, mirror slap, and shutter shock. Our approach can significantly contribute to the development of cameras.
4. Discussions
4.1 Robustness against image distortion
Although image distortion is neglected in Section 2.1, it is actually inevitable. The current method is expected to have robustness against image distortion. This is because the distortion occurs commonly in both the reference pattern and the displaced one owing to camera vibrations, and the effect can be largely offset by the differencing operation in the left-hand side of Eq. (7). Figure 20 shows the verification result of robustness against the trapezoidal distortion. Here, the test chart is intentionally positioned in a slightly slanted direction about 2° toward the image sensor, and the estimation errors are calculated as is the case with Fig. 13. Despite the camera image being subjected to the trapezoidal distortion, the current method holds almost the same estimation error as that shown in Fig. 13. This means that the current method has robustness over the image distortion. However, the previous method deteriorates the estimation accuracy, as shown in Fig. 20, because it uses a computed pattern as the reference pattern and not the pattern captured from the camera image itself.
4.2 Separation between yaw/pitch rotations and horizontal/vertical translations
As shown in Eq. (1), the horizontal/vertical translations and the yaw/pitch rotations coexist as the translational motions . Since they lead to the same effect in the image, it is not easy to separate them from each other without any other knowledge. Experiments in Section 3.2 are the cases where either or exists individually, and therefore, it has been possible to estimate each one accurately. However, in practical cases such as those in Sections 3.3 and 3.4, both and coexist simultaneously. In this case, only the mixture motion can be estimated. Nevertheless, it is noteworthy that the camera vibration can be detected directly from the camera image itself, because the first priority for camera development is to investigate the extent of the effect of vibrations on the image.
There are some methods to separate each motion from the mixture motion. The simplest method is to set the shooting distance to be sufficiently long or short, because the displacement caused by is proportional to the shooting distance. In another approach [27], has been measured by a gyro sensor attached to the camera body; has been detected through our method in which case the prototype version [24] was used, and finally is given by subtracting the effect of from . Moreover, the effect of the third term of Eq. (15) could also be used for separating and , because this term includes only the effect of and causes the trapezoidal distortion. Therefore, if the degree of distortion can be accurately estimated from the image, can be estimated, and then, can be calculated from and .
4.3 Mechanical coupling of lens and image sensor
In the formulation of Section 2.1, the lens and image sensor have implicitly been supposed to be combined rigidly. In the experiments of Section 3.2, the camera body fitted with the lens and image sensor was vibrated by the vibration exciters, and it was verified that the detected vibrations are just as the given motions. This indicates the validity of the assumption of rigidness. However, the mechanical coupling might actually be somewhat loose. Shutter shocks mentioned in Section 3.4 vibrate primarily the image sensor but may not affect the lens. The non-rigid case can be expressed by applying the following equation:
instead of Eq. (13), where represent matrices representing the yaw/pitch/roll rotations of the image sensor around the camera lens center, and indicate the horizontal/vertical translations of the image sensor with respect to the camera body. Then, Eq. (1) is rewritten asThis means that each of the rotations and translations are replaced by motions of both the lens and the image sensor. Since only the combined motion can be detected as described in Step (4) of Section 2.3, each motion of the lens and image cannot be separated individually without any other knowledge. However, no conventional methods have been able to detect the minute vibrations induced in the inside of camera even if they are the combined motions. If other sensors can also be used, e.g., if gyro sensors are attached to the lens body, it may become possible to separate the combined motion.Note that the proposed system is not necessarily intended only for the vibration estimation of the camera body and lens. In the cases of residual vibrations of the image stabilizer, mirror slaps, and shutter shocks, the impacts on the camera image need to be examined. It is because the image stabilizer resultantly reduces vibrations in the camera image but not of the camera body and lens, and the mirror slaps and shutter shocks do not truly propagate to the camera body. Our method is the only way to detect vibrations directly from the camera image itself.
4.4 Other possible expansions
Our proposed system needs special equipment for displaying a time-varying pattern. The custom-made LED display that has been used in experiments contains many LEDs and a high-speed blinking rate, and allows the detection of camera vibrations with high temporal and spatial resolutions. If we do not require such high performance, a commonly used liquid crystal display (LCD) is useful as a substitute for an LED display. The LCD allows us to display various time-varying patterns. The idea was introduced in our previous study [24]. However, since the blinking speed is limited to 60 Hz, it is impossible to measure camera vibrations in a high temporal resolution.
Although the LED display used in the experiments uses only white light, the use of three or more color LEDs, e.g., red, green, and blue, is effective for improving the orthogonality between each partial lattice. Even if the camera vibration is too large to separate each partial lattice because they overlap with each other, a lattice pattern with multicolor LEDs might facilitate separation.
The idea of the time-varying pattern can potentially be expanded to not only the camera itself but also other vibration measurements. For example, if a camera is firmly attached to a vibrating object such as a land surface or a construction, the object vibration will be detected from the camera image. As another example, if many discrete LEDs instead of a customized LED display are dispersively attached to a target and photographed by a camera fixed onto a static stage, then the vibration of the target is recorded as the positions of the point light sources in the camera image. A procedure similar to that mentioned in Section 2.3 would then reveal details of the vibration.
5. Conclusion
In this paper, we proposed and demonstrated a method for measuring camera vibrations from the camera image itself. A unique feature of our method is the employment of a time-varying pattern as the camera test chart instead of the conventional static test pattern. The time-varying pattern was realized by using a specially designed LED display that we developed. Our study also presents the theoretical framework underlying the approach and our pattern analysis of the camera image for detecting camera vibrations. We performed certain verification experiments using vibration exciters to demonstrate the efficacy of the proposed method, which exhibited sub-pixel detection accuracy, a sensitivity of 0.1 pixels, and robustness against image distortion. We also measured and compared the camera shakes and residual vibrations of an image stabilizer, thereby enabling the quantitative evaluation of image stabilizers. The proposed method also demonstrated that even small vibrations such as mirror slaps and shutter shocks could be detected.
Our system is limited to the estimation of XY and roll motions. The XY motion indicates a mixture motion that includes both horizontal/vertical translations and yaw/pitch rotations with respect to each of the lens and image sensor motions. In the current system, it is difficult to separate each component from the combined motion. Nevertheless, our system can definitely aid in developing high-definition cameras, because the camera vibration needs to be detected directly from the camera image itself, especially in residual vibrations of image stabilizer, mirror slaps, and shutter shocks. Our method can be a unique and useful tool to evaluate various types of camera vibrations.
In our upcoming studies, we plan to establish a method for the quantitative evaluation of image stabilizers and a method of measuring camera shakes in video images as an extension of this study.
Appendix A geometric relationship between test chart and image sensor
In the simplest case, the planes of both the test chart and the image sensor are perpendicular to the optical axis , as shown in Fig. 2(a), i.e., the image sensor plane is parallel to that of the test chart one. This results in a proportional relationship such that
where represents the scaling factor given by , and the lens center is defined as the origin. Camera vibrations cause the camera body and lens to rotate and translate slightly with respect to the test chart, as shown in Fig. 2(b). In this case, the relationship between the test chart and the image sensor can be represented using rotation matrices by denote the yaw/pitch/roll rotations of the camera body around the camera lens center, and denote the horizontal/vertical translations of the camera body with respect to the test chart. If the camera vibrations, i.e., , are assumed to be very small, Eq. (13) can be simplified such thatby Taylor’s first-order approximation. The scaling factor can be determined from the third line, and the projected point can be computed by assigning to the first and second lines. By applying Taylor’s first-order approximation again, Eq. (14) results inNote that even if the order of the rotation and translation operators in Eq. (13) is changed, the above equation holds the same result because the operation order affects only the second or higher order approximation. Equation (15) can be interpreted as follows. The first term on the right-hand side indicates a rotational displacement in the x0-y0 plane, and the second term indicates horizontal and vertical displacements that consist of two translational components, i.e., and , where the latter components are derived from yaw and pitch rotations . The third term indicates a trapezoidal distortion caused by the skewed relationship between the test chart and the image sensor. If the distance between the test chart and the camera lens is sufficiently long compared with the test chart size, the third term can be neglected. For simplicity, the discussion in this study is based on this assumption. Then, Eq. (15) reduces to Eq. (1).Funding
The Japan Society for the Promotion of Science (JSPS) KAKENHI (25330188).
Acknowledgments
The authors acknowledge support from Tsubosaka Electric Co., Ltd. for their cooperation in formulating and implementing the experimental system used in this study.
References and links
1. Y. Yeh and H. Z. Cummins, “Localized fluid flow measurements with an He–Ne laser spectrometer,” Appl. Phys. Lett. 4(10), 176–178 (1964). [CrossRef]
2. R. L. Powell and K. A. Stetson, “Interferometric vibration analysis by wavefront reconstruction,” J. Opt. Soc. Am. 55(12), 1593–1598 (1965). [CrossRef]
3. S. Ueha, N. Magome, and J. Tsujiuchi, “Comments on speckle interferometry with regard to displacements of the camera during an exposure,” Opt. Commun. 27(3), 324–326 (1978). [CrossRef]
4. H. Kondo, H. Imazeki, and T. Tsuruta, “Measurement of image blur due to intrinsic vibration of the camera body by applying speckle photography,” J. Appl. Photogr. Eng. 9(5), 138–142 (1983).
5. K. Sato, S. Ishizuka, A. Nikami, and M. Sato, “Control techniques for optical image stabilizing system,” IEEE Trans. Consum. Electron. 39(3), 461–466 (1993). [CrossRef]
6. N. Joshi, S. B. Kang, C. L. Zitnick, and R. Szeliski, “Image deblurring using inertial measurement sensors,” ACM Trans. Graphics 29(4), 30 (2010).
7. E. Kee, S. Paris, S. Chen, and J. Wang, “Modeling and removing spatially-varying optical blur,” in Proceedings of IEEE Conference on Computational Photography (IEEE, 2011), pp. 1–8.
8. F. Xiao, A. Silverstein, and J. Farrell, “Camera-motion and effective spatial resolution,” in Proceedings of International Congress of Imaging Science (Society for Imaging Science and Technology, 2006), pp. 33–36.
9. H. Choi, J.-P. Kim, M.-G. Song, W.-C. Kim, N.-C. Park, Y.-P. Park, and K.-S. Park, “Effects of motion of an imaging system and optical image stabilizer on the modulation transfer function,” Opt. Express 16(25), 21132–21141 (2008). [CrossRef] [PubMed]
10. Digital Photography Review (DPReview), http://www.dpreview.com/.
11. S. C. Shin, W.-S. Ohm, S.-M. Kim, and S. Kang, “A method for evaluation of an optical image stabilizer in an image sensor module,” Int. J. Precis. Eng. Manuf. 12(2), 367–370 (2011). [CrossRef]
12. N. Aoki, H. Kusaka, and H. Otsuka, “Measurement and description method for image stabilization performance of digital cameras,” Proc. SPIE 8659, 86590O (2013). [CrossRef]
13. P. Xu, Q. Hao, C. Huang, and Y. Wang, “Degradation of modulation transfer function in push-broom camera caused by mechanical vibration,” Opt. Laser Technol. 35(7), 547–552 (2003). [CrossRef]
14. B. Golik and D. Wueller, “Measurement method for image stabilizing systems,” Proc. SPIE 6502, 65020O (2007). [CrossRef]
15. R. Safaee-Rad and M. Aleksic, “Handshake characterization and image stabilization for cell-phone cameras,” Proc. SPIE 7241, 72410V (2009). [CrossRef]
16. H. S. Choi, J. H. Cheung, S. H. Kim, and J. H. Ahn, “Structural dynamic displacement vision system using digital image processing,” NDT Int. 44(7), 597–608 (2011). [CrossRef]
17. J. J. Lee and M. Shinozuka, “A vision-based system for remote sensing of bridge displacement,” NDT Int. 39(5), 425–431 (2006). [CrossRef]
18. H.-S. Jeon, Y.-C. Choi, J.-H. Park, and J. W. Park, “Multi-point measurement of structural vibration using pattern recognition from camera image,” Nucl. Eng. Technol. 42(6), 704–711 (2010). [CrossRef]
19. A. M. Wahbeh, J. P. Caffrey, and S. F. Masri, “A vision-based approach for the direct measurement of displacements in vibrating systems,” Smart Mater. Struct. 12(5), 785–794 (2003). [CrossRef]
20. B. Ferrer, J. Espinosa, A. B. Roig, J. Perez, and D. Mas, “Vibration frequency measurement using a local multithreshold technique,” Opt. Express 21(22), 26198–26208 (2013). [CrossRef] [PubMed]
21. M. Ben-Ezra and S. K. Nayar, “Motion-based motion deblurring,” IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 689–698 (2004). [CrossRef] [PubMed]
22. K. Yue, Z. Li, M. Zhang, and S. Chen, “Transient full-field vibration measurement using spectroscopical stereo photogrammetry,” Opt. Express 18(26), 26866–26871 (2010). [CrossRef] [PubMed]
23. S. Ri, S. Hayashi, S. Ogihara, and H. Tsuda, “Accurate full-field optical displacement measurement technique using a digital camera and repeated patterns,” Opt. Express 22(8), 9693–9706 (2014). [CrossRef] [PubMed]
24. K. Nishi and R. Ogino, “3D camera-shake measurement and analysis,” in Proceedings of IEEE Conference on Multimedia and Expo (IEEE, 2007), pp. 1271–1274.
25. K. Nishi and T. Onda, “Evaluation system for camera shake and image stabilizers,” in Proceedings of IEEE Conference on Multimedia and Expo (IEEE, 2010), pp. 926–931. [CrossRef]
26. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, (Cambridge press, 2003).
27. K. Hayashi, M. Tanaka, H. Kusaka, and H. Hashi, “New approach on multi-axial analysis of camera shake,” in Proceedings of IEEE Conference on Consumer Electronics (IEEE, 2010), pp. 39–40. [CrossRef]