Speckle-learning-based object recognition through scattering media

Takamasa Ando; Ryoichi Horisaki; Jun Tanida

doi:10.1364/OE.23.033902

1. Introduction

Imaging through scattering media has recently generated intense interest in the field of optics, such as for biomedicine and security [1]. Studies on this topic can be categorized roughly into three approaches. The first approach is based on removal or reduction of diffusion components by deblur processing [2–11]. This approach assumes weak scattering compared with the other two. The second approach employs inversion of the transmission matrix of the scattering media [12, 13]. The third approach is based on image retrieval from autocorrelation of speckle patterns by assuming that the speckle patterns are shift-invariant, which is called the memory effect [14,15]. The last two descattering approaches are applicable to strong scattering but have several issues, including the observable object size and the complexity of the optical setup for introducing reference light and employing the memory effect. Also, computational costs for image reconstruction in those two approaches are generally high.

Object recognition through scattering media, which is an important issue in various applications including biomedicine and security, also has been demonstrated mainly based on the first approach [16, 17]. In these studies, weakly scattered images are deblurred or descattered and are then provided to a classifier, as shown in Fig. 1(a). Research related to this topic includes non-invasive analysis of fruit ripening and medical examination using speckle patterns [18–21]. In these studies, some feature descriptors, such as autocorrelation and Fourier coefficients of spatial and/or temporal speckle patterns from an object, are used in a classifier, also as shown in Fig. 1(a). Selection of suitable feature descriptors is an important but difficult manual task with this approach.

Fig. 1 Schematic diagrams of (a) conventional and (b) proposed methods for object recognition through scattering media.

Download Full Size | PDF

Here we show that machine learning of a number of speckle patterns is sufficient for binary classification through scattering media. Binary classification, which is the task of separating elements into two classes based on a rule, is simple but fundamental in object recognition [22]. As shown in Fig. 1(b), our approach directly learns speckles and does not require descattering or selection of feature descriptors. Therefore, it eliminates the limitations of the descattering algorithms, the complexity of the optical setup, and the heuristic of the feature descriptors used in the previous approaches. Furthermore, our approach is applicable to highly scattering media, like those assumed in [12–15].

2. Methods

In our experiments, objects f_i ∈ ℂ^N_x×N_y, where N_x and N_y are the numbers of pixels along the x-and y-dimensions and i is the index of the object, displayed on a spatial light modulator (SLM) between scattering media are illuminated by coherent laser light. The light propagation process from an input complex field u_in ∈ ℂ^N_x×N_y to an output complex field u_out ∈ ℂ^N_x×N_y is written with the Fresnel operator 𝒫(•) as

u_{out} = 𝒫 (u_{in}) = \frac{exp (j k z)}{j λ z} \iint u_{in} (x_{in}, y_{in}) exp [\frac{j k}{2 z} [{(x_{out} - x_{in})}^{2} + {(y_{out} - y_{in})}^{2}]] d x_{in} d y_{in},

where k is the wavenumber, λ is the wavelength, and z is the distance between the input and output planes [23]. Then the speckle imaging process in our experiments is described as

g_{i} = {| 𝒫 (r_{2} ○ 𝒫 (r_{1} ○ f_{i})) |}^{2},

where g_i ∈ ℝ^N_x×N_y is the i-th speckle intensity image captured by a camera, r₁ ∈ ℂ^N_x×N_y is the random complex field propagated from the first (front) scattering medium, and r₂ ∈ ℂ^N_x×N_y is the random complex field generated by the second (rear) scattering medium. Here the operator ○ shows the element-wise product. As shown in Eq. (2), reference light is not introduced, and thus the imaging process is nonlinear.

The i-th object f_i is classified with a binary label l_i ∈ {−1, 1}, for example the labels of face and non-face classes are 1 and −1, respectively, in our experiments. Data set D, which is given to the classifier for our object recognition in Fig. 1(b), is composed of N_g captured speckle images from the labeled objects and represented as [24, 25]

D = {𝒮 (g_{i}), l_{i} | 𝒮 (g_{i}) \in ℝ^{N_{s}}, l_{i} \in {- 1, 1}}_{i = 1}^{N_{g}},

where 𝒮(•) shows the sampling of N_s (≤ N_x × N_y) pixels from each of the captured images g_i of the interfering laser light through the scattering media. The locations of the sampled pixels are randomly chosen, and the locations are fixed during the N_g measurements. The operator 𝒮 subsamples each captured image to reduce their data size for a learning process. Each speckle data 𝒮(g_i) after sampling is labeled according to the label of the i-th object f_i. In this case, linear classifiers are not applicable to recognition of the measurement data even if the object data is linearly separable in object space because of the nonlinear imaging process shown in Eq. (2).

We used the support vector machine (SVM) [26–28] for recognition of amplitude/phase objects with speckles. The SVM is one of the pattern recognition methods that use supervised learning, which are trained in advance using data with pre-known labels and classify new data according to the training. The SVM separates the data into two classes by maximizing the width of the margin m between the separating hyperplane and the data. This process can enhance the adaptability to test data classification.

The hyperplane is calculated by solving the following equation so as to maximize the margin m = 1/‖w‖₂ [26]:

min_{w, b} \frac{1}{2} {‖ w ‖}^{2} subject to l_{i} (w \cdot 𝒮 (g_{i}) - b) \geq 1, \forall i,

where w ∈ ℝ^N_s is the normal vector to the hyperplane, and b is the intercept of the hyperplane. The SVM in Eq. 4 is a linear form and has been extended to a nonlinear classifier with the kernel trick [29, 30]. In the kernel trick, input data that is not linearly separable with the hyperplane in original space can be linearly separated in the mapped feature space by the conversion. This nonlinear classifier can be applied to our measurements. The input speckle data 𝒮(g_i) are mapped into a high-dimensional feature space in advance with a nonlinear kernel function, such as a radial basis, which is used in our experiments, polynominal, or multilayer perceptron function [31].

3. Experiments

The experimental setup for object recognition through scattering plates is shown in Fig. 2. The target objects were face data and non-face data, which were displayed on a transmissive SLM (LC 2012 manufactured by Holoeye, pixel pitch: 36 μm, active area: 36.9×27.6 mm²). We verified that object recognition could be performed with both the amplitude and phase modes of the SLM. In the amplitude mode, two polarizers were used, one located on each side of the SLM, in the parallel Nicol arrangement. In the phase mode, the bare SLM was used. The target objects on the SLM were illuminated by diffused light generated through the front scattering plate shown in Fig. 2 using a diode laser (DM-6935HA manufactured by Neoark, wavelength: 650 nm, output power: 17 mW). The light that passed through the SLM was diffused again by the rear scattering plate, which had the same scattering properties as the front one. Finally the speckle intensity image of the scattered light was captured by a monochrome CCD (PL-B953 manufactured by PixeLink).

Fig. 2 Setup for the experiment.

Download Full Size | PDF

The face and non-face data used in the experiments were selected from the CBCL Face Database provided by the Center for Biological and Computational Learning at MIT [32]. The face and non-face images originally had 19 ×19 gray-scale pixels, which is defined as N_o, and they were resized into 640 ×640 pixel images for display on the SLM in the experiments. The face images were obtained from different persons of various races, ages, and genders, and with/without glasses and facial hair. The non-face images were different structured random textures. Figures 3(a)–3(j) and Figs. 3(k)–3(t) are ten examples of the face and non-face data, respectively. Individuals are recognizable from the images shown in those figures. In the experiments, the images and people in the test data were different from those in the training data. The training and test phases of the SVM were coded using the statistics toolbox in MATLAB. A gaussian radial basis function was chosen as the kernel, and the predictor data was standardized before machine learning.

Fig. 3 Examples of (a)–(j) face and (k)–(t) non-face images.

Download Full Size | PDF

Two kinds of scattering plates, scattering plates 1 and 2 (FV-104 and FV-107 acrylic resin plates manufactured by Shoei Chemical, respectively), shown in Fig. 4, were used in the experiments. The total light transmittance and the collimated light transmittance of scattering plate 1 were 51.2 % and 1.1 %, and those of scattering plate 2 were 30.1 % and 0.1 %, respectively. The two configurations were used with scattering plates 1 and 2 in the experiments. The pixel count of the captured speckle images was 768 ×768 (=N_x × N_y). Examples of the captured speckle images of the face and non-face data in Figs. 3(a) and 3(k) are shown in Figs. 5(a)–5(d). Figures 5(a) and 5(b) are the speckles obtained with the amplitude mode of the SLM for the face and non-face data, respectively, and Figs. 5(c) and 5(d) are the speckles obtained with the phase mode of the face and non-face data, respectively. As shown in the figure, the textures of the face and non-face data disappeared due to the scattering plates.

Fig. 4 Scattering plates for the experiment. Two sets of each scattering plate were prepared as front and rear scattering plates, as shown in Fig. 2.

Download Full Size | PDF

Fig. 5 Examples of experimentally captured speckle intensity images of the face and non-face images shown in Figs. 3(a) and 3(k). The speckles captured through the scattering plate 2 of the amplitude target of (a) the face image and (b) the non-face image. The speckles captured through the scattering plate 2 of the phase target of (c) the face image and (d) the non-face image. The subfigures at the right in Figs. (a)–(d) are close-ups of the central 50 ×50 pixels in each of the left figures.

Download Full Size | PDF

Normalized correlations between the speckle intensity images from Figs. 3(a)–3(e) belonging to the face class and Figs. 3(k)–3(o) belonging to the non-face class are shown in Fig. 6. Here, the correlation 𝒞(v, w) between the v-th and w-th speckle intensity images g_v and g_w is calculated as 𝒞(v, w) = ‖(g_v/‖g_v‖_ℓ₂) ○ (g_w/‖g_w‖_ℓ₂)‖_ℓ₁, where ‖•‖_{ℓ_p} is the ℓ_p-norm. The calculated correlations with the scattering plate 1 from the amplitude and phase targets are shown in Figs. 6(a) and 6(b), respectively. Also, ones with the scattering plate 2 from the amplitude and phase targets are shown in Figs. 6(c) and 6(d), respectively. Those correlations do not have any bias associated with the face and non-face classes on the speckle intensity images.

Fig. 6 Correlations between speckle intensity images from Figs. 3(a)–3(e) and Figs. 3(k)–3(o). Results with the scattering plate 1 from the (a) amplitude and (b) phase targets. Results with the scattering plate 2 from the (c) amplitude and (d) phase targets. The indices of the row and column in (a) show the subindices of Figs. 3(a)–3(e) and Figs. 3(k)–3(o).

Download Full Size | PDF

In the first experiment, the effect of the pixel sampling 𝒮 in Eq. (3) was verified. The training on the SVM classifier was executed using 2000 sets of captured speckle data (1000 face and 1000 non-face data sets) with known labels in each experiment by using scattering plate 1 or 2. Two-hundred sets of captured speckle data (100 face and 100 non-face data sets, different from training data sets) were used for the test data. For classification by the SVM in the training and test phases, 25, 50, 100, 200, 400, and 800 (= N_s) pixels were sampled randomly from the 589824 (=768×768 =N_x × N_y) pixels in each of the captured speckle images. These sampling conditions contain both undersampling (N_s < N_o =19×19 =361, which is the pixel count of the original face and non-face images) and oversampling (N_s > N_o). The locations of the sampling pixels were fixed for all speckle images.

Figure 7 shows the accuracy rates, which is the ratio of the number of correctly classified test data sets to the total number of the test data sets, for the face and non-face recognition of the test data with different numbers of sampling pixels. The plots in this figure were averages of five different pixel samplings. Amplitude targets had high accuracy rates, at more than 85 %, even with a small number of sampling pixels, such as 25 pixels, and scattering plate 2. The accuracy rates in the case of the 800 sampling pixels for the amplitude targets through scattering plates 1 and 2 were 97.6 % and 94.8 %, respectively. The phase targets were also recognized with high accuracy rates of 92.3 % and 77.0 % through scattering plates 1 and 2, respectively, with 800 pixels. The reason why the amplitude targets resulted in higher accuracy rates than the phase targets is considered to be the lower nonlinearity of the amplitude targets, because the light intensity scattering process can be roughly approximated by a large blur, which is linearly modeled. The result shows that the SVM works on not only oversampling but also undersampling. The tendency in the graph in Fig. 7 indicates that more sampling pixels might provide a higher accuracy rate.

Fig. 7 Accuracy rates with different numbers of sampling pixels.

Download Full Size | PDF

Figure 7 also includes the accuracy rates for amplitude targets without the scattering plates. The amplitude targets without scattering plates were used as a substitute for the conventional scattered object recognition, which employs a descattering process, shown in Fig. 1(a). The captured images without the scattering plates are considered as results after perfect descattering.

The same SVM process as that with the scattering plates was conducted for the captured images. The accuracy rates of the substitute were almost equivalent to those of the amplitude targets with the scattering plate 1, as shown in Fig. 7. Thus, the SVM alleviates the turbulence due to the scattering media in object recognition. Note that phase targets without the scattering plates are not observable by the camera because they are transparent in this case.

In the second experiment, the effect of the number of training images was verified. Figure 8 plots the accuracy rates with 50, 100, 200, 500, 1000, and 2000 sets of the training captured images from the face and non-face images. The number of sampling pixels was fixed to 100, which is undersampling (N_s < N_o). Two-hundred sets of captured images, which were different from the ones used for training, were used for the test data. The same SVM process as in the first experiment was used here. The result in Fig. 8 indicates that a larger number of training images increases the accuracy rate. This effect is more significant for the phase targets than the amplitude targets. The plots of the substitute case, which is amplitude targets without the scattering plates, show similar accuracy rates to those of amplitude targets with the scattering plate 2.

Fig. 8 Accuracy rates with different numbers of training images.

Download Full Size | PDF

4. Conclusion

We demonstrated object recognition of amplitude and phase targets of face and non-face images through scattering media using speckle patterns with the SVM classifier. The targets displayed on the SLM were illuminated with laser light via the front scattering plate. The light passing though the SLM was diffused again by the rear scattering plate. Then the speckle intensity images, in which the targets were not recognizable by human perception, were captured by the camera. A number of the captured speckle intensity images were used for machine learning. Object recognition was realized with high accuracy rates with both the amplitude and phase targets, showing the effectiveness of our proposed method. This demonstration showed that the SVM can be applied to object recognition through scattering media, where the speckle intensity measurements are not linearly connected to the object. In the experiments, larger numbers of sampling pixels and training images resulted in a higher accuracy rate for object recognition. However, this situation increases the computational cost, especially, in the training process. This issue should be analyzed further, and it may be solvable by using another machine learning algorithm and/or a dimensionality reduction.

This work shows the possibility of bypassing the descattering and feature description in conventional object recognition through scattering media. Our approach alleviates limitations, such as the object size, of the descattering algorithms and simplifies the optical setup and the feature descriptions used in the conventional approaches. It can be applied to various scattering processes, such as those in more highly scattering media compared with the media used here and back/multiple scattering configurations. Future issues to be addressed include object recognition with temporal/spectral/polarimetric speckles, decryption of optically encrypted data, multidimensional object recognition, multiclass classification, and unsupervised machine learning. Our approach will be practically useful in the fields of biomedicine and security because scattering phenomena often degrade the performance of optical systems in those fields.

References and links

1. A. P. Mosk, A. Lagendijk, G. Lerosey, and M. Fink, “Controlling waves in space and time for imaging and focusing in complex media,” Nat. Photonics 6, 283–292 (2012). [CrossRef]

2. G. D. Gilbert and J. C. Pernicka, “Improvement of underwater visibility by reduction of backscatter with a circular polarization technique,” Appl. Opt. 6, 741–746 (1967). [CrossRef] [PubMed]

3. Y. Y. Schechner, S. G. Narasimhan, and S. K. Nayar, “Polarization-based vision through haze,” Appl. Opt. 42, 511–525 (2003). [CrossRef] [PubMed]

4. D. J. Cuccia, F. Bevilacqua, A. J. Durkin, and B. J. Tromberg, “Modulated imaging: quantitative analysis and tomography of turbid media in the spatial-frequency domain,” Opt. Lett. 30, 1354–1356 (2005). [CrossRef] [PubMed]

5. S. K. Nayar, G. Krishnan, M. D. Grossberg, and R. Raskar, “Fast separation of direct and global components of a scene using high frequency illumination,” ACM Trans. Graph. 25, 935–944 (2006). [CrossRef]

6. T. Treibitz and Y. Y. Schechner, “Active polarization descattering,” IEEE Trans. Pattern Anal. Mach. Intell. 31, 385–399 (2009). [CrossRef] [PubMed]

7. I. Moon and B. Javidi, “Three-dimensional visualization of objects inscattering medium by use of computational integral imaging,” Opt. Express 16, 13080–13089 (2008). [CrossRef] [PubMed]

8. D. Shin and B. Javidi, “Visualization of 3D objects in scattering medium using axially distributed sensing,” J. Disp. Technol. 8, 317–320 (2012). [CrossRef]

9. T. Ando, R. Horisaki, T. Nakamura, and J. Tanida, “Single-shot acquisition of optical direct and global components using single coded pattern projection,” J. Jpn. Appl. Phys. 54, 042501 (2015). [CrossRef]

10. T. Ando, R. Horisaki, and J. Tanida, “Three-dimensional imaging through scattering media using three-dimensionally coded pattern projection,” Appl. Opt. 54, 7316–7322 (2015). [CrossRef] [PubMed]

11. V. Durán, F. Soldevila, E. Irles, P. Clemente, E. Tajahuerce, P. Andrés, and J. Lancis, “Compressive imaging in scattering media,” Opt. Express 23, 14424–14433 (2015). [CrossRef] [PubMed]

12. A. Liutkus, D. Martina, S. Popoff, G. Chardon, O. Katz, G. Lerosey, S. Gigan, L. Daudet, and I. Carron, “Imaging with nature: compressive imaging using a multiply scattering medium,” Sci. Rep. 4, 5552 (2014). [CrossRef] [PubMed]

13. M. Kim, W. Choi, Y. Choi, C. Yoon, and W. Choi, “Transmission matrix of a scattering medium and its applications in biophotonics,” Opt. Express 23, 12648–12668 (2015). [CrossRef] [PubMed]

14. J. Bertolotti, E. G. van Putten, C. Blum, A. Lagendijk, W. L. Vos, and A. P. Mosk, “Non-invasive imaging through opaque scattering layers,” Nature 491, 232–234 (2012). [CrossRef] [PubMed]

15. O. Katz, P. Heidmann, M. Fink, and S. Gigan, “Non-invasive single-shot imaging through scattering layers and around corners via speckle correlations,” Nat. Photonics 8, 784–790 (2014). [CrossRef]

16. J. S. Tyo, M. P. Rowe, E. N. Pugh, and N. Engheta, “Target detection in optically scattering media by polarization-difference imaging,” Appl. Opt. 35, 1855–1870 (1996). [CrossRef] [PubMed]

17. D. Shin, J.-J. Lee, and B.-G. Lee, “Recognition of a scattering 3D object using axially distributed image sensing technique,” ARPN J. Eng. Appl. Sci. 9, 2085–2088 (2014).

18. A. Zdunek, A. Adamiak, P. M. Pieczywek, and A. Kurenda, “The biospeckle method for the investigation of agricultural crops: a review,” Opt. Lasers Eng. 52, 276–285 (2014). [CrossRef]

19. R. Nassif, C. A. Nader, C. Afif, F. Pellen, G. Le Brun, B. Le Jeune, and M. Abboud, “Detection of golden apples’ climacteric peak by laser biospeckle measurements,” Appl. Opt. 53, 8276–8282 (2014). [CrossRef]

20. Z. Zalevsky, Y. Beiderman, I. Margalit, S. Gingold, M. Teicher, V. Mico, and J. Garcia, “Simultaneous remote extraction of multiple speech sources and heart beats from secondary speckles pattern,” Opt. Express 17, 21566–21580 (2009). [CrossRef] [PubMed]

21. Y. Bishitz, N. Ozana, Y. Beiderman, F. Tenner, M. Schmidt, V. Mico, J. Garcia, and Z. Zalevsky, “Noncontact optical sensor for bone fracture diagnostics,” Biomed. Opt. Express 6, 651–657 (2015). [CrossRef] [PubMed]

22. C. M. Bishop, Pattern Recognition and Machine Learning (Springer-Verlag New York, Inc., 2006).

23. J. W. Goodman, Introduction to Fourier Optics (Roberts & Co, 2004).

24. T. G. Dietterich, “Machine-learning research,” AI Mag. 18, 97–136 (1997).

25. J. A. K. Suykens, “Nonlinear modelling and support vector machines,” in Proceedings of Instrumentation and Measurement Technology (IEEE, 2001), pp. 287–294.

26. C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Min. Knowl. Discov. 2, 121–167 (1998). [CrossRef]

27. V. Blanz, B. Schslkopf, H. Bulthoff, C. Burges, V. Vapnik, and T. Vetter, “Comparison of view-based object recognition algorithms using realistic 3D models,” Icann 1112, 251–256 (1996).

28. E. Osuna, R. Freund, and F. Girosi, “Training support vector machines: an application to face detection,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 1997), pp. 130–136. [CrossRef]

29. O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, “Choosing multiple parameters for support vector machines,” Mach. Learn. 46, 131–159 (2002). [CrossRef]

30. D. Fradkin and I. Muchnik, “Support vector machines for classification,” Discret. Methods Epidemiol. 70, 13–20 (2006).

31. M. G. Genton, “Classes of kernels for machine learning: a statistics perspective,” J. Mach. Learn. Res. 2, 299–312 (2001).

32. “MIT CBCL Face Database,” http://cbcl.mit.edu/software-datasets/FaceData2.html.

Speckle-learning-based object recognition through scattering media

Abstract

1. Introduction

2. Methods

3. Experiments

4. Conclusion

References and links

Cited By

Figures (8)

Equations (4)

Optics Express