Engineering pupil function for optical adversarial attacks

Kyulim Kim; JeongSoo Kim; Seungri Song; Jun-Ho Choi; Chulmin Joo; Chulmin Joo; Jong-Seok Lee; Jong-Seok Lee; Jong-Seok Lee

doi:10.1364/OE.450058

1. Introduction

Injecting small perturbations to input images can significantly degrade the performance of deep image classifiers, called adversarial attacks, which raise security concerns of deep learning-based applications [1,2]. Since the publication by Szegedy et al. [3], numerous attack examples have been proposed, and representative ones are highlighted in Ref. [4]. Most studies obtain adversarial examples in the digital domain by altering the pixel values of digital images. Attack can also be performed by modifying the target object or projecting patterns in the scene. For instances, a few studies demonstrate the efficacy of adversarial examples found in the digital domain when they are implemented in the physical domain, e.g., printed objects [5,6]. Such applicability of adversarial examples on real objects raises more serious security concerns in practical applications (e.g., autonomous vehicles [7], person detector [8]).

Orthogonal to these attempts, this paper introduces a novel approach for white-box, non-targeted adversarial attack by modifying the light field information inside the optical imaging system. The idea is to modulate the phase information of the light in the pupil plane of an optical imaging system. Spatially varying phase modulations are found by minimizing image distortion and maximizing cross-entropy, which are realized by a spatial light modulator (SLM) in the optical system. The change in the digital image obtained by the image sensor due to this phase modulation is hardly perceptible, but can significantly deteriorate the performance of the classifier.

The main contribution of our work can be summarized as follows. 1) We propose a scheme for optical adversarial attack that is implemented on the phase information inside the optical system, which deteriorates the performance of the deep models performing classification using the images acquired from the optical system. 2) We show the feasibility of our attack by conducting experiments on a simulated optical system for various images from the ImageNet dataset. It is shown that the attacked optical system produces output images that have similar quality as the unattacked outputs but fool the subsequent image classification models. 3) We conduct real experiments on an actual system implementing our attack to demonstrate the feasibility of the proposed idea in the real world. Our attack is also compared to common phase distortions such as spherical aberration, defocus, and astigmatism, which verifies the significant superiority of our method as an attack. 4) Most attack and defense algorithms have been demonstrated in the digital domain. Our work shifts this paradigm from the digital to the optical domain, introducing an optical architecture that can dynamically impose attack in the optical system.

2. Related work

2.1 Adversarial attack

Various adversarial attack methods against image classification models have been developed. Goodfellow et al. [1] proposed the fast gradient sign method (FGSM) that obtains a perturbation for a given image from the sign of the gradients of a target image classification model. Kurakin et al. [9] extended FGSM to find a more powerful perturbation, called I-FGSM. Carlini and Wagner [10] developed an attack method that minimizes the amount of deterioration and the distance of logits between the original predicted class label and the target label. Some recent approaches including trust-region-based adversarial attack [11] and query-efficient attacks [12,13] successfully decreased the computational cost.

While the aforementioned methods focus on injecting a perturbation into a given digital image that will be directly inputted to a target image classification model, some researchers have also investigated adversarial examples that are applicable to physical objects. Kurakin et al. [5] demonstrated the feasibility of finding adversarial examples even when the attacked images are printed and captured again using a phone camera. Eykholt et al. [14] showed that physically tampering real objects such as road signs can attack image classification models. Athalye et al. [6] further provided adversarial showcases with 3D-printed objects that can make the classification model misclassify the images taken in various viewpoints. Duan et al. [15] proposed a form of physical attack, adversarial laser beam, which employs the laser beam illuminated on the scene as the adversarial perturbation.

Previous researches have focused on attacking images or objects themselves, and to the best of our knowledge, there is no approach that attacks optical systems acquiring images from real objects.

2.2 SLM-based optical systems

In Fourier optics, a lens is regarded as a Fourier transform engine. That is, for a given object in the front focal plane of the lens, its Fourier transform can be obtained in the back focal plane of the lens. This plane is referred to as the Fourier plane, where one has access to the spatial frequency spectrum of the object. By placing an SLM in the Fourier plane, one can alter phase delay for each spatial frequency component, thus modifying the transfer function of an optical imaging system. Various applications of SLMs can be referred to [16].

Learning-assisted SLM-based optical systems have recently been reported. In [17,18], convolutional neural networks (CNNs) were used to obtain optimal SLM patterns for optical aberration correction. In [19,20], SLMs corrected for wavefront distortions caused by the atmospheric turbulence, which were estimated by CNNs from the obtained images. When SLMs are used for computer-generated holography, CNNs can be used to calculate SLM patterns producing desired illumination patterns [21–23]. In optical encryption, a pair of an SLM and a CNN can be used for encryption and decryption, respectively [24,25]; similarly, they can be used for encoding and decoding, respectively, in optical communications [26].

A recent study [27] considered deep image classification for images captured by an optical system. This study proposed an image data augmentation strategy to improve classification accuracy, where two SLMs were used to generate several images with various geometric transformations. Kravets et al. [28] introduced a defense technique using an SLM to defend adversarial attacks applied in the digital domain. Our work, in contrast, presents the first demonstration of the physics-deep learning incorporated design and experimental implementation of adversarial attack in an SLM-based vision system, which may pose security concerns for the aforementioned imaging applications. It should be noted that Zhou et al. [29,30] recently proposed learning-based attacks to demonstrate the vulnerability of computer-generated holography and diffractive encryption systems that employ the SLMs.

3. Proposed system

Figure 1 illustrates a schematic of our optical imaging setup. A camera lens (CL) collects the object field, and then generates the image in the intermediate image plane. In order to achieve a direct access to the Fourier plane, we construct a 4-f system using two lenses to relay the information onto the image sensor plane. A phase-only SLM is placed in the Fourier plane. Since the SLM is polarization-dependent, a linear polarizer is placed before the SLM. The phase-modulated light via the SLM is then reflected by the beam splitter, Fourier-transformed by the one of the relay lenses, and subsequently captured by an image sensor. More details on the system, including hardware specifications, can be found in Appendix A.

Fig. 1. Overview of the proposed optical adversarial attack system. A phase modulation module consisting of a polarizer (P), relay lens (RL), beam-splitter (BS), and SLM is implemented to the photography system. The unattacked image is obtained without phase modulation, while the attacked image is obtained by adversarial phase perturbation in the pupil plane. When the acquired images are classified by the deep model, the unattacked image is classified appropriately, but the attacked image is misclassified. CL: camera lens.

Download Full Size | PDF

The obtained digital image is then input to a deep neural network that classifies the object in the image. We consider three widely known models, namely, ResNet50 [31], VGG16 [32], and MobileNetV3 [33], which are pre-trained on the ImageNet dataset [34].

Our optical adversarial attack aims to find an adversarial perturbation that is realized as a SLM pattern, which leads the classifier to misclassify the resulting image, while no significant visible differences are observed between the unattacked and attacked images. Note that, in our adversarial attack scenario, it is assumed that the system user does not have access to the SLM pattern and can only see the captured image.

3.1 Imaging model of optical adversarial attack

Typical machine vision scenarios consider an incoherent imaging system where the object is illuminated by incoherent light sources (e.g., sunlight, lamp, LEDs). Incoherent imaging systems are regarded as linear in intensity, meaning the measured intensity (${I_{img}}$) in the image plane is the convolution of the intensity of object (${I_{obj}}$) with the squared absolute value of the coherent point-spread function ($h$) [35]:

(1)$${I_{img}} = {|h |^2} \otimes {I_{obj}}. $$

Taking the Fourier transform of Eq. (1), one obtains

(2)$${\tilde{I}_{img}} = {\cal F}\{{{{|h |}^2}} \}\cdot {\tilde{I}_{obj}}, $$

where ${\cal F}\{{\cdot} \}$ is the Fourier transform operator. Here, ${\tilde{I}_{img}}$ and ${\tilde{I}_{obj}}$ are the Fourier transforms of intensity distributions in the image and object planes, respectively. ${\cal F}\{{{{|h |}^2}} \}$ is defined as optical transfer function (OTF), which can be evaluated as $OTF = H\mathrm{\ \star\; }H$, with $H = {\cal F}\{h \}$ (called pupil function) and two-dimensional cross-correlation operator $\mathrm{\ \star }$.

By placing an SLM in the Fourier plane, one can modulate the phase of the pupil function ($H$), thus modifying the transfer function of an imaging system. Since we use a single SLM to modulate the phase in the Fourier plane, for a phase map and pupil specified at a certain color, corresponding information in the other channels should be evaluated by taking the wavelength-dependent pupil size and phase delay into account, and considered to obtain adversarial phase map. We assume that the SLM in the pupil plane is a non-dispersive medium and its size and phase delay vary inversely with the wavelength. In such case, the pupil size and phase delay of the blue channel are first computed, and the information for other channels can then be approximated as:

(3)$${H_i}({\vec{u}} )= circ\left( {\frac{{NA}}{{{\lambda_i}}}} \right) \cdot \exp \left( {i{\phi_B}({\vec{u}} )\frac{{{\lambda_B}}}{{{\lambda_i}}}} \right), $$

where $\vec{u}$ is the spatial frequency coordinates, $circ({\cdot} )$ is the circle function that models a circular pupil, $NA$ is numerical aperture of imaging system, ${\phi _B}({\vec{u}} )$ is the phase map for the blue channel, and $i = R,\,G,$ and B. Note that $({{\lambda_R},{\lambda_G},{\lambda_B}} )= ({610nm,530nm,470nm} )$ and ${\phi _B}({\vec{u}} )= 0$ in case of no phase modulation.

3.2 Finding adversarial perturbation

Our attack finds non-targeted adversarial phase perturbation $\mathrm{\Phi }({\vec{u}} )$ via gradient-based ${l_2}$-norm optimization to maximize the classification loss while minimizing image distortions. Here, $\mathrm{\Phi } = ({\phi _R},{\phi _G},{\phi _B})$, where ${\lambda _R}{\phi _R} = {\lambda _G}{\phi _G} = {\lambda _B}{\phi _B}$. The optimization problem to find the attacked version of ${I_{img}}$ with phase modulation $\mathrm{\Phi }$ (denoted by ${\hat{I}_\mathrm{\Phi }}$) is written as

(4)$$\mathop {\min }\limits_\mathrm{\Phi } J(\mathrm{\Phi }) = \mathop {\min }\limits_\mathrm{\Phi } \alpha {||{{{\hat{I}}_\mathrm{\Phi }} - {I_{img}}} ||_2} - L({y,f({{{\hat{I}}_\mathrm{\Phi }}} )} ). $$

In this formulation, the objective function to be minimized, $J(\mathrm{\Phi })$, consists of two terms. The first term is to minimize the distortion by the attack in the final image, which is expressed as the ${l_2}$-norm distance between the attacked image intensity (${\hat{I}_\mathrm{\Phi }}$) and the unattacked image intensity (${I_{img}}$). The second term is to induce misclassification by maximizing the classification loss ($L$). Here, we use the cross-entropy between the ground truth class label, y, and the output of the classification model for the attacked image, $f({{{\hat{I}}_\mathrm{\Phi }}} )$. These two terms have a trade-off relationship, and the constant $\alpha $ balances them.

The above optimization problem is solved by the popular gradient-descent approach as follows.

(5)$$\mathrm{\Phi } \leftarrow \mathrm{\Phi } - \mathrm{\eta }\frac{{\partial J}}{{\partial \mathrm{\Phi }}}, $$

where $\mathrm{\eta }$ denotes the learning rate. Using Eq. (5), the phase perturbation is iteratively updated. Note that the image is transformed to the frequency domain via the Fourier transform using the imaging model explained in Section 3.1, where we can alter only the phase as specified in Eq. (5) without changing the magnitude.

To find an appropriate value of $\alpha $, we adopt an iterative approach [10] that starts with a large value (to ensure a small amount of image distortion) and gradually decreases its value until the classification result becomes incorrect.

4. Simulation experiments

Prior to applying our adversarial attack on a real optical system, we first conduct experiments on a simulation environment using the forward imaging model explained in the previous section. This enables us to examine the feasibility of our attack method on a relatively larger number of images containing diverse objects.

4.1 Implementation details

We employ 1,000 test images of the NeurIPS 2017 Adversarial Attacks and Defences Competition dataset [36]. This dataset contains images associated with each of the 1,000 ImageNet classes, which are not included in the training images of the original ImageNet dataset.

The classification accuracy is used as the primary evaluation metric. In addition, we employ the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) to measure the amount of deterioration in the attacked images compared to their corresponding unattacked images.

To find adversarial example ${\hat{I}_\mathrm{\Phi }}$ for given image ${I_{img}}$ by Eq. (4), we employ the Adam optimizer [37] for the gradient descent algorithm [Eq. (5)] with ${\beta _1} = 0.9,\,\,{\beta _2} = 0.999$ and $\varepsilon = {10^{ - 8}}$, which is known to be effective in quickly finding adversarial examples [10]. We use a learning rate of $5 \times {10^{ - 3}}$ and a weight decay factor of $5 \times {10^{ - 6}}$. Note that the usage of weight decay yielded quicker convergence and prevention of further image quality degradation. We initially set the value of $\alpha $ to ${10^{ - 1}}$ and reduce it by ${1 / {10}}$ if a valid ${\hat{I}_\mathrm{\Phi }}$ is not found within the maximum number of iterations, which is set to 300. The optimization process stops once we obtain a valid ${\hat{I}_\mathrm{\Phi }}$ for which the model yields misclassification.

We observe that both accuracy and PSNR tend to converge to certain values as $\alpha $ decreases as shown in Fig. 2(a). When $\alpha $ becomes ${10^{ - 5}}$, the accuracy and PSNR are 0.126 and 35.39 dB, respectively, and both do not significantly change when $\alpha $ decreases further. Thus, we set the minimum value of $\alpha $ as ${10^{ - 6}}$. Figure 2(b) depicts a showcase of the obtained images for different values of $\alpha $. The three cases do not show significant perceptual differences, while the classification result becomes wrong when $\alpha = {10^{ - 3}}$ and a larger amount of phase perturbation is applied.

Fig. 2. (a) Classification accuracy of the unattacked and attacked images with respect to $\alpha $ for VGG16. PSNR between the unattacked and attacked images are also shown. (b) A showcase of the attacked images (top) and their phase perturbations (bottom) for different values of $\alpha $.

Download Full Size | PDF

4.2 Results

Table 1 shows the performance comparison on the three classification models. For the unattacked images, all the models achieve classification accuracy above 0.840. However, when our adversarial attack is employed, the accuracy is significantly reduced. This proves that the optical system for image classification is highly vulnerable to our proposed attack. In addition, both the PSNR and SSIM values of the images obtained from the attacked optical system are significantly high (i.e., above 30 dB), implying that differences between the unattacked and attacked images are hardly noticeable.

Table 1. Accuracy, PSNR, and SSIM for different image classification models. Standard deviations across images are also shown. Note that relatively large standard deviations of PSNR are due to the images with failed attack despite severe phase perturbations and the images with little changes despite the attack.

View Table | View all tables in this article

As a baseline, we test a so-called “random phase attack” by constructing a random phase pattern ${\mathrm{\Phi }_{random}}$. We obtain images having an average PSNR of 33.12 dB, which is slightly lower than those obtained from our attack (Table 1). However, the accuracy barely drops when those images are inputted to the models, which are 0.894, 0.828, and 0.860 for ResNet50, VGG16, and MobileNetV3, respectively. This shows that the perturbations found by our attack are very different from random perturbations and our method successfully deteriorates the classification performance while preserving the quality of the obtained images.

Figure 3 shows example images with and without the adversarial attack. Here, the confidence level (%) corresponds to the probability produced by the output of a model’s final softmax layer for the predicted class, which indicates how confident the model is about the prediction it made. The absolute differences of the unattacked and attacked images in the pixel domain and the optimized phase modulation patterns ($\mathrm{\Phi }$) are also shown. It can be seen that differences between the unattacked and attacked images are not significant, which is also manifested by high PSNR and SSIM values. However, the classification models misclassify all the attacked images. Here, the classified labels differ depending on the models. For instance, the starfish image is misclassified as honeycomb, flatworm, and mask by each model, respectively. The pixel-domain changes also differ depending on the models. For example, the differences are mostly on the red channel for ResNet50, while those are mostly on the green channel for MobileNetV3. The amount of distortion in terms of PSNR and SSIM is also model-dependent. For example, the PSNR values of the airliner image for ResNet50 and MobileNetV3 are 28.42 dB and 38.53 dB, respectively. These model-dependent characteristics of the perturbations can be also found from low transferability of the attacked images between the models as shown in Table 2.

Fig. 3. Visual showcases of the simulation experiments. Classified labels and their confidence levels (denoted in %) are also reported. The third column shows the absolute pixel value differences between the unattacked and attacked images, which are magnified 10 times for better visualization. The last column shows the modulated phase in the Fourier domain.

Download Full Size | PDF

Table 2. Transferability of attacked images between models in terms of accuracy

View Table | View all tables in this article

However, we also observe some consistent characteristics of the phase modulation patterns for different images and classifiers. First, a wider range of phase modulations tends to yield a more distorted image having a lower PSNR value. For instance, for ResNet50, the phase patterns of both the starfish and airliner images contain larger values (appearing as more yellow colors) than that of the cabbage butterfly image, and thus the former yields lower PSNR and SSIM values than the latter. Second, the phase patterns of the same image appear similar to some extent across different models. For example, the phases of starfish exhibit wave-like patterns, while those of cabbage butterfly contain more grain-like textures.

The overall patterns of the pixel value changes are largely different from those obtained from many existing adversarial attacks in the pixel domain [1,9,10]. The former preserves textures of the unattacked images, whereas the latter is typically similar to random noise and largely distorts the original textures. It is because our attack method manipulates the imaging system in the phase domain instead of the pixel domain.

To verify this, we compare our method with I-FGSM [9]. In I-FGSM, the maximum magnitude of perturbations is set to ${5 / {255}}$ for ResNet50 and VGG16, and ${4 / {255}}$ for MobileNetV3 in order to obtain as close PSNR values to those of our attack (Table 1) as possible. The number of iterations is set to 50. As a result, I-FGSM drops the accuracy to nearly zero. However, its perturbations are clearly perceivable unlike those by our attack even though PSNR is similar. We conduct a subjective test on 20 randomly selected images that are misclassified by both our attack and I-FGSM. By adopting the paired comparison test methodology [38], 15 participants are asked to judge which of each image pair (I-FGSM vs. our attack) looks more degraded. They find that I-FGSM degrades the image quality more for all 20 images for each model (Fig. 4). This confirms that the perturbation by our attack can be concealed more effectively, which is a crucial threat as an attack.

Fig. 4. Subjective test results for 20 images, where 15 participants chose more degraded images between I-FGSM and our attack.

Download Full Size | PDF

5. Real experiments

We physically implement our attack in an optical system in order to demonstrate the vulnerability of real optical systems to the proposed optical attack.

5.1 Implementation details

We build a vision system equipped with a SLM in the pupil plane, and acquire the images of “actual” objects. Considering the practical constraint, we use ten real objects that correspond to ten ImageNet classes, which are bath towel, computer keyboard, lighter, paintbrush, ping-pong ball, plate rack, ruler, screwdriver, syringe, and toilet tissue. We place an object 100-120 cm away from the camera lens, which yields images with a field-of-view of about $250 \times 250$mm. Phase modulation is performed with the SLM with a resolution of $224 \times 224$ pixels and a pixel size of 30 µm. Experimental details are provided in Appendix A.

We employ the pre-trained ResNet50 and VGG16 models. MobileNetV3 is excluded here due to its relatively poor performance on the actual objects. The same gradient descent-based optimization method used in the simulation is employed to compute perturbation patterns, which are then realized by the SLM.

In addition to our attack method, we also investigate the impact of other optical aberrations that are usually found in real optical systems. We consider spherical aberration, defocus, and astigmatism. The amounts of these distortions are determined in a way that the resulting images have similar SSIM values to those perturbed by our attack.

5.2 Results

Table 3 shows the performance of our attack method and the three optical distortions. We also report the classification accuracy of the unattacked images and that of the attacked images obtained from the simulation environment. Both ResNet50 and VGG16 successfully classify the ten real objects when no distortion is involved. However, when our attack is applied, all the objects are classified incorrectly for both models. Furthermore, all optical-domain distortions do not affect much the classification performance unlike our attack; all ten objects are still classified correctly for ResNet50 and nine objects for VGG16. These demonstrate that the real optical system is highly vulnerable to the proposed optical adversarial attack.

Table 3. Accuracy, PSNR, and SSIM for the unattacked images, attacked images in simulation, attacked images in the real experiment, and images with optical aberrations in the real experiment. Standard deviations across images are also shown.

View Table | View all tables in this article

Figure 5 shows three visual showcases of our attack and the optical distortions for ResNet50 (more results can be found in Appendix C). When the unattacked and attacked images are compared in the digital domain, there are no obvious visual differences, which also appears as high PSNR in Table 3. However, our attack successfully fools the classification model. For example, the unattacked image #3 is correctly classified as paintbrush. However, the attacked ones are misclassified as mortar in the simulation and real environments. The optical aberrations hardly affect the classification performance because they are not specifically designed to change the classification results, only reducing the confidence levels slightly (e.g., from 30.7% to 26.6% by spherical aberration) without resulting in misclassification. These show that our attack is completely different from the optical aberrations in the viewpoint of fooling classification models.

Fig. 5. Visual showcases of the real experiments for ResNet50. Classified labels and their confidence levels (in %) are also reported.

Download Full Size | PDF

The phase modulation patterns for our attack and those of the optical distortions also show significant differences. The range of the phase values for our attack is significantly smaller than those for the optical distortions: about [-0.5, 0.5] vs. [-3, 3] (rad). In addition, the phase patterns are highly distinguishable across different objects for our attack method, while they are not for the optical distortions.

6. Conclusion

We presented the feasibility of attacking optical systems in the optical imaging pipeline instead of attacking images in the digital domain. For a given real object, our attack method finds a spatially varying phase modulation in the pupil plane to minimize the amount of image distortion but significantly degrade the performance of an image classification model. The results from the simulation and real experiments showed that the specially-designed pupil modulation can make the optical system and classification model misclassify the object.

Our work has a few important implications. First, to the best of our knowledge, our work is the first to demonstrate the possibility of implementing adversarial attacks by altering the light information in the optical system. The work by Li et al. [39] proposes a physical attack by putting a sticker on the camera lens, which involves physical intervention (i.e., putting a sticker) outside the optical system. Our attack, in contrast, takes place inside the image formation hardware. Second, we demonstrated the feasibility of an adversarial attack implemented in the frequency domain through the phase perturbation. Some previous studies performed analysis of attacks in the frequency domain. For instance, Yin et al. [40] analyzed patterns of image-domain perturbations in the Fourier domain; Guo et al. [41] and Sharma et al. [42] investigated the effect of frequency-domain filtering of image-domain perturbations. However, there has been no attack method that is directly performed in the frequency domain. Third, we raise a new vulnerability issue of SLM-based imaging systems. In various state-of-the-art imaging systems, SLM is employed to tailor focus behavior [43] and to enhance the image contrast [44,45] Furthermore, it serves as the key element for compressive sensing [46] and aberration correction [47]. Our work implies that in such systems, malicious attempts can also be made via the implementation of the optical attacks.

As with typical adversarial attack methods, the phase modulation for an adversarial attack is object-dependent. In our experiments, SLM was employed to modulate the scene-dependent adversarial phase perturbation. Recently, universal adversarial perturbation, namely a single perturbation that causes most images to be misclassified to a different class, has been developed [48]. Similar studies can be performed in our method to obtain a universal adversarial phase perturbation, and the obtained pattern can then be implemented with SLM or fabricated using conventional lithography techniques to attack the imaging system over a broad range of objects.

Appendix A. Detailed specification of optical system

Figure 6 shows the experimental setup for our optical adversarial attack. Two high-power lamps are utilized as the light sources to acquire high-SNR images. The optical system is built on an inclined optical breadboard to acquire perspective images. Table 4 details the components used in our optical system.

Fig. 6. Setup of our optical system for the real experiments.

Download Full Size | PDF

Table 4. Specification of our optical system

View Table | View all tables in this article

Our optical adversarial attack system is built considering both deep classification models and phase modulation devices, i.e., SLM. Since typical classification models, including ResNet50 and VGG16, are trained on the images having a pixel resolution of $224 \times 224$ pixels, our experiments are conducted accordingly based on the image data with the same pixel resolution. Most commercial image sensors have a large number of pixels (typically 4M pixels or more), and in our case, an image sensor with a size of $2,048 \times 2,048$ pixels is employed. Thus, images acquired in the real experiments are resized to $224 \times 224$ pixels.

Our perturbation pattern is designed to have a pixel resolution of $224 \times 224$ pixels that is identical to that of the acquired image. On the other hand, the SLM employed in our experiments has a pixel resolution of $512 \times 512$ pixels. To display the perturbation pattern on SLM, the perturbation pattern is resized to $448 \times 448$ pixels with the nearest neighbor interpolation and we use $448 \times 448$ pixels of SLM. Note that a scaling factor of 2 is chosen to reduce interpolation artifacts while maximizing the size of the Fourier plane. In other words, a $2 \times 2$ pixels in the SLM is considered as one super-pixel, and the displayed perturbation pattern has a pixel resolution of $224 \times 224$ pixels effectively.

Appendix B. More results of simulation experiments

Figure 2(a) shows the classification accuracy and PSNR of the attacked images with respect to the value of $\alpha $ only for VGG16. The results for ResNet50 and MobileNetV3 are shown here in Fig. 7. It can be seen that they are similar to that of VGG16.

Fig. 7. Classification accuracy of the unattacked and attacked images with respect to $\alpha $ for (a) ResNet50 and (b) MobileNetV3. PSNR between the unattacked and attacked images are also shown.

Download Full Size | PDF

Appendix C. More results of real experiments

In Fig. 5, we presented the results of the real experiment only for three images with ResNet50. The rest of the results are shown here. Figure 8 and 9 show the results for the other seven images with ResNet50. Figures 10–12 show the results for all ten images with VGG16. The observations made previously in Sec. 5.2 can be also made in these results.

Fig. 8. Visual showcases of the real experiments for ResNet50. Classified labels and their confidence levels are also reported.

Download Full Size | PDF

Fig. 9. Visual showcases of the real experiments for ResNet50. Classified labels and their confidence levels are also reported.

Download Full Size | PDF

Fig. 10. Visual showcases of the real experiments for VGG16. Classified labels and their confidence levels are also reported.

Download Full Size | PDF

Fig. 11. Visual showcases of the real experiments for VGG16. Classified labels and their confidence levels are also reported.

Download Full Size | PDF

Fig. 12. Visual showcases of the real experiments for VGG16. Classified labels and their confidence levels are also reported.

Download Full Size | PDF

Funding

National Research Foundation of Korea (2015R1A5A1037668, 2020R1A2C2012061); Yonsei University (2020-0-01361, Artificial Intelligence Graduate School Program).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572 (2014).

2. D. Su, H. Zhang, H. Chen, J. Yi, P.-Y. Chen, and Y. Gao, “Is Robustness the Cost of Accuracy?–A Comprehensive Study on the Robustness of 18 Deep Image Classification Models,” in Proceedings of the European Conference on Computer Vision (ECCV), 631–648 (2018).

3. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199 (2013).

4. N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learning in computer vision: A survey,” IEEE Access 6, 14410–14430 (2018). [CrossRef]

5. A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” arXiv preprint arXiv:1607.02533 (2016).

6. A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok, “Synthesizing robust adversarial examples,” in International Conference on Machine Learning, 284–293 (PMLR, 2018).

7. D. Nassi, R. Ben-Netanel, Y. Elovici, and B. Nassi, “MobilBye: attacking ADAS with camera spoofing,” arXiv preprint arXiv:1906.09765 (2019).

8. K. Xu, G. Zhang, S. Liu, Q. Fan, M. Sun, H. Chen, P.-Y. Chen, Y. Wang, and X. Lin, “Adversarial t-shirt! evading person detectors in a physical world,” in European Conference on Computer Vision, 665–681 (Springer, 2020).

9. A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,” arXiv preprint arXiv:1611.01236 (2016).

10. N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in 2017 IEEE Symposium on Security and Privacy (SP), 39–57 (IEEE, 2017).

11. Z. Yao, A. Gholami, P. Xu, K. Keutzer, and M. W. Mahoney, “Trust region based adversarial attack on neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11350–11359 (2019).

12. B. Ru, A. Cobb, A. Blaas, and Y. Gal, “Bayesopt adversarial attack,” in International Conference on Learning Representations (2019).

13. J. Du, H. Zhang, J. T. Zhou, Y. Yang, and J. Feng, “Query-efficient meta attack to deep neural networks,” arXiv preprint arXiv:1906.02398 (2019).

14. K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, and D. Song, “Robust physical-world attacks on deep learning visual classification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1625–1634 (2018).

15. R. Duan, X. Mao, A. K. Qin, Y. Chen, S. Ye, Y. He, and Y. Yang, “Adversarial Laser Beam: Effective Physical-World Attack to DNNs in a Blink,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16062–16071 (2021).

16. C. Maurer, A. Jesacher, S. Bernet, and M. Ritsch-Marte, “What spatial light modulators can do for optical microscopy,” Laser Photonics Rev. 5(1), 81–101 (2011). [CrossRef]

17. B. Zhang, J. Zhu, K. Si, and W. Gong, “Deep learning assisted zonal adaptive aberration correction,” Front. Phys. 8, 634 (2021). [CrossRef]

18. Y. Jin, Y. Zhang, L. Hu, H. Huang, Q. Xu, X. Zhu, L. Huang, Y. Zheng, H.-L. Shen, and W. Gong, “Machine learning guided rapid focusing with sensor-less aberration corrections,” Opt. Express 26(23), 30162–30171 (2018). [CrossRef]

19. J. Liu, P. Wang, X. Zhang, Y. He, X. Zhou, H. Ye, Y. Li, S. Xu, S. Chen, and D. Fan, “Deep learning based atmospheric turbulence compensation for orbital angular momentum beam distortion and communication,” Opt. Express 27(12), 16671–16688 (2019). [CrossRef]

20. K. Wang, M. Zhang, J. Tang, L. Wang, L. Hu, X. Wu, W. Li, J. Di, G. Liu, and J. Zhao, “Deep learning wavefront sensing and aberration correction in atmospheric turbulence,” PhotoniX 2(1), 1–11 (2021). [CrossRef]

21. M. H. Eybposh, N. W. Caira, M. Atisa, P. Chakravarthula, and N. C. Pégard, “DeepCGH: 3D computer-generated holography using deep learning,” Opt. Express 28(18), 26636–26650 (2020). [CrossRef]

22. H. Goi, K. Komuro, and T. Nomura, “Deep-learning-based binary hologram,” Appl. Opt. 59(23), 7103–7108 (2020). [CrossRef]

23. Y. Peng, S. Choi, N. Padmanaban, and G. Wetzstein, “Neural holography with camera-in-the-loop training,” ACM Trans. Graph. 39(6), 1–14 (2020). [CrossRef]

24. H. Hai, S. Pan, M. Liao, D. Lu, W. He, and X. Peng, “Cryptanalysis of random-phase-encoding-based optical cryptosystem via deep learning,” Opt. Express 27(15), 21204–21213 (2019). [CrossRef]

25. L. Zhou, Y. Xiao, and W. Chen, “Machine-learning attacks on interference-based optical encryption: experimental demonstration,” Opt. Express 27(18), 26143–26154 (2019). [CrossRef]

26. Y. Na and D.-K. Ko, “Deep-learning-based high-resolution recognition of fractional-spatial-mode-encoded data for free-space optical communications,” Sci. Rep. 11(1), 1–11 (2021). [CrossRef]

27. B. Li, O. K. Ersoy, C. Ma, Z. Pan, W. Wen, and Z. Song, “A 4F optical diffuser system with spatial light modulators for image data augmentation,” Opt. Commun. 488, 126859 (2021). [CrossRef]

28. V. Kravets, B. Javidi, and A. Stern, “Compressive imaging for defending deep neural networks from adversarial attacks,” Opt. Lett. 46(8), 1951–1954 (2021). [CrossRef]

29. L. Zhou, Y. Xiao, and W. Chen, “Learning-based attacks for detecting the vulnerability of computer-generated hologram based optical encryption,” Opt. Express 28(2), 2499–2510 (2020). [CrossRef]

30. L. Zhou, Y. Xiao, and W. Chen, “Vulnerability to machine learning attacks of optical encryption based on diffractive imaging,” Optics and Lasers in Engineering 125, 105858 (2020). [CrossRef]

31. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778 (2016).

32. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556 (2014).

33. A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, and V. Vasudevan, “Searching for mobilenetv3,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 1314–1324 (2019).

34. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, and M. Bernstein, “Imagenet large scale visual recognition challenge,” Int J Comput Vis 115(3), 211–252 (2015). [CrossRef]

35. J. W. Goodman, Introduction to Fourier Optics, 4th ed. (W. H. Freeman & Company, 2017).

36. A. Kurakin, I. Goodfellow, S. Bengio, Y. Dong, F. Liao, M. Liang, T. Pang, J. Zhu, X. Hu, and C. Xie, “Adversarial attacks and defences competition,” in The NIPS'17 Competition: Building Intelligent Systems, 195–231 (Springer, 2018). [CrossRef]

37. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014).

38. B. Series, “Methodology for the subjective assessment of the quality of television pictures,” Recommendation ITU-R BT, 500–513 (2012).

39. J. Li, F. Schmidt, and Z. Kolter, “Adversarial camera stickers: A physical camera-based attack on deep learning systems,” in International Conference on Machine Learning, 3896–3904 (PMLR, 2019).

40. D. Yin, R. G. Lopes, J. Shlens, E. D. Cubuk, and J. Gilmer, “A fourier perspective on model robustness in computer vision,” arXiv preprint arXiv:1906.08988 (2019).

41. C. Guo, J. S. Frank, and K. Q. Weinberger, “Low Frequency Adversarial Perturbation,” arXiv preprint arXiv:1809.08758 (2018).

42. Y. Sharma, G. W. Ding, and M. Brubaker, “On the effectiveness of low frequency perturbations,” arXiv preprint arXiv:1903.00073 (2019).

43. C. Y. Lin, J. H. Chieh, and Y. Luo, “Multi-focal imaging system by using a programmable spatial light modulator,” in Biomedical Imaging and Sensing Conference,107111P (International Society for Optics and Photonics, 2018).

44. S. Fürhapter, A. Jesacher, S. Bernet, and M. Ritsch-Marte, “Spiral phase contrast imaging in microscopy,” Opt. Express 13(3), 689–694 (2005). [CrossRef]

45. J.-Y. Zheng, R. M. Pasternack, and N. N. Boustany, “Optical scatter imaging with a digital micromirror device,” Opt. Express 17(22), 20401–20414 (2009). [CrossRef]

46. J. P. Dumas, M. A. Lodhi, W. U. Bajwa, and M. C. Pierce, “From modeling to hardware: an experimental evaluation of image plane and Fourier plane coded compressive optical imaging,” Opt. Express 25(23), 29472–29491 (2017). [CrossRef]

47. C. Lingel, T. Haist, and W. Osten, “Spatial-light-modulator-based adaptive optical system for the use of multiple phase retrieval methods,” Appl. Opt. 55(36), 10329–10334 (2016). [CrossRef]

48. S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Universal adversarial perturbations,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1765–1773 (2017).

Model	Accuracy (unattacked)	Accuracy (attacked)	PSNR (dB)	SSIM
ResNet50	0.896	0.194	35.15 ( $\pm 13.30)$	0.9552 ( $\pm 0.0584)$
VGG16	0.840	0.126	34.21 ( $\pm 11.64)$	0.9494 ( $\pm 0.0701)$
MobileNetV3	0.860	0.220	37.30 ( $\pm 13.48)$	0.9632 ( $\pm 0.0504)$

Model	Metric	Unattacked	Attacked		Optical aberrations
Model	Metric	Unattacked	Simulation	Experiment	Spherical	Defocus	Astigmatism
ResNet50	Accuracy	1.0	0.0	0.0	1.0	1.0	1.0
	PSNR (dB)	-	35.39 ( $\pm$ 4.78)	33.36 ( $\pm$ 3.53)	35.81 ( $\pm$ 1.83)	36.03 ( $\pm$ 2.08)	36.11 ( $\pm$ 2.12)
	SSIM	-	0.9723 ( $\pm$ 0.0178)	0.9357 ( $\pm$ 0.0263)	0.9391 ( $\pm$ 0.0191)	0.9386 ( $\pm$ 0.0195)	0.9394 ( $\pm$ 0.0206)
VGG16	Accuracy	1.0	0.0	0.0	0.9	0.9	0.9
	PSNR (dB)	-	36.81 ( $\pm$ 4.45)	34.31 ( $\pm$ 2.75)	35.38 ( $\pm$ 1.75)	35.37 ( $\pm$ 2.11)	35.29 ( $\pm$ 2.49)
	SSIM	-	0.9745 ( $\pm$ 0.0207)	0.9381 ( $\pm$ 0.0323)	0.9380 ( $\pm$ 0.0170)	0.9366 ( $\pm$ 0.0191)	0.9367 ( $\pm$ 0.0212)

Component	Model (Vendor)	Specification
SLM	HSP512(Meadowlark)	Resolution: $512 \times 512$ pixelsArray size: $7.68 \times 7.68$ mmPixel pitch: $15 \times 15$ µmDiffraction efficiency: 61 - 95%Fill factor: 83.4 - 100%Wavefront distortion: $λ$ =7 at 532 nmPhase range: 0 - $2 π$ Phase level: 65,536 (16 bits)
Image sensor	Flare 4M180-CL(IO Industries)	Resolution: $2, 048 \times 2, 048$ pixelsPixel size: $5.5 \times 5.5$ µmSensor type: CMOSDynamic range: 60 dBPixel bit depth: 8 bitsColor filter: Bayer
Camera lens	SP AF 60 mm F/2 Di II Macro 1:1(Tamron)	Focal length: 60 mmAperture: F/2.0 to F/22Angle of view: $26^{\circ}$
Relay lens	AC508-100-A(Thorlabs)	Focal length: 100 mm
Beam splitter	BS031(Thorlabs)	Clear aperture: $40.64 \times 40.64$ mm

Model	Accuracy (unattacked)	Accuracy (attacked)	PSNR (dB)	SSIM
ResNet50	0.896	0.194	35.15 ( $\pm 13.30)$	0.9552 ( $\pm 0.0584)$
VGG16	0.840	0.126	34.21 ( $\pm 11.64)$	0.9494 ( $\pm 0.0701)$
MobileNetV3	0.860	0.220	37.30 ( $\pm 13.48)$	0.9632 ( $\pm 0.0504)$

Model	Metric	Unattacked	Attacked		Optical aberrations
Model	Metric	Unattacked	Simulation	Experiment	Spherical	Defocus	Astigmatism
ResNet50	Accuracy	1.0	0.0	0.0	1.0	1.0	1.0
	PSNR (dB)	-	35.39 ( $\pm$ 4.78)	33.36 ( $\pm$ 3.53)	35.81 ( $\pm$ 1.83)	36.03 ( $\pm$ 2.08)	36.11 ( $\pm$ 2.12)
	SSIM	-	0.9723 ( $\pm$ 0.0178)	0.9357 ( $\pm$ 0.0263)	0.9391 ( $\pm$ 0.0191)	0.9386 ( $\pm$ 0.0195)	0.9394 ( $\pm$ 0.0206)
VGG16	Accuracy	1.0	0.0	0.0	0.9	0.9	0.9
	PSNR (dB)	-	36.81 ( $\pm$ 4.45)	34.31 ( $\pm$ 2.75)	35.38 ( $\pm$ 1.75)	35.37 ( $\pm$ 2.11)	35.29 ( $\pm$ 2.49)
	SSIM	-	0.9745 ( $\pm$ 0.0207)	0.9381 ( $\pm$ 0.0323)	0.9380 ( $\pm$ 0.0170)	0.9366 ( $\pm$ 0.0191)	0.9367 ( $\pm$ 0.0212)

Engineering pupil function for optical adversarial attacks

Abstract

1. Introduction

2. Related work

2.1 Adversarial attack

2.2 SLM-based optical systems

3. Proposed system

3.1 Imaging model of optical adversarial attack

3.2 Finding adversarial perturbation

4. Simulation experiments

4.1 Implementation details

4.2 Results

5. Real experiments

5.1 Implementation details

5.2 Results

6. Conclusion

Appendix A. Detailed specification of optical system

Appendix B. More results of simulation experiments

Appendix C. More results of real experiments

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (12)

Tables (4)

Equations (5)

Optics Express