Passive imaging at 250&#x2005;GHz for detection of face presentation attacks

Marcin Kowalski

doi:10.1364/OE.411864

1. Introduction

Presentation attack detection (PAD) remains a serious challenge for biometric recognition systems. Variety of attacks on different biometric modalities requires new detection methods to be developed. Presentation attack detection systems are expected to determine whether a subject being screened is authentic. Presentation attacks (PA) on face recognition systems are among the most popular and are presumably the easiest attacks to be done using a screen replay, a printed photo or 3D models.

For the last decade, terahertz (THz) radiation within the range of 0.1-3 THz has seen a growing number of applications, including explosive detections [1], concealed object detection [2,3], non-destructive evaluation [4], telecommunication [5], and biomedical investigations [6–13]. Terahertz radiation is used in several different ways including passive, active and time-domain imaging. Security applications of terahertz imaging include a detection of dangerous materials and a stand-off detection of concealed objects. Terahertz radiation penetrates a variety of non-polar non-conducting materials and is inherently safe for living tissues and DNA, because it is not ionizing thanks to a low photon energy. Due to a high attenuation of water, penetration of living tissues is limited to 0.5 mm at lower frequencies of the THz band [14,15].

The specific property of THz spectrum to penetrate a variety of non-metallic materials may introduce new possibility to detect facial presentation attacks. The aim of this paper is to present the outcomes of the investigation on presentation attack detection in passive terahertz imaging domain at 250 GHz. Investigations were aimed to evaluate capability of a passive terahertz imaging to detect a variety of presentation attacks including printed photo attack, display image attack, as well as customized 3D masks of two types.

The paper presents methodology and results of an analysis of images of subjects collected with a passive imager operating at 250 GHz (1.25 mm). Energy emissions are analyzed with regard to different human parts. A variety of experiments has been performed showing different aspects of detecting presentation attacks including attacks with different PAIs and different clothing. Spectral properties of presentation attack instruments (PAIs) have been characterized in a broad spectrum of terahertz waves. Based on the collected database a numerical analysis has been conducted and a set of deep learning methods of presentation attack detection based on a distribution of energy emissions has been evaluated.

The contributions are summarized as follows:

- First reported investigations on presentation attack detection in passive terahertz imaging;
- Theoretical considerations on possibility of face PAD in passive terahertz/millimeter wave imaging domain;
- Spectral characterization of PAI including two types of novel 3D facial masks;
- Spectral characterization of clothing used during experiment sessions;
- Study on influence of different clothing on PAD performance;
- Long-lasting experiments with PAIs to investigate a heat transfer impact on PAD;
- Evaluation of existing deep-learning classifiers for PAD;
- Passive terahertz face spoofing dataset (PTFSD) presenting attacks performed using 2D paper-printed photographs, 2D photographs displayed using a tablet, 3D full-face flexible mask and printed 3D facial mask.

The manuscript is organized as follows. In Section 2 brief reviews of related works are provided. The theoretical analysis, as well as the measurement protocol are described in Section 3 and Section 4, respectively. An image analysis is presented in Section 5. Finally, an automatic algorithm is presented and validated in Section 6 with a summary presented in Section 7.

2. Related Works

2.1 Types of PAD methods

PA detection has gained a lot of interest during the last decade and as a result, a variety of approaches have been proposed. PAD methods can be classified as intrusive or non-intrusive depending on the need of additional actions to be performed by the subject under screening. While the intrusive presentation attack detection methods require a subject to cooperatively complete some specific live actions, such as blinking, head turning, or mouth opening, the non-intrusive PA detection methods require no action from the subject. Non-intrusive PAD technologies rely on the fact, that the sensor is able to provide sufficient information about a subject and information extracted from images.

2.2 Face presentation attack datasets

A variety of works have been proposed, taking advantage of publicly available datasets of presentation attacks including NUAA [16], ZJU Eyeblink [17], Idiap Print-attack [18], Idiap Replay-attack [19], CASIA FASD [20], MSU-MFSD [21], MSU RAFS [22], UVAD [24] or 3DMA [23], MLFP [25], CSMAD [26], 3DMAD [31]. No dataset of THz images with presentation attacks have been introduced.

2.3 Algorithms

Several algorithmic approaches have been proposed for detection of presentation attacks. Many algorithms use special features focusing on detecting a specific attack based on texture analysis [27], motion analysis [28], life sign indicators [29], or 3D properties [30]. Erdogmus et al. [31] released the first 3D mask attack dataset (3DMAD) and proposed a detection method based on an LDA classification of block-based extracted uniform LBP features. Liu et al. [32] presented a remote photoplethysmography (rPPG) based method to detect 3D masks. In another work, Li et al. [33] proposed a 3D mask detection method based on pulse detection. Methods presented in [32] and [33] focus on an analysis of different heartbeat signals between real faces and 3D masks and are sensitive to camera settings and variable light conditions.

A variety of machine learning or deep learning methods have also been proposed. Bhattacharjee et al. [26] explored vulnerability of the convolutional neural network (CNN) based face-recognition systems for presentation attacks performed using custom-made silicone masks. George et al. presented a PA detection algorithm based on the multi-channel convolutional neural network (MC CNN). Shao et al. [34] presented a 3D mask detection method based on motion optical flow features extracted from a pre-trained VGG-net. Manjani et al. [36] presented a silicone mask detection method based on a deep dictionary learning with a greedy learning algorithm for the optimization problem. Ramachandra [37] evaluated the effectiveness of five state-of-the-art presentation attack detection (PAD) techniques for detecting 3D masks.

All the PAD methods based on a deep learning have a common disadvantage. They provide no indicator which would be easy to interpret. The mechanism in deep learning methods is not straightforward. This study aims to propose a set of interpretable indicators for presentation attacks detection.

2.4 Physical features

Detection of presentation attacks often relies on specific spectral properties. A variety of methods based on visible, near infrared and thermal infrared imaging have been proposed. Li et al. [35] proposed a new method to discriminate the reflectance difference of real faces and 3D face masks. Sun et al. [38] proposed a liveness detection approach based on thermal infrared and visible spectra. The detection method uses a canonical correlation analysis between a visible and thermal face. Kant and Sharma [39] proposed a PA detection method based on using the fusion of thermal imaging and skin elasticity. The technique aims to analyze a sequence of visible range and thermal infrared images during cooperative actions performed by a subject. Seo proposed a thermal face-convolutional neural network (Thermal Face-CNN) with the external knowledge [45]. The external knowledge used in the proposed method relies on a human face temperature. George et al. [46] proposed the multi-channel convolutional neural network (MC-CNN) tested with grayscale, depth, infrared, and thermal infrared images. Kowalski [47] investigated thermal infrared imaging for detection of various presentation attacks and took into account the impact of a variable subject’s state and a heat transfer between the subject’s face and the mask on the PAD performance. He proposed a method composed of a head detection algorithm and a deep neural network classification algorithm. He also evaluated an impact heat transfer between PAIs and the subject’s face on the PAD performance.

There is a limited number of works on spoofing detection in a terahertz domain including a study on detecting fingerprint spoofing using a time-domain spectroscopy [40]. Authors of this work used an active THz time-domain spectroscopy setup to acquire signals reflected from the finger pads attached to the finger. THz time-domain spectroscopy (TDS) setup used in [40] is an active modality that operates in a contact mode. Authors presented two algorithms for detecting fingerprint spoofing attacks based on the analysis of time and frequency-domain signals. Both algorithms achieve impressive results.

The terahertz domain exploration for presentation attack detection can be done in different ways starting from active and passive intensity imaging, time-domain imaging and, finally, terahertz tomography. Active intensity imaging may be used to increase spatial resolution [41]. THz TDS could potentially provide interesting results for detecting face presentation attacks as it may reveal an additional layer presence on the subject’s face. This method seems to be very difficult to be applied for stand-off distances of several meters. However, previous works [42–44] show a significant progress in the field of free-space TDS.

Passive terahertz imaging has not been yet reported for a face presentation attack detection. However, since terahertz imagers are commonly used at airports and other security checkpoints for concealed objects detection, it is justified to investigate their possible second role as presentation attack detectors.

This work concerns the PAD research into passive terahertz imaging at 250 GHz. This type of sensor was not reported to be used for PAD. Various PAIs for presentation attack detection including printed or displayed photos using 3D-printed masks, facial silicone masks or full-face latex masks were investigated.

3. Theoretical considerations

Terahertz radiation has a unique property of penetrating non-metallic materials. Passive terahertz imaging has some common properties with thermal infrared imaging. Imagers operating in a passive terahertz and thermal infrared acquire radiation emitted by objects and do not require additional illuminators. Thermal infrared and passive THz cameras acquire radiation proportional to a relative distribution of the apparent temperature of objects placed in the camera lens field of view. Objects projection depends on camera and lens parameters, as well as on temperature difference between objects and their emissivity. However, passive THz imagers provide a much lower thermal resolution image than state-of-the-art thermal infrared cameras. Imagers operating in the terahertz range provide images with a low spatial resolution due to a recorded radiation long wavelength. Low spatial resolution limits the facial recognition capability but might be sufficient for presentation attack detection since no detailed face shapes are required.

Terahertz radiation penetrates a variety of non-polar non-conducting materials including clothing, plastics and other non-metals therefore, it allows to see through various materials. Passive terahertz imaging is inherently safe for living tissues and DNA. This part of spectrum is highly attenuated by water thus, a penetration of living tissues is limited to a few millimeters at lower frequencies of the THz band.

In the paper, we consider various face presentation attacks, made by using various presentation attack instruments including printed papers or masks mounted on a human face as presented in Fig. 1.

Fig. 1. Radiation model of the face covered with a facial mask.

Download Full Size | PDF

Since both the human face and the mask are in a direct contact, the energy (heat) is transferred between the human face and the mask. Each body being in a thermal equilibrium emits radiation in a broad spectral range [48]. The amount of energy radiated by a body and the spectral radiance is described by Eq. [48]:

(1)$${B_\nu }(T )= \frac{{2 \cdot {\nu ^2}}}{{{c^2}}}k \cdot T, $$

where k is the Boltzmann constant and T is the temperature, at 300 GHz ${B_\nu }(T )= 8.28 \cdot {10^{ - 15}} \cdot W/sr \cdot {m^2}$. The heat conduction is described by Fourier’s law which states that a time rate of a heat transfer through a material is proportional to a negative gradient in the temperature and to the area, at right angles to that gradient, through which the heat flows [49].

(2)$$\vec{q} ={-} k\nabla T, $$

where $\vec{q}$ is the local heat flux density, k is the material conductivity (related to temperature) and $\nabla T$ is the temperature gradient. Passive cameras assign temperature differences to differences in the radiated energies in their spectral ranges per unit surface. Face and mask adhesion are of great importance since the thermalization depends strongly on a direct contact between the two bodies. During this study two types of PAIs have been considered. The first type concerns masks directly attached to a subject’s face while the second type corresponds with all the PAIs being held in a subject’s hand, for example displayed or printed photographs.

For the purpose of theoretical considerations, facial masks are considered as a porous material, partly transmitting terahertz radiation. Depending on structure, type of material, and mask thickness, terahertz radiation can be partly transmitted as through an optical filter [49]. The filter is characterized by the emissivity ε, the reflectance coefficient ρ and the transmittance τ. The amount of energy radiated also depends on a temperature between the inner and outer sides of the mask [49]. To estimate a limiting value of the contrast, it has been assumed that for any mask its temperature equalled the human face temperature. This temperature lies, in fact, between temperatures of the inner and outer sides of the facial mask.

To estimate the contrast limiting value, we assumed that for any mask the temperature of the surface emitting radiation equalled the human body temperature. The radiant exitance value Φ_W in the area W that differs from the radiant exitance Φ_S, of the area S (Fig. 1). Temperatures of these ROIs have been used to calculate the values necessary to determine effectiveness of PAI detecting. Elements of a total radiant exitance from both W and S surfaces are presented in Fig. 1.

In order to estimate the thermal contrast based on the image analysis, a simplified model has been adopted in which the radiant exitances Φ_S and Φ_W (of areas without and with a hidden object underneath, respectively) are given by the following equations:

(3)$${\Phi _S} = {\tau _C}{\Phi _B}, $$

(4)$${\Phi _W} = {\tau _C}{\Phi _B} + {\Phi _C} + {\rho _C}{\Phi _A}, $$

where ${\rho _C}$ is the mask reflection coefficient, ${\tau _C}$ is the mask transmittance coefficient, ${\Phi _B}$ is the radiant exitance of a human skin, Φ_C is the mask radiant exitance and ${\Phi _A}$is the mask surface irradiance.

The components presented in Eqs. (3) and (4) correspond to different physical phenomena. Component ${\rho _C}{\Phi _A}$ describes the ambient radiation reflected from the mask surface${\tau _C}{\Phi _B}$ is the portion of a skin radiation sent through a mask characterized by the transmittance coefficient ${\tau _{C.}}$ In both Eqs. (3) and (4) it is assumed that a mask temperature which results in the radiation component ${\Phi _C}$ is close to the ambient temperature T_A. It has been assumed that the mask temperature was approximately the same for all the PAIs considered during the study. The energetic contrast K between the neighbour areas W and S in a thermal equilibrium can be described as:

(5)$$K = \frac{{{\Phi _S} - {\Phi _W}}}{{{\Phi _S}}} = \frac{{{\tau _C}{\Phi _B} - ({{\tau_C}{\Phi _B} + {\Phi _C} + {\rho_C}{\Phi _A}} )}}{{{\tau _C}{\Phi _B}}} = \frac{{{\Phi _C} + {\rho _C}{\Phi _A}}}{{{\tau _C}{\Phi _B}}}. $$

The above analysis does not take under consideration the transmittance coming from a human face that was transmitted, absorbed, and re-emitted by the mask. This analysis can be translated to a spectral-related domain. The spectral contrast$\; cp({\lambda ,T} )$ has been used to connect the wavelength $\lambda $ and the temperature T. The relationship can be described by the following equation:

(6)$$cp({\lambda ,T} )= \frac{{{\varphi _B}({\lambda ,T} )- {\varphi _C}({\lambda ,T} )}}{{{\varphi _B}({\lambda ,T} )}},$$

where ${\varphi _B}({\lambda ,T} )$ is the human body spectral radiant exitance, and ${\varphi _c}({\lambda ,T} )$ is the mask spectral radiant exitance. The values of body and mask spectral radiant exitances, can be described by:

(7)$${\varphi _B}({\lambda ,T} )= {\varepsilon _B}({\lambda ,T} )\cdot {m_{BB}}({\lambda ,T} ), $$

(8)$${\varphi _c}({\lambda ,T} )= {\varepsilon _c}({\lambda ,T} )\cdot {m_{BB}}({\lambda ,T} ), $$

where ${\varepsilon _B}({\lambda ,T} ),$ ${\varepsilon _c}({\lambda ,T} )$ and ${m_{BB}}({\lambda ,T} )$ are the human body spectral emissivity, the mask spectral emissivity and the blackbody spectral radiant exitance, respectively.

Imagers have the finite spectral range of $\varDelta \lambda = {\lambda _{min}} - {\lambda _{max}}$ that is determined by the spectral response of a detection unit and the applied optics transmittance characteristics. However, it is possible, for a given spectral band $\varDelta \lambda $, ambient temperature ${T_A}$Z, body temperature ${T_B}$and mask temperature ${T_c}$ to define the power contrast $CP({{\lambda_{min}},{\lambda_{max}},{T_A},{T_B},{T_c}} )$ given by the following:

(9)$$CP({{\lambda_{min}},{\lambda_{max}},{T_A},{T_B},{T_c}} )= 1 - \frac{{\mathop \smallint \nolimits_{{\lambda _{min}}}^{{\lambda _{max}}} {\tau _c}({\lambda ,{T_A}} ){\varepsilon _c}({\lambda ,{T_C}} ){m_{bb}}({\lambda ,{T_c}} )d\lambda }}{{\mathop \smallint \nolimits_{{\lambda _{min}}}^{{\lambda _{max}}} {\tau _c}({\lambda ,{T_A}} ){\varepsilon _B}({\lambda ,{T_B}} ){m_{bb}}({\lambda ,{T_B}} )d\lambda }},$$

where ${\tau _c}({\lambda ,T} )$is the clothing spectral transmittance coefficient.

Subject’s face and mask are directly connected, therefore, they pursue to achieve a thermodynamic equilibrium. Their temperature within a certain time is equalized. Since the human body has a constant temperature and, initially, a mask is in a lower ambient temperature, both bodies will tend to balance. As a result, the second body is heated by taking energy from the first body. Due to above analysis the experiment should be conducted at several intervals to illustrate the energy flow between bodies.

Measurement of real values of the factors mentioned in the above analysis is a complex process as shown in recent works [50,51] and extends beyond the scope of this study as the latter is focused on conducting practical measurements of PAIs in real applications. However, real application of the theoretical analysis is of high importance and is considered for further investigations in the future.

4. Measurement environment

4.1 Measurement of PAIs

During this study, various instruments have been used to perform face presentation attacks. The focus has been put on detecting 2D and 3D PAIs presenting realistic faces. These PAIs include 2D photographs printed on paper or foil sheets, 2D photographs displayed using a tablet, 3D full-face flexible masks and printed 3D facial masks. 2D attacks are simple and include printed or displayed photographs as PAIs, 3D attacks using customized masks are more complex.

During this study, two 3D facial masks have been used. The first type of mask concerns a full-face flexible mask made of foam-latex. These masks can be customized and are manufactured in various sizes dedicated to the subject. These PAIs cover the subject’s head and neck and the masks inner surface is adhesive. The masks are manufactured with holes in eye, nose, and mouth locations. During the study, seven different latex masks of various sizes and different appearance have been used.

The second type of masks used during the experiment is a 3D-printed mask manufactured by Thatsmyface. These masks are made of a hard material which does not fully adhere to facial shapes. The shape of each mask is customized and based on facial images of the subject’s face. Each mask covers the front of the face and is attached using a tape. The masks size is standardized. During the study, three different masks presenting four subjects have been used. Sample images of 3D masks are shown in Fig. 2.

Fig. 2. Images of sample (a) latex masks, and (b) 3D-printed masks.

Download Full Size | PDF

Since the difference between a genuine subject and a presentation attack may rely on the intensity difference between specific regions of interests, the facial masks used during the study have been characterized in the range of 0.15–2 THz to estimate transmissive capabilities of the materials. The characterization has been performed using a TeraView TDS TPS 300 [52] spectrometer operating in a transmission mode. Figure 3 presents the graphs of the transmittance through 3D printed and full-face latex masks within the THz (150 GHz–2.0 THz) range.

Fig. 3. Graphs of the transmittance through all facial masks used during the study.

Download Full Size | PDF

All presented graphs confirm the trend that transmittance through masks in a terahertz range decreases with increasing frequency. The highest transmittance at the imager operation frequency is reported for paper and foil sheets with the values of 92% and 88.1%, respectively. All the full-face latex masks offer a higher transmission that 3D printed masks at a full investigated spectrum with the maximum transmittance of 57.44% at 250 GHz. The mean value of a full-face mask transmission is more than two times higher than of the 3D printed masks. Since the study is performed at a frequency of 250 GHz, the transmission through different masks is relatively high. Full statistical values at the imager frequency are provided in Table 1.

Table 1. Values of the transmission through facial masks at 250 GHz.

View Table | View all tables in this article

The experiment has been divided into two parts depending on different spoofing instruments used for presentation attacks. During the first part, spoofing instruments with no direct contact with the face have been investigated. This applies mostly to printed or displayed attacks. The second part of the experiment concerns subjects wearing different PAIs that are directly attached to the face. During this part of the study, 3D facial masks have been investigated during long-lasting experiments to evaluate a heat transfer impact on the PAD performance. For reference, images of genuine subjects have been registered.

4.2 Imager

Measurement methodology employed during the measurements consisted of methods and algorithms, as well as a hardware setup. To investigate the possibility of detecting a presentation attack in the terahertz range, a TS4 passive imager (Digital Barriers, ThruVision) [53] operating at 250 GHz has been used. The operation frequency has been selected as a compromise between imaging distance and transmittance through various materials. Higher frequency of imaging would provide a more detailed image due to a higher spatial resolution, however the transmission through materials would drop significantly, thus the benefit of seeing through materials and objects would no longer be valid.

This imager is commonly used at airports and other security checkpoints. A selected imager, being installed at a security checkpoint at the airport, could play an additional role of a presentation attack detector. The measurement equipment parameters are presented in Table 2.

Table 2. Imager parameters.

View Table | View all tables in this article

It should be noted that terahertz imagers operate at shorter distances than typical visible range and thermal infrared cameras due to signals low energy. Moreover, comparing to thermal infrared cameras, passive terahertz cameras provide images of a lower pixel resolution and a lower thermal resolution. However, the main advantage of passive THz imaging to be explored in this study is a transmission through various non-metallic materials. Intensity analysis is expected to reveal changes of intensity introduced by the PAI presence.

The real distance of a THz camera used during the experiment was fixed between 3 m and 9 m. The field of view at the distance of 3 m corresponds to an area of around 60 cm x 120 cm. The camera captures subjects’ head and torso. At smaller distances images are blurred.

During the experiments, a distance between the subject being screened and the cameras was constant (3 m), ambient temperature was constant (294 K, controlled by the air conditioning) and relative humidity varied by no more than 3%. Measurement data (images, air temperature, humidity, and pressure, values of the body and object temperatures) were collected every minute.

4.3 Clothing

The amount of energy reaching the imager depends on a subject’s clothing and a presentation attack instrument used during the experiment. This measurement methodology assumes that the relation between intensities of different body parts should reveal a presentation attack. Since both clothing and PAI introduce an emission change, relative intensities between a head region and another selected part of a human body (torso) are analyzed. However, the subject may be wearing different clothing of various thickness, material, and basic weight. Different clothing may strongly influence PA detection capabilities, therefore the impact of a different clothing has been carefully investigated. All the measurements have been performed with three sets of clothing of different terahertz transmittance. The first set was composed of a loose and light clothing, while the second and third sets consisted of a thick and warm one. Moreover, the measurement methodology takes also into account the impact of a heat transfer between a PAI and a subject’s face on the PAD capability. All the experiments have been performed in 30-min sessions. It has been noticed that 30 minutes is a sufficient period for temperature to reach equilibrium which for most of the PAIs has been achieved after 10 minutes. By a temperature equilibrium we understand the temperature of PAI which, initially lower than the temperature of the subject’s face, is constant after a certain period of time.

Terahertz radiation transmission through clothing has been reported previously [2,54,55]. Transmittance values depend not only on type of material but also on basis weight, thickness and arrangement of fibres. The clothing transmittance used in this study has been characterized to provide complete and comparable information. Transmittance graphs are presented in Fig. 4. Clothing transmittance values for a frequency of 250 GHz are presented in Table 3.

Fig. 4. Graphs of the transmittance through clothing.

Download Full Size | PDF

Table 3. Values of transmittance through the selected clothing at 250 GHz.

View Table | View all tables in this article

All the clothing has been selected to reflect the real-life situations. Transmittance measurements have been performed in a wide spectrum from 150 GHz to 2.5 THz.

5. Image analysis

The face presentation attacks detection process relies on the analysis of a 2D distribution of the energies radiated by objects in the camera lens field of view. Intensity distribution in an image depends on a clothing transmittance, as well as on a transmittance through spoofing instruments.

Since the camera captures not only a head but also a torso, for the numerical analysis, statistical values of pixels from two regions of interests (ROIs) are compared as shown in Fig. 5(d). The first ROI is located in a head area and covers a square of 6 × 6 pixels. The second ROI represents the rest of the body and is located on the torso. This region covers an area of 40 × 60 pixels. For the statistics, mean values of pixels inside the ROIs are calculated. Samples of THz images presenting genuine subjects are presented in Fig. 5(a).

Fig. 5. THz images presenting genuine subjects: (a) wearing a T-shirt, (b) wearing a T-shirt and a sweater, (c) wearing a T-shirt, a sweater and a jacket, (d) wearing a T-shirt, a sweater and a leather jacket, (e) image with the selected ROIs.

Download Full Size | PDF

Intensity analysis aims to reveal whether the presence of PAIs introduces any change into the image intensity which could be distinguished from other body parts. A very important physical aspect that may affect the imager ability to detect an object is the contact between a presentation attack instrument and a human body. The heat exchange described by Fourier’s law occurs only when two bodies are directly connected with each other. Loose masks may not perfectly adhere to the face and the energy transfer may not be perfect. This is highly relevant for 3D-printed facial masks used during the experiment since they are not flexible and may not adhere perfectly.

In order to assess the heat transfer impact on the attack detection capability, all the measurements have been performed in 30-min sessions. During each session, a single configuration with a set of clothes and PAIs was investigated. For the purpose of algorithm development, 4000 images have been collected including 2000 images of genuine subjects and 2000 images with presentation attacks. Each dataset corresponding to a set of clothing is composed of five subsets presenting five different PAIs. We collected 400 images of subjects with each of PAIs.

Graphs of intensity of various PAIs are presented in Fig. 6. The graphs present normalized mean pixel intensities of various PAIs during 30-minute-long recording sessions.

Fig. 6. Normalized pixel intensities of different PAIs across time for the subject wearing a T-shirt.

Download Full Size | PDF

It can be noticed that the range of pixel intensities is very wide. Based on Fig. 6, the PAIs can be grouped according to common values of intensities. 3D-printed mask and a tablet correspond to low pixel intensities while foil sheets, paper sheets and latex masks correspond to high pixel intensities. Correlation can be found between values of pixel intensity and values of the transmittance through PAIs. High transmission PAIs introduce a small change to energy radiation, thus pixel intensities are of smaller values. It should be also highlighted that PAIs pixel intensities are almost constant during the experiment.

5.1 Printed or Displayed Attacks

The set of spoofing instruments used during the first part of the experiment included facial photographs printed on a paper sheet (80 g/m2), facial photographs printed on a transparent photographic foil sheet (120 g/m2), and facial photographs displayed on a tablet screen (Microsoft Surface). All the instruments were held in a hand closely in front of the face. Sample images of subjects holding each of the spoofing instruments are presented in Fig. 7.

Fig. 7. THz images presenting subjects and presentation attacks using: (a) PET foil sheet, subject wearing a T-shirt, (b) paper sheet, subject wearing a T-shirt, (c) tablet, subject wearing a T-shirt, (d) PET foil sheet, subject wearing a T-shirt and a sweater, (e) paper sheet, subject wearing a T-shirt and a sweater, (f) tablet, subject wearing a T-shirt and a sweater, (g) PET foil sheet, subject wearing a T-shirt, a sweater and a jacket, (h) paper sheet, subject wearing a T-shirt, a sweater and a jacket, (i) tablet, subject wearing a T-shirt, a sweater and a jacket.

Download Full Size | PDF

The values of mean normalized pixel intensities over two regions of interests are provided in Table 4. The values of foil sheet, paper sheet and tablet have been calculated for RoI 1 (head) while the reference corresponds to RoI 2 (torso).

Table 4. Mean normalized intensities of RoIs with paper sheets or foil sheets.

View Table | View all tables in this article

The intensity analysis revealed that the tablet presence introduces the largest change of a pixel intensity. This is a result of the tablet zero transmission. The foil and paper sheets introduce smaller changes, however, there is a noticeable difference of intensities between a subject and an object. The paper sheet and foil sheet are of a very high transmittance (around 90% at 250 GHz), thus they introduce a small loss of energy and a small change of pixel intensities. It needs to be highlighted that the intensities during the experiment are quite constant due to the fact that PAIs have been held closely but not attached to the face. Moreover, intensity of the reference RoI also depends on a set of clothing used during the experiment and decreases for thicker clothes. Since no direct contact between PAIs and the face has occurred, there was no direct heat transfer between these two bodies.

5.2 Attacks using 3-dimensional masks

The second part of the experiment involved all spoofing instruments directly interacting with faces (facial masks). The relative temperature between the face and the facial mask changes during the experiment, as a result of the energy transmission between them. Assuming that the mask is attached to the face for the specified period of time, its initial temperature is lower than the body temperature and it is heated by the body, and, also, the object and the body reach a thermal equilibrium after some period of time. In order to assess the impact of temperature changes on the detection capability, we performed the measurements in 30-min sessions. During each session, a single configuration with a set of clothes and PAIs was investigated.

Intensity changes caused by the presence of paper sheets or foil sheets are shown at images presented in Fig. 8.

Fig. 8. THz images presenting a subject wearing facial masks: (a) 3D printed – subject wearing a T-shirt looking left, (b) 3D printed – subject wearing a T-shirt looking right, (c) 3D printed – subject wearing a T-shirt, frontal face position, (d) full-face latex mask, subject wearing a T-shirt, (e) 3D printed – subject wearing a T-shirt and a sweater, (f) full- face latex mask, subject wearing a T-shirt and a sweater, (g) 3D printed – subject wearing a T-shirt, a sweater and a jacket, (h) full-face latex mask, subject wearing a T-shirt, a sweater and a jacket.

Download Full Size | PDF

The values of mean pixel intensities over two regions of interests are provided in Table 5.

Table 5. Mean normalized intensities of RoIs with latex and 3D-printed masks.

View Table | View all tables in this article

Pixel intensities of the reference area (torso) strongly depend on the clothing covering the body. Clothing presence decreases the amount of energy reaching the camera, therefore the intensities corresponding to this region of the image are lower. Intensity decrease depends on thickness and type of clothing. When comparing pixel intensities of the same mask type for a different set of clothing, it is noticeable that the mask intensity does not change significantly when changing clothing at the reference area. It has been noticed that a mean pixel intensity of a facial area increases during the experiments with facial masks. Temperature of facial masks increases due to a direct contact with the human face, therefore the normalized intensities at the end of each experiment are higher than at the beginning.

A full-face latex mask introduces a small change of intensity since the transmission through this type of PAIs is very high at 250 GHz. However, the mean normalized intensities are still lower than the pixel intensities of the bare face and the reference area.

Difference between mean pixel intensities for 3D-printed masks and a reference area is large. 3D-printed masks are of a very low transmission thus they block the radiation coming from a human face.

6. Image analysis

As a result of the analysis presented in Section 5, two presentation attack detection methods have been proposed. The first method uses a simple intensity threshold to distinguish attacks from genuine images. The second method uses several deep learning classification models for automatic PAD. Since no datasets of face presentation attacks with passive THz imaging are publicly available, the presented PAD methods were developed and validated based on an in-house dataset. In total, we acquired 4000 measurements, including 2000 measurements of genuine subjects and 2000 images with presentation attacks.

For the validation purpose, two metrics have been used, namely attack presentation classification error rate (APCER) and bona fide presentation classification error rate (BPCER). APCER corresponds to an attack presentations proportion using the same PAI species incorrectly classified as bona fide presentations in a specific scenario, while BPCER corresponds to a proportion of bona fide presentations incorrectly classified as attack presentations in a specific scenario.

6.1 Threshold based method

As a result of the intensity analysis, a threshold-based method was proposed. The threshold was calculated for distinguishing genuine subjects and imposters based on a difference between mean normalized intensities of two RoIs and a bare face. During validation, mean intensities for two RoIs presented in Fig. 4(d) have been calculated for each sample (image) and compared with the threshold.

Thresholds have been calculated to obtain APCER of 1% based on 70% randomly selected samples of the whole dataset. The mean BPCER calculated for the given APCER is of 29%. Results analysis showed that most of the incorrectly classified samples come from attacks performed using a photograph printed on a paper.

Unknown-attack validation scheme has been applied as a secondary validation method to verify the generalization of the proposed method. During the unknown-attack validation, a training and testing split was based on known and unknown PAIs. In all the unknown-attack scenarios, images presenting attack with one type of PAI were partitioned to calculate the threshold, and the remaining images of “unknown” attacks were kept aside for testing. In this manner the generalization capability of the PA detection method was thoroughly assessed. All the thresholds have been calculated to obtain APCER of 1% for the known PAIs. The mean results of the unknown-attack validation are presented in Table 6.

Table 6. Results of the unknown-attack validation.

View Table | View all tables in this article

The method validated separately on 3D-printed masks and displayed photographs obtained very good results. The error rate of 1% was generated by situations when the subject was wearing a very thick clothing and a leather jacket. However, even if this situation is not realistic in a typical airport scenario, it is presented in this study.

2D attacks with paper or foil sheets, as well as latex masks generalize very well on 3D printed masks and displayed photographs. 3D printed masks and displayed photographs are shown to introduce large intensity differences between RoIs. For these two types of PAIs, almost all tested samples have been correctly classified. Validation on other PAIs has shown that the rate of incorrectly classified samples is much higher. This is a result of a smaller intensity difference between RoIs introduced by each of the remaining PAIs. Moreover, it has been noticed that clothing introduces large intensity variations for the reference RoI. Some sets of clothing introduce an intensity change similar to the change of intensity introduced by PAI.

6.2 Attacks using 3-dimensional masks

The second PAD method investigated in this study is based on deep learning classifiers. Terahertz images were transferred to a deep neural network for presentation attack detection. Eight state-of-the-art CNNs have been investigated during this study including ResNet-18 [56], ResNet-50 [56], ResNet-101 [56], GoogLeNet [57], Inception v3 [58], MobileNet-2 [59], DenseNet-201 [60], and Xception [61]. Each CNN has been trained to classify presented subjects as genuine or imposter. Selected networks have been first trained on ImageNet dataset and re-trained on datasets of images presenting genuine subjects and attacks. For training and validation purposes, an in-house dataset was used.

Two training and validation approaches have been applied. First, each classifier has been validated in a ten-fold cross-validation scheme. The dataset has been divided into training, test and validation sets with a split ratio of 70%, 10% and 20%, respectively. The mean results of the 10-fold cross-validation are presented in Table 7.

Table 7. Results of the 10-fold cross-validation.

View Table | View all tables in this article

A ten-fold cross-validation revealed that all 8 investigated neural networks obtained APCER of minimum 12% and BPCER of at least 7.21%. The investigated methods achieve high performance when trained on PAIs present during training. However, all the studied methods do not achieve the state-of-the-art performance presented by some of the presentation attack detection methods operating in visible domain or thermal infrared. The best performing neural network GoogLeNet obtained APCER of 12.01% with BPCER of 8.01%. These results are lower than the best performing PAD methods operating in visible and thermal infrared. However, during this study, known image classification algorithms have been used instead of algorithms tailored specifically for the terahertz image analysis purposes. As for the second validation approach, the unknown-attack validation scheme was applied to verify the proposed method generalization. During each iteration of validation, the training dataset included images presenting attacks using PAIs not present in the validation dataset. Results of the unknown-attack validation are presented in Tables 8 and 9.

Table 8. Results of the unknown-attack validation^a.

View Table | View all tables in this article

Table 9. Results of the unknown-attack validation^a.

View Table | View all tables in this article

The unknown-attack experiments on deep learning models revealed that PAD performance is not uniform across all presentation attack instruments. 2D attacks with paper or foil sheets, as well as latex masks generalize very well on 3D printed masks and displayed photographs. Intensity analysis revealed that thermalization influences only 3D printed and latex masks. Heating impact on PAD performance for attacks using 3D-printed and latex masks is low. Both classification errors for 3D printed masks and latex masks are constant at the beginning and end of the experiment, independent from the acquisition moment.

The highest error rates have been noticed for models trained on 3D printed masks, latex masks and displayed photographs and validated on 2D attacks with paper or foil sheets. Paper photographs and 2D foil sheets introduce a very small intensity change due to a very high transmission at 250 GHz. It seems that attacks performed using these two PAIs are the most difficult to detect as they are due to small differences introduced by these PAIs. It should be also noticed that models validated on latex masks obtained error rates of at least 10.81%. Transmission of latex masks is relatively high. Generally, GoogLeNet and Inception v3 are the best performing classifiers from the tested methods set. On the other hand, ResNet-18 and MobileNet-2 achieved the lowest performance.

7. Discussion

All the presentation attack instruments introduce a change of amount of radiation reaching the imager. As a result, intensity of pixels corresponding to presentation attack instrument change.

The long-lasting experiments aimed to show real-life situations. From results shown in previous subsections it can be seen that performance of PAD algorithms is constant. Heating of presentation attack instruments does not influence the PAD performance. This is an important difference between thermal infrared and terahertz imagers since results reported in [41] show that heat transfer between PAIs and subject’s face influences PAD performance. This work [41] reported that PAD methods achieve an impressive performance at the beginning of the experiment, but it may decrease up to 5% after several minutes.

Results presented in Section 6 show disproportions between PAD performance for different experiments in the unknown attack scenario. Attacks using PAIs made of materials of a high transmittance are most difficult to detect, because they introduce a small intensity change.

The presented methods do not achieve the state-of-the-art performance presented by some of the works in visible range and thermal infrared. The best performing methods reported in the literature connect various imaging techniques including visible range, depth, infrared and thermal infrared imaging. The Multi-Channel Convolutional Neural Network (MC-CNN) [40] obtained impressive results in a k-fold validation, as well as in an unknown attack scenario. This method was reported to obtain ACER of 2.59% and 2.61% for thermal and visible imaging, respectively. Combination of grayscale, depth, near infrared and thermal infrared resulted with ACER of 0.84%.

Our previous studies [47] have shown that methods based on the analysis of thermal infrared images may perform very well. The methods based on known neural networks are reported to show an impressive performance. Six of eight classifiers achieved APCER and BCPER of 0%. However, the performance of thermal infrared PAD methods decreases in time due to masks heating.

Taking into account that acceptable errors in biometric recognition systems are of 0.1%, the presented method needs being improved to achieve an acceptable performance.

8. Summary

This paper presents the first study on a detection of various presentation attacks with terahertz passive imaging at 250 GHz. For the purpose of this study a database of attacks and genuine subjects have been collected and analyzed. Collected images compose the passive Terahertz face spoofing dataset that will be published along with this manuscript. The dataset contains images presenting attacks performed using five different PAIs. The set of PAIs include 2D paper-printed photographs, 2D photographs displayed using a tablet, 3D full-face flexible mask and 3D-printed facial mask.

Collected images presenting various attacks have been analyzed to reveal dependencies between intensities at different parts of the subject’s body. The study presents the thermal flow impact between the subject and PAI on PAD performance. Long lasting experiments with all the PAIs and different types of clothing have been performed.

Finally, two presentation attack detection methods are proposed. Both methods allow to detect all types of attacks considered in this study, including attacks simple 2D photographs and customized masks. Both methods have been validated using two different validation schemas.

The first method is based on a threshold corresponding to a difference between mean intensities of selected regions of interests. This method validated across all the PAIs obtains BPCER of 29% for APCER of 1%. This PAD method is based on a simple thresholding, thus all the results are easily interpretable.

The second method employs eight different deep learning classifiers to detect presentation attacks. The ten-fold cross-validation revealed that all 8 investigated neural networks provided APCER of at least 12% and BPCER of at least 7.21%. However, the unknown-attack validation revealed disproportions between detection performance of deep learning models across different attacks. It seems that attacks performed using 2D printed paper or foil sheets are the most difficult to detect as they are characterized with a high THz transmission. For these two PAIs, models achieved the highest error rates. Paper photographs and 2D-foil sheets introduce a very small intensity change due to a very high transmission at 250 GHz. 2D attacks with paper or foil sheets, as well as latex masks generate very well on 3D printed masks and displayed photographs. All the deep learning models during the unknown-attack validation performed with 3D printed masks and displayed photographs achieved a very high performance with almost zero classification error rates. Latex masks achieved error rates of at least 10.81%.

It has been noticed that different type of clothing introduces large variations of intensity for the reference RoI. Some sets of clothing introduce an intensity change similar to an intensity change introduced by a PAI. Transmission through clothing, that is an advantage for a concealed object detection, might be a fundamental disadvantage for detection of presentation attacks in this particular imaging modality. The study revealed that the heat flow between the face and the PAI does not influence significantly the performance of PAD algorithms.

The presented methods do not achieve the state-of-the-art performance presented by some of the works in visible range and thermal infrared. Both presented PAD methods obtain low results for very simple attacks. However, more sophisticated attacks, including attacks with 3D facial masks are very well detectable with passive THz imaging. However, exploration of other imaging modalities of the THz spectrum may provide interesting findings.

There is a room for improvement, since the imager used during the experiments was of relatively low parameters (NETD and spatial resolution) comparing to SOTA imagers operating in visible domain or thermal infrared. It is expected that an imager with a better NETD and a higher spatial resolution may provide much higher performance. One of possible directions for improving performance is to use blackbody or other object of known temperature to calibrate the imaging setup. This field has been initially considered and requires more experiments and investigations. Moreover, since only general classification algorithms have been evaluated, more suitable algorithms should also introduce the performance enhancement.

As a point for future studies, it is expected that the use of active THz time-domain imaging would provide interesting results. However, development of the THz time-domain setup operating in a free space is very challenging. Author expects that the setup would be able to detect an additional layer at the face by analyzing time-domain signal responses.

Funding

Wojskowa Akademia Techniczna (UGB/22-783/2020); H2020 Societal Challenges (833704).

Acknowledgement

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 833704. This research was funded by Military University of Technology, grant no. UGB/22-783/2020.

Disclosures

The author declares that there are no conflicts of interest related to this article.

References

1. M. C. Kemp, “Explosives Detection by Terahertz Spectroscopy—A Bridge Too Far?” IEEE Trans. Terahertz Sci. Technol. 1(1), 282–292 (2011). [CrossRef]

2. M. Kowalski and M. Kastek, “Comparative studies of passive imaging in terahertz and mid-wavelength infrared ranges for object detection,” IEEE T Inf. Foren. Sec. 11(9), 2028–2035 (2016). [CrossRef]

3. M. Kowalski, “Real-time concealed object detection and recognition in passive imaging at 250 GHz,” Appl. Opt. 58(12), 3134–3140 (2019). [CrossRef]

4. P. Lopato and T. Chady, “Terahertz detection and identification of defects in layered polymer composites and composite coatings,” Nondestruct. Test. Eva. 28(1), 28–43 (2013). [CrossRef]

5. J. F. O’Hara, S. Ekin, W. Choi, and I. Song, “A Perspective on Terahertz Next-Generation Wireless Communications,” Technol. 7(2), 43–61 (2019). [CrossRef]

6. J. Wang, R. I. Stantchev, Q. Sun, T. W. Chiu, A. T. Ahuja, and E. Pickwell MacPherson, “THz in vivo measurements: the effects of pressure on skin reflectivity,” Biomed. Opt. Express 9(12), 6467–6476 (2018). [CrossRef]

7. D. B. Bennett, W. Li, Z. D. Taylor, W. S. Grundfest, and E. R. Brown, “Stratified Media Model for Terahertz Reflectometry of the Skin,” IEEE Sens. J. 11(5), 1253–1262 (2011). [CrossRef]

8. S. R. Tripathi, E. Miyata, P. B. Ishai, and K. Kawase, “Morphology of human sweat ducts observed by optical coherence tomography and their frequency of resonance in the terahertz frequency region,” Sci. Rep. 5(1), 9071 (2015). [CrossRef]

9. Y. B. Ji, E. S. Lee, S. H. Kim, J. H. Son, and T. I. Jeon, “A miniaturized fiber-coupled terahertz endoscope system,” Opt. Express 17(19), 17082–17087 (2009). [CrossRef]

10. G. G. Hernandez-Cardoso, S. C. Rojas-Landeros, M. Alfaro-Gomez, A. I. Hernandez-Serrano, I. Salas-Gutierrez, E. Lemus-Bedolla, A. R. Castillo-Guzman, H. L. Lopez-Lemus, and E. Castro-Camus, “Terahertz imaging for early screening of diabetic foot syndrome: A proof of concept,” Sci. Rep. 7(1), 42124–42129 (2017). [CrossRef]

11. I. Ozheredov, M. Prokopchuk, M. Mischenko, T. Safonova, P. Solyankin, A. Larichev, A. Angeluts, A. Balakin, and A. Shkurinov, “In vivo THz sensing of the cornea of the eye,” Laser Phys. Lett. 15(5), 055601 (2018). [CrossRef]

12. P. C. Ashworth, E. Pickwell-MacPherson, E. Provenzano, S. E. Pinder, A. D. Purushotham, M. Pepper, and V. P. Wallace, “Terahertz pulsed spectroscopy of freshly excised human breast cancer,” Opt. Express 17(15), 12444–12454 (2009). [CrossRef]

13. C. S. Joseph, R. Patel, V. A. Neel, R. H. Giles, and A. N. Yaroslavsky, “Imaging of ex vivo nonmelanoma skin cancers in the optical and terahertz spectral regions optical and terahertz skin cancers imaging,” J Biophotonics 7(5), 295–303 (2014). [CrossRef]

14. I. I. V. Prozheev, O. A. Smolyanskaya, M. V. Duka, A. A. Ezerskaya, V. V. Orlov, E. A. Strepitov, N. S. Balbekin, and M. K. Khodzitsky, “Study of Penetration Depth Dispersion of THz Radiation in Human Pathological Tissues,” Progress in Electromagnetics Research Symposium, (2014) pp. 1536–1539.

15. L. Yu, L. Hao, T. Meiqiong, H. Jiaoqi, L. Wei, D. Jinying, C. Xueping, F. Weiling, and Z. Yang, “The medical application of terahertz technology in non-invasive detection of cells and tissues: opportunities and challenges,” RSC Adv. 9(17), 9354–9363 (2019). [CrossRef]

16. X. Tan, Y. Li, and J. Liu, Face liveness detection from a single image with sparse low rank bilinear discriminative model (Springer, 2010).

17. G. Pan, L. Sun, L. Z. Wu, and S. Lao, “Eyeblink-Based Anti-Spoofing in Face Recognition from a Generic Webcamera,” in Proceedings of the 11th IEEE International Conference on Computer Vision(ICCV) (2007), pp. 1–8.

18. A. Anjos and S. Marcel, “Counter-measures to photo attacks in face recognition: A public database and a baseline,” in Proceedings of the International Joint Conference on Biometrics (IJCB) (2011), pp. 1–7.

19. I. Chingovska, A. Anjos, and S. Marcel, “On the effectiveness of local binary patterns in face anti-spoofing,” in Proceedings of the International Conference of Biometrics Special Interest Group (BIOSIG) (2012).

20. Z. Zhang, J. Yan, S. Liu, Z. Lei, D. Yi, and S. Z. Li, “A face anti-spoofing database with diverse attacks,” in Proceedings of the 2012 5th IAPR International Conference on Biometrics (ICB) (2012), pp. 26–31.

21. D. Wen, H. Han, and A. K. Jain, “Face spoof detection with image distortion analysis,” IEEE Trans. Inf. Forensics Secur. 10(4), 746–761 (2015). [CrossRef]

22. K. Patel, H. Han, A.K. Jain, and G. Ott, “Live face video vs. spoof face video: Use of moiré patterns to detect replay video attacks,” in Proceedings of the International Conference on Biometrics (ICB) (2015), pp. 98–105.

23. J. Xiao, Y. Tang, J. Guo, Y. Yang, X. Zhu, Z. Lei, and S. Z. Li, “3DMA: A Multi-modality 3D Mask Face Anti-spoofing Database,” in 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (2019).

24. A. Pinto, W. R. Schwartz, H. Pedrini, and A. D. Rocha, “Using visual rhythms for detecting video-based facial spoof attacks,” IEEE Trans. Inf. Forensics Secur. 10(5), 1025–1038 (2015). [CrossRef]

25. A. Agarwal, D. Yadav, N. Kohli, R. Singh, M. Vatsa, and A. Noore, “Face presentation attack with latex masks in multispectral videos,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2017), pp. 275–283.

26. S. Bhattacharjee, A. Mohammadi, and S. Marcel, “Spoofing Deep Face Recognition with Custom Silicone Masks,” in IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS) (2018) pp.??.

27. S. R. Arashloo, J. Kittler, and W. Christmas, “Face Spoofing Detection Based on Multiple Descriptor Fusion Using Multiscale Dynamic Binarized Statistical Image Features,” IEEE Trans. Inf. Forensics Secur. 10(11), 2396–2407 (2015). [CrossRef]

28. A. Anjos, M. M. Chakka, and S. Marcel, “Motion-based counter-measures to photo attacks in face recognition,” IET Biometrics 3(3), 147–158 (2014). [CrossRef]

29. M. Killioǧlu, M. Taşkiran, and N. Kahraman, “Anti-spoofing in face recognition with liveness detection using pupil tracking,” in IEEE 15th International Symposium on Applied Machine Intelligence and Informatics, Proceedings, (SAMI) (2017), pp. 87–92.

30. A. Asaduzzaman, A. Mummidi, M. F. Mridha, and F. N. Sibai, “Improving facial recognition accuracy by applying liveness monitoring technique,” in Proceedings of the 3rd International Conference on Advances in Electrical Engineering (ICAEE) (2015), pp. 133–136.

31. N. Erdogmus and S. Marcel, “Spoofing face recognition with 3d masks,” IEEE Trans. Inf. Forensics Secur. 9(7), 1084–1097 (2014). [CrossRef]

32. S. Liu, P. C. Yuen, S. Zhang, and G. Zhao, “3d mask face anti-spoofing with remote photoplethysmography,” in European Conference on Computer Vision (Springer, 2016).

33. X. Li, J. Komulainen, G. Zhao, P. C. Yuen, and M. Pietikainen, “Generalized face anti-spoofing by detecting pulse from face videos,” in International Conference on Pattern Recognition (ICPR) (2016), pp. 4244–4249.

34. R. Shao, X. Lan, and P. C. Yuen, “Deep convolutional dynamic texture learning with adaptive channel-discriminability for 3d mask face anti-spoofing,” in IEEE International Joint Conference on Biometrics (IJCB) (2017), pp. 748–755.

35. L. Li, Z. Xia, X. Jiang, Y. Ma, F. Rolib, and X. Feng, “3D Face Mask Presentation Attack Detection Based on Intrinsic Image Analysis,” arXiv:1903.11303v, (2019) pp. 1–24.

36. I. Manjani, S. Tariyal, and M. Vatsa, “Detecting Silicone Mask based Presentation Attack via Deep Dictionary Learning,” IEEE Trans. Inf. Forensics Secur. 12(7), 1713–1723 (2017). [CrossRef]

37. R. Ramachandra, S. Venkatesh, K. B. Raja, S. Bhattacharjee, P. Wasnik, S. Marcel, and C. Busch, “Custom silicone Face Masks: Vulnerability of Commercial Face Recognition Systems & Presentation Attack Detection,” in 7th International Workshop on Biometrics and Forensics (IWBF).

38. L. Sun, W. B. Huang, and M. H. Wu, “TIR/VIS Correlation for Liveness Detection in Face Recognition,” in International Conference on Computer Analysis of Images and Patterns (CAIP) (2011), pp. 114–121.

39. C. Kant and N. Sharma, “Fake Face Recognition using Fusion of Thermal Imaging and Skin Elasticity,” IJCSCIJ 1(4), 65–72 (2013).

40. N. Pałka and M. Kowalski, “Towards Fingerprint Spoofing Detection in the Terahertz Range,” Sensors 20(12), 3379 (2020). [CrossRef]

41. M. Kowalski, D. Coquillat, P. Zagrajek, and W. Knap, “Improvement of terahertz imaging using lock-in techniques,” 2015 40th International Conference on Infrared, Millimeter, and Terahertz waves (IRMMW-THz), Hong Kong (2015), pp. 1–2.

42. P. Y. Han and X.-C. Zhang, “Free-space coherent broadband terahertz time-domain spectroscopy,” Meas. Sci. Technol. 12(11), 1747–1756 (2001). [CrossRef]

43. J. Dai, J. Liu, and X.-C. Zhang, “Terahertz Wave Air Photonics: Terahertz Wave Generation and Detection With Laser-Induced Gas Plasma,” IEEE J. Sel. Top. Quantum Electron. 17(1), 183–190 (2011). [CrossRef]

44. K. Liu and X. C. Zhang, “Toward Standoff Sensing of CBRN with THz Waves,” NATO Science for Peace and Security Series B: Physics and Biophysics. Springer, Dordrecht (2017) pp. 3–10.

45. J. Seo and I. J. Chung, “Face Liveness Detection Using Thermal Face-CNN with External Knowledge,” Symmetry 11(3), 360 (2019). [CrossRef]

46. A. George, Z. Mostaani, D. Geissenbuhler, O. Nikisins, A. Anjos, and S. Marcel, “Biometric Face Presentation Attack Detection with Multi-Channel Convolutional Neural Network,” IEEE Trans. Inf. Forensics Secur. 15, 42–55 (2020). [CrossRef]

47. M. Kowalski, “A Study on Presentation Attack Detection in Thermal Infrared,” Sensors 20(14), 3988 (2020). [CrossRef]

48. M. Planck, The Theory of Heat Radiation. Philadelphia, (P. Blakiston’s Son & Co, 1914).

49. J. H. Lienhard and J. V. Lienhard, A Heat Transfer Textbook, 3rd ed. (Cambridge, Phlogiston, 2008).

50. A. Y. Owda, N. Salmon, N. D. Rezgui, and S. Shylo, “Millimetre wave radiometers for medical diagnostics of human skin,” 2017 IEEE SENSORS, Glasgow (2017), pp. 1–3.

51. A. Y. Owda, N. Salmon, A. J. Casson, and M. Owda, “The Reflectance of Human Skin in the Millimeter-Wave Band,” Sensors 20(5), C1 (2020). [CrossRef]

52. Teraview website, https://teraview.com/, accessed 21/04/2020.

53. ThruVision website, https://thruvision.com/, accessed 21/04/2020.

54. J. E. Bjarnason, T. L. J. Chan, A. W. M. Lee, M. A. Celis, and E. R. Brown, “Millimeter-wave, terahertz, and mid-infrared transmissionthrough common clothing,” Appl. Phys. Lett. 85(4), 519–521 (2004). [CrossRef]

55. R. Knipper, A. Brahm, E. Heinz, T. May, G. Notni, H.-G. Meyer, A. Tünnermann, and J. Popp, “THz Absorption in Fabric and Its Impact on Body, Scanning for Security Application,” IEEE Trans. THz Sci. Technol. 5(6), 999–1004 (2015). [CrossRef]

56. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), pp. 770–778.

57. C. Szegedy, W. Liu, and Y. Jia, “Going Deeper with Convolutions,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).

58. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), pp. 2818–2826.

59. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” arXiv:1801.04381v4, (2018).

60. F. X. Chollet, “Deep Learning with Depthwise Separable Convolutions,” arXiv:1610.02357, (2017).

61. G. Huang, L. Zhuang, L. van Der Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” arXiv:1608.06993v4, (2017).

Masks	Statistics	Transmittance value [%]
Full-face latex masks	max	57.44
	min	30.38
	mean	47.88
3D-printed masks	max	23.90
	min	21.88
	mean	22.87
Transparent photographic foil sheet	single value	88.10
Paper sheet	single value	92.00
Tablet	single value	0.00

Parameters	Values
Detection unit	passive sensing line,
eight Schottky diodes,
uncooled
Frequency, bandwidth
	250 GHz (1.25 mm), ± 20 GHz
Noise equivalent temperature difference (NETD)	∼1 K at 300 K
Image resolution	124 × 271 pixels

Clothing type	Transmittance value
Leather jacket	44.2%
Cotton T-shirt	92.3%
Cotton jacket	69.3%
Cotton sweater	76.2%

		Foil sheet	Paper sheet	Tablet	Reference (RoI 2)	Reference – bare face
T-shirt	Beginning	0.85 ± 0.01	0.93 ± 0.01	0.22 ± 0.02	0.95 ± 0.02	0.94 ± 0.03
T-shirt	30 min	0.86 ± 0.01	0.93 ± 0.01	0.25 ± 0.02	0.96 ± 0.03	0.94 ± 0.04
T-shirt and sweater	Beginning	0.83 ± 0.01	0.92 ± 0.01	0.23 ± 0.02	0.91 ± 0.03	0.94 ± 0.04
T-shirt and sweater	30 min	0.84 ± 0.01	0.91 ± 0.01	0.24 ± 0.02	0.90 ± 0.02	0.95 ± 0.03
T-shirt, sweater and jacket	Beginning	0.85 ± 0.01	0.92 ± 0.01	0.23 ± 0.02	0.79 ± 0.03	0.94 ± 0.04
T-shirt, sweater and jacket	30 min	0.84 ± 0.01	0.93 ± 0.01	0.25 ± 0.02	0.81 ± 0.02	0.95 ± 0.03
T-shirt, sweater and leather jacket	Beginning	0.85 ± 0.01	0.92 ± 0.01	0.23 ± 0.02	0.61 ± 0.03	0.94 ± 0.04
T-shirt, sweater and leather jacket	30 min	0.84 ± 0.01	0.93 ± 0.01	0.25 ± 0.02	0.61 ± 0.02	0.95 ± 0.03

		Latex mask	3D-printed mask	Reference (RoI2)	Reference – bare face
T-shirt	Beginning	0.88 ± 0.04	0.35 ± 0.01	0.94 ± 0.03	0.94 ± 0.03
T-shirt	30 minutes	0.90 ± 0.03	0.36 ± 0.01	0.95 ± 0.02	0.94 ± 0.02
T-shirt and sweater	Beginning	0.87 ± 0.03	0.32 ± 0.01	0.92 ± 0.04	0.94 ± 0.04
T-shirt and sweater	30 minutes	0.90 ± 0.03	0.33 ± 0.02	0.91 ± 0.02	0.95 ± 0.03
T-shirt, sweater and jacket	Beginning	0.87 ± 0.03	0.31 ± 0.01	0.81 ± 0.03	0.95 ± 0.03
T-shirt, sweater and jacket	30 minutes	0.89 ± 0.04	0.32 ± 0.02	0.80 ± 0.04	0.95 ± 0.02
T-shirt, sweater and leather jacket	Beginning	0.88 ± 0.03	0.32 ± 0.01	0.71 ± 0.03	0.94 ± 0.03
T-shirt, sweater and leather jacket	30 minutes	0.89 ± 0.04	0.32 ± 0.02	0.70 ± 0.04	0.95 ± 0.02

Passive imaging at 250 GHz for detection of face presentation attacks

Abstract

1. Introduction

2. Related Works

2.1 Types of PAD methods

2.2 Face presentation attack datasets

2.3 Algorithms

2.4 Physical features

3. Theoretical considerations

4. Measurement environment

4.1 Measurement of PAIs

4.2 Imager

4.3 Clothing

5. Image analysis

5.1 Printed or Displayed Attacks

5.2 Attacks using 3-dimensional masks

6. Image analysis

6.1 Threshold based method

6.2 Attacks using 3-dimensional masks

7. Discussion

8. Summary

Funding

Acknowledgement

Disclosures

References

Cited By

Figures (8)

Tables (9)

Equations (9)

Optics Express

PAI	All
PAI	APCER^a	BPCER^a
Paper photograph^b	43.11 ± 4.00	31.33 ± 3.00
Foil sheet^c	16.10 ± 2.00	11.65 ± 2.00
Displayed photograph (tablet)^d	1.21 ± 0.00	1.04 ± 0.00
Latex mask^e	19.34 ± 2.00	14.21 ± 2.00
3D-printed mask^f	1.11 ± 0.00	1.21 ± 0.00

Method	All
Method	APCER^a	BPCER^a
GoogLeNet	12.01 ± 1.00	8.01 ± 1.00
Inception v3	18.04 ± 1.00	7.84 ± 1.00
ResNet-18	23.22 ± 2.00	11.83 ± 1.00
ResNet-50	19.11 ± 2.00	9.12 ± 1.00
ResNet-101	20.21 ± 1.00	10.50 ± 1.00
MobileNet-2	24.23 ± 2.00	8.31 ± 1.00
DenseNet-201	18.54 ± 1.00	7.21 ± 1.00
Xception	15.01 ± 2.00	8.34 ± 1.00

Method	Paper photograph^b		Displayed photograph^c		Foil sheet^d
Method	APCER	BPCER	APCER	BPCER	APCER	BPCER
GoogLeNet	29.12	21.33	0.00	0.00	15.11	14.01
Inception v3	31.33	30.14	0.01	0.00	15.90	14.81
ResNet-18	44.12	43.11	0.35	0.26	26.04	23.25
ResNet-50	39.33	41.87	0.00	0.00	21.49	21.12
ResNet-101	40.11	39.55	0.15	0.11	23.11	24.12
MobileNet-2	45.12	42.29	0.23	0.14	19.05	17.99
DenseNet-201	31.33	30.14	0.00	0.00	16.66	15.45
Xception	36.18	35.01	0.03	0.04	18.05	17.12