Characterisation of multi-layered structures using a vector-based gradient descent algorithm at terahertz frequencies

Amlan kusum Mukherjee; Sven Wassmann; Konstantin Wenzel; Björn Globisch; Robert Kohlhaas; Lars Liebermeister; Sascha Preu

doi:10.1364/OE.486562

1. Introduction

The Terahertz (THz) frequency range spans between $0.1-10$ THz. Despite of the relative scarcity of table-top, non-cryogenic THz systems a few decades back, a multitude of demonstrated applications in the fields of astronomy [1,2], spectroscopy [3,4], imaging [5,6], non-destructive testing [7,8], and many others has accentuated the potential of the THz frequency range. Compact THz sources with up to milliwatts of emitted power [9–11] and sensitive semiconductor-based receivers, offering systems with $120-140$ dB peak dynamic range [12], have facilitated development of modular and portable or breadboard-sized THz systems [13], e.g., photonic spectrum analysers [14,15], and photonic vector network analysers (VNAs) [16,17]. Applications pertaining to material characterisation [18,19] and thereby, imaging [7,20,21] using THz radiation has gained pace due to its industrial applicability where it offers solutions that established X-ray or ultrasound solutions face challenges. Examples of such challenges are low density materials that offer little X-ray contrast or layer stacks thereof, or materials or assemblies with air inclusions where ultra-sound can hardly be used [22–24]. In this work, we further explore the potential of THz frequencies in characterising multilayered (optically) thin films and detecting nanometric structural variations using a VNA-like setup. In the process, we develop and demonstrate a scattering parameter (S-parameter)-based modelling technique for multilayered structures and a vector-based gradient descent algorithm to fit measured Fabry-Pérot transmission and reflection coefficients to the multilayered model.

In a free-space setup, a plane-plane, homogeneous dielectric slab of thickness $d_s$ and complex refractive index $\underline {n}$ behaves as a Fabry-Pérot Etalon for the THz beam incident on it. The field transmission coefficient $t_\text {FP}$ and reflection coefficient $r_\text {FP}$ of the Etalon read as [25]

(1)$$t_\text{FP} = \frac{(1-R)\cdot\exp(j\Phi)}{1 - R\cdot\exp(2j\Phi)}$$

(2)$$r_\text{FP} = {-}\sqrt{R} + \frac{\sqrt{R}(1-R)\cdot\exp(2j\Phi)}{1-R\cdot\exp(2j\Phi)}$$

where $R = |(\underline {n} - 1)/(\underline {n} + 1)|^2$ is the reflection coefficient at the air-sample interface, the phase $\Phi = k_0\underline {n}d_s\cos {\theta _r}$, wherein $\theta _r$ is the angle of refraction and $\underline {n} = n + j\kappa$. $n$ and $\kappa$ are the real refractive index and the extinction coefficient, respectively. Fitting Eqs. (1) and (2) to the measured transmission through and reflections from the planar dielectric slab, henceforth considered as a device under test (DUT), gives precise estimates of its thickness, refractive index and absorption coefficient as a function of frequency [16,17]. We have previously estimated thickness variations on a highly resistive silicon (HR-Si) DUT with a precision of $28$ nm by fitting the phase of $t_\text {FP}$ and thereby imaged nanometric structures as small as $\lambda /15000$ [21]. An individual pixel scan required $\sim 24$ s (and an additional $30$ s for the stabilisation of the frequency raster) for a measurement bandwidth of only $0.2$ THz with a resolution of $50$ MHz. A $25\times 25$ mm$^2$ Siemens star with $0.5$ mm of lateral resolution required $\sim 43$ hours of measurement time which is orders of magnitude too long for industrial applications.

Recent developments in fast, broadband, continuous wave (CW) spectrometers [26] can expedite this scanning process significantly. Figure 1 shows a scan conducted using the “T-Sweeper” system, a fast frequency-domain spectrometer [27], developed by Fraunhofer Heinrich Hertz Institute. Using a p-i-n diode-based THz transmitter and a photoconductive receiver, the system features a maximum scan rate of $\sim 500$ THz/s and an available scanning bandwidth of $\sim 4$ THz. “T-Sweeper” measures both amplitude and phase of the THz electric field incident on the receiver. As DUT, we etch a $10.7\pm 0.1\,\mathrm {\mu }\textrm {m}$ deep Siemens star into a $525\pm 5\,\mathrm {\mu }\textrm {m}$ thick HR-Si wafer. The images are generated using a bandwidth of $1.1$ THz with a frequency step size of $1$ GHz while averaging over $100$ scans to reduce the noise floor, resulting in an overall measurement time of $1$ s per pixel. Figure 1(a) and (c) show the physical etch-depth of the star and the refractive index values, respectively, estimated by fitting the the phase of the measured $t_\text {FP}$ to its analytical expression. Figure 1(b) and (d) show the corresponding estimates obtained from the amplitude of the Fabry-Pérot transmission coefficient, where we use a novel vector-based fitting algorithm discussed later in this manuscript.

Fig. 1. Terahertz image of a $10.7\pm 0.1\,\mathrm {\mu }\textrm {m}$ deep Siemens star etched into a $525\pm 5\,\mathrm {\mu }\textrm {m}$ thick HR-Si ($n=3.416$, $\sigma <0.01$ S/m) wafer. Fig. (a) and (b) show the physical etched depth obtained from the phase and the amplitude of the Fabry-Pérot transmission coefficient, respectively. Fig. (c) and (d) show the corresponding estimated refractive indices.

Download Full Size | PDF

As compared to the standard CW spectrometer, the “T-Sweeper” features three major differences: (i) the overall scan-time reduces by a factor of $50$ with the used settings. Higher speeds are possible by using less averaging, however, at the cost of dynamic range. The high sweep-rate of the “T-Sweeper” makes it apt for industrial deployments. (ii) The $5$ times larger scanning bandwidth of the “T-Sweeper” facilitates a higher fitting accuracy and a more precise resolution in the time-domain. As each interface produces a peak in the time domain, the $5$ times higher time domain resolution as compared to Ref. [21] eases the de-embedding of the characteristics parameters of multilayered DUTs. (iii) Even though the thickness estimates using the “T-Sweeper” can be as precise as $\sim 100$ nm [26], the lower frequency resolution of $1$ GHz and the increased incoherence between the laser signal and the incident THz signal at the receiver deteriorates the estimation precision for the physical thickness of single-layered samples to $\sim 2\,\mathrm {\mu }\textrm {m}$ in this work, compared to the demonstrated $28$ nm with the standard CW spectrometer [21].

2. Characterisation of multilayered structures

2.1 Modelling

It is cumbersome to analytically formulate Fabry-Pérot equations for multilayered structures using the transfer matrix method [28] and fitting these equations to measured data is computationally expensive. Alternatively in this work, we model the multilayered DUT as a cascade of individual single-layered DUTs surrounded by air with vanishing thickness and subsequently, use S-parameter formulations to calculate the overall transmission and reflection coefficients. Figure 2 shows the cascaded model of a multilayered DUT. We consider the left and the right side of the individual layers as ports $1$ and $2$, respectively. The complex power waves [29] $a_i^{(m)}$ and $b_i^{(m)}$ (alternatively, voltage waves or complex electric field waves) represent the incident and reflected electric field from the $m^\text {th}$ layer at port $i$ ($i=1,2$), respectively. Power waves $a_{1}$, $a_{2}$, $b_{1}$ and $b_{2}$ correspond to the overall multilayered DUT. The S-parameters relate the power waves of an individual DUT as

(3)$$\begin{bmatrix} b_1^{(m)}\\ b_2^{(m)} \end{bmatrix} = \begin{bmatrix} S^{(m)} \end{bmatrix} \begin{bmatrix} a_1^{(m)}\\ a_2^{(m)} \end{bmatrix}.$$

We can formulate the S-parameter-martrix (S-matrix) of a single layered DUT from the complex Fabry-Pérot transmission and reflection coefficients as

(4)$$\begin{bmatrix}S\end{bmatrix} = \begin{bmatrix} r_\text{FP} & t_\text{FP}\\ t_\text{FP} & r_\text{FP} \end{bmatrix}.$$

Since, the S-matrix relate the incident power waves at all ports to the reflected power waves from the corresponding ports, they cannot be directly multiplied for cascaded structures. Hence, we convert S-matrices of individual layers to the corresponding transfer parameter matrices (T-matrices), which relate the incident and reflected signal of one port to the other as

(5)$$\begin{bmatrix} a_2^{(m)}\\ b_2^{(m)} \end{bmatrix} = \begin{bmatrix}T^{(m)}\end{bmatrix} \begin{bmatrix} b_1^{(m)}\\ a_1^{(m)} \end{bmatrix}.$$

When the continuity of the electric field is maintained at the layer boundaries, the complex power waves of the consecutive cascaded DUT layers satisfy $a^{(m)}_2 = b^{(m+1)}_1$ and $b^{(m)}_2 = a^{(m+1)}_1$. Consequently, the T-matrix of the overall multilayered DUT [$T_\text {DUT}$] becomes the product of the T-matrices of the individual layers

(6)$$\begin{bmatrix} T_\text{DUT} \end{bmatrix} = \prod_{m = 1}^{N} \begin{bmatrix} T^{(m)} \end{bmatrix}.$$

The transformation back of $[T_\text {DUT}]$ to S-parameters gives the overall transmission and reflection coefficients of the modelled structure. We note that the normal electric field component is discontinuous at the layer interfaces [30] and hence, the simplistic model described above is only valid when the electric field component is tangential to the layer boundaries. This incorporates the case of normally incident THz beam on the DUT or the s-polarised beam for oblique incidences.

Fig. 2. A plane-plane multilayered DUT is modelled as a cascaded structure of multiple single-layered DUTs. In free-space setups, the matched load $Z_L$ is implicitly implemented as no reflections occur after the DUT. $V_s$ and $Z_s$ are the source voltage and impedance, respectively.

Download Full Size | PDF

2.2 Vector-based gradient descent algorithm (VGDA)

In a multilayered DUT, if individual layers are optically thick and a large enough scanning bandwidth is available to temporally resolve the reflections at the individual layer interfaces, the delay between the recurring reflections provides estimates of refractive index and thickness of the constituting layers [31,32]. When these reflections are temporally overlapping but are discernible, algorithms fit analytical Fabry-Pérot functions to the measured data, either in time or frequency domain [7,21,26]. When the peaks are indiscernible or merged, neural networks have been used to extract the layer thicknesses non-analytically [33]. Rather than formulating an error function for the multilayered DUT, we show a vector-based gradient descent technique to fit the measured data to the S-parameters-based model and determine the best set of parameters fitting the measured data.

The vector-based implementation is similar to Nelder-Mead simplex algorithm (NMA) for minima finding [34], where the error between measured and the estimated models is assumed to be a continuous hyper-plane. Figure 3(a) visualises the simplistic case of two optimisation parameters $x$ and $y$, where the vertical $z$-axis indicates the error between measured and modelled S-parameters of the DUT. The area vector ${\mathop {{V}}\limits^{\rightarrow}}$ of $\Delta ABC$ created by any three points on the error plane is normal to the plane of the triangle and always tilts towards the direction of lower local error. In each iteration, we swap the point of maximum error in $\Delta ABC$ with a newly calculated point along the tilt of ${\mathop {{V}}\limits^{\rightarrow}}$ and recalculate the tilt of the area vector. Iterative update of points of $\Delta ABC$, with adaptive displacements, moves the triangle to the minimum of the error plane. The process is analogous to a ball located at the point with the minimum error on $\Delta ABC$ rolling down the error plane. Figure 3(b) shows an optimisation trace, in which the point with minimum error in $\Delta ABC$ heuristically moves from a random starting point (red cross) to the position of minimum error (green cross). The minima-seeker point moves through the error plane heuristically, where the direction of movement is determined only by the three-dimensional location of points constituting $\Delta ABC$.

Fig. 3. Visualisation of a two-parameter optimisation process. (a) The axes $x$ and $y$ plot the parameters to be optimised and the $z$-axis represents the 2-norm error between the estimated model and the actual measurements. The area vector ${\mathop {{V}}\limits^{\rightarrow}}$ of triangle $\Delta ABC$, calculated at the point with minimum error C, tilts towards lower local error. An imaginary ball placed on the error-plane would tend to move towards the projection of ${\mathop {{V}}\limits^{\rightarrow}}$ on the $x-y$ plane. (b) The trace of a minima-seeker point after a successful optimisation run. Even though the minima-seeker point is oblivious to the entire error plane, it starts from the red cross on the error plane and reaches the point with minimum error (green cross).

Download Full Size | PDF

The optimisation algorithm can be extended to N parameters by substituting vector cross-product by the wedge product [35,36] to compute the higher dimensional analogue of the area vector. Figure 4 shows a detailed flow-diagram of the algorithm. The images depict a two-parameter case for the simplicity. The algorithm takes three $N\times 1$ arrays as inputs for an $N$-parameter optimisation, namely upper and lower limits of the $N$ optimisation parameters and their resolution accuracy of estimates, in order. Additionally, the functions for parameter normalisation and error calculation are modular and their handles are also provided as inputs.

1. In the initialisation step, the algorithm computes three parameters:
- (a) an initial displacement value $D = L_{diag}/8$,
- (b) a running displacement threshold, $D_{th}=D/2$ and
- (c) the minimum threshold displacement $D_{min}$.
2. We choose $N+1$ points on the $N$-dimensional plane created by the optimisation parameters within the fitting bounds. Note that each point represents a set of parameters describing the analytical model. We set the $(N+1)$^th dimension of each point as the error between the measured data and the analytical model. $P_{set}$ is a matrix of order $(N+1)\times (N+1)$ containing all the $N+1$ points. $(7)$$\begin{bmatrix} P_{set} \end{bmatrix}= \begin{bmatrix} P_1\\P_2\\\vdots\\P_{N+1} \end{bmatrix}= \begin{bmatrix} P_{i,j} \end{bmatrix}_{(N+1)\times(N+1)},$$$ where $P_{i,j}$ refers to the $j$^th parameter of the $i$^th point and all $P_{i,(N+1)}$ are the errors between measured data and model generated with the $i$^th parameter-set $P_{i,q},\ \forall \ q\in [1,N]$.
For instance, assuming two optimisation parameters $x$ and $y$, the three initial point vectors $P_1$, $P_2$ and $P_3$ are set to ($x_{max}, y_{max}$), ($x_{max}$, $y_{max} - \Delta y_{lim}/4$) and ($x_{max} - \Delta x_{lim}/4$), respectively. $\Delta x_{lim}$ and $\Delta y_{lim}$ are the optimisation spans for the parameters $x$ and $y$, respectively.
3. Iterative optimisation involving the following steps continues until either $D_{th}$ becomes less than $D_{min}$, i.e., the expected optimisation accuracy is attained or maximum number of iterations is reached.
- (a) We find ${\mathop {{P}}\limits^{\rightarrow}}_{min}$ as the point with minimum error in $P_{set}$.
- (b) $N$ line vectors (${\mathop {{V}}\limits^{\rightarrow}}_{1}$, ${\mathop {{V}}\limits^{\rightarrow}}_{2}$,…, ${\mathop {{V}}\limits^{\rightarrow}}_{N}$) join other $N$ points to the ${\mathop {{P}}\limits^{\rightarrow}}_{min}$.
- (c) The wedge product ${\mathop {{V}}\limits^{\rightarrow}}_{p}$ of all $N$ vectors at ${\mathop {{P}}\limits^{\rightarrow}}_{min}$ is computed as $(8)$${\mathop{{V}}\limits^{\rightarrow}}_{p} = \begin{vmatrix} \hat{e_{1}} & \hat{e_{2}} & \ldots & \hat{e}_{N+1}\\ V_{1,1} & V_{1,2} & \ldots & V_{1,N+1} \\ V_{2,1} & V_{2,2} & \ldots & V_{2,N+1} \\ \vdots & \vdots & \ddots & \vdots \\ V_{N,1} & V_{N,2} & \ldots & V_{N,N+1} \end{vmatrix}, $$$ where $\hat {e}_i$ is the unit vector along the $i$^th dimension, $|M|$ is denotes the determinant of the matrix $M$, $V_{j, k}$ is the component in $k$-th dimension for $j$-th vector. For $N = 2$, eqn. (8) is the cross-product of ${\mathop {{V}}\limits^{\rightarrow}}_{1}$ and ${\mathop {{V}}\limits^{\rightarrow}}_{2}$.
- (d) The projection of ${\mathop {{V}}\limits^{\rightarrow}}_{p}$ on the $N$-dimensional plane formed by the $N$ optimisation parameters ${\mathop {{V}}\limits^{\rightarrow}}_{p,N}$ indicates the direction of lower local error. A new point vector ${\mathop {{P}}\limits^{\rightarrow}}_{new}$ is estimated as ${\mathop {{P}}\limits^{\rightarrow}}_{min} + D\cdot \hat {V}_{p,N}$, where $\hat {V}_{p,N}$ is the unit vector along ${\mathop {{V}}\limits^{\rightarrow}}_{p,N}$.
- (e) If ${\mathop {{P}}\limits^{\rightarrow}}_{new}$ exceeds the limits along any dimensions, it is reflected inside the limiting bounds along those dimensions. This keeps the optimisation algorithm within its limits.
- (f) ${\mathop {{P}}\limits^{\rightarrow}}_{new}$ replaces the point vector with maximum error in $P_{set}$.
- (g) $D$ and $D_{th}$ is additively increased by $5\%$ in each iteration. However, when $|{\mathop {{P}}\limits^{\rightarrow}}_{new} - {\mathop {{P}}\limits^{\rightarrow}}_{min}| < D_{th}$ or the distance of current ${\mathop {{P}}\limits^{\rightarrow}}_{new}$ from a previously calculated ${\mathop {{P}}\limits^{\rightarrow}}_{new}$ is less than $D_{th}$, both $D$ and $D_{th}$ is halved, i.e., they undergo a multiplicative decrease. The multiplicative decrease helps in accurately finding the error minimum, whereas, the additive increase maintains an optimal displacement rate to reach the minimum faster. Similar tactics are used in transfer control protocol (TCP) for probing network traffic during congestion control [37].
4. The final $P_{new}$ is returned as the optimal point with minimum error when the iteration ends.

Fig. 4. Flow diagram of the VGDA. After initialisation, the points $P_1$, $P_2$ and $P_3$ create a triangle, where $P_1$ features the largest error and $P_3$, the smallest. ${\mathop {{R}}\limits^{\rightarrow}}$, the projection of area vector of $\Delta P_1P_2P_3$, points towards a lower local error. The new point ${\mathop {{P}}\limits^{\rightarrow}}_{new}$ is calculated at a distance $D$ from $P_3$ along ${\mathop {{R}}\limits^{\rightarrow}}$. Steps $1-3$ are iterated with additive increase and multiplicative decrease of $D$ to move the $\Delta P_1P_2P_3$ and thereby, find the point with minimum error.

Download Full Size | PDF

Since the wedge product in Eq. (8) is anti-commutative, we calculate errors along both $\hat {V}_{p,N}$ and $-\hat {V}_{p,N}$ in each iteration and we assume the one with lower error as ${\mathop {{P}}\limits^{\rightarrow}}_{new}$. The optimisation must be restarted from another starting point in case the algorithm optimises to a local minimum. In this work, we have used “standard score” normalisation for all the optimisation parameters and use 2-norm distance between the modelled and the measured datasets as error. The optimisation time depends on the complexity of the error calculating function and expected accuracy of the fitting parameters, and hence, on the number of iterations. Additionally, in some instances, optimisation results can be further improved by making the normalisation function non-linear, for instance, by using “Simgoid” function. Use cases of standard score and Simgoid normalisation functions are briefly discussed in Section 3 of Supplement 1.

As a proof of concept, we fit a dataset measured with optical ellipsometry using a wavelength range between $790-1700$ nm with s-polarised illumination to S-parameters-based model of the $11$-layered DUT using the VGDA. The DUT consists of a $11$ alternating layers of SiN and SiO chemical vapour-deposited on a p-doped silicon wafer to be used as a DBR for vertical-cavity surface emitting lasers (VCSELs). The set angle of incidence is $\sim 40^\circ$. Since, the refractive indices of SiN, SiO and p-doped silicon substrate are known, we model the DBR with the angle of incidence, thicknesses of the SiN and SiO layers as optimisation parameters. Figure 5(a) depicts the schematic of the model. The thicknesses of all the SiN (also, SiO) layers are assumed to be equal. Figure 5(b) shows an excellent agreement between the measured and modelled DBRs, wherein we estimate the angle of incidence to be $41.94^\circ$ and thickness of SiN and SiO layers to be $202.0$ nm and $230.1$ nm, respectively. Further details of the optimisation run are available in Table 1. These thickness estimates deviates from the values obtained from the ellipsometric modelling by $2\%$ and probably results from the inaccurate angle of incidence input used for the model of the ellipsometer data.

Fig. 5. Ellipsometric measurements with s-polarisation of a distributed Bragg reflector (DBR) used in VCSELs for $1550$ nm operation. (a) The DBR is modelled as $11$ alternating layers of SiN ($n=1.925$) and SiO ($n=1.451$), supported by a p-doped silicon substrate. (b) Comparison between the measured and the estimated reflection coefficient of the DBR from the fitted model. The approximate thickness estimates obtained from the ellipsometer for the silicon nitride (SiN) and silicon oxide (SiO) layers are $198$ nm and $234$ nm, respectively.

Download Full Size | PDF

Table 1. Parameter space, termination criterion and number of required iterations for the optimisation run of the DBR structure shown in Fig. 5(b).

View Table | View all tables in this article

3. Applications

The demonstrated VGDA is a generic minima finding algorithm useful for cases where it is difficult to analytically formulate of the optimisation problem. We demonstrate two further use cases of the algorithm in the terahertz domain by characterising a multilayered DUT comprised of optically thin layers and imaging of a $50$ nm high silicon carbide (SiC) Siemens star deposited on a HR-Si substrate from the amplitude information of the Fabry-Pérot transmission coefficient, which was previously not possible due to high noise floor of the amplitude estimates [21].

3.1 Characterisation of thin films

In the first THz application case, the DUT is a three layered structure where a block of ambient air is sandwiched between a layer of adhesive tape and cling wrap (alternatively known as food wrap or plastic wrap), manufactured from polyvinyl chloride (PVC) and low-density polyethylene (LDPE), respectively. Figure 6(a) shows a schematic of the DUT. The thickness of the adhesive tape and Cling wrap measured with a Dektak profilometer are $52.37\pm 0.01\,\mathrm {\mu }\textrm {m}$ and $11.45\pm 0.01\,\mathrm {\mu }\textrm {m}$, respectively. The air-gap in between measures to $5.01 \pm 0.005$ mm with digital calipers. The refractive index and $\tan \delta$ of PVC at 1 THz are 1.62 and 0.05 [16], whereas they are $\sim 1.5$ and $\sim 0$ for LDPE, respectively [38]. The DUT is laterally probed over 25 mm with a resolution of 0.5 mm. Figure 6(b) shows the estimated thicknesses of individual layers and their corresponding refractive indices. The results are obtained from fitting the amplitude of the Fabry-Pérot transmission coefficient, scanned over a bandwidth of $1$ THz with the “T-Sweeper” system.

Fig. 6. (a) Schematic of the triple-layered DUT consisting of a air-gap sandwiched between the adhesive tape and Cling wrap layers. The transmitted THz signal through the DUT undergoes multiple reflections at the layer interfaces. A spatial scan is conducted over a length of $25$ mm with a $0.5$ mm resolution across the sample to quantify the estimation precision. (b)Thickness and refractive index estimates of a triple-layered DUT constituting adhesive tape, air and cling wrap obtained using VGDA. A spatial scan over $25$ mm is conducted with a $0.5$ mm spatial resolution.

Download Full Size | PDF

Table 2 lists the estimated thicknesses and refractive indices obtained from the VGDA. Estimates using constrained NMA are added for comparison, where we use the same function to calculate error. We further calculate the relative standard error (RSE) of the estimates with respect to the measured physical thickness of each layer from the $51$ measured points. The RSE for the vector-based algorithm for the adhesive tape and air layer is below $2\%$, whereas, the RSE of the NMA-estimates are slightly higher. For the cling warp layer, the vector based gradient descent algorithm produces more accurate thickness estimates than the NMA, however RSE for both the algorithms is higher than $4.5\%$. The estimated refractive index of the adhesive tape is higher than that of PVC which is due to the adhesive present on its surface. The estimated refractive index of the Cling wrap has a higher error margin and can be attributed to the low index contrast between air and the LDPE layer. In this example, the VGDA optimised the measured data to the multilayered model faster than the NMA, clocking at $90.50$ s for $51$ pixels, whereas the NMA required $191.42$ s. We note that the Cling wrap is optically too thin to be accurately characterised using $\leq 1$ THz bandwidth. Larger frequency scans may improve the precision (and accuracy) of the estimates.

Table 2. Mean estimates and RSE of the thicknesses and refractive indices of the multilayered DUT formed by adhesive tape, air and Cling wrap. Estimates using NMA and VGDA are compared to the mechanically measured thicknesses of the constituent layers.

View Table | View all tables in this article

A frequency scan of $1$ THz results in a Fourier-limited spatial resolution of $300\,\mathrm {\mu }\textrm {m}$, which is significantly larger than the optical thicknesses of both the adhesive tape and the cling wrap. The standard time-of-flight-based estimates in TDS would require at least $20$ THz of bandwidth to resolve these layers. Additionally, the refractive index contrast between PVC, LDPE and air is very low and hence, reflections at the air-PVC (or LDPE) interfaces are minimal. To our knowledge, this exemplifies the first characterisation of a multilayered DUT, consisting of layers with optical thicknesses $\leq 15\,\mathrm {\mu }\textrm {m}$, using $1$ THz bandwidth or less and without using any high refractive index substrate.

In our models we have considered a constant value of refractive index for simplicity and hence, the estimated refractive indices shown in Table 2 are an average value over the measurement frequencies. To estimate frequency-dependent refractive indices of dispersive media using VGDA, we propose the following two ways:

1. Considering the average estimated refractive index over the whole frequency span, further iterations of VGDA optimise the model over discrete smaller frequency spans to obtain their corresponding averaged refractive indices. A frequency dependent refractive index is obtained by an interpolation of these discrete values.
2. The frequency-dependent refractive index is modelled as a function of frequency and the model-coefficients are used a input parameters to the VGDA instead of the refractive index.

3.2 Imaging a 50 nm thick Siemens star

Precise thickness estimates can be used to image surface-structured DUTs. In Fig. 1(b) and (d), we obtained the refractive index and thickness of the etched HR-Si wafer simultaneously using the VGDA. Note that the refractive index estimates of silicon are more accurate in the amplitude-based estimates in Fig. 1(d) than the phase-based estimates in Fig. 1(b). However, since the precision of the measurements conducted by employing the “T-Sweeper” system is $\sim 2\,\mathrm {\mu }\textrm {m}$, we image the $50$ nm-thick Siemens star using a standard CW system at a scanning bandwidth of $0.2$ THz (i.e., $\sim 0.6-0.8$ THz).

The DUT is a $525\pm 5\,\mathrm {\mu }\textrm {m}$-thick HR-Si wafer with a $50\pm 10$ nm thick SiC layer atop, deposited in the shape of a Siemens star using chemical vapour deposition. Since, the SiC layer is extremely thin ($\sim \lambda /8500$), the layer can be considered as a perturbation of the substrate height [21]. Here, the VGDA optimises the refractive index and optical thickness of a single-layered DUT to the measured amplitude of the Fabry-Pérot transmission coefficient. Figure 7 shows the THz image of the $50$ nm star, where we follow the same image enhancing techniques in post-processing as detailed in our previous works [21,39]. Figure 7(d) is the THz image of the $50$ nm high Siemens star and Fig. 7(f) shows the corresponding optical micrograph. Figure 7(e) plots the optical thickness of a line-scan, marked by the white arrow in Fig. 7(d), and depicts an average optical thickness difference of $\sim 160$ nm. The corresponding refractive index of $\sim 3.2$ for the SiC layer is in agreement with the literature value of $n_o = 3.15\pm 0.02$ and $n_e = 3.24\pm 0.02$ of 4H-SiC between $0.6-0.8$ THz [40]. The large error bar reflects the fact that chemical-vapor deposited materials may differ slightly in refractive index. Section 1 of the Supplement 1 document shows a further imaging example of a 250 nm-thick Siemens star using VGDA.

Fig. 7. Terahertz image of a $50$ nm thick SiN Siemens star on a $525\pm 5\,\mathrm {\mu }\textrm {m}$ thick HR-Si wafer. (a) Raw data of the heights estimated from the amplitude of the Fabry-Pérot transmission coefficients. (b) The surface warping of the HR-Si substrate. The warping is calculated from the $5$^th order plane fit to (a). (c) Subtracting (b) from (a) makes the star visible. (d) The contrast of (c) is enhanced by removing outlier estimates. (e) A linescan at $x=-10$ mm shows clearly an optical height difference of $160$ nm. (f) The microscopic image of the Siemens star.

Download Full Size | PDF

The achieved precision of the VGDA calculated from the standard deviation of the height data is $22$ nm which is better than the precision of the phase-fitting algorithm employed in [21]. Further, the VGDA algorithm is able to image the $50$ nm star from the amplitude information of $t_\text {FP}$, which was not previously possible. From optical thickness and fitted refractive index the physical thickness is determined to $50$ nm and thus in excellent agreement with the true height measured with a Dektak profilometer that features an accuracy and precision of $\sim 10$ nm. We therefore conclude that not only the precision but also the accuracy is of the order of $10-20$ nm.

4. Conclusion

In this work, we presented a novel vector-based implementation of the gradient descent algorithm and explored its applications in the characterisation of multilayered structures and imaging nanometric thickness variations in planar DUTs. Enabled by the fast CW THz spectrometer, we were able to measure absolute optical thicknesses as low as $15\,\mathrm {\mu }\textrm {m}$ in a multilayered DUT. Using a standard CW spectrometer, we demonstrated measurements of relative thickness variations of $50$ nm for the (quasi) single-layered DUT. The algorithm is practical when analytically formulating the error (or optimisation) function is cumbersome. The convergence of the VGDA is more robust than the NMA, where the convergence significantly depends on the starting point. The VGDA performs on-par with the constrained NMA, implemented using the “interior-point” algorithm [41], and Section 2 of the Supplement 1 document, compares the performance of VGDA and NMA for evaluating a DBR structure at THz frequencies that the former optimises more accurately as the number of optimisation parameters increases. However, standard NMA finds the minima of the Rosenbrock function [42] faster than the current implementation of VGDA. The algorithm is currently implemented in MATLAB and can be further sped-up when implemented in Python, C or even hard-coded in an FPGA. Furthermore, the VGDA can be integrated with other modelling software, e.g., CST microwave studio, or used to optimise ellipsometric measurements. We remark that the algorithm can be applied to more general cases and topics. For instance, for a desired ideal band-pass characteristics of a DBR with predefined pass and stop-band specifications, the algorithm can predict the number of layers necessary and their corresponding thicknesses. An off-topic example is optimal parameter predictions, e.g., in stock markets, given a certain initial condition and access to historical data.

Funding

European Research Council (713780).

Acknowledgments

We thank Alonso Ingar Romero for his help with the data acquisition and Dr. Irina Harder and the group for Nanofabrication TDSU1 from the Max Planck Institute for the Science of Light (MPL), Erlangen for their expertise on the DRIE process, along with providing access to the MPL cleanroom and dry etching facilities, MPL for the Siemens stars.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. G. J. Stacey, “THz Low Resolution Spectroscopy for Astronomy,” IEEE Trans. Terahertz Sci. Technol. 1(1), 241–255 (2011). [CrossRef]

2. G. Bubnov, A. Gunbina, I. Lesnov, M. Mansfeld, F. Kovalev, R. Alekseev, A. Korobeynikova, S. Vdovichev, and V. Vdovin, “Development and research of sub-terahertz astronomy and telecommunication equipment,” AIP Conf. Proc. 2304, 020013 (2020). [CrossRef]

3. Y. Peng, C. Shi, Y. Zhu, M. Gu, and S. Zhuang, “Terahertz spectroscopy in biomedical field: a review on signal-to-noise ratio improvement,” PhotoniX 1(1), 12 (2020). [CrossRef]

4. M. C. Beard, G. M. Turner, and C. A. Schmuttenmaer, “Terahertz Spectroscopy,” J. Phys. Chem. B 106(29), 7146–7159 (2002). [CrossRef]

5. E. Castro-Camus, M. Koch, and D. M. Mittleman, “Recent advances in terahertz imaging: 1999 to 2021,” Appl. Phys. B 128(1), 12 (2022). [CrossRef]

6. P. Hillger, J. Grzyb, R. Jain, and U. R. Pfeiffer, “Terahertz Imaging and Sensing Applications With Silicon-Based Technologies,” IEEE Trans. Terahertz Sci. Technol. 9(1), 1–19 (2019). [CrossRef]

7. D. Molter, K.-S. Ellenberger, J. Klier, S. Duran, J. Jonuscheit, G. von Freymann, N. Vieweg, and A. Deninger, “Kilohertz pixel-rate multilayer terahertz imaging of subwavelength coatings,” Appl. Sci. 12(10), 4964 (2022). [CrossRef]

8. D. Nuessler and J. Jonuscheit, “Terahertz based non-destructive testing (ndt),” tm - Technisches Messen 88, 1 (2020). [CrossRef]

9. E. Peytavit, P. Latzel, F. Pavanello, G. Ducournau, and J. Lampin, “CW Source Based on Photomixing With Output Power Reaching 1.8 mW at 250 GHz,” IEEE Electron Device Lett. 34(10), 1277–1279 (2013). [CrossRef]

10. A. Al-Khalidi, K. H. Alharbi, J. Wang, R. Morariu, L. Wang, A. Khalid, J. M. L. Figueiredo, and E. Wasige, “Resonant Tunneling Diode Terahertz Sources With up to 1 mW Output Power in the J-Band,” IEEE Trans. Terahertz Sci. Technol. 10(2), 150–157 (2020). [CrossRef]

11. “Frequency multipliers (wr and d series),” https://www.vadiodes.com/en/frequency-multipliers. Accessed: 2021-01-26.

12. P.-K. Lu, D. Turan, and M. Jarrahi, “High-sensitivity telecommunication-compatible photoconductive terahertz detection through carrier transit time reduction,” Opt. Express 28(18), 26324–26335 (2020). [CrossRef]

13. K. Sengupta, T. Nagatsuma, and D. M. Mittleman, “Terahertz integrated electronic and hybrid electronic–photonic systems,” Nat. Electron. 1(12), 622–635 (2018). [CrossRef]

14. A. D. J. Fernandez Olvera, B. L. Krause, and S. Preu, “A true optoelectronic spectrum analyzer for millimeter waves with hz resolution,” IEEE Access 9, 114339–114347 (2021). [CrossRef]

15. B. L. Krause, A. D. J. F. Olvera, and S. Preu, “Photonic spectrum analyzer for wireless signals in the thz range,” IEEE Access 10, 42061–42068 (2022). [CrossRef]

16. F. R. Faridi and S. Preu, “Pulsed free space two-port photonic vector network analyzer with up to 2 THz bandwidth,” Opt. Express 29(8), 12278–12291 (2021). [CrossRef]

17. A. D. J. Fernandez Olvera, A. K. Mukherjee, and S. Preu, “A Fully Optoelectronic Continuous-Wave 2-Port Vector Network Analyzer Operating From 0.1 THz to 1 THz,” IEEE J. Microw. 1(4), 1015–1022 (2021). [CrossRef]

18. M. Naftaly, N. Vieweg, and A. Deninger, “Industrial applications of terahertz sensing: State of play,” Sensors 19(19), 4203 (2019). [CrossRef]

19. M. Mueh, M. Maasch, R. A. Knieß, H. U. Göringer, and C. Damm, “Detection of african trypanosomes using asymmetric double-split ring based thz sensors,” IEEE J. Electromagn. RF Microwaves Med. Biol. 1(2), 66–73 (2017). [CrossRef]

20. K. Ahi, “A method and system for enhancing the resolution of terahertz imaging,” Measurement 138, 614–619 (2019). [CrossRef]

21. A. Ingar Romero, A. K. Mukherjee, A. Fernandez Olvera, M. Méndez Aller, and S. Preu, “Visualizing nanometric structures with sub-millimeter waves,” Nat. Commun. 12(1), 7091 (2021). [CrossRef]

22. M. A. Anderson, W. L. Newmeyer, and E. S. Kilgore, “Diagnosis and treatment of retained foreign bodies in the hand,” Am. J. Surg. 144(1), 63–67 (1982). [CrossRef]

23. D. Driskell, J. Gillum, J. Monti, and A. Cronin, “Ultrasound evaluation of soft-tissue foreign bodies by US army medics,” J. Med. Ultrasound 26(3), 147–152 (2018). [CrossRef]

24. J. Davis, B. Czerniski, A. Au, S. Adhikari, I. Farrell, and J. M. Fields, “Diagnostic accuracy of ultrasonography in retained soft tissue foreign bodies: A systematic review and meta-analysis,” Acad. Emerg. Med. 22(7), 777–787 (2015). [CrossRef]

25. G. Hernandez, Fabry-Perot Interferometers (Cambridge University Press, 1988), chap. 2, Cambridge Studies in Modern Optics.

26. L. Liebermeister, S. Nellen, R. B. Kohlhaas, S. Lauck, M. Deumer, S. Breuer, M. Schell, and B. Globisch, “Terahertz multilayer thickness measurements: Comparison of optoelectronic time and frequency domain systems,” J. Infrared, Millimeter, Terahertz Waves 42, 1153–1167 (2021). [CrossRef]

27. J. Kutz, L. Liebermeister, N. Vieweg, K. Wenzel, R. Kohlhaas, and M. Naftaly, “A terahertz fast-sweep optoelectronic frequency-domain spectrometer: Calibration, performance tests, and comparison with tds and fds,” Appl. Sci. 12(16), 8257 (2022). [CrossRef]

28. M. Born, E. Wolf, A. B. Bhatia, P. C. Clemmow, D. Gabor, A. R. Stokes, A. M. Taylor, P. A. Wayman, and W. L. Wilcock, Principles of Optics: Electromagnetic Theory of Propagation, Interference and Diffraction of Light (Cambridge University Press, 1999), 7th ed.

29. K. Kurokawa, “Power Waves and the Scattering Matrix,” IEEE Trans. Microwave Theory Tech. 13(2), 194–202 (1965). [CrossRef]

30. C. Yeh and F. I. Shimabukuro, Fundamental Electromagnetic Field Equations (Springer US, Boston, MA, 2008), pp. 11–53.

31. S. Krimi, J. Klier, J. Jonuscheit, G. von Freymann, R. Urbansky, and R. Beigang, “Highly accurate thickness measurement of multi-layered automotive paints using terahertz technology,” Appl. Phys. Lett. 109(2), 021105 (2016). [CrossRef]

32. D. G. A. Ibrahim, S. K. H. Khalil, H. H. A. Sherif, and M. M. Eloker, “Multilayer film thickness measurement using ultrafast terahertz pulsed imaging,” J. Phys. Commun. 3(3), 035013 (2019). [CrossRef]

33. N. S. Schreiner, W. Sauer-Greff, R. Urbansky, G. von Freymann, and F. Friederich, “Multilayer thickness measurements below the rayleigh limit using fmcw millimeter and terahertz waves,” Sensors 19(18), 3910 (2019). [CrossRef]

34. J. A. Nelder and R. Mead, “A Simplex Method for Function Minimization,” The Comput. J. 7(4), 308–313 (1965). [CrossRef]

35. P. Lounesto, Clifford Algebras and Spinors (Springer Netherlands, Dordrecht, 1986).

36. H. Grassmann, Die Lineale Ausdehnungslehre ein neuer Zweig der Mathematik: Dargestellt und durch Anwendungen auf die übrigen Zweige der Mathematik, wie auch auf die Statik, Mechanik, die Lehre vom Magnetismus und die Krystallonomie erläutert, Cambridge Library Collection - Mathematics (Cambridge University Press, 2012).

37. D.-M. Chiu and R. Jain, “Analysis of the increase and decrease algorithms for congestion avoidance in computer networks,” Comput. Networks ISDN Syst. 17(1), 1–14 (1989). [CrossRef]

38. W. R. Folks, S. K. Pandey, and G. Boreman, “Refractive index at thz frequencies of various plastics,” in Optical Terahertz Science and Technology, (Optica Publishing Group, 2007), p. MD10.

39. A. K. Mukherjee and S. Preu, “Emulating a broadband frequency sweep for terahertz thin-film imaging,” in 2022 47th International Conference on Infrared, Millimeter and Terahertz Waves (IRMMW-THz), (2022), pp. 1–4.

40. M. Naftaly, J. F. Molloy, B. Magnusson, Y. M. Andreev, and G. V. Lanskii, “Silicon carbide – a high-transparency nonlinear material for thz applications,” Opt. Express 24(3), 2590–2595 (2016). [CrossRef]

41. “fmincon – find minimum of constrained nonlinear multivariable function,” https://de.mathworks.com/help/optim/ug/fmincon.html. Accessed: 2023-01-03.

42. H. H. Rosenbrock, “An Automatic Method for Finding the Greatest or Least Value of a Function,” The Comput. J. 3(3), 175–184 (1960). [CrossRef]

Parameters	Bounds	Termination criterion	Req. Iterations
SiO thickness	[ $200, 250$ ] nm	Normalised $D_{t h} < 10^{- 4}$ or iterations $> 1000$	67
SiN thickness	[ $180, 230$ ] nm
Angle of incidence	[ $35^{\circ}, 45^{\circ}$ ]

Layer	Thickness (Mech.)	Algo.	Thickness		Refractive Index
Layer	Thickness (Mech.)	Algo.	Est.	RSE	Est.	RSE
Adhesive tape	$53.27 \pm 0.01 μ m$	VGDA	56.23 $μ m$	1.43%	1.83	0.60%
Adhesive tape	$53.27 \pm 0.01 μ m$	NMA	55.79 $μ m$	1.70%	1.82	0.85%
Air	5.01 $\pm$ 0.005 mm	VGDA	5.05 mm	0.35%	–	–
Air	5.01 $\pm$ 0.005 mm	NMA	5.01 mm	0.47%	–	–
Cling wrap	$11.45 \pm 0.09 μ m$	VGDA	12.22 $μ m$	3.04%	1.31	1.14%
Cling wrap	$11.45 \pm 0.09 μ m$	NMA	8.01 $μ m$	5.42%	1.35	1.16%

Parameters	Bounds	Termination criterion	Req. Iterations
SiO thickness	[ $200, 250$ ] nm	Normalised $D_{t h} < 10^{- 4}$ or iterations $> 1000$	67
SiN thickness	[ $180, 230$ ] nm
Angle of incidence	[ $35^{\circ}, 45^{\circ}$ ]

Layer	Thickness (Mech.)	Algo.	Thickness		Refractive Index
Layer	Thickness (Mech.)	Algo.	Est.	RSE	Est.	RSE
Adhesive tape	$53.27 \pm 0.01 μ m$	VGDA	56.23 $μ m$	1.43%	1.83	0.60%
Adhesive tape	$53.27 \pm 0.01 μ m$	NMA	55.79 $μ m$	1.70%	1.82	0.85%
Air	5.01 $\pm$ 0.005 mm	VGDA	5.05 mm	0.35%	–	–
Air	5.01 $\pm$ 0.005 mm	NMA	5.01 mm	0.47%	–	–
Cling wrap	$11.45 \pm 0.09 μ m$	VGDA	12.22 $μ m$	3.04%	1.31	1.14%
Cling wrap	$11.45 \pm 0.09 μ m$	NMA	8.01 $μ m$	5.42%	1.35	1.16%

Characterisation of multi-layered structures using a vector-based gradient descent algorithm at terahertz frequencies

Abstract

1. Introduction

2. Characterisation of multilayered structures

2.1 Modelling

2.2 Vector-based gradient descent algorithm (VGDA)

3. Applications

3.1 Characterisation of thin films

3.2 Imaging a 50 nm thick Siemens star

4. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (7)

Tables (2)

Equations (8)

Optics Express

Amlan kusum Mukherjee	https://orcid.org/0000-0001-9239-6118
Konstantin Wenzel	https://orcid.org/0009-0000-7145-7874
Lars Liebermeister	https://orcid.org/0000-0001-9415-5051
Sascha Preu	https://orcid.org/0000-0003-0818-1865