Characterization of the displacement response in chromatic confocal microscopy with a hybrid radial basis function network

Wenlong Lu; Cheng Chen; Jian Wang; Richard Leach; Chi Zhang; Xiaojun Liu; Zili Lei; Wenjun Yang; Xiangqian (Jane) Jiang

doi:10.1364/OE.27.022737

1. Introduction

Confocal microscopy has an increasing number of applications in biological science [1,2], materials characterization [3], microstructure topography measurement [4–6], and shape measurement [7–9]. In order to achieve its depth discrimination capability [10,11], confocal microscopy needs to scan the sample layer by layer in the axial direction with a high-accuracy vertical actuator, such as a piezoelectric (PZT) actuator. The scan produces a Gaussian-like intensity curve on the photodetector, referred to as the axial response signal (ARS) [12]. By extracting the peak position of the ARS, the sample surface height can be determined.

As a variant on confocal microscopy, chromatic confocal microscopy (CCM) can achieve axial scanning with a chromatically dispersed objective, i.e., a hyperchromatic objective [13,14]. More specifically, the hyperchromatic objective allows different spectral components of a broadband light source to be focused on different heights [15,16]. With the combination of the confocal illumination/detection setup and a hyperchromatic objective, a spectral ARS is obtained in CCM during the measurement process. Also, different heights or displacements correspond to spectral ARSs with different peak wavelengths [17,18]. Thus, the surface height or displacement can be decoded from the characterized response relationship between the peak wavelengths of the spectral ARSs and known displacements [19]. Therefore, an accurate displacement-wavelength response model is essential to chromatic confocal measurement [20].

Fainman et al. [21–23], Shi et al. [24], Kuang et al. [25] and Gao et al. [26] have constructed CCMs with either hybrid diffractive-refractive objectives or a single lens as the chromatic aberration provider for point, line and multiple-point measurement. In each of these CCMs, a linear function is utilised to characterise the displacement-wavelength relationship. Polynomials of orders higher than unity [27–30] and theoretical raytracing data [19] can also been used to describe the displacement-wavelength relationship.

These explicit function models of the displacement-wavelength relationship for CCM can lead to several errors. Firstly, there are height-dependent peak extraction errors in confocal microscopy ARS processing [31,32]. Similar wavelength extraction errors will occur in CCM since, from a mathematical perspective, the ARSs in both confocal models are similar [33]. Furthermore, peak extraction errors can deteriorate the accuracy of the displacement-wavelength relationship [25]. Secondly, there is still a lack of background theoretical support for linear or higher order polynomial models in CCM. Thirdly, there are influence factors, such as uncorrected monochromatic aberrations of the hyperchromatic objective [33], measurement noise of spectral ARSs [34,35] and the spectral-dependent influences of the light source or spectrometer [29]. These influence factors cause the nominal displacement-wavelength model to deviate from the ideal, which is also vindicated from the comparison between the real displacement-wavelength relationship and the theoretical raytracing data [19]. Therefore, explicit function models can cause significant measurement errors in some local regions [24,27,28], since the influence factors as mentioned above can lead to non-negligible nonlinearity in the displacement-wavelength relationship [36]. For these reasons, an explicit function model is often only suitable for short-range CCM with small nonlinearity [37], which can be problematic for CCMs with measurement ranges of hundreds or even thousands of micrometers.

Artificial neural networks (ANNs) are popular methods for non-linear modeling [38,39] because they can approximate an arbitrary non-linear function with arbitrary accuracy [40]. In essence, an ANN is a black-box model that depends on the input and output data [41]. The ANN approach has been widely utilised to determine the input-output response relationships in the optical community [41–47], such as the calibration of spectrometers, digital cameras, or star sensors. To our knowledge, the ANN approach has not been utilized with CCM.

Instead of utilizing an ANN-based direct mapping, a hybrid ANN is constructed due to its relatively small network size, short training times, high mapping accuracy, and strong generalization ability [48]. In this paper, we propose a hybrid radial basis function network to achieve accurate characterization of the displacement response relationship in CCM. Precisely, the hybrid network consists of a chromatic dispersion model and a radial basis function network (RBFN).

The rest of the paper is organized as follows. Section 2 explains the details of the hybrid model and discuses two major aspects: the chromatic dispersion model for the accurate displacement-wavelength relationship characterization, and an RBFN to construct a mapping between the spectral ARSs and the characterization errors of the chromatic dispersion model. In order to eliminate the influence of the signal noise [49] on the response characterization, spectral ARSs from repeated measurement runs are collected as the training data. Section 3 demonstrates the advantage of our hybrid network and experimental comparisons with a conventional polynomial model. Section 4 presents the concluding remarks.

2. Hybrid RBFN model

2.1 CCM characterization principle

Figure 1 is a schema of a CCM, which has a broadband light source (MWWHF2, Thorlabs, USA), a hyperchromatic objective lens with approximately 400 µm in the visible spectrum bandwidth, a commercial spectrometer (Maya Pro 2000, Ocean Optics, USA) and a high-accuracy PZT (P721.CDQ, Physik Instrumente, Germany). The objective lens allows the broadband source to be focused at different axial positions for each wavelength. Thus, only one wavelength will be focused on the sample, and the axial distance or sample height determines which wavelength is best focused. Upon reflection from the sample, the light is refocused onto the end of an optical fiber with a diameter of 50 µm, which acts as the detection aperture. Only the wavelength that is well-focused on the sample is well-focused on the fiber end [30]. Therefore, the signal level will be highest for the wavelength corresponding to the sample height [33].

Fig. 1. Schema of the CCM characterization procedure.

Download Full Size | PDF

In the CCM characterization process, a flat mirror is located within the CCM’s measurement range. A spectral ARS is recorded with the spectrometer, as shown in the right of Fig. 1. The mirror is moved by the PZT to acquire different spectral ARSs. The essence of the method is to determine a continuous mapping between the spectral ARSs and the displacement from the PZT.

In the conventional response characterization process, the peak wavelengths of spectral ARSs are extracted with a centroid algorithm or non-linear fitting algorithms, such as parabolic fitting, Gaussian fitting, or sinc² fitting [34,35]. Following the peak extraction process, a linear or polynomial model is developed to describe the displacement-wavelength relationship in CCM. In the measurement process, the measured sample can cause a unique spectral ARS, similar to that shown in Fig. 1. The surface height is decoded according to the displacement-wavelength response model with the extracted peak wavelengths as the input. Therefore, both peak wavelength extraction and displacement-wavelength relationship fitting errors can affect the response characterization accuracy, which in turn reduces the chromatic confocal measurement accuracy.

Different from the conventional characterization process, our hybrid network is implemented with three procedures, as shown in Fig. 2: extracting the peak wavelengths, characterizing the displacement-wavelength relationship with the chromatic dispersion model, and constructing the RBFN mapping between the spectral ARSs and the characterization errors of the chromatic dispersion model. In the following section, we discuss the latter two procedures in detail, since the accurate extraction of the peak wavelength is described in detail elsewhere [32,50–52].

Fig. 2. Structure of the hybrid RBFN.

Download Full Size | PDF

2.2 Chromatic dispersion model

The hyperchromatic objective can be classified into three types: a conventional refractive objective, a diffractive objective, or a hybrid diffractive-refractive objective [53]. The chromatic dispersion properties of a diffractive lens can be characterized by a simple equation given elsewhere [21]. Here, we are more concerned with the chromatic dispersion model for the conventional refractive objective, which is also the prerequisite for the displacement-wavelength characterization of the hybrid diffractive-refractive objective.

In previous literature, a linear or polynomial function is used to describe the displacement-wavelength relationship of a refractive hyperchromatic objective [30,37]. However, there is a lack of theoretical support for such models. The essence of the chromatic dispersion model is to characterize the dispersion properties of the refractive materials with suitable dispersion formula. The index of the refractive material can be described with several dispersion formulae, such as the Cauchy equation, empirical Schott formula, and the Sellmeier dispersion formula [54]. Among these, the Cauchy equation is simple and un-accurate, while the Schott formula and the Sellmeier formula are both complex for characterizing the refractive index-wavelength relationship of the refractive materials. Precisely, the Schott formula has been substituted by the Sellmeier formula in optical design. The Sellmeier formula is accurate enough, but has a complex expression with the wavelength terms in the denominator, which causes some convergence problems [55]. Moreover, the focusing strength of a refractive lens can be a complex expression of the refractive-index, which indicates the displacement-wavelength relationship based on the Sellmeier formula can be even more complex [26]. Here, we propose a chromatic dispersion model based on the Buchdahl’s dispersion formula to characterize the chromatic focal shift of the hyperchromatic objective. Our dispersion model is a polynomial function, which is simple and accurate.

Buchdahl introduced a change of variables from wavelength $\lambda $ to a chromatic coordinate $\omega $, to express the refraction index-wavelength relationship in a power series of the chromatic coordinate. Buchdahl’s model is adopted thus [56]

(1)$$N(\lambda )= {N_0} + {\nu _1}\omega + {\nu _2}{\omega ^2} + \ldots + {\nu _q}{\omega ^q}, $$

where the material type uniquely determines ${N_0}$ and ${\nu _\textrm{q}}$, q is subscript (also the exponent of the polynomial), and the chromatic coordinate $\omega $ is defined as

(2)$$\omega = \frac{{{\lambda \mathord{\left/ {\vphantom {\lambda {1000}}} \right.} {1000}} - 0.5876}}{{1 + 2.5({{\lambda \mathord{\left/ {\vphantom {\lambda {1000}}} \right.} {1000}} - 0.5876} )}}, $$

It should be noted that the wavelength in Buchdahl’s model is defined in micrometers, which is different from the wavelength unit in the spectrometer, which is given in nanometers.

The refractive index fitting errors with wavelength are shown in Fig. 3 for two typical Schott glasses: N-BK7 and N-SF6. The fit indices are measured indices at wavelengths in the visible range [57]. The root-mean-square (RMS) values of the residual fitting errors are $1.4 \times {10^{ - 6}}$ and $1.0 \times {10^{ - 6}}$. for the N-BK7 and N-SF6, respectively, with a fifth-order Buchdahl’s expression. Thus, Eq. (1) is accurate enough to describe the refractive index-wavelength relationship since it has a comparable performance with the Sellmeier formula [58]. More terms can also be adopted if a more accurate characterization is needed.

Fig. 3. Refractive index fitting error of Buchdahl’s dispersion formula against wavelength.

Download Full Size | PDF

Assuming the paraxial approximation, the focal length $f(\lambda )$ of a thin lens in the vacuum, at a wavelength $\lambda $, is given by [24]

(3)$$f(\lambda )= \frac{1}{{N(\lambda )- 1}} \cdot \frac{{{r_1}{r_2}}}{{{r_2} - {r_1}}},$$

where ${r_1}$ and ${r_2}$ are the radii of the two spherical refractive surfaces. It should be noted that, for a given lens shape, the quantity $f(\lambda )({n(\lambda )- 1} )$ is constant for all wavelengths [59]. Therefore, the focal length $f(\lambda )$ is expressed as

(4)$$f(\lambda )= \frac{{{f_{nom}}({{n_{nom}} - 1} )}}{{N(\lambda )- 1}} = \frac{{{C_{nom}}}}{{N(\lambda )- 1}}, $$

where ${f_{nom}}$ and ${C_{nom}}$ are the nominal focal length and refractive index at the design wavelength.

Using the Taylor series expansion, the focal length $f(\lambda )$ is approximated as

(5)$$f(\lambda )\approx \frac{{{C_{nom}}}}{{{N_0} - 1}} \cdot \left[ {1 - \frac{1}{{{N_0} - 1}}({{\nu_1}\omega + {\nu_2}{\omega^2} + \ldots + {\nu_q}{\omega^q}} )} \right], $$

since the sum of the terms in Eq. (5), except the first term, is much smaller than unity.

Before the Eq. (5) is applied as the chromatic dispersion model, several points need to be emphasized and clarified. First, the Eq. (5) indicates that the chromatic dispersion model is a polynomial function of the chromatic coordinate $\omega $, instead of using the Eq. (5) directly as the chromatic dispersion model. Although the Eq. (5) is derived for the focal length-wavelength relationship of a single lens, the polynomial expression is also applied to complex hyperchromatic objective [60], since the focal power (reciprocal of the focal length) of a complex hyperchromatic objective with a multi-lens configuration can be calculated as a linear combination of the focal power of the individual optical components [61]. In other words, the complex objective can be modelled as an equivalent thin lens. Second, the expression in Eq. (5) only describes the paraxial dispersion properties of the hyperchromatic objective. Other monochromatic aberrations of the hyperchromatic objective are not concerned in the chromatic dispersion model.

The characterization of the paraxial chromatic focal shift using the chromatic dispersion model, is shown in Fig. 4, in comparison to ray tracing data from commercial software ZEMAX. In Fig. 4(a), the chromatic dispersion model, and the ray tracing data are in good agreement. In Fig. 4(b), the chromatic dispersion model outperforms the conventional polynomial model in terms of characterizing the paraxial chromatic focal shift, with much smaller residual fitting errors. The RMS value of the residual fitting errors of the chromatic dispersion model is approximately 1.2 nm within a 400 µm measurement spectral range, with a fifth-order model, which is sufficiently accurate for the CCM response characterization process. Meanwhile, this also indicates that the chromatic dispersion model can replace the ZEMAX modeling data for the paraxial characterization of the chromatic confocal microscopy.

Fig. 4. Accuracy of chromatic dispersion model for a hyperchromatic objective lens with multi-lenses (a) Displacement-wavelength relationships for the ray tracing data and the chromatic dispersion model and (b) Variation of the residual errors from the chromatic dispersion model with the wavelength

Download Full Size | PDF

2.3 Implementation of the RBFN

2.3.1 Introduction to the RBFN

Although our proposed chromatic dispersion model provides an accurate characterization of the paraxial displacement-wavelength relationship in CCM, the real displacement-wavelength relationship can be influenced by several troublesome factors, such as the uncorrected monochromatic aberrations (occurring in the optical design, manufacturing or assembly process) [33], different amounts of monochromatic aberrations at different wavelengths [24], the non-uniform intensity spectrum of the broadband light source [25] and the wavelength-dependent response characteristics of the spectrometer [29].

It is challenging to design a hyperchromatic objective with the wanted chromatic aberration and without the monochromatic aberrations. The monochromatic aberrations that exceed a certain limit can lead significantly distorted spectral signal, which leads to significant peak extraction errors [62] and distorted displacement-wavelength relationship. Also, the different amounts of monochromatic aberrations at different wavelengths can lead to different distortions of the displacement-wavelength relationship. In addition, the spectrum of the light source and the spectral response characteristics of the spectrometer can result in distortions of the spectral signal, which eventually introduces distortions to the displacement-wavelength relationship. However, all these factors are difficult to quantify in practice accurately, which implies that the paraxial characterization of the CCM using the chromatic dispersion model is not enough. An ANN mapping of the spectral signals and calibrated displacements are effective since the overall effects of the monochromatic aberrations and the spectral characteristics of the light source and the spectrometer are reflected on the spectral signals. Here, we choose an ANN mapping between the spectral signals and characterization errors.

As a feedforward ANN, RBFN has one hidden layer with non-sigmoidal activation functions [63]. In the hidden layer, there can be several neurons, and each neuron has a center vector, which is of the same length as the input vector. The topological structure of an RBFN is shown in Fig. 5. The mapping from the input layer to the hidden layer is non-linear, while the mapping from the hidden space to the output space is linear, that is, a linear weighted sum of the hidden layer’s output. Thus, the network output is a linear combination of a series of activation functions. Such a characteristic is advantageous for network training speed. In the RBFN, the types of activation functions do not significantly affect network performance [64]. Thus our RBFN output with Gaussian activation function is defined as [44]

(6)$$\varphi (X )= \sum\limits_{i = 1}^{{N_h}} {{w_i}\exp \left( { - \frac{{{{||{X - {k_i}} ||}^2}}}{{2\sigma_i^2}}} \right)}, $$

where $||{\ldots } ||$ is the Euclidean norm, ${N_h}$ is the number of neurons in the hidden layer, ${{\boldsymbol{k}}_i}$ is the centre vector for the i^th neuron in the hidden-layer, ${\sigma _i}$ is the kernel width of the i^th neuron in the hidden-layer, ${w_i}$ is the weight of the output layer, ${\boldsymbol{X}}$ is the input vector pattern of the network and $\varphi$ is the network output.

Fig. 5. The topological structure of an RBFN.

Download Full Size | PDF

Essential factors in the implementation of the RBFN are the selection of the neurons’ center vectors, the determination of the kernel widths, and the optimization of the linear weights. In general, the mean squared error (MSE) is adopted as the performance measure of the network [41]. Here, we take the square root of the MSE, i.e., the RMSE as the performance measure, which has the same units as the displacement being predicted. The RMSE that is, the RMS of deviations between the desired output and the network output, is given by

(7)$$RMSE = \sqrt {\frac{1}{{{N_t}}}\sum\limits_{t = 1}^{{N_t}} {{{({Y_t^{des} - Y_t^{net}} )}^2}} }, $$

where $Y_t^{des}$ and $Y_t^{net}$ denote the desired output and the network output for the input pattern t respectively, and ${N_t}$ is the number of training patterns.

In the network, the number of neurons in the hidden layer needs to be determined before the optimization of the network architecture. Too many neurons may easily lead to overfitting, while it is difficult to complete the mapping tasks if there are too few neurons [65]. In the classical feedforward ANN, the architecture does not change during training, and it is difficult to determine the optimal number of hidden layers and its neurons. Our RBFN is constructed to self-adaptively adjust the number of neurons. The number of neurons in the hidden layer is firstly assumed, then the neurons’ centers are determined with a classical k-means clustering algorithm [66]. Lastly, the linear weights are optimized with a least-squares method [67]. Usually, the number of neurons is increased automatically by checking the RMSE measure and then repeats the training process until either the requested precision or the largest number of neurons is achieved.

2.3.2 Dimensionality reduction

It was emphasized in section 2.1 that the RBFN is constructed with the spectral ARSs as the input and the residual fitting errors of the chromatic dispersion model as the output. The construction of RBFN with the spectral ARSs as input vectors has the advantage of suppressing the influence of wavelength-dependent influence factors, as described in Section 1. However, the process of using spectral ARSs as input vectors also causes a new issue. In Fig. 6, three spectral ARSs are illustrated with a wavelength resolution of 0.46 nm. The spectral ARSs cover an effective spectrum bandwidth of approximately 100 nm, which corresponds to a 100 µm measurement range. The input vector, i.e., the vector of the intensity values, has approximately 200 dimensions. If the CCM has a larger measurement range with a broader spectral bandwidth, the dimensions of the input vector can be higher; up to thousands. A direct RBFN mapping between the original spectral ARSs and the residual fitting errors can cause the dimensionality-related issue, such as the “curse of dimensionality” or long training time [41]. Therefore, dimensionality reduction is needed before network construction.

Fig. 6. Spectral ARSs at different axial positions.

Download Full Size | PDF

Dimensionality reduction provides a feature projection where the data, i.e., the spectral ARSs in the high-dimensional space, are projected to a space of fewer dimensions, without loss of valuable information. Several classical dimensionality reduction methods [68] have been proposed, such as principal component analysis (PCA), kernel PCA, and autoencoder. Among these methods, PCA can perform a linear data transform to a lower-dimensional space in such a way that the variance of the data in the low-dimensional representation is maximized [69].

The essence of PCA is to perform the linear transform by left-multiplying a matrix. Suppose we have a collection of m training patterns (also m input vectors) and each pattern is ($n \times 1$)-dimensional. If the variables in the ($n \times 1$)-dimensional input vector are written as $\{{{x_1},{x_2},\ldots,{x_n}} \}$ the new representative variables $\left\{ {{z_1},{z_2},...,{z_l}} \right\}\ (l \le n)$ are expressed as

(8)$$ \left\{ {\begin{array}{{c}} {{z_1} = {\alpha _{11}}{x_1} + {\alpha _{12}}{x_2} + ... + {\alpha _{1n}}{x_n}}\\ {{z_2} = {\alpha _{21}}{x_1} + {\alpha _{22}}{x_2} + ... + {\alpha _{2n}}{x_n}}\\ {...}\\ {{z_l} = {\alpha _{l1}}{x_1} + {\alpha _{l1}}{x_2} + ... + {\alpha _{\ln }}{x_n}} \end{array}} \right.,$$

The desired matrix ${\boldsymbol{\alpha}}$ is used to make the two arbitrary variables in the new representative $\{{{z_1},{z_2},\ldots,{z_l}} \}$ irrelevant and make the new variables sort with decreasing variances [70]. The transforming matrix ${\boldsymbol{\alpha}}$ can be determined from an eigenvalue decomposition of the covariance matrix, which is constructed from the original data sets [41]. In CCM, this covariance matrix is derived from a collection of spectral ARSs. With the PCA operation, the input vector can be reduced from an ($n \times 1$)-dimensional vector to an ($l \times 1$)-dimensional vector, where l can be much smaller than $n$.

3. Experiments

3.1 Experimental procedures

In our response characterization process, the experimental setup shown in Fig. 1 is used to measure a flat mirror, which is actuated by a closed-loop PZT. During the movement of the flat mirror, the same point on the mirror is measured at different axial positions, which are determined from the PZT's displacement sensor output. The spectral ARS is recorded with a fixed displacement interval. At each axial position, twenty-five frames of spectral ARSs with 7.2 ms integration time are obtained. The acquired spectral ARSs are processed, and a mapping between the extracted peak wavelengths and the PZT displacements is carried out using the chromatic dispersion model. The RBFN is trained, with the PCA-processed spectral ARSs as the input, and the residual errors of the chromatic dispersion model as the output.

After the response characterization process, we test the measurement performance with our hybrid RBFN, by measuring a moving flat mirror. The measurement process is started after the response characterization process with as small a change of the external condition as possible. Data set 1 is collected for the training/validation of the hybrid RBFN, and data set 2 is collected for testing the predicted performance of the hybrid RBFN, as shown in Fig. 7. In the network optimization process, in general, 95% of data set 1 are used for network training, while the other 5% is used for network validation to avoid under-fitting or over-fitting of the network [41].

Fig. 7. Illustrations of the training/validation and testing data sets.

Download Full Size | PDF

3.2 Network training/validation performance

In this section, the influences of the peak extraction algorithms, the displacement-wavelength fitting models, and the dimensionality-reduction, on the network performance are investigated. Comparisons in this section are presented, provided that the same data sets, i.e., spectral ARSs and PZT displacements, are used. The scale of data set 1 is large when the spectral ARS is recorded with a 100 nm interval (25,000 patterns, 1000 intervals within 100 µm measurement range), the training time can take several tens of hours if all these data are utilised to demonstrate the procedure. Therefore, we select a part of the data set 1 for our demonstrations in this section. The spectral ARS is selected with a 300 nm interval, thus 8,000 patterns (333 intervals, where each has twenty-five frames of spectral signals).

3.2.1 Peak wavelength extraction algorithms

In this section, the network training performance of different algorithms, including the centroid algorithm and the Gaussian fitting algorithm, are compared based on the same hybrid network topology, as shown in Fig. 2. The variation of RMSE with the number of RBFN neurons for the centroid algorithm and the Gaussian fitting are shown in Fig. 8. The hybrid RBFN with a more accurate peak wavelength extraction algorithm, such as the Gaussian fitting, can achieve the convergent limit with much fewer neurons than with a less accurate algorithm, such as the centroid algorithm. Moreover, the RMSE performance of the hybrid RBFN based on Gaussian fitting is much smaller than that based on the centroid algorithm. In other words, the wavelength extraction performance also has an important effect on the hybrid network performance.

Fig. 8. Network training performance of different peak extraction algorithms.

Download Full Size | PDF

3.2.2 Displacement-wavelength fitting models

The network training performance of a polynomial model and a chromatic dispersion model is shown in Fig. 9, based on the same hybrid network topology, as shown in Fig. 2. Provided that the polynomial model and the chromatic dispersion model have the same number of constants (four constants in Fig. 9), the chromatic dispersion model can achieve a smaller initial characterization error than that from the polynomial model. Thus, the chromatic dispersion model is superior to the conventional polynomial model and is more suitable for a direct displacement-wavelength characterization for CCMs with a measurement range of several micrometers [26]. Besides, our proposed dispersion model has the advantage of reaching a convergent limit with fewer neurons.

Fig. 9. Network training performance of different displacement-wavelength fitting models.

Download Full Size | PDF

3.2.3 PCA based dimensionality reduction

PCA is effective at reducing the dimensionality of the original data because the PCA-processed data can represent most of the variance of the original data. The variation of the individual proportion and the sum of these proportions, which account for the total variance of the original data, is shown in Fig. 10, along with different principal components. It can be seen that the sum of the individual proportions of the first twenty-five principal components can account for over 99%. In other words, these twenty-five principal components can represent the original data, i.e., spectral signals without significant loss of information. The network training performance is compared with and without the PCA-based dimensionality reduction, as shown in Fig. 11. It is clear that the PCA operation has not deteriorated the network performance. Moreover, the dimensionality reduction can reduce the training time from 200 s to around 90 s (for 8000 training patterns). It should be noted that the training data for illustration, is only a small part of the original training data set 1. The PCA operation can significantly improve the training speed when training is applied to larger-size data sets [70].

Fig. 10. The individual and summing proportion of the data variance in the PCA.

Download Full Size | PDF

Fig. 11. Network training performance with and without PCA based dimensionality reduction.

Download Full Size | PDF

3.3 Measurement results

After training the hybrid RBFN with data set 1 (Fig. 7), the measurement performance with the hybrid network is evaluated with data set 2. The polynomial models with conventional peak wavelength extraction algorithms are also developed in advance, where the polynomial exponent is determined to minimise the RMS of the residual fitting errors. A piecewise linear interpolation method is also used for the comparison, which is discussed elsewhere [32,50,52]. Two measures are used to evaluate the measurement performance with these methods. The systematic measurement error is the average deviation of the repeated measured displacements from the ideal PZT value, and the standard deviation is the spread of the repeated measured displacements for the equivalent PZT displacement [71].

The systematic measurement errors and standard deviations of the CCM are illustrated in Figs. 12(a) and 12(b), using different response characterization methods, such as the polynomial models and the hybrid RBFN. The RMS values of systematic measurement errors are 0.079 µm, 0.034 µm, 0.008 µm and 0.007 µm for the polynomial model using the centroid algorithm, the polynomial model using the Gaussian fitting, the hybrid RBFN using the corrected parabolic fitting, and the piecewise linear interpolation method using the corrected parabolic fitting respectively. The RMS values of standard deviations are 0.057 µm, 0.030 µm, 0.026 µm and 0.028 µm for the polynomial model using the centroid algorithm, the polynomial model using the Gaussian fitting, the hybrid RBFN using the corrected parabolic fitting, and the piecewise linear interpolation method using the corrected parabolic fitting respectively. Compared with the conventional model using Gaussian fitting, the hybrid model can achieve 76% measurement accuracy and 14% measurement precision improvement. Compared to our proposed piecewise linear interpolation method [52], the hybrid RBFN can achieve a comparable measurement performance in terms of the RMS measures of the systematic measurement errors and standard deviations. However, the piecewise linear interpolation method can lead to significant gross errors in terms of both measurement measures, as shown in Figs. 12(a) and 12(b).

Fig. 12. Measurement results with different calibration models. (a) Systematic errors at different PZT displacements and (b) Standard deviations at different PZT displacements.

Download Full Size | PDF

The displacement response characterization is also affected by some other factors, such as the sample color [20], surface texture [72], and the magnitudes of the measured signals [73]. The hybrid network has offered a potential path to suppress these factors by implementing this response characterization process. For example, different colored specimens can be measured, as shown in Fig. 2 to collect spectral ARSs as the training data set, with which a hybrid network-based displacement relationship can be obtained. As it is known to us the generalization ability of the ANN is deeply dependent on the size of training data set. It will be interesting to construct such data sets for chromatic confocal measurement applications to further enhance the measurement performance of the chromatic confocal microscopy.

4. Conclusions

A hybrid radial basis function network is proposed for the accurate displacement response characterization for chromatic confocal microscopy. Our hybrid model has been developed with several benefits, such as an accurate and efficient peak wavelength extraction algorithm, an accurate chromatic dispersion model and a self-adjusted radial basis function network with principal component analysis-based dimensionality reduction. Our chromatic confocal measurement results demonstrated the advantage of our hybrid model in terms of the systematic measurement error and standard deviation, and show that our model can achieve at least 76% measurement accuracy and 14% measurement precision enhancement, respectively, when compared with conventional polynomial models. Our future work will focus on developing a hybrid network with more exceptional generalization ability, on suppressing the sample-dependent influences on the displacement response characterization.

Funding

National Natural Science Foundation of China (NSFC) (51475190, 51675167, 51705178); National Instrument Development Specific Project of China (2011YQ160013); Key Grant Project of Science and Technology Program of Hubei Province of PR China (2017AAA001); Shenzhen Basic Scientific Research Project (JCYJ2017030717134710).

Acknowledgment

The authors thank Dr. Xuzhan Chen, Miss. Yin Zhou, and Mr. Kai Zhang in HUST for insightful discussions on neural networks and dimensionality reduction. Also, sincere thanks are given to Dr. Xuzhan Chen for going cycling with Mr. Cheng Chen, which has offered him lots of strength, perseverance, and determination to carry out this work.

Disclosures

The authors declare that there are no conflicts of interest related to this article.

References

1. M. Gu, Principles of three-dimensional imaging in confocal microscopes (World Scientific, 1996).

2. J. B. Pawley, Handbook of biological confocal microscopy (Springer, 2006).

3. B.V.R. Tata and B. Raj, “Confocal laser scanning microscopy: Applications in material science and technology,” Bull. Mater. Sci. 21(4), 263–278 (1998). [CrossRef]

4. L. Qiu, D. Liu, W. Zhao, H. Cui, and Z. Sheng, “Real-time laser differential confocal microscopy without sample reflectivity effects,” Opt. Express 22(18), 21626–21640 (2014). [CrossRef]

5. J. Chesna, B. Wiedmaier, J. Wang, A. Samara, R. Leach, T. Her, and S. Smith, “Aerial wetting contact angle measurement using confocal microscopy,” Meas. Sci. Technol. 27(12), 125202 (2016). [CrossRef]

6. L. Li, J. Liu, Y. Liu, C. Liu, H. Zhang, X. You, K. Gu, Y. Wang, and J Tan, “A promising solution to the limits of microscopes for smooth surfaces: fluorophore-aided scattering microscopy,” Nanoscale 10(20), 9484–9488 (2018). [CrossRef]

7. J. Yang, L. Qiu, W. Zhao, Y. Shen, and H. Jiang, “Laser differential confocal paraboloidal vertex radius measurement,” Opt. Lett. 39(4), 830–833 (2014). [CrossRef]

8. J. Liu, Y. Wang, K. Gu, X. You, M. Zhang, M. Li, and J. Tan, “Measuring profile of large hybrid aspherical diffractive infrared elements using confocal profilometer,” Meas. Sci. Technol. 27(12), 125011 (2016). [CrossRef]

9. J. Yang, L. Qiu, W. Zhao, and H. Wu, “Laser differential reflection-confocal focal-length measurement,” Opt. Express 20(23), 26027–26036 (2012). [CrossRef]

10. R.K. Leach, Optical Measurement of Surface Topography (Springer, 2011).

11. J. Yang, L. Qiu, W. Zhao, R. Shao, and Z. Li, “Measuring the lens focal length by laser reflection-confocal technology,” Appl. Opt. 52(16), 3812–3817 (2013). [CrossRef]

12. T. R. Corle, C. H. Chou, and G. S. Kino, “Depth response of confocal optical microscope,” Opt. Lett. 11(12), 770–772 (1986). [CrossRef]

13. A. Ruprecht, K. Koerner, T. Wiesendanger, H. Tiziani, and W. Osten, “Chromatic confocal detection for high-speed microtopography measurements,” Proc. SPIE 5302, 53–60 (2004). [CrossRef]

14. Q. Yu, K. Zhang, C. Cui, R. Zhou, F. Cheng, R. Ye, and Y. Zhang, “Method of thickness measurement for transparent specimens with chromatic confocal microscopy,” Appl. Opt. 57(33), 9722–9728 (2018). [CrossRef]

15. M. Hillenbrand, L. Lorenz, R. Kleindienst, A. Grewe, and S. Sinzinger, “Spectrally multiplexed chromatic confocal multipoint sensing,” Opt. Lett. 38(22), 4694–4697 (2013). [CrossRef]

16. K. Ang, Z. Fang, and A. Tay, “Note: Real-time three-dimensional topography measurement of microfluidic devices with pillar structures using confocal microscope,” Rev. Sci. Instrum. 85(2), 026108 (2014). [CrossRef]

17. L. Chen, Y. Chang, and H. Li, “Full-field chromatic confocal surface profilometry employing digital micromirror device correspondence for minimizing lateral cross talks,” Opt. Eng. 51(8), 081507 (2012). [CrossRef]

18. B. Jiao, X. Li, Q. Zhou, K. Ni, and X. Wang, “Improved chromatic confocal displacement-sensor based on a spatial-bandpass-filter and an X-shaped fiber-coupler,” Opt. Express 27(8), 10961–10973 (2019). [CrossRef]

19. B. S. Chun, K. Kim, and D. Gweon, “Three-dimensional surface profile measurement using a beam scanning chromatic confocal microscope,” Rev. Sci. Instrum. 80(7), 073706 (2009). [CrossRef]

20. Q. Yu, K. Zhang, R. Zhou, C. Cui, F. Cheng, S. Fu, and R. Ye, “Calibration of a chromatic confocal microscope for measuring a colored specimen,” IEEE Photonics J. 10(6), 1–9 (2018). [CrossRef]

21. S.L. Dobson, P. Sun, and Y. Fainman, “Diffractive lenses for chromatic confocal imaging,” Appl. Opt. 36(20), 4744–4748 (1997). [CrossRef]

22. P. C. Lin, P. Sun, L. Zhu, and Y. Fainman, “Single-shot depth-section imaging through chromatic slit-scan confocal microscopy,” Appl. Opt. 37(28), 6764–6770 (1998). [CrossRef]

23. S. Cha, P. C. Lin, L. Zhu, P. Sun, and Y. Fainman, “Nontranslational three-dimensional profilometry by chromatic confocal microscopy with dynamically configurable micromirror scanning,” Appl. Opt. 39(16), 2605–2613 (2000). [CrossRef]

24. K. Shi, P. Li, S. Yin, and Z. Liu, “Chromatic Confocal Microscopy using supercontinuum light,” Opt. Express 12(10), 2096–2101 (2004). [CrossRef]

25. D. Luo, C. F. Kuang, and X. Liu, “Fiber-based chromatic confocal microscope with Gaussian fitting method,” Opt. Laser Technol. 44(4), 788–793 (2012). [CrossRef]

26. X. Chen, T. Nakamura, Y. Shimizu, C. Chen, Y. Chen, H. Matsukuma, and W. Gao, “A chromatic confocal probe with a mode-locked femtosecond laser source,” Opt. Laser Technol. 103, 359–366 (2018). [CrossRef]

27. U. Minoni, G. Manili, S. Bettoni, E. Varrenti, D. Modotto, and C. De Angelis, “Chromatic confocal setup for displacement measurement using a supercontinuum light source,” Opt. Laser Technol. 49, 91–94 (2013). [CrossRef]

28. C. Olsovsky, R. Shelton, O. Carrasco-Zevallos, B. E. Applegate, and K. C. Maitland, “Chromatic confocal microscopy for multi-depth imaging of epithelial tissue,” Biomed. Opt. Express 4(5), 732–740 (2013). [CrossRef]

29. J. Garzón, T. Gharbi, and J. Meneses, “Real-time determination of the optical thickness and topography of tissues by chromatic confocal microscopy,” J. Opt. A: Pure Appl. Opt. 10(10), 104028 (2008). [CrossRef]

30. D.W. Sesko, “Intensity compensation for interchangeable chromatic point sensor components,” U.S.patent US7, 876, 456 B2 (Jan. 25, 2011).

31. A. K. Ruprecht, T. F. Wiesendanger, and H. J. Tiziani, “Signal evaluation for high-speed confocal measurements,” Appl. Opt. 41(35), 7410–7415 (2002). [CrossRef]

32. C. Chen, J. Wang, X. J. Liu, W. L. Lu, H. Zhu, and X. Q. Jiang, “Influence of sample surface height for evaluation of peak extraction algorithms in confocal microscopy,” Appl. Opt. 57(22), 6516–6526 (2018). [CrossRef]

33. M. Hillenbrand, B. Mitschunas, F. Brill, A. Grewe, and S. Sinzinger, “Spectral characteristics of chromatic confocal imaging systems,” Appl. Opt. 53(32), 7634–7642 (2014). [CrossRef]

34. C. Liu, Y. Liu, T. Zheng, J. Tan, and J. Liu, “Monte Carlo based analysis of confocal peak extraction uncertainty,” Meas. Sci. Technol. 28(10), 105016 (2017). [CrossRef]

35. J. Liu and J. Tan, Confocal Microscopy (Morgan & Claypool, 2016).

36. H. Nouira, N. E. Hayek, X. Yuan, and N. Anwer, “Characterization of the main error sources of chromatic confocal probes for dimensional measurement,” Meas. Sci. Technol. 25(4), 044011 (2014). [CrossRef]

37. G. Zhuo, C. Hsu, Y. Wang, and M. Chan, “Chromatic confocal microscopy to rapidly reveal nanoscale surface/interface topography by position-sensitive detection,” Appl. Phys. Lett. 113(8), 083106 (2018). [CrossRef]

38. M. J. Baker, J. T. Xi, and J. F. Chicharo, “Neural Network digital fringe calibration technique for structured light profilometers,” Appl. Opt. 46(8), 1233–1243 (2007). [CrossRef]

39. X. Chen, Y. Chen, K. Gupta, J. Zhou, and H. Najjaran, “SliceNet: A proficient model for real-time 3D shape-based recognition,” Neurocomputing 316, 144–155 (2018). [CrossRef]

40. K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks 2(5), 359–366 (1989). [CrossRef]

41. I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning (MIT press, Cambridge, 2016).

42. H. Zhao, S. Shi, H. Jiang, Y. Zhang, and Z. Xu, “Calibration of AOTF-based 3D measurement system using multiplane model based on phase fringe and BP neural network,” Opt. Express 25(9), 10413–10433 (2017). [CrossRef]

43. D. Luo and M. W. Kudenov, “Neural network calibration of a snapshot birefringent Fourier transform spectrometer with periodic phase errors,” Opt. Express 24(10), 11266 (2016). [CrossRef]

44. L. Fan, W. Li, A. Dahlback, J. J. Stamnes, S. Stamnes, and K. Stamnes, “New neural-network-based method to infer total ozone column amounts and cloud effects from multi-channel, moderate bandwidth filter instruments,” Opt. Express 22(16), 19595–19609 (2014). [CrossRef]

45. M. Grunwald, P. Laube, M. Schall, G. Umlauf, and M.O. Franz, “Radiometric calibration of digital cameras using neural networks,” Proc. SPIE 10395, 1039505 (2017). [CrossRef]

46. C. Zhang, Y. Niu, H. Zhang, and J. Lu, “Optimized star sensors laboratory calibration method using a regularization neural network,” Appl. Opt. 57(5), 1067–1074 (2018). [CrossRef]

47. W. J. Xiang, Z. X. Zhou, D. Y. Ge, Q. Y. Zhang, and Q. H. Yao, “Camera calibration by hybrid Hopfield network and self-adaptive genetic algorithm,” Meas. Sci. Rev. 12(6), 302–308 (2012). [CrossRef]

48. S. Xie, X. Zhang, S. Chen, and C. Zhu, “Hybrid neural network models of transducers,” Meas. Sci. Technol. 22(10), 105201 (2011). [CrossRef]

49. L. C. Chen, D.T. Nguyen, and Y.W. Chang, “Precise optical surface profilometry using innovative chromatic differential confocal microscopy,” Opt. Lett. 41(24), 5660–5663 (2016). [CrossRef]

50. C. Chen, W. Yang, J. Wang, W. Lu, X. Liu, and X. Jiang, “Accurate and efficient height extraction in chromatic confocal microscopy using corrected fitting of the differential signal,” Precis. Eng. 56, 447–454 (2019). [CrossRef]

51. M. Rahlves, B. Roth, and E. Reithmeier, “Confocal signal evaluation algorithms for surface metrology: uncertainty and numerical efficiency,” Appl. Opt. 56(21), 5920–5926 (2017). [CrossRef]

52. C. Chen, J. Wang, R. Leach, W. Lu, X. Liu, and X. Jiang, “Corrected parabolic fitting for height extraction in confocal microscopy,” Opt. Express 27(3), 3682–3697 (2019). [CrossRef]

53. M. Rayer and D. Mansfield, “Chromatic confocal microscopy using staircase diffractive surface,” Appl. Opt. 53(23), 5123–5130 (2014). [CrossRef]

54. W. J. Smith, Modern Optical Engineering: The Design of Optical Systems (McGraw-Hill, New York, 1990).

55. B. Tatian, “Fitting Refractive-Index Data With The Sellmeier Dispersion Formula,” Appl. Opt. 23(24), 4477–4485 (1984). [CrossRef]

56. P. N. Robb and R. I. Mercado, “Calculation of refractive indices using Buchdahl’s chromatic coordinate,” Appl. Opt. 22(8), 1198–1215 (1983). [CrossRef]

57. Schott glass catalog (2011), http://www.us.schott.com.

58. C.L. Li and J. Sasián, “Adaptive dispersion formula for index interpolation and chromatic aberration correction,” Opt. Express 22(1), 1193–1202 (2014). [CrossRef]

59. D. N. Fuller, A. L. Kellner, and J. H. Price, “Exploiting chromatic aberration for image-based microscope autofocus,” Appl. Opt. 50(25), 4967–4976 (2011). [CrossRef]

60. J. Novak and A. Miks, “Hyperchromats with linear dependence of longitudinal chromatic aberration on wavelength,” Optik 116(4), 165–168 (2005). [CrossRef]

61. M. Hillenbrand, B. Mitschunas, C. Wenzel, A. Grewe, X. Ma, P. Feßer, M. Bichra, and S. Sinzinger, “Hybrid hyperchromats for chromatic confocal sensor systems,” Adv. Opt. Technol. 1(3), 187 (2012). [CrossRef]

62. C. Chen, J. Wang, C. Zhang, W. Lu, X. Liu, Z. Lei, and X. Jiang, “Influence of optical aberrations on the peak extraction in confocal microscopy,” Opt. Commun. 449, 24–32 (2019). [CrossRef]

63. M. J. L. Orr, "Introduction to radial basis function networks," University of Edinburgh, Edinburgh, Scotland, 1996.

64. S. Chen, C. F. N. Cowan, and P. M. Grant, “Orthogonal least squares learning algorithm for radial basis function networks,” IEEE Trans. Neural Netw. 2(2), 302–309 (1991). [CrossRef]

65. D. Manrique, J. Rios, and A. Rodriguez-Paton, “Evolutionary system for automatically constructing and adapting radial basis function networks,” Neurocomputing 69(16-18), 2268–2283 (2006). [CrossRef]

66. J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A K-means clustering algorithm,” J. R. Stat. Soc. Series C 28(1), 100–108 (1979). [CrossRef]

67. F. A. Tobar, S. Y. Kung, and D. P. Mandic, “Multikernel least mean square algorithm,” IEEE Trans. Neural Netw. Learning Syst. 25(2), 265–277 (2014). [CrossRef]

68. S. T. Roweis and S. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science 290(5500), 2323–2326 (2000). [CrossRef]

69. H. Abdi and L. J. Williams, “Principal component analysis,” Wiley Interdiscip. Rev. Comput. Stat. 2(4), 433–459 (2010). [CrossRef]

70. I. T. Jolliffe, Principle Component Analysis (Springer, 1986).

71. R. K. Leach, Fundamental Principles of Engineering Nanometrology (Elsevier, 2014).

72. D. Claus, G. Pedrini, T. Boettcher, M. Taphanel, W. Osten, and R. Hibst, “Development of a realistic wave propagation-based chromatic confocal microscopy model,” Proc. SPIE 10677, 106770X (2018). [CrossRef]

73. D. Duque and J. Garzon, “Effects of both diffractive element and fiber optic based detector in a chromatic confocal system,” Opt. Laser Technol. 50, 182–189 (2013). [CrossRef]

Characterization of the displacement response in chromatic confocal microscopy with a hybrid radial basis function network

Abstract

1. Introduction

2. Hybrid RBFN model

2.1 CCM characterization principle

2.2 Chromatic dispersion model

2.3 Implementation of the RBFN

2.3.1 Introduction to the RBFN

2.3.2 Dimensionality reduction

3. Experiments

3.1 Experimental procedures

3.2 Network training/validation performance

3.2.1 Peak wavelength extraction algorithms

3.2.2 Displacement-wavelength fitting models

3.2.3 PCA based dimensionality reduction

3.3 Measurement results

4. Conclusions

Funding

Acknowledgment

Disclosures

References

Cited By

Figures (12)

Equations (8)

Optics Express