## Abstract

We propose a polarization-based probabilistic discriminative model for deriving a set of new sigmoid-transformed polarimetry feature parameters, which not only enables accurate and quantitative characterization of cancer cells at pixel level, but also accomplish the task with a simple and stable model. By taking advantages of polarization imaging techniques, these parameters enable a low-magnification and wide-field imaging system to separate the types of cells into more specific categories that previously were distinctive under high magnification. Instead of blindly choosing the model, the L0 regularization method is used to obtain the simplified and stable polarimetry feature parameter. We demonstrate the model viability by using the pathological tissues of breast cancer and liver cancer, in each of which there are two derived parameters that can characterize the cells and cancer cells respectively with satisfactory accuracy and sensitivity. The stability of the final model opens the possibility for physical interpretation and analysis. This technique may bypass the typically labor-intensive and subjective tumor evaluating system, and could be used as a blueprint for an objective and automated procedure for cancer cell screening.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

## 1. Introduction

Cancer is one of the leading causes of death in 127 countries [1]. Trend indicates that cancer will soon be the leading cause of premature death given the decreasing trend of premature death due to cardiovascular diseases. Among all types of cancers, female breast cancer is the most diagnosed cancers in 2021, surpassing lung cancer [2]. Liver cancer is the third leading cause of cancer death worldwide. For cancer diagnosis, pathologists examine pathological tissues under high-resolution microscopes to look for textual indication of cancer. While such method will remain to be the gold standard for the next decades, such manual examination is qualitative, subjective, and requires intense labor resources [3]. The clinical demand for a method capable of wide field quantitative evaluation of the pathological slides is high.

In recent years, optical imaging techniques combined with machine learning models are gradually emerging in digital pathology and diagnosis, which enriches the input data forms of the models and expands their ability to acquire microstructural information [4–6]. The Mueller matrix is a comprehensive description of the samples’ polarization properties, containing abundant microstructural information and optical properties [7–9]. To quantitatively decode the Mueller matrix, sets of polarization parameters with physical meanings have been derived by experiments and simulations [10–13] and have demonstrated promising prospects in microstructural characterization of complex biological specimens [14–17]. It has been testified that the contrast mechanisms of 2D images of polarization parameters depend on the samples’ polarization characteristics and less on the imaging resolution [18,19], making them possible to characterize microstructures which may not be available for human vision under a low-resolution system. These polarization parameters can be used alone, or as polarimetry basis parameters (PBPs) to construct polarimetry feature parameters (PFPs), which have explicit connections to the sample’s microstructural characteristics, by employing various machine learning methods. For example, the Mueller matrix polar decomposition parameters $D,\; \; P,$ and $\mathrm{\Delta }$ can be associated with diattenuation, polarizance, and depolarization [13]. Previously, we proposed a linear discriminant analysis (LDA) based approach for deriving PFPs as simplified linear functions of the PBPs, for quantitative characterization of cells and fiber collagen in various breast pathological tissues [5].

Inspired by this, in this paper, we proposed a novel framework for deriving complex sigmoid-transformed PFPs for quantitative characterization of a more specific and finer microstructure than cells – cancer cells. The polarization-based probabilistic discriminative model (P-PDM), designed based on the degrees of complexity of the pathological features and their corresponding polarization characteristics, aims at objective cancer cell identification. Identification of cancerous cell is achieved by first identifying all the cells, and then separating the normal cells from the cancerous cells. The nodes in P-PDM are built using L0 regularized linear logistic regression (LR) classifiers [20] and connected by using conditional probability and Bayes’ theorem. Specifically, we measured pathological tissue samples’ PBPs by using the Mueller matrix microscope [21,22] as input data of the P-PDM, and the model outputs sigmoid-transformed PFPs whose forms depend on the number of used nodes, the edges connecting the nodes, and the probability formulas describing the nodes which can be dissected to extract the physical meanings. A comparison is made between the L0 [20] and L1 [23] regularization method, and L0 regularization is selected for model stability. Then, we demonstrated the viability of our proposed P-PDM in hematoxylin and eosin (H&E) sections of breast cancer tissues and liver cancer tissues, in each of which there are two derived final PFPs that can respectively characterize the cells and cancer cells, with satisfactory accuracy and sensitivity of 85% and above. The pixel-based model learns single pixels’ polarization features, which may also be measured when the corresponding microstructure is not visible to human vision, and then makes the decision on the belonging structure type of the pixels in polarization images. Therefore, the derived final PFPs may accomplish tasks in a low-resolution and wide-field system, which paves the way for quantitative and rapid screening of cancer cells in pathological sections.

## 2. Related work

Polarization imaging-based machine learning methods for cancer aided
diagnosis in pathology are gradually emerging recently. Dremin *et al.* introduced a high-resolution polarization
imaging technique combined with k-means method for automatically
clustering the pixels of breast cancer tissues into three types, that is,
fat tissue, benign fibrosis, and epithelial carcinoma located inside the
milk duct [24]. This work shows the
possibility of classification, while the performance of the proposed model
is not given when dealing with a large amount of samples; Sindhoora
*et al.* calculated 19 gray level
co-occurrence matrix features of polarization parameters images measured
under a 40× objective lens and input them into the support vector machine
classifier to detect the tumor regions of ductal cancer tissues [25]; The high-resolution polarization
hyperspectral images were processed using multiple machine learning
classifiers by Zhou *et al.*, such as random
forest, support vector machine, gaussian naive bayes and logistic
regression [26]*.* They explore the possibility of classifiers based on
polarization hyperspectral features for automatic detection of head and
neck squamous cell carcinoma on H&E-stained tissue sections. Christian
*et al.* constructed a decision-theoretic
framework for biological tissues classification using Mueller polarimetry,
introduced the preprocessing method involving superpixels, and then tested
different machine learning models using *ex
vivo* specimens of uterine cervix [27]. Ivanov *et al.* measured Mueller
matrix images of *ex vivo* human colon
samples, decomposed them using symmetric decomposition and depolarization
metric calculus. With the help of supervised and unsupervised machine
learning methods including logistic regression, random forest, support
vector machine, and principal component analysis, they showed promising
result distinguishing healthy and tumor human colon samples [28]. Wang *et
al*. combined principal components scores with polarimetric
parameters for structure classification, proposing the term digital
staining of the H&E slides. The method is validated on intestinal
metaplasia and normal glands [29].

In addition to the applications of machine learning model based on
handcrafted features, some researchers are also investigating the use of
deep convolution neural network (CNN) to extract the features of
polarization images of cancer tissues for automatic aided diagnosis of
cancer cells. Zhao *et al.* established the
Mueller matrix image data set of pathological tissues of giant cell tumor
of bone at 40× magnification [30].
The data set entered into the CNN to extract microstructural information
as depth features and were calculate the Mueller matrix derived parameters
as manual features. Then, a multi-parameters fusion network was proposed
to fuse the two features for automatic detection of the pathological
samples of giant cell tumor of bone; Xia *et
al*. and Ma *et al*. proposed a ReSE
Net [31] and a MuellerNet [32] for the classification of different
breast cancer cells respectively. The high-resolution Mueller matrix image
of single cancer cell was measured and as input data of the networks. ReSE
Net adds a SENet on the basis of ResNet, obtaining the weight of each
polarization feature channel according to its importance for
classification. MuellerNet includes a normal stream composed of a ResNet
for processing light intensity images and a polarization stream composed
of a CNN with attention mechanism for processing polarization images. The
accuracy of the two is about 88% and 86% respectively. Roa *et al.* utilized a deep learning model utilizing
Res-Net to segment cervical collagen and elastin, where the model
performance is compared with second harmonic generation and two-photon
excitation florescence. The authors also proposed a CNN-K-NN hybrid model,
or the use of U-net if data is abundant [33].

Based on above, to our knowledge, it is the first time to derive sigmoid-transformed PFPs with physical meanings and feature specificity for quantitative characterization of cancer cells at pixel level in breast and liver pathological tissues in a low-resolution and wide-field system.

## 3. Methods

#### 3.1 Pathological samples

To test our method, we used two kinds of H&E-stained pathological tissue slides (4-μm-thick) – breast cancer and liver cancer. Pathological breast cancer tissue slices from 9 patients were provided by University of Chinese Academy of Sciences Shenzhen Hospital. In addition, liver cancer pathological tissue samples used in this study were acquired from 7 clinical patients of Fujian Medical University Cancer Hospital. The size of the regions-of-interest (ROIs) in this study is 1000×1000 in pixels. In each case of pathological tissue sample, pathologists selected 6 ROIs, every two of which are a group. The three groups respectively contain a large number of non-cell tissues (refer to fiber structures, background, or other tissues, denoted as N tissues), non-cancer cells (N cells), and cancer cells (C cells). Meanwhile, the target microstructure in each ROI was labelled manually by pathologists using MATLAB Graphical User Interface to produce the mask as ground truth for cross validation of the models. More details about pathological features and how to label target microstructures in H&E images by pathologists were provided in Section 1 of the Supplement 1. Thus, a total of 54 ROIs of breast cancer and 42 ROIs of liver cancer tissues were analyzed respectively. This work was approved by the Ethics Committees of University of Chinese Academy of Sciences Shenzhen Hospital and Fujian Medical University Cancer Hospital.

#### 3.2 Data acquisition

### 3.2.1 Experimental setup

The H&E-stained sections from biopsy enter into the Mueller matrix microscope for respectively obtaining the sample’s Mueller matrix image under a 4× objective lens and H&E pathological image under a 20× objective lens. Figure 1(a) illustrates the photograph and schematic of the Mueller matrix microscope. The Mueller matrix microscope was constructed by adding polarization state generator and analyzer to the commercial transmission-light microscope [21], and works by adopting the typical dual rotating retarder method [22]. This instrument has already been described in the previous publications [5,6,15,19]. The details on the construction and optimization of the system can be found in [34–36]. For the sake of completeness, more information about the instrument was provided in Section 2 of the Supplement 1. Figure 1(b) is an example of measuring the Mueller matrix of the breast cancer pathological section under a 4× objective lens. The element m11 is the sample’s intensity image, and the other 15 elements represent the sample’s complete polarization features and are all normalized by m11. Figure 1(c) is the corresponding H&E image obtained under a 20× objective lens, and several target microstructures are labelled with solid lines in different colors by pathologists.

### 3.2.2 Polarimetry basis parameters

Although the Mueller matrix contains samples’ complete polarization
properties and abundant microstructure information, it often
inconvenient to use the matrix elements directly since they are
lack of explicit connections to the microstructures and sensitive
to sample orientation. Recently, multiple techniques based on
physical theory were adopted to derive several sets of PBPs, which
have clear physical meanings, and are either insensitive or
related explicitly to the orientation angle of the sample. The
input data of the designed machine learning model is single pixels
with series of polarization features. If the polarization features
are affected by the sample orientation, the convergence and
robustness of the trained model may be impaired, making the
extracted PFPs unstable. Therefore, we are using the
azimuthal-invariant PBPs rather than Mueller matrix elements or
polarization intensity images (which are affected by the light
source intensity) as input features for deriving new feature
specific polarization parameters PFPs. Table S1 (Section 3 in
Supplement 1) summarizes the
computing formulas and physical meanings of PBPs commonly used in
polarimetry. These PBPs are all used as the input polarization
features of samples to derive PFPs for the target microstructure
characterization in this study. Lu and Chipman proposed the
Mueller matrix polar decomposition (MMPD) method and derived
linear retardation *δ*, diattenuation
*D*, depolarization *Δ*, and optical rotation *ψ* [10]. The
Mueller matrix transformation (MMT) method proposed in our
previous study extracted anisotropy degree *t* _{1}, polarizance *b*, and circular birefringence *β* [11]. The
Mueller matrix rotation invariant (MMRI) parameters also decode
effective information from the Mueller matrix, including linear
polarizance *P _{L}*, linear
diattenuation

*D*, linear birefringence related

_{L}*r*and

_{L}*q*[12].

_{L}In addition to the above parameters, a Mueller matrix asymmetry
parameter (MMAP) is proposed, namely the *P* _{TMS}. For a transversal mirror symmetric
(TMS) sample, the Mueller matrix elements should be symmetric,
specifically $\textrm{m24} ={-}
\textrm{m42}$ and $\textrm{m34} ={-}
\textrm{m43}$ [13]. The parameter *P*
_{TMS} measures the breaking of such symmetry. The value
of *P* _{TMS} of the TMS
sample is 0. Previous studies of various pathological tissues have
shown that parameters of MMPD, MMT, MMRI, and MMAP in transmission
polarimetry have a good potential for probing microstructures and
facilitating medical diagnosis [5–7,15,21,31].

#### 3.3 Overview

In this study, we proposed a P-PDM for quantitative characterization of
cancer cells under a low-resolution and wide-field system. Firstly, we
took pathological tissues’ Mueller matrix images under a 4× objective
and H&E images under a 20× objective. The element m11 in Mueller
matrix represents the intensity image of sample, which can be used as
the fixed image for pixel level registration with the H&E image of
the sample (Fig. 2(a)).
The pixel level registration adopts the affine transformation method
[37] and generates the
transformation matrix **T**. In the H&E image, the target
microstructures were manually labelled by experienced pathologists.
The ROIs were selected in the overlapping area of the two images
(Fig. 2(b)). Then sets
of polarization parameters, derived from earlier studies, of the ROI
were calculated from the Mueller matrix as the PBPs (Fig. 2(c)). The labels were transformed by
the matrix **T** to produce label masks used for mapping on
the PBPs images to select target pixels of the three microstructures
(Fig. 2(d)). Then, we
selected the best PBPs groups from MMPD, MMT, MMRI, and MMAP as the
input features of the selected classifiers (Fig. 2(e)). The classifiers used in this
study includes P-PDM based on L0 regularization, P-PDM based on L1
regularization, artificial neural network (ANN), and LDA. The P-PDM
output two sigmoid-transformed PFPs, which can be used for
quantitative characterization of the cells and C cells in the ROI
(Fig. 2(f)).

#### 3.4 Algorithm architecture

### 3.4.1 Image registration

In this study, the input data of the P-PDM were the target pixel
values in PBPs. These pixels were selected by the label maps in
the corresponding H&E images. The H&E images were obtained
under a 20× objective, which was consistent with the objective
lens used in clinical observation of cell morphology for
evaluation and diagnosis. Therefore, the resolution of H&E
images is enough to label the target microstructures for
pathologists, which is rarely affected by error sources such as
saturated pixels and mixing typology information. To produce the
mask for directly mapping on the samples’ PBPs images to select
pixels, we need to achieve the pixel-by-pixel registration between
the sample’s Mueller matrix image and H&E image and obtain
transformation matrix **T**. As shown in
Fig. 2(a), we
adopted the affine transformation method to transform the H&E
image (moving image) to match the m11 image (fixed image).
Specifically, we called the *cpselect*
function to start the Control Point Selection Tool in MATLAB.
After selecting several feature points in m11 image and its
corresponding H&E image, the *fitgeotrans* function was conducted to produce the
transformation matrix **T**, in which the *transformation type* was set as “affine”.
Pathologists’ markings of the target microstructure on the H&E
image and matrix **T** were substituted in the *imwarp* function, and then produced the mask
which was used to select target pixels in PBPs as input of the
classifiers.

### 3.4.2 Polarization-based probabilistic discriminative model

Here, the designed P-PDM consists of two L0 regularized linear LR classifiers connected by prior knowledge. The model is designed to first extract all the cell-pixels, and then isolates the cancerous-cell-pixels from the normal-cell-pixels. The LR classifiers were adopted as nodes in P-PDM, since (i) LR classifier is a discriminative model that does not assume any prior-distribution regarding the input data [38]. We employed LR classifiers, considering the probability distribution of input features, i.e., PBPs, in certain classes may not be Gaussian distributions; and (ii) some studies preliminarily demonstrated the ability of LR classifier for the characterization of different biological tissue samples based on polarization data [39].

The L0 regularization is imposed using orthogonal matching pursuit (OMP) [40]. Given the number of allowed parameters to be used in the linear model, OMP logistic regression finds the sparse solution to the classification problem using a greedy approach [40]. The hyperparameter for this model is the number of parameters allowed.

Note that the output of each node resembles a probability or posterior probability to enable the classification post-processing. The output of standard linear regression is simply a combination of input features rather than a calibrated probability. Therefore, we introduced the sigmoid function, which is a bounded differentiable real function ranging from 0 to 1, to map the outputs of the nodes into probabilities [41]. After that, Bayes’ theorem would be implemented to derive sigmoid-transformed PFPs by multiplying these probabilities. Such PFP value of a pixel is the independent probability that the pixel belongs to the interested class. Specifically:

Given a pixel from ROI image, the aim of the P-PDM is to predict the class that this pixel belongs to according to its series of PBPs. The P-PDM intends to find out the probability of whether a certain pixel belongs to the cells class or the C cells class, and outputs two sigmoid-transformed PFPs for the characterization of the target microstructures respectively. The prior knowledge that cancer cell is a subclass of cell is fully utilized when designing the graphical model. Specifically, given a randomly sampled pixel ${\textrm{x}_{\textrm{i,j}}}$ from the ROI image, $\textrm{P(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}{\; )}$ is the probability that ${\textrm{x}_{\textrm{i,j}}}$ belongs to the cells class (denoted by ${\textrm{S}_{\textrm{Cells}}}$). Given a pixel ${\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}$, the conditional probability that the pixel belongs to the C cells class (denoted by ${\textrm{S}_{\textrm{C cells}}}$) is $\textrm{P(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$. The probability formulas of interest are sigmoid functions of PBPs combinations, shown as

*C*is the hyperparameter in the classifier) for distinguishing cells from other microstructures in pathological tissues. $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ in (1) is the linear combinations of PBPs learnt by using a L0 regularized LR classifiers (named as ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$) for the recognition of cancer cells given that the pixel belongs to the cell class.

As mentioned in Section 3.1, pathologists labelled three target
microstructures in ROIs – N tissues, N cells, and C cells – in
breast cancer and liver cancer pathological tissues. During
training process, at least 10000 pixels were randomly sampled from
each target microstructure, containing the feature information in
the labelled area. Even if there were some artifacts in a small
quantity of pixels or some noise in the target region, it was
taken into account by the training process to improve the
robustness of the model. Here, grid search based on cross
validation was used to determine the hyperparameter *C* in LR classifiers, which is the number of
parameters used in the OMP model.

In each fold of the cross-validation, 1 ROI will be selected as testing set, and the rest are the training set. For the ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$ node, which differentiate cells from other structures, the L0 regularized LR classifier is trained by using the selected PBP of different structures, including noncell (denoted as N tissues) labelled as negative class, as well as noncancer cells and cancer cells (denoted as N cells and C cells) labelled as positive class. As for ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$, which differentiate C cells form N cells, the classifier is trained by using the N cells and C cells data set, labeled as normal and cancerous respectively.

For ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$ training, we randomly sampled 10000 pixels from the N cells class and 10000 pixels from the C cells class, labelled them as the positive class. Meanwhile, 20000 pixels were sampled from N tissues class, and labelled them as negative class. ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$’s input data are the PBPs of these pixels, and the output is $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ which is a polarization parameter for quantitative characterization of cells (vs non-cell tissues) in pathological tissues. For the identification of cells in breast cancer and liver cancer, the number of parameters used was chosen as 2 after using the grid search based on the OMP model, cross validation and prior researches [5].

For ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$ training, we labelled 10000 pixels sampled from the C cells class as the positive class, and labelled 10000 pixels sampled from the N cells class as the negative class. After training, ${\textrm{L}_{\textrm{C cells}}}(\textrm{C} )$ produces $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$, which can be used to predict if a pixel belongs to the C cells class or not, given that the pixel belongs to the cell class. After grid search based on the cross validation, the number of parameters used was determined as 2 in breast cancer and liver cancer tissues.

By implementing the linear LR classifiers, the PBPs combinations $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ and $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in {\textrm{S}_{\textrm{Cells}}}\textrm{)}$ can be obtained, which are used to calculate the two probability formulas in (1). The output of the P-PDM is the final sigmoid-transformed PFPs, which are defined as the products of probability formulas, calculated as

### 3.4.3 Artificial neural network and linear discriminant analysis

Linear discriminant analysis (LDA) is the most basic machine learning model for classification tasks. Given two groups of data to be separated and classified, LDA algorithm is designed to find a linear hyperplane in the feature space that can separate the groups by assuming a gaussian distribution. [42] The viability of LDA for PFP extraction was presented and discussed in our previous work, providing the benefit of clear interpretability and satisfactory performance for simple structure identification tasks. [5] However, as we move on to identifying more complex pathological structures such as cancerous cells, the LDA algorithm struggles in performance due to its linearity. For implementation of LDA in our work, we used the discriminant analysis package in Scikit-learn library.

Artificial neural network is on the other end of the spectrum for
model-complexity. Given an input data point, ANN will process the
input data with a layered directed graph of hidden units, where
each hidden unit is essentially a tunable linear model followed by
a non-linear activation function, designed to mimic biological
neural activities. [43] The
coefficients in the hidden units are fine-tuned using gradient
descent during training. It is often the baseline model for
supervised learning tasks. ANN benefits from its nonlinear
decision boundary, capable of performing complex classification
tasks, but it is analogous to a black box, lacking model
interpretability. For implementation, we first normalized the
input features, and then the model is trained using *MLPClassifier* function in scikit-learn.

Our proposed P-PDM sits in between the spectrum; it has great interpretability due to the prior-knowledge inspired model structure, while sustaining decent performance comparable to that of ANN. For implementation details of LDA and ANN see [5].

For comparison with P-PDM, we adopted a three-class ANN classifier,
which can be used to identify cells, C cells, and N tissues in
pathological tissues. Cross validation was used to determine the
hyperparameters in the three-class ANN classifier. After grid
search, we could obtain the parameter settings of the two
optimized ANN classifiers for the cancer cell recognition tasks in
breast and liver cancer tissues respectively. For breast cancer,
the *Hidden_layer_size* was searched
as (50) and *Learning_rate_init* is
0.00001. For liver cancer, the *Hidden_layer_size* was chosen as (75,75) and *Learning_rate_init* is 0.01. In addition, to
compare with our previous study, we also employed four two-class
LDA classifiers for the quantitative characterization of the cells
and C cells in breast and liver cancer tissues respectively. The
data set and labels of ANN and LDA classifiers are the same as
those of P-PDM. ANN and LDA were implemented through the
open-source library Scikit-learn in Python version 3.7.6 with
Intel Core i7-9700 CPU @3.00GHz.

## 4. Results

Before analyzing the experimental results, recall the three types involved: non-cell tissues (N tissues), non-cancer cells (N cells), and cancer cells (C cells), with N cells and C cells belonging to cells. The output of P-PDM is two sigmoid-transformed PFPs – $\textrm{PF}\textrm{P}_\textrm{Cells}$ and $\textrm{PF}\textrm{P}_\textrm{C cells}$, which have great potential for specific and quantitative characterization of cells and C cells in pathological tissues, respectively.

#### 4.1 Selection of polarimetry basis parameters

Here, we introduced the four PBPs groups popular adopted in polarimetry (Section 3 in Supplement 1). However, there are important correlations between such observables. To ensure the robustness and convergence of the model, we employed the study on the selection of input PBPs for improving linearly independence of input variables. Specifically, it is the selection of PBPs groups before as input of the machine learning classifiers.

During cross validation, the four groups of PBPs, i.e., MMPD, MMT, MMRI, and MMAP, enter into the proposed P-PDM respectively. We calculated and analyzed the average accuracy on the classification of different cells when different PBPs groups were treated as input data. Based on the classification performance, we determined which PBPs groups should be selected, and demonstrated their necessity and advantages of incorporating together. As shown in Table 1, we can observe that (i) in breast tissues, the combination of MMT and MMRI yields the best overall classification accuracy for cell and cancer cell detection; and (ii) in liver tissues, the classifier employing MMPD, MMT, and MMAP groups as input data has most balanced performance for the identification of cells and cancer cells. Therefore, MMT and MMRI were selected as input parameters groups in breast tissues, while MMPD, MMT, and MMAP parameter groups were selected in liver tissues.

#### 4.2 Physical interpretability of nodes

Figure 3 and
Figure 4 respectively
summarize the output results of ${\textrm{L}_{\textrm{C
cells}}}(\textrm{C} )$ and ${\textrm{L}_{\textrm{Cells}}}(\textrm{C} )$ in breast cancer and liver cancer
pathological tissues, which are the simplified linear functions of the
PBPs and can be used to calculate the probability formulas by (1).
After obtaining these probability formulas describing nodes in P-PDM,
the final sigmoid-transformed PFPs can be derived by (2). For L0
regularization the hyperparameter is the number of nonzero parameters,
and the corresponding results are shown in Fig. 3(a) and (c). For L1 regularization
the hyperparameter *C* is the inverse of
regularization strength, and the results of grid search is shown in
Fig. 3(b) and (d). In
Fig. 3(e)–(h), the
*x*-axis is the input PBPs and *y*-axis is the set of linear combination of PBPs
with optimized coefficients for each round of cross validation. The
color bar represents the coefficient value of PBPs. For each round of
cross validation, the training set and test set varies, therefore the
resultant PFPs varies as well. For L0 regularization using OMP method,
the selected PBPs appears stable, meaning that the same PBPs are
selected with similar coefficients. In comparison, the L1
regularization method produces unstable results – for each round of
cross validation, different PBPs are selected with varying
coefficient. The experimental results indicate that the P-PDM based on
L0 regularization can derive the simplest form of $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$ as the product of sigmoid functions
of simplified and stable PBPs combinations. This simplified and stable
parameters not only quantitatively characterize cancer cells in
complex pathological tissues, but also pave the way for physical
interpretation.

The physical interpretability of nodes in P-PDM based on L0
regularization comes from the PBPs whose physical meanings are clear.
The simplified linear functions of the PBPs can describe the
correlation between polarization characteristics and the pathological
features of interest, and can explain these nodes of model to an
extent. Fig. 4 presents
the first node of P-PDM that is $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in
{\textrm{S}_{\textrm{Cells}}}\textrm{)}$ whose sigmoid function is $\textrm{PF}{\textrm{P}_{\textrm{Cells}}}$ used to identify cells from tissues,
we can conclude that (i) in breast cancer tissues, the pixels with
high anisotropy (*t* _{1}) and low
linear birefringence (*r _{L}*) are
more likely to be cells. Of note, both

*r*and

_{L}*q*are linear birefringence related parameters, and their values are equal in the case of transverse mirror symmetry. It can be observed from Table 1 that the accuracy of using the transverse mirror symmetry related parameter as input is about 40%, indicating that the mirror symmetry may not be broken in some breast cells. Therefore,

_{L}*r*and

_{L}*q*are used interchangeably in a few ROIs in Fig. 4(a); and (ii) in liver cancer tissues, the coefficient of

_{L}*t*

_{1}is high and that of

*δ*is low, meaning that cells have strong anisotropy and low linear birefringence property.

In addition, $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$ can be obtained by multiplying the
sigmoid function of the first node $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in
{\textrm{S}_{\textrm{Cells}}}\textrm{)}$ and the second node $\textrm{PFP(}{\textrm{x}_{\textrm{i,j}}} \in
{\textrm{S}_{\textrm{C cells}}}|{\textrm{x}_{\textrm{i,j}}} \in
{\textrm{S}_{\textrm{Cells}}}\textrm{)}$, which may explain the polarization
feature variation from normal cells to cancer cells: (i) in breast
cancer tissues, the decrease in linear (*q _{L}*) and circular (

*β*) birefringence signals cancer progression; (ii) in liver tissues, cancer cells have strong transverse mirror asymmetry (

*P*

_{TMS}) and polarizance (

*b*).

#### 4.3 Quantitative characterization results

The two final PFPs’ performance can be validated on the ROIs from the test samples. PFPs can be calculated from the PBPs of each ROI. By using PFPs, the cells in ROI’s can be identified and their types can be predicted quantitatively. In Fig. 5, we summarize the quantitative characterization results of different types of cells obtained from the final PFPs in breast cancer and liver cancer pathological tissues respectively. The ROIs presented in Fig. 5 were selected randomly from the test set. They do not represent the performance of all the cases, and are shown here only to illustrate the parts of characterization results of PFPs. In Fig. 5, the H&E images are the corresponding ground truth of the PFPs’ characterization results, in which the corresponding cells area are inside the black solid line and outside the blue solid line labelled by pathologists. From Fig. 5, we can observe the following patterns: first, in Fig. 5(a) and (c), the two ROIs are composed by N cells and N tissues. In these ROIs, the high values of $\textrm{PF}{\textrm{P}_{\textrm{Cells}}}$ can indicate the positions of cells. Meanwhile, $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$ which are sensitive to cancer cells, almost have no high values in these regions of healthy breast and liver tissues. Secondly, most cells in Fig. 5(b) and (d) are C cells. Therefore, in the ROIs composed by C cells and N tissues, the 2D images of $\textrm{PF}{\textrm{P}_{\textrm{Cells}}}$ and $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$ have obvious contrast at cells positions. Third, there are also some misclassified pixels. For example, in Fig. 5(c), healthy liver tissue, there are a few highlight pixels in 2D images of $\textrm{PF}{\textrm{P}_{\textrm{C cells}}}$, which means the PFPs predict the pixels belonging to non-cancer cells as cancer cells.

Based on above analysis, we can conclude that: (i) $\textrm{PF}{\textrm{P}_{\textrm{Cells}}}$ has great potential for identifying of all kinds of cells and $\textrm{PF}\textrm{P}_\textrm{C}\; \textrm{cells}$ with stronger specificity may be considered as a powerful tool for quantitative recognition of cancer cells in the two pathological tissues; and (ii) taking full advantage of polarization imaging, the PFPs’ characterization ability depends less on image resolution. It makes cancer cells screening possible under a 4× objective lens.

#### 4.4 Performance of classifiers

The performance of the two PFPs derived by P-PDM based on L0 regularization, the two LDA classifiers, and the three-class ANN classifier were evaluated at pixel level by feeding test set. The training and test process can be found in 3.4. The average values of accuracy, precision, and recall were calculated after conducting on test data which are not overlap with the training set. Table 2 summarizes the performance of cells classification in pathological tissues of breast cancer and liver cancer using P-PDM, LDA, and ANN, from which we can observe that: (i) in both pathological tissues, the performance differences between the P-PDM and ANN are moderate, indicating that the model with a few nodes can have comparable performance with complex ANN; (ii) for the identification of cancer cells, LDA has limited ability for the recognition of target microstructures. For example, for the quantitative characterization of C cells in liver cancer, the accuracy of P-PDM can achieve 0.854 while that of LDA is 0.738; (iii) P-PDM achieves high performance with the simplest model possible, using no more than four polarization basis parameters for the classification of cancer cells, much simpler than other machine learning methods, providing unprecedented possibility for physical interpretation of the result. Of note, although the preliminary experiment results point in a positive direction, a larger database is still needed to validate them.

## 5. Discussion and concluding remarks

The proposed P-PDM shows potential to act as indicators for quantitative characterization of cancer cells in pathological tissues with maximal physical interpretability. We designed the P-PDM by integrating prior knowledge, making the model simple with only a few nodes. The nodes can be built using L0 regularized LR classifiers based on orthogonal matching pursuit and defined by the linear combination of PBPs with physical meanings. We connected the nodes by employing conditional probability and Bayes’ theorem. Therefore, each node in this model can be dissected to extract the physical meanings and their correspondence to microstructural features. Such a P-PDM allows us to analyze the polarization features variations between healthy and cancerous cells. We demonstrated the viability of P-PDM by using the pathological tissues of breast cancer and liver cancer, in each of which there are two derived PFPs that can respectively characterize the cells and cancer cells with satisfactory performance scores, with the simplest form of PFP possible using no more than 4 PBPs. We also compared the proposed model with a three-class ANN; while the P-PDM’s recall is worse than the ANN model, P-PDM’s precision is higher, and thus its overall accuracy is on par with that of the ANN model. Therefore, the proposed model has comparable performance with ANN, but P-PDM is computationally much simpler, considering its number of parameters is orders of magnitude smaller than that of ANN. Notably, the PFPs could work under a 4× objectives and separate the types of cells into more specific categories which are only distinctive under high magnification, since the contrast of polarization imaging depends less on imaging resolution. It may pave the way for rapidly scanning and quantitatively analysis of the whole pathological section in a low-resolution and wide-field system, building cornerstone for primary screening of cancer cells in clinical practice.

The limitation with this model is two-fold. Firstly, the manual labels provided by pathologists are expensive and labor intensive. Secondly, the performance of P-PDM model decreases as the model becomes deeper, because the error rate in each node propagates throughout the entire model. To potentially address the first limitation and improve the P-PDM model, one should consider taking advantage of semi-supervised learning methods, which utilizes the unlabeled data to improve model performance.

In summary, the proposed P-PDM leverages the strength of polarization imaging to classify cancer cells at pixel level, making the identification quantitative, objective, interpretable, and less dependent on imaging resolution.

## Funding

National Natural Science Foundation of China (11974206, 61527826); Shenzhen Bureau of Science and Innovation (JCYJ20170412170814624); Beijing Municipal Administration of Hospitals’ Youth Programme (QML20191206).

## Disclosures

The authors declare no conflicts of interest.

## Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

## Supplemental document

See Supplement 1 for supporting content.

## References

**1. **F. Bray, M. Laversanne, E. Weiderpass, and I. Soerjomataram, “The ever-increasing importance
of cancer as a leading cause of premature death
worldwide,” Cancer **127**(16),
3029–3030 (2021). [CrossRef]

**2. **H. Sung, J. Ferlay, R. L. Siegel, M. Laversanne, I. Soerjomataram, A. Jemal, and F. Bray, “Global cancer statistics 2021:
GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers
in 185 countries,” CA Cancer J. Clin. **71**(3),
209–249 (2021). [CrossRef]

**3. **B. Saikia, K. Gupta, and U. N. Saikia, “The modern histopathologist:
in the changing face of time,” Diagn.
Pathol. **3**(1),
25–29 (2008). [CrossRef]

**4. **Y. Rivenson, H. Wang, Z. Wei, K. D. Haan, Y. Zhang, Y. Wu, H. Günaydın, J. E. Zuckerman, T. Chong, A. E. Sisk, L. M. Westbrook, W. D. Wallace, and A. Ozcan, “Virtual histological staining
of unlabelled tissue-autofluorescence images via deep
learning,” Nat. Biomed. Eng. **3**(6),
466–477 (2019). [CrossRef]

**5. **Y. Dong, J. Wan, L. Si, Y. Meng, Y. Dong, S. Liu, H. He, and H. Ma, “Deriving polarimetry feature
parameters to characterize microstructural features in histological
sections of breast tissues,” IEEE Trans. Med.
Imaging **68**(3),
881–892 (2021). [CrossRef]

**6. **Y. Dong, J. Wan, X. Wang, J. H. Xue, J. Zou, H. He, P. Li, A. Hou, and H. Ma, “A Polarization-imaging-based
machine learning framework for quantitative pathological diagnosis of
cervical precancerous lesions,” IEEE Trans.
Med. Imaging **40**(12),
3728–3738 (2021). [CrossRef]

**7. **C. He, H. He, J. Chang, B. Chen, H. Ma, and M. J. Booth, “Polarisation optics for
biomedical and clinical applications: a review,”
Light: Sci. Appl. **10**(1), 194
(2021). [CrossRef]

**8. **N. Ghosh and I. A. Vitkin, “Tissue polarimetry: concepts,
challenges, applications, and outlook,” J.
Biomed. Opt. **16**(11),
110801 (2011). [CrossRef]

**9. **V. V. Tuchin, “Polarized light interaction
with tissues,” J. Biomed. Opt. **21**(7), 071114
(2016). [CrossRef]

**10. **S. Y. Lu and R. A. Chipman, “Interpretation of Mueller
matrices based on polar decomposition,” J.
Opt. Soc. Am. **13**(5),
1106–1113 (1996). [CrossRef]

**11. **H. He, R. Liao, N. Zeng, P. Li, Z. Chen, X. Liu, and H. Ma, “Emerging new tool for
characterizing the microstructural feature of complex biological
specimen,” J. Lightwave Technol. **37**(11),
2534–2548 (2019). [CrossRef]

**12. **P. Li, D. Lv, H. He, and H. Ma, “Separating azimuthal
orientation dependence in polarization measurements of anisotropic
media,” Opt. Express **26**(4),
3791–3800 (2018). [CrossRef]

**13. **P. Li, Y. Dong, J. Wan, H. He, T. Aziz, and H. Ma, “Polaromics: deriving
polarization parameters from a Mueller matrix for quantitative
characterization of biomedical specimen,” J.
Phys. D: Appl. Phys. **55**(3),
034002 (2022). [CrossRef]

**14. **P. Schucht, H. R. Lee, M. H. Mezouar, E. Hewer, A. Raabe, M. Murek, I. Zubak, J. Goldberg, E. Kövari, A. Pierangelo, and T. Novikova, “Visualization of white matter
fiber tracts of brain tissue sections with wide-field imaging Mueller
polarimetry,” IEEE Trans. Med.
Imaging **39**(12),
4376–4382 (2020). [CrossRef]

**15. **Y. Dong, J. Qi, H. He, C. He, S. Liu, J. Wu, D. S. Elson, and Hui Ma, “Quantitatively characterizing
the microstructural features of breast ductal carcinoma tissues in
different progression stages by Mueller matrix
microscope,” Biomed. Opt. Express **8**(8),
3643–3655 (2017). [CrossRef]

**16. **Y. Dong, H. He, W. Sheng, J. Wu, and H. Ma, “A quantitative and non-contact
technique to characterize microstructural variations of skin tissues
during photo-damaging process based on Mueller matrix
polarimetry,” Sci. Rep. **7**(1), 14702
(2017). [CrossRef]

**17. **N. Ghosh, M. Wood, and I. A. Vitkin, “Mueller matrix decomposition
for extraction of individual polarization parameters from complex
turbid media exhibiting multiple scattering, optical activity, and
linear birefringence,” J. Biomed.
Opt. **13**(4),
044036 (2008). [CrossRef]

**18. **Y. Liu, Y. Dong, S. Lu, R. Meng, and H. Ma, “Comparison between image
texture and polarization features in histopathology,”
Biomed. Opt. Express **12**(3),
1593–1608 (2021). [CrossRef]

**19. **Y. Shen, R. Huang, H. He, S. Liu, Y. Dong, J. Wu, and H. Ma, “Comparative study of the
influence of imaging resolution on linear retardance parameters
derived from the Mueller matrix,” Biomed. Opt.
Express **12**(1),
211–225 (2021). [CrossRef]

**20. **R. Rubinstein, M. Zibulevsky, and M. Elad, “Efficient implementation of
the K-SVD algorithm using batch orthogonal matching
pursuit,” Tech. Rep. 40 (Computer
Science Department, Israel Institute of Technology,
2008).

**21. **Y. Wang, H. He, J. Chang, N. Zeng, S. Liu, M. Li, and H. Ma, “Differentiating characteristic
microstructural features of cancerous tissues using Mueller matrix
microscope,” Micron **79**, 8–15
(2015). [CrossRef]

**22. **D. H. Goldstein, “Mueller matrix dual-rotating
retarder polarimeter,” Appl. Opt. **31**(31),
6676–6683 (1992). [CrossRef]

**23. **S. I. Lee, H. Lee, P. Abbeel, and A. Y. Ng, “Efficient L1 regularized
logistic regression,” in 21th National
Conference on Artificial Intelligence Conference
(Association for the Advance of Artificial
Intelligence) (2014), paper
06.

**24. **V. Dremin, O. Sieryi, M. Borovkova, J. Näpänkangas, I. Meglinski, and A. Bykov, “Histological imaging of
unstained cancer tissue samples by circularly polarized
light,” in European Conferences on
Biomedical Optics 2021 (ECBO) (2021), paper
EM3A.3.

**25. **K. M. Sindhoora, K. U. Spandana, D. Ivanov, E. Borisova, U. Raghavendra, S. Rai, S. P. Kabekkodu, K. K. Mahato, and N. Mazumder, “Machine-learning-based
classification of Stokes-Mueller polarization images for tissue
characterization,” in Journal of Physics:
Conference Series (2021), paper
012045.

**26. **X. Zhou, L. Ma, W. Brown, J. V. Little, A. Y. Chen, L. L. Myers, B. D. Sumer, and B. Fei, “Automatic detection of head
and neck squamous cell carcinoma on pathologic slides using polarized
hyperspectral imaging and machine learning,”
Proc. SPIE **11603**,
116030Q (2021). [CrossRef]

**27. **C. Heinrich, J. Rehbinder, A. Nazac, B. Teig, A. Pierangelo, and J. Zallat, “Mueller polarimetric imaging
of biological tissues: classification in a decision-theoretic
framework,” J. Opt. Soc. Am. A **35**(12),
2046–2057 (2018). [CrossRef]

**28. **I. Deyan, D. Viktor, G. Tsanislava, B. Alexander, N. Tatiana, O. Razvigor, and M. Igor, “Polarization-Based
Histopathology Classification of Ex Vivo Colon Samples Supported by
Machine Learning,” Front. Phys. **9**, 814787 (2022). [CrossRef]

**29. **W. Wang, L.G. Lim, S. Srivastava, J. Bok-Yan So, A. Shabbir, and Q. Liu, “Investigation on the potential
of Mueller matrix imaging for digital staining,”
J. Biophotonics **9**(4),
364–375 (2016). [CrossRef]

**30. **Y. Zhao, J. Zang, M. Reda, K. Feng, G. Cheng, Z. Ren, S. G. Kong, S. Su, H. X. Huang, and H. Huang, “Detecting giant cell tumor of
bone lesions using Mueller matrix polarization microscopic imaging and
multi-parameters fusion network,” IEEE Sens.
J. **20**(13),
7208–7215 (2020). [CrossRef]

**31. **L. Xia, Y. Yao, Y. Dong, M. Wang, H. Ma, and L. Ma, “Mueller polarimetric
microscopic images analysis based classification of breast cancer
cells,” Opt. Commun. **475**, 126194
(2020). [CrossRef]

**32. **D. Ma, Z. Lu, L. Xia, Q. Liao, W. Yang, H. Ma, R. Liao, L. Ma, and Z. Liu, “MuellerNet: a hybrid 3D–2D CNN
for cell classification with Mueller matrix images,”
Appl. Opt. **60**(22),
6682–6694 (2021). [CrossRef]

**33. **C. Roa, V. N.. Du Le, M. Mahendroo, I. Saytashev, and J. Ramella-Roman, “Auto-detection of cervical
collagen and elastin in Mueller matrix polarimetry microscopic images
using K-NN and semantic segmentation classification,,”
Biomed Opt. Express **12**(24),
2236–2249 (2021). [CrossRef]

**34. **R. M. A. Azzam, “Photopolarimetric measurement
of the Mueller matrix by Fourier analysis of a single detected
signal,” Opt. Lett. **2**(6),
148–150 (1978). [CrossRef]

**35. **D. H. Goldstein and R. A. Chipman, “Error analysis of a Mueller
matrix polarimeter,” J. Opt. Soc. Am.
A **7**(4),
693–700 (1990). [CrossRef]

**36. **K. M. Twietmeyer, R. A. Chipman, A. E. Elsner, Y. Zhao, and D. Vannasdale, “Mueller matrix retinal imager
with optimized polarization conditions,” Opt.
Express **16**(26),
21339–21354 (2008). [CrossRef]

**37. **B. Zitová and J. Flusser, “Image registration methods: a
survey,” Image Vision Comput. **21**(11),
977–1000 (2003). [CrossRef]

**38. **J. H. Xue and D. M. Titterington, “Comment on “On discriminative
vs. generative classifiers: A comparison of logistic regression and
naive bayes”,” Neural Process Lett. **28**(3),
169–187 (2008). [CrossRef]

**39. **C. Rodríguez, A. V. Eeckhout, L. Ferrer, E. G. Caurel, E. G. Arnay, J. Campos, and A. Lizana, “Polarimetric data-based model
for tissue recognition,” Biomed. Opt.
Express **12**(8),
4852–4872 (2021). [CrossRef]

**40. **Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching pursuit:
recursive function approximation with applications to wavelet
decomposition,” in Proceedings of 27th
Asilomar Conference on Signals, Systems and Computers
(1993), pp. 40–44 vol.
1.

**41. **J. Platt, * Advances in Large
Margin Classifiers: Probabilistic outputs for support vector machines
and comparisons to regularized likelihood methods*,
(Massachusetts Institute of
Technology, 2000), pp.
61–75.

**42. **T. Hastie, R. Tibshirani, and J. Friedman, * The Elements of
Statistical Learning*,
(Springer, 2009), vol.
2, pp.
106–119.

**43. **T. Hastie, R. Tibshirani, and J. Friedman, * The Elements of
Statistical Learning*,
(Springer, 2009), vol.
2, pp.
389–395.