Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Characterization of a medium by estimating the constituent components of its coherency matrix in polarization optics

Open Access Open Access

Abstract

In this work, an alternative route to analyze a set of coherency matrices associated to a medium is addressed by means of the Independent Component Analysis (ICA) technique. We highlight the possibility of extracting an underlying structure of the medium in relation to a model of constituent components. The medium is considered as a mixture of unknown constituent components weighted by unknown but statistically independent random coefficients of thickness. The ICA technique can determine the number of components necessary to characterize a set of sample of the medium. An estimate of the value of these components and their respective weights is also determined. Analysis of random matrices generated by multiplying random diattenuators and depolarizers is presented to illustrate the proposed approach and demonstrate its capabilities.

©2011 Optical Society of America

1 - Introduction

Mueller matrix as linear mapping between the input and output Stokes vectors of light interacting with media, is a very powerful tool for polarimetric characterization of linear media. Though Mueller matrix may be viewed as a full representation of the polarimetric properties of a material medium, extracting these properties from the expression of the rough Mueller matrix, is not straightforward. This the reason while the Mueller matrices are usually interpreted by decomposing it into elementary components related to basic optical quantities (diattenuation, retardance and depolarizance for instance).

The most popularized technique is a factorization one based on the polar-decomposition. This decomposition of an arbitrary Mueller matrix was first proposed by Lu and Chipman [1]. Since this seminal article, a lot of papers have been published completing this approach. Some limitations of this decomposition have been underlined [2] and the forward and reverse decompositions concept has been proposed [3] to solve this question. Others works have tried to limit the “asymmetry” of the initial polar decomposition. A symmetric decomposition has been proposed [4]. This approach leads to a straightforward Mueller matrix interpretation in terms of an arrangement of five factors: a diagonal depolarizer stacked between two retarder and diattenuator pairs. Most of these works are in fact related to the question of the influence of the order in which the elementary matrices are multiplied, since they do not commute. Finding different values for the basic optical properties estimated from a single Mueller matrix is a straightforward consequence of this non commuting product. It follows that this set of decompositions is well appropriate to synthesize an optical system with a given Mueller matrix from classically available optical components. The issue of analysing the basic optical properties with this approach seems to be more problematic. There is no reason to presume that the birefringence effect for instance is placed first or even it is localised at a specific place in the medium without a priori knowledge on this medium under analysis.

A second approach to decompose a Mueller matrix into elementary components may be related to the work of Cloude [5] who demonstrated the possibility to decompose such a matrix as a convex sum of up to four Mueller-Jones matrices. This demonstration is related to the spectral decomposition theorem applied to the coherency matrix associated to the Mueller matrix. Coherency matrix H is related to the Mueller matrix elements mij and Kronecker product of Pauli matrices (σi) by:

H=ijmij(σiσj*)

It is worth noticing that we assimilate H to the coherency matrix although the classical definition is related to a matrix which differs from H by an unitary transformation. In fact, as demonstrated by Aiello [6], the definition of H is related to the basis of Χ4x4 that is considered.

Coherency matrix is thus another possibility of representation of the polarimetric properties of a material medium. But this parallel incoherent decomposition (the emerging light is a sum of several incoherent contributions) of coherency matrices is not the only one like it. Other parallel decompositions may be found in the review paper of Gil [7] for instance. Once again, this tool does not lead to a straightforward characterisation of optical properties of the medium since we only access to an equivalent parallel model.

The third approach to analyse the polarimetric properties of a medium is chronologically the first proposed method since it is related to the layered-medium interpretation proposed by Jones in the seventh paper of his series [8]. Introducing the notion of N matrices based on this layered-medium interpretation, Jones described the state of the light at every point z in an optical system along the light path but characterized the optical properties of the medium too. Later, Azzam extended this approach [9] to the partially polarized light propagating through non depolarizing media and introduced the differential Mueller matrix m related to the Mueller matrix at distance z into the medium by:

dM(z)dz=mM(z)

Azzam also derived the relations between the entries of N and m differential matrices for non depolarizing media. However, the formal relation between these both matrices was formulated by Barakat [10] from the concept of exponential versions of the Mueller-Jones matrices.

In a recent paper [11], we addressed the question of interpolation of the coherency matrix. For the denoted Log-Euclidean (LE) process, we demonstrated the physical meaning of this interpolation process is founded on a very similar approach to the layered-medium interpretation proposed by Jones. The medium associated to the coherency matrix H(t) is considered as a sandwich of thin laminae. Interpolating from H1=H(0) to H2=H(1) consists in changing the proportion of both the elements D1=log(H1) and D2=log(H2) constituting the thin laminae. The basic point that we wish to stress is that D1 and D2 may be considered as constituent components of the set of matrices H(t) and thus of the related media. This point is developed in the following and the questions of how determine these components and how many components are needed to describe a set of coherency matrices are addressed by means of Independent Component Analysis (ICA) technique.

The paper is thus organized as follows. After a reminder of definitions and relations associated with coherency matrix and LE process, we derive the hypothesis of our approach and describe the ICA processing over the space of coherency matrices. The case of random matrices generated by multiplying random diattenuators and depolarizers is eventually presented for illustrating the proposed approach.

2 – Coherency matrix space and LE metrics

2.1. Mueller and coherency matrix definition

Following [12] [13], we define a Mueller matrix M as a convex sum of so-called Mueller-Jones matrices. From this definition, it is possible to extract from M a semi definite Hermitian matrix H that is related to the Mueller matrix elements mij by Eq. (1). If the set of coherency matrices is restricted to HPD(4), the manifold of Hermitian Positive Definite matrices of dimension 4, a Lie group structure can be associated to HPD(4) (see [14] for demonstration). The group product is defined by:

H1H2=exp[Log(H1)+Log(H2)]
where exp and log are the classical matrix exponential and logarithm mappings respectively. It is clear that neither the Mueller-Jones matrices nor the Mueller matrices corresponding to a coherency matrix with 1 or 2 null eigenvalues are elements of HPD(4). So, we do not take them into account in this paper.

2.2. Log-Euclidean distance on coherency matrix space

When considering HPD(4) matrices as a smooth manifold, it is also possible to define a Riemannian framework on this set of matrices. This is the way used by Arsigny [14] to define the LE distance on Symmetric Positive Definite matrices. Proving that this distance can easily be extended to HPD(4) matrices and is well adapted to coherency matrix interpolation was done in [11].

Following the definition of the LE distance, interpolating from H1=H(0) to H2=H(1) consists in changing the proportion of both the elements D1=log(H1) and D2=log(H2) constituting the thin laminae (see [11] for more details). This medium is thus characterized by an infinitesimal generator D = (1-t).D1+t.D2. This is a sandwich of both the previous media with proportions (1-t) and t respectively. It is worth noticing that this result is independent of the order in which both the sections are placed since matrix summation is commutative. Thus constituent components are not localised into the medium but may be regarded as distributed contributions along the light path.

This approach can be extended when defining the logarithmic scalar multiplication of a HPD matrix H by a scalar α∈Ρ in the same way as Arsigny proposed for symmetric matrix [14]. This multiplication is defined by:

(αH)=exp[αLog(H)]

With both the operations defined by Eq. (3) and (4), V = (HPD(4), ⊕, •) can be viewed as a vector space when a HDP matrix is identified with its logarithm. The reader is referred to [14] for the demonstration of this property with Symmetric Positive Definite matrices.

A straightforward consequence is that H can be written as a linear mixtures of vectors { G1, G2, …, GN} of this space V.

H=(α1G1)(α2G2)(αNGN)=exp[i=1NαiLog(Gi)]

Following the same approach used to exhibit the physical meaning of LE interpolation process, H may be considered a sandwich of constituent components Di = log(Gi) with the corresponding thickness αi. Decomposing H over a set of basis vectors of V could be a first available solution. It could be also possible to choose the constituent components Di in such a way that the corresponding coefficients αi are homogeneous to optical quantities like for N matrix defined by Jones.

However, we propose to explore an alternative route by exhibiting polarimetric structure of the medium directly from the data in relation to a predefined analysis model. If we consider the medium to be described by the coherency matrix H at any point in the space (or any instant of time for temporal variations), we wish to be able to identify the presence of a set of constituent components Di. This set of constituent components may be considered a signature of the medium. The fluctuations of H are thus supposed to be only related to variations of the thickness αi of these components. The problem are then: How many constituent components Di are needed to characterize the medium and what are these constituent components and their weighting coefficients ?

3 – Analysis of a medium by a constituent components model

Without any supplementary information, we are just able to tell that N = 16 components are obviously needed to represent the most general medium since H has 16 degrees of freedom.

Nevertheless, one approach to solving this problem could be to assume some statistical properties of the quantities to estimate. If the medium may be considered as a mixture of unknown constituent components weighted by unknown but statistically independent random coefficients of thickness αi, the Independent Component Analysis (ICA) technique can be used to estimate the Di.

3.1. ICA model used for coherency matrix of a medium

It is beyond the scope of this paper to develop a general presentation of this method and the reader is referred to the literature on this topic, see [15] for instance. We just develop the classical formalism of ICA technique and explain how this approach can be used for our problem

From Eq. (5) at any point in the space or any instant of time, we have a coherency matrix H and we can write:

LH=log(H)=j=1NαjDj

It is worth noticing that it is equivalent to speak about H a Coherency matrix, LH the logarithm of this matrix or M the associated Mueller matrix. There is no ambiguity between the three notations.

Equation (6) defined αj as the realization of a random variable. The random thickness αi are the latent variables of our problem since they cannot be directly observed and the Dj may be considered as mixing elements. It is possible to derive a more explicit expression of these mixtures. Since HHPD(4), Dj is necessarily a Hermitian matrix and we have only 16 unknown entries for each Dj matrix:

Dj=[d1jd5j+id6jd7j+id8jd9j+id10j.d2jd11j+id12jd13j+id14j..d3jd15j+id16j...d4j]

A similar description is available for the LH matrices with 16 estimated entries lhi. The matrix A = (aij) with aij = dji is thus a 16 rows and N columns matrix. X = (lhi) is a 16 rows and 1 column matrix and S = (αj) a N rows and 1 column matrix. We may write using a vector-matrix notation X = A S. For the classical formalism of ICA technique, A stands for the mixing matrix. The problem is then to recover the S components using the input data X. When A is known, we can solve this linear equation by classical methods. If the mixing matrix is assumed to be unknown too, this problem can nevertheless be solved but assuming that the random components αj are statistically independent variables. This is the starting point for ICA method. All we observe is the random vector X and ICA algorithm performs an unsupervised estimation of N, the number of independent components and a joint estimation of A and S. How this algorithm works can be related to a minimization of mutual information process. Mutual information is a natural information measure of the independence of random variables. The ICA of the random vector X is defined as the invertible transformation S = WX where the matrix W is determined so that the mutual information of the S components is minimized [15]. Various algorithms have been derived in the literature to perform this task. FastICA [16] is one of these algorithms and it will be used below. This algorithm assumes that the data is preprocessed by centering. Consequently, we have the following relation between the input data X and the estimated S and A:

X=A(SS)+X

(See Appendix A for the proof) where <.> denotes the ensemble average.

It is worth noticing that <X> represents the mean value of LH matrices and corresponds exactly to the mean value of coherency matrices according to the LE distance:

HLE=exp(1Kj=1Klog(Hj))

See [11] [14] for more details on this definition. The A matrix of Eq. (8) and the corresponding constituent components Di may be viewed as a characterization of fluctuations around this mean value. Thus we are exhibiting polarimetric structure of the medium directly from the data itself in which case the feature extraction process can be regarded as well suited to data which is being processed.

Both S and A being unknown, any scalar multiplier in one of the thickness αi could be cancelled by dividing the corresponding column of A by the same scalar. This is a well-known ambiguity of the ICA model. Normalizing the Dj matrices of constituent components, is the way we choose to fix the magnitudes of the thickness αj. After this step of normalization, Tr(Dj Dj) = 1 where † stands for a Hermitian conjugate and Tr(.) denotes the trace of the matrix. Nevertheless, there is always the ambiguity of the sign. Since αi are presumed to be homogeneous to thickness, we can always multiply the αj and the corresponding Dj matrix by −1 without affecting the model in order to have positive quantities for the thickness.

Obviously, the major assumption of this approach is on the actual existence of statistically independent coefficients αj. This is not an so unrealistic assumption in many cases. Considering a Mueller matrix as a mixing of independent physical quantities (birefringence or dichroïsm for instance) is one example. Since these quantities can have different values from one moment to another or from one location to another, a set of matrices with independent latent variables can be measured and analyzed according to the proposed approach. Illustrating this issue is exactly what we will do with the example below.

3.2. The example of matrices with random diattenuation and depolarization values

We consider the example of a medium characterized by its Mueller matrix M. According to Lu and Chipman [1], M can be decomposed as MRMDMDEP where MR stands for a retardance matrix, MD stands for the diattenuation one and MDEP for the depolarizing one. For the sake of simplicity we will suppose MR to be the identity matrix in order to limit the number of independent components, but this is absolutely not a limitation of the approach.

The MDEP matrix can be expressed as:

MDEP=[1pqr0a0000b0000c]

We assume a, b, p uniformly distributed random variables with <a> = 0.3, <b> = 0.4 and <p> = 0.1. c, q and r are fixed at 0.12, 0 and 0 respectively. Similarly, a MD matrix is completely defined by its diattenuation vector DV =[ f g h ]. f and g are fixed to 0.2 and −0.14 respectively and h is assumed uniformly distributed random variable with <h> = 0.12. All these values are arbitrarily chosen but in order to have physical Mueller matrices. A set of random Mueller matrices is then generated by multiplying these random diattenuators and depolarizers

Two examples of the diattenuation and depolarization matrices are listed in Tab. 1 with the corresponding generated matrices M.

Tables Icon

Table 1. Examples of Generated Mueller Matrices. First Column: MD Values, Second Column: MDEP Values, Third Column: M= MDMDEP

All these 4 random variables are supposed to be independent. This is not an unrealistic assumption since diattenuation and depolarization can be regarded as derived from different physical properties and their components are specifically introduced as degrees of freedom of Mueller matrices in the Lu and Chipman decomposition.

A set of 1000 matrices is built following the procedure previously described. The corresponding coherency matrices are computed and this set of matrices is the input data analyzed by ICA method. The algorithm estimates a number of independent components equal to 4. The mean value of LH matrices and the four components are also estimated. The Mueller matrix Mmean corresponding to this mean value of LH is listed in Tab. 2 with its polar decomposition. This is exactly the Mueller matrix with the mean values of the random variables used to generate the set of input data. It could be argued that if a Lu and Chipman decomposition was first applied on each random matrix and the classical Euclidean mean computed on the set of MD and MDEP respectively, the good mean value is then obtained by the multiplication of both these mean matrices. This is true but is only possible because the order of the decomposition is a priori known which is not generally the case in experimental measurements.

Tables Icon

Table 2. Estimated Mean Mueller Matrix. First Column: Mmean, Second Column: MD Values, Third Column: MDEP Values

In order to analyze the constituent components estimated by the algorithm, we compute the Mueller matrices denoted Mc(i), associated to the estimated constituent components when all the αj except one are set to 0. These matrices are thus computed from the N coherency matrices given for i∈{1;N} by:

Hi=exp[αiDi+log(HLE)]

Tab. 3 shows an example of these N=4 Mueller matrices computed when the αj are the estimated values corresponding to the Mueller matrix at the first line of Tab. 1. For each of these Mc(i) matrices, the polar decomposition is computed and corresponding MD and MDEP matrices are shown.

Tables Icon

Table 3. Estimated Mc Mueller Matrix Associated to Each Component. First Column: Mc(i), Second Column: MD(i) Values, Third Column: MDEP(i) Values

For each of the MD(i) and MDEP(i) matrices of Tab. 3, the matrix elements that differ from the average matrix elements of Tab. 2 are set in bold. It is thus clear that each of the matrices MC(i) carries the information corresponding to one and only one random variables. MC(1), MC(2) and MC(3) are related to three random variables we have introduced in the matrix of depolarization and MC(4) is on the fourth random variable that we introduced in the matrix of diattenuation. The αj we used in this example are the estimated values corresponding to the Mueller matrix at the first line of Tab. 1. It is worth noticing that the entries (in bold font) of the MD(i) and MDEP(i) matrices corresponding to the action of the random variables, have the values of the entries (in bold font) of the depolarization and diattenuation matrices at the first line of Tab. 1. Similar results are obviously obtained with all matrices of the original set. Constituent components Di therefore characterized the polarimetric properties of the set of matrices under analysis. By ICA approach, it is possible to isolate these polarimetric features, each being carried by a constituent component Di. These examples demonstrate the potential of the method in terms of feature extraction of a set of Mueller or coherency matrices.

4 - Conclusion

In this work, an alternative route to analyze a set of coherency matrices associated to a medium is addressed by means of the Independent Component Analysis technique. We highlight the possibility of extracting an underlying structure of the medium in relation to a model of constituent components. If we consider the medium to be described by the coherency matrix at any point in the space (or any instant of time for temporal variations), we are able to identify the presence of a set of constituent components. This set of constituent components may be considered a signature of the medium. The fluctuations of the coherency matrix are thus supposed to be only related to variations of the thickness of these components. The samples are considered as a mixture of unknown constituent components weighted by unknown but statistically independent random coefficients. The ICA technique can determine the number of components necessary to characterize this set of samples of the medium. An estimate of the value of these components and their respective weights is also determined.

The case of random matrices generated by multiplying random diattenuators and depolarizers was presented to illustrate the proposed approach and demonstrate its capabilities. Applications of this method to the case of coherency matrix from experimental data are being analyzed.

Appendix A

From the ICA technique, we have an estimation of A and W in such a way that WA = Id but it is worth noticing that AW is not equal to the identity matrix. The algorithm assumes that the data is preprocessed by centering so we actually estimate:

X-X=AS1

then

S=S1+WXorS1=SWX

But from (A1), we have by a left multiplication by W:

S1=WX-WX

from (A3) and (A2) we have: S = W X and then:

S=WX

Thus from (A1), (A2) and (A4) we eventually have X - <X> = A (S - <S>) that gives Eq. (8).

References and Links

1. S. Y. Lu and R. A. Chipman, “Interpretation of Mueller matrices based on polar decomposition,” J. Opt. Soc. Am. A 13(5), 1106–1113 (1996). [CrossRef]  

2. J. Morio and F. Goudail, “Influence of the order of diattenuator, retarder, and polarizer in polar decomposition of Mueller matrices,” Opt. Lett. 29(19), 2234–2236 (2004). [CrossRef]   [PubMed]  

3. R. Ossikovski, A. De Martino, and S. Guyot, “Forward and reverse product decompositions of depolarizing Mueller matrices,” Opt. Lett. 32(6), 689–691 (2007). [CrossRef]   [PubMed]  

4. R. Ossikovski, “Analysis of depolarizing Mueller matrices through a symmetric decomposition,” J. Opt. Soc. Am. A 26(5), 1109–1118 (2009). [CrossRef]   [PubMed]  

5. S. R. Cloude, “Group theory and polarisation algebra,” Optik (Stuttg.) 75, 26–36 (1986).

6. A. Aiello and J. P. Woerdman, arXiv.org e-print archive math-ph/0412061 (2004)

7. J. J. Gil, “Polarimetric characterization of light and media,” Eur. Phys. J. Appl. Phys. 40(1), 1–47 (2007), doi:. [CrossRef]  

8. R. C. Jones, “A new calculus for the treatement of optical systems. VII properties of the N-matrices,” J. Opt. Soc. Am. 38(8), 671–685 (1948). [CrossRef]  

9. R. M. A. Azzam, “Propagation of partially polarized light through anisotropic media with or without depolarization: A differential 4x4 matrix calculus,” J. Opt. Soc. Am. 68(12), 1756–1767 (1978). [CrossRef]  

10. R. Barakat, “Exponential versions of the Jones and Mueller-Jones polarization matrices,” J. Opt. Soc. Am. A 13(1), 158–163 (1996). [CrossRef]  

11. V. Devlaminck, “Mueller matrix interpolation in polarization optics,” J. Opt. Soc. Am. A 27(7), 1529–1534 (2010). [CrossRef]   [PubMed]  

12. K. Kim, L. Mandel, and E. Wolf, “Relationship between Jones and Mueller matrices for random media,” J. Opt. Soc. Am. A 4(3), 433–437 (1987). [CrossRef]  

13. B. N. Simon, S. Simon, N. Mukunda, F. Gori, M. Santarsiero, R. Borghi, and R. Simon, “A complete characterization of pre-Mueller and Mueller matrices in polarization optics,” J. Opt. Soc. Am. A 27(2), 188–199 (2010). [CrossRef]   [PubMed]  

14. V. Arsigny, P. Fillard, X. Pennec, and N. Ayache, “Geometric means in a novel vector space structure on symmetric positive-definite matrices,” SIAM J. Matrix Anal. Appl. 29(1), 328–347 (2007). [CrossRef]  

15. A. Hyvärinen and E. Oja, “Independent component analysis: algorithms and applications,” Neural Netw. 13(4-5), 411–430 (2000). [CrossRef]   [PubMed]  

16. A. Hyvärinen, “Fast and robust fixed-point algorithms for independent component analysis,” IEEE Trans. Neural Netw. 10(3), 626–634 (1999). [CrossRef]   [PubMed]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Tables (3)

Tables Icon

Table 1 Examples of Generated Mueller Matrices. First Column: MD Values, Second Column: MDEP Values, Third Column: M= MD MDEP

Tables Icon

Table 2 Estimated Mean Mueller Matrix. First Column: Mmean, Second Column: MD Values, Third Column: MDEP Values

Tables Icon

Table 3 Estimated Mc Mueller Matrix Associated to Each Component. First Column: Mc(i), Second Column: MD(i) Values, Third Column: MDEP(i) Values

Equations (15)

Equations on this page are rendered with MathJax. Learn more.

H= ij m ij ( σ i σ j * )
dM( z ) dz =m M( z )
H 1 H 2 =exp[ Log( H 1 )+Log( H 2 ) ]
( αH )=exp[ α Log( H ) ]
H=( α 1 G 1 )( α 2 G 2 )( α N G N )=exp[ i=1 N α i Log( G i ) ]
LH= log(H) = j=1 N α j D j
D j =[ d 1 j d 5 j +i d 6 j d 7 j +i d 8 j d 9 j +i d 10 j . d 2 j d 11 j +i d 12 j d 13 j +i d 14 j . . d 3 j d 15 j +i d 16 j . . . d 4 j ]
X=A( S S )+ X
H LE = exp( 1 K j=1 K log( H j ) )
M DEP =[ 1 p q r 0 a 0 0 0 0 b 0 0 0 0 c ]
H i = exp[ α i D i +log( H LE ) ]
X- X =A S 1
S= S 1 +W X or S 1 = SW X
S 1 = WX-W X
S = W X
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.