Retrieving space-dependent polarization transformations via near-optimal quantum process tomography

Francesco Di Colandrea; Francesco Di Colandrea; Lorenzo Amato; Lorenzo Amato; Roberto Schiattarella; Alexandre Dauphin; Filippo Cardano; Filippo Cardano

doi:10.1364/OE.491518

1. Introduction

Quantum process tomography (QPT) consists of determining all the parameters characterizing a quantum evolution from a set of experimental measurements [1]. This represents a crucial task for most quantum applications. If the process under investigation is a genuine quantum black box, i.e., no preliminary information is available, QPT unveils the mathematical structure of the unknown transformation, thus enabling the prediction of how a specific input state will change under its action. On the other hand, QPT is key to certifying if a quantum device is properly working, for instance, validating the security of a quantum communication channel [2] or reconstructing a transfer matrix within a quantum circuit [3]. In the past decades, this technique has been employed to characterize different quantum architectures, such as nuclear magnetic resonances [4], atoms in optical lattices [5], trapped ions [6,7], superconducting circuits [8,9] and photonic setups [10–18].

In the simplest picture of linear and unitary processes, in principle QPT could be accomplished by extracting analytical relations between the transformation parameters and the outcomes of suitable projective measurements [19,20]. However, realistic experimental noise compromises the feasibility of such approach, typically yielding non-physical results. This limitation can be overcome by mapping the reconstruction of the process into an optimization problem, as first proposed for the tomography of quantum states [21–27]. The typical benchmark is the maximum-likelihood approach [12], consisting of the minimization of the negative log-likelihood. The latter represents a notion of distance between a set of experimental outcomes and the corresponding theoretical predictions, based on the estimated quantum process.

In this framework, the most elementary scenario is the characterization of an SU(2) gate $U$ acting on a two-level system, encoding a qubit of quantum information. In photonic setups, qubits can be encoded into optical polarization, with $U$ implemented via one or multiple birefringent waveplates. Accordingly, the characterization of devices acting on light polarization can be nicely solved within the mathematical paradigm of QPT [28,29]. In this analysis, we focus on fully polarized states of light, described within the Jones formalism [29]. The broader subject of depolarizing optical transformations, that are described within the Mueller matrix formalism, will be addressed in a subsequent study.

While standard waveplates rotate uniformly the polarization of a light beam, several photonic applications involve light propagation through inhomogeneous devices or materials that implement space-dependent polarization transformations, possibly exhibiting a complex spatial structure. These applications include polarization imaging [30] and multiplexing [31], or the generation and manipulation of structured light [32–34] through spin-orbit metasurfaces [35]. In these cases, reconstructing the whole transformation requires the determination of a two-dimensional (2D) spatially-varying unitary operator $U(x,y)$, which in turn implies that the QPT procedure is iterated over multiple positions. Performing the full tomography by relying upon numerical minimization routines may become significantly lengthy when increasing the number of iterations, depending on the desired spatial resolution and the system size.

Here we investigate the optimization of numerical techniques assisting the tomography of SU(2) gates, experimentally encoded into unitary polarization rotations. By optimization, we mean achieving a satisfactory level of accuracy in the characterization process while using minimal resources, including the number of measurements and the time consumed by the numerical routines.

Considering a collimated light beam propagating along the $z$ axis, we aim at determining the parameters of a space-dependent polarization rotation $U(x,y)$ implemented by a thin birefringent optical element, lying in a transverse plane $z=z_0$. We specifically consider two complementary approaches to the reconstruction of $U$ at each transverse position $(x,y)$, namely a genetic algorithm (GA) and supervised machine learning (ML). In the first case, we devise a genetic search of the minimum of the cost function, emulating the principle of Darwinian natural selection [36–39]. GAs have been largely employed in the contexts of quantum simulation [40], quantum annealing [41], as well as for diverse optical applications, both for general optimization [42–48] and purely tomographic [49,50] purposes. The second approach relies on training a neural network (NN) to predict the process generating a limited number of experimental outcomes. Nowadays, ML is ubiquitous in physics [51], playing a special role in quantum applications [52]. Significant computational advantages have been recently reported in machine-learning-assisted quantum tomographic experiments [53,54]. Photonic setups represent an important testbed to probe its potentialities [55], and several applications can actually benefit from an efficient characterization, ranging from ultra-fast microscopy to optical communications.

We first test the algorithms with synthetic experimental data, generated via numerical simulations carried out for single unitary processes. In the context of space-dependent transformations, this would correspond to the polarization rotation associated with a sufficiently small spatial region. By keeping the number of input measurements to six, we compare the performances of these approaches with the popular NMinimize routine from Mathematica [56], taking into account both the timing and the accuracy of the prediction. Finally, we adapt these routines to characterize the entire polarization transformation generated by chosen combinations of inhomogeneous optical waveplates, such as liquid-crystal (LC) metasurfaces [57,58].

2. Polarization rotations as SU(2) processes

Within Jones formalism, the polarization state of a fully-polarized light beam, or a single photon, can be expressed as a two-component complex vector $\psi =(\alpha,\beta )^T$, where $|\alpha |^2+|\beta |^2=1$. Here and in the following, we are considering left and right circular polarizations as basis states $(1,0)^T$ and $(0,1)^T$, respectively. Building on a one-to-one correspondence between this representation and the ket notation for quantum states, we label circular basis states as $|{L}\rangle$ and $|{R}\rangle$, respectively, and represent $\psi$ as a qubit $|{\psi }\rangle =\alpha |{L}\rangle +\beta |{R}\rangle$, that can be visualized as a point on the unit-radius Bloch (or Poincaré) sphere [see Fig. 1]. Within this formalism, the action of a unitary rotation of light polarization, or equivalently a single-qubit unitary gate, is captured by an SU(2) operator. This can be expressed as

(1)$$U=\exp{-i\Theta(\boldsymbol{n}\cdot\boldsymbol{\sigma})},$$

whose matrix form reads

(2)$$U=\begin{pmatrix} \cos \Theta -i \sin \Theta \,n_z &-i\sin \Theta \,(n_x-i n_y)\\ -i\sin \Theta \,(n_x+i n_y) &\cos \Theta + i \sin \Theta \,n_z \end{pmatrix}.$$

In Eqs. (1)–(2), $\Theta \in [0,\pi ]$, $\boldsymbol {n}=(n_x,n_y,n_z)$ is a real-valued unit-norm vector and $\boldsymbol {\sigma }=\left ({\sigma _x,\sigma _y,\sigma _z}\right )$ is a vector whose components are the three Pauli operators.

Fig. 1. Geometric representation of the parametrization $\left (\Theta, \boldsymbol {n} \right )$ of SU(2) gates. Polarization qubits are represented as points on the Poincaré sphere (cyan sphere). Two qubits are connected via an SU(2) map $U$, $|{\psi '}\rangle =U|{\psi }\rangle$, represented as a point at $\boldsymbol {d}=\Theta \boldsymbol {n}$, lying on a sphere of radius $\Theta$ (yellow sphere).

Download Full Size | PDF

This representation allows one to map any SU(2) matrix into a point on a sphere of radius $\Theta$, whose direction is given by $\boldsymbol {n}$. In this picture, the action of the operator defined in Eq. (2) corresponds to a counterclockwise rotation of the Poincaré sphere around the ${\boldsymbol {n}\text {-axis}}$ through an angle $2\Theta$ [see Fig. 1]. Besides the operator in Eq. (2), a generic polarization rotation would also include a global phase $\Phi$, which we are not considering here. Its possible role is discussed in detail below.

From the SU(2) group property, it follows that the cascaded action of multiple waveplates can also be written as in Eq. (2). In this representation, the unit vector $\boldsymbol {n}$ gives the position on the Poincaré sphere of the polarization eigenstates associated with the optical transformation. These are optical modes whose polarization state is not altered after passing through the optical sequence, as these simply acquire a phase factor $e^{\pm i\Theta }$.

In the tomographic reconstruction of $U$, we aim at determining all the parameters $\left (\Theta,\boldsymbol {n}\right )$ describing the whole transformation, by investigating how certain input polarizations are modified when passing through the chosen optical elements. This information is retrieved via projective measurements, illustrated in the next section.

3. Projective measurements

The basic idea behind the tomographic approach is that each projective measurement conveys some information about the transformation parameters. To exploit this idea, we repeatedly prepare a sequence of $N_\text {in}$ input polarization states $|{\psi _\text {in}^i}\rangle$ ($i=1,\ldots,N_\text {in}$), let them evolve under the action of $U$, and project them on a sequence of $N_\text {out}$ states $|{\psi _\text {out}^j}\rangle$ ($j=1,\ldots,N_\text {out}$). Finally, we use a power meter to record the optical power

(3)$$I_{ij}=I_0|{\langle{\psi_\text{out}^j}|{U}|{\psi_\text{in}^i}}\rangle|^2,$$

where $I_0$ is the total power of the beam, and we are assuming that both $U$ and the detection process are not affected by losses.

As anticipated above, it is clear that global phase factors $\Phi$ in the unitary map $U$ do not affect any measurement of the form of Eq. (3). Accordingly, any tomography based only on projective measurements cannot detect them. While this is not an issue in the case of spatially-uniform rotations, it could limit our ability to characterize entirely the action of space-dependent transformations where also $\Phi$ is inhomogeneous. If needed, this quantity could be obtained with standard interferometric measurements. Our ignorance of global phase factors also implies that an SU(2) process $U$ with parameters $(\Theta,\boldsymbol {n})$ cannot be distinguished from the one corresponding to $(\pi -\Theta,-\boldsymbol {n})$, since the latter is simply $-U$ [59–62]. The implications of this ambiguity will be further handled when analyzing complex polarization transformations in Sec. 6.

Our projective measurements involve the six polarization states forming the mutually unbiased bases of SU(2): $|{L}\rangle$ and $|{R}\rangle$, $|{H}\rangle =\left (|{L}\rangle +|{R}\rangle \right )/\sqrt {2}$ and $|{V}\rangle =\left (|{L}\rangle -|{R}\rangle \right )/\sqrt {2}$ (horizontal and vertical polarizations, respectively), and $|{D}\rangle =\left (|{L}\rangle +i|{R}\rangle \right )/\sqrt {2}$ and $|{A}\rangle =\left (|{L}\rangle -i|{R}\rangle \right )/\sqrt {2}$ (diagonal and antidiagonal polarizations, respectively). Considering other sets of states, as shown for instance in Ref. [63] in the case of quantum state tomography of polarization qubits, is out of the scope of the present work. We plan to consider this aspect in the future for further optimization of the proposed techniques.

4. Optimized process tomography

The set of Eqs. (3) contains analytical relations between the parameters $(\Theta,\boldsymbol {n})$ and the projective measurements $I_{ij}$. As such, these could be solved exactly or numerically in order to reconstruct the unknown process. Although allowed, this straightforward approach proves to be highly unreliable because of the uncorrelated experimental noise between different polarimetric measurements, which often entails the prediction of non-physical results (for example, the parameter $\Theta$ or some components of the vector $\boldsymbol {n}$ feature an imaginary part) [28]. This inconvenience can be avoided by casting the tomography as an optimization problem, which allows retrieving the "most likely" SU(2) process compatible with the experimental results. The reconstruction is thus accomplished by minimizing the distance between a set of experimental outcomes and the corresponding theoretical predictions.

4.1 Minimization approach

The "distance" mentioned above is quantitatively estimated in terms of a cost function, whose minimization is typically achieved via automatized routines.

Assuming that the light detection system is only affected by Gaussian noise, the log-likelihood cost function is expressed as [25]

(4)$$\mathcal{L}=\sum_{\alpha} \frac{(I^\text{th}_{\alpha}-I^\text{exp}_{\alpha})^2}{I^\text{th}_{\alpha}},$$

where $\alpha$ runs over the chosen input-output combinations of Eq. (3). For each projective measurement, $I^\text {th}_{\alpha }$ and $I^\text {exp}_{\alpha }$ represent the theoretical and experimental normalized light intensities, respectively. The normalization is performed with respect to the total optical power $I_0$. Equation (4) is essentially the extension to coherent light beams of the formula originally provided in Ref. [25].

To keep both experimental and computational time at a minimum, one would consider a reduced set of projective measurements to be carried out and processed. However, decreasing the amount of data comes at the expense of the accuracy of the tomography. In Appendix A, we prove that a minimum of five measurements is needed to reconstruct a generic SU(2) operator. Here we focus on the near-optimal case corresponding to six of these, as it proved more robust to experimental noise. Even in this case, we find that process tomography based on GAs and NNs reconstructs SU(2) matrices with significantly large fidelities (on average). Benchmarking these results against those obtained through all available minimization routines is out of the scope of the present work. Among common choices, we employ NMinimize (NMin) from Mathematica [12,14,25]. This routine embeds several optimization methods. By default, it automatically picks the most convenient one based on the type of cost function [56].

4.2 Genetic algorithm approach

Our genetic algorithm evolves real-valued individuals (or chromosomes)

(5)$$\boldsymbol{x} = (\Theta,n_x,n_y,n_z).$$

Each individual is a candidate to provide an optimal parametrization of the unknown process in terms of $\Theta$ and $\boldsymbol {n}$. By means of operators mimicking the natural selection mechanism, the GA selects for reproduction those individuals better minimizing the Mean Squared Error (MSE) cost function

(6)$$\mathcal{L}_{\text{MSE}}=\sum_{\alpha} (I^\text{th}_{\alpha}-I^\text{exp}_{\alpha})^2$$

within the current generation. Their genetic inheritance is transmitted to the next generations, thus driving future populations toward the optimal solution. In our implementation, the reproduction occurs in the form of blend crossover [64]. Furthermore, to explore a wider region of the parameter landscape, genetic mutations are included in our workflow in the form of Gaussian noise on single genes. The maximum number of generations is used as a termination criterion. The complete sequence of operators implementing our genetic reconstruction is detailed in Appendix B.

4.3 Neural network approach

Neural networks constitute our second alternative. We apply supervised learning to train a feedforward NN to predict the optimal $(\Theta,\boldsymbol {n})$ parametrization reproducing a set of experimental outcomes. To perform QPT with our NN, the training set consists of a suitable set of projective measurements, as prescribed by Eq. (3), while the outputs are related to the transformation parameters $\left (\Theta,n_x,n_y\right )$, with $n_z$ obtained from the normalization condition. Interestingly, the learning procedure is proficiently achieved only when using half of the sphere associated with the SU(2) space [see Fig. 1], restricting the training samples to the northern hemisphere $n_z>0$ (or equivalently the other hemisphere). This is necessary to remove the ambiguity between $U$ and $-U$ discussed in Sec. 3. [59–62].

Starting from a random initialization, the learning process dynamically refines all the weights of the NN via the optimization of a cost function (loss function), measuring the distance between the prediction with the current settings and the correct outputs (labels). The learning process is divided into different time blocks (epochs), and it is completed when the cost function converges to a global minimum. The MSE function is chosen as a cost function to be minimized during supervised learning. Further details on the structure of the NN and the learning phase are reported in Appendix C.

5. Numerical experiments

To validate our methods for reconstructing a generic SU(2) transformation, we carry out a series of numerical experiments, comparing the performance of the GA and ML approaches with the minimization of the likelihood function in Eq. (4) performed with NMin, when processing $N=6$ measurements. We refer to Appendix D for the explicit declaration of all hyper-parameters used in our routines.

First, we build a set of $10^3$ random unitary operators, sampling uniformly the SU(2) space. Then, as discussed in Sec. 3, we compute the outcomes of a set of selected projective measurements (further details below). We also study the effect of various levels of disorder on the synthetic experiments. This is introduced as a zero-mean Gaussian noise with increasing standard deviation $\Delta$ affecting the angular settings of the optical waveplates used to realize each projective measurement, thus mimicking non-ideal optical measurements. The chosen input-output combinations are

(7)$$\begin{aligned} I_{LL}&=n_z^2 \sin ^2(\Theta)+\cos ^2(\Theta),\\ I_{HH}&=n_x^2 \sin ^2(\Theta)+\cos ^2(\Theta),\\ I_{LH}&=\frac{1}{2}\left(1+2 n_x n_z \sin^2 (\Theta)+n_y \sin (2 \Theta) \right),\\ I_{LD}&=\frac{1}{2}\left(1+2 n_y n_z \sin^2 (\Theta )-n_x \sin (2 \Theta) \right),\\ I_{HL}&=\frac{1}{2}\left(1+2 n_x n_z \sin^2 (\Theta )-n_y \sin (2 \Theta) \right),\\ I_{HD}&=\frac{1}{2}\left(1+2 n_x n_y \sin^2 (\Theta)+n_z \sin (2 \Theta) \right), \end{aligned}$$

where $I_{ij}= |{\langle {j}|{U}|{i}\rangle }|^2/I_0$ denotes the normalized light power recorded when we prepare the polarization state $|{i}\rangle$ and project onto $|{j}\rangle$ after the evolution $U$.

In order to quantify the agreement between the theoretical model and the prediction resulting from each reconstruction algorithm, we compute the fidelity [65–67]

(8)$$F=\frac{1}{2}\,\left|\textrm{Tr}(U_\text{th}^{{\dagger}}U_\text{exp})\right|,$$

where $U_\text {th}$ and $U_\text {exp}$ are the target and predicted processes, respectively. We note that the fidelity is not sensitive to a global phase, hence it cannot distinguish operators $U$ and $-U$. Figure 2(a) shows the average infidelities $\left (1-F\right )$ obtained for the different approaches as the level of noise increases. The GA and the NN always outperform the standard minimization routine. The advantage is also preserved in presence of moderate noise. Figure 2(b) shows the standard deviation of the fidelity distributions for the three approaches with the same noise levels. Histograms in Figs. 2(c)-(f) depict the infidelities distributions for different levels of noise. Surprisingly, we find that for a significant fraction ($\sim 20{\%}$) of synthetic processes the prediction resulting from the NMin routine is quite poor ($1-F>0.1$). We suspect that a similar behavior escaped previous numerical investigations, which is possibly emerging here because of the reduced number of input measurements that are processed, and the remarkably large set of randomly generated unitaries to be tested.

Fig. 2. (a) Log-plot of the average infidelities of the three optimization algorithms (see legend) for different levels of Gaussian noise. The average is computed over $10^3$ numerical experiments. (b) Log-plot of the standard deviations of the fidelity distributions for the same synthetic realizations as in panel (a). (c)-(f) Infidelity distributions are represented as histograms for different levels of noise: (c) $\Delta =0^\circ$, (d) $\Delta =1^\circ$, (e) $\Delta =2^\circ$, (f) $\Delta =5^\circ$.

Download Full Size | PDF

Besides the accuracy in retrieving arbitrary SU(2) processes, we analyze the time consumption of these algorithms. The algorithms are executed on a machine with an Intel Core i7-10700 CPU ($2.90$ GHz), $32$ GB of RAM and a NVIDIA Quadro P2200 GPU (used for training the NN). On average, NMin requires $\sim 0.2$ s for a single reconstruction, the GA requires $\sim 0.1$ s, while, remarkably, the NN only requires $\sim 1 \,\mu$s. The time consumption reported for the NN does not include the time required for the training stage ($\sim 25$ seconds per epoch, amounting to a total of $\sim$ 20 minutes), which has to be performed only once.

As is typical for non-deterministic algorithms, the predictions of the GA have been averaged over 10 runs for each unitary reconstruction. However, statistical fluctuations of the final prediction are marginal for all levels of noise. Accordingly, in the following applications of the GA, we will only consider single executions.

6. Space-dependent polarization transformations

The techniques previously described address single unitary polarization transformations that are spatially uniform, but can be extended to characterize space-dependent transformations $U(x,y)$. Spatially-inhomogeneous optical waveplates are such an example: they modify the transverse polarization profile of a collimated light beam, propagating along the $z$ axis of a reference frame. The action of a thin waveplate with its optic axis oriented at an angle $\alpha$ with respect to the $x$ axis can be put in the matrix form

(9)$$R_{\delta}(x,y)=\begin{pmatrix} \cos{(\delta/2)} & i\sin{(\delta/2)}e^{{-}2i \alpha(x,y)}\\ i\sin{(\delta/2)}e^{2i \alpha(x,y)} & \cos{(\delta/2)} \end{pmatrix},$$

where $\delta$ is the optical birefringence (that we assume to be uniform) and $\alpha (x,y)$ is space-dependent, that is, the optic axis takes different orientations across the transverse plane $(x,y)$. While the operator in Eq. (9) has no global phase factor, other optical devices could implement polarization transformations featuring a space-dependent global phase profile. To detect it, our technique should be accompanied by an interferometric measurement. The complex transformation $U$ can be decomposed as a collection of local transformations, acting at different positions (pixels):

(10)$$U=\sum_{(x,y)}U(x,y)|{x,y}{\rangle}{\langle}{x,y}|.$$

We stress here that this decomposition is exact only in the near field, where each pixel can still be addressed independently. In the far field (or upon some propagation), the transformation is non-diagonal in the position space, as each pixel is strongly correlated to all the others. This challenging scenario will be addressed in a subsequent study.

We generalize our algorithms to achieve the characterization of the full 2D maps by means of an iterative approach, i.e., via a pixel-by-pixel reconstruction. A straightforward iteration of the procedures described above could lead to some ambiguities, resulting in turn into the determination of discontinuous patterns $(\Theta (x,y),\boldsymbol {n}(x,y))$. This would be related to at least two issues. First, we cannot discriminate between processes $U$ and $-U$, leading to random jumps between the two families of solutions $(\Theta,\boldsymbol {n})$ and $(\pi -\Theta,-\boldsymbol {n})$. Second, physical imperfections in the sample or inaccuracies in the experimental procedures may lead to the reconstruction of an erroneous polarization transformation. To enforce a continuous modulation of the reconstructed parameters, we adopt different strategies, depending on the optimization algorithm. Motivated by these arguments, for each numerical technique (GA and NN) we identified a strategy to retrieve patterns that do not feature these artificial jumps.

6.1 Continuity for the GA

The imposition of the continuity constraint in the GA approach is performed a priori, by properly preparing the starting population in each pixel. As long as the optical transformation does not feature singularities, we expect that the solution found in a given pixel cannot be too different from the solution associated with neighbouring pixels. Continuity can therefore be chased by initializing the population of a pixel from the solutions found by the GA in neighbouring pixels, possibly perturbed with uniform noise.

Formally, let us denote with $(i,j)$ the $i$-th row and $j$-th column of the pixel grid and with $s_{(i,j)}$ the best individual found by the GA performed in $(i,j)$. The initial population of the GA in $(i, j)$ is obtained by randomly selecting $N$ times (where $N$ is the population size) individuals from the set $\mathcal {S}$ of the best solutions found by the GA in all the neighbouring pixels within a distance $d$ from $(i, j)$ [see Fig. 3(a)]. Obviously, since the grid is spanned from left to right, the current pixel gets information only from neighbouring pixels where the GA has been already performed. As for the starting pixel $(0,0)$, no previous information is available, and the initial population is randomly generated [see Fig. 3(b)]. Figures 3(c)-(d) exemplify the choice of neighbours for pixels $(1,1)$ and $(3,3)$. After performing $N$ samplings from the corresponding set $\mathcal {S}_{(i,j)}$ to prepare the initial population, a uniformly-extracted noise in the range $\left [ -\epsilon, \epsilon \right ]$ is applied to each sampled solution. To respect the physical constraints, after the application of noise $\Theta$ is rescaled to the range $[ 0, \pi ]$, and the vector $\boldsymbol {n}$ is normalized. Finally, we observe that this procedure allows initializing the GA very close to the actual solution, so we expect fewer iterations to be needed to obtain an adequate convergence. Therefore, to decrease the computational time of the entire reconstruction, the number of generations for pixels other than $(0,0)$ is reduced. With a clever choice of the number of generations, this trick reduces the total time up to a factor of $10$. In particular, the parameter configuration chosen in our experiments is: $d=2$, $\epsilon =0.2$, $N_{\text {gen}}^{(0,0)} = 60$, $N_{\text {gen}}= 10$, where $N_{\text {gen}}^{(0,0)}$ indicates the number of iterations of the GA reconstructing the transformation associated with pixel $(0,0)$, while $N_{\text {gen}}$ refers to all the other pixels. The remaining hyper-parameters are the same as those reported in Appendix D.

Fig. 3. (a) Set of possible neighbouring pixels $\mathcal {S}$ (gray) for a certain grid position $(i,j)$ (green) within a distance ${d=2}$. (b) The genetic optimization starts from pixel $(0,0)$, for which the initial population is completely random. Sets of neighbouring pixels used in our algorithm for pixels $(1,1)$ (c) and $(3,3)$ (d).

Download Full Size | PDF

6.2 Continuity for the NN

The imposition of the continuity constraint in the NN approach is performed a posteriori, as feedforward networks do not possess a memory of previous inputs. After the transformation has been reconstructed pixel by pixel, the global phase ambiguity is gauged away by checking for sudden jumps in the predicted parameters $(\Theta (i,j),n_{x}(i,j),n_{y}(i,j))$ in correspondence with neighbouring pixels. If an inconsistency is found, the pixel parameters are switched to $\left (\pi - \Theta (i,j),-n_{x}(i,j),-n_{y}(i,j)\right )$ and, accordingly, $n_{z}(i,j)\rightarrow -n_{z}(i,j)$.

7. Experimental results

We realize complex polarization transformations with the setup sketched in Fig. 4. We expand the light beam produced by a He-Ne laser source (with wavelength ${\lambda =633}$ nm), so as to have the final beam waist $w_0\simeq 5$ mm. To implement different realizations of Eq. (3), we prepare the desired input polarization state with a half-wave plate (HWP) and a quarter-wave plate (QWP). The output projection is performed with a QWP and a linear polarizer (LP). Light intensity profiles are successively captured by a CCD camera placed immediately after the projection stage. In our implementation, the sum of Eq. (10) extends over a discrete grid of $73\times 73$ pixels, obtained by compressing the experimental pictures captured by the camera. This allows minimizing the errors due to local intensity fluctuations in the image area, which appear when projecting onto different bases.

Fig. 4. Experimental setup to reconstruct complex polarization transformations, implemented via LC metasurfaces. The CCD camera records the light intensity distributions resulting from different projective measurements, realized by adjusting the preparation and projection stages.

Download Full Size | PDF

In our setup, the actual polarization transformation $U$ is realized via one or multiple LC metasurfaces, as those originally described in Refs. [57,58]. LC technology allows us to fabricate plates with a patterned local optic-axis orientation $\alpha (x,y)$, enabling us to implement several transformations within the same setup. In our experiments, we specifically employ linear polarization gratings, referred to as $g$-plates [58], and LC plates with uniform optic-axis orientation, acting as ordinary waveplates. Furthermore, the birefringence $\delta$ of these devices is electrically controlled and can therefore be tuned [68]. In the case of a $g$-plate deflecting the beam along the $x$ direction, $\alpha (x,y)=\pi x/\Lambda$ ($\Lambda =5$ mm), while for ordinary waveplates we set $\alpha (x,y)=0$.

To test the performances of our approaches when processing data from real experiments, we start implementing the simple polarization transformation realized by a single $g$-plate $T_x(\delta )$, oriented along the $x$ direction, tuned at $\delta =\pi$. Reconstructions obtained via GA and ML are compared with the theoretical parameters in Fig. 5(a). An excellent agreement is observed for this preliminary experiment: $\bar {F}_{GA}=98.6{\%}$ and $\bar {F}_{NN}=94.6{\%}$, where $\bar {F}$ denotes the average fidelity computed over all the pixels. In addition to the intrinsic precision limit of our routines, the non-perfect agreement has to be also ascribed to possible deviations of our setup from the ideal behaviour, especially to fabrication defects in our LC devices.

Fig. 5. Reconstruction of the parameters $\left (\Theta, \boldsymbol {n} \right )$ associated with selected space-dependent polarization transformations: (a) ${U=T_x(\pi )}$, where $T_x$ denotes the $g$-plate operator acting along $x$, (b) ${U=T_y(\pi /4)T_x(\pi )W(\pi /2)}$, (c) ${U=T_y(\pi /2)T_x(\pi /6)W(\pi )}$. (d)-(f) Infidelities histograms for the pixels of processes (a)-(c), respectively. Average fidelity and computation time: (a) $\bar {F}_{GA}=98.6{\%}$, $t_{GA}=106$ s, $\bar {F}_{NN}=94.6{\%}$, $t_{NN}=180$ ms, (b) $\bar {F}_{GA}=97.0{\%}$, $t_{GA}=109$ s, $\bar {F}_{NN}=96.6{\%}$, $t_{NN}=180$ ms, (c) $\bar {F}_{GA}=95.3{\%}$, $t_{GA}=106$ s, $\bar {F}_{NN}=94.7{\%}$, $t_{NN}=180$ ms.

Download Full Size | PDF

We then proceed to characterize more complex polarization transformations, realized by stacking three metasurfaces. We specifically reconstruct the two processes ${U=T_y(\pi /4)T_x(\pi )W(\pi /2)}$ and ${U=T_y(\pi /2)T_x(\pi /6)W(\pi )}$ [see Figs. 5(b)-(c)], where $T_x$ ($T_y$) denotes the $g$-plate operator acting along the $x$ ($y$) direction, while $W(\delta )$ represents an ordinary waveplate. For $\delta =\pi /2$ and $\delta =\pi$, the latter acts as a QWP and HWP, respectively. In the first case, we find $\bar {F}_{GA}=97.0{\%}$ and $\bar {F}_{NN}=96.6{\%}$, while in the second realization we obtain $\bar {F}_{GA}=95.3{\%}$ and $\bar {F}_{NN}=94.7{\%}$. The times required for the full reconstruction are indicated in the captions for each process. While we typically get a lower fidelity in the case of the NN, this method is about $600$ times faster than the one based on the GA. Histograms in Figs. 5(d)-(f) depict the infidelities distributions over all the pixels for the three processes described above.

In Appendix E, we report the tomographic reconstructions retrieved with the standard maximum-likelihood approach, for which we find a good agreement with the theory as well. This is expected as the continuity constraint critically restricts the parameter space, mitigating the possibility of getting stuck in local minima.

In general, the high values of the recorded average fidelities confirm that our routines are suitable for reconstructing optical unitary operations from a very limited amount of experimental data. The good performances of these tomographic approaches in the case of a real experimental scenario also represent a crucial demonstration of their robustness against realistic sources of noise in the detection apparatus, which are more general and less controllable than our synthetic estimates. Finally, we observe that only in Fig. 5(a) the NN fails to reconstruct the correct values of $(n_x,n_y,n_z)$ at isolated pixels, while this is not taking place in Figs. 5(b)-(c). As such, we believe that the imperfections in Fig. 5(a) are strictly related to the special case $n_z=0$.

8. Conclusion

We presented two routines that can assist experimentalists in characterizing optical setups used to manipulate light polarization. When adopting a GA and a NN, we obtain noteworthy performances in reconstructing complex polarization transformations from a set of input measurements close to the minimal case, both in terms of the accuracy of the reconstruction and the required computational resources. In future implementations, these approaches could be used in a complementary way. ML appears more fitting for a preliminary scanning of the experimental platform, while the GA produces the most reliable predictions, even if more time is required to converge to the correct solution. This work demonstrates two innovative techniques for polarization tomography, specifically adapted to space-dependent transformations. We plan to optimize these routines even further and extend future studies to other algorithms. For instance, ML predictions could substantially improve by processing the whole experimental image at once, by means of convolutional neural networks [69,70]. This approach would facilitate the reconstruction of the optical process even in the presence of singularities, and without relying upon post-processing adjustments, as we did in this work. The final predictions might be further processed to retrieve geometrical and topological features of unitary evolutions which can be simulated within our photonic setup [58,62,71,72]. At the same time, it would be interesting to extend the same approaches to more general physical scenarios, for example exploring non-unitary evolutions [73] and de-polarizing channels, probing multi-photon regimes [74] in quantum photonics experiments, or employing these techniques for biomedical applications based on polarization optics [75].

A. Minimal set of measurements

In this section, we provide two arguments explaining why five is the minimal number of projective measurements needed to reconstruct a generic SU(2) operator.

We first start with a geometric argument. Reconstructing a generic $SU(2)$ operator $U$ means predicting its action on any qubit. Accordingly, given two different input states, we need to determine the obtained output states. Let us assume that our first input state is $|{\psi _{\text {in},1}}\rangle$. The output state $U|{\psi _{\text {in},1}}\rangle$ is a point on the Bloch sphere. To unambiguously identify a point on a sphere, the distance from three other points must be determined (as long as these do not lie on a great circle). "Measuring distances" is equivalent to extracting the magnitude of the scalar product. Therefore, we perform three projective measurements of the state $U|{\psi _{\text {in},1}}\rangle$. In this way, we fully reconstruct $U|{\psi _{\text {in},1}}\rangle$. We now choose a second input state $|{\psi _{\text {in},2}}\rangle$ that, crucially, is not orthogonal to the first one. Accordingly, to reconstruct $U|{\psi _{\text {in},2}}\rangle$ only two projective measurements are needed, as the knowledge of the distance from $U|{\psi _{\text {in},1}}\rangle$ is already given by the unitarity condition.

This gives us five measurements. However, when reconstructing the output state, it may happen that $U|{\psi _{\text {in},1}}\rangle$ and the two other states chosen as reference points all lie on a great circle on the Bloch sphere. There is actually only a zero-measure set of states for which this may happen. However, with data coming from real experiments, this possibility compromises the robustness of the reconstruction. This effect can be tackled by introducing a sixth measurement, which always ensures an accurate reconstruction.

We also provide an algebraic argument. A generic SU(2) operator $U$ can be put in the matrix form

(11)$$U= \begin{pmatrix} A e^{i \varphi} & Be^{i \psi}\\ -B e^{{-}i \psi} &A e^{{-}i\varphi} \end{pmatrix},$$

where $A$ and $B$ are positive real numbers satisfying ${A^2+B^2=1}$, and $\varphi$ and $\psi$ represent real phases.

In the basis of circular polarizations, we can determine both $A$ and $B$ with a single projective measurement:

(12)$$I_{LL}=|{\langle{L}|{U}|{L}\rangle}|^2=A^2.$$

From the above equation, we determine $A=\sqrt {I_{LL}}$ and $B=\sqrt {1-I_{LL}}$.

As for the phases $\varphi$ and $\psi$, four additional measurements are required. We consider the combination

(13)$$\begin{aligned}I_{LH}&=|{\langle{H}|{U}|{L}\rangle}|^2=\dfrac{1}{2}-AB \cos\left(\varphi+\psi \right),\\ I_{LD}&=|{\langle{D}|{U}|{L}\rangle}|^2=\dfrac{1}{2}+AB \sin\left(\varphi+\psi \right), \end{aligned}$$

from which we can uniquely determine the phase sum $\varphi +\psi$, and

(14)$$\begin{aligned}I_{HL}&=|{\langle{L}|{U}|{H}\rangle}|^2=\dfrac{1}{2}+AB \cos\left(\varphi-\psi \right),\\ I_{HD}&=|{\langle{D}|{U}|{H}\rangle}|^2=\dfrac{1}{2}+A^2 \sin\left(2\varphi\right) + B^2\sin\left(2\psi\right) , \end{aligned}$$

from which we can uniquely determine the phase difference $\varphi -\psi$.

B. Genetic tomography

Here follows the detailed sequence of operators used in our GA. First, the well-known tournament selection mechanism [77] is used as selection operator. This consists in repeating $N$ times (where $N$ is the population size) the following steps:

(1) randomly select a subset of $k$ individuals;
(2) choose the fittest individual among them to be inserted in the mating pool, where reproduction takes place.

For our purposes, "fittest" are those individuals for which the cost function is minimum. Second, the blend crossover [64] is used to mate individuals in the mating pool. Accordingly, when two individuals ${\boldsymbol {x}_A=(\Theta _A,n_{xA},n_{yA},n_{zA})}$ and ${\boldsymbol {x}_B=(\Theta _B,n_{xB},n_{yB},n_{zB})}$ reproduce, two newborn individuals $\boldsymbol {x}_1$ and $\boldsymbol {x}_2$ originates, with each gene $x_{c,i}$ given by a random number belonging to the interval

(15)$$\begin{aligned} \left[ x_{A,i}-\alpha_c (x_{B,i}-x_{A,i}),x_{B,i}+\alpha_c (x_{B,i}-x_{A,i}) \right]\Theta(x_{B,i}-x_{A,i})+\\ \left[ x_{B,i}-\alpha_c (x_{A,i}-x_{B,i}),x_{A,i}+\alpha_c (x_{A,i}-x_{B,i}) \right]\Theta(x_{A,i}-x_{B,i}), \end{aligned}$$

where $\alpha _c$ tunes the crossover, with $c=\left \lbrace 1,2\right \rbrace$ and $i=\left \lbrace 1,2,3,4 \right \rbrace$, and $\Theta (x)$ is the Heaviside step function. Finally, a Gaussian mutation [78] with mean $\mu$ and standard deviation $\sigma$ can mutate individual genes of offspring solutions. Our GA is also provided with an elitism mechanism. In other words, the best chromosome from the old population is carried over to the next one, replacing the worst individual of the offspring. This mechanism pushes the algorithm to a fast convergence toward the best solution. To preserve the physical validity of the final prediction, after each operation carried out on an individual, a modulo-$\pi$ operation is performed on its first gene, while normalization is enforced on its last three genes. The maximum number of generations is used as a termination criterion.

Table 1 reports the pseudo-code of the implemented GA. Our GA is performed relying on the DEAP library [79].

Table 1. Pseudo-code of the implemented genetic algorithm.

View Table | View all tables in this article

C. Machine learning tomography

The network scheme we consider is a dense network, as each neuron in a given layer is connected to each neuron in the subsequent layer. Figure 6 shows the structure of our network. We choose the rectified linear unit ("ReLu") as activation function for all the layers, except for the output layer, for which we choose the Sigmoid function. Training data are generated at run-time, so as to avoid importing huge data-sets and possibly decreasing the issue of overfitting, which can result from a too limited number of training samples. Several standard loss functions have been tested for the supervised learning. Among these, the MSE has always shown the fastest convergence. The learning process is divided into $50$ epochs. During each epoch, the NN learns about a training set of $2^{20}$ randomly generated SU(2) processes, divided into $2^{12}$ batches. At the end of each epoch, the performance of the network is evaluated on a different set of data, the validation set, consisting of $2^{18}$ processes, divided into $2^{10}$ batches. The validation phase works similarly to the training, but after the evaluation of the cost function the weights are not adjusted. To improve the learning performance, we employ the callback "ReduceLROnPlateau". This function allows us to keep track of the loss during the various epochs and reduce the learning rate in case of stagnation. To ensure good performances in case of noisy data, we also employ Gaussian Dropout, where noise is randomly applied to chosen layers during training [80,81]. Finally, the Adam optimization algorithm [82] is used to adjust the network hyper-parameters during the supervised learning.

Fig. 6. Structure of the dense neural network. For the intermediate layers (hidden layers), the number of neurons has been written explicitly.

Download Full Size | PDF

The ML approach is implemented with the help of the TENSORFLOW library [83].

D. Hyper-parameters

For the sake of reproducibility, we declare the configurations employed in our approaches. After a careful tuning procedure, the best setting of the GA and NN hyper-parameters is reported in Table 2 and 3, respectively.

Table 2. GA hyper-parameters

View Table | View all tables in this article

Table 3. NN hyper-parameters

View Table | View all tables in this article

E. Maximum-likelihood tomography

Figure 7 shows the tomographic reconstructions of the space-dependent polarization transformations explored in Sec. 7, as obtained by minimizing the log-likelihood cost function with NMin. In this case, the continuity constraint is a priori enforced, as in the GA application. Fidelity and timing values are reported in the caption for each process.

Fig. 7. Maximum-Likelihood reconstruction of the parameters $\left (\Theta, \boldsymbol {n} \right )$ associated with the same experimental processes as in Fig. 5: (a) $\bar {F}=98.7{\%}$, $t=1303$ s, (b) $\bar {F}=96.9{\%}$, $t=1273$ s, (c) $\bar {F}=95.0{\%}$, $t=1176$ s.

Download Full Size | PDF

Funding

'la Caixa' Foundation (LCF/BQ/PR20/11770012); Marie-Sklodowska-Curie (101029393, 847648); Narodowe Centrum Nauki (2016/20/W/ST4/00314); EU Horizon (899794); Barcelona Supercomputing Center (FI-2022-1-0042); European Social Fund (2017 SGR 134, QuantumCAT U16-011424); Fundació Mir-Puig; Fundació Cellex; European Union NextGenerationEU (PRTR); Ministerio de Ciencia y Innovation Agencia Estatal de Investigaciones (CEX2019-000910-S/10.13039/501100011033, FIDEUA PID2019-106901GB-I00, FPI, PGC2018-097027-B-I00/10.13039/501100011033, QUANTERA DYNAMITE PCI2022-132919, QUANTERA MAQS PCI2019-111828-2, QUSPIN RTC2019-007196-7); European Research Council (694683, AdG NOQIA); Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (204801021001); Ministero dell'Università e della Ricerca (PE0000023-NQSTI).

Acknowledgments

FDC and FC acknowledge financial support from the European Union Horizon 2020 program, under European Research Council (ERC) Grant No. 694683 (PHOSPhOR). FC acknowledges financial support from PNRR MUR project PE0000023-NQSTI. LA acknowledges financial support from the Swiss National Science Foundation (SNSF) under grant No. 204801021001. A.D. acknowledges financial support from ERC AdG NOQIA; Ministerio de Ciencia y Innovation Agencia Estatal de Investigaciones (PGC2018-097027-B-I00/10.13039/501100011033, CEX2019-000910-S/10.13039/501100011033, Plan National FIDEUA PID2019-106901GB-I00, FPI, QUANTERA MAQS PCI2019-111828-2, QUANTERA DYNAMITE PCI2022-132919, Proyectos de I+D+I "Retos Colaboración" QUSPIN RTC2019-007196-7); European Union NextGenerationEU (PRTR); Fundació Cellex; Fundació Mir-Puig; Generalitat de Catalunya (European Social Fund FEDER and CERCA program (AGAUR Grant No. 2017 SGR 134, QuantumCAT U16-011424, co-funded by ERDF Operational Program of Catalonia 2014-2020); Barcelona Supercomputing Center MareNostrum (FI-2022-1-0042); EU Horizon 2020 FET-OPEN OPTOlogic (Grant No 899794); National Science Centre, Poland (Symfonia Grant No. 2016/20/W/ST4/00314); European Union’s Horizon 2020 research and innovation programme under the Marie-Skøodowska-Curie grant agreement No 101029393 (STREDCH) and No 847648 ("La Caixa" Junior Leaders fellowships ID100010434: LCF/BQ/PI19/11690013, LCF/BQ/PI20/11760031, LCF/BQ/PR20/11770012, LCF/BQ/PR21/11840013). A.D. further acknowledges the financial support from a fellowship granted by la Caixa Foundation (ID 100010434, fellowship code LCF/BQ/PR20/11770012).

Disclosures

The authors declare no conflicts of interest.

Data availability

The source code and data underlying the results presented in this paper are available in Ref. [76].

References

1. I. L. Chuang and M. A. Nielsen, “Prescription for experimental determination of the dynamics of a quantum black box,” J. Mod. Opt. 44(11-12), 2455–2467 (1997). [CrossRef]

2. F. Bouchard, F. Hufnagel, D. Koutný, A. Abbas, A. Sit, K. Heshami, R. Fickler, and E. Karimi, “Quantum process tomography of a high-dimensional quantum communication channel,” Quantum 3, 138 (2019). [CrossRef]

3. S. Goel, S. Leedumrongwatthanakun, N. H. Valencia, W. McCutcheon, C. Conti, P. W. H. Pinkse, and M. Malik, “Inverse-design of high-dimensional quantum optical circuits in a complex medium,” arXiv, arXiv:2204.00578 (2022). [CrossRef]

4. A. M. Childs, I. L. Chuang, and D. W. Leung, “Realization of quantum process tomography in NMR,” Phys. Rev. A 64(1), 012314 (2001). [CrossRef]

5. S. H. Myrskog, J. K. Fox, M. W. Mitchell, and A. M. Steinberg, “Quantum process tomography on vibrational states of atoms in an optical lattice,” Phys. Rev. A 72(1), 013615 (2005). [CrossRef]

6. C. F. Roos, G. P. T. Lancaster, M. Riebe, H. Häffner, W. Hänsel, S. Gulde, C. Becher, J. Eschner, F. Schmidt-Kaler, and R. Blatt, “Bell states of atoms with ultralong lifetimes and their tomographic state analysis,” Phys. Rev. Lett. 92(22), 220402 (2004). [CrossRef]

7. M. Riebe, K. Kim, P. Schindler, T. Monz, P. O. Schmidt, T. K. Körber, W. Hänsel, H. Häffner, C. F. Roos, and R. Blatt, “Process tomography of ion trap quantum gates,” Phys. Rev. Lett. 97(22), 220407 (2006). [CrossRef]

8. T. Yamamoto, M. Neeley, E. Lucero, R. C. Bialczak, J. Kelly, M. Lenander, M. Mariantoni, A. D. O’Connell, D. Sank, H. Wang, M. Weides, J. Wenner, Y. Yin, A. N. Cleland, and J. M. Martinis, “Quantum process tomography of two-qubit controlled-z and controlled-not gates using superconducting phase qubits,” Phys. Rev. B 82(18), 184515 (2010). [CrossRef]

9. R. C. Bialczak, M. Ansmann, M. Hofheinz, E. Lucero, M. Neeley, A. D. O’Connell, D. Sank, H. Wang, J. Wenner, M. Steffen, A. N. Cleland, and J. M. Martinis, “Quantum process tomography of a universal entangling gate implemented with Josephson phase qubits,” Nat. Phys. 6(6), 409–413 (2010). [CrossRef]

10. M. W. Mitchell, C. W. Ellenor, S. Schneider, and A. M. Steinberg, “Diagnosis, prescription, and prognosis of a bell-state filter by quantum process tomography,” Phys. Rev. Lett. 91(12), 120402 (2003). [CrossRef]

11. J. B. Altepeter, D. Branning, E. Jeffrey, T. C. Wei, P. G. Kwiat, R. T. Thew, J. L. O’Brien, M. A. Nielsen, and A. G. White, “Ancilla-Assisted Quantum Process Tomography,” Phys. Rev. Lett. 90(19), 193601 (2003). [CrossRef]

12. J. L. O’Brien, G. J. Pryde, A. Gilchrist, D. F. V. James, N. K. Langford, T. C. Ralph, and A. G. White, “Quantum process tomography of a controlled-not gate,” Phys. Rev. Lett. 93(8), 080502 (2004). [CrossRef]

13. M. Lobino, D. Korystov, C. Kupchak, E. Figueroa, B. C. Sanders, and A. I. Lvovsky, “Complete Characterization of Quantum-Optical Processes,” Science 322(5901), 563–566 (2008). [CrossRef]

14. I. Bongioanni, L. Sansoni, F. Sciarrino, G. Vallone, and P. Mataloni, “Experimental quantum process tomography of non-trace-preserving maps,” Phys. Rev. A 82(4), 042307 (2010). [CrossRef]

15. S. Rahimi-Keshari, M. A. Broome, R. Fickler, A. Fedrizzi, T. C. Ralph, and A. G. White, “Direct characterization of linear-optical networks,” Opt. Express 21(11), 13450 (2013). [CrossRef]

16. B. Ndagano, B. Perez-Garcia, F. S. Roux, M. McLaren, C. Rosales-Guzman, Y. Zhang, O. Mouane, R. I. Hernandez-Aranda, T. Konrad, and A. Forbes, “Characterizing quantum channels with non-separable states of classical light,” Nat. Phys. 13(4), 397–402 (2017). [CrossRef]

17. C. Antón, P. Hilaire, C. A. Kessler, J. Demory, C. Gómez, A. Lemaître, I. Sagnes, N. D. Lanzillotti-Kimura, O. Krebs, N. Somaschi, P. Senellart, and L. Lanco, “Tomography of the optical polarization rotation induced by a single quantum dot in a cavity,” Optica 4(11), 1326 (2017). [CrossRef]

18. K. V. Jacob, A. E. Mirasola, S. Adhikari, and J. P. Dowling, “Direct characterization of linear and quadratically nonlinear optical systems,” Phys. Rev. A 98(5), 052327 (2018). [CrossRef]

19. F. Le Roy-Brehonnet and B. Le Jeune, “Utilization of Mueller matrix formalism to obtain optical targets depolarization and polarization properties,” Prog. Quantum Electron. 21(2), 109–151 (1997). [CrossRef]

20. A. Laing and J. L. O’Brien, “Super-stable tomography of any linear optical device,” arXiv, arXiv:1208.2868 (2012). [CrossRef]

21. Z. Hradil, “Quantum-state estimation,” Phys. Rev. A 55(3), R1561–R1564 (1997). [CrossRef]

22. S. M. Tan, “An inverse problem approach to optical homodyne tomography,” J. Mod. Opt. 44(11-12), 2233–2259 (1997). [CrossRef]

23. K. Banaszek, G. M. D’Ariano, M. G. A. Paris, and M. F. Sacchi, “Maximum-likelihood estimation of the density matrix,” Phys. Rev. A 61(1), 010304 (1999). [CrossRef]

24. Z. Hradil, J. Summhammer, G. Badurek, and H. Rauch, “Reconstruction of the spin state,” Phys. Rev. A 62(1), 014101 (2000). [CrossRef]

25. D. F. V. James, P. G. Kwiat, W. J. Munro, and A. G. White, “Measurement of qubits,” Phys. Rev. A 64(5), 052312 (2001). [CrossRef]

26. J. Reháček, Z. Hradil, and M. Ježek, “Iterative algorithm for reconstruction of entangled states,” Phys. Rev. A 63(4), 040303 (2001). [CrossRef]

27. M. S. Kaznady and D. F. V. James, “Numerical strategies for quantum tomography: Alternatives to full optimization,” Phys. Rev. A 79(2), 022109 (2009). [CrossRef]

28. A. Aiello, G. Puentes, D. Voigt, and J. P. Woerdman, “Maximum-likelihood estimation of Mueller matrices,” Opt. Lett. 31(6), 817 (2006). [CrossRef]

29. R. C. Jones, “A new calculus for the treatment of optical systems i. description and discussion of the calculus,” J. Opt. Soc. Am. 31(7), 488–493 (1941). [CrossRef]

30. J. E. Solomon, “Polarization imaging,” Appl. Opt. 20(9), 1537 (1981). [CrossRef]

31. J. A. Davis, G. H. Evans, and I. Moreno, “Polarization-multiplexed diffractive optical elements with liquid-crystal displays,” Appl. Opt. 44(19), 4049 (2005). [CrossRef]

32. H. Rubinsztein-Dunlop, A. Forbes, M. V. Berry, et al., “Roadmap on structured light,” J. Opt. 19(1), 013001 (2017). [CrossRef]

33. A. Forbes, M. de Oliveira, and M. R. Dennis, “Structured light,” Nat. Photonics 15(4), 253–262 (2021). [CrossRef]

34. M. Piccardo, V. Ginis, A. Forbes, et al., “Roadmap on multimode light shaping,” J. Opt. 24(1), 013001 (2022). [CrossRef]

35. D. Neshev and I. Aharonovich, “Optical metasurfaces: new generation building blocks for multi-functional optics,” Light: Sci. Appl. 7(1), 58 (2018). [CrossRef]

36. J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence (MIT Press, 1992).

37. D. Whitley, “A genetic algorithm tutorial,” Stat. Comput. 4(2), 65–85 (1994). [CrossRef]

38. M. Mitchell, An Introduction to Genetic Algorithms (MIT press, 1998).

39. L. M. Schmitt, “Theory of genetic algorithms,” Theor. Comput. Sci. 259(1-2), 1–61 (2001). [CrossRef]

40. U. Las Heras, U. Alvarez-Rodriguez, E. Solano, and M. Sanz, “Genetic algorithms for digital quantum simulations,” Phys. Rev. Lett. 116(23), 230504 (2016). [CrossRef]

41. P. R. Hegde, G. Passarelli, A. Scocco, and P. Lucignano, “Genetic optimization of quantum annealing,” Phys. Rev. A 105(1), 012612 (2022). [CrossRef]

42. R. I. Woodward and E. J. R. Kelleher, “Towards ’smart lasers’: self-optimisation of an ultrafast pulse source using a genetic algorithm,” Sci. Rep. 6(1), 37616 (2016). [CrossRef]

43. R. I. Woodward and E. J. R. Kelleher, “Genetic algorithm-based control of birefringent filtering for self-tuning, self-pulsing fiber lasers,” Opt. Lett. 42(15), 2952–2955 (2017). [CrossRef]

44. L. Michaeli and A. Bahabad, “Genetic algorithm driven spectral shaping of supercontinuum radiation in a photonic crystal fiber,” J. Opt. 20(5), 055501 (2018). [CrossRef]

45. N. Spagnolo, E. Maiorino, C. Vitelli, M. Bentivegna, A. Crespi, R. Ramponi, P. Mataloni, R. Osellame, and F. Sciarrino, “Learning an unknown transformation via a genetic approach,” Sci. Rep. 7(1), 14316 (2017). [CrossRef]

46. G. Pu, L. Yi, L. Zhang, and W. Hu, “Genetic algorithm-based fast real-time automatic mode-locked fiber laser,” IEEE Photonics Technol. Lett. 32(1), 7–10 (2020). [CrossRef]

47. M. Bielak, R. Stárek, V. Krčmarský, M. Mičuda, and M. Ježek, “Accurate polarization preparation and measurement using twisted nematic liquid crystals,” Opt. Express 29(21), 33037 (2021). [CrossRef]

48. D. Karpov and P. Horak, “Evolutionary algorithm to design high-cooperativity optical cavities,” New J. Phys. 24(7), 073028 (2022). [CrossRef]

49. U. Mahlab, J. Shamir, and H. J. Caulfield, “Genetic algorithm for optical pattern recognition,” Opt. Lett. 16(9), 648–650 (1991). [CrossRef]

50. K. D. Kihm and D. P. Lyons, “Optical tomography using a genetic algorithm,” Opt. Lett. 21(17), 1327–1329 (1996). [CrossRef]

51. M. I. Jordan and T. M. Mitchell, “Machine learning: Trends, perspectives, and prospects,” Science 349(6245), 255–260 (2015). [CrossRef]

52. A. Dawid, J. Arnold, B. Requena, et al., “Modern applications of machine learning in quantum sciences,” arXiv, arXiv:2204.04198 (2022). [CrossRef]

53. A. M. Palmieri, E. Kovlakov, F. Bianchi, D. Yudin, S. Straupe, J. D. Biamonte, and S. Kulik, “Experimental neural network enhanced quantum tomography,” npj Quantum Inf. 6(1), 20 (2020). [CrossRef]

54. Y. Quek, S. Fort, and H. K. Ng, “Adaptive quantum state tomography with neural networks,” npj Quantum Inf. 7(1), 105 (2021). [CrossRef]

55. G. Genty, L. Salmela, J. M. Dudley, D. Brunner, A. Kokhanovskiy, S. Kobtsev, and S. K. Turitsyn, “Machine learning and applications in ultrafast photonics,” Nat. Photonics 15(2), 91–101 (2021). [CrossRef]

56. Wolfram Research, Numerical Nonlinear Global Optimization (2021), https://reference.wolfram.com/language/tutorial/ConstrainedOptimizationGlobalNumerical.html.

57. A. Rubano, F. Cardano, B. Piccirillo, and L. Marrucci, “Q-plate technology: a progress review [Invited],” J. Opt. Soc. Am. B 36(5), D70 (2019). [CrossRef]

58. A. D’Errico, F. Cardano, M. Maffei, A. Dauphin, R. Barboza, C. Esposito, B. Piccirillo, M. Lewenstein, P. Massignan, and L. Marrucci, “Two-dimensional topological quantum walks in the momentum space of structured light,” Optica 7(2), 108 (2020). [CrossRef]

59. N. Fläschner, B. S. Rem, M. Tarnowski, D. Vogel, D.-S. Lühmann, K. Sengstock, and C. Weitenberg, “Experimental reconstruction of the Berry curvature in a Floquet Bloch band,” Science 352(6289), 1091–1094 (2016). [CrossRef]

60. T. Li, L. Duca, M. Reitter, F. Grusdt, E. Demler, M. Endres, M. Schleier-Smith, I. Bloch, and U. Schneider, “Bloch state tomography using Wilson lines,” Science 352(6289), 1094–1097 (2016). [CrossRef]

61. M. Tarnowski, F. N. Ünal, N. Fläschner, B. S. Rem, A. Eckardt, K. Sengstock, and C. Weitenberg, “Measuring topology from dynamics by obtaining the Chern number from a linking number,” Nat. Commun. 10(1), 1728 (2019). [CrossRef]

62. C.-R. Yi, J. Yu, H. Yuan, R.-H. Jiao, Y.-M. Yang, X. Jiang, J.-Y. Zhang, S. Chen, and J.-W. Pan, “Extracting the Quantum Geometric Tensor of an Optical Raman Lattice by Bloch State Tomography,” arXiv, arXiv:2301.06090 (2023). [CrossRef]

63. N. Bent, H. Qassim, A. A. Tahir, D. Sych, G. Leuchs, L. L. Sánchez-Soto, E. Karimi, and R. W. Boyd, “Experimental realization of quantum tomography of photonic qudits via symmetric informationally complete positive operator-valued measures,” Phys. Rev. X 5(4), 041006 (2015). [CrossRef]

64. L. J. Eshelman and J. D. Schaffer, “Real-coded genetic algorithms and interval-schemata,” in Foundations of Genetic Algorithms, vol. 2 (Elsevier, 1993), pp. 187–202.

65. A. Gilchrist, N. K. Langford, and M. A. Nielsen, “Distance measures to compare real and ideal quantum processes,” Phys. Rev. A 71(6), 062310 (2005). [CrossRef]

66. X. Wang, Z. Sun, and Z. D. Wang, “Operator fidelity susceptibility: An indicator of quantum criticality,” Phys. Rev. A 79(1), 012105 (2009). [CrossRef]

67. R. Cabrera, O. M. Shir, R. Wu, and H. Rabitz, “Fidelity between unitary operators and the generation of robust gates against off-resonance perturbations,” J. Phys. A Math. Theor. 44(9), 095302 (2011). [CrossRef]

68. B. Piccirillo, V. D’Ambrosio, S. Slussarenko, L. Marrucci, and E. Santamato, “Photon spin-to-orbital angular momentum conversion via an electrically tunable q-plate,” Appl. Phys. Lett. 97(24), 241104 (2010). [CrossRef]

69. K. O’Shea and R. Nash, “An introduction to convolutional neural networks,” arXiv, arXiv:1511.08458 (2015). [CrossRef]

70. F. Marquardt, “Machine Learning and Quantum Devices,” SciPost Phys. Lect. Notes, p. 29 (2021).

71. F. Cardano, A. D’Errico, A. Dauphin, M. Maffei, B. Piccirillo, C. de Lisio, G. D. Filippis, V. Cataudella, E. Santamato, L. Marrucci, M. Lewenstein, and P. Massignan, “Detection of zak phases and topological invariants in a chiral quantum walk of twisted photons,” Nat. Commun. 8(1), 15516 (2017). [CrossRef]

72. F. Di Colandrea, A. Babazadeh, A. Dauphin, P. Massignan, L. Marrucci, and F. Cardano, “Ultra-long quantum walks via spin–orbit photonics,” Optica 10(3), 324 (2023). [CrossRef]

73. X. Zhan, L. Xiao, Z. Bian, K. Wang, X. Qiu, B. C. Sanders, W. Yi, and P. Xue, “Detecting topological invariants in nonunitary discrete-time quantum walks,” Phys. Rev. Lett. 119(13), 130501 (2017). [CrossRef]

74. H.-S. Zhong, H. Wang, Y.-H. Deng, et al., “Quantum computational advantage using photons,” Science 370(6523), 1460–1463 (2020). [CrossRef]

75. C. He, H. He, J. Chang, B. Chen, H. Ma, and M. J. Booth, “Polarisation optics for biomedical and clinical applications: a review,” Light: Sci. Appl. 10(1), 194 (2021). [CrossRef]

76. F. Di Colandrea, L. Amato, R. Schiattarella, A. Dauphin, and F. Cardano, “1234534253 / QPT,” Github (2022), https://github.com/1234534253/QPT.

77. D. E. Goldberg and K. Deb, “A comparative analysis of selection schemes used in genetic algorithms,” in Foundations of Genetic Algorithms, vol. 1 (Elsevier, 1991), pp. 69–93.

78. O. Kramer, “Genetic algorithms,” in Genetic Algorithm Essentials, (Springer, 2017), pp. 11–19.

79. F.-A. Fortin, F.-M. De Rainville, M.-A. Gardner, M. Parizeau, and C. Gagné, “DEAP: Evolutionary algorithms made easy,” J. Mach. Learn. Res. 13(1), 2171–2175 (2012). [CrossRef]

80. A. Labach, H. Salehinejad, and S. Valaee, “Survey of dropout methods for deep neural networks,” arXiv, arXiv:1904.13310 (2019). [CrossRef]

81. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res. 15(1), 1929–1958 (2014). [CrossRef]

82. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv, arXiv:2204.00578 (2014). [CrossRef]

83. M. Abadi, A. Agarwal, P. Barham, et al., “TensorFlow: Large-scale machine learning on heterogeneous systems,” (2015), available from tensorflow.org.

Population size	$N = 40$
Number of generations	$N_{gen} = 60$
Selection parameters
Tournament size	$k = 3$
Crossover parameters
Crossover probability	$p_{c} = 0.8$
Blend crossover	$α_{1} = α_{2} = 0.5$
Mutation parameters
Mutation probability	$p_{m} = 0.1$
Gaussian mutation	$μ = 0$ , $σ = 0.2$

Number of hidden layers	$N = 8$
Neurons in input layer	$N_{in} = 6$
Neurons in layer $1$	$N_{1} = 128$
Neurons in layer $2$	$N_{2} = 128$
Neurons in layer $3$	$N_{3} = 64$
Neurons in layer $4$	$N_{4} = 64$
Neurons in layer $5$	$N_{5} = 64$
Neurons in layer $6$	$N_{6} = 64$
Neurons in layer $7$	$N_{7} = 32$
Neurons in layer $8$	$N_{8} = 16$
Neurons in output layer	$N_{out} = 3$
Activations
Layers 1, 2, 3, 4, 5, 6, 7, 8	“ReLu”
Output layer	“Sigmoid”
Data structure
Batch size	$S_{batch} = 256$
Batches per epoch (training)	$N_{T} = 2^{12}$
Batches per epoch (validation)	$N_{V} = 2^{10}$
Number of epochs	$N_{E} = 50$
Optimization parameters
Optimizer	“Adam”
Loss function	MSE
Learning rate (initial)	$η = 10^{- 3}$
Gaussian Dropout rate	$rate = 0.01$
ReduceLROnPlateau:
Patience	$5$ epochs
Reduction factor	$μ = 0.1$

Population size	$N = 40$
Number of generations	$N_{gen} = 60$
Selection parameters
Tournament size	$k = 3$
Crossover parameters
Crossover probability	$p_{c} = 0.8$
Blend crossover	$α_{1} = α_{2} = 0.5$
Mutation parameters
Mutation probability	$p_{m} = 0.1$
Gaussian mutation	$μ = 0$ , $σ = 0.2$

Number of hidden layers	$N = 8$
Neurons in input layer	$N_{in} = 6$
Neurons in layer $1$	$N_{1} = 128$
Neurons in layer $2$	$N_{2} = 128$
Neurons in layer $3$	$N_{3} = 64$
Neurons in layer $4$	$N_{4} = 64$
Neurons in layer $5$	$N_{5} = 64$
Neurons in layer $6$	$N_{6} = 64$
Neurons in layer $7$	$N_{7} = 32$
Neurons in layer $8$	$N_{8} = 16$
Neurons in output layer	$N_{out} = 3$
Activations
Layers 1, 2, 3, 4, 5, 6, 7, 8	“ReLu”
Output layer	“Sigmoid”
Data structure
Batch size	$S_{batch} = 256$
Batches per epoch (training)	$N_{T} = 2^{12}$
Batches per epoch (validation)	$N_{V} = 2^{10}$
Number of epochs	$N_{E} = 50$
Optimization parameters
Optimizer	“Adam”
Loss function	MSE
Learning rate (initial)	$η = 10^{- 3}$
Gaussian Dropout rate	$rate = 0.01$
ReduceLROnPlateau:
Patience	$5$ epochs
Reduction factor	$μ = 0.1$

Retrieving space-dependent polarization transformations via near-optimal quantum process tomography

Abstract

1. Introduction

2. Polarization rotations as SU(2) processes

3. Projective measurements

4. Optimized process tomography

4.1 Minimization approach

4.2 Genetic algorithm approach

4.3 Neural network approach

5. Numerical experiments

6. Space-dependent polarization transformations

6.1 Continuity for the GA

6.2 Continuity for the NN

7. Experimental results

8. Conclusion

A. Minimal set of measurements

B. Genetic tomography

C. Machine learning tomography

D. Hyper-parameters

E. Maximum-likelihood tomography

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (7)

Tables (3)

Equations (15)

Optics Express