Probability theory for 3-layer remote sensing radiative transfer model: univariate case

Avishai Ben-David; Charles E. Davidson

doi:10.1364/OE.20.010004

1. Introduction

The radiative transfer model with simple 3-layer geometry has been the primary physical model for long wave infrared (LWIR) or “thermal infrared” passive remote sensing for many years [1–7]. This model has been used for detecting vapor and aerosol targets in the atmosphere in conjunction with detection algorithms such as matched filters and anomaly detectors. A probability density function (pdf) model for the radiative transfer geometry is of great interest and importance for predicting the stochastic nature of the measured signals, evaluating performance, and for developing detection algorithms.

The traditional three-layer remote sensing model is based on the deterministic radiative transfer equation [8,9] where radiative transfer parameters are non-fluctuating. In 1991, Stephens et al. [10] gave a review of efforts to include stochastic components into the radiative transfer equation, and proposed a method for determining pdfs for the optical properties of stratiform clouds. Stephens’s model was based on the two-stream solution [8,9] to the radiative transfer equation for a single layer, where the optical depth and single scattering albedo were treated as random variables with empirically measured pdfs (the asymmetry parameter was fixed). Using transformation of variables and numerical techniques, Stephens obtained output distributions for the albedo, diffuse transmission, and absorption of the cloud. Other work [11,12] has incorporated fluctuations by characterizing a few radiative transfer parameters by a mean and variance (second-order statistics) while holding others fixed; these papers do not address the underlying probability distribution of the predicted spectral radiance. The goal of our work is to predict the pdf of spectral radiance not only due to a fluctuating vapor cloud, but also due to fluctuations in the atmosphere in front and behind the cloud, where all parameters are stochastic and we employ higher order statistics.

In the literature there is a seemingly related area of research regarding radiative transfer through random or stochastic media [13–15], sometimes referred to as stochastic radiative transfer. However, as Byrne states on page 386 of [14], “The common term ‘stochastic radiative transfer’ is, strictly, a misnomer. The situation we discuss is the problem of finding the average (emphasis added) radiative properties of an ‘ensemble’ (a generally infinite set) of problems, by means other than actually solving each problem and averaging results”. Thus, so-called stochastic radiative transfer does not provide information regarding the distribution of spectral radiance. In our paper when we use the term stochastic radiative transfer or stochastic model, we do not intend to refer to ensemble averages, but to complete probability distribution information.

In this work we develop a stochastic 3-layer radiative transfer model that consists of a foreground layer between the sensor and the target, a non-scattering target layer (e.g., a vapor cloud), a background layer at the far end of the target, and an external radiation source (e.g., a mountain, building, etc.) at the end of the line of sight (LOS). The external radiation source can also be taken as atmospheric radiation, in which case it can be thought of as a fourth layer. Each layer is characterized by a transmittance (which attenuates the radiation from previous layers) and serves as a source of radiation (due to its intrinsic temperature). Our main objective is to derive a pdf for the at-sensor spectral radiance (i.e., the spectral radiance—hereafter just radiance—input to a hypothetical sensor) due to fluctuations in transmittance and the radiation produced by the layers. These fluctuations contribute to variance in sensor measurements that is typically referred to as “clutter” (variation from the scene that potentially interferes with the task of detecting a particular signal of interest). Technological advancements have led to very low sensor noise, and typical remote sensing applications are limited by clutter noise. Thus, since it is of primary importance to understand the stochastic nature of the physical environment, and less important to model sensor noise, we neglect modeling sensor characteristics in this paper. At-sensor radiance statistics, when combined with specific sensor models and detection algorithms, can produce accurate end-to-end performance predictions. Predictions derived from the at-sensor radiance only (as in this paper) may be interpreted as upper bounds on observed performance from an actual sensor.

We focus mainly on modeling each of the three layers as homogeneously composed layers at thermal equilibrium (the target layer is always modeled this way), in which case the layer radiance is due to self emission, and the stochastic nature of the radiance is due to fluctuations in temperature and density (which in turn causes the layer’s transmittance and emissivity to be stochastic). This type of model is likely applicable to near-horizon LOS (e.g., LOS₁ in Fig. 1 ), but not up- or down-looking LOS (e.g., LOS₂ in Fig. 1) since the vertical stratification of the atmosphere makes it less likely for a layer to be homogeneously composed (local thermal equilibrium is always assumed). Therefore, we also allow the layer radiance to be taken with an assumed pdf intended to capture all sources of fluctuations in a layer without assuming homogeneity. We will show that a lognormal distribution is a reasonable distribution to assume for the radiance from a layer (as well as for the transmission). In fact, normal and lognormal distributions play a central role in our model due to the central limit theorem and the role these distributions play in sampling from additive and multiplicative processes [16–21].

Fig. 1 Geometry and radiative transfer quantities for a 3-layer radiative transfer model for horizontal line of sight (LOS₁) and up-looking slant path (LOS₂). Transmissions and temperatures for the different layers are denoted by $t_{i}$ and $T_{i}$ , respectively. For LOS₁ the foreground and background layer radiances $(L_{f}, L_{b})$ are given as emissions $(1 - t_{i}) B (T_{i})$ from a blackbody $B (T_{i})$ . The external source radiance, $L_{s}$ , at the end of LOS can be a function of temperature $T_{s}$ and emissivity $ε_{s}$ , i.e., $L_{s} = ε_{s} B (T_{s})$ .

Download Full Size | PDF

To derive the pdf of the at-sensor radiance, we allow any or all of the radiative transfer model inputs to become random variables (RVs), assume a particular distribution and population parameters for the variates in each layer (transmission and temperature), and assume that the layers are independently distributed. Then, the pdf of the at-sensor radiance is determined by finding population parameters for a distribution that matches the first four moments (mean, variance, skewness, and kurtosis, denoted E, V, S, and K, respectively) computed analytically through the 3-layer radiative transfer model equation. A pdf that matches all moments is the correct (true) analytical distribution, however, in practice one does not have access to reliable higher moments, and furthermore, fitting routines that include higher moments are extremely complex. In our statistical model we make use of the first 4 moments, only; in detection theory the sought-after information (the presence of the target) resides in the tail of the distribution [22–24], hence the importance to describe a pdf with arbitrary skewness and kurtosis values that may be used as an indicator for detection [25].

There are several systems of distributions designed to exactly match any combination of values for the first four moments, including Pearson distributions, Burr distributions, and systems based on logistic and Laplace distributions, respectively [18,26]. In our work we chose the Johnson system [18,26–28] for simplicity (it includes only four distributions), convenience (only the Johnson S_U distribution is typically needed for our model), and the central role the lognormal distribution plays in the system. The ability of the Johnson S_U distribution to model the at-sensor radiance will be established by comparing to stochastic simulations. The S_U distribution is also useful to model intermediate terms in the model (such as the radiance incident on the back of the cloud due to the source and background layers), although sometimes less well.

Our statistical model is a “forward model”. We do not address the general question of how to choose and determine the statistical properties of the radiative transfer variates (transmissions and temperatures) for a given physical scenario, nor do we address the “inverse problem” of estimating statistical properties of the radiative transfer variates from radiance measurements. These are both challenging problems deserving further research.

Our statistical model is for a single wavelength (i.e., the radiance is a scalar) hence, we develop a univariate pdf model. We reserve the extension from univariate to multivariate model (where radiance is specified at multiple wavelengths—and thus is a vector quantity—as is required for multi- or hyperspectral applications) for a future publication. Our model can be extended from 3 layers to n layers (though computing analytical moments for n layers is very cumbersome).

The remainder of the paper is organized as follows. In section 2 we present the radiative transfer model. In section 3 we present the details of the pdf model. In section 4 we detail the simulations that are used to demonstrate and investigate our statistical model. In the simulations we use turbulence theory to set the population parameters for the layer temperatures. In section 5 we show results of simulations and we discuss the performance of our model. We conclude with a summary. We include four appendices to aid the reader with different technical aspects of our presentation. Definitions and properties of the lognormal and Johnson S_U distributions (which play a key role in the stochastic model) appear in Appendices A and B, respectively. Moments and properties of the expectation operator appear in Appendices C and D.

2. Radiative transfer model

We use a standard layer model for passive LWIR remote sensing, using the following notation (for simplicity we omit the wavelength dependence of all variables in the model). Let L_f, L_b, L_s denote radiances originating from the foreground, background, and source layers, respectively; let t_f, t_c, t_b denote transmissions of the foreground, cloud (target), and background layers, and let B(T_c) denote the Plank blackbody radiance at the cloud temperature, T_c. Cloud transmission may also be written in terms of optical depth, $τ_{c}$ , as $t_{c} = \exp (- τ_{c})$ , and due to local thermal equilibrium and Kirchoff’s law, the emissivity of the (non-scattering) cloud layer is $ε_{c} = 1 - t_{c}$ .

Using terminology from detection theory [29]—where H₀ denotes the null hypothesis (target cloud is absent) and H₁ is the alternative hypothesis (target is present)—the total at-sensor radiance given by the model in the presence of the target is denoted M|H₁ or M⁽¹⁾ and is given by

M^{(1)} = L_{f} + t_{f} (1 - t_{c}) B (T_{c}) + t_{f} t_{c} L_{b} + t_{f} t_{c} t_{b} L_{s} .

The total at-sensor radiance in the absence of the cloud, denoted M|H₀ or M⁽⁰⁾, can be obtained from Eq. (1) by setting t_c to 1, resulting in

M^{(0)} = L_{f} + t_{f} L_{b} + t_{f} t_{b} L_{s} .

Equations (1-2) are good for any LOS. However, for LOS₁ or for when foreground and background layers are well-approximated as homogeneous layers at thermal equilibrium, the foreground and background layer radiances may be written as

{\begin{cases} L_{f} = (1 - t_{f}) B (T_{f}) \\ L_{b} = (1 - t_{b}) B (T_{b}) \end{cases}} .

The terms $1 - t_{f} = ε_{f}$ and $1 - t_{b} = ε_{b}$ are emissivities for the foreground and background layers, respectively (in the LWIR most of the attenuation is due to molecular absorption, and therefore scattering may oftentimes be neglected). In many cases one assumes $T_{b} = T_{f}$ . Figure 1 diagrams the radiative transfer model and physical parameters for each layer for both horizontal and slant-path lines of sight (LOS₁ and LOS₂, respectively).

It is also convenient to define at-sensor radiance terms corresponding to each layer, i.e., layer-radiances after attenuation by subsequent layers, as follows: $M_{f} = L_{f}$ , $M_{c} = t_{f} (1 - t_{c}) B (T_{c})$ , $M_{b}^{(1)} = t_{f} t_{c} L_{b}$ , $M_{s}^{(1)} = t_{f} t_{c} t_{b} L_{s}$ , $M_{b}^{(0)} = t_{f} L_{b}$ , and $M_{s}^{(0)} = t_{f} t_{b} L_{s}$ . Parenthetical superscript notation is necessary for M_b and M_s to differentiate between H₀ and H₁ since the presence of the cloud affects them (whereas M_f is unaffected by presence of the cloud and M_c only exists under H₁, and thus does not need additional notation). Using these quantites, the total at sensor radiances in Eqs. (1-2) are just the sum of the appropriate individual at-sensor radiance terms: $M^{(1)} = M_{f} + M_{c}^{(1)} + M_{b}^{(1)} + M_{s}^{(1)}$ and $M^{(0)} = M_{f} + M_{b}^{(0)} + M_{s}^{(0)}$ , respectively.

The information about the vapor cloud’s presence, ΔM, can be seen by subtracting Eq. (2) from Eq. (1) to obtain

{\begin{matrix} Δ M = M^{(1)} - M^{(0)} \\ = - t_{f} (1 - t_{c}) Δ T \\ Δ T = L_{i n} - B (T_{c}) \\ L_{i n} = L_{b} + t_{b} L_{s} \end{matrix}}

where ΔT, known as thermal contrast [1,4], is the difference between the input radiance on the back of the cloud, L_in, and the blackbody radiance of the cloud. If the thermal contrast is negative,

Δ T < 0

, the cloud is observed in emission: the cloud is “warmer” than the input radiance and adds photons to the LOS (

Δ M > 0

). If the thermal contrast is positive,

Δ T > 0

, the cloud is observed in absorption: the cloud is “colder” than the input radiance and removes photons from the LOS (

Δ M < 0

). A non-zero value of ΔT is required for vapor detection, and thus is the “engine” behind thermal infrared remote sensing. It is sometimes convenient to present the thermal contrast in temperature units (Kelvin) as

Δ T = B^{- 1} (L_{i n}) - T_{c}

where

B^{- 1} (\cdot)

is the inverse of the blackbody function.

It is essential to note that Eqs. (1-4) describe the traditional deterministic model (henceforth called the physical model): they predict the value of the at-sensor radiance given constant (non-fluctuating) values of the physical parameters. The goal of this paper is to extend the physical model by allowing any (or all) of the model inputs to be RVs; the result is that model outputs (M⁽⁰⁾, M⁽¹⁾, and ΔM, as well as intermediate quantities such as ΔT, L_in, etc.) are themselves RVs and characterized by appropriate pdfs. The details of how this is done are presented in the next section, however there are immediate implications that can be introduced here. For example, in the physical model, $Δ T = 0$ means no information about the cloud reaches the sensor. However, in the stochastic model, even when $E (Δ T) = 0$ , the cloud may still be detectable based on fluctuations around the mean. Additionally, it is possible for the cloud to be observed in emission and in absorption at the “same time”, i.e., at a given moment there will be a non-zero probability for the cloud to be in absorption and simultaneously a non-zero probability for the cloud to be in emission. Furthermore, the polarity (sign) of $Δ M$ is affected by the foreground layer (that is stochastic in our model), thus, a cloud that is in a given radiation mode, say emission, for which the probability for ΔT to be less than zero is one—i.e., $\Pr (Δ T < 0) = 1$ —may appear in the measurements in the reverse mode with $\Pr (Δ M < 0) > 0$ . Only with a probability model can one study these phenomena, which we demonstrate and discuss further in Section 5.

3. Probability model

In this section we describe how initial distributions for the input RVs are chosen (in subsection 3.1), and detail how the pdfs of the output RVs are determined (subsection 3.2). Input RVs are assumed independent (this includes RVs from different layers, e.g., T_f and T_c, as well as RVs within a single layer, e.g., t_c and T_c). In general this choice is reasonable, however, in adiabatic conditions air temperature and density are not independent, causing correlation between temperature and transmission (proportional to density). This dependence could, in principle, be handled using the ideal gas law.

H₀ and H₁ radiances along a fixed LOS are (necessarily) measured at different times, and therefore we assume that all the population parameters in the radiative transfer model remain unchanged (stationary) during the measurement period and that the sampling is independent. For example, we assume that there is no drift in the population parameters or distribution of the foreground temperature, hence, the mean and variance (and all higher moments) for $T_{f}$ are fixed. In a recent paper [30] it was shown that for a short period ~3-5 s temperature drift is small and the stationary assumption for temperature is reasonable.

3.1 Distribution of input terms

3.1.1 Temperature – Planck blackbody function

The Planck blackbody function $B (T)$ is given by

B (T) = \frac{k_{1}}{e^{k_{2} / T} - 1}

where, for a given wavelength

λ

in microns and radiance units W/cm²/sr/μm,

k_{1} = 1.1910 \times 10^{4} λ^{- 5}

and

k_{2} = 1.4388 \times 10^{4} λ^{- 1}

; for λ in wavenumbers (cm⁻¹) and radiance units W/cm²/sr/cm⁻¹,

k_{1} = 1.1910 \times 10^{- 12} λ^{3}

and

k_{2} = 1.4388 λ

.

Assume that the temperature, T, is a normally distributed random variable (a commonly made assumption, supported by the central limit theorem) with mean $μ_{T}$ and variance $σ_{T}^{2}$ , i.e., $T ~ N (μ_{T}, σ_{T}^{2})$ . Deriving the exact pdf of B(T) via transformation of variables is straight-forward, however exact moments of the distribution (required in our model) are not computable analytically. Instead, an extremely good approximation is that B(T) is lognormally distributed, $L N (μ_{B}, σ_{B}^{2})$ (for which all moments are easily computed), which is shown as follows.

Typical terrestrial values of μ_T are on the order of 300 K with variances at least an order of magnitude smaller. Under these conditions, 1/T is approximately linear (e.g., the points $1 / (μ_{T} \pm 3 σ)$ are close to equidistant from the median $1 / μ_{T}$ and therefore the mean is close to $1 / μ_{T}$ ). Therefore, a first-order Taylor expansion of $1 / T$ around $T_{0} = μ_{T}$ , giving $1 / T \approx μ_{T}^{- 1} - μ_{T}^{- 2} (T - μ_{T})$ , leads to the excellent approximation $1 / T ~ N (μ_{T}^{- 1}, μ_{T}^{- 4} σ_{T}^{2})$ . It follows (Appendix A) that $\exp (k_{2} / T) ~ L N (k_{2} μ_{T}^{- 1}, k_{2}^{2} μ_{T}^{- 4} σ_{T}^{2})$ .

Were it not for the “-1” term in the denominator of (6), the inverse and multiplicative properties of the lognormal distribution (see Appendix A) could be used to show that $B (T) ~ L N (\ln k_{1} - k_{2} μ_{T}^{- 1}, k_{2}^{2} μ_{T}^{- 4} σ_{T}^{2})$ . Although it is tempting to simply ignore the “minus one” term—on the grounds that typically $\exp (k_{2} / T) > > 1$ in the LWIR—it creates a bias (shift) in the distribution that while small, is significant compared to the variance of the distribution. A better method is to approximate the denominator as a standard 2-parameter lognormal distribution. That is, let $x = \exp (k_{2} / T) - 1$ be the denominator of (6). The goal is to find population parameters μ_x and $σ_{x}^{2}$ for a lognormal distribution that matches the mean and variance of x without neglecting the minus one. The mean and variance of x are computed (see Appendix A) as $E (x) = \exp (k_{2} μ_{T}^{- 1} + 0.5 k_{2}^{2} μ_{T}^{- 4} σ_{T}^{2}) - 1$ and $V (x) = [\exp (k_{2}^{2} μ_{T}^{- 4} σ_{T}^{2}) - 1] E {(x)}^{2}$ , respectively, and population parameters are computed with Eq. (A2) from Appendix A. Finally, the inverse and multiplicative properties of the lognormal distribution can now be used to show that the blackbody radiance is lognormally distributed, $B (T) ~ L N (μ_{B}, σ_{B}^{2})$ , where $μ_{B} = \ln k_{1} - μ_{x}$ and $σ_{B}^{2} = σ_{x}^{2}$ . The fit is so good that B(T) may essentially be considered an “input term” to the model.

3.1.2 Transmission

We choose to represent the transmission of a layer in the model as a lognormally-distributed variate, $t ~ L N (μ_{t}, σ_{t}^{2})$ . Partially this is due to the convenience of working with lognormal distributions. More importantly, however, is a physical argument based on the statistics of multiplicative processes.

Consider the transmission of monochromatic radiation through a general layer composed of j sub-layers. The total transmission will be a product of the sub-layer transmissions, $t = \prod_{i = 1}^{j} t_{i}$ . Each sub-layer transmission is a non-negative random variable, $0 \leq t_{i} \leq 1$ . By the central limit theorem for multiplicative processes (see property (x) of Appendix A) t approaches a lognormal distribution, $t ~ L N (μ_{t}, σ_{t}^{2})$ . Note that values for μ_t and $σ_{t}^{2}$ for distributions generated in this manner will naturally enforce the constraint that the probability for $t > 1$ must be zero, i.e., $\Pr (t > 1) = 0$ . Note also that due to the multiplicative property of the lognormal distribution, if the transmissions of the sub-layers are themselves lognormally distributed, $t_{i} ~ L N (μ_{i}, σ_{i}^{2})$ , then $t ~ L N (μ_{t}, σ_{t}^{2})$ with $μ_{t} = \sum μ_{i}$ and $σ_{t}^{2} = \sum σ_{i}^{2}$ .

The above central-limit argument is likely very appropriate for foreground and background layer transmissions, since these tend to be due to the ambient atmosphere along a path that may be long (e.g., a background layer for LOS to space). The transmission of the cloud is qualitatively different than that of the foreground or background layers in the sense that the cloud is localized and (in our remote sensing applications) it is usually due to an external source (e.g., a man-made cloud). An additional justification for the cloud transmission t_c to be lognormally distributed is based on the assumption of Gaussian statistics for concentration that is commonly used in plume dispersion models [31, Ch. 18]. Since the optical depth is proportional to concentration, this implies $t_{c} = \exp (- τ_{c}) = L N (- μ_{τ c}, σ_{τ c}^{2})$ , where μ_τc and $σ_{τ c}^{2}$ are the mean and variance of the normally-distributed optical depth.

As already mentioned, a real (i.e., physical) transmission is always bounded $0 \leq t \leq 1$ . An arbitrary lognormal distribution will always be positive but may not have a non-zero probability for $t > 1$ . Because the user may choose population parameters for the layer transmissions where $\Pr (t > 1) > 0$ , we actually represent all transmissions in the model as truncated lognormal distributions that enforce an upper bound at 1 (if the probability that $t > 1$ is negligible—e.g., $0 \leq μ_{t} - η σ_{t}$ where η is a factor large enough, for example $η = 4$ , to ensure a negligible probability for the normally-distributed optical depth to be negative—then truncation will have no effect). Moments of the truncated lognormal distribution are given in Appendix A, property (ix).

3.1.3 Radiance

We assume that the ambient atmospheric radiance $L ~ L N (μ_{L}, σ_{L}^{2})$ is lognormally distributed (e.g., foreground and background layers for a slant-path LOS—LOS₂ in Fig. 1—where the layers are modeled generically without assuming homogeneity). This choice is partially motivated by the fact that radiance values must be non-negative (the lognormal distribution naturally enforces this constraint) and that the lognormal distribution is easy to work with. More importantly, however, is that there is a physical argument to be made that is based on radiative transfer theory.

The formal solution of the classical radiative transfer differential equation [32] [Ch. 11, Eq. (3)] for an atmosphere in local thermodynamic equilibrium is given [32] [Ch. 11, Eq. (4)] as an integral for the multiplication of a blackbody function along a slant path with a slant path transmission to an observer. Approximating the integral as a sum shows that the radiance is given by a summation of t × B(T) terms. We already showed that the blackbody and transmission functions are both lognormally distributed. With the addition property of lognormals (see property (vi) in Appendix A), the result is lognormally distributed radiance.

If the source radiance, L_s, is an atmospheric radiance, it is reasonable to assume that it is lognormally distributed, as the above argument would apply. For source radiance that is due to emission from an opaque object, we assume a lognormal distribution also. If the object is a blackbody, then this is justifiable based on section 3.1.1. For graybody radiation, where $L_{s} = ε_{s} B (T_{s})$ , choosing a lognormal distribution for the radiance implies either (i) ε_s is constant and property (iii), Apendix A, establishes $L_{s} ~ L N (\ln ε_{s} + μ_{B}, σ_{B}^{2})$ , or (ii) $ε_{s} ~ L N (μ_{ε s}, σ_{ε s}^{2})$ and property (v), Appendix A, establishes $L_{s} ~ L N (μ_{ε s} + μ_{B}, σ_{ε s}^{2} + σ_{B}^{2})$ . In the latter case, it must be that the population parameters of ε_s are such that Pr(ε_s>1) = 0 for the quantity to be strictly physical.

3.2. Distribution of output terms using Johnson S_U

The primary output terms of the pdf model are the total at-sensor radiances M⁽⁰⁾ and M⁽¹⁾, as well as the differential radiance ΔM; these are the terms that a sensor has access to directly (i.e., can measure). The thermal contrast, ΔT, is also fundamentally important due to the key role it plays in enabling LWIR detection. We model these terms with the Johnson S_U distribution. In this section we explain the process and the reasons why the S_U distribution is successful as a single distribution to describe these terms. We also discuss the use of the Johnson S_U distribution to describe intermediate terms in the model—terms of secondary importance such as at-sensor radiance terms for the individual layers, M_f, M_c, M_b, and M_s—which can be more challenging for the S_U distribution to represent.

The previous section showed that it is reasonable to model all of the input RVs in Eq. (1) as lognormal distributions. While physically justifiable, this is also very convenient since a product of lognormal distributions is (analytically) lognormally distributed, and a sum of lognormal distributions is (approximately) lognormally distributed. Ignoring the presence of emission terms such as $ε_{c} = 1 - t_{c}$ (e.g., assuming t_c was constant or that ε_c was lognormally distributed) would lead to a lognormal distribution for the total at-sensor radiance (where correlation between terms in the sum—caused by t_f, t_c, and t_b appearing in multiple places—can be handled via property (viii) in Appendix A). However, the problem is that the emissivity $ε = 1 - t$ when $t ~ L N (μ_{t}, σ_{t}^{2})$ does not follow the standard 2-parameter lognormal distribution, and is instead described by a lognormal distribution that has been reversed (reflected about zero) and shifted. While the moments of this distribution are easily computed in terms of the moments of t (e.g., $E (ε^{2}) = 1 - 2 E (t) + E (t^{2})$ ), it is difficult to determine the distribution of the product $ε \times B (t)$ , for example. Equivalently, expanding terms such as $M_{c} = t_{f} (1 - t_{c}) B (T_{c})$ to become $t_{f} - t_{f} t_{c} B (T_{c})$ shows that a difference of (correlated) lognormally-distributed variates enters into the model. Theoretically, the standard 2-parameter lognormal distribution is a poor candidate to approximate a difference of lognormals since there is no guarantee that values would be positive. This fact, combined with the fact that transmissions in our model may actually be truncated lognormal distributions (for which the nice multiplicative and additive properties no longer apply), motivated us to use the Johnson system to fit the moments of the model outputs. A secondary advantage to this is that one may relax the need to assume a specific distribution for the model inputs, and instead may simply specify the first four moments of the input variates.

The Johnson system contains three 4-parameter distributions, S_L, S_U, and S_B, which, along with the normal distribution, allows fitting any combination of values for the set of moments, {E, V, S, K}. All distributions in the system may be considered transformations of a normal variate. Typically distribution systems are characterized by the Pearsonian coefficients $β_{1} = S^{2}$ and $β_{2} = K$ , where each distribution in the system corresponds to a region, curve, or point in the β₁, β ₂ plane. The S_L distribution is a lognormal distribution with the added flexibility of scale and location parameters (and thus the distribution may be shifted and the skewness may become negative); despite being defined as a four-parameter distribution only three of the parameters are identifiable. Both the Johnson S_L and the standard two-parameter lognormal distributions are described by the same curve in the β₁, β ₂ plane (see Fig. 2 ). Any point below the lognormal curve is in the S_U region, which is an unbounded (and unimodal) distribution. Any point above the lognormal curve is in the S_B region, a bounded distribution that may be unimodal or bimodal. The normal distribution is a point in Fig. 2 and is a limiting form of the S_L (and S_U and S_B) distributions; the S_L is a limiting form of the S_U (and S_B). The fact that the lognormal distribution appears as a special case of each distribution in the Johnson system is an important property given that (as stated above) the model predicts lognormally distributed radiance under certain conditions.

Fig. 2 Standard $β_{1}, β_{1}$ chart, where the y-axis is reversed (by common convention). The lognormal curve (corresponding to both LN and S_L distributions) is given by Eq. (A1) and is shown in red. The normal distribution is a point in blue at coordinates (0,3). The area “below” the lognormal curve contains all S_U distributions. The area above the lognormal curve corresponds to the S_B region. All probability distributions must lie below the shaded gray area in the upper right corner of the plot.

Download Full Size | PDF

The S_U distribution is particularly important because: (i) we observed that most of our simulations produced (β₁, β ₂) values that—if not on the lognormal curve—were below it, and thus in the S_U region; (ii) analytical moments are relatively easy to derive and there are good procedures for method-of-moments estimation (details are in Appendix B); (iii) the S_L and the LN distributions are limiting cases of the S_U, and thus the S_U distribution can represent these distributions with an error that is as small as desired—this means that the model outputs may always be considered to be S_U thereby avoiding the complexity of having to switch to or from an S_L representation.

There are times when the moments of model outputs are above the lognormal curve in the S_B region (generally caused when the truncated lognormal distribution is necessary for transmission variates). However, dealing with the S_B distribution is difficult due to the complexity of the moments; estimation must be performed via quantiles or even frequencymoments [28]. Therefore, even when the distribution to be fit is in the S_B region, we use an S_U approximation (we project the point in the S_B region to the closest point on the lognormal curve and then use the S_U—see details in Appendix B, property (ii)—generally the Euclidean distance to the lognormal curve is small, and the S_U fit is reasonably good). Therefore, the S_U distribution is the only member of the Johnson system that we need to consider.

The general procedure for determining the pdf of model outputs is to analytically compute the first four moments (E, V, S, and K) for each model output that we desire to fit with a pdf, say, M⁽¹⁾ under LOS₂. E, V, S, and K are a function of the raw moments of M⁽¹⁾, which are in turn a function of the population parameters of the model inputs. The raw moments are found by, e.g., expanding the function ${(L_{f} + t_{f} (1 - t_{c}) B (T_{c}) + t_{f} t_{c} L_{b} + t_{f} t_{c} t_{b} L_{s})}^{n}$ (where n is 1 through 4; obtaining 5 terms for $n = 1$ , 15 terms for $n = 2$ , 35 terms for $n = 3$ , and 70 terms for $n = 4$ ) and taking the expectation of each expression, using properties of the expectation operator given in Appendix D. Each expression contains terms that are a product of up to six RVs (in this example) appearing to powers up to 4. Because of independence, the expectation operator distributes, e.g., $E (t_{f}^{3} t_{c}^{2} L_{b}) = E (t_{f}^{3}) E (t_{c}^{2}) E (L_{b})$ . E, V, S, and K, are then computed as a function of the raw moments using Appendix C. This procedure, though tedious, is nevertheless straight-forward and may be applied to all of the model outputs. Given the analytical moments, then the estimation procedure in Appendix B is used to determine the four population parameters (a, b, c, d) of the S_U distribution, e.g., $M^{(1)} ~ S_{U} (a, b, c, d)$ . If the moments are located in the S_U region (or on the lognormal curve), then the corresponding moments of the S_U fit (see Eq. (B4) in Appendix B) will match exactly. If the distribution to be fit is in the S_B region, then the skewness and kurtosis will be slightly off, and only the first two moments will be fitted exactly.

In the remainder of this section, we highlight several individual model outputs, how well they are modeled with the S_U distribution, and conditions for them to be lognormally distributed.

3.2.1 Homogeneous layer radiance and at-sensor radiances

As previously mentioned, the possibility of truncated lognormal distributions for the transmission terms and the presence of emission terms $ε = 1 - t$ are the primary reasons for model outputs to deviate from the standard 2-parameter lognormal distribution. Therefore, terms such as M_c—or L_f and L_b using the homogeneity assumption (Eq. (3))—are potentially the most problematic since both transmission and emission terms appear. Our observations indicate that terms such as M_c are the most likely to be poorly fit by the S_U distribution, i.e., it is possible for them to be in the S_B region. However, even when the component at-sensor radiances are poorly fit, the total at-sensor radiance tends to be remarkably well fit by the S_U, and usually M⁽¹⁾ is even better fit than M⁽⁰⁾. A reason for this may stem from the fact that the S_B region (see Fig. 2) is characterized by smaller kurtosis values for a given skewness than the S_U (or alternatively larger skewness for the same kurtosis). M⁽⁰⁾ and M⁽¹⁾ are a convolution (sum) of the component at-sensor radiances where convolution is a smoothing and broadening operation. We theorize that the convolution of terms helps increase the value of the kurtosis relative to the skewness, making it more likely for total at-sensor radiance terms to be in the S_U region. M⁽¹⁾ is a convolution of more terms, and thus would be more likely to be in the S_U region than M⁽⁰⁾. Regardless of the reason, M⁽⁰⁾ and M⁽¹⁾—the primary outputs of the model—are particularly well fit by the S_U distribution.

The ability to approximate at-sensor radiances as lognormals is attractive as it eliminates the necessity of using a fitting routine for the S_U due to the elegance of lognormal distributions that are easily manipulated. Component at-sensor radiances such as M_s (or M_b under LOS₂ when the homogeneity assumption is not used) will be lognormally distributed as long as the transmissions are not significantly truncated. On the other hand, M_c will only be lognormal if the transmissions are not truncated and if $1 - t_{c}$ may be well approximated by a lognormal distribution (which occurs when t_c is nearly symmetric), in which case $M_{c} = t_{f} (1 - t_{c}) B (T_{c})$ is a product of three lognormal variates. If all of the component at-sensor radiances are lognormally distributed, then the total at-sensor radiance will also be (approximately) lognormally distributed (based on the additive property of correlated lognormals in Appendix A—where the correlation is caused by the transmissions appearing in multiple terms).

We note that a further simplification for the total at-sensor radiance (or any other quantity) that is lognormally distributed is the normal distribution. The normal is the limiting form of the lognormal as $σ^{2} \to 0$ . Thus, if the skewness and kurtosis of the lognormal distribution are not too far from 0 and 3, respectively, then a normal distribution may be used instead. For example, at wavenumber of 1000 cm⁻¹, $μ_{T} = 300 K$ and $σ_{T} = 5 K$ , the blackbody lognormal function $B (T) ~ L N (μ_{B}, σ_{B}^{2})$ (see Section 3.1.1) is given with $μ_{B} = - 11.5$ and $σ_{B}^{2} = 0.0064$ , for which the skewness and kurtosis (Appendix A) are 0.24 and 3.1, respectively. Hence, $B (T)$ can be well-approximated by a normal distribution $B (T) ~ N (μ_{B}, σ_{B}^{2})$ .

3.2.2 Differential radiance

The differential radiance, $Δ M = M^{(1)} - M^{(0)}$ , is particularly well-suited to a Johnson S_U distribution. We can theoretically prove that a difference of independent S_U variates, $z = x - y$ , always remains in the S_U region. Although this does not establish that a difference of S_U variates is analytically an S_U distribution, it does say that an S_U fit will always be able to exactly match the first 4 moments. Additionally, even when one or both of M⁽⁰⁾ and M⁽¹⁾ are in the S_B region (and therefore outside of the S_U), it is possible for ΔM to be in the S_U region. Thus, the Johnson S_U is very appropriate for fitting ΔM.

In the physical model, the differential radiance given by Eq. (4) is the information in the measured radiance that is due to the cloud, i.e., it captures only those photons that interact with the cloud. In the stochastic model the interpretation of ΔM is not as straight-forward: $Δ M = M^{(1)} - M^{(0)}$ is a difference of RVs that are assumed to be (independently) sampled at different times, and fluctuations in the input RVs will cause a differential radiance between terms that do not involve the cloud. Thus, in the stochastic model ΔM captures both “informative” cloud-photons, as well as “uninformative” photons that are due to clutter fluctuations. For example, consider the foreground radiance, L_f, which is distributed identically and independently under H₀ and H₁ conditions. The mean $E (L_{f} | H_{1} - L_{f} | H_{0}) = 0$ , but the variance $var (L_{f} | H_{1} - L_{f} | H_{0})$ is the sum of the variances of L_f |H₀ and L_f |H₁; therefore, the differential radiance will include some photons that originated in the foreground layer and never interacted with the cloud. Note that correlations between H₀ and H₁ quantities may be included in this analysis, but here we assume independent temporal sampling.

When M⁽⁰⁾ and M⁽¹⁾ analytically follow the lognormal distribution, the differential radiance ΔM is a difference of independent lognormals. We can show analytically that a difference between two independent lognormal distributions is always in the S_U region (below the lognormal line in the β₁,β₂ plane), and in fact, the fit is so fantastic it is tempting to think that the S_U distribution might even be the analytical pdf for the difference of lognormals.

In the stochastic model, we use binary hypothesis testing [19] with respect to the radiance distributions M⁽⁰⁾ and M⁽¹⁾ to determine the detection potential of the remote sensing scenario that is due to scene noise (e.g., the likelihood that at-sensor radiance may be attributed to the presence of a cloud along with the corresponding probability of false alarm). We are aware of the fact that detecting a presence of a cloud with only one wavelength is problematic as it is susceptible to drift, atmospheric disturbances, and interferences (using multiple wavelengths, these effects may be mitigated due to the wavelength dependence of the different events).

3.2.3 Thermal contrast

Determining the distribution of the thermal contrast in radiance units—defined in (4)—is qualitatively similar to determining the distribution of differential radiance, and raises no additional challenges. However, determining the distribution of ΔT in units of brightness temperature (Eq. (5)) is difficult because of the appearance of the inverse Planck blackbody function, B⁻¹(·), that appears in the expression $Δ T = B^{- 1} (L_{i n}) - T_{c}$ . If L_in is a lognormally distributed variate, then the steps in showing that B(T) is approximately lognormal when T is normal can be reversed to show that $T_{i n} = B^{- 1} (L_{i n})$ is approximately normal, where

B^{- 1} (L) = \frac{k_{2}}{\ln (k_{1} / L + 1)} .

However, in general L_in will be described (or approximated) by an S_U distribution. Transformation of variables applied to Eq. (7) is too complicated when the input distribution is the Johnson S_U (scaling and translation relationships for the S_U distribution exist—see Appendix B—but more complicated transformations generally do not lead to densities that can be integrated); computing moments of T_in analytically from the moments of L_in is also intractable due to the nonlinearity of the transform. However, moments can be computed easily when the inverse blackbody function is linearized (approximated) with a first order Taylor expansion. Therefore, we will describe two complementary methods to determine the pdf of ΔT: (i) approximating L_in as a lognormal distribution and using the exact functional form of B⁻¹; or (ii) propagating exact moments of L_in using an approximated form of B⁻¹.

In the first method, $k_{1} / L ~ L N (\ln k_{1} - μ_{L}, σ_{L}^{2})$ , and $y = k_{1} / L + 1$ may be approximated very well as a 2-parameter lognormal distribution with population parameters {μ_y,σ_y} found from fitting $E (y) = k_{1} \exp (- μ_{L} + σ_{L}^{2} / 2) + 1$ and $V (y) = [\exp (σ_{L}^{2}) - 1] E {(y)}^{2}$ using Eq. (A2). Then, $\ln y ~ N (μ_{y}, σ_{y}^{2})$ by definition. The previously-used first order Taylor approximation for the reciprocal of a normal variate (Section 3.1.1) establishes $B^{- 1} (L)$ to be approximately $N (μ_{y}^{- 1}, μ_{y}^{- 4} σ_{y}^{2})$ . T_c is also normally distributed, and thus the distribution of ΔT is approximated by $N (μ_{y}^{- 1} - μ_{T c}, μ_{y}^{- 4} σ_{y}^{2} + σ_{T c}^{2})$ . These steps are simply the reverse of the steps in Section 3.1.1, and are appropriate whenever the S_U distribution for L_in is not “too far” from the lognormal line. However, as the kurtosis and skewness of the S_U gets further from the lognormal, the output distribution for T_in becomes skewed and the normal approximation (which is necessarily symmetric) will begin to provide a poor fit.

In the second method, we approximate Eq. (7) with a first order Taylor expansion around the mean of L. Let L₀ denote E(L), then

B^{- 1} (L) = B^{- 1} (L_{0}) + (L - L_{0}) {\frac{\partial B^{- 1}}{\partial L} |}_{L = L_{0}} + O ({(L - L_{0})}^{2}) .

The result is that $B^{- 1} (L)$ now has a linear relationship with random variate L, $B^{- 1} (L) \approx α_{1} L + α_{0}$ , with constant gain α₁ and offset α₀ given by

\begin{array}{l} α_{1} = {\frac{\partial B^{- 1}}{\partial L} |}_{L = L_{0}}, \\ α_{0} = B^{- 1} (L_{0}) - ({\frac{\partial B^{- 1}}{\partial L} |}_{L = L_{0}}) L_{0}, \end{array}

respectively, where

{\frac{\partial B^{- 1}}{\partial L} |}_{L = L_{0}} = \frac{B^{- 1} (L_{0})}{L_{0} (k_{1} + L_{0}) \ln (\frac{k 1}{L_{0}} + 1)} .

The moments of $B^{- 1} (L)$ and $Δ T = B^{- 1} (L_{i n}) - T_{c}$ may now be easily computed using Appendices C and D, and fit with an S_U via property (ii) of Appendix B.

Note that if L_in analytically follows an S_U distribution, $L_{i n} ~ S_{U} (a, b, c, d)$ , then $B^{- 1} (L_{i n}) ~ S_{U} (α_{1} a + α_{0}, b α_{1}, c, d)$ via property (iii) in Appendix B, where the absolute value and signum functions are not needed since $α_{1} > 0$ . Subsequently, moments of $B^{- 1} (L)$ can be found from Eqs. (B3) and (B4) instead of using Appendices C and D. Additionally, if T_c is a constant, then neither Appendices C and D nor the S_U fitting are needed since Appendix B property (iii) establishes $Δ T ~ S_{U} (α_{1} a + α_{0} - T_{c}, b α_{1}, c, d)$ .

The first order Taylor expansion is appropriate when the variance of L_in is small enough such that the inverse blackbody function is approximately linear over the domain of L_in. Which method is better will be determined by the relative error in approximating L_in as a lognormal distribution compared to the error in assuming the blackbody function is linear. In all simulations shown in Section 5, the blackbody function is close to linear, and so the second method works well. Thus, in the remainder of the paper we only focus on the second method.

4. Simulations

In this section we describe the details of simulations for the detection of triethyl phosphate (TEP) vapor, including how population parameters were chosen. All simulations are run at 1049.3 cm⁻¹ (which corresponds to peak absorption for TEP) and radiance values are reported in units of $W / c m^{2} / s r / c m^{- 1}$ . With given population parameters $(μ, σ^{2})$ for the input variates to the model, we sample values from the appropriate random number generator (10⁶ samples of each variate, creating 10⁶ sets of values). For example, layer transmissions are sampled from (possibly truncated) lognormal distributions, temperatures are sampled from normal random number generators, and layer radiances (e.g., L_s) are sampled from the lognormal distribution. Samples for the intermediate model terms and the primary outputs are computed via applying the model equations to each set of random values, creating 10⁶ values for each output term. We ensure that all RVs for H₀ are sampled independently (i.e., sampled separately) from H₁, thus we have an additive signal model where H₀ and H₁ are uncorrelated (in field measurements H₀ and H₁ are measured at different times, and therefore are assumed independent). From the sampled RV data we compute histograms (sampled pdfs) that we compare with the analytical pdfs predicted by our model. Predicted pdfs are solely a function of the population parameters for the input variates.

We present two types of simulations. The first is a physical scenario of a sensor situated 1 m above the ground in a horizontal LOS geometry that is viewing a vapor cloud located at a horizontal distance of 1 km. At the end of the LOS (10km away from the sensor), there is a topographical target (a mountain). The second type of simulation is for a hypothetical scenario with a challenging set of parameters for the RVs (e.g., a combination of transmissions, temperatures and radiance values) in order to test our model’s accuracy and to explore its limitations. The parameters chosen for the hypothetical scenario can be associated with a physical scenario, but this is not our intention. For both types of simulations we chose horizontal LOS geometry, a geometry that is more challenging than a slant path due to the added complexity of the foreground and background emission terms (discussed in Section 3.2). Simulation results for slant path geometry (not shown) were very good. Table 1 summarizes population parameters as well as means and variances for all simulations.

Table 1. Parameters for Simulations

View Table | View all tables in this article

4.1 Physical simulation

The 3-layer model geometry in the physical simulation is similar to a previously used scenario [4]: a LOS near the horizon (1.7 degree up-look angle), foreground pathlength of 1 km at temperature of 288 K; a cloud at temperature of 288 K (i.e., at equilibrium with the ambient atmosphere); background layer of 9 km path at the cloud temperature; a mountain (that serves as an external blackbody source) with emissivity of 0.85 (typical of many clay soils in the IR) at temperature of 278 K (i.e., 10 K colder than the ambient air temperature, a scenario that can occur in the morning hours). Note that a 1.7 degree up-look angle is shallow enough that horizontal distances and slant-path distances are virtually identical. We use previous stochastic simulations [6] as guidelines for setting population parameters for transmission and temperature functions in this work.

The standard deviation of the transmission, $\sqrt{V (t)}$ , for IR wavelengths and horizontal LOS (elevation angle < 5°) is between $2 \cdot 10^{- 5}$ to $8 \cdot 10^{- 5}$ for which the mean transmission values $E (t)$ are between 0.7 to 0.97 (see Figs. 4 and 5 in Reference [6]). Note that the transmission of the atmosphere is highly wavelength dependent. At wavelength 993 $c m^{- 1}$ , a relatively clear atmospheric transmission, the ground level extinction coefficient is 0.07 $k m^{- 1}$ ; for 1053 $c m^{- 1}$ , due to ozone absorption, the extinction coefficient is 0.12 $k m^{- 1}$ ; for 853 $c m^{- 1}$ , due to water vapor absorption, the ground level extinction coefficient is 0.3 $k m^{- 1}$ .

In our “physical” simulation we choose the foreground transmission for horizontal path of 1 km to be lognormally distributed with $E (t_{f}) = 0.9$ and $\sqrt{V (t_{f})} = 5 \cdot 10^{- 5}$ for which the population parameters $(μ_{t}, σ_{t}^{2})$ can be computed from $V (t), E (t)$ with Eq. (A2) to be $μ_{t f} = - 0.105$ and $σ_{t f}^{2} = 3.09 \cdot 10^{- 9}$ . Thus, the foreground transmission is $t_{f} ~ L N (μ_{t f}, σ_{t f}^{2})$ and the foreground optical depth is $τ_{f} ~ N (- μ_{t f}, σ_{t f}^{2})$ . The background layer is 9 times longer in pathlength than the foreground layer; population parameters are derived from the relation $τ_{b} = 9 τ_{f}$ . Thus, $τ_{b} ~ N (- μ_{t b}, σ_{t b}^{2})$ and $t_{b} = \exp (- τ_{b}) = L N (μ_{t b}, σ_{t b}^{2})$ with $μ_{t b} = 9 μ_{t f}$ and $σ_{t b}^{2} = 81 σ_{t f}^{2}$ . Using property (i) of Appendix A, the mean and standard deviation of the background transmission is $E (t_{b}) = 0.387$ and $\sqrt{V (t_{b})} = 1.94 \cdot 10^{- 4}$ . Note that this produces a variablilty for the background transmission that is a little larger (factor of 2.5) than observed in [6]. Truncation is not required for any of the transmission variates in the physical simulation.

The variance of the atmospheric temperature is computed with a turbulence theory [6]. The variance of temperature due to turbulence between two points separated by distance Δx is $V (T) = C_{T}^{2} Δ x^{2 / 3}$ , where $C_{T}^{2}$ is the temperature structure constant. $C_{T}^{2}$ is proportional to the refractive index structure constant, $C_{n}^{2}$ (Eq. (2) in [6] where a typo needs correction: the left-hand side should have been $C_{n}^{2}$ and the right-hand side should have been $C_{T}^{2}$ ). Both $C_{T}^{2}$ and $C_{n}^{2}$ are a function of elevation. The Tatarski model (see [33] Eq. (15).4.35) gives the height dependence of the refractive index structure constant as $C_{n}^{2} (h) = C_{n}^{2} (1) h^{- 4 / 3}$ where $h > 1 m$ (supported by experiments for h up to at least 100m). We use a value of $C_{n}^{2} (1) = 10^{- 13} m^{- 2 / 3}$ which is characteristic of moderately strong turbulence; the corresponding value of the temperature structure constant (for the 1976 US standard atmosphere [34] with surface temperature of 288K and 46% relative humidity) is $C_{T}^{2} = 0.088 K^{2} m^{- 2 / 3}$ . Using the Tatarski height dependence and assuming (for convenience) that wind speed $u = 1 m s^{- 1}$ and sampling rate $d t = 1 s$ (and hence $Δ x^{2 / 3} = {(u d t)}^{2 / 3} = 1$ ), the variance of temperature over a slant-path extending from height $h_{1}$ above the ground to height $h_{2}$ is estimated by $V (T) = \int_{h 1}^{h 2} 0.088 h^{- 4 / 3} d h = 0.264 (h_{1}^{- 1 / 3} - h_{2}^{- 1 / 3})$ . For LOS near the horizon at 1.7° elevation angle, the variance of the foreground layer (1km path from $h_{1} = 1 m$ to $h_{2} = 30.7 m$ ) is 0.180 K² and for the background layer (9km path from $h_{1} = 30.7 m$ to $h_{2} = 298 m$ ) is 0.0448 K². The variance of the cloud temperature is simply set to 0.088 K². Temperatures in the model are normally distributed, and as explained in 3.1.1, corresponding distributions for the blackbody radiance are very close to lognormal distributions; Table 1 gives population parameters and moments for these quantities.

The radiance from the external source, L_s, is modeled as a lognormal distribution with $E (L_{s}) = 5.15 \cdot 10^{- 6}$ and $V (L_{s}) = 1.79 \cdot 10^{- 15}$ . This choice is arbitrary but is consistent with graybody radiation $L_{s} = ε_{s} B (T_{s})$ with $T_{s} ~ N (278, 0.088)$ and $ε_{s}$ distributed according to a lognormal distribution with $E (ε_{s}) = 0.85$ and $σ_{ε s}^{2} = σ_{B T s}^{2}$ ; population parameters and moments are given in Table 1.

For the vapor cloud population parameters we used field data from an experiment where a TEP vapor “cloud” was confined to a closed chamber [3] of size $3 m \times 3 m \times 6 m$ . In the experiment, concentration-pathlength values inferred from LWIR remote sensing measurements when 1 cm³ of TEP was vaporized in the chamber were fit with a normal distribution. We extract the mean and standard deviation of the TEP concentration-pathlength ( $ρ = 56.60$ mg/m² and $σ = 2.762$ mg/m², respectively; see [3], Table 1, $i = 4$ ) and convert the values to optical depth ( $τ = α ρ$ , where $α = 8.389 \cdot 10^{- 4} m^{2} m g^{- 1}$ is the absorption coefficient at 1049.3 cm⁻¹). Thus, the mean optical depth is $μ_{τ_{c}} = α_{} \times 56.60 = 0.0475$ with variance $σ_{τ c}^{2} = α^{2} \times {2.762}^{2} = 5.369 \cdot 10^{- 6}$ , which means that the TEP cloud’s transmission in the physical simulation is distributed as $t_{c} ~ L N (μ_{t c}, σ_{t c}^{2})$ where $μ_{t_{c}} = - μ_{τ_{c}}$ and $σ_{t c}^{2} = σ_{τ c}^{2}$ . Note that in our model the dimension of the cloud is immaterial, only the optical depth is important.

4.2 Additional simulations

For the second type of simulations (intended to challenge our model) we increase the variance of the temperature of the three layers ( $T_{f}, T_{c}, T_{b}$ ) by ten-fold, $σ_{T f}^{2} \to 10 σ_{T f}^{2} = 1.80 K^{2}$ , $σ_{T c}^{2} \to 10 σ_{T c}^{2} = 0.88 K^{2}$ , and $σ_{T b}^{2} \to 10 σ_{T b}^{2} = 0.448 K^{2}$ . The source radiance is altered by increasing the variance of T_s by a factor of 100, from $0.088 K^{2} \to 8.8 K^{2}$ . The variances of all layer transmissions are set to the large value of 0.01 (standard deviation of 0.1 transmission unit). For “Simulation 1”, the mean transmission value for all layers is set to 0.95 (i.e., optically thin layers), which results in a lognormal distribution with ~29% of values greater than 1 (and thus the effects of truncation are significant). For “Simulation 2”, the mean transmission values for all three layers are set to 0.1, such that each layer is closer to being optically thick. Although less than 0.1% of transmission values are greater than 1 for Simulation 2 (and thus truncation is not significant), the transmission pdfs are highly skewed (they are sharply peaked with a mode close to zero and a long tail) which challenges the model. Population parameters and corresponding means and variances for the various distributions for Simulations 1 and 2 are summarized in Table 1. Note that the mean and variance for the transmission variates are reported prior to truncation. The actual mean and variance of the truncated lognormal distributions in Simulations 1 and 2 are $E (t) = 0.900$ , $V (t) = 4.20 \cdot 10^{- 3}$ , and $E (t) = 0.0991$ , $V (t) = 8.89 \cdot 10^{- 3}$ , respectively.

5. Results and discussion

We compare the sampled data histograms of the radiative transfer quantities (Eqs. (1-5)) to those predicted with the Johnson S_U distribution. To assess how close the two are, we compute a symmetric Kullback-Leibler (KL) distance [35], $K L = (K L (α, β) + K L (β, α)) / 2$ . The KL distances are noted in the figure captions. The smaller the distance the better the agreement. The computed value of the KL will depend on how the histograms are computed (e.g., the bin spacing used). To get an idea of the significance of the KL value, we computed the 99th percentile for KL distances between a histogram computed from 10⁶ samples drawn from N(0,1) and the theoretical density; the result of 0.0013 means that KL values must be greater than this to establish that the two densities are not identical. In Figs. 3 -4 we show the comparison for the physical simulations and in Figs. 5 -8 for the two hypothetical scenarios. In Table 2 we summarize the Johnson S_U population parameters, ${a, b, c, d}$ , for the primary model outputs M⁽⁰⁾, M⁽¹⁾, and $Δ M$ , as well as for $Δ T$ . The pdf of $Δ T$ is computed using the second method outlined in section 3.2.3, using Eq. (8).

Fig. 3 Comparison of data histogram (solid red) and model-predicted pdfs (thick dotted green) for several intermediate model outputs for the physical simulation. The L and M radiances (in units of W/cm²/sr/cm⁻¹) are modeled with the Johnson S_U distribution, and $B (T_{c})$ is modeled with a lognormal. Top row: emission radiances for the foreground and background layers, $L_{f} = M_{f}$ , $L_{b}$ , and cloud layer emission radiance $L_{c} = (1 - t_{c}) B (T_{c})$ . Middle row: At-sensor radiances for the external source, $M_{s}^{(0)}$ , and the background layer, $M_{b}^{(0)}$ , under the H₀ scenario and the at-sensor radiance from the cloud layer, $M_{c}$ . Bottom row: At-sensor radiances for the external source, $M_{s}^{(1)}$ , and background layer, $M_{b}^{(1)}$ , for the H₁ scenario and the Planck blackbody radiance for the cloud, $B (T_{c})$ .The KL distance between theory and sampled data for all plots is very small (<0.002), hence excellent agreement between the data and the model.

Download Full Size | PDF

Fig. 4 Comparison of data histograms (solid lines) and model pdfs (thick dotted lines) for the primary model outputs for the physical simulation. Top left: at-sensor radiance under H₀ and H₁ hypotheses, $M^{(0)}$ and $M^{(1)}$ , respectively. Top right: Differential radiance, $Δ M = M^{(1)} - M^{(0)}$ . Bottom left: Thermal contrast (in Kelvin), $Δ T = B^{- 1} (L_{i n}) - T_{c}$ , computed using Eq. (8). Bottom right: Receiver operating characteristics where the line of equal probability of false alarm and detection is shown in gray. The agreement between model predications and the data is excellent (KL distances for $M^{(0)}$ , $M^{(1)}$ , $Δ M$ , and $Δ T$ of 0.0014, 0.0015, 0.0014, 0.0016, respectively).

Download Full Size | PDF

Fig. 5 Same as Fig. 3 but for Simulation 1. The agreement between the model and the data for L and M radiances is poor (large KL values of ~1.3) with the exception of $M_{s}^{(0)}$ and $M_{s}^{(1)}$ for which the fit is good and the KL distance is small (0.005 and 0.008 respectively).

Download Full Size | PDF

Fig. 8 Same as Fig. 4 but for Simulation 2. The KL distance between theory and sampled data for $M^{(0)}$ , $M^{(1)}$ , and $Δ M$ is very small (0.002, 0.0014, and 0.0013, respectively) indicating good agreement between the model and data. For $Δ T$ the fit is reasonably good with a KL of 0.007. Detection performance for this scenario is reduced in comparison to Simulation 1: although the optical depth of the target cloud is much greater in Simulation 2, the thermal contrast is reduced and there is greater attenuation from the foreground layer.

Download Full Size | PDF

Table 2. Parameters for Johnson Distributions, $x ~ S_{U} (a, b, c, d)$

View Table | View all tables in this article

In the first panel of Fig. 3 we show (top row) the layer emission radiances $L_{f} = M_{f}$ , $L_{b}$ , and the cloud layer emission radiance $L_{c} = (1 - t_{c}) B (T_{c})$ . In the middle row we show the $H_{0}$ at-sensor radiances for the external source and background layers $(M_{s}^{(0)}, M_{b}^{(0)})$ , and the cloud at-sensor radiance $M_{c}$ ; in the bottom row we show $H_{1}$ at-sensor radiance $(M_{s}^{(1)}, M_{b}^{(1)})$ and the cloud blackbody radiance, $B (T_{c})$ . The layer radiances and at-sensor radiances are modeled with Johnson S_U statistics and the blackbody $B (T_{c})$ is modeled with lognormal statistics. Figure 3 shows that the approximation of a blackbody with a lognormal pdf is excellent. Other blackbodies (foreground and background) in Table 1 (all simulations) also produced practically an exact match between sampled data and a lognormal.

The agreement between our model and the data for the emission terms and the at-sensor radiances is excellent (KL distances for all RVs in the physical simulation are less than 0.002).

In Fig. 4 (top left panel) we show the pdf for the total at-sensor radiance under H₀ and H₁. In the top right panel we show the pdf for the differential radiance $Δ M$ (Eq. (4)). The pdfs are in execellent agreement with the data (KL distances for M⁽⁰⁾, M⁽¹⁾, and $Δ M$ are less than 0.002). In the bottom left panel we show the pdf for the thermal contrast (in Kelvin) $Δ T = B^{- 1} (L_{i n}) - T_{c}$ where $L_{i n} = L_{b} + t_{b} L_{s}$ . The agreement with the data is excellent, KL distance of 0.0016. The negative $Δ T$ indicates that the cloud is always in emission mode, $\int_{- \infty}^{0} p_{Δ T} (d T) d T = 1$ , and the probability for $Δ T > 0$ is zero. Nevertheless, the pdf for $Δ M$ shows that the cumulative probability for $Δ M < 0$ is $\int_{- \infty}^{0} p_{Δ M} (d M) d M = 0.13$ . Thus, due to the fluctuations in the foreground layer, the cloud that is “physically” in emission mode will appear in the measurements with $M^{(1)} < M^{(0)}$ , about 13% of the time, and the cloud may be erroneously interpreted as a cloud in “absorption” mode. This also has implications towards averaging spectra in order to increase the signal to noise ratio: fluctuations in the polarity of $Δ M$ means that naively averaging instances of $Δ M$ measured by a sensor will decrease noise but will also decrease the mean (signal). In the bottom right panel we show the performance of detection, with the standard receiver operation characteristics (ROC) curves (the probability of detection, P_D, as a function of the probability of false alarm, P_FA). The ROC curve can be used to assess the theoretical performance of detecting the cloud due to scene fluctuations.

In Figs. 5-8 we show the two hypothetical scenarios of Table 1 with challenging parameters. We first discuss the S_U fit for several of the intermediate terms in the model to illuminate the limitations of the S_U distribution, and then we show that in spite of these limitations, the S_U is an excellent fit for the important quantities that determine performance, namely, M⁽¹⁾, M⁽⁰⁾, $Δ M$ , and $Δ T$ . In Fig. 5 (simulation 1) the KL distances for L_f, L_b, L_c, $M_{b}^{(0)}$ , $M_{b}^{(1)}$ , and M_c are large (~1.3), which indicates poor agreement with the S_U model. The histograms for these parameters all have a large probability for radiance values near zero, and a discontinuity at zero that reflects the fact that radiance values may not be negative (the shape of the histograms is caused by the severe truncation necessary to model the transmission terms on the interval 0 to 1; many transmission values are close to 1, meaning that the emission values for each layer $ε = 1 - t$ are near zero). The S_U distribution, however, is a continuous distribution defined on the interval $- \infty$ to $\infty$ with no restriction regarding non-negativity. Thus, the S_U fits for these terms are poor, and include (nonphysical) negative radiances. At-sensor radiances for the source layer, on the other hand, are reasonably well-fit by the S_U distribution (KL distances for $M_{s}^{(0)}$ and $M_{s}^{(1)}$ are 0.005 and 0.008, respectively). The source layer is not modeled with its own transmission term, and the effect of attenuating the source radiance through truncated lognormal transmission terms (where a majority of values are close to 1) is small.

Despite the fact that intermediate model terms for simulation 1 pose problems for the Johnson S_U fit, the primary model outputs, M⁽¹⁾, M⁽⁰⁾, and $Δ M$ —as well as $Δ T$ —are very well fit by the S_U distribution, with KL distances around 0.0013. These quantities, as well as the ROC curve for simulation 1, are shown in Fig. 6 . The ROC curve shows the challenging detection scenario posed by this simulation; despite the large thermal contrast and the same vapor optical depth as the physical simulation, it is only modestly above the 45 degree line (equal probability of false alarm and detection—a “coin-flip” detection scenario). Thus, scene fluctuations are large compared to the difference in average radiance between H₁ and H₀. In simulation 1 (Fig. 6) the cloud is in emission mode nearly always (the probability of $Δ T > 0$ is negligible, around $1 \cdot 10^{- 5}$ ), but nevertheless the cloud appears in the measurements as an “absorbing” cloud, $Δ M < 0$ ,with large (37%) probability, $\int_{- \infty}^{0} p_{Δ M} (d M) d M = 0.37$ .

Fig. 6 Same as Fig. 4 but for Simulation 1. The KL distance between theory and sampled data for $M^{(0)}$ , $M^{(1)}$ , $Δ M$ , and $Δ T$ is very small (0.0014, 0.0013, 0.0013, 0.003, respectively), hence excellent agreement between the model and the data. Although the optical depth is similar and the thermal contrast larger for Simulation 1 in comparison to the physical scenario, the performance is reduced for Simulation 1 due to larger fluctuations.

Download Full Size | PDF

In Fig. 7 (simulation 2), layer radiances L_f, L_b, and L_c show a reasonable S_U fit (KL values between 0.03 and 0.06). In each case, the long tail to the left of the S_U peak causes a small probability of non-physical negative values (~0.02%). At-sensor radiances $M_{b}^{(0)}$ and $M_{c}$ are also reasonable (KL values around 0.75) but have a higher probability of negative radiances (3%). Fits for $M_{s}^{(0)}$ , $M_{s}^{(1)}$ , and $M_{b}^{(1)}$ are poor with large KL distances (greater than 2) and significant probabilities of negative radiances (13%, 27%, and 14%, respectively). The amount of truncation for transmission variates is small (<0.1%), thus we do not think the main reason for the poor fits. Regardless of the cause, the S_U distribution is just not appropriate to describe the shape (rise and fall) of the histograms. The shape of the S_U distribution is controlled by the skewness and kurtosis of the distribution; as shown in Fig. 2 the S_U distribution only covers a subset of the possible values. The histograms in Fig. 7 are all in the S_B region: L_f, L_b, L_c, $M_{b}^{(0)}$ , and $M_{c}$ are all relatively close to the lognormal line (~2 units or less) such that the S_U approximation is still reasonable; $M_{s}^{(0)}$ , $M_{s}^{(1)}$ , and $M_{b}^{(1)}$ are significantly far from the lognormal line (10-100 units away) such that the S_U approximations are poor. Evidently, certain types of very sharply peaked distributions that rise very quick onone side with a long tail on the other (as in Fig. 7) are difficult for the S_U to represent. Despite the poor fit for some of the intermediate model outputs, Fig. 8 shows excellent agreement between the model and the histograms for M⁽¹⁾, M⁽⁰⁾, and $Δ M$ . In fact, these histograms are actually in the S_U region and KL values are small (≤0.002). For $Δ T$ the fit is good, though not perfect (KL of 0.008). The discrepancy between the model and the histogram for $Δ T$ is not due to the validity of the linearization of the inverse blackbody function described in Section 3.2.3, but is due instead to the fact that the L_in radiance is actually in the S_B region. The distance to the lognormal line is small, however, and the S_U approximation for $Δ T$ is reasonable.

Fig. 7 Same as Fig. 3 but for Simulation 2. The KL distance for the L radiances are small (0.03-0.06) and are larger (~0.75) for $M_{b}^{(0)}$ and $M_{c}$ . The agreement for $M_{s}^{(0)}$ , $M_{s}^{(1)}$ , and $M_{b}^{(1)}$ is poor (KL of 2.7, 6, and 12, respectively).

Download Full Size | PDF

Figure 8 also shows the detection performance (ROC curve) for Simulation 2, which is even poorer than for Simulation 2. The larger optical depth of the three layers in Simulation 2 causes more emission from the cloud layer, but also more attenuation of cloud photons due to the foreground layer. Additionally, the thermal contrast is reduced (compare the magnitude of $Δ T$ in Figs. 6 and 8) due to the fact that fewer photons from the mountain reach the cloud. In fact, in Simulation 2 the cloud will exhibit both radiation modes: 83% of the time in emission ( $\int_{- \infty}^{0} p_{Δ T} (d T) d T = 0.83$ , computed from the histogram), and 17% in absorption (in Simulation 1 the cloud was always in emission). In the measurements, however, the cloud in Simulation 2 will appear in emission mode with probability of 47% ( $\int_{- \infty}^{0} p_{Δ M} (d M) d M = 0.47$ ). Thus, inferences made from $Δ M$ regarding $Δ T$ (that are valid in the physical model) can be misleading in practice, where variation in the radiative transfer terms causes a mismatch between $Δ M$ and $Δ T$ .

5. Summary

A probability model for a 3-layer radiative transfer model (foreground layer, cloud layer, background layer, and an external source at the end of LOS) has been developed. The 3-layer model is fundamentally important as the primary physical model in passive LWIR remote sensing, but it has not been useful for predicting performance due to clutter (scene variations).

This is (to our knowledge) the first probability model for a 3-layer radiative transfer model where all quantities are allowed to be stochastic and higher order statistics are included. The model provides statistical characterization of at-sensor spectral radiance due to scene fluctuations, that is, it can predict the detection potential in clutter-noise limited scenarios. Note that the probability model could be combined with sensor models and detection algorithms to predict the performance of a particular sensor.

The model is based on computing analytical moments (mean, variance, skewness, and kurtosis) of the output variates based on the moments of the input variates. Our choices for the input statistics are made based on physical theory, and we expect that they can accurately model real stochastic quantities. However, any distribution that has four computable moments can be chosen. Moments for input variates could also be measured empirically.

We assumed Gaussian (normal) statistics for temperature fluctuations, $T ~ N (μ_{T}, σ_{T}^{2})$ , which results in the excellent approximation that the Planck blackbody radiance is lognormally distributed, $B (T) ~ L N (μ_{B}, σ_{B}^{2})$ . Layer radiances and transmission variates are modeled as lognormal variates, both of which are justifiable based on physical arguments. The lognormal distribution has the nice property that it describes non-negative random variables (it is defined on the domain 0 to ∞), and thus is naturally applicable to describe radiance values. Transmission values are non-negative but also less than or equal to 1; therefore it is sometimes necessary (depending on population parameters) to represent transmission variates by truncated lognormal distributions. Other choices are possible (for example, the beta distribution is naturally bounded between 0 and 1 and can describe a wide range of shapes), however, the lognormal distribution is convenient to work with since products and sums of lognormal distributions are themselves lognormally distributed (or approximately so).

In special cases the model predicts that the total at-sensor radiance is lognormally distributed, however the lognormal distribution is not flexible enough to represent the general case. Therefore, we use the Johnson system of distributions to fit the moments of the output variates. There are other systems of curves that may be used for distribution fitting; we chose the Johnson system due to central role the logarithmic distribution plays in the system and the small number of distributions that appear in the system. The Johnson S_U distribution is a particularly useful and flexible distribution that is appropriate for representing the at-sensor radiance under H₀ and H₁ hypotheses, M⁽⁰⁾ and M⁽¹⁾, respectively, and the differential radiance, $Δ M = M^{(1)} - M^{(0)}$ . These are the primary quantities that a sensor would be able to measure. The thermal contrast, ΔT (see Eqs. (4) and (5)), is also typically well-fit by the S_U distribution. The thermal contrast is of interest as the principle physical quantity that enables LWIR vapor detection. Other intermediate output terms of the model may be well fit with the S_U distribution; at times they would be better modeled with the Johnson S_B distribution, but method-of-moments estimation is intractable with the S_B, which precludes its use in our model. This is a disadvantage to the Johnson system, but in practice it is of little impact since the pimary model outputs are fit so well by the S_U distribution, even when the intermediate terms would require the S_B to be “optimally” fit.

In the traditional (non-stochastic) physical model, $Δ M$ gives only information about the vapor cloud. In the stochastic model, due to scene fluctuations, ΔM includes photons that never interacted with the cloud.

Detection potential may be quantified by ROC curves derived from binary hypothesis testing on the M⁽⁰⁾ and M⁽¹⁾ distributions. Fluctuations (in the temperature of the cloud and in the input radiance on the back of the cloud, Lin) can also cause the cloud to be in emission and in absorption at the “same time”, i.e., at a given moment there is a non-zero probability for the cloud to be in absorption and a non-zero probability for the cloud to be in emission. This has potential impacts towards processing spectral time-series data: it may be dangerous to attempt to increase the signal-to-noise ratio (SNR) via averaging spectra, since it would be possible for an emission and absorption spectrum to cancel. It is also possible that due to the fluctuations in the foreground layer, the polarity of $Δ M$ will be reversed, hence, a cloud that is in emission (i.e., the cloud output radiance into the foreground layer is greater than the incident radiance on its back side) may appear in the measurements with $Δ M < 0$ . Only with a probability model for radiative transfer one can study these types of phenomena.

In the probability model we implicitly assume that the population parameters for the three layers (and the external source) are stationary (i.e., are fixed) during the measurement period. Our probability model is a univariate model and thus can only address the radiative transfer model at a single wavelength. This is the first step in developing a multivariate model for hyperspectral remote sensing where the covariance between wavelengths plays an important role. The challenge in extending our univariate model to a multivariate model is mainly in extending Johnson S_U to multivariate statistics where the covariance needs to be computed and estimated from Eq. (1) that is written for vector quantities. This research may lead to new detection algorithms tailored to exploit the Johnson S_U statistics. Our pdf model is a “forward model” where with given population parameters for the radiative transfer model we predict performance. The challenge of estimating the population parameters form sampled data (i.e., an “inverse model”) is great and remains open.

Appendix A. The lognormal distribution

Let x be a normal distribution, $x ~ N (μ, σ^{2})$ , with density

p_{x} (x) = {(2 π σ^{2})}^{- 1 / 2} \exp (- \frac{1}{2} {(x - μ)}^{2} σ^{- 2}) .

By transformation of variables, $y = \exp (x)$ follows the standard two-parameter lognormal distribution, $y ~ L N (μ, σ^{2})$ , with density

p_{y} (y) = {(2 π σ^{2} y^{2})}^{- 1 / 2} \exp (- \frac{1}{2} {(\ln (y) - μ)}^{2} σ^{- 2}) .

The population parameters $(μ, σ^{2})$ are the mean and variance of the associated normal distribution. We will now present some useful properties of the lognormal distribution.

(i) Statistical properties (moments)
The raw n^th moments of the lognormal distribution are given by:
$E (y^{n}) = \exp (n μ + \frac{1}{2} n^{2} σ^{2})$
from which the following moments may be derived (using Appendix C) to be
$\begin{array}{l} E (y) = \exp (μ + \frac{1}{2} σ^{2}), \\ V (y) = [\exp (σ^{2}) - 1] E (^{y) 2}, \\ S (y) = [\exp (σ^{2}) + 2] \sqrt{\exp (σ^{2}) - 1,} \\ K (y) = \exp (4 σ^{2}) + 2 \exp (3 σ^{2}) + 3 \exp (2 σ^{2}) - 3. \end{array}$
The median of the lognormal distribution is simply $m e d i a n (y) = \exp (μ)$ , and the mode is given by $mode (y) = \exp (μ - σ^{2})$ . Defining Pearsonian coefficients $β_{1} = S {(y)}^{2}$ and $β_{2} = K (y)$ , the equation for the lognormal curve in the β₁, β ₂ plane (see Fig. 2) is given by
${\begin{matrix} β_{2} = β_{1} + \exp (4 σ^{2}) + \exp (3 σ^{2}) + 1 \\ β_{1} = \exp (3 σ^{2}) + 3 \exp (2 σ^{2}) - 4 \end{matrix}} .$
(ii) Estimation using the method of moments
The population parameters for a lognormal distribution $y ~ L N (μ, σ^{2})$ that matches the mean $E (x)$ and variance $V (x)$ of a random variable x, can be found by solving the simultaneous equations ${E (y) = E (x), V (y) = V (x)}$ to be
${\begin{matrix} σ^{2} = \ln (1 + \frac{V (x)}{E {(x)}^{2}}) \\ μ = \ln [E (x)] - \frac{1}{2} σ^{2} \end{matrix}} .$
Note that when attempting to fit a distribution to sampled data, other methods are usually preferred to the method of moments (maximum likelihood, for example) due to higher efficiencies [28]. Hybrid techniques may also be used, for example, replacing the mean by the median in the method of moments gives
${\begin{matrix} σ^{2} = \ln (\frac{1}{2} + \frac{\sqrt{m e d i a n {(x)}^{2} + 4 V (x)}}{2 m e d i a n (x)}) \\ μ = \ln [m e d i a n (x)] \end{matrix}} .$
However, for fitting theoretical distributions, where the moments $E (x)$ and $V (x)$ are known exactly, there are no concerns in using the method of moments.
In the properties below, let $y_{i} ~ L N (μ_{i}, σ_{i}^{2})$ be independent lognormal variates (i.e., $E (y_{i}^{n} y_{j}^{k}) = E (y_{i}^{n}) E (y_{j}^{k})$ ) and $c$ a positive constant. We have analytically the following properties:
(iii) Scaling: $z = c y ~ L N (c + μ, σ^{2}),$
(iv) Inverse: $z = 1 / y = \exp (- x) ~ L N (- μ, σ^{2}),$
(v) Multiplication: $z = \prod y_{i} ~ L N (\sum μ_{i}, \sum σ_{i}^{2}),$
(vi) Division: $z = y_{1} / y_{2} ~ L N (μ_{1} - μ_{2}, σ_{1}^{2} + σ_{2}^{2}) .$
The following two properties are approximately true, where in property (viii) the independence assumption is dropped:
(vii) Addition (approximation with Fenton-Wilkinson [19], ): $\begin{matrix} z = \sum_{i} y_{i} ~ L N (μ_{z}, σ_{z}^{2}), \\ μ_{z} = \ln (\sum_{i} E (y_{i})) - \frac{1}{2} σ_{z}^{2}, \\ σ_{z}^{2} = \ln (1 + \frac{\sum_{i} V (y_{i})}{{(\sum_{i} E (y_{i}))}^{2}}) . \end{matrix}$
For two lognormal variables $z = y_{1} + y_{2}$ it was stated [20,21] that Fenton-Wilkinson moment-matching method breaks down if $σ_{y} < 4 d B$ for either y₁ or y₂. In terms of signal to noise ratio, defined as $S N R_{y} = \sqrt{E {(y)}^{2} / V (y)}$ , this restriction can be shown (by solving $σ_{y} = \ln (1 + S N R_{y}^{- 2}) < 4 d B$ ) to give the condition that a minimum $S N R_{y} > 0.3$ is required for the Fenton-Wilkinson approximation to hold. This property may be combined with (iii) and (v) to show that a linear combination of lognormal variates (with positive coefficients) is lognormally distributed, as long as the SNR restriction is met.
(viii) Addition of correlated lognormals
For two lognormals $y_{1} ~ L N (μ_{1}, σ_{1}^{2})$ and $y_{2} ~ L N (μ_{2}, σ_{2}^{2})$ that are correlated with correlation coefficient
$ρ_{12} = c o r r (y_{1,} y_{2}) = \frac{E (y_{1} y_{2}) - E (y_{1}) E (y_{2})}{\sqrt{V (y_{1}) V (y_{2})}},$
we follow the spirit of Fenton-Wilkinson and use the method of moments—property (ii)—to find $z = y_{1} + y_{2} ~ L N (μ_{z}, σ_{z}^{2})$ such that $E (z) = E (y_{1}) + E (y_{2})$ and $V (z) = V (y_{1}) + V (y_{2}) + 2 ρ_{12} \sqrt{V (y_{1}) V (y_{2})}$ are matched exactly. A sum of three correlated lognormals with correlations $ρ_{12}$ , $ρ_{13}$ , and $ρ_{23}$ is easily computed in the same manner:
$\begin{matrix} z = y_{1} + y_{2} + y_{3} ~ L N (μ_{z}, σ_{z}^{2}), \\ E (z) = E (y_{1}) + E (y_{2}) + E (y_{3}), \end{matrix}$
and
$\begin{matrix} V (z) = V (y_{1}) + V (y_{2}) + V (y_{3}) + 2 ρ_{12} \sqrt{V (y_{1}) V (y_{2})} \\ + 2 ρ_{13} \sqrt{V (y_{1}) V (y_{3})} + 2 ρ_{23} \sqrt{V (y_{2}) V (y_{3})} . \end{matrix}$
(ix) Moments of truncated lognormal
For a lognormal variate $y ~ L N (μ, σ^{2})$ that is truncated to be within $0 \leq y \leq 1$ (e.g., when y is a transmission that may not be greater than 1) the n^th raw moment may be written in terms of the untruncated moments:
$E (y^{n}) = \frac{e r f c (\frac{μ + n σ^{2}}{\sqrt{2 σ^{2}}})}{e r f c (\frac{μ}{\sqrt{2 σ^{2}}})} \exp (n μ + \frac{1}{2} n^{2} σ^{2})$
where $e r f c (\cdot) = 1 - e r f (\cdot)$ is the complementary error function. If the associated normal parameters $(μ, σ^{2})$ are such that the probability Pr(y>1) is negligible (i.e., no truncation is required), the ratio
$\frac{e r f c (\frac{μ + n σ^{2}}{\sqrt{2 σ^{2}}})}{e r f c (\frac{μ}{\sqrt{2 σ^{2}}})} \to 1,$
and the expression for untruncated raw moments is recovered.
(x) Central limit theorem for multiplicative processes
Consider a product of independent non-negative variates, $y = \prod z_{i}$ . Taking the logarithm of this expression leads to a sum of (real-valued) independent variates, $x = \ln y = \sum \ln z_{i}$ . If, by the central limit theorem $x \to N (μ, σ^{2})$ , then the relationship $y = \exp (x)$ immediately implies that $y \to L N (μ, σ^{2})$ .

Appendix B. The Johnson S_U distribution

The S_U distribution is based on the following transformation of a normally distributed variate $x ~ N (0, 1)$ :

y = a + b \sinh (\frac{x - c}{d})

where the parameters a and $b > 0$ are location and scale parameters, respectively, and c and $d > 0$ are shape parameters. The density for $y ~ S_{U} (a, b, c, d)$ is given by

p_{y} (y) = \frac{d \exp {- \frac{1}{2} {[c + d \sinh^{- 1} (\frac{y - a}{b})]}^{2}}}{b \sqrt{2 π [{(\frac{y - a}{b})}^{2} + 1]}} .

The S_U distribution can describe the lognormal and normal distributions as limiting cases: $S_{U} \to S_{L}$ when $c \to \pm \infty$ , and $S_{U} \to N$ when $d \to \infty$ . The S_U distribution is symmetric (skewness is zero) when $c = 0$ . Some useful properties of the S_U distribution are presented below.

(i) Statistical properties (moments)
The n^th moments around arbitrary location $θ$ are given by
$E [{(y - θ)}^{n}] = \sum_{j = 0}^{n} (\begin{matrix} n \\ j \end{matrix}) {(- \frac{b}{2})}^{j} {(a - θ)}^{n - j} \sum_{r = 0}^{j} (\begin{matrix} j \\ r \end{matrix}) {(- 1)}^{r} e^{(j - 2 r) Ω} ω^{{(j - 2 r)}^{2}}$
where $ω = \exp (2^{- 1} d^{- 2})$ and $Ω = c d^{- 1}$ . Raw moments may be found from this equation by setting $θ = 0$ , central moments are found by setting $θ = E (y)$ . Using Appendix C, the following moments may be derived:
${\begin{matrix} E = a - b ω \sinh (Ω), \\ V = \frac{1}{2} b^{2} (ω^{2} - 1) (1 + ω^{2} \cosh 2 Ω), \\ S = - ω {(\frac{ω^{2} - 1}{2})}^{1 / 2} \frac{ω^{2} (ω^{2} + 2) \sinh 3 Ω + 3 \sinh Ω}{{(1 + ω^{2} \cosh 2 Ω)}^{3 / 2}}, \\ K = \frac{1}{2} \frac{ω^{4} (ω^{8} + 2 ω^{6} + 3 ω^{4} - 3) \cosh 4 Ω + 4 ω^{4} (ω^{2} + 2) \cosh 2 Ω + 3 (2 ω^{2} + 1)}{{(1 + ω^{2} \cosh 2 Ω)}^{2}} \end{matrix}} .$
Note that references [28, p42, Eq. (2).62] and [18, Eq. (6).74] both have errors in their expressions for moments of the S_U distribution. In the β₁, β ₂ plane, the S_U distribution defines an unbounded region “below” the lognormal curve (see Fig. 2).
(ii) Estimation using method of moments
We follow the iterative numerical procedure suggested by Elderton and Johnson [36] for finding the $(c, d)$ parameters that match the desired $β_{1} = S^{2}$ and $β_{2} = K$ coordinates of the distribution to be fit (in our model these are computed analytically; for fitting sampled data alternative methods such as quantiles [37] may be preferred). For the S_U distribution the sign of the skewness is determined by the sign of c ( $c < 0$ means $S > 0$ ), but because the fitting routine operates on β₁ (thereby losing the sign of the skewness), the computed value of c will always be non-positive. Thus, c should be multiplied by $- 1$ if the skewness to be fit was negative. Given values for c and d, it is then straightforward to determine the value of b that matches the variance, and then the value of a that matches the mean. This is the same procedure used in [38].
If the distribution to be fit is in the S_U region or on the lognormal curve, then the first four moments will be matched exactly by this fitting procedure. If the distribution to be fit lies in the S_B region (above the lognormal curve in Fig. 2) then the skewness and kurtosis will not match exactly. To ensure that the “best” possible S_U approximation is returned by the iterative procedure, we first find the point on the lognormal curve with the smallest Euclidean distance in the β₁,β₂ plane to the S_B point to be fit. This can be found using Newton’s method [39] to minimize the distance $D = \sqrt{{(β_{1} - S^{2})}^{2} + {(β_{2} - K)}^{2}}$ where β₁ and β₂ are given by Eq. (A1) and S² and K are the coordinates of the point in the S_B region. D is only a function of σ; the solution value σ_min is plugged back into Eq. (A1) to find the coordinates on the lognormal curve closest to the S_B point. Then, applying Elderton and Johnson’s routine to this point finds population parameters {a,b,c,d} corresponding to the optimal S_U approximation (and is numerically equivalent the best possible S_L approximation). This procedure is necessary to return the optimal S_U fit because the Elderton and Johnson procedure approaches the solution while preserving the correct value of β₂. Thus, it would return an S_U approximation that matched the mean, variance, and kurtosis, but not the skewness. The optimal solution accepts some error in the kurtosis in order to reduce the error in the skewness to achieve minimum total error.
Note that there is a non-uniqueness in approximating a point on the lognormal curve with the S_U distribution. Since the lognormal is a limiting form of the S_U, the S_U can get indistinguishably close to the lognormal distribution without actually reaching it; for a given tolerance there will be a circular arc in the S_U region centered on the lognormal point in the β₁,β₂ plane defining particular S_U distributions that are equidistant to the lognormal point (and therefore have the same value for the total error and are thus “equally good” at approximating the lognormal point). In practice, however, for a fixed tolerance, using the iterative fitting routine will always produce the same set of S_U population parameters since the point is approached from a specific direction (along a line of constant β₂).
(iii) Translation and Scaling:
Let k and λ be constants and $y ~ S_{U} (a, b, c, d)$ . Then, it can be shown that $z = k y + λ ~ S_{U} (k a + λ, | k | b, sgn (k) c, d)$ . The absolute value and signum (sign) functions are needed because by convention, b and d are always nonnegative. In (B1), since sinh(·) is an odd function and x is a zero-mean symmetric distribution, the following transformations are all equivalent, $- \sinh (x - c) = \sinh (- x + c) = \sinh (x + c)$ . Therefore there is an invariance in the Johnson S_U distribution to simultaneously flipping the sign of the c and b parameters, and no loss in generality in always taking b to be positive.

Appendix C. [E, V, S, K] as a function of moments

Let z be a random variable whose raw moments $E (z^{n})$ exist for n =1, 2, 3, 4. Then the mean, variance, skewness, and kurtosis may be written as a function of the raw moments as

\begin{array}{l} E (z), \\ V (z) = E ({[z - E (z)]}^{2}) = E (z^{2}) - E (z), \\ S (z) = \frac{E ({[z - E (z)]}^{3})}{V {(z)}^{3 / 2}} = \frac{E (z^{3}) - 3 E (z^{2}) E (z) + 2 E {(z)}^{3}}{V {(z)}^{3 / 2}}, \\ K (z) = \frac{E ({[z - E (z)]}^{4})}{V {(z)}^{2}} = \frac{E (z^{4}) - 4 E (z^{3}) E (z) + 6 E (z^{2}) E {(z)}^{2} - 3 E {(z)}^{4}}{V {(z)}^{2}}, \end{array}

respectively. These expressions may be derived using the properties of the expectation operator, which are summarized in Appendix D.

Appendix D. Properties of the expectation operator

The definition of the expectation operator is

E [f (x)] = \int f (x) p_{x} (x) d x

where p_x(x) is the pdf describing the random variable x, and the integral is computed over the entire domain of x. It follows directly from the definition that $E (k x + λ) = k E (x) + λ$ where k and λ are constants. It can also be established, regardless of independence between variates x and y, that $E [f (x) + g (y)] = E [f (x)] + E [g (y)]$ . Additionally, if x and y are independent—such that the joint probability density $p_{x, y} (x, y) = p_{x} (x) p_{y} (y)$ —then $E [f (x) g (y)] = E [f (x)] E [g (y)]$ .

For convenience we give the moments of $z = x \pm y$ and of $z = x y$ (with x and y independent), which are frequently used in deriving moments of the model outputs. The moments of $z = x \pm y$ (for example, $Δ M = M^{(1)} - M^{(0)}$ and $Δ T = B^{- 1} (L_{i n}) - T_{c}$ ) are given by

{\begin{cases} E (z) = E (x) \pm E (y) \\ E (z^{2}) = E (x^{2}) \pm 2 E (x) E (y) + E (y^{2}) \\ E (z^{3}) = E (x^{3}) \pm 3 E (x^{2}) E (y) + 3 E (x) E (y^{2}) \pm E (y^{3}) \\ E (z^{4}) = E (x^{4}) \pm 4 E (x^{3}) E (y) + 6 E (x^{2}) E (y^{2}) \pm 4 E (x) E (y^{3}) + E (y^{4}) \end{cases}} .

For emissivity terms $ε = (1 - t)$ that are of the form $z = 1 - y$ , (D1) may be used with $E (x^{n}) = 1$ . The moments of $z = x y$ (with x and y independent) are given by

{\begin{cases} E (z) = E (x) E (y) \\ E (z^{2}) = E (x^{2}) E (y^{2}) \\ E (z^{3}) = E (x^{3}) E (y^{3}) \\ E (z^{4}) = E (x^{4}) E (y^{4}) \end{cases}} .

References and links

1. M. L. Polak, J. L. Hall, and K. C. Herr, “Passive Fourier-transform infrared spectroscopy of chemical plumes: an algorithm for quantitative interpretation and real-time background removal,” Appl. Opt. 34(24), 5406–5412 (1995). [CrossRef] [PubMed]

2. D. F. Flanigan, “Prediction of the limits of detection of hazardous vapors by passive infrared with the use of modtran,” Appl. Opt. 35(30), 6090–6098 (1996). [CrossRef] [PubMed]

3. A. Ben-David and H. Ren, “Detection, identification, and estimation of biological aerosols and vapors with a Fourier-transform infrared spectrometer,” Appl. Opt. 42(24), 4887–4900 (2003). [CrossRef] [PubMed]

4. A. Ben-David, C. E. Davidson, and J. F. Embury, “Radiative transfer model for aerosols at infrared wavelengths for passive remote sensing applications: revisited,” Appl. Opt. 47(31), 5924–5937 (2008). [CrossRef] [PubMed]

5. D. Manolakis, “Signal processing algorithms for hyperspectral remote sensing of chemical plumes”, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1857 – 1860, Digital Object Identifier: 10.1109/ICASSP.2008.4517995 (2008). [CrossRef]

6. A. Ifarraguerri and A. Ben-David, “Impact of atmospheric boundary layer turbulent temperature fluctuations on remote detection of vapors by passive infrared spectroscopy,” Opt. Express 16(22), 17366–17382 (2008). [CrossRef] [PubMed]

7. A. Ben-David, “Remote detection of biological aerosols at a distance of 3 km with a passive Fourier transform infrared (FTIR) sensor,” Opt. Express 11(5), 418–429 (2003). [CrossRef] [PubMed]

8. H. C. van de Hulst, Multiple Light Scattering Tables, Formulas and Application (Academic, 1980).

9. K. N. Liou, An Introduction to Atmospheric Radiation, 2nd ed. (Academic, 2002).

10. G. L. Stephens, P. M. Gabriel, and S.-C. Tsay, “Statistical radiative transport in one-dimensional media and its application to the terrestrial atmosphere,” Transp. Theory Stat. Phys. 20(2-3), 139–175 (1991). [CrossRef]

11. J. Kerekes and J. Baum, “Full spectrum spectral imaging system analytical model,” IEEE Trans. Geosci. Rem. Sens. 43(3), 571–580 (2005). [CrossRef]

12. C. Quang, F. Dalaudier, A. Roblin, V. Rialland, and P. Chervet, “Statistical model for atmospheric limb radiance structure: application to airborne infrared surveillance systems,” Proc. SPIE 7108, 710805, 710805-9 (2008). [CrossRef]

13. A. Ishimaru, Wave Propagation and Scattering in Random Media (Academic, 1978).

14. N. Byrne, “3D Radiative transfer in stochastic media,” in 3D Radiative Transfer in Cloudy Atmospheres, A. Marshak and A. B. Davis, eds. (Springer, 2005).

15. D. E. Lane-Veron and C. J. Somerville, “Stochastic theory of radiative transfer through generalized cloud fields,” J. Geophys. Res. 109(D18), D18113 (2004). [CrossRef]

16. J. Aitchison and J. A. C. Brown, The Lognormal Distribution (Cambridge University Press, 1957).

17. E. Limpert, W. A. Stahel, and M. Abbt, “Lognormal distributions across the sciences: keys and clues,” Bioscience 51(5), 341–352 (2001). [CrossRef]

18. N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous univariate distributions, 2nd ed. (Wiley, 1994), Vol. 1.

19. L. F. Fenton, “The sum of lognormal probability distributions in scatter transmission systems,” IRE Trans. Commun. Syst. CS-8, 57–67 (1960).

20. G. L. Stuber, Principles of Mobile Communications (Kluwer Academic Publishers, 1996).

21. N. B. Mehta, J. Wu, A. F. Molisch, and J. Zhang, “Approximating a sum of random variables with a lognormal,” IEEE Trans. Wirel. Comm. 6(7), 2690–2699 (2007). [CrossRef]

22. D. Manolakis, D. Marden, and G. A. Shaw, “Hyperspectral image processing for automatic target detection applications,” Lincoln Lab. J. 14, 79–116 (2003).

23. A. Schaum, “Adapting to change: the CFAR problem in advanced hyperspectral detection,” in Proceedings of the 36th Applied Imagery Pattern Recognition Workshop (IEEE Computer Society Washington, DC, 2007), pp. 15–21.

24. D. Manolakis, R. Lockwood, T. Cooley, and J. Jacobson, “Is there a best hyperspectral detection algorithm?” Proc. SPIE 7334, 733402, 733402-16 (2009). [CrossRef]

25. H. Ren, “Anomaly detection in hyperspectral imagery: second and higher-order statistics-based algorithms,” in Recent Advances in Hyperspectral Signal And Image Processing, C–I Chang, ed. (Transworld Research Network, 2006), Chap. 12.

26. A. Stuart and K. Ord, Kendall’s Advanced Theory of Statistics: Distribution Theory, 6th ed. (Oxford University, 2007), Vol. 1.

27. N. L. Johnson, “Systems of frequency curves generated by methods of translation,” Biometrika 36(1-2), 149–176 (1949). [CrossRef] [PubMed]

28. J. K. Ord, Families of Frequency Distributions (Griffin, 1972), Chap. 2.

29. S. M. Kay, Fundamentals of statistical signal processing: detection theory (Prentice Hall PTR, 1998), Vol. II.

30. A. Ben-David, S. K. Holland, G. Laufer, and J. D. Baker, “Measurements of atmospheric brightness temperature fluctuations and their implications on passive remote sensing,” Opt. Express 13(22), 8781–8800 (2005). [CrossRef] [PubMed]

31. J. H. Seinfeld and S. N. Pandis, Atmospheric Chemistry and Physics (Wiley, 1998).

32. S. Chandrasekhar, Radiative Transfer (Dover, 1960).

33. N. S. Kopeika, A System Engineering Approach to Imaging (SPIE Optical Engineering Press, 1998), Chap. 15.

34. A. Berk, G. P. Anderson, P. K. Acharya, L. S. Bernstein, L. Muratov, J. Lee, M. J. Fox, S. M. Adler-Golden, J. H. Chetwynd, M. L. Hoke, R. B. Lockwood, T. W. Cooley, and J. A. Gardner, “MODTRAN5: A Reformulated Atmospheric Band Model with Auxiliary Species and Practical Multiple Scattering Options,” Proc. SPIE 5655, 88–95 (2005). [CrossRef]

35. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. (Wiley, 2001).

36. W. P. Elderton and N. L. Johnson, Systems of Frequency Curves (Cambridge University Press, 1969).

37. R. E. Wheeler, “Quantile estimators of Johnson curve parameters,” Biometrika 67(3), 725–728 (1980). [CrossRef]

38. I. D. Hill, R. Hill, and R. L. Holder, “Algorithm AS99: fitting Johnson curves by moments,” J. R. Stat. Soc., Ser. C, Appl. Stat. 25(2), 180–189 (1976).

39. F. S. Acton, Numerical Methods That Work (The Mathematical Association of America, 1990).

	RV	pdf	Parameters	Physical Simulation	Simulation 1	Simulation 2
Foreground	$T_{f}$	N	$(μ_{T f}, σ_{T f}^{2})$	$(288 K, 0.180 K^{2})$	$(288 K, 1.80 K^{2})$
	$B (T_{f})$	LN	$(μ_{B T f}, σ_{B T f}^{2})$	$(- 11.83, 6.02 \cdot 10^{- 5})$	$(- 11.83, 6.02 \cdot 10^{- 4})$
	$B (T_{f})$	LN	(E, V)	$(7.32 \cdot 10^{- 6}, 3.22 \cdot 10^{- 15})$	$(7.32 \cdot 10^{- 6}, 3.22 \cdot 10^{- 14})$
	$t_{f}$	LN	$(μ_{t f}, σ_{t f}^{2})$	$(- 0.105, 3.09 \cdot 10^{- 9})$	$(- 0.057, 0.011)$	$(- 2.65, 0.693)$
			(E, V)	$(0.9, 2.5 \cdot 10^{- 9})$	$(0.95, 0.01)$	$(0.1, 0.01)$
			% truncated	0%	29.4%	0.07%
Target Layer (Cloud)	$T_{c}$	N	$(μ_{T c}, σ_{T c}^{2})$	$(288 K, 0.088 K^{2})$	$(288 K, 0.88 K^{2})$
	$B (T_{c})$	LN	$(μ_{B T c}, σ_{B T c}^{2})$	$(- 11.83, 2.95 \cdot 10^{- 5})$	$(- 11.83, 2.95 \cdot 10^{- 4})$
	$B (T_{c})$	LN	(E, V)	$(7.32 \cdot 10^{- 6}, 1.58 \cdot 10^{- 15})$	$(7.32 \cdot 10^{- 6}, 1.58 \cdot 10^{- 14})$
	$τ_{c}$	N	$(μ_{τ c}, σ_{τ c}^{2})$	$(0.047, 5.37 \cdot 10^{- 6})$	$(0.057, 0.011)$	$(2.65, 0.693)$
	$t_{c}$	LN	$(μ_{t c}, σ_{t c}^{2})$	$(- 0.047, 5.37 \cdot 10^{- 6})$	$(- 0.057, 0.011)$	$(- 2.65, 0.693)$
			(E, V)	$(0.954, 4.88 \cdot 10^{- 6})$	$(0.95, 0.01)$	$(0.1, 0.01)$
			% truncated	0%	29.4%	0.07%
Background	$T_{b}$	N	$(μ_{T b}, σ_{T b}^{2})$	$(288 K, 0.0448 K^{2})$	$(288 K, 0.448 K^{2})$
	$B (T_{b})$	LN	$(μ_{B T b}, σ_{B T b}^{2})$	$(- 11.83, 1.50 \cdot 10^{- 5})$	$(- 11.83, 1.50 \cdot 10^{- 4})$
			(E, V)	$(7.32 \cdot 10^{- 6}, 8.03 \cdot 10^{- 16})$	$(7.32 \cdot 10^{- 6}, 8.03 \cdot 10^{- 15})$
	$t_{b}$	LN	$(μ_{t b}, σ_{t b}^{2})$	$(- 0.948, 2.5 \cdot 10^{- 7})$	$(- 0.057, 0.011)$	$(- 2.65, 0.693)$
			(E, V)	$(0.387, 3.75 \cdot 10^{- 8})$	$(0.95, 0.01)$	$(0.1, 0.01)$
			% truncated	0%	29.4%	0.07%
External Source	$T_{s}$	N	$(μ_{T s}, σ_{T s}^{2})$	$(278 K, 0.088 K^{2})$	$(278 K, 8.80 K^{2})$
	$B (T_{s})$	LN	$(μ_{B T s}, σ_{B T s}^{2})$	$(- 12.02, 3.388 \cdot 10^{- 5})$	$(- 12.02, 3.391 \cdot 10^{- 3})$
	$B (T_{s})$	LN	(E, V)	$(6.05 \cdot 10^{- 6}, 1.24 \cdot 10^{- 15})$	$(6.06 \cdot 10^{- 6}, 1.25 \cdot 10^{- 13})$
	$ε_{s}$	LN	$(μ_{ε s}, σ_{ε s}^{2})$	$(- 0.1625, 3.388 \cdot 10^{- 5})$	$(- 0.164, 3.391 \cdot 10^{- 3})$
	$ε_{s}$	LN	(E, V)	$(0.85, 2.45 \cdot 10^{- 5})$	$(0.85, 2.45 \cdot 10^{- 3})$
	$L_{s}$	LN	$(μ_{L s}, σ_{L s}^{2})$	$(- 12.18, 6.78 \cdot 10^{- 5})$	$(- 12.18, 6.78 \cdot 10^{- 3})$
	$L_{s}$	LN	(E, V)	$(5.15 \cdot 10^{- 6}, 1.79 \cdot 10^{- 15})$	$(5.15 \cdot 10^{- 6}, 1.81 \cdot 10^{- 13})$

	x	a	b	c	D
Physical Simulation	$M^{(0)}$	1.5238e-7	4.4926e-11	−3619.2	288.123
	$M^{(1)}$	6.5954e-6	1.0159	0	4.7453e7
	$Δ M$	3.5104e-8	1.4649	0	4.7453e7
	$Δ T$	−9.2118	37.544	−7.6107	106.41
Simulation 1	$M^{(0)}$	5.5634e-6	18.517	0	4.7453e7
	$M^{(1)}$	1.9563e-5	3.0743e-11	511.80	37.333
	$Δ M$	4.6307e-5	1.7045e-5	157.35	91.417
	$Δ T$	−65.227	59.266	−15.212	20.131
Simulation 2	$M^{(0)}$	7.2382e-6	7.1219e-7	−0.35504	4.3966
	$M^{(1)}$	7.0853e-6	1.1756e-6	−1.4319	7.3955
	$Δ M$	−4.4556e-9	1.6652e-6	−0.10131	7.1994
	$Δ T$	0.50780	1.9143	1.3834	1.6689

	RV	pdf	Parameters	Physical Simulation	Simulation 1	Simulation 2
Foreground	$T_{f}$	N	$(μ_{T f}, σ_{T f}^{2})$	$(288 K, 0.180 K^{2})$	$(288 K, 1.80 K^{2})$
	$B (T_{f})$	LN	$(μ_{B T f}, σ_{B T f}^{2})$	$(- 11.83, 6.02 \cdot 10^{- 5})$	$(- 11.83, 6.02 \cdot 10^{- 4})$
	$B (T_{f})$	LN	(E, V)	$(7.32 \cdot 10^{- 6}, 3.22 \cdot 10^{- 15})$	$(7.32 \cdot 10^{- 6}, 3.22 \cdot 10^{- 14})$
	$t_{f}$	LN	$(μ_{t f}, σ_{t f}^{2})$	$(- 0.105, 3.09 \cdot 10^{- 9})$	$(- 0.057, 0.011)$	$(- 2.65, 0.693)$
			(E, V)	$(0.9, 2.5 \cdot 10^{- 9})$	$(0.95, 0.01)$	$(0.1, 0.01)$
			% truncated	0%	29.4%	0.07%
Target Layer (Cloud)	$T_{c}$	N	$(μ_{T c}, σ_{T c}^{2})$	$(288 K, 0.088 K^{2})$	$(288 K, 0.88 K^{2})$
	$B (T_{c})$	LN	$(μ_{B T c}, σ_{B T c}^{2})$	$(- 11.83, 2.95 \cdot 10^{- 5})$	$(- 11.83, 2.95 \cdot 10^{- 4})$
	$B (T_{c})$	LN	(E, V)	$(7.32 \cdot 10^{- 6}, 1.58 \cdot 10^{- 15})$	$(7.32 \cdot 10^{- 6}, 1.58 \cdot 10^{- 14})$
	$τ_{c}$	N	$(μ_{τ c}, σ_{τ c}^{2})$	$(0.047, 5.37 \cdot 10^{- 6})$	$(0.057, 0.011)$	$(2.65, 0.693)$
	$t_{c}$	LN	$(μ_{t c}, σ_{t c}^{2})$	$(- 0.047, 5.37 \cdot 10^{- 6})$	$(- 0.057, 0.011)$	$(- 2.65, 0.693)$
			(E, V)	$(0.954, 4.88 \cdot 10^{- 6})$	$(0.95, 0.01)$	$(0.1, 0.01)$
			% truncated	0%	29.4%	0.07%
Background	$T_{b}$	N	$(μ_{T b}, σ_{T b}^{2})$	$(288 K, 0.0448 K^{2})$	$(288 K, 0.448 K^{2})$
	$B (T_{b})$	LN	$(μ_{B T b}, σ_{B T b}^{2})$	$(- 11.83, 1.50 \cdot 10^{- 5})$	$(- 11.83, 1.50 \cdot 10^{- 4})$
			(E, V)	$(7.32 \cdot 10^{- 6}, 8.03 \cdot 10^{- 16})$	$(7.32 \cdot 10^{- 6}, 8.03 \cdot 10^{- 15})$
	$t_{b}$	LN	$(μ_{t b}, σ_{t b}^{2})$	$(- 0.948, 2.5 \cdot 10^{- 7})$	$(- 0.057, 0.011)$	$(- 2.65, 0.693)$
			(E, V)	$(0.387, 3.75 \cdot 10^{- 8})$	$(0.95, 0.01)$	$(0.1, 0.01)$
			% truncated	0%	29.4%	0.07%
External Source	$T_{s}$	N	$(μ_{T s}, σ_{T s}^{2})$	$(278 K, 0.088 K^{2})$	$(278 K, 8.80 K^{2})$
	$B (T_{s})$	LN	$(μ_{B T s}, σ_{B T s}^{2})$	$(- 12.02, 3.388 \cdot 10^{- 5})$	$(- 12.02, 3.391 \cdot 10^{- 3})$
	$B (T_{s})$	LN	(E, V)	$(6.05 \cdot 10^{- 6}, 1.24 \cdot 10^{- 15})$	$(6.06 \cdot 10^{- 6}, 1.25 \cdot 10^{- 13})$
	$ε_{s}$	LN	$(μ_{ε s}, σ_{ε s}^{2})$	$(- 0.1625, 3.388 \cdot 10^{- 5})$	$(- 0.164, 3.391 \cdot 10^{- 3})$
	$ε_{s}$	LN	(E, V)	$(0.85, 2.45 \cdot 10^{- 5})$	$(0.85, 2.45 \cdot 10^{- 3})$
	$L_{s}$	LN	$(μ_{L s}, σ_{L s}^{2})$	$(- 12.18, 6.78 \cdot 10^{- 5})$	$(- 12.18, 6.78 \cdot 10^{- 3})$
	$L_{s}$	LN	(E, V)	$(5.15 \cdot 10^{- 6}, 1.79 \cdot 10^{- 15})$	$(5.15 \cdot 10^{- 6}, 1.81 \cdot 10^{- 13})$

	x	a	b	c	D
Physical Simulation	$M^{(0)}$	1.5238e-7	4.4926e-11	−3619.2	288.123
	$M^{(1)}$	6.5954e-6	1.0159	0	4.7453e7
	$Δ M$	3.5104e-8	1.4649	0	4.7453e7
	$Δ T$	−9.2118	37.544	−7.6107	106.41
Simulation 1	$M^{(0)}$	5.5634e-6	18.517	0	4.7453e7
	$M^{(1)}$	1.9563e-5	3.0743e-11	511.80	37.333
	$Δ M$	4.6307e-5	1.7045e-5	157.35	91.417
	$Δ T$	−65.227	59.266	−15.212	20.131
Simulation 2	$M^{(0)}$	7.2382e-6	7.1219e-7	−0.35504	4.3966
	$M^{(1)}$	7.0853e-6	1.1756e-6	−1.4319	7.3955
	$Δ M$	−4.4556e-9	1.6652e-6	−0.10131	7.1994
	$Δ T$	0.50780	1.9143	1.3834	1.6689

Probability theory for 3-layer remote sensing radiative transfer model: univariate case

Abstract

Corrections

1. Introduction

2. Radiative transfer model