Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Probability theory for 3-layer remote sensing radiative transfer model: univariate case

Open Access Open Access

Abstract

A probability model for a 3-layer radiative transfer model (foreground layer, cloud layer, background layer, and an external source at the end of line of sight) has been developed. The 3-layer model is fundamentally important as the primary physical model in passive infrared remote sensing. The probability model is described by the Johnson family of distributions that are used as a fit for theoretically computed moments of the radiative transfer model. From the Johnson family we use the SU distribution that can address a wide range of skewness and kurtosis values (in addition to addressing the first two moments, mean and variance). In the limit, SU can also describe lognormal and normal distributions. With the probability model one can evaluate the potential for detecting a target (vapor cloud layer), the probability of observing thermal contrast, and evaluate performance (receiver operating characteristics curves) in clutter-noise limited scenarios. This is (to our knowledge) the first probability model for the 3-layer remote sensing geometry that treats all parameters as random variables and includes higher-order statistics.

©2012 Optical Society of America

Corrections

Avishai Ben-David and Charles E. Davidson, "Probability theory for 3-layer remote sensing radiative transfer model: errata," Opt. Express 21, 11852-11852 (2013)
https://opg.optica.org/oe/abstract.cfm?uri=oe-21-10-11852

1. Introduction

The radiative transfer model with simple 3-layer geometry has been the primary physical model for long wave infrared (LWIR) or “thermal infrared” passive remote sensing for many years [17]. This model has been used for detecting vapor and aerosol targets in the atmosphere in conjunction with detection algorithms such as matched filters and anomaly detectors. A probability density function (pdf) model for the radiative transfer geometry is of great interest and importance for predicting the stochastic nature of the measured signals, evaluating performance, and for developing detection algorithms.

The traditional three-layer remote sensing model is based on the deterministic radiative transfer equation [8,9] where radiative transfer parameters are non-fluctuating. In 1991, Stephens et al. [10] gave a review of efforts to include stochastic components into the radiative transfer equation, and proposed a method for determining pdfs for the optical properties of stratiform clouds. Stephens’s model was based on the two-stream solution [8,9] to the radiative transfer equation for a single layer, where the optical depth and single scattering albedo were treated as random variables with empirically measured pdfs (the asymmetry parameter was fixed). Using transformation of variables and numerical techniques, Stephens obtained output distributions for the albedo, diffuse transmission, and absorption of the cloud. Other work [11,12] has incorporated fluctuations by characterizing a few radiative transfer parameters by a mean and variance (second-order statistics) while holding others fixed; these papers do not address the underlying probability distribution of the predicted spectral radiance. The goal of our work is to predict the pdf of spectral radiance not only due to a fluctuating vapor cloud, but also due to fluctuations in the atmosphere in front and behind the cloud, where all parameters are stochastic and we employ higher order statistics.

In the literature there is a seemingly related area of research regarding radiative transfer through random or stochastic media [1315], sometimes referred to as stochastic radiative transfer. However, as Byrne states on page 386 of [14], “The common term ‘stochastic radiative transfer’ is, strictly, a misnomer. The situation we discuss is the problem of finding the average (emphasis added) radiative properties of an ‘ensemble’ (a generally infinite set) of problems, by means other than actually solving each problem and averaging results”. Thus, so-called stochastic radiative transfer does not provide information regarding the distribution of spectral radiance. In our paper when we use the term stochastic radiative transfer or stochastic model, we do not intend to refer to ensemble averages, but to complete probability distribution information.

In this work we develop a stochastic 3-layer radiative transfer model that consists of a foreground layer between the sensor and the target, a non-scattering target layer (e.g., a vapor cloud), a background layer at the far end of the target, and an external radiation source (e.g., a mountain, building, etc.) at the end of the line of sight (LOS). The external radiation source can also be taken as atmospheric radiation, in which case it can be thought of as a fourth layer. Each layer is characterized by a transmittance (which attenuates the radiation from previous layers) and serves as a source of radiation (due to its intrinsic temperature). Our main objective is to derive a pdf for the at-sensor spectral radiance (i.e., the spectral radiance—hereafter just radiance—input to a hypothetical sensor) due to fluctuations in transmittance and the radiation produced by the layers. These fluctuations contribute to variance in sensor measurements that is typically referred to as “clutter” (variation from the scene that potentially interferes with the task of detecting a particular signal of interest). Technological advancements have led to very low sensor noise, and typical remote sensing applications are limited by clutter noise. Thus, since it is of primary importance to understand the stochastic nature of the physical environment, and less important to model sensor noise, we neglect modeling sensor characteristics in this paper. At-sensor radiance statistics, when combined with specific sensor models and detection algorithms, can produce accurate end-to-end performance predictions. Predictions derived from the at-sensor radiance only (as in this paper) may be interpreted as upper bounds on observed performance from an actual sensor.

We focus mainly on modeling each of the three layers as homogeneously composed layers at thermal equilibrium (the target layer is always modeled this way), in which case the layer radiance is due to self emission, and the stochastic nature of the radiance is due to fluctuations in temperature and density (which in turn causes the layer’s transmittance and emissivity to be stochastic). This type of model is likely applicable to near-horizon LOS (e.g., LOS1 in Fig. 1 ), but not up- or down-looking LOS (e.g., LOS2 in Fig. 1) since the vertical stratification of the atmosphere makes it less likely for a layer to be homogeneously composed (local thermal equilibrium is always assumed). Therefore, we also allow the layer radiance to be taken with an assumed pdf intended to capture all sources of fluctuations in a layer without assuming homogeneity. We will show that a lognormal distribution is a reasonable distribution to assume for the radiance from a layer (as well as for the transmission). In fact, normal and lognormal distributions play a central role in our model due to the central limit theorem and the role these distributions play in sampling from additive and multiplicative processes [1621].

 figure: Fig. 1

Fig. 1 Geometry and radiative transfer quantities for a 3-layer radiative transfer model for horizontal line of sight (LOS1) and up-looking slant path (LOS2). Transmissions and temperatures for the different layers are denoted by ti and Ti, respectively. For LOS1 the foreground and background layer radiances (Lf,Lb) are given as emissions (1ti)B(Ti) from a blackbody B(Ti). The external source radiance, Ls, at the end of LOS can be a function of temperature Ts and emissivity εs, i.e., Ls=εsB(Ts).

Download Full Size | PDF

To derive the pdf of the at-sensor radiance, we allow any or all of the radiative transfer model inputs to become random variables (RVs), assume a particular distribution and population parameters for the variates in each layer (transmission and temperature), and assume that the layers are independently distributed. Then, the pdf of the at-sensor radiance is determined by finding population parameters for a distribution that matches the first four moments (mean, variance, skewness, and kurtosis, denoted E, V, S, and K, respectively) computed analytically through the 3-layer radiative transfer model equation. A pdf that matches all moments is the correct (true) analytical distribution, however, in practice one does not have access to reliable higher moments, and furthermore, fitting routines that include higher moments are extremely complex. In our statistical model we make use of the first 4 moments, only; in detection theory the sought-after information (the presence of the target) resides in the tail of the distribution [2224], hence the importance to describe a pdf with arbitrary skewness and kurtosis values that may be used as an indicator for detection [25].

There are several systems of distributions designed to exactly match any combination of values for the first four moments, including Pearson distributions, Burr distributions, and systems based on logistic and Laplace distributions, respectively [18,26]. In our work we chose the Johnson system [18,2628] for simplicity (it includes only four distributions), convenience (only the Johnson SU distribution is typically needed for our model), and the central role the lognormal distribution plays in the system. The ability of the Johnson SU distribution to model the at-sensor radiance will be established by comparing to stochastic simulations. The SU distribution is also useful to model intermediate terms in the model (such as the radiance incident on the back of the cloud due to the source and background layers), although sometimes less well.

Our statistical model is a “forward model”. We do not address the general question of how to choose and determine the statistical properties of the radiative transfer variates (transmissions and temperatures) for a given physical scenario, nor do we address the “inverse problem” of estimating statistical properties of the radiative transfer variates from radiance measurements. These are both challenging problems deserving further research.

Our statistical model is for a single wavelength (i.e., the radiance is a scalar) hence, we develop a univariate pdf model. We reserve the extension from univariate to multivariate model (where radiance is specified at multiple wavelengths—and thus is a vector quantity—as is required for multi- or hyperspectral applications) for a future publication. Our model can be extended from 3 layers to n layers (though computing analytical moments for n layers is very cumbersome).

The remainder of the paper is organized as follows. In section 2 we present the radiative transfer model. In section 3 we present the details of the pdf model. In section 4 we detail the simulations that are used to demonstrate and investigate our statistical model. In the simulations we use turbulence theory to set the population parameters for the layer temperatures. In section 5 we show results of simulations and we discuss the performance of our model. We conclude with a summary. We include four appendices to aid the reader with different technical aspects of our presentation. Definitions and properties of the lognormal and Johnson SU distributions (which play a key role in the stochastic model) appear in Appendices A and B, respectively. Moments and properties of the expectation operator appear in Appendices C and D.

2. Radiative transfer model

We use a standard layer model for passive LWIR remote sensing, using the following notation (for simplicity we omit the wavelength dependence of all variables in the model). Let Lf, Lb, Ls denote radiances originating from the foreground, background, and source layers, respectively; let tf, tc, tb denote transmissions of the foreground, cloud (target), and background layers, and let B(Tc) denote the Plank blackbody radiance at the cloud temperature, Tc. Cloud transmission may also be written in terms of optical depth, τc, as tc=exp(τc), and due to local thermal equilibrium and Kirchoff’s law, the emissivity of the (non-scattering) cloud layer is εc=1tc.

Using terminology from detection theory [29]—where H0 denotes the null hypothesis (target cloud is absent) and H1 is the alternative hypothesis (target is present)—the total at-sensor radiance given by the model in the presence of the target is denoted M|H1 or M(1) and is given by

M(1)=Lf+tf(1tc)B(Tc)+tftcLb+tftctbLs.

The total at-sensor radiance in the absence of the cloud, denoted M|H0 or M(0), can be obtained from Eq. (1) by setting tc to 1, resulting in

M(0)=Lf+tfLb+tftbLs.

Equations (1-2) are good for any LOS. However, for LOS1 or for when foreground and background layers are well-approximated as homogeneous layers at thermal equilibrium, the foreground and background layer radiances may be written as

{Lf=(1tf)B(Tf)Lb=(1tb)B(Tb)}.

The terms 1tf=εf and 1tb=εb are emissivities for the foreground and background layers, respectively (in the LWIR most of the attenuation is due to molecular absorption, and therefore scattering may oftentimes be neglected). In many cases one assumes Tb=Tf. Figure 1 diagrams the radiative transfer model and physical parameters for each layer for both horizontal and slant-path lines of sight (LOS1 and LOS2, respectively).

It is also convenient to define at-sensor radiance terms corresponding to each layer, i.e., layer-radiances after attenuation by subsequent layers, as follows: Mf=Lf, Mc=tf(1tc)B(Tc), Mb(1)=tftcLb, Ms(1)=tftctbLs, Mb(0)=tfLb, and Ms(0)=tftbLs. Parenthetical superscript notation is necessary for Mb and Ms to differentiate between H0 and H1 since the presence of the cloud affects them (whereas Mf is unaffected by presence of the cloud and Mc only exists under H1, and thus does not need additional notation). Using these quantites, the total at sensor radiances in Eqs. (1-2) are just the sum of the appropriate individual at-sensor radiance terms: M(1)=Mf+Mc(1)+Mb(1)+Ms(1) and M(0)=Mf+Mb(0)+Ms(0), respectively.

The information about the vapor cloud’s presence, ΔM, can be seen by subtracting Eq. (2) from Eq. (1) to obtain

{ΔM=M(1)M(0)=tf(1tc)ΔTΔT=LinB(Tc)Lin=Lb+tbLs}
where ΔT, known as thermal contrast [1,4], is the difference between the input radiance on the back of the cloud, Lin, and the blackbody radiance of the cloud. If the thermal contrast is negative, ΔT<0, the cloud is observed in emission: the cloud is “warmer” than the input radiance and adds photons to the LOS (ΔM>0). If the thermal contrast is positive, ΔT>0, the cloud is observed in absorption: the cloud is “colder” than the input radiance and removes photons from the LOS (ΔM<0). A non-zero value of ΔT is required for vapor detection, and thus is the “engine” behind thermal infrared remote sensing. It is sometimes convenient to present the thermal contrast in temperature units (Kelvin) as
ΔT=B1(Lin)Tc
where B1() is the inverse of the blackbody function.

It is essential to note that Eqs. (1-4) describe the traditional deterministic model (henceforth called the physical model): they predict the value of the at-sensor radiance given constant (non-fluctuating) values of the physical parameters. The goal of this paper is to extend the physical model by allowing any (or all) of the model inputs to be RVs; the result is that model outputs (M(0), M(1), and ΔM, as well as intermediate quantities such as ΔT, Lin, etc.) are themselves RVs and characterized by appropriate pdfs. The details of how this is done are presented in the next section, however there are immediate implications that can be introduced here. For example, in the physical model, ΔT=0 means no information about the cloud reaches the sensor. However, in the stochastic model, even when E(ΔT)=0, the cloud may still be detectable based on fluctuations around the mean. Additionally, it is possible for the cloud to be observed in emission and in absorption at the “same time”, i.e., at a given moment there will be a non-zero probability for the cloud to be in absorption and simultaneously a non-zero probability for the cloud to be in emission. Furthermore, the polarity (sign) of ΔM is affected by the foreground layer (that is stochastic in our model), thus, a cloud that is in a given radiation mode, say emission, for which the probability for ΔT to be less than zero is one—i.e., Pr(ΔT<0)=1—may appear in the measurements in the reverse mode with Pr(ΔM<0)>0. Only with a probability model can one study these phenomena, which we demonstrate and discuss further in Section 5.

3. Probability model

In this section we describe how initial distributions for the input RVs are chosen (in subsection 3.1), and detail how the pdfs of the output RVs are determined (subsection 3.2). Input RVs are assumed independent (this includes RVs from different layers, e.g., Tf and Tc, as well as RVs within a single layer, e.g., tc and Tc). In general this choice is reasonable, however, in adiabatic conditions air temperature and density are not independent, causing correlation between temperature and transmission (proportional to density). This dependence could, in principle, be handled using the ideal gas law.

H0 and H1 radiances along a fixed LOS are (necessarily) measured at different times, and therefore we assume that all the population parameters in the radiative transfer model remain unchanged (stationary) during the measurement period and that the sampling is independent. For example, we assume that there is no drift in the population parameters or distribution of the foreground temperature, hence, the mean and variance (and all higher moments) for Tf are fixed. In a recent paper [30] it was shown that for a short period ~3-5 s temperature drift is small and the stationary assumption for temperature is reasonable.

3.1 Distribution of input terms

3.1.1 Temperature – Planck blackbody function

The Planck blackbody functionB(T)is given by

B(T)=k1ek2/T1
where, for a given wavelength λ in microns and radiance units W/cm2/srm, k1=1.1910×104λ5 and k2=1.4388×104λ1; for λ in wavenumbers (cm−1) and radiance units W/cm2/sr/cm−1, k1=1.1910×1012λ3 and k2=1.4388λ.

Assume that the temperature, T, is a normally distributed random variable (a commonly made assumption, supported by the central limit theorem) with mean μT and variance σT2, i.e., T~N(μT,σT2). Deriving the exact pdf of B(T) via transformation of variables is straight-forward, however exact moments of the distribution (required in our model) are not computable analytically. Instead, an extremely good approximation is that B(T) is lognormally distributed, LN(μB,σB2) (for which all moments are easily computed), which is shown as follows.

Typical terrestrial values of μT are on the order of 300 K with variances at least an order of magnitude smaller. Under these conditions, 1/T is approximately linear (e.g., the points 1/(μT±3σ) are close to equidistant from the median 1/μT and therefore the mean is close to 1/μT). Therefore, a first-order Taylor expansion of 1/T around T0=μT, giving 1/TμT1μT2(TμT), leads to the excellent approximation 1/T~N(μT1,μT4σT2). It follows (Appendix A) that exp(k2/T)~LN(k2μT1,k22μT4σT2).

Were it not for the “-1” term in the denominator of (6), the inverse and multiplicative properties of the lognormal distribution (see Appendix A) could be used to show that B(T)~LN(lnk1k2μT1,k22μT4σT2). Although it is tempting to simply ignore the “minus one” term—on the grounds that typically exp(k2/T)>>1 in the LWIR—it creates a bias (shift) in the distribution that while small, is significant compared to the variance of the distribution. A better method is to approximate the denominator as a standard 2-parameter lognormal distribution. That is, let x=exp(k2/T)1 be the denominator of (6). The goal is to find population parameters μx and σx2 for a lognormal distribution that matches the mean and variance of x without neglecting the minus one. The mean and variance of x are computed (see Appendix A) as E(x)=exp(k2μT1+0.5k22μT4σT2)1 and V(x)=[exp(k22μT4σT2)1]E(x)2, respectively, and population parameters are computed with Eq. (A2) from Appendix A. Finally, the inverse and multiplicative properties of the lognormal distribution can now be used to show that the blackbody radiance is lognormally distributed, B(T)~LN(μB,σB2), where μB=lnk1μx and σB2=σx2. The fit is so good that B(T) may essentially be considered an “input term” to the model.

3.1.2 Transmission

We choose to represent the transmission of a layer in the model as a lognormally-distributed variate, t~LN(μt,σt2). Partially this is due to the convenience of working with lognormal distributions. More importantly, however, is a physical argument based on the statistics of multiplicative processes.

Consider the transmission of monochromatic radiation through a general layer composed of j sub-layers. The total transmission will be a product of the sub-layer transmissions, t=i=1jti. Each sub-layer transmission is a non-negative random variable, 0ti1. By the central limit theorem for multiplicative processes (see property (x) of Appendix A) t approaches a lognormal distribution, t~LN(μt,σt2). Note that values for μt and σt2 for distributions generated in this manner will naturally enforce the constraint that the probability for t>1 must be zero, i.e., Pr(t>1)=0. Note also that due to the multiplicative property of the lognormal distribution, if the transmissions of the sub-layers are themselves lognormally distributed, ti~LN(μi,σi2), then t~LN(μt,σt2) with μt=μi and σt2=σi2.

The above central-limit argument is likely very appropriate for foreground and background layer transmissions, since these tend to be due to the ambient atmosphere along a path that may be long (e.g., a background layer for LOS to space). The transmission of the cloud is qualitatively different than that of the foreground or background layers in the sense that the cloud is localized and (in our remote sensing applications) it is usually due to an external source (e.g., a man-made cloud). An additional justification for the cloud transmission tc to be lognormally distributed is based on the assumption of Gaussian statistics for concentration that is commonly used in plume dispersion models [31, Ch. 18]. Since the optical depth is proportional to concentration, this implies tc=exp(τc)=LN(μτc,στc2), where μτc and στc2 are the mean and variance of the normally-distributed optical depth.

As already mentioned, a real (i.e., physical) transmission is always bounded 0t1. An arbitrary lognormal distribution will always be positive but may not have a non-zero probability for t>1. Because the user may choose population parameters for the layer transmissions where Pr(t>1)>0, we actually represent all transmissions in the model as truncated lognormal distributions that enforce an upper bound at 1 (if the probability that t>1 is negligible—e.g., 0μtησt where η is a factor large enough, for example η=4, to ensure a negligible probability for the normally-distributed optical depth to be negative—then truncation will have no effect). Moments of the truncated lognormal distribution are given in Appendix A, property (ix).

3.1.3 Radiance

We assume that the ambient atmospheric radiance L~LN(μL,σL2) is lognormally distributed (e.g., foreground and background layers for a slant-path LOS—LOS2 in Fig. 1—where the layers are modeled generically without assuming homogeneity). This choice is partially motivated by the fact that radiance values must be non-negative (the lognormal distribution naturally enforces this constraint) and that the lognormal distribution is easy to work with. More importantly, however, is that there is a physical argument to be made that is based on radiative transfer theory.

The formal solution of the classical radiative transfer differential equation [32] [Ch. 11, Eq. (3)] for an atmosphere in local thermodynamic equilibrium is given [32] [Ch. 11, Eq. (4)] as an integral for the multiplication of a blackbody function along a slant path with a slant path transmission to an observer. Approximating the integral as a sum shows that the radiance is given by a summation of t × B(T) terms. We already showed that the blackbody and transmission functions are both lognormally distributed. With the addition property of lognormals (see property (vi) in Appendix A), the result is lognormally distributed radiance.

If the source radiance, Ls, is an atmospheric radiance, it is reasonable to assume that it is lognormally distributed, as the above argument would apply. For source radiance that is due to emission from an opaque object, we assume a lognormal distribution also. If the object is a blackbody, then this is justifiable based on section 3.1.1. For graybody radiation, where Ls=εsB(Ts), choosing a lognormal distribution for the radiance implies either (i) εs is constant and property (iii), Apendix A, establishes Ls~LN(lnεs+μB,σB2), or (ii) εs~LN(μεs,σεs2) and property (v), Appendix A, establishes Ls~LN(μεs+μB,σεs2+σB2). In the latter case, it must be that the population parameters of εs are such that Pr(εs>1) = 0 for the quantity to be strictly physical.

3.2. Distribution of output terms using Johnson SU

The primary output terms of the pdf model are the total at-sensor radiances M(0) and M(1), as well as the differential radiance ΔM; these are the terms that a sensor has access to directly (i.e., can measure). The thermal contrast, ΔT, is also fundamentally important due to the key role it plays in enabling LWIR detection. We model these terms with the Johnson SU distribution. In this section we explain the process and the reasons why the SU distribution is successful as a single distribution to describe these terms. We also discuss the use of the Johnson SU distribution to describe intermediate terms in the model—terms of secondary importance such as at-sensor radiance terms for the individual layers, Mf, Mc, Mb, and Ms—which can be more challenging for the SU distribution to represent.

The previous section showed that it is reasonable to model all of the input RVs in Eq. (1) as lognormal distributions. While physically justifiable, this is also very convenient since a product of lognormal distributions is (analytically) lognormally distributed, and a sum of lognormal distributions is (approximately) lognormally distributed. Ignoring the presence of emission terms such as εc=1tc (e.g., assuming tc was constant or that εc was lognormally distributed) would lead to a lognormal distribution for the total at-sensor radiance (where correlation between terms in the sum—caused by tf, tc, and tb appearing in multiple places—can be handled via property (viii) in Appendix A). However, the problem is that the emissivity ε=1t when t~LN(μt,σt2) does not follow the standard 2-parameter lognormal distribution, and is instead described by a lognormal distribution that has been reversed (reflected about zero) and shifted. While the moments of this distribution are easily computed in terms of the moments of t (e.g., E(ε2)=12E(t)+E(t2)), it is difficult to determine the distribution of the product ε×B(t), for example. Equivalently, expanding terms such as Mc=tf(1tc)B(Tc) to become tftftcB(Tc) shows that a difference of (correlated) lognormally-distributed variates enters into the model. Theoretically, the standard 2-parameter lognormal distribution is a poor candidate to approximate a difference of lognormals since there is no guarantee that values would be positive. This fact, combined with the fact that transmissions in our model may actually be truncated lognormal distributions (for which the nice multiplicative and additive properties no longer apply), motivated us to use the Johnson system to fit the moments of the model outputs. A secondary advantage to this is that one may relax the need to assume a specific distribution for the model inputs, and instead may simply specify the first four moments of the input variates.

The Johnson system contains three 4-parameter distributions, SL, SU, and SB, which, along with the normal distribution, allows fitting any combination of values for the set of moments, {E, V, S, K}. All distributions in the system may be considered transformations of a normal variate. Typically distribution systems are characterized by the Pearsonian coefficients β1=S2 and β2=K, where each distribution in the system corresponds to a region, curve, or point in the β1, β 2 plane. The SL distribution is a lognormal distribution with the added flexibility of scale and location parameters (and thus the distribution may be shifted and the skewness may become negative); despite being defined as a four-parameter distribution only three of the parameters are identifiable. Both the Johnson SL and the standard two-parameter lognormal distributions are described by the same curve in the β1, β 2 plane (see Fig. 2 ). Any point below the lognormal curve is in the SU region, which is an unbounded (and unimodal) distribution. Any point above the lognormal curve is in the SB region, a bounded distribution that may be unimodal or bimodal. The normal distribution is a point in Fig. 2 and is a limiting form of the SL (and SU and SB) distributions; the SL is a limiting form of the SU (and SB). The fact that the lognormal distribution appears as a special case of each distribution in the Johnson system is an important property given that (as stated above) the model predicts lognormally distributed radiance under certain conditions.

 figure: Fig. 2

Fig. 2 Standard β1,β1 chart, where the y-axis is reversed (by common convention). The lognormal curve (corresponding to both LN and SL distributions) is given by Eq. (A1) and is shown in red. The normal distribution is a point in blue at coordinates (0,3). The area “below” the lognormal curve contains all SU distributions. The area above the lognormal curve corresponds to the SB region. All probability distributions must lie below the shaded gray area in the upper right corner of the plot.

Download Full Size | PDF

The SU distribution is particularly important because: (i) we observed that most of our simulations produced (β1, β 2) values that—if not on the lognormal curve—were below it, and thus in the SU region; (ii) analytical moments are relatively easy to derive and there are good procedures for method-of-moments estimation (details are in Appendix B); (iii) the SL and the LN distributions are limiting cases of the SU, and thus the SU distribution can represent these distributions with an error that is as small as desired—this means that the model outputs may always be considered to be SU thereby avoiding the complexity of having to switch to or from an SL representation.

There are times when the moments of model outputs are above the lognormal curve in the SB region (generally caused when the truncated lognormal distribution is necessary for transmission variates). However, dealing with the SB distribution is difficult due to the complexity of the moments; estimation must be performed via quantiles or even frequencymoments [28]. Therefore, even when the distribution to be fit is in the SB region, we use an SU approximation (we project the point in the SB region to the closest point on the lognormal curve and then use the SU—see details in Appendix B, property (ii)—generally the Euclidean distance to the lognormal curve is small, and the SU fit is reasonably good). Therefore, the SU distribution is the only member of the Johnson system that we need to consider.

The general procedure for determining the pdf of model outputs is to analytically compute the first four moments (E, V, S, and K) for each model output that we desire to fit with a pdf, say, M(1) under LOS2. E, V, S, and K are a function of the raw moments of M(1), which are in turn a function of the population parameters of the model inputs. The raw moments are found by, e.g., expanding the function (Lf+tf(1tc)B(Tc)+tftcLb+tftctbLs)n (where n is 1 through 4; obtaining 5 terms for n=1, 15 terms for n=2, 35 terms for n=3, and 70 terms for n=4) and taking the expectation of each expression, using properties of the expectation operator given in Appendix D. Each expression contains terms that are a product of up to six RVs (in this example) appearing to powers up to 4. Because of independence, the expectation operator distributes, e.g., E(tf3tc2Lb)=E(tf3)E(tc2)E(Lb). E, V, S, and K, are then computed as a function of the raw moments using Appendix C. This procedure, though tedious, is nevertheless straight-forward and may be applied to all of the model outputs. Given the analytical moments, then the estimation procedure in Appendix B is used to determine the four population parameters (a, b, c, d) of the SU distribution, e.g., M(1)~SU(a,b,c,d). If the moments are located in the SU region (or on the lognormal curve), then the corresponding moments of the SU fit (see Eq. (B4) in Appendix B) will match exactly. If the distribution to be fit is in the SB region, then the skewness and kurtosis will be slightly off, and only the first two moments will be fitted exactly.

In the remainder of this section, we highlight several individual model outputs, how well they are modeled with the SU distribution, and conditions for them to be lognormally distributed.

3.2.1 Homogeneous layer radiance and at-sensor radiances

As previously mentioned, the possibility of truncated lognormal distributions for the transmission terms and the presence of emission terms ε=1t are the primary reasons for model outputs to deviate from the standard 2-parameter lognormal distribution. Therefore, terms such as Mc—or Lf and Lb using the homogeneity assumption (Eq. (3))—are potentially the most problematic since both transmission and emission terms appear. Our observations indicate that terms such as Mc are the most likely to be poorly fit by the SU distribution, i.e., it is possible for them to be in the SB region. However, even when the component at-sensor radiances are poorly fit, the total at-sensor radiance tends to be remarkably well fit by the SU, and usually M(1) is even better fit than M(0). A reason for this may stem from the fact that the SB region (see Fig. 2) is characterized by smaller kurtosis values for a given skewness than the SU (or alternatively larger skewness for the same kurtosis). M(0) and M(1) are a convolution (sum) of the component at-sensor radiances where convolution is a smoothing and broadening operation. We theorize that the convolution of terms helps increase the value of the kurtosis relative to the skewness, making it more likely for total at-sensor radiance terms to be in the SU region. M(1) is a convolution of more terms, and thus would be more likely to be in the SU region than M(0). Regardless of the reason, M(0) and M(1)—the primary outputs of the model—are particularly well fit by the SU distribution.

The ability to approximate at-sensor radiances as lognormals is attractive as it eliminates the necessity of using a fitting routine for the SU due to the elegance of lognormal distributions that are easily manipulated. Component at-sensor radiances such as Ms (or Mb under LOS2 when the homogeneity assumption is not used) will be lognormally distributed as long as the transmissions are not significantly truncated. On the other hand, Mc will only be lognormal if the transmissions are not truncated and if 1tc may be well approximated by a lognormal distribution (which occurs when tc is nearly symmetric), in which case Mc=tf(1tc)B(Tc) is a product of three lognormal variates. If all of the component at-sensor radiances are lognormally distributed, then the total at-sensor radiance will also be (approximately) lognormally distributed (based on the additive property of correlated lognormals in Appendix A—where the correlation is caused by the transmissions appearing in multiple terms).

We note that a further simplification for the total at-sensor radiance (or any other quantity) that is lognormally distributed is the normal distribution. The normal is the limiting form of the lognormal as σ20. Thus, if the skewness and kurtosis of the lognormal distribution are not too far from 0 and 3, respectively, then a normal distribution may be used instead. For example, at wavenumber of 1000 cm−1, μT=300K and σT=5K, the blackbody lognormal function B(T)~LN(μB,σB2) (see Section 3.1.1) is given withμB=11.5 and σB2=0.0064, for which the skewness and kurtosis (Appendix A) are 0.24 and 3.1, respectively. Hence, B(T) can be well-approximated by a normal distribution B(T)~N(μB,σB2).

3.2.2 Differential radiance

The differential radiance, ΔM=M(1)M(0), is particularly well-suited to a Johnson SU distribution. We can theoretically prove that a difference of independent SU variates, z=xy, always remains in the SU region. Although this does not establish that a difference of SU variates is analytically an SU distribution, it does say that an SU fit will always be able to exactly match the first 4 moments. Additionally, even when one or both of M(0) and M(1) are in the SB region (and therefore outside of the SU), it is possible for ΔM to be in the SU region. Thus, the Johnson SU is very appropriate for fitting ΔM.

In the physical model, the differential radiance given by Eq. (4) is the information in the measured radiance that is due to the cloud, i.e., it captures only those photons that interact with the cloud. In the stochastic model the interpretation of ΔM is not as straight-forward: ΔM=M(1)M(0) is a difference of RVs that are assumed to be (independently) sampled at different times, and fluctuations in the input RVs will cause a differential radiance between terms that do not involve the cloud. Thus, in the stochastic model ΔM captures both “informative” cloud-photons, as well as “uninformative” photons that are due to clutter fluctuations. For example, consider the foreground radiance, Lf, which is distributed identically and independently under H0 and H1 conditions. The mean E(Lf|H1Lf|H0)=0, but the variance var(Lf|H1Lf|H0) is the sum of the variances of Lf |H0 and Lf |H1; therefore, the differential radiance will include some photons that originated in the foreground layer and never interacted with the cloud. Note that correlations between H0 and H1 quantities may be included in this analysis, but here we assume independent temporal sampling.

When M(0) and M(1) analytically follow the lognormal distribution, the differential radiance ΔM is a difference of independent lognormals. We can show analytically that a difference between two independent lognormal distributions is always in the SU region (below the lognormal line in the β1,β2 plane), and in fact, the fit is so fantastic it is tempting to think that the SU distribution might even be the analytical pdf for the difference of lognormals.

In the stochastic model, we use binary hypothesis testing [19] with respect to the radiance distributions M(0) and M(1) to determine the detection potential of the remote sensing scenario that is due to scene noise (e.g., the likelihood that at-sensor radiance may be attributed to the presence of a cloud along with the corresponding probability of false alarm). We are aware of the fact that detecting a presence of a cloud with only one wavelength is problematic as it is susceptible to drift, atmospheric disturbances, and interferences (using multiple wavelengths, these effects may be mitigated due to the wavelength dependence of the different events).

3.2.3 Thermal contrast

Determining the distribution of the thermal contrast in radiance units—defined in (4)—is qualitatively similar to determining the distribution of differential radiance, and raises no additional challenges. However, determining the distribution of ΔT in units of brightness temperature (Eq. (5)) is difficult because of the appearance of the inverse Planck blackbody function, B−1(·), that appears in the expression ΔT=B1(Lin)Tc. If Lin is a lognormally distributed variate, then the steps in showing that B(T) is approximately lognormal when T is normal can be reversed to show that Tin=B1(Lin) is approximately normal, where

B1(L)=k2ln(k1/L+1).

However, in general Lin will be described (or approximated) by an SU distribution. Transformation of variables applied to Eq. (7) is too complicated when the input distribution is the Johnson SU (scaling and translation relationships for the SU distribution exist—see Appendix B—but more complicated transformations generally do not lead to densities that can be integrated); computing moments of Tin analytically from the moments of Lin is also intractable due to the nonlinearity of the transform. However, moments can be computed easily when the inverse blackbody function is linearized (approximated) with a first order Taylor expansion. Therefore, we will describe two complementary methods to determine the pdf of ΔT: (i) approximating Lin as a lognormal distribution and using the exact functional form of B−1; or (ii) propagating exact moments of Lin using an approximated form of B−1.

In the first method, k1/L~LN(lnk1μL,σL2), and y=k1/L+1 may be approximated very well as a 2-parameter lognormal distribution with population parameters {μy,σy} found from fitting E(y)=k1exp(μL+σL2/2)+1 and V(y)=[exp(σL2)1]E(y)2 using Eq. (A2). Then, lny~N(μy,σy2) by definition. The previously-used first order Taylor approximation for the reciprocal of a normal variate (Section 3.1.1) establishes B1(L) to be approximately N(μy1,μy4σy2). Tc is also normally distributed, and thus the distribution of ΔT is approximated by N(μy1μTc,μy4σy2+σTc2). These steps are simply the reverse of the steps in Section 3.1.1, and are appropriate whenever the SU distribution for Lin is not “too far” from the lognormal line. However, as the kurtosis and skewness of the SU gets further from the lognormal, the output distribution for Tin becomes skewed and the normal approximation (which is necessarily symmetric) will begin to provide a poor fit.

In the second method, we approximate Eq. (7) with a first order Taylor expansion around the mean of L. Let L0 denote E(L), then

B1(L)=B1(L0)+(LL0)B1L|L=L0+O((LL0)2).

The result is that B1(L) now has a linear relationship with random variate L, B1(L)α1L+α0, with constant gain α1 and offset α0 given by

α1=B1L|L=L0,α0=B1(L0)(B1L|L=L0)L0,
respectively, where

B1L|L=L0=B1(L0)L0(k1+L0)ln(k1L0+1).

The moments of B1(L) and ΔT=B1(Lin)Tc may now be easily computed using Appendices C and D, and fit with an SU via property (ii) of Appendix B.

Note that if Lin analytically follows an SU distribution, Lin~SU(a,b,c,d), then B1(Lin)~SU(α1a+α0,bα1,c,d) via property (iii) in Appendix B, where the absolute value and signum functions are not needed since α1>0. Subsequently, moments of B1(L) can be found from Eqs. (B3) and (B4) instead of using Appendices C and D. Additionally, if Tc is a constant, then neither Appendices C and D nor the SU fitting are needed since Appendix B property (iii) establishes ΔT~SU(α1a+α0Tc,bα1,c,d).

The first order Taylor expansion is appropriate when the variance of Lin is small enough such that the inverse blackbody function is approximately linear over the domain of Lin. Which method is better will be determined by the relative error in approximating Lin as a lognormal distribution compared to the error in assuming the blackbody function is linear. In all simulations shown in Section 5, the blackbody function is close to linear, and so the second method works well. Thus, in the remainder of the paper we only focus on the second method.

4. Simulations

In this section we describe the details of simulations for the detection of triethyl phosphate (TEP) vapor, including how population parameters were chosen. All simulations are run at 1049.3 cm−1 (which corresponds to peak absorption for TEP) and radiance values are reported in units of W/cm2/sr/cm1. With given population parameters (μ,σ2) for the input variates to the model, we sample values from the appropriate random number generator (106 samples of each variate, creating 106 sets of values). For example, layer transmissions are sampled from (possibly truncated) lognormal distributions, temperatures are sampled from normal random number generators, and layer radiances (e.g., Ls) are sampled from the lognormal distribution. Samples for the intermediate model terms and the primary outputs are computed via applying the model equations to each set of random values, creating 106 values for each output term. We ensure that all RVs for H0 are sampled independently (i.e., sampled separately) from H1, thus we have an additive signal model where H0 and H1 are uncorrelated (in field measurements H0 and H1 are measured at different times, and therefore are assumed independent). From the sampled RV data we compute histograms (sampled pdfs) that we compare with the analytical pdfs predicted by our model. Predicted pdfs are solely a function of the population parameters for the input variates.

We present two types of simulations. The first is a physical scenario of a sensor situated 1 m above the ground in a horizontal LOS geometry that is viewing a vapor cloud located at a horizontal distance of 1 km. At the end of the LOS (10km away from the sensor), there is a topographical target (a mountain). The second type of simulation is for a hypothetical scenario with a challenging set of parameters for the RVs (e.g., a combination of transmissions, temperatures and radiance values) in order to test our model’s accuracy and to explore its limitations. The parameters chosen for the hypothetical scenario can be associated with a physical scenario, but this is not our intention. For both types of simulations we chose horizontal LOS geometry, a geometry that is more challenging than a slant path due to the added complexity of the foreground and background emission terms (discussed in Section 3.2). Simulation results for slant path geometry (not shown) were very good. Table 1 summarizes population parameters as well as means and variances for all simulations.

Tables Icon

Table 1. Parameters for Simulations

4.1 Physical simulation

The 3-layer model geometry in the physical simulation is similar to a previously used scenario [4]: a LOS near the horizon (1.7 degree up-look angle), foreground pathlength of 1 km at temperature of 288 K; a cloud at temperature of 288 K (i.e., at equilibrium with the ambient atmosphere); background layer of 9 km path at the cloud temperature; a mountain (that serves as an external blackbody source) with emissivity of 0.85 (typical of many clay soils in the IR) at temperature of 278 K (i.e., 10 K colder than the ambient air temperature, a scenario that can occur in the morning hours). Note that a 1.7 degree up-look angle is shallow enough that horizontal distances and slant-path distances are virtually identical. We use previous stochastic simulations [6] as guidelines for setting population parameters for transmission and temperature functions in this work.

The standard deviation of the transmission, V(t), for IR wavelengths and horizontal LOS (elevation angle < 5°) is between 2105 to 8105 for which the mean transmission values E(t) are between 0.7 to 0.97 (see Figs. 4 and 5 in Reference [6]). Note that the transmission of the atmosphere is highly wavelength dependent. At wavelength 993 cm1, a relatively clear atmospheric transmission, the ground level extinction coefficient is 0.07 km1; for 1053cm1, due to ozone absorption, the extinction coefficient is 0.12 km1; for 853 cm1, due to water vapor absorption, the ground level extinction coefficient is 0.3 km1.

In our “physical” simulation we choose the foreground transmission for horizontal path of 1 km to be lognormally distributed with E(tf)=0.9 and V(tf)=5105 for which the population parameters (μt,σt2) can be computed from V(t),E(t) with Eq. (A2) to be μtf=0.105 and σtf2=3.09109. Thus, the foreground transmission is tf~LN(μtf,σtf2) and the foreground optical depth is τf~N(μtf,σtf2). The background layer is 9 times longer in pathlength than the foreground layer; population parameters are derived from the relation τb=9τf. Thus, τb~N(μtb,σtb2) and tb=exp(τb)=LN(μtb,σtb2) with μtb=9μtf and σtb2=81σtf2. Using property (i) of Appendix A, the mean and standard deviation of the background transmission is E(tb)=0.387 and V(tb)=1.94104. Note that this produces a variablilty for the background transmission that is a little larger (factor of 2.5) than observed in [6]. Truncation is not required for any of the transmission variates in the physical simulation.

The variance of the atmospheric temperature is computed with a turbulence theory [6]. The variance of temperature due to turbulence between two points separated by distance Δx is V(T)=CT2Δx2/3, where CT2 is the temperature structure constant. CT2 is proportional to the refractive index structure constant, Cn2 (Eq. (2) in [6] where a typo needs correction: the left-hand side should have been Cn2 and the right-hand side should have been CT2). Both CT2 and Cn2 are a function of elevation. The Tatarski model (see [33] Eq. (15).4.35) gives the height dependence of the refractive index structure constant as Cn2(h)=Cn2(1)h4/3 where h>1m (supported by experiments for h up to at least 100m). We use a value of Cn2(1)=1013m2/3 which is characteristic of moderately strong turbulence; the corresponding value of the temperature structure constant (for the 1976 US standard atmosphere [34] with surface temperature of 288K and 46% relative humidity) is CT2=0.088K2m2/3. Using the Tatarski height dependence and assuming (for convenience) that wind speed u=1ms1 and sampling rate dt=1s(and hence Δx2/3=(udt)2/3=1), the variance of temperature over a slant-path extending from height h1 above the ground to height h2 is estimated by V(T)=h1h20.088h4/3dh=0.264(h11/3h21/3). For LOS near the horizon at 1.7° elevation angle, the variance of the foreground layer (1km path from h1=1m to h2=30.7m) is 0.180 K2 and for the background layer (9km path from h1=30.7m to h2=298m) is 0.0448 K2. The variance of the cloud temperature is simply set to 0.088 K2. Temperatures in the model are normally distributed, and as explained in 3.1.1, corresponding distributions for the blackbody radiance are very close to lognormal distributions; Table 1 gives population parameters and moments for these quantities.

The radiance from the external source, Ls, is modeled as a lognormal distribution with E(Ls)=5.15106 and V(Ls)=1.791015. This choice is arbitrary but is consistent with graybody radiation Ls=εsB(Ts) with Ts~N(278,0.088) and εs distributed according to a lognormal distribution with E(εs)=0.85 and σεs2=σBTs2; population parameters and moments are given in Table 1.

For the vapor cloud population parameters we used field data from an experiment where a TEP vapor “cloud” was confined to a closed chamber [3] of size 3m×3m×6m. In the experiment, concentration-pathlength values inferred from LWIR remote sensing measurements when 1 cm3 of TEP was vaporized in the chamber were fit with a normal distribution. We extract the mean and standard deviation of the TEP concentration-pathlength (ρ=56.60mg/m2 and σ=2.762 mg/m2, respectively; see [3], Table 1, i=4) and convert the values to optical depth (τ=αρ, where α=8.389104m2mg1is the absorption coefficient at 1049.3 cm−1). Thus, the mean optical depth is μτc=α×56.60=0.0475 with variance στc2=α2×2.7622=5.369106, which means that the TEP cloud’s transmission in the physical simulation is distributed as tc~LN(μtc,σtc2) where μtc=μτc and σtc2=στc2. Note that in our model the dimension of the cloud is immaterial, only the optical depth is important.

4.2 Additional simulations

For the second type of simulations (intended to challenge our model) we increase the variance of the temperature of the three layers (Tf,Tc,Tb) by ten-fold, σTf210σTf2=1.80K2, σTc210σTc2=0.88K2, and σTb210σTb2=0.448K2. The source radiance is altered by increasing the variance of Ts by a factor of 100, from 0.088K28.8K2. The variances of all layer transmissions are set to the large value of 0.01 (standard deviation of 0.1 transmission unit). For “Simulation 1”, the mean transmission value for all layers is set to 0.95 (i.e., optically thin layers), which results in a lognormal distribution with ~29% of values greater than 1 (and thus the effects of truncation are significant). For “Simulation 2”, the mean transmission values for all three layers are set to 0.1, such that each layer is closer to being optically thick. Although less than 0.1% of transmission values are greater than 1 for Simulation 2 (and thus truncation is not significant), the transmission pdfs are highly skewed (they are sharply peaked with a mode close to zero and a long tail) which challenges the model. Population parameters and corresponding means and variances for the various distributions for Simulations 1 and 2 are summarized in Table 1. Note that the mean and variance for the transmission variates are reported prior to truncation. The actual mean and variance of the truncated lognormal distributions in Simulations 1 and 2 are E(t)=0.900, V(t)=4.20103, and E(t)=0.0991, V(t)=8.89103, respectively.

5. Results and discussion

We compare the sampled data histograms of the radiative transfer quantities (Eqs. (1-5)) to those predicted with the Johnson SU distribution. To assess how close the two are, we compute a symmetric Kullback-Leibler (KL) distance [35], KL=(KL(α,β)+KL(β,α))/2. The KL distances are noted in the figure captions. The smaller the distance the better the agreement. The computed value of the KL will depend on how the histograms are computed (e.g., the bin spacing used). To get an idea of the significance of the KL value, we computed the 99th percentile for KL distances between a histogram computed from 106 samples drawn from N(0,1) and the theoretical density; the result of 0.0013 means that KL values must be greater than this to establish that the two densities are not identical. In Figs. 3 -4 we show the comparison for the physical simulations and in Figs. 5 -8 for the two hypothetical scenarios. In Table 2 we summarize the Johnson SU population parameters, {a,b,c,d}, for the primary model outputs M(0), M(1), and ΔM, as well as for ΔT. The pdf of ΔT is computed using the second method outlined in section 3.2.3, using Eq. (8).

 figure: Fig. 3

Fig. 3 Comparison of data histogram (solid red) and model-predicted pdfs (thick dotted green) for several intermediate model outputs for the physical simulation. The L and M radiances (in units of W/cm2/sr/cm−1) are modeled with the Johnson SU distribution, andB(Tc)is modeled with a lognormal. Top row: emission radiances for the foreground and background layers, Lf=Mf, Lb, and cloud layer emission radiance Lc=(1tc)B(Tc). Middle row: At-sensor radiances for the external source, Ms(0), and the background layer, Mb(0), under the H0 scenario and the at-sensor radiance from the cloud layer,Mc. Bottom row: At-sensor radiances for the external source, Ms(1), and background layer, Mb(1), for the H1 scenario and the Planck blackbody radiance for the cloud, B(Tc).The KL distance between theory and sampled data for all plots is very small (<0.002), hence excellent agreement between the data and the model.

Download Full Size | PDF

 figure: Fig. 4

Fig. 4 Comparison of data histograms (solid lines) and model pdfs (thick dotted lines) for the primary model outputs for the physical simulation. Top left: at-sensor radiance under H0 and H1 hypotheses, M(0) and M(1), respectively. Top right: Differential radiance, ΔM=M(1)M(0). Bottom left: Thermal contrast (in Kelvin), ΔT=B1(Lin)Tc, computed using Eq. (8). Bottom right: Receiver operating characteristics where the line of equal probability of false alarm and detection is shown in gray. The agreement between model predications and the data is excellent (KL distances for M(0),M(1), ΔM, and ΔT of 0.0014, 0.0015, 0.0014, 0.0016, respectively).

Download Full Size | PDF

 figure: Fig. 5

Fig. 5 Same as Fig. 3 but for Simulation 1. The agreement between the model and the data for L and M radiances is poor (large KL values of ~1.3) with the exception of Ms(0)and Ms(1)for which the fit is good and the KL distance is small (0.005 and 0.008 respectively).

Download Full Size | PDF

 figure: Fig. 8

Fig. 8 Same as Fig. 4 but for Simulation 2. The KL distance between theory and sampled data for M(0), M(1), and ΔM is very small (0.002, 0.0014, and 0.0013, respectively) indicating good agreement between the model and data. For ΔT the fit is reasonably good with a KL of 0.007. Detection performance for this scenario is reduced in comparison to Simulation 1: although the optical depth of the target cloud is much greater in Simulation 2, the thermal contrast is reduced and there is greater attenuation from the foreground layer.

Download Full Size | PDF

Tables Icon

Table 2. Parameters for Johnson Distributions, x~SU(a,b,c,d)

In the first panel of Fig. 3 we show (top row) the layer emission radiances Lf=Mf, Lb, and the cloud layer emission radiance Lc=(1tc)B(Tc). In the middle row we show the H0 at-sensor radiances for the external source and background layers (Ms(0),Mb(0)), and the cloud at-sensor radiance Mc; in the bottom row we show H1 at-sensor radiance (Ms(1),Mb(1))and the cloud blackbody radiance, B(Tc). The layer radiances and at-sensor radiances are modeled with Johnson SU statistics and the blackbody B(Tc) is modeled with lognormal statistics. Figure 3 shows that the approximation of a blackbody with a lognormal pdf is excellent. Other blackbodies (foreground and background) in Table 1 (all simulations) also produced practically an exact match between sampled data and a lognormal.

The agreement between our model and the data for the emission terms and the at-sensor radiances is excellent (KL distances for all RVs in the physical simulation are less than 0.002).

In Fig. 4 (top left panel) we show the pdf for the total at-sensor radiance under H0 and H1. In the top right panel we show the pdf for the differential radiance ΔM(Eq. (4)). The pdfs are in execellent agreement with the data (KL distances for M(0), M(1), and ΔM are less than 0.002). In the bottom left panel we show the pdf for the thermal contrast (in Kelvin) ΔT=B1(Lin)Tc where Lin=Lb+tbLs. The agreement with the data is excellent, KL distance of 0.0016. The negative ΔT indicates that the cloud is always in emission mode, 0pΔT(dT)dT=1, and the probability for ΔT>0 is zero. Nevertheless, the pdf for ΔM shows that the cumulative probability for ΔM<0 is 0pΔM(dM)dM=0.13. Thus, due to the fluctuations in the foreground layer, the cloud that is “physically” in emission mode will appear in the measurements with M(1)<M(0), about 13% of the time, and the cloud may be erroneously interpreted as a cloud in “absorption” mode. This also has implications towards averaging spectra in order to increase the signal to noise ratio: fluctuations in the polarity of ΔM means that naively averaging instances of ΔMmeasured by a sensor will decrease noise but will also decrease the mean (signal). In the bottom right panel we show the performance of detection, with the standard receiver operation characteristics (ROC) curves (the probability of detection, PD, as a function of the probability of false alarm, PFA). The ROC curve can be used to assess the theoretical performance of detecting the cloud due to scene fluctuations.

In Figs. 5-8 we show the two hypothetical scenarios of Table 1 with challenging parameters. We first discuss the SU fit for several of the intermediate terms in the model to illuminate the limitations of the SU distribution, and then we show that in spite of these limitations, the SU is an excellent fit for the important quantities that determine performance, namely, M(1), M(0),ΔM, and ΔT. In Fig. 5 (simulation 1) the KL distances for Lf, Lb, Lc, Mb(0), Mb(1), and Mc are large (~1.3), which indicates poor agreement with the SU model. The histograms for these parameters all have a large probability for radiance values near zero, and a discontinuity at zero that reflects the fact that radiance values may not be negative (the shape of the histograms is caused by the severe truncation necessary to model the transmission terms on the interval 0 to 1; many transmission values are close to 1, meaning that the emission values for each layer ε=1t are near zero). The SU distribution, however, is a continuous distribution defined on the interval to with no restriction regarding non-negativity. Thus, the SU fits for these terms are poor, and include (nonphysical) negative radiances. At-sensor radiances for the source layer, on the other hand, are reasonably well-fit by the SU distribution (KL distances for Ms(0) and Ms(1) are 0.005 and 0.008, respectively). The source layer is not modeled with its own transmission term, and the effect of attenuating the source radiance through truncated lognormal transmission terms (where a majority of values are close to 1) is small.

Despite the fact that intermediate model terms for simulation 1 pose problems for the Johnson SU fit, the primary model outputs, M(1), M(0), and ΔM—as well as ΔT—are very well fit by the SU distribution, with KL distances around 0.0013. These quantities, as well as the ROC curve for simulation 1, are shown in Fig. 6 . The ROC curve shows the challenging detection scenario posed by this simulation; despite the large thermal contrast and the same vapor optical depth as the physical simulation, it is only modestly above the 45 degree line (equal probability of false alarm and detection—a “coin-flip” detection scenario). Thus, scene fluctuations are large compared to the difference in average radiance between H1 and H0. In simulation 1 (Fig. 6) the cloud is in emission mode nearly always (the probability of ΔT>0is negligible, around 1105), but nevertheless the cloud appears in the measurements as an “absorbing” cloud, ΔM<0,with large (37%) probability, 0pΔM(dM)dM=0.37.

 figure: Fig. 6

Fig. 6 Same as Fig. 4 but for Simulation 1. The KL distance between theory and sampled data for M(0), M(1), ΔM, and ΔT is very small (0.0014, 0.0013, 0.0013, 0.003, respectively), hence excellent agreement between the model and the data. Although the optical depth is similar and the thermal contrast larger for Simulation 1 in comparison to the physical scenario, the performance is reduced for Simulation 1 due to larger fluctuations.

Download Full Size | PDF

In Fig. 7 (simulation 2), layer radiances Lf, Lb, and Lc show a reasonable SU fit (KL values between 0.03 and 0.06). In each case, the long tail to the left of the SU peak causes a small probability of non-physical negative values (~0.02%). At-sensor radiances Mb(0) and Mc are also reasonable (KL values around 0.75) but have a higher probability of negative radiances (3%). Fits for Ms(0), Ms(1), and Mb(1)are poor with large KL distances (greater than 2) and significant probabilities of negative radiances (13%, 27%, and 14%, respectively). The amount of truncation for transmission variates is small (<0.1%), thus we do not think the main reason for the poor fits. Regardless of the cause, the SU distribution is just not appropriate to describe the shape (rise and fall) of the histograms. The shape of the SU distribution is controlled by the skewness and kurtosis of the distribution; as shown in Fig. 2 the SU distribution only covers a subset of the possible values. The histograms in Fig. 7 are all in the SB region: Lf, Lb, Lc, Mb(0), and Mc are all relatively close to the lognormal line (~2 units or less) such that the SU approximation is still reasonable; Ms(0), Ms(1), and Mb(1)are significantly far from the lognormal line (10-100 units away) such that the SU approximations are poor. Evidently, certain types of very sharply peaked distributions that rise very quick onone side with a long tail on the other (as in Fig. 7) are difficult for the SU to represent. Despite the poor fit for some of the intermediate model outputs, Fig. 8 shows excellent agreement between the model and the histograms for M(1), M(0), and ΔM. In fact, these histograms are actually in the SU region and KL values are small (≤0.002). For ΔT the fit is good, though not perfect (KL of 0.008). The discrepancy between the model and the histogram for ΔT is not due to the validity of the linearization of the inverse blackbody function described in Section 3.2.3, but is due instead to the fact that the Lin radiance is actually in the SB region. The distance to the lognormal line is small, however, and the SU approximation for ΔT is reasonable.

 figure: Fig. 7

Fig. 7 Same as Fig. 3 but for Simulation 2. The KL distance for the L radiances are small (0.03-0.06) and are larger (~0.75) for Mb(0) and Mc. The agreement forMs(0), Ms(1), and Mb(1) is poor (KL of 2.7, 6, and 12, respectively).

Download Full Size | PDF

Figure 8 also shows the detection performance (ROC curve) for Simulation 2, which is even poorer than for Simulation 2. The larger optical depth of the three layers in Simulation 2 causes more emission from the cloud layer, but also more attenuation of cloud photons due to the foreground layer. Additionally, the thermal contrast is reduced (compare the magnitude of ΔT in Figs. 6 and 8) due to the fact that fewer photons from the mountain reach the cloud. In fact, in Simulation 2 the cloud will exhibit both radiation modes: 83% of the time in emission (0pΔT(dT)dT=0.83, computed from the histogram), and 17% in absorption (in Simulation 1 the cloud was always in emission). In the measurements, however, the cloud in Simulation 2 will appear in emission mode with probability of 47% (0pΔM(dM)dM=0.47). Thus, inferences made from ΔM regarding ΔT (that are valid in the physical model) can be misleading in practice, where variation in the radiative transfer terms causes a mismatch between ΔM and ΔT.

5. Summary

A probability model for a 3-layer radiative transfer model (foreground layer, cloud layer, background layer, and an external source at the end of LOS) has been developed. The 3-layer model is fundamentally important as the primary physical model in passive LWIR remote sensing, but it has not been useful for predicting performance due to clutter (scene variations).

This is (to our knowledge) the first probability model for a 3-layer radiative transfer model where all quantities are allowed to be stochastic and higher order statistics are included. The model provides statistical characterization of at-sensor spectral radiance due to scene fluctuations, that is, it can predict the detection potential in clutter-noise limited scenarios. Note that the probability model could be combined with sensor models and detection algorithms to predict the performance of a particular sensor.

The model is based on computing analytical moments (mean, variance, skewness, and kurtosis) of the output variates based on the moments of the input variates. Our choices for the input statistics are made based on physical theory, and we expect that they can accurately model real stochastic quantities. However, any distribution that has four computable moments can be chosen. Moments for input variates could also be measured empirically.

We assumed Gaussian (normal) statistics for temperature fluctuations, T~N(μT,σT2), which results in the excellent approximation that the Planck blackbody radiance is lognormally distributed, B(T)~LN(μB,σB2). Layer radiances and transmission variates are modeled as lognormal variates, both of which are justifiable based on physical arguments. The lognormal distribution has the nice property that it describes non-negative random variables (it is defined on the domain 0 to ∞), and thus is naturally applicable to describe radiance values. Transmission values are non-negative but also less than or equal to 1; therefore it is sometimes necessary (depending on population parameters) to represent transmission variates by truncated lognormal distributions. Other choices are possible (for example, the beta distribution is naturally bounded between 0 and 1 and can describe a wide range of shapes), however, the lognormal distribution is convenient to work with since products and sums of lognormal distributions are themselves lognormally distributed (or approximately so).

In special cases the model predicts that the total at-sensor radiance is lognormally distributed, however the lognormal distribution is not flexible enough to represent the general case. Therefore, we use the Johnson system of distributions to fit the moments of the output variates. There are other systems of curves that may be used for distribution fitting; we chose the Johnson system due to central role the logarithmic distribution plays in the system and the small number of distributions that appear in the system. The Johnson SU distribution is a particularly useful and flexible distribution that is appropriate for representing the at-sensor radiance under H0 and H1 hypotheses, M(0) and M(1), respectively, and the differential radiance, ΔM=M(1)M(0). These are the primary quantities that a sensor would be able to measure. The thermal contrast, ΔT (see Eqs. (4) and (5)), is also typically well-fit by the SU distribution. The thermal contrast is of interest as the principle physical quantity that enables LWIR vapor detection. Other intermediate output terms of the model may be well fit with the SU distribution; at times they would be better modeled with the Johnson SB distribution, but method-of-moments estimation is intractable with the SB, which precludes its use in our model. This is a disadvantage to the Johnson system, but in practice it is of little impact since the pimary model outputs are fit so well by the SU distribution, even when the intermediate terms would require the SB to be “optimally” fit.

In the traditional (non-stochastic) physical model, ΔM gives only information about the vapor cloud. In the stochastic model, due to scene fluctuations, ΔM includes photons that never interacted with the cloud.

Detection potential may be quantified by ROC curves derived from binary hypothesis testing on the M(0) and M(1) distributions. Fluctuations (in the temperature of the cloud and in the input radiance on the back of the cloud, Lin) can also cause the cloud to be in emission and in absorption at the “same time”, i.e., at a given moment there is a non-zero probability for the cloud to be in absorption and a non-zero probability for the cloud to be in emission. This has potential impacts towards processing spectral time-series data: it may be dangerous to attempt to increase the signal-to-noise ratio (SNR) via averaging spectra, since it would be possible for an emission and absorption spectrum to cancel. It is also possible that due to the fluctuations in the foreground layer, the polarity of ΔMwill be reversed, hence, a cloud that is in emission (i.e., the cloud output radiance into the foreground layer is greater than the incident radiance on its back side) may appear in the measurements with ΔM<0. Only with a probability model for radiative transfer one can study these types of phenomena.

In the probability model we implicitly assume that the population parameters for the three layers (and the external source) are stationary (i.e., are fixed) during the measurement period. Our probability model is a univariate model and thus can only address the radiative transfer model at a single wavelength. This is the first step in developing a multivariate model for hyperspectral remote sensing where the covariance between wavelengths plays an important role. The challenge in extending our univariate model to a multivariate model is mainly in extending Johnson SU to multivariate statistics where the covariance needs to be computed and estimated from Eq. (1) that is written for vector quantities. This research may lead to new detection algorithms tailored to exploit the Johnson SU statistics. Our pdf model is a “forward model” where with given population parameters for the radiative transfer model we predict performance. The challenge of estimating the population parameters form sampled data (i.e., an “inverse model”) is great and remains open.

Appendix A. The lognormal distribution

Let x be a normal distribution, x~N(μ,σ2), with density

px(x)=(2πσ2)1/2exp(12(xμ)2σ2).

By transformation of variables, y=exp(x) follows the standard two-parameter lognormal distribution, y~LN(μ,σ2), with density

py(y)=(2πσ2y2)1/2exp(12(ln(y)μ)2σ2).

The population parameters (μ,σ2) are the mean and variance of the associated normal distribution. We will now present some useful properties of the lognormal distribution.

  • (i) Statistical properties (moments)

    The raw nth moments of the lognormal distribution are given by:

    E(yn)=exp(nμ+12n2σ2)

    from which the following moments may be derived (using Appendix C) to be

    E(y)=exp(μ+12σ2),V(y)=[exp(σ2)1]E(y)2,S(y)=[exp(σ2)+2]exp(σ2)1,K(y)=exp(4σ2)+2exp(3σ2)+3exp(2σ2)3.

    The median of the lognormal distribution is simply median(y)=exp(μ), and the mode is given by mode(y)=exp(μσ2). Defining Pearsonian coefficients β1=S(y)2 and β2=K(y), the equation for the lognormal curve in the β1, β 2 plane (see Fig. 2) is given by

    {β2=β1+exp(4σ2)+exp(3σ2)+1β1=exp(3σ2)+3exp(2σ2)4}.

  • (ii) Estimation using the method of moments

    The population parameters for a lognormal distribution y~LN(μ,σ2) that matches the mean E(x) and variance V(x) of a random variable x, can be found by solving the simultaneous equations {E(y)=E(x),V(y)=V(x)} to be

    {σ2=ln(1+V(x)E(x)2)μ=ln[E(x)]12σ2}.

    Note that when attempting to fit a distribution to sampled data, other methods are usually preferred to the method of moments (maximum likelihood, for example) due to higher efficiencies [28]. Hybrid techniques may also be used, for example, replacing the mean by the median in the method of moments gives

    {σ2=ln(12+median(x)2+4V(x)2median(x))μ=ln[median(x)]}.

    However, for fitting theoretical distributions, where the moments E(x) and V(x) are known exactly, there are no concerns in using the method of moments.

    In the properties below, let yi~LN(μi,σi2) be independent lognormal variates (i.e., E(yinyjk)=E(yin)E(yjk)) and c a positive constant. We have analytically the following properties:

  • (iii) Scaling: z=cy~LN(c+μ,σ2),
  • (iv) Inverse: z=1/y=exp(x)~LN(μ,σ2),
  • (v) Multiplication: z=yi~LN(μi,σi2),
  • (vi) Division: z=y1/y2~LN(μ1μ2,σ12+σ22).

    The following two properties are approximately true, where in property (viii) the independence assumption is dropped:

  • (vii) Addition (approximation with Fenton-Wilkinson [19], ):
    z=iyi~LN(μz,σz2),μz=ln(iE(yi))12σz2,σz2=ln(1+iV(yi)(iE(yi))2).

    For two lognormal variables z=y1+y2 it was stated [20,21] that Fenton-Wilkinson moment-matching method breaks down if σy<4dB for either y1 or y2. In terms of signal to noise ratio, defined as SNRy=E(y)2/V(y), this restriction can be shown (by solving σy=ln(1+SNRy2)<4dB) to give the condition that a minimumSNRy>0.3 is required for the Fenton-Wilkinson approximation to hold. This property may be combined with (iii) and (v) to show that a linear combination of lognormal variates (with positive coefficients) is lognormally distributed, as long as the SNR restriction is met.

  • (viii) Addition of correlated lognormals

    For two lognormals y1~LN(μ1,σ12) and y2~LN(μ2,σ22) that are correlated with correlation coefficient

    ρ12=corr(y1,y2)=E(y1y2)E(y1)E(y2)V(y1)V(y2),

    we follow the spirit of Fenton-Wilkinson and use the method of moments—property (ii)—to find z=y1+y2~LN(μz,σz2) such that E(z)=E(y1)+E(y2) and V(z)=V(y1)+V(y2)+2ρ12V(y1)V(y2) are matched exactly. A sum of three correlated lognormals with correlations ρ12, ρ13, and ρ23 is easily computed in the same manner:

    z=y1+y2+y3~LN(μz,σz2),E(z)=E(y1)+E(y2)+E(y3),

    and

    V(z)=V(y1)+V(y2)+V(y3)+2ρ12V(y1)V(y2)+2ρ13V(y1)V(y3)+2ρ23V(y2)V(y3).

  • (ix) Moments of truncated lognormal

    For a lognormal variate y~LN(μ,σ2) that is truncated to be within 0y1 (e.g., when y is a transmission that may not be greater than 1) the nth raw moment may be written in terms of the untruncated moments:

    E(yn)=erfc(μ+nσ22σ2)erfc(μ2σ2)exp(nμ+12n2σ2)

    where erfc()=1erf() is the complementary error function. If the associated normal parameters (μ,σ2) are such that the probability Pr(y>1) is negligible (i.e., no truncation is required), the ratio

    erfc(μ+nσ22σ2)erfc(μ2σ2)1,

    and the expression for untruncated raw moments is recovered.

  • (x) Central limit theorem for multiplicative processes

    Consider a product of independent non-negative variates, y=zi. Taking the logarithm of this expression leads to a sum of (real-valued) independent variates, x=lny=lnzi. If, by the central limit theorem xN(μ,σ2), then the relationship y=exp(x) immediately implies that yLN(μ,σ2).

Appendix B. The Johnson SU distribution

The SU distribution is based on the following transformation of a normally distributed variate x~N(0,1):

y=a+bsinh(xcd)

where the parameters a and b>0 are location and scale parameters, respectively, and c and d>0 are shape parameters. The density for y~SU(a,b,c,d) is given by

py(y)=dexp{12[c+dsinh1(yab)]2}b2π[(yab)2+1].

The SU distribution can describe the lognormal and normal distributions as limiting cases: SUSL when c±, and SUN when d. The SU distribution is symmetric (skewness is zero) when c=0. Some useful properties of the SU distribution are presented below.

  • (i) Statistical properties (moments)

    The nth moments around arbitrary location θ are given by

    E[(yθ)n]=j=0n(nj)(b2)j(aθ)njr=0j(jr)(1)re(j2r)Ωω(j2r)2

    where ω=exp(21d2) and Ω=cd1. Raw moments may be found from this equation by setting θ=0, central moments are found by setting θ=E(y). Using Appendix C, the following moments may be derived:

    {E=abωsinh(Ω),V=12b2(ω21)(1+ω2cosh2Ω),S=ω(ω212)1/2ω2(ω2+2)sinh3Ω+3sinhΩ(1+ω2cosh2Ω)3/2,K=12ω4(ω8+2ω6+3ω43)cosh4Ω+4ω4(ω2+2)cosh2Ω+3(2ω2+1)(1+ω2cosh2Ω)2}.

    Note that references [28, p42, Eq. (2).62] and [18, Eq. (6).74] both have errors in their expressions for moments of the SU distribution. In the β1, β 2 plane, the SU distribution defines an unbounded region “below” the lognormal curve (see Fig. 2).

  • (ii) Estimation using method of moments

    We follow the iterative numerical procedure suggested by Elderton and Johnson [36] for finding the (c,d) parameters that match the desired β1=S2 and β2=K coordinates of the distribution to be fit (in our model these are computed analytically; for fitting sampled data alternative methods such as quantiles [37] may be preferred). For the SU distribution the sign of the skewness is determined by the sign of c (c<0 means S>0), but because the fitting routine operates on β1 (thereby losing the sign of the skewness), the computed value of c will always be non-positive. Thus, c should be multiplied by 1 if the skewness to be fit was negative. Given values for c and d, it is then straightforward to determine the value of b that matches the variance, and then the value of a that matches the mean. This is the same procedure used in [38].

    If the distribution to be fit is in the SU region or on the lognormal curve, then the first four moments will be matched exactly by this fitting procedure. If the distribution to be fit lies in the SB region (above the lognormal curve in Fig. 2) then the skewness and kurtosis will not match exactly. To ensure that the “best” possible SU approximation is returned by the iterative procedure, we first find the point on the lognormal curve with the smallest Euclidean distance in the β1,β2 plane to the SB point to be fit. This can be found using Newton’s method [39] to minimize the distance D=(β1S2)2+(β2K)2 where β1 and β2 are given by Eq. (A1) and S2 and K are the coordinates of the point in the SB region. D is only a function of σ; the solution value σmin is plugged back into Eq. (A1) to find the coordinates on the lognormal curve closest to the SB point. Then, applying Elderton and Johnson’s routine to this point finds population parameters {a,b,c,d} corresponding to the optimal SU approximation (and is numerically equivalent the best possible SL approximation). This procedure is necessary to return the optimal SU fit because the Elderton and Johnson procedure approaches the solution while preserving the correct value of β2. Thus, it would return an SU approximation that matched the mean, variance, and kurtosis, but not the skewness. The optimal solution accepts some error in the kurtosis in order to reduce the error in the skewness to achieve minimum total error.

    Note that there is a non-uniqueness in approximating a point on the lognormal curve with the SU distribution. Since the lognormal is a limiting form of the SU, the SU can get indistinguishably close to the lognormal distribution without actually reaching it; for a given tolerance there will be a circular arc in the SU region centered on the lognormal point in the β1,β2 plane defining particular SU distributions that are equidistant to the lognormal point (and therefore have the same value for the total error and are thus “equally good” at approximating the lognormal point). In practice, however, for a fixed tolerance, using the iterative fitting routine will always produce the same set of SU population parameters since the point is approached from a specific direction (along a line of constant β2).

  • (iii) Translation and Scaling:

    Let k and λ be constants and y~SU(a,b,c,d). Then, it can be shown that z=ky+λ~SU(ka+λ,|k|b,sgn(k)c,d). The absolute value and signum (sign) functions are needed because by convention, b and d are always nonnegative. In (B1), since sinh(·) is an odd function and x is a zero-mean symmetric distribution, the following transformations are all equivalent, sinh(xc)=sinh(x+c)=sinh(x+c). Therefore there is an invariance in the Johnson SU distribution to simultaneously flipping the sign of the c and b parameters, and no loss in generality in always taking b to be positive.

Appendix C. [E, V, S, K] as a function of moments

Let z be a random variable whose raw moments E(zn) exist for n =1, 2, 3, 4. Then the mean, variance, skewness, and kurtosis may be written as a function of the raw moments as

E(z),V(z)=E([zE(z)]2)=E(z2)E(z),S(z)=E([zE(z)]3)V(z)3/2=E(z3)3E(z2)E(z)+2E(z)3V(z)3/2,K(z)=E([zE(z)]4)V(z)2=E(z4)4E(z3)E(z)+6E(z2)E(z)23E(z)4V(z)2,

respectively. These expressions may be derived using the properties of the expectation operator, which are summarized in Appendix D.

Appendix D. Properties of the expectation operator

The definition of the expectation operator is

E[f(x)]=f(x)px(x)dx

where px(x) is the pdf describing the random variable x, and the integral is computed over the entire domain of x. It follows directly from the definition that E(kx+λ)=kE(x)+λ where k and λ are constants. It can also be established, regardless of independence between variates x and y, that E[f(x)+g(y)]=E[f(x)]+E[g(y)]. Additionally, if x and y are independent—such that the joint probability density px,y(x,y)=px(x)py(y)—then E[f(x)g(y)]=E[f(x)]E[g(y)].

For convenience we give the moments of z=x±y and of z=xy (with x and y independent), which are frequently used in deriving moments of the model outputs. The moments of z=x±y (for example, ΔM=M(1)M(0) and ΔT=B1(Lin)Tc) are given by

{E(z)=E(x)±E(y)E(z2)=E(x2)±2E(x)E(y)+E(y2)E(z3)=E(x3)±3E(x2)E(y)+3E(x)E(y2)±E(y3)E(z4)=E(x4)±4E(x3)E(y)+6E(x2)E(y2)±4E(x)E(y3)+E(y4)}.

For emissivity terms ε=(1t) that are of the form z=1y, (D1) may be used with E(xn)=1. The moments of z=xy (with x and y independent) are given by

{E(z)=E(x)E(y)E(z2)=E(x2)E(y2)E(z3)=E(x3)E(y3)E(z4)=E(x4)E(y4)}.

References and links

1. M. L. Polak, J. L. Hall, and K. C. Herr, “Passive Fourier-transform infrared spectroscopy of chemical plumes: an algorithm for quantitative interpretation and real-time background removal,” Appl. Opt. 34(24), 5406–5412 (1995). [CrossRef]   [PubMed]  

2. D. F. Flanigan, “Prediction of the limits of detection of hazardous vapors by passive infrared with the use of modtran,” Appl. Opt. 35(30), 6090–6098 (1996). [CrossRef]   [PubMed]  

3. A. Ben-David and H. Ren, “Detection, identification, and estimation of biological aerosols and vapors with a Fourier-transform infrared spectrometer,” Appl. Opt. 42(24), 4887–4900 (2003). [CrossRef]   [PubMed]  

4. A. Ben-David, C. E. Davidson, and J. F. Embury, “Radiative transfer model for aerosols at infrared wavelengths for passive remote sensing applications: revisited,” Appl. Opt. 47(31), 5924–5937 (2008). [CrossRef]   [PubMed]  

5. D. Manolakis, “Signal processing algorithms for hyperspectral remote sensing of chemical plumes”, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1857 – 1860, Digital Object Identifier: 10.1109/ICASSP.2008.4517995 (2008). [CrossRef]  

6. A. Ifarraguerri and A. Ben-David, “Impact of atmospheric boundary layer turbulent temperature fluctuations on remote detection of vapors by passive infrared spectroscopy,” Opt. Express 16(22), 17366–17382 (2008). [CrossRef]   [PubMed]  

7. A. Ben-David, “Remote detection of biological aerosols at a distance of 3 km with a passive Fourier transform infrared (FTIR) sensor,” Opt. Express 11(5), 418–429 (2003). [CrossRef]   [PubMed]  

8. H. C. van de Hulst, Multiple Light Scattering Tables, Formulas and Application (Academic, 1980).

9. K. N. Liou, An Introduction to Atmospheric Radiation, 2nd ed. (Academic, 2002).

10. G. L. Stephens, P. M. Gabriel, and S.-C. Tsay, “Statistical radiative transport in one-dimensional media and its application to the terrestrial atmosphere,” Transp. Theory Stat. Phys. 20(2-3), 139–175 (1991). [CrossRef]  

11. J. Kerekes and J. Baum, “Full spectrum spectral imaging system analytical model,” IEEE Trans. Geosci. Rem. Sens. 43(3), 571–580 (2005). [CrossRef]  

12. C. Quang, F. Dalaudier, A. Roblin, V. Rialland, and P. Chervet, “Statistical model for atmospheric limb radiance structure: application to airborne infrared surveillance systems,” Proc. SPIE 7108, 710805, 710805-9 (2008). [CrossRef]  

13. A. Ishimaru, Wave Propagation and Scattering in Random Media (Academic, 1978).

14. N. Byrne, “3D Radiative transfer in stochastic media,” in 3D Radiative Transfer in Cloudy Atmospheres, A. Marshak and A. B. Davis, eds. (Springer, 2005).

15. D. E. Lane-Veron and C. J. Somerville, “Stochastic theory of radiative transfer through generalized cloud fields,” J. Geophys. Res. 109(D18), D18113 (2004). [CrossRef]  

16. J. Aitchison and J. A. C. Brown, The Lognormal Distribution (Cambridge University Press, 1957).

17. E. Limpert, W. A. Stahel, and M. Abbt, “Lognormal distributions across the sciences: keys and clues,” Bioscience 51(5), 341–352 (2001). [CrossRef]  

18. N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous univariate distributions, 2nd ed. (Wiley, 1994), Vol. 1.

19. L. F. Fenton, “The sum of lognormal probability distributions in scatter transmission systems,” IRE Trans. Commun. Syst. CS-8, 57–67 (1960).

20. G. L. Stuber, Principles of Mobile Communications (Kluwer Academic Publishers, 1996).

21. N. B. Mehta, J. Wu, A. F. Molisch, and J. Zhang, “Approximating a sum of random variables with a lognormal,” IEEE Trans. Wirel. Comm. 6(7), 2690–2699 (2007). [CrossRef]  

22. D. Manolakis, D. Marden, and G. A. Shaw, “Hyperspectral image processing for automatic target detection applications,” Lincoln Lab. J. 14, 79–116 (2003).

23. A. Schaum, “Adapting to change: the CFAR problem in advanced hyperspectral detection,” in Proceedings of the 36th Applied Imagery Pattern Recognition Workshop (IEEE Computer Society Washington, DC, 2007), pp. 15–21.

24. D. Manolakis, R. Lockwood, T. Cooley, and J. Jacobson, “Is there a best hyperspectral detection algorithm?” Proc. SPIE 7334, 733402, 733402-16 (2009). [CrossRef]  

25. H. Ren, “Anomaly detection in hyperspectral imagery: second and higher-order statistics-based algorithms,” in Recent Advances in Hyperspectral Signal And Image Processing, C–I Chang, ed. (Transworld Research Network, 2006), Chap. 12.

26. A. Stuart and K. Ord, Kendall’s Advanced Theory of Statistics: Distribution Theory, 6th ed. (Oxford University, 2007), Vol. 1.

27. N. L. Johnson, “Systems of frequency curves generated by methods of translation,” Biometrika 36(1-2), 149–176 (1949). [CrossRef]   [PubMed]  

28. J. K. Ord, Families of Frequency Distributions (Griffin, 1972), Chap. 2.

29. S. M. Kay, Fundamentals of statistical signal processing: detection theory (Prentice Hall PTR, 1998), Vol. II.

30. A. Ben-David, S. K. Holland, G. Laufer, and J. D. Baker, “Measurements of atmospheric brightness temperature fluctuations and their implications on passive remote sensing,” Opt. Express 13(22), 8781–8800 (2005). [CrossRef]   [PubMed]  

31. J. H. Seinfeld and S. N. Pandis, Atmospheric Chemistry and Physics (Wiley, 1998).

32. S. Chandrasekhar, Radiative Transfer (Dover, 1960).

33. N. S. Kopeika, A System Engineering Approach to Imaging (SPIE Optical Engineering Press, 1998), Chap. 15.

34. A. Berk, G. P. Anderson, P. K. Acharya, L. S. Bernstein, L. Muratov, J. Lee, M. J. Fox, S. M. Adler-Golden, J. H. Chetwynd, M. L. Hoke, R. B. Lockwood, T. W. Cooley, and J. A. Gardner, “MODTRAN5: A Reformulated Atmospheric Band Model with Auxiliary Species and Practical Multiple Scattering Options,” Proc. SPIE 5655, 88–95 (2005). [CrossRef]  

35. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. (Wiley, 2001).

36. W. P. Elderton and N. L. Johnson, Systems of Frequency Curves (Cambridge University Press, 1969).

37. R. E. Wheeler, “Quantile estimators of Johnson curve parameters,” Biometrika 67(3), 725–728 (1980). [CrossRef]  

38. I. D. Hill, R. Hill, and R. L. Holder, “Algorithm AS99: fitting Johnson curves by moments,” J. R. Stat. Soc., Ser. C, Appl. Stat. 25(2), 180–189 (1976).

39. F. S. Acton, Numerical Methods That Work (The Mathematical Association of America, 1990).

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1
Fig. 1 Geometry and radiative transfer quantities for a 3-layer radiative transfer model for horizontal line of sight (LOS1) and up-looking slant path (LOS2). Transmissions and temperatures for the different layers are denoted by t i and T i , respectively. For LOS1 the foreground and background layer radiances ( L f , L b ) are given as emissions (1 t i )B( T i ) from a blackbody B( T i ) . The external source radiance, L s , at the end of LOS can be a function of temperature T s and emissivity ε s , i.e., L s = ε s B( T s ) .
Fig. 2
Fig. 2 Standard β 1 , β 1 chart, where the y-axis is reversed (by common convention). The lognormal curve (corresponding to both LN and SL distributions) is given by Eq. (A1) and is shown in red. The normal distribution is a point in blue at coordinates (0,3). The area “below” the lognormal curve contains all SU distributions. The area above the lognormal curve corresponds to the SB region. All probability distributions must lie below the shaded gray area in the upper right corner of the plot.
Fig. 3
Fig. 3 Comparison of data histogram (solid red) and model-predicted pdfs (thick dotted green) for several intermediate model outputs for the physical simulation. The L and M radiances (in units of W/cm2/sr/cm−1) are modeled with the Johnson SU distribution, and B( T c ) is modeled with a lognormal. Top row: emission radiances for the foreground and background layers, L f = M f , L b , and cloud layer emission radiance L c =(1 t c )B( T c ) . Middle row: At-sensor radiances for the external source, M s (0) , and the background layer, M b (0) , under the H0 scenario and the at-sensor radiance from the cloud layer, M c . Bottom row: At-sensor radiances for the external source, M s (1) , and background layer, M b (1) , for the H1 scenario and the Planck blackbody radiance for the cloud, B( T c ) .The KL distance between theory and sampled data for all plots is very small (<0.002), hence excellent agreement between the data and the model.
Fig. 4
Fig. 4 Comparison of data histograms (solid lines) and model pdfs (thick dotted lines) for the primary model outputs for the physical simulation. Top left: at-sensor radiance under H0 and H1 hypotheses, M (0) and M (1) , respectively. Top right: Differential radiance, ΔM= M (1) M (0) . Bottom left: Thermal contrast (in Kelvin), ΔT= B 1 ( L in ) T c , computed using Eq. (8). Bottom right: Receiver operating characteristics where the line of equal probability of false alarm and detection is shown in gray. The agreement between model predications and the data is excellent (KL distances for M (0) , M (1) , ΔM , and ΔT of 0.0014, 0.0015, 0.0014, 0.0016, respectively).
Fig. 5
Fig. 5 Same as Fig. 3 but for Simulation 1. The agreement between the model and the data for L and M radiances is poor (large KL values of ~1.3) with the exception of M s (0) and M s (1) for which the fit is good and the KL distance is small (0.005 and 0.008 respectively).
Fig. 8
Fig. 8 Same as Fig. 4 but for Simulation 2. The KL distance between theory and sampled data for M (0) , M (1) , and ΔM is very small (0.002, 0.0014, and 0.0013, respectively) indicating good agreement between the model and data. For ΔT the fit is reasonably good with a KL of 0.007. Detection performance for this scenario is reduced in comparison to Simulation 1: although the optical depth of the target cloud is much greater in Simulation 2, the thermal contrast is reduced and there is greater attenuation from the foreground layer.
Fig. 6
Fig. 6 Same as Fig. 4 but for Simulation 1. The KL distance between theory and sampled data for M (0) , M (1) , ΔM , and ΔT is very small (0.0014, 0.0013, 0.0013, 0.003, respectively), hence excellent agreement between the model and the data. Although the optical depth is similar and the thermal contrast larger for Simulation 1 in comparison to the physical scenario, the performance is reduced for Simulation 1 due to larger fluctuations.
Fig. 7
Fig. 7 Same as Fig. 3 but for Simulation 2. The KL distance for the L radiances are small (0.03-0.06) and are larger (~0.75) for M b (0) and M c . The agreement for M s (0) , M s (1) , and M b (1) is poor (KL of 2.7, 6, and 12, respectively).

Tables (2)

Tables Icon

Table 1 Parameters for Simulations

Tables Icon

Table 2 Parameters for Johnson Distributions, x~ S U (a,b,c,d)

Equations (31)

Equations on this page are rendered with MathJax. Learn more.

M ( 1 ) = L f + t f (1 t c )B( T c )+ t f t c L b + t f t c t b L s .
M ( 0 ) = L f + t f L b + t f t b L s .
{ L f =(1 t f )B( T f ) L b =(1 t b )B( T b ) }.
{ ΔM= M ( 1 ) M (0) = t f (1 t c )ΔT ΔT= L in B( T c ) L in = L b + t b L s }
ΔT= B 1 ( L in ) T c
B(T)= k 1 e k 2 /T 1
B 1 ( L )= k 2 ln( k 1 /L +1 ) .
B 1 ( L )= B 1 ( L 0 )+( L L 0 ) B 1 L | L= L 0 +O( (L L 0 ) 2 ).
α 1 = B 1 L | L= L 0 , α 0 = B 1 ( L 0 )( B 1 L | L= L 0 ) L 0 ,
B 1 L | L= L 0 = B 1 ( L 0 ) L 0 ( k 1 + L 0 )ln( k1 L 0 +1 ) .
p x (x)= (2π σ 2 ) 1/2 exp( 1 2 (xμ) 2 σ 2 ).
p y (y)= (2π σ 2 y 2 ) 1/2 exp( 1 2 (ln(y)μ) 2 σ 2 ).
E( y n )=exp(nμ+ 1 2 n 2 σ 2 )
E(y)=exp(μ+ 1 2 σ 2 ), V(y)=[exp( σ 2 )1]E (y) 2 , S(y)=[exp( σ 2 )+2] exp( σ 2 )1, K(y)=exp(4 σ 2 )+2exp(3 σ 2 )+3exp(2 σ 2 )3.
{ β 2 = β 1 +exp(4 σ 2 )+exp(3 σ 2 )+1 β 1 =exp(3 σ 2 )+3exp(2 σ 2 )4 }.
{ σ 2 =ln( 1+ V(x) E (x) 2 ) μ=ln[ E(x) ] 1 2 σ 2 }.
{ σ 2 =ln( 1 2 + median (x) 2 +4V(x) 2median(x) ) μ=ln[ median(x) ] }.
z= i y i ~LN( μ z , σ z 2 ), μ z =ln( i E( y i ) ) 1 2 σ z 2 , σ z 2 =ln( 1+ i V( y i ) ( i E( y i ) ) 2 ).
ρ 12 =corr( y 1, y 2 )= E( y 1 y 2 )E( y 1 )E( y 2 ) V( y 1 )V( y 2 ) ,
z= y 1 + y 2 + y 3 ~LN( μ z , σ z 2 ), E(z)=E( y 1 )+E( y 2 )+E( y 3 ),
V(z)=V( y 1 )+V( y 2 )+V( y 3 )+2 ρ 12 V( y 1 )V( y 2 ) +2 ρ 13 V( y 1 )V( y 3 ) +2 ρ 23 V( y 2 )V( y 3 ) .
E( y n )= erfc( μ+n σ 2 2 σ 2 ) erfc( μ 2 σ 2 ) exp(nμ+ 1 2 n 2 σ 2 )
erfc( μ+n σ 2 2 σ 2 ) erfc( μ 2 σ 2 ) 1,
y=a+bsinh( xc d )
p y ( y )= dexp{ 1 2 [c+d sinh 1 ( ya b )] 2 } b 2π[ ( ya b ) 2 +1] .
E[ (yθ) n ]= j=0 n ( n j ) ( b 2 ) j ( aθ ) nj r=0 j ( j r ) ( 1 ) r e (j2r)Ω ω (j2r) 2
{ E=abωsinh(Ω), V= 1 2 b 2 ( ω 2 1)(1+ ω 2 cosh2Ω), S=ω ( ω 2 1 2 ) 1/2 ω 2 ( ω 2 +2)sinh3Ω+3sinhΩ ( 1+ ω 2 cosh2Ω ) 3/2 , K= 1 2 ω 4 ( ω 8 +2 ω 6 +3 ω 4 3)cosh4Ω+4 ω 4 ( ω 2 +2)cosh2Ω+3(2 ω 2 +1) ( 1+ ω 2 cosh2Ω ) 2 }.
E(z), V(z)=E( [ zE(z) ] 2 )=E( z 2 )E(z), S(z)= E( [zE(z)] 3 ) V (z) 3/2 = E( z 3 )3E( z 2 )E(z)+2E (z) 3 V (z) 3/2 , K(z)= E( [zE(z)] 4 ) V (z) 2 = E( z 4 )4E( z 3 )E(z)+6E( z 2 )E (z) 2 3E (z) 4 V (z) 2 ,
E[f(x)]= f(x) p x (x)dx
{ E(z)=E(x)±E(y) E( z 2 )=E( x 2 )±2E(x)E(y)+E( y 2 ) E( z 3 )=E( x 3 )±3E( x 2 )E(y)+3E(x)E( y 2 )±E( y 3 ) E( z 4 )=E( x 4 )±4E( x 3 )E(y)+6E( x 2 )E( y 2 )±4E(x)E( y 3 )+E( y 4 ) }.
{ E(z)=E(x)E(y) E( z 2 )=E( x 2 )E( y 2 ) E( z 3 )=E( x 3 )E( y 3 ) E( z 4 )=E( x 4 )E( y 4 ) }.
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.