Characteristic functionals in imaging and image-quality assessment: tutorial

Eric Clarkson; Harrison H. Barrett

doi:10.1364/JOSAA.33.001464

1. INTRODUCTION

If we are imaging an ensemble of objects, then it is natural to describe this ensemble as a random field on some support region in $R^{q}$ . (Other terms used for this concept are random function and stochastic process.) For example, in medical imaging, the object is described by a function of three variables that varies unpredictably from one patient to the next. In x-ray computed tomography, this function gives the x-ray attenuation at each point, while in position emission tomography and single photon emission computed tomography imaging, this function gives the activity distribution inside the body. While there are anatomical structures that are common to all patients, there are also unpredictable variations in these structures throughout the patient population. As another example, for the x-ray imaging of luggage, there is an almost infinite variety of objects that can be packed in a bag and also infinite variation in the arrangement of these objects in the bag. The corresponding attenuation maps then vary in an unpredictable way from one bag to the next. In each of these examples, any given attenuation map or activity distribution can be regarded as a realization of a random field defined on a region of space. In medical imaging, this support region can be thought of as a cylinder large enough to contain all of the patients being imaged. In a luggage scanner, the support set may be a box shape large enough to contain all of the bags being scanned. In all imaging situations, there will be unpredictable variations in the function being imaged. If there were no such variations, then the object function would be known already and there would be no need to create an image. Therefore, we will consider the object functions being imaged to be realizations of a random field defined on a support set.

Describing the statistical properties of a random field can be difficult since we are dealing with essentially a random vector in an infinite-dimensional space. The concept of a probability density function (PDF), which is used to describe the statistics of continuous finite-dimensional random vectors, is not defined in general for a random field. For finite-dimensional random vectors, an equivalent description of the statistics is given by the characteristic function, which is the Fourier transform of the PDF. Indeed, many concepts in probability theory and multivariate statistics are easier to understand if we use the characteristic function as our description of the random vector. Fortunately, the concept of the characteristic function generalizes easily to random fields and is referred to as the characteristic functional in this context.

After a brief discussion of probability models and random variables in Section 2 to establish notation, we introduce the idea of a random field in Section 3. In Section 4, we define the characteristic functional for a random field and discuss some of its properties. Sections 5 and 6 introduce generalized random fields and their characteristic functionals; these ideas are needed to describe point processes, for example. In Section 7, we show how certain mathematical operations with random fields translate to relations among the corresponding characteristic functionals. Section 8 consists of a myriad of examples of random fields and their characteristic functionals that are relevant to imaging applications. We then show how characteristic functionals are propagated through noisy imaging systems in Section 9. This discussion includes both binned and list-mode imaging systems. Finally, in Section 10, we show how characteristic functionals and image data can be used to estimate population parameters, classify populations of objects, and compute a Fisher information matrix for the estimation of population parameters. In all of these applications, an analytic expression for the characteristic functional is essential to carrying out the task. Fortunately, in Section 8, we have a long list of such analytic expressions. We hope that these results will convince readers that modeling object populations as random fields and describing their statistical properties with characteristic functionals is both useful and important for the analysis of imaging systems and the data generated by them.

Since this paper is intended as a tutorial on the use of characteristic functionals in image science, most of the results have been derived elsewhere. Exceptions to this rule are Subsection 8.L on dynamically evolving random fields and Sections 10 and 11 on using characteristic functionals to generate mathematical observers and figures of merit for imaging systems.

2. PROBABILITY MODELS AND RANDOM VARIABLES

To define the characteristic functional of a random field, we must first discuss the definition of a random field. This definition in turn necessitates a brief discussion of probability spaces and random variables.

To define a probability space [1–3] we start with a set $Ω$ , sometimes called the sample space, the space of outcomes, or the set of elementary events. This set may be finite. For example, if $N$ photons strike a detector and $n$ of them are detected, then $Ω = {0, 1, 2, \dots, N}$ because these are the possible outcomes from this experiment. The outcome space may be infinite but discrete. An example of this situation is operating a photon-counting detector being illuminated by a source for a fixed amount of time. In this case, it is standard practice to take $Ω = {0, 1, 2, \dots,}$ . If an imaging system performs $M$ real-valued measurements of an object, then $Ω = R^{M}$ . In the theory of random fields, we have to allow for the possibility that $Ω$ is an infinite-dimensional space. The set of all real-valued functions of $q$ variables is an example of such a space. In general, we will use the symbol $ω$ to represent a point in $Ω$ . Thus, $ω$ may represent an integer, a real number, a finite-dimensional vector, a function, or some other more abstract way of representing outcomes.

The next step in defining a probability model is to define the collection $F$ of subsets of $Ω$ that will be assigned a probability. These subsets are usually called events. When $Ω$ is a discrete set, $F$ is usually simply the collection of all possible subsets of $Ω$ . Thus, in this case, we will assign a probability to any subset of outcomes. When $Ω$ is not a discrete set, then $F$ normally does not consist of all possible subsets of $Ω$ . In general, all that we require of the collection $F$ is three conditions. The first condition is that $Ω$ itself is one of the subsets in the collection. The second condition is that, for any subset $A$ in the collection, the complement of $A$ , which is the subset of points in $Ω$ that are not in $A$ , is also in the collection. The third condition is that, if the subsets $A_{n}$ are in the collection for $n = 1, 2, \dots$ , then the subset consisting of the union of these subsets,

A = ⋃_{n = 1}^{\infty} A_{n},

is also in the collection. A collection of subsets

F

satisfying these properties is called a

σ

-algebra.

The final ingredient for a probability model is a probability measure $P r (A)$ that assigns a probability to each event $A$ in $F$ . We always have $0 \leq P r (A) \leq 1$ . To state the only other condition on the probability measure, we begin with a sequence of subsets $A_{n}$ that are in $F$ for $n = 1, 2, \dots$ . We assume that the events are mutually disjoint, i.e., $A_{n} \cap A_{m}$ is empty for $n \neq m$ . Then, if $A$ is the union event as just described, the probability measure must satisfy

P r (A) = \sum_{n = 1}^{\infty} P r (A_{n}) .

This is just the usual condition that probabilities add when the events are disjoint, and is called countable additivity.

A probability model is specified by the triple $(Ω, F, P r)$ consisting of a sample space, a $σ$ -algebra of subsets of this space, and a probability measure on this collection of subsets. The concept of a probability model forms the foundation of probability theory, which in turn is essential for any discussion of random fields and characteristic functionals. For those not familiar with $σ$ -algebras and probability measures, we will discuss a few examples.

In the first example, we will consider the photon-counting system where $Ω = {0, 1, 2, \dots,}$ and an event $E$ will be any subset of $Ω$ . Since the one-element set ${n}$ is an event, it has an associated probability $P_{n} = P r ({n})$ . For any event, we then have

P r (E) = \sum_{n \in E} P_{n} .

This is just the usual way to assign a probability to a subset of a discrete outcome space.

In the second example, we will take $Ω$ to be the set $R$ of all real numbers. If the outcome of an experiment is a measurement of a continuous quantity, then this is the appropriate outcome space. If the real-valued function $p r (x)$ on $R$ satisfies $p r (x) \geq 0$ for all $x$ and

\int_{R} p r (x) d x = 1,

then we say that this function is a PDF. The corresponding probability measure is defined by

P r (E) = \int_{E} p r (x) d x,

when the integral exists. This last caveat is important, since there are subsets of

R

for which this integral does not exist. The existence of these non-measurable sets has been known for some time and was a motivation for restricting the subsets to which we want to assign a probability, the events, to all be members of a

σ

-algebra. For the interested reader it is not difficult to find descriptions of such sets, but they are extremely unlikely to arise in applications. We say that a

σ

-algebra is generated by a collection of subsets of

Ω

if it is the smallest

σ

-algebra containing all of those subsets. When

Ω

is the real line

R

, then we normally use the

σ

-algebra generated by the intervals, the Borel

σ

-algebra, to define a probability measure. For events in the Borel

σ

-algebra, the integral in Eq. (5) is always defined.

As a final example of a situation where we need to restrict the subsets to which we want to assign a probability, consider $Ω$ to be the space of all real-valued functions on $R$ . We may think of an experiment where the outcome is an oscilloscope trace. In a typical measurement, we want to assign a probability to an event of the form $E = {f : a_{n} \leq f (x_{n}) \leq b_{n} for n = 1, 2, \dots, N}$ . This event corresponds to a finite number of measurements of the function $f (x)$ at the points $x_{n}$ , with the results falling into the given intervals. We can then take $F$ to be the $σ$ -algebra generated by these subsets for our probability model. In this case, however, the subset $C = {f : a \leq f (x) \leq b for c < x < d}$ is not a member of $F$ and cannot be assigned a probability since it requires an uncountably infinite number of measurements to verify that a given function is in $C$ . We are usually not bothered by this, since we cannot make an uncountably infinite number of measurements anyway. However, this example does show that the natural $σ$ -algebra for a probability model may be very far from containing all of the subsets of $Ω$ . In any case, we will only use the general concept of a probability model as just outlined to define what a random field is and what the characteristic functional of a random field is. Once that is accomplished, we will not have to worry very much about the details of the probability model used to define any given random field.

A real random variable is a real-valued function $X = f (ω)$ on the sample space $Ω$ that satisfies the following measurability condition. For any real number $a$ , the subset of $Ω$ defined by $E_{X} (a) = {ω : f (ω) \leq a}$ is in the collection $F$ of events, and may thus be assigned a probability. A complex random variable $Z = f (ω)$ is defined by insisting that $Re Z$ and $Im Z$ are both real random variables. We can also have real random vectors $X = f (ω)$ of any dimension by insisting that each component $X_{n}$ of the vector is a real random variable. This definition is also valid for complex random vectors.

For a real random variable $X$ , the PDF $p r_{X} (x)$ is defined by the condition

\int_{- \infty}^{a} p r_{X} (x) d x = P r [E_{X} (a)]

for any real number

a

. We will use the angle-bracket notation for expectations of function of a random variable:

⟨ F (X) ⟩ = \int_{- \infty}^{\infty} F (x) p r_{X} (x) d x .

The condition for this integral to exist is that

F (X)

is itself a random variable. This notation will also be used for expectations of functions of finite-dimensional random vectors, in which case there is a multivariate PDF

p r_{X} (x)

that can be used to compute them.

3. RANDOM FIELDS

Let $S$ be some $q$ -dimensional support set for all functions in this section. This may be a subset of $R^{q}$ or a $q$ -dimensional manifold such as a sphere. The support set $S$ may even be a discrete set of points in $R^{q}$ , such as a regular grid in this space. A random field [1–3] is a function $f (r, ω)$ of points $r$ in $S$ and $ω$ in $Ω$ such that $X_{r} = f (r, ω)$ is a random variable for each $r$ . We may also describe a random field as a collection of random variables $X_{r}$ for each $r$ in $S$ . For a fixed sample point $ω$ , the function $f_{ω} (r) = f (r, ω)$ is called a realization of the random field. Sometimes further constraints are placed on the random field, such as assuming that all of the realizations are continuous functions of $r$ . In general, however, we cannot expect the realizations to be continuous. We will use the common notation $f (r) = X_{r}$ for a random field. This notation has the potential for ambiguity, but it will be clear from the context whether we are referring to a random field or an ordinary function of $r$ .

Given an ordered set of points $R = [r_{1}, \dots, r_{N}]$ in $S$ , the vector $f = [f (r_{1}), \dots, f (r_{N})]$ is a random vector with a corresponding multivariate PDF $p r_{R} (x_{1}, \dots, x_{N})$ on $R^{N}$ or $C^{N}$ . The Kolmogorov extension theorem [3] provides two sets of consistency conditions that this collection of PDFs has to satisfy. The two sets of conditions arise from permutations of the points in $R$ and marginalization over any of the variables $x_{n}$ . Conversely, if the conditions in the theorem are satisfied, then there is a random field corresponding to the collection of PDFs. This theorem is often used to derive the existence of a random field from a suitable collection of PDFs.

For any realization of a random field, we will need to consider the inner product defined by

(ϕ, f_{ω}) = \int_{S} ϕ^{*} (r) f_{ω} (r) d^{q} r .

For a discrete support set

S

, the integral is replaced with a sum here and anywhere below where an integral over

S

appears. Any linear combination of random variables is also a random variable. Therefore, if we think of the integral in this expression as a limit of Riemann sums,

(ϕ, f_{ω}) = \lim_{Δ V \to 0} \sum_{k = 1}^{K} ϕ^{*} (r_{k}) f (r_{k}, ω) Δ V_{k},

where

Δ V = \max {Δ V_{k}}

, then it is a limit of a sequence of random variables. As long as this limit exists for all

ω

, the integral defines a random variable

X_{ϕ}

, which will be also be written as

(ϕ, f)

. We will also write

X_{ϕ} = (ϕ, f) = \int_{S} ϕ^{*} (r) f (r) d^{q} r .

If we have two functions $ϕ_{1} (r)$ and $ϕ_{2} (r)$ that satisfy this condition and $ϕ (r) = c_{1} ϕ_{1} (r) + c_{2} ϕ_{2} (r)$ , then $(ϕ, f)$ is also a random variable. Thus, the set of functions $ϕ (r)$ such that $(ϕ, f)$ is a well-defined random variable forms a vector space. The details of the problem of determining which functions are in this space depend on the random field itself, and we will not be discussing that issue in this work. This issue is discussed in detail elsewhere and involves a considerable amount of mathematics [4,5]. In all of the equations that follow, we will simply assume that $ϕ (r)$ is a function in this space.

4. CHARACTERISTIC FUNCTIONALS OF RANDOM FIELDS

Another way to think of a random field is as a map that assigns the function $f_{ω} (r)$ to each $ω$ in the sample space. Since the set of functions on $S$ forms a vector space, this is analogous to a random vector in an infinite-dimensional space. However, there is no easy way to generalize the concept of a PDF for a finite-dimensional random vector $X$ to a PDF for an infinite-dimensional random vector. The main reason for this is that there is no way to easily generalize an integral of the form

\int_{R^{N}} f (x) p r_{X} (x) d^{N} x

to the case where

R^{N}

is replaced with an infinite-dimensional function space. Thus, the concept of a PDF is not very useful for random fields. On the other hand, the characteristic function for a random vector, which is given by

ψ_{X} (ξ) = ⟨ \exp [- 2 π i Re (ξ^{†} x)] ⟩,

is also very useful for problems involving random vectors and is easily generalized to random fields. In this equation, the dagger superscript indicates the conjugate transpose. Of course, for random vectors, the characteristic function is just the Fourier transform of the PDF

p r_{X} (x)

and is therefore an equivalent description of the statistics of the random variable. The same is true for the characteristic functional of a random field even though there may be no PDF.

The characteristic functional for a random field $f (r)$ is defined by

Ψ_{f} (ϕ) = ⟨ \exp [- 2 π i Re (ϕ, f)] ⟩ .

As noted previously, we always assume that the function

ϕ

in the argument of the characteristic functional satisfies the condition that the scalar

X_{ϕ} = (ϕ, f)

is a well-defined random variable. Thus, the expectation in the definition of the characteristic functional can be computed from the one-dimensional PDF

p r_{ϕ} (x)

for

X_{ϕ}

. There are certain conditions that a characteristic functional must satisfy. The normalization condition is

Ψ_{f} (0) = 1

. The Hermitian symmetry condition is

Ψ_{f}^{*} (ϕ) = Ψ_{f} (- ϕ)

. The magnitude of the characteristic functional always satisfies

| Ψ_{f} (ϕ) | \leq ⟨ | \exp [- 2 π i Re (ϕ, f)] | ⟩ = 1

. The last condition is called non-negative definiteness,

\sum_{j, k = 1}^{J} Ψ_{f} (ϕ_{j} - ϕ_{k}) a_{j} a_{k}^{*} = ⟨ {| \sum_{j = 1}^{J} a_{j} \exp [- 2 π i Re (ϕ_{j}, f)] |}^{2} ⟩ \geq 0,

which must be true for any

J

, any set of functions

{ϕ_{1}, \dots, ϕ_{J}}

for which

Ψ_{f} (ϕ_{j})

is defined for all

j

, and any complex vector

a = [a_{1}, \dots, a_{J}]

. The condition in this equation is equivalent to constraining the matrix

A

given by

A_{j k} = Ψ_{f} (ϕ_{j} - ϕ_{k})

to be non-negative definite. A generalization of Bochner’s theorem [3] to this setting shows that, if we add in a continuity condition, then these constraints on a functional ensure that it is the characteristic functional of a random field.

Suppose that $f (r)$ is a real random field. Since $X_{r} = f (r)$ is a random variable, we may use its PDF to compute a mean value: $\bar{f} (r) = ⟨ f (r) ⟩$ . Similarly, $[f (r), f (r^{'})]$ is a random vector and its PDF may be used to compute the correlation function $R (r, r^{'}) = ⟨ f (r) f (r^{'}) ⟩$ . The covariance function for the random field is then given by $K (r, r^{'}) = R (r, r^{'}) - \bar{f} (r) \bar{f} (r^{'})$ . For a real random field, we have $Ψ_{f} (ϕ) = ⟨ \exp [- 2 π i (ϕ, f)] ⟩$ when $ϕ (r)$ is a real-valued function, which will be the assumption made in this paragraph. For a parameter $τ$ , we then have

\frac{d}{d τ} Ψ_{f} {(τ ϕ) |}_{τ = 0} = - 2 π i (ϕ, \bar{f}) .

So the mean of the random field is determined by the characteristic functional. Taking another derivative, we get

{\frac{d^{2}}{d τ^{2}} Ψ_{f} (τ ϕ) |}_{τ = 0} = - 4 π^{2} \int_{S} \int_{S} ϕ (r) R (r, r^{'}) ϕ (r^{'}) d^{q} r^{'} d^{q} r .

We will write this equation in the form

{\frac{d^{2}}{d τ^{2}} Ψ_{f} (τ ϕ) |}_{τ = 0} = - 4 π^{2} (ϕ, R ϕ),

where the integral operator

R

is given by the inner integral in the previous equation. The characteristic functional therefore determines the correlation function and hence the covariance function also. There are similar expressions for higher-order moments of the random field. There is also an integral operator

K

whose kernel is the covariance function

K (r, r^{'})

. Spectral theory for random fields is the study of the spectrum of this operator, which can also be derived from the characteristic functional in the obvious way.

For complex random fields, the correlation function is usually defined by $R (r, r^{'}) = ⟨ f^{*} (r) f (r^{'}) ⟩$ , with the corresponding covariance function $K (r, r^{'}) = R (r, r^{'}) - {\bar{f}}^{*} (r) \bar{f} (r^{'})$ . If we introduce the complex parameter $τ = τ_{R} + i τ_{I}$ , then this correlation function is given by

{(\frac{\partial^{2}}{\partial τ_{R}^{2}} - \frac{\partial^{2}}{\partial τ_{I}^{2}}) Ψ_{f} (τ ϕ) |}_{τ = 0} = - 4 π^{2} (ϕ, R ϕ) .

Since

R

is a Hermitian operator, the quantity

(ϕ, R ϕ)

is real. To complete the description of the second-order statistics for a complex random process, we also need another correlation function defined by

\tilde{R} (r, r^{'}) = ⟨ f (r) f (r^{'}) ⟩

, with the corresponding covariance function

\tilde{K} (r, r^{'}) = \tilde{R} (r, r^{'}) - \bar{f} (r) \bar{f} (r^{'})

. In terms of the characteristic functional, we then have

{(\frac{\partial^{2}}{\partial τ_{R}^{2}} + \frac{\partial^{2}}{\partial τ_{I}^{2}}) Ψ_{f} (τ ϕ) |}_{τ = 0} = - 4 π^{2} Re (ϕ, \tilde{R} ϕ^{*})

and

{2 \frac{\partial^{2}}{\partial τ_{R} \partial τ_{I}} Ψ_{f} (τ ϕ) |}_{τ = 0} = - 4 π^{2} Im (ϕ, \tilde{R} ϕ^{*}) .

Note that $\tilde{R}$ is, in general, not a Hermitian operator.

5. GENERALIZED RANDOM FIELDS

Some physically important random fields, such as Poisson point processes, are actually generalized random fields. For a generalized random field we do not insist that $X_{r} = f (r)$ is a random variable since, for any given realization, $f_{ω} (r)$ may not be well defined as an ordinary function. Instead, we restrict $ϕ (r)$ to be a test function and define

X_{ϕ} = \int_{S} ϕ^{*} (r) f (r, ω) d^{q} r .

Here,

f_{ω} (r) = f (r, ω)

is taken to be a generalized function [1,6] for all sample points

ω

. The requirement for a generalized random field is that

X_{ϕ}

be a well-defined random variable for all test functions

ϕ (r)

. We will also write

X_{ϕ} = (ϕ, f) = f (ϕ)

and think of

f (ϕ)

for a particular test function

ϕ (r)

as the output of a random distribution on the space of test functions defined on

S

. This notation emphasizes the fact that the space of test functions on

S

plays the same role for a generalized random process as the set

S

itself does for an ordinary random process. Thus, we can also write

X_{ϕ} = f (ϕ, ω)

as defining the random variable

X_{ϕ}

, and

f_{ω} (ϕ) = f (ϕ, ω)

as defining a realization of the generalized random field. If we have a vector of test functions

ϕ = [ϕ_{1}, \dots, ϕ_{N}]

, then we have a random vector

f = [f (ϕ_{1}), \dots, f (ϕ_{N})]

with corresponding PDF

p r_{ϕ} (x_{1}, \dots, x_{N})

.

6. CHARACTERISTIC FUNCTIONALS OF GENERALIZED RANDOM FIELDS

Just as with ordinary random fields, extending the concept of a PDF to a generalized random field is difficult. We would need the PDF to be defined on the space of distributions on $S$ , which is an infinite-dimensional space that includes, for example, the space of locally integrable functions on $S$ . However, the characteristic functional for a generalized random field is easily defined by $Ψ_{f} (ϕ) = ⟨ \exp [- 2 π i Re (ϕ, f)] ⟩$ , where the argument $ϕ (r)$ of the functional is taken to be a test function. The expectation in this expression is computed using the PDF $p r_{ϕ} (x)$ for the random variable $X_{ϕ} = (ϕ, f)$ . All of the statistical information about the generalized random field is contained in this functional.

A generalized random field is real-valued if $f (ϕ)$ is real whenever $ϕ (r)$ is a real-valued test function. In this case, we may define the mean of the generalized random field as a distribution by using $\bar{f} (ϕ) = ⟨ f (ϕ) ⟩$ . Since, for any given test function $ϕ (r)$ , $f (ϕ)$ is a random variable, the expectation in this definition may be computed by using the PDF for that random variable. We may also write this definition as $(ϕ, \bar{f}) = ⟨ (ϕ, f) ⟩$ . Similarly, for a pair of test functions $[ϕ, ϕ^{'}]$ , the random vector $[f (ϕ) f (ϕ^{'})]$ has a PDF that can be used to compute the correlation operator via $⟨ f (ϕ) f (ϕ^{'}) ⟩ = (ϕ, R ϕ^{'})$ . Notice that we do not attempt to define a correlation function, which in many cases would be a generalized function itself. The covariance operator is then defined by $(ϕ, R ϕ^{'}) - \bar{f} (ϕ) \bar{f} (ϕ^{'}) = (ϕ, K ϕ^{'})$ .

For complex generalized random fields, we have the Hermitian correlation operator $R$ defined by $⟨ f^{*} (ϕ) f (ϕ^{'}) ⟩ = (ϕ, R ϕ^{'})$ , and the corresponding covariance operator $K$ given by $(ϕ, R ϕ^{'}) - {\bar{f}}^{*} (ϕ) \bar{f} (ϕ^{'}) = (ϕ, K ϕ^{'})$ . As with ordinary random fields, to complete the description of the second-order statistics of a complex generalized random field, we need the operator $\tilde{R}$ defined by $⟨ f (ϕ) f (ϕ^{'}) ⟩ = (ϕ, \tilde{R} ϕ^{'})$ . The corresponding covariance operator $\tilde{K}$ is then given by $(ϕ, \tilde{R} ϕ^{'}) - \bar{f} (ϕ) \bar{f} (ϕ^{'}) = (ϕ, \tilde{K} ϕ^{'})$ . The relations between the characteristic functional and these first- and second-order statistics are the same as those for ordinary random fields given previously.

7. OPERATIONS WITH RANDOM FIELDS

Various mathematical operations with random fields or generalized random fields give rise to relations between characteristic functionals. If $f_{1} (r, ω_{1})$ and $f_{2} (r, ω_{2})$ are two random fields on $S$ with corresponding probability models $(Ω_{1}, F_{1}, P r_{1})$ and $(Ω_{2}, F_{2}, P r_{2})$ , then there is a probability model on $Ω = Ω_{1} \times Ω_{2}$ with a probability measure that satisfies $P r (A_{1} \times A_{2}) = P r_{1} (A_{1}) P r_{2} (A_{2})$ . We can then define a random field on $S$ by $f (r, (ω_{1}, ω_{2})) = f_{1} (r, ω_{1}) + f_{2} (r, ω_{2})$ . We usually write $f (r) = f_{1} (r) + f_{2} (r)$ in this case and say that the two random fields $f_{1} (r)$ and $f_{2} (r)$ are statistically independent. The characteristic functionals for these random fields are related by the equation $Ψ_{f} (ϕ) = Ψ_{f_{1}} (ϕ) Ψ_{f_{2}} (ϕ)$ . On the other hand, we may also take this equation as defining when two random fields are statistically independent when they share a probability model $(Ω, F, P r)$ . The obvious condition for this equation to be valid is that it must be permissible to use $ϕ (r)$ in the argument of all of the characteristic functionals involved.

In contrast to the technical issues for summing two random fields, multiplying a random field by a non-random function $α (r)$ is straightforward and leads to the relation $Ψ_{α f} (ϕ) = Ψ_{f} (α^{*} ϕ)$ . The one constraint here is that the function $α^{*} (r) ϕ (r)$ must be in the space of functions that can be used in the argument of the functional $Ψ_{f}$ . For example, if $f (r)$ is a generalized random field and $α^{*} (r)$ is infinitely differentiable, then $α (r) f (r)$ is also a generalized random field. If $ϕ (r)$ is now a test function, then $α^{*} (r) ϕ (r)$ is also a test function and the equality is valid.

Now suppose that $H$ is a linear integral operator given by

g (r) = \int_{S^{'}} h (r, r^{'}) f (r^{'}) d^{q} r^{'} = H f (r) .

If we can show that the integral defining

g_{ω} (r) = H f_{ω} (r)

converges for all

r

in

S

and all sample points

ω

, then this equation defines a random field

g (r, ω) = g_{ω} (r)

on

S

. In this case, we usually write

g = H f

as a relation between two random fields. The characteristic functionals of these two random fields are related by

Ψ_{g} (ϕ) = Ψ_{f} (H^{†} ϕ)

, where the adjoint operator

H^{†}

is defined by

(ϕ, H f) = (H^{†} ϕ, f)

. In the relation just given between characteristic functionals,

ϕ (r)

must be in the space of functions for which both sides of the equation are well defined. Note that this relation between characteristic functionals makes sense even if

H

is a differential operator. In this case we can, if necessary, restrict

ϕ (r)

to be a test function so that

g (r)

can be treated as a generalized random field defined by the relation

(ϕ, g) = (H^{†} ϕ, f)

.

Now suppose that we have a set of random fields $f_{n} (r, ω)$ on $S$ such that they are all defined with the same probability model $(Ω, F, P r)$ for $n = 1, \dots, N$ . Also assume that we have a set of probabilities $P r_{n}$ . We can create a new probability model $(\tilde{Ω}, \tilde{F}, \tilde{P} r)$ such that the samples are pairs $\tilde{ω} = (ω, n)$ and $\tilde{P} r (E, n) = P r (E) P r_{n}$ . The mixture random field for this situation is defined by $f (r, \tilde{ω}) = f_{n} (r, ω)$ . We can think of the realizations of this random field as being equal to $f_{n} (r, ω)$ with probability $P r_{n}$ . We then have that the characteristic functional for the mixture random field is given by the convex combination

Ψ_{f} (ϕ) = \sum_{n = 1}^{N} P r_{n} Ψ_{f_{n}} (ϕ)

of the characteristic functionals of the component random fields. Again, the condition for this equation to be valid is that it must be permissible to use

ϕ (r)

in the argument of all of the characteristic functionals involved. With this constraint, we can see that any convex combination of characteristic functionals is also a characteristic functional.

8. EXAMPLES

We now look at some examples of characteristic functionals that are relevant for imaging. In most cases, we do not present a derivation of the characteristic functionals since they are readily available elsewhere.

A. Real Gaussian

A random field on $S$ is Gaussian or normal if its characteristic functional has the form $Ψ_{f} (ϕ) = \exp [- 2 π i (ϕ, \bar{f})] \exp [- 2 π^{2} (ϕ, K ϕ)]$ [1 (page 410),2]. In this equation, $\bar{f} (r)$ is the mean of the random field and $K$ is the covariance operator as described above. If the mean is zero and the covariance operator is a multiple of the identity operator, then we have a zero-mean white noise Gaussian random field. This noise field is often used in the description of Brownian motion. Note that a white noise Gaussian random field is actually a generalized random field since the kernel of the covariance operator is given by $K (r, r^{'}) = δ (r - r^{'})$ , which is a generalized function of two variables. Due to the central limit theorem, Gaussian random fields often occur in applications as an approximation for a random field that is a sum of many independent random fields.

Suppose that we have a random field $p (r)$ that satisfies $p (r) > 0$ for all realizations and all $r$ in $S$ . If the random field $f (r) = \ln p (r)$ is a Gaussian random field, then we say that $p (r)$ is a log-normal random field [1 (page 1462),7]. There is no analytic form for the characteristic functional of a log-normal random field. However, in this case, the functional

M_{p} (ϕ) = ⟨ \exp [- 2 π i (ϕ, \ln p)] ⟩ = \exp [- 2 π i (ϕ, \bar{f})] \exp [- 2 π^{2} (ϕ, K ϕ)]

is well defined whenever

ϕ (r)

is in the domain of the characteristic functional of the random field

f (r)

. Note that the operator

K

in this expression is the covariance operator for

f (r)

and not for

p (r)

. If

p^{'} (r)

and

p^{''} (r)

are statistically independent positive random fields and

p (r) = p^{'} (r) p^{''} (r)

, then

M_{p} (ϕ) = M_{p^{'}} (ϕ) M_{p^{''}} (ϕ)

. Therefore, due to the central limit theorem, log-normal random fields often show up as an approximation for a random field that is a product of many independent positive random fields.

B. Circular Gaussian

A complex random field on $S$ is a circular Gaussian if its characteristic functional is given by $Ψ_{f} (ϕ) = \exp [- π^{2} (ϕ, K ϕ)]$ , where the covariance operator $K$ is a Hermitian operator. It is often assumed that the electric field in partially coherent light is a circular Gaussian random field. We can also have complex random fields that are Gaussian but not circular Gaussian. The characteristic functional in this case involves both covariance functions for the complex random field.

C. Poisson Random Field

A random point field is a generalized random field that satisfies

(ϕ, f) = \sum_{n = 1}^{N} ϕ (r_{n}),

where

N

is a random integer with a distribution

P r (N)

, and, for a given

N

, the points

{r_{1}, \dots, r_{N}}

are also randomly selected [1 (page 649)]. A typical restriction we could place on

ϕ (r)

for this expression to be well defined is that it be continuous.

For a Poisson random point field, we have the Poisson distribution for $N$ :

P r (N) = \frac{{\bar{N}}^{N}}{N!} \exp (- \bar{N}) .

We also require that the points

{r_{1}, \dots, r_{N}}

be chosen independently from a PDF

p r (r)

. The mean of this generalized random field satisfies

\bar{N} p r (r) = \bar{f} (r)

and therefore

\bar{N} = \int_{S} \bar{f} (r) d^{q} r .

If we define the function $\tilde{ϕ} (r)$ by $2 π i \tilde{ϕ} (r) = 1 - \exp [2 π i ϕ (r)]$ , then the characteristic functional of a Poisson random field is given by $Ψ_{f} (ϕ) = \exp [- 2 π i (\tilde{ϕ}, \bar{f})]$ . Poisson random fields are often used to describe the distribution of photon locations on a photon-counting detector. In this case the mean of the field is called the photon fluence, and it determines the statistics of the field completely. Poisson random fields are also used to describe photon-emitting objects that are being imaged.

D. Random Fields Related to Poisson Random Fields

Suppose that the mean of a Poisson random field $f (r)$ is itself a random field. Then we have immediately that the characteristic functional of $f (r)$ is given by $Ψ_{f} (ϕ) = Ψ_{\bar{f}} (\tilde{ϕ})$ . In this case, we say that $f (r)$ is a doubly stochastic Poisson random field [1 (page 659),8]. If we are imaging an ensemble of objects with a photon-counting detector, then a doubly stochastic random field is the appropriate model of the statistics of the emitted or detected photon locations.

If we convolve a Poisson random field with a continuous function $h (r)$ , then we have a filtered Poisson random field [1 (page 662)]:

f (r) = \sum_{n = 1}^{N} h (r - r_{n}) .

This type of field is sometimes used to model textures in objects being imaged. If we define the Hermitian conjugate of the filter function via

h^{†} (r) = h^{*} (- r)

and define

ϕ_{h} (r) = (h^{†} * ϕ) (r)

, then we have

Ψ_{f} (ϕ) = \exp [- 2 π i ({\tilde{ϕ}}_{h}, \bar{f})]

for the characteristic functional of the filtered Poisson random field.

We can have a doubly stochastic filtered Poisson random field also. In this case, the characteristic functional is given by $Ψ_{f} (ϕ) = Ψ_{\bar{f}} ({\tilde{ϕ}}_{h})$ . Other generalizations are also possible. For example, the filter could have the more general form $h (r, r_{n})$ , i.e., a space-variant filter. It is straightforward to work out the characteristic functional in these cases also.

E. Partitioned Random Fields

We will say that the random field is partitioned if we can write the support set $S$ as a disjoint union

S = ⋃_{j = 1}^{J} S_{j},

and we have mutually independent random fields

f_{j} (r)

on each set

S_{j}

such that

(ϕ, f) = \sum_{j = 1}^{J} a_{j} (ϕ_{j}, f_{j}) .

Here,

ϕ_{j} (r)

is the restriction of

ϕ (r)

to the set

S_{j}

[1 (page 428)]. The coefficients

a_{j}

in this sum are the components of a random vector

a

with PDF

p r_{a} (a)

. We can think of the random fields

f_{j} (r)

as generating random textures in each region

S_{j}

that are statistically independent of each other, but the random vector

a

introduces correlations in the amplitudes of these textures between the regions. This can be used, for example, to model the distribution of a drug or radiotracer in the organs of each member of an ensemble of subjects being imaged. If each

f_{j} (r)

is a filtered Poisson random field with filter

h_{j} (r)

, then we define the functions

{\tilde{ϕ}}_{h j} (r)

by

2 π i {\tilde{ϕ}}_{h j} (r) = 1 - \exp [2 π i (h_{j} * ϕ_{j}) (r)]

and the vector

{\tilde{ϕ}}_{h}

by

{[{\tilde{ϕ}}_{h}]}_{j} = ({\tilde{ϕ}}_{h j}, {\bar{f}}_{j})

. The characteristic functional for the partitioned random field is then given by

Ψ_{f} (ϕ) = ψ_{a} ({\tilde{ϕ}}_{h})

.

F. Square of a Real Gaussian Random Field

First we consider a finite-dimensional, zero-mean Gaussian random vector $f$ with covariance matrix $K$ . Define a random vector $I$ by $I_{n} = f_{n}^{2}$ . If we use $D (ξ)$ to denote the diagonal matrix with the components $ξ_{n}$ along the main diagonal, then we may write the characteristic function of $I$ as

ψ_{I} (ξ) = \int_{R^{n}} [{(2 π)}^{N} \det K] \exp (- \frac{1}{2} f^{†} K^{- 1} f) \exp [- 2 π i f^{†} D (ξ) f^{†}] d^{N} f .

This integral can be performed analytically and the result is [1 (page 1255)]

ψ_{I} (ξ) = {\det [I + 2 π i KD (ξ)]}^{- \frac{1}{2}} .

Now consider a zero-mean Gaussian random field $f (r)$ with covariance operator $K$ . We define the corresponding intensity field $I (r)$ by $I (r) = f^{2} (r)$ . This is called the intensity field in analogy with optical applications where $f (r)$ would be the (real) electric field from a finite-bandwidth source, and the detector responds to the intensity field. If we define the diagonal operator $D_{ϕ}$ for $ϕ (r)$ acting on a function $β (r)$ by the equation $D_{ϕ} β (r) = ϕ (r) β (r)$ , then the obvious generalization of the finite-dimensional formula is

Ψ_{I} (ϕ) = {\det [I + 2 π i K D_{ϕ}]}^{- \frac{1}{2}} .

This determinant is well defined if $K$ is a trace-class operator since, in this case, $2 π i K D_{ϕ}$ is also a trace-class operator and the determinant can be defined as an infinite product of eigenvalues. However, just because this functional is well defined does not mean it is necessarily the characteristic functional of the random field $I (r)$ .

To see why we might believe that the given expression for $Ψ_{I} (ϕ)$ is correct, we consider the following special cases:

ϕ (r) = \sum_{n = 1}^{N} ξ_{n} δ (r - r_{n}) .

We will assume that the Gaussian random field

f (r)

is an ordinary, as opposed to a generalized, random field. In this case, these sums are allowable for

ϕ (r)

because

(ϕ, f) = \sum_{n = 1}^{N} ξ_{n} f (r_{n}),

which is a well-defined random variable. We then have

K D_{ϕ} β (r) = \sum_{n = 1}^{N} ξ_{n} K (r, r_{n}) β (r_{n}),

which shows that

K D_{ϕ}

is a finite-rank operator with rank no greater than

N

. Therefore, there are at most

N

nonzero eigenvalues for this operator and we can find them by assuming that an eigenfunction has the form

β (r) = \sum_{n^{'} = 1}^{N} v_{n^{'}} K (r, r_{n^{'}}) .

After substituting this $β (r)$ into the eigenvlaue equations, we arrive at

\sum_{n^{'} = 1}^{N} ξ_{n} K (r_{n}, r_{n^{'}}) v_{n^{'}} = λ v_{n} .

Thus, the nonzero eigenvalues of

K D_{ϕ}

for this

ϕ (r)

are the eigenvalues of

KD (ξ)

. Therefore we can say that, if there is a random field for which

Ψ_{I} (ϕ)

(as defined previously) is the characteristic functional, then this random field will have the same finite-dimensional PDFs as

I (r)

. By the Kolmogorov theorem, we would then have that this random field is

I (r)

. By Bochner’s theorem,

Ψ_{I} (ϕ)

as just given is the characteristic functional of a random field if it satisfies the constraints given in Section 4. The constraint of non-negative definiteness is the hard one to check but we can argue as follows. Let the sampled function

\tilde{ϕ} (r)

be defined by

\tilde{ϕ} (r) = \sum_{n = 1}^{N} ϕ (r_{n}) δ (r - r_{n}),

where the

r_{n}

form a regular grid in

S

. Then the operator

K D_{\tilde{ϕ}}

is a good approximation to

K D_{ϕ}

, and this approximation gets better as we refine the grid. Since

K D_{\tilde{ϕ}}

is a finite-rank operator, we know that

Ψ_{I} (ϕ)

satisfies all of the constraints if we restrict

ϕ (r)

to sampled functions. By continuity, we then have that the constraints are satisfied for all

ϕ (r)

. This argument is really a sketch of a proof that we have the correct expression for

Ψ_{I} (ϕ)

and needs work to make it rigorous. Filling this sketch out is, however, beyond the scope of this paper.

G. Square Magnitude of a Circular Gaussian

Now suppose that $f (r)$ is a circular Gaussian complex random field and $I (r) = {| f (r) |}^{2}$ . Using the results of the previous subsection, we have

Ψ_{I} (ϕ) = {\det [I + 2 π i K D_{ϕ}]}^{- 1} .

This is the characteristic functional for a commonly used model of speckle in coherent imaging. The characteristic functions for the finite-dimensional PDFs are given by

ψ_{I} (ξ) = {\det [I + 2 π i KD (ξ)]}^{- 1} .

For the $N = 1$ case, the inverse 1D Fourier transform integral can be computed analytically to give a chi-square PDF for the random variable $I (r)$ at a fixed $r$ . The $N = 2$ case can also be computed analytically in terms of the $I_{0}$ Bessel function [1 (page 1258),9].

H. Complex-Amplitude Poisson Random Field

An example of a complex-amplitude Poisson random field has the form

(ϕ, f) = \sum_{n = 1}^{N} a_{n} \exp (i φ_{n}) ϕ (r_{n}),

where, for a given

N

, the amplitudes

a_{n}

, the phases

φ_{n}

, and the locations

r_{n}

are all independent of each other. The number

N

is a Poisson random variable with mean

\bar{N}

. Conditioned on

N

, we have that the locations are i.i.d. with PDF

p r (r)

and the amplitudes are i.i.d. with PDF

p r_{a} (a)

. A common assumption that we will follow is that the phases are i.i.d. and uniformly distributed on

[- π, π]

. This is a common model for a scattered radiation field from a random collection of point scatterers, and is also known as a Cox process [10].

We define the function $m (r) = \bar{N} p r (r)$ and

\tilde{ϕ} (r) = \frac{i}{2 π} \int_{0}^{\infty} {J_{0} [2 π a | ϕ (r) |] - 1} p r_{a} (a) d a,

where

J_{0}

is the zero-order Bessel function. We then have that the characteristic functional of the scattered field is given by

Ψ_{f} (ϕ) = \exp [- 2 π i (\tilde{ϕ}, m)]

[1 (page 1317)]. If we have an ensemble of objects, each consisting of a collection of point scatterers, then

m (r)

is itself a random field and the characteristic functional of

f (r)

is given by

Ψ_{f} (ϕ) = Ψ_{m} (\tilde{ϕ})

.

If the propagation of the field through space or an imaging system is defined by an operator $P$ , then $Ψ_{P f} (ϕ) = Ψ_{f} (P^{†} ϕ)$ . If the propagation is implemented by convolving with a point spread function $h (r)$ we define, as before, $ϕ_{h} = h * ϕ$ and have $Ψ_{f} (ϕ) = \exp [- 2 π i ({\tilde{ϕ}}_{h}, m)]$ for a fixed collection of scatterers and $Ψ_{f} (ϕ) = Ψ_{m} ({\tilde{ϕ}}_{h})$ for an ensemble of objects.

I. Innovation Random Field

Sparsity of a random field is a property often assumed in compressed sensing that allows better object reconstructions. One model for a sparse random field $f (r)$ assumes that $f = L w$ where $w (r)$ is a stationary white noise field, sometimes called an innovation field. We then have $Ψ_{f} (ϕ) = Ψ_{w} (L^{†} ϕ)$ . To describe the statistical properties of an innovation field, we introduce the translation operators $T_{s} ϕ (r) = ϕ (r - s)$ and the random variables $(T_{s} ϕ, w)$ . The stationarity assumption is that for any $N$ and functions ${ϕ_{1} (r), \dots, ϕ_{N} (r)}$ , the $N$ -dimensional PDF for the $N$ -dimensional random vector with components $(T_{s} ϕ_{n}, w)$ is independent of the translation vector $s$ . For a stationary random field, we necessarily have $S = R^{q}$ . The white noise assumption is that whenever $ϕ (r) ϕ^{'} (r) \equiv 0$ for all $r$ in $S$ , the random variables $X_{ϕ}$ and $X_{ϕ^{'}}$ are statistically independent. In this case it can be shown that [4,5]

Ψ_{w} (ϕ) = \exp {\int_{R^{q}} Λ [ϕ (r)] d^{q} r},

where, for some

θ

and

σ

,

Λ (ω) = - 2 π i θ ω - 2 π^{2} σ^{2} ω^{2} + \int_{R^{*}} [\exp (- 2 π i a ω) - 1 + 2 π i a ω step (1 - | a |)] d V (a) .

The differential

d V (a)

is called the Levy measure and must satisfy the constraint

\int_{R^{*}} \min (1, a^{2}) d V (a) < \infty .

Therefore any random field that is the result of a linear operator acting on an innovation field must have a characteristic functional of the form $Ψ_{f} (ϕ) = Ψ_{w} (L^{†} ϕ)$ , where the functional $Ψ_{w}$ is as just given. If the PDFs for the random variables $(L^{†} ϕ_{n}, w) = (ϕ_{n}, f)$ for some basis sequence $ϕ_{1} (r), ϕ_{2} (r), \dots$ are highly kurtotic with long tails, then this sequence can be thought of as a sparsity basis for the random field $f (r)$ . This requirement on these PDFs can then be related to properties of the Levy measure for the innovation process.

J. Plane-Wave Random Field

Suppose that the random field $f (r)$ is a superposition of a random number $N$ of plane waves with random amplitudes, phases, and spatial frequencies:

f (r) = \sum_{n = 1}^{N} A_{n} \exp (- 2 π i k_{n} \cdot r) .

This would be the case, for example, if $f (r)$ was the inverse Fourier transform of a point process with complex amplitudes in frequency space. For a given $N$ , we will assume that the amplitudes and spatial frequencies are independent of each other and are i.i.d. with PDFs $p r_{A} (A_{n})$ and $p r_{k} (k_{n})$ , respectively. If $Φ (k)$ is the Fourier transform of $ϕ (r)$ and $N$ is Poisson distributed with mean $\bar{N}$ , then the characteristic functional of this random field is given by $Ψ_{f} (ϕ) = \exp {\bar{N} {⟨ ψ_{A} [Φ (k)] ⟩}_{k} - \bar{N}}$ . If we assume that the spatial frequencies follow a circular Gaussian distribution, with variance $σ^{2}$ , then

Ψ_{f} (ϕ) = \exp {\bar{N} {⟨ \exp [- π^{2} σ^{2} {| Φ (k) |}^{2}] ⟩}_{k} - \bar{N}} .

The single-point characteristic function for the random variable $X = f (r_{0})$ is derived by setting $ϕ (r) = ξ δ (r - r_{0})$ , which results in

ψ_{X} (ξ) = \exp [\bar{N} \exp (- π^{2} σ^{2} {| ξ |}^{2}) - \bar{N}] .

There is no analytic form for the PDF corresponding to this characteristic function.

K. Infinite-Series Random Field

Suppose that the function $F (n, ω)$ describes a random field on a discrete set of integer vectors $n$ . A Markov random field [1 (page 428)] is an example of this situation. If the functions $μ_{n} (r)$ form a set of functions on $S$ indexed by these integer vectors, then we may define a random field $f (r, ω)$ on $S$ via

f (r, ω) = \sum_{n} F (n, ω) μ_{n} (r),

as long as the series converges for all

ω

in

Ω

. A wavelet series expansion, for example, would have this form. We then write

f (r) = \sum_{n} F (n) μ_{n} (r),

where

F (n)

is understood to be a random field on the index set. We also have, for appropriate functions

ϕ (r)

, the series

(ϕ, f_{ω}) = \sum_{n} F_{ω} (n) (ϕ, μ_{n}),

which defines a random variable. We will write this equation as

(ϕ, f) = \sum_{n} F (n) (ϕ, μ_{n}) .

If we define a vector $ϕ$ on the index set via ${[ϕ]}_{n} = (ϕ, μ_{n})$ , then the characteristic functional for the random field $f (r)$ is given by $Ψ_{f} (ϕ) = Ψ_{F} (ϕ)$ . This type of model can be used for compressed-sensing applications where the functions $μ_{n} (r)$ could form a wavelet basis or some similar basis or frame.

L. Dynamically Evolving Random Field

Suppose that we define a bounded spatial integral operator $A$ on spatiotemporal functions via

A f (r, t) = \int_{S} A (r, r^{'}) f (r^{'}, t) d^{q} r .

We may then consider the dynamic equation

\frac{\partial}{\partial t} f (r, t) = A f (r, t) + I (r, t),

with initial condition

f (r, 0) = a (r)

. The solution to this equation can be written as [11]

f (r, t) = \exp (t A) a (r) + \int_{0}^{t} \exp [(t - t^{'}) A] I (r, t^{'}) d t^{'},

where the exponential of an operator is defined using the power-series expansion. We will assume that the input function

I (r, t)

and the initial distribution

a (r)

are independent random fields. The random fields

f (r, t)

and

I (r, t)

are then defined on

S \times [0, \infty)

. Now fix a time

t

and consider the random field

f_{t} (r) = f (r, t)

on

S

. We will denote the characteristic functional for this field by

Ψ_{t} (ϕ)

. For a function

ϕ (r)

in the domain of this functional, we define

{\tilde{ϕ}}_{t} (r, t^{'}) = step (t - t^{'}) \exp [(t - t^{'}) A^{†}] ϕ (r)

and

ϕ_{t} (r) = \exp (t A) ϕ (r) = {\tilde{ϕ}}_{t} (r, 0)

. Then we have the relation

Ψ_{t} (ϕ) = Ψ_{a} (ϕ_{t}) Ψ_{I} ({\tilde{ϕ}}_{t})

, assuming that all of the functional values in this equation are well defined. If the operator

A

is a differential operator, we may have a problem defining the exponential

\exp (t A)

. One case where this exponential is well defined is when

A^{†} = - A

, i.e.,

A

is anti-Hermitian. For example, if

A = s \cdot \nabla_{r}

, then

\exp (t A) f (r) = f (r + s)

. This operator appears in the radiative transport equation [1].

In some applications, we may also want a vector-valued version of this type of dynamically evolving random field. A model for the distribution of various molecules in an organ or a tumor, for example, may take this form. In this case, we define a spatial integral operator $A$ on spatiotemporal vector-valued functions via

A f (r, t) = \int_{S} A (r, r^{'}) f (r^{'}, t) d^{q} r,

where

A (r, r^{'})

is now a matrix-valued kernel function. Then our dynamic equation is given by

\frac{\partial}{\partial t} f (r, t) = A f (r, t) + I (r, t),

with initial condition

f (r, 0) = a (r)

. The solution to the dynamic equation is given by [11]

f (r, t) = \exp (t A) a (r) + \int_{0}^{t} \exp [(t - t^{'}) A] I (r, t^{'}) d t^{'},

where the exponential of the operator can still be defined by using the power-series expansion. We again assume that the input function

I (r, t)

and the initial distribution

a (r)

are independent vector-valued random fields. The vector-valued random fields

f (r, t)

and

I (r, t)

are then defined on

S \times [0, \infty)

. Now fix time

t

and consider the random field

f_{t} (r) = f (r, t)

on

S

. We define inner products by

(ϕ, f_{t}) = \int_{S} ϕ^{†} (r) f (r) d^{q} r,

and the characteristic functional for this field by

Ψ_{t} (ϕ) = ⟨ \exp [- 2 π i (ϕ, f_{t})] ⟩

. For a vector-valued function

ϕ (r)

in the domain of this functional, we define

{\tilde{ϕ}}_{t} (r, t^{'}) = step (t - t^{'}) \exp [(t - t^{'}) A^{†}] ϕ (r)

and

ϕ_{t} (r) = \exp (t A) ϕ (r) = {\tilde{ϕ}}_{t} (r, 0)

. We then have the relation

Ψ_{t} (ϕ) = Ψ_{a} (ϕ_{t}) Ψ_{I} ({\tilde{ϕ}}_{t})

, assuming that all of the functional values in this equation are well defined. This model, perhaps in combination with Example 8.E, may be useful for describing pharmacokinetic processes in medical imaging applications.

M. Cascaded Poisson Process

For a cascaded Poisson process, we have

(ϕ, f) = \sum_{n = 1}^{N} \sum_{k = 1}^{K_{n}} ϕ (r_{n} + Δ r_{n k}) .

The idea here is that there are primary interactions that take place at positions

r_{n}

, but that what is measured are the secondary interactions at positions

r_{n} + Δ r_{n k}

. This is a model for scintillation detectors, for example [1 (page 670)]. We will examine the simplest situation from the statistical point of view, but more complex cases can also be considered. The random variables and vectors are

N

,

K_{n}

,

r_{n}

, and

Δ r_{n k}

. These integers and vectors are all assumed to be independent of each other. The

Δ r_{n k}

are i.i.d. with PDF

p r_{Δ r} (Δ r)

, and the

r_{n}

are i.i.d with PDF

p r_{r} (r)

. The

K_{n}

are i.i.d. Poisson random variables with mean

\bar{K}

, and

N

is Poisson with mean

\bar{N}

. We define the functions

b (Δ r) = \bar{K} p r_{Δ r} (Δ r)

and

a (r) = \bar{N} p r_{r} (r)

.

In order to present the characteristic functional for this process, we start with the functions ${\tilde{ϕ}}_{r} (Δ r)$ defined by $2 π i {\tilde{ϕ}}_{r} (Δ r) = 1 - \exp [2 π i ϕ (r + Δ r)]$ . We then compute the function $\tilde{ϕ} (r)$ by using $2 π i \tilde{ϕ} (r) = 1 - \exp [- 2 π i ({\tilde{ϕ}}_{r}, b)]$ . Finally, we have $Ψ_{f} (ϕ) = \exp {- 2 π i (\tilde{ϕ}, a)}$ . If $a (r)$ is itself a random field, as would happen if we were imaging an ensemble of objects using a scintillation camera, then we have $Ψ_{f} (ϕ) = Ψ_{a} (\tilde{ϕ})$ . Note that the cascading of the Poisson process is reflected in the recursive chain that leads to the characteristic functional.

9. IMAGING SYSTEMS

For our purposes, an imaging system collects a finite amount of image data about an unknown object function $f (r)$ . We will assume that the object function is a realization from a random field with characteristic functional $Ψ_{f} (ϕ)$ . We will consider two types of systems corresponding to binned data and list-mode data.

A. Binned Data

For a linear imaging system with a binned $M$ -dimensional data vector $g$ , we write $g = H f + n$ , where $H f$ is given by

{[H f]}_{m} = \int_{S} h_{m} (r) f (r) d^{q} r = (h_{m}^{*}, f),

and

n

is a zero-mean random vector that describes the measurement noise. In general, we do not make any further assumptions about the measurement noise, so

n

is not necessarily statistically independent of the object

f (r)

. If we define the mean data vector by

\bar{g} = H f

, then we have

Ψ_{\bar{g}} (ξ) = Ψ_{f} (H^{†} ξ)

. By the defining equation of the imaging system, the function

H^{†} ξ

is in the domain of the characteristic functional

Ψ_{f} (ϕ)

.

For additive measurement noise, where $n$ is statistically independent of the object $f (r)$ , we have the simple relation $Ψ_{g} (ξ) = Ψ_{f} (H^{†} ξ) Ψ_{n} (ξ)$ . For Poisson measurement noise, we define

{\tilde{ξ}}_{m} = \frac{1}{2 π i} [\exp (2 π i ξ_{m}) - 1],

and we then have

Ψ_{g} (ξ) = Ψ_{f} (H^{†} \tilde{ξ})

[1 (page 666)]. If the measurement noise has a Poisson component plus an additive component, then we have

Ψ_{g} (ξ) = Ψ_{f} (H^{†} \tilde{ξ}) Ψ_{n} (ξ)

, where

Ψ_{n} (ξ)

is the characteristic function for the additive component. The data in all of these cases is one high-dimensional sample from

p r_{g} (g)

for each object.

B. List-Mode Data

The data for a list-mode imaging system can be viewed as a Poisson random field $g (v)$ on a $p$ -dimensional space $V$ of attribute vectors. The mean of this random process is given by

\bar{g} (v) = \int_{S} L (v, r) f (r) d^{q} r,

which we write as

\bar{g} = L f

. The attributes assigned to each photon can include position, time, and energy. Such systems are also called photon-processing systems [12]. From our previous discussion, we know that the characteristic functional for the generalized random field

g (v)

with a fixed object function

f (r)

is given by

Ψ_{g} (ϕ) = \exp [- 2 π i Re (\tilde{ϕ}, L f)]

. When the object

f (r)

is a random field, we then have

Ψ_{g} (ϕ) = Ψ_{f} (L^{†} \tilde{ϕ})

. If we define the mean number of photons as

\bar{N} = \int_{V} \bar{g} (v) d^{p} v,

and the attribute space PDF as

p r_{v} (v) = \bar{g} (v) / \bar{N}

, then the data for a given object is a set of

N

low-dimensional independent samples from this PDF, where

N

is a Poisson random number with mean

\bar{N}

.

10. MATHEMATICAL OBSERVERS FOR POPULATIONS OF OBJECTS

In this section, we will discuss how characteristic functionals can be used to create mathematical observers for estimation and classification tasks of populations of objects. For estimation, the task will be to estimate parameters of a model that generated a sample of object fields that are being imaged. Therefore, we are trying to estimate population parameters as opposed to parameters that describe a particular object function. Similarly, for classification, we are trying to use image data to classify a sample of object fields as belonging to one of a finite set of object models. In both cases, the strategy is to use the sample of data vectors to estimate the characteristic function for the data, and then compare this estimate to the analytical form for this characteristic function derived from the characteristic functional of the object model and the model for the measurement noise. We will consider only binned data, but a similar procedure can be implemented for list-mode data also.

A. Estimation Tasks

In this subsection, we assume that the random field is specified by a parameter vector $θ$ , although more general parameter spaces are certainly possible. We will use the notation $f (r | θ) = f_{θ} (r)$ for the random field and emphasize again that $θ$ is a population parameter and does not determine any particular realization of the random field. Our goal is to produce an estimate of this parameter from the data generated by a binned-data imaging system. Define the imaging system operator by ${[g]}_{m} = g_{m} = (h_{m}^{*}, f_{θ})$ for $m = 1, \dots, M$ . We will also write $g = H f_{θ}$ and note that we are considering noise-free measurements in this section for simplicity. Generalizing these results to additive or Poisson noise is straightforward. If the parameterized characteristic functional is given by $Ψ_{f} (ϕ | θ) = ⟨ \exp [- 2 π i Re (ϕ, f_{θ})] ⟩$ , then we have the parameterized characteristic function of the data: $ψ_{g} (ξ | θ) = Ψ_{f} (H^{†} ξ | θ)$ .

Consider a sample ${f_{1} (r | θ), \dots, f_{J} (r | θ)}$ of independent realizations of the object field and the corresponding data matrix $G = {g_{1}, \dots, g_{J}}$ . We can form an estimate ${\hat{ψ}}_{g} (ξ | θ)$ of $ψ_{g} (ξ | θ)$ by using a sample mean:

{\hat{ψ}}_{g} (ξ | θ) = \frac{1}{J} \sum_{j = 1}^{J} \exp [- 2 π i Re (ξ, g_{j})] .

Since the terms in the sum are independent complex random variables of unit magnitude, the central-limit theorem applies and we may assume that the joint distribution of the random variables

Z_{l} = {\hat{ψ}}_{g} (ξ_{l} | θ)

for

l = 1, \dots, L

is normal. The mean of this distribution is given by

{\bar{Z}}_{l} (θ) = ⟨ Z_{l} ⟩ = ψ_{g} (ξ_{l} | θ)

. Our goal is to use Maximum Likelihood (ML) or Maximum a posteriori (MAP) estimation using the vector

Z = (Z_{1}, \dots, Z_{L})

as our data. Note that we may choose the number of frequencies

L

to be any positive integer. To perform the estimation, we need the covariance matrix for

Z

. To compute this covariance matrix, we start with the quadratic moments

⟨ Z_{l}^{*} Z_{l^{'}} ⟩ = \frac{J - 1}{J} ψ_{g}^{*} (ξ_{l} | θ) ψ_{g} (ξ_{l^{'}} | θ) + \frac{1}{J} ψ_{g} (ξ_{l^{'}} - ξ_{l} | θ)

and

⟨ Z_{l} Z_{l^{'}} ⟩ = \frac{J - 1}{J} ψ_{g} (ξ_{l} | θ) ψ_{g} (ξ_{l^{'}} | θ) + \frac{1}{J} ψ_{g} (ξ_{l^{'}} + ξ_{l} | θ) .

Now if we define $Δ Z_{l} = Z_{l} - {\bar{Z}}_{l}$ , there are two covariance matrices for this complex vector:

{[K_{Z} (θ)]}_{l l^{'}} = ⟨ Δ Z_{l}^{*} Δ Z_{l^{'}} ⟩ = \frac{1}{J} [ψ_{g} (ξ_{l^{'}} - ξ_{l} | θ) - ψ_{g}^{*} (ξ_{l} | θ) ψ_{g} (ξ_{l^{'}} | θ)]

and

{[C_{Z} (θ)]}_{l l^{'}} = ⟨ Δ Z_{l} Δ Z_{l^{'}} ⟩ = \frac{1}{J} [ψ_{g} (ξ_{l^{'}} + ξ_{l} | θ) - ψ_{g} (ξ_{l} | θ) ψ_{g} (ξ_{l^{'}} | θ)] .

Now we are ready to formulate an estimator for the population parameter vector $θ$ . From the data matrix $G$ , we follow the steps just mentioned to create the $2 L$ -dimensional vector $W$ :

W_{Z} = [\begin{matrix} Z \\ Z^{*} \end{matrix}] .

From the characteristic functional of the object model, the system operator, and the noise model for noisy data, we compute the mean vector

{\bar{W}}_{Z} (θ) = [\begin{matrix} \bar{Z} (θ) \\ {\bar{Z}}^{*} (θ) \end{matrix}]

and the Hermitian matrix

Q_{Z} (θ) = [\begin{matrix} K_{Z} (θ) & C_{Z} (θ) \\ C_{Z}^{†} (θ) & K_{Z}^{*} (θ) \end{matrix}] .

From the central-limit theorem we then have, to a good approximation for large

J

,

p r_{Z} (Z | θ) = \frac{1}{π^{L} \sqrt{\det Q_{Z} (θ)}} \exp {- \frac{1}{2} {[W_{Z} - {\bar{W}}_{Z} (θ)]}^{†} Q_{Z}^{- 1} (θ) [W_{Z} - {\bar{W}}_{Z} (θ)]} .

We may then compute the MAP estimate of

θ

via

θ_{MAP} (Z) = \arg \max_{θ} [p r_{Z} (Z | θ) p r_{θ} (θ)] .

Alternatively, we can take a logarithm and write

{\hat{θ}}_{MAP} (Z)

as

\arg \min_{θ} {{[W_{Z} - {\bar{W}}_{Z} (θ)]}^{†} Q_{Z}^{- 1} (θ) [W_{Z} - {\bar{W}}_{Z} (θ)] + \ln \det Q_{Z} (θ) - 2 \ln [p r_{θ} (θ)]} .

A similar formula can be used for the ML estimate if we have no prior on $θ$ . See also [13] for another approach to estimating population parameters from characteristic functionals.

B. Classification Tasks

We may use a similar approach for classification tasks as we did for estimation tasks. We suppose that the data is generated from one of a finite set $f_{1} (r), \dots, f_{N} (r)$ of random fields. If we define $f (r | n) = f_{n} (r)$ , then the classification task comes down to estimating the number $n$ from the data. If the prior probability for class $n$ is $P r_{n}$ , then we can follow the same processing steps as we described for the estimation task and get a MAP classifier

{\hat{n}}_{MAP} (Z) = \arg \max_{n} [p r_{Z} (Z | n) P r_{n}],

where the maximum is taken over the set

{1, \dots, N}

. With the obvious notation, this can also be written as

\arg \min_{n} {{[W_{Z} - {\bar{W}}_{Z} (n)]}^{†} Q_{Z}^{- 1} (n) [W_{Z} - {\bar{W}}_{Z} (n)] + \ln \det Q_{Z} (n) - 2 \ln P r_{n}} .

When the central-limit theorem is applicable, this classifier minimizes the probability of error among all classifiers that use the processed data vector $Z$ .

11. FISHER INFORMATION MATRIX

The Fisher information matrix (FIM) is often used as a figure of merit for an imaging system. The matrix inverse of the FIM is related to estimation performance via the well known Cramer–Rao lower bound on the variance of an unbiased estimator. The FIM is also directly related to the performance, as measured by the area under the receiver operating characteristic (ROC) curve, of the ideal observer in detecting a small change in the population parameter vector [14,15]. In our case, we can compute the FIM of the processed data vector $Z$ using

F (θ) = {⟨ \nabla_{θ} \ln [p r_{Z} (Z | θ)] \nabla_{θ}^{†} \ln [p r_{Z} (Z | θ)] ⟩}_{Z | θ} .

Assuming that the central-limit theorem is valid for $Z$ , we have

{[F (θ)]}_{p p^{'}} = {[\frac{\partial}{\partial θ_{p}} {\bar{W}}_{Z} (θ)]}^{†} Q_{Z}^{- 1} (θ) [\frac{\partial}{\partial θ_{p^{'}}} {\bar{W}}_{Z} (θ)] + \frac{1}{2} tr {Q^{- 1} (θ) [\frac{\partial}{\partial θ_{p}} Q (θ)] Q^{- 1} (θ) [\frac{\partial}{\partial θ_{p^{'}}} Q (θ)]} .

Therefore, this matrix can be computed from the characteristic functional of the object random field, the system operator, and the characteristic function of the noise.

12. CONCLUSIONS

We have provided a brief introduction to the concept of a characteristic functional for a random field and described some of its properties. Our primary purpose for doing this in an imaging context is that characteristic functionals provide a way to describe the statistical properties of object models that represent the ensemble of objects being imaged. We have provided examples of random fields and their characteristic functionals that are useful in imaging, and shown how they are propagated through a noisy imaging system. Finally, we have described methods for using characteristic functionals for estimation and classification tasks, and for computing a Fisher information matrix for an imaging system.

We have seen that there are analytic expressions for many random field models that are useful in imaging. Furthermore, many of these expressions are not very complicated. This is in contrast to the other method of describing the statistics of a random field, the finite-dimensional PDFs for the random vectors consisting of samples of the field at a finite number of points. These PDFs are often complicated or unknown for the examples we have discussed. Since the characteristic functional contains all of the statistical information about the field and can be used for estimating population parameters and classifying populations, it is often a more useful description of a random object model.

Funding

National Institutes of Health (NIH) (P41-EB002035, R01-EB000803); U.S. Department of Homeland Security (DHS) (HSHQDC-14-C-BOOIO).

REFERENCES

1. H. H. Barrett and K. J. Myers, Foundations of Image Science (Wiley, 2004).

2. E. Parzen, Stochastic Processes (SIAM, 1999).

3. A. N. Shiryaev, Probability (Springer, 2000).

4. A. Amini and M. Unser, “Sparsity and infinite divisibility,” IEEE Trans. Inf. Theory 60, 2346–2358 (2014). [CrossRef]

5. J. Fageot, A. Amini, and M. Unser, “On the continuity of characteristic functionals and sparse stochastic modeling,” J. Fourier Anal. Appl. 20, 1179–1211 (2014). [CrossRef]

6. W. Rudin, Functional Analysis (McGraw-Hill, 1973).

7. J. Moller, A. R. Syversveen, and R. P. Waagepetersen, “Log Gaussian Cox processes,” Scand. J. Stat. 25, 451–482 (1998). [CrossRef]

8. P. R. Bouzas, M. J. Valderamma, and A. M. Aguilaera, “On the characteristic functional of a doubly stochastic Poisson process: application to a narrow-band process,” Appl. Math. Model. 30, 1021–1032 (2006). [CrossRef]

9. J. W. Goodman, Statistical Optics (Wiley, 2015).

10. P. R. Bouzas, N. Ruiz-Fuentes, and F. M. Ocana, “Functional approach to the random mean of a compound Cox process,” Comput. Statist. 22, 467–479 (2007). [CrossRef]

11. M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra (Academic, 1974).

12. L. Caucci and H. H. Barrett, “Objective assessment of image quality V: photon counting detectors and list-mode data,” J. Opt. Soc. Am. A 29, 1003–1016 (2012). [CrossRef]

13. M. A. Kupinski, E. Clarkson, J. Hoppin, and H. H. Barrett, “Experimental determination of object statistics from noisy images,” J. Opt. Soc. Am. A 20, 421–429 (2003). [CrossRef]

14. E. Clarkson and F. Shen, “Fisher information and surrogate figures of merit for the task-based assessment of image quality,” J. Opt. Soc. Am. A 27, 2313–2326 (2010). [CrossRef]

15. F. Shen and E. Clarkson, “Using Fisher information to approximate ideal observer performance on detection tasks for lumpy background images,” J. Opt. Soc. Am. A 23, 2406–2414 (2006). [CrossRef]

Characteristic functionals in imaging and image-quality assessment: tutorial

Abstract

1. INTRODUCTION

2. PROBABILITY MODELS AND RANDOM VARIABLES

3. RANDOM FIELDS

4. CHARACTERISTIC FUNCTIONALS OF RANDOM FIELDS

5. GENERALIZED RANDOM FIELDS

6. CHARACTERISTIC FUNCTIONALS OF GENERALIZED RANDOM FIELDS

7. OPERATIONS WITH RANDOM FIELDS

8. EXAMPLES

A. Real Gaussian

B. Circular Gaussian

C. Poisson Random Field

D. Random Fields Related to Poisson Random Fields

E. Partitioned Random Fields

F. Square of a Real Gaussian Random Field

G. Square Magnitude of a Circular Gaussian

H. Complex-Amplitude Poisson Random Field

I. Innovation Random Field

J. Plane-Wave Random Field

K. Infinite-Series Random Field

L. Dynamically Evolving Random Field

M. Cascaded Poisson Process

9. IMAGING SYSTEMS

A. Binned Data

B. List-Mode Data

10. MATHEMATICAL OBSERVERS FOR POPULATIONS OF OBJECTS

A. Estimation Tasks

B. Classification Tasks

11. FISHER INFORMATION MATRIX

12. CONCLUSIONS

Funding

REFERENCES

Cited By

Equations (80)

Journal of the Optical Society of America A