Abstract
Characteristic functionals are one of the main analytical tools used to quantify the statistical properties of random fields and generalized random fields. The viewpoint taken here is that a random field is the correct model for the ensemble of objects being imaged by a given imaging system. In modern digital imaging systems, random fields are not used to model the reconstructed images themselves since these are necessarily finite dimensional. After a brief introduction to the general theory of characteristic functionals, many examples relevant to imaging applications are presented. The propagation of characteristic functionals through both a binned and list-mode imaging system is also discussed. Methods for using characteristic functionals and image data to estimate population parameters and classify populations of objects are given. These methods are based on maximum likelihood and maximum a posteriori techniques in spaces generated by sampling the relevant characteristic functionals through the imaging operator. It is also shown how to calculate a Fisher information matrix in this space. These estimators and classifiers, and the Fisher information matrix, can then be used for image quality assessment of imaging systems.
© 2016 Optical Society of America
1. INTRODUCTION
If we are imaging an ensemble of objects, then it is natural to describe this ensemble as a random field on some support region in . (Other terms used for this concept are random function and stochastic process.) For example, in medical imaging, the object is described by a function of three variables that varies unpredictably from one patient to the next. In x-ray computed tomography, this function gives the x-ray attenuation at each point, while in position emission tomography and single photon emission computed tomography imaging, this function gives the activity distribution inside the body. While there are anatomical structures that are common to all patients, there are also unpredictable variations in these structures throughout the patient population. As another example, for the x-ray imaging of luggage, there is an almost infinite variety of objects that can be packed in a bag and also infinite variation in the arrangement of these objects in the bag. The corresponding attenuation maps then vary in an unpredictable way from one bag to the next. In each of these examples, any given attenuation map or activity distribution can be regarded as a realization of a random field defined on a region of space. In medical imaging, this support region can be thought of as a cylinder large enough to contain all of the patients being imaged. In a luggage scanner, the support set may be a box shape large enough to contain all of the bags being scanned. In all imaging situations, there will be unpredictable variations in the function being imaged. If there were no such variations, then the object function would be known already and there would be no need to create an image. Therefore, we will consider the object functions being imaged to be realizations of a random field defined on a support set.
Describing the statistical properties of a random field can be difficult since we are dealing with essentially a random vector in an infinite-dimensional space. The concept of a probability density function (PDF), which is used to describe the statistics of continuous finite-dimensional random vectors, is not defined in general for a random field. For finite-dimensional random vectors, an equivalent description of the statistics is given by the characteristic function, which is the Fourier transform of the PDF. Indeed, many concepts in probability theory and multivariate statistics are easier to understand if we use the characteristic function as our description of the random vector. Fortunately, the concept of the characteristic function generalizes easily to random fields and is referred to as the characteristic functional in this context.
After a brief discussion of probability models and random variables in Section 2 to establish notation, we introduce the idea of a random field in Section 3. In Section 4, we define the characteristic functional for a random field and discuss some of its properties. Sections 5 and 6 introduce generalized random fields and their characteristic functionals; these ideas are needed to describe point processes, for example. In Section 7, we show how certain mathematical operations with random fields translate to relations among the corresponding characteristic functionals. Section 8 consists of a myriad of examples of random fields and their characteristic functionals that are relevant to imaging applications. We then show how characteristic functionals are propagated through noisy imaging systems in Section 9. This discussion includes both binned and list-mode imaging systems. Finally, in Section 10, we show how characteristic functionals and image data can be used to estimate population parameters, classify populations of objects, and compute a Fisher information matrix for the estimation of population parameters. In all of these applications, an analytic expression for the characteristic functional is essential to carrying out the task. Fortunately, in Section 8, we have a long list of such analytic expressions. We hope that these results will convince readers that modeling object populations as random fields and describing their statistical properties with characteristic functionals is both useful and important for the analysis of imaging systems and the data generated by them.
Since this paper is intended as a tutorial on the use of characteristic functionals in image science, most of the results have been derived elsewhere. Exceptions to this rule are Subsection 8.L on dynamically evolving random fields and Sections 10 and 11 on using characteristic functionals to generate mathematical observers and figures of merit for imaging systems.
2. PROBABILITY MODELS AND RANDOM VARIABLES
To define the characteristic functional of a random field, we must first discuss the definition of a random field. This definition in turn necessitates a brief discussion of probability spaces and random variables.
To define a probability space [1–3] we start with a set , sometimes called the sample space, the space of outcomes, or the set of elementary events. This set may be finite. For example, if photons strike a detector and of them are detected, then because these are the possible outcomes from this experiment. The outcome space may be infinite but discrete. An example of this situation is operating a photon-counting detector being illuminated by a source for a fixed amount of time. In this case, it is standard practice to take . If an imaging system performs real-valued measurements of an object, then . In the theory of random fields, we have to allow for the possibility that is an infinite-dimensional space. The set of all real-valued functions of variables is an example of such a space. In general, we will use the symbol to represent a point in . Thus, may represent an integer, a real number, a finite-dimensional vector, a function, or some other more abstract way of representing outcomes.
The next step in defining a probability model is to define the collection of subsets of that will be assigned a probability. These subsets are usually called events. When is a discrete set, is usually simply the collection of all possible subsets of . Thus, in this case, we will assign a probability to any subset of outcomes. When is not a discrete set, then normally does not consist of all possible subsets of . In general, all that we require of the collection is three conditions. The first condition is that itself is one of the subsets in the collection. The second condition is that, for any subset in the collection, the complement of , which is the subset of points in that are not in , is also in the collection. The third condition is that, if the subsets are in the collection for , then the subset consisting of the union of these subsets,
is also in the collection. A collection of subsets satisfying these properties is called a -algebra.The final ingredient for a probability model is a probability measure that assigns a probability to each event in . We always have . To state the only other condition on the probability measure, we begin with a sequence of subsets that are in for . We assume that the events are mutually disjoint, i.e., is empty for . Then, if is the union event as just described, the probability measure must satisfy
This is just the usual condition that probabilities add when the events are disjoint, and is called countable additivity.A probability model is specified by the triple consisting of a sample space, a -algebra of subsets of this space, and a probability measure on this collection of subsets. The concept of a probability model forms the foundation of probability theory, which in turn is essential for any discussion of random fields and characteristic functionals. For those not familiar with -algebras and probability measures, we will discuss a few examples.
In the first example, we will consider the photon-counting system where and an event will be any subset of . Since the one-element set is an event, it has an associated probability . For any event, we then have
This is just the usual way to assign a probability to a subset of a discrete outcome space.In the second example, we will take to be the set of all real numbers. If the outcome of an experiment is a measurement of a continuous quantity, then this is the appropriate outcome space. If the real-valued function on satisfies for all and
then we say that this function is a PDF. The corresponding probability measure is defined by when the integral exists. This last caveat is important, since there are subsets of for which this integral does not exist. The existence of these non-measurable sets has been known for some time and was a motivation for restricting the subsets to which we want to assign a probability, the events, to all be members of a -algebra. For the interested reader it is not difficult to find descriptions of such sets, but they are extremely unlikely to arise in applications. We say that a -algebra is generated by a collection of subsets of if it is the smallest -algebra containing all of those subsets. When is the real line , then we normally use the -algebra generated by the intervals, the Borel -algebra, to define a probability measure. For events in the Borel -algebra, the integral in Eq. (5) is always defined.As a final example of a situation where we need to restrict the subsets to which we want to assign a probability, consider to be the space of all real-valued functions on . We may think of an experiment where the outcome is an oscilloscope trace. In a typical measurement, we want to assign a probability to an event of the form . This event corresponds to a finite number of measurements of the function at the points , with the results falling into the given intervals. We can then take to be the -algebra generated by these subsets for our probability model. In this case, however, the subset is not a member of and cannot be assigned a probability since it requires an uncountably infinite number of measurements to verify that a given function is in . We are usually not bothered by this, since we cannot make an uncountably infinite number of measurements anyway. However, this example does show that the natural -algebra for a probability model may be very far from containing all of the subsets of . In any case, we will only use the general concept of a probability model as just outlined to define what a random field is and what the characteristic functional of a random field is. Once that is accomplished, we will not have to worry very much about the details of the probability model used to define any given random field.
A real random variable is a real-valued function on the sample space that satisfies the following measurability condition. For any real number , the subset of defined by is in the collection of events, and may thus be assigned a probability. A complex random variable is defined by insisting that and are both real random variables. We can also have real random vectors of any dimension by insisting that each component of the vector is a real random variable. This definition is also valid for complex random vectors.
For a real random variable , the PDF is defined by the condition
for any real number . We will use the angle-bracket notation for expectations of function of a random variable: The condition for this integral to exist is that is itself a random variable. This notation will also be used for expectations of functions of finite-dimensional random vectors, in which case there is a multivariate PDF that can be used to compute them.3. RANDOM FIELDS
Let be some -dimensional support set for all functions in this section. This may be a subset of or a -dimensional manifold such as a sphere. The support set may even be a discrete set of points in , such as a regular grid in this space. A random field [1–3] is a function of points in and in such that is a random variable for each . We may also describe a random field as a collection of random variables for each in . For a fixed sample point , the function is called a realization of the random field. Sometimes further constraints are placed on the random field, such as assuming that all of the realizations are continuous functions of . In general, however, we cannot expect the realizations to be continuous. We will use the common notation for a random field. This notation has the potential for ambiguity, but it will be clear from the context whether we are referring to a random field or an ordinary function of .
Given an ordered set of points in , the vector is a random vector with a corresponding multivariate PDF on or . The Kolmogorov extension theorem [3] provides two sets of consistency conditions that this collection of PDFs has to satisfy. The two sets of conditions arise from permutations of the points in and marginalization over any of the variables . Conversely, if the conditions in the theorem are satisfied, then there is a random field corresponding to the collection of PDFs. This theorem is often used to derive the existence of a random field from a suitable collection of PDFs.
For any realization of a random field, we will need to consider the inner product defined by
For a discrete support set , the integral is replaced with a sum here and anywhere below where an integral over appears. Any linear combination of random variables is also a random variable. Therefore, if we think of the integral in this expression as a limit of Riemann sums, where , then it is a limit of a sequence of random variables. As long as this limit exists for all , the integral defines a random variable , which will be also be written as . We will also writeIf we have two functions and that satisfy this condition and , then is also a random variable. Thus, the set of functions such that is a well-defined random variable forms a vector space. The details of the problem of determining which functions are in this space depend on the random field itself, and we will not be discussing that issue in this work. This issue is discussed in detail elsewhere and involves a considerable amount of mathematics [4,5]. In all of the equations that follow, we will simply assume that is a function in this space.
4. CHARACTERISTIC FUNCTIONALS OF RANDOM FIELDS
Another way to think of a random field is as a map that assigns the function to each in the sample space. Since the set of functions on forms a vector space, this is analogous to a random vector in an infinite-dimensional space. However, there is no easy way to generalize the concept of a PDF for a finite-dimensional random vector to a PDF for an infinite-dimensional random vector. The main reason for this is that there is no way to easily generalize an integral of the form
to the case where is replaced with an infinite-dimensional function space. Thus, the concept of a PDF is not very useful for random fields. On the other hand, the characteristic function for a random vector, which is given by is also very useful for problems involving random vectors and is easily generalized to random fields. In this equation, the dagger superscript indicates the conjugate transpose. Of course, for random vectors, the characteristic function is just the Fourier transform of the PDF and is therefore an equivalent description of the statistics of the random variable. The same is true for the characteristic functional of a random field even though there may be no PDF.The characteristic functional for a random field is defined by
As noted previously, we always assume that the function in the argument of the characteristic functional satisfies the condition that the scalar is a well-defined random variable. Thus, the expectation in the definition of the characteristic functional can be computed from the one-dimensional PDF for . There are certain conditions that a characteristic functional must satisfy. The normalization condition is . The Hermitian symmetry condition is . The magnitude of the characteristic functional always satisfies . The last condition is called non-negative definiteness, which must be true for any , any set of functions for which is defined for all , and any complex vector . The condition in this equation is equivalent to constraining the matrix given by to be non-negative definite. A generalization of Bochner’s theorem [3] to this setting shows that, if we add in a continuity condition, then these constraints on a functional ensure that it is the characteristic functional of a random field.Suppose that is a real random field. Since is a random variable, we may use its PDF to compute a mean value: . Similarly, is a random vector and its PDF may be used to compute the correlation function . The covariance function for the random field is then given by . For a real random field, we have when is a real-valued function, which will be the assumption made in this paragraph. For a parameter , we then have
So the mean of the random field is determined by the characteristic functional. Taking another derivative, we get We will write this equation in the form where the integral operator is given by the inner integral in the previous equation. The characteristic functional therefore determines the correlation function and hence the covariance function also. There are similar expressions for higher-order moments of the random field. There is also an integral operator whose kernel is the covariance function . Spectral theory for random fields is the study of the spectrum of this operator, which can also be derived from the characteristic functional in the obvious way.For complex random fields, the correlation function is usually defined by , with the corresponding covariance function . If we introduce the complex parameter , then this correlation function is given by
Since is a Hermitian operator, the quantity is real. To complete the description of the second-order statistics for a complex random process, we also need another correlation function defined by , with the corresponding covariance function . In terms of the characteristic functional, we then have andNote that is, in general, not a Hermitian operator.
5. GENERALIZED RANDOM FIELDS
Some physically important random fields, such as Poisson point processes, are actually generalized random fields. For a generalized random field we do not insist that is a random variable since, for any given realization, may not be well defined as an ordinary function. Instead, we restrict to be a test function and define
Here, is taken to be a generalized function [1,6] for all sample points . The requirement for a generalized random field is that be a well-defined random variable for all test functions . We will also write and think of for a particular test function as the output of a random distribution on the space of test functions defined on . This notation emphasizes the fact that the space of test functions on plays the same role for a generalized random process as the set itself does for an ordinary random process. Thus, we can also write as defining the random variable , and as defining a realization of the generalized random field. If we have a vector of test functions , then we have a random vector with corresponding PDF .6. CHARACTERISTIC FUNCTIONALS OF GENERALIZED RANDOM FIELDS
Just as with ordinary random fields, extending the concept of a PDF to a generalized random field is difficult. We would need the PDF to be defined on the space of distributions on , which is an infinite-dimensional space that includes, for example, the space of locally integrable functions on . However, the characteristic functional for a generalized random field is easily defined by , where the argument of the functional is taken to be a test function. The expectation in this expression is computed using the PDF for the random variable . All of the statistical information about the generalized random field is contained in this functional.
A generalized random field is real-valued if is real whenever is a real-valued test function. In this case, we may define the mean of the generalized random field as a distribution by using . Since, for any given test function , is a random variable, the expectation in this definition may be computed by using the PDF for that random variable. We may also write this definition as . Similarly, for a pair of test functions , the random vector has a PDF that can be used to compute the correlation operator via . Notice that we do not attempt to define a correlation function, which in many cases would be a generalized function itself. The covariance operator is then defined by .
For complex generalized random fields, we have the Hermitian correlation operator defined by , and the corresponding covariance operator given by . As with ordinary random fields, to complete the description of the second-order statistics of a complex generalized random field, we need the operator defined by . The corresponding covariance operator is then given by . The relations between the characteristic functional and these first- and second-order statistics are the same as those for ordinary random fields given previously.
7. OPERATIONS WITH RANDOM FIELDS
Various mathematical operations with random fields or generalized random fields give rise to relations between characteristic functionals. If and are two random fields on with corresponding probability models and , then there is a probability model on with a probability measure that satisfies . We can then define a random field on by . We usually write in this case and say that the two random fields and are statistically independent. The characteristic functionals for these random fields are related by the equation . On the other hand, we may also take this equation as defining when two random fields are statistically independent when they share a probability model . The obvious condition for this equation to be valid is that it must be permissible to use in the argument of all of the characteristic functionals involved.
In contrast to the technical issues for summing two random fields, multiplying a random field by a non-random function is straightforward and leads to the relation . The one constraint here is that the function must be in the space of functions that can be used in the argument of the functional . For example, if is a generalized random field and is infinitely differentiable, then is also a generalized random field. If is now a test function, then is also a test function and the equality is valid.
Now suppose that is a linear integral operator given by
If we can show that the integral defining converges for all in and all sample points , then this equation defines a random field on . In this case, we usually write as a relation between two random fields. The characteristic functionals of these two random fields are related by , where the adjoint operator is defined by . In the relation just given between characteristic functionals, must be in the space of functions for which both sides of the equation are well defined. Note that this relation between characteristic functionals makes sense even if is a differential operator. In this case we can, if necessary, restrict to be a test function so that can be treated as a generalized random field defined by the relation .Now suppose that we have a set of random fields on such that they are all defined with the same probability model for . Also assume that we have a set of probabilities . We can create a new probability model such that the samples are pairs and . The mixture random field for this situation is defined by . We can think of the realizations of this random field as being equal to with probability . We then have that the characteristic functional for the mixture random field is given by the convex combination
of the characteristic functionals of the component random fields. Again, the condition for this equation to be valid is that it must be permissible to use in the argument of all of the characteristic functionals involved. With this constraint, we can see that any convex combination of characteristic functionals is also a characteristic functional.8. EXAMPLES
We now look at some examples of characteristic functionals that are relevant for imaging. In most cases, we do not present a derivation of the characteristic functionals since they are readily available elsewhere.
A. Real Gaussian
A random field on is Gaussian or normal if its characteristic functional has the form [1 (page 410),2]. In this equation, is the mean of the random field and is the covariance operator as described above. If the mean is zero and the covariance operator is a multiple of the identity operator, then we have a zero-mean white noise Gaussian random field. This noise field is often used in the description of Brownian motion. Note that a white noise Gaussian random field is actually a generalized random field since the kernel of the covariance operator is given by , which is a generalized function of two variables. Due to the central limit theorem, Gaussian random fields often occur in applications as an approximation for a random field that is a sum of many independent random fields.
Suppose that we have a random field that satisfies for all realizations and all in . If the random field is a Gaussian random field, then we say that is a log-normal random field [1 (page 1462),7]. There is no analytic form for the characteristic functional of a log-normal random field. However, in this case, the functional
is well defined whenever is in the domain of the characteristic functional of the random field . Note that the operator in this expression is the covariance operator for and not for . If and are statistically independent positive random fields and , then . Therefore, due to the central limit theorem, log-normal random fields often show up as an approximation for a random field that is a product of many independent positive random fields.B. Circular Gaussian
A complex random field on is a circular Gaussian if its characteristic functional is given by , where the covariance operator is a Hermitian operator. It is often assumed that the electric field in partially coherent light is a circular Gaussian random field. We can also have complex random fields that are Gaussian but not circular Gaussian. The characteristic functional in this case involves both covariance functions for the complex random field.
C. Poisson Random Field
A random point field is a generalized random field that satisfies
where is a random integer with a distribution , and, for a given , the points are also randomly selected [1 (page 649)]. A typical restriction we could place on for this expression to be well defined is that it be continuous.For a Poisson random point field, we have the Poisson distribution for :
We also require that the points be chosen independently from a PDF . The mean of this generalized random field satisfies and thereforeIf we define the function by , then the characteristic functional of a Poisson random field is given by . Poisson random fields are often used to describe the distribution of photon locations on a photon-counting detector. In this case the mean of the field is called the photon fluence, and it determines the statistics of the field completely. Poisson random fields are also used to describe photon-emitting objects that are being imaged.
D. Random Fields Related to Poisson Random Fields
Suppose that the mean of a Poisson random field is itself a random field. Then we have immediately that the characteristic functional of is given by . In this case, we say that is a doubly stochastic Poisson random field [1 (page 659),8]. If we are imaging an ensemble of objects with a photon-counting detector, then a doubly stochastic random field is the appropriate model of the statistics of the emitted or detected photon locations.
If we convolve a Poisson random field with a continuous function , then we have a filtered Poisson random field [1 (page 662)]:
This type of field is sometimes used to model textures in objects being imaged. If we define the Hermitian conjugate of the filter function via and define , then we have for the characteristic functional of the filtered Poisson random field.We can have a doubly stochastic filtered Poisson random field also. In this case, the characteristic functional is given by . Other generalizations are also possible. For example, the filter could have the more general form , i.e., a space-variant filter. It is straightforward to work out the characteristic functional in these cases also.
E. Partitioned Random Fields
We will say that the random field is partitioned if we can write the support set as a disjoint union
and we have mutually independent random fields on each set such that Here, is the restriction of to the set [1 (page 428)]. The coefficients in this sum are the components of a random vector with PDF . We can think of the random fields as generating random textures in each region that are statistically independent of each other, but the random vector introduces correlations in the amplitudes of these textures between the regions. This can be used, for example, to model the distribution of a drug or radiotracer in the organs of each member of an ensemble of subjects being imaged. If each is a filtered Poisson random field with filter , then we define the functions by and the vector by . The characteristic functional for the partitioned random field is then given by .F. Square of a Real Gaussian Random Field
First we consider a finite-dimensional, zero-mean Gaussian random vector with covariance matrix . Define a random vector by . If we use to denote the diagonal matrix with the components along the main diagonal, then we may write the characteristic function of as
This integral can be performed analytically and the result is [1 (page 1255)]Now consider a zero-mean Gaussian random field with covariance operator . We define the corresponding intensity field by . This is called the intensity field in analogy with optical applications where would be the (real) electric field from a finite-bandwidth source, and the detector responds to the intensity field. If we define the diagonal operator for acting on a function by the equation , then the obvious generalization of the finite-dimensional formula is
This determinant is well defined if is a trace-class operator since, in this case, is also a trace-class operator and the determinant can be defined as an infinite product of eigenvalues. However, just because this functional is well defined does not mean it is necessarily the characteristic functional of the random field .
To see why we might believe that the given expression for is correct, we consider the following special cases:
We will assume that the Gaussian random field is an ordinary, as opposed to a generalized, random field. In this case, these sums are allowable for because which is a well-defined random variable. We then have which shows that is a finite-rank operator with rank no greater than . Therefore, there are at most nonzero eigenvalues for this operator and we can find them by assuming that an eigenfunction has the formAfter substituting this into the eigenvlaue equations, we arrive at
Thus, the nonzero eigenvalues of for this are the eigenvalues of . Therefore we can say that, if there is a random field for which (as defined previously) is the characteristic functional, then this random field will have the same finite-dimensional PDFs as . By the Kolmogorov theorem, we would then have that this random field is . By Bochner’s theorem, as just given is the characteristic functional of a random field if it satisfies the constraints given in Section 4. The constraint of non-negative definiteness is the hard one to check but we can argue as follows. Let the sampled function be defined by where the form a regular grid in . Then the operator is a good approximation to , and this approximation gets better as we refine the grid. Since is a finite-rank operator, we know that satisfies all of the constraints if we restrict to sampled functions. By continuity, we then have that the constraints are satisfied for all . This argument is really a sketch of a proof that we have the correct expression for and needs work to make it rigorous. Filling this sketch out is, however, beyond the scope of this paper.G. Square Magnitude of a Circular Gaussian
Now suppose that is a circular Gaussian complex random field and . Using the results of the previous subsection, we have
This is the characteristic functional for a commonly used model of speckle in coherent imaging. The characteristic functions for the finite-dimensional PDFs are given byFor the case, the inverse 1D Fourier transform integral can be computed analytically to give a chi-square PDF for the random variable at a fixed . The case can also be computed analytically in terms of the Bessel function [1 (page 1258),9].
H. Complex-Amplitude Poisson Random Field
An example of a complex-amplitude Poisson random field has the form
where, for a given , the amplitudes , the phases , and the locations are all independent of each other. The number is a Poisson random variable with mean . Conditioned on , we have that the locations are i.i.d. with PDF and the amplitudes are i.i.d. with PDF . A common assumption that we will follow is that the phases are i.i.d. and uniformly distributed on . This is a common model for a scattered radiation field from a random collection of point scatterers, and is also known as a Cox process [10].We define the function and
where is the zero-order Bessel function. We then have that the characteristic functional of the scattered field is given by [1 (page 1317)]. If we have an ensemble of objects, each consisting of a collection of point scatterers, then is itself a random field and the characteristic functional of is given by .If the propagation of the field through space or an imaging system is defined by an operator , then . If the propagation is implemented by convolving with a point spread function we define, as before, and have for a fixed collection of scatterers and for an ensemble of objects.
I. Innovation Random Field
Sparsity of a random field is a property often assumed in compressed sensing that allows better object reconstructions. One model for a sparse random field assumes that where is a stationary white noise field, sometimes called an innovation field. We then have . To describe the statistical properties of an innovation field, we introduce the translation operators and the random variables . The stationarity assumption is that for any and functions , the -dimensional PDF for the -dimensional random vector with components is independent of the translation vector . For a stationary random field, we necessarily have . The white noise assumption is that whenever for all in , the random variables and are statistically independent. In this case it can be shown that [4,5]
where, for some and , The differential is called the Levy measure and must satisfy the constraintTherefore any random field that is the result of a linear operator acting on an innovation field must have a characteristic functional of the form , where the functional is as just given. If the PDFs for the random variables for some basis sequence are highly kurtotic with long tails, then this sequence can be thought of as a sparsity basis for the random field . This requirement on these PDFs can then be related to properties of the Levy measure for the innovation process.
J. Plane-Wave Random Field
Suppose that the random field is a superposition of a random number of plane waves with random amplitudes, phases, and spatial frequencies:
This would be the case, for example, if was the inverse Fourier transform of a point process with complex amplitudes in frequency space. For a given , we will assume that the amplitudes and spatial frequencies are independent of each other and are i.i.d. with PDFs and , respectively. If is the Fourier transform of and is Poisson distributed with mean , then the characteristic functional of this random field is given by . If we assume that the spatial frequencies follow a circular Gaussian distribution, with variance , then
The single-point characteristic function for the random variable is derived by setting , which results in
There is no analytic form for the PDF corresponding to this characteristic function.K. Infinite-Series Random Field
Suppose that the function describes a random field on a discrete set of integer vectors . A Markov random field [1 (page 428)] is an example of this situation. If the functions form a set of functions on indexed by these integer vectors, then we may define a random field on via
as long as the series converges for all in . A wavelet series expansion, for example, would have this form. We then write where is understood to be a random field on the index set. We also have, for appropriate functions , the series which defines a random variable. We will write this equation asIf we define a vector on the index set via , then the characteristic functional for the random field is given by . This type of model can be used for compressed-sensing applications where the functions could form a wavelet basis or some similar basis or frame.
L. Dynamically Evolving Random Field
Suppose that we define a bounded spatial integral operator on spatiotemporal functions via
We may then consider the dynamic equation with initial condition . The solution to this equation can be written as [11] where the exponential of an operator is defined using the power-series expansion. We will assume that the input function and the initial distribution are independent random fields. The random fields and are then defined on . Now fix a time and consider the random field on . We will denote the characteristic functional for this field by . For a function in the domain of this functional, we define and . Then we have the relation , assuming that all of the functional values in this equation are well defined. If the operator is a differential operator, we may have a problem defining the exponential . One case where this exponential is well defined is when , i.e., is anti-Hermitian. For example, if , then . This operator appears in the radiative transport equation [1].In some applications, we may also want a vector-valued version of this type of dynamically evolving random field. A model for the distribution of various molecules in an organ or a tumor, for example, may take this form. In this case, we define a spatial integral operator on spatiotemporal vector-valued functions via
where is now a matrix-valued kernel function. Then our dynamic equation is given by with initial condition . The solution to the dynamic equation is given by [11] where the exponential of the operator can still be defined by using the power-series expansion. We again assume that the input function and the initial distribution are independent vector-valued random fields. The vector-valued random fields and are then defined on . Now fix time and consider the random field on . We define inner products by and the characteristic functional for this field by . For a vector-valued function in the domain of this functional, we define and . We then have the relation , assuming that all of the functional values in this equation are well defined. This model, perhaps in combination with Example 8.E, may be useful for describing pharmacokinetic processes in medical imaging applications.M. Cascaded Poisson Process
For a cascaded Poisson process, we have
The idea here is that there are primary interactions that take place at positions , but that what is measured are the secondary interactions at positions . This is a model for scintillation detectors, for example [1 (page 670)]. We will examine the simplest situation from the statistical point of view, but more complex cases can also be considered. The random variables and vectors are , , , and . These integers and vectors are all assumed to be independent of each other. The are i.i.d. with PDF , and the are i.i.d with PDF . The are i.i.d. Poisson random variables with mean , and is Poisson with mean . We define the functions and .In order to present the characteristic functional for this process, we start with the functions defined by . We then compute the function by using . Finally, we have . If is itself a random field, as would happen if we were imaging an ensemble of objects using a scintillation camera, then we have . Note that the cascading of the Poisson process is reflected in the recursive chain that leads to the characteristic functional.
9. IMAGING SYSTEMS
For our purposes, an imaging system collects a finite amount of image data about an unknown object function . We will assume that the object function is a realization from a random field with characteristic functional . We will consider two types of systems corresponding to binned data and list-mode data.
A. Binned Data
For a linear imaging system with a binned -dimensional data vector , we write , where is given by
and is a zero-mean random vector that describes the measurement noise. In general, we do not make any further assumptions about the measurement noise, so is not necessarily statistically independent of the object . If we define the mean data vector by , then we have . By the defining equation of the imaging system, the function is in the domain of the characteristic functional .For additive measurement noise, where is statistically independent of the object , we have the simple relation . For Poisson measurement noise, we define
and we then have [1 (page 666)]. If the measurement noise has a Poisson component plus an additive component, then we have , where is the characteristic function for the additive component. The data in all of these cases is one high-dimensional sample from for each object.B. List-Mode Data
The data for a list-mode imaging system can be viewed as a Poisson random field on a -dimensional space of attribute vectors. The mean of this random process is given by
which we write as . The attributes assigned to each photon can include position, time, and energy. Such systems are also called photon-processing systems [12]. From our previous discussion, we know that the characteristic functional for the generalized random field with a fixed object function is given by . When the object is a random field, we then have . If we define the mean number of photons as and the attribute space PDF as , then the data for a given object is a set of low-dimensional independent samples from this PDF, where is a Poisson random number with mean .10. MATHEMATICAL OBSERVERS FOR POPULATIONS OF OBJECTS
In this section, we will discuss how characteristic functionals can be used to create mathematical observers for estimation and classification tasks of populations of objects. For estimation, the task will be to estimate parameters of a model that generated a sample of object fields that are being imaged. Therefore, we are trying to estimate population parameters as opposed to parameters that describe a particular object function. Similarly, for classification, we are trying to use image data to classify a sample of object fields as belonging to one of a finite set of object models. In both cases, the strategy is to use the sample of data vectors to estimate the characteristic function for the data, and then compare this estimate to the analytical form for this characteristic function derived from the characteristic functional of the object model and the model for the measurement noise. We will consider only binned data, but a similar procedure can be implemented for list-mode data also.
A. Estimation Tasks
In this subsection, we assume that the random field is specified by a parameter vector , although more general parameter spaces are certainly possible. We will use the notation for the random field and emphasize again that is a population parameter and does not determine any particular realization of the random field. Our goal is to produce an estimate of this parameter from the data generated by a binned-data imaging system. Define the imaging system operator by for . We will also write and note that we are considering noise-free measurements in this section for simplicity. Generalizing these results to additive or Poisson noise is straightforward. If the parameterized characteristic functional is given by , then we have the parameterized characteristic function of the data: .
Consider a sample of independent realizations of the object field and the corresponding data matrix . We can form an estimate of by using a sample mean:
Since the terms in the sum are independent complex random variables of unit magnitude, the central-limit theorem applies and we may assume that the joint distribution of the random variables for is normal. The mean of this distribution is given by . Our goal is to use Maximum Likelihood (ML) or Maximum a posteriori (MAP) estimation using the vector as our data. Note that we may choose the number of frequencies to be any positive integer. To perform the estimation, we need the covariance matrix for . To compute this covariance matrix, we start with the quadratic moments andNow if we define , there are two covariance matrices for this complex vector:
andNow we are ready to formulate an estimator for the population parameter vector . From the data matrix , we follow the steps just mentioned to create the -dimensional vector :
From the characteristic functional of the object model, the system operator, and the noise model for noisy data, we compute the mean vector and the Hermitian matrix From the central-limit theorem we then have, to a good approximation for large , We may then compute the MAP estimate of via Alternatively, we can take a logarithm and write asA similar formula can be used for the ML estimate if we have no prior on . See also [13] for another approach to estimating population parameters from characteristic functionals.
B. Classification Tasks
We may use a similar approach for classification tasks as we did for estimation tasks. We suppose that the data is generated from one of a finite set of random fields. If we define , then the classification task comes down to estimating the number from the data. If the prior probability for class is , then we can follow the same processing steps as we described for the estimation task and get a MAP classifier
where the maximum is taken over the set . With the obvious notation, this can also be written asWhen the central-limit theorem is applicable, this classifier minimizes the probability of error among all classifiers that use the processed data vector .
11. FISHER INFORMATION MATRIX
The Fisher information matrix (FIM) is often used as a figure of merit for an imaging system. The matrix inverse of the FIM is related to estimation performance via the well known Cramer–Rao lower bound on the variance of an unbiased estimator. The FIM is also directly related to the performance, as measured by the area under the receiver operating characteristic (ROC) curve, of the ideal observer in detecting a small change in the population parameter vector [14,15]. In our case, we can compute the FIM of the processed data vector using
Assuming that the central-limit theorem is valid for , we have
Therefore, this matrix can be computed from the characteristic functional of the object random field, the system operator, and the characteristic function of the noise.12. CONCLUSIONS
We have provided a brief introduction to the concept of a characteristic functional for a random field and described some of its properties. Our primary purpose for doing this in an imaging context is that characteristic functionals provide a way to describe the statistical properties of object models that represent the ensemble of objects being imaged. We have provided examples of random fields and their characteristic functionals that are useful in imaging, and shown how they are propagated through a noisy imaging system. Finally, we have described methods for using characteristic functionals for estimation and classification tasks, and for computing a Fisher information matrix for an imaging system.
We have seen that there are analytic expressions for many random field models that are useful in imaging. Furthermore, many of these expressions are not very complicated. This is in contrast to the other method of describing the statistics of a random field, the finite-dimensional PDFs for the random vectors consisting of samples of the field at a finite number of points. These PDFs are often complicated or unknown for the examples we have discussed. Since the characteristic functional contains all of the statistical information about the field and can be used for estimating population parameters and classifying populations, it is often a more useful description of a random object model.
Funding
National Institutes of Health (NIH) (P41-EB002035, R01-EB000803); U.S. Department of Homeland Security (DHS) (HSHQDC-14-C-BOOIO).
REFERENCES
1. H. H. Barrett and K. J. Myers, Foundations of Image Science (Wiley, 2004).
2. E. Parzen, Stochastic Processes (SIAM, 1999).
3. A. N. Shiryaev, Probability (Springer, 2000).
4. A. Amini and M. Unser, “Sparsity and infinite divisibility,” IEEE Trans. Inf. Theory 60, 2346–2358 (2014). [CrossRef]
5. J. Fageot, A. Amini, and M. Unser, “On the continuity of characteristic functionals and sparse stochastic modeling,” J. Fourier Anal. Appl. 20, 1179–1211 (2014). [CrossRef]
6. W. Rudin, Functional Analysis (McGraw-Hill, 1973).
7. J. Moller, A. R. Syversveen, and R. P. Waagepetersen, “Log Gaussian Cox processes,” Scand. J. Stat. 25, 451–482 (1998). [CrossRef]
8. P. R. Bouzas, M. J. Valderamma, and A. M. Aguilaera, “On the characteristic functional of a doubly stochastic Poisson process: application to a narrow-band process,” Appl. Math. Model. 30, 1021–1032 (2006). [CrossRef]
9. J. W. Goodman, Statistical Optics (Wiley, 2015).
10. P. R. Bouzas, N. Ruiz-Fuentes, and F. M. Ocana, “Functional approach to the random mean of a compound Cox process,” Comput. Statist. 22, 467–479 (2007). [CrossRef]
11. M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra (Academic, 1974).
12. L. Caucci and H. H. Barrett, “Objective assessment of image quality V: photon counting detectors and list-mode data,” J. Opt. Soc. Am. A 29, 1003–1016 (2012). [CrossRef]
13. M. A. Kupinski, E. Clarkson, J. Hoppin, and H. H. Barrett, “Experimental determination of object statistics from noisy images,” J. Opt. Soc. Am. A 20, 421–429 (2003). [CrossRef]
14. E. Clarkson and F. Shen, “Fisher information and surrogate figures of merit for the task-based assessment of image quality,” J. Opt. Soc. Am. A 27, 2313–2326 (2010). [CrossRef]
15. F. Shen and E. Clarkson, “Using Fisher information to approximate ideal observer performance on detection tasks for lumpy background images,” J. Opt. Soc. Am. A 23, 2406–2414 (2006). [CrossRef]