1 An Efﬁcient Estimation Scheme for Phase-Diversity Time...

1

An Efficient Estimation Schemefor Phase-Diversity Time Series Data

Johnathan M. Bardsley

Abstract

We present a two-stage method for obtaining both phase and object estimates from phase-diversity time seriesdata. In the first stage, the phases are estimated for each time frame using the limited memory BFGS method. Inthe second stage, an algorithm that incorporates a nonnegativity constraint as well prior knowledge of data noisestatistics is used to obtain an estimate of the object being observed. The approach is tested on real phase-diversitydata with 32 time frames, and a comparison is made between it and a previously developed approach. Also, theimage deblurring algorithm in stage two is tested against other standard methods and is shown to be the best forour problem.

Index Terms

phase-diversity, image deblurring, nonlinear and nonnegatively constrained optimization

I. INTRODUCTION

In astronomy, image formation can be modeled by the linear operator equation

d = Sf + e, (1)

where d is the blurred, noisy image, f is the unknown true image, or object, e is additive noise, and S is theblurring operator. In the case of spatially invariant blurs, Sf can be written as a convolution of the associated pointspread function (PSF) s and the object f ; that is,

Sf(u, v) = s ? f(u, v) :=∫ ∞

−∞

∫ ∞

−∞s(u− ξ, v − η)f(ξ, η)dξdη.

In practice, discrete representations of the data d and the blurring operator S are often assumed to be known,yielding a linear system

d = Sf + e,

where d ∈ Rn is the collected data, f ∈ Rn is the unknown discrete object, e ∈ Rn is the random vectorcharacterizing the noise in the image formation process, and S is the n× n blurring matrix.

However, in important instances the blurring operator S and its discrete counterpart S are unknown. In suchcases, if the light emanating from the object is assumed to be incoherent, Fourier optics [7] tells us that the PSFtakes the form

s(φ) = |F−1(peıφ)|2, (2)

where p denotes the pupil, or aperture, function, ı =√−1, and F denotes the 2-D Fourier transform,

(Fh)(y) =∫ ∫

R2

h(x)e−ı2π x·y dx, y ∈ R2. (3)

The pupil function p = p(x1, x2) is determined by the extent of the telescope’s primary mirror. The phase φcharacterizes image blur due to aberrations in incoming wavefronts caused by atmospheric turbulence.

When (2) is used, the corresponding discrete equation, which is both underdetermined and nonlinear, has theform

d = S(φ)f + e. (4)

Department of Mathematical Sciences, the University of Montana. This work was done during the author’s visit to the University ofHelsinki, Finland, under the University of Montana Faculty Exchange Program. Email: [email protected]

2

Here both φ and f are unknown and have dimension n × 1. To make the problem of estimating both φ and fwell-determined, a second image, known as the diversity image, is collected that satisfies the discrete equation

ddiv = S(φ + θ)f + ediv. (5)

Both d and ddiv are collected at the same instant, and hence, f and φ are the same in both (4) and (5). The vectorθ characterizes a known defocus, and guarantees that d and ddiv are sufficiently different. For the phase-diversitydata that we will analyze in this paper, the array θ has components

θjk = −2.38 ∗ π × (x2j + y2

k). (6)

Finally, the random vector ediv characterizes the noise in the formation of the diversity image ddiv.In order to improve the accuracy of the estimate of f , as well as to estimate the phase φ as it changes in time

- this is of interest, for example, in adaptive optics - T diversity images of the form (4)-(5) can be collected atregular time intervals, leading to a sequence of images {di,ddiv,i}T

i=1 satisfying the discrete, nonlinear system ofequations [

di

ddiv,i

]=

[S(φi)S(φi + θ)

]f +

[ei

ediv,i

], (7)

for i = 1, . . . , T .The focus of this paper is the problem of estimating f and the phases {φi}T

i=1 from {di,ddiv,i}Ti=1 given model

(7), for large values of T . Our interest in this problem stems from the fact that we have a real, 32-frame phase-diversity data set that we wish to analyze.

In [9], an effective approach is outlined for estimating f and φ given a single phase-diversity data set d, ddiv.This approach is combined with regularization in [13]. The regularization approach in the T > 1 case is studied in[5], where an all-in-one approach is taken for the phase and object estimates. By all-in-one we mean that estimatesof the object f and the phases {φi}T

i=1 are sought via the solution of a single optimization problem. This approachis reasonably effective for small values of T , but breaks down when T is large. This is due to the large scale natureof the problem, where for 128× 128 data there are (T + 1)1282 unknowns to be estimated. Thus as T increases,the computational burden quickly becomes excessive, and the smoothing of estimates becomes more pronounced.

In this work, we present a two-stage approach for estimating f and the phases {φi}Ti=1 from {di,ddiv,i}T

i=1. Inthe first stage, phase estimates are obtained by solving (7) individually for each i. The approach taken is that of[13] and the computational method used is presented in [5]. In the second stage, an estimate of the object is soughtthat utilizes the full time series of the data. With the phase estimates φ1, . . . ,φT from stage one, the blurringmatrices S1, . . . ,ST are computed via Si = S(φi); note that S(φi) is defined in terms of the PDF s(φi), which isa discretization of (2). An approximate solution of

S1...

ST

f =

d1...

dT

+

e1...

eT

(8)

is then computed. We note that multiplication by the coefficient matrix in (8) and its transpose can be efficientlycomputed using the fast Fourier transform. Also, if a least squares, or penalized least squares, solution of (8)is sought, the problem of multiple images is efficiently dealt with. A least squares formulation also allows forthe inclusion of both nonnegativity constraints and a priori information about the noise statistics. We use thenonnegatively constrained and statistically motivated computational method for least squares estimation presentedin [2].

The paper is organized as follows. In Section II, we present the algorithm used for the phase estimation algorithmstep. Noise statistics and the computational method for the object estimation step are the focus of Section III. Wepresent numerical results in Section IV and state conclusions in Section V.

3

II. STAGE 1: PHASE ESTIMATION

In this section, we formulate the phase estimation minimization problem and present the computational methodthat we will use to solve it. First, consider the least-squares likelihood function

ì(f , φ) =∥∥∥∥[

di

ddiv,i

]−

[S[φ]S[φ + θ]

]f∥∥∥∥

2

, (9)

where || · || denotes the standard L2 norm. Since deconvolution and phase retrieval are both ill-posed problems, anyminimizer of ì is unstable with respect to noise in the data. Hence some form of regularization must be used. Inthe approach of [13], Tikhonov regularization is used, and (9) is replaced by

Ji(f ,φ) = ì(f , φ) +γ

2‖f‖2 +

α

2φT cov(φ)−1φ, (10)

where γ and α are the positive regularization parameters. The phase covariance matrix cov(φ) is given by the vonKarman model for turbulence (turbulence in the temperature distribution within the earth’s atmosphere gives rise tophase error). This model assumes that atmospheric turbulence is second order, wide sense stationary, and isotropicwith zero mean, which gives rise, in the continuum domain (c.f. [10]), to the operator

cov(φ) = F−1diag(Φ)F ,

whereΦ(ω) =

C1

(C2 + |ω|2)11/6. (11)

Here ω = (ωx, ωy) represents spatial frequency. The matrix cov(φ) is then obtained by an appropriate discretizationof cov(φ).

In order to express (10) entirely as a function of φ, we take the approach of [9], [13], where it is noted that forfixed φ, Ji has a unique minimizer in f that can be explicitly computed and that depends only on φ. In the Fourierdomain, this expression is given by

F =P[~φ]∗

Q[~φ], (12)

wherePi(φ) = (di + ddiv,i)∗s(φ), Qi(φ) = γ + |s(φ)|2. (13)

Here “ · ” denotes discrete the Fourier transform (c.f. [13]). By substituting (12) back into (10), one obtains thereduced cost functional,

Ji,red(φ) = ì,red(φ) +α

2φT cov(φ)−1φ, (14)

where (c.f. [13])

ì,red(φ) = (||di||2 + ||ddiv,i||2)−⟨

Pi(φ)Qi(φ)

,Pi(φ)⟩

. (15)

The phase estimate at time i is then given by

φidef= arg min

φJi,red(φ). (16)

In [5], [9], [13], a number of methods are applied to problem (16). The most efficient approach, set forth in [5],is to solve (16) using the limited memory BFGS method (L-BFGS) [8], which we briefly describe now.

A generic quasi-Newton algorithm with line search globalization is presented below. For simplicity in presentation,we use J to denote Ji. We denote the gradient of J at φ by ∇J(φ) and the Hessian of J at φ by ∇2J(φ).

Quasi-Newton / Line Search Algorithm

ν := 0;φ0 := initial guess;begin quasi-Newton iterations

4

gν := ∇J(φν); % compute gradientBν := SPD approximation to ∇2J(φν);vν := −B−1

ν gν ; % compute quasi-Newton stepτν := arg minτ>0 J(φν + τvν); % line searchφν+1 := φν + τνvν ; % update approximate solutionν := ν + 1;

end quasi-Newton iterations

In practice, the line search subproblem is solved inexactly [8]. To obtain the quasi-Newton matrix B−1ν at every

at each iteration, we use the limited memory BFGS (L-BFGS) recursion [8]. The BFGS method generates a newHessian approximation Bν+1 in terms of the differences in the successive approximates to the solution and itsgradient,

sνdef= φν+1 − φν , (17)

yνdef= ∇J(φν+1)−∇J(φν). (18)

L-BFGS is based on a recursion for the inverse of the Bν’s,

B−1ν+1 =

(I − yνsT

ν

yTν sν

)B−1

ν

(I − sνyT

ν

yTν sν

)+

sνsTν

yTν sν

, (19)

for ν = 0, 1, . . .. Multiplication by B−1ν+1 requires a sequence of inner products involving the sν’s and yν’s, together

with the application of B−10 . If B0 is symmetric positive definite (SPD) and the “curvature condition” yT

ν sν > 0holds for each ν, then each of the Bν’s is also SPD, thereby guaranteeing that −B−1

ν ∇J(φν) is a descent direction.The curvature condition can be maintained by implementing the line search correctly [8].

“Limited memory” means that at most N vector pairs {(sν ,yν), . . . , (sν−N+1,yν−N+1)} are stored and at mostN steps of the recursion are taken, i.e., if ν ≥ N , apply the recursion (19) for ν, ν − 1, . . . , ν −N , and set B−1

ν−Nequal to an SPD matrix M−1

ν . We will refer to Mν as the preconditioning matrix. In standard implementations,Mν is taken to be a multiple of the identity [8]. For our application, however, taking

Mν = cov(φ),

yields an algorithm with much better convergence properties.

III. STAGE 2: OBJECT ESTIMATION

The phase estimates from Stage 1 yield estimates of the blurring matrices S1, . . . ,ST , which can be used in (8)to obtain an estimate of f that incorporates the full time series {d1, . . . ,dT }. Since it is possible to incorporateprior knowledge of both the additive noise random vectors e1, . . . , eT and of the nonnegativity of the object f , wepresent an algorithm that does both of these things.

In order to simplify our presentation in what follows, we will use the notation

S =

S1...

ST

, d =

d1...

dT

, and e =

e1...

eT

. (20)

Then (8) can be expressed Sf = d.

A. Incorporating CCD Camera Noise Statistics

The following statistical model (see Refs. [11], [12]) applies to image data from a CCD detector array:

d = Poiss(Sf) + Poiss(β · 1) + N(0, σ2I), (21)

where 1 is a vector of all ones. The relation described by (21) means that each element di of the vector d is arandom variable with distribution

di = nobj(i) + n0(i) + g(i), i = 1, . . . , n. (22)

5

Equation (22) can be described as follows:• nobj(i) is the number of object dependent photoelectrons measured by the ith detector in the CCD array. It is

a Poisson random variable with Poisson parameter [Sf ]i.• n0(i) is the number of background photoelectrons, which arise from both natural and artificial sources, measured

by the ith detector in the CCD array. It is a Poisson random variable with a fixed positive Poisson parameterβ.

• g(i) is the so-called readout noise, which is due to random errors caused by the CCD electronics and errorsin the analog-to-digital conversion of measured voltages. It is a Gaussian random variable with mean 0 andfixed variance σ2.

The random variables nobj(i), n0(i), and g(i) are assumed to be independent of one another and of nobj(j), n0(j),and g(j) for i 6= j.

We simplify statistical model (21) using the independence properties of the random variables in (22) togetherwith the approximation (c.f. [4, pp. 190 and 245])

N(λ, λ) ≈ Poiss(λ) for λ >> 1. (23)

This yields the approximation

d + σ2 · 1 = N(Sf + β · 1 + σ2 · 1, diag(Sf + β · 1 + σ2 · 1)),

which can be rewrittend− β · 1 = Sf + N(0, cov(e)). (24)

wherecov(e) = diag(Sf + β · 1 + σ2 · 1)

is the approximate covariance of the error random vector e.In applications, since f is not known a priori, cov(e) must estimated. For the data set that we are analyzing

di >> σ2 + β for most pixel coordinates i. Furthermore, d ≈ Sf . Thus the following approximations should bereasonably accurate:

d ≈ d− β,

max(d,1) ≈ Sf + β · 1 + σ2 · 1.

This leads to the following linear, stochastic system

d = Sf + e

e = N(0, diag(max(d,1))),

which we will use in our computations.

B. A Iterative Method for Estimating f

We take the approach of [2] and compute the best linear unbiased estimator of f by minimizing

minf≥0

J(f) def= ‖Sf − d||2cov(e)−1 (25)

via the nonlinear fixed point iteration

fk+1 = fk − τkfk ¯ ST cov(e)−1(Sfk − d). (26)

Iteration (26) is known as the covariance-preconditioned modified residual norm steepest descent method (CPM-RNSD). The line search parameter τk in (26) is given by

τk = min{τuc, τbd}, (27)

where, if vk = fk ¯ ∇J(fk),

τuc = −〈vk,∇J(fk)〉〈vk,STSvk〉 , (28)

6

andτbd = min {−[fk]i/[vk]i | [vk]i < 0} . (29)

We note that (27) ensures that the CPMRNSD iterates will satisfy the nonnegativity constraint fk ≥ 0 for all k.CPMRNSD requires two fast Fourier transforms and two inverse fast Fourier transforms per iteration.

It is shown in [2] that CPMRNSD is closely related to the well-known and oft-used Richardson-Lucy iteration.However, in the experiments of [2] CPMRNSD exhibits faster convergence than does Richardson-Lucy.

It should also be mentioned that the use of CPMRNSD in this context constitutes the straightforward extensionof the method to multiple images deblurring. Recall that S, d and e are as defined in (20).

C. Incorporating the Object Estimates from Stage 1

Given the phase estimate φi obtained in Stage 1, an associated object estimate fi is given via the inverse Fouriertransform of expression (12). In practice, during the phase estimation step, the object regularization parameter ischosen to be slightly larger than optimal in order to increase the efficiency of the individual phase estimationproblems. Thus these object estimates are not of a high resolution, but they can provide important prior informationabout f . A straightforward way of incorporating the object estimates {fi}T

i=1 is to take the initial guess in theCPMRNSD iterations to be given by

f0 = max(f ,1) where f def=1T

T∑

i=1

fi. (30)

Note that f is the mean of the fi’s.

IV. NUMERICAL RESULTS

In this section, we test the two-stage method presented above on data obtained by a 2-channel phase diversitysystem incorporated into a 1.6 meter telescope at the US Air Force’s Maui Space Surveillance Complex on MountHaleakala on the island of Maui, Hawaii. These data consists of 32 10-millisecond exposure images of a binarystar. Although we do not have the discrete object in hand, since we know that it is a binary star, we can test theeffectiveness of our approach.

20 40 60

10

20

30

40

50

60−1

0

1

20 40 60

10

20

30

40

50

60−0.5

0

0.5

1

20 40 60

10

20

30

40

50

60−1

−0.5

0

20 40 60

10

20

30

40

50

60 −2

−1

0

1

Fig. 1. Phase reconstruction for frames 4, 12, 20, and 28 of the 32-frame phase time series.

In the phase estimation stage, problem (16) is solved for i = 1, . . . , 32 using L-BFGS with N = 20 storedvectors. The regularization parameters were taken to be α = 0.1 and γ = 0.05. These values balanced computation

7

efficiency with accuracy of estimation as measured by the reconstruction of the object in the second stage. TheLBFGS algorithm was iterated until either the step norm or the gradient norm had decreased six orders of magnitudefrom their values at the first iteration. We note that with a simple gradient descent or EM algorithm, such as isused in [9], this is not realizable in a computationally efficient manner. In Figure 1, the reconstructed phases fori = 4, 12, 20, and 28 are shown. Using MATLAB on a Dell Latitude Laptop with a 2.13 GHz processor and 2 GBRAM, the estimation of all 32 phases took just under 8 minutes.

With the phase estimates in hand, an approximate solution of (8) can be computed using the CPMRNSD algorithmof Section III-B. Using initial guess (30) reduced computational time by 20 percent. The reconstruction shown inFigure 2 was obtained after 250 CPMRNSD iterations, which took approximately 2 minutes in MATLAB usingthe computer mentioned in the previous paragraph.

0

50

100

150

0

50

100

1500

0.5

1

1.5

2

x 105

Fig. 2. Object reconstruction obtained using the two-stage approach with CPMRNSD initial guess given by (30).

Given the fact that the blurring matrices S1, . . . ,S32 where not known exactly in this problem, these resultssuggest that CPMRNSD is very robust with respect to errors in the blurring matrices. We believe that this is duein large part to the inclusion of the nonnegativity constraints, but the incorporation of a priori knowledge of noisestatistics may also have an effect.

A. A Comparison With Other Approaches

Using the all-in-one approach of [5] with α = 0.5, γ = 0.005, and T = 4 - the values of these parameters usedin numerical tests in [5] - and using the L-BFGS method to minimize the associated reduced cost function, weobtain the reconstruction in Figure 3. The two intensity peaks corresponding to the position of the stars is clear,but the reconstruction is of a much lower resolution than that found in Figure 2. We note that this reconstructionwas obtained after the gradient had been reduce by approximately 7 orders of magnitude. It is very similar to thereconstruction presented in [5].

In order to test the effectiveness of CPMRNSD, we also tried the two-stage approach using different methodsfor the object estimation step.

Object estimates using standard Tikhonov regularization are given by

fα = (STS + αI)−1STd, (31)

where S and d are defined as in (20). This was done for many different regularization parameters with uniformlypoor results. In Figure 4, we show a reconstruction for α = 0.05 which is in the range of regularization parametervalues that yield the “best”, i.e. most binary star like, results.

We also applied the MRNSD algorithm - obtained by setting cov(φ) equal the identity matrix in (26) - tothe object estimation sub-problem in order to test the effect of removing the noise statistics information from

8

0

50

100

150

0

50

100

150−1000

0

1000

2000

3000

Fig. 3. Object reconstruction obtained using the all-in-one approach of [5].

0

50

100

150

0

50

100

150−500

0

500

1000

1500

2000

2500

Fig. 4. Object reconstruction obtained using the two-stage approach with Tikhonov regularization for stage two.

CPMRNSD. The reconstruction obtained after 250 MRNSD iterations with initial guess (30) is given in Figure 5. Itis evident that the nonnegativity constraints are effective, but a comparison with the reconstruction obtained usingCPMRNSD suggests that incorporating the a priori noise statistics information yields improved reconstructions.

Finally, in Figure 6 we present the reconstruction obtained by applying 250 iterations of Richardson-Lucyalgorithm [3] with initial guess (30) in stage two. Richardson-Lucy has the form

fk+1 = (fk / ST1) ¯ ST (max(d,0) / ST fk),

where division is done component-wise. It can also be motivated statistically if a Poisson, rather than a Gaussian,distribution for the data is assumed (c.f. [2]). The results seem to be slightly better than those obtained usingMRNSD, but are inferior to the CPMRNSD estimate. Note, in particular, the height of the tallest peaks, as well asthe number and height of the other, erroneous peaks.

We note that the costs of implementing CPMRSND, MRNSD and Richardson-Lucy are roughly the same, as aniteration of each requires four fast Fourier transforms.

9

0

50

100

150

0

50

100

1500

1

2

3

4

x 105

Fig. 5. Object reconstruction obtained using the two-stage approach with 250 iterations of MRNSD in stage two.

0

50

100

150

0

50

100

1500

1

2

3

4

5

x 105

Fig. 6. Object reconstruction obtained using the two-stage approach with 250 iterations of Richardson-Lucy in stage two.

V. CONCLUSIONS

We have presented a two-stage approach for reconstructing the object f and phases {φi}Ti=1 from phase-diversity

time series data {di,ddiv,i}Ti=1, assuming the discrete nonlinear sequence of models (7).

In the first stage, the objective is to obtain accurate phase estimates. This is done by applying the L-BFGSmethod to the problem of minimizing (16), as is done in [5].

In the second stage, problem (8) is approximately solved using the CPMRNSD algorithm of [2], where Sidef=

S(φi) with {φi}Ti=1 obtained in stage one. CPMRNSD incorporates a nonnegativity constraint as well as prior

knowledge of the statistics of the data noise.Comparisons of this two stage algorithm with the all-in-one approach of [5] indicate that the two-stage approach

is more effective. We believe that this is due to the fact that the two-stage approach was able to incorporate thefull 32 time frame phase-diversity time series - due to computational issues only 4 time frames were considered in[5]. Also, by breaking the problem into two stages, we were able to use a more sophisticated and robust algorithmin the second stage, resulting in superior object reconstructions.

10

The effectiveness of CPMRNSD for the multiple images deblurring problem of stage two was tested againstTikhonov regularization, MRNSD and Richardson-Lucy, and was found to be the most effective of these fourmethods for this problem, indicating that incorporating both the nonnegativity constraint and the a priori noisestatistics information has a positive effect on results. Finally, the implementation of CPMRNSD in this context canbe viewed as a straight-forward extension of MRNSD-type algorithms to multiple images deblurring.

VI. ACKNOWLEDGEMENTS

The author would like to thank Dr. David Tyler of the Maui High Performance Computing Center, ProfessorTodd Torgersen of the Department of Computer Science at Wake Forest University and Professor Curt Vogel of theMathematical Sciences Department at Montana State University. Dr. Tyler and his staff provided the phase-diversitydata; Professor Torgersen provided data processing software; and Dr. Vogel wrote the MATLAB code for the phaseestimation step. Finally, the author would like to thank the University of Montana’s International Exchange Programfor the opportunity to conduct research at the University of Helsinki during the 2006-07 academic year.

REFERENCES

[1] Johnathan M. Bardsley, A Nonnegatively Constrained Trust Region Algorithm for the Restoration of Images with an Unknown Blur,Electronic Transactions in Numerical Analysis, 20, 2005, pp. 139-153.

[2] Johnathan M. Bardsley and James G. Nagy, Covariance-Preconditioned Iterative Methods for Nonnegatively Constrained ImageReconstruction, SIAM Journal on Matrix Analysis and Applications, 27 (4), 2006, pp. 1184-1198.

[3] M. Bertero and P. Boccacci, Introduction to Inverse Problems in Imaging, IOP Publishing Ltd., London, 1998.[4] W. Feller, An Introduction to Probability Theory and Its Applications, Wiley, New York, 1971.[5] L. Gilles, C. R. Vogel, and J. Bardsley, Computational Methods for a Large-Scale Inverse Problem Arising in Atmospheric Optics,

Inverse Problems, 18, 2002, pp. 237-252.[6] R. A. Gonsalves, Phase diversity in adaptive optics, Opt. Eng. 12, 1982, pp. 829–832.[7] Joseph W. Goodman, Introduction to Fourier Optics, McGraw-Hill, 1996.[8] J. Nocedal and S. J. Wright, Numerical Optimization, Springer–Verlag, 1999.[9] R. Paxman, T. Schulz, and J. Fienup, Joint estimation of object and aberrations by using phase diversity, J. Opt. Soc. Am. a, 9 (1992),

pp. 1072–1085.[10] M. Roggemann and B. Welsh, Imaging Through Turbulence, CRC Press, 1996.[11] D. L. Snyder, A. M. Hammoud, and R. L. White, Image recovery from data acquired with a charge-coupled-device camera, Journal

of the Optical Society of America A, 10 (1993), pp. 1014–1023.[12] D. L. Snyder, C. W. Helstrom, A. D. Lanterman, M. Faisal, and R. L. White, Compensation for readout noise in CCD images, Journal

of the Optical Society of America A, 12 (1995), pp. 272–283.[13] C. R. Vogel, T. Chan, and R. Plemmons, Fast Algorithms for Phase Diversity-Based Blind Deconvolution, in Adaptive Optical System

Technologies, SPIE Proceedings Vol. 3353 (1998).

1 An Efﬁcient Estimation Scheme for Phase-Diversity Time...

Documents

Transcript of 1 An Efﬁcient Estimation Scheme for Phase-Diversity Time...