Multispectral Compressive Imaging Strategies using Fabry ... · Multispectral Compressive Imaging...

Multispectral Compressive Imaging Strategiesusing Fabry-Perot Filtered Sensors

Kevin Degraux, Valerio Cambareri, Bert Geelen,Laurent Jacques, Gauthier Lafruit ∗

February 7, 2018

Abstract

This paper introduces two acquisition device architectures for multispectral compressive imaging. Unlikemost existing methods, the proposed computational imaging techniques do not include any dispersive element,as they use a dedicated sensor which integrates narrowband Fabry-Perot spectral filters at the pixel level. Thefirst scheme leverages joint inpainting and super-resolution to fill in those voxels that are missing due to thedevice’s limited pixel count. The second scheme, in link with compressed sensing, introduces spatial randomconvolutions, but is more complex and may be affected by diffraction. In both cases we solve the associatedinverse problems by using the same signal prior. Specifically, we propose a redundant analysis signal prior ina convex formulation. Through numerical simulations, we explore different realistic setups. Our objective isalso to highlight some practical guidelines and discuss their complexity trade-offs to integrate these schemesinto actual computational imaging systems. Our conclusion is that the second technique performs best at highcompression levels, in a properly sized and calibrated setup. Otherwise, the first, simpler technique should befavored.

Multispectral imaging, compressed sensing, spectral filters, Fabry-Perot, random convolution, generalizedinpainting.

1 IntroductionMultispectral (MS) imaging consists in capturing the light intensity, X0(u, v, λ), of an object or scene as it variesalong its 2-D spatial coordinates (u, v) and over different wavelengths λ, i.e., the light spectrum as measured into afew intervals or bands. This information is sampled in a 3-D data volume, which allows for accurate classificationor segmentation of constituents in an object or scene from their spectral profile. Hence, MS imaging finds diverseapplications in remote sensing [1], optical sorting [2], astronomy [3], food science [4], medical imaging [5] andprecision agriculture [6].

A classic approach is to spatially or spectrally multiplex the MS cube over a 2-D Focal Plane Array (FPA). Thisis done by scanning the cube, so that specific slices are sequentially acquired by the sensor in several snapshots (fora review see, e.g., [7]). Such systems require either tunable spectral filters or dispersive elements with mechanicalparts to scan the object or scene. These approaches entail trade-offs between complexity and cost, spectral andspatial resolution, and acquisition time.

Recently, single-snapshot MS imagers were developed to rapidly acquire a MS cube, thus avoiding motionartifacts and enabling video acquisition rate [8]. Among such imagers, we focus on those using Fabry-Perot (FP)filtered sensors [9, 10], i.e., standard CMOS imaging sensors on top of which an array of spectral filters is de-posited. This technique generalizes RGB filter arrays [11] to filter banks using an arbitrary number of narrowbandprofiles [9], e.g., a few tens. Thus the array imposes a reduction in spatial resolution as the sensor’s pixels arepartitioned between bands.

∗KD, VC, and LJ are with the ISPGroup ICTEAM/ELEN, Universite catholique de Louvain, Belgium, LJ is funded by the F.R.S.-FNRS,part of this work has been funded by the pole Mecatech Walloon region project ADRIC (e-mail: kevin.degraux, valerio.cambareri, [email protected]). GL is with LISA, Universite libre de Bruxelles (e-mail: [email protected]). BG is withimec, Belgium (e-mail: [email protected]). Computational resources have been provided by the CISM/UCL and the CECI in FWBfunded by the F.R.S.-FNRS under convention 2.5020.11.

1

arX

iv:1

802.

0204

0v1

[cs

.CV

] 6

Feb

201

8

[email protected]

[email protected]

This paper investigates MS imaging strategies based on Compressed Sensing (CS) (see, e.g., [12, 13]), anestablished signal processing paradigm that has inspired several computational imaging frameworks [14 – 17].After acquisition by a compressive device, the measurements are fed into a recovery algorithm along with thesensing operator and signal prior. Under broad theoretical conditions [18, 19], this method recovers a high-resolution approximation of the target scene, even if the sensing was performed below the scene’s Nyquist rate.The complexity of the sensing operation (e.g., resolution, time) is therefore balanced to the complexity of thesignal with respect to a given prior.

1.1 Main ContributionsOur work contributes to advancing the field of MS compressive imaging in the following senses:

(i) We propose two MS snapshot imaging strategies: Multispectral Volume Inpainting (MSVI) and MultispectralRandom Convolution (MSRC). Both maintain a relatively low system-level complexity without any disper-sive element. Using CS principles, they are designed with a low-pixel-count FP sensor.

(ii) MSVI leverages a generalized inpainting procedure, as discussed in [20], to provide a simple integration ofthe FP sensor in a computational imaging scheme. This architecture performs a spatio-spectral subsamplingof the MS cube and relies on their redundancy to obtain a high-resolution recovery. It is fairly simple andworks best at lower compression levels.

(iii) MSRC leverages random convolution, as discussed in [21], to provide spatial-domain CS by means of anout-of-focus random Coded Aperture (CA), i.e., an array of square apertures randomly placed on an opaquescreen. It preserves the spectral resolution, fixed by the low number of narrowband FP filters (e.g., 16) on theFPA. In an ideally sized, low-noise setup, this more complex architecture clearly improves the recoveredquality, especially at higher compression levels. However, it entails some optical design challenges, asdiscussed in Section 4.2.

(iv) Our analysis is paired with a discussion on the analysis-sparse signal prior, the associated convex optimiza-tion formulation and fine-tuned ADMM algorithm [22, 23] for the large-scale recovery of MS cubes.

(v) Both architectures are numerically compared in terms of achievable recovery performances. We also discusstheir complexity trade-offs and design guidelines, by identifying unavoidable adverse optical effects, tointegrate these schemes into realistic imaging systems.

Our findings and numerical results corroborate how a conspicuous reduction in the number of measurementsw.r.t. the Nyquist-rate representation of X0(u, v, λ) is made possible by both architectures while preserving highPeak Signal-to-Noise Ratio (PSNR). Table 1 summarizes some pros and cons of each strategy, which are detailedand clarified throughout the paper.

The paper is organized as follows. The rest of Section 1 places our contribution in perspective with respectto related work in the literature and introduces useful notations. Section 2 presents the FP filters technology, theproposed MS analysis sparsity prior and the inverse problem formulation, common to both strategies, as wellas the associated reconstruction algorithm. Section 3 details the specifics of the MSVI strategy, i.e., its imageformation model and sensing matrix. Section 4 similarly provides the details of the MSRC strategy and discussessome associated non-idealities and practical considerations. Section 5 presents numerical reconstruction results.We demonstrate the MSVI performances with experimental data and compare MSVI and MSRC, using simulatedacquisition. The final section gives a brief conclusion.

1.2 Related Work1.2.1 Compressive Spectral Imaging

The use of CS for MS imaging schemes dates back to [16, 24]. The most popular application of CS to spectralimaging is the Coded Aperture Snapshot Spectral Imaging (CASSI) framework [15, 16, 25 – 27], with its manyvariants summarized hereafter. Single-disperser CASSI uses a random CA to partially occlude the focused scene.A refractive prism or grating then shears the spatio-spectral information, and the processed light is recorded bya standard imaging sensor. The system introduced by [26] shows high spectral accuracy after image recovery, atthe expense of lower spatial accuracy. Double-disperser CASSI [16] achieves opposite performances in terms of

2

MSVI MSRC

Optical system complexity Very simple Simple (simpler than [16])Optical calibration Simple ComplexRobustness to noise Good GoodRobustness to miscalibration Acceptable Low (see, e.g., [44, 45])PSNR at 1:16 compression 33dB 37dB (with or without PSF)PSNR at 1:2 compression 48dB 50dB (41dB with PSF)Acquisition speed Fast FastInitialization quality (7) Acceptable Low

Table 1: Comparison of the proposed imaging strategies. Peak Signal-to-Noise Ratio (PSNR) based on Fig. 6. See MSVI initialization onFig. 7.

spatial versus spectral accuracy, but requires non-trivial calibration and geometric alignment of its optical compo-nents. A close line of work in [28, 29] proposes a snapshot spectral imaging architecture. It is based on CASSI butfeatures wide-band spectral filters, which provide random spatio-spectral subsampling after the shearing element.Non-snapshot spectral imaging architectures based on CS were also recently proposed [24, 30, 31].

CASSI and its variants target a relatively large number of bands and are intrinsically capable of achieving highspectral resolution thanks to dispersive optics. However, when the spectrum is well represented by fewer bands,spectral mixing is less effective, for CS purposes, than spatial mixing, especially for FP-filtered sensors with onlya few tens of narrowband filters (e.g., [32]) whose high selectivity excludes spectral super-resolution. In this work,we focus on those FP filtered schemes that target a few bands without using any dispersive element.

1.2.2 Compressive Imaging by Random Convolution

Since its introduction, CS has been envisioned to provide image acquisition at reduced sensor resolution [33] orshorter acquisition times [34] (see also the tutorial in [35] and references therein). In particular, the second strat-egy proposed in this paper is related to CS by random convolution: The sensing operation acts as a spatial-domainconvolution with a random filter, e.g., a random CA as in CA imaging [36, 37]. More recently, the subsampledrandom convolution operation was shown, in [19, 38, 39], to comply with theoretical results of CS. This operationwas also featured in recent imaging architectures [40 – 42]. Convolution-based schemes are appealing becausethey allow for a fast sensing operation. Indeed, the compressive measurements can be formed in one frame of afull imaging sensor, as opposed to single-pixel camera designs [24, 33, 43], where the compressive measurementsare multiplexed in time. Moreover, Fast Fourier Transform (FFT) implementations of the convolutions drasti-cally reduce the computational cost of reconstruction compared to unstructured sensing operations. However, thesnapshot capability and numerical efficiency of random convolution architectures are paid by a higher correlationbetween adjacent compressive measurements, because of their spatial adjacency and considering the optical-levelnon-idealities such as the Point Spread Function (PSF) of the optical elements. In this work we propose a MSextension of the low-complexity snapshot imaging architecture introduced by Bjorklund and Magli [42]. In par-ticular, their architecture uses a CA placed out-of-focus to provide random convolution.

1.3 NotationsVectors are noted x ∈ Rn (bold), matrices, X ∈ Rn1×n2 (upper-case bold), 3-D arrays, X ∈ Rn1×n2×n3

(caligraphic bold), vec(X) is the vectorization of X in row-major ordering, Id is the identity matrix and Φ∗ isthe adjoint of Φ. We note [n] , 1, . . . , n, ‖x‖p , (

∑ni=1 |x|

p)1/p is the `p-norm of x and ‖Φ‖ is the operator

`2-norm. The proximal operator associated to f is defined by proxf (v) , argminx f(x) + 12 ‖x− v‖

22. The full

(2-D discrete) convolution betweenX and Y isX ∗Y and its valid part (MATLAB and NumPy terminology, i.e.,fully overlapping inputs without zero-padded edges) is X ∗ Y . The indicator ιC(x) , 0 if x is inside the set Cand +∞ otherwise. We note diag(x), the diagonal matrix with diagonal x; bdiag(A,B, . . . ), the block-diagonalmatrix with blocks A,B, . . . arranged without overlap and bdiagn(A) , bdiag(A,A, . . . ), repeating n timesA.

3

450 500 550 600 650

λ (nm)

Tran

smitt

ance

(arb

itrar

yun

its)

(a)

(b)

(c) (d)

Figure 1: FP filtered sensors: (a) transmittance profiles (arbitrary units) of a 16-band FP filter bank in the VIS range; depiction of (b) mosaic,(c) random, and (d) tiled filters as deposited on a CMOS sensor array (gray).

2 PreliminariesIn this section, we introduce the FP filtered sensors used in both strategies. We then propose an MS imageanalysis sparsity prior. This prior is used to regularize an inverse problem through a convex optimization program,again common to both strategies. This section then describes the convex formulation and the associated recoveryalgorithm used in Section 5.

2.1 Fabry-Perot Filtered SensorsThe class of imaging sensors at the core of this work are comprised of a standard CMOS sensor designed to operatein the visible light (VIS) range of wavelengths (i.e., 400–700 nm), on which a layer of Fabry-Perot interferometers[46] is deposited. The latter, whose physics is well described in [47], act on the spectrum of incoming light asband-pass filters whose center wavelength and width are designed to yield narrowband profiles (about 10 nm).

Once the filter profiles are designed, the FP filters can be manufactured to cover the area of a single pixel, eitherin a mosaic layout [10], where a group of different filters is repeated in a 4× 4 or 5× 5 mosaic pattern (Fig. 1b),or by partitioning the sensor in a tiled layout, where the sensor is partitioned in large areas with the spectral filterfor a specific wavelength deposited on top of them [48] (Fig. 1d). While it is possible to envision architecturesthat use tiled layouts [10, 21] we here focus on mosaic designs, as they will allow a reduction of the correlationbetween measurements taken on adjacent sensor pixels. Such a sensor, [10], was designed and prototyped at imecand will be referred to as imec’s sensor in the following. The use of an external spectral cutoff filter, removinganything outside the VIS range, allows one to obtain a filter bank such as the one depicted in Fig. 1a. Theseprofiles were generated for illustration purposes based on calibration measurements taken at imec. The raw datawas post-processed to only keep the main lobe of each filter response. In particular, smaller secondary modes,which can appear at harmonic wavelengths (see [47] and [10] for another example), were removed for clarity. Inthis work, we consider an idealized situation, ignoring secondary modes. Furthermore, in a real situation, we mustcompensate for the attenuation coefficients, either before, during or after reconstruction.

Hereafter, for the sake of simplicity, we approximate the spectral responses by a Dirac delta, δ(λ − λ`), ateach filter’s centre wavelength λ`, with equal gain. We consider a sensor featuring a 16-band filter bank withuniformly spaced center wavelengths between 470− 620 nm, either placed in a 4× 4 mosaic pattern (Fig. 1b), orwith a randomly-assigned arrangement called random pattern (Fig. 1c). The latter has not been manufactured inpractice but should not pose any major difficulty compared to the mosaic pattern. In simulations (Section 5.2.1),the random pattern is generated by permuting the assigned locations of the filters over the entire FPA.

2.2 Forward model and analysis sparsity priorLet X0 ∈ Rnu×nv×nλ represent a discretized MS cube in its 2-D spatial and 1-D spectral domains, equivalentlyrepresented by its vectorization x0 , vec(X0) ∈ Rn, n = nunvnλ. Both studied architectures entail a noisylinear acquisition process, summarized by the following generic forward model,

y = Φx0 +w. (1)

4

In this model, the linear sensing operator is represented in matrix form by Φ ∈ Rm×n where m , mumv . Ityields a set of compressive measurements that are captured by the sensor array, Y ∈ Rmu×mv or in vectorizedform y , vec(Y ) ∈ Rm. The noise vector w ∈ Rm is bounded in `2-norm by τ .

As any computational imaging system based on regularized inverse problems, our schemes must leverage aprior model for the signal being acquired. We here choose to use an analysis sparsity prior (see, e.g., [49 – 51]).Specifically, we separately apply linear transforms to the spatial and spectral domains, denoted by Auv and Aλ.This amounts to constructing a separable transform by the Kronecker product A , Auv ⊗ Aλ. For the spatialdomain transform, Auv , we chose a 2-D Daubechies-4 Undecimated Discrete Wavelet Transform (UDWT) whichforms a shift-invariant, separable, and overcomplete wavelet transform [52, 53]. The approximation level (scalingcoefficients) is inherently not sparse as it contains the low-pass approximation of the image. We found, however,that the slowly varying spatial information helps in leveraging the redundancy between bands. We thus use a 2-DDiscrete Cosine Transform (DCT), which concentrates the low-pass information in a few coefficients, making itconsistent with our sparsity prior. The wavelet filters are chosen with length 8 and in 3 levels, resulting in ananalysis transform Auv ∈ R10nunv×nunv (3 levels × 3 directions + 1 approximation level). The DCT is chosenfor the 1-D spectral domain transform, Aλ ∈ Rnλ×nλ , given that we focus on MS cubes from natural scenes withsmooth spectral profiles.

2.3 Recovery MethodThe recovery method consists in inverting (1) to find an estimate x of the MS cube, using the analysis-sparsityprior. We use the `1-analysis formulation from [50], with an additional range constraint R , [xmin, xmax]n,which reads

x , argminx∈R

‖Ax‖1 s.t. ‖y −Φx‖2 ≤ τ . (2)

A good noise estimate can be used for setting the parameter τ . We solve the non-smooth convex optimizationprogram (2) using the Alternating Directions Method of Multipliers (ADMM) introduced in [54, 55]. Specifically,we use the version from [23, Algorithm 2], recasted to solve problems of the form

minz

∑Jj=1 gj (Hjz) , (3)

where gj are convex lower semicontinuous functions andHj are linear operators such that (H1∗, . . . ,HJ

∗)∗ hasfull column rank. A practical implementation requires efficient computation of the proximal operators [56, 57]associated to the functions gj , as well as the matrix-vector products Hjz and Hj

∗wj for arbitrary z and wj . A

crucial step of the algorithm is the matrix inversion,(∑J

j=1 µjHj∗Hj

)−1

z, for some µj > 0. Any propertyof the matrices, Hj , that can simplify that step should be exploited. In particular, the tight frame property, i.e.,Hj∗Hj = Id; Fourier diagonalization, i.e., Hj = F∗ΣF, where F is the Discrete Fourier Transform (DFT) and

Σ is diagonal; or the sparsity and separability ofHj , are used in the following.For their blind deconvolution problem, [23] proposes to handle the boundary conditions by adding and then

masking the missing rows of the block-circulant sensing operator. This stabilizes the estimation, while recoveringthe block-circulant structure of the convolution operator, allowing Fourier diagonalization. Building over theseideas, as detailed for both architectures in Sections 3.2 and 4.3, we can arbitrarily add rows and columns to Φ inorder to exploit one of the properties cited above. We define an extended sensing matrix Φ ∈ Rm×n (with m ≥ mand n ≥ n) and restriction operators Rm ∈ 0, 1m×m and Rn ∈ 0, 1n×n, i.e., that restrict input vectors (oflength m and n) to some arbitrarily chosen index sets (of length m and n), such that

Φ = RmΦR ∗n . (4)

Note that the adjoint, R ∗n , of the restriction, Rn, is the corresponding zero-padding operator. In addition to thatfactorization of Φ, it happens that the analysis transform A introduced above is actually a scaled tight frame, i.e.,there exists a diagonal weighting matrix Ω such that A , ΩA and A∗A = Id. In order to make use of thetight frame property of A and the factorization (4) of Φ, we define Ω , bdiag(Ω,0n−n) and A ,

(ARn

Rcn

),

where Rcn ∈ 0, 1(n−n)×n is the complementary restriction of Rn, such that R ∗nRn + Rc

n∗Rc

n = Id, and0n−n ∈ R(n−n)×(n−n) is the zero matrix. The tight frame property, A∗A = Id, is thus preserved. Let x ,R ∗nx, i.e., a zero-padded version of x, and let α , Ax and z , Φx. Note that

∥∥ΩAx∥∥

1= ‖Ax‖1 and

5

Interpolation operator

FP filters

Target Upscaled Focal plane

Up

X0 2 Rnunvn Xup 2 Rmumvn Y 2 Rmumv

M `

uv

u

v

Figure 2: MSVI forward model.

∥∥y −RmΦx∥∥

2= ‖y −Φx‖2. By imposing Rc

nx = 0, we get the equivalent problem to (2),

x = Rnargminx∈Rn

∥∥ΩAx∥∥

1s.t. (5)∥∥y −RmΦx

∥∥2≤ τ, Rnx ∈ R and Rc

nx = 0.

Let ι‖y−·‖≤τ , ιR and ι0 be the indicators of the sets z | ‖y − z‖ ≤ τ, R and 0. In order to match therequired form, (3), the problem (5) is then split in J = 3 functions,

g1(z) , ι‖y−·‖≤τ (Rmz) withH1 , Φ

g2(α) ,∥∥Ωα∥∥

1withH2 , A

g3(x) , ιR(Rnx) + ι0(Rcnx) withH3 , Id.

The corresponding proximal operators all admit a very simple closed form expression that can be efficientlyevaluated (see, e.g., [56, 57] and references therein). Moreover, we have∑3

j=1 µjHj∗Hj = µ1

(Φ∗Φ + µ2+µ3

µ1Id), (6)

which, as we will see, is easily invertible for both architectures.For initializing the algorithm, we use a 3-D linear interpolation of Y in the 3-D (u, v, λ) space to get an

estimate Y lin ∈ Rmu×mv×nλ . Let ylin = vec(Y lin) so that y = Rmylin (but ylin 6= R∗my). We then useTikhonov regularization,

xinit ,((ΦRn)∗ΦRn + τ2Id

)−1(ΦRn)∗ylin, (7)

that we practically solve using the conjugate gradients algorithm, i.e., without matrix inversion.

3 Multispectral Compressive Imaging by Generalized InpaintingThe first architecture, coined Multispectral Volume Inpainting (MSVI), is presented in this section. We describethe formation and recording of measurements on the snapshot FP sensor and the corresponding sensing matriximplementation. The description below is aligned with Fig. 2.

3.1 Image Formation ModelOur scheme allows us to choose, as a free parameter, the target spatial resolution, nu × nv , of the target MSvolume, X0 ∈ Rnu×nv×nλ (see Fig. 2). We choose to target a smaller resolution than the sensor resolution,mu × mv , i.e., nu ≤ mu and nv ≤ mv , even though mumv = m ≤ n = nunvnλ. We assume that a spatiallow-pass filter at appropriate frequency has been placed before the device so that the chosen resolution, nu × nv ,achieves the Nyquist rate of the resulting low-pass scene, X0(u, v, λ). This practice is common for stabilizingdemosaicking [58], e.g., using birefringent filters [59] or by slightly defocusing the objective lens.

Let Xup ∈ Rmu×mv×nλ be an upscaled version of the scene X0, i.e., matching the FPA pixel countmu×mv .We can obtain this Xup by using a smooth and separable interpolation function, e.g., Lanczos, represented here

6

by the linear operator, Up ∈ Rm×nunv , applied to each band. Since nu × nv achieves the Nyquist rate, bothXup and X0 are lossless representations of X0(u, v, λ). This de-couples the number of FPA pixels, mumv , fromthe target scene resolution, nunv; we may choose the subsampling rate m/n by changing one or the other. Inorder to model the sensing operation and its relation with the upsampling Up, we introduce the diagonal maskoperator,M` ∈ 0, 1m×m, masking all FPA pixels but the ones corresponding to the FP filters of index ` ∈ [nλ].Since every FPA pixel is sampling exactly one band, we have

∑`∈[nλ]M` = Id and M`M`′ = 0 for ` 6= `′

so that the concatenation, M , (M1, . . . ,Mnλ) ∈ 0, 1m×nλm is a restriction operator in Rnλm such thatMM∗ = Id. The sensing matrix in (1) finally reads Φ ,

(Φ1, · · · ,Φnλ

)with Φ` , M`Up. This forward

model is schematized on Fig. 2.Set aside the clear affiliation with the inpainting problem in computer vision [60 – 62], we can make links with

the random basis ensembles (see, e.g., [63, Chapter 12] and references therein) in CS. In the spectral direction, thesparsity basis is the DCT, which is maximally incoherent with the canonical basis and therefore an optimal choice.On the other hand, in the spatial dimension, loosely speaking and ignoring upsampling, the sparsity “basis” is awavelet transform which is not maximally incoherent with the canonical basis. This intuitively justifies the studyof the second method in Section 4. Rigorously extending the analogy to redundant wavelet analysis with theupsampling would require further work.

3.2 Sensing matrix implementationAs explained in Section 2.3, we can ease the computations by adding rows and columns to Φ (see (4)). Onenatural choice is Rn = Id and Rm = M , so that Φ = bdiagnλ(Up). Therefore, Φ∗Φ + µId is a separablesparse matrix which is easily pre-computed and fast to invert, for example with the conjugate gradients algorithm.Even though the gain is less obvious than in the MSRC case discussed in Section 4.3, we found, empirically, thatusing this trick speeds up ADMM’s convergence compared to the direct use of Φ. Let it be noted that in the casemu = nu and mv = nv , i.e., a subsampling rate of 1/nλ, we have Up = Id, which makes the inversion step astrivial as a scalar multiplication by (1 + µ)−1.

4 Multispectral Compressive Imaging by Out-of-Focus Random Convo-lution

This section describes the Multispectral Random Convolution (MSRC) device. First, we discuss the image forma-tion model, then some implementation aspects linked to important non-idealities, such as attenuation and diffrac-tion. Finally, we discuss the numerical implementation of the sensing matrix, to be used in the recovery methodof Section 2.3.

4.1 Image Formation ModelWe give here a description of the MSRC device, based on geometrical optics, depicted on Fig. 3. This followsthe ideas originally introduced by [42]. The difference, here, is that we use the FP filtered sensor instead of apanchromatic sensor.

4.1.1 Continuous model

For a precise description, it is easier to use the continuous domain representation, X0(u, v, λ), of the objectof interest. In order to lighten the notations, we will consider only one spatial dimension and use a simplifiedX0(u, λ) instead of X0(u, v, λ). Since everything is separable, the two dimensional extension is straightforward.from which we consider that a virtual flipped source X0(−u, λ) radiates in all directions allowed by the apertureof the objective lens. It illuminates a random CA with su elements, at a distance d along the optical axis. The CA,

7

Image plane

Coded aperture (CA)

Scene

FP filters

Focal planeObjective lens

Focusing lens

Convolution kernel Sampled imageActual target

d

z

#d tan(#)

f tan(#)

s

f

u

Virtual targetX0(u,)X0(u,)

m

S(u) Y (u,)

su

mu

symbols

pixels

Figure 3: Geometry of the MSRC optical path.

with aperture pitch ∆s, is modeled by its transmittance function1,

S(u) =

su∑i=1

Si rectsui ( u∆s

). (8)

Its su known symbols are Si with equal probability, modeling either transparent (Si = 1) or opaque (Si = 0)pixels. We assume that the CA has negligible effect in the spectral domain. The CA is illuminated by replicas ofthe source, X0(u′(ϑ) − u, λ) shifted by u′(ϑ) = d tan(ϑ) as bundles of parallel rays that propagate in the samedirections, defined by the angle ϑ ∈ [−π/2, π/2] w.r.t. the optical axis. An ideal thin lens with focal length f ,placed in front of the CA, then focuses the modulated light on the sensor. All the rays with direction ϑ convergeon the focal plane at u′′(ϑ) = f tan(ϑ),

Y (u′′(ϑ), λ) =

∫S(u)X0(u′(ϑ)− u, λ) du (9)

= [S ∗X0](u′(ϑ), λ) = [S ∗X0]( df u′′(ϑ), λ),

with ∗ denoting, here, a continuous convolution. This defines the relationship with between ∆s, f , d and the pixelpitch of the imaging sensor, ∆m = ∆sf

d . For i ∈ [mu], we choose to model the sampling function correspondingto the ith detector by Mi(u, λ) = δmui ( u

∆m) δ(λ − λi). In this notation, we highlight the fact that the sensor is

spectrally-filtered, i.e., we assign a different wavelength λi depending on the pixel index i (see Section 2.1). Theith measurement, is obtained as

yi =

∫∫Mi(u, λ)[S ∗X0]( df u, λ) dudλ, (10)

forming the discrete measurements vector, y. Note that this spectral filtering step is the continuous equivalent ofthe one described in Section 3 and represented on the right part of Fig. 2, but where Y (u, λ) = [S ∗X0]( df u, λ)

replaces X0(u, λ). There are mu sensor pixels, i.e., mu shifts of the target on the CA, which covers nu CAelements. The CA must therefore have su = nu +mu − 1 elements to cover all recorded angles.

1In order to lighten the notations, we introduce the two following sampling functions, for any grid length, n ∈ N, and sampling rate (orpitch), ∆ > 0,

δni ( u∆

) , δ(u∆− (i− (n+1)

2))

rectni ( u∆

) , rect(u∆− (i− n

2))

where δ(u) is Dirac’s delta function and rect(u) is the boxcar function, i.e., 1 if u ∈ [0, 1) and 0 elsewhere. They are defined so that forsampling indices i ∈ [n], the sampling grid is always centered around u = 0.

8

As explained in [42], we can alter the CA pattern and measurements vector so that the symbols of S(u)effectively become Si ∈ −1, 1 instead of Si ∈ 0, 1. We propose to either use two complementary patternsS+(u) and S−(u), where transparent pixels (Si = 1) become opaque (Si = 0) and vice versa, and subtract thecorresponding measurements vectors, y = y+ − y−, or to subtract measurement made with a fully transparentaperture, Son(u) (i.e., Si = 1, ∀i), from 2y+, i.e., y = 2y+ − yon. This implies the use of a programmableCA or a fixed mask that can easily be removed for a full, non-coded acquisition (see Section 4.2). In the rest, weconsider that the equivalent Si ∈ −1, 1 pattern is used.

4.1.2 Discrete model

The discrete linear forward model of the optical processing chain stems from a particular discretization of thetarget volume. The most natural choice, in this instance, is to replace X0(u, λ), in (10), by its Dirac-sampledversion,

Xd0 (u, λ) ,

nu∑i=1

nλ∑`=1

x0,i,`δnui ( u

∆s)δ(λ− λ`), (11)

where x0,i,` is a sample of the discrete target volume. Note that the sampling functions of Mi(u, λ) and [S ∗Xd

0 ]( df u, λ) are nicely aligned with each other, so that (10) directly translates to the discrete model. Coming backto two discrete spatial dimensions, the discrete forward model thus reads

Y =

nλ∑`=1

M`(S ∗X0,`), (12)

where Y ∈ Rmu×mv is the array of recorded measurements; M`(·) : Rmu×mv → Rmu×mv are the mask linearoperators modeling the effect of the FP filters (they correspond to the matrices M` introduced in Section 3); thefilter S ∈ −1, 1su×sv represents the discrete, 2-D version of S(u); and X0,` ∈ Rnu×nv is the band of index `of the full cube, X0 ∈ Rnu×nv×nλ . The size, su × sv , of the CA is chosen so that the valid convolution (noted ∗,see Section 1.3) matches the size of the sensor, i.e., (su−nu+1, sv−nv+1) = (mu,mv).

4.1.3 Multi-snapshot mode

Since a total of mu × mv measurements is recorded by the sensor, the latter produces mumvnλ

measurementsper band. We consider the possibility of partitioning the acquisition of y ∈ Rm by taking multiple snapshots,Ypp∈[mS ], with mS different aperture patterns, i.e., Spp∈[mS ]. Therefore, the total number of measurementsbecomes m = mumvmS . Taking multiple snapshots with different aperture patterns is expected to reduce thecorrelation between measurements. As the size of the FPA decreases, while keeping m constant, the multi-snapshot device resembles more and more the single-pixel camera [33], equivalent to settingmu = mv = nλ = 1.

4.2 Non-idealities and practical considerationsThe parallel compressive MSRC scheme entails some additional concerns for its actual implementation. Hereafter,we explain the effect of diffraction and a few other non-idealities.

4.2.1 Diffraction and Point Spread Function

As anticipated by [42], the main optical-level limitation of this scheme is the impact of diffraction that occurs atthe CA. A single small square aperture, followed by a lens and illuminated by a plane wave, forms a diffractionpattern at the focal plane [64, Chapter 4]. The effect of diffraction at the CA is modeled as an optical filter whosePoint Spread Function (PSF) is that pattern. The 2-D, wavelength-dependent, diffraction kernel has the followingexpression at the focal plane,

H(u, v, λ) = a sinc2

(u

∆s

λf

)sinc2

(v

∆s

λf

),

where a > 0 is an unimportant energy conservation constant (normalized afterwards). This PSF has a low-passeffect that limits the spatial bandwidth of the system, causing more correlation between measurements and adecrease of performance. We again simplify the discussion to one spatial dimension.

9

Right before being sampled by the sensor, as in (10), the ideal function Y (u, λ) is spatially convolved withH(u, λ) as

Y (u, λ) = [H ∗ Y ](u, λ) = [H ∗ S ∗X0]( df u, λ), (13)

with H(u, λ) , H( fdu, λ), the kernel as equivalently viewed at the CA scale. Note that this rescaled H(u, λ)does not physically appear at the CA (since the diffraction pattern is only observed in the focal plane) but is only anotational trick allowing mathematical simplification. The expression for the discrete measurements (10) becomes

yi =

∫∫Mi(u, λ)[H ∗ S ∗X0]( df u, λ) dudλ, (14)

In order to define the discrete sensing model, we inject (11) in (14) by replacing X0 by Xd0 . After expanding

the expression of yi (the details are omitted for brevity), one can verify that the result is completely equivalent toreplacing the continuous kernel H(u) in (14) by a discretized version defined by

Hd(u) =

mh∑i=1

hi(λ) δmhi ( u∆s

). (15)

where the mh PSF samples hi(λ) are given by

hi(λ) , b

∫H(u, λ)rectmhi ( u

∆s) du. (16)

The number, mh, of kernel samples is determined by the size of the window in which they are significantly biggerthan zero, and b is a normalization factor such that

∑i hi(λ) = 1. Note that sampling H(u, λ) with steps ∆s is

equivalent to sampling H(u, λ) with steps ∆m. Also note that the sampling function, rectmhi ( u∆s

), in (16) comesfrom the expression of S(u) (see (8)) but also depends on the chosen discretizationXd

0 and sampling functionsMi,modeling the sensor pixels.

Let H` ∈ Rmh×mh be the 2-D discrete PSF, by evaluating the 2-D generalization of (16) at wavelengths λ`.Using (15), we can now adapt the 2-D multi-snapshot discrete model as,

Yp =

nλ∑`=1

M`

((H` ∗ Sp) ∗X0,`

), (17)

where we can pre-compute the diffracted, wavelength dependent aperture patterns, Sp,` , H` ∗ Sp. Since thesize, mu×mv , of the focal plane matches the valid convolution with a CA of size su× sv , we can safely truncatethe diffracted aperture pattern, Sp,`, to an effective size of su × sv .

Keep in mind that the modeled diffraction kernel is an approximation based on assumptions such as the use ofa perfect thin lens, the fact that the object is an incoherent plane wave, etc. The actual PSF of the system could bemeasured, for instance, using a pre-defined CA along with a point-like target light source to estimate the PSF witha regularized inverse problem (see, e.g., [65, 66]). Spatially-dependent PSFs could also be estimated with similartechniques. We leave this subject open to future investigation.

4.2.2 Sizing example

We can compute the size of the diffraction kernel as a function of the pixel pitches ∆m and ∆s and the focallength, f , which is constrained by the size of the lens and thus the size of the CA. Specifically, the diameter Dlens

of the focusing lens must be bigger than the CA, i.e., Dlens ≥√

2 max su, sv∆s. Moreover, practical lensesshould have a sufficiently high F-number to avoid aberrations, i.e., f/Dlens ≥ 0.5. We can characterize the widthof the diffraction kernel on the sensor by the location of its first zeros, where the argument of the sinc2(·) is 1, i.e.,in pixels,

DPSF,λ = 2λf

∆m∆s,

so that H(1/2DPSF,λ∆m, λ) = H(1/2DPSF,λ∆s, λ) = 0. We thus apply this to the simulations parameters ofSection 5.2.2, i.e., nu = nv = 256, mu = mv = 256, so that su = sv = 511. Notice that the largest PSF widthcorresponds to the longest wavelength, λmax = 620nm. Let ∆s = 80µm so that the CA is about 41mm wideand we must choose a lens of at least 58mm in diameter, with focal length f ≥ 29mm, e.g., we can arbitrarily

10

DPSF,max=

11 px

Figure 4: 11 pixels wide diffraction PSF (∆m = 55µm).

choose f = 40mm. All these parameters being fixed, the PSF width on the focal plane (at λmax = 620nm) is620µm and the number of pixels is determined by the pixel pitch ∆m of the sensor, which also determines thedistance d. These parameters are incompatible with standard CMOS technology of ∆m = 5.5µm used in [10]. Inthat case, we get an impractical width of DPSF,λmax

= 112 pixels. The workaround, proposed in [42], of binningpixels together in macro-pixels is wasteful and defeats the purpose of the compressive architecture. Another wayof modifying the equivalent ∆m, requiring further investigation, is to magnify the sensor as viewed from thefocusing lens. For the simulations, intended as a proof of concept, we assume an effective magnification of 10 or20 times the 5.5µm CMOS sensor. This leads to a width of respectively DPSF,λmax

= 11 and 5 pixels (∆m = 55and 110µm). This PSF is illustrated on Fig. 4.

4.2.3 Other practical considerations

We mention here a few other important challenges of the MSRC design. Beside the spectral differences mentionedin Section 2.1, manufacturing variability may introduce unknown gains in the sensor. Because each measurementprovides information about the entire scene, this can highly limit the quality of the MSRC reconstruction. Incomparison, the effect of a corrupted measurement in the MSVI design would be localized. Blind calibrationtechniques [44, 45] may help when direct calibration is not possible.

Optical alignment is another important issue. For instance, the distances d and f and the roll angle betweenthe sensor and the CA must be precisely set. The choice of the lenses must also minimize chromatic and sphericalaberrations.

Narrowband filtering considerably decreases the system’s light throughput. Therefore, the implementationmust limit further light attenuation. For instance, several possible choices exist for the CA. A manufactured maskwith physical holes provides the best light throughput but is not programmable. Pixels of a semi-transparent LCDSpatial Light Modulator (SLM) have imperfect opacity or transparency. The same goes with reflective LiquidCrystal on Silicon (LCoS) devices [67], paired with a polarizing beam splitter [68] that further dims the light.Despite their excellent light transmittance, Digital Micro-mirror Devices (DMD), such as the one used in thesingle-pixel camera [33], are not suitable for being used out of focus, as uncertainty in the deflection angles (see,e.g., [69]) would translate into systematic error..

4.3 Sensing matrix implementationBased on the discrete model (17), we can write the sensing matrix corresponding to the vectorized forwardmodel (1). Let Sp,` ∈ Rmumv×nunv be the partial block circulant matrix which defines the valid convolution op-erator with Sp,`, and letM` ∈ Rmumv×mumv be the matrix equivalent toM`(·) (i.e., the same as in Section 3.1).First, notice that every band and every snapshot can be processed separately by the submatrices Φp,` , M`Sp,`such that the vectorized form of (17) is yp =

∑nλ`=1 Φp,`x0,`. The sensing matrix, is thus the block matrix,

Φ = (Φp,`)p∈[mS ],`∈[nλ]. This follows from the natural order in which the yp and x0,` elements are stacked inthe vectorized y ∈ Rm and x0 ∈ Rn. Note that each Φp,` is a masked (some rows are zeroed by M`) randomconvolution which enjoys good CS properties as explained in Section 1.2. Let Rmumv ∈ Rmumv×susv be therestriction operator that selects the valid part, of sizemu×mv , of a circular convolution of size su×sv . Similarly,let R∗nunv ∈ Rsusv×nunv be the zero-padding operator (adjoint of the restriction) whose output matches the sizesu × sv of the circular convolution. Let F ∈ Csusv×susv be the 2-D DFT and let Σp,` = diag(F vec(Sp,`)), i.e.,

11

the diagonal matrix of the DFT of Sp,`. With all these ingredients, we can factorize,

Φp,` = M`RmumvF∗Σp,`FR∗nunv .

The factors composing the full matrix Φ thus read R∗n = bdiagnλ(R∗nunv ), Rm = bdiagmS (Rmumv ), with

Rmumv = (M1Rmumv . . . ,MnλRmumv ) ,

and, denoting Fnλ = bdiagnλ(F), FmSnλ = bdiagmSnλ(F) and the diagonal matrices Σp = bdiag(Σp,1, . . . ,Σp,nλ),

Φ = FmSnλ

Σ1

...ΣmS

Fnλ .

With this factorization, Φ∗Φ + µId is easily invertible. Indeed, since FmSnλ and Fnλ are unitary and Σp isdiagonal, noting Σ2 =

∑mSp=1 Σ2

p, we have Φ∗Φ+µId = F∗nλ(Σ2 + µId

)Fnλ . Therefore, inverting Φ∗Φ+µId

is just equivalent to inverting the diagonal matrix, Σ2 + µId. Note that computing Φx for some input vectorx ∈ Rn requires computing nλ DFTs and mSnλ inverse DFTs of size su × sv . Similarly, computing Φ∗z forsome input vector z ∈ Rm requires mSnλ DFTs and nλ inverse DFTs. Comparatively, computing the inverse ofΦ∗Φ + µId is cheaper since it only requires nλ DFTs and nλ inverse DFTs.

5 Numerical ExperimentsIn this section, we present numerical results with experimental data for the MSVI and with simulated acquisitionto compare both MSVI and MSRC strategies. Experiments were performed in MATLAB with the code provided insupplementary material 2. In all the recoveries, we used µ1 = 50 ρ

‖Φ‖2 , µ2 = µ3 = ρ, with ρ = 40. The dynamicrange is normalized to xmin = 0 and xmax = 1. The tolerance of tol = 5.10−5 for the relative `2-distance betweentwo iterations was always reached in less than niter = 2000 iterations. The µj parameters were manually tunedin order to reach a reasonably fast convergence, but they do not critically affect the recovery quality. The ρ

‖Φ‖2factor in µ1 is a heuristic normalization of Φ∗Φ compared to Id in (6). For the initialization, we use a toleranceof 10−3 and a maximum of ten iterations.

5.1 MSVI ExperimentOn Fig. 5, we present the result based on experimental measurements of a test scene, observed with imec’s mosaicsensor. This imager has a resolution of 1024×2048 pixels organized in a mosaic of 256×512 identical 4×4 macro-pixels; each with nλ = 16 different FP filters at wavelengths of visible light (as in Fig. 1b). For this experiment werestricted the measurements to a 512×512 region, depicted by the bigger white square on the false color image.The subsampling rate m/n, here is 1/4, i.e., we recover a volume with 256×256×16 voxels. For setting τ ,we target a minimum measurements to residual ratio of 20 log10(‖y‖ /τ) = 40dB. The naive, super-pixel based,demosaicking method (top row), used in [10], is clearly the worst. The middle row shows the result of the linearinterpolation and Tikhonov regularization initialization method (7). Though we observe a clear improvement, agrid artifact, already observed in [20], appears and is particularly visible on the 551 nm band. This artifact isremoved with the proposed method (bottom row). Without the exact filter calibration profiles and the ground truthspectra, we cannot, here, evaluate the spectral accuracy.

5.2 Comparison of MSVI and MSRC on Synthetic SimulationsIn the following, we use a MS dataset to compare both strategies, quantitatively and qualitatively, on a seriesof controlled, synthetic simulations. It comprises eight 256 × 256 × 16 ROI selected from the 32 multispectral512× 512× 31 volumes of the CAVE dataset [70]. The spectral ROI is 470 nm through 620 nm, matching imec’ssensor. The spatial ROI was manually chosen in each image to capture the most interesting features. The chart andstuffed toys sample (ROI centered at (230, 280)), used for qualitative comparisons, is shown on Fig. 6 (left). The

2This paper has supplementary downloadable material available at https://github.com/kevd42/hsics tci .

12

(A)(A)

(B)(B)(C)(C)

Real mosaic sensor data, false colorReal mosaic sensor data, false color

(A)(A)

476 nm476 nm

NaiveNaive 0

1

(A)(B)(C)

(B)(B)

551 nm551 nm

(C)(C)

628 nm628 nm

xinitxinit 0

1

(A)(B)(C)

xx476 551 6280

1

(A)(B)(C)

Figure 5: Result of an experiments on real data acquired with imec’s sensor. Qualitative comparison between naive demosaicking (nearestneighbor), the initialization (7), and the proposed method. The top image is a false color nearest neighbor preview. The points (A), (B) and (C)are pixels whose spectra are represented in the 4th column.

other samples, chosen to produce average PSNR curves, were balloons (255, 128), feathers (256, 256), jelly beans(256, 256), glass tiles (256, 256), stuffed toys (256, 256), superballs (200, 236), and beads (256, 256). The middleand right plots on Fig. 6 show the average (over the eight dataset samples) reconstruction Peak Signal-to-NoiseRatio (PSNR , −10 log10(MSE), where MSE stands for Mean Squared Error) in function of the subsamplingrate m/n for five different sensor configurations; two MSVI and three MSRC setups, each with two levels, 40and 20dB (Signal-to-Noise Ratio (SNR)), of additive white gaussian noise on the measurements. The value of τis determined by an oracle. Each simulation uses the corresponding sensing operator, Φ, for both generating themeasurements and for reconstruction, i.e., there is no model mismatch. Fig. 7 shows the qualitative result of thechosen sample at m/n = 1/16, 40dB SNR. We chose this extreme low sampling rate under low input noise toexpose the most obvious differences between sensing strategies and configurations.

5.2.1 MSVI

Since n is fixed by the dataset, we explore four MSVI sensor sizes: mu = mv ∈ 256, 512, 768, 1024. Wetest two different FP filters configurations: the mosaic pattern (Fig. 1b) and the random pattern (Fig. 1c). Atlower subsampling ratios, the random arrangement outperforms the mosaic sensor, particularly at high input SNR.This indicates that randomness mitigates aliasing caused by extreme subsampling. At Nyquist rate, mosaic beatsrandom sampling, but both results are above 50dB and look visually perfect (not shown).

On Fig. 7, we first show the initialization point as defined by (7), i.e., since Φ = Id, xinit = (1 + τ2)−1ylin.Despite being much faster than our iterative method, it gives visually scrambled results, particularly bad on theouter 470 nm and 620 nm bands where less data-points are available. The spectral error is particularly large forpixel (A) where the highly textured region destabilizes linear interpolation. The results obtained by the proposedmethod, denoted x, are more accurate: edges and textures, e.g., the horizontal bars of the chart and the stripeson the toy’s sleeve are well resolved. The mosaic arrangement leads to a grid artifact as observed in Section 5.1,whereas the random arrangement leads to seemingly unstructured artifacts and smaller spectral error areas, which

13

(A)(A)

(B)(B)

(C)(C)

0 0.2 0.4 0.6 0.8 1

30

40

50

60

m/n

PSN

R(d

B)

MSVI mosaic MSVI randomMSRC no diff. MSRC 5px PSF

MSRC 11px PSF

Input SNR: 40dB

0 0.2 0.4 0.6 0.8 1

30

35

40

45

m/n

PSN

R(d

B)

MSVI mosaic MSVI randomMSRC no diff. MSRC 5px PSF

MSRC 11px PSF

Input SNR: 20dB

Figure 6: Results of the synthetic experiment. (Left) Ground truth in false color of the CAVE [70] sample, chart and stuffed toy. The biggerwhite square region indicates the part of the cube (ROI) that was used in the experiment. The smaller white squares labelled (A), (B) and (C)correspond to zooms on features depicted on Fig. 7. (Middle and right) Average PSNR over the whole dataset for two levels of input noise:40dB (middle) and 20dB (right). The curves correspond to the MSVI system (dotted lines) with mosaic (square purple) and random (diamondorange) layouts, and to the MSRC (plain lines) when diffraction is not modeled (red dots) and when DPSF,λmax is 5 (black triangles) or 11pixels (green circles).

concurs with the PSNR curves. In practice, a higher sampling rate, e.g., m/n = 1/4, is preferable to mitigatethose effects.

5.2.2 MSRC

For testing the MSRC strategy, the size of the FPA is fixed to mu = mv = 256, so that su = sv = 511. Tovary the sampling rate m/n, the number of snapshots is increased as mS ∈ 1, 4, 9, 16. Simulations with aunique snapshot and increasing FPA size, omitted for the sake of conciseness, gave very close but slightly inferiorresults. We compare the performances of a diffraction-free case with two cases where the diffraction kernel wasrespectively DPSF,λmax = 5 and 11 pixels wide. As expected, the global trend indicates that increasing the size ofthe PSF decreases quality. Interestingly, at m/n = 1/16, the reconstruction PSNR of the diffraction-free case ison par with the 5 pixels case.

Fig. 7 suggests that the MSRC method is suitable for extreme subsampling (compression). For example, thedigits on the chart (first column) of the diffraction-free reconstruction are legible and artifacts are barely noticeable.The spectral error is also impressively small. The performances rapidly decrease with diffraction and its spatiallow-pass effect. As expected, the redundant wavelet prior is not able to recover the lost high-pass information.However, where the 11 pixels case gives pretty bad spectral accuracy, especially near edges, the 5 pixels caseremains reasonably good at spectral reconstruction.

5.2.3 Comparison

In the ideal diffraction-free case under low noise, the MSRC device provides a performance improvement of upto 4dB (for the 1/16 subsampling rate) over the MSVI. This justifies the present study on the feasibility of theMSRC design. However, at higher sampling rates, diffraction decreases quality, even with a 5 pixels PSF. Note thatthe gap between MSVI and the ideal MSRC falls to zero at 20dB of input SNR. For the qualitative comparison,we focus on the case, where MSRC outperforms MSVI on average, even with diffraction. MSRC gives betterspectral accuracy than MSVI on the selected pixels, in particular pixels (A) and (B). Regarding spatial accuracy,noisy patterns appear between the stripes on the toy’s sleeve with MSVI reconstruction. However, the spatialhigh-frequency content, particularly visible on the chart patterns and digits, is affected by the diffraction kernel.

6 ConclusionBoth strategies proposed in this paper use a MS sensor with integrated FP filters. Despite using the principlesof CS, they do not involve dispersive elements. Along with the conceptual optical design, for each device, weproposed an accurate forward model and a unified reconstruction procedure, formulated as a regularized inverseproblem with an original spatio-spectral prior. The particularity of MSRC, compared to MSVI, lies in the spatialmixing provided by an out-of-focus CA, which allows higher compression ratios but, if not properly sized, entailsadverse effects such as diffraction.

14

MSV

IM

SRC

(A)(A)

470 nm470 nm

Ground truthGround truth 0

1

(A)(B)(C)

(B)(B)

540 nm540 nm

(C)(C)

620 nm620 nm

Mosaic xinitMosaic xinit 0

1

(A)(B)(C)

Random xinitRandom xinit 0

1

(A)

(B)(C)

Mosaic xMosaic x 0

1

(A)

(B)(C)

Random xRandom x 0

1

(A)(B)(C)

No diff. xNo diff. x 0

1

(A)

(B)(C)

5px PSF x5px PSF x0

1

(A)

(B)(C)

11px PSF x11px PSF x470 540 6200

1

(A)

(B)(C)

Figure 7: Qualitative results for the five sensor setups at m/n = 1/16, 40dB input SNR. The patches correspond to zooms on regions of thecube (Fig. 6). The points (A), (B) and (C) are pixels whose spectra are represented in the 4th column. The light gray lines remind the groundtruth and the red areas represent the error. The 2nd and 3rd rows show the results of linear interpolation of the mosaic and random MSVImeasurements. The last five rows are the recovery results corresponding to each tested setup.

Through extensive numerical simulations, we explored different setups. We devised practical guidelines andhighlighted limitations for both methods allowing to proceed towards an informed implementation. In an ideallysized, low-noise, calibrated setup, MSRC gives better performances with high compression. In other situations,factoring the cost of implementation and calibration, MSVI should be preferred.

AcknowledgmentsThe authors thank the Integrated Imagers group of imec (Leuven, Belgium) for supporting part of this designexploration during the second author’s visit between February and August 2014.

15

References[1] P. Shippert, “Why Use Hyperspectral Imagery?” Photogrammetric engineering and remote sensing, vol. 70,

pp. 377–379, 2004.

[2] P. Tatzer, T. Panner, M. Wolf, and G. Traxler, “Inline sorting with hyperspectral imaging in an industrialenvironment,” N. Kehtarnavaz and P. A. Laplante, Eds., vol. 5671. International Society for Optics andPhotonics, feb 2005, p. 162.

[3] E. K. Hege, D. O’Connell, W. Johnson, S. Basty, and E. L. Dereniak, “Hyperspectral imaging for astronomyand space surviellance,” in Optical Science and Technology, SPIE’s 48th Annual Meeting, S. S. Shen andP. E. Lewis, Eds., vol. 5159. International Society for Optics and Photonics, jan 2004, p. 380.

[4] A. A. Gowen, C. P. O’Donnell, P. J. Cullen, G. Downey, and J. M. Frias, “Hyperspectral imaging - anemerging process analytical tool for food quality and safety control,” Trends in Food Science & Technology,vol. 18, pp. 590–598, 2007.

[5] G. Lu and B. Fei, “Medical hyperspectral imaging: a review,” Journal of Biomedical Optics, vol. 19, p.010901, 2014.

[6] M. L. Whiting, S. L. Ustin, P. Zarco-Tejada, A. Palacios-Orueta, and V. C. Vanderbilt, “Hyperspectral map-ping of crop and soils for precision agriculture,” W. Gao and S. L. Ustin, Eds., vol. 6298. InternationalSociety for Optics and Photonics, aug 2006, p. 62980B.

[7] R. G. Sellar and G. D. Boreman, “Comparison of relative signal-to-noise ratios of different classes of imagingspectrometer,” Applied optics, vol. 44, pp. 1614–1624, 2005.

[8] N. Hagen and M. W. Kudenov, “Review of snapshot spectral imaging technologies,” Optical Engineering,vol. 52, pp. 090 901–090 901, 2013.

[9] A. Lambrechts, P. Gonzalez, B. Geelen, P. Soussan, K. Tack, and M. Jayapala, “A CMOS-compatible, inte-grated approach to hyper- and multispectral imaging,” 2014 IEEE International Electron Devices Meeting,pp. 10.5.1–10.5.4, 2014.

[10] B. Geelen, N. Tack, and A. Lambrechts, “A compact snapshot multispectral imager with a monolithicallyintegrated per-pixel filter mosaic,” in SPIE MOEMS-MEMS. International Society for Optics and Photonics,2014.

[11] B. E. Bayer, “Color Imaging Array. U. S. patent 3971 065,” p. 10, 1976.

[12] D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52, pp. 1289–1306,2006.

[13] E. J. Candes and M. B. Wakin, “An introduction to compressive sampling,” Signal Processing Magazine,IEEE, vol. 25, pp. 21–30, 2008.

[14] R. M. Willett, R. F. Marcia, and J. M. Nichols, “Compressed sensing for practical optical imaging systems:a tutorial,” Optical Engineering, vol. 50, 2011.

[15] G. Arce, D. Brady, L. Carin, H. Arguello, and D. Kittle, “Compressive coded aperture spectral imaging: Anintroduction,” IEEE Signal Processing Magazine, vol. 31, pp. 105–115, 2014.

[16] M. Gehm, R. John, D. Brady, R. Willett, and T. Schulz, “Single-shot compressive spectral imaging with adual-disperser architecture,” Optics Express, vol. 15, pp. 14 013–14 027, 2007.

[17] M. E. Gehm and D. J. Brady, “Compressive sensing in the EO/IR,” Appl. Opt., vol. 54, pp. C14–C22, Mar.2015.

[18] E. J. Candes, “The restricted isometry property and its implications for compressed sensing,” Comptes Ren-dus Mathematique, vol. 346, pp. 1–4, 2008.

[19] H. Rauhut, “Compressive Sensing and Structured Random Matrices,” in Radon Series Comp. Appl. Math,M. Fornasier, Ed. DE GRUYTER, jan 2010, pp. 1–94.

16

[20] K. Degraux, V. Cambareri, L. Jacques, B. Geelen, C. Blanch, and G. Lafruit, “Generalized inpainting methodfor hyperspectral image acquisition,” in 2015 IEEE International Conference on Image Processing (ICIP),vol. 2015-Decem. IEEE, sep 2015, pp. 315–319.

[21] K. Degraux, V. Cambareri, B. Geelen, L. Jacques, G. Lafruit, and G. Setti, “Compressive HyperspectralImaging by Out-of-Focus Modulations and Fabry-Perot Spectral Filters,” in Proceedings of the second ”in-ternational Traveling Workshop on Interactions between Sparse models and Technology” (iTWIST’14), 2014,pp. 21–23.

[22] S. Boyd, “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multi-pliers,” Foundations and Trends R© in Machine Learning, vol. 3, pp. 1–122, 2010.

[23] M. S. C. Almeida and M. A. T. Figueiredo, “Deconvolving Images With Unknown Boundaries Using theAlternating Direction Method of Multipliers,” IEEE Transactions on Image Processing, vol. 22, pp. 3074–3086, aug 2013.

[24] T. Sun and K. Kelly, “Compressive Sensing Hyperspectral Imager,” in Frontiers in Optics 2009/Laser ScienceXXV/Fall 2009 OSA Optics & Photonics Technical Digest. Optical Society of America, 2009, p. CTuA5.

[25] A. A. Wagadarikar, N. P. Pitsianis, X. Sun, and D. J. Brady, “Video rate spectral imaging using a codedaperture snapshot spectral imager,” Optics Express, vol. 17, pp. 6368–6388, 2009.

[26] A. Wagadarikar, R. John, R. Willett, and D. Brady, “Single disperser design for coded aperture snapshotspectral imaging,” Applied Optics, vol. 47, pp. B44–B51, 2008.

[27] Y. Wu, I. O. Mirza, G. R. Arce, and D. W. Prather, “Development of a digital-micromirror-device-basedmultishot snapshot spectral imaging system,” Optics Letters, vol. 36, pp. 2692–2694, 2011.

[28] C. V. Correa, H. Arguello, and G. R. Arce, “Compressive spectral imaging with colored-patterned detectors,”in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014,pp. 7789–7793.

[29] ——, “Snapshot colored compressive spectral imager,” JOSA A, vol. 32, pp. 1754–1763, 2015.

[30] Y. August, C. Vachman, Y. Rivenson, and A. Stern, “Compressive hyperspectral imaging by random separa-ble projections in both the spatial and the spectral domains,” Applied optics, vol. 52, pp. D46–D54, 2013.

[31] J. E. Fowler, “Compressive pushbroom and whiskbroom sensing for hyperspectral remote-sensing imaging,”in Image Processing (ICIP), 2014 IEEE International Conference on. IEEE, 2014, pp. 684–688.

[32] B. Geelen, C. Blanch, P. Gonzalez, K. Tack, and A. Lambrechts, “A tiny VIS-NIR snapshot multispectralcamera,” in SPIE OPTO, no. 937414. International Society for Optics and Photonics, 2015.

[33] J. K. Romberg, M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G.Baraniuk, “Single-Pixel Imaging via Compressive Sampling,” Signal Processing Magazine, IEEE, vol. 25,pp. 83–91, mar 2008.

[34] M. Lustig, D. Donoho, and J. M. Pauly, “Sparse MRI: The application of compressed sensing for rapid MRimaging,” Magnetic Resonance in Medicine, vol. 58, pp. 1182–1195, 2007.

[35] R. M. Willet, R. F. Marcia, and J. M. Nichols, “Compressed sensing for practical optical imaging systems: atutorial,” Optical Engineering, vol. 50, p. 072601, jul 2011.

[36] R. H. Dicke, “Scatter-Hole Cameras for X-Rays and Gamma Rays,” The Astrophysical Journal, vol. 153, p.L101, aug 1968.

[37] E. E. Fenimore and T. M. Cannon, “Coded aperture imaging with uniformly redundant arrays,” AppliedOptics, vol. 17, p. 337, feb 1978.

[38] J. Romberg, “Compressive Sensing by Random Convolution,” SIAM Journal on Imaging Sciences, vol. 2,pp. 1098–1128, jan 2009.

17

[39] H. Rauhut, J. Romberg, and J. A. Tropp, “Restricted isometries for partial random circulant matrices,” Ap-plied and Computational Harmonic Analysis, vol. 32, pp. 242–254, mar 2012.

[40] L. Jacques, P. Vandergheynst, A. Bibet, V. Majidzadeh, A. Schmid, and Y. Leblebici, “CMOS compressedimaging by Random Convolution,” in 2009 IEEE International Conference on Acoustics, Speech and SignalProcessing. IEEE, apr 2009, pp. 1113–1116.

[41] R. F. Marcia, Z. T. Harmany, and R. M. Willett, “Compressive coded aperture imaging,” in IS&T/SPIEElectronic Imaging, C. A. Bouman, E. L. Miller, and I. Pollak, Eds., feb 2009.

[42] T. Bjorklund and E. Magli, “A parallel compressive imaging architecture for one-shot acquisition,” in 2013Picture Coding Symposium, PCS 2013 - Proceedings, nov 2013, pp. 65–68.

[43] W. L. Chan, K. Charan, D. Takhar, K. F. Kelly, R. G. Baraniuk, and D. M. Mittleman, “A single-pixelterahertz imaging system based on compressed sensing,” Applied Physics Letters, vol. 93, p. 121105, sep2008.

[44] V. Cambareri and L. Jacques, “A Greedy Blind Calibration Method for Compressed Sensing with UnknownSensor Gains,” ArXiv e-prints arXiv:1610.02851, pp. 1–6, 2016.

[45] S. Ling and T. Strohmer, “Self-calibration and biconvex compressive sensing,” Inverse Problems, vol. 31, p.115002, nov 2015.

[46] C. Fabry and A. Perot, “On a New Form of Interferometer,” The Astrophysical Journal, vol. 13, p. 265, may1901.

[47] G. Hernandez, Fabry–Perot Interferometers: Cambridge Studies in Modern Optics. Cambridge UniversityPress, may 1986, vol. 3.

[48] B. Geelen, N. Tack, and A. Lambrechts, “A snapshot multispectral imager with integrated tiled filters andoptical duplication,” in Spie Moems-Mems, G. von Freymann, W. V. Schoenfeld, and R. C. Rumpf, Eds., vol.8613, mar 2013, p. 861314.

[49] M. Elad, P. Milanfar, and R. Rubinstein, “Analysis versus synthesis in signal priors,” European Signal Pro-cessing Conference, vol. 23, pp. 947–968, 2006.

[50] E. J. Candes, Y. C. Eldar, D. Needell, and P. Randall, “Compressed sensing with coherent and redundantdictionaries,” Applied and Computational Harmonic Analysis, vol. 31, pp. 59–73, 2011.

[51] S. Nam, M. Davies, M. Elad, and R. Gribonval, “The cosparse analysis model and algorithms,” Applied andComputational Harmonic Analysis, vol. 34, pp. 30–56, jan 2013.

[52] S. Mallat, A Wavelet Tour of Signal Processing. Elsevier, 2009.

[53] J. L. Starck, J. Fadili, and F. Murtagh, “The undecimated wavelet decomposition and its reconstruction,”IEEE Transactions on Image Processing, vol. 16, pp. 297–309, 2007.

[54] R. Glowinski and A. Marroco, “Sur l’Approximation, par Elements d’Ordre un, et la Resolution,par Penalisation-Dualite, d’une Classe de Problemes de Dirichlet non Lineares,” Revue Francaised’Automatique, Informatique, et Recherche Operationelle, vol. 9(R-2), pp. 41–76, 1975.

[55] D. Gabay and B. Mercier, “A dual algorithm for the solution of nonlinear variational problems via finiteelement approximation,” Computers & Mathematics with Applications, vol. 2, pp. 17–40, 1976.

[56] P. L. Combettes and J.-C. Pesquet, “Proximal Splitting Methods in Signal Processing,” in Fixed-Point Algo-rithms for Inverse Problems in Science and Engineering, H. H. Bauschke, R. S. Burachik, P. L. Combettes,V. Elser, D. R. Luke, and H. Wolkowicz, Eds. Springer New York, 2011, pp. 185–212.

[57] N. Parikh and S. P. Boyd, “Proximal Algorithms,” Foundations and Trends in Optimization, vol. 1, pp. 123–231, 2013.

[58] B. W. Keelan, Handbook of image quality : characterization and prediction. New York: Marcel Dekker,2002.

18

[59] T. Acharya, K. Bhattacharya, and A. Ghosh, “Optical processing using a birefringence-based spatial filter,”Journal of Modern Optics, vol. 41, pp. 979–986, 1994.

[60] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, “Image inpainting,” in Proceedings of the 27th annualconference on Computer graphics and interactive techniques - SIGGRAPH ’00. New York, New York, USA:ACM Press, 2000, pp. 417–424.

[61] C. Ballester, M. Bertalmio, V. Caselles, G. Sapiro, and J. Verdera, “Filling-in by joint interpolation of vectorfields and gray levels,” IEEE Transactions on Image Processing, vol. 10, pp. 1200–1211, 2001.

[62] M. Elad, J.-L. Starck, P. Querre, and D. L. Donoho, “Simultaneous cartoon and texture image inpaintingusing morphological component analysis (MCA),” Applied and Computational Harmonic Analysis, vol. 19,pp. 340–358, 2005.

[63] S. Foucart and H. Rauhut, “A mathematical introduction to compressive sensing,” Appl. Numer. Harmon.Anal. Birkhauser, Boston, 2013.

[64] J. W. Goodman, Introduction to Fourier Optics. Roberts and Company Publishers, 2005.

[65] A. Gonzalez, V. Delouille, and L. Jacques, “Non-parametric PSF estimation from celestial transit solar im-ages using blind deconvolution,” Journal of Space Weather and Space Climate, vol. 6, p. A1, jan 2016.

[66] S. Guerit, A. Gonzalez, A. Bol, J. A. Lee, and L. Jacques, “Blind Deconvolution of PET Images usingAnatomical Priors,” Proceedings of the third ”international Traveling Workshop on Interactions betweenSparse models and Technology” iTWIST’16, aug 2016.

[67] H. Nagahara, C. Zhou, T. Watanabe, H. Ishiguro, and S. K. Nayar, “Programmable aperture camera usingLCoS,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligenceand Lecture Notes in Bioinformatics), vol. 6316 LNCS, 2010, pp. 337–350.

[68] G. R. Fowles, “Introduction to Modern Optics,” American Journal of Physics, vol. 36, p. 770, 1968.

[69] Texas Instrument, “Dlp4500 0.45 wxga dmd,” Part datasheet, April 2013 (revised January 2016).

[70] F. Yasuma, T. Mitsunaga, D. Iso, and S. K. Nayar, “Generalized assorted pixel camera: Postcapture control ofresolution, dynamic range, and spectrum,” IEEE Transactions on Image Processing, vol. 19, pp. 2241–2253,2010.

19

Multispectral Compressive Imaging Strategies using Fabry ... · Multispectral Compressive Imaging...

Documents

Transcript of Multispectral Compressive Imaging Strategies using Fabry ... · Multispectral Compressive Imaging...