Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure...

7
This received the “best paper” award at the 1999 IEEE Conf. on Computer Vision and Pattern Recognition Robust Hierarchical Algorithm for Constructing a Mosaic from Images of the Curved Human Retina Ali Can 1 Charles V. Stewart 2 Badrinath Roysam 1 1 Dept. of Elec. Comp. and Sys. Eng. Rensselaer Polytechnic Institute Troy, NY 12180-3590 cana,roysab @rpi.edu 2 Dept. of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180-3590 [email protected] Abstract This paper describes computer vision algorithms to as- sist in retinal laser surgery, which is widely used to treat leading blindness causing conditions but only has a 50% success rate, mostly due to a lack of spatial mapping and reckoning capabilities in current instruments. The novel technique described here automatically constructs a com- posite (mosaic) image of the retina from a sequence of in- complete views. This mosaic will be useful to opthalmol- ogists for both diagnosis and surgery. The new technique goes beyond published methods in both the medical and computer vision literatures because it is fully automated, models the patient-dependent curvature of the retina, han- dles large interframe motions, and does not require calibra- tion. At the heart of the technique is a 12-parameter image transformation model derived by modeling the retina as a quadratic surface and assuming a weak perspective cam- era, and rigid motion. Estimating the parameters of this transformation model requires robustness to unmatchable image features and mismatches between features caused by large interframe motions. The described estimation tech- nique is a hierarchy of models and methods: the initial match set is pruned based on a 0 th order transformation estimated using a similarity-weighted histogram; a 1 st or- der, affine transformation is estimated using the reduced match set and least-median of squares; and the final, 2 nd order, 12-parameter transformation is estimated using an M-estimator initialized from the 1st order results. Initial experimental results show the method to be robust and ac- curate in accounting for the unknown retinal curvature in a fully automatic manner, while preserving image details. The authors would like to thank Dr. Howard L. Tanenbaum of the Cen- ter for Sight, Albany, NY, Dr. James N. Turner of the Wadsworth Center for Laboratories and Research, NY State Dept. of Health, and Amitha Per- era Dept. of Computer Science, Renssealer for their contributions to this work. 1 Introduction Medical applications of computer vision have received increasing attention over the past several years. Many of these applications are in MRI and CT image analysis [20], while others are in robotic surgery [27], and in surgical plan- ning and visualization [11]. One application area where the development of 3-D computer vision techniques has been less substantial, but where the need is great, is in opthalmol- ogy. This paper considers the particular problem of retinal laser surgery. Retinal laser surgery, currently done manually, is a proven treatment for leading blindness-causing conditions [17, 22], including age-related macular degeneration [19], degenerative myopia, and diabetic retinopathy, affecting over 24 million people in the US alone [21, 31]. When diag- nosing these conditions, the retina is imaged in two spectral channels. One channel images the retina in the visible spec- trum, while the other images the layer beneath the retina in the near-infrared using a fluorescent dye (Indocyanine Green [14]). Using both types of imagery, physicians are able to delineate pathological regions for subsequent sur- gical laser photocoagulation. This surgical procedure must be done accurately, without damaging vital healthy tissue, without missing pathological tissue, and without overexpo- sure [22]. Although laser retinal surgery is the most widely used treatment for the above conditions, the current success rate is below 50% for the first treatment, with a recurrence and/or persistence rate of about 50%. Each re-treatment also has a 50% failure rate, with visual recovery declining with successive treatments [31]. These failures can be traced to the manual nature of the surgery, and the lack of spatial reckoning aids in the clinical instrumentation. It is difficult for the physician to develop a complete view of the retina, to respond quickly to patient eye movements which can not be completely stopped during surgery, and to measure op- tical dosage quantitatively. The compelling need to mini- mize invasiveness implies that computer vision methods are preferable for addressing the above problems [29].

Transcript of Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure...

Page 1: Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure [22]. Although laser retinal surgery is the most widely used treatment for the above

This receivedthe“bestpaper”awardat the1999IEEEConf. onComputerVisionandPatternRecognition

RobustHierarchicalAlgorithm for ConstructingaMosaicfrom Imagesof theCurvedHumanRetina

�Ali Can1 CharlesV. Stewart2 BadrinathRoysam1

1Dept.of Elec.Comp.andSys.Eng.RensselaerPolytechnicInstitute

Troy, NY 12180-3590�cana,roysab� @rpi.edu

2Dept.of ComputerScienceRensselaerPolytechnicInstitute

Troy, NY [email protected]

Abstract

This paperdescribescomputervision algorithmsto as-sist in retinal laser surgery, which is widely usedto treatleading blindnesscausingconditionsbut only has a 50%successrate, mostlydue to a lack of spatial mappingandreckoning capabilities in current instruments. The noveltechniquedescribedhere automaticallyconstructsa com-posite(mosaic)image of the retina from a sequenceof in-completeviews. This mosaicwill be usefulto opthalmol-ogists for both diagnosisand surgery. Thenew techniquegoesbeyond publishedmethodsin both the medical andcomputervision literatures becauseit is fully automated,modelsthe patient-dependentcurvature of the retina,han-dleslargeinterframemotions,anddoesnot requirecalibra-tion. At theheartof thetechniqueis a 12-parameterimagetransformationmodelderivedby modelingthe retina as aquadratic surfaceand assuminga weakperspectivecam-era, and rigid motion. Estimatingthe parameters of thistransformationmodelrequires robustnessto unmatchableimage featuresandmismatchesbetweenfeaturescausedbylarge interframemotions. The describedestimationtech-nique is a hierarchy of modelsand methods: the initialmatch set is prunedbasedon a 0th order transformationestimatedusinga similarity-weightedhistogram; a 1st or-der, affine transformationis estimatedusing the reducedmatch setand least-medianof squares; and the final, 2nd

order, 12-parametertransformationis estimatedusing anM-estimatorinitialized from the 1st order results. Initialexperimentalresultsshowthemethodto berobustandac-curatein accountingfor theunknownretinal curvature in afully automaticmanner, while preservingimagedetails.

�Theauthorswouldlike to thankDr. HowardL. Tanenbaumof theCen-

ter for Sight,Albany, NY, Dr. JamesN. Turnerof the Wadsworth Centerfor LaboratoriesandResearch,NY StateDept.of Health,andAmithaPer-eraDept. of ComputerScience,Renssealerfor their contributionsto thiswork.

1 Intr oductionMedical applicationsof computervision have received

increasingattentionover the pastseveral years. Many oftheseapplicationsarein MRI andCT imageanalysis[20],while othersarein roboticsurgery[27], andin surgicalplan-ning andvisualization[11]. Oneapplicationareawherethedevelopmentof 3-D computervision techniqueshasbeenlesssubstantial,but wheretheneedis great,is in opthalmol-ogy. This paperconsiderstheparticularproblemof retinallasersurgery.

Retinal laser surgery, currently done manually, is aproven treatmentfor leadingblindness-causingconditions[17, 22], including age-relatedmaculardegeneration[19],degenerative myopia, and diabetic retinopathy, affectingover24million peoplein theUSalone[21, 31]. Whendiag-nosingtheseconditions,theretinais imagedin two spectralchannels.Onechannelimagestheretinain thevisiblespec-trum, while the other imagesthe layer beneaththe retinain the near-infrared using a fluorescentdye (IndocyanineGreen[14]). Using both typesof imagery, physiciansareable to delineatepathologicalregions for subsequentsur-gical laserphotocoagulation.This surgical proceduremustbe doneaccurately, without damagingvital healthytissue,without missingpathologicaltissue,andwithout overexpo-sure[22]. Althoughlaserretinalsurgeryis themostwidelyusedtreatmentfor theaboveconditions,thecurrentsuccessrateis below 50%for thefirst treatment,with a recurrenceand/orpersistencerateof about50%.Eachre-treatmentalsohasa 50%failurerate,with visual recovery decliningwithsuccessive treatments[31]. Thesefailurescanbe tracedtothe manualnatureof the surgery, and the lack of spatialreckoningaidsin theclinical instrumentation.It is difficultfor the physicianto developa completeview of the retina,to respondquickly to patienteyemovementswhich cannotbe completelystoppedduring surgery, andto measureop-tical dosagequantitatively. The compellingneedto mini-mizeinvasivenessimpliesthatcomputervisionmethodsarepreferablefor addressingtheaboveproblems[29].

Page 2: Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure [22]. Although laser retinal surgery is the most widely used treatment for the above

Figure1: Threeexample1024 � 1024imagesof theretinasurfaceshowing relatively little overlapbetweenthem.

We are developing a two-phasecomputervision sys-tem to assistphysiciansduring diagnosisandretinal lasersurgery. PhaseI, coinciding with the physician’s explo-rationfor diagnosis,involvesconstructingahigh-resolutionwide-areamosaicof the retina. PhaseII, coinciding withsurgical treatment,involvesreal-timetracking of the pro-jected position of the laser on the retina and comparingthis againstphysician-delineatedregionsof surgical inter-est.Trackingresultswill beusedto calculatedosimetryandto turn the laseron or off asappropriate.The focusof thecurrentpaperis on the mosaicconstructedduring PhaseIand usedas a spatialreferencemap during PhaseII. Theimagematchingtechniquedevelopedfor mosaicconstruc-tion will alsobe usedaspart of a multistage,hierarchicaltrackingalgorithm.

Although the problem of constructinga compositeor“mosaic” image from image sequenceshas beenstudiedheavily over the pastdecadein both the computervision[15, 26, 25] andbiomedicalliteratures[18, 5, 9], theprob-lem of automaticallyconstructinga high-resolutionmosaicfrom a retinal imagesequenceraisestwo major issuesre-quiring novel techniques:� An appropriateinter-imagetransformationmodelmust

be foundfor mappingindividual imagesonto themo-saic imageframe. This model must accountfor thecurvedsurfaceof theretina,therelatively generalrigidmotion possible,and the inherentlack of calibrationbecausetheopticsof eachindividualretinachangestheeffectiveopticsof thecamera.Wewill demonstratetheneedfor a second-ordertransformationmodel whichwe derivehere.� A new techniqueis neededto estimateparametersofthe transformationmodel. Becauselarge eye move-mentsarepossible,theremaybeaslittle as40%over-lap betweenimages. This implies that feature-basedtechniques[28, 30] shouldbeusedinsteadof intensity-basedmethods[3, 6], but moreimportantlyit indicatesthatrobustnessto largenumbersof mismatchesandun-

X

Y Z

X '

Y'

Z'

x '

y'

x

y

p

q

Retina

Lens

Iris

Cornea

CameraCp

OpticDisk

Choroid

VitreousHumour

Camera Cq

Figure2: Two views of theretina.

matchablefeaturesmust be central to the estimationtechnique.

Investigationof theseissuesformstheheartof thepaper.

2 ImageTransformation ModelTheretinalsurfaceis almostbut notquitespherical,mak-

ing a quadricor evena quadraticmodel,intuitively, a goodapproximationto its shape.Duringdiagnosisandsurgery, apatient’spupil is dilatedandhis/herforeheadis heldagainsta harness.Small shifts in headpositionare likely, induc-ing translationsand rotationsof the eye. Eye movementsthemselves,not significantly constrainedduring diagnosisor surgery, arealmostexclusively rotationalandcanoccurabouttwo axesat ratesof up to 180� per second.Signifi-cantly, neitheraxisof rotationis theopticalaxis. The fun-duscamerathroughwhich theretinais viewedis fixedandproduceshigh-resolution,1024 � 1024images(Fig. 1). To-gether, the foregoing observationsimply that the apparentmotion of the retinashouldbe modeledasa generalrigidmotion. (They alsoimply, however, thatsomecomponentsof themotion— rotationaboutthecamera’s opticalaxisinparticular— will be small.) This generalitysuggeststhatthecurvednatureof theretinasurfaceshouldbetaken intoaccountin the inter-imagetransformationmodel and thattransformationmodelsbasedon a planarsurface[1, 6] willbe insufficient. We confirm theseintuitions by deriving a

Page 3: Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure [22]. Although laser retinal surgery is the most widely used treatment for the above

Figure3: Examplesshowing visually the transformationerror obtainedusinga 0th order— imagetranslationonly — model(left), theaffine model(center),andthequadraticmodel(right).

transformationmodel for a quadraticsurfaceimagedby aweakperspectivecameraandby comparingit to modelsforplanarsurfaces.

Considertwo imagesIp � p andIq � q of a quadraticsur-face(theretina)takenby weakperspectivecamerasCp andCq (Fig. 2). Let 3 � 3 rotationmatrix and3 � 1 vector tdefinetherigid transformationbetweencameraviewpoints.Thetransformationmaybedueto movementof thesurface.Let � X � Y � Z T be a point on the surfacein the coordinatesystemof cameraCp having theZ axisalignedwith theop-tical axis,sothatp � � x � y T � s� X � Y T . Write thequadraticsurfaceequationas

Z � � 1 � X � Y � � 1 � X � Y T � (1)

with a 3 � 3 symmetric parametermatrix. Also, let� X ��� Y ��� Z �� T describethesamepoint as � X � Y � Z T but in thecoordinatesystemof Cq, sothatq � � x� � y� T � s� � X � � Y � T .Then,becauseof therigid transformation�

X �Y ��� � �

r11 r12 r13

r21 r22 r23���� XYZ

���� �tXtY �

where the r i j are components of . Substituting� 1 � X � Y � � 1 � X � Y T for Z, thensubstitutingx� s for X, y� sfor Y, x� � s� for X � andy� � s� for Y � , andfinally manipulatingtheequationyields

q � Θ x � p �� (2)

wherex � p � � 1 � x � y� x2 � xy� y2 T andΘ is a 2 � 6 parame-ter matrix. We only estimatethe12 parametersof Θ ratherthanextractingtheunknown quadraticsurface,camera,andmotion parametersrepresentedby Θ. Equation2 general-izesthe2D affinetransformationmodelinducedby therigidmotionof a planarsurface[6], which is givenby:

q �"! p

�t # (3)

Notice that (2) is the second-orderTaylor seriesexpansionof the generalimage transformationequation,while the

affine model(3) is the first-orderexpansion.The instanta-neousmotion of a planarsurfaceundera full-perspectivemodel inducesan 8-parametermodel, whereasthat of aquadraticsurfaceinducesa17-parameter, non-linearmodel.

The two planarmodelsand the new quadraticmodelsmaybecomparedbothnumericallyandvisually. To do so,we manuallylocateandmatchfeatures(bifurcationsof theretinalvasculature)in pairsof images.We thenfind theim-agetransformationparametersfor eachmodelminimizingthesum-squarederrorbetweenatransformedfeatureanditsmatch,usingthestandarddeviation of theresidualerror tomeasuretransformationmodelaccuracy. Accuraciesin therange4-5pixelsaretypical for the6- and8-parametermod-elsinducedby planarsurfacemotionwhereasaccuraciesofaround1.1pixelsaretypical of transformationsinducedbyquadraticsurfacemotion. This is particularlyencouragingbecauseerror in manualfeaturepositionlocalizationis ontheorderof a pixel. We thereforechoosethe12-parametermodel,whichwereferto asthe“quadraticmodel”, for con-structingretinalmosaics.Thischoiceis confirmedby visualcomparisonbetweentheaffineandquadraticmodelsshownin Fig. 3(center)and(right). Theadvantageof thequadraticmodel can clearly be seennear the white center(the op-tic disk). Thetransformationestimatedundersimpleimagetranslationis alsoshown. The significanceof this will beexplainedin thenext section.

3 Estimating the TransformationThe next stepis developinga techniqueto estimatethe

quadraticmodelparametersfor any pair of imageframes.The major challengeis handlingpotentiallysmall overlapbetweenimages. This necessitatesuse of feature-basedmatchingasthebasisfor estimatingthetransformationandsuggeststhe needfor substantialrobustnessto unmatchedandmismatchedfeatures.

3.1 DetectingVasculature Bifur cationsLookingat theimagesin Fig. 1, themostprominentfea-

turesarethebloodvesselsandtheirbifurcations/crossovers.Otherimageregionsareusuallyrelatively featureless.Ad-ditional featuresappearin pathologicalretinas,but vascular

Page 4: Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure [22]. Although laser retinal surgery is the most widely used treatment for the above

Figure4: Landmarksdetectedusingtherecursivevasculaturetrac-ing algorithm.

structureusuallycontinuesto be prominent. We thereforedetectbifurcationsandmatchthemasthebasisfor estimat-ing theinter-imagetransformation.1

We have developeda recursivevasculaturetracingalgo-rithm to mapout thebloodvesselstructureanddetectbifur-cations(“landmarks”) in an image. The algorithm,whichhasbeendescribedin detailelsewhere[8], startsby analyz-ing the imagealonga coarsegrid to do 1D edgedetectionandgatherimageintensitystatistics.Seedpointsfor tracingare identifiedandprioritized from a detailedimageanaly-sis arounddetectededges.Subsequentrecursive tracingisstraightforwardandrapid,with specialcasechecksto detectbifurcations.An exampleresultis shown in Figure4. Thealgorithmrunsat framerateona dualprocessorSGI.

3.2 Hierarchical Estimation AlgorithmMatching bifurcations and estimating the quadratic

transformationrequiresspecialcare. The bifurcationsarenot particularly distinguishablefrom each other locallybecausemany have similar vesselorientationsand back-grounds.Therefore,the largemotionbetweenframeswithresulting small image overlap implies that there will bemany different possiblematchesfor eachbifurcation, butmany bifurcationsmayhaveno correctmatchesat all. Thismakesthematchingandtransformationestimationproblemappearmoredifficult thanrelatedrobustmatchingproblemssuchasin fundamentalmatrixestimation[28, 30].

To motivatetheestimationtechnique,we revisit thedis-cussionandresultsof Sec.2. First,theretinalsurface,whileapproximatelyquadratic,is roughly planarover large re-gions. This explainswhy the affine transformationmodelyields small error, and suggeststhat an affine model, in-volving only 6 parametersandrequiringonly 3 correspon-dences,couldbeestimatedasapreliminarystepto estimat-

1In ongoing work we are studying the possibility of detectingandmatchingpathologiessuchasdrusen,lipid exudates,angioidstreaks,hem-orrhages,laserscars,andcotton-wool spotsto augmentmatchingof bifur-cations.

ing the full quadraticmodel. Second,the lack of rotationaboutthe (eye’s) optical axis implies that the apparentro-tation in the imagewill not be substantial:rotationsabouttheothertwo axes,eachparallelto theimageplane,produceapparentmotionsthataredifficult to distinguishfrom imagetranslationexceptoverbroadimageregions.Third, transla-tionalmotionof thehead,especiallyalongtheopticalaxisislikely to besmallenoughthatconvergentanddivergentmo-tionsin theimageareinsignificant.Thesesecondandthirdpointssuggestthata 0th ordermodelinvolving only imagetranslationmay be usedasan even cruderinitial approxi-mationto thetransformation(Fig. 3(a)). Fourthandfinally,the featuresare relatively sparseand the featuressetsarerelatively small, involving 30-50bifurcations. This makespracticalthetechniquesdescribedbelow.

Wehavedevelopedahierarchicalmatchingandtransfor-mation estimationalgorithm basedon theseobservations.Eachlevel of thehierarchyinvolvesa differenttransforma-tion modelandestimationtechnique,andtheresultsof onelevel areusedasinitial conditionsfor subsequentlevels.

0th Order, Translation: Initially, just a translationvec-tor, t0, is estimated.This representsthe0th ordertransfor-mationmodel

T0 � p; t0 $� p

�t0 (4)

Let the initial vasculaturebifurcationsetsfrom the twoimagesbedenotedP andQ, with Np andNq pointsrespec-tively. Forma2D weightedhistogramof translationvectorswith largehistogrambin sizes(e.g.15-20pixelson a side),andestablishaninitial correspondencesetC0 � P � Q. Re-strictionscould be placedon the matchesC0, but thus farthis hasnot beennecessary. For each � pi � q j &% C0, calcu-latet i ' j � q j ( pi andenteraweightedvotein thehistogrambin containingt i ' j . The weight is a similarity measureonthe matchesbasedon similarity in intersectionanglesandvesselthickness. Figure 5 shows an examplehistogram.Thepeakof thishistogram,afterlow-passfiltering,providesthe estimatet̂0. A new correspondenceset C1 is formedby all matchessuchthat t i ' j is within twice the histogrambin width of t̂0. Somefeatureswill haveseveralcorrespon-dencesin C1, while otherswill havenone.2

1stOrder, Affine Model: Thenext levelof thehierarchyestimatesanaffine transformation

T1 � p; !)� t1 $�"! p

�t1

using a minor variation on least-medianof squaresalgo-rithm (LMS) [23]. Let P1 * P containthe featuresfromIp having at leastonematchin C1, andfor eachp % P1 letC1 � p $�,+ q - � p � q .% C1 / . Notethatif theoverlapbetween

2Pointsp areconsideredto be comingfrom the imageto be incorpo-ratedinto themosaic,hencetheasymmetrictreatmentof point setsP andQ in matching.

Page 5: Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure [22]. Although laser retinal surgery is the most widely used treatment for the above

Figure 5: Plot of the 2D similarity-weightedhistogramfor oneimagepair.

imagesis small,P1 will bemuchsmallerthanP. TheLMSestimateof theaffineparametersis� ˆ!)� t̂1 � argmin0 ' t1 median

p 1 P1min

q 1 C1 2 p 354 q ( ! p ( t1 4 2 # (5)

In otherwords,the objective function for a given � !)� t1 iscalculatedby finding theminimumdistancematchfor eachtransformedp % P1 andthentakingthemedianof theresult-ing squared(error)distances.This differsfrom otherwell-known usesof LMS, e.g.in estimatingthefundamentalma-trix [28, 30], becauseuniquenessin thecorrespondencesetis not (andshouldnotbe)enforcedyet. Minimization is ac-complishedthrougha randomsamplingtechnique[10, 23].To form eachsample,a triple of pointsis selectedrandomlyfrom P1 andthena matchfor eachpoint p in the triple israndomlychosenfrom C1 � p . After minimization,a robustscalevalue,σ̂1, is calculatedfrom � ˆ!)� t̂1 usingthemedianabsolutedeviation (MAD) scaleestimator[24]. Note thatboth uncertaintyin featurepositionandmodelingerror inusingtheaffine transformationcontributeto themagnitudeof σ̂1.

2nd Order, Quadratic Model: The final level of thehierarchyestimatesthequadratictransformation

T2 � p;Θ 6� Θx � p (seeEqn.2 anddiscussion)usinganM-estimator[12]. Wedescribefirst a straightforwardinstantiation,andthenshowseveralimportantmodifications.For eachpi % P1, let

qi � argminq 1 C1 2 pi 3 4 q ( ˆ! p ( t̂1 4 2 �

i.e. qi is thebestmatchfor pi basedon theestimatedaffinetransformation.ThentheM-estimateof Θ is

Θ̂ � argminΘ

∑pi 1 P1

ρ � 4 qi ( Θx � pi 4 � σ̂ 7� (6)

whereρ is a“robustlossfunction” whichgrowssubquadrat-ically, and σ̂ is a scaleestimate. Here we usethe Tukeybiweightfunction[4]:

ρ � u$�98 a2

6 : 1 ( � 1 (,; ua < 2 < 3 = - u -?> aa2

6 - u -?@ a

whereu � 4 q ( Θx � p 4 � σ̂ is a“scalenormalizedresidual.”(Typically a A 4 # 0 [13].) Equation6 may be solved us-ing iteratively-reweightedleast-squares(IRLS) [13], withweightfunctionw � u$� ρ � � uB� u, in ourcase

w � u6� 8 : 1 ( ; ua < 2 = 2 - u -C> a

0 - u -C@ a #The valueui for eachmatchin eachiterationis calculatedusingthe estimateof Θ obtainedin the previous iteration.A robuststartingpoint for IRLS is crucial,ashasbeencon-firmedexperimentallyin vision applications[28].

Severalmodificationsof theestimationequation(6) andIRLS searchtechniqueareimportanthere:

1. TheM-estimatoris initialized from theaffine estimate� ˆ!)� t̂1 andscaleσ̂1. Theseareusedto computeinitialscalenormalizedresiduals

ui � 4 qi ( ˆ! pi ( t̂1 4 � σ̂1 �andtheseareusedto computeinitial weightsto beusedin IRLS.

2. A new (MAD) scale,σ̂2, is estimatedfrom the resid-ualsof thefirst IRLS estimateof Θ andthenfixedfortheremainingiterations.

3. Becausew � uD� 0 for residuals (error distances)greaterthanabout4σ̂2, thereis no needto restrictthematchset in (6). The entire original set C0 may beused,sincematcheswith large error distancessimplycontributezeroweight. (In practice,looserestrictionsareplacedpurely for computationalreasons.)This al-lows recovery from earlier mistakes in reducingthematchset.

4. The robust weightsw � u are augmentedin IRLS bya correspondencesimilarity measureand,whenmorethanonematchfor agivenp % P hasnon-zeroweight,theseweightsarenormalized. For example,if � p � q and � p � q �� aretwo matchesfor p, with robustweightsw andw� andsimilarity measuressands� , thentheac-tual IRLS weightswill be:

wEF� swsw

sw

�s� w� ; w� E � s� w� s� w�

sw

�s� w�

Page 6: Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure [22]. Although laser retinal surgery is the most widely used treatment for the above

Figure6: Thecombinationof two imageshaving relatively smalloverlap basedon the estimatedquadratictransformation. Non-linearwarpingfrom thequadraticmodelcanbeseenattheedgeofthetop frame.

In effect, the last two modificationsallow decisionsaboutcorrespondenceto bedeferreduntil thefinal transformationis estimated. This differs significantly from other robustmatchingandestimationalgorithms[28, 30], andis amajorreasonwhy thealgorithmis successful.Correctcorrespon-dencesaredeterminedto bethosehaving non-zeroweights.More thanonesuchcorrespondencefor any p indicatesalingeringambiguity, which is usuallycausedby mergingofbloodvesselintersectionsinto asinglefeaturein oneimagebut not in theother. Theseareresolvedandfinal correspon-dencesfor all featuresareestablishedby minimizing nor-malizedSSDmeasuresbetweenatemplatecenteredonp inIp andregionssurroundingT2 � pi � Θ̂ in Iq. Final estimationof Θ is basedon thesecorrespondences.

An exampleresultshowing oneimagetransformedontoasecondbasedon theestimatedquadratictransformationisshown in Fig. 6.

4 Constructing the CompleteMosaicTheinterimagetransformestimationtechniqueformsthe

core of the mosaicconstructionalgorithm. In the retinalsurgery application,mosaicconstructionwill be doneon-line and interactively, with a physicianspecifyingimagesto be addedto the mosaicone frame at a time. This en-surescompletecoverageand useof good quality images.Thedatasetshown herewasaquiredundersimilar circum-stances.

Givenanimage,Ip, to beaddedto themosaicimage(Iqin the notationthat hasbeenusedthus far), the first stepis estimationof thequadratictransformationandespecially

Figure7: A mosaiccombining10 imageseach1024 � 1024pixelsusingthequadratictransformationmodelandhierarchicalestima-tion technique.

the final correspondenceset. Next, the intensitiesin Ip arenormalized.Then,theregion of Iq coveredby thetransfor-mationof Ip is determined.For eachpixel q in this region,q is inversemappedonto p, the intensity is interpolatedin Ip and,backin the mosaicframe,the resultingvalueisaveragedwith pixels from other imagesthat map onto q.Sinceinvertingthequadratictransformdirectly is difficult,a quadratictransformis estimatedfrom thefinal correspon-dencesbetweenIp andthemosaicframewith rolesof p andq reversed.This is thenusedfor the inversemapping. In-correcttransformestimatesmaybeidentifiedby comparingthe (transformed)vasculaturestructureof Ip outputby therecursive tracingalgorithm(Sec.3.1) to thatof the mosaicusinga digital distancemap[7]. An examplemosaiccon-structedfrom 10 imagesis shown in Figure7.

5 Discussionand ConclusionsWe have presenteda quadratic image transformation

model and a robust, hierarchicaltechniquefor estimatingthe transformationbetweentwo imagesof the retinal sur-face. Theseform the coreof a methodfor constructingawide-area,high-resolutionmosaicof thehumanretina,im-proving on the mostly 2-D methodsdevelopedto dateforthis application[29, 5], andgoing beyond currentmosaic-ing methodsin the computervision literature[15, 26, 25].Improvementsto our techniquearestill possible,however,andwe arecurrentlyconsideringindexing methods[16] asa complementto thefirst two levelsof thehierarchy.

High quality retinal mosaicssuch as those shown inFig. 7 aredirectly valuablefor diagnosinga varietyof dis-

Page 7: Robust Hierarchical Algorithm for Constructing a Mosaic from …stewart/retina/cvpr99.pdf · sure [22]. Although laser retinal surgery is the most widely used treatment for the above

eases,especiallythoseinvolving theretinalperiphery[18].Other intendedusesof thesemosaicsaspart of our long-term goal of developinga two-phasecomputervision sys-tem to assistphysiciansduring diagnosisandretinal lasersurgeryinclude:treatmentplanning,on-linetreatmentmon-itoring, spatialdosimetry, errordetection,alarmsandsafetyshutoffs when the laser strays from the intendedtarget,changeanalysisbetweenpatientvisits, andevenvirtual re-ality toolsfor surgical simulation.Themosaicconstructionalgorithm, in modified form, can also enhanceemergingalternativesto classiclasersurgery [2] andbe usedto de-tectmismatchesbetweensmallimageregionsto flagretinalbumpsanddetachments.

References[1] G. Adiv. Determining3-d motion andstructurefrom optical flow

generatedby several moving objects. IEEE Trans. on PAMI,7(4):384–401,1985.

[2] S.Asrani,S.Zou,G. D’Anna,S.Lutty, S.Vinores,M. Goldberg, andR. Zeimer. Feasibilityof laser-targetedphotoocclusionof thechori-ocapillarylayer in rats. Invest.Ophthalmol.Vis. Sci., 38(13):2702–2710,Dec1997.

[3] J.Barron,D. Fleet,andS.Beauchemin.Performanceof opticalflowtechniques.Int. J. of ComputerVision, 12(1):43–77,February1994.

[4] A. E. BeatonandJ. W. Tukey. Thefitting of power series,meaningpolynomials,illustratedon band-spectroscopicdata. Technometrics,16:147–185,1974.

[5] D. E.Becker, A. Can,H. L. Tanenbaum,J.N. Turner, andB. Roysam.Imageprocessingalgorithmsfor retinalmontagesynthesis,mapping,andreal-timelocationdetermination.IEEE Trans.on Biomed.Eng.,45(1),Jan.1998.

[6] J. Bergen, P. Anandan,K. Hanna,andR. Hingorani. Hierarchicalmodel-basedmotionestimation.In Proc.2ndECCV, pages237–252,1992.

[7] G. Borgefors. Distancetransformationsin digital images. CVGIP,34(3):344–371,June1986.

[8] A. Can,J.N. Turner, H. L. Tanenbaum,andB. Roysam.Rapidauto-matedtracingandfeatureextractionfrom live high-resolutionretinalfundusimagesusingdirectexploratoryalgorithms. IEEE Trans.onBiomed.Eng., 1999.Accepted,to appear.

[9] P. Dani andS.Chaudhuri.Automatedassemblingof images- imagemontagepreparation. Pattern Recognition, 28(1):431–445,March1995.

[10] M. A. Fischlerand R. C. Bolles. RandomSampleConsensus:Aparadigmfor modelfitting with applicationsto imageanalysisandautomatedcartography. CACM, 24:381–395,1981.

[11] W. Grimson,T. Lozano-Perez,W. Wells, III, G. Ettinger, S. White,and R. Kikinis. An automaticregistration methodfor framelessstereotaxy, imageguidedsurgeryandenhancedreality visualization.IEEE Trans.onMed.Img., 15,1996.

[12] F. R. Hampel,P. J. Rousseeuw, E. Ronchetti,andW. A. Stahel.Ro-bust Statistics: TheApproach Basedon InfluenceFunctions. JohnWiley & Sons,1986.

[13] P. W. HollandandR. E. Welsch.Robust regressionusingiterativelyreweightedleast-squares.Commun.Statist.-Theor. Meth., A6:813–827,1977.

[14] L. HyvarinenandR. W. Flower. Indocyaninegreenfluorescencean-giography. ActaOphthalmologica, 58(4):528–538,1980.

[15] M. Irani, P. Anandan,andS. Hsu. Mosaicbasedrepresentationsofvideosequencesandtheir applications.In Proc. IEEE Int. Conf. onComputerVision, pages605–611,1995.

[16] D. Jacobs.Matching3-d modelsto 2-d images.Int. J. of ComputerVision, 21(1-2):123–153,January1997.

[17] J.M. KraussandC. A. Puliafito.Lasersin ophthalmology. Lasers InSurgeryAndMedicine, 17:102–159,1995.

[18] A. A. Mahurkar, M. A. Vivino, B. L. Trus, E. M. Kuehl, M. B.Datiles, and M. I. Kaiser-Kupfer. Constructingretinal fundusphotomontages.InvestigativeOphthalmology and Visual Science,37(8):1675–1683,July1996.

[19] R. Murphy. Age-relatedmaculardegeneration. Ophthalmology,9:696–971,1986.

[20] W. Niessen,K. Vincken,J.Weickert,andM. Viergever. Threedimen-sionalMR brainsegmentation.In Proc.IEEEInt. Conf. onComputerVision, pages53–58,1998.

[21] Researchto PreventBlindnessFoundation.RPB:Researchto preventblindnessannualreport.598MadisonAvenue,New York,NY 10022,1992.

[22] J.RoiderandH. Laqua.Lasercoagulationof age-relatedmacularde-generation.Klinische MonatsblatterFur Augenheilkunde, 206:428–437,June1995.

[23] P. J. Rousseeuw. Leastmedianof squaresregression.J. Amer. Stat.Assoc., 79:871–880,1984.

[24] P. J. RousseeuwandA. M. Leroy. Robust Regressionand OutlierDetection. JohnWiley & Sons,1987.

[25] H. Sawhney, S.Hsu,andR. Kumar. Robustvideomosaicingthroughtopologyinferenceandlocalto globalalignment.In Proc.5thECCV,volumeII, pages103–119,1998.

[26] R. Szeliski. Video mosaicsfor virtual environments. IEEE CGA,16(2):22–30,1996.

[27] R. Taylor et al. An image-directedrobotic systemfor preciseorthopedic-surgery. IEEE Trans. on Robotics and Automation,10(3):261–275,1994.

[28] P. Torr andD. Murray. The developmentandcomparisonof robustmethodsfor estimatingthe fundamentalmatrix. Int. J. of ComputerVision, 24(3):271–300,1997.

[29] C. H. Wright, R. D. Ferguson,H. G. RylanderIII, A. J. Welch,andS. F. Barrett. Hybrid approachto retinal trackingandlaseraimingfor photocoagulation.Journal of BiomedicalOptics, 2(2), 195–2031997.

[30] Z. Zhang,R.Deriche,O.Faugeras,andQ.Luong.A robusttechniquefor matchingtwo uncalibratedimagesthroughthe recovery of theunknown epipolargeometry. Artificial Intelligence, 78(1-2):87–119,1995.

[31] I. E. Zimmergaller, N. M. Bressler, andS. B. Bressler. Treatmentof choroidalneovascularization— updatedinformationfrom recentmacularphotocoagulationstudygroupreports. InternationalOph-thalmology Clinics, 35:37–57,1995.