BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf ·...

8
This paper reports the results ofan entirely blind reading of 44 planar sestamibi studies performed at Hotel Dieu de Montreal and mailed to the University of Virginia for computer quantitation and interpretation. Quantitative computer processing was performed using a program de signed specifically for sestamibi (15). The interpreters were completely unaware of any clinical or demographic data that could indicate pretest likelihood of disease and there fore potentially bias scan readings. The purpose of this study was to determine how well interpreters experienced in 201Tl imaging could read ses tamibi studies with the aid ofquantitative image processing in terms of inter-interpreter agreement, agreement with coronary angiographic findings and 201T1perfusion studies. Studies were also analyzed according to purely quantitative criteria. This allowed us to generate receiver operating characteristic (ROC) curves of sensitivity and specificity, as a function of threshold for defect detection. MATERIALSANDMETHODS Patient Population The study cohort comprised 15 volunteers and 29 patients with suspected CAD and abnormal perfusion by 20'fl exercise scintigraphy. The volunteers had <5% likelihood of CAD based on Bayesian analysis of age, sex, symptom classification, and normal exercise201Tl scintigraphy.Catheterizationwas not per formed on the volunteers. Ofthe 29 patients with suspected CAD, one had angiographically normal coronary arteries and 28 had significant CAD defined as >70% luminal diameter narrowing of one or more coronary arteries. Of these, 15 had a clinical history ofprior myocardial infarction. The patient with angiographically normal coronary arteries was added to the normal volunteer group, yielding 16 subjects in the normal group. All studies were performed at the Hotel-Dieu d'Montreal in Quebec. The infor mation on the composition of the patient population was corn pletely unknown to the interpreters of the sestamibi images at the University of Virginia. StressandRestSestamibi ImagingProtocol Two separate injections were used to perform the sestamibi studies with 1—3days between injections. For stress images, patients followed the Bruce protocol and were injected intrave nously with 20—25 mCi of sestamibi at peak exercise followed by I mm of additional exercise. Imaging commenced 60—90mm after injection in the 45 left anterior oblique (LAO) projection, Sestamibi planar myocardial perfusion studies were per formed at Hotel-Dieu de Montreal on 28 patients with docu mented coronary artery disease and 16 normal subjects. Stress and rest studies were performed on separate days. These studieswere sent to Virginiafor interpretationwhile blinded as to age, sex, and other clinical information. Studies were quantitatedindependentlyby two operators (using a computer program modified for Sestamibi), and interpreted independently by two experienced interpreters. Computer quantitation of 2816 segments gave an average interoperator deviation of 2.2%. Pure quantitative criteria were applied for computer interpretation. By varying the detection threshold, we producedthe entire ROC curve relatingsensitivityand specificity as a function of detection threshold. Using only computer criteria for normal or abnormal, interoperator agree ment by patient was 98% and 93% by view. The computer could achieve equal positive and negative predictive accuracy of87%. Interpreters,allowedbothquantitativeandsubjective judgment, agreed on 91% of 44 patients, 90% of 132 views, and 92% of 660 segments. Interpreters averaged 94% posi tive and 86% negative predictive accuracy. J Nucl Med 1992; 33:668—675 ew 99mTclabeled myocardial perfusion agents have been long awaited and are now released for clinical use in the United States. Clinical trials of sestamibi (Cardio litee@DuPont, N. Billerica, MA), a 99mTc..labeledisoni tnle, have been conducted and early experience appears promising for its utility to detect coronary artery disease (CAD) and delineate myocardial infarction. Some results of multicenter clinical trials of this agent have been pub lished (1,2). Other investigations have indicated the myo cardial extraction and kinetic transport of sestamibi (3— 6); the potential use of this agent for differentiation of viable from infarcted myocardium (7—10);the use of ECG gating with sestamibi (1 1, 12); and preliminary experience for detection of CAD (13,14). ReceivedJul. 30, 1991;revisionaccepted Dec. 13, 1991. For reprints contact: Denny D. Watson, NuclearCardiology,Box 468-65, Lkiiversityof Virginia Health Sciences Center, Charlottesville, VA 22908. 668 The Journal of Nuclear Medicine • Vol. 33 • No. 5 • May 1992 Blinded Evaluation of Planar Technetium-99m- Sestamibi Myocardial Perfusion Studies Denny D. Watson, William H. Smith, George A. Beller, Ellen L. Vinson, and Raymond Taillefer Division ofMedical Imaging, Department ofRadiology, and Division ofCardiology, Department oflnternal Medicine, at the Universityof VirginiaHealth Sciences Center, Charlottesville, Virginia and Department ofNuclear Medicine, Hotel-Dieu de Montreal, Montreal, Canada

Transcript of BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf ·...

Page 1: BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf · Anotherfeatureusedbytheinterpretersisthe“flashback display‚(15).Thisdisplaycapabilityisusefulforthedetection

This paper reports the results ofan entirely blind readingof 44 planar sestamibi studies performed at Hotel Dieu deMontreal and mailed to the University of Virginia forcomputer quantitation and interpretation. Quantitativecomputer processing was performed using a program designed specifically for sestamibi (15). The interpreters werecompletely unaware of any clinical or demographic datathat could indicate pretest likelihood of disease and therefore potentially bias scan readings.

The purpose of this study was to determine how wellinterpreters experienced in 201Tl imaging could read sestamibi studies with the aid ofquantitative image processingin terms of inter-interpreter agreement, agreement withcoronary angiographic findings and 201T1perfusion studies.Studies were also analyzed according to purely quantitativecriteria. This allowed us to generate receiver operatingcharacteristic (ROC) curves of sensitivity and specificity,as a function of threshold for defect detection.

MATERIALSANDMETHODS

Patient PopulationThe study cohort comprised 15 volunteers and 29 patients

with suspected CAD and abnormal perfusion by 20'fl exercisescintigraphy. The volunteers had <5% likelihood of CAD basedon Bayesian analysis of age, sex, symptom classification, andnormal exercise201Tlscintigraphy.Catheterizationwas not performed on the volunteers. Ofthe 29 patients with suspected CAD,one had angiographically normal coronary arteries and 28 hadsignificant CAD defined as >70% luminal diameter narrowing ofone or more coronary arteries. Of these, 15 had a clinical historyofprior myocardial infarction. The patient with angiographicallynormal coronary arteries was added to the normal volunteergroup, yielding 16 subjects in the normal group. All studies wereperformed at the Hotel-Dieu d'Montreal in Quebec. The information on the composition of the patient population was cornpletely unknown to the interpreters of the sestamibi images atthe University of Virginia.

StressandRestSestamibiImagingProtocolTwo separate injections were used to perform the sestamibi

studies with 1—3days between injections. For stress images,patients followed the Bruce protocol and were injected intravenously with 20—25mCi of sestamibi at peak exercise followed byI mm of additional exercise. Imaging commenced 60—90mmafter injection in the 45 left anterior oblique (LAO) projection,

Sestamibi planar myocardial perfusion studies were performed at Hotel-Dieu de Montreal on 28 patients with documented coronary artery disease and 16 normal subjects.Stress and rest studies were performed on separate days.These studieswere sent to Virginiafor interpretationwhileblinded as to age, sex, and other clinical information. Studieswere quantitatedindependentlyby two operators (using acomputer program modified for Sestamibi), and interpretedindependently by two experienced interpreters. Computerquantitation of 2816 segments gave an average interoperatordeviation of 2.2%. Pure quantitative criteria were applied forcomputer interpretation. By varying the detection threshold,we producedthe entire ROC curve relatingsensitivityandspecificity as a function of detection threshold. Using onlycomputer criteria for normal or abnormal, interoperator agreement by patient was 98% and 93% by view. The computercould achieve equal positive and negative predictive accuracyof 87%. Interpreters,allowedbothquantitativeandsubjectivejudgment, agreed on 91% of 44 patients, 90% of 132 views,and 92% of 660 segments. Interpreters averaged 94% positive and 86% negative predictive accuracy.

J Nucl Med 1992; 33:668—675

ew 99mTclabeled myocardial perfusion agents havebeen long awaited and are now released for clinical use inthe United States. Clinical trials of sestamibi (Cardiolitee@DuPont, N. Billerica, MA), a 99mTc..labeledisonitnle, have been conducted and early experience appearspromising for its utility to detect coronary artery disease(CAD) and delineate myocardial infarction. Some resultsof multicenter clinical trials of this agent have been published (1,2). Other investigations have indicated the myocardial extraction and kinetic transport of sestamibi (3—6); the potential use of this agent for differentiation ofviable from infarcted myocardium (7—10);the use of ECGgating with sestamibi (1 1,12); and preliminary experiencefor detection of CAD (13,14).

ReceivedJul. 30, 1991; revisionaccepted Dec. 13, 1991.For reprintscontact: DennyD. Watson, NuclearCardiology,Box 468-65,

Lkiiversityof Virginia Health Sciences Center, Charlottesville, VA 22908.

668 The Journal of Nuclear Medicine •Vol. 33 •No. 5 •May 1992

Blinded Evaluation of Planar Technetium-99m-Sestamibi Myocardial Perfusion StudiesDenny D. Watson, William H. Smith, George A. Beller, Ellen L. Vinson, and Raymond Taillefer

Division ofMedical Imaging, Department ofRadiology, and Division ofCardiology, Department oflnternal Medicine, at theUniversityof VirginiaHealth Sciences Center, Charlottesville, Virginia and Department ofNuclear Medicine, Hotel-Dieu deMontreal, Montreal, Canada

Page 2: BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf · Anotherfeatureusedbytheinterpretersisthe“flashback display‚(15).Thisdisplaycapabilityisusefulforthedetection

Another feature used by the interpreters is the “flashbackdisplay―(15). This display capability is useful for the detectionof subtle defect reversibility.The background-correctedimagesare brought into alignment using image registration techniquespreviously described (15,16,23) and each scaled to its maximumcount. With a single keystroke the interpreter can cause theexercise image to appear in the position occupied by the restimage. The shift between images is extremely swift and smooth,and since the images are precisely aligned, there is no apparentmotion ofthe myocardium. Ifa defect is reversible, that myocardial segment will have increased relative uptake of sestamibi,which will appear as a sudden shift in the intensity ofthe ischemicregion when the flashback is employed. The flashback displayexploits the eye's superb ability to detect sudden changes in anotherwise static image, thus enabling an intuitive and sensitivemethod of visually identifying reversibility.

Two expert @°‘TIinterpreters at the University of Virginia(DDW, GAB) independently read the quantitated sestamibi images from video monitors using the display program describedabove. The readers were blinded to age, sex, other clinical information and the prevalence of CAD in this population. Eachinterpreter was isolated and readings were recorded on a formthat was signed and mailed before the clinical data was sent toVirginia from Montreal. Fifteen segments were graded by the twointerpreters, five per view. Each view was divided into five equalsegments with the apex in the middle and two segments to eachside ofthe apex. Each segment was graded as either normal (NL),persistent defect (PD), or reversible defect (RD). A view wasautomatically graded as RD if any segment showed reversibility.If a viewshoweda defectwith no reversiblesegments,then theview was graded as PD. If the view had only normal segments,then it was graded as NL. Patients were automatically graded ina similar fashion. Also, a patient or view was determined to beabnormal if the resulting grade was PD or RD.

Automatic Computer MeasurementsPeakactivityfor selectedmyocardialsegmentswas automati

cally measured by the computer along four count profiles foreach view (18). Each profile was divided into two segments atthe vertical centroid line through the heart yielding eight segmentsper view. Profiles through the valve planes were excluded, yielding16segments for analysis. There were five segments in the anteriorview, including proximal anterolateral, distal anterolateral, apical,inferior and inferobasilar, five segments in the 70' LAO view,including proximal and distal anterior, apical, inferior and posterior; and six segments in the 45°LAO view, including upperand midseptum, inferoseptal, inferolateral, mid and upper posterolateral. The peak activity for a given segment was found bysearching from centroid to edge for the maximum value (in thebackground-corrected image). The segmental activity values werethen normalized by dividing each segment in a particular viewby the averageof the highestfive percentileof profilevalues inthe corresponding view. This is done by first sorting the numericalvalues from all myocardial samples into descending order andaveraging the top five percentile. This avoids normalizing to thesingle highest value, which would be normalizing to the mostextremestatisticaloutlier from a largenumber of samples.Normalized values greater than 100% were set to 100%. The normalized values were then converted to defect percentages bysubtracting from 100%.

Interoperator variability was measured by computing the root

followed sequentially by anterior and 70°LAO projections. Following a resting injection of 20—25mCi, imaging began 75—100mm later using the same imaging sequence. All sestamibi imageswere recorded for a preset time of 10 mm, using a 64 X 64 matrixsize in a 20% window centered on the l40-keV gamma ray peakof99mTc. Images were acquired on a Medical Data System (MDS)

computer.

Quantitative Image AnalysisImages were mailed on diskettes to the University of Virginia

for quantitative analysis and visual interpretation. The Universityof Virginia investigators were given only the patient code andwere blinded to all other data, including the patient's sex or thepercentage of normals in the population. A quantitative programoriginally developed for 201Tl(16) was modified for sestamibiimaging as previously described (15). Important features of theoriginal program include display normalization and calibration,minimal operator dependence, background subtraction, and image registration. The following modifications were made for usewith sestamibi:

1. The interpolative background subtraction algorithm wasmodified because of the difference in heart-to-backgroundratio and shape of the extracardiac background betweenresting and exercise sestamibi images (15—20).

2. With sestamibi,suboptimal display of the heart is causedby excessive 99mTcactivity in the viscera. The programautomatically suppresses extracardiac activity if it exceedsthe maximum activity in the heart (15).

3. Sincetwo separateinjectionsare necessaryin sestamibiimaging, the rest and exercise profiles must be scaled differently. Exercise profiles are displayed with the maximumvalue at full scale. The resting profiles are then displayed sothe maximum value ofthe resting profiles will be at 75% offull scale. Stress and rest profiles can then be overlapped forcomparison without the two curves being on top of eachother. The overlapped curves are ideally scaled to showregionsof reversibility.

All patient studies were independently quantitated by twoexperienced operators on a Sophy P 32-bit microcomputer (Sopha Medical—Columbia, MD). The resulting images and countprofiles were independently interpreted by two experienced readcr5.

Image Display and InterpretationThe reproducibility of image interpretation depends on a

standardized method ofimage display. Since image interpretationwas to be performed by reading from video monitors, we usedthe test pattern developed by the Society of Motion Picture andTelevision Engineers (SMPTE) (21) to set the video contrast andbrightness levels. We also used a standardized intensity scale thatwe designed (15,22) to relate image count density linearly toperceived brightness.

The computer program saves the quantitated results on diskfor later recall using a special display program designed for easeofinteraction with the interpreter. Figure 1illustrates the differentdisplay formats that are available to the interpreters. Figure lAand 1B are the standard formats for exercise and rest images.Four count profiles(each @lcm wide)span the heart. In FigureIB, the exercise and rest profiles are overlapped for better comparison.

669Blinded Evaluationof Sestamibi Studies •Watson et al

Page 3: BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf · Anotherfeatureusedbytheinterpretersisthe“flashback display‚(15).Thisdisplaycapabilityisusefulforthedetection

A B

D

—STRESS —RE@T REST 7ØLAO

_s@k@_@ :---i@-@@\@

HEART SIZE64x64 matrix

17 pixels wide

12 pixels higl@ ii_

C

REST 7LAO

STRESS

V IEWS

REST

VIEWS

mean square (RMS) deviation of the defect percentage from themean of the operatorsover the entire set of 2816segments.

To investigatecomputer operator variability, the followingquantitativecriteriawereestablishedforcomputerinterpretation.A significant initial defect was defined as >25% for all exercisesegments except >30% for the inferior sample of the anteriorview, the upper septal sample of the 4Y LAO view, and theposteriorsampleofthe 70°LAOview;and >40% for the inferobasilar sample ofthe anterior view. These thresholds are the sameas our thallium standards (16) but with the upper septal sampleadjustedto compensatefor slightlygreaterbase-to-apexgradienttypical ofsestamibi (18,19).

Reversibility was arbitrarily defined as a significant initialdefect, within any of the 16 sample regions, that improved atleast 10percentagepoints in the same sample region in the reststudy. Using the above criteria, the computer automaticallygradedeach myocardialsegmentas eitherNL, PD, or RD. Views

670

and patients were then automatically graded as described in theprevious section.

Statistical and ROC AnalysisThe results of the expert and computer interpretations were

mailed to Montreal to signify completion of the image analysisphase. The blind was then broken and the composition of thepatient population and the catheterization results were sent byfacsimile to Virginia for final statistical analysis.

The mean and standard deviation was computed for eachmyocardial segment in the patients identified as normal. Thesevalues are shown in Figure 2 for stress studies (solid line) and reststudies (dashed line). The normal values for rest and stress studieswerenearlyidentical(r = 0.97)and wereaveragedtogether.Thisresult indicates that the background subtraction (1 7—19)hadcorrectly compensated for the different extracardiac backgroundfound in the rest studies compared to stress. The Z-score for each

The Journal of Nuclear Medicine •Vol. 33 •No. 5 •May1992

STRESS @ØLAO

@ .@

STRESS 7ØLAO REST 7SLAO

@ a

FIGURE1. (A)Thestandarddisplayforexerciseimagesisshown.Therawimageisdisplayed(automaticallyrescaledifnecessary)in the upper left quadrant.The background-correctedimageis shown in the upper right. Fourcount profiles(each-@-1cm wide)spanthe heart and are displayedin the lower right quadrant. (B) The standard display for rest images is similar to that of the exerciseimages.In addition,in the lower left quadrantthe exerciseand rest profilesare overlappedfor better comparison(rememberthat thereStprofileshavebeenautomaticallyscaledto 75% of the exerciseprofiles).(C) Side-by-sideimagedisplayshowsthe stress-restpair of raw imagesabovewith the correspondingbackground-correctedimagesbelow. (D)Surveydisplayshows all six raw imagesbelow with a user-selectablestress-restbackground-correctedpair above.

STRESS 7SLAO

Page 4: BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf · Anotherfeatureusedbytheinterpretersisthe“flashback display‚(15).Thisdisplaycapabilityisusefulforthedetection

PatientTABLEI

Summary of Expert and ComputerInterpretationsPatientPrevious

ComputerComputerObserverObservercodeinfarct#1 #2 #1 #2Group

SAMPLE SEGMENT

PredictiveaccuracyObserver#1Observer#2All

patientsPositiveNegative0.96 0.790.930.93ExdudingPositive0.910.86Prior

MINegative0.830.93

ALLRE No NL NL NLNLNLBACNOYes RD RD RDRDCADBAYGIYes RD RD RDRDCADBEAFENo RD RD RDRDCADBELLONo NL NL NLNLNLBOICLNo NL NL NLNLNLBROANNo NL NL RD*RD*NLCHACLYes PD PD PD@RDCADCHAJONo RD RD RDRDCADCHEANNo NL NL NLNLNLCLOCAYes PD PD RIYPDCADDAOBEYes RD RD RDRDCADDUBGINo NL PD NLtNLNLDUPJEYes RD RD RDRDCADGIGJENo NL NL NL*NL*CADHAMMANo NL NL RDRDCADHEBGINo NL NL NLNLNLJEAALNo NL NL NLNLNLJOLROYes RD@ PD RDRDCADLADYVNo NL NL NLNLNLLAMRANo NL NL NLNLNLLANRI

No RD RD RDRDCADLARSENo NL NL NLNLNLLAUYVYes RD RD RDRDCADLECCHNo RD*t PD* NLNLNLLEMFRNo NL NL NLNLNLLERALNo RD RD RDRDCADLOPHUYes RD RD RDRDCADMAHHAYes NL* NL* RDRDCADMALANNo RD RD RDRDCADMAUA

Yes PD PD RDRDCADMONPIYes RD RD RDRDCADOUEMANo NL NL* RDRDCADPARFRNo NL NL NL**RDCADPOIHENo NL NL NLNLNLQUEMIYes NL* NL NL*@RDCADRICROYes PD PD RDRDCADRICYVNo PD PD RDRDCADROYDANo RD RD RDRDCADROYEGNo NL NL NL**RDCADSTHANYes RD RD RDRDCADTAIRANo NL NL NLtRD*NLTETGANo RD RD RDRDCADVILDANo NL NL NLNLNL*

Denotes disagreement withgroup.t

Denotes disagreement between computeroperators.t

Denotes disagreement between expert interpreters.

U.U.

w

0I-w-J

0.

C,)

FIGURE 2. ShowsthepercentagedifferencecomparingsampIesegmentto averagepeakactivityfor all24 segmentssampledby the computer in the normal population.The solid line is forstress studies and the dashed line for rest studies. There is nosignificantdifference,indicatingthat the backgroundsubtractionhascorrectlycompensatedfor differencesinextracardiacuptake.

segment was then computed using the average and standarddeviation obtained from the normal data base. For every subject,the average of the highest and next highest segment Z-score wasobtained. Averaging the two highest defect scores is done toreduce statistical variation that would result from selecting thesingle most deviant sample from many samples. The Z-scoreindicates the number of standard deviations each patient variesfrom the normal average based on the two most deviant segments.We then set thresholds that varied from 0.5 to 10 standarddeviations. For each threshold, a patient was classified as abnormal ifthe exercise study varied from the normal average by morethan the threshold deviation. For each different threshold, thefalse-positive rate and true-positive rate was determined. Thisdata is plotted to create the ROC curve. For low thresholds, thesensitivity is high but the specificity is low. For high thresholds,the specificity is high but the sensitivity is low. The ROC analysisindicates the tradeoffbetween sensitivity and specificity.

RESULTSObserverInterpretation

Table I is a summary of the expert and computerinterpretations broken down by patient. As shown inTable 2, for Observer 1, the predictive accuracy of apositive test was 96% (24/25) and the predictive accuracyof a negative test was 79% (15/19). Observer 2 had conesponding predictive accuracies of 93% (27/29) and 93%(14/1 5). Table 2 also shows the predictive accuracy obtamed after excluding all patients with history of priormyocardial infarction. Table 3 indicates the characteristicsof the patient population. In Table 4, the agreement between the expert interpreters is broken down by category.The interpreters agreed at least 90% ofthe time on patient,view, and segment. The concordance by segment is shownin Figure 3. Interpreters disagreed most often on apicalsegments.Disagreementwasgreateronthejudgmentofredistribution than on classification of normal versus abnormal.

TABLE2Predictive Accuracy of Observers

671Blinded Evaluationof Sestamibi Studies •Watson et al

Page 5: BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf · Anotherfeatureusedbytheinterpretersisthe“flashback display‚(15).Thisdisplaycapabilityisusefulforthedetection

C@JEXPERT#

1

NL PD RD

NL 484 215%

I—a_

><PD2138wRD32698

Normal group(n=16)CAD

group(n=28)Age(yr)41.1

±7.353.9 ±8.6Sex(%male)88%79%Previous

Ml015(54%)Peakheartrate(bpm)165.5 ±11.7132.9 ±16.8PeaksystolicBP(mmHg)146.7 ±I 7.5143.3 ±22.9

By segment

Category By patient By view Exactmatch NL/ABmatch

Total

By segment

Category By patient By view Exactmatch NL/ABmatch

Total

Eachcolumnshowsthe agreementdividedby the total possiblewiththe percentagreementshownbelow.Thecomputerscoredtheoperatorson theirdeterminationof whethera patient(or view)wasnormal (NL) or abnormal (PD, AD). In the exact match by segmentcolumn,the computerhad to give a segmentthe samegradeforeachoperator,eitherNL,PD,or AD,for anexactagreement.In theNL/ABmatchby segmentcolumn,thecomputerhadonlyto concurthat a segment was normal (NL) or abnormal (PD, AD) for eachoperatorfor agreement.

Eachcolumnshowsthe agreementdividedby the total possiblewith the percent agreement shown below. The interpreters werescored on their determination of whether a patient (or view) wasnormal(NL)or abnormal(PD,RD).In the exact matchby segmentcolumn, both interpreters had to give a segment the same grade,either NL, PD, or RD, for an exact agreement. In the NL/AB matchby segmentcolumn,the interpretershad only to concur that asegmentwasnormal(NL or abnormal(PD AD for agreement.

TABLE3PatientCharacteristics

FIGURE 3. Illustrates the concordance between theexpert interpretersfor the 660 segments graded inthe study. The exactagreementwas90.2% (595/660).

Computer InterpretationThe patient summary of the computer interpretations

is shown in Table 1. The two different operators obtaineddifferent computer classification of normal versus abnormal on only one patient. In only two patients did the twocomputer operators disagree with respect to PD versusRD. Table 5 tabulates the agreement between operatorsby category. The agreements were uniformly high. Theinteroperator RMS deviation for 28 16 segments averaged2.2%.

Thresholds for Defect DetectionFigure 4 shows the sensitivity for defect detection as a

function of the threshold level set to detect a defect. Thiscurve is derived from the subset of patients with CAD.Also shown in Figure 4 is the specificity or false-positiverate as a function of the threshold for defect detection.This curve shows a rapid loss of specificity since thethreshold is reduced below about two standard deviationsfrom the normal population. The sensitivity curve showsa gradual falloffofsensitivity since the threshold for defectdetection is increased. This graph does not extend to show

TABLE4Interpreter Agreement

all the defects. The largest defects in the CAD populationranged to over 40 standard deviations. This shows a widedeviation of defects in patients with CAD ranging fromextremely abnormal to where there are no defects statistically distinguishable from the normal population. Figure5A includes all the data used to produce a conventionalROC curve. This curve shows how sensitivity and specificity vary by using different thresholds for detection of CAD.Also plotted on this graph is the sensitivity and specificitycombination obtained from the two interpreters. The interpreters scored somewhat higher than the computeralone, presumably by exercising judgment that was notprogrammed into the computer. Figure 5B is the ROCcurve and interpreter scores obtained after excluding allpatients with history of prior myocardial infarction.

TABLE 5Computer Agreement

40/441 19/132595/660609/66098%93%97%98%91°h90%90%92%Normals15/1646/48253/256254/256Normals

Abnormals

Anterior

450 LAO

70°LAO1

5/1694%25/2889%

—

—

—47/48

98%72/8486%41/4493%42/4495%36/4482%238/240

99%357/420

85%199/220

90%195/220

89%201/22091%238/240

99%371/420

88%205/220

93%200/220

91%204/22093%Abnormals

Anterior

45°LAO

70°LAO94%

28/28100%

—

—

—96%

77/8492%40/4491%

41/4493%42/4495%99%

428/44896%

211/22096%

257/26497%

213/22097%99%

436/44897%

213/22097%

260/26498%

217/22099%

672 The Journalof NuclearMedicine•Vol. 33 •No. 5 •May 1992

Page 6: BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf · Anotherfeatureusedbytheinterpretersisthe“flashback display‚(15).Thisdisplaycapabilityisusefulforthedetection

w@ @pIw@fl-@

@@-@ -tNO@MALULATIOI—r-i-@-@

—“.-

—....•@1-@:•*@@-@-:::@

@:-@@..@‘.1-,-

,tJ::@

.--:@

-EX -,-LU

@iUNj:j@S@BJCT;:

—,@...Co@[email protected]——At@[email protected]@TE@+-———ImER-

0.4@ —

03 —

0.2@ —ElC@MIUT@R.•——- . -iPiE:A- - —@I@r — -E@ - — —

n

. .. Al ALfSl;.

—,-

,@-,--

I

?-@-@-

@::@-.-.;.:.@.

.@-@--1

[email protected]@JTIR•

@

0.4

0.3

0.2

FIGURE4. Sensitivityandspecificityareshownasafunctionof the threshold (horizontalaxis) used to separatenormal fromabnormal. @hevalues(Z-scores)on the horizontalaxis representthe numberof standarddeviationsfrom normal.Thecurvesshowthat10%to I 5%oftheCADgroupwerestatisticallyindistinguishable from the normal group and can be identified only at theexpenseof increasingfalse positives.

DISCUSSIONThis study demonstrates that sensitivity and specificity

can both be maintained at reasonably high levels withplanar sestamibi imaging. The interpreters were experi

A1° B 1.0

0.9

0.8

0.9

0.8

enced readers of 201'fl but not sestamibi scintigrams. Differences exist between the two radionuclides that requiredsignificant modification ofthe computer program used forquantitativeanalysis.However,afterthesemodificationswere made, little adjustment was required by the interpreters for correct interpretation. The population in thisstudy is clearly biased by the inclusion of normal volunteers and a moderately high incidence of patients withhistory ofprior myocardial infarction. However, a notablefeature was the completely blinded interpretation of thesestamibi studies. The data were collected in one institution and sent to a second institution for interpretation.Observers were blinded even to whether a study was of amale or female and had no idea of the percentage ofnormal and abnormal subjects included in the cohort.

Interpretation entirely based on computer criteria wasperformed to generate the ROC curves. ROC analysis isessential to properly characterize sensitivity and specificitybecause it defines the entire range of these variables anddefines the tradeoff of specificity for sensitivity. A knowledge of the entire curve is necessary to judge the power ofthe test to discriminate between normal and abnormalpatients.

Figure 4 is an illuminating alternative to the conventional ROC plot. In this figure, the specificity curve isderived from the normal sub-population. This curve indicates that there were few normal subjects with defectsmore than three standard deviation units away from thenormal mean but about 15% of normals that deviated by

. SENSITIVITY0 SPECIFICITY1.00@@@ _________ I—,_________

0.900.800.700.600.500.400.300.200.100.00

0 1 2 3 4 5

THRESHOLD FOR DEFECT DETECTION

0.10.1

0.

wI—

w>I—C,)0a-w

I.

wI—

w>I-Cl)0a-w

I-

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

FALSE POSITIVE RATE FALSE POSITIVE RATE

FIGURE5. (A)StandardROCcurveshowingthetrue-positiverate(specificity)ontheverticalaxisandthefalse-positiverate(1-specificity) on the horizontal axis. The solid dots represent computer results at various detection thresholds. The open squares arethe result of the interpretationby two experiencedobservers. (B) ROC curve Obtainedafter exduding all patients with history ofmyocardialinfarction.

Blinded Evaluationof Sestamibi Studies •Watson et al 673

0.7 0.7

0.6

0.5

0.6

Page 7: BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf · Anotherfeatureusedbytheinterpretersisthe“flashback display‚(15).Thisdisplaycapabilityisusefulforthedetection

two standard deviation units. This is expected, assumingnormally distributed deviations from the mean. The curveof sensitivity is obtained from the sub-population withCAD and this group has what might seem to be anunexpected, extremely broad distribution. The most foridly abnormal subjects (not plotted) extended to morethan 40 standard deviations away from normal. At theother end of this spectrum, slightly more than 10% of theCAD group had a myocardial uptake pattern that wasstatistically indistinguishable from the normal population.Thus, we find about 10% of the population with angiographically significant coronary artery abnormalities, whonevertheless do not have a significant maldistribution ofsestamibi. While this subgroup of patients has CAD byangiographic criteria, it is reasonable that most of thesepatients do not have significant maldistribution of myocardial blood flow. In that sense, some of these “falsenegatives―may be correct in the sense of indicating homogeneity of myocardial perfusion. The only way to increase the sensitivity so as to “detect―disease in thissubgroup is to lower the threshold into the range of normalpatients and accept a higher false-positive rate.

Note that we could produce a 95% sensitivity by accepting a 40% false-positive rate. We prefer to maintain abalanced sensitivity and specificity and can achieve anequal sensitivity and specificity of about 85% by purelyquantitative criteria or about 90% with expert interpretation added. As discussed above, it would be unrealistic toexpect much better separation of normals from patientswith hemodynamically significant coronary disease, giventhat there will be some cases where coronary anatomydoes not exactly predict myocardial perfusion. Moreover,our normal population was biased with normal volunteers,which probably produced a normal distribution with lessspread than from normal patients referred for evaluation.

Other studies have looked at the sensitivity and specificity ofsestamibi for detection ofCAD. The clinical efficacyofsestamibiplanarstressandrestimagingwasassessedina multicenter Phase II clinical trial involving 38 patientsfrom several institutions(1). 0f36 patients with significantCAD, 35 (97%) had abnormal 2OVflstress images and 32(89%)hadabnormalsestamibiimages.Theslightincreasein 20Tl sensitivity (not statistically significant) may beattributed to the fact that most patients in the trial wereenrolled based on having an abnormal 2oVflscan on theimaging study performed for clinical indications. Otherstudies employing planar or SPECT imaging techniqueshave also demonstrated that exercise and rest sestamibiimaging is comparable to stress-redistribution 20―fl @ntig..raphy for detection and localization ofCAD (14,24,25).

A unique feature of the present study is the totallyblinded nature of the scintigraphic interpretation employing quantitative image analysis. The generation of ROCcurves for sensitivity and specificity as a function of threshold for sestamibi defect detection is also unique to thisstudy.

674 The Journal of Nuclear Medicine •Vol. 33 •No. 5 •May1992

ACKNOWLEDGMENT

The assistance of DuPont Pharma in converting the data fromMDSto Sophacompatiblediskettesis gratefullyacknowledged.

REFERENCES

1. Wackers F@f,Berman D5, Maddahi J, et al. Technetium.99m hexakis-2-methoxyisobutyl isonitrile: human biodistribution, dosimetry, safety andpreliminary compalison to thallium-201 for myocardial perfusion imagingJNucIMed 1989;30:30!—311.

2. TailleferR,Dupras0, Spomv, etal.Myocardialperfusionimagingwitha newradiotracer,technetium-99m-hexamibi(methoxyisobutylisonitrile):comparisonwith thallium.201imaging C/in Nuc/Med 1989;14:97—100.

3. OkadaRD.GloverD, GaffneyT, WilliamsS.Myocardialkineticsoftechnetium-99m-hexakis-2-methoxy-2-methylpropyl-isonitrile.Circulalion 1988;77:491—498.

4. Li Q, Frank TL, Franceschi D, Wagner HN, Jr., Becker LC. Technetium99m-methoxyisobutyl isonitrile (RP-30) for quantification of myocardialischemia and reperfusion in dogs. J Nuc/Med 1988;29:1539—1548.

5. Leppo JA, Meerdink DJ. Comparison of the myocardial uptake of atechnetium-labeled isonitrile analogue and thallium. Circ Res 1989;65:632—639.

6. SinusasAJ,WatsonDD, CannonJM, BelierGA. Effectof ischemiaandpostischemic dysfunction on myocardial uptake of technetium-99m-labeled methoxy-isobutylisonitrile and thallium-201. J Am Coil Cardio/1989;14:1785—1793.

7. Wackers F, Gibbons R, Verani M, et al. Serial quantitative planar Technetium-99misonitrileimagingin acute myocardialinfarction:efficacyfornon-invasive assessment ofthrombolytic therapy. JAm CoilCardiol 1989;14:861—873.

8. Gibbons Ri, Verani MS. Behrenbeck T, ci al. Feasibility of tomographictechnetium-99mhexakis.2-methoxy-2-methylpropylisonitrileimagingforthe assessmentof myocardialarea at risk and the effect of treatment inacute myocardialinfarction.Circulation1989;80:1277—1286.

9. RoccoRP,DilsizianV. StraussHW, BoucherCA. Technetium-99misonitrite myocardial uptake at rest. II. Relation to clinical markers of potentialviability. JAm Coil Cardio/ l989;14:l678—l684.

10. Tatum JL, Rehr RB, DiSciascioG, Romhilt DW, Fratkin Mi. Thallium.201/technetium-99m-RP-30a disparity in the course ofmyocardial infarc.non after auempted reperfusion. J Nuc/Med l988;29: 1283—1286.

11. Najm YC, Timmis AD, Maisey MN, et al. The evaluation of ventricularfunction using gated myocardial imaging with Tc-99m-MIBI. Eur Heart)l989;lO:142—148.

12. Marcassa C, Marzullo P. Parodi 0, Sambuceti G, L'Abbate A. A newmethod for noninvasive quantitation of segmental myocardial wall thickening using technetium-99m-methoxy.isobutyl-isonitrile scintigraphy—resuIts in normal subjects. J Nuc/Med 1990;31:173—177.

13. Taillefer R, Lambert R, Dupras G, et al. Clinical comparison betweenthallium-201 and Tc.99m.methoxy isobutyl isonitrile (hexamibi) myocar.dial perfusion imaging for detection ofcoronary artery disease. EurJ Nuc!Med 1989;15:280—286.

14. Kiat H, Maddahi J, Roy L, ci al. Comparison oftechnetium-99m methoxy.isobutylisonitrileandthallium-20l forevaluationofcoronaryarterydiseaseby planar and tomographicmethods.Am Heart I 1989;117:1—11.

15. Smith WH, Watson DD. Technical aspects of myocardial planar imagingwith Tc-99msestamibi.AmJ Cardiol 1990;66:16E—22E.

16. Watson DD, Campbell NP, Read EK, Gibson RS, Testes CD, Beller GA.Spatialand temporal quantitation of planar thallium myocardialimages.J Nuc/Med 198l;22:577—584.

17. Sinusas AJ, Smith WH, Brookeman V, Vinson E, Beller GA, Watson DD.Quantitative imaging with technetium-99m.methoxyisobutylisonitrile(RP.30): comparison with 11-201 using a new background subtractionalgorithm lAbstract]. Circulation 1987;76:2l7.

18. Sinusas AJ, Beller GA, Smith WH, vinson EL, Brookeman V, WatsonDD. Quantitative planar imagingwith technetium-99m-methoxyisobutylisonitrile: comparison of uptake patterns with thallium.201. I Nuci Med1989;30:1456—l463.

19. Koster K, Wackers FJ, Matters JA, Fetterman RC. Quantitative analysisof planar technetium-99m-isonitrilemyocardial perfusion images usingmodified background subtraction. J NuclMed l990;3 1:1400—1408.

20. Watson DD, Smith WH. Sestamibi and the issue of tissue crosstalk IEditorialj.JNuclMedl990;3l:1409—l4ll.

Page 8: BlindedEvaluationofPlanarTechnetium-99m- …jnm.snmjournals.org/content/33/5/668.full.pdf · Anotherfeatureusedbytheinterpretersisthe“flashback display‚(15).Thisdisplaycapabilityisusefulforthedetection

2 1. Gray JE, Lisk KG, Haddick DH, HarshbargerJH, OosterhofA, SchwenkerR, Membersofthe SMPTESubcommitteeon RecommendedPracticesforMedical Diagnostic Display Devices. Test pattern for video displays andhard.copycameras.Radiologyl985;l54:519—527.

22. Watson DD, Leidholtz E, BelIer GA, Testes CD. Defect perception inmyocardial perfusion images [Abstract). J NuclMed l980;2l:P61—P62.

23. Read ME, Watson DD, Read EK, Leidholtz E. A method for automaticoverlapping of sequential scintiphoto images [Abstract]. J Nuc/ Med 1980;2 I P6 I.

(continuedfrom page 5A)

24. Kahn JK, McGhie I, Akers MS. ci al. Quantitative rotational tomographywith @°‘Tland [email protected] direct comparison innormal individuals and patients with coronary artery disease. Circulationl989;79: l282—l293.

25. Iskandrian AS, Heo J, Kong B, et al. Use of technetium.99m isonitrile(RP-30A)in assessingleft ventricularperfusionand function at rest andduring exercise in coronary artery disease, and comparison with coronaryarteriography and exercise thallium-20l SPECT imaging Am J Cardio/1989;64:270—275.

PURPOSEA 55-yr-old patient with recurrent rectal bleedingwas imaged with 9@'Tc-red cells. Images wererecorded at 5, 10, 15, 30, 60, 140, 180 mm and9 hr after tracer administration (Fig. 1). At140 mm, a well-defined area in the right colonappeared, clearly demonstrating a new episode ofbleeding. After surgical exploration, histologicaldata confirmed pseudolymphomatosis (Fig. 2)of the ileum with superficial erosion of themucosa.

_,4@

TRACER99mTc@Cs

Figure1 ROUTE OF ADMINISTRATIONIntravenously

TIME AFTER INJECTION5, 10, 15, 30, 60, 140, 180 mm and 9 hrpostinjection

Gamma camera (Elscint)

CONTRIBUTORSSergio Baldari, MD, Giorgio Restifo, MDand Prof. Nunzio Bonanno, MD

INSTITUTIONInstitute of Radiological Sciences, University ofMessina, Messina, Italy

Figure2

675Blinded Evaluationof Sestamibi Studies •Watson et al

FIRST IMPRESSIONS