Juvenile Idiopathic Arthritis of Peripheral Joints

19
Review Article Juvenile Idiopathic Arthritis of Peripheral Joints: Quality of Reporting of Diagnostic Accuracy of Conventional MRI 1 Elka Miller, MD, Andreas Roposch, MD, MSc, Elizabeth Uleryk, BA, MLS, Andrea S. Doria, MD, PhD, MSc Rationale and Objectives. The aim of this study was to systematically review the quality of papers on the clinimetric properties of magnetic resonance imaging for the diagnosis of juvenile idiopathic arthritis in peripheral joints. Materials and Methods. A review of Medline, EMBASE, the Database of Abstracts of Reviews of Effects, and the Cochrane Library was performed by using a systematic search strategy. Two independent reviewers evaluated selected articles by using Standards for Reporting of Diagnostic Accuracy (STARD) and Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tools. Items were reported independently for STARD and QUADAS. Results. Eighteen studies (validity, n = 18; reliability, n = 3; responsiveness, n = 3) were included. Their overall quality of reporting of methods was fair. Methodological problems with the STARD system included a lack of reporting of exclusion criteria (n = 14), partial or no information on operators’ expertise (n = 14) or blinding (n = 18), and deficient information on study time frames (n = 12), treatments (n = 10), or indeterminate results (n = 18). The distribution of QUADAS scores was hetero- geneous, with overall scores ranging between 3.5 (poor) and 16.5 (excellent) (maximum score, 17.5). Conclusions. The quality of reporting of methods in studies on the magnetic resonance imaging assessment of juvenile idiopathic arthritis is heterogeneous and fair overall. Further methodological refinement of research design should be sought in future studies to provide stronger evidence for the value of novel techniques in clinical settings. Key Words. Juvenile idiopathic arthritis; magnetic resonance imaging; systematic review; peripheral joints; children. ª AUR, 2009 Juvenile idiopathic arthritis (JIA) is the most common rheu- matic disease of childhood, with six to 19.6 incident cases per 100,000 children yearly in North America (1). Although the prognosis of this disease in children is generally favorable, with a majority of patients having no active synovitis in adulthood, persistent disease or progression of the disease is not uncommon, affecting children’s growth and development (2,3). Laboratory indices of synovial inflammation are measures that are easily quantifiable but fail to provide accurate in- formation about functional joint outcomes in JIA (4). Ra- diographs are usually nonspecific in the early stages of the disease, while magnetic resonance imaging (MRI) is a sensi- tive imaging tool for the detection of synovial hypertrophy and cartilage degeneration, which can be evaluated only indirectly by plain radiography (5). For a diagnostic test such as MRI to be helpful, it must fulfill basic diagnostic test standards, including accuracy (reliability and validity) and responsiveness (sensitivity to change) (6). Only when these basic measurement properties are established can the further assessment of results on clin- ical outcomes, decision making, cost-effectiveness, and risks associated with a test be evaluated (7–10). Because of the increasing interest in the role of anatomic MRI as an outcome measure in clinical trials and in Acad Radiol 2009; 16:739–757 1 From the Department of Diagnostic Imaging, Hamilton Health Science, McMaster University, 1200 Main Street West, Room 2SRAD, Hamilton, ON L8N 3Z5, Canada (E.M.-work developed during Dr. Miller’s fellowship at the Department of Diagnostic Imaging, Hospital for Sick Children, University of Toronto, Toronto, Canada); the Department of Diagnostic Imaging, Hospital for Sick Children, University of Toronto, Toronto, Canada (E.U., A.S.D.); and the Department of Orthopaedic Surgery, Great Ormond Street Hospital for Children, Institute of Child Health, University College London, London, United Kingdom (A.R.). This study was partially funded by a Career Development Award from the Canadian Child Health Clinician-Scientist Program and from the Department of Medical Imaging, University of Toronto, to Dr Doria. Received October 3, 2008; accepted January 8, 2009. Address correspon- dence to: E.M. e-mail: [email protected] or [email protected] ª AUR, 2009 doi:10.1016/j.acra.2009.01.012 739

Transcript of Juvenile Idiopathic Arthritis of Peripheral Joints

Review Article

Juvenile Idiopathic Arthritis of Peripheral Joints:Quality of Reporting of Diagnostic Accuracy of Conventional MRI1

Elka Miller, MD, Andreas Roposch, MD, MSc, Elizabeth Uleryk, BA, MLS, Andrea S. Doria, MD, PhD, MSc

Rationale and Objectives. The aim of this study was to systematically review the quality of papers on the clinimetric properties

of magnetic resonance imaging for the diagnosis of juvenile idiopathic arthritis in peripheral joints.

Materials and Methods. A review of Medline, EMBASE, the Database of Abstracts of Reviews of Effects, and the Cochrane

Library was performed by using a systematic search strategy. Two independent reviewers evaluated selected articles by using

Standards for Reporting of Diagnostic Accuracy (STARD) and Quality Assessment of Diagnostic Accuracy Studies (QUADAS)

tools. Items were reported independently for STARD and QUADAS.

Results. Eighteen studies (validity, n = 18; reliability, n = 3; responsiveness, n = 3) were included. Their overall quality of

reporting of methods was fair. Methodological problems with the STARD system included a lack of reporting of exclusion

criteria (n = 14), partial or no information on operators’ expertise (n = 14) or blinding (n = 18), and deficient information on study

time frames (n = 12), treatments (n = 10), or indeterminate results (n = 18). The distribution of QUADAS scores was hetero-

geneous, with overall scores ranging between 3.5 (poor) and 16.5 (excellent) (maximum score, 17.5).

Conclusions. The quality of reporting of methods in studies on the magnetic resonance imaging assessment of juvenile idiopathic

arthritis is heterogeneous and fair overall. Further methodological refinement of research design should be sought in future

studies to provide stronger evidence for the value of novel techniques in clinical settings.

Key Words. Juvenile idiopathic arthritis; magnetic resonance imaging; systematic review; peripheral joints; children.

ª AUR, 2009

Juvenile idiopathic arthritis (JIA) is the most common rheu-

matic disease of childhood, with six to 19.6 incident cases per

100,000 children yearly in North America (1). Although the

prognosis of this disease in children is generally favorable,

with a majority of patients having no active synovitis in

Acad Radiol 2009; 16:739–757

1 From the Department of Diagnostic Imaging, Hamilton Health Science,

McMaster University, 1200 Main Street West, Room 2SRAD, Hamilton, ON

L8N 3Z5, Canada (E.M.-work developed during Dr. Miller’s fellowship at the

Department of Diagnostic Imaging, Hospital for Sick Children, University of

Toronto, Toronto, Canada); the Department of Diagnostic Imaging, Hospital for

Sick Children, University of Toronto, Toronto, Canada (E.U., A.S.D.); and the

Department of Orthopaedic Surgery, Great Ormond Street Hospital for

Children, Institute of Child Health, University College London, London, United

Kingdom (A.R.). This study was partially funded by a Career Development

Award from the Canadian Child Health Clinician-Scientist Program and from

the Department of Medical Imaging, University of Toronto, to Dr Doria.

Received October 3, 2008; accepted January 8, 2009. Address correspon-

dence to: E.M. e-mail: [email protected] or [email protected]

ª AUR, 2009doi:10.1016/j.acra.2009.01.012

adulthood, persistent disease or progression of the disease is

not uncommon, affecting children’s growth and development

(2,3).

Laboratory indices of synovial inflammation are measures

that are easily quantifiable but fail to provide accurate in-

formation about functional joint outcomes in JIA (4). Ra-

diographs are usually nonspecific in the early stages of the

disease, while magnetic resonance imaging (MRI) is a sensi-

tive imaging tool for the detection of synovial hypertrophy

and cartilage degeneration, which can be evaluated only

indirectly by plain radiography (5).

For a diagnostic test such as MRI to be helpful, it must

fulfill basic diagnostic test standards, including accuracy

(reliability and validity) and responsiveness (sensitivity to

change) (6). Only when these basic measurement properties

are established can the further assessment of results on clin-

ical outcomes, decision making, cost-effectiveness, and risks

associated with a test be evaluated (7–10).

Because of the increasing interest in the role of anatomic

MRI as an outcome measure in clinical trials and in

739

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

functional MRI as a predictor in arthritis, it becomes essential

to determine the current quality of reporting of diagnostic

studies of MRI in JIA. A recent unstructured review (5) as-

sessed the role of MRI for the diagnosis of JIA, but to our

knowledge, no prior systematic review has been conducted to

assess the current status of knowledge on the measurement

properties of MRI for the diagnostic assessment of JIA. The

goal of the present systematic review was to assess the quality

of diagnostic accuracy reporting of studies with regard to the

clinimetric (reliability, validity, and responsiveness) proper-

ties of MRI to diagnose JIA.

MATERIAL AND METHODS

Data Sources and Search

An electronic search of the literature was performed by

three investigators (E.M., E.U., A.S.D.), who identified studies

in which the authors reported the diagnostic accuracy of MRI

for the assessment of JIA. Medline (January 1966 to June

2008), EMBASE (January 1980 to June 2008), the Database of

Abstracts of Reviews of Effects of the National Health Service

Center for Reviews and Dissemination, and the Cochrane Li-

brary were searched through OVID using a validated search

strategy (8) that combined Medical Subject Headings and

EMBASE terms with free-text words. These terms included

‘‘juvenile idiopathic arthritis,’’ ‘‘juvenile rheumatoid arthri-

tis,’’ ‘‘arthritis,’’ ‘‘cartilage degeneration,’’ ‘‘magnetic reso-

nance imaging,’’ ‘‘T2 mapping,’’ ‘‘molecular imaging,’’

‘‘diagnostic sensitivity,’’ ‘‘treatment,’’ and ‘‘outcome.’’

Two reviewers (E.M., A.S.D.) independently read the

abstracts of all articles with relevant titles. If the content of

a study was not obvious from the title, key words, and ab-

stract, the original article was retrieved and evaluated by both

reviewers for eligibility. Subsequently, all original articles

that were found to be eligible for inclusion were reviewed

independently. At any stage, disagreements were discussed

and resolved in a consensus.

Inclusion Criteria

Our systematic review included studies on clinimetric

(evaluative, discriminative, and predictive) measurement

properties of any MRI methods that were reported for the

diagnosis of JIA in patients aged <18 years. Specifically, we

included studies in which the conversion criteria (ie, how to

analyze and interpret an MRI examination) used to evaluate

the reliability or validity of MRI assessment of peripheral

joints (knees, hips, ankles and feet, wrists and hands, and

shoulders) in children with JIA was reasonably described.

The minimum criterion for the inclusion of articles with

regard to process criteria (ie, how to perform the MRI scan-

ning) was a one-paragraph report on the proposed MRI pro-

tocol and scanner information. Eligible studies reported

740

primary data obtained at a health care center or research

institute.

Eligible studies reported the methodologic concepts of

either construct or criterion validity, regardless of whether

these terms were or were not mentioned in the article.

Studies excluded from our review included those not

providing patients’ demographic information, case reports,

pictorial essays, opinion letters, and reviews. Studies that

specifically reported axially located joint disease (sacroiliac

joint or temporomandibular joint) were also excluded. Arti-

cles written in languages other than English, French, Ger-

man, Italian, Spanish, and Portuguese were excluded.

Data Extraction and Outcome Measures

Quality assessment of the reporting of included articles was

performed using both the Standards for Reporting of Diag-

nostic Accuracy (STARD; 25 items) (11) and Quality As-

sessment of Diagnostic Accuracy Studies (QUADAS; 14

items) (12) criteria for diagnostic test reporting. The STARD

tool items were rated as adequately described, not described,

or partially described (6). The quality of conversion meth-

odology of each study was evaluated using the QUADAS

criteria (12); for this tool, we indicated whether the items

were or were not adequately described (yes or no). If it was

unclear from the information provided in the article, this item

was rated as ‘‘unclear’’ (6). Reliability studies were not as-

sessed with the QUADAS criteria, because only two items of

this system were applicable (6).

Statistical Analysis

The level of agreement between the two reviewers in

scoring the STARD and QUADAS criteria was assessed

using k statistics with 95% confidence intervals (CIs) (13,14).

Total agreement between the two readers was considered if

both readers scored a given item as 1 (complete information),

0 (lack of information), or 0.5 (partial information). Partial

interreader agreement was considered if one reader scored

a given item as 1 and the other reader as 0.5, or if one reader

scored the item as 0 and the other reader as 0.5. No agreement

meant that one reader scored a given item as 0 and the other

reader as 1. Kappa coefficients #0.40 indicated poor agree-

ment, $0.40 and #0.60 moderate agreement, $0.60 and

#0.80 good agreement, and $0.80 excellent agreement (13).

We used SAS version 8.2 (SAS Institute Inc, Cary, NC) for

all analyses.

RESULTS

Search and Selection

We retrieved 1,782 citations; 18 studies were found to be

eligible for this systematic review (Fig 1). Reasons for the

exclusion of papers included a lack of information on

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

conversion criteria for MRI (15,16), pictorial essays (17,18),

review articles (5), a lack of correlation with MRI (19,20),

and axially located joint disease (21–24). Table 1 shows the

demographic characteristic of patients and MRI clinimetric

properties of the selected articles. Details on methodologic

design according to the STARD and QUADAS criteria are

available in Appendices A (STARD criteria) and B

(QUADAS criteria).

Qualitative Assessment of Quality of Reporting(STARD Tool)

The overall interreader agreement for STARD was good

for both total (k = 0.70; 95% CI, 0.64–0.76) and partial (k =

0.77; 95% CI, 0.34–1.0) agreement. Disagreements were the

result of vague description of key issues, such as participant

sampling (item 5), study design (item 6), reference standard

(items 7 and 19), reproducibility (item 13), and statistics

(item 24). Only four studies (22.2%) (25–28) reported the

Medical Subject Heading term ‘‘sensitivity,’’ ‘‘specificity,’’

or ‘‘predictive values.’’

The objectives of the selected studies were (1) assess-

ment of the role of contrast-enhanced MRI in the diagnosis

of long-standing (25,27,29–33) and short-standing disease

(<1 year in duration) (28); (2) evaluation of unaffected

knees as an attempt to identify patients at risk for developing

polyarticular JIA (26); (3) evaluation of novel MRI tech-

niques, including three-dimensional fat-saturated contrast

MRI (34), T2 relaxation (35), synovium volume quantifi-

cation (36), and synovium pharmacokinetic characteristic

analysis (37–39); (4) determination of the efficacy and

toxicity of intra-articular steroids using conventional MRI as

the outcome measure (33,40,41); and (5) examination of the

role of ultrasound in assessing joint inflammation in JIA

compared with MRI (42).

Participants (Items 3–6)In 11 studies (61.1%), the authors mentioned the criteria

used for the diagnosis of JIA (International League of As-

sociations for Rheumatology or American College of Rheu-

matology). Only four studies (22.2%) reported exclusion

criteria, such as previous intra-articular injection of steroids

(28,42), motion artifacts (35), and need for sedation for im-

aging (28,35,36).

Fifteen studies (83.3%) (25,26,28–30,33,35–42) included

information on the recruitment process of patients, and only

eight studies (44.4%) reported sampling of consecutive

cases. Data were collected prospectively in 11 studies

(61.1%) (26,28,33,35–42).

Reference Standard (Item 7)Nine studies (50%) used physical examination of the pa-

tient as a reference standard for MRI examinations

(25,28,30,33,35,36,38,39,41). Four studies (22.2%) used

MRI for discrimination between diseased and healthy joints

(25,26,35,42). In five studies (27.8%), MRI was used a ref-

erence standard, as an internal reference (contrast-enhanced

MRI) for unenhanced MRI (31,32,42), as an internal refer-

ence for different pharmacokinetic models of enhanced MRI

(37), or as an outcome measure for clinical or laboratory

evaluation (27). In two studies (11.1%) (31,32), arthroscopy

was used as the reference standard for MRI.

Test Methods (Items 8–11)Process criteria were reasonably well described in all

but two studies (11.1%) (25,33), which partially described

the imaging acquisition methods and the criteria for inter-

preting magnetic resonance images. Only two studies

(11.1%) (27,28) stated the rationale for using specific units

and cutoff values of the index test (MRI). Four studies

Figure 1. Flow diagram revealing the search and selection pro-

cess used for the identification and quality assessment of articles

on the diagnostic accuracy of juvenile idiopathic arthritis. DARE,

Database of Abstracts of Reviews of Effects.

741

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

Table 1Demographic Characteristics of Patients and Corresponding MRI Clinimetric Properties of the Selected Articles in This Review

Study

Number of Patients,

Number of Joints,

Joint Types

Mean

(Range) Age (y) % Female

Research

Design

Construct

Validity Reliability Responsiveness

Herve-Somma et al (31) 24 (24 knees) 10 (3–18) 17 Retrospective Yes

Eich et al (40) 15 (11 knees, 4 hips) 6.3 (3.5–11.8) 7 Prospective Yes Yes

Huppertz et al (41) 21 (18 knees,

2 ankles, 1 elbow)

10.1 (1.4–18.9) 14 Prospective Yes Yes

Murray et al (32) 7 (14 hips) 11 (7–17) 4 Not stated Yes

Remedios et al (33) 11 (13 ankles) 9.7 (5–14) 9 Prospective Yes

Ramsey et al (29) 21 (21 knees) 13 (2–17) 11 Retrospective

Uhl et al (25) 21 (42 knees) 9.5 (5–13) 8 Not stated Yes Yes

Cakmakci et al (34) 38 (38 knees) 8 (2–17) 25 Prospective Yes

Gylys-Morin et al (28) 30 (30 knees) 10.2 (5–16) 21 Prospective Yes Yes

El-Miedany et al (42) 40 (40 knees)

JIA patients,

40 control

patients

11 (3–17) 32 Not stated

Argyropoulou et al (30) 28 (56 hips) 12.5 (2–24) 14 Not stated Yes

Kight et al (35) 18 (18 knees)

JIA patients,

21 (21 knees)

healthy children

8.1 (4.9–10.8) 39 Prospective

Workie et al (38) 13 (13 knees) 10.2 (6–16) 9 Prospective Yes

Workie and Dardzinski (37) 10 (10 wrists) 11.1 (5.2–15.7) 9 Prospective Yes Yes

Graham et al (36) 8 (8 knees) 11 (6–15) 7 Not stated Yes

Gardner-Medwin (26) 10 (10 knees) 9.4 (5.2–14.2) 7 Prospective Yes

Nistala et al (27) 34 (68 hips) 14.4 (4.3–19.7) No information Retrospective

Workie et al (39) 17 (17 knees) 10.3 (6.4–15.5) 13 Prospective Yes

JIA, juvenile idiopathic arthritis; MRI, magnetic resonance imaging.

(22.2%) (26–29) provided information on the expertise of

the professionals who executed and rated the MRI and

reference standards. In only four studies (22.2%) (26–

28,35) was clear information about the blinding of asses-

sors available.

Statistics (Items 12 and 13)Of three studies (16.7%) (27,28,36) that assessed reli-

ability, k statistics were used in two (11.1%) and coefficients

of variation were used in one (5.6%). In two additional

studies (11.1%) (27,34), k statistics were inappropriately

applied for agreement between MRI and clinical results. Only

three studies (16.7%) (25,27,28) provided receiver-operating

characteristic (ROC) curves as measures of validity.

Only one study (5.6%) (27) modeled the data using re-

gression analysis. Four studies (22.2%) (29,33,40,42) did not

apply any statistical methods at all. None of the three studies

on responsiveness (34,40,41) used sound statistical methods

to report changes over time.

Time Frame of Study (Item 14)Only six studies (33.3%) included information on when

the studies had been performed.

742

Characteristics of Participants (Item 15 and 16)There was satisfactory reporting of demographic and

clinical characteristics of participants (age, sex, and anatomic

localization of arthritis) in all studies.

Test Results (Items 17–20)Six of 18 diagnostic accuracy studies (33.3%) and two of

three responsiveness studies (66.7%) adequately reported the

treatments used and the time intervals between evaluation of

the index test and the corresponding reference standard. The

distribution of disease severity was well described in seven

studies (38.9%), partially described in seven (38.9%), and not

described at all in four (22.2%).

In nine diagnostic accuracy studies (50%), the authors

reported their results in raw data tables that enabled the re-

calculation of results. Two of the studies (66.7%) on re-

sponsiveness (40) reported patients’ adverse events with the

use of steroids. No studies reported adverse events with ga-

dolinium administration or sedation, if used in the study.

Estimates (Items 21–24)Precision values, such as 95% CIs, were reported in only

three studies (16.7%) (28,32,41), and standard deviation or

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

standard error measures were reported in five studies

(25,28,30,35,41). Indeterminate results were not reported in

any validity study. Estimates of interreader agreement for

MRI findings reported as k coefficients denoted good (0.40–

0.75) or excellent (>0.75) overall reliability (27,28,43). In the

study of Graham et al (36), the intrareader and interreader

reliability of MRI measured as coefficients of variation

ranged between 7.6% and 13.9% for individual readers and

was 11.2% for interreader assessment.

Semiquantitative Assessment of Quality ofReporting (QUADAS Tool)

Overall, the interreader reliability of interpretation of the

QUADAS items was moderate for total agreement (k = 0.57;

95% CI, 0.47–0.67) and good for partial agreement (k = 0.64;

95% CI, 0.52–0.77). Disagreements resulted from unclear

information regarding the type of reference standard used

(item 3), the timing (item 4) and independence (item 7) be-

tween the performance of the reference standard and the in-

dex test, and verification bias (item 10).

There was substantial heterogeneity in the quality of

methods used for reporting results in the studies included in

this review, as noted by the distribution of QUADAS scores

(Appendix B). Scores for single studies (maximum score,

17.5) ranged between 3.5 (poor) and 16.5 (excellent) (median

score, 10.3 [fair quality]). Scores for individual studies

(maximum score, 14) ranged between 1 to 5 (poor) and 11 to

14 (excellent) (27,28,35) (median score, 6–10 points, 8.3

[fair quality]).

None of the studies reported the method used for sample

size calculation (item 1). In 16 studies (88.9%), the reference

standard measure was assessed regardless of the index test

result (item 6). In 12 studies (66.7%), the authors reported

whether the reference standard measure was assessed in all

patients or in only some patients (item 5). Information on

imaging acquisition was consistently reported in 15 studies

(83.3%) (item 8). In 17 studies (94.4%), clinical data were

available by the time the test results were interpreted (item

12). Uninterpretable test results (item 13) (score, 1.5), with-

drawals from the study (item 14) (score, 3.5), and interpre-

tation of reference standard results without knowledge of

index test results (item 11) (score, 6) were items that were

poorly scored.

DISCUSSION

The results of this review indicate that diagnostic test

standards were fairly fulfilled in studies of the validity, reli-

ability, and responsiveness of MRI in JIA.

Ideally, more clarity in the description of research designs

is desirable in upcoming studies. Suggestions to improve the

methodologic report of studies include the use of the medical

subject terms ‘‘sensitivity,’’ ‘‘specificity,’’ or ‘‘positive [or

negative] likelihood [or predictive value]’’ and the use of

ROC curves as a measure of concurrent validity. The

standardization of protocols and the validation of an MRI

scale for the interpretation of JIA findings are topics to be

pointed out in future studies. Although MRI is able to

discriminate different types of cartilage (articular, epiphy-

seal, and physeal) at distinct stages of development of

growing joints (44), so far, no MRI scale has been validated

for use in children. The investigation of the Outcome

Measures in Rheumatology group has been focused on the

definition and testing of novel imaging tools for the

assessment of rheumatoid arthritis of adults (45). This

limitation with regard to the unavailability of MRI scales

targeted to the pediatric population is demonstrated in

Gylys-Morin et al’s (28) study, which provides cutoff

values for Pettersson radiographic scores but fails to report

cutoff values for MRI findings. This limitation makes it

difficult to assess the reproducibility of ROC curves, which

are considered standard methods for descriptions of diag-

nostic accuracy (46,47).

In this review, only two studies (31,32) provided infor-

mation on the criterion validity of MRI compared with ar-

throscopic results. In Gylys-Morin et al’s (28) study, only one

patient had femoral and tibial cartilage thinning confirmed at

arthroscopy. The evaluation of criterion validity of MRI in

JIA is a challenging issue, because in vivo reference stan-

dards are typically unavailable for the determination of in-

active or remission disease (48).

Clinical examinations cannot be considered as reference

standards to determine the diagnostic accuracy of an imaging

test, because they do not provide an equivalent or superior

quantity of information compared to the diagnostic test (6).

Nevertheless, clinical findings can be used as constructs for

comparison with MRI. Future studies should explain the ra-

tionale for choosing a reference standard.

With regard to construct validity, most of the selected

studies reported internal correlations between MRI findings

or correlations between MRI findings and clinical and lab-

oratory results. In this review, although several studies

evaluated the correlation between synovial contrast en-

hancement and disease activity (38,39), only one study (38)

compared MRI findings to functional outcomes (total joint

scores and Childhood Health Assessment Questionnaire).

The Childhood Health Assessment Questionnaire is the most

widely used instrument for the assessment of functional

status during childhood (age 1–19 years) for musculoskeletal

disorders, and it has demonstrated high validity, reliability,

and responsiveness to changes over time (49). Further in-

vestigation of correlations between MRI findings and

Childhood Health Assessment Questionnaire constructs

should be encouraged.

743

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

Very few studies (three of 18) evaluated the responsive-

ness of MRI. The fact that responsiveness studies typically

require a long time frame for the investigation of changes

over time and for monitoring adverse drug events (40,41)

may have contributed to the shortage of studies that ad-

dressed this clinimetric property of MRI. Likewise, only

three studies (72.2%) assessed the interreader reliability of

the interpretation of MRI.

Our review showed that the process criteria were rea-

sonably well reported in most of the selected articles.

However, both in our review and in other reviews (6), the

rationale for using specific cutoff values and units was

omitted. The results of our review also showed that the

articles that failed to report statistical methods were usually

published in the 1990s, likely reflecting a growing focus in

the scientific community on following standard guidelines

for the publication of articles on diagnostic tests (50,51).

On the other hand, with regard to the descriptions of the

conversion criterion (interpretation of findings) in the se-

lected studies, <30% of the articles reported blinding of

reviewers. Blinding is important to generate less biased

results.

With regard to reliability, none of the studies in this review

reported the causes of disagreement between readers or how

the authors handled indeterminate results, similar to what has

been noted in other reviews (6). Reasons for disagreement

should be reported in an effort to provide unequivocal in-

formation for clinical practice (52). None of the studies in-

cluded in this review reported the methods used for sample

size calculation or defined the spectrums of patients selected

for their studies. The lack of this information impaired the

external validity of their results.

Although the earliest changes in JIA are seen in the small

joints of the feet and hands (53,54), most of the studies of this

review (16 [88.9%]) evaluated large joints (the knees and

hips). The limited availability of high-resolution coils for

imaging the small joints may have accounted for the prefer-

ential investigation of large joints.

The chief limitation of this systematic review is the het-

erogeneity of scores on research methodology items noted

with the QUADAS tool. The use of different strength field

MRI scanners (0.5 vs 1.5 T), coils, and sequences in the se-

lected studies contributed to the heterogeneity. Finally, nei-

ther the STARD nor the QUADAS tool incorporates a quality

score; therefore, the scores in this review tended to ignore the

importance of individual items and the direction of potential

bias (11,12).

In conclusion, although most studies had prospective de-

signs, which enables better planning of research design, the

overall reporting of the diagnostic accuracy of MRI in as-

sessing JIA was fair, with several methodologic flaws noted.

The standardization of MRI protocols and scales for the in-

terpretation of findings in growing joints with JIA is clearly

744

needed. This may facilitate future reporting of sound diag-

nostic test statistics providing sensible cutoff values for the

calculation of ROC curves. Future studies in JIA should

focus on the assessment of small joints (hands, wrists, and

feet) rather than large joints for the detection of early

changes in JIA. Improvement of reporting should empha-

size conversion criteria. Reports should include information

on the blinding of reviewers, the number and expertise of

readers, indeterminate results, diagnostic accuracy in terms

of ROC curves, precise estimates (95% CIs, standard de-

viations, and standard errors) of values, assessments of the

reliability of interpretation of MRI results and responsive-

ness of MRI, and investigation of the value of MRI as

a predictive index.

GLOSSARY

QUADAS (Quality Assessment of DiagnosticAccuracy Studies) (12): A scoring system developed in an

attempt to produce standards for reporting evaluations of

diagnostic tests (55).

Reliability: Obtaining the same result when a phenome-

non is measured by the same clinician or by different clini-

cians on the same occasion or on different occasions (56).

Responsiveness: The ability of a scale to detect change in

outcomes when change is present (sensitivity to change)

(57,58).

STARD (Standards for Reporting of Diagnostic Ac-curacy) (11): A scoring system developed in an attempt to

produce standards for reporting evaluations of diagnostic

tests (55).

Validity: The degree to which the result of a measure-

ment corresponds to the true state of the phenomenon being

measured (56). Four types of validity are recognized: face,

content, construct, and criterion (59). Construct validity

specifies the factors, or constructs, that account for variance

in the proposed measures as well as the hypothesized re-

lations among them (ie, convergent or discriminate rela-

tionships) (59). Criterion validity is the correspondence

between a proposed measure and a reference standard

variable (59).

REFERENCES

1. Cassidy JT, Petty RE. Juvenile rheumatoid arthritis. In: Textbook of pe-

diatric rheumatology. Philadelphia, PA: Saunders; 1995; 135.

2. Oen K. Long-term outcomes and predictors of outcomes for patients

with juvenile idiopathic arthritis. Best Pract Res Clin Rheumatol 2002; 16:

347–360.

3. Oen K, Malleson PN, Cabral DA, et al. Disease course and outcome of

juvenile rheumatoid arthritis in a multicenter cohort. J Rheumatol 2002;

29:1989–1999.

4. Prakash K. Pispati. Evidence-based practice in rheumatology. APLAR J

Rheumatol 2003; 6:44–49.

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

5. Graham TB. Imaging in juvenile arthritis. Curr Opin Rheumatol 2005; 17:

574–578.

6. Roposch A, Moreau NM, Uleryk E, Doria AS. Developmental dysplasia of

the hip: quality of reporting of diagnostic accuracy for US. Radiology

2006; 241:854–860.

7. Lijmer JG, Mol BW, Heisterkamp S, et al. Empirical evidence of design-

related bias in studies of diagnostic tests. JAMA 1999; 282:1061–1066.

8. Deville WL, Bezemer PD, Bouter LM. Publications on diagnostic test

evaluation in family medicine journals: an optimal search strategy. J Clin

Epidemiol 2000; 53:65–69.

9. Smidt N, Rutjes AW, van der Windt DA, et al. Quality of reporting of di-

agnostic accuracy studies. Radiology 2005; 235:347–353.

10. Reid MC, Lachs MS, Feinstein AR. Use of methodological standards in

diagnostic test research. Getting better but still not good. JAMA 1995;

274:645–651.

11. Bossuyt PM, Reitsma JB, Bruns DE, et al. Towards complete and ac-

curate reporting of studies of diagnostic accuracy: the STARD initiative.

Clin Chem 2003; 49:1–6.

12. Whiting P, Rutjes AW, Reitsma JB, et al. The development of QUADAS:

a tool for the quality assessment of studies of diagnostic accuracy in-

cluded in systematic reviews. BMC Med Res Methodol 2003; 3:25.

13. Altman DG. Practical statistics for medical research. London: Chapman

& Hall; 1991.

14. Fleiss JL. Statistical methods for rates and proportions; New York: John

Wiley, 1981.

15. Tynjala P, Honkanen V, Lahdenne P. Intra-articular steroids in radiologi-

cally confirmed tarsal and hip synovitis in juvenile idiopathic arthritis. Clin

Exp Rheumatol 2004; 22:643–648.

16. Johnson K, Wittkop B, Haigh F, et al. The early magnetic resonance im-

aging features of the knee in juvenile idiopathic arthritis. Clin Radiol 2002;

57:466–471.

17. Yulish BS, Lieberman JM, Newman AJ, et al. Juvenile rheumatoid

arthritis: assessment with MR imaging. Radiology 1987; 165:149–152.

18. Senac MO Jr, Deutsch D, Bernstein BH, et al. MR imaging in juvenile

rheumatoid arthritis. AJR Am J Roentgenol 1988; 150:873–878.

19. Adib N, Silman A, Thomson W. Outcome following onset of juvenile

idiopathic inflammatory arthritis: I. frequency of different outcomes.

Rheumatology (Oxford) 2005; 44:995–1001.

20. Adib N, Silman A, Thomson W. Outcome following onset of juvenile

idiopathic inflammatory arthritis: II. predictors of outcome in juvenile

arthritis. Rheumatology (Oxford) 2005; 44:1002–1007.

21. Kuseler A, Pedersen TK, Gelineck J, Herlin T. A 2 year followup study of

enhanced magnetic resonance imaging and clinical examination of the

temporomandibular joint in children with juvenile idiopathic arthritis.

J Rheumatol 2005; 32:162–169.

22. Arabshahi B, Cron RQ. Temporomandibular joint arthritis in juvenile

idiopathic arthritis: the forgotten joint. Curr Opin Rheumatol 2006; 18:

490–495.

23. Scolozzi P, Bosson G, Jaques B. Severe isolated temporomandibular

joint involvement in juvenile idiopathic arthritis. J Oral Maxillofac Surg

2005; 63:1368–1371.

24. Neidel J, Boehnke M, Kuster RM. The efficacy and safety of intraarticular

corticosteroid therapy for coxitis in juvenile rheumatoid arthritis. Arthritis

Rheum 2002; 46:1620–1628.

25. Uhl M, Krauss M, Kern S, et al. The knee joint in early juvenile idiopathic

arthritis. An ROC study for evaluating the diagnostic accuracy of con-

trast-enhanced MR imaging. Acta Radiologica 2001; 42:6–9.

26. Gardner-Medwin JM, Killeen OG, Ryder CAJ, et al. Magnetic resonance

imaging identifies features in clinically unaffected knees predicting ex-

tension of arthritis in children with monoarthritis. J Rheumatol 2006; 33:

2337–2343.

27. Nistala K, Babar J, Johnson K, et al. Clinical assessment and core

outcome variables are poor predictors of hip arthritis diagnosed by

MRI in juvenile idiopathic arthritis. Rheumatology (Oxford) 2007; 46:

699–702.

28. Gylys-Morin VM, Graham TB, Blebea JS, et al. Knee in early juvenile

rheumatoid arthritis: MR imaging findings. Radiology 2001; 220:

696–706.

29. Ramsey SE, Cairns RA, Cabral DA, et al. Knee magnetic resonance im-

aging in childhood chronic monarthritis. J Rheumatol 1999; 26:

2238–2243.

30. Argyropoulou MI, Fanis SL, Xenakis T, et al. The role of MRI in the eval-

uation of hip joint disease in clinical subtypes of juvenile idiopathic ar-

thritis. Br J Radiol 2002; 75:229–233.

31. Herve-Somma CM, Sebag GH, Prieur AM, et al. Juvenile rheumatoid arthritis

of the knee: MR evaluation with Gd-DOTA. Radiology 1992; 182:93–98.

32. Murray JG, Ridley NTF, Mitchell N, Rooney M. Juvenile chronic arthritis of

the hip: Value of contrast-enhanced MR imaging. Clin Radiol 1996; 51:

99–102.

33. Remedios D, Martin K, Kaplan G, et al. Juvenile chronic arthritis: diag-

nosis and management of tibio-talar and sub-talar disease. Br J Rheu-

matol 1997; 36:1214–1217.

34. Cakmakci H, Kovanlikaya A, Unsal E. Short-term follow-up of the juvenile

rheumatoid knee with fat-saturated 3D MRI. Pediatr Radiol 2001; 31:

189–195.

35. Kight AC, Dardzinski BJ, Laor T, Graham TB. Magnetic resonance im-

aging evaluation of the effects of juvenile rheumatoid arthritis on distal

femoral weight-bearing cartilage. Arthritis Rheum 2004; 50:901–905.

36. Graham TB, Laor T, Dardzinski BJ. Quantitative magnetic resonance

imaging of the hands and wrists of children with juvenile rheumatoid ar-

thritis. J Rheumatol 2005; 32:1811–1820.

37. Workie DW, Dardzinski BJ. Quantifying dynamic contrast-enhanced MRI

of the knee in children with juvenile rheumatoid arthritis using an arterial

input function (AIF) extracted from popliteal artery enhancement, and the

effect of the choice of the AIF on the kinetic parameters. Magnc Reson

Med 2005; 54:560–568.

38. Workie DW, Dardzinski BJ, Graham TB, et al. Quantification of dynamic

contrast-enhanced MR imaging of the knee in children with juvenile

rheumatoid arthritis based on pharmacokinetic modeling. Magn Reson

Imaging 2004; 22:1201–1210.

39. Workie DW, Graham TB, Laor T, et al. Quantitative MR characterization of

disease activity in the knee in children with juvenile idiopathic arthritis:

a longitudinal pilot study. Pediatr Radiol 2007; 37:535–543.

40. Eich GF, Halle F, Hodler J, et al. Juvenile chronic arthritis: imaging of the

knees and hips before and after intraarticular steroid injection. Pediatr

Radiol 1994; 24:558–563.

41. Huppertz HI, Tschammler A, Horwitz AE, Schwab KO. Intraarticular

corticosteroids for chronic arthritis in children: efficacy and effects on

cartilage and growth. J Pediatr 1995; 127:317–321.

42. El-Miedany YM, Housny IH, Mansour HM, et al. Ultrasound versus MRI in

the evaluation of juvenile idiopathic arthritis of the knee. Joint Bone Spine

2001; 68:222–230.

43. Rosner B. Fundamentals of biostatistics. Belmont, CA: Duxbury; 1995.

44. Doria AS, Babyn PS, Feldman B. A critical appraisal of radiographic

scoring systems for assessment of juvenile idiopathic arthritis. Pediatr

Radiol 2006; 36:759–772.

45. Ostergaard M, McQueen F, Bird P, et al. The OMERACT Magnetic Res-

onance Imaging Inflammatory Arthritis Group—advances and priorities.

J Rheumatol 2007; 34:852–853.

46. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver

operating characteristic (ROC) curve. Radiology 1982; 143:29–36.

47. Moses LE, Shapiro D, Littenberg B. Combining independent studies of

a diagnostic test into a summary ROC curve: data-analytic approaches

and some additional considerations. Stat Med 1993; 12:1293–1316.

48. Wallace CA, Ravelli A, Huang B, Giannini EH. Preliminary validation of

clinical remission criteria using the OMERACT filter for select categories

of juvenile idiopathic arthritis. J Rheumatol 2006; 33:789–795.

49. Singh G, Athreya BH, Fries JF, Goldsmith DP. Measurement of health

status in children with juvenile rheumatoid arthritis. Arthritis Rheum 1994;

37:1761–1769.

50. Jaeschke R, Guyatt G, Sackett DL. Evidence-Based Medicine Working

Group. Users’ guides to the medical literature. III. How to use an article

about a diagnostic test. A. Are the results of the study valid? JAMA 1994;

271:389–391.

51. Jaeschke R, Guyatt GH, Sackett DL. Evidence-Based Medicine Working

Group. Users’ guides to the medical literature. III. How to use an article

about a diagnostic test. B. What are the results and will they help me in

caring for my patients? JAMA 1994; 271:703–707.

745

52. O’Dowd TC. Informing patients about clinical disagreement. Lancet

1989; 2:744.

53. Scott DL, Coulton BL, Popert AJ. Long term progression of joint damage

in rheumatoid arthritis. Ann Rheum Dis 1986; 45:373–378.

54. Sharp JT. An overview of radiographic analysis of joint damage in rheu-

matoid arthritisand its use in metaanalysis. J Rheumatol 2000;27:254–260.

55. Rennie D. Improving reports of studies of diagnostic tests: the STARD

initiative. JAMA 2003; 289:89–90.

56. Feinstein AR. Clinimetric perspectives. J Chronic Dis 1987; 40:

635–640.

57. Kirshner B, Guyatt G. A methodological framework for assessing health

indices. J Chronic Dis 1985; 38:27–36.

58. Nunnally J. Psychometric theory. New York: McGraw-Hill; 1978.

59. Patrick DL, Erickson P. Health status and health policy allocation

resources to health care. New York: Oxford University Press; 1993:

198–202.

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

746

APPENDIX A. REPORT OF QUALITY IN REPORTING USING STARD ITEMS FOR 18 STUDIES OF JIAOF THE PERIPHERAL JOINTS USING MRI

Quality Study Response

1. Identify the article as a study of

diagnostic accuracy

Herve-Somma et al (31) Yes (construct validity)

Eich et al (40) Yes (construct validity and responsiveness)

Huppertz et al (41) Yes (construct validity and responsiveness)

Murray et al (32) Yes (construct validity)

Remedios et al (33) Yes (construct validity)

Ramsey et al (29) Yes (criterion validity)

Uhl et al (25) Yes (construct validity)

Cakmakci et al (34) Yes (construct validity and responsiveness)

Gylys-Morin et al (28) Yes (construct validity and reliability); one case of

cartilage thinning was confirmed at arthroscopy

(criterion validity)

El-Miedany et al (42) Yes (construct validity)

Argyropoulou et al (30) Yes (construct validity)

Kight et al (35) Yes (construct validity)

Workie et al (38) Yes (construct validity)

Workie and Dardzinski (37) Yes (construct validity); parameters (Ktrans, Vp, Kep)

of three models for arterial input function were

evaluated

Graham et al (36) Yes (construct validity) and reliability

Gardner-Medwin (26) Yes (construct validity)

Nistala et al (27) Yes (construct validity and reliability)

Workie et al (39) Yes (construct validity)

2. State the research questions or study

aims, such as estimating diagnostic

accuracy or comparing accuracy

between tests or across participant

groups

Herve-Somma et al (31) To evaluate the role of contrast MRI in diagnosis,

staging, and planning treatment of JRA

Eich et al (40) To evaluate the use of radiography, ultrasound, and

MRI in assessment of affected knee and hips before

and after intra-articular joint injection

Huppertz et al (41) To assess efficacy and potential toxicity of

intra-articular corticosteroids therapy

Murray et al (32) To determine the value of contrast enhancement in

MRI diagnosis of hip joint disease with JIA

Remedios et al (33) To compare clinical evaluation of hind foot synovitis

with enhanced MRI in children with JIA

Ramsey et al (29) To compare clinical evaluation of hind foot synovitis

with contrast-enhanced MR and evaluate the

efficacy of intra-articular steroid injection

Uhl et al (25) To determine sensitivity and specificity of MR in

diagnosis of JIA

Cakmakci et al (34) To establish correlation between clinical status and

three-dimensional fast-spin contrast MRI in

response to treatment

Gylys-Morin et al (28) To determine MRI findings in early JRA

El-Miedany et al (42) To assess the role of ultrasound vs MRI in inflammation

in JIA of the knee

Argyropoulou et al (30) To establish the role of MRI in the assessment of hip

joint involvement in clinical subtypes with JIA

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

747

APPENDIX A (Continued)

Quality Study Response

Kight et al (35) To examine MRI T2 relaxation times in the

weight-bearing cartilage of the distal femur in

healthy children and children with JRA

Workie et al (38) Not reported

Workie and Dardzinski (37) Not reported

Graham et al (36) To assess feasibility of measuring synovial volume in

the hand/wrist in polyarticular JIA

Gardner-Medwin (26) To evaluate if MRI of clinical unaffected joints is more

sensitive than clinical assessment in identifying risk

patients

Nistala et al (27) To assess diagnostic performance of clinical vs hip

MRI and to determine clinical and serological

predictors of MRI diagnosed hip arthritis

Workie et al (39) Utility of dynamic contrast-enhanced MRI based on

pharmacokinetic modeling to evaluate disease

activity in the knee and to correlate with clinical

findings

3. The study population: inclusion and

exclusion criteria, setting, and location

where data were collected

Herve-Somma et al (31) Attending pediatric rheumatology clinic as outpatient;

study population was well reported (age and

gender); setting and location were not reported

Eich et al (40) EULAR criteria for JIA and one of the following: (1)

failure of systemic treatment and physiotherapy at 3

mo, (2) local growth, (3) popliteal cyst; study

population was well reported (age and gender);

setting and location were not reported

Huppertz et al (41) Consent for intra-articular corticosteroids

administration; population age and gender were

reported; location was partially reported

Murray et al (32) Diagnose according to juvenile chronic arthritis criteria;

age and gender reported; location not reported

Remedios et al (33) Age and gender reported; no inclusion criteria or

location reported

Ramsey et al (29) Country hospital in Canada; included all patients

referred with presumptive diagnosis of clinical

monoarthritis; study population was well described

(age, gender)

Uhl et al (25) Children included were followed for $1 y and

diagnosed to have JIA of the knee by an experienced

pediatric rheumatologist; age and gender reported;

location of the study not included

Cakmakci et al (34) Patients attending pediatric rheumatology clinic as

outpatients; study population (age, gender)

included; location of the study was not reported

Gylys-Morin et al (28) Included according to ACR criteria, clinical evident

arthritis in at least one knee, disease duration of <1 y;

children were excluded if needed sedation for

imaging or if injection of intra-articular steroids in the

affected knee; age and gender reported; referred

from rheumatology clinic

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

748

APPENDIX A (Continued)

Quality Study Response

El-Miedany et al (42) Included according to ILAR criteria; excluded if

injection of intra-articular steroids in the affected

knee; study population (age and gender); location of

the study reported

Argyropoulou et al (30) Included according to ILAR criteria; study population

(age and sex) reported; location not reported

Kight et al (35) Included according to ACR criteria, active arthritis

documented by rheumatologist in at least one

examination, disease duration 2–7 y, girls, and age

4.9–10.8 y; excluded girls who needed sedation for

imaging and if motion; location reported

Workie et al (38) Inclusion criteria: history of JRA; study population (age

and gender) reported; location not reported

Workie and Dardzinski (37) Included children with history of JRA and good

popliteal artery signal enhancement; study

population (age and gender) reported; location not

reported

Graham et al (36) Inclusion criteria: polyarticular disease, ACR criteria (at

least three active joints, one hand); excluded

children who needed sedation; study population

(age and gender) and location of the study reported

Gardner-Medwin (26) Arthritis criteria for monoarthritis; Age and gender

reported; tertiary center

Nistala et al (27) ILAR criteria; disease duration >6 mo; study (age and

gender) reported; location not reported

Workie et al (39) ILAR criteria for active arthritis; study population (age

and gender) reported; location not reported

4. Participant recruitment: was

recruitment based on presenting symptoms,

results from previous test, or the fact that the

participants had received the index test or the

reference standard?

Herve-Somma et al (31) Not reported

Eich et al (40) Presenting symptoms

Huppertz et al (41) Need for administration of intra-articular steroids

Murray et al (32) Not reported

Remedios et al (33) Presenting symptoms (pain and swelling)

Ramsey et al (29) Presenting symptoms

Uhl et al (25) Partial reported

Cakmakci et al (34) Not reported

Gylys-Morin et al (28) Presenting symptoms

El-Miedany et al (42) Presenting symptoms

Argyropoulou et al (30) Partial reported

Kight et al (35) Patients with JIA recruited by chart review and mail;

controls were children of hospital personnel and

healthy children who respond from an advertisement

Workie et al (38) History of JRA

Workie and Dardzinski (37) Not reported

Graham et al (36) Presenting symptoms

Gardner-Medwin (26) Presenting symptoms (monoarthritis)

Nistala et al (27) Presenting symptoms (hip pain)

Workie et al (39) Presenting symptoms

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

749

APPENDIX A (Continued)

Quality Study Response

5. Participant sampling: was the study population

a consecutive series of participants defined by

selection criteria in items 3 and 4?

Herve-Somma et al (31) Not reported

Eich et al (40) Consecutive patients

Huppertz et al (41) Consecutive patients

Murray et al (32) Not reported

Remedios et al (33) Consecutive patients

Ramsey et al (29) Consecutive patients

Uhl et al (25) Not reported

Cakmakci et al (34) Consecutive patients

Gylys-Morin et al (28) Not mentioned but likely consecutive patients

El-Miedany et al (42) Consecutive patients

Argyropoulou et al (30) Consecutive patients

Kight et al (35) Not reported

Workie et al (38) Not reported

Workie and Dardzinski (37) Not reported

Graham et al (36) Not reported

Gardner-Medwin (26) Consecutive patients

Nistala et al (27) Partial (no information about consecutive patients)

Workie et al (39) Partial (no information about consecutive patients)

6. Data collection: was data collection planned

before the index test and reference standard

were performed (prospective study) or after

(retrospective study)?

Herve-Somma et al (31) Retrospective

Eich et al (40) Prospective

Huppertz et al (41) Prospective

Murray et al (32) Not stated

Remedios et al (33) Prospective

Ramsey et al (29) Retrospective

Uhl et al (25) Not stated

Cakmakci et al (34) Not stated

Gylys-Morin et al (28) Prospective

El-Miedany et al (42) Prospective

Argyropoulou et al (30) Not stated

Kight et al (35) Prospective

Workie et al (38) Prospective

Workie and Dardzinski (37) Not stated

Graham et al (36) Prospective

Gardner-Medwin (26) Prospective

Nistala et al (27) Retrospective

Workie et al (39) Prospective

7. The reference standard and its rationale Herve-Somma et al (31) Unenhanced MRI, test measure; enhanced MRI,

reference standard; or x-ray, test measure;

unenhanced and enhanced MRI, reference

standards

Eich et al (40) MRI, US, and clinical findings, tests; follow-up,

reference standard

Huppertz et al (41) For responsiveness: MRI and clinical assessment

(tests), follow-up (reference standard); for construct

validity: MRI (test), clinical assessment (reference

standard)

Murray et al (32) Unenhanced MRI, test measure; enhanced MRI,

reference standard

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

750

APPENDIX A (Continued)

Quality Study Response

Remedios et al (33) MRI, test measure (scoring system); clinical findings,

reference standard

Ramsey et al (29) Responsiveness of clinical findings to NSAIDs and

intra-articular steroids: clinical response confirmed

by MRI n = 1/4; accuracy of MRI: MRI diagnosis

confirmed by arthroscopy n = 5/5; accuracy of x-rays

compared by MRI false-negatives = 620; false-

positives = 1/20

Uhl et al (25) MRI, test measure; clinical findings, reference standard

for discrimination of JIA vs non-JIA knees

Cakmakci et al (34) MRI and clinical findings, tests; follow-up, reference

standard

Gylys-Morin et al (28) Evaluative role among internal items: MRI, test

measure (40 items) and between MRI items and

clinical synovitis; discriminative role: MRI for

discrimination between JIA and control knees

El-Miedany et al (42) Evaluative role comparing unenhanced (test) vs

enhanced MRI (reference standard): extent of

pannus, joint effusion; evaluative role comparing

x-rays, US, unenhanced MRI (tests) and enhanced

MRI (reference standard): cartilage destruction;

discriminative role: MRI vs US for discrimination

between JIA and control subjects: 9 structural items

Argyropoulou et al (30) MRI, test measure (scoring system); clinical findings,

reference standard

Kight et al (35) Discriminative role of MRI (test); clinical assessment

(JIA vs healthy children): reference standard

Workie et al (38) Evaluative role of MRI, test measure (signal

enhancement patterns) compared with clinical

measures (CHAQ-Disability by parents and total

knee scores [swelling, tenderness and limitation]):

functional measure, reference standard, and

evaluative role (internal relationship) of MRI

components (enhancement rates of synovium vs

femoral physis)

Workie and Dardzinski (37) Not applicable: assessment of differences/relationship

between parameters of three models

Graham et al (36) MRI, test measure (total synovial volume); clinical

scores, reference standard

Gardner-Medwin (26) Follow-up: reference standard

Nistala et al (27) MRI, outcome measure, reference standard; clinical

and laboratory findings, predictor measures

Workie et al (39) MRI, test measure (scoring system); clinical findings,

reference standard

8. Technical specifications of material and

methods involved, including how and when

measurements were taken, and/or cite

references for index test and reference

standard

Herve-Somma et al (31) Methods (MRI) well described

Eich et al (40) Methods (MRI) well described

Huppertz et al (41) Methods (MRI) well described

Murray et al (32) Methods (MRI) well described

Remedios et al (33) Methods (MRI) partially described

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

751

APPENDIX A (Continued)

Quality Study Response

Ramsey et al (29) Methods (MRI) well described

Uhl et al (25) Methods (MRI) partially described

Cakmakci et al (34) Methods (MRI) well described

Gylys-Morin et al (28) Methods (MRI) well described

El-Miedany et al (42) Methods (MRI) well described

Argyropoulou et al (30) Methods (MRI) well described

Kight et al (35) Methods (MRI) well described

Workie et al (38) Methods (MRI) well described

Workie and Dardzinski (37) Methods (MRI) well described

Graham et al (36) Methods (MRI) well described

Gardner-Medwin (26) Methods (MRI) well described

Nistala et al (27) Methods (MRI) well described

Workie et al (39) Methods (MRI) well described

9. Definition of and rationale for the units, cutoffs,

and/or categories of the results of the index

tests and the reference standard

Herve-Somma et al (31) Not reported

Eich et al (40) Not applicable

Huppertz et al (41) Not applicable

Murray et al (32) Not reported

Remedios et al (33) Not reported

Ramsey et al (29) Not reported

Uhl et al (25) Not reported

Cakmakci et al (34) Not applicable

Gylys-Morin et al (28) Pettersson score (radiographs)

El-Miedany et al (42) Not reported

Argyropoulou et al (30) Not reported

Kight et al (35) Not reported

Workie et al (38) Not reported

Workie and Dardzinski (37) Not reported

Graham et al (36) Not reported

Gardner-Medwin (26) Not reported

Nistala et al (27) Global assessment of overall disease activity (VAS-

PGA), CHAQ, and VAS global

Workie et al (39) Not reported

10. The number, training, and expertise of the

persons executing and reading the index test

and the reference standard

Herve-Somma et al (31) Independent observer review the MRI (no further

information was given)

Eich et al (40) Not reported

Huppertz et al (41) Not reported

Murray et al (32) Two radiologists review all the images; no information

about the clinicians; no details on reviewers

expertise given

Remedios et al (33) Two radiologists review all the images; no information

about the clinicians; no details on reviewers

expertise given

Ramsey et al (29) Two pediatric radiologists with experience in

musculoskeletal MRI, one blinded to clinical history;

differences were review in consensus

Uhl et al (25) Five independent radiologists without knowledge of

the clinical findings; no information about the

clinicians

Cakmakci et al (34) An independent observer; no further information given

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

752

APPENDIX A (Continued)

Quality Study Response

Gylys-Morin et al (28) Two radiologists blinded to clinical details read the

MRI; in case of disagreement, a third radiologist

provided independent interpretation; no further

information given

El-Miedany et al (42) Not reported

Argyropoulou et al (30) One radiologist (specified) review blinded to clinical

evaluation; no further information given

Kight et al (35) Operators involved in data analysis and image

evaluation were blinded to study group; no further

information given

Workie et al (38) Not reported

Workie and Dardzinski (37) Not reported

Graham et al (36) Rheumatologist for clinical inclusion, radiologist for

MRI interpretation

Gardner-Medwin (26) MRI was reviewed by two consultant pediatric

radiologist (expertise specified)

Nistala et al (27) Patients examined by a pediatric rheumatologist or an

experienced pediatric rheumatologist in trainee (not

otherwise specified); MRI review by two pediatric

radiologist (names specified)

Workie et al (39) Not reported

11. Whether or not the readers of the index

tests and reference standard were blind to

the results of the other test and describe any

other clinical information available to

the readers

Herve-Somma et al (31) Not reported

Eich et al (40) Not reported

Huppertz et al (41) Not reported

Murray et al (32) Partial

Remedios et al (33) Not reported

Ramsey et al (29) Partial

Uhl et al (25) Partial

Cakmakci et al (34) Not reported

Gylys-Morin et al (28) Radiologist were blinded to details of the clinical data

El-Miedany et al (42) Not reported

Argyropoulou et al (30) One radiologist review blinded to clinical evaluation

Kight et al (35) Operators involved in data analysis and image

evaluation were blinded to clinical information

Workie et al (38) Not reported

Workie and Dardzinski (37) Not reported

Graham et al (36) Rheumatologist were blinded to result of imaging;

radiologists were blinded to the results of clinical

assessment

Gardner-Medwin (26) Clinical examiners were blinded to MRI results

Nistala et al (27) Pediatric radiologists were blinded to clinical details

Workie et al (39) Not mentioned

12. Methods for calculating or comparing

measures of diagnostic accuracy, and the

statistical methods used to quantify uncertainty

Herve-Somma et al (31) Degree of cartilage destruction (MRI); c2 and

Wilcoxon tests

Eich et al (40) Changes over time; no statistical methods reported

Huppertz et al (41) Not applicable

Murray et al (32) Visualization of pannus; Wilcoxon test

Remedios et al (33) Correlation between MRI signal and clinical outcome

measures; no statistical methods reported

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

753

APPENDIX A (Continued)

Quality Study Response

Ramsey et al (29) Clinical (MRI: reference standard; n = 4), x-rays (MRI:

reference standard; n = 20), and MRI (arthroscopy

and biopsy: reference standard; n = 5); no statistical

methods reported

Uhl et al (25) Not mentioned, partial, presence or absence of JIA;

terms sensitivity, specificity, positive and negative

likelihood; ROC curves (no cutoff values reported);

standard error

Cakmakci et al (34) Both MRI and clinical scores over time; Spearman

correlation: association between MRI and clinical

results

Gylys-Morin et al (28) Pettersson score (radiographs), MRI findings; t tests

(continuous variables); nonparametric tests (ordinal

variables); Spearman rank correlation: if at least one

variable was ordinal; Pearson correlation: two

continuous variables; ROC curves: nonparametric

methods (Hanley and McNeil method): cut-offs

reported only for x-rays (Pettersson scores), not for

MRI

El-Miedany et al (42) US and MRI findings in affected and control subjects;

no statistical methods reported

Argyropoulou et al (30) Clinical criteria for disease activity; test of normality of

distribution; two-tailed t tests: differences in MRI

grades between active and inactive patients.

ANOVA: differences in MRI grades between patients

with oligo, poly, and systemic forms

Kight et al (35) T2 values of JIA vs healthy knees; standard deviations

for normalized distances; two-tailed t tests for

analyses of differences in mean T2 relaxation times

Workie et al (38) Correlation between MRI signal and clinical outcome

measures; Pearson correlation coefficient for

parameter between distal femoral physis and

synovium; unpaired t tests

Workie and Dardzinski (37) Poorly described; t tests for differences in signal

enhancement using Ktrans, Kep, and Vp

Graham et al (36) Clinical measures (total hand swelling scores); Pearson

correlation (parametric data) and Spearman

correlation (nonparametric data)

Gardner-Medwin (26) Future clinical outcome; Fisher’s exact test and

Mann-Whitney test

Nistala et al (27) MRI, core outcome variables; Pearson correlation:

association between total MRI scores;

Mann-Whitney test: comparison between hip MRI

scores and clinician-defined active and inactive

groups; c2 test: effect of damage on concordance

between clinical and MRI scores

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

754

APPENDIX A (Continued)

Quality Study Response

Workie et al (39) MRI parameters (Ktrans, Kep, Vp); Spearman’s rank

correlation: between MRI and clinical/laboratory

parameters; Wilcoxon test: comparison of results at

different time points

13. Methods for calculating

test reproducibility

Herve-Somma et al (31) Not applicable

Eich et al (40) Not done

Huppertz et al (41) Not applicable

Murray et al (32) Not done

Remedios et al (33) Not done

Ramsey et al (29) Not done

Uhl et al (25) Not done

Cakmakci et al (34) Kappa statistics: not appropriate for agreement

between MRI and clinical results.

Gylys-Morin et al (28) k (interobserver reliability): cutoff values for excellent,

good, and marginal reliability

El-Miedany et al (42) Not applicable

Argyropoulou et al (30) Not done

Kight et al (35) Not done

Workie et al (38) Not applicable

Workie and Dardzinski (37) Not done

Graham et al (36) Interoperator reliability of synovial volume calculation;

coefficient of variation (variation within and between

observers)

Gardner-Medwin (26) Not done

Nistala et al (27) k: intraobserver agreement; not appropriate for

concordance between clinician’s assessment and

MRI results

Workie et al (39) Not applicable

ACR, American College of Radiology; ANOVA, analysis of variance; CHAQ, Childhood Health Assessment Questionnaire; EULAR, European

League Against Rheumatism; ILAR, International League of Associations for Rheumatology; JIA, juvenile idiopathic arthritis; JRA, juvenile

rheumatoid arthritis; MRI, magnetic resonance imaging; NSAID, nonsteroidal anti-inflammatory drug; PGA, patient global assessment;

ROC, receiver-operating characteristic; STARD, Standards for Reporting of Diagnostic Accuracy; VAS, visual analog scale.

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

755

APPENDIX B. REPORT OF THE QUADAS CRITERIA OF 18 ARTICLES ON ASSESSMENT OF JIA OF PERIPHERAL JOINTS USINGMRI

Item

Herve-

Somma

et al (31)

Eich

et al

(40)

Huppertz

et al

(41)

Murray

et al

(32)

Remedios

et al

(33)

Ramsey

et al

(29)

Uhl

et al

(25)

1. Was the spectrum of patients

representative of the patients in who

will receive the test in practice?

U/C U/C Yes No U/C U/C No

2. Were selection criteria clearly

described?

U/C Yes Yes U/C No Yes No

3. Is the reference standard likely to

correctly classify the target condition?

Yes Yes No Yes Yes Yes Yes

4. Is the time period between reference

standard and index test short enough

to be reasonably sure that the target

condition did not change between the

two tests?

U/C U/C Yes U/C No Yes N/A

5. Did the whole sample or a random

selection of the sample, receive

verification using a reference standard

of diagnosis?

Yes Yes Yes Yes Yes Yes No

6. Did patients receive the same

reference standard regardless of the

index test result?

Yes Yes Yes Yes Yes Yes Yes

7. Was the reference standard

independent of the index test?

No No N/A Yes Yes Yes Yes

8. Was the execution of the index test

described in sufficient detail to permit

replication of the test?

Yes Yes Yes Yes No No U/C

9. Was the execution of the reference

standard described in sufficient detail

to permit replication of the test?

Yes Yes N/A Yes U/C Yes No

10. Were the index test results

interpreted without knowledge of the

results of the reference standard?

No No N/A Yes U/C U/C Yes

11. Were the index reference standard

results interpreted without knowledge

of the results of the index test?

No No N/A No No U/C U/C

12. Were the same clinical data available

when test results were interpreted as

would be available when the test is

used in practice?

Yes Yes Yes Yes Yes Yes Yes

13. Were uninterpretable test results

reported?

No U/C No No No No No

14. Were withdrawals from the study

explained?

N/A Yes U/C No No No No

JIA, juvenile idiopathic arthritis; MRI, magnetic resonance imaging; N/A, not available; QUADAS, Quality Assessment of Studies of

Diagnostic Accuracy Included in Systematic Reviews; U/C, unclear.Items were rated ‘‘yes’’ if adequately described, ‘‘no’’ if not adequately described, ‘‘unclear’’ if the information in the article was unclear, and

‘‘not available’’ if no information was given in the article.

MILLER ET AL Academic Radiology, Vol 16, No 6, June 2009

756

Cakmakci

et al(34)

Gylys-Morin

et al(28)

El-Miedany

et al(42)

Argyropoulou

et al(30)

Kight

et al(35)

Workie

et al(38)

Workie and

Dardzinski(37)

Graham

et al(36)

Gardner-

Medwin(26)

Nistala

et al(27)

Workie

et al(39)

U/C U/C U/C U/C U/C U/C No U/C U/C U/C U/C

U/C Yes U/C No Yes No No Yes No Yes Yes

Yes Yes Yes No Yes No N/A Yes Yes Yes Yes

U/C U/C No No N/A N/A N/A Yes U/C Yes No

Yes Yes U/C No No No No No Yes Yes Yes

Yes Yes Yes Yes Yes N/A N/A Yes Yes Yes Yes

No Yes No U/C Yes No N/A Yes Yes Yes Yes

Yes Yes U/C Yes Yes Yes Yes Yes Yes Yes Yes

Yes Yes Yes Yes N/A N/A N/A Yes Yes Yes Yes

No Yes No No Yes No N/A U/C Yes U/C U/C

No Yes No No Yes No No Yes Yes Yes No

Yes Yes Yes U/C Yes Yes No Yes Yes Yes Yes

No No No No Yes No No No No No No

No N/A No No Yes No No No No Yes No

APPENDIX B. (Continued)

Academic Radiology, Vol 16, No 6, June 2009 JIA: QUALITY OF REPORTING OF DIAGNOSTIC ACCURACY OF MRI

757