Fletcher, J (2006)

download Fletcher, J (2006)

of 9

Transcript of Fletcher, J (2006)

  • 8/17/2019 Fletcher, J (2006)

    1/9

    This article was downloaded by: [Universita di Padova]On: 23 October 2013, At: 06:36Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

    Scientific Studies of ReadingPublication details, including instructions for

    authors and subscription information:

    http://www.tandfonline.com/loi/hssr20

    Measuring Reading

    ComprehensionJack M. Fletcher

    Published online: 19 Nov 2009.

    To cite this article: Jack M. Fletcher (2006) Measuring Reading Comprehension,

    Scientific Studies of Reading, 10:3, 323-330, DOI: 10.1207/s1532799xssr1003_7

    To link to this article: http://dx.doi.org/10.1207/s1532799xssr1003_7

    PLEASE SCROLL DOWN FOR ARTICLE

    Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and views

    expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.

    This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

    http://www.tandfonline.com/page/terms-and-conditionshttp://dx.doi.org/10.1207/s1532799xssr1003_7http://www.tandfonline.com/action/showCitFormats?doi=10.1207/s1532799xssr1003_7http://www.tandfonline.com/loi/hssr20http://www.tandfonline.com/page/terms-and-conditionshttp://dx.doi.org/10.1207/s1532799xssr1003_7http://www.tandfonline.com/action/showCitFormats?doi=10.1207/s1532799xssr1003_7http://www.tandfonline.com/loi/hssr20

  • 8/17/2019 Fletcher, J (2006)

    2/9

    Measuring Reading Comprehension

    Jack M. Fletcher Department of Psychology

    University of Houston

    The five articles in this special issue are a blend of experimental and correlational

    approaches that exemplify advances in contemporary approaches to assessment of 

    reading comprehension. They illustrate how inferences about reading comprehen-

    sion are determined in part by the material presented for comprehending and the for-

    mat that is used for assessing comprehension of the material. In the future, the ap-

    proaches to measuring reading comprehension in these articles could be further

    integrated by perspectives that cut across particular approaches to measurement and

    begin to utilize multimethod modeling of the latent variables that underlie different

    tests of reading comprehension.

    How is reading comprehension best measured? The five articles in this special is-

    sue each address a set of problems that affect how reading comprehension is mea-

    sured, utilizing an interesting blend of experimental and correlational methodolo-

    gies. The clear consensus across these articles is that the measurement issues are

    complicated, reflecting the complex, multidimensional nature of reading compre-

    hension. Historically, education and psychology researchers have identified multi-

    ple approaches to measurement, with different decades characterized by an em-phasis on different approaches to assessment. As Pearson and Hamm (2005)

    summarized, early research identified that reading comprehension involved multi-

    ple components that would appear depending on the formats used to present the

    material to be read and the manner in which the person was asked to indicate their

    understanding of the material that was read. Despite this historical emphasis, many

    modern approaches to the assessment of reading comprehension are one dimen-

    sional, with little variation in the material the person reads and relatively narrow

    response formats that do not vary within the test. Thus, some tests rely almost ex-

    SCIENTIFIC STUDIES OF READING, 10(3), 323–330Copyright © 2006, Lawrence Erlbaum Associates, Inc.

    Correspondence should be sent to Jack M. Fletcher, Department of Psychology, University of 

    Houston, 2151 West Holcombe Boulevard, 222 TMC Annex, Houston, TX 77204–5053. E-mail:

    [email protected]

       D  o  w  n   l

      o  a   d  e   d   b  y   [   U  n   i  v  e  r  s   i   t  a   d   i   P  a   d  o  v  a   ]  a   t   0   6  :   3   6   2   3   O  c   t  o   b  e  r   2   0   1   3

  • 8/17/2019 Fletcher, J (2006)

    3/9

    clusively on multiple choice, others on fill-in-the-blank (cloze), and others on re-

    tells. The drive for high reliability, especially on high-stakes assessments, often

    leads to significant restrictions of both the type of material that must be read andthe response formats. Yet, as the article by Cutting and Scarborough (2006/this is-

    sue) clearly demonstrates, the inferences that are made about how well an individ-

    ual person comprehends written material vary depending on how it is assessed.

    Inferring how well a person comprehends is the real problem in measuring

    reading (and language) comprehension (RAND Reading Study Group, 2002). In

    contrast to word recognition, where the behavior of interest is fairly overt and dif-

    ferent methods give similar outcomes at the latent variable level (accounting for

    method variance and error of measurement), the assessment of reading compre-

    hension is difficult because it is not an overt process that can be directly observed.Rather, only the products of the process of comprehending are observed, and an in-

    ference is made about the nature of the processes and the quality of the comprehen-

    sion. As the article by Francis et al. (2006/this issue) demonstrates, any single,

    one-dimensional attempt to assess reading comprehension is inherently imperfect.

    From the psychometric view of Francis et al., differences across methods used to

    measure reading comprehension can be interpreted as degrees of imperfection in

    how well different indicators identify one or more latent variables that make up

    reading comprehension. In this commentary I examine the five articles from this

    admittedly psychometric perspective.Thisperspective does helpplacethe fivearticles in this issue inperspective. Each

    explicitly focuses on issues involving measurement, specifically, what is read and

    howcomprehensionofwhatisreadismeasured.Threearticlesareessentiallyexper-

    imental in their origins and address both of these determinants of reading compre-

    hensionindifferentways.Millis,Magliano,andTodaro(2006/thisissue)focusedon

    the comprehension of expository text and asked for responses that represented

    think-alouds. These responses were then analyzedat the level of discourse using la-

    tent semantic analysis (LSA). The validity of this approachwas evaluated by corre-

    lating the products of LSA with a passage taken from Nelson–Denny reading com-prehension test (Brown, Fischo, & Hanna, 1993). Questions were developed to

    represent literal (text-based) and inferential understanding of the passages. They

    found good convergence between the LSA methods of assessing reading compre-

    hension and the indices that are obtained in the Nelson–Denny.

    Although the difficulty level of the passages was controlled in Millis et al.’s

    (2006/this issue) article, Deane, Sheehan, Sabatini, Futagi, and Kostin (2006/this

    issue) explicitly demonstrated that variations in the lexical and discourse elements

    of different passages accounted for variations in how comprehension was tested at

    different grade levels. They measured text-based characteristics of lexiled Grade 3and 6 expository and narrative books. Based in part on LSA, they identified a vari-

    etyof text characteristics that were factor analyzed to identify different dimensions

    of text variability. Comparisons of the factor structure of Grade 3 and 6 texts

    324   FLETCHER

       D  o  w  n   l

      o  a   d  e   d   b  y   [   U  n   i  v  e  r  s   i   t  a   d   i   P  a   d  o  v  a   ]  a   t   0   6  :   3   6   2   3   O  c   t  o   b  e  r   2   0   1   3

  • 8/17/2019 Fletcher, J (2006)

    4/9

    showed differences in a variety of indices of complexity that accounted for differ-

    ences in readability and undoubtedly would explain differences in levels of com-

    prehension depending on the level of reading ability.The third experimental article (Rayner, Chace, Ashby, & Slattery, 2006/this is-

    sue) examined the role of eye movements as indicators of comprehension pro-

    cesses. In a series of studies, the authors monitored eye movements and showed

    that variations in eye movements reflect the difficulty of the passage as well as dif-

    ficulties dealing with inconsistencies in the text. When the text is more difficult,

    comprehension is inconsistent, and there is a much higher probability of a regres-

    sive eye movement. The nature of regressive eye movements when reading text is

    an indicator of the operation of different reading comprehension processes. Thus,

    the presence of long regression usually indicates that the reader has found a part of the text in which his or her understanding is not adequate, leading him or her to re-

    read earlier sections of the passage.

    The two correlational studies explicitly show that the inferences about reading

    comprehension depend on how reading comprehension is measured. Cutting and

    Scarborough (2006/this issue) addressed this issue explicitly by examining the

    contributions of different measures of word decoding and oral language ability to

    three different reading comprehension tests: the Wechsler Individual Achievement

    Test (WIAT; Wechsler, 1992), the Gates–MacGinitie (MacGinitie, Maria, &

    Dryer, 2000), and the Gray Oral Reading Test—Third Edition (Wiederholt &Bryant, 1992). These three tests vary in terms of the material the person reads and

    the format used to indicate his or her understanding. The WIAT presents short pas-

    sages of 2 to 3 sentences that are read silently. The participant then answers

    open-ended questions about each passage. One question is literal, and the other is

    inferential, and a short-answer approach is used. In contrast, the Gates–MacGinitie

    involves passages up to 15 sentences that are also read silently. The response for-

    mat is multiple choice. Finally, the Gray Oral Reading Test—Third Edition in-

    volves passages of about 6 to 7 sentences that are read aloud. The response format

    involves multiple-choice questions. In contrast to the Gates–MacGinitie andWIAT, the participant must respond with the passage out of view. Not surprisingly,

    Cutting and Scarborough found that the pattern of correlations of these different

    assessments to different measures of decoding and oral language skills varies and

    is not constant.

    Francis et al. (2006/this issue) explored similar issues but used a latent variable

    approach to test specific models about the relation of decoding, language ability,

    and comprehension. They also found that the relation of decoding and language

    abilities depends on how comprehension is assessed. Two standardized tests of 

    reading comprehension were used: Passage Comprehension from the Wood-cock–Johnson–III (Woodcock, McGrew, & Mather, 2001) and the Diagnostic As-

    sessment of Reading Comprehension (DARC). The Passage Comprehension

    subtest is a widely used measure that involves reading sentences or short passages,

    MEASURING READING COMPREHENSION   325

       D  o  w  n   l

      o  a   d  e   d   b  y   [   U  n   i  v  e  r  s   i   t  a   d   i   P  a   d  o  v  a   ]  a   t   0   6  :   3   6   2   3   O  c   t  o   b  e  r   2   0   1   3

  • 8/17/2019 Fletcher, J (2006)

    5/9

    with a cloze procedure as a response format. The DARC is an experimental test

    that involves reading passages of three sentences, but it is specifically designed to

    control for the level of decoding that is required. The passages have been carefullyconstructed to manipulate components that involve text memory and inferencing

    as well as knowledge access and integration. They also measure comprehension

    using linguistic discourse methods. Not surprisingly, the decoding and phonologi-

    cal awareness measures are more strongly related to the Woodcock–Johnson–III

    Passage Comprehension test, and the contribution to these types of measures are

    not strongly related to the DARC.

    These five articles are all noteworthy for the different components of reading

    comprehension that are assessed. In the remainder of this commentary, I highlight

    three issues that seem to be particularly important for the measurement of readingcomprehension: nature of the text, how reading comprehension is assessed, and in-

    dividual differences.

    NATURE OF THE TEXT

    All five of these articles demonstrate that the material the participant is asked to

    read is a major determinant of the inference that is made about the quality of com-prehension. Clearly, different inferences will be made depending not only on the

    difficulty of the text but also on the semantic, syntactic, and related characteristics

    of the text. This is most clearly demonstrated in Deane et al.’s (2006/this issue) ar-

    ticle but is also inherent in the eye movement study by Rayner et al. (2006/this is-

    sue) and the explicit effort to minimize the role of certain text characteristics in

    Francis et al. (2006/this issue). Deane et al.’s findings are particularly interesting,

    showing that variations in text characteristics are related to standard ways of as-

    sessing the readability of the passages. This research is consistent with other stud-

    ies of text characteristics, most notably by Hiebert (2002) and Foorman, Francis,Davidson, Harm, and Griffin (2005). Foorman et al. developed a computer pro-

    gram for identifying lexical, semantic, and syntactic features of text in six different

    commercial basal reading programs. They found considerable variability not only

    in the composition of the text but also in the frequency of specific words taught in

    the program. Such studies illustrate the importance of understanding text variabil-

    ity as a determinant of the inferences made about reading comprehension. Cer-

    tainly, having participants read text that is relatively restricted in composition lim-

    its the inferences that can be made. Examinations of test characteristics should be

    linked with methods derived from discourse analysis, which forms the basis for theapproach of Deane et al. to analyzing text characteristics. Millis et al. (2006/this is-

    sue) nicely demonstrate the value of LSA not only for understanding the nature of 

    the passage but also for linking comprehension of the passage to the text.

    326   FLETCHER

       D  o  w  n   l

      o  a   d  e   d   b  y   [   U  n   i  v  e  r  s   i   t  a   d   i   P  a   d  o  v  a   ]  a   t   0   6  :   3   6   2   3   O  c   t  o   b  e  r   2   0   1   3

  • 8/17/2019 Fletcher, J (2006)

    6/9

    RESPONSE FORMATS

    All of the articles in this series demonstrate the importance of considering the for-mat by which readingcomprehension is assessed.This is most explicitly addressed

    by Cutting and Scarborough (2006/this issue), who show differential discriminant

    validity for a set of predictive variables as they predicted reading comprehension

    outcomes. Similarly, the latent variable analyses of Francis et al. (2006/this issue)

    also show differential forms of discriminant validity. The assessments of reading

    comprehension in these two studies included standardized tests but expanded to

    experimental and discourse-level procedures. Francis et al. and Rayner et al.

    (2006/this issue) utilized a method that is independent of the participant’s actual

    response to assess reading comprehension, showing that regressive eye move-ments are good indicators of difficulties in comprehension and that these difficul-

    ties are related to the complexity of the text. The use of LSA for analyzing

    think-alouds in Millis et al.’s (2006/this issue) article seems particularly promis-

    ing. It would be interesting to know more about how think-alouds are correlated

    with performance on standardized tests of reading comprehension like those used

    by Cutting and Scarborough. The use of the passages in the Nelson– Denny is in-

    teresting, but the response format was somewhat modified. Across studies, the

    finding of positive correlations across different measures of reading comprehen-

    sion that varied in the format by which comprehension was measured was interest-ing. Clearly, there are one or more latent variables that represent the different com-

    ponents of reading comprehension that are imperfectly indexed by individual

    measures, but these indices are correlated. Method bias, measurement error, and

    the operation of general factors must be addressed in understanding these correla-

    tions, but these components can be modeled.

    INDIVIDUAL DIFFERENCES

    Virtually all of the articles mentioned the need for diagnostic tests of reading com-

    prehension. A diagnostic test of reading comprehension would provide informa-

    tion on the strengths and weaknesses of individual people and their ability to com-

    prehend reading material. Such assessments may suggest interventions at the

    classroom or individual level that could be applied to enhance reading comprehen-

    sion. Much is known about teaching reading comprehension, so knowing more

    about how to link specific forms of instruction with the needs of individual readers

    would be helpful.

    Interestingly, the samples in each of these studies tended to be relatively re-stricted in terms of the rangeof individual differences, with the possible exceptions

    of the samples examined by Cutting and Scarborough (2006/this issue) and Fran-

    cis et al. (2006/this issue). The expansion of these methods into more diverse popu-

    MEASURING READING COMPREHENSION   327

       D  o  w  n   l

      o  a   d  e   d   b  y   [   U  n   i  v  e  r  s   i   t  a   d   i   P  a   d  o  v  a   ]  a   t   0   6  :   3   6   2   3   O  c   t  o   b  e  r   2   0   1   3

  • 8/17/2019 Fletcher, J (2006)

    7/9

    lations would be interesting, but larger samples would be needed. In the end, it is

    likely that what one would conclude about diagnostic tests of reading comprehen-

    sion would depend on how reading comprehension is assessed. The notion that in-dividual differences can be understood with a single procedure is probably ques-

    tionable. Adequate assessments of reading comprehension will have to rely on

    multiple procedures that manipulate both the material that is read and the response

    format. It is likely that the assessment will need to be expanded beyond just what

    might happen in a psychoeducational assessment or a group-based high-stakes as-

    sessment and take into account comprehension as it actually occurs in the class-

    room. Far too much may have been made of the idea that reading comprehension is

    an interaction among the reader and the text and occurs in a situated model, but

    these are nonetheless relevant considerations when it comes to understanding indi-vidual differences. It may be useful to incorporate observational techniques that in-

    volve systematic querying of teachers about the quality of a particular student’s

    reading comprehension.

    This need highlights the importance of beginning to integrate research across

    studies of the sort observed in this special issue and to begin to approach reading

    comprehension from a multimethod perspective in which a full range of underly-

    ing latent variables can be assessed and analytic methods like those used in Francis

    et al. (2006/this issue) are applied to understand the relations of the constructs at

    the level of the latent variable and not simply at the level of the observed variables.

    CONCLUSIONS

    Ifthegoalistodevelopdiagnostictests,whichwasmentionedbytheauthorsofeach

    of the five articles, then approaches to the assessment of reading comprehension

    need to incorporatemultiple indicators to enhance the precision inwhich the under-

    lying latent variables are measured. Otherwise, the results may have what the con-

    struct validity world has termed a mono-operation bias (Cook & Campbell, 1979).Althoughsomearguethatissuesinassessingreadingcomprehensionaresocomplex

    that no psychometric approach can ever be adequate, the real issue is attempting to

    model the complexity through experimental and correlational techniques. We need

    multimethod research that moves beyond the unidimensional approaches thatchar-

    acterize most contemporaryapproaches tomeasurementandlooks at relationsat the

    latent variable level. Cook and Campbell identified construct underrepresentation,

    in which a single variable does not adequately index the underlying constructs, as a

    major factor limiting inferences about complex human behaviors. They noted that

    the fundamental problem in construct validity research was “that the operationswhicharemeanttorepresentaparticularcauseoreffectconstructcanbeconstruedin

    terms of more than one construct” (p. 59)or is underidentified because of mono-op-

    eration bias. Research that isolates specific factors in reading comprehension is im-

    328   FLETCHER

       D  o  w  n   l

      o  a   d  e   d   b  y   [   U  n   i  v  e  r  s   i   t  a   d   i   P  a   d  o  v  a   ]  a   t   0   6  :   3   6   2   3   O  c   t  o   b  e  r   2   0   1   3

  • 8/17/2019 Fletcher, J (2006)

    8/9

    portant. What is also needed is research that integrates across methods and specifi-

    cally attempts to identify different constructs that are specific to reading

    comprehension and their relation to otherconstructs that makeup reading and othercognitiveskills.Suchapproacheswillmovebeyondmono-operationbiasandleadto

    assessmentsthatcapturetherichnessofreadingcomprehension.Testsbasedonsuch

    analyses not only will permit better inferences about reading comprehension but

    also will be diagnostic, because variabilitywithin the test will be apparent for some

    readers. This variability may be tied to differential instruction, whichis theultimate

    purpose of assessing reading comprehension.

    ACKNOWLEDGMENTS

    This research was supported in part by a grant from the National Institute of Child

    Health and Human Development, HD052117, Texas Center for Learning

    Disabilities. I gratefully acknowledge contributions of Rita Taylor to manuscript

    preparation.

    REFERENCES

    Brown, J. I., Fischo, V. V., & Hanna, G. S. (1993). Nelson-Denny Reading Test . Chicago: Riverside.

    Cook, T. D., & Campbell, D. A. (1979). Quasi-experimental design: Design and analysis issues. Chi-

    cago: Rand McNally.

    Cutting, L. E., & Scarborough, H. S. (2006). Prediction of reading comprehension: Relative contribu-

    tions of word recognition, language proficiency, and other cognitive skills can depend on how com-

    prehension is measured. Scientific Studies of Reading, 10, 277–299.

    Deane, P., Sheehan, K. M., Sabatini, J., Futagi, Y., & Kostin, I. (2006). Differences in text structure and

    its implications for assessment of struggling readers. Scientific Studies of Reading, 10, 257–275.

    Foorman, B. R., Francis, D. J., Davidson, K. C., Harm, M. W., & Griffin, J. (2004). Variability in text

    features in six grade 1 basal reading programs. Scientific Studies of Reading, 8, 167–197.

    Francis, D. J., Snow, C. E., August, D., Carlson, C. D., Miller, J., & Iglesias, A. (2006). Measures of 

    reading comprehension: A latent variable analysis of the diagnostic assessment of reading compre-

    hension. Scientific Studies of Reading, 10, 301–322.

    Hiebert, E. H. (2002). Standards, assessments, and text difficulty. In A. E. Farstrup & S. J. Samuels

    (Eds.), What research has to sayabout reading instruction (pp. 337–391). Newark, DE: International

    Reading Association.

    Millis, K., Magliano, J., & Todaro, S. (2006). Measuring discourse-level processes with verbal proto-

    cols and latent semantic analysis. Scientific Studies of Reading, 10, 225–240.

    Pearson, P. D., & Hamm, D. N. (2005). The assessment of reading comprehension: A review of prac-

    tices—Past, present, and future. In S. G. Paris & S. A. Stahl (Eds.), Children’s reading comprehen-sion and assessment  (pp. 13–69). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.

    RAND Reading Study Group. (2002). Reading for understanding. Washington, DC:RAND Education.

    Rayner, K., Chace, K. H., Slattery, T. J., & Ashby, J. (2006). Eye movements as reflections of compre-

    hension processes in reading. Scientific Studies of Reading, 10, 241–255.

    MEASURING READING COMPREHENSION   329

       D  o  w  n   l

      o  a   d  e   d   b  y   [   U  n   i  v  e  r  s   i   t  a   d   i   P  a   d  o  v  a   ]  a   t   0   6  :   3   6   2   3   O  c   t  o   b  e  r   2   0   1   3

  • 8/17/2019 Fletcher, J (2006)

    9/9

    Wechsler,D. L. (1992). Wechsler Individual Achievement Test . SanAntonio, TX: Psychological Corpo-

    ration.

    Wiederholt, L., & Bryant, B. (1992). Gray Oral Reading Test—3. Austin, TX: PRO-ED.

    Woodcock, R. W., McGrew, K. S., & Mather, N. (2001). Woodcock–Johnson III Tests of Achievement .

    Itasca, IL: Riverside.

    330   FLETCHER

       D  o  w  n   l

      o  a   d  e   d   b  y   [   U  n   i  v  e  r  s   i   t  a   d   i   P  a   d  o  v  a   ]  a   t   0   6  :   3   6   2   3   O  c   t  o   b  e  r   2   0   1   3