Torrance 2011 Assessment

download Torrance 2011 Assessment

of 28

Transcript of Torrance 2011 Assessment

  • 7/27/2019 Torrance 2011 Assessment

    1/28

    This article was downloaded by: [Oxford Brookes University]On: 01 July 2013, At: 04:44Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

    British Journal of Educational

    StudiesPublication details, including instructions for authors

    and subscription information:

    http://www.tandfonline.com/loi/rbje20

    Using Assessment to Drive the

    Reform of Schooling: Time toStop Pursuing the Chimera?Harry Torrance

    a

    aManchester Metropolitan University

    Published online: 08 Dec 2011.

    To cite this article: Harry Torrance (2011): Using Assessment to Drive the Reform ofSchooling: Time to Stop Pursuing the Chimera?, British Journal of Educational Studies,59:4, 459-485

    To link to this article: http://dx.doi.org/10.1080/00071005.2011.620944

    PLEASE SCROLL DOWN FOR ARTICLE

    Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

    This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expresslyforbidden.

    The publisher does not give any warranty express or implied or make anyrepresentation that the contents will be complete or accurate or up to

    date. The accuracy of any instructions, formulae, and drug doses should beindependently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damageswhatsoever or howsoever caused arising directly or indirectly in connectionwith or arising out of the use of this material.

    http://www.tandfonline.com/page/terms-and-conditionshttp://www.tandfonline.com/page/terms-and-conditionshttp://dx.doi.org/10.1080/00071005.2011.620944http://www.tandfonline.com/loi/rbje20
  • 7/27/2019 Torrance 2011 Assessment

    2/28

    British Journal of Educational Studies

    Vol. 59, No. 4, December 2011, pp. 459485

    USING ASSESSMENT TO DRIVE THE REFORM OFSCHOOLING: TIME TO STOP PURSUING THE CHIMERA?

    by HARRY TORRANCE, Manchester Metropolitan University

    ABSTRACT: Internationally, over the last 2030 years, changing the proce-dures and processes of assessment has come to be seen, by many educators aswell as policy-makers, as a way to frame the curriculum and drive the reformof schooling. Such developments have often been manifested in large scale,

    high stakes testing programmes. At the same time educational arguments havebeen made about the need to provide students with good quality formativefeedback, and informative reports about what they have achieved. The chimeraof a perfectly integrated and functioning curriculum and assessment systemhas been pursued, but such ambition far outstretches systemic capacity; it isneither feasible nor desirable. The national testing and examination system in

    England is an exemplar case. As national results have improved, much evi-dence suggests that, if anything, actual standards of achievement are falling,and grade inflation is undermining public confidence in the whole system. The

    paper will review these issues and tensions, and argue that a different modelfor developing curriculum and assessment is urgently needed.

    Keywords: assessment, school reform, examination results, England

    1. INT RODU CT ION1

    Internationally, over the last 2030 years, various developments have taken place

    in the field of curriculum and assessment that have led governments around the

    world to look to assessment policy and practice as a way of exerting pressure

    on their school systems. Changing the procedures and processes of assessment

    has come to be seen, by many educators as well as policy-makers, as a way to

    frame the curriculum and drive the reform of schooling. There is not a singleexplanation for how and why this has happened. Many tributaries have contributed

    to the current torrent of policy initiatives. But the unintended consequences, or at

    least I assume they are unintended, are becoming very apparent, and if they are

    not addressed then they are likely to undermine the validity and legitimacy of the

    whole enterprise.

    In this paper I review some of the general factors contributing to the current

    mainstream intellectual and policy consensus; explore some of the consequences

    of current practice; and identity key elements which must be changed in order to

    develop a curriculum and assessment system which is fit for purpose. The argu-ment of the paper is that in the increasingly frantic search for a perfectly integrated

    and functioning assessment system, our ambition has far outstretched our capacity

    ISSN 0007-1005 (print)/ISSN 1467-8527 (online)

    2011 Society for Educational Studies

    http://dx.doi.org/10.1080/00071005.2011.620944

    http://www.tandfonline.com

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    3/28

    460 ASSESSMENT IN SCHOOL REFORM

    to deliver in terms of both what is feasible and what is desirable. The compromises

    that have been struck between educational aspiration and political purpose have

    distorted both, with very negative consequences for the educational experience of

    students and the credibility of the overall system. The first part of the paper briefly

    summarises where we are now with respect to both the educational and politi-

    cal arguments which focus on using assessment to reform schooling. Secondly,

    I examine evidence of the effects of such developments in England, including

    steadily rising pass rates in national tests and public examinations, but increasing

    scepticism about their validity and reliability, i.e. about what they actually mean

    in terms of educational standards. Finally, I go on to think about what can be

    salvaged and what has to change if the appropriateness and quality of schooling

    in the twenty-first century is to be improved.

    2. HOW DID WE GET HERE?

    Changes in assessment do not simply arise from technical developments in the

    field, though these certainly contribute. Rather, such change reflects developments

    in the social and economic aspirations which we hold for the education system,

    and thus what it is that we are trying to design assessment to accomplish. This

    section of the paper briefly reviews the following issues and developments with

    respect to their diverse contributions to the current settlement:

    selection, certification and norm referencing;human resource development and education for all;

    criterion referencing, clarity of outcomes and the development of content

    standards;

    social justice and educational inclusion;

    summative and formative assessment.

    Selection, Certification and Norm Referencing

    One of the most profound intellectual and policy shifts over the last 30 years orso has been the move from seeing only a small percentage of a student cohort as

    being both educable and worth educating, to seeing education as an investment in

    the social and economic potential of the whole cohort. We now take this aspiration

    for granted, at least at the level of policy rhetoric, but it was not always so: quite

    the reverse.

    Historically, education was a scarce commodity, access to educational oppor-

    tunities were limited, and educational assessment was largely concerned with

    selecting individuals for those limited opportunities; for access to an elite sec-

    ondary education for example, and access to university. In turn grades andcertificates were awarded to individuals at the end of particular courses of study, as

    they progressed through the education system. So the focus of assessment was on

    identifying individual achievement, and particularly on selecting and certificating

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    4/28

    ASSESSMENT IN SCHOOL REFORM 461

    individuals. In so doing, this process functioned to identify, and legitimate on

    grounds of educational merit, the identification of the next cohort of suitably

    qualified and socialised personnel for economic and social leadership roles in

    society. Selection and certification was done by relatively small elite groups, of

    relatively small elite groups, for relatively small elite groups, and was under-

    pinned by reference to the idea that innate intellectual ability was distributed along

    a normal distribution curve within a population. The most obvious example in

    England is the work of Cyril Burt in producing the intellectual justification for

    mental testing and in turn the 11+ selection test for grammar school entrance

    (Torrance, 1981).

    Selection, and grading for certification, produced the need for assessments

    to generate a rank order, with norm-referencing being used for such purposes.

    What mattered was where an individuals score came in relation to their peers,

    rather than any absolute level of achievement that it might signal in itself. Ofcourse absolute levels of achievement were important in terms of determining

    grade boundaries, but such conceptions of achievement remained largely within

    the tacit knowledge of examiners and were not reported explicitly. What mattered

    publicly was the norm-referenced rank-order and grades awarded. Such practices

    were a product of their time, largely determined by the imperative to create a

    small social and economic elite, to lead and manage a largely unskilled manual

    workforce.

    Human Resource Development and Education for All

    Such times have changed. Without rehearsing all the changes in the international

    terms of trade, and the conditions of production, that have occurred over the last

    3040 years, it is apparent that we now live in a world of intense global economic

    competition with mass movements of capital and labour. Unskilled production

    has virtually vanished from the UK and other similar economies, and the empha-

    sis now is on education for the so-called knowledge economy and as a form of

    investment in human capital. The focus is now on education for all, or at least

    the large majority, and the development of a fit-for-purpose assessment system asa system, i.e. as part of an integrated approach to national human resource devel-

    opment. The imperative now is to treat education as an economic investment, both

    on the part of the individual student, and on the part of government. Instead of

    needing a legitimate reason to dispense with the intellectual capabilities of most

    of the population, governments now need to cultivate these capabilities.

    Criterion Referencing, Clarity of Outcomes and the Development of Content

    Standards

    In parallel with such developments the need for assessment to produce more

    useful information about student achievement has also become apparent use-

    ful information for teachers, for the students themselves, and also for other

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    5/28

    462 ASSESSMENT IN SCHOOL REFORM

    stakeholders such as parents, employers and government. Norm-referenced, rank

    ordered grades do not communicate what students have achieved and, over time,

    we have seen a move toward more criterion-referenced assessment. Initial interest

    in criterion-referencing derived from the development of mastery learning pro-

    grammes and evaluation studies, in the 1960s and 1970s, which sought to delineate

    and identify what students should know and be able to do after following a particu-

    lar course of study (e.g. Bloom, 1974; Ebel, 1972; Glaser, 1963). Such early work

    was very much internal, so to speak, to the curriculum development and evalua-

    tion research community, but the idea of reporting learning outcomes, rather than

    norm-referenced grades, became more widely disseminated as demands for util-

    ity and accountability developed in the 1970s and 1980s. Employers wanted more

    information about what school leavers could do and governments wanted more

    information about what the school system was producing. Moreover, demands also

    grew for the school system to produce different things a wider range of more rel-evant skills and understandings for the knowledge economy. This in turn required a

    wider range of assessment methods to be developed to identify and report a wider

    range of learning outcomes practical work, coursework and extended project

    work, for example, to test practical competences and the application, rather than

    simply the memorisation and regurgitation, of knowledge. Thus a concern for what

    we might term content standards, and the production of more useful information

    about what school students know, understand and can do, has merged with debates

    about how best to measure and report such content standards, and indeed enforce

    them.

    Social Justice and Educational Inclusion

    Various types of social justice arguments have also contributed to this nexus of

    change, partly linked to the human capital development arguments outlined above,

    but also partly driven by arguments about promoting social inclusion and social

    mobility through equal access to educational opportunities. Thus advocates argue

    that the majority of the population should not be abandoned to comparative, norm-

    referenced, failure. Rather, we need our assessment systems to identify and reportwhat students can do, rather than what they cannot, including the many social

    and attitudinal outcomes of education which are just as important as academic

    outcomes (e.g. Broadfoot et al., 1988; ILEA Hargreaves Report, 1984). Thus

    we want our children not just to be able to do maths, or science, or whatever,

    but to enjoy them and understand their importance. Equally we wish to value

    achievements in other domains, including social and political understanding, and

    ensure that students can contribute to civil society. In tandem with such general

    arguments about widening the scope and inclusiveness of assessment have come

    specific technical developments incorporating graded tests, the modularisation ofthe curriculum and the possibility of accumulating better final results though the

    assessment of coursework and even re-sits of modular papers to improve grades

    (Murphy and Torrance, 1988 review some of the original advocacy for these sorts

    of developments; Hayward and McNichol, 2007 report some of the problems).

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    6/28

    ASSESSMENT IN SCHOOL REFORM 463

    Summative and Formative Assessment

    A more specifically educational variant of some of these arguments is manifested

    in the debate about summative and formative assessment, and the role that forma-

    tive assessment could play in improving the quality and outcomes of teaching andlearning. Summative assessment reports the outcomes of an assessment process.

    It largely takes place at the end of a course of study, and results are reported after

    the course of study has finished, to the student and to interested others. Even if the

    assessment has involved some form of criterion-referencing, and is informative

    and positive (all big ifs of course), reporting after the fact about what has been

    achieved leaves little scope for using such information for improvement. Moreover

    narrow forms of summative assessment, focusing largely on testing academic

    achievement, can have a very narrowing backwash effect on the curriculum and

    the quality of teaching and the student experience (as will be reviewed in more

    detail below). Advocates of formative assessment argue that using a wider range

    of classroom-based tasks to assess student progress, and providing good quality

    feedback to students during a course on what they have achieved but also how

    they might improve, can facilitate learning and improve outcomes. Many issues

    arise of course, with respect to the nature and quality of the feedback and the

    support provided for students (which, again, will be reviewed below) but for the

    purposes of this introductory discussion it is sufficient to note that this major edu-

    cational aspiration and endeavour is played out in the context of, and plays into

    the development of, the wider debate about the purpose, validity and reliability of

    assessment.

    3. WHERE ARE WE NOW?

    So, a wide variety of interacting elements, deriving from long term social and

    economic change, and from educational arguments about the role of assessment

    in facilitating learning, seem to have produced the current consensus. It is not that

    there is some sort of simple progression here, such that norm referencing has been

    completely superseded, or that there is any particularly conscious orchestration

    and integration of these different elements. Rather, all elements are in-play at thepresent time, but the major influences currently driving developments in curricu-

    lum and assessment derive from human capital theory coupled with the demand

    for clarity of objectives and the prescription of content standards.

    Thus governments around the world are looking to produce integrated curricu-

    lum and assessment systems to drive up standards, and they are supported by many

    educational advocates of greater integration. Perhaps the two most visible exam-

    ples of change are the National Curriculum and Assessment system in England

    (DES, 1987), and the No Child Left Behind legislation in the United States

    (NCLB, 2001) now morphing into Obamas standards-oriented Race for the Topand the State-level Common Core Standards Initiative. Meanwhile, other countries

    are adopting similar programmes, including New Zealand, which has been devel-

    oping national standards linked to a testing system since 2002 (NZQA, 2011), and

    Australia (cf. Wyatt-Smith et al., 2010).

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    7/28

    464 ASSESSMENT IN SCHOOL REFORM

    A key problem, however, is that there remains a schism between the educa-

    tional arguments for changes in assessment to enhance learning, and the policy

    demands for school improvement and accountability. This schism produces ten-

    sions in developing an integrated system, and can result in significant unintended

    consequences. The arguments in favour of using assessment to change teaching

    essentially fall into two related, but nevertheless distinct, categories. One argu-

    ment derives from educational issues and values, the other is much more oriented

    towards accountability and the use of political pressure to bring about change.

    The educational arguments revolve around the role that assessment plays in

    determining the curriculum, using the so-called backwash effect of assessment,

    noted earlier, in a positive way; as Resnick and Resnick (1992) have put it:

    You get what you assess; you dont get what you dont assess; you should build

    assessment towards what you want . . . to teach . . . (p. 59)

    This is very much the thinking that influenced the Measurement-Driven

    Instruction movement in the USA in the late 1980s and 1990s. Put desired

    objectives into testing programmes and teachers will teach those desired objec-

    tives (Airasian, 1988; Popham, 1987). Thus the intention is to use changes in

    assessment directly to influence curriculum content and the process of teaching

    and learning. More recently such arguments have developed to incorporate the

    notion of a standards-based curriculum, whereby standards are set in terms of

    curriculum content and achievement levels and test are aligned with the curricu-

    lum to reinforce the teaching of those standards and to measure whether and towhat extent such standards have been achieved.

    A rather more complex interpretation of the same broad insight focuses much

    more at classroom level and on the quality of teacherstudent interaction. Thus,

    to reiterate, it is also recognised that routine, informal assessment can play a key

    role in underpinning or undermining the quality of teaching and learning in the

    classroom. How teachers assess students work, what sorts of positive or negative

    feedback is given, and whether or not advice on how to improve is provided can

    make a great deal of difference to what is learned and how it is learned. This is the

    thinking which underlies the formative assessment movement in England, wheresuch approaches are perhaps most developed (Black and Wiliam, 1998; Black

    et al., 2006; Torrance and Pryor, 1998), though it has also been very influential

    in Australia and New Zealand (Cowie and Bell, 1999; Sadler, 1989, 1998; Wyatt

    Smith et al., 2010) and acknowledged as potentially important for developments in

    Hong Kong, the USA and elsewhere (Carless, 2006; Hargreaves, 2007; Shepard,

    2000).

    The political accountability arguments for using assessment to drive school

    reform are much simpler and more clear cut. Here the claim is that education sys-

    tems in general, schools in particular, must have their efficiency and effectivenessmeasured by the outcomes produced. Expected standards of achievement must be

    prescribed and tests regularly employed to identify whether or not these expecta-

    tions have been met. In publicly maintained school systems such prescription is

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    8/28

    ASSESSMENT IN SCHOOL REFORM 465

    controlled by government and the quality of teaching and learning in the class-

    room is assumed to rise if results improve. Essentially, in this model, testing is

    used as a lever to effect the system qua system; the detail at classroom level is

    assumed to look after itself. If results are improving, the quality of students edu-

    cational experience and achievement is assumed to be improving. However, just

    as with measurement-driven instruction or a standards-based curriculum, it is

    crucial to the logic and practice of such an accountability system that the tests

    employed do indeed genuinely sample the curriculum, and reliably measure stu-

    dent achievement. The tests must be valid indicators of quality across the system

    as a whole otherwise they will drive the system in the wrong direction.

    In England, we are very much dealing with this politically driven,

    accountability-oriented analysis of the nature of the problem of educational stan-

    dards and what to do about them; though some elements of the arguments about

    standards-based instruction and formative assessment, or assessment for learn-ing as it is now more commonly known, also feature in debate. In this respect

    we can note once again that change in the education system is unlikely to occur

    only as a result of educational arguments, or indeed simply to comply with gov-

    ernment pressure and legislation. It is the interaction of the two which produces

    particular practices at particular points in time. In the systemic social and insti-

    tutional space of education, educational arguments are likely to be modified

    and adapted to fit the prevailing political context, while at one and the same time

    such arguments are deployed in policy debates in order to increase the rhetorical

    and symbolic legitimacy of policy and to mobilise action within local educationalcontexts and institutions.

    It is this interaction of policy and educational aspiration that seems to have

    produced the current educational orthodoxy of trying to combine formative

    approaches to classroom assessment with large-scale summative accountability

    systems. The aspiration to combine formative and summative assessment in a sin-

    gle system was first articulated in the Task Group on Assessment and Testing

    Report (The TGAT Report, DES, 1987), which provided the educational rationale

    for the original national system of testing. The report argued that:

    It is possible to build up a comprehensive picture of the overall achievements of apupil by aggregating, in a structured way, the separate results of a set of assessmentsdesigned to serve formative purposes. (para. 25)

    The TGAT Report was also known as the Black Report since Paul Black chaired

    the Task Group, and he has been a significant advocate of merging formative and

    summative assessment over the intervening years (e.g. Black, 1998).

    More recently, Shavelson et al. (2005), reviewing developments in England,

    Australia and the United States, argue that:

    The potential for summative and formative assessment to work at cross purposes . . .is enormous. However . . . [i]f left in conflict the summative function will overpowerthe formative and . . . [t]he goal of teaching . . . becomes improving scores on thesetests. (p. 7)

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    9/28

    466 ASSESSMENT IN SCHOOL REFORM

    Gilmore (2002), reviewing the National Education Monitoring Project in

    New Zealand (NEMP), argues:

    The vision that underlies large-scale assessment programmes . . . provides depend-

    able assessment information for accountability (evaluative) purposes while at thesame time supporting and sustaining exemplary teaching and learning. (p. 345)

    Bennett (2011), in a substantial review of current international perspectives,

    concludes that:

    The effectiveness of formative assessment will be limited by the nature of the largersystem in which it is embedded . . .. Ultimately we have to change the system . . . ifwe want to have maximum impact on learning and instruction. [This] means remak-ing our accountability tests . . .[ ]. . . we have to rethink assessment from the groundup as a coherent system. (pp. 1920)

    Thus the rationale seems to be that accountability is here to stay; summative

    assessment will always drive out formative assessment if they are set in opposition

    to one another, therefore we need to splice them together in an attempt to create

    the perfect chimera, the perfect genetically modified assessment system. In this

    ambition our reach has far outstretched our grasp. The understandable felt need of

    educators to pursue their educational aspirations in the context of particular pol-

    icy demands has produced a tendency towards making over-ambitious claims for

    what can be accomplished. And, certainly in England, the system that has been

    produced is starting to collapse under its own weight.

    4. WHAT IS THE IMPACT OF THESE CHANGES? POLICY AND PRACTICE

    IN ENG LA ND

    I now move on to provide some illustrations of the problems that have become

    apparent in England, whereby the very improvement in results that the creation of a

    national system has brought about, have arguably been accomplished by too much

    coaching and practising for the tests, and are now undermining public credibility in

    the whole enterprise. England is chosen for illustrative purposes because in manyrespects it represents the paradigm case of the sorts of trends we have seen over

    the last 30 years or so. Moreover England now has over 20 years of experience of

    developing a national curriculum and assessment system so any initial teething

    troubles should have been long overcome. If any problems remain (and they do)

    then it is likely that they are intractable and require a different approach.

    England has a statutory National Curriculum and Testing system, introduced

    in 1988 (DES, 1987) which tests all students in a cohort at regular intervals.

    Originally all students were tested at age seven in English and Maths, at ages 11

    and 14 in English, Maths and Science, and sat national public examinations the General Certificate of Secondary Education (GCSE) at age 16. Currently

    all students are now tested only at 11, with GCSE retained at 16. Since 2005

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    10/28

    ASSESSMENT IN SCHOOL REFORM 467

    classroom-based teacher assessment has been used to report results at age seven,

    while tests at 14 were abolished in 2009 following a fiasco of lost papers,

    unmarked papers and wrongly marked papers, which demonstrated just how over-

    whelming testing whole cohorts had become. Thus two stages in the process have

    been dropped as the undesirability of testing very young children, and the impossi-

    bility of maintaining any pretence of quality and reliability in such a mass system,

    became apparent. England also retains a subject-based examination system at the

    formal secondary school leaving age of 16 years, and a subject-based Advanced

    level A-level qualification, normally taken at 18 years of age, for entrance to

    university. So, over the last 25 years or so, if English policy makers could squeeze

    something into the assessment system, they did, though subsequently elements of

    what they squeezed in popped out again, as it transpired that there was not enough

    space in the system for all the testing that was envisaged. Two full levels of whole-

    cohort testing at ages seven and 14 have been dropped in an attempt to amelioratethe worst effects of testing.

    The explicit use of assessment to drive educational change in England dates

    back to the introduction of a single system of secondary school examinations, the

    General Certificate of Secondary Education (GCSE), for 16-year-olds (the min-

    imum school leaving age), by the then Conservative government of Margaret

    Thatcher in 1986 (with the first new exams taken in 1988). In the 1960s and

    1970s England operated with two parallel secondary school examination systems:

    GCE O-level2 for those students considered to be in the top 20 per cent of the

    ability range; and CSE3 for those considered to be in the next 40 per cent ofthe ability range. The bottom 40 per cent were not considered capable of tak-

    ing examinations at all. Selection of the top 20 per cent for entry to grammar

    schools was based on the 11+ intelligence test and, overall, such a selective sys-

    tem represented the epitome of a norm-referenced system. The creation of a single

    system of examining, GCSE, in the mid-1980s, might be said to mark the point

    at which governments in England fully began to buy into human capital theory

    and treat education as an investment in the population as a whole, rather than as

    a way to select a social and economic elite. Of course education, and particularly

    assessment, still does play a major role in selecting and legitimating the selectionof a social and economic elite, but it is at least now arguable that this outcome is

    an unintended effect of policy, rather than an overt intention.

    One effect of selectivity was that, precisely because it was not thought appro-

    priate for all children to take secondary school examinations, there were no overall

    data in the system about how well schools were doing and what standards were

    being attained across the system as a whole. Moreover, the selection test for sec-

    ondary school allocation (the 11+) could only provide evidence that 80 per cent

    of pupils failed their 11+ and therefore failed primary education. Even this

    test was largely phased out with the introduction of comprehensive secondary

    schools, so that by the mid-1980s there were virtually no data whatsoever on

    the output of primary schools. The Labour government of the 1970s launched

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    11/28

    468 ASSESSMENT IN SCHOOL REFORM

    the Assessment of Performance Unit (APU) to try to provide evidence of stan-

    dards achieved. However, the APU did not provide unequivocal and easily usable

    evidence about national standards (Gipps and Goldstein, 1983); nor, because of its

    sampling strategy, could it reach into and influence every classroom. First GCSE

    (1986), then the National Curriculum and National Testing (1988) were introduced

    in order to control directly what was taught and how it was taught, and to measure

    whether or not it was being taught effectively.

    There has been extensive detailed argument about the scale and scope of the

    National Curriculum and Testing system and many modifications have been put in

    place since it was first introduced (cf. Daugherty, 1995; Torrance, 1995, 2003).

    However, the issue of educational accountability has remained the key policy

    problem for over 20 years now, and the development of a standards-based, test-

    driven education system has remained the key policy solution up to the present.

    The commitment to a testing regime has remained completely taken-for-granted,as elements of policy have been built up, layer by layer, and then stripped away

    again, in successive attempts by both Conservative and Labour governments to try

    finally to realise the vision of an integrated national curriculum and testing system

    and render it operational. So we have had 20+ years of a natural experiment with

    the educational provision of our children.

    5. IMPACT ON RESULTS AR E STANDARDS RISING?

    Before moving on to review some of the detail of the system in operation, andsome of the educational consequences of this experiment, I will review some of

    the results produced so we have a sense of what this move from a selective to

    a mass system actually looks like. The increasingly feverish activity of govern-

    ments of both complexions since the mid-1990s, both with respect to the National

    Curriculum and testing regime, and with respect to GCSE and A-level, indicates

    that educational standards are still considered to be a political issue about which

    something must be done, or at least be seen to be done.

    National Curriculum test scores have risen since national testing was first

    introduced but have plateaued since around 2000 and insofar as they indicateanything meaningful about educational standards this suggests that progress

    in primary education has stalled, or appears to have stalled. However, figures

    for GCSE and A-level indicate that examination scores have risen consistently

    since the 1970s, irrespective of which government is in power or which spe-

    cific curriculum interventions have been pursued. These results tend to indicate

    that it is the general trend towards human resource development and crite-

    rion referencing, combined with the general pressure to succeed, that has seen

    scores rise.

    To take national test results first, at age seven, in Table 1 and Figure 1, wecan see that results started high, improved a little, and have stayed high, but there

    remains a stubborn 1520 per cent or so of children who are not reaching level 2,

    the expected level, in maths and English by the age of seven. At age 11 (Table 1

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    12/28

    ASSESSMENT IN SCHOOL REFORM 469

    TABLE 1: Percentage of pupils gaining National Curriculum assessment level 2or above at age seven and level 4 or above at age 11, England

    Age 7 Age 11

    Eng Maths Eng Maths Science

    1992 77 781995 76 78 48 441996 80 80 58 54 621997 81/80 83 63 61 692000 81/84 90 75 72 852002 84/82 90 75 73 862005 85/82 91 79 75 862007 84/80 90 80 77 882008 85/80 90 81 78 882009 84/81 89 80 78 882010 85/81 89 81 80 85

    source: http://www.dcsf.gov.uk/rsgateway/ 1992 first full run of KS1 tests; 1995 first full run of KS2 English and Maths; 1996 firstfull run of KS2 Science. New Labour government elected. KS1 English results now being reported separately interms of attainment targets (81 per cent gained level 2 in Reading, 80 per cent in Writing). Suchdetails had been available previously but results were routinely reported as whole subjectlevels. The Writing score is averaged across the writing test and the spelling test. The scoresare averaged in Figure 1. KS1 tests now conducted as teacher assessment so results 20052010 are no longer

    directly comparable with previous results at KS1, though interestingly, teacher assessment doesnot seem to diverge from the established trend in test results. Results for Age 11 Science now derive from teacher assessment only; a national sample of 5per cent took tests with 81 per cent reaching level 4. NB 2010 also saw c. 25 per cent of pri-mary schools boycotting the English and Maths tests (c. 4000 schools), leading to governmentagreement to review national testing for the future.

    and Figure 2) the results start low, rapidly improve, but again, have plateaued since

    2000 with around 20 per cent of children not achieving the expected level, level 4,

    in maths or English.Not every years results are recorded in Table 1 and Figures 1 and 2; rather, suf-

    ficient years are recorded to indicate trends over time along with key dates which

    government has variously used and dropped as indicators of progress. Progress

    since 1997 was the measure routinely deployed by the New Labour government

    at national level. And at first sight progress since 1997 seems significant. But

    closer scrutiny indicates significant improvements occurred in results prior to

    1997. Thus, for example, in the first two years after National Testing was first intro-

    duced at KS2 under a Conservative government (19951997) results improved by

    15 percentage points in English (from 48 per cent to 63 per cent) and 17 percent-age points in maths (from 44 per cent to 61 per cent). In the ten years after 1997,

    results improved by 16 percentage points in English (63 per cent to 79 per cent)

    and 14 percentage points in maths (61 per cent to 75 per cent; 19972006), but

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    13/28

    470 ASSESSMENT IN SCHOOL REFORM

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    1992 1995 1996 1997 2000 2002 2005 2007 2008 2009 2010

    Age 7 English

    Age 7 Maths

    Figure 1. Percentage of pupils gaining National Curriculum Assessment level 2 or above atage 7 (KS1), England

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    1992 1995 1996 1997 2000 2002 2005 2007 2008 2009 2010

    Age 11 English

    Age 11 Maths

    Age 11 Science

    Figure 2. Percentage of pupils gaining National Curriculum Assessment level 4 or above atage 11 (KS2), England

    with most of this improvement being achieved by 2000. The plateau effect since

    2000 has continued through to the most recent results available in 2010.

    One inference we might take from these figures, especially with respect to

    results at KS2 (age 11), is that the introduction of National Testing constituted

    a major perturbation in the primary school system such that teachers were left

    initially deskilled by the innovation, so results started low; but results rapidly

    improved as teachers and students came to understand what was required of them,

    in terms of test preparation, and then progress tailed off as the limits of suchartificial improvement were reached.

    Results for GCSE at age 16 are rather different but perhaps even more instruc-

    tive (Table 2 and Figure 3). They have been rising steadily since the exam was

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    14/28

    ASSESSMENT IN SCHOOL REFORM 471

    TABLE 2: Percentage of pupils gaining O-level/CSE grade 1/GCSE and Equi-valents 19752010, England

    Percentage 5 or more AC; Percentage 5 or more AG

    1975 22.6 58.61980 24.0 69.01988 29.9 74.71990 34.5 80.31995 43.5 85.71997 45.1 86.42000 49.2 88.92005 56.8 89.92007 61.4 90.92008 65.3 91.62009 70.0 92.32010 75.4 92.8

    Source: Torrance (2003) and time series 19962010 available at: http://www.education.gov.uk/rsgateway/DB/SFR/s000985/sfr01-2011.pdf (accessed 5 September 2011). NB R&Supdate and revise pass rates so these figures may vary by fractions of a percentage from thosepublished in previous years; the figures recorded here are the most recent posted on the DoEwebsite. For details of calculating equivalence between O-level, CSE and GCSE see Torrance (2003).It should also be noted that DCSF/DfE Research and Statistics report totals including GCSEsand equivalents, including KS4 students taking iGCSEs and vocational GCSEs. The totalstherefore are slightly higher than the headline GCSE pass rate that is announced each autumnbut this does not affect the thrust of the argument.

    first introduced in 1988 and indeed were rising prior to its introduction. In the

    mid-1970s, when only the top 20 per cent of students were thought capable of

    passing O-level, the percentage of students passing at least five O-levels or their

    equivalent under the previous dual system was 22.6 per cent.4 By 1988, the first

    year of GCSE results, this had risen to 29.9 per cent. By the mid-1990s this had

    risen further to 43.5 per cent and the most recent results for 2010 indicate that

    .

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    %5 or more A*-C grades

    %5 or more A*-G grades

    Figure 3. Percentage of pupils gaining O-level/CSE grade 1/GCSE and equivalents 19752010, England

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    15/28

    472 ASSESSMENT IN SCHOOL REFORM

    75 per cent of students now pass five or more GCSEs or their equivalent at grades

    AC.

    That is, 75 per cent of the school population now achieve what 30 years ago it

    was thought only the top 20 per cent could achieve. Furthermore, taking the full

    range of grades into account (AG), as an indicator of the numbers of students

    gaining at least some benefit from their secondary education, almost 60 per cent

    gained at least five AG grades in 1975, while nearly 93 per cent achieved five

    AG in 2010.

    So from the published statistics in the public domain the evidence is that pass

    rates have been steadily improving over many years. And, on the face of it, this

    represents an absolute transformation of what the system is achieving, compared

    to 30 years ago.

    Of course, overall pass rates conceal other issues. Within these general trends

    different sub-groups perform better than others, and results vary by social class,gender and race. Thus, for example, 87.5 per cent of candidates of Chinese origin

    gained at least five ACs in 2009 (n=2,275), while only 67 per cent of candi-

    dates of Black African and Caribbean origin did so (n=23,609) though this in

    itself represents a very significant improvement from the 35 per cent recorded in

    the early 2000s (Torrance, 2005). Amongst candidates of White British origin,

    69.8 per cent passed at least five ACs in 2009 (n=461,445). A recent Joseph

    Rowntree-sponsored study indicates that poor, working-class white boys do worst

    of all (Cassen and Kingdon, 2007).

    There is not space here to explore such differential pass rates in more detail,but, taken together, the results of National Curriculum Tests and GCSE indicate

    that there is a major bifurcation developing between those students who are doing

    well and a substantial minority of perhaps 2025 per cent of children who are not

    riding the rising tide of results. Clearly this raises major political and educational

    issues, as the system is driven more and more by the pursuit of examination suc-

    cess for the majority, andde facto attends less and less to the needs of those who,

    for whatever reason, do not fit into the model. Overall, however, for the purposes

    of the present discussion, the key point is that GCSE pass rates have been rising

    in England for more than 30 years.A-level pass rates have also been rising in similar fashion over the last 30 years

    (Table 3 and Figure 4). Originally A-level examinations were designed to qualify

    and select applicants for entrance to university. They were designed to be taken by

    a minority of the minority thought to be capable of benefiting from an academic

    secondary education. Today A-levels taken at age 18 are starting to replace GCSE

    as the marker of a successfully completed secondary education.

    Again, not every year is recorded in Table 3 and Figure 4, but rather those

    that indicate trends over time, and particularly over the last ten years when already

    good results still show an incremental improvement year on year. The key points

    to note include a steadily rising numbers of passes and top grades over 30 years;

    a particular blip around 20002001 when major changes to the structure of A-level

    was introduced (Curriculum 2000), but, once accommodated, the steadily rising

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    16/28

    ASSESSMENT IN SCHOOL REFORM 473

    TABLE 3: A-level pass rates 19802010, England

    Entries Percent A grades Percent all grades A-E

    1980 567, 027 8.8 67.81985 9.8 70.21990 12.0 76.71995 15.6 84.02000 606, 995 18.8 90.62001 18.3 89.62002 20.3 94.12003 21.3 95.32004 22.1 95.92005 717, 127 22.4 96.22006 23.8 96.52007 25.0 96.92008 25.6 97.22009 26.5 97.52010 784, 877 26.8 97.6

    Sources: Department of Education and Science (1980) Statistics of School Leavers, CSE and GCE, England Department of Education and Science (1985) Statistics of Education, School Leavers, CSE andGCE England Daily Telegraph 14 August 2008 accessed via website http://www.telegraph.co.uk 4 October 2010. Times Educational Supplement 17 August 1990 p. 3. Times Educational supplement 18 August 1995 p. 5. Department for Education and Employment (2000) Statistics of Education: GCSE/GNVQ andGCE A/AS Level and Advanced GNVQ Examination Results1999/2000 England accessed fromDCSF Research and Statistics website: http://www.dcsf.gov.uk/rsgateway 4 October 2010. All furtherresults, 20012010, from Joint Council for Qualifications website http://www.jcq.org.uk/ accessed 4October 2010. It might be noted in passing that it is remarkably difficult to gain access to A-level passrates pre-2001 when they first started appearing on the JCQ website. So far as I can discern there is nosingle source of results which goes back to 1980 or beyond. Government statistics have been producedin very different ways over the period, to address whatever was the key policy concern of the day. Thusthis table has been produced by extensive mining of different sources First results for Curriculum 2000 including modular A/S levels. First use of A and A grades: A = 8.1 per cent, A = 18.7 per cent = 26.8 per cent total A + A

    (NB c. 14 per cent of all candidates achieved straight As in 2010, i.e. 3 x A or A grades, up from c.7 per cent in 2000).

    pass rate was resumed and perhaps slightly accelerated; as was the shift towards

    awarding of top grades. In 1980 only nine per cent of entries gained grade A, and

    from a significantly smaller number of overall entries (n=567,027), while in 2010

    27 per cent of entries were awarded grade As, from a total entry of 784,877.

    Figure 5 illustrates the change very dramatically. Here the diamonds line indi-

    cates the grade distribution in 1980, with the majority of passing grades being C,

    D and E, and skewed towards the bottom end. The squares line indicates grade

    distribution in 2010, and the graph is reversed, with most passing grades being C,B and A.

    So what might be the explanation for these long term trends? Here we return

    to the points made in the opening section of the paper. Some element of a genuine

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    17/28

    474 ASSESSMENT IN SCHOOL REFORM

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    19

    80

    19

    85

    19

    90

    19

    95

    20

    00

    20

    01

    20

    02

    20

    03

    20

    04

    20

    05

    20

    06

    20

    07

    20

    08

    20

    09

    20

    10

    % A Grades

    % all pass grades (A-E)

    Figure 4. Percentage A-level passes, 19802010, England. 1980: n =567,027; 2010:n =784,877

    0

    5

    10

    15

    20

    25

    30

    E D C B A

    1980

    2010

    Figure 5. Percentage distribution of A-level grades, EA, 1980 and 2010, England

    rise in standards is likely to be present, driven by better socio-economic conditions

    of students, higher expectations of educational outcomes by students, parents and

    teachers, and better teaching. But this is combined with and compounded by two

    key elements of the changes which have taken place in systems of assessment:

    (i) an increasingly more focused concentration on passing exams, by both

    teachers (teaching to the test) and the majority of students (extrinsic

    motivation), because of the perceived importance of educational success

    in institutional accountability and individual life chances;

    (ii) the increased transparency of modular, criterion-referenced assessmentsystems, which affords teachers and students much more opportunity to

    practise for tests and improve grades through, coaching, specific feedback

    and resubmission of work.

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    18/28

    ASSESSMENT IN SCHOOL REFORM 475

    6. IMPACT ON EDUCATIONAL EXPERIENCE AND QUALITY

    In relation to National Curriculum test scores, the evidence in England suggests

    that teaching to the test is the most likely recent explanation for rising scores

    which tail off as teachers and students come to be about as efficient as they canbe at scoring well on the tests within a regime of coaching and practice. Many

    research studies have reported an increasing focus on test preparation, particularly

    in the final year of primary school prior to the tests being taken (for a review of

    recent studies see Wyse and Torrance, 2009). Thus, for example, McNess et al.

    (2001) note that:

    Whole class teaching and individual pupil work increased at the expense of groupwork . . . [there was] a noticeable increase in the time spent on the core subjects . . .[and] teachers . . . put time aside for revision and mock tests . . . (pp. 1213)

    While Hall et al. (2004) report that:

    assessment is synonymous with testing . . . assessment, narrowed to test-taking inpreparation for SATs, is the main business of life in the last two terms of year six.(p. 804)

    However, it is not only independent research studies which highlight such prob-

    lems. School inspectors who routinely visit schools on a regular basis have

    reported on a narrowing of the curriculum and summaries of their inspection find-

    ings have been included in the annual reports from the Office for Standards inEducation (OfSTED). One recent report noted:

    In many [primary] schools the focus of the teaching of English is on those parts ofthe curriculum on which there are likely to be questions in national tests . . . Historyand, more so, geography continued to be marginalized . . . In [secondary] schools . . .the experience of English had become narrower . . . as teachers focused on tests andexaminations . . . There was a similar tension in mathematics . . . (OfSTED, 2006,

    pp. 5256)

    Such concerns have also now reached Parliament, with a report of the Select

    Committee for Children, Schools and Families (2008) stating:

    In an effort to drive up national standards, too much emphasis has been placed on asingle set of tests and this has been to the detriment of some aspects of the cur-riculum and some students. (Select Committee, reported on BBC 13 May 2008:http://news.bbc.co.uk/1/hi/education/7396623.stm)

    And finally, after a boycott of KS2 tests in 2010 by about 25 per cent of primary

    schools, the Bew Committee was set up by the new Conservative and Liberal

    Democrat coalition government to review KS2 testing, reporting in June 2011.

    Their terms of reference included:

    how to avoid, as far as possible, the risk of perverse incentives, over-rehearsal andreduced focus on productive learning. (Bew Report, 2011, p. 4)

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    19/28

    476 ASSESSMENT IN SCHOOL REFORM

    The report notes:

    There are considerable concerns . . . that the system is too high stakes, which canlead to unintended consequences such as over-rehearsal and teaching to the test.

    (p. 9)

    Comparable evidence can also be identified internationally. Many research studies

    from the United States (e.g. Klein et al., 2000; Linn, 2000; Shepard, 1990) report

    similar findings from previous investigations of test-based reform in the USA, and

    the same issues are now beginning to emerge from studies of the No Child Left

    Behind program. State level NCLB test scores are rising (CEP, 2007), but equally

    Administrators and teachers have made a concerted effort to align curriculum

    and instruction with state academic standards and assessments (CEP, 2006, p. 1).

    A recently completed study by Rand Education funded by the US National ScienceFoundation noted that:

    changes included a narrowing of the curriculum and instruction toward tested topicsand even toward certain problems styles or formats. Teachers also reported focusingmore on students near the proficient cut-score . . . (Hamilton et al., 2007, Summary:

    p. xix)

    So overall, the international research evidence suggests that rising test scores

    might actually mask falling standards as students are exposed to a much restricted

    curriculum.Similar issues with regard to narrowing the curriculum have been reported

    with respect to GCSE in England. For the purposes of government statistics and

    the compilation of league tables of secondary schools the passing grade for

    GCSE is grade C, and schools are under enormous pressure to maximise the num-

    bers of students passing at least five AC grades. As noted above, OfSTED has

    observed a focus on exam preparation in secondary schools as well as primary

    schools, and research by Gillborn and Youdell (2006) has reported schools focus-

    ing particularly on identifying and supporting those students who could possibly

    be moved from a D to a C grade GCSE triage as they have termed it, a practicevery similar to that noted by the US Rand Report cited above. Taken together with

    our earlier observations about a significant minority of children not fitting in with

    the model, this means that perhaps 2025 per cent of children are increasingly

    ignored as they progress through the school system, if they are not thought worthy

    of such triage investment.

    At A-level, research undertaken as part of a Nuffield Foundation sponsored

    review of the 1419 curriculum in England identified modularisation and the re-

    taking of modular tests as a key issue for pass rates in science (Hayward and

    McNichol, 2007). My own research investigating assessment across the post-compulsory sector including A-level has identified transparency of procedures,

    objectives and criteria as a key issue which, combined with the detailed feedback

    which teachers give students, has led to a situation which I have characterised as

    criteria compliance:

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    20/28

    ASSESSMENT IN SCHOOL REFORM 477

    . . . greater transparency of intended learning outcomes and the criteria by whichthey are judged, and . . . [c]larity in assessment procedures [and] processes . . . hasunderpinned the widespread use of coaching, practice and provision of formativefeedback to boost individual and institutional achievement . . . . However . . . such

    transparency encourages instrumentalism . . . transparency of objectives coupledwith extensive use of coaching and practice to help learners meet them is in dan-ger of removing the challenge of learning and reducing the quality and validity ofoutcomes achieved. This might be characterized as a move from assessment oflearn-ing, through the currently popular idea of assessment for learning, to assessmentas learning, where assessment procedures and practices come completely to domi-nate the learning experience, and criteria compliance comes to replace learning.(Torrance, 2007 p. 282)

    So, potentially, we have reached a situation in England where scores and grades

    are continuing to rise but the validity and reliability of the standards achieved

    are subject to increasing doubt, and the educational experience of even the mostsuccessful students, let alone those who are not successful, is compromised.

    Employers and university selectors alike are expressing concern about the

    quality and credibility of GCSE and A-level grades (e.g. Sykes Review, 2010) and

    this in turn is drawing responses from the Examination Boards (e.g. Cambridge

    Assessment, 2010, 2011). A recent review commissioned by the Conservative

    Party, the Sykes Review, reported that:

    Confidence in the qualifications and assessment system has been diminishing formany years. The usefulness of the system has been eroded by the politicisation ofassessment outcomes, by universities loss of confidence in A levels as a certificate ofreadiness for university-level study, by employers loss of confidence in GCSEs andA levels as certification of relevant knowledge and skills, and by the disproportionate

    burden placed by external assessment on pupils, teachers and schools. The volumeof external assessment has also grown enormously . . . . This process has underminedthe credibility of teacher and school assessment, as well as limiting and underminingteaching. (Sykes Review, 2010, p. 4)

    Now it might be argued that it was ever thus, employers and indeed examiners

    have often complained that standards are not high enough. Also, to reiterate, the

    Sykes Review was set up to report to the Conservative Party when in opposition,and might thus be thought of as rather partisan. However, the Review Committee

    membership was very broad, and actually produced findings not that far removed

    from OFSTED and the 2008 Parliamentary Select Committee. The more interest-

    ing point to note is how widespread is the concern across the political spectrum.

    It is also interesting to note the time lag between the research findings of the late

    1990s and early 2000s the work of McNess, Hall and others, cited earlier and

    the impact on policy thinking some ten years later. We could have avoided ten

    years of this progressive narrowing of the curriculum if politicians had been in a

    position to hear and respond to the research literature.Successive research studies over 40+ years have indicated that it is the vital-

    ity of teacherstudent relationships and the quality of teacherstudent interaction

    that are the most important factors in improving student learning experiences

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    21/28

    478 ASSESSMENT IN SCHOOL REFORM

    and raising attainment (Galton et al., 1980; Galton et al., 1999; Jackson, 1968;

    Mehan, 1979; Mercer, 1995). Yet this is precisely what is threatened by an over-

    concentration on testing. The focus of the teacherstudent relationship is currently

    oriented towards criteria compliance and grade accumulation, rather than learning.

    It is almost as if successive governments in England have taken an actuarial view

    of the role of rising pass rates. Examination success is correlated with social and

    economic success, so, the policy thinking seems to be, maximise exam passes,

    especially for previously disadvantaged groups, and social and economic mobility

    will necessarily improve, irrespective of the educational experience of the students

    or the quality of the outcomes achieved.

    In focusing on assessment, standards and accountability to drive improvements

    in schooling, policymakers seem to have lost sight of the purpose of education and

    the nature of individual achievement of what it is that standards are supposed to

    embody in terms of the knowledge, skills, attitudes and competences that we mightexpect of young people leaving school in the twenty-first century. It is almost as

    if the unit of analysis for policy-making, so to speak, has shifted from the cur-

    riculum and the building blocks of individual student learning and achievement,

    to the overall output of the system. Individual achievement and, more particularly,

    the content and quality of that achievement the traditional focus of assessment

    has been ignored, or at least taken for granted, as policy has focused on raising test

    scores across the system as a whole.

    Meanwhile, many educationists and assessment developers have been con-

    tent to ride the tiger of accountability, and take the policy context as given, inorder to try to insinuate their own version of measurement driven instruction or

    assessment for learning into the system. However, assessment processes cannot

    be so easily assimilated into a single, integrated, systemic operation. Three issues

    in particular are apparent:

    (i) just because assessment can be observed to have negative backwash effects

    on the curriculum and teaching, this doesnt necessarily mean that the

    same mechanism is available to harness these effects to beneficial pur-poses; certainly this is proving far more difficult than seems to have been

    anticipated;

    (ii) the impact of assessment for learning on students knowledge and under-

    standing will inevitably be mediated by the accountability context in which

    it operates; thus students are currently learning to accumulate grades,

    rather than understand the structure, coherence and content of particular

    knowledge domains;

    (iii) criterion-referencing enables the structure of knowledge domains and the

    processes of assessment associated with them to be more transparent, suchthat more students can achieve more success, but the very nature of that

    success undermines the selection function of assessment and thus, in turn,

    the credibility of what is achieved.

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    22/28

    ASSESSMENT IN SCHOOL REFORM 479

    Education has moved from being a scarce good to a positional good; whereby all

    cannot succeed in terms of publicly reported grades without the nature of the suc-

    cess being called into question. This is now a key and urgent issue to improve

    educational quality and outcomes without simultaneously undermining the credi-

    bility of the enterprise. Equally, we have to attend to the 2025 per cent of students

    that the accountability model is leaving adrift.

    7. WHERE DO WE GO FROM HERE? DEVELOPING EDUCATIONAL QUALITY

    We are, however, where we are, and the irony of the current situation, certainly

    in England, is that a narrow test-driven accountability system is unlikely to pro-

    duce flexible and creative workers for the so-called knowledge economy, even

    amongst the 75 per cent who are currently seen as successful, let alone the 25 per

    cent who are not. Nor are such systems likely to contribute to more general delib-eration about the curriculum, teacher development and the democratic purposes

    and organisation of schooling. Quite the reverse, accountability systems take these

    matters of purpose as settled, the only issues being efficiency and effectiveness in

    meeting already determined goals.

    But it is becoming apparent that a one-size-fits-all standards agenda in educa-

    tional policymaking has run its course and that what is needed is a new focus on

    curriculum content, local responsiveness and the quality of teaching and learning

    on the production of diverse experiences of learning for an uncertain world.

    8. POLICY OPTIONS FOR THE FUTURE

    There are no simple or straightforward policy options with respect to assessment

    and testing. Assessment intersects with every aspect of an educational system:

    at the level of the individual student and teacher and their various experi-ences (positive or negative) of the assessment process;

    at the level of the school or similar educational institution and how it is

    organised and held to account; and at the level of the educational and social system with respect to what

    knowledge is endorsed and which people are legitimately accredited for

    future economic and social leadership.

    Governments in England over the last 20 years or so seem to have only partially

    appreciated this, assuming that standards can be mandated and measured with-

    out the process of measurement impacting on and, in key respects, distorting the

    system as a whole. Similarly, because educational achievement is correlated with

    social and economic well-being, the efforts of successive governments in Englandseem to have concentrated on pushing as many children as possible through as

    many examinations as possible. This seems to have been conceived of as part of

    the drive for social inclusion and improving social mobility, without reflecting on

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    23/28

  • 7/27/2019 Torrance 2011 Assessment

    24/28

    ASSESSMENT IN SCHOOL REFORM 481

    re-conceptualise the integration of curriculum development and

    assessment by starting from the perspective of the curriculum: i.e. put

    resources and support into re-thinking curriculum goals for the twenty-first

    century and developing illustrative examples of high quality assessment

    tasks that underpin and reinforce these goals, for teachers to use as

    appropriate.

    The Bew Report (2011) on Key Stage 2 (KS2) testing indicates that statutory

    testing will be further restricted, particularly with respect to testing Writing

    with writing composition . . . subject only to . . . teacher assessment (p. 14).

    Meanwhile, national testing of the whole cohort in Science was ended in 2010

    with achievement in KS2 Science now being monitored by the testing of a sample

    of schools and students.

    So, governments continue to learn the lessons of an over-concentration on test-

    ing, though the inspection regime also needs to change. The focus of institutional

    self-evaluation and other accountability mechanisms including inspection need to

    be on the quality of the learning experience in the classroom, rather than simply

    concentrating on the outcomes produced.

    But given what we now know about twenty years of such attempts to com-

    bine amelioration with educational development, it might be argued that what is

    needed is a much more radical break with current policy and practice. Continuing

    centralisation of systemic control will threaten long term systemic sustainability,

    professional development and capacity building. This issue was noted by the exter-nal evaluation of the National Literacy and Numeracy Strategies when it reported

    that continuing this kind of accountability for too long may result in a culture

    of dependence (Earl et al., 2003, p. 6). Similar problems can be observed with

    respect to the potential opportunities to develop teacher assessment. Although

    national testing was ended at age seven and replaced by teacher assessment, teach-

    ers continue to use past papers to provide them with evidence for their judgements

    (Reed and Lewis, 2005). Comparable issues are emerging with the abolition of

    national testing at age 14, with preparations for GCSE simply being started earlier.

    Thus what is required is not just rolling back the boundaries of testing. The liber-ated intellectual and material possibilities for developing high quality teaching and

    learning must be supported by positive programmes of professional development.

    In the end quality education must involve quality activities taking place in

    schools and classrooms and this requires curriculum and professional develop-

    ment at local level. A radical version of the future would not simply attempt

    to ameliorate central control but to challenge it, and seek to embed new ideas

    and practices at local level. It is necessary to invest in more creative forms of

    curriculum and professional development, especially with respect to classroom

    level assessment skills and understandings, and to re-build the communities ofepistemological practice within which judgments about standards are made and

    ultimately reside.

    Understanding and addressing key educational issues for the twenty-first

    century requires much more curriculum flexibility and responsiveness and this

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    25/28

    482 ASSESSMENT IN SCHOOL REFORM

    requires investment in teacher professional development at local level. Twenty

    years of increasing central control and regulation have produced a narrow and

    risk-averse education culture which is the very antithesis of the ostensible purpose

    of the exercise. Producing higher test scores is not enough, for learners or for gov-

    ernments. The need for better quality educational encounters and better quality

    information with which to take decisions has never been more acute. We need to

    stimulate new visions of what might be accomplished by our education system,

    and new ways to record diverse experiences and outcomes, rather than continuing

    to insist that all we can achieve is compliance with that which is already known.

    9. NOTES1

    Earlier versions of this article were presented as a keynote speech to the New Zealand

    Association for Research in Education (NZARE) annual conference, University ofAuckland, December 2010; the British Educational Studies Association (BESA) annualconference, Manchester Metropolitan University, July 2011; and as a paper to theBritish Educational Research Association (BERA) annual conference, London Instituteof Education, September 2011.

    2General Certificate of Education Ordinary Level; GCE Advanced Level was and stillis taken at around 18+ to qualify for entry to university.

    3Certificate of Secondary Education.

    4i.e. the equivalent of five GCSEs at grades AC: the top GCSE grades of AC areofficially accepted as the equivalent of the old O-level passes; the percentage of stu-dents gaining at least five ACs is the officially and commonly accepted measureof a good secondary education; the percentage of students gaining at least five AGs (the full range of grades) is the officially and commonly accepted measure of aminimally satisfactory secondary education.

    10. REFERENCES

    Airasian, P. (1988) Measurement-driven instruction: a closer look, EducationalMeasurement: Issues and Practice, 7 (4), 611.

    Bennett, R. (2011) Formative assessment: a critical review, Assessment in Education, 18(1), 525.

    Bew Report (2011) Independent Review of Key Stage 2 Testing, Assessment andAccountability. Available at: http://www.education.gov.uk/ks2review (accessed 25August 2011).

    Black, P. (1998) Testing: Friend or Foe? (London, Falmer Press).Black, P. and Wiliam, D. (1998) Assessment and Classroom Learning, Assessment in

    Education, 5 (1), 774.Black, P., McCormick, R., James, M. and Pedder, D. (2006) Learning how to learn and

    assessment for learning, Research Papers in Education, 21 (2), 119132.Bloom, B. (1974) An introduction to Mastery Learning Theory. In J. Block (Ed.) Schools

    Society and Mastery Learning(New York, Holt, Rinehart and Winston).Broadfoot, P., James, M., McMeeking, S., Nuttall, D. and Stierer, B. (1988) Records of

    Achievement: Report of the National Evaluation (London, HMSO).Cambridge Assessment (2010) A better approach to regulating qualification standards.

    Available at: http://www.cambridgeassessment.org.uk/ca/Viewpoints/Viewpoint?id=134763 (accessed 4 July 2011).

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    26/28

    ASSESSMENT IN SCHOOL REFORM 483

    Cambridge Assessment (2011) Higher education admissions test must be fair,valid and transparent. Available at: http://www.cambridgeassessment.org.uk/ca/Spotlight/Detail?tag=entry (accessed 4 July 2011).

    Carless, D. (2011) From Testing to Productive Student Learning (London, Routledge).

    Carless, D., Joughin, G., Liu N-F and Associates (2006) How Assessment SupportsLearning(Hong Kong, Hong Kong University Press).Cassen, R. and Kingdon, G. (2007) Tackling Low Educational Achievement (York, Joseph

    Rowntree Foundation).Centre on Education Policy (2006) From the Capital to the Classroom: Year 4 of the No

    Child Left Behind Act: Summary and Recommendations. Available at: http://www.cep-dc.org/.

    Centre on Education Policy (2007) Has Student Achievement Increased Since No Child LeftBehind? Available at: http://www.cep-dc.org/.

    Cowie, B. and Bell, B. (1999), A model of formative assessment in science education,Assessment in Education, 6, 101116

    Daugherty, R. (1995) National Curriculum Assessment: a Review of Policy 19871994(London, Falmer Press).Department for Education (2011) The English Baccalaureate. Available at: http://www.

    education.gov.uk/schools/teachingandlearning/qualifications/englishbac/a0075975/theenglishbaccalaureate (accessed 4 July 2011).

    Department of Education and Science (1987) Task Group on Assessment and Testing:A Report(London, DES).

    Earl, L., Watson, N., Levin, B., Leithwood, K., Fullan, M., Torrance, N. et al. (2003)Watching and Learning 3: Final Report of the External Evaluation of Englands

    Literacy and Numeracy Strategies; Executive Summary (Nottingham, DfES).Ebel, R.L. (1972) Essentials of Educational Measurement(Englewood Cliffs, NJ, Prentice-

    Hall).Galton, M., Simon, B. and Croll, P. (1980) Inside the Primary Classroom (London,

    Routledge and Kegan Paul).Galton, M., Hargreaves, L., Comber, C. and Wall, D. (1999) Inside the Primary Classroom:

    20 Years on (London, Routledge).Gillborn, D. and Youdell, D. (2006) Educational triage and the D-to-C conversion: suitable

    case for treatment? In H. Lauder, P. Brown, J. Dillabough and A. H. Halsey (Eds)Education, Globalisation and Social Change (Oxford, Oxford University Press).

    Gilmore A. (2002) Large-scale assessment and teachers assessment capacity: learn-ing opportunities for teachers in the National Education Monitoring Project in NewZealand, Assessment in Education, 9 (3), 343361.

    Gipps, C. and Goldstein, H. (1983) Monitoring Children (London, Heinemann).Glaser, R. (1963) Instructional technology and the measurement of learning outcomes,

    American Psychologist, 18, 519522.Hall, K., Collins, J., Benjamin, S., Nind, M. and Sheehy, K. (2004) SATurated models of

    pupildom: assessment and inclusion/exclusion, British Educational Research Journal,30 (6), 801881.

    Hamilton, L. et al. (2007) Standards-based Accountability under No Child Left Behind(Santa Monica, Rand Education).

    Hargreaves, E. (2007) The validity of collaborative assessment for learning, Assessment inEducation, 14 (2), 185199.

    Hayward, G. and McNicholl, J. (2007) Modular mayhem? A case study of the develop-ment of the A level science curriculum in England, Assessment in Education, 14 (3),

    335351.Inner London Education Authority (1984) Improving Secondary Schools (The Hargreaves

    Report) (London, ILEA).Jackson, P. (1968) Life in Classrooms (New York, Holt Reinhart and Winston).

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    27/28

    484 ASSESSMENT IN SCHOOL REFORM

    Klein, S. et al. (2000) What do test scores in Texas tell us? Education Policy AnalysisArchives, 8, 49. Available at: http://epaa.asu.edu/epaa/v8n49.

    Linn, R. (2000) Assessments and accountability, Educational Researcher, 29,416.

    McNess, E., Triggs, P., Broadfoot, P., Osborn, M. and Pollard, A. (2001) The changingnature of assessment in English primary schools: findings from the PACE Project 19891997, Education 313, 29 (3), 916.

    Mehan, H. (1979) Learning Lessons: Social Organisation in the Classroom (Harvard,Harvard University Press).

    Mercer N. (1995) The Guided Construction of Knowledge (Clevedon, Multi-LingualMatters).

    Murphy, R. and Torrance, H. (1988) The Changing Face of Educational Assessment(Maidenhead, Open University Press).

    New Zealand Qualification Authority (2011) History of NCEA. Available at:http://www.nzqa.govt.nz/qualifications-standards/qualifications/ncea/understanding-

    ncea/history-of-ncea/ (accessed 7 July 2011).No Child Left Behind Act (2001) Public Law 107110. Available at:http://www.ed.gov/nclb/landing.jhtml.

    Office for Standards in Education (2006) The Annual Report of Her MajestysChief Inspector of Schools 2005/06 (London, OfSTED). Available at:http://www.ofsted.gov.uk.

    Popham, J. (1987) The merits of measurement-driven instruction, Phi Delta Kappan, 68,679682.

    Reed, M. and Lewis, K. (2005) Key Stage 1 Evaluation of New Assessment Arrangements(Slough, NFER).

    Resnick, L. and Resnick, D. (1992) Assessing the thinking curriculum. In B. Gifford andM. OConnor (Eds) Future Assessments: Changing Views of Aptitude, Achievementand Instruction (Boston, Kluwer).

    Sadler, R. (1989) Formative assessment and the design of instructional systems,Instructional Science, 18, 119144.

    Sadler, R. (1998) Formative assessment: revisiting the territory, Assessment in Education,5 (1), 7784.

    Shavelson, R., Black, P., Wiliam, D. and Coffey, J. (2005) On linking forma-tive and summative functions in the design of large-scale assessment sys-tems. Available at: http://www.stanford.edu/dept/SUSE/SEAL/Reports_Papers/Onpercent20Aligning per cent20Formative per cent20and per cent20Summative percent20Functions_Submit.doc (accessed 4 July 2011).

    Shepard L. (1990) Inflated test score gains: is the problem old norms or teaching to thetest? Educational Measurement: Issues and Practice, 9 (3), 1522.

    Shepard L. (2000) The role of assessment in a learning culture, Educational Researcher,29 (7), 414.

    Sykes Review (2010) The Sir Richard Sykes Review of the future of Englishqualifications and assessment system. Available at: http://www. conserva-tives.com/news/news_stories/2010/03/~/media/Files/Downloadablepercent20Files/Sir

    per cent20Richard per cent20Sykes_Review.ashx (accessed 4 July 2011).Torrance, H. (1981) The origins and development of mental testing in England and the

    United States, British Journal of Sociology of Education, 2 (1), 4559.Torrance, H. (Ed, 1995) Evaluating Authentic Assessment: Issues, Problems and Future

    Possibilities (Buckingham, Open University Press).

    Torrance, H. (2003) Assessment of the National Curriculum in England. In T. Kellaghanand D. Stufflebeam (Eds) International Handbook of Educational Evaluation(Dordrecht, Kluwer).

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013

  • 7/27/2019 Torrance 2011 Assessment

    28/28

    ASSESSMENT IN SCHOOL REFORM 485

    Torrance, H. (2005) Testing times for black achievement some observationsfrom England. Paper presented to Symposium Leaving No Child Behind: HowFederal Education Agencies are Addressing Achievement Gaps for Linguisticand Racial/Ethnic Groups American Educational Research Association Annual

    Conference, Montreal, 1115 April.Torrance, H. (2007) Assessment as Learning? How the use of explicit learning objectives,assessment criteria and feedback in post-secondary education and training can come todominate learning, Assessment in Education, 14 (3), 281294.

    Torrance, H. and Pryor, J. (1998) Investigating Formative Assessment: Teaching, Learningand Assessment in the Classroom (Buckingham, Open University Press).

    Wyatt-Smith, C., Klenowski, V. and Gunn, S. (2010) The centrality of teachers judgementpractice in assessment: A study of standards in moderation, Assessment in Education:Principles, Policy and Practice, 17 (1), 5975.

    Wyse, D. and Torrance, H. (2009) The development and consequences of national cur-riculum assessment for primary education in England, Educational Research, 51 (2),

    213238.

    Correspondence

    Professor Harry Torrance

    The Education and Social Research Institute

    Manchester Metropolitan University

    799 Wilmslow Road

    Manchester

    M20 2RR

    E-mail: [email protected]

    Downloadedby[OxfordBrookesU

    niversity]at04:4401July

    2013