Torrance 2011 Assessment

7/27/2019 Torrance 2011 Assessment

1/28

This article was downloaded by: [Oxford Brookes University]On: 01 July 2013, At: 04:44Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

British Journal of Educational

StudiesPublication details, including instructions for authors

and subscription information:

http://www.tandfonline.com/loi/rbje20

Using Assessment to Drive the

Reform of Schooling: Time toStop Pursuing the Chimera?Harry Torrance

a

aManchester Metropolitan University

Published online: 08 Dec 2011.

To cite this article: Harry Torrance (2011): Using Assessment to Drive the Reform ofSchooling: Time to Stop Pursuing the Chimera?, British Journal of Educational Studies,59:4, 459-485

To link to this article: http://dx.doi.org/10.1080/00071005.2011.620944

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expresslyforbidden.

The publisher does not give any warranty express or implied or make anyrepresentation that the contents will be complete or accurate or up to

date. The accuracy of any instructions, formulae, and drug doses should beindependently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damageswhatsoever or howsoever caused arising directly or indirectly in connectionwith or arising out of the use of this material.
http://www.tandfonline.com/page/terms-and-conditionshttp://www.tandfonline.com/page/terms-and-conditionshttp://dx.doi.org/10.1080/00071005.2011.620944http://www.tandfonline.com/loi/rbje20


2/28

British Journal of Educational Studies

Vol. 59, No. 4, December 2011, pp. 459485

USING ASSESSMENT TO DRIVE THE REFORM OFSCHOOLING: TIME TO STOP PURSUING THE CHIMERA?

by HARRY TORRANCE, Manchester Metropolitan University

ABSTRACT: Internationally, over the last 2030 years, changing the proce-dures and processes of assessment has come to be seen, by many educators aswell as policy-makers, as a way to frame the curriculum and drive the reformof schooling. Such developments have often been manifested in large scale,

high stakes testing programmes. At the same time educational arguments havebeen made about the need to provide students with good quality formativefeedback, and informative reports about what they have achieved. The chimeraof a perfectly integrated and functioning curriculum and assessment systemhas been pursued, but such ambition far outstretches systemic capacity; it isneither feasible nor desirable. The national testing and examination system in

England is an exemplar case. As national results have improved, much evi-dence suggests that, if anything, actual standards of achievement are falling,and grade inflation is undermining public confidence in the whole system. The

paper will review these issues and tensions, and argue that a different modelfor developing curriculum and assessment is urgently needed.

Keywords: assessment, school reform, examination results, England

1. INT RODU CT ION1

Internationally, over the last 2030 years, various developments have taken place

in the field of curriculum and assessment that have led governments around the

world to look to assessment policy and practice as a way of exerting pressure

on their school systems. Changing the procedures and processes of assessment

has come to be seen, by many educators as well as policy-makers, as a way to

frame the curriculum and drive the reform of schooling. There is not a singleexplanation for how and why this has happened. Many tributaries have contributed

to the current torrent of policy initiatives. But the unintended consequences, or at

least I assume they are unintended, are becoming very apparent, and if they are

not addressed then they are likely to undermine the validity and legitimacy of the

whole enterprise.

In this paper I review some of the general factors contributing to the current

mainstream intellectual and policy consensus; explore some of the consequences

of current practice; and identity key elements which must be changed in order to

develop a curriculum and assessment system which is fit for purpose. The argu-ment of the paper is that in the increasingly frantic search for a perfectly integrated

and functioning assessment system, our ambition has far outstretched our capacity

ISSN 0007-1005 (print)/ISSN 1467-8527 (online)

2011 Society for Educational Studies

http://dx.doi.org/10.1080/00071005.2011.620944

http://www.tandfonline.com

Downloadedby[OxfordBrookesU

niversity]at04:4401July

2013


3/28

460 ASSESSMENT IN SCHOOL REFORM

to deliver in terms of both what is feasible and what is desirable. The compromises

that have been struck between educational aspiration and political purpose have

distorted both, with very negative consequences for the educational experience of

students and the credibility of the overall system. The first part of the paper briefly

summarises where we are now with respect to both the educational and politi-

cal arguments which focus on using assessment to reform schooling. Secondly,

I examine evidence of the effects of such developments in England, including

steadily rising pass rates in national tests and public examinations, but increasing

scepticism about their validity and reliability, i.e. about what they actually mean

in terms of educational standards. Finally, I go on to think about what can be

salvaged and what has to change if the appropriateness and quality of schooling

in the twenty-first century is to be improved.

2. HOW DID WE GET HERE?

Changes in assessment do not simply arise from technical developments in the

field, though these certainly contribute. Rather, such change reflects developments

in the social and economic aspirations which we hold for the education system,

and thus what it is that we are trying to design assessment to accomplish. This

section of the paper briefly reviews the following issues and developments with

respect to their diverse contributions to the current settlement:

selection, certification and norm referencing;human resource development and education for all;

criterion referencing, clarity of outcomes and the development of content

standards;

social justice and educational inclusion;

summative and formative assessment.

Selection, Certification and Norm Referencing

One of the most profound intellectual and policy shifts over the last 30 years orso has been the move from seeing only a small percentage of a student cohort as

being both educable and worth educating, to seeing education as an investment in

the social and economic potential of the whole cohort. We now take this aspiration

for granted, at least at the level of policy rhetoric, but it was not always so: quite

the reverse.

Historically, education was a scarce commodity, access to educational oppor-

tunities were limited, and educational assessment was largely concerned with

selecting individuals for those limited opportunities; for access to an elite sec-

ondary education for example, and access to university. In turn grades andcertificates were awarded to individuals at the end of particular courses of study, as

they progressed through the education system. So the focus of assessment was on

identifying individual achievement, and particularly on selecting and certificating



2013


4/28

ASSESSMENT IN SCHOOL REFORM 461

individuals. In so doing, this process functioned to identify, and legitimate on

grounds of educational merit, the identification of the next cohort of suitably

qualified and socialised personnel for economic and social leadership roles in

society. Selection and certification was done by relatively small elite groups, of

relatively small elite groups, for relatively small elite groups, and was under-

pinned by reference to the idea that innate intellectual ability was distributed along

a normal distribution curve within a population. The most obvious example in

England is the work of Cyril Burt in producing the intellectual justification for

mental testing and in turn the 11+ selection test for grammar school entrance

(Torrance, 1981).

Selection, and grading for certification, produced the need for assessments

to generate a rank order, with norm-referencing being used for such purposes.

What mattered was where an individuals score came in relation to their peers,

rather than any absolute level of achievement that it might signal in itself. Ofcourse absolute levels of achievement were important in terms of determining

grade boundaries, but such conceptions of achievement remained largely within

the tacit knowledge of examiners and were not reported explicitly. What mattered

publicly was the norm-referenced rank-order and grades awarded. Such practices

were a product of their time, largely determined by the imperative to create a

small social and economic elite, to lead and manage a largely unskilled manual

workforce.

Human Resource Development and Education for All

Such times have changed. Without rehearsing all the changes in the international

terms of trade, and the conditions of production, that have occurred over the last

3040 years, it is apparent that we now live in a world of intense global economic

competition with mass movements of capital and labour. Unskilled production

has virtually vanished from the UK and other similar economies, and the empha-

sis now is on education for the so-called knowledge economy and as a form of

investment in human capital. The focus is now on education for all, or at least

the large majority, and the development of a fit-for-purpose assessment system asa system, i.e. as part of an integrated approach to national human resource devel-

opment. The imperative now is to treat education as an economic investment, both

on the part of the individual student, and on the part of government. Instead of

needing a legitimate reason to dispense with the intellectual capabilities of most

of the population, governments now need to cultivate these capabilities.

Criterion Referencing, Clarity of Outcomes and the Development of Content

Standards

In parallel with such developments the need for assessment to produce more

useful information about student achievement has also become apparent use-

ful information for teachers, for the students themselves, and also for other



2013


5/28


stakeholders such as parents, employers and government. Norm-referenced, rank

ordered grades do not communicate what students have achieved and, over time,

we have seen a move toward more criterion-referenced assessment. Initial interest

in criterion-referencing derived from the development of mastery learning pro-

grammes and evaluation studies, in the 1960s and 1970s, which sought to delineate

and identify what students should know and be able to do after following a particu-

lar course of study (e.g. Bloom, 1974; Ebel, 1972; Glaser, 1963). Such early work

was very much internal, so to speak, to the curriculum development and evalua-

tion research community, but the idea of reporting learning outcomes, rather than

norm-referenced grades, became more widely disseminated as demands for util-

ity and accountability developed in the 1970s and 1980s. Employers wanted more

information about what school leavers could do and governments wanted more

information about what the school system was producing. Moreover, demands also

grew for the school system to produce different things a wider range of more rel-evant skills and understandings for the knowledge economy. This in turn required a

wider range of assessment methods to be developed to identify and report a wider

range of learning outcomes practical work, coursework and extended project

work, for example, to test practical competences and the application, rather than

simply the memorisation and regurgitation, of knowledge. Thus a concern for what

we might term content standards, and the production of more useful information

about what school students know, understand and can do, has merged with debates

about how best to measure and report such content standards, and indeed enforce

them.

Social Justice and Educational Inclusion

Various types of social justice arguments have also contributed to this nexus of

change, partly linked to the human capital development arguments outlined above,

but also partly driven by arguments about promoting social inclusion and social

mobility through equal access to educational opportunities. Thus advocates argue

that the majority of the population should not be abandoned to comparative, norm-

referenced, failure. Rather, we need our assessment systems to identify and reportwhat students can do, rather than what they cannot, including the many social

and attitudinal outcomes of education which are just as important as academic

outcomes (e.g. Broadfoot et al., 1988; ILEA Hargreaves Report, 1984). Thus

we want our children not just to be able to do maths, or science, or whatever,

but to enjoy them and understand their importance. Equally we wish to value

achievements in other domains, including social and political understanding, and

ensure that students can contribute to civil society. In tandem with such general

arguments about widening the scope and inclusiveness of assessment have come

specific technical developments incorporating graded tests, the modularisation ofthe curriculum and the possibility of accumulating better final results though the

assessment of coursework and even re-sits of modular papers to improve grades

(Murphy and Torrance, 1988 review some of the original advocacy for these sorts

of developments; Hayward and McNichol, 2007 report some of the problems).



2013


6/28


Summative and Formative Assessment

A more specifically educational variant of some of these arguments is manifested

in the debate about summative and formative assessment, and the role that forma-

tive assessment could play in improving the quality and outcomes of teaching andlearning. Summative assessment reports the outcomes of an assessment process.

It largely takes place at the end of a course of study, and results are reported after

the course of study has finished, to the student and to interested others. Even if the

assessment has involved some form of criterion-referencing, and is informative

and positive (all big ifs of course), reporting after the fact about what has been

achieved leaves little scope for using such information for improvement. Moreover

narrow forms of summative assessment, focusing largely on testing academic

achievement, can have a very narrowing backwash effect on the curriculum and

the quality of teaching and the student experience (as will be reviewed in more

detail below). Advocates of formative assessment argue that using a wider range

of classroom-based tasks to assess student progress, and providing good quality

feedback to students during a course on what they have achieved but also how

they might improve, can facilitate learning and improve outcomes. Many issues

arise of course, with respect to the nature and quality of the feedback and the

support provided for students (which, again, will be reviewed below) but for the

purposes of this introductory discussion it is sufficient to note that this major edu-

cational aspiration and endeavour is played out in the context of, and plays into

the development of, the wider debate about the purpose, validity and reliability of

assessment.

3. WHERE ARE WE NOW?

So, a wide variety of interacting elements, deriving from long term social and

economic change, and from educational arguments about the role of assessment

in facilitating learning, seem to have produced the current consensus. It is not that

there is some sort of simple progression here, such that norm referencing has been

completely superseded, or that there is any particularly conscious orchestration

and integration of these different elements. Rather, all elements are in-play at thepresent time, but the major influences currently driving developments in curricu-

lum and assessment derive from human capital theory coupled with the demand

for clarity of objectives and the prescription of content standards.

Thus governments around the world are looking to produce integrated curricu-

lum and assessment systems to drive up standards, and they are supported by many

educational advocates of greater integration. Perhaps the two most visible exam-

ples of change are the National Curriculum and Assessment system in England

(DES, 1987), and the No Child Left Behind legislation in the United States

(NCLB, 2001) now morphing into Obamas standards-oriented Race for the Topand the State-level Common Core Standards Initiative. Meanwhile, other countries

are adopting similar programmes, including New Zealand, which has been devel-

oping national standards linked to a testing system since 2002 (NZQA, 2011), and

Australia (cf. Wyatt-Smith et al., 2010).



2013


7/28


A key problem, however, is that there remains a schism between the educa-

tional arguments for changes in assessment to enhance learning, and the policy

demands for school improvement and accountability. This schism produces ten-

sions in developing an integrated system, and can result in significant unintended

consequences. The arguments in favour of using assessment to change teaching

essentially fall into two related, but nevertheless distinct, categories. One argu-

ment derives from educational issues and values, the other is much more oriented

towards accountability and the use of political pressure to bring about change.

The educational arguments revolve around the role that assessment plays in

determining the curriculum, using the so-called backwash effect of assessment,

noted earlier, in a positive way; as Resnick and Resnick (1992) have put it:

You get what you assess; you dont get what you dont assess; you should build

assessment towards what you want . . . to teach . . . (p. 59)

This is very much the thinking that influenced the Measurement-Driven

Instruction movement in the USA in the late 1980s and 1990s. Put desired

objectives into testing programmes and teachers will teach those desired objec-

tives (Airasian, 1988; Popham, 1987). Thus the intention is to use changes in

assessment directly to influence curriculum content and the process of teaching

and learning. More recently such arguments have developed to incorporate the

notion of a standards-based curriculum, whereby standards are set in terms of

curriculum content and achievement levels and test are aligned with the curricu-

lum to reinforce the teaching of those standards and to measure whether and towhat extent such standards have been achieved.

A rather more complex interpretation of the same broad insight focuses much

more at classroom level and on the quality of teacherstudent interaction. Thus,

to reiterate, it is also recognised that routine, informal assessment can play a key

role in underpinning or undermining the quality of teaching and learning in the

classroom. How teachers assess students work, what sorts of positive or negative

feedback is given, and whether or not advice on how to improve is provided can

make a great deal of difference to what is learned and how it is learned. This is the

thinking which underlies the formative assessment movement in England, wheresuch approaches are perhaps most developed (Black and Wiliam, 1998; Black

et al., 2006; Torrance and Pryor, 1998), though it has also been very influential

in Australia and New Zealand (Cowie and Bell, 1999; Sadler, 1989, 1998; Wyatt

Smith et al., 2010) and acknowledged as potentially important for developments in

Hong Kong, the USA and elsewhere (Carless, 2006; Hargreaves, 2007; Shepard,

2000).

The political accountability arguments for using assessment to drive school

reform are much simpler and more clear cut. Here the claim is that education sys-

tems in general, schools in particular, must have their efficiency and effectivenessmeasured by the outcomes produced. Expected standards of achievement must be

prescribed and tests regularly employed to identify whether or not these expecta-

tions have been met. In publicly maintained school systems such prescription is



2013


8/28


controlled by government and the quality of teaching and learning in the class-

room is assumed to rise if results improve. Essentially, in this model, testing is

used as a lever to effect the system qua system; the detail at classroom level is

assumed to look after itself. If results are improving, the quality of students edu-

cational experience and achievement is assumed to be improving. However, just

as with measurement-driven instruction or a standards-based curriculum, it is

crucial to the logic and practice of such an accountability system that the tests

employed do indeed genuinely sample the curriculum, and reliably measure stu-

dent achievement. The tests must be valid indicators of quality across the system

as a whole otherwise they will drive the system in the wrong direction.

In England, we are very much dealing with this politically driven,

accountability-oriented analysis of the nature of the problem of educational stan-

dards and what to do about them; though some elements of the arguments about

standards-based instruction and formative assessment, or assessment for learn-ing as it is now more commonly known, also feature in debate. In this respect

we can note once again that change in the education system is unlikely to occur

only as a result of educational arguments, or indeed simply to comply with gov-

ernment pressure and legislation. It is the interaction of the two which produces

particular practices at particular points in time. In the systemic social and insti-

tutional space of education, educational arguments are likely to be modified

and adapted to fit the prevailing political context, while at one and the same time

such arguments are deployed in policy debates in order to increase the rhetorical

and symbolic legitimacy of policy and to mobilise action within local educationalcontexts and institutions.

It is this interaction of policy and educational aspiration that seems to have

produced the current educational orthodoxy of trying to combine formative

approaches to classroom assessment with large-scale summative accountability

systems. The aspiration to combine formative and summative assessment in a sin-

gle system was first articulated in the Task Group on Assessment and Testing

Report (The TGAT Report, DES, 1987), which provided the educational rationale

for the original national system of testing. The report argued that:

It is possible to build up a comprehensive picture of the overall achievements of apupil by aggregating, in a structured way, the separate results of a set of assessmentsdesigned to serve formative purposes. (para. 25)

The TGAT Report was also known as the Black Report since Paul Black chaired

the Task Group, and he has been a significant advocate of merging formative and

summative assessment over the intervening years (e.g. Black, 1998).

More recently, Shavelson et al. (2005), reviewing developments in England,

Australia and the United States, argue that:

The potential for summative and formative assessment to work at cross purposes . . .is enormous. However . . . [i]f left in conflict the summative function will overpowerthe formative and . . . [t]he goal of teaching . . . becomes improving scores on thesetests. (p. 7)



2013


9/28


Gilmore (2002), reviewing the National Education Monitoring Project in

New Zealand (NEMP), argues:

The vision that underlies large-scale assessment programmes . . . provides depend-

able assessment information for accountability (evaluative) purposes while at thesame time supporting and sustaining exemplary teaching and learning. (p. 345)

Bennett (2011), in a substantial review of current international perspectives,

concludes that:

The effectiveness of formative assessment will be limited by the nature of the largersystem in which it is embedded . . .. Ultimately we have to change the system . . . ifwe want to have maximum impact on learning and instruction. [This] means remak-ing our accountability tests . . .[ ]. . . we have to rethink assessment from the groundup as a coherent system. (pp. 1920)

Thus the rationale seems to be that accountability is here to stay; summative

assessment will always drive out formative assessment if they are set in opposition

to one another, therefore we need to splice them together in an attempt to create

the perfect chimera, the perfect genetically modified assessment system. In this

ambition our reach has far outstretched our grasp. The understandable felt need of

educators to pursue their educational aspirations in the context of particular pol-

icy demands has produced a tendency towards making over-ambitious claims for

what can be accomplished. And, certainly in England, the system that has been

produced is starting to collapse under its own weight.

4. WHAT IS THE IMPACT OF THESE CHANGES? POLICY AND PRACTICE

IN ENG LA ND

I now move on to provide some illustrations of the problems that have become

apparent in England, whereby the very improvement in results that the creation of a

national system has brought about, have arguably been accomplished by too much

coaching and practising for the tests, and are now undermining public credibility in

the whole enterprise. England is chosen for illustrative purposes because in manyrespects it represents the paradigm case of the sorts of trends we have seen over

the last 30 years or so. Moreover England now has over 20 years of experience of

developing a national curriculum and assessment system so any initial teething

troubles should have been long overcome. If any problems remain (and they do)

then it is likely that they are intractable and require a different approach.

England has a statutory National Curriculum and Testing system, introduced

in 1988 (DES, 1987) which tests all students in a cohort at regular intervals.

Originally all students were tested at age seven in English and Maths, at ages 11

and 14 in English, Maths and Science, and sat national public examinations the General Certificate of Secondary Education (GCSE) at age 16. Currently

all students are now tested only at 11, with GCSE retained at 16. Since 2005



2013


10/28


classroom-based teacher assessment has been used to report results at age seven,

while tests at 14 were abolished in 2009 following a fiasco of lost papers,

unmarked papers and wrongly marked papers, which demonstrated just how over-

whelming testing whole cohorts had become. Thus two stages in the process have

been dropped as the undesirability of testing very young children, and the impossi-

bility of maintaining any pretence of quality and reliability in such a mass system,

became apparent. England also retains a subject-based examination system at the

formal secondary school leaving age of 16 years, and a subject-based Advanced

level A-level qualification, normally taken at 18 years of age, for entrance to

university. So, over the last 25 years or so, if English policy makers could squeeze

something into the assessment system, they did, though subsequently elements of

what they squeezed in popped out again, as it transpired that there was not enough

space in the system for all the testing that was envisaged. Two full levels of whole-

cohort testing at ages seven and 14 have been dropped in an attempt to amelioratethe worst effects of testing.

The explicit use of assessment to drive educational change in England dates

back to the introduction of a single system of secondary school examinations, the

General Certificate of Secondary Education (GCSE), for 16-year-olds (the min-

imum school leaving age), by the then Conservative government of Margaret

Thatcher in 1986 (with the first new exams taken in 1988). In the 1960s and

1970s England operated with two parallel secondary school examination systems:

GCE O-level2 for those students considered to be in the top 20 per cent of the

ability range; and CSE3 for those considered to be in the next 40 per cent ofthe ability range. The bottom 40 per cent were not considered capable of tak-

ing examinations at all. Selection of the top 20 per cent for entry to grammar

schools was based on the 11+ intelligence test and, overall, such a selective sys-

tem represented the epitome of a norm-referenced system. The creation of a single

system of examining, GCSE, in the mid-1980s, might be said to mark the point

at which governments in England fully began to buy into human capital theory

and treat education as an investment in the population as a whole, rather than as

a way to select a social and economic elite. Of course education, and particularly

assessment, still does play a major role in selecting and legitimating the selectionof a social and economic elite, but it is at least now arguable that this outcome is

an unintended effect of policy, rather than an overt intention.

One effect of selectivity was that, precisely because it was not thought appro-

priate for all children to take secondary school examinations, there were no overall

data in the system about how well schools were doing and what standards were

being attained across the system as a whole. Moreover, the selection test for sec-

ondary school allocation (the 11+) could only provide evidence that 80 per cent

of pupils failed their 11+ and therefore failed primary education. Even this

test was largely phased out with the introduction of comprehensive secondary

schools, so that by the mid-1980s there were virtually no data whatsoever on

the output of primary schools. The Labour government of the 1970s launched



2013


11/28


the Assessment of Performance Unit (APU) to try to provide evidence of stan-

dards achieved. However, the APU did not provide unequivocal and easily usable

evidence about national standards (Gipps and Goldstein, 1983); nor, because of its

sampling strategy, could it reach into and influence every classroom. First GCSE

(1986), then the National Curriculum and National Testing (1988) were introduced

in order to control directly what was taught and how it was taught, and to measure

whether or not it was being taught effectively.

There has been extensive detailed argument about the scale and scope of the

National Curriculum and Testing system and many modifications have been put in

place since it was first introduced (cf. Daugherty, 1995; Torrance, 1995, 2003).

However, the issue of educational accountability has remained the key policy

problem for over 20 years now, and the development of a standards-based, test-

driven education system has remained the key policy solution up to the present.

The commitment to a testing regime has remained completely taken-for-granted,as elements of policy have been built up, layer by layer, and then stripped away

again, in successive attempts by both Conservative and Labour governments to try

finally to realise the vision of an integrated national curriculum and testing system

and render it operational. So we have had 20+ years of a natural experiment with

the educational provision of our children.

5. IMPACT ON RESULTS AR E STANDARDS RISING?

Before moving on to review some of the detail of the system in operation, andsome of the educational consequences of this experiment, I will review some of

the results produced so we have a sense of what this move from a selective to

a mass system actually looks like. The increasingly feverish activity of govern-

ments of both complexions since the mid-1990s, both with respect to the National

Curriculum and testing regime, and with respect to GCSE and A-level, indicates

that educational standards are still considered to be a political issue about which

something must be done, or at least be seen to be done.

National Curriculum test scores have risen since national testing was first

introduced but have plateaued since around 2000 and insofar as they indicateanything meaningful about educational standards this suggests that progress

in primary education has stalled, or appears to have stalled. However, figures

for GCSE and A-level indicate that examination scores have risen consistently

since the 1970s, irrespective of which government is in power or which spe-

cific curriculum interventions have been pursued. These results tend to indicate

that it is the general trend towards human resource development and crite-

rion referencing, combined with the general pressure to succeed, that has seen

scores rise.

To take national test results first, at age seven, in Table 1 and Figure 1, wecan see that results started high, improved a little, and have stayed high, but there

remains a stubborn 1520 per cent or so of children who are not reaching level 2,

the expected level, in maths and English by the age of seven. At age 11 (Table 1



2013


12/28


TABLE 1: Percentage of pupils gaining National Curriculum assessment level 2or above at age seven and level 4 or above at age 11, England

Age 7 Age 11

Eng Maths Eng Maths Science

1992 77 781995 76 78 48 441996 80 80 58 54 621997 81/80 83 63 61 692000 81/84 90 75 72 852002 84/82 90 75 73 862005 85/82 91 79 75 862007 84/80 90 80 77 882008 85/80 90 81 78 882009 84/81 89 80 78 882010 85/81 89 81 80 85

source: http://www.dcsf.gov.uk/rsgateway/ 1992 first full run of KS1 tests; 1995 first full run of KS2 English and Maths; 1996 firstfull run of KS2 Science. New Labour government elected. KS1 English results now being reported separately interms of attainment targets (81 per cent gained level 2 in Reading, 80 per cent in Writing). Suchdetails had been available previously but results were routinely reported as whole subjectlevels. The Writing score is averaged across the writing test and the spelling test. The scoresare averaged in Figure 1. KS1 tests now conducted as teacher assessment so results 20052010 are no longer

directly comparable with previous results at KS1, though interestingly, teacher assessment doesnot seem to diverge from the established trend in test results. Results for Age 11 Science now derive from teacher assessment only; a national sample of 5per cent took tests with 81 per cent reaching level 4. NB 2010 also saw c. 25 per cent of pri-mary schools boycotting the English and Maths tests (c. 4000 schools), leading to governmentagreement to review national testing for the future.

and Figure 2) the results start low, rapidly improve, but again, have plateaued since

2000 with around 20 per cent of children not achieving the expected level, level 4,

in maths or English.Not every years results are recorded in Table 1 and Figures 1 and 2; rather, suf-

ficient years are recorded to indicate trends over time along with key dates which

government has variously used and dropped as indicators of progress. Progress

since 1997 was the measure routinely deployed by the New Labour government

at national level. And at first sight progress since 1997 seems significant. But

closer scrutiny indicates significant improvements occurred in results prior to

1997. Thus, for example, in the first two years after National Testing was first intro-

duced at KS2 under a Conservative government (19951997) results improved by

15 percentage points in English (from 48 per cent to 63 per cent) and 17 percent-age points in maths (from 44 per cent to 61 per cent). In the ten years after 1997,

results improved by 16 percentage points in English (63 per cent to 79 per cent)

and 14 percentage points in maths (61 per cent to 75 per cent; 19972006), but



2013


13/28


0

10

20

30

40

50

60

70

80

90

100

1992 1995 1996 1997 2000 2002 2005 2007 2008 2009 2010

Age 7 English

Age 7 Maths

Figure 1. Percentage of pupils gaining National Curriculum Assessment level 2 or above atage 7 (KS1), England

0

10

20

30

40

50

60

70

80

90

100

1992 1995 1996 1997 2000 2002 2005 2007 2008 2009 2010

Age 11 English

Age 11 Maths

Age 11 Science

Figure 2. Percentage of pupils gaining National Curriculum Assessment level 4 or above atage 11 (KS2), England

with most of this improvement being achieved by 2000. The plateau effect since

2000 has continued through to the most recent results available in 2010.

One inference we might take from these figures, especially with respect to

results at KS2 (age 11), is that the introduction of National Testing constituted

a major perturbation in the primary school system such that teachers were left

initially deskilled by the innovation, so results started low; but results rapidly

improved as teachers and students came to understand what was required of them,

in terms of test preparation, and then progress tailed off as the limits of suchartificial improvement were reached.

Results for GCSE at age 16 are rather different but perhaps even more instruc-

tive (Table 2 and Figure 3). They have been rising steadily since the exam was



2013


14/28


TABLE 2: Percentage of pupils gaining O-level/CSE grade 1/GCSE and Equi-valents 19752010, England

Percentage 5 or more AC; Percentage 5 or more AG

1975 22.6 58.61980 24.0 69.01988 29.9 74.71990 34.5 80.31995 43.5 85.71997 45.1 86.42000 49.2 88.92005 56.8 89.92007 61.4 90.92008 65.3 91.62009 70.0 92.32010 75.4 92.8

Source: Torrance (2003) and time series 19962010 available at: http://www.education.gov.uk/rsgateway/DB/SFR/s000985/sfr01-2011.pdf (accessed 5 September 2011). NB R&Supdate and revise pass rates so these figures may vary by fractions of a percentage from thosepublished in previous years; the figures recorded here are the most recent posted on the DoEwebsite. For details of calculating equivalence between O-level, CSE and GCSE see Torrance (2003).It should also be noted that DCSF/DfE Research and Statistics report totals including GCSEsand equivalents, including KS4 students taking iGCSEs and vocational GCSEs. The totalstherefore are slightly higher than the headline GCSE pass rate that is announced each autumnbut this does not affect the thrust of the argument.

first introduced in 1988 and indeed were rising prior to its introduction. In the

mid-1970s, when only the top 20 per cent of students were thought capable of

passing O-level, the percentage of students passing at least five O-levels or their

equivalent under the previous dual system was 22.6 per cent.4 By 1988, the first

year of GCSE results, this had risen to 29.9 per cent. By the mid-1990s this had

risen further to 43.5 per cent and the most recent results for 2010 indicate that

.

0

10

20

30

40

50

60

70

80

90

100

%5 or more A*-C grades

%5 or more A*-G grades

Figure 3. Percentage of pupils gaining O-level/CSE grade 1/GCSE and equivalents 19752010, England



2013


15/28


75 per cent of students now pass five or more GCSEs or their equivalent at grades

AC.

That is, 75 per cent of the school population now achieve what 30 years ago it

was thought only the top 20 per cent could achieve. Furthermore, taking the full

range of grades into account (AG), as an indicator of the numbers of students

gaining at least some benefit from their secondary education, almost 60 per cent

gained at least five AG grades in 1975, while nearly 93 per cent achieved five

AG in 2010.

So from the published statistics in the public domain the evidence is that pass

rates have been steadily improving over many years. And, on the face of it, this

represents an absolute transformation of what the system is achieving, compared

to 30 years ago.

Of course, overall pass rates conceal other issues. Within these general trends

different sub-groups perform better than others, and results vary by social class,gender and race. Thus, for example, 87.5 per cent of candidates of Chinese origin

gained at least five ACs in 2009 (n=2,275), while only 67 per cent of candi-

dates of Black African and Caribbean origin did so (n=23,609) though this in

itself represents a very significant improvement from the 35 per cent recorded in

the early 2000s (Torrance, 2005). Amongst candidates of White British origin,

69.8 per cent passed at least five ACs in 2009 (n=461,445). A recent Joseph

Rowntree-sponsored study indicates that poor, working-class white boys do worst

of all (Cassen and Kingdon, 2007).

There is not space here to explore such differential pass rates in more detail,but, taken together, the results of National Curriculum Tests and GCSE indicate

that there is a major bifurcation developing between those students who are doing

well and a substantial minority of perhaps 2025 per cent of children who are not

riding the rising tide of results. Clearly this raises major political and educational

issues, as the system is driven more and more by the pursuit of examination suc-

cess for the majority, andde facto attends less and less to the needs of those who,

for whatever reason, do not fit into the model. Overall, however, for the purposes

of the present discussion, the key point is that GCSE pass rates have been rising

in England for more than 30 years.A-level pass rates have also been rising in similar fashion over the last 30 years

(Table 3 and Figure 4). Originally A-level examinations were designed to qualify

and select applicants for entrance to university. They were designed to be taken by

a minority of the minority thought to be capable of benefiting from an academic

secondary education. Today A-levels taken at age 18 are starting to replace GCSE

as the marker of a successfully completed secondary education.

Again, not every year is recorded in Table 3 and Figure 4, but rather those

that indicate trends over time, and particularly over the last ten years when already

good results still show an incremental improvement year on year. The key points

to note include a steadily rising numbers of passes and top grades over 30 years;

a particular blip around 20002001 when major changes to the structure of A-level

was introduced (Curriculum 2000), but, once accommodated, the steadily rising



2013


16/28


TABLE 3: A-level pass rates 19802010, England

Entries Percent A grades Percent all grades A-E

1980 567, 027 8.8 67.81985 9.8 70.21990 12.0 76.71995 15.6 84.02000 606, 995 18.8 90.62001 18.3 89.62002 20.3 94.12003 21.3 95.32004 22.1 95.92005 717, 127 22.4 96.22006 23.8 96.52007 25.0 96.92008 25.6 97.22009 26.5 97.52010 784, 877 26.8 97.6

Sources: Department of Education and Science (1980) Statistics of School Leavers, CSE and GCE, England Department of Education and Science (1985) Statistics of Education, School Leavers, CSE andGCE England Daily Telegraph 14 August 2008 accessed via website http://www.telegraph.co.uk 4 October 2010. Times Educational Supplement 17 August 1990 p. 3. Times Educational supplement 18 August 1995 p. 5. Department for Education and Employment (2000) Statistics of Education: GCSE/GNVQ andGCE A/AS Level and Advanced GNVQ Examination Results1999/2000 England accessed fromDCSF Research and Statistics website: http://www.dcsf.gov.uk/rsgateway 4 October 2010. All furtherresults, 20012010, from Joint Council for Qualifications website http://www.jcq.org.uk/ accessed 4October 2010. It might be noted in passing that it is remarkably difficult to gain access to A-level passrates pre-2001 when they first started appearing on the JCQ website. So far as I can discern there is nosingle source of results which goes back to 1980 or beyond. Government statistics have been producedin very different ways over the period, to address whatever was the key policy concern of the day. Thusthis table has been produced by extensive mining of different sources First results for Curriculum 2000 including modular A/S levels. First use of A and A grades: A = 8.1 per cent, A = 18.7 per cent = 26.8 per cent total A + A

(NB c. 14 per cent of all candidates achieved straight As in 2010, i.e. 3 x A or A grades, up from c.7 per cent in 2000).

pass rate was resumed and perhaps slightly accelerated; as was the shift towards

awarding of top grades. In 1980 only nine per cent of entries gained grade A, and

from a significantly smaller number of overall entries (n=567,027), while in 2010

27 per cent of entries were awarded grade As, from a total entry of 784,877.

Figure 5 illustrates the change very dramatically. Here the diamonds line indi-

cates the grade distribution in 1980, with the majority of passing grades being C,

D and E, and skewed towards the bottom end. The squares line indicates grade

distribution in 2010, and the graph is reversed, with most passing grades being C,B and A.

So what might be the explanation for these long term trends? Here we return

to the points made in the opening section of the paper. Some element of a genuine



2013


17/28


0

10

20

30

40

50

60

70

80

90

100

19

80

19

85

19

90

19

95

20

00

20

01

20

02

20

03

20

04

20

05

20

06

20

07

20

08

20

09

20

10

% A Grades

% all pass grades (A-E)

Figure 4. Percentage A-level passes, 19802010, England. 1980: n =567,027; 2010:n =784,877

0

5

10

15

20

25

30

E D C B A

1980

2010

Figure 5. Percentage distribution of A-level grades, EA, 1980 and 2010, England

rise in standards is likely to be present, driven by better socio-economic conditions

of students, higher expectations of educational outcomes by students, parents and

teachers, and better teaching. But this is combined with and compounded by two

key elements of the changes which have taken place in systems of assessment:

(i) an increasingly more focused concentration on passing exams, by both

teachers (teaching to the test) and the majority of students (extrinsic

motivation), because of the perceived importance of educational success

in institutional accountability and individual life chances;

(ii) the increased transparency of modular, criterion-referenced assessmentsystems, which affords teachers and students much more opportunity to

practise for tests and improve grades through, coaching, specific feedback

and resubmission of work.



2013


18/28


6. IMPACT ON EDUCATIONAL EXPERIENCE AND QUALITY

In relation to National Curriculum test scores, the evidence in England suggests

that teaching to the test is the most likely recent explanation for rising scores

which tail off as teachers and students come to be about as efficient as they canbe at scoring well on the tests within a regime of coaching and practice. Many

research studies have reported an increasing focus on test preparation, particularly

in the final year of primary school prior to the tests being taken (for a review of

recent studies see Wyse and Torrance, 2009). Thus, for example, McNess et al.

(2001) note that:

Whole class teaching and individual pupil work increased at the expense of groupwork . . . [there was] a noticeable increase in the time spent on the core subjects . . .[and] teachers . . . put time aside for revision and mock tests . . . (pp. 1213)

While Hall et al. (2004) report that:

assessment is synonymous with testing . . . assessment, narrowed to test-taking inpreparation for SATs, is the main business of life in the last two terms of year six.(p. 804)

However, it is not only independent research studies which highlight such prob-

lems. School inspectors who routinely visit schools on a regular basis have

reported on a narrowing of the curriculum and summaries of their inspection find-

ings have been included in the annual reports from the Office for Standards inEducation (OfSTED). One recent report noted:

In many [primary] schools the focus of the teaching of English is on those parts ofthe curriculum on which there are likely to be questions in national tests . . . Historyand, more so, geography continued to be marginalized . . . In [secondary] schools . . .the experience of English had become narrower . . . as teachers focused on tests andexaminations . . . There was a similar tension in mathematics . . . (OfSTED, 2006,

pp. 5256)

Such concerns have also now reached Parliament, with a report of the Select

Committee for Children, Schools and Families (2008) stating:

In an effort to drive up national standards, too much emphasis has been placed on asingle set of tests and this has been to the detriment of some aspects of the cur-riculum and some students. (Select Committee, reported on BBC 13 May 2008:http://news.bbc.co.uk/1/hi/education/7396623.stm)

And finally, after a boycott of KS2 tests in 2010 by about 25 per cent of primary

schools, the Bew Committee was set up by the new Conservative and Liberal

Democrat coalition government to review KS2 testing, reporting in June 2011.

Their terms of reference included:

how to avoid, as far as possible, the risk of perverse incentives, over-rehearsal andreduced focus on productive learning. (Bew Report, 2011, p. 4)



2013


19/28


The report notes:

There are considerable concerns . . . that the system is too high stakes, which canlead to unintended consequences such as over-rehearsal and teaching to the test.

(p. 9)

Comparable evidence can also be identified internationally. Many research studies

from the United States (e.g. Klein et al., 2000; Linn, 2000; Shepard, 1990) report

similar findings from previous investigations of test-based reform in the USA, and

the same issues are now beginning to emerge from studies of the No Child Left

Behind program. State level NCLB test scores are rising (CEP, 2007), but equally

Administrators and teachers have made a concerted effort to align curriculum

and instruction with state academic standards and assessments (CEP, 2006, p. 1).

A recently completed study by Rand Education funded by the US National ScienceFoundation noted that:

changes included a narrowing of the curriculum and instruction toward tested topicsand even toward certain problems styles or formats. Teachers also reported focusingmore on students near the proficient cut-score . . . (Hamilton et al., 2007, Summary:

p. xix)

So overall, the international research evidence suggests that rising test scores

might actually mask falling standards as students are exposed to a much restricted

curriculum.Similar issues with regard to narrowing the curriculum have been reported

with respect to GCSE in England. For the purposes of government statistics and

the compilation of league tables of secondary schools the passing grade for

GCSE is grade C, and schools are under enormous pressure to maximise the num-

bers of students passing at least five AC grades. As noted above, OfSTED has

observed a focus on exam preparation in secondary schools as well as primary

schools, and research by Gillborn and Youdell (2006) has reported schools focus-

ing particularly on identifying and supporting those students who could possibly

be moved from a D to a C grade GCSE triage as they have termed it, a practicevery similar to that noted by the US Rand Report cited above. Taken together with

our earlier observations about a significant minority of children not fitting in with

the model, this means that perhaps 2025 per cent of children are increasingly

ignored as they progress through the school system, if they are not thought worthy

of such triage investment.

At A-level, research undertaken as part of a Nuffield Foundation sponsored

review of the 1419 curriculum in England identified modularisation and the re-

taking of modular tests as a key issue for pass rates in science (Hayward and

McNichol, 2007). My own research investigating assessment across the post-compulsory sector including A-level has identified transparency of procedures,

objectives and criteria as a key issue which, combined with the detailed feedback

which teachers give students, has led to a situation which I have characterised as

criteria compliance:



2013


20/28


. . . greater transparency of intended learning outcomes and the criteria by whichthey are judged, and . . . [c]larity in assessment procedures [and] processes . . . hasunderpinned the widespread use of coaching, practice and provision of formativefeedback to boost individual and institutional achievement . . . . However . . . such

transparency encourages instrumentalism . . . transparency of objectives coupledwith extensive use of coaching and practice to help learners meet them is in dan-ger of removing the challenge of learning and reducing the quality and validity ofoutcomes achieved. This might be characterized as a move from assessment oflearn-ing, through the currently popular idea of assessment for learning, to assessmentas learning, where assessment procedures and practices come completely to domi-nate the learning experience, and criteria compliance comes to replace learning.(Torrance, 2007 p. 282)

So, potentially, we have reached a situation in England where scores and grades

are continuing to rise but the validity and reliability of the standards achieved

are subject to increasing doubt, and the educational experience of even the mostsuccessful students, let alone those who are not successful, is compromised.

Employers and university selectors alike are expressing concern about the

quality and credibility of GCSE and A-level grades (e.g. Sykes Review, 2010) and

this in turn is drawing responses from the Examination Boards (e.g. Cambridge

Assessment, 2010, 2011). A recent review commissioned by the Conservative

Party, the Sykes Review, reported that:

Confidence in the qualifications and assessment system has been diminishing formany years. The usefulness of the system has been eroded by the politicisation ofassessment outcomes, by universities loss of confidence in A levels as a certificate ofreadiness for university-level study, by employers loss of confidence in GCSEs andA levels as certification of relevant knowledge and skills, and by the disproportionate

burden placed by external assessment on pupils, teachers and schools. The volumeof external assessment has also grown enormously . . . . This process has underminedthe credibility of teacher and school assessment, as well as limiting and underminingteaching. (Sykes Review, 2010, p. 4)

Now it might be argued that it was ever thus, employers and indeed examiners

have often complained that standards are not high enough. Also, to reiterate, the

Sykes Review was set up to report to the Conservative Party when in opposition,and might thus be thought of as rather partisan. However, the Review Committee

membership was very broad, and actually produced findings not that far removed

from OFSTED and the 2008 Parliamentary Select Committee. The more interest-

ing point to note is how widespread is the concern across the political spectrum.

It is also interesting to note the time lag between the research findings of the late

1990s and early 2000s the work of McNess, Hall and others, cited earlier and

the impact on policy thinking some ten years later. We could have avoided ten

years of this progressive narrowing of the curriculum if politicians had been in a

position to hear and respond to the research literature.Successive research studies over 40+ years have indicated that it is the vital-

ity of teacherstudent relationships and the quality of teacherstudent interaction

that are the most important factors in improving student learning experiences



2013


21/28


and raising attainment (Galton et al., 1980; Galton et al., 1999; Jackson, 1968;

Mehan, 1979; Mercer, 1995). Yet this is precisely what is threatened by an over-

concentration on testing. The focus of the teacherstudent relationship is currently

oriented towards criteria compliance and grade accumulation, rather than learning.

It is almost as if successive governments in England have taken an actuarial view

of the role of rising pass rates. Examination success is correlated with social and

economic success, so, the policy thinking seems to be, maximise exam passes,

especially for previously disadvantaged groups, and social and economic mobility

will necessarily improve, irrespective of the educational experience of the students

or the quality of the outcomes achieved.

In focusing on assessment, standards and accountability to drive improvements

in schooling, policymakers seem to have lost sight of the purpose of education and

the nature of individual achievement of what it is that standards are supposed to

embody in terms of the knowledge, skills, attitudes and competences that we mightexpect of young people leaving school in the twenty-first century. It is almost as

if the unit of analysis for policy-making, so to speak, has shifted from the cur-

riculum and the building blocks of individual student learning and achievement,

to the overall output of the system. Individual achievement and, more particularly,

the content and quality of that achievement the traditional focus of assessment

has been ignored, or at least taken for granted, as policy has focused on raising test

scores across the system as a whole.

Meanwhile, many educationists and assessment developers have been con-

tent to ride the tiger of accountability, and take the policy context as given, inorder to try to insinuate their own version of measurement driven instruction or

assessment for learning into the system. However, assessment processes cannot

be so easily assimilated into a single, integrated, systemic operation. Three issues

in particular are apparent:

(i) just because assessment can be observed to have negative backwash effects

on the curriculum and teaching, this doesnt necessarily mean that the

same mechanism is available to harness these effects to beneficial pur-poses; certainly this is proving far more difficult than seems to have been

anticipated;

(ii) the impact of assessment for learning on students knowledge and under-

standing will inevitably be mediated by the accountability context in which

it operates; thus students are currently learning to accumulate grades,

rather than understand the structure, coherence and content of particular

knowledge domains;

(iii) criterion-referencing enables the structure of knowledge domains and the

processes of assessment associated with them to be more transparent, suchthat more students can achieve more success, but the very nature of that

success undermines the selection function of assessment and thus, in turn,

the credibility of what is achieved.



2013


22/28


Education has moved from being a scarce good to a positional good; whereby all

cannot succeed in terms of publicly reported grades without the nature of the suc-

cess being called into question. This is now a key and urgent issue to improve

educational quality and outcomes without simultaneously undermining the credi-

bility of the enterprise. Equally, we have to attend to the 2025 per cent of students

that the accountability model is leaving adrift.

7. WHERE DO WE GO FROM HERE? DEVELOPING EDUCATIONAL QUALITY

We are, however, where we are, and the irony of the current situation, certainly

in England, is that a narrow test-driven accountability system is unlikely to pro-

duce flexible and creative workers for the so-called knowledge economy, even

amongst the 75 per cent who are currently seen as successful, let alone the 25 per

cent who are not. Nor are such systems likely to contribute to more general delib-eration about the curriculum, teacher development and the democratic purposes

and organisation of schooling. Quite the reverse, accountability systems take these

matters of purpose as settled, the only issues being efficiency and effectiveness in

meeting already determined goals.

But it is becoming apparent that a one-size-fits-all standards agenda in educa-

tional policymaking has run its course and that what is needed is a new focus on

curriculum content, local responsiveness and the quality of teaching and learning

on the production of diverse experiences of learning for an uncertain world.

8. POLICY OPTIONS FOR THE FUTURE

There are no simple or straightforward policy options with respect to assessment

and testing. Assessment intersects with every aspect of an educational system:

at the level of the individual student and teacher and their various experi-ences (positive or negative) of the assessment process;

at the level of the school or similar educational institution and how it is

organised and held to account; and at the level of the educational and social system with respect to what

knowledge is endorsed and which people are legitimately accredited for

future economic and social leadership.

Governments in England over the last 20 years or so seem to have only partially

appreciated this, assuming that standards can be mandated and measured with-

out the process of measurement impacting on and, in key respects, distorting the

system as a whole. Similarly, because educational achievement is correlated with

social and economic well-being, the efforts of successive governments in Englandseem to have concentrated on pushing as many children as possible through as

many examinations as possible. This seems to have been conceived of as part of

the drive for social inclusion and improving social mobility, without reflecting on



2013


23/28


24/28


re-conceptualise the integration of curriculum development and

assessment by starting from the perspective of the curriculum: i.e. put

resources and support into re-thinking curriculum goals for the twenty-first

century and developing illustrative examples of high quality assessment

tasks that underpin and reinforce these goals, for teachers to use as

appropriate.

The Bew Report (2011) on Key Stage 2 (KS2) testing indicates that statutory

testing will be further restricted, particularly with respect to testing Writing

with writing composition . . . subject only to . . . teacher assessment (p. 14).

Meanwhile, national testing of the whole cohort in Science was ended in 2010

with achievement in KS2 Science now being monitored by the testing of a sample

of schools and students.

So, governments continue to learn the lessons of an over-concentration on test-

ing, though the inspection regime also needs to change. The focus of institutional

self-evaluation and other accountability mechanisms including inspection need to

be on the quality of the learning experience in the classroom, rather than simply

concentrating on the outcomes produced.

But given what we now know about twenty years of such attempts to com-

bine amelioration with educational development, it might be argued that what is

needed is a much more radical break with current policy and practice. Continuing

centralisation of systemic control will threaten long term systemic sustainability,

professional development and capacity building. This issue was noted by the exter-nal evaluation of the National Literacy and Numeracy Strategies when it reported

that continuing this kind of accountability for too long may result in a culture

of dependence (Earl et al., 2003, p. 6). Similar problems can be observed with

respect to the potential opportunities to develop teacher assessment. Although

national testing was ended at age seven and replaced by teacher assessment, teach-

ers continue to use past papers to provide them with evidence for their judgements

(Reed and Lewis, 2005). Comparable issues are emerging with the abolition of

national testing at age 14, with preparations for GCSE simply being started earlier.

Thus what is required is not just rolling back the boundaries of testing. The liber-ated intellectual and material possibilities for developing high quality teaching and

learning must be supported by positive programmes of professional development.

In the end quality education must involve quality activities taking place in

schools and classrooms and this requires curriculum and professional develop-

ment at local level. A radical version of the future would not simply attempt

to ameliorate central control but to challenge it, and seek to embed new ideas

and practices at local level. It is necessary to invest in more creative forms of

curriculum and professional development, especially with respect to classroom

level assessment skills and understandings, and to re-build the communities ofepistemological practice within which judgments about standards are made and

ultimately reside.

Understanding and addressing key educational issues for the twenty-first

century requires much more curriculum flexibility and responsiveness and this



2013


25/28


requires investment in teacher professional development at local level. Twenty

years of increasing central control and regulation have produced a narrow and

risk-averse education culture which is the very antithesis of the ostensible purpose

of the exercise. Producing higher test scores is not enough, for learners or for gov-

ernments. The need for better quality educational encounters and better quality

information with which to take decisions has never been more acute. We need to

stimulate new visions of what might be accomplished by our education system,

and new ways to record diverse experiences and outcomes, rather than continuing

to insist that all we can achieve is compliance with that which is already known.

9. NOTES1

Earlier versions of this article were presented as a keynote speech to the New Zealand

Association for Research in Education (NZARE) annual conference, University ofAuckland, December 2010; the British Educational Studies Association (BESA) annualconference, Manchester Metropolitan University, July 2011; and as a paper to theBritish Educational Research Association (BERA) annual conference, London Instituteof Education, September 2011.

2General Certificate of Education Ordinary Level; GCE Advanced Level was and stillis taken at around 18+ to qualify for entry to university.

3Certificate of Secondary Education.

4i.e. the equivalent of five GCSEs at grades AC: the top GCSE grades of AC areofficially accepted as the equivalent of the old O-level passes; the percentage of stu-dents gaining at least five ACs is the officially and commonly accepted measureof a good secondary education; the percentage of students gaining at least five AGs (the full range of grades) is the officially and commonly accepted measure of aminimally satisfactory secondary education.

10. REFERENCES

Airasian, P. (1988) Measurement-driven instruction: a closer look, EducationalMeasurement: Issues and Practice, 7 (4), 611.

Bennett, R. (2011) Formative assessment: a critical review, Assessment in Education, 18(1), 525.

Bew Report (2011) Independent Review of Key Stage 2 Testing, Assessment andAccountability. Available at: http://www.education.gov.uk/ks2review (accessed 25August 2011).

Black, P. (1998) Testing: Friend or Foe? (London, Falmer Press).Black, P. and Wiliam, D. (1998) Assessment and Classroom Learning, Assessment in

Education, 5 (1), 774.Black, P., McCormick, R., James, M. and Pedder, D. (2006) Learning how to learn and

assessment for learning, Research Papers in Education, 21 (2), 119132.Bloom, B. (1974) An introduction to Mastery Learning Theory. In J. Block (Ed.) Schools

Society and Mastery Learning(New York, Holt, Rinehart and Winston).Broadfoot, P., James, M., McMeeking, S., Nuttall, D. and Stierer, B. (1988) Records of

Achievement: Report of the National Evaluation (London, HMSO).Cambridge Assessment (2010) A better approach to regulating qualification standards.

Available at: http://www.cambridgeassessment.org.uk/ca/Viewpoints/Viewpoint?id=134763 (accessed 4 July 2011).



2013


26/28


Cambridge Assessment (2011) Higher education admissions test must be fair,valid and transparent. Available at: http://www.cambridgeassessment.org.uk/ca/Spotlight/Detail?tag=entry (accessed 4 July 2011).

Carless, D. (2011) From Testing to Productive Student Learning (London, Routledge).

Carless, D., Joughin, G., Liu N-F and Associates (2006) How Assessment SupportsLearning(Hong Kong, Hong Kong University Press).Cassen, R. and Kingdon, G. (2007) Tackling Low Educational Achievement (York, Joseph

Rowntree Foundation).Centre on Education Policy (2006) From the Capital to the Classroom: Year 4 of the No

Child Left Behind Act: Summary and Recommendations. Available at: http://www.cep-dc.org/.

Centre on Education Policy (2007) Has Student Achievement Increased Since No Child LeftBehind? Available at: http://www.cep-dc.org/.

Cowie, B. and Bell, B. (1999), A model of formative assessment in science education,Assessment in Education, 6, 101116

Daugherty, R. (1995) National Curriculum Assessment: a Review of Policy 19871994(London, Falmer Press).Department for Education (2011) The English Baccalaureate. Available at: http://www.

education.gov.uk/schools/teachingandlearning/qualifications/englishbac/a0075975/theenglishbaccalaureate (accessed 4 July 2011).

Department of Education and Science (1987) Task Group on Assessment and Testing:A Report(London, DES).

Earl, L., Watson, N., Levin, B., Leithwood, K., Fullan, M., Torrance, N. et al. (2003)Watching and Learning 3: Final Report of the External Evaluation of Englands

Literacy and Numeracy Strategies; Executive Summary (Nottingham, DfES).Ebel, R.L. (1972) Essentials of Educational Measurement(Englewood Cliffs, NJ, Prentice-

Hall).Galton, M., Simon, B. and Croll, P. (1980) Inside the Primary Classroom (London,

Routledge and Kegan Paul).Galton, M., Hargreaves, L., Comber, C. and Wall, D. (1999) Inside the Primary Classroom:

20 Years on (London, Routledge).Gillborn, D. and Youdell, D. (2006) Educational triage and the D-to-C conversion: suitable

case for treatment? In H. Lauder, P. Brown, J. Dillabough and A. H. Halsey (Eds)Education, Globalisation and Social Change (Oxford, Oxford University Press).

Gilmore A. (2002) Large-scale assessment and teachers assessment capacity: learn-ing opportunities for teachers in the National Education Monitoring Project in NewZealand, Assessment in Education, 9 (3), 343361.

Gipps, C. and Goldstein, H. (1983) Monitoring Children (London, Heinemann).Glaser, R. (1963) Instructional technology and the measurement of learning outcomes,

American Psychologist, 18, 519522.Hall, K., Collins, J., Benjamin, S., Nind, M. and Sheehy, K. (2004) SATurated models of

pupildom: assessment and inclusion/exclusion, British Educational Research Journal,30 (6), 801881.

Hamilton, L. et al. (2007) Standards-based Accountability under No Child Left Behind(Santa Monica, Rand Education).

Hargreaves, E. (2007) The validity of collaborative assessment for learning, Assessment inEducation, 14 (2), 185199.

Hayward, G. and McNicholl, J. (2007) Modular mayhem? A case study of the develop-ment of the A level science curriculum in England, Assessment in Education, 14 (3),

335351.Inner London Education Authority (1984) Improving Secondary Schools (The Hargreaves

Report) (London, ILEA).Jackson, P. (1968) Life in Classrooms (New York, Holt Reinhart and Winston).



2013


27/28


Klein, S. et al. (2000) What do test scores in Texas tell us? Education Policy AnalysisArchives, 8, 49. Available at: http://epaa.asu.edu/epaa/v8n49.

Linn, R. (2000) Assessments and accountability, Educational Researcher, 29,416.

McNess, E., Triggs, P., Broadfoot, P., Osborn, M. and Pollard, A. (2001) The changingnature of assessment in English primary schools: findings from the PACE Project 19891997, Education 313, 29 (3), 916.

Mehan, H. (1979) Learning Lessons: Social Organisation in the Classroom (Harvard,Harvard University Press).

Mercer N. (1995) The Guided Construction of Knowledge (Clevedon, Multi-LingualMatters).

Murphy, R. and Torrance, H. (1988) The Changing Face of Educational Assessment(Maidenhead, Open University Press).

New Zealand Qualification Authority (2011) History of NCEA. Available at:http://www.nzqa.govt.nz/qualifications-standards/qualifications/ncea/understanding-

ncea/history-of-ncea/ (accessed 7 July 2011).No Child Left Behind Act (2001) Public Law 107110. Available at:http://www.ed.gov/nclb/landing.jhtml.

Office for Standards in Education (2006) The Annual Report of Her MajestysChief Inspector of Schools 2005/06 (London, OfSTED). Available at:http://www.ofsted.gov.uk.

Popham, J. (1987) The merits of measurement-driven instruction, Phi Delta Kappan, 68,679682.

Reed, M. and Lewis, K. (2005) Key Stage 1 Evaluation of New Assessment Arrangements(Slough, NFER).

Resnick, L. and Resnick, D. (1992) Assessing the thinking curriculum. In B. Gifford andM. OConnor (Eds) Future Assessments: Changing Views of Aptitude, Achievementand Instruction (Boston, Kluwer).

Sadler, R. (1989) Formative assessment and the design of instructional systems,Instructional Science, 18, 119144.

Sadler, R. (1998) Formative assessment: revisiting the territory, Assessment in Education,5 (1), 7784.

Shavelson, R., Black, P., Wiliam, D. and Coffey, J. (2005) On linking forma-tive and summative functions in the design of large-scale assessment sys-tems. Available at: http://www.stanford.edu/dept/SUSE/SEAL/Reports_Papers/Onpercent20Aligning per cent20Formative per cent20and per cent20Summative percent20Functions_Submit.doc (accessed 4 July 2011).

Shepard L. (1990) Inflated test score gains: is the problem old norms or teaching to thetest? Educational Measurement: Issues and Practice, 9 (3), 1522.

Shepard L. (2000) The role of assessment in a learning culture, Educational Researcher,29 (7), 414.

Sykes Review (2010) The Sir Richard Sykes Review of the future of Englishqualifications and assessment system. Available at: http://www. conserva-tives.com/news/news_stories/2010/03/~/media/Files/Downloadablepercent20Files/Sir

per cent20Richard per cent20Sykes_Review.ashx (accessed 4 July 2011).Torrance, H. (1981) The origins and development of mental testing in England and the

United States, British Journal of Sociology of Education, 2 (1), 4559.Torrance, H. (Ed, 1995) Evaluating Authentic Assessment: Issues, Problems and Future

Possibilities (Buckingham, Open University Press).

Torrance, H. (2003) Assessment of the National Curriculum in England. In T. Kellaghanand D. Stufflebeam (Eds) International Handbook of Educational Evaluation(Dordrecht, Kluwer).



2013


28/28


Torrance, H. (2005) Testing times for black achievement some observationsfrom England. Paper presented to Symposium Leaving No Child Behind: HowFederal Education Agencies are Addressing Achievement Gaps for Linguisticand Racial/Ethnic Groups American Educational Research Association Annual

Conference, Montreal, 1115 April.Torrance, H. (2007) Assessment as Learning? How the use of explicit learning objectives,assessment criteria and feedback in post-secondary education and training can come todominate learning, Assessment in Education, 14 (3), 281294.

Torrance, H. and Pryor, J. (1998) Investigating Formative Assessment: Teaching, Learningand Assessment in the Classroom (Buckingham, Open University Press).

Wyatt-Smith, C., Klenowski, V. and Gunn, S. (2010) The centrality of teachers judgementpractice in assessment: A study of standards in moderation, Assessment in Education:Principles, Policy and Practice, 17 (1), 5975.

Wyse, D. and Torrance, H. (2009) The development and consequences of national cur-riculum assessment for primary education in England, Educational Research, 51 (2),

213238.

Correspondence

Professor Harry Torrance

The Education and Social Research Institute

Manchester Metropolitan University

799 Wilmslow Road

Manchester

M20 2RR

E-mail: [email protected]



2013

Torrance 2011 Assessment

Documents

Transcript of Torrance 2011 Assessment