J. Read 2008 - Diagnostic Assessment

8/12/2019 J. Read 2008 - Diagnostic Assessment

1/11

Identifying academic language needs throughdiagnostic assessment

John Read*

Department of Applied Language Studies and Linguistics, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand

Abstract

The increasing linguistic diversity among both international and domestic students in English-medium universities creates new

challenges for the institutions in addressing the students needs in the area of academic literacy. In order to identify students with

such needs, a major New Zealand university has implemented the Diagnostic English Language Needs Assessment (DELNA) pro-

gramme, which is now a requirement for all first-year undergraduate students, regardless of their language background. The results

of the assessment are used to guide students to appropriate forms of academic language support where applicable. This article ex-

amines the rationale for the assessment programme, which takes account of some specific provisions governing university admis-

sion in New Zealand law. Then, drawing on the test validation network by Read and Chapelle [Read, J., & Chapelle, C. A. (2001). A

framework for second language vocabulary assessment. Language Testing, 18, 1e32] the article considers in some detail: 1) the

way in which DELNA is presented to staff and students of the university, and 2) the procedures for reporting the results. It also

considers the criteria by which the programme should be evaluated.

2008 Elsevier Ltd. All rights reserved.

Keywords: Language assessment; English for academic purposes; Diagnosis; University admission; Undergraduate students; Language support

1. Introduction

The internationalisation of education in the major English-speaking countries has long created the need to provide

various forms of academic language support for those international students who have been admitted to the institution,

but whose proficiency is still not fully adequate to meet the language demands of their degree studies. Language sup-

port most often takes the form of English for academic purposes (EAP) courses targeting specific skills such as writingor listening, but it can also include adjunct language classes linked to a particular content course, writing clinics, peer

editing programmes, self-access centres, and so on. A typical strategy is to require incoming international students to

take an in-house placement test, the results of which are used either to exempt individuals from the EAP programme or

to direct them into the appropriate courses to address their needs. Accounts of tests designed broadly for this purpose

at various universities can be found inBrown (1993), Fox (2004), Fulcher (1997), andWall, Clapham, and Alderson

(1994).

* Tel.: 64 9 373 7599x87673; fax: 64 9 308 2360.

E-mail address: [email protected]

1475-1585/$ - see front matter 2008 Elsevier Ltd. All rights reserved.

doi:10.1016/j.jeap.2008.02.001

Journal of English for Academic Purposes 7 (2008) 180e190www.elsevier.com/locate/jeap
mailto:[email protected]://www.elsevier.com/locate/jeaphttp://www.elsevier.com/locate/jeapmailto:[email protected]


2/11

At the same time, it is now well recognised that many students who are not on student visas also have academic

language needs. This may result from the success of policies to recruit students from indigenous ethnic or linguistic

minority groups which have traditionally been underrepresented in tertiary education. Another major category con-

sists of relatively recent migrants or refugees, who have received much if not all of their secondary education in

the host country and thus have met the academic requirements for university admission, but who still experience dif-

ficulties with academic reading and writing in particular (Harklau, Losey, & Siegal, 1999). The term Generation 1.5

has been coined in the US to refer to the fact that these students are separated from the country of their birth but often

not fully integratede linguistically, educationally or culturally e into their new society. Beyond these two identifiable

categories, there is a broader continuum of academic literacy needs within the student body in the contemporary

English-medium university, including many students who are monolingual in English.

Although various forms of language support may be available to these domestic students on campus, the issue is

how to identify the ones who need such support and to what extent they should be requiredto take advantage of it.

There can be legal or ethical constraints on directing students into language support on the basis of their language

background or other demographic characteristics. It may also be counterproductive to make it obligatory for students

to participate in a support programme when they have no wish to be set apart from their peers and are reluctant to

acknowledge that they have language needs. One way to address the situation is to introduce some form of diagnostic

assessment, comparable to the in-house placement tests for international students. In fact, one of the tests cited above

(Fulcher, 1997) was designed to be administered at the University of Surrey in the UK to all incoming students, re-gardless of their immigration status or language background. A similar solution is emerging at the university which is

the subject of the present article.

Having regard for these various considerations, it is necessary to give some careful thought to the development of

an assessment procedure for this purpose. There are technical issues, such as how to assess native and non-native

speakers by means of a common metric and how to reliably identify those with no need of language support within

the minimum amount of testing time. However, the focus of this discussion will be on the need to present the assess-

ment to the students and to the university community in a manner that will achieve its desired goals while at the same

time avoiding unnecessary compulsion.

2. The context

The particular case to be considered here is a programme called Diagnostic English Language Needs Assessment

(DELNA), which has been implemented at the University of Auckland in New Zealand. The programme was intro-

duced to address concerns that developed through the 1990s with the influx of students who are now collectively iden-

tified as having English as an additional language (EAL). During that decade New Zealand tertiary institutions

vigorously recruited international students, particularly from East Asia. These students were required to demonstrate

their proficiency in English as a condition of admission. However, the typical requirement for undergraduates of Band

6.0 in IELTS came to be recognised as a relatively modest level of English proficiency, particularly for students whose

cultural background and previous educational experience made it difficult to meet the academic expectations of their

lecturers and tutors (Read & Hayes, 2003). In the absence of any moves to raise the minimum English requirement for

entry, then, the University of Auckland e like other New Zealand universities and polytechnics e needed to provide

various forms of ongoing language support for international students.

The liberalisation of immigration policy in the late 1980s also opened up opportunities for skilled migrants and

business investors to migrate to New Zealand with their families. This led to an inflow of new immigrants from Tai-

wan, China, South Korea, India and Hong Kong, peaking in 1995 but continuing at lower levels to this day. The vast

majority of the new immigrants settled in the Auckland metropolitan area and in time these communities produced

substantial numbers of students for tertiary institutions in the region, and for the University of Auckland in particular.

The students from these communities had quite similar linguistic, educational and cultural profiles to international

students; many students in both categories had attended a New Zealand secondary school for one, two or more years

before entering the university. However, there was one crucial difference. Under New Zealand law (the Education Act

1989), permanent residents are classified as domestic students for the purpose of university admission and cannot be

subjected to any entry requirement that is not also imposed on citizens of the country. This means specifically that new

migrants cannot be targeted to take an English proficiency test or enrol in ESL classes as a condition of being admitted

into a university.

181J. Read / Journal of English for Academic Purposes 7 (2008) 180e190


3/11

Another provision in the Education Act creates further challenges. The law allows any domestic student who has

reached the age of 20 to apply for special admission to a New Zealand university, regardless of their level of prior

educational achievement. Thus, in principle adult migrants as well as citizens have had open entry to tertiary educa-

tion, although in practice their choices have been constrained by admission requirements for particular degree pro-

grammes, and those lacking a New Zealand secondary school qualification are likely to be strongly counselled to

initially take on a light, part-time workload.

Students accepted for special admission have diverse language needs. Whereas those from the East Asian migrant

communities may resemble international students linguistically and culturally, others are mature students from En-

glish speaking backgrounds who may not lack proficiency in the language as such but rather academic literacy. These

students include members of the Pacific Nations communities (particularly from Samoa, Tonga, the Cook Islands,

Niue and the Tokelau Islands) who may have native proficiency in general conversational English but whose low level

of achievement in their secondary schooling would have excluded them from further educational opportunity, had the

special admission provision not been available. Although the Pacific communities are long established in New Zea-

land, it has only been in more recent years that the universities have made systematic efforts to recruit Pasifika stu-

dents, with a particular emphasis on programmes in Education, Health Sciences and Theology.

Thus, through the 1990s the University of Auckland faced various challenges in responding to the growing linguis-

tic diversity of its student body, not least because of the constraints imposed by the Education Act. Proposals from two

leading professors (Ellis, 1998; Ellis & Hattie, 1999) that the university should introduce an entrance examination inEnglish for students who could not produce evidence of adequate competence in the language received support from

the Faculty of Arts and were accepted by the central administration of the university. The development and piloting of

the DELNA instruments took place in 2000e01 (Elder & Erlam, 2001) and the programme became operational in

2002.

3. DELNA: its philosophy and design

Before looking at how DELNA operates in practice, it is useful to outline several basic principles underlying its

development. To some extent, the principles reflect the constraints imposed on the university by the Education

Act, but they can also be seen as a positive commitment by the institution to enhancing the educational opportunities

of the whole student body.

One principle was that the test results would not play any role in admissions decisions; students were to be as-

sessed only after they had been accepted into the university for their chosen degree programme. In this sense,

then, the administration of DELNA represents a low-stakes situation, although from another point of view the

stakes are higher for students who are at serious risk of failing courses or not achieving their academic potential

as a result of their limited proficiency in the language. The university, too, has a stake in preserving academic

standards and maintaining good completion rates, particularly on equity grounds for Maori, Pasifika and other

students from historically underrepresented groups on the campus.

As a means of emphasising the point that DELNA was not IELTS under another guise, it was deliberately called

an assessment rather than a test, and the individual components are known as measures.

There was to be an important element of personal choice for students in their participation in DELNA and their

subsequent uptake of opportunities for language support and enhancement. In practice, particular departments

and degree programmes have required their students either to take DELNA and/or to participate in some

form of language support, but the principle remains that students should be strongly encouraged to take advan-

tage of this initiative rather than being compelled to do so against their will.

DELNA represented a recognition by the university that it shares with students a joint responsibility to address

academic language needs. This contrasts with the situation of international students applying for admission,

where the onus is on the students to demonstrate, by paying a substantial fee for an international English

test, that they have adequate competence in the language. For students and for departments, DELNA is free

of charge and several of the language support options are available to students at no additional cost to them.

In operation, DELNA involves two phases of assessment, Screening and Diagnosis, as shown in Table 1. The

Screening measures were designed to provide a quick and efficient means of separating out native speakers and other

182 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190


4/11

proficient users of the language who were unlikely to encounter difficulties with academic English, and exempting

them from further assessment. Both of the Screening measures are computer-based. One is a vocabulary test, assessing

knowledge of a sample of academic words by means of a simple wordddefinition matching format (Beglar & Hunt,

1999). The other, variously known in the literature as a speed reading (Davies, 1975, 1990) or cloze-elide (Manning,

1987) format, is a kind of reverse cloze procedure. In each line of an academic-style text an extraneous word is inserted

and the test takers must identify each inserted word under a speeded condition which means that only the most pro-

ficient students complete all 73 items within the time available. In a validation study (Elder & Erlam, 2001), the re-liability estimates were 0.87 for Vocabulary and 0.88 for Speed Reading. The two tests correlated with a composite

listening, reading and writing score from the Diagnosis (see below) individually at 0.74 (vocabulary) and 0.77 (speed

reading), and collectively at 0.82.

For students who score below a threshold level on the Screening, the three measures in the Diagnosis phase provide

a more extensive, task-basedassessmentof their academic language skills. Unlikethe computerised Screening measures,

they are all paper-based instruments. In the Listening test (30 min), the students hear an audio-recorded mini-lecture on

a non-specialist topic and respond to short answer, multiple-choice and information transfer items. The Reading test

(45 min) is based on one or two reading texts on topics of general interest totalling about 1200 words. Various item types

are used, including cloze, information transfer, matching, multiple-choice, true-false and short answer. For the Writing

task (30 min), thecandidates write 200 words of commentary on a social trend, as presented to them in the form of a sim-

ple table or graph. Their writing is rated on three analytic scales: fluency, content, and grammar and vocabulary.

The Diagnosis phase takes 2 hours to administer, as compared to 30 min for the Screening, and is obviously more

expensive in other respects, in that it requires manual scoring and, in the case of the writing task, double rating on the

three scales by trained examiners (for research on the training procedures, seeElder, Barkhuizen, Knoch, & von

Randow, 2007; Knoch, Read, & von Randow, 2007). TheElder and Erlam (2001)validation study obtained reliability

estimates of 0.82 for Listening and 0.83 for Reading. In the case of Writing, the two recent studies just cited (Elder

et al., 2007; Knoch et al., 2007) produced estimates of 0.95e0.97 for the reliability of candidate separation, using the

FACETS program.

Further details of the two phases of the DELNA assessment, including sample items and tasks, can be found in the

DELNA Handbook, which is downloadable from the programme website: www.delna.auckland.ac.nz.

Set out this way, DELNA looks very much like a conventional language test. Certainly the Diagnosis tasks are sim-

ilar to those found in IELTS and other EAP proficiency tests. However, the intended purpose of the instrument is dif-

ferent and this means that it needs to be presented in a distinctive manner, in keeping with the principles outlined at thebeginning of this section.

4. An analysis of test purpose

A useful framework for analysing how test purpose should influence test design and delivery is that developed by

Read and Chapelle (2001). Although the framework is exemplified in terms of vocabulary testing, it has general ap-

plicability to various forms of language assessment. As shown inFig. 1, the framework has numerous components and

it is beyond the scope of the present article to consider them all in detail.

At the top level of the framework, test purpose is decomposed into three componentse inferences, uses and in-

tended impacts e which in turn lead to validity considerations and mediating factors. It is the second and third me-

diating factors which are of particular concern here, but it is also necessary to address the first component briefly.

Table 1

The structure of DELNA

Screening (30 min)

Vocabulary

Speed Reading

Diagnosis (2 hours) Listening to a mini-lecture

Reading academic-type texts

Writing an interpretation of a graph

http://www.delna.auckland.ac.nz/http://www.delna.auckland.ac.nz/


5/11

4.1. Construct definition

The inferences to be made on the basis of performance in DELNA can be defined in terms ofacademic literacy in

English: the ability of incoming undergraduate students to cope with the language demands of their degree pro-

gramme. Although ultimately the assessment is targeted at students for whom English is an additional language

(EAL), the construct is broader thanacademic literacy in English as an additional language because many of those

to be assessed come from English-speaking backgrounds, and the whole function of the initial Screening phase of

DELNA is to separate out students for whom adequate academic literacy is unlikely to be at issue. Designing

a test for students with English as both a first and an additional language creates a special challenge because it cannotbe assumed that items and tasks will perform the same way for the two groups. Elder, McNamara, and Congdon (2003)

used Rasch analysis to investigate this issue and found a somewhat complex pattern, whereby each of the DELNA

tasks except the vocabulary measure exhibited some significant bias in favour of either native or non-native speakers.

However, since the bias was in both directions and relatively small in magnitude overall, the researchers considered

that it was within tolerable limits for a low-stakes assessment of this kind.

Read and Chapelle (2001)distinguish three levels of inference: whole test, sub-test and item. For DELNA item-

level inferences are not appropriate. In the Screening phase, the construct is defined specifically in terms of efficient

access to academic language knowledge and it is sufficient to make inferences at the level of the whole test. Thus, the

vocabulary and speed reading scores are combined into a single result to determine whether the student should proceed

to the Diagnosis phase.

Elder and von Randow (in press) have investigated the validity of inferences based on the Screening score

examining its suitability as a basis for determining whether students needed to proceed to the Diagnosis. Their study

involved an analysis of the performance of 353 students who took both the Screening and Diagnosis measures. A

minimum criterion score was set on the basis of performance in the listening, reading and writing tests of the

Diagnosis phase. Then, by means of regression analysis, an optimum cut score (combining the vocabulary and speed

reading scores) was established for the Screening phase. This cut score successfully identified 93% of the students

whose performance fell below the criterion level in the Diagnosis phase. However, it also meant that relatively few

students would be exempted from taking the costly Diagnosis measures and so, with financial considerations in

mind, a lower cut score was set. The lower score identified only 81% of the students who were under the criterion

level but on the other hand it resulted in less than 1% of false negatives: students below the cut score who neverthe-

less had achieved the criterion level in the Diagnosis. Therefore, for operational purposes it is only students whose

Screening performance falls under the lower cut score who are required to proceed to the Diagnosis. Those who

are between the two cut scores receive a general recommendation to seek academic language support (see4.3below).

TEST PURPOSE Inferences Uses Intended Impacts

VALIDITY

CONSIDERATIONS

Construct

Validity

Relevance

and Utility

Actual

Consequences

MEDIATING

FACTORS

Construct

Definition

Performance

Summary and

Reporting

Test

Presentation

TEST DESIGNDecisions about the Structure

and Formats of the Test

VALIDATION Arguments based on Theory, Evidence and Consequences

Fig. 1. A framework for incorporating a systematic analysis of test purpose into test validation (adapted from Read & Chapelle, 2001, p. 10).

http://-/?-http://-/?-


6/11

For those who complete the Diagnosis, sub-test inferences are desirable so that students can be advised on whether

they should seek language support in each of the three skill areas of listening, reading and writing. This means that

each sub-test needs to provide a reliable measure of the skill involved. The reliability estimates quoted in Section3are

very satisfactory from this perspective.

4.2. Test presentation

Although test presentation comes third in the Read and Chapelle framework, it is more appropriate to discuss it

next in this account of DELNA. Presentation is a mediating factor that comes from a consideration of the impact of

a test(Messick, 1996).Read and Chapelle (2001)point out that most research on impact in language testing has

focused on the washback effects of existing tests and examinations (see, e.g.Alderson, 1996; Cheng & Watanabe,

2004). However, Read and Chapelle argue that if the consequences of implementing a test are to be seen as an in-

tegral element in evaluating its quality, a statement of the intendedimpact of the instrument needs to be included in

the specification of test purpose early in the development of a new test. Thus, the actual consequences of putting the

test into operation can be evaluated by reference to the prior statement of intended impact. This means in turn that

the test developers should consider how the intended impact can be achieved through the way that the test is

presented.

Test presentation is a concept that has not received much attention in the literature and it deserves some consider-ation here. It consists of a series of steps, taken as part of the process of developing and implementing the test, to in-

fluence its impact in a positive direction. Since there are numerous stakeholders in assessment, particularly when the

stakes are high, [t]est developers choose to portray their tests in ways that will appeal to particular audiences (Read

& Chapelle, 2001, p. 18). These can include educational administrators, teachers, parents, users of the test results, and

of course the test takers, who need to be familiar with the test formats and willing to accept that the test is a fair as-

sessment of their language abilities.

Seen in this light, test presentation has a strong connection to that much maligned concept in testing, face validity.

Authors of introductory texts on language testing, starting withLado (1961), have generally dismissed this concept as

not being a credible form of evidence to support a validity argument, since it is based on simple inspection (Lado,

1961, p. 321) or the judgment of an untrained observer (Davies et al., 1999, p. 59).

However, this rejection of the concept has generally been accompanied by an acknowledgement that, although theterm may be a misnomer, it represents a matter of genuine concern in testing. That is to say, test developers are con-

fronted with a real problem ife regardless of the technical merits of the teste one or more of the stakeholder groups

are not convinced that the content or the formats are suitable for the assessment purpose. Thus,Alderson, Clapham,

and Wall (1995)give face validity a positive gloss as meaning acceptable to users (p. 173), echoingCarroll (1980),

who had earlier proposed acceptability as one of the four desirable characteristics (along with relevance, comparabil-

ity and economy) of a communicative test. In addition,Bachman and Palmer (1996, p. 42)andDavies et al. (1999, p.

59)refer to the even more positive notion of test appeal.

Thus, test presentation can be seen as a proactive approach to promoting the acceptability of the test to the various

stakeholders, and above all to the test takers, in order to achieve the intended impact. The test developer needs to en-

sure that the purpose and nature of the assessment is clearly understood, that it meets stakeholder expectations as much

as possible, and that the test takers in particular engage with the test tasks in a manner that will help produce a valid

measure of their language ability. Major proficiency tests generate a strong external motivation for students because of

the stakes involved, whereas with a programme like DELNA it is more important to create a positive internal moti-

vation based on a recognition of the benefits that the results of the assessment may bring for the student.

4.2.1. Presentation of DELNA to students

The general principles underlying the presentation of DELNA are those that were introduced in Section3above:

the fact that the results are never used for admissions purposes; the term assessment is preferred to test; there is

a significant element of personal choice for students; and the university shares with its students the responsibility for

addressing their academic language needs. The slogan Increases your chance of success, which has featured in

DELNA publicity, is also intended to express the positive intent of the programme.

There have been two main pathways to DELNA for students entering the university each semester. The first was

literally by invitation. In the admissions office the records of incoming domestic students (citizens and permanent

http://-/?-http://-/?-http://-/?-http://-/?-


7/11

residents) were reviewed to identify those who had not provided evidence of their competence in English for tertiary-

level study. Students coming directly from secondary school in the last few years hold the National Certificate of Ed-

ucational Achievement (NCEA), which includes a literacy requirement to demonstrate proficiency in academic

reading and writing in English. However, mature students and recently arrived immigrants who enter the university

under special admission often lack a recognised secondary school qualification or any other evidence of academic

literacy.

Students thus identified received a letter inviting them to take the DELNA diagnosis. For statutory reasons, as pre-

viously explained, the university could not make it mandatory but, that consideration aside, the wording of the letter

had a positive tone which emphasised the intended role of DELNA in enhancing the students study experience. Ini-

tially, the uptake of these invitations was relatively low but by 2005 it had reached about 40% (116/295).

The other main pathway, which has now essentially superseded the first one, results from decisions by departments

or faculties to require all students in designated first-year courses to take DELNA. Initially, this applied to programmes

which attracted a high proportion of EAL students, such as the Bachelor degrees in Business and Information Man-

agement, and in Film, TVand Media Studies. However, from 2007 it has officially become a requirement for almost all

first-year students, regardless of their language background, to take the DELNA Screening. This not only observes the

legal niceties but also highlights the important role of the Screening phase in efficiently separating out academically

literate students for exemption from further assessment.

In 2007 a total of 5427 students were assessed through the DELNA programme. These students are estimated torepresent around 70% of all the first-year students at the university that year, although the percentage is higher if

groups such as transferring and exchange students are excluded. Of those who completed the Screening, 1208

were recommended to return for the Diagnosis phase; however, only 504 (42%) did so. This shortfall is discussed

in Section5 below.

In terms of presentation, as DELNA assessment has become the norm for first-year students, it is increasingly

accepted as just another part of the experience of entering the university. Students are informed of the assessment re-

quirement in their departments handbook and can obtain further information from the programme website, including

the downloadable DELNA Handbook, with its sample versions of the assessment tasks and advice on completing

them. In addition, it is easy for students to book for a DELNA session online at their preferred day and time. One other

appealing feature of the Screening measures in particular is that they are computer-administered, which adds a novelty

value for students who may never have taken such a language test before.

4.2.2. Presentation of DELNA to staff

Much of the initial impetus for the development of DELNA came from the concerns of teaching staff across the

university in the 1990s about the academic literacy needs of students in their classes. This created a positive environ-

ment for the acceptance of a programme like DELNA to address those needs, but of course that is not the same as an

understanding of how this particular programme works.

The establishment of DELNA saw the formation of a Reference Group chaired by the Deputy Vice-Chancellor (Ac-

ademic) and composed of representatives from all the university faculties as well as from the various language support

programmes. The group meets regularly to discuss policy issues, monitor the implementation of DELNA and provide

a channel of communication from the programme staff to the faculties. The departments which were the early partic-

ipants in DELNA are well represented on the group but, as the assessments have expanded, it has been necessary to

open new avenues of communication to academic and administrative staff across the university to ensure that: a) an

informed decision is made when departments or faculties decide to require their first-year students to take the assess-

ment; b) the relevant staff correctly interpret the DELNA results when they receive them; and c) effective follow-up

action is taken to give students access to language support if they need it.

In 2005 an information guide for staff was produced in pamphlet form and it has been followed by an FAQ doc-

ument. However, experience has shown that the printed material must be backed up by face-to-face meetings with key

staff members responsible for DELNA administration in particular faculties or departments.

4.3. Performance summary and reporting

This brings us back to the second mediating factor of theRead and Chapelle (2001)framework, performance sum-

mary and reporting, which relates to the intended use of the test. The assessment results are used to identify students

http://-/?-http://-/?-


8/11

who may be at risk because of their low academic literacy and then to advise them on suitable forms of language sup-

port and enhancement. Where participation in DELNA is a course requirement, the results also go to the academic

programme or department for follow-up action as appropriate. Thus, the two main recipient groups for the results

are the students and their departments.

Given that the whole purpose of the programme is to address academic literacy needs, the reporting of student per-

formance includes not only the assessment result but also a recommendation for language enhancement where appro-

priate. At this point, then, it is useful to list the main language support options available on the campus.

Credit courses in academic language skills: ESOL100e102 for EAL students, and ENGWRIT101, a writing

course for students from English-speaking backgrounds.

Workshops, short non-credit courses, individual consultations and self-access study facilities offered by the Stu-

dent Learning Centre (SLCe available to all students) and the English Language Self-Access Centre (ELSACe

specialising in services for EAL students).

Discipline-specific language tutorials linked to particular courses (following a kind of adjunct model) which

have for some time attracted a high proportion of EAL students. Currently these courses are in Commerce,

Health Sciences, Theology, and Film, TV and Media Studies.

The Screening phase of DELNA is primarily intended to exempt highly proficient students from further assess-ment. Thus, the scores from the vocabulary and speed reading measures are combined to divide the test-takers into

three categories with deliberately non-technical labels:

Good e no language enrichment required.

Satisfactory e some independent activity at SLC or ELSAC recommended.

Recommended for Diagnosise should take the DELNA Diagnosis.

The Screening result is sent individually to each student by email and, when DELNA is a departmental require-

ment, an Excel file of results for each course is forwarded to a designated staff member. Until 2006, the Screening

reports included the two actual test scores for vocabulary and speed reading. However, the fact that the cut scores

for the three categories varied according to which form of the test each student took caused some confusion and,in addition, there were indications that the scores were being used in at least one academic programme as quasi-pro-

ficiency measures to assign students to tutorial groups according to their language ability. This led to the current policy

of reporting just the students category.

In the case of the Diagnosis phase, a scale modelled on the IELTS band scores (from a top level of 9 down to 4)

has been used for rating performance and reporting the results to students. However, for reporting to staff a simpler

A-B-C-D system is used for each of the three skills (listening, reading and writing). The A and B grades correspond to

the Good and Satisfactory categories respectively in the Screening, and students whose three-grade average is at one of

these levels receive an email report. On the other hand, students averaging in the C and D range, who are considered to

be at significant risk, are sent an email request to collect their results in person from the DELNA language adviser. As

with the Screening, the results are also sent to the designated staff member when the Diagnosis is required by the

department.

The appointment of the language adviser, beginning in 2005, resulted from a recognition that students scoring low

in the Diagnosis were generally not accessing the recommended language support. A small-scale study byBright and

von Randow (2004), involving follow-up interviews with eighteen DELNA candidates at the end of the academic year,

found that only four of them had taken the specific advice given in the results notification. Although most of the

participants had in fact passed their courses, they acknowledged that they had really struggled to meet the language

demands of their studies. One strong message from the interviews was that the students would have greatly appreci-

ated the opportunity to discuss their DELNA results and their language support options face-to-face, rather than just

receiving the impersonal emailed report. Thus, the language adviser now meets with each student individually, goes

over their profile of results, and directs them to the most appropriate form(s) of support. She often follows up the initial

meeting with ongoing monitoring of their progress through the semester or even longer.

Thus performance summary and reporting in this case involves not simply the form of the report but also, for the

less proficient students, the medium by which the result is communicated to them.



9/11

5. Evaluating the programme

The extended discussion in Section4, drawing on theRead and Chapelle (2001)framework, has shown how the

purpose of the assessment has been worked out through the design and delivery mechanisms of DELNA. At the time of

writing, the programme is still being rolled out. It has yet to achieve full participation by the incoming first-year stu-

dent population in the Screening phase, and furthermore in 2006 only 30% of students (444 out of 1340) who were

recommended for Diagnosis on the basis of their Screening results actually went on to the second phase of the assess-

ment. Higher levels of participation will depend on the extent to which faculties and departments enforce the require-

ment that their students should take one or both phases of DELNA. Some academic programmes have introduced

specific incentives for students to take the assessment, by for instance withholding the first essay grade or subtracting

a few percent of the final course grade of students who do not comply.

However, the point of the exercise is not just to assess the students but rather to address their academic language

needs where appropriate. As noted in the previous section, there is now provision within the DELNA programme it-

self, through the work of the language adviser, to provide intensive counselling for those students whose results in the

Diagnosis phase show that they have the most serious language needs. Some academic units have introduced their own

follow-up measures for such students. For example, the Bachelors degree in Business and Information Management

has a well-established Language and Communication Support Programme (LanCom), which integrates various forms

of support into the delivery of their courses. In the Faculty of Engineering students who score below a minimum levelin the DELNA Diagnosis must undertake a quasi-course with its own course code, involving attendance at 15 hours of

workshops at the Student Learning Centre (SLC) and satisfactory completion of another 15 hours of directed study at

the English Language Self-Access Centre (ELSAC).

With the expansion of DELNA assessment into the Faculties of Arts and Science, it is more of a challenge to re-

spond to the language needs of students enrolled for a degree which includes courses offered by several different de-

partments. In the first instance, the Screening results may simply provide course conveners with a broad profile of the

language needs of their students, who may be several hundred in number. Many departments lack the resources to offer

specialised language support to their students. One realistic option for them is the introduction of systematic proce-

dures for referring students in need to SLC or ELSAC; another option may be to review their teaching and assessment

practices to avoid creating unnecessary difficulties for EAL students in their courses.

Returning briefly to theRead and Chapelle (2001)framework, one element in the validation of a test or assessmentprocedure is an investigation of its actual consequences as compared to its intended impact. At the institutional level,

the intended impact can be defined in terms of levels of academic literacy in the student population. The implemen-

tation of DELNA is supposed to lead to a meaningful reduction over time in the number of students whose academic

performance is hampered by language-related difficulties. The question is what kind of data counts as evidence that

the goal is being achieved for the undergraduate student body as a whole.

Davies and Elder (2005)took up this point in their review of current theory and practice in language test validation,

using DELNA as a case study. They formulated a series of eight hypotheses that can be investigated to build an ar-

gument for the validity of DELNA. Most of the hypotheses relate either to the technical qualities of the DELNA tests

as measures of academic literacy or to the utility of the scores to the users. However, the final hypothesis takes up the

issue of the wider impact of the programme:

H.8 The student population will benefit from the information that the test provides. (2005, p. 805).

Davies and Elder highlight a number of challenges in, first, defining the nature of the benefit and then gathering

evidence in support of the hypothesis. One way to address the hypothesis would be to define the benefit as an increase

in academic literacy, as measured by a further assessment of their language proficiency after, say, a semester or two of

study. However, DELNA is set up as a one-time assessment for each student and the system blocks them from taking it

more than once. In addition, there are currently no plans to introduce an exit test of English proficiency for graduating

students.

This means that we need to look for benefits in other ways. One kind of evidence relates to student uptake of the

DELNA advice by accessing the various language support options available to them. If they enrol in an ESOL credit

course or attend a tutorial linked to one of their subject courses, their progress and end-of-course achievement will be

assessed by their tutors. On the other hand, it is more of a challenge to monitor the benefit gained by students who

participate in the support programmes at SLC and ELSAC. Students are required to register when they first access

http://-/?-http://-/?-


10/11

these programmes and records are kept of their attendance at workshops and individual consultations, but that is not

the same as assessing the benefit of these language support opportunities in improving the students academic lan-

guage proficiency.

A broader approach to the situation is to look at grade point averages and retention rates for whole undergraduate

cohorts, particularly in courses with large EAL student enrolments. AsDavies and Elder (2005)point out, though, it is

difficult to separate out language proficiency from academic ability, motivation, sociocultural adjustment and the

range of other factors that influence student achievement in their university studies, particularly if underachievement

is represented not just by dropout or failure rates but also by lower grades than the student might otherwise have

achieved. The issues involved are reminiscent of those which have complicated research on the predictive validity

of major proficiency tests like TOEFL and IELTS (see, e.g.,Hill, Storch, & Lynch, 1999; Light, Xu, & Mossop, 1987).

Thus, global university-wide measures of impact may prove to be less useful than more focused investigations of

particular groups of students. One such study, being conducted by the DELNA Programme in conjunction with the

Department of Film, TVand Media Studies, is tracking a cohort of students through their three years of study towards

a BA major in FTVMS. The data include annual interviews with the students as well as the quantitative measures

provided by the initial DELNA results and their course grades. Through this kind of targeted research, it will become

possible to develop a validity argument that combines rich qualitative evidence with more objective measures of

students language proficiency and academic achievement.

6. Conclusion

The DELNA assessment programme has a number of features that differentiate it from other tests of English for

academic purposes. First, it does not function as a gatekeeping device for university admission, and students cannot be

excluded from the institution on the basis of their results in either phase of the assessment. The fact that some students

find this hard to believe helps to account for the relatively low participation rate in the Diagnosis phase of DELNA

among those who are recommended to take it. Secondly, it is not simply a placement procedure to direct students

into one or more courses within a required EAP programme according to their level and areas of need. There is a range

of language support options that students are recommended to participate in as appropriate. A related feature is the

distinctive philosophy behind the programme which holds that students should retain a degree of personal choice

as to whether they take advantage of the opportunities for language and study support which are available to them.Although it partly reflects the constraints imposed by national education legislation, this approach is also based on

the assumption that academic language support will be more effective if students recognise for themselves the extent

of their language needs and make a commitment to attend to them.

One other important characteristic of DELNA is that it is centrally funded with a direct management line to the

office of the Deputy Vice-Chancellor (Academic). Although its offices are located in the Department of Applied Lan-

guage Studies and Linguistics, the programme has always been conceived as a university-wide initiative. This helps to

avoid any perception that DELNA is just serving the interests of a particular department or faculty. It is an issue that

has emerged in discussions with staff from other New Zealand universities about the possibility of introducing

a DELNA programme on their own campuses. Initial enquiries have typically come from student learning advisers

or ESOL tutors who have thought in terms of purchasing a set of diagnostic tests for their own institution. However,

a briefing on the full scope of DELNA and its associated language support provisions reveals how much more is in-

volved, with a firm commitment by senior management being a crucial element in the successful operation of the pro-

gramme at Auckland.

The DELNA programme is moving to a consolidation phase after the considerable expansion in the coverage of

incoming undergraduate students over the past couple of years. There is a consequent need to ensure that effective

use is made of the DELNA results and that an increasing proportion of the targeted students participate in the appro-

priate forms of language support and enhancement. The position of DELNA as a centrally funded programme is se-

cure for the foreseeable future, although it remains to be seen to what extent the university will be able to commit

sufficient resources to meet the range of language needs that the assessment results are revealing. Other related issues

may yet emerge, such as the need to set language proficiency standards for students graduating from Bachelors pro-

grammes or concerns about the academic literacy of postgraduate students. For now, though, it is widely accepted

within the institution that DELNA is a very worthwhile means of addressing the language needs of incoming

undergraduates.



11/11

References

Alderson, J. C. (Ed.). (1996). Washback in language testing [Special issue]. Language Testing, 13(3).

Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press.

Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.

Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary tests. Language Testing, 16,

131e

162.Bright, C., & von Randow, J. (2004, September). Tracking language test consequences: the student perspective. Paper presented at the Ninth

National Conference on Community Languages and English for Speakers of Other Languages (CLESOL), Christchurch, New Zealand.

Brown, J. D. (1993). A comprehensive criterion-referenced language testing project. In: D. Douglas, & C. Chapelle (Eds.), A new decade of

language testing research (pp. 163e184). Washington, DC: TESOL.

Carroll, B. J. (1980). Testing communicative performance. Oxford: Pergamon.

Cheng, L., & Watanabe, Y. (2004). Washback in language testing: Research contexts and methods . Mahwah, NJ: Lawrence Erlbaum Associates.

Davies, A. (1975). Two tests of speed reading. In: R. L. Jones, & B. Spolsky (Eds.), Testing language proficiency(pp. 119e130). Arlington, VA:

Center for Applied Linguistics.

Davies, A. (1990). Principles of language testing. Oxford: Blackwell.

Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). A dictionary of language testing. Cambridge: Cambridge

University Press.

Davies, A., & Elder, C. (2005). Validity and validation in language testing. In: E. Hinkel (Ed.), Handbook of research in second language teaching

and learning (pp. 795e813). Mahwah, NJ: Lawrence Erlbaum.

Elder, C., Barkhuizen, G., Knoch, U., & von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writingassessment. Language Testing, 24, 37e64.

Elder, C., & Erlam, R. (2001). Development and validation of the diagnostic english language needs assessment (DELNA): Final report. Auckland:

Department of Applied Language Studies and Linguistics, University of Auckland.

Elder, C., McNamara, T., & Congdon, P. (2003). Rasch techniques for detecting bias in performance assessments: An example comparing the

performance of native and non-native speakers on a test of academic English. Journal of Applied Measurement, 4, 181e197.

Elder, C., & von Randow, J. (in press). Exploring the utility of a web-based English language screening tool. Language Assessment Quarterly.

Ellis, R. (1998). Proposal for a language proficiency entrance examination. Unpublished manuscript, New Zealand: University of Auckland.

Ellis, R., & Hattie, J. (1999). English language proficiency at the University of Auckland: A proposal. Unpublished manuscript, New Zealand:

University of Auckland.

Fox, J. (2004). Test decisions over time: Tracking validity. Language Testing, 21, 437e465.

Fulcher, G. (1997). An English language placement test: issues in reliability and validity. Language Testing, 14, 113e139.

Harklau, L., Losey, K. M., & Siegal, M. (Eds.). (1999). Generation 1.5 meets college composition: Issues in the teaching of writing to U.S.

educated learners of ESL. Mahwah, NJ: Lawrence Erlbaum.

Hill, K., Storch, N., & Lynch, B. (1999). A comparison of IELTS and TOEFL as predictors of academic success. In: R. Tulloh (Ed.), IELTS

research reports, (Vol. 2, pp. 52e63). Canberra: IELTS Australia.

Knoch, U., Read, J., & von Randow, J. (2007). Re-training writing raters online: How does it compare with face-to-face training? Assessing

Writing, 12, 26e43.

Lado, R. (1961). Language testing. London: Longman.

Light, R. L., Xu, M., & Mossop, J. (1987). English proficiency and academic performance of international students. TESOL Quarterly, 21,

251e261.

Manning, W. H. (1987). Development of cloze-elide tests of English as a second language . TOEFL Research Report, No. 23. Princeton, NJ:

Educational Testing Service.

Messick, S. (1996). Validity and washback in language testing. Language Testing, 13, 241e256.

Read, J., & Chapelle, C. A. (2001). A framework for second language vocabulary assessment. Language Testing, 18, 1e32.

Read, J., & Hayes, B. (2003). The impact of IELTS on preparation for academic study in New Zealand. In: R. Tulloh (Ed.), IELTS research reports

2003, (Vol. 4, pp. 153e205). Canberra: IELTS Australia.

Wall, D., Clapham, C., & Alderson, J. C. (1994). Evaluating a placement test. Language Testing, 11, 321e

344.

John Readis Head of the Department of Applied Language Studies and Linguistics at the University of Auckland. His primary research interests

are in vocabulary assessment and testing English for academic and professional purposes. He is the author of Assessing Vocabulary (Cambridge,

2000) and has been co-editor of Language Testing.


J. Read 2008 - Diagnostic Assessment

Documents

Transcript of J. Read 2008 - Diagnostic Assessment