J. Read 2008 - Diagnostic Assessment
-
Upload
sheila-julmohamad -
Category
Documents
-
view
218 -
download
0
Transcript of J. Read 2008 - Diagnostic Assessment
-
8/12/2019 J. Read 2008 - Diagnostic Assessment
1/11
Identifying academic language needs throughdiagnostic assessment
John Read*
Department of Applied Language Studies and Linguistics, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand
Abstract
The increasing linguistic diversity among both international and domestic students in English-medium universities creates new
challenges for the institutions in addressing the students needs in the area of academic literacy. In order to identify students with
such needs, a major New Zealand university has implemented the Diagnostic English Language Needs Assessment (DELNA) pro-
gramme, which is now a requirement for all first-year undergraduate students, regardless of their language background. The results
of the assessment are used to guide students to appropriate forms of academic language support where applicable. This article ex-
amines the rationale for the assessment programme, which takes account of some specific provisions governing university admis-
sion in New Zealand law. Then, drawing on the test validation network by Read and Chapelle [Read, J., & Chapelle, C. A. (2001). A
framework for second language vocabulary assessment. Language Testing, 18, 1e32] the article considers in some detail: 1) the
way in which DELNA is presented to staff and students of the university, and 2) the procedures for reporting the results. It also
considers the criteria by which the programme should be evaluated.
2008 Elsevier Ltd. All rights reserved.
Keywords: Language assessment; English for academic purposes; Diagnosis; University admission; Undergraduate students; Language support
1. Introduction
The internationalisation of education in the major English-speaking countries has long created the need to provide
various forms of academic language support for those international students who have been admitted to the institution,
but whose proficiency is still not fully adequate to meet the language demands of their degree studies. Language sup-
port most often takes the form of English for academic purposes (EAP) courses targeting specific skills such as writingor listening, but it can also include adjunct language classes linked to a particular content course, writing clinics, peer
editing programmes, self-access centres, and so on. A typical strategy is to require incoming international students to
take an in-house placement test, the results of which are used either to exempt individuals from the EAP programme or
to direct them into the appropriate courses to address their needs. Accounts of tests designed broadly for this purpose
at various universities can be found inBrown (1993), Fox (2004), Fulcher (1997), andWall, Clapham, and Alderson
(1994).
* Tel.: 64 9 373 7599x87673; fax: 64 9 308 2360.
E-mail address: [email protected]
1475-1585/$ - see front matter 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jeap.2008.02.001
Journal of English for Academic Purposes 7 (2008) 180e190www.elsevier.com/locate/jeap
mailto:[email protected]://www.elsevier.com/locate/jeaphttp://www.elsevier.com/locate/jeapmailto:[email protected] -
8/12/2019 J. Read 2008 - Diagnostic Assessment
2/11
At the same time, it is now well recognised that many students who are not on student visas also have academic
language needs. This may result from the success of policies to recruit students from indigenous ethnic or linguistic
minority groups which have traditionally been underrepresented in tertiary education. Another major category con-
sists of relatively recent migrants or refugees, who have received much if not all of their secondary education in
the host country and thus have met the academic requirements for university admission, but who still experience dif-
ficulties with academic reading and writing in particular (Harklau, Losey, & Siegal, 1999). The term Generation 1.5
has been coined in the US to refer to the fact that these students are separated from the country of their birth but often
not fully integratede linguistically, educationally or culturally e into their new society. Beyond these two identifiable
categories, there is a broader continuum of academic literacy needs within the student body in the contemporary
English-medium university, including many students who are monolingual in English.
Although various forms of language support may be available to these domestic students on campus, the issue is
how to identify the ones who need such support and to what extent they should be requiredto take advantage of it.
There can be legal or ethical constraints on directing students into language support on the basis of their language
background or other demographic characteristics. It may also be counterproductive to make it obligatory for students
to participate in a support programme when they have no wish to be set apart from their peers and are reluctant to
acknowledge that they have language needs. One way to address the situation is to introduce some form of diagnostic
assessment, comparable to the in-house placement tests for international students. In fact, one of the tests cited above
(Fulcher, 1997) was designed to be administered at the University of Surrey in the UK to all incoming students, re-gardless of their immigration status or language background. A similar solution is emerging at the university which is
the subject of the present article.
Having regard for these various considerations, it is necessary to give some careful thought to the development of
an assessment procedure for this purpose. There are technical issues, such as how to assess native and non-native
speakers by means of a common metric and how to reliably identify those with no need of language support within
the minimum amount of testing time. However, the focus of this discussion will be on the need to present the assess-
ment to the students and to the university community in a manner that will achieve its desired goals while at the same
time avoiding unnecessary compulsion.
2. The context
The particular case to be considered here is a programme called Diagnostic English Language Needs Assessment
(DELNA), which has been implemented at the University of Auckland in New Zealand. The programme was intro-
duced to address concerns that developed through the 1990s with the influx of students who are now collectively iden-
tified as having English as an additional language (EAL). During that decade New Zealand tertiary institutions
vigorously recruited international students, particularly from East Asia. These students were required to demonstrate
their proficiency in English as a condition of admission. However, the typical requirement for undergraduates of Band
6.0 in IELTS came to be recognised as a relatively modest level of English proficiency, particularly for students whose
cultural background and previous educational experience made it difficult to meet the academic expectations of their
lecturers and tutors (Read & Hayes, 2003). In the absence of any moves to raise the minimum English requirement for
entry, then, the University of Auckland e like other New Zealand universities and polytechnics e needed to provide
various forms of ongoing language support for international students.
The liberalisation of immigration policy in the late 1980s also opened up opportunities for skilled migrants and
business investors to migrate to New Zealand with their families. This led to an inflow of new immigrants from Tai-
wan, China, South Korea, India and Hong Kong, peaking in 1995 but continuing at lower levels to this day. The vast
majority of the new immigrants settled in the Auckland metropolitan area and in time these communities produced
substantial numbers of students for tertiary institutions in the region, and for the University of Auckland in particular.
The students from these communities had quite similar linguistic, educational and cultural profiles to international
students; many students in both categories had attended a New Zealand secondary school for one, two or more years
before entering the university. However, there was one crucial difference. Under New Zealand law (the Education Act
1989), permanent residents are classified as domestic students for the purpose of university admission and cannot be
subjected to any entry requirement that is not also imposed on citizens of the country. This means specifically that new
migrants cannot be targeted to take an English proficiency test or enrol in ESL classes as a condition of being admitted
into a university.
181J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
-
8/12/2019 J. Read 2008 - Diagnostic Assessment
3/11
Another provision in the Education Act creates further challenges. The law allows any domestic student who has
reached the age of 20 to apply for special admission to a New Zealand university, regardless of their level of prior
educational achievement. Thus, in principle adult migrants as well as citizens have had open entry to tertiary educa-
tion, although in practice their choices have been constrained by admission requirements for particular degree pro-
grammes, and those lacking a New Zealand secondary school qualification are likely to be strongly counselled to
initially take on a light, part-time workload.
Students accepted for special admission have diverse language needs. Whereas those from the East Asian migrant
communities may resemble international students linguistically and culturally, others are mature students from En-
glish speaking backgrounds who may not lack proficiency in the language as such but rather academic literacy. These
students include members of the Pacific Nations communities (particularly from Samoa, Tonga, the Cook Islands,
Niue and the Tokelau Islands) who may have native proficiency in general conversational English but whose low level
of achievement in their secondary schooling would have excluded them from further educational opportunity, had the
special admission provision not been available. Although the Pacific communities are long established in New Zea-
land, it has only been in more recent years that the universities have made systematic efforts to recruit Pasifika stu-
dents, with a particular emphasis on programmes in Education, Health Sciences and Theology.
Thus, through the 1990s the University of Auckland faced various challenges in responding to the growing linguis-
tic diversity of its student body, not least because of the constraints imposed by the Education Act. Proposals from two
leading professors (Ellis, 1998; Ellis & Hattie, 1999) that the university should introduce an entrance examination inEnglish for students who could not produce evidence of adequate competence in the language received support from
the Faculty of Arts and were accepted by the central administration of the university. The development and piloting of
the DELNA instruments took place in 2000e01 (Elder & Erlam, 2001) and the programme became operational in
2002.
3. DELNA: its philosophy and design
Before looking at how DELNA operates in practice, it is useful to outline several basic principles underlying its
development. To some extent, the principles reflect the constraints imposed on the university by the Education
Act, but they can also be seen as a positive commitment by the institution to enhancing the educational opportunities
of the whole student body.
One principle was that the test results would not play any role in admissions decisions; students were to be as-
sessed only after they had been accepted into the university for their chosen degree programme. In this sense,
then, the administration of DELNA represents a low-stakes situation, although from another point of view the
stakes are higher for students who are at serious risk of failing courses or not achieving their academic potential
as a result of their limited proficiency in the language. The university, too, has a stake in preserving academic
standards and maintaining good completion rates, particularly on equity grounds for Maori, Pasifika and other
students from historically underrepresented groups on the campus.
As a means of emphasising the point that DELNA was not IELTS under another guise, it was deliberately called
an assessment rather than a test, and the individual components are known as measures.
There was to be an important element of personal choice for students in their participation in DELNA and their
subsequent uptake of opportunities for language support and enhancement. In practice, particular departments
and degree programmes have required their students either to take DELNA and/or to participate in some
form of language support, but the principle remains that students should be strongly encouraged to take advan-
tage of this initiative rather than being compelled to do so against their will.
DELNA represented a recognition by the university that it shares with students a joint responsibility to address
academic language needs. This contrasts with the situation of international students applying for admission,
where the onus is on the students to demonstrate, by paying a substantial fee for an international English
test, that they have adequate competence in the language. For students and for departments, DELNA is free
of charge and several of the language support options are available to students at no additional cost to them.
In operation, DELNA involves two phases of assessment, Screening and Diagnosis, as shown in Table 1. The
Screening measures were designed to provide a quick and efficient means of separating out native speakers and other
182 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
-
8/12/2019 J. Read 2008 - Diagnostic Assessment
4/11
proficient users of the language who were unlikely to encounter difficulties with academic English, and exempting
them from further assessment. Both of the Screening measures are computer-based. One is a vocabulary test, assessing
knowledge of a sample of academic words by means of a simple wordddefinition matching format (Beglar & Hunt,
1999). The other, variously known in the literature as a speed reading (Davies, 1975, 1990) or cloze-elide (Manning,
1987) format, is a kind of reverse cloze procedure. In each line of an academic-style text an extraneous word is inserted
and the test takers must identify each inserted word under a speeded condition which means that only the most pro-
ficient students complete all 73 items within the time available. In a validation study (Elder & Erlam, 2001), the re-liability estimates were 0.87 for Vocabulary and 0.88 for Speed Reading. The two tests correlated with a composite
listening, reading and writing score from the Diagnosis (see below) individually at 0.74 (vocabulary) and 0.77 (speed
reading), and collectively at 0.82.
For students who score below a threshold level on the Screening, the three measures in the Diagnosis phase provide
a more extensive, task-basedassessmentof their academic language skills. Unlikethe computerised Screening measures,
they are all paper-based instruments. In the Listening test (30 min), the students hear an audio-recorded mini-lecture on
a non-specialist topic and respond to short answer, multiple-choice and information transfer items. The Reading test
(45 min) is based on one or two reading texts on topics of general interest totalling about 1200 words. Various item types
are used, including cloze, information transfer, matching, multiple-choice, true-false and short answer. For the Writing
task (30 min), thecandidates write 200 words of commentary on a social trend, as presented to them in the form of a sim-
ple table or graph. Their writing is rated on three analytic scales: fluency, content, and grammar and vocabulary.
The Diagnosis phase takes 2 hours to administer, as compared to 30 min for the Screening, and is obviously more
expensive in other respects, in that it requires manual scoring and, in the case of the writing task, double rating on the
three scales by trained examiners (for research on the training procedures, seeElder, Barkhuizen, Knoch, & von
Randow, 2007; Knoch, Read, & von Randow, 2007). TheElder and Erlam (2001)validation study obtained reliability
estimates of 0.82 for Listening and 0.83 for Reading. In the case of Writing, the two recent studies just cited (Elder
et al., 2007; Knoch et al., 2007) produced estimates of 0.95e0.97 for the reliability of candidate separation, using the
FACETS program.
Further details of the two phases of the DELNA assessment, including sample items and tasks, can be found in the
DELNA Handbook, which is downloadable from the programme website: www.delna.auckland.ac.nz.
Set out this way, DELNA looks very much like a conventional language test. Certainly the Diagnosis tasks are sim-
ilar to those found in IELTS and other EAP proficiency tests. However, the intended purpose of the instrument is dif-
ferent and this means that it needs to be presented in a distinctive manner, in keeping with the principles outlined at thebeginning of this section.
4. An analysis of test purpose
A useful framework for analysing how test purpose should influence test design and delivery is that developed by
Read and Chapelle (2001). Although the framework is exemplified in terms of vocabulary testing, it has general ap-
plicability to various forms of language assessment. As shown inFig. 1, the framework has numerous components and
it is beyond the scope of the present article to consider them all in detail.
At the top level of the framework, test purpose is decomposed into three componentse inferences, uses and in-
tended impacts e which in turn lead to validity considerations and mediating factors. It is the second and third me-
diating factors which are of particular concern here, but it is also necessary to address the first component briefly.
Table 1
The structure of DELNA
Screening (30 min)
Vocabulary
Speed Reading
Diagnosis (2 hours) Listening to a mini-lecture
Reading academic-type texts
Writing an interpretation of a graph
183J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
http://www.delna.auckland.ac.nz/http://www.delna.auckland.ac.nz/ -
8/12/2019 J. Read 2008 - Diagnostic Assessment
5/11
4.1. Construct definition
The inferences to be made on the basis of performance in DELNA can be defined in terms ofacademic literacy in
English: the ability of incoming undergraduate students to cope with the language demands of their degree pro-
gramme. Although ultimately the assessment is targeted at students for whom English is an additional language
(EAL), the construct is broader thanacademic literacy in English as an additional language because many of those
to be assessed come from English-speaking backgrounds, and the whole function of the initial Screening phase of
DELNA is to separate out students for whom adequate academic literacy is unlikely to be at issue. Designing
a test for students with English as both a first and an additional language creates a special challenge because it cannotbe assumed that items and tasks will perform the same way for the two groups. Elder, McNamara, and Congdon (2003)
used Rasch analysis to investigate this issue and found a somewhat complex pattern, whereby each of the DELNA
tasks except the vocabulary measure exhibited some significant bias in favour of either native or non-native speakers.
However, since the bias was in both directions and relatively small in magnitude overall, the researchers considered
that it was within tolerable limits for a low-stakes assessment of this kind.
Read and Chapelle (2001)distinguish three levels of inference: whole test, sub-test and item. For DELNA item-
level inferences are not appropriate. In the Screening phase, the construct is defined specifically in terms of efficient
access to academic language knowledge and it is sufficient to make inferences at the level of the whole test. Thus, the
vocabulary and speed reading scores are combined into a single result to determine whether the student should proceed
to the Diagnosis phase.
Elder and von Randow (in press) have investigated the validity of inferences based on the Screening score
examining its suitability as a basis for determining whether students needed to proceed to the Diagnosis. Their study
involved an analysis of the performance of 353 students who took both the Screening and Diagnosis measures. A
minimum criterion score was set on the basis of performance in the listening, reading and writing tests of the
Diagnosis phase. Then, by means of regression analysis, an optimum cut score (combining the vocabulary and speed
reading scores) was established for the Screening phase. This cut score successfully identified 93% of the students
whose performance fell below the criterion level in the Diagnosis phase. However, it also meant that relatively few
students would be exempted from taking the costly Diagnosis measures and so, with financial considerations in
mind, a lower cut score was set. The lower score identified only 81% of the students who were under the criterion
level but on the other hand it resulted in less than 1% of false negatives: students below the cut score who neverthe-
less had achieved the criterion level in the Diagnosis. Therefore, for operational purposes it is only students whose
Screening performance falls under the lower cut score who are required to proceed to the Diagnosis. Those who
are between the two cut scores receive a general recommendation to seek academic language support (see4.3below).
TEST PURPOSE Inferences Uses Intended Impacts
VALIDITY
CONSIDERATIONS
Construct
Validity
Relevance
and Utility
Actual
Consequences
MEDIATING
FACTORS
Construct
Definition
Performance
Summary and
Reporting
Test
Presentation
TEST DESIGNDecisions about the Structure
and Formats of the Test
VALIDATION Arguments based on Theory, Evidence and Consequences
Fig. 1. A framework for incorporating a systematic analysis of test purpose into test validation (adapted from Read & Chapelle, 2001, p. 10).
184 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
http://-/?-http://-/?- -
8/12/2019 J. Read 2008 - Diagnostic Assessment
6/11
For those who complete the Diagnosis, sub-test inferences are desirable so that students can be advised on whether
they should seek language support in each of the three skill areas of listening, reading and writing. This means that
each sub-test needs to provide a reliable measure of the skill involved. The reliability estimates quoted in Section3are
very satisfactory from this perspective.
4.2. Test presentation
Although test presentation comes third in the Read and Chapelle framework, it is more appropriate to discuss it
next in this account of DELNA. Presentation is a mediating factor that comes from a consideration of the impact of
a test(Messick, 1996).Read and Chapelle (2001)point out that most research on impact in language testing has
focused on the washback effects of existing tests and examinations (see, e.g.Alderson, 1996; Cheng & Watanabe,
2004). However, Read and Chapelle argue that if the consequences of implementing a test are to be seen as an in-
tegral element in evaluating its quality, a statement of the intendedimpact of the instrument needs to be included in
the specification of test purpose early in the development of a new test. Thus, the actual consequences of putting the
test into operation can be evaluated by reference to the prior statement of intended impact. This means in turn that
the test developers should consider how the intended impact can be achieved through the way that the test is
presented.
Test presentation is a concept that has not received much attention in the literature and it deserves some consider-ation here. It consists of a series of steps, taken as part of the process of developing and implementing the test, to in-
fluence its impact in a positive direction. Since there are numerous stakeholders in assessment, particularly when the
stakes are high, [t]est developers choose to portray their tests in ways that will appeal to particular audiences (Read
& Chapelle, 2001, p. 18). These can include educational administrators, teachers, parents, users of the test results, and
of course the test takers, who need to be familiar with the test formats and willing to accept that the test is a fair as-
sessment of their language abilities.
Seen in this light, test presentation has a strong connection to that much maligned concept in testing, face validity.
Authors of introductory texts on language testing, starting withLado (1961), have generally dismissed this concept as
not being a credible form of evidence to support a validity argument, since it is based on simple inspection (Lado,
1961, p. 321) or the judgment of an untrained observer (Davies et al., 1999, p. 59).
However, this rejection of the concept has generally been accompanied by an acknowledgement that, although theterm may be a misnomer, it represents a matter of genuine concern in testing. That is to say, test developers are con-
fronted with a real problem ife regardless of the technical merits of the teste one or more of the stakeholder groups
are not convinced that the content or the formats are suitable for the assessment purpose. Thus,Alderson, Clapham,
and Wall (1995)give face validity a positive gloss as meaning acceptable to users (p. 173), echoingCarroll (1980),
who had earlier proposed acceptability as one of the four desirable characteristics (along with relevance, comparabil-
ity and economy) of a communicative test. In addition,Bachman and Palmer (1996, p. 42)andDavies et al. (1999, p.
59)refer to the even more positive notion of test appeal.
Thus, test presentation can be seen as a proactive approach to promoting the acceptability of the test to the various
stakeholders, and above all to the test takers, in order to achieve the intended impact. The test developer needs to en-
sure that the purpose and nature of the assessment is clearly understood, that it meets stakeholder expectations as much
as possible, and that the test takers in particular engage with the test tasks in a manner that will help produce a valid
measure of their language ability. Major proficiency tests generate a strong external motivation for students because of
the stakes involved, whereas with a programme like DELNA it is more important to create a positive internal moti-
vation based on a recognition of the benefits that the results of the assessment may bring for the student.
4.2.1. Presentation of DELNA to students
The general principles underlying the presentation of DELNA are those that were introduced in Section3above:
the fact that the results are never used for admissions purposes; the term assessment is preferred to test; there is
a significant element of personal choice for students; and the university shares with its students the responsibility for
addressing their academic language needs. The slogan Increases your chance of success, which has featured in
DELNA publicity, is also intended to express the positive intent of the programme.
There have been two main pathways to DELNA for students entering the university each semester. The first was
literally by invitation. In the admissions office the records of incoming domestic students (citizens and permanent
185J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
http://-/?-http://-/?-http://-/?-http://-/?- -
8/12/2019 J. Read 2008 - Diagnostic Assessment
7/11
residents) were reviewed to identify those who had not provided evidence of their competence in English for tertiary-
level study. Students coming directly from secondary school in the last few years hold the National Certificate of Ed-
ucational Achievement (NCEA), which includes a literacy requirement to demonstrate proficiency in academic
reading and writing in English. However, mature students and recently arrived immigrants who enter the university
under special admission often lack a recognised secondary school qualification or any other evidence of academic
literacy.
Students thus identified received a letter inviting them to take the DELNA diagnosis. For statutory reasons, as pre-
viously explained, the university could not make it mandatory but, that consideration aside, the wording of the letter
had a positive tone which emphasised the intended role of DELNA in enhancing the students study experience. Ini-
tially, the uptake of these invitations was relatively low but by 2005 it had reached about 40% (116/295).
The other main pathway, which has now essentially superseded the first one, results from decisions by departments
or faculties to require all students in designated first-year courses to take DELNA. Initially, this applied to programmes
which attracted a high proportion of EAL students, such as the Bachelor degrees in Business and Information Man-
agement, and in Film, TVand Media Studies. However, from 2007 it has officially become a requirement for almost all
first-year students, regardless of their language background, to take the DELNA Screening. This not only observes the
legal niceties but also highlights the important role of the Screening phase in efficiently separating out academically
literate students for exemption from further assessment.
In 2007 a total of 5427 students were assessed through the DELNA programme. These students are estimated torepresent around 70% of all the first-year students at the university that year, although the percentage is higher if
groups such as transferring and exchange students are excluded. Of those who completed the Screening, 1208
were recommended to return for the Diagnosis phase; however, only 504 (42%) did so. This shortfall is discussed
in Section5 below.
In terms of presentation, as DELNA assessment has become the norm for first-year students, it is increasingly
accepted as just another part of the experience of entering the university. Students are informed of the assessment re-
quirement in their departments handbook and can obtain further information from the programme website, including
the downloadable DELNA Handbook, with its sample versions of the assessment tasks and advice on completing
them. In addition, it is easy for students to book for a DELNA session online at their preferred day and time. One other
appealing feature of the Screening measures in particular is that they are computer-administered, which adds a novelty
value for students who may never have taken such a language test before.
4.2.2. Presentation of DELNA to staff
Much of the initial impetus for the development of DELNA came from the concerns of teaching staff across the
university in the 1990s about the academic literacy needs of students in their classes. This created a positive environ-
ment for the acceptance of a programme like DELNA to address those needs, but of course that is not the same as an
understanding of how this particular programme works.
The establishment of DELNA saw the formation of a Reference Group chaired by the Deputy Vice-Chancellor (Ac-
ademic) and composed of representatives from all the university faculties as well as from the various language support
programmes. The group meets regularly to discuss policy issues, monitor the implementation of DELNA and provide
a channel of communication from the programme staff to the faculties. The departments which were the early partic-
ipants in DELNA are well represented on the group but, as the assessments have expanded, it has been necessary to
open new avenues of communication to academic and administrative staff across the university to ensure that: a) an
informed decision is made when departments or faculties decide to require their first-year students to take the assess-
ment; b) the relevant staff correctly interpret the DELNA results when they receive them; and c) effective follow-up
action is taken to give students access to language support if they need it.
In 2005 an information guide for staff was produced in pamphlet form and it has been followed by an FAQ doc-
ument. However, experience has shown that the printed material must be backed up by face-to-face meetings with key
staff members responsible for DELNA administration in particular faculties or departments.
4.3. Performance summary and reporting
This brings us back to the second mediating factor of theRead and Chapelle (2001)framework, performance sum-
mary and reporting, which relates to the intended use of the test. The assessment results are used to identify students
186 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
http://-/?-http://-/?- -
8/12/2019 J. Read 2008 - Diagnostic Assessment
8/11
who may be at risk because of their low academic literacy and then to advise them on suitable forms of language sup-
port and enhancement. Where participation in DELNA is a course requirement, the results also go to the academic
programme or department for follow-up action as appropriate. Thus, the two main recipient groups for the results
are the students and their departments.
Given that the whole purpose of the programme is to address academic literacy needs, the reporting of student per-
formance includes not only the assessment result but also a recommendation for language enhancement where appro-
priate. At this point, then, it is useful to list the main language support options available on the campus.
Credit courses in academic language skills: ESOL100e102 for EAL students, and ENGWRIT101, a writing
course for students from English-speaking backgrounds.
Workshops, short non-credit courses, individual consultations and self-access study facilities offered by the Stu-
dent Learning Centre (SLCe available to all students) and the English Language Self-Access Centre (ELSACe
specialising in services for EAL students).
Discipline-specific language tutorials linked to particular courses (following a kind of adjunct model) which
have for some time attracted a high proportion of EAL students. Currently these courses are in Commerce,
Health Sciences, Theology, and Film, TV and Media Studies.
The Screening phase of DELNA is primarily intended to exempt highly proficient students from further assess-ment. Thus, the scores from the vocabulary and speed reading measures are combined to divide the test-takers into
three categories with deliberately non-technical labels:
Good e no language enrichment required.
Satisfactory e some independent activity at SLC or ELSAC recommended.
Recommended for Diagnosise should take the DELNA Diagnosis.
The Screening result is sent individually to each student by email and, when DELNA is a departmental require-
ment, an Excel file of results for each course is forwarded to a designated staff member. Until 2006, the Screening
reports included the two actual test scores for vocabulary and speed reading. However, the fact that the cut scores
for the three categories varied according to which form of the test each student took caused some confusion and,in addition, there were indications that the scores were being used in at least one academic programme as quasi-pro-
ficiency measures to assign students to tutorial groups according to their language ability. This led to the current policy
of reporting just the students category.
In the case of the Diagnosis phase, a scale modelled on the IELTS band scores (from a top level of 9 down to 4)
has been used for rating performance and reporting the results to students. However, for reporting to staff a simpler
A-B-C-D system is used for each of the three skills (listening, reading and writing). The A and B grades correspond to
the Good and Satisfactory categories respectively in the Screening, and students whose three-grade average is at one of
these levels receive an email report. On the other hand, students averaging in the C and D range, who are considered to
be at significant risk, are sent an email request to collect their results in person from the DELNA language adviser. As
with the Screening, the results are also sent to the designated staff member when the Diagnosis is required by the
department.
The appointment of the language adviser, beginning in 2005, resulted from a recognition that students scoring low
in the Diagnosis were generally not accessing the recommended language support. A small-scale study byBright and
von Randow (2004), involving follow-up interviews with eighteen DELNA candidates at the end of the academic year,
found that only four of them had taken the specific advice given in the results notification. Although most of the
participants had in fact passed their courses, they acknowledged that they had really struggled to meet the language
demands of their studies. One strong message from the interviews was that the students would have greatly appreci-
ated the opportunity to discuss their DELNA results and their language support options face-to-face, rather than just
receiving the impersonal emailed report. Thus, the language adviser now meets with each student individually, goes
over their profile of results, and directs them to the most appropriate form(s) of support. She often follows up the initial
meeting with ongoing monitoring of their progress through the semester or even longer.
Thus performance summary and reporting in this case involves not simply the form of the report but also, for the
less proficient students, the medium by which the result is communicated to them.
187J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
-
8/12/2019 J. Read 2008 - Diagnostic Assessment
9/11
5. Evaluating the programme
The extended discussion in Section4, drawing on theRead and Chapelle (2001)framework, has shown how the
purpose of the assessment has been worked out through the design and delivery mechanisms of DELNA. At the time of
writing, the programme is still being rolled out. It has yet to achieve full participation by the incoming first-year stu-
dent population in the Screening phase, and furthermore in 2006 only 30% of students (444 out of 1340) who were
recommended for Diagnosis on the basis of their Screening results actually went on to the second phase of the assess-
ment. Higher levels of participation will depend on the extent to which faculties and departments enforce the require-
ment that their students should take one or both phases of DELNA. Some academic programmes have introduced
specific incentives for students to take the assessment, by for instance withholding the first essay grade or subtracting
a few percent of the final course grade of students who do not comply.
However, the point of the exercise is not just to assess the students but rather to address their academic language
needs where appropriate. As noted in the previous section, there is now provision within the DELNA programme it-
self, through the work of the language adviser, to provide intensive counselling for those students whose results in the
Diagnosis phase show that they have the most serious language needs. Some academic units have introduced their own
follow-up measures for such students. For example, the Bachelors degree in Business and Information Management
has a well-established Language and Communication Support Programme (LanCom), which integrates various forms
of support into the delivery of their courses. In the Faculty of Engineering students who score below a minimum levelin the DELNA Diagnosis must undertake a quasi-course with its own course code, involving attendance at 15 hours of
workshops at the Student Learning Centre (SLC) and satisfactory completion of another 15 hours of directed study at
the English Language Self-Access Centre (ELSAC).
With the expansion of DELNA assessment into the Faculties of Arts and Science, it is more of a challenge to re-
spond to the language needs of students enrolled for a degree which includes courses offered by several different de-
partments. In the first instance, the Screening results may simply provide course conveners with a broad profile of the
language needs of their students, who may be several hundred in number. Many departments lack the resources to offer
specialised language support to their students. One realistic option for them is the introduction of systematic proce-
dures for referring students in need to SLC or ELSAC; another option may be to review their teaching and assessment
practices to avoid creating unnecessary difficulties for EAL students in their courses.
Returning briefly to theRead and Chapelle (2001)framework, one element in the validation of a test or assessmentprocedure is an investigation of its actual consequences as compared to its intended impact. At the institutional level,
the intended impact can be defined in terms of levels of academic literacy in the student population. The implemen-
tation of DELNA is supposed to lead to a meaningful reduction over time in the number of students whose academic
performance is hampered by language-related difficulties. The question is what kind of data counts as evidence that
the goal is being achieved for the undergraduate student body as a whole.
Davies and Elder (2005)took up this point in their review of current theory and practice in language test validation,
using DELNA as a case study. They formulated a series of eight hypotheses that can be investigated to build an ar-
gument for the validity of DELNA. Most of the hypotheses relate either to the technical qualities of the DELNA tests
as measures of academic literacy or to the utility of the scores to the users. However, the final hypothesis takes up the
issue of the wider impact of the programme:
H.8 The student population will benefit from the information that the test provides. (2005, p. 805).
Davies and Elder highlight a number of challenges in, first, defining the nature of the benefit and then gathering
evidence in support of the hypothesis. One way to address the hypothesis would be to define the benefit as an increase
in academic literacy, as measured by a further assessment of their language proficiency after, say, a semester or two of
study. However, DELNA is set up as a one-time assessment for each student and the system blocks them from taking it
more than once. In addition, there are currently no plans to introduce an exit test of English proficiency for graduating
students.
This means that we need to look for benefits in other ways. One kind of evidence relates to student uptake of the
DELNA advice by accessing the various language support options available to them. If they enrol in an ESOL credit
course or attend a tutorial linked to one of their subject courses, their progress and end-of-course achievement will be
assessed by their tutors. On the other hand, it is more of a challenge to monitor the benefit gained by students who
participate in the support programmes at SLC and ELSAC. Students are required to register when they first access
188 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
http://-/?-http://-/?- -
8/12/2019 J. Read 2008 - Diagnostic Assessment
10/11
these programmes and records are kept of their attendance at workshops and individual consultations, but that is not
the same as assessing the benefit of these language support opportunities in improving the students academic lan-
guage proficiency.
A broader approach to the situation is to look at grade point averages and retention rates for whole undergraduate
cohorts, particularly in courses with large EAL student enrolments. AsDavies and Elder (2005)point out, though, it is
difficult to separate out language proficiency from academic ability, motivation, sociocultural adjustment and the
range of other factors that influence student achievement in their university studies, particularly if underachievement
is represented not just by dropout or failure rates but also by lower grades than the student might otherwise have
achieved. The issues involved are reminiscent of those which have complicated research on the predictive validity
of major proficiency tests like TOEFL and IELTS (see, e.g.,Hill, Storch, & Lynch, 1999; Light, Xu, & Mossop, 1987).
Thus, global university-wide measures of impact may prove to be less useful than more focused investigations of
particular groups of students. One such study, being conducted by the DELNA Programme in conjunction with the
Department of Film, TVand Media Studies, is tracking a cohort of students through their three years of study towards
a BA major in FTVMS. The data include annual interviews with the students as well as the quantitative measures
provided by the initial DELNA results and their course grades. Through this kind of targeted research, it will become
possible to develop a validity argument that combines rich qualitative evidence with more objective measures of
students language proficiency and academic achievement.
6. Conclusion
The DELNA assessment programme has a number of features that differentiate it from other tests of English for
academic purposes. First, it does not function as a gatekeeping device for university admission, and students cannot be
excluded from the institution on the basis of their results in either phase of the assessment. The fact that some students
find this hard to believe helps to account for the relatively low participation rate in the Diagnosis phase of DELNA
among those who are recommended to take it. Secondly, it is not simply a placement procedure to direct students
into one or more courses within a required EAP programme according to their level and areas of need. There is a range
of language support options that students are recommended to participate in as appropriate. A related feature is the
distinctive philosophy behind the programme which holds that students should retain a degree of personal choice
as to whether they take advantage of the opportunities for language and study support which are available to them.Although it partly reflects the constraints imposed by national education legislation, this approach is also based on
the assumption that academic language support will be more effective if students recognise for themselves the extent
of their language needs and make a commitment to attend to them.
One other important characteristic of DELNA is that it is centrally funded with a direct management line to the
office of the Deputy Vice-Chancellor (Academic). Although its offices are located in the Department of Applied Lan-
guage Studies and Linguistics, the programme has always been conceived as a university-wide initiative. This helps to
avoid any perception that DELNA is just serving the interests of a particular department or faculty. It is an issue that
has emerged in discussions with staff from other New Zealand universities about the possibility of introducing
a DELNA programme on their own campuses. Initial enquiries have typically come from student learning advisers
or ESOL tutors who have thought in terms of purchasing a set of diagnostic tests for their own institution. However,
a briefing on the full scope of DELNA and its associated language support provisions reveals how much more is in-
volved, with a firm commitment by senior management being a crucial element in the successful operation of the pro-
gramme at Auckland.
The DELNA programme is moving to a consolidation phase after the considerable expansion in the coverage of
incoming undergraduate students over the past couple of years. There is a consequent need to ensure that effective
use is made of the DELNA results and that an increasing proportion of the targeted students participate in the appro-
priate forms of language support and enhancement. The position of DELNA as a centrally funded programme is se-
cure for the foreseeable future, although it remains to be seen to what extent the university will be able to commit
sufficient resources to meet the range of language needs that the assessment results are revealing. Other related issues
may yet emerge, such as the need to set language proficiency standards for students graduating from Bachelors pro-
grammes or concerns about the academic literacy of postgraduate students. For now, though, it is widely accepted
within the institution that DELNA is a very worthwhile means of addressing the language needs of incoming
undergraduates.
189J. Read / Journal of English for Academic Purposes 7 (2008) 180e190
-
8/12/2019 J. Read 2008 - Diagnostic Assessment
11/11
References
Alderson, J. C. (Ed.). (1996). Washback in language testing [Special issue]. Language Testing, 13(3).
Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press.
Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.
Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary tests. Language Testing, 16,
131e
162.Bright, C., & von Randow, J. (2004, September). Tracking language test consequences: the student perspective. Paper presented at the Ninth
National Conference on Community Languages and English for Speakers of Other Languages (CLESOL), Christchurch, New Zealand.
Brown, J. D. (1993). A comprehensive criterion-referenced language testing project. In: D. Douglas, & C. Chapelle (Eds.), A new decade of
language testing research (pp. 163e184). Washington, DC: TESOL.
Carroll, B. J. (1980). Testing communicative performance. Oxford: Pergamon.
Cheng, L., & Watanabe, Y. (2004). Washback in language testing: Research contexts and methods . Mahwah, NJ: Lawrence Erlbaum Associates.
Davies, A. (1975). Two tests of speed reading. In: R. L. Jones, & B. Spolsky (Eds.), Testing language proficiency(pp. 119e130). Arlington, VA:
Center for Applied Linguistics.
Davies, A. (1990). Principles of language testing. Oxford: Blackwell.
Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). A dictionary of language testing. Cambridge: Cambridge
University Press.
Davies, A., & Elder, C. (2005). Validity and validation in language testing. In: E. Hinkel (Ed.), Handbook of research in second language teaching
and learning (pp. 795e813). Mahwah, NJ: Lawrence Erlbaum.
Elder, C., Barkhuizen, G., Knoch, U., & von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writingassessment. Language Testing, 24, 37e64.
Elder, C., & Erlam, R. (2001). Development and validation of the diagnostic english language needs assessment (DELNA): Final report. Auckland:
Department of Applied Language Studies and Linguistics, University of Auckland.
Elder, C., McNamara, T., & Congdon, P. (2003). Rasch techniques for detecting bias in performance assessments: An example comparing the
performance of native and non-native speakers on a test of academic English. Journal of Applied Measurement, 4, 181e197.
Elder, C., & von Randow, J. (in press). Exploring the utility of a web-based English language screening tool. Language Assessment Quarterly.
Ellis, R. (1998). Proposal for a language proficiency entrance examination. Unpublished manuscript, New Zealand: University of Auckland.
Ellis, R., & Hattie, J. (1999). English language proficiency at the University of Auckland: A proposal. Unpublished manuscript, New Zealand:
University of Auckland.
Fox, J. (2004). Test decisions over time: Tracking validity. Language Testing, 21, 437e465.
Fulcher, G. (1997). An English language placement test: issues in reliability and validity. Language Testing, 14, 113e139.
Harklau, L., Losey, K. M., & Siegal, M. (Eds.). (1999). Generation 1.5 meets college composition: Issues in the teaching of writing to U.S.
educated learners of ESL. Mahwah, NJ: Lawrence Erlbaum.
Hill, K., Storch, N., & Lynch, B. (1999). A comparison of IELTS and TOEFL as predictors of academic success. In: R. Tulloh (Ed.), IELTS
research reports, (Vol. 2, pp. 52e63). Canberra: IELTS Australia.
Knoch, U., Read, J., & von Randow, J. (2007). Re-training writing raters online: How does it compare with face-to-face training? Assessing
Writing, 12, 26e43.
Lado, R. (1961). Language testing. London: Longman.
Light, R. L., Xu, M., & Mossop, J. (1987). English proficiency and academic performance of international students. TESOL Quarterly, 21,
251e261.
Manning, W. H. (1987). Development of cloze-elide tests of English as a second language . TOEFL Research Report, No. 23. Princeton, NJ:
Educational Testing Service.
Messick, S. (1996). Validity and washback in language testing. Language Testing, 13, 241e256.
Read, J., & Chapelle, C. A. (2001). A framework for second language vocabulary assessment. Language Testing, 18, 1e32.
Read, J., & Hayes, B. (2003). The impact of IELTS on preparation for academic study in New Zealand. In: R. Tulloh (Ed.), IELTS research reports
2003, (Vol. 4, pp. 153e205). Canberra: IELTS Australia.
Wall, D., Clapham, C., & Alderson, J. C. (1994). Evaluating a placement test. Language Testing, 11, 321e
344.
John Readis Head of the Department of Applied Language Studies and Linguistics at the University of Auckland. His primary research interests
are in vocabulary assessment and testing English for academic and professional purposes. He is the author of Assessing Vocabulary (Cambridge,
2000) and has been co-editor of Language Testing.
190 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190