J. Read 2008 - Diagnostic Assessment

download J. Read 2008 - Diagnostic Assessment

of 11

Transcript of J. Read 2008 - Diagnostic Assessment

  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    1/11

    Identifying academic language needs throughdiagnostic assessment

    John Read*

    Department of Applied Language Studies and Linguistics, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand

    Abstract

    The increasing linguistic diversity among both international and domestic students in English-medium universities creates new

    challenges for the institutions in addressing the students needs in the area of academic literacy. In order to identify students with

    such needs, a major New Zealand university has implemented the Diagnostic English Language Needs Assessment (DELNA) pro-

    gramme, which is now a requirement for all first-year undergraduate students, regardless of their language background. The results

    of the assessment are used to guide students to appropriate forms of academic language support where applicable. This article ex-

    amines the rationale for the assessment programme, which takes account of some specific provisions governing university admis-

    sion in New Zealand law. Then, drawing on the test validation network by Read and Chapelle [Read, J., & Chapelle, C. A. (2001). A

    framework for second language vocabulary assessment. Language Testing, 18, 1e32] the article considers in some detail: 1) the

    way in which DELNA is presented to staff and students of the university, and 2) the procedures for reporting the results. It also

    considers the criteria by which the programme should be evaluated.

    2008 Elsevier Ltd. All rights reserved.

    Keywords: Language assessment; English for academic purposes; Diagnosis; University admission; Undergraduate students; Language support

    1. Introduction

    The internationalisation of education in the major English-speaking countries has long created the need to provide

    various forms of academic language support for those international students who have been admitted to the institution,

    but whose proficiency is still not fully adequate to meet the language demands of their degree studies. Language sup-

    port most often takes the form of English for academic purposes (EAP) courses targeting specific skills such as writingor listening, but it can also include adjunct language classes linked to a particular content course, writing clinics, peer

    editing programmes, self-access centres, and so on. A typical strategy is to require incoming international students to

    take an in-house placement test, the results of which are used either to exempt individuals from the EAP programme or

    to direct them into the appropriate courses to address their needs. Accounts of tests designed broadly for this purpose

    at various universities can be found inBrown (1993), Fox (2004), Fulcher (1997), andWall, Clapham, and Alderson

    (1994).

    * Tel.: 64 9 373 7599x87673; fax: 64 9 308 2360.

    E-mail address: [email protected]

    1475-1585/$ - see front matter 2008 Elsevier Ltd. All rights reserved.

    doi:10.1016/j.jeap.2008.02.001

    Journal of English for Academic Purposes 7 (2008) 180e190www.elsevier.com/locate/jeap

    mailto:[email protected]://www.elsevier.com/locate/jeaphttp://www.elsevier.com/locate/jeapmailto:[email protected]
  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    2/11

    At the same time, it is now well recognised that many students who are not on student visas also have academic

    language needs. This may result from the success of policies to recruit students from indigenous ethnic or linguistic

    minority groups which have traditionally been underrepresented in tertiary education. Another major category con-

    sists of relatively recent migrants or refugees, who have received much if not all of their secondary education in

    the host country and thus have met the academic requirements for university admission, but who still experience dif-

    ficulties with academic reading and writing in particular (Harklau, Losey, & Siegal, 1999). The term Generation 1.5

    has been coined in the US to refer to the fact that these students are separated from the country of their birth but often

    not fully integratede linguistically, educationally or culturally e into their new society. Beyond these two identifiable

    categories, there is a broader continuum of academic literacy needs within the student body in the contemporary

    English-medium university, including many students who are monolingual in English.

    Although various forms of language support may be available to these domestic students on campus, the issue is

    how to identify the ones who need such support and to what extent they should be requiredto take advantage of it.

    There can be legal or ethical constraints on directing students into language support on the basis of their language

    background or other demographic characteristics. It may also be counterproductive to make it obligatory for students

    to participate in a support programme when they have no wish to be set apart from their peers and are reluctant to

    acknowledge that they have language needs. One way to address the situation is to introduce some form of diagnostic

    assessment, comparable to the in-house placement tests for international students. In fact, one of the tests cited above

    (Fulcher, 1997) was designed to be administered at the University of Surrey in the UK to all incoming students, re-gardless of their immigration status or language background. A similar solution is emerging at the university which is

    the subject of the present article.

    Having regard for these various considerations, it is necessary to give some careful thought to the development of

    an assessment procedure for this purpose. There are technical issues, such as how to assess native and non-native

    speakers by means of a common metric and how to reliably identify those with no need of language support within

    the minimum amount of testing time. However, the focus of this discussion will be on the need to present the assess-

    ment to the students and to the university community in a manner that will achieve its desired goals while at the same

    time avoiding unnecessary compulsion.

    2. The context

    The particular case to be considered here is a programme called Diagnostic English Language Needs Assessment

    (DELNA), which has been implemented at the University of Auckland in New Zealand. The programme was intro-

    duced to address concerns that developed through the 1990s with the influx of students who are now collectively iden-

    tified as having English as an additional language (EAL). During that decade New Zealand tertiary institutions

    vigorously recruited international students, particularly from East Asia. These students were required to demonstrate

    their proficiency in English as a condition of admission. However, the typical requirement for undergraduates of Band

    6.0 in IELTS came to be recognised as a relatively modest level of English proficiency, particularly for students whose

    cultural background and previous educational experience made it difficult to meet the academic expectations of their

    lecturers and tutors (Read & Hayes, 2003). In the absence of any moves to raise the minimum English requirement for

    entry, then, the University of Auckland e like other New Zealand universities and polytechnics e needed to provide

    various forms of ongoing language support for international students.

    The liberalisation of immigration policy in the late 1980s also opened up opportunities for skilled migrants and

    business investors to migrate to New Zealand with their families. This led to an inflow of new immigrants from Tai-

    wan, China, South Korea, India and Hong Kong, peaking in 1995 but continuing at lower levels to this day. The vast

    majority of the new immigrants settled in the Auckland metropolitan area and in time these communities produced

    substantial numbers of students for tertiary institutions in the region, and for the University of Auckland in particular.

    The students from these communities had quite similar linguistic, educational and cultural profiles to international

    students; many students in both categories had attended a New Zealand secondary school for one, two or more years

    before entering the university. However, there was one crucial difference. Under New Zealand law (the Education Act

    1989), permanent residents are classified as domestic students for the purpose of university admission and cannot be

    subjected to any entry requirement that is not also imposed on citizens of the country. This means specifically that new

    migrants cannot be targeted to take an English proficiency test or enrol in ESL classes as a condition of being admitted

    into a university.

    181J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    3/11

    Another provision in the Education Act creates further challenges. The law allows any domestic student who has

    reached the age of 20 to apply for special admission to a New Zealand university, regardless of their level of prior

    educational achievement. Thus, in principle adult migrants as well as citizens have had open entry to tertiary educa-

    tion, although in practice their choices have been constrained by admission requirements for particular degree pro-

    grammes, and those lacking a New Zealand secondary school qualification are likely to be strongly counselled to

    initially take on a light, part-time workload.

    Students accepted for special admission have diverse language needs. Whereas those from the East Asian migrant

    communities may resemble international students linguistically and culturally, others are mature students from En-

    glish speaking backgrounds who may not lack proficiency in the language as such but rather academic literacy. These

    students include members of the Pacific Nations communities (particularly from Samoa, Tonga, the Cook Islands,

    Niue and the Tokelau Islands) who may have native proficiency in general conversational English but whose low level

    of achievement in their secondary schooling would have excluded them from further educational opportunity, had the

    special admission provision not been available. Although the Pacific communities are long established in New Zea-

    land, it has only been in more recent years that the universities have made systematic efforts to recruit Pasifika stu-

    dents, with a particular emphasis on programmes in Education, Health Sciences and Theology.

    Thus, through the 1990s the University of Auckland faced various challenges in responding to the growing linguis-

    tic diversity of its student body, not least because of the constraints imposed by the Education Act. Proposals from two

    leading professors (Ellis, 1998; Ellis & Hattie, 1999) that the university should introduce an entrance examination inEnglish for students who could not produce evidence of adequate competence in the language received support from

    the Faculty of Arts and were accepted by the central administration of the university. The development and piloting of

    the DELNA instruments took place in 2000e01 (Elder & Erlam, 2001) and the programme became operational in

    2002.

    3. DELNA: its philosophy and design

    Before looking at how DELNA operates in practice, it is useful to outline several basic principles underlying its

    development. To some extent, the principles reflect the constraints imposed on the university by the Education

    Act, but they can also be seen as a positive commitment by the institution to enhancing the educational opportunities

    of the whole student body.

    One principle was that the test results would not play any role in admissions decisions; students were to be as-

    sessed only after they had been accepted into the university for their chosen degree programme. In this sense,

    then, the administration of DELNA represents a low-stakes situation, although from another point of view the

    stakes are higher for students who are at serious risk of failing courses or not achieving their academic potential

    as a result of their limited proficiency in the language. The university, too, has a stake in preserving academic

    standards and maintaining good completion rates, particularly on equity grounds for Maori, Pasifika and other

    students from historically underrepresented groups on the campus.

    As a means of emphasising the point that DELNA was not IELTS under another guise, it was deliberately called

    an assessment rather than a test, and the individual components are known as measures.

    There was to be an important element of personal choice for students in their participation in DELNA and their

    subsequent uptake of opportunities for language support and enhancement. In practice, particular departments

    and degree programmes have required their students either to take DELNA and/or to participate in some

    form of language support, but the principle remains that students should be strongly encouraged to take advan-

    tage of this initiative rather than being compelled to do so against their will.

    DELNA represented a recognition by the university that it shares with students a joint responsibility to address

    academic language needs. This contrasts with the situation of international students applying for admission,

    where the onus is on the students to demonstrate, by paying a substantial fee for an international English

    test, that they have adequate competence in the language. For students and for departments, DELNA is free

    of charge and several of the language support options are available to students at no additional cost to them.

    In operation, DELNA involves two phases of assessment, Screening and Diagnosis, as shown in Table 1. The

    Screening measures were designed to provide a quick and efficient means of separating out native speakers and other

    182 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    4/11

    proficient users of the language who were unlikely to encounter difficulties with academic English, and exempting

    them from further assessment. Both of the Screening measures are computer-based. One is a vocabulary test, assessing

    knowledge of a sample of academic words by means of a simple wordddefinition matching format (Beglar & Hunt,

    1999). The other, variously known in the literature as a speed reading (Davies, 1975, 1990) or cloze-elide (Manning,

    1987) format, is a kind of reverse cloze procedure. In each line of an academic-style text an extraneous word is inserted

    and the test takers must identify each inserted word under a speeded condition which means that only the most pro-

    ficient students complete all 73 items within the time available. In a validation study (Elder & Erlam, 2001), the re-liability estimates were 0.87 for Vocabulary and 0.88 for Speed Reading. The two tests correlated with a composite

    listening, reading and writing score from the Diagnosis (see below) individually at 0.74 (vocabulary) and 0.77 (speed

    reading), and collectively at 0.82.

    For students who score below a threshold level on the Screening, the three measures in the Diagnosis phase provide

    a more extensive, task-basedassessmentof their academic language skills. Unlikethe computerised Screening measures,

    they are all paper-based instruments. In the Listening test (30 min), the students hear an audio-recorded mini-lecture on

    a non-specialist topic and respond to short answer, multiple-choice and information transfer items. The Reading test

    (45 min) is based on one or two reading texts on topics of general interest totalling about 1200 words. Various item types

    are used, including cloze, information transfer, matching, multiple-choice, true-false and short answer. For the Writing

    task (30 min), thecandidates write 200 words of commentary on a social trend, as presented to them in the form of a sim-

    ple table or graph. Their writing is rated on three analytic scales: fluency, content, and grammar and vocabulary.

    The Diagnosis phase takes 2 hours to administer, as compared to 30 min for the Screening, and is obviously more

    expensive in other respects, in that it requires manual scoring and, in the case of the writing task, double rating on the

    three scales by trained examiners (for research on the training procedures, seeElder, Barkhuizen, Knoch, & von

    Randow, 2007; Knoch, Read, & von Randow, 2007). TheElder and Erlam (2001)validation study obtained reliability

    estimates of 0.82 for Listening and 0.83 for Reading. In the case of Writing, the two recent studies just cited (Elder

    et al., 2007; Knoch et al., 2007) produced estimates of 0.95e0.97 for the reliability of candidate separation, using the

    FACETS program.

    Further details of the two phases of the DELNA assessment, including sample items and tasks, can be found in the

    DELNA Handbook, which is downloadable from the programme website: www.delna.auckland.ac.nz.

    Set out this way, DELNA looks very much like a conventional language test. Certainly the Diagnosis tasks are sim-

    ilar to those found in IELTS and other EAP proficiency tests. However, the intended purpose of the instrument is dif-

    ferent and this means that it needs to be presented in a distinctive manner, in keeping with the principles outlined at thebeginning of this section.

    4. An analysis of test purpose

    A useful framework for analysing how test purpose should influence test design and delivery is that developed by

    Read and Chapelle (2001). Although the framework is exemplified in terms of vocabulary testing, it has general ap-

    plicability to various forms of language assessment. As shown inFig. 1, the framework has numerous components and

    it is beyond the scope of the present article to consider them all in detail.

    At the top level of the framework, test purpose is decomposed into three componentse inferences, uses and in-

    tended impacts e which in turn lead to validity considerations and mediating factors. It is the second and third me-

    diating factors which are of particular concern here, but it is also necessary to address the first component briefly.

    Table 1

    The structure of DELNA

    Screening (30 min)

    Vocabulary

    Speed Reading

    Diagnosis (2 hours) Listening to a mini-lecture

    Reading academic-type texts

    Writing an interpretation of a graph

    183J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

    http://www.delna.auckland.ac.nz/http://www.delna.auckland.ac.nz/
  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    5/11

    4.1. Construct definition

    The inferences to be made on the basis of performance in DELNA can be defined in terms ofacademic literacy in

    English: the ability of incoming undergraduate students to cope with the language demands of their degree pro-

    gramme. Although ultimately the assessment is targeted at students for whom English is an additional language

    (EAL), the construct is broader thanacademic literacy in English as an additional language because many of those

    to be assessed come from English-speaking backgrounds, and the whole function of the initial Screening phase of

    DELNA is to separate out students for whom adequate academic literacy is unlikely to be at issue. Designing

    a test for students with English as both a first and an additional language creates a special challenge because it cannotbe assumed that items and tasks will perform the same way for the two groups. Elder, McNamara, and Congdon (2003)

    used Rasch analysis to investigate this issue and found a somewhat complex pattern, whereby each of the DELNA

    tasks except the vocabulary measure exhibited some significant bias in favour of either native or non-native speakers.

    However, since the bias was in both directions and relatively small in magnitude overall, the researchers considered

    that it was within tolerable limits for a low-stakes assessment of this kind.

    Read and Chapelle (2001)distinguish three levels of inference: whole test, sub-test and item. For DELNA item-

    level inferences are not appropriate. In the Screening phase, the construct is defined specifically in terms of efficient

    access to academic language knowledge and it is sufficient to make inferences at the level of the whole test. Thus, the

    vocabulary and speed reading scores are combined into a single result to determine whether the student should proceed

    to the Diagnosis phase.

    Elder and von Randow (in press) have investigated the validity of inferences based on the Screening score

    examining its suitability as a basis for determining whether students needed to proceed to the Diagnosis. Their study

    involved an analysis of the performance of 353 students who took both the Screening and Diagnosis measures. A

    minimum criterion score was set on the basis of performance in the listening, reading and writing tests of the

    Diagnosis phase. Then, by means of regression analysis, an optimum cut score (combining the vocabulary and speed

    reading scores) was established for the Screening phase. This cut score successfully identified 93% of the students

    whose performance fell below the criterion level in the Diagnosis phase. However, it also meant that relatively few

    students would be exempted from taking the costly Diagnosis measures and so, with financial considerations in

    mind, a lower cut score was set. The lower score identified only 81% of the students who were under the criterion

    level but on the other hand it resulted in less than 1% of false negatives: students below the cut score who neverthe-

    less had achieved the criterion level in the Diagnosis. Therefore, for operational purposes it is only students whose

    Screening performance falls under the lower cut score who are required to proceed to the Diagnosis. Those who

    are between the two cut scores receive a general recommendation to seek academic language support (see4.3below).

    TEST PURPOSE Inferences Uses Intended Impacts

    VALIDITY

    CONSIDERATIONS

    Construct

    Validity

    Relevance

    and Utility

    Actual

    Consequences

    MEDIATING

    FACTORS

    Construct

    Definition

    Performance

    Summary and

    Reporting

    Test

    Presentation

    TEST DESIGNDecisions about the Structure

    and Formats of the Test

    VALIDATION Arguments based on Theory, Evidence and Consequences

    Fig. 1. A framework for incorporating a systematic analysis of test purpose into test validation (adapted from Read & Chapelle, 2001, p. 10).

    184 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

    http://-/?-http://-/?-
  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    6/11

    For those who complete the Diagnosis, sub-test inferences are desirable so that students can be advised on whether

    they should seek language support in each of the three skill areas of listening, reading and writing. This means that

    each sub-test needs to provide a reliable measure of the skill involved. The reliability estimates quoted in Section3are

    very satisfactory from this perspective.

    4.2. Test presentation

    Although test presentation comes third in the Read and Chapelle framework, it is more appropriate to discuss it

    next in this account of DELNA. Presentation is a mediating factor that comes from a consideration of the impact of

    a test(Messick, 1996).Read and Chapelle (2001)point out that most research on impact in language testing has

    focused on the washback effects of existing tests and examinations (see, e.g.Alderson, 1996; Cheng & Watanabe,

    2004). However, Read and Chapelle argue that if the consequences of implementing a test are to be seen as an in-

    tegral element in evaluating its quality, a statement of the intendedimpact of the instrument needs to be included in

    the specification of test purpose early in the development of a new test. Thus, the actual consequences of putting the

    test into operation can be evaluated by reference to the prior statement of intended impact. This means in turn that

    the test developers should consider how the intended impact can be achieved through the way that the test is

    presented.

    Test presentation is a concept that has not received much attention in the literature and it deserves some consider-ation here. It consists of a series of steps, taken as part of the process of developing and implementing the test, to in-

    fluence its impact in a positive direction. Since there are numerous stakeholders in assessment, particularly when the

    stakes are high, [t]est developers choose to portray their tests in ways that will appeal to particular audiences (Read

    & Chapelle, 2001, p. 18). These can include educational administrators, teachers, parents, users of the test results, and

    of course the test takers, who need to be familiar with the test formats and willing to accept that the test is a fair as-

    sessment of their language abilities.

    Seen in this light, test presentation has a strong connection to that much maligned concept in testing, face validity.

    Authors of introductory texts on language testing, starting withLado (1961), have generally dismissed this concept as

    not being a credible form of evidence to support a validity argument, since it is based on simple inspection (Lado,

    1961, p. 321) or the judgment of an untrained observer (Davies et al., 1999, p. 59).

    However, this rejection of the concept has generally been accompanied by an acknowledgement that, although theterm may be a misnomer, it represents a matter of genuine concern in testing. That is to say, test developers are con-

    fronted with a real problem ife regardless of the technical merits of the teste one or more of the stakeholder groups

    are not convinced that the content or the formats are suitable for the assessment purpose. Thus,Alderson, Clapham,

    and Wall (1995)give face validity a positive gloss as meaning acceptable to users (p. 173), echoingCarroll (1980),

    who had earlier proposed acceptability as one of the four desirable characteristics (along with relevance, comparabil-

    ity and economy) of a communicative test. In addition,Bachman and Palmer (1996, p. 42)andDavies et al. (1999, p.

    59)refer to the even more positive notion of test appeal.

    Thus, test presentation can be seen as a proactive approach to promoting the acceptability of the test to the various

    stakeholders, and above all to the test takers, in order to achieve the intended impact. The test developer needs to en-

    sure that the purpose and nature of the assessment is clearly understood, that it meets stakeholder expectations as much

    as possible, and that the test takers in particular engage with the test tasks in a manner that will help produce a valid

    measure of their language ability. Major proficiency tests generate a strong external motivation for students because of

    the stakes involved, whereas with a programme like DELNA it is more important to create a positive internal moti-

    vation based on a recognition of the benefits that the results of the assessment may bring for the student.

    4.2.1. Presentation of DELNA to students

    The general principles underlying the presentation of DELNA are those that were introduced in Section3above:

    the fact that the results are never used for admissions purposes; the term assessment is preferred to test; there is

    a significant element of personal choice for students; and the university shares with its students the responsibility for

    addressing their academic language needs. The slogan Increases your chance of success, which has featured in

    DELNA publicity, is also intended to express the positive intent of the programme.

    There have been two main pathways to DELNA for students entering the university each semester. The first was

    literally by invitation. In the admissions office the records of incoming domestic students (citizens and permanent

    185J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

    http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    7/11

    residents) were reviewed to identify those who had not provided evidence of their competence in English for tertiary-

    level study. Students coming directly from secondary school in the last few years hold the National Certificate of Ed-

    ucational Achievement (NCEA), which includes a literacy requirement to demonstrate proficiency in academic

    reading and writing in English. However, mature students and recently arrived immigrants who enter the university

    under special admission often lack a recognised secondary school qualification or any other evidence of academic

    literacy.

    Students thus identified received a letter inviting them to take the DELNA diagnosis. For statutory reasons, as pre-

    viously explained, the university could not make it mandatory but, that consideration aside, the wording of the letter

    had a positive tone which emphasised the intended role of DELNA in enhancing the students study experience. Ini-

    tially, the uptake of these invitations was relatively low but by 2005 it had reached about 40% (116/295).

    The other main pathway, which has now essentially superseded the first one, results from decisions by departments

    or faculties to require all students in designated first-year courses to take DELNA. Initially, this applied to programmes

    which attracted a high proportion of EAL students, such as the Bachelor degrees in Business and Information Man-

    agement, and in Film, TVand Media Studies. However, from 2007 it has officially become a requirement for almost all

    first-year students, regardless of their language background, to take the DELNA Screening. This not only observes the

    legal niceties but also highlights the important role of the Screening phase in efficiently separating out academically

    literate students for exemption from further assessment.

    In 2007 a total of 5427 students were assessed through the DELNA programme. These students are estimated torepresent around 70% of all the first-year students at the university that year, although the percentage is higher if

    groups such as transferring and exchange students are excluded. Of those who completed the Screening, 1208

    were recommended to return for the Diagnosis phase; however, only 504 (42%) did so. This shortfall is discussed

    in Section5 below.

    In terms of presentation, as DELNA assessment has become the norm for first-year students, it is increasingly

    accepted as just another part of the experience of entering the university. Students are informed of the assessment re-

    quirement in their departments handbook and can obtain further information from the programme website, including

    the downloadable DELNA Handbook, with its sample versions of the assessment tasks and advice on completing

    them. In addition, it is easy for students to book for a DELNA session online at their preferred day and time. One other

    appealing feature of the Screening measures in particular is that they are computer-administered, which adds a novelty

    value for students who may never have taken such a language test before.

    4.2.2. Presentation of DELNA to staff

    Much of the initial impetus for the development of DELNA came from the concerns of teaching staff across the

    university in the 1990s about the academic literacy needs of students in their classes. This created a positive environ-

    ment for the acceptance of a programme like DELNA to address those needs, but of course that is not the same as an

    understanding of how this particular programme works.

    The establishment of DELNA saw the formation of a Reference Group chaired by the Deputy Vice-Chancellor (Ac-

    ademic) and composed of representatives from all the university faculties as well as from the various language support

    programmes. The group meets regularly to discuss policy issues, monitor the implementation of DELNA and provide

    a channel of communication from the programme staff to the faculties. The departments which were the early partic-

    ipants in DELNA are well represented on the group but, as the assessments have expanded, it has been necessary to

    open new avenues of communication to academic and administrative staff across the university to ensure that: a) an

    informed decision is made when departments or faculties decide to require their first-year students to take the assess-

    ment; b) the relevant staff correctly interpret the DELNA results when they receive them; and c) effective follow-up

    action is taken to give students access to language support if they need it.

    In 2005 an information guide for staff was produced in pamphlet form and it has been followed by an FAQ doc-

    ument. However, experience has shown that the printed material must be backed up by face-to-face meetings with key

    staff members responsible for DELNA administration in particular faculties or departments.

    4.3. Performance summary and reporting

    This brings us back to the second mediating factor of theRead and Chapelle (2001)framework, performance sum-

    mary and reporting, which relates to the intended use of the test. The assessment results are used to identify students

    186 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

    http://-/?-http://-/?-
  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    8/11

    who may be at risk because of their low academic literacy and then to advise them on suitable forms of language sup-

    port and enhancement. Where participation in DELNA is a course requirement, the results also go to the academic

    programme or department for follow-up action as appropriate. Thus, the two main recipient groups for the results

    are the students and their departments.

    Given that the whole purpose of the programme is to address academic literacy needs, the reporting of student per-

    formance includes not only the assessment result but also a recommendation for language enhancement where appro-

    priate. At this point, then, it is useful to list the main language support options available on the campus.

    Credit courses in academic language skills: ESOL100e102 for EAL students, and ENGWRIT101, a writing

    course for students from English-speaking backgrounds.

    Workshops, short non-credit courses, individual consultations and self-access study facilities offered by the Stu-

    dent Learning Centre (SLCe available to all students) and the English Language Self-Access Centre (ELSACe

    specialising in services for EAL students).

    Discipline-specific language tutorials linked to particular courses (following a kind of adjunct model) which

    have for some time attracted a high proportion of EAL students. Currently these courses are in Commerce,

    Health Sciences, Theology, and Film, TV and Media Studies.

    The Screening phase of DELNA is primarily intended to exempt highly proficient students from further assess-ment. Thus, the scores from the vocabulary and speed reading measures are combined to divide the test-takers into

    three categories with deliberately non-technical labels:

    Good e no language enrichment required.

    Satisfactory e some independent activity at SLC or ELSAC recommended.

    Recommended for Diagnosise should take the DELNA Diagnosis.

    The Screening result is sent individually to each student by email and, when DELNA is a departmental require-

    ment, an Excel file of results for each course is forwarded to a designated staff member. Until 2006, the Screening

    reports included the two actual test scores for vocabulary and speed reading. However, the fact that the cut scores

    for the three categories varied according to which form of the test each student took caused some confusion and,in addition, there were indications that the scores were being used in at least one academic programme as quasi-pro-

    ficiency measures to assign students to tutorial groups according to their language ability. This led to the current policy

    of reporting just the students category.

    In the case of the Diagnosis phase, a scale modelled on the IELTS band scores (from a top level of 9 down to 4)

    has been used for rating performance and reporting the results to students. However, for reporting to staff a simpler

    A-B-C-D system is used for each of the three skills (listening, reading and writing). The A and B grades correspond to

    the Good and Satisfactory categories respectively in the Screening, and students whose three-grade average is at one of

    these levels receive an email report. On the other hand, students averaging in the C and D range, who are considered to

    be at significant risk, are sent an email request to collect their results in person from the DELNA language adviser. As

    with the Screening, the results are also sent to the designated staff member when the Diagnosis is required by the

    department.

    The appointment of the language adviser, beginning in 2005, resulted from a recognition that students scoring low

    in the Diagnosis were generally not accessing the recommended language support. A small-scale study byBright and

    von Randow (2004), involving follow-up interviews with eighteen DELNA candidates at the end of the academic year,

    found that only four of them had taken the specific advice given in the results notification. Although most of the

    participants had in fact passed their courses, they acknowledged that they had really struggled to meet the language

    demands of their studies. One strong message from the interviews was that the students would have greatly appreci-

    ated the opportunity to discuss their DELNA results and their language support options face-to-face, rather than just

    receiving the impersonal emailed report. Thus, the language adviser now meets with each student individually, goes

    over their profile of results, and directs them to the most appropriate form(s) of support. She often follows up the initial

    meeting with ongoing monitoring of their progress through the semester or even longer.

    Thus performance summary and reporting in this case involves not simply the form of the report but also, for the

    less proficient students, the medium by which the result is communicated to them.

    187J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    9/11

    5. Evaluating the programme

    The extended discussion in Section4, drawing on theRead and Chapelle (2001)framework, has shown how the

    purpose of the assessment has been worked out through the design and delivery mechanisms of DELNA. At the time of

    writing, the programme is still being rolled out. It has yet to achieve full participation by the incoming first-year stu-

    dent population in the Screening phase, and furthermore in 2006 only 30% of students (444 out of 1340) who were

    recommended for Diagnosis on the basis of their Screening results actually went on to the second phase of the assess-

    ment. Higher levels of participation will depend on the extent to which faculties and departments enforce the require-

    ment that their students should take one or both phases of DELNA. Some academic programmes have introduced

    specific incentives for students to take the assessment, by for instance withholding the first essay grade or subtracting

    a few percent of the final course grade of students who do not comply.

    However, the point of the exercise is not just to assess the students but rather to address their academic language

    needs where appropriate. As noted in the previous section, there is now provision within the DELNA programme it-

    self, through the work of the language adviser, to provide intensive counselling for those students whose results in the

    Diagnosis phase show that they have the most serious language needs. Some academic units have introduced their own

    follow-up measures for such students. For example, the Bachelors degree in Business and Information Management

    has a well-established Language and Communication Support Programme (LanCom), which integrates various forms

    of support into the delivery of their courses. In the Faculty of Engineering students who score below a minimum levelin the DELNA Diagnosis must undertake a quasi-course with its own course code, involving attendance at 15 hours of

    workshops at the Student Learning Centre (SLC) and satisfactory completion of another 15 hours of directed study at

    the English Language Self-Access Centre (ELSAC).

    With the expansion of DELNA assessment into the Faculties of Arts and Science, it is more of a challenge to re-

    spond to the language needs of students enrolled for a degree which includes courses offered by several different de-

    partments. In the first instance, the Screening results may simply provide course conveners with a broad profile of the

    language needs of their students, who may be several hundred in number. Many departments lack the resources to offer

    specialised language support to their students. One realistic option for them is the introduction of systematic proce-

    dures for referring students in need to SLC or ELSAC; another option may be to review their teaching and assessment

    practices to avoid creating unnecessary difficulties for EAL students in their courses.

    Returning briefly to theRead and Chapelle (2001)framework, one element in the validation of a test or assessmentprocedure is an investigation of its actual consequences as compared to its intended impact. At the institutional level,

    the intended impact can be defined in terms of levels of academic literacy in the student population. The implemen-

    tation of DELNA is supposed to lead to a meaningful reduction over time in the number of students whose academic

    performance is hampered by language-related difficulties. The question is what kind of data counts as evidence that

    the goal is being achieved for the undergraduate student body as a whole.

    Davies and Elder (2005)took up this point in their review of current theory and practice in language test validation,

    using DELNA as a case study. They formulated a series of eight hypotheses that can be investigated to build an ar-

    gument for the validity of DELNA. Most of the hypotheses relate either to the technical qualities of the DELNA tests

    as measures of academic literacy or to the utility of the scores to the users. However, the final hypothesis takes up the

    issue of the wider impact of the programme:

    H.8 The student population will benefit from the information that the test provides. (2005, p. 805).

    Davies and Elder highlight a number of challenges in, first, defining the nature of the benefit and then gathering

    evidence in support of the hypothesis. One way to address the hypothesis would be to define the benefit as an increase

    in academic literacy, as measured by a further assessment of their language proficiency after, say, a semester or two of

    study. However, DELNA is set up as a one-time assessment for each student and the system blocks them from taking it

    more than once. In addition, there are currently no plans to introduce an exit test of English proficiency for graduating

    students.

    This means that we need to look for benefits in other ways. One kind of evidence relates to student uptake of the

    DELNA advice by accessing the various language support options available to them. If they enrol in an ESOL credit

    course or attend a tutorial linked to one of their subject courses, their progress and end-of-course achievement will be

    assessed by their tutors. On the other hand, it is more of a challenge to monitor the benefit gained by students who

    participate in the support programmes at SLC and ELSAC. Students are required to register when they first access

    188 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

    http://-/?-http://-/?-
  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    10/11

    these programmes and records are kept of their attendance at workshops and individual consultations, but that is not

    the same as assessing the benefit of these language support opportunities in improving the students academic lan-

    guage proficiency.

    A broader approach to the situation is to look at grade point averages and retention rates for whole undergraduate

    cohorts, particularly in courses with large EAL student enrolments. AsDavies and Elder (2005)point out, though, it is

    difficult to separate out language proficiency from academic ability, motivation, sociocultural adjustment and the

    range of other factors that influence student achievement in their university studies, particularly if underachievement

    is represented not just by dropout or failure rates but also by lower grades than the student might otherwise have

    achieved. The issues involved are reminiscent of those which have complicated research on the predictive validity

    of major proficiency tests like TOEFL and IELTS (see, e.g.,Hill, Storch, & Lynch, 1999; Light, Xu, & Mossop, 1987).

    Thus, global university-wide measures of impact may prove to be less useful than more focused investigations of

    particular groups of students. One such study, being conducted by the DELNA Programme in conjunction with the

    Department of Film, TVand Media Studies, is tracking a cohort of students through their three years of study towards

    a BA major in FTVMS. The data include annual interviews with the students as well as the quantitative measures

    provided by the initial DELNA results and their course grades. Through this kind of targeted research, it will become

    possible to develop a validity argument that combines rich qualitative evidence with more objective measures of

    students language proficiency and academic achievement.

    6. Conclusion

    The DELNA assessment programme has a number of features that differentiate it from other tests of English for

    academic purposes. First, it does not function as a gatekeeping device for university admission, and students cannot be

    excluded from the institution on the basis of their results in either phase of the assessment. The fact that some students

    find this hard to believe helps to account for the relatively low participation rate in the Diagnosis phase of DELNA

    among those who are recommended to take it. Secondly, it is not simply a placement procedure to direct students

    into one or more courses within a required EAP programme according to their level and areas of need. There is a range

    of language support options that students are recommended to participate in as appropriate. A related feature is the

    distinctive philosophy behind the programme which holds that students should retain a degree of personal choice

    as to whether they take advantage of the opportunities for language and study support which are available to them.Although it partly reflects the constraints imposed by national education legislation, this approach is also based on

    the assumption that academic language support will be more effective if students recognise for themselves the extent

    of their language needs and make a commitment to attend to them.

    One other important characteristic of DELNA is that it is centrally funded with a direct management line to the

    office of the Deputy Vice-Chancellor (Academic). Although its offices are located in the Department of Applied Lan-

    guage Studies and Linguistics, the programme has always been conceived as a university-wide initiative. This helps to

    avoid any perception that DELNA is just serving the interests of a particular department or faculty. It is an issue that

    has emerged in discussions with staff from other New Zealand universities about the possibility of introducing

    a DELNA programme on their own campuses. Initial enquiries have typically come from student learning advisers

    or ESOL tutors who have thought in terms of purchasing a set of diagnostic tests for their own institution. However,

    a briefing on the full scope of DELNA and its associated language support provisions reveals how much more is in-

    volved, with a firm commitment by senior management being a crucial element in the successful operation of the pro-

    gramme at Auckland.

    The DELNA programme is moving to a consolidation phase after the considerable expansion in the coverage of

    incoming undergraduate students over the past couple of years. There is a consequent need to ensure that effective

    use is made of the DELNA results and that an increasing proportion of the targeted students participate in the appro-

    priate forms of language support and enhancement. The position of DELNA as a centrally funded programme is se-

    cure for the foreseeable future, although it remains to be seen to what extent the university will be able to commit

    sufficient resources to meet the range of language needs that the assessment results are revealing. Other related issues

    may yet emerge, such as the need to set language proficiency standards for students graduating from Bachelors pro-

    grammes or concerns about the academic literacy of postgraduate students. For now, though, it is widely accepted

    within the institution that DELNA is a very worthwhile means of addressing the language needs of incoming

    undergraduates.

    189J. Read / Journal of English for Academic Purposes 7 (2008) 180e190

  • 8/12/2019 J. Read 2008 - Diagnostic Assessment

    11/11

    References

    Alderson, J. C. (Ed.). (1996). Washback in language testing [Special issue]. Language Testing, 13(3).

    Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press.

    Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.

    Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary tests. Language Testing, 16,

    131e

    162.Bright, C., & von Randow, J. (2004, September). Tracking language test consequences: the student perspective. Paper presented at the Ninth

    National Conference on Community Languages and English for Speakers of Other Languages (CLESOL), Christchurch, New Zealand.

    Brown, J. D. (1993). A comprehensive criterion-referenced language testing project. In: D. Douglas, & C. Chapelle (Eds.), A new decade of

    language testing research (pp. 163e184). Washington, DC: TESOL.

    Carroll, B. J. (1980). Testing communicative performance. Oxford: Pergamon.

    Cheng, L., & Watanabe, Y. (2004). Washback in language testing: Research contexts and methods . Mahwah, NJ: Lawrence Erlbaum Associates.

    Davies, A. (1975). Two tests of speed reading. In: R. L. Jones, & B. Spolsky (Eds.), Testing language proficiency(pp. 119e130). Arlington, VA:

    Center for Applied Linguistics.

    Davies, A. (1990). Principles of language testing. Oxford: Blackwell.

    Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). A dictionary of language testing. Cambridge: Cambridge

    University Press.

    Davies, A., & Elder, C. (2005). Validity and validation in language testing. In: E. Hinkel (Ed.), Handbook of research in second language teaching

    and learning (pp. 795e813). Mahwah, NJ: Lawrence Erlbaum.

    Elder, C., Barkhuizen, G., Knoch, U., & von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writingassessment. Language Testing, 24, 37e64.

    Elder, C., & Erlam, R. (2001). Development and validation of the diagnostic english language needs assessment (DELNA): Final report. Auckland:

    Department of Applied Language Studies and Linguistics, University of Auckland.

    Elder, C., McNamara, T., & Congdon, P. (2003). Rasch techniques for detecting bias in performance assessments: An example comparing the

    performance of native and non-native speakers on a test of academic English. Journal of Applied Measurement, 4, 181e197.

    Elder, C., & von Randow, J. (in press). Exploring the utility of a web-based English language screening tool. Language Assessment Quarterly.

    Ellis, R. (1998). Proposal for a language proficiency entrance examination. Unpublished manuscript, New Zealand: University of Auckland.

    Ellis, R., & Hattie, J. (1999). English language proficiency at the University of Auckland: A proposal. Unpublished manuscript, New Zealand:

    University of Auckland.

    Fox, J. (2004). Test decisions over time: Tracking validity. Language Testing, 21, 437e465.

    Fulcher, G. (1997). An English language placement test: issues in reliability and validity. Language Testing, 14, 113e139.

    Harklau, L., Losey, K. M., & Siegal, M. (Eds.). (1999). Generation 1.5 meets college composition: Issues in the teaching of writing to U.S.

    educated learners of ESL. Mahwah, NJ: Lawrence Erlbaum.

    Hill, K., Storch, N., & Lynch, B. (1999). A comparison of IELTS and TOEFL as predictors of academic success. In: R. Tulloh (Ed.), IELTS

    research reports, (Vol. 2, pp. 52e63). Canberra: IELTS Australia.

    Knoch, U., Read, J., & von Randow, J. (2007). Re-training writing raters online: How does it compare with face-to-face training? Assessing

    Writing, 12, 26e43.

    Lado, R. (1961). Language testing. London: Longman.

    Light, R. L., Xu, M., & Mossop, J. (1987). English proficiency and academic performance of international students. TESOL Quarterly, 21,

    251e261.

    Manning, W. H. (1987). Development of cloze-elide tests of English as a second language . TOEFL Research Report, No. 23. Princeton, NJ:

    Educational Testing Service.

    Messick, S. (1996). Validity and washback in language testing. Language Testing, 13, 241e256.

    Read, J., & Chapelle, C. A. (2001). A framework for second language vocabulary assessment. Language Testing, 18, 1e32.

    Read, J., & Hayes, B. (2003). The impact of IELTS on preparation for academic study in New Zealand. In: R. Tulloh (Ed.), IELTS research reports

    2003, (Vol. 4, pp. 153e205). Canberra: IELTS Australia.

    Wall, D., Clapham, C., & Alderson, J. C. (1994). Evaluating a placement test. Language Testing, 11, 321e

    344.

    John Readis Head of the Department of Applied Language Studies and Linguistics at the University of Auckland. His primary research interests

    are in vocabulary assessment and testing English for academic and professional purposes. He is the author of Assessing Vocabulary (Cambridge,

    2000) and has been co-editor of Language Testing.

    190 J. Read / Journal of English for Academic Purposes 7 (2008) 180e190