NEWCASTLE UNIVERSITY 13 December 2013 Annual … exam is supposedly modelled on the IELTS reading...

17

NEWCASTLE UNIVERSITY

Annual Report of External Examiner for a taught programme Session: 2012/13

NOTES 5. Please return your report as soon as possible after completion of your duties (no later than 1

September for undergraduate programmes; and no later than four weeks following the Board of Examiners – by 30 November if possible – for postgraduate programmes) to: Academic Administration Manager, c/o Examinations & Academic Events, Newcastle University, King’s Gate, Newcastle upon Tyne NE1 7RU.

6. The report will be circulated to relevant staff in the School and to relevant University committees

(of which student representatives are members) and used in the Annual Monitoring and Review and Internal Subject Review processes.

7. All comments provided are highly valued by the University in contributing to the maintenance and

enhancement of the academic standards of its awards and will be taken into full consideration where these fit within Newcastle University policy and practice.

8. Full details of how this report will be considered can be found in Section 5, Appendix 1 of the

Policy and Procedures for External Examiners of Taught Programmes.

PART 1

Name:

Diane Schmitt

Institutional Address:

Nottingham Language Centre Nottingham Trent University Burton Street Nottingham NG8 1NG

School in which examining undertaken:

INTO Newcastle

Programme/s examined:

INTO Diploma Business

Year of appointment: 2012/13

Is this your final year?

NO

If you wish to make an additional, confidential report to the Vice-Chancellor on a matter of particular importance and/or sensitivity, please do so on a separate sheet, indicating that it is confidential.

Document G - Cross-FLTSEC 13 December 2013

http://www.ncl.ac.uk/quilt/assets/documents/qsh-extexam-policy.pdf

18

Please provide any comments for the attention of the Board of Studies under the following headings:

Exemplary Practice The commitment of the EAP team to developing the programme and all aspects of its delivery.

Commendations Semester One assessments are used to provide formative feedback to students, but do not contribute to the final grade.

Recommendations The listening test must be replaced as soon as possible with an assessment that clearly exams candidates listening ability. The reading test also requires urgent revision, as do the scales for reporting student performance.


19

PART 2: QUALITY AND STANDARDS

Please answer the following questions indicating yes/no as appropriate, with any additional comments you wish to make:

Intended Learning Outcomes

Are the intended learning outcomes of the programme(s) appropriate (compared to those in similar programmes elsewhere in the sector)? Are the intended learning outcomes appropriate to the level of award as set out in the Framework for Higher Education Qualifications?

No Yes X N/A

Yes, although there is a heavy focus on writing, to the detriment of other skills.

Curriculum

Does the curriculum enable students to attain the intended learning outcomes of the programme?

No Yes X N/A

As above, writing takes up a disproportionate amount of syllabus time – roughly 46% of sessions focus on writing with the remaining time split between reading, listening, speaking, vocabulary, grammar and skills. There is scope for giving more attention to these other skills as increasingly writing is no longer the only way students are assessed. The limited number of sessions on seminar skills, for instance, and their proximity to the actual assessment creates the appearance of teaching to the test rather than developing these skills for use in other areas of the EAP module or the other Diploma modules.

Methods of Assessment

Are the methods (and balance) of assessment appropriate in measuring student achievement in relation to the intended learning outcomes?

No Yes X N/A

The balance of assessment is appropriate in relation to the learning outcomes set.

Criteria for assessment and examination

Are the assessment criteria appropriate for measuring student attainment of the intended learning outcomes? The listening assessment is in no way fit for purpose and must be replaced. The exam consists of four parts. The first part of the exam purports to be a lecture, but is essentially a reading and multiple choice gap fill exercise which does not require any listening to complete. The second, third and fourth parts of the test are built around a debate. These parts look more like a listening test, but it is difficult to work out exactly what aspects of listening they aim to assess. Looking at the student marks, it is clear that Part 1 does not discriminate between candidates at all and Part 2 very little. Thus neither of these two parts provides any real value to the exam. Only Parts 3 & 4 appear to spread out the candidates in any way, but it is difficult to interpret what exactly this tells us about students ability to carry out listening in academic contexts. A new listening exam must clearly outline which aspects of listening it aims to assess and this should be documented in a set of test specifications which can be used to develop parallel exams for use with the two yearly cohorts and for future years. The exam should be trialled on students from other INTO centres or other UK universities to ensure that the difficulty level of the items is appropriate. The reading assessment would also benefit from careful consideration of what exactly you are aiming to assess. The items to focus on the word, sentence and paragraph level and do not require the students to make any links across paragraphs or demonstrate understanding of the texts as a whole. There does not appear to be an obvious rationale for the weighting of items as some items which are


20

worth two points require two pieces of information and others three. It is not clear how marks are awarded for these types of items. The module handbook lays out a range of reading skills most of which are not assessed by this exam or explicitly by any other assessment on the programme. The EAP team should look to developing test specifications which clearly layout the purpose of the exam, what it should cover and the desired format of the item types. This will enable them to ensure that the relevant learning outcomes are covered and that parallel forms can be developed for subsequent cohorts. The EAP team should also consider trialling their exams with other similar cohorts at other institutions to ensure that the level of difficulty of their exams is appropriate and that all items are working as expected. The exam is supposedly modelled on the IELTS reading exam, but it does not assess the range of reading skills covered in that exam. It would also make more sense for the exam to relate to what is taught on the INTO curriculum rather than to an external exam. The writing exam is a good attempt at providing a subject specific assessment. It is not clear to me as external though where the background knowledge required for success in the assessment comes from. Is this based on content students have covered in another module? This needs to be made clearer. The writing exam would also benefit from having marking criteria that are clearly differentiated from the coursework writing criteria. Are the marking criteria effective in discriminating between levels of attainment in relationship to the classification of the award?

No X Yes N/A

According to the Module Handbook, INTO Newcastle uses a common grading scale from 10-90. However, the conversion scales for the Reading Test are from 30-90 and for listening from 0-100. On the foundation programme, the listening conversion scale is from 40-80 and the Graduate Diploma scales are different again. It is not clear how each of these scales is related to the others and this proliferation of scales appears to undermine the notion of a common grading scale. Furthermore, INTO claims that students who receive marks of 65 or 70 on their common grading scale demonstrate the same level of proficiency as IELTS levels 6.5 and 7.0 respectively. What evidence, if any, do you have for these claims? The evidence I see leads me to a very different conclusion. The assessment tasks undertaken by students at the foundation, diploma and graduate diplomas in the writing and speaking coursework assessments have been designed to reflect the educational level of the programme. Several are linked to subject modules. The use of the common scale with generic IELTS descriptors leads one to believe that the task difficulty and language difficulty required for achieving points along the scale are equivalent. In other words, use of the common scale and descriptors implies that students at foundation, diploma and graduate diploma who all achieve a mark of 65 on their writing are demonstrating the same level of skill and language use. This is patently not the case given the major differences in the requirements for each assessment task. The links to the IELTS scale are problematic for a number of reasons and should be severed. See Appendix A for a fuller discussion of this issue.

Internal marking

Is internal marking (in accordance with the marking criteria) impartial, fair and consistent?

No Yes X N/A

Yes, to the extent it can be given the confusion highlighted above.


21

Standards

Are the standards of the programme(s) appropriate? Please refer to the national subject benchmark statements (where appropriate), the Framework for Higher Education Qualifications, the programme specification and (where appropriate) requirements of professional statutory bodies

No Yes X N/A

The quality of the work produced by students who achieve passing grades and above is high.

Comment upon the extent to which the stated output standards for the programme(s) are comparable with those of similar programmes in other UK higher education institutions.

The module learning outcomes are in line with those set by other similar programmes at other institutions I am familiar with. The amount and weighting of assessments is also similar.

Comment upon the comparability of the output standards achieved by students with those achieved by students on similar programmes elsewhere.

For the assessments that do actually work well, e.g. the writing and speaking coursework, the speaking exam and the writing exam, the students output is on par with those in similar programmes. However, we cannot be sure of this when it comes to the listening and reading examinations as these do not succeed in assessing the learning outcomes for this programme.

Comment upon the particular strengths and weaknesses of the current cohort.

Given the range of other issues highlighted above and the limited amount of time allotted for me to review the work for this programme and the Graduate Diploma, I am afraid I am unable to comment on the details of this particular cohort’s performance.

For examiners of subjects which contribute to joint or combined honours programmes, please provide separate comments on the comparability of standards and student performance specifically relating to joint or combined honours.

n/a


22

PART 3: PROCEDURES

Were you given sufficient notice of the examination dates and of the meeting of the Board of Examiners?

No Yes X N/A

If you made any recommendations in your previous report have these been addressed by the University? Please outline briefly any issues which you feel have not been considered appropriately.

No Yes N/A X

Was the Board of Examiners conducted in accordance with the University's policies and procedures?

No Yes X N/A

Is the process of assessment effective and fair in its treatment of individual candidates, particularly with regard to the exercise of discretion?

No Yes X N/A

Were you given sufficient information on the following to enable you to fulfil your duties?

Programme aims and learning outcomes (eg, the programme specification)

Curriculum content

Assessment procedures (including assessment criteria)

Marking scheme and instruction to examiners

Standards

No Yes X N/A

Yes, except I would like to see answer keys to examinations at the time I am asked to approve them. It would be helpful if the spreadsheet of student marks included student numbers for ease of cross-referencing. It was extremely useful to see students’ entry scores and be able to make comparisons between entry and exit scores.

Were you given an opportunity to comment upon the assessment structure of the programme, where applicable? Were these appropriate in terms of the intended learning outcomes and of an appropriate standard?

No Yes X N/A

The overall structure which combines a mix of coursework and examination is appropriate. However, the listening and reading exams are weak links in this structure. Also the scales used to report


23

student performance are out of sync with what is actually being assessed and should be replaced with scales and descriptors that are more accurately aligned with how the students are being assessed. As repeatedly noted by the previous external, the use of an IELTS-linked scale means that student marks on EAP are artificially inflated. I was shown a tracking analysis of students from the 2010/11 diploma cohort through to graduation in 2013. Average EAP marks are about 13 points higher than the average marks earned by students’ in their final degree classification. The weaknesses I have outlined in the assessment process are serious and need to be seen as a priority for both the development and the on-going delivery of this programme. It is important for INTO and University management to note that achieving the quality needed for high stakes assessments such as these will require an investment in time and importantly staff development. Very few EAP teachers have had training in assessment development. The expertise required to deliver the training is available both within the University of Newcastle and within the INTO organisation, alternatively it can be provided by bringing in external assessment specialists. Which ever route INTO and the University choose to take, CPD in this are must be a priority.

Did you have sufficient opportunity to review student work and examination scripts? Was the sample provided sufficient for you to make the required judgements?

No X Yes N/A

The sample was sufficient, but the amount of time was not sufficient given the size of the cohort and the range of assessments and the fact that I am reviewing two programmes with different assessment structures.

(Where appropriate) Were you given an adequate opportunity to participate in the assessment process through involvement in, for example, practicals/clinical examinations/exhibitions etc. If YES, was the method of selection of students appropriate?

No Yes N/A X

From the achievements of students, does the quality of the programme seem adequate to support the attainment of the standards of the award?

No Yes X N/A

This is largely due to amazing commitment of the module leaders and the teaching staff.

Collaborative Programmes

(For external examiners of collaborative programmes, including articulations, franchises, validations, and multiple or joint awards) Were you offered adequate information about the collaboration, e.g. a copy of the memorandum of agreement?

No Yes N/A X

Were you offered adequate information about any variations in the programme compared to the same or similar ones offered at Newcastle?


24

No Yes N/A X

Did you have the opportunity to compare the achievements of candidates taught by the partner institution compared to those taught by the University? If YES, were the achievements of students taught by the partner institution comparable to those taught by the University? If NO, from the achievements of students, does the quality of the programme at the partner institution seem adequate to support the attainment of the standards of the award?

No Yes N/A X

Signed: Date: 25 October 2013


25

Appendix A

The University of Newcastle requires that end of programme assessment marks be reported using the

IELTS scale. It is strongly advised that this practice be ended because it gives the impression that

there is no difference between students who are directly admitted to the university on the basis of an

IELTS (or other English proficiency test) and those who are admitted as a result of successfully

completing a foundation, diploma or graduate diploma programme with INTO. It is important that

users of IELTS scores and INTO programme reports have a clear understanding of the differences

between the these entry routes and an important step towards achieving this aim would be to change

the way INTO’s programme assessment marks are reported to better reflect what is taught and

assessed.

The IELTS test is a general proficiency test of English language that makes use of academic-related

topics to assess reading and writing. The speaking and listening tasks do not use academic topics.

The reading and listening texts are short, as are the writing texts that students produce – 150 and 250

words for the latter. There is no requirement for test takers to integrate the skills, e.g. to write about

something they have read or listened to. The IELTS exam provides a 3-hour snapshot of an

applicant’s level of proficiency in the English language. The IELTS test does not claim to and cannot

provide the university with information about an applicant’ ability to apply their language skills to the

types of academic tasks they will be required to undertake as university students.

INTO Foundation, Diploma and Graduate Diploma programmes, on the other hand, offer a year-long

courses of instruction with 100- to 200-hours of English for Academic Purposes tuition which aims to

both develop students’ language proficiency and, importantly, teach them how to apply their current

level of language to a range of academic tasks. Students are required to read genuine academic

texts (academic books and articles), use knowledge gained from their reading to participate in

seminar discussions and give academic presentations, and to write full-length academic assignments.

Because the purposes of the IELTS and INTO assessments are different, the tasks are and must be

different. The assessment tasks on the INTO programmes call on a number of skills and strategies

that cannot be assessed using an IELTS-style marking scheme. As noted above, the purpose of

IELTS is to provide a point-in-time snapshot of a student’s language proficiency whereas INTO

assessments aim to provide information about the degree to which students have achieved the

learning outcomes of a course of study.

Language proficiency tests are not linked to any specific syllabus because they are intended to be

taken by students from a wide range of educational and learning backgrounds. One result of this is

that scores on these tests are not very sensitive to change over short periods of study which are

directly linked to specific syllabi and learning outcomes. There is a dearth of research on gains made

on IELTS and other proficiency tests over time, but the few studies that have been done are not


26

encouraging. Green (2005:57) found that students studying on EAP courses showed little

improvement on the writing element of IELTS tests taken at the start and end of their courses. The

table below shows that students who started with the lowest scores made the greatest gains, 1.25

bands over 7-10 weeks and just under half a band over 3-4 weeks, while students who entered with

higher scores improved by less than half a band over any of the time periods between testing. Scores

of students entering with 6.0 were more likely go down than up, most likely due to regression to the

mean due to the width of the IELTS bands and/or standard error of measurement rather than being

due to any regression in ability. Tonkyn (2012) found similar trends on the IELTS speaking test for a

group of 24 students on a 10-week pre-sessional, the mean gain for the group was less then .25 of a

band. While some students showed greater gains, Tonkyn’s results mirror Green’s and both conclude

that proficiency tests like IELTS do not function as sensitive progress tests and were never intended

to do so. Thus, for universities that use IELTS band scoring for their own internal assessments what

we would expect to see is that most students would show very little change in their place on the

IELTS scale, e.g. Graduate Diploma students entering with IELTS 6.0 would be unlikely to achieve

IELTS 6.5 unless they benefit from score rounding. If students do achieve much greater gains using

internal assessments, as some appear to do, it is just as likely that these “gains” are due to problems


27

with the way the scale is being applied on the internal programme as due to “true” gains in English

language proficiency as measured by IELTS.

A further problem with use of an IELTS-linked common scale is that each of the INTO programmes

has a number of explicitly stated learning outcomes that reflect the education levels for foundation,

Year 1 and pre-masters work. These learning outcomes would more appropriately be assessed with

criterion-referenced achievement assessments suited to each level. Use of the IELTS-linked common

scale may lead score users to believe that the level of task difficulty required for achieving points

along the scale is the same, i.e. an overall mark of 65 at foundation level refers to the same level of

work as a 65 achieved at diploma or graduate diploma level. Task specific descriptors that assess the

learning outcomes for each programme level would provide a better indicator of students’

achievement of the respective learning outcomes and allow users to better understand what students

at each level can do. Scales using numbers or letter grades can still be used if these are easier for

score users to understand and use, but these would be backed up by level specific descriptors that

give a much clearer idea of students’ abilities to apply their language skill to academic tasks.

In sum, use of an IELTS-linked scale masks the key difference between what is assessed on a 3-hour

exam and what is assessed through a combination of course work assessments and examinations.

Furthermore, the use of a common scale for English language fails to differentiate between the very

different requirements of language use required by the three programme levels. Both of these

situations represent bad academic practice. In my many conversations with staff responsible for the

EAP modules on the Diploma and Graduate Diploma, I can see that the problem of the scale is

holding back the development of assessment tasks that truly demonstrate what students have learned

on each course.

References

Green, A. (2005). EAP study recommendations and score gains on the IELTS Academic Writing test,

Assessing Writing, 10(1), 44-60

Tonkyn, A. (2012). Measuring and perceiving changes in oral complexity, accuracy and fluency:

examining instructed learners' short-term gains. In A. Housen, F. Kuiken, and I. Vedder (eds.)

Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA.

Amsterdam: John Benjamins, pp 221-246.


NEWCASTLE UNIVERSITY 13 December 2013 Annual … exam is supposedly modelled on the IELTS reading...

Documents

Transcript of NEWCASTLE UNIVERSITY 13 December 2013 Annual … exam is supposedly modelled on the IELTS reading...