Practical Language Testing Glenn Fulcher

29
PLT Designing test specification 2 Dr. Khalili Sabet Presenter: Masoud Dolatshahi University of Guilan © Masoud Dolatshahi - Guilan University 1 Practical Language Testing Glenn Fulcher

Transcript of Practical Language Testing Glenn Fulcher

Page 1: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 1

PLT Designing test specification 2Dr. Khalili SabetPresenter: Masoud DolatshahiUniversity of Guilan

PracticalLanguageTestingGlenn Fulcher

Page 2: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 2

3. A sample detailed specification for a reading test

A-Test framework• The test purpose• Target test takers• Criterion Domain• Rationale for the test content

Page 3: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 3

Test purpose and target audience* Globalization and workers and immigrants* The ore higher level of education, the more capacity of undertaking work and integrating socially• + correlation reading ability and earnings,►• inverse correlation reading ability and ► ability to

access critical services such as health care

It is not for *who have already spent up to three years in the country prior to testing* below the age of 18

Page 4: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 4

Annotation 1: The framework for this test is not terribly long, but presents a rationalefor the purpose of the test and identifies the test takers. One important feature of thisframework is that it states who the test is not designed for. A specification that limitsthe applicability of a test is a sign that the designers have clearly thought through thekinds of decisions that the test results may support.

B- General description1- Reading to search for simple information (facts) relating to employment and social services.2- Comprehending consequences and reasons.3- Lower level text processing that supports survival reading in social and work-seeking contexts 4- Background knowledge should not play a role in responding to test items.5- Subject/topic knowledge is limited to everyday survival tasks.6- Cultural knowledge should not play a role in responding to test items.

Page 5: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 5

7- Linguistic knowledge is restricted to basic reading comprehension as follows: Word recognition/identification Priority is given to words in the most common 2000, although some less common Work-specific vocabulary may be included as long as it is not too technical.

8- Cohesion Pronominal reference (he, they) Substitution (e.g. the same, one)

9- Skills: reading for factual information Scan a text to identify a piece of information or a word quickly Skim a text to extract main message

10- Understanding logical sequence clause relations

11- Cause consequence: y is the consequence of xe.g. ‘more agricultural jobs are expected in Lincolnshire this year because of the excellentSpring’ (explicit), ‘more agricultural jobs are expected in Lincolnshire after goodweather conditions this Spring’ (inexplicit).

Page 6: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 6

Instrument-achievement: By doing x, y occurse.g. ‘Register with your dentist today. In this way you will get treatment in reasonable time if you experience toothache’ (explicit) and ‘Register with your dentist today and get treatment in reasonable time if you experience toothache’ (inexplicit).

• Annotation 2: The general description sets out what should be tested. Theknowledge and skills listed are those that are expected to be directly relevant to thetarget test takers and the purpose of the test. Notice that it also states what shouldnot be tested – background knowledge, cultural knowledge or linguistic knowledgethat a new arrival could not be expected to have mastered, and is not relevant totheir early survival needs. This level of explicitness acts as a good guide for identifyinganti-items.C- Prompt attributes 1- All texts should be concrete, not abstract 2- All texts should be factual

D- Text types and genres Descriptions = Advertisements, job announcements, notices, signs, directories Procedures = Public information leaflets, forms Recounts = News items

Page 7: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 7

More complex text types such as exposition, argument and narrative are excluded.Topics that can be varied may include work, leisure, health, travel, accommodation and shopping. Texts should not be more difficult than Flesch-Kincaid 40, and should not exceed 150 words.

Product A• Feature 1• Feature 2• Feature 3

Product B• Feature 1• Feature 2• Feature 3

Page 8: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 8

Slide Title

• Make Effective Presentations• Using Awesome Backgrounds• Engage your Audience• Capture Audience Attention

Page 9: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 9

Slide Title

Product A• Feature 1• Feature 2• Feature 3

Product B• Feature 1• Feature 2• Feature 3

Page 10: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 10

Annotation 3: It is important that the text types and their features are carefullyspecified in a reading test for a group of test takers who are so clearly defined interms of their language needs. Texts outside the fields and registers listed would notallow sound inferences to the constructs. The length and difficulty are also specified,and a sample text provided. Notice how the text is coded as A2–D1. As more suitabletexts are found they can be coded, for example, B1–E3 for a leaflet explaining howto register with a Health Clinic. The coding can then be used to ensure that theAssembly Specification has an adequate number of texts from each category so thatthe test developer can claim that the test has adequate domain representation.

E- Response attributes• Response attributes are restricted to the selection of a correct

response from four options (multiple choice).• Measurement component: each item is dichotomously scored

and items are summed as a measure of ‘basic reading for work and social purposes’.

Annotation 4: Only multiple-choice items are going to be used in this test.Constructed response items that ask the test taker to produce language are considered too difficult for the intended test-taking population.

Page 11: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 11

Item specifications

Specification supplement:The stem may or may not contain words taken from the passage, or use synonyms orparaphrase for words taken from the passage. It is expected that stems containing words from the text are likely to be easier, as word recognition will be achieved by matching words in the stem to those in the text, rather than identifying similar meanings.

Page 12: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 12

• Make Effective Presentations• Using Awesome Backgrounds• Engage your Audience• Capture Audience Attention

Specification supplement:The target word should be located within two sentences of the pronominal reference or substitution. Distracters are drawn from noun phrases in the vicinity of the target word, and may be closer to the pronominal reference or substitution.

Page 13: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 13

Product A• Feature 1• Feature 2• Feature 3

Product B• Feature 1• Feature 2• Feature 3

Page 14: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 14

• Make Effective Presentations• Using Awesome Backgrounds• Engage your Audience• Capture Audience Attention

Page 15: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 15

Annotation 5: Each of the item specifications sets out what is and is not allowed.Codes are provided for the key variables. The example for item Type 4 is codedas A2. This coding is very helpful at this level of delicacy as it allows the assemblyspecification to say that so many items should test pronominal and substitutionreference, and how many of each should be close to the referent (and thereforeeasier), how many should be distant. When devising coding systems for items it isnecessary to remember that the more complex the coding system, the more complexthe assembly specification will be. Also, as the number of codes goes up to reflectthe complexity of variability in the criterion domain, so does the number of items thatyou will have to put on the test to adequately reflect the criterion domain!

Page 16: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 16

F-Assembly specification• Targets: This test is intended to be used to provide English

language support to newly arrived workers.• Constraints: Form constraints :Each form contains 50 items attached to 10 texts. Text constraints : 5 description texts, 3 procedure texts and 2 recount texts.

Topic constraints : 5 Employment texts and 5 Social Needs texts, one to be drawn from each subtopic.

Item constraints :

Page 17: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 17

G-Delivery specificationPaper based, with one text and approximately five questions appearing on each page. Each text and questions to appear in Times New Roman 12pt.

Length of test: 75 minutes.

Annotation 6: The assembly specification attempts to ensure that the relevant text types and topics are well represented, and that there is at least one item from each code in the system. Using this assembly specification each form of the test should be parallel in both content and difficulty. Finally, we have the delivery specification that provides layout and length, but does not tell us how the test is to be administered.

Page 18: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 18

4. Granularity • Many teachers do not like detailed test specifications like the

example above• They limit creativity• The tendency for teachers to teach nothing apart from what is

defined in the specifications**** Solution :• Popham suggested ‘boiled down’ to contain

just a general description and a sample item

Page 19: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 19

Product A• Feature 1• Feature 2• Feature 3

Product B• Feature 1• Feature 2• Feature 3

While this would not be desirable in a high-stakes test, it is an excellent strategy for classroom assessment.

Page 20: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 20

5. Performance conditions

• When a test specification attempts to list the [test holding] conditions, these are frequently referred to as performance conditions

• Canadian Language Benchmarks: • •• Interaction is face-to-face, or on the phone.• •• Rate of speech is slow to normal.• •• Context is mostly familiar, or clear and predictable, but also moderately demanding(e.g. real-world environment; limited support from interlocutors).• •• Circumstances range from informal to more formal occasions.• •• Instructions have five to six steps, and are given one at a time, with visual clues.• •• Topics are of immediate everyday relevance.• •• Setting is familiar.• •• Topic is concrete and familiar.• CEFR does not place specific tasks in its levels

Page 21: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 21

6. Target language use domain analysis

• This approach involves describing the item/task according to features that exist in the target language use situation across a number of categories:

• 1- The facets of the testing environment (such as place, equipment, personnel, and so on)• 2- The facets of the test rubric (organization , time instructions)-/titles• 3- facets of the input (format and language)• 4- facets of the expected response (format, language, restrictions on response)• 5- the relationship between input and response (whether reciprocal, non-reciprocal or adaptive)

Page 22: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 22

Despite following the TLU style, we notice that the introductory material (purpose,constructs), and the format of the sample item and related description, show that Millsh as created a hybrid specification that suits her own teaching and testing context A service encounter (travel agency) specification (Mills, 2009: 95–101)

• Test purpose• The purpose of this test is to test whether students have learned

the content of the course and to provide diagnostic feedback to the teacher so that the course and the final test can be tailored to the level and the needs of the learners.

• Definition of construct• For this achievement test a syllabus-based construct definition is used:• Situation: ‘the ability to perform a simplified simulated

authentic dialogue to book or take a booking for a flight at a travel agency’.

• Skill: Comprehend and write down important facts. Negotiate for meaning when breakdowns in communication occur by asking for clarification and reformulating information.

Page 23: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 23

Slide Title

Product A• Feature 1• Feature 2• Feature 3

Product B• Feature 1• Feature 2• Feature 3

Page 24: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 24

Sample ItemOne student is the travel agent and one is the customer. The customer wants to know theprice and availability of seats. Both students need to write down 5 pieces of information.RubricsThe customer will receive a role card stating the destination, preferred travel day, andclass of ticket. The travel agent will receive one of the following flight information tables.

Page 25: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 25

7. Moving back and forth

Product A• Feature 1• Feature 2• Feature 3

• Test designers move back and forth all the time

• A specification thatis ready for use might be version 1.0, and numbers on the way to 1.0 show the specification’scurrent state of evolution.• Davidson and Lynch (2002:call ‘ownership’, the creation of specifications is a dynamic process that is almost always the work of a group, rather than an individual.Specifications therefore reflect the theoretical and practical beliefs and judgments oftheir creators.

Page 26: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 26

5. Performance conditions

Page 27: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 27

Slide Title

Product A• Feature 1• Feature 2• Feature 3

Product B• Feature 1• Feature 2• Feature 3

Page 28: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 28

5. Performance conditions

Page 29: Practical Language Testing Glenn Fulcher

© Masoud Dolatshahi - Guilan University 29

Slide Title

Product A• Feature 1• Feature 2• Feature 3

Product B• Feature 1• Feature 2• Feature 3