Session 32014 - Social Sciencesweb/@educ/... · Session 3: Constructing Quality Assessments ... •...

23
7/01/2014 1 Session 3: Constructing Quality Assessments Summer Institute: Assessment in Schools January 2014 Session 3 Constructing Quality Assessments In this session we will describe the four key design features for quality assessments 1. the assessment method and its relationship to the outcomes being measured; 2. the adequacy of sampling of the content of the assessment; 3. the construction of quality items, tasks, exercises and marking schemes; 4. the use of the assessments so that they minimise bias. explain the importance of developing a Table of Specifications and describe how it is constructed. identify 2 categories of assessment methods from which to choose for any particular assessment – selected response and constructed response; Session 3 Constructing Quality Assessments In this session we will compare and contrast the strengths and weaknesses of the selected-response and constructed-response items; develop effective multiple choice items (selected - response); develop effective matching items (selected -response); develop effective short answer items (selected response); and, develop effective essay items (constructed response); and, discuss the strengths and weaknesses of constructed- response items.

Transcript of Session 32014 - Social Sciencesweb/@educ/... · Session 3: Constructing Quality Assessments ... •...

7/01/2014

1

Session 3: Constructing Quality Assessments

Summer Institute: Assessment in Schools January 2014

Session 3Constructing Quality Assessments

In this session we will

• describe the four key design features for qualityassessments

1. the assessment method and its relationship to the outcomesbeing measured;

2. the adequacy of sampling of the content of the assessment;3. the construction of quality items, tasks, exercises and marking

schemes;4. the use of the assessments so that they minimise bias.

• explain the importance of developing a Table ofSpecifications and describe how it is constructed.

• identify 2 categories of assessment methods fromwhich to choose for any particular assessment –selected response and constructedresponse;

Session 3Constructing Quality Assessments

In this session we will

• compare and contrast the strengths and weaknessesof the selected-response and constructed-responseitems;

• develop effective multiple choice items (selected -response);

• develop effective matching items (selected -response);• develop effective short answer items (selected –

response); and,• develop effective essay items (constructed response);

and,• discuss the strengths and weaknesses of constructed-

response items.

7/01/2014

2

Effective Classroom Assessment

Step 1: Have a clear purposeWhy assess?

Who will use the results?How will they use the results?

Step 2: Have Clear OutcomesAssess what?

Are the outcomes clear?Are the outcomes appropriate?

Step 3: Have a Sound DesignAssess how?

What method?Are they quality tasks?

Have the tasks sampled the outcomes wellHas bias been avoided?

Step 4: Communicate effectivelyReport results to whom?

Report results how?Meet the needs of the reporting audience?

Design Method 1:Matching Assessment Methods to Outcomes

Selected Response

Constructed Response

Multiple ChoiceMatching

EssayShort Answer

Performance Assessment

Interviews etc.

Knowledge

Reasoning

Performance Skills

Products

Design Method 1:Matching Assessment Methods to Outcomes

Selected Response Constructed Response

Multiple ChoiceMatching

EssayShort Answer

Performance Assessment

Interviews etc.

Knowledge Generally used for assessing knowledge

Can tap understanding of relationships between knowledge

Not usually used Ask questions, interview, but time consuming

Reasoning Can be used to assess some aspects of reasoning

Written descriptions of complex problem solving can be used

Can watch students solve problems and infer reasoning proficiency

Can ask students to “think aloud” or can ask questions to probe reasoning

Performance Skills Can assess some aspects of the knowledge prerequistes of the performance but cannot generally tap the skill

Can observe and evaluate skills as they are being performed

Can be used to assess the oral communication part of the performance  and knowledge part of the performance

Products Can assess knowledge part of the ability to create products, but cannot use these to assess the products themselves

Can assess proficiency in carrying out the product development and the attributes of the product itself

Can probe procedural knowledge and knowledge of attributes of the products, but not the product itself

7/01/2014

3

Design Method 2:Sample Achievement Appropriately

• Ensure that there is enough of the right kind of exercises (items, activities, etc.) to sample student performance in a way that ensures that the conclusion drawn about the achievement is relatively strong.

• Always aim to get the best image of student achievement with the smallest possible sample of student performance; maximum information for minimum cost.

• Use a Table of Specifications to ensure appropriate sampling of content

Design Method 2:Sample Achievement AppropriatelyBuilding a Table of Specifications

Building a Table of Specifications requires the following steps:

1. Preparing a list of outcomes – these describe the types of performances the students are expected to demonstrate (e.g. Knows basic terms – “Writes a definition of each term”; “Identifies the term that represents each weather element”; etc.)

2. Outlining the course content – the content describes the area in which each type of performance is to be demonstrated (e.g. “air pressure”; “wind”; “temperature”; etc.)

3. Preparing a chart that relates the outcomes to the content through the number or percentage of items. This gives the relative emphasis in the curriculum.

Design Method 2:Sample Achievement Appropriately

Table of Specifications

Learning Outcomes

Content Area

Basic Skills Application Problem Solving

Total Percentage

Fractions 5 5 5 15

Mixed numbers

5 5 10 20

Decimals 5 15 10 30

Decimal to Fraction 

conversions5 15 15 35

Total Percentage Points

20 40 40 100

(Adapted from Miller et al. (2009))

7/01/2014

4

Design Method 2:Sample Achievement Appropriately

Table of Specifications

(Adapted from Miller et al. (2009))

Reading Skill Number of Items

Locates details in a passage 10

Identifies the main idea in a paragraph 10

Identifies the sequence of actions or events 10

Identifies relationships expressed in a passage 20

Total Number of Items 50

Design Method 3:Construct Quality Items, Tasks, Exercises and Marking Schemes

• A sound assessment requires the construction of good quality assessment items, tasks, exercises and marking schemes.

• Each item or task type requires a set of procedural guidelines for construction.

Design Method 3:Construct Quality Items, Tasks, Exercises and Marking Schemes

• All test items require students to make a selected-response or a constructed-response.

• Selected-response type items require students to choose an answer from a set of two or more options (e.g. multiple choice items and matching items).

• Constructed-response items require students to supply their own responses (short answer items and essay items)

7/01/2014

5

Parts of a Multiple Choice Item

NameLength (feet)

USS California 624

USS Iowa 887

IJN Musashi 862

USS North Carolina 729

The lengths of some of the largest battleships ever built are shown in the table below.

What is the range of the lengths of the battleships in the table?

Stimulus

Lead sentence,Directions line

Stem, Stem question

(A) 105 [last length – first length](B) 263 * Key[correct answer](C) 775.5 [mean](D) 795.5 [median]

Distractors

RationalesOptions

13

Parts of a Multiple Choice Item (continued)

• Directions Line, Lead Sentence ‐ an introduction which directs a student to use the stimulus to answer the item or provides contextual information about the stimulus.

• Stimulus – information required in order to answer the item. Only use stimulus if a student needs the stimulus to answer the question. 

• Stem – a question or statement which poses a clearly defined problem that is aligned to the content standard or curriculum benchmark being measured.

14

Parts of a Multiple Choice Item (continued)

• Options –the answer choices for students to select when answering the question. 

• Key –the correct answer.

• Distractors – the incorrect answers. 

• Rationales –justifications that explain why a certain distractor is plausible, yet incorrect, or demonstrates a common misconception.

15

7/01/2014

6

Quick Quiz (True or False or Don’t Know [need more information])

1. One function of the lead sentence is to direct students to use thestimulus to answer the item.

2. Item rationales explain why distractors are plausible.

3. Distractors are incorrect answer options.

4. A stimulus should be included on an item even if it is notnecessary to answer the item.

5. Answer options include the key (correct answer) and thedistractors.

6. Reading passages, maps, diagrams, and tables are examples ofstimuli.

Item Types

Answer Type

Correct Answer

Best Answer

Negative Answer

Stem Type

Direct Question A B C

Sentence Completion

D E F

Picture/Diagram G H I

17

Which statement best characterises the man appointed by president Eisenhower to be Chief Justice of the United States Supreme Court?

A. An Associate Justice of the Supreme Court who had once been a Professor of Law at Harvard.

B. A successful governor who had been an unsuccessful candidate for the Republican presidential nomination.

C. A well known New York attorney who successfully represented the leaders of the Communist Party in the US

D. A Democratic Senator from a southern state who had supported Eisenhower in his campaign for the presidency. 

Direct question and best answer type (Type B)

18

7/01/2014

7

Requirements of writing good items

Item writers must have

• a thorough knowledge of the subject matter ‐ including knowledge of popular fallacies and misconceptions;

• an understanding of the content standards of the curriculum and the student performance of students;

• good verbal communication skills;

• technical item writing skills; and,

• imagination and ingenuity.

19

Sources of item ideas

Sources of item ideas include

• chance ideas and inspirations; 

• the work (verbal, written) of students;

• the items and ideas of other people;

• job analysis (What is an individual who is proficient in this area expected to be able to do?”); and,

• imagination and ingenuity.

20

Advantages of multiple choice items

Advantages of multiple choice items include 

• versatility (adaptable for various levels of learning outcome including, simple recall of knowledge, analysis of phenomena, application of principles, interpreting cause and effect relationships, etc.) ;

• increased validity (more questions therefore greater coverage of the syllabus); 

• increased reliability because of objectivity of marking; and,

• increased efficiency (easily marked).

21

7/01/2014

8

Disadvantages of multiple choice items

Disadvantages of multiple choice items include 

• versatility (not adaptable to measuring certain learning outcomes including articulating an explanation, displaying thought processes, etc.); 

• decreased reliability because of susceptibility to guessing; and,

• the difficulty of construction.

22

A couple of myths associated with multiple choice items

• Multiple choice items can only be used to measure lower‐level outcomes such as those based on knowledge, facts and principles.  

• Students tend to blindly guess the answers to multiple choice items (some people believe that the minimum score a person can get on a multiple choice test is 25% for a 4‐option test).

23

Hints for Writing Multiple Choice Questions

Align the item with the learning objective

Economics objective: Compare and contrast three forms of business organization (sole proprietorship, partnership, and corporation)

Which group influences policies and sets objectiveswithin a corporation?

a) bondholdersb) shareholders *c) employee unions d) government regulators

Is this assessing the learning objective?

24

7/01/2014

9

Hints for Writing Multiple Choice Questions

Each item should focus on an important concept.

The early development of analysis of variance (ANOVA)  was mainly due to work done by Sir Ronald A. Fischer.

What was Fischer’s middle name?

a) Alan

b) Albert

c) Aylmer

d) Arthur

Is this assessing an important learning objective?

25

Hints for Writing Multiple Choice Questions

The language is simple, clear and unambiguous.

Is this clear and concise language?

The local community theatre group in Kollum is performing a play in the Navdeep Public School auditorium. There are 15 rows of seats.  Each row contains 28 seats. 

What is the largest number of tickets that the communitytheatre group can sell to fill the auditorium for one performance?

a) 400 ticketsb) 420 tickets*c) 450 ticketsd) 470 tickets

26

Hints for Writing Multiple Choice Questions

How has this stem improved?

A theatre has 15 rows and each row contains 28 seats. What is the largest number of tickets that the theatre can sell to fill the auditorium for one performance?

a) 400 ticketsb) 420 tickets*c) 450 ticketsd) 470 tickets

Better …

27

7/01/2014

10

Hints for Writing Multiple Choice Questions

The stem and answer options must be grammatically consistent with one another

What is the answer? Has the question really assessed the learning objective?

The type of vessel that carries blood from the heart to the lungs is an

a) artery.b) capillary.c) node.d) vein.

28

Hints for Writing Multiple Choice Questions

How has the item improved?

The type of vessel that carries blood from the heart to the lungs is 

a) an artery*.b) a capillary.c) a node.d) a vein.

Better …

29

Hints for Writing Multiple Choice Questions

The stem and distractors must not give clues to the key

What is the answer? Has the question really assessed the learning objective?

Seetha is writing a report on different types of ecosystems. In which source would she most likely find this information?

a) Cooking with Native Plantsb) A Guide to Rock Collectingc) Ecosystems of Asia* d) The Big Book of Animal Habits

30

7/01/2014

11

Hints for Writing Multiple Choice Questions

The item presents a single clearly formulated question in the stem and is written in question or sentence completion format.

Can you cover the options and still answer to the item?

From the article, the reader can tell that

a) turtles like to hide under the rocks.b) dogs like to play in the snow.c) squirrels like to eat mulberries. d) cats like to chase mice.

31

Hints for Writing Multiple Choice Questions

How has the item improved?

From the article, the reader can tell that turtles like to

a) hide under the rocks.b) play in the snow.c) eat mulberries. d) chase mice

Better …

32

Hints for Writing Multiple Choice Questions

The stem and answer options are phrased in positive terms .

Have you ever been asked what is not the answer to a question?

None of the following cities are state capitals except

a) Bellary.

b) Hyderabad*.

c) Katni.

d) Pune.

33

7/01/2014

12

Hints for Writing Multiple Choice Questions

How has the item improved?

Better …

Which one of the following cities is a state capital?

a) Bellary

b) Hyderabad*

c) Katni

d) Pune

34

Hints for Writing Multiple Choice Questions

There are no repetitions in the options.

Why do you think that repetitions in the options can lower validity?

Milk can be pasteurised at home by

a) heating it to a temperature of 33oC for 30 minutes.

b) heating it to a temperature of 43oC for 30 minutes.

c) heating it to a temperature of 53oC for 30 minutes.

d) heating it to a temperature of 63oC for 30 minutes*.

35

Hints for Writing Multiple Choice Questions

Better …

The minimum temperature that can be used to pasteurise milk at home is

a) 33oC.

b) 43oC.

c) 53oC.

d) 63oC*.

How has the item improved?

36

7/01/2014

13

Hints for Writing Multiple Choice Questions

Ensure that there is only one correct or clearly best answer.

What is the correct answer to this item?

Which one is the odd one out?

a) billiards

b) cricket

c) hockey

d) football

37

Hints for Writing Multiple Choice Questions

Ensure that there are neither repetitions nor opposites in the options.

Which would be the most likely effect of this changein fiscal policy?

A. The inflation rate would decline.B. The unemployment rate would rise.C. Consumer spending would increase. *D. Consumer spending would decrease.

What is the correct answer to this item? 38

Hints for Writing Multiple Choice Questions

Answer options are plausible and similar in context, ideas, focus, phrasing and length

What is the correct answer to this item? Why has the item writer made d) so long?

Epistemology is the branch of philosophy dealing with

a) the nature of science.

b) morality.

c) beauty.

d) the nature and origin of knowledge – that is, the manner in which human beings sense and process external stimuli in the form of knowledge.

39

7/01/2014

14

Hints for Writing Multiple Choice Questions

Specific determiners (always, all, never, only, none) must be used cautiously.

What is the correct answer to this item? How could the weak students get this item correct without really knowing the answer?

Achievement tests help students to improve their learning by

a) encouraging them all to study hard.

b) informing them of their progress.

c) giving them all a feeling of success.

d) preventing any of them from neglecting their work.

40

Hints for Writing Multiple Choice Questions

Use “All of the above” and “None of these” sparingly

What is the correct answer to this item? What is wrong with this item?

Which of the following levels are included in Bloom’s Taxonomy?

a) Comprehension

b) Application

c) Analysis

d) All of the above

Hints for Writing Multiple Choice Questions

Use “All of the above” and “None of these” sparingly

What is the correct answer to this item? What is wrong with this item?

Which of the following levels are included in Bloom’s Taxonomy?

a) Comprehension

b) Application

c) Analysis

d) All of the above

42

7/01/2014

15

Hints for Writing Multiple Choice Questions

What is the problem with using “None of the above” as the answer?

Which one of the following is a level in Bloom’s Taxonomy for the cognitive domain?

a) Critical Thinking

b) Scientific Thinking

c) Reasoning

d) None of the above

Another Example

43

Hints for Writing Multiple Choice Questions

Ensure that answer options do not overlap with each other

What is the correct answer to this item? What is wrong with this item?

If the scores on a test have a reliability of 0.75, what percentage of an observed score is attributable to errors of measurement?

a) over 5%

b) over 10%

c) over 20%

d) over 30%

44

Hints for Writing Multiple Choice Questions

How has the item improved?

Better …

If the scores on a test have a reliability of 0.75, what percentage of an observed score is attributable to errors of measurement?

a) 2.5%

b) 5%

c) 25%

d) 50%

45

7/01/2014

16

Writing multiple choice items from an editorial point of view

The items must be

• free from spelling, punctuation, grammatical and other editorial faults;

• presented with appropriate text fonts (size, type), highlighting (bold, underlining, italics) and layout (paragraphing and positioning);

• arranged in such a way that the students do not have to turn pages to link sources to questions (stems to options); and,

• arranged in the paper in the order from easiest to hardest.

46

One result of having a theatre in a community is more:

a) jobsb) teachersc) storesd) crime

Quick Quiz What is the problem with item 1?

47

In Charles Dickens’ novel A Christmas Carol, which characteristic describes Ebenezer Scrooge?

a) miserly b) nervousc) inquisitived) all of the above

Quick Quiz What is the problem with item 2?

48

7/01/2014

17

The table below shows snowfall totals for Pokhara in February.

Based on the table, what percent of days in February hadsnowfall of 5 cm or more?

a) less than 20% *b) less than 30%c) more than 65%d) more than 75%

Quick Quiz What is the problem with item 3?

Why should candy be eaten sparingly between meals?

a) Candy depletes energy.b) Candy causes diabetes. c) Candy causes headaches.d) Candy dulls the appetite for other foods essential for proper

nutrition.*

Quick Quiz What is the problem with item 4?

50

Quick Quiz (True or False or Don’t Know [need more information

51

A GOOD ITEM

1. measures a specific learning objective.

2. contains subject matter and vocabulary that is above the student’sgrade-level.

3. has only one correct answer or clearly best answer.

4. assesses trivial or obscure subject matter.

5. free from grammatical clueing.

6. is free of negative wording such as “not” or “none of the above.”

7/01/2014

18

Quick Quiz (True or False or Don’t Know [need more information

52

A GOOD ITEM

7. assesses more than one concept.

8. contains options that are opposite of one another.

9. contains distractors that assess common errors or misconceptions.

Quick Quiz (True or False or Don’t Know [need more information

53

1. Distractors should be parallel in content, structure, and length.

2. Cognition refers to the difficulty level of an item.

3. Items should be written so that the content in the item is accessibleto the widest range of students.

4. Parallelism refers to when a student from different ethnic, sex, orcultural groups perform differently on an item.

5. Controversial items are often assessed with the multiple-choiceformat.

Quick Quiz (True or False or Don’t Know [need more information

54

6. Item fairness means that the item assesses all students at theappropriate age and enrolled grade‐level.

7. An example of bias is presenting a culturally stereotypical situation inthe item.

7/01/2014

19

55

Match Column A with Column B to make each statement true. The first one has been done for you.

A week is a period of 100 years

A decade has 10 days

September is a period of 10 years

A century has 7 days

is the ninth month of the year

Design Method 3:Matching Questions

• Variation on multiple choice type items.• Suited to testing knowledge of terms, definitions and events.

56

Design Method 3:Hints for Writing Matching Questions

• Clearly explain the basis on which the match is to be made.• Make sure the items contain the same content.• Make the lists of items and responses short.• Have an unequal number of items and responses to reduce

guessing.• Have all group names, numbers etc. in one list.• Arrange the lists in a systematic fashion.

57

Design Method 3:Constructed-Response Items

Hints for Writing Short-Response Questions

• Relatively easy to construct.• Not truly objective in terms of marking. • Can be limited in measuring higher order outcomes.

7/01/2014

20

58

Design Method 3:Constructed-Response Items

Hints for Writing Short-Response Questions

• Make sure the question or statement is clear and precise.– (Example: The Chernobyl nuclear disaster occurred in ___?)

• Ensure that the blanks in completion items are at the end of the statement.

• Make sure there are no irrelevant clues.• Provide appropriate space for the answer.• Keep the structure simple with minimal parts and sequence parts in

difficulty order.• Prepare a marking key.

59

Design Method 3:Constructed-Response Items

Hints for Essay Questions

Designing and developing essays involves three steps:

1. Planning – begin with clearly articulated outcomes; key components of knowledge, reasoning or writing

2. Task Development – specify what it is students have to know; identify what it is students are to write about; and, scaffold the response without letting the students know how to succeed.

3. Marking Guide Development – decide whether the task will require an analytical or holistic rubric.

60

Design Method 3:Constructed-Response Items

Hints for Essay Questions

• Write the question in simple, clear and direct language.• Present the question in such a way that the student’s task is clearly defined.• Consider carefully the amount of time the students need to answer the essay

item.• Use essay items to assess higher-order cognition skills.• Prepare a marking guide in advance.• Make prior decisions regarding such factors as spelling, grammar, etc.• Evaluate essay responses anonymously.• Score essay responses via analytic or holistic rubrics.

7/01/2014

21

HOLISTIC RATING SCALE – Poetry Task

Exercise The poet uses a great deal of imagery (mental pictures) in the poem. For example, he states that there are a great number of golden daffodils fluttering and dancing, and he compares them to the stars in the Milky Way. How do these mental pictures add to your understanding or enjoyment of the poem? Write a paragraph explaining your answer.

Scoring Rubric

Note: At Levels 3 and 4, errors in mechanics and grammar (e.g. fragments, misspellings, flawed punctuation and incorrect capitalization) should not impede understanding.

ANALYTICAL RATING SCALE Marking Guideline

Assign points to student responses that most closely match the characteristics listed. 4. - provides a direct, accurate response to the function of imagery

- shows good organisation and a clear focus - includes few, if any, errors in grammar, usage, or mechanics

3. - exhibits a fairly clear and logical response to the question

- contains minor organizational flaws or a somewhat unclear focus - may include some errors in grammar, usage, or mechanics

2. - attempts to address the question

- includes confusing organization and a focus which is unclear - contains problems in mechanics that interfere with communication

1. - barely attempts to address the question

- includes little organisation and is not focused - complicates message with serious problems in language and

mechanics 0. - indicates that the student has failed to attempt the question N/S - indicates that the response is illegible or unreadable

63

Analytic Marking Rubric

Problem AreaIdentified(1 Mark)

Described(0, 1 or 2 Marks)

Solution(0, 1 or 2 Marks)

Total

Local Control 1 2 1 4

Federal Support

1 1 2 4

Legal Constraints

1 2 2 5

The Media 1 0 0 1

TOTAL 4 5 5 14

7/01/2014

22

Design Method 4:Minimise Bias

• Potential error sources can occur from such things as the scoring process, the student’s emotional state, the test administration environment and the item context.

• Must make sure that the sources are understood and taken account of in the construction and administration of the tests/tasks.

Some Reflective Questions

1. What would be the major points of a presentation to staff advocating the use of selected-response items over constructed-response items?

2. Based on your own experience, how skilled do you think most classroom teachers are in constructing selected-response items?

3. What is your opinion regarding the relative virtues of holistic versus analytic scoring essays? Would your opinion change if designing a scoring rubric for a history item?

4. Here is an essay by a 5th Grade student in response to the question: Explain why smoking is dangerous to your health. The teacher expected students to mention smoking effects on lungs; teeth ; heart and stamina. What score would you give the essay out of 5?

Smoking is a very nasty habit and it’s against the law to smoke in many buildings today. Smoking makes people cough and it turns your teeth yellow and they get decaid and they fall out. Smoking causes some kinds of cancer to.

Some Recommended Readings

1. Downing, S.M. (2006) Selected-response item formats in test development. In S.M. Downing and T. M. Haladyna (Eds.) Handbook of test development. Mahwah, NJ: Lawrence Erlbaum and Associates.

2. Miller, M.D., Linn, R.L. and Gronlund, N.E. (2009) Measurement and Assessment in Teaching. Pearson. Upper Saddle River, New Jersey.

3. Reynolds, C.R., Livingstone, R.B., and Wilson, V. (2009) Measurement and Assessment in Education. Pearson. Upper Saddle River, New Jersey.

4. Popham, W.J. (2006) Assessment for Educational Leaders. Pearson. Boston

5. Welch, C. (2006) Item and prompt development in performance testing. In S.M. Downing and T. M. Haladyna (Eds.) Handbook of test development. Mahwah, NJ: Lawrence Erlbaum and Associates.

7/01/2014

23

Some Recommended Internet Sites

1. www.ncme.org

National Council on Measurement in Education (NCME).

2. www.nwrel.org

The Northwest Regional Electronic Laboratory (NWREL)

3. www.wested.org

WestED