Session 32014 - Social Sciencesweb/@educ/... · Session 3: Constructing Quality Assessments ... •...
-
Upload
nguyenhuong -
Category
Documents
-
view
213 -
download
0
Transcript of Session 32014 - Social Sciencesweb/@educ/... · Session 3: Constructing Quality Assessments ... •...
7/01/2014
1
Session 3: Constructing Quality Assessments
Summer Institute: Assessment in Schools January 2014
Session 3Constructing Quality Assessments
In this session we will
• describe the four key design features for qualityassessments
1. the assessment method and its relationship to the outcomesbeing measured;
2. the adequacy of sampling of the content of the assessment;3. the construction of quality items, tasks, exercises and marking
schemes;4. the use of the assessments so that they minimise bias.
• explain the importance of developing a Table ofSpecifications and describe how it is constructed.
• identify 2 categories of assessment methods fromwhich to choose for any particular assessment –selected response and constructedresponse;
Session 3Constructing Quality Assessments
In this session we will
• compare and contrast the strengths and weaknessesof the selected-response and constructed-responseitems;
• develop effective multiple choice items (selected -response);
• develop effective matching items (selected -response);• develop effective short answer items (selected –
response); and,• develop effective essay items (constructed response);
and,• discuss the strengths and weaknesses of constructed-
response items.
7/01/2014
2
Effective Classroom Assessment
Step 1: Have a clear purposeWhy assess?
Who will use the results?How will they use the results?
Step 2: Have Clear OutcomesAssess what?
Are the outcomes clear?Are the outcomes appropriate?
Step 3: Have a Sound DesignAssess how?
What method?Are they quality tasks?
Have the tasks sampled the outcomes wellHas bias been avoided?
Step 4: Communicate effectivelyReport results to whom?
Report results how?Meet the needs of the reporting audience?
Design Method 1:Matching Assessment Methods to Outcomes
Selected Response
Constructed Response
Multiple ChoiceMatching
EssayShort Answer
Performance Assessment
Interviews etc.
Knowledge
Reasoning
Performance Skills
Products
Design Method 1:Matching Assessment Methods to Outcomes
Selected Response Constructed Response
Multiple ChoiceMatching
EssayShort Answer
Performance Assessment
Interviews etc.
Knowledge Generally used for assessing knowledge
Can tap understanding of relationships between knowledge
Not usually used Ask questions, interview, but time consuming
Reasoning Can be used to assess some aspects of reasoning
Written descriptions of complex problem solving can be used
Can watch students solve problems and infer reasoning proficiency
Can ask students to “think aloud” or can ask questions to probe reasoning
Performance Skills Can assess some aspects of the knowledge prerequistes of the performance but cannot generally tap the skill
Can observe and evaluate skills as they are being performed
Can be used to assess the oral communication part of the performance and knowledge part of the performance
Products Can assess knowledge part of the ability to create products, but cannot use these to assess the products themselves
Can assess proficiency in carrying out the product development and the attributes of the product itself
Can probe procedural knowledge and knowledge of attributes of the products, but not the product itself
7/01/2014
3
Design Method 2:Sample Achievement Appropriately
• Ensure that there is enough of the right kind of exercises (items, activities, etc.) to sample student performance in a way that ensures that the conclusion drawn about the achievement is relatively strong.
• Always aim to get the best image of student achievement with the smallest possible sample of student performance; maximum information for minimum cost.
• Use a Table of Specifications to ensure appropriate sampling of content
Design Method 2:Sample Achievement AppropriatelyBuilding a Table of Specifications
Building a Table of Specifications requires the following steps:
1. Preparing a list of outcomes – these describe the types of performances the students are expected to demonstrate (e.g. Knows basic terms – “Writes a definition of each term”; “Identifies the term that represents each weather element”; etc.)
2. Outlining the course content – the content describes the area in which each type of performance is to be demonstrated (e.g. “air pressure”; “wind”; “temperature”; etc.)
3. Preparing a chart that relates the outcomes to the content through the number or percentage of items. This gives the relative emphasis in the curriculum.
Design Method 2:Sample Achievement Appropriately
Table of Specifications
Learning Outcomes
Content Area
Basic Skills Application Problem Solving
Total Percentage
Fractions 5 5 5 15
Mixed numbers
5 5 10 20
Decimals 5 15 10 30
Decimal to Fraction
conversions5 15 15 35
Total Percentage Points
20 40 40 100
(Adapted from Miller et al. (2009))
7/01/2014
4
Design Method 2:Sample Achievement Appropriately
Table of Specifications
(Adapted from Miller et al. (2009))
Reading Skill Number of Items
Locates details in a passage 10
Identifies the main idea in a paragraph 10
Identifies the sequence of actions or events 10
Identifies relationships expressed in a passage 20
Total Number of Items 50
Design Method 3:Construct Quality Items, Tasks, Exercises and Marking Schemes
• A sound assessment requires the construction of good quality assessment items, tasks, exercises and marking schemes.
• Each item or task type requires a set of procedural guidelines for construction.
Design Method 3:Construct Quality Items, Tasks, Exercises and Marking Schemes
• All test items require students to make a selected-response or a constructed-response.
• Selected-response type items require students to choose an answer from a set of two or more options (e.g. multiple choice items and matching items).
• Constructed-response items require students to supply their own responses (short answer items and essay items)
7/01/2014
5
Parts of a Multiple Choice Item
NameLength (feet)
USS California 624
USS Iowa 887
IJN Musashi 862
USS North Carolina 729
The lengths of some of the largest battleships ever built are shown in the table below.
What is the range of the lengths of the battleships in the table?
Stimulus
Lead sentence,Directions line
Stem, Stem question
(A) 105 [last length – first length](B) 263 * Key[correct answer](C) 775.5 [mean](D) 795.5 [median]
Distractors
RationalesOptions
13
Parts of a Multiple Choice Item (continued)
• Directions Line, Lead Sentence ‐ an introduction which directs a student to use the stimulus to answer the item or provides contextual information about the stimulus.
• Stimulus – information required in order to answer the item. Only use stimulus if a student needs the stimulus to answer the question.
• Stem – a question or statement which poses a clearly defined problem that is aligned to the content standard or curriculum benchmark being measured.
14
Parts of a Multiple Choice Item (continued)
• Options –the answer choices for students to select when answering the question.
• Key –the correct answer.
• Distractors – the incorrect answers.
• Rationales –justifications that explain why a certain distractor is plausible, yet incorrect, or demonstrates a common misconception.
15
7/01/2014
6
Quick Quiz (True or False or Don’t Know [need more information])
1. One function of the lead sentence is to direct students to use thestimulus to answer the item.
2. Item rationales explain why distractors are plausible.
3. Distractors are incorrect answer options.
4. A stimulus should be included on an item even if it is notnecessary to answer the item.
5. Answer options include the key (correct answer) and thedistractors.
6. Reading passages, maps, diagrams, and tables are examples ofstimuli.
Item Types
Answer Type
Correct Answer
Best Answer
Negative Answer
Stem Type
Direct Question A B C
Sentence Completion
D E F
Picture/Diagram G H I
17
Which statement best characterises the man appointed by president Eisenhower to be Chief Justice of the United States Supreme Court?
A. An Associate Justice of the Supreme Court who had once been a Professor of Law at Harvard.
B. A successful governor who had been an unsuccessful candidate for the Republican presidential nomination.
C. A well known New York attorney who successfully represented the leaders of the Communist Party in the US
D. A Democratic Senator from a southern state who had supported Eisenhower in his campaign for the presidency.
Direct question and best answer type (Type B)
18
7/01/2014
7
Requirements of writing good items
Item writers must have
• a thorough knowledge of the subject matter ‐ including knowledge of popular fallacies and misconceptions;
• an understanding of the content standards of the curriculum and the student performance of students;
• good verbal communication skills;
• technical item writing skills; and,
• imagination and ingenuity.
19
Sources of item ideas
Sources of item ideas include
• chance ideas and inspirations;
• the work (verbal, written) of students;
• the items and ideas of other people;
• job analysis (What is an individual who is proficient in this area expected to be able to do?”); and,
• imagination and ingenuity.
20
Advantages of multiple choice items
Advantages of multiple choice items include
• versatility (adaptable for various levels of learning outcome including, simple recall of knowledge, analysis of phenomena, application of principles, interpreting cause and effect relationships, etc.) ;
• increased validity (more questions therefore greater coverage of the syllabus);
• increased reliability because of objectivity of marking; and,
• increased efficiency (easily marked).
21
7/01/2014
8
Disadvantages of multiple choice items
Disadvantages of multiple choice items include
• versatility (not adaptable to measuring certain learning outcomes including articulating an explanation, displaying thought processes, etc.);
• decreased reliability because of susceptibility to guessing; and,
• the difficulty of construction.
22
A couple of myths associated with multiple choice items
• Multiple choice items can only be used to measure lower‐level outcomes such as those based on knowledge, facts and principles.
• Students tend to blindly guess the answers to multiple choice items (some people believe that the minimum score a person can get on a multiple choice test is 25% for a 4‐option test).
23
Hints for Writing Multiple Choice Questions
Align the item with the learning objective
Economics objective: Compare and contrast three forms of business organization (sole proprietorship, partnership, and corporation)
Which group influences policies and sets objectiveswithin a corporation?
a) bondholdersb) shareholders *c) employee unions d) government regulators
Is this assessing the learning objective?
24
7/01/2014
9
Hints for Writing Multiple Choice Questions
Each item should focus on an important concept.
The early development of analysis of variance (ANOVA) was mainly due to work done by Sir Ronald A. Fischer.
What was Fischer’s middle name?
a) Alan
b) Albert
c) Aylmer
d) Arthur
Is this assessing an important learning objective?
25
Hints for Writing Multiple Choice Questions
The language is simple, clear and unambiguous.
Is this clear and concise language?
The local community theatre group in Kollum is performing a play in the Navdeep Public School auditorium. There are 15 rows of seats. Each row contains 28 seats.
What is the largest number of tickets that the communitytheatre group can sell to fill the auditorium for one performance?
a) 400 ticketsb) 420 tickets*c) 450 ticketsd) 470 tickets
26
Hints for Writing Multiple Choice Questions
How has this stem improved?
A theatre has 15 rows and each row contains 28 seats. What is the largest number of tickets that the theatre can sell to fill the auditorium for one performance?
a) 400 ticketsb) 420 tickets*c) 450 ticketsd) 470 tickets
Better …
27
7/01/2014
10
Hints for Writing Multiple Choice Questions
The stem and answer options must be grammatically consistent with one another
What is the answer? Has the question really assessed the learning objective?
The type of vessel that carries blood from the heart to the lungs is an
a) artery.b) capillary.c) node.d) vein.
28
Hints for Writing Multiple Choice Questions
How has the item improved?
The type of vessel that carries blood from the heart to the lungs is
a) an artery*.b) a capillary.c) a node.d) a vein.
Better …
29
Hints for Writing Multiple Choice Questions
The stem and distractors must not give clues to the key
What is the answer? Has the question really assessed the learning objective?
Seetha is writing a report on different types of ecosystems. In which source would she most likely find this information?
a) Cooking with Native Plantsb) A Guide to Rock Collectingc) Ecosystems of Asia* d) The Big Book of Animal Habits
30
7/01/2014
11
Hints for Writing Multiple Choice Questions
The item presents a single clearly formulated question in the stem and is written in question or sentence completion format.
Can you cover the options and still answer to the item?
From the article, the reader can tell that
a) turtles like to hide under the rocks.b) dogs like to play in the snow.c) squirrels like to eat mulberries. d) cats like to chase mice.
31
Hints for Writing Multiple Choice Questions
How has the item improved?
From the article, the reader can tell that turtles like to
a) hide under the rocks.b) play in the snow.c) eat mulberries. d) chase mice
Better …
32
Hints for Writing Multiple Choice Questions
The stem and answer options are phrased in positive terms .
Have you ever been asked what is not the answer to a question?
None of the following cities are state capitals except
a) Bellary.
b) Hyderabad*.
c) Katni.
d) Pune.
33
7/01/2014
12
Hints for Writing Multiple Choice Questions
How has the item improved?
Better …
Which one of the following cities is a state capital?
a) Bellary
b) Hyderabad*
c) Katni
d) Pune
34
Hints for Writing Multiple Choice Questions
There are no repetitions in the options.
Why do you think that repetitions in the options can lower validity?
Milk can be pasteurised at home by
a) heating it to a temperature of 33oC for 30 minutes.
b) heating it to a temperature of 43oC for 30 minutes.
c) heating it to a temperature of 53oC for 30 minutes.
d) heating it to a temperature of 63oC for 30 minutes*.
35
Hints for Writing Multiple Choice Questions
Better …
The minimum temperature that can be used to pasteurise milk at home is
a) 33oC.
b) 43oC.
c) 53oC.
d) 63oC*.
How has the item improved?
36
7/01/2014
13
Hints for Writing Multiple Choice Questions
Ensure that there is only one correct or clearly best answer.
What is the correct answer to this item?
Which one is the odd one out?
a) billiards
b) cricket
c) hockey
d) football
37
Hints for Writing Multiple Choice Questions
Ensure that there are neither repetitions nor opposites in the options.
Which would be the most likely effect of this changein fiscal policy?
A. The inflation rate would decline.B. The unemployment rate would rise.C. Consumer spending would increase. *D. Consumer spending would decrease.
What is the correct answer to this item? 38
Hints for Writing Multiple Choice Questions
Answer options are plausible and similar in context, ideas, focus, phrasing and length
What is the correct answer to this item? Why has the item writer made d) so long?
Epistemology is the branch of philosophy dealing with
a) the nature of science.
b) morality.
c) beauty.
d) the nature and origin of knowledge – that is, the manner in which human beings sense and process external stimuli in the form of knowledge.
39
7/01/2014
14
Hints for Writing Multiple Choice Questions
Specific determiners (always, all, never, only, none) must be used cautiously.
What is the correct answer to this item? How could the weak students get this item correct without really knowing the answer?
Achievement tests help students to improve their learning by
a) encouraging them all to study hard.
b) informing them of their progress.
c) giving them all a feeling of success.
d) preventing any of them from neglecting their work.
40
Hints for Writing Multiple Choice Questions
Use “All of the above” and “None of these” sparingly
What is the correct answer to this item? What is wrong with this item?
Which of the following levels are included in Bloom’s Taxonomy?
a) Comprehension
b) Application
c) Analysis
d) All of the above
Hints for Writing Multiple Choice Questions
Use “All of the above” and “None of these” sparingly
What is the correct answer to this item? What is wrong with this item?
Which of the following levels are included in Bloom’s Taxonomy?
a) Comprehension
b) Application
c) Analysis
d) All of the above
42
7/01/2014
15
Hints for Writing Multiple Choice Questions
What is the problem with using “None of the above” as the answer?
Which one of the following is a level in Bloom’s Taxonomy for the cognitive domain?
a) Critical Thinking
b) Scientific Thinking
c) Reasoning
d) None of the above
Another Example
43
Hints for Writing Multiple Choice Questions
Ensure that answer options do not overlap with each other
What is the correct answer to this item? What is wrong with this item?
If the scores on a test have a reliability of 0.75, what percentage of an observed score is attributable to errors of measurement?
a) over 5%
b) over 10%
c) over 20%
d) over 30%
44
Hints for Writing Multiple Choice Questions
How has the item improved?
Better …
If the scores on a test have a reliability of 0.75, what percentage of an observed score is attributable to errors of measurement?
a) 2.5%
b) 5%
c) 25%
d) 50%
45
7/01/2014
16
Writing multiple choice items from an editorial point of view
The items must be
• free from spelling, punctuation, grammatical and other editorial faults;
• presented with appropriate text fonts (size, type), highlighting (bold, underlining, italics) and layout (paragraphing and positioning);
• arranged in such a way that the students do not have to turn pages to link sources to questions (stems to options); and,
• arranged in the paper in the order from easiest to hardest.
46
One result of having a theatre in a community is more:
a) jobsb) teachersc) storesd) crime
Quick Quiz What is the problem with item 1?
47
In Charles Dickens’ novel A Christmas Carol, which characteristic describes Ebenezer Scrooge?
a) miserly b) nervousc) inquisitived) all of the above
Quick Quiz What is the problem with item 2?
48
7/01/2014
17
The table below shows snowfall totals for Pokhara in February.
Based on the table, what percent of days in February hadsnowfall of 5 cm or more?
a) less than 20% *b) less than 30%c) more than 65%d) more than 75%
Quick Quiz What is the problem with item 3?
Why should candy be eaten sparingly between meals?
a) Candy depletes energy.b) Candy causes diabetes. c) Candy causes headaches.d) Candy dulls the appetite for other foods essential for proper
nutrition.*
Quick Quiz What is the problem with item 4?
50
Quick Quiz (True or False or Don’t Know [need more information
51
A GOOD ITEM
1. measures a specific learning objective.
2. contains subject matter and vocabulary that is above the student’sgrade-level.
3. has only one correct answer or clearly best answer.
4. assesses trivial or obscure subject matter.
5. free from grammatical clueing.
6. is free of negative wording such as “not” or “none of the above.”
7/01/2014
18
Quick Quiz (True or False or Don’t Know [need more information
52
A GOOD ITEM
7. assesses more than one concept.
8. contains options that are opposite of one another.
9. contains distractors that assess common errors or misconceptions.
Quick Quiz (True or False or Don’t Know [need more information
53
1. Distractors should be parallel in content, structure, and length.
2. Cognition refers to the difficulty level of an item.
3. Items should be written so that the content in the item is accessibleto the widest range of students.
4. Parallelism refers to when a student from different ethnic, sex, orcultural groups perform differently on an item.
5. Controversial items are often assessed with the multiple-choiceformat.
Quick Quiz (True or False or Don’t Know [need more information
54
6. Item fairness means that the item assesses all students at theappropriate age and enrolled grade‐level.
7. An example of bias is presenting a culturally stereotypical situation inthe item.
7/01/2014
19
55
Match Column A with Column B to make each statement true. The first one has been done for you.
A week is a period of 100 years
A decade has 10 days
September is a period of 10 years
A century has 7 days
is the ninth month of the year
Design Method 3:Matching Questions
• Variation on multiple choice type items.• Suited to testing knowledge of terms, definitions and events.
56
Design Method 3:Hints for Writing Matching Questions
• Clearly explain the basis on which the match is to be made.• Make sure the items contain the same content.• Make the lists of items and responses short.• Have an unequal number of items and responses to reduce
guessing.• Have all group names, numbers etc. in one list.• Arrange the lists in a systematic fashion.
57
Design Method 3:Constructed-Response Items
Hints for Writing Short-Response Questions
• Relatively easy to construct.• Not truly objective in terms of marking. • Can be limited in measuring higher order outcomes.
7/01/2014
20
58
Design Method 3:Constructed-Response Items
Hints for Writing Short-Response Questions
• Make sure the question or statement is clear and precise.– (Example: The Chernobyl nuclear disaster occurred in ___?)
• Ensure that the blanks in completion items are at the end of the statement.
• Make sure there are no irrelevant clues.• Provide appropriate space for the answer.• Keep the structure simple with minimal parts and sequence parts in
difficulty order.• Prepare a marking key.
59
Design Method 3:Constructed-Response Items
Hints for Essay Questions
Designing and developing essays involves three steps:
1. Planning – begin with clearly articulated outcomes; key components of knowledge, reasoning or writing
2. Task Development – specify what it is students have to know; identify what it is students are to write about; and, scaffold the response without letting the students know how to succeed.
3. Marking Guide Development – decide whether the task will require an analytical or holistic rubric.
60
Design Method 3:Constructed-Response Items
Hints for Essay Questions
• Write the question in simple, clear and direct language.• Present the question in such a way that the student’s task is clearly defined.• Consider carefully the amount of time the students need to answer the essay
item.• Use essay items to assess higher-order cognition skills.• Prepare a marking guide in advance.• Make prior decisions regarding such factors as spelling, grammar, etc.• Evaluate essay responses anonymously.• Score essay responses via analytic or holistic rubrics.
7/01/2014
21
HOLISTIC RATING SCALE – Poetry Task
Exercise The poet uses a great deal of imagery (mental pictures) in the poem. For example, he states that there are a great number of golden daffodils fluttering and dancing, and he compares them to the stars in the Milky Way. How do these mental pictures add to your understanding or enjoyment of the poem? Write a paragraph explaining your answer.
Scoring Rubric
Note: At Levels 3 and 4, errors in mechanics and grammar (e.g. fragments, misspellings, flawed punctuation and incorrect capitalization) should not impede understanding.
ANALYTICAL RATING SCALE Marking Guideline
Assign points to student responses that most closely match the characteristics listed. 4. - provides a direct, accurate response to the function of imagery
- shows good organisation and a clear focus - includes few, if any, errors in grammar, usage, or mechanics
3. - exhibits a fairly clear and logical response to the question
- contains minor organizational flaws or a somewhat unclear focus - may include some errors in grammar, usage, or mechanics
2. - attempts to address the question
- includes confusing organization and a focus which is unclear - contains problems in mechanics that interfere with communication
1. - barely attempts to address the question
- includes little organisation and is not focused - complicates message with serious problems in language and
mechanics 0. - indicates that the student has failed to attempt the question N/S - indicates that the response is illegible or unreadable
63
Analytic Marking Rubric
Problem AreaIdentified(1 Mark)
Described(0, 1 or 2 Marks)
Solution(0, 1 or 2 Marks)
Total
Local Control 1 2 1 4
Federal Support
1 1 2 4
Legal Constraints
1 2 2 5
The Media 1 0 0 1
TOTAL 4 5 5 14
7/01/2014
22
Design Method 4:Minimise Bias
• Potential error sources can occur from such things as the scoring process, the student’s emotional state, the test administration environment and the item context.
• Must make sure that the sources are understood and taken account of in the construction and administration of the tests/tasks.
Some Reflective Questions
1. What would be the major points of a presentation to staff advocating the use of selected-response items over constructed-response items?
2. Based on your own experience, how skilled do you think most classroom teachers are in constructing selected-response items?
3. What is your opinion regarding the relative virtues of holistic versus analytic scoring essays? Would your opinion change if designing a scoring rubric for a history item?
4. Here is an essay by a 5th Grade student in response to the question: Explain why smoking is dangerous to your health. The teacher expected students to mention smoking effects on lungs; teeth ; heart and stamina. What score would you give the essay out of 5?
Smoking is a very nasty habit and it’s against the law to smoke in many buildings today. Smoking makes people cough and it turns your teeth yellow and they get decaid and they fall out. Smoking causes some kinds of cancer to.
Some Recommended Readings
1. Downing, S.M. (2006) Selected-response item formats in test development. In S.M. Downing and T. M. Haladyna (Eds.) Handbook of test development. Mahwah, NJ: Lawrence Erlbaum and Associates.
2. Miller, M.D., Linn, R.L. and Gronlund, N.E. (2009) Measurement and Assessment in Teaching. Pearson. Upper Saddle River, New Jersey.
3. Reynolds, C.R., Livingstone, R.B., and Wilson, V. (2009) Measurement and Assessment in Education. Pearson. Upper Saddle River, New Jersey.
4. Popham, W.J. (2006) Assessment for Educational Leaders. Pearson. Boston
5. Welch, C. (2006) Item and prompt development in performance testing. In S.M. Downing and T. M. Haladyna (Eds.) Handbook of test development. Mahwah, NJ: Lawrence Erlbaum and Associates.