Writing High Quality Assessment Items Using a Variety of Formats Scott Strother & Duane Benson...
-
Upload
reginald-preston -
Category
Documents
-
view
218 -
download
2
Transcript of Writing High Quality Assessment Items Using a Variety of Formats Scott Strother & Duane Benson...
Writing High Quality Assessment Items
Using a Variety of Formats
Scott Strother & Duane Benson
11/14/14
Carnegie Community College Pathways
Both Pathways’ instructional systems include:
• Ambitious learning goals
• Lessons and out-of-class materials
• Formative and summative assessments
• Productive persistence
• Language and literacy
• Dynamic online environment
• Advancing quality teaching
• Rapid analytics to support continuous improvement
Carnegie Community College Pathways
Both Pathways’ instructional systems include:
• Ambitious learning goals
• Lessons and out-of-class materials
• Formative and Summative Assessments
• Productive persistence
• Language and literacy
• Dynamic online environment
• Advancing quality teaching
• Rapid analytics to support continuous improvement
Summative Assessments:Purpose
Summative Assessments
1) Understand Levels of Student Knowledge
2) Understand Variation in Student Knowledge
3) Program Improvement
Summative Assessments:Assessment Building Process
7
Item Writing Language & Literacy
Field Testing /
Comparison
Dissemination and
Administration
Psychometric Analysis
Assessment Assembly
Measurement Review and Content Check:
Faculty; Carnegie
Performance Analysis
Measurement Review and Content Check:
Faculty; Carnegie
1 2 3
4 5 6
7 8 9
Summative Assessments:Assessment Report
8
Item Writing Language & Literacy
Field Testing /
Comparison
Dissemination and
Administration
Psychometric Analysis
Assessment Assembly
Measurement Review and Content Check:
Faculty; Carnegie
Performance Analysis
Measurement Review and Content Check:
Faculty; Carnegie
1 2 3
4 5 6
7 8 9
Summative Assessments:Purpose
Statway Mid-Pathway Assessments Pathway DifferenceFall 2011 +4.9Spring 2012 +7.2Fall 2012 +3.5Spring 2013 +4.1
Statway End-of- Pathway AssessmentsSpring 2012 +20.8Spring 2013 +17.4
Quantway 1 AssessmentsSpring 2012 +15.8Fall 2012 +13.2Spring 2013 +11.7
Summative Assessments:Purpose
Statway Mid-Pathway Assessments Correlation to GradeFall 2011 .61**Spring 2012 .44 *Fall 2012 .57**Spring 2013 .64**
Statway End-of- Pathway AssessmentsSpring 2012 .60**Spring 2013 .49**
Quantway 1 AssessmentsSpring 2012 .60**Fall 2012 .66**Spring 2013 .66**
Summative Assessments:Purpose
Summative Assessments
1) Understand Levels of Student Knowledge
2) Understand Variation in Student Knowledge
3) Program Improvement
What is Next?
Summative Assessments:Improvement
Item Diversity
12
Going Beyond Multiple Choice
Diversity of Item Types
Why not multiple choice?
• MC items tend to be reliable, seem to be easily written and understood…
• Some would argue MC items:– Do not test critical thinking, creativity– Inauthentic– Too easily answered by guessing– Encourage teaching to test– Do not engage students
Summative Assessments:Improvement
Capacity forAssessment Writing
14
Assessment Training
Summative Assessments:Improvement
Implementation
15
Online Assessments
Summative Assessments:Improvement
Item Diversity
Capacity forAssessment Writing
Implementation
16
Going Beyond Multiple Choice
Online Assessments
Assessment Training
Assessment Expert Convenin
g
Assessment Team /
Training
1. Diversity of Item Types
2. High Quality Items
3. High Quality Assessments
Assessment Expert Convenin
g
Assessment Team /
Training
1. Diversity of Item Types
2. High Quality Items
3. High Quality Assessments
Assessment Expert Convenin
g
Assessment Team /
Training
1. Diversity of Item Types
2. High Quality Items
3. High Quality Assessments
Diversity of Item Types
• True False
• Ordering / Ranking
• Multiple choice
• Multiple select
• Fill in blank / completion
• Matching
• Drag n’ drop
• Hotspot
• Knowledge matrix
• Cascaded items
• Calculations
• Short answer
• Visual representation (graphing, plotting, sketch diagram, drawing,, create graphics)
• Labeling
• Constructed response / free response
• Project based construction, (performance task, investigation project, portfolio piece, multistep project, performance assessment
• Apprentice task
• Essay
• Computer Simulations
Example of New Item Formats
• Drag N Drop
• HotSpot
Trade-Offs
Option Example Benefits Drawbacks1. Established
measureMultiple-choice, true-false
Cheap, easy to obtain and score
Less authentic, complex, engaging
2. Novel but somewhat established measure
Open-response, portfolio
More complex, less rote
Challenges in making raters consistent, time
3. Cutting-edge technology-based measure
Performance, computer simulation
Much more authentic, engaging
Resources (including computers), remaining technical challenges
Diversity of Item Types
Big Considerations
Purpose of Assessment
Depth of Knowledge
Cognitive Load / Level of Knowledge
Recognize and recall
Procedural knowledge (skills and concepts)
Explain and conclude (strategic thinking)Justification (extended thinking)
Design Specifications
Grading (time, reliability at scale, etc.)
Goal of Item (e.g. productive struggle)
ContextItemProgram
Diversity of Item Types
Purpose of Assessment
Depth of Knowledge
Cognitive Load
Context
Design Specifications
Grading
Goal of Item
• True False
• Ordering / Ranking
• Labeling
• Multiple Choice
• Multiple select
• Fill in blank / completion
• Matching
• Drag n’ drop
• Hotspot
• Knowledge Matrix
• Cascaded items
• Calculations
• Short answer
• Visual representations
• Constructed response / free response
• Project based construction,
• Apprentice task
• Essay
• Computer Simulation
Diversity of Item Types
Purpose of Assessment
Depth of Knowledge
Cognitive Load
Context
Design Specifications
Grading
Goal of Item
• True False
• Ordering / Ranking
• Labeling
• Multiple Choice
• Multiple select
• Fill in blank / completion
• Matching
• Drag n’ drop
• Hotspot
• Knowledge Matrix
• Cascaded items
• Calculations
• Short answer
• Visual representations
• Constructed response / free response
• Project based construction,
• Apprentice task
• Essay
High Quality Assessments
• Aligning Items with Learning Outcomes or Standards– Categorical Concurrence – Range-of-Knowledge Correspondence – Balance of Representation – Depth-of-Knowledge Consistency
High Quality Assessments
• Aligning Items with Learning Outcomes or Standards
• Reliability
• Validity
Validity is an argument
• Validity: evidence supports specific interpretations of scores for specific uses
• No such thing as a valid test – there are only valid uses and inferences
• A variety of evidence can be used to support these arguments
• Test content • Response processes • Relationships with other measures • Generalizability across administrations, forms,
etc.
High Quality Assessments
• Aligning Items with Learning Outcomes or Standards
• Reliability
• Validity
• Bias
• Depth of Knowledge (targeting various levels and skills*)
• Difficulty (ensuring the appropriate spread)
• Discrimination (of various ability levels)
• Variety of Item Formats
• Context / Relevance
• Purpose of Assessments (e.g. formative, summative)
Purpose
• Examples• Sum up what a student has learned?• Compare across a wide range of students?• Chart progress towards mastery?• Understand student’s learning process?• Drive academic improvement?
• Technical/measurement quality• Reliability• Validity
• Instructional priorities• Authenticity• Task complexity• Engagement
High Quality Assessments
Critical things often overlooked:
-How will you use the feedback?
-useful for teaching and learning?
-Program goals (e.g. productive struggle)
-Administration platform
-BALANCE
Process of Item Writing
• Item writing is hard– Become very familiar with content and learning outcomes– Choose specific learning outcomes, difficulty level, discrimination,
item format, etc. first– Consider design specifications of each item format– Consider resources: time, administration platform– Balance
– Get help!• Collective Understanding
• Group effort• Iterative Feedback• Examples are great! Good and bad!
– PRACTICE• Writing• Reviewing!