Cast of Characters - Confex · •2. relationship between Professional Learning & SA Outcomes? ......
Transcript of Cast of Characters - Confex · •2. relationship between Professional Learning & SA Outcomes? ......
Cast of Today’s Characters
• Michael R. McCormick, Superintendent Val Verde USD
• Jennifer M. Doskocil, Elementary Coordinator
• Sandy Sanford, EdD, Assessment Consultant
• Pete Goldschmidt, PhD, Research Consultant
An
alys
is &
Act
ion
An
alys
is &
Act
ion
An
alys
is &
Act
ion
An
alys
is &
Act
ion
An
alys
is &
Act
ion
2002-2013
School Year
Be
nch
mar
k 1
Ben
chm
ark
2
Ben
chm
ark
3
Post
-Yea
r B
ench
mar
k
Pre
-Yea
r B
ench
mar
k
• 95% Multiple Choice • ~75% of Standards • 20 to 40 items • 85% Scan Sheet
“formative analysis” & Action 2002-2013
• By Grade Levels Teams
• Reduction from Standard to Items
• Using EADMS Analytic Tools
• Isolate Offending Standard/Items(s) o e.g., Wrong Answer Analysis
• Determine Cause(s)
• Determine Instructional Fix
• Apply Instructional Fix
• Intensely Collaborative (at least initially)
Downside of 2002-2013
• Lockstep System
• District Dictated—Not Teacher Owned
• Competition Increased as years passed
• Collaboration Decreased as years passed
• Original Purpose Compromised
• Ignored the “Formative Assessment” Revolution
• Could not effectively deal with new SA item types and PTs
Influence of the New “Summative Assessment”
Characteristic OLD in Math NEW in Math
Administration Paper & Pencil Computer
Test Components One MC CAT & PT
Item Types MC MC, MS, EQ, TM, TI, DD, GR, ST
Responses One Correct per Item May have Many Correct per Item
Cognitively Complexity Lower Higher
Psychomotor Dependent Little Lots
Results Math beat ELA ELA beat Math
MC Question
Are things like the required use of a Computer & Technology Enhanced Items (TEIs)…
A. needless Contaminants creating needless barriers?
(Construct Irrelevant Variance)
B. essential Components for CCR in the 21st century?
Our Need at time of Pilot
Transition to an assessment/analysis approach that…
• minimizes Competition and maximizes Collaboration
• features a Teacher-Driven approach to assessment/analysis as opposed to a District-Dictated approach
• incorporates the instructional use of Formative Interactions
• considers the psychomotor aspects of Computer use & TEIs
A Word about the “F” word
“formative analysis” = analyzing any assessment in order to act formatively “Formative Interaction” = using the assessment to interact in a formative manner with students (e.g., complex interactions) “FORMATIVE Assessment” = assessment at the point of instruction and incorporated systematically in the instructional process.
Pilot Research Questions (2015-2016)
• Will teachers given full authority to choose all assessment items (Control Group) OR teachers choosing half the assessment items with an expert choosing the other half (Treatment Group) build assessments that align better to Smarter Balanced specifications with respect to mix of DOK, Item Types, Task Models, & Claims?
• Which Group will perform better on the Summative Assessment?
Pilot Structure Characteristic 2015-16
Grade Level 5th Grade
Content Area Math
Treatment Group 6 Sites (Random)
Treatment Assignment 8 Item Testlet
Testlet Unique to Each Teacher (~27)
Items Selected by Teachers (4 Items) & Expert (4 Items)
Control Group 6 Sites (Random)
Control Assignment 8 Item Testlet
Testlet Unique to Each Teacher (~27)
Items Selected by Teachers (all 8 Items)
Assignment Iterations 5
Iteration Interval Approx. 6 weeks
Administration Window ~2 weeks at end of instruction
Item Alignment
• Each item created to conform to the appropriate Smarter Balanced Item Specification Task Model
• Testlet Mix (Item Type, Task Model, DOK, & Claim) as close as possible to SA variety as described in Blueprint and Item Specifications
Experiment—Pilot (Candidate List)
• Expert builds Candidate List (CL) of 20 items in advance of each Assessment Cycle. Items are authored in EADMS
• Based on 2-3 Priority Standards per Cycle
• Mix of Item Types, Task Model, DOK, & Claim aligned to Summative Assessment
• CL cover sheet (Group A or B specific) describing process with the following sheets showing each item with indication of standard, item type, Task Model, DOK, and Claim/Target
• Items not used for Testlet could be used by teachers to support lessons
Pilot Testlet System
Candidate Item List Published
Teachers Study List
Group A Choose 8 Items
Group B Choose 4 Item
Group B 4 More Items Added
Testlets Created in EADMS
Testlets Numbers Sent to Teachers
Teachers Admin Testlets
Scores Captured in EADMS
Administration Monitored in EADMS
Results Harvested from EADMS
Results Analyzed W/R Summative Assessment
Professional Development—Pilot
• 3.5 hours Up-Front, On-Site PD re the value of item selection oBuilding Testlet oMeasuring purposely oStandard, DOK, Item Type, Task Model mix, Claim mix oFormative Interaction oTEI psychomotor alert
• Optional “On Demand” extended PD
• Continual emphasis via Instructional Coaches
• Reminders & emphasis via Candidate Item Lists with each Assessment Cycle
Role of the Testlet in Pilot
• “formative Tool” oShort-Cycle Interim (Benchmark)
oTo be ”formatively analyzed” and acted on accordingly
•Remaining CL items could be used to support instruction
Pilot Study Results
• No significant difference in Treatment and Control Groups w/r SA
• Testlet Quality converged (required PD same for all)
• Teachers still used Testlets more Summatively than Formatively
• Teachers wanted Testlet symmetry across grade level at each site
• Psychomotor challenges re Computer & TEIs greater than thought
Our Need for Year 2 Study
Improving the assessment/analysis approach so as to…
• minimize Competition and maximizes Collaboration
• feature a Teacher-Driven approach to assessment as opposed to a District-Dictated approach
• Move to school (grade level) based Testlets
• Accelerate Formative Interaction towards FORMATIVE Assessment
• Accelerate help for psychomotor aspects of computer & TEI
Year 2 Research Questions
Analysis of Fidelity of Implementation
What is the…
• 1. relationship between Professional Learning & Testlet Quality?
• 2. relationship between Professional Learning & SA Outcomes?
• 3. relationship between Testlet Quality & SA Outcomes?
Pilot & Year 2 Structure Characteristic 2015-16 2016-17
Grade Level 5th Grade 3rd, 4th, & 5th Grades
Content Area Math Math
Treatment Group 6 Sites (Random) 12 Sites
Treatment Assignment 8 Item Testlet 8 Item Testlet
Testlet Unique to Each Teacher (27) Each Site (12)
Items Selected by Teachers (4 Items) & Expert (4 Items) Grade Team, Expert, or Both
Control 6 Sites (Random)
Control Assignment 8 Item Testlet
Testlet Unique to Each Teacher (27)
Items Selected by Teachers (all 8 Items)
Assignment Iterations 5 9
Iteration Interval Approx. 6 weeks Approx. 4 weeks
Administration Window ~2 weeks ~4 weeks
Year 2 PD
• Continuation of Pilot PD
• More explicit Direction of Pilot PD
• More detail on Formative Techniques
• Provision and Explanation of Formative Tools (e.g., Low Tech to High Tech Process and Power Points)
• Greater emphasis on Computer & TEI re psychomotor implications
• More On-Demand On-Site PD
Year 2 Testlet System
Candidate Item List Published
Site Grade Levels
Study List
GL Chooses 8 Items
Testlets Created in EADMS
Testlets Numbers Sent to Teachers
Teachers Admin Testlets
Scores Captured in EADMS
Results Harvested from EADMS
GL 4 & Expert 4 Items
Expert Chooses 8 Items
Results Analyzed W/R Summative Assessment
Teachers Use Results Formatively
formative Tool
• Candidate List contains 20 items that are a representative mix of the 2 or 3 Priority Standards with regard to Claims, Item Types, Task Models, & DOK levels,
• Model of ideal mix built
• Grade level Testlet mix compared to ideal model mix
• Each Testlet awarded an integer score on a scale from 0 to 8
Item # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Q
Site A X X X X X X X X 1
Site B X X X X X X X X 4
Site C X X X X X X X X 8
STND 5NF3 5NF4 5NF6
DOK DOK 1 DOK 2 DOK3
TM TM1a TM1b TM2 TM3 TM4a TM4b TM6a TM6b TM1 TM2c
Claim Claim 1 Target F C2A C3D
Model 2 Items 1 Item 1 or 2 Items 2 Items 1 or 2 Items
formative Tool Evaluation
Year 2 Results
Results pertain to: Extending the model presented previously Relationships among elements (describing implementation) Relationships with outcomes Caveats: Lack of true experimental control Aggregate Data Single Teslet
Year 2 Results
• What might have influenced teachers in selecting items (correlations)?
• Pygmalion effect Last year’s 5th graders: Implementation Rigor
.64 .12
• Responsiveness (current students as 4th graders) .03 .47
• Most recent performance ? ?
Year 2 Results
• (Change in performance from 4th to 5th grade Correlation
• fidelity of implementation .43
• Rigor – School/testlet contribution to ability .44
Year 2 Results
• (Change in performance from 4th to 5th grade Correlation
• fidelity of implementation .43
• Rigor – School/testlet contribution to ability .44
Year 2 Results (modeling testlets)
Use a Multilevel Measurement Model Cross-classified Three Level (items within Students and Schools) Level-1 Model Prob(RESPONSEijk=1|πjk) = ϕijk log[ϕijk/(1 - ϕijk)] = ηijk ηijk = π0jk + π1jk*(ITEM1ijk) + π2jk*(ITEM2ijk) + π3jk*(ITEM5ijk) + π4jk*(ITEM6ijk) + π5jk*(ITEM7ijk) + π6jk*(ITEM8ijk) + π7jk*(ITEM9ijk) + π8jk*(ITEM10ijk) + π9jk*(ITEM11ijk) + π10jk*(ITEM12ijk) + π11jk*(ITEM13ijk) + π12jk*(ITEM14ijk) + π13jk*(ITEM15ijk) + π14jk*(ITEM16ijk) + π15jk*(ITEM17ijk) + π16jk*(ITEM18ijk) + π17jk*(ITEM19ijk) + π18jk*(ITEM20ijk)
Year 2 Results (modeling Testlets)
Between Students within Testlets
π0jk = β00k + r0jk π1jk = β10k π2jk = β20k π3jk = β30k π4jk = β40k π5jk = β50k π6jk = β60k π7jk = β70k π8jk = β80k π9jk = β90k π10jk = β100k π11jk = β110k π12jk = β120k π13jk = β130k π14jk = β140k π15jk = β150k π16jk = β160k π17jk = β170k π18jk = β180k
Between Testlets β00k = γ000 + u00k β10k = γ100 β20k = γ200 β30k = γ300 β40k = γ400 β50k = γ500 β60k = γ600 β70k = γ700 β80k = γ800 β90k = γ900 β100k = γ1000 β110k = γ1100 β120k = γ1200 β130k = γ1300 β140k = γ1400 β150k = γ1500 β160k = γ1600 β170k = γ1700 β180k = γ1800
Year 2 Results (testlet items)
ICCs
0.000
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
0.900
1.000
-4.00 -3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00
Year 2 Results (testlet items)
Item Information
Functions
0.000
0.050
0.100
0.150
0.200
0.250
0.300
-4.00 -3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00
Year 2 Results (TBD)
Model individual student data.
Link students to teachers in order to create variation within testlet between teachers.
Incorporate longitudinal aspect.
Year 2 Challenges (Observation & Survey)
• Administration window narrower than instructional period— Saturation
• Disconnects in understanding the analytic processes
• Disconnects in understanding the nature of the formative use process
• Each of these will be attached in Year 3
• Computer & TEI psychomotor problems still abound
• Still see some resistance and apathy, but “Buy-In” traction increasing
• Need for Student Voice
Teacher Survey Highlights
1. More training on purpose of FORMATIVE Assessment and Formative Interaction
2. More training on how to perform “formative analysis” of results
3. Liked that assessments were common among grade level teachers at site (feedback from 15-16 survey) to facilitate group analysis and collaboration
Teacher Survey Highlights
4. Liked easy access to what they judged to be well-written items that challenged students
5. Liked the concept of more control to teachers
Year-3 Next Steps
• Systematic Districtwide PD on Formative Assessment Techniques (Using both Internal & External Assets, e.g., SVMI)
• Formative use of both Performance Tasks and unused CL items (Monitoring)
• Continuation of the 3-5 Testlet Process (& Include 1&2)
• Real Time Monitoring of the Testlet Results and providing Expert Commentary to Sites
• Develop & implement Computer/TEI psychomotor solutions
• Develop & implement Student Voice instruments
• Measure degree of teacher uptake of PD
Role of the Provided Items in Year 3
• Chosen Testlet to act as Short-Cycle Interim that is formatively analyzed and acted on
• Choose remaining CL items (and other items) to be used with the FORMATIVE Assessment process
• Develop the use of well-written Performance Tasks (PTs) in the FORMATIVE Assessment process
2015-16 2016-17 2017-2018
Grade Level 5th Grade Grades 3 to 5 Grades 1 to 5
Content Area Math Math Math
Treatment Group 6 Sites (Random) 12 Sites 12 Sites
Treatment Assignment 8 Item Testlet 8 Item Testlet 8 Item Testlet Lesson Items
Testlet Unique to Each Teacher (27) Each Site (12) Each Site (12)
Items Selected by Teachers (4) & Expert (4) Grade Team, Expert, or Both Grade Team, Expert, or Both
Control 6 Sites (Random)
Control Assignment 8 Item Testlet
Testlet Unique to Each Teacher (27)
Items Selected by Teachers (all 8 Items)
Assignment Iterations 5 9 12
Iteration Interval Approx. 6 weeks Approx. 4 weeks 2-3 Weeks
Admin Window ~2 weeks ~2 weeks Same as Instruction Window
Window Alignment Testlet last half Instruction Testlet ~last half Instruction Testlet and Instruction the same
Add Performance Tasks
Emphasis Testlet to SA Relationship Formative Use of Testlets FORMATIVE Assessment
Thank You! Michael R. McCormick [email protected]
@ValVerdeSupt
Jennifer M. Doskocil [email protected]
Pete Goldschmidt [email protected]
Sandy Sanford [email protected]
www.k12sharing.org Password “Discover”