Cast of Characters - Confex · •2. relationship between Professional Learning & SA Outcomes? ......

The Impact of Increased Teacher Assessment Literacy & Use of Formative Feedback

Year-2 Study

Cast of Today’s Characters

• Michael R. McCormick, Superintendent Val Verde USD

• Jennifer M. Doskocil, Elementary Coordinator

• Sandy Sanford, EdD, Assessment Consultant

• Pete Goldschmidt, PhD, Research Consultant

Short History of Val Verde Elementary Benchmarks

An

alys

is &

Act

ion

An

alys

is &

Act

ion

An

alys

is &

Act

ion

An

alys

is &

Act

ion

An

alys

is &

Act

ion

2002-2013

School Year

Be

nch

mar

k 1

Ben

chm

ark

2

Ben

chm

ark

3

Post

-Yea

r B

ench

mar

k

Pre

-Yea

r B

ench

mar

k

• 95% Multiple Choice • ~75% of Standards • 20 to 40 items • 85% Scan Sheet

“formative analysis” & Action 2002-2013

• By Grade Levels Teams

• Reduction from Standard to Items

• Using EADMS Analytic Tools

• Isolate Offending Standard/Items(s) o e.g., Wrong Answer Analysis

• Determine Cause(s)

• Determine Instructional Fix

• Apply Instructional Fix

• Intensely Collaborative (at least initially)

Downside of 2002-2013

• Lockstep System

• District Dictated—Not Teacher Owned

• Competition Increased as years passed

• Collaboration Decreased as years passed

• Original Purpose Compromised

• Ignored the “Formative Assessment” Revolution

• Could not effectively deal with new SA item types and PTs

---Collaboration Capital---

Build your

REPUTATIO

N

SHARING -Michael R.

McCormick

Val Verde USD

Influence of the New “Summative Assessment”

Characteristic OLD in Math NEW in Math

Administration Paper & Pencil Computer

Test Components One MC CAT & PT

Item Types MC MC, MS, EQ, TM, TI, DD, GR, ST

Responses One Correct per Item May have Many Correct per Item

Cognitively Complexity Lower Higher

Psychomotor Dependent Little Lots

Results Math beat ELA ELA beat Math

MC Question

Are things like the required use of a Computer & Technology Enhanced Items (TEIs)…

A. needless Contaminants creating needless barriers?

(Construct Irrelevant Variance)

B. essential Components for CCR in the 21st century?

Pilot Study 2015-16

Our Need at time of Pilot

Transition to an assessment/analysis approach that…

• minimizes Competition and maximizes Collaboration

• features a Teacher-Driven approach to assessment/analysis as opposed to a District-Dictated approach

• incorporates the instructional use of Formative Interactions

• considers the psychomotor aspects of Computer use & TEIs

A Word about the “F” word

“formative analysis” = analyzing any assessment in order to act formatively “Formative Interaction” = using the assessment to interact in a formative manner with students (e.g., complex interactions) “FORMATIVE Assessment” = assessment at the point of instruction and incorporated systematically in the instructional process.

Pilot Research Questions (2015-2016)

• Will teachers given full authority to choose all assessment items (Control Group) OR teachers choosing half the assessment items with an expert choosing the other half (Treatment Group) build assessments that align better to Smarter Balanced specifications with respect to mix of DOK, Item Types, Task Models, & Claims?

• Which Group will perform better on the Summative Assessment?

Pilot Structure Characteristic 2015-16

Grade Level 5th Grade

Content Area Math

Treatment Group 6 Sites (Random)

Treatment Assignment 8 Item Testlet

Testlet Unique to Each Teacher (~27)

Items Selected by Teachers (4 Items) & Expert (4 Items)

Control Group 6 Sites (Random)

Control Assignment 8 Item Testlet

Testlet Unique to Each Teacher (~27)

Items Selected by Teachers (all 8 Items)

Assignment Iterations 5

Iteration Interval Approx. 6 weeks

Administration Window ~2 weeks at end of instruction

Item Alignment

• Each item created to conform to the appropriate Smarter Balanced Item Specification Task Model

• Testlet Mix (Item Type, Task Model, DOK, & Claim) as close as possible to SA variety as described in Blueprint and Item Specifications

Claim 1 Target C

Experiment—Pilot (Candidate List)

• Expert builds Candidate List (CL) of 20 items in advance of each Assessment Cycle. Items are authored in EADMS

• Based on 2-3 Priority Standards per Cycle

• Mix of Item Types, Task Model, DOK, & Claim aligned to Summative Assessment

• CL cover sheet (Group A or B specific) describing process with the following sheets showing each item with indication of standard, item type, Task Model, DOK, and Claim/Target

• Items not used for Testlet could be used by teachers to support lessons

Pilot Testlet System

Candidate Item List Published

Teachers Study List

Group A Choose 8 Items

Group B Choose 4 Item

Group B 4 More Items Added

Testlets Created in EADMS

Testlets Numbers Sent to Teachers

Teachers Admin Testlets

Scores Captured in EADMS

Administration Monitored in EADMS

Results Harvested from EADMS

Results Analyzed W/R Summative Assessment

Professional Development—Pilot

• 3.5 hours Up-Front, On-Site PD re the value of item selection oBuilding Testlet oMeasuring purposely oStandard, DOK, Item Type, Task Model mix, Claim mix oFormative Interaction oTEI psychomotor alert

• Optional “On Demand” extended PD

• Continual emphasis via Instructional Coaches

• Reminders & emphasis via Candidate Item Lists with each Assessment Cycle

Role of the Testlet in Pilot

• “formative Tool” oShort-Cycle Interim (Benchmark)

oTo be ”formatively analyzed” and acted on accordingly

•Remaining CL items could be used to support instruction

Pilot Study Results

• No significant difference in Treatment and Control Groups w/r SA

• Testlet Quality converged (required PD same for all)

• Teachers still used Testlets more Summatively than Formatively

• Teachers wanted Testlet symmetry across grade level at each site

• Psychomotor challenges re Computer & TEIs greater than thought

Year 2 Study 2016-17

Our Need for Year 2 Study

Improving the assessment/analysis approach so as to…

• minimize Competition and maximizes Collaboration

• feature a Teacher-Driven approach to assessment as opposed to a District-Dictated approach

• Move to school (grade level) based Testlets

• Accelerate Formative Interaction towards FORMATIVE Assessment

• Accelerate help for psychomotor aspects of computer & TEI

Year 2 Research Questions

Analysis of Fidelity of Implementation

What is the…

• 1. relationship between Professional Learning & Testlet Quality?

• 2. relationship between Professional Learning & SA Outcomes?

• 3. relationship between Testlet Quality & SA Outcomes?

Pilot & Year 2 Structure Characteristic 2015-16 2016-17

Grade Level 5th Grade 3rd, 4th, & 5th Grades

Content Area Math Math

Treatment Group 6 Sites (Random) 12 Sites

Treatment Assignment 8 Item Testlet 8 Item Testlet

Testlet Unique to Each Teacher (27) Each Site (12)

Items Selected by Teachers (4 Items) & Expert (4 Items) Grade Team, Expert, or Both

Control 6 Sites (Random)


Testlet Unique to Each Teacher (27)


Assignment Iterations 5 9

Iteration Interval Approx. 6 weeks Approx. 4 weeks

Administration Window ~2 weeks ~4 weeks

Year-2 Indicators

• Proximal = Testlet Quality

• Distal = Summative Assessment Outcomes

Mediation Study

Professional Development

formative Tool

Summative Assessment

Results

Year 2 PD

• Continuation of Pilot PD

• More explicit Direction of Pilot PD

• More detail on Formative Techniques

• Provision and Explanation of Formative Tools (e.g., Low Tech to High Tech Process and Power Points)

• Greater emphasis on Computer & TEI re psychomotor implications

• More On-Demand On-Site PD

Year 2 Testlet System

Candidate Item List Published

Site Grade Levels

Study List

GL Chooses 8 Items

Testlets Created in EADMS

Testlets Numbers Sent to Teachers

Teachers Admin Testlets

Scores Captured in EADMS

Results Harvested from EADMS

GL 4 & Expert 4 Items

Expert Chooses 8 Items

Results Analyzed W/R Summative Assessment

Teachers Use Results Formatively

formative Tool

• Candidate List contains 20 items that are a representative mix of the 2 or 3 Priority Standards with regard to Claims, Item Types, Task Models, & DOK levels,

• Model of ideal mix built

• Grade level Testlet mix compared to ideal model mix

• Each Testlet awarded an integer score on a scale from 0 to 8

Item # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Q

Site A X X X X X X X X 1

Site B X X X X X X X X 4

Site C X X X X X X X X 8

STND 5NF3 5NF4 5NF6

DOK DOK 1 DOK 2 DOK3

TM TM1a TM1b TM2 TM3 TM4a TM4b TM6a TM6b TM1 TM2c

Claim Claim 1 Target F C2A C3D

Model 2 Items 1 Item 1 or 2 Items 2 Items 1 or 2 Items

formative Tool Evaluation

Killer Item 11 C1TF DOK-2

Killer Item 19 C2TA DOK-3

Year 2 Results

Results pertain to: Extending the model presented previously Relationships among elements (describing implementation) Relationships with outcomes Caveats: Lack of true experimental control Aggregate Data Single Teslet

Year 2 Results

Year 2 Results

• What might have influenced teachers in selecting items (correlations)?

• Pygmalion effect Last year’s 5th graders: Implementation Rigor

.64 .12

• Responsiveness (current students as 4th graders) .03 .47

• Most recent performance ? ?

Year 2 Results

• (Change in performance from 4th to 5th grade Correlation

• fidelity of implementation .43

• Rigor – School/testlet contribution to ability .44

Year 2 Results (modeling testlets)

Use a Multilevel Measurement Model Cross-classified Three Level (items within Students and Schools) Level-1 Model Prob(RESPONSEijk=1|πjk) = ϕijk log[ϕijk/(1 - ϕijk)] = ηijk ηijk = π0jk + π1jk*(ITEM1ijk) + π2jk*(ITEM2ijk) + π3jk*(ITEM5ijk) + π4jk*(ITEM6ijk) + π5jk*(ITEM7ijk) + π6jk*(ITEM8ijk) + π7jk*(ITEM9ijk) + π8jk*(ITEM10ijk) + π9jk*(ITEM11ijk) + π10jk*(ITEM12ijk) + π11jk*(ITEM13ijk) + π12jk*(ITEM14ijk) + π13jk*(ITEM15ijk) + π14jk*(ITEM16ijk) + π15jk*(ITEM17ijk) + π16jk*(ITEM18ijk) + π17jk*(ITEM19ijk) + π18jk*(ITEM20ijk)

Year 2 Results (modeling Testlets)

Between Students within Testlets

π0jk = β00k + r0jk π1jk = β10k π2jk = β20k π3jk = β30k π4jk = β40k π5jk = β50k π6jk = β60k π7jk = β70k π8jk = β80k π9jk = β90k π10jk = β100k π11jk = β110k π12jk = β120k π13jk = β130k π14jk = β140k π15jk = β150k π16jk = β160k π17jk = β170k π18jk = β180k

Between Testlets β00k = γ000 + u00k β10k = γ100 β20k = γ200 β30k = γ300 β40k = γ400 β50k = γ500 β60k = γ600 β70k = γ700 β80k = γ800 β90k = γ900 β100k = γ1000 β110k = γ1100 β120k = γ1200 β130k = γ1300 β140k = γ1400 β150k = γ1500 β160k = γ1600 β170k = γ1700 β180k = γ1800

Year 2 Results (testlet items)

ICCs

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

-4.00 -3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00

Year 2 Results (testlet items)

Item Information

Functions

0.000

0.050

0.100

0.150

0.200

0.250

0.300

-4.00 -3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00

Year 2 Results

Ability by School

Year 2 Results (TBD)

Model individual student data.

Link students to teachers in order to create variation within testlet between teachers.

Incorporate longitudinal aspect.

Year 2 Challenges (Observation & Survey)

• Administration window narrower than instructional period— Saturation

• Disconnects in understanding the analytic processes

• Disconnects in understanding the nature of the formative use process

• Each of these will be attached in Year 3

• Computer & TEI psychomotor problems still abound

• Still see some resistance and apathy, but “Buy-In” traction increasing

• Need for Student Voice

Example of Resistance/Apathy Indicator

Teacher Survey Highlights

1. More training on purpose of FORMATIVE Assessment and Formative Interaction

2. More training on how to perform “formative analysis” of results

3. Liked that assessments were common among grade level teachers at site (feedback from 15-16 survey) to facilitate group analysis and collaboration

Teacher Survey Highlights

4. Liked easy access to what they judged to be well-written items that challenged students

5. Liked the concept of more control to teachers

Year 3 Study 2017-18

Year-3 Next Steps

• Systematic Districtwide PD on Formative Assessment Techniques (Using both Internal & External Assets, e.g., SVMI)

• Formative use of both Performance Tasks and unused CL items (Monitoring)

• Continuation of the 3-5 Testlet Process (& Include 1&2)

• Real Time Monitoring of the Testlet Results and providing Expert Commentary to Sites

• Develop & implement Computer/TEI psychomotor solutions

• Develop & implement Student Voice instruments

• Measure degree of teacher uptake of PD

Role of the Provided Items in Year 3

• Chosen Testlet to act as Short-Cycle Interim that is formatively analyzed and acted on

• Choose remaining CL items (and other items) to be used with the FORMATIVE Assessment process

• Develop the use of well-written Performance Tasks (PTs) in the FORMATIVE Assessment process

2015-16 2016-17 2017-2018

Grade Level 5th Grade Grades 3 to 5 Grades 1 to 5

Content Area Math Math Math

Treatment Group 6 Sites (Random) 12 Sites 12 Sites

Treatment Assignment 8 Item Testlet 8 Item Testlet 8 Item Testlet Lesson Items

Testlet Unique to Each Teacher (27) Each Site (12) Each Site (12)

Items Selected by Teachers (4) & Expert (4) Grade Team, Expert, or Both Grade Team, Expert, or Both

Control 6 Sites (Random)


Testlet Unique to Each Teacher (27)


Assignment Iterations 5 9 12

Iteration Interval Approx. 6 weeks Approx. 4 weeks 2-3 Weeks

Admin Window ~2 weeks ~2 weeks Same as Instruction Window

Window Alignment Testlet last half Instruction Testlet ~last half Instruction Testlet and Instruction the same

Add Performance Tasks

Emphasis Testlet to SA Relationship Formative Use of Testlets FORMATIVE Assessment

Thank You! Michael R. McCormick [email protected]

@ValVerdeSupt

Jennifer M. Doskocil [email protected]

Pete Goldschmidt [email protected]

Sandy Sanford [email protected]

www.k12sharing.org Password “Discover”

mailto:[email protected]




Cast of Characters - Confex · •2. relationship between Professional Learning & SA Outcomes? ......

Documents

Transcript of Cast of Characters - Confex · •2. relationship between Professional Learning & SA Outcomes? ......