PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National...

32
PARCC Field Test Study Comparability of High School Mathematics End-of-Course Assessments National Conference on Student Assessment San Diego June 2015

description

To assist states in aligning instruction to the CCSSM, model course pathways were developed for High School mathematics with standards organized into two sequences of coursework designed to lead to college and career readiness and to prepare students for study in more advanced mathematics courses. – The Traditional pathway is based on organization of high school course work typically seen in the United States. o It includes two algebra courses and a geometry course, with some data, probability and statistics included in each course. (Algebra 1, Geometry, Algebra 2) – The Integrated pathway provides a more integrated approach to secondary mathematics that is less common in the United States, but typical internationally. o It includes a sequence of three courses, each of which includes number, algebra, geometry, probability and statistics. (Integrated Mathematics 1, 2, 3) Introduction 3

Transcript of PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National...

Page 1: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

PARCC Field Test StudyComparability of High School Mathematics End-of-Course

Assessments

• National Conference on Student AssessmentSan DiegoJune 2015

Page 2: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

2

Overview

• PARCC field test EOC study design• Statistical analysis • SME review of item maps

Page 3: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• To assist states in aligning instruction to the CCSSM, model course pathways were developed for High School mathematics with standards organized into two sequences of coursework designed to lead to college and career readiness and to prepare students for study in more advanced mathematics courses. – The Traditional pathway is based on organization of high school course

work typically seen in the United States.o It includes two algebra courses and a geometry course, with some

data, probability and statistics included in each course. (Algebra 1, Geometry, Algebra 2)

– The Integrated pathway provides a more integrated approach to secondary mathematics that is less common in the United States, but typical internationally.o It includes a sequence of three courses, each of which includes

number, algebra, geometry, probability and statistics. (Integrated Mathematics 1, 2, 3)

Introduction

3

Page 4: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

The HS EOC comparability study was designed to address the following research questions:

1. What degree of comparability (e.g., linked or concorded) can be achieved between the assessments of the two course sequences? Can the comparability be achieved at the course level or only at the aggregate level?

2. How do the psychometric properties of items that are used in assessments of both course sequences compare? More specifically, can a single calibration suffice for an item used in both course sequences or must an item be separately calibrated for use in each?

Study Overview

4

Page 5: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• To the extent possible, the Field Test was designed to reflect future operational administrations– 2 separate administrations – PBA in March, EOY in April– Dual mode administration – PBA and EOY field test forms constructed to full operational test blueprints and

requirements • FT data collection design

– 2 conditions: 1) Full summative (FS, PBA+EOY), 2) PBA or EOY but not both– Linking through common items across forms and conditions, and randomly

equivalent groups– Oversampling to reach target sample size 1,200 valid cases per form– Initial design of 6 FS forms per test title for scoring/scaling and research

studies; modified in response to recruitment challenges

Overview of Field Test Design

5

Page 6: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• Primary FT data (CBT as per RFP)– Traditional and Integrated forms with common items

o Original design had 6 Condition 1 (FS, PBA & EOY) forms each EOC– Number of forms reduced due for all EOCS, with greater reduction

and redistribution for Integratedo Linkage across same level courses (Alg1/Math1, Geometry/Math2,

Alg2/Math3) , and diagonally as per PARCC frameworks– For each EOC

o Sample recruitment challenges, sought volunteerso Target of 1,200 valid cases per form not met despite forms reduction

- persistent gaps for Integrated Math

EOC Study Data Collection

6

Page 7: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Data Status – Traditional Math

7

PBA EOY

Test Test Condition

Form Valid Cases

Number of core

items per Form

Possible score

points per Form

Form Valid Cases

Number of core

items per Form

Possible score

points per Form

Algebra 1 Cond 1 FS (PBA+EOY)

1 1,284 12 16 1 1,165 28 392 1,127 14 25 2 1,094 28 393 1,168 13 27 3 1,171 29 464 937 10 22 4 1,107 26 375 1,269 10 19 5 1,288 21 29

Geometry Cond 1 FS (PBA+EOY

1 1,426 12 22 1 1,169 29 452 1,319 8 13 2 959 26 363 1,252 10 20 3 1,075 27 404 1,204 14 29 4 1,033 27 405 1,302 12 22 5 1,178 30 47

Algebra 2 Cond 1 FS (PBA+EOY

1 1,318 12 29 1 1,132 25 402 1,191 6 6 2 1,172 26 413 1,107 9 12 3 1,063 25 424 1,623 10 12 4 1,294 23 38

Page 8: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Data Status – Integrated Math

8

Test Test Condition Form Valid Cases Number of core items per Form

Possible score points

per Form

Form Valid Cases Number of core items per Form

Possible score points per

Form

Integrated Math1

Cond 1 FS (PBA+EOY) 1 472 13 24 1 640 26 38

Cond 2 PBA 2 1,114 14 29

Cond 2 EOY 2 516 26 40 3 478 25 33

Integrated Math2

Cond 1 FS (PBA+EOY) 1 414 10 10 1 429 20 26

Cond 2 PBA 2 882 9 11

Cond 2 EOY 2 289 20 27 3 307 20 25

Integrated Math3

Cond 1 FS (PBA+EOY) 1 320 8 13 1 286 23 31

Cond 2 PBA 2 848 10 14

Cond 2 EOY 2 316 23 36 3 329 24 34

PBA EOY

Page 9: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

9

Core Items, N Common (Points)

Type of Link EOC Linkage

PBA EOY

Traditional Condition 1 Traditional Condition1

Integrated

Cond 1

Integrated

Cond 2Total

Integrated

Cond 1

Integrated

Cond 2Total

Pathway

ALG1 IM1 6(15) 4(9) 10(24) 7(10) 18(23) 25(33)

GEOM IM2   1(3) 1(3) 3(4) 8(9) 11(13)

ALG2 IM3 2(7) 2(2) 4(9) 8(10) 16(26) 24(36)

Page 10: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• Classical item analysis – cross-sequence examination of relative item difficulties

• Cross-sequence DIF• Comparative analyses of factor structure • Cross-sequence linking

– Separate calibrations (1PL), linking with mean-mean procedure• Item maps

– For examination of consistency of item difficulties – For examination of consistency of meaning of scores at key points

with respect to KSAs

Analysis Plan

10

Page 11: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• Calculate summary statistics of item difficulties (p-values) for common items administered in each pathway

• Convert common item p-values to z-scores and plot to examine the consistency of relative difficulty across the pathways

Item Difficulty for Common Items

it tit

pt

p pzs

Page 12: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

12

Z-Value Plot: PBAAlgebra 1 vs. Mathematics 1

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

2014FTZ-Value Comparison

R= .989

Zvalue_Algebra 1

Zval

ue_I

nt. M

ath

1

Page 13: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

13

Z-Value Plot: EOYAlgebra 1 vs. Mathematics 1

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

2014FT Z-Value Comparison

R=.959

Zvalue_Algebra 1

Zval

ue_I

nt. M

ath

1

Page 14: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

14

Z-Value Plot: EOYGeometry vs. Mathematics 2

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

2014FTZ-Value Comparison

R-.856

Zvalue_GEO

Zval

ue_I

nt. M

ath

2

Page 15: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

15

Z-Value Plot: EOYAlgebra 2 vs. Mathematics 3

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

2014FTZ-Value Comparison

R=. 929

Zvalue_Algebra 2

Zval

ue_I

nt. M

ath

3

Page 16: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• Algebra 1, Mathematics 1: Correlations indicate consistency of common item relative difficulty in the two EOC populations, at levels considered sufficient to support linking

• Geometry, Mathematics 2: Lower correlation, typically considered insufficient for linking

• Algebra 2, Mathematics 3: Correlation at level considered sufficient to support linking

Z-Score Summary

Page 17: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Students per EOC Test Item

EOC Number of Items

Average Sample Size

Minimum Sample Size

Algebra 1 142 1160 825 Geometry 104 1213 1035 Algebra 2 149 1144 935

Mathematics 1 94 616 444 Mathematics 2 62 417 266 Mathematics 3 79 372 280

Page 18: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

18

Separate Calibrations, Linking

• For dichotomous items, the 1PL model (Rasch)

• For polytomous items, the one-parameter partial credit (1PPC) model

• After separate calibrations, examined correlations of item difficulty parameter estimates for the EOC pair common items.

• Item parameter estimates for each EOC course pair were placed on the same scale using the common item linking mean-mean procedure.

Page 19: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• Algebra 1 with Mathematics 1 .92• Algebra 2 with Mathematics 3 .92

• Geometry with Mathematics 2 .84

Correlations of Common Item Difficulty Parameter Estimates

Page 20: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Item Maps

• Item maps for each course included both course-specific items and common items, separately identified.

• The common items provide the vehicle for aligning the items from the two courses.

• Criteria for location of items on the map is based on a specified response probability (RP67)

- Metric: Scale score=(RP67 theta * 100) + 40022

Page 21: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Item Map: Algebra 1 vs. Int. Math 1

21

Page 22: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Question: Does obtaining a Score of X (showing what a student knows and can do in terms of item content) for Test I match what it means to obtain a Score of X in Test II?

Responses:1 Yes, very much so2 For the most part, but there are some differences3 Somewhat, but weakly4 No, not at all

Expert Review (Subject Matter Experts)Rating Scale

Page 23: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

23

Experts Review—First Set of Ratings

• Interpret the meaning of scores at key points on the scale in terms of the KSAs represented by the distribution of items in the vicinity of the score. Key scale scores: 550, 650, 750

• Review items located near the 3 scale points and interpret performance on the two tests. All items and item specific information were provided.

• Side by side comparison of maps for designated Traditional-Integrated EOC pairs

• Compare the distribution of items on each item map• Examine pattern of common item performance across EOCs,

and relative to unique items within

Page 24: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Provide ratings at values of 550, 650, 750, and Overall for each of the following:

• Course level– Algebra 1 / Mathematics 1 – Geometry / Mathematics 2 – Algebra 2 / Mathematics 3

• Aggregate level (end of 3-course sequence)– Traditional Sequence / Integrated Sequence

Rating Tasks

Page 25: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Item Map: Geometry vs. Int. Math 2

25

Page 26: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Item Map: Algebra 2 vs. Int. Math 3

26

Page 27: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

27

Ratings following Session 1Algebra 1 with Mathematics 1

Comparison Scale Value Average 1

Yes, very much

2 For the

most part

3 Somewhat, but weakly

4 No,

not at all

Algebra 1 550 1.83 3 1 2 0

With 650 1.83 3 1 2 0

Math 1 750 1.83 3 1 2 0

Overall 2.00 2 2 2 0

Page 28: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

Ratings following Session 1Geometry with Mathematics 2

Comparison Scale Value Average 1 Yes,

very much

2 For the

most part

3 Somewhat, but weakly

4 No,

not at all

Geometry 550 3.17 1 1 0 4

With 650 2.83 1 1 2 2

Math 2 750 3.00 1 1 1 3

Overall 3.17 0 2 1 3

Page 29: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

29

Group Discussion of Item Maps/Ratings

SMEs discussed results and were given the opportunity to change ratings during the 2nd meeting

• Second ratings for Algebra 1 / Mathematics 1 indicated less comparability than initial rating

• Second ratings for Traditional Pathway with Integrated Pathway indicated more comparability than initial ratings

Page 30: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• Algebra 1 with Mathematics 1– Responses were close to evenly distributed among ratings of 1 to 3

• Algebra 2 with Mathematics 3– Modal response was (2) For the most part– 87.5% of response were either 1 or 2

• Geometry with Mathematics 2– Modal response was that the math skills were not comparable. – 67% of responses either (3) Somewhat but weakly or (4) No, not at all

• Aggregate level – Majority of the responses were (2) For the most part, but there are

some differences

Item Mapping Summary

Page 31: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

31

• Results from field test data do not always translate directly to operational administration results.

• The small sample sizes, especially for the Integrated Mathematics courses, make firm conclusions problematic.

• Data from operational administrations should result in increased volume, therefore, more stable results should allow for firmer conclusions.

Limitations

Page 32: PARCC Field Test Study Comparability of High School Mathematics End-of- Course Assessments National Conference on Student Assessment San Diego June 2015.

• The data suggest separate scales for Geometry and Mathematics 2 – Concordance tables may be a possibility for aligning scores, if

common item correlations are high enough; however, this will likely yield concordant scores that differ substantially in terms of meaning, that is, in terms of the underlying knowledge, skills, and abilities needed to obtain each score.

• For the Algebra 1/Mathematics 1 and Algebra 2/Mathematics 3 comparisons, the data from the smallish sample sizes indicate that using concurrent calibration is not strongly supported. – Depending on Operational results, options for reporting may include

linking of the separate IRT scales to support a common reporting scale, or concordance tables to align scores.

Conclusions