Improving Test-Taking Effort in Low-Stakes Group-Based ...€¦ · • Conducted a meta-analysis of...
Transcript of Improving Test-Taking Effort in Low-Stakes Group-Based ...€¦ · • Conducted a meta-analysis of...
Improving Test-Taking Effort in Low-Stakes Group-Based Educational Testing: A Meta-Analysis of Interventions
Joseph A. Rios & Ou L. LiuEducational Testing Service
1Copyright © 2016 by Educational Testing Service. All rights reserved. ETS, the ETS logo and MEASURING THE POWER OF LEARNING are registered trademarks of Educational Testing Service (ETS). 34728
Presentation delivered at the annual AERA conference on April, 13, 2018 in New York City
What is Noneffortful Responding?
• Nonsystematic responding with intentional disregard for item content due to low test-taking effort
2
LOWEFFORT
NO RESPONSE
NONEFFORTFUL RESPONSE
Impact on Evaluation of Measurement Properties
• Inflated difficulty parameters (van Barneveld, 2007)
• Inflated internal consistency reliability (Wise, 2009)
• Increased Type I error in DIF analyses (DeMars & Wise, 2010)
• Biased predictive validity coefficients (Wise, 2009)
• Biased linking coefficients (Mittelhaëuser, Béguin, & Sijtsma, 2015)
3
Impact of Noneffortful Responding on Aggregated Scores(Rios et al., 2017)
4Copyright © 2015 by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
01% 2.50% 5% 6.25% 12.50% 25%
Stan
dard
ized
Diff
eren
ce (d
)(O
bser
ved
-Tru
e)
Proportion of Noneffortful Responses in Total Sample
Easy UnrelatedModerate UnrelatedHard Unrelated
Impact on Score Interpretations
• Treatment effects (Osborne & Blanchard, 2011)
• Math achievement gaps between: (a) males and females, and (b) Black and White students (Soland, in press)
• Evaluation ratings of school personnel when assessing student growth (Wise, Ma, Cronin, & Theaker, 2013)
• Country-level comparisons (Debeer, Buchholz, Hartig, & Janssen, 2014)
5
How Do We Increase Test-Taking Effort?
6
Motivation InterventionsIntervention Objective Examples
Increase Test Relevance Improve the importance that students place on the
assessment results
Explain how the assessment results will be used to
improve classroom instruction, curriculum, and
institutional reputation
Modify Assessment Design Alter test content or administration procedure to
improve interest and/or effort
Avoid lengthy item stems, limit open-ended
responses, align content with students’ interests, use
game-design features
Promise Feedback Providing performance-contingent feedback that is
informative of students’ competence
Individual-level score reports
External Incentive Improve performance by providing performance-
contingent rewards
Give students money for every item correct;
certificate of achievementfor meeting proficiency7
Study Objective• Conducted a meta-analysis of studies that include
interventions to improve test-taking effort and performance in low-stakes group-based educational testing contexts
1. What is the overall impact of interventions on improving test-taking effort and test performance?
2. What are the contextual variables (e.g., participant, methodological, and assessment characteristics) that moderate the impact of such interventions?
3. Which intervention type is most effective in improving test-taking effort and/or test performance?
8
Search Strategy
• Four distinctive search strategies were employed: (a) a reference database search, (b) internet browsing, (c) expert consultation, and (d) citation searches (both backward and forward)
• Keywords used were: “test taking” AND “motivation”
• This literature search was completed between July 28, 2016 and August 23, 2016.
9
Eligibility Criteria
• Students had to be in K-12 or higher education settings
• Had to include a control condition that did not deviate from standard testing practice
• Must have included quantitative results for test-taking effort and/or test performance outcome measures
10
Variable Coding• The following variables were coded for:
• participant age (K-12 vs. higher education)• percentage of female participants• randomized vs. partially randomized sampling • participant recruitment strategy • performance measure item type• length of performance measure• publication type• intervention type
11
Analyses• Effect sizes were calculated using Cohen’s d
• Publication bias was examined via the funnel plot and the trim-and-fill method
• Outliers were identified and down-weighted based on a sensitivity analysis
• Meta-regression was used to calculate average effect sizes and conduct moderator analyses using the robust variance estimation (RVE) procedure• Assists in mitigating artificial reduction of variance
estimates and inflation of Type I error due to effect size dependencies
12
Results• 5,556 studies were examined and 45 were retained
• 25 studies published in journals and 20 grey literature
• 44 effect sizes of test-taking effort and 87 effect sizes of test performance based on 15,962 participants
• 47% of studies included (21 out of 45) were published since 2010
• Only 7 out of the 45 studies were conducted outside of the United States
13
Evaluating Publication Bias and Outliers
14
Average Effect Sizes and Heterogeneity
Dependent Variable k n M [95% CI] se p I2
Test-taking effort 22 44 0.20 [0.10, 0.31] 0.05 <.001 78.61%
Test performance 40 87 0.13 [0.06, 0.19] 0.03 <.01 68.92%
15
Note. k = number of studies; n = number of effect sizes; CI = confidence interval
Moderator Analysis
Moderator k n Estimate SeAge 21 41 -.13 .11
Gender 21 41 -.04 .05Sampling 22 44 .06 .09
RecruitmentStrategy
22 44 -.30** .10
Item Type 16 26 -.08 .21Test Length 16 26 -.01 .08
16
Test-Taking Effort
• No significant moderators were observed for the test performance dependent variable
p < .001
Publication Type as a ModeratorDependent
VariablePublication
Type k n M [95% CI] se p
Test-TakingEffort Published 9 21 0.38 [0.17, 0.58] 0.10 <.001
Grey 13 23 0.08 [-0.01, 0.17] 0.04 .09
Test Performance Published 22 44 0.17 [0.05, 0.29] 0.06 .09
Grey 18 43 0.07 [0, 0.14] 0.03 <.05
17
Note. k = number of studies; n = number of effect sizes; CI = confidence interval
Average Effect Sizes by Intervention TypeTest-Taking Efforta Test Performanceb
Intervention Type k n M [95% CI] k n M [95% CI]
Promising Feedback 8 12 0.30 [.005, 0.56] 13 19 0.10 [-0.02, 0.22]
Test Relevance 5 8 0.24 [-.013, 0.61] 11 21 0.20 [-0.12, 0.33]
Assessment Design 6 8 0.44 [0.29, 0.59] 12 17 0.08 [-0.15, 0.11]
External Incentives 7 9 0.52 [0.30, 0.73] 14 22 0.14 [-0.14, 0.22]
18
k = Number of studies; n = number of effect sizes
Post-hoc Analysis of External Incentives Intervention• Examined inverse variance-weighted average effect sizes
by subtype for the external incentives intervention
• The relatively low average effect size of the external incentives intervention type was deflated by the inclusion of the monetary incentives approach
19
Sub-Intervention
k n Average Effect Size 95% CI
Monetary 8 13 .07 .01, .12Nonmonetary 6 7 .23 .15, .31
Test Performance
Implications• How participants are recruited may impact the
effectiveness of test-taking effort interventions
• Providing students with performance-contingent nonmonetary incentives may improve both test-taking effort and test performance on low-stakes educational assessments
• Practitioners should avoid solely relying on the use of test-taking effort interventions to improve the validity of inferences made from low-stakes assessments
20
THANK YOU!
Joseph A. Rios, [email protected]