The Effects of Peer Review and Reproducibility on Learning · The Effects of Peer Review and...
Transcript of The Effects of Peer Review and Reproducibility on Learning · The Effects of Peer Review and...
The Effects of Peer Review and Reproducibility on Learning:
a Randomized Experiment
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
1
Introduction
● Problem: Stats/Math Education
(rote learning in highschools vs. non-rote learning in college)
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
2
Introduction
● Statistics Education (rote vs. non-rote learning)
● Constructivism (seems promising)
● Peer Review:
● different effects/roles: Reviewee vs. Reviewer (Van Gennip et al. 2009; Lundstrom et al. 2009; Strijbos et al. 2009)
● assumes Reproducibility
● requires technology (Wessa 2009)
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
3
Purpose of the Study
● Attempts to prove the causal effects of Peer Review (based on Reproducible Computing) on:
● Perceived utility of statistics
● Actual behavior (application of statistics)
● Non-rote learning (conceptual understanding)
● Attitude towards risk
● Setting:
● Fully randomized experiment
● Stock market environment/game
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
4
Experiment embedded in stats course
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
5
XSE Trading Screen
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
6
XSE is linked to Stats Software
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
7
The Challenge
● Implement a Market-Neutral Arbitrage Investment Strategy:
● Long Pile: contains stocks which are expected to rise
● Short Pile: contains stocks which are expected to drop
● Neutral Pile: contains stocks for which no prediction can be made. Note: in efficient markets, all stocks should go to the Neutral Pile
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
8
Hypothesis 1
● Utility hypothesis (practical relevance)
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
9
Hypothesis 2
● Behavior hypothesis
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
10
Hypothesis 3
● Non-rote learning hypothesis
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
11
Hypothesis 4
● Attitude hypothesis
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
12
Cohorts / Groups
● Groups:
● Control group: students receive instructor-based
feedback and need to correct mistakes from
previous assignment
● Treatment group: students engage in peer review
(based on solution provided by instructor)
● Cohorts:
● 2 full rounds of peer review
● 4 full rounds of peer review
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
13
2 Groups x 2 Cohorts for each Hypothesis
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
14
Phases of Experiment
● Phase A: preparation period which was needed
to ensure that the stock market’s statistical
properties are perfect to perform a MNAS
● Phase B: period during which the MNAS is
implemented (decisions are made at the
beginning of phase B)
● Phase C: aftermath
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
15
Market Index (phase A, B, C)
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
16
Odds Ratios
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
17
Binomial Effect Size Display
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
18
Conclusions
● Submitting Peer Review messages does cause:
● Students to use statistics more often
● Non-rote learning
● Changed attitudes towards risk
● No evidence that Peer Review changes
perceived practical relevance
(maybe more rounds of peer review are
needed)
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
19
Strengths versus Weaknesses
● Limitations:
● Only 2 cohorts (2 rounds and 4 rounds of PR)
● No evidence for Relevance Hypothesis
● Strengths:
● with the exception of the practical relevance hypothesis, all experimental observations are based on objective measurements which are generated by innovative, educational technology
● the experiment is embedded in a challenging game which has a history of many years and is known to be enjoyable and captivating
● the measured learning outcomes lie outside of the regular curriculum
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
20
Contact
• Ian E. Holliday [email protected]
• P. Wessa [email protected]
• Website:
• http://www.freestatistics.org
CA
L 2
01
1, M
anch
este
r -
p
rese
nte
d b
y Ia
n E
. Ho
llid
ay,
A
sto
n U
niv
ersi
ty
21