1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of...
-
Upload
joanna-hudson -
Category
Documents
-
view
218 -
download
0
Transcript of 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of...
1
Automatic Essay Scoring is Hereand Now Online
Welcome to CIT S234
Gary GreerUniversity of Houston Downtown
&Michelle OverstreetThe College Board
Tuesday, Oct 25, 2005 (9:15 AM - 10:15 AM)
Coral Tower, Lobby Level
2
AES, AI, ACCUPLACER/WritePlacer
When essays are scored by human experts, the scoring characteristics can be mapped by Artificial Intelligence (AI) and used in Automatic Essay Scoring (AES).
AI is used to identify and internalize essay features into scoring models (algorithms).
The algorithms are verified in simulation and subsequently on live essays.
The algorithms are used by AES to score an essay.
3
Automatic Essay Scoring
AI maps salient characteristics of freshman essays (about 300) into a linear model of each score (for example, 6s ; 8s ; 10s, etc.)
AES is carried out by mathematically matching live essays to these predetermined linear models to predict a score.
AES algorithms specify whether an essay’s characteristics mathematically match the semantic space previously specified by human graders.
4
AES
AES therefore emulates human raters by repeatedly evaluating characteristic essay features such as Structure, Content, Style, Syntax, Discourse, and Word choice to predict a maximum likelihood estimate of a score according to the algorithms copied from the 300 human-expert scored essays.
AES’s performance has been verified in national level studies and now waits for users to conduct performance tests at local levels.
We conducted our local performance study with ACCUPLACER/WritePlacer.
5
WritePlacer employs AI called Intellimetrics WritePlacer infers and internalizes the rubric
and pooled judgments of human scorers by analyzing over 300 semantic, syntactic and discourse features in five categories: Focus and Unity Development and Elaboration Organization and Structure Sentence Structure Mechanics and Conventions
6
ACCUPLACER/WritePlacer is Online
ACCUPLACER Online offers an option for AES called WritePlacer Plus. Delivery is online, testing time is reduced, reliability is enhanced, and scoring is immediate.
At U. of Houston Downtown we asked whether this AES is the same as human-expert scoring. In other words, does this AES differ from human scoring?
7
We Conducted a local Study
Research Question 1 What is the correlation between
WritePlacer scores and human expert scores? Is it significant?
Research Question 2 Do distributions of scores differ? (Are the
medians equal?)
8
Our Hypotheses Hypothesis 1
A significant correlation exists between WritePlacer scores and human expert scores.
(Ho : correlation = 0)
Hypothesis 2 The Median WritePlacer score is equal
to the Median human expert score. (Ho: Medians are equal.)
9
Our Method
Participants were 112 randomly selected, college freshmen examinee essay takers.
Their essays were twice scored : 1st by WritePlacer’s AES and 2nd by human experts.
Correlation between scores was obtained.
To see whether the median scores differed, a non-parametric test statistic was obtained.
10
Table 1 Frequencies of Differences
Difference
Frequency Percent Who Scored higher?
-2 4 4% AES
-1 7 6% AES
0 67 60% identical
1 28 25% Human
2 6 5% Human
Total 112 Total 100%
11
Table 2 – Significance Tests
Medians Test
nMeanRank
Sum ofRanks
Wilcoxon Test Statistic
AES 112 119.63 13398 11802p>.05Human 112 105.38 11802
Correlation
rho = .724 p<.05
12
Discussion of Tables
Table 1 indicates that 91% of the paired scores were identical or agreed within 1 point and that 9% differed by 2 points.
[The 10 (9%) that differed by 2 points were split 60%-40%: 6 where Human > AES and 4 where AES >Human)].
Table 2 shows inferential statistics supporting a conclusion that AI scoring assigns the same scores to essays as human experts assign to (the same) essays.
13
Findings
The correlation between WritePlacer scores and human-expert scores is significant :
r =.72 p<.05.
The distributions of WritePlacer scores and human-expert scores are the same):
Wilcoxon W 11802 p>.05
14
Conclusions
Scoring essays by AES (as implemented within ACCUPLACER/WritePlacer) is consistent with scoring essays by human experts. (Interrater reliability is significant.)
AES scoring of essays is not subject to unreliability (inconsistency) due to fatigue. AES never gets tired !
AES scoring is efficient and effective.
15
Additional Issues:
1. Measurement error is eliminated.
2. Essay supplemented by MC items = increased confidence about placement.
3. Efficiency/ faculty freed for instruction.
4. GMAT/MCAT/SAT are adopting AES.
5. Deep Blue learned chess moves.
16
Thank you