1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of...

16
1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board Tuesday, Oct 25, 2005 (9:15 AM - 10:15 AM) Coral Tower, Lobby Level

Transcript of 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of...

Page 1: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

1

Automatic Essay Scoring is Hereand Now Online

Welcome to CIT S234

Gary GreerUniversity of Houston Downtown

&Michelle OverstreetThe College Board

Tuesday, Oct 25, 2005 (9:15 AM - 10:15 AM)

Coral Tower, Lobby Level

Page 2: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

2

AES, AI, ACCUPLACER/WritePlacer

When essays are scored by human experts, the scoring characteristics can be mapped by Artificial Intelligence (AI) and used in Automatic Essay Scoring (AES).

AI is used to identify and internalize essay features into scoring models (algorithms).

The algorithms are verified in simulation and subsequently on live essays.

The algorithms are used by AES to score an essay.

Page 3: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

3

Automatic Essay Scoring

AI maps salient characteristics of freshman essays (about 300) into a linear model of each score (for example, 6s ; 8s ; 10s, etc.)

AES is carried out by mathematically matching live essays to these predetermined linear models to predict a score.

AES algorithms specify whether an essay’s characteristics mathematically match the semantic space previously specified by human graders.

Page 4: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

4

AES

AES therefore emulates human raters by repeatedly evaluating characteristic essay features such as Structure, Content, Style, Syntax, Discourse, and Word choice to predict a maximum likelihood estimate of a score according to the algorithms copied from the 300 human-expert scored essays.

AES’s performance has been verified in national level studies and now waits for users to conduct performance tests at local levels.

We conducted our local performance study with ACCUPLACER/WritePlacer.

Page 5: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

5

WritePlacer employs AI called Intellimetrics WritePlacer infers and internalizes the rubric

and pooled judgments of human scorers by analyzing over 300 semantic, syntactic and discourse features in five categories: Focus and Unity Development and Elaboration Organization and Structure Sentence Structure Mechanics and Conventions

Page 6: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

6

ACCUPLACER/WritePlacer is Online

ACCUPLACER Online offers an option for AES called WritePlacer Plus. Delivery is online, testing time is reduced, reliability is enhanced, and scoring is immediate.

At U. of Houston Downtown we asked whether this AES is the same as human-expert scoring. In other words, does this AES differ from human scoring?

Page 7: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

7

We Conducted a local Study

Research Question 1 What is the correlation between

WritePlacer scores and human expert scores? Is it significant?

Research Question 2 Do distributions of scores differ? (Are the

medians equal?)

Page 8: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

8

Our Hypotheses Hypothesis 1

A significant correlation exists between WritePlacer scores and human expert scores.

(Ho : correlation = 0)

Hypothesis 2 The Median WritePlacer score is equal

to the Median human expert score. (Ho: Medians are equal.)

Page 9: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

9

Our Method

Participants were 112 randomly selected, college freshmen examinee essay takers.

Their essays were twice scored : 1st by WritePlacer’s AES and 2nd by human experts.

Correlation between scores was obtained.

To see whether the median scores differed, a non-parametric test statistic was obtained.

Page 10: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

10

Table 1 Frequencies of Differences

Difference

Frequency Percent Who Scored higher?

-2 4 4% AES

-1 7 6% AES

0 67 60% identical

1 28 25% Human

2 6 5% Human

Total 112 Total 100%

Page 11: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

11

Table 2 – Significance Tests

Medians Test

nMeanRank

Sum ofRanks

Wilcoxon Test Statistic

AES 112 119.63 13398 11802p>.05Human 112 105.38 11802

Correlation

rho = .724 p<.05

Page 12: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

12

Discussion of Tables

Table 1 indicates that 91% of the paired scores were identical or agreed within 1 point and that 9% differed by 2 points.

[The 10 (9%) that differed by 2 points were split 60%-40%: 6 where Human > AES and 4 where AES >Human)].

Table 2 shows inferential statistics supporting a conclusion that AI scoring assigns the same scores to essays as human experts assign to (the same) essays.

Page 13: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

13

Findings

The correlation between WritePlacer scores and human-expert scores is significant :

r =.72 p<.05.

The distributions of WritePlacer scores and human-expert scores are the same):

Wilcoxon W 11802 p>.05

Page 14: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

14

Conclusions

Scoring essays by AES (as implemented within ACCUPLACER/WritePlacer) is consistent with scoring essays by human experts. (Interrater reliability is significant.)

AES scoring of essays is not subject to unreliability (inconsistency) due to fatigue. AES never gets tired !

AES scoring is efficient and effective.

Page 15: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

15

Additional Issues:

1. Measurement error is eliminated.

2. Essay supplemented by MC items = increased confidence about placement.

3. Efficiency/ faculty freed for instruction.

4. GMAT/MCAT/SAT are adopting AES.

5. Deep Blue learned chess moves.

Page 16: 1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.

16

Thank you