Download - Validity Evaluation

Validity Evaluation

NCSA PresentationNAAC/NCIEA GSEG CONSTORIA

Validation is “a lengthy, even endless process” (Cronbach, 1989, p.151)

NAAC/Center for Assessment. 4/30/08

3

Kane’s argument-based framework

• “…assumes that the proposed interpretations and uses will be explicitly stated as an argument, or network of inferences and supporting assumptions, leading from observations to the conclusions and decisions. Validation involves an appraisal of the coherence of this argument and of the plausibility of its inferences and assumptions (Kane, 2006, p. 17).”

Two Types of Arguments

• An interpretative argument specifies the proposed interpretations and uses of test results by laying out the network of inferences and assumptions leading from the observed performances to the conclusions and decisions based on the performances

• The validity argument provides an evaluation of the interpretative argument (Kane, 2006)


The NAAC-GSEG

• Builds off of previous projects (NHEAI and first phase of NAAC) that focused on technical documentation

• Collecting and evaluating validity evidence was a significant challenge

• This project focuses on having states (5) create and evaluate validity arguments

• Our focus is NOT on single studies or even sets of studies, but on how the studies support or refute a validity argument



Process

• Theory of Action

• Interpretive Argument

• Validity Evaluation

Theory of Action

Types of Studies• Surveys

– Teachers and Principals • Skills, resources, supports

– Scorer background– Consequential validity– Parent Surveys– Learner Characteristics Inventories

• Classroom Visit Protocols• Scoring Observation Protocols• Performance-level judgment assignment• Focus group

Prioritizing the Studies

• Based on the Theory of Action… – Which studies will give us the best information to

develop an interpretive argument?– How will we synthesize the study results into our

interpretive argument?

Evaluating the Validity Argument

• Haertel (1999) reminded us that the individual pieces of evidence (typically presented in separate chapters of technical documents) do not make the assessment system valid or not, it is only by synthesizing this evidence in order to evaluate the interpretative argument can we judge the validity of the assessment program.

An approach for organizing synthesis for validity evaluation (Ryan & Marion)Example 1 Example 2

Dimension of Validity Evidence

Response processes Generalization (reliability)

Proposition/ Claim

The rubric captures appropriately the knowledge and skills valued and assessed

Scorers apply the rubric accurately and consistently when scoring the AA‐AAS

Validity Study Questions

What features and components of the GAA portfolio entries and scoring rubric do scorers attend to during the scoring process?

Given training and support materials similar to those used by the state’s AA-MAS vendor, will scorers achieve an acceptable level of consistency and accuracy in their ratings?

Criteria What is acceptable or good enough evidentiary support?

Typically r=.8?

Evidence (Data) Collected

Think-aloud protocols and interviews with a group of scorers (n = 14).

Scores examined for consistency & accuracy for a set of trained scorers (w/out SWD expertise)

Results Summary of results Mixed results

Claim and support?

Is the claim for the proposition supported by the evidence? How strong is the support?

Weak correlational results for inter-scorer agreement

Potential alternative explanations?

Do the results reveal other potential likely explanations?

Scorers of AA-AAS must have content expertise to be effective scorers

Next Steps

• States are completing their studies and comparing their results with their interpretive arguments.

• They will then evaluate the study results interpretive argument comparisons to develop the validity evaluation.

Contact Information

• Jacqui Kearns, Ed.D.NAAC

University of Kentucky www.naacpartners.org• Scott Marion NCIEA www.nciea.org

http://www.naacpartners.org/

http://www.nciea.org/