” Interface” Validity Investigating the potential role of face validity in content validation...

” Interface” ValidityInvestigating the potential role of face validity in

content validation

Gábor Szabó, Robert MärczECL Examinations

EALTA 9 - Innsbruck, June 2, 2012


Outline

- Questions of face validity- New approach- Context, participants and instruments- Results- Conclusions


”Post mortem”?Educational context:

it is important to seem to be testing as well as to be actually doing it

Test takers’ acceptance of the test:

- contributes to the validity of it

- source of motivation

Lay opinion – taken seriously?


”Interface” validity

New approach:

Test takers are asked to

- give their opinion on the test (face validity)

- give their opinion on the content (content

validity)


Context and participantsECL International Language Examination System

Level – B2

Reading comprehension test

Two tasks: sentence completion

short answer

Online questionnaire

903 answers within the first week (cc 50%)


The instrument

Questionnaire of 17 items

Four-point Likert scale

(4: completely true – 1: not true at all)

6 items – on face validity: general statements concerning difficulty, layout, etc.

11 items – on content validity: descriptors of the CEFR paraphrased

Two negative items (halo effect)


The Questionnaire - ExamplesFace validity:

3. I had enough time to complete the tasks.

Content validity

Original CEFR descriptor:

”Can understand articles and reports concerned with

contemporary problems in which the writers adopt

particular stances or viewpoints.”

9. I could understand the viewpoints of the writer.

16. It was difficult to understand the viewpoints of the writer.


Procedure

Halo effect: analysing the parallel opposite items we found significant negative correlations (-0.630 /-0.670)

Deleting responses with inconsistent response patterns

791 candidates’ responses were found valid and consistent


Results and analysis

Descriptive statistics


Results and analysis• Item correlations

– Expectation: significant, probably moderate correlations• Descriptors tap into different aspects of B2

construct

– Actual results• Strong, significant correlation (0.807) in one case:

Though the text was long I was able to scan it quickly

Though the text was complex I was able to scan it quickly


Results and analysis– Actual results

• Moderate, significant correlations (0.405-0.654)

I could quickly identify the content of the text – I could understand the viewpoints of the writer

I could understand the stance of the writer – I could quickly identify the content of the text

I could quickly identify the content of the text –Though the text was complex I was able to scan it quickly

Most consistent pattern of correlations in the case of item 8: I could quickly identify the content of the text


Results and analysis– Actual results

• Low, sometimes not significant, occasionally negative correlations (<0.4)

I could rarely find idioms in the text

A broad active vocabulary was needed to complete the tasks

The text was concerned with contemporary problems


Results and analysis

• Batch correlations– Correlating face validity items with content

validity items• Significant, moderate correlation (0.536) found • Indication of relationship between constructs?


Conclusions

• Using candidate feedback in content validation is potentially useful

• Further analyses of data in progress– Checking for significant differences between

sets of responses to different items• Refinement of reworded descriptors needed• Further research necessary

– Relationship between candidate performance and opinion


Thank you for your attention!

” Interface” Validity Investigating the potential role of face validity in content validation...

Documents

Transcript of ” Interface” Validity Investigating the potential role of face validity in content validation...