measurment, testing & eveluation

21

description

major concepts and terminologies in second language testing

Transcript of measurment, testing & eveluation

Page 1: measurment, testing & eveluation
Page 2: measurment, testing & eveluation

Measurement: Fundamental Concepts &

Preliminaries

Page 3: measurment, testing & eveluation

Importance of TestingIn Educational situations

To determine the progress of studentsTo ascertain achievement of educational objectives To make sound decision based on evaluationTo know how much learning has taken place.

Teaching & Testing Relationship Testing at the service of teachingWashback / backwash effect

Positive / negative

Page 4: measurment, testing & eveluation

Concepts & Terms Test:

A procedure designed to elicit a certain behavior from which one can make inferences about certain characteristics of an individual.

Assessment: An ongoing Process and a kind of measurement

which encompasses a wider domain than a test and is carried out in direct and indirect ways.

Page 5: measurment, testing & eveluation

Concepts & Terms

Measurement: Process of quantifying individuals’ characteristics

according to specific rules & procedure

Evaluation: The systematic gathering of information for the

purpose of making decisions. Qualitative vs. Quantitative Evaluations

Page 6: measurment, testing & eveluation

Teaching–assessment Relation

TEACHING

Page 7: measurment, testing & eveluation

Measurement 5

1Evaluation 2

Page 8: measurment, testing & eveluation

1. non-test, non-measure evaluationQualitative description of Ss performance

2. non-test measure for evaluationTeacher’s ranking for assigning grades

3. test for evaluative purposeAchievement testing

4. test for non-evaluative purposeProficiency test for research

5. non-test measure for non-evaluative purposeAssigning code numbers to subjects for research

Page 9: measurment, testing & eveluation
Page 10: measurment, testing & eveluation

Nominal Scale

Not really a ‘scale’ because it does not scale objects along any dimension. It simply labels objects and gives the researcher the least amount of information about participants.

Gender : Male = 1 Female = 2 Religious Affiliation : Catholic= 1 Protestant= 2 Jewish= 3Muslim= 4

Other= 5 yes/no responses categorizing subject by hair colour marital status Race political party affiliation college major BirthplaceNominal data is often generated in studies using a questionnaire design from

closed, forced choice questions, e.g. type of pet (cat, dog, rat etc.)

Page 11: measurment, testing & eveluation

Ordinal ScaleNumbers are used to place objects in order, but there is no

information regarding the differences (intervals) between points on the scale.

symptoms of depression from a psychiatric assessment?None= 0 Mild= 1Moderate= 2 Severe= 3

Ranking students according to frequency of spelling errorsthe scores on a Likert questionnaire

Strongly Agree= 5Agree= 4No opinion= 3Disagree= 2Strongly disagree= 1

Page 12: measurment, testing & eveluation

Interval Scale•An interval scale is a scale on which equal intervals

between objects represent equal meaningful differences.

• Determining scores on a grammar test

A 10-degree difference has the same meaning anywhere along the scale.

Page 13: measurment, testing & eveluation

Ratio Scale•Ratio scales have a true zero point and are

meaningful

Physical scales of timelengthweightspeedabsolute temperature(Kelvin scale)

Page 14: measurment, testing & eveluation

    the categories of the variable:

   

PropertyNominal

Ordinal

Interval

Ratio

Distinctiveness + + + +

Ordering + + +Equal interval + +Absolute zero +

Page 15: measurment, testing & eveluation

Test GenresTest Battery

A group of tests standardized on the same population to yield comparable results and to produce a single score.

Traditional vs. Computer-adaptive Discrete-point vs. Global/ Integrative Pragmatic vs. Functional/communicative Norm-referenced vs. Criterion-referenced Direct vs. Indirect Subjective vs. Objective Summative vs. Formative Power vs. Speed

Page 16: measurment, testing & eveluation

NRT vs. CRT

Dimension CRT NRT

PurposeTo determine whether each student has achieved specific skills or concepts. To find out how much students know before and after instruction

To rank each student with respect to theachievement of others in broad areas of knowledge.To discriminate between high and low achievers.

Content

Measures specific skills making up a designated curriculum and identified by teachers and curriculum experts. Each skill is expressed as an instructional objective.

Measures broad skill areas sampled from a variety of textbooks, syllabi, and the judgments of curriculum experts.

ItemCharacteristics

Each skill is tested by at least four items to obtain an adequate sample of performance and to minimize the guessing effect . The items which test any given skill are parallel in difficulty.

Each skill is usually tested by less than four items.Items vary in difficulty.Selected items show high discrimination indexes.

ScoreInterpretation

Testers are compared with a preset standard for acceptable achievement. The performance of other examinees is irrelevant. A student's score is usually expressed as a percentage. Student achievement is reported for individual skills.

Testers are compared with other examinees and assigned a score--usually expressed as a percentile, a grade equivalent score, or a stanine.Student achievement is reported  for broad skill areas, although some norm-referenced tests do report student achievement for individual skills.

Page 17: measurment, testing & eveluation

Test ItemsAlternate response Items

True / False Yes / NoAgree / DisagreeRight / Wrong

Fixed / Closed-ended response Items Multiple-ChoiceMatching

Free / Open-ended response Items Short answerGap-FillEssay

Page 18: measurment, testing & eveluation

Teacher-made vs. Standardized TestsTeacher-made/classroom tests: Small scale, classroom tests generally prepared,

administered, and scored by one teacher .

Standardized Tests: Tests with fixed contents, constant administration and scoring

procedures and statistically acceptable characteristics.

Differences between TMD & SDT Administration & scoring Content sampling Test construction Norms & Standards Purpose and use

Page 19: measurment, testing & eveluation

Self-assessment1. A test refers to a standard set of items to be

answered.

2.Evaluation uses both tests and informal pieces of evidence for making a value judgment and decision.

4.Measurement refers to any device for obtaining information in a quantitative manner.

5.If a person Knows how to teach, he may not be necessarily able to judge the ability of his pupils.

Mohd. Pazhouhesh

T

F

T

T

Page 20: measurment, testing & eveluation

6. Educational decisions can be made without measurement or evaluation.

7. Summative evaluation involves the use of tests and quizzes for the purpose of determining the effectiveness of instructional programs.

Mohd. Pazhouhesh

F

T

Page 21: measurment, testing & eveluation

The process of gathering information to make proper decisions is called ----------.

a. measurementb. testingc. evaluationd. examination

The subjective judgment of a teacher about a student’s performance is a kind of --------- evaluation.

a. quantitativeb. standardc. qualitative d. comprehensive

Mohd. Pazhouhesh