Bab 3

16
CHAPTER THREE STANDARDIZED TESTS WHAT ARE STANDARDIZED TESTS? One of the most widespread assessment procedures used is standardized tests. Although standardized tests have many critics, and the pressure to change them is considerable, they will be part of the assessment procedures in most school districts in the foreseeable future. Teachers should be able to use the results. Standardized tests are prepared for nationwide use (usually commercial) to provide accurate and meaningful information on students’ levels of performance relative to others at their age or grade levels. To make test scores comparable, the tests are administered and scored under carefully controlled conditions that are uniform to all students so that students all over the country (and world) have equal chances to demonstrate what they know. Standard methods are used to develop items, administer the test, score it, and report the scores to interested audiences. Such tests are usually constructed by subject matter specialists and experts on testing. Standardized tests are typically used to provide a yardstick (which teacher made tests cannot provide) against which to compare individuals or groups of students. The interpretation of scores on

Transcript of Bab 3

Page 1: Bab 3

CHAPTER THREE

STANDARDIZED TESTS

WHAT ARE STANDARDIZED TESTS?

One of the most widespread assessment procedures used is standardized tests.

Although standardized tests have many critics, and the pressure to change

them is considerable, they will be part of the assessment procedures in most

school districts in the foreseeable future. Teachers should be able to use the

results. Standardized tests are prepared for nationwide use (usually

commercial) to provide accurate and meaningful information on students’

levels of performance relative to others at their age or grade levels. To make

test scores comparable, the tests are administered and scored under carefully

controlled conditions that are uniform to all students so that students all over

the country (and world) have equal chances to demonstrate what they know.

Standard methods are used to develop items, administer the test, score it, and

report the scores to interested audiences. Such tests are usually constructed by

subject matter specialists and experts on testing.

Standardized tests are typically used to provide a yardstick (which

teacher made tests cannot provide) against which to compare individuals or

groups of students. The interpretation of scores on standardized tests are

based on national and sub national norms (see Box 3.1 for definitions of

measurement criteria). Test norms are records of the performances of groups

of individuals who have previously taken the test. Test norms are used to

determine how the score of any test taker compares with the scores of a

sample of a similar individuals. The test publishers provide one or more ways

of comparing each student’s raw score (number of correct answers) with the

norm sample.

Standardized tests evolved and proliferated because of the unreliability

of school assessments. The Scholastics’ Aptitude Test (SAT), for example,

was created in 1926 as an efficient and economical way for college admission

officers to select the most promising students from the pool of applicants. The

SAT test scores were found to be a better predictor of grades in college than

were high school grades. Since that time standardized tests have been used to

Page 2: Bab 3

(a) select and place students into classes, programs, special schools, or

colleges, (b) decide whether a student should advance to the next level, (c)

diagnose student’s problems in learning, (d) determine honors, awards, and

scholarships, (e) evaluate the effectiveness of instructional programs, (f)

apply for federal funds, and (g) conduct research. Standardized test scores

have become the yardstick for measuring the quality of schools, school

districts, and even education within a state and the country as a whole.

There are two types of standardized tests: achievement tests and

aptitude tests. Achievement tests focus on the knowledge and skills learned

in school and may be in the form of achievement batteries, diagnostic tests, or

subject-specific tests. Aptitude tests focus on the potential maximum

achievement of students and may measure general intellectual aptitude,

aptitude to do well in college or certain vocational training programs, reading

aptitude, mechanical aptitude, or perceptual aptitude. Although aptitude tests

and achievement tests are theoretically different, their results are so highly

correlated that both may be considered achievement tests (see Box 3.2 for

definitions of statistical interpretations of test results).

BOX 3.1

CRITERIA FOR GOOD MEASUREMENT PROCEDURES

CRITERIA DEFINITIONReliability Reliability exists when a student’s performance remains

the same or repeated measurements. On a norm-referenced measure, this means that when the measure is repeated and the raw scores of students are arranged in order from highest to lowest, all students will keep the same rank.

Validity Validity means that the test actually measures what it was designed to measure, all of what it was designed to measure, and nothing but what is was designed to measure.

Objectivity Objectivity is the agreement of (a) experts on the correct answer to a test item and (b) different scores on what score should be assigned to a test paper or questionnaire.

Practicality Practicality of a measure is determined by the cost per copy, the time it takes to administer it, the ease of scoring, and other factors teachers have to take into account before deciding to use a particular measure.

Discrimination When a norm-referenced measure is used, each item has to discriminate among students as high, medium, and low on the skill or knowledge being measured.

Norm-referenced tests

Norm-referenced tests are designed to test a student’s performance as it compares to the performances of other students

Criteria-referenced tests

Criteria-referenced tests are designed to compare a student’s test performance to preset criteria defining excellence on learning tasks or skills.

Page 3: Bab 3

BOX3.2

INTERPRETING STANDARDIZED TEST SCORES

STATISTIC DEFINITIONFrequency distribution A list of the number of people who obtain each

scores on a test. This information may be expressed as a simple graph, called a histogram or bar graph, where the horizontal or x-axis indicates the number of possible scores and the vertical or y-axis indicates the number of students who attained each score

Measures of central tendency

The mean is the sum of all scores in the class divided by the number of students. The median is the midpoint in a set of scores arranged in order, from highest to lowest. It is most useful when a few unusually high or low scores distort the mean.

Standard deviation The average of the differences of all student’s scores from the mean score. A large standard deviation indicates that students obtained a wide range of scores on the test. A small standard deviation indicates that the range of scores is low and most students scored right around the mean.

Standard score An indication of how far each student’s score is above or below the mean as measured by standard deviation units, which allows for the comparison of scores from different tests, regardless of the size of the class or the number of items on the test. To find the standard score you subtract the mean from the student’s raw score and divide by the standard deviation. Three common standard scores are z-score, stanine, and the normal curve equivalent. Z-scores have a mean of 0 and a standard deviation of 1. Stanine (a combination of the words, standard nine) scores have a mean of 5 and a standard deviation of 2. The normal curve equivalent (NCE) scores range from 1 to 99 with a mean of 50 and a standard deviation of about 21.

Percentile rank The percentage of the class with scores below that obtained by the student. Percentile rank can range from 0 to 100.

Grade-equivalent scores

The average of the scores of all students in the norming sample at that grade level. Grade-equivalent scores are generally listed as numbers, such as 11.4, 9.6, 7.2, or 3.5. the whole number expresses the grade level and the decimals stand for tenths of a year. Grade-equivalent scores are easy to interpret and understand.

ADVANTAGES OF STANDARDIZED TESTS

1. Standardized tests are easily administered and they take little time away

from instruction.

Page 4: Bab 3

2. Standardized tests provide a standard situation in which all students are

required to answer the same questions. This ensures that all students may

be evaluated on the same criteria—some students will not be evaluated on

different criteria than others.

3. Standardized tests provide a permanent record of behavior when they are

written (some tests can be oral). A permanent record of answers allows

teachers to examine the same answers several times to ensure that the

evaluation is fair and unbiased.

4. Standardized test scores allow simple comparisons between students,

schools districts, states, and nations. From the global comparisons

provided, an overall assessment can be made.

5. Standardized tests are used by psychometrics and major institutions and,

therefore, they carry scientific credibility and tradition.

6. Standardized tests are unparalleled for certain purposes, such as large-

scale, cost-effective assessment of large numbers of students on low-level

cognitive objectives.

7. Standardized tests tend to have high predictive validity. Advanced

placement tests, for example, accurately predict how students will perform

in college courses.

DISADVANTAGES OF STANDARDIZED TESTS

1. The content of standardized tests is problematic. Standardized tests

measure factual or declarative information and a narrow group of verbal

skills (such as word recall, fluency, and recognition vocabulary). They

tend not to measure depth of understanding, integration of knowledge, and

production of discount let alone social progress, individual worth, or

school effectiveness. Abstract verbal skills, for example, do not determine

excellence in writing a poem, singing a lullaby, tutoring a child, or giving

an order in a factory.

2. The range of what standardized tests can assess is limited.

Standardized tests are inadequate in assessing student’s generative

capabilities, such as (a) pressing themselves orally or in writing, (b)

organizing and analyzing an able dance of data, (c) devising an

experiment to answer an interesting question and (d) working

cooperatively with others.

3. The ability of standardized tests to identify students with special

needs negligible. Standardized tests are of little have in identifying

Page 5: Bab 3

students what need either support and help to succeed or challenges

beyond those offered the curriculum because they are

a. Not timely. They are administered at most once per year and usually

one once every several years.

b. Not aligned. Assessment must be aligned with the curriculum and

conducted regularly and frequently to avoid having students fall

farther and farther behind and having other students endure repetition

and slow put (because they learn quickly or already know what is

being taught). Any student is potentially in need of special assistance

at some point. And any student is potentially eligible for additional

challenge. Unless teachers have a capability frequently to assess

students with regard to the curriculum, they have no way of knowing

whether a challenge or special assistance is need.

4. Standardized tests are not helpful in assessing (a) student learning in

specific courses or (b) achievement of district program goals. A

generic test high school juniors will not yield information on the degree to

which student have learned in a specific class, for example, physics, auto

mechanics, or family life. Prepared program goals (what students will

have learned as a result studying social studies or language arts) typically

include broad statement such as “understand major historical trends” or

“communicate effectively speaking and writing”. Such goals involve

higher-level outcomes not include in standardized tests.

5. Standardized tests have limited use in assessing student exit outcomes.

Exit outcomes statement of the knowledge and skills students will process

after completing schooling. Exit outcomes address the question, when

students leave us what will they know and what can they do? Most

schools use exit outcomes to guide curriculum planning and curriculum

audits. Thus, if a district has an exit outcome relating to critical and

creative thinking, faculty may examine the science program to ensure it

does not rely solely on the memorization facts and the performance of

“cookbook” labs. In some schools, furthermore condition for graduations

is that students demonstrate proficiency on the exit outcomes to a

committee of teachers, community or business leaders, and other students.

Such demonstrations may include an original garment design, a creative

solution to a situation in auto mechanics, or a unique approach to a

problem in trigonometry.

6. Standardized tests are not useful in conducting external curriculum

audit. To determine the quality of a school or district curriculum, student

performance must be compared with students from all over the state or

Page 6: Bab 3

nation, using the same procedures and techniques and sharing the results.

Although this precisely the purpose of a standardized test, the tests are not

capable of assessing the full range of the curriculum—all the knowledge

and skills that educators and the community consider important. A

curriculum audit should reflect the entire range of the curriculum deemed

important by educators and the community.

7. Standardized test results that are used for high-stage purposes create

a temptation to cheat in some manner. There are many stories about

schools that exclude low-achieving students from a standardized test

because including them would depress their average score and cause them

to lose face (or worse) in comparison with other schools. There are also

stories about teachers who unfairly coach their students on test items or

provide unauthorized assistance during the actual assessment.

8. The impact standardized tests have on instruction is problematic. Test

construction emphasizes basic skills and neglects many of the most

important outcomes of schooling. When teachers “teach to the test,” they

emphasize basic skills at the expense of higher-order reasoning skills.

9. Use of standardized test results is limited. Standardized tests predict

how many years of conventional education a student will attain. They do

not, however, predict occupational success in fields such as medicine,

engineering, teaching, scientific research, and business.

Cautions

In and of themselves, tests are incapable of harming students. It is the way test

results are used that is potentially harmful to students. (See Activity 3.1 to

conduct your own analysis of standardized tests.) Even the best tests can

create problems if their results are misused. The issue is not whether

standardized tests should exist, but rather how their results should be used.

Some helpful hints follow.

1. Make sure you are using the right tests. Schools often devise new goals

and curriculum only to assess their success with tests that are not relevant

to either the goals or the new instruction or materials. Whatever the

purposes of assessment, they cannot be accomplished unless the correct

tests are used.

2. Do not use the results of standardized tests to judge the success of

local programs and goals. Standardized tests cover large segments of

subject matter or general abilities related to learning. They focus on

general goals common to schools across the country and are not suitable

Page 7: Bab 3

for evaluating limited instruction, such as in a single learning unit or for

judging how well a strictly local instructional goal is achieved.

3. Assume that test results are fallible and not always accurate. Low

scores can be the result of (a) poor health, negative moods, and

distractions; (b) lack of test-taking skills; and (c) inability to take tests

well due to factors such as anxiety. Every test score contains possible

error. Many students who score poorly on standardized tests excel in

school, college, or occupations.

4. Use more than a single test score to make important decisions. Given

the possibility of error that exists for every test score, a single test score is

too suspect to serve as the sole criterion for any crucial decision.

Supporting evidence is needed.

5. Do not set arbitrary minimums for performance on tests. Using

arbitrary minimums to make critical decisions is inherently unfair. If the

standard is arbitrarily set at 85, for example, there may be no valid reason

to predict that a person who scores 86 will perform better in the future

than a person who scores 84. Tests do not have sufficient validity and

reliability to make such fine distinctions.

6. Remember that a test does not measure all the content, skills, or

behavior of interest. Tests are limited to what they cover, which is

usually a sample of what a student knows or can do. Another test that

samples differently could get quite different results. Scores are

approximations of students’ knowledge and competencies.

7. Remember, in some cases there is no alternative to standardized tests.

The SAT and GRE provide important information, as do advanced

placement tests. Standardized tests have their place and if used

appropriately, provide information that cannot be currently obtained any

other way.

ACTIVITY 3.1 ANALYZING STANDARDIZED TESTS

1. Select a standardized test that either you or your students have taken. Note

type of questions used in the test. Write sample questions that are appropriate

your students based on the types used in the test. Have your students practice

answering the questions until you are sure they are familiar with how to

answer each type of question.

2. Choose several of the test questions in the standardized test. Analyze and lal

them according to the following categories :

a. Prior knowledge needed

b. Higher-level reasoning needed

Page 8: Bab 3

c. More than one answer seems correct

d. Ambiguous wording

e. Recall required

f. Culturally biased

g. Other:

Comment on your findings:

HOW TO HELP YOUR STUDENTS DO BETTER ON

STANDARDIZED TESTS

There steps can help students do better on standardized tests : (a) making

students comfortable in the test-taking situation, (b) showing students how to

complete test efficiently, and (c) helping students understand that their scores

on standardized tests are neither a cause for pride nor shame.

The First Step Is the Warm-Up. It involves familiarizing students with the

mechanics of the testing situation. Pass out facsimiles of the answer sheet and

have students practice filling in their names and other information. Rehearse

the preliminary instructions, using the exact language the manual advises.

Practice arranging the seats according to the seating plan suggested by the test

makers. Give students proper pencils and have them practice filling in answer

sheets rapidly because speed is essential on standardized tests and neatness

does not help one’s score. During practice, watch for students who tend to

lose their place or have trouble marking the proper boxes. Having students

practice reading questions on one sheet and marking answers on another sheet

is especially helpful. The purpose of such warm-up procedures is to make the

mechanics of test taking so familiar that students will be relaxed and

competent when faced with the real thing.

The second step is the dry run. Use old copies of the test that contain

questions no longer use to familiarize students with question format, the

vocabulary of instructions, and the general appearance of the test. Have

students devise their own best strategies for test taking and than share their

ideas whit one another. Look through the test to determine whether any

special skills, such as reading graphs and charts, are needed. If so, drill

students on those skills. Tell students whether they should guess or avoid

Page 9: Bab 3

guessing. Because skimming is a vital skill, have students practice reading

passages both aloud and silently, stressing only key words, and reading

passages with the question and answer mind. At the end of each practice

session, discuss with students the following rules for test taking:

1. As quickly as possible, complete the entire test or section. First answer

only questions you are sure of and those with obvious answers. Lightly

mark the questions that make you pause and return to them later.

2. Leave a minute at the end of the test to fill in any blank boxes. Guess on every question you don’t know unless there is a penalty for doing so.

3. Do not get interested in the reading passages or any information contained in the test. Standardized tests are not for learning or thinking. They are for gauging how well students take test.

4. Never argue with answer. Simply select answers the testing agency will score as correct.

The third step is the follow-through. This involves letting students know that

what was tested was their ability to take a test and their scores are neither

cause for pride nor shame. The real work completed during the school year

measures achievement and ability.

SUMMARY

Standardized tests are tests prepared for nationwide use (usually commercial)

to provide accurate and meaningful information on students’ levels of

performance relative to others at their age or grade levels. National and

subnational normative data are provided for most standardized instrument so

that student performance can be compared to other than local norms.

Standardized test scores may be used to evaluate the effectiveness of

instructional programs, to select and place students, to diagnose students’

problem in learning, and to conduct research.

There are two types of standardized tests—achievement and aptitude.

Standardized achievement tests may be in the form of achievement batteries,

diagnostic tests, and subject-specific tests. Aptitude tests may be general

intelligence test or multifactor aptitude batteries. Standardized tests ensure that

all students are evaluated by the same criteria. They yield more accurate and

fairer evaluations than unsystematic observations. On the other hand,

standardized achievement and aptitude tests measure a narrow group of verbal

skills and primarily contain multiple items that do not allow students to

demonstrate complex cognitive and problem solving skills.

Page 10: Bab 3

To interpret standardized test scores, you need to understand

frequency distributions, measures of central tendency and standard deviation,

percentiles, grade-equivalents, and normal curve equivalents. The criteria that

standardized tests good all other measurement procedures) have to meet are

reliability, validity, objectivity, practicality, and discrimination. A measure is

reliable if it is consistent and actual. A measure is valid if it measures what it

is supposed to measure. A measure is objectivity if there is agreement among

(a) experts n the correct answer to a tests item and (b) different scorers on

what score should be assigned to a test paper or questionnaire. Practicality is

determined by the cost of and ease with which a measure used. A measure

discriminates if it differentiates among high-, medium-, and low- achieving

students.

Teachers can help their students score higher on standardized tests by

making students comfortable in the test-taking situation, showing students

how to complete tests efficiently, and helping students understand that their

scores on standardized tests are neither a cause for pride nor shame. In

holding teachers accountable for student scores on standardized tests, it

should be remembered that teachers can provide an opportunity for students

to learn, they cannot make students learning.