COMPARISON OF THE WAIS AND THE WAIS-R A DISSERTATION …
Transcript of COMPARISON OF THE WAIS AND THE WAIS-R A DISSERTATION …
COMPARISON OF THE WAIS AND THE WAIS-R
IN A HIGH SCHOOL POPULATION
by
CHERYL L. SIMON, B.A.
A DISSERTATION
IN
PSYCHOLOGY
Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for
the Degree of
DOCTOR OF PHILOSOPHY
Approved
Accepted
May, 1986
No.^'^ ACKNOWLEDGEMENTS
1 am deeply indebted to my chairman. Dr. James
Clopton. His consistent support, guidance,
encouragement, and experimental expertise are in large
measure responsible for the success of this research
project. I would also like to acknowledge the helpful
input of Drs. George, Locke, Maddux, Stoltenberg, and
Wysocki. I am grateful to Bill Hoke, Laura Farrell,
Debbie Shanks, and George Simon for thier patience and
diligence in testing subjects. Finally, I wish to thank
the Lubbock Independent School District administrators
for their cooperation with this project and the high
school principals and counselors at Coronado, Estacado,
and Monterey for their patience and helpfulness in the
execution of this study.
11
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ii
LIST OF TABLES v
LIST OF FIGURES vii
I. INTRODUCTION 1
Overview 1
Developmental History of the Wechsler Scales 3
Organizational Differences
between the WAIS and WAIS-R . . . . 11
General Testing Considerations lA
Source of WAIS-R Items 17
Reliability 32
Validity 39
Standardization of the WAIS
and WAIS-R Al
Factor Analysis 44
Studies Comparing WAIS and
WAIS-R Scores 47
Race-IQ Controversy 55
Examiner Effects 60
Rationale for Present Study 65
II. METHOD 71
Subjects 71
Procedure 72
Formal Hypotheses 73 1 1 1
III. RESULTS 77
Subject Characteristics 78
Main Data Analyses 78
Supplementary Data Analyses 93
IV. DISCUSSION 104
Results and Implications of
Hypothesis Testing 104
Limitations of the Present Study . . . 123
Suggestions for Future Research . . . . 123
REFERENCES 125
APPENDIX: CONSENT FORM 134
IV
LIST OF TABLES
1. Comparative Order of WAIS and WAIS-R Subtests . . . 12
2. WAIS and WAIS-R Classifications of Intelligence 15
3. Changes in Item Content from the V/AIS to the WAIS-R 18
4. Mean Reliability Coefficients and Standard Errors of Measurement across Age Groups for WAIS and WAIS-R Subtest and IQ Scores 34
5. Demographic Information for 70 Subjects Included in U'AIS/WAIS-R Comparisons: Race, Age, Sex, School 79
6. WAIS and WAIS-R Range and Mean IQ Scores 80
7. F Values Denoting Main and Interaction Effects in Test x Order x Race Analysis of Variance with WAIS and WAIS-R Counterbalanced for Order of Administration 81
8. Means, Standard Deviations, and Correlations for WAIS and WAIS-R Scores 83
9. Main Effects of Test within Order and Order within Test for the Six Subtest and Three IQ Scores with Significant Test x Order Interactions (Tukey's Method) 88
10. Comparison of WAIS and WAIS-R Subtest and IQ Means for 70 High School Students: Breakdown of Naive Subjects 90
11. Scheffe Test of Multiple Comparisons for Mean White, Black, and Mexican-American Subtest and IQ Scores on WAIS and WAIS-R Combined 92
12. F Values for Examiner Main Effect and Examiner Interactions with Test and Order 96
13. WAIS and WAIS-R Mean IQ Differences in Three IQ Categories 101
14. Comparison of WAIS and WAIS-R Verbal, Performance, and Full Scale Mean IQ Scores for White, Black, and Mexican-American Subjects 114
15. Percentage of Correct Responses for Armstrong and King Items on the WAIS-R 117
VI
LIST OF FIGURES
1. Test x Order Interaction for V/AIS and WAIS-R Verbal IQ Scores 84
2. Test X Order Interaction for WAIS and WAIS-R Performance IQ Scores 85
3. Test X Order Interaction for WAIS and WAIS-R Full Scale IQ Scores 86
4. Race x Test Interaction for WAIS and WAIS-R Performance IQ Scores 94
5. Test X Order x Examiner Interaction (Order 1) for WAIS and WAIS-R Digit Symbol Subtest . . . . 98
6. Test X Order x Examiner Interaction (Order 2) for WAIS and WAIS-R Digit Symbol Subtest . . . . 99
7. Frequency Distribution of WAIS-R FSIQ Scores for the Three Racial Groups 103
V l l
CHAPTER I
INTRODUCTION
Overvie w
In 1981, the Wechsler Adult Intelligence Scale
(WAIS; Wechsler, 1955) was revised and published as the
Wechsler Adult Intelligence Scale-Revised (WAIS-R;
Wechsler, 1981). This revision sought to update the WAIS
content and to provide new norms based upon the scores
obtained from more recent samples of the population. Due
to the many differences between the two tests (revised
instructions, updated test items, altered sequence of
subtest administration, altered scoring of test items,
and updated norms), it is important to determine whether
the two scales provide comparable scores.
Previous studies comparing the two tests have found
significant differences between many of the subtest and
IQ scores. In most cases, WAIS scores have been higher
than WAIS-R scores (Kelly, Montgomery, Felleman, & Webb,
1984; Lippold & Claiborn, 1983; Mishra & Brown, 1983;
Prifitera & Ryan, 1983; Rabourn, 1983; Smith, 1983;
Urbina, Golden, & Ariel, 1983; Wechsler, 1981). However,
Simon and Clopton (1984) found that WAIS-R scores were
higher than WAIS scores in a sample of mentally retarded
individuals. Edwards and Klein (1984) used a sample of
1
gifted subjects and found no significant difference
between WAIS and WAIS-R scores. The present study will
determine whether differences exist between the two tests
in a sample of high school students.
This study will also compare the performance of
three racial groups (White, Black, and Mexican-American)
on the WAIS and the WAIS-R. Since V/echsler made several
changes in the content of the WAIS-R to make it a more
culture-fair test, it will be informative to compare the
two tests to determine whether the significant
discrepancy found between racial groups on the WAIS
persists on the WAIS-R. Order of administration and
examiner effects are also examined to determine whether
these variables significantly influence WAIS and V/AIS-R
scores. Finally, statistical tests are performed to
determine whether differences between the WAIS and WAIS-R
vary according to the subject's IQ.
As an introduction to the research, the
developmental history of the Wechsler scales is
presented. This brief historical summary deals with some
of the trends in the intelligence testing movement which
led to the development of the WAIS and the WAIS-R. The
differences between these two tests are discussed in some
detail. Specifically, differences in WAIS and WAIS-R
test organization, item content, reliability, validity,
and factor analysis are surveyed. A review of studies
comparing the WAIS and WAIS-R is also included to define
the general direction of current research findings.
This general comparison of the WAIS and WAIS-R is
followed by a section which summarizes the historical
events surrounding the development of the race-IQ
controversy in the United States. The literature
concerning racial differences on standardized
intelligence tests is surveyed.
The next section includes a review of the different
types of examiner effects found when administering
standardized intelligence tests. Relevant literature is
also summarized.
A rationale for the present study is discussed. The
methodology, specific hypotheses, and results of the data
analyses are presented. Finally, the results are
discussed as they relate to previous findings.
Limitations of the study are outlined as well as
suggestions for future reearch in this area.
Developmental History of the Wechsler Scales
Interest in human intelligence and its assessment
emerged in the latter part of the nineteenth century.
The first attempts to understand "mental" concepts were
directed toward the measurement of human perception and
discrimination. It was during this time that E. H. Weber
(1795-1887) and G. T. Fechner (1801-1887) were developing
new methods of psychophysical measurement. Weber had
postulated that as the magnitude of any perceptual
stimulus increases, so does the size of change required
for discrimination to occur. Successful discrimination
of a change in stimulus intensity was termed a "just
noticeable difference" (JND). Fechner attempted to
determine the relationship between stimulus intensity and
perceived intensity by developing a scale that indirectly
measured the subjective impression of stimulus intensity.
Sir Francis Galton, an English biologist, was the
first to engage in research directed toward the
assessment of "intelligence." He conducted several
statistical studies of individual sensory and psychomotor
responses. Galton believed that the measurement of these
responses provides an index of a person's intellectual
functioning. He wrote:
The only information that reaches us concerning outward events appears to pass through the avenue of our senses; and the more perceptive the senses are of difference, the larger is the field upon which our judgment and intelligence can act. (Galton, 1883, p. 27)
James Cattell, a prominent American psychologist in
the 1800's, shared Galton's view that a measure of
intellectual functioning could best be obtained through
tests of sensory discrimination and reaction time.
Cattell developed a series of measures that he termed
"mental tests." These tests were administered
individually and consisted of measures of muscular
strength, sensitivity to pain, visual and auditory
acuity, weight discrimination, speed of movement, and
memory (Anastasi, 1968, p. 9 ) . Other psychologists
developed similar procedures. For example, Jastrow set
up an exhibit at the Columbian Exposition held in
Chicago in 1893 and invited visitors to take tests of
sensory, motor, and other simple perceptual processes
and then compare their performance with the performance
of other people.
Several European psychologists of this period had
devised tests which attempted to measure more complex
intellectual functions. In 1895, Kraeplin compiled
tests to measure practice effects, memory, and
susceptibility to fatigue and distraction. He
considered these to be "the basic factors in the
characterization of an individual" (Anastasi, 1968,
p. 9 ) . Ebbinghaus used tests of arithmetic computation,
memory span, and sentence completion to measure the
scholastic achievement of children. The Italian
psychologist, Ferrari, devised a series of tests that
included motor response measures and picture
interpretation tasks. In 1895, Binet and Henri
criticized most of the available tests for relying too
heavily on sensory input and simple, discrete abilities.
They proposed a series of tests that assessed such
functions as memory, attention, imagination,
suggestibility, and comprehension. These tests were the
forerunners to the Binet intelligence scales.
Although Alfred Binet had devoted many years of
research toward the assessment of intelligence, it was
not until 1905 that he and his associate, Theodore
Simon, developed the first objective and practical scale
to assess intellectual behavior. The Binet-Simon scale
was used to determine which children in Paris were fully
educable, educable with special help in the schools, or
retarded to the point of being unable to benefit from
public education (Matarazzo, 1972). In 1908, Binet and
Simon made improvements in their scale and developed the
concept of mental age. This concept relates the
performance of an individual to the performance of other
persons within a particular age group. For example, a
child whose performance equals that of an average 7-year-
old is given the mental age 7. L. M. Terman, a
professor residing at Stanford University, revised the
Binet-Simon scale and popularized the test in the United
States. The revised scale became known as the
Stanford-Binet (Terman, 1916) and provided a ratio of
mental age to chronological age. This ratio was labeled
"intelligence quotient" or I. Q. The Stanford-Binet
remained the major intelligence test in America for 21
years. In 1937, the 1916 form v/as revised, divided into
two equivalent forms (L and M), and restandardized on a
sample of 3,184 Americans (Anastasi, 1968).
Although the Stanford-Binet was fairly popular, much
clinical evidence suggested that the scale was unsuitable
for use with adults (Zimmerman & Woo-Sam, 1973). The
test was originally developed for use with children and
lacked face validity for most adults. Wechsler (1939)
expressed this view:
Asking the ordinary housewife to furnish you with a rhyme to the words "day," "cat," and "mill," or an ex-army sergeant to give you a sentence with the words "boy," "river," or "ball," is not apt to evoke either interest or respect, (p. 17)
David Wechsler, in an attempt to design an
intelligence test for adults that was not merely an
extension of a children's test, developed the
Wechsler-Bellevue (W-B I) in 1939. This scale differed
from previous tests in several ways. It contained
8
material more suitable for adults, included more adults
in the standardization sample, de-emphasized the
importance of speed, and omitted the relatively routine
manipulation of words (e.g., rhyming words, using words
in a sentence) used in the earlier tests. The W-B I was
divided into Verbal and Performance subtests, with items
grouped by type rather than age level. The Verbal
section consisted of five subtests: Information,
Comprehension, Arithmetic, Similarities, and Digit Span.
Vocabulary was added as an alternate subtest. The
Performance scale also had five subtests: Picture
Completion, Block Design, Picture Arrangement, Object
Assembly, and Digit Symbol.
Wechsler (1939) introduced the concept of the
deviation IQ score when he developed the Wechsler
Bellevue I. The deviation IQ score defines an
individual's level of intelligence by comparing the
performance of the individual with the scores attained
by members of his or her own age group. A fixed mean
(100) and standard deviation (15) allows an individual's
IQ to have the same basic meaning regardless of the
subject's age. For example, an IQ of 100 obtained by a
50-year-old and by a 20-year-old would reflect the same
relative standing among the individual's age group. The
deviation IQ score has replaced the mental age concept
in modern intelligence tests.
The Wechsler Bellevue II (W-B II) was developed as
an alternate form for situations in which subjects who
had completed the Wechsler Bellevue I later needed to be
retested. The W-B II was used primarily for
intellectual assessment of military personnel, but was
never as readily accepted as the first version of the
test (Matarazzo, 1972). However, Wechsler adapted and
modified items from the W-B II and used them in the
construction of the Wechsler Intelligence Scale for
Children (WISC; 1949). The WISC was developed as an
intelligence test for children between the ages of six
and sixteen.
Sixteen years after the publication of the W-B I,
Wechsler published the Wechsler Adult Intelligence Scale
(WAIS; 1955). He had the support of the Psychological
Corporation in his efforts to obtain a national sample
of test results for use in standardization of the WAIS.
The WAIS was designed specifically for adults and older
adolescents aged sixteen and above. Soon after its
introduction, the WAIS became one of the most frequently
used psychological tests in the country (Matarazzo,
1972).
10
The WAIS is composed of 11 subtests. Six of these
subtests are grouped into the Verbal Scale, i.e.,
Information, Comprehension, Arithmetic, Similarities,
Digit Span, and Vocabulary. The scaled scores of these
subtests are combined to yield a Verbal IQ (VIQ). The
remaining five subtests make up the Performance Scale,
i.e., Digit Symbol, Picture Completion, Block Design,
Picture Arrangement, and Object Assembly. These scaled
scores are combined to give a Performance IQ (PIQ). All
11 subtests contribute to a Full Scale IQ (FSIQ).
In 1981, Wechsler issued a revised and
restandardized version of the WAIS. This new edition is
known as the Wechsler Adult Intelligence Scale-Revised
(WAIS-R; 1981). Wechsler's (1981) main objective for
revising and restandardizing the 1955 WAIS was to insure
its continued validity and effectiveness as a test of
intelligence. The WAIS-R contains many improvements
over the WAIS. These improvements include clarification
of administration and scoring procedures, modification
of outdated or biased item content, and use of a larger
and more up-to-date standardization sample. These
changes will be described in the following sections.
Even though these improvements have been made,
Wechsler's (1981) views on the nature and function of an
11
intelligence test have not undergone any major changes.
Therefore, it is clear that Wechsler intended for the
WAIS-R to measure the same intellectual factors as the
WAIS.
Organizational Differences between the WAIS and WAIS-R
The WAIS-R is composed of the same 11 subtests as
the WAIS, although there is a difference in order of
administration. In the WAIS, all of the Verbal subtests
are presented first, followed by all of the Performance
subtests. The relative positions of the subtests have
been changed, and Verbal and Performance subtests are
presented alternately. Table 1 lists the order of
presentation of subtests on the two scales. Wechsler
(1981) believed that alternating the Verbal and
Performance subtests on the WAIS-R would help maintain
the subject's interest, but would not affect IQ scores.
The method of obtaining IQ scores is similar for both
tests. The raw scores for each of the subtests are
converted into scaled scores. The scaled scores for
each of the 11 subtests are based on a reference group
which consisted of 500 subjects in the standardization
sample between the ages of 20 and 34. These scaled
scores range from 1 to 19 with a mean of 10 and a
Table 1
Comparative Order of V/AIS and WAIS-R Subtests
12
WAIS Subtests
Information
Comprehension
Arithmetic
Similarities
Digit Span
Vocabulary
Digit Symbol
Picture Completion
Block Design
Picture Arrangement
Object Assembly
WAIS-R Subtests
Information
Picture Completion
Digit Span
Picture Arrangement
Vocabulary
Block Design
Arithmetic
Object Assembly
Comprehension
Digit Symbol
Similarities
Note. WAIS = Wechsler Adult Intelligence Scale; WAIS-R = WAIS-Revised.
13
standard deviation of 3. The sum of the scaled scores
obtained on the Verbal subtests is converted to a VIQ
score, and the sum of the scaled scores obtained on the
Performance subtests is converted to a PIQ score. An
FSIQ score is then derived from the sum of the Verbal
and Performance scaled scores. The conversion of scaled
score to IQ score is based on age norms. The WAIS has
norms for ten age groups ranging from 16 years, 0
months, to 75 years and over, i.e., 16-17, 18-19, 20-24,
25-34. 35-44, 45-54, 55-64. 65-69, 70-74, 75+. The
WAIS-R provides norms for nine age groups ranging from
16 years, 0 months to 74 years, 11 months. These age
groups are similar to those used for WAIS
standardization, except that the last age group (75+) is
omitted. Although this sample is lacking in the WAIS-R,
it is common practice to use the norms for the 70 year
to 74 year group for those individuals who are 75 years
and older. This, of course, should be done with
caution.
The WAIS-R, like the WAIS, has a mean IQ score of
100 and a standard deviation of 15. However, the WAIS-R
has a more restricted range of possible IQ scores. WAIS
IQ scores range from 41 to 167 for the VIQ score, from
35 to 185 for the PIQ score, and from 41 to 179 for the
14
FSIQ score. The WAIS-R IQ's range from 46 to 150 for
the VIQ score, 47 to 150 for the PIQ score, and 46 to
150 for the FSIQ score. The most striking difference in
the IQ score range is the ceiling of 150 on all WAIS-R
IQ scores. Wechsler (1981) explained this change:
The WAIS-R is not intended to make fine discriminations among adults of extremely high ability because it has a natural ceiling, like measures of all aptitudes, beyond which it no longer measures what it was originally designed to appraise. For this reason, IQ's above 150 - a point more than three standard deviations above the population mean - are not provided for in the IQ tables of this Manual, (p. 5)
The WAIS-R classifies individual intellectual
levels on the basis of the same IQ score categories as
the V/AIS. Table 2 presents these categories, along with
the name assigned to each category for the WAIS and the
WAIS-R. Changes from the WAIS include the use of the
terms High Average and Low Average in place of Bright
Normal and Dull Normal. The category label of Mentally
Retarded also replaces the term Mentally Defective.
General Testing Considerations
It is important for the examiner to be aware of the
changes in the administration and scoring of the WAIS-R.
Some of the more general modifications will be discussed
briefly in this section, whereas others will be
discussed in more detail under the headings of specific
15
Table 2
WAIS and WAIS-R Classifications of Intelligence
IQ Score WAIS
Classification WAIS-R
Classification
130 and above 120-129 110-119 90-109 80- 89 70- 79
69 and below
Very Superior Superior Bright Normal Average Dull Normal Borderline Mental Defective
Very Superior Superior High Average (Bright) Average Low Average (Dull) Borderline Mentally Retarded
Note. WAIS = Wechsler Adult Intelligence Scale; WAIS-R WAIS-Revised. The information for this table was obtained from the WAIS and WAIS-R manuals.
16
subtests.
Instructions on scoring are clarified throughout
the WAIS-R manual. For example, the WAIS Similarities
and Vocabulary subtests do not include instructions for
questioning ambiguous responses in order to better
determine whether the response is worth 2, 1, or 0
points. This is changed in these two subtests of the
WAIS-R, with a specific notation, (Q), written after
responses that should be more thoroughly questioned.
The WAIS-R manual also lists specific principles to
guide the scoring of multiple responses. For example,
if a subject gives a response that is intended to
replace an earlier response, the earlier response is
ignored and the later one is scored. The WAIS manual
does not provide such specific guidelines. Instructions
for starting and discontinuing subtests are clarified in
the WAIS-R manual. For example, a score of 0 is given
to items passed after the criteria for discontinuing
have been met. These criteria are explained in the
introduction of each subtest. Upon comparison, it may
be seen that the WAIS-R manual is much more explicit
than the WAIS manual in its rules for administration and
scoring.
Both manuals discuss the issues of appropriate
17
testing conditions, maintaining rapport, and prorating
the IQ scores when a subtest is not administered. Each
manual treats these issues in a similar fashion,
although the WAIS-R manual tends to include more
specific information on these topics. The record form
itself has been modified, with more space provided for
recording responses and observations. To make this
additional space available, the record form was expanded
from the original four pages of the WAIS to six pages.
The front page of the WAIS-R protocol no longer contains
the Information subtest. Therefore, it can be used as a
summary page for test scores without revealing any of
the raw data.
Source of WAIS-R Items
The WAIS-R retains not only the same subtests and
general characteristics of the V/AIS, but also many of
the same items. Approximately 80% of the WAIS-R items
are retained from the WAIS, either in identical form or
with slight modifications. However, some of the items
which appeared dated were revised or omitted, and some
new items were added. Table 3 summarizes the changes in
item content on the WAIS-R.
Of the 258 items on the WAIS-R, 227 appeared on the
WAIS. Of these 227 items from the WAIS, 134 originated
18
Table 3
Changes in Item Content from the WAIS to the WAIS-R
Test
Verbal Scale
Number of items in the WAIS-R
Total WAIS-R Items
I terns From WAIS*
Modified WAIS
Items** New
Items
Information (29)a Digit Span (14) Vocabulary (40) Arithmetic (14) Comprehension (14) Similarities (13)
29 14 35 14 16 14
20 14 33 12 12 10
0 0 0 1 0 1
9 0 2 1 4 3
Performance Scale
Picture Completion (21) Picture Arrangement (8) Block Design (10) Object Assembly (4) Digit Symbol (90)
20 10 9 4 93
14 6 9 4
90
1 0 0 0 0
5 4 0 0 3
Note. (a) - The number of items in the 1955 WAIS is shown in the parentheses following the subtest name.This table was adapted from information obtained in the WAIS-R manual. * Items same as WAIS or slightly modified by rewording or redrawing. ** Items from WAIS substantially modified.
19
on the 1939 W-B I and 93 appeared as new items on the
1955 WAIS. Thus, 87.9% of the items on the WAIS-R are
from previous Wechsler scales (WAIS or W-B I), and only
12,1% of the items on the WAIS-R are new. Due to the
high degree of item overlap between the two Scales,
WAIS-R scores will certainly be expected to correlate
highly with previous Wechsler Scales. Furthermore, any
sizable differences found between scores on the scales
must reflect more than changes in content alone.
Wechsler (1981) summarized the changes in content,
administration, and scoring for each subtest of the
WAIS-R. Smith (1982) elaborated somewhat on these
changes and discussed possible effects that these
subtest differences could have on resulting WAIS-R
scores. A survey of the changes made in WAIS-R subtest
content, administration, and scoring may help in
determining the source of WAIS and WAIS-R score
differences.
WAIS-R Verbal Subtests
Information. The Information subtest is a series
of questions that tap knowledge commonly attained in an
educational setting. The Information subtest of the
WAIS-R is composed of 29 items just as it is on the
WAIS. Nine of these WAIS-R items are from the W-B I, 11
20
originated on the WAIS, and nine items are completely
new. Items which changed over time were updated, e.g.,
"What is the population of the United States?" Also, new
items replaced questions judged to be culturally biased.
These new items relate to women (Marie Curie, Amelia
Earhart), minorities (Louis Armstrong, Martin Luther
King), and information of more recent historical
interest (theory of relativity, Presidents of the
U. S. since 1950).
The only change in the administration of the
Information subtest has to do with the presentation of
the first six items. The examiner still begins with
item 5, but now items 1-4 are given if either item 5 or
6 is missed. On the WAIS, both items 5 and 6 had to be
missed in order to go back to the first four items. The
subtest is still discontinued after five consecutive
failures. Credit for items 1-4 is given if items 5 and
6 are passed .
The updating of individual subtest questions should
increase the likelihood that subjects will be familiar
with the material. The changes in administration of the
first six items should also have the effect of
increasing variability at the lower end of the scale.
Digit Span. The Digit Span subtest requires the
21
subject to listen to several series of digits and
memorize the sequence of digits in each series. On the
first seven sets of digits, the subject is instructed to
repeat the series verbatim. The last seven series are
to be repeated in reverse order. The WAIS-R retains the
same digits as the WAIS, and each item has two trials.
On the WAIS, the second trial of a series is
administered to the subject only if the first trial is
failed. The WAIS-R manual, on the other hand, instructs
that both trials are to be administered even if the
first trial is passed. The subtest is discontinued when
both trials of an item are missed.
The method of scoring the Digit Span subtest has
also been modified. On the WAIS, the score for the
Digit Span subtest equals the number of digits in the
longest forward series repeated correctly, plus the
number of digits in the longest backward series repeated
correctly. On the WAIS-R Digit Span, each item is
assigned a value of 2, 1, or 0 points, depending on the
number of trials passed. This raises the maximum
possible score from 17 to 28, thus increasing the
sensitivity of measurement for this particular subtest.
Unfortunately, the administration of this subtest now
takes more time due to the increased number of trials.
22
Vocabulary. The Vocabulary subtest requires the
subject to define a series of words. The WAIS-R
Vocabulary list contains 35 words, compared with the 40
words used in the WAIS. Seven of the WAIS words have
been dropped and two new words added. Besides word
changes, the order of words has also been rearranged to
reflect the current level of item difficulty. Several
items were omitted either because they were too
difficult to define (i.e., travesty) or too difficult to
score (i.e., slice). Wechsler (1981) also wanted to
shorten the subtest.
There has been no change in the administration
procedures of the Vocabulary subtest. Examiners begin
with word 4 and give words 1-3 if any word from 4-8 is
failed. The subtest is still discontinued after five
consecutive failures. The scoring of words 1-3 has been
changed. Whereas these items are scored 2 or 0 on the
WAIS, they are now given scores of 2, 1, or 0. These
changes in scoring should increase variability slightly
at the extreme lower end of functioning, but should have
no other major effects. Administration time may be
shortened slightly since there are five fewer items to
be defined and the words have been rearranged according
to item difficulty.
23
Arithmetic. The Arithmetic subtest requires
subjects to mentally compute several math word problems.
This subtest consists of 14 items, the same number as
the WAIS. There are five items retained from the W-B I,
eight items that originated on the IMIS, and one new
item. Although several of these items have been
reworded in an effort to update them, the computations
are nearly identical to those on the WAIS. Several
items were reworded in an attempt to remove references
to the male gender, to reflect current prices, and to
update cultural material. These changes should serve to
increase the face validity of the subtest.
The administration of the subtest remains
unchanged. The examiner starts with item 3, giving
items 1-2 only if items 3 and 4 are both missed. The
subtest is discontinued after four consecutive failures.
The rules for allowing bonus points for rapid, correct
solutions have been altered slightly. On the WAIS-R,
bonus points are awarded for five items instead of the
four items which include bonus points on the WAIS.
Therefore, the maximum score is increased from 18 to 19.
Comprehension. The Comprehension subtest requires
the subject to answer questions that measure their
ability to make practical judgments. This subtest is
24
composed of 16 items. Two WAIS items have been dropped
from the WAIS-R and four new items have been added.
Comprehension now contains eight items from the W-B I
and four from the WAIS. The order of item difficulty
has been rearranged. For example, all of the proverbs
are now placed toward the end of the subtest. There has
also been some clarification in the wording of several
items .
There have been several changes in the
administration and scoring of the Comprehension subtest.
The examiner now starts with the first item for all
subjects, instead of with item 3 as on the WAIS. The
examiner is also allowed to give help on the first item
if the subject fails to give a 2-point response to the
question, whereas there was no assistance allowed on the
WAIS. For those items requiring two ideas for full
credit, subjects giving only one idea are now asked for
a second response. The maximum score is 32 instead of
28 as on the WAIS.
These changes in administration and scoring should
help the subject better understand the nature of the
required response. The changes also increase the
variability at the lower end of the scale.
Similarities. The Similarities subtest requires
25
the subject to tell how two objects or concepts are
alike. This subtest now contains 14 items as compared
to the 13 items on the WAIS. One item from the WAIS was
modified to remove a specific gender reference, e.g.,
Coat-Dress changed to Coat-Suit. Of the 14 items, nine
originated on the W-B I, two on the WAIS, and three
items are new.
Administration of the Similarities subtest remains
the same as the WAIS. The examiner starts the test with
item 1 and discontinues after four consecutive failures.
However, the examiner may now give help by supplying the
subject with the correct answer on item 1 if the subject
fails to give a 2-point response. On the V/AIS, help was
given on item 1 only if it was failed completely.
Scoring of the items remains the same, with each answer
being assigned a score of 2, 1, or 0. All of the
changes mentioned above increase the maximum score from
26 on the WAIS to 28 on the WAIS-R. Changes should also
clarify for the subject the type of response that is
expected.
WAIS-R Performance Subtests
Picture Completion. The Picture Completion subtest
requires subjects to find the most important detail
missing in a series of pictures. There are now 20 items
26
on this subtest as compared to the 21 items on the WAIS
Picture Completion. Six items from the WAIS were
dropped and five new ones added. The subtest contains
eight items from the W-B I and seven items that
originated on the WAIS. Several of the items have been
redrawn in order to improve clarity, to modernize the
content, to include minorities, and to lessen regional
bias. For example, one item from the WAIS was
substantially modified, i.e., a snow scene was changed
to a beach scene.
Administration of the WAIS-R Picture Completion
remains very similar to the WAIS. However, there have
been a few minor changes. For example, every WAIS item
was introduced with the question, "Now what is missing
in this picture?" On the WAIS-R, however, it is
permissible to modify or omit this phrase once it is
clear that the subject understands the instructions. On
both versions, the subject begins with item 1 and
discontinues after five consecutive failures. The
examiner is now allowed to give help on the first and
second item failed. On the WAIS, help is given on the
second item only if the first item is failed. There is
still a 20 second time limit for each picture, and the
subject is questioned only the first time an unimportant
27
missing part is named. One point is given for each
correct answer. This yields a maximum score of 20
instead of 21 as on the WAIS.
These changes should have no major effect on the
scores obtained on the WAIS-R. The added flexibility
given to the examiner in helping the subject may be of
some benefit to those individuals functioning at a lower
level of intelligence.
Picture Arrangement. The Picture Arrangement
subtest requires the subject to arrange several sets of
pictures in the correct order so that they tell a story
that makes sense. There are now ten Picture Arrange-er.t
items on the WAIS-R as compared to the eight items on
the WAIS. Two WAIS items were eliminated and four new
ones added. One item (Nest) was eliminated because it
was too easy and the other (Hold-up) was eliminated
because it proved to be ambiguous. One WAIS item
(Enter) was modified and slightly redrawn. New items
and modifications of former items have improved clarity,
and now represent people of both sexes and different
races .
There have been several changes made in the
administration and scoring of the WAIS-R Picture
Arrangement subtest. First of all, testing on the WAIS
28
subtest continues if either item 1 or 2 is passed. If
both trials of items 1 and 2 are failed, testing is
discontinued. Rather than allowing for two trials on
items 1-2, the WAIS-R only allows two trials on item 1.
Testing is continued even if both trials of item 1 are
failed, and it is only discontinued after 4 consecutive
failures. Two points are awarded for a correct
arrangement and one point for an acceptable variation.
Time bonuses are no longer awarded on the WAIS-R,
reducing the maximum score from 36 to 20 points.
The provision for discontinuing may make this
subtest less frustrating for those subjects functioning
at the lower level of intelligence since there is only
one trial on item 2. This change should also shorten
administration time.
Block Design. The Block Design subtest requires
the subject to reproduce several designs using either
four or nine blocks. All of the blocks are identical,
with some sides being all red, some sides all white, and
some sides half red and half white. There are nine
Block Design items on the WAIS-R instead of the original
ten on the WAIS. One simple item was eliminated and no
new items were added. The item dropped was too similar
in difficulty to other items to be of discriminative
29
value. The WAIS-R subtest contains seven items from the
W-B I and two items that originated on the WAIS.
The administration procedures of the WAIS-R Block
Design remain the same. The examiner begins with item
1, allows two trials if needed on items 1 and 2, and
discontinues after three consecutive items are failed.
However, the scoring on the WAIS-R has been changed
substantially. The value of the first two items has
been reduced from 4, 2, or 0 points to 2, 1, or 0
points. The number of items which include bonus points
for speed has been increased from four to seven and the
number of bonus points that can be earned has increased
on five items from 2 to 3 points. Therefore, even
though the length of the subtest has been reduced by one
item, the maximum score has increased from 48 on the
WAIS to 51 on the WAIS-R.
These changes should result in more score
variability in the middle and upper ranges of
functioning. The increase in credit for rapid
performance may differentially affect some subgroups of
the population.
Object Assembly. The Object Assembly subtest
requires subjects to put together puzzle pieces to make
different objects. The WAIS-R Object Assembly subtest
30
is composed of the same four items that are found on the
WAIS, i.e., Manikin, Profile, Hand, Elephant. The only
change in the items themselves involves some minor
modernization of the manikin. The reverse sides of the
puzzle pieces are now gray in color instead of being the
same biege color on both sides as they are on the WAIS.
All four of the items are administered to each
subject, and the object pieces are presented to the
subject in the same pattern as on the WAIS. On the WAIS
and WAIS-R, one point is awarded for each joint that is
correctly assembled within the time limit. The subject
can earn up to three bonus points for rapid, perfect
completion of each item. However, even though there are
three categories of bonus points on each test (based on
the number of seconds it takes for the subject to
complete the task), the WAIS-R makes these points more
difficult to obtain by requiring faster completion of
the object. The highest number of points that may be
earned on items 2, 3, and 4 has been decreased by one
point each on the WAIS-R. Therefore, the total possible
score is now 41 as opposed to 44 on the WAIS.
It is quite probable that these changes will have
no major effect on Object Assembly scores. Variability
should not be changed since decreasing the scores on the
31
last three items was accomplished by eliminating a
one-point gap that existed on the WAIS between the score
for perfect completion and the first time bonus. Of
course, the coloring of the reverse side of each object
piece should eliminate the problem of subjects turning
the pieces over by mistake since they should now
immediately recognize their error.
Digit Symbol. On the Digit Symbol subtest,
subjects are required to associate geometric symbols
with the numbers 1-9. The subject is asked to reproduce
the numbers beneath their paired symbol in a timed
paper-and-pencil task. There are no significant changes
made in this subtest. The same nine number-symbol
pairings used on the W-B I and the V/AIS are now used in
the WAIS-R. The total number of items has been
increased by three by simply reducing the number of
sample items by three. There are 93 WAIS-R items
instead of the 90 items on the WAIS. The order of items
on the first row has been changed slightly because of
the shift in the starting point. The subject is still
given 90 seconds to complete as many items as possible.
The additional three items included in the WAIS-R should
increase the variability somewhat among higher
functioning individuals.
Reliability
Split-Half Reliability
Wechs le r (1955) computed r e l i a t o i l i l t y ccoDsffffikciLfflTtts
f o r WAIS IQ and s u b t e s t s c o r e s (witBn ttBne exccffiipttiLaaii ooff
D i g i t Span and D i g i t Symbol) u s i n g tBne syliltt-ftiEfillff
c o r r e l a t i o n a l t e c h n i q u e . T h i s i n v o l v e d (ccBraigDailtiimg tdhee
c o r r e l a t i o n between s c o r e s on odd amd ewem dittejns anxil
c o r r e c t i n g t h e c o e f f i c i e n t f o r t h e f m l l Iffimgttlh ooff ttltfis
t e s t w i t h t h e Spearman-Brown fo r i au l a - Im ttlhffi ccfflffiee coff
D i g i t Span , where t h e two h a l v e s of tine Iteslt mHi;y tixe
c o n s i d e r e d s e p a r a t e t e s t s , t h e c o r r e l a t i o m ffooir IMgiLtts
Forward and D i g i t s Backward s c o r e s was caDirirsccttffidl fiosr td te
f u l l l e n g t h of t h e t e s t . A s e p a r a t e stimciy wi"as dlccmfe toe
e s t i m a t e t h e r e l i a b i l i t y of t h e D i g i t Symife(a)l smlbiiftstt
s i n c e t h e s p l i t - h a l f t e c h n i q u e i s iiiiaf)f)B-<5)5)a-i*it* feoQ: ttiinit^ii
t e s t s . R e l i a b i l i t y c o e f f i c i e n t s v©r% e iHifjUiit di g'cfcrr tth^fi^^
age g r o u p s i n t h e s t a n d a r d i z a t i o n sasif)!^:: l -=-ll9 ,, S ^ - ^ . ,
and 4 5 - 5 4 .
I n W e c h s l e r ' s (1955) s t u d y , F u l l S(£al% 10)''% 5yii%ll4j^
r e l i a b i l i t y c o e f f i c i e n t s of .97 i n a l l %fe5-%% %^% %%!W^1^..
V e r b a l I Q ' s had i d e n t i c a l r e l i t b i l i t 4 % S Q.f s % 4% %lili
t h r e e g r o u p s , and Pe r fo rminee 1Q'§ hQi<l t%ii%lbi4itt3i%!% ctft
. 9 3 and . 9 4 . T h e r e f o r e , i t may be §fe^9i %!?!%% %44 i t te%^
I Q ' s have a h i g h deg ree of I n t e r n a l <e;'aft%tS%%%(€J -. I^TT^
33
i n d i v i d u a l s u b t e s t s y i e l d e d lower r e l i a M l J t l t i ® s irsiiigjinig
from some c o e f f i c i e n t s in t h e .60*8 fasimniKi wdlltlb IMglLtt
Span , P i c t u r e Ar rangemen t , and O b j e c t A»»emilt)]Ly„ tt®
c o e f f i c i e n t s a s h igh a s .96 f o r Vocatounlairy.
Wechs le r (1981) a l s o employed a split—Hoalff g)irGDccffidluJii«
t o d e t e r m i n e t h e r e l i a b i l i t y of t h e ¥ A I S - 1 I(® semcH asoilbttffiffitt
s c o r e s . As on t h e WAIS, d i f f e r e n t procecimrffiffi wffiirffi uisasdd
t o o b t a i n c o e f f i c i e n t s fo r t h e D i g i t Spam fflmdl Diggitt
Symbol s u b t e s t s . The r e l i a b i l i t i e s f©r tHae itlfaireEeE IQUSB sBxree
ve ry h i g h a c r o s s a l l n i n e age g r o u p s . Tine aweiragg®
c o e f f i c i e n t s f o r t h e VIQ, PIQ, and FSI(£J acnroBffis ag© iramgees
were . 9 7 , . 9 3 , and . 9 7 , r e s p e c t i v e l y . Coefffficciffinitts ffcDir
t h e i n d i v i d u a l s u b t e s t s r anged from , 5 2 f o r ®lt)jj®<cit
Assembly t o .96 fo r V o c a b u l a r y .
Ryan, P r i f i t e r a , and La r sen (1982) couKfiuufEit cii a stturfijy
c o n c e r n i n g t h e s p l i t - h a l f r e l i a b i l i t y ®f tlhe WAIS--8? ffojn- s&
mixed sample of p s y c h i a t r i c and n e n i n a l o g i c a l 5)aiti%mtt%..
S p l i t - h a l f r e l i a b i l i t y c o e f f i c i e n t s rasng^ci t -cfemi .."^2 fftwr
t h e A r i t h m e t i c s u b t e s t to .92 f o r VocafetalaJry ^Bii MojctHc
D e s i g n . The r e s u l t s of t h i s s t u d y ar© e^as i s t^ ih i t *4itlh
W e c h s l e r ' s (1981) c o n c l u s i o n s and deia©ast?-at% ^lh%t it^h*
WAIS-R i s a r e l i a b l e i n s t r u m e n t .
A compar i son of t h e WAIS and \vAIS-K ¥%lt%Miitt3y 4l%tt%
a s computed by Wechsler (1955» 1981) i a f aM% 4 %ii%^%%tt%
3^
T a b l e 4
Mean R e l i a b i l i t y C o e f f i c i e n t s amd Sltamcflfflirdi ffiinrccrrs of Measurement Across Age Grouaps ffoBir VHMS>
and WAIS-R S u b t e s t amd 1<^ Scoires
MAIS WMLS-ffv
T e s t r SEm ir Sffim
Information Digit Span Vocabulary Arithmetic Comprehension Similarities
Picture Completion Picture Arrangement Block Design Object Assembly Digit Symbol
Verbal IQ Performance IQ Full Scale IQ
N o t e . The s t a n d a r d e r r o r s of ©easwiremiemit aB-% im stall^cii s c o r e u n i t s f o r t h e s u b t e s t s and i n I(^ st£(air* umitt* ffctir t h e V e r b a l , P e r f o r m a n c e , and F u l l S c a l e IQ} S(£Q)1J-%S.. TUhLi* t a b l e c o n t a i n s r e l i a b i l i t y and staiiii(iatt-(i ^iftoas- Qjff measurement d a t a o b t a i n e d from t h e WAIS auifi IW^lS-^v m a n u a l s .
. 9 1
. 68
. 9 5
. 8 2
. 7 8
. 8 6
. 8 3
. 6 7
. 8 4
. 68
. 9 2
. 9 6
. 9 3
. 9 7
0 . 8 7 1 . 7 1 0 . 6 S 1 . 3 2 1 . 4 3 1 . 1 9
1 . 1 6 1 . 6 1 1.2® 1 . 6 3 0 . 8 5
3 .0® 3 . 8 7 2 . 6 ®
-ffi® - S 3 -®ffi -SM -®4+
^m. -8811 .774+ -S77 -®® . ® 2
^m ^m ^m
QD..99G3 n..2222 QD..6&II ii. .n3 n..2aa) 11..2M
I1..225 Il..^Il QD..9effi ]1. .S!6 ll . .2ffi
Ti..nu, u,.,w. 2..533
33
s l i g h t bu t i n c o n s i s t e n t i n c r e a s e s im r e l i f f l M l i l t y ffcsar ttlte
WAIS-R. The WAIS-R's i n t e r n a l coms i s t emcy <C(5J®ffffii(EiL«nnttffi
a r e s l i g h t l y lower f o r t h r e e s u b t e s t s : Imffomn^ttiaDin,,
S i m i l a r i t i e s , and P i c t u r e C o m p l e t i o n . TBne ireraiffliLmiiing EBILXC
s u b t e s t s ( e x c l u d i n g D i g i t Span and B i g i t SyraiBDODll)) slbccw
e i t h e r i d e n t i c a l or s l i g h t l y h i g h e r split—Boffllff
c o r r e l a t i o n s f o r t h e WAIS-R. The i m c r e a s e im ttlh©
s e n s i t i v i t y of measurement f o r t h e MAIS—E Digi t t SgjsBin
a p p e a r s t o have produced an i n c r e a s e inn reliaBDiliLttg^ ffooir
t h a t s u b t e s t . T h e r e f o r e , even thongin MecBnsleir ((IL®8BIL))
c o n c l u d e d t h a t t h e WAIS-R i s more r e l i a l s l e ttoffim Itlh® WAUIS,,
i t would seem more a p p r o p r i a t e t o s ay t h a t tHa®
r e l i a b i l i t y of t h e WAIS-R i s comparals le t® tike WAIS amdl
t h a t b o t h have h i g h l y r e l i a b l e s c o r e s .
The s t a n d a r d e r r o r of measurememt (|]SEmi)) i s amQ)ttlh®ir
way t o d e t e r m i n e t h e p r e s e n c e of e r r o r waariaMlLittS'.. TDlms
measurement d e s c r i b e s t h e band of e r r o r storircaminKfiim^ am
i n d i v i d u a l ' s t h e o r e t i c a l " t r u e " s c o r e onn t&ialt tt^stt.. Tlltffitt
i s , t h e SEm e s t i m a t e s t h e s t a n d a r d (ieviati«»m Q&ff am
i n d i v i d u a l ' s s c o r e s on a t e s t i f t h e persssm w^a-fe 4%%tt^ ^
l a r g e number of t i m e s , and t h e r e v e r e m® elhaoi^fe* tui
s c o r e s due t o t h e r e p e a t e d t e s t i n g . Tafele 4 5)S-%%%ihit% ttthfe
a v e r a g e s t a n d a r d e r r o r of measurment f®r WMS ^Hitii WJ^lS-Ji?
s c o r e s . The re a p p e a r s t o be a tendleiacy it©^at(fi %li^lhttlL3y
36
lower SEm's on the WAIS-R. However, four subtests show
an increase in variability on the WAIS-R: Information,
Similarities, Picture Completion, and Digit Symbol.
Compared to the WAIS, the WAIS-R has an SEm that is lower
for VIQ and FSIQ scores and higher for PIQ scores.
Test-Retest Reliability
There have been several test-retest studies
conducted using the WAIS-R (Ryan, Georgemiller, Geisser,
& Randall, 1985; Wechsler, 1981). The WAIS-R manual
(1981) reported the results of two test-retest studies
for two separate age groups, i.e., 25- to 30-year-olds and
45- to 54-year-olds. Correlations for the 25- to 34-year-
old group's VIQ, PIQ, and FSIQ scores were .94, .89, and
.95, respectively. For the 45- to 54-year-old group the
VIQ, PIQ and FSIQ correlational values were .97, .90, and
-96. Using intertest intervals of 2-5 weeks in the 25-34
age group and 2-7 weeks in the 45-54 age group, Wechsler
(1981) obtained comparable increases in the mean IQ
scores for the younger group (VIQ = 3.3, PIQ = 8.9, and
FSIQ = 6.6) and the older group (VIQ = 3.1, PIQ = 7.7,
and FSIQ = 5.7). Ryan et al. (1985) also obtained highly
significant correlation coefficients on the WAIS-R using
a sample of 21 psychiatric and neurologically impaired
patients (r's = .79, .88, and .86 for VIQ, PIQ, and FSIQ,
37
respectively). There was a mean IQ gain on retest of
2.91 points for VIQ, 4.52 points for PIQ, and 3.86 points
for FSIQ. Of course, these practice effects are somewhat
less than those found by Wechsler (1981) in his
test-retest studies. Ryan et al. (1985) suggested that
test-retest changes may vary according to the population
studied.
The significant correlations achieved in the three
WAIS-R reliability studies are consistent with the
results of a WAIS test-retest study conducted by Coons
and Peacock (1959). They found that VIQ, PIQ, and FSIQ
scores correlated about .96 in a sample of psychiatric
patients. This study also revealed that WAIS IQ scores
increased on retest, with VIQ, PIQ, and FSIQ score
increases averaging 2.6, 8.6, and 5.0 points. It may be
concluded on the basis of the highly significant
correlation coefficients that the WAIS and the WAIS-R
have excellent test-retest reliability. However, there
is a consistent practice effect for the two tests which
leads to significantly increased scores on retest,
particularly for the Performance IQ score.
38
Scoring Reliability
Scoring reliability on the WAIS has been of concern
to examiners for many years (Cohen, 1965; Sattler,
Winget, & Roth, 1969; Schwartz, 1966; Walker, Hunt, &
Schwartz, 1965). Guertin, Ladd, Frank, Rabin, and
Hiester (1971) reviewed the literature concerning scoring
reliability and concluded that even minor changes in test
procedures can affect individual scores. They also noted
that differences in the test scores obtained by different
examiners are often found, but that little is known about
the factors that contribute to these differences.
Research indicates that certain subtests (Comprehension,
Similarities, and Vocabulary) require a great deal of
examiner judgment, and that examiners often assign
different scores to identical responses. One variable
that does not appear to be a factor involved in the
ability to score tests reliably is the examiner's level
of experience (Schwartz, 1966). These researchers found
that less experienced examiners had the same level of
scoring disagreement as more skilled clinicians.
To date, there is only one study published
concerning scoring reliability on the WAIS-R. Ryan
(1983) employed 19 psychologists and 20 graduate students
to score WAIS-R protocols from two vocational counseling
39
clients. He found that mechanical scoring error produced
IQ scores that differed by between 4 and 18 points,
regardless of the examiner's level of experience. For
both protocols and both groups of examiners, scoring
agreement with the actual Full Scale IQ scores ranged
from 32% to 35%. However, over 77% of the scores were
within one standard error of measurement of the true
scores. The results of this study suggest that the
improvements made in the WAIS-R scoring procedures did
not produce greater consistency in scoring among
examiners. The WAIS-R still allows the judgment of the
examiner to enter into the scoring process, especially on
the subtests of Comprehension, Vocabulary, and
Similarities.
Validity
Wechsler did not present validity data in either the
WAIS or the WAIS-R manual. In fact, he appears to assume
that intelligence is whatever it is that the Wechsler
Scales measure. Wechsler (1981) concluded that since the
WAIS-R measures the same abilities as the WAIS and the
W-B I, and also overlaps considerably in test content
with the earlier forms of the Wechsler Adult Scales, then
any validity studies with the WAIS and the W-B I may be
considered relevant to the WAIS-R.
There has been a great deal of researcto t&at
examines the validity of the WAIS. Matarazz® 11912}
surveyed studies that correlated the WAIS witBn acsdemiicc
success (r = .50) and years of educational experie:nr.ce; ((ff
= .70). Matarazzo (1972) also reviewed mamy Gsttfeer
articles concerning the concurrent validitsr af thie W,'i.IS
in an attempt to determine the relatioosliips EDetweenn
different variables and the WAIS. Zimmerniaii and WGECZ—Ssir.
(1973) concluded that the strength of the relatianstrip;
between WAIS IQs and other criteria of imtelligernce is
dependent on the reliability of the other criteria. Fccrr
example, if the reliability of the coniparatiTe criteiriGEini
is low, the relationship between that specific crrteirlair.
and the WAIS will be weakened.
To date, there has been only one valieiitj stu:d;y
published for the WAIS-R. Ryan and Rosenberg C19S2)
found a significant relationship between the Ver&aL an-.d:
Full Scale IQ scores of the WAIS-R and the tBnree
achievement standard scores (Reading, Spellings Katlt)' (sff
the Wide Range Achievement Test (WRAT). These fimdiimjs
support the concurrent validity of the WAIS-R. KGrw werr,,
further research is needed, comparing the ¥AIS-E witfr.
other types of achievement and intelligence tests.. (Jivi-ti
the great similarity between the structure and cG?nit(*ir.t (eff
41
the WAIS and the WAIS-R, Ryan and Rosenberg's (1983)
study supports Wechsler's (1981) assumption that the
WAIS-R is as valid a measure of intellectual functioning
as the WAIS.
Standardization of the WAIS and WAIS-R
The WAIS-R manual is more complete than the WAIS
manual in presenting statistical information related to
standardization samples. Upon comparison, it may be seen
that the WAIS-R used a slightly larger standardization
sample (N = 1880) than did the WAIS (N = 1700). Both of
these tests were standardized on a national sample, and
both attempted to include representative proportions of
the following variables: age, sex, race, geographic
region, occupation, education, and urban-rural residence.
The WAIS sample was derived in an attempt to approximate
the proportion of these variables as represented in the
1950 census. The normative sample for the WAIS-R was
derived from the same variables in the 1970 census.
Comparing the proportions of these variables in the
WAIS and WAIS-R samples should reflect changes in the
general population from 1950 to 1970 and suggest ways
that WAIS-R scores may differ from WAIS scores. However,
only four of these variables seem to have changed
significantly over time. These variables are discussed
42
below.
Geographic Region
In the WAIS-R standardization sample, there has been
a slight shift from the Northeast and North Central to
the Western region of the United States. Approxixately
18% of the standardization sample is now frorn the ' 'ester-
region as compared with the 12% of the WAIS
standardization sample.
Urban-Rural Residence
WAIS and WAIS-R samples reveal a significant shift
from rural to urban residence between standardizations.
This shift is best demonstrated in the 20-24 year age
group. When the WAIS was being standardized, 65% of this
group lived in urban areas, whereas 78.8% of this group
lived in urban regions when the V/AIS-R standardization
sample was being established (1970 census).
Occupation
The standardization sample of the WAIS-R was
stratified across six occupational groups derived from
the census categories of working and non-working
individuals. The division of occupational categories is
different in the WAIS standardization sample, with
individuals stratified across 13 categories, so it is
43
difficult to directly compare the occupational
stratification for the two tests. Another problem that
makes it difficult to compare the two tests is that the
young student population (ages 16-19) is not clearly
represented in the WAIS-R manual as being part of the
standardization sample. However, there has been a
dramatic increase in the number of women included in the
labor force for the WAIS-R sample. In the 1950 census, a
majority of women in every age group, except 18-19, were
homemakers. This changed in the 1970 census, with many
women moving to administrative, managerial, and clerical
positions.
Education
Educational attainment was divided into five levels
according to the number of years of school completed,
i.e., 8 years or less, 9-11 years, 12 years (high school
graduate), 13-15 years, 16 years or more (college
graduate). These categories were the same for the WAIS
and the WAIS-R. Separate educational distributions were
determined for each sex and for every age group. In
general, there has been a trend toward increased
educational attainment since the 1950 census was
conducted. For example, only about 18% of the 25-34 year
age group used in the WAIS sample had graduated from
44
college, as compared with 23.7% of the 25-34 age group in
the WAIS-R sample.
Factor Analysis
The WAIS and WAIS-R have both been factor analyzed
to determine the major dimensions of the scales.
Wechsler (1955) performed the first factor analysis of
WAIS subtest scores. Three major factors were identified
in this study: a general factor in which all of the
subtests clustered, a verbal factor, and a performance
factor. Many of the factor analyses conducted on the
WAIS since that time have derived the same three factors,
with only slight variations related to the type of
statistical procedure utilized (see Matarazzo, 1976, for
a complete discussion). Cohen (1957a, 1957b) conducted a
factor analysis of the WAIS with intercorrelations of
subtests obtained for four of the age groups in the
standardization sample. He found the presence of a
single factor common to all 11 subtests. In addition,
however, three major group factors were identified. One
was a Verbal Comprehension factor, with large weights for
the Vocabulary, Information, Comprehension, and
Similarities subtests. A Perceptual Organization factor
was found with loadings mainly for the Block Design and
Object Assembly subtests. Finally, the third major group
45
factor was described as a Memory factor (Freedom from
Distractibility) . This factor loaded primarily on the
Arithmetic and Digit Span subtests. Several other
studies also support the presence of Cohen's (1957a,
1957b) three group factors (e.g., Berger, Bernstein,
Klein, Cohen, & Lucas, 1964; Denerll, Broeder, & Sokolov,
1964).
There have been several factor analytic studies
conducted on the scores obtained by the subjects in the
WAIS-R standardization sample (Beck, 1985; Blaha &
Wallbrown, 1982; Gutkin, Reynolds, & Galvin, 1984;
Naglieri & Kaufman, 1983; O'Grady, 1983; Parker, 1983;
Plake, Gutkin, & Kroetin, 1984; Silverstein, 1982, 1985).
Each of these studies employed different types of
analyses (e.g., principal factor, multifactor,
orthogonal, oblique, maximum likelihood confirmatory
analysis, cluster analysis), which affected the number of
factors found as well as the different subtests which
loaded into each factor. For example, Silverstein (1982)
conducted a principal factor analysis on the
standardization data for the WAIS and the WAIS-R. He
identified two stable factors in both tests, i.e., Verbal
Comprehension and Perceptual Organization. He also found
a general factor (g) that accounted for a majority of the
46
variance in the two tests. These results were confirmed
in a more recent factor analytic study performed on
WAIS-R scores obtained from a sample of psychiatric and
medical patients (Ryan, Rosenberg, & DeWolfe, 1984).
Parker (1983) also found Verbal and Performance factors
which loaded onto the WAIS-R. However, using a
three-factor analysis, Parker (1983) found evidence of
another discrete factor which he labeled Freedom from
Distractibility. This three-factor model was confirmed
in a later study which analyzed the WAIS-R scores of
psychiatric and medical patients (Beck, 1985). In
contrast, O'Grady (1983) concluded that the WAIS-R
contains a general intellectual factor (g) and that the
other factors play only a small role in influencing
WAIS-R subtest scores. To determine which factor model
was most appropriate, Silverstein (1985) conducted a
cluster analysis of WAIS-R subtests used in the
standardization sample. Although he found evidence for
three major clusters of subtests (Verbal Comprehension,
Perceptual Organization, and Freedom from
Distractibility), he concluded that the choice between
two and three factor solutions is academic and that for
all practical purposes, the WAIS-R measures nothing but
47
Besides the factor of general intelligence, two
factors have appeared consistently in the majority of the
WAIS and V/AIS-R factor analytic studies: Verbal
Comprehension and Perceptual Organization. A third
factor described either as Memory or Freedom from
Distractibility was also frequently found. Research
supports the use of Full Scale IQ scores due to the large
general factor in the test scores. However, the presence
of the third Freedom from Distractibility factor does not
lend support to the grouping of the subtests into Verbal
and Performance scales. In conclusion, there are no
major differences between the factor structures of the
WAIS and the WAIS-R, although there may be variation in
how WAIS and WAIS-R subtests load on the factors.
Studies Comparing WAIS and WAIS-R Scores
Since the publication of the WAIS-R, several studies
have been conducted to determine the equivalency of WAIS
and WAIS-R scores. The results of these WAIS/WAIS-R
comparisons will be reviewed to determine the magnitude
and source of these differences.
Wechsler (1981) conducted the first comparison of
WAIS and WAIS-R scores, administering the two tests in a
counterbalanced order to a sample of 72 adults between
the ages of 35 and 44. There was an intertest interval
48
of 3 to 6 weeks. In this sample, the Verbal,
Performance, and Full Scale WAIS IQ scores were
approximately 7, 8. and 8 points higher, respectively,
than corresponding IQ scores on the WAIS-R. However,
correlations between WAIS and WAIS-R IQ scores were very
high, i.e., ( = .91, .79, and .88, for VIQ, PIQ, and
FSIQ scores, respectively).
For all eleven subtests, WAIS scores were higher
than WAIS-R scores. Wechsler (1981) did not report
significance levels for these scores, but subtests with
the largest differences included Similarities (2.2 point
difference), Vocabulary (1.8 point difference).
Comprehension (1.8 point difference). Picture Completion
(1.8 point difference), and Digit Symbol (1.8 point
difference). Digit Span had the smallest difference
between subtest scores (0.6 point difference). There
were clear practice effects; individuals tended to do
somewhat better on the second test taken.
Studies have been conducted to determine whether
differences in WAIS and WAIS-R scores exist for various
groups of test subjects. For example, Lippold and
Claiborn (1983) used a combined version of the WAIS and
the WAIS-R, with items which are identical for the two
tests administered only once, and tested 30 veterans who
49
had been referred for neuropsychological evaluation.
This procedure led to the assignment of identical scores
to those items that were the same for both tests. As in
Wechsler's (1981) study, WAIS IQ scores were
significantly higher than WAIS-R IQ scores. There was a
7.6 point difference in VIQ scores, an 8.6 point
difference in PIQ scores, and an 8.4 point difference in
FSIQ scores. Unfortunately, Lippold and Claiborn (1983)
did not report subtest score comparisons.
Rabourn (1983) administered the same combined form
of the WAIS and WAIS-R to 52 subjects from the University
of California Counseling Center. As in the Lippold and
Claiborn (1983) study, this particular format was used to
eliminate the practice effects common in the usual
test-retest methods of administration. Rabourn (1983)
found that subjects scored significantly higher on the
WAIS than on the WAIS-R. WAIS Verbal, Performance, and
Full Scale IQ scores were 6.2, 7.6, and 6.7 points
higher, respectively, than corresponding WAIS-R IQ
scores. All WAIS subtest scores with the exception of
Information were significantly higher than corresponding
WAIS-R subtest scores.
Kelly, Montgomery, Felleman, and Webb (1984)
compared WAIS and WAIS-R IQ scores obtained from two
50
groups of neurologically impaired patients (N = 114).
One group was administered the WAIS and the other was
administered the WAIS-R. Results of this comparison
showed that all three WAIS IQ scores were significantly
higher than corresponding WAIS-R IQ scores.
Prifitera and Ryan (1983) conducted a WAIS/WAIS-R
comparison using 32 psychiatric and vocational counseling
patients. Subjects were administered both tests in a
counterbalanced order. WAIS-R VIQ, PIQ, and FSIQ scores
were 7.59, 7.06, and 7.77 points lower than corresponding
WAIS IQ scores. Differential practice effects were also
found such that the order of administration affected the
score differences between the two tests.
Smith (1983) administered the WAIS and WAIS-R in a
counterbalanced order to 70 college students. WAIS VIQ,
PIQ, and FSIQ scores were significantly higher than
corresponding scores on the WAIS-R by 8, 9, and 9 points,
respectively. Eight of the 11 subtest scores showed a
significant difference with scores higher on the WAIS.
For the other three subtests. Similarities, Picture
Arrangement, and Block Design, scores were either higher
or equal on the WAIS, even though no significant
differences were found. The correlations between WAIS
and WAIS-R were somewhat lower than the values found in
51
Wechsler's (1981) study, with the coefficients for
Picture Arrangement ir_ = .15) and Object Assembly (r =
.14) not reaching statistical significance. Smith (1983)
also found significant Test x Order interactions for nine
of the 11 subtests and all three IQ scores such that the
order of administration affected the difference between
WAIS and WAIS-R scores.
Urbina, Golden, and Ariel (1983) administered the
WAIS and WAIS-R to 35 females and 33 males ranging in age
from 16 to 74 years. Forty-nine of the subjects were
labeled "normal," whereas 19 were psychiatric patients in
various diagnostic categories. The order of test
administration was not counterbalanced; i.e., 72% of the
subjects took the WAIS before the WAIS-R. The intertest
interval was from 1 day to 7 months. WAIS subtest and IQ
scores were significantly higher than corresponding
WAIS-R scores, except for the Digit Span subtest. For
all three of the IQ scores, there was a 5 point
difference between the two tests. Correlations between
respective V/AIS and WAIS-R scores were also highly
significant, ranging from an correlation of .57 for
Object Assembly to a correlation of -95 for VIQ and
Vocabulary. The differences in WAIS and V/AIS-R scores
were related to the age of the subject and order of test
52
administration. For example, Urbina et al. (1983) noted
that the updated content of the WAIS-R may pose greater
difficulties for older than for younger people. This
would explain their finding that older subjects had
significantly higher WAIS Verbal and Full Scale IQ
scores. Also, those subjects who took the WAIS after the
WAIS-R scored higher on the WAIS.
Mishra and Brown (1983) examined the comparability
of the WAIS and WAIS-R by administering the two scales to
a sample of 88 predominately college-aged subjects in a
counterbalanced order. As in the studies previously
reviewed, WAIS scores were significantly higher than
WAIS-R scores, with the exception of Picture Arrangement.
WAIS Verbal, Performance, and Full Scale IQ scores were
approximately 5 to 6 points higher than the corresponding
WAIS-R IQs. All of the correlations obtained from the
scores of the two tests were significant, ranging from
.51 for Object Assembly and Picture Completion to .85 for
Vocabulary. The only statistically significant main
effect for order was for the Picture Arrangement subtest.
Mishra and Brown (1983) do not provide separate subtest
means for each order of test administration. Therefore,
it is not possible to determine whether the order effect
found for the Picture Arrangement subtest is responsible
53
for the lack of a significant difference between WAIS and
WAIS-R Picture Arrangement scores.
In a counterbalanced design, Simon and Clopton
(1984) administered both the WAIS and WAIS-R to 29 mildly
and moderately retarded adults. Significant differences
between the WAIS and WAIS-R were found for two of the IQ
scores and for five of the 11 subtest scores. In
contrast to the results of previous comparison studies,
WAIS-R Full Scale IQ and Verbal IQ scores were
significantly higher than corresponding WAIS IQ scores.
For the Arithmetic and Vocabulary subtests, WAIS-R scores
were also significantly higher than WAIS scores.
However, for the Digit Symbol, Picture Completion, and
Object Assembly subtests, WAIS scores were significantly
higher than WAIS-R scores. No significant order main
effects were found, but a significant Test x Order
interaction was found for the Block Design subtest.
Significant correlations were also obtained for each of
the corresponding WAIS and WAIS-R scores.
Edwards and Klein (1984) compared the performance of
38 highly intelligent Mensa members on the WAIS and
WAIS-R. The two tests were administered in a
counterbalanced order with an intertest interval of
approximately three weeks. The author found no
54
significant differences between initial WAIS and WAIS-R
IQ scores. WAIS Verbal, Performance, and Full Scale IQ
scores were only 3.5, 2.5, and 1.7 points higher than
corresponding WAIS-R IQ scores. Subtest differences were
not reported. Edwards and Klein (1984) found that the
order of test administration affected the difference
between WAIS and WAIS-R test scores. Subjects who took
the WAIS followed by the WAIS-R gained a mean of 3 Full
Scale points upon retesting, in contrast to an average
9-point gain for subjects who took the WAIS-R followed by
the WAIS. Results of this study indicate that
individuals in the Superior and Very Superior range of
intelligence tend to earn fairly equivalent WAIS and
WAIS-R scores.
In summary, the WAIS/WAIS-R comparisons using
subjects of fairly average intelligence found WAIS scores
to be significantly higher than WAIS-R scores. However,
there were two studies which did not find WAIS scores to
be higher than WAIS-R scores. Simon and Clopton (1984)
found V/AIS-R scores to be significantly higher than WAIS
scores when testing individuals in a mentally retarded
population. Edwards and Klein (1984) found no
significant differences in initial WAIS and WAIS-R scores
when they administered both tests to gifted individuals.
55
This is an indication that the subject's level of
functioning may be a factor determining the difference in
the scores he or she receives on the two tests. Another
common finding in most of the studies is the presence of
high correlation coefficients for the scores on the two
tests. However, there is one subtest, Object Assembly,
that consistently correlates lower than the other
subtests. This may be partially explained by the low
reliability of this subtest. Because the WAIS and WAIS-R
Object Assembly subtests have lower reliability than the
other subtests, they cannot be expected to correlate
highly with each other.
Race-IQ Controversy
The issue of cultural differences in IQ scores first
became controversial after the American development of
mass testing for intelligence in World War I. Yerkes,
Otis, and other psychologists developed the Army Alpha
(largely verbal and quantitative) and the Army Beta
(nonverbal) group tests and applied them to almost two
million men in 1917-1918. After the war, Yerkes (1921)
published comparisons between Blacks and Whites and
between different White immigrant groups. This report
concluded that there was a significant difference in IQ
scores obtained by Blacks and Whites. The publication
56
also concluded that draftees from English-speaking
countries or Scandinavia scored relatively high, while
those from Latin or Slavic backgrounds scored low. Even
though Yerkes found a relationship between IQ and length
of stay in the United States, he did not emphasize the
fact that many of his "minority subjects" were less
acculturated than the White, English-speaking subjects.
The Lippmann-Terman debate in the early 1920's was
symbolic of the controversy surrounding comparative
testing of racial groups. Underlying the controversy
then, as now, were the issues of whether certain races
are inherently less intelligent than others (hereditarian
position) and whether equality of opportunity would be
denied to minorities by the way authorities use findings
of test differences. Walter Lippmann, a noted
journalist, took an environmentalist position and
attacked the IQ tests and their interpretations. He
believed that socioeconomic as well as other cultural and
environmental differences were responsible for the IQ
discrepancies between Blacks and Whites. Terman, on the
other hand, defended psychological tests and took a
hereditarian position. He believed that Blacks were
lower in measured IQ than Whites as a result of a variety
of inherited, genetic traits.
57
A similar controversy developed in the early 1970's
in reaction to Arthur Jensen's publication in the 1969
Harvard Educational Review, which questioned the benefits
of Head Start programs and implied that Blacks were
hereditarily inferior (see Block and Dworkin, 1976, for a
summary of the controversy created by Jensen's article).
The Civil Rights movement in the 1960's and 70's
contributed to an increasingly hostile attitude toward
any procedures that segregated racial groups in
educational and work settings. Psychological tests,
especially IQ measures, became the target of many groups
trying to prevent black children from being stereotyped
and limited to taking classes for slow learners.
Even though there has been a major movement toward
racial equality in the last twenty years, the issue of
cultural differences in intelligence continues to be a
topic surrounded by controversy. There are hundreds of
studies dealing with the cultural applications of
intelligence tests. Researchers such as Jensen (1980),
Matarazzo (1972), and Oakland (1977) have written
extensively on this subject. It is interesting to note
that most of the data cited on both sides of the heredity
vs. environment debate may by interpreted as supporting
either perspective (Matarazzo, 1972). Loehlin, Lindzey,
58
and Spuhler (1975) reviewed the literature on race
differences in IQ and suggested three possible causes:
test inadequacies, environmental and cultural
differences, and genetic differences.
Within the United States, many studies have found
Black-White group differences in IQ scores of
approximately one standard deviation (15 Wechsler
points), but these differences appear to be influenced by
factors such as the socioeconomic conditions and
educational attainment of Blacks and Whites. According
to Matarazzo and Pankratz (1980), this average difference
in racial IQ scores is only an empirical finding and does
not mean that Black individuals are any less intelligent
than Whites. Rather, they conclude that research in this
area has only served to correlate scores on particular
tests with skin color. The same authors also note that
an IQ score may be viewed as a measure of social
conditions, not just a biological inevitability. They
proposed that if nutrition, home experience, and the
school environment were similar for Black and White
individuals, then racial differences in IQ scores would
decrease.
The Wechsler scales and other standardized
intelligence tests have been used extensively to compare
59
IQ scores for children and adults of different races
(i.e., Dreger & Miller, 1968; Oakland & Feigenbaum, 1979;
Sandoval, 1979; Shuey, 1966; Vance & Engin, 1978). Each
of these studies found rather large mean differences
between Blacks and Whites on these standardized
intelligence tests. However, in most of these studies
the distribution of individuals' scores overlapped
considerably, with many Blacks earning IQ scores above
the mean of the White sample. Other variables such as
socioeconomic class, education, sex, family size, and
prenatal care have been found to influence IQ scores and
confound direct correlations that have been made between
race and IQ (Matarazzo, 1972). For example, Baughman and
Dahlstrom (1968) found that the White children in their
study earned a mean IQ score that was 13 points higher
than the Black children. However, the parents of the
White children had approximately 2 more years of
education, earned a substantially higher income, and were
working in higher occupational categories. It may be
concluded that it is not possible to predict an
individual's IQ score solely on the basis of his race.
Other evironmental factors must also be taken into
consideration.
60
Examiner Effects
Several researchers have found that examiner
characteristics influence scores on the Wechsler scales,
Studies have investigated such topics as examiner
expectancy-effects, as well as the impact of the
examiner's experience, sex, and race on test scores.
Expectancy Effects
Pretest information plays a role in the performance
of individuals on intelligence tests (Egeland, 1969;
Masling, 1959; Sattler, Hillix, & Neher, 1970; Sattler &
Winget, 1970; Schroeder & Kleinsasser, 1972; Simon,
1969). Examinees who were positively presented were
rated significantly higher than negatively presented
examinees (Sattler & Winget, 1970). When confederates
were coached to give certain test responses on the WAIS
which included several ambiguous responses, subjects
received significantly more credit on the ambiguous items
when they produced an overall superior WAIS record
(Sattler, Hillix, & Neher, 1970). In one study,
confederate examinees played "cold" and "warm" roles when
tested by unsuspecting experimenters (Masling, 1959).
The warm role enhanced the score in three
ways: experimenters used more reinforcing comments, gave
more opportunity to clarify and correct answers, and used
61
more lenient scoring on the protocols of the warm
subjects. In contrast, Simon (1969) pointed out that the
existence of expectancies regarding an examinee's
probable level of performance does not mean that scoring
bias will necessarily occur. Therefore, although results
indicate that the expectancies of the examiner do not
automatically lead to scoring bias, it is important for
examiners to take this factor into consideration when
interpreting test scores. Of course, eliminating halo
effects in the administration and scoring of intelligence
tests is a difficult goal. One possible approach would
be to have examiners administer tests without prior
knowledge of the examinee's abilities (Schroeder &
Kleinsasser, 1972). However, even if this procedure were
followed, the examiner would undoubtedly begin to develop
expectancies after administering the first few subtests
of a scale- Sattler (1974) noted that it is probably
impossible to eliminate the examiner's positive or
negative evaluations of the individual being tested.
However, it is possible for the examiner to become more
aware of his bias and minimize the effect of his
reactions on his rating of the individual's test
responses.
62
Examiner Experience
Studies that have evaluated the examiner's
experience as a variable that may affect scores on the
Wechsler tests report nonsignificant findings (Davis,
Peacock, Fitzpatrick, & Mulhern, 1969; Kaspar, Throne, &
Schulman, 1968, Masling, 1959; Plumb & Charles, 1955;
Schwartz, 1966). Overall, the studies indicate that the
examiner's experience is not of critical importance in
affecting test scores. Once a certain level of
proficiency is reached, the variable of experience as a
factor that affects test procedures or the examinee's
performance is no longer important (Sattler, 1974).
Examiner Sex
Few studies of examiner effects have systematically
evaluated the role of the examiner's sex. Those that
have done so indicate that, although the examiner's sex
may interact with the examinee's sex in affecting
performance on selected tests or subtests, no consistent
trends have emerged. For example, in one study female
examiners obtained higher WISC Arithmetic subtest scores
from female examinees than from male examinees, whereas
male examiners tended to obtain higher scores from male
examinees than from female examinees (Pedersen,
Shinedling, & Johnson, 1968). However, another study
63
failed to find significant examiner sex effects for the
WISC Arithmetic subtest, but found that the examiner's
sex was a significant factor influencing results on other
WISC subtests (Quereshi, 1968b).
Examiner Race
Many researchers have found that racial differences
affect the examiner-examinee relationship (Anastasi,
1968; Deutsch, Fishman, Kogan, North, & Whiteman, 1964).
According to the research. Black examinees who are given
intelligence tests by White examiners may display
behaviors that reflect their discomfort in the test
situation. For example, they may be more hesitant to
answer questions or have unnatural reactions to the test.
However, results of research on racial differences
suggest that the examiner's race does not usually affect
the actual performance of Black or White subjects on
individual intelligence tests (Graziano, Varca, & Levy,
1982; Sattler & Gwynne, 1982; Shuey, 1966). Unfortu
nately, there have been only a few studies undertaken
which examine these variables, and most of the studies
are not well designed. For example, several of the
studies reviewed used only a small number of examiners
and neglected to control for the examiner's race.
Therefore, more research is needed in this area to better
64
determine the effect of the examiner's race on the
examinee's test performance.
General Examiner Studies
The research literature conclusively shows that
examiners differ in their scoring of unclear responses.
Using ambiguous responses, examiner differences have been
found in many studies using the Wechsler scales (Mahan,
1963; Massey, 1964; Miller & Chanskey, 1972; Miller,
Chansky, & Gredler, 1970; Plumb & Charles, 1955; Sattler,
Winget, & Roth, 19.69; Schwartz, 1966; Walker, Hunt, &
Schwartz, 1965). Because of this variability in the
scores that examiners give to ambiguous responses, it may
be concluded that the subjects' IQ scores are dependent,
in part, on the particular examiner performing the
evaluation .
Other studies of the Wechsler scales yield
conflicting results regarding examiner differences. Even
though many of these studies have flaws in their
methodology, there is some support for the idea that
examiners occasionally differ in the subtest and IQ
scores they obtain. However, these differences are not
large or pervasive. Although reasons for examiner
differences are not usually known, Thomas, Hertzig,
Dryman, and Fernandez (1971) indicated that one examiner
65
who obtained higher scores in his study (a) used more
positive terms in her reports to describe the
examiner-child interaction and the child's behavior, (b)
spent more time getting to know the child and
establishing rapport, and (c) always encouraged the child
to try to answer the test questions. The research does
not indicate that there will be significant differences
found among every group of examiners. However,
researchers conclude that examiners should attempt to
achieve the highest level of competence and skill.
Rationale for Present Study
There are two related issues to be addressed in this
study. The first issue has to do with the equivalence of
the WAIS and WAIS-R in a sample of 70 high school
students. It will be important to determine whether
corresponding WAIS and WAIS-R subtest and IQ scores are
similar and whether the two tests are significantly
correlated with one another in this population. The
second area of research deals with the comparable
performance of three different racial groups (White,
Black, and Mexican-American) on the WAIS and WAIS-R.
This research will determine whether differences between
these three racial groups are the same for both tests.
This study will also compare scores obtained by the three
66
racial groups on the WAIS-R, a supposedly more
culture-fair test than the WAIS, and see how these
differences compare with the commonly found 10-15 IQ
point gap between white and minority individuals on the
WAIS (Dreger & Miller, 1968; Shuey, 1966).
In addition to these major issues of interest, a
supplementary examination of examiner effects will also
be made to determine whether examiner characteristics
differentially influenced test scores. Simon and Clopton
(1984) included an analysis of examiner effects in their
comparison of WAIS and V/AIS-R scores. They found
significant Examiner x Test interactions for two
subtests. Digit Span and Block Design. Earlier research
with the WAIS demonstrated that examiner variables often
lead to differences in test scores (for review, see
Guertin, Ladd, Frank, Rabin, & Hiester, 1971).
With regard to the equivalence of the two tests,
There have been several comparisons made between the WAIS
and WAIS-R using different populations (Edwards & Klein,
1984; Kelly, Montgomery, Felleman, & Webb, 1984; Lippold
& Claiborn, 1983; Mishra & Brown, 1983; Prifitera & Ryan,
1983; Rabourn, 1983; Simon & Clopton, 1984; Smith, 1983;
Urbina, Golden, & Ariel, 1983; Wechsler, 1981). Of
course, since the normative sample of the WAIS-R is more
67
up-to-date and was carefully selected to ensure its
representativeness, the WAIS-R is usually preferred over
its predecessor, the WAIS, for measuring adult
intelligence. However, there are several reasons why it
is important to know whether these two scales provide
comparable measures of intelligence. First of all, there
are situations in which it would be important to know
whether the scores could be treated as interchangeable.
In developmental studies, for example, previous data is
often collected with the WAIS, in which case it would be
necessary to know whether changes in intelligence test
scores were meaningful or were merely due to differences
between the two scales. Second, results on the WAIS may
have been used to categorize individuals in certain
settings, i.e., treatment centers for the mentally
retarded, educationally gifted programs, or special
education classes for the learning disabled. When these
same individuals are tested later with the WAIS-R,
discrepancies could result in different decisions
regarding placement.
It is the purpose of the present study to compare
the WAIS and the WAIS-R using a sample of high school
students. In most school systems, assignment of students
to special education classes is based on the discrepancy
68
between intelligence and achievement scores. The present
study will utilize a sample of high school students to
determine whether they obtain significantly different
scores on the V/AIS and the WAIS-R. If, as predicted,
these students obtain significantly lower scores on the
WAIS-R than on the WAIS, this has serious implications
regarding their ability to qualify for remedial
educational placement. In fact, it would be much more
difficult for students who are tested with the WAIS-R to
meet the criteria for assignment of the handicapping
condition of a learning disability. Therefore, it is
necessary to determine the magnitude of the differences
between the WAIS and V/AIS-R in this population so that
standards of placement may be adjusted accordingly.
One of the most important reasons for obtaining this
sample of high school students has to do with the wide
range of IQ scores expected in this population. Most of
the previous WAIS/WAIS-R comparisons, with the exception
of Wechsler (1981), have utilized samples with a fairly
narrow range of IQs. Several studies used college
students with generally above average IQ scores (Mishra &
Brown, 1983; Rabourn, 1983; Smith, 1983). Two studies
compared the two tests in a population of neurologically
impaired individuals (Kelly, Montgomery, Felleman, &
69
Webb, 1984; Lippold and Claiborn, 1983). Edwards and
Klein (1984) used intellectually gifted individuals in
their comparison. The population of normal high school
students used in this study should represent a more
varied sample with regard to IQ scores.
It is necessary to examine differential effects of
administration order (WAIS followed by WAIS-R vs. WAIS-R
followed by WAIS). Since each subject is given both the
WAIS and the WAIS-R in counterbalanced order, it is
important to rule out the possibility of differential
practice effects. Although the results of this analysis
will have limited clinical value since there will
obviously not be many cases when the two tests are
administered in a short period of time, it is expected
that order will differentially affect the magnitude of
test differences as in Smith's (1983) study- Therefore,
Order effects will be examined in relationship to
WAIS/WAIS-R test scores.
This study will also examine the performance of
three different racial groups on the WAIS and the WAIS-R.
Earlier studies conducted using the Wechsler scales have
documented major discrepancies between IQ means of
different racial groups (Jensen, 1980; Shuey, 1966;
Winter, 1968; Wysocki & Wysocki, 1969). However, there
70
have been no studies to date analyzing racial differences
in WAIS-R scores or comparing the IQ differences between
the WAIS and the WAIS-R for different racial groups. The
present study proposes to determine whether the same
differences between racial groups exist on the WAIS-R as
on the WAIS.
CHAPTER II
METHOD
Sub jects
The original subjects included in this study were 75
high school students (41 males, 34 females) between the
ages of 16 and 19. These students were enrolled in
regular classes at three high schools in the Lubbock
Independent School District: Coronado High School,
Estacado High School, and Monterey High School.
Prospective subjects were chosen randomly by school
counselors and principals. Involvement in the study was
on a voluntary basis, with no reward for participation.
Those subjects under the age of 18 who were willing to
participate in the study were required to have their
parents sign a consent form (See Appendix A). Subjects
who were 18 years old, or older, signed the consent form
themselves.
Although 75 subjects had volunteered to participate
in this study, complete WAIS and WAIS-R protocols were
obtained from only 70 subjects. Five potential subjects
dropped out of the study after the first round of
testing. One potential subject moved out of town, one
subject ran away from home, one subject transferred out
of a study hall and did not wish to be tested during a
71
72
regular class, one subject refused to be retested because
she was studying for final exams, and one subject could
not be contacted by the examiner.
Procedure
Each subject was randomly assigned to one of two
groups, resulting in 38 subjects in the first group and
37 subjects in the second group. It was the intention of
the author that the 38 subjects in the first group (Order
1) would be tested initially with the WAIS and then with
the WAIS-R. The 37 subjects in the second group (Order
2) were to be tested first with the WAIS-R, then with the
WAIS. However, of the five subjects who dropped out of
the study after the first test, one was in Order 1 and
four were in Order 2. This resulted in an unequal number
of subjects being assigned to each order (Order 1 = 37
subjects, Order 2 = 33 subjects).
There was an interval of 30 to 60 days between
administration of the two tests. All of the subjects
except for one were tested while school was in session,
at an isolated testing area located in the school
building. The remaining subject was given her second
test in her home several days after school was out for
the summer.
The WAIS and WAIS-R were administered and scored
73
according to Wechsler's (1955, 1981) instructions. Five
students enrolled in the Clinical Psychology graduate
program at Texas Tech University administered and scored
the tests, and the scoring of each protocol was checked
by the author. Two of the examiners were male and three
were female. All of the examiners were Caucasian. Each
examiner was proficient in the use of the WAIS and the
WAIS-R. Proficiency on the WAIS was achieved through
individualized instruction by the author, whereas
proficiency on the WAIS-R had been determined in a
graduate course on intelligence testing.
Initially, each examiner was randomly assigned 15
subjects to test. However, two of the examiners (exs. 3
and 4) were able to complete testing on only 13 subjects
apiece, and one examiner (ex. 5) completed testing on
only 12 subjects. Therefore, Examiner 2 tested two extra
subjects (n = 17) using the WAIS and the WAIS-R to ensure
complete data for 70 subjects.
Subjects were informed that test results would be
available to them when they completed both tests.
Formal Hypotheses
1. It is hypothesized that WAIS IQ scores will be
significantly higher than corresponding WAIS-R IQ scores.
This prediction is based on the fact that, with only two
7^
e x c e p t i o n s (Edwards & K l e i n , 1984; S±m©n & Cl©ptt©ffl,
1 9 8 4 ) , a l l s t u d i e s compar ing WAIS and MAIS-f ®c©re§ tea*'©
found WAIS s c o r e s t o be h i g h e r t h a n MAIS-I s c a r e s , M s © ,
p r e v i o u s WISC and WISC-R c o o p a r i s c m s have gfe®wffii th^t WI§C
IQ s c o r e s a r e s i g n i f i c a n t l y liiglhier ttosio WISC-S s e o r ^ s
( B e r r y & S h e r r e t s , 1975 ; B r o o k s , 1977 ; © o f p s l t & E imfiro ffii,
1977; L a r r a b e e & H a l r o y d , 1976; Scfewartimg, 1976; g©l ly„
1977 ; S w e r d l i k , 1977; Weiner S Xatifinisffli, 1979)*
F l y n n ' s (1984) r ev i ew •d€iDonstrste(i t l i s t ^v^ff
S t a n f o r d - B i n e t amd Wechsler stanisJisr^izati-®® s^fflipl^ tfr^ifP
1932 t o 1978 e s t a t e l i s i i e d t&m-g'heT aaorrni® t t e ® i t t s
p r e d e c e s s o r . T h i s p a t t e r m of inacressjflagly rnicpye ^mH-e-W-l-t-
norms was i n t e r p r e t e d t o mean t h a t , du r l i ig ttto^lt pefficgicS <§>£
46 y e a r s , t h e a v e r a g e IQ of Aro^ricsus ir®^® ft)^' slinvs)^^ 14
p o i n t s ( n e a r l y one f u l l s t a n d a r d dfiyiati(S)fliJ.. I t e ^ *
c h a n g e s i n s t a n d a r d i z a t i o n sa.-myles sery-s lt(© expl^iisi w^^
i n d i v i d u a l s per form b e l t e r on t:b« .WAIS t i s i c <s>iin ttlte
WAIS-R. The s t a n d a r d i z a t i o n sa-mple iii^e-d It® s s t s M i s l b
norms f o r t h e VlIS-iR perfor-mfiiil a t a lhii.gHD«Br J«w«l -ifti m ttHd*
sample used t© e s t a b l i s i norms i ^ r ltlh« WM3^ 11h«ff tfcS)ff*„
when an i n d i i ^ i d u a l ' s pe r fo rmance i s co.uijpsr-sil ^^ifli^ti; ttHii#
two s e t s of no rms , he i s go ing t© £i:'or^ i h i ^ t e r \»*#m
compared t o t h e " e a s i e r " WAIB norms t.ii^n •*'?h«n:i (psm^ff^ tJkP
t h e more -d i f f l - cu l t WAIS-.l? n c r m s -
75
2. It is hypothesized that there will be
significant Test x Order interactions, such that the
difference between WAIS and WAIS-R IQ scores will be
affected by the order in which the tests are
administered. In studies where both tests are given to
each subject (Edwards & Klein, 1984; Simon & Clopton,
1984; Smith, 1983), WAIS vs. WAIS-R differences are
dependent on the order in which the tests are
administered.
3. It is hypothesized that White subjects will have
significantly higher WAIS and WAIS-R IQ scores than Black
or Mexican-American subjects. This prediction is based
on earlier studies (Baughman & Dahlstrom, 1968; Dreger &
Miller, 1968; Jensen, 1980; Oakland & Feigenbaum, 1979;
Sandoval, 1979; Shuey, 1966; Vance & Engin, 1978) that
found 10-12 point differences in IQ scores when Black and
White subjects were compared on standardized intelligence
tests.
4. It is predicted that there will be significant
Test X Race interactions such that the difference between
WAIS and WAIS-R scores will be affected by the race of
the subject. Wechsler changed several items when
revising the WAIS in an attempt to make it a more
culture-fair test. New questions were added on the
76
Information subtest about Martin Luther King and Louis
Armstrong. Several of the pictures in the Picture
Completion and Picture Arrangement subtests were also
changed to depict Black individuals. Therefore, it is
hypothesized that the discrepancy found between White and
Black IQ scores on the WAIS should not be as large on the
WAIS-R, with Blacks performing relatively better on the
WAIS-R. However, Black IQ scores are still expected to
be lower than the IQ scores of White subjects on the
WAIS-R.
5. It is hypothesized that WAIS subtest and IQ
scores will correlate significantly with corresponding
WAIS-R scores. This prediction is based on previous WAIS
and WAIS-R studies that found high correlations between
WAIS scores and corresponding WAIS-R scores (Mishra &
Brown, 1983; Simon & Clopton, 1984; Smith, 1983; Urbina,
Golden, & Ariel, 1983; Wechsler, 1981).
Due to the large number of statistical comparisons,
the alpha level is set at £ < .01 for all statistical
tests to provide a more conservative measure of
significant results.
CHAPTER III
RESULTS
This chapter will include a description of the 70
subjects whose test scores were analyzed. Results of
main and supplementary analyses performed on the research
data will be presented. The main analyses tested the
first four hypotheses:
1. A significant main effect for Test was predicted
such that WAIS subtest and IQ scores would be higher than
corresponding WAIS-R scores.
2. A significant Test x Order interaction was
predicted such that the difference between WAIS and
WAIS-R scores would be affected by the order in which the
tests were administered (Order 1 vs. Order 2).
3. A significant main effect for Race was predicted
such that White subjects would score higher than Black or
Mexican-American subjects on the WAIS and the WAIS-R.
4. A significant Test x Race interaction was
predicted such that the race of the subjects would affect
their differential performance on the WAIS and WAIS-R.
Supplementary analyses were performed to determine
the significance of WAIS/WAIS-R correlations, and to
determine whether there were significant differences
between the scores for each examiner. An additional
77
78
analysis was also performed to investigate the
relationship between WAIS and WAIS-R differences and the
subject's IQ.
Subject Characteristics
Table 5 presents information about the race, age,
sex, and school for the 70 subjects who completed both
the WAIS and WAIS-R. Full Scale IQ scores for subjects
on the WAIS ranged from 80 to 136, with a mean Full Scale
IQ of 105.27. The Full Scale IQ scores for subjects on
the WAIS-R ranged from 74 to 143, with a mean Full Scale
IQ of 100.67. Verbal, Performance, and Full Scale IQ
ranges and means are presented in Table 6.
Main Data Analyses
A 2 X 2 X 3 analysis of variance with one within-
(Test) and two between-subjects (Order, Race) variables
was performed on each of the 11 subtest scaled scores and
three IQ scores of the WAIS and V/AIS-R. Table 7 gives F_
values for the main effects of Test, Order, and Race and
for interactions among these three variables.
Hypotheses 1 and 2
Significant differences were found between WAIS and
WAIS-R scores for ten of the 11 subtests and for all
three IQ scores. In each case, WAIS scores were
79
Table 5
Demographic Information for 70 Subjects Included in WAIS/WAIS-R Comparison:
Race, Age, Sex, School
Race
Caucasian
Black
Mexican-American
Age
16
17
18
19
Sex
Male
Female
School
Coronado
Estacado
Monterey
N_
34
20
16
I 39
16
13
2
N_
40
30
N
21
24
25
80
Table 6
WAIS and WAIS-R Range and Mean IQ Scores
WAIS WAIS-R
Range H Range M
Verbal IQ 78 - 139 101.34 71 - 143 97.86
Performance IQ 83 - 142 109.73 69 - 139 104.87
Full Scale IQ 80 - 136 105.27 74 - 143 100.67
CO
u a>
<4-l
w
c o
• H J-l u 03 I i 0) 4->
c
c OJ CO
T3 a> a c CO
. - I CO
O l-c (U
m J-) • H c CO a >> o
CO
c oi <C I
CO
u CO
c o
CO
4-1
m
e •a <:
3 o
< w I S
n M
s o M H CJ)
< OS
H
M
>- i
< I
CN
00 c
• H 4-1 O c 0)
m <u a
C9
>
X c CO
u Q) cn
o :s
4 J
0)
•a
o o
14-1
u m
n
U3 OS X o X
H
OS
X
H
X I
OS
X
o
CO
o X
J2
o CO
OS
CO
u
•o
o
00 CM ^ r^ O O o 00 m r^ -vi- CM
1^ o in tM -* o VO ^ n -tf
C S O t - c O O O C M ^ O O O
ON r- 00 m -* ^
00 CO O 00 t-H l>»
CO CM O O O CM
•-I ON o in en n CM O ^ CM SJ- 00
CO cs 1^ -* ^ o en O -* O O <-< •—I CM
00 en -ij- in O en .-I c^ rsi
* *
ON vO in r»- o^ o o ^ o ON en CM
O O O 00 en en
* * * * * Ijf :i: :i: St: Sii ijf
4: # # #: # ^ o CM in in 00 \o vo 00 00 m en -o-in CM tn en 00 in
ON •fl' en 00 o o^ in 00 en CN en CM
* * * ^ jje ^ ilic s i : in «-< ON m T>»
•tf \ o «-i o cn
o CM in o f 1-H f-H 1-4 i—I - . i^ '
* * * *
* * * * I-H \o en r^ -I—( CM en o^ 00 en
o in 00 in en
c? DO in -^ in 1—I CM - ^ .—f r ^
O O O ^ O O O O ^ O - *
O 1^ o r^ in in
3j;
\0 O ^
ON CM i n
O in -<3-
O in o CM O - ^
O O O O O — I O O O O — I o o o
* * * * * f ^ - H O o ON so
o -* -* —I r~ in
* * * * * * * * * - * -i^ \D —< ON O
® en r>-CM —( CM
CM ON i n - H CM CM
O CM 1-H
(0 4-1 0] (U
H
0} 4J n (U 4-) X3 3
TO
« * :$:
* *
\D CM m t^
00 1-H
c o
- H 4J to s )-< o
VM c
4-1
.—1 1—4
c o
TH
03 c (U x : (U in
<-H »-H
* * * * O ^ cjN i n
m CM r«-
o - H 4-1 <u E x :
a. 4J Fl O
- H U
CJN
m OJ
•H C 4J to
* * * .-H i n
en ON
>, u
- H C to J-< CO to
^ 4J T 4 -r4 £ oo
•rH - f - l
f-t
=) X I to u o
O -< W Q >
* * * * * * * # iii ^ Jlf lii vO O O N 0 0 r^ en en 1 ^ r^ CD
CM ^4 i n vO so
3
letion
3 1
ngement
bly
2
•.H n. c to e O S oo iH tu
X I O •<-( i.4 ffl S C) M -< ffl >. <u -<
W 0) O tt) ^ H 4J
4J 3 Jsi 3 y •H j j y 4J tt) oo U O U " ^
- H MH -rH -H -XI O O i CQ O H O
* * * *-H so
t ^ CM
* * * * * * CM —< •<f ON
—< i n e n si-
j - H c y 4-H
i) V jy
© • £ : • ' - < 4-H
i - H
to X I u tl)
to « £ y i H « )
o I M -r^ u->-i a> 3
i > o- fc
u I—I
ffl
SJ
y to
i2£
y
C
U
o
as I
y y
O
« j y
-TN) — H
II o
y s j
y * y * u
o o c tc u
II •<-! J.4
jj y
« e
C
- y
V j y
S O *
y o
i s
V
m
s i g n i f i c a n t l y h i g h e r tJiaiia corregjxsjfflcfilfflg ^At^^W- §(&(§>f&§.
T a b l e 8 d e p i c t s n e a n s u b t e s t SffliJ Kg sCi§>ir^g £(§>$• rt^ WM§
a n d MAIS-R. A r i t h e e t i c w a s the <s>mtf Sisi&t^^t tfti^ft (^i-S iPcPtf
j i e l d s i g n i f i c a n t l y d i f f e r e i a t t $(C(5)ff"e§ f(§)ff tfti^ t\W(§) tf:^^tt^..
EoMCver , MAIS A r i t h i m e t i c SBttBsitestt S(C(6)ff ^ *'^ff* ^fl^i-U
s l i g h t l y h i g h e r (M = 9 . 2 6 ) timm W M ^ E ^cccpff^^ ffi =? ^..77(5)^..
T h e r e was a s i g r o i f i c a i i l t imteff^cticPfli li)*tt\w^#iP (S^d^^i^ (§>£
a d m i i m i s t r a t i o i n i ( O r i e r 1 w s , ©BTffier 2)) SiPcS 'ff^^tt ^lW^J§
T S - MAIS-ffi) f©r s i x ®ff tftie 1 1 siuBoite^lt^ ^iPdi tfcftff" * J J tflhir^^
IQ s c o r e s ( S e e Tsfele 7 , Coluuimm 4^5.. f i ^o i i r ^^ i „ 2 , , SiPdi :?
i l l m s t r s t t e t h e T e s t x (D)Br(Seff ili»tt®ff«(eti:i-(SWi tfcPff i t t te W^ifib^lL,,
Perfforimaimce, fflnnd Fuull S e a l ® I ^ SiCcSff*^,, ff^^ip^cfttiw^ei^.. 'Mi#
(flifffferenace BBetweem WAIS suicfi W/M^ff (ecs ff ^ i ^ rsu«3h iJ^ff^iSir
^witBn \K1IS s c o r e s tolglbeff- ttHn m \\^^I^-fi? ^(fkPir*^)) fiip (fe-dter ^
ffls c®mi5)aire(fl t® ©rcfier 1 f(S)ff a l l itlbff"«# I(§) ^cftSff*^.. 'fflii^
WAIS-K im ©rcfier 1 dlcpes mcpit * J w * y ^ ^ y i ^ M Mg*«f f \JM;A; II 4FV
S(E(B)ir£s„ Ibmt (5)ft)tffli-«iiiBg jpffSffttitS* IbstfcPff-* t3:^^&P$ t t l te UvA lIS iiifi
Ifflirg^ir dUi cCfir pSflKei!-®^ tt)«tt\W«*rti i t l te tt;KP tf^^tt^..
iDui® 1t(5) t t t e S i ^ i f i c C S i P t t iintt«ff*tf:ti::a<Piti lJj#tfe*«^fi l i ^ t t ^tUi
ipircsxc^didiirte tfcpff fflHliJtti55J# (f(ftflT[P*ff:tf cPrP^ lb*tti».^E» rtP^ftf^ \*<^
aittiiUiJ-««di ttoP ^^apunirP^ ^iiffjpfi^ rfl ii-rti «tftf«3:tt^ few i T ^ a t vMitGiUifi
83
Table 8
Means, Standard Deviations, and Correlations for WAIS and WAIS-R Scores
Subtest
Information Comprehension Arithmetic Similarities Digit Span Vocabulary
Digit Symbol Picture Completion block Design Picture Arrangement Object Assembly
Verbal IQ Performance IQ Full Scale IQ
w Mean
8.26 9.96 9.26 10.66 9.57 8.96
11.78 10.58 11.54 10.78 11.67
101.34 109.73 105.27
AIS
SD
2.44 3.24 2.59 2.47 2.76 2.38
2.71 2.24 3.04 2.31 2.84
12.30 12.12 11.83
WAIS-R
Mean
7.37 8.67 8.70 8.33 8.91 7.47
10.41 9.03 10.66 10.08 10.08
97.86 104.87 100.67
SD
2.23 2.74 2.34 2.45 2.38 2.10
2.52 2.33 3.20 2.50 3.02
13.56 16.00 14.60
r
.86***
.73***
.75***
.64***
.68***
.89***
.73***
.48***
.74***
.44*^*
.32*
.91***
.61***
.82***
Note. Subtest means are in scaled score points and Verbal IQ, Performance IQ, and Full Scale IQ means are in IQ points. *£ < .01; *** £ < .0001
VIQ
84
110
109
108
107
106
105
104
103
102
101
100
99 98 97 96
103.03
WAIS
WAIS-R
O r d e r of A d m i n i s t r a t i o n
F i g u r e 1
Test X Order Interaction for WAIS and WAIS-R Verbal IQ Scores
PIQ
85
114 ••
113 ••
112 •
111 •
110 •
109 •
108 •
107 •
106 -
105 -
104 -
103 -
102 •
101 •
100 -•
99 -
98 -.
97 •
96 •
WAIS
110.57
106 .59
13 .24
WAIS-R
O r d e r o f A d m i n i s t r a t i o n
F i g u r e 2
Test X Order Interaction for WAIS and WAIS-R Performance IQ Scores
86
FSIQ
113--
112--
111--
no--109-
108 -
107 •-
1 0 6 ••
105--
104--
1 0 3 -
102--
1 0 1 -
100 •-99- .
9 8 ••
9 7 -9 6 -
WAIS-R 97.27
Order of Administration
Figure 3
Test X Order Interaction for WAIS and WAIS-R Full Scale IQ Scores
87
Order 1 and Order 2, and simple effects for Order within
Test. Table 9 presents these results for the six
subtests and three IQ scores with significant Test x
Order interactions.
When the WAIS-R followed the WAIS in Order 1 (Table
9, Column 1 vs. 2), the WAIS produced a significantly
higher score on only one subtest. Similarities. The
WAIS-R produced a significantly higher Performance IQ
score by 3.98 points. All other test differences were
non-significant.
When the WAIS followed the WAIS-R in Order 2 (Table
9, Column 3 vs. 4), the WAIS resulted in significantly
higher scores on all six of the subtests. The WAIS also
produced a higher Verbal IQ by 5.61 points. Performance
IQ by 14.76 points, and Full Scale IQ by 10.58 points.
WAIS scores were compared in Order 1 and Order 2
(WAIS first vs. WAIS second). When the WAIS was the
second scale administered (Table 9, Column 3 vs. 1),
significantly higher scaled scores resulted on
Similarities, Digit Symbol, Picture Completion, and
Object Assembly. All three of the IQ scores were
significantly higher when the WAIS was administered as
the second scale (VIQ, 3.19 points; PIQ, 6.65 points;
FSIQ, 4.88 points). WAIS-R scores were also compared in
88
Table 9
Main Effects of Test within Order and Order v/ithin Test for the Six Subtest and Three IQ Scores with
Significant Test x Order Interactions (Tukey's Method)
Ord
WAIS Subtest (1)
SIM
DSYM
PC
BD
PA
OA
9.73
11.46
9.89
11.30
10.54
10.78
er 1
WAIS-R (2)
8.30
10.94
9.35
11.27
10.81
11.32
Order 2
WAIS (3)
11.70
12.15
11.36
11.82
11.06
12.67
WAIS-R (4)
8.36
9.82
8.67
9.97
9.27
8.70
Diff. of T w/in 0
1-2
1.43*
0.52
0.54
0.03
-0.27
-0.54
3-4
3.34*
2.33*
2.69*
1.85*
1.79*
3.97*
Diff. of 0 w/in T
3-1
1.97*
0.69*
1.47*
0.52
0.52
1.89*
2-4
-0.06
1.12*
0.68
1.30*
1.54*
2.62*
VIQ 99.84 98.24 103.03 97.42 1.60 5.61* 3.19* 0.82
PIQ 106.59 110.57 113.24 98.48 -3.98* 14.76* 6.65* 12.09*
FSIQ 102.97 103.70 107.85 97.27 -0.73 10.58* 4.88* 6.43*
Order 1 = WAIS followed by WAIS-R; Order 2 = WAIS-R followed by WAIS.
Subtest means are in scaled score points and Verbal IQ, Performance IQ, and Full Scale IQ means are in IQ points,
SIM = Similarities; DSYM = Digit Symbol; PC = Picture Completion; BD = Block Design; PA = Picture Arrangement; OA = Object Assembly; VIQ = Verbal IQ; PIQ = Performance IQ; FSIQ = Full Scale IQ.
*£ < .01.
89
Order 1 and Order 2 (Table 9, Column 2 vs. 4). When the
WAIS-R was administered as the second scale (Order 1),
four of the six subtests (Digit Symbol, Block Design,
Picture Arrangement, Object Assembly) and tv/o of the IQ
scores (PIQ, 12.09 points; FSIQ, 6.43 points) were
significantly higher than when the WAIS-R was
administered as the first scale.
Even though it is important to ascertain the effect
that order of administration has on test results,
differential practice effects make it difficult to
determine actual test differences. Therefore, a
between-groups comparison of subjects taking their first
Wechsler scale was made. Table 10 presents differences
between means for these "naive" subjects. WAIS scores
were significantly higher than WAIS-R scores for the
following subtests: Similarities, Vocabulary, and Object
Assembly. Performance IQ scores and Full Scale IQ scores
were significantly higher on the WAIS than on the WAIS-R.
Hypotheses 3 and 4
The three racial groups (White, Black,
Mexican-American) obtained significantly different scores
for 10 of the 11 subtests and for all three IQ scores
(see Table 7).
The Scheffe method of multiple comparisons between
90
Table 10
Comparison of WAIS and WAIS-R Subtest and IQ Means for 70 High School Students:
Breakdown of Naive Subjects
Subtest
Information Comprehension Arithmetic Similarities Digit Span Vocabulary
Digit Symbol Picture Completion Block Design Picture Arrangement Object Assembly
Verbal IQ Performance IQ Full Scale IQ
WAIS (N
M
8.16 9.35 9.30 9.73 9.49 8.92
11.46 9.89 11.30 10.54 10.78
99.84 106.59 102.97
= 37)
SD
2.23 3.42 2.72 2.47 3.23 2.67
2.05 2.00 3.25 1.73 2.52
12.97 10.31 11.27
WAIS-R (N
M
7.27 9.09 8.79 8.36 8.58 7.18
9.82 8.67 9.97 9.27 8.70
97.42 98.48 97.27
= 33)
SD
2.15 2.47 2.33 2.29 2.17 1.69
2.69 2.15 3.25 2.04 2.72
12.10 13.96 12.41
Difference
0.89 0.26 0.51 1.37* 0.91 1.74*
1.64 1.22 1.33 1.18 2.08**
2.41 8.11** 5.70*
points
91
means was used to determine whether White subjects scored
significantly higher on the WAIS and the WAIS-R than
Blacks and Mexican-American subjects. The subtest and IQ
means for White subjects were compared with those of
Black and Mexican-American subjects separately as well as
with the combined means of Black and Mexican-American
subjects. Table 11 presents the mean subtest and IQ
scores for the three racial groups along with F values
for the three comparisons examined.
Although White subjects scored higher than Black
subjects on all subtests and IQ scores, significant
differences were found for only three subtests:
Information, Arithmetic, and Vocabulary. White subjects
also scored significantly higher than Black subjects on
the Verbal and Full Scale IQ scores.
White subjects scored higher than Mexican-American
subjects on all subtests and IQ scores, with
significantly higher scores on the Comprehension and
Similarities subtests. White subjects also obtained
significantly higher Verbal and Full Scale IQ scores than
Mexican-American subjects.
When subtest and IQ scores of the White subjects
were compared with the combined scores of Black and
Mexican-American subjects, White subjects scored
92
u to
•—1
1-H
y 1-H
X I
CO
H
c to i)
s ) H
o 14H
CO
c o CO
• H
u to D,
E O
u
<u 1-H
a. •H 4.)
1-H
3
^ HH
c to o
• H Ul
CU
E <
1
c to u
• H
X tu
' ' • ^
T 3
c CO
., ^ o CO
1-H
O CQ
4-)
to CU
H
.. y 4-1 • H
x ; y 12
MH MH
<U JZ o
CO
CO I-H
< ^ c o
CO y SH
o u
CO
o* hH
T 3
C CO
j - J
to y 4-1
X I
3 W
T 3
y
c • H
X 3
B O
!J)
OS 1
CO hH < 3
X )
c to
to c o to
• H
M
to
-> to y 3
a , i - H
E O
CJ
c to y
^
to > IJH
>
to y i-i
o u
CO
c t3 y
S
en ">
CM
to >
I - H
en
CO
> 1-H
CM
CO
> 1—1
< 1
^
^ o to
i H /V^
y 4J
•H x : 13
to 4-1 to y 4-1
X I
3 C/D
/—\ u
\^^
y—s
^ N
/ — S
to N . - ^
-\ en s_^
CM
^ -
• - s
1-H
—/
* en CM
en 1—1
1 ^
NO
00
CM
cn •
1-H
1-H
i n 1 ^
NO
o o-
^
i n 1-H
* * NO 0 0 NO c n
CM en 1-H 1-H
* O l NO
<r en
O CO 1—1
* CM NO NO 0 0
• • 0 0 1-1
1—(
CN < r
* * -<r 00 00 1 ^ eN4 o
CM NO en 1-H 1-H
<t- r^ T~~ CO < t NO
o cn 00 1—1
* i n > * CM en r-H o . . .
CO NO .-H 1-H
C7N NO ON NO cjN o i n 1-H
1 ^ i ^
CM 0 0 ON < r
t^ r^
en in c^ cn
ON o o
ion
4-1
to E u o
14H
c I-H
1-H r-H
nsion
ic
y 4J
jz y y e iH x ; D . 4-1
E -H O H
u <
00 CO l~^
CM 00 O < r C3N ON
00 f ^ NO
0 0 i-H r ^
r^ en ^
O O ON 1-H t—(
to y
•H a >,
4-1 CO ^1 • H CU CO i-l C/: I-H
CO 3 r H 4-1 X I
•H -H to E 00 U
•H -H O CO Q >
i n 1—1 -o -
1—1 en CM
O i n en
VO NO 1-H CM C7N l~^
CM o en in
CM 0 0
m o CM c n
O en en o 00
en r-H NO
o o in • • •
O -<I- .-H
CM en CM r^ O r^
r ^ 1-H
i n i~~-• •
•vl- -3-
CM < • NO •<!•
O ON O O CJN 1-H 1-H
o CM m I-H O NO
--H CJN CO ,—1
00 en CM CM NO r~
1-H O CM I-H 1-H I-H
mbol
Completion
sign
> . y CO y o
u 4J 3 ^ •H 4-1 u oo u o
• H - H i H
n PL, pa
1-H
i n 00 CM o
CJN o I - H
<f on o o
I-H CM 1-H I-H
4-1
Arrangemen
ssembly
< tu U 4-1
3 U 4-1 01
y - - - i • H X I
a, o
•^ •^ -^ CM c n NO ON o i n
. . . c n o 1-H CM I-H CNl
* * i n ~d- CM O < t < t
• • • vo in cn t-H 1—t
<} <t r^ C3N CjN 1-H
. . . O N ON CJN I-H r-H
ON NO r -I-H O <!•
• • • CM en NO ON o c^
I - H
CM O CM
on osi m • • • o ON on
C3N CTN CJN
~S- NO ON
in o in . . .
00 < t r-H O I-H I-H I-H 1-H I-H
Q nee IQ
le IQ
M CO CO
E U r H ^1 CO CO O
j:> U-t r-i i-l I H i H y y 3
> PH fe
cr > y 4J
• H
j r 3
II
u
. . c CO y
•H u y E < 1
C
CO
u • H
X y
s •
to
> y • 4-1 - a •H y j r c S -H
^ II E
O xa u
• - to ^ y y u to O
i H u oa to
• c to to
= White V
can-Americ
CO - H 1-H
X O y •
• S y 4-1
o 2
V •a c c4 CO *
93
significantly higher on all of the Verbal subtests with
the exception of Digit Span and all three of the IQ
scores. None of the Performance subtest scores were
significantly different from one another.
The predicted Test x Race interaction was found to
be significant only for Performance IQ scores. This was
an indication that the race of the subject significantly
affected their differential performance on the WAIS and
WAIS-R Performance IQ scores (see Table 7). Figure 4
illustrates this interaction. For the Performance IQ
score, WAIS scores were higher than WAIS-R scores for
each racial group. The WAIS vs. WAIS-R difference in PIQ
scores for Black subjects was significantly larger than
the WAIS vs. WAIS-R difference in PIQ scores for White
and Mexican-American subjects.
Tukey's procedure was used to compare mean PIQ
scores for each racial group on the WAIS and the WAIS-R.
Results indicate that significant differences between the
PIQ scores of Whites vs. Blacks and Mexican-Americans
exist on both the WAIS and the WAIS-R.
Supplementary Data Analyses
Hypothesis 5
It was predicted that WAIS subtest and IQ scores
would be significantly correlated with corresponding
94
PIQ
116
115
114
113
112
111 110
109
108
107
106
105
104
103
102
101
100
99
98
97
96
95
94
93
92
115.76
White
112.35
102.90 M-A
Black
95.50
WAIS WAIS-R
F i g u r e 4
Race X T e s t I n t e r a c t i o n f o r WAIS and WAIS-R P e r f o r m a n c e IQ S c o r e s
95
WAIS-R scores. Table 8 presents the correlation
coefficients for WAIS and WAIS-R subtest and IQ scores.
All correlations were statistically significant.
The three WAIS IQ scores were significantly
correlated with correspomdiBg MAIS-R IQ scores. There
was a tendency for Verbal subtest and IQ scores to have
higher correlation coefficieets than Performance scores.
The subtests whose correlations had the greatest
magnitude were Vocabulary (r = 0.89), Information (r_ =
0.86), and Arithmetic iT_ = 0.75). These subtests
contributed to a highly positive relationship between the
Verbal IQ scores on the WAIS and WAIS-R (r = 0.91).
Conversely, the subtests with the lowest magnitude of
correlation between the two tests were Object Assembly (r
= 0.32), Picture Arrangement (r_ = 0.44), and Picture
Completion (r = 0.48). These lower correlations
contributed to a lower correlation between the
Performance IQ scores of the two tests (jr = 0.61).
Test X Order x Examiner
A 2 X 2 X 5 analysis of variance with one within-
(Test) and two between-subjects (Order, Examiner)
variables was performed as a supplementary test to
determine whether examiners assigned significantly
different test scores from one another. Table 12
Table 12
_F Values for Examiner Main Effect and Examiner Interactions with Test and Order
96
Subtests
Information Comprehension Arithmetic Similarities Digit Span Vocabulary
Digit Symbol Picture Completion Block Design Picture Arrangement Object Assembly
Verbal IQ Performance IQ Full Scale IQ
E x a m i n e r
0 . 7 0 1 .01 0 . 6 4 1 .36 0 . 5 6 1 .85
0 . 4 8 1 .94 1 .04 0 . 2 8 1 .40
1 .08 1 .22
T X E
1 .02 0 . 5 1 1 . 0 5 3 . 3 7 2 . 8 8 0 . 6 6
2 . 0 9 1 .90 0 . 5 6 0 . 2 0 1 .60
1 . 8 3 1 . 9 5
0 T X 0 X E
1.10 1.82
0.00 0.12 0.99 0.45 1.38 0.75
0.56 0.85 0.88 1.00 0.60
0.45 0.77 0.46
25 20
0.35 1.56 0.92 0.68
4.35* 2.04 0.48 2.27 0.45
2.25 2.80 3.37
Note £. < .01.
T = Test; 0 = Order; E = Examiner
97
presents F_ values for the Examiner main effect and
Examiner interactions with Test and Order of
administration.
According to these results, there were no
significant differences between examiner scores for any
of the subtest or IQ scores. There were also no
significant Test x Examiner interactions, indicating that
the examiner was not a factor affecting WAIS/WAIS-R test
differences for any of the subtest or IQ scores.
However, there was a significant Test x Order x Examiner
interaction found for one subtest. Digit Symbol. There
were no significant interactions found for any of the IQ
scores.
Figures 5 and 6 illustrate the interaction within
Order 1 and 2, respectively, for the Digit Symbol
subtest. In Order 1, all of the examiners with the
exception of Examiner 3 achieved higher WAIS than WAIS-R
scores. In contrast, Examiner 3 tested subjects whose
WAIS-R mean scores were higher than WAIS mean scores. It
should also be noted that Examiner 4 obtained a larger
difference between WAIS and WAIS-R Digit Symbol scores in
Order 1 than the other examiners. In Order 2, all
examiners tested subjects whose mean scores were higher
on the WAIS than on the WAIS-R Digit Symbol subtest.
98
D i g i t Symbol
1 2 . 4
1 2 . 2
1 2 . 0
1 1 . 8
1 1 . 6 •
1 1 . 4
1 1 . 2
1 1 . 0 •
1 0 . 8 •
1 0 . 6 •
1 0 . 4
1 0 . 2
1 0 . 0 •
9 . 8
9 . 6 •
9 . 4
9 . 2
9 . 0
8 . 8
Ex. 3
Ex. 5
4- 4-WAIS WAIS-R
Figure 5
Test X Order x Examiner Interaction (Order 1) for WAIS and WAIS-R Digit Symbol Subtest
99 Digit Symbol
WAIS WAIS-R
Figure 6
Test X Order x Examiner Interaction (Order 2) for WAIS and WAIS-R Digit Symbol Subtest
100
However, differences between WAIS and VJAIS-R scores were
much larger in Order 2 than in Order 1 for all examiners
except Examiner 5. Examiner 5 tested subjects whose mean
scores were only 0.1 scaled point higher on the WAIS than
the WAIS-R Digit Symbol subtest.
Test Differences and IQ Categories
Differences between WAIS and WAIS-R Performance and
Full Scale IQ scores varied significantly according to
the IQ category of the individuals tested. WAIS-R Full
Scale IQ scores were used to define three categories:
Below Average (IQ < 90), Average (89 < IQ < 110), and
Above Average (IQ > 109). For Performance IQ scores, the
discrepancy between the two tests was significantly
different for the Above Average group as compared to the
Average and Below Average groups (see Table 13). Mean
WAIS PIQ scores were 3.89 points lower than WAIS-R scores
in the Above Average group. In contrast, mean WAIS PIQ
scores were 6.89 points higher than WAIS-R PIQ scores in
the Average group and 11.07 points higher than WAIS-R PIQ
scores in the Below Average group.
For Full Scale IQ scores, significant differences
between WAIS and WAIS-R scores were also found for the
Above Average as compared to the Average and Below
Average groups (see Table 13). Mean WAIS-R FSIQ scores
101
Table 13
WAIS and WAIS-R Mean IQ Differences in Three IQ Categories
VIQ Difference PIQ Difference FSIQ Difference
Average IQ (90-109)
1
NO
N
O
NO
on on
on
Mean
3.69 6.89 5.78
S^
5.46 11.15 6.97
VIQ Difference PIQ Difference FSIQ Difference
Below Average IQ (69-89)
I 15 15 15
Mean
6.40 11.07 9.53
1 3.29 7.90 4.00
Above Average IQ (110-143)
N Mean SD
VIQ Difference 19 PIQ Difference 19 FSIQ Difference 19
0, -3, -1,
.79
.89
.53
6. 15, 10,
.07
.06
.18
Note. Each difference score is computed by subtracting the mean WAIS-R score from the mean WAIS score. IQ categories are defined by WAIS-R scores.
102
were 1.53 points higher than WAIS FSIQ scores in the
Above Average range. This is compared to the Average and
Below Average groups in which WAIS FSIQ scores were 9.53
and 5.78 points higher than WAIS-R FSIQ scores,
respectively.
A limitation inherent to this comparison of IQ
categories is that the Below Average group was composed
almost entirely of minorities (see Figure 7). Further,
there is only one minority subject included in the Above
Average group. However, the significant difference that
was found between test scores in the Average and Above
Average categories in the overall analysis was also found
in an analysis of the data taken from White subjects
only. As there was only one White subject in the Below
Average group, it was not possible to compare Below
Average with Above Average categories in this sample.
103
-L ^
I
r~ 00 oo
II • en
y CJN oo c II
OS X
*
NO
o I
NO i-^in
>3-II •
O QJCJN 00 C II to osix
* *
*
X
* *
*
*_
V 'I* '1^
*
c (0
u •H i-i y 1 ^ u
•H X 0)
r:
* * * * *
« «
*
*
* *
* * * 1 . *
|X ^ « *
* * * * *
u >;= to
r-H CQ
*
y 4J •H
on
00 00 r~- cn
II ON O
y 1-H CO c: II to
OS I X
CM - J -
t-H
I-H
CM on r - H
fi • - H
?9 r - H
O CN
^ I - H
?:! 1-H
• R r - H
00 t-H r - H
\o ^ H t-H
«-H t - H
<N <-H I - H
o I - H
8 I - H
8 1-H
s * H
s • — '
Q I - H
»
«
3 S 8 88 «
a S S S NO r--rJ
CO CU I-I O
u CO
cr M CC C/J CH IJ-i 3
O DS IH
1 CJ3 CO hH i H < 03 32 -H
U I ^ <4H CD
O OS y
u c y 3 o y 00 -H Ui
•H 4-1 X !
fo 3 H X I •H y i-i x ; 4-1 4J
•H U Q O
IJH
u c y 3
cr y
s-< |JH
CHAPTER IV
DISCUSSION
This section will begin with a general discussion of
the findings for the principal hypotheses. Following the
general discussion, the limitations of the present study
are outlined. Finally, suggestions are presented for
future research.
Results and Implications of Hypothesis Testing
Test Differences
Hypothesis 1 predicted that WAIS subtest and IQ
scores would be significantly higher than corresponding
WAIS-R scores when these tests were administered to a
sample of 70 high school students. This hypothesis was
confirmed for 10 of the 11 subtests (with the exception
of the Arithmetic subtest) and all three IQ scores.
These results are comparable to earlier VMIS/WAIS-R
comparisons, in which investigators found that WAIS IQs
were significantly higher than WAIS-R IQs (e.g., Kelly,
Montgomery, Felleman, & Webb, 1984; Lippold & Claiborn,
1983; Mishra & Brown, 1983; Prifitera & Ryan, 1983;
Rabourn, 1983; Smith, 1983; Urbina, Golden, & Ariel,
1983; Wechsler, 1981).
In the present study, WAIS VIQ, PIQ, and FSIQ scores
104
105
were higher than corresponding scores on the WAIS-R by
3.48, 4.86, and 4.60 points, respectively. These IQ
differences are somewhat lower than the 7-9 point
differences found in several of the previous studies that
sampled other populations (e.g., Prifitera & Ryan, 1983;
Lippold & Claiborn, 1983; Rabourn, 1983; Smith, 1983;
Wechsler, 1981). However, research data would suggest
that these differences are nonetheless meaningful.
Wechsler (1981) reported average standard errors of
measurement for Verbal, Performance, and Full Scale IQ
scores of 2.74, 4.14, and 2.53 points, respectively
(p. 33). WAIS and WAIS-R IQ scores would be considered
roughly equivalent if the differences between
corresponding scores fell within this predicted range.
However, differences between WAIS and VJAIS-R IQ scores in
the present study exceeded this expected margin. This is
further evidence that the scores betv/een the tv;o tests
were significantly different.
There are several possible explanations for the
higher scores usually obtained on the WAIS as compared to
the WAIS-R. Flynn (1984) noted that since 1932 it has
been necessary to establish increasingly more difficult
norms for intelligence tests to compensate for the
improved test performance of the standardization sample.
106
He concluded that, as a result of the increasing
difficulty level of the tests, subjects who are tested
with two different intelligence tests nearly always score
higher on the test that was standardized earlier.
There are significant differences between the
general characteristics of the WAIS and WAIS-R
standardization samples. The WAIS standardization sample
was tested in the early 1950's, prior to the impact of
television and mass media, and before the prominence of
child-development movements that stress the importance of
early stimulation for young children. The WAIS-R sample
was tested in the late 1970's and had the benefits of
mass media, more enlightened and better educated parents,
and therefore greater cultural advantages. Not
surprisingly, the individuals in the WAIS-R sample
performed better than did their WAIS counterparts on the
types of tasks included in a Wechsler test.
There are several differences between the WAIS and
WAIS-R which may contribute to the improved performance
by the WAIS-R standardization sample. Rabourn (1983)
cites specific changes on the WAIS-R which could
potentially account for higher scoring by the
standardization sample. For example, the WAIS-R manual
instructs examiners to ask for a second response on items
107
needing two answers on the Comprehension subtest.
Further, there have been changes in directions, scoring,
timing, and wording for subtests which may account for
some of the differences. These changes could have
facilitated higher scoring in the WAIS-R standardization
group, resulting in a set of WAIS-R norms that are more
difficult than the WAIS norms. Therefore, when today's
adults are compared to their contemporaries in the WAIS-R
normative group, their scores will not be as high as when
they are compared to their age counterparts of just one
generation ago in the WAIS standardization sample.
The fact that individuals generally tend to obtain
significantly lower scores on the WAIS-R as compared to
the WAIS has implications regarding academic placement.
For example, discrepancies between achievement and
intellectual ability for students referred for a possible
learning disorder will tend to be smaller when the V/AIS-R
is used instead of the WAIS. This will make it less
likely that students taking the WAIS-R will qualify for
remedial placement based on a learning disability.
Examiners should keep the discrepancy between the two
scores in mind when assessing the presence of learning
disabilities and recommending appropriate academic
placement. Past WISC/WISC-R comparisons have shown that
108
mainstreaming decisions may be contingent on which scale
is included in the assessment battery (cf. Swerdlik,
1977).
Hypothesis 2 predicted that the difference between
WAIS and WAIS-R scores would be affected by the order in
which the two tests were administered. This hypothesis
was confirmed for six subtests (Similarities, Digit
Symbol, Picture Completion, Block Design, Picture
Arrangement, Object Assembly) and all three IQ scores.
These results confirm earlier research in this area which
indicates that the order of test administration
differentially affects WAIS/WAIS-R test scores (Edwards &
Klein, 1984; Simon & Clopton, 1984; Smith, 1983). The
six WAIS and WAIS-R subtests and three IQ scores for
which significant Test x Order interactions were found
were compared within Order 1 and Order 2 (see Table 9).
The WAIS subtest and IQ scores were significantly higher
than corresponding WAIS-R scores when individuals took
the WAIS-R first (Order 2). These same practice effects
did not take place in Order 1 except for the Performance
IQ score, which was significantly higher on the WAIS-R.
Generally, therefore, when the WAIS was given first, the
practice effect served to mask the real difference
between the scores for the two tests (i.e., the two tests
109
approached equivalence). However, obtaining practice on
the WAIS-R before taking the WAIS (which is usually
higher than the WAIS-R anyway) magnifies the difference
between the tests.
It is interesting to note that all five of the
Performance subtest differences were affected by order of
administration, suggesting that the performance subtests
are more susceptible to practice effects than the verbal
subtests. This may be becamse the performance subtests
were novel to many suntojects, miiilike the verbal tasks
which were similar to academiic material. It is probable
that the intertest interval of 30 to 60 days in the
present study contributetfi to tfee differential practice
effects, particularly on tine PerforBance subtests. It is
unlikely that an individual womld be retested after such
a short length of time in a clinical setting, thus
decreasing the likelihood that practice on an earlier
test would affect test scores. Therefore, although it is
important to be aware of how these factors influence test
scores in the present research, these findings have
limited clinical implications.
Racial Differences
Hypothesis 3 predicted that White subjects would
score significantly higher than Black and Mexican-
no
American subjects on the WAIS and the WAIS-R subtest and
IQ scores. This hypothesis received some support in the
present study (see Table 11). White subjects received
significantly higher Verbal, Performance, and Full Scale
IQ scores than the two minority groups combined (see
Table 11). In addition, White subjects scored
significantly higher than Blacks and Mexican-Americans
(combined) on the Verbal subtests of Information,
Comprehension, Arithmetic, Similarities, and Vocabulary.
White subjects scored slightly higher on the remaining
Verbal subtest as well as on all of the Performance
subtests. However, these differences were not
significant.
These results are in agreement with past studies
that have compared White and minority IQ scores on
standardized intelligence tests (Holtzman, Diaz-Guerrero,
& Swartz, 1975; Jensen, 1980; Matarazzo, 1972; Sattler,
1974; Winter, 1968). In the United States, Black-White
group differences of between 10 and 15 Wechsler points
are not uncommon (for review, see Sundberg & Gonzales,
1981). In the present study, the differences between
mean IQ scores of White and Black subjects were 18.22
points on VIQ, 14.86 on PIQ, and 18.07 on FSIQ. The
differences between mean IQ scores of White and
Ill
Mexican-American subjects were somewhat lower, with a
16.35 point difference on VIQ, 11.00 point difference on
PIQ, and 15.12 point difference on FSIQ.
The significant differences found between White and
Black students occurred on Verbal subtests which measure
academic abilities: Information, Arithmetic, and
Vocabulary. One possible explanation for these results
is that, due to cultural differences. Black individuals
may have a more difficult time learning material in a
classroom setting than do White individuals. Black
individuals may have developed unique verbal skills
(i.e.. Black Dialect) that are not measured by
conventional tests or accepted in a classroom setting
(Williams, 1970). Motivational factors may also
contribute to this discrepancy between scores. Blacks
and other minority group individuals may respond with "I
don't know" more quickly to terminate the unpleasantness
of interacting with a demanding adult. Studies have
suggested that minority group individuals, in comparison
with middle class individuals, are more wary of adults,
less motivated to be correct for the sake of correctness
alone, and often willing to settle for lower levels of
achievement and success (Zigler & Butterfield, 1968).
In the present study. White subjects did not score
112
significantly higher than either Black or
Mexican-American subjects on any of the Performance
subtests. Black and Mexican-American mean scores on
Performance subtests were 1-4 points higher overall than
their scores on Verbal subtests, contributing to the
decreased discrepancy between racial groups on
Performance IQ scores (see Table 11). Since verbal tasks
consistently produce racial differences, several
individuals have attempted to produce more "culture-fair"
tests by eliminating the language factor (e.g.. The
Leiter International Performance Scale, 1966; The
Progressive Matrices, 1960). However, these types of
non-verbal tests have not been shown to have greater
predictive validity than verbal tests with minority
groups (Anastasi, 1976).
Several researchers (e.g., Loehlin, Lindzey, &
Spuhler, 1975; Matarazzo & Pankratz, 1980) have addressed
the rather consistent finding that minority groups score
lower on standardized intelligence tests than do White
subjects. Some of the explanations offered include
environmental deprivation, educational deficits, lower
socioeconomic status, genetic differences, and/or test
inadequacies. The present study did not attempt to
determine the causal factors underlying differences in IQ
113
scores between various racial groups. However, results
do suggest that IQ score differences between minority
groups and White individuals continue to be significant.
Hypothesis 4 predicted that the race of the subject
would affect differential performance on the WAIS and the
WAIS-R. However, a significant Race x Test interaction
was found for only the Performance IQ (see Table 7). In
other words, there were no significant differences
between the WAIS and the WAIS-R due to racial factors on
any of the subtest or IQ scores other than Performance
IQ. As expected, White, Black, and Mexican-American
groups all scored several points higher on the three WAIS
IQ scores as compared to corresponding WAIS-R IQ scores
(see Table 14). However, the difference between
WAIS/V/AIS-R Performance IQ scores of Black subjects was
7.40 points as compared to a difference of 3.41 points
for Whites and 4.75 for Mexican-Americans. Therefore,
even though Black PIQ scores tend to be lower than PIQ
scores earned by the other two racial groups on both
tests, the discrepancy between WAIS and WAIS-R PIQ scores
is significantly greater for the Black as compared to the
White racial group (see Figure 4).
Significant differences in the Performance IQ scores
of White and minority groups continue to exist on the
114
Table 14
Comparison of WAIS and WAIS-R Verbal, Performance, and Full Scale Mean IQ Scores for White,
Black, and Mexican-American Subjects
White Black
WAIS
VIQ
PIQ
FSIQ
WAIS-R
VIQ
PIQ
FSIQ
110.03
115.76
113.29
107.06
112.35
109.88
92.15
102.90
96.60
88.50
95.50
90.45
Mexican-American
94.38
105.44
99.06
90.00
100.69
93.88
Note. VIQ = Verbal IQ, PIQ = Performance IQ, FSIQ Full Scale IQ.
115
WAIS-R (see Figure 4). There were no significant Race x
Test interactions found for the Verbal or Full Scale IQ
scores, although the Full Scale Test x Race interaction
approached significance (2. < .013). The fact that
minority groups scored significantly lower than White
subjects on the WAIS-R Performance scale but not on the
Verbal scale suggests that, contrary to clinical lore,
performance items are no more culture fair than verbal
items on the WAIS-R. One might assume that, since the
WAIS-R was revised in several areas to make it a more
culture-fair test, racial IQ differences might decline on
the WAIS-R when compared with the WAIS. However, it
appears that WAIS-R revisions did not lessen the IQ score
differences between White and minority group subjects.
Wechsler made several changes in the WAIS-R in an
attempt to improve the face validity of the test. He
included several pictures of Black individuals in the
Picture Completion and Picture Arrangement subtests and
added questions about Louis Armstrong and Martin Luther
King in the Information subtest. In an effort to
determine whether changing and/or adding any of these
items on the WAIS-R led to improved scores for minority
group individuals, three-dimensional chi square analysis
(item X race x right/wrong) were performed on two of the
116
Information items that were added (Armstrong, King).
Responses of individuals from each racial group were
compared on the identified item as well as on the two
items of similar difficulty on either side of the
identified item. Table 15 presents the results of these
analyses in terms of the percentage of correct responses
given for each item by the three racial groups. For the
Armstrong item, a significant Race x Item interaction was
found, indicating that the race of the subject
differentially affected their response to the compared
items. Black subjects answered the Armstrong item
correctly more often than the other items of similar
difficulty. This may be contrasted with the differential
performance of the White subjects, as they tended to give
a higher percentage of incorrect responses to the
Armstrong item as compared to the items of similar
difficulty. Therefore, it may be concluded that Black
subjects found the Armstrong item relatively easier than
the other items of supposedly similar difficulty, whereas
White subjects found the Armstrong item much more
difficult than the other items.
Analysis of the King item revealed that the race of
the subject did not differentially affect their
performance on the King item as compared to two items of
117
Table 15
Percentage of Correct Responses for Armstrong and King Items on the WAIS-R
Armstrong item
Other 2 items*
White Black Mexican-American
2 6 . 5
7 1 . 0
5 5 . 0
4 0 . 0
2 5 . 0
3 4 . 0
King item 91.0
Other 2 items* 47.0
70.0
23.0
50.0
25.0
Note - The other 2 items refer to the questions immediately preceding and following the Armstrong and King items. These items are supposedly at similar levels of difficulty.
118
similar difficulty. However, there was a significant
main effect for Race such that White subjects tended to
answer all three items correctly more often than Black or
Mexican-American subjects. There was also a significant
main effect for item, which indicates that there is a
difference in the difficulty level of the three items.
All subjects found the two comparison items more
difficult than the King item. Overall, the inclusion of
the King and Armstrong items appears to add only to the
face validity of the WAIS-R and did not serve to
significantly lessen the discrepancy between White and
minority group IQ scores.
Supplementary Findings
Correlational data. Even though significant
differences exist between the WAIS and WAIS-R IQ scores,
additional evidence supports the idea that there is a
close relationship between the basic abilities tapped by
the two tests. Factor analyses conducted on the WAIS and
WAIS-R show no significant differences between the factor
structures of the tests (cf. Blaha & Wallbrown, 1982;
Gutkin, Reynolds, & Galvin, 1984; Parker, 1983;
Silverstein, 1982). Basically, there are three major
factors which are found to load consistently into each
scale. These include a general factor common to all
119
subtests, a Verbal Comprehension factor, and a Perceptual
Organization factor. Several studies also identified an
additional factor in both tests labeled Freedom from
Distractibility.
Numerous studies have found strong correlations
between WAIS and WAIS-R IQ scores (e.g., Edwards & Klein,
1983; Mishra & Brown, 1983; Simon & Clopton, 1984; Smith,
1983; Urbina, Golden, & Ariel, 1983; Wechsler, 1981). As
predicted, the correlational results of the present study
(T_ = .91, .61, and .82 for VIQ, PIQ, and FSIQ,
respectively) are similar to those in Wechsler's (1981)
study (£ = .91, .79, and .88 for VIQ, PIQ, and FSIQ,
respectively). These highly positive correlations are
not surprising in view of the large overlap in item
content on the WAIS and the WAIS-R. However, these
results also indicate that the WAIS and the WAIS-R
measure similar abilities. Since the two tests also
appear to be measuring similar constructs of
intelligence, it is suggested that the subtest and IQ
score discrepancies on the two tests are a result of a
difference in standardization norms.
Examiner variable. A Test x Order x Examiner
analysis of variance was performed to determine whether
the five examiners produced significantly different
120
scores from one another. There were no significant
examiner main effects found for any of the subtest or IQ
scores. However, there was one Test x Order x Examiner
interaction found for the Digit Symbol subtest. This
interaction was unexpected as the instructions for this
particular subtest are specific, a precise time limit of
90 seconds is given on both the WAIS and WAIS-R, and
there is no subjectivity involved in scoring. Despite
this one specific instance of examiner variance, overall
the results indicate that the examiners in this study did
not produce significantly different scores.
IQ category differences. An analysis of WAIS and
WAIS-R differences in three different IQ categories
suggests that differences between the two tests are
affected by the IQ of the subject. Individuals in the
Above Average group (IQ > 109) obtained slightly higher
scores on WAIS-R PIQ and FSIQ as compared to
corresponding scores on the WAIS. However, subjects in
the Average (89 < IQ < 110) and Below Average (IQ < 90)
groups earned significantly higher PIQ and FSIQ scores on
the WAIS as compared to the WAIS-R (see Table 13).
Flynn (1984) noted that it has been necessary to
establish increasingly stringent norms for intelligence
tests in the past 44 years. More stringent norms
121
translate into lower test scores obtained on the most
recently derived tests, e.g., WAIS-R. Flynn (1984)
attributed this discrepancy in test norms to the fact
that people have been getting smarter over the years.
However, the results of the present study suggest that
these differences in norms do not exist across all ranges
of intelligence. For example, it is true that subjects
of Below Average and Average intelligence tend to score
significantly higher on the WAIS than on the WAIS-R
(Kelly, Motgomery, Felleman, & Webb, 1984; Lippold &
Claiborn, 1983; Mishra & Brown, 1983; Prifitera & Ryan,
1983; Rabourn, 1983; Smith, 1983; Urbina, Golden, &
Ariel, 1983; Wechsler, 1983). In contrast, however, the
present study shows that individuals in the High Average
category did not obtain the expected discrepancy between
WAIS and WAIS-R scores. In fact, WAIS-R scores were
slightly higher than WAIS scores for this group. Thus,
the current results suggest that, while the norms have
become more difficult for individuals of Average
intelligence, they have not changed as much for
individuals of Above Average intelligence. Edwards &
Klein (1984) also found that highly intelligent adults
did not perform significantly better on the WAIS as
compared to the WAIS-R. Further, a comparison of the
122
WAIS and WAIS-R in a mentally retarded sample (Simon &
Clopton, 1984) failed to support Flynn's (1984)
conclusion that overall IQ scores in the United States
are rising. Therefore, although Flynn's (1984)
assumptions about increased IQ may be true in the more
normal ranges of intelligence, it should be emphasized
that individuals in the extreme upper and lower ranges of
intelligence do not appear to be exhibiting these same
increases.
To further understand these test differences as they
relate to IQ category, the difference between variance
scores on the WAIS and the WAIS-R was examined. The
WAIS-R variance is larger than that obtained on the WAIS
in this study as well as in Wechsler's (1981) original
WAIS/WAIS-R comparison. These variance differences can
explain why WAIS-R scores are artificially inflated in
the Above Average ranges of intelligence, minimizing the
difference between WAIS and WAIS-R test scores. This may
also explain why the difference between the V/AIS and the
WAIS-R is much greater for people in the Below Average
range of intelligence. Therefore, it is important for
researchers to take the variance of test scores into
consideration when examining test differences rather than
looking solely at differences between test means.
123
Limitations of the Present Study
The present research was performed in a specific
population of 70 high school students between the ages of
16 and 19. Therefore, generalizing the results of this
study to other groups should be done with caution. It is
also interesting that the subjects were enrolled in a
school district that was segregated in the recent past.
In fact, one of the schools used still has a very large
number of minority as compared to white students. It is
possible that these factors contributed to the
significant differences between the IQ scores of the
three racial groups tested.
The design of the present study originally called
for each examiner to test the same number of subjects
with both tests. However, it became necessary for
examiners to test an unequal number of subjects. There
was also an unequal number of subjects assigned to each
order of administration due to differential subject
dropout. However, these minor deviations in the research
design do not appear to present as a significant
limitation due to the strength of the major findings.
Suggestions for Future Research
Further research is needed to compare scores
obtained on the WAIS-R and the WISC-R. A great deal of
124
effort has been expended comparing the two forms of the
Wechsler adult scale, but there has been limited research
examining the transition of scores from the WISC-R to the
WAIS-R. Another standardized test of intellectual
achievement that is becoming very popular in the academic
setting is the Kaufman Assessment Battery for Children
(K-ABC; 1983). It will be important for researchers to
determine whether these two tests provide comparable
scores, particularly since the K-ABC also provides
standardized scores in several areas of academic
achievement.
Several of the WAIS/WAIS-R comparison studies
suggest that the discrepancy between the scores on the
two tests varies according to the general IQ of the
population tested. For example, mentally retarded
subjects tend to score higher on the WAIS-R as compared
to the WAIS. This is completely opposite from research
findings with other populations in which WAIS scores are
significantly higher. It would be interesting for future
studies to examine the WAIS-R norms to determine the
reasons for this discrepancy.
REFERENCES
Anastasi, A. (1968). Psychological testing (3rd ed.). London: MacMillan and Company.
Baughman, E. E., & Dahlstrom, W. G. (1968). Negro and white children: A psychological study in the rural south. New York: Academic Press.
Beck, N. C. (1985). WAIS-R factor structure in psychiatric and general medical patients. Journal of Consulting and Clinical Psychology, 53, 402-405.
Berger, L., Bernstein, A., Klein, E., Cohen, J., & Lucas, G. (1964). Effects of aging and pathology on the factorial structure of intelligence. Journal of Consulting Psychology, 28, 199-207.
Berry, K., & Sherrets, S. (1975). A comparison of the WISC and WISC-R for special education students. Pediatric Psychology, _3, 14.
Blaha, J., & Wallbrown, F. H. (1982). Hierarchical factor structure of the Wechsler Adult Intelligence Scale-Revised. Journal of Consulting and Clinical Psychology. 50, 652-660.
Block, N. J., & Dworkin, G. (Eds.). (1976). The IQ controversy. New York: Pantheon Press.
Brooks, C. R. (1977). WISC. WISC-R, S-B L & M, WRAT: Relationships and trends among children ages six to ten referred for psychological evaluation. Psychology in the Schools, 14, 30-33.
Cohen, E. (1965). Examiner differences with individual intelligence tests. Perceptual and Motor Skills, ^ , 1324.
Cohen, J. (1957a). A factor-analytically based rationale for the WAIS. Journal of Consulting Psychology, 21, 451-457.
Cohen. J. (1957b). The factorial structure of the WAIS between early adulthood and old age. Journal of Consulting Psychology. 21, 283-290.
Coons, W. H.. & Peacock, E. P. (1959). Inter-examiner reliability of the WAIS with mental hospital patients. Ortho-Psychiatric Association Quarterly, 22., 33-37.
125
126
Davis, W. E., Peacock, W., Fitzpatrick, P., & Mulhern, M. (1969). Examiner differences, prior failure, and subjects' WAIS arithmetic scores. Journal of Clinical Psychology, 25, 178-180.
Denerll, R. D., Broeder, J., & Sokolov, S. L. (1964). WISC and WAIS factors in children and adults with epilepsy. Journal of Clinical Psychology, 20, 236-240.
Deutsch, M., Fishman, J. A., Kogan, L., North, R., & Whiteman, M. (1964). Guidelines for testing minority children. Journal of Social Issues, 20, 129-145.
Doppelt, J. E., & Kaufman, A. S. (1977). Estimation of the differences between WISC-R and WISC IQs. Educational and Psychological Measurement, 37, 417-424.
Dreger, R. M., & Miller, F. S. (1968). Comparative psychological studies of negroes and whites in the United States: 1959-1965. Psychological Bulletin, W, 1-58.
Edwards, B. T., & Klein, M. (1984). Comparison of the WAIS and WAIS-R with subjects of high intelligence. Journal of Clinical Psychology. 40, 300-302.
Egeland, B. (1969). Examiner expectancy: Effects on the scoring of the WISC. Psychology in the Schools, 6^, 313-315.
Flynn, J. R. (1984). The mean IQ gain of Americans: Massive gains 1932 to 1978. Psychological Bulletin, 2^, 29-51.
Galton, F. (1883). Inquiries into human faculty and its development. London: Macmillan and Company.
Graziano, W. G., Varca, P. E., & Levy. J. C. (1982). Race of examiner effects and the validity of intelligence tests. Review of Educational Research, .52., 469-497.
Guertin, W. H., Ladd, C. E., Frank, G. H., Rabin, A. I., & Hiester. D. S. (1971). Research with the Wechsler Intelligence Scales for adults: 1965-1970. The Psychological Record, 21, 289-339.
127
Gutkin, T. B., Reynolds. C. R., & Galvin, G. A. (1984). Factor analysis of the Wechlser Adult Intelligence Scale-Revised (WAIS-R): An examination of the standardization sample. Journal of School Psychology. 22, 83-93.
Holtzman, W. H., Diaz-Guerrero, R.. & Swartz. J. D. (1975). Personality development in two cultures: A cross-cultural longitudinal study of school children in Mexico and the United States. Austin: University of Texas Press.
Jensen. A. R. (1980). Bias in mental testing. New York: Free Press.
Kaspar, J. C , Throne, F. M., & Schulman, J. L. (1968). A study of the inter-judge reliability in scoring the responses of a group of mentally retarded boys to three WISC subscales. Educational and Psychological Measurement, 28, 469-477.
Kaufman, A. S.. & Kaufman, N. L. (1983). Kaufman Assessment Battery for Children: Administration and scoring manual. Circle Pines. Minnesota: American Guidance Service.
Kelly, M. P., Montgomery, M. L., Felleman, E. S., & Webb, W. W. (1984). Wechsler Adult Intelligence Scale and Wechsler Adult Intelligence Scale-Revised in a neurologically impaired population. Journal of Clinical Psychologv. 40, 788-791.
Larrabee, G. H., & Halroyd, R. G. (1976). Comparison of Wise and WISC-R using a sample of highly intelligent children. Psychological Reports, 38, 1077-1080.
Leiter, R. G, (1966). Manual for the Leiter International Performance Scale; Parts I and II. Beverly Hills. California; Services.
Western Psychological
Lippold. S., & Claiborn, J. M. (1983). Comparison of the Wechsler Adult Intelligence Scale and the Wechsler Adult Intelligence Scale-Revised. Journal of Consulting and Clinical Psychology, 51, 315.
Loehlin. J. C , Lindzey. G., & Spuhler. J. N. (1975). Race differences in intelligence. San Francisco: W"i H. Freeman and Company.
128
Mahan. T. W. (1963). Diagnostic consistency and prediction: A note on graduate student skills. Personnel and Guidance Journal, 42, 364-367.
Masling, J. M. (1959). The effects of warm and cold interaction on the administration and scoring of an intelligence test. Journal of Consulting Psychology. 23_, 336-341.
Massey, J. 0. (1964). WISC scoring criteria. Palo Alto, California: Consulting Psychologists Press.
Matarazzo. J. D. (1972). Wechsler's measurement and appraisal of adult intelligence (5th ed.), Baltimore: Williams and Wilkins.
Matarazzo, J. D., & Pankratz, L. D. (1980). Intelligence. In R. H. Woody (Ed.), Encyclopedia of clinical assessment. San Francisco: Jossey-Bass.
Miller, C. K., & Chansky, N. M. (1972). Psychologists' scoring of WISC protocols. Psychology in the Schools, 9, 144-152.
Miller, C. K., Chansky, N. M., & Gredler, G. R. (1970). Rater agreement on WISC protocols. Psychology in the Schools, 1_, 190-193.
Mishra, S. P., & Brown, K. H. (1983). The comparability of WAIS and WAIS-R IQs and subtest scores. Journal of Clinical Psychology, 39. 754-757.
Naglieri, J. A., & Kaufman, A. S. (1983). How many factors underlie the WAIS-R? Journal of Psychoeducational Assessment, J , 113-119.
Oakland, T. (Ed.). (1977). Psychological and educational assessment of minority children. New York: Brunner/Mazel.
Oakland, T., & Feigenbaum, D. (1979). Multiple sources of test bias on the V/ISC-R and Bender-Gestalt test. Journal of Consulting and Clinical Psychology. 47, 968-974.
O'Grady, K. E. (1983). A confirmatory maximum likelihood factor analysis of the WAIS-R. Journal of Consulting and Clinical Psychology. 51, 826-831.
129
Parker, K. (1983). Factor analysis of the WAIS-R at nine age levels between 16 and 74 years. Journal of Consulting and Clinical Psychology. 51 , 302-308.
/Pedersen, D. M., Shinedling, M, M., & Johnson, D. L. (1968). Effects of sex of examiner and subject on children's quantitative test performance. Journal of Personality and Social Psychology, 10. 251-254.
Plake, B. S.. Gutkin, T. B., & Kroetin, T. (1984). Confirmatory factor analysis of the WAIS-R: Plausibility of models. Journal of Psychoeducational Assessment, 2^, 273-277.
Plumb, G. R., & Charles, D. C. (1955). Scoring difficulty of Wechsler Comprehension responses. Journal of Educational Psychology, 46, 179-183.
Prifitera, A., & Ryan, J. J. (1983). WAIS/WAIS-R comparisons in a clinical sample. Clinical Neuropsychology, 5_, 97-99.
Quereshi, M. Y. (1968). Intelligence test scores as a function of sex of experimenter and sex of subject Journal of Psychology. 69. 277-284.
Rabourn. R. E. (1983). The WAIS and WAIS-R: A comparison and a caution. Professional Psychology Research and Practice• 14, 357-361.
Raven, J. C. (1960). Guide to the Standard Progressive Matrices. London: Lewis and Company.
Ryan, J. J. (1983). Scoring reliability on the WAIS-R. Journal of Consulting and Clinical Psychology, 51 , 149-150.
Ryan, J. J., Georgemiller, R. J., Geisser, M. E., & Randall. D. M. (1985). Test-retest stability of the WAIS-R in a clinical sample. Journal of Clinical Psychology, 41. 552-555.
Ryan. J. J.. Prifitera. A., & Larsen, J- (1982). Reliability of the WAIS-R with a mixed patient sample. Perceptual and Motor Skills, 55, 1277-1278.
130
Ryan, J. J., & Rosenberg, S. J. (1983). Relationship between the Wechsler Adult Intelligence Scale-Revised and the Wide Range Achievement Test a sample of mixed patients. Perceptual and Motor Skills, 56, 623-626.
in
Ryan, J. J., Rosenberg, S. J., & DeWolfe, A. S. (1984). Generalization of the WAIS-R factor structure with a vocational rehabilitation sample. Journal of Consulting and Clinical Psychology. 52, 311-312.
Sandoval, J. (1979). The WISC-R and internal evidence of test bias with minority groups. Journal of Consulting and Clinical Psychology, 47, 919-927.
Sattler, J. M. (1969). Effects of cues and examiner influence on two Wechsler subtests. Journal of Consulting and Clinical Psychology, 33, 716-721.
Sattler, J. M. (1974). Assessment of children's intelligence. Company.
Philadelphia: V/. B. Saunders
Sattler, J. M., & Gwynne, J. (1982). White examiners generally do not impede the intelligence test performance of Black children: To debunk a myth. Journal of Counseling and Clinical Psychology. 50, 196-208.
Sattler, J. M., Hillix, W. A., & Neher, L. A. (1970). Halo effect in examiner scoring of intelligence test responses. Journal of Consulting and Clinical Psychology, 34, 172-176.
Sattler, J. M., & Winget. B. M. (1970). Intelligence testing procedures as affected by expectancy and IQ. Journal of Clinical Psychology. 26, 446-448.
Sattler, J. M., Winget, B. M., & Roth, R. J. (1969). Scoring difficulty of WAIS and WISC C, S, and V responses. Journal of Clinical Psychology, 25, 175-177.
Schroeder, H. E., & Kleinsasser, L. D. (1972). Examiner bias: A determinant of children's verbal behavior on the WISC. Journal of Consulting and Clinical Psychology, 19. 451-454.
Schwarting, F. G. (1976). A comparison of the WISC and WISC-R. Psychology in the Schools, 13. 139-141.
131
Schwartz, M. L. (1966). The scoring of WAIS C responses by experienced and inexperienced judges. Journal of Clinical Psychology, 17^, kl5-kll.
Shuey, A. M. (1966). The testing of negro intelligence (2nd ed.). New York: Social Science Press.
Silverstein. A. B. (1982). Factor structure of the Wechsler Adult Intelligence Scale-Revised. Journal of Consulting and Clinical Psychology, 50, 661-664.
Silverstein. A. B. (1985). Cluster analysis of the Wechsler Adult Intelligence Scale-Revised. Journal of Clinical Psychology. 41. 98-100.
Simon. W. E. (1969). Expectancy effects in the scoring of vocabulary items: A study of scorer bias. Journal of Educational Measurement, ^, 159-164.
Simon, C. L., & Clopton, J. R. (1984). Comparison of WAIS and WAIS-R scores of mildly and moderately mentally retarded adults. American Journal of Mental Deficiency, 89, 301-303.
Smith, R. S. (1982). A comparison study of the Wechsler Adult Intelligence Scale (WAIS) and the Wechsler Adult Intelligence Scale-Revised (WAIS-R) in a college population. (Doctoral dissertation, Biola University. 1981). Dissertation Abstracts International, 43, 886B.
Smith. R. S. (1983). A comparison of the Wechsler Adult Intelligence Scale and the Wechsler Adult Intelligence Scale-Revised in a college population. Journal of Consulting and Clinical Psychology. 51, 414-419.
Solly, D. C. (1977). Comparison of WISC and WISC-R scores of mentally retarded and gifted children. Journal of School Psychology. 15. 255-258.
Sundberg, N. D., & Gonzales, L. R. (1981). Cross-cultural and cross-ethnic assessment: Overview and issues. In P. McReynolds (Ed.), Advances in psychological assessment, Vol. 5, San Francisco: Jossey-Bass.
132
Swerdlik, M. E. (1977). The question of the comparability of the WISC and WISC-R: Review of the research and implications for school psychologists. Psychology in the Schools, 14, 260-270.
Terman, L. M. (1916). The measurement of intelligence. Boston: Houghton Mifflin.
Thomas, A., Hertzig, M. E., Dryman, I., & Fernandez, P. (1971). Examiner effect in IQ testing of Puerto Rican working-class children. American Journal of Orthopsychiatry, 41, 809-821.
Urbina, S. P., Golden, C. J., & Ariel, R. N. (1983). WAIS/WAIS-R: Initial comparisons. Clinical Neuropsychology, 4_, 145-146.
Vance, H. B., & Engin, A. (1978). Analysis of cognitive abilities of black children's performance on the WISC-R. Journal of Clinical Psychology, 34, 452-456.
Walker, R. E.. Hunt, W. A., & Schwartz, M. L. (1965). The difficulty of WAIS C scoring. Journal of Clinical Psychology. 21. 427-429.
Wechsler, D. (1939). Measurement of adult intelligence (1st ed.). Baltimore: Williams and Wilkins.
Wechsler, D. (1949). Manual for the Wechsler Intelligence Scale for Children. New York: The Psychological Corporation.
Wechsler, D. (1955). Manual for the Wechsler Adult Intelligence Scale. New York: The Psychological Corporation.
Wechsler, D. (1958). The measurement and appraisal of adult intelligence (4th ed.). Baltimore: Williams and Wilkins.
Wechsler, D. (1981). WAIS-R manual: Wechsler Adult Intelligence Scale-Revised. New York: The Psychological Corporation.
Weiner, S. G., & Kaufman, A. S. (1979). WISC-R vs WISC for black children suspected of learning or behavioral disorders. Journal of Learning Disabilities. 12, 100-104.
133
Williams. R. L. (1970). Black pride, academic relevance, and individual achievement. Counseling Psychologist, 2^, 18-22.
Winter, G. D. (1968). Intelligence, interest, and personality characteristics of a selected group of students: A description and comparison of white and negro students in a vocational rehabilitation administration program in Bassick and Harding high schools, Bridgeport, Connecticut. Dissertation Abstracts. 28, 4920-4921.
Wysocki, B. A., & Wysocki, A. C. (1969). Cultural differences as reflected in Wechsler-Bellevue Intelligence (WBII) Test. Psychological Reports, 15., 95-101.
Yerkes, R. M. (Ed.). (1921). Psychological examining in the U. S. army. Memoirs of the National Academy of Science, 15, 890.
Zimmerman, I. L., & Woo-Sam, J. M. (1973). Clinical interpretation of the VJechsler Adult Intelligence Scale. New York: Grune and Stratton.
APPENDIX: CONSENT FORM
This research is being conducted by a graduate student in the Department of Psychology at Texas Tech University. In order to have your child participate in this study, it is necessary that you be informed as to the nature of the research. By signing this form, you give your consent for your son or daughter to participate in this study.
The purpose of this study is to compare the scores obtained on two intelligence tests, the V/echsler Adult Intelligence Scale (WAIS) and the V/echsler Adult Intelligence Scale-Revised (V/AIS-R). Each student will be given both tests with a time interval of 30-45 days between administrations. Testing will be conducted at the high school with the cooperation of both the principal and the counselor. Each test should take approximately 90 minutes to administer, and no testing will be done at times when a student is taking a test in a class. The WAIS and the WAIS-R will be administered by graduate students in the Texas Tech Psychology Department who are proficient in the use of each test.
After all students have completed both tests, I will be happy to meet with you and discuss test results. However, it will not be possible to reveal IQ scores obtained on either test. This information will be used only for research purposes. Results of this study should be helpful in determining whether large differences exist between these tests.
Dr. James Clopton, who is in charge of this project, has agreed to answer any questions you may have concerning this research. He can be reached by calling 742-3703. Dr. Clopton has also informed me that you may contact the Texas Tech University Institutional Review Board for the Protection of Human Subjects by writing them in care of the Office of Research Services, Texas Tech University, Lubbock, Texas 79409, or by calling 748-3884.
If this project causes any physical injury to participants, treatment is not necessarily available at Texas Tech University or the Student Health Center, nor is there necessarily any insurance carried by the University or its personnel applicable to cover such injury. Financial compensation for any such injury must be provided through the participant's own insurance program. Further information may be obtained from Dr. J. Knox Jones, Jr., Vice President for Research and
134
135
Graduate Studies, 742-2153, Room 118, Administration Building. Texas Tech University, Lubbock, Texas 79409.
I have read the above and give my consent for my son or daughter to participate.
Signature of Student Signature of Witness
Signature of Guardian