What is a thesis? How do you write a thesis? What makes a good thesis? THE THESIS STATEMENT.
thesis 7_16
-
Upload
fstina-beremas -
Category
Documents
-
view
215 -
download
0
Transcript of thesis 7_16
-
8/18/2019 thesis 7_16
1/142
INFORMATION TO USERS
This was produced from a copy of a document sent to us for microfilming. While the
most advanced technological means to photograph and reproduce this document
have been used, the quality is heavily depe nd ent u pon the quality of the material
submit ted.
The following explan ation of techniq ues is provided to help you und erstan d
markings or notat ions which may appear on this rep roduct ion.
1. The sign or tar ge t for pages app arently lacking from the doc um ent
pho togra phe d is Missing Page (s) . If i t was possible to obtain the missing
page(s) or section, the y are spliced in to the film along w ith adjacent pag es.
This may have necessitated cutting through an image and duplicating
adjacent pages to assure you of complete continuity.
2. When an image on the fi lm is obliterated with a round black mark i t is an
indication that the fi lm inspector noticed either blurred copy because of
movement dur ing exposure, or dupl icate copy. Unless we meant to delete
copyrighted materials that should not have been fi lmed, you will f ind a
good image of the page in the adjacent frame.
3 .
When a map, drawing or chart , etc. , is part of the material being photo
graphed the photog rapher has fol lowed a def ini te method in sect ioning
the material . I t is customary to begin fi lming at the upper left hand corner
of a large sheet and to continue from left to right in equal sections with
small overlaps. If necessary, sectioning is continued again—beginning
below the first row and continuing on until complete.
4 . Fo r any i l lustrations tha t ca nn ot be reproduc ed satisfactorily by
xerography, photographic prints can be purchased at additional cost and
t ipped into your xerograp hic cop y. Requests can be made to our
Dissertations Customer Services Department.
5. Some pages in any d oc um en t may have indistinct prin t . In all cases we
have fi lmed the best available copy.
University
Microfilms
International
3 0 0 N Z E E B R O A D . A N N A R B O R , M l 4 8 1 0 6
1 8 B E D F O R D R O W , L O N D O N W C 1 R 4 E J , E N G L A N D
-
8/18/2019 thesis 7_16
2/142
8012402
OEHMKE,
T HERESA MARIA
THE DEVELOPMENT
AN D
VALIDATION
OF A
TESTING INSTRUMENT
TO MEASURE PROBLEM SOLVING SKILL S
O F
CHILDREN
IN
GRADES
FIVE THROUGH EIGHT
The University
of
Iowa
PH .D. 1979
University
Microf i lms
I n t e r n c i t I O
PI
3 .1 300 N. Zeeb
Road, Ann
Arbor, MI 48106 18 Bedford Row, London
WC1R
4EJ, England
Copyright 1979
by
OEHMKE, THERESA MARIA
All Rights Reserved
-
8/18/2019 thesis 7_16
3/142
THE DEVELOPMENT AND VALIDATION OF A
TESTIN G INSTRUMENT TO MEASURE PROBLEM SOLVING SKILLS
OF CHILDREN IN GRADES FIVE THROUGH EIGHT
by
Theresa Maria Oehmke
A thes is submitted in partial fulfillment of the
requirements for the degree of Doctor of Philosophy
in Education in the Graduate C ollege of
The University of Iowa
December, 1979
Thes is supervisor: Professor Harold L. Schoen
-
8/18/2019 thesis 7_16
4/142
Graduate College
The University of Iowa
Iowa City, Iowa
CERTIFICATE OF APPROV AL
PH.D.
THESIS
This is to certify that the Ph.D. thesis of
Theresa Maria Oehmke
has bee n approved by t he E xamining Committee
for th e thesis requirement for the Doctor of
Philosophy degree in Education
at the December, 1979 graduation.
Member
3 supervisor
Thesis committee: ffy
A
-
yi
~^-0L^&(• r^oci^-e^z^
Thesis supervisor
Member
Member
-
8/18/2019 thesis 7_16
5/142
DEDICATION
To Bob and Jim
i i
-
8/18/2019 thesis 7_16
6/142
ACKNOWLEDGEMENT
For their help in the preparation of this t hes is , I owe an expres
sion of appreciation to a large number of persons ; only a few of whom I
shall menti on by na me.
To Professor Harold Schoen I extend my thanks for all the guidance,
motiv ation, direction and assistance he gave me from the initiat ion to
the completion of th is study.
Special thanks are due George Immerzeel, Joan Due a, Ea rl Ockenga
and John Tar r and the personnel at th e Malcolm Price Laboratory School
for the ir encouragement and support during several phases of this i nve s
tigation.
A debt of gratitude to Professors H. D. Hoover and A. N . Hieronymus
is acknowledged for their assistance on some of the statistical details
of the te sti ng procedure and for providing me wi th the opportunity to
collect some of the pertinent data in this study.
I would like to express my appreciation to William M. Smith for the
use of his expertis e on t he technical aspects of the use of the computer
on data processi ng.
Thanks is given to Ada Burns for her amazing ability to type what I
thought I ha d writte n.
Finally, I would like to thank my husband B ob, and son Jim, and all
my friends w ho we re always ready wi th an encouraging word.
iii
-
8/18/2019 thesis 7_16
7/142
TABLE OF CONTENTS
Page
LIS T OF TAB LE S vi
LIST OF FIGUR ES vii
CHAPTER
I. INT ROD UCT ION 1
Purpose 3
Ov er vi ew of the Study 3
IPSP Test Development h
I P S P T e s t V a l i da t i on . . . . . 5
Reliability 6
Conte nt Validit y 8
Concurre nt Validity 8
Dis crimin ant Validity 9
Oper at iona l Defin iti ons Used in This Study 10
O v e r v i e w 1 1
I I . REV IEW OF TH E LITE RATU RE 12
Mode ls of Pr oblem Solvin g 13
Te st in g th e Abi lit y to Solve Problems 19
Summary 22
I I I .
D EV E L O P M E N T A N D P R O C E D U R E S 23
Pur pose of the Iowa Problem Solvin g Pr oje ct 23
De ve lopment of th e IPS P Test 2*1
De si gn and Dev elopment of Inte rvie w Pr ocedure s 28
The Q uant ificat ion Scale 30
Development 30
Use of the Scale
3b
Th e Pi lot In te rv ie w Study 1+0
Procedures ^0
Pilot Sample
k l
Interview Problems
k2
The Inter views ^3
The Final Interview Study kk
ITBS an d the IPSP Test ^5
IPSP Subtest Discrimination
h9
iv
-
8/18/2019 thesis 7_16
8/142
CHAPTER Page
IV. DATA AND RESULTS 50
IPSP Test and Interview Data 50
Pilot Interview Study 51
Final Interview Study 53
ITBS and the IPSP Test 55
IPSP Test Administration 55
ITBS Subtests 60
Correlations Between IPSP Subtests and ITBS
Subtests 60
IPSP Subtest Discrimination 62
V. ANALYSES AND IMPLICATIONS 72
Summary and Conclusions 72
Phase 1: The Final Interview Study 73
Phase 2: ITBS and the IPSP Test 73
Phase 3: IPSP Subtest Discrimination 7^
Limitations 7^
Classroom Implications . 76
Implications for Resea rch 8l
IPSP Test 82
APPENDIX A. DESCRIPTION OF THE IOWA PROBLEM SOLVING PROJECT . . . . 85
APPENDIX B. PHASES OF IPSP TEST DEVELOPMENT AND SUMMARY OF TEST
FORMS 9^
APPENDIX C. INTERVIEW PROBLEMS USED IN THE PILOT STUDY 98
APPENDIX D. IOWA TEST OF BASIC SKILLS RELIABILITY ANAL YSIS
NATIONAL REPRESENTATIVE SAMPLE 106
APPENDIX E. FORMULA FOR ITBS CORRELATIONS CORRECTED FOR
ATTENUATION 10 8
APPENDIX F. ANALYSIS OF IPSP TEST RESULTS OCTOBER 1978
AND MARCH 1979 ADMINISTRATIONS Ill
APPENDIX G. PILO T TEACHING STUDY 120
BIBLIOGRAPHY 125
v
-
8/18/2019 thesis 7_16
9/142
LIST OF TABLES
Table Page
1. Phases of Validation of the IPSP Test 27
2. Analysis of IPSP Test with Interviews-Pilot Study 52
3. Analysis of IPSP Test with Interviews—Final Study 5^
b .
Reliability Analysis 56
5. Reliability Analysis « 57
6. Reliability Analysis 58
7. Reliability Analysis 59
8. Correlations Between IPSP Subtests and Iowa Test of Basic
Skills Tests 6l
9- Reliability Analysis Sample of Iowa Students October 1978 . . 68
10. Reliability Analysis Sample of Iowa Students March 1979 . . . 69
11.
Correlations Corrected for Attenuation of October 1978 IPSP
Subtests 70
12. Iowa Test of Basic Skills Reliability Analysis Nati onal
Representative Sample 107
V I
-
8/18/2019 thesis 7_16
10/142
LIST OF FIGURES
Figure Page
1. Steps and Component Skills of the IPSP Test Model , 5
2.
Interviewing Procedures 31
3. Quantification Scheme for the Components of the IPSP Test
Model Step 1: Getting to Know the Problem 35
b . Quantification Scheme for the Components of the IPSP Test
Model Step 3: Carry ing Out the Plan 36
5- Quantification Scheme for the Components of the IPSP Test
Model Step it: Looking Back 37
6. Schedule of Events for Pilot I ntervi ew Study
b 2
7. Schedule of Events for Fina l Interview Study b G
8. Square of Correlations Corrected for Attenuation Between
IPSP Steps and ITBS Subtests Form 5 6 3, Grade 5 63
9. Square of Correlations Corrected for Attenuation Between
IPSP Steps and ITBS Subtests Form 5 6 3, Grade 6 6b
10.
Square of Correlations Corrected for Attenuation Between
IPSP Steps and ITBS Subtests Form 783, Grade 7 65
11.
Square of Correlations Corrected for Attenuation Between
IPSP Steps and ITBS Subtests Form 5 82 , Grade 8 66
12.
Ann's IPSP Test Profile 77
13-
Dave's IPSP Test Profile 78
lU. Schedule for Pilot Study 122
vii
-
8/18/2019 thesis 7_16
11/142
1
CHAPTER I
INTRODUCTION
The question of wha t processes are involved in problem solving and
more particularly in mathematical problem solving is one tha t has been
investigated for many ye ar s. As one surveys the literature, one not only
becomes aware of the vast quantity of research that is being done but
also of the many and diver se methods used to study the problem solving
processes.
Studies range from simply observi ng individual students as they
solve problems to factor analysis of paper-pencil measures of problem
solving. A data-gathering method that has come into prominen t use today
is the structured one-to-one in terview. In such case s, the researcher
observes the behav ior of th e subject as he "thinks aloud" wh ile solving
selected problems. Typically the interview s ar e audio- or video-taped
and a protocol or check list of processes is completed for each student-
problem pair. Other approaches include simulating the problem solving
processes of a human being on a computer and using the re sults to build
a problem solving theory . In addition, results and methods of rese arch
ers investigating th e cognitive processes involved in gene ral problem
solving often have application for mathematical problem solvin g.
In an attempt to analyze and thereby further understand the problem
-
8/18/2019 thesis 7_16
12/142
solving process r esea rchers hav e proposed various multiste p models.
One of the earliest wa s proposed by Wallas
(1926).
Based on his own
experiences and his analyses of what others think they are doing when
they solve problems, Wallas suggested a four-step model: prepa ra tion,
incubation, illuminat ion, and verification. Drawing upon many years of
experience as a mathematician and a t eache r, Polya (1957, 1962) pro
posed another four-step model: understa nd the problem, make a plan ,
carry out the plan , and look back at the complete solution. Res tle and
Davis (1962) suggested t hat the problem solver goes through a number of
independent but s equentia l stages . The student solves a subproblem at
each s tage, thereby allowing him to go on to the next step. Thes e and
other multistep models hav e appeared in the literature wit h varyin g
degrees of supporting empirical evidence.
If problem solving is a multis tep process t he question th en
arises:
would it be possible to measure a person's ability (or skill)
at each of the steps? If there wer e a reliable testing instrument
which measured this ability , of what value would it be t o the classroom
teacher? Would the te acher be able to use such a test to help plan for
problem solving instruction? Of course , the one-to-one inte rvi ew str at
egy is available to evaluate a child's problem solving process es but the
teacher does not have the time to conduct interviews with all the stu
dents in the classroom. In a ddition, the interview has been criticized
for yielding results wh ich are subjective and unreliable.
There is little in the literature concerning paper-pencil testing
instruments which purport t o measure problem solving processe s. Man y
existing group adminis tered tests contain problem solving or
-
8/18/2019 thesis 7_16
13/142
3
application sections which consist of verbal problems. For example,
the Iowa Test of Basic Skills (Houghton, Mifflin Co., 19 l b ) and the
Metropolitan Achievement Test s (Harcourt, Brace and Jovanovich, 1971)
contain subtests designed to measure problem solving ability. How ev er ,
the information these tests provide, namely, the grade-level equivalent
or percentile rank of the student and class in a larger group, is de
rived from the number of correct ans wer s. No attempt is made to mea
sure the process that was used to arrive at the solution or to identify
specific skills which may be the source of the child's difficulties.
Purpose
At
As part of the Iowa Problem Solving Project (IPSP) , Schoen and
Oehmke developed a multiple choice paper-pencil test designed t o pro
vide individual and class profiles which illustrate the performance of
fifth through eighth graders at each of three steps of the problem solv
ing process. A modified version of the four-step problem solving pro
cess model proposed by Polya served as th e IPSP testing model.
The purpose of this study is th e validation of this tes ting instr u
ment called the Iowa Problem Solving Project Test (IPSP test ).
Overview of the Study
The steps in the validation process are listed below.
1. Compute estimates of the reliability coefficients for the IPSP
test and its three subtests.
'A three-year project directed by George Immerzeel of the University of
Northern Iowa and funded under ISE A, Title I V, C.
-
8/18/2019 thesis 7_16
14/142
b
2.
Judge the content validity of the IPSP test using the j udg
ments of mathematics educators and educational measurement specialist s.
3. Judge the concurrent validity by measuring the r elationship
between the IPSP subtest scores and a) the results of a think aloud
measurement procedure, and b) the mathematics concepts, mathematics
problem solving, reading compreh ens ion, and graph skills subtest scores
of the Iowa Test of Basic Skills.
b . Judge the discriminant validity of the IPSP test by analyzing
a matrix of correlation coefficient s, corrected for atte nuation,
between pairs of the t hree IPSP subtests and the four ITBS subtests .
IPSP Test Development
Several "non-standard" instruments have been developed to test
problem solving ability (Wear ne, 1 976 ; Zalews ki, 1975; Lucas, 1972).
These will be described in Chapter two. Howev er , the IPSP test appears
to be the first attempt t o measure skills in a multistep model. A
mul
tiple choice format wa s chosen in order to maximize th e potential for
broad, long-range impact on classroom practices. It was reasoned that
a) standardized machine-scored tests presently ha ve a great influence on
curriculum and ins truction, and b) in this format the IPSP test may hav e
some influence on standardized testing. Furth ermore, the machine scor
ing capability is likely to increase the number of potential users .
After a good deal of "brain storming" wit h members of fie IPSP te am, a
search of the problem s olving literature and a detailed analysis of
each s tage, the testing model as shown in Figure 1 was developed.
-
8/18/2019 thesis 7_16
15/142
5
1. Get to Know the Problem
A. Determine insufficient information
B.
Identify extraneous information
C. Write a question for the problem setting
2.
Choose What to Do from a List of Strategies
3. Do It
A. Choose the necess ary computation
B.
Estimate from a diagram
C. Compute from a diagram
D. Use a table
E. Compute from an equation
b
. Look Back
A. Identify problems that can be solved in the same
way as a given one
B.
Vary conditions in a given problem
C. Check a solution with the conditions of the
problem
Steps and Component Skills of the IPSP Test Model
Figure 1
Like standardized tests the IPSP test can be efficiently a dminis
ter ed to large groups of students and machine scored with va rious n orm
data easily obtainable. In additi on, subtest scores corresponding to
steps 1 , 3, and b can be obtained. After nearly two years of effort,
no via'ble way to test skills in step 2 in a multiple-choice format w as
found.
IPSP Test Validation
As an innovative measurement instrument containing many items wit h
untested structural characteris tics and designed to measure constructs
nev er "before measured with paper-pencil in str ument, the IPSP test
development called for much careful planning and formative evaluation.
-
8/18/2019 thesis 7_16
16/142
6
Maj or questions concerned the validity of the te st , if indeed, a reli
able test wit h reliable subtests could be constructed. By utilizing
the Iowa Testing Program's tryout facilities it was possible to con
struct experimental test un it s, administe r th em to representative sam
ples of Iowa fifth through eighth grade rs , and revise the units based
on the item analyses and test data. Als o, over 100 students
wer e interviewed at various stages in the test development process as a
concurrent check on the test validity. Over a three-year period the
test evolved into its present form.
For th e present study esti mates of reliability and of several
types of validity of the final form of the IPSP test we re obtained:
(l) content va lidity, (2) concurrent validity , and (3) discriminant
validity. A general description of the procedures for each of the
above areas follows. A detailed description of the procedures and
findings are in later chapters.
Reliability
The concept of reliability r efers to an estimation of the degree
of consistency of measurement. Theoretically, if a test is reliable,
an individual should obtain the sa me, or almost the same, score on
repeated administrations. Many factors may affect a student's observed
score to make it different than t he theoretically true score. Thes e
"errors"
of measurement may be caused by the test itself, i.e., a par
ticular sample of items may be a "good" or "bad" s ample; by attributes
of the person taking the tes t, i.e., motivation, attitude, health, test-
wiseness;
or by administrative conditions and procedures at the time of
-
8/18/2019 thesis 7_16
17/142
7
the t est , i.e. , noise , distractions, poor lighting, or lack of uniform
ity in giving test ins tructions. Th e most difficult factors to control
are t he subject's attributes.
Reliability of a score for a testing sample is defined as th e ra tio
of the variance of the true score to t he variance of the observed score:
2
r
t t 2
S
t
2
where r
JX
= reliability of the te st , s
m
= variance of the truett T
2
score,
and s = variance of obtained score. Reliability is also re -
ferred to in the literature as a correlation between scores on the same
test or parallel tes ts.
One of the most commonly use d methods of estimating reliability is
the Cronbach alpha. If the data are in dichotomous form, this estimate
is equivalent to the Kuder-Richardson 20 coefficient. It may be calcu
lated by the following formula:
2
« s r r
1
A>
2 2
where s is the variance of the sum over the k items and s is the
J. Xi
average item variance. An advantage of this coefficient is that it
computes all ways a given test might be split and the n gives a "best
estimate" reliability. In this study, reliability coefficients were
estimated using either th e Cronbach alpha formula or a modified vers ion
of the KR-8 which is a formula in th e series of 20 developed by Kuder-
Richardson
(1937)•
-
8/18/2019 thesis 7_16
18/142
8
Content Validity
The basic content validity question i s: Are the i tems in question
a representat ive sample of the construct or subject matter domain to be
measured? A familiar example of a need for high content validity is
the ty pical classroom tes t. Here t he individual's score on a sample of
items from a content domain is used to infer t he students' achievement
level in the entire domain. Thus the assessment of content validity
involves careful judgment of the degree to which t he selection of items
is appropriate and representativ e of th e domain or construct to be
tested.
Content validity is usually assessed by th e judgment of experts ,
that i s, subject matter and testing specialists. A limitation of this
method'is that ther e are no clearly specified techniques or standards
for determining content validity.
The reaction of educational test ing experts and mathematics educa
tors to the IPSP testin g model and sample test units was sought. The
testi ng experts were two faculty members in Educational Measurement at
The U niversity of Iowa who also are authors of the Iowa Test of Basic
Skills. The mathematics educators were the IPSP project team and ad
visory board member s, both composed of mathematics teachers in grade
five through graduate mathematics education.
Concurrent Validity
Concurrent validity refers t o the degree to which a test corre
lates with th is independent measure. It is the determining factor in
deciding whet her a test can replace procedures th at are ei ther more
-
8/18/2019 thesis 7_16
19/142
elaborate or require special techniques as in the case of the IPSP test
and the one-to-one interview method. One w ay of demonstrating a test's
concurrent validity is to compare the t est scores with some independent
measure which is presumed to measure t he behav ior in question.
In thi s study data from interviews and from the IPSP t est were
gathered concurrently. The interview measure consisted of presenting a
series of problems t o a child and asking him to th ink aloud as he at
tempted to solve each problem. The data were th en obtained by coding
and a nalyzing t he solution strategies from obser vat ions, audio-tapes,
and th e s ubjects' writ ten work. Thus one estimate of the concurrent
validity of the IPSP test is the correlation between the score in each
of th e three IPSP subtests and corresponding data collected via the
think aloud procedure.
Another estimate of concurrent v alidity wa s derived from the IPSP
test's relationship to several standardized achievement measures . The
criteria selected were four Iowa Test of Basic Skills subtests: mathe
matics concepts, mathematics problem solving, reading comprehen sion,
and graph skills.
Discriminant Validity
Campbell and Fiske (1959) state tha t validation usually proceeds
by convergent meth ods, i.e., independent measurement techniques de
signed to measure the same trait. Howev er, for purposes of test inter
pretat ion, discriminant validity is also required; that i s , not only
should a test be highly correlated with other test s purporting to mea
sure the same trait but it should not correlate highly wi th tests that
measure distinctly different t raits . In the case of the IPSP t es t, the
-
8/18/2019 thesis 7_16
20/142
10
discriminant validity issue refers to the degree to which the subtest
scores differ from each other and from scores on other similar tes ts.
Discriminan t validity in this study is approached through th e use
of matrices of correlations corrected for atte nuation. Intercorrelated
variables include steps 1 , 3, and
b
of the IPSP test and the afore
mentioned subtests of the
ITBS.
In particular, the three IPSP subtests
should not be highly correlated with each other nor w ith the ITBS
subtests.
Operational Definitions Used in This Study
1. Problem solving
To search consciously for some action appropriat e to att aining a
clearly conceived, but not immediately atta inable aim. To solve
a problem means to find such action (Polya, 1962),
2.
Cogniti ve processes
Actions of cognitive performance such as per ceiv ing, remember
ing, thi nking, and desiring that depend on t he subject's per
formance capacities. These processes ar e not directly observ
able but are presumed to underlie a person's behavior whe n faced
with a cognitive task.
3. Four-step problem solving test model
For purposes of this study Figure 1 defines the model. Demon
stration of the skills in each of steps 1, 3, and b is taken
as evidence of a student's ability in that ste p,
it. Problem solving processes
a) the cognitive processes presumed to underlie or guide the
-
8/18/2019 thesis 7_16
21/142
11
subje ct's choices of solution stra tegie s;
b) th e observable actions or operations the subject actually
adopts to attempt to solve the problem. Also called problem
solving solution strategies. It should be clear from the
context w hich meaning applies.
Overview
Prev ious approaches to problem solving resea rch and attempts to
measure problem solving are discussed in Chapter I I , Review of the
Literature. Chapter III includes a discussion of the theoretical
framework and design of the IPSP t es t, procedures use d to develop the
verbal problems, scoring methods for the think aloud in ter vie ws, and
data gathering procedures. The results are discussed in Chapter IV.
A summary of th e v alidation re sults, implications for future res ear ch,
and implications for the teaching of mathematical problem solving are
discussed i n Chapter V.
-
8/18/2019 thesis 7_16
22/142
12
CHAPTER II
REVIEW OF THE LITERATURE
Since the purpose of this paper is the development and validation
of an instrument to test mathematical problem solving, first , various
models that mathematical educators, mathemat ician s, psychologist s, and
other researchers have used to study the problem solving processes will
be discussed, and s econd, testing instruments developed by other re
searchers t o measure problem solving processes wi ll be summarized. The
reader who is interested i n a more detailed discussion of heuri sti cs,
str ategies, task ana lysis , structures and other pertinent factors in
volved in the problem solving processes is referred t o one or more of
the following rev iew s.
A collection of papers edited by Kleinmuntz (1966) includes
dis
cussions of particular aspects of problem solving, research and theory..
Davis (1966) gives a comprehensive survey of research and theory rela
tive to tra ditional lear ning, cognitive and Gestalt approaches to prob
lem solving, and computer and mathematical models of problem solving.
Kilpatrick (1967) presents an extensive discussion of the defini
tion of heuri sti cs. He describes in detail the notion of problem solv
ing as a search process
,
the thinking aloud t echn ique, and other meth
ods used to study problem solving processes. He also develops a
coding sy stem for tran scribing t he problem solving protocol of subjects
-
8/18/2019 thesis 7_16
23/142
13
from audio-tape to paper as t hey use the think aloud techn ique.
Hollander's (1973) review focuses on studies related to word prob
lems for students in grades three through e ight. The review includes
studies carried out from 1922 to 1969 in seven categorie s: problem
analysis, computati on, general reading ability, specific reading skills,
specificity of the problem st ate ment , the problem sit uation, and lan
guage factors.
Webb (197 5) notes that
...Because of the complexity of problem solving
processes and the n umber of variables associated
with problem solving, research in this ar ea has
been too diverse t o have any real consolida
tion. ..(p.
l)
His review focuses on studies tha t involve problem solving tasks and
strat egies. These studies we re conducted from 1967 to 1973 and in
volved students in grades three through eight.
Lucas (1972) discusses the nature of problem solving, the search
mode used by information processors , some formal models of the problem
solving process and some techniques used by earlier resea rchers to
externalize thought processe s.
Models of Problem Solving
The attempt to describe the thought processes used in math ematical
problem solving is not a new quest. For many years mathematicians have
sought to determine and understand, the thought processes the y use in
discovering new math emat ics. Some accounts discuss in great detail
th e individual thought processes use d in formulating and proving math e
matical conj ectures.
-
8/18/2019 thesis 7_16
24/142
l i t
The mathematician Henri Poincare
(l9lit),
in attempting "to see
what h appens in the very soul of the mathematician," gives an explicit
account of his per sonal recollection of the discovery of the Fuchsian
functions. After his discussion he states:
...As regards my other resea rches, the account I
should give would be exactly s imilar , and th e ob
servations related by other mathematicians in the
inquiry of I'Enseignement Mat hemati que
1
would only
confirm th em.
One is at on.ce struck by t hes e appeara nces of
sudden illumination, obvious indications of a long
course of previous unconscious work in mathe mati
cal discovery seems t o me indisputable, and we
shall find traces of it in other cases wher e it is
less evident...(p. 5 5 ).
"The appearances of sudden illumination" recounted by Poincare was
also cited by Gauss in referring to a theorem on which he h ad worked for
years. He sta te s, "Like a sudden flash of lightning the riddle happened
to be solved."
"Sudden illumination" is the third s tage of the psychologist
Wallas' (1926) model. In an attempt to analyze and thereby
further understand the problem solving process , Wallas observed accounts
of thought processes related to him by his st udent s, colleagues and
friends.
It was the account of the German physicist Helmholz that in
spired Wallas to describe thr ee stages in the formation of new thought:
preparat ion: the first stage during which th e problem is investigated
and all the facts are gathered,
incubation: the second stage during which one rests from any conscious
thought about the problem at hand and/or consciously
X
A review w hich instituted an inquiry into the habits of mind and meth
ods of work of mathematicians during the early 20th century.
-
8/18/2019 thesis 7_16
25/142
15
thinks of another problem,
illumination: the th ircPstage during which th e idea and/or solution ap
pears as a 'flash' or 'aha'.
Wallas added a fourth stage, verification, during whi ch th e validity of
the idea is test ed: which Helmholz did not describe but which Poincare
vivi dly describes in his accounts.
Polya (1957,1962) also developed a four-step model for problem
solving. His principal aim in advocating a heurist ic approach t o prob
lem solving wa s to enable the student t o ask questions w hich focus on
the essence of the problem. Drawing upon many year s of experience as a
mathematician and a teacher of mathemati cs, he describes the following
four-step model:
(1) Understa nding the problem: The student tries to understand
the problem by looking at the data and asking th e questi ons, Is it pos
sible to satisfy the conditions of the problem?, Is there redundant or
insufficient data?
(2) Devising a plan: The student tries t o find a connection be
twe en th e data and the unknown; the student should eventually choose a
plan or strategy for t he solution.
(3) Carrying out the plan: The student carries out the plan,
checking each step along the way.
(it) Looking back: The student examines the solution by checking
the r esults and/or arguments. The student also attempts to relate the
method or re sult to other problems.
In a discussion of Polya's model, May er (1977) points out th at some of
Polya's ideas (restating the given and th e goal) are examples of the
-
8/18/2019 thesis 7_16
26/142
16
Gestalt idea of "restructuring." He concludes that:
...while Polya gives many excellent intuitions
about how the restructuring event occurs and how
to encourage it , the concept is still a vague one
that has not been experimentally we ll studied...
(p. 67)
The earlier mathematicians, as w ell as Wallas and Polya, based
their accounts of the problem solving process on either introspection
or ret rospection. That i s , the problem solvers reported on their
thought processes as they worked (introspection) or they recalled these
thought processes after they had completed the problem (ret rospection).
Kilpatrick (1967) finds difficulties with both of these methods. Of
introspection h e sta tes , "but psychologists soon were raising questions
about th e nature and magnitude of th e distortion introduced by requir
ing the subject; to observe himself th inking." His criticism of ret ro
spection refers to Bloom and Broder (1950)
,
who stated that the diffi
culties w ith retrospection lie in remembering all the steps in one's
thought processe s, including errors and blind alley s, and in reproduc
ing the se steps without rearranging th em into a more cohere nt, logical
order.
It appears that Claparede (1917;193*0 was the first to use a third
approach, t he "think aloud" technique (Kilpatr ick, 1967). This technique
does not require t he subjects to think and observe themselves thinking
at the same time. The subjects are asked to vocalize their thought pro
cesses as they are thinking. Hence the subjects do not have to analyze
thei r thought processes nor are they required t o have special trai ning.
Ther e ar e, howeve r, potentia l difficulties wit h the think aloud tech
nique:
interference of speech with th inking, the lapse into silence whe n
-
8/18/2019 thesis 7_16
27/142
17
the subject is deeply engrossed in thought, and the essential difference
between the verbalized solution and the one found silently. Kilpatrick
(1967) summarizes the views of several authors (Rota, 1966 ; Brunk, Col-
lister,
Swift and Slayton, 1 958 ; Gagne and Smith, 1 962 ; Dansereau and
Gugg, 1966) concerning thes e difficulties and concludes tha t:
...The method of thinking aloud has the special
virtues of being both productive and easy t o use .
If the subject understands what is wa nt ed—th at
he is not only to solve the problem but also to
tell how he goes about finding a soluti on—an d if
the method is used with an awareness of its
limi
tati on, then one can obtain detailed information
about thought processes
...(p.
8 ) .
One of the first attempts to systematically gather empirical evi
dence wa s by Duncker (19^5 ) who studied th e problem solving protocol of
subjects w ho were given a problem and asked to "think aloud." Two of
the problems that he used wer e the tumor problem:
...Given a human being wi th an inoperable stomach
tumor, and rays which destroy organic tissues at
sufficient int ens ity , by what procedure can one
free him of the tumor by thes e rays and at the
same time avoid destroying the healthy tissue which
surrounds it
?... p.
1).
and the 13 problem:
...Why are all six place numbers of the form 276 ,2 76,
591,591,
112,112 divisible by 1 3?...(p. 31 ).
Duncker illustrated a t ypical solution protocol for th e tumor prob
lem with a flow chart and observed that the problem solving process
starts from a gener al solution, then progresses to a functional solution
and then to a specific solution.
In a more recent attempt to gather data empirically, Restle and
Davis (1962) developed a model which describes th e subject as going
-
8/18/2019 thesis 7_16
28/142
18
through sequential stages when solving a problem. Each stage is a sub-
problem with its own subgoal.
Thus,
the individual solves a sequence
of subproblems which then enables him to continue on to the next sta ge.
The model states that the number of st ages , k, for any given problem
can be determined by the square of the average time to solution, t ,
divided by the square of the standard deviation, s, of the time to
2 2
solution, or k = t /s . They do not describe the stages, and assume
that each is of equal difficulty.
Simon (1975) and his associates hav e also investigated thought
processes used in problem solving. He used laboratory conditions to
observe human beings working on well-structured problems that the sub
jects find difficult but not unsolvable. He states the following broad
characteristics of the information processing system which he uses to
describe human problem solving:
(1) serial processing, one process at a time;
(2) small short te rm memory capacity for only a few symbols; and
(3) infinite long term memory with fact retrieval but slow storage.
Simon further states that the solver always appears to search
sequentially, adding small successive accretions to his store of infor
mation about the problem and its solution.
These models and other multiste p models (e.g., Dew ey, 1 933; John
son, 1955 ; Gagne, 1962; Guilford, 1967; Kilpatrick, 1969; Post,
1968;
Webb, 19TU, 1977; Les ter , 1975) ha ve appeared in the literature
with varying degrees of supporting empirical evidence. Such models
suggest a number of questions about teaching and measuring problem
solving skills. If problem solving is a multistep
-
8/18/2019 thesis 7_16
29/142
19
process,
would it be possible to measure a person's ability at each
step? Would this in formation be more useful to a teacher tha n just the
single number right, pe rcent ile, or grade level equivalent score?
How can this type of evaluation be effected? Can paper-pencil tests be
used or is a one-to-one interview assessment necessary? A paper and
pencil test can be a v er y accurate and efficient evaluation in str ument,
especially in the case of easily measured skills. However , the complex
process of problem solving is more difficult to evaluate.
Test ing the Ability to Solve Problems
In contrast to both the we alth and diversity of research in the
problem solving process , the re is a scarcity of instruments to measure
these processes. Johnson (l96l) cites the dearth of instruments to mea
sure the problem solving process in the N ational Council of Teachers of
Mathematics year book on Eva luation in Mathemathics:
...The committee would hav e liked t o include ma
terial on th e appraisal of mental processes in the
learning of mathemati cs. As teachers of mathemat
ics,
we are deeply concerned about developing
skill in productiv e thinking. Too often, many of
us find ourselves knowing little about t he rela
tions betw een the solutions given by our students
and the thought processes that led to these solu
tions. Howe ver , tests for appraising higher men
tal processes such as concept formation, problem
solving, and creative thinking in mathematics do
not
exis t... (pp.
2,3).
Recently, efforts have been made to develop instruments that me a
sure problem solving processe s. Spee die, Treffinger, and Feldhusen
(1973) summarized what they and other authors (Ray, 195 5 ; John , 195 ^ ;
Ke isler, 1969) vie w as characteristics of a good problem solving process
test. Three of these characteristics are :
-
8/18/2019 thesis 7_16
30/142
20
1. The test should yield a variety of continuous measures con
cerning the outcomes of th e problem solving, the processe s,
and the intellectual skills involved.
2.
The test should demonstrat e both reliability and validit y.
3. The test should be practical for group administrat ion.
Many standardized group administered t est s, such as the Iowa Test
of Basic Skills and the Metropolitan Achievement Tes ts, contain subtests
designed to measure problem solving ability. This measurement is ob
tained from a single score , i.e., the number of correct ans wer s. No
attempt is made to decompose either the v erbal problem it self, or the
solution, into component skills so that one can identify the specific
skill which may be the source of the student's difficulty.
Proudfit (1978) cites other limitations of standardized te st s. She
lists eight tests that wer e examined by Charles and Moses (Webb & Mos es ,
1977) who concluded that most of the items did not measure problem solv
ing processes but emphasized application of either previously learned
skills
or
algorithms. In additi on, Proudfit examined other problem solv
ing instr uments, e.g., the Purdue Elementary Problem Solving In vent ory,
and concluded that the y did not meet the previously stated crite ria on
at least one of the thre e counts.
Post
(1968),
using a four step model (recognition, ana lysi s, pro
duction, verification) designed a problem solving test which is in a
multiple-choice format. Howe ver , the scores on the test are measures
of the end product and not of any particular step in his model. His
content v alidation procedure used the judgment of a panel of expert s,
and his split h alf reliabilities ranged from .60 - .82.
-
8/18/2019 thesis 7_16
31/142
21
Foster (1972) developed a problem solving test which had a
mul
tiple-choice format for some items and an open-ended format for other
items.
Some of the items are designed to measure process more than
product, but t he test is not machine scorable.
Hiatt (1970) discusses the need for measuring mathematical problem
solving processes . He designed a "creative problem solving" test in
which points are awarded on the basis of approaches use d, e.g., more
points are awarded if the student does some computation mentally.
Since this test is an open-ended format, it would be difficult to admin
ister to large groups.
Wearne (1977) developed a test of problem solving behavior which
provides information about th e child's master y of the prerequisite mat h
ematical concepts in the problems. Each ite m, called a super-item, is
composed of thre e par ts : a comprehension question, an application ques
tion,
and a problem solving question. The comprehen sion question
assesses th e child's understanding of the problem set ting. The appli
cation question asses ses the child's understanding of the concept wh ich
is presumed to be a prerequisite to solving the problem, and finally,
the problem is solved. How ev er , it was difficult to substantiate the
assertion that th e comprehen sion and application items ar e assessing
prerequisites of th e problem solving ite ms.
Proudfit (1978) has designed a problem solving process test in
which the child is given three problems, asked to select one , solve i t,
and then answer a list of 12 questions before going on to the next prob
lem. The questions refer to the child's solution processes . Two of the
questions are open ended and the other 10 are in a multiple-choice
-
8/18/2019 thesis 7_16
32/142
22
format. The test was administered to 100 fifth grade students but no
formal reliability or validity data were reported.
Zalewski (197*0 developed a set of verbal problems w hich he admin
istered to a group of seventh graders. His purpose was t o predict th e
results of problem solving process assessments using the t hink aloud
procedure and coding scheme from hi s paper and pencil te st . For th e
interview test he used Lucas ' point system which gave 1 point for
'Approach', 2 points for 'Plan', and 2 points for 'Result'. The wri tt en
test was scored using the number of correct answers. The correlation
between the rankings on the writt en tests and the interviews was .6 8,
below t he criterion Zalewski h ad set prior to his study.
Summary
Many resea rchers, including psychologists, mathematicians and edu
cators, have investigated t he problem solving processes usi ng multis tep
models.
Their concensus is that th e multiste p model is a valid model
for t he investigation of the problem solving processes. In a ddition,
not only is there a s carcity of instruments to evaluate these models ,
but also a scarcity of instruments to measure problem solving processe s
in a machine scorable format. The IPSP test is an instrument developed
to measure problem solving skills in a machine scorable format. The
validation of this test is the purpose of this study.
-
8/18/2019 thesis 7_16
33/142
23
CHAPTER III
DEVELOPMENT AND PROCEDURES
Purpose of the Iowa Problem Solving Proj ect
The Iowa Problem Solving Project (IPSP) is a thre e year project
directed by George Immerzeel of the Uni versity of Northe rn Iowa and
funded under IS EA , Title IV, C. Its primary purpose is to deve lop,
eval
uate,
and disseminate materials to improve t he math ematical problem
solving abilities of students in grades five through eight.
The approach advocated by th e IPSP staff involves both the t each
ing of specific solution strategies and the solving of many interes ting
verbal problems wi th increasing difficulty levels. Eight instructional
modules have bee n developed to help the student build a variety of
skills and strategies necessar y to successfully solve v erbal problems.
Each module consists of a booklet and a card deck. Th e booklet provides
instructional activiti es aimed at developing a particular skill while
the 100-problem card deck provides practice in solving problems of va ry
ing difficulty levels wh ich are especially designed for that skill. Two
IPSP members (Schoen a nd Oehmke) wer e assigned the t ask of developing a
measurement instrument based on the IPSP problem solving model. A more
complete description of the IPS P proposal, modules, and teaching stra t
egies is given in Appendix A.
-
8/18/2019 thesis 7_16
34/142
2it
Development of the IPSP Test
After several meetings with the IPSP team and advisory board to
discuss the purpose, goals and philosophy of th e IPSP te st , Schoen and
Oehmke began th e tas k of developing a problem solving process t es t. The
testing model which was used, after several revis ions, is given in Chap
ter I (figure l) . It should be noted that while t here we re frequent
consultations among the sta ff, the IPSP test and th e IPSP modules were
designed and developed independently; the modules w er e developed at the
University of Northern Iowa and the testing instrument was developed at
The University of Iowa.
Three of the more important constraints th at were imposed on the
testing instruments we re : (l) th e format should be multiple-choice so
that the test can be machine scored, (2) the items should measure prob
lem solving subskills and not just the ability to get a final answer,
and (3) the test should be based on the IPSP tes ting model as developed
by the IPSP team.
A search of the literature was completed to locate instruments
which measure problem solving processes and any items which we re found
were classified according to the IPSP testing model. Instruments were
found whi ch we re in an open-ended or multiple-choice format or some
combination of both (Kilpatrick, 1967; Post, 196 8; Hia tt , 1970; Foster,
1972;
Hollander, 1973; James, 1973; Zalewski, 1 9 l b ; Wearne, 1976;
Krutetski,
1976).
In many cases it was difficult to classify the items
according to the specific subskills of the IP SP te sti ng model. Those
items tha t seemed to be most amenable to rev ision we re selected and re
written to conform to a specific step of the model. In addition, many
-
8/18/2019 thesis 7_16
35/142
25
items were wr itten to test subskills in each category. The objective
was to build a large item bank to be used during th e formative period.
Thus, valid items which satisfied it em analysis and reliability criteria
in tryouts could be selected for inclusion in the IPSP test .
A first draft of the IPSP test w as examined by two authors of the
Iowa Test of Basic Skills (ITBS) at The University of Iowa, by the IPSP
tea m, and by the IPSP Advisory Board. The consensus was that the items
did measure the subskills in the IPSP testing model. Howeve r, there was
also a consensus th at the items wer e "too wordy" and tended to be
"cute." These suggestions, as well as those of several doctoral stu
dents in mathematics education, were included in several revisions of
the first draft.
Each r evise d, open-ended item was then typed on a 3" X5 " card and
presented to twenty-nine students in individual interviews. These stu
dents we re in grades five through eight and were selected by the ir
teachers as representing a cross section of each of their classes. At
the beginning of each interview the student wa s told that these problems
might be different than any they had seen before, that s/he wa s not
being tested but that his/her thoughts and suggestions as s/he wa s work
ing the problems were needed to improve th e test items. The student
read each problem aloud; the n talked aloud while solving the problem.
Paper and pencil were available if the student chose to use them.
The primary purpose of these interviews was to get th e reaction of
the target audience to the items and to gather ideas for distractors.
It wa s t he interview ers' intent to be passive but still attempt to
elicit as much information as possible from th e students. Gen era lly,
-
8/18/2019 thesis 7_16
36/142
26
the s tudents wer e extremely cooperati ve. Two of the brightest eighth
graders not only solved th e "hardest" problems immediately, but analyzed
the items in great detail. Their comments were very informative, and
some of the ir suggestions wer e used as foils for the next revi sion. Th e
interview s w ere recorded on audio tape and analyzed according to reading
difficulties, concept difficulties , strategies used, computational
errors, and answers give n. Using information obtained from these
interv iews, the first draft of the IP SP test wa s revised.
Experimental units of multiple-choice items wer e developed and
administer ed at three different times w it h re visions following each tr y-
out.
The last tryout prior to the project e valuation was administered
to representa tive samples of Iowa students in grades five through eight .
Appendix B contains a more complete description of the phases in
the development of the IPSP test along with a summary of formative dat a.
The study reported here focuses on th e v alidation, rather than the de
velopment , of the IPSP t est . This validation was completed in three
main phases : determining the relationship betwee n the IPSP test and
data gathered from individual int er vie ws , determining the relationship
between th e IPSP subtests and several ITBS s cales, and determining the
relationships between the IPSP subtests. The first phase is an ass ess
ment of concurrent validity, and the last two a re forms of discriminant
validation using the Fisk and Campbell (1959) terminology. During the
validation several different forms of the IPSP test were used. These
are numbered and described in Table 1.
Phase one of the validation process wa s considered to be the most
important and the least straightforward. Thus, most of the effort, as
-
8/18/2019 thesis 7_16
37/142
27
Table 1
Phases of Validation of the IPSP Test
Test Date:
December, 1977
Sample Tested:
Malcolm Price
Lab School
Test
Form
5 6 1
5 6 2
781
782
Grade
Level
5,6
5,6
7,8
7,8
Number of Items
Total (Subtest)
it0(l2,it,12,12)
U0(12,it ,1 2,1 2)
it0(l2,it,12,12)
it0(l2,it,12,12)
Phase of
Validation
Pilot Interview
Study; Pretest
Pilot Interview
Study; Posttest
Pilot Interview
Study; Pretest
Pilot Interview
Study; Posttest
Test Date:
January, 1978
Sample Tested:
Representative
Sample of Iowa
Students
563
56b
581
582
783
5,6
5,6
20(6,2,6,6)
20(6,2,6,5)
5,6,7,8
20(6,2,6,6)
5,6,7,8
20(6,2,5,7)
7,8 20(6,2,6,6)
Final Tryout
Series
78U
7,8 20(it,2,7,7)
Test Date:
May, 1978
Sample Tested:
Two Fifth Grade
Classes from an
Iowa City Ele
mentary School
561
it0(l2,i|,12,12) Final Interview
Study
Test Date: 565 5,6
October, 1978
Sample Tested:
One Hundred 785 7,8
Fifth and Sixth
Grade Classes
from Iowa
Schools
30(10,0,10,10)
30(10,0,10,10)
Summativ e E valua
tion of IPSP Pro
ject:
Pretest
-
8/18/2019 thesis 7_16
38/142
28
well as the emphasis in this report, was placed on the relationship be
tween the interview data and IPSP test scores.
Design and Development of Interview Procedures
A preliminary list of interviewing procedures was developed using
the experiences that were gained during the first round of interviews
described in the previous section. The purpose of the later interviews
was to discover th e strategies th e child uses to solve ve rbal problems.
Hence it was decided that the intervi ewer should not lead the child into
selecting a particular heuristi c. Any questions asked by the inte r
viewer to elicit more information should not lead the child. The re wer e
also instances of nonverbal leading that occurred during the tri al in
terviews.
For example, a child unsure of which operation to use would
say , "I think I should multiply?" and then look at the interviewer's
face to get some
sort,
of reaction. Whether the child in fact multiplied
or performed some other operation depended on the re action of the i nte r
viewe r. Consequently a major reason for developing a set of writt en
interview procedures was to minimize th e interviewer's influence on the
student's problem solving processe s.
A first draft of the procedure wa s brief and contain ed such i n
structions to the interviewer as : Encourage the student t o vocalize his
thinking as much as possible during th e sample warm-up t ime ; Try to re
cord something on t ape ; Don't go any longer than 15 seconds wit hout re
cording something on tape . After the list was developed it was tested
with two volunteer fifth and sixth graders. Revisions wer e made ,
and these were tried with two volunteer students, a sixth and a
-
8/18/2019 thesis 7_16
39/142
29
seventh grader. In addition, two doctoral students in mathematics edu
cation tried the procedure with several of their students. Some of the
suggestions tha t we re incorporated into a revised edition wer e: Supply
a pencil with th e eraser cut off to prevent eras ures; Rath er t han giving
the student one or more sheets of paper for all the problems, put one
problem on each sh eet so that problem and computations are toget he r;
Tell the students that you will not let them know wheth er their solution
is correct.
A difficulty that occurred with some frequency was that of the stu
dent becoming silent while using paper and pencil ei ther to do arithme
tic computations or while apparently thinking. If the interviewer
interrupted and asked the child to tell what s/he was thi nking, the
reply would be , "I'll tell you as soon as I'm finished," or "Just a
minute,
I'm thi nkin g." Thi s difficulty was handled in se ver al
ways.
If
the child was in deep thought working on the problem, th e interviewer
would ask a question about the overt observable behav ior of the child,
e.g., "Are you doing some multiplication
now?"
or "Are you adding
now?",
when it was obvious that the child was doing that particular computation
using paper and pencil. The child usually mumbled "ye s" or shook his
or her head and went on with the arithmetic. This strategy was used as
an indicator on the tape to let the coder know what th e student was
doing during th at t ime .
A second strategy that worked very well was t o make a comment sim
ilar to the following: "That's fin e, you're telling me what you're
thinking"; "You're doing fine , Su e , you're telling me what I want to
know"; Can you put into words wha t you're thinking?" Such statements
-
8/18/2019 thesis 7_16
40/142
30
seemed to encourage t he child to vocalize and give more deta ils, yet
did not appear t o lead the students into using a specific strategy. The
final form of th e list of interview procedures wa s the result of five
tryouts and r evi sions. Figure 2 shows the final form used in this vali
dation study.
The Quantification Scale
Development
Since one of the goals of this study was to assess the relationship
between the IPSP test and the results of the think aloud interv iews, a
quantification code based on the IPSP testing model was needed to pro
cess the interview findings. Some related research was found.
Kilpatrick (1967) developed a coding scheme to analyze the proto
cols used by hi s subjects in a think aloud i nte rvi ew, but did not
attempt to quantify thes e protocols. Lucas (1972) used a modification
of Kilpatrick's coding scheme wit h calculus st udent s. Hi s five-point
scoring code is based on three categories: Approach (one
point),
Plan
(two points), and Result (two point s). The "Approach" phase represented
the subject's understa nding of the problem, th e "Plan" phase r epresented
the subject's attempt t o find a path to obtain th e answ er, and the
"Result"
phase wa s the subject's final answer.
Zalewski (l97it) investigated th e relationship bet wee n a paper-
pencil test and inte rview re sults. He esse ntially followed the proced
ures e stablished by K ilpatrick and Lucas wit h some modifications since
his subjects wer e seventh graders. The process score obtained by using
Lucas' scoring procedure provided a basis for ranking the subjects.
-
8/18/2019 thesis 7_16
41/142
31
Intervie wing Procedures
Problems should be t yped one on a page, preferably placed at the
side of the page so that th e student can use the rest of the page
for any computing, drawing diagrams, tables, or any type of think
ing.
Start the interview wi th 2 sample problems, thus allowing the s tu
dent to become familiar w ith the routine and wit h the type of
information the interviewer would like to find. At all times make
a conscious effort to put th e child at eas e.
Tell the student that no i nformation will be given on wheth er the
answer or strategies are correct since you want to get th e best
possible data.
Do encourage the students to go on by making comments such as ,
"You're doing just fine." "That's good, you're t elling me what
you're thinking." "Go ahead." BUT DO NOT LEAD THE STUDENT INTO
USING A STRATEGY.
Don't go any longer than about 15-20 seconds without recording
something on tape. EXCEPTIO N: If the student is doing computa
tions or drawing a diagram or making ta bles, etc., make some sort
of statement a s , "You're making a t able," etc.
Encourage the student to vocalize his thinking as much as possible.
If a student falls silent w hile wri ting or drawi ng, prompt him by
reading what he has writt en or ask him what he is doing. However ,
rule 5 takes precedence over rule 7-
If a student doesn't an sw er , or doesn't make any comments about
his thin king, wait about 15 seconds and ask, "Can you tell me what
you are th inking?" Wait another 1 0 seconds or so and ask again.
This ti me, "Are you trying t o figure something out?" If nothing
happens,
call this IMPA SSE. Now ask the quest ion, "Would you like
a hint or another problem?"
If the child says ye s , this would indicate to you that the rating
part of the data gathering is ove r, but to continue to get diag
nostic data. This can be done by asking the student to identify
the a rea that presented the trouble, why did he have this trouble,
e.g., didn't know met hod, lack of underst anding of the problem,
read problem incorrectly, etc.
Figure 2
-
8/18/2019 thesis 7_16
42/142
32
Figure 2 (cont'd.)
8c.
If the student says no, then allow more time and ask him if he
would tell you what he is thinking or what meth od he is trying,
or would he try to do his figuring on paper. Then repeat steps
8a,
8b and 8c again.
9- If the student is not trying to solve the problem get him on the
right tr ack, but after IMPASSE.
10. For the first half of the problems observe the student. Does he
have a habit of LOOKING BACK? If not , follow step 11 .
11.
If the student does NOT have the habit of LOOKING B ACK , and has
already been given the first half of the problems, then lead him
on wit h prompts listed on the LOOKI NG BACK coding sheet, e.g.
"Did you check your answer wit h the conditions of the problem?"
"Did you check your a nswer?" "How sure are y ou that your answer
is correct?"
DON'T
1. Do not allow the child to erase. Instruct him to make a line
through the mist ake.
2.
Do not give any tutoring or prompting until after the IMPASSE and
then only if the child asks a quest ion. However do use the pro
cedure listed in step 8a.
3. Do not summarize what the child ha s done . Try to get him/her to
do it.
it. Do not te ll the student whether he is on the right tra ck, or
whether his answer is correct.
5. Do not tell the student that you are going to use the strategies
listed in steps 8a , 8b and 8c.
-
8/18/2019 thesis 7_16
43/142
33
Written tests were administered to thes e same subjects who were then
ranked according to the number of correct ans wer s. The correlation
coefficient between the written te sts and interviews was .68. Zalewski
concluded that a higher correlation is necessary before the writte n
test scores can be used as a substitute or predictor" for interview
results.
Webb (1975) also used an adaptation of the coding system developed
by Kilpatrick and Lucas. He used th e "Approach," "Plan," and "Result"
scoring system and obtained a frequency count from a check list of prob
lem solving process variables.
From the preceding discussion it appeared that no
3-step
quant ifi
cation scheme was available to investigate the relationships between
intervie ws and IPSP test res ults. A first attempt at developing the
scale w as made using Kilpatrick's processi ng sequence with some modifi
cations in order to follow the IPS P tes ting model. In trying to quant
ify these processing sequences the procedures became very cumbersome.
A n ew attempt wa s made in which flow charts wer e designed for each step
of the model. Again when it came time to assign a number at the various
branches the instrument became unmanageable. Another attempt was made
in which behavior in each step of the testing model was assigned three
numbers:
0 , 1 , and 2, which were to serve as categories. In the 0
cat
egory would be those processes whi ch we re totally incorrect or a re
sponse such as "I don't know wha t to do." The 2 category would contain
responses wh ich wer e completely correct and the 1 category would con
tain the intermediate responses. This new procedure was used with th e
audio tapes from the first i nte rvi ews. It became immediately apparent
-
8/18/2019 thesis 7_16
44/142
3it
that at least one more category was needed and that the categories w ere
not explicit enough for each step of the model.
These revisions were made and the resulting instrument now had four
categories:
0, 1, 2 , 3, and more explicit descriptors under each cate
gory. This new instrument w as used to process additional tapes and fur
ther revisions were made. At this stage the instrument was examined by
the same two mathematics educators who were consulted on the inter view
form. Each category and its descriptors were thoroughly discussed.
Step
it,
the looking back st ep, present ed the greatest difficulty. If
the subject gives an a nsw er, appears to be mulling it over, goes back to
the problem, reads it again, and then gives another an sw er —is this
checking the answer or tryi ng to understand the problem? It was decided
to include this process under step 1 and be more explicit w ith the
descriptors under step it.
After general agreement on the appropriateness of the scale was
reached, three raters quantified audio tapes of interviews using this
form. After a few minor additions the instrument was considered to be
in "final" form. As a final te st , each rater analyzed the same three
interviews on audio tapes .
Use of the Scale
The final form of the quantification scale is given i n Figures 3,
it, 5. Behavior which involved rea ding, analyzing and understa nding the
problem was classified as step 1 behavior. Brie fly, a score of 0 was
assigned to a student who failed completely to understand a problem; 1
was assigned to a student w hose analysis of the problem was incorrect
-
8/18/2019 thesis 7_16
45/142
Quantification Scheme for the Components of the IPSP Test M odel
Step 1:
Getting to
Know
the P roblem
Says he doesn't understand
the problem and makes no
attempt at solution.
Tries t o solve the problem
unaware that there is in
sufficient information and
never starts correct
strategy.
Fails t o use data correctly
in attempts at solution,
e.g., uses all extraneous
data to arrive at solu
tion.
Immediately tries to do
some arithmetic operations
using all the numbers in
the problem without regard
to a correct strategy.
Makes a fa l se s t a r t ( recog
n izes i t as such) bu t c an ' t
a r r i v e a t a co r r ec t s t r a t e g y .
Says he doesn't understand
problem—rereads it—tries to
make a start but is unsuc
cessful (includes rephrasing,
trying to understand what is
unknown, what is given, or
searching for a path).
Reads problem, knows ther e
are "too many numbers" but
can't organize proper data
(extraneous
data).
Makes true statements about
extraneous data but does not
advance solution.
Rereads problem—appears to
know there is something miss
ing but cannot state what is
wrong and makes a false
start (missing
data).
Tries t o solve the problem
without regard to using data
correctly. After a brief tr ial
and error realizes he is not
using data correctly but can
not correct th e situation.
Tries t o summarize data or
repeat it in a different form
but does not find correct
strategy.
Makes a false star t—but
eventually arrives at a cor
rect strategy (includes
trial and error).
Tries to solve the problem
and unconsciously makes up
his own missing data but does
not state that there is in
sufficient data. His solution
strategy is correct for the
data he provides.
Tries t o summarize data or
repeat it in a different form
which starts him out on the
correct strategy but later he
gets off the track.
States there is no solution
because of insufficient data
and attempts to modify con
ditions but is unsuccessful.
Uses correct strategy but
does not use data in its
proper form (e.g., neglects
units).
Solves some of the cases in
volved in the problem but
fails to consider all solu
tions.
Any correct attempt to understand
the problem by reading or re
phrasin g, i.e., trying to under
stand what is unknown, what is
given,
or sea rching for a path.
Rereads the problem to assist in
drawing figures, tables , equa
tions , performing a check or in
troducing symbols. (Must be ap
parent that this is going to aid
in understanding the problem and
finding a correct strategy.)
States a plan for an intermediate
or final goal which is a correct
strategy.
Carries out exploratory manipula
tions which lead to correct solu
tion.
States problem can't be worked
and tells what is needed to work
it (modifies problem). States
reason why it can't be worked
(insufficient data).
States what data is not needed in
solution of problem while stating
correct strategy (extraneous
data).
States th e conditions and con
straints of the problem correctly.
Immediately starts out to work
the problem and succeeds.
Figure 3
OJ
-
8/18/2019 thesis 7_16
46/142
Step 3: Carrying Out the Plan
0
Any manipulations or compu
tations that are done are
incorrect.
Strategy is set up correct
ly but is not able to carry
it out, e.g., cannot solve
an equation.
Tries to use a diagram,
figure, or table but does
the computation incorrectly.
Suggests a plan but cannot
carry it out.
Quantification Scheme for
1
Does less than half the
number of necessary computa
tions correctly.
Sets up equation but cannot
solve it completely. Does
simple operations like addi
tion and subtraction.
Uses successive approximation
(systematic trial and er ror)
and does the first step cor
rectly but cannot carry it to
the end.
the Components of the IPSP Test
2
Does half or more of the
necessary computations cor
rectly.
Sets up equation but cannot
solve it completely in that
s/he makes errors on the ha
harder operations, e.g., mul
tiplication, divi dion, clear
ing fractions, etc.
Makes a mistake in copying
correct number but carries
out computations correctly.
Makes an incorrect diagram,
figure, or table but uses th e
numbers in the computation
correctly.
Uses successive approximation
and does the first few steps
correctly but bombs on the
computations involved in the
final step.
Does computation correctly
but uses units incorrectly.
Model
3
Sets up problem correctly
and carries out actual compu
tations correctly.
Sets up problem incorrectly,
but all computation is done
correctly.
Uses t he algorithm or equation
correctly, e.g., manipulates
all parts of the e quation cor
rectly .
When usin g successive approxi
mations (trial and error),
uses information from previous
trial correctly, i.e., com
putes all these va lues cor
rectly.
Starts to execute plans, makes
a mistake (computationally)
but finds errors and corrects
them.
Figure b
-
8/18/2019 thesis 7_16
47/142
Step
h : Looking Back
Quantification Scheme for the Components of the IP SP Test Model
0
Makes no attempt to check
answer or
conditions of
problem.
Says,
"It's probably
wrong, and makes n o
attempt to check the
answer.
Says s/he do esn 't know and
makes no attempt to cor
r e c t i t .
1
Expresses uncertainty about
answer.
Says it's probably wrong (or
some version) and attempts to
give a reason for his/her
uncertainty.
Makes an attempt t o check the
answer but is not successful
enough to be convinced that
it is right or wrong.
Checks computations involved
in answer but does not check
to see i f answer satisfies
condition of problem. Errors
here should be major.
2
Makes some at tempt to check
answer or decide whether it
is corre ct—but eventually
gives up.
Makes an attempt to check the
answer by var ious methods
(i.e.,
retraces ste ps, checks
condition of problem, substi
tutes answe r but cannot carry
out check
completely ).
Makes an attempt to check the
answer by various methods
(i.e., retraces st eps, checks
condition of problem, substi
tutes a nswer but fails to de
tect the incorrect answer).
Errors he re s hould be minor.
3
Attempts tt -heck the values
of an unknc . or the validity
of an argument.
Tries to decide whether the
answer makes sense
(i.e.,
realistic, reasonable
esti
mates) .
Checks that all pertinent data
has been used.
Suggests a new problem that
can be solved in the same way.
Successfully attempts to sim
plify the problem.
Checks solution by retracing
steps or substitution.
Checks that solution satisfies
conditions of problem.
Figure 5
-
8/18/2019 thesis 7_16
48/142
38
but wh o understood some of its e lements; 2 was assigned to a student
whose analysis was correct except for a minor er ror such as reading data
incorrectly; 3 was assigned to an ent irely correct understanding of the
problem which led to a valid solution str ategy. A crucial point is that
the step 1 score was not affected by errors in the application of a
solution strate gy, once that strategy wa s chosen. Errors in the appli
cation of a chosen strategy were reflected in the step 3 score.
Step it behavi or consisted of student moves after a tentati ve solu
tion was reached. Many students stopped as soon as they had an answer
and were given a score of 0 for step it. Bri efly, a score of 1 was
assigned if some uncertainty was expressed but no systematic check was
made;
2 was assigned if a check was atte mpted but w as either incorrect
or incomplete; 3 was assigned if a valid check of the computation, con
ditions and/or reasonableness of the solution was carried out. Agai n,
specific criteria we re described for each numerical s core, but an impor
tant point is that the step it score wa s not affected by any behavior
preceding a tentative solution. An exception was that students were
assigned 0 on step it if no tentative solution wa s reached.
The following two examples will illustrate the scoring scheme.
Ann, a sixth grader, was presented with thi s problem:
A bag of XL-50 brand marbles contai ns 25 marbles and costs
19$.
How much wi ll 125 marbles cost?
Ann read the problem aloud and this is the transcribed interview:
A: Uh . . . Oh boy . . . hm . . .
I: What are you doing now?
A: I'm tryi ng to figure out how I'll do this. Eit her add or multiply
-
8/18/2019 thesis 7_16
49/142
39
. . .O .K . I'm going to multiply 125 marbles by 19$. (multiplies)
It comes out $11.25. That's not right.
I: What are you trying to find?
A: I'm trying to get the right answ er.
I: But what answer?
A: What I should do with 25 , 1 9, and 12 5 , because I know with those
numbers I hav e to do something.
(Silence)
(Rereads the problem)
(Silence)
A: I want to see if I multiplied wrong . . .
(Remultiplies but is still stumped)
Ann exhibited a behavior wh ich occurred very frequently in student
interviews.
She tried to do some arit hmetic operations using all th e
numbers in the problem. She lacked, or at least failed to use, analytic
skills, essentially step 1 behavior. Howe ve r, her computational skills
and ability to use tables and diagrams, as illustrated in other prob
lems , wer e good. On this problem, she was given a score of 0 for st ep
1; 3 for step 3; 0 for step it. If she had made a computational error or
misused an equation, she would hav e received a 0 , 1, or 2 on step 3. A
similar pattern emerged in her solution to other problems.
Dave,
a fifth grader, is an example of a student w ho was able to
understand most of the problem settings presented to hi m, but had diffi
culty carrying out his solution stra tegies. This is illustrated wit h
the following example of a single-step problem:
-
8/18/2019 thesis 7_16
50/142
itO
Mr. Price earned $75 in each of 8 we eks. How much did he
earn for all 8 weeks?
D: O.K. 75 , 75 , 75 , . . . (add eight 75's)
O.K. 1 , 2 , 3, . . .
I: So you wrote eight 75's down , right?
D: Ye s , O.K., that'd be . . . eight 5's would be Uo. It'd be 0 and it
on top. And eight 7's would be . . .O .K . let's see . . . Hmm.
(Writes them down and adds them)
D: It'd be
$5.60.
I: O.K. that's your answer then?
D: Yes.
Dave chose to add eight 75 's, which was a correct strategy. How
ev er, he had difficulty in finding the sum. To make the computation
easi er, he correctly noted that 8 sevens is the same as it fourteens. He
was scored a. 3 on step 1, and 2 on step 3 on thi s problem. His score on
step
it
wa s 0 since he did not exhibit any behavior in that category.
Lateri wi th prompting, Dave realized that he h ad left out the 8 five s,
and he corrected himself.
The Pilot Interview Study
Procedures
After a year of development, a pilot study wa s designed in which
data from interviews and the IPSP test in thei r developing forms, wer e
gathered from the same sample of students in Fall, 1977- The year of
development included a trial run in Januar y, 1 977, and one in Ma rch,
1977, in which four 20-item forms of the IPSP test for each run wer e
-
8/18/2019 thesis 7_16
51/142
i t l
administered to students in grades five through eight. The form of th e
IPSP test that was used in this pilot study was a revised one based on
the data obtained from those trial
runs.
Preparations for th e pilot study included briefing th e classroom
teachers and administrators of the school, setting up a schedule for
interviewing the students , and working out the logistics of the sched
ule. A final session was held which was attended by the IPSP sta ff, the
classroom teachers who would be involved in the study, the h ead of the
school's mathematics departmen t, the prin cipal, and involved counse lors.
The form of the IPSP test that would be used was presented, and the team
discussed the purpose of th e study and answered questions raised by th e
school personnel.
Pilot Sample
All of the students in grades five through eight in the Malcolm
Price Laboratory School, Cedar Falls, Iowa , were involved in t his pilot
study. The students wer e randomly divided into two groups across grade
levels. Group one consisted of 99 students in grades five through
eight.
Group two consisted of 103 students from the same grade levels.
Howev er, within the gr oups, the fifth and sixth graders were administ er
ed one form of the test wh ile th e seventh and eighth graders completed
another form of the t es t. Concurrently with th e above groupings , each
teacher was asked to divide each of their classes into an upper ability
and lower ability half and to select one "verbal" student from each
half.
This resulted in the selection of 32 students, four from each
grade, to be involved in think aloud interv iews. All of the interviews
-
8/18/2019 thesis 7_16
52/142
it2
wer e conducted by the inv est igator and followed the intervie w form dis
cussed in the previous s ection. The interview tapes were then coded
according to the quantification code previously described. Students in
group two completed the IPSP tes t after the interviews w ere completed.
Correlation coefficients betwe en interview and IPSP test scores wer e
the n computed. Figure 6 shows t he time schedule for the study.
Schedule of Events for Pilot Interview Study
Date
11/28
11/29
11/30
Activity
paper and pencil test
for group 1
inte rvie w it students from
each grade in group 2
(room it)
paper and pencil test
for group 2
interview
b
s tudents from
each grade in group 1
(room it)
Figure 6
Responsibility
classroom tea cher
the investigator
classroom t eache r
the investigator
Interv iew P roblems
One hundred open-ended ve rbal problems were developed for grades
five through eight independently from those on the IPSP test . The se
problems were reviewed'by members of th e IPSP s taff. Samples from th e
100 problems were administered to six volunteer students in grades five
through eight in think aloud int erv iew s. Information obtained from
these interviews and suggestions from the staff were used in rev ising
-
8/18/2019 thesis 7_16
53/142
k3
some problems and eliminating othe rs. A pool of 65 problems re sulted.
These problems were then classified into seven levels: level one con
tain ing simple one step word problems and each succeeding level con
taining problems tha t w ere increasingly difficult in both concepts and
computations to be used. Each problem was typed on a half sheet of
paper so the student could do any n eeded computations on that paper.
These problems are included in Appendix C.
The Interviews
The investigator conducted all 32 interviews. Because the int er
views we re taking place during th e regular school day, a rather brief
time limit of 20 minutes per student was allotted. The first five min
utes were used in talking t o th e student about the procedure to be used
and in presenting two sample problems. Students wer e encouraged to talk
but were not given any hints or told whether wha t they wer