MATHEMATICAL METHODS IN THE SOCIAL …rca.ucsd.edu/selected_papers/A Theory of Stimulus...

23
MATHEMATICAL METHODS IN THE SOCIAL SCIENCES, 1959 Proceedings of the First Stanford Symposium Edited by KENNETH J. ARROW SAMUEL KARLIN PATRICK SUPPES 1960 STANFORD UNIVERSITY PRESS STANFORD, CALIFORNIA

Transcript of MATHEMATICAL METHODS IN THE SOCIAL …rca.ucsd.edu/selected_papers/A Theory of Stimulus...

MATHEMATICAL METHODS IN THE

SOCIAL SCIENCES, 1959

Proceedings of the First Stanford Symposium

Edited by KENNETH J. ARROW

SAMUEL KARLIN

PATRICK SUPPES

1960

STANFORD UNIVERSITY PRESS STANFORD, CALIFORNIA

STANFORD UNIVERSITY PRESS STANFORD, CALIFORNIA @ 1960 BY THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY ALL RIGHTS RESERVED

LIBRARY OF CONGRESS CATALOG CARD NUMBER: 60-9048 PRINTED IN THE UNITED STATES OF AMERICA

IN

= 0 since only differences in the - 1) parameters (s2, t2, • • ·, s,., t,.) This suggests that n must be

assume the scores are numbered have some order. Suppose it is

3, 4),

>tt) (i < j).

identified. In fact, on the basis :), where the ordering on s and t1 the model based on two "time" r process of one parameter (the

Then

O'u = p•4 +t4 ,

0"34 = p•r•a+t4-ts •

'· Another case in which all of = 0 < ta. In all of the other

ivariate Statistical Analysis, New

:al Inference in Factor Analysis," in rkeley Symposium on Mathematical Angeles: University of California

ysis: The Radex," in P.F. Lazarsfeld, lt&C68, Glencoe, Ill.: The Free Press,

15

A Theory of Stimulus Discrimination Learning

RICHARD C. ATKINSON

University of California, Los Angeles

1. Introduction

This paper presents a theory of discrimination learning. For simplicity, the analysis will be restricted to situations in which only one of two stimuli is presented on each trial; the subject's task is to learn to respond appro­priately to them. However, the theory can be readily generalized to situa­tions involving more than two stimuli.

There are several recent quantitative theories dealing with this type of problem; in particular, the work of Bush and Mosteller [4), Restle [15, 16, 17], Burke and Estes [3], Green [11), Atkinson [1], and Estes [6, 7]. Each of these theories has certain limitations. The limitations are of two types. In some cases the conceptual framework on which the theory is based seems unrealistic from a psychological viewpoint. In other cases, the theories ap­pear to be contradicted by experimental data or are extremely restrictive in the type of problem to which they are applicable. An analysis of these theories will not be presented (see Restle [18] and Estes [7)) but later some predictions derived from our model and comparable results from other theories will be examined.

The experimental situation involves a series of discrete trials. On each trial one of two stimuli CS1 or S2) is presented. To the presentation of s,, the subject makes one of two responses CA1 or A2); these responses are mutually exclusive and exhaustive. A trial is terminated by a reinforcing event CE1 or £2). If an E1 occurs the A1 response has been reinforced, and if an E2 occurs the A2 response has been reinforced.

Thus the experimenter can present one of the following four combinations on each trial: CS1, £1), CS1, £2), CS2, £1), or CS2, £2). The respective probabili­ties of these four events will be a, b, c, and d, where a+ b + c + d = 1. More general schedules of reinforcing events and stimulus events can be investigated but the above schedule encompasses most of the experimental research on discrimination learning. The traditional type of discrimination task is described when a= d = 1/2 and b = c = 0; the subject must learn to

This research was supported by the Office of Naval Research under Contract. Nonr 233(58).

221

222 RICHARD C. ATKINSON

make the AI response to the presentation of SI and to make the A2 response to the presentation of S2. Another type of discrimination problem only re­cently investigated is specified when a, b, c, d * 0 [2], [6], [8], [9], [14].

Three additional parameters are frequently used to avoid cumbrous nota­tion: rri = aj(a + b), rr2 = c/(c + d), and [j = a + b. The parameter rri is the probability of an EI event given SI, the parameter rr2 is the probability of an EI event given S2, and [j is the probability of an S1 presentation.

The aim of the present model is to account for the following factors in discrimination learning:

(i) The effect of stimulus dimensions; specifically, the variable of stimu­lus similarity or differentiability;

(ii) The effect of reinforcement schedules; the influence of rri and rr2; (iii) The effect of stimulus schedules; that is, the influence of variations

in [j; (iv) Previous experience on other discrimination tasks.

Other variables influencing discrimination learning have been experimentally investigated, but for the moment we shall be more than happy to limit our attention to the above factors.

The dependent variable of major interest is the expected probability on trial n of an A, response to the presentation of an S1 stimulus. This proba­bility will be denoted as Pn(At I SJ).

The model is similar in some respects to the theories mentioned above. All of these theories conceptually represent the S, stimulus as a collection of component parts; the hypothetical components are called stimulus ele­ments or cues. Thus, the stimuli S1 and S2 are represented by two sets of elements, and similarity between the stimuli is defined with respect to the number of elements the two sets have in common. Some theorists have proposed that individual elements are conditioned to responses; these theories will be referred to as component models. Others, notably Estes [7], have proposed that patterns of stimulus elements are conditioned to responses.

The problem encountered by both types of theories is illustrated by the observation that subjects can learn to discriminate with perfect accuracy between stimuli having so many features in common that initially there is marked generalization among them. To account for this result the com­ponent theories have had to postulate some mechanism by which the com­mon stimulus elements gradually become ineffective or, to use Restle's ter­minology, become "adapted and rendered nonfunctional during learning." The mechanisms that have been proposed lack psychological rationale and tend to be applicable to extremely restricted schedules of reinforcement. The pattern model is not subject to these objections and has no difficulty accounting for the above observation. Unhappily, however, it implies that the rate of discrimination learning is independent of the number of common stimulus elements. This prediction seems contrary to most experimental evidence.

In our model, both conditioning concepts are employed. Further, a mecha­nism is postulated that integrates the two types of conditioning. The con-

THEORY OF STIM

trol of the integrating mechat and stimulus similarity of the

2. Stimulus Representation

Stimuli impinging on the 01

terms of a set S* of stimulu~ stimulus leads to the activatio 1. Further, as will be indica is a function of the activated are theoretical constructs to ~

are not the receptor neurons o tion of the physical stimulus,

For our problem, t;he total s tion of sl and the total stimu of s2 are represented by two spectively; the presentation 01

For different pairs of physical sizes and have different relati partially overlapping, or one

A set s. is defined that rei and S2 (i.e., Sc = St n Sz). T identified in a discrimination 1 be associated with the actual associated with background 1

mental chamber, sounds from so forth.

Let n(SI), n(Sz), and n(Sc) r1 S., respectively. Then

(1) n(SI)-WI= n(SI

The quantities WI and wz can the stimuli sl and 52 similar Roughly speaking, the more t

to unity; as similarity increa! For many experimental sit

comparable (see, for example, that wr = w2 = w; hence, wht should assume that wi = w2.

One additional restriction h That is, the case where WI

describe a situation in which sented on all trials of the ex Suppes and Atkinson [19] for paper.

iON

1 and to make the A2 response iscrimination problem only re­* 0 [2), [6), [8), [9], [14). used to avoid cumbrous nota­

l- b. The parameter n-1 is the meter n-2 is the probability of y of an sl presentation. t for the following factors in

:ifically, the variable of stimu-

, the influence of n-r and n-2; is, the influence of variations

:tation tasks. ning have been experimentally more than happy to limit our

s the expected probability on f an S1 stimulus. This proba-

:he theories mentioned above. he s, stimulus as a collection nents are called stimulus ele­are represented by two sets of is defined with respect to the ommon. Some theorists have 1ed to responses; these theories >thers, notably Estes [7), have :tre conditioned to responses. theories is illustrated by the

minate with perfect accuracy common that initially there is ount for this result the com­mechanism by which the com­Iective or, to use Restle's ter­mfunctional during learning." ck psychological rationale and ~ schedules of reinforcement. •jections and has no difficulty ~pily, however, it implies that ient of the number of common :ontrary to most experimental

~employed. Further, a mecha­·pes of conditioning. The con-

,j THEORY OF STIMULUS DISCRIMINATION LEARNING 223

trol of the integrating mechanism is governed by the reinforcing schedule and stimulus similarity of the particular discrimination task.

THEORY 2. Stimulus Representation

Stimuli impinging on the organism are to be represented conceptually in terms of a set S* of stimulus elements. The presentation of a particular stimulus leads to the activation of a unique subset of S* with probability 1. Further, as will be indicated later, the response elicited by the stimulus is a function of the activated stimulus elements. These stimulus elements are theoretical constructs to which we will assign certain properties. They are not the receptor neurons of neurophysiology but a schematic representa­tion of the physical stimulus, having certain simple and uniform properties.

For our problem, 1;he total stimulus situation associated with the presenta­tion of S1 and the total stimulus situation associated with the presentation of s. are represented by two subsets of stimulus elements St and S2, re· spectively; the presentation of s, (i = 1, 2) leads to the activation of set s,_ For different pairs of physical stimuli the sets S 1 and s. may be of differept sizes and have different relationships. That is, S 1 and S2 may be disjoint, partially overlapping, or one may be a subset of the other.

A set s. is defined that represents those stimulus elements common to S1 and s. (i.e., Sa= Sr n S2). Typically, two sources of common elements are identified in a discrimination problem. One subset of common elements may be associated with the actual stimuli to be discriminated; the other may be associated with background stimuli such as characteristics of the experi­mental chamber, sounds from the apparatus, proprioceptive stimulation, and so forth.

Let n(Sr), n(S2), and n(Sa) represent the number of elements in St, s., and Sa, respectively. Then

(1) n(St) - n(Sa) Wt = n(St) '

n(S.) - n(Sa) w. = n(S.)

The quantities Wt and w. can be used to define indexes of similarity between the stimuli S1 and S2 similar to those introduced by Bush and Mosteller [4]. Roughly speaking, the more dissimilar the stimuli the closer Wt and w. are to unity; as similarity increases wr and w. approach zero.

For many experimental situations the stimuli Sr and s. are es;;entially comparable (see, for example, [2)) and, consequently, it is natural to assume that Wt = w. = w; hence, when the subscript on w is omitted the reader should assume that Wt = w2.

One additional restriction is imposed; namely, 0 < Wt ~ 1 and 0 ~ w2 ~ 1. That is, the case where wr = w2 = 0 will not be considered. This would describe a situation in which the same stimulus (represented by Sa) is pre­sented on all trials of the experiment. The case is considered in detail by Suppes and Atkinson [19) for models of the type to be presented in this paper.

224 RICHARD C. ATKINSON

3. Conditioning States

With regard to conditioning, we distinguish between individual stimulus elements and stimulus patterns. A stimulus pattern refers to a subset of stimulus elements all of which are activated simultaneously; on a given trial, only one stimulus pattern can be activated. Two conditioning relations are defined: (i) on trial n, each stimulus element is conditioned to either A1

or A,; (ii) on trial n, each stimulus pattern is conditioned to either A 1

or A2.

To clarify, consider a set S* with only three stimulus elements s1, s2, and S3. There are seven possible stimulus patterns: (s1), (s2), (s3), (s1, s2), (s1, s3), (s2, s3), and (s1, s2, s3). If the stimulus impinging on the subject is such that the elements St, s2, and S3 are activated simultaneously, then the only pat­tern activated is (St, s,, s3); patterns (s1, s3) and (s2, s3) are not activated. Under the most general experimental conditions we might assume that any set of conditioning relations is possible; as an example, to be referred to again, the following states might hold on trial n for individual elements and for the stimulus patterns:

(2) St-A2, s2-A1, s3-A~o (s~)-At ,

(s2)-A2, (s3)-A2, (St, s2)-A1 ,

(s~o s3)-A2, (s2, s3)-A~o (s~o s2, s3)-A2 .

In the riext two sections we consider how these conditioning relations de­termine responses and how they are established.

4. Response Probability and Perceptual States

The response made by a subject on a given trial depends on the condi' tioning states of the activated stimulus elements. However, the response can be determined either by the stimulus pattern or by the individual stimu­lus elements. Specifically,

(P) The subject can respond in terms of the activated stimulus pattern and make the At response to which the activated stimulus pattern is conditioned;

(C) The subject can respond to the component stimulus elements such that the probability of an At response is the proportion of activated stimulus elements conditioned to A1.

If the response is determined on trial n by the stimulus pattern, then we will say that the subject is in state P on trial n; if the response is deter­mined by the component stimulus elements, then the subject is in state C. These two states will be referred to collectively as perceptual states.

In (2), if the stimulus elements St, s2, and S3 are activated simultaneous­ly, then (i) an A2 will be elicited if the subject is in perceptual state P, and (ii) an A1 will be elicited with probability 2/3 if the subject is in state C. However, if only s1 is activated, then (i) an A 1 will be made if the sub­ject is in state P, and (ii) an A2 will occur if the subject is in state C.

THEORY OF STil\

Whether a subject is in th< on his reinforcement history. subject has been consistently then he will remain in state reinforced when the subject v C. If partial reinforcement l or C, then in the future the probability other than 0 or 1

5. Conditioning Process

To be explicit, in discuss random variable. The forma been made in describing othE not have facilitated the preSE variable is a (measurable) func probability measure on the s on the values of the random

The sample space X for a comes x of the experiment. of stimulus, response, andre experiment, but it is impor random variable we speak oJ a particular realization of tl doubtedly produce a differen1 of the experiment at another

The random variable Fn(x) ' states as a function of reinf< ing random variable is define

(i) All activated stimulu: are conditioned to the reinfm then the stimulus elements a are conditioned to A1•

(i') No change occurs in t stimulus patterns on trial n.

(ii) If the subject is in pe reinforced, then the subject w response is reinforced, then for the C state, if the respo1 in C, but he switches to P i

(ii') No change occurs in Then

(3) Fn(X) =

~SON

sh between individual stimulus s pattern refers to a subset of :ed simultaneously; on a given ted. Two conditioning relations mentis conditioned to either A1 rn is conditioned to either A1

ee stimulus elements s1, s2, and terns: (s1), (s2), (s3), (s1, s.), (s1, S3), ~ing on the subject is such that ultaneously, then the only pat-and (s2, s3) are not activated.

ions we might assume that any an example, to be referred to trial n for individual elements

-A,'

A,' , s2, s3)-A • .

these conditioning relations de­>hed.

~tes

•en trial depends on the condi­ments. However, the response tern or by the individual stimu-

the activated stimulus pattern the activated stimulus pattern

1ponent stimulus elements such >e is the proportion of activated

'{the stimulus pattern, then we ial n; if the response is deter-then the subject is in state C.

ively as perceptual states. I S3 are activated simultaneous­ubject is in perceptual state P, ity 2/3 if the subject is in state an A, will be made if the sub­if the subject is in state C.

THEORY OF STIMULUS DISCRIMINATION LEARNING 225

Whether a subject is in the P or C state at the start of trial n depends on his reinforcement history. If the sequence of At responses made by the subject has been consistently reinforced when the subject was in the P state, then he will remain in state P; if the At responses have been consistently reinforced when the subject was in the C state, then he will remain in state C. If partial reinforcement has occurred when the subject was in either P or C, then in the future the subject will be in state P or C with some probability other than 0 or 1.

5. Conditioning Process

To be explicit, in discussing conditioning we introduce the notion of a random variable. The formal application of random variables could have been made in describing other features of the model, but until now it would not have facilitated the presentation. We remind the reader that a random variable is a (measurable) function defined on a sample space, and the over-all probability measure on the sample space induces a probability distribution on the values of the random variable.

The sample space X for a given experiment is the set of all possible out­comes x of the experiment. It is suggestive to think of x as the sequence of stimulus, response, and reinforcing events for a particular subject in the experiment, but it is important to remember that when in defining the random variable we speak of subject x, we mean the subject's protocol in a particular realization of the experiment. The same subject would un­doubtedly produce a different experimental outcome x in another realization of the experiment at another time.

The random variable Fn(x) describes changes in conditioning and perceptual states as a function of reinforcement schedules. Specifically, the condition­ing random variable is defined in terms of the following statements:

(i) All activated stimulus elements and the activated stimulus pattern are conditioned to the reinforced response. That is, if E1 occurs on trial n, then the stimulus elements and the stimulus pattern activated on the trial are conditioned to A 1•

(i') No change occurs in the conditioning state of stimulus elements or stimulus patterns on trial n.

(ii) If the subject is in perceptual state P and the A; response made is reinforced, then the subject will remain in state P; however, if the alternative response is reinforced, then the subject will change to state C. Similarly, for the C state, if the response emitted is reinforced the subject remains in C, but he switches to P if the A 1 response was not reinforced.

(ii') No change occurs in the perceptual state on trial n. Then

(3)

\

3 if (i) and (ii) apply ,

2 if (i) and (ii') apply , Fn(x) = 1 if (i') and (ii) apply ,

0 if (i') and (ii') apply .

226 RICHARD C. ATKINSON

Only three cases will be considered in this paper.

Case 1: (4) P(F,. = 3) = 0, P(F,. = 2) = P(F,. = 1) = 0,

P(F,. = 0) = (1 - 0) (0 < 0 ;$; 1) .

Case II:

(5) P(F,. = 3) = ata2, P(Fn = 2) = (1 - at)a2, P(F,. = 1) = at(1 - az),

P(F,. = 0) = (1 - at)(1 - a2) (0 < at, a2 ;$; 1) .

Case III:

(6) P(F,. = 3) = rs, P(F,. = 2) = r2, P(F,. = 1) = Tt, P(F,. = 0) =To.

6. Subject States

At the start of trial n, it is assumed that a subject can be described by an ordered four-tuple in which

(i) The first member of the tuple is either P or C and indicates the per­ceptual state on trial n.

(ii) The second member is either At or A2. If it is At (i = 1, 2), then the individual stimulus elements in the subset St,..., s. are conditioned to Ac and the stimulus pattern associated with the entire subset St is conditioned to At.

(iii) The third member is either At or A2 and indicates whether the in­dividual elements in s. are conditioned to At or A2.

(iv) The fourth member is either At or A2. If it is At (i = 1, 2), then the individual stimulus elements in the subset S2 ,..., s. are conditioned to At and the stimulus pattern associated with the entire subset S2 is conditioned to At.

Thus, following the rules for response probability, we find that if the sub­ject is in state (C, At, A2, A2), then the probability of an At response is Wt in the presence of St and 0 in the presence of S2. As a second example, for the state (P, At, A2, At) the probability of an At response is 1 in the presence of both St and S2.

These four-tuples will be referred to as subject states and assigned identi­fying numbers as follows:

1. (P, At, At, At) 9. (C, Ar, Ar, Ar)

2. (P, Ar, Ar, A2) 10. (C, Ar, Ar, A2)

3. (P, Ar, A2, At) 11. (C, At, A2, Ar)

4. (P, Ar, A2, A2) 12. (C, Ar, A2, A2)

5. (P, A2, Ar, Ar) 13. (C, A2, Ar, Ar)

6. (P, A2, Ar, A2) 14. (C, A2, Ar, Az)

7. (P, A2, A2, Ar) 15. (C, A2, A2, At)

8. (P, Az, A2, A2) 16. (C, A2, A2, A2)

The subject state description permits response distributions to be specified in a fairly simple fashion. One reason for being able to restrict the number

THEORY OF STIM

of states to sixteen is that, b a trial (i.e., if F,.(x) = 3, 2) in subsequent trials the stimulm stimulus elements in the subse response. Thus it is not nece pattern for sj and the individ in the description of the seco The only complication is witl wish to admit the possibility elements in the subset sj - ~

is a limitation but it is offse1

7. Mathematical Formulati(]

From the assumptions prest that the sequence of random is a Markov chain.I This r matrix P = [PtJ] may be defir being in subject state j on tri about subject states on trials further, PtJ is independent of process is completely charact initial probability distributim

We now use the axioms of matrix. In making such a d1 ous possible occurrences on < ing from a point represents : sibilities. Two of the sixtee lustrate the procedure.

Each path on a tree from : a possible change in subject path is obtained by multipli Figure 1 the probability of t a transition from one state t

priate branches. For exam1 them lead from state 4 to : second example, in Figure 2 and, consequently, P12,2 = ra( bilities for Case III is given

Let Ut(n) be the probabili trial n, where n = 1, 2, · · · .

(7) U(n)

Then U(n) = U(n- 1) · P, a

I For more complex stimulus ! tions other than those proposed may not be a Markov chain.

;QN

paper.

= 1) = 0,

t < 8 ~ 1).

las, P(Fn = 1) = a1(1 - as),

(0 < a1, a. ::::;; 1) .

~ = 1) = r1, P(F,. = 0) = ro.

il. subject can be described by

P or C and indicates the per-

If it is A1 (i = 1, 2), then the .., 8. are conditioned to A1 and 11tire subset 81 is conditioned

and indicates whether the in­or A •. If it is A1 (i = 1, 2), then the

- 8. are conditioned to A1 and 1tire subset 8. is conditioned

tbility, we find that if the sub­.bility of an A1 response is w1

:>f 8,. As a second example, 1f an A1 response is 1 in the

;ect states and assigned identi-

9. (C, A1, A1, A1) 10. (C, A1, A1, A.)

11. (C, A1, A., A1) 12. (C, A1, A., A.) 13. (C, A., A1, A1) 14. (C, A., A1, A.)

15. (C, A., A., A1) 16. (C, A., A., A.)

1e distributions to be specified ing able to restrict the number

I :·; :_~ \,j

q I

THEORY OF STIMULUS DISCRIMINATION LEARNING 227

of states to sixteen is that, by assumption, if conditioning has occurred on a trial (i.e., if Fn(x) = 3, 2) in which S1 (j = 1, 2) was presented, then on all subsequent trials the stimulus pattern associated with 81 and the individual stimulus elements in the subset 81,.., 8. will both be conditioned to the same response. Thus it is not necessary to keep track separately of the stimulus pattern for 8} and the individual elements in the subset 8} ,_ 8., as indicated in the description of the second and fourth components of a subject state. The only complication is with regard to initial conditions, where we might wish to admit the possibility that the pattern associated with 8} and the elements in the subset 81 - 8. are conditioned to different responses. This is a limitation but it is offset by the resulting simplicity of the model.

7. Mathematical Formulation

From the assumptions presented in the preceding sections, it can be shown that the sequence of random variables that take the subject states as values is a Markov chain.1 This means, among other things, that a transition matrix P = [Pt1] may be defined, where PtJ is the conditional probability of being in subject state j on trial n + 1, given state i on trial n. Information about subject states on trials preceding n in no way affects Pt1 on trial n; further, Pu is independent of n, that is, constant over trials. The learning process is completely characterized by these transition probabilities and the initial probability distribution on the subject states.

We now use the axioms of the preceding section to derive the transition matrix. In making such a derivation it is convenient to represent the vari· ous possible occurrences on a trial by a tree. Each set of branches emanat· ing from a point represents a mutually exclusive and exhaustive set of pos· sibilities. Two of the sixteen trees are presented in Figures 1 and 2 to il­lustrate the procedure.

Each path on a tree from a beginning point to a terminal point represents a possible change in subject state on a given trial. The probability of each path is obtained by multiplication of conditional probabilities. Thus, for Figure 1 the probability of the top path is simply na. The probability of a transition from one state to another is obtained by summing over appro­priate branches. For example, of the sixteen paths in Figure 1, two of them lead from state 4 to state 2 and, therefore, P4,2 = na + r.a. As a second example, in Figure 2 only one path leads from state 12 to state 2 and, consequently, P12,2 = r3(1- w1)a. The complete set of transition proba­bilities for Case III is given in the Appendix.

Let Ui(n) be the probability of being in subject state i at the start of trial n, where n = 1, 2, · · · . Define the row matrix

(7) U (n) = [u1(n), u.(n), · · ·, u16(n)] .

Then U(n) = U(n- 1) · P, and, in general,

1 For more complex stimulus and reinforcement schedules or for conditioning assump· tions other than those proposed in Cases I-III, the particular sequence of Subject states may not be a Markov chain.

C\1 C\1 v v

228

THEORY OF STIM1

(8)

Further, define Pl1> as the pr4 given that at trial r we were exists and is independent of i

(9)

The limiting quantities u1 exi: reducible and aperiodic. A M proper subset of states, that i within this set the probability odic if there is no fixed perio

Experimentally, it is imposs on a given trial. That is, the response (At or A'J), and reini but this information is not Sll

ample, if St is presented and which of the sixteen states n fact, for this particular com would have been possible: 1, 2, confounding is due to the fac observable.

Since trial descriptions and correspondence, it is necessary to define probabilities of even lowing quantities are of partie bility on trial n of an At res Pn(At I S.), the conditional pro s. stimulus presentation. By that (10) Pn(At I St) = Ut(n) + u2(

+ wt[uu(1i

and

(11) Pn(At I S2) = Ut(n) + Ua(

+ w2[un(1

Also, for analytical purposes perceptual state P will be us

(12) Pn(P) =

ANAl

8. Empirical Implications o

In this case changes in th pletely dependent on each othc

-ln~N ~~~~

~ v v v v v

I

§

"' ... 3 "' .!i ri .a­-- ~ "'-­"' .. g,-

1:: bO 0

.5 N ......... "' :.:: ~ ... ~ ... .... .. "' .... ::: < ~ N 'So ........ · -~-;; r...­~

§

"' ~ ~ "' ~ ~

.E

.!i ri .a_ ·- ~ "'·-&b bO §

·.E v

~ ~ ... .... ... "' .':: ::: ... < -a ..... ..... · ~ + ~ ;:

fi:~ .'::

THEORY OF STIMULUS DISCRIMINATION LEARNING 229

(8) U(n) = U(1) · pn-l .

Further, define Plj> as the probability of being in state j. at trial r + n, given that at trial r we were in state i. Moreover, if the appropriate limit exists and is independent of i, let

(9) UJ = lim Pin> . n-oo J

The limiting quantities u1 exist for any finite-state Markov chain that IS Ir­reducible and aperiodic. A Markov chain is irreducible if there is no closed proper subset of states, that is, no proper subset of states such that once within this set the probability of leaving it is 0. A Markov chain is aperi­odic if there is no fixed period for return to any state.

Experimentally, it is impossible to identify individual states of the process on a given trial. That is, the experimenter knows which stimulus (51 or 52), response CA1 or A2), and reinforcing event (E1 or E2) occurred on the trial, but this information is not sufficient to identify the subject state. For ex­ample, if 51 is presented and A1 occurs, we cannot establish unequivocally which of the sixteen states the subject was in when the A1 occurred. In fact, for this particular combination, any one of the following ten states would have been possible: 1, 2, 3, 4, 9, 10, 11, 12, 13, or 14. Obviously, this confounding is due to the fact that the perceptual states are not directly observable.

Since trial descriptions and subject states cannot be placed in one-to-one correspondence, it is necessary (for an experimental evaluation of the theory) to define probabilities of events that are observable. Consequently, the fol­lowing quantities are of particular interest: PnCA1I 51), the conditional proba­bility on trial n of an A1 response given an 51 stimulus presentation; and Pn(AII 52), the conditional probability on trial n of an A1 response given an 52 stimulus presentation. By inspection of the theoretical states it follows that (10) PnCA1 I 51) = u1(n) + u2(n) + ua(n) + U4(n) + U9(n) + u10(n)

+ uh[Un(n) + u12(n)) + (1 - uh)[u13(n) + uu(n)]

and

(11) Pn(AI I 52) = u1(n) + ua(n) + Uo(n) + u1(n) + u9(n) + U1a(n)

+ w2[un(n) + U1o(n)) + (1 - w2)[uro(n) + uu(n)] .

Also, for analytical purposes, the probabilities at the start of trial n of perceptual state P will be useful:

(12) Pn(P) = u1(n) + u2(n) + · · · + Us(n) .

ANALYSIS OF THE MODEL

8. Empirical Implications of Case I

In this case changes in the conditioning and perceptual states are com­pletely dependent on each other, as indicated by (4). If, on trial n, a change

230 RICHARD C. ATKINSON

in the conditioning state is possible, then a change in the perceptual state also can occur; if a change is not possible in the conditioning state, then no change can occur in the perceptual state.

An inspection of the transition matrix for this case leads to two immediate conclusions: (i) the subject states 3, 6, 11, and 14 are transient; (ii) the asymp­totic probability of any subject state is independent of 0. This follows from the observation that the main diagonal of P for this case has terms of the form (1- 0) + oa, while all other non-zero terms are of the form (JiJ'.

Asymptotic Predictions

We now consider some general aspects of asymptotic behavior that are of experimental interest. If 0 < uJI < 1 and 0 ~ w2 < 1, then a single closed set of states exists,2 namely cl = {2, 4, 5, 7, 9, 16}; cl forms an irreducible aperi­odic Markov chain, and it can be shown that

(13)

u2 = ad(a +b)/A ,

u, = ad(c + d)/A ,

Us = bc(c + d)/A ,

where A = (a + b)(c + d) .

Further, by (10)-(12) we have

U7 = bc(a + b)/A,

U9 = ac/A,

U1a = bd/A ,

(14) Poo(Al I Sl) = u2 + u4 + u9 = 7r!, Poo(Al I S2) = Us + lt7 + u9 = 7r2,

ad+ be Poo(P) = u2 + U4 + Us + u7 = A .

The only other admissible condition not considered above is when w1

= w2 = 1. In this event, two closed sets of states exist, namely C1 and C2 = {1, 8, 10, 12, 13, 15}. If the subject is not in cl or c2 at the start of the ex­periment, then (i) the subject will be absorbed into cl if he was initially in state 11 or 14, and (ii) the subject will be absorbed into C2 if he was in­itially in state 3 or 6. Thus when w1 = w2 = 1 the probabilities of absorp­tion into C1. and C2 are, respectively:

(15)

and

(16) U!(1) + U3(1) + lta(1) + Us(1) + U!o(1) + ul2(1) + U13(1) + U15 (1) .

C2 also forms an irreducible aperiodic Markov chain, and if the subject is absorbed into the c2 set, then

U1 =ad/A,

(17) Us= bd/A,

U10 = ad(a + b)/A,

Further, when absorbed into C2,

u12 = ad(c +d)/A ,

U13 = bc(c + d)/A ,

U1s = bc(a + b)/A .

2 A set C of states is closed if no state outside C can be reached from any state in C.

THEORY OF STil\'

(18) Poo(Al I S1) = U1 + U10 +

Thus for Case I, Poo(A, I SJ) ability distribution U(1), as ir holds for Poo(P) except for th

(19) Poo(P) = {(ab +be)/. (ac +btl)/

Thus, when w1 = ws = 1, the The asymptotic results for

described by W. K. Estes for; cally the probability of a res. simply the probability of rei presented. That is, the prob p .. (AII S1); similarly, the pre p .. (AI I S2). Experimental evid later.

Pre-asymptotic Predictions

Certain pre-asymptotic prot: tion task will be investigated and b = c = 0. For this prob: is correct when Ss is present~

Under these conditions the of states 2 and 4. Consequer

(20)

To simplify the analysis, a 1, 2, · · ·, 16). Both of these mental applications. The tv first condition and no initial · tion.

Given these restrictions, it Further, it can be shown tha

(21)

ua(n) = Us(n) = Uo(n) =

= U13(n) = uu(n) =

u1(n) = Us(n) = u9(n) =

1 u10(n) = uu(n) =

4w ([J

u2(n) = u4(n) •

By (21) and the fact that ~

ON

ilange in the perceptual state the conditioning state, then no

. s case leads to two immediate 4 are transient; (ii) the asymp­~pendent of 0. This follows >f P for this case has terms ~ro terms are of the form Oo'.

~mptotic behavior that are of ~2 < 1, then a single closed set Cr forms an irreducible aperi-

~e(a + b)/.d,

zc/.d,

ld/.d.

. I S2) = u, + U1 + U9 = rr2,

ad+ be J

nsidered above is when w1 = es exist, namely Cr and C2 = ' or c2 at the start of the ex­:1 into Cr if he was initially lbsorbed into c2 if he was in-1 the probabilities of absorp-

r Un(1) + Uu(1) + Ur6(1)

· Uu(1) + U13(1) + Ut 5(1) .

r chain, and if the subject is

rd(c + d)/.d ,

oc(c + d)/.d ,

oc(a + b)/.d •

can be reached from any state in C.

(18)

THEORY OF STIMULUS DISCRIMINATION LEARNING 231

P~(Ar I 5r) = Ur + Uro + U12 = 7rr, P~(Ar I 52) = Ur + Uta + ul5 = 7rz •

ac + bd P~(P) = U1 + Us = J .

Thus for Case I, P~(At I 51) is independent of 0, wr, w2, and the initial prob­ability distribution U(1), as indicated by (14) and (18). The same conclusion holds for P~(P) except for the case where wr = w2 = 1; here,

(19) p (P) = {(ab + bc)/.d, with probability indicated by (15) , ~ (ac + bd)/.d, with probability indicated by (16) .

Thus, when wr = w2 = 1, the asymptotic probability P~(P) depends on U(1). The asymptotic results for P~(At I 51) correspond to the "matching law"

described by W. K. Estes for a linear learning model [5]. That is, asymptoti­cally the probability of a response in the presence of a given stimulus is simply the probability of reinforcing that response when the stimulus is presented. That is, the probability of an Er event given 5r is rrr, which is P~(Arl 5r); similarly, the probability of an Er given 52 is rr2, which is P~CA1I 52). Experimental evidence dealing with this result will be presented later.

Pre-asymptotic Predictions

Certain pre-asymptotic properties of the model for a classical discrimina­tion task will be investigated, that is, for a situation in which a = d = 1/2 and b = c = 0. For this problem Ar is correct when 5r is presented and A2 is correct when s2 is presented.

Under these conditions the process is absorbed in a closed set consisting of states 2 and 4. Consequently,

(20) P~(Ar I Sr) = P-CA2I 52) = P~(P) = 1 .

To simplify the analysis, assume that Wt = w2 = w and Ut(1) = 1/16 (i = 1, 2, · · ·, 16). Both of these assumptions are reasonable for many experi­mental applications. The two stimuli are taken to be comparable by the first condition and no initial response bias is posited by the second condi­tion.

Given these restrictions, it is obvious that Pn(Ar I 5r) = PnCA2I 52) for all n. Further, it can be shown that

(21)

ua(n) = u,(n) = u6(n) = uin) = uu(n) 1 = Ura(n) = Uu(n) = Urs(n) = 16(1- O)n-I ,

1 3 ur(n) = u8(n) = u9(n) = Ur6(n) = 4(1- 0/2)n-I-

16(1- O)n-r,

1 1 Uro(n) = U12(n) = -t;;-Cl1 - 0(1 - w)/2]n-I - (1 - 0/2]n-I} + 16(1 - O)n-r,

u2(n) = u4(n) .

By (21) and the fact that 2:ut(n) = 1, it follows that

232 RICHARD C. ATKINSON

(22) 1-w 3w-1 Pn(All S1) = 1- --[1- 0(1- w)/2]n-l- [1- 0/2Jn-l.

4w 4w

Equation (22) also describes the probability of a correct response, defined here as

An inspection of (22) indicates that the rate of approach to the asymptote is a function of both 8 and w. To investigate the effect of w on condition­ing, let 8 remain fixed and consider two discrimination problems, one de­scribed by w' and the other by w, where w' > w. Further, for fixed 8, define p~(C) and Pn(C) in terms of w' and w, respectively.

By inspection of (22) it is clear that on early trials p~(C) will be greater than Pn(C). However, at some trial a crossover occurs, and from this point on p,(C) is greater than p~(C) as they both approach unity. Further, for a fixed value of w, the crossover point occurs later in the series of trials as w' increases. The prediction of an increase in the number of errors on early trials with an increase in similarity of the discriminanda is reasonably well-established experimentally. With regard to the crossover effect we know of no conclusive experimental evidence; however, some recent data of LaBerge and Smith [13] suggest that such a phenomenon may occur.

The proof of the above result depends on a rather surprising observation. Define EN as the expected number of incorrect responses made by a subject over the first N trials of the experiment. Then

N 3 1 3w-1 EN= ,L. [1- p,(C)] =-- -[1- 8(1- w)/2]N- [1- 8/2]N.

"- 1 28 2w8 2w8

As the number of trials becomes large,

N 3 EN__._. 28

Thus we have another interesting experimental prediction for Case I. The number of errors made in the "perfect" learning of a classical discrimina­tion problem is independent of the similarity of the discriminanda.

9. Empirical Implications of Case II

In this case, changes in the perceptual and conditioning states are statisti­cally independent, as indicated by (5). On any trial, a change in the per­ceptual state may occur with probability a1, while a change in the condi­tioning state may occur with probability a2. This modification (as compared with Case I, in which changes in the perceptual and conditioning states are completely dependent) leads to predictions which, for some parameter values, are markedly different from those of Case I.

In analyzing Case II some experimental data will be compared with theo­retical predictions. However, before inspecting particular experiments, there

THEORY OF STill

are several general results tl (i) As in Case I, subject

(ii) In general, u, and p.( as a1 and a2 approach 1, tl and, therefore, by P~CA1I S1)

(iii) We have restricted 0 • theories of discrimination obi conditions, no change can OCI

initially he is in a P·state, tl for e-states. Consequently, states 1 through 8 and the o1 closed set,

(23)

u1 = ac/.4,

u2 = ad(a­

U3 = 0,

U4 = ad(c-

If the process is absorbed in

(24) P~CA1IS1)

P~CA1IS,)

The u/s for the second clos identical to those presented in P~CA1 I St) is defined by (10) a

(25)

Thus, when a1 = 0, asymptot ing on the initial state of the suits of (25) are identical to 1 Estes [3]. In contrast, the p1 predictions generated by Est{

(iv) Another general resul· w2 = 1. Here, independent oJ P~(Al IS.) = TC2.

Classical Discrimination Lean

When rc1 = 1, rc2 = 0, and fJ tion matrix, that states 2 anc u2 =a and u4 =d. Conseque:

ON

-I- 3w- 1 [1- 0/2]"-I . 4w

a correct response, defined

PnCS2) = p,(AI I S1) .

>f approach to the asymptote the effect of w on condition· ·imination problems, one de· o. Further, for fixed 0, define ively. r trials p~(C) will be greater ~r occurs, and from this point tpproach unity. Further, for s later in the series of trials tse in the number of errors ty of the discriminanda is Vith regard to the crossover al evidence; however, some that such a phenomenon may

rather surprising observation. responses made by a subject

en

1)/2]N- 3w- 1 [1- 0/2]N . 2w0

tl prediction for Case I. The ting of a classical discrimina· of the discriminanda.

mditioning states are statisti· y trial, a change in the per· while a change in the condi· ~his modification (as compared tal and conditioning states are 11, for some parameter values,

. will be compared with theo· : particular experiments, there

THEORY OF STIMULUS DISCRIMINATION LEARNING

are several general results that are important in applying the theory: (i) As in Case I, subject states 3, 6, 11, and 14 are transient.

233

(ii) In general, u, and P~CA1I St) depend on a1, a2, w1, and w2. However, as a1 and a2 approach 1, the stationary process is described by (14)-(18) and, therefore, by P~(AI I S1) = rr1 and P~(AI I S2) = rr2.

(iii) We have restricted 0 <at~ 1. However, results relevant to other theories of discrimination obtain when a1 = 0 and 0 <a,~ 1. Under these conditions, no change can occur in the perceptual state of the subject. If initially he is in a P-state, then he will remain in a P-state; the same holds for C-states. Consequently, two closed sets are formed, one consisting of states 1 through 8 and the other of states 9 through 16. For the former closed set,

u1 = ac/d, Us = bc(c + d)/d ,

(23) u2 = ad(a + b)/d , ll6 = 0,

U3 = 0, u, = bc(a + b)/d ,

Itt = ad(c + d)/d , Us = bd/d .

If the process is absorbed in this set, by (10) and (11)

(24) P~(AI I Sl) = lll + ll2 + ll3 + lt4 = il"J '

p.,(AI IS,) = ll1 + U3 + U5 + u, = n, •

The ut's for the second closed set, consLting of states 9 through 16, are identical to those presented in (23), i.e., Ut = Ut+B· However, for these states P~(AI I St) is defined by (10) and (11) as

(25)

P~(AI I S1) = ltg + ll1o + w1[Un + zt12] + (1 - WI)[uu + Uu]

= 7rJW! + (1 - WI)[~7rl + (1 - ~)11:2] ,

P"'(AI I S2) = ltg + ll13 + w2[U12 + u,,] + (1 - w2)[uiO + llu]

= n2w2 + (1 - w2)[~n1 + (1 - ~)n2l ·

Thus, when a1 = 0, asymptotic behavior is described by (24) or (25), depend· ing on the initial state of the subject. It is interesting to note that the re· suits of (25) are identical to the asymptotic predictions made by Burke and Estes [3]. In contrast, the predictions embodied in (24) are identical to the predictions generated by Estes' pattern model [7].

(iv) Another general result of interest holds for the case in which w1 = w2 = 1. Here, independent of the value of ~. we have P~(AII S1) = 11:1 and p"'(AI I S2) = 71:2.

Classical Discrimination Learning

When n1 = 1, n2 = 0, and ~ * 0, 1, it is clear, by inspection of the transi· tion matrix, that states 2 and 4 form the only closed set and, further, that u2 =a and Ut =d. Consequently,

p,(AII SI) ~ 1, PnCA2IS2) ~ 1.

234

1.0

.9

.!!!_.8 c( -Q.

.7

/ /

5

,-" /

RICHARD C. ATKINSON

----------.,..,. .....

---------------

10

------------

--w •.1

---w•.9

15 BLOCKS OF 10 TRIALS

-----

20

Figure 3. Theoretical functions for P,.(A 1 I S1 ) averaged over 10-trial blocks; a= d =¥.a and u,(l) = ~6·

Computational results are presented in Figure 3 for a = d = 1/2, u,(1) = 1/16, a1 = a2 = a, and w1 = wz = w. Only Pn(Al I S1) is plotted, since it is equal to Pn(Azl Sz) under these conditions. Clearly, the larger the value of a the more rapidly the subject approaches perfect responding. Also, as was true for Case I, the proportion of correct responses on early trials increases with increasing values of w; however, at some trial a crossover occurs and from this point on the larger the value of w the greater the number of errors.

Similar computations are provided in Figure 4 for w1 > 0 and wz = 0. This would describe a situation in which the presentation of the stimulus Sz is theoretically represented by a set of stimulus elements that is a proper subset of 81. For example, experimentally S1 might be the onset of a tone and a light while Sz is the onset of just the tone. As can be seen from the figure, in mastering the problem the subject makes more errors on Sz trials than on S1 trials. Stated differently, the subject learns to make the A 1 response to S1 more quickly than the Az response to Sz.

An experiment was conducted to test this specific prediction, employing a procedure and apparatus described elsewhere [2]. Facing the subject was an array of ten lights; the onset of all ten lights designated an S1 trial and the onset of a specific subset of five of these lights designated an S2 trial. (The five lights that made up the Sz were randomly selected for each sub­ject.) On every trial the subject made either an A1 or an Az response; this was followed by a signal that told him which response was correct. Sixty subjects were run for eighty trials; a= d = 1/2, with the restriction that over the eighty trials there were exactly forty S1's and forty S2's.

1.0

.9

.!!. .8 c( -Q.

.7

.6

THEORY OF STU

. 5 .....,,...-,..--.---r-.--""T"""-5

Bl

Figure 4(a). Theoretical fUJI a= d = ¥.!, u,(l) = ~6. <It= a2 = .0

1.0

.9

-~.8 c( -Q.

.7

.6

.5 .....,r-"T"""--r--r-"T"""--r---r 5

BLI

Figure 4(b). Theoretical fur a= d = ¥.!, u,(l) = ~6. <It= a2 = .0

)N

------===== ~

--------~

-----

ALS

--W•.I ---w •.9

' 15 ' 20

aged over 10-trial blocks; a= d =¥.a

3 for a = d = 1/2, u.(1) = 1/16, is plotted, since it is equal to

te larger the value of a the ~ponding. Also, as was true 1ses on early trials increases ~ trial a crossover occurs and ' the greater the number of

.rre 4 for a.11 > 0 and wz = 0. presentation of the stimulus

ulus elements that is a proper ight be the onset of a tone and s can be seen from the figure, more errors on S, trials than !llms to make the Ar response I•

specific prediction, employing ~ [2]. Facing the subject was nts designated an Sr trial and lights designated an Sz trial.

ndomly selected for each sub· an Ar or an Az response; this t response was correct. Sixty L/2, with the restriction that y Sr's and forty Sz's.

1.0

.9

~ .8 .ii -Q.

.7

.6 'I

" I

I

THEORY OF STIMULUS DISCRIMINATION LEARNING 235

/

I I

/

" / /

..... ..... ..... " " "

.... .... .... .... ..... .....

.... .... -....

-----------------­_,---

- p(A1IS1)

--- p(AziSz)

.5 1 1 1 1 1 I I I I I I I I I I I I I I I I 20 5 10 15

BLOCKS OF 10 TRIALS

Figure 4(a). Theoretical functions for P,.(A,[ S,) averaged over 10-trial blocks; a=d=¥.!, u,(1) =lAs, llt =a2 = .OS, and oo1 = .9, oo2 =0.

1.0

.9

~.8 <[ -Q.

-------

p(A~Is~)

p(AziSz)

.5 1 a 1 1 1 I I I I I I I I I I I I I I I I 5 10 15 20

BLOCKS OF 10 TRIALS

Figure 4(b). Theoretical functions for p,.(A,[ S,) averaged over 10-trial blocks; a= d = ¥.!, u

1(1) = lhs, a 1 = a 2 = .OS, and oo1 = .5, oo2 = 0.

- .q ~ - THEORY OF STIMl 0

'3 '3 '3 It) c ,--.. 0 It) .g

g u Before the experiment was 1 c

.E trials and the last block of se'

"' predicted, over the last block oJ "' "' on sl trials was greater than "' ... {') t test indicated that the resul1 ~ .... For the first block of ten tr

~ '-' the random sequence of sl an 8 ~ and five s. trials. Over the -0 trials was slightly less than .q "' CLI ::s this result did not approach s1 ~ 1)""! The Estes and Burke Study -~II 1JCC. The case in which 1r1 = 1, 11 .. ., jl., ; task investigated by Estes anc >Oo- study, but only the acquisitiot

CLI considered. Facing the subjec '" II &, ....

0 ·- t: the onset of either the six lig ~

~ .. ( 1sl 1v) 111d

cq .. the right half was designated t: nated an s. trial. On each t1 response and was then told \\

Figure 5 presents theoretical Ol It)- - It! Ol CLI and w1 = ro2 = .1, .5, and .9. . • • • • • .c .... 3 3 3 3 "J "J !!! .. w = .1 provide a reasonably g

I .£ I An inspection of the curves - 'N I "' en en ..Ill for this special case, namely I ~ <( <( I closer P-CA1 Is.) is to .5, the - - I :en 3 a. a. ...J

<( .. Other Studies .... I a: ~ I Two studies deal with the , I

.... .. Ol CLI

> perimental groups. Popper a: 0 0

N 1l groups in which 1r1 = .85, 8 = bD .30, and .15. A similar stud) LL "' ....

r--O CLI [2] in which 1r1 = .9 and Irs > "' en "' respectively. The findings of

~ c 0 .g the discussion will be limited lt)g u

The experimental procedun c .E

I m begins with the onset of one I ~ >. I I

·.;:] a response (At or A,) and is 1 CLI ....

I I ~ .... "' Figure 6 presents theoretic I I :a~ I I .c .... ro = ro1 = w. and a = a1 = as; \ I I ~ ::s \ \ I -~ The abscissa represents differ

\ \ I "'., ' ' ' I CLI C a. As a approaches 1 and ~ 3 "' bD "' 1r1 for all values of 1r2. How

q ~ IX) ·- CLI ,..., IQ "1 ~~

Csl''it)d w relation between P-CA1 I S1) a1 probability of an A1 response

236 Bogartz, and Turner study a:

., • ts

I

' \ \ \ I

I )

/ /

/ /

/

- It! ~ • • • 3 3 3

I I I I I I \ I \ I

\ ' ::::----;, '

"' "!

tg

~

t=' "!

0

~

-(/) -..J

m

ct ~ .... 2 IL

,...0

C/) :Ill: 0

an9 CD

...,

Ci § ·= u § ..... ~

~ _...._

.... V) -~" 8 -Q.,

Ci "' .. :::2

~ a;l0 ti II ·- cc. ~"d p.. ;

.Co-~ II.

6:, " ·- ~ ~ ..

.s ..

.£ "' ~ ~ -;; .E

I

~ t ~

1 ~ "' §

·= u s:: .E

.. ~

-;; >. ·~] ~ -;; g~

..c .. E-< :::2

-~ "'-c:~ ~ ; 6:,"' ~~ ~

THEORY OF STIMULUS DISCRIMINATION LEARNING 237

Before the experiment was run, it was decided that the first block of ten trials and the last block of seventy trials would be analyzed separately. As predicted, over the last block of seventy trials the proportion of A1 responses on S1 trials was greater than the proportion of A,'s on S2 trials. A paired t test indicated that the result was significant at the .05 level.

For the first block of ten trials, an additional restriction was placed on the random sequence of sl and s2 events such that there were five sl trials and five S2 trials. Over the first ten trials the proportion of A1's on S1 trials was slightly less than the proportion of A2's on S2 trials; however, this result did not approach statistical significance.

The Estes and Burke Study

The case in which n1 = 1, 1r2 = 1/2, and 8 = 1/2 describes a discrimination task investigated by Estes and Burke [8]. There are several aspects to the study, but only the acquisition process for the "constant" group will be considered. Facing the subject was a circular array of twelve lights, and the onset of either the six lights on the left half of the panel or the six on the right half was designated an S1 trial; the onset of the other six desig­nated an S2 trial. On each trial the subject made either an A1 or an A2 response and was then told which response was correct.

Figure 5 presents theoretical curves computed for a1 = a2 = .05, Ui(l) = 1/16, and w1 = aJ2 = .1, .5, and .9. The functions Pn(AI I S1) and PnCA1 I S2) for w = .1 provide a reasonably good fit of the observed data .

An inspection of the curves in Figure 5 illustrates some theoretical results for this special case, namely that the closer p.(Al I S1) is to unity and the closer P-CA1 I S2) is to .5, the larger is the value of w.

Other Studies

Two studies deal with the case where 1r1 is fixed and n2 varies over ex­perimental groups. Popper and Atkinson [14] report a study involving five groups in which n1 = .85, 8 = .50, and n2 takes on the values .85, .70, .50, .30, and .15. A similar study is reported by Atkinson, Bogartz, and Turner [2] in which 1r1 = .9 and 1r2 is .9, .7, .5, .3, and .1 for the five groups, respectively. The findings of these studies are in agreement. Consequently, the discussion will be limited to the latter study.

The experimental procedure is similar to Estes and Burke [8]. Each trial begins with the onset of one of two lights (S1 or S2). The subject makes a response CA1 or A2) and is then told which response was correct CE1 or E2).

Figure 6 presents theoretical curves for P-CA1 I S1) for various values of w = w1 = w2 and a = a1 = a2; 1r1 and 8 are fixed at .9 and .5, respectively. The abscissa represents different values of n2 and the parameters are w and a. As a approaches 1 and also as w approaches 1, P=CA1I S1) approaches n1 for all values of 1r2. However, in general, the model predicts a convex relation between P-CA1 I S1) and 1r2. This convex relation for the asymptotic probability of an A1 response on S1 trials was demonstrated in the Atkinson. Bogartz, and Turner study and also in the Popper and Atkinson study.

238 RICHARD C. ATKINSON

Figure 6 presents another interesting feature of Case II. Here we observe that when rr1 = rr. = .9, the probability P®CA1 I S1) is generally greater than .9, with maximal differences between .9 and Poo(All S1) for small values of OJ

and a. This result obtains in general for Case II. That is, if n:1 = n:: = n: > 1/2, then P~CA1 I S1) and p®(A1 I S2) are both greater than n: except when w = 1 or when a = 1.

10. Discussion

An empirical evaluation of the theory is currently in progress. The re­search involves experimentation with both rats and human subjects. The variables being analyzed encompass various stimulus and reinforcement schedules and also include procedures designed to manipulate the values of w1 and w.. The research with rats is being conducted using two-bar Skin­ner boxes in which the bars are retractable, thereby permitting a discrete trial procedure. The equipment is completely automatic, and the animal's response data can be immediately transferred to I.B.M. cards for analysis.

Concurrently, various procedures for estimating parameters are being ex­plored. In any application of the model it will be necessary to estimate at least two parameters. For example, if the stimuli S1 and S2 are com­parable, then application of Case I would involve estimating w and fJ. A maximum-likelihood method has been worked out for this case and is pro­grammed at the Western Data Processing Center for the I.B.M. 709 computer. An individual subject's response record and sequence of experimental events are read into the computer, and values of w and fJ are computed that max­imize the likelihood of the particular response protocol. Some preliminary work indicates that Case I will have very restricted applicability. Conse­quently, a similar procedure for Case II is being developed that permits the simultaneous estimation of w, a1, and a". Other procedures for estima­tion, including a pseudo maximum-likelihood method suggested by Suppes, are also being investigated.

From a psychological standpoint there is one additional comment to be made about applications of the model. This involves a prediction, suggested by the work of Wyckoff [20], of the rate of change in perceptual and con­ditioning states. If the perceptual and conditioning processes are viewed as response mechanisms, then we would expect, in terms of a gradient of re­inforcement [12], that the response temporally closest to the reinforcing event should be acquired most rapidly. Thus we might suspect that the rate of change in the conditioning states would be greater than in the per­ceptual states. For situations in which this analysis is correct, Case I is definitely not applicable. However, Case II may represent a good approxima­tion when a1 < a,. More generally, such an analysis would suggest that r1 < r• for Case III.

In conclusion, it appears that the model generates interesting predictions regarding both reinforcement schedules and similarity between discriminanda. No rigorous attempt has been made to test the theory for the special cases considered. Nevertheless, qualitatively it appears that the model accounts

1 THEORY OF STIMl

for some aspects of traditional extended without modification plex stimulus and reinforceme1

Listed below are the transit terms are indicated. Interpret trial n + 1, given state i on tr

Pu = a + c + (b + d)ro Pu = dr. P1.1 = br. Pl.9 = (b + d)n Pu• = dra P1.15 =bra p •. l =cr. P2.2 = a + (b + c)ro + dCr1 + ro) Pu = d(ra + rz) P2.s = P1.1 P•.9 =era P•.1o = (b + c)n P•.1e = P1.u Pa.l = (a+ c)(ra + r:) Pa.a =(a+ c)(rJ + ro) + (b +d) Pu=Pu Pa.1 = Pu Ps.n = P1.9 Ps.12 = P1.12 Ps.l5 = P:.1e Pu = P2.1 Pu = a(ra + r.) Pu = a(r1 + ro) + (b + c)ro + t p4,8 = pl,7

P4.9 = P•.9 P4.12 = P•.1o P4.18 = P•.1e P5.1 = ar. P5.5 = (a + d)ro + b(n + ro) +' Pu = b(ra + r:) P5.s = Pu P5.9 = ara P5.1s =(a+ d)n p5,18 = Pl.U Po.:= P5.1 Pe.5 = Pu Pe.e = (a+ c)ro + (b + d)Cr1 +

SON

~of Case II. Here we observe I SI) is generally greater than •..,(AI I SI) for small values of m ase II. That is, if rri = rr: = h greater than rr except when

1rrently in progress. The re­tts and human subjects. The

stimulus and reinforcement i to manipulate the values of :onducted using two-bar Skin­thereby permitting a discrete r automatic, and the animal's to I.B.M. cards for analysis.

ting parameters are being ex­lllill be necessary to estimate e stimuli SI and S, are com­rolve estimating w and fJ. A

out for this case and is pro­er for the I.B.M. 709 computer. }Uence of experimental events tnd fJ are computed that max­! protocol. Some preliminary :stricted applicability. Conse­ng developed that permits the Other procedures for estima­method suggested by Suppes,

one additional comment to be tvolves a prediction, suggested change in perceptual and con­::ming processes are viewed as in terms of a gradient of re­

:Iy closest to the reinforcing s we might suspect that the ld be greater than in the per­; analysis is correct, Case I is y represent a good approxima-analysis would suggest that

nerates interesting predictions tilarity between discriminanda. te theory for the special cases ~ears that the model accounts

THEORY OF STIMULUS DISCRIMINATION LEARNING 239

for some aspects of traditional types of discrimination learning and can be extended without modification to discrimination tasks involving more com­plex stimulus and reinforcement schedules.

APPENDIX

Listed below are the transition probabilities for Case III; only non-zero terms are indicated. Interpret Pt.J as the probability of being in state j on trial n + 1, given state i on trial n.

Pl.l = a + c + (b + d)ro Pu = dr2 Pu = br: P1.9 = (b + d)n PI.n = dn PI.I5 =bra P2.I = cr2 P2.2 =a+ (b + c)ro + d(ri + ro) Pu = d(n + r2) P2.s = Pu P:.9 =en P:.Io = (b + c)n P:.Ia = P1.I• Pa.I = (a+ c)(ra + r2) Pa.a = (a+ c)(TI + ro) + (b + d)ro Pa.f = PI.f Pa.1 = Pu Pa.n = PI.9 Pa.I2 = PI.u Pa.Io = P:.Ia pf,I = P2.I Pu = a(n + r2) Pu = a(ri + ro) + (b + c)ro + d p,,s = PI,7 p,,9 = P:.9 Pf.I2 = P:.IO Pf.Ie = P:.Ia Po.I = ar2 P•.• =(a+ d)ro + b(ri + ro) + c Pu = b(ra + r2) P•.s = PI.4 Po.9 = ara Po.Ia = (a + d)n Po.Ia = PI.I2 Po.z = P6.I Po.6 = P4.I p6.8 = (a + c)ro + (b + d)(TI + ro)

Po.s = (b + d)(ra + r2) Po.IO = Po.9 Po.Ia = P4.9 Po.u = (a+ c)ri P1.I = P•.I Pu = c(ra + r:) Pr. 1 = (a + d)ro + b + c(n + ro) P1.s = P•.s Pr.9 = Po.9 P1,u = P6.1a Pr.1o = Ps.Io Ps.: = P1.1

Ps.o = Po.s Ps.s = (a + c)ro + b + d Ps.Io = Pe.Io Ps.Ia = Pu.Ia Ps.Io = Po.u PD. I = Pa.n P9,4 = Ps.Io P9.7 = PI.Is PD.9 = a + c + (b + d)ro P9.12 = Ps.s P9.I6 = PI.7 PlO.I = CW2T3

Pio.: = [b + cw2 + d(1- w:)]n P1D.4 = d(1 - w2)ra Pio.s = PI.B P10.D = c[m2r2 + (1 - m2)(ra + r2)] P10.Io =a+ bro

+ c[w2ro + (1 - w:)(ri + ro)] + d[w:(Tl + ro) + (1 - m:)ro]

P10.u = d[w2Cra + r2) + (1- w2)r:] P10.Ia = P9.I5 Pn.I = [a(1- WI)+ c(1- m:)]ra P1,1a = [a(1- WI)+ bwi + c(1- m2)

+ dw2]TI

Pn.f = dw:ra

240 RICHARD C. ATKINSON

Pn.1 = b(J)Jn

Pn.9 = a[w1Cra + r.) + (1- w1)r.J + c[w.(ra + r.) + (1-w.)r.]

Pn.n = a[wl(Tl + ro) + (1- wl)ro] + b[w1ro + (1- w1)Cr1 + ro)] + c[w.(rl + ro) + (1- w.)ro] + d[w.ro + (1 - w.)(rl + ro)]

Pn.12 = d[w.r. + (1- w.)(ra + r.)] Pn.l5 = b[w1r2 + (1- w1)Cra + r.)] Pl2,1 = PB.l3 P12.2 = a(1 - wl)ra P12.4 = [a(1 - w1) + bw1 + clr1 P12.B = Pn.7 pl2,9 = p4,1 P12.1o = a[w1Cra + r.) + (1- wl)r.] P12.12 = a[(rl + ro)wl + (1- wl)ro]

+ b[W1T0 + (1 - W1)(71 + To)) + cro + d

P12.1o = Pn.l5 P1a.1 = aw1ra P1u = [aw1 + b(1 - w1) + d]r1 P1a. 1 = b(1 - w1)ra PH.s = dra PH.D = a[w1r2 + (1- w1)Cra + r.)] PH.la = a[w1ro + (1 - w1)Cr1 + ro)]

+ b[wl(TI + ro) + (1 - wl)ro] + c + dro

PH.l5 = b[w1Cra + r.) + (1 - w1)7•] Pl3,16 = P5.B

Pu.• = P1a.1 Pu.5 = cw.ra Pu.o = [aw1 + b(1 - w1) + cw2

+ d(1 - w.)]rl Pu.s = [b(1 - w1) + d(1 - w.)]ra Pu.1o = P1a,9 Pu.1a = c[w.r. + (1 - w.)(ra + r.)] Pu.u = a[w1ro + (1- w1)Cr1 + ro)]

+ b(wl(Tl + To) + (1 - Wl)To] +c[w.ro + (1- w.)(n + ro)] + d[(JJ•(Tl + ro) + (1 - w.)ro]

Pu.l6 = b(wl(Ta + rz) + (1- wl)r.] + d(w2(7a + 72) + (1 - Wz)T2]

P1s.1 = Ps.9 P1s,5 = c(1 - w.)ra Pl5. 1 = [a + c(1 - wz) + dw.]rl P1s.s = dwzra P1s.9 = Ps.l pl5,13 = c[w.(ra + r.) + (1 - w.)r.] P15.1s = aro +b.

+ C(w2(71 + To) + (1 - Wz)To] + d[w.ro + (1 - wz)(rl + ro)]

P1s.16 = Pn.12 P16.2 = Ps.9 P1o.s = P2.D P1o.s =(a+ c)rl P16.10 = Ps.l pl6,13 = p6,5

P16.l6 =(a+ c)ro + b + d

REFERENCES

[1) ATKINSON,R. C. "A Markov Model for Discrimination Learning," Psychomet·rika, 23 (1958), 309-22.

[2) ATKINSON, R. C., W. H. BOGARTZ, and R.N. TURNER. "Supplementary Report: Discrimination Learning with Probabilistic Reinforcement Schedules," Journal of Experimental Psychology, 51 (1959), 349-50.

[3) BURKE, C. }., and W. K. ESTES. "A Component Model for Stimulus Variables in Discrimination Learning," Psychometrika, 22 (1957), 133-45.

[4) BUSH, R. R., and F. MOSTELLER. "A Model for Stimulus Generalization and Discrimination," Psychological Review, 58 (1951), 413-23.

[5) ESTES, W. K. "Theory of Learning with Constant, Variable, or Contingent Probabilities of Reinforcement," Psychometrika, 22 (1957), 113-32.

[6) ESTES, W. K. "Of Models and Men," The American Psychologist, 12 (1957), 609--17.

[7) ESTES, W. K. "Component and Pattern Models with Markovian Interpretations," in R. Bush and W. K. Estes, eds., Studies in Mathematical Learning Theory, Stanford, Calif.: Stanford University Press, 1959.

THEORY OF STIM1

[8) ESTES, W. K., and C. J. BUl Discrimination Learning in ll ogy, 50 (1955), 81-88.

[9) ESTES, W. K., C. ]. BURKE, bilistic Discrimination Learni1 233-39.

[10] ESTES, W. K., and P. SUPPE W. K. Estes, eds., Studies i1 Stanford University Press, 1!

[11] GREEN, E. J. "A Simplifiecl Review, 65 (1958), 56-63.

[12] HULL, C. L. Principles of E [13) LABERGE, D. L., and A. SMJ

ing," Journal of Experime1 [14] POPPER, J., and R. C. ATKU

tioning Situation," JournnJ, • [15) RESTLE, F. "A Theory of

(1955), 11-19. [16] RESTLE, F. "Axioms of a '

20 (1955), 201-8. [17] RESTLE, F. "A Theory of

Psychological Review, 64 (I [18] RESTLE, F. "A Survey and

W. K. Estes, eds., Studies· Stanford University Press, 1

[19] SUPPES, P., and R. C. ATKl Situations, I. The Theory,' Applied Mathematics and St

[20] WYCKOFF, L. B., JR. "Th1 ha v ior," Psychological Revi

~SON

~13.1

:ro2rs aw1 + b(1 - ro1) + cw2 + d(1 - w2)lr1

b(1 - ro1) + d(1 - w,)]rs pl3,9

c[ro2r2 + (1 - ro2)Crs + r2)] a(WiTO + (1 - WJ)(Tl + To)] + b[wh1 + ro) + (1 - wJ)ro] +c[a12ro + (1 - w2)Cr1 + ro)] + d[(JJ2(TJ + ro) + (1 - w2)ro]

b[(JJJ(rs + r,) + (1 - w1)r2l + d[(J)hs + r,) + (1 - ro2)r,]

1;,9

·(1 - (JJ,)rs

a + c(1 - w,) + dw2Jr1 fro2rs >5,1

c[ro2Crs + r2) + (1 - w2)r2l aro +b. + c[w.(r1 + ro) + (1 - w,)ro] + d[w2ro + (1 - w2)Cn + ro)] Pu.u 1;,9

12,9

a+ c)ri Pu Pu (a + c)ro + b + d

ination Learning," Psychometrika,

'URNER. "Supplementary Report: .forcement Schedules," Journal of

.ent Model for Stimulus Variables ~2 (1957), 133-45. for Stimulus Generalization and

51), 413-23. :onstant, Variable, or Contingent a, 22 (1957), 113-32. 1merican Psychologist, 12 (1957),

s with Markovian Interpretations," : Mathematical Learning Theory, 59.

J

THEORY OF STIMULUS DISCRIMINATION LEARNING 241

[8] ESTES, W. K., and C. J. BURKE. "Application of a Statistical Model to Simple Discrimination Learning in Human Subjects," Journal of Experimental Psychol· ogy, 50 (1955), 81-88.

[9] ESTES, W. K., C.}. BURKE, R. C. ATKINSON, and }. P. FRANKMANN. "Proba· bilistic Discrimination Learning," Journal of Experimental Psychology, 54 (1957), 233-39.

[10] ESTES, W. K., and P. SUPPES. "Foundations of Linear Models," in R. Bush and W. K. Estes, eds., Studies in Mathematical Learning Theory, Stanford. Calif.: Stanford University Press, 1959.

[11] GREEN, E. }. "A Simplified Model for Stimulus Discrimination," Psychological Review, 65 (1958), 56-63.

[12] HULL, C. L. Principles of Behavior, New York: Appleton-Century-Crofts, 1943. [13] LABERGE, D. L., and A. SMITH. "Selective Sampling in Discrimination Learn­

ing," Journal of Experimental Psychology, 54 (1957), 423-30. [14] POPPER, J., and R. C. ATKINSON. "Discrimination Learning in a Verbal Condi­

tioning Situation," Journal of Experimental Psychology, 56 (1958), 21-25. [15] RESTLE, F. "A Theory of Discrimination Learning," Psycjwlogical Review, 62

(1955), 11-19. [16] RESTLE, F. "Axioms of a Theory of Discrimination Learning," Psychometrika,

20 (1955), 201-8. [17] RESTLE, F. "A Theory of Selective Learning with Probable Reinforcements,"

Psychological Review, 64. (1957), 182-91. [18] RESTLE, F. "A Survey and Classification of Learning Models," in R. Bush and

W. K. Estes, eds., Studies in Mathematical Learning Theory, Stanford, Calif.: Stanford University Press, 1959.

[19] SUPPES, P., and R. C. ATKINSON. "Markov Learning Models for Multiperson Situations, I. The Theory," Technical Report No. 21, Contract Nonr 225(17), Applied Mathematics and Statistics Laboratory, Stanford University, 1959.

[20] WYCKOFF, L. B., }R. "The Role of Observing Responses in Discrimination Be­havior," Psychological Review, 59 (1952), 431-42.