Rescorla r 1985, Pavlovian Vonditioning, It's Not What Ypu Think It Is
Time and Causality: A theory of learning What is associative learning for? How does Rescorla Wagner...
-
Upload
ashton-duffy -
Category
Documents
-
view
231 -
download
0
Transcript of Time and Causality: A theory of learning What is associative learning for? How does Rescorla Wagner...
Time and Causality: A theory of learning
What is associative learning for?
How does Rescorla Wagner do? How does it fail?
Wagner’s time-based theory of learning
Applications
What is associative learning for?
What is associative learning for?
Learning about causality Tone --> food
What is associative learning for?
Learning about features of stimuli - what goes with what
juicy
nicepastry
fruitwarm
If you want to design a model to learn about causality, what should it be like?
If you want to design a model to learn about causality, what should it be like?
Directionality Cause -->effect
or effect --> Cause
If you want to design a model to learn about causality, what should it be like?
Sensitivity to delay between Cause and effect
If you want to design a model to learn about causality, what should it be like?
Sensitive to correlation
If you want to design a model to learn about causality, what should it be like?
Sensitive to correlation
If you want to design a model to learn about causality, what should it be like?
Sensitive to correlation
If you want to design a model to learn about causality, what should it be like?
Learning about predictable outcomes
?
?
If you want to design a model to learn about causality, what should it be like?
Learning about predictable outcomes
?
?
What possible rules are there for forming associations?
Would a pure contiguity model have the properties we want? (e.g. Hebb)
V = ()
Direction X
Delay yes
Correlation X
Predictable outcomes X
V = ()
Rescorla Wagner avoids some of these problems:
V = ( - V)
Rescorla Wagner avoids some of these problems:
bracketed term means how surprising US is
V = ( - V)
Rescorla Wagner thus allow selective learning about surprising outcomes
it can also explain sensitivity to correlation
Rescorla Wagner thus allow selective learning about surprising outcomes
it can also explain sensitivity to correlation
Rescorla Wagner thus allow selective learning about surprising outcomes
it can also explain sensitivity to correlation
Rescorla Wagner thus allow selective learning about surprising outcomes
it can also explain sensitivity to correlation
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context
context ---> food context+tone ---> food
Rescorla Wagner
Direction X
Delay X
Correlation yes
Predictable outcomes yes
Rescorla Wagner cannot explain why backward conditioning should not work, and cannot easily explain the effect of trace intervals…..
this is because there is nothing in the Rescorla Wagner equation that refers to time
– and time is the essence of causality
Wagner’s SOP (1981)
Sometimes Opponent Process Theory
incorporates time, by basing itself on the idea that processing of a stimulus can vary:
as a function of time (cf trace decay in STM)
as a function of recent events
stimulus processing is reduced if:
the same stimulus has just been presented
self-generated priming
a predictor (CS) for the stimulus has just been presented
retrieval-generated priming
General Assumptions
Stimulus represented as a set of elements, some of which may be activated by stimulus presentation.
Elements may be inactive, or in one of two states:
A1 is a primary state of limited capacity (corresponding to rehearsal/STM)
A2 is a secondary state of activation.
General Assumptions
Differences between A1 and A2....
Response elicited by A2 often less intense than that elicited by A1 – in some cases it’s the opposite
General Assumptions
When a stimulus is presented, some of its inactive elements enter A1, then gradually decay into A2, and then become inactive again.
inactive A1 A2 inactive
fast slow
I A1
A2
I A1
A2
I A1
A2fast
I A1
A2slow
How does this model produce self generated priming?
I A1
A2
How does this model produce self generated priming?
I A1
A2
but after a while.....
I A1
A2 fast
When the stimulus is first presented its elements go into A1, and then quickly decay into A2
Elements cannot go from A2 directly to A1; must decay to I first
The more elements accumulated in A2 state, the fewer are left for the next presentation of the CS to put into A1
So the second presentation produces less A1 activity, and the stimulus is less effective
so by the time the next CS occurs.....
I A1
A2 fast
Retrieval-generated priming:
if an associate of the stimulus is presented, then its elements are activated directly to the A2 state.
inactive A2 inactive
I A1
A2
Condition Tone --->Food.....
and present Tone; what happens to Food elements?
I A1
A2
...so when food presented it is less effective
“conditioned diminution of the UR”
Differences between A1 and A2....
Learning about A1 and A2 obeys different rules..
in order to form an excitatory association :
--- the CS must be A1
--- if the US must be in A1
--- if the US is in A2 an inhibitory association forms
How does conditioning happen?
After one trial:
I A1
A2
I A1
A2
tone food
How does conditioning stop?
After many trials:
I A1
A2
I A1
A2
tone food
CS mainly A1, US mainly in A2 --> mix of excitatory and inhibitory
learning
How does extinction happen?
I A1
A2
I A1
A2
tone food
CS mainly A1, US all in A2 --> inhibitory learning
nothing
Inhibitory conditioning:
First establish tone-->food association
I A1
A2
I A1
A2
tone food
Inhibitory conditioning:
nothing
CS mainly A1, US all in A2 --> inhibitory learning
I A1
A2
I A1
A2
light food
then introduce tone+lightnothing trials
An inhibitor prevents inactive elements of the US from entering A2.
It will thus interfere with action of a conditioned excitor, which is trying to put inactive US elements into A2.
So how does this model do all the things that learning about causality would require?
Selective learning about signals for surprising events
Correlation
Delay
Directionality
Blocking: Early Stage 1
tonefood tone+light food
I A1
A2
tone
I A1
A2
food
Blocking: Late Stage 1
tonefood tone+light food
I A1
A2
tone
I A1
A2
food
Blocking: Stage 2
tonefood tone+light food
I A1
A2
light
I A1
A2
food
CS mainly A1, US mainly in A2 --> mix of excitatory and inhibitory
learning
I A1
A2
I A1
A2
Excitatory Conditioning Short ISI
Mainly A1/A1 ---> strong excitatory association
I A1
A2
I A1
A2
Less CS in A1 ---> weaker excitatory association
Excitatory Conditioning Medium ISI
I A1
A2
I A1
A2
No CS in A1 ---> no excitatory association
Excitatory Conditioning Very Long ISI
Backward conditioning
tonefood
I A1
A2
I A1
A2
Further Predictions and Applications
The theory predicts that a US will be processed less effectively when it is predicted. This was tested by Terry and Wagner (1975).
Train
US: CS--> US no US: CS--> -
(or the opposite)
1210864200
20
40
60
80
100
CS+
CS-
US signals reinforcement
Sessions
% CRs
12108642010
20
30
40
50
60
70
80
CS+
CS-
US signals nonreinforcement
Sessions
% CRs
Then train tone-->US light-->no US
after this training
US: CS--> US no US: CS--> -
Test : compare
tone --> US: CS ??
light --> US: CS ??
Predicted shock should be less effective than unsignalled shock
Tone trials should be less accurate than light trials
Redrawn from Terry & Wagner
0
5
10
15
20
25
30
35
40
45
50
1 2 3
Preparatory Releaser Interval (sec)
Mean percent correct deviation from P
CS-CS+
Another prediction of the account is that a predicted CS is less effective at evoking its CR than a surprising one -- priming
A --> X --> food
B --> Y --> food
test CR to X and Y with same combinations... and different combinations
same A --> X different A --> Y
B --> Y B --> X
10
2
4
6
8
From Honey, Hall & Bonardi, 1993
Same Different
Elevation ratio
Applications 1
Andresen et al (1990) – The scapegoat effect
Suggested novel tasting food eaten after “normal” food which precedes CT will
acquire strong association and overshadow association to normal food
(act as a scapegoat)
This idea appeals to two principles:
(i) conditioning two stimuli together results in less learning than if you condition just one -- overshadowing
(ii) novel stimuli condition better than familiar ones – latent inhibition – latent inhibition
Applications 1
Andresen et al (1990) – The scapegoat effect
Suggested novel tasting food eaten after “normal” food which precedes CT will
acquire strong association and overshadow association to normal food
(act as a scapegoat)
This idea appeals to two principles:
(i) conditioning two stimuli together results in less learning than if you condition just one -- overshadowing
(ii) novel stimuli condition better than familiar ones – latent inhibition – latent inhibition
CS CS CS CS +
Applications 1
Andresen et al (1990) – The scapegoat effect
Suggested novel tasting food eaten after “normal” food which precedes CT will
acquire strong association and overshadow association to normal food
(act as a scapegoat)
This idea appeals to two principles:
(i) conditioning two stimuli together results in less learning than if you condition just one -- overshadowing
(ii) novel stimuli condition better than familiar ones – latent inhibition – latent inhibition
CS CS CS CS +
context context context context
Applications 2
Drug addiction and tolerance e.g Paletta & Wagner 1986
Response elicited by A2 may be opposite to that elicited by A1
If the UR has two phases, one opposite to the other, it suggests A1 and A2 activity are opposite to each other
e.g. UR to morphine sedation/hypoactivity (A1 response) followed by hyperactivity (compensatory A2 response)
this means that CSs associated with the drug may produce tolerance to drug’s effect
Paletta & Wagner 1986
Three groups of animals:
Morphine (distinctive context)
Morphine (home cage)
No drug
Then tested all groups in distinctive context
measure activity and sensitivity to pain (tail flick test)
Across several experiments they found evidence of hyperactivity and hyperalgesia in the group that had experienced morphine in a distinctive context – the opposite of drug’s normal effects
Suggested Reading
Dickinson, A. (1980). Contemporary animal learning theory. Cambridge University Press) (Short, sophisticated but compelling introduction to learning theory written from a causal perspective)
Honey, R.C., Hall, G., & Bonardi, C. (1993). Negative priming in associative learning: Evidence from serial conditioning procedures. Journal of Experimental Psychology: Animal Behavior Processes, 19, 90-97.
A test of Wagner's theory
Marlin, N.A., & Miller, R.R. (1981). Associations to contextual stimuli as a determinant of long term habituation. Journal of Experimental Psychology: Animal Behavior Processes, 7, 313-333.
A test of Wagner's theory
Paletta, M.S., & Wagner, A.R. (1986). Development of context-specific tolerance to morphine: support for a dual process interpretation. Behavioral Neuroscience, 100, 611-623.
Application of Wagner's theory
Terry, W.S., & Wagner, A.R. (1975). Short term memory for "surprising" versus "expected" conditioned stimuli in Pavlovian conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 104, 122-133.
A test of Wagner's theory
Wagner, A.R. (1981) SOP: A model of automatic memory processing in animals. In N.E. Miller & R.R. Spear (Eds.) Information processes in animals: Memory Mechanims (pp. 95-128). Hillsdale, N.J. Erlbaum
Wagner’s theory!