Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

46
Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior

Transcript of Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

Page 1: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

Chapter 8

Instrumental Conditioning: Learning the

Consequences of Behavior

Page 2: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

8.1

Behavioral Processes

Page 3: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

3

8.1 Behavioral Processes

• The “Discovery” of Instrumental Conditioning

• Components of the Learned Association

• Putting It All Together: Building the S–R–C Association

• Learning and Memory in Everyday Life— The Problem with Punishment

• Choice Behavior

Page 4: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

4

The “Discovery” of Instrumental Conditioning

• Instrumental conditioning—developing a contingency between response and outcome.

• Organism learns to make responses to obtain or avoid important consequences. e.g., trained circus

animals, waterskiing squirrels

AP

/Wid

e W

orld

Ph

oto

s

Page 5: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

5

Free-Operant Learning

• Operant conditioning—a type of instrumental learning.

• Skinner’s free-operant paradigm: Replaces Thorndike’s discrete trials.

Learner can operate apparatus (e.g. Skinner box) at will.

Learner’s trial-independent responses are measured with a cumulative recorder.

Page 6: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

6

Operant Conditioning

Page 7: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

7

Free-Operant Learning

• Reinforcement—consequences increase behavior probability.

• Punishment—consequences decrease behavior probability.

Page 8: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

8

Components of the Learned Association

• Three components to instrumental conditioning: Stimulus (S)

Response (R)

Consequence (C)

Page 9: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

9

Stimulus

• A discriminative stimulus is a cue, not a US or CS. A signal for when response will lead to

consequence.

Examples:Starting whistle for racing swimmers

Potty seat for toilet-training toddler

• Can increase behavior probability, but does NOT elicit behavior.

Page 10: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

10

Response: Shaping

• Shaping—successive approximations to the desired response are reinforced. Collect baseline data on current behavior

(establish operant level).

Identify target behavior.

Reinforce successive approximations of the target response.

Page 11: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

11

Response: Shaping

• Example: Helping autistic children learn language (Ivar Lovaas, 1987). Say target word (e.g., child’s name).

Reinforce with food any sound, then closer imitations.

Introduce new words.

Page 12: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

12

Response: Chaining

• Chaining—learning a complicated sequence of responses by adding one discrete “link” (step) at a time. Backward chaining—training steps in reverse order.

• Examples: Teaching pets unusual tricks.

Teaching workers a sequential manufacturing process.

Page 13: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

13

Skinner

• http://www.youtube.com/watch?v=mm5FGrQEyBY

Page 14: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

14

Consequence: Primary Reinforcers

• Reinforcer—behavioral consequence that makes future behavior more likely.

• Primary reinforcers: Reinforcing events that occur because of their natural

characteristics and inherent ability to reinforce behavior (drive reduction theory).

Examples:Food or Water

Sleep

Sex

Page 15: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

15

Consequence: Secondary Reinforcers

• Secondary (conditioned) reinforcers: Reinforcing events that function as reinforcers

because they are consistently associated with one or more primary reinforcers.

Example: MoneyNo biological imperative.

Can be exchanged for primary reinforcers (e.g., food or shelter).

Page 16: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

16

Consequence: Punishers

• Punishers: Behavioral consequence leads to a reduction of future

behavior.

• Strong and enduring aversive stimuli are the most effective suppressors. Aversive stimuli of low intensity may reinforce behavior we

intend to suppress!

Apply aversive stimuli immediately after the targeted behavior.

Delaying punisher decreases contingency.

Page 17: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

17

The Negative Contrast Effect

Data from Kobre and Lipsitt, 1972.

Page 18: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

18

Learning and Memory in Everyday Life— The Problem with Punishment

• Use of corporal punishment is controversial.

• Alternatives: Scolding

Time-out

Grounding

Withholding allowance

• Avoid attention for punished behavior.

• Reinforce appropriate behavior.

Page 19: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

19

Putting It All Together: Building the S–R–C Association• Timing affects learning:

Immediate consequence = best learning

Instrumental conditioning faster if R–C interval is short (temporal contiguity).

• Timing can also impact: Punishment

Immediate punishment more effective than delayed punishment.

Self-controlForego immediate reward for greater future reward.

Page 20: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

20

8.1 Interim Summary

• Instrumental conditioning = learning a three-way association (S → R → C) between: Discriminative stimulus (S)

Response (R)

Consequence (C)C may be reinforcer or punisher.

• In instrumental conditioning, C occurs only if R is made; whereas, In classical conditioning, the consequence (US) occurs

automatically after the stimulus (CS).

Page 21: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

21

8.1 Interim Summary

• Four classes of instrumental conditioning: Positive reinforcement

Negative reinforcement

Positive punishment

Negative punishment.

• “Negative” and “positive” show if consequence is subtracted or added.

• “Reinforcement” and “punishment” show response increase or decrease with learning.

Page 22: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

22

8.1 Interim Summary

• Operant conditioning: subclass of instrumental conditioning Organism responds at its own rate.

• Complex responses may be trained by: Shaping

Reinforcement of progressive approximations.

Chaining Training a sequence of responses, one step at a time.

Page 23: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

8.2

Brain Substrates

Page 24: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

24

8.2 Brain Substrates

• The Basal Ganglia (BG) and Instrumental Conditioning

• Mechanisms of Reinforcement in the Brain

Page 25: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

25

BG and Instrumental Conditioning

• BG help connect information from the sensory and motor cortices to make a behavioral response.

• BG may serve as storage for S–R associations (especially those in which R is a movement).

• With BG lesions (in dorsolateral striatum): Rats learned to lever-press for food.

But showed impaired discriminative S training.

Page 26: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

26

BasalGanglia

Page 27: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

27

Reinforcement in the Brain:This figure shows that instrumental learning may involve the interaction of several neural systems.

Page 28: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

28

Electrical Brain Stimulation

• One of the “pleasure centers” is the ventral tegmental area (VTA) in the brainstem. The VTA is the center for dopamine

neuromodulation.

• VTA stimulation = powerful reinforcement

Page 29: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

29

Consequence C:Electrical Brain

Stimulation

Motor System (e.g. basal

ganglia)

Reinforcement System

Stimulus S Response R(Sight of lever) (Press lever)

Visual System (e.g. visual

cortex)

Taste System (e.g.brainstem

gustatory nuclei)

Hungry?

Electrical Brain Stimulation:Brain stimulation may directly activate the brain's “reinforcement” system, eliminating the need for natural reinforcers (e.g., food).

Page 30: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

30

Dopamine and Reinforcement

• Some VTA axons extend to the nucleus accumbens in BG. Nucleus accumbens sends dopamine to motor

areas in the striatum.

• Dopamine may be the physiological basis for the “wanting” aspect of reinforcement. “Motivation” or “wanting” in chemical form

May contribute to addictive behavior.

Page 31: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

31

Reward Prediction by Dopamine Neurons

• Schultz (2002) trained monkeys to press a lever for food.

• Electrophysiological recordings indicate that dopamine neurons in a monkey’s midbrain signal reward (or omission of reward).

Page 32: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

32

Reward Prediction by Dopamine Neurons

• In study: Dopamine neurons in a monkey’s midbrain respond

strongly after unexpected rewards.

If light occurs before food, dopamine neurons increase activation after light, but not after food.

Dopamine neurons decrease activity after an expected reward does NOT occur (omission).

• Illustrates reward prediction hypothesis i.e., dopamine is involved in predicting future reward.

Page 33: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

33

(A) Unexpected juice reward:

(B) Reward is predicted by light stimulus:

(C) Predicted reward is omitted:

Adapted from Schultz, 2002.

Page 34: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

34

Opioids and Hedonic (Liking) Value

• Endogenous opioids (endorphins) may mediate “liking.”

• Opiates (heroin, morphine) bind to the brain’s natural opiate receptors.

• Opiates may provide information about “liking” that helps stimulate VTA’s “wanting” system.

Page 35: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

35

8.2 Interim Summary

• In the brain, instrumental S-R-C associations may be stored in corticocortical connections and via basal ganglia.

• Brain’s reinforcement system may include release of dopamine from ventral tegmental area to basal ganglia.

• Drugs that interfere with the dopamine system disrupt instrumental conditioning.

Page 36: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

36

8.2 Interim Summary

• Several hypotheses on interaction of dopamine and reinforcement. Anhedonia hypothesis:

Dopamine gives reinforcers their “goodness.”

Incentive salience hypothesis:Dopamine modulates “wanting” rather than “liking”

(how hard an organism is willing to work for reinforcement).

Reward prediction hypothesis:Dopamine signals whether reinforcement is expected.

Page 37: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

37

8.2 Interim Summary

• Whereas dopamine may be involved in “wanting,” endogenous opioids may be involved in “liking.” Drugs that affect brain opiate receptors affect

hedonic (“goodness”) value of primary reinforcers and punishers (e.g., food and pain).

Page 38: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

8.3

Clinical Perspectives

Page 39: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

39

8.3 Clinical Perspectives

• Drug Addiction

• Behavioral Addiction

• Treatments

Page 40: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

40

Drug Addiction

• Pathological addiction—a strong habit maintained despite harmful consequences. Involves craving a high “euphoria” and avoiding

withdrawal.

Seeking pleasure involves positive reinforcement.

Avoiding pain involves negative reinforcement.

• As indicated by the incentive salience hypothesis, dopamine is involved in “wanting” a drug.

Page 41: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

41

Effects of Drugs on Dopaminergic Neurons

Page 42: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

42

Behavioral Addiction

• Behavioral addiction—addiction to certain behaviors, rather than drugs.Produces euphoria.

Understanding drug addiction may help understand/treat behavioral addictions.

• Examples: Compulsive gambling,

eating, sex, Internet use, shopping, exercise, work

Eve

ryn

igh

t Im

ag

es/

Ala

my

Page 43: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

43

Behavioral Addiction

• http://www.youtube.com/watch?v=Bz2VT5Ky7Kw

Eve

ryn

igh

t Im

ag

es/

Ala

my

Page 44: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

44

Treatments

• Naltrexone (drug) treatment: Indirectly inhibits dopamine production; may help

treat heroin addicts and compulsive gamblers.

• (Cognitive) behavior therapies: e.g., extinction, distancing, reinforcement of

alternative behaviors, delayed reinforcement

Based on instrumental conditioning principles.

Page 45: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

45

8.3 Interim Summary

• Addictive drugs (e.g., heroin, caffeine) may hijack brain’s reinforcement system. May be psychological as well as physiological

addiction.

• Behavioral addictions may reflect same brain processes as drug addictions.

Page 46: Chapter 8 Instrumental Conditioning: Learning the Consequences of Behavior.

46

8.3 Interim Summary

• Treatment for people with addictions may include: Cognitive therapy

Medication

Behavioral therapyIncluding principles learned from instrumental

conditioning.