Operant applications
-
Upload
david-a-townsend -
Category
Technology
-
view
1.011 -
download
1
description
Transcript of Operant applications
Operant Applications
Principles of Learning
Applications of Operant Conditioning
Skinner introduced the concept of teaching machines that shape learning in small steps and provide reinforcements
for correct rewards.
In School
LW
A-JD
L/ C
orbis
Applications of Operant Conditioning
Reinforcers affect productivity. Many companies now allow employees to share
profits and participate in company ownership.
At work
Applications of Operant Conditioning
At Home
In children, reinforcing good behavior increases the occurrence of these behaviors. Ignoring unwanted behavior decreases their
occurrence.
6
Operant conditioning: Addiction (1)
Drug use is a behaviour that is reinforced by the positive reinforcement that occurs from the pharmacologic properties of the drug.
7
Operant conditioning: Addiction (2)
Once a person is addicted, drug use is reinforced by the negative reinforcement of removing or avoiding painful withdrawal symptoms.
Behavior Therapy• Behavior therapy uses learning methods to
change abnormal behavior, thoughts and feelings– Behavior therapists use classical and operant
conditioning techniques as well as modeling– Counterconditioning: learning a new response
• Systematic desensitization: relaxation is paired with a stimulus that formerly induced anxiety
• Aversive conditioning: an unpleasant event is paired with a stimulus to reduce its attractiveness
Ch 2.23
Counterconditioning
Cognitive Behavior Therapy
• Cognitive therapy assumes that thought patterns can cause a disturbance of emotion or behavior – Beck’s Cognitive Therapy for Depression
• Depressed mood caused by cognitive distortions– “Nothing good ever happens to me”
– Ellis’s Rational Emotive Behavior Therapy• Emotional upset is due to irrational beliefs
– “I must be loved by everyone”
Ch 2.25
The Cognitive Paradigm
• Cognition involves the mental processes of perceiving, recognizing, judging and reasoning
• The cognitive paradigm focuses on how people structure and understand their experiences and how these experiences are related to past experiences stored in memory
Ch 2.24
12
Operant conditioning: Addiction (2)
Once a person is addicted, drug use is reinforced by the negative reinforcement of removing or avoiding painful withdrawal symptoms.
13
Operant conditioning: Application to CBT techniques
• Functional Analysis – identify high-risk situations and determine reinforcers
• Examine long- and short-term consequences of drug use to reinforce resolve to be abstinent
• Schedule time and receive praise
• Develop meaningful alternative reinforcers to drug use
Gary Wilkes (1994) Animal Trainer
• Elephants:
Dangerous, handling stress sensitive
Calluses build-up (unable to walk)
Cut away with sharp tool
Elephant Manicure
• Violent Aggressive Bull
• Callous not trimmed in 10 years
• Vets can not touch
• What to do?
• Large steel gate with hole in corner (size of elephants foot)
• Clicker + Carrot• Clicker + approach gate +
carrot• Clicker +lift foot + carrot• Clicker + move foot to hole• Etc….• After training: elephant would
voluntarily walk to gate and put foot through
Elephant Manicure
• CS + US
• SHAPING
• Large steel gate with hole in corner (size of elephants foot)
• Clicker + Carrot• Clicker + approach gate +
carrot• Clicker +lift foot + carrot• Clicker + move foot to hole• Etc….• After training: elephant would
voluntarily walk to gate and put foot through
Self Awareness
• Self Aware: observe ones own behavior
• “I think Joe will quit school” ( he is engaged in those types of behaviors)
• I have observed myself engaged in those behaviors. (“I think I will quit school”)
• Long-term Comas• Behave like awake:
Open eyes
Turn heads
Move a hand
Coma = not responsive to environment
Boyle and Greer (1983)
• Reinforced spontaneous behaviors with music
Moved patient
Requested action
Reward = short selection of favorite music
2 sessions a day/ 16 weeks
• Reinforcement
• Outcome (Reward) contingent on behavior
• Cause and effect!
• 33% increased spontaneous movement
1 came out of coma
Norris Edwards: Chapter 8: Wade08.ppt Page: 19
The Problem with The Problem with RewardReward
The Problem with The Problem with RewardReward
• Misuse of reward Misuse of reward ~ rewards must be tied to the ~ rewards must be tied to the behavior we are trying to increase.behavior we are trying to increase.
• Each of use has had the experience of standing Each of use has had the experience of standing in the checkout line and the market and seeing a in the checkout line and the market and seeing a child in a shopping cart tempted by the candy child in a shopping cart tempted by the candy and toys on display adjacent to the line. and toys on display adjacent to the line.
• When we as parents giving a purchase something When we as parents giving a purchase something to quiet our kids in that situation, what behavior to quiet our kids in that situation, what behavior are we actually reinforcing?are we actually reinforcing?
• Misuse of reward Misuse of reward ~ rewards must be tied to the ~ rewards must be tied to the behavior we are trying to increase.behavior we are trying to increase.
• Each of use has had the experience of standing Each of use has had the experience of standing in the checkout line and the market and seeing a in the checkout line and the market and seeing a child in a shopping cart tempted by the candy child in a shopping cart tempted by the candy and toys on display adjacent to the line. and toys on display adjacent to the line.
• When we as parents giving a purchase something When we as parents giving a purchase something to quiet our kids in that situation, what behavior to quiet our kids in that situation, what behavior are we actually reinforcing?are we actually reinforcing?
Norris Edwards: Chapter 8: Wade08.ppt Page: 20©1999 Prentice Hall
Hidden Cost of Rewards
• Preschoolers played with felt-tipped markers and observed
• Divided into 3 groups:– Given markers again and
asked to draw– Promised a reward for
playing with markers– Played with markers,
then rewarded
Albert BanduraSocial Cognitive Theory
• Theories that emphasize how behavior is learned and maintained through observation and imitation of others, positive consequences, and cognitive processes such as plans expectations, and beliefs.
• Observational Learning ~ A process in which an individual learns new responses by observing the behavior of another (a model) rather than through direct experience; sometimes called Vicarious Conditioning.
Skinner (1953) and Verbal Behaviors
• “That itches”• “That tickles”• “That hurts”
• Observed behavior:
• Scratching• Giggling• Tears and groans
Basic Behavioral Principles
• Antecedent - any stimulus that happens before a behavior (S)
• Behavior - an observable and measurable act of an individual (R)
• Consequence - any stimulus that happens after a behavior (O)
Social-Cognitive Learning TheoriesSocial-Cognitive Learning Theories• To this point most American learning
theories have maintained the position that most learning can be explained in terms of the behavioral ABCs.
• Antecedents event preceding the behavior
• Behavior itself• Consequences of the behavior.• Social Learning Theories emphasizes
the importance of observational learning by observing people in social context.
• To this point most American learning theories have maintained the position that most learning can be explained in terms of the behavioral ABCs.
• Antecedents event preceding the behavior
• Behavior itself• Consequences of the behavior.• Social Learning Theories emphasizes
the importance of observational learning by observing people in social context.
Verbal Conditioning
S-R-O
Skinner (1957)
The Mand(Requesting)
• All mands have one thing in common: in the antecedent condition, there is a Motivative Operation (or motivation {S-S}) in place.
• A= thirst (MO) (S)• B= “I want juice” (R)• C= student gets juice (O)• If a child does not want the item, you
cannot teach them to mand for it.
Verbal Conditioning
• S
Hungry?
Sleepy?
• O
Reinforced (Behavior and self aware observation)
Reward or Punish?
• R
Yes!
No! (self
awareness?)
Norris Edwards: Chapter 8: Wade08.ppt Page: 29
When Punishment FailsWhen Punishment Fails
• Most misbehavior is hard to punish immediately.
• Punishment conveys little information.
• An action intended to punish may instead be reinforcing because if brings attention.
• Most misbehavior is hard to punish immediately.
• Punishment conveys little information.
• An action intended to punish may instead be reinforcing because if brings attention.
Behavior and the Mind
• Edward Tolman (1938) experiment with rats demonstrated latent learning
• Latent learning is learning that in not immediately revealed through a change in behavior
• Latent learning occurs without obvious reinforcement
• Perception of the model and of themselves influence individual's learning.
Tolman
Latent Learning: A Classic Experiment(Tolman & Honzik,
1930)Three groups of rats were given practice trials in a maze, 1 trial per day.
The maze consisted of a series of components
shaped like the letter T.
A trial started when the rat was placed in the Start box and ended when he entered the Goal box, after which he was removed from the maze.
Tolman
Latent Learning: A Classic Experiment(Tolman & Honzik,
1930)
TSTART
TTT
i
TT
...
GOAL
When the rat went up the stem of the T, he reached a choice point.If he turned one way, he came to a dead end.If he turned the other way, he came to the entrance of the next component.
Tolman
Latent Learning: A Classic Experiment(Tolman & Honzik,
1930)
TSTART
TTT
i
TT
...
GOAL
Each time the rat turned into the dead end, it was counted as an error.The measure of performance (dependent variable) was the number of errors on a trial.
If learning occurred, the number of errors should decrease as more and more trials were given.
Latent Learning: A Classic Experiment(Tolman & Honzik,
1930)
GROUP 1: On every trial, these rats received food when they reached the goal box.
GROUP 2: These rats never received food. They were simply removed from the maze when they got to the goal box.
GROUP 3: These rats got no food on Trials 1 to 10. But on Trial 11, and every trial afterwards, they received a food reward.
US = Food
UR = Consume Food
CS = Maze
CR= Consume Food
Latent Learning: A Classic Experiment(Tolman & Honzik,
1930)
1 10 11 17
Trials (1 Trial per Day)
Avera
ge
Err
ors
0
2
4
6
8
1
0 GR 1 — GR 2 — GR 3 —
The day-to-day decrease in errors represented a “relatively permanent change in behavior” that resulted from practice.
This was clear evidence for learning.
Hull’s theory predicts that the rats in Hull’s theory predicts that the rats in groups 3 & 2 will not learngroups 3 & 2 will not learn
Latent Learning: A Classic Experiment
(Tolman & Honzik, 1930)
1 10 11 17
Trials (1 Trial per Day)
Avera
ge
Err
ors
0
2
4
6
8
1
0 GR 1 — GR 2 — GR 3 —
Group 2 got no food but still improved slightly. Removal from the maze was a small reward.
There was little evidence for learning.
Hull vs. Tolman
• Hull’s law of primary reinforcement:– “when a stimulus-response relationship is followed
by a reduction in need, the probability increases that on subsequent occasions the same stimulus will invoke the same response” (Schultz & Schultz, op. cit., p. 329)
• Learning can only take place if there is reinforcement
• S-R connections strengthened by the no. of reinforcements that have occurred - Hull called this “habit strength”
• Habit strength = intervening variable
Hull vs. Tolman
• Tolman devised an experimental test of Hull’s theory
• Hull’s theory states - learning must involve reinforcement– So we can deduce this hypothesis from his
theory:• Rats will not learn if they are not rewarded
– Tolman tested this hypothesis
Latent Learning: A Classic Experiment
1 10 11 17
Trials (1 Trial per Day)
Avera
ge
Err
ors
0
2
4
6
8
1
0 GR 1 — GR 2 — GR 3 —
Getting no food on Trials 1 – 10, Group 3 performed like Group 2 through Trial 11.
Latent Learning: A Classic Experiment
1 10 11 17
Trials (1 Trial per Day)
Avera
ge
Err
ors
0
2
4
6
8
1
0 GR 1 — GR 2 — GR 3 —
On the next trial, Group 3 matched Group 1, and then did even better!
Latent Learning: A Classic Experiment(Tolman & Honzik,
1930)Interpretation
Group 3 learned the route to the maze on Trials 1 to 10 but didn’t show it because there was no motivation to perform. How could they learn if there was no CS/US pairings?They outperformed Group 1 because the shift from no reward to reward made the reward seem larger by comparison. This is called “positive contrast.”
So S-S is the way animals learn?
Hull maintained that maze itself caused little S-R bonds to form
S-R theory still dominated psychology for 40 more years
Response Vs. Place Learning
GROUP P always found food in Goal Box 1.
Start 1
Start 2
Goal 2
Goal 1
(Tolman, Ritchie & Kalish, 1946)
This maze had no walls or roof so that rats could see “landmarks” in the room such as a window, door, or lamp.
On a random half of the trials, the rats started from Start Box 1, and on the other half they started from Start Box 2.
GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.
Response Vs. Place Learning
GROUP P always found food in Goal Box 1.
Start 1
Start 2
Goal 2
Goal 1
(Tolman, Ritchie & Kalish, 1946)
Cognitive theory predicted that GROUP P would learn faster because they only had to learn one cognitive map.
Behavior theory predicted GROUP R would learn faster because they only had to learn one sequence of movements at the choice point—a right turn.
GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.
Response Vs. Place Learning
GROUP P always found food in Goal Box 1.
Start 1
Start 2
Goal 2
Goal 1
(Tolman, Ritchie & Kalish, 1946)
GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.
What’s YOUR prediction?Are you a behaviorist or a
cognitivist?GROUP PGROUP R
Response Vs. Place Learning
GROUP P always found food in Goal Box 1.
Start 1
Start 2
Goal 2
Goal 1
(Tolman, Ritchie & Kalish, 1946)
GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.
What’s YOUR prediction?Are you a behaviorist or a
cognitivist?GROUP PGROUP R
Group P learned faster.
ButLater studies found that if the maze had a roof so the rats couldn’t see things in the room, response learning was faster.
Response Vs. Place Learning
GROUP P always found food in Goal Box 1.
Start 1
Start 2
Goal 2
Goal 1
(Tolman, Ritchie & Kalish, 1946)
GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.
What’s YOUR prediction?Are you a behaviorist or a
cognitivist?GROUP PGROUP R
Group P learned faster. Both response and place learning occur. Which type is faster depends on what cues are available. So both the S-R and S-S views turned out to be right!
S-R or S-SClassical conditioning can involve both S-R and S-S
Today:
Controlled vs. Automatic processing
S-S= While learning
S-R= After learning
Theories Explaining Classical Conditioning
HULL• Born 1884 in Akron NY• Graduated U. of
Michigan in 1913• Ph.D. U. of Wisconsin
1918• 1929-1952 Professor of
Psychology at Yale• Died 1952
Tolman• Born Newton, Mass. On April
14, 1886.• BA at MIT in electrochemistry• Ph.D. psychology in 1915• Spent month at Giessen under
Kofka. Heavily influenced by Gestalt movement
• Ardent pacificist• Dismissed at Northwestern U• Went to UC Berkley rest of
career
S-R or S-S
Behavioral vs. Cognitive Views of Learning
These traditions in learning theory have existed for decades. They give different answers to the
fundamental question, “What is learned” when learning takes place?
Behaviorists say: “Specific actions”
Cognitivists say: “Mental representations”
For example, in a “Skinner Box”, a rat may receive a food reward every time he presses the bar. He presses faster and faster. What has he learned?
S-R S-S
S-R vs. S-SViews of Learning
These traditions in learning theory have existed for decades. They give different answers to the
fundamental question, “What is learned” when learning takes place?
S-R view: “to press the bar.”
S-S view:
For example, in a “Skinner Box”, a rat may receive a food reward every time he presses the bar. He presses faster and faster. What has he learned?
“that pressing produces food.”
S-R vs. S-SViews of Learning
S-R
(“learns to”)1. Learning involves the formation of associations between specific actions and specific events (stimuli) in the environment. These stimuli may either precede or follow the action (antecedents vs. consequences).
2. Many behaviorists use intervening variables to explain behavior (e.g., habit, drive) but avoid references to mental states.
3. RADICAL BEHAVIORISM (operant conditioning/behavior modification/behavior analysis): avoids any intervening variables and focuses on descriptions of relationships between behavior and environment (“functional analysis”).
S-R vs. S-SViews of Learning
S-S(“learns that”)
1. Learning takes place in the mind, not in behavior. It involves the formation of mental representations of the elements of a task and the discovery of how these elements are related.
2. Behavior is used to make inferences about mental states but is not of interest in itself (“methodological behaviorism”). 3. EXAMPLE: Tolman & Honzik’s experiment on latent learning. Tolman, a pioneer of cognitive psychology, argued that when rats practice mazes, they acquire a “cognitive map” of the layout—mental representations of the landmarks and their spatial relationships.
S-R or S-S
• Autoshaping
• Taste aversion
• Eyeblink conditioning
• Blocking
• Extinction
• Spontaneous Recovery
• S-R
• S-S
• S-R
• S-S
• S-R
• S-S
Latent LearningLatent Learning
• Rats: one maze trial/day• One group found food every
time (red line)• Second group never found
food (blue line)• Third group found food on
Day 11 (green line)– Sudden change, day 12
• Learning isn’t the same as performance
• Rats: one maze trial/day• One group found food every
time (red line)• Second group never found
food (blue line)• Third group found food on
Day 11 (green line)– Sudden change, day 12
• Learning isn’t the same as performance
Norris Edwards: Chapter 8: Wade08.ppt Page: 56©1999 Prentice Hall
Cognitive Maps
• Tolman trained rats in this maze, with all alleys open– Not to scale; the path on the
left is too long.
• If “Block A” in place, rats chose green (shorter) path
• If “Block B” in place, rats chose blue path– Green path also blocked
• Rats navigate as if they have an internal map
• Tolman trained rats in this maze, with all alleys open– Not to scale; the path on the
left is too long.
• If “Block A” in place, rats chose green (shorter) path
• If “Block B” in place, rats chose blue path– Green path also blocked
• Rats navigate as if they have an internal map
Varieties of cognitive maps? (Gallistel 1990)
Specific issues:• Spatial scale (local vs. home-range) • Geometric content (metric, topological) • Reference frame (egocentric/view-dependent vs. allocentric/view-
independent)Evidence: • People: short cuts in cities and VR (errors); mixed evidence
contents of underlying map• Rodents: most studies on local scale; mixed evidence on contents• Insects: on local and home-range scale--metric, egocentric
Broader Definition (Gallistel 1990): ‘A cognitive map is a record in the central nervous system of macroscopic geometric relations among surfaces in the environment used to plan movements through the environment. A central question is what type of geometric relations a map encodes’.
More on Cognitive Maps: Chimpanzee Behavior
More on Cognitive Maps: Chimpanzee Behavior
• Chimpanzee on experimenter’s back• Watched site bating: 18 locations• Later released to retrieve food• Most food found• Retrieval route differed from baiting route• Traveling distance was very efficient
Cognitive Maps (spatial learning)
More on Cognitive Maps: Chimpanzee Behavior
• Second experiment
• Same general plan
• 18 locations: 9 fruits and 9 vegetables
• First retrieval visits were to retrieve fruits, according with food preferences
More on Cognitive Maps: Chimpanzee Behavior
• Results suggest that chimpanzees have something like a cognitive map of compound.
• As they are carried around, chimpanzees store information about food locations not on the basis of the particular path that they are traveling, but on the basis of their cognitive map. Cognitive Map = A
separate type of memory (Bedroom, Gestalt)
More on Cognitive Maps: Chimpanzee Behavior
• Chimpanzees work with this cognitive representation to determine most efficient route to travel in gathering food.
• This solution depends on cognitive mediation between inputs and behavior that transforms and organizes inputs.
• To explain chimpanzees’ behavior without appeal to mediating processes would provide an impoverished view of what animal does.
http://www.scottcamazine.com/photos/BeeBehavior/images/06waggleDance_jpg.jpg
Sun Compass and Memory in Bees
Food 20° 40°75°
(Up)
20° 40°
75°
• Bees encode (allocentric?) flight direction in dances
• As sun moves, dances change• Dances change even when bees can’t see sun
(thus compensate by memory)• Reference for memory: landmarks (Dyer &
Gould 1981; Dyer &Dickinson 1996)
H
F
Noon
16:0012
16
The basic task
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
A STRATEGY FOR INCREASING BEHAVIOUR
• Behavioral self-management is a strategy for increasing some desired behavior (for example, hours spent studying or exercising) by using self-administered rewards. A behavioral self-management program requires the following:
Strategies for increasing a desired behavior
• Choose a target Choose a target behaviour (the behaviour behaviour (the behaviour you want to increase)you want to increase)
• Record a baseline (count Record a baseline (count time engaged in the time engaged in the desired behaviour or desired behaviour or number of times the number of times the desired behaviour is desired behaviour is performedperformed )
• Establish goals (set Establish goals (set gradual goals – daily and gradual goals – daily and weekly)weekly)
• Choose reinforcers (for Choose reinforcers (for when you reach daily and when you reach daily and weekly goals)weekly goals)
• Record your progress Record your progress (time you engaged in the (time you engaged in the behaviour or number of behaviour or number of times you performed the times you performed the activity)activity)