Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!
-
Upload
henry-henderson -
Category
Documents
-
view
224 -
download
0
Transcript of Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!
Operant Conditioning
The Learner is NOT passive. Learning based on consequence!!!
Operant Conditioning Learning controlled by a connection to the
consequence of one’s behavior Consequences of behavior determine whether it will
be repeated in future
Vs. Classical Conditioning Behavior is…
CC: elicited, automatic, reflexive OC: emitted, voluntary, complex behaviors
Reward is… CC: provided independent of actions OC: dependent on behavior
B.F. Skinner
• The most influential behaviorist and proponent of Operant Conditioning.
• Nurture guy through and through.
• Used a Skinner Box (Operant Conditioning Chamber) to prove his concepts.
Skinner Operant box—non-reflexive behaviors could
be altered by learning
Chaining Behaviors
Subjects are taught a number of responses successively in order to get a reward.
Click picture to see a rat chaining behaviors.
Click to see a cool example of chaining behaviors.
Thorndike’s Puzzle and The Law of Effect
• Edward Thorndike• Locked cats in a cage• Behavior changes because of
its consequences.• If a response is rewarded, that
response is more likely to occur• If consequences are
unpleasant, the Stimulus-Reward connection will weaken. (LOE)
• Called the whole process instrumental learning.• Instrumental behaviors
Click picture to see a better explanation of the Law of Effect.
Thorndike
Operant Conditioning
Reinforcement Increases probability of response Positive: desirable stimulus is added Negative: undesirable stimulus is removed
Punishment Decreases probability of response
Positive: adding something bad
Negative: removing something good
Reinforcement
When an event increases the likelihood that a response will
occur again
Positive
Adding something good
Designed to increase behavior
Negative
Removing something bad
Designed to increase behavior
Types of reinforcers
Primary vs. secondary Primary: inherently satisfying to most people Secondary: gain value from conditioning
Immediate & delayed Usually needs to be immediate, but humans can
handle delayed reinforcers Important for self-control
Rat basketball
What type of learning was this an example of?
Can you explain what helped the rats learn to score a basket?
Punishment/Consequence When an event decreases the likelihood that a response
will occur again
Two types: Positive & Negative
Positive ≠ Good. POSITIVE = ADD
Adding something bad
Designed to decrease behavior
Negative ≠ Bad. NEGATIVE = SUBTRACT
Removing something good
Designed to decrease behavior
Importance of reinforcement Punishment signals undesirable behavior but
doesn’t inform of desired behavior Punished behavior is suppressed Punishment teaches stimulus discrimination Punishment (esp. physical) teaches fear &
aggression Ignore behavior that one wants to punish; look
for what to reinforce
Punishment tends to be ineffective It tells the organism what not to do,
rather than what to doCreates anxiety that can interfere with
future learningEncourages subversive behavior
(sneakiness)Provides a model for aggressive
behavior Only true for some races/cultures
Neg. reinforcement ≠ punishment
The Decision Tree
How to solve operant conditioning problems
Should the behavior
increase or decrease?
Is something being
added or taken away?
Increase.(Reinforcement)
Decrease.(Punishment)
Added.(Positive)
Removed.(Negative)
Review
Positive Negative
Punishmentdecreases behavior
ADD something
unfavorable
SUBTRACT something desirable
Reinforcement
increasesbehavior
ADD something desirable
SUBTRACT something
unfavorable
Applications of Operant Conditioning
Behavior Modification Started with Thorndike Altering individual behavior (frequency) through positive and
negative reinforcement and positive and negative punishment Adaptive behaviors
Reduction of behavior through its extinction and punishment A.K.A. – Applied Behavior Analysis or Positive Behavior Support (PBS)
A child is riding with an adult, and the child is thirsty. So, the child asks to stop and get a drink. The adult says no, the child asks again, and again, and again... Finally, the adult gives in, saying, "All right, just this once." Big mistake, right? Why? The adult has now put the child on a partial schedule, guaranteeing a repetition of the same behavior later on. Instead, the adult should have said, "All right, I'll get you a drink IF you don't ask for one for the next 10 (time may have to vary, depending on the child) minutes." Then, the adult is providing the child with positive reinforcement for being quiet.
Ending a Relationship?????
Behavior Modification Reinforcement provides a system of rewards and punishments to change negative
behavior into positive responses. Provides rewards when someone acts in a positive manner. Rewards can range
from a compliment to granting a special privilege to the patient whose behavior becomes desirable.
A negative consequence might be the result of unwanted behavior, with the removal of a favorite object or taking away a privilege.
Cognitive behavior modification techniques focus on thought patterns that affect behavior, Involve teaching a patient to recognize thoughts that may be unrealistic or distort
reality. Keeping a journal, role-playing, and being asked to defend thoughts that defy
reality. Eating disorders, anxiety disorder, OCD, Panic attacks
Aversion behavior modification techniques center on the premise that all behavior is learned and can be unlearned. (aka CC) Electrical shock treatment is one example of adverse stimuli used to treat deviant
behavior. (Mild) medication given to alcoholics that might make them ill if they drink while
using the drug. The token system provides immediate rewards while setting goals for future conduct.
Distribute a token or similar object each time a patient or student exhibits positive behavior.
Tokens can be amassed and later exchanged for a prize or privilege, or lost due to unwanted behavior.
This form of behavior modification is commonly used in mental institutions and prisons to help control individuals who show violent tendencies.
Premack principle A less frequently performed
behavior can be increased by reinforcing it with a more frequent behavior Eat your vegetables before you
can have dessert!
Operant Conditioning in Daily Life
Do we wait for the subject to deliver the desired behavior?
Sometimes, we use a process called shaping.
Shaping is reinforcing small steps on the way to the desired behavior.
To train a dog to get your slippers, you would have to reinforce him in small steps. First, to find the slippers. Then to put them in his mouth. Then to bring them to you and so on…this is shaping behavior.
To get Barry to become a better student, you need to do more than give him a massage when he gets good grades. You have to give him massages when he studies for ten minutes, or for when he completes his homework. Small steps to get to the desired behavior.
Shaping Reinforcing responses that come successively closer to the desired response
Successive approximations
Shaping Reinforcers gradually increase organism’s
actions toward desired end behavior Successive approximations : behaviors closer &
closer to end learning goal get rewarded1. Simply turning toward the lever will be reinforced
2. Only stepping toward the lever will be reinforced
3. Only moving to within a specified distance from the lever will be reinforced
4. Only touching the lever with a part of the body will be reinforced
5. Only touching the lever with a specified paw will be reinforced
6. Only depressing the lever partially with the specified paw will be reinforced
7. Only depressing the lever completely with the specified paw will be reinforced
Schedules of reinforcement
•How often to you give the reinforcer?•Every time or just sometimes you see the behavior.
Schedules of Reinforcement Continuous reinforcement schedule:
Reinforcing a response every time Learning occurs rapidly, extinction occurs rapidly
Partial reinforcement schedule: Reinforcing a response only some of the time Slower acquisition, but resistant to extinction
Fixed vs. Variable Ratio vs. Interval
Fixed ratio: after set # of responses Variable ratio: after unpredictable # of responses Fixed interval: after set amount of time has passed Variable interval: after unpredictable amount of time has
passed
Continuous v. Partial Reinforcement
Continuous Partial Reinforce the behavior
EVERYTIME the behavior is exhibited.
Usually done when the subject is first learning to make the association.
Acquisition comes really fast.
But so does extinction.
• Reinforce the behavior only SOME of the times it is exhibited.
• Acquisition comes more slowly.
• But is more resistant to extinction.
• FOUR types of Partial Reinforcement schedules.
Schedules of reinforcement Continuous vs. partial
Ratio schedules
1. Fixed-ratio (FR) schedules: Reinforcement after a fixed (predictable)
number of responses Ex: paid $1 for every 20 apples you pick
2. Variable-ratio (VR) schedules: Reinforcement after a varying (unpredictable)
number of responses Induces very high rate of responding
Ex: scratch & win lottery tickets
Interval Schedules3. Fixed-interval (FI) schedule:
Reinforcement after a fixed (predictable) amount of time
4. Variable-interval (VI) schedule: Reinforcement after varying (unpredictable)
amounts of time
Reinforcement Schedules
after set number of responses
after set amount of time
after random number of responses
after random amount of time
Ratio Interval
Fixed
Variable
Ratio Interval
Fixed
Variable
Name that Schedule!
Winning at the slot machines Getting a free flight after accumulating 10,000
flight miles Receiving an allowance every Saturday
regardless of chores, as long as you’ve done one chore
Random drug testing at your job
A.Variable Ratio C. Variable Interval
B.Fixed Ratio D. Fixed Interval
AB
D
C