Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Operant Conditioning

The Learner is NOT passive. Learning based on consequence!!!

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=0oKrLhZXglPsqM&tbnid=_udNBsxJpCnagM:&ved=0CAUQjRw&url=http://blog.lib.umn.edu/clar0841/psychblog2012/2012/01/you-cant-tell-me-youre-not-intrigued-about-the-possibility-of-building-a-better-girlfriend.html&ei=44t2UtrlAuTL2QW9kYH4Aw&bvm=bv.55819444,d.b2I&psig=AFQjCNGo8NE7Cr_acFt2wYGjM-mkgaciDw&ust=1383587153072337

Operant Conditioning Learning controlled by a connection to the

consequence of one’s behavior Consequences of behavior determine whether it will

be repeated in future

Vs. Classical Conditioning Behavior is…

CC: elicited, automatic, reflexive OC: emitted, voluntary, complex behaviors

Reward is… CC: provided independent of actions OC: dependent on behavior

http://www.youtube.com/watch?v=JA96Fba-WHk

B.F. Skinner

• The most influential behaviorist and proponent of Operant Conditioning.

• Nurture guy through and through.

• Used a Skinner Box (Operant Conditioning Chamber) to prove his concepts.

Skinner Operant box—non-reflexive behaviors could

be altered by learning

Chaining Behaviors

Subjects are taught a number of responses successively in order to get a reward.

Click picture to see a rat chaining behaviors.

Click to see a cool example of chaining behaviors.

http://www.youtube.com/watch?v=6A9sPBftMQg

http://www.youtube.com/watch?v=_u1Zhh5ZfGM

Thorndike’s Puzzle and The Law of Effect

• Edward Thorndike• Locked cats in a cage• Behavior changes because of

its consequences.• If a response is rewarded, that

response is more likely to occur• If consequences are

unpleasant, the Stimulus-Reward connection will weaken. (LOE)

• Called the whole process instrumental learning.• Instrumental behaviors

Click picture to see a better explanation of the Law of Effect.

http://www.youtube.com/watch?v=6A9sPBftMQg

http://www.youtube.com/watch?v=pb-6DqfYw6U

http://www.youtube.com/watch?v=BDujDOLre-8

Thorndike

Operant Conditioning

Reinforcement Increases probability of response Positive: desirable stimulus is added Negative: undesirable stimulus is removed

Punishment Decreases probability of response

Positive: adding something bad

Negative: removing something good

http://www.youtube.com/watch?v=t7Dna0677m0

http://www.youtube.com/watch?v=t7Dna0677m0

http://www.youtube.com/watch?v=PRvhoRVfAss&feature=related

http://www.youtube.com/watch?v=PRvhoRVfAss&feature=related

Reinforcement

When an event increases the likelihood that a response will

occur again

Positive

Adding something good

Designed to increase behavior

Negative

Removing something bad

Designed to increase behavior

Types of reinforcers

Primary vs. secondary Primary: inherently satisfying to most people Secondary: gain value from conditioning

Immediate & delayed Usually needs to be immediate, but humans can

handle delayed reinforcers Important for self-control

Rat basketball

What type of learning was this an example of?

Can you explain what helped the rats learn to score a basket?

http://www.youtube.com/watch?v=drnnulHw5CM

Punishment/Consequence When an event decreases the likelihood that a response

will occur again

Two types: Positive & Negative

Positive ≠ Good. POSITIVE = ADD

Adding something bad

Designed to decrease behavior

Negative ≠ Bad. NEGATIVE = SUBTRACT

Removing something good

Designed to decrease behavior

Importance of reinforcement Punishment signals undesirable behavior but

doesn’t inform of desired behavior Punished behavior is suppressed Punishment teaches stimulus discrimination Punishment (esp. physical) teaches fear &

aggression Ignore behavior that one wants to punish; look

for what to reinforce

Punishment tends to be ineffective It tells the organism what not to do,

rather than what to doCreates anxiety that can interfere with

future learningEncourages subversive behavior

(sneakiness)Provides a model for aggressive

behavior Only true for some races/cultures

Neg. reinforcement ≠ punishment

The Decision Tree

How to solve operant conditioning problems

Should the behavior

increase or decrease?

Is something being

added or taken away?

Increase.(Reinforcement)

Decrease.(Punishment)

Added.(Positive)

Removed.(Negative)

Review

Positive Negative

Punishmentdecreases behavior

ADD something

unfavorable

SUBTRACT something desirable

Reinforcement

increasesbehavior

ADD something desirable

SUBTRACT something

unfavorable

Applications of Operant Conditioning

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=bw83BUwxvtCr6M&tbnid=yR4HJVguB8iQ3M:&ved=0CAUQjRw&url=http://psych201.blogspot.com/&ei=jox2UrfhM-WM2gWRsIHABA&bvm=bv.55819444,d.b2I&psig=AFQjCNF0AizqCUraAfVkWutNs5QxOFtFsQ&ust=1383587336130244

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=xJuudnSqb0JwBM&tbnid=eMTScGS5YxKsHM:&ved=0CAUQjRw&url=http://appsychtextbk.wikispaces.com/Operant+Conditioning&ei=pIx2UtOWJ4fq2gWsnIHgBA&bvm=bv.55819444,d.b2I&psig=AFQjCNF0AizqCUraAfVkWutNs5QxOFtFsQ&ust=1383587336130244

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=-yLx7ieRdFZYCM&tbnid=nUh71ibY0ZEQ1M:&ved=0CAUQjRw&url=http://www.flickr.com/photos/lilita/5383389900/&ei=0Yx2UoPDJdG_2QWfqYCwBQ&bvm=bv.55819444,d.b2I&psig=AFQjCNF0AizqCUraAfVkWutNs5QxOFtFsQ&ust=1383587336130244

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=-yLx7ieRdFZYCM&tbnid=nUh71ibY0ZEQ1M:&ved=0CAUQjRw&url=http://www.docstoc.com/docs/58655433/Operant-Conditioning&ei=6Ix2Uu--KqiC3AXlhYHQAw&bvm=bv.55819444,d.b2I&psig=AFQjCNF0AizqCUraAfVkWutNs5QxOFtFsQ&ust=1383587336130244

Behavior Modification Started with Thorndike Altering individual behavior (frequency) through positive and

negative reinforcement and positive and negative punishment Adaptive behaviors

Reduction of behavior through its extinction and punishment A.K.A. – Applied Behavior Analysis or Positive Behavior Support (PBS)

A child is riding with an adult, and the child is thirsty. So, the child asks to stop and get a drink. The adult says no, the child asks again, and again, and again... Finally, the adult gives in, saying, "All right, just this once." Big mistake, right? Why? The adult has now put the child on a partial schedule, guaranteeing a repetition of the same behavior later on. Instead, the adult should have said, "All right, I'll get you a drink IF you don't ask for one for the next 10 (time may have to vary, depending on the child) minutes." Then, the adult is providing the child with positive reinforcement for being quiet.

Ending a Relationship?????

Behavior Modification Reinforcement provides a system of rewards and punishments to change negative

behavior into positive responses. Provides rewards when someone acts in a positive manner. Rewards can range

from a compliment to granting a special privilege to the patient whose behavior becomes desirable.

A negative consequence might be the result of unwanted behavior, with the removal of a favorite object or taking away a privilege.

Cognitive behavior modification techniques focus on thought patterns that affect behavior, Involve teaching a patient to recognize thoughts that may be unrealistic or distort

reality. Keeping a journal, role-playing, and being asked to defend thoughts that defy

reality. Eating disorders, anxiety disorder, OCD, Panic attacks

Aversion behavior modification techniques center on the premise that all behavior is learned and can be unlearned. (aka CC) Electrical shock treatment is one example of adverse stimuli used to treat deviant

behavior. (Mild) medication given to alcoholics that might make them ill if they drink while

using the drug. The token system provides immediate rewards while setting goals for future conduct.

Distribute a token or similar object each time a patient or student exhibits positive behavior.

Tokens can be amassed and later exchanged for a prize or privilege, or lost due to unwanted behavior.

This form of behavior modification is commonly used in mental institutions and prisons to help control individuals who show violent tendencies.

Premack principle A less frequently performed

behavior can be increased by reinforcing it with a more frequent behavior Eat your vegetables before you

can have dessert!

Operant Conditioning in Daily Life

Do we wait for the subject to deliver the desired behavior?

Sometimes, we use a process called shaping.

Shaping is reinforcing small steps on the way to the desired behavior.

To train a dog to get your slippers, you would have to reinforce him in small steps. First, to find the slippers. Then to put them in his mouth. Then to bring them to you and so on…this is shaping behavior.

To get Barry to become a better student, you need to do more than give him a massage when he gets good grades. You have to give him massages when he studies for ten minutes, or for when he completes his homework. Small steps to get to the desired behavior.

Shaping Reinforcing responses that come successively closer to the desired response

Successive approximations

Shaping Reinforcers gradually increase organism’s

actions toward desired end behavior Successive approximations : behaviors closer &

closer to end learning goal get rewarded1. Simply turning toward the lever will be reinforced

2. Only stepping toward the lever will be reinforced

3. Only moving to within a specified distance from the lever will be reinforced

4. Only touching the lever with a part of the body will be reinforced

5. Only touching the lever with a specified paw will be reinforced

6. Only depressing the lever partially with the specified paw will be reinforced

7. Only depressing the lever completely with the specified paw will be reinforced

Schedules of reinforcement

•How often to you give the reinforcer?•Every time or just sometimes you see the behavior.

Schedules of Reinforcement Continuous reinforcement schedule:

Reinforcing a response every time Learning occurs rapidly, extinction occurs rapidly

Partial reinforcement schedule: Reinforcing a response only some of the time Slower acquisition, but resistant to extinction

Fixed vs. Variable Ratio vs. Interval

Fixed ratio: after set # of responses Variable ratio: after unpredictable # of responses Fixed interval: after set amount of time has passed Variable interval: after unpredictable amount of time has

passed

Continuous v. Partial Reinforcement

Continuous Partial Reinforce the behavior

EVERYTIME the behavior is exhibited.

Usually done when the subject is first learning to make the association.

Acquisition comes really fast.

But so does extinction.

• Reinforce the behavior only SOME of the times it is exhibited.

• Acquisition comes more slowly.

• But is more resistant to extinction.

• FOUR types of Partial Reinforcement schedules.

Schedules of reinforcement Continuous vs. partial

Ratio schedules

1. Fixed-ratio (FR) schedules: Reinforcement after a fixed (predictable)

number of responses Ex: paid $1 for every 20 apples you pick

2. Variable-ratio (VR) schedules: Reinforcement after a varying (unpredictable)

number of responses Induces very high rate of responding

Ex: scratch & win lottery tickets

Interval Schedules3. Fixed-interval (FI) schedule:

Reinforcement after a fixed (predictable) amount of time

4. Variable-interval (VI) schedule: Reinforcement after varying (unpredictable)

amounts of time

Reinforcement Schedules

after set number of responses

after set amount of time

after random number of responses

after random amount of time

Ratio Interval

Fixed

Variable

Ratio Interval

Fixed

Variable

Name that Schedule!

Winning at the slot machines Getting a free flight after accumulating 10,000

flight miles Receiving an allowance every Saturday

regardless of chores, as long as you’ve done one chore

Random drug testing at your job

A.Variable Ratio C. Variable Interval

B.Fixed Ratio D. Fixed Interval

AB

D

C

Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Documents

Transcript of Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!