APPSYCH Chapter 6 Operant Conditioning

18
Chapter 6: Learning “Operant Conditioning”

Transcript of APPSYCH Chapter 6 Operant Conditioning

Page 1: APPSYCH Chapter 6 Operant Conditioning

Chapter 6: Learning“Operant Conditioning”

Page 2: APPSYCH Chapter 6 Operant Conditioning

Operant ConditioningA type of learning in which

behavior is strengthened if followed by reinforcement or diminished if followed by punishment.

in rats:• trial and error learning• allows acquisition of motor programs that aren’t instinctive• behavior shaped by rewards• develops as a result of the association

of reinforcement with a particular response• on a proportion of occasions

Trial & Error --> Trial & Reward --> Operant Conditioning Operant Response -- Reinforcement -- Learned Behavior

Page 3: APPSYCH Chapter 6 Operant Conditioning
Page 4: APPSYCH Chapter 6 Operant Conditioning

Classical vs. OperantThey both use acquisition, discrimination,

SR, generalization and extinction.

Classical Conditioning is automatic (respondent behavior).

Ex.) Your dog gets sick and requires several painful trips to the vet. Now he hides every time he hears you rattle your keys. Automatic.

Operant Conditioning involves behavior where one can influence their environment with behaviors which have consequences (operant behavior).

Ex.) Teacher comments on test.

Page 5: APPSYCH Chapter 6 Operant Conditioning
Page 6: APPSYCH Chapter 6 Operant Conditioning

Edward Thorndike

Law of Effect: rewarded behavior is likely to recur.

Previous theories had emphasized practice or repetition, Thorndike gave equal consideration to the effects of reward or punishment, success or failure, and satisfaction or annoyance on the learner

Page 7: APPSYCH Chapter 6 Operant Conditioning

B.F. SkinnerInstead of antecedents of behavior (what comes before) a new focus on consequences of behavior.

BF Skinner argued that, CC did not explain complex behavior.

2 categories of consequences: Reinforcement & Punishment.

Reinforcement is designed to increase the probability that a behavior will occur again.

Punishment is designed to decrease the probability that a behavior will occur again.

Page 8: APPSYCH Chapter 6 Operant Conditioning

Operant Conditioning Chamber

Page 9: APPSYCH Chapter 6 Operant Conditioning

Positive reinforcement - when something is given (apply an aversive stimulus).

Negative reinforcement - when something is removed (remove an aversive stimulus).

Skinner - punishment should be judicious, immediate, consistent, & severe enough actually to be a punishment.

Page 10: APPSYCH Chapter 6 Operant Conditioning
Page 11: APPSYCH Chapter 6 Operant Conditioning

ShapingA procedure in Operant Conditioning in

which reinforcers guide behavior closer and closer towards a goal.

ReinforcersAny event that STRENGTHENS the behavior it follows.

There are + and – reinforcers.

Page 12: APPSYCH Chapter 6 Operant Conditioning

Positive ReinforcementStrengthens a response by

presenting a stimulus after a response.

We may continue to go to work each day because we receive a paycheck on a weekly or monthly basis.  

If we receive awards for writing short stories, we may be more likely to increase the frequency of writing short stories. 

Receiving praise for our karaoke performances can increase how often we sing.  These are all examples of positive reinforcement.

Page 13: APPSYCH Chapter 6 Operant Conditioning

Negative ReinforcementStrengthens a response by reducing

or removing an aversive stimulus.

Examples:Driving in heavy traffic is a negative condition

for most of us. You leave home earlier than usual one morning, and don't run into heavy traffic. You leave home earlier again the next morning and again you avoid heavy traffic. Your behavior of leaving home earlier is strengthened by the consequence of the avoidance of heavy traffic.

The concept of Negative Reinforcement is difficult to learn because of the word negative. Negative Reinforcement is often confused with Punishment. They are very different, however.

Negative Reinforcement strengthens a behavior because a negative condition is stopped or avoided as a consequence of the behavior.

Punishment, on the other hand, weakens a behavior because a negative condition is introduced or experienced as a consequence of the behavior.

Page 14: APPSYCH Chapter 6 Operant Conditioning

Fixed-ratio SchedulesA schedule that reinforces a

response only after a specified number of responses.

Examples in natural environments:Jobs that pay based on units delivered. Employees often find this schedule undesirable because it produces a rate of response that leaves them nervous and exhausted at the end of the day. They may feel pressured not to slow down or take rest breaks, since they feel that such will costs them money. This is an example of how a schedule can produce a high rate of response even though the response rate is aversive to the subject.Examples in video games:Collecting tokens. Many games require the player to collect a fixed number of tokens to advance to the next level, obtain a new life point, or receive some other reinforcers.Attaining a new level in an RPG. Some RPG's clearly indicate how much experience is required to achieve the next level. A high degree of certainty as to the level of work that will be required to achieve the next level puts the player on a fixed ratio schedule.

Page 15: APPSYCH Chapter 6 Operant Conditioning

Variable-ratio ScheduleA schedule of reinforcement that reinforces

a response after an unpredictable number of responses.

ExamplesSlot machines. Slot machines are programmed on VR

schedule. The gambler has no way of predicting how many times he must put a coin in the slot and pull the lever to hit a payoff but the more times a coin is inserted the greater the chance of a payout. People who play slot machines are often reluctant to leave them, especially when they have had a large number of un-reinforced responses. They are concerned that someone else will win the moment they leave.

Playing golf. It only takes a few good shots to encourage the player to keep playing or play again. The player is uncertain how good each shot will be, but the more often they play, the more likely they are to get a good shot.

Door to door salesmen. It is uncertain how many houses they will have to visit to make a sale, but the more houses they try, the more likely that they will succeed.

Page 16: APPSYCH Chapter 6 Operant Conditioning

Fixed-interval Schedule

A schedule of reinforcement that reinforces a response only after a specified time has elapsed.

Example: I give Bart a Butterfinger every ten minutes after he moons someone.

Page 17: APPSYCH Chapter 6 Operant Conditioning

Variable-interval ScheduleA schedule of reinforcement

that reinforces a response at unpredictable time intervals.

Pop Quizzes

Page 18: APPSYCH Chapter 6 Operant Conditioning

PunishmentAn event that DECREASES the behavior that it follows.

Does punishment work?