PRINCIPLES OF APPETITIVE CONDITIONING Chapter 6 1.
-
Upload
frank-hunt -
Category
Documents
-
view
216 -
download
2
Transcript of PRINCIPLES OF APPETITIVE CONDITIONING Chapter 6 1.
PRINCIPLES OF APPETITIVE CONDITIONING
Chapter 6
1
Early Contributors2
Thorndike’s Contribution Emphasized Laws of Behavior
Demonstrated trial by trial learning S-R learning
Skinner’s contribution Emphasized Contingency
A specified relationship between behavior and reinforcement in a given situation
The environment “sets” the contingencies S(R->O)
A “Faux” Distinction3
Instrumental conditioningA conditioning procedure in which the
environment constrains the opportunity for reward (discrete trial)
Operant conditioningWhen a specific response produces
reinforcement, and the frequency of the response determines the amount of reinforcement obtained (continuous responding, schedules of reinforcement)
Thorndikes’ Law of Effect
S-R associations are stamped in by reward (satisfiers)
Stimulus Response
4
S R
Thorndike: “What is learned?”
Reinforcement “stamps in” this connection
Habit Learning
5
S R O
Is that it?
?
?
Pavlovian Association
Instrumental Association
6
“O” matters7
The Importance of Past Experience Depression/Negative Contrast
The effect in which a shift from high to low reward magnitude produces a lower level of response than if the reward magnitude had always been low.
Elation/Positive Contrast The effect in which a shift from low to high
reward magnitude produces a greater level of responding than if the reward magnitude had always been high.
Negative and Positive Contrast8
Logic of Devaluation Experiment
Max
Min
R-O or Goal Directed: Controlled by the current value of the reinforced, and so it should be reduced to zero after devaluation.
S-R or Habit: Responding that is not controlled by the current value of the reward, and so it is insensitive to reinforcerdevaluation.
Res
pond
ing
Normal Devalued
9
R-O Association (aka the instrumental association)
Phase 1 Devaluation Test
Push LeftPellet Pellet+LiCl Right?
Push RightSucrose Sucrose+LiCl Left?
Num
ber
of P
ushe
s
Left Pushes
Right Pushes
DevaluedPellet
DevaluedSucrose
10
Summary of Devaluation
Neutered male rats lower but do not eliminate their responding previously associated with access to a “ripe” female rat.
Rats satiated for reward#1 preferentially lower responding to get reward#1 more than reward#2.
Goal devaluation effects tend to shrink with continued training and goal-directed responding is replaced by habit learning.
11
S-O Association (aka Pavlovian Association)
Stage 1
RightPellet TonePellet Tone: Left? Right?
LeftSucrose LightSucrose Light: Left? Right?
Num
ber
of
Pre
sses
Tone Light
Left
Right
TestStage 2
12
Skinner’s Contributions
Automatic Easy measurements
that can be compared across species
13
Three Terms Define the Contingency Three term contingency
Discriminative stimulus (S+ or S-) Operant (R) Consequence (O)
14
Operant Strengthened
Bite
Groom
Lick
Rear
Push Lever
Reinforcer
Light-OnSkinner Box
S+ R O
15
Techniques and Concepts
Shaping: Successive approximations Require closer and closer appoximations to
the target behaviour Secondary Reinforcers:
Stimuli accompanying reinforcer delivery Marking:
Feedback that a response had occurred
16
Shaping17
Shaping (or successive approximation procedure)Select a highly occurring operant behavior,
then slowly changing the contingency until the desired behavior is learned
Training a Rat to Bar Press18
Step 1: reinforce for eating out of the food dispenser
Step 2: reinforce for moving away from the food dispenser
Step 3: reinforce for moving in direction of bar
Step 4: reinforce for pressing bar
Appetive Reinforcers19
Primary reinforcerAn activity whose reinforcing properties are
innate Secondary reinforcer
An event that has developed its reinforcing properties through its association with primary reinforcers
Primary Reward Magnitude20
The Acquisition of an Instrumental or Operant ResponseThe greater the magnitude of the reward,
the faster the task is learnedThe differences in performance may reflect
motivational differences
Magnitude21
Primary Reward and Degraded Contingency
= bar press = food
Perfect contingency
Strong Responding
Degraded contingency
Weak Responding
22
Strength of Secondary Reinforcers
23
Several variables affect the strength of secondary reinforcersThe magnitude of the primary reinforcerThe greater the number of primary-secondary
pairings, the stronger the reinforcing power of the secondary reinforcer
The time elapsing between the presentation of the secondary reinforcer and the primary reinforcer affects the strength of the secondary reinforcer
Primary-Secondary Pairings24
Schedules of Reinforcement25
Schedules of reinforcementA contingency that specifies how often or
when we must act to receive reinforcement
Schedules of Reinforcement
Fixed Ratio Reinforcement is
given after a given number of responses
Short pauses
Variable Ratio After a varying
number of responses
26
Schedules of Reinforcement
Fixed Interval First response after a
given interval is rewarded
FI Scallop
Variable Interval Like FI but varies with
a given average Scallop disappears
27
Fixed Interval Schedule28
Fixed interval scheduleReinforcement is available only after a
specified period of time and the first response emitted after the interval has elapsed is reinforced
Scallop effectExperience - the ability to withhold the
response until close to the end of the interval increases with experience
The pause is longer with longer FI schedules
Variable Interval Schedules29
Variable interval scheduleAn average interval of time between available
reinforcers, but the interval varies from one reinforcement to the next contingency
Characterized by steady rates of respondingThe longer the interval, the lower the response
rateScallop effect does not occur on VI schedulesEncourages S-R habit learning
Some Other Schedules
DRL, Differential reinforcement for low rates of responding
DRH, Differential reinforcement for high rates of responding
DR0, Different reinforcement of anything but the target behavior
30
Compound Schedules31
Compound scheduleA complex contingency where two or more
schedules of reinforcement are combined
$5 today $50 waitVI-30 VI-60
Schedule this….
Concurrent schedules permit the subject Concurrent schedules permit the subject to alternate between different to alternate between different schedules; or to repeatedly choose schedules; or to repeatedly choose between working on different schedules between working on different schedules A B
32
Matching Law
B1/(B1+B2) = R1/(R1+R2)B1/(B1+B2) = R1/(R1+R2) B stands for numbers of a certain behaviorB stands for numbers of a certain behavior R stands for numbers of a reinforcers R stands for numbers of a reinforcers
earnedearned
33
Sniffy the Rat
Schedule Behavior B1/(B1+B2)
R1/(R1+R2)
“1” vs “2”
VI-30 vs VI-10 25% 25%
VI-10 vs VI-30 75% 75%
VI-10 vs VI-50 83.3% 83.3%
VI-50 vs VI-10 16.7% 16.7%
VI-30 vs VI-30 50% 50%
VI-10 vs VI-10 50% 50%
34
Typical Result35
Deviations From Matching
BiasBias represents a preference for represents a preference for responding on one response more than responding on one response more than the other that has nothing to do with the the other that has nothing to do with the schedules programmedschedules programmed one pigeon key requires more force to close one pigeon key requires more force to close
its contact than the other, so that the its contact than the other, so that the pigeon has to peck harderpigeon has to peck harder
one food hopper delivers food more quickly one food hopper delivers food more quickly than another than another
36
Sensitivity
OvermatchingOvermatching -- the relative rate of -- the relative rate of responding is responding is more extrememore extreme than than predicted by matching. The subject predicted by matching. The subject appears to be “too sensitive" to the appears to be “too sensitive" to the schedule differences.schedule differences.
UndermatchingUndermatching -- the relative rate of -- the relative rate of responding on a key is responding on a key is less extremeless extreme than predicted by matching. The subject than predicted by matching. The subject appears to by “insensitive" to the appears to by “insensitive" to the schedule differences.schedule differences.
37
Overmatching
Poor Self-Control
small
LARGE
A B
Direct Choice(Concurrent Schedule)
39
Self-Control and Overmatching
Concurrent ChoiceConcurrent Choice Human and nonhumans often chose a Human and nonhumans often chose a
immediate small reward over a larger immediate small reward over a larger delayed reward (delayed rewards are delayed reward (delayed rewards are “discounted”) “discounted”)
40
Another Example of Impulsivity
“Free” reinforcers given every 20s
Lever press advances delivery of the first pellet, and deletes the second pellet
So, if you press at 2 seconds, you get a pellet immediately, but you get no other pellets until the 60 second pellet is available.
20s 40s 60s
41
Delay of Reinforcement
Delayed reinforcers Delayed reinforcers are steeply are steeply discounteddiscounted
Loss of self-control Loss of self-control and impulsivityand impulsivity
0
10
20
30
40
50
60
70
80
90
100
-9 -6 -3 0
smallimmediate
largedelayed
Rei
nfo
rcer
Po
ten
cy
Delay
42
small
A
LARGE
B
A B
Concurrent Chain(Pre-committment)
43
Behavioral Methods for Self Control
Pre-commitmentPre-commitmentSelf-Exclusion Self-Exclusion ContractsContracts
DistractionDistraction ModelingModeling Shaping WaitingShaping Waiting
Reduce delay for Reduce delay for smallsmall
Increase delay for Increase delay for largelarge
44
The Discontinuance of Reinforcement
45
ExtinctionThe elimination or suppression of a
response caused by the discontinuation of reinforcement or the removal of the unconditioned stimulus
When reinforcement is first discontinued, the rate of responding remains highUnder some conditions, it even increases
Stronger Learning ≠ Slower Extinction Partial Reinforcement Extinction Effect or
PREE
Extinction Paradox
46
Importance of Consistency of Reward
47
Extinction is slower following partial rather than continuous reinforcement
Partial reinforcement extinction effect (PREE): the greater resistance to extinction of an instrumental or operant response following intermittent rather than continuous reinforcement during acquisitionOne of the most reliable phenomena in
psychology
Acquisition with Differing Percentages
Spee
d
Day
100%
80/50/30%
48
Extinction with Differing Percentages
Spee
d
Day
80% 50% 30%
100%
49
Explanations
Mowrer-Bitterman Discrimination Hypothesis
Amsel’s Frustration Theory (Emotional) Capaldi’s Sequential Theory (Cognitive)
50
Theios Experiment (not just discrimination)
PHASE 1 PHASE 2 EXT
G1 100% 0%
G2 100% 100% 0%
G3 50% 100% 0%
G4 50% - 0%
51
Extinction Trials
Spee
d
G1, G2 100%
PHASE 1 PHASE 2 EXT
G1 100% 0%
G2 100% 100% 0%
G3 50% 100% 0%
G4 50% - 0%
G3, G4 50%
52
Amsel’s Frustration Theory53
Amsel’s Frustration Theory
100% Reinforcement Group
54
Amsel’s Frustration Theory
50% Reinforcement Group
55
Amsel (Percentage Reinforcement)
Extinction Trials
Spee
d
100% 50%
56
Amsel’s Frustration Theory
EXT
BETWEEN SUBJECT
GROUP 1 T F 100%
T-
GROUP 2 N F 50%
N-
WITHIN SUBJECT
TRIALS 1,3,6….
TF 100%
T-
TRIALS2,4,5….
NF 50%
N-
PREE
ReversedPREE
57
Influence of Reward Magnitude58
The influence of reward magnitude on resistance is dependent upon the amount of acquisition training.
With extended acquisition, a small consistent reward may produce more resistance to extinction than a large reward (absence more frustrating).
Reward Magnitude and Percentage
59
Sequential Theory60
Sequential theory If reward follows nonreward, the animal will
associate the memory of the nonrewarded experience with the operant or instrumental response
During extinction, the only memory present after the first nonrewarded experience is that of the nonrewarded experience
61
Animals receiving continuous reward do not experience nonrewarded responses and so they do not associate nonrewarded responses with later reward
Thus, the memory of receiving a reward after persistence in the face of nonreward becomes a cue for continued responding # of N-R transitions N length Variability of N-length
62
What is the significance of PRE? It encourages organisms to persist even
though every behavior is not reinforced In the natural environment, not every attempt
to attain a desired goal is successful PRE is adaptive because it motivates animals
not to give up too easily