Operant applications

66
Operant Applications Principles of Learning

description

 

Transcript of Operant applications

Page 1: Operant applications

Operant Applications

Principles of Learning

Page 2: Operant applications
Page 3: Operant applications

Applications of Operant Conditioning

Skinner introduced the concept of teaching machines that shape learning in small steps and provide reinforcements

for correct rewards.

In School

LW

A-JD

L/ C

orbis

Page 4: Operant applications

Applications of Operant Conditioning

Reinforcers affect productivity. Many companies now allow employees to share

profits and participate in company ownership.

At work

Page 5: Operant applications

Applications of Operant Conditioning

At Home

In children, reinforcing good behavior increases the occurrence of these behaviors. Ignoring unwanted behavior decreases their

occurrence.

Page 6: Operant applications

6

Operant conditioning: Addiction (1)

Drug use is a behaviour that is reinforced by the positive reinforcement that occurs from the pharmacologic properties of the drug.

Page 7: Operant applications

7

Operant conditioning: Addiction (2)

Once a person is addicted, drug use is reinforced by the negative reinforcement of removing or avoiding painful withdrawal symptoms.

Page 8: Operant applications

Behavior Therapy• Behavior therapy uses learning methods to

change abnormal behavior, thoughts and feelings– Behavior therapists use classical and operant

conditioning techniques as well as modeling– Counterconditioning: learning a new response

• Systematic desensitization: relaxation is paired with a stimulus that formerly induced anxiety

• Aversive conditioning: an unpleasant event is paired with a stimulus to reduce its attractiveness

Ch 2.23

Page 9: Operant applications

Counterconditioning

Page 10: Operant applications

Cognitive Behavior Therapy

• Cognitive therapy assumes that thought patterns can cause a disturbance of emotion or behavior – Beck’s Cognitive Therapy for Depression

• Depressed mood caused by cognitive distortions– “Nothing good ever happens to me”

– Ellis’s Rational Emotive Behavior Therapy• Emotional upset is due to irrational beliefs

– “I must be loved by everyone”

Ch 2.25

Page 11: Operant applications

The Cognitive Paradigm

• Cognition involves the mental processes of perceiving, recognizing, judging and reasoning

• The cognitive paradigm focuses on how people structure and understand their experiences and how these experiences are related to past experiences stored in memory

Ch 2.24

Page 12: Operant applications

12

Operant conditioning: Addiction (2)

Once a person is addicted, drug use is reinforced by the negative reinforcement of removing or avoiding painful withdrawal symptoms.

Page 13: Operant applications

13

Operant conditioning: Application to CBT techniques

• Functional Analysis – identify high-risk situations and determine reinforcers

• Examine long- and short-term consequences of drug use to reinforce resolve to be abstinent

• Schedule time and receive praise

• Develop meaningful alternative reinforcers to drug use

Page 14: Operant applications

Gary Wilkes (1994) Animal Trainer

• Elephants:

Dangerous, handling stress sensitive

Calluses build-up (unable to walk)

Cut away with sharp tool

Page 15: Operant applications

Elephant Manicure

• Violent Aggressive Bull

• Callous not trimmed in 10 years

• Vets can not touch

• What to do?

• Large steel gate with hole in corner (size of elephants foot)

• Clicker + Carrot• Clicker + approach gate +

carrot• Clicker +lift foot + carrot• Clicker + move foot to hole• Etc….• After training: elephant would

voluntarily walk to gate and put foot through

Page 16: Operant applications

Elephant Manicure

• CS + US

• SHAPING

• Large steel gate with hole in corner (size of elephants foot)

• Clicker + Carrot• Clicker + approach gate +

carrot• Clicker +lift foot + carrot• Clicker + move foot to hole• Etc….• After training: elephant would

voluntarily walk to gate and put foot through

Page 17: Operant applications

Self Awareness

• Self Aware: observe ones own behavior

• “I think Joe will quit school” ( he is engaged in those types of behaviors)

• I have observed myself engaged in those behaviors. (“I think I will quit school”)

• Long-term Comas• Behave like awake:

Open eyes

Turn heads

Move a hand

Coma = not responsive to environment

Page 18: Operant applications

Boyle and Greer (1983)

• Reinforced spontaneous behaviors with music

Moved patient

Requested action

Reward = short selection of favorite music

2 sessions a day/ 16 weeks

• Reinforcement

• Outcome (Reward) contingent on behavior

• Cause and effect!

• 33% increased spontaneous movement

1 came out of coma

Page 19: Operant applications

Norris Edwards: Chapter 8: Wade08.ppt Page: 19

The Problem with The Problem with RewardReward

The Problem with The Problem with RewardReward

• Misuse of reward Misuse of reward ~ rewards must be tied to the ~ rewards must be tied to the behavior we are trying to increase.behavior we are trying to increase.

• Each of use has had the experience of standing Each of use has had the experience of standing in the checkout line and the market and seeing a in the checkout line and the market and seeing a child in a shopping cart tempted by the candy child in a shopping cart tempted by the candy and toys on display adjacent to the line. and toys on display adjacent to the line.

• When we as parents giving a purchase something When we as parents giving a purchase something to quiet our kids in that situation, what behavior to quiet our kids in that situation, what behavior are we actually reinforcing?are we actually reinforcing?

• Misuse of reward Misuse of reward ~ rewards must be tied to the ~ rewards must be tied to the behavior we are trying to increase.behavior we are trying to increase.

• Each of use has had the experience of standing Each of use has had the experience of standing in the checkout line and the market and seeing a in the checkout line and the market and seeing a child in a shopping cart tempted by the candy child in a shopping cart tempted by the candy and toys on display adjacent to the line. and toys on display adjacent to the line.

• When we as parents giving a purchase something When we as parents giving a purchase something to quiet our kids in that situation, what behavior to quiet our kids in that situation, what behavior are we actually reinforcing?are we actually reinforcing?

Page 20: Operant applications

Norris Edwards: Chapter 8: Wade08.ppt Page: 20©1999 Prentice Hall

Hidden Cost of Rewards

• Preschoolers played with felt-tipped markers and observed

• Divided into 3 groups:– Given markers again and

asked to draw– Promised a reward for

playing with markers– Played with markers,

then rewarded

Page 21: Operant applications

Albert BanduraSocial Cognitive Theory

• Theories that emphasize how behavior is learned and maintained through observation and imitation of others, positive consequences, and cognitive processes such as plans expectations, and beliefs.

• Observational Learning ~ A process in which an individual learns new responses by observing the behavior of another (a model) rather than through direct experience; sometimes called Vicarious Conditioning.

Page 22: Operant applications

Skinner (1953) and Verbal Behaviors

• “That itches”• “That tickles”• “That hurts”

• Observed behavior:

• Scratching• Giggling• Tears and groans

Page 23: Operant applications

Basic Behavioral Principles

• Antecedent - any stimulus that happens before a behavior (S)

• Behavior - an observable and measurable act of an individual (R)

• Consequence - any stimulus that happens after a behavior (O)

Page 24: Operant applications

Social-Cognitive Learning TheoriesSocial-Cognitive Learning Theories• To this point most American learning

theories have maintained the position that most learning can be explained in terms of the behavioral ABCs.

• Antecedents event preceding the behavior

• Behavior itself• Consequences of the behavior.• Social Learning Theories emphasizes

the importance of observational learning by observing people in social context.

• To this point most American learning theories have maintained the position that most learning can be explained in terms of the behavioral ABCs.

• Antecedents event preceding the behavior

• Behavior itself• Consequences of the behavior.• Social Learning Theories emphasizes

the importance of observational learning by observing people in social context.

Page 25: Operant applications

Verbal Conditioning

S-R-O

Page 26: Operant applications

Skinner (1957)

Page 27: Operant applications

The Mand(Requesting)

• All mands have one thing in common: in the antecedent condition, there is a Motivative Operation (or motivation {S-S}) in place.

• A= thirst (MO) (S)• B= “I want juice” (R)• C= student gets juice (O)• If a child does not want the item, you

cannot teach them to mand for it.

Page 28: Operant applications

Verbal Conditioning

• S

Hungry?

Sleepy?

• O

Reinforced (Behavior and self aware observation)

Reward or Punish?

• R

Yes!

No! (self

awareness?)

Page 29: Operant applications

Norris Edwards: Chapter 8: Wade08.ppt Page: 29

When Punishment FailsWhen Punishment Fails

• Most misbehavior is hard to punish immediately.

• Punishment conveys little information.

• An action intended to punish may instead be reinforcing because if brings attention.

• Most misbehavior is hard to punish immediately.

• Punishment conveys little information.

• An action intended to punish may instead be reinforcing because if brings attention.

Page 30: Operant applications

Behavior and the Mind

• Edward Tolman (1938) experiment with rats demonstrated latent learning

• Latent learning is learning that in not immediately revealed through a change in behavior

• Latent learning occurs without obvious reinforcement

• Perception of the model and of themselves influence individual's learning.

Page 31: Operant applications

Tolman

Latent Learning: A Classic Experiment(Tolman & Honzik,

1930)Three groups of rats were given practice trials in a maze, 1 trial per day.

The maze consisted of a series of components

shaped like the letter T.

A trial started when the rat was placed in the Start box and ended when he entered the Goal box, after which he was removed from the maze.

Page 32: Operant applications

Tolman

Latent Learning: A Classic Experiment(Tolman & Honzik,

1930)

TSTART

TTT

i

TT

...

GOAL

When the rat went up the stem of the T, he reached a choice point.If he turned one way, he came to a dead end.If he turned the other way, he came to the entrance of the next component.

Page 33: Operant applications

Tolman

Latent Learning: A Classic Experiment(Tolman & Honzik,

1930)

TSTART

TTT

i

TT

...

GOAL

Each time the rat turned into the dead end, it was counted as an error.The measure of performance (dependent variable) was the number of errors on a trial.

If learning occurred, the number of errors should decrease as more and more trials were given.

Page 34: Operant applications

Latent Learning: A Classic Experiment(Tolman & Honzik,

1930)

GROUP 1: On every trial, these rats received food when they reached the goal box.

GROUP 2: These rats never received food. They were simply removed from the maze when they got to the goal box.

GROUP 3: These rats got no food on Trials 1 to 10. But on Trial 11, and every trial afterwards, they received a food reward.

US = Food

UR = Consume Food

CS = Maze

CR= Consume Food

Page 35: Operant applications

Latent Learning: A Classic Experiment(Tolman & Honzik,

1930)

1 10 11 17

Trials (1 Trial per Day)

Avera

ge

Err

ors

0

2

4

6

8

1

0 GR 1 — GR 2 — GR 3 —

The day-to-day decrease in errors represented a “relatively permanent change in behavior” that resulted from practice.

This was clear evidence for learning.

Hull’s theory predicts that the rats in Hull’s theory predicts that the rats in groups 3 & 2 will not learngroups 3 & 2 will not learn

Page 36: Operant applications

Latent Learning: A Classic Experiment

(Tolman & Honzik, 1930)

1 10 11 17

Trials (1 Trial per Day)

Avera

ge

Err

ors

0

2

4

6

8

1

0 GR 1 — GR 2 — GR 3 —

Group 2 got no food but still improved slightly. Removal from the maze was a small reward.

There was little evidence for learning.

Page 37: Operant applications

Hull vs. Tolman

• Hull’s law of primary reinforcement:– “when a stimulus-response relationship is followed

by a reduction in need, the probability increases that on subsequent occasions the same stimulus will invoke the same response” (Schultz & Schultz, op. cit., p. 329)

• Learning can only take place if there is reinforcement

• S-R connections strengthened by the no. of reinforcements that have occurred - Hull called this “habit strength”

• Habit strength = intervening variable

Page 38: Operant applications

Hull vs. Tolman

• Tolman devised an experimental test of Hull’s theory

• Hull’s theory states - learning must involve reinforcement– So we can deduce this hypothesis from his

theory:• Rats will not learn if they are not rewarded

– Tolman tested this hypothesis

Page 39: Operant applications

Latent Learning: A Classic Experiment

1 10 11 17

Trials (1 Trial per Day)

Avera

ge

Err

ors

0

2

4

6

8

1

0 GR 1 — GR 2 — GR 3 —

Getting no food on Trials 1 – 10, Group 3 performed like Group 2 through Trial 11.

Page 40: Operant applications

Latent Learning: A Classic Experiment

1 10 11 17

Trials (1 Trial per Day)

Avera

ge

Err

ors

0

2

4

6

8

1

0 GR 1 — GR 2 — GR 3 —

On the next trial, Group 3 matched Group 1, and then did even better!

Page 41: Operant applications

Latent Learning: A Classic Experiment(Tolman & Honzik,

1930)Interpretation

Group 3 learned the route to the maze on Trials 1 to 10 but didn’t show it because there was no motivation to perform. How could they learn if there was no CS/US pairings?They outperformed Group 1 because the shift from no reward to reward made the reward seem larger by comparison. This is called “positive contrast.”

Page 42: Operant applications

So S-S is the way animals learn?

Hull maintained that maze itself caused little S-R bonds to form

S-R theory still dominated psychology for 40 more years

Page 43: Operant applications

Response Vs. Place Learning

GROUP P always found food in Goal Box 1.

Start 1

Start 2

Goal 2

Goal 1

(Tolman, Ritchie & Kalish, 1946)

This maze had no walls or roof so that rats could see “landmarks” in the room such as a window, door, or lamp.

On a random half of the trials, the rats started from Start Box 1, and on the other half they started from Start Box 2.

GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.

Page 44: Operant applications

Response Vs. Place Learning

GROUP P always found food in Goal Box 1.

Start 1

Start 2

Goal 2

Goal 1

(Tolman, Ritchie & Kalish, 1946)

Cognitive theory predicted that GROUP P would learn faster because they only had to learn one cognitive map.

Behavior theory predicted GROUP R would learn faster because they only had to learn one sequence of movements at the choice point—a right turn.

GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.

Page 45: Operant applications

Response Vs. Place Learning

GROUP P always found food in Goal Box 1.

Start 1

Start 2

Goal 2

Goal 1

(Tolman, Ritchie & Kalish, 1946)

GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.

What’s YOUR prediction?Are you a behaviorist or a

cognitivist?GROUP PGROUP R

Page 46: Operant applications

Response Vs. Place Learning

GROUP P always found food in Goal Box 1.

Start 1

Start 2

Goal 2

Goal 1

(Tolman, Ritchie & Kalish, 1946)

GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.

What’s YOUR prediction?Are you a behaviorist or a

cognitivist?GROUP PGROUP R

Group P learned faster.

ButLater studies found that if the maze had a roof so the rats couldn’t see things in the room, response learning was faster.

Page 47: Operant applications

Response Vs. Place Learning

GROUP P always found food in Goal Box 1.

Start 1

Start 2

Goal 2

Goal 1

(Tolman, Ritchie & Kalish, 1946)

GROUP R found food in Goal Box 1 when they started from Start Box 1 but received food in Goal Box 2 when they started from Start Box 2.

What’s YOUR prediction?Are you a behaviorist or a

cognitivist?GROUP PGROUP R

Group P learned faster. Both response and place learning occur. Which type is faster depends on what cues are available. So both the S-R and S-S views turned out to be right!

Page 48: Operant applications

S-R or S-SClassical conditioning can involve both S-R and S-S

Today:

Controlled vs. Automatic processing

S-S= While learning

S-R= After learning

Page 49: Operant applications

Theories Explaining Classical Conditioning

HULL• Born 1884 in Akron NY• Graduated U. of

Michigan in 1913• Ph.D. U. of Wisconsin

1918• 1929-1952 Professor of

Psychology at Yale• Died 1952

Tolman• Born Newton, Mass. On April

14, 1886.• BA at MIT in electrochemistry• Ph.D. psychology in 1915• Spent month at Giessen under

Kofka. Heavily influenced by Gestalt movement

• Ardent pacificist• Dismissed at Northwestern U• Went to UC Berkley rest of

career

S-R or S-S

Page 50: Operant applications

Behavioral vs. Cognitive Views of Learning

These traditions in learning theory have existed for decades. They give different answers to the

fundamental question, “What is learned” when learning takes place?

Behaviorists say: “Specific actions”

Cognitivists say: “Mental representations”

For example, in a “Skinner Box”, a rat may receive a food reward every time he presses the bar. He presses faster and faster. What has he learned?

S-R S-S

Page 51: Operant applications

S-R vs. S-SViews of Learning

These traditions in learning theory have existed for decades. They give different answers to the

fundamental question, “What is learned” when learning takes place?

S-R view: “to press the bar.”

S-S view:

For example, in a “Skinner Box”, a rat may receive a food reward every time he presses the bar. He presses faster and faster. What has he learned?

“that pressing produces food.”

Page 52: Operant applications

S-R vs. S-SViews of Learning

S-R

(“learns to”)1. Learning involves the formation of associations between specific actions and specific events (stimuli) in the environment. These stimuli may either precede or follow the action (antecedents vs. consequences).

2. Many behaviorists use intervening variables to explain behavior (e.g., habit, drive) but avoid references to mental states.

3. RADICAL BEHAVIORISM (operant conditioning/behavior modification/behavior analysis): avoids any intervening variables and focuses on descriptions of relationships between behavior and environment (“functional analysis”).

Page 53: Operant applications

S-R vs. S-SViews of Learning

S-S(“learns that”)

1. Learning takes place in the mind, not in behavior. It involves the formation of mental representations of the elements of a task and the discovery of how these elements are related.

2. Behavior is used to make inferences about mental states but is not of interest in itself (“methodological behaviorism”). 3. EXAMPLE: Tolman & Honzik’s experiment on latent learning. Tolman, a pioneer of cognitive psychology, argued that when rats practice mazes, they acquire a “cognitive map” of the layout—mental representations of the landmarks and their spatial relationships.

Page 54: Operant applications

S-R or S-S

• Autoshaping

• Taste aversion

• Eyeblink conditioning

• Blocking

• Extinction

• Spontaneous Recovery

• S-R

• S-S

• S-R

• S-S

• S-R

• S-S

Page 55: Operant applications

Latent LearningLatent Learning

• Rats: one maze trial/day• One group found food every

time (red line)• Second group never found

food (blue line)• Third group found food on

Day 11 (green line)– Sudden change, day 12

• Learning isn’t the same as performance

• Rats: one maze trial/day• One group found food every

time (red line)• Second group never found

food (blue line)• Third group found food on

Day 11 (green line)– Sudden change, day 12

• Learning isn’t the same as performance

Page 56: Operant applications

Norris Edwards: Chapter 8: Wade08.ppt Page: 56©1999 Prentice Hall

Cognitive Maps

• Tolman trained rats in this maze, with all alleys open– Not to scale; the path on the

left is too long.

• If “Block A” in place, rats chose green (shorter) path

• If “Block B” in place, rats chose blue path– Green path also blocked

• Rats navigate as if they have an internal map

• Tolman trained rats in this maze, with all alleys open– Not to scale; the path on the

left is too long.

• If “Block A” in place, rats chose green (shorter) path

• If “Block B” in place, rats chose blue path– Green path also blocked

• Rats navigate as if they have an internal map

Page 57: Operant applications

Varieties of cognitive maps? (Gallistel 1990)

Specific issues:• Spatial scale (local vs. home-range) • Geometric content (metric, topological) • Reference frame (egocentric/view-dependent vs. allocentric/view-

independent)Evidence: • People: short cuts in cities and VR (errors); mixed evidence

contents of underlying map• Rodents: most studies on local scale; mixed evidence on contents• Insects: on local and home-range scale--metric, egocentric

Broader Definition (Gallistel 1990): ‘A cognitive map is a record in the central nervous system of macroscopic geometric relations among surfaces in the environment used to plan movements through the environment. A central question is what type of geometric relations a map encodes’.

Page 58: Operant applications

More on Cognitive Maps: Chimpanzee Behavior

Page 59: Operant applications

More on Cognitive Maps: Chimpanzee Behavior

• Chimpanzee on experimenter’s back• Watched site bating: 18 locations• Later released to retrieve food• Most food found• Retrieval route differed from baiting route• Traveling distance was very efficient

Cognitive Maps (spatial learning)

Page 60: Operant applications

More on Cognitive Maps: Chimpanzee Behavior

• Second experiment

• Same general plan

• 18 locations: 9 fruits and 9 vegetables

• First retrieval visits were to retrieve fruits, according with food preferences

Page 61: Operant applications

More on Cognitive Maps: Chimpanzee Behavior

• Results suggest that chimpanzees have something like a cognitive map of compound.

• As they are carried around, chimpanzees store information about food locations not on the basis of the particular path that they are traveling, but on the basis of their cognitive map. Cognitive Map = A

separate type of memory (Bedroom, Gestalt)

Page 62: Operant applications

More on Cognitive Maps: Chimpanzee Behavior

• Chimpanzees work with this cognitive representation to determine most efficient route to travel in gathering food.

• This solution depends on cognitive mediation between inputs and behavior that transforms and organizes inputs.

• To explain chimpanzees’ behavior without appeal to mediating processes would provide an impoverished view of what animal does.

Page 63: Operant applications

http://www.scottcamazine.com/photos/BeeBehavior/images/06waggleDance_jpg.jpg

Sun Compass and Memory in Bees

Food 20° 40°75°

(Up)

20° 40°

75°

• Bees encode (allocentric?) flight direction in dances

• As sun moves, dances change• Dances change even when bees can’t see sun

(thus compensate by memory)• Reference for memory: landmarks (Dyer &

Gould 1981; Dyer &Dickinson 1996)

H

F

Noon

16:0012

16

The basic task

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 64: Operant applications

A STRATEGY FOR INCREASING BEHAVIOUR

• Behavioral self-management is a strategy for increasing some desired behavior (for example, hours spent studying or exercising) by using self-administered rewards. A behavioral self-management program requires the following:

Page 65: Operant applications

Strategies for increasing a desired behavior

• Choose a target Choose a target behaviour (the behaviour behaviour (the behaviour you want to increase)you want to increase)

• Record a baseline (count Record a baseline (count time engaged in the time engaged in the desired behaviour or desired behaviour or number of times the number of times the desired behaviour is desired behaviour is performedperformed )

• Establish goals (set Establish goals (set gradual goals – daily and gradual goals – daily and weekly)weekly)

• Choose reinforcers (for Choose reinforcers (for when you reach daily and when you reach daily and weekly goals)weekly goals)

• Record your progress Record your progress (time you engaged in the (time you engaged in the behaviour or number of behaviour or number of times you performed the times you performed the activity)activity)

Page 66: Operant applications