The Traveling salesrat: Insights into optimal spatial...

1
Introduction Methods Acknowledgments References Conclusions 0.2 0.4 0.6 0.8 1 1 2 3 4 0 Last Trial Configuration Type First Trial 1 2 3 4 0 0.2 0.4 0.6 0.8 1 Configuration Type Exploration Trial First Trial Last Trial Optimality Ratio = Path taken/min path length Theoretical Distribution of Path Lengths Experimental Distribution of Path Lengths 3 2.5 2 1.5 1 0.5 0 20 40 60 80 3 2.5 2 1.5 1 0.5 0 100 200 300 Trials Optimality Ratio Performance across trials 2.0 1.8 1.6 1.4 1.2 1.0 0.8 4 2 6 8 10 Days Optimality Ratio Performance across days 1 2 3 4 5 6 T2 T1 T3 T4 T5 T7 T8 T9 T6 T10 Traveling Salesman Problem Optimization on the basis of spatial and reward information Last Trial Last Trial First Trial First Trial 10 trials to find the optimal path for a given configuration A total of 24 configurations over a six day period were used. Re-optimization due to reward contingencies modification of First Trial Last Trial Reward removed from city after 10 trials. Trials continued until rat was able to re-optimize its path. Spatially optimal, reward optimal, or ambiguous configurations were presented. Each configurations: Exploration trial+ 3 decision trials. ? The traveling salesman problem is a classic problem in artificial intelligence and theoretical computer science in which an agent has to plan visits to a fixed set of cities. It can be solved by calculating the total distance traveled for every possible tour and sorting the solutions. Finding the best solution is computationally expensive (NP-complete problem) because each city added increases the complexity of the problem exponentially. ? Heuristic methods allow humans to find near optimal solutions. ? Understanding the neural mechanisms underlying these heuristic processes can give insights into how complex choices are made. ? We propose a rodent model to investigate problem solving strategies at both behavioral and neural levels. ? We study how rats use a combination of spatial and reward information to optimize their decisions. The authors would like to profusely thank Brian Gereke for assistance with experimental design, conduction of experiments and data analysis , Peter Hirshman and Nadia Corral-Frías for their help running experiments. Traveling Salesman Problem Optimization on the basis of spatial and reward information Re-Optimization because of reward removal T1 T4 T7 T10 Reward Removed T11 T14 T17 T20 Waveforms of VTA neurons: T21 T22 T23 T24 Re-optimization of path after a reward is removed from a learned location 30º 90º 150º 180º -150º -90º -30º Applegate, D., Bixby, R., Chvátal, V., et al. (2006) The Traveling Salesman Problem: A Computational Study. Princeton, NJ: Princeton University Press. Bures, J.,Buresova, O., and Nerad, L. (1992). Can rats solve a simple version of the traveling salesman problem? Behavioural Brain Research. 52, 133-142. Fields, H., et al. (2007). Ventral Tegmental Area Neurons in Learned Appetitive Behavior and Positive Reinforcement. Annu. Rev. Neurosci. 30, 289-316. ? The ability of rats to optimize their path based on reward contingencies suggests that reinforcement may contribute to the rats’ ability to shift their strategy towards the optimal path. ? This shift is likely to depend at least in part on the interaction between brain structures that are involved in reward processing (VTA), spatial navigation (hippocampus) and planning (prefrontal cortex). ? Route optimization occurs within a configuration, not over sessions, which suggests that this task involves planning and short term memory, not long term memory. ? Rats will choose the optimal spatial or reward solution if presented separately – this choice is made more decisively with training. When both reward and spatial options are presented together, rats will shift strategy from reward to spatial optimization with training. ? Rats are able re-optimize when reward contingencies are changed. ? Optimization based on reward availability and quantity suggests the involvement of the dopamingeric system of the ventral tegmental area (VTA). ? Unlike in most TSP problems, spatial choice is biased by the current position and orientation of the rat. Optimality Ratio = Path taken/min path length Number of Paths Number of Paths Starting Orientation (degrees) 0 30 90 150 180 -150 -90 -30 0 0.2 0.4 0.6 0.8 1 Probability of Choosing Right City City preference based on initial starting position n=45 (4 rats) n=3 rats Influence of initial orientation in cases of ambiguous choices Last Trial First Trial Rats were randomly started at eight angles (0, 30, 90, 120, 180, 210, 270, 330) relative to center of arena. Rats were allowed to visit only one city. Type 1: Spatially Ambiguous Reward Disambiguous Type 2: Spatially Disambiguous Reward Ambiguous Type 3: Spatially Disambiguous Reward Disambiguous Type 4: Spatially Disambiguous Reward Disambiguous Worst Choice Best spatial + reward Reward Choice Spatial Choice Choose R = 1 Choose L = 0 Influence of initial orientation in ambiguous choices (Fields et al., 2007) Intermediate Trial Sample paths of a rat running trials of a Type 4 configuration Spatially Optimal Reward Optimal Non-optimal Reward + Spatially Optimal Spatially Optimal Reward Optimal Non-optimal Reward + Spatially Optimal Spatially Optimal Reward Optimal Non-optimal Reward + Spatially Optimal Spatially Optimal Reward Optimal Non-optimal Reward + Spatially Optimal 679.22 1 2 3 Laurel Watkins de Jong , Gerard M. Martin , Jean-Marc Fellous 1. 2. 3. Undergraduate Biology Research Program (UBRP), Univ. of Arizona, Tucson, AZ Dept. of Psychology, Memorial University, CA Dept of Psychology and program in applied Mathematics, Univ. of Arizona, Tucson, AZ. The Traveling salesrat: Insights into optimal spatial navigation and the role of the dopaminergic system. Sample Paths Sammy. Jr Darwin Darwin Sammy Jr. Indiana 0 5 10 15 20 timed-out 0 1 2 3 4 5 Trials until re-optimization Number of occurences n=25 (4 rats) 679.22

Transcript of The Traveling salesrat: Insights into optimal spatial...

Page 1: The Traveling salesrat: Insights into optimal spatial ...amygdala.psychdept.arizona.edu/posters/TSP-SFN-09.pdf · 10 trials. Trials continued until rat was able to re-optimize its

Introduction

Methods

Acknowledgments

References

Conclusions

0.2

0.4

0.6

0.8

1

1 2 3 40

Last Trial

Configuration Type

First Trial

1 2 3 40

0.2

0.4

0.6

0.8

1

Configuration Type

Exploration Trial First Trial Last Trial

Optimality Ratio = Path taken/min path length

Theoretical Distribution of Path Lengths

Experimental Distribution of Path Lengths

32.521.510.50

20

40

60

80

32.521.510.50

100

200

300

Trials

Op

tima

lity

Ra

tio

Performance across trials

2.0

1.8

1.6

1.4

1.2

1.0

0.8

42 6 8 10

Days

Op

tima

lity

Ra

tio

Performance across days

1 2 3 4 5 6

T2T1

T3 T4

T5

T7 T8

T9

T6

T10Traveling Salesman Problem

Optimization on the basis of spatial and reward information

Last Trial

Last TrialFirst Trial

First Trial

10 trials to find the optimal path for a given configuration

A total of 24 configurations over a six day period were used.

Re-optimization due to reward contingencies modification of

First Trial Last Trial

Reward removed from city after10 trials.

Trials continued until rat was able to re-optimize its path.

Spatially optimal, reward optimal, or ambiguous configurations were presented. Each configurations: Exploration trial+ 3 decision trials.

? The traveling salesman problem is a classic problem in artificial intelligence and theoretical computer science in which an agent has to plan visits to a fixed set of cities. It can be solved by calculating the total distance traveled for every possible tour and sorting the solutions. Finding the best solution is computationally expensive (NP-complete problem) because each city added increases the complexity of the problem exponentially.

? Heuristic methods allow humans to find near optimal solutions.

? Understanding the neural mechanisms underlying these heuristic processes can give insights into how complex choices are made.

? We propose a rodent model to investigate problem solving strategies at both behavioral and neural levels.

? We study how rats use a combination of spatial and reward information to optimize their decisions.

The authors would like to profusely thank Brian Gereke for assistance with experimental design, conduction of experiments and data analysis , Peter Hirshman and Nadia Corral-Frías for their help running experiments.

Traveling Salesman Problem

Optimization on the basis of spatial and reward information

Re-Optimization because of reward removal

T1 T4 T7 T10

Reward Removed

T11 T14

T17 T20Waveforms of VTA neurons:

T21 T22 T23 T24

Re-optimization of path after a reward is removed from a learned location

0º 30º 90º 150º 180º

-150º -90º -30º

Applegate, D., Bixby, R., Chvátal, V., et al. (2006) The Traveling Salesman Problem: A ComputationalStudy. Princeton, NJ: Princeton University Press.

Bures, J.,Buresova, O., and Nerad, L. (1992). Can rats solve a simple version of the traveling salesman problem? Behavioural Brain Research. 52, 133-142.

Fields, H., et al. (2007). Ventral Tegmental Area Neurons in Learned Appetitive Behavior and Positive Reinforcement. Annu. Rev. Neurosci. 30, 289-316.

?The ability of rats to optimize their path based on reward contingencies suggests that reinforcement may contribute to the rats’ ability to shift their strategy towards the optimal path.

?This shift is likely to depend at least in part on the interaction between brain structures that are involved in reward processing (VTA), spatial navigation (hippocampus) and planning (prefrontal cortex).

?Route optimization occurs within a configuration, not over sessions, which suggests that this task involves planning and short term memory, not long term memory.

?Rats will choose the optimal spatial or reward solution if presented separately – this choice is made more decisively with training. When both reward and spatial options are presented together, rats will shift strategy from reward to spatial optimization with training.

?Rats are able re-optimize when reward contingencies are changed.

?Optimization based on reward availability and quantity suggests the involvement of the dopamingeric system of the ventral tegmental area (VTA).

?Unlike in most TSP problems, spatial choice is biased by the current position and orientation of the rat.

Optimality Ratio = Path taken/min path length

Nu

mb

er

of P

ath

sN

um

be

r o

f P

ath

s

Starting Orientation (degrees)

0 30 90 150 180 -150 -90 -300

0.2

0.4

0.6

0.8

1

Pro

ba

bili

ty o

f C

ho

osi

ng

Rig

ht C

ity

City preference based on initial starting position

n=45 (4 rats)

n=3 rats

Influence of initial orientation in cases of ambiguous choices

Last TrialFirst Trial

Rats were randomly started at eight angles (0, 30, 90, 120, 180, 210, 270, 330) relative to center of arena.

Rats were allowed to visit only one city.

Type 1:Spatially Ambiguous

Reward Disambiguous

Type 2:Spatially Disambiguous

Reward Ambiguous

Type 3:Spatially Disambiguous Reward Disambiguous

Type 4:Spatially Disambiguous Reward Disambiguous

Worst Choice Best spatial + reward Reward Choice

Spatial Choice

Choose R = 1Choose L = 0

Influence of initial orientation in ambiguous choices

(Fields et al., 2007)

Intermediate Trial

Sample paths of a rat running trials of a Type 4 configuration

Spatially Optimal

Reward Optimal

Non-optimal

Reward + SpatiallyOptimal

Spatially Optimal

Reward Optimal

Non-optimal

Reward + SpatiallyOptimal

Spatially Optimal

Reward Optimal

Non-optimal

Reward + SpatiallyOptimal

Spatially Optimal

Reward Optimal

Non-optimal

Reward + SpatiallyOptimal

679.22

1 2 3Laurel Watkins de Jong , Gerard M. Martin , Jean-Marc Fellous1. 2. 3.

Undergraduate Biology Research Program (UBRP), Univ. of Arizona, Tucson, AZ Dept. of Psychology, Memorial University, CA Dept of Psychology and program in applied Mathematics, Univ. of Arizona, Tucson, AZ.

The Traveling salesrat: Insights into optimal spatial navigation and the role of the dopaminergic system.

Sample Paths

Sammy. Jr

Darwin

DarwinSammy Jr.

Indiana

0 5 10 15 20 timed-out0

1

2

3

4

5

Trials until re-optimization

Nu

mb

er

of o

ccu

ren

ce

s

n=25 (4 rats)

679.22