Yin i Knowlton

download Yin i Knowlton

of 14

Transcript of Yin i Knowlton

  • 7/31/2019 Yin i Knowlton

    1/14

    *Laboratory for IntegrativeNeuroscience, National

    Institute on Alcohol Abuse

    and Alcoholism, National

    Institutes of Health, 5625

    Fishers Lane, TS-13,

    Bethesda, Maryland 20892,

    USA. Department of

    Psychology, Franz Hall,

    University of California, Los

    Angeles, California

    10095-1563, USA.

    Correspondence to B.J.K.

    e-mail:

    [email protected]

    doi:10.1038/nrn1919

    The role of the basal ganglia in habitformationHenry H. Yin* and Barbara J. Knowlton

    Abstract | Many organisms, especially humans, are characterized by their capacity or

    intentional, goal-directed actions. However, similar behaviours oten proceed

    automatically, as habitual responses to antecedent stimuli. How are goal-directed actions

    transormed into habitual responses? Recent work combining modern behavioural assays

    and neurobiological analysis o the basal ganglia has begun to yield insights into the neuralbasis o habit ormation.

    When you flip on a light switch, your behaviour coul bea result of the esire for a state of illumination couplewith the belief that a certain movement will lea to it.Sometimes, however, you just turn on the light habitu-ally, without anticipating the consequences the verycontext of having arrive home in a ark room auto-matically triggers your reaching for the light switch.Although to the observer these two cases might appearto be similar, they iffer in the extent to which theyare controlle by outcome expectancy. When the lightswitch is known to be broken, the habit might still persistwhereas the goal-irecte action might not.

    Intuitively, then, goal-irecte actions are control-le by their consequences, habits by anteceent stimuli.But how can we translate such intuitive concepts intooperationally efine terms an experimentally test-able hypotheses? Here, we outline the basic conceptualframework that has emerge from the behaviouralanalysis of goal-irecte actions an stimulus-rivenhabits, an integrate this framework with recent fin-ings on the anatomy an physiology of the basal ganglia,a set of nuclei that have long been known to control

    voluntary behaviour. More specifically, we show thatistinct networks involving the basal ganglia are theneural implementations of actions an habits, an thatan unerstaning of these networks can illuminate fin-ings from ifferent levels of analysis, from the cellularan molecular mechanisms of synaptic plasticity to theconitions that favour habit formation an the evelop-ment of compulsivity in various clinical isorers.

    Basal ganglia and instrumental behaviours

    The basal ganglia: anatomy and functions.The basalganglia are a set of nuclei locate in the cerebrum (FIG. 1).Unlike the cortex, which has excitatory, glutamatergic

    projection neurons, the basal ganglia contain inhibitory,GABA (-aminobutyric aci)-containing projectionneurons. Of these projection neurons, the spiny varietybelongs to the striatum (the input nucleus) an the aspiny

    variety belongs to the pallium (the output nucleus)1,2.The striatal projection neurons are often quiescent

    owing to their intrinsic membrane properties2, an whenthey are activate by strong an coherent inputs from thecortex (an, to a lesser extent, the thalamus), they ten toreuce the tonically active pallial output. The outcomeof this isinhibitory pathway, the most basic pathwayin the basal ganglia, is the facilitation of the targetemotor network3. However, a ifferent pathway, trai-tionally known as the inirect pathway, appears to exertinhibitory control over ownstream thalamocorticalan brainstem networks4.

    In iscussing the role of the basal ganglia in behav-iour, it is useful to think of them as a biological systemthat operates by classic selectionist principles, possess-ing a generator of iversity an mechanisms of selectionan of ifferential amplification. The striatum receivesmassive projections from almost all cortical areas, an

    from the intralaminar nuclei of the thalamus. These areorganize roughly by the area from which the projectionarises. The thalamocortical network, which projects tothe striatum, provies a wealth of inputs that representa iverse array of signals relate to representationsof sensory inputs, motor programmes an internalstates2,5. This ynamic set of inputs, which can changefrom moment to moment, therefore constitutes a gen-erator of iversity. Moreover, the basal ganglia, an inparticular the striatum, are capable of selection an if-ferential amplification: in the short term through lateralinhibition an the membrane properties of the striatalprojection neurons, which shift between ifferent states

    R E V I E W S

    464 | JUNE 2006 | VOLUME 7 www.nature.com/reviews/neuro

  • 7/31/2019 Yin i Knowlton

    2/14

    Cortex

    Striatum

    SNc/VTA

    Thalamus

    Brainstem

    GPe

    STN GPi/SNr

    Dopaminemodulation

    Indirectpathway

    Directpathway

    Excitation

    Inhibition

    of excitability; an in the long run by long-term synap-tic plasticity, which can preserve or alter the process ofbehavioural selection2,6.

    Instrumental behaviour. Given their crucial place inthe cerebrum, how o the basal ganglia function ingenerating purposive behaviour? Divac7 an Konorski8were among the first to systematically examine theeffects of cortical an basal ganglia lesions on theacquisition of instrumental behaviours. Whenevera particular outcome is contingent on a response,be it flexing a leg, traversing a maze or pressing alever, the behaviour in question is instrumental.Instrumental behaviours iffer from reflexes anfixe action patterns, which are not controlle by thecontingency between behaviour an its consequences.Lesions of the sensorimotor cortex severely impaireskille movements, an lesions of the premotorcortex impaire the chaining of action repertoires.

    By contrast, lesions of the basal ganglia (particularlythe striatum) isrupte the very instrumentality ofactions espite relatively intact fine movements,the animals that were teste coul no longer performor acquire actions in orer to earn specific rewars oravoi aversive stimuli8.

    Although Konorski8 presciently observe that striatallesions prouce variable results, he i not have at hisisposal behavioural assays that woul have allowe himto precisely analyse these effects. A major obstacle tounerstaning basal ganglia function is the conceptualconfusion that characterize the fiel of instrumentallearning for many ecaes, which in some ways persists

    even toay. Although instrumental behaviour appears tobe primarily irecte towars a goal, traitional theories,with a few notable exceptions9,10, ismisse this obviouspossibility. In the prime of behaviourism research, thestuy of learning was ominate by Hull an his fol-lowers, for whom instrumental learning is escribe interms of stimulusresponse (SR) bons strengtheneby subsequent reinforcement11,12. SR/reinforcementtheory was base on the work of Thornike, an aimeto eliminate unscientific concepts such as intentional-ity, expectancy an internal representation11,12. Themost funamental assumption of this theory is that allbehaviour is elicite by some anteceent stimuli fromthe external environment, an that the consequences ofbehaviour, by proviing satisfaction or issatisfaction tothe organism, merely reinforces or weakens the SR asso-ciation. Deliberately ismissing the intentional accountof goal-irecte behaviour that our behaviour canbe controlle by actionoutcome contingencies theSR/reinforcement theorist assigne no causal role tooutcome expectancy. Although this position might be

    consiere extreme toay, its pervasive influence onneuroscience can harly be exaggerate, an it remainspowerful in many of the implicit assumptions mae byresearchers who interpret all neural activity solely asa function of anteceent stimuli presente before themotor response.

    However, research over the past two ecaes hasshown conclusively that animals can encoe the causalrelationship between their actions an outcomes, ancontrol their actions accoring to their anticipation of,an esire for, the outcome13,14. Consequently, we arenow aware of the paramount importance of two previ-ously neglecte variables the remembere value ofthe expecte outcome an the knowlege of the causalrelationship between the action an the outcome. Therealization that these variables can be manipulate by theexperimenter has revolutionize the stuy of purposivebehaviour.

    As a result of this paraigm shift, there are nowexperimental assays that measure intentionality angoal-irecteness. Two classes of assay have becomecommon in the contemporary analysis of instrumentallearning. In the first, the value of the outcome is increase(inflate) or ecrease (evalue). Devaluation is farmore common because it is easier to reuce the value ofan outcome; for example, by giving the animal unlim-ite exposure to the foo reinforcer before a brief probe

    test. If performance is sensitive to manipulations ofoutcome value (for example, if the rate of responingecreases after outcome evaluation), then the behav-iour is controlle by the anticipation of the outcome. Ifperformance is insensitive to these manipulations, thenthe behaviour is controlle by anteceent stimuli (itis habitual). Importantly, this test shoul occur in theabsence of the outcome to probe the nature of memoryfor the association inepenently of new learning thatcan occur uring the test.

    In the secon class of assays, the actionoutcomecontingency (AO; the egree to which the outcomeepens on the action) is manipulate14,15. This is often

    Figure 1 | A schematic of the main connections of the

    basal ganglia. Simpliied illustration o basal ganglia

    anatomy based on a primate brain. The direct and indirect

    pathways rom the striatum have net eects odisinhibition and inhibition on the cortex, respectively.

    STN, subthalamic nucleus; GPe, external globus pallidus;

    GPi, internal globus pallidus; SNr, substantia nigra pars

    reticulta; SNc, substantia nigra pars compacta; VTA,

    ventral tegmental area.

    R E V I E W S

    NATURE REVIEWS |NEUROSCIENCE VOLUME 7 | JUNE 2006 |465

  • 7/31/2019 Yin i Knowlton

    3/14

    Respons

    erate

    Session

    Overtraining

    0 5 10 15

    No changes inresponse rate

    Changes inresponserate

    Initialacquisition

    Overtrained

    0

    10

    20

    30

    Rewa

    rdrate

    Response rate

    Ratio vs interval

    0 5 10 15 200.0

    2.5

    5.0

    7.5

    Ratio

    Interval

    Rewardrate

    Response rate0 5 10 15 20

    Omission and degradation

    Omission

    Degradation

    0.0

    2.5

    5.0

    7.5

    a

    b

    c

    one using contingency egraation, a proceure thatintrouces free rewars that are inepenent of anyaction. Instrumental contingency can be viewe as the

    probability of rewar given a particular action relative tothe probability of rewar given no action. If these prob-abilities are the same, the contingency is sai to be com-pletely egrae. This woul be the case, for example,if one is pai the same amount regarless of how muchwork is one; the question is to what extent work outputwoul ecrease as a result of the egrae contingencybetween work an pay. If egraing the contingencyha no effect on work, it coul be conclue that thebehaviour was habitual an not goal-irecte.

    For any given behaviour to be establishe as a goal-irecte action, it must pass both tests16. First, perform-ance must be sensitive to revaluation of the outcome.

    Secon, performance must be sensitive to manipulationof the AO contingency. Actions characterize by thesecriteria are not efine by specific motor programmes

    but by the goal state, such as a certain rate of rewar;in maintaining this goal state the behaviour in questionis moulate biirectionally. Such biirectional controlcan be emonstrate empirically by a complete reversalin instrumental contingency known as omission (BOX 1),in which an action that previously earne a rewar isarrange to prevent it, an the animal can only earnrewars by refraining from performing the action17,18.Not surprisingly, omission is the most rapi metho forreucing performance of goal-irecte actions.

    The analysis of the instrumental actions revieweabove has crucial implications for the stuy of habit for-mation, as behaviour not guie by outcome expectancy

    Box 1 | Conditions that lead to habit formation

    In ratio scheules, a response results in a certain probability

    of rewar; more responses yiel more rewars. In interval

    scheules, a response is only reware after a certain time

    interval has elapse. Uner certain conitions (for example,

    when a single actionoutcome (AO) pairing is use), these

    scheules can generate behaviours that iffer greatly in

    their sensitivity to manipulations of outcome value aninstrumental contingency. For instance, training uner an

    interval scheule results in behaviour that is less sensitive to

    the imposition of the omission contingency17. In short,

    whereas ratio scheules prouce goal-irecte actions

    controlle by the AO contingency, interval scheules ten

    to generate stimulusresponse (SR) habits25. The most

    crucial ifference between these scheules can be

    illustrate by plotting their feeback functions, with the

    rate of response on the x-axis an the rate of rewar on the

    y-axis106 (panel a). Whereas ratio scheules set up a strong

    correlation between response rates an rewar rates,

    interval scheules o not13.

    Moreover, as Dickinson has observe, in both

    overtraining an interval scheules, the experience

    instrumental contingency the correlation between achange in response rate an a change in rewar rate is

    low24,107. In interval scheules, this experience

    contingency is usually low. However, in ratio scheules the

    experience contingency is high early in training, when

    response rates vary, resulting in varying local rates of

    rewar; but with overtraining, the animals ten to respon

    at a consistently high rate, resulting in little change in the

    local rates of rewar (panel b). Finally, this hypothesis also

    explains why, given two actions an two outcomes in

    training, behaviour was shown to be goal-irecte even

    after extensive training with interval scheules14, as this

    conition ensures that experience contingency remains

    high (choosing one action woul completely stop rewar

    elivere by the other action).

    The feeback function can also be use to illustratecommon manipulations of the instrumental contingency.

    For example, omission is a complete reversal of the normal

    AO contingency that is, a response prevents the

    reinforcer, but no response results in reinforcer elivery

    (panel c). In egraation, the instrumental contingency is

    reuce by presenting non-contingent backgroun

    reinforcers; for example, making the probability of the

    reinforcer the same regarless of response (panel c).

    R E V I E W S

    466 | JUNE 2006 | VOLUME 7 www.nature.com/reviews/neuro

  • 7/31/2019 Yin i Knowlton

    4/14

    Extinction

    Operationally, the withholding

    of reinforcement after previous

    reinforcement.

    an the instrumental contingency can be escribe asan SR habit. This is a clear preiction from SR/rein-forcement theory, accoring to which the outcome isnot part of the SR association, but merely strengthensor weakens it. Inee, uner many conitions behav-iours are not sensitive to changes in contingency anoutcome evaluation1921. The SR/reinforcement theoryof Thornike an Hull has, therefore, stoo the test oftime when juge by its success at capturing the natureof habit learning.

    As a result of extensive research, there is now a con-sensus that instrumental behaviours are controlle bytwo istinct systems the AO system an the SRsystem that are engage uner ifferent conitions.In appetitive instrumental learning, the amount of train-ing (in particular the number of reware responses)appears to be a crucial factor in etermining the shiftfrom AO to SR control over behaviour that is, habitformation. Therefore, overtraining tens to promotehabit formation22. The scheule of reinforcement useis also a key factor (BOX 1). Early stuies using evalua-

    tion to examine the associative structure of instrumentalconitioning faile to fin any evience that perform-ance was controlle by goal expectancy, as evaluationha no effect on performance uring the extinction test.The use of interval scheules in these stuies was largelyresponsible for their failure to fin evience for AOlearning19,23,24. An explicit comparison of the scheulesemonstrate that, even with the amount of reinforce-ment equate, interval scheules prouce habitualresponing whereas ratio scheules o not25. The if-ference in sensitivity to changes in outcome value musttherefore be ue to ifferences between interval anratio scheules (BOX 1).

    Habit learning in the dorsal striatum

    Early efforts to unerstan basal ganglia functionswere heavily influence by SR/reinforcement theory.Accoring to the ominant view, the basal ganglia are theneural implementation of the law of effect, responsible forSR learning reinforce by rewars (with the reinforce-ment signal possibly provie by opamine) in a graualprocess of habit formation2628. Unsurprisingly, this viewhas initially foun consierable empirical support29,30.

    Clear evience comes from stuies using the place/response learning task, first invente by Tolman anrevive by Packar an McGaugh in a series of impor-tant experiments31,32. In this task, a rat is traine to

    retrieve foo from one arm of a cross maze surrouneby various environmental cues (FIG. 2). After training, itis given probe tests in which the starting arm is placeat the opposite en of the maze. The use of the responsestrategy (same left turn) shows that the learning wasinflexible an response-specific, but the use of the placestrategy (right turn) shows that the animal was able toincorporate surrouning spatial cues in eciing whichway to turn, selecting a response that was the oppositeof what was initially learne.

    After moerate training, most rats use the place strat-egy when teste, but after extensive training they switcheto a response strategy. Moreover, with inactivation

    of the orsal striatum, the rats were more likely to usethe place strategy espite extene training; however,inactivation of the hippocampus ha the opposite effect that is, the response strategy was use more frequentlyeven early in training32.

    These results have two important implications. First,with overtraining, there is a shift in behavioural controlfrom goal-irecte actions to habits, an such a shift canbe reveale by a behavioural assay. Secon, the orsalstriatum an the hippocampus might, on the basis ofthis account, be viewe as competing learning systems.This view has been evelope by Polrack an Packar,who argue that irect or inirect neural connectionsbetween the hippocampus an orsal striatum coulmeiate the competition between them33.

    Data from human stuies suggest that there is asimilar issociation between eclarative learning thatis epenent on the meial temporal lobe (MTL) annon-eclarative striatum-epenent learning. Unlikehabits, eclarative memories can be acquire rapily,often after a single trial. These memories are explicit, in

    that participants are aware of the memories, an they areflexible, in that they can be applie to new situations. Forexample, eclarative an habit learning were issociatein a recent stuy using a concurrent iscrimination taskin which pairs of objects were presente34. The partici-pants task was to choose the reware item in each pair.Neurologically intact participants learne these is-criminations quickly. Patients with severe amnesia fol-lowing amage to the MTL were also able to learn theseiscriminations, but their performance improve muchmore slowly. Although the patients eventually learnethe iscriminations, they i not show explicit knowl-ege of these associations. They were unable to choosethe reware items from the total array of stimuli. Theirperformance appeare to be habitual, with the presenta-tion of the pair automatically eliciting the choice of thecorrect item. Inee, the amnesic participants justifietheir choices by stating that some items just seemeright, rather than relying on their eclarative memoryfor previous trials.

    Another task that has been use to assess habit learn-ing in humans is the probabilistic classification task. Inthis task, a series of cues are each probabilistically asso-ciate with one of two outcomes, an the participantmust guess which outcome is preicte on the basisof the cues that appear in each trial. Because the cuesan outcomes are probabilistically associate, it is if-

    ficult to memorize their relationship explicitly. Amnesicpatients are able to learn these associations normally,which is consistent with the iea that they are learneinepenently of MTL structures that support eclara-tive memory. Furthermore, patients with Parkinsonsisease, who exhibit abnormal striatal functioning ueto loss of opaminergic input, have been shown to beimpaire in the implicit learning of these associations35,although they manage to achieve normal levels of per-formance with further training. This suggests that otherneural systems can support learning in this task. A recentstuy foun that patients with mil Parkinsons iseasewere able to perform almost as well as neurologically

    R E V I E W S

    NATURE REVIEWS |NEUROSCIENCE VOLUME 7 | JUNE 2006 |467

  • 7/31/2019 Yin i Knowlton

    5/14

    Training

    Start

    Goal Response Place

    Probe test b Win-stay on a radial arm mazea

    normal participants on the probabilistic classificationtask, but they showe a very ifferent pattern of brainactivation uring performance as reveale by functionalMRI. Whereas in control participants the striatal regionswere activate uring learning, patients with Parkinsonsisease showe activation in the hippocampus an sur-rouning MTL cortical regions36. It appears that patientswith Parkinsons isease achieve goo performance byrelying on eclarative memory, whereas neurologicallyintact participants relie on non-eclarative learningmechanisms. Many real-worl tasks encountere byhumans probably involve both habit an eclarativelearning; the system that contributes most to perform-ance epens on the amount of training, the ease ofmemorizing associations an the relative integrity ofthe basal ganglia an MTL in the learner.

    Functional heterogeneity in the dorsal striatum. Despitethe evience for basal ganglia involvement in habitlearning, many finings cannot be explaine by the ieathat the orsal striatum is the substrate of this type oflearning. For example, stuies recoring from cauatecells in monkeys performing a saccae task have shown

    that the neural activity encoing the preferre irectionof saccae coul change accoring to whether that irec-tion is reware, an this activity is rapily moifie asnew contingencies are encountere3739. Simultaneousrecoring from the prefrontal cortex (PFC) an cauatehas shown that cauate activity rapily aapts to thecontingency before PFC activity oes, an even beforesignificant improvements in performance occur40. Suchata suggest that certain learning mechanisms in thestriatum o not have the characteristics of habit learn-ing, that anticipation of future rewars has a crucial rolein regulating striatal activity, an that changes in neuralactivity as a result of learning occur at a rate too rapi to

    be explaine by the slow an graual changes posite bytraitional SR/reinforcement theory.

    Because the orsal striatum is a large an het-erogeneous structure, similar to the cerebral cortex,the question naturally arises as to whether, like thecortex, it is also functionally specialize. The cauatein primates is part of the associative striatum, whichreceives inputs from association cortices. It corre-spons to the orsomeial striatum (DMS) in roents,whereas the putamen is part of the sensorimotorstriatum, corresponing to the orsolateral striatum(DLS) in roents41(FIG. 3a). Many investigators havecreate large lesions of the orsal striatum in roents,without regar for the meial/lateral istinction, butthe amage appears to have been more prominent inthe lateral region.

    Inee, the DLS iffers from the DMS in connectiv-ity, istribution of various receptors an mechanisms ofsynaptic plasticity4143(BOX 2). Previous stuies have alsosuggeste a functional issociation between the DLSan DMS42,44. For example, work by Devan an Whiteshowe that the DMS, like the orsal hippocampus, isinvolve in flexible place learning, whereas the DLS sub-

    serves inflexible response learning45,46. In particular, theyiscovere that lesions of the DMS result in a preferencefor cue-base responing in the water-maze task. Takinginto account these results an the ifferent patterns ofanatomical connectivity, these investigators proposethat the DMS belongs to the same functional system asthe hippocampus.

    In view of the istinction between actions anhabits outline above, these consierations raise theinteresting possibility that the DLS is involve inSR learning, whereas the DMS is involve in AOlearning. Yin et al. conucte a series of stuies totest this hypothesis using assays (BOX 1) that coul be

    Figure 2 | Simple maze tasks for measuring habits and actions. a | In the place/response task, rats are trained to

    retrieve ood rom one arm o a T-maze or cross maze. The content o learning can be assessed by moving the starting arm

    to the other side o the maze on a probe test. The animal may enter the arm corresponding to the location o the reward

    during training (place strategy) or the arm corresponding to the turning response that was reinorced during training

    (response strategy). b | In the radial arm maze, animals can learn either a win-stay or a win-shit contingency. In win-stay,

    arms baited with ood are signalled by a cue (such as a light at the entryway). Animals will gradually learn to respond to

    these cues by running down the arms and retrieving the ood. Extensive win-stay training produces behaviour insensitive

    to devaluation113, and requires the dorsolateral striatum114. By contrast, the win-shit task is similar to natural oraging in

    that animals need to eiciently traverse the region without revisiting areas beore resources are replenished. They must

    learn the location o the arms that they have visited on each trial. Because arms are not re-baited on each trial, once the

    animal visits the arm and eats the ood, it should remember not to return to that arm during the session. Win-shitperormance is sensitive to devaluation113, and is impaired by hippocampal lesions114.

    R E V I E W S

    468 | JUNE 2006 | VOLUME 7 www.nature.com/reviews/neuro

  • 7/31/2019 Yin i Knowlton

    6/14

    Mediodorsalthalamus

    Mediodorsal/ventral

    thalamus

    Ventralthalamus

    Prerontal andparietal association

    cortices

    Associative network

    Thalamocorticalnetwork

    Basal ganglia

    Midbrain

    Limbic networka

    Orbital andventral PFC

    Habit ormation

    Increasing efector specificity and automaticity

    DA neurons DA neurons DA neurons

    Sensorimotor network

    Sensorimotorcortices

    Limbic striatum(nucleus

    accumbens)

    Associative striatum(caudate/DMS)

    Sensorimotor striatum(putamen/DLS)

    Ventralpallidum

    Motorpallidum

    Associativepallidum

    Mediodorsal/ventral

    thalamus

    Ventralthalamus

    Prerontal andparietal association

    cortices

    Associative networkActionoutcome (AO)

    b

    DA neurons DA neurons

    Sensorimotor networkStimulusresponse (SR)

    Sensorimotorcortices

    Associative striatum(caudate/DMS)

    Sensorimotor striatum(putamen/DLS)

    Motorpallidum

    Associativepallidum

    Thalamocorticalnetwork

    Basal ganglia

    Midbrain

    Excitation

    Inhibition

    Dopamine modulation

    Disinhibition

    applie to instrumental learning paraigms17,21,4749.Taking avantage of the establishe ifferencesbetween ratio an interval feeback scheules, theyfirst examine the effects of excitotoxic lesions tothe DLS using variable interval scheules, which areknown to generate habits in this case, lever press-ing that is insensitive to outcome evaluation. Aftertraining, the sucrose rewar was evalue by inucingtaste aversion until the animals stoppe consumingit in their home cages. When these rats were testelater for extinction, lever pressing of controls was notreuce by evaluation. By contrast, although ratswith DLS lesions coul normally learn to press a leverfor rewar, they mae fewer responses after evalu-

    ation relative to the controls. It appears that because

    their habit system was isrupte by the lesion, thealternative AO system assume control over behav-iour. However, a similar effect was not observe in ratswith DMS lesions.

    In another stuy, to assess the role of the DMS in AOlearning, Yin et al. use a training proceure with twoactions an two outcomes uner variable ratio sche-ules. This proceure generates goal-irecte actions thatare sensitive to outcome evaluation an contingencyegraation14. The posterior DMS (pDMS) was shown tobe a crucial substrate for the acquisition an expressionof goal-irecte actions. Both pre- an post-traininglesions, as well as reversible inactivation of the pDMS

    Figure 3 | Cortico-basal ganglia networks as the fundamental motifs of cerebral organization. a | Highly simpliied

    schematic illustration o the three major networks: the limbic, associative and sensorimotor networks. b | Schematic

    illustration showing cortico-basal ganglia networks in relation to serial adaptation. A shit rom the associative to the

    sensorimotor cortico-basal ganglia network is observed during habit ormation. DA, dopamine; DLS, dorsolateral striatum;

    DMS, dorsomedial striatum; PFC, prerontal cortex.

    R E V I E W S

    NATURE REVIEWS |NEUROSCIENCE VOLUME 7 | JUNE 2006 |469

  • 7/31/2019 Yin i Knowlton

    7/14

    abolishe sensitivity to evaluation an egraation48.Moreover, local blockae of NMDA (N-methyl-d-aspar-tate) receptors, which are require for the inuction oflong-term potentiation (LTP) in this region, specificallyprevente the encoing of new AO contingencieswithout impairing performance47. Therefore, the pDMSappears to be a crucial neural substrate for the learningan expression of goal-irecte actions. In its absence,the behaviour of the animal becomes habitual evenuner training conitions that result in goal-irecteactions in control rats.

    Moreover, it was shown that the pDMS is also involvein flexible choice behaviour in the place/response task ona cross maze49. After pre-training lesions were create,rats were traine extensively to retrieve foo from theeast arm of the maze, starting from the south arm, byturning right at the choice point (FIG. 2). Unlike controlrats, most rats in the pDMS lesion group turne righton the probe tests, when they starte from the northarm. This observation agrees with a growing boy ofrecent ata that show the role of the DMS in flexiblechoice behaviour50,51. Note that the key manipulation

    in the place/response task, namely the probe test withthe opposite starting point, is similar to a reversal in theAO contingency. Previously, a particular turn woullea to the arm with foo, but with the 180 rotationof the starting point, the same turn woul lea to thepreviously unreware arm. Again, the choice behav-iour of rats with pDMS lesions is renere inflexible anhabitual.

    Lever-pressing controlle by the instrumental con-tingency therefore shares common neural substrateswith the use of the place strategy in the maze. Despiteifferences between the motor programmes of pressinga lever an of traversing a maze, the common neural

    substrate in the pDMS suggests that this area is crucialfor learning the AO contingency, the feature share bythese tasks. On the cross maze, after a reversal in startingpoint, reaching the original goal requires a reintegrationof the spatial features of the environment with the goallocation. Whereas the hippocampus is necessary toascertain the spatial location of the rewar, the pDMSis involve in choosing the correct course of action thatleas to this location.

    One interpretation of these results is that the hippoc-ampus oes not compete with, or function inepen-ently of, the striatum, as has been previously claime29,33.Rather, the hippocampus can act together with orso-meial an ventral striatal regions to form a functionalcircuit. This hypothesis is supporte by stuies thatexamine activity in the DMS uring spatial navigationon various mazes52,53. Accoring to these stuies, theDMS contains spatially selective neurons that fire whenanimals take a particular route to reach a goal; it alsocontains hea-irection neurons with activity alignewith that of the place fiels of hippocampal place cells.Therefore, information about the current position of the

    animal provie by hippocampal place cells can be useto signal where to go to reach a efinite goal, an thisinformation is probably conveye to the DMS irectly

    via the cortico-striatal projection from the hippocampalpyramial neurons.

    Further evience for the role of the associative stria-tum in AO learning has come from stuies examiningcauate (DMS homologue) activity in humans another primates5456. Tricomi et al.57 foun that cauateactivity was moulate by the perceive contingencybetween action an outcome. Robust activation wasfoun only when the participants thought that theiraction resulte in the gain or loss of money, whereas

    Box 2 | Different rules of synaptic plasticity

    A basic assumption in contemporary neuroscience is that long-term synaptic plasticity, wiely stuie in the forms of long-

    term potentiation (LTP) an long-term epression (LTD), is a central physiological mechanism that unerlies learning. The

    issociation between the orsolateral striatum (DLS) an orsomeial striatum (DMS) at the level of behaviour is mirrore

    by istinct rules of synaptic plasticity in these regions. Although opamine is crucial for all forms of striatal plasticity, the

    exact mechanisms show remarkable regional variation43.

    The DMS expresses LTP that epens on the activation of D1-like opamine receptors an NMDA (N-methyl-d-aspartate)

    glutamate receptors43,108

    . The blockae of NMDA receptors in this region specifically prevents the learning of actionoutcome contingency, suggesting a critical functional role for LTP in the DMS in such learning47. Aitional evience

    comes from a stuy using intracranial self-stimulation of the opaminergic cells in rats to reinforce lever pressing109. The

    optimal parameters for self-stimulation were also foun to inuce cortico-striatal LTP in vivo, an the egree of

    potentiation in the cortico-striatal pathway in the DMS negatively correlate with the time taken to acquire lever pressing,

    which is a measure of initial actionoutcome learning. This form of LTP requires the activation of D1 receptors, suggesting

    that it is the same form as is observe in vitro.

    By contrast, opamine-epenent striatal LTD is usually foun in the DLS, an requires the activation of D2-like

    opamine receptors, group I metabotropic glutamate receptors, an L-type calcium channels110. The resulting increase in

    intracellular calcium causes the postsynaptic synthesis an the release of enocannabinois, which then act as a

    retrograe messenger on presynaptic cannabinoi CB1 receptors to ecrease probability of glutamate release from the

    cortico-striatal terminals111. The role of this intriguing form of plasticity in DLS-epenent habit learning is not known.

    However, previous work has shown that local infusion of a D2 receptor agonist can improve acquisition on the win-stay

    task, which typically prouces habitual responing that is insensitive to outcome evaluation112,113.

    As the above observations suggest, the same learning experience can result in ifferent types of synaptic change in the

    DLS an DMS. These changes are regulate by istinct rules as a result of ifferential istribution of key receptors in theseregions. Future stuies will no oubt she light on this striking correlation between mechanisms of striatal plasticity at the

    cellular an molecular level an the action/habit issociation at the behavioural level.

    R E V I E W S

    470 | JUNE 2006 | VOLUME 7 www.nature.com/reviews/neuro

  • 7/31/2019 Yin i Knowlton

    8/14

    time-locke anticipation of the outcome without the actioncontingency i not activate the cauate. These resultsalso clearly implicate the associative striatum as a crucialcomponent of the AO system. Furthermore, Williamsan Eskanar recore neural activity from both theanterior cauate an the putamen (DLS homologue) inmonkeys traine to move joysticks after presentationsof iscriminative stimuli58. These authors showe thatcauate activity in response to outcome presentationis strongly correlate with the rate of learning (slopeof the learning curve), whereas putamen activity iscorrelate with the learning curve itself. Although theauthors interprete such learning as SR, in view ofthe framework above, the behaviour of the monkeysis probably controlle by the AO contingency. Thespecific iscriminative stimuli merely tells the animalwhich AO contingency is in effect (that is, that a par-ticular joystick movement will lea to rewar), an thelearning that occurs uring the steepest portion of thelearning curve correspons to the initial acquisition ofthe AO association. However, once this rapi learn-

    ing has taken place, cauate activity quickly ecreases,whereas putamen activity remains high an follows thelearning curve closely until it asymptotes. This patternof activity agrees with earlier theoretical claims aboutthe relative rates of learning in the AO an SR sys-tems59. Moreover, this stuy also foun that, whereasstimulation of the putamen ha no effect, stimulation ofthe cauate significantly enhance the rate of learningwithout changing the asymptotic level of performanceor heonic preference, suggesting a causal role for thisstructure in instrumental learning.

    A hierarchy of cortico-basal ganglia networks

    We have suggeste that associative structures abstractescriptions of learning processes at the behavioural level can be mappe onto iscrete regions in the orsalstriatum. In particular, AO learning can be mappeonto the DMS, whereas SR learning can be mappe ontothe DLS. How, then, are we to interpret such emonstra-tions of functional heterogeneity from stuies that use thestrategy of process issociation? More importantly, whatoes it tell us about habit formation, whereby behaviouralcontrol is switche from one system to another?

    Paraoxically, the chief implication of such func-tional heterogeneity is not that a more refine analysisof behaviour is accompanie by a more refine localiza-tion of function. If we compare the relevant ata on the

    striatum with ata from other brain regions that projectto, or receive inputs from, the basal ganglia, a ifferentpicture emerges.

    Consierable evience shows that the PFC alsohas a crucial role in instrumental learning6065. Stuiesby Balleine an colleagues have shown that rats withpre-training lesions to the meial PFC, especially theprelimbic region, which provies massive projectionsto the DMS, faile to show sensitivity to evaluationan egraation60,61. In aition, pre-training lesions ofthe meioorsal nucleus of the thalamus, an eventualownstream target of outputs from the DMS as well asthe major source of thalamic projections to the PFC, also

    abolish sensitivity to evaluation an egraation66. Toa certain extent, these observations resemble the effectsof the pDMS lesions reviewe above67.

    Taking into account the above observations, we canno longer maintain that the orsal striatum as a whole isa substrate for habit learning. Nor can we capture the is-tinction aequately with the traitional contrast betweenhippocampus-epenent learning an striatum-epen-ent learning. It shoul be note that in this connection,selective pre-training lesions of the hippocampus o notconsistently rener behaviour habitual, as lesions of thepDMS o68. One possible role for the hippocampus, in

    view of this result an of the results from previous mazestuies31,46,69, is the integration of goal-irecte actionsthat require some representation of spatial an/or tem-poral configurations. In any case, although the preciserole of the hippocampus in AO learning remains to beetermine, the consierable functional heterogeneityin the orsal striatum prompts a reconsieration ofthe currently accepte moel of multiple memory sys-tems in which the striatum as a whole serves a specific

    mnemonic function.Alternatively, we propose that a cortico-basal ganglia

    network is a funamental motif of cerebral organization,an is the funamental unit of function at the level ofbehaviour (FIG. 3a). This claim is inspire by the trai-tional moel of basal ganglia organization in terms ofparallel an re-entrant loops70, although we o not placespecial emphasis on either the thalamocortical target ofbasal ganglia outputs or the strictly parallel nature ofthe networks. Inee, as iscusse below, interactionbetween networks is vital to the transformation fromactions to habits.

    A cortico-basal ganglia network is a functionalgroup comprising ifferent cortical, striatal an pallialcomponents, in aition to the various cell groups (forexample, opaminergic) in the mibrain that constitutethe brains value system, as well as the associate ien-cephalic structures (for example, the thalamus an thesubthalamic nucleus). The integration of various physi-ological processes in these components results in theoutput of the network that is, behaviour. Althougheach of these components, by virtue of characteristicphysiological properties, has unique computationalproperties, at the behavioural level it is the integratefunctioning of a istribute network comprising vari-ous components that is important. That is, when weprobe behaviour with contemporary behavioural assays,

    we can map issociable classes of behaviours onto is-sociable cortico-basal ganglia networks. This point isworth emphasizing, as systems neuroscience is oftenominate by attempts to localize psychological func-tions without regar for the actual functional circuitryof the brain. Not only o the psychological functionslack operational specificity, but the anatomical entitiesthat are sai to subserve such functions also lack therequisite circuitry. For instance, it is often asserte thatthe neocortex meiates a particular function, whereasthe striatum subserves another40. By contrast, usingoperationally efine representational structures thatcan be issociate behaviourally allows us to ientify

    R E V I E W S

    NATURE REVIEWS |NEUROSCIENCE VOLUME 7 | JUNE 2006 |471

  • 7/31/2019 Yin i Knowlton

    9/14

    the istribute networks that control istinct types ofecision-making an learning. Although our proposalremains preliminary, an nees to be refine an cor-recte by future research, it shoul be clear by nowthat the traitional view of multiple memory systems(which ivies the cerebrum into istinct functionalsystems corresponing to visually istinct anatomicalentities such as the hippocampus, amygala, striatuman neocortex) oes not provie a fully satisfactoryexplanatory framework.

    In the framework propose here, the corticostriatalprojections are loosely organize by cortical region sothat the limbic cortex projects to the limbic striatum(mainly the nucleus accumbens), the association cortexprojects to the orsomeial, or associative, striatum, anthe sensorimotor cortex projects to the orsolateral orsensorimotor striatum71(FIG. 3a). The limbic network,which has a key role in appetitive Pavlovian learning,can exert tremenous influence on the associative ansensorimotor networks (iscusse below).

    In the associative network, the meial PFC (similar

    to the orsolateral PFC in primates72) an the DMS(cauate) are involve in transient or working memory.Lesions of either structure impair performance on spatialelaye response an elaye alternation tasks7,73,74. Likethe PFC62, cauate activity is also strongly moulate byanticipation of rewar75. Thus, the associative network iscapable of monitoring recent actions as well as anticipat-ing their consequences. By contrast, the sensorimotorlevel comprises the sensorimotor cortices an their tar-gets in the basal ganglia, beginning with the DLS. Theoutputs of this circuit eventually reach the motor corticesan brainstem motor networks. Unlike the associativestriatum, neural activity in the sensorimotor striatumis not irectly moulate by rewar expectancy, but ismore closely relate to movements an to iscriminativestimuli76,77.

    Habit formation and serial adaptation. Joel an Weinerpropose an important revision to the traitional schemeof parallel circuits41,78,79. Rather than close loops withstrict point-to-point topographical organization, theyargue that interaction between ifferent loops is maepossible by interconnections between them. This claimis supporte by recent anatomical work. In aition tothe close, strictly reciprocal projections, there are openstriatonigral projections to a nigral area that, in turn,projects to a ifferent striatal region80. These connec-

    tions coul allow the activity in one cortico-basal gangliacircuit to be propagate to the next circuit iteratively,suggesting a hierarchical organization in which a givencortico-basal ganglia circuit can be consiere as a par-ticular level in a functional hierarchy81. In aition, fur-ther interaction between circuits is possible at the levelof the thalamo-cortico-thalamic connections82.

    We therefore propose that these overlappingcortico-basal ganglia networks form a labile hierarchywith three major levels, consisting of the limbic (stimu-lusoutcome, SO), associative (AO) an sensorimotor(SR) networks (FIG. 3a). Here, we focus on the last twonetworks (FIG. 3b), which we locate in the two cortico-basal

    ganglia circuits coursing through the orsal striatum.These networks are characterize by strong re-entrantprojections to the thalamocortical network, often pre-cisely re-entering the cortical region from which thecorticostriatal projections arise83. The associative net-work is crucial for the acquisition an performanceof goal-irecte actions, but in the course of habitformation this network appears to relinquish controlover behaviour to the sensorimotor network, which isresponsible for SR habits. This relationship is mostclearly reveale in two relate sets of observations: oneon ifferences in the extent of effector specificity, an theother on the switch, with extene practise, from onenetwork to another in the control of behaviour.

    Effector specificity refers to the extent to which thelearning of a skill, as reflecte in various performancemeasures, is limite to the effector (for example, a han)with which it is originally traine. As shown by a stuyusing monkeys, correct performance early in the learn-ing of a behavioural sequence is not specific to the hanoriginally use to perform the sequence; with extensive

    practise, however, correct performance becomes spe-cific to the han use84. This task, not surprisingly, alsorequires the striatum, an learning of new an olersequences epens on ifferent striatal regions.

    The egree of effector specificity reflects the level offunctional integration in the hierarchical organization ofcortico-basal ganglia networks. The associative networkachieves a higher level of functional integration, having atits isposal a wier range of motor programmes that canbe selecte to reach the goal. It is not effector-specific,possibly owing to the bilateral corticostriatal projectionsin this network. By contrast, the sensorimotor networkis more effector-specific, possibly owing to its morelateralize corticostriatal projections85. With habit for-mation, therefore, the control of behaviour shifts froma higher level of functional integration to a lower one more specifically, from the associative cortico-basalganglia network to the sensorimotor cortico-basal gan-glia network (FIG. 3b). However, extensive amage toeither network results in the other network assumingcontrol over instrumental behaviour17,21,4749,60.

    Human imaging stuies of habit learning havefoun that overtraining of a behaviour shifts the cor-tical substrate from ventral areas to more orsal areas,an similar shifts have been observe in the striatum.Learning of new motor responses, for example, acti-

    vate the cauate an the orsolateral PFC, whereas

    with well-learne sequences the site of activation shiftsto the putamen an motor cortices. When well-traineparticipants were aske to pay attention to their actions,the cauate an the more ventral PFC were again acti-

    vate86,87. Such finings are not surprising in light of thehierarchical framework. Therefore, attention to actionrequires the associative network, but once a task is welllearne only the sensorimotor network is neee for itsperformance.

    In another stuy, Polracket al. examine the neuralbasis of automaticity, a concept from cognitive psychol-ogy operationally efine as resistance to interferencefrom the performance of a seconary task88. After

    R E V I E W S

    472 | JUNE 2006 | VOLUME 7 www.nature.com/reviews/neuro

  • 7/31/2019 Yin i Knowlton

    10/14

    Motor output

    SStimulus

    RResponse rate

    OOutcome rate

    R response rate

    Experiencedcontingencydetector

    O outcome rate

    RResponses

    Contiguitydetector

    OOutcomes

    Temporal-difference

    algorithmA reinforcement learning

    method that is driven by the

    difference between temporally

    successive predictions, rather

    than by the difference between

    predicted and actual

    outcomes.

    Markov decision processes

    A stochastic control process

    with the Markov property:

    future states are conditionally

    independent of past states and

    depend only on the current

    state.

    extensive training, the associative cortico-basal ganglianetwork, incluing the orsolateral PFC an its cor-responing striatal target in the cauate, ecrease inactivity. However, the supplementary motor area anthe putamen/globus pallius, parts of the sensorimotorcortico-basal ganglia network, i not show a similarecrease. As behaviour became more automatic withextensive practise, there was also a shift from the associa-tive to the sensorimotor cortico-basal ganglia networks.

    Potential mechanisms for serial adaptation. What arethe mechanisms unerlying the processes of serialaaptation escribe above? Unfortunately, there islittle evience available to answer this question. Asmentione above, the spiralling connections betweenthe striatum an the mibrain iscovere by Haber ancolleagues coul serve as a possible anatomical instan-tiation of links between networks, but numerous otherpossibilities exist82. Without inulging in speculative

    anatomy, we iscuss the problem at a more abstract,computational level, which is open to ifferent neuralimplementations.

    As escribe in BOX 1, Dickinson first propose thatthe experience contingency between behaviour anrewar is the key eterminant of whether behaviour isgoal-irecte or habitual. Experience contingency isefine as the correlation between changes in rewarrates an changes in response rates. This account hasimplications for possible neural implementations. It sug-gests that there are neural etectors for rates of responsesan rates of outcomes, an that outputs from theseetectors must converge to yiel some estimate of expe-

    rience contingency, which coul etermine whetherthe AO system or the SR system is engage. To etectrates an changes in rates, a process akin to ifferentia-tion woul be appropriate. For example, as illustrate byFIG. 4, activity in a particular unit coul simply reflect theerivative (for example, rate) of activity upstream, anan iteration of this process coul reaily yiel the seconerivative (for example, a change in rate). Although ourframework implicates the cortico-basal ganglia networksas the neural implementations of such computationalprocesses, ientifying the specific substrates requiresextensive empirical work. This simple mechanism sug-gests that any reuction in experience instrumentalcontingency, as encountere in contingency egraationan overtraining, coul lea to reuce output of thecontingency etector, an it is this output that woulcompete with the SR/reinforcement system for thecontrol of behaviour.

    A ifferent an more formal moel, which accountsfor much of the ata on the various conitions leaingto habit formation, was provie by a recent theoretical

    paper89. Using a set of computational methos known asreinforcement learning, Daw et al. moelle the processof habit formation by combining two inepenent con-trollers with istinct mechanisms for estimating valuefunctions (the yiel of behaviour in a given state). Themoel-base controller was use to simulate the AOsystem, whereas the moel-free controller was use tosimulate the SR habit system. The key proposal was thatarbitration is base on the uncertainty (posterior vari-ances of estimate values or expecte inaccuracy) in esti-mating the value function; the value etermining actualchoice behaviour is taken from the controller with theleast uncertainty. Accoring to Daw et al., the moel-free(habit) controller, using the temporal-difference algorithm,estimates value functions by caching that is, storing along-run value for future use an choice behaviour isetermine by the store value. Because such estimatesare ivorce from the outcome (much like the SR rein-forcement theory), this metho is computationally trac-table but inflexible, yieling behaviour that is insensitiveto outcome evaluation, whereas exactly the oppositeis true of the moel-base controller (AO system).Further work is neee to exten the uncertainty-basemoel beyon iscrete Markov decision processes to trulyfree operant conitions, an to incorporate instrumentalcontingency into this moel.

    Habits in relation to addictionAiction has often been viewe simply as a malaaptivetype of habit learning90. Although this view is supporteby the insensitivity of rug-seeking behaviour to harm-ful consequences, the motivational compulsion seen inaiction can harly be explaine by SR/reinforcementtheory alone. Although our suggestion that habit for-mation involves the serial aaptation of istinct cortico-basal ganglia networks is also supporte by the literatureon aiction91,92, in the case of aiction consierationsmust be given to aitional processes, especially appe-titive Pavlovian conitioning, as inciental pairingbetween situational cues an rugs allow such learning

    Figure 4 | Schematic illustration of hypothetical mechanisms for the detection of

    instrumental contingency in appetitive instrumental learning. The most

    straightorward mechanism or the detection o rates and changes in rates is the

    biological equivalent o dierentiation. Anticipation is made possible by a higher-

    order derivative o the detected variable, just as velocity can increase more quickly

    than distance. Thereore, in the neural implementation o dierentiation we already

    have a possible mechanism or prediction. The output o the experienced contingency

    detector should have a crucial role in determining whether the actionoutcome (AO)

    system or the stimulusresponse (SR) system is controlling behaviour. In the absence

    o any activation o this detector, the SR system, as described by traditional SR/

    reinorcement theory, can assume control over behaviour. In this illustration we have

    also assumed that the contiguity between response and outcome reinorces the SR

    association.

    R E V I E W S

    NATURE REVIEWS |NEUROSCIENCE VOLUME 7 | JUNE 2006 |473

  • 7/31/2019 Yin i Knowlton

    11/14

    Stereotypy

    Repetitive patterns of

    behaviour that are

    characterized by the lack of

    variation; often observed in

    various psychiatric disorders

    and after psychomotor

    stimulant administration.

    Striosome

    A patch-like compartment in

    the striatum that is

    characterized by low acetyl-

    cholinesterase staining and

    other chemical markers.

    to take place. In most situations, of course, Pavlovianconitioning an instrumental learning can occursimultaneously, an interact in controlling behaviour.In our view, to unerstan aiction it is necessary toconsier these interactions.

    In Pavlovian conitioning, the contingent pairingof a conitional stimulus (CS) an an outcome resultsin the acquisition of conitional responses (CRs) to thepreviously neutral stimulus. The CR is not controlle bythe responseoutcome contingency: even if the responseprevents the outcome, as when an omission contingency(BOX 1) is impose, the CR is still elicite by the CS93.

    As Berrige an Robinson have argue, situationalcues in aiction can acquire motivational properties,which they call incentive salience94. Incentive salience isa measure of how much the rewar is wante rather thanlike, an it is this property that is argue to be greatlyenhance in aiction. Being a escription of appetitivepreparatory CRs, it can be issociate from consum-matory CRs such as taste reactivity8,95. Preparatory CRsare usually less specific than consummatory CRs (for

    example, salivation); although measurable peripherally,they also correspon to central motivational states such ascraving or wanting in appetitive learning, or fear in aver-sive learning8. Such states inuce by preictors of rewarcan irectly potentiate instrumental responing8,96.

    It has long been claime that iscriminative stimulipreceing instrumental actions an reafferent stimuligenerate by actions can form associations with theoutcome an further motivate instrumental behaviour97.Although such explanations fail to account for much ofthe contemporary ata, they remain valuable for theiremphasis on Pavlovianinstrumental interactions, whichhave been amply ocumente24. Pavlovianinstrumentaltransfer (PIT), a rigorous experimental metho use tostuy such interactions, assesses the extent to whichPavlovian CSs that preict outcomes can potentiateinstrumental performance yieling the same outcomes98.As PIT is normally prouce by long, tonic CSs, whichcan also elicit preparatory CRs in appetitive conitioning,one potentially important mechanism unerlying aic-tion at the level of neural systems is the heightene trans-fer from the Pavlovian incentive system to the systemsthat govern instrumental behaviour8. This mechanism isin accor with the important role of environmental cuesin triggering compulsive rug seeking91.

    In view of the serial aaptation hypothesis escribeabove, PIT can also be viewe in terms of interactions

    between cortico-basal ganglia networks (FIG. 3a). Inthis connection, an intriguing recent fining is thatas behaviour becomes habitual it also becomes moresusceptible to transfer of control that is, a PavlovianCS can potentiate habitual responing more than itcan potentiate goal-irecte actions99. As the nucleusaccumbens, which belongs to the limbic cortico-basalganglia network, is critical for PIT100, it coul also exertcontrol over the sensorimotor network (FIG. 3a) via thespiralling connections with opaminergic neurons80.

    Similar ieas have been avance recently byCanales, whose argument is base on experiments thatmeasure activity in ifferent chemical compartments in

    the striatum5. Work by Canales an Graybiel has shownthat exposure to aictive rugs leas to relatively higheractivation of striosomal neurons than of matrix neurons,an that this pattern of activation is correlate with ameasure of motor stereotypy101. These two compartmentsgenerally elineate two sources of cortical inputs to thestriatum, an so Canales argues that the ominance ofthe striosomal activation reflects heightene control ofthe basal ganglia circuitry by inputs from limbic corticalareas. This hypothesis is supporte by the fining thatlesioning or inactivating the infralimbic cortex, whichis involve in the inhibitory control of Pavlovian CRs102an a source of inputs to the striosome compartment,resulte in sensitivity to evaluation even in overtrainerats whose performance is normally habitually control-le103,104. Although the role of the infralimbicstriosomesystem in habit formation is not clear, it may in fact beengage in Pavlovian control of instrumental systems.An obvious preiction here is that lesions of this systemwoul isrupt PIT.

    What is clear from the above iscussion is that the

    motivational compulsion seen in aiction coul bemoelle by PIT, an implemente by links between thelimbic an the sensorimotor cortico-basal ganglia net-works (FIG. 3a). Accoringly, ifferent stages of aictionare expecte to be characterize by istinct behaviouralcharacteristics as a result of the unerlying serial aapta-tion from network to network. In support of such claims,a recent stuy of the effect of cocaine self-aministrationon striatal activity in monkeys foun a graual spreaan intensification of the effects of the rug from the

    ventral striatum to the orsal striatum92. Everitt anRobbins have also shown that reafferent stimuli thatpreict rewar can initially potentiate opamine releasein the accumbens, an eventually in the orsal striatum,which suggests that these Pavlovian motivators can affectcortico-basal ganglia networks that meiate instrumen-tal behaviour105. Pavlovian learning, therefore, possiblyprecees instrumental learning, with serial aaptationinitiate in the limbic network an eventually spreaingto the sensorimotor network. As a result, our generalframework can reaily incorporate various accounts ofaiction, an establish a relationship between habitualresponing an motivational compulsion.

    Conclusions

    Given the enormous structural complexity of the basalganglia, a strictly bottom-up approach in eluciating

    their functions might not be fruitful. Instea, researchcan be guie by a top-own analysis base on theunerstaning of behaviour. The goal of this review,above all, is to clear up conceptual confusions an stimu-late research by outlining a coherent framework baseon known anatomy an physiology as well as our currentunerstaning of instrumental behaviours.

    Central to this framework is the istinction betweengoal-irecte actions an stimulus-riven habits, thetwo main categories of instrumental behaviour. Theycan be issociate at the behavioural level using assaysthat manipulate the value of the outcome an the contin-gency between action an outcome. Using these assays,

    R E V I E W S

    474 | JUNE 2006 | VOLUME 7 www.nature.com/reviews/neuro

  • 7/31/2019 Yin i Knowlton

    12/14

    1. Swanson, L. W. Cerebral hemisphere regulation of

    motivated behavior. Brain Res.886, 113164 (2000).

    A learned and provocative review of cerebral

    anatomy focusing on basal ganglia organization.

    2. Wilson, C. J. in The Synaptic Organization of the Brain

    (ed. Shepherd, G. M.) 329375 (Oxford Univ. Press,

    New York, 2004).

    3. Deniau, J. M. & Chevalier, G. Disinhibition as a basic

    process in the expression of striatal functions. II. The

    striato-nigral influence on thalamocortical cells of the

    ventromedial thalamic nucleus. Brain Res.334,

    227233 (1985).

    4. Albin, R. L., Young, A. B. & Penney, J. B. The functional

    anatomy of basal ganglia disorders. Trends Neurosci.

    12, 366375 (1989).

    5. Canales, J. J. Stimulant-induced adaptations in

    neostriatal matrix and striosome systems: transiting

    from instrumental responding to habitual behavior in

    drug addiction. Neurobiol. Learn. Mem.83, 93103

    (2005).

    6. Wickens, J. R. & Koetter, R. in Models of Information

    Processing in the Basal Ganglia (eds Houk, J. C.,

    Davis, J. L. & Beiser, D. G.)187214 (MIT Press,

    Cambridge, Massachusetts, 1995).7. Divac, I., Rosvold, H. E. & Szwarcbart, M. K. Behavioral

    effects of selective ablation of the caudate nucleus.

    J. Comp. Physiol. Psychol.63, 184190 (1967).

    8. Konorski, J. Integrative Activity of the Brain (University

    of Chicago Press, Chicago, 1967).

    9. Skinner, B. The Behavior of Organisms (Appleton-

    Century-Crofts, New York, 1938).

    10. Tolman, E. C. Purposive Behavior in Animals and Man

    (Macmillan, New York, 1932).

    11. Thorndike, E. L.Animal Intelligence: Experimental

    Studies (Macmillan, New York, 1911).

    12. Hull, C. Principles of Behavior(Appleton-Century-

    Crofts, New York, 1943).

    13. Dickinson, A. inAnimal Learning and Cognition

    (ed. Mackintosh, N. J.) 4579 (Academic, Orlando,

    1994).

    14. Colwill, R. M. & Rescorla, R. A. in The Psychology of

    Learning and Motivation (ed. Bower, G.) 55104

    (Academic, New York, 1986).

    References 13 and 14 are excellent introductionsto the modern study of instrumental learning.

    15. Hammond, L. J. The effect of contingency upon the

    appetitive conditioning of free-operant behavior.

    J. Exp. Anal. Behav.34, 297304 (1980).

    16. Dickinson, A. & Balleine, B. in Spatial Representation:

    Problems in Philosophy and Psychology (eds Eilan, N.

    et al.) 277293 (Blackwell, Malden, Massachusetts,

    1993).

    17. Yin, H. H., Knowlton, B. J. & Balleine, B. W. Inactivation

    of dorsolateral striatum enhances sensitivity to changes

    in the actionoutcome contingency in instrumental

    conditioning. Behav. Brain Res.166, 189196 (2006).

    18. Davis, J. & Bitterman, M. E. Differential reinforcement

    of other behavior (DRO): a yoked-control comparison.

    J. Exp. Anal. Behav.15, 237241 (1971).

    19. Holman, E. W. Some conditions for the dissociation of

    consummatory and instrumental behavior in rats.

    Learn. Motiv.6, 358366 (1975).

    20. Adams, C. D. Variations in the sensitivity of

    instrumental responding to reinforcer devaluation.

    Q. J. Exp. Psychol.33B, 109122 (1982).

    21. Yin, H. H., Knowlton, B. J. & Balleine, B. W. Lesions of

    dorsolateral striatum preserve outcome expectancy

    but disrupt habit formation in instrumental learning.

    Eur. J. Neurosci.19, 181189 (2004).

    22. Colwill, R., Rescorla, R. A. The role of response

    reinforcer associations increases throughout extended

    instrumental training.Anim. Learn. Behav.16,

    105111 (1988).

    23. Dickinson, A. in Learning, Motivation, and Cognition

    (eds Bouton, M. E. & Fanselow, M. S.) 345367

    (American Psychological Association, Washington DC,

    1997).

    24. Dickinson, A. in Contemporary Learning Theories

    (eds Klein, S. B. & Mowrer, R. R.) 279308 (Lawrence

    Erlbaum Associates, Hillsdale, New Jersey, 1989).

    25. Dickinson, A., Nicholas, D. J. & Adams, C. D. The

    effect of the instrumental training contingency on

    susceptibility to reinforcer devaluation. Q. J. Exp.

    Psychol. B35, 3551 (1983).

    26. Miller, R. Meaning and Purpose in the Intact Brain

    (Oxford Univ. Press, New York, 1981).27. Mishkin, M., Malamut, B. & Bachevalier, J.

    in Neurobiology of Learning and Memory

    (eds Lynch, G. et al.) 6577 (Guilford, New York, 1984).

    28. Robbins, T. W., Giardini, V., Jones, G. H., Reading, P. &

    Sahakian, B. J. Effects of dopamine depletion from the

    caudate-putamen and nucleus accumbens septi on the

    acquisition and performance of a conditional

    discrimination task. Behav. Brain Res.38, 243261

    (1990).

    29. Packard, M. G. & Knowlton, B. J. Learning and

    memory functions of the basal ganglia.Annu. Rev.

    Neurosci.25, 563593 (2002).

    30. White, N. M. A functional hypothesis concerning the

    striatal matrix and patches: mediation of SR memory

    and reward. Life Sci.45, 19431957 (1989).

    31. Packard, M. G. Glutamate infused posttraining into

    the hippocampus or caudate-putamen differentially

    strengthens place and response learning. Proc. Natl

    Acad. Sci. USA96, 1288112886 (1999).

    32. Packard, M. G. & McGaugh, J. L. Inactivation ofhippocampus or caudate nucleus with lidocaine

    differentially affects expression of place and response

    learning. Neurobiol. Learn. Mem.65, 6572 (1996).

    33. Poldrack, R. A. & Packard, M. G. Competition among

    multiple memory systems: converging evidence from

    animal and human brain studies. Neuropsychologia

    41, 245251 (2003).

    34. Bayley, P. J., Frascino, J. C. & Squire, L. R. Robust

    habit learning in the absence of awareness and

    independent of the medial temporal lobe. Nature436,

    550553 (2005).

    35. Knowlton, B. J., Mangels, J. A. & Squire, L. R.

    A neostriatal habit learning system in humans.Science

    273, 13991402 (1996).

    36. Moody, T. D., Bookheimer, S. Y., Vanek, Z. &

    Knowlton, B. J. An implicit learning task activates

    medial temporal lobe in patients with Parkinsons

    disease. Behav. Neurosci.118, 438442 (2004).

    37. Kawagoe, R., Takikawa, Y. & Hikosaka, O. Expectation

    of reward modulates cognitive signals in the basal

    ganglia. Nature Neurosci.1, 411416 (1998).

    38. Lauwereyns, J. et al. Feature-based anticipation of

    cues that predict reward in monkey caudate nucleus.

    Neuron33, 463473 (2002).

    39. Lauwereyns, J., Watanabe, K., Coe, B. & Hikosaka, O.

    A neural correlate of response bias in monkey caudate

    nucleus. Nature418, 413417 (2002).

    40. Pasupathy, A. & Miller, E. K. Different time courses of

    learning-related activity in the prefrontal cortex and

    striatum. Nature433, 873876 (2005).

    41. Joel, D. & Weiner, I. The connections of the

    dopaminergic system with the striatum in rats and

    primates: an analysis with respect to the functional

    and compartmental organization of the striatum.

    Neuroscience96, 451474 (2000).

    42. West, M. O. et al.A region in the dorsolateral striatum

    of the rat exhibiting single-unit correlations with

    specific locomotor limb movements.J. Neurophysiol.

    64, 12331246 (1990).

    43. Partridge, J. G., Tang, K. C. & Lovinger, D. M. Regional

    and postnatal heterogeneity of activity-dependent

    long-term changes in synaptic efficacy in the dorsalstriatum.J. Neurophysiol.84, 14221429 (2000).

    The first study to demonstrate regional variations

    in the types and mechanisms of striatal synaptic

    plasticity.

    44. Whishaw, I. Q., Mittleman, G., Bunch, S. T. &

    Dunnett, S. B. Impairments in the acquisition,

    retention and selection of spatial navigation strategies

    after medial caudate-putamen lesions in rats. Behav.

    Brain Res.24, 125138 (1987).

    45. Devan, B. D., McDonald, R. J. & White, N. M. Effects

    of medial and lateral caudate-putamen lesions on

    place- and cue-guided behaviors in the water maze:

    relation to thigmotaxis. Behav. Brain Res.100, 514

    (1999).

    46. Devan, B. D. & White, N. M. Parallel information

    processing in the dorsal striatum: relation to

    hippocampal function.J. Neurosci.19, 27892798

    (1999).

    47. Yin, H. H., Knowlton, B. J. & Balleine, B. W.

    Blockade of NMDA receptors in the dorsomedialstriatum prevents actionoutcome learning in

    instrumental conditioning. Eur. J. Neurosci.22,

    505512 (2005).

    48. Yin, H. H., Ostlund, S. B., Knowlton, B. J. & Balleine,

    B. W. The role of the dorsomedial striatum in

    instrumental conditioning. Eur. J. Neurosci.22,

    513523 (2005).

    49. Yin, H. H. & Knowlton, B. J. Contributions of striatal

    subregions to place and response learning. Learn.

    Mem. 11, 459463 (2004).

    References 4749 present a series of studies that

    established for the first time a dissociation

    between SR learning in the DLS and AO learning

    in the pDMS.

    50. Ragozzino, M. E. Acetylcholine actions in the

    dorsomedial striatum support the flexible shifting of

    response patterns. Neurobiol. Learn. Mem.80,

    257267 (2003).

    they can also be issociate in terms of their unerlyingneural substrates, in the form of istinct cortico-basalganglia networks.

    Clearly, an unerstaning of network interactionsthat result in a switch in behavioural control fromactions to habits has important implications for the stuyof skill learning, aiction an various clinical isorersresulting from basal ganglia abnormalities. At present,however, we remain ignorant of the etaile mechanismsthat unerlie habit formation at all levels of analysis. Atthe behavioural level, all the conitions that promotehabit formation have yet to be characterize precisely.Although several behavioural characteristics of habitscan be specifie (for example, insensitivity to outcomeevaluation an contingency egraation, lack of behav-ioural flexibility an lack of awareness in humans), othercharacteristics are less clear (for example, the egree of

    effector specificity an the nee for attention uringlearning). At the neural systems level, we o not yetunerstan the properties of the cortico-basal ganglianetworks responsible for ifferences in behaviouralflexibility, or in sensitivity to instrumental contingencymanipulations. At the cellular level, in aition to ourignorance of the etaile molecular mechanisms uner-lying synaptic transmission an plasticity in the basalganglia, we o not yet unerstan how synaptic plasticityin the basal ganglia alters the outputs of the networks,an we o not have irect evience linking such plas-ticity to well-efine learning. Nevertheless, we hopethat the framework propose here will stimulate futureresearch, by irecting attention to those variables thatare crucial in the analysis of purposive behaviour, anby unerscoring the importance of precise behaviouralanalysis in eluciating the functions of neural systems.

    R E V I E W S

    NATURE REVIEWS |NEUROSCIENCE VOLUME 7 | JUNE 2006 |475

  • 7/31/2019 Yin i Knowlton

    13/14

    51. Ragozzino, M. E., Jih, J. & Tzavos, A. Involvement of

    the dorsomedial striatum in behavioral flexibility: role

    of muscarinic cholinergic receptors. Brain Res.953,

    205214 (2002).

    52. Ragozzino, K. E., Leutgeb, S. & Mizumori, S. J. Dorsal

    striatal head direction and hippocampal place

    representations during spatial navigation. Exp. Brain

    Res.139, 372376 (2001).

    53. Mulder, A. B., Tabuchi, E. & Wiener, S. I. Neurons in

    hippocampal afferent zones of rat striatum parse

    routes into multi-pace segments during maze

    navigation. Eur. J. Neurosci.19, 19231932 (2004).54. Delgado, M. R., Locke, H. M., Stenger, V. A. &

    Fiez, J. A. Dorsal striatum responses to reward and

    punishment: effects of valence and magnitude

    manipulations. Cogn. Affect. Behav. Neurosci.3,

    2738 (2003).

    55. Delgado, M. R., Stenger, V. A. & Fiez, J. A. Motivation-

    dependent responses in the human caudate nucleus.

    Cereb. Cortex14, 10221030 (2004).

    56. Zink, C. F., Pagnoni, G., Martin-Skurski, M. E.,

    Chappelow, J. C. & Berns, G. S. Human striatal

    responses to monetary reward depend on saliency.

    Neuron42, 509517 (2004).

    57. Tricomi, E. M., Delgado, M. R. & Fiez, J. A. Modulation

    of caudate activity by action contingency. Neuron41,

    281292 (2004).

    An interesting human imaging study that provided

    strong evidence for the role of the caudate in

    encoding AO contingencies.

    58. Williams, Z. M. & Eskandar, E. N. Selective

    enhancement of associative learning by

    microstimulation of the anterior caudate. Nature

    Neurosci.9, 562568 (2006).

    59. Dickinson, A., Balleine, B., Watt, A. & Gonzalez, F.

    Motivational control after extended instrumental

    training.Anim. Learn. Behav.23, 197206 (1995).

    60. Balleine, B. W. & Dickinson, A. Goal-directed

    instrumental action: contingency and incentive

    learning and their cortical substrates.

    Neuropharmacology37, 407419 (1998).

    61. Corbit, L. H. & Balleine, B. W. The role of prelimbic

    cortex in instrumental conditioning. Behav. Brain Res.

    146, 145157 (2003).

    62. Leon, M. I. & Shadlen, M. N. Effect of expected reward

    magnitude on the response of neurons in the

    dorsolateral prefrontal cortex of the macaque. Neuron

    24, 415425 (1999).

    63. Tsujimoto, S. & Sawaguchi, T. Properties of delay-

    period neuronal activity in the primate prefrontal

    cortex during memory- and sensory-guided saccade

    tasks. Eur. J. Neurosci.19, 447457 (2004).

    64. Tsujimoto, S. & Sawaguchi, T. Neuronal representationof responseoutcome in the primate prefrontal cortex.

    Cereb. Cortex14, 4755 (2004).

    65. Tsujimoto, S. & Sawaguchi, T. Working memory of

    action: a comparative study of ability to selecting

    response based on previous action in New World

    monkeys (Saimiri sciureus and Callithrix jacchus).

    Behav. Processes58, 149155 (2002).

    66. Corbit, L. H. , Muir, J. L. & Balleine, B. W. Lesions of

    mediodorsal thalamus and anterior thalamic nuclei

    produce dissociable effects on instrumental

    conditioning in rats. Eur. J. Neurosci.18, 12861294

    (2003).

    67. Ostlund, S. B. & Balleine, B. W. Lesions of medial

    prefrontal cortex disrupt the acquisition but not the

    expression of goal-directed learning.J. Neurosci.25,

    77637770 (2005).

    68. Corbit, L. H. , Ostlund, S. B. & Balleine, B. W.

    Sensitivity to instrumental contingency degradation is

    mediated by the entorhinal cortex and its efferents via

    the dorsal hippocampus.J. Neurosci.22,1097610984 (2002).

    69. Packard, M. G. & McGaugh, J. L. Double dissociation

    of fornix and caudate nucleus lesions on acquisition of

    two water maze tasks: further evidence for multiple

    memory systems. Behav. Neurosci.106, 439446

    (1992).

    70. Alexander, G. E., DeLong, M. R. & Strick, P. L. Parallel

    organization of functionally segregated circuits linking

    basal ganglia and cortex.Annu. Rev. Neurosci.9,

    357381 (1986).

    71. Reep, R. L., Cheatwood, J. L. & Corwin, J. V. The

    associative striatum: organization of cortical

    projections to the dorsocentral striatum in rats.

    J. Comp. Neurol.467, 271292 (2003).

    72. Dalley, J. W., Cardinal, R. N. & Robbins, T. W.

    Prefrontal executive and cognitive functions in

    rodents: neural and neurochemical substrates.

    Neurosci. Biobehav. Rev.28, 771784 (2004).

    73. Divac, I., Markowitsch, H. J. & Pritzel, M. Behavioral

    and anatomical consequences of small intrastriatal

    injections of kainic acid in the rat. Brain Res.151,

    523532 (1978).

    74. Levy, R., Friedman, H. R., Davachi, L. &

    Goldman-Rakic, P. S. Differential activation of the

    caudate nucleus in primates performing spatial and

    nonspatial working memory tasks.J. Neurosci.17,

    38703882 (1997).

    75. Hassani, O. K., Cromwell, H. C. & Schultz, W. Influence

    of expectation of different rewards on behavior-related

    neuronal activity in the striatum.J. Neurophysiol.85,24772489 (2001).

    76. Kimura, M., Aosaki, T. & Ishida, A. Neurophysiological

    aspects of the differential roles of the putamen and

    caudate nucleus in voluntary movement.Adv. Neurol.

    60, 6270 (1993).

    77. Kanazawa, I., Murata, M. & Kimura, M. Roles of

    dopamine and its receptors in generation of choreic

    movements.Adv. Neurol.60, 107112 (1993).

    78. Joel, D. & Weiner, I. The organization of the basal

    ganglia-thalamocortical circuits: open interconnected

    rather than closed segregated. Neuroscience63,

    363379. (1994).

    An important review in a series by the same

    authors arguing for interactions between cortico-

    basal ganglia networks.

    79. Joel, D. & Weiner, I. The connections of the primate

    subthalamic nucleus: indirect pathways and the

    open-interconnected scheme of basal ganglia-

    thalamocortical circuitry. Brain Res. Brain Res. Rev.

    23, 6278 (1997).

    80. Haber, S. N., Fudge, J. L. & McFarland, N. R.

    Striatonigrostriatal pathways in primates form an

    ascending spiral from the shell to the dorsolateral

    striatum.J. Neurosci.20, 23692382 (2000).

    81. Redgrave, P., Prescott, T. J. & Gurney, K. The basal

    ganglia: a vertebrate solution to the selection

    problem? Neuroscience89, 10091023 (1999).

    82. Haber, S. N. The primate basal ganglia: parallel and

    integrative networks.J. Chem. Neuroanat.26,

    317330 (2003).

    83. Middleton, F. A. & Strick, P. L. Basal ganglia and

    cerebellar loops: motor and cognitive circuits. Brain

    Res. Brain Res. Rev.31, 236250 (2000).

    84. Rand, M. K. et al. Characteristics of sequential

    movements during early learning period in monkeys.

    Exp. Brain Res.131, 293304 (2000).

    85. McGeorge, A. J. & Faull, R. L. The organization of the

    projection from the cerebral cortex to the striatum in

    the rat. Neuroscience29, 503537 (1989).

    86. Jueptner, M., Frith, C. D., Brooks, D. J.,

    Frackowiak, R. S. & Passingham, R. E. Anatomy ofmotor learning. II. Subcortical structures and learning

    by trial and error.J. Neurophysiol.77, 13251337

    (1997).

    87. Jueptner, M. et al.Anatomy of motor learning.

    I. Frontal cortex and attention to action.

    J. Neurophysiol.77, 13131324 (1997).

    88. Poldrack, R. A. et al. The neural correlates of motor

    skill automaticity.J. Neurosci.25, 53565364

    (2005).

    References 8588 show shifts in activation

    patterns of cortico-basal ganglia networks in the

    course of skill learning.

    89. Daw, N. D., Niv, Y. & Dayan, P. Uncertainty-based

    competition between prefrontal and dorsolateral

    striatal systems for behavioral control. Nature

    Neurosci.8, 17041711 (2005).

    90. Everitt, B. J. & Wolf, M. E. Psychomotor stimulant

    addiction: a neural systems perspective.J. Neurosci.

    22, 33123320 (2002).

    91. Altman, J. et al. The biological, social and clinicalbases of drug addiction: commentary and debate.

    Psychopharmacology (Berl.)125, 285345

    (1996).

    92. Porrino, L. J., Lyons, D., Smith, H. R., Daunais, J. B. &

    Nader, M. A. Cocaine self-administration produces a

    progressive involvement of limbic, association, and

    sensorimotor striatal domains.J. Neurosci.24,

    35543562 (2004).

    93. Williams, D. R. & Williams, H. Automaintenance in the

    pigeon: sustained pecking despite contingent non-

    reinforcement.J. Exp. Anal. Behav.12, 511520

    (1969).

    94. Robinson, T. E. & Berridge, K. C. Addiction.Annu. Rev.

    Psychol.54, 2553 (2003).

    95. Berridge, K. C. & Robinson, T. E. What is the role of

    dopamine in reward: hedonic impact, reward learning,

    or incentive salience? Brain Res. Brain Res. Rev.28,

    309369 (1998).

    96. Tiffany, S. T. A cognitive model of drug urges and drug-

    use behavior: role of automatic and nonautomatic

    processes. Psychol. Rev.97, 147168 (1990).

    97. Rescorla, R. A. & Solomon, R. L. Two-process learning

    theory: relationships between Pavlovian conditioning

    and instrumental learning. Psychol. Rev.74,

    151182 (1967).

    98. Corbit, L. H. & Balleine, B. W. Double dissociation of

    basolateral and central amygdala lesions on the general

    and outcome-specific forms of Pavlovianinstrumental

    transfer.J. Neurosci.25, 962970 (2005).

    99. Holland, P. C. Relations between Pavlovianinstrumental transfer and reinforcer devaluation.

    J. Exp. Psychol. Anim. Behav. Process.30, 104117

    (2004).

    100. Corbit, L. H., Muir, J. L. & Balleine, B. W. The role of

    the nucleus accumbens in instrumental conditioning:

    evidence of a functional dissociation between

    accumbens core and shell.J. Neurosci.21,

    32513260 (2001).

    101. Canales, J. J. & Graybiel, A. M. A measure of striatal

    function predicts motor stereotypy. Nature Neurosci.

    3, 377383 (2000).

    102. Rhodes, S. E. & Killcross, S. Lesions of rat infralimbic

    cortex enhance recovery and reinstatement of an

    appetitive Pavlovian response. Learn. Mem.11,

    611616 (2004).

    103. Coutureau, E. & Killcross, S. Inactivation of the

    infralimbic prefrontal cortex reinstates goal-directed

    responding in overtrained rats. Behav. Brain Res. 146,

    167174 (2003).

    104. Killcross, S. & Coutureau, E. Coordination of actions

    and habits in the medial prefrontal cortex of rats.

    Cereb. Cortex13, 400408 (2003).

    105. Everitt, B. J. & Robbins, T. W. Neural systems of

    reinforcement for drug addiction: from actions to

    habits to compulsion. Nature Neurosci.8,

    14811489 (2005).

    106. Baum, W. M. The correlation-based law of effect.

    J. Exp. Anal. Behav.20, 137153 (1973).

    107. Dickinson, A. Actions and habits: the development of

    behavioural autonomy. Phil. Trans. R. Soc. Lond. B

    308, 6778 (1985).

    108. Kerr, J. N. & Wickens, J. R. Dopamine D-1/D-5

    receptor activation is required for long-term

    potentiation in the rat neostriatum in vitro.

    J. Neurophysiol.85, 117124 (2001).

    109. Reynolds, J. N., Hyland, B. I. & Wickens, J. R.

    A cellular mechanism of reward-related learning.

    Nature413, 6770 (2001).

    110. Gerdeman, G. L., Partridge, J. G., Lupica, C. R. &

    Lovinger, D. M. It could be habit forming: drugs of

    abuse and striatal synaptic plasticity. Trends Neurosci.26, 184192 (2003).

    111. Gerdeman, G. L., Ronesi, J. & Lovinger, D. M.

    Postsynaptic endocannabinoid release is critical to

    long-term depression in the striatum. Nature

    Neurosci.5, 446451 (2002).

    112. Packard, M. G. & White, N. M. Dissociation of

    hippocampus and caudate nucleus memory systems

    by posttraining intracerebral injection of dopamine

    agonists. Behav. Neurosci.105, 295306 (1991).

    113. Sage, J. R. & Knowlton, B. J. Effects of US devaluation

    on win-stay and win-shift radial maze performance in

    rats. Behav. Neurosci.114, 295306 (2000).

    114. Packard, M. G., Hirsh, R. & White, N. M. Differential

    effects of fornix and caudate nucleus lesions on two

    radial maze tasks: evidence for multiple memory

    systems.J. Neurosci.9, 14651472 (1989).

    Acknowledgements:H.H.Y. was supported by the Intramural Research Program

    at the National Institute on Alcohol Abuse and Alcoholism,National Institutes of Health. B.J.K. was supported by a

    National Science Foundation grant. We would like to thank

    B. Balleine, R. Costa, N. Daw, T. Dickinson and S. Ostlund

    for helpful discussion.

    Competing interests statementThe authors declare no competing financial interests.

    DATABASESThe following terms in this article are linked online to:

    OMIM: http://www.ncbi.nlm.nih.gov/Omim

    Parkinsons disease

    FURTHER INFORMATIONKnowltons homepage: http://www.psych.ucla.edu/Faculty/

    Knowlton

    Access to this links box is available online.

    R E V I E W S

    476 | JUNE 2006 | VOLUME 7 www.nature.com/reviews/neuro

  • 7/31/2019 Yin i Knowlton

    14/14