CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández....

22

Transcript of CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández....

Page 1: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1
Page 2: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

CAL Simulation Guides ISSN 2054-2747 Temporal Difference Simulator © Version 1.0 2012

Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández.

Centre for Computational and Animal Learning Research St. Albans, United Kingdom

Page 3: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

1 Introduction 2

Temporal Difference Simulator 1.0

Contents

1 Introduction .......................................................................................................................................... 3 2 Before you start .................................................................................................................................... 3 3 Running the application ........................................................................................................................ 3

3.1 Entering a design ............................................................................................................................ 4 3.2 Temporal properties ...................................................................................................................... 4 3.3 Parameters ..................................................................................................................................... 5 3.4 Design settings ............................................................................................................................... 6

3.4.1 Set Different US per Phase ...................................................................................................... 6 3.4.2 Context Simulation .................................................................................................................. 6 3.4.3 Compound Results and Configural Cues ................................................................................. 6

3.5 Procedural settings ........................................................................................................................ 7 3.5.1 Number Random Trial Combinations ..................................................................................... 7 3.5.2 Number of Random Distributions for Variable Length CS ...................................................... 7 3.5.3 Time-step Length .................................................................................................................... 7 3.5.4 Add Decision Rule Simulation ................................................................................................. 8 3.5.5 Eligibility Traces....................................................................................................................... 8 3.5.6 Mean Type .............................................................................................................................. 8 3.5.7 Variable Distribution Type ...................................................................................................... 8

3.6 Saving a design ............................................................................................................................... 8 3.7 Loading a design............................................................................................................................. 8 3.8 Outputs .......................................................................................................................................... 9 3.9 Exporting results to excel ............................................................................................................... 9 3.10 Figures display ............................................................................................................................ 10

4 Worked Examples ............................................................................................................................... 11 4.1 A simple worked example - Egger-Miller effect ........................................................................... 11 4.2 Working with context - ABA Renewal .......................................................................................... 12 4.3 Using configural cues - Negative patterning .............................................................................. 14 4.4 Using variable durations - Temporal overshadowing .................................................................. 16

5 Terms of use ........................................................................................................................................ 18 6 Feedback ............................................................................................................................................. 18 7 References .......................................................................................................................................... 18

Page 4: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

1 Introduction 3

1 Introduction

This document is a quick guide to installing and using the Centre for Computational and Animal

Learning Research’s Temporal Difference Simulator v1.0 for Complete Serial Compound (Sutton &

Barto, 1987; Moore, Choi & Brunzell, 1998) TD.

Executable versions (.exe for Windows and .app for Apple) are available, in addition to a .jar file

intended to run on Java Runtime Environment 6 or above for UNIX/Linux operating systems. Once

downloaded to your computer, the file will run without installation.

This document does not cover technical details of the underlying implementation, or the Temporal

Difference model.

The simulator builds upon a simulator of Rescorla and Wagner’s model, the “RW_Simulator”

version 3.0 (Mondragón, Alonso & Fernández, 2011; Alonso, Mondragón & Fernández, 2012).

2 Before you start You will need to download a version of the simulator appropriate to your platform (.exe for Windows

machines, .app for OSX machines or .jar for UNIX/Linux operating systems) from http://www.cal-

r.org/in- dex.php?id=software.

For Windows users the download will be named “TD_Simulator.exe” and can be run directly after

downloading, for Mac users a disk image is provided (“TD_Simulator.dmg”) containing the .app file

which can either be run directly from the disk image or after dragging the .app to your Applications

folder.

Users of other platforms should select the “JAVA” button to download the “TD_Simulator.jar” file.

This file will run in any platform provided that Java Runtime Environment (JRE) 6 or above is

installed. Most popular Linux distributions such as Fedora, Debian, Ubuntu, Arch, and CentOS

already include a JRE.

Users who wish to access the source code of the simulator should also download the .jar file, which contains .java files in addition to the runnable binaries.

3 Running the application To start the simulator you will need to navigate to the directory where you stored the file and double

click the icon to launch it. After launching the simulator, you will be presented by the main screen as

shown in Figure 1.

This window is headed by the main menu (“File”, “Design Settings”, “Procedural Settings”, and

“Help”), and consists of two input panels and one output panel. The experimental design is specified

in a matrix of groups and phases in the top panel; the values of the parameters are entered in the

bottom left panel; summary results are displayed in the output panel on the bottom right.

Page 5: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

3 Running the application 4

Figure 1 Simulator main screen

3.1 Entering a design

Begin by selecting the cell on the first row, next to Group 1 to enter the experimental design for

phase 1, describing each trial as follows:

Number of trials followed by Stimuli followed by Reinforcer (+ or -)

Make sure that you separate each trial type with a forward slash (/) and press Enter when you

have finished entering trial descriptions for a phase and group. For example, 80 reinforced trials of

stimulus A followed by 80 reinforced trials of two stimuli A and B would be entered as “80A+/80AB+”.

The order in which trials occur is determined by the order in which they are entered in the phase; if

the design requires that different kinds of trial occur in a random ordering, the corresponding

“Random” checkbox should be checked.

To change the name of a group, click the current name and edit it. You can also add and remove

groups by clicking the “+” and “−“ buttons on the left of the window. Similarly, clicking the “+” and “−“

buttons at the top right will respectively add and remove phases.

3.2 Temporal properties

After you enter a trial description, you will see that the corresponding cell in the “Stimuli Temporal

Parameters” column has been populated with the stimuli you entered. Here, we can set up the

durations of stimuli and their temporal relationship to one another (the type of conditioning). Click the

cell to open the temporal properties window for this group and phase as in Figure 2.

First, set the duration of the US, then click the “CS Temporal Properties” for a CS to set the duration

Page 6: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

3 Running the application 5

of that CS. Stimuli can have a fixed (F) duration where they will last exactly the number of seconds

specified on every trial, or a variable (V) duration. For stimuli with a variable length, actual lengths on

each trial are selected from a random distribution of durations such that the mean duration of the

stimulus over all the trials where it is present in a phase is the number of seconds input by the user.

For variable duration stimuli, each phase is run with a number of random orderings1 of these

durations specified in the “Number of Random Distributions for Variable Length CS” (see 3.5.2). The

values for variable length stimuli are produced by averaging the results of all the phases.

Stimuli can also be configured to have different onsets and offsets, relative to the US, by clicking the

corresponding cell in the “Conditioning” column. A stimulus can be configured to have a forward2 (Fw,

the default) relationship to the US, a backward (Bw) relationship, or a simultaneous (Sm) relationship.

In the same window, the ISI (inter-stimulus interval) can also be set, giving the length of time between

onset of the stimulus and the US for forward and simultaneous conditioning, and the end of the US

and the start of the stimulus for backward conditioning.

The durations of ITIs (inter-trial intervals) can also be configured per phase and group by clicking

the “ITI” column, with each ITI for a phase made up of a minimal fixed length and an additional

variable period that can be 0.

3.3 Parameters

After entering an experiment configuration you can edit the parameters by pressing the “Set

Parameters” button. Three tables appear. In the top table, α values for each CS must be entered

(the default is 0.3). The bottom table contains a set of default values given to the US (β+ is set to

0.75, β− to 0.7 and λ to 1); the user can of course, modify these parameters. Finally, the bottom table

labeled “Others” contains the δ (trace decay) and γ (discount factor) parameters, which default to 0.9

and 0.95 respectively and may also be modified.

1 Note that using random trial orderings in combination with variable stimuli can result in long runtimes. 2 Note that for variable length stimuli, configuring a forward relationship will always ensure the US is delivered after the

stimulus has finished, even though the actual duration of the stimulus changes on each phase.

Figure 2 Temporal properties window

Page 7: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

3 Running the application 6

3.4 Design settings

3.4.1 Set Different US per Phase

This option allows you to set different β+, β− and λ values for different phases, i.e. different US

motivational values per phase.

3.4.2 Context Simulation

The simulator supports the simulation of context; selecting “Same Context” from the “Context

Simulation” menu in the “Design Settings” menu will prompt the user to enter a salience for the

context (by default 0.05), then add it to all trials in all phases and groups, as seen in Figure 3.

Figure 3 Same context display

Alternatively, selecting “Different Contexts” will allow the user to set one of several contexts per

phase and group, and modify the salience by clicking the context column as shown in Figure 4.

Figure 4 Different contexts display

This adds a context column in each phase. By default, the context added is the φ context,

represented by φ(0.05) in the context column of a phase, as with a single context. To modify a

context, click on it to open the context window (Figure 4). Here, you can select from one of four

distinct contexts and alter the salience of them. By default, “No Context Simulation” is selected.

3.4.3 Compound Results and Configural Cues

This simulator generates associative strength values for standard additive compound stimuli. It also

computes compound values using added configural cues, which have a length equal to the

overlapping period of all the component stimuli of the configural cue.

Page 8: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

3 Running the application 7

To calculate compound values, you must select “Design Settings/Show Compound Results”. If you wish to calculate stimulus compounds with configural cues you must also tick “Design Settings/Use Configural Cues” in the main menu. Press “Set Parameters” to input the alpha values for the configural cues. By default, the product of the elemental alpha values will appear but the

user can modify these values. Configural cues are represented as “c(ΦA)”, “c(ΦAB)”, etc. (see

Figure 5).

Figure 5 Settings for calculating compounds with context and configural cues

3.5 Procedural settings

3.5.1 Number Random Trial Combinations

Select this option to alter the number of combinations used for random phases, the default is 100.

Setting this to a high number will result in significantly slower runtimes, particularly in combination

with variable length stimuli.

3.5.2 Number of Random Distributions for Variable Length CS

Select this option to alter the number of combinations used for variable length stimuli, the default is

100. Setting this to a high number will result in significantly slower runtimes, particularly in

combination with random trial combinations.

3.5.3 Time-step Length

Select this option to alter the size of the time-step used in simulation, which defaults to 1 (1 time-step

equals a second) and hence the number of components per stimulus (length of stimulus divided by

time-step size). Note that the length of the time-step is also the lower bound on durations, the US,

CSs and ITIs can be no shorter than the time-step.

Lower numbers will allow the user to simulate at a higher temporal resolution, but will increase

the simulation time required for long stimuli durations as the number of components increases.

Page 9: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

3 Running the application 8

3.5.4 Add Decision Rule Simulation

Select this option to add a simulated conditioned response rate per minute to results. The response

rate is derived from the decision rule given by Church and Kirkpatrick (2001). Enabling the option

prompts the user to enter a threshold value that must be exceeded to produce a response.

3.5.5 Eligibility Traces

Use this menu to select the algorithm used to calculate eligibility traces that are used to control the

extent to which a component is eligible for changes to associative strength. Three options are

available: an accumulating trace (Sutton, 1988) which increases each time a CS occurs and decays

relatively slowly, a replacing trace (Singh & Sutton, 1996) that has the same decay properties but

never accumulates above 1, and a bounded accumulating trace (Sutton & Barto, 1987, 1990) where trace decay is exponential.

By default, the simulator uses a replacing trace.

3.5.6 Mean Type

Use this menu to choose the type of mean used to when producing variable distributions for stimuli and ITI durations. By default the simulator will use the arithmetic mean, but it can alternatively use the geometric mean.

3.5.7 Variable Distribution Type

Two variable distributions are available for producing variable duration stimuli and ITIs, an

exponential distribution and a uniform distribution. By default, the simulator uses an exponential

distribution to produce variable durations.

3.6 Saving a design

If you would like to save your design so you can retrieve it another time (for instance, to modify it), all

you have to do is to click “Save” on the “File” menu. This will pop up a window, which asks you

where you would like to save your file. You need to choose a directory and a filename. Click “OK” to

save the file to the directory you have chosen. A file with the extension “.tdl” will appear in the

chosen directory.

3.7 Loading a design

Once you have saved your design, you can re-use it by selecting “Open” from the “File” menu. The

application will replace the currently loaded design with the design from the file you have selected.

Page 10: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

3 Running the application 9

3.8 Outputs

After the experimental design and parameters have been entered, and the design and procedural settings have been chosen, click “Run”. Once the simulation has been completed, a textual summary of the results will be displayed in the data area, located on the right hand side, as shown in

Figure 63 . The user can scroll up and down to check the stimulus mean V values per trial, group and phase and the V of each component of the stimulus on the final trial of each phase.

At this stage complete results can be exported to an excel spreadsheet for further analysis, or

displayed as figures.

Where compounds are used, they are represented as “AB”, “ABC” etc. Configural cue

compounds are displayed as “[AB]”, “[ABC]”, etc. and replace the standard compounds in text output,

excel spreadsheets, and figures display.

If decision rule simulation is enabled (see 3.5.4.), a simulated response rate (responses per

minute) per component will be also displayed in all outputs.

3.9 Exporting results to excel

The application can also export the results to a “.xlsx” (Excel 2007-2010) type spreadsheet, usable in

Microsoft Excel and current version of LibreOffice/OpenOffice. It creates a workbook that has a

different sheet per group. Phases are presented individually on a separate table. Each sheet

contains the name of the file followed by CS, US and Other parameters followed by context

saliences (if using context). The temporal properties of all the stimuli in that group and phase, and a

line showing its design precedes each phase table. Each phase table shows the V results per

component for each trial, as well as the average V for the complete stimulus on each trial. If you

have enabled decision rule simulation, a simulated response per component is also shown for each

trial. Figure 7 shows the exported data. To export the results select “File/Export”.

3 The “Clear All” button can also be used to clear the current design, leaving the number of groups and phases intact.

Figure 6 Parameters and results

Page 11: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

3 Running the application 10

Figure 7 Excel spreadsheet results

Exporting large results sets can be slow, and a progress bar with an estimated time remaining will

be displayed.

3.10 Figures display

The simulator also displays graphs of results, accessible after running an experiment by clicking the

“Display Figures” button. By default, the simulator will show separate figures for each phase for the

mean associative strength of each stimulus, plotted against trials (Figure 8 (a)), and the associative

strength of each component of each stimulus after the final trial (Figure 8 (b)). If decision rule

simulation has been enabled, the simulator will also show a simulated response graph, showing the

simulated responses per minute for each stimulus at each time-step after the final trial (Figure 8 (c)).

For all graphs, the user can select which groups and stimuli are displayed by checking or

unchecking their respective boxes. The figures can remain open while a new experimental design is

run to aid comparison between results.

Page 12: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

3 Running the application 11

(a) Trial level

(b) Component level

(c) Simulated response

Figure 8 Figures display

Page 13: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

4 Worked Examples 11

Figures can also be saved, copied, printed, zoomed and modified. To access these functions

right mouse click (or, in Mac, Ctrl+Click) while pointing over the figure, to open the menu. For

instance, to facilitate data comparison you may wish to fix the Y-axis limits (by default axis limits are

set to the highest value plotted). Right click (or Ctrl+Click) and choose “Properties”. Then, select

“Plot” at the first row of tabs; “Range Axis” (for Y-axis) at the second, and “Range” at bottom row of

tabs. Unmark the “Auto-adjust range” tick-box and modify the range values as required. Figure 9

shows these menus.

Figure 9 Range axis options for figures

You can also zoom in on a portion of the graph by left-clicking and dragging down and right to

encompass the area you wish to focus on. After zooming in, you can return to the full graph by left-

clicking and dragging up and left.

4 Worked Examples

4.1 A simple worked example - Egger-Miller effect

The Egger-Miller effect (Egger & Miller, 1962) describes a phenomenon in which a redundant

stimulus that precedes and overlaps a target stimulus reduces the target conditioning compared to a

control in which the target is presented alone despite having the same temporal relationship with the

US.

For our Egger-Miller effect experiment we will consider three groups. Group “Redundant” will be

“80AB+”, Group “Informative” will be “40A−/80AB+” randomly presented, and Group “Control” will

consist of “80B+” trials. After entering the trial description for Group “Redundant”, click the “CS

Temporal Properties” for CS A to set the duration of the stimulus. For this example, select “Fixed”

from the “Duration” column and enter 10 for the “Length”, then click “OK” or hit “Enter” to return to the

temporal properties window.

Now, you can set the type of conditioning (how the stimulus is related in time to the US). By

default, this is set to forward conditioning, indicated by “Fw” in the “Conditioning” column. In this

case, the default is fine for our purposes. Check that the “Type” column is set to “Fixed” and the “ISI”

column is set to “10”, then click “OK” or press “Enter”. Now, set CS B to a duration of 5 seconds with

an ISI of 5 seconds.

Next enter the data for the remaining groups. Once a stimulus has been entered its “Temporal

Properties” will be filled automatically (it will keep the information from the first group). If you need to

change them, you can do so.

Page 14: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

4 Worked Examples 12

Now we can input the parameters, as in Table 1 with US per phase disabled.

Table 1 Egger-Miller parameters

Parameter Value

α 0.3

β+ 0.75

β− 0.75

λ 1

δ 0.9

γ 0.95

Now click the “Run” button to begin simulating the experiment.

Figure 10 Egger-Miller results

Figure 10 shows the effect. The redundant stimulus A reduces the associative strength of B when

compared with the levels acquired by B in Group “Informative” and Group “Control”.

4.2 Working with context - ABA Renewal

In this example you will use the simulator to simulate the ABA renewal effect (Bouton & Bolles,

1979), where a stimulus conditioned in one context and extinguished in another recovers some of

the associative strength acquired previously when returned to the original context. This example will

show you how to use different contexts across phases.

In this instance, you will need two groups, Group “ABA” (the experimental group) and Group

“AAA” (the control, that will be conditioned with the same context throughout all phases). Create the

first group, and set the trial description to “50A+” with a fixed duration of 10 seconds, forward

conditioned. Now, add Group “AAA” with the same trial description, i.e., “50A+”. You should now

have two groups, with identical trial descriptions.

Page 15: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

4 Worked Examples 13

Now, add a new phase. This will be the extinction phase of the conditioning, so enter a trial

description of “25A−“ for the two groups. Finally, add a test phase with a trial description of “5A−“ for

all groups.

Next, you need to add context to your phases. Enable the “Different Contexts” setting, then you

will need to change the identity of the context shown in Phase 2 in Group “ABA”. For this experiment,

ensure the context alpha is set to the default (0.05) and change the context from φ to ψ, then click

“OK”. The design table should look as in Figure 11.

Figure 11 Design for ABA renewal

Finally enable compound results by selecting “Show Compound Results” from the “Design

Settings” menu. Ensure that your parameters are set as in Table 2, and click “Run”.

Table 2 ABA renewal parameters

Parameter Value

α 0.3

β+ 0.75

β− 0.75

λ 1

δ 0.9

γ 0.95

φ 0.05

ψ 0.05

Once the simulation has completed, click the “Display Figures” button to get a graphical view of

the results and inspect the results for the φA compound in the final phase - the relative performance of the “AAA” group and of the group “ABA” shows the impact of the different context.

Page 16: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

4 Worked Examples 14

Figure 12 ABA Renewal results

Figure 12 shows V mean values per trial for the test phase of the experiment, with the φA

compound showing reduced extinction in Group “ABA”.

4.3 Using configural cues - Negative Patterning

In this example, you will use the simulator to reproduce a negative patterning phenomenon. In this

procedure, two stimuli signal the US separately whereas the compound formed by these two stimuli

does not (Rescorla, 1972).

Create a single group named “Negative P.” with the trial description “100A+/100B+/100AB−”, and

set all stimuli to a fixed duration of 10 seconds, forward conditioning. In this experiment the different

trial types must be interspersed, so you will need to make sure the “Random” checkbox is selected.

You will also need to increase the number of random combinations from the default (100), to 1000.

To do this, select “Number of Random Trial Combinations” from the “Procedural Settings” menu,

enter 1000 in the popup that appears and click “OK”. For this experiment, use the following

parameters:

Table 3 Negative patterning parameters

Parameter Value

α 0.3

β+ 0.75

β− 0.7

λ 1

δ 0.9

γ 0.95

Finally, enable “Show Compound Results” from the “Design Settings” menu, click the “Run”

button, and then display the figures. As you can see (Figure 13 (a)), no discrimination occurs. Like

the Rescorla and Wagner model, TD requires the use of configural cues to successfully solve

negative patterning.

Page 17: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

4 Worked Examples 15

Now, enable the “Use Configural Cues” option from the “Design Settings”, this will add a single

configural cue to the CS α table - c(AB), with the salience set to the product of the values for A and

B. Run the experiment again, and view the figures. When viewing figures with configural cues

enabled, configural cues are not visible by default. To show configural cues on a figure, the user

must check the corresponding checkbox at the bottom of the figure window.

(a) The simulation does not predict negative patterning discrimination without configural cues

(b) The simulation correctly predicts negative patterning discrimination when using configural cues

Figure 13 Negative patterning results

Figure 13 (b) demonstrates that assuming a configural cue in the representation of the stimulus

compound [AB], TD is able to solve the discrimination: The stimuli individual associative strength

increases whereas the associative strength of the compound decreases with training.

At this point, you can also view the associative strength of the c(AB) cue – and note that it has

become a conditioned inhibitor.

Page 18: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

4 Worked Examples 16

4.4 Using Variable Durations - Temporal Overshadowing

To demonstrate the use of variable durations for stimuli we will now simulate a temporal

overshadowing experiment. Overshadowing refers to a phenomenon by which a stimulus that is

conditioned alone with a US (target stimulus) acquires more associative strength than when

conditioned paired with a second stimulus. Temporal overshadowing results have shown that a fixed

duration stimulus overshadows a target stimulus more than a variable length stimulus, both when the

target stimulus is fixed and when it is variable (Jennings, Alonso, Mondragón & Bonardi, 2011).

First, we create the “FF” Group. Phase 1 will have 90 reinforced presentations of two stimuli, A

and B, of equal and fixed durations, 30s, simultaneously presented with forward conditioning. We

add now Phase 2 that will be identical for all groups and will consist in 3 non-reinforced presentations

of a stimulus B of fixed duration. Group “VF” Phase 1 will consist of one fixed length stimulus B

presented with a variable duration stimulus A, both stimuli offsetting at the time of the US delivery.

We can now add the remaining groups and phases following the table 4 specifications. Please note

that all Variable stimuli will have a 30s mean and their variability will be exponential (“Procedural

Settings/Variable Distribution Type/Exponential”).

Table 4 Temporal overshadowing settings

Phase 1 Phase 2

Group CSs Duration A Duration B Length (s) C. type CS Duration Length (s) C. type

FF 90AB+ Fixed Fixed 30 Fw 3B- Fixed 30 Fw

VF 90AB+ Variable Fixed 30 Fw 3B- Fixed 30 Fw

F 90B+ -- Fixed 30 Fw 3B- Fixed 30 Fw

VV 90AB+ Variable Variable 30 Fw 3B- Fixed 30 Fw

FV 90AB+ Fixed Variable 30 Fw 3B- Fixed 30 Fw

V 90B+ -- Variable 30 Fw 3B- Fixed 30 Fw

(a) Temporal properties

Parameter Value

α 0.3

β+ 0.75

β− 0.7

λ 1

δ 0.9

γ 0.95

(b) Temporal overshadowing parameters

Press set parameters and make sure that they are the same as in the table above. Click “Run”.

Page 19: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

4 Worked Examples 17

Figure 14 Temporal overshadowing simulation during B test trials Phase 2.

Figure 14 shows the simulated mean associative strength of B during test. This simulation

correctly predicts lower levels for Group “FF” and Group “FV” in which the overshadowing stimulus is

fixed than for Group “VF” and Group “VV” with a variable overshadowing stimulus, relative to their

corresponding controls Group “F” and Group “V” respectively.

Figure 15 CS components associative strength during the last test trial

Figure 15 shows the simulated CS components associative strength correctly predicting a timing

pattern in the components’ strength -- a progressive increase in strength towards the stimulus B

offset -- when trained with fixed durations (Group “F”, Group “FF” and Group “VF”) and a flat pattern

in B components strength for Groups “V”, “FV” and “VV” trained with a variable duration B.

Page 20: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

5 Terms of use

18

5 Terms of use

Please, read carefully the following license agreement. If you do not accept the terms of the

agreement, please delete immediately the software from your computer.

Temporal Difference Simulator ver. 1.0 and its software and documentation is copyrighted by

Jonathan Gray, Eduardo Alonso, Esther Mondragón, and Alberto Fernández (The authors). The

following terms apply to Temporal Difference Simulator ver. 1.0 unless explicitly disclaimed. The

authors hereby grant permission to use, copy and distribute, (but NOT sell or modify) this software

and its documentation, provided that it is retained unchanged in all copies and that this notice is

included verbatim in any distributions. No written agreement, license, or royalty fee is required to

use or distribute this software.

Use of this software and its authorship must be acknowledged in oral (for example, lectures,

tutorials, laboratory sessions, demonstrations, conferences) or written communication (for

example, books, articles, proceedings).

The authors are not liable for any misuse or misleading use of the software.

IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR

CONSE- QUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS

OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR

OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR

PERFORMANCE OF THIS SOFTWARE.

THE AUTHOR SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT

LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A

PARTICULAR PUR- POSE, AND NON-INFRINGEMENT. THIS SOFTWARE AND ITS

DOCUMENTATION ARE PROVIDED ON AN "AS IS" BASIS, AND THE AUTHORS HAVE NO

OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR

MODIFICATIONS.

6 Feedback

If you have any questions or comments about the application or this document, please feel free to

email Esther Mondragón, at [email protected]. We welcome any suggestions or criticisms. If

there is an inaccuracy somewhere, please let us know where it occurred and what values were

used. It would be very helpful if a saved file from the simulator, which contains the values, is sent as

an attachment.

7 References

Alonso, E., Mondragón, E., & Fernández, A. (2012). A Java simulator of Rescorla and Wagner's

prediction error model and configural cue extensions. Computer Methods and Programs in

Biomedicine. doi: 10.1016/j.cmpb.2012.02.004. Available online 13 March 2012,

http://www.sciencedirect.com/science/article/pii/S0169260712000429.

Bouton, M. E., & Bolles, R. C. (1979). Contextual control of the extinction of conditioned fear. Learning

Motivation, 10, 445 -466.

Church, R., & Kirkpatrick, K. (2001). Theories of Conditioning and Timing. In S. Klein & R. Mowrer

(Eds.), Handbook of contemporary learning theories (pp. 211–255). Laurence Erlbaum Associates,

Mahwah, NJ.

Egger, M. D., & Miller, N. E. (1962). Secondary reinforcement in rats as a function of information value

and reliability of the stimulus. Journal of Experimental Psychology, 64, 97–104.

Page 21: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1

7 References 19

Jennings, D. J., Alonso, E., Mondragón, E., & Bonardi, C. (2011). Temporal uncertainty during

overshadowing: A temporal difference approach. In E. Alonso & E. Mondragón (Eds.),

Computational Neuroscience for Advancing Artificial Intelligence: Models, Methods and

Applications (pp. 46-55). Hershey, PA: IGI Global.

Mondragón, E., Alonso, E., & Fernández, A (2011). Rescorla & Wagner Simulator © V.3 and V. 3.1

[Computer software]. London: CAL-R. http://www.cal-r.org/index.php?id=R-Wsim.

Moore, J., Choi, J., & Brunzell, D. (1998). Predictive timing under temporal uncertainty: the TD model

of the conditioned response. In D. Rosenbaum & A. Collyer (Eds.), Timing of Behavior: Neural,

Computational, and Psychological Perspectives (pp.3–34). Cambridge, MA: MIT Press.

Rescorla, R. A. (1972). "Configural" conditioning in discrete-trial bar pressing. Journal of Comparative

and Physiological Psychology, 79, 307-317.

Singh, S. P., & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine

Learning, 22, 123–158.

Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine learning, 3,

9–44.

Sutton, R. S., & Barto, A. G. (1987). A temporal-difference model of classical conditioning. In

Proceedings of the Ninth Annual Conference of the Cognitive Science Society, pp. 355–378.

Sutton, R. S. & Barto, A. G. (1990). Time-Derivative Models of Pavlovian Reinforcement. In M.

Gabriel and J. Moore (Eds.), Learning and computational neuroscience: Foundations of Adaptive

Networks, (pp. 497–537). Cambridge, MA: MIT Press,

Page 22: CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández. Centre for Computational and Animal Learning Research St. Albans, United Kingdom. 1