CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández....
Transcript of CAL Simulation Guides · Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández....
CAL Simulation Guides ISSN 2054-2747 Temporal Difference Simulator © Version 1.0 2012
Jonathan Gray, Eduardo Alonso, Esther Mondragón & Alberto Fernández.
Centre for Computational and Animal Learning Research St. Albans, United Kingdom
1 Introduction 2
Temporal Difference Simulator 1.0
Contents
1 Introduction .......................................................................................................................................... 3 2 Before you start .................................................................................................................................... 3 3 Running the application ........................................................................................................................ 3
3.1 Entering a design ............................................................................................................................ 4 3.2 Temporal properties ...................................................................................................................... 4 3.3 Parameters ..................................................................................................................................... 5 3.4 Design settings ............................................................................................................................... 6
3.4.1 Set Different US per Phase ...................................................................................................... 6 3.4.2 Context Simulation .................................................................................................................. 6 3.4.3 Compound Results and Configural Cues ................................................................................. 6
3.5 Procedural settings ........................................................................................................................ 7 3.5.1 Number Random Trial Combinations ..................................................................................... 7 3.5.2 Number of Random Distributions for Variable Length CS ...................................................... 7 3.5.3 Time-step Length .................................................................................................................... 7 3.5.4 Add Decision Rule Simulation ................................................................................................. 8 3.5.5 Eligibility Traces....................................................................................................................... 8 3.5.6 Mean Type .............................................................................................................................. 8 3.5.7 Variable Distribution Type ...................................................................................................... 8
3.6 Saving a design ............................................................................................................................... 8 3.7 Loading a design............................................................................................................................. 8 3.8 Outputs .......................................................................................................................................... 9 3.9 Exporting results to excel ............................................................................................................... 9 3.10 Figures display ............................................................................................................................ 10
4 Worked Examples ............................................................................................................................... 11 4.1 A simple worked example - Egger-Miller effect ........................................................................... 11 4.2 Working with context - ABA Renewal .......................................................................................... 12 4.3 Using configural cues - Negative patterning .............................................................................. 14 4.4 Using variable durations - Temporal overshadowing .................................................................. 16
5 Terms of use ........................................................................................................................................ 18 6 Feedback ............................................................................................................................................. 18 7 References .......................................................................................................................................... 18
1 Introduction 3
1 Introduction
This document is a quick guide to installing and using the Centre for Computational and Animal
Learning Research’s Temporal Difference Simulator v1.0 for Complete Serial Compound (Sutton &
Barto, 1987; Moore, Choi & Brunzell, 1998) TD.
Executable versions (.exe for Windows and .app for Apple) are available, in addition to a .jar file
intended to run on Java Runtime Environment 6 or above for UNIX/Linux operating systems. Once
downloaded to your computer, the file will run without installation.
This document does not cover technical details of the underlying implementation, or the Temporal
Difference model.
The simulator builds upon a simulator of Rescorla and Wagner’s model, the “RW_Simulator”
version 3.0 (Mondragón, Alonso & Fernández, 2011; Alonso, Mondragón & Fernández, 2012).
2 Before you start You will need to download a version of the simulator appropriate to your platform (.exe for Windows
machines, .app for OSX machines or .jar for UNIX/Linux operating systems) from http://www.cal-
r.org/in- dex.php?id=software.
For Windows users the download will be named “TD_Simulator.exe” and can be run directly after
downloading, for Mac users a disk image is provided (“TD_Simulator.dmg”) containing the .app file
which can either be run directly from the disk image or after dragging the .app to your Applications
folder.
Users of other platforms should select the “JAVA” button to download the “TD_Simulator.jar” file.
This file will run in any platform provided that Java Runtime Environment (JRE) 6 or above is
installed. Most popular Linux distributions such as Fedora, Debian, Ubuntu, Arch, and CentOS
already include a JRE.
Users who wish to access the source code of the simulator should also download the .jar file, which contains .java files in addition to the runnable binaries.
3 Running the application To start the simulator you will need to navigate to the directory where you stored the file and double
click the icon to launch it. After launching the simulator, you will be presented by the main screen as
shown in Figure 1.
This window is headed by the main menu (“File”, “Design Settings”, “Procedural Settings”, and
“Help”), and consists of two input panels and one output panel. The experimental design is specified
in a matrix of groups and phases in the top panel; the values of the parameters are entered in the
bottom left panel; summary results are displayed in the output panel on the bottom right.
3 Running the application 4
Figure 1 Simulator main screen
3.1 Entering a design
Begin by selecting the cell on the first row, next to Group 1 to enter the experimental design for
phase 1, describing each trial as follows:
Number of trials followed by Stimuli followed by Reinforcer (+ or -)
Make sure that you separate each trial type with a forward slash (/) and press Enter when you
have finished entering trial descriptions for a phase and group. For example, 80 reinforced trials of
stimulus A followed by 80 reinforced trials of two stimuli A and B would be entered as “80A+/80AB+”.
The order in which trials occur is determined by the order in which they are entered in the phase; if
the design requires that different kinds of trial occur in a random ordering, the corresponding
“Random” checkbox should be checked.
To change the name of a group, click the current name and edit it. You can also add and remove
groups by clicking the “+” and “−“ buttons on the left of the window. Similarly, clicking the “+” and “−“
buttons at the top right will respectively add and remove phases.
3.2 Temporal properties
After you enter a trial description, you will see that the corresponding cell in the “Stimuli Temporal
Parameters” column has been populated with the stimuli you entered. Here, we can set up the
durations of stimuli and their temporal relationship to one another (the type of conditioning). Click the
cell to open the temporal properties window for this group and phase as in Figure 2.
First, set the duration of the US, then click the “CS Temporal Properties” for a CS to set the duration
3 Running the application 5
of that CS. Stimuli can have a fixed (F) duration where they will last exactly the number of seconds
specified on every trial, or a variable (V) duration. For stimuli with a variable length, actual lengths on
each trial are selected from a random distribution of durations such that the mean duration of the
stimulus over all the trials where it is present in a phase is the number of seconds input by the user.
For variable duration stimuli, each phase is run with a number of random orderings1 of these
durations specified in the “Number of Random Distributions for Variable Length CS” (see 3.5.2). The
values for variable length stimuli are produced by averaging the results of all the phases.
Stimuli can also be configured to have different onsets and offsets, relative to the US, by clicking the
corresponding cell in the “Conditioning” column. A stimulus can be configured to have a forward2 (Fw,
the default) relationship to the US, a backward (Bw) relationship, or a simultaneous (Sm) relationship.
In the same window, the ISI (inter-stimulus interval) can also be set, giving the length of time between
onset of the stimulus and the US for forward and simultaneous conditioning, and the end of the US
and the start of the stimulus for backward conditioning.
The durations of ITIs (inter-trial intervals) can also be configured per phase and group by clicking
the “ITI” column, with each ITI for a phase made up of a minimal fixed length and an additional
variable period that can be 0.
3.3 Parameters
After entering an experiment configuration you can edit the parameters by pressing the “Set
Parameters” button. Three tables appear. In the top table, α values for each CS must be entered
(the default is 0.3). The bottom table contains a set of default values given to the US (β+ is set to
0.75, β− to 0.7 and λ to 1); the user can of course, modify these parameters. Finally, the bottom table
labeled “Others” contains the δ (trace decay) and γ (discount factor) parameters, which default to 0.9
and 0.95 respectively and may also be modified.
1 Note that using random trial orderings in combination with variable stimuli can result in long runtimes. 2 Note that for variable length stimuli, configuring a forward relationship will always ensure the US is delivered after the
stimulus has finished, even though the actual duration of the stimulus changes on each phase.
Figure 2 Temporal properties window
3 Running the application 6
3.4 Design settings
3.4.1 Set Different US per Phase
This option allows you to set different β+, β− and λ values for different phases, i.e. different US
motivational values per phase.
3.4.2 Context Simulation
The simulator supports the simulation of context; selecting “Same Context” from the “Context
Simulation” menu in the “Design Settings” menu will prompt the user to enter a salience for the
context (by default 0.05), then add it to all trials in all phases and groups, as seen in Figure 3.
Figure 3 Same context display
Alternatively, selecting “Different Contexts” will allow the user to set one of several contexts per
phase and group, and modify the salience by clicking the context column as shown in Figure 4.
Figure 4 Different contexts display
This adds a context column in each phase. By default, the context added is the φ context,
represented by φ(0.05) in the context column of a phase, as with a single context. To modify a
context, click on it to open the context window (Figure 4). Here, you can select from one of four
distinct contexts and alter the salience of them. By default, “No Context Simulation” is selected.
3.4.3 Compound Results and Configural Cues
This simulator generates associative strength values for standard additive compound stimuli. It also
computes compound values using added configural cues, which have a length equal to the
overlapping period of all the component stimuli of the configural cue.
3 Running the application 7
To calculate compound values, you must select “Design Settings/Show Compound Results”. If you wish to calculate stimulus compounds with configural cues you must also tick “Design Settings/Use Configural Cues” in the main menu. Press “Set Parameters” to input the alpha values for the configural cues. By default, the product of the elemental alpha values will appear but the
user can modify these values. Configural cues are represented as “c(ΦA)”, “c(ΦAB)”, etc. (see
Figure 5).
Figure 5 Settings for calculating compounds with context and configural cues
3.5 Procedural settings
3.5.1 Number Random Trial Combinations
Select this option to alter the number of combinations used for random phases, the default is 100.
Setting this to a high number will result in significantly slower runtimes, particularly in combination
with variable length stimuli.
3.5.2 Number of Random Distributions for Variable Length CS
Select this option to alter the number of combinations used for variable length stimuli, the default is
100. Setting this to a high number will result in significantly slower runtimes, particularly in
combination with random trial combinations.
3.5.3 Time-step Length
Select this option to alter the size of the time-step used in simulation, which defaults to 1 (1 time-step
equals a second) and hence the number of components per stimulus (length of stimulus divided by
time-step size). Note that the length of the time-step is also the lower bound on durations, the US,
CSs and ITIs can be no shorter than the time-step.
Lower numbers will allow the user to simulate at a higher temporal resolution, but will increase
the simulation time required for long stimuli durations as the number of components increases.
3 Running the application 8
3.5.4 Add Decision Rule Simulation
Select this option to add a simulated conditioned response rate per minute to results. The response
rate is derived from the decision rule given by Church and Kirkpatrick (2001). Enabling the option
prompts the user to enter a threshold value that must be exceeded to produce a response.
3.5.5 Eligibility Traces
Use this menu to select the algorithm used to calculate eligibility traces that are used to control the
extent to which a component is eligible for changes to associative strength. Three options are
available: an accumulating trace (Sutton, 1988) which increases each time a CS occurs and decays
relatively slowly, a replacing trace (Singh & Sutton, 1996) that has the same decay properties but
never accumulates above 1, and a bounded accumulating trace (Sutton & Barto, 1987, 1990) where trace decay is exponential.
By default, the simulator uses a replacing trace.
3.5.6 Mean Type
Use this menu to choose the type of mean used to when producing variable distributions for stimuli and ITI durations. By default the simulator will use the arithmetic mean, but it can alternatively use the geometric mean.
3.5.7 Variable Distribution Type
Two variable distributions are available for producing variable duration stimuli and ITIs, an
exponential distribution and a uniform distribution. By default, the simulator uses an exponential
distribution to produce variable durations.
3.6 Saving a design
If you would like to save your design so you can retrieve it another time (for instance, to modify it), all
you have to do is to click “Save” on the “File” menu. This will pop up a window, which asks you
where you would like to save your file. You need to choose a directory and a filename. Click “OK” to
save the file to the directory you have chosen. A file with the extension “.tdl” will appear in the
chosen directory.
3.7 Loading a design
Once you have saved your design, you can re-use it by selecting “Open” from the “File” menu. The
application will replace the currently loaded design with the design from the file you have selected.
3 Running the application 9
3.8 Outputs
After the experimental design and parameters have been entered, and the design and procedural settings have been chosen, click “Run”. Once the simulation has been completed, a textual summary of the results will be displayed in the data area, located on the right hand side, as shown in
Figure 63 . The user can scroll up and down to check the stimulus mean V values per trial, group and phase and the V of each component of the stimulus on the final trial of each phase.
At this stage complete results can be exported to an excel spreadsheet for further analysis, or
displayed as figures.
Where compounds are used, they are represented as “AB”, “ABC” etc. Configural cue
compounds are displayed as “[AB]”, “[ABC]”, etc. and replace the standard compounds in text output,
excel spreadsheets, and figures display.
If decision rule simulation is enabled (see 3.5.4.), a simulated response rate (responses per
minute) per component will be also displayed in all outputs.
3.9 Exporting results to excel
The application can also export the results to a “.xlsx” (Excel 2007-2010) type spreadsheet, usable in
Microsoft Excel and current version of LibreOffice/OpenOffice. It creates a workbook that has a
different sheet per group. Phases are presented individually on a separate table. Each sheet
contains the name of the file followed by CS, US and Other parameters followed by context
saliences (if using context). The temporal properties of all the stimuli in that group and phase, and a
line showing its design precedes each phase table. Each phase table shows the V results per
component for each trial, as well as the average V for the complete stimulus on each trial. If you
have enabled decision rule simulation, a simulated response per component is also shown for each
trial. Figure 7 shows the exported data. To export the results select “File/Export”.
3 The “Clear All” button can also be used to clear the current design, leaving the number of groups and phases intact.
Figure 6 Parameters and results
3 Running the application 10
Figure 7 Excel spreadsheet results
Exporting large results sets can be slow, and a progress bar with an estimated time remaining will
be displayed.
3.10 Figures display
The simulator also displays graphs of results, accessible after running an experiment by clicking the
“Display Figures” button. By default, the simulator will show separate figures for each phase for the
mean associative strength of each stimulus, plotted against trials (Figure 8 (a)), and the associative
strength of each component of each stimulus after the final trial (Figure 8 (b)). If decision rule
simulation has been enabled, the simulator will also show a simulated response graph, showing the
simulated responses per minute for each stimulus at each time-step after the final trial (Figure 8 (c)).
For all graphs, the user can select which groups and stimuli are displayed by checking or
unchecking their respective boxes. The figures can remain open while a new experimental design is
run to aid comparison between results.
3 Running the application 11
(a) Trial level
(b) Component level
(c) Simulated response
Figure 8 Figures display
4 Worked Examples 11
Figures can also be saved, copied, printed, zoomed and modified. To access these functions
right mouse click (or, in Mac, Ctrl+Click) while pointing over the figure, to open the menu. For
instance, to facilitate data comparison you may wish to fix the Y-axis limits (by default axis limits are
set to the highest value plotted). Right click (or Ctrl+Click) and choose “Properties”. Then, select
“Plot” at the first row of tabs; “Range Axis” (for Y-axis) at the second, and “Range” at bottom row of
tabs. Unmark the “Auto-adjust range” tick-box and modify the range values as required. Figure 9
shows these menus.
Figure 9 Range axis options for figures
You can also zoom in on a portion of the graph by left-clicking and dragging down and right to
encompass the area you wish to focus on. After zooming in, you can return to the full graph by left-
clicking and dragging up and left.
4 Worked Examples
4.1 A simple worked example - Egger-Miller effect
The Egger-Miller effect (Egger & Miller, 1962) describes a phenomenon in which a redundant
stimulus that precedes and overlaps a target stimulus reduces the target conditioning compared to a
control in which the target is presented alone despite having the same temporal relationship with the
US.
For our Egger-Miller effect experiment we will consider three groups. Group “Redundant” will be
“80AB+”, Group “Informative” will be “40A−/80AB+” randomly presented, and Group “Control” will
consist of “80B+” trials. After entering the trial description for Group “Redundant”, click the “CS
Temporal Properties” for CS A to set the duration of the stimulus. For this example, select “Fixed”
from the “Duration” column and enter 10 for the “Length”, then click “OK” or hit “Enter” to return to the
temporal properties window.
Now, you can set the type of conditioning (how the stimulus is related in time to the US). By
default, this is set to forward conditioning, indicated by “Fw” in the “Conditioning” column. In this
case, the default is fine for our purposes. Check that the “Type” column is set to “Fixed” and the “ISI”
column is set to “10”, then click “OK” or press “Enter”. Now, set CS B to a duration of 5 seconds with
an ISI of 5 seconds.
Next enter the data for the remaining groups. Once a stimulus has been entered its “Temporal
Properties” will be filled automatically (it will keep the information from the first group). If you need to
change them, you can do so.
4 Worked Examples 12
Now we can input the parameters, as in Table 1 with US per phase disabled.
Table 1 Egger-Miller parameters
Parameter Value
α 0.3
β+ 0.75
β− 0.75
λ 1
δ 0.9
γ 0.95
Now click the “Run” button to begin simulating the experiment.
Figure 10 Egger-Miller results
Figure 10 shows the effect. The redundant stimulus A reduces the associative strength of B when
compared with the levels acquired by B in Group “Informative” and Group “Control”.
4.2 Working with context - ABA Renewal
In this example you will use the simulator to simulate the ABA renewal effect (Bouton & Bolles,
1979), where a stimulus conditioned in one context and extinguished in another recovers some of
the associative strength acquired previously when returned to the original context. This example will
show you how to use different contexts across phases.
In this instance, you will need two groups, Group “ABA” (the experimental group) and Group
“AAA” (the control, that will be conditioned with the same context throughout all phases). Create the
first group, and set the trial description to “50A+” with a fixed duration of 10 seconds, forward
conditioned. Now, add Group “AAA” with the same trial description, i.e., “50A+”. You should now
have two groups, with identical trial descriptions.
4 Worked Examples 13
Now, add a new phase. This will be the extinction phase of the conditioning, so enter a trial
description of “25A−“ for the two groups. Finally, add a test phase with a trial description of “5A−“ for
all groups.
Next, you need to add context to your phases. Enable the “Different Contexts” setting, then you
will need to change the identity of the context shown in Phase 2 in Group “ABA”. For this experiment,
ensure the context alpha is set to the default (0.05) and change the context from φ to ψ, then click
“OK”. The design table should look as in Figure 11.
Figure 11 Design for ABA renewal
Finally enable compound results by selecting “Show Compound Results” from the “Design
Settings” menu. Ensure that your parameters are set as in Table 2, and click “Run”.
Table 2 ABA renewal parameters
Parameter Value
α 0.3
β+ 0.75
β− 0.75
λ 1
δ 0.9
γ 0.95
φ 0.05
ψ 0.05
Once the simulation has completed, click the “Display Figures” button to get a graphical view of
the results and inspect the results for the φA compound in the final phase - the relative performance of the “AAA” group and of the group “ABA” shows the impact of the different context.
4 Worked Examples 14
Figure 12 ABA Renewal results
Figure 12 shows V mean values per trial for the test phase of the experiment, with the φA
compound showing reduced extinction in Group “ABA”.
4.3 Using configural cues - Negative Patterning
In this example, you will use the simulator to reproduce a negative patterning phenomenon. In this
procedure, two stimuli signal the US separately whereas the compound formed by these two stimuli
does not (Rescorla, 1972).
Create a single group named “Negative P.” with the trial description “100A+/100B+/100AB−”, and
set all stimuli to a fixed duration of 10 seconds, forward conditioning. In this experiment the different
trial types must be interspersed, so you will need to make sure the “Random” checkbox is selected.
You will also need to increase the number of random combinations from the default (100), to 1000.
To do this, select “Number of Random Trial Combinations” from the “Procedural Settings” menu,
enter 1000 in the popup that appears and click “OK”. For this experiment, use the following
parameters:
Table 3 Negative patterning parameters
Parameter Value
α 0.3
β+ 0.75
β− 0.7
λ 1
δ 0.9
γ 0.95
Finally, enable “Show Compound Results” from the “Design Settings” menu, click the “Run”
button, and then display the figures. As you can see (Figure 13 (a)), no discrimination occurs. Like
the Rescorla and Wagner model, TD requires the use of configural cues to successfully solve
negative patterning.
4 Worked Examples 15
Now, enable the “Use Configural Cues” option from the “Design Settings”, this will add a single
configural cue to the CS α table - c(AB), with the salience set to the product of the values for A and
B. Run the experiment again, and view the figures. When viewing figures with configural cues
enabled, configural cues are not visible by default. To show configural cues on a figure, the user
must check the corresponding checkbox at the bottom of the figure window.
(a) The simulation does not predict negative patterning discrimination without configural cues
(b) The simulation correctly predicts negative patterning discrimination when using configural cues
Figure 13 Negative patterning results
Figure 13 (b) demonstrates that assuming a configural cue in the representation of the stimulus
compound [AB], TD is able to solve the discrimination: The stimuli individual associative strength
increases whereas the associative strength of the compound decreases with training.
At this point, you can also view the associative strength of the c(AB) cue – and note that it has
become a conditioned inhibitor.
4 Worked Examples 16
4.4 Using Variable Durations - Temporal Overshadowing
To demonstrate the use of variable durations for stimuli we will now simulate a temporal
overshadowing experiment. Overshadowing refers to a phenomenon by which a stimulus that is
conditioned alone with a US (target stimulus) acquires more associative strength than when
conditioned paired with a second stimulus. Temporal overshadowing results have shown that a fixed
duration stimulus overshadows a target stimulus more than a variable length stimulus, both when the
target stimulus is fixed and when it is variable (Jennings, Alonso, Mondragón & Bonardi, 2011).
First, we create the “FF” Group. Phase 1 will have 90 reinforced presentations of two stimuli, A
and B, of equal and fixed durations, 30s, simultaneously presented with forward conditioning. We
add now Phase 2 that will be identical for all groups and will consist in 3 non-reinforced presentations
of a stimulus B of fixed duration. Group “VF” Phase 1 will consist of one fixed length stimulus B
presented with a variable duration stimulus A, both stimuli offsetting at the time of the US delivery.
We can now add the remaining groups and phases following the table 4 specifications. Please note
that all Variable stimuli will have a 30s mean and their variability will be exponential (“Procedural
Settings/Variable Distribution Type/Exponential”).
Table 4 Temporal overshadowing settings
Phase 1 Phase 2
Group CSs Duration A Duration B Length (s) C. type CS Duration Length (s) C. type
FF 90AB+ Fixed Fixed 30 Fw 3B- Fixed 30 Fw
VF 90AB+ Variable Fixed 30 Fw 3B- Fixed 30 Fw
F 90B+ -- Fixed 30 Fw 3B- Fixed 30 Fw
VV 90AB+ Variable Variable 30 Fw 3B- Fixed 30 Fw
FV 90AB+ Fixed Variable 30 Fw 3B- Fixed 30 Fw
V 90B+ -- Variable 30 Fw 3B- Fixed 30 Fw
(a) Temporal properties
Parameter Value
α 0.3
β+ 0.75
β− 0.7
λ 1
δ 0.9
γ 0.95
(b) Temporal overshadowing parameters
Press set parameters and make sure that they are the same as in the table above. Click “Run”.
4 Worked Examples 17
Figure 14 Temporal overshadowing simulation during B test trials Phase 2.
Figure 14 shows the simulated mean associative strength of B during test. This simulation
correctly predicts lower levels for Group “FF” and Group “FV” in which the overshadowing stimulus is
fixed than for Group “VF” and Group “VV” with a variable overshadowing stimulus, relative to their
corresponding controls Group “F” and Group “V” respectively.
Figure 15 CS components associative strength during the last test trial
Figure 15 shows the simulated CS components associative strength correctly predicting a timing
pattern in the components’ strength -- a progressive increase in strength towards the stimulus B
offset -- when trained with fixed durations (Group “F”, Group “FF” and Group “VF”) and a flat pattern
in B components strength for Groups “V”, “FV” and “VV” trained with a variable duration B.
5 Terms of use
18
5 Terms of use
Please, read carefully the following license agreement. If you do not accept the terms of the
agreement, please delete immediately the software from your computer.
Temporal Difference Simulator ver. 1.0 and its software and documentation is copyrighted by
Jonathan Gray, Eduardo Alonso, Esther Mondragón, and Alberto Fernández (The authors). The
following terms apply to Temporal Difference Simulator ver. 1.0 unless explicitly disclaimed. The
authors hereby grant permission to use, copy and distribute, (but NOT sell or modify) this software
and its documentation, provided that it is retained unchanged in all copies and that this notice is
included verbatim in any distributions. No written agreement, license, or royalty fee is required to
use or distribute this software.
Use of this software and its authorship must be acknowledged in oral (for example, lectures,
tutorials, laboratory sessions, demonstrations, conferences) or written communication (for
example, books, articles, proceedings).
The authors are not liable for any misuse or misleading use of the software.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR
CONSE- QUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.
THE AUTHOR SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PUR- POSE, AND NON-INFRINGEMENT. THIS SOFTWARE AND ITS
DOCUMENTATION ARE PROVIDED ON AN "AS IS" BASIS, AND THE AUTHORS HAVE NO
OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR
MODIFICATIONS.
6 Feedback
If you have any questions or comments about the application or this document, please feel free to
email Esther Mondragón, at [email protected]. We welcome any suggestions or criticisms. If
there is an inaccuracy somewhere, please let us know where it occurred and what values were
used. It would be very helpful if a saved file from the simulator, which contains the values, is sent as
an attachment.
7 References
Alonso, E., Mondragón, E., & Fernández, A. (2012). A Java simulator of Rescorla and Wagner's
prediction error model and configural cue extensions. Computer Methods and Programs in
Biomedicine. doi: 10.1016/j.cmpb.2012.02.004. Available online 13 March 2012,
http://www.sciencedirect.com/science/article/pii/S0169260712000429.
Bouton, M. E., & Bolles, R. C. (1979). Contextual control of the extinction of conditioned fear. Learning
Motivation, 10, 445 -466.
Church, R., & Kirkpatrick, K. (2001). Theories of Conditioning and Timing. In S. Klein & R. Mowrer
(Eds.), Handbook of contemporary learning theories (pp. 211–255). Laurence Erlbaum Associates,
Mahwah, NJ.
Egger, M. D., & Miller, N. E. (1962). Secondary reinforcement in rats as a function of information value
and reliability of the stimulus. Journal of Experimental Psychology, 64, 97–104.
7 References 19
Jennings, D. J., Alonso, E., Mondragón, E., & Bonardi, C. (2011). Temporal uncertainty during
overshadowing: A temporal difference approach. In E. Alonso & E. Mondragón (Eds.),
Computational Neuroscience for Advancing Artificial Intelligence: Models, Methods and
Applications (pp. 46-55). Hershey, PA: IGI Global.
Mondragón, E., Alonso, E., & Fernández, A (2011). Rescorla & Wagner Simulator © V.3 and V. 3.1
[Computer software]. London: CAL-R. http://www.cal-r.org/index.php?id=R-Wsim.
Moore, J., Choi, J., & Brunzell, D. (1998). Predictive timing under temporal uncertainty: the TD model
of the conditioned response. In D. Rosenbaum & A. Collyer (Eds.), Timing of Behavior: Neural,
Computational, and Psychological Perspectives (pp.3–34). Cambridge, MA: MIT Press.
Rescorla, R. A. (1972). "Configural" conditioning in discrete-trial bar pressing. Journal of Comparative
and Physiological Psychology, 79, 307-317.
Singh, S. P., & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine
Learning, 22, 123–158.
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine learning, 3,
9–44.
Sutton, R. S., & Barto, A. G. (1987). A temporal-difference model of classical conditioning. In
Proceedings of the Ninth Annual Conference of the Cognitive Science Society, pp. 355–378.
Sutton, R. S. & Barto, A. G. (1990). Time-Derivative Models of Pavlovian Reinforcement. In M.
Gabriel and J. Moore (Eds.), Learning and computational neuroscience: Foundations of Adaptive
Networks, (pp. 497–537). Cambridge, MA: MIT Press,