Distributed Individual-Based Simulationbic/papers/JimingLiu-Thesis.pdfUNIVERSITY OF CALIFORNIA,...
Transcript of Distributed Individual-Based Simulationbic/papers/JimingLiu-Thesis.pdfUNIVERSITY OF CALIFORNIA,...
UNIVERSITY OF CALIFORNIA, IRVINE
Distributed Individual-Based Simulation
DISSERTATION
submitted in partial satisfaction of the requirements for the degree of
DOCTOR OF PHILOSOPHY
in Information and Computer Science
by
Jiming Liu
Dissertation Committee: Professor Lubomir F. Bic, Co- Chair
Professor Michael B. Dillencourt Co-Chair Professor Arthur D. Lander
2008
© 2008 Jiming Liu
ii
The dissertation of Jiming Liu
is approved and is acceptable in quality and form for
publication on microfilm and in digital formats:
____________________________
____________________________ Committee Co-Chair
____________________________ Committee Co-Chair
University of California, Irvine
2008
iii
Contents
List of Figures vii
List of Tables x
List of Algorithms xi
Acknowledgments xii
Curriculum Vistae xiii
Abstract xiv
1 Chapter 1: Introduction 1
1.1 Research motivation and target problem......................................................... 2
1.2 Individual-based model .................................................................................. 3
1.3 Distributed individual-based simulation ......................................................... 5
1.3.1 Problem partitioning ............................................................................. 5
1.3.2 Problem mapping .................................................................................. 6
1.3.3 Communication ..................................................................................... 6
1.3.4 Simulation consistency ......................................................................... 7
1.4 Chapters in the dissertation ............................................................................ 7
2 Chapter 2: Particle Diffusion Model 9
iv
2.1 The particle diffusion model .......................................................................... 9
2.2 Model specification ........................................................................................ 11
2.2.1 Simulated space .................................................................................... 11
2.2.2 Cell representation ................................................................................ 12
2.2.3 Receptor representation ........................................................................ 12
2.2.4 Particle representation ........................................................................... 13
2.3 Basic simulation ............................................................................................. 15
3 Chapter 3: Improved Model 16
3.1 Macro simulation distance step ...................................................................... 17
3.2 Calculating particle capturing ......................................................................... 18
3.2.1 First hit .................................................................................................. 20
3.2.2 Number of returns ................................................................................. 21
3.2.3 Horizontal distance ............................................................................... 22
3.2.4 Probability of capturing ........................................................................ 22
3.2.5 Modification to the formula .................................................................. 23
4 Chapter 4: Distributed Individual-Based Simulation 25
4.1 Problem decomposition .................................................................................. 27
4.1.1 The Lagrangian decomposition method ................................................ 27
4.1.2 The Eulerian decomposition method with vertical strips ..................... 29
4.1.3 The Eulerian decomposition method with horizontal strips ................. 31
4.2 Implementation ............................................................................................... 33
4.2.1 Overview ............................................................................................... 33
4.2.2 Messengers ............................................................................................ 35
4.3 Consistency with sequential implementation ................................................ 41
4.3.1 Identical results with the sequential simulation .................................... 42
4.3.2 Random number seed initialization ...................................................... 45
5 Chapter 5: Simulation Enhancement: Parallel Simulation Protocols 48
v
5.1 Exchange less frequently ................................................................................ 49
5.2 Shadow cells ................................................................................................... 53
5.3 Conflict scenarios description ........................................................................ 55
5.3.1 Scenario 1: A free particle becoming stuck when it should not ........... 55
5.3.2 Scenario 2: A free particle not becoming stuck when it should ........... 61
5.4 Conflict resolution .......................................................................................... 66
5.4.1 Solution to scenario 1 ............................................................................ 69
5.4.2 Solution to scenario 2 ............................................................................ 73
5.5 The order of processing the particles .............................................................. 77
5.5.1 First come, first processed .................................................................... 77
5.5.2 Particles migration ................................................................................ 79
5.5.3 Preserving the order of processing the particles ................................... 81
5.5.4 Temporary incoming particles .............................................................. 82
5.6 Confliction resolution algorithm .................................................................... 85
5.7 Correctness of distributed simulation ............................................................. 90
6 Chapter 6: Experimental Assessments 92
6.1 Performance evaluation .................................................................................. 92
6.1.1 Experiment 1 ......................................................................................... 94
6.1.2 Experiment 2 ......................................................................................... 96
6.1.3 Experiment 3 ......................................................................................... 98
6.1.4 Experiment 4 ......................................................................................... 100
6.1.5 Performance trade-offs .......................................................................... 102
6.2 System capability and scalability ................................................................... 104
6.3 System steady state - stop criteria .................................................................. 104
6.3.1 Determining the point of steady state ................................................... 105
6.3.2 Quantifying the mean time to reaching steady state ............................. 106
7 Chapter 7: Biology Results Obtained From the Simulation 110
7.1 Analysis of simulation output ......................................................................... 110
vi
7.1.1 Variation of number of stuck particles .................................................. 111
7.1.2 Shapiro-Wilk W test for variation ......................................................... 114
7.2 More experiments ........................................................................................... 118
7.2.1 Case study 1: release stuck particles ..................................................... 118
7.2.2 Case study 2: stuck particles crossing through cell .............................. 123
8 Chapter 8: Related work 127
9 Chapter 9: Conclusions 131
9.1 Contribution .................................................................................................... 131
9.2 Future work .................................................................................................... 133
Bibliography 135
Appendix 1 140
Appendix 2 161
Appendix 3 181
Appendix 4 202
vii
List of Figures
2.1 Particle diffusion models ...................................................................................... 10
2.2 The simulated space ............................................................................................. 11
2.3 Cell representation ................................................................................................ 12
2.4 Algorithm for the basic simulation ....................................................................... 15
3.1 Cell grid and 9 areas of a particle position ........................................................... 19
3.2 A particle moving to the right segment at the end of macro time step ................. 22
4.1 Lagrangian decomposition method and node mapping ........................................ 28
4.2 Eulerian decomposition method with vertical strips and node mapping .............. 30
4.3 Eulerian decomposition method with horizontal strips and node mapping ......... 31
4.4 An example of simulated space partitioning and node mapping .......................... 33
4.5 An example of nodes with their neighbors ........................................................... 33
4.6 Messengers system architecture and the task and shuttle Messengers ................. 34
4.7 Creator Messenger script pseudo code ................................................................. 35
4.8 Task Messenger script pseudo code ..................................................................... 37
4.9 Left shuttle Messenger script pseudo code ........................................................... 41
4.10 Random number sequences .................................................................................. 42
4.11 Random number sequence change in particle migration ...................................... 43
4.12 Random number sequences unique to new particles ............................................ 44
4.13 Function to assign a random number seed to a particle ........................................ 46
5.1 Communication granularity level ......................................................................... 51
viii
5.2 Node mapping with show cells ............................................................................. 52
5.3 View of incoming particle A’ on local node (node 1) ........................................... 54
5.4 Views of local node with an incoming particle (scenario 1) ................................ 56
5.5 Views of particles movement in sequential and parallel implementation
of scenario 1 ......................................................................................................... 57
5.6 View of local node with degraded stuck particle on local node (scenario 2) ....... 61
5.7 Views of particles movement in sequential and parallel implementation
of scenario 2 ......................................................................................................... 63
5.8 Tentatively stuck particles on node 1 .................................................................... 66
5.9 TSP tag structure .................................................................................................. 67
5.10 DSP tag structure .................................................................................................. 69
5.11 Solution for scenario 1 .......................................................................................... 70
5.12 Overstating the number of free receptors in the shadow cells of node 2 ............. 74
5.13 Particles migration between node 1 and node 2 ................................................... 80
5.14 A temporary incoming particle on the local node ................................................ 84
5.15 Pseudo code of the conflict resolution .................................................................. 86
6.1 Execution time of experiment 1 at different epoch 1engths ................................. 94
6.2 Speedup of experiment 1 at different epoch lengths ............................................ 95
6.3 Execution time of experiment 2 at different epoch lengths .................................. 97
6.4 Speedup of experiment 1 and 2 on 5 nodes .......................................................... 97
6.5 Execution time of experiment 3 at different epoch lengths .................................. 99
6.6 Speedup of experiment 3 at different epoch lengths ............................................ 99
6.7 Execution time of experiment 4 at different epoch lengths .................................. 101
6.8 Speedup of experiment 3 and 4 on 10 nodes ........................................................ 101
6.9 Example of fitted piecewise linear models to one of the stuck particles datasets 108
6.10 Example of fitted piecewise linear models to one of free particles datasets ........ 109
7.1 Average value and number of stuck particles from one simulation in bin 1 ........ 112
7.2 10 time ranges (T1 – T10) in bin 1 of stuck particles .......................................... 113
7.3 10 time ranges (T1 – T10) in bin 1 of free particles ............................................. 115
ix
7.4 CV for the number of stuck particles in 10 time ranges ....................................... 116
7.5 CV for the number of free particles in 10 time ranges ......................................... 106
7.6 Stuck particles released back to system ................................................................ 118
7.7 Particles diffusion in the simulated space (6.67 Bio-minutes) ............................. 119
7.8 Number of stuck Particles at the end of simulation (6.67 Bio-minutes) .............. 121
7.9 Number of free particles at the end of simulation (6.67 Bio-minutes) ................. 122
7.10 A stuck particle released crossing the cell ............................................................ 123
7.11 Particles diffusion at the end of 20 million iterations (3.33 Bio-minutes) ........... 124
7.12 Number of stuck particles at the end of simulation (3.33 Bio-minutes) .............. 125
7.13 Number of free particles at the end of simulation (3.33 Bio-minutes) ................. 126
x
List of Tables
6.1 Parameter set for experiments .............................................................................. 93
6.2 Comparison of the speedup and accuracy at different epoch lengths .................. 102
6.3 Mean time with confidence interval results for stuck particles ............................ 107
6.4 Mean time with confidence interval results for free particles .............................. 107
7.1 p-value produced by Shapiro-Wilk W Test for bin 1 output ................................ 115
7.2 Parameters in case studies ..................................................................................... 117
xi
List of Algorithms
5.1 Parallel simulation procedures performed by task Messengers ........................... 84
xii
Acknowledgments
I would like to thank my committee members: Professor Lubomir Bic, Professor Michael
Dillencourt, and Professor Arthur Lander for their guidance, support and patience
throughout the years. I would like to thank Professor Dan Gillen for his help in the simu-
lation data analysis.
I would like to thank the past and present members of Messengers research group:
Bozhena Bidyuk, Hairong Kuang, Susan Mabry, Eugene Gendelman, Koji Noguchi, Lei
Pan, Richard Utter, Ming Kin Lai, Javid Huseynov, and Wendy Zhang. Thank everyone,
for all their help and encouragement over the years. In particular, I would like to send
my gratitude to Koji and Ming, for maintaining the Messengers system and helping me in
debugging the Messengers programming.
I would like to thank my family for their love and understanding.
xiii
Curriculum Vitae
Jiming Liu
1983 B.S. in Computer Engineering National University of Defense Technology, China 1992 M.S. in Industrial Engineering Purdue University, West Lafayette, Indiana 2001 M.S. in Information and Computer Science University of California, Irvine 2008 Ph.D. in Information and Computer Science University of California, Irvine 2001 GAANN Fellowship Department of Education, USA
xiv
Abstract of the Dissertation
Distributed Individual-Based Simulation
By
Jiming Liu
Doctor of Philosophy in Information and Computer Science
University of California, Irvine, 2008
Professor Lubomir F. Bic, Co-Chair
Professor Michael B. Dillencourt, Co-Chair
Individual-based simulation can be implemented in a distributed fashion by making each
machine in a distributed system responsible for a portion of the problem. Eulerian
method and Lagrangian method can be used to decompose the problem. Individual-based
simulation is not a new concept, nor is the idea of distributed computing; the system here
we worked on offers techniques and prototypes of combing these two paradigms into one
large scale simulation environment.
Our simulation target model is a biological particle diffusion model with a large
population of particles. We describe methods of improving simulation performance that
xv
combine several techniques including model improvement, distributed simulation system
structures, a hybrid problem decomposition that combines the classical Lagrangian and
Eulerian methods, and varying granularity of synchronization between computers in the
distributed system. We present the performance results to show the speedup of the simu-
lation using the methods and parallelism we developed and implemented. We evaluate
the simulation results consistency between the distributed simulation and the sequential
simulation and the trade-offs between accuracy and speedup. We also present biological
case studies by using our simulation system.
1
Chapter 1
Introduction
Individual-based models are the simulations that have the following characteristics. It
typically consists of an environment and individuals. The environment can be a simu-
lated space in which the interactions (1) between the individuals and (2) between the in-
dividuals and the environment occur. The individuals can be defined by its characteris-
tic parameters or attributes. The behaviors and states of the individuals are tracked
through entire simulation based on the global consequences of the local interactions of
the individuals. An individual-based simulation can also be exhibit mobility where the
individuals can move around in simulated space. Because of these characteristics of the
individual-based simulation, it has been widely used in many applications, such as ecol-
ogy and biology, traffic control and sociology.
A distributed individual-based simulation is to simulate the individual-based
2
model in a distributed computing environment. The distributed simulation applies to
the applications that simulate a large population of individuals or process time-
consuming tasks. The major issues in implementing a distributed simulation include the
application partitioning, communication overhead, and the simulation results consis-
tency.
We developed a distributed individual-base simulation system to support a time-
consuming large populated individual-based model. Individual-based simulation is not
a new concept, nor is the idea of distributed computing; the system here we worked on
offers techniques and prototypes of combing these two paradigms into one large scale
simulation environment.
1.1 Research motivation and target problem
We started this research work after we got interested in a biological experiment pre-
sented by professor Author Lander. The biological experiment is a process of mole-
cules diffusing in an area built with wed cells. During the molecules diffusion process,
a few events occur, such as molecules binding and unbinding to the cell receptors, and
molecules degradating from the intercellular space. The problem is that the molecule
diffusion process is an extremely time-consuming processing. In order to observe the
molecule activities in the diffusion process, the experiment needs to run for a period of
time defined by the biological time step. The biological time step used in this experi-
ment is 5 nanoseconds. To simulate molecule’s movement at each biological time step
3
for 1 second of the biological clock, the simulation needs to run for 200 million itera-
tions. The simulation can run for a long period of time to simulate a large amount of
particles.
We are interested in the problem because it is 1) a typical computer simulation
problem, 2) an individual-based application, and 3) a good candidate for a distributed
computing simulation. The developed simulation system can also be used as a tool for
simulating biological applications.
1.2 Individual-based model
The principles in modeling an individual-based application can be summed up as the
following: 1) the individuals have its own identity and are behaviorally instinct because
of the environmental influences, and 2) the interactions between the individuals or be-
tween the individual and its environment are inherently localized, e.g. the individuals
are influenced by the nearby individuals. Based on these principles and the characteris-
tics of the applications, the individual-based modeling is used in many research areas.
The applications of the individual-based model can be found most commonly in the
study of ecology and biology [KBW99, RG05, HHM96], sociology [WH96], artificial
intelligent and traffic control [HM96]. The techniques, approaches, software and tools
used in the modeling have been developed and researched and are continue to be re-
searched.
We use some of the techniques and approaches in presenting our target model, a
4
biology application of molecules diffusion within an intercellular space. The target
model is spatially exhibit meaning that the particles are associated with a geometry lo-
cation in the simulated space. The model also exhibits mobility because the particles
move around in the simulated space. There are three basic components to be defined in
the individual-based model for the biological application:
• Simulated space
The simulated space is the simulation environment representing an intercellular
space, which is constructed by cells. Each cell has its grid like geometry loca-
tion in the simulated space. There are receptors resided on each cell as part of
the cell structure.
• Particles
Particles are the individuals. The particles are associated with a geometry loca-
tion and move around in the simulated space. The particles do not interact with
each other directly, but they interact with the nearby cell receptors and the result
from the interaction influences its behaviors. Therefore the particles interact
with each other indirectly through the receptors. The states and location of each
particle is tracked through entire simulation.
• Parameters
Parameter values of the model specification define the simulation environment
and control the behavior of particles. By using a specific set of parameters or
the interaction rules to the particles, some complex decision making by a parti-
5
cle can be simulated. Different set of parameters can be used to study different
scenarios of the application.
In chapter 2 we present the original application model and its characteristics. In
chapter 3, we present an improved model to make it suitable to a computer simulation
and at the same time, not changing the nature of the application.
1.3 Distributed individual-based simulation
A distributed system runs on a collection of computers. A distributed simulation system
is developed for applications that simulate a large scale of problem and carry out time-
consuming tasks. The main issues in a distributed simulation system can be summed up
as follows: 1) problem partitioning and mapping, 2) communication overhead, 3) results
consistency, and 4) system capability and scalability.
1.3.1 Problem partitioning
To simulate a problem on a cluster of computers, the problem needs to be decomposed
into small pieces. The number of smaller problems should be equal or more than the
number of computers meaning that each computer processes one or more small prob-
lems. There are two well-known approaches in the problem partitioning in distributed
individual-based simulations: 1) the Lagrangian decomposition method and 2) the Eule-
rian decomposition method.
6
The Lagrangian decomposition method focuses on the simulated individual enti-
ties. It divides the entities into multiple groups and let each computer be responsible for
each group of entities. The most parallelism is achieved by this decomposition method
for the application that the individual entities only interact with each other within its
group.
The Eulerian decomposition method is to divide the simulated environment into
small portions. Each computer is responsible for the activities occurring in the region.
This decomposition method is usually used in the application in which the individual
entities interact with the environment locally and the global environment synchroniza-
tion does not happen frequently. In chapter 4, we present and compare each of the
methods and present a hybrid method of Lagrangian method and Eulerian method in the
problem decomposition.
1.3.2 Problem mapping
To map a portion or a set of portions of the problem to a node can be straightforward.
Usually at the stage of problem decompostion, the mapping is put into considerations.
The goal of mapping is to 1) minimize the communication between the processors, 2)
balance the workload on each node, 3) provide the scalability, and 4) provide the system
flexibility. In the chapter 4 we present the mapping strategy.
1.3.3 Communication
7
In the distributed system, each node processes a portion or a set of portions of the prob-
lem. Each portion does not exist as an independent task. The dependences between
these portions require data exchange to keep everyone in synchronized. The time spent
on data exchange is one of the major issues in the distributed system. In chapter 5 we
present communication granularity approach to reduce the communication overhead by
exchange less frequently between the machine nodes.
1.3.4 Simulation consistency
The application is simulated in a sequential way initially. We implement it by using our
distributed system to make it run faster. In general the consistency of results between a
sequential simulation and a parallel simulation is evaluated from two aspects: statistic
consistency and exact consistency. We focus on the exact consistency. In chapter 4, we
discuss the random number generator strategy by assigning a random number generator
to each particle. In chapter 5, we present the conflict resolution and more protocols to
ensure the exact consistency while benefiting from the most of parallelism.
1.4 Chapters in the dissertation
There are 9 chapters in this dissertation. Chapter 2 presents the target mode with the
model specification. Chapter 3 presents the improved model. Chapter 4 presents paral-
lel structures. Chapter 5 discusses the enhancement of the parallel simulations. Chap-
ter 6 examines and presents performance evaluation. In chapter 7 we use the simulation
8
system as a tool in biological case studies to produce the results in a biology form.
Chapter 8 reviews some related work to this research work and the final chapter, chap-
ter 9 discusses the conclusion and future work.
9
2. Chapter 2
The Target Model
2.1 The particle diffusion model
Our target model is to simulate particles diffusion in a biological intercellular space,
based on the molecules diffusion theory in random walks in biology presented by Berg
[Ber93]. We present this simulation model using a state and event based modeling ap-
proach [Fis95]. Figure 2.1 (a) presents the target model conceptually and shows the ba-
sic nature of the particle diffusion process in a virtual intercellular space. Figure 2.1 (b)
illustrates the state and event-based approach to declarative the model. It defines three
possible states of the system represented by circles, where a state is identified by parti-
cles existing in the system. The six events are represented by curves. A particle can be
a free particle when it arrives in the system and moves around in the system. A stuck
10
particle is a particle that is captured by a receptor. A stuck particle can be released back
to the system as a new free particle or be absorbed by the receptor and degrade from the
system for the rest of simulation time.
(a) Conceptual model
(b) State and event-based model
Figure 2.1 Particle diffusion models
11
2.2 Model specification
2.2.1 Simulated space
A two-dimensional particle diffusion space is used as a simulated space to present this
computational model. The space is composed of a set of cells with a fixed location and
size, in grid geometry, see figure 2.2. The left boundary is a closed boundary. Particles
attempting to go pass the left boundary are bounced back to the simulated space. The
right boundary is an open boundary. Particles that walk across the right boundary dis-
appear from the simulated space. The top and bottom boundaries are identified with
each other. For example, a particle crossing the top boundary ends up on the other site
(the bottom part) of the simulated space with the same coordinate value of x; and a par-
ticle crossing the bottom boundary ends up in the top part of the simulated space.
Topologically, the space is a cylinder, closed on the left and open on the right.
Figure 2.2 The simulated space
12
2.2.2 Cell representation
A cell object is a square with a size of 10µm by 10µm. The location of a cell object is a
fixed position in the cell simulated space. Cells do not overlap with each other. The
distance between cells is one tenth of the cell size, which equals 1µm. The wall of each
cell is divided into 20 segments, which corresponds to cell membranes. The number of
receptors in a given segment varies over time. Figure 2.3(a) displays the geometry of
the cells in the simulated space. Figure 2.3(b) illustrates the detail of a cell object,
which includes cell area, cell wall that is divided into wall segments, and receptors that
reside on the wall segments.
(a) (b)
Figure 2.3 Cell representation
2.2.3 Receptor representation
Receptors reside on cell membranes (cell segments in our simulation system). A recep-
tor has two states: free or occupied. When a receptor captures a particle its state
13
changes from free to the state of occupied. When the stuck particle gets released or de-
graded, the receptor that captures the particle changes state back to free. In the follow-
ing section we will give description of the particles.
Receptors, both free and occupied, move between neighbor cell segments at a
predefined rate at every simulation time step. This movement balances the number of
free and occupied receptors on the cell segments. For example, at each time step, 50%
of the free and 50% of occupied receptors in one segment are moved to its neighbors.
The fraction of number of receptors is accumulated at every time step and the receptors
move occurs when there is at least one receptor available.
2.2.4 Particle representation
Particles are identified in the system by their position and state. The position of a parti-
cle is defined by its coordinates, x and y in the simulated space. A particle does not own
a territory; the model allows more than one particle to reside at one spot and have the
same coordinate values. Therefore the density of particles in the simulated space has no
limit. The total number of particles entering into the system can be varied to simulate
different cases, according to biological assumptions.
New particles enter into the simulated space periodically during the course of
simulation. A particle starts as a free particle and can travel freely through the alleys
(between cells) or be captured by a free receptor on a cell segment. When a particle
moves to the next position within an alley and is not captured, it remains as a free parti-
14
cle at the end of that time step. If a particle is captured by a free receptor, it becomes a
stuck particle and occupies a receptor.
A free particle walks in a fixed distance (the micro distance step) in one of four
directions (north, south, east, and west). The direction is chosen with each direction
having equal probability randomly based on the value of a random number that is gen-
erated uniformly distributed over the interval [0.0, 1). For example, if the random num-
ber is in the interval [0, 0.25), the particle moves to the north direction, if in the interval
[0.25, 0.5), the particle moves to south, and so on.
The micro distance step (s) is 10-9m or 1nm. It is calculated by using the follow-
ing equation that we adapted from [Ber93], equation 2.1. Here, D is the diffusion coef-
ficient and t is the micro time step.
(1nm) mDts
sectseccmD
9
9
127
104
105105
−
−
−−
==
⎟⎟⎠
⎞⎜⎜⎝
⎛
×=
×=
(2.1)
As a particle diffuses through the cell alleys, at each time step the calculation
occurs to decide if the particle gets captured by a receptor and becomes a stuck particle.
If a particle hits a cell wall but does not get captured by a free receptor, it bounces away
from the cell wall in the next time step.
A stuck particle can be absorbed by the cell segment on which the receptor that
it occupies currently resides. When this happens, the occupied receptor becomes free
15
and the stuck particle disappears from the system forever. The rate at which this occurs
is predefined as a system parameter. A stuck particle can also be released from an oc-
cupied receptor and re-enter into the system with a new location near the segment where
it gets released. The rate at which this occurs is also predefined as a system parameter.
2.3 Basic simulation
The simulation algorithm described in this chapter is straightforward. The algorithm of
this basic simulation is showed in Figure 2.4.
Procedure MAIN While (not end of simulation) do For (all cell segments) Move receptors between the neighbor segments End For For (all cell segments) Calculate total #of degradation particles by accumulated in iteration Update #of free and occupied receptors End For For (all free particles) Move particle by distance (s) in one of directions (N, S, E, W) Decide whether particle gets captured by a free receptor Update particle state and location Update receptors state End For Advance bio-clock by (t) End While End MAIN
Figure 2.4 Algorithm for the basic simulation
16
3. Chapter 3
Improved Model
In the micro simulation model described in chapter 2, particles diffuse in the simulated
space in a simple Brownian motion. A particle moves one step in one of four directions
per micro time step. Direct execute of this model is time consuming, and system can
become infeasible for a simulation that runs for a long period of time. For example,
simulating the model for one bio-second requires system to execute 2*108 iterations.
This chapter introduces a macro simulation, which reduces the execution time by re-
placing micro time steps with a macro time step and improves the system scalability.
This macro simulation 1) does not change the nature of the micro simulation and 2) runs
more efficiently.
In the micro-step model introduced previously, a particle may be captured at a
micro time step when the particle moves a micro distance step and hits a cell wall. The
17
system then decides of whether it is indeed captured. Since the macro simulation re-
places micro time steps with a macro time step, micro time steps are no longer visible to
the system. This complicates the decision as to when particles should be captured, e.g.
at which micro time step. In this chapter, we introduce the macro simulation. In par-
ticular, we discuss two issues that need to be resolved for the macro simulation model:
defining the macro simulation distance step and defining the particles capturing process
to make it consistent with the micro simulation model, the latter uses some classical re-
sults about the behaviors of random walk [Fel66].
3.1 Macro simulation distance step
We calculate the motion in a macro simulation as a distance and a direction. The dis-
tance is calculated as a displacement of a Gaussian distribution. The direction is calcu-
lated randomly and uniformly.
Calculating the macro distance step
To replace micro time steps with macro time step, we compute the macro dis-
tance step (L) as a displacement from Gaussian distribution:
),(L ξ0Ν= . (3.1)
The standard deviation Dt4=ξ , where 127 sec105 −−×= cmD is a fixed pa-
rameter as we defined in chapter 2. Here t is macro time step, we assume that it equals
to 2000 micro time steps, therefore sec10 5−=t , then m.cm µξ 04402010 6 == − .
18
The assumption of a macro time step equal to 2000 micro time steps in the
Gaussian distribution is reasonable because of the following. We want the macro dis-
tance step is reasonably large enough, but do not go too far within one step. The micro
distance step at each micro time step is a random walk step, which equals 0.001µm (as
we presented in chapter 2). The geometry of a cell is 10µm by 10µm. A cell segment is
2µm (e.g. 2000 micro distance steps) and the alley between two cells is 1µm (e.g. 1000
micro distance steps). With the standard deviation m. µξ 0440= (44 micro distance
steps), ξ3 equals to 132 micro distance steps, which means that with less than 0.3%
probability a particle can move up to as many as 132 micro steps within a macro time
step. This tells that 1) a particle moves cross a cell alley within less than 5 macro steps
is highly unlikely to happen and 2) a particle moves cross a cell segment within 10
macro steps is also highly unlikely to happen. Therefore the macro time step assump-
tion is reasonable.
Particle moving direction
The particle moves at a direction between 0 and 360 degree. A uniformly
distributed random number is generated to calculate direction of motion.
3.2 Calculating particle capturing
In the macro simulation, a particle moves a macro step instead of a micro step. We
need a method to determine whether a particle is captured during a given macro step.
19
Our method is based on determining, in order, following four items:
• The first hit of a particle to a cell wall: The number of micro steps a particle
taken before it first time hits a cell wall within a macro time step, if it ever hits a
cell wall. To determine with which receptor on which cell wall segment a free
particle may bind to, we divide the cell alley around a cell to 8 areas. These ar-
eas are labeled from 0 to 7, see Figure 3.1. A free particle can be captured by a
receptor resided on the nearest cell wall segment. For example, a particle lo-
cated in area 0 can be captured by a free receptor on a segment of cell wall cw0
and a particle in area 5 can be captured by a free receptor on a segment of cell
wall cw0 or cw1.
Figure 3.1 Cell grid and 8 areas of a particle position
20
• The number of returns of the particle to the cell wall after the first hit: Within
the remaining macro time steps, calculate how many times the particle repeat-
edly hit the same cell wall.
• Horizontal distance: To determine which cell segment of the cell wall the parti-
cle hits.
• The probability of getting captured: With all above computed, to decide if the
particle gets captured.
3.2.1 First hit
For each particle, we calculate the number of micro time steps it takes before it first
time hits a cell wall. The cell wall of a particle is determined by the position of the par-
ticle, as we illustrated in the Figure 3.1.
The first-hit-time is the number of micro time steps (t) a particle has taken be-
fore it hits a cell wall for the first time. We compute the first-hit-time using inequality
(3.2), which we adapt from Theorem 2. (7.5) [Fel66]. We generate a real variable x uni-
formly in the interval [0, 1], and let t be the smallest integer for which (3.2) holds. If t
is less or equal to 2000 (macro step equals 2000 micro steps), then t is used for the fur-
ther calculation. Otherwise the particle is considered to not be captured during the
macro step. Note that if a particle locates far away from a cell wall, the chance of hit-
ting the cell wall within a macro time step is relatively small. Therefore we use ine-
quality (3.2) to calculate particles that are within 15 micro steps to the cell wall.
21
For the far away particles, the chance of getting captured is small and the com-
putation of t is too time-consuming. In this case we use a normal approximation to es-
timate the first-hit-time. We use equation (3.3), which we adapt from Theorem 3 (7.7)
[Fel66].
[ ]
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
<=
=
≥•⎟⎟
⎠
⎞
⎜⎜
⎝
⎛+•∑
=
2000
0
21
t...t 5, 3, 1, k ,number odd an is r If
6,...t 4, 2, k then number, even an is r If1 , number random a :x wallcell the to distance the :r
x k-2 rkk
t
k kr
(3.2)
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
<>=
=
2000010
22
2
t ),(Nu
wallcell the to distance the :rurt
(3.3)
3.2.2 Number of returns
If a particle hits the cell wall, the next step is to determine the total number of returns of
a particle to hit on the same receptor. Equation (3.4) below is used to calculate the
number of returning hits (v) of a particle to a cell wall. We adapt this equation from
22
Theorem 4 [Fel66].
( )⎟⎟⎠
⎞⎜⎜⎝
⎛ =
=
microsteps remaining of number the:n0,1Νd
ndv (3.4)
3.2.3 Horizontal distance
When a particle hits a cell wall, the probability of its being captured depends on which
segment on the cell wall it hits and how many free receptors are on that cell segment.
To decide the hitting cell segment, we use equation (3.5) to calculate the horizontal dis-
tance (x), which is the distance that the particle moves parallel to the direction of the
cell wall before the first hit. Here t is the number of micro steps the particle takes be-
fore it hits the cell wall (section 3.2.1). For example, see Figure 3.2, a particle starting
location is in the area of segment S2, but it may move to the right and ends up at a dif-
ferent segment area, in this case, S3.
( ) t,Nx 0= (3.5)
Figure 3.2 A particle moving to the right segment at the end of macro time step
3.2.4 Probability of capturing
23
We assume that the probabilities of being captured on any given hit are equal and inde-
pendent. The total probability p of being captured in the macro step is ( )vq−− 11 where
v is the number of return hits we calculated in the section 3.2.2. q is the probability of
being captured in a particular hit, fRLSRR q ×= . The value of q depends on the follow-
ing three quantities:
• fR : The number of free receptors on the cell segment that the particle hits.
• LS: The size of a cell segment. There are total of 20 cell segments, 5 segments
on each side of cell wall. The size of the cell segment is 2µm.
• RR: The radius of a receptor. Since the receptors are non-overlapping, we as-
sign a receptor a area with a radius.
3.2.5 Modification to the formula
The introduced formulas in the previous sections overestimate the probabilities of being
captured, because the probabilities are not really independent. If the first hit misses a
receptor, the next hit is likely to be nearby and hence also is likely to miss a receptor.
To correct this, we made the change to the following two calculations:
Reduce the number of hits by a factor of u in equation (3.4) to adjust v:
uvw = (3.6)
Reduce the probability of being captured (described in section 3.2.4) by a factor
24
of f, which results in equation (3.7):
( )
⎜⎜⎜⎜⎜⎜
⎝
⎛
⎟⎟⎟⎟⎟⎟
⎠
⎞
==
××=
−−=
receptors free of number :RnmRR
mLS
RfLS
RR q
qp
f
f
w
52
1
11
µ (3.7)
In this simulation we give .f ,u 305 == To determine an appropriate value of
factor u and factor f depends on biological observations.
Finally, we generate a uniformly distributed random number a (0< a<1). If a <
p, the particle is captured by one of the free receptors on the cell segment where the par-
ticle hits.
25
4. Chapter 4
Distributed Individual-Based Simulation
Individual-based simulations are simulations based on the global consequences of local
interactions among individuals of a population and between individuals and its envi-
ronment. These individuals might represent plants, animals, molecules in ecosystems or
biological systems [FSW97, HM96], vehicles in traffic [PKC07], people in crowds, or
autonomous objects [FBD98]. These models typically consist of an environment or
framework in which the interactions occur and some number of individuals defined in
terms of their behaviors (procedural/activity rules) and characteristic parameters. In an
individual-based model, the characteristics of each individual are tracked through the
entire simulation. This stands in contrast to modeling techniques where the characteris-
tics of the population are averaged together and the model attempts to simulate changes
in these averaged characteristics for the whole population. Individual-based models are
26
also known as entity or agent based models.
Our model has the characteristics of an individual-based mode: 1) it consists of a
simulated space of cells in grid geometry as the environment, 2) the particles are the in-
dividuals, 3) a particle has its own state and behaviors, 4) particles do not interact to
each other; they only interact to the receptors on the cell walls, and 5) the characteristics
of particles are tracked in entire simulation at each iteration. An individual-based
model can be simulated in a sequential way, or in a distributed fashion. Our goal is to
develop a distributed simulation system that supports the simulations of this model and
runs faster than the sequential system and is accurate.
A distributed system runs on a cluster of nodes. Each node carries a certain
amount of work load. The nodes interact by exchanging messages over the intercon-
necting network to achieve synchronization. The major issues that arise in implement-
ing such a parallel computing system include: 1) dividing the problem into small por-
tions, 2) providing a distributed computing environment to support the implementation,
and 3) ensuring that the simulation is correct, in the sense that it provides results consis-
tent with what sequential implementation produces.
In this chapter, section 4.1 presents the methods of problem decomposition for a
distributed computing and mapping the partitioned problem to physical nodes. Section
4.2 describes algorithms developed for the distributed simulation and 4.3 discusses the
results.
27
4.1 Problem decomposition
In this section we describe two problem decomposition methods: the Lagrangian
method and the Eulerian method, which are two well-known mechanisms used to parti-
tion a problem in a distributed system by assigning each node a fixed portion of the
problem. We describe the advantages and disadvantages of each method for this par-
ticular simulation. Our conclusion is that the best partitioning scheme for our simula-
tion is hybrid of two methods - Eulerian horizontal strip decomposition.
4.1.1 The Lagrangian decomposition method
Definition
In the Lagrangian decomposition method each node is responsible for a set of
entities (particles) and tracks them for the entire simulation. A set of entries can be
grouped by their static pre-defined attributes.
Decomposition
In our simulation, all particles have the same characteristics. However, we can
group particles by the locations where the particles enter into the system, for example,
by the cell rows. New particles enter from the left boundary of the simulated space on
the center point of each row continuously. We can reasonably group particles according
to their entry point. Figure 4.1 shows two groups of particles. Group one is represented
by the black circles and consists of the particles that initially enter from row one.
28
Group two is represented by blank circles and consists of the particles entering from
row two. Each group of particles is assigned to a node and all particles in that group
remain assigned to that node for the entire simulation. As showed in Figure 4.1, group
one of particles is assigned to node 1, and group two of particles is assigned to node 2.
Figure 4.1 Lagrangian decomposition method and node mapping
Advantages
The advantage of using this decomposition method is that particles are bound to
a specific node and do not migrate to other nodes, so no communication time is spent on
particles migration.
Disadvantages
The particles do not directly interact with each other; they interact indirectly
through receptors that reside on the cell walls. When a particle comes in contact with a
receptor, the status of the receptor affects the new status of the particle and of the recep-
29
tor itself. If the status of the receptor changes, this state change is seen by particles that
later come in contact with the same receptor. This behavior of particles and receptors
requires that each node keeps an up-to-date map of the status of all receptors with which
its particle may come in contact. In the Lagrangian model, the particle tracked by any
one machine could be anywhere in the simulated space. Hence the Lagrangian ap-
proach complicates the simulation: every machine must know about every environ-
mental change in the entire system. This requires communications among all machines.
The more machines added to the system, the worse the performance is, so this approach
does not scale well.
4.1.2 The Eulerian decomposition method with vertical strips
Definition
In the Eulerian decomposition method, each node is responsible for a specific
region in the simulated space. Unlike the Lagrangian approach, we divide the fixed ge-
ometry space into problem regions, instead of dividing by the entity type.
Decomposition
One way of implementing this is method to partition the whole simulated space
into vertical strips that do not change the shape during the simulation. Each strip con-
tains a column of cells and is mapped to each node. Figure 4.2 shows the vertical strips
and the mapping. As new particles enter into the simulated space, it takes some time for
30
the particle to move from the entry point to the rest of the simulated space. Figure 4.2
shows that node 1 carries more particles than node 2. This is especially true at the be-
ginning of the simulation, when there are many more free particles than the stuck parti-
cles.
Figure 4.2 Eulerian decomposition method with vertical strips and node mapping
Disadvantages
One obvious problem with this approach is that there is no workload balance
and there is no parallelism at the beginning of the simulation.
A reasonable question is: can we cut vertical strips with different size of regions
to achieve the workload balance and parallelism? For instance, one node maps only a
portion of a most left cell column and another node can have multiple cell columns
mapped. This is not a good idea for our simulation model, because this breaks a cell
into multiple segments and allocates them into different nodes. The geometry break
causes heavy communication traffic between notes to keep the cell segments synchro-
31
nized. This complicates the simulation by adding communication overhead.
4.1.3 The Eulerian decomposition method with horizontal strips
Definition
An Eulerian method alternative is to partition the simulated space into horizontal
strips. The shape of the strips does not change in the simulation. New particles are
grouped by the location where they enter into the simulated space. Each horizontal strip
accepts new particles at each time step.
Figure 4.3 Eulerian decomposition method with horizontal strips and node mapping
Decomposition
Figure 4.3 illustrates particles that are grouped into horizontal strips based on
their locations in the simulation space. Each horizontal strip is mapped on to a node.
32
This approach combines the geometry partitioning and particles grouping. The geome-
try mapping to a node does not change in the entire simulation. As the particles migrate
to the neighbor nodes, the workload will remain balanced during the simulation.
Advantages
The proportion of particles in each horizontal strip does not change dramatically
during the simulation. This approach provides good workload balancing during the en-
tire simulation. The only required communications is when particles migrate to the
neighbor nodes, instead of among all nodes in the entire simulated space, when particles
walk across the horizontal boundary. This is the approach we have chosen for the
simulation.
Disadvantages
The required communication between neighbor nodes can also be costly, if the
particle migration occurs too frequently. We can reduce the communication frequency
by utilizing techniques that we will discuss in the next chapter.
Partitioned space mapping on nodes
Using the Eulerian horizontal strip decomposition approach, Figure 4.4 shows a
5-5 cells virtual space partitioning and mapping to a 5 node network. Each row of cells
is mapped to a node. Each node has two neighbors, for example node 2 has two
neighbors: node 3 is the lower neighbor and node 1 is the upper neighbor. We wrap
around the simulated space by connecting node 1 and node 5. Therefore node 5 is the
33
lower neighbor of node1 and node 1 is the upper neighbor of node 5. The numbers
marked in each cell represent cell numbers: there are 25 cells, 5 cells per row. Figure
4.5 lists the nodes and their neighbors.
Figure 4.4 An example of simulated space partitioning and node mapping
Node Upper neighbor node Lower neighbor node 1 2 5 2 3 1 3 4 2 4 5 3 5 1 4
Figure 4.5 An example of nodes and its neighbors
4.2 Implementation
4.2.1 Overview
34
We use MESSENGERS [BFD96, FBD98, FBD99, FCH+08], a distributed program-
ming system based on the principles of autonomous objects, developed in our research
group, to support implementation of our distributed individual-based simulation. The
autonomous objects, called Messengers, carry their own behaviors, perform tasks in a
form of program, and are capable of navigating through the underlying network.
In the implementation, we use these types of Messengers: 1) a creator Messen-
ger to create a logical network on a set of nodes, 2) a task Messenger to carry the com-
putation task on each node, and 3) a shuttle Messenger to perform the synchronization
between the neighbor nodes.
Figure 4.6 Messengers system architecture and the task and shuttle Messengers
Figure 4.6 shows the Messengers system architecture and activities of the task
Messenger and the shuttle Messengers. Each node has a task Messenger, drawn with a
gray circle labeled “TM”, which carries the work load for the node in the simulation. A
35
left-shuttle Messenger is drawn with a gray circle labeled “LM” with dashed-line and a
right-shuttle Messenger is drawn with a gray circle labeled “RM” with dashed-line.
Two shuttle Messengers are injected by a task Messenger to migrate particles and up-
date state of the cell grid to its neighbors at every time unit. A shuttle Messenger has a
short life in the system; it hops to its neighbor and uploads the particles and the space
strip information to the neighbor node, then exits from the system. In the next section,
we describe details of the Messengers implementation.
4.2.2 Messengers
Creator Messenger
We use a creator Messenger to create a logical network on a set of physical
nodes and inject a task Messenger on each node. Figure 4.7 shows the Messenger
pseudo code of the creator Messenger script.
1 create(node_name;n_link;physical_node); 2 for ( number of nodes < total_nodes ) { 3 create(node_name;n_link;physical_node); 4 } 5 // Create link between the first node and last node 6 create(node_name;n_link;physical_node); 7 // at the first node; 8 for ( current_node < total_nodes ) { 9 inject(Task_Messenger); 10 hop(link=+”n_link”); 11 } 12 exit;
Figure 4.7 Creator Messenger script pseudo code
36
Three Messengers functions are called by the creator Messengers script. The
Messengers logical network is created by statements in line 1-6. Creator Messenger in-
jects a task Messenger on the node (line 9) and hops to the next node (line 10). The for-
loop loops through all nodes in the network. Three Messengers statements are called by
the creator Messenger to perform this work.
1. create: This Messengers statement creates a logical node on a specified physi-
cal node. It also generates a link along with the Messenger moves. The links
we create between nodes are two-way links.
2. inject: This Messengers statement activates another Messenger. The activated
Messenger starts work on the same node. The function can pass the parameters
to the injected Messenger. The creator Messenger uses this statement to prompt
the task Messenger on each node.
3. hop: The hop statement makes the Messenger to be navigated to other nodes by
the link along the underlining network. The creator Messenger hops along the
logical nodes network on the physical nodes to inject the task Messenger on
each note.
Task Messenger
A task Messenger is injected by the creator Messenger on each logical node.
Basically, a task Messenger reads the update information sent by the neighbor nodes
and performs the tasks assigned to its node. A task Messenger also uploads up-to-date
information and injects shuttle Messengers to communicate with the neighbors. Figure
37
4.8 presents the Messenger script pseudo code for a task Messenger.
1. // Initialization 2. initParameters(); 3. createNodeGrid(); 4. initParticles(); 5. signalEvent(left_shuttle); 6. signalEvent(right_shuttle); 7. // Simulation 8. while( current time < simulation time) 9. { 10. waitEvent(e_left_shuttle, i); 11. waitEvent(e_right_shuttle, i); 12. updateStuckParticle(); 13. moveReceptors(); 14. degradeParticles(); 15. computeParticles(); 16. loadParticlesToShuttles(); 17. current time++; 18. inject(left_shuttle); 19. inject(right_shuttle); 20. waitEvent(left_shttle_hop); 21. waitEvent(right_shuttle_hop); 22. } 23. exit;
Figure 4.8 Task Messenger script pseudo code
The script initializes the node parameters and node particles (line 2, 4). We
have two kinds of Messengers: 1) task Messenger that runs and stays on the node, and
2) shuttle Messenger that comes and goes to perform the data exchange. When both
task and shuttle Messengers are executed at the same time on a node, a synchronization
of multiple Messengers must be done on that node, so the Messengers can run in a cor-
38
rect order. The event synchronization mechanism is used to manage the Messengers.
Two statements, signalEvent(event) and waitEvent(event) are used to synchronize each
other, along with the node variable event.
The simulated space mapped on each node is created on line 3. The while-loop
(line 8-22) is the main task of the program, which simulates the particle’s movement for
in fix simulation time. At each iteration, the task Messenger waits for the neighbors to
finish uploading the information (line 10-11). The number of stuck particles gets up-
dated (line 12). Line 13 moves the receptors around cell segments, and line 14 calcu-
lates and processes the degraded stuck particles. All free particles are calculated to the
next position (line 15). The particles that walked crossing to the neighbors are loaded to
the shuttle variables (line 16). The shuttle Messengers are injected to the neighbors
(line 17-18). Before advancing to the next iteration, the task Messenger waits shuttle
Messengers to finish loading the data from node variable to the Messenger variable ac-
cordingly (line 20-21).
There are eight C functions are called in the task Messenger script. Following
are the description of each function.
1. initParameters: This function initializes the simulation parameters, such as the
size of a cell grid, the number of receptors in each cell segment, and the simula-
tion parameters.
2. createNodeGrid: This function creates the simulated space on each node. The
simulated space has a grid cell structure and all cell grids have the same size.
39
3. initParticles: Each node is responsible to simulate a group of particles. This
function initializes the Messengers node variables for particles and allocates the
node variable arrays of particles.
4. updateStuckParticle: This function calculates and updates the number of stuck
particles in each cell segment.
5. moveReceptors: This function calculates the number of free and captured re-
ceptors in each cell segment and moves the receptors among the neighbor seg-
ments with a pre-defined exchange rate.
6. degradeParticles: This function calculates the number of degrading particles
based on the number of stuck particles in each cell segment. The rate of particle
degrading is pre-defined. If a stuck particle degrades from the system, the re-
ceptor that held it gets freed and its state changes from occupied to free.
7. computeParticles: This function processes all free particles residing on the
node. It calculates the next movement of free particles and makes decision on
particle capturing or degradation. If a particle gets captured by a free receptor,
the particle’s state changes from free to stuck, and is removed from the free par-
ticle list. The receptor that captures the particles changes its state from free to
occupied. The number of free receptors is decreased by 1.
8. loadParticlesToShuttles: This function sorts out all free particles. If there are
particles crossed over its node boundary, the particles are loaded to the shuttle
node variables and ready to be migrated to the neighbor nodes.
40
Shuttle Messenger
A shuttle Messenger is injected by a task Messenger on a node. It hops along
the underlying network between the nodes. A shuttle Messenger 1) sorts out the parti-
cles that are in neighbor’s territory, and 2) carries these to-be-migrate particles and hops
to the neighbor node and download the particles to the neighbor node variables. A shut-
tle Messenger exits system after finishing its task. So a new shuttle Messenger is in-
jected at every time. We have two shuttle Messengers, a left-shuttle Messenger and a
right-shuttle Messenger. The left-shuttle Messenger hops to the upper neighbor. The
right-shuttle Messenger hops to the lower neighbor. Figure 4.9 shows the script pseudo
code of a left shuttle Messenger. The right shuttle Messenger does the similar job as
what left shuttle Messengers does. The difference is that the left shuttle hops to its up-
per neighbor and the right shuttle Messenger hops to the lower neighbor.
Line 1 loads the migrated particles stored in the node variables to the Messenger
variables of the left shuttle Messenger. Line 2 loads the current cell grid state (the state
of stuck particles and state of receptors) to the Messenger variable of the left shuttle
Messenger. When the left shuttle is ready to hop to its neighbor, it sends signal out to
indicate that the left shuttle has finishing loading the data and ready to leave (line 3).
The left shuttle Messenger then hops to its upper neighbor node (line 4). On the
neighbor node, the arriving shuttle needs to wait neighbor’s shuttle Messengers to leave
before updating data to the neighbor node variables (line 5-6). The migrated particles
are uploaded into the neighbor on line 7 and the cell grid state is updated on line 8. Fi-
nally, the left shuttle Messenger signals the task Messenger to continue for the next it-
41
eration.
1. shuttleLoad(node_left_out, msgr_left_part); 2. gridMapShuttle(¶m, node_grid, msgr_grid); 3. signalEvent(left_shuttle_hop, i); 4. hop(link=+/-”nodeLink”); 5. waitEvent(left_shuttle_hop, i); 6. waitEvent(right_shuttle_hop, i); 7. shuttleLoad(msgr_left_part, node_right_in); 8. gridMapShuttle(¶m, msgr_grid, node_grid); 9. signalEvent(e_right_shuttle, i); 10. exit;
Figure 4.9 Left shuttle Messenger script pseudo code
There are two C functions called in the left shuttle Messenger script. Following
are the description of each function.
1. shuttleLoad: This function loads migrated particles from a node area of the
task Messenger to left shuttle Messenger. If there is no particle migration
needed, length of the list of migration particle is set to zero.
2. gridMapShuttle: This function synchronizes the cell map (the state of rece-
tores) between the neighbors. The cell map is updated through the data syn-
chronization between neighbor nodes.
4.3 Consistency with sequential implementation
We have described how to implement a parallel version of our simulation. We need to
verify that the parallel version can produce results consistent with the sequential simula-
42
tion. A common criterion is statistic consistency: the results may be somewhat different
by the exhibit similar statistical behavior. We describe here how to achieve something
stronger, exact consistency. We ensure that the behavior of each particle is exactly the
same in our parallel and sequential simulations.
4.3.1 Identical results with the sequential simulation
Handling random numbers
Figure 4.10 Random number sequences
In Figure 4.10, we give an example of generating random numbers in the se-
quential and the parallel simulations. P1, P2 and P3 are three particles. The sequential
simulation runs on node 1. One random number sequence S1 is generated for all parti-
cles on node 1. In the parallel simulation, one random number sequence is generated
for all particles on one node. We assume that P1 and P2 enter into the system from
node 1, and P3 enters into the system from node 2. Sequence S1 is used by the particles
43
on node 1, therefore P1 and P2 use the same sequence S1. The sequence S2 is gener-
ated for the particles on node 2, therefore P3 uses sequence S2. This implementation
produces the different results. Random numbers generated in the sequential simulation
are different than the random numbers generated in the parallel simulation for simulat-
ing the same particle. For example, at the iteration 2, in the sequential simulation, the
random number generated for the particle P1 is the fourth random number in the se-
quence S1, labeled “4, S1”. In the parallel simulation, it is the third random number in
the sequence S1, labeled “3, S1”. To resolve this difference, we assign a same random
number sequence to a cell row in both simulations. Figure 4.11 shows that, before the
migration occurs, the random numbers for each particle in both sequential and parallel
system are consistent. However, with this set, there is another problem in getting the
identical random numbers in the sequential and the parallel simulations.
Figure 4.11 Random number sequence change in particle migration
44
In the parallel simulation, particles migrate to the neighbor nodes. The migrated
particles are treated as new particles on the neighbor nodes. The random number se-
quence used by the particle is generated for that cell row where the particle first time
enters system. So it could lose its original random number sequence and starts a new
sequence on the node it migrates to. This cause the random numbers vary on both cell
rows. Figure 4.11 illustrates random number sequence change after the particle migra-
tion.
At iteration 3, particle P2 migrates to node 2. It loses its original random num-
ber sequence S1. On node 2, P2 is treated as a new particle, so the random number is
assigned to it as the next number in the sequence of S2, which is labeled as “4, S2”.
Figure 4.12 Random number sequences unique to new particles
In order to achieve exact consistency between distributed and sequential simula-
tion, we need the sequence of random numbers generated for every particle to be the
45
same in both systems. To accomplish this we assign a unique random number sequence
to each new particle. So the random numbers used by the simulation on that particle are
generated in the sequence bound with the particle no matter where the particle resides
during the simulation. When the particle migrates to its neighbor, it carries the random
number sequence structure with it. We use this mechanism in both sequential and par-
allel systems, so the random number sequence for a particular particle is always identi-
cal. Figure 4.12 illustrates the sequences that are unique to each new particle.
4.3.2 Random number seed initialization
We create a unique random number seed for each new particle. New particles enter
into simulated space on each cell row. Each cell row is mapped on to a node. Figure
4.13 shows the function of initializing seed structure.
The function initSeed is called when a new particle is generated. The seed value
is an accumulated number of integers with an initial value of 99 plus the node number
(line 3 – 4). We use C function srand48_r(long int seedval, struct drand48_data
*buffer) to initialize the seed structure (line 7 – 9). Array randState(particele_index) is
a node variable. It is used to store the value of the initialized seed structure. Each ele-
ment in the array is bound to the particle by the particle_index. The particle_index is
assigned to a new particle according to the order it enters into the system on each node.
The data type of drand48_data takes memory of 24 bytes. We use C library function
memcpy to copy the data to the node variable array (line 10). So during the simulation
46
when a random number is generated for simulating a particle movement, it is always
generated from the sequence that is initialized and saved with the particle at the initiali-
zation time.
1. function initSeed() 2. { 3. int node_i[5] = {1,2,3,4,5} 4. int seed_int = 99; 5. struct randState_s *&randState[particle_index]; 6. seed_int = seed_int + node_i[node_label]; 7. struct drand48_data *seedState; 8. seedState = (struct drand48_data *)malloc(sizeof(struct drand48_data)); 9. srand48_r(seed_init, seedState); 10. memcpy(randState, seedState, 24); 11. free(seedState); 12. }
Figure 4.13 Function to assigning a random number seed to a particle
.
The shuttle function in the shuttle Messengers
The shuttle function in the shuttle messengers carries the particle and its random
number sequence structure when hopping between the nodes. There are two more vari-
ables added in the shuttle function: (1) node_randState_left_out and (2)
node_randState_right_in. The shuttle function loads the particle and its random num-
ber sequence from the node variables to Messenger variables before it hops to the
neighbor node. The modified shuttle functions called in the left Messenger are as fol-
lows:
• shuttle(node_left_out, msgr_left_part, node_randState_left_out, left_randState)
47
• shuttle(msgr_left_part, node_right_in, left_randState, node_randState_right_in)
48
5. Chapter 5
Simulation Enhancement: Parallel Simu-
lation Protocols
The main goal is to parallelize the simulation model to speed up the simulation execu-
tion, so that we are able to simulate this application long enough in the sense of biologi-
cal time, in a realistic time frame. In the previous section, we created the system struc-
ture for this distributed individual-based simulation. The problem with that implemen-
tation is the slowness caused by communication overhead. Because of this overhead,
the simulation execution takes more time in the parallel computation than the sequential
execution. In this section, we present our approach to enhance the system implementa-
tion and improve the performance of the parallel computing. The two major issues we
need to address and resolve are 1) the communication overhead and 2) the consistency
49
between the parallel and sequential simulations. There are some other subtle problems
that arise and will be discussed as well.
Based on the simulated space mapping on to the machine nodes, which we have
discussed in the previous sections, when a particle moves across to a neighbor node,
communication between the nodes is required. The communication becomes a signifi-
cant factor in slowing down the simulation execution when such communication occurs
more frequently. The overhead of the communication makes more impact when the
simulation runs for a long period of time.
The correctness of the results produced by the parallel system is another con-
cern. The simulation model is originally implemented sequentially. Converting it to a
parallel simulation creates a speed vs. accuracy trade-off.
In the following sections, we present our solutions to these two issues.
5.1 Exchange less frequently
We reduce the communication overhead by decreasing the frequency of the communi-
cations between the nodes. This requires finding a level of granularity of communica-
tion delay that gives us adequate parallelism and does not compromise the accuracy of
simulation results.
Epoch
We define an epoch as a time interval between two occurrences of data ex-
50
change. The length determines the granularity of communication used in the system.
For example, the system synchronization in a distributed simulation can be designed to
occur at every iteration or at every epoch.
Epoch length
In our simulation the time interval or epoch is measured by number of iterations.
We use epoch length to quantify the time interval or epoch. For example, if we set ep-
och length to 500 iteartions, we exchange data at every epoch; then the communication
delay is 500 iterations.
Exchange every epoch
Figure 5.1 and Figure 5.2 show a simulation that runs with 3 iterations. Figure
5.1 (a) shows the communication that occurs when data is exchanged every iteration. Ti
is the computation time for iteration i, Ci is the time spent on network communication
and data packing and unpacking. The total computation time is 3Ti. The total commu-
nication time spent is 3Ci. The total time spent on 3 iterations is 3Ti+3Ci.
Figure 5.1 (b) shows communication that happens when data is exchanged every
3 iterations (i.e., with an epoch length of 3). Te is the time spent for the total computa-
tion, Te=3Ti. Ce is the total communication time. In Generally, 3Ci > Ce, the communi-
cation overhead is reduced when the communication granularity level is increased by
number of iterations.
Our goal is to find out the length of epoch that can serve the best for the least
51
communication overhead and keep the most of accuracy of the computation. This ap-
proach provides a way to speedup the simulation. However, at the same time it intro-
duces the communication delay.
Figure 5.1 Communication granularity level
52
The delay in turn introduces two issues:
1. A particle can move across to a neighbor node during an epoch. This is ad-
dressed by introducing shadow cells.
2. The information in shadow cells may not be current, causing particle to be-
come stuck when they should not, or conversely. We describe the problem
in more detail in section 5.3, and we describe the solution in section 5.4.
Figure 5.2 Node mapping with shadow cells
53
5.2 Shadow cells
We create shadow cells to extend the local node boundary. A shadow cell is a copy of
the neighbor cell at the beginning of the epoch. During the epoch, the shadow cells are
part of the local node working space. Figure 5.2 shows a 5-node mapping with shadow
cells. The simulated space consists of 25 cells in 5 rows and 5 columns, numbered from
1 to 25. Each node maps a row of local cells, and two rows of shadow cells of its
neighbors. The local cells marked with a bold numbers and shadow cells marked in
gray numbers. For example, node 1 maps one row of local cells marked by number 1 to
5, and two rows of shadow cells marked by number 6 to 10 and 21 to 25 respectively.
Shadow cells are synchronized at the beginning of the epoch and can be accessed and
processed by the local node. The length of the epoch determines how frequently the
shadow cells get refreshed. At the start of each epoch, the entire simulated space gets
synchronized with the local data exchange between neighbors.
A potential problem with this approach is that the out-of-date information in the
shadow cells could cause a conflict problem between a local particle and an incoming
particle. For example, consider Figure 5.3, which shows the view of a local node to an
incoming particle. In this example, node 1 is the local node; node 2 is the lower
neighbor. Particles on node 1 are labeled by letter A. Particles on node 2 are labeled by
letter B. At the beginning of an epoch, all A particles on node 1 and all B particles on
node 2 are within their respective node cell boundaries. As the simulation continues
and a particle on node 2 cell number 6 moves across its boundary into node 1 cell num-
54
ber 1 area, we mark this particle as A’ to distinguish it from other particles on node 1.
This particle A’ is an incoming particle to node 1. Before the end of the epoch, node 1
does not see this particle. Node 1 only processes its own particles – A particles, while
particle A’ is still moved and processed by node 2. On the other hand, node 2 does not
see the up-to-date information on node 1, so it processes particle A’ based on stale in-
formation about node 1 that was stored on node 2 at the beginning of the epoch.
Figure 5.3 View of incoming particle A’ on local node (node 1)
After particle enters into its neighbor cell area, a free receptor on node 1 could
be available to a local particle A and also to the incoming particle A’ at a later time. If
the local particle is captured by the free receptor, then the incoming particle, which is
only visible to the node 2, could be captured by the same free receptor because at the
time the node 2 sees the cell on node 1 as its shadow cells and does not know that the
receptor has became occupied by capturing a local particle. We illustrate this example
in scenario 1 of section 5.3, below.
55
If we do not correct the information mismatch caused by introducing shadow
cells, the simulation loses its accuracy. Such inaccuracies can increase as the simula-
tion continues. To correct this we developed a conflict resolution scheme, described in
section 5.4.
5.3 Conflict scenarios description
By introducing the shadow cells to support the less frequent data exchange, a conflict
happens when two free particles are candidates to be captured by same free receptor.
Because the communication delays in the distributed simulation, such conflict has to be
resolved at the end of epoch. In this section we describe these scenarios of showing
how these conflicts occur. We then give our solution in the following sections. We re-
fer to Figure 5.3 for the description of local node (node 1), neighbor node (node 2), lo-
cal particles (particle A), incoming particles on local node (particle A’) and neighbor
node particles (particle B).
5.3.1 Scenario 1: A free particle becoming stuck when it should
not
In this first scenario, one free receptor is available to a local particle. It captures the lo-
cal particle and becomes occupied. In a later time, the same receptor that has become
occupied appears to a neighbor node to be a free receptor, because the neighbor reads
the stale data saved at the beginning of the epoch. As a result, the neighbor node cap-
56
tures another particle (an incoming particle), which it should not capture. Figure 5.4
shows the sequence of events of particle capturing in both sequential and parallel simu-
lation within an epoch. We use a simple data set to describe this scenario.
Figure 5.4 Views of local node with an incoming particle of the local node (scenario 1)
Figure 5.4 (a) shows the sequence of events that occur on the node 1 in the se-
quential simulation. Figure 5.4 (b) and (c) illustrate the sequence of events that occur in
distributed simulation where we use the shadow cells. Figure 5.4 (b) illustrates the view
57
of the node 1 that appears to the node 1. The question mark indicates that the incoming
particle (particle A’) is not visible to the local node. Figure 5.4 (c) shows the view of
local node that appears to the neighbor node, node 2 (the local cells are the shadow cells
of the neighbor node). The question mark indicates that the local particles (particle A)
are not visible to the node 2. The time periods T0 through Te are the subsequence time
periods within an epoch.
Figure 5.5 illustrates this scenario graphically with the time line. The sequential
simulation runs on one node and the parallel simulation runs on two nodes with shadow
cells mapped on each node. The shadow cell is labeled by letter S.
Figure 5.5 Views of particles movement in sequential and parallel implementations of scenario 1
58
1. At time T0, the beginning of the epoch, the system is synchronized by exchanging
data between neighbor nodes. Figure 5.4 (a), (b) and (c) have the same view: one
free receptor (#of stuck particles) on the local node, and one local free particle (#of
local particles). In the synchronization, all the incoming particles are migrated to
the local node, so there are no incoming free particles at the beginning of each ep-
och.
2. At time T1, one particle on the neighbor node walks across the boundary becoming
an incoming particle on the local node. Figure 5.4 (a) shows that without communi-
cation delay or in a sequential simulation, the local particle and the incoming parti-
cle are treated the same in the sequential simulation in which all cells are located on
one node. So this view shows that there are two free particles are processed, one is
the local particle and another one is the incoming particle.
In the Figure 5.4 (b), there is a question mark for the incoming particle. The
question mark here indicates that, during the epoch, the node 1 does not see any in-
coming particles that come from node 2. At this time, node 1 has an incoming par-
ticle that has crossed from node 2, but node 1 does not see it, so there is a question
mark for the incoming particle. The node 1 only moves the local free particle to the
next location.
In the Figure 5.4(c), the question mark indicates that the node 2 does not see
any particles on node 1 during the epoch. At this time, one particle that was located
59
on the node 2 before time T1 walks cross the node boundary and enters into its
shadow cell. This particle is an incoming particle of node 1, but is moved and proc-
essed by the node 2 based on the stale information of node 1 during the epoch.
3. At time T2, Figure 5.4 (a) and (b) show that a particle capturing event occurs. Fig-
ure 5.4 (a) shows that there are two free particles, and the local free particle gets
captured by the free receptor and becomes a stuck particle. The incoming particle
moves to the next location.
The particle capturing event on node 1 is illustrated in Figure 5.4 (b). The
node 1 sees only the local free particle; it gets captured by the free receptor and be-
comes a stuck particle.
In this example (Figure 5.4 (c)), the particle capturing event does not occur
on node 2, although node 2 sees that there is a free receptor available in its shadow
area (node 1) based on the stale information saved on the neighbor node at T0. The
incoming particle may be too far away from the available free receptor on the local
node, or the capturing probability does not satisfy the capturing criteria. So node 2
moves the incoming particle to the next location in its shadow area.
4. At time T3, node 1 sees there is no free receptor left on the local node. Node 1 then
moves the incoming particle to the next location (Figure 5.4 (a)). Figure 5.4 (b)
shows that there is no free receptor available and there is no local free particle to be
moved. So node 1 does not have any work to do.
60
However, a particle capturing event occurs on node 2; see Figure 5.4 (c).
Node 2 reads that there is a free receptor available on node 1, its shadow area. It
processes the incoming particle. The incoming particle gets captured by the free re-
ceptor that is the same free receptor that captured the local particle at T2 to become
a stuck particle. Comparing the events that occur in the sequential simulation illus-
trated in Figure 5.4 (a), this particle capturing event on node 2 should not occur and
this incoming particle should not be captured. The cause of this problem is because
node 2 does not see the up-to-date information on node 1 and uses the stale informa-
tion to make the decision of capturing. The solution for this scenario is described in
section 5.4.1.
5. At time Te, it is the end of the epoch. In the sequential simulation, there is a stuck
particle and a free particle in the system. In the parallel simulation, on node 1, it
ends with a stuck particle and 0 free local particles, and unknown incoming particles
(Figure 5.4 (b)). On node 2, there is a stuck particle and 0 free incoming particles,
and unknown local particles (Figure 5.4 (c)). These unknown particles are resolved
at the time when the system performs the system synchronization by exchanging the
data between the nodes at the end of the epoch. In this example, there is no particle
needs to be migrated between the local node and the neighbor node. We discuss the
particle migration in section 5.4.3.
The basic idea of resolving the scenario 1 is to re-process the particle capturing
events and release the stuck particle that should not be stuck. We implement a mecha-
61
nism for this solution in the conflict resolution. The detail of this solution is discussed
in section 5.4.1.
5.3.2 Scenario 2: A free particle not becoming stuck when it
should
Figure 5.6 Views of local node with degraded stuck particles on local node (Scenario 2)
62
In general, data can only be recovered if it was saved or can be reproduced. In
the distributed simulation, the communication delay can cause information lose by up-
dating system based on stale data. We have seen the free receptor conflict in the sce-
nario 1 by using stale data at T3 on node 2. However, the conflict in that situation can
be resolved in the conflict resolution, in which the stuck particles can be recalculated to
produce the correct result. In the second scenario, the problem is that a free particle
does not get stuck when it should. When this happens, the system has no knowledge
about this event. Therefore, this lost stuck particle cannot be recovered. We use a sim-
ple dataset to describe this scenario. Figure 5.6 shows sequence of events on node 1.
Figure 5.6 (a) illustrates the sequence of events in the sequential simulation. Figure 5.6
(b) and (c) illustrates the sequence of events in the parallel simulation by using the
shadow cells.
Figure 5.7 illustrates this scenario graphically with the time line. The sequential
simulation runs on one node and the parallel simulation runs on two nodes with shadow
cells mapped on each node. The shadow cell is labeled by letter S.
63
Figure 5.7 Views of particles movement in sequential and parallel implementations of scenario 2 1. At time T0, it is the beginning of the epoch. The local node data is same in both se-
quential and parallel simulations. The local data is also saved by the neighbor node.
There is 1 stuck particle on the local node.
2. At time T1, Figure 5.6 (a) illustrates a particle degradation event that occurs on the
local node. The stuck particle is degraded from the system; the receptor that cap-
tured the particle is freed. In the sequential simulation, the just freed receptor is
available to the free particles.
Figure 5.6 (b) and (c) illustrate particle activities in the parallel simulation
on two nodes. Figure 5.6 (b) shows the particle degradation event on node1. The
64
receptor that is occupied by the stuck particle is freed. This newly freed receptor is
available only to the local particles because node 1 does not see any incoming parti-
cles from its neighbor node.
Node 2 however is not aware of the particle degradation event on node 1, so
it does not see the newly freed receptor (Figure 5.6 (c)). The number of free recep-
tors appears to node 2 remains 0, which is read from the data saved at the beginning
of the epoch, at time T0.
3. At time T2, in Figure 5.6 (a), it illustrates that a particle walks across the boundary
and become an incoming particle. The incoming particle is treated same as a local
free particle in the sequential simulation.
In the parallel simulation, the node 1 does not see the incoming particle that
comes across from the neighbor node; see Figure 5.6 (b). There is no work to do for
the node 1 at this time period.
On node 2, a particle moves across its node boundary and enters into its
shadow cell area becoming a incoming particle of node 1. Node 2 processes this
particle in the shadow area and moves it the next location on node 1 (Figure 5.6 (c)).
4. At time T3, in the sequential simulation (Figure 5.6 (a)), the incoming particle is
captured by the newly freed receptor.
In the parallel simulation, node 1 is not aware of the incoming particle that
comes from a neighbor at time T2, the capturing event does not occur on node 1.
The newly freed receptor remains free on node 1, see Figure 5.6 (b).
65
While the neighbor node, node 2 is not aware of the newly released free re-
ceptor on node 1, so the particle capturing event does not happen in the shadow area
on node 2. This incoming particle processed by node 2 should become stuck but is
not, because the current information about the free receptor on the local node is not
passed to the neighbor node during the epoch. The data available to the neighbor
node is the stale data saved at time T0.
The problem with this lost stuck particle is that it cannot be corrected later
on because both local and neighbor nodes have no knowledge about the to-be-
captured incoming particles. The event of capturing the incoming particle by the
free receptor on the local node is not snapshot by the system and therefore it cannot
be found or re-produced later on. To be able to capture this incoming particle when
it should be captured, we need an available free receptor that appears to the
neighbor node at the time. To make this happen, we overstate the number of free
receptors on the shadow cells at the beginning of the epoch. The detail is discussed
in section 5.4.2.
5. At time Te, the end of the epoch, in the sequential simulation, there is a stuck parti-
cle in the system (Figure 5.6 (a)). In the parallel simulation, because the particle
capturing event does not happen during the epoch, there is a free particle remain in
the system, instead of a stuck particle. This also tells that, after time T3, the sequen-
tial and parallel simulations produce a different result that cannot be corrected. In
66
this scenario, the discrepancies between the two simulations cannot be resolved at a
later time.
We have described two conflict scenarios in this section. In the following sec-
tion, we discuss the solutions and the more potential problems in each scenario.
5.4 Conflict resolution
We developed a conflict resolution to deal with the scenarios we identified in the sec-
tion 5.3. The conflict resolution scheme is to (1) take snapshots of every particle cap-
turing event that occurs during the epoch, (2) take snapshots of every particle degrada-
tion event that occurs during the epoch, and (3) re-process the data that is collected dur-
ing the previous epoch to resolve the conflicts to obtain the correct simulation results.
Figure 5.8 Tentatively stuck particles on node 1
Tentatively stuck particle
We name a particle a tentatively stuck particle if it is captured by a free receptor
during the epoch. The tentatively stuck particles include the stuck particles on the local
cells, and the stuck particles that are in its shadow cells on the neighbor nodes. Figure
67
5.8 illustrates the tentatively stuck particles on node 1. Cell 1 to 5 are the local cells of
node 1, cell 6 to 10 are the shadow cells of its lower neighbor node, and cell 21 to 25
are the shadow cells of its upper neighbor. There are total of 5 tentatively stuck particles
in this figure. There are three local tentatively stuck particles which are in the local
cells labeled by . The stuck particle in the shadow cells of lower neighbor is labeled
by . The stuck particle in the shadow cells of upper neighbor is labeled by .
This means that during the epoch, node 1 takes five snapshots to record five particle
capturing events.
Tentatively stuck particle tag
We use a tentatively stuck particle (TSP) tag to save the snapshot whenever a
particle capturing event occurs during the epoch. A TSP tag attaches to every tenta-
tively stuck particle. By the end of epoch, in the conflict resolution process, these TSP
tags are exchanged between nodes. Figure 5.9 shows the TSP tag.
Figure 5.9 TSP tag structure
68
The tag entries are:
1. particle index: This is the index of the free particle list, used to remember
the entering order of the free particle.
2. iteration: This is the iteration when the particle is captured during the ep-
och.
3. location: This is the coordinates of geometry location where the free particle
resides before it gets captured.
4. event type: This event type is used to indicate the tentatively stuck particle
that it is a local stuck particle, or is captured in the shadow cells. If it is cap-
tured by on local cell, it is a type I event, otherwise it is type II. For exam-
ple, the event type in the tag for a A tentatively particle is I; the event type
is II for and tentatively stuck particles, which are captured in
shadow cells.
5. cell segment number: this is the number of cell segment in which the parti-
cle is captured.
6. number of free receptors: This is the number of free receptors at the time a
particle gets captured.
7. number of return hit: This is one of the parameters used in calculating the
capturing of a particle.
8. random number for the capturing probability: This is the random num-
ber generated when the particle gets captured.
Degraded stuck particle tag
We use a degraded stuck particle (DSP) tag to save the snapshot when a stuck
particle degrades from the local cells. Figure 5.10 shows the DSP tag. A DSP tag at-
taches to the stuck particle that is degraded from the system, and is processed in the
69
conflict resolution.
Figure 5.10 DSP tag structure
5.4.1 Solution to scenario 1
As described in the scenario 1 in the previous section, conflicts occurs when one free
receptor captures two free particles during the epoch. The rule in our simulation is that
one free receptor can only capture one free particle at a time and then it becomes an oc-
cupied receptor. A free particle cannot be captured by an occupied receptor. So the
second particle capturing event should not occur. In the section we describe how we fix
this problem.
To fix the problem, we (1) take snapshot of all tentatively stuck particles during
the epoch and (2) exchange the TSP tags between the nodes and re-run the simulation of
the last epoch to confirm the tentatively stuck particle or release it as a free particle.
Because the conflict is caused by type II tentatively stuck particle, the re-run process
should be only executed when there is a type II tentatively stuck particle.
Figure 5.11 illustrates the process of taking the snapshots during the epoch and
how the conflict resolution resolves the tentatively stuck particles for scenario 1. Figure
70
5.11 (a) shows the snapshot that is taken on the local node during the epoch. Figure
5.11 (b) shows the snapshot that is taken on the neighbor node during the epoch. Figure
5.11 (c) illustrates the re-processing of the tags of tentatively stuck particles. It releases
the tentatively stuck particle that was tagged at T3 on node 2 and moves it to the next
location. The process is described as the following:
Figure 5.11 Solution for scenario 1
71
1. At the simulation time T2 during the epoch, a free receptor is available to the local
node. The free receptor captures the local free particle. A snapshot is taken for this
capturing event, labeled by the TSP tag. The event type of this TSP tag is type I,
see Figure 5.11(a).
2. Figure 5.11(b) shows that, at the simulation time T3 during the epoch, on the
neighbor node, the same free receptor that captured the local free particle at T2, ap-
pears to be a free receptor to the neighbor node in its shadow cell area. The particle
that walks into its shadow cell area on node 2 get captured by this free receptor that
is the same free receptor capturing the free particle on node 1. A TSP tag is created
for this tentatively stuck particle. The event type in this tag is type II because the
tentatively stuck particle was captured in the shadow cells of the neighbor node.
At the conflict resolution time, the TSP tags created on the shadow cells on
neighbor nodes during the epoch are sent over to the local node. In this example, the
local node has total of two TSP tags, one is a type I TSP tag and one is a type II tag.
This means that during the epoch, at the most two particles can become stuck on the lo-
cal node.
The resolution repeats the simulation of the last epoch. It starts at T0. The dif-
ference of this repeat simulation with the normal simulation is that this repeat simula-
tion does not simulate the free particles; it only calculates the stuck particles. The goal
of the resolution is to find the tentatively stuck particles that should not become stuck,
and then release them back to free particles. Because the TSP tags collect all possible
72
stuck particles that could be stuck during the epoch, the resolution process is to read
each TSP tag to either confirm it as a stuck particle or release it to a free particle. If a
tentatively stuck particle is released as a free particle, it continues to move to the next
location until the end of the epoch. The tags are sorted first by the iteration at which the
snapshot took place. So the tag created at T2 is processed first followed by the tag cre-
ated at T3. Figure 5.11 (c) illustrates resolution process. It confirms one tentatively
stuck particle to be a stuck particle and releases another tentatively stuck particle as a
free particle. Following we describe process of resolving the TSP tags.
1. The repeat process starts at T0, but only re-calculates the capturing events at the
time when the snapshot was taken. At time T2, the first TSP tag that was taken at
time T2 during the epoch is processed. If the data in TSP tag matches the data in the
current simulation then the tentatively particle is confirmed as a stuck particle. Fig-
ure 5.11 (c) shows that the TSG data matches the current simulation, there is a free
receptor available in that cell segment; it confirms that the tentatively stuck particle
is a stuck particle. The free receptor becomes occupied.
2. At time T3 during the resolution process, the TSP tag that was created at T3 during
the epoch gets processed. There is no free receptor left in the cell segment, the sec-
ond tentatively stuck particle cannot be stuck. So it is released as a free particle and
this free particle continues to move to the next location until the end of the epoch.
The result of the conflict resolution for this example is that the local free particle
73
is stuck at T2. It is the only new stuck particle during this epoch. The second tenta-
tively stuck particle is released to be a free particle and continues to move to the next
location until the end of the epoch. The calculation of the free particle movement in the
resolution process is accurate based on the data saved by TSP tag. The conflict resolu-
tion ensures that the result is consistent with what the sequential simulation produces.
5.4.2 Solution to scenario 2
We have defined type I and type II tentatively stuck particles. The solution to scenarios
1 shows that in order to use the tentatively stuck particle mechanism we must ensure
that all of the tentatively stuck particles including local stuck particles (type I) and stuck
particles in shadow cells (type II) are successfully tagged during the epoch. Scenario 2
describes the problem of the type II tentatively stuck particles not being tagged during
the epoch. The reason why this happens is that the newly freed receptors are available
to the local free particles only because the neighbor node does not have visibility to the
local node during epoch. To resolve this problem, we overstate the number of free re-
ceptors in the shadow cells at the beginning of the epoch. By doing this, the number of
free receptors appears to the free particles in the shadow cells is more than it actually is.
So the possibility of capturing a free particle in a shadow cell is increased. This ap-
proach ensures that at least all possible tentatively stuck particles are detected and
tagged during the epoch.
We overstate the number of free receptors in the shadow cells at (1) the begin-
74
ning of the epoch and (2) during the epoch when the number of free receptors in the
shadow cell is zero read by the free particles in the shadow cell.
Figure 5.12 Overstating the number of free receptors of local node in the shadow cell on neighbor node
At the beginning of the epoch, the system has performed synchronization. All
the nodes have been updated by exchanging the data with the neighbors. So each node
75
has the current information of its neighbors and saves the information to its shadow
cells. We update the shadow cells by adding one additional free receptor to every cell
segment of the shadow cells. This information is used as stale data to calculate the in-
coming particles in the shadow cells during the epoch. This extra free receptor in the
shadow cell acts as a newly freed receptor released by a particle degradation event dur-
ing epoch and appears to the incoming particle in the shadow cell. Figure 5.12 illus-
trates this approach to resolve the scenarios 2.
Figure 5.12 (a) shows the snapshot that is taken on the local node during the ep-
och. Figure 5.12 (b) shows the snapshot that is taken on the neighbor node during the
epoch. Figure 5.12 (c) illustrates the re-processing of the two tags of tentatively stuck
particles. The process is described as follows:
1. At the simulation time T1 during the epoch, a stuck particle is degraded, so the re-
ceptor that bound with this stuck particle is freed. After it is freed, it is immediately
available as a free receptor to the local free particles. This particle degrading event
is tagged with a DSP tag, see Figure 5.12 (a).
2. At the simulation time T3 during the epoch, on the neighbor node, the incoming par-
ticle gets captured as a tentatively stuck particle in the shadow cell by the free re-
ceptor. This free receptor is a free receptor that was saved at the beginning of the
epoch as an overstated free receptor. The type II TSP tag is generated for this tenta-
tively stuck particle, see Figure 5.12 (b).
DSP tags are local node tags and do not exchange with neighbors. It contains
76
the event information of particle degradation on the local node. Because particle degra-
dation process changes the number of free receptors and number of stuck particles in the
system, the DSP tags must be processed with the exchanged DSP tags in the conflict
resolution.
Figure 5.12 (c) illustrates resolution process. It confirms that the tentatively
stuck particle at T3. Following we describe process of resolving the TSP tags.
1. When the conflict resolution runs to the first DSP tag at iteration of T1, the conflict
resolution confirms that the stuck particle is degraded and the receptor that bound
with the stuck particle is freed. The newly freed receptor is available immediately.
2. As the resolution process runs to time T3, the TSP tag created at T3 during the epoch
is processed. The newly freed receptor is available at the time. The number of free
receptors in the current system matches the number of free receptors saved in the
TSP tag, which is one free receptor. So this tentatively stuck particle is confirmed
as a stuck particle. It results the same result as the one from the sequential simula-
tion. By the end of epoch, there is one stuck particle and no free receptor left in the
system.
When the cell is full or almost full with the stuck particles, there is none or very
few of free receptors left. This is the situation when the simulation has run for a certain
period of time and free receptors have captured the particles up to its capacity. How-
ever, when the system reaches to this situation, the stuck particles start to degrade more
77
because the degradation rate is based on the number of stuck particles in a cell segment.
The more the stuck particles reside in a cell segment, the fast the stuck particles de-
grade. So this degradation can produce more free receptors during the epoch than it is
predicted at the beginning of the epoch. So adding one more free receptor to each cell
segment in the shadow cells at the beginning of the epoch may not always succeed in
this situation. To deal with this situation, we set the number of free receptor to 1 when
there is 0 free receptors read in the shadow cells by an incoming particle during epoch,
to increase the particle capturing probability.
The potential problem with this approach is that the overstating of free receptors
in shadow cells can have more particles tagged as tentatively stuck particles than they
should during the epoch. However, when it happens, it then becomes the issue we ad-
dressed in scenario 1.
5.5 The order of processing the particles
We have described the problems and resolutions in the event of particle capturing in the
distributed simulation. In the section we address the issue about the order of processing
particles.
5.5.1 First come, first processed
The order of processing particles is not in the biological sense. Driving on the itera-
tions, the particles are processed in the order determined only by the implementation. It
78
does not matter which particle is captured, they move simultaneously and instantane-
ously, but the same chosen order must be preserved in both simulation systems to make
the sequential and parallel simulations comparable and achieve an identical simulation
result.
In general, in the sequential simulation, the particles are processed in the priority
of first-come-first-processed. The order of processing particles can determine which
particle gets stuck. For example, two free particles are the candidates to be captured by
one free receptor. The one that comes first gets the priority to be stuck. Following we
alter the example we stated earlier in the scenario 1 (see Figure 5.4 (a)), to describe this
issue.
• If the both local and incoming free particles become candidates of the free re-
ceptor at the same iteration, i.e. T3 = T2, but only one particle can be captured by
the free receptor, the one that is processed first becomes stuck. The other parti-
cle then bounces back from the cell segment to continue the next move as a free
particle. So the order of processing these two particles determines which parti-
cle gets stuck.
In the sequential simulation, particles keep its order in the entire simulation on
each node. The particle that enters into the system first is always processed first. This
is not always true in the parallel simulation. In the parallel system, some particles mi-
grate to the neighbor nodes at the end of epoch. The particles that migrate to the
neighbor are treated as new particles on that neighbor node. By taking the priority of
79
first-come-first-processed, the migrated particles are processed after the particles that
are already on the neighbor node get processed. This could change the order of process-
ing the particles, because these migrated particles may come into the system earlier than
some of the particles on the neighbor node.
The sequential and parallel simulations can produce different results if the parti-
cles are simulated in different orders. So we need to preserve the order of processing
the particles in the entire simulation.
5.5.2 Particles migration
In the parallel simulation, the incoming particles migrate to its neighbor nodes at the
end of epoch. So at the beginning of the epoch, all particles start on the node they be-
long to. A particle migrates to its neighbor node as a new particle to that node. Based
on the priority of first-come-first-processed, the migrated particles are processed fol-
lowing the local particles.
Figure 5.13 illustrates a simple example of particles migration between node 1
and node 2. At epoch e, there are 4 particles A1, A2, A3, and A4 on node 1, labeled by
A1(e,1), A2(e,1), A3(e,1) and A4(e,1). There are 4 particles on node 2, labeled by B1(e,2),
B2(e,2), B3(e,2) and B4(e,2). e is the time of epoch, 1 is the node number.
80
Figure 5.13 Particles migration between node 1 and node 2
As we discussed in the earlier section, to simulate particles in different orders
between two simulations can produce different results. In the sequential simulation, the
order of processing the particles does not change in the entire simulation. Figure 5.13
(a) shows the particles migration in the sequential simulation. During the epoch e, node
1 particle A3(e,1) moves across to the node 2 area and becomes an incoming particle of
node 2. At the end of epoch, in the particle migration process, the particles are not
moved from its current node to the node it migrates to; instead the migration updates the
values of particles. Particle A3(e,1) updates to A3(e+1,2) to indicates that at epoch e+1,
particle A3 currently located in the node 2 area. So the migration process in the sequen-
tial simulation does not change the order of processing particles on the nodes.
81
Figure 5.13 (b) shows the particle migration in the parallel simulation. The par-
ticle A3(e,1) migrates to node 2 from node 1. It is shuttled over by the shuttle Messenger
to node 2, labeled by A3(e+1, 2). The order of processing the particles on node 1 changes
from A1, A2, A3, A4 to A1, A2, A4. The order of processing the particles on node 2
changes from B1, B2, B3, B4 to B1, B2, B3, B4, A3.
To keep the order of processing particles same in both sequential and parallel
simulations, we migrate the particles to its neighbors in sequential simulation in the
same way as we do in the parallel simulation. The sequential simulation runs on one
physical node. We create a set of logical nodes in the sequential simulation correspond-
ing to the set of physical nodes on which the parallel simulation runs. When an incom-
ing particle migrates to a neighbor node in the parallel simulation, it is shuttled over by
the shuttle Messenger. In the sequential simulation, the incoming particle is deleted
from the particle list of the logical local node, and then added to the particle list on the
logical neighbor node the particle migrates to.
5.5.3 Preserving the order of processing the particles
We pack and unpack the particles that are shuttled between nodes in such a way that it
keeps particles in order consistently. There are three types of new particles entering to a
node. They are (1) new particles first entering into the system; (2) migrated particles,
and (3) tentatively stuck particles that are released back into the system as free particles.
The new particles entering from the left boundary are always appended to the
82
end of the particle list on each node. The migrated particles and the just freed tenta-
tively stuck particles released by the conflict resolution are sorted by the particle index
on their original node and appended to the particle list on the current node. The proce-
dure of unpacking particles on a destination node is described as the following:
a. Load particles migrated from upper neighbor to the local node.
b. Load particles migrated from lower neighbor to the local node.
c. Load tentatively stuck particles to the local node.
d. Run confliction resolution to resolve tentatively stuck particles.
e. Sort the particles come from the upper neighbor based on the original parti-
cle index and append them to the particle list on the local node.
f. Sort the particles come from the lower neighbor based on the original part
index and append them to the particle list on the local node.
g. Sort the particles released from the tentatively stuck particles that were in-
coming particles in the upper shadow cells and append them to the particle
list on the local node.
h. Sort the particles released from the tentatively stuck particles that were in-
coming particles in the lower shadow cells and append them to the particle
list on the local node.
5.5.4 Temporary incoming particles
A temporary incoming particle is: a particle walks into a shadow cell and goes close to a
83
free receptor and then returns to its local node during epoch. Figure 5.14 illustrates the
path of a temporary incoming particle of the local node. The particle labeled by B, starts
on the neighbor node cell number 7, and walks cross the boundary to enter to node 1
shadow cell becoming an incoming particle of node 1, in shadow cell number 2. The
incoming particle is labeled by B’. The incoming particle moves in the shadow cell and
can go very close to a free receptor in shadow cell number 2, but did not get stuck. By
the end of epoch, it returns back to node 2 and ends with a new location. The returned
particle is labeled by B.
The temporary incoming particle could be processed in a different way between
the sequential and the parallel simulation. In the sequential simulation, this particle is
simply a free particle on node 2 with a new location. But in the parallel simulation, it is
not always the case; it may become a type II tentatively stuck particle during the time it
moves around in the shadow cell. If that happens, the tentatively stuck particle is shut-
tled to node 1 at the end of the epoch. After it gets released by the conflict resolution
process, it stays on node 1 as a free particle and processed by node 1 in the next epoch.
The particle then is processed on different nodes. From that point on, the order of proc-
essing this particle is different in two systems. As we have mentioned in the previous
section, the sequential and the parallel simulations could produce different results by
processing the particles in different orders.
84
Figure 5.14 A temporary incoming particle on the local node
To handle this mismatch, we define a rule. This rule says that during epoch, if a
particle travels cross to a shadow cell and at least one-time goes near to a free receptor
in a cell segment, within a predefined distance (for example, 5 micro steps) and at the
end of the epoch, this particle travels back to its original node, then this particle is con-
sidered as an incoming particle and needs to be migrated to the neighbor.
We apply this rule in both sequential and parallel simulations to ensure that the
particle is simulated in the same way. The temporary incoming particles are considered
same as other new migrated particles to the node. The different about this particle with
the other migrating particles is how it is identified as a migrating particle in the epoch.
The temporary incoming particle is identified as a migrating particle when it meets the
following two conditions: (1) it goes close enough to a free receptor in the shadow cell
but does not get captured during the epoch and (2) it ends up on the node it starts on at
the beginning of the epoch.
85
So there are total four types of new particles entering to the node at the end of
epoch. We described the unpacking, sorting and appending of the migrated particles in
this section to preserve the order of processing particles.
5.6 Confliction resolution algorithm
The conflict resolution is part of the synchronization process. It is executed at the end
of epoch. Figure 5.15 displays the pseudo code of conflict resolution. Line 18 – 27
process and confirm the stuck particles and set the particles to free if it is not a stuck
particle. Line 11 calls a function to calculate the just freed particle and moves it to the
next location until the end of epoch. If it gets stuck during this calculation, it returns the
state as a stuck particle to tagParts.state, and will be traded as a tentatively stuck parti-
cle again and continued to be processed.
1 processTentativeStuckParts() { 2 n: current iteration 3 el: epoch length 4 for( i < (n – el)) { 5 // use the data as the beginning of the epoch 6 moveCellReceptors(i); 7 for ( I < tagPart_len) { 8 if( tagParts[I] is set to a free particle in line 25) { 9 // recalculate the next movement as a free particle; 10 // returns 1 when becomes stuck 11 tagParts[I].state = re-calculate unstuck particles; 12 } 13 } 14 if(all tagParts have been processed, set at line 30) { 15 Continue; 16 } 17 if(i == tagParts[tagI].iteration) {
86
18 while (I == tagParts[tagI].iteration and tagI < tagPart_len) { 19 calculate the degradation; 20 calculate the capturing; 21 If(captured) { 22 update cell map; 23 } 24 else { 25 set the tagPart to a free particle and move one step; 26 } 27 tagI++; 28 } 29 if(tagI >= tagPart_len) { 30 all tagParts set to have been processed; 31 } 32 } 33 update the current node cell map; 34 }
Figure 5.15 Pseudo code of the conflict resolution
Algorithm with exchange less frequently
The algorithm of the parallel simulation with a variable level of granularity of data ex-
change delay is listed below, in Algorithm 5.1.
Algorithm 5.1 Parallel simulation procedures performed by task Messengers
Procedure MAIN Open output files Initiate Messenger shuttle events While not (end of simulation) do If at epoch then Wait for shuttle events LOAD_TAG_SHUTTLE PROCESS_TAG_PARTS For all tag parts Add unconfirmed tentative stuck particles to the migrating particle list Add the random number structure to node variable(s) End For
87
Sort node particles migrated from left/right neighbor by particle index Add migrating particles to the node particle list Reset shuttle variable(s) UPDATE_BIN_STUCK_PARTS Write particles to the open files End If MOVE_RECEPTORS DEGRADATE_STUCK_PARTS COMPUTE_PARTS Advance by 1step If at epoch then LOAD_MIGRATION_PARTS UPDATE_PARTS Inject Messengers shuttles Wait for the signals form shuttles hopped over from neighbors End If End While Close output files End MAIN Procedure LOAD_TAG_SHUTTLE Load tag parts of left shuttle to node variables Load tag parts of right shuttle to node variables Sort tag parts by tag particle index Sort random number structure by tag particle index End LOAD_TAG_SHUTTLE Procedure PROCESS_TAG_PARTS Refer to Pseudo code in Figure 4.20 End PROCESS_TAG_PARTS Procedure UPDATE_BIN_STUCK_PARTS For (all cell segments on local node) Update tagPart’s #of free/stuck receptors by local node value Increasing #of free receptors to neighbor cell segments by 1 End For End UPDATE_BIN_STUCK_PARTS Procedure MOVE_RECEPTORS
88
For (all cell segments on local node and neighbor nodes) If integer(50% of free or occupied receptors in the segment) >= 1 Move them to the neighbor segments Update the total number of free and occupied receptors End If End For End MOVE_RECEPTORS Procedure DEGRADATE_STUCK_PARTS For (all cell segments on local node and neighbor nodes) Calculate total #of degradating particles by accumulated in iteration If total #of degradating particles >=1 then Add it to tagPart as a degradation particle Update #of free and occupied receptors End If End For End DEGRADATE_STUCK_PARTS Procedure COMPUTE_PARTS Define: S = the particle state (0: free particle; 1: local stuck; 2: tagPart stuck), R = the particle release state (1: particle stuck, need to be released), For (all particles) If particle is a freed particle or out of right boundary of virtual space then Set particle state to be freed Go to next particle End If Copy random number sequence structure from node variable Generate three random numbers IS_PARTICLE_STUCK If S = 1 or S = 2 then R = 1 S = 0 Add the particle to tagpart Copy the random sequence structure to tagPart’s Increase #of tagPart by 1 Else Generate 2 random numbers MOVE_PART Update the particle location Copy the random sequence structure to node variable End If
89
End For Return the #of tagPart End COMPUTE_PARTS Procedure IS_PARTICLE_STUCK Define: D = the distance between particle and cell segment FS = the number of steps the particle takes to reaches the cell segment RS = the number of steps the particle takes to return to the cell segment Find the particle location area in the grid Calculate D, FS and RS If D > FS + RS If the particle ends in neighbor area then If the particle is within 5 steps to a cell segment then Particle is set to be an invaded particle End If Get #of free receptor from the neighbor cell segment End If Get #of free receptor from local cell segment Calculate stuck particle If it is a stuck particle then Update cell segment Add to tagPart End If End If End IS_PARTICLE_STUCK Procedure MOVE_PART Calculate the direction and distance the next location of the particle (e3.2) If the particle hits the cell segment, it bounces back to the grid area Update particle with next location End MOVE_PART Procedure LOAD_MIGRATION_PARTS For (all particles) If the particle is a crossing particle to the neighbor then Load crossing particles to the shuttles respectively Load corresponding random number sequence structure to the shuttles Set particle state to be released End If End For
90
UPDATE_PARTS End LOAD_MIGRATION_PARTS Procedure UPDATE_PARTS For (all particles) If the particle is to be released Recycle particle to be reused to store next new particle End If End For End UPDATE_PARTS
5.7 Correctness of distributed simulation
Most of distributed individual-based simulations are developed for the parallel execu-
tion from the beginning. There are many simulation models designed for sequential
simulation. It is not a trivial task to convert such sequential simulation model to a par-
allel structured simulation in order to obtain a better speedup performance. Some inter-
esting issues to a parallel simulation model are addressed by Bajaj [BBM99] in their
case study. They focus on the process of converting a sequential model to parallel im-
plementation by using PARSEC programming language that supports both sequential
and parallel simulation algorithms. They identify the areas need to be changed in the
initial sequential simulation model in order to make the model suitable for parallel
simulation. The approach works for the models that can make such changes without
losing the requirements of the original model. However, the study focused on the proc-
ess of converting other than comparing and validating the simulation results from two
simulation structures.
91
We have developed the protocols used in the distributed individual-based simu-
lation. These protocols provide a set of rules governing the computation and conflict
resolution to produce the exact same results as from the sequential simulation.
In the next chapter we discuss the performance in varying level of granularity of
the communication and a trade-off between simulation execution speedup and results
accuracy between the sequential and parallel simulations.
92
6. Chapter 6
Experimental Assessment
We have developed the protocols used in the distributed individual-based simulation.
These protocols provide a set of rules governing the computation and conflict resolution
to produce the exact same results as what the sequential simulation produces. The as-
sessment we present in this section covers the following aspects of (1) the performance
evaluation, (2) the capability and scalability of the distributed simulation system, and
(3) the system steady state.
6.1 Performance evaluation
To evaluate performance of a parallel system, the most commonly used metrics is
speedup, which captures the benefits of solving a given problem using a parallel system.
93
In general, speedup is defined as the ratio of time needed to solve the problem on a sin-
gle processor to the time used to simulate the same problem with multi-processors.
However, speedup in our simulation can not be evaluated as straightforward as stated in
such a definition. The granularity of communication between processors is a significant
factor to the speedup.
In this section we evaluate the simulation performance by running experiments
on the different size of problems. We then study the factors that affect the performance
of the simulation and explain the experiment results. The parameters used in the ex-
periments are listed in Table 6.1.
Cell size 10µm by 10µm Number of cell segments per cell 20 Cell segment size 2µm Cell alley (between cells) 1µm Number of receptors per cell seg-ment 20
Cell specification
Receptor radius 5nm Time step 1e-5 second Particle diffusion
parameters Distance step N(0, 0.044µm) Free receptors moving rate (between cell segments, per itera-tion)
0.5
Stuck receptors moving rate (between cell segments, per itera-tion)
0.5 Other parameters
Stuck particle degradation rate (per iteration, per cell segment) 5e-7
Table 6.1 Parameter set for experiments
94
6.1.1 Experiment 1
The simulated space in this experiment consists of 50 cells organized by 5 rows with 10
cells in each row. New particles enter into the system from the left boundary of the
simulated space at a rate of 0.001 particles per iteration per cell row. The experiment
runs for 1,000,000 iterations. So we simulated a total of 5000 particles.
Figure 6.1 Execution time of experiment 1 at different epoch lengths
In the sequential simulation, the experiment runs on one node. In the parallel
simulation, the simulated space is mapped to 5 nodes with each cell row to one node.
We apply the parallel simulation protocols we discussed in the chapter 5 into this ex-
periment. We run an experiment on each epoch length of 10, 100, 200, 500, 1000,
2000, 3000 and 5000 respectively. Figure 6.1 shows the execution times for the se-
quential and parallel simulations (labeled by 5 nodes) at each of the epoch length. The
95
execution time for the sequential simulation is 1303 seconds. In the parallel simula-
tion, the execution time varied based on the length of the epoch. The performance of
the parallel simulation is worse than the sequential simulation when the epoch length is
less than or equal to 10. The execution time then is reduced significantly when the
length of the epoch extends from 10, to 1000. After the length of 1000, the execution
time does not change much. This is because that when the epoch length is relatively too
long, the system synchronization and the conflict resolution process starts taking longer
time to finish.
Figure 6.2 Speedup of experiment 1 at different epoch lengths
Figure 6.2 show the speedup of the experiment. The speedup ( pS ) of this ex-
periment is calculated as the following:
96
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
=
processors p withn simulatioparallel the of time execution the is :Tn simulatiol sequentiathe of time execution the is :T
processors of number the is :p
TTS
p
pp
The parallel simulation on 5 nodes gains more speedup along the increasing of the ep-
och length. When the length of epoch stretches to 5000, the speedup is 2.62. The simu-
lation gets a good speedup in this experiment.
6.1.2 Experiment 2
The parallel simulation gains a speedup in the experiment 1, but the speedup is not a
linear or close to a linear speedup. In this experiment, we increase the size of problem
by increasing the number of particles to be simulated. The experiment has the same
simulated space of experiment 1, but has more particles. There are 0.004 particles per
iteration per cell row entered into system, 4 times more than in experiment 1. We run
the simulation for the same set of epoch lengths as in the experiment 1.
Figure 6.3 shows the execution times. The execution time of sequential simula-
tion is 9558 seconds. The parallel simulation (labeled by 5 nodes) gains speedup along
the increasing of the epoch length. This is consistent to experiment 1. However, the
experiment 2 gets more speedup than experiment 1, Figure 6.4 shows the speedup com-
parison of the two simulations. The most speedup of experiment 2 is 4.42, which is
close to a linear speedup. This result shows that the problem has to be big enough to
97
gain a better performance.
Figure 6.3 Execution time of experiment 2 at different epoch lengths
Figure 6.4 Speedup of experiment 1 and 2 on 5 nodes
Before the system reaches a steady state, when new particles are increased, the
number of free particles in the system is increased as well. This is because that the free
receptors are limited by a fix number, 20 per cell segment in this case. When free re-
98
ceptors become occupied by capturing particles, no more free particles can be captured
until the occupied receptors become free again by degradation of stuck particles. When
more free particles in the system, the simulation takes more time on calculating. The
sequential simulation takes about 7 times longer in experiment 2 than experiment 1.
However the parallel simulation handles this problem better. Because the calculation is
done on each node in the parallel system, the parallelization efficiency is increased.
Therefore the performance of parallel simulation is improved on simulating a bigger
problem.
6.1.3 Experiment 3
In this experiment, we increase the simulated space to simulate 100 cells organized by
10 rows with 10 cells in each row. We map one cell row to one node, so 10 nodes are
used. The workload is balanced by a row of cells. New particles enter into the system
at a rate of 0.001 per iteration per cell row. There are total of 10,000 particles simulated
in this experiment. As in the earlier experiments, we run an experiment at epoch length
of 10, 100, 200, 500, 1000, 2000, 3000, and 5000 respectively.
Figure 6.5 shows the execution time of the sequential simulation and the parallel
simulation (labeled by 10 nodes). The execution time of the sequential simulation is
2625 seconds. In the parallel simulation, at length of 10, the execution time is more
than the sequential simulation. The parallel simulation gains more speedup when the
length of epoch gets longer. This is consistent with the earlier experiments.
99
Figure 6.5 Execution time of experiment 3 at different epoch lengths
Figure 6.6 Speedup of experiment 3 at different epoch lengths
The Figure 6.6 shows the speedup of experiment 3. The most speedup is 3.5 at
the epoch length of 5000. Although the workload on each node is same as in the ex-
periment 1, this experiment gets a better performance comparing to the sequential simu-
100
lation. This is because when we simulate a big problem, with more cell rows involved,
the sequential simulation takes more time to process particles on these cell rows. How-
ever, the parallel structure can add more nodes to handle more cell rows. Because the
workload on each node is balanced by cell rows, the execution time does not increase
by adding more nodes into the system. The only extra time required is the overhead
time spent on the synchronization among the neighbor nodes in a larger network.
6.1.4 Experiment 4
Experiment 3 gives a good speedup when simulating more cells. The most speedup is
3.5. In this experiment, we further increase the problem size to see if we can get a good
speedup close to be linear. We use the same simulated space as in the experiment 3 and
the same set of 10 nodes to map the problem. The problem size is increased by simulat-
ing more particles – we increase particle entering rate to 0.004 particles per iteration per
cell row. There are 40,000 particles simulated in this experiment, 4 times more than it
in the experiment 3.
Figure 6.7 illustrates the execution time for the sequential simulation and the
parallel simulation on 10 nodes at different epoch length of 10, 100, 200, 500, 1000,
2000, 3000, and 5000. The execution time of the sequential simulation is 20712 sec-
onds. In parallel simulation, at the epoch length of 10, the exaction time is more than
the sequential simulation. The parallel simulation gains more speedup when the epoch
length gets longer. This is consistent with the experiment 3. Figure 6.8 shows the
101
comparison of speedups of experiment 3 and 4. The most speedup in experiment 4 is
8.47 at the epoch length of 5000. The experiment 4 obtains a better performance for
simulating more particles. This is also consistent with what the experiment 2 produces.
Figure 6.7 Execution time of experiment 4 at different epoch lengths
Figure 6.8 Speedup of experiment 3 and 4 on 10 nodes
102
6.1.5 Performance trade-offs
In these experiments we found that the simulation is scalable. When adding more nodes
or increasing the simulated particles in the system, the simulation gains a better per-
formance with a significant speedup, especially when the epoch length stretches from 0
to 1000. The system continues to gain a little more speedup with increasing of the
length of the epoch, but not much. In this section we compare the results produced by
the parallel simulation on each of the epoch length with the result produced by the se-
quential simulation and find the length of the epoch that gives the best simulation re-
sults in the sense of the speedup and consistency.
We found that the result generated from the parallel simulation with the epoch
length less or equal to 1000 is identical to the result produced by the sequential simula-
tion. When the length increases more than 1000 iterations, some discrepancy starts to
occur. However the simulation does not gain much more speedup by using the epoch
length longer than 1000, as we examined in the experiments. The performance trade-
offs evaluation makes sense for an application where it gains more speedup when the
length of epoch gets longer.
Iterations 500 1000 2000 3000 5000 Accuracy 100% 100% 99.7% 95.8% 93.9%
5 node speedup 1.86 2.24 2.45 2.56 2.6 10 node speedup 2.74 3.16 3.42 3.48 3.5
Table 6.2 Comparison of the speedup and accuracy at different epoch lengths
103
Table 6.2 shows that when the epoch length gets longer, the accuracy goes lower
and the speedup gets a little increased. 100% accuracy means that the results from the
sequential simulation is exactly the same as the results produced by the parallel simula-
tion, including 1) the number of free particles, 2) the number of stuck particles, and 3)
the location of each free particle in the system. If the accuracy is not 100%, it means
that there are a number of particles that do not match in the results produced by the se-
quential and parallel simulations, in one or more of these three counts.
The following behaviors of particles cause the discrepancy when the epoch
length goes too long: 1) the particle is not stuck when it should, and 2) the particle
moves across over the boundary of the shadow node, and fails to migrate to the correct
node. We can enforce new rules to correct these in the conflict resolution. For exam-
ple, we can overstate more free receptors to deal with issue 1, to ensure that all tenta-
tively stuck particles are captured during the epoch. However, the time spent on resolv-
ing these issues and the speedup the simulation can obtain by resolving these issues can
become too much overhead. So we conclude that if the simulation obtains a significant
speedup when the length of epoch goes longer, the new rules should be enforced to keep
the simulation in consistent results.
The system does not get much of the speedup and starts to lose it accuracy as the
length of epoch gets longer than 1000 iterations. This inaccuracy introduces uncertainty
into the system and can cause more discrepancy in the rest of simulation time. So the
epoch length of 1000 iterations is the recommended epoch length to benefit most from
the parallelization and keep the simulation results in consistency.
104
6.2 System capability and scalability
The distributed simulation implementation makes it possible to simulate this particle
diffusion model, which could potentially run in a very long period of time, in a reason-
able time frame. The distributed simulation system is scalable. It is extensible to deal
with a bigger or more complicated problem. As we described in the experiments, the
simulation gains more speedup when simulating a bigger problem of running on more
nodes or processing more particles.
The architecture of the distributed simulation allows different problem parame-
ters and new functionalities to be added as needed. This supports the flexibility of the
simulation implementation. The particle behaviors in the diffusion process can be ex-
amined by changing or adding new parameters. In chapter 7, we use this system to
simulate biological applications in case studies.
6.3 System steady state - stop criteria
The simulation model that we study in this work is a biological particle diffusion model,
which is a flow system - particles enter into the system constantly during the entire
simulation. A steady state of such a system is that the current observed behavior of the
system will continue in the future, even the particles still flow through the system.
However, from that point on, the system starts to produce repeating results; therefore,
there is no need to continue the simulation. This leads us to find out if our system has
105
such a steady state, and if it does, what it takes for the system to reach the steady state.
We did the following work to define the system steady state: 1) specify mean-
ingful criteria for determining the time at which the number of cells in each bin had
reached a steady state and 2) quantify the mean time to reaching a steady state for each
bin. Professor Dan Gillen in ICS department of UCI has come up this idea and is the
main contributor to this work.
6.3.1 Determining the point of steady state
When each bin reaches a steady state, then the system researches a steady state. For
each bin, consider a piecewise liner regression model of the form:
iiii )c,tmax()c,tmin(y εβββ +++= 210 (eq. 1)
where iy represents the number of cells collected at time it , c represents a single
change-point for the piecewise linear term, and iε is a random error term. In this
case 1β denotes the slope of the number of captured particles at times less than c and 2β
denotes the slope of the number of captured particles at times greater than c. To deter-
mine the point of equilibrium, the regression model given in (eg. 1) was fit assuming
the value of c for each observed catchment time. This procedure then produced slope β̂
corresponding to a change point at each observed catchment time. The point of equilib-
rium was determined as the first time point as which the upper limit of a 95% confi-
dence interval for 2β (the slope of the line for times after c) had ruled out values greater
106
than 0. Intuitively, this means that we are looking for the first change point where we
are confident that the first order trend in the number of captured particles is no longer
positive.
To apply this model for both free and stuck particles in each bin, we collected 10
datasets from 10 runs on 10 nodes. Each dataset 1) computes one cell row on one node,
2) generates a unique random number sequence, and 3) runs for 60 million iterations
(10 minutes of biological time). We collect data at every 500 iterations. The data shows
that the particles move around at the first 4 bins after certain time, and few of particles
appear in bin 5.
The procedure was performed for each of the datasets, 10 datasets total for the
first 4 bins, resulting in a separate time to reaching a steady state for each dataset.
6.3.2 Quantifying the mean time to reaching steady state
Now we present the mean time (and corresponding 95% confidence interval) to reach-
ing a steady state for both free and stuck particles in each bin. Table 6.3 shows the re-
sults for stuck particles and Table 6.4 presents the results for free particles.
Figure 6.9 shows the histogram of number stuck particles in 5 bins on one of the
simulation runs. The mean time that reaches the steady state in each bin varies from bin
1 to bin 4. In bin 1 and bin 2 when it reaches the steady state, about 100% receptors are
occupied. In bin 3, there are about 91% receptors full. In bin 4, less than half of the re-
ceptors are occupied. There are a low number of receptors in bin 5 that captures parti-
107
cles. After all 4 bins reach the steady state; the number of stuck particles keeps its
level. The system is in a steady state.
Bin Mean Lo. 95CI Hi. 95CI 1 4025500 3850274 4200726 2 4028000 3950069 4105931 3 6933000 6833945 7032055 4 14675500 14058454 15292546
Table 6.3 Mean time with confidence interval results for stuck particles
Bin Mean time Lo. 95CI Hi. 95CI 1 8128000 7581478 8674522 2 10148000 9380705 109152953 13130500 11465772 147952284 19563000 15970401 23155599
Table 6.4 Mean time with confidence interval results for free particles
Figure 6.10 shows the histogram charts of free particles in each bin over the 5
bins. As stuck particles, free particles reach the steady state in each bin at the different
time and keep the number of free particles in each bin at a different level. The farthest
bin, in this case, free particles travel through is bin 5. The number of free particles in
bin 5 is 1 at most of time after the system reaches the steady state. Because the number
of free particles takes longer time to reach the steady state than the stuck particles, we
claim the system reaches a steady state at the time when the number of free particles
reaches the steady state in each bin.
108
(a) Bin 1 (b) Bin 2
(c) Bin 3 (d) Bin 4
(e) Bin 5
Figure 6.9 Example of fitted piecewise linear models to one of the stuck particles data-sets
109
(a) Bin 1 (b) Bin 2
(c) Bin 3 (d) Bin 4
(e) Bin 5
Figure 6.10 Example of fitted piecewise linear models to one of free particles datasets
110
7. Chapter 7
Biology Results Obtained From the
Simulation
We have presented our distributed simulation system that improves the performance of
simulation applications. One of our goals is to make this system be useful in biological
application simulations. In this chapter we present data analysis to the simulation re-
sults and use the simulation system in biological case studies.
7.1 Analysis of simulation output
In this section we present a statistical analysis to the simulation results. The output of
the simulation we analyze includes 1) the number of stuck particles in each bin and 2)
111
the number of free particles in each bin during the simulation. What we want to find
out is whether the variations of the number of stuck and the number of free particles fit
a normal distribution. If it is not, what the distribution is and how close it is to a normal
distribution. The assumption is that if the variation distribution does not fit a normal
distribution, the problem cannot well be formed by a mathematical approach and a
computer simulation is necessary for better describing the problem.
In the following sections, we describe our approach to define and calculate the
variations for the number of stuck particles and the number of free particles. To sim-
plify the calculation, we only present the data in bin 1 from one simulation run. The
same calculation and data analysis can be done for different simulation runs with altered
parameter sets and for each of cell bin.
7.1.1 Variation of number of stuck particles
The dataset we measure is the histogram data of the number of stuck particles in bin 1.
The data was collected at every 500 iterations. We consider two measures of variabil-
ity: the range and the variation. We also run statistic test to test the variation to be nor-
mally distributed. The mean value we use in the variation calculation is the average
value calculated by 10 datasets that come from 10 simulation runs. Figure 7.1 (a)
shows the average value curve and a curve of the number of stuck particles from one of
simulation runs for bin 1. Figure 7.1 (b) shows a part of zoom-in curves in Figure 7.1
(a) for a better look for the two curves.
112
(a) Average curve and number of stuck particles from one of the simulation runs
(b) Zoom in of two curves from part of (a)
Figure 7.1 Average value and number of stuck particles from one simulation in bin 1
113
Figure 7.2 10 time ranges (T1 – T10) in bin 1 of stuck particles
The curve of the number of stuck particles is stabilized when it is close to the
steady state. Knowing how much variation there is from the beginning of the simula-
tion to the time of reaching the steady state can be very helpful. When we know the
variation, we can find out whether the variation to be normally distributed. If it is nor-
mally distributed, the application can be mathematically modeled and calculated, in-
stead of doing the simulation work.
We use range to cluster the data and calculate variation in each time range. The
total time range in this calculation is between the beginning of the simulation and the
time the simulation has reached the steady state. For example, for the number of stuck
particles in bin 1, the system is close to the steady state around iteration of 4,000,000.
We define a time range to 500,000 iterations. Figure 7.2 shows that we divide total
114
simulation time to 10 ranges (T1 to T10). In each time range, there is 1000 points of
data collected at every 500 iterations within the time range. We calculate variation in
each time range as follows:
Let ix represent the simulated data in a dataset, n be the size of the data set ( in
this case, n is 1000), ix be the average value on the fitting curve and i = 1,…n.
The absolute variation:
iii xxx −=∆
The relative variation:
x
xxx
i
iii
−=∂
7.1.2 Shapiro-Wilk W test for variation
The same method of calculating the number of stuck particles in section 7.2.1 is used to
calculate the number of free particles and its variation. To simplify the process, we
only show the datasets for bin 1.
There are total four sets of data we run by Shapiro-Wilk W Test to test the null
hypothesis that the absolute and relative variation came from a normal distributed popu-
lation. The size of time range for free particles is defined by 1,000,000 iterations, be-
cause it takes longer to reach the steady state. Figure 7.3 shows the 10 time ranges for
the number of free particles in total simulation time of 10,000,000 iterations.
115
Figure 7.3 10 time ranges (T1 – T10) in bin 1 of free particles
Number of stuck particles Number of free particles Time range Absolute
variation Relative variation
Absolute variation
Relative variation
T1 < .0001 0.000 < .0001 0.000 T2 < .0001 < .0001 < .0001 < .0001 T3 < .0001 < .0001 < .0001 < .0001 T4 < .0009 < .0013 < .0023 < .0004 T5 < .0001 < .0001 < .0001 < .0001 T6 < .0001 < .0001 < .0569 < .0582 T7 < .0001 < .0001 < .1626 < .2151 T8 < .0002 < .0003 < .0001 < .0001 T9 < .0009 < .0011 < .0001 < .0001
T10 < .0001 < .0001 < .0008 < .0001
Table 7.1 p-value produced by Shapiro-Wilk W Test for Bin 1 output
A small p-value yielded (< 0.05) from the test rejects the null hypothesis. Table
7.1 shows the p-values for each time range for the stuck and free particles for bin 1.
116
The results show that almost all the tests are rejected. The details of test results are
listed in the Appendices. The tests were run at the UCI Center for Statistical Consulting
Department of Statistics.
The same method of range variation calculation and statistical test can be ap-
plied to other bins for further analysis. The time range can also be varied in measuring
the output. By using a different time range, the distribution may vary as well.
Figure 7.4 CV for the number of stuck particles in 10 time ranges
Figure 7.5 CV for the number of free particles in 10 time ranges
117
Coefficient of variation (CV)
We can measure the spread of the variations in each time range by calculating
the coefficient of variation. Because most of the variations do not come from normal
distribution, we use the relative variation ix∂ and N datasets in CV calculation, i.e.
( )∑ ∂= 21ix
NCV . Here N is 1000 for the stuck particles, and 2000 for the free parti-
cles. Figure 7.4 shows the Coefficient variation for the number of stuck particles. Fig-
ure 7.5 presents the Coefficient variation for the number of free particles.
Simulated space 50 cells 5 rows by 10 columns
Cell size 10µm by 10µm Number of cell segments per cell 20 Cell segment size 2µm Cell alley (between cells) 1µm Number of receptors per cell seg-ment 20
Cell specification
Receptor radius 5nm Time step 1e-5 second Particle diffusion
parameters Distance step N(0, 0.044µm) Free receptors moving rate (between cell segments, per itera-tion)
0.5
Stuck receptors moving rate (between cell segments, per itera-tion)
0.5
Stuck particle degradation rate (per iteration, per cell segment) 5e-7
Other parameters
Stuck particle release rate (per iteration, per cell segment) 1.0e-7
Table 7.2 Case study parameters
118
7.2 More experiments
We extend the capability of the simulation model to program various scenarios that
would be interested in a biology study. In this section we present two cases we simu-
lated by using this system and illustrate the results.
7.2.1 Case study 1: releasing stuck particles
In this case study, we release the stuck particles back into the system as new particles.
The released particle has same characteristics of a new particle. However, it entries into
the system at the location it gets released, instead of entering from the left boundary of
the simulated space. For example, if a stuck particle is released, it becomes a new par-
ticle and the entry location is at the middle point of the cell segment it was stuck to (see
Figure 7.6). The parameters we used in this case study are listed in table 7.2.
Figure 7.6 Stuck particles released back to system
The simulation runs for 40 million iterations, which equals 6.67 biological min-
utes. By adding the releasing particles, we now have two sources of new particles in-
119
jecting into the system. We observed: 1) the free particles keep increasing crossing the
bins, 2) the particles move to as far as the 8th bin, and 3) the system does not reach a
steady state at the end of the simulation. From this observation, we can predict that
when the particles move crossing the right boundary of the simulation space and start to
disappearing from the system, the system will eventually reach a steady state, in which
the rate of new particles entering into system matches the rate of particles degrading
from the system. Figure 7.7 shows the particles diffusion in the systems at the end of
simulation. Each dot presents a location of a free particle. Figure 7.7 (a) shows the re-
sults we did earlier without particle released to the system. Figure 7.7 (b) shows the re-
sults with particles released back into the simulated space as new particles.
(a) Without releasing stuck particles (b) With releasing stuck particles
Figure 7.7 Particles diffusion in the simulated space (6.67 Bio-minutes)
To simplify the data presentation, Figure 7.8 illustrates the comparison of stuck
particles and Figure 7.9 shows the comparison of the free particles in 8 bins on one cell
row respectively. Curve 1 shows the number of particles without releasing particles,
120
and curve 2 shows the number of particles with the feature of releasing particles. By
adding the release feature, particles can travel as far as to bin 8, while in the simulation
where no stuck particles being released, particles do not travel crossing over bin 5 and
the system reaches a steady state faster.
121
Figure 7.8 Number of stuck particles at the end of simulation (6.67 Bio-minutes)
122
Figure 7.9 Number of free particles at the end of simulation (6.67 Bio-minutes)
123
7.2.2 Case study 2: stuck particle crossing through cell
In this simulation, we do not release stuck particles in a certain rate as we did in the
case study 1 while keep all other parameters same. We release a stuck particle on the
cell wall 0 (cw 0), at every 10,000 iterations, which equals 0.1 bio-seconds. The re-
leased the particle re-enters the system from the other side of the cell. Figure 7.10 illus-
trates an example of such crossing cell release. The release rate can be adjusted by
simply changing value of release parameter of the simulation. For example, the number
of releasing particles be can calculated by a certain percentage of stuck particles.
Figure 7.10 A stuck particle released crossing the cell
Figure 7.11 shows the particle diffusion results at the end of 20 million itera-
tions, which equals 3.33 bio-minutes. Figure 7.11 (a) shows free particle locations at
the end of simulation without releasing stuck particles, and Figure 7.11 (b) shows the
simulation with the feature of releasing stuck particles periodically. With the stuck
particles released in this way, some of the free particles are able to travel as far as to bin
9.
To simplify the data presentation, Figure 7.12 shows comparison of the number
124
of stuck particles in 10 bins. Figure 7.13 shows comparison of the number of free parti-
cles in 10 bins. Curve 1 illustrates the data in simulation without releasing particles and
curve 2 presents the data in this case study that the stuck particles are released periodi-
cally and move crossing the cells.
(a) Without releasing feature (b) Stuck particles released crossing cells
Figure 7.11 Particles diffusion at the end of 20 million iterations (3.33 Bio-minutes)
125
Figure 7.12 Number of stuck particles at the end of simulation (3.33 Bio-minutes)
126
Figure 7.13 Number of free particles at the end of simulation (3.33 Bio-minutes)
127
8. Chapter 8
Related Work
The research work on parallel and distributed simulation started in the 1970’s and has
been remained active since then. Distributed simulation technologies address issues
concerning the execution of simulation programs on a collection of computers which do
not share memory and are connected by a communication underlining network. Parallel
and distributed simulation systems can provide benefit to many applications including
individual-based applications [FBD98, MBD98].
The main goal of a distributed simulation system is to reduce execution time.
To archive this goal, issues in developing such a system have long been discussed and
studies, such as problem decomposition, distributed virtual environment, time manage-
ment, synchronization, parallel algorithms, and simulation correctness.
128
Problem partitioning is a defining characteristic of a distributed system. The
partition is a logical boundary between portions of the problem or information, or a
physical boundary between groups of machine nodes. The purpose of partitioning is to
assign responsibility for some aspect of problem to a specific processor, in order to
achieve a maximum efficiency of the parallelism. Obtaining a good load balancing and
efficient communication between the processors are main concerns in the partitioning.
Two well-known methods people often use to partition the problem of individual-based
model in the distributed system are Lagrangian method and the Eulerian method
[CHM+94, FBD98, Mer98]. In general, Lagrangian method assigns a fixed set of enti-
ties to a node in the distributed system. The Eulerian method divides the simulated
space and assigns a portion of the simulated space, together with the entities currently
located in that area to a node in the distributed system. We apply a kind of hybrid of
these two methods into our problem partitioning. New particles are grouped based on
their entering location. The simulated space is divided to Eulerian horizontal strips.
Each node in the distributed system is responsible for a horizontal strip or a number of
horizontal strips. The particles residing in the area of the partition can migrate to other
area periodically. Our partition provides an opportunity to obtain load balancing
through the entire simulation and a good mapping structure to support communication
efficiency between the nodes in the distributed system.
The research on distributed simulation system has continued for decades. The
techniques developed to resolve issues in distributed simulation system have matured
over the last few decades [NF92, Fuj99, Fuj01]. With the principal goal of reducing
129
execution time, the communication cost is a key problem that must be addressed. The
communication between nodes is managed by synchronization. The goal of synchroni-
zation mechanism is to ensure that each node processes events in timestamp order. The
interesting synchronization techniques are described as conservative synchronization
and optimistic synchronization.
In conservative synchronization, if each node can keep such timestamp order
precisely, execution of the simulation on a distributed simulation system will produce
exactly the same results as an execution on a sequential simulation system. In contract
to conservative synchronization approach, optimistic synchronization allows events to
occur concurrently. The concurrently processed events might create conflicts. How-
ever, the optimistic synchronization is able to detect and recover from them.
To implement synchronization mechanisms to a distributed simulation system
generally rely on application model specific information. What to avoid is the commu-
nication overhead in the synchronization. Some research papers [OHS91, Fer95,
LPL93, Fuj99,] contain basic ideas of synchronization techniques, alterative schemes
for different models and applications, and solutions to ultimately reduce the execution
time in a distributed simulation system.
The synchronization mechanism we used in our distributed simulation system is
an alterative optimistic synchronization. We introduce epoch as a time interval for syn-
chronization to occur. The length of epoch is defined by number of iterations that are
the time steps in the simulation. During epoch particles are processed on each node
130
may create conflicts between particles on different nodes. We developed a conflict
resolution to take snapshot of tentatively conflict events and resolved the conflicts by
rolling back to prior epoch and re-process particles that cause the conflicts. This syn-
chronization mechanism works efficiently. The distributed simulation system obtains a
good speedup, sufficient correctness, and scalability.
Distributed individual-based simulation systems can provide substantial benefit
to applications that are simulated by time steps and need a significant long simulation
time to get an interesting result. The computer simulation system may also useful in
simulating models that are mathematically theoretically formulated [BKP+05, BKP+06,
BPK+08]. The simulation can present the model in a virtual environment in a real bio-
clock time. The results from the simulation can be used to validate and enhance the
model. Our distributed individual-based simulation system provides such an environ-
ment for the biological particles diffusion model. The system has the ability to adapt to
different scenarios of case studies and is scalable to large problems.
131
9. Chapter 9
Conclusions
We describe a distributed individual-based simulation system to allow a large scale of
simulation while preserving the consistency results between the sequential simulation
and the parallel simulation. The research work starts with a biology application model
and focuses on developing a system in which a computer can be useful to help study
and understand applications. The contributions of this work are listed in next section.
9.1 Contributions
The main contribution of this dissertation is to develop an approach to perform compu-
tationally intensive individual-based simulation. We investigated issues and provided
the solutions in the following areas:
132
• Model simplification. We simplified a basic simulation model to use macro
time steps instead of micro time steps by replacing the random walk with Gaus-
sian distribution and formulate the basic micro simulation.
• Parallelization to improve performance.
• We studied the characteristics of the simulation model and choices of paral-
lelism. We investigated methods of problem partitioning and chose and im-
plemented the Eulerian-horizontal strip mapping for the parallel simulation
implementation.
• We defined a level of granularity of communication delay to reduce the
communication frequency and to speedup the simulation.
• We developed a set of rules and protocols and conflict resolution scheme to
ensure the parallelism and simulation results accuracy between the sequen-
tial simulation and the parallel simulation. The trade-off between speedup
and consistency is investigated and presented.
• Performance evaluation. We evaluated the system performance by evaluating
the simulation execution speedup and accuracy of the simulation results between
the sequential simulation and the parallel simulation. The parallel simulation
implementation allows large scale of simulation while preserving correctness.
• A tool for biology applications. Biology datasets are used in the case study.
The results of the simulation are presented from the biological point of view and
133
can be used in analysis and research of molecular diffusion in an intercellular
virtual space. The simulation system provides flexibility to allow the execution
of different biology datasets by varying the system parameters.
9.2 Future work
The work presented in this dissertation can be further continued in the following areas.
Improvement of the conflict resolution scheme
We use conflict resolution to solve the problem caused by the communication
delay to ensure consistency of the simulation results. One of the rules we use in the
conflict resolution is to overstate the number of receptors in shadow cells at the begin-
ning of epoch. An interesting problem is to develop a dynamic overstating mechanism
to further improve the performance and consistency between sequential simulation and
parallel simulation. For example, when there are many free receptors available, a small
number of overstating could be used. When stuck particles are increased in the system,
the number of overstating could be increased. This is because when there are more
stuck particles in the system, more stuck particles could get degraded from the system
to free receptors during epoch. To adjust the number of overstating can make this rule
play more accurately and efficiently.
The length of epoch affects the simulation performance and consistency. More
work can be done to develop a dynamic mechanism to vary the granularity of commu-
nication to further investigate trade-offs between performance and accuracy. For exam-
134
ple, the epoch length can be longer when conflicts occur less frequently, and can be ad-
justed working together with other rules applied to the system.
Usability evaluation
The simulation system we developed can be used as a tool for simulating bio-
logical applications. We have run the simulation in biology case-study to produce data
for biological analysis. We did this by adding new events of particles and changing the
parameter sets. However, more case-studies are needed to help evaluate the system us-
ability and extendibility.
135
10. Bibliography
[BBM99] L. Bajaj, R.Bagrodia, and R. Mayer. Case Study: Paralyzing a Sequential
Simulation Model. Computer Science Department, University of Califor-
nia, Los Angles, 1999.
[Ber93] H. C. Berg. Random Walks in biology. Expanded Edition. Princeton Uni-
versity Press, 1993.
[BFD96] L. F. Bic, M. Fukuda, and M. B. Dillencourt. Distributed Computing us-
ing Autonomous Objects. IEEE Computer, 29(8), 1996.
[BKP+05] T. Bollenbach, K. Kruse, P. Pantazis, M. Gonzales-Gaitan, and F. Julicher.
Robust formation of morphogen gradients. Biological Physics (phys-
ics.bio-ph), Physical Review Letters 94, 018103, 2005.
[BKP+06] T. Bollenbach, K. Kruse, P. Pantazis, M. Gonzales-Gaitan, and F. Julicher.
Morphogen Transport in Epithelia. Biological Physics (physics.bio-ph),
arXiv: q-bio/0609011v1[q-bio.OT], 2006
[BPK+08] T. Bollenbach, P. Pantazis, A. Kicheva, C. Bokel, M. Gonzalez-Gaitan and
F. Julicher. Precision of the Dpp Gradient. Development 135 (6), pages
1137-1146, 2008.
136
[CHM+94] T. W. Clart, R. v. Hanxleden, J. A. McCammon, and L. R. Scott. Parallel-
izing Molecular Dynamics using Spatial Decomposition. Proceedings of
the IEEE Scalable High-Performance Computing Conference, pages 95-
102, 1994.
[FBD98] M. Fukuda, L. F. Bic, and M. B. Dillencourt. Distributed Individual-
Based Simulation Using Autonomous Objects. Technical Report 97-46,
Department of Information and Computer Science, University of Califor-
nia, Irvine, 1998.
[FBD99] M. Fukuda, L. F. Bic, and M. B. Dillencourt. Messages versus messengers
in distributed programming. Journal of Parallel and Distributed Comput-
ing, 57:188-211, 1999.
[FCH+08] M. Fukuda, C. Wicke, H. Kuang, E. Gendelman, K. Noguchi, and M. K.
Lai. Messengers User’s manual, version 3.1.4. Department of Computer
science Donald Bren School of Information and Computer Sciences, Uni-
versity of California, Irvine, 2008.
[Fel66] W. Feller An Introduction to Probability Theory and Its Applications. Vol-
ume 1, Third Edition. John Wiley & Sons, Inc. New York, 1966.
[Fer95] A. Ferscha. Parallel and Distributed Simulation of Discrete Event Sys-
tems. Contributed to the: Handbook of Parallel and Distributed Comput-
ing. McGraw-Hill, 1995.
[Fis95] P. A. Fishwick. Simulation Model Design and Execution (Building Digital
World). Prentice Hall International Series in Industrial and Systems Engi-
neering, Prentice-Hall, Inc., New Jersey, 1995.
137
[FSW97] F. Fishwick, J. G. Sanderson, and W. F. Wolf. A Multimodeling Basis for
Across-Trophic-Level Ecosystem Modeling: The Florida Everglades Ex-
ample. Transactions of the Society for Computer Simulation International,
15 (2). pp. 76-89. ISSN 0740-6797/98
[Fuj99] R. M. Fujimoto. Parallel and Distributed Simulation. Proceedings of the
1999 Winter Simulation Conference, 1999, pages 122 – 131.
[Fuj99a] R. M. Fujimoto. Exploiting Temporal Uncertainty in Parallel and Dis-
tributed Simulations. Proceedings of the 13th Workshop on Parallel and
Distributed Simulation, pages 46-53, 1999.
[Fuj01] R. M. Fujimoto. Parallel and Distributed Simulation Systems. Proceed-
ings of the 2001 Winter Simulation Conference, 2001, pages 147 – 157.
[Fuk97] M. Fukuda. Messengers: A Distributed Computing System Based on
Autonomous Objects. PhD Dissertation, Department of Information and
Computer Science, University of California, Irvine, 1997.
[Gim02] H. R. Gimblett. Integrating Geographic Information Systems and Agent-
based Modeling Techniques for Simulating Social and Ecological Proc-
esses. A volume in the Santa Fe institute studies in the sciences of com-
plexity, Oxford University Press, 2002.
[HHM96] S. Hinckley, A. J. Hermann, B. A. Megrey, Development of a sparially ex-
plicit, individual-based model of marine fish early life history. Marine
Ecology Progress Series, Volume 139, pages 47-68, 1996.
[HM96] T. Hopkins and D. R. Morse. The implementation and visualization of a
large spatial individual-based model using Fortran 90. Technical Report
138
18-96*, University of Kent, Computing Laboratory, University of Kent,
Canterbury, UK, 1996.
[HNP97] J. Hamilton, D.A. Nash, and U. W. Pooch. Distributed Simulation. Com-
pater Science & Engineering, Volume 8, CRC Press, 1997.
[KBW99] J. U. Kreft, G. Booth, J.W.T. Wimpenny. Applications of individual-based
modeling in microbial ecology. In Proceedings of the 8th International
Symposium on Microbial Ecology, Atlantic Canada Society for Microbial
Ecology, Halifax, Canada, 1999.
[LPL93] Y.-B. Lin, B. R. Preiss, and W. M. Loucks. Selecting the Checkpoint In-
terval in Time Warp Parallel Simulation. In Proceedings of the 7th work-
shop on Parallel and Distributed Simulation, pages 3-10, 1993. IEEE
Computer Society.
[MBD98] F. Merchant, L. Bic, and M. B. Dillencourt. Load Balancing in Individual-
Based Spatial Applications. In proceedings of the International Confer-
ence on Parallel Architectures and Compilation Techniques (PACT’98),
pages 350-357, 1998.
[Mer98] F. Merchant. Load balancing in spatial individual-based system using
autonomous objects. PhD Dissertation, Department of Information and
Computer Science, University of California, Irvine, 1998.
[MSC94] W. Maniatty, B. Szymansk, and T. Garaco. Implementation and Perform-
ance of Parallel Ecological Simulations. In Proc. Conf. Applications in
Parallel and Distributed Computing, Caracas, Venezuela, April 1994. IFIP
Transaction A-44, North Holland, Amsterdam, 1994, pages 93-102.
139
[NF92] D. Nicol and R. Fujimoto. Parallel Simulation Today. NASA Contract
Nos. NASI-18605 and NASI-19480, In Annals of Operations Research,
Institute for Computer Applications in Science and Engineering NASA
Langley Research Center, 1992.
[OHS91] B. Overeinder, B. Hertzberger, and P. Sloot. Parallel Discrete Event
Simulation. In third workshop on design and realization of computer sys-
tems. http://www.science.uva.nl/research/scs/papers/byyear.html.
[PKC07] J. Plumert, J. Kearney, and J. Cremer. How Does Traffic Density Influence
Cyclists’Gap Choices? International Conference, Road Safety and Simu-
lation (RSS), Rome, Italy, 2007.
[RG05] B. Rashleigh and G. D. Grossman. An individual-based simulation for
mottled sculpin in a southern Appalachian stream. Ecological Modeling
187(2005) 247-258.
[Rob05] S. Robinson. Distributed simulation and simulation practice. Simulation
2005, Volume81, Number 1, 2005.
[TT96] Y. M. Teo and S. C. Tay. Performance analysis of parallel simulation on
distributed systems. Distrib. Syst. Engng 3, pages 20-31. The British
Computer Society, The institution of Electrical Engineers and IOP Pub-
lishing Ltd, 1996. Printed in UK.
[WH96] J. D. Westervelt and L. D. Hopkins. Facilitating mobile objects within the
context of spatial landscape processes. NCGIA, Third International Con-
ference/Workshop on Integrating GIS and Environmental Modeling, Santa
Fe, NM, 1996.
140
Appendix 1
Shapiro-Wilk W Test result for the number of stuck particles – absolute variation distribution in time range of T1 to T10
141
T1 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Normal(0.2823,2.16659) Quantiles 100.0% maximum 5.80099.5% 5.40097.5% 4.40090.0% 3.30075.0% quartile 1.80050.0% median 0.30025.0% quartile -1.40010.0% -2.6002.5% -3.6980.5% -4.3000.0% minimum -5.000 Moments Mean 0.2823Std Dev 2.1665851Std Err Mean 0.0685134upper 95% Mean 0.4167468lower 95% Mean 0.1478532N 1000Sum Wgt 1000Sum 282.3Variance 4.6940908Skewness 0.0980475
142
Kurtosis -0.582746CV 767.47611N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.2823 0.1478532 0.4167468 Dispersion σ 2.1665851 2.0756152 2.2659563 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.991890 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
143
T2 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-10 -8 -6 -5 -4 -3 -1 0 1 2 3 4 5
Normal(-1.3123,2.6049) Quantiles 100.0% maximum 4.60099.5% 4.49997.5% 3.00090.0% 2.09075.0% quartile 0.50050.0% median -1.00025.0% quartile -3.10010.0% -4.6902.5% -6.8000.5% -9.3000.0% minimum -9.600 Moments Mean -1.3123Std Dev 2.6049039Std Err Mean 0.0823743upper 95% Mean -1.150654lower 95% Mean -1.473946N 1000Sum Wgt 1000Sum -1312.3Variance 6.7855242Skewness -0.421029
144
Kurtosis 0.0921381CV -198.4991N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -1.3123 -1.473946 -1.150654 Dispersion σ 2.6049039 2.4955301 2.7243787 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.986497 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
145
T3 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-5 -4 -3 -2 -1 0 1 2 3 4 5
Normal(0.2473,1.57448) Quantiles 100.0% maximum 4.60099.5% 4.50097.5% 3.69890.0% 2.40075.0% quartile 1.00050.0% median 0.20025.0% quartile -0.70010.0% -1.8002.5% -2.5980.5% -4.3000.0% minimum -4.500Moments Mean 0.2473Std Dev 1.5744814Std Err Mean 0.0497895upper 95% Mean 0.3450039lower 95% Mean 0.1495961N 1000Sum Wgt 1000Sum 247.3Variance 2.4789917Skewness 0.1276124Kurtosis 0.3538502CV 636.66858
146
N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.2473 0.1495961 0.3450039 Dispersion σ 1.5744814 1.5083726 1.6466956 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.985826 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
147
T4 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-6 -5 -4 -3 -2 -1 0 1 2
Normal(-1.6343,1.51794) Quantiles 100.0% maximum 2.10099.5% 1.90097.5% 1.20090.0% 0.30075.0% quartile -0.60050.0% median -1.50025.0% quartile -2.70010.0% -3.6002.5% -4.8000.5% -5.7000.0% minimum -5.900 Moments Mean -1.6343Std Dev 1.5179386Std Err Mean 0.0480014upper 95% Mean -1.540105lower 95% Mean -1.728495N 1000Sum Wgt 1000Sum -1634.3Variance 2.3041376Skewness -0.175283
148
Kurtosis -0.217664CV -92.88005N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -1.6343 -1.728495 -1.540105 Dispersion σ 1.5179386 1.4542039 1.5875594 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.994383 0.0009
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
149
T5 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-6 -5 -4 -3 -2 -1 0 1 2 3 4
Normal(-1.0976,1.90504) Quantiles 100.0% maximum 3.50099.5% 3.10097.5% 2.70090.0% 1.30075.0% quartile 0.40050.0% median -1.10025.0% quartile -2.60010.0% -3.5002.5% -4.5000.5% -5.5000.0% minimum -5.700 Moments Mean -1.0976Std Dev 1.9050416Std Err Mean 0.0602427upper 95% Mean -0.979383lower 95% Mean -1.215817N 1000Sum Wgt 1000Sum -1097.6Variance 3.6291834Skewness 0.048216
150
Kurtosis -0.669827CV -173.5643N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -1.0976 -1.215817 -0.979383 Dispersion σ 1.9050416 1.8250534 1.992417 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.988991 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
151
T6 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Normal(0.1468,1.91214) Quantiles 100.0% maximum 5.60099.5% 5.30097.5% 4.30090.0% 2.30075.0% quartile 1.30050.0% median 0.20025.0% quartile -1.00010.0% -2.2002.5% -4.1000.5% -5.0000.0% minimum -5.000 Moments Mean 0.1468Std Dev 1.9121365Std Err Mean 0.0604671upper 95% Mean 0.265457lower 95% Mean 0.028143N 1000Sum Wgt 1000Sum 146.8Variance 3.656266Skewness -0.111582
152
Kurtosis 0.2930095CV 1302.5453N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.1468 0.028143 0.265457 Dispersion σ 1.9121365 1.8318504 1.9998373 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.990576 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
153
T7 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-5 -4 -3 -2 -1 0 1 2 3
Normal(-0.7542,1.62597) Quantiles 100.0% maximum 2.90099.5% 2.80097.5% 1.99890.0% 1.50075.0% quartile 0.40050.0% median -0.70025.0% quartile -2.00010.0% -2.6002.5% -4.2980.5% -4.6000.0% minimum -5.400 Moments Mean -0.7542Std Dev 1.6259662Std Err Mean 0.0514176upper 95% Mean -0.653301lower 95% Mean -0.855099N 1000Sum Wgt 1000Sum -754.2Variance 2.6437661Skewness -0.190632
154
Kurtosis -0.494556CV -215.5882N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.7542 -0.855099 -0.653301 Dispersion σ 1.6259662 1.5576957 1.7005417 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.988097 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
155
T8 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-5 -4 -3 -2 -1 0 1 2 3 4
Normal(-0.7583,1.68346) Quantiles 100.0% maximum 3.70099.5% 3.20097.5% 2.30090.0% 1.40075.0% quartile 0.50050.0% median -0.80025.0% quartile -1.90010.0% -3.1002.5% -4.1000.5% -4.7000.0% minimum -5.100 Moments Mean -0.7583Std Dev 1.6834623Std Err Mean 0.0532358upper 95% Mean -0.653833lower 95% Mean -0.862767N 1000Sum Wgt 1000Sum -758.3Variance 2.8340452Skewness -0.095227Kurtosis -0.492681
156
CV -222.0048N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.7583 -0.862767 -0.653833 Dispersion σ 1.6834623 1.6127776 1.7606749 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.993359 0.0002
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
157
T9 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-4 -3 -2 -1 0 1 2 3
Normal(-0.4849,1.28836) Quantiles 100.0% maximum 3.20099.5% 3.00097.5% 1.80090.0% 1.20075.0% quartile 0.50050.0% median -0.50025.0% quartile -1.30010.0% -2.2002.5% -3.0980.5% -3.6000.0% minimum -3.700 Moments Mean -0.4849Std Dev 1.288364Std Err Mean 0.0407416upper 95% Mean -0.404951lower 95% Mean -0.564849N 1000Sum Wgt 1000Sum -484.9Variance 1.6598819Skewness -0.073015
158
Kurtosis -0.264388CV -265.6969N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.4849 -0.564849 -0.404951 Dispersion σ 1.288364 1.2342687 1.3474553 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.994430 0.0009
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
159
T10 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-5 -4 -3 -2 -1 0 1 2 3 4 5
Normal(0.6512,1.45349) Quantiles 100.0% maximum 4.70099.5% 4.10097.5% 3.30090.0% 2.50075.0% quartile 1.70050.0% median 0.60025.0% quartile -0.20010.0% -1.1002.5% -2.3980.5% -4.3990.0% minimum -4.500 Moments Mean 0.6512Std Dev 1.4534893Std Err Mean 0.0459634upper 95% Mean 0.7413958lower 95% Mean 0.5610042N 1000Sum Wgt 1000Sum 651.2Variance 2.1126312Skewness -0.246777
160
Kurtosis 0.371661CV 223.20168N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.6512 0.5610042 0.7413958 Dispersion σ 1.4534893 1.3924607 1.5201541 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.992108 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
161
Appendix 2
Shapiro-Wilk W Test result for the number of stuck particles - relative variation distribution in time range of T1 to T10
162
T1 Distribution
-1 0 1
Normal(-0.0006,0.11194) Quantiles 100.0% maximum 1.00099.5% 0.61397.5% 0.21890.0% 0.02775.0% quartile 0.01350.0% median 0.0023325.0% quartile -0.01010.0% -0.0402.5% -0.2110.5% -0.4740.0% minimum -1.000 Moments Mean -0.000617Std Dev 0.111939Std Err Mean 0.0035451upper 95% Mean 0.0063396lower 95% Mean -0.007574N 997Sum Wgt 997Sum -0.615402Variance 0.0125303Skewness -0.190762Kurtosis 33.915057CV -18135.01N Missing 3 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.000617 -0.007574 0.0063396 Dispersion σ 0.111939 0.1072322 0.1170812 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.489990 0.0000
163
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
164
T2 Distribution
-0.03 -0.02 -0.01 0 0.01
Normal(-0.0041,0.00789) Quantiles 100.0% maximum 0.016799.5% 0.016397.5% 0.008490.0% 0.005775.0% quartile 0.001350.0% median -0.003025.0% quartile -0.009310.0% -0.01432.5% -0.02060.5% -0.02820.0% minimum -0.0290 Moments Mean -0.004055Std Dev 0.0078872Std Err Mean 0.0002494upper 95% Mean -0.003566lower 95% Mean -0.004545N 1000Sum Wgt 1000Sum -4.055108Variance 0.0000622Skewness -0.437854Kurtosis 0.1677688CV -194.5N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.004055 -0.004545 -0.003566 Dispersion σ 0.0078872 0.007556 0.0082489 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.984724 <.0001
165
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
166
T3 Distribution
-0.01 0 0.01
Normal(0.00065,0.00415) Quantiles 100.0% maximum 0.012199.5% 0.011997.5% 0.009790.0% 0.006375.0% quartile 0.002750.0% median 0.0005325.0% quartile -0.001810.0% -0.00482.5% -0.00690.5% -0.01130.0% minimum -0.0119 Moments Mean 0.0006477Std Dev 0.0041544Std Err Mean 0.0001314upper 95% Mean 0.0009055lower 95% Mean 0.0003899N 1000Sum Wgt 1000Sum 0.6476681Variance 1.7259e-5Skewness 0.1205743Kurtosis 0.3403582CV 641.4393N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.0006477 0.0003899 0.0009055 Dispersion σ 0.0041544 0.00398 0.0043449 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.986340 <.0001
167
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
168
T4 Distribution
-0.01 0
Normal(-0.0042,0.00394) Quantiles 100.0% maximum 0.005599.5% 0.004997.5% 0.003190.0% 0.0007875.0% quartile -0.001650.0% median -0.003925.0% quartile -0.007110.0% -0.00932.5% -0.01240.5% -0.01470.0% minimum -0.0153 Moments Mean -0.004237Std Dev 0.0039352Std Err Mean 0.0001244upper 95% Mean -0.003993lower 95% Mean -0.004481N 1000Sum Wgt 1000Sum -4.237032Variance 1.5486e-5Skewness -0.171529Kurtosis -0.219183CV -92.8759N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.004237 -0.004481 -0.003993 Dispersion σ 0.0039352 0.00377 0.0041157 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.994673 0.0013
169
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
170
T5 Distribution
-0.01 0 0.01
Normal(-0.0028,0.00488) Quantiles 100.0% maximum 0.009099.5% 0.007997.5% 0.006990.0% 0.003375.0% quartile 0.001050.0% median -0.002825.0% quartile -0.006610.0% -0.00902.5% -0.01160.5% -0.01410.0% minimum -0.0146 Moments Mean -0.002815Std Dev 0.004876Std Err Mean 0.0001542upper 95% Mean -0.002512lower 95% Mean -0.003118N 1000Sum Wgt 1000Sum -2.815035Variance 2.3775e-5Skewness 0.0434639Kurtosis -0.668155CV -173.211N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.002815 -0.003118 -0.002512 Dispersion σ 0.004876 0.0046712 0.0050996 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.989184 <.0001
171
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
172
T6 Distribution
-0.01 0 0.01
Normal(0.00037,0.00485) Quantiles 100.0% maximum 0.014299.5% 0.013597.5% 0.010990.0% 0.005875.0% quartile 0.003350.0% median 0.0005125.0% quartile -0.002510.0% -0.00562.5% -0.01040.5% -0.01270.0% minimum -0.0127 Moments Mean 0.0003706Std Dev 0.0048535Std Err Mean 0.0001535upper 95% Mean 0.0006718lower 95% Mean 6.9387e-5N 1000Sum Wgt 1000Sum 0.3705715Variance 2.3557e-5Skewness -0.10646Kurtosis 0.2877157CV 1309.7445N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.0003706 6.9387e-5 0.0006718 Dispersion σ 0.0048535 0.0046498 0.0050761 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.990664 <.0001
173
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
174
T7 Distribution
-0.01 0
Normal(-0.0019,0.00412) Quantiles 100.0% maximum 0.007399.5% 0.007197.5% 0.005090.0% 0.003875.0% quartile 0.001050.0% median -0.001825.0% quartile -0.005110.0% -0.00662.5% -0.01090.5% -0.01170.0% minimum -0.0137 Moments Mean -0.001915Std Dev 0.0041215Std Err Mean 0.0001303upper 95% Mean -0.001659lower 95% Mean -0.00217N 1000Sum Wgt 1000Sum -1.914698Variance 1.6987e-5Skewness -0.195183Kurtosis -0.492687CV -215.2561N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.001915 -0.00217 -0.001659 Dispersion σ 0.0041215 0.0039485 0.0043105 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.988012 <.0001
175
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
176
T8 Distribution
-0.01 0 0.01
Normal(-0.0019,0.00427) Quantiles 100.0% maximum 0.009499.5% 0.008197.5% 0.005890.0% 0.003575.0% quartile 0.001350.0% median -0.002025.0% quartile -0.004810.0% -0.00792.5% -0.01040.5% -0.01200.0% minimum -0.0130 Moments Mean -0.001923Std Dev 0.0042673Std Err Mean 0.0001349upper 95% Mean -0.001658lower 95% Mean -0.002187N 1000Sum Wgt 1000Sum -1.922662Variance 1.821e-5Skewness -0.101762Kurtosis -0.480337CV -221.9498N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.001923 -0.002187 -0.001658 Dispersion σ 0.0042673 0.0040882 0.0044631 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.993555 0.0003
177
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
178
T9 Distribution
-0.01 0
Normal(-0.0012,0.00326) Quantiles 100.0% maximum 0.008199.5% 0.007697.5% 0.004590.0% 0.003075.0% quartile 0.001350.0% median -0.001325.0% quartile -0.003310.0% -0.00562.5% -0.00780.5% -0.00910.0% minimum -0.0094 Moments Mean -0.001226Std Dev 0.0032588Std Err Mean 0.0001031upper 95% Mean -0.001023lower 95% Mean -0.001428N 1000Sum Wgt 1000Sum -1.225564Variance 1.062e-5Skewness -0.073742Kurtosis -0.259923CV -265.9048N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.001226 -0.001428 -0.001023 Dispersion σ 0.0032588 0.003122 0.0034083 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.994561 0.0011
179
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
180
T10 Distribution
-0.01 0 0.01
Normal(0.00165,0.00368) Quantiles 100.0% maximum 0.012099.5% 0.010497.5% 0.008490.0% 0.006375.0% quartile 0.004350.0% median 0.001525.0% quartile -0.000510.0% -0.00282.5% -0.00600.5% -0.01120.0% minimum -0.0114 Moments Mean 0.0016488Std Dev 0.0036813Std Err Mean 0.0001164upper 95% Mean 0.0018772lower 95% Mean 0.0014203N 1000Sum Wgt 1000Sum 1.6487637Variance 1.3552e-5Skewness -0.243157Kurtosis 0.3815874CV 223.27637N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.0016488 0.0014203 0.0018772 Dispersion σ 0.0036813 0.0035267 0.0038501 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.992216 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
181
Appendix 3
Shapiro-Wilk W Test result for the number of free particles – absolute variation distribution in time range of T1 to T10
182
T1 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Normal(0.2968,2.31378) Quantiles 100.0% maximum 8.00099.5% 7.10097.5% 5.20090.0% 3.30075.0% quartile 1.80050.0% median 0.10025.0% quartile -1.30010.0% -2.5002.5% -4.0000.5% -5.0000.0% minimum -5.600 Moments Mean 0.2968Std Dev 2.3137819Std Err Mean 0.0517377upper 95% Mean 0.3982655lower 95% Mean 0.1953345N 2000Sum Wgt 2000Sum 593.6Variance 5.3535866Skewness 0.3266569
183
Kurtosis 0.0234684CV 779.5761N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.2968 0.1953345 0.3982655 Dispersion σ 2.3137819 2.2442365 2.3878076 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.992199 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
184
T2 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-10 0 10
Normal(-0.1145,4.15515) Quantiles 100.0% maximum 13.5099.5% 10.4097.5% 8.2090.0% 4.7975.0% quartile 2.3050.0% median 0.2025.0% quartile -2.4010.0% -5.702.5% -9.200.5% -10.900.0% minimum -14.90 Moments Mean -0.1145Std Dev 4.1551549Std Err Mean 0.0929121upper 95% Mean 0.0677147lower 95% Mean -0.296715N 2000Sum Wgt 2000Sum -229Variance 17.265312Skewness -0.179739Kurtosis 0.4234024
185
CV -3628.956N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.1145 -0.296715 0.0677147 Dispersion σ 4.1551549 4.0302634 4.2880924 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.988270 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
186
T3
Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-10 0 10 20
Normal(1.2164,6.66349) Quantiles 100.0% maximum 23.0099.5% 19.4097.5% 15.8090.0% 11.1075.0% quartile 5.5050.0% median 0.2025.0% quartile -3.6010.0% -6.792.5% -9.800.5% -12.100.0% minimum -14.70Moments Mean 1.2164Std Dev 6.6634902Std Err Mean 0.1490002upper 95% Mean 1.5086119lower 95% Mean 0.9241881N 2000Sum Wgt 2000Sum 2432.8Variance 44.402102Skewness 0.4871243
187
Kurtosis -0.23434CV 547.8042N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 1.2164 0.9241881 1.5086119 Dispersion σ 6.6634902 6.4632057 6.8766779 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.978342 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
188
T4
Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-20 -10 0 10 20
Normal(-1.095,6.68336) Quantiles 100.0% maximum 20.8099.5% 16.0097.5% 11.5090.0% 7.8075.0% quartile 3.8050.0% median -1.0025.0% quartile -5.9010.0% -9.702.5% -14.100.5% -16.900.0% minimum -22.00Moments Mean -1.095Std Dev 6.6833568Std Err Mean 0.1494444upper 95% Mean -0.801917lower 95% Mean -1.388083N 2000Sum Wgt 2000Sum -2190Variance 44.667259Skewness 0.0382623
189
Kurtosis -0.345238CV -610.3522N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -1.095 -1.388083 -0.801917 Dispersion σ 6.6833568 6.4824752 6.8971801 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.997436 0.0023
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
190
T5
Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-40 -30 -20 -10 0 10
Normal(-14.28,8.94581) Quantiles 100.0% maximum 16.1099.5% 9.0097.5% 4.4090.0% -1.4175.0% quartile -8.2050.0% median -15.4025.0% quartile -20.9010.0% -25.002.5% -29.200.5% -32.700.0% minimum -38.40Moments Mean -14.2802Std Dev 8.9458058Std Err Mean 0.2000343upper 95% Mean -13.8879lower 95% Mean -14.6725N 2000Sum Wgt 2000Sum -28560.4Variance 80.027442Skewness 0.3991832
191
Kurtosis -0.281267CV -62.64482N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -14.2802 -14.6725 -13.8879 Dispersion σ 8.9458058 8.6769217 9.2320125 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.984885 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
192
T6 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-20 -10 0 10 20
Normal(0.74645,7.88901) Quantiles 100.0% maximum 24.1099.5% 20.3097.5% 16.3090.0% 11.2075.0% quartile 6.0050.0% median 0.4025.0% quartile -4.7010.0% -8.902.5% -14.200.5% -19.600.0% minimum -25.30Moments Mean 0.74645Std Dev 7.8890104Std Err Mean 0.1764036upper 95% Mean 1.0924042lower 95% Mean 0.4004958N 2000Sum Wgt 2000Sum 1492.9Variance 62.236486Skewness 0.0476895Kurtosis -0.11743
193
CV 1056.8706N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.74645 0.4004958 1.0924042 Dispersion σ 7.8890104 7.6518904 8.1414067 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.998430 0.0569
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
194
T7 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-30 -20 -10 0 10 20
Normal(-3.4707,7.57891) Quantiles 100.0% maximum 20.9099.5% 16.3097.5% 11.7090.0% 5.8075.0% quartile 1.6050.0% median -3.4025.0% quartile -8.6010.0% -12.802.5% -18.100.5% -25.400.0% minimum -31.40Moments Mean -3.47075Std Dev 7.5789102Std Err Mean 0.1694696upper 95% Mean -3.138394lower 95% Mean -3.803106N 2000Sum Wgt 2000Sum -6941.5Variance 57.439879Skewness -0.046462Kurtosis 0.1270792
195
CV -218.3652N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -3.47075 -3.803106 -3.138394 Dispersion σ 7.5789102 7.3511109 7.8213852 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.998759 0.1626
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
196
T8 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-20 -10 0 10 20 30 40
Normal(5.65015,9.06994) Quantiles 100.0% maximum 38.1099.5% 27.8097.5% 23.6090.0% 17.9075.0% quartile 12.1850.0% median 5.2025.0% quartile -0.9010.0% -5.602.5% -10.900.5% -17.300.0% minimum -26.50Moments Mean 5.65015Std Dev 9.0699373Std Err Mean 0.20281upper 95% Mean 6.047891lower 95% Mean 5.252409N 2000Sum Wgt 2000Sum 11300.3Variance 82.263762Skewness 0.1123349Kurtosis -0.294735
197
CV 160.5256N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 5.65015 5.252409 6.047891 Dispersion σ 9.0699373 8.7973221 9.3601153 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.996043 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
198
T9 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-30 -20 -10 0 10 20
Normal(-4.4903,8.03024) Quantiles 100.0% maximum 22.1099.5% 16.4097.5% 11.7090.0% 6.3075.0% quartile 1.1850.0% median -4.9025.0% quartile -10.6010.0% -14.502.5% -18.900.5% -22.200.0% minimum -29.10Moments Mean -4.4903Std Dev 8.0302377Std Err Mean 0.1795616upper 95% Mean -4.138153lower 95% Mean -4.842447N 2000Sum Wgt 2000Sum -8980.6Variance 64.484718Skewness 0.209037Kurtosis -0.32125
199
CV -178.8352N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -4.4903 -4.842447 -4.138153 Dispersion σ 8.0302377 7.7888729 8.2871523 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.994541 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
200
T10 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-30 -20 -10 0 10
Normal(-8.6058,7.90431) Quantiles 100.0% maximum 15.5099.5% 10.7097.5% 6.6090.0% 1.4075.0% quartile -3.3050.0% median -8.3025.0% quartile -13.6010.0% -18.702.5% -25.300.5% -31.200.0% minimum -36.00Moments Mean -8.60585Std Dev 7.9043115Std Err Mean 0.1767458upper 95% Mean -8.259225lower 95% Mean -8.952475N 2000Sum Wgt 2000Sum -17211.7Variance 62.47814Skewness -0.193983Kurtosis 0.1786541
201
CV -91.84812N Missing 0 Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -8.60585 -8.952475 -8.259225 Dispersion σ 7.9043115 7.6667316 8.1571972 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.997073 0.0008
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
202
Appendix 4
Shapiro-Wilk W Test result for the number of free particles – relative variation distribution in time range of T1 to T10
203
T1 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4
Normal(0.01326,0.09714) Quantiles 100.0% maximum 0.413099.5% 0.340297.5% 0.228190.0% 0.138175.0% quartile 0.063850.0% median 0.003725.0% quartile -0.033410.0% -0.09972.5% -0.18520.5% -0.28570.0% minimum -0.3590Moments Mean 0.0132613Std Dev 0.0971411Std Err Mean 0.0021721upper 95% Mean 0.0175212lower 95% Mean 0.0090014N 2000Sum Wgt 2000Sum 26.522633Variance 0.0094364Skewness 0.2452791Kurtosis 1.628284
204
CV 732.51496N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.0132613 0.0090014 0.0175212 Dispersion σ 0.0971411 0.0942214 0.100249 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.969956 0.0000
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
205
T2 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
0
Normal(0.00026,0.02463) Quantiles 100.0% maximum 0.086199.5% 0.066497.5% 0.053390.0% 0.029975.0% quartile 0.014350.0% median 0.001125.0% quartile -0.014210.0% -0.03142.5% -0.05080.5% -0.06060.0% minimum -0.0828 Moments Mean 0.0002624Std Dev 0.0246308Std Err Mean 0.0005508upper 95% Mean 0.0013426lower 95% Mean -0.000818N 2000Sum Wgt 2000Sum 0.5248922Variance 0.0006067Skewness 0.0587376
206
Kurtosis 0.4722138CV 9385.1007N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.0002624 -0.000818 0.0013426 Dispersion σ 0.0246308 0.0238905 0.0254189 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.990876 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
207
T3 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
0 0.1
Normal(0.00375,0.02764) Quantiles 100.0% maximum 0.090699.5% 0.076297.5% 0.061590.0% 0.043475.0% quartile 0.022150.0% median 0.0008525.0% quartile -0.015910.0% -0.03022.5% -0.04440.5% -0.05710.0% minimum -0.0701 Moments Mean 0.003755Std Dev 0.0276432Std Err Mean 0.0006181upper 95% Mean 0.0049672lower 95% Mean 0.0025427N 2000Sum Wgt 2000Sum 7.5099536Variance 0.0007641Skewness 0.290481
208
Kurtosis -0.3062CV 736.17618N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.003755 0.0025427 0.0049672 Dispersion σ 0.0276432 0.0268124 0.0285276 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.990514 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
209
T4 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-0.07 -0.04 -0.01 0.01 0.03 0.05 0.07
Normal(-0.0035,0.02233) Quantiles 100.0% maximum 0.075399.5% 0.055497.5% 0.039190.0% 0.026175.0% quartile 0.012650.0% median -0.003525.0% quartile -0.019710.0% -0.03192.5% -0.04550.5% -0.05470.0% minimum -0.0712 Moments Mean -0.003452Std Dev 0.0223334Std Err Mean 0.0004994upper 95% Mean -0.002473lower 95% Mean -0.004431N 2000Sum Wgt 2000Sum -6.903916Variance 0.0004988Skewness 0.0869692
210
Kurtosis -0.356845CV -646.9788N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.003452 -0.004431 -0.002473 Dispersion σ 0.0223334 0.0216622 0.023048 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.996842 0.0004
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
211
T5 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-0.1 0
Normal(-0.0437,0.02746) Quantiles 100.0% maximum 0.050299.5% 0.027797.5% 0.013890.0% -0.004475.0% quartile -0.025050.0% median -0.047025.0% quartile -0.063910.0% -0.07652.5% -0.09010.5% -0.10130.0% minimum -0.1218 Moments Mean -0.04371Std Dev 0.0274649Std Err Mean 0.0006141upper 95% Mean -0.042506lower 95% Mean -0.044915N 2000Sum Wgt 2000Sum -87.4202Variance 0.0007543Skewness 0.384414
212
Kurtosis -0.268225CV -62.83426N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.04371 -0.044915 -0.042506 Dispersion σ 0.0274649 0.0266394 0.0283436 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.986192 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
213
T6 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-0.07 -0.05 -0.02 0 0.02 0.04 0.06
Normal(0.00211,0.02259) Quantiles 100.0% maximum 0.069199.5% 0.057497.5% 0.046790.0% 0.032175.0% quartile 0.017250.0% median 0.001125.0% quartile -0.013610.0% -0.02592.5% -0.04070.5% -0.05540.0% minimum -0.0708 Moments Mean 0.0021096Std Dev 0.0225929Std Err Mean 0.0005052upper 95% Mean 0.0031004lower 95% Mean 0.0011189N 2000Sum Wgt 2000Sum 4.2192877Variance 0.0005104Skewness 0.0524162
214
Kurtosis -0.153238CV 1070.9336N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.0021096 0.0011189 0.0031004 Dispersion σ 0.0225929 0.0219138 0.0233157 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.998437 0.0582
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
215
T7 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-0.09 -0.06 -0.03 0 0.02 0.04 0.06
Normal(-0.0095,0.02097) Quantiles 100.0% maximum 0.059299.5% 0.045097.5% 0.032890.0% 0.016175.0% quartile 0.004450.0% median -0.009425.0% quartile -0.023810.0% -0.03542.5% -0.04980.5% -0.06970.0% minimum -0.0857 Moments Mean -0.009536Std Dev 0.0209743Std Err Mean 0.000469upper 95% Mean -0.008616lower 95% Mean -0.010455N 2000Sum Wgt 2000Sum -19.07102Variance 0.0004399Skewness -0.02056
216
Kurtosis 0.143328CV -219.9601N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.009536 -0.010455 -0.008616 Dispersion σ 0.0209743 0.0203439 0.0216454 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.998850 0.2151
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
217
T8 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
0 0.1
Normal(0.01507,0.02412) Quantiles 100.0% maximum 0.100899.5% 0.073597.5% 0.062590.0% 0.047975.0% quartile 0.032350.0% median 0.013825.0% quartile -0.002410.0% -0.01482.5% -0.02890.5% -0.04540.0% minimum -0.0702 Moments Mean 0.0150675Std Dev 0.0241163Std Err Mean 0.0005393upper 95% Mean 0.016125lower 95% Mean 0.0140099N 2000Sum Wgt 2000Sum 30.134937Variance 0.0005816Skewness 0.1176393
218
Kurtosis -0.308412CV 160.05523N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ 0.0150675 0.0140099 0.016125 Dispersion σ 0.0241163 0.0233914 0.0248878 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.995788 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
219
T9 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-0.08 -0.05 -0.03 -0.01 0.01 0.03 0.05
Normal(-0.0118,0.02095) Quantiles 100.0% maximum 0.056799.5% 0.042297.5% 0.030190.0% 0.016275.0% quartile 0.003050.0% median -0.012825.0% quartile -0.027610.0% -0.03802.5% -0.04970.5% -0.05830.0% minimum -0.0774 Moments Mean -0.011833Std Dev 0.0209498Std Err Mean 0.0004685upper 95% Mean -0.010914lower 95% Mean -0.012752N 2000Sum Wgt 2000Sum -23.66628Variance 0.0004389Skewness 0.1801721
220
Kurtosis -0.335315CV -177.0433N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.011833 -0.012752 -0.010914 Dispersion σ 0.0209498 0.0203201 0.02162 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.995178 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.
221
T10 Distribution
.001
.01
.05
.10
.25
.50
.75
.90
.95
.99
.999
-4
-3
-2
-1
0
1
2
3
4
Nor
mal
Qua
ntile
Plo
t
-0.09 -0.07 -0.05 -0.03 -0.01 0.01 0.03
Normal(-0.0219,0.02014) Quantiles 100.0% maximum 0.038799.5% 0.026797.5% 0.016490.0% 0.003575.0% quartile -0.008450.0% median -0.021225.0% quartile -0.034410.0% -0.04762.5% -0.06490.5% -0.08040.0% minimum -0.0923Moments Mean -0.02188Std Dev 0.0201419Std Err Mean 0.0004504upper 95% Mean -0.020996lower 95% Mean -0.022763N 2000Sum Wgt 2000Sum -43.75908Variance 0.0004057Skewness -0.236473Kurtosis 0.2169387
222
CV -92.05821N Missing 0Fitted Normal Parameter Estimates Type Parameter Estimate Lower 95% Upper 95% Location µ -0.02188 -0.022763 -0.020996 Dispersion σ 0.0201419 0.0195365 0.0207863 Goodness-of-Fit Test Shapiro-Wilk W Test
W Prob<W0.996051 <.0001
Note: Ho = The data is from the Normal distribution. Small p-values reject Ho.