Xingdong Bian- X-Machine Model of a Biological System

download Xingdong Bian- X-Machine Model of a Biological System

of 65

Transcript of Xingdong Bian- X-Machine Model of a Biological System

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    1/65

    I

    X-Machine Model of a Biological System

    Third year undergraduate dissertation project

    Final Dissertation

    Department of Computer Science

    University of Sheffield

    Author: Xingdong Bian

    Supervisor: Prof. Mike HolcombeModule code: COM3021

    Date: 29/03/2006

    This report is submitted in partial fulfilment of the requirement for the degree of

    Bachelor of Science with Honours in Computer Science by Xingdong Bian.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    2/65

    II

    Signed declaration:

    All sentences or passages quoted in this dissertation from other people's work havebeen specifically acknowledged by clear cross-referencing to author, work and page(s).

    Any illustrations which are not the work of the author of this dissertation have been

    used with the explicit permission of the originator and are specifically acknowledged. I

    understand that failure to do this amounts to plagiarism and will be considered grounds

    for failure in this dissertation and the degree examination as a whole.

    Name: XINGDONG BIAN

    Signature:

    Date: 02/05/2006

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    3/65

    III

    Abstract:

    This project is in the field of computational biology, by using the computer simulation

    model to display the biological systems spatial and temporal aspects in detail.

    The aim for this project is develop a simulation of a vital part of the immune system by

    using X-machine framework and tools such as xparser and xml. By converting the

    exist models in Matlab code into xml, and then use an xparser parse it to a runnable C

    source coded programme.

    Three models are involved in this project: chemical interaction model, NF-kB

    signalling pathway model and NF-kB & MAP kinase signalling combined model. The

    first two models have existing Matlab models to be converted, but the last model is

    needed to do some research and add a new pathway into NF-kB.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    4/65

    IV

    Acknowledgments

    Thanks everyone who helped me with this project. Especially my supervisor Prof.

    Mike Holcombe, thanks him leading me to the right direction, many ideas and muchadvice of this project. Also thanks Mr. Simon Coakley helped me with xml

    specification, xparser and visualisation. Thanks Mr. Mark Pogson help me with Matlab

    example models. Lastly, thanks Prof. Eva Qwarnstrom helped me with biological

    knowledge and experimental data.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    5/65

    V

    Contents

    Title -------------------------------------------------------------------------------------- I

    Signed declaration -------------------------------------------------------------------------------------- II

    Abstract -------------------------------------------------------------------------------------- III

    Acknowledgments -------------------------------------------------------------------------------------- IV

    Contents -------------------------------------------------------------------------------------- V

    Figure List -------------------------------------------------------------------------------------- VII

    Chapter 1 Introduction 1

    Section 1.1 Background 1

    Section 1.2 About the Project 2

    Section 1.2.1 Agent-Based Modelling 2

    Section 1.2.2 X-machine 3

    Section 1.2.3 HPCx 3

    Section 1.3 About This Dissertation 4

    Chapter 2 Literature Review 5

    Section 2.1 Overview 5

    Section 2.2 Agent-Based Intracellular Chemical Interactions Model 6

    Section 2.3 Agent-Based the NF-B Signalling Pathway Model 8

    Section 2.4 NF-B Signalling Pathway and MAP Kinase Signal Pathway

    Combined Model

    11

    Section 2.5 Some Agent-Based Modelling Approaches 12

    Section 2.5.1 Swarm Agent-Based Modelling 12

    Section 2.5.2 MASON Multi-Agent Simulations 13Section 2.5.3 X-machine Framework and XML 14

    Chapter 3 Requirements and Analysis 17

    Section 3.1 Objectives and Requirement for the Project 17

    Section 3.2 Analysis for Intracellular Chemical Interaction Model 18

    Section 3.2.1 Importance and User Requirements 18

    Section 3.2.2 Conversion from Matlab 19

    Section 3.2.3 Concentrations Rates 20

    Section 3.3 Analysis for the NF-B Signalling Pathway Model 20

    Section 3.3.1 Importance and User Requirements 20

    Section 3.3.2 Conversion from Matlab 21

    Section 3.4 Analysis for the NF-B & MAP Kinase Signalling Pathway

    Combined Model

    22

    Chapter 4 Design 24

    Section 4.1 Associated Language with the Project 24

    Section 4.1.1 XML 24

    Section 4.1.2 Matlab 24

    Section 4.1.3 C 25

    Section 4.2 Overall Design 25

    Section 4.2.1 X-machine Frameworks Architecture 25Section 4.2.2 Main XML File Structure 26

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    6/65

    VI

    Section 4.2.3 Iteration XML File Structure 27

    Section 4.3 Design of Chemical Interaction Model 28

    Section 4.4 Design of NF-B Signalling Pathway Model 29

    Section 4.5 Design of NF-B & MAP Kinase Signalling Pathway Combined

    Model

    30

    Chapter 5 Implementation and Testing 33

    Section 5.1 Implementation of Three Models 33

    Section 5.1.1 Implementation of Chemical Interaction Model 33

    Section 5.1.2 Implementation of the NF-B Signalling Pathway Model 36

    Section 5.1.3 Implementation of NF-B & MAP Kinase Signal Pathway

    Combined Model

    39

    Section 5.2 Testing Methods 40

    Section 5.2.1 Unix Tool for Single Iteration Testing 40

    Section 5.2.2 Getdata Programme for Whole Iteration Files Testing 40

    Chapter 6 Results and Discussion 42

    Section 6.1 Results and Discussion of Chemical Interaction Model 42

    Section 6.2 Result and Discussion of NF-B Pathway model 45

    Section 6.3 Result and Discussion of NF-B & MAP kinase pathways

    combined model

    49

    Chapter 7 Conclusions 52

    Section 7.1 Summary of the Dissertation and Project 52

    Section 7.2 Future Work of this Project 52

    References ------------------------------------------------------------------------ 54

    Appendices ------------------------------------------------------------------------ i

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    7/65

    VII

    Figure List

    Figure 2.1 Transmembrane Signalling Biomechanical and Soluble Mediators 5

    Figure 2.2 Chemical Interaction Model Visualisation (Matlab) 8

    Figure 2.3 Chemical Interaction Model Results (Matlab) 8

    Figure 2.4 NF-B Pathway Model Visualisation (Matlab) 10

    Figure 2.5 NF-B Pathway Model Results (Matlab) 10

    Figure 2.6 Summary of MAP kianse pathway 11

    Figure 3.1 Process of combination 18

    Figure 3.2 Chemical reactions 19

    Figure 3.3 concentration of molecule A, B and C against time 20

    Figure 3.4 Possible states and transition of an NF-B 22

    Figure 3.5 Simplify of the MAP Kinase pathway 23

    Figure 4.1 Structure of the Main file (a) 26

    Figure 4.2 Structure of the Main file (b) 27

    Figure 4.3 NF-B & MAP Kinase Signalling Pathway Relation 31

    Figure 5.1 states and relations in X-machine 34

    Figure 5.2 Visualisation of Chemical Interaction Model 35

    Figure 5.3 Visualisation of NF-B signalling pathway model 38

    Figure 5.4 Visualisation of NF-B MAP kinase combined model 39

    Figure 5.5 Concentration against Iterations (time steps) graph 41

    Figure 6.1 Chemical interaction agent model graph one 43

    Figure 6.2 Chemical interaction agent model graph two 44

    Figure 6.3 Visualisation for chemical interaction model 44Figure 6.4 NF-B pathway agent model result (a) 48

    Figure 6.5 NF-B pathway agent model result (b) 49

    Figure 6.6 Result of the combined model 51

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    8/65

    Chapter 1: Introduction

    1

    Chapter 1: Introduction

    Section 1.1: Background

    This project is in the field of computational biology, computational biology is a term

    for an interdisciplinary field of the joining of both computer technology and biology.

    Computational biology has just started in recent years. The field is located at the

    interface between the two scientific and technological disciplines that can be argued to

    drive a significant if not the dominating part of contemporary scientific innovation

    [1].

    After more discoveries in biology such as the structure, organisation and behaviour of

    cells, tissues, organisms and communities of biological systems, more understanding

    and may be simulation is needed. Computer technology is able to solve this question,and providing prediction for important aspects of the biology systems behaviour.

    Computer technology gives vitality to the research of biology area. The famous

    example is the Human Genome Project, it has generated an extraordinary amount of

    data. Biologists are now faced with the challenge of extracting meaning from linear

    sequences composed of billions of base pairs. The work of computational biologists is

    indispensable for this task and for many other biological problems that lend themselves

    to computational solutions [2]. This is the reason why computational biology field is

    developed dramatically, more and more people in both areas are starting to work

    together and get best solution of their research.

    There are 10 major research areas for computational biology now: sequence analysis,

    computational evolutionary biology, gene expression analysis, regulation analysis,

    protein expression analysis, analysis of mutations in cancer, structure prediction,

    measuring biodiversity, modelling biological systems and high-throughput image

    analysis. My project is in the 9th

    area stated above modelling biological systems, this

    area involves the use of computer simulations of cellular subsystems for both spatial

    and temporal aspects the complex connections of these cellular processes.

    The definition for biological computer modelling is using a computer programmewhich tries to simulate an abstract model of a particular biological system. Biological

    computer simulation is a subset of computer simulation. Computer simulation is a

    really useful part in modelling lots of natural systems, which gives insight into the

    operation of the nature systems are been modelled. The age before computer

    simulation, people were using mathematical models, but with computer simulation,

    modelling went in a new stage.

    Here is history of computer simulation (quoted from the Wikipedia article "Computer

    Simulation", it is licensed under the GNU Free Documentation License --

    http://www.gnu.org/copyleft/fdl.html):

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    9/65

    Chapter 1: Introduction

    2

    Computer simulation was developed hand-in-hand with the rapid growth of the

    computer, following its first large-scale deployment during the Manhattan Project in

    World War II to model the process of nuclear detonation. It was a simulation of 12 hard

    spheres using a Monte Carlo algorithm. Computer simulation is often used as an

    adjunct to, or substitution for, modelling systems for which simple closed formanalytic solutions are not possible. There are many different types of computer

    simulation; the common feature they all share is the attempt to generate a sample of

    representative scenarios for a model in which a complete enumeration of all possible

    states of the model would be prohibitive or impossible. Computer models were

    initially used as a supplement for other arguments, but their use later became rather

    widespread. The physicist Richard Feynman, was not fond of such models and once

    called them "a disease"[3].

    Section 1.2: About the Project

    About my project: the aim for my project is developing a simulation of a vital part of

    the immune system by using framework and tools. Based on the existing framework

    which was developed by the Computational Biology Research Group in our

    department, it can model different kinds of biological systems and the systems are

    defined in terms of individual agents which play the role of different biological

    entities such as molecules, receptors etc. Also the simulations they have built can solve

    thousand of these agents operating and communication with other agents. This is

    called Agent-Based Modelling.

    Section 1.2.1: Agent-Based Modelling

    Agent-Based Modelling is developed to deal with the complexities of the system and

    to extend the capabilities of previous chemical modelling attempts [4][5]. It can

    provide better understanding of the operation for the cellular reactions for both spatial

    and temporal aspects.

    Agent-Based modelling (also known as individual-based modelling) treats each

    individual component of a system as a single entity (or agent) obeying its ownpre-defined rules and reacting to its environment and neighbouring agents accordingly

    [4][6]. Agent is good for representing component of a system.

    Also, for the agents, they can be represented by various computational models; the

    approach chosen here is the X-machine, providing an intuitive and precise method to

    model the functional behaviour of systems in a flexible and modular manner [5]. A

    single stream X-machine is used to describe each individual agent, and communication

    channels are identified between machines to deal with agent interactions [7]. When

    modelling complex systems, there is an essential feature for X-machine: it is directly to

    develop by adding new agents to the system and makes the modelling process

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    10/65

    Chapter 1: Introduction

    3

    extensible.

    Section 1.2.2: X-machine

    The reason we use X-machine is due to its speciality. X-machines are similar to finitestate machines, which are models of behaviour based on states and transitions, but the

    X-machines has a addition feature: memory, it achieve that transitions between states

    can include the memory and the modification of it [9]. The memory lets X-machine

    have an important and novel feature. The memory in X-machine contains physical

    location, so that the number of states required to model the system is manageably

    small.

    The using of framework as this: programme using XML with the X-machine specific

    way and then the Xparser (which is built by the computational biology research group

    in our department as well) will produce a programme in C code from the X-machineXML specification. By running the programme it will simulate the agents behaviour

    and it is also possible to visualise the simulation by the special visualisation C

    programme built for the model.

    The reason why the framework is based on XML instead of directly writing it into C

    code is: the XML is simple and it is flexible text format derived from SGML, which

    will show all the state of each agent clearly and it is really simple to be code compare

    with C. After the XML code created, the Xparser will parse it into C code easily.

    Section 1.2.3: HPCx

    The computational biology research team has already done the model for the vital part

    of immune system in Matlab, what I will do is convert the model into X-machine

    framework which will be running under C compilers.

    The reason for that is because of the super computer HPCx cannot run Matlab but C.

    In order to get this super computer to calculate our simulation, we have to convert our

    model into C.

    We can see the super computers hardware specification of it (quoted fromhttp://www.epcc.ed.ac.uk/msc/systems_HPCx.htm):

    The HPCx system is located at the UK's CCLRC's Daresbury Laboratory and operated by the HPCx

    Consortium.

    The HPCx system uses IBM p690+ Regatta nodes for the compute and IBM p690 Regatta nodes for

    login and disk I/O. Each Regatta node contains 32 processors. At present there are two p690 service

    nodes. At the beginning of the user service on HPCx phase2 in April 2004, twenty p690+ nodes were

    used for compute jobs, offering a total of 640 processors. From Monday, 10 May, there were 38 frames,

    i.e. 1216 processors, available to users. Then the system had a throughput of at least 4.8 Tflops (4800

    AU/hr). This was increased to 50 nodes offering 1600 processors end of May 2004. The peak

    computational power of the HPCx system is 10.8 Tflops peak, or at least 6 Tflops sustained. The

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    11/65

    Chapter 1: Introduction

    4

    complete new platform gave a value of 6,188 Gflops for the Rmax value of the Linpack benchmark. The

    service can thus provide 6,188 AUs per hour, 148,512 AUs per day.

    HPCx service is provided by a consortium led by the University of Edinburgh, with the

    Council for the Central Laboratory of the Research Council and IBM. This supercomputer will help us by running the simulation by thousands of processors with

    different agent in different processor to get a much more accurate result. However my

    project doesnt involve to HPCx directly.

    Section 1.3: About This Dissertation

    This dissertation consists of seven chapters, after this beginning introduction chapter,

    the second chapter is literature review, all the related background literature will be

    mentioned as well as the X-machine framework in detail and the associating three

    programming language with my project. The third chapter is requirements and analysis,this chapter talks about the project by objectives, requirement and the analysis in a

    more detailed way. How the project will be evaluated will also be included in this

    chapter. The next chapter is design the design technique of this project. Then the fifth

    chapter is implementation and testing, this chapter is about the coding methods and

    how to test the model. The sixth chapter is results and discussion, this is a important

    chapter that shows the main results of the model and some discussion. The last chapter

    is conclusions, a summarisation of the project and the dissertation.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    12/65

    Chapter 2: Literature Review

    5

    Chapter 2: Literature Review

    Section 2.1: Overview

    Three models are involved in my project: intracellular chemical interactions model, the

    NF-B signalling pathway model and a combined of NF-B signalling pathway and

    MAP Kinase signal pathway model. Also there are three programming languages

    associated with my project xml, Matlab and C. We can see a picture which shows a

    part signalling pathways in cell, and some of the molecules are going to appear in the

    model, this picture was done by Prof. Eva Qwarnstrom:

    Figure 2.1 [26]

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    13/65

    Chapter 2: Literature Review

    6

    Section 2.2: Agent-Based Intracellular Chemical Interactions Model

    Firstly, I will introduce intracellular chemical interactions model. Even the simplest

    life forms require the interaction of more than 400 chemical processes that are encoded

    by genes [9]. To track and understand the intracellular chemical interactions, theintracellular signalling pathways should be considered. Intracellular signalling

    pathways are really important for cell behaviour in control and regulation. With

    agent-based modelling it will show the intracellular signalling pathways in both spatial

    and temporal concerns. By using the agent-based modelling, it is possible to provide a

    framework for calculating chemical interactions with accurate result.

    Complex interactions of genes, proteins and other molecules within the cell must be

    addressed in order to gain a better understanding of how these pathways operate

    [6][10][11]. Also by using mathematical models with the information of physical

    components of the cell, it is easier to understand the activities of signalling pathways.

    People used to model intracellular signalling pathways relying on reaction kinetics, by

    using ordinary differential equations to show each chemicals quantities with time.

    This is possible only when the chemicals in the cell are well mixed. However, due to

    internal structure and low numbers and non-uniform distributions of certain key

    molecules in the cell, this is certainly not true [12].

    Also because the signalling pathways are complex, only using mass number of

    ordinary differential equations is necessary for the reaction kinetics models. However

    the description will be huge and the solutions will be difficult to be expressed. This

    kind of models has some other problems as well: they have limitations in function

    properly and those large numbers of ordinary differential equations are sensitive, only

    small changes to the equations will cause big changes in behaviours. So this kind of

    models has a narrow view of the real behaviours in the cells even they can provide

    useful results sometimes.

    An important factor needs to be encountered for intracellular modelling is time delays.

    Time delays in certain cellular processes such as transcription can have very significant

    effects on pathway behaviour [6]. Differential equation models dont consider thisfactor because of its attributes, they cant include inside with those ordinary

    differential equations.

    An even more important factor for intracellular modelling is spatial effects. Again,

    differential equation models are hard to consider spatial effects.

    As all above, even the differential equation models are important, but they still have

    lots of disadvantages and limitation on modelling of intracellular interactions. So, to

    gain a higher level of understanding the mechanical and structural effects on

    intracellular pathways, more transparent and abstract models are needed.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    14/65

    Chapter 2: Literature Review

    7

    A good modelling approach here, which is called: agent-based modelling. Agent-based

    modelling models each individual component of a system as a single agent obeying its

    own pre-defined rules and reacting to its environment and neighbouring agents

    accordingly [6]. That means agent-based modelling contains new methods of

    modelling spatial systems that deal with much finer spatial and temporal scales whereactivity is represented at the level of the individual or agent. Also, processes naturally

    enter these systems as agent behaviour and then it joins the spatial context naturally as

    well. Agent-based modelling has recently been applied to a variety of biological

    systems, including insect communities and epithelial tissue [13][14][15][16].

    Agents in a biological system for a biochemical pathway, can be presented as anything

    from a molecule to a signalling receptor to a an entire chain of interactions can be

    modelled as an agent, thus providing a modular and extensible modelling framework

    which allows abstraction of details as necessary [5]. So agent-based modelling is

    clarified in spatial modelling, which is good for monitoring intracellular interactionand the change cell structures by the interaction processes.

    Compared with the differential equation models, agent-based models have a lot more

    freedom: they can model different quantity and different positions of molecules with

    no limitations if the computer is good enough. Also, the two important factors: time

    delays and spatial effects can be included in the model easily. But notice the number of

    agents must be positive.

    Different from the differential equation models, agent-based models dont need a lot

    ordinary differential equations in modelling, but they need some other details for each

    agents position and properties, so that is a large amount of information that needs to

    be specified. Another thing needs to be noticed is the agent-based model should agree

    with the associate kinetics model.

    The two images below is an agent-based model coded in Matlab by Mark Pogson in

    our department. The Figure 2.2 shows a step in the middle of interaction, it clearly

    displays all three kinds of molecules position and number in a three dimensional box.

    The Figure 2.3 shows the number of each kind molecule against time in second. We

    can see that by the time change molecule A interacts with molecule B producesmolecule C. Also the numbers of them are associated. An agent-based intracellular

    interaction model (A + B C) by Matlab code:

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    15/65

    Chapter 2: Literature Review

    8

    Figure 2.2

    Figure 2.3

    Section 2.3: Agent-Based the NF-B Signalling Pathway Model

    After the intracellular chemical interaction model, now we move on to the second

    model which is involved with my project, it is called the NF-B signalling pathway.

    NF-B nuclear factor kappa B, is a heterodimeric protein composed of different

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    16/65

    Chapter 2: Literature Review

    9

    combinations of members of the Rel family of transcription factors. The Rel/ NF-kB

    family of transcription factors are involved mainly in stress-induced, immune, and

    inflammatory responses. In addition, these molecules play important roles during the

    development of certain hemopoietic cells, keratinocytes, and lymphoid organ

    structures. More recently, NF-kB family members have been implicated in neoplastic progression and the formation of neuronal synapses. NF-kB is also an important

    regulator in cell fate decisions, such as programmed cell death and proliferation

    control, and is critical in tumorigenesis [17]. So the intracellular NF-B signalling

    pathway is important to immune systems.

    Due to its control of cells death and proliferation, the research of NF-B signalling

    pathway is really important. Imagine if people can control it, let cancer cells kill

    themselves and normal cells stay alive, then the biggest problem in the world now

    cancer, will be solved. However, it is not easy to control it so a good model for

    intracellular NF-B signalling pathway is needed to show both spatial and temporaldetails of the pathway for research purpose.

    NF-B activation is tightly controlled by inhibitors of NF-B (IB) proteins [5][18].

    IB sequesters the majority of NF-B in the cytoplasm as complexes by masking their

    nuclear localisation signals [19]. During activation, IB is phosphorylated by IB

    kinases (IKK), causing its ubiquitination and proteosome-mediated degradation. The

    newly free NF-B is consequently transported into the nucleus, inducing genes bearing

    cognate binding motifs [5].

    All the information above is for showing how important NF-B signalling pathway is

    and how NF-B is activated. Now we need a computational model to get the

    information of the way how it controls the signalling pathways, with the results

    provided by the experiment.

    It is the same with intracellular chemical interaction model, people use differential

    equations to model inhibitors performance. However, as I mentioned above, the

    differential equation models have limitation to show the actual pathway. So, the best

    approach here is agent-based modelling.

    Agent-based modelling is able to give the intracellular NF-B signalling pathway a

    better scope of analysis and more complete view of the regulatory mechanisms. It

    shows what is actually happening inside the cell. A single agent is a molecule inside

    the cell in this model and its behaviour is controlled by the rules of interaction and its

    environment. Even sometimes it is not possible to model all the individual molecules

    due to biological or computational limitations, but by using some other agents to

    separate the system into useful components, it will provide a complete view of the

    pathway.

    Again in this model, the agent-based modelling has wilder scope than the reaction

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    17/65

    Chapter 2: Literature Review

    10

    kinetics modelling, but the agent-based model must agree with the corresponding

    reaction kinetics model.

    The two images bellow is a second agent-based model coded in Matlab by Mark

    Pogson, from our department. The Figure 2.4 shows a step in the middle of the NF-

    Bsignalling pathway simulation, it clearly displays a cells model and the position for

    each kind of molecule. The Figure 2.5 shows the concentration of each kind of

    molecule against time in second.

    Figure 2.4

    Figure 2.5

    As we can see, the model is made up of lots of different molecule in a spherical cell

    with a spherical nuclear centre region. However, in the actual world, some cells have

    unique and non-spherical free shape. To model those cells, we will need some special

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    18/65

    Chapter 2: Literature Review

    11

    software to sort the boundary out, but it is still based on a spherical shaped model with

    all kinds of coordinates.

    Section 2.4: NF-B Signalling Pathway and MAP Kinase Signal Pathway

    Combined Model

    MAP Kinase stands for Mitogen-activated protein kinase. In cell biology,

    mitogen-activated protein kinases are serine/threonine-specific protein kinases that

    respond to extracellular stimuli (mitogens) and regulate various cellular activities, such

    as gene expression, mitosis, differentiation, and cell survival/apoptosis. Extracellular

    stimuli lead to activation of a MAPK via a signalling cascade composed of MAPK,

    MAPK kinase (MAPKK), and MAPKK kinase (MAPKKK). A MAPKKK that is

    activated by extracellular stimuli phosphorylates a MAPKK on its serine and threonine

    residues, and then this MAPKK activates a MAPK through phosphorylation on its

    serine and tyrosine residues. This MAPK signalling cascade has been evolutionarilywell-conserved from yeast to mammals. [27]

    Figure 2.6 [25]

    The Figure 2.6 only shows a summary of MAP kinase pathway, but the Figure 2.1

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    19/65

    Chapter 2: Literature Review

    12

    shows a more complex and complete signalling pathways. It also shows the cross talk

    between NF-B and MAP kinase pathways.

    This pathway can also be modelled by the agent-based model. By introduce each

    molecule as an agent. Same with NF-

    B signalling pathway agent-based modelling isalso able provide a better scope of analysis and more complete view of the regulatory

    mechanisms. However, the combined model is more complex and more important for

    research purpose, what is actually happening inside the cell is necessary to be

    displayed by computer model.

    The most important thing is to see if these two pathways interfere with each other

    when they are in the same model, also the cross interaction between the members of

    them is fatal.

    If two pathways behave normal in the same model that means X-machine frameworkis capable to model more than one pathway. This is also the base of the future models

    which have three or more pathways inside.

    Section 2.5: Some Agent-Based Modelling Approaches

    Section 2.5.1: Swarm Agent-Based Modelling

    Swarm is a multi-agent software platform for the simulation of complex adaptive

    systems. In the Swarm system the basic unit of simulation is the swarm, a collection of

    agents executing a schedule of actions. Swarm supports hierarchical modelling

    approaches whereby agents can be composed of swarms of other agents in nested

    structures. Swarm provides object oriented libraries of reusable components for

    building models and analyzing, displaying, and controlling experiments on those

    models. Swarm is currently available as a beta version in full, free source code form. It

    requires the GNU C Compiler, Unix, and X Windows. [33]

    The modelling formalism that Swarm adopts is a collection of independent agents

    interacting via discrete events. Within that framework, Swarm makes no assumptionsabout the particular sort of model being implemented. There are no domain specific

    requirements such as particular spatial environments, physical phenomena, agent

    representations, or interaction patterns. Swarm simulations have been written for such

    diverse areas as chemistry, economics, physics, anthropology, ecology, and political

    science. [33]

    Swarm uses each individual agent as a basic unit, each agent generates events affect

    itself and other agents, and the simulation of Swarm uses a number of agents

    interacting with each other.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    20/65

    Chapter 2: Literature Review

    13

    Swarm needs libraries to do the simulation. Swarm libraries serve two major

    functions. The libraries are a set of classes that model builders can use by direct

    instantiation. For many objects, especially highly technical ones such as schedule data

    structures, it's likely that all a user will ever do is use the classes as provided. But in

    addition, one can use Swarm libraries by subclassing them, specializing particularclasses for particular modelling needs. Both modes of using the Swarm libraries are

    important; Swarm is designed to facilitate both as appropriate. [33] This is also the

    limitation of the Swarm agent-based modelling.

    Section 2.5.2: MASON Multi-Agent Simulations

    MASON Stands forMulti-Agent SimulatorOfNeighbourhoods... orNetworks... or

    something..., MASON is a fast discrete-event multiagent simulation library core in

    Java, designed to be the foundation for large custom-purpose Java simulations, andalso to provide more than enough functionality for many lightweight simulation needs.

    MASON contains both a model library and an optional suite of visualization tools in

    2D and 3D. MASON is a joint effort between George Mason University's ECLab

    Evolutionary Computation Laboratory and the GMU Center for Social Complexity,

    and was designed by Sean Luke, Gabriel Catalin Balan, and Liviu Panait, with help

    from Claudio Cioffi-Revilla, Sean Paus, Keith Sullivan, Daniel Kuebrich, Joey

    Harrison, and Ankur Desai. [34]

    MASON has some special features:

    Simulations can be serialized to checkpoints (freeze-dried and written to disk),which can be recovered from at any time, even to different Java platforms and new

    MASON visualization toolkits.

    MASON can be set up to be guaranteed duplicatable, meaning that the samesimulation parameters will produce the same results regardless of platform.

    Libraries are provided for visualizing in 2D and in 3D (using Java3D), tomanipulate the model graphically, to take screenshots, and to generate movies

    (using Java Media Framework).

    While the visualization toolkits are fairly large, the core simulation model isintentionally very small, fast, and easy to understand. [34]

    However, from the description above, MASON uses Java technology to simulation

    models, as in last chapter, we need to run models on HPCx, but HPCx doesnt support

    Java, so it is not possible to choose this simulation system for my project.

    As in last two sections, these two models are not suit for my project as the X-machine

    framework, you will know why the X-machine framework is the most suitable one for

    my project in next section.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    21/65

    Chapter 2: Literature Review

    14

    Section 2.5.3: X-machine Framework and XML

    Due to the mass usage of agent-based modelling for intracellular interactions, it is

    necessary to develop a common architecture for the large amount of agents systems.

    The approach here is a framework based on the X-machine. The framework canstandardise the expression of agents in a special way. The X-machine framework uses

    XML code, through a C coded Xparser, it can be parsed into a runnable C code.

    There are quite a lot of tools for computational biology modelling research, but for

    agent examples uses, there is not many, only some framework with inadaptable

    structure based, which wont suit our models. Also there are some agent-based

    frameworks already exists but they cant reach the needs for intracellular modelling.

    Because inside actual cells there are millions of molecules and associated cellular

    signalling. Due to the huge number of agents the need of a common architecture is

    essential. With running on a super computer like HPCx as I mentioned in theintroduction, it makes the modelling result more accurate. The reason why it can be

    run on the supercomputers is the definition of agents. The agents are defined as

    autonomous computing machines that communicate with messages the processing of

    the agents can be spread across many processors and computers that are connected on

    a network [8].

    The messaging between agents is similar with the message communication with

    computers, so the messages from the agents can be used in computers. MPI (Message

    Passing Interface) is a library that allows the creation of programs that can be spread

    across computers and that communicate with messages and has become the de facto

    standard for distributed memory parallel processing [8]. So we can use computers to

    simulate the agents and the messages between those agents.

    It is possible to define a cell as a system which processes some parallel collections of

    communication. So we need a good model to define the behaviour of agents running in

    parallel and sending each other data and process them. The X-machine matches all

    needs, X-machine is similar with other finite state machines, and it has states, input

    output alphabet and a unique thing which other state machines dont have memory.

    With this additional memory, it is then really useful and suitable for agent-basedmodelling. When the transition between states, they can have memory with them and

    modify it. We can see the definition of a stream X-machine.

    The definition of a stream X-machine is an 8-tuple [16]:

    X = (, , Q, M, , F, q0, m0)

    and is the input and output alphabets respectively.Q is the finite set of states.Mis the (possibly) infinite set called memory., the type of the machineX, is a set of partial functions that map an input and a

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    22/65

    Chapter 2: Literature Review

    15

    memory state to an output and possibly different memory state, : x Mx

    M.

    F is the next state partial function, F: Q x Q, which given a state and afunction from the type determines the next state. Fis often described as a state

    transition diagram.qo and mo the initial state and initial memory respectively.

    From now on the term X-machine refers to a stream X-machine [8]. Because the

    X-machines can communicate, we can use the Communication X-machine. A

    Communication X-machine model uses X-machines which can exchange messages.

    The Communication X-machine model can be defined as the tuple [8]:

    ((Cix) i = 1..n, R)

    where:

    Cix is the i-th Communicating X-machine in the system, andR is a communication relation between the n X-machines

    By different method of defining R, we can get different definition of communicating

    X-machine. One of the most accepted approaches uses the idea of a communication

    matrix which acts as the means of communication between X-machines [8]. The

    communication cells in this approach contain message between X-machines. However,

    this approach still has disadvantages when using X-machines as agents, especially

    when there are a lot of agents, the communication matrices will be too large to link

    each other. Also, the target agent to send message is unclear from the point of an agent,

    due to the changes of the communication.

    Agents are restricted to interact with surrounding agents in the communicating

    X-machine agent-based models, so the distance of massages sending between agents is

    restricted. In this approach, the communication relation between X-machines R

    consists of two lists: message list and message type list. In the message list, all the

    X-machines will understand and able to read the messages. It is really important forthe concept of this kind of implementation, it means the actions of each X-machine are

    based on input messages. If the source of the input message is too far from this

    X-machine, then the message will be ignored; if the source is at a reasonable distance,

    it will be processed. Also, this method can be extended, just need to put a tag with

    some intelligent information on it, e.g. the max. distance for the sending X-machine

    and possible receiving X-machines.

    There are a lot of ways of communicating and handling messages. There is a useful

    one, which is the communication between two agents that are processed on distinct

    computers in a computer cluster or a grid system. What people are doing now is having

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    23/65

    Chapter 2: Literature Review

    16

    a local message list for each computer CPU in the computer cluster. The agent only

    sends and receives message from the local computer CPU, but there will be a separate

    calculation to see if any other agent need the message on different computer CPUs.

    The calculation involves the distance between each agent, by giving each of them an

    influence boundary, it will be easy to decide if an agent needs the message.

    XML is used for the implementation architecture of X-machine here. By coding with a

    XML text file, the X-machine architecture can be defined. This is really easy to use for

    most people, by using any kind of file editor, they can modify the XML code easily.

    Also, it is possible to develop a graphical interface to modify the XML, without seeing

    the implementation directly.

    It is necessary to build a parser for the XML code which can parse the XML into a

    runnable C programme to run the X-machine agents with the message list relation. The

    parser itself is coded in C and it is universal for all XML coded X-machine agentsmodels, we call the parser Xparser. To complete an iteration, another XML text file is

    needed to define the starting state and details for each agent as an initial point to run

    the programme. By using these files, it is possible to have certain different runs of the

    model with different result for research.

    The representation of the X-machine model can be visualised by using a special coded

    visualisation programme. The visualisation programme is coded in C as well. By using

    the visualisation, it gives us a direct view of the models structure and interactions

    procedures. Also, it is possible to screenshot each frame of the visualisation as a photo

    file, with a set of screenshots, they can be converted to a video file by using a free

    software which is called VirtualDub ( see http://www.virtualdub.org/).

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    24/65

    Chapter 3: Requirements and Analysis

    17

    Chapter 3: Requirements and Analysis

    Section 3.1: Objectives and Requirement for the Project

    This chapter is a mainly about objectives and requirements about the project. Each of

    the three models will be discussed in detail.

    For my project, the aim is developing a simulation of a vital part of the immune system

    by using framework and tools. Based on the existing framework which is the

    X-machine, and it was developed by the computational biology research group in our

    department. It can model biological systems which are involved with my project easily.

    Each individual agent plays as a role of a molecule or a receptor. Based on

    agent-based modelling, it can solve thousand of these agents operating and

    communication with other agents.

    The objective for my project is, based on the existing two Matlab models, convert

    them into X-machine models. For both intracellular chemical interaction model and the

    NF-B signalling pathway model, Mark Pogson has used Matlab to model them and I

    have already received them. However, for the third model, there is no existing Matlab

    model for combined two pathways. So this is something challenging and needed to be

    fully tested to see if this works properly in X-machine framework.

    Clearly, for requirements, the first thing is to understand all the Matlab models in detail,

    and then I need to sort out the architecture and the method of X-machine modelling.

    Also, I need to understand how to use the Xparser developed by Simon Coakley.

    Then, I can make my start: after fully understanding the Matlab model, I need to

    convert them into X-machine model, which represented by a XML file. Then I need to

    create an initial state file called 0.xml (based on XML as well) to give the model initial

    starting agents details, because the Matlab can generate initial agents at every run

    starting point, but in X-machine, I need to create myself. Then, use Xparser to parse

    the XML into C. if there is no problem with compiling, then it is possible to get an .exe

    runnable programme file. Use the programme, assign a iteration number and point the0.xml initial state file, all the process will be done and I can get a XML file for each

    iteration. Simon Coakley also has developed a visualisation programme specialised for

    the X-machine model. With that programme, it can give us a direct view of the model

    in 3D pictures.

    After the conversion of the two Matlab models into X-machine model, then it is

    possible to start the third model. By defining each molecule as an agent, set of binding

    rules for each new kind of molecules and set of moving rules for them, this model will

    be made up.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    25/65

    Chapter 3: Requirements and Analysis

    18

    As in Figure 3.1, it is possible to start with two individual models for NF-B and

    MAPK pathways, then put them together into a single model. However, there is an

    important thing: the state numbers for each pathways molecules should be unique,

    then it wont clash when they are combined together. Also, the cross-talk between

    NF-

    B pathway and MAPK pathway is necessary to be shown in the model, if there isavailable detailed data for that. I will discuss more about the combination model in a

    following section and chapter.

    Figure 3.1 Process of combination

    Section 3.2: Analysis for Intracellular Chemical Interaction Model

    Section 3.2.1: Importance and User Requirements

    This model is a very basic and simple model, but everything is from the basic to

    complex. Many aspects of life involve the interaction of multiple components and

    subunits and the corresponding emergence of both form and function. This is true

    whether we are dealing with molecules within an individual cell, cells within tissue,

    organs within an organism or organisms within a community or ecology. [28] Bysorting out how each molecule interaction with another kind, it is possible to build a

    large and complex model with a number of different kinds molecules or pathways.

    The key feature for agent-based modelling is model each molecule as an agent, from

    the Figure 3.2, (a) Reaction kinetics differential equations treat reacting chemicals as

    well mixed and uniform; (b) Agent-based approach models each individual molecule

    [28].

    NF-B

    MAPK

    Combined

    Mode

    Cross-Talk

    Mix

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    26/65

    Chapter 3: Requirements and Analysis

    19

    Figure 3.2 Chemical reactions [28]

    The agent-based models have greater scope than the reaction kinetics differential

    equation models, but they need to define a lot more details than the latter one. For

    example, the movement of a single molecule is needed to be defined, also the bindingrules of A molecules to B molecules as well. Incorrect data may course a big difference

    in result.

    Agent-based models have to agree with reaction kinetics differential equation models.

    Because when the agent-based model has large number of molecules and they are

    mixed well, reaction kinetics differential equation models can be applied. However,

    there are not many information about individual molecular interactions, so it is

    necessary to gain some data from reaction kinetics for agent-based model.

    Section 3.2.2: Conversion from Matlab

    During conversion, there is a big change need to be defined first the state of each

    molecule. X-machine is a special kind of state machines, so when modelling

    intracellular actions, each of the molecules is an X-machine, and each of them has a

    state. So I need to sort out each kind of molecules possible state.

    The intracellular chemical interaction model only has two kinds of molecule initially,

    so the states are easy to be defined. Two states for molecule A: free and bond withmolecule B, one state for molecule B: free. From the perspective of A, it receive

    message from molecule B and decide bond or not. After bound with B, they changed to

    a third kind of molecule, at this time, when we marking the state, we can let molecule

    B disappear and molecule A changes to the state bond with B it is actually

    molecule C now, but for easier to compute and display.

    Also, the requirement for a bind is important as well. Normally the interaction

    boundary depends on the radius of the molecule. It is necessary to define the radius

    and interaction boundary for each kind of molecules.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    27/65

    Chapter 3: Requirements and Analysis

    20

    Section 3.2.3: Concentrations Rates

    There is a good way to check if the result is correct or not, just calculate the number of

    molecules, for each bond, the molecule A and molecule B will decrease one unit, and

    molecule C will then increase one unit, this should happen in the same time step, lookback to Figure 2.3, you can see the concentration changes easily. And the model will

    be built based on these. The evaluation for this model will be easy as well if the

    concentration change in molecule A with a time step t is a, for molecule B is b, the

    interaction is between molecule A and molecule B and produces molecule C, so the a

    = b. from Figure 3.3 [6]:

    Figure 3.3: concentration of molecule A, B and C against time [6]

    Section 3.3: Analysis for the NF-B Signalling Pathway Model

    Section 3.3.1: Importance and User Requirements

    As in last chapter, we know that NF-B signalling pathway is vital to immune response

    regulation. Alterations in pathway regulation underlie many diseases, includingatherosclerosis and arthritis. The modelling of individual molecules, receptors and

    genes provides a more comprehensive outline of regulatory network mechanisms than

    previously possible with equation-based approaches. [28] For this model, all the data

    is from single cell experimental analysis by the Academic Unit of Cell Biology,

    Division of Genomic Medicine in the University of Sheffield.

    For a user using this model, he/she will be able to change and alter each kind of

    molecules moving speed, radius and initial quantity (concentration). Another thing is

    user should be able to define the colour for each kind of molecules. That means even

    the data from the Matlab code is not correct, but as soon as the experiment finished,

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    28/65

    Chapter 3: Requirements and Analysis

    21

    user is able to correct the model and each kind of molecules is independent to another

    kind change ones detail wont affect others but will get correct result.

    NF-B interact with IB should follow the interaction requirement as described in last

    section, NF-

    B can be seen as molecule A in last model, I

    B can be seen as moleculeB, so when they bound it will be NF-B & IB, can be seen as molecule C. So the

    concentration change should follow the Figure 3.4, but there are lots of other kinds of

    molecule involved, the situation will be a lot more complex.

    Section 3.3.2: Conversion from Matlab

    From the detail in the Matlab code, it is possible to know that: Activation of the

    NF-B pathway if controlled by inhibitors of NF-B (IB) proteins, which sequester

    the majority of NF-B in the cytoplasm as complexes by masking their nuclear

    localisation signals. During activation, IB is phosphorylated by IB kinases (IKK),causing its degradation. The newly freed NF-B is consequently transported into the

    nucleus, inducing inflammatory genes, including those encoding IB, thus regulating

    the pathway through negative feedback.[28][29][30][31]

    Also from the Matlab code, there are NF-B, IB, IB, IB, Nuclear Importing

    Receptors and Nuclear Exporting Receptors modelled as agents. The conversion from

    Matlab is complicated. Each kind of molecules has a set of states. However, the

    number of IB and IB in the real cell is tiny, from the suggestion of Mark Pogson,

    it is not necessary to include these two molecules into the model. Then we can have a

    look the possible state for each kind of molecules:

    For the NF-B molecule, it is most complicated one in this model, see Figure 3.4 on

    next page for the possible states and transition of a NF-B. As you can see only one

    molecule will have those states: bound and unbound with different molecules in

    cytoplasm and nuclear, also states for free in cytoplasm and nuclear, bound and

    unbound with importing and exporting receptor. In more detail, NF-B should have a

    state bound with IB in cytoplasm; a state of free in cytoplasm; a state of bound with

    nuclear importing receptors; a state of free in nucleus; a state of bound with IB in

    nucleus; a state of bound with nuclear exporting receptors, a state of bound with IBthen bound with nuclear importing receptors and a state of bound with IB then

    bound with nuclear exporting receptors.

    ForIB the possible states are: free in cytoplasm, bound with nuclear importing receptor,

    free in nucleus and bound with exporting receptor. IB is a lot simpler than the NF-B

    molecule.

    For both kinds of nuclear receptors, there are two states: dormant and active. When

    active, that means something bound with it; when dormant, that means it is free and

    ready to bind with other kinds of molecules.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    29/65

    Chapter 3: Requirements and Analysis

    22

    Figure 3.4: Possible states and transition of an NF-B [5]

    Section 3.4: Analysis for the NF-B & MAP Kinase Signalling Pathway

    Combined Model

    This model involves two pathways: NF-B signalling pathway and MAP KinaseSignalling pathway. NF-B pathway has already done in the second model, so the

    tasks are build the MAP kinase pathway separately and then combine them together.

    As in the Figure 3.5 (next page), it is possible to simplify the model from the Figure

    2.6. Ras, SOS and GRb2 molecules can be seen as a single kind, this can be treated as

    NF-B in the last model. Active-Ras can be treated as IB. However, both of them

    cant go inside of nuclear. After they bound, will produce a molecule called MAPK,

    instead of Raf (MAP KKK), MEK1/2 (MAP KK) and ERK1/2 (MAP K). Raf (MAP

    KKK), MEK1/2 (MAP KK) and ERK1/2 (MAP K) is a degradation process, so they

    can be treated as one kind MAPK. MAPK is the only one goes inside nuclear and

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    30/65

    Chapter 3: Requirements and Analysis

    23

    then it will switch on gene.

    Same with NF-B and IB, the interaction of Ras_SOS_Grb2, Active-Ras and

    MAPK, should follow the concentration change as in Figure 3.4. Also the change of

    NF-

    B and I

    B should not be affected in this model, this is the way for evaluation.

    The cross-talk between these two pathway has not yet been discovered fully. The only

    thing we know now is a molecule called NIK, it is the important part of cross-talk.

    The next chapter is design, I will talk about the design of each model in detail.

    Figure 3.5 Simplify of the MAP Kinase pathway

    Bound

    Ras_SOS_GRb2 Active-Ras

    Raf

    (MAP KKK)

    MEK1/2

    (MAP KK)

    ERK1/2

    (MAP K)

    Ap-1

    Nuclear Membrane

    MAPK

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    31/65

    Chapter 4: Design

    24

    Chapter 4: Design

    Section 4.1: Associated Language with the Project

    There are three programming languages associated with my project -- XML, Matlab

    and C. It is necessary to get familiar with these languages before the design of the

    models.

    Section 4.1.1: XML

    Firstly, lets have a look at XML. XML, also known as Extensible Markup Language,

    similar with our familiar language HTML (Hypertext Markup Language), they are all

    derived from SGML (Standard for General Markup Language). XML is a simple but

    very flexible language. XML was actually designed for the challenge of large-scaleelectronic publishing [24]. Also, people are now using XML on exchanging data

    between the Web and other devices. E.g. the RSS (Really Simple Syndication) feed

    service, by providing a common format text file in XML, let the users receive most

    up-to-date information such as news, weather and so on.

    Compared with HTML, XML are very flexible. Because the tags in HTML are

    predefined; but in XML, you can define the tags by yourself. With your own-tags

    compatible XML parser or reader, they can archive a goal with great efficiency.

    Section 4.1.2: Matlab

    Secondly, we turn to Matlab. Matlab is an interactive mathematical environment and

    high-level technical computing language, originally based on the FORTRAN packages

    LINPACK and EISPACK, but now based on LAPACK and BLAS [20]. Matlab is a

    really useful tool for mathematical modelling; it also has a lot of features [21]:

    High-level language for technical computing Development environment for managing code, files, and data Interactive tools for iterative exploration, design, and problem solving Mathematical functions for linear algebra, statistics, Fourier analysis, filtering,

    optimization, and numerical integration

    2-D and 3-D graphics functions for visualizing data Tools for building custom graphical user interfaces Functions for integrating MATLAB based algorithms with external applications

    and languages, such as C, C++, Fortran, Java, COM, and Microsoft Excel

    With these features, Matlab is really a powerful tool for computing and mathematical

    studies. However, compared with XML architecture, it is not so suitable for

    agent-based modelling when handling the agents and the communication relation

    messages. Another reason is the HPCx super computer does not support Matlab, in

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    32/65

    Chapter 4: Design

    25

    order to get the super computer running in parallel with different agent on different

    CPU, so it is necessary to convert the existing Matlab coded models into X-machine

    models

    Section 4.1.3: C

    Lastly, we focus on C programming language. There is a book called The C

    Programming Language by Brian Kernighan and Dennis Ritchie, give us an informal

    specification on C and some history information about C.

    The C programming language is a standardized imperative computer programming

    language developed in the early 1970s by Ken Thompson and Dennis Ritchie for use

    on the UNIX operating system. It has since spread to many other operating systems,

    and is one of the most widely used programming languages. C is prized for its

    efficiency, and is the most popular programming language for writing system software,though it is also used for writing applications. It is also commonly used in computer

    science education, despite not being designed for novices. [22][23]

    C is a language which operates very close to the hardware, also C is most similar with

    assembly language rather than other high-level languages. So C makes it easier for

    programmers to control what the programme is doing. That results in more efficiency

    than other languages.

    C also can archive lots of features than other languages, because C accepts most of the

    compilers, libraries, and interpreters. That is why the Xparser uses C as well as

    visualisation programme for X-machine models. Also, as mentioned above, the HPCx

    super computer has no problem to run C, so C is the best choice for the post-parsing

    programming language of X-machine models.

    Now we know all three languages, it is a good preparation of the design stage.

    Section 4.2: Overall Design

    Section 4.2.1: X-machine Frameworks Architecture

    X-machine framework is a specialised framework for modelling biologic and other

    areas models based on individual agents. The architecture now is using .xml text file to

    define the data for each individual agents. From the last section, XML is transferable

    description language. It is easy to build an .xml file by using text editors (low level

    programming) to write directly or using a GUI tool (high level implementation) to

    construct it.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    33/65

    Chapter 4: Design

    26

    Also, there is another important .xml file which defines all the interaction rules,

    sending and receiving messages, movements and variables etc. With a parser called

    Xparser, the .xml file could be parsed into a C code file. Then use a compiler, it will be

    an executable programme. The programme can run X-machine agents and

    implements the global message list communication relation [8].

    By using the programme from above and supporting an initial .xml text file which

    holds all the states and other information of every agent, the model will start. Each

    iteration of the programme generates an .xml text file, holds all the changes of the

    states and other information such as location, speed etc.

    After a number of iterations (can be defined when programme start), there will be a set

    of .xml text files. Please note that one iteration is 0.5 second, so 2 iterations are 1

    second. Now using these files is a great pleasure: you could use a specialised

    visualisation tool to get the display of the model; you could use a getdata tool to getneeded information to generate a graph with specified x-axis and y-axis. In next

    chapter -- implementation and testing, this will be introduced in detail.

    Section 4.2.2: Main XML File Structure

    The main XML file is the soul of the model. Even tough when we visualise the model

    and get the data of the model, we wont need this main XML model file, but without

    this file or this file is incorrect, the model wont work or wont work properly.

    The main file structure is not simple (please see the Figure 4.1), the highest structure

    of the model main file consists of three parts, which are defined states, X-machine,

    Messages. Defined states part is actually comments of all the states for each molecule,

    which help users understand. Messages show each kind of messages and contents.

    Figure 4.1 Structure of the Main file (a)

    Model

    main file

    Defined

    StatesX-machine Messages

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    34/65

    Chapter 4: Design

    27

    The most complicated part is X-machine, it consists of three sub-parts as well, but they

    are: Memory, States and Functions. (Figure 4.2)

    Memory part is actually for variables, user can define all the global variables in this

    part, with special tags, and it is quite simple to define them.

    States part contains three states normally: input, output and move, which are linked

    with the functions in Function part. This part normally doesnt need to be changed for

    most of the model.

    Functions part is the core part in this file. It is the most complicated sub-part. It

    controls each agents behaviour. Outputdata function is for outputting messages which

    contains location, state and bond information. Inputdata function is for get message

    from other agents and process them then with appropriate reactions. Movements

    function is the function which controls the movement and locations of agents. It alsodraws the boundary of the model structure. For different cases of models, there are

    might be some other necessary functions act in this sub-part.

    Figure 4.2 Structure of the Main file (b)

    Memory States Functions

    X-machine

    Outputdata Inputdata Movements

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    35/65

    Chapter 4: Design

    28

    Section 4.2.3: Iteration XML File Structure

    Iteration files are also important for this framework; they have to be simple and

    uniform easier for reading and processing them.

    Most commonly, these files will start with a iteration number tag to show which

    iteration this file is and then the following part is for each agent in this model. This

    will include all the agents at this iteration time and the detailed data for each of the

    agent. Different model should have different type of data for the agents.

    Section 4.3: Design of Chemical Interaction Model

    Chemical interaction model is the simplest model in my project. It follows that A+B

    C. Only three kinds of molecules are involved. So the structure of the model and xmlfiles is simple as well. There is already a Matlab model exists, and I could use the

    information of the molecules inside and interaction rules for the X-machine model. It

    is actually a conversion for this model.

    In Matlab, it is possible to generate numbers of molecules data randomly when the

    programme starts; also Matlab is a good tool to plot the graph of concentration against

    time for the model. But in X-machine model, these functions are needed extra tool to

    do it. So the design of the main file and iteration file should be quite simple.

    As in the chapter three, the first thing need to do for conversion is clarify the states for

    each kind of the molecules. For molecule A, there are two states: 0 free in box, 1

    bound with B (this is actually appears as molecule C). For molecule B, there is only

    one state: 100 free in box. When molecule B bound with molecule A, it should be

    treated as disappeared. So there is no need to have a state for B which says it bound

    with molecule A.

    This models shape is inside a box, but according to the Matlab code, there are two

    coordination methods needed -- Cartesian and Polar coordinates. Cartesian coordinates

    mostly used in this model as location purpose, polar coordinates used as motion andmovement purpose. So by using the Cartesian coordinates, it is possible to draw the

    boundary of the box and limit each molecule stay inside this box by reversing the

    movement in polar coordinates when they hit the edge of the box.

    The Memory part has to contain all the global variables for supporting the coordination

    systems described above. Also it has to contain other necessary variables such as state

    number, molecule radius and so on.

    The states part is simple, only have three states to be set: output, input and move as

    described in section 4.2.1.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    36/65

    Chapter 4: Design

    29

    Then the most challenge part is the functions part. Four functions are essential:

    outputdata, inputdata, checkbondtries and move. The best way for binding is from the

    perspective of each A molecule to look for bind. By processing the location messages

    from other B molecules, the A molecule will choose the best one and then bond with it.During binding process, bond message involved as well. The move function will make

    sure all the molecules freely moving around inside the box.

    Messages part contains two kinds of messages. First one is location message, which

    contains each molecules state, Cartesian coordinates and id number. Second one is

    bond message: it has the information of senders id and state, receivers state,

    bondunbond tag and a distance value.

    The iteration files have the same structure as described in section 4.2.2. The detailed

    implementation of this model will be appeared in the next chapter implementationand testing.

    Section 4.4: Design of NF-B Signalling Pathway Model

    The NF-B signalling pathway model is a complicated model compare with last one, it

    involves four kinds of molecules and tens of different states. The molecules are NF-B,

    IB (IB and IB are ignored because of their concentration is low), nuclear

    importing receptors and nuclear exporting receptors.

    NF-B can bind with IB, nuclear importing receptors and nuclear exporting

    receptors. Also after it bound with IB, the NF-B& IB is possible to bound with

    nuclear importing and exporting receptors as well. So the design of state numbers for

    NF-B are: 0 - free in cytoplasm, 1 - bound to IB in cytoplasm, 2 - bound to nuclear

    importing receptors, 3 - bound to IB and then bound to nuclear importing receptors,

    4 - bound to nuclear exporting receptors, 5 - bound to IB and then nuclear exporting

    receptors, 6 - free in nuclear and 7 - bound to IkBa in nuclear.

    Same with chemical interaction model, IB acts similar with the molecule B in thatmodel. When IB bound with NF-B, it is not necessary to display it, so the solution

    is eliminate it. However, for the situation that IB bound with nuclear importing and

    exporting receptors is different. The state numbers have to be unique so the design of

    state numbers for IB are: 10 - free in cytoplasm, 11 - bound to nuclear importing

    receptors, 12 - bound to nuclear exporting receptors and 13 - free in nuclear.

    For nuclear importing and exporting receptors, the states are easy. They dont need to

    worry which kind is bound with them, because the state numbers for the above two

    kinds of molecules have indicated the type of bind. So the only thing they need to

    make themselves clear is if they are busy or not. The design of state numbers for

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    37/65

    Chapter 4: Design

    30

    nuclear importing receptors is: 20 dormant and 21 active. For nuclear exporting

    receptors is same but different number: 30 dormant and 31 active.

    Now the design of states numbers has done, the next part is memory part. The only

    thing in memory part is definition of variables. As last model, the shape was a box, butthis model is a shape of a cell. The structure and boundaries are more complex.

    However, the coordination systems are still the same with last model Cartesian and

    polar coordination systems. Also same purpose for each system: Cartesian coordinates

    take care of locations, polar coordinates control the movement of molecules. The

    boundaries are drawn by the molecules, with the co-operation of both coordination

    systems, nuclear receptors will lay on the nuclear membrane and other molecules will

    be moving in the region they should be e.g. inside cytoplasm or nuclear. If any if the

    molecule is about to across the boundary, it is possible to reverse the movement and

    pull them back.

    The states part would be exactly the same with last model. However, the functions

    parts will not.

    Because this model involves nuclear receptors, some new sets of rules are necessary to

    appear inside this part. Another thing needs to be noticed is, for each bind with nuclear

    receptors, there should be a delay before they are unbind and be released into a new

    region. For example, NF-B bound with nuclear importing receptors, after a while, its

    state should be changed as a NF-B free moving inside nuclear. The last function

    move, it takes the responsibility of drawing the boundary, it should let nuclear

    receptors move on the nuclear membrane only and control other molecules moving in

    the right regions.

    Messages part is same with last model as well, which contains location message and

    bond message. Also, location message contains each molecules state, Cartesian

    coordinates and id number; bond message contains the information of senders id and

    state, receivers state, bondunbond tag and a distance value.

    Iteration files are same structure but more kinds of molecules are inside them now.

    In the next chapter, I will follow the design and talk about implementation in detail for

    NF-B signalling pathway model.

    Section 4.5: Design of NF-B & MAP Kinase Signalling Pathway

    Combined Model

    This models design of structure is almost the same with the NF-B model, but there is

    a new pathway added in MAP kinase pathway. Because there is no existing Matlab

    model for this one, so the relationship and cross-talk between these two pathways is

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    38/65

    Chapter 4: Design

    31

    important. However, according to the Academic Unit of Cell Biology, Division of

    Genomic Medicine in the University of Sheffield the only relationship of these two

    pathways is in the Figure 4.3, they havent sorted the exact cross-talk between NF-B

    and MAP kinase pathways. So the design for this model is actually add a new pathway

    into the last model. From the Figure 4.3, it means molecules from outside of cell throwthe toll receptor, some of the molecule will go inside of NF-B pathway, and others

    will go MAP kinase with a probability. But in the model it actually models intracellular

    behaviours, so the initial molecules are assigned in the first iteration file.

    Figure 4.3 NF-B & MAP Kinase Signalling Pathway Relation [32]

    There are some more states needed to add in, from the Figure 3.4 in last chapter, it is

    possible to treat Ras, SOS and Grb2 as a single molecule, it acts similar with molecule

    A in the first model, and Active-Ras acts similar with molecule B in the first model.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    39/65

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    40/65

    Chapter 5: Implementation and Testing

    33

    Chapter 5: Implementation and Testing

    This chapter is about the detailed method of implementation of the three models in myproject. After the models are finished, it is important to test the models and evaluate

    them, so the testing method of the models will also be mentioned in this chapter.

    Section 5.1: Implementation of Three Models

    This section is the implementation of the models in my project, followed the design

    from last chapter, all the three models will be explained well of the implementation

    process. The first two models are actually converted from two Matlab models, but

    because of the difference between Matlab and X-machine framework, the best way to

    convert is get the ideas and algorithms from the Matlab and then write directly in XMLwith XML specification for X-machine framework.

    Section 5.1.1: Implementation of Chemical Interaction Model

    In Matlab, the model of chemical interaction works as: firstly, it defines some

    constants needed, such as time step, box length, speed range etc; then, it generates

    initial molecules positions and plot immediately; thirdly, it creates initial directions

    vectors; fourthly, it uses a loop from the perspective of each molecule A to look for a

    suitable molecule B to bind; fifthly, it controls the moving of each molecule and keep

    them inside the interaction model box; lastly, it draws a graph which shows

    concentration against time.

    As in last chapter, X-machine doesnt have its own initial value generation tool and

    graph drawing tool, so these need special external tools to help, but it is not difficult to

    archive them.

    This model started with the main .xml file. In last chapter, the states for each molecule

    are defined, and then we need to define constants and variables. Box length can be

    defined as a constant 3000 (in meter e-10) and used later. Variables are id number andstate number as integers; doubles are x, y and z for Cartesian coordinates, postheta,

    posphi and posr for polar coordinates, movetheta, movephi and mover for movements

    in Cartesian coordinates and iradius for the radius of molecules.

    Then the states part, three states are defined: output, input and move. For output state,

    it has association with Outputdata function and pointing to input state. For input

    state, it is associated with Inputdata function and has the destination move state.

    For move state, it linked with Move function and with the output as next state. So

    the three states are actually linked together as a closed ring shape (see Figure 5.1):

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    41/65

    Chapter 5: Implementation and Testing

    34

    Figure 5.1 states and relations in X-machine

    The next part is functions part. From the design chapter, this is the most complicated

    part, and all the code is written between xml tags is actually C code. Mostly if else,

    while, and other simple C functions.

    The first function is Outputdata. This function is for sending out location message,

    with a method called add_location_message. It sends out the id, state, x, y and z of the

    molecules information to other molecule to decide. This function is quite short.

    The second function is Inputdata. In this function, it defines some local variables first

    for processing the location message. With a while loop, it gets all the location

    messages and process one by one for each molecule A. For each message, it first

    checks to see if it comes from the molecule is referring to (from itself), and it gets rid

    of messages from other molecule A. Also if the un-squared distance is less than the

    molecules radius squared (radius2). Then all the requirements are matched and a bond

    message can be sent. With information of the source molecules id, state and

    destinations molecules state, the distance of them and an integer 3 for bindunbind

    tag (3 means a try for a bond).

    The third function is called checkbondtries. This function is for processing the bond

    message with 3 on bindunbind tag, and decides if it is necessary to make the bond.

    Once the id of both associated molecule is checked, a new bond message will be sent,

    it has all the same information but with a 0 on bindunbind tag (0 means a bind tag)

    and 0.0 for distance distance is therefore not useful after they bind. Also the

    molecule B will be freed in memory (disappear) by a code return 1 in an if else

    statement.

    The fourth function is also the last function in this model -- move. This function

    processes the bond message with a 0 on bindunbind tag, and makes the bond. That

    Output

    (initial)

    Input

    Move

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    42/65

    Chapter 5: Implementation and Testing

    35

    means the molecules state will be changed here as well (from 0 to 1 in this model).

    Then the following bit of this function is for controlling the movement of molecules,

    by using the Cartesian coordinates for location purpose, polar coordinates for

    movement, all the molecules will be restrict inside the box. And the movement is

    followed Brownian motion freely moving inside the box within defined speed range.

    The last part is messages part. As in design chapter, two kinds of messages are defined:

    location and bond message.

    That is all for the main .xml file. Now we need to make an initial iteration .xml file and

    visualisation programme. The initial iteration file creating programme and

    visualisation programme are part of the X-machine framework, so Simon Coakley has

    already done some examples only need to change them suit this model.

    The create initial iteration programme is written in C. Firstly, some variables defined,which are molecule initial numbers and moving speeds. The main part is some for

    loops, each for loop is used for a type of molecule, and it generates assigned number

    of molecules with random coordinates and speeds.

    Visualisation programme is written in C as well, it uses some openGL libraries. It reads

    each of the iteration files and displays each type of molecule in different colour, with

    the iteration file change. It is possible to display the molecules as moving objects. It

    also has function of rotation, save each iteration display as an image etc. There are

    some images of the model visualisation Figure 5.2:

    Figure 5.2 Visualisation of Chemical Interaction Model

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    43/65

    Chapter 5: Implementation and Testing

    36

    Section 5.1.2: Implementation of the NF-B Signalling Pathway Model

    The Matlab model of NF-B signalling pathway model is similar with chemical

    interaction model, but more complex and involves a lot more molecules and receptors.

    The model works as: firstly, it also defines some constants needed; then, it assignsinitial positions in spherical polar coordinates and also converts into Cartesian

    coordinates; thirdly, it uses a big while loop to do interactions, with defined sets of

    rules; lastly, the results will be drawn on a graph which shows concentration against

    time.

    As state number have been defined, it is necessary to define some constants and

    variables. The only constant defined here is receptor delay with a value 10. Variables

    are exactly the same with chemical interaction model: id number, state number, x, y, z,

    postheta, posphi, posr, movetheta, movephi and iradius.

    The states part is the same with last model, also three states are defined: output, input

    and move. The relationship of them are followed the Figure 5.1.

    From the functions part, it is easily to see the difference between NF-B pathway

    model and chemical interaction model. Same this last model, four functions in this part:

    Outputdata, Inputdata, Checkbondtries and Move.

    The first function is Outputdata. This time the function is not simple as last one. There

    are two sub-parts in this function. The first one is for sending out location message,

    which contains the id, state, x, y and z of the molecules information to other molecule

    to decide. The second sub-part is for nuclear receptors. By checking the states, if the

    molecule is an active nuclear receptor, it will decrease receptor delay counter. Once the

    receptor delay counter is changed to zero, it will send a bond message out to unbind

    the molecule which was binding with it. The bond message will contain both

    molecules state and with a 1 on bindunbind tag (1 corresponds to an unbind tag).

    Then the active nuclear receptor will release the molecule which was bound with it and

    the receptors state will be changed to dormant.

    The second function is Inputdata. By start with setting some local variables forfunction use and then check the bond message one by one with a while loop. If the

    bond message has a bindunbind tag 1, the associate molecule will be changed state

    into appropriate region. There are six situations:

    1. If NF-B bound to importing nuclear receptor then make free in nuclear;2. If NF-B bound to exporting nuclear receptor then make free in cytoplasm;3. If NF-B & IB bound to importing nuclear receptor then make free in nuclear;4. If NF-B & IB bound to exporting nuclear receptor then make free in cytoplasm;5. If IB bound to importing nuclear receptor then make free in nuclear;6. If IB bound to exporting nuclear receptor then make free in cytoplasm.

  • 8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

    44/65

    Chapter 5: Implementation and Testing

    37

    Then this function will get location message for each molecule. Firstly check if the

    location message was sent from the molecule itself, if it is not, then check the distance

    between the molecule and message sender. If the distance is less than radius2, then it

    will check if the states of them match any of the four situations (s: sender, r: receiver ofthe location message):

    1. r: NF-B free in cytoplasm and s: IB free in cytoplasm;2. r: NF-B free in nuclear and s: IB free in nuclear;3. r: dormant nuclear importing receptor and s: (NF-B free in cytoplasm, IB free

    in cytoplasm or NF-B & IB free in cytoplasm);

    4. r: dormant nuclear exporting receptor and s: (NF-B free in nuclear, IB free in nuclearor NF-B & IB free in nuclear).

    If any of the four situations matched, a bond message with a bindunbind tag 3 a tryfor bond will be sent.

    The third function is Checkbondtries. This function only processes the bond message

    with 3 on bindunbind tag. If it gets a message like that, then it will check if the

    sender is closest in distance, if it is, then it will be bound with each other. Firstly it will

    send a bond message with 0 on bindunbind tag means bind. And then change