Urop poster 2014 benson and mat 5 13 14 (2)

1
Computa(onal methods for simula(ng biochemical networks with noise Mathew Shum and Benson Lim Department of Chemical Engineering and Materials Science, University of California, Irvine Reference 1. Wang, J., Xu, L., Wang, E., & Huang, S. (2010). The poten(al landscape of gene(c circuits imposes the arrow of (me in stem cell differen(a(on. Biophysical journal, 99(1), 2939. 2. Wang, J., Zhang, K., Xu, L., & Wang, E. (2011). Quan(fying the Waddington landscape and biological paths for development and differen(a(on. Proceedings of the Na8onal Academy of Sciences, 108(20), 82578262. 3.Adelman, J. L., & Grabe, M. (2013). Simula(ng rare events using a weighted ensemblebased string method. The Journal of chemical physics, 138(4), 044105. Kreyszig, E. (2010). Advanced engineering mathema8cs. John Wiley & Sons. 4. Gillespie, D. T. (1977). Exact stochas(c simula(on of coupled chemical reac(ons. The journal of physical chemistry, 81(25), 23402361. 5 . M i c h a e l T . H e a t h , Scien8fic Compu8ng, An Introductory Survey , 2nd edi(on, McGrawHill, New York, 2002. See Sec(on 9.3.3, especially Example 9.9. Acknowledgments Margaret J. Tse (PhD student), Dr. Elizabeth L. Read UROP (Undergraduate Research Opportunity Program) Conclusion and Future Direc8on Future Work : Both the Langevin approach and the Fokker Planck approach involve different approxima(ons of stochas(c dynamics. We plan to compare the output of these methods on iden(cal biochemical networks in order to determine whether the predicted results are the same. We plan to use the exis(ng MATLAB codes to simulate two network models (the "general" and the "exclusive" gene(c toggle switch) using three different methods: the “Gillespie algorithm”, the averaged steadystate distribu(on, and the FokkerPlanck equa(on. Next, we will carry out systema(c simula(ons of the two network models with iden(cal parameters using all three approaches. Quan(ta(ve comparison of predicted steadystate distribu(ons will be carried out by calcula(ng the sum of the squared residuals. Computa(onal efficiency will be recorded as the processor (me required to complete the simula(ons. Conclusion The Langevin Equa(on can be used to simulate individual trajectories on a given poten(al surface. The FokkerPlanck Equa(on can be used to simulate the (medependent behavior of the probability distribu(on. Our MATLAB scripts for these two methods were validated against previous work. Our codes are being used to help develop more efficient simula(on methods in the Read lab. Results Gene8c regulatory network model: The gene(c toggle switch involves two genes that mutually repress each other. This type of network is thought to underlie many types of cell differen(a(on, where a cell can differen(ate into two alterna(ve types. The figures show the distribu(on of expression of the two genes in a popula(on of cells. The popula(on of cells splits into two types, with some expressing more of Gene 1, and some expressing more of Gene 2. Computer memory limita(ons were a challenge: the simula(ons shown have not reached steadystate. 2D Ring Poten8al: The 2D ring poten(al has a poten(al landscape that consist of four amractor basins (wells) and a circularly driving force. The mul(ple amractor basins are the loca(ons of the longlived states. The 2D ring poten(al has a circular driving force depicted in Figure 4. We calculated the poten(al surface using the equa(ons shown below: Periodic 2D system The periodic two dimensional model has a single poten(al well with a ver(cal driving force along with periodic boundary condi(ons. The driving force pulls trajectories downward through the amractor basin. The amractor basin indicates where trajectories will most likely land. The trajectories represent pathways where the reac(on will most likely proceed. However, the poten(al surface does not show the true path the reac(on will take place. We calculated the poten(al surface using the Langevin Equa(on and validated our code by comparing our figures to reference [3] shown to the right. We also were able to calculate the individual stochas(c trajectories on the surface using the Langevin Equa(on. The wells in the poten(al surface determine where the trajectories spend most of their (me, while the circular driving force determines the average direc(on of mo(on. Figure 5: Normalized probability surface of a gene(c toggle switch. (A) Probability surface of the gene(c switch at t=0.1s where most of the cells are expressing both genes at an intermediate level. (B) Probability surface at t=4.5 seconds where the probability has split into two amractor basins. (C) Probability surface at t = 20s where the probability has converged back into a single amractor basin. Figure 3: Periodic poten(al surface Figure 4: Single trajectory for 2Dring 1 2 1 t=0.1s t = 20s t= 4.5s A. B. C. Gene 1 Gene 2 ( ) Figure 2: Periodic poten(al surface for 2D system 1 2 Figure 3: Periodic poten(al surface for 2D system according to reference [3] Equa(on 1 is the 2D ring poten(al surface ( background) Equa(on 2 determines trajectories posi(ons (black) 1 and 2 represent the expression level of the two genes 1,2 1 and 2 represent the self ac(va(on strength 1 and 2 give the mutual inhibi(on strength 1 and 2 are the degrada(on constants Abstract Background Informa8on Computa(onal models of complex biochemical networks can be useful in understanding the regula(on of gene expression. One important considera(on in developing such models is how to include the effects of noise, because gene expression has been demonstrated to be stochas(c. We have studied three model systems under this project, in order to explore different computa(onal methods for simula(ng biochemical systems. First, we developed MATLAB scripts that sample trajectories for two toy models systems, which share some similar features to biochemical networks. Finally, we developed a script to model a small gene network using the Fokker Planck equa(on, which relate to the biochemical kine(cs and stochas(c fluctua(ons. “Biochemical networks” refer to the complex interac(ons between biomolecules in cells. An example is a generegulatory network, in which different genes can interact with each other. By taking into account randomness (or noise), stochas(c simula(ons of biochemical networks can give insight into cellular behavior. Gene expression has been found to be a stochas(c process which means that the number of copies of a protein encoded by a gene can vary randomly over (me. For stochas(c processes, the outcome can be predicted only in a probabilis(c way. The steadystate probability distribu(on gives the likelihood of the system exis(ng in all possible states. In the probability distribu(on, the peaks give the most likely observable states. The poten(al surface is related to the inverse probability distribu(on. The wells represent the most likely states. The poten(al surface is analogous to a gravita(onal poten(al, where a ball sirng on a slope must roll downhill un(l it reaches the bomom of the well. Methodology We studied two types of model systems and a small gene regulatory model. Each toy model had a defined poten(al func(on, and the driving forces share some features of biochemical networks because they can exist in nonequilibrium state. We also developed MATLAB scripts to simulate the dynamics of these par(cular systems. We used two types of simula(on methods that include noise known as the Langevin equa(on and the Fokker Planck equa(on. Stable (observable) states Products Steady state probability distribu(on ( ) Reactants Poten(al Surface : hypothe(cal par(cle posi(on : hypothe(cal par(cle mass () : noise term : constant ( damping coefficient) The Langevin equa(on takes a poten(al func(on and defines a determinis(c driving force. Then noise is included as an addi(ve term. We simulated this by applying the backward Euler method and adding random displacement at each (me point. The Fokker Planck Equa(on (Advec(on Diffusion Equa(on) models the probabilis(c behavior of a system and was solved by the finitedifference method, a method uses to simulate the (medependent behavior, which is a method used for solving par(al differen(al equa(ons. In this approach, the random varia(on in gene expression are analogous to the random diffusive mo(on of a par(cle.

Transcript of Urop poster 2014 benson and mat 5 13 14 (2)

Computa(onal  methods  for  simula(ng  biochemical  networks  with  noise  Mathew  Shum  and  Benson  Lim  

Department  of  Chemical  Engineering  and  Materials  Science,  University  of  California,  Irvine  

   

Reference  1.  Wang,   J.,   Xu,   L.,  Wang,   E.,   &   Huang,   S.   (2010).  The  poten(al  landscape  of  gene(c  circuits  imposes  the   arrow   of   (me   in   stem   cell   differen(a(on.  Biophysical  journal,  99(1),  29-­‐39.    2.   Wang,   J.,   Zhang,   K.,   Xu,   L.,   &  Wang,   E.   (2011).  Quan(fying   the   Waddington   landscape   and  b io log i ca l   pa th s   f o r   deve lopment   and  differen(a(on.   Proceedings   of   the   Na8onal  Academy  of  Sciences,  108(20),  8257-­‐8262.  3.Adelman,   J.   L.,   &   Grabe,   M.   (2013).   Simula(ng  rare   events   using   a   weighted   ensemble-­‐based  str ing   method.   The   Journal   of   chemical  physics,  138(4),  044105.  Kreyszig,   E.   (2010).   Advanced   engineering  mathema8cs.  John  Wiley  &  Sons.  4.  Gillespie,  D.  T.  (1977).  Exact  stochas(c  simula(on  of   coupled   chemical   reac(ons.   The   journal   of  physical  chemistry,  81(25),  2340-­‐2361.  5 .   M i c h a e l   T .   H e a t h ,   Scien8fic   Compu8ng,   An   Introductory   Survey,   2nd  edi(on,  McGraw-­‐Hill,  New  York,  2002.  See  Sec(on  9.3.3,  especially  Example  9.9.    

Acknowledgments  Margaret  J.  Tse  (PhD  student),  Dr.  Elizabeth  L.  Read  

UROP  (Undergraduate  Research  Opportunity  Program)  

Conclusion  and  Future  Direc8on  

Future  Work  :    Both   the   Langevin   approach   and   the   Fokker-­‐Planck  approach  involve  different  approxima(ons  of   stochas(c  dynamics.  We  plan   to   compare   the  output  of  these  methods  on  iden(cal  biochemical  networks   in   order   to   determine   whether   the  predicted   results   are   the   same.  We   plan   to   use  the   exis(ng   MATLAB   codes   to   simulate   two  network   models   (the   "general"   and   the  "exclusive"   gene(c   toggle   switch)   using   three  different  methods:   the   “Gillespie  algorithm”,   the  averaged   steady-­‐state   distribu(on,   and   the  Fokker-­‐Planck   equa(on.   Next,   we   will   carry   out  systema(c   simula(ons   of   the   two   network  models  with   iden(cal   parameters  using   all   three  approaches.   Quan(ta(ve   comparison   of  predicted   steady-­‐state   distribu(ons   will   be  carried  out  by  calcula(ng  the  sum  of  the  squared  residuals.   Computa(onal   efficiency   will   be  recorded   as   the   processor   (me   required   to  complete  the  simula(ons.    

Conclusion    The   Langevin   Equa(on   can   be   used   to   simulate  individual   trajectories   on   a   given   poten(al  surface.  The  Fokker-­‐Planck  Equa(on  can  be  used  to   simulate   the   (me-­‐dependent   behavior   of   the  probability   distribu(on.   Our   MATLAB   scripts   for  these   two   methods   were   validated   against  previous  work.  Our  codes  are  being  used  to  help  develop  more  efficient  simula(on  methods  in  the  Read  lab.        

𝜙  

Results  

Gene8c  regulatory  network  model:  The  gene(c  toggle  switch  involves  two  genes  that  mutually  repress  each  other.  This  type  of  network  is  thought  to  underlie  many  types  of  cell  differen(a(on,  where  a  cell  can  differen(ate  into  two  alterna(ve  types.  The  figures  show  the  distribu(on  of  expression  of  the  two   genes   in   a   popula(on   of   cells.   The   popula(on   of   cells   splits   into   two   types,  with   some   expressing  more   of   Gene   1,   and   some  expressing  more  of  Gene  2.  Computer  memory  limita(ons  were  a  challenge:  the  simula(ons  shown  have  not  reached  steady-­‐state.    

2D  Ring  Poten8al:  The  2D   ring  poten(al  has  a  poten(al   landscape  that  consist  of  four  amractor  basins  (wells)  and  a  circularly   driving   force.   The   mul(ple   amractor  basins  are  the  loca(ons  of  the  long-­‐lived  states.  The  2D  ring  poten(al  has  a  circular  driving  force  depicted  in  Figure  4.  We  calculated  the  poten(al  surface  using  the  equa(ons  shown  below:  

Periodic  2D  system  The  periodic  two  dimensional  model  has  a  single  poten(al  well   with   a   ver(cal   driving   force   along   with   periodic  boundary   condi(ons.   The   driving   force   pulls   trajectories  downward  through  the  amractor  basin.  The  amractor  basin  indicates   where   trajectories   will   most   likely   land.   The  trajectories   represent   pathways   where   the   reac(on   will  most   likely   proceed.   However,   the   poten(al   surface   does  not   show   the   true   path   the   reac(on   will   take   place.   We  calculated   the   poten(al   surface   using   the   Langevin  Equa(on  and  validated  our  code  by  comparing  our  figures  to  reference  [3]  shown  to  the  right.  

We   also   were   able   to   calculate   the   individual   stochas(c   trajectories   on   the   surface   using   the   Langevin   Equa(on.   The   wells   in   the  poten(al  surface  determine  where  the  trajectories  spend  most  of   their  (me,  while   the  circular  driving   force  determines  the  average  direc(on  of  mo(on.  

Figure  5:  Normalized  probability  surface    of  a  gene(c  toggle  switch.  (A)  Probability  surface  of  the  gene(c  switch  at  t=0.1s  where  most  of  the  cells  are  expressing  both  genes  at  an  intermediate  level.  (B)  Probability  surface  at  t=4.5  seconds  where  the  probability  has  split  into  two  amractor  basins.  (C)  Probability  surface  at  t  =  20s  where  the  probability  has  converged  back  into  a  single  amractor  basin.  

𝜙  

Figure  3:  Periodic  poten(al  surface   Figure  4:  Single  trajectory  for  2D-­‐ring  

𝑋↓1   

𝑋↓2   

𝑋↓1   

t=0.1s   t  =  20s    t=    4.5s  A.   B.   C.  

Gene   𝑥↓1   

Gene

   𝑥↓2   

𝑃( 𝑥 )  

Figure  2:  Periodic  poten(al  surface  for  2D  system  

𝑋↓1   

𝑋↓2   

𝜙  

Figure  3:  Periodic  poten(al  surface  for  2D  system  according  to  reference  [3]  

Equa(on  1  is  the  2D  ring  poten(al  surface  (  background)  Equa(on  2  determines  trajectories  posi(ons  (black)  

𝑥↓1 and   𝑥↓2   represent  the  expression  level  of  the  two  genes  𝑋↓1 , 𝑋↓2   𝑎↓1 and   𝑎↓2 represent  the  self  ac(va(on  strength  𝑏↓1 and   𝑏↓2   give  the  mutual  inhibi(on  strength    𝑘↓1 and   𝑘↓2   are  the  degrada(on  constants  

 

Abstract  

Background  Informa8on  

Computa(onal  models  of  complex  biochemical  networks  can  be  useful  in  understanding   the   regula(on   of   gene   expression.   One   important  considera(on   in  developing  such  models   is  how  to   include  the  effects  of  noise,  because  gene  expression  has  been  demonstrated  to  be  stochas(c.  We   have   studied   three   model   systems   under   this   project,   in   order   to  explore   different   computa(onal   methods   for   simula(ng   biochemical  systems.  First,  we  developed  MATLAB  scripts  that  sample  trajectories  for  two  toy  models  systems,  which  share  some  similar  features  to  biochemical  networks.   Finally,  we  developed  a   script   to  model   a   small   gene  network  using  the  Fokker  Planck  equa(on,  which  relate  to  the  biochemical  kine(cs  and  stochas(c  fluctua(ons.    

“Biochemical   networks”   refer   to   the   complex   interac(ons   between  biomolecules   in  cells.  An  example   is  a  gene-­‐regulatory  network,   in  which  different   genes   can   interact   with   each   other.   By   taking   into   account  randomness   (or   noise),   stochas(c   simula(ons   of   biochemical   networks  can  give  insight  into  cellular  behavior.  Gene  expression  has  been  found  to  be   a   stochas(c   process   which   means   that   the   number   of   copies   of   a  protein   encoded   by   a   gene   can   vary   randomly   over   (me.   For   stochas(c  processes,  the  outcome  can  be  predicted  only   in  a  probabilis(c  way.  The  steady-­‐state   probability   distribu(on   gives   the   likelihood   of   the   system  exis(ng  in  all  possible  states.  In  the  probability  distribu(on,  the  peaks  give  the  most   likely  observable   states.   The  poten(al   surface   is   related   to   the  inverse  probability  distribu(on.  The  wells  represent  the  most  likely  states.  The  poten(al  surface  is  analogous  to  a  gravita(onal  poten(al,  where  a  ball  sirng  on  a  slope  must  roll  downhill  un(l  it  reaches  the  bomom  of  the  well.      

Methodology  We   studied   two   types   of   model   systems   and   a   small   gene   regulatory  model.  Each   toy  model  had  a  defined  poten(al   func(on,  and  the  driving  forces   share   some   features   of   biochemical   networks   because   they   can  exist   in   non-­‐equilibrium   state.   We   also   developed   MATLAB   scripts   to  simulate  the  dynamics  of  these  par(cular  systems.  We  used  two  types  of  simula(on  methods   that   include   noise   known   as   the   Langevin   equa(on  and  the  Fokker  Planck  equa(on.  

Stable  (observable)  states  Products  

Steady  state  probability  distribu(on  

𝑃( 𝑥 )  

𝑥   Reactants  

Poten(al  Surface  

𝑥   

   

𝒙  :  hypothe(cal  par(cle  posi(on  𝑚:  hypothe(cal  par(cle  mass  

𝜂(𝑡)  :  noise  term  𝜆:  constant  (  damping  coefficient)  

 

The   Langevin   equa(on   takes   a   poten(al  func(on   and   defines   a   determinis(c   driving  force.   Then   noise   is   included   as   an   addi(ve  term.   We   simulated   this   by   applying   the  backward   Euler   method   and   adding   random  displacement  at  each  (me  point.    

The   Fokker   Planck   Equa(on   (Advec(on   Diffusion   Equa(on)   models   the  probabilis(c  behavior  of  a  system  and  was  solved  by  the  finite-­‐difference  method,  a  method  uses  to  simulate  the  (me-­‐dependent  behavior,  which  is   a   method   used   for   solving   par(al   differen(al   equa(ons.   In   this  approach,  the  random    varia(on  in  gene  expression  are    analogous  to  the  random  diffusive  mo(on  of  a  par(cle.