From RNA-seq Time Series Data to Models of Regulatory Networks · From RNA-seq Time Series Data to...
Transcript of From RNA-seq Time Series Data to Models of Regulatory Networks · From RNA-seq Time Series Data to...
From RNA-seq Time Series Data
to Models of Regulatory Networks
Konstantin MischaikowDept. of Mathematics, Rutgers
Brown, Dec 2017
A Model Problem
Estimated number of malaria cases in 2010: between 219 and 550 million
Estimated number of deaths due to malaria in 2010: 600,000 to 1,240,000
Malaria may have killed half of all the people that ever lived. And more people are now infected than at any point in history. There are up to half a billion cases every year, and about 2 million deaths - half of those are children in sub-Saharan Africa. J. Whitfield, Nature, 2002
Resistance is now common against all classes of antimalarial drugs apart from artemisinins. … Malaria strains found on the Cambodia–Thailand border are resistant to combination therapies that include artemisinins, and may therefore be untreatable.
World Health Organization
Malaria is of great public health concern, and seems likely to be the vector-borne disease most sensitive to long-term climate change.
World Health Organization
Malaria
Malaria: P. falciparum 48 hour cycle
1-2 minutes
Task: Understand the regulation on the genetic/biomolecular level with the goal of affecting the dynamics with drugs.
A proposed network
A differential equation dxdt = f(x,�) is proba-
bly a reasonable model for the dynamics, butI do not have an analytic description of f orestimates of the parameters �.
Malaria is• Sequenced • Poorly annotated
All genes (5409)
1.5$
0.0$
&1.5$
Standard$devia0o
ns$from
$mean$expressio
n$(z&score)$
High
Low
0$ 10$ 20$ 30$ 40$
0me$in#vitro#(hours)$
50$ 60$
perio
dic$gene
s$(43)$
10 20 30 40 50 600
Walter Reed Army Inst. Research Duke University
A Philosophical Interlude
The Lac Operon Ozbudaketal.Nature2004
Network Model
1
⌧yy = ↵
RT
RT +R(x)� y
1
⌧xx = �y � x
R(x) =RT
1 +⇣
xx0
⌘n
ODE Model
Data
ODES are great modeling tools, but should be handled with care.
parameter values
↵ =84.4
1 + (G/8.1)1.2+ 16.1
� = . . .
ClassicalQualita7veRepresenta7onofDynamics
DynamicSignature
(MorseGraph)
Not Precise Accurate Rigorous
Precise Not Accurate Not Rigorous
What does it mean to solve an ODE?
Conley-Morse Chain Complex
model“truth”
parameter
Combinatorial Dynamics
p1 p0
p2
p3
Vertices: StatesEdges: Dynamics
Don’t know exact current state, so don’t know exact next state
Simple decomposition of Dynamics:
Recurrent
Nonrecurrent (gradient-like)
Linear time Algorithm!
Morse Graphof state transition graph
State Transition Graph F : X �!!X
Poset
p1 p0
p2
p3
POSET
Morse Graphof F : X �!!X
Join IrreducibleJ_(A)
Birkhoff’s Theorem implies that the Morse graph and the lattice of Attractors are equivalent.
What is observable? A � X is an attractor if F(A) = A
p1 p0
p1, p0
p2, p1, p0
p3, p2, p1, p0
Lower Sets O(M)
Computable
Observable
;Lattice of Attractorsof F : X �!!X
_ = [^ = maximal attractor in \
Topology(differential equations are not
defined on discrete sets)
Let X be a compact metric space. phase space
Let R(X) denote the lattice ofregular closed subsets of X.
Infinite unbounded lattice
Level of measurement Applicable scale for model
Let L be a finite bounded sublatticeof R(X).
Topo
logy
Dyn
amic
s
Use Birkhoff to define poset (P := J_(A), <)
Remark: I have purposefully ignored the relation between L and F : X �!!X
G(L) denoted atoms of L “smallest” elements of L
For each p 2 P define a Morse tile M(p) := cl(A \ pred(A))
Declare a bounded sublattice A ⇢ L to be a lattice of forward invariantregions (attracting blocks).
Example
Morse tiles M(p)
Let F 0(x) = �f(x).
-4 40
Atoms of lattice: G(L) = {[n, n+ 1] | n = �4, . . . , 3}Phase space: X = [�4, 4] ⇢ R
P
1 2
3
Birkhoff
How does this relate to a differential
equationdxdt = f(x)?
-4 40
F
F
A
Lattice of attracting blocks: A = {[�3,�1], [1, 3], [�3,�1] [ [1, 3], [�4, 4]}
Attracting blocks are regions of phasespace that are forward invariant withtime.
Remark: This leads to a homology theory
F
Switching Systems(an example of how to use these ideas)
Choosing L and F : X �!!X
Biological Model
Assume xi decays. dxidt = ��ixi
dxidt = ��ixi + ⇤i(x)dxidt = ��ixi + ⇤i(xj)
How do I want to interpret this information?What differential equation do I want to use?
Proposed model:
dx2
dt
x1
✓2,1
u2,1
l2,1
x1 represses theproduction of x2.
1 2
x1 activates theproduction of x2.
1 2
Parameters1/node3/edge
For x1 < ✓2,1 we ask about sign (��2x2 + u2,1).
For x1 > ✓2,1 we ask about sign (��2x2 + l2,1).
xi denotes amount of species i.
��j,i(xi) =
(uj,i if xi < ✓j,i`j,i if xi > ✓j,i
Focus on sign of ��ixi + ��i,j(xj)
���ixi + �+
i,j(xj)�
x1
x2
Phase space: X = (0,1)2
Parameter space is a subset of (0,1)8
✓2,1
✓1,2
Fix z a regular parameter value.
z is a regular parameter value if
0 < �i
0 < `i,j < ui,j ,
0 < ✓i,k 6= ✓j,k, and0 6= ��i✓j,i + ⇤i(x)
Example 1 2��2✓1,2 + �+
2,1(x1)
��1✓2,1 + ��1,2(x2)
We care about sign of:
If x1 < ✓2,1 and��2✓1,2 + �+
2,1(x1) > 0
If x1 < ✓2,1 and��2✓1,2 + �+
2,1(x1) < 0
✓2,1
✓1,2
x1
x2
Need to Construct State Transition Graph Fz : X �!!X
Fix z a regular parameter value.
VerticesX corresponds to all rectangulardomains and co-dimension 1 facesdefined by thresholds ✓.
Faces pointing in map to their domain.
Domains map to their faces pointingout.
Edges
If no outpointing faces domain mapsto itself.
✓2,1
✓1,2
x1
x2
MorseGraph FP{0,1}
Fix z a regular parameter value.
Example 1 2
✓2,1
✓1,2
x1
x2
Assume: l1,2 < �1✓2,1 < u1,2
�2✓1,2 < l2,1 < u2,1
l1,2 < �1✓2,1 < u1,2
l2,1 < �2✓1,2 < u2,1
Check signs of ��i✓j,i + �±i,j(xj)
Constructing state transition graph Fz : X �!!X
(M1,M2,m1,m2)
Dynamics ordersmaxima and minimaFC
DSGRN Database
Parameter graph provides explicit partition of entire 8-D parameter space.
Observe that we can query this database for local or global dynamics.
Input:Regulatory Network 1 2
Output:DSGRN database
(1)FP(1,1)
�1✓2,1 < l1,2 < u1,2
�2✓1,2 < l2,1 < u2,1
(4)FP(1,1)
�1✓2,1 < l1,2 < u1,2
l2,1 < �2✓1,2 < u2,1
(7)FP(1,0)
�1✓2,1 < l1,2 < u1,2
l2,1 < u2,1 < �2✓1,2
(2)FP(0,1)
l1,2 < �1✓2,1 < u1,2
�2✓1,2 < l2,1 < u2,1
(5)FC
l1,2 < �1✓2,1 < u1,2
l2,1 < �2✓1,2 < u2,1
(8)FP(1,0)
l1,2 < �1✓2,1 < u1,2
l2,1 < u2,1 < �2✓1,2
(3)FP(0,1)
l1,2 < u1,2 < �1✓2,1
�2✓1,2 < l2,1 < u2,1
(6)FP(0,0)
l1,2 < u1,2 < �1✓2,1
l2,1 < �2✓1,2 < u2,1
(9)FP(0,0)
l1,2 < u1,2 < �1✓2,1
l2,1 < u2,1 < �2✓1,2
Back to Malaria
Remark: there are a variety of statistical methodsfor generating possible regulatory networks from this type of time series data.
Invitrodata(WRAIR+Haase)
High
Low
0
1.5
-1.5
Stan
dard
dev
iatio
ns fr
om
mea
n ex
pres
sion
(z-s
core
)
0 10 20 30 40 50 60
Time in vitro (hours)
Puta
tive
TF g
enes
(456
)
Remarks about dynamics:
1. Gene expression is cyclic in nature.
2. We know relative times of expression of genes.
Assumption: Expression ofimportant functions must berobust to perturbations.
Experimental7meseriesforassociatedgenes
Simple Test
Cyclicfeedbacksystem:wellunderstoodusingclassicaldynamicalsystemstechniques.
Undertheassump7onofmonotoneswitchesifparametervaluesarechosensuchthatthereexistsastableperiodicorbit,thenthemaximainthenetworkmustoccurintheorder:(188,93,184,395)(green,blue,cyan,red)
Conclusion:Thisnetworkdoesnotgenerateobserveddynamics
Nomathema7caltheory Timeseriesforassociatedgenes
Computa7on7meonlaptopapproximately1second.
SQLQuery:Astablecycleinvolvingoscilla7onsinallgenes
DSGRNcomputa7onproducesaparametergraphwithapproximately45,000nodes.
96parametergraphnodeswithMorsegraphthathasaminimalnodeconsis7ngofaFullCycle(FC).
Havedevelopedpolynomial7mealgorithmthattakepathsinstatetransi7ongraphandiden7fiessequencesofpossiblemaximaandminima.
Testedallmax-minsequencesfromstatetransi7ongraphsfromall96parametergraphnodesagainst17,280experimentalpa`erns.NoMatch
Conclusion:Thisnetworkdoesnotgenerateobserveddynamics
DSGRN Analysis (II): Max-Min Matching
M m M m M m
M M m M m m
DSGRN strategy
I. Start with a proposed gene regulatory network.
Perform DSGRN computation to identifyparameter node for which minimal Morsenode is FC
Reject parameter node if max-min sequences of FC are not linear extensions of poset.
Extract from experimental data the poset indicating possible max-min orderings.
Compute fraction of parameter nodes that match experimental data.
IF fraction is small, THEN reject regulatory network.
Assumption: Expression of important functions must be robust to perturbations.
DSGRN strategyII. Start with an acceptable gene regulatory network.
Add/Remove edge(s)
Create new regulatory network via random perturbations:
Add/Remove node
I. Start with the newly proposed gene regulatory network.
Current favorite network
90% of parameter nodes* induce minimal FC node
80% of parameter nodes* induce minimal FC which agrees with experimentally determined max-min ordering.
Thank-you for your Attention
Homology + Database Softwarechomp.rutgers.edu
RutgersS. Harker
MSUT. Gedeon
B. Cummings
FAUW. Kalies
VU AmsterdamR. Vandervorst
Example (continued)
-4 40
X = [�4, 4] ⇢ R
1
The homology Conley index of M(p) is
CH⇤(p) := H⇤(A, pred(A);k) k a field
2
3
(P, <)(k, 0, . . .)(k, 0, . . .)
2
40 0 10 0 10 0 0
3
5
A
Theorem: (R. Franzosa) There exists a strictly upper triangular (withrespect to <) boundary operator
� :M
p2P
CH⇤(p) !M
p2P
CH⇤(p)
such that the induced homology is isomorphic to H⇤(X).
(0,k, 0, . . .)
Conley index can be used to guarantee existence of equilibria, periodic orbits, heteroclinic and homoclinic orbits, and chaotic dynamics.
F
F