The error threshold or ribo-organisms
description
Transcript of The error threshold or ribo-organisms
The error threshold or ribo-organisms
Eörs Szathmáry
Collegium Budapest AND Eötvös University
Crucial assumptions
• There was in fact an RNA-dominated worlds
• RNAs acted as genes and as ribozymes
• Replication as a problem was solved
• The accuracy problem?
• The internal cometition problem?
Inaccurate replication immediately raises further concerns (Eigen, 971)
• Early replication must have been error-prone
• Error threshold sets the limit of maximal genome size to <100 nucleotides
• Not enough for several genes• Unlinked genes will compete• Genome collapses• Resolution???
An example of “replication”RNARNARNARNARNARGARNARNARNARNXRNARNARNHDNMRNARNARNARQARNARNJRPA
WORLDWORLFWORLDWORLLIDRYDWORLDWORLDKORLDWORLDWORLDWORLDWORLDWORUDWORLDWORHDWORLDWORLDWORWDWORLDWORLDWRRLD
HYPOTHESISEYPKTHYSIIHYPEXHESISHYPOTHESISHYPOTHESISHYPETHESKSHYYOTHESISHYPOTHESISHYPOTHESISHYPOTHESISHYPOTHESISHYPOTHESISHYPOSHESISHYPOTMESISHTPOTHESISCYPOTGESISHYPOTHEGIAHYPOXHLSISHYPXTHESISHYPOTHESISHYPUTHESIS
Eigen’s Paradox and theError threshold
ln
1
sN
q
N length
s superiority of the master
q error rate per digit
Quasispecies made simple
• For didactics, there are only two genotypes
• Only forward mutations
• Fitness values and mutations rates
Simplified error threshold
x + y = 1
Error theshold and error catastrophe
Error threshold and extinction threshold
Population dynamics on surfaces
• Reaction-diffusion on the surface (following Hogeweg and Boerlijst, 1991)
• One tends to interact with one’s neighbours
• This is important, because lesson from theoretical ecology indicates that such conditions promote coexistence of competitors
• Important effect on the dynamics of the primordial genome (cf. Eigen’s paradox)
Nature 420, 360-363 (2002).
Replicase RNA
Other RNA
Elements of the model
• A cellular automaton model simulating replication and dispersal in 2D
• Replication needs a template next door
• Replication probability proportional to rate constant (allowing for replication)
• Diffusion
X
i - 2 i - 1 i i + 1 i + 2
j -2
j - 1
j
j + 1
j + 2
S
Maximum as a function of molecule length
• Target and replicase efficiency
• Copying fidelity• Trade-off among
all three traits: worst case
Evolving population
Error rate Replicase activity
SCM is better than HPC at high mutation rates (Zintzaras, Santos, Szathmáry, J. theor. Biol. 2002)
• Survival of the flattest• SCM is better only at
high mutations rates• Exactly relevant for
early systems
RNA structure and the error theshold: Kun, Santos, Szathmáry (2005) Nature
Genetics 37, 1008-1011.
• The 3D shape of the molecule
• Enzymatic activity depends on the structure
• Phenotype of a ribozyme is the structure
• There are fewer structures than sequences
• A few mutations in the sequence usually do not change the structure
• The 2D structure can be computed easily
RNA structure – an example
AUCGUCUGUCGGCGAU
GCAUGACUCAUUAUGC
Master copy:
Mutant:
Same structure
Same fitness
(different text can have the same meaning)(different text can have the same meaning)
Aim / question
• The phenotype is more easily maintained than the genotype.
• Phenotypic error threshold, which is higher than the genotypic error threshold.
• For estimating the error threshold a fitness landscape is needed
• The proposed fitness landscapes will be based on mutagenesis experimental data
• Enzymatic activity will be used as a proxy for fitness (protocell)
Neurospora Varkund Satellite Ribozyme
uaagagcguuCg-CcC
gcgguaguaaGc AgG|||||| |||
A
GAACACGA CAC GUUaUgAcug||| ||| ||||||||||
GAC
GCU GUG-A-CGGuAuUggc
CUC-GC-GA-UC-GU-AC-G
A
g
a
ua
UUA
GUGUaUUGUCA|||||||||CguAgCAGUU
u
GGA
AA
aCuUuaaC||||||||uGaAauuGc
g
au
-U-
3’
5’AA
640
650
680
730
740
690
660
670
700
710
720
750
760
770
780II
III
IV V
VI
uaagagcguuCg-CcC
gcgguaguaaGc AgG|||||| |||
A
GAACACGA CAC GUUaUgAcug||| ||| ||||||||||
GAC
GCU GUG-A-CGGuAuUggc
CUC-GC-GA-UC-GU-AC-G
A
g
a
ua
UUA
GUGUaUUGUCA|||||||||CguAgCAGUU
u
GGA
AA
aCuUuaaC||||||||uGaAauuGc
g
au
-U-
3’
5’AA
640
650
680
730
740
690
660
670
700
710
720
750
760
770
780II
III
IV V
VI
N = 144
83/144 (57%) of the positions were mutated, we used 183 mutants
Hairpin Ribozyme
aaacaGAGAAGUcaACCAg|||||
A G AA
AUGGUcCAUUAUAUG
A C A
GUG
CACG|||
uu
1
10
20 30
40
50
5’
3’
H1
loop A
H2 H3 H4
loop BaaacaGAGAAGUcaACCAg
|||||A G A
A
AUGGUcCAUUAUAUG
A C A
GUG
CACG|||
uu
1
10
20 30
40
50
5’
3’
H1
loop A
H2 H3 H4
loop B
N = 50
39/50 (78%) of the positions were mutated, we used 142 mutants
General observations on ribozymes
1. Structure is important, individual base pairs are
not
2. Structure can be slightly varied
3. There are critical sites
4. The landscape is multiplicative (there might be a slight
synergy)
RNA Population dynamics
• Replication rate is proportional to fitness
• Copying is error-prone, but length does not change
• Degradation is independent of fitness
,i
ij
p
Phenotypic error threshold
0.047 0.048 0.049 0.050 0.051 0.052 0.0530
1000
2000
3000
4000
5000
6000
7000 Population size = 10000Estimated Error Threshold =0.0536r = -0.993
Tim
e to e
xtin
ctio
n (genera
tions)
Per digit effective mutation rate (*)
Mean Min. Max.
0.12 0.13 0.14 0.15 0.16 0.17 0.18
0
200
400
600
800
1000
1200
1400
1600
Tim
e to
Ect
ionc
tion
(gen
erat
ions
)
Per digit mutation rate ()
Mean Min. Max.
Population Size = 10000r = -0.95Estimated Error Threshold =0.146
* = 0.053 * = 0.146
VS Ribozyme Hairpin
Comparison with other types of landscapes
VS Ribozyme
0
0.01
0.02
0.03
0.04
0.05
0.06
Structural Mnt. Fuji (0.2) Mnt. Fuji (0.8) Single Peak Eigen (lns=1)
Est
imat
ed E
rror
Thre
shold
Hairpin ribozyme
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Structural Mnt. Fuji (0.2) Mnt. Fuji (0.8) Single Peak Eigen (lns=1)
Est
imat
ed E
rror
Thre
shold
Mnt. Fuji type of landscape
• No structure
• Activities based on point mutations
Single peak fitness landscape
• Based on average activity of point mutants
Neutral mutions tame the error threshold
• Extrapolation from the available mutants as samples to the whole fitness landscape
• Accuracy of viral RNA polymerases would be sufficient to run the genome of a ribo-organism of about 70 genes
Error rates and the origin of replicators
Some open questions
• The maximum genome size of the stochastic corrector model (how many genes in the bag?)
• The evolution of genome size through duplication and divergence of metabolic enzyme functions
• The origin of chromosomes