Naïve assumption: no selection against synonymous substitutions
Selection
sequence position
rate of synonymous substitutions
Synonymous purifying selection (conservation)
Protein folding
Splicing regulatory elements
mRNA structure
Overlapping genes
Codon bias
Species 1Species 2Species 3
T AACT GCCACG GCTACA GCAT A
L T S ICTT ACA AGC ATCCTT ACA AGC ATCCTT ACA AGC ATC L T S I
G R GGG CGTGGT CGGGGA CGA G R
sequence position
Testing for synonymous selection
H0: free from synonymous selection → constant Ks
H1: under synonymous selection → variable Kslikelihood ratio test
21
0
1 ~)|(
)|(log2
MDL
MDL
Research objective
Quantify and characterize the
magnitude and role of synonymous purifying
selection
Comparative sequence data
S.cerevisiaeS.paradoxusS.mikataeS.bayanusS.castelli
> 20 million years
70%-90% coding DNA sequence identity
Comparative sequence data5,135 datasets of multiple sequence alignments + phylogenies (5,182 of ~6,000 S. cerevisiae genes)
Obtained from Wapinski et al., Nature 2007
GATCGATTC
GATCGATTA
GATCGGTCC
GCTCGGTCC
GATAGACAT
?
position
Under significant synonymous selection
Under synonymous selection
Not under synonymous selection
42%
(2,154)
45.6%
(2,341)
12.4%
(640)
Synonymous selection underlies codon bias
Different organisms prefer specific codons over others that encode the same amino acid
R: S. cerevisiaeAGA 48%
AGG 21%
CGA 7%
CGC 6%
CGG 4%
CGU 14%
Codon bias (synonymous selection) derives from protein structure
Translation speed Translation accuracy
S. cerevisiae mitochondrial NADP(+)-dependent isocitrate dehydrogenase (PDB: 2QFY)
Codon bias at the protein 3D structure
S. cerevisiae mitochondrial NADP(+)-dependent isocitrate dehydrogenase (PDB: 2QFY)
codon bias core > codon bias surface
S. cerevisiae mitochondrial NADP(+)-dependent isocitrate dehydrogenase (PDB: 2QFY)
codon bias interface > codon bias surface
MDR1 is a member of the ABC transporter family.
They pump drugs out of the cell utilizing ATP, which change conformation of the protein.
These proteins were shown to induce multi-drug resistance in various cancers.
C3435T is a synonymous SNP that was reported to be a risk factor for several diseases such as Parkinson’s diseases, colon cancer, and renal epithelial tumor.
It can be either because:
1. Change in mRNA level
2. Change in splicing
3. Linkage disequilibrium with other causative SNPs
4. Something else
FACS analysis.
In purple – cell transfected with empty vector
All other colors – cell trasfected with a vector containing MDR1 (various haplotypes)
MDR1 pumps the drug (Bodipy) out of the cells.
Bodipy
All other colors – cell trasfected with a vector containing MDR1 – various haplotypes
The inhibitor works differently on the various haplotypes
They showed that synonymous substitutions did not change protein levels but rather the structure.
This was shown by differential response to specific antibodies.
Important for linking SNPs to diseases.
Conservation of Ks in pol
0
1
2
3
4
5
750 800 850 900 950
Site
Ks
rate
Mayrose et al. Bioinformatics/ISMB (2007)
0
1
2
3
4
900 910 920 930 940 950
Position
Ks
ra
te
DNA flap
cPPT
CTS
?
Conservation of Ks in pol (zoom in)
cPPTA
This region serves as a primer for the reverse transcriptase in the synthesis of the plus-strand DNA.
cPPT
CTS = Central Termination SequenceA
The CTS is involved in the nuclear import of the HIV-1 genome.
CTS
Kudla et al. showed that the levels of GFP – which is a protein whose gene can easily be inserted into a host genome and its levels can then be easily quantified, are strongly affected by the secondary structure of the 5’ end of the mRNA.
Mechanism: stable secondary structures at the 5’ end of the mRNA obstruct ribosome binding to the mRNA and result with lower protein levels
Top Related