Predicting the 3D Structure
of RNA motifs
Ali Mokdad – UCSF
May 28, 2007
Predicting RNA structure• Existing RNA folding algorithms (mfold, sfold, pfold,
Dynalign) determine the locations of cWW helices.
• Internal loops, hairpin loops, and junctions are represented as “bulges” or unstructured areas between these helices.
• Many of these “bulges” have stable 3D structures that in many cases allow the whole molecule to carry its function.
• Many long-range interactions in the same RNA molecule, or interactions between RNA and other molecules occur at these locations.
• If we can determine the structures of these areas, we can target them with drugs, and we can better understand their mechanisms and functions...
Predicting 3D structure of RNA loops
• RNA loops are mostly made of non-WC BPs.
• These non-WC BPs are less common than helical WC BPs, but they still make a good portion (ca. 1/3).
• To complicate things, the non-WC BPs are not all homogeneous, instead they belong to any of a dozen or so geometric types*.
• As a results, their 3D-structures eluded computational prediction for so long.
*Leontis & Westhof, RNA 2001, v7: 499-512
Isostericity-based structure prediction
• Comparative Sequence Analysis has been very successful in predicting cis WC BPs.
• The problem with applying that onto non-WC BPs is that their allowed patterns of substitutions are more diverse and less obvious than cis WC substitutions.
• To some extent these patterns were not even known until recently*
*Leontis, Stombaugh, & Westhof, NAR 2002, v30: 3497-531
ISFOLD: a small first stepto solve a big problem
• ISFOLD looks in sequence alignments for patterns similar to the known isosteric substitution patterns of base pairs.
• This similarity can be scored and ranked, and based on it predictions of individual BP occurrences (their types and locations) can be made.
• Such structural predictions are, of course, as good as the sequence alignments are.
• For the best results, the alignments should be highly accurate and large, but also divergent enough to show substitutions in places of interest.
CSA example from 5S rRNA• CSA looks in sequence alignments for canonical “mutual compensating
mutations” (C=G, G=C, A–U, U–A, and G/U & U/G) that covary in two alignment positions (or columns).
??
cWW cWW cWW
cWStWS
ISFOLD GUI
ISFOLD predictions for the 5S rRNA BPs
All Correct predictions!
ISFOLD predictions for the whole motif
Note: #26 is 75%A & 25%C
Summary of all 5S ISFOLD predictions
Also: discovered 2 mistakes in original classification of BPs from crystal structure
ISFOLD can also use mutation data
(An example from viroids)
(Viable and lethal mutations determined experimentally)
Predicting Loop E motif in Viroids
Published model* Without mutation data With mutation data
(My viroids alignment is low quality)*Zhong et al, J Virol 2006, v80: 8566-81
Conclusions• This software predicts not just cWW BPs, but all types of BPs
from sequence alignments.
• ISFOLD does 2 tasks:1. Predicts which 2 nucleotides are interacting to form a BP (location).
2. Predicts which specific type of interaction is most probably formed.
• Good results when based on good alignments (5S rRNA).
• Results dramatically improved when mutation data is used.
• The higher the quality of the alignment, the better the predictions.
• By good alignment quality I mean 3 things: 1. Large number of sequences
2. Enough variability between sequences
3. But not too much variability that might mean complete change of the 3D motif (motif swap).
ISFOLD:Mission is NOT accomplished
• ISFOLD as it is now is only the beginning, it provides a framework that can be added upon in the future…
• One thing to consider is that RNA recurrent motifs (such as internal and hairpin loops) occur as whole units – groups of individual BPs tend to occur together.
ISFOLD:Mission is NOT accomplished
• ISFOLD as it is now is only the beginning, it provides a framework that can be added upon in the future…
• One thing to consider is that RNA recurrent motifs (such as internal and hairpin loops) occur as whole units – groups of individual BPs tend to occur together.
• When a good structural library of observed recurrent motifs becomes available (like SCOR database), ISFOLD could be modified to study whole motifs at once (specific stacks of BPs can be scored together)
Top Related