New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the
description
Transcript of New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the
![Page 1: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/1.jpg)
New structure-based methods for the phylogeneticanalysis of ribosomal RNA sequences using the
parsimony optimality criterion
Joseph J. GillespieMatthew J. Yoder*
Anthony I. Cognato
1
2
3
![Page 2: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/2.jpg)
1. RAA/RSC/REC coding
2. RNA basepair coding
RNA molecules havecharacteristic higherorder structure that isconserved across all life
Using structure to guidethe assignment of positionalnucleotide homology, weoffer two new approaches toanalyzing rRNA sequences:
CCUGAGA
AAC
CC
GAAA G G
UCG
AAA
GAGGAAAUUCAUUC
GCGUUUCGACUGUCGA
UUG
GAUGU
U
ACGGAUAGCGUU
UCGGCGUCGUCCG
UUAUAUUCUUCGU
GUCGAU
GUC
GAACGCGU
GCACUUUUCCUCUAGU
AG
GACGUCGC
G
AUCCGUUGGGUG
UCUGUCUAAGAC
UCGAGGUGGAGC
CCGCGUAAUUUUU
AAUUAUGCGG
AC
CCUUGGUGUU
C
C
GACAGACUCACUCGACGGUAU
ACUAA
UGGCG
CGGGG C C
GCUA
CUUUA
GU
UAGCG
UUCGG
C
CCGUAGCA
AGCA
CGUUCUG UG
UUUG
ACGGCG
AUCGG
AC
CUG
GUGCCGAU
UCUGUC
C
CAGAACGAC UGUU
GGUUACGGU
GUU C
UCGAACAGACCUCG
UAUGA A A C G C C G
AUC AG C
GA C
GCU C
U A G A UU
GG
G
UAC U
UUCAGG
2
2f
2e
2d
2c
2b
2a
3g
3a
3e
3d
3c
3b
3f
3h
3i
3o
3n
3m
3l
3k
3j
3p
3
1b
1a
AG
5’
U
GC
AUUUGC
UU
3’
![Page 3: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/3.jpg)
RAA/RSC/REC coding*
Methods for characterizing regions of RNA sequencealignments wherein positional nucleotide homologycannot be assigned with confidence
Based on the premise of using information fromsecondary structure (i.e., compensatory base changeevidence) to delimit unalignable positions
*Gillespie (2004) Mol. Phylogenet. Evol. 33: 936-943
![Page 4: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/4.jpg)
RAA/RSC/REC coding
![Page 5: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/5.jpg)
RAA/RSC/REC coding*
Region of ambiguous alignment
Two or more adjacent, non-pairing positions withina sequence wherein positional homology cannot beconfidently assigned due to the high occurrence ofindels in other sequences
RAA
*Gillespie (2004) Mol. Phylogenet. Evol. 33: 936-943
![Page 6: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/6.jpg)
RAA/RSC/REC coding*
Region of slipped-strand compensation
Region involved in base-pairing wherein positionalhomology cannot be defended across a multiplesequence alignment; inconsistency in pairing likelydue to slipped-strand mispairing
RSC
*Gillespie (2004) Mol. Phylogenet. Evol. 33: 936-943
![Page 7: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/7.jpg)
RAA/RSC/REC coding*
Region of expansion and contraction
Variable helical region flanked by conserved basepairsat the 3’ and 5’ ends, and an unpaired terminal bulgeof at least three nucleotides; characteristic of RNAhairpin-stem loops
REC
*Gillespie (2004) Mol. Phylogenet. Evol. 33: 936-943
![Page 8: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/8.jpg)
RAA/RSC/REC coding*
Subdividing large ambiguously aligned regions intosmaller components provides:
why?
1. a means for comparing structurally similar nucleotides in fragment level alignment methods (INAASE, POY)
2. fewer character state transformations between taxa, with less potential to exceed the number of allotted states in a given phylogenetic software
4. improvements to existing global structural models for the various rRNA molecules on public databases
*Gillespie (2004) Mol. Phylogenet. Evol. 33: 936-943
3. the ability to objectively assign different substitution weights to pairing (RSC, REC) and non-pairing (RAA) regions
5. a more explicit set of homologies
![Page 9: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/9.jpg)
![Page 10: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/10.jpg)
RNA basepair coding
![Page 11: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/11.jpg)
A = A AA = C CA = H GA = M UA = T C = R AC = Q CC = I GC = F UC = W G = N AG = E CG = L GG = P UG = Y U = D AU = G CU = K GU = S UU = V
*adopted from Smith et al. (2004) Mol. Biol. Evol. 21: 419-427
non-pairing pairing
code (20 states)*RNA basepair coding
![Page 12: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/12.jpg)
substitution matrix
non-pairing
canonical
non-canonical
(-)
(-)
RNA basepair coding
![Page 13: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/13.jpg)
weighting, i.e.RNA basepair coding
![Page 14: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/14.jpg)
RNA basepair coding scripts
available via the Jrna script package
![Page 15: New structure-based methods for the phylogenetic analysis of ribosomal RNA sequences using the](https://reader035.fdocuments.net/reader035/viewer/2022070405/5681402a550346895dab8bb6/html5/thumbnails/15.jpg)
http://hymenoptera.tamu.edu/rna
This project was funded by NSF-PEET DEB grants 0328922 toRobert Wharton and 0358920 to Anthony Cognato.