Creation, curation and analysis of RNA and Protein alignments with Jalview
-
Upload
james-procter -
Category
Technology
-
view
226 -
download
1
description
Transcript of Creation, curation and analysis of RNA and Protein alignments with Jalview
![Page 1: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/1.jpg)
Crea%on, cura%on and analysis of Protein and RNA alignments with Jalview
Jim Procter, Jalview Coordinator [email protected] Sco:sh Phylogeny Discussion Group James HuBon InsDtute 18th February 2013
![Page 2: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/2.jpg)
What is Jalview ?
• A java alignment viewer java alignment viewer .. but not just for viewing..
• Java ? – PlaPorm independence – Standalone or web based tool
• Open source since ‘98 – Coordinated in Dundee – Core development funded by …
![Page 3: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/3.jpg)
• Columns relate disDnct sites in biomolecules
10
FER2_ARATH/1-17Q93Z60_ARATH/1-17FER1_MAIZE/1-21O80429_MAIZE/1-12
M A S T A L S - - - - S A IV S T S F L RM A S T A L S - - - - S A IV S T S F L RM A T V L G S P R A P A F F F S S S S L RM A A T A L - - - - - - - - - SM S I L R
Quality
Conservation
Consensus
T-COFFEE
* * 7 7 7 7 - - - - - - - - - * 6 * 4 * *
M A S T A L S - - - - S A IV S T S F L R9 9 7 6 5 5 3 1 1 1 1 3 4 4 5 7 8 9 9 9 9
Alignment programs T-‐COFFEE, Muscle, Clustal, etc..
Sequence Database search Blast, HMMer, etc..
Ortholog database Orthodb, Panther, HOGENOM, etc.
Domain or MoDf database NCBI CDD, Prosite, Pfam/Rfam, PIR ..
Expert Knowledge Experimental characterizaJon, mutaJon studies.
PhylogeneDc trees MrBayes/RaXML/etc
2D & 3D structure predicDon Homology modelling, Fold recogniJon, secondary structure & disorder
Published Literature Figures, supplementary info
Create
Curate
Refine
EvoluDonary Analysis PosiJve selecJon analysis, Molecular basis of character traits
Molecular analysis Structure-‐funcJon relaJonships, AcJve sites, Binding moJfs
![Page 4: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/4.jpg)
CuraDng Alignments – Alignments someDmes need manual curaDon – Correct alignments conserve common properDes – Shading can highlight differences
![Page 5: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/5.jpg)
• Columns relate disDnct sites in biomolecules
10
FER2_ARATH/1-17Q93Z60_ARATH/1-17FER1_MAIZE/1-21O80429_MAIZE/1-12
M A S T A L S - - - - S A IV S T S F L RM A S T A L S - - - - S A IV S T S F L RM A T V L G S P R A P A F F F S S S S L RM A A T A L - - - - - - - - - SM S I L R
Quality
Conservation
Consensus
T-COFFEE
* * 7 7 7 7 - - - - - - - - - * 6 * 4 * *
M A S T A L S - - - - S A IV S T S F L R9 9 7 6 5 5 3 1 1 1 1 3 4 4 5 7 8 9 9 9 9
Alignments are central … Alignment programs T-‐COFFEE, Muscle, Clustal, etc..
Sequence Database search Blast, HMMer, etc..
Ortholog database Orthodb, Panther, HOGENOM, etc.
Domain or MoDf database NCBI CDD, Prosite, Pfam/Rfam, PIR ..
Expert Knowledge Experimental characterizaJon, mutaJon studies.
PhylogeneDc trees MrBayes/RaXML/etc
2D & 3D structure predicDon Homology modelling, Fold recogniJon, secondary structure & disorder
Published Literature Figures, supplementary info
Create
Curate
Refine Analyse
Annotate
EvoluDonary Analysis PosiJve selecJon analysis, Molecular basis of character traits
Molecular analysis Structure-‐funcJon relaJonships, AcJve sites, Binding moJfs
![Page 6: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/6.jpg)
Distributed Annotation System
In t er a c t i v e E d i t i n g V i su ali z a t i on
A li gn m e n t s
S t r u c t u re s
Fe a t u re s
A nn o t a t i on
T ree s
Se qu e n c e s
P C A
F i gu r e Ge n er a t i on
C lic k a b l e H TM L C lic k a b l e H TM L L i n e A r t L i n e A r t
I m a g e s I m a g e s
A n al y s i s Pairwise
alignment
C on se n s u s C on se r v a t io n & C lu s t er in g
Shading
Trees/PCA
![Page 7: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/7.jpg)
J Song et al. Science 2011;331:1036-1040 Published by AAAS
Fig. 1 Structural overview of mDNMT1(650–1602)–DNA 19-nucleotide oligomer complex with bound AdoHcy.
Results at each stage require analysis: manual verification
A stage may be repeated several times with different methods and parameters
Data, analysis, evidence and results must be properly recorded
Analysis involves lots of different kinds of data:
Sequences, alignments, trees, structures, functional assays, literature.
![Page 8: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/8.jpg)
Illustrator Word/LaTeX.. Etc...
Google Alignment on Web
Jalview
• Key qualiDes for a workbench – Undo and Redo! – Archival of data, results and all display
se:ngs – Filtering & MulDple views
– Easy access to to databases and programs
![Page 9: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/9.jpg)
Lightweight UI
Integrate with web sites
MulD-‐windowed UI
VisualizaDon & Analysis
Common Data & Analysis EdiDng, messaging and
File Input/Output
Jalview Flavours
![Page 10: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/10.jpg)
Distributed Annotation System
In t er a c t i v e E d i t i n g V i su ali z a t i on
A li gn m e n t s
S t r u c t u re s
Fe a t u re s
A nn o t a t i on
T ree s
Se qu e n c e s
P C A
F i gu r e Ge n er a t i on
C lic k a b l e H TM L C lic k a b l e H TM L L i n e A r t L i n e A r t
I m a g e s I m a g e s
ClustalW
Mafft
Multiple alignment
Protein Disorder
Functional site analysis
Protein 2ndary structure
A n al y s i s Pairwise
alignment
C on se n s u s C on se r v a t io n & C lu s t er in g
Shading
Trees/PCA
![Page 11: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/11.jpg)
hBp://www.jalview.org
New logo
Jalview Launch BuBons
InstallaDon packages and
source Help and
documentaDon
Jalview Community
Jalview Development and release history
Jalview training news and
course dates
![Page 12: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/12.jpg)
Launching the jalview desktop
![Page 13: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/13.jpg)
![Page 14: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/14.jpg)
Total launches by conDnent (Sep 2012)
Americas 26,575 2012 27,427 2011 -‐3%
Europe 25,161 2012 21,612 2011 +16%
Asia 15,395 2012 11,386 2011 +35%
Oceania 758 2012 1,184 2011 -‐36%
Africa 468 2012 346 2011 +35%
UK 5,886 +11%
India 3168 75%
Malaysia 1464 66%
Brazil 2109 +50%
![Page 15: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/15.jpg)
Jalview News
![Page 16: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/16.jpg)
How accurate are the Google AnalyDcs Launch StaDsDcs ?
![Page 17: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/17.jpg)
0
5000
10000
15000
20000
25000
30000
January February March April May June July August September October November December
Year on Year increase in Desktop Launches as measured by Google AnalyDcs vs our web server logs
2012
2011
2010
2009
Server logs
Google AnalyJcs 2012
2011
2010
2009
1 in 4 users say ‘no’ to Google AnalyDcs
![Page 18: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/18.jpg)
0
5000
10000
15000
20000
25000
30000
Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug Oct Dec
Desktop launches per month
Unique IPs per month
2011 2010
2012 Total 275539 55805 IPs
2012 2008 2009
2011 Total 233320 48563 IPs
2008 Total 144868 23924 IPs
2009 Total 181118 32537 IPs
2010 Total 191321 38483 IPs
Jalview Desktop monthly usage from 2007-‐2012
![Page 19: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/19.jpg)
The Jalview Example Project
• Demonstrates most of Jalview’s key features
![Page 20: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/20.jpg)
One alignment, many views
Sequence features highlight key regions like funcDonal sites
Alignment annotaDon area shows graphs and symbols from
calculaDons and manual curaDon
![Page 21: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/21.jpg)
Linked tree viewer allows subgroups to be idenDfied in alignment
Group selec1ons
Colours and mouseovers
Linked Jmol viewer shows one or more
structures coloured by alignment views
![Page 22: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/22.jpg)
Typical jalview workflow 1. Import sequence or alignment
– Drag/drop, Paste, or URL 2. Decorate sequences with references & annotaDon
from external databases 3. Create alignment 4. Use built in shading, conservaDon analysis, tree and
PCA capabiliDes to explore – Also use annotaDon and structure data if available
5. Select regions for refinement or further analysis 6. Import trees, annotaDon, etc. created with other
programs to explore further 7. Prepare annotated views for publicaDon
![Page 23: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/23.jpg)
PhylogeneDc analysis and Jalview
• Built in tree methods – UPGMA
• Fast, simple, but not reliable for phylogeneDc inferrence
– Neighbour joining • Slower than UPGMA • Perhaps useful for a first approximaDon
– Jalview’s implementaDon is not the most efficient
• Import trees for subgroup analysis – Load any number of Newick/NH Extended files onto an alignment from another program
• Bootstraps are displayed
![Page 24: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/24.jpg)
Typical jalview workflow 1. Import sequence or alignment
– Drag/drop, Paste, or URL 2. Decorate sequences with references & annotaDon
from external databases 3. Create alignment 4. Use built in shading, conservaDon analysis, tree and
PCA capabiliDes to explore – Also use annotaDon and structure data if available
5. Select regions for refinement or further analysis 6. Import trees, annotaDon, etc. created with other
programs to explore further 7. Prepare annotated views for publicaDon
Save as Jalview project
at strategic points
![Page 25: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/25.jpg)
• Jalview projects store key data for a session – Alignments – AnnotaDon & Database IDs – Structures – Trees – Display se:ngs
• New Jalview versions are tested for backwards compaDbility – Archival record of your analysis
![Page 26: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/26.jpg)
Distributed Annotation System
In t er a c t i v e E d i t i n g V i su ali z a t i on
A li gn m e n t s
S t r u c t u re s
Fe a t u re s
A nn o t a t i on
T ree s
Se qu e n c e s
P C A
Retrieval from External Databases Standard DBs
![Page 27: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/27.jpg)
DAS allows Jalview access to Over 270 Sequence Databases…
• New database dialog in 2.8.x series
• Cross source querying • ENSEMBL sources
![Page 28: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/28.jpg)
JDAS
Developed by Rafael Jiminez, Jonathan Warren and other Java developers in DAS Communty
![Page 29: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/29.jpg)
Sequence Features
![Page 30: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/30.jpg)
Sources of sequence feature data
• Jalview sequence annotaDon files • DAS sources • GFF files • Certain ‘rich’ alignment formats
– Stockholm – AMSA
![Page 31: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/31.jpg)
Sequence Features Dialog box
DAS ANNOTATION
SERVERS
• Query matches ID to Authority • Map to local reference frame
• Mouse over for feature name, links and scores
• Group features by source • Type==colour • Highlight start-‐end • Order for opDmal display
• Select specific sources • Filtered list • Add user defined sources
![Page 32: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/32.jpg)
Shading, thresholding, colour by label.
![Page 33: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/33.jpg)
ClustalW
Mafft
AACon
Clustal Omega quick alignment of millions
of sequences
Assorted protein disorder
predictors
Protein conservaDon calculaDons
JABAWS 2 services in Jalview 2.8
![Page 34: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/34.jpg)
Alignment Web Services: JABAWS:MSA
Peter Troshin
jws2
jaws2
HTTP
Replaces original Jalview 2 services: • Extensible framework for wrapping command line
programs • Can be installed on user’s own machine/cluster
See Troshin et al. applicaDon note in Bioinforma1cs for more details.
![Page 35: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/35.jpg)
www.compbio.dundee.ac.uk/jabaws
Jalview Web Service GUI
JABAWS Java Client
JABAWS:MSA Troshin et al. 2011, Bioinforma1cs. JABAWS 2 In Prepara1on.
Native JABAWS installs on a range of platforms
JABAWS Virtual Appliance foryour private use. powered by
JABAWS AmazonMachine Image on EC2
![Page 36: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/36.jpg)
Jalview’s JABAWS ConfiguraDon Panel
![Page 37: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/37.jpg)
Jalview’s Alignment Methods
• JABWS alignment services
– Preset aligment modes – User defined se`ngs
• Pairwise alignment – Needleman and Wunsch
• Mostly used internally
![Page 38: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/38.jpg)
0
2000
4000
6000
8000
10000
12000 Jun-‐05
Aug-‐05
Oct-‐05
Dec-‐05
Feb-‐06
Apr-‐06
Jun-‐06
Aug-‐06
Oct-‐06
Dec-‐06
Feb-‐07
Apr-‐07
Jun-‐07
Aug-‐07
Oct-‐07
Dec-‐07
Feb-‐08
Apr-‐08
Jun-‐08
Aug-‐08
Oct-‐08
Dec-‐08
Feb-‐09
Apr-‐09
Jun-‐09
Aug-‐09
Oct-‐09
Dec-‐09
Feb-‐10
Apr-‐10
Jun-‐10
Aug-‐10
Oct-‐10
Dec-‐10
Feb-‐11
Apr-‐11
Jun-‐11
Aug-‐11
Oct-‐11
Dec-‐11
Feb-‐12
Apr-‐12
Jun-‐12
Aug-‐12
TcoffeeWS
ProbconsWS
MuscleWS
MavWS
ClustalOWS
ClustalWS
JNet MSA PredicDons
JNet Sequence PredicDons
Mav Alignments
Clustal Alignments
Muscle Alignments
Jalview Public Service StaDsDcs 93939 Alignments and 8789 Jpred3 Jobs From September 2011 – August 2012
83950 Alignments and 8193 Jpred3 Jobs From September 2010 – August 2011
![Page 39: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/39.jpg)
Common types of alignment algorthm
a. Sequence database searches – opDmal alignment between query and hit e.g. Blast (single sequence), PSI-‐Blast and HMMER
b. Progressive – opDmise alignment between branches on guide tree e.g. ClustalW
c. TransiDve – opDmise MSA to maximise consistency between pairs e.g. T-‐COFFEE, ProbCons
Profile methods – e.g. Muscle and MAFFT are hybrid of b and c. Latest methods, e.g. ClustalO, are hybrids – employ sampling strategies to speed up tree building & refinement.
Query
Hit4Hit2
Hit1
Hit5
Hit3
aQuery
Hit4Hit2
Hit1
Hit5
Hit3
bQuery
Hit4Hit2
Hit1
Hit5
Hit3
c
Figure adapted from Procter et al. (2010) Nature Methods 7 S16 -‐ S25
![Page 40: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/40.jpg)
Alignment Job Parameter Se:ngs Browse or edit to change
name of set
text box to add notes for the parameter set
Start job with current se:ngs
or cancel.
BuBons appear to create, update, rename or delete user se:ngs.
Parameters contains more complex se:ngs
ToolDps give brief descripDon and link
to further info
ToolDps give brief descripDon and link
(right click) to further info
![Page 41: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/41.jpg)
‘Realignment’ – adding sequences to an exisDng alignment
Clustal Realign opDons: • Jalview leaves gaps in when
sending sequences to JABAWS • ClustalW
– Fixes sequences with gaps, aligns other sequences to profile
• ClustalO – Creates an HMM from sequences with gaps
– Aligns all sequences to HMM
![Page 42: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/42.jpg)
ClustalW
Mafft
AACon
Clustal Omega quick alignment of millions
of sequences
![Page 43: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/43.jpg)
18 new alignment conservaDon calculaDons provided as web services
AACon
• Work like built-‐in calculaDons
• GUI to control parameters
• Se:ngs stored in project file
![Page 44: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/44.jpg)
Are conserva%on scores trustworthy ? • Good quality alignments
– ConservaDon == probable molecular similarity
• Poor alignments – ConservaDon == random noise
• How do you measure MSA reliability ? – Try different methods. Vary parameters – Compare the results.
![Page 45: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/45.jpg)
10
FER2_ARATH/1-17Q93Z60_ARATH/1-17FER1_MAIZE/1-21O80429_MAIZE/1-12
M A S T A L S - - - - S A IV S T S F L RM A S T A L S - - - - S A IV S T S F L RM A T V L G S P R A P A F F F S S S S L RM A A T A L - - - - - - - - - SM S I L R
T-COFFEE
Quality
Consensus
Conservation
9 9 7 6 5 5 3 1 1 1 1 3 4 4 5 7 8 9 9 9 9
M A S T A L S - - - - S A IV S T S F L R
* * 7 7 7 7 - - - - - - - - - * 6 * 4 * *
Consistent Pairwise Alignments Score = 740 Length = 17 PID = 100% Q93Z60_ARATH MASTALSSAIVSTSFLR ||||||||||||||||| FER2_ARATH MASTALSSAIVSTSFLR
Score = 160 Length = 17 PID=29.41% FER1_MAIZE LGSPRAPAFFFSSSSLR ..| .. . |.| || FER2_ARATH MASTALSSAIVSTSFLR
Score = 160 Length = 17 PID=29.41 FER1_MAIZE LGSPRAPAFFFSSSSLR ..| .. . |.| || Q93Z60_ARATH MASTALSSAIVSTSFLR
Score = 310 Length = 12 PID=58.33% O80429_MAIZE MAATALSMSILR ||.|||| .|. FER2_ARATH MASTALSSAIVS
Score = 310 Length = 12 PID = 58.33% O80429_MAIZE MAATALSMSILR ||.|||| .|. Q93Z60_ARATH MASTALSSAIVS
Major inconsistency: +4 Shift Score = 120 Length = 12 PID = 41.67% O80429_MAIZE MAATALSMSILR .| .| | || FER1_MAIZE APAFFFSSSSLR
10
FER2_ARATH/1-17Q93Z60_ARATH/1-17FER1_MAIZE/1-21O80429_MAIZE/1-12
M A S T A L S - - - - S A IV S T S F L RM A S T A L S - - - - S A IV S T S F L RM A T V L G S P R A P A F F F S S S S L RM A A T A L - - - - - - - - - SM S I L R
Quality
Conservation
Consensus
T-COFFEE
* * 7 7 7 7 - - - - - - - - - * 6 * 4 * *
M A S T A L S - - - - S A IV S T S F L R9 9 7 6 5 5 3 1 1 1 1 3 4 4 5 7 8 9 9 9 9
• Calculate ‘shix’ between all pairwise aligments and the mulDple sequence alignment – Higher shixs are less reliable
![Page 46: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/46.jpg)
T-‐COFFEE alignment reliability scores
This figure shows a set of structures
superimposed according to an alignment
generated by T-COFFEE. The T-COFFEE
reliability score highlights the most reliable
regions in red, and least reliable in blue.
Easiest to do this is from command line, or from the T-‐COFFEE web site • CORE – score an alignment
• hBp://tcoffee.crg.cat/apps/tcoffee/do:core
• M-‐COFFEE – combine results from many ‘popular aligners’ • hBp://tcoffee.crg.cat/apps/tcoffee/do:mcoffee
• Jalview can read the ‘score_ascii’ file for an alignment Hope to add funcDonality to JABAWS in future
![Page 47: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/47.jpg)
MulD Harmony – from Jaap Heringa’s Group at Amsterdam Free University
Progress monitored in job service window
SRBS client submits alignment and
groups to service
Return results to user
![Page 48: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/48.jpg)
![Page 49: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/49.jpg)
Protein Secondary Structure PredicDon
• Neural network trained on amino acid profiles – Predicts Helix, shEet, or Coil based on sliding window
• Also predicts coiled coils and surface accessibili%es
• Server can take – Single Sequence
• Service find homologs with PSI-‐Blast
– Alignment • Service uses MSA to calculate profile for predicDon
![Page 50: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/50.jpg)
Sequence features are overlaid on alignment to highlight key regions
Alignment annotaDon area shows graphs and symbols from
calculaDons and manual curaDon
Just to recap…
AnnotaDon like T-‐COFFEE scores can be used to shade alignment
![Page 51: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/51.jpg)
![Page 52: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/52.jpg)
Protein Disorder predicDon • Complementary problem to secondary structure predicDon – Recognise structured & unstructured domains – Predict holes in density maps (REM450) – Detect flexible loops (‘HOTLOOPS’)
• Programs provided by JABAWS 2 employ – Machine learning methods (DisEMBL) – Similarity to disordered sequences (RONN) – Empirical amino acid staDsDcs (IUPred, GlobPlot)
![Page 53: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/53.jpg)
Disorder PredicDons from JABAWS
Jalview JABAWS 2.0
Client Process results into both annota%on and features
JABAWS Analysis Service
Features highlight disordered region or structured domain
predicDons
Use Threshold & Per-‐sequence opDon on ‘Colour by AnnotaDon’ dialog to shade alignment using raw scores
![Page 54: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/54.jpg)
Disorder in Interleukin 7 Orthologs
![Page 55: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/55.jpg)
Jalview and NucleoDde data • Basic by current standards • Built in cDNA-‐>Amino acid transla%on
– Works with Aligned cDNA – Preserves alignment annotaDon – Does not backtranslate
• European nucleo%de archive records – Parse cDNA annotaDon in conDgs
• Display of WUSS or VIENNA RNA secondary structure nota%on – Stockholm file import – LocaRNA Clustal files
![Page 56: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/56.jpg)
TranslaDon of annotated cDNA alignment
Codon highlighDng
Mouse is over Arginine
Structure HighlighDng
![Page 57: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/57.jpg)
Jalview 2.8 and RNA 2nd-‐ary Structure
Structure Consensus Logo: Shows base pair distribution at each paired position in a given RNA secondary structure.
Linked VARNA RNA Secondary Structure viewer and editor.
RALEE style colouring highlights disDnct stems and
helices
![Page 58: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/58.jpg)
VARNA has a wide range of 2D RNA plots
and supports interacDve annotaDon
SelecDons and mouse posiDons shared
between alignment view and VARNA
VARNA: InteracDve drawing and ediDng of the RNA secondary structure Kévin Darty, Alain Denise and Yann Ponty Bioinforma1cs (2009) 25 1974-‐1975
![Page 59: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/59.jpg)
Michele Clamp Director of Informa1cs and Scien1fic Applica1ons, Harvard.
James Cuff High Performance Compu1ng, Harvard.
A brief history of Jalview
Steve Searle Now at
Sanger, UK
Andrew Waterhouse U. Basel.
Jalview Version 2 2005
Jalview Version 1 1997
Jim Procter
David MarDn
2004 Jalview 1 published.
VAMSAS eScience project
![Page 60: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/60.jpg)
Jalview v2 Alignment, Analysis, Figure Genera%on
TOPALi v2
Evolu%on & Phylogeny
AstexViewer@MSD-‐EBI
Structure analysis
Iain Milne
Dominik Lindner
Frank Wright
David Marshall
Pierre Marguerite
Tom Oldfield
Andrew Waterhouse
Jim Procter
David MarDn, Geoff Barton
VAMSAS VisualizaDon and Analysis of Molecular Sequences, Alignments and Structures
![Page 61: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/61.jpg)
Analysis of Protein
Sequences
Analysis of Nucleic Acid Sequences
Analysis of Protein
Structures
VAMSAS VisualizaDon and Analysis of Molecular Sequences, Alignments and Structures
![Page 62: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/62.jpg)
Jalview
TOPALi AstexViewer @MSD-‐EBI
Aim: Enable user to move between different VAMSAS ApplicaDons
Databases, AnnotaDon Alignment
2-‐ary Structure PredicDon
Structure Databases Structural Clustering Uniprot/MSD Mapping
Model SelecDon Phylogeny
Ancestral Sequences PosiDve SelecDon RecombinaDon
SDll can’t do this L
![Page 63: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/63.jpg)
Michele Clamp Interim Director of Research Compu1ng, Harvard.
James Cuff Cycle Compu1ng, USA.
A brief history of Jalview
Steve Searle Now at
Sanger, UK
Andrew Waterhouse U. Basel.
Jalview Version 2 2005
Jalview Version 1 1997
Jim Procter
David MarDn
2004 Jalview 1 published.
2.1
2.2
2.3
2.4
VAMSAS eScience project
Jalview 2 Paper 2009
![Page 64: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/64.jpg)
Unsupported graphical tools cause headaches..
User Headaches: • Can’t do what you want:
– Read/write format ‘X’ – Run my analysis ‘Y’ – Load my 10Gb dataset
• Slow or difficult to use • Doesn’t work properly
Core Developer needed for: • Staying‘state of the art’:
– New standards – New analysis methods – New data
• Maintenance – Dependencies – New OS Versions – New hardware
• Hard to do ‘research’ and maintain soxware
![Page 65: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/65.jpg)
Michele Clamp Interim Director of Research Compu1ng, Harvard.
James Cuff Cycle Compu1ng, USA.
A brief history of Jalview
Steve Searle Now at
Sanger, UK
Andrew Waterhouse U. Basel.
Jalview Version 2 2005
Jalview Version 1 1997
Jim Procter
David MarDn
2004 Jalview 1 published.
Jalview 2 Paper 2009
2.1
2.2
2.3
2.4 2.5 2.6
New BBSRC Funding 2009-‐2014
.1 2.7
Peter Troshin
2.8
VAMSAS eScience project
![Page 66: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/66.jpg)
The Jalview 5 Year Plan
Extensible Maintainable
Sustainable
Community
More Capable
Users and Developers
More Flexible
1 Oct 2009 30th Sep 2014
5 Year Tools and Resources Development Fund Grant from the UK Biotechnology and Biological Sciences Research Council
![Page 67: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/67.jpg)
The Jalview 5 Year Plan
Extensible Maintainable
More Capable More Flexible
Soxware
Engineering
Large
Datasets
Analysis
services
1 Oct 2009 30th Sep 2014
Sustainable
Community
GUI refactor & plugins
![Page 68: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/68.jpg)
Co-‐developed with Sasha Sherstnev on Jpred BBSRC BBR
![Page 69: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/69.jpg)
The Jalview 5 Year Plan
Extensible Maintainable
More Capable More Flexible
Soxware
Engineering
Large
Datasets
Analysis
services
1 Oct 2009 30th Sep 2014
Sustainable
Community Users and Developers
Community Outreach • Website redesign
• Training • Users and Developers
• InternaDonalisaDon
• True Open Source Development
• Issue tracker and open repository
GUI refactor & plugins
![Page 70: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/70.jpg)
![Page 71: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/71.jpg)
www.google-melange.org www.google-melange.com
![Page 72: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/72.jpg)
Lauren Lui, UC Santa Cruz. hBp://jalview-‐rnasupport.blogspot.com/
alignment fetcher
Purine/pyrimidine colourscheme
Colouring to highlight helical structure
WUSS annotaDon parser (from RALEE)
NESCent
![Page 73: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/73.jpg)
Jan Engelhardt (Uni. Leipzig)
NESCent
![Page 74: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/74.jpg)
Google Summer of Code 2013 : Why should I participate?
Students gain: v skills v real world experience v sample code v contacts
Organizations gain: v new contributions & contributors v global exposure
NESCent
![Page 75: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/75.jpg)
Google and NESCent Summer of Code 2013 New phylogeny support in Jalview ?
• cDNA/protein ediDng – Display AA translaDon of cDNA alignment – ORF support
• Bacterial genome alignment manipulaDon – New ediDng
• JABAWS – Phylogeny services – Extend alignment services to
• Return guide trees and uncertainty scores • Support ‘add’ and profile alignment
![Page 76: Creation, curation and analysis of RNA and Protein alignments with Jalview](https://reader033.fdocuments.net/reader033/viewer/2022060108/554f526bb4c905524c8b4f54/html5/thumbnails/76.jpg)
Kersten Schroeder U. Dundee, UK. Paul Gardiner Rfam,NZ. Albert Vilella, EBI, UK.
The Jalview developers Michele Clamp Harvard, USA. James Cuff Cycle Compu1ng
Steve Searle Sanger, UK
Andrew Waterhouse Basel, Switzerland.
Geoff Barton (Money) David MarDn (Teaching) Peter Troshin (JABAWS) Barry Strachan (logo) Tom Walsh (Apache) Ryan Maclaughlan (CSS) Andrew Millar (Drupal) All the Jalview users, and …
RNA Features Lauren Lui UC Santa Cruz, USA. Jan Engelhardt Univ. Leipzig, Germany. Yann Ponty (VARNA) École Polytechnique, Fr. RNA Experts T-‐COFFEE Scores Paolo di Tomasso Notredame Group, CRG, Spain.
Google Summer of Code Hilmar Lapp & Karen Cranston NESCent (Duke U.)
Yann Ponty, École P., Fr.