Genomic sequence analysis tools and a genotype-phenotype … · 2020. 4. 24. · Dengue Virus...

1
Genomic sequence analysis tools and a genotype-phenotype association platform in the Virus Pathogen Resource Yun Zhang 1 , Brett Pickett 1 , Eva Rab 1 , Jyothi Noronha 1 , R. Burke Squires 1 , Victoria Hunt 1 , Mengya Liu 2 , Liwei Zhou 3 , Chris Larson 4 , Jonathan Dietrich 3 , Edward B. Klem 3 , Richard H. Scheuermann 1,5 1 Department of Pathology, 5 Division of Biomedical Informatics, Univ. of Texas Southwestern Medical Center, Dallas, TX; 2 Southern Methodist Univ., Dallas, TX; 3 Northrop Grumman Health Solutions, Rockville MD; 4 Vecna Technologies, Greenbelt MD. Introduction Figure 2: A screenshot of the Sequence Feature Details page. The details page displays strain information, Sequence Feature information, available 3D protein structures, and a table containing all Variant Types for the selected Sequence Feature. 1 (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic acids research, 39, D214-219. 2 Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Federhen, S. et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic acids research, 39, D38-51. 3 Vita, R., Zarebski, L., Greenbaum, J.A., Emami, H., Hoof, I., Salimi, N., Damle, R., Sette, A. and Peters, B. (2010) The immune epitope database 2.0. Nucleic acids research, 38, D854-862. Edgar, R.C., 2004. 4 MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC bioinformatics 5, 113. 5 Waterhouse, A.M., Procter, J.B., Martin, D.M., Clamp, M. and Barton, G.J. (2009) Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics, 25, 1189-1191. 6 Zmasek, C.M. and Eddy, S.R. (2001) ATV: display and manipulation of annotated phylogenetic trees. Bioinformatics, 17, 383-384. 7 Hanson, R. (2010) Jmol - a paradigm shift in crystallographic visualization. Journal of Applied Crystallography, 43, 1250-1260. We would like to thank Elliot Lefkowitz, Carla Kuiken, Bernard Moss, and R. Pad Padmanabhan for reviewing and validating the SFVT definitions, as well as the primary data providers for the sequence data that was used throughout this study. We also recognize the scientific and technical personnel responsible for supporting and developing ViPR, which has been wholly supported with federal funds from the NIAID, NIH, Department of Health and Human Services (N01AI2008038 to R.H.S.). Figure 4: 3D Protein Structure Viewer in the Virus Pathogen Database and Analysis Resource (ViPR). A display of an example sequence feature highlighted on the 3D structure of the Hepatitis C Virus NS5b protein (PDB ID: 1CSJ). ViPR combines the strength of a relational database with a suite of bioinformatics integrated tools to support everything from basic sequence and structural analyses to more advanced genotype-phenotype studies. The uniqueness of ViPR lies in: integrating data from various sources encouraging the analysis of the comprehensive data contained within the system combining the available tools to quickly perform complex analytical workflows facilitating rapid hypothesis generation using bio- informatics methods for subsequent experimental testing allowing data sharing and storage with collaborators in personal workbenches Figure 1: A screenshot of the ViPR homepage. The ViPR homepage is the portal used to access the various types of data and advanced functionality within the system. The Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org ), sponsored by the National Institute of Allergy and Infectious Diseases serves as a single publicly-accessible repository of integrated datasets and analysis tools for 14 different virus families to support wet-bench virology researchers focusing on the development of diagnostics, prophylactics, vaccines, and treatments for these pathogens. ViPR Supports 14 Virus Families Arenaviridae, Bunyaviridae, Caliciviridae, Coronaviridae, Filoviridae, Flaviviridae, Hepeviridae, Herpesviridae, Paramyxoviridae, Picornaviridae, Poxviridae, Reoviridae, Rhabdoviridae, and Togaviridae. ViPR Intergrates Data from Many Sources GenBank sequence records, gene annotations, and strain metadata Gene Ontology (GO) classifications UniProtKB protein annotations Protein Databank (PDB) 3D protein structures Immune epitopes from the Immune Epitope Database (IEDB) Clinical data Additional data derived from computational algorithms Host-Pathogen Interactions* coming soon ViPR Provides Analysis and Visualization Tools Genome Annotator BLAST Sequence Similarity Search Multiple Sequence Alignment Phylogenetic Tree Construction 3D Protein Structures with Sequence Feature or Epitope Highlights Sequence Feature Variant Type (SFVT) Analysis Metadata-driven Comparative Genomics Analysis SNP Analysis ViPR enables you to store and share data and results through the ViPR Workbench Figure 3: Analytical tools available for SFVT data. (A) A multiple sequence alignment calculated with MUSCLE 4 and visualized with JalView 5 in ViPR. (B) A metadata-driven comparative genomics analysis tool to identify individual positions that correlate with a metadata attribute. (C) A phylogenetic tree that has been automatically colored according to country of isolation using the Archaeopteryx 6 tool showing the relationship between HCV-1 genomes . Loading Virus Pathogen Database and Analysis Resource (ViPR)... Search Search our comprehensive database for: Analyze Analyze data online: Save to Workbench Use your workbench to: Browse All Search Types Browse All Tools Single-Stranded Positive-Sense RNA Single-Stranded Negative-Sense RNA Double-Stranded RNA Double-Stranded DNA Clinical Data for Human Dengue Virus Isolates!! In this release of ViPR we have added extensive clinical data for ~2600 Dengue Virus isolates. Includes patient demographics and measures of disease severity, immune response, evolution, etc. Isolates from both Eastern and Western Hemispheres. All linked to complete genome sequences! Try our Metadata Genome Compare tool for custom comparative analysis. Search, or download complete dataset. Like Brett Pickett , Burke Squires and 16 Click on family or species of interest in taxonomy below to view viral genomes, or click here to view in list format. Genomes Genes & proteins Immune epitopes 3D protein structures Identify similar sequences (BLAST) Align sequences (MSA) Identify short peptides in proteins Visualize aligned sequences Store data in working sets for future analysis Integrate ViPR data with your laboratory data Store analysis results Share results and data with collaborators Virus Families Caliciviridae Coronaviridae Flaviviridae Hepeviridae Picornaviridae Togaviridae Arenaviridae Bunyaviridae Filoviridae Paramyxoviridae Rhabdoviridae Reoviridae Herpesviridae Poxviridae Click on a featured virus of interest to go to virus-specific home page. Featured Viruses Dengue Hepatitis C virus Highlights ViPR Workbench The ViPR Workbench allows users to save 'working sets' of sequences, searches and analysis results between web sessions in their own private workspace. Users can share working sets or analysis results with collaborators. Key Highlights: Connect with Us About Us Announcements Links Resources Support You are logged in as [email protected] !"#$%&' )*+% ,- ./0% 1* 2%34%'5% 6%*$4"% )*+% 7%0*$&$&89:9;&"4897;<1="%>&#'9?@AB-CD 2%34%'5% 6%*$4"% EF 7%0*$&$&89:9;&"4890#G/0"#$%&'9261?C !#8&$&#' ?@A=H11 :#++%'$ 7;<1 ;.=1 2$"*&' B"%I%"%'5% 8$"*&'D 7CC J ):9KKH1K- 2%34%'5% 6%*$4"% L%I&'&$&#' 2$"*&' =)JM= ;*"&*'$ ./0% &' 2%34%'5% 6%*$4"% L%I&'&$&#' 2$"*&' =)JM= 2#4"5% !#8&$&#' ?@A=H11 2#4"5% 2%34%'5% M55%88&#' =)JM= !"#$%&' 2$"45$4"% =)JM= 2#4"5% !4NG&5*$&#' O'&0"#$P!-CQA@ R!4NS%FP1@?H?HCC R1AK1T@TC ,U&F%'5% :#F%8 ,V! !"#$"%&" (")*$+" ,"*)-./ 7#+% 2%34%'5% 6%*$4"% ;*"&*'$ ./0%8 2%34%'5% 6%*$4"%8 2%34%'5% 6%*$4"% L%$*&G8 B7%0*$&$&89:9;&"4897;<1="%>&#'9?@AB-CDD !"#$%&%&' ) *&+,' 2,M<:7 LM.M M)MWXY, Z ;E2OMWEY, [\<]^,):7 ;E<O2 6MSEWE,2 7\S, MN#4$ O8 M''#4'5%+%'$8 W&'_8 <%8#4"5%8 2400#"$ 2&>' \4$ X#4 *"% G#>>%F &' *8 /4'`ab*'>c4$8#4$bd%8$%"'`%F4 A C B !"#$%$%&%'#( !"#$%& (#)# #*#+,-" . /0!1#+0-" 23$45"*%& /0$1! 6#70+0"! &37" #89:; 1< #==9:=>?@?=;< +A=B< $?<9:C>?< !:DD9C; !AE= 3:; ,9: FC? G9EE?H A= F< I:=JKLF=EM:;<9:;LN?<;?C=J?H: !"#$ &"'( )*#+,"-./ ($0+,# - 1*2 ($3*$/# 4$5 ),-"6"62 !+6#-7# 8/ ($.$-/$ 9-#$: ;*. <=> ?@<< )A"/ 0,+B$7# "/ C*6D$D 5E #A$ F-#"+6-. G6/#"#*#$ +C H..$,2E -6D G6C$7#"+*/ 9"/$-/$/ I FGJ K 9JJLM *6D$, !+6#,-7# F+N JJLF?O??@@P@@@Q<! -6D "/ - 7+..-5+,-#"+6 5$#R$$6 F+,#A,+0 S,*TT-6 J$-.#A G)> 86"U$,/"#E +C )$V-/ L+*#AR$/#$,6 W$D"7-. !$6#$, -6D &$76- )$7A6+.+2"$/N &",*/ "T-2$/ 7+*,#$/E +C !9! '*5."7 J$-.#A GT-2$ X"5,-,E> 4$..7+T$ GT-2$/> 8NLN 9$0-,#T$6# +C &$#$,-6/ HCC-",/ > L7"$67$ +C #A$ G6U"/"5.$ -6D &",-.Y+6$> LR"// G6/#"#*#$ +C 1"+"6C+,T-#"7/N !"#$ &'()*+#+ ,-$(#)+ !"#$%&'& )*+#'$ , -./012"3*&42"5*3&0#"506*$#4&*3&072&+,+3*#+8*"+,9*+#.!:; ,./').(" 0'12$ 3-'-4($- 56*).7-'-$#8 94-- :6./ 5;<()2-+ =(4 5).$ 9*+#,5#+# <*"28* .284#3'&2" 6*423+ =:'>?*+@ 9<0ABCDCABEEFGH )A$ W$#-D-#- S$6+T$ !+T0-,"/+6 H6-.E/"/ )++. 7+6/"/#/ +C #A,$$ 0-,#/: - T*.#"0.$ /$3*$67$ -."26T$6# I*/"62 W8L!XZM> - 7A"[/3*-,$ 2++D6$// +C C"# #$/# #+ "D$6#"CE 0+/"#"+6/ I7+.*T6/M +C #A$ T*.#"0.$ /$3*$67$ -."26T$6# #A-# /"26"C"7-6#.E D"CC$, C,+T #A$ $V0$7#$D I,-6D+TM D"/#,"5*#"+6 +C ,$/"D*$/ 5$#R$$6 -.. T$#-D-#- 2,+*0/> -6D - '$-,/+6\/ 7A"[/3*-,$ #$/# #+ "D$6#"CE #A$ /0$7"C"7 0-",/ +C T$#-D-#- 2,+*0/ #A-# 7+6#,"5*#$ #+ #A$ +5/$,U$D /#-#"/#"7-. D"CC$,$67$N 4A$6 ] +, T+,$ 2,+*0/ -,$ "67.*D$D "6 #A$ -6-.E/"/> #A$ ![U-.*$ C,+T #A$ S++D6$// +C ^"# #$/# R".. "D$6#"CE 7+.*T6/ A-U"62 /"26"C"7-6# U-,"-#"+6 5$#R$$6 -.. 2,+*0/> RA".$ #A$ '$-,/+6\/ #$/# R".. "D$6#"CE #A$ /0$7"C"7 0-",I/M +C 2,+*0/ #A-# T-_$ #A$ 7+.*T6 /"26"C"7-6# I"N$N "C #A+/$ 2,+*0/ R$,$ 6+# "67.*D$D "6 #A$ -6-.E/"/> #A$ 7+.*T6 R+*.D 6+ .+62$, 5$ "D$6#"C"$D -/ /"26"C"7-6#MN F+#$: RA$6 +6.E #R+ T$#-D-#- 2,+*0/ -,$ 7+T0-,$D> #A$ 0[U-.*$/ C+, 5+#A /#-#"/#"7-. #$/#/ R".. 5$ "D$6#"7-. /"67$ -.. /"26"C"7-6# 7+.*T6/ C,+T #A$ S++D6$// +C ^"# #$/# 7-6 +6.E D"CC$, 5$#R$$6 2,+*0/N >6#;+?2(4- 3.."'-++ .@ A#$ 9-+$ B-+2)$ )A$,$ -,$ ? 0+/"#"+6/ #A-# A-U$ - /"26"C"7-6# 6+6[,-6D+T D"/#,"5*#"+6 5$#R$$6 #A$ /0$7"C"$D 2,+*0/N 5.+#$#.' >6#;+?2(4- <()2- >;C()2- ,-74-- A4--".D ]`?? =N]?O @N@]PO< ] =]`? <@NOaO @N@<]@a ] J"D$ S,-0A 5-(4+.'E+ >6#;+?2(4- 5(#4/#+- >.D1(4#+.' B-1.4$ )A$,$ -,$ ? 0+/"#"+6/ RA"7A -,$ /"26"C"7-6#.E D"CC$,$6# 5$#R$$6 #A$ 2,+*0/N 5.+#$#.' F2)$#1)- >.D1(4#+.' 5;C()2- :2G+-$+ >.'$4#G2$#'7 $. :#7'#@#8('8- ]`?? @N@]PO< <[? =]`? @N@<]@a <[? J+T$ WE 4+,_5$67A H6-.E/"/ IJ!&bF+6,$/0+6D$,/b-6Db($.-0/$,/b'+/#[#,$-#T$6#[W$#-!H)LM !"#$%& () *%+"%,-%. /0 *%+"%,-% 12$%3 *4&25, !2#% *26% 7,238.5. 9%,%&24% :;83(<%,%45- =&%% !"#$%&"'( *&"+,(- .(/$(,0(# *>?@>!A> B!CDEF7=BD! J+T$ WE 4+,Z5$67A 4+,Z"62NNN H."26 L$3*NNN ($/*.#/ &"/*-."[$ H."26$D L$3*$67$/ ($/*.#/ !"#$%$%&%'#( L\H(!J 9H)H HFHX]Y\ ^ &GL8HXGY\ 4_(`1\F!J &G(8L aHWGXG\L J_W\ H5+*# 8/ H66+*67$T$6#/ X"6Z/ ($/+*,7$/ L*00+,# L"26 _*# ]+* -,$ .+22$D "6 -/ E*6N[A-62b*#/+*#AR$/#$,6N$D* !"#$ &"'( )*#+,"-./ ($0+,# - 1*2 ($3*$/# 4$5 ),-"6"62 !+6#-7# 8/ ($.$-/$ 9-#$: ;*. <=> ?@<< )A"/ 0,+B$7# "/ C*6D$D 5E #A$ F-#"+6-. G6/#"#*#$ +C H..$,2E -6D G6C$7#"+*/ 9"/$-/$/ I FGJ K 9JJLM *6D$, !+6#,-7# F+N JJLF?O??@@P@@@Q<! -6D "/ - 7+..-5+,-#"+6 5$#R$$6 F+,#A,+0 S,*TT-6 J$-.#A G)> 86"U$,/"#E +C )$V-/ L+*#AR$/#$,6 W$D"7-. !$6#$, -6D &$76- )$7A6+.+2"$/N &",*/ "T-2$/ 7+*,#$/E +C !9! '*5."7 J$-.#A GT-2$ X"5,-,E> 4$..7+T$ GT-2$/> 8NLN 9$0-,#T$6# +C &$#$,-6/ HCC-",/ > L7"$67$ +C #A$ G6U"/"5.$ -6D &",-.Y+6$> LR"// G6/#"#*#$ +C 1"+"6C+,T-#"7/N !"#"$ &'"( )*+" &'"( ,# -.*/" )E0$ ;W+. 7+TT-6D A$,$N !01 L-U$ &"$R: 9$/7,"0#"+6: 2!3)4,5 )4!6246!7 89 4:7 !;,<=7>7;=7;4 !;, >853?7!,)7 89 :7>,4-4-) 2 &-!6) '91 X"6Z: <!L; @?85 28??,;= 5-;7A HDU-67$D */$,/ 7-6 $6#$, - ;W+. 7+TT-6D D",$7#.E */"62 ;W+. G6#$,-7#"U$ L7,"0# N ),&7B!7)48!7 )A$ /#-#$ +C #A$ U"$R$, IA"2A."2A#"62> [++T"62> $#7M 7-6 5$ /-U$D -# -6E #"T$ -6D #A$6 ,$#,"$U$D .-#$,N !A++/$ #A$ U"$R E+* R-6# #+ /-U$> #A$6 ,$/#+,$ "# .-#$, RA$6 E+* -,$ ,$-DEN 9"/0.-E )E0$: "#$%&'()* "+),$+,)# -& .()+%%& 9$#-"./ / 0 .#11 23-+%3#4 2C"*D :'/EC'/E$ 2C"*D !+.+, L#,*7#*,$ 1E: X-5$.: =-)>5,3 8>4-8;)A )A$/$ +0#"+6/ 7+6#,+. #A$ 2$6$,-. -00$-,-67$ +C #A$ 0,+#$"6 /#,*7#*,$ "6 #A$ U"$R$,N :-F:5-F:4 5-F,;=) J"2A."2A# X"2-6D/ "6 :-F:5-F:4 7>-48>7) J"2A."2A# $0"#+0$/ +6 #A$ /#,*7#*,$ "6 N \",/#> /$.$7# -6 $0"#+0$ #E0$ C,+T #A$ ."/#N )A$6 7A$7Z $0"#+0$/ #+ A"2A."2A#N >"G$'H" #"I0"1J" !*1/" -7=K -= HH!(HH ?O?@]?O?^ O_ HH`Xa9 ?O?Q]?O?P ?^? H!(HH` ?O?<]?O?_ ^=P HG`LX) ?_O<]?_O_ <P=? H`Xa9! ?O?^]?Ob@ ??bO :-F:5-F:4 K3 )L-))<>!84 >8)-4-8; J"2A."2A# "6 #A$ ,$/"D*$/ "6 #A"/ /#,*7#*,$ 7+,,$/0+6D"62 #+ D$C"6$D LR"//',+# 0+/"#"+6/N c6#$, +6$ +, T+,$ 7+TT-]D$."T"#$D 0+/"#"+6/ I<^>b@M> +, - ,-62$ I<^]b@M> #A$6 7."7Z J"2A."2A#N ?_Q@]?_Q^> ?O@?]?O@b> ?Ob_ '91 L$3*$67$KL#,*7#*,$ 9$#-"./ :-F:5-F:4B5,K75 97,46!7) )A$/$ +0#"+6/ 7+.+,> A"2A."2A# +, .-5$. 7$,#-"6 C$-#*,$/ +C #A$ /#,*7#*,$N ($/#+,$ &"$R: Y++T: 5667 L0"6: J+T$ b9 ',+#$"6 L#,*7#*,$ L$-,7A ($/*.#/ ',+#$"6 L#,*7#*,$ &"$R$, I<!L;M !"#$%&%&' ) *&+,' LcH(!J 9H)H HFHXdYc e &GL8HXGYc 4f(`1cF!J &G(8L \HWGXGcL JfWc H5+*# 8/ H66+*67$T$6#/ X"6Z/ ($/+*,7$/ L*00+,# L"26 f*# d+* -,$ .+22$D "6 -/ E*6N[A-62g*#/+*#AR$/#$,6N$D* Multiple Sequence Alignment, Phylogenetic Tree and Meta-CATS 3D Protein Structure Viewer Summary Acknowledgements References Sequence Feature Variant Type (SFVT) Sequence Features (SFs): characterized structural, functional, immune epitope, or sequence alteration regions of a protein manually curated from UniProt 1 , GenBank 2 , and the Immune Epitope Database 3 and then validated by expert researchers. Variant Type (VT): Polymorphisms in each Sequence Feature are identified as “Variant Types” of the Sequence Feature. Available for hepatitis C virus subtype 1a, Dengue virus type 2, and Orthopoxvirus (Vaccinia) in ViPR. Enables researchers to quickly query and analyze the genotypic changes for all sequence records that could be associated with a given phenotype.

Transcript of Genomic sequence analysis tools and a genotype-phenotype … · 2020. 4. 24. · Dengue Virus...

  • Genomic sequence analysis tools and a genotype-phenotype association platform in the Virus Pathogen Resource

    Yun Zhang1, Brett Pickett1, Eva Rab1, Jyothi Noronha1, R. Burke Squires1, Victoria Hunt1, Mengya Liu2, Liwei Zhou3, Chris Larson4, Jonathan Dietrich3, Edward B. Klem3, Richard H. Scheuermann1,5

    1Department of Pathology, 5Division of Biomedical Informatics, Univ. of Texas Southwestern Medical Center, Dallas, TX; 2Southern Methodist Univ., Dallas, TX; 3Northrop Grumman Health Solutions, Rockville MD; 4Vecna Technologies, Greenbelt MD.

    Introduction

    Figure 2: A screenshot of the Sequence Feature Details page. The details page displays strain information, Sequence Feature information, available 3D protein structures, and a table containing all Variant Types for the selected Sequence Feature.

    1(2011) Ongoing and future developments at the Universal Protein Resource. Nucleic acids research, 39, D214-219. 2 Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Federhen, S. et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic acids research, 39, D38-51.

    3Vita, R., Zarebski, L., Greenbaum, J.A., Emami, H., Hoof, I., Salimi, N., Damle, R., Sette, A. and Peters, B. (2010) The immune epitope database 2.0. Nucleic acids research, 38, D854-862. Edgar, R.C., 2004.

    4MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC bioinformatics 5, 113. 5Waterhouse, A.M., Procter, J.B., Martin, D.M., Clamp, M. and Barton, G.J. (2009) Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics, 25, 1189-1191.

    6Zmasek, C.M. and Eddy, S.R. (2001) ATV: display and manipulation of annotated phylogenetic trees. Bioinformatics, 17, 383-384. 7Hanson, R. (2010) Jmol - a paradigm shift in crystallographic visualization. Journal of Applied Crystallography, 43, 1250-1260.

    We would like to thank Elliot Lefkowitz, Carla Kuiken, Bernard Moss, and R. Pad Padmanabhan for reviewing and validating the SFVT definitions, as well as the primary data providers for the sequence data that was used throughout this study. We also recognize the scientific and technical personnel responsible for supporting and developing ViPR, which has been wholly supported with federal funds from the NIAID, NIH, Department of Health and Human Services (N01AI2008038 to R.H.S.).

    Figure 4: 3D Protein Structure Viewer in the Virus Pathogen Database and Analysis Resource (ViPR). A display of an example sequence feature highlighted on the 3D structure of the Hepatitis C Virus NS5b protein (PDB ID: 1CSJ).

    ViPR combines the strength of a relational database with a suite of bioinformatics integrated tools to support everything from basic sequence and structural analyses to more advanced genotype-phenotype studies. The uniqueness of ViPR lies in:

    •  integrating data from various sources •  encouraging the analysis of the comprehensive data

    contained within the system •  combining the available tools to quickly perform complex

    analytical workflows •  facilitating rapid hypothesis generation using bio-

    informatics methods for subsequent experimental testing •  allowing data sharing and storage with collaborators in

    personal workbenches

    Figure 1: A screenshot of the ViPR homepage. The ViPR homepage is the portal used to access the various types of data and advanced functionality within the system.

    The Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org), sponsored by the National Institute of Allergy and Infectious Diseases serves as a single publicly-accessible repository of integrated datasets and analysis tools for 14 different virus families to support wet-bench virology researchers focusing on the development of diagnostics, prophylactics, vaccines, and treatments for these pathogens.

    ViPR Supports 14 Virus Families

    Arenaviridae, Bunyaviridae, Caliciviridae, Coronaviridae, Filoviridae, Flaviviridae, Hepeviridae, Herpesviridae, Paramyxoviridae, Picornaviridae, Poxviridae, Reoviridae, Rhabdoviridae, and Togaviridae.

    ViPR Intergrates Data from Many Sources

    •  GenBank sequence records, gene annotations, and strain metadata

    •  Gene Ontology (GO) classifications •  UniProtKB protein annotations •  Protein Databank (PDB) 3D protein structures •  Immune epitopes from the Immune Epitope Database

    (IEDB) •  Clinical data •  Additional data derived from computational algorithms •  Host-Pathogen Interactions*coming soon

    ViPR Provides Analysis and Visualization Tools

    •  Genome Annotator •  BLAST Sequence Similarity Search •  Multiple Sequence Alignment •  Phylogenetic Tree Construction •  3D Protein Structures with Sequence Feature or Epitope

    Highlights •  Sequence Feature Variant Type (SFVT) Analysis •  Metadata-driven Comparative Genomics Analysis •  SNP Analysis

    ViPR enables you to store and share data and results through the ViPR Workbench

    Figure 3: Analytical tools available for SFVT data. (A) A multiple sequence alignment calculated with MUSCLE4 and visualized with JalView5 in ViPR. (B) A metadata-driven comparative genomics analysis tool to identify individual positions that correlate with a metadata attribute. (C) A phylogenetic tree that has been automatically colored according to country of isolation using the Archaeopteryx6 tool showing the relationship between HCV-1 genomes .

    Loading Virus Pathogen Database and Analysis Resource (ViPR)...

    SearchSearch our comprehensive database for:

    AnalyzeAnalyze data online:

    Save to WorkbenchUse your workbench to:

    Browse All Search Types Browse All Tools

    Single-Stranded Positive-Sense RNA Single-Stranded Negative-Sense RNA Double-Stranded RNA Double-Stranded DNA

    For influenza virus data and tools, usethe Influenza Research Database, aseparate resource created by the ViPRteam.

    Two DBPs Funded!!Through the Driving Biological Projects (DBP) program,the ViPR team will be collaborating with two groups toextend ViPR in support of more host-pathogeninteraction data.

    A group headed by Dr. Moriah Szpara and Dr. Lynn

    Clinical Data for HumanDengue Virus Isolates!!

    In this release of ViPR we have added extensiveclinical data for ~2600 Dengue Virus isolates.Includes patient demographics and measures ofdisease severity, immune response, evolution, etc.Isolates from both Eastern and WesternHemispheres.All linked to complete genome sequences!Try our Metadata Genome Compare tool for customcomparative analysis.Search, or download complete dataset.

    Like Brett Pickett, Burke Squires and 16

    Genome Statistics for Virus Families

    Families 14

    Genera 70

    Species 912

    Strains 50,196

    Segments 64,367

    Click on family or species of interest in taxonomy below to view viral genomes, or click here to view in list format.

    Genomes

    Genes & proteins

    Immune epitopes

    3D protein structures

    Identify similar sequences (BLAST)

    Align sequences (MSA)

    Identify short peptides in proteins

    Visualize aligned sequences

    Store data in working sets for future analysis

    Integrate ViPR data with your laboratory data

    Store analysis results

    Share results and data with collaborators

    Virus Families

    Caliciviridae

    Coronaviridae

    Flaviviridae

    Hepeviridae

    Picornaviridae

    Togaviridae

    Arenaviridae

    Bunyaviridae

    Filoviridae

    Paramyxoviridae

    Rhabdoviridae

    Reoviridae Herpesviridae

    Poxviridae

    Family: Arenaviridae (1 Genus - 1118 complete genomes) Family: Bunyaviridae (4 Genera - 2800 complete genomes) Family: Caliciviridae (6 Genera - 451 complete genomes) Family: Coronaviridae (2 Subfamilies - 536 complete genomes) Family: Filoviridae (2 Genera - 96 complete genomes) Family: Flaviviridae (3 Genera - 5967 complete genomes)

    Start to type species to get suggestions Jump

    Click on a featured virus of interest to go to virus-specific home page.

    Click on family below or use Virus Taxonomy Browser at bottom of this page to select family or species of interest. Orclick here to view taxonomy in list format.

    Jump to species in taxonomy:

    Featured Viruses

    Dengue Hepatitis C virus

    Highlights

    ViPR WorkbenchThe ViPR Workbench allows users to save 'working sets' ofsequences, searches and analysis results between websessions in their own private workspace. Users can shareworking sets or analysis results with collaborators.

    Key Highlights:Save and organize working sets of sequences,analysis results and search criteriaVisualize saved analysis resultsShare working sets and analysis results with othersUpload personal sequences and combine with existing working sets

    Go to Workbench

    Virus Taxonomy Browser

    Connect with Us

    Influenza Research Database

    Data Summary Updated April 26, 2011

    Announcements

    About Us Announcements Links Resources Support

    You are logged in as [email protected]

    !"#$%&'()*+% ,-./0% 1*2%34%'5%(6%*$4"%()*+% 7%0*$&$&89:9;&"4897;%4$..7+T$%GT-2$/>%8NLN9$0-,#T$6#%+C%&$#$,-6/%HCC-",/%>%L7"$67$%+C%#A$%G6U"/"5.$%-6D%&",-.Y+6$>%LR"//%G6/#"#*#$%+C%1"+"6C+,T-#"7/N

    !"#$%&'()*+#+%,-$(#)+

    !"#$%&'&()*+#'$(,(-./012"3*&42"5*3&0#"506*$#4&*3&072&+,+3*#+8*"+,9*+#.!:;

    %,./').("%0'12$% %3-'-4($-%56*).7-'-$#8%94--% %:6./%5;6#;+?2(4-%5(#4/#+-%>.D1(4#+.'%B-1.4$

    )A$,$%-,$%?%0+/"#"+6/%RA"7A%-,$%/"26"C"7-6#.E%D"CC$,$6#%5$#R$$6%#A$%2,+*0/N

    5.+#$#.' F2)$#1)-%>.D1(4#+.'%5;C()2- :2G+-$+%>.'$4#G2$#'7%$.%:#7'#@#8('8-

    ]`?? @N@]PO<

    !"#$%&"'()*&"+,(-).(/$(,0(#

    *>?@>!A>'B!CDEF7=BD!

    J+T$% %WE%4+,Z5$67A% %4+,Z"62NNN% %H."26%L$3*NNN% %($/*.#/% %&"/*-."[$%H."26$D%L$3*$67$/% %($/*.#/

    !"#$%$%&%'#(L\H(!J%9H)H HFHX]Y\%^%&GL8HXGY\ 4_(`1\F!J &G(8L%aHWGXG\L J_W\

    H5+*#%8/ H66+*67$T$6#/ X"6Z/ ($/+*,7$/ L*00+,# L"26%_*#

    ]+*%-,$%.+22$D%"6%-/%E*6N[A-62b*#/+*#AR$/#$,6N$D*

    !"#$%&'()*+,-.&/()(0(%-&(.1&2.(34%"%&5-%+$#6-&7!"'58&9&:3(;";"#%$#7M%7-6%5$%/-U$D%-#%-6E%#"T$%-6D%#A$6,$#,"$U$D%.-#$,N%!A++/$%#A$%U"$R%E+*%R-6#%#+%/-U$>%#A$6%,$/#+,$%"#%.-#$,%RA$6%E+*%-,$,$-DEN

    9"/0.-E%)E0$: "#$%&'()*!"+),$+,)#!-&!.()+%%&

    9$#-"./

    /!0!.#11!23-+%3#4 2C"*D

    :'/EC'/E$ 2C"*D

    !+.+,%L#,*7#*,$%1E: !

    X-5$.: !

    =-)>5,3%8>4-8;)A)A$/$%+0#"+6/%7+6#,+.%#A$%2$6$,-.%-00$-,-67$%+C%#A$%0,+#$"6%/#,*7#*,$%"6#A$%U"$R$,N

    :-F:5-F:4%5-F,;=)%J"2A."2A#%X"2-6D/%"6%%%%%%%

    :-F:5-F:4%7>-48>7)%J"2A."2A#%$0"#+0$/%+6%#A$%/#,*7#*,$%"6%%%%%%% N%\",/#>%/$.$7#%-6%$0"#+0$%#E0$

    C,+T%#A$%."/#N%)A$6%7A$7Z%$0"#+0$/%#+%A"2A."2A#N

    >"G$'H"%#"I0"1J" !*1/" -7=K%-=HH!(HH ?O?@]?O?^ O_HH`Xa9 ?O?Q]?O?P ?^?H!(HH` ?O?%?O@?]?O@b>%?Ob_

    '91%L$3*$67$KL#,*7#*,$%9$#-"./

    :-F:5-F:4B5,K75%97,46!7))A$/$%+0#"+6/%7+.+,>%A"2A."2A#%+,%.-5$.%7$,#-"6%C$-#*,$/%+C%#A$%/#,*7#*,$N

    ($/#+,$%&"$R: !

    Y++T: 5667

    L0"6:

    %

    %

    %

    J+T$% %b9%',+#$"6%L#,*7#*,$%L$-,7A% %($/*.#/% %',+#$"6%L#,*7#*,$%&"$R$,%I

    !"#$%&%&'()(*&+,'LcH(!J%9H)H HFHXdYc%e%&GL8HXGYc 4f(`1cF!J &G(8L%\HWGXGcL JfWc

    H5+*#%8/ H66+*67$T$6#/ X"6Z/ ($/+*,7$/ L*00+,# L"26%f*#

    d+*%-,$%.+22$D%"6%-/%E*6N[A-62g*#/+*#AR$/#$,6N$D*

    !"#$%&'()*+,-.&/()(0(%-&(.1&2.(34%"%&5-%+$#6-&7!"'58&9&:3(;";"#