Centers of Excellence for Influenza Research and Surveillance 6 th Annual Meeting Aug 1, 2012 Status...
-
Upload
lily-johnston -
Category
Documents
-
view
215 -
download
0
Transcript of Centers of Excellence for Influenza Research and Surveillance 6 th Annual Meeting Aug 1, 2012 Status...
Centers of Excellence for Influenza Research and Surveillance
6th Annual Meeting
Aug 1, 2012
Status of IRD Development
Session Topics
• Current CEIRS data in IRD• Surveillance• Serology• Immunology & ImmPort
• IRD enhancements over past year• Search improvements• Surveillance data from map• Support for serology data• 3D movies• Phylogenetic tree decoration• Metadata-driven comparative genomics analysis• Sequence feature submission tool• Host factor data• Publications
• Plans for future development
Current CEIRS Data in IRD
CEIRS Surveillance Samples
2008 2009 2010 2011 2012 Total
# of Samples 3040 65806 68551 132266 19321 288984
25,000
75,000
125,000
175,000
225,000
275,000
325,000
CEIRS Surveillance Samples
# of
sam
ple
s
94% avian5.8% non-human mam.0.2% human
Surveillance Sample Stats
Avian Records Avian %
Non-Human Mammalian
Records Mammalian %
Total 197,207 14,098
Tested 175,746 89% 12,973 92%
Flu-positive 10,136 5.8% 510 3.9%
Linked to sequence 772 7.6% 11 2.2%
*as of May 1, 2012
Serology Samples
Species category Submission Year Sample Count
Avian 2011 254
Avian 2012 962
Human 2012 201
Non Human Mammalian 2010 200
Non Human Mammalian 2011 986
Non Human Mammalian 2012 72
TOTAL 2675
Influenza Serology Data
CEIRS Immunology Data in IRD
CEIRS Immunology Data in IRD
Introduction to ImmPort
• Immunology Database and Analysis Portal (ImmPort)– Bioinformatics Integration Support Contract (BISC)
• Purpose– Warehouse for storing immunology experiment data– Integrate data with analysis and visualization tools– Provide access to research community
• Projects– Population Genetics Analysis Program– HLA Region Genetics in Immune-mediated Diseases– Modeling Immunity for Biodefense– Others
Additional ImmPort Capabilities
• Integrate data from multiple resources– OMIM, GO, synonyms, protein-protein
interactions, etc.
• Suite of data analysis and visualization tools–Microarray– Flow Cytometry– Other “-omics” platforms
IRD Enhancements Over Past Year
Sequence Search Page Enhancements
Quick Text Search
Surveillance Data from Map
Spinning 3D Protein Structure Movie
Phylogenetic Tree Decoration
Decorate by:– Host species
• Avian: Avian grouped/separated
– Country– Year– HA subtype– NA subtype– HA & NA subtype– Geographic region– Flu season– SFVT
Manual decoration
Metadata-driven Comparative Analysis Tool
Sequence Feature Variant Type (SFVT)
Sequence Feature Submission Tool
DMID Systems Biology Program
Host Factor Data
IRD/ViPR Publications 2012
Future Development Plans
User Support and Outreach
• Data– Evaluate feasibility of supporting Antigenic
Cartography– Prepare packages of (correctly-formatted) data to
export to external tools
• Outreach– Perform on-site outreach at CEIRS centers– Continue developing tutorials for existing tools &
features
Search
• Query Capabilities– Ability to search for high-path and/or low-path
strains (using sequence biomarkers)
Comparative Genomics
• Develop PCR Primer design tool (exclude orthologs)
• Increase SF definitions for: virulence, host specificity, replication, etc.
• Provide a new tool to assign (or convert between) sequence coordinate schemes
Annotation and Host Factor Data
• Ensure sequence submissions are appropriately prepared (i.e. no primer sequence, etc.)
• Increase number of host factor datasets• Develop method to handle different statistical
methods from various “-omics” platforms (e.g. microarray, proteomics, etc.)
Surveillance
• Identify NIAID-funded human surveillance studies and solicit deposition into IRD
• Develop additional use-cases to identify additional helpful data types
Immunology
• Epitopes– Add search options such as: CD4, CD8, host
• Serology– Solicit feedback from community on use-cases– Identify volunteers for data submission
PA-X Prediction for All Strains
• Build on analysis performed earlier this year by Jagger et al. Science 2012 Jul 13;337(6091):199-204– Identified new protein on segment 3 using ~1000 sequences
• Frameshift occurs at codon 190 in PA protein, results in new C-terminus
• IRD will extend this analysis across all segment 3’s in resource– Add PA-X annotation to existing IRD sequence records– Allow users to search for PA-X protein sequences– Provide data that can assist in downstream comparative
genomics analyses
H5 clade annotation tool
• Automated clade determination for any query HA sequence
• Match WHO clade definitions
NGS Deep Sequencing Data
• Primary data in SRA• Derived data in IRD– Positions with sequence variation– Proportion of read with a particular sequence
variation
• Metadata to understand the context