Gramene Scientific Advisory Board December 14, 2010
description
Transcript of Gramene Scientific Advisory Board December 14, 2010
Gramene Scientific Advisory Board
December 14, 2010
1Gramene SAB 2010
Introduction of SAB Members
• David Marshall (SCRI)• Paul Flicek (EBI)• Michael Ashburner (Cambridge)• Anna M McClung (USDA-ARS)• Patricia Klein (Texas A&M)• William Beavis (Iowa State)• Tim Nelson (Yale)• Georgia Davis (Missouri)
2Gramene SAB 2010
Introduction of Gramene• Doreen Ware (CSHL, PI)• Susan McCouch (Cornell, PI)• Pankaj Jaiswal (OSU, PI)• Ed Buckler (Cornell, PI)• Vindhya Amarasinghe (OSU, Pathways)• Karthikeyan Athikkattuvalasu (Cornell, Diversity, Phenotypes)• Terry Casstevens (Cornell, Diversity)• Charles Chen (Cornell, Diversity)• Aaron Chuah (CSHL, Diversity)• Genevieve DeClerck (Cornell, Diversity)• Palitha Dharmawardhana (OSU, Pathways)• Marcela Monaco (CSHL, Pathways)• Will Spooner (CSHL, Genomes)• Joshua Stein (CSHL, Genomes)• Jim Thomason (CSHL, Germplasm, Website, Pathways, Genes)• Sharon Wei (CSHL, Genomes)• Ken Youens-Clark (CSHL, Project Manager, etc.)
3Gramene SAB 2010
Aim 1: Genomes
Gramene SAB 2010 4
Doreen Ware, PI
Sharon Wei, Will Spooner, Ken Youens-Clark, Jim Thomason, Marcela Monaco, Josh Stein,
(Total Full Time Equivalent [FTE] 3.5)
Note: hired 25% FTE (Josh) to replace Noel Yap who left the project in the Cornell Group
1.5 FTE available from Ware, Dvorak NSF collaborations
Suggestions From Last Year• Add Brachypodium
– Added in Release 29
• Add a basal plant, e.g. Selaginella– We chose Physcomitrella patens because it was better
documented at the time (GB record and published)– Selaginella now has GB record and will be investigated for 2011
• Add a Solanacea and/or Legume– We are adding tomato in 2011 and are looking into either
soybean or Medicago
• Display RNAseq data– We now have the ability to display as DAS track (see
maizesequence.org)– Need to investigate data sources
Gramene SAB 2010 5
Highlights in 2010• Genomes: 3 new; many updates• Software: Ensembl 59 provides new visualizations
– SNP view– SNP Mart– Multi-species view– Multi-sequence alignment
• New Analyses– Gene-centered synteny build– EPO multi-sequence alignment– Split-gene detection
• New Development– GERP Conservation (Sharon)– GWAS views (Aaron, NSF 2010 collaboration)– Tandem arrays (Josh, Will)
Gramene SAB 2010 6
17 Genomes in Release 32• Physcomitrella (moss): Basal land plant• Updated assemblies of grapevine & poplar• Updated annotations of Indica rice & Arabidopsis • Updated assemblies & annotations of Oryza chr 3S projects
7Gramene SAB 2010
Genome Plans 2011:
Planning:• Lycopersicon esculentum (tomato)• Oryza glabberima (African domesticated rice)• Oryza brachyantha (wild rice)• Aegilops tauschii (wheat D, NSF #0701916)
Investigating:• Selaginella moellendorffii (basal vascular plant)• Triticum aestivum (hexaploid wheat)• Malus x domestica (apple) • Glycine max (soybean) or Medicago
8Gramene SAB 2010
Collaborations Genomes
– NSF PGI #0638820 PI Wing end 2009 (wild rice OMAP)– USDA ARS Grape end 2009 – NSF PGI PI Buckler end 2009 – NSF 2010 #0723510 PI Nordborg end 2011 (Arabidopsis
thaliana, A. Lyrata, Capesella) – NSF #0701916 PGI PI Dvorak end 2011 (wheat)– NSF PGI PI Wilson end 2010 (maize)– NSF PGI PI #0723510 Scanlon end 2012 (maize)– NSF PGI PI Springer to start this year (maize)– NSF PGI PI Wing end 2011 (wild rice OGE)– NSF PGI #1032105 PI McCombie end 2012 (wheat)– EBI BBRSC Paul Kersey (travel for coordination
participants)– NSF PGI PI McCouch end 2014 (rice)– NSF XXX Iplant Steve Goff
New Maps and Markers
New maps in last year:•Sorghum genetic (Mace) •Barley genetic (Close) •Ae. tauschii genetic (Dvorak) •Switchgrass genetic (Tobias)
10Gramene SAB 2010
More genomes in CMap
Gramene SAB 2010 11
Added two more fully sequenced genomes to CMap with seq/seq comparisons based on orthology (build 32).
New SNP View
• Synonymous coding• Non-synonymous
coding• Stop gain/loss• Splice site• UTR• Intronic
Shows functional consequences of polymorphism
12Gramene SAB 2010
New in Ensembl 56
Rice 160,000 SNPs x 21 varieties (incl. Nipponbare ref.) from OryzaSNP, MSU6
Maize 1.6 million SNPs x 27 NAM founder lines from Panzea, AGPv1
Arabidopsis
2010 Project SNP Discovery: 637,522 SNPs x 21 ecotypes (incl. Col-0 ref.), TAIR9
2010 Project 250K SNP chip genotypes v3.04, 214,000 SNPs x 1179 ecotypes, TAIR9
1001 Genomes/WTCHG SNPs from dbSNP, 2.7 million SNPs, 17 ecotypes, TAIR9
Grape 71K SNPs (Myles et al.)
13Gramene SAB 2010
SNP BioMart
Filter on region, phenotype, strains, id, & consequence (e.g. introduced STOP codon), and other attributes
Available for rice japonica, rice indica, Arabidopsis & grape datasets
Configure output fields and format (XLS, CSV, TSV, or HTML)
If HTML, link to Variation, Gene, or Browser Pages
Whole Genome Alignments
Gramene SAB 2010 14
Schwartz S et al., Genome Res.;13(1):103-7 Kent WJ et al., Proc Natl Acad Sci U S A., 2003;100(20):11484-9
BLASTZ-CHAIN-NET between 20 pairs of speciesAlignment (Release)Oryza sativa Japonica O.japOryza sativa Indica 31 O.indSorghum bicolor 31 - S.bicBrachypodium distachyon 31 31 - B.disArabidopsis thaliana 31 31 31 31 A.thaArabidopsis lyrata 31 - - - 31Vitis vinifera 31 - - - 31Poplar trichocarpa 31 - - - 31Oryza glaberrima 3s 31 - - - -Oryza minuta CC 3s 31 - - - -Oryza officinalis 3s 31 - - - -Oryza punctata 3s 31 - - - -Physcomitrella patens 32 - - - 32
New & improved alignment viewer (Ensembl 56)
Multispecies View
• Stack any number of genomes aligned to a common reference by BLASTZ
• Browse & zoom along any genome independently
Gramene SAB 2010 15
Re-introduced in Ensembl 56
Automated Detection of Split Genes
Gramene SAB 2010 16
Special class of “paralog” since Ensembl 58Contiguous split paralog: Non-overlapping, nearby (<1 Mb), same strandPutative split paralog: Non-overlapping, different regions (e.g. scaffolds)
Species Split GenesPopulus trichocarpa 1181Sorghum bicolor 1087Oryza sativa Japonica 916Vitis vinifera 520Oryza sativa Indica 365Zea mays 280Arabidopsis lyrata 202Arabidopsis thaliana 137Brachypodium distachyon 101
Genome alignment confirms inconsistent annotation
Gene-Centered Synteny Build
Gramene SAB 2010 17
2010: Implemented with automated pipeline runnables• Release 31: monocots• Release 32: dicots
Oryza sativa Japonica O.jap
Brachypodium distachyon YES B.dis
Sorghum bicolor YES YES S.bic
Arabidopsis thaliana - - - A.tha
Arabidopsis lyrata - - - YES A.lyr
Vitis vinifera - - - YES YES V.vin
Poplar trichocarpa - - - YES YES YES P.tri
Compara Orthologs Collinear mappings (DAGchainer)“in-range” mappings near collinear anchors
Map
Grape Reference Highlights Duplicated Regions in Arabidopsis and Poplar
• Polyploid and segmental duplications manifest as co-syntenic regions
• SyntenyView links to browser: Thus users can easily navigate between duplicated regions
Gramene SAB 2010 18
EPO Multiple Alignment & Ancestor Reconstruction
• Gramene implementation in 2010• Release 32: 8-way EPO alignment
– Rice japonica, indica, Brachypodium, sorghum, Arabidopsis, A. lyrata, grape, poplar
Paten et al (2008) Genome Research 18:1814Paten et al (2008) Genome Research 18:1829
2010 Genomes Development: Constrained Elements
• Genomic Evolutionary Rate Profiling (GERP): measures purifying selection• Method testing using 4-way and 8-way EPO alignments as input with
varying parameters• Input tree generated from 1301 ortholog sets• Planning release in 2011
Gramene SAB 2010 20
Cooper et al (2005) Genome Research 15:901
2010 Genomes Development
Gramene SAB 2010 21
Tandem Duplicate Detection
• Adjacent paralogs with no more than 2 intervening unrelated gene
• Increase gene dosage• Diversifying selection• Often species-specific
Gramene SAB 2010 22
Species Clusters Genes Largest FunctionRice japonica 2519 7054 24 phytosulfokine receptor-like (LRR-kinase receptor)Sorghum 2182 5927 19 Chalcone-stilbene synthase likeMaize 1871 4564 22 DUF1754 (domain of unknown function)Arabidopsis 1738 4581 28 ECA1 gametogenesis related family
LRR-Kinase cluster in rice
LRR-Kinase species-specific expansions
Collaboration with Ensembl Genomes
23Gramene SAB 2010
• Share conference calls• Developers meeting (Hinxton, UK, Sept. 2010) • Co-authored papers/posters• Two releases• Ensembl Developer’s Workshop
Website Improvements• Home facelift:
quick entry-points
• Migrated to Apache 2.0 in Release 31
REST Interfaces
25Gramene SAB 2010
New RESTful interface for site gives greater
user control over data views and format
New Oryza Pages• Highlights this genus with images, phylogeny,
geographic origin, & traits of interest• Entry points to browsers, germplasm, markers, &
taxonomy ontology
Gramene SAB 2010 26
Web Services
• Distributed Annotation Server (DAS) serving Ensembl genes as well as Gramene markers, sequences, and QTL
• Gramene Mart integration with Galaxy• Public MySQL server• Diversity data via Tassel and GDPC• Subversion for code access
27Gramene SAB 2010
Browser Development 2011 Plans• Communicate/distinguish gene-confidence information
– 28% of MSU6 rice genes are annotated as “TE_related” and 17% are in poorly-conserved “hypothetical” class
– 20% Sorghum genes are “low-confidence” (TE, pseudogenes, etc)– Color-code or display in separate tracks in browser– Color-code in gene-tree display
• List/Display detailed gene-level synteny information– Explicitly list syntenic genes from Gene Page– Indicate that a gene is syntenic to one or more genes of a different species
within the browser (e.g. color-code or synteny track) • List co-syntenic genes
– 2 genes (in separate blocks) having synteny to a common gene in another species arose from a large scale duplication event (e.g. polyploidy or segmental).
• Tandem Array track– Indicate clusters of paralogous genes within browser
• [Challenges of low-depth or highly fragmented genomes, e.g. wheat & Physcomitrella]
Gramene SAB 2010 28
2010 Ongoing Development Work
• miRNA pipeline runnable– Refine and automate steps in miRNA
annotation– Vmatch alignment– mfold RNA secondary structure prediction– Filter based on secondary structure
• Gene-Build with RNAseq evidence data– First pilot experiments performed
29Gramene SAB 2010
Questions for the SAB?
• Nominate genomes• New data types e.g. RNAseq data
available for current genomes that we may not be aware of
• Any physical aspects of web site needing improvement
30Gramene SAB 2010
Aim 2: Pathways
Pankaj Jaiswal, PI
Palitha Dharmawardhana, Jim Thomason, Vindhya Amarasinghe, Liya Ren,
AS Karthikeyan, Marcela Monaco
Note: Liya left the project this year and has been replaced by Marcela.
31Gramene SAB 2010
Aim#2 Plan (2009-2010 / Year-3)
• Continue curating Rice and Sorghum Pathways
• Release MaizeCyc and BrachyCyc
• Add all available microarray probesets to MarkerDb and allow OMICS viewer to validate
• Develop Reactome database for (Rice)
• Update the gene database schema to structure the allele based annotations on function, phenotype and interactions.
• Maintain and Develop Ontologies
32
33Gramene SAB 2010
Added BrachyCyc, MaizeCyc
Updated Pathway tools twice to latest versions.
Updated the individual pathway databases twice to be consistent with the Pathway tools version
Rice Pathways curated by addition of hydroxycinnamic acid and serotonin biosynthetic pathways, updates to auxin biosynthesis, tryptophan biosynthesis. Addition of 80 transport reactions and 477 transporters
Suggestions from last SAB
Concerns on supporting three technologies: Cyc, Reactome, WikiPathways.
Suggested moving to Reactome and allow the Cyc and WikiPathway databases to be populated by automated exports using BioPax.
34Gramene SAB 2010
Reactome Database Build• Reactome:
– Rice• Start with RiceCyc import and build on the existing Enselmbl and
Curated Genedb resources
– Arabidopsis • After consulting with the Reactome project and the Arabidopsis
Reactome group, this will become part of the renewal effort. The work on it will start with integrating it in the Reactome central database from its current location in JIC (www.arabidopsis reactome.org) , followed by active curation.
• Active curation will be primarily done in collaboration with Nick Provart’s group at Univ. of Toronto.
• This is a new International Collaboration
– Plan is to integrate the plant specific Reactome database instances in the Reactome central database, but provide a modified user interface for users.
Gramene SAB 2010 35
Rice Reactome• Initial build of the Rice Reactome started by importing the complete
(curated and predicted) RiceCyc data in BioPax level-2 format.• A test-v2 Rice Reactome is available from this link.
– The Reactome tools with some tweaking successfully imported 375 pathways and the children reactions
– Efforts are now on to integrate the mappings to • ChEBI, Ligand and PubChem for compounds/metabolites• KEGG for EC enzymes• Uniprot
– Drawing the network diagrams requiring manual curation. • Priority is to draw networks for fully curated Rice Pathways by using the Reactome tools
– Integrate predicted models of regulatory pathways for rice based on the reference pathway projections for cell cycle, transcription, translation etc.
– Curate test case rice pathways• Organized a week long workshop attended by curators from Gramene and BAR-Univ. of
Toronto (Nick Provart’s group)• Mentored by Reactome co-PI Peter D’Eustachio• A test case of ABA metabolism and signaling was curated, which contained both the
molecular and genetic interaction datasets.
Gramene SAB 2010 36
ABA metabolism and signaling pathway
Gramene SAB 2010 37
Klinger et al J. Exp. Bot. (2010) 61 (12): 3199-3210.
Reactome model: A prototype reaction network, ABA-mediated transcriptional regulation, was laid out using material from Nambara & Marion-Poll (2005 – PMID: 15862093) to supplement the pathways of ABA synthesis and catabolism available as RiceCyc templates, and the regulatory processes discussed by Xiong et al. (2002 – PMID: 11779861) (especially Figure 10) and Klingler et al. (2010 – PMID: 20522527)
Automated Cyc and WikiPathways builds
• Based on the SAB suggestions, the progress has been made towards the goal of extending the annotation of pathway databases in Cyc and Wiki versions in an automated way.
• However to do that approach we have to streamline the data workflow and structure the current curated gene database as a central repository/aggregator of necessary datasets to help achieve this goal.
• The Curated Gene database schema was restructured to hold, whole genome based annotations on genes and alleles and their associations to function, phenotype, germplasm, pathways, gene-to-gene interactions, gene products, and gene models, besides providing cross references to sequencing project objects (like gene models from IRGSP-RAP, MSU-OSA, BGI gene models for rice O. sativa) and published literature.
• Use aggregated datasets for automated Cyc build using the standard patwhay tools and provide the BioPax and SMBL dumps to WikiPathways project for their users.
• Gramene’s focus will be pathway curation and annotation in Reactome and functional annotation in gene database.
38Gramene SAB 2010
Outreach• Curated rice specific pathways and compounds contributed to PlantCyc and
MetaCyc projects on reference pathway databases.• Organized Workshops
– Community Gene Annotation Workshop at Plant Biology 2010 (July 2010)• Jointly organized with Plant Ontology (PO) Project.• Provided meeting support by way of website portal and onsite helping hands• Tool development (plant configurations of Phenote annotation tool and Ontologies) and
funding provided by PO project.• Attended by about 35 researchers of which 12 were awarded travel support by PO.
– Reactome workshop at CSHL, 25-29 October 2010• Attended by Gramene and BAR curators• Mentored by Reactome database (Peter D’Eustachio)• Hands on curation of a test case pathway.• Analysis of RiceCyc import and current Reactome Annotation tools.• Development of curation strategy and annotation guidelines.
39Gramene SAB 2010
Plans for 2010-2011• Release Rice Reactome• Release curated gene database in new avatar as
aggregator of gene information• Integrate microarray probeset mappings in OMICS validator
for non-rice pathways• Conduct the gene and pathway annotation outreach
workshops.• Develop test cases for upcoming Renewal and strategies
for analyzing large-scale datasets generated by NextGen technologies on transcriptomics and metabolomics.
• Maintain the current Cyc based Pathway views upgare to v14.5 and later of Ptools
40Gramene SAB 2010
Pathway Collaborations• Metacyc/BioCyc (Peter Karp)• Reactome (Lincoln Stein, Peter D’Eustachio)• Arabidopsis Reactome (Nick Provart, Henning Hermjakob)• PlantCyc (Sue Rhee)• SolCyc and Solanaceae Genome Network (Lukas Mueller)• Phenote curation tool (Nomi Harris, Suzi Lewis)• Ontologies (GO, PO, OBO)• BrachyBase (Todd Mockler)• Sorghum Biofuel and Bioenergy Project (John Mullet)• MaizeSequence.org• MaizeGDB• Maize Pathways (Andrew Hanson)• C3-C4 project (Tim Nelson, Tom Brutnell, Chris Myer, R. Bruskiewich)• WikiPathways• Expression data (Todd Mockler, Tim Nelson, Tom Brutnell)
Gramene SAB 2010 41
Questions for SAB?
• Nominate Pathways• Types of analysis users are interested in• Potential collaborators (national and
International)
Gramene SAB 2010 42
Aim3: Gramene Diversity Module
Susan McCouch & Edward Buckler, PIs
Terry Casstevens, Genevieve DeClerck, Charles Chen, AS Karthikeyan,
Jon Zhang, Qi Sun, Ken Youens-Clark.
43Gramene SAB 2010
Suggestions from last year
• Integration with key tools – We provide new SNP query tool, Web-
launched Tassel, and downloads to work with Flapjack, in formats like Plink, HapMap, etc.
• How about genotype storage? – Implemented BLOBs to store SNPs
New Data Sets• Arabidopsis
– Atwell et. al.. Genotype, phenotype, association data. ~214,000 SNPs, 199 Germplasm, 107 Phenotypes.
• Rice– Zhao et. al PLoS May 2010, "1536 Assay": 1311
SNPs x 395 varieties, mapped to MSU6.0– Gross B, et. al, Mol Ecol. Aug 2010 SNP diversity
study from PG • Maize
– dbSNP IDs and AGPv2 coordinate update for current dataset (1.6 million SNP x 27 NAM lines)
Web Interface – SNP Query
Downloads
Tassel
GWAS Visualization
Gramene SAB 2010 49
Tassel Development• New data structure significantly improving memory efficiency• Alignment viewer • User-friendly “wizards”• Progress monitoring with ability to cancel tasks • Import/export Hapmap, Flapjack, Plink data formats • Auto-loading and analysis execution from web site startup• GLM and MLM:
– GLM interface simplified. – Compression and faster P3D implemented for MLM resulting in reduced
runtime. – Matrix Algebra library wrapper written to make switching to newer, faster
libraries easier. – EJML Matrix Algebra library interface implemented.
• Tassel 3.0 Pipeline… – Automates complex loading/analysis pipelines – Doesn't need Java coding to create – Has simultaneously executing pipeline segments – Works from web site launch, command line, and GUI
- Experimental evidences (from other species, e.g. Arabidopsis)- Ontology terms
Selection of candidate genes
Selection of candidate genes
Prior-candidate genes
Prior-candidate genes
Compara pipelineCompara pipeline
- Coordinates of the genes- Functional implication or annotations
GWAS associationsGWAS associations
Hapmap SNP information
Hapmap SNP information
- SNP positions- Linkage disequilibrium estimates (r2)
Linkage block size calculations
Linkage block size calculations
- Associated SNP map positions- p-values
Linkage block size for ith prior candidate is given by:Bi = 95% quantile {di1, di2, di3,…dix} di1, di2, ..and dix are the map distances of the SNP loci in the gene to other loci on the same chromosome that are in a perfect LD (r2=1.0)
Enrichment score calculations
Enrichment score calculations
Hapmap SNP information
Hapmap SNP information
-SNP positions
for ith prior candidate gene, the enrichment score, Ei, is calculated by the weighted hypermetric probability of observing gi significant associations in the linkage block Bi, given the number of SNP xi located in the block and the total number of Gt SNP loci on the chromosome
Functional implicationsFunctional
implications
Functional implication of prior candidate genesby statistically significant overrepresentation of association signals
Example: Days-to-silk flowering time associations of maize chromosome 8
- Maize first generation hapmap 1.6 M SNP of all chromosomes- 136,119 SNPs on chromosome 8
- Flowering time trait, Days-to-Silk, of maize GWAS associations on chromosome 8- 144 associations (p-values < 1e-6)
- Curated Arabidopsis flowering time candidate genes- 274 genes in total
- Compara orthology of maize homologs to Arabidopsis flowering time candidates- 74 prior candidate genes
- Linkage disequilibrium estimates (r2) from 136,119 SNPs, filtered with MAF > 0.05
- Genetic distances calculated from each maize candidate gene to 144 GWAS associations
- Genetic distances of every pair of SNP loci in a perfect LD (r2=1.0)
Linkage block size calculations
Empirical cumulative probability distribution of genetic distances estimated by the SNP loci that are in a perfect LD
95% quantile
Linkage block size =105,387 bp
Pro
bab
ility
genetic distance of SNP loci
0 0.2 Mb 0.4Mb 0.6 Mb 0.8 Mb
Enrichment score calculations
Suppose GWAS identify Mt SNPs significantly associated with flowering time variation in Nt total number of SNPs on a given chromosome.
The enrichment score (Sei) determines the probability of getting gi number of significant GWAS association, weighted by p-values, within a linkage block.
Enrichment score for ith gene:
Mt: total number of significant GWAS SNPs on a given chromosome
Nt: total number of SNPs on a given chromosome
gi: significant GWAS SNPs in the defined window
xi: number of SNPS in the defined window
Sei: enrichment score of the ith maize flowering time candidate gene
where
Log10 of odds of maize flowering time prior candidate gene
FT maize homolog
AGL79 maize homolog
GI maize homolog
TOC1 maize homolog
rap2.7 AP2 maize homolog
Chromosome 2 Chromosome 3 Chromosome 8
LOD =2*
* Probability of null hypothesis is assessed by randomizing the association results with respect to the SNP positions, without changing the number and strength of association signals.
Plans - Rice
• Rice Diversity 44K chip: ~39,000 SNPs, 400 rice lines, phenotype data for 23traits - Build 33
• Rice SNP Consortium 1M chip data - Build 34
• Curate key large GWAS results
Plans Maize, Arabidopsis
• Maize Diversity/Panzea, 56 million SNPs x 104 maize lines (Build 33)
• Phenotypic data for an additional 10-20 traits (depending on publication acceptance rate)
• Additional data from Arabidopsis 2010 Project
• Curate key large GWAS results
Diversity Collaborations
• Rice:– McCouch (#0606461, #1026555)– Wing (#1026200)– Purugganan (#0701382)– Olsen (#0638820)
• Arabidopsis: Nordberg (#0723510)• Maize: Buckler (#0820619)
Gramene SAB 2010 57
Plans - Software
• Google Web Toolkit for association data viewer• SNP Query - additional features• TASSEL
– Flapjack integration. Work with SCRI to create seamless connectivity between the two applications
– Complete support for heterozygous data – Greater Junit testing (regression testing)– Automated MLM/GLM association analysis– New graphical displays (i.e., Manhattan plot) – Improvements to kinship calculations, imputation function
• Functional implications from GWAS associations -- develop web-based interface for statistical method
Plans – Comparative GWAS
• Develop web-based interface for comparative candidate gene enrichment system.
Diversity Questions for the SAB
• What should happen to diversity data in the renewal?– Large projects such as SeeD (CIMMYT),
Wheat/Barley CAP, GRIN-Global will likely go to new standards
• What needs to be done to transition?
Gramene SAB 2010 60
Aim 5: Outreach
Everyone
61Gramene SAB 2010
62Gramene SAB 2010
Tutorials
63Gramene SAB 2010
OpenHelix’s Gramene tutorial went live the end of March, 2010. As of Sept. 7, The tutorial includes a self-run tutorial as well as PowerPoint slides, handouts, and exercises. In the five months it has been available, the landing page has received 305 views, with 36 viewings of the tutorial.
Five new Gramene-produced tutorials such as this one on pathways.
Meetings and Presentations
– Presentations • PAG• Rice Technical Working Group• Maize conference• International Symposium on Integrative Bioinformatics• Evolution • ISMB• Genome Informatics• Agronomy, Crop and Soil Sciences Meeting
– ASPB curation workshop with hands-on exercises
– Other:• Gramene Retreat (CSHL, June 2010)• Plant Ensembl developers meeting (Hinxton, Sept. 2010)• Plant Reactome training workshop (CSHL, Oct. 2010)• Ken and Jim TA’d bioinformatics course (CSHL, Oct. 2010)
Letters of Support
• Wise/Dickerson, NSF-PGRP TRPGR: NextGen PLEXdb (0543441)• Ana Caicedo (UMass) The evolutionary genomics of invasive weedy
rice (0638820)• Rod Wing CPGS Oryza Genome Evolution (1026200)• Dick McCombie CPGS: Gene Discovery in Wheat (1032105)• Carolyn Lawrence, NSF-PGRP GERP: Functional Structural Diversity
Among Maize Haplotypes (0743804)• Steven Briggs, TRPGR Discovery, revision, and validation of maize
genes by proteogenomics (0924023)• Matt Vaughn, Epigenetic Variation in Maize (0922095)
Gramene SAB 2010 65
Publications• “Gramene database in 2010: updates and extensions” (Youens-Clark, et al.)
Nucleic Acids Research, 2010, 1–10 doi:10.1093/nar/gkq1148.• “Fine Quantitative Trait Loci Mapping of Carbon and Nitrogen Metabolism
Enzyme Activities and Seedling Biomass in the Intermated Maize IBM Mapping Population.” (Zhang, Chen, Buckler, et al.) Plant Physiology, in press.
• “Gramene database: a hub for comparative plant genomics.” (P Jaiswal). Methods Mol Biol. 2011;678:247-75. (invited book chapter)
• “Applications and methods utilizing the Simple Semantic Web Architecture and Protocol (SSWAP) for bioinformatics resource discovery and disparate data and service integration.” (Nelson et.al) BioData Min. 2010 Jun 4;3(1):3.
Coming Up:• “Gramene GeneTrees: A comprehensive database of phylogenetic trees in
plants and other model Eukaryotes” (Plant Phys)• RiceCyc• Diversity• Genome sequence analysis
66Gramene SAB 2010
Plant Ensembl Collaboration
• Lead: Will • EBI Participants: Paul Kersey, Paul
Derwent, Dan Staines, Andy Yates• Gramene Participants: Will Spooner,
Doreen Ware, Aaron Chuah, Shiran Pasternak, Sharon Wei
67Gramene SAB 2010
Plant Reactome Curators Meeting
Pankaj Jaiswal and Marcela Monaco organized an intensive five-day meeting (October 25-29) at CSHL with Peter D'Eustachio of New York University to learn how to use the Reactome model and software to curate plant pathways.
Other participants included Vindhya Amarasinghe (OSU), Palitha Dharmawardhana (OSU), and Hardeep Nahal (Univ. of Toronto).
68Gramene SAB 2010
• Development work on visualizing annotations from DNA Subway within Gramene’s Ensembl views
• Contribution of reference genomes for high-throughput sequencing
Gramene SAB 2010 69
Web Usage and Stats
70Gramene SAB 2010
Page Requests by Year per Month2001 - 2010
Explanation of drop in web usage
Gramene SAB 2010 72
Prior to release 29, Gramene was experiencing problems from abusive spidering by web searches on our development site. As a consequence, all indexing was disabled in our “robots.txt” file. Through an error in the release process, this file was copied to the live server, thereby refusing access to search engines. This explains the severe drop in usage by casual users finding Gramene through Internet searches. The problem has been fixed, and usage appears to be climbing again.
3-year Perspective
Gramene SAB 2010 73
Top Countries - Visits% Nov 2009 – Nov 2010
Duration of Visit
Depth of Visit
Visitor Loyalty
Gramene SAB 2010 78
Thanks, from Gramene
End
79Gramene SAB 2010