Biocurator 2012 poster P10
Transcript of Biocurator 2012 poster P10
Community annotation with EcoliWiki and GONUTSDaniel Renfro, Brenley McIntosh, Deborah Siegele and Jim Hu
Texas A&M University (College Station, TX)
AbstractCommunity participation in content generation and maintenance for biological databases has long been viewed as a possible solution to the problems of cost and scalability that limit the classical model for biocuration. The success of Wikipedia has inspired a proliferation of biological wikis. EcoliWiki and GONUTS are two wikis that are designed for distinct but overlapping purposes. EcoliWiki is modeled on typical model organism databases, with a central component being gene-centric pages about genes, their products, expression and regulation, and evolution. GONUTS is a Gene Ontology browser and repository for term-specific usage notes for GO. It also supports community annotation for proteins with UniProt accessions. EcoliWiki and GONUTS share common wiki infrastructure for automated creation of pages from templates, handling references, and capturing tabular data to enable structured data mining. Both use the directed acyclic graph struture of mediawiki categories to capture relationships between pages.
So far, the initial fear that wikis would introduce chaos into annotation has not been a problem. Instead, a common problem faced by wikis and other community annotation systems is that biologists have only weak incentives to participate in content curation. To increase participation and couple annotation to common career goals for academic biologists, we created the Community Assessment of Community Annotation with Ontologies (CACAO). In CACAO, biologist get teaching credit for having teams of students participate in GO annotation. Annotation is done as an intercollegiate competition on the GONUTS website, and annotations, along with student-generated notes are submitted to GO and UniProt after review by curators. CACAO leverages the expertise of students, faculty supervisors, and biocurators and could be a viable model for other kinds of community efforts.
Adapting wikis for annotation• Traditional models of community curation create barriers to user participation
• Contributions are invisible while gatekeepers evaluate them• Partial information is discouraged
• Wikis provide immediate feedback and allow submission of smaller units of information• But wikis are traditionally too unstructured for efficient extraction of
structured data• TableEdit is a mediawiki extension developed for EcoliWiki to address this
problem
Gene Ontology (GO) is the de facto ontology for functional annotation. GO annotations for Escherichia coli gene products can be added to EcoliWiki (http://ecoliwiki.net) while annotations for any protein in UniProt can be added to GONUTS (http://gonuts.tamu.edu) by any registered user.
GO Annotations
GONUTS & CACAO
Scoreboard - Tracks team annotations, challenges and points in real time.
Team and Individual Contributions - A table on each team’s page tracks annotations from team members. A similar table shows each annotation contributed by the individual biocurator.
Assessment by Experienced Students- Graduate students or undergraduates who have completed at least 1 semester of CACAO initially assess every annotation as acceptable, unacceptable, requiring changes or requiring additional review by a professional biocurator. In addition, these students judge challenges and refinements.
GONUTS & Electronic Jamborees
"Because the breadth of expertise necessary to annotate a complete genome does not exist in any single individual or organization, we hosted an "Annotation Jamboree" involving more than 40 scientists from around the world, primarily from the Drosophila research community. Each was responsible for organizing and interpreting the gene set for a given protein family or biological process. Over a 2-week period, jamboree participants worked to define genes, to classify them according to predicted function, and to begin synthesizing information from a genome-wide perspective."- Adams et al. (2000) The genome sequence of Drosophila melanogaster. Science 287:2185-2195
Annotation jamborees were first described for the annotation of the Drosophila genome
Having multiple investigators travel to a single site is hard. GONUTS allows the Reference Genome project of the GO consortium to organize annotation jamborees via conference calls and over the internet.
Other groups can use GONUTS in similar ways.
Genes of interest for an annotation jamboree are tagged in GONUTS. These tags allow a set of software tools to generate graphs and tables that compare the GO annotations for each gene in the group.
Recruited via Institution
GO consortium TAMUUCLMiss State
Phage Meeting Mich StatePenn StateWisconsin-ParksideN. Dakota StateCentral Florida
PortEco Steering Committee SwarthmoreWisconsin
ASM General Meeting Hofstra
ASM CUE N. Texas
TAMU Seminar speakers Miami Ohio
Other Houston Baptist
Growth in CACAO Activity
Participation Recruitment
Challenges and Rebuttals – Submitted challenges are displayed in a table that allows for multiple challenges and rebuttals.