Connect.barcodeoflife.org. Promote barcoding as a global standard Build participation Working Groups...

Post on 15-Dec-2015

214 views 0 download

Tags:

Transcript of Connect.barcodeoflife.org. Promote barcoding as a global standard Build participation Working Groups...

Connect.barcodeoflife.org

• Promote barcoding as a global standard

• Build participation

• Working Groups

• BARCODE standard

• International Conferences

• Increase production of public BARCODE records

Networks, Projects, Organizations

Barcode of Life Community

Principles and Goals• Free and open access• Standardization and scalability• Specimen-centered• Rapid data release following primary QA/QC• Ongoing crowd-sourced data curation • Enable accelerated modern taxonomy• Navigate across data types (DNA, specimens,

species, publications, georeferences)• Locate, aggregate, display and analyze data,

resources

How Barcoding Works• Building the reference library:

– Well-identified specimen– Tissue subsample– DNA extraction, PCR amplification– DNA sequencing– Data submission to GenBank

• Using the reference library:– Unidentified specimen– Tissue, DNA, sequencing– Comparison with reference sequences

How Barcoding is Done

From specimen to sequence to species

Voucher Specimen

DNA extraction CO1 gene DNA sequencing Trace file

Public Databases of

Barcode Records

Collecting

ND3

COIII

ND2

ND1

NBII, 25 February 2009

BOLD Workbench for Barcode Data Assembly/Analysis

GenBank, EMBL, and DDBJOfficial Archival Repositories of Barcode Data

http://www.insdc.org/

Current Norm: High throughputLarge labs, hundreds of samples per day

ABI 3100 capillary

automated sequencer

Large capacity PCR and

sequencing reactions

● US$100-150K purchase ● 2-3 hours processing time● 150-500 samples per day ● US$3-5 per sample

Technology Development Partnership Goal

The DNA Sequencing

Lab of 2013?

Producing Barcode Data: 201?Barcode data anywhere, instantly

• Data in seconds to minutes

• Pennies per sample• Link to reference

database• A taxonomic GPS• Usable by non-

specialists

Status of Barcode Data• BOLD records (public and private):

– 956,000 records, 78,000 named species• BARCODE records in GenBank:

– 194,000 records– Insects: 150,000 records– Fish: 23,500 records– Birds: 6,000 records– Mammals: 2500 records

BARCODE Data StandardRequired Elements for COI

• Species designation• Voucher ID in standard Darwin Core format• Minimum 500 bp, >1% ambiguous sites• Bidirectional overlapping reads, 2 trace files• Primer name and sequences• Country/ocean region• Strongly recommended:

– Collection date and collector– Identifier– Latitude/longitude

Non-COI regions for other taxa

• Land plants:– Chloroplast matK and rbcL approved Nov 09– Non-coding plastid and nuclear regions being

explored• Fungi and protists:

– CBOL Working Groups convened– Recommendations expected in 2010

Barcode Sequence

Voucher Specimen

Species Name

Specimen Metadata

Literature(link to content or

citation)

BARCODE Records in INSDC

Indices - Catalogue of Life - GBIF/ECAT

Nomenclators - Zoo Record - IPNI - NameBank

Publication links - New species

GeoreferenceHabitat

Character setsImages

BehaviorOther genes

Trace filesOther

DatabasesPhylogenetic

Pop’n GeneticsEcological

Primers

Databases - Provisional sp.

Linkout from GenBank to BOLD

ISBER: 13 May 2009

Linkout from GenBank to Taxonomy

ISBER: 13 May 2009

Link from GenBank to Museums

Washington Airport Gate 3

• Dulles, National, or Baltimore-Washington?• 2 concourses at BWI concourse A or B?• 3 concourses at National• 4 Dulles concourses

The Controlled Vocabulary of Airport Codes

Darwin Core TripletStructured Link to Vouchers

Institutional Acronym

Collection Code

Catalog ID: :

Structured Link to Vouchers

NHM LEP 123456: :

personal DHJanzen SRNP12345: :

NCBI’s Biorepository List

• Compiled from Index Herbariorum, literature sources, GenBank submissions

• 6,936 records• 1,177 records with non-unique acronyms• 517 homonymous acronyms• 374 shared by two records• 143 shared by three records

AMNHIcelandic Institute of Natural History, Akureyri Division Akureyri Iceland

AMNH American Museum of Natural History New York USA

UNL Universidad Autónoma de Nuevo León Monterrey, Nuevo León Mexico

UNL University of Nebraska State Museum Lincoln, Nebraska USA

UNLCentro de Estratigrafia e Paleobiologia da Universidade Nova de Lisboa Monte de Caparica Portugal

ZMK Zoological Musem, Kristiania Oslo Norway

ZMK Zoologisches Museum der Universität Kiel Kiel Germany

ZMK Zoological Museum, Copenhagen Copenhagen Denmark

CBOL/GBIF/NCBI Registry of Biorepositories

www.biorepositories.org

Mixture of:• Single collections• Repository institutions• Networks/consortia• Databases• NGOs

Does NOT include:• GenBank• EMBL• DDBJ• BOLD

What Should We Do?CBOL will invest a year to populate institution and collection data in biorepositories.org • Hope to build synchronization with:

– Institution database at GenBank– Index Herbariorum– Authority files in BOLD

• Hope to install web services • How can we accelerate registration process?• Where should the data reside long-term?

– GenBank?– GBIF?