Academia Sinica, 16 January 2007 DNA Barcoding: An Emerging Global Standard for Species...
-
Upload
jarrett-fricker -
Category
Documents
-
view
215 -
download
0
Transcript of Academia Sinica, 16 January 2007 DNA Barcoding: An Emerging Global Standard for Species...
Academia Sinica, 16 January 2007
DNA Barcoding: An Emerging Global Standard for Species
IdentificationConsortium for the Barcode of Life
National Museum of Natural HistorySmithsonian Institution
http://www.barcoding.si.edu202/633-0808; fax 202/633-2938
Academia Sinica, 16 January 2007
A DNA barcode is a short gene sequence
taken from standardized portions
of the genome, used to identify species
Academia Sinica, 16 January 2007
Characteristics of Barcode Regions
• Flanked by conserved regions
• Easy to amplify
• Low intraspecies variability
• Discontinuous variation between species
• Long enough to work in all groups
• Short enough for single reads
Academia Sinica, 16 January 2007
The Mitochondrial Genome
Cyt bCyt b
D-Loop
ND5
H-strand
ND4
ND4LND3
COIII
COICOIL-strand
ND6
COI
ND2
ND1
COII
Small ribosomal RNA
Large ribosomal RNA
ATPase subunit 8
ATPase subunit 6
Academia Sinica, 16 January 2007
Using DNA Barcodes• Establish reference library of barcodes
from identified voucher specimens• If necessary, revise species limits• Then:
– Identify unknowns by searching against reference sequences
– Look for matches (mismatches) against ‘library on a chip’
– Before long: Analyze relative abundance in multi-species samples
Academia Sinica, 16 January 2007
1. Databasing2. Labeling3. Imaging4. Tissue sampling5. DNA extraction6. PCR7. PCR check8. Sequencing reaction9. Sequencing cleanup10. Sequencing11. Trace editing & submission
Analytical chain
Academia Sinica, 16 January 2007
BoLD Data System• Developed/hosted by Univ. Guelph
• Workbench for most barcode projects
• Laboratory Information Management System (LIMS) for assembling data
• Management and Analysis System
• Identification system for matching unknowns to reference records
• Uploading to GenBank
Academia Sinica, 16 January 2007
1. Databasing2. Labeling3. Imaging4. Tissue sampling5. DNA extraction6. PCR7. PCR check8. Sequencing reaction9. Sequencing cleanup10. Sequencing11. Trace editing & submission
Analytical chain
Academia Sinica, 16 January 2007
Current Norm: High throughput
ABI 3100 capillary
automated sequencer
Large capacity PCR and
sequencing reactions
Academia Sinica, 16 January 2007
Fresh/Frozen Museum
Tissue Sampling $0.41 $0.41
DNA Extraction $0.34 $2.00
PCR Amplification $0.24 $0.48
PCR Product Check $0.35 $0.70
Cycle Sequencing $1.04 $2.08
Sequencing Cleanup $0.32 $0.64
Sequence $0.40 $0.80
Total: $3.10 $7.11
Cost of Reagents and Disposables
Academia Sinica, 16 January 2007
Producing Barcode Data: 2008 Faster, more portable: Hundreds of samples per hour
Integrated DNA microchips Table-top microfluidic systems
Academia Sinica, 16 January 2007
Producing Barcode Data: 2010?Barcode data anywhere, instantly
• Data in seconds to minutes
• Pennies per sample
• Link to reference database
• A taxonomic GPS• Usable by non-
specialists
Academia Sinica, 16 January 2007
Uses of DNA BarcodesApplied tool for identifying regulated species:• Disease vectors, agricultural pests, invasives• Environmental indicators, protected species • Using minimal samples, damaged specimens, gut
contents, droppings
Research tool for improving species-level taxonomy:• Associating all life history stages, genders• Testing species boundaries, finding new variants
“Triage” tool for flagging potential new species:• Undescribed and cryptic species
Academia Sinica, 16 January 2007
Uses of DNA BarcodesApplied tool for identifying regulated species:• Disease vectors, agricultural pests, invasives• Environmental indicators, protected species • Using minimal samples, damaged specimens, gut
contents, droppings
Research tool for improving species-level taxonomy:• Associating all life history stages, genders• Testing species boundaries, finding new variants
“Triage” tool for flagging potential new species:• Undescribed and cryptic species
Academia Sinica, 16 January 2007
Hypopygus lepturusHoedeman 1962
Hypopygus lepturusHoedeman 1962
Steatogenys elegans
Steatogenys duidae
Steatogenini until the early 90’s
Academia Sinica, 16 January 2007
Nijssen & Isbrüker 1972Nijssen & Isbrüker 1972
Color patterns in Hypopygus
Academia Sinica, 16 January 2007
Hypopygus neblinaeMago-Leccia 1994
Hypopygus neblinaeMago-Leccia 1994
Hypopygus lepturusHoedeman 1962
Hypopygus lepturusHoedeman 1962
Steatogenys
Steatogenini during the 90’s
Academia Sinica, 16 January 2007
Hypopygus neblinaeMago-Leccia 1994
Hypopygus neblinaeMago-Leccia 1994
Hypopygus lepturusHoedeman 1962
Hypopygus lepturusHoedeman 1962
StegostenoposTriques 1997
StegostenoposTriques 1997
Steatogenys
Steatogenini during the 90’s / today
Academia Sinica, 16 January 2007
Steatogenys sp.
Hypopygus lepturus
Stegostenopos cryptogenes
R. Bernhard, 20048a8a
Academia Sinica, 16 January 2007
AA
CC
DD
Steatogenys
H. lepturusH. lepturus
RAG 1
MP/ML/DistStegostenopusStegostenopus
Hypopygus neblinaeHypopygus neblinae
Academia Sinica, 16 January 2007
12S16S Strict of
ML/MP/Dist
AA
CC
EEDD
H. neblinaeH. neblinae
StegostenopusStegostenopus
Steatogenys
H. l
eptu
rus
H. l
eptu
rus
Academia Sinica, 16 January 2007
DDEE
H.
lep
turu
sH
. le
ptu
rus
2781
2845
2876
2885
2792
2791
D-loop MP/ML/Dist
Academia Sinica, 16 January 2007
Eigenmannia sp.
COI - BARCODE MP
H. l
eptu
rus
H. l
eptu
rus
AA2781
2845
CC 2792
2791
DD
EE
H. neblinaeH. neblinae
StegostenopusStegostenopus
Academia Sinica, 16 January 2007
Uses of DNA BarcodesApplied tool for identifying regulated species:• Disease vectors, agricultural pests, invasives• Environmental indicators, protected species • Using minimal samples, damaged specimens, gut
contents, droppings
Research tool for improving species-level taxonomy:• Associating all life history stages, genders• Testing species boundaries, finding new variants
“Triage” tool for flagging potential new species:• Undescribed and cryptic species
Academia Sinica, 16 January 2007
Wider Impacts of Barcoding: 2008• Catalyzing interoperability of databases
– Barcode data standards link sequences, specimens, species names and publications
• Improving the information infrastructure– Digital library initiative in taxonomy
• Renewing the mission of museums– DNA recovery from formalin-fixed specimens– Promoting the growth of DNA banks
• Expanding analytical toolbox for taxonomy
Academia Sinica, 16 January 2007
What DNA Barcoding is NOT• Barcoding is not DNA taxonomy; no
single gene (or character) is adequate• Barcoding is not Tree of Life; barcode
clusters are not phylogenetic trees• Barcoding is not just COI; standardizing
on one region has benefits and limits• Molecules in taxonomy is not new; but
large-scale and standardization are new• Barcoding can help to create a 21st
century research environment for taxonomy
Academia Sinica, 16 January 2007
What DNA Barcoding is NOT• Barcoding is not DNA taxonomy; no
single gene (or character) is adequate• Barcoding is not Tree of Life; barcode
clusters are not phylogenetic trees• Barcoding is not just COI; standardizing
on one region has benefits and limits• Molecules in taxonomy is not new; but
large-scale and standardization are new• BUT…Barcoding can help to create a
21st century research environment for taxonomy
Academia Sinica, 16 January 2007
Consortium for the Barcode of Life (CBOL)
• First barcoding publications in 2002• Cold Spring Harbor planning workshops in 2003• Sloan Foundation grant, launch in May 2004• Secretariat opens at Smithsonian, September 2004• First international conference February 2005• Now an international affiliation of:
– 130+ Members Org’s, 40 countries, 6 continents– Natural history museums, biodiversity organizations– Users: e.g., government agencies– Private sector biotech companies, database providers
Academia Sinica, 16 January 2007
CBOL Member Organizations June 2006: 120 Member Organizations, 40 countries
Academia Sinica, 16 January 2007
CBOL’s Working Groups
• Database: Designing/constructing the Barcode Section of GenBank
• DNA: Protocols for formalin-fixed and old museum specimens; Producing LIMS for dissemination
• Data Analysis: Beyond phenetic methods; population genetics perspective
• Plants: Identify gene region(s) for barcoding
Academia Sinica, 16 January 2007
Infrastructure of Taxonomy:Fragmented, Disconnected
• Collections and databases of specimens
• Compilations of taxonomic names
• Data repositories (characters, gene sequences, images, trees)
• Monographs
• Floristic and faunistic surveys/inventories
• Revisions
• The (undigitized) Taxonomic Literature
Academia Sinica, 16 January 2007
Barcode Records in INSDC• Consensus results of Front Royal meeting
– GBIF ITIS GRIN– NBII Species2000 IPNI– ICZN ZooRecord OBIS
• Structured link to voucher specimen• Species name selected from authority• Online access to metadata• Trace files and quality scores• Minimum sequence length
Academia Sinica, 16 January 2007
Barcode Sequence
Voucher Specimen
Species Name
Specimen Metadata
Literature(link to content or
citation)
BARCODE records in GenBank
Indices - Catalog of Life - GBIF/ECAT
Nomenclators - Zoo Record - IPNI
NameBank
Publication links - New species
GeoreferenceHabitat
Character setsImages
BehaviorOther genes
Trace filesOther
DatabasesPhylogenetic
Pop’n GeneticsEcological
Primers
Academia Sinica, 16 January 2007
Digitizing Taxonomic Literature
• CBOL’s catalytic efforts:– Library-Laboratory meeting in London on
electronic access to taxonomic literature– Led to formation of Biodiversity Heritage
Library initiative– Proactive steps with PubMed to add
taxonomic journals to online abstracts– Aggressive negotiation with publishers of
barcoding papers
Academia Sinica, 16 January 2007
CBOL’s Working Groups
• Database: Designing/constructing the Barcode Section of GenBank
• DNA: Protocols for formalin-fixed and old museum specimens; Producing LIMS for dissemination
• Data Analysis: Beyond phenetic methods; population genetics perspective
• Plants: Identify gene region(s) for barcoding
Academia Sinica, 16 January 2007
The Barcode Assembly Line: 2006
Freshly collected specimens
Frozen tissue Young museum specimens
DNA Barcode Data
Academia Sinica, 16 January 2007
The Barcode Assembly Line: 2008Opening the museum treasure-trove
Freshly collected specimens
Frozen tissue Young museum specimens
DNA Barcode Data
Formalin-fixed specimens
Older museum specimens
Academia Sinica, 16 January 2007
CBOL Formalin Workshop• Literature survey of DNA recovery
protocols from formalin-fixed specimens
• Solicited proposal from National Research Council
• May 8-9 workshop in Washington
• Chemists, biochemists, biophysicists, biomedical researchers
• Create a new research agenda
Academia Sinica, 16 January 2007
CBOL’s Working Groups
• Database: Designing/constructing the Barcode Section of GenBank
• DNA: Protocols for formalin-fixed and old museum specimens; Producing LIMS for dissemination
• Data Analysis: Beyond phenetic methods; population genetics perspective
• Plants: Identify gene region(s) for barcoding
Academia Sinica, 16 January 2007
Data analysis protocols in 2008 A Bigger, Better Analytical Toolkit
to handle the Barcode Data Explosion
• Collaboration of statisticians, computer scientists, population geneticists
• Sampling issues:– Sample size versus confidence level– Sample size in light of geography, gene flow
• Analytical tools and protocols:– Treatment of missing DNA site data– Identification versus species delimitation
(classification versus clustering)
Academia Sinica, 16 January 2007
CBOL’s Working Groups
• Database: Designing/constructing the Barcode Section of GenBank
• DNA: Protocols for formalin-fixed and old museum specimens; Producing LIMS for dissemination
• Data Analysis: Beyond phenetic methods; population genetics perspective
• Plants: Identify barcode gene region(s) for land plants
Academia Sinica, 16 January 2007
Progress toward Plant Barcode• Kress 2005 proposal for ITS and trnh-psbA• Kew Garden receives Sloan/Moore
Foundation support• Phase 1 screens 100 genes across 50
sibling species pairs• Phase 2 tests of matK, rpcoC1, rpoB, ndhJ,
and accD• Canadian proposal for rbcL• CBOL protocols for approving barcode
regions
Academia Sinica, 16 January 2007
Current and Planned CBOL Barcoding Projects
• FishBOL and All Birds Initiatives• “Demonstrator Systems: by 2008:
– Tephritid fruit flies (agricultural pests)
– Mosquitoes (disease vectors)
• African Scale Insect Barcoding Initiative (planned at Cape Town Regional Meeting)
• Barcoding for Conservation Committee
Academia Sinica, 16 January 2007
Launching CBOL Projects
Assembling Steering Committee– Users– Taxonomists, collection curators– Service providers (BoLD, analytical labs)
• Plan for scope, timetable, logistics
• Pilot tests of primers, PCR amplification
• Assemble pipeline of specimens to lab
Academia Sinica, 16 January 2007
ABBI and FISH-BOL• Global initiatives to create reference library
• Enable users to adopt barcode ID systems
• All-species barcode database will:– Strengthen specimen/species data– Improve collections, tissue/DNA resources– Attract users to barcoding for specimen IDs
• Regional Working Groups
• Small Steering Committee and CBOL
Academia Sinica, 16 January 2007
Planned Outreach • Regional meetings in:
– Cape Town, South Africa, 7-8 April 2006, SANBI– Nairobi, Kenya, 18-19 October 2006, NMK– Sao Paolo, Brazil, February 2007, INPA– Southern/SE Asia, mid-2007
• Second International Barcode Conference– Southeast Asia (?), September 2007 (?)
• Support from CBOL, host governments and international development agencies
Academia Sinica, 16 January 2007
Milestones for 20082007 20082006
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2
International Conference
Development of Consensus Plant Barcode Region
Data Analysis Protocols and S/W
Formalin Study
Advanced Lab Protocols
200K records 500K records100K records
Demonstrator System Launched
Database:
Data Analysis WG:
DNA WG:
Plant WG:
Database WG: Extended DB Interoperability
BoLI Data Portal Launched
Campaigns: Regional Groups Operational
First Data Releases
10K birds30K fish
Data Standards