A metagenetic approach for revealing community structure of marine planktonic copepods J. HIRAI,* M....
-
Upload
joanna-mills -
Category
Documents
-
view
245 -
download
0
Transcript of A metagenetic approach for revealing community structure of marine planktonic copepods J. HIRAI,* M....
A metagenetic approach for revealing community structure of marine planktonic copepods
J. HIRAI,* M. KURIYAMA,* T. ICHIKAWA,* K. HIDAKA* and A. TSUDA†*National Research Institute of Fisheries Science, Fisheries Research Agency, 2-12-4 Fukuura, Kanazawa, Yokohama, Kanagawa236-8648, Japan, †Atmosphere and Ocean Research Institution, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba277-8564, Japan
Molecular Ecology Resources (2015) 15, 68–80
Keywords: 28S rDNA: D2, 454 genome sequencer, biodiversity, Copepoda, metagenetics
Introduction
Marine ecosystem Planktonic copepods: Most abundant, wide distributionArthropoda 388-522 Mya, 11500 morphological species (new, cryptic species)
Important: Marine food webs and biogeochemical cycles Indicator: sensitive to environmental changes (natural, anthropogenic stressors)
Þ Study their community structureÞ Understanding and monitoring changes in marine ecosystems
Horizontal distribution: locally studied, not global scalesDue to time-consuming morphological classification, requiring expertsCryptic species, difficulty to identify the immature stages etc.
Solution: DNA-based genetic analysis1) Sanger sequencing-> time-consuming & cost intensive for individual sorting and analysis 2) Metagenetic method-> useful for surveying species richness in metazoans-> MOUTs: molecular operational taxonomic units based on sequence similarity-> independent of morphological classification-> effective tool for rapidly and comprehensively revealing the community structure
Introduction
DNA barcoding
To facilitate species identification based on similarity to known sequences in DBCommon molecular marker: COI, highly variable 5’ end of COI gene
Copepods: only a few studies used COI as a molecular markerCOI shows high evolutionary rate, difficulty in designing primer in a broad group
rRNA : common genetic marker for metagenetic analysis,having both variable and conserved region, Large subunit rDNA (LSU): more variable, used for species identificationD2 region of LSU (about 350bp): hypervariable region btw conserved regions-> suitable for designing universal primer, LSU region: available sequences in DB: Larger than COIUsed for metagenetic analysis of Haptophyta
In this study,Develop a metagenetic method for revealing community structure of copepods using 454 pyrosequencing Þ Rapid & comprehensive analysis of copepod community structure will be possiblePropose: new, efficient technique for assessing copepod community structure(diversity and distribution of planktonic copepods)
Materials and methods4 plankton community samples(1 artificial (Art.), 3 field-collected (FC))1. Similarity threshold for MOTU clustering in Art.2. Apply new method to 3 FC samples(showing high species diversity, significant hydrographic variation observed)
* Compare: metagenetic vs. morphological analysis* Evaluation the accuracy of the new method
Artificial community samples33 species(subtropical regions off): Table1.3 orders, 17 families, 27 genera-> sanger/pyrosequencing
Sanger sequencing33 species: first antennaPrimer: LSUCop-D1F (50-GCGGAGGAAAAGAAAACAAC-30)Cop-D3R (50-CGATTAGTCTTTCGCCCCT-30), 1000bp
Field-collected samples
O-line transect along 138°Subtropical: (31°00.00N, 137°59.90E Kuroshio: (32°54.70N, 138°00.60E)Slope: (33°52.00N, 137°44.00E).
* Vertical sampling- Depth: 0 to 200m / daytime- VMPS- Vertical multiple plankton sampler0.25m2 mouth-opening area100um mesh
* Sample preparation- 99% ethanol, 4°c maintain- After 24hours, replace to new EtOH- aliquots: morphological classifi-cation and metagenetic analysis
* Temperature-Salinity- CTD profiler
Fig. S1. Vertical profiles of water temperature and salinity at 138°E (0–250 m depth).
Highest Temperature, Salinity
Pyrosequencing
4 ethanol-preserved samples (one: artificial, ¼ aliquot field collected samples)* Remove all large noncopepod samples using 2mm mesh* Filtered onto 100um mesh
Primer: 400bp w D2 region, highly conserved region; successful amplification of LSU-D2 region in > 100 copepod species
LSU Cop-D2F (50-AGACCGATAGCAAACAAGTAC-30)LSU Cop-D2R (50-GTCCGTGTTTCAAGACGG-30)
Chimera: major cause of overestimation of diversitySolution:- Low number of cycles, - Long extension time, - Low concentrations of template DNA
Quality filtering
To minimize overestimation of MOTUs
Criteria1. Contained no ambiguous Ns2. Comprised 300-420bp w/o primer sites3. Contained ≤ 5 homopolymers4. No MID adaptor mismatch5. No more than three mismatches per primer6. Average quality score > 277. contained primer sitesbased on LSU sequences of copepods in GenBank
Merge: Forward & Reverse sequences in mothurClassify: based on the Reference data set using the naïve Bayesian classifier > 70%Align: sequences classified as subclass Copepod, MaffFiltering: using single-linkage preclusteringChimeras: remove them using UCHIMEReference Data: copepod sequences w LSU-D2 region in GenBank + 100 LSU-D2 seqs.artificial & field-collected samplesTaxonomic information: Boxshall & Halsey(2004), SILVA
MOTU analysis of the artificial community sample
Clustered at 95-99% similarity thresholdIn-del region: removed to minimize overestimation of MOTUs in the distant cal. Due to Homopolymer: most common error of 454 pyrosequencingOnly MOTUs w ≥3 Sequence readsMOTUs numbers: calculated for each similarity threshold 95~99%33 reference sequence vs. MOTUs : NJ tree build using MEGA5
No. of sequence reads for MOTU/ DW of the identified reference species in Art.logDW =2.891(Log PL)-7.467
Pearson’s product-moment correlation coefficients(r) 1: positive 0: no correlation -1: negativeBtw proportion of DW and sequence reads using SPSS
Fig. S3. Relationship between per-centage of dry weight and sequence reads at 97% similarity in the artificial community analysis (r = 0.638, p < 0.01).
MOTU analysis of field-collected samples
Quality-filtered sequence:Slope: 10611, Kuroshio: 6221, Subtropical: 12500 reads: low, many short fragments6221 reads was used for MOTU clustering for comparison of No. of 3 sites MOTUs ?
Clustered into MOTUs (97 similarity threshold)Classified into taxonomic order using a naïve Bayesian classifierCalanoid: classified at family level< 70% threshold : unclassified MOTU
Taxonomic composition of MOTUs(1)A nonbiomass-based approach: use only No. of the MOTUs(2)Biomass-based approach: MOTUs including the No. of sequence reads
Detection of biomass-dominant species: representative sequence of the top 6 MOTUs(those with the highest numbers of sequence reads) and blasted
Morphological analysis of field-collected samples
Morphological classification: Quantitative sample aliquots for calanoid copepods: depends on the size
Biomass(DW) estimation: logDW =2.891(Log PL)-7.467
Total number of species, biomass of each speciesNumbers of species and total biomass per familyBiomass-dominant species=> Compared with the value obtained from metagenetic analysis
MOTUs from metagenetic analysis vs. LSU-D2 sequences of morphologically identified species
Pearson’s product-moment correlation coefficients(r) Btw sequence reads of MOTUs and biomass
Results
Artificial community sample analysis33 reference sequences: D2 region, 405-408
Asymptote: suggesting sufficient sampling coverage
99%, 98%: overestimation 96%, 95%: underestimation97%: the closest match to the true MOTUs numbers
Nonselected MOTUs: probably present as gut contents of predatory copepods
MOTU no.
Reads
Best hit Iden-tity
Acces-sion no.
MOTU 1 1415 Pareucalanus attenuatus 99%AB796416
MOTU 2 1028 Subeucalanus subtenuis 100%AB796417
MOTU 3 811 Calanus sinicus 100%AB796406
MOTU 4 591 Euchirella messinensis 99%AB796401
MOTU 5 523 Eucalanus californicus 100%AB796414
MOTU 6 262 Undeuchaeta major 100%AB796403
MOTU 7 187 Centropages sp. 99%AB796413
MOTU 8 93 Paraeuchaeta media 99%AB796418
MOTU 9 88 Euchirella curticauda 99%AB796400
MOTU 10
87 Neocalanus gracilis 99%AB796410
MOTU 11
61 Temora discaudata 99%AB796428
MOTU 13
53 Mesocalanus tenuicornis 99%AB796407
MOTU 14
52Pleuromamma abdomi-nalis
99%AB796423
MOTU 15
39 Pontellina plumata 99%AB796426
MOTU 16
31 Aetideus acutus 99%AB796399
MOTU 17
24 Gaetanus minor 99%AB796402
MOTU 18
18 Candacia curta 100%AB796412
MOTU 19
17 Lucicutia flavicornis 100%AB796419
MOTU 20
11 Corycaeus sp. 99%AB796430
MOTU 21
10 Oithona sp. 100%AB796429
MOTU 23
10 Mecynocera clausi 100%AB796420
MOTU 26
7 Haloptilus sp. 99%AB796405
MOTU 27
7 Paracalanus sp. 99%AB796425
MOTU 30
4 Scolecithrix danae 99%AB796427
MOTU 31
4 Cosmocalanus darwinii 99%AB796408
MOTU 33
3 Calocalanus sp. 99%AB796411
MOTU 34
3 Pareucalanus sp. 99%AB796415
MOTU 35
3 Metridia brevicauda 100%AB796421
Table S1. BLAST results of selected MO-TUs in the artificial community analysis.
Fig. S2. Unrooted NJ tree of the artificial community for comparison between reference sequences of 33 species and MOTUs at the 97% similarity threshold. Scale bar indicates p-distance. Reference sequences obtained by Sanger sequencing are indicated by red circles and rep-resentative sequences of MOTUs are indicated by blue squares. Values in parentheses represent numbers of sequence reads in each MOTU.
Fig. S3. Relationship between percentage of dry weight and sequence reads at 97% similarity in the artificial community anal-ysis (r = 0.638, p < 0.01).
The number of sequence reads & DW of each speciesÞ Correlation (not strong)
High biomass(DW) tends to contain large num-bers of sequence reads
Field-collected sample analysis110 copepod MOTUs, 97% similarity70 of these were classified into calanoid copepods73 calanoid copepod species: morphologically identified
MOTU number≒ Species richnessNo. of calanoid MOTUs > Obs. Morphological species
59.4, 65.6, 63.3% calanoida
11.5-15.6%
3.1-3.3%
16.7-20.3%
59.4-65.6% dominant 3sites
49.9, 85.1, 62.9% calanoida49.3, 11.0, 35.2% cy-clopoida
Fig. S4. Unrooted neighbor-joining tree of the field-collected samples for comparison between morphological species and MOTUs at 97% similarity. Scale bar indicates p-distance. Sequences of morphologically identified species are indicated by red circles and representative sequences of MOTUs are indicated by blue squares.
64 morphological species-> 47 MOTUs, 97% similarity9 morphological species-> not detected in MOTUs23 MOTUs-> not correspond to Morphological species
Proportion of sequence reads-> proportion of DWpositive correlation
Family level species richness Significant correlated with morphological analyses of calanoid copepods in all 3 sites(slope r: 0.691, Kuroshio r=0.878, subtropical r= 0.808)
Fig. 6 Comparison btw metagenetic & morphological analaysis of calanoid copepods
Family level % of sequence reads & DWSignificant correlation with FC (slope r: 0.843, Kuroshio r=0.802, subtropical r= 0.921)Large proportion – High biomass
underestimation
In Kuroshio currentHigh species richness
The correlation btw No. of sequence reads & DWUsefulness of the number of reads as a proxy for biomass
99% similarity threshold: high species-level resolution for detection of dominant species
Discussion
MOTUs (LSU-D2)in metagenetic analysis: Reflect Species compositionProxy for species richness97% similarity: suitable for surveying the community structure of pelagic copepodsusing LSU regions
LSU: simple to design primer,Slow evolution rate: underestimation with insufficient taxonomic resolution97%: high species resolution97% similarity MOTUs clustering: avoid Artificial inflation of diversity; Haptophyta
99%: ideal for species identification, not proper for evaluating species richnessInflation in the No. of MOTUs => small numbers of sequence reads: =>no significant effect on dominant MOTUs
Art.: (Metridia venusta & Oncaea sp.): not detected in the MOTUs, NematodaWhy: Insufficient quantity of template DNA, PCR bias
Gut contents of carnivorous copepods
Discussion
Discrepancy btw Biomass & No. of sequence reads: primer mismatches, length of amplicons & copy numbers of rRNA
Art. s: - primer mismatch-> PCR efficiency; to minimize mismatch 3’ region of each primer is important- sequence length (OK)- Sequence reads: suggested to be a proxy for biomass, not strong (bias):Correlation btw Biomass & No. of sequence reads in NGS: SSU region study
FC.s:High species richness in the warm, western-boundary Kuroshio Current Copepods diversity: strongly correlated with temperatureHigher diversity: warm oligotrophic oceansHighest species richness: affected by the Kuroshio Current (HT and S)Kuroshio Current: - transport plankton from lower latitudes- Increase species diversity in the western North Pacific
Discussion
No. of MOTUs > morphological speciesMorphological identified by only adult copepods
Metagenetic analysis: immature stages, possible cryptic species, cut contents=> Higher estimates of species richnessMorphological species with large biomass: successful detection with MOTUs in FC.s.
MOTUs O, Morphological species (X): small sequence reads: rare speciesOther possibility: rare MOTUs = artefacts, pseudogenes, remnants of extracellular DNA in the water
Rarefaction curve: no fully stabilizeNo. of sequence reads ∝ larger numbers of MOUTs
MMGH sample
Discussion
Proportion of sequence reads Vs. DW: correlated
Discrepancies btw sequence reads & DW: - methodological biases of metagenetic & morphological analysis
MOTU classification at family level (Fig.6)Species richness of taxa: difficult to identify morphologically(ex: Paracalanidae, & Scolecitrichidae): small size, subtle morphological dif-ferences
Hydrographic area: Kuroshio & Subtropical station: Paracalanidae (genus Calocalanus)
Underestimation: Acartiidae (primer mismatches, short sequence lengths, phylogeny)Clausocalanidae (small genetic distance Clausocalanidae & Calanidae)Þ 97% similarity threshold: not good!
This metagenetic anlaysis: optimized in wide range of copepod taxaSolutionDifferent methods for data analysis and different molecular marker should be selected (Acartiidae, Calanidae, Clausocalanidae)
Discussion
Sequence reads ∝ biomass composition at the family level: (Fig.6,7)
Morphological analysis: time-consuming sorting, dissection
* Metagenetic analysis: all individuals, immature stages, And rapid detection of biomass-dominant taxa, Dominant taxa: valuable insight into the composition of the copepod com-munity: important to understanding copepod community structure and envi-ronmental conditions
C. sinicus, P. parvus : dominant at the Slope stationKnown to Key species and important prey for planktivorous fish in this region
Detect species richness and biomass of small copepods (Oithona, Para-calanus, Clausocalanus – underestimated)99% similarity: proper to detect dominant species
Rapid means of obtaining valuable information on copepod community structureMust be improved, LSU Reference DB accumulationCalanoid copepods (specific ecological characters): classification to the genus level-> easily adapted to field-collected samples on the global scale.