Comparative metagenomics of bathypelagic plankton and...

20
ORIGINAL ARTICLE Comparative metagenomics of bathypelagic plankton and bottom sediment from the Sea of Marmara Achim Quaiser 1,2 , Yvan Zivanovic 3 , David Moreira 1 and Purificacio ´n Lo ´pez-Garcı´a 1 1 Unite´ d’Ecologie, Syste´matique et Evolution, CNRS UMR8079, Universite´ Paris-Sud 11, Orsay, France; 2 Ecosyste`mes, Biodiversite´, Evolution, CNRS UMR 6553, Universite´ de Rennes 1, Rennes, France and 3 Institut de Ge´ne´tique et Microbiologie, UMR C8621 Universite´ Paris-Sud 11, Orsay, France To extend comparative metagenomic analyses of the deep-sea, we produced metagenomic data by direct 454 pyrosequencing from bathypelagic plankton (1000 m depth) and bottom sediment of the Sea of Marmara, the gateway between the Eastern Mediterranean and the Black Seas. Data from small subunit ribosomal RNA (SSU rRNA) gene libraries and direct pyrosequencing of the same samples indicated that Gamma- and Alpha-proteobacteria, followed by Bacteroidetes, dominated the bacterial fraction in Marmara deep-sea plankton, whereas Planctomycetes, Delta- and Gamma- proteobacteria were the most abundant groups in high bacterial-diversity sediment. Group I Crenarchaeota/Thaumarchaeota dominated the archaeal plankton fraction, although group II and III Euryarchaeota were also present. Eukaryotes were highly diverse in SSU rRNA gene libraries, with group I (Duboscquellida) and II (Syndiniales) alveolates and Radiozoa dominating plankton, and Opisthokonta and Alveolates, sediment. However, eukaryotic sequences were scarce in pyrose- quence data. Archaeal amo genes were abundant in plankton, suggesting that Marmara planktonic Thaumarchaeota are ammonia oxidizers. Genes involved in sulfate reduction, carbon monoxide oxidation, anammox and sulfatases were over-represented in sediment. Genome recruitment analyses showed that Alteromonas macleodii ‘surface ecotype’, Pelagibacter ubique and Nitrosopumilus maritimus were highly represented in 1000 m-deep plankton. A comparative analysis of Marmara metagenomes with ALOHA deep-sea and surface plankton, whale carcasses, Peru subsurface sediment and soil metagenomes clustered deep-sea Marmara plankton with deep- ALOHA plankton and whale carcasses, likely because of the suboxic conditions in the deep Marmara water column. The Marmara sediment clustered with the soil metagenome, highlighting the common ecological role of both types of microbial communities in the degradation of organic matter and the completion of biogeochemical cycles. The ISME Journal (2011) 5, 285–304; doi:10.1038/ismej.2010.113; published online 29 July 2010 Subject Category: integrated genomics and post-genomics approaches in microbial ecology Keywords: deep-sea; anaerobic respiration; carbon fixation; carbon cycle; sulfate reduction; ammonia oxidation Introduction Marine waters and their underlying sediments constitute the largest portion of the biosphere. However, despite they are key for biogeochemical cycling in our planet, the composition of marine microbial communities, their metabolic potential and activities and their interactions with the environment remain poorly understood. A consider- able effort has been devoted though in past years to study planktonic communities inhabiting euphotic layers by molecular and genomics-based methods that complement classical cultivation-based approaches. In fact, surface seawaters have been an environment of choice for pioneer studies of micro- bial diversity based on the amplification, cloning and sequencing of small subunit ribosomal RNA (SSU rRNA) genes (Schmidt et al., 1991), as well as for more modern metagenomic (Venter et al., 2004; DeLong et al., 2006; Rusch et al., 2007; Feingersch et al., 2009) and metatranscriptomic (Frias-Lopez et al., 2008; Poretsky et al., 2009; Shi et al., 2009) analyses. Knowledge about deep-sea planktonic communities is much more fragmentary, and the situation is even worse for deep-sea sediments, as with the exception of cold seeps, hydrothermal sediments or whale falls, which have raised some Received 4 March 2010; revised 18 May 2010; accepted 13 June 2010; published online 29 July 2010 Correspondence: P Lo ´pez-Garcı ´a, Unite ´ d’Ecologie, Syste ´matique et Evolution, CNRS UMR8079, Universite ´ Paris-Sud, ba ˆtiment 360, Orsay Cedex 91405, France. E-mail: [email protected] The ISME Journal (2011) 5, 285–304 & 2011 International Society for Microbial Ecology All rights reserved 1751-7362/11 www.nature.com/ismej

Transcript of Comparative metagenomics of bathypelagic plankton and...

Page 1: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

ORIGINAL ARTICLE

Comparative metagenomics of bathypelagicplankton and bottom sediment from theSea of Marmara

Achim Quaiser1,2, Yvan Zivanovic3, David Moreira1 and Purificacion Lopez-Garcıa1

1Unite d’Ecologie, Systematique et Evolution, CNRS UMR8079, Universite Paris-Sud 11, Orsay, France;2Ecosystemes, Biodiversite, Evolution, CNRS UMR 6553, Universite de Rennes 1, Rennes, France and 3Institutde Genetique et Microbiologie, UMR C8621 Universite Paris-Sud 11, Orsay, France

To extend comparative metagenomic analyses of the deep-sea, we produced metagenomic data bydirect 454 pyrosequencing from bathypelagic plankton (1000 m depth) and bottom sediment of theSea of Marmara, the gateway between the Eastern Mediterranean and the Black Seas. Data fromsmall subunit ribosomal RNA (SSU rRNA) gene libraries and direct pyrosequencing of the samesamples indicated that Gamma- and Alpha-proteobacteria, followed by Bacteroidetes, dominated thebacterial fraction in Marmara deep-sea plankton, whereas Planctomycetes, Delta- and Gamma-proteobacteria were the most abundant groups in high bacterial-diversity sediment. Group ICrenarchaeota/Thaumarchaeota dominated the archaeal plankton fraction, although group II and IIIEuryarchaeota were also present. Eukaryotes were highly diverse in SSU rRNA gene libraries, withgroup I (Duboscquellida) and II (Syndiniales) alveolates and Radiozoa dominating plankton, andOpisthokonta and Alveolates, sediment. However, eukaryotic sequences were scarce in pyrose-quence data. Archaeal amo genes were abundant in plankton, suggesting that Marmara planktonicThaumarchaeota are ammonia oxidizers. Genes involved in sulfate reduction, carbon monoxideoxidation, anammox and sulfatases were over-represented in sediment. Genome recruitmentanalyses showed that Alteromonas macleodii ‘surface ecotype’, Pelagibacter ubique andNitrosopumilus maritimus were highly represented in 1000 m-deep plankton. A comparative analysisof Marmara metagenomes with ALOHA deep-sea and surface plankton, whale carcasses, Perusubsurface sediment and soil metagenomes clustered deep-sea Marmara plankton with deep-ALOHA plankton and whale carcasses, likely because of the suboxic conditions in the deep Marmarawater column. The Marmara sediment clustered with the soil metagenome, highlighting the commonecological role of both types of microbial communities in the degradation of organic matter and thecompletion of biogeochemical cycles.The ISME Journal (2011) 5, 285–304; doi:10.1038/ismej.2010.113; published online 29 July 2010Subject Category: integrated genomics and post-genomics approaches in microbial ecologyKeywords: deep-sea; anaerobic respiration; carbon fixation; carbon cycle; sulfate reduction;ammonia oxidation

Introduction

Marine waters and their underlying sedimentsconstitute the largest portion of the biosphere.However, despite they are key for biogeochemicalcycling in our planet, the composition of marinemicrobial communities, their metabolic potentialand activities and their interactions with theenvironment remain poorly understood. A consider-able effort has been devoted though in past years to

study planktonic communities inhabiting euphoticlayers by molecular and genomics-based methodsthat complement classical cultivation-basedapproaches. In fact, surface seawaters have been anenvironment of choice for pioneer studies of micro-bial diversity based on the amplification, cloningand sequencing of small subunit ribosomal RNA(SSU rRNA) genes (Schmidt et al., 1991), as well asfor more modern metagenomic (Venter et al., 2004;DeLong et al., 2006; Rusch et al., 2007; Feingerschet al., 2009) and metatranscriptomic (Frias-Lopezet al., 2008; Poretsky et al., 2009; Shi et al., 2009)analyses. Knowledge about deep-sea planktoniccommunities is much more fragmentary, and thesituation is even worse for deep-sea sediments, aswith the exception of cold seeps, hydrothermalsediments or whale falls, which have raised some

Received 4 March 2010; revised 18 May 2010; accepted 13 June2010; published online 29 July 2010

Correspondence: P Lopez-Garcıa, Unite d’Ecologie, Systematiqueet Evolution, CNRS UMR8079, Universite Paris-Sud, batiment360, Orsay Cedex 91405, France.E-mail: [email protected]

The ISME Journal (2011) 5, 285–304& 2011 International Society for Microbial Ecology All rights reserved 1751-7362/11

www.nature.com/ismej

Page 2: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

attention because of their increased diversity andproductivity (Jorgensen and Boetius, 2007), typicalsea-bottom sediments have been barely explored. Ametagenomic study of subsurface sediments at thePeru margin from 1 to 50 m below the seafloor(bottom depth 150.5 m; Biddle, personal commu-nication) was only recently published (Biddle et al.,2008). So far, metagenomic analyses targeting glob-ally deep-sea planktonic communities have beencarried out in two locations. DeLong et al. (2006)studied picoplankton at different depths from sur-face to 4000 m at the ALOHA station in the NorthPacific subtropical gyre by constructing fosmidlibraries and sequencing the insert extremities ofrandomly chosen clones, which yielded an averageof 8–10 Mbp of sequence per library. A similarapproach was used to study plankton coming from3000 m depth from the Ionian Sea in the Mediterra-nean basin, which also produced B10 Mbp ofsequence (Martin-Cuadrado et al., 2007). Morerecently, B200 Mbp of sequence were obtained froma shotgun library for the 4000 m depth at theALOHA station (Konstantinidis et al., 2009). Com-paring metagenomic sequences from differentdepths in the water column is helpful to determinethe genes and the potential metabolic functionsassociated to stratified marine communities. Con-versely, comparing metagenomic data from similardepth and different geographic locations can informabout the influence of local parameters and/orgeographic distance on the composition and meta-bolic capabilities of microbial communities. Deep-sea (below 1000 m) planktonic environments aregenerally considered to be relatively uniform andstable in terms of microbial diversity, as microbialdispersal is high and conditions are overall rela-tively similar (oligotrophy, high pressure and lowtemperature). However, different water masses areendowed with different physico-chemical character-istics. Such differences are particularly important inclose seas, much more influenced by coastal inputand local features. For example, Mediterraneanwaters are very different from open oceanic waters,and this seems to be reflected at the metagenomelevel (Martin-Cuadrado et al., 2007).

The Sea of Marmara has a maximum depth ofB1300 m and is the gateway between the Mediter-ranean Sea and the Black Sea, to which it isconnected through the straits of the Dardanelles(65 m sill depth) and the Bosphorus (35 m silldepth), respectively. The Mediterranean and theBlack seas have very different average salinities(38% and 18%, respectively), which together withthe shallow sill depths of the two straits, results in astably stratified Marmara water column with adefined halocline at B25 m of depth. The strongstratification limits vertical heat exchange andventilation of subhalocline waters, leading toan almost constant temperature of 14.5 1C belowthe halocline and to very low concentrations ofdissolved oxygen (30–50mmol kg�1) compared with

surface waters, almost ten times as oxygenated(Unluata et al., 1990; Besiktepe et al., 1994). TheSea of Marmara is subjected to a two-layer flowregime. On one hand, it is fed by the surface influxof brackish, nutrient-rich waters from the Black Sea,whereas there is an outflow of salty waters throughthe Bosphorus undercurrent. Biologically availablenutrients imported from the Black Sea are rapidlyconsumed by photosynthetic organisms on surfacewaters, leading to a net export of particulatenutrients to deeper layers. On the other hand,Marmara receives the influx of salty (nearly 39%),highly oxygenated, but nitrate- and phosphate-depleted Mediterranean (Aegean) waters. Thesesalty, denser waters sink toward the bottom of theMarmara basin, contributing to the strong stratifica-tion (Unluata et al., 1990; Besiktepe et al., 1994).

During the past 160 000 years, Marmara wasdisconnected from the Mediterranean several timeswhen the eustatic sea level dropped below its outletand evolved into a lake with freshened waters,leaving ancient shorelines and other pieces ofevidence of its lacustrine past visible today. Thelatest connection with the Mediterranean occurred12 000 years ago (Cagatay et al., 2009). In addition tothis agitated history, the Marmara basin is highlydynamic from a geological point of view, as itsNorthern slope is traversed by the seismically moreactive branch of the North Anatolian Fault, the MainMarmara Fault. Gas seeps and water seeps have beenunevenly observed in the three major basins foundalong this fault (from West to East: Tekirdag, theCentral basin and Cinarcik) (Geli et al., 2008; Zitteret al., 2008). Gas or cold seeps are enriched inmethane and can be seen directly as bubbles comingout from the sediment, or can be detected indirectlyby the presence of dark patches revealing areas ofstrong redox activity in which sulfate reductionoccurs just beneath the surface. These patches areoften colonized by siboglinid polychaetes andbivalves likely associated with symbiotic bacteriaand may be covered by white, little-structured matsof sulfide-oxidizing bacteria. The distribution of gasseeps may provide indications of fault activity (Geliet al., 2008). Water seeps are observed moresporadically. In this case, brackish water trappedin the sediment during ancient lacustrine times(before 14 000 years ago) is expelled throughchimneys of authigenic carbonates, as indicated bypore fluid chemistry (Zitter et al., 2008). In additionto the influence of seeps on localized areas, thenormal sediments of Marmara, particularly in baysand estuaries, are anoxic and heavily pollutedbecause of incoming currents from the Black Sea,the intense ship traffic and the industrial, agricul-tural and municipal wastewaters of a denselypopulated region (mostly Istanbul). A recent studyexplored the prokaryotic diversity of several sedi-ments in coastal regions based on denaturing gelgradient electrophoresis of amplified SSU rRNAgene fragments, which suggested the dominant

Comparative metagenomics: Sea of MarmaraA Quaiser et al

286

The ISME Journal

Page 3: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

occurrence of fermentative bacteria, denitrifyingbacteria and hydrogenotrophic methanogenicarchaea (Cetecioglu et al., 2009).

In this study, we present metagenomic dataobtained by direct 454 pyrosequencing of DNA from1000 m-deep plankton of the 0.2–5 mm fraction size(Ma101) and from bottom sediment (1300 m; Ma29)of the Marmara Sea. To relate metagenome data tothe more conventional SSU rRNA gene-based diver-sity, we analyzed in parallel SSU rRNA genelibraries for archaea, bacteria and eukaryotes fromthe same samples. In addition, we studied theprokaryotic diversity in two other points of thedeepest part of the water column (500 and 1250 mdepth) below and above the 1000 m-deep pointstudied in more detail. A comparative metage-nomics analysis of Marmara metagenomic data withthat of previous studies of deep-sea and surfaceplankton, subsurface marine sediment, whalecarcasses and soil reveals interesting trends.

Materials and methods

SamplingSamples from the Marmara Sea were retrievedaboard the Atalante in 30 May (sediment) and 6June (plankton) 2007 during the sampling cruise‘MARNAUT’ (http://cdf.u-3mrs.fr/~henry/marmara/index.html). During the cruise, oxygen concentra-tions at various water columns ranged from236–238 mmol kg�1 in surface waters to 8 mmol kg�1

at bottom (clearly suboxic, o10 mmol kg�1; Cagatayet al., 2009). Plankton from the water column in theCentral Basin (401 50.30N 2811.40 E) and from threedifferent depths (samples Ma120 at 500 m, Ma101 at1000 m and Ma109 at 1250 m, the latter collected10 m above the bottom) was collected using Niskinbottles mounted on a CTD rosette. Concomitantly,measurements of salinity, temperature and oxygenfor each sample were obtained (SupplementaryFigure S1). Ma101 (1000 m) had an oxygen concen-tration of 29.7 mmol kg�1. Its temperature of 14.47 1Cand salinity of 38.6 PSU revealed a clear influence ofEastern Mediterranean waters. Seawater filtrationwas carried out onboard immediately. Total volumesof 5 l (Ma120 and Ma109) and 100 l (Ma101) werefiltered through 5 mm-diameter pore filters (TMTP,Millipore, Billerica, MA, USA) and the filtratespassed through 0.22 mm-pore-diameter filters (GTTP,Millipore). Filters were fixed in ethanol and storedat �20 1C. The sediment sample was recovered in asterile Falcon tube from the surface of a core from amulti-push-core system at 1300 m depth (401 46.80Nand 2916.10 E) and stored at �80 1C.

DNA extraction, PCR amplification, cloning andsequencingFilters were trimmed into fragments of ca 1 mm2

with a sterile scalpel and DNA was then extractedusing a Mo-Bio PowerSoil DNA extraction kit

(Mo-Bio, Carlsbad, CA, USA) following the manu-facturer’s instructions. DNA from the sedimentsample was extracted with the Mo-Bio UltracleanSoil DNA kit Mega Prep (Mo-Bio) and also purifiedusing the PowerClean DNA Clean-Up kit (Mo Bio)according the manufacturer’s instructions.

Archaeal, bacterial and eukaryotic SSU rRNAgenes were amplified with at least two differentcombinations of primers specific for each domainof life: Ar21F (50-TTCCGGTTGATCCTGCCGGA-30),ANMEF (50-GGCTCAGTAACACGTGGA-30) and theprokaryote-specific primer 1492R (50-GGTTACCTTGTTACGACTT-30) for archaea; B27F (50-AGAGTTTGATCCTGGCTCAG-30), B63F (50-CAGGCCTAACACATGCAAGTC-30) and 1492R for bacteria; and EK-42F(50-CTCAARGAYTAAGCCATGCA-30), EK-82F (50-GAAACTGCGAATGGCTC-30) and EK-1498R (50-CACCTACGGAAACCTTGTTA-30) for eukaryotes. PCR reactionswere performed as follows: initial denaturation at 94 1Cfor 5 min, 32 cycles consisting of a denaturation stepat 94 1C for 30 s, annealing at 55 1C for 30 s andextension at 72 1C for 1 min, and followed by afinal extension at 72 1C for 7 min. SSU rRNA genelibraries were constructed using the TOPO TAcloning kit (Invitrogen, Carlsbad, CA, USA)and chemically competent TOP10’ One Shot cells(Invitrogen). Approximately one hundred cloneinserts per domain of organisms and sample wereamplified using vector primers and those having theexpected size were partially sequenced using thereverse primer (1492R or 1498R, respectively) byCogenics (Meylan, France). The sequences weredeposited in GenBank with accession numbersHM103738—HM103902 (archaeal SSU rDNAs),HM103523—HM103737 (bacterial SSU rDNAs),HM103388—HM103522 (eukaryotic SSU rDNAs)and SRA012098 (454 pyrosequences).

Phylogenetic analysis of SSU rRNA genes andrarefaction analysesA total of 401 partial prokaryotic SSU rRNAsequences (153, 117, 55 and 76 from samplesMa29, Ma101, Ma109 and Ma120, respectively)were aligned using the NAST alignment tool(DeSantis et al., 2006b) and chimera checked usingBellerophon 3 (http://greengenes.lbl.gov; DeSantiset al., 2006a). The partial high quality sequenceswere imported in the Greengenes ARB database(236 469 sequences, 18 November 2008 release) andthe alignment was manually corrected using thesequence editor included in the ARB software(Ludwig et al., 2004). In the case of eukaryotes, atotal of 144 partial sequences (86 for Ma29 and 65for Ma101) were aligned using the SINA web alignertool, imported in the SILVA SSURef ARB database(release 100) and manually corrected using ARB(Pruesse et al., 2007). Subsets of sequences werechosen on the basis of neighbor joining treesconstructed in ARB. Alignments excluding gapsand ambiguously aligned positions were exported

Comparative metagenomics: Sea of MarmaraA Quaiser et al

287

The ISME Journal

Page 4: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

from ARB to reconstruct phylogenetic trees bymaximum likelihood using TREEFINDER applyinga general time reversible model of sequence evolu-tion (Jobb et al., 2004). Maximum likelihood boot-strap values were inferred using 1000 replicates.The phylogenetic trees were visualized with theprogram TREEVIEW (Page, 1996). Distance matriceswere generated in ARB using previously generatedalignments only from good quality sequences andexported in Phylip format. SSU rDNA accumulationcurves were obtained for archaea, bacteria andeukaryotes from these matrices at different sequenceidentity levels (100%, 97%, 95%, 90% and 80%)using DOTUR (Schloss and Handelsman, 2005).

Pyrosequencing of Marmara DNA samples and qualitytrimming of metagenomic sequences for comparativeanalysesDNA from the 0.2–5-mm size 1000-m-deep planktonfraction, Ma101, and from the sediment sampleMa29 was sequenced in 2� 1/2 GS-FLX run (onesample per region; Cogenics, Meylan, France). As aminimum of 5 mg of DNA was needed for pyro-sequencing and because of the relatively lowamount of DNA obtained from Ma101 (3 mg), this

sample was subjected to a whole genome amplifica-tion step (Cogenics) to double the DNA amount. Thepotential bias in DNA amplification, if any, wasexpected to be low. The total number of valid readsobtained was 545 585. With an average of 182 bpread length, the number of bases cumulated at99 Mb. We selected five other metagenomic analysesfrom marine origin (surface and deep-sea waters,ocean subsurface sediments, whale carcasses) andsoil (Table 1) to carry out a comparative analysis. Asthese metagenomic data were generated usingdifferent strategies and sequencing technologiesand to minimize potential biases owing to differ-ences in sequence length, sequence reads weretreated as follows. Ma101 and Ma29 pyrosequencereads were trimmed to the size range of 130–240 bp(average of 192 bp). Surface plankton and Perumargin subseafloor metagenomes consisting ofshorter reads were trimmed to a size range of 90–130 bp. Metagenome sequences of whale carcassesand Waseca soil, which were generated by shotgunsequencing, were trimmed differently to get homo-genous and comparable sequence fragments. First,only sequences in the size range of 400–3500 bpwere retained. Then, 200 nucleotides (soil) and 100nucleotides (whale carcass) located at 50 and 30 of

Table 1 Characteristics of the original metagenomic libraries and of the subsequently selected data used for our comparativemetagenomic study.

Ma29Marmaradeep-seasediment(1300 m)

Ma101Marmaradeep-seaplankton(1000 m)

ALOHAdeep-seaplankton

(500 m, 770 mand 4000 m)

ALOHAeuphotic zone

plankton (70 m)

Peru sedimentmix (1, 16, 32and 50 m bsf;

seafloor150.5 m)

Whalecarcasse

mix(rib, boneand mat)

Waseca soil

Sequencingmethod

Pyrosequencing Pyrosequencing Sanger,fosmid ends

Pyrosequencing Pyrosequencing Sanger,shotgun

Sanger,shotgun

Original averagesize (bp)

181 181 B900 110 100 3400 2400

Size range (bp) 135–240 135–240 — 90–130 90–130 — —Average size aftertrimming (bp)

191.44 191.47 192 110 109 192 192

No of fragments 178 199 183 631 149 022 319 477 266 259 200 722 192 639Total nucleotides (bp) 34 114 417 35 159 828 28 612 224 35 142 470 29 022 231 38 538 624 36 986 688GC content (%) 55.63 44.41 51.75 35.35 45.95 47.44 58.03tRNA (total matches) 187 407 278 286 150 211 187% COG matches(from no of reads)

28.81 37.45 29.46 22.22 12.82 38.83 30.56

% KEGG matches(from no of reads)

15.62 24.29 17.79 25.70 10.03 21.47 15.54

% of matches againstnr (only Ma29 andMa101)

44.9 57.2 ND ND ND ND ND

No of rRNA matches(LSU+SSU)

119 476 154 723 146 921 102

COG marker proteinmatches/Mbp

24.65 50.83 32.52 79.13 46.82 36.72 19.65

Effective genomesize (Mbp)

3.66 1.78 2.77 1.59 2.92 2.45 4.59

% of ‘ghost reads’ 11.8 9.1 0.3 10.0 12.8 0.6 4.6Reference This study This study DeLong

et al. (2006)Frias-Lopezet al. (2008)

Biddleet al. (2008)

Tringeet al. (2005)

Tringeet al. (2005)

Abbreviations: COG, clusters of orthologous groups; KEGG, Kyoto Encyclopedia of Genes and Genomes; ND, not determined; rRNA, ribosomalRNA; tRNA, transfer RNA; bsf, below sea floor.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

288

The ISME Journal

Page 5: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

each read were removed, as they often containedstretches of single nucleotides (low quality se-quence) and vector sequences. In addition, readscontaining stretches of more than 10 A, T, G, C or Nand reads containing strong matches to the NCBI(http://www.ncbi.nlm.nih.org) vector database(manually inspected) were eliminated from ouranalysis. Sequence length was then homogenizedto 192 bp to have a size similar to that of oursequences and, subsequently, duplicate reads wereremoved. These trimmed sequences were randomlynumbered and, from them, sequence fragments wererandomly chosen for subsequent metagenome com-parison analysis. We used a roughly comparabletotal amount of sequence from the different meta-genomes in our analyses, the smaller metagenomesbeing the limiting factor (Table 1). The sequencesfrom whale carcasses retained represent a mix ofequal proportions of rib, bone and mat whale fallsequences. Fosmid-end sequences from three differ-ent depths (500, 770 and 4000 m) of the ALOHAmeso- and bathypelagic plankton metagenome weretrimmed to 192 bp sizes and vector-contaminatedreads were eliminated (three in total). To keep acomparable size of the ALOHA deep-sea planktonwith the other chosen metagenomes, we pooled allthose trimmed sequences in a single ‘ALOHA deep-sea plankton’ metagenome (41 995, 55 312 and52 150 sequences from HF500, HF770 and HF4000,respectively; that is, a total of 149 457 sequences).

Identification and elimination of ‘ghost reads’During preliminary analysis of Marmara pyro-sequences, we detected the presence of 9.1–12.8%of artefactual repeated sequences (‘ghost reads’) inour samples, as well as in other (pyrosequence)metagenomes analyzed in this work (see below), aphenomenon also recently observed by others(Gomez-Alvarez et al., 2009). ‘Ghost reads’ werealso observed in nonpyrosequence metagenomes,but to a lower extent (0.3–4.6%; Table 1). To identifyand eliminate ‘ghost reads’ from our samples, weperformed a distribution analysis by plotting, foreach metagenome, the number of duplicated readsas a function of duplication size at positions 1, 5, 20,50 and 80 nt from the start of sequence. Asevidenced by the plots (Supplementary Figure S2),a plateau for the number of duplicates is reached ineach case when duplication size exceeds B10 nt, atany start position examined, and levels further forduplication sizes as long as 400 nt. As metagenomesequences are supposed to be randomly distributed,it is highly unlikely that reads start with identicalstretches of nucleotides. After additional manualinspection, we determined the following duplica-tion size cut-offs: for the ALOHA deep-sea planktonfosmid-end sequences, the first 20 nt, and for theother metagenomes, the first 15 nt. We then elimi-nated all the sequences that were identical with

respect to these criteria to assure the uniqueness ofthe reads used for subsequent comparative analyses.

Taxonomic profiling based on rRNA and Clusters oforthologous group (COG) marker protein matchesTrimmed pyrosequences were blasted against thecurated rRNA database from Urich et al. (2008)containing SSU and large subunit (LSU) rRNAsequences, as well as additional taxonomic identi-fiers for groups of environmental sequences.BLASTN analysis was performed against SSU andLSU rRNA databases separately. The BLAST outputfile was used as input file for MEGAN (Huson et al.,2007). The taxonomic affiliation of the sequenceswas assigned using MEGAN with a cut-off bit scoreof 490 for the long (Ma101, Ma29, ALOHA deep-seaplankton, whale carcass and soil) and 455 for theshort (plankton surface and Peru subseafloor) meta-genomic sequences. Reads matching with low bitscore values were manually checked. In addition,BLASTX (Altschul et al., 1997) analysis was per-formed against a database consisting of 31 markerprotein families that provide sufficient phylogeneticinformation to perform taxonomic profiling derivingfrom 191 reference species with sequenced genomes(Ciccarelli et al., 2006; von Mering et al., 2007). Thisdatabase was complemented with marker proteinsequences from seven recently sequenced genomesthat are representatives of additional phyla or groups(Cenarchaeum symbiosum, Nitrosopumilus maritimusSCM1, Candidatus Pelagibacter ubique HTCC1062,Pseudoalteromonas haloplanktis TAC125, Sulfuri-monas denitrificans DSM 1251, Rhodopirellula balticaSH 1, Geobacter uraniireducens Rf4, Alteromonasmacleodii ‘Deep ecotype’). Only best matches fromBLAST analysis were retained and the taxonomicaffiliation was determined using MEGAN (originalBLAST cut-off e-05, bit score 440, average bitscore 69.7).

Estimates of effective genome sizes (EGS)We applied the in silico method developed byRaes et al (2007) to estimate the average genomesize of microbial communities using metagenomicsequence data.

COG, KEGG and SEED subsystem clustering and crosscomparison of metagenomesThe seven selected metagenomes were analyzed byBLASTX against the COG (Tatusov et al., 2003) andKEGG (Kyoto Encyclopedia of Genes and Genomes;Kanehisa et al., 2004) databases applying a cut-offbit score of 440 and normalized to the number ofgenomic fragments for each metagenome. Thenumber of hits to the functional categories fromCOG and KEGG with at least 10 and 1 matches,respectively, in all seven metagenomes were normal-ized to the number of matches in each metagenome

Comparative metagenomics: Sea of MarmaraA Quaiser et al

289

The ISME Journal

Page 6: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

and used for cluster analysis (831 COG functionalcategories, 148 KEGG functional groups). Thepercentage of each COG and KEGG functionalcategory shared in the seven metagenomes wasdetermined and shown in cross-comparative heat-maps. The COG functional categories showing thehighest variations among the seven metagenomeswere determined applying the following cut-offcriteria: 410% standard deviation among themetagenomes within the functional categories. Theremaining 74 COG functional categories are shown(see below, Figure 5). The seven metagenomes hadmatches with a total of 207 different KEGG func-tional groups. KEGG groups without matches in oneof the metagenomes and o20 matches in at leastanother metagenome were eliminated. The remain-ing 148 functional KEGG groups were used forcluster analysis. Unique, overrepresented and un-derrepresented groups were identified using s.d.cut-off of 8% and additional manual inspection. Theapplication of different matrices and differentsampling depths did not change the metagenomesample tree topology in KEGG analysis (applied to148 KEGG groups). Hierarchical cluster analysis wasperformed with MeV 4.4 (Saeed et al., 2003)applying Euclidean distance and Kendall’s taumatrices and average linkage clustering.

Screening of functional key enzymesWe constructed reference databases for a number ofkey enzymes that are diagnostic, or strongly sugges-tive, of particular metabolic pathways. Only proteinsequences with confirmed activity or/and strongsimilarity to them were included in the database.The metagenomes were blasted against these referencedatabases and matches showing E-values smaller than1e-05 were retained. The remaining matches wereblasted against nr and the alignments manuallyinspected to decide about their classification.

Genome recruitmentGenome fragment recruitment analysis was per-formed using promer and nucmer packages in-cluded in the MUMmer 3.0 sequence alignmentsoftware tool (Kurtz et al., 2004) Metagenome readswere aligned to the chosen reference genome usingthe promer and/or nucmer alignment tools understandard conditions. The output file was parsed byshow-coords (included in MUMmer 3.0) applyingoptions that knockout overlapping alignments fromanother reading frame (-k), display the sequencelengths (-l), display start and stop of matching regionon nucleotide level (standard), sort the output filesaccording to the reference file (-r) and indicateidentities and similarities of each match (default).Nucleotide regions matching the reference genomewere identified, and duplicates correspondingto overlapping fragment matches eliminated. Theremaining matches correspond to unique nucleotide

regions that were nonredundantly covered by themetagenome reads based on amino acid alignments.To avoid the inclusion of non-coding regions asintergenic regions, as true matches, only protein-encoding regions were considered as referencegenome length, resulting in the exclusion of riboso-mal RNAs and transfer RNAs.

Shared matches and metagenome cross-comparisonEach metagenome was analyzed with promer(MUMmer 3.0) against all other metagenomes usingthe ‘maxmatch’ parameter. The results were parsedwith the show-coords script with �k and �rparameters as described above. Applying thesestrict parameters, the average amino-acid identitybetween the matches was 80.5% for match lengthcoverage of 70%. Only reciprocal matches wereretained, and the shared matches between themetagenomes counted. The seven metagenomeswere blasted (TBLASTN) against each other usingstrict-cut-off (bit score 440) and normalizationcriteria. The numbers of normalized reciprocal bestmatches were counted and used to construct adistance matrix for subsequent clustering.

Results and discussion

Diversity of Archaea, Bacteria and Eucarya in Marmaradeep-sea plankton and sediment based on SSU rDNAgene librariesTo provide a contextual framework of microbialdiversity to our metagenomic analyses, we con-structed SSU rRNA gene libraries of the samesamples plus, in the case of plankton, two addi-tional depths in the aphotic water column, usingvarious combinations of domain-specific primers,followed by SSU rDNA sequencing, BLAST compar-ison and phylogenetic analysis. We then comparedthe diversity retrieved in this way with the informa-tion derived from Marmara metagenomic data, bothby using the identified SSU rRNA gene sequencesand taxonomically relevant marker proteins asclassified using the NCBI taxonomy by MEGAN(Huson et al., 2007; see below).

A total of 165 archaeal and 215 bacterial SSUrRNA sequences of high quality were generated fromgene libraries of the Marmara water column at 500 m(Ma120), 1000 m (Ma101), 1250 m (Ma109) and froma sediment sample (Ma29) in the Marmara Seabottom. The eukaryotic diversity was studied onlyin the two samples selected for pyrosequencing,Ma101 and Ma29, with a total of 135 sequencesgenerated. Short and/or poor-quality sequenceswere discarded from our analyses. As expected,the sediment sample Ma29 harbored a higherdiversity of organisms belonging to the threedomains of life than plankton samples, which wasalso reflected in the corresponding rarefactioncurves (Supplementary Figure S3), though this trendwas more apparent for archaea and bacteria. Figure 1

Comparative metagenomics: Sea of MarmaraA Quaiser et al

290

The ISME Journal

Page 7: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

shows the relative proportions of the different taxaidentified in the different SSU rDNA libraries asdeduced from individual phylogenetic analyses (seeSupplementary Figures S4 to S15).

Planktonic bacteria were more diverse in meso-pelagic waters (500 m), with seven different bacterialphyla detected by phylogenetic analysis (Figure 1and Supplementary Figures S4 to S9), than inbathypelagic waters (1000 and 1250 m). Neverthe-less, libraries from all three samples were largelydominated by Proteobacteria, Gammaproteobacteriabeing largely the most represented group at 1000and 1250 m depth. At 1000 m, 85% of the Gamma-proteobacteria affiliated to the common opportunis-tic marine planktonic genus Alteromonas (Figure 1and Supplementary S4). Alphaproteobacteria repre-sented B20% of the bacterial population in allplankton libraries, and were dominated by membersof the ubiquitous SAR11 group, followed by membersof the Roseobacter clade (Supplementary Figure S6).The latter is also very abundant in marine environ-ments and occupies a variety of ecological nichesshowing activities that go from sulfur oxidation and

the production of secondary metabolites to carbonmonoxide oxidation and aerobic anoxygenic photo-synthesis (Wagner-Dobler and Biebl, 2006; Brinkhoffet al., 2008). Deltaproteobacteria were underrepre-sented at 1000 m, but were the third most repre-sented bacterial group at 500 m and 1250 m(Figure 1). However, whereas in the 500 m libraryall deltaproteobacterial sequences affiliated to thetypical deep-sea planktonic group SAR324 (Wrightet al., 1997; Lopez-Garcıa et al., 2001), sequencesfrom the 1250 m sample, a suboxic sample atonly 10 m above the sea bottom, were related alsoto deep-sea sediment environmental clones af-filiated to the genus Nitrospina or were sequencesof difficult affiliation within the Deltaproteobacteria(Supplementary Figure S7). Acidobacteria, a phy-lum that was found to be abundant in deep-seaMediterranean waters (Quaiser et al., 2008), was alsorelatively well represented in mesopelagic waters,but not in deeper waters. By contrast, Acidobacteriawere one of the three dominant groups in sedi-ment libraries, together with Gamma- and mostlysulfate-reducing Delta-proteobacteria (Figure 1,

0

10

20

30

40

50

60

70

80

90

100

Per

cent

age

of S

SU

rR

NA

seq

uenc

es

Ma120(500 m)

Ma101(1000 m)

Ma101pyroseq.

Ma109(1250 m)

Ma29(sediment)

Ma29pyroseq.

Bacteria

TM6

MBMPE71

Nitrospirae

Lentisphaera

OP3

WS3

BRC1

Marine_group_A

Chloroflexi

Gemmatimonadetes

NKB19

Actinobacteria

Planctomycetes

Bacteroidetes

Acidobacteria

Betaproteobacteria

Deltaproteobacteria

Alphaproteobacteria

Gammaproteobacteria

Bacteria

Archaea

Eukaryotes

0

10

20

30

40

50

60

70

80

90

100

Per

cent

age

of S

SU

rR

NA

seq

uenc

es

0

10

20

30

40

50

60

70

80

90

100

Per

cent

age

of S

SU

rR

NA

seq

uenc

es

Ma120(500 m)

Ma101(1000 m)

Ma101pyroseq.

Ma109(1250 m)

Ma29(sediment)

Ma29pyroseq.

Euryarchaeota, Unclassified

Euryarchaeota, ANME/Methanosarcinales

Euryarchaeota, DHVE1

Euryarchaeota, Benthic Group E

Euryarchaeota, Group III

Euryarchaeota, Group II

Thaumarchaeota (Group I Crenarchaeota)

Ma101 Ma29

Haptophyta

Telonemida

Radiozoa - Acantharea

Radiozoa - Polycistinea

Cercozoa

Stramenopiles

Alveolata - others

Alveolata - Dinoflagellates

Alveolata - MGII

Alveolata - MGI

Metazoa

Fungi

Amoebozoa

Uncertain

Figure 1 Relative proportions of different archaeal, bacterial and eukaryotic taxa based on SSU rDNA sequences detected in genelibraries and metagenomic data in deep-sea plankton and bottom sediment of the Marmara Sea. Pyrosequence indicates SSU rDNApercentages in metagenomic pyrosequence data as classified by MEGAN; no eukaryotic SSU rDNA sequences were identified in them.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

291

The ISME Journal

Page 8: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

Supplementary Figures S7 and S8). SedimentGammaproteobacteria clustered apart from planktonsequences, being mostly related to clone sequencesfrom deep-sea sediments. One sequence was relatedto the symbiont of the anaerobic annelid Olaviusalgarvensis (Supplementary Figure S5; Woyke et al.,2006), relatives of which have been found inassociation with other gutless oligochaete wormsin the Mediterranean (Ruehland et al., 2008). Thissuggests that some of the sediment-associatedbacteria form stable endosymbiotic associationswith a variety of anaerobic eukaryotic hosts. Inaddition to Proteobacteria and Acidobacteria,sequences ascribing to another 14 phyla and candidatedivisions were identified in the sediment sampleMa29 (Figure 1 and Supplementary Figure S9),highlighting the extremely rich and largely unex-plored bacterial diversity of deep marine sediments,similar to previous observations in other Mediterra-nean sediments (Polymenakou et al., 2005).

Archaea showed lower diversity than bacteria asindicated by rarefaction curves showing that ourlibraries had practically reached saturation at 97%sequence identity, especially in the bathypelagicsamples (Supplementary Figure S3). Althoughgroup I Crenarchaeota/Thaumarchaeota (Brochier-Armanet et al., 2008) frequently dominate in deepocean waters (Karner et al., 2001; DeLong et al.,2006), our planktonic samples seemed dominated byEuryarchaeota based on SSU rDNA sequences(between 57% and 94% of archaeal sequences) ashas been observed in other deep-sea localities(Martin-Cuadrado et al., 2008; Figure 1 and Supple-mentary Figure S10 and S11). This would be inagreement with recent observations that group Iarchaea in the East Mediterranean meso- and bath-ypelagic waters may have more variable relativeproportions than those suggested by previous stu-dies (Varela et al., 2008). However, these proportionspartially reflect the use of one Euryarchaeota-biasedprimer sets used to amplify archaea (B50% ofsequences), and hence should be taken with cau-tion. Thaumarchaeota reached important levels inbathypelagic samples (Figure 1), but although pre-sent in the sediment Ma29, they represented o10%of the total archaeal sequences. All the Ma29thaumarchaeotal sequences associated to the sedi-ment sample clustered with Cenarchaeum/Nitroso-pumilus or with a group composed of environmentalsequences retrieved from deep-sea environments(Lost City, Mariana Trough) and from the substratumof the fish tank where N. maritimus was isolatedfrom (Konneke et al., 2005). By contrast, planktonicsequences clustered with other cosmopolitan opera-tional taxonomic units (OTUs) identified in Pacific,Antarctic and Mediterranean meso- and bathypela-gic waters (DeLong et al., 2006; Martin-Cuadradoet al., 2008; Supplementary Figure S10). Within theEuryarchaeota, groups II and III were roughlyequally represented in the three water columndepths studied (Figure 1). Group II Euryarchaeota

were only detected in plankton, forming two majorlineages (Supplementary Figure S11) that were notrelated to proteorhodopsin-containing group II Eur-yarchaeota (Frigaard et al., 2006). The generallyunderrepresented group III Euryarchaeota was rela-tively abundant in the deepest part of the watercolumn in Marmara. Euryarchaeota, especially ofthe benthic group E (Vetriani et al., 1999), domi-nated the SSU rRNA gene library from the sedimentsample. About 25% of the Ma29 archaea werescattered in ill-defined euryarchaeotal groups com-posed of environmental sequences coming, mostly,from deep-sea sediments or hydrothermal vents.Finally, around 7% of Ma29 archaea belonged to theMethanosarcinales, with some sequences very clo-sely related to members of the ANME2 group(Figure 1 and Supplementary S12). The presenceof ANME sequences is not surprising, as theMarmara Sea harbors several scattered cold seepareas (Geli et al., 2008; Zitter et al., 2008) where theanaerobic oxidation of methane likely occurs.However, the relatively low proportion of archaeais consistent with the ‘normal’ bottom-sedimentnature of the Ma29 sediment.

The diversity of microbial eukaryotes was high inboth Marmara deep-sea plankton and sediment, asshown by rarefaction curves far from saturation(Figure 1 and Supplementary S3). As expected inwaters from this depth, our planktonic SSU rDNAlibraries were dominated by alveolate lineages,which accounted for 49.3% of the total clones andcomprised sequences belonging to the parasiticmarine alveolate groups I (Duboscquellida) and II(Syndiniales), as well as to core dinoflagellates(Supplementary Figure S13). Alveolates were alsovery abundant in sediment libraries (29.5%), withtypical dinoflagellate sequences more abundantthan sequences belonging to the marine alveolategroups I and II. One ciliate sequence was alsoidentified in Ma29 (Supplementary Figure S13). Thesecond more abundant group in Ma101 planktonwas that of the Radiozoa (Rhizaria) with bothrepresentatives of the Acantharea and Polycystinea.Although Acantharea were slightly more abundant,all the sequences belonged to a single phylotype,closely related to an environmental sequenceretrieved from Arctic waters (Lovejoy et al., 2006),whereas the Polycystinea showed a higher diversity(Supplementary Figure S14). A few sequences ofhaptophytes were detected in Ma101, which likelycorrespond to sinking cells, though it might bepossible that some haptophytes are still active at thisdepth because of their mixotrophic capabilities. Afew sequences of fungi and metazoa (ctenophoresand copepods) were identified in the planktonicsample as well (Supplementary Figure S15). In thesediment libraries, together with the alveolates,opisthokonts (fungi and metazoa) were the mostabundant, accounting for up to 30% of the totalsequences. We also detected members of the Amoe-bozoa, Telonemia, Cercozoa and Stramenopiles.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

292

The ISME Journal

Page 9: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

Several of the Stramenopile (heterokont) sequencescorresponded to diatoms, suggesting that diatomssink while conserving their DNA intact more success-fully than other photosynthetic eukaryotes, thoughwe cannot exclude that they corresponded to hetero-trophic diatom species, which have recently beenfound in coastal sediments (Blackburn et al., 2009).

Microbial diversity in Marmara deep-sea plankton andsediment based on SSU rDNA and protein marker genepyrosequencesWe compared the community composition deducedfrom SSU rDNA libraries to that deduced from thedetection of SSU rRNA gene fragments in pyrose-quence reads followed by their taxonomic classifi-cation by MEGAN (Huson et al., 2007). However,this could only be performed for prokaryotic com-munities, as we detected a too small number ofeukaryotic SSU rRNA gene fragments (only 5) inMa101 and Ma29 pyrosequences, and none of themcould be classified in a given taxonomic group withcertainty. The low number of eukaryotic rRNA genematches was likely because of both, the largergenome sizes of eukaryotes (decreasing the prob-ability of sequencing a particular gene) and the factthat they are quantitatively less abundant thanprokaryotic cells in our samples. The relativeproportions of the major bacterial groups as classi-fied by MEGAN in Ma101 plankton pyrosequenceswere comparable to those identified in SSU rDNAlibraries (Figure 1) with a large dominance ofGammaproteobacteria (with 73.2% clones, 61.6%rRNA matches) followed by Alphaproteobacteria(21.1% clones, 21.9% rRNA matches) and Bacter-oidetes (2.8% clones, 6.2% rRNA matches). Theclassification of bacterial SSU rDNA clones andpyrosequences was also rather congruent in thesediment sample Ma29, with Gammaproteobacteria,Alphaproteobacteria, Deltaproteobacteria and Bac-teroidetes showing similar proportions. On the otherhand, Planctomycetes seemed to be underrepre-sented in the clone library (3.3%) compared withthe rRNA matches (25%), whereas an opposite trendwas observed for the Acidobacteria (10% in pyrose-quence matches, whereas they represented 25.6% inthe clone library). These variations were probablydue to differences in primer specificities as it isknown for Planctomycetes (Vergin et al., 1998), aswell as to the statistical error associated to the totallow numbers of assigned SSU rRNA matches in thesediment sample. A relative high abundance ofPlanctomycetes was already reported in Mediterra-nean deep-sea sediments, as they represented up to14% (7.9–14%) in SSU rRNA clone librariesconstructed from four different sediment samples(Polymenakou et al., 2005). In all, B20% of thebacterial sequences in our gene libraries affiliated tocandidate divisions and little abundant phyla. Asimilar percentage appeared as unclassified bacteriausing MEGAN (Figure 1). By contrast, the affiliation

of archaeal pyrosequences by MEGAN did notcorrelate well with that of SSU rDNAs in libraries,although this may be partly due to primer-inducedbiases toward the Euryarchaeota. In the planktonsample, Thaumarchaeota pyrosequences dominated(89%), whereas they accounted for 43% in genelibraries. In the sediment sample Ma29 only sixarchaeal SSU rRNA matches were identified in thesediment metagenome, which made the comparisonwith SSU rDNA data unreliable.

In addition, we carried out BLASTX analyses ofMa101 and Ma29 pyrosequences against the nrdatabase in GenBank. The output file was then usedas input in MEGAN to classify the sequencesaccording to its taxonomic identifiers. From thetotality of raw sequences, 44.9% (Ma29) and 57.2%(Ma101) had matches in the nr database (Table 1).Archaeal matches accounted for 10.3% (95% ofwhich belonged to the Thaumarchaeota) and 2.32%(47% Thaumarchaeota) in Ma101 and Ma29, respec-tively. Although these values compare well with therespective percentages of archaeal SSU rDNAmatches in pyrosequences, they are potentiallyhighly biased because of the lack of completegenome sequences of group II and III Euryarchaeotain databases.

Microbial diversity in Marmara deep-sea plankton andsediment compared with other metagenomic data setsWe compared the Marmara sequences against acollection of selected published metagenomes thatwere adequately trimmed and/or merged to mini-mize potential biases because of differences insequence length and read depth (Table 1, seeMaterial and methods). We selected all the meta-genomes available at the time of analysis corre-sponding to deep-sea plankton that were ofsufficient size to be compared with our data sets inaddition to sensible outgroup metagenomes. Theseincluded the data sets of deep-sea plankton fosmid-end sequences in the Pacific ALOHA water column(500–4000 m depth; DeLong et al., 2006) and themetagenomic sequences of surface plankton in thesame water column (Frias-Lopez et al., 2008). Formarine sediments, the only metagenomic dataavailable corresponded to Peru margin subsurfacesamples (Biddle et al., 2008) that, although not verydeep (150 m depth), could still be a good represen-tative of this type of habitat. We included theWaseca soil metagenome (Tringe et al., 2005), assoils correspond to the ecological compartmentwhere the ultimate degradation of organic mattertakes place on continents, as is the case of sedimentsin oceans. Finally, we also included whale carcassmetagenomic data (Tringe et al., 2005), because ofboth its marine deep origin and its transitionallocation between bottom sediment and deep plank-ton. The microbial diversity captured in the DNAsequences from the seven different metagenomeswas analyzed using rRNA and conserved marker

Comparative metagenomics: Sea of MarmaraA Quaiser et al

293

The ISME Journal

Page 10: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

protein sequence matches. Metagenome sequenceswere blasted (BLASTN) against a curated SSU and(LSU) database containing detailed taxonomicidentifiers including rRNA genes from environmen-tal samples (Urich et al., 2008). They were alsoblasted (BLASTX) against 31 families of markerproteins from 198 reference species covering 36different taxonomic groups of all three domains (18for Bacteria, 10 for Archaea and 8 for Eucarya) thatgenerally provide sufficient phylogenetic informa-tion as to perform taxonomic profiling (Ciccarelliet al., 2006). Although this approach suffers fromsome bias, given that many of the phylotypesidentified by SSU rRNA sequences belong tolineages for which genome sequences are not yetavailable, the overall taxonomic profile based onthese 31 protein markers corroborated that obtainedfrom SSU rDNA sequences, although with someminor quantitative differences (data not shown).

The taxonomic classification of the rRNA andmarker protein matches in the different metagen-omes was analyzed using MEGAN. As the metagen-ome sequences are too short for detailed affiliationto taxonomic groups, we limited the assignment tothe phylum level. We first identified rRNA matchesto known SSU rRNA sequences. The affiliation toSSU rRNA sequences excluding the LSU rRNAmatches represents an adequate way to capture the

largest diversity, as SSU rRNA sequences areavailable from sequenced genomes, but also fromenvironmental species only known by their SSUrRNA genes. The highest number of SSU rRNA hitswas found in the plankton surface (272) followed bythe whale carcass metagenomes (254) and Ma101(188) and ALOHA deep-sea fosmid ends (50;Table 1). The lowest number of rRNA hits occurredin sediment Ma29 (45) followed by the soil meta-genome (49) and Peru subseafloor (100), likelyrevealing a larger genome average size (see below).

At the level of domains, bacteria were the mostabundant group in all but the Peru metagenome(Figure 2). In the Peru subseafloor sediment, 54% ofthe rRNA matches affiliated to Archaea (37.5% oftotal affiliated to Marine benthic group C, Crenarch-aeota), although archaeal rRNA matches were onlydetected at 16, 32 and 50 m, but not at 1 m below theseafloor, as already reported (Biddle et al., 2008). Arelatively high proportion of archaeal rRNAs wasfound in both planktonic deep-sea metagenomes,Ma101 and ALOHA deep-sea sequences accountingfor 13.6% and 16.0% (Thaumarchaeota: 11.1% and12.0%), respectively. In the sediment sample as wellas in soil, archaeal rRNA matches accounted for13.3% and 8.2%, respectively. Surprisingly, noarchaeal rRNA sequence was detected in the whalecarcass metagenome.

other Eukarya

Alveolata

other Archaea

Thaumarchaeota

Cren_MarBenthGpC

other Euryarchaeota

Eury_SAGMA1

Eury_group_II

Eury_Methanomicrobia

other Bacteria

Fibrobacteres/Acidobacteria

Actinobacteria

Cyanobacteria

Chloroflexi

Firmicutes

Planctomycetes

Bacteroidetes/Chlorobi

other Proteobacteria

Betaproteobacteria

Delta-/Epsilonproteobacteria

Alphaproteobacteria

Gammaproteobacteria

Not assigned SSU rRNA genes

0

10

20

30

40

50

60

70

80

90

100R

elat

ive

perc

enta

ge o

f SS

U r

RN

A s

eque

nces

Eukaryotes

Archaea

Bacteria

Ma29sediment

(45)

Ma101pl.1000 m

(188)

ALOHAdeep-seaplankton

(50)

ALOHAsurfaceplankton

(252)

Perumargin

subsurfacesediment

(100)

Whalecarcass

(254)

Wasecasoil(49)

Marmara Sea

Figure 2 Relative proportions of different archaeal, bacterial and eukaryotic taxa identified in different metagenomic data sets based ontheir content of SSU rDNA sequences. The taxonomic assignment was carried out using MEGAN. The total number of SSU rDNAmatches in each metagenome is indicated in parentheses.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

294

The ISME Journal

Page 11: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

Within bacteria, Gammaproteobacteria weredominant (47.9%) in the Ma101 plankton sampleand they were relatively abundant in the whalecarcass metagenome (25.2%), but were much lessabundant in other metagenomes, with values around10% (Figure 2). The proportion of Alphaproteobac-teria was much more stable in all plankton and thewhale carcass metagenomes, with values around20%. The Marmara sediment and the soil samplehad a relatively similar distribution of rRNA hits intaxonomic groups. The two more different taxo-nomic profiles corresponded to the ALOHA surfaceplankton metagenome, which was dominated bycyanobacteria (47.8%) and to the subseafloor eco-system (Peru margin) where archaea dominatedfollowed by bacteria belonging to the Chloroflexi(26.0%; Figure 2). Eukaryotic SSU rDNAs were scarcewith no or very few matches detected (o5%); onlyin the ALOHA deep-sea plankton metagenome theyrepresented around 10% of the total SSU rDNAsdetected. Taking into account that eukaryotic rRNAgenes are usually organized in large tandems with,sometimes, hundreds of gene copies, these obser-vations reinforce the idea that, although diverse,eukaryotes are minor components of the microbialdiversity, at least in terms of cell numbers.

Comparison of Marmara deep-sea plankton andsediment with other metagenomesWe then compared seven selected metagenomic datasets by reciprocal TBLASTN analyses. We identifiedhigh scoring sequence pairs and computed thenumber of reciprocal best matches. As can be seenin Figure 3, the highest number of best reciprocalmatches occurred in the Marmara deep-sea planktonMa101 and whale carcass metagenomes, followed byMa101 and the ALOHA deep-sea plankton data set.The Peru margin data set accounted for the lowest

number of high scoring pairs with the rest ofthe metagenomes, followed by the ALOHA surfacesample.

These results were further confirmed by theanalysis of shared sequence reads between meta-genomes. For this, each metagenome was analyzedagainst all other metagenomes using the promeralignment tool of MUMMER (Kurtz et al., 2004) withstrict parameters and considering only reciprocalbest matches. Under these conditions, only highlyconserved proteins that are present in the metagen-omes can be identified. The highest number ofreciprocal matches was again identified between thewhale carcass and the Marmara deep-sea planktonmetagenome (Ma101) with 9666 matches followedby the Ma101/ALOHA deep-sea data set with 8367matches (Figure 4a). As the number of reads amongthese three metagenomes differs to some extent forthe ALOHA deep-sea metagenome (149 022) com-pared with Ma101 (183 621) and whale carcass(200 722), the shared reads to the ALOHA deep-seadata set represent a minimum value of shared reads.Nevertheless, as the number of shared reads with theother metagenomes is much lower, these threemetagenomes are clearly the most similar, sharingat least 1859 reads. Most of those shared readsencoded housekeeping proteins as ribosomal pro-teins and transfer RNA synthetases, correspondingessentially to the marker proteins mentioned above.Cluster analysis using the number of shared reads,did indeed group together those three metagenomes(Figure 4b). The Marmara sediment Ma29clustered with the soil sample and all these fivemetagenomes were closer among them than with thePeru subseafloor and the ALOHA surface watermetagenomes.

This clustering was further confirmed by thecomparison of COG and KEGG categories in thedifferent data sets (see below).

ALOHAsurface plankton

ALOHAdeep-sea plankton

Whale carcass

Ma101(plankton 1,000 m)

Peru margin(subseafloor)

Soil (waseca)

Ma29(Marmara sediment)

Number of best matches

Ma29 (sediment; 1,300 m depth)

Ma101 (plankton; 1,000 m depth)

ALOHA deep-sea plankton

ALOHA surface plankton

Peru margin (subsurface sediment)

Whale carcass

Waseca soil

2000 4000 6000 8000 10000 120000

Figure 3 Best reciprocal TBLASTN high scoring pairs (HSPs) between selected metagenomes.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

295

The ISME Journal

Page 12: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

COG and KEGG functional categoriesThe predicted protein sequences of the sevenmetagenomes were compared against different data-bases including COGs (Tatusov et al., 2003), KEGG(Kanehisa et al., 2004) and SEED subsystems(Overbeek et al., 2005). In all, but the plankton-surfaceand the subseafloor metagenomes (both encompass-ing short reads, B100 bp) the proportion of matchesto the COG database exceeded largely the KEGG andSEED matches (Supplementary Figure S17). Thesequences were categorized and the presence offunctional protein groups and metabolic traitscompared among the different metagenomes usingcluster analysis. At the level of COG categories, onlyslight variations between the metagenomes could beobserved. The highest variations corresponded tothe categories of signal transduction (category T)and translation and ribosome biogenesis (category J)and, more precisely, to the ratio between the twocategories (Supplementary Figure S18). Notably,other housekeeping genes, such as those involvedin transcription (category K) seemed more stableacross metagenomes than those involved in transla-tion. The ratio of genes involved in signal transduc-tion versus those involved in translation was lowestin the surface ALOHA metagenome and in the Perusubsurface metagenome, whereas it was intermedi-ate in the deep ALOHA and Marmara plankton andmuch higher in the rest of the metagenomes. The

fact that genes involved in signal transduction weremore relatively abundant in the whale carcass, inMarmara sediment and in soil metagenomes is inagreement with the idea that the microbial commu-nities in these structured biotopes entertain morecomplex interactions that involve cell-to-cell com-munication. Similarly, the higher ratio of signaltransduction versus translation genes in deep-seaplankton compared with that in surface waters issuggestive of a higher cell-to-cell interaction level indeep-sea planktonic communities. This would bein agreement with the idea that deep-sea planktoniclife occurs mostly associated to particles of sinkingorganic matter, where biofilms may form (Martin-Cuadrado et al., 2007). Finally, the low ratio ofsignal transduction versus translation genes in thePeru subsurface is likely explained by the domi-nance of this particular sample by Crenarchaeota,generally characterized by compact genomes, and bythe particular stable environmental constraints ofthis extreme habitat that result in a geneticallydistinct community (Biddle et al., 2008).

A hierarchical cluster analysis of the differentCOG categories (831 COGs having at least 1 match ineach of the 7 metagenomes) showed that the twodeep-sea plankton samples plus the whale carcassmetagenomes, on the one hand, and the soil andMa29 sediment, on the other hand, grouped together(Figure 4 and Supplementary Figure S19). Toidentify functional categories that might be specificfor a metagenome we determined 74 COGs thatshowed the highest variations applying a cut-offstandard deviation of 10% within the functionalgroup followed by additional manual inspection(Figure 4). The more highly represented COGs in thedeep-sea plankton sample (Ma101) corresponded toa chemotaxis protein involved in environmentalsensing and cell response, and the 3-deoxy-D-arabino-heptulosonate 7-phosphate synthase in-volved in the biosynthesis of aromatic amino acids(Table S1). The sediment sample (Ma29) was largelyenriched in a variety of COGs, many of which wererelated to anaerobic and/or autotrophic metabolismsand to sulfur cycling (Table S1), which is in goodagreement with microbial communities associatedto anoxic sediments.

The hierarchical cluster analysis of KEGG func-tional categories yielded a similar result to thatobserved using COG categories, with the deepplanktonic and the whale carcass and the soil andsediment metagenomes, respectively, clusteringtogether (Figure 4 and Supplementary Figure S20).The deep-sea plankton sample Ma101 showedrelatively high levels of sequences affiliated tobacterial chemotaxis, flagellar assembly, arachido-nic acid metabolism, type III secretion system andgamma-hexachlorocyclohexane degradation. InMa29, the most abundant genes corresponded toelectron transfer carriers, type II secretionsystem, signal transduction mechanisms and energymetabolism.

Marmaraplankton(1,000 m)

Whalecarcass

ALOHAdeep-sea plankton

(500-4,000 m)

9,666

3,7278,367

1,859

ALOHA surface plankton

ALOHA deep-sea plankton

Whale carcass

Ma101 (1,000 m plankton)

Peru margin (subseafloor)

Soil (waseca)

Ma29 (Marmara sediment)

a

b

Figure 4 MUMMER-based identification of shared sequences inmetagenomes. (a) Number of shared sequences in Marmara andALOHA deep-sea plankton and whale carcass metagenomes.(b) Cluster analysis of selected metagenomes based on sharedsequences.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

296

The ISME Journal

Page 13: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

Estimation of the EGS based on COG marker proteinsRecently, Raes et al. (2007) established an in silicomethod to estimate the average genome sizeof microbial communities using metagenomicsequence data. This method is based on the inverserelationship of the genome size and the marker genedensity and includes the normalization to thenumber of sequence reads and the total number ofbase pairs of the analyzed metagenome. The selectedmarker genes encode for 35 COG marker proteinspresent in all genomes with low and constant copynumbers, independently of genome size. Thiscorrelation was identified by simulations usingfragmented complete genome sequences andapplied to metagenomic sequences with fragmentlengths of 400–1100 bp and a bit score cut-off 460for the identification of marker protein matches byBLAST analysis. To test the applicability of thiscorrelation to shorter reads and to get an idea ofthe average genome size of microorganisms in thedeep-sea plankton and sediment, we applied thedescribed function to our data. The 35 COG markerproteins were blasted against all 7 metagenomes. Asthe bit score in BLAST analysis depends on thealigned fragment length, we first evaluated differentbit score cut-offs ranging from 20 to 80. We observeda strong variability in the number of marker proteinmatches and, consequently, in the EGS estimations

using bit score cut-offs in the range of 60 and above.This indicates that this approach strongly dependson fragment length. Hence, it can be at mostindicative when applied to fragments with averagelengths between 100 and 200 bp. Nevertheless,when applying a bit score cut-off of 440 for thedetection of marker genes, we obtained an EGS of3.66 Mbp for the sediment community (Ma29) and1.78 Mbp for the deep-sea plankton (Ma101; Table 1and Supplementary Figure S21). Using theuntrimmed original metagenome fragments from oneGOS surface plankton sample and the Waseca soilecosystem and applying a bit score -off of 460, Raeset al. (2007) obtained average prokaryotic EGS of1.6 Mbp and 4.74 Mbp, respectively. These valuesare comparable with our results obtained from theALOHA surface plankton pyrosequences (1.71 Mbp)and the trimmed Waseca soil fragments (4.59 Mbp)using a bit score cut-off threshold 440. However,although Raes et al. (2007) estimated the whalefall average genome size to 3.55 Mbp using theaverage of all three different whale carcassessamples, our estimation yielded only 2.45 Mbp(cut-off 440). This might be partly because ofdifferences in the actual sequences considered forthe estimation, as in our study we used an equalmixture of the different whale ecosystems trimmedto shorter sequence lengths. At any rate, despite the

Waseca soil

Ma29 (sediment)

ALOHA deep-sea plankton

Ma101 (plankton)

Whale carcass

ALOHA surface plankton

Peru margin (subseafloor)

ALOHA surface plank.

ALOHA deep-sea plank.

Ma101 (plankton)

Whale carcass

Waseca soil

Ma29 (sediment)

Peru margin

Xenobiotics by cyt P450 (Ma101)

Ether lipid metabolism (Ma101)Bacterial chemotxis (Ma101)

Type III secretion (Ma101)

Biphenyl A degradation (Ma101)

Taurin metabolism (Ma101)

Flagellar assembly (Ma101)

Signal transduction (Ma29)Androgen and Estrogen (Ma29)Glucosaminoglucan degradation (Ma29)

Spingholipid biosynthesis (Ma29)

Pyruvate oxidoreductase (Ma29)Type II secretion system (Ma29)

Protein kinase (Ma29)Sporulation (Ma29)

COG

KEGG

Figure 5 Cluster analysis of several metagenomes based on matches to different COG and KEGG categories normalized to the respectivenumber of sequence fragments. For COGs, only functional categories having at least 10 matches in all 7 metagenomes were retained(831 COGs), from which the most frequent 74 are shown (see also Supplementary Figure S19). For KEGGs, only functional categorieshaving at least 1 match in all 7 metagenomes or at least 20 matches in 1 metagenome were retained (148 KEGGs; see also SupplementaryFigure S20). The names of particular KEGG categories showing strong differences across metagenomes are shown at the bottom.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

297

The ISME Journal

Page 14: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

variability of the EGS values obtained depending onthe thresholds used, these estimates are consistentwith smaller genome sizes associated to planktonsamples and larger genome sizes associated to soiland sediment samples, with the notable exception ofthe highly divergent and archaea-dominated Perusubsurface.

Functional key enzymesIn addition to the general overview on the commu-nity functional potential that the relative abundanceof functional COG and KEGG categories providefrom the different metagenomes, we looked forparticular proteins that are considered to be diag-nostic for particular enzymatic pathways and,hence, for particular metabolic capabilities. Theseincluded ammonia monooxygenase AmoA, AmoBand AmoC subunits (nitrification), 4-hydroxybutyr-yl dehydratase (CO2 fixation by the 3-hydroxypro-pionate/4-hydroxybutyrate pathway), dissimilatorysulfite reductase DsrA and DsrB subunits (sulfaterespiration), dissimilatory nitrite reductase subunitsNirK and NirS (nitrate respiration), nitrogenasesubunits NifH and NifD (nitrogen fixation), carbon-monoxide dehydrogenase CoxLMS subunits (COoxidation), RuBisCO (CO2 fixation), sulphatase(degradation of sulfonated heteropolysaccharides),hydroxylamine oxidoreductase HAO (anammox),methyl coenzyme A reductase (anaerobic oxidationof methane) and C-P lyase (phosphonate utilization;Figure 6 and Supplementary Table S2).

The ammonium monooxygenase is the keyenzyme in the first step of nitrification. Thaumarch-aeota are major factors in the oxidation of ammoniato nitrite in soil and oceans as suggested by thedominance of archaeal over bacterial amoA genes(Leininger et al., 2006; Yakimov et al., 2007).However, not all deep-sea Thaumarchaeota possessamoA, suggesting that many deep-sea archaea arenot chemolithoautotrophic ammonia oxidizers(Agogue et al., 2008). In our case, archaeal amogenes were fairly abundant in the deep-sea planktonsample Ma101 (32 matches) as compared with therest of metagenomes (below 1–2 matches), whereasbacterial amo genes were detected in the Ma101metagenome in much lower numbers (1 match).This suggests that bathypelagic Thaumarchaeota inthe Marmara Sea are indeed ammonia oxidizers,whereas those in the open Pacific Ocean are likelynot (or to much lesser extent). In addition, Ma101Thaumarchaeota seem to be chemolithoautotrophic,as we identified a high number of matches withstrong identities and bit scores (ranging from58–151) to 4-hydroxybutyryl dehydratase, a keyenzyme in the 3-hydroxypropionate/4-hydroxybutyratepathway for autotrophic CO2 fixation in group Iarchaea (Berg et al., 2007). In addition to this Cfixation pathway, we identified a number of hits tothe RuBisCO large subunit (rbcL), indicating thepresence of the more conventional Calvin cycle forCO2 fixation in the different metagenomes (Supple-mentary Table S2).

We identified genes involved in nitrate respirationin the whale carcass, Marmara sediment and

Ma29 (sediment; 1,300 m depth)

Ma101 (plankton; 1,000 m depth)

ALOHA deep-sea plankton

ALOHA surface plankton

Peru margin (subseafloor)

Whale carcass

Waseca soil

AmoABC(Thaumarchaeota)

AmoABC(Bacteria)

diss. nitrite reductase(NirK/NirS)

Nitrogenase(NifH/NifD)

diss. sulfite reductase(DsrAB)

Carbon-monoxid DH(CoxLMS)

4-hydroxybutyryl dehydratase(Thaumarchaeota)

methyl-CoM reductase(McrABCDG)

RuBisCO

Sulfatase

Hydroxylamine oxidoreductase(anamox)

C-P lyase(PhnJL-Phosphonate degradation)

N° of key enzyme matches/genome equivalent

Figure 6 Comparison of the relative proportion of key metabolic enzymes in Marmara deep-sea plankton and sediment withother metagenomes.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

298

The ISME Journal

Page 15: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

plankton and soil, but not in the ALOHA watercolumn (Table S2). The absence of genes for nitraterespiration in the highly oxygenated ALOHA watercolumn in the open Pacific Ocean correlates withthe fact that denitrification occurs when oxygen islimited and nitrate is available as terminal electronacceptor. This is the situation in the above suboxicto anoxic samples included in our study (deepMarmara Sea, whale carcass, soil). Furthermore, inthe case of the whale carcass, the presence of thesegenes correlates with the occurrence of a significantnumber of SSU rRNA hits to Epsilonproteobacteriaaffiliated to S. denitrificans (22% of the whalecarcass metagenome), which is able to performdenitrification. Tringe et al. (2005) already notedan enrichment of nitrate respiration genes in whalecarcass and soil. Along with genes involved innitrate respiration, genes involved in nitrogenfixation, a process inhibited by oxygen, were alsorelatively abundant in soil and Ma29 sedimentmetagenomes (Figure 6, Supplementary Table S2).Many of the nitrogen fixers in the whale carcassmetagenome might be Alpha- or Epsilon-proteobac-teria (Johnston et al., 2005).

As expected, genes encoding the subunits of thedissimilatory sulfite reductase responsible for thereduction of sulfite to sulfide during sulfate reduc-tion were very abundant in the sediment Ma29, butwere also present, though at much lower abundance,in Marmara deep-sea plankton and the whalecarcass metagenomes (Figure 6 and SupplementaryTable S2). Sulfate reduction is predominantlycarried out by Deltaproteobacteria, which indeedaccounted for ca 20% SSU rRNA genes in librariesand a similar percentage of rRNA hits in pyrose-quences (Figure 1). Phylogenetic analyses of SSUrDNA sequences from gene libraries showed thatmany of the Ma29 Delta-proteobacteria affiliated tothe sulfate-reducing Desulfobacteraceae, but also toseveral lineages without cultivated members (Sup-plementary Figure S7), which suggest that some ofthem may be sulfate reducers as well. Among theplanktonic deltaproteobacterial SSU rDNA se-quences, several belonged to the uncultivated groupSAR324. The co-occurrence of genes for sulfatereduction in the same sample might suggest that theSAR324 are capable of reducing sulfate. Indeed, thepresence of certain metabolic genes in metagenomicclones (Moreira et al., 2006) and the relativeabundance of this group in oxygen-depleted waters(Zaikova et al., 2010) suggested that SAR324may correspond to anaerobic or microaerophilicorganisms.

In anoxic sediments, sulfate reduction is generallyaccompanied by the activity of methanogenicarchaea in deeper sediment layers. In cold seepenvironments, such as those existing in localizedareas of the Marmara Sea, some sulfate-reducingbacteria are symbiotically associated to archaeacarrying out anaerobic methane oxidation. We thuslooked for the presence of methyl coenzyme M

reductase, which catalyses the terminal step inmethanogenesis and seems to have a role also inreverse methanogenesis, being characteristic ofmethane-metabolizing archaea. We used the genesencoding the subunits of methyl coenzyme M fromMethanosarcina barkeri (McrABCDG) and those ofthe nickel protein involved in the anaerobic oxida-tion of methane (McrABG) from an unculturedarchaeon (Kruger et al., 2003) as seeds against thechosen metagenomes. However, no hits were de-tected. This is in agreement with both, the fact thatMa29 corresponded to ‘normal’, bottom sedimentnot strongly influenced by cold seep activity andwith the fact that it was collected from the surface ofthe sediment core and hence above the methanogen-esis layer. Accordingly, SSU rRNA sequences be-longing to either typical methanogenic archaea or toANME groups were scarce (o10%; Figure 1).

As Planctomycetes were relatively abundant inthe Marmara sediment, we searched for the presenceof genes that could indicate the occurrence ofanammox activity by blasting 28 genes closelyrelated to the Kuenenia stuttgartensis gene encodingthe hydroxylamine oxidoreductase, one of the keyenzymes of the anammox reaction (Strous et al.,2006). The highest number of matches was found inthe Ma29 metagenome (8) followed by the whalecarcass (2) and the ALOHA deep-sea metagenome(2). Although the number of matches remainsrelatively low, their relative higher abundance inMarmara sediment together with the presence of arelative high abundance of planctomycetes SSUrDNAs suggest that at least a fraction of thesebacteria are able to oxidize ammonia anaerobically.Correlating with a relative high abundance ofPlanctomycetes in Marmara sediment, we alsodetected a high number of sulfatases (Figure 6 andSupplementary Table S2). Sulfatases are abundantin the genome of R. baltica (109 hits; Glockner et al.,2003) and, in general, marine planctomycetespossess a large number of these enzymes, whichthey might use for the initial breakdown of sulfatedheteropolysaccharides, thus having an importantrole in recycling these abundant oceanic com-pounds (Woebken et al., 2007).

The use of phosphonate compounds has beenrecently proposed as an important source of phos-phorous in P-depleted surface marine waters, aswell as in more P-rich deep waters. Known genesbut also novel pathways for phosphonate utilizationare abundant in metagenomic picoplankton librariesas deduced from functional screenings (Martinezet al., 2009). Genes for phosphonate utilization werealso abundant in the highly oligotrophic surfacewaters of the East Mediterranean (Feingersch et al.,2009). We looked for the presence of C-P lyaseinvolved in phosphonate metabolism in our selectedmetagenomes. We detected a similar moderateabundance of homologs in Marmara and ALOHAdeep-sea plankton. However, they were much moreabundant in the whale carcass (Figure 6), which is

Comparative metagenomics: Sea of MarmaraA Quaiser et al

299

The ISME Journal

Page 16: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

most likely related to the active mobilization bythese microbial communities of bone-associatedphosphonates.

Finally, we also looked for carbon monoxidedehydrogenase genes. They were highly abundantin the ALOHA deep-sea metagenome (34 hits)followed by the Marmara Ma101 plankton andMa29 sediment (19 hits each), soil (17 hits), theALOHA surface metagenome (15 hits), the whalecarcass (8) and Peru subsurface (4) metagenomes.The oxidation of CO to CO2 as an alternative orsupplementary energy source is widespread inmany marine bacteria including, notably, membersof the highly versatile and abundant Roseobacterclade (King and Weber, 2007; Brinkhoff et al., 2008).Carbon monoxide dehydrogenase genes weredetected at relative high abundance in deep Medi-terranean waters, which suggested that deep-seamicrobes might perform a similar form of lithoheter-otrophy to that showed by surface bacterioplanktonby oxidizing CO (Martin-Cuadrado et al., 2007). Thepossible role of CO oxidation in deep waters hasbeen criticized because the source of CO would beunclear at that depth and because carbon monoxidedehydrogenase is also involved in some pathways ofcentral C metabolism, for instance in acetogenicmethanogens (Pezacka and Wood, 1984). However,hydrothermal activity associated with oceanicridges and to submarine volcanic areas, which areindeed rather extensive in the Mediterranean,constitutes a very likely source of CO in the deepsea. Furthermore, sequencing of metagenomic

fosmids containing carbon monoxide dehydro-genase genes has shown that they are organized inclusters having the typical structure of CO oxidizingbacteria (Martin-Cuadrado et al., 2009). Hence,lithoheterotrophy based on CO oxidation mayactually be an important strategy to gain free energyalso in the deep sea.

Genome recruitment in Marmara deep-sea planktonand sediment metagenomesTo get a deeper insight on the genomic structure ofdominant microorganisms in Marmara Sea, weperformed recruitment analysis using the promerand nucmer packages included in MUMmer 3.0(Kurtz et al., 2004). We selected as seeds themicrobial genomes that showed the highest numberof hits in the Marmara metagenomes. In Ma101plankton, the microorganisms that retrieved morehits in pyrosequences with high average amino-acididentities were Nitrosopumilus maritimus, P. ubique,A. macleodii ‘deep ecotype’ and A. macleodiiATCC27126, isolated from surface waters (with77%, 81%, 83% and 90% amino acid identities,respectively). Thus, we plotted these genome se-quences against the Ma101 deep-sea plankton meta-genome. At the amino-acid level (promer), thegenomes of A. macleodii ‘deep ecotype’, A. macleodiiATCC27126, P. ubique and N. maritimus recruitedMa101 pyrosequences covering 39%, 44%, 46% and25%, respectively, of the respective genomes and,when intergenic regions and ribosomal RNA genes

protein coding region

total

Genome coverage (%)

Alteromonas macleodii “deep”(aa - 16613)

Alteromonas macleodii “deep”(nt - 2397)

Alteromonas macleodii “surface”(aa - 19521)

Alteromonas macleodii “surface”(nt - 10053)

Nitrosopumilus maritimus(aa - 4541)

Nitrosopumilus maritimus(nt - 65)

Pelagibacter ubique HTCC1062(aa - 7413)

Pelagibacter ubique HTCC1062(nt - 1850)

Rhodopirellula baltica(aa - 1114)

Sulfurimonas denitrificans(aa - 3565)

Bradyrhizobium japonicum(aa - 3526)

Prochlorococcus marinus(aa - 89451)

Nitrosopumilus maritimus(aa - 1210)

Ma101(plankton)

Ma29(sediment)

Whalecarcass

Soil(waseca)

ALOHA(surface plankton)

ALOHA(deep plankton)

0 10 20 30 40 50 60 70 80 90

Figure 7 Percentage of coverage of the most abundantly recruited genomes in Marmara deep-sea plankton (Ma101) and sediment(Ma29) as compared with those most represented in other selected metagenomes. The numbers in parentheses indicate the numbers ofmatching reads; aa, based on amino acids (promer); nt based on nucleotides (nucmer).

Comparative metagenomics: Sea of MarmaraA Quaiser et al

300

The ISME Journal

Page 17: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

were excluded, the coverage increased to 45.7%,52.0%, 48.8% and 27.3%, respectively (Figure 7).Similarly, at the nucleotide level (nucmer), weobserved coverages of 31.8% (A. macleodii ATCC27126,‘surface ecotype’), 7.1% (A. macleodii ‘deep eco-type’), 21.7% (P. ubique) and 0.435% (N. maritimus).The regions from A. macleodii ‘deep-ecotype’not covered by Ma101 corresponded essentiallyto the (meta)genomic islands containing variableregions that were previously identified as majordifferences between this strain and the ATCC27126(Ivars-Martinez et al., 2008). Similarly, several(meta)genomic islands were identified in the recruit-ment plot of P. ubique (Supplementary Figure S22).

The dominance of A. macleodii ATCC27126, theso-called ‘surface ecotype’, over the ‘deep-ecotype’in deep Marmara waters is at odds with someprevious observations in other Mediterraneanwaters. Whereas the surface ecotype is dominantin surface Mediterranean and also in open oceanwaters and seems to prefer warmer temperatures,the deep-ecotype is also present in surface waters,but in lower amounts and exhibits a preference forlower temperatures (Ivars-Martinez et al., 2008). The‘deep-ecotype’ isolate was obtained from 1000 m-deep Adriatic water at an average temperature of13 1C, exactly the same depth and temperature(14 1C) as that of the Marmara Ma101 sample studiedhere. However, in Marmara, the surface ecotypepredominates. It has been proposed that the ‘deep-ecotype’ is better adapted to microaerophilic condi-tions and that it might have a better tolerance toheavy metals (Ivars-Martinez et al., 2008). However,this is also at odds with our observations, as thedeep Marmara waters are suboxic and the MarmaraSea is heavily polluted (Unluata et al., 1990;Besiktepe et al., 1994; Cagatay et al., 2009). Two,nonmutually exclusive, explanations could accountfor the dominance of the surface ecotype in thedeep Marmara Sea. First, A. macleodii cells fromsurface East Mediterranean (Aegean) waters aretransported to depth after the water mass enteringMarmara through the Dardanelles. Therefore, theirpresence would attest to the Aegean nature of thedeep Marmara waters. However, this would notnecessarily explain its relative high abundance atthis depth. The second explanation would refer tothe higher metabolic flexibility of the surfaceecotype, which is able to exploit a wider range ofsubstrates (Ivars-Martinez et al., 2008) and mighteventually be more adapted to Marmara bathypela-gic waters.

In Marmara sediment, recruitment was muchpoorer than in deep-sea plankton because of thelarger microbial diversity and, hence, lower genomecoverage. The genomes that recruited best werethose of the planctomycetes Pirellula sp. 1 (2520hits) and Blastopirellula maris (2133 hits), followedby that of Geobacter uranioreducens Rf4 (1853 hits),Pelobacter carbinolicus DSM2380 (1494 hits) andNitrosococcus oceani (1375 hits), illustrating the

high relative abundances of, mostly, planctomycetesand sulfate-reducing Deltaproteobacteria.

Concluding remarksThis study is a contribution to incipient compara-tive metagenomics of deep-sea microbial commu-nities from both plankton and sediment to startbuilding a better picture of the microbial diversityand the associated gene content and functionalpotential as a function of geographical and physi-co-chemical environmental parameters. We ana-lyzed metagenomic 454 pyrosequences frombathypelagic plankton (Ma101, 1000 m depth) andfrom ‘normal’, non-cold-seep influenced, bottomsediment (Ma29, 1300 m bottom depth) in the Seaof Marmara. Overall, the metagenomic data were ingood agreement with the bacterial diversity inferredfrom SSU rRNA gene libraries. Bathypelagic Mar-mara plankton was dominated by Gamma- andAlpha-proteobacteria with the ‘surface’ A. macleodiiecotype and P. ubique as dominant lineages, asshown by genome recruitment analyses. Within thearchaea, group I Crenarchaeota/Thaumarchaeotathat seem capable of oxidizing ammonia, as attestedby the presence of archaeal amo genes, dominate.Sediment communities are extremely diverse, withparticularly high relative abundances of Deltapro-teobacteria, many of them are sulfate reducers, andplanctomycetes, which are important in carboncycling. Methanogenic and/or methanotrophic ar-chaea were not abundant in the surface sedimentsample, indicating that the transition to the metha-nogenesis zone is deeper and that the sediment isnot actually influenced by intense seeping activity,which, in Marmara, is restricted to localized blackpatches along the main North Anatolian Fault (Geliet al., 2008; Zitter et al., 2008). Future comparisonwith metagenome sequences from those seep areasin the same location would be interesting to revealspecific adaptations to those environments.Although a large diversity of eukaryotes wasdetected from both plankton and sediment, eukar-yotic sequences (including rRNA genes) in themetagenome data sets were much less abundantthan prokaryotic ones. The comparison of theMarmara metagenomic data with those of otherdeep-sea data sets (ALOHA deep-sea plankton,whale carcasses), marine subsurface sediment ofthe Peru margin, and other outgroup environments(ALOHA surface plankton and soil) showed thatMarmara deep-sea plankton resembled both themeso- and bathypelagic Pacific ALOHA deep-seaand the whale carcass metagenomes. AlthoughMarmara bathypelagic waters share the deep-seaconditions with the ALOHA deep-sea plankton, itshares suboxic conditions with the chemosyntheticecosystem of whale carcasses. Marmara sedimentclusters with the soil metagenome, indicating thatthe ecological role involving the ultimate degrada-tion of organic matter and the completion of

Comparative metagenomics: Sea of MarmaraA Quaiser et al

301

The ISME Journal

Page 18: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

biogeochemical cycles imposes strong constraintsand determines the nature and function of theassociated microbial communities.

Acknowledgements

We thank P Henry and N Cagatay, chief scientists of theoceanographic cruise MARNAUT, for providing PLG theopportunity to participate, as well as to the shipboardscientific team for an enjoyable cruise. We also acknowl-edge Captain and crew of R/V L’Atalante and the helpof the Turkish Navy to protect our ship in the zonesof heavy ship traffic. We thank INSU for providinga CTD-rosette for water sampling and L Fichen, whooperated it. This work was funded by the French NationalAgency for Research (EVOLDEEP project, contract numberANR-08-GENM-024-002).

References

Agogue H, Brink M, Dinasquet J, Herndl GJ. (2008). Majorgradients in putatively nitrifying and non-nitrifyingArchaea in the deep North Atlantic. Nature 456:788–791.

Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z,Miller W et al. (1997). Gapped BLAST and PSI-BLAST:a new generation of protein database search programs.Nucleic Acids Res 25: 3389–3402.

Berg IA, Kockelkorn D, Buckel W, Fuchs G. (2007).A 3-hydroxypropionate/4-hydroxybutyrate autotrophiccarbon dioxide assimilation pathway in Archaea.Science 318: 1782–1786.

Besiktepe ST, Sur IH, Ozsoy E, Abdul Latif M, Oguz T,Unluata U. (1994). The circulation and hydrography ofthe Marmara Sea. Prog Oceanogr 34: 285–334.

Biddle JF, Fitz-Gibbon S, Schuster SC, Brenchley JE,House CH. (2008). Metagenomic signatures of thePeru Margin subseafloor biosphere show a geneticallydistinct environment. Proc Natl Acad Sci USA 105:10583–11058.

Blackburn MV, Hannah F, Rogerson A. (2009). Firstaccount of apochlorotic diatoms from intertidal sandof a south Florida beach. Estuarine Coastal and ShelfScience 84: 519–526.

Brinkhoff T, Giebel HA, Simon M. (2008). Diversity,ecology, and genomics of the Roseobacter clade: ashort overview. Arch Microbiol 189: 531–539.

Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P.(2008). Mesophilic Crenarchaeota: proposal for a thirdarchaeal phylum, the Thaumarchaeota. Nat RevMicrobiol 6: 245–252.

Cagatay MN, Eris K, Ryan WBF, Sancar U, Polonia A,Akcer S et al. (2009). Late Pleistocene-Holoceneevolution of the northern shelf of the Sea of Marmara.Mar Geol 265: 87–100.

Cetecioglu Z, Ince BK, Kolukirik M, Ince O. (2009).Biogeographical distribution and diversity of bacterialand archaeal communities within highly pollutedanoxic marine sediments from the marmara sea.Mar Pollut Bull 58: 384–395.

Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B,Bork P. (2006). Toward automatic reconstruction of ahighly resolved tree of life. Science 311: 1283–1287.

DeLong EF, Preston CM, Mincer T, Rich V, Hallam SJ,Frigaard NU et al. (2006). Community genomics

among stratified microbial assemblages in the ocean’sinterior. Science 311: 496–503.

DeSantis Jr TZ, Hugenholtz P, Keller K, Brodie EL, Larsen N,Piceno YM et al. (2006b). NAST: a multiple sequencealignment server for comparative analysis of 16S rRNAgenes. Nucleic Acids Res 34: W394–W399.

DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL,Keller K et al. (2006a). Greengenes, a chimera-checked 16S rRNA gene database and workbenchcompatible with ARB. Appl Environ Microbiol 72:5069–5072.

Feingersch R, Suzuki MT, Shmoish M, Sharon I, Sabehi G,Partensky F et al. (2009). Microbial communitygenomics in eastern Mediterranean Sea surface waters.ISME J 4: 78–87.

Frias-Lopez J, Shi Y, Tyson GW, Coleman ML, SchusterSC, Chisholm SW et al. (2008). Microbial communitygene expression in ocean surface waters. Proc NatlAcad Sci USA 105: 3805–3810.

Frigaard NU, Martinez A, Mincer TJ, DeLong EF. (2006).Proteorhodopsin lateral gene transfer betweenmarine planktonic Bacteria and Archaea. Nature 439:847–850.

Geli L, Henry P, Zitter T, Dupre S, Tryon M, Cagatay MNet al. (2008). Gas emissions and active tectonics withinthe submerged section of the North Anatolian Faultzone in the Sea of Marmara. Earth and PlanetaryScience Letters 274: 34–39.

Glockner FO, Kube M, Bauer M, Teeling H, Lombardot T,Ludwig W et al. (2003). Complete genome sequenceof the marine planctomycete Pirellula sp. strain 1.Proc Natl Acad Sci USA 30: 30.

Gomez-Alvarez V, Teal TK, Schmidt TM. (2009). Systema-tic artifacts in metagenomes from complex microbialcommunities. ISME J 3: 1314–1317.

Huson DH, Auch AF, Qi J, Schuster SC. (2007). MEGANanalysis of metagenomic data. Genome Res 17: 377–386.

Ivars-Martinez E, Martin-Cuadrado AB, D’Auria G, Mira A,Ferriera S, Johnson J et al. (2008). Comparativegenomics of two ecotypes of the marine planktoniccopiotroph Alteromonas macleodii suggests alterna-tive lifestyles associated with different kinds ofparticulate organic matter. ISME J 2: 1194–1212.

Jobb G, von Haeseler A, Strimmer K. (2004). TREEFINDER:a powerful graphical analysis environment for mole-cular phylogenetics. BMC Evol Biol 4: 18.

Johnston AW, Li Y, Ogilvie L. (2005). Metagenomic marinenitrogen fixation—feast or famine? Trends Microbiol13: 416–420.

Jorgensen BB, Boetius A. (2007). Feast and famine—microbial life in the deep-sea bed. Nat Rev Microbiol 5:770–781.

Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M.(2004). The KEGG resource for deciphering thegenome. Nucleic Acids Res 32: D277–D280.

Karner MB, DeLong EF, Karl DM. (2001). Archaealdominance in the mesopelagic zone of the PacificOcean. Nature 409: 507–510.

King GM, Weber CF. (2007). Distribution, diversity andecology of aerobic CO-oxidizing bacteria. Nat RevMicrobiol 5: 107–118.

Konneke M, Bernhard AE, de la JR , Walker CB, Waterbury JB,Stahl DA. (2005). Isolation of an autotrophic ammonia-oxidizing marine archaeon. Nature 437: 543–546.

Konstantinidis KT, Braff J, Karl DM, DeLong EF. (2009).Comparative metagenomic analysis of a microbialcommunity residing at a depth of 4000 meters at

Comparative metagenomics: Sea of MarmaraA Quaiser et al

302

The ISME Journal

Page 19: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

station ALOHA in the North Pacific subtropical gyre.Appl Environ Microbiol 75: 5345–5355.

Kruger M, Meyerdierks A, Glockner FO, Amann R,Widdel F, Kube M et al. (2003). A conspicuous nickelprotein in microbial mats that oxidize methaneanaerobically. Nature 426: 878–881.

Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M,Antonescu C et al. (2004). Versatile and open softwarefor comparing large genomes. Genome Biol 5: R12.

Leininger S, Urich T, Schloter M, Schwark L, Qi J,Nicol GW et al. (2006). Archaea predominate amongammonia-oxidizing prokaryotes in soils. Nature 442:806–809.

Lopez-Garcıa P, Lopez-Lopez A, Moreira D, Rodrıguez-Valera F. (2001). Diversity of free-living prokaryotesfrom a deep-sea site at the Antarctic Polar Front. FEMSMicrobiol Ecol 36: 193–202.

Lovejoy C, Massana R, Pedros-Alio C. (2006). Diversityand distribution of marine microbial eukaryotes in theArctic Ocean and adjacent seas. Appl Environ Micro-biol 72: 3085–3095.

Ludwig W, Strunk O, Westram R, Richter L, Meier H,Yadhukumar et al. (2004). ARB: a software envi-ronment for sequence data. Nucleic Acids Res 32:1363–1371.

Martin-Cuadrado AB, Ghai R, Gonzaga A, Rodriguez-Valera F. (2009). CO dehydrogenase genes found inmetagenomic fosmid clones from the deep Mediterra-nean. Appl Environ Microbiol 75: 7436–7444.

Martin-Cuadrado AB, Lopez-Garcia P, Alba JC, Moreira D,Monticelli L, Strittmatter A et al. (2007). Metage-nomics of the deep Mediterranean, a warm bath-ypelagic habitat. PLoS ONE 2: e914.

Martin-Cuadrado AB, Rodriguez-Valera F, Moreira D, AlbaJC, Ivars-Martinez E, Henn MR et al. (2008). Hindsightin the relative abundance, metabolic potential andgenome dynamics of uncultivated marine archaeafrom comparative metagenomic analyses of bathype-lagic plankton of different oceanic regions. ISME J 2:865–886.

Martinez A, Tyson GW, Delong EF. (2009). Widespreadknown and novel phosphonate utilization pathwaysin marine bacteria revealed by functional screeningand metagenomic analyses. Environ Microbiol 12:222–238.

Moreira D, Rodrıguez-Valera P, Lopez-Garcıa P. (2006).Genome fragments from mesopelagic Antarctic watersreveal a novel deltaproteobacterial group related to themyxobacteria. Microbiology 152: 505–517.

Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang HY,Cohoon M et al. (2005). The subsystems approach togenome annotation and its use in the project to annotate1000 genomes. Nucleic Acids Res 33: 5691–5702.

Page RD. (1996). TreeView: an application to displayphylogenetic trees on personal computers. ComputAppl Biosci 12: 357–358.

Pezacka E, Wood HG. (1984). Role of carbon monoxidedehydrogenase in the autotrophic pathway used byacetogenic bacteria. Proc Natl Acad Sci USA 81:6261–6265.

Polymenakou PN, Bertilsson S, Tselepides A, StephanouEG. (2005). Bacterial community composition indifferent sediments from the Eastern MediterraneanSea: a comparison of four 16S ribosomal DNA clonelibraries. Microb Ecol 50: 447–462.

Poretsky RS, Hewson I, Sun S, Allen AE, Zehr JP, Moran MA.(2009). Comparative day/night metatranscriptomic

analysis of microbial communities in the North Pacificsubtropical gyre. Environ Microbiol 11: 1358–1375.

Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W,Peplies J et al. (2007). SILVA: a comprehensive onlineresource for quality checked and aligned ribosomalRNA sequence data compatible with ARB. NucleicAcids Res 35: 7188–7196.

Quaiser A, Lopez-Garcia P, Zivanovic Y, Henn MR,Rodriguez-Valera F, Moreira D. (2008). Comparativeanalysis of genome fragments of Acidobacteria fromdeep Mediterranean plankton. Environ Microbiol 10:2704–2717.

Raes J, Foerstner KU, Bork P. (2007). Get the most out ofyour metagenome: computational analysis of environ-mental sequence data. Curr Opin Microbiol 10: 490–498.

Ruehland C, Blazejak A, Lott C, Loy A, Erseus C, DubilierN. (2008). Multiple bacterial symbionts in two speciesof co-occurring gutless oligochaete worms from Med-iterranean sea grass sediments. Environ Microbiol 10:3404–3416.

Rusch DB, Halpern AL, Sutton G, Heidelberg KB, William-son S, Yooseph S et al. (2007). The Sorcerer II GlobalOcean Sampling expedition: Northwest Atlanticthrough Eastern Tropical Pacific. PLoS Biol 5: e77.

Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati Net al. (2003). TM4: a free, open-source system formicroarray data management and analysis. Biotechni-ques 34: 374–378.

Schloss PD, Handelsman J. (2005). Introducing DOTUR, acomputer program for defining operational taxonomicunits and estimating species richness. Appl EnvironMicrobiol 71: 1501–1506.

Schmidt TM, DeLong EF, Pace NR. (1991). Analysis of amarine picoplankton community by 16S rRNA genecloning and sequencing. J Bacteriol 173: 4371–4378.

Shi Y, Tyson GW, DeLong EF. (2009). Metatranscriptomicsreveals unique microbial small RNAs in the ocean’swater column. Nature 459: 266–269.

Strous M, Pelletier E, Mangenot S, Rattei T, Lehner A,Taylor MW et al. (2006). Deciphering the evolutionand metabolism of an anammox bacterium from acommunity genome. Nature 440: 790–794.

Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, KiryutinB, Koonin EV et al. (2003). The COG database: anupdated version includes eukaryotes. BMC Bioinfor-matics 4: 41.

Tringe SG, von Mering C, Kobayashi A, Salamov AA,Chen K, Chang HW et al. (2005). Comparativemetagenomics of microbial communities. Science308: 554–557.

Unluata U, Oquz T, Latif MA, Ozsoy E. (1990). On thephysical oceanography of the Turkish Straits. In: PrattLJ (ed). The Physical Oceanography of Sea Straits.Kluver: Dordrecht. pp 25–60.

Urich T, Lanzen A, Qi J, Huson DH, Schleper C, SchusterSC. (2008). Simultaneous assessment of soil microbialcommunity structure and function through analysis ofthe meta-transcriptome. PLoS One 3: e2527.

Varela MM, van Aken HM, Sintes E, Herndl GJ. (2008).Latitudinal trends of Crenarchaeota and Bacteria inthe meso- and bathypelagic water masses ofthe Eastern North Atlantic. Environ Microbiol 10:110–124.

Venter JC, Remington K, Heidelberg JF, Halpern AL,Rusch D, Eisen JA et al. (2004). Environmental genomeshotgun sequencing of the Sargasso Sea. Science 304:66–74.

Comparative metagenomics: Sea of MarmaraA Quaiser et al

303

The ISME Journal

Page 20: Comparative metagenomics of bathypelagic plankton and ...max2.ese.u-psud.fr/publications/2011-ISMEJ_metagenomics-Marmara-Sea.pdf · ORIGINAL ARTICLE Comparative metagenomics of bathypelagic

Vergin KL, Urbach E, Stein JL, DeLong EF, Lanoil BD,Giovannoni SJ. (1998). Screening of a fosmid library ofmarine environmental genomic DNA fragments revealsfour clones related to members of the order Planctomy-cetales. Appl Environ Microbiol 64: 3075–3078.

Vetriani C, Jannasch HW, MacGregor BJ, Stahl DA,Reysenbach AL. (1999). Population structure andphylogenetic characterization of marine benthic Ar-chaea in deep-sea sediments. Appl Environ Microbiol65: 4375–4384.

von Mering C, Hugenholtz P, Raes J, Tringe SG, Doerks T,Jensen LJ et al. (2007). Quantitative phylogeneticassessment of microbial communities in diverseenvironments. Science 315: 1126–1130.

Wagner-Dobler I, Biebl H. (2006). Environmental biologyof the marine Roseobacter lineage. Annu Rev Micro-biol 60: 255–280.

Woebken D, Teeling H, Wecker P, Dumitriu A, Kostadinov I,Delong EF et al. (2007). Fosmids of novel marinePlanctomycetes from the Namibian and Oregon coastupwelling systems and their cross-comparison withplanctomycete genomes. ISME J 1: 419–435.

Woyke T, Teeling H, Ivanova NN, Huntemann M, RichterM, Gloeckner FO et al. (2006). Symbiosis insightsthrough metagenomic analysis of a microbial consor-tium. Nature 443: 950–955.

Wright TD, Vergin KL, Boyd PW, Giovannoni SJ. (1997). Anovel delta-subdivision proteobacterial lineage fromthe lower ocean surface layer. Appl Environ Microbiol63: 1441–1448.

Yakimov MM, La Cono V, Denaro R, D’Auria G,Decembrini F, Timmis KN et al. (2007). Primaryproducing prokaryotic communities of brine, interfaceand seawater above the halocline of deep anoxiclake L’Atalante, Eastern Mediterranean Sea. ISME J 1:743–755.

Zaikova E, Walsh DA, Stilwell CP, Mohn WW, Tortell PD,Hallam SJ. (2010). Microbial community dynamics ina seasonally anoxic fjord: Saanich Inlet, BritishColumbia. Environ Microbiol 12: 172–191.

Zitter TAC, Henry P, Aloisi G, Delaygue G, Cagatay MN,Mercier de Lepinay B et al. (2008). Cold seeps alongthe main Marmara Fault in the Sea of Marmara(Turkey). Deep-Sea Research I 55: 552–570.

Supplementary Information accompanies the paper on The ISME Journal website (http://www.nature.com/ismej)

Comparative metagenomics: Sea of MarmaraA Quaiser et al

304

The ISME Journal