Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial...

30
Comparison of 26 Sphingomonad Genomes Reveals Diverse Environmental 1 Adaptations and Biodegradative Capabilities 2 3 Frank O. Aylward 1,2 , Bradon R. McDonald 1 , Sandra M. Adams 1,2 , Alejandra Valenzuela 1,2 , 4 Rebeccah A. Schmidt 1,2 , Lynne A. Goodwin 4,5 , Tanja Woyke 5 , Cameron R. Currie 1,2 , Garret 5 Suen 1,2 ,and Michael Poulsen 1,2,3,# 6 7 1 DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, 8 Wisconsin 53706, USA 9 2 Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53706, 10 USA 11 3 Section for Ecology and Evolution, Department of Biology, University of Copenhagen, 12 Universitetsparken 15, 2100 Copenhagen East, Denmark 13 4 Biosciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, 14 USA 15 5 DOE Joint Genome Institute, Walnut Creek, California 94598, USA 16 17 # Address for correspondence: Michael Poulsen, University of Copenhagen, Department of 18 Biology, Section for Ecology and Evolution, Universitetsparken 15, Building 3, 1st floor, 2100 19 Copenhagen East, Phone: +45 35321292, Fax: +45 35321250, E-mail: [email protected] 20 Running Title: Ecological Genomics of Sphingomonads 21 22 23 Copyright © 2013, American Society for Microbiology. All Rights Reserved. Appl. Environ. Microbiol. doi:10.1128/AEM.00518-13 AEM Accepts, published online ahead of print on 5 April 2013 on May 10, 2020 by guest http://aem.asm.org/ Downloaded from

Transcript of Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial...

Page 1: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

Comparison of 26 Sphingomonad Genomes Reveals Diverse Environmental 1

Adaptations and Biodegradative Capabilities 2

3

Frank O. Aylward1,2, Bradon R. McDonald1, Sandra M. Adams1,2, Alejandra Valenzuela1,2, 4

Rebeccah A. Schmidt1,2, Lynne A. Goodwin4,5, Tanja Woyke5, Cameron R. Currie1,2, Garret 5

Suen1,2,and Michael Poulsen1,2,3,# 6

7

1 DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, 8

Wisconsin 53706, USA 9

2 Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53706, 10

USA 11

3 Section for Ecology and Evolution, Department of Biology, University of Copenhagen, 12

Universitetsparken 15, 2100 Copenhagen East, Denmark 13

4 Biosciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, 14

USA 15

5 DOE Joint Genome Institute, Walnut Creek, California 94598, USA 16

17

# Address for correspondence: Michael Poulsen, University of Copenhagen, Department of 18

Biology, Section for Ecology and Evolution, Universitetsparken 15, Building 3, 1st floor, 2100 19

Copenhagen East, Phone: +45 35321292, Fax: +45 35321250, E-mail: [email protected] 20

Running Title: Ecological Genomics of Sphingomonads 21

22

23

Copyright © 2013, American Society for Microbiology. All Rights Reserved.Appl. Environ. Microbiol. doi:10.1128/AEM.00518-13 AEM Accepts, published online ahead of print on 5 April 2013

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 2: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

ABSTRACT: Sphingomonads comprise a physiologically versatile group within the 24

Alphaproteobacteria that include strains of interest for biotechnology, human health, and 25

environmental nutrient cycling. Here we compare 26 sphingomonad genome sequences to 26

gain insight into their ecology, metabolic versatility, and environmental adaptations. Our 27

multi-locus phylogenetic and average amino acid identity (AAI) analyses confirm that 28

Sphingomonas, Sphingobium, Sphingopyxis, and Novosphingobium are well-resolved 29

monophyletic groups with the exception of Sphingomonas sp. strain SKA58, which we 30

propose belongs to the genus Sphingobium. Our pan-genomic analysis of sphingomonads 31

reveals numerous species-specific ORFs but few signatures of genus-specific cores. The 32

organization and coding potential of the sphingomonad genomes appears highly variable, 33

and plasmid-mediated gene transfer and chromosome-plasmid recombination, together 34

with prophage- and transposon-mediated rearrangements, appear to play prominent roles 35

in the genome evolution of this group. We find many of the sphingomonad genomes encode 36

numerous oxygenases and glycoside hydrolases, which are likely responsible for their 37

ability to degrade various recalcitrant aromatic compounds and polysaccharides, 38

respectively. Many of these enzymes are encoded on megaplasmids, suggesting that they 39

may be readily transferred between species. We also identified enzymes putatively used for 40

the catabolism of sulfonate and nitroaromatic compounds in many of the genomes, 41

suggesting that plant-based compounds or chemical contaminants may be sources of 42

nitrogen and sulfur. Many of these sphingomonads appear adapted to oligotrophic 43

environments, but several contain genomic features indicative of host associations. Our 44

work provides a basis for understanding the ecological strategies employed by 45

sphingomonads and their role in environmental nutrient cycling. 46

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 3: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

INTRODUCTION 47

Bacteria belonging to the genera Sphingomonas, Sphingobium, Novosphingobium, and 48

Sphingopyxis, collectively referred to as sphingomonads, comprise an assortment of 49

physiologically diverse bacteria within the phylum Alphaproteobacteria (1). Members of this 50

group are broadly distributed in nature and have been isolated from a variety of environments 51

including human-associated niches (2), chemically-contaminated water and sediment (3), 52

activated sludge from wastewater treatment plants (4), seawater (5), and stromatolites (6). Many 53

sphingomonads have been demonstrated to play an important role in nutrient cycling, especially 54

in oligotrophic environments (5). Others reside in various plant- and animal-associated 55

ecosystems, and can even be responsible for nosocomial infections in humans (2). 56

A salient characteristic of many sphingomonads is their ability to degrade a variety of 57

recalcitrant aromatic compounds and polysaccharides (1). A number of sphingomonads have 58

been isolated from environments heavily contaminated with pesticides, herbicides, and other 59

xenobiotics, and some have been shown to use these compounds as a sole carbon source (7, 8). 60

Some sphingomonads have been shown to degrade dioxins (9), diphenyl ethers (7, 10), 61

polyethylene glycol (11), dibenzofurans (12, 13), and many other aromatic and chloroaromatic 62

compounds (8, 14). Moreover, many strains are known to metabolize lignin (15), an aromatic 63

polymer abundant in plant cell walls. Taken together with the ability of many sphingomonads to 64

deconstruct polysaccharides (16, 17), it is clear that these bacteria play important roles in the 65

degradation of diverse recalcitrant compounds in the environment. 66

To provide insight into the ecology of sphingomonads, we compared the genomes of 26 67

bacteria belonging to the genera Sphingomonas, Sphingobium, Novosphingobium, and 68

Sphingopyxis. Additionally, we sequenced and analyzed draft genomes of Sphingomonas sp. 69

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 4: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

strain Mn802 and Sphingomonas sp. strain PR090111-T3T-6A, two isolates associated with 70

termites. The primary objectives of this study were to use genomic tools to shed light on the 71

ecological strategies utilized by these bacteria and to broaden our knowledge of the evolutionary 72

processes affecting their genomes. We compared the architecture of all sphingomonad genomes 73

and, through analysis of the sphingomonad pan-genome, identified shared and variably 74

distributed genes. Using targeted comparative analyses of specific genes involved in nutrient 75

uptake and metabolism, our work provides hypotheses regarding the genomic content necessary 76

for sphingomonads to thrive in a diversity of environments. Given the role many sphingomonads 77

have been postulated to play in the cycling of recalcitrant carbon compounds in the environment, 78

we also compared the complement of genes involved in the degradation of polysaccharides and 79

aromatic compounds. 80

MATERIALS AND METHODS 81

Termite Collection and Bacterial Isolations 82

Termite colonies were sampled from natural populations in South Africa and Puerto Rico 83

(Table S1). Sphingomonas sp. strain Mn802 was isolated from a worker in a nest of the fungus-84

growing termite Macrotermes natalensis in 2008, and strain PR090111-T3T-6A was isolated 85

from nest material from a Nasutitermes sp. in 2009. For strain Mn802, an underground fungus-86

growing termite colony of Macrotermes natalensis was excavated in South Africa and workers 87

were collected using sterile forceps. Workers were kept together with fungus material in alcohol-88

wiped plastic containers until they were processed at the Forestry and Agricultural 89

Biotechnology Institute (FABI) at the University of Pretoria, South Africa. In the laboratory, 90

major workers were ground in 0.5 mL water and a homogenate was plated on 1% carboxymethyl 91

cellulose (CMC) medium (10 g CMC and 20 g agar/L), and a Sphingomonas isolate was 92

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 5: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

obtained through serial transfer. For strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 93

sampled from a termitarium in Bosque Estatal Maricao, Puerto Rico in 2009. A homogenate of 94

nest material in sterile water was plated on chitin medium (4 g chitin, 0.7 g K2HPO4, 0.3 g 95

KH2PO4, 0.5 g MgSO4•5H2O, 0.01 g FeSO4•7H2O, 0.001 g ZnSO4, 0.0001 g MnCl2, 1 L water) 96

(18), and colonies on the plates were isolated by serial transfer. 97

Genome Sequencing 98

DNA was extracted from two pure cultures of Sphingomonas isolates grown in liquid LB 99

media using the Qiagen DNeasy Blood & Tissue Kit (Qiagen California, USA). Shotgun 100

sequencing libraries were constructed at the DOE Joint Genome Institute (JGI), and all aspects 101

related to library construction and sequencing are available at http://www.jgi.doe.gov/. Whole-102

genome sequencing was performed for both isolates using a single picotiter plate on a 454 FLX 103

Titanium pyrosequencer (19) and de novo assembled using Newbler version 2.3. The draft 104

genome of strain Mn802 consists of 25 contigs comprising 3.19 Mb, while the genome of strain 105

PR090111-T3T-6A consists of 26 contigs comprising 3.91 Mb. The raw 454 sequences for 106

Sphingomonas spp. strain Mn802, and strain PR090111-T3T-6A are available from the NCBI 107

Sequence Read Archive (SRA) under accession numbers SRA037852 and SRA037853, 108

respectively, and the assemblies used in this study have been deposited in 109

DDBJ/EMBL/GenBank under the accession numbers AORY00000000 and AORL00000000, 110

respectively. 111

Genome Annotations 112

Proteins from all draft and complete genomes were predicted de novo using Prodigal (20). 113

Protein clusters were constructed through comparison with previously-constructed clusters 114

available on the eggNOG Database version 3.0 (21) using HMMER (22). To do this, all bacterial 115

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 6: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

cluster alignments (bactNOGs) available from eggNOG were downloaded on 09/28/2011, and 116

HMMer3 was used to construct hidden-Markov model profiles for each cluster. For the clusters 117

that lacked alignments on the eggNOG FTP, alignments were generated using MUSCLE (23). 118

All sphingomonad proteins predicted were compared to these clusters using hmmscan from 119

HMMer3, with a minimum e-value < 1e-5. Proteins were designated as belonging to the same 120

cluster as their top HMMer model hit. Raw data regarding cluster membership can be found in 121

Dataset S1. 122

To identify distinguishing genomic features in the three sphingomonad genera for which 123

multiple genomes are available (Sphingomonas, Sphingobium, and Novosphingobium), we 124

performed an enrichment analysis of the protein clusters represented in these genera. To do this, 125

we used Fisher’s Exact Test (P < 0.01) to identify protein clusters that were over-represented in 126

one of these genera compared to the other two genera combined (raw data available in Dataset 127

S1). For our analysis of highly variable coding potential, protein clusters with highly variable 128

membership across the sphingomonads were identified by calculating the standard deviation of 129

cluster membership across all genomes. The annotations of representative proteins belonging to 130

the clusters with the highest variability (defined here as having a standard deviation > 1.5) were 131

analyzed using the same tools implemented for whole-genome annotations (see below). 132

Visualization of Venn diagrams was performed using the tool Venny 133

(http://bioinfogp.cnb.csic.es/tools/venny/index.html). 134

Proteins predicted from the sphingomonad genomes were annotated using the KEGG 135

KAAS server (24) and by comparison with the Clusters of Orthologous Groups (COG) database 136

(25) using RPSBLAST (26) (e-value < 1e-5). To identify transposons we found all proteins with 137

hits to one or more transposon-associated COGs (full list and details in Dataset S2). Prophage 138

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 7: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

and phage-like proteins on the chromosomes of the complete sphingomonad genomes were 139

identified using PhiSpy (27) with the associated GenBank files available on the GenBank 140

database (http://www.ncbi.nlm.nih.gov/genbank/index.html) as the input. Oxygenases were 141

determined by identifying all hits to mono- and di-oxygenases in the KEGG Orthology (28) 142

database (full KEGG annotations are available in Dataset S3). For Carbohydrate Active Enzymes 143

(CAZymes, (29)), predictions were made as previously described (30) (full annotations for 144

oxygenases and CAZymes are available in Dataset S2). To identify transposons we found 145

proteins with homology to 27 COG profiles associated with these elements (details and full 146

annotations in Dataset S2). 147

Phylogenetic Analyses 148

A multi-locus phylogenetic tree was constructed by identifying orthologous protein-149

coding genes between genomes. This was done by identifying all universally conserved eggNOG 150

clusters and, for each genome, retaining only the protein having the highest HMMER hit to the 151

Markov model for that cluster. This resulted in 282 universally conserved protein-coding genes 152

for each genome. Protein sequences for each conserved cluster were aligned using the E-INS-i 153

algorithm implemented in MAFFT v6 (31), and then converted to nucleotide alignments. These 154

were concatenated and used to generate a phylogeny using RAxML (32). The concatenated 155

alignment was partitioned by gene and GTRGAMMA model parameters were estimated for each 156

partition of the alignment separately. The phylogeny was generated with 100 bootstrap iterations. 157

For the Average Amino Acid Identity (AAI) analyses, we used a strategy similar to one 158

previously described (33). AAI values were calculated for each pair of genomes using the 159

reciprocal best BLASTP hits of their predicted proteins (parameters –X 150, -q -1, -F F, -e 1e-5). 160

The AAI values were then converted into distances by subtracting them from 100. The 161

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 8: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

dendrogram was created by using the resulting pairwise distances as input for the Neighbor 162

package in Phylip (34). Both the multi-locus phylogeny and AAI dendrogram were visualized 163

using the interactive Tree of Life (iTOL) (35). The pairwise synteny plots were constructed using 164

the same reciprocal best BLASTP hits used for the AAI analysis and plotting them using R (36). 165

RESULTS AND DISCUSSION 166

Phylogenetic Analyses of the Sphingomonads 167

We analyzed the phylogenetic relationships of the sphingomonad genomes using both 168

multi-locus sequence analysis (MLSA) and Average Amino acid Identity (AAI). For the MLSA, 169

we used a concatenated alignment of 248 orthologous genes found to be universally conserved 170

across all genomes. The resulting phylogeny revealed distinct clustering of the four genera, 171

consistent with biochemical and 16S-based phylogenetic analyses distinguishing these groups 172

(37) (Figure 1). Maximum-likelihood bootstrap support was 100% for all nodes, and the 173

topology of this tree was identical to a dendrogram created by clustering pairwise AAI analysis 174

(Figure 1), suggesting well-supported phylogenetic relationships. The pairwise AAI values 175

ranged from 55.4% (between Sphingomonas wittichii RW1 and Novosphingobium sp. strain 176

AP12) and 99.5% (between Sphingomonas wittichii species RW1 and DP58) (all available in 177

Dataset S4). Despite its classification in the genus Sphingomonas, both our MLSA and AAI 178

analysis revealed that strain SKA58 clusters within the genus Sphingobium in the same clade as 179

Sphingobium yanoikiyae XLDN2-5 and strain AP49 (Figure 1), suggesting that this bacterium 180

should be classified within this genus. 181

General Genomic Organization and Selfish Genetic Elements 182

Statistics for the 26 genomes analyzed in this study can be found in Table 1. Considerable 183

variation in the organization of the seven complete genomes was observed (Table 2). For 184

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 9: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

example, Sphingobium chlorophenolicum L-1 and Sphingobium japonicum UT26S contain two 185

chromosomes, while all other complete genomes contain only one. Synteny analysis revealed 186

only a small degree of conserved gene order between the complete genomes (Figure S1). As 187

noted previously (38), a small degree of shared gene order is present between Sphingobium 188

japonicum UT26S and Sphingobium chlorophenolicum L-1. These genomes appear to have 189

undergone multiple chromosomal rearrangements since their divergence, and neither share 190

identifiable synteny with strain SYK-6. Also, only a small degree of synteny was observed 191

between the chromosomes of Novosphingobium aromaticivorans DSM 12444 and 192

Novosphingobium sp. strain PP1Y (Figure S1). 193

All of the complete genomes, except that of Sphingopyxis alaskensis RB2256, contained 194

megaplasmids (plasmids >100 Kb, (39)), consistent with previous suggestions that these 195

replicons are common in sphingomonads (40). Although some megaplasmids were comparable 196

in size to the secondary chromosomes of Sphingobium chlorophenolicum L-1 and Sphingobium 197

japonicum UT26S, these replicons do not encode 16S rRNA operons or translational machinery 198

characteristic of chromosomes (see the Translation category in Figure S2). Furthermore, the 199

metabolic potential encoded on the different replicons is markedly different, with the vast 200

majority of proteins involved in essential processes such as amino acid and coenzyme 201

metabolism encoded on the chromosomes, while hypothetical proteins are more commonly 202

encoded on megaplasmids and plasmids (Figure S2). 203

We identified type IV secretion system components encoded in the majority of the 204

genomes (Table S2), raising the possibility for natural transformation or plasmid-mediated gene 205

transfer as mechanisms for gene exchange. Components of the type IV apparatus are encoded on 206

the megaplasmid of each megaplasmid-containing bacteria except Novosphingobium 207

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 10: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

aromaticivorans DSM 12444, suggesting that these replicons may encode their own mechanisms 208

for transfer. Experiments demonstrating the transfer of the 184.5 Kb megaplasmid pNL1 of 209

Novosphingobium aromaticivorans DSM 12444 to other sphingomonads has previously been 210

reported (40), and these experiments also observed rapid recombination between the transferred 211

plasmid and the chromosome of the new host. Moreover, ~120 open reading frames (ORFs) with 212

shared synteny and > 98% amino acid identity are found on plasmid pSLGB of Sphingobium sp. 213

strain SYK-6, the main chromosome of Sphingobium japonicum UT26S, and the plasmid 214

pSPHCH01 of Sphingobium chlorophenolicum L-1 (previously reported in (38, 41)), suggesting 215

recent transfer events. The transfer of essential genes to plasmids has been postulated to give rise 216

to secondary chromosomes in other bacteria (42), providing a potential explanation for the multi-217

chromosomal arrangement in Sphingobium chlorophenolicum L-1 and Sphingobium japonicum 218

UT26S. 219

Other selfish genetic elements, such as prophage and transposons, were also prevalent 220

features of the sphingomonads. The program PhiSpy (27) identified several prophage encoded on 221

the chromosomes (Table 2), with the highest number predicted from Novosphingobium 222

aromaticivorans DSM 12444, where 5 putative prophage predicted to contribute 9.3% of all 223

proteins encoded by this bacterium. Novosphingobium sp. strain PP1Y encoded 3 prophage 224

contributing 113 protein-coding genes to its chromosome. Between 1-2 putative prophage were 225

also identified in all chromosomes of the complete genomes, with the exception of the smaller 226

chromosome of Sphingobium japonicum UT26S (Table 2). COG-based annotation also identified 227

a highly variable number of transposons in the sphingomonads, with between 2-65 of these 228

elements encoded in the complete and draft genomes (Table 2, Dataset S2). The largest number 229

was found in Sphingomonas sp. strain PAMC 26621, Novosphingobium sp. strain Rr 2-17, and 230

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 11: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

the Sphingobium yanoikuyae XLDN2-5 (63-65 each), while the fewest were identified in the two 231

termite-associated Sphingomonas strains (Mn802 and PR090111-T3T-6A). 232

Pan-Genomics of the Sphingomonads 233

To investigate the pan-genome of the sphingomonads, we used Prodigal (20) to identify a 234

total of 110,698 proteins. Comparison of proteins encoded in the sphingomonad genomes to the 235

eggNOG Database (21), which contains previously generated protein clusters from bacterial 236

genomes, revealed a total of 23,087 protein clusters, including 248 that were universally 237

conserved between all genomes and 492 that were conserved between the complete genomes. 238

These genes consisted primarily of ribosomal proteins and proteins involved in replication, DNA 239

repair, amino acid metabolism, transcription, and other basic metabolic functions (Figure S3). 240

Although numerous genus-specific protein clusters were identified in our analyses 241

(Figure 2a), the majority of these were only represented in one or a few genomes within genera 242

(Figure 2b), indicating that they do not represent features that consistently distinguish the 243

sphingomonad genera. Note that because the genome of Sphingopyxis alaskensis RB2256 is the 244

only representative of its genus, it was not possible to ascertain whether the protein clusters 245

unique to this bacterium are representative of other Sphingopyxis isolates. To investigate 246

consistent differences in coding potential between sphingomonad genera for which multiple 247

genomes are available, we identified protein clusters that were over-represented in either of the 248

genera Sphingomonas, Sphingobium, or Novosphingobium compared to the other two combined 249

(Fisher’s Exact test, P < 0.01). Using this approach we identified 87 protein clusters over-250

represented in Sphingomonas, 163 in Sphingobium, and 145 in Novosphingobium (details in 251

Dataset S1). Interestingly, most of these clusters have no COG annotation, a general or unknown 252

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 12: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

functional prediction, or are associated with DNA processing (Figure S4), suggesting that they 253

encode selfish genetic elements rather than conserved metabolic processes. 254

Consistent with inter-genus findings, the majority of protein clusters identified as highly 255

variable are predicted to encode transposases, phage-related proteins, or proteins that could not 256

be annotated (Figure 3). Some of these variably distributed proteins encode for glycoside 257

hydrolases (GHs) and oxygenases, suggesting that sphingomonad genomes may be particularly 258

affected by adaptations related to changes in biodegradative capabilities. Clustering analysis of 259

the variable protein clusters revealed that many were represented in specific genera or species, 260

suggesting that shared ancestry may be responsible for their distribution (Figure 3). 261

Sphingomonas wittichii strains RW1 and DP58, for example, both encoded multiple proteins 262

belonging to clusters annotated as outer membrane receptors, dehydrogenases, and transposases 263

(Figure 3). Overall, however, our pan-genomic analysis revealed no large-scale differences in 264

coding potential that distinguish the four sphingomonad genera. 265

Aromatic Compound and Polysaccharide Degradation 266

To investigate the capacity of sphingomonads to degrade aromatic compounds, we 267

analyzed the encoded mono- and di-oxygenases, enzymes that catalyze the ring cleavage step 268

critical to aromatic compound degradation (43). The distribution of mono- and dioxygenases 269

varied by an order of magnitude across the sphingomonad phylogeny, with between 4-43 270

monooxygenases and 3-46 dioxygenases encoded (Figure 1, Table S3). Both Sphingomonas 271

wittichii RW1 and DP58 encoded the largest complement of oxygenases, although Sphingobium 272

chlorophenolicum L-1, Sphingobium sp. strain SYK-6, and all Novosphingobium species except 273

for Novosphingobium nitrogenifigens Y88T also encoded a large complement of these enzymes 274

(Table S3). The most abundant monooxygenase families identified included nitronate 275

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 13: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

monooxygenase (NMO, K000459), vanillate monooxygenases (vanA, K03862), and 276

alkanesulfonate monooxygenase (ssuD, K04091), while the most abundant dioxygenase families 277

included taurine dioxygenase (tauD, K03119) and 4-hydroxyphenylpyruvate dioxygenase (hppD, 278

K00457) (Table 3). Both Spingomonas wittichii RW1 and DP58 contained the largest number of 279

predicted vanillate monooxygenases (24 and 19, respectively) as well as multiple dioxin 280

dioxygenases (hcaA homologs, K05708), the latter being generally absent from the other 281

sphingomonad genomes (Dataset S2). 282

The abundance of tauD and ssuD homologs in many of the genomes suggests that 283

sulfonated compounds may be sources of both carbon and sulfur for many of the 284

sphingomonads. Bacteria have been shown to use a variety of organosulfur compounds as carbon 285

and sulfur sources, and these compounds are abundant in soil and aquatic environments, as well 286

as in environmental pollutants and fossil fuels (44). Novosphingobium sp. strain AP12, isolated 287

from the rhizosphere of the cottonwood tree (Populus deltoides), for example, contained a 288

particularly large amount of these enzymes (16 ssuD homologs and 17 tauD homologs). 289

Homologs to the sulfonate transport apparatus (ssuABC) were also identified in strain AP12 as 290

well as all of the soil- or rhizosphere associated Sphingobium species (S. chlorophenolicum L-1, 291

S. japonicum UT26S, strain AP49, and S. yanoikuyae XLDN2-5) (Table S2). Interestingly, S. 292

japonicum UT26S contained all ssuABC homologs but no ssuD dioxygenase, suggesting it may 293

degrade sulfur-containing compounds using alternative enzymes (Tables 3 and S2, Dataset S3). 294

In contrast to ssuD and tauD, the presence of nitronate monooxygenase (NMO, K00457) 295

homologs in the sphingomonad genomes was more consistent, with all genomes coding for 296

between 2-7 copies. The largest complement of these enzymes was found in the Sphingobium 297

chlorophenolicum L-1 and Sphingobium japonicum UT26S. NMO preferentially acts on 298

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 14: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

nitroalkanes, raising the possibility that these compounds may be naturally occurring sources of 299

both nitrogen and carbon to some sphingomonads. 300

Many of the sphingomonads contained a diversity of GHs, suggesting adaptations to 301

degrade plant polysaccharides. Sphingomonas elodea ATCC 31461 and Sphingomonas sp. 302

ATCC 31555 contained the largest complement of these enzymes (99 and 97, respectively) 303

including cellulases and hemicellulases (Figure 1). The genomes with the most predicted GHs 304

also encoded representatives from the cellulase families GH5, GH6, and GH9 (Table 3), in 305

support of their ability to hydrolyze cellulose. The genomes of these also encode a diversity of 306

other GHs, carbohydrate esterases, and polysaccharide lyases that likely target pectins, xylans, 307

mannans, and other plant cell wall polysaccharides (Table S3). The isolates with the largest 308

complement of hemicellulases and cellulases, Sphingomonas elodea ATCC 31461, 309

Sphingomonas sp. ATCC 31555, Sphingobium yanoikuyae XLDN2-5, and Sphingobium sp. 310

strain AP49, were isolated from environments rich in plant biomass, such as plant tissue, soil, 311

and pond water, suggesting sphingomonad participation in the degradation of environmental 312

plant biomass. 313

Among the sphingomonads with the least GHs (20-22 each) were strains encoding a large 314

number of oxygenases (e.g., Sphingomonas wittichii strains RW1 and DP58 and Sphingobium sp. 315

strain SYK-6; Figure 1, Table S3), suggesting specialization in the degradation of aromatic 316

compounds and lacking capacity to deconstruct recalcitrant polysaccharides. However, although 317

Novosphingobium nitrogenifigens Y88T encoded only 17 GHs, it also lacked a substantial coding 318

potential for oxygenases compared to other Novosphingobium strains (Figure 1, Table S3), 319

suggesting that the degradation of a wide diversity of recalcitrant carbon compounds may be less 320

important for this strain compared to other sphingomonads. 321

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 15: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

Analysis of the seven complete sphingomonad genomes revealed that many oxygenases 322

and GHs are encoded on plasmids or megaplasmids (Table 2). In Novosphingobium sp. strain 323

PP1Y, more GHs were encoded on the megaplasmid than the chromosome (41 versus 36), and 324

the plasmids of Sphingomonas wittichii RW1 and Novosphingobium aromaticivorans DSM 325

12444 contain several dioxygenases. Moreover, the smaller of the two chromosomes in 326

Sphingobium chlorophenolicum L-1 has been shown to encode the genes necessary to degrade 327

the aromatic pesticide pentachlorophenol, and evidence suggests that three genes involved have 328

arisen through multiple horizontal gene transfer events (38). Together with the high incidence of 329

chromosomal rearrangements and plasmid-mediated gene transfers observed, this is consistent 330

with the assertion that plasmids and other selfish genetic elements are likely responsible for 331

frequent exchange of biodegradative capabilities in sphingomonads (14). 332

Adaptations to the Environment 333

All of the sphingomonad genomes contained between 1-4 copies of the 16S rRNA gene, 334

with the majority containing one or two copies (Table 1). Previous work has shown that low 16S 335

rRNA copy number is a characteristic commonly associated with bacteria that inhabit a 336

specialized niche and do not rapidly respond to changing environmental conditions (45). 337

Together with the specific sets of oxygenases and GHs encoded by most of the sphingomonads, 338

it seems likely that many of these bacteria are specialized heterotrophs capable of utilizing 339

components of refractory detritus or xenobiotics as carbon, sulfur, and phosphorus sources. This 340

has previously been suggested for Sphingopyxis alaskensis RB2256 (46-48), Sphingobium 341

japonicum UT26S (49), and sphingomonads residing in subsurface sediment, such as 342

Novosphingobium aromaticivorans DSM 12444 (50). This is likely a successful ecological 343

strategy in oligotrophic environments, where few alternative nutrient sources would be available 344

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 16: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

and where other microbes would likely not possess the enzymatic machinery necessary to 345

compete for their use. Thus, rather than remaining dormant for long periods of time and doubling 346

rapidly when environmental conditions become favorable, sphingomonads may be adapted to 347

specialize in the utilization of particular recalcitrant compounds in oligotrophic conditions. 348

Adaptation to oligotrophic conditions has already been shown for the marine bacterium 349

Sphingopyxis alaskensis RB2256, an important contributors to nutrient cycling in marine 350

ecosystems (46, 47, 51). This ultramicrobacterium, which maintains an extremely small cell 351

volume (< 0.08 μm3), has adapted to persist in seawater containing extremely low concentrations 352

of amino acids and other nutrients (5, 47). Previous analyses have suggested that this genome has 353

not undergone the same degree of genomic reduction as has been observed in other marine 354

ultramicrobacteria (47, 52). This species has the second-smallest genome of all of the 355

sphingomonads analyzed here (smallest among the complete genomes) and is the only complete 356

sphingomonad genome lacking a megaplasmid, suggesting that it may in fact have undergone a 357

considerable amount of genomic streamlining compared to other sphingomonads. 358

Despite the apparent adaptations of many sphingomonads to oligotrophic conditions, 359

many of these bacteria can be found in host-associated environments that are likely more 360

nutrient-rich. Many of the species analyzed in this study were isolated from environments 361

including termites, lichen, plant tissue, or the rhizosphere (Table 1). Type IV secretion systems, 362

implicated in host-microbe interactions in other Alphaproteobacteria (53), were identified in 363

most of the sphingomonad genomes (Table S2), raising the possibility that these systems are 364

used during host interaction, in addition to their likely role in DNA transfer. Many of these 365

associations may, however, be short-term and not involve host-microbe specificity. For example, 366

some of the sphingomonads isolated from the rhizosphere, such as Novosphingobium sp. strain 367

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 17: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

AP49 and Sphingobium sp. strain AP12, encode genes for organosulfur and nitroaromatic 368

compound utilization, indicating that they may subsist on recalcitrant compounds or secondary 369

metabolites produced by potential plant associates. These bacteria may therefore degrade these 370

compounds in the soil or similar environments, without the necessity for any specific association 371

with metazoan hosts. 372

Two Sphingomonas species sequenced as part of this study, strain Mn802 and strain 373

PR090111-T3T-6A, were isolated from termites (Table S1). Strain PR090111-T3T-6A encodes 374

80 CAZymes, suggesting it may play a role in the degradation of the copious quantities of plant 375

biomass ingested by its potential termite host. Strain Mn802, however, encodes relatively few 376

CAZymes and has the smallest genome size of all the sphingomonads, suggesting that it may 377

have reduced biodegradative capabilities and subsists on easily accessible nutrients readily 378

available in the termite environment. Sphingomonads have previously been isolated from 379

termites (54), indicating that members of this group may have a consistent but as-of-yet 380

uncharacterized termite-association. If this is correct, our analyses suggest that termite-associated 381

sphingomonads do not all partake in the breakdown of recalcitrant plant material, but rather may 382

serve alternative functions in the associations. 383

Conclusions 384

Consistent with previous biochemical characterizations and analyses of 16S rDNA (37), 385

the genera Sphingomonas, Sphingobium, Novosphingobium, and Sphingopyxis form well-386

resolved phylogenetic clades at the whole-genome level, although Sphingomonas sp. strain 387

SKA58 appears to belong in the genus Sphingobium. A large number of species-specific genes 388

were present in the sphingomonad genomes, and few genomic features can reliably distinguish 389

the genera. Selfish genetic elements appear to be prominent forces shaping this genome 390

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 18: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

evolution, and megaplasmids, prophage, transposons, and frequent chromosomal rearrangements 391

are prevalent features in the group. Many sphingomonads appear to successfully occupy 392

oligotrophic niches and utilize recalcitrant compounds as sources of carbon, nitrogen, and sulfur. 393

Species with few oxygenases or GHs may have evolved to live in more nutrient-rich 394

environments, including host-associations, while others may have undergone genomic 395

streamlining to adapt to extreme oligotrophic conditions. As more information emerges on the 396

distribution and catabolic activities of the sphingomonads, insights into their ecology will greatly 397

advance our understanding of their impact on global nutrient cycles and potential applications to 398

biotechnology. 399

400

Acknowledgements 401

This work was funded by the DOE Great Lakes Bioenergy Research Center (DOE BER Office 402

of Science DE-FC02-07ER64494) supporting FOA, SMA, SW, AV, RAS, CRC, GS, and MP, 403

and by a STENO stipend awarded by the Danish Agency for Science, Technology and 404

Innovation to MP. The work conducted by the US Department of Energy Joint Genome Institute 405

is supported by the Office of Science of the US Department of Energy under Contract No. DE-406

AC02-05CH11231. We thank W. de Beer, M. Wingfield and the Forestry and Agricultural 407

Biotechnology Institute, University of Pretoria, for assistance, and three anonymous reviewers 408

for helpful suggestions to improve this work. 409

410

411

412

413

414

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 19: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

Figure Legends 415

416

Figure 1. Maximum-likelihood MLSA of 26 sphingomonad genomes constructed from 417

concatenated nucleotide sequences of 248 universally conserved genes, together with the 418

distribution of glycoside hydrolases (GHs) and mono- and di-oxygenases identified in the 419

sphingomonad genomes. The topology of this tree is identical to the AAI dendrogram created 420

from the proteins encoded in these genomes. Boostrap support for the MLSA is given at each 421

node, and the bar indicates the number of nucleotide substitutions per site. GHs were divided 422

into categories based on their predicted substrate as follows: oligosaccharides: GH1, GH2, GH3, 423

GH13, GH15, GH28, GH29, GH31, GH35, GH36, GH39, GH42, GH43, GH92; hemicelluloses: 424

GH8, GH10, GH26, GH51, GH53, GH105; chitin and peptidoglycan: GH18, GH20, GH23, 425

GH24, GH25, GH73; cellulose: GH5, GH6, GH9, GH44; other: all other GH families. 426

427

Figure 2. A) Venn diagram showing the number of shared and unique protein clusters identified 428

in the 4 sphingomonad genera. B) Distribution of protein clusters identified in the complete and 429

draft sphingomonad genomes. The x-axis represents the number of genomes encoding a member 430

of a protein cluster, and the y-axis represents the total number of clusters having that many 431

genomes represented. 432

433

Figure 3. Heatmap representing the sphingomonad protein clusters with the most variable 434

distributions across the 26 genomes. The dendrogram was constructed using the pairwise 435

Pearson’s r correlation calculated for the protein cluster membership distributions (raw values in 436

Dataset S1). 437

438

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 20: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

References 439

440

1. Balkwill, DL, Fredrickson, J, and Romine, M. 2006. The prokaryotes: a handbook on 441 the biology of bacteria, chapter 6.10, Sphingomonas and related genera. Springer, 442 Singapore. 443

2. Ryan, MP, and Adley, CC. 2010. Sphingomonas paucimobilis: a persistent Gram-444 negative nosocomial infectious organism. J. Hosp. Infect. 75:153-7. 445

3. Miller, TR, Delcher, AL, Salzberg, SL, Saunders, E, Detter, JC, and Halden, RU. 446 2010. Genome sequence of the dioxin-mineralizing bacterium Sphingomonas wittichii 447 RW1. J. Bacteriol. 192:6101-2. 448

4. Hu, A, He, J, Chu, KH, and Yu, CP. 2011. Genome sequence of the 17β-estradiol-449 utilizing bacterium Sphingomonas strain KC8. J. Bacteriol. 193:4266-7. 450

5. Eguchi, M, Ostrowski, M, Fegatella, F, Bowman, J, Nichols, D, Nishino, T, and 451 Cavicchioli, R. 2001. Sphingomonas alaskensis strain AFO1, an abundant oligotrophic 452 ultramicrobacterium from the North Pacific. Appl. Environ. Microbiol. 67:4945-54. 453

6. Farias, ME, Revale, S, Mancini, E, Ordonez, O, Turjanski, A, Cortez, N, and 454 Vazquez, MP. 2011. Genome sequence of Sphingomonas sp. S17, isolated from an 455 alkaline, hyperarsenic, and hypersaline volcano-associated lake at high altitude in the 456 Argentinean Puna. J. Bacteriol. 193:3686-7. 457

7. Coronado, E, Roggo, C, Johnson, DR, and van der Meer, JR. 2012. Genome-wide 458 analysis of salicylate and dibenzofuran metabolism in Sphingomonas wittichii RW1. 459 Front. Microbiol. 3:300. 460

8. Rentz, JA, Alvarez, PJ, and Schnoor, JL. 2008. Benzo(a)pyrene degradation by 461 Sphingomonas yanoikuyae JAR02. Environ. Pollut. 151:669-77. 462

9. Wittich, RM, Wilkes, H, Sinnwell, V, Francke, W, and Fortnagel, P. 1992. 463 Metabolism of dibenzo-p-dioxin by Sphingomonas sp. strain RW1. Appl. Environ. 464 Microbiol. 58:1005-10. 465

10. Schmidt, S, Wittich, RM, Erdmann, D, Wilkes, H, Francke, W, and Fortnagel, P. 466 1992. Biodegradation of diphenyl ether and its monohalogenated derivatives by 467 Sphingomonas sp. strain SS3. Appl. Environ. Microbiol. 58:2744-50. 468

11. Sugimoto, M, Tanabe, M, Hataya, M, Enokibara, S, Duine, JA, and Kawai, F. 2001. 469 The first step in polyethylene glycol degradation by sphingomonads proceeds via a 470 flavoprotein alcohol dehydrogenase containing flavin adenine dinucleotide. J. Bacteriol. 471 183:6694-8. 472

12. Harms, H, Wilkes, H, Wittich, R, and Fortnagel, P. 1995. Metabolism of 473 hydroxydibenzofurans, methoxydibenzofurans, acetoxydibenzofurans, and 474 nitrodibenzofurans by Sphingomonas sp. strain HH69. Appl. Environ. Microbiol. 475 61:2499-505. 476

13. Wilkes, H, Wittich, R, Timmis, KN, Fortnagel, P, and Francke, W. 1996. 477 Degradation of chlorinated dibenzofurans and dibenzo-p-dioxins by Sphingomonas sp. 478 strain RW1. Appl. Environ. Microbiol. 62:367-71. 479

14. Stolz, A. 2009. Molecular characteristics of xenobiotic-degrading sphingomonads. Appl. 480 Microbiol. Biotechnol. 81:793-811. 481

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 21: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

15. Masai, E, Katayama, Y, Nishikawa, S, and Fukuda, M. 1999. Characterization of 482 Sphingomonas paucimobilis SYK-6 genes involved in degradation of lignin-related 483 compounds. J. Ind. Microbiol. Biotechnol. 23:364-373. 484

16. Sutherland, IW, and Kennedy, L. 1996. Polysaccharide lyases from gellan-producing 485 Sphingomonas spp. Microbiology 142 (Pt 4):867-72. 486

17. Hashimoto, W, Miyake, O, Momma, K, Kawai, S, and Murata, K. 2000. Molecular 487 identification of oligoalginate lyase of Sphingomonas sp. strain A1 as one of the enzymes 488 required for complete depolymerization of alginate. J. Bacteriol. 182:4572-7. 489

18. Hsu, SC, and Lockwood, JL. 1975. Powdered chitin agar as a selective medium for 490 enumeration of actinomycetes in water and soil. Appl. Microbiol. 29:422-6. 491

19. Margulies, M, Egholm, M, Altman, WE, Attiya, S, Bader, JS, Bemben, LA, Berka, 492 J, Braverman, MS, Chen, YJ, Chen, Z, Dewell, SB, Du, L, Fierro, JM, Gomes, XV, 493 Godwin, BC, He, W, Helgesen, S, Ho, CH, Irzyk, GP, Jando, SC, Alenquer, ML, 494 Jarvie, TP, Jirage, KB, Kim, JB, Knight, JR, Lanza, JR, Leamon, JH, Lefkowitz, 495 SM, Lei, M, Li, J, Lohman, KL, Lu, H, Makhijani, VB, McDade, KE, McKenna, 496 MP, Myers, EW, Nickerson, E, Nobile, JR, Plant, R, Puc, BP, Ronan, MT, Roth, 497 GT, Sarkis, GJ, Simons, JF, Simpson, JW, Srinivasan, M, Tartaro, KR, Tomasz, A, 498 Vogt, KA, Volkmer, GA, Wang, SH, Wang, Y, Weiner, MP, Yu, P, Begley, RF, and 499 Rothberg, JM. 2005. Genome sequencing in microfabricated high-density picolitre 500 reactors. Nature 437:376-80. 501

20. Hyatt, D, Chen, GL, Locascio, PF, Land, ML, Larimer, FW, and Hauser, LJ. 2010. 502 Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC 503 Bioinformatics 11:119. 504

21. Powell, S, Szklarczyk, D, Trachana, K, Roth, A, Kuhn, M, Muller, J, Arnold, R, 505 Rattei, T, Letunic, I, Doerks, T, Jensen, LJ, von Mering, C, and Bork, P. 2012. 506 eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic 507 ranges. Nucleic Acids Res. 40:D284-9. 508

22. Eddy, SR. 1998. Profile hidden Markov models. Bioinformatics 14:755-63. 509 23. Edgar, RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high 510

throughput. Nucleic Acids Res. 32:1792-7. 511 24. Moriya, Y, Itoh, M, Okuda, S, Yoshizawa, AC, and Kanehisa, M. 2007. KAAS: an 512

automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 513 35:W182-5. 514

25. Tatusov, RL, Natale, DA, Garkavtsev, IV, Tatusova, TA, Shankavaram, UT, Rao, 515 BS, Kiryutin, B, Galperin, MY, Fedorova, ND, and Koonin, EV. 2001. The COG 516 database: new developments in phylogenetic classification of proteins from complete 517 genomes. Nucleic Acids Res. 29:22-8. 518

26. Altschul, SF, Madden, TL, Schaffer, AA, Zhang, J, Zhang, Z, Miller, W, and 519 Lipman, DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein 520 database search programs. Nucleic Acids Res. 25:3389-402. 521

27. Akhter, S, Aziz, RK, and Edwards, RA. 2012. PhiSpy: a novel algorithm for finding 522 prophages in bacterial genomes that combines similarity- and composition-based 523 strategies. Nucleic Acids Res. 40:e126. 524

28. Kanehisa, M, Araki, M, Goto, S, Hattori, M, Hirakawa, M, Itoh, M, Katayama, T, 525 Kawashima, S, Okuda, S, Tokimatsu, T, and Yamanishi, Y. 2008. KEGG for linking 526 genomes to life and the environment. Nucleic Acids Res. 36:D480-4. 527

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 22: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

29. Cantarel, BL, Coutinho, PM, Rancurel, C, Bernard, T, Lombard, V, and Henrissat, 528 B. 2009. The Carbohydrate-Active EnZymes database (CAZy): an expert resource for 529 Glycogenomics. Nucleic Acids Res. 37:D233-8. 530

30. Suen, G, Scott, JJ, Aylward, FO, Adams, SM, Tringe, SG, Pinto-Tomas, AA, Foster, 531 CE, Pauly, M, Weimer, PJ, Barry, KW, Goodwin, LA, Bouffard, P, Li, L, 532 Osterberger, J, Harkins, TT, Slater, SC, Donohue, TJ, and Currie, CR. 2010. An 533 insect herbivore microbiome with high plant biomass-degrading capacity. PLoS Genet. 534 6:e1001129. 535

31. Katoh, K, and Toh, H. 2010. Parallelization of the MAFFT multiple sequence alignment 536 program. Bioinformatics 26:1899-900. 537

32. Stamatakis, A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic 538 analyses with thousands of taxa and mixed models. Bioinformatics 22:2688-90. 539

33. Konstantinidis, KT, and Tiedje, JM. 2005. Towards a genome-based taxonomy for 540 prokaryotes. J. Bacteriol. 187:6258-64. 541

34. Felsenstein, J. 1989. PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics 542 5:164-166. 543

35. Letunic, I, and Bork, P. 2007. Interactive Tree Of Life (iTOL): an online tool for 544 phylogenetic tree display and annotation. Bioinformatics 23:127-8. 545

36. R Development Core Team. 2008. R: A language and environment for statistical 546 computing. 547

37. Takeuchi, M, Hamana, K, and Hiraishi, A. 2001. Proposal of the genus Sphingomonas 548 sensu stricto and three new genera, Sphingobium, Novosphingobium and Sphingopyxis, 549 on the basis of phylogenetic and chemotaxonomic analyses. Int. J. Syst. Evol. Microbiol. 550 51:1405-17. 551

38. Copley, SD, Rokicki, J, Turner, P, Daligault, H, Nolan, M, and Land, M. 2012. The 552 whole genome sequence of Sphingobium chlorophenolicum L-1: insights into the 553 evolution of the pentachlorophenol degradation pathway. Genome Biol. Evol. 4:184-98. 554

39. Schwartz, E. 2009. Microbial megaplasmids. Springer-verlag, Berlin. 555 40. Basta, T, Keck, A, Klein, J, and Stolz, A. 2004. Detection and characterization of 556

conjugative degradative plasmids in xenobiotic-degrading Sphingomonas strains. J. 557 Bacteriol. 186:3862-72. 558

41. Masai, E, Kamimura, N, Kasai, D, Oguchi, A, Ankai, A, Fukui, S, Takahashi, M, 559 Yashiro, I, Sasaki, H, Harada, T, Nakamura, S, Katano, Y, Narita-Yamada, S, 560 Nakazawa, H, Hara, H, Katayama, Y, Fukuda, M, Yamazaki, S, and Fujita, N. 561 2012. Complete genome sequence of Sphingobium sp. strain SYK-6, a degrader of lignin-562 derived biaryls and monoaryls. J. Bacteriol. 194:534-5. 563

42. Slater, SC, Goldman, BS, Goodner, B, Setubal, JC, Farrand, SK, Nester, EW, Burr, 564 TJ, Banta, L, Dickerman, AW, Paulsen, I, Otten, L, Suen, G, Welch, R, Almeida, 565 NF, Arnold, F, Burton, OT, Du, Z, Ewing, A, Godsy, E, Heisel, S, Houmiel, KL, 566 Jhaveri, J, Lu, J, Miller, NM, Norton, S, Chen, Q, Phoolcharoen, W, Ohlin, V, 567 Ondrusek, D, Pride, N, Stricklin, SL, Sun, J, Wheeler, C, Wilson, L, Zhu, H, and 568 Wood, DW. 2009. Genome sequences of three Agrobacterium biovars help elucidate the 569 evolution of multichromosome genomes in bacteria. J. Bacteriol. 191:2501-11. 570

43. Harayama, S, Kok, M, and Neidle, EL. 1992. Functional and evolutionary relationships 571 among diverse oxygenases. Annu. Rev. Microbiol. 46:565-601. 572

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 23: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

44. Kertesz, MA. 2000. Riding the sulfur cycle-metabolism of sulfonates and sulfate esters 573 in gram-negative bacteria. FEMS Microbiol. Rev. 24:135-75. 574

45. Klappenbach, JA, Dunbar, JM, and Schmidt, TM. 2000. rRNA operon copy number 575 reflects ecological strategies of bacteria. Appl. Environ. Microbiol. 66:1328-33. 576

46. Lauro, FM, McDougald, D, Thomas, T, Williams, TJ, Egan, S, Rice, S, DeMaere, 577 MZ, Ting, L, Ertan, H, Johnson, J, Ferriera, S, Lapidus, A, Anderson, I, Kyrpides, 578 N, Munk, AC, Detter, C, Han, CS, Brown, MV, Robb, FT, Kjelleberg, S, and 579 Cavicchioli, R. 2009. The genomic basis of trophic strategy in marine bacteria. Proc. 580 Natl. Acad. Sci. USA 106:15527-33. 581

47. Williams, TJ, Ertan, H, Ting, L, and Cavicchioli, R. 2009. Carbon and nitrogen 582 substrate utilization in the marine bacterium Sphingopyxis alaskensis strain RB2256. 583 ISME J. 3:1036-52. 584

48. Fegatella, F, Lim, J, Kjelleberg, S, and Cavicchioli, R. 1998. Implications of rRNA 585 operon copy number and ribosome content in the marine oligotrophic 586 ultramicrobacterium Sphingomonas sp. strain RB2256. Appl. Environ. Microbiol. 587 64:4433-8. 588

49. Nagata, Y, Natsui, S, Endo, R, Ohtsubo, Y, Ichikawa, N, Ankai, A, Oguchi, A, 589 Fukui, S, Fujita, N, and Tsuda, M. 2011. Genomic organization and genomic structural 590 rearrangements of Sphingobium japonicum UT26, an archetypal gamma-591 hexachlorocyclohexane-degrading bacterium. Enzyme Microb. Technol. 49:499-508. 592

50. Fredrickson, JK, Balkwill, DL, Romine, MF, and Shi, T. 1999. Ecology, physiology, 593 and phylogeny of deep subsurface Sphingomonas sp. J. Ind. Microbiol. Biotechnol. 594 23:273-283. 595

51. Cavicchioli, R, Ostrowski, M, Fegatella, F, Goodchild, A, and Guixa-Boixereu, N. 596 2003. Life under nutrient limitation in oligotrophic marine environments: an 597 eco/physiological perspective of Sphingopyxis alaskensis (formerly Sphingomonas 598 alaskensis). Microb. Ecol. 45:203-17. 599

52. Giovannoni, SJ, Tripp, HJ, Givan, S, Podar, M, Vergin, KL, Baptista, D, Bibbs, L, 600 Eads, J, Richardson, TH, Noordewier, M, Rappe, MS, Short, JM, Carrington, JC, 601 and Mathur, EJ. 2005. Genome streamlining in a cosmopolitan oceanic bacterium. 602 Science 309:1242-5. 603

53. Hubber, A, Vergunst, AC, Sullivan, JT, Hooykaas, PJ, and Ronson, CW. 2004. 604 Symbiotic phenotypes and translocated effector proteins of the Mesorhizobium loti strain 605 R7A VirB/D4 type IV secretion system. Mol. Microbiol. 54:561-74. 606

54. Wenzel, M, Schonig, I, Berchtold, M, Kampfer, P, and Konig, H. 2002. Aerobic and 607 facultatively anaerobic cellulolytic bacteria from the gut of the termite Zootermopsis 608 angusticollis. J. Appl. Microbiol. 92:32-40. 609

55. Balkwill, DL, Drake, GR, Reeves, RH, Fredrickson, JK, White, DC, Ringelberg, DB, 610 Chandler, DP, Romine, MF, Kennedy, DW, and Spadoni, CM. 1997. Taxonomic 611 study of aromatic-degrading bacteria from deep-terrestrial-subsurface sediments and 612 description of Sphingomonas aromaticivorans sp. nov., Sphingomonas subterranea sp. 613 nov., and Sphingomonas stygia sp. nov. Int. J. Syst. Bacteriol. 47:191-201. 614

56. Strabala, TJ, Macdonald, L, Liu, V, and Smit, AM. 2012. Draft genome sequence of 615 Novosphingobium nitrogenifigens Y88(T). J. Bacteriol. 194:201. 616

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 24: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

57. Luo, YR, Kang, SG, Kim, SJ, Kim, MR, Li, N, Lee, JH, and Kwon, KK. 2012. 617 Genome sequence of benzo(a)pyrene-degrading bacterium Novosphingobium 618 pentaromativorans US6-1. J. Bacteriol. 194:907. 619

58. Brown, SD, Utturkar, SM, Klingeman, DM, Johnson, CM, Martin, SL, Land, ML, 620 Lu, TY, Schadt, CW, Doktycz, MJ, and Pelletier, DA. 2012. Twenty-one genome 621 sequences from Pseudomonas species and 19 genome sequences from diverse bacteria 622 isolated from the rhizosphere and endosphere of Populus deltoides. J. Bacteriol. 623 194:5991-3. 624

59. D'Argenio, V, Petrillo, M, Cantiello, P, Naso, B, Cozzuto, L, Notomista, E, Paolella, 625 G, Di Donato, A, and Salvatore, F. 2011. De novo sequencing and assembly of the 626 whole genome of Novosphingobium sp. strain PP1Y. J. Bacteriol. 193:4296. 627

60. Gan, HM, Chew, TH, Hudson, AO, and Savka, MA. 2012. Genome sequence of 628 Novosphingobium sp. strain Rr 2-17, a nopaline crown gall-associated bacterium isolated 629 from Vitis vinifera L. grapevine. J. Bacteriol. 194:5137-8. 630

61. Anand, S, Sangwan, N, Lata, P, Kaur, J, Dua, A, Singh, AK, Verma, M, Kaur, J, 631 Khurana, JP, Khurana, P, Mathur, S, and Lal, R. 2012. Genome sequence of 632 Sphingobium indicum B90A, a hexachlorocyclohexane-degrading bacterium. J. Bacteriol. 633 194:4471-2. 634

62. Gai, Z, Wang, X, Tang, H, Tai, C, Tao, F, Wu, G, and Xu, P. 2011. Genome sequence 635 of Sphingobium yanoikuyae XLDN2-5, an efficient carbazole-degrading strain. J. 636 Bacteriol. 193:6404-5. 637

63. Shin, SC, Kim, SJ, Ahn do, H, Lee, JK, and Park, H. 2012. Draft genome sequence of 638 Sphingomonas echinoides ATCC 14820. J. Bacteriol. 194:1843. 639

64. Gai, Z, Wang, X, Zhang, X, Su, F, Wang, X, Tang, H, Tai, C, Tao, F, Ma, C, and Xu, 640 P. 2011. Genome sequence of Sphingomonas elodea ATCC 31461, a highly productive 641 industrial strain of gellan gum. J. Bacteriol. 193:7015-6. 642

65. Wang, X, Tao, F, Gai, Z, Tang, H, and Xu, P. 2012. Genome sequence of the welan 643 gum-producing strain Sphingomonas sp. ATCC 31555. J. Bacteriol. 194:5989-90. 644

66. Shin, SC, Ahn do, H, Lee, JK, Kim, SJ, Hong, SG, Kim, EH, and Park, H. 2012. 645 Genome sequence of Sphingomonas sp. strain PAMC 26605, isolated from arctic lichen 646 (Ochrolechia sp.). J. Bacteriol. 194:1607. 647

67. Lee, J, Shin, SC, Kim, SJ, Kim, BK, Hong, SG, Kim, EH, Park, H, and Lee, H. 2012. 648 Draft genome sequence of a Sphingomonas sp., an endosymbiotic bacterium isolated 649 from an arctic lichen Umbilicaria sp. J. Bacteriol. 194:3010-1. 650

68. Lee, H, Shin, SC, Lee, J, Kim, SJ, Kim, BK, Hong, SG, Kim, EH, and Park, H. 2012. 651 Genome sequence of Sphingomonas sp. strain PAMC 26621, an arctic-lichen-associated 652 bacterium isolated from a Cetraria sp. J. Bacteriol. 194:3030. 653

69. Ma, Z, Shen, X, Hu, H, Wang, W, Peng, H, Xu, P, and Zhang, X. 2012. Genome 654 sequence of Sphingomonas wittichii DP58, the first reported phenazine-1-carboxylic 655 acid-degrading strain. J. Bacteriol. 194:3535-6. 656

657 658

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 25: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

Table 1. Statistics and isolation information for the 26 sphingomonad genomes.

Genome Genome size

(Mpb) Sequencing

Status Chromosomes

(Plasmids) No. of 16S Operons Isolation Source Reference

Novosphingobium Novosphingobium aromaticivorans DSM 12444 4.2 Complete 1 (2) 3 Subsurface sediment (55) Novosphingobium nitrogenifigens Y88T 4.2 Draft 1 Paper mill wastewater (56) Novosphingobium pentaromativorans US6-1 5.3 Draft 1 Ulsan Bay sediment (57) Novosphingobium sp. strain AP12 5.6 Draft 1 Cottonwood rhizosphere (58) Novosphingobium sp. strain PP1Y 5.3 Complete 1 (3) 3 Seawater (59) Novosphingobium sp. strain Rr 2-17 4.5 Draft 1 Grapevine crown gall (60) Sphingopyxis Sphingopyxis alaskensis RB2256 3.4 Complete 1 (1) 1 Seawater (46) Sphingobium Sphingobium chlorophenolicum L-1 4.6 Complete 2 (1) 3 Soil (38) Sphingobium indicum B90A 4.1 Draft 2 Sugarcane rhizosphere (61) Sphingobium japonicum UT26S 4.4 Complete 2 (3) 3 Contaminated soil (49) Sphingobium sp. strain AP49 4.5 Draft 1 Cottonwood rhizosphere (58) Sphingobium sp. strain SYK-6 4.3 Complete 1 (1) 2 Pulp mill wastewater (41) Sphingobium yanoikuyae XLDN2-5 5.4 Draft 1 Contaminated soil (62) Sphingomonas Sphingomonas echinoides ATCC 14820 4.2 Draft 1 Plate contaminant (63) Sphingomonas elodea ATCC 31461 4.1 Draft 1 Plant (Elodea) tissue (64) Sphingomonas sp. strain KC8 4.1 Draft 1 Activated sludge (4) Sphingomonas sp. strain Mn802 3.2 Draft 1 Termite This study Sphingomonas sp. strain PR090111-T3T-6A 3.9 Draft 1 Termite This study Sphingomonas sp. strain S17 4.3 Draft 1 Hypersaline lake (6) Sphingomonas sp. strain SKA58 3.9 Draft 4 Seawater NA Sphingomonas sp. strain ATCC 31555 4.0 Draft 1 Pond water (65) Sphingomonas sp. strain PAMC 26605 4.7 Draft 1 Arctic lichen (66) Sphingomonas sp. strain PAMC 26617 4.7 Draft 1 Arctic lichen (67) Sphingomonas sp. strain PAMC 26621 4.8 Draft 2 Arctic lichen (68) Sphingomonas wittichii DP58 5.6 Draft 1 Pimiento rhizosphere (69) Sphingomonas wittichii RW1 5.9 Complete 1 (2) 2 Elbe River (3)

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 26: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

Novosphingobium aromaticivorans DSM 12444Novosphingobium nitrogeni!gens Y88T

Novosphingobium pentaromativorans US6-1Novosphingobium sp. strain PP1Y Novosphingobium sp. strain Rr 2-17Novosphingobium sp. strain AP12Sphingopyxis alaskensis RB2256 Sphingobium sp. strain SYK-6 Sphingomonas sp. strain SKA58 Sphingobium yanoikuyae XLDN2-5Sphingobium sp. strain AP49Sphingobium chlorophenolicum L-1 Sphingobium indicum B90A Sphingobium japonicum UT26S Sphingomonas sp. strain PR090111-T3T-6ASphingomonas sp. strain KC8Sphingomonas wittichii RW1Sphingomonas wittichii DP58 Sphingomonas sp. strain ATCC 31555 Sphingomonas elodea ATCC 31461Sphingomonas sp. strain Mn802 Sphingomonas sp. strain S17 Sphingomonas sp. strain PAMC 26621Sphingomonas sp. strain PAMC 26617Sphingomonas sp. strain PAMC 26605 Sphingomonas echinoides ATCC14820

0.1

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

0 20 40 60 80 100 0 20 40 60 80 100

Glycoside Hydrolases Oxygenases

Oligosaccharides

Chitin and Peptidoglycan

Hemicelluloses

Cellulose

Other

Predicted Substrate

Monooxygenases

Diooxygenases

100

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 27: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

Table 2. Size and number of encoded selfish genetic elements and biodegradative enzymes in the 7 complete sphingomonad genomes.

Genome Size (Kb) Putative

Prophage Phage-like proteins (% of total proteins) Transposons Monooxygenases Dioxygenases CAZymes

Novosphingobium aromaticivorans DSM 12444 Chromosome 3,561.6 5 309 (9.3%) 16 13 19 58 Plasmid pNL1 184.5 1 1 1 13 1 Plasmid pNL2 487.3 2 3 8 Novosphingobium sp. strain PP1Y Chromosome 3,911.5 3 113 (3.3%) 24 10 33 36 Plasmid Lpl 192.1 1 5 2 Plasmid Spl 48.7 1 1 Plasmid Mpl 1,161.6 17 5 6 41 Sphingobium chlorophenolicum L-1 Chromosome 1 3,080.8 1 107 (3.8%) 15 9 4 30 Chromosome 2 1,368.7 1 74 (6.7%) 8 6 15 17 Plasmid pSPHCH01 123.7 2 1 Sphingobium japonicum UT26S Chromosome 1 3,514.8 2 95 (2.7%) 37 9 6 28 Chromosome 2 681.9 7 2 2 22 Plasmid pUT1 31.8 5 Plasmid pUT2 5.4 Plasmid pPCHQ1 191.0 2 6 2 2 Sphingobium sp. strain SYK-6 Chromosome 4,199.3 1 88 (2.3%) 29 16 23 25 Plasmid pSLPG 148.8 6 1 Sphingomonas wittichii RW1 Chromosome 5,382.26 2 183 (3.8%) 36 42 38 33 Plasmid pSWIT01 310.2 2 9 2 1 Plasmid pSWIT02 222.8 1 31 1 6 2 Sphingopyxis alaskensis RB2256 Chromosome 3,345.2 1 62 (2.0%) 36 9 7 33 F Plasmid 28.5

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 28: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000

No

. o

f P

ro

tein

s i

n C

luste

r

No. of Shared Protein Clusters

Complete and Draft Genomes

Complete Genomes Only

Number of Genomes Represented in Cluster

2 4 6 8 10 12 14 16 18 20 22 24 26

2856

1035

2281

193

2230

6004

216

386

1121437

23628193073

95

113

Novosphingobium Sphingopyxis

SphingomonasSphingobiumA.

B.

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 29: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

ctNOG48720

ctNOG36942

ctNOG25022

ctNOG01575

ctNOG01004

ctNOG01747

ctNOG55345

ctNOG64903

ctNOG16692

ctNOG92823

ctNOG05450

ctNOG11156

ctNOG02899

ctNOG05539

ctNOG21021

ctNOG67812

ctNOG00683

ctNOG00976

ctNOG37476

ctNOG07586

ctNOG18385

ctNOG01071

ctNOG06489

ctNOG01556

ctNOG16474

ctNOG05617

ctNOG29058

ctNOG36442

ctNOG63405

ctNOG03614

ctNOG37608

ctNOG50242

ctNOG07960

ctNOG02315

ctNOG00329

ctNOG11601

ctNOG06966

ctNOG24549

ctNOG08745

ctNOG58638

ctNOG67681

ctNOG24928

ctNOG56084

ctNOG00059

ctNOG58937

ctNOG62732

ctNOG00104

ctNOG61303

ctNOG31010

ctNOG61960

ctNOG07381

ctNOG76184

bactNOG48720 Uncharacterized protein

bactNOG76184 Uncharacterized protein

bactNOG36942 Uncharacterized protein

bactNOG25022 Uncharacterized protein

bactNOG01747 Phage integrase

bactNOG01575 Transposase

bactNOG01004 Transposase

bactNOG37476 Uncharacterized protein

bactNOG55345 Uncharacterized protein

bactNOG64903 Transcriptional regulator

bactNOG16692 Transposase

bactNOG92823 Uncharacterized protein

bactNOG00976 Xanthine dehydrogenase

bactNOG00683 GH3 glucosidase

bactNOG21021 Uncharacterized protein

bactNOG67812 Uncharacterized protein

bactNOG05450 2-polyprenyl-6-methoxyphenol hydroxylase

bactNOG11156 Uncharacterized protein

bactNOG02899 GH92 mannosidase

bactNOG05539 Outer membrane receptor

bactNOG01556 Transposase

bactNOG07586 Transposase

bactNOG18385 Antirestriction protein

bactNOG01071 Uncharacterized protein

bactNOG06489 Predicted transcriptional regulator

bactNOG16474 Putative silver e�ux pump

bactNOG05617 Uncharacterized protein

bactNOG29058 Transposase

bactNOG36442 Predicted endonuclease

bactNOG07960 Taurine catabolism dioxygenase

bactNOG02315 Na+/H+-dicarboxylate symporters

bactNOG63405 Outer membrane receptor

bactNOG50242 Uncharacterized protein

bactNOG03614 Lignostilbene-alpha,beta-dioxygenase

bactNOG37608 Uncharacterized protein

bactNOG00329 Acetyl-CoA acetyltransferase

bactNOG11601 Outer membrane receptor

bactNOG06966 Transposase

bactNOG24549 Transposase

bactNOG08745 Outer membrane receptor

bactNOG58638 Outer membrane receptor

bactNOG67681 Outer membrane receptor

bactNOG24928 Outer membrane receptor

bactNOG56084 Outer membrane receptor

bactNOG62732 Outer membrane receptor

bactNOG00059 Aldehyde dehydrogenases

bactNOG58937 Acyl-CoA synthetases

bactNOG07381 Outer membrane receptor

bactNOG00104 Pyruvate/2-oxoglutarate dehydrogenase

bactNOG61303 Dehydrogenase

bactNOG31010 Uncharacterized protein

bactNOG61960 Outer membrane receptor

Sph

ing

om

on

as

ech

ino

ides

ATC

C 1

48

20

Sph

ing

om

on

as

sp. s

tra

in P

AM

C 2

66

05

Sph

ing

om

on

as

sp. s

tra

in P

AM

C 2

66

17

Sph

ing

om

on

as

sp. s

tra

in P

AM

C 2

66

21

Sph

ing

om

on

as

sp. s

tra

in S

17

Sph

ing

om

on

as

sp. s

tra

in M

n8

02

Sph

ing

om

on

as

elo

dea

ATC

C 3

14

61

Sph

ing

om

on

as

sp. s

tra

in A

TCC

31

55

5Sp

hin

go

mo

na

s w

itti

chii

DP

58

Sph

ing

om

on

as

wit

tich

ii R

W1

Sph

ing

om

on

as

sp. s

tra

in K

C8

Sph

ing

om

on

as

sp. s

tra

in P

R0

90

11

1-T

3T-

6A

Sph

ing

ob

ium

jap

on

icu

m U

T2

6S

Sph

ing

ob

ium

ind

icu

m B

90

ASp

hin

go

biu

m c

hlo

rop

hen

olic

um

L-1

Sph

ing

ob

ium

sp

. str

ain

AP

49

Sp

hin

go

biu

m y

an

oik

uya

e X

LD

N2

-5Sp

hin

go

mo

na

s sp

. str

ain

SK

A5

8Sp

hin

go

biu

m s

p. s

tra

in S

YK

-6Sp

hin

go

pyx

is a

lask

ensi

s R

B2

25

6N

ovo

sph

ing

ob

ium

sp

. str

ain

AP

12

No

vosp

hin

go

biu

m s

p. s

tra

in R

r 2

-17

No

vosp

hin

go

biu

m s

p. s

tra

in P

P1

YN

ovo

sph

ing

ob

ium

pen

taro

ma

tivo

ran

s U

S6

-1N

ovo

sph

ing

ob

ium

nit

rog

eni!

gen

s Y

88

T

No

vosp

hin

go

biu

m a

rom

ati

civo

ran

s D

SM

12

44

4

Dataset sphingo

0

2

4

6

22

38

54

No. of proteins in cluster

ctNOG48720

ctNOG36942

ctNOG25022

ctNOG01575

ctNOG01004

ctNOG01747

ctNOG55345

ctNOG64903

ctNOG16692

ctNOG92823

ctNOG05450

ctNOG11156

ctNOG02899

ctNOG05539

ctNOG21021

ctNOG67812

ctNOG00683

ctNOG00976

ctNOG37476

ctNOG07586

ctNOG18385

ctNOG01071

ctNOG06489

ctNOG01556

ctNOG16474

ctNOG05617

ctNOG29058

ctNOG36442

ctNOG63405

ctNOG03614

ctNOG37608

ctNOG50242

ctNOG07960

ctNOG02315

ctNOG00329

ctNOG11601

ctNOG06966

ctNOG24549

ctNOG08745

ctNOG58638

ctNOG67681

ctNOG24928

ctNOG56084

ctNOG00059

ctNOG58937

ctNOG62732

ctNOG00104

ctNOG61303

ctNOG31010

ctNOG61960

ctNOG07381

ctNOG76184

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

bac

proteins er

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from

Page 30: Comparison of 26 Sphingomonad Geno mes Reveals Diverse ...€¦ · 93 obtained through serial transfer. Fo r strain PR090111-T3T-6A, a colony of Nasutitermes sp. was 94 sampled from

Table 3. Select catabolic enzymes identified in the sphingomonad genomes. NMO: nitranoate monooxygenase; vanA: vanillate monooxygenase; ssuD: alkanesulfonate monooxygenase; tauD: taurine dioxygenase; hppD: 4-hydroxyphenylpyruvate dioxygenase.

Monooxygenases Dioxygenases Cellulase Families

Genome NMO; K00459

vanA; K03862

ssuD; K04091

tauD; K03119

hppD; K00457 GH5 GH6 GH9

Novosphingobium Novosphingobium aromaticivorans DSM 12444 4 4 1 2 2 Novosphingobium nitrogenifigens Y88T 3 1 2 3 1 Novosphingobium pentaromativorans US6-1 6 5 2 2 2 Novosphingobium sp. strain AP12 5 16 17 1 Novosphingobium sp. strain PP1Y 3 2 1 1 2 1 1 Novosphingobium sp. strain Rr 2-17 3 2 5 1 Sphingopyxis Sphingopyxis alaskensis RB2256 6 1 1 1 Sphingobium Sphingobium chlorophenolicum L-1 7 1 1 6 1 Sphingobium indicum B90A 5 1 1 Sphingobium japonicum UT26S 7 2 1 1 Sphingobium sp. strain AP49 4 1 2 4 1 Sphingobium sp. strain SYK-6 2 7 7 1 Sphingobium yanoikuyae XLDN2-5 4 2 4 2 2 Sphingomonas Sphingomonas echinoides ATCC 14820 2 1 1 Sphingomonas elodea ATCC 31461 2 1 2 1 1 1 1 Sphingomonas sp. strain KC8 4 1 2 1 Sphingomonas sp. strain Mn802 2 1 1 Sphingomonas sp. strain PR090111-T3T-6A 3 1 1 Sphingomonas sp. strain S17 3 2 1 2 Sphingomonas sp. strain SKA58 3 1 1 1 Sphingomonas sp. strain ATCC 31555 2 1 1 2 2 Sphingomonas sp. strain PAMC 26605 3 3 1 1 1 Sphingomonas sp. strain PAMC 26617 2 2 1 2 4 2 Sphingomonas sp. strain PAMC 26621 2 2 1 2 2 Sphingomonas wittichii DP58 4 19 5 8 1 Sphingomonas wittichii RW1 5 24 4 7 1

on May 10, 2020 by guest

http://aem.asm

.org/D

ownloaded from