University of Groningen Genomics-based discovery and ...

37
University of Groningen Genomics-based discovery and engineering of biocatalysts for conversion of amines Heberling, Matthew Michael IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2017 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Heberling, M. M. (2017). Genomics-based discovery and engineering of biocatalysts for conversion of amines. University of Groningen. Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license. More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne- amendment. Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 10-02-2022

Transcript of University of Groningen Genomics-based discovery and ...

Page 1: University of Groningen Genomics-based discovery and ...

University of Groningen

Genomics-based discovery and engineering of biocatalysts for conversion of aminesHeberling, Matthew Michael

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2017

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):Heberling, M. M. (2017). Genomics-based discovery and engineering of biocatalysts for conversion ofamines. University of Groningen.

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license.More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne-amendment.

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 10-02-2022

Page 2: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 37PDF page: 37PDF page: 37PDF page: 37

37

2Genome mining

Matthew M. Heberling1,4, Christiaan P. Postema1, Thomas J. Meijer1, Marleen Otzen1, Daniel Wibberg2, Jochen Blom3, Andreas Schlüter2, and Dick B. Janssen1

1 Biotransformation and Biocatalysis, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 4, 9747 AG Groningen, The Netherlands2 Center for Biotechnology (CeBiTec), Institute for Genome Research and Systems Biology, Genome Research of Industrial Microorganisms, Bielefeld University, Universitätsstr. 27, 33615 Bielefeld, Germany3 Bioinformatics and Systems Biology, Gießen University, Gießen, Germany4 Current Address: Peptone – The Protein Intelligence Company, Hullenbergweg 280, 1101 BV Amsterdam, The Netherlands

Author contribution:MMH and DBJ designed experiments and established collaborations with DW, JB, and AS for bioinformatics analyses of the genomes. CPP and TJM cloned and characterized Vp-TA. MO assisted in the bioinformatics analyses of the M7V and SBV1 genomes. MMH and DBJ wrote the manuscript.

Page 3: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 38PDF page: 38PDF page: 38PDF page: 38

38 | Chapter 2

Unearthing β-amino acid-converting enzymes

Abstract

Background: Beta-amino acids (β-aa) occur as constituents in bioactive molecules, including coenzyme A and numerous peptide antibiotics. Little is known about their microbial metabolism, which is especially complex in the case of β,β-di-substituted amino acids like β-valine that cannot be converted by most deamination enzymes that act on α-amino acids. Since rare genes might be involved, we examined metagenomic and soil enrichment cultures as complementary strategies for obtaining genes encoding deamination pathways.Results: Soil samples and environmental gene libraries expressed in E. coli were subjected to growth selection in media containing β-aa’s as sole N-/C-sources. During growth selections, we discovered a pure isolate of Acidovorax (strain MG01) and a mixed culture (M7) of Herbaspirillum and Variovorax species (M7H and M7V, resp.) that metabolize β-valine. Additional cultures containing a Herbaspirillum, Burkholderia, and Variovorax species were also obtained that convert at least three of the following substrates: β-Glu, β-Val, β-Phe, β-Ala, α-methyl-β-Ala, β-Leu, and β-Tyr. The metagenome screening produced no hits. After partial genome sequencing of the M7 culture and MG01 strain, bioinformatics analyses identified CoA ligase-lyase pairs in Acidovorax strain MG01 and in Variovorax strain M7V involved in β-valine conversion. Characterization of a β-transaminase from Variovorax M7V (VarM7V-TA) implicated a role in the degradation of three β-aa’s. The new transaminase showed superior activity with (S)-β-Tyr compared to previously characterized β-TAs. Lastly, genome mining identified putative transaminases and a tyrosine 2,3-aminomutase (TAM) in Acidovorax MG01 and the β-valine degrader Pseudomonas sp. strain SBV1, respectively, as interesting candidates for future characterization.Conclusions: Through soil enrichments, biocatalysts involved in β-aa conversions were discovered. At least four bacterial cultures showed metabolic versatility with up to six β-aa’s. The recently discovered route for β-Val metabolism in strain SBV1 was strongly implicated in the new strains from this study, MG01 and M7V. Thus, the prevalent route for β-valine metabolism in bacteria seems to be CoA-dependent. Characterization of a β-TA from strain M7V suggests a role in β-aa metabolism.

Key words: genome mining, β-amino acid, β-valine, β-phenylalanine, comparative genomics, soil enrichment, β-transaminase, tyrosine 2,3-aminomutaseAbbreviations: β-amino acid (β-aa), tyrosine and phenylalanine aminomutase (TAM/PAM)/ammonia lyase (TAL/PAL); histidine ammonia lyase (HAL), transaminase (TA), metagenomic DNA (metDNA), minimal media (MM), Acidovorax (Aci), Variovorax (Var)

Page 4: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 39PDF page: 39PDF page: 39PDF page: 39

39Genome mining |

2

1 | Background

Beta-amino acids (β-aa’s) occur in in many bioactive secondary metabolites across all five kingdoms of life [1]. Bacteria dominate in biosynthetic capacities of β-aa’s with their distinct ability to generate β-aa’s occurring in non-ribosomal peptides and macrolactam polyketides through de novo biosynthesis [1,2]. The structural diversity and protease-resistance effects from β-aa moieties and β-peptide scaffolds in many bioactive molecules accentuate their potential in drug development [1-3]. For instance, bioactive molecules harboring β-aa’s are used for their anticancer (β-Phe in taxol; β-Tyr in C-1027), insecticidal (β-Ala in destruxins), and antibacterial effects (α,β-diaminopropanoic acid in penicillins) [1,2,4,5]. Incorporation of a β-Val moiety increased water solubility of a renin inhibitor drug for hypertension while maintaining potency [6]. The occurrence of synthetic pathways leading to β-amino acids suggests the existence of a diversity of enzymes acting on these compounds or their precursors. Such enzymes may be applicable in biocatalytic processes aimed at production of natural and semi-synthetic bioactive compounds. Typical routes for discovery of novel microbial biocatalysts involve metagenome library screening, genome database mining, or soil enrichment of microbial cultures using targeted substrates as sole nitrogen or carbon sources. Metagenomic DNA (metDNA) libraries provide access to novel genes through culture-independent means that circumvent a major barrier set by 99% of microbes that resist lab cultivation [7]. Detection is based on functional screening of expression libraries, e.g. using enzyme assays, or on direct large-scale sequencing. A compelling example of novel gene discovery through marine metagenomics is showcased in the massive sequencing project of microbiomes from the Sargasso Sea that revealed 1.2 M novel genes from an estimated 1,800 genomic species comprising 148 new bacteria phylotypes [8]. Through functional soil metagenomics screening, Desantis et al. added 200 novel nitrilases to the enzymatic toolbox for production of valuable hydroxy-carboxylic acids [9]. Inspired by these impressive metagenomics endeavors, novel biocatalysts are being continually discovered as eco-friendly alternatives that can replace traditional chemical catalysts or improve prior biosynthetic processes in industry [10,11]. Soil enrichments are usually the preferred route when searching for rare enzymes [12]. Organisms producing notable β-aa-converting enzymes have been enriched from soil, providing substrate specificities towards β-Phe and β-Val [13-16]. For instance, Kim et al. discovered the aryl β-aa degrading bacterium Mesorhizobium sp. LUK after soil enrichment using rac-β-Phe selection medium and characterized the β-transaminase (β-TA) responsible for (S)-β-Phe conversion [13,14]. A more efficient β-TA was later discovered by our group through similar soil enrichments [16]. We have also discovered β-Val-conversion enzymes in a bacterial strain obtained by soil enrichment, establishing the first report on β-Val metabolism in bacteria [15]. Newly isolated bacterial cultures are now routinely characterized by next generation DNA sequencing techniques. Once new microbial genomes are (partially) established and annotated, comparative genomics is a vital tool used for discovering relevant metabolic pathways and new enzymes for environmental, industrial, and therapeutic applications. As an example with industrial and therapeutic relevance, the complete biosynthetic pathway of the anticancer drug Taxol – previously

Page 5: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 40PDF page: 40PDF page: 40PDF page: 40

40 | Chapter 2

restricted to plant species – was recently decoded for an endophytic fungus via comparative genomics to enable a viable alternative biosynthetic route to Taxol [17]. Deamination screening is applied when searching for enzymes acting on C-N bonds, since amination (biosynthetic) screening is impractical for growth selections, unless an auxotrophic trait can be complemented. Thus, the discovery of deamination enzymes that convert various amino compounds, including β-Val and β-Phe, can be targeted by selective enrichment using such substrates as a nitrogen (N)- or carbon (C)-source. The structure of β-Val (or other dialkylglycines) restricts the diversity of feasible biodegradation routes due to the blocked amino-substituted β-carbon – eliminating conversion by amino acid dehydrogenases, amino-transferases, or amine oxidases that are commonly involved in the catabolism of α-amino acids. Our recent work with Pseudomonas sp. str. SBV1 defied the expected route of conversion through an aminomutase or ammonia lyase (Figure 1) by discovering a novel CoA-dependent ligase-lyase pair [15].

Figure 1 | Possible initial steps in β-Val metabolism. AM = aminomutase, MM = methyl mutase, and AL = ammonia lyase.

The complete pathway involves β-Val-converting enzymes and leucine catabolic enzymes. After screening an SBV1 genomic library in E. coli, two sets of catabolic genes were found in a cosmid library insert that conferred growth of the recombinant E. coli in β-Val selection media. A β-valinyl-CoA ligase (BvaA) encoded by the insert was shown to activate β-Val for subsequent deamination of the CoA-adduct via one or two β-valinyl-CoA ammonia lyases (BvaB1/B2) to give 3-methyl-

Page 6: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 41PDF page: 41PDF page: 41PDF page: 41

41Genome mining |

2

crotonyl-CoA, which is putatively shunted to the leucine degradation pathway encoded by another gene cluster. In the current work, we further explore the diversity and distribution of bacterial β-aa metabolism. Previous works above show that aminotransferases may act on β-aa’s, unless the amino group is on a tertiary carbon atom, in which case a lyase requiring CoA activation or a mutase may be involved. It remains unclear how widespread these β-aa deamination mechanisms are, and if β-aa catabolic pathways and gene clusters are conserved among β-aa degrading bacteria. Inspired by their possible applicability in biocatalysis, we isolated novel β-aa degrading bacteria and studied the deamination enzymes and β-aa catabolic gene clusters in these strains.

2 | Results

2.1 | Discovery of organisms for β-amino acid metabolism In search of organisms metabolizing β-aa’s, we attempted metDNA library screening as well as traditional soil enrichments. In both approaches, 12 diverse amino compounds were supplied as either sole N- or C-sources in liquid minimal media (MM) media. MetDNA extracted from unenriched Dutch compost soil was used earlier to construct two recombinant plasmid libraries of ~300k and 500k vector clones, respectively (Dr. A. Schallmey, unpublished). Both libraries were screened, ensuring a 12x coverage of clones. Although metDNA screening yielded no positive hits, an Acidovorax contaminant (strain MG01) was shown to grow in liquid MM supplied with β-Val as a sole N-source. Further MG01 characterization showed kanamycin resistance on MM agarose plates, but not on LB agar plates. Failure to isolate plasmid DNA from this strain confirmed absence of a recombinant library plasmid. The kanamycin resistance of strain MG01 explains its persistence during the library screenings, as kanamycin selection was applied throughout. Due to the lack of hits from the metDNA library screenings, focus was shifted to liquid MM growth selections. Of the 12 amino substrates, only three (β-Glu, β-Val, β-Phe) produced growth in five enrichment cultures. Substrates that produced no growth were β-Asn, β-Leu, β-Tyr, azido acetic acid, β-Ala, α-methyl-β-Ala, 3-phenyl-2-oxazolidinone, N-methyl-Asp, and N-butyl-Asp. These cultures were carried through two subsequent rounds of serial dilutions into the respective fresh liquid MM. After colony purification, cultures that were seemingly pure isolates were obtained that metabolize the aforementioned β-aa’s (Table 1): M1 (Herbaspirillum sp.) and M2 (Burkholderia sp.) on β-Glu; M6 (Variovorax sp.) on β-Phe; M7 (provisionally identified as a Variovorax sp.) on β-Val; and M8 (Variovorax sp.) on β-Phe. The substrate scope for all five cultures along with the MG01 and SBV1 strains were tested with the other 10 amino compounds used for initial soil enrichments (Table 1). The M7 culture enriched with β-Val was found to metabolize five additional β-aa’s: α-methyl-β-Ala, β-Leu, β-Glu, β-Phe, and β-Tyr. The MG01 and SBV1 strains, which were enriched with β-Val, also exhibited versatile growth by metabolizing 4–6 other β-aa’s. Because of their metabolism of β-aa’s and capacity to degrade β-Val, the M7 culture and Acidovorax strain MG01 were chosen for further genomics and bioinformatics studies. Strain SBV1 served as a positive control in the ensuing comparative genomics studies since its

Page 7: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 42PDF page: 42PDF page: 42PDF page: 42

42 | Chapter 2

β-Val-conversion genes and enzymes were recently characterized [15]. Since the remaining cultures (M1, M2, M6 and M8) were not subjected to partial genome sequencing, they are not assigned strains IDs in Table 1.

Table 1 | Summary of cultures and growth capacities

Culture/genus1 Strain2 β-Ala α-methyl- β-Ala

β-Val β-Leu β-Glu β-Phe β-Tyr

M1 Herbaspirillum sp. n.t.3 + - + +* - -

M2 Burkholderia sp. n.t. + - + +* - -

M6 Variovorax sp. n.t. - - - - +* -

M7 Variovorax/Herbaspirillum spp.4 M7V/H n.t. + +* + + + +

M8 Variovorax sp. n.t. + +* + + + +

Acidovorax sp. MG01 + + +* + - + +

Pseudomonas sp. SBV1 + + +* - - - +1 based on 16S rRNA sequence analyses. Bottom two strains were not isolated in the soil enrichments in this study.2 Strain designation only for soil isolates subjected to genome sequencing 3 n.t.= not tested4 subsequent bioinformatics analysis of partial genome sequence indicated a mixed culture* substrate used during initial growth selection of respective bacterial soil isolate

2.2 | Partial genome construction and classification of β-Val-degrading isolates To identify and compare putative β-Val catabolic gene clusters, we sequenced genomic DNA (gDNA) of strains MG01 and SBV1 and of the M7 culture, and assembled draft genomes of each (Table S1). During the genome reconstruction phase of the M7 culture (Table 1), it appeared that the data represents a mixed culture consisting of a Variovorax sp. (designated as strain M7V) and a Herbaspirillum sp. (strain M7H). The persistence of the mixed culture during serial dilutions and diligent colony purification steps suggests a symbiotic relationship between strains M7V and M7H that is crucial for growth in β-Val selection medium. From the metagenome, 673 M7V contigs were extracted and mapped to the best available reference genome (Variovorax paradoxus EPS, 6,550,056 bp, 6,088 CDS, 66% G+C content), which is a bacterium that promotes plant growth via roots [18]. Reference mapping resulted in 167 M7V scaffolds covering 8,872,035 bp that harbor 7,919 predicted genes. The resulting gaps between scaffolds and their relative locations cannot be fully relied upon when mapping draft genomes to a reference, especially if the strains do not belong to the same species and show a substantially different genome size. For the Variovorax str. M7A draft genome mapping, the reference genome (Vp EPS) used is at least 2.3 Mbps shorter and represents a different species (ANI: ~86%). A total of 4,814,686 bp from the M7H partial genome were mapped to the reference genome of Herbaspirillum seropedicae SmR1. The unassigned contigs represent <1% of the total sequence and cover a total of 558,078 bp that contain 517 annotated genes. The draft genome of Acidovorax strain MG01 comprised 150 scaffolds that include 887 scaffolded contigs based on the reference genome of Acidovorax strain AAC00-1 (5,352,772 bp, 4,868 CDS, 69% G+C content, GenBank CP000512.1). The unordered contigs cover 190,418 bp (216 CDS), or ~4%

Page 8: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 43PDF page: 43PDF page: 43PDF page: 43

43Genome mining |

2

of total sequence space covered by all contigs. The ordered contigs transpired from ~6.8 M sequence reads that produced a 70x coverage of the draft genome for Acidovorax MG01 (4,373,069 bp, 4,080 CDS). The Pseudomonas SBV1 partial genome (5,854,165 bp, 5,409 CDS, 59% G+C content) constitutes 315 scaffolds created by mapping 1,899 scaffolded contigs against the reference genome from Pseudomonas fluorescence strain Pf-5 (7,074,893 bp, 6,144 CDS, 63% G+C content) [19]. The contigs were assembled from ~6.9 M sequence reads (347,907,871 bp) to produce a 50x coverage of the partial genome. The unordered contigs span 327,987 bp (325 CDS) that represents ~5% of total sequence space covered by all contigs. Genus identifications of the soil isolates were based on 16S rRNA sequencing that produced ≥97% nucleotide sequence identity (seq. ID) with the chosen microbial genera. For species classification, the low pairwise average nucleotide identity (ANI) values in Table S2 indicate that all three β-Val-degrading strains represent new species. Specifically, all ANI comparisons between the newly established genomes and their reference sets do not surpass 87%, which is below the recently proposed identity thresholds of 92% by Zhang et al. and 95% by Richter et al. for species circumscription [20,21].

2.3 | Identification of β-Val gene clusters To understand the capacities of the MG01 and M7V strains to degrade β-Val, we searched their draft genomes (Table 2) for gene clusters similar to the two gene clusters enabling β-Val metabolism in SBV1 (Figure 2a) [15]. As depicted in Figure 2b (top), the two SBV1 gene clusters encode for the coupled metabolism of β-Val in ORFs 8, 9, and 12; and leucine in ORFs 3, 4, 5, and 6. Similar gene clusters for leucine metabolism were found in the M7V and MG01 draft genomes with the four catabolic genes sharing 41–69% seq. ID (Table S3) with the respective genes in SBV1. The gene order within the leucine gene clusters also are the same as in SBV1, although the adjacency of genes varies. For the β-Val gene cluster, homologs of the β-valinyl-CoA ligase (VarBvaA) and lyases (VarBvaB1/VarBvaB2) were found in strain M7V with seq. IDs in the range of 40–64%. The closest verified homologs of VarBvaA and VarBvaB1 in the public databases were the BvaA and BvaB1, resp. At least two β-Val gene clusters seem to exist in the MG01 draft genome, albeit with lower seq. IDs in the range of 24–55%. Although the above sequence similarities provide convincing evidence for the presence of β-Val-converting enzymes, examining the context of these genes could strengthen their inferred function. A notable feature of the β-Val gene cluster in SBV1 (Figure 2b, top) is the adjacency of the genes encoding for the characterized β-Val-CoA ligase BvaA (ORF 8) and lyase BvaB1 (ORF 9). This ligase-lyase pair is flanked by a permease (ORF 7) and transcription regulator (ORF 10). In the MG01 draft genome, two ligase-lyase pairs were found (Figure 2b, bottom, same numbering for homologous genes). One pair consists of the ligase AciBvaA2 and lyase AciBvaB1b (contig 76), but with reversed orientation compared to the respective SBV1 gene cluster. Neither a homologous permease nor a transcription regulator flank this pair. However, genes encoding ABC-type branched-chain amino acid transport proteins and an SOS-response transcriptional repressor are nearby on the same contig.

Page 9: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 44PDF page: 44PDF page: 44PDF page: 44

44 | Chapter 2

Table 2 | Genome mining summary for β-amino acid-converting enzymes

Enzyme [MW, Length] Organism Closest Verified Homolog* [Organism, code, % seq. ID]

Substrate/comment

β-valine-CoA ligases

BvaA [52.2 kDa, 466 aa] SBV1 query β-valine**

VarBvaA [48.9 kDa, 440 aa] M7V BvaA [Pseudomonas sp. SBV1, AIZ50556.1, 64%]

β-valine

AciBvaA1 [48.6 kDa, 436 aa] MG01 PA-CoA ligase [Burkholderia cenocepacia, 2Y4O, 73%]

Phenylacetate

AciBvaA2 [44.6 kDa, 407 aa] MG01 PA-CoA ligase [Burkholderia cenocepacia, 2Y4N, 30%]

β-valine

AciBvaA3 [57.7 kDa, 525 aa] MG01 Fatty Acyl CoA Synthetase [Mycobacterium tuberculosis, 3R44, 29%]

could be novel β-valine-CoA ligase

β-valinyl-CoA lyases

BvaB1 [27.9 kDa, 253 aa] SBV1 query β-valinyl-CoA**

BvaB2 [28.5 kDa, 266 aa] SBV1 query β-valinyl-CoA**

VarBvaB1 [27.5 kDa, 252 aa] M7V BvaB1 [Pse SBV1, AIZ50557.1, 60%]

β-valinyl-CoA

VarBvaB2 [27.7 kDa, 259 aa] M7V Enoyl-CoA hydratase [M. tuberculosis, 3Q0J, 60%]

Acyl-CoA

AciBvaB1a [25.7 kDa, 245 aa] MG01 BvaB1 [same as above, 50%]

β-valinyl-CoA

AciBvaB1b [28.9 kDa, 270 aa] MG01 Enoyl-CoA hydratase [Bacillus anthracis, 3KQF, 42%]

β-valinyl-CoA

AciBvaB2 [28 kDa, 259 aa] MG01 Enoyl-CoA hydratase [M. tuberculosis, 3H81, 62%]

Acyl-CoA

Transaminases

β-TA (VarM7V-TA) [46.1 kDa, 434 aa]

M7V β-Phe TA [V. paradoxus, 4AO9, 84%]

β-Phe**, β-Tyr**, β-Leu**

ba-TA M7V [62.7 kDa, 550 aa] M7V ba-TA [Pseudomonas sp., 4UHM, 64%]

β-alanine

ba-TA MG01 [50.4 kDa , 469 aa] MG01 ba-TA [same as above, 63%]

β-alanine

ba-TA SBV1 [49.5 kDa, 456 aa] SBV1 ω-TA [P. putida, 3A8U, 94%]

β-alanine

Aminomutase

putTAM SBV1 [59.0 kDa, 539 aa] SBV1 TAM [Streptomyces globisporus, 3KDY, 36%]

β-Tyr, α-Tyr

*Represents the closest homolog with the PDB or UniProt/SwissProt databases. **confirmed PA = phenylacetate, TAM = tyrosine 2,3-aminomutase; TA = transaminase, ba = β-alanine

Page 10: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 45PDF page: 45PDF page: 45PDF page: 45

45Genome mining |

2

Figure 2 | Pathway genes involved in β-Val metabolism. (a) Proposed metabolic pathway of β-Val with the associated ORF numbers of the β-Val activating enzymes (red) and leucine pathway enzymes (blue) in strain SBV1. (b) Gene clusters involved in β-Val metabolism in strain SBV1 (top) and the implicated clusters in strains M7V (middle) and in MG01 (bottom). Amino acid sequence identity percentages relative to the corresponding enzyme in strain SBV1 are indicated above the relevant ORFs. The gene clusters for SBV1 originate from a library insert and those from the other two strains originate from assembled contigs based on a reference genome, as indicated below each cluster (“Co”). All ORFs are accurately depicted. MT = putative methyltransferase

Page 11: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 46PDF page: 46PDF page: 46PDF page: 46

46 | Chapter 2

The second ligase-lyase pair in strain MG01 includes (in contig 41) the putative ligase AciBvaA3 and lyase AciBvaB1a (55% seq. ID). The putative ligase is implicated as a CoA-dependent enzyme through its gene location next to the CoA lyase. The organization within the gene cluster comprising the second pair is the same as in SBV1, except that the transcription regulator (ORF 10) is in the opposite direction and upstream to the ligase (ORF 8). Overall, the AciBvaA3-AciBvaB1a pair could represent a second β-Val gene cluster in MG01 and uses a novel CoA ligase. The AciBvaA1 ligase and AciBvaB2 lyase are likely not β-Val enzymes since there closest verified homologs are a phenylacetate-CoA ligase from Burkholderia cenocepacia (PDB: 2Y4O, 73% seq. ID) and an enoyl-CoA hydratase from Mycobacterium tuberculosis (PDB: 3H81, 62%), as seen in Table 2. In addition, their genes are not adjacent to any of the genes encoding for putative β-Val enzymes in the MG01 genome. One ligase-lyase pair is present in the M7V draft genome: ligase VarBvaA and lyase VarBvaB1 (Figure 2b, middle). A transcription regulator homologous to that of ORF 10 in SBV1 (49% seq. ID) is located 3 ORFs downstream to this pair. However, several transport and regulatory proteins encoded nearby on the same contig are not homologous to the respective ORFs in SBV1. Overall, the organization within this gene cluster is similar to that in SBV1, except that the transcription regulator (ORF10) in M7V is not adjacent to the lyase (ORF 9).

2.4 | Comparative genomic studies of β-Val-degrading strains To examine if the occurrence of typical β-Val metabolism genes is restricted to the isolated β-Val degraders, we carried out comparative genomic studies using the EDGAR tool [22]. Each β-Val-degrading strain was compared to two closely related reference strains (>95% 16S rRNA seq. IDs) as follows: SBV1 (Pf Pf-5 and Pf Pf0-1), MG01 (Ac AAC00-1 and Da SPH-1), and M7V (Vp S110 and Vp EPS). The reference strains were unable to degrade β-Val. The resulting Venn diagrams produced varying numbers of singletons (Figure 3). Singletons in the Venn diagrams represent genes that return no reciprocal best hits between the subset of genomes. The “Singletons” interface within EDGAR uses a stricter definition, rejecting all genes that have any unidirectional BLAST hit against any other genome in the subset. Thus, this interface was used to verify true singletons (parenthesized numbers in Figure 3). For the SBV1 comparison set in Figure 3a, all three characterized β-Val-converting enzymes were verified as singletons. The MG01 comparison set in Figure 3b shows that ligase AciBvaA1 and lyase AciBvaB2 are not singletons, which does not provide evidence for a role in β-Val conversion and further supports the implied roles from the above sequence analyses. However, the other ligase-lyase pair in MG01 were verified as singletons. The M7V comparison set in Figure 3c shows that its ligase-lyase pair consists of singletons as well. Thus, the putative genes involved in β-Val conversion in all three strains were verified as singletons, confirming that their occurrence is restricted to the isolated β-Val degraders.

Page 12: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 47PDF page: 47PDF page: 47PDF page: 47

47Genome mining |

2

Figure 3 | Comparative genomics analyses. Venn diagrams depict comparative analyses of (a) SBV1, (b) MG01, and (c) M7V partial genomes with the indicated number of shared and unique genes (i.e., singletons) present between the β-Val-degrading genomes (red) and the respective references genomes from strains that do not convert β-Val (green and blue). Verified singleton numbers indicated in parentheses. The putative β-Val- degrading enzymes that are verified singletons are boxed in grey. Vp S110 genome comprises 2 chromosomes.

2.5 | Genome mining for β-alanine enzymes Bioinformatics was also applied to rationalize the soil isolates’ catabolism of β-Ala and α-methyl-β-Ala. Three possible routes of β-Ala metabolism were considered (Figure 4): route 1, transamination to give malonate semialdehyde via a β-Ala:pyruvate transaminase, ba-TA (2.6.1.18); route 2, conversion to α-Ala via an alanine 2,3-aminomutase, ala-AM (EC 5.4.3.10); and route 3, acryloyl-CoA formation via a CoA transferase and CoA ammonia lyase (EC 4.3.1.6). Peptide sequences of a ba-TA (PDB: 4UHM, Pseudomonas sp.) [23] and a β-alanyl CoA lyase (AJ715482, Clostridium propionicum) [24] involved in β-alanine catabolism were searched against the draft genomes to retrieve orthologs (Table 2). No sequences exist for native ala-AMs, only for mutants of a lysine 2,3-aminuatase, lys-AM (CAB13860.1, Bacillus subtilis; GAP81141.1, Porphyromonas

Page 13: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 48PDF page: 48PDF page: 48PDF page: 48

48 | Chapter 2

gingivalis) that acquired ala-AM activity [25]. Ba-TAs were the closest verified homologs (>60% seq. ID) with hits generated from the ba-TA query sequence from all three strains were generated with the ba-TA sequence. No hits were generated using the query sequences for the lyase or lys-AM mutants. Overall, all three strains likely employ transamination to metabolize β-Ala. For α-methyl-β-Ala, the ba-TA query from Pseudomonas sp. str. AAC (ba-TA Pse) was recently shown by Wilding et al. to convert this substrate [23]. Given the strong identities between this enzyme and the orthologs from each soil isolate, it is likely that these homologs enable the metabolism of α-methyl-β-Ala as well.

Figure 4 | Pathways for β-alanine catabolism. Metabolic routes of β-alanine are numbered in red and involve: ba-TA (β-Ala-transaminase), ala-AM (alanine-2,3-aminomutase), and an AL (β-alanyl-CoA:ammonia lyase).

2.6 | β-alanine gene clusters encode putative transaminases Wilding et al. [23] discovered that adjacent genes encoding for the PLP-dependent ba-TA Pse and a malonate semialdehyde dehydrogenases (ms-DH) constitute a bicistronic operon for β-Ala catabolism in Pseudomonas sp. str. AAC [23]. Using ba-TA Pse as the query for transaminase homologs (pathway 1) yielded hits (Table 2) in all soil isolates with strong homology (61–78% seq. ID). Like the ba-TA Pse gene, the genes of the homologs are also adjacent to genes encoding putative ms-DHs, except in M7V where there is a 3-ORF-separation with a non-homologous ms-DH (Figure 5). The ba-TA and ms-DH homologs in strains SBV1 and MG01 share >60% seq. ID with the respective enzymes identified by Wilding et al. Interestingly, the ba-TA gene from strain MG01 is also near a gene encoding for an NAD(P)-dependent malate dehydrogenase (mal-DH), which may feed pyruvate into the TA reaction through decarboxylation of oxaloacetate during β-Ala catabolism. From the gene context analyses and homology, the TA homologs from all three isolates seem to play a role in β-Ala catabolism. Although a large disconnect between sequence similarity and function often hampers homology-based annotation of class III transaminases like the ba-TA, sequence analyses (Figure 6) between the putative ba-TAs and previously characterized homologs from Pseudomonas

Page 14: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 49PDF page: 49PDF page: 49PDF page: 49

49Genome mining |

2SBV1 (Co319)

MG01(Co260)

M7V(Co387)

Pse AAC(reference)

3,143,651 3,153,763

ba-TA

ms-DH

amino acid transporter

transcription regulator

mal-DH

798,482 805,872

78% 89%

4,261,834 4,272,013

61% 72%

2,368,169 2,380,769

62%28%

mal-DHEC 1.1.1.40

ba-TAEC 2.6.1.18

ms-DHEC 1.2.1.27

O

O

HO

O

O

oxaloacetate O

O

O

pyruvate

O

O O

malonate-semialdehyde

Acetyl-CoA

CO 2 + H + NH 2

O

HO

-AlaNH 2

O

HO

L-AlaCoANAD +

HCO 3-

NADH

OH

O

O -

O

-O

(S )-malate

NADP + NADPH

(or)

Figure 5 | Gene context analyses of putative β-alanine transaminases in soil isolates. The recently discovered β-alanine catabolic pathway genes in Pseudomonas sp. str. AAC (Pse AAC) serve as a reference (Wilding et al., 2016). The seq. IDs between the reference enzymes and the orthologs in each strain are indicated above the relevant ORF. The contig number is indicated below each genome name. All ORF sizes are accurately depicted.Abbreviations: mal-DH = malate dehydrogenase, and ms-DH = malonate semialdehyde dehydrogenase.

putida (ba-TA Pp) and P. aeruginosa (ba-TA Pa) strengthen their likely role as a promiscuous ba-TA [23,26]. For example, the ‘flipping arginine’ (R414) that binds the β-Ala carboxylate group is present in all putative ba-TAs, along with the two residues (W61 and S231) that position this group for catalysis [26]. These three residues define the ba-TA active site fingerprint proposed by Steffen-Munsberg et al. [26]. All residues for PLP cofactor binding are also present. Substrate promiscuity is an innate feature in class III TAs [26] and the putative ba-TAs possess all six active site residues (F/Y24, W61, F89, Y153, I262, R414) in ba-TA Pp that are declared as promiscuity determinants [27-29].

Page 15: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 50PDF page: 50PDF page: 50PDF page: 50

50 | Chapter 2

Figure 6 | Multiple sequence alignment of β-alanine transaminases (ba-TA). The ba-TA Pse query sequence from Pseudomonas sp. str. AAC (Pse_4UHM, A0A081YAY5) for genome mining is the reference (ref) for residue numbering and sorting by seq. IDs (parenthesized). The putative ba-TAs from soil isolates are boxed in black. Fingerprint regions proposed by Steffen-Munsberg et al. (2015) that define ba-TA activity in class III transaminases are in red boxes with consensus residues below each box (capitalized= >70% conserved in original subset, lowercase= 30–70% conserved). The symbols mark various functional sites found within 3D structure of Pse_4UHM: carboxyl-binding R414 (‘flipping arginine’, ), PLP interactions ( ), and residues proposed to accommodate long-chain aliphatic ω-amino acid substrates in the active site ( ). The remaining TA sequences with PDB codes include: Pa_4B98 (Pseudomonas aeruginosa), Pp_3A8U (P. putida). Six key active site residues in Pp_3A8U that are conserved in eight other ω-TAs are indicated ( ) and proposed as determinants of similar substrate specificities with various ω←amino acids and amines (Park et al., 2012).

Page 16: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 51PDF page: 51PDF page: 51PDF page: 51

51Genome mining |

2

2.7 | Genome mining reveals a putative β-transaminase and tyrosine aminomutase Enzymes with activities toward the remaining growth substrates were searched for in the draft genomes of the soil isolates to rationalize the growth selections in Table 1: β-Glu AM (glutamate 2,3-aminomutase, GAM [30], β-Tyr AM (tyrosine 2,3-aminomutase, TAM [31], β-Phe AM (phenylalanine 2,3-aminomutase, PAM [32,33]; β-transaminase, β-TA [34]; and two ω-TAs [26]. Only the convincing hits from each search query are discussed below. A transamination route is implicated in the metabolism of β-Phe in strain M7V, based on the genome mining results in Table 2. For instance, a β-TA (VarM7V-TA) sharing 84% seq. ID with a PLP fold-type I β-TA from V. paradoxus str. CBF3 (Vp-TA) was discovered in strain M7V [16]. In strain MG01, the β-Phe catabolic route is not clear. A putative transaminase (ba-TA MG01) shares only 31% seq. ID with the ba-TA from Mesorhizobium loti, which has activity with β-Phe [26,35]. Searches with this MG01 hit against the PDB and UniProt databases also returned ba-TAs as the most closely related enzymes, including the aforementioned ba-TA Pse (63% seq. ID, PDB: 4UHM). Although this TA has a broad substrate scope comprising amino acids and amines, β-Phe was not tested. Considering that all three strains metabolized β-Tyr, candidate enzymes were found only in two strains, SBV1 and M7V. In strain M7V, the aforementioned VarM7V-TA could be the enzyme converting β-Tyr since this activity is confirmed below with purified enzyme, which also aligns with the activity of its homolog [16]. From strain SBV1, a putative tyrosine 2,3-aminomutase (putTAM SBV1) sharing 36% seq. IDs with top homologs from Streptomyces globisporus (PDB: 3KDY) and Chondromyces crocatus (UniProt: Q0VZ68) was detected (Table 2).

2.8 | Bioinformatics analyses of a putative tyrosine aminomutase The aromatic ammonia lyases (AL) and aminomutases (AM) belong to a family of enzymes that utilize the 4-methylidene-imidazole-5-one (MIO) internal cofactor for amino group transfer or ammonia release with substrates phenylalanine (PAL/PAM), tyrosine (TAL/TAM), and histidine (HAL) [31-33]. The MIO cofactor is formed by autocatalytic rearrangement of a [Thr/Ala]-Ser-Gly signature motif and acts as an electrophile during catalysis. To examine the MIO-dependent enzymes present in the new strains, we used Blast searches and multiple sequence alignments (MSA) (Figure 7a). All putative MIO enzymes that were detected contained the catalytic tyrosines, Tyr85 and Tyr309, and the carboxylate binding group, Arg312 (putTAM SBV1 numbering). These conserved residues were recently used to select and characterize 22 novel PAL and TAL genes by Jendresen et al. [36]. Based on similarities and signature motifs differentiating between HAL (E494DHVSM) and TAM/TAL (Q494D(I/V)VSM), it appears that both MG01 and M7V contain a putative HAL, whereas SBV1 likely contains a TAM/TAL [37]. The latter is in agreement with the presence of a histidine residue at position 116, which acts as a substrate selectivity switch (S1) by binding the tyrosine hydroxyl group [38,39]. A second switch (S2) is present at position 461. Almost all HALs contain a methionine at this position, while most PALs contain a lysine and TAM/TALs a glutamic acid or an alanine [38]. This residue also influences enantioselectivity of TAM from Chondromyces crocatus (TAM Cc) [37]. In putTAM SBV1 there is a lysine at S2, suggesting that the enzyme may be an unusual catabolic (R)-selective TAM or even a β-TAL, which has never been reported. Near this putative SBV1 TAM/TAL

Page 17: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 52PDF page: 52PDF page: 52PDF page: 52

52 | Chapter 2

Page 18: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 53PDF page: 53PDF page: 53PDF page: 53

53Genome mining |

2

Figu

re 7

| Bi

oinf

orm

atic

s ch

arac

teri

zatio

n of

MIO

-dep

ende

nt e

nzym

es fr

om s

oil i

sola

tes.

(a) P

artia

l seq

uenc

e al

ignm

ent o

f MIO

-dep

ende

nt e

nzym

es. V

ario

us M

IO e

nzym

e se

quen

ces a

re a

ligne

d to

ded

uce

the

likel

y fu

nctio

n of

put

ativ

e H

AL/

TAM

enz

ymes

(‘pu

t’) fr

om so

il iso

late

s (re

d se

quen

ces 2

, 3, 2

1) b

y ex

amin

ing

resid

ues w

ithin

the

indi

cate

d ke

y re

gion

s tha

t rep

rese

nt su

bstr

ate

sele

ctiv

ity sw

itche

s (S1

, S2)

, the

inte

rnal

cof

acto

r (M

IO),

and

a sig

natu

re m

otif,

resp

. Seq

uenc

e ID

s (br

acke

ts) a

nd re

sidue

num

berin

g ar

e re

lativ

e to

pu

tTA

M S

BV1

(seq

. 21)

. (b)

Phy

loge

netic

ana

lysis

of c

hara

cter

ized

MIO

enzy

mes

and

thei

r unc

hara

cter

ized

hom

olog

s. Ve

rified

and

pre

dict

ed (*

) sub

stra

te sp

ecifi

citie

s are

indi

cate

d to

righ

t. Fo

r cla

des i

nclu

ding

eith

er p

utat

ive

or c

hara

cter

ized

enz

ymes

, the

indi

cate

d sp

ecifi

citie

s are

eith

er p

redi

cted

or v

erifi

ed, r

espe

ctiv

ely.

PAM

Tc s

erve

d as

the

outg

roup

. MIO

en

zym

es re

pres

ente

d in

the

mul

tiple

sequ

ence

alig

nmen

t and

phy

loge

netic

tree

are

:

Veri

fied

HA

L en

zym

es (I

D, o

rgan

ism, c

ode)

:

HA

L Ac

i (Ac

idov

orax

citr

ulli

AA

C00

-1, A

1TRE

4)

HA

L Bs

(Bac

illus

subt

ilis,

BAA

0664

4)

HA

L M

x (M

yxoc

occu

s xan

thus

, ABF

9095

3)

HA

L Pp

(Pse

udom

onas

put

ida,

1G

K3)

HA

L Sa

(Stig

mat

ella

aura

ntia

ca, A

AK

5718

3)

Veri

fied

AM

s:

TAM

Cc (

Chon

drom

yces

croc

atus

Cm

C5,

Q0V

Z68)

TAM

Am

(Act

inom

adur

a m

adur

ae, A

BY66

005)

TAM

Cm

(Cup

riavi

dus m

etal

lidur

ans,

ABF

0711

7, T

AL

activ

ity p

redo

min

ates

)

TAM

Mf (

Myx

ococ

cus f

ulvu

s, FM

2122

43)

TAM

Sg (

Stre

ptom

yces

glob

ispor

us, 3

KD

Y)

TAM

Str

(Stre

ptoa

llote

ichus

sp. A

TCC

536

50, A

FV52

190.

1) PA

M P

a (P

anto

ea a

gglo

mer

ans,

3UN

V)

PAM

Tc (

Taxu

s chi

nens

is, 2

YII)

Veri

fied

ALs

:

TAL

Fj (F

lavo

bact

eriu

m jo

hnso

niae

, AK

E508

27.1

)

TAL

Ha

(Her

peto

sipho

n au

rant

iacu

s, W

P_01

2189

431.

1)

TAL

Rc (R

hodo

bact

er ca

psul

atus

, ABU

3377

0.1)

TAL

Rs (R

hodo

bact

er sp

haer

oide

s, Q

3IW

B0)

TAL

Se (S

acch

arot

hrix

espa

naen

sis, Q

2EYY

5)

PAL

Np

(Nos

toc p

unct

iform

e, B2

J528

)

PAL

Pl (P

hoto

rhab

dus l

umin

esce

ns, Q

7N4T

3)

PAL

Sm (S

trept

omyc

es m

ariti

mus

, Q9K

HJ9

)

Hom

olog

s of p

utTA

M S

BV1

(bol

d) re

trie

ved

from

‘nr’

NC

BI d

atab

ase:

putH

AL

Au (A

ctin

opla

nes u

tahe

nsis,

KH

D76

861)

putH

AL

Bs (B

acill

us su

btili

s, W

P_06

1186

919.

1)

putH

AL

Hb

(Hirs

chia

bal

tica,

WP_

0158

2715

7)

putH

AL

Oa

(Obl

itim

onas

alk

alip

hila

, AK

X59

620)

putH

AL

Pa2

(Pse

udoa

ltero

mon

as a

rctic

a, E

RG09

680)

putH

AL

Pa3

(Pse

udom

onas

am

ygda

li, K

PX55

588)

putH

AL

Pf (P

lesio

cysti

s pac

ifica

, WP_

0069

7402

1)

putH

AL

Pm (P

seud

omon

as m

ande

lii, W

P_03

0554

03)

putH

AL

Pse (

Pseu

dom

onas

sp. B

ICA

1-14

, KJS

7421

1.2)

putH

AL

Vc (V

ibrio

cycli

troph

icus,

WP_

0167

9130

6 )

putT

AM

Str

(Stre

ptom

yces

sp. M

g1, B

4VD

U5)

putT

AM

Pse

(Pse

udoa

ltero

mon

as sp

. P1-

11, A

0A0P

9G87

)

putT

AM

Av

(Acin

etob

acte

r ven

etia

nus,

A0A

150H

UR0

)

putT

AL

Str (

Stre

ptom

yces

sp. W

K-5

344,

L0N

6Q0)

Page 19: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 54PDF page: 54PDF page: 54PDF page: 54

54 | Chapter 2

gene there is a gene for a putative CoA ligase that might form coumaryl-CoA for further conversion in the benzoate pathway. Unfortunately, both genes lie at the end of a contig, which prevents further analysis. TAM family members may have evolved through convergent evolution from more distant sequences [37], causing a lack of phylogenetic clustering (Figure 7b). Accordingly, the putative TAM/TAL from SBV1 shows only modest resemblance to other TAMs/TALs. To examine possible weaknesses in the current database annotation of MIO enzymes, we selected 14 randomly chosen bacterial homologs (seqs. 8 and 22–34) which are annotated as HAL (10), TAL (1), or TAM (3) and manually inspected alignments for characteristic sequence motifs. Two homologs from Actinoplanes utahensis (seq. 34, 46% seq. ID) and Bacillus subtilis (seq. 32, 40% seq. ID) annotated as HALs seem to be PALs given their lack of a His residue at the S1 switch position and the presence of a Lys at the S2 location. Most of the other putative HALs may represent a TAM or TAL since the defining residues match those of TAM/TAL (Figure 7a) and the closest homologs in the protein databank are TAMs, not HALs. Apparently, automated sequence annotation has led to many errors in MIO enzyme annotation [37], probably because the first MIO enzyme characterized was a histidine ammonia lyase from Pseudomonas putida. The putative TAM homologs present in our β-amino acid degraders and those annotated as HALs mentioned above may form a novel cluster within the MIO enzyme family.

2.9 | Sequence analysis of a putative β-transaminase Beta-transaminases (β-TAs) catalyze the transfer of amino groups from β-aa’s to either pyruvate or α-ketoglutarate, producing alanine or glutamate, respectively, and the corresponding β-ketoacid. Amine transfer proceeds by a ping-pong mechanism with the aminated pyridoxal 5’-phosphate (PLP) as the modified enzyme intermediate. Of the seven reported β-TAs that convert β-Phe, two have crystal structures and a few reports investigated their substrate scopes and synthetic capacities via kinetic resolution or asymmetric synthesis [14,16,26,40-43]. To further evaluate the potential of VarM7V-TA as a β-TA, its sequence was compared with verified β-TAs, one of which was correctly predicted as a β-TA (Pg-TA) by Crismaru et al. [16] based on the signature motif, R-X-[AVI]-X(6)-P-X(14)-D-G-X(8)-[EDNQ]-[YFW], that can identify aromatic β-TAs [40,41]. The multiple sequence alignment in Figure 8 shows that VarM7V-TA contains all active site residues identified in the crystal structure of Vp-TA (PDB: 4AO9), including the residues of the signature motif for identifying aromatic β-TAs [16]. The gene contexts of the β-TAs included in the sequence alignment were compared, except for Mes-TA which lacks information, to deduce their probable physiological roles. The genes encoding for VarM7V-TA and Vp-TA are surrounded by pathway genes involved in sulfur and nitrogen/amino acid metabolism. In strain M7V, the deaminated aryl β-keto acid product formed from β-Phe (Figure 9) may be shunted to the benzoate degradation pathway or to the β-oxoadipate pathway via acetophenone degradation [44] due to the presence of genes encoding a dioxygenase (51% seq. ID with phenoxybenzoate dioxygenase from Pseudomonas pseudoalcaligenes strain POB310 [45,46] and a catechol 1,2-dioxygenase (45% seq. ID with homolog from Cupriavidus necator [47]. Both oxygenases in M7V are also singletons. The limited gene context available for strain CBF3 does not indicate the metabolic fate of the aryl β-keto acid.

Page 20: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 55PDF page: 55PDF page: 55PDF page: 55

55Genome mining |

2

Figu

re 8

| M

ultip

le s

eque

nce

alig

nmen

t of

β-t

rans

amin

ases

. The

alig

nmen

t in

clud

es v

erifi

ed β

-TA

s as

fol

low

s: Pg

-TA

(Pa

rabu

rkho

lder

ia g

ram

inis

C4D

1M,

WP_

0060

5063

5); M

es-T

A (M

esor

hizo

bium

sp. s

trai

n LU

K, 2

YKY)

; and

Vp-

TA (V

ario

vora

x pa

rado

xus,

4AO

9). V

arM

7V-T

A (V

. par

adox

us sp

. str

ain

M7V

) rep

rese

nts

a pu

tativ

e β-T

A. A

ctiv

e site

resid

ues b

ased

on

Vp-

TA cr

ysta

l str

uctu

re a

re m

arke

d w

ith sy

mbo

ls ac

cord

ing

to fu

nctio

n: ♦

-inte

ract

with

the P

LP co

fact

or; ▼

-bin

d th

e ca

rbox

ylat

e gro

ups o

f (S)

-β-P

he, p

yruv

ate,

and/

or th

e α/β

-car

boxy

late

gro

up(s

) of 2

-oxo

glut

arat

e; an

d ■

-hyd

roph

obic

inte

ract

ions

with

the p

heny

l rin

g of

(S)-

β-Ph

e. Bo

xed

regi

on (r

ed) i

ndic

ates

area

cont

aini

ng si

gnat

ure m

otif

resid

ues (

•) o

f β-T

As p

ropo

sed

by C

rism

aru

et a

l. Se

q. ID

per

cent

ages

are p

aren

thes

ized

and

are r

elat

ive

to V

arM

7V-T

A p

eptid

e se

quen

ce.

Page 21: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 56PDF page: 56PDF page: 56PDF page: 56

56 | Chapter 2

Figure 9 | Possible degradation pathways of aryl β-amino acids in Variovorax paradoxus strain M7V. The genes encoding for the boxed enzymes are found in the partial genome of the strain. The VarM7V-TA is represented in the first enzymatic step (β-TA).

2.10 | Characterization of a β-transaminase To verify the function of VarM7V-TA as a β-TA in strain M7V, the gene was cloned, over-expressed in E. coli, and the enzyme was purified with an N-terminal His6-affinity tag to give protein of >90% purity. The other β-TAs included in the multiple sequence alignment in Figure 8 were tested alongside VarM7V-TA, except for Pg-TA, using an enzyme-coupled assay with glutamate as the amino acceptor and glutamate dehydrogenase for continuous detection of NAD+ reduction coupled to glutamate formation. VarM7V-TA showed the same or significantly higher activities compared to Vp-TA for β-aa conversions, which was previously one of the most active β-TAs (Figure 10). Mes-TA was the least active for all substrates. The highest specific activity (24 U/mg) was exhibited by VarM7V-TA with (S)-β-Tyr, which was converted ~2x faster than with Vp-TA. VarM7V-TA’s promiscuous activities towards β-Phe, β-Tyr, and β-Leu likely explain the capacity of strain M7V to utilize these substrates as sole N-sources (Table 1). The results also indicated substantial substrate inhibition. As with most enzyme-substrate combinations, the highest activities were found with the low (2 mM) concentration of amine donor. None of the β-TAs showed activity with β-Glu and several aliphatic (isopropylamine, 3-aminopentane, amylamine, 4-aminobutyric acid) and aromatic amines ((S)-α-MBA, (S)-(+)-2-phenylglycinol, benzylamine, phenethylamine, piperonylamine).

Page 22: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 57PDF page: 57PDF page: 57PDF page: 57

57Genome mining |

2

Figure 10 | Activities of β-transaminases. Specific activity (S.A.) measurements with various β-amino acids for (a) Mes-TA (b) Vp-TA, and (c) VarM7V-TA in an enzyme-coupled assay using GDH. Only converted enantiomers are shown.

3 | Discussion

3.1 | Soil enrichments prevail in screening for β-amino acid conversions Traditional soil enrichments and meta-/genome library screening have become prominent paths for enzyme discoveries and unveiling novel metabolic capacities in microbes [11,13,16,48-50]. After applying both approaches in this work to discover biocatalysts involved in β-aa conversion, the soil enrichment approach succeeded. Failure of the library screening could be due to: 1) a limited library size, in which β-aa active genes may be rare or absent; 2) growth screening of libraries, which may miss genes coding for an enzyme that only confers a growth phenotype in a host that metabolizes the product; 3) intrinsic limitations such as expression [51]. For 1), it is estimated that <800 genomes equivalents were screened. Redundant genomes likely reduce this number. Although a soil sample may contain similar genome equivalents from cultivable microbes (1% of ~9,000 [12]), soil enrichments increase the chances of discovering rare β-aa degradation genes by overcoming the latter two challenges of library screening. Pre-enriched libraries may help, but was not attempted here [11,51]. With the advent of next generation sequencing, our combined approach for rare gene discovery with growth selection and genome sequencing is the preferred route, avoiding the troubles of library construction and screening. Through growth selection, we retrieved six new organisms, including strains metabolizing several β-aa’s. Three organisms (M7V, MG01, SBV1) were sequenced and partial genome sequences were analyzed to locates genes of β-aa active enzymes (Table 2). A likely essential symbiotic relationship with strain M7H prevented pure culture isolation of M7V. Differences in metal resistance among Variovorax sp. and a Herbaspirillum sp. may be exploited to possibly separate the M7V and M7H strains [52-54]. The broader range of metal resistance by V. paradoxus could favor isolation of the

Page 23: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 58PDF page: 58PDF page: 58PDF page: 58

58 | Chapter 2

M7V strain [54]. In all three partial genomes, putative fold-type I PLP-dependent TAs for β-Phe and β-Ala degradation were detected. New CoA ligase-lyase gene clusters for β-Val degradation were detected in MG01 and M7V, which are similar to the characterized β-Val gene cluster in SBV1. A putative MIO-dependent TAM in SBV1 was detected that likely enables its degradation of β-Tyr. We also examined the occurrence of MIO enzymes. Despite an ~50x genome coverage by sequencing reads, a bias was introduced somewhere before or during 16S rRNA and partial genome sequencing that hindered detection of the 16S rRNA gene in the M7H partial genome. Searches against the unordered contigs originating from the M7 culture also produced no significant 16S rRNA gene hits [55]. Thus, the M7H strain contamination was not detected during initial identification of the M7 culture due to some bias that prevented sequencing of its 16S rRNA gene both at the strain isolation and partial genome construction stages.

3.2 | Enhanced functional implication of putative β-valine enzymes Until recently, nothing was known about microbial deamination of β-Val [15]. Our interest in β-Val conversion was to discover rare ammonia lyases or aminomutases that can convert β-amino compounds that cannot be converted by common deamination mechanisms. A CoA-dependent ligase-lyase pair was recently discovered for the conversion of β-Val by BvaA and BvaB1/2 enzymes in Pseudomonas strain SBV1 [15]. Our results show that this route is not unique to the SBV1 strain as it is present in our new soil isolates. Thus, similar CoA-dependent enzyme pairs were found in two new β-Val degrading strains: M7V (Var BvaA/BvaB1) and MG01 (Aci BvaA2/BvaB1b). Otzen et al. previously compared a BvaA homology model with the crystal structure of a homologous phenylacetate (PA)-CoA ligase (PDB: 2Y4N) to rationalize their specificity differences (Figure 11a) [15,56]. They postulated that alterations in 6 active site residues ((Y136 (BvaA)→L167(PA), F141→W172, A147→D178, A214→G245, I236→N270, and Pro245→Ile278) change the shape and polarity of the β-Val binding pocket that prevents BvaA from converting PA. Comparing these six residue positions in Figure 11b from a sequence alignment suggests that Aci BvaA1 is likely a PA-CoA ligase while Aci BvaA2 and Var BvaA are β-Val CoA ligases. Additional β-aa-converting enzymes were predicted in strains SBV1, MG01, and M7V. For β-Ala conversion, putative ba-TAs were homologous to a highly promiscuous ba-TA recently discovered by Wilding et al. [23]. This enzyme was shown to produce 12-aminododecanoic acid, a building block of nylon-12 [57]. Thus, the homologs from each strain, especially from SBV1, could be candidates for industrial evaluation.

Page 24: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 59PDF page: 59PDF page: 59PDF page: 59

59Genome mining |

2

Figure 11 | Sequence analyses of active site residues in aryl- and β-valinyl-CoA ligases. (a) Comparison of reactions catalyzed by the CoA ligases that convert β-valine (BV) and phenylacetate (PA) (b) Comparison of aryl substrate binding pocket residues based on a phenylacetate-CoA ligase from B. cenocepacia (Bc PA-CoA ligase) (Law et al., 2011). Included in the multiple sequence alignment are putative CoA ligases, Aci BvaA1 & 2 and Var M7V ligase (Var BvaA); as well as the β-valinyl CoA ligase from Pse SBV1 (BvaA). Green residues indicate those found only in the Bc PA-CoA ligase, Red residues for those found only in the β-valinyl CoA ligase, and white for those residues found in neither. Ref = reference sequence for sequence i.d. % and residue numbering.

Page 25: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 60PDF page: 60PDF page: 60PDF page: 60

60 | Chapter 2

3.3 | A new class of putative tyrosine aminomutases For β-Tyr catalysts, a putative TAM from SBV1 (putTAM SBV1) and 8 of its homologs from the nr database at NCBI were inferred based on careful sequence and phylogenetic analyses. TAM reversibly converts β-Tyr to α-Tyr, while α-TAL reversibly deaminates α-Tyr. There are no known β-TALs. The gene context of putTAM SBV1 consists of a benzoate degradation gene, suggesting that SBV1 uses TAM to convert β-Tyr to α-Tyr, which is metabolized to benzoate for carbon recycling of aromatic compounds (Figure 12) [58,59].

Figure 12 | Proposed catabolism of β-tyrosine in Pseudomonas sp. strain SBV1. Likely route utilizing TAM (putTAM SBV1, boxed). *unknown enzyme

This pathway is further supported by the fact that no other candidate enzymes were found in SBV1 that enable its degradation of β-Tyr. Previously characterized TAMs are involved in the biosynthesis of β-Tyr for antibiotic and chromophore production [5,31,37,60]. Thus, putTAM SBV1 could be a novel MIO-enzyme specialized in the catabolism of β-Tyr. The unconventional gene context likely explains why TAM SBV1 and its homologs do not cluster with known TAMs in the phylogenetic tree (Figure 7b). Such cluster separation between newly discovered TALs (TAL Ha/Fj in Figure 7) and known TALs was also apparent in a recent work by Jendresen et al., who identified 22 new PAL and TAL enzymes in a quest to discover highly active and specific industrial candidates [36]. The gene context of a bifunctional TAM/TAL (seq. 21, 34% seq. ID, UniProt: Q1LRV9) with an unknown role in Cupriavidus metallidurans str. CH34 (TAM Cm) was also different to that of putTAM SBV1 [37]. The gene for TAM Cm is surrounded by amino acid conversion and transport genes, including a putative ba-TA that is homologous (50% seq. ID) to the aforementioned promiscuous ba-TA Pse (PDB: 4UHN/M) [23]. Thus, putTAM SBV1 and TAM Cm likely are involved in β-Tyr catabolism. Interestingly, none of the strains harboring the new class of putative TAMs are Cyanobacteria, which are prolific producers of secondary metabolites [61]. Thus, investigating whether β-Tyr serves as a building block for novel bioactive natural products from these strains could be fruitful. In general, predicting function solely based on sequence similarity for MIO enzymes can lead to extensive misannotations. This may be the case for the 10 randomly chosen homologs (40–100% seq. IDs) of of putTAM SBV1 that are annotated as HALs. These bacterial homologs likely represent 2 novel PALs and 8 novel TAM/TALs. Of the top 5,000 hits generated from the putTAM SBV1 homologs search that are mostly annotated as HALs in the nr database, up to 420 of the deposited sequences

Page 26: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 61PDF page: 61PDF page: 61PDF page: 61

61Genome mining |

2

could be wrongly annotated after filtering these hits for ≥40% seq. ID and ≥90% query coverage. Conversely, these putative enzymes could represent a novel group of HALs. This would break the strict conservation of the Glu-494 residue [62] among characterized HALs that is replaced by a Gln, which is the same residue conserved across characterized non-HAL MIO-enzymes in Figure 7a and in prior sequence alignments [37,62]. Although the role of this residue remains elusive [62], the rationale for implicating putTAM SBV1 suggests erroneous annotations of the putative HALs. The prevalent HAL misannotation was also observed by Krug et al., who functionally verified one putative HAL as a TAM Cm, which was discovered using the signature motif applied here [37]. Clearly, automatic annotation tools need to incorporate differentiating residue and motif information of protein sequences from prior studies to reduce the propagating errors that devalue public sequence databases. Using primary to tertiary structural analyses, our recent work highlighted the importance of hydrophobicity in an active site inner loop that was found to be a partial structural determinant of PAM and PAL activities [33]. Schafhauser et al. recently applied this approach to predict the function of a putative PAL to be a PAM, which supplies the β-Phe moiety of a potent mycotoxin in a prevalent foodborne mold [63]. Extending such approach to the TAM/TAL enzymes would offer additional insight into their predicted activities; especially when the gene context and differentiating motifs are uninformative or lacking and β-Tyr utilization by the host organism is unknown.

3.4 | A new β-transaminase Characterization of a β-TA from M7V (VarM7V-TA) likely rationalized the strain’s capacity to convert other substrates like β-Phe, β-Tyr, and β-Leu. Such substrate promiscuity is well known for the ω-TA class of enzymes, which encompasses β-TA [23,26]. The active site residue position 90 was proposed in a recent review to influence activity with amines other than amino acids: Phe= high activity, Tyr= low activity [26]. The observed activities of the β-TAs included in the sequence alignment (Figure 8) correlate with this notion. For example, the three β-TAs tested in this study are not active with amines and contain a Tyr-90, while Pg-TA harbors a Phe-90 and was previously shown to be active with amines [41]. All tested β-TAs were highly enantioselective towards the substrates tested. Based on observations from MD simulations of a highly promiscuous ba-TA, Wilding et al. recently suggested that, in addition to variations in binding pocket size, substantial structural rearrangements within ω-TAs may largely influence substrate specificities [23]. Perhaps similar simulation studies would shed light on the observed trends of substrate activities in this study. The signature motifs that separately identify ba-TA (Figure 6) and β-TA (Figure 8) sequences allow for rapid prediction between β-Phe and β-Ala TAs.

4 | Conclusions

Using soil enrichments, we isolated three novel strains capable of growth on different β-aa’s. Genome mining generated CoA-dependent ligase-lyase pairs that are likely involved in β-valine metabolism. The gene clusters showed similarities in overall organization to the β-valine degradation genes described by Otzen et al. [15]. Sequence analysis also identified genes encoding PLP fold-type I

Page 27: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 62PDF page: 62PDF page: 62PDF page: 62

62 | Chapter 2

aminotransferases that could be involved in the metabolism of aromatic β-aa’s. The genome of strain SBV1 encodes at least two enzymes potentially relevant for biocatalytic beta amino production: a novel TAM and a highly promiscuous ba-TA, whose homologs partake in the biosynthesis of antibiotics and various high-value platform chemicals, respectively [5,23,37,64].

5 | Methods

5.1 | Strains and Chemicals The bacterial strains isolated in this study, in addition to those used as reference strains for comparative growth studies, are summarized in Table S4. Chemicals screened during growth selections were obtained as the racemic form when relevant from the following suppliers: Acros Organics (β-phenylalanine); Fluka (β-alanine); Sigma-Aldrich (β-leucine; α-methyl-β-alanine; N-methyl-DL-aspartic acid), β-glutamic acid); Bachem (β-asparagine); InnoChimie GmbH (β-tyrosine); Fluorochem (β-valine); and Frinton Laboratories, Inc. (3-phenyl-2-oxazolidinone). The following chemicals were synthesized by Wiktor Szymanski (Univ. of Groningen) according to established protocols: azido acetic acid; N-butyl aspartic acid.

5.2 | Screening of metagenomic DNA libraries for deamination enzymes Two unenriched libraries (unpublished) using Dutch soil from medium compost (DIR-MC) and ripe compost (DIR-RC) were screened for deamination enzymes using liquid MM growth selection experiments with the substrates in Table 1. The metDNA libraries were screened using a DH5α-E/ recombinant pZERO2 (KanR) expression system with an ~12x coverage of the respective library size. The liquid MM (pH 7.0) used for growth selections comprised per L: 5.3 g Na2HPO4•12H2O, 1.4 g KH2PO4, 0.2 g MgSO4•7H2O, 1.0 g Na2SO4, 1.0 mL vitamin solution, and 5.0 mL of a trace metal solution that is vital for bacterial growth [65]. For screening metDNA library constructs in the E. coli DH5α-E (Invitrogen) expression host, thiamine (100 µM) and kanamycin (30 µg/mL) were added to overcome thiamine auxotrophy and for selective resistance of the recombinant host, respectively. To screen libraries with various sole nitrogen (N) sources, a β-aa (5 mM) was used in combination with glucose (0.2% v/v). As a reliable N-source serving as a positive growth control, (NH4)2SO4 was used (5 mM) when screening with sole carbon sources (5 mM substrate). Cultures were capped with a cotton plug and aluminum foil before being placed in a shaking incubator (200 rpm) at 30°C. Plate media were solidified with washed agarose (0.80% final conc. (v/v)).

5.3 | Enrichment and isolation of β-amino acid degrading bacteria During the metDNA library screening, a false-positive hit was discovered that was identified as a new Acidovorax strain eventually deemed Aci MG01, which was enriched with β-Val as a sole N-source. For the enrichment of β-aa-degrading microbes, liquid MM were inoculated (30°C) with soil samples from a grassy field in the Netherlands (53° 12’ 23”N, 6° 32’ 40” E). After serial dilutions and colony purification, a single CFU was used to inoculate the relevant liquid MM (5 mL) to confirm

Page 28: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 63PDF page: 63PDF page: 63PDF page: 63

63Genome mining |

2

growth on the sole N- or C-source until saturated growth was observed. This final culture was then used for genomic DNA (gDNA) isolation and glycerol stocks for each of the final soil isolates outlined in Table 1. Each soil culture, except for M8, could grow in various liquid rich media, such as DSMZ, LB, and TB media. Cultures were also tested with other β-aa’s as sole N-sources and N-butyl-Asp as a sole C-source. The substrate growth screening was performed in 10 mL sterile, round-bottom Greiner tubes placed at 30°C (200 rpm). All chosen reference strains were subjected to liquid MM growth experiments with β-valine provided as the sole N-source for the experimental samples and (NH4)2SO4 for growth controls.

5.4 | Cloning and production of β-transaminases The gene encoding for VarM7V-TA was amplified gDNA isolated from strain M7V using primers (Forward: 5’-gcgcggcagccatatgacccttgcccccatcg-3’; Reverse: 5’-gcggccgcaagcttttatcagcccagg gcgggcagcag-3’) and inserted via NdeI/HindIII restriction sites into a pET28b(+) expression vector under transcriptional control of the T7 promoter. The resulting recombinant gene codes for an N-terminal His6-tag (MGSSHHHHHH) followed by a 10-amino acid linker region (SSGLVPRGSH) and the coding sequence. Plasmids expressing the Vp-TA and Mes-TA genes were described earlier [16,34]. The pET28b(+)/E. coli C41 (DE3) expression system was used for all three TAs using kanamycin (50 mg/ml) in all cultures. For Vp-TA (48.6 KDa) production, a pre-culture was diluted (100x) in 1 L TB medium (5 L flask) and incubated in a rotary shaker (37°C, 135 rpm) until OD600 ~0.5, and then induced with IPTG (0.8 mM). The induced culture incubated for an additional 16 h (30°C, 135 rpm). The same protocol was followed for VarM7V-TA (48.3 KDa) production, except that the induced culture was incubated for 20 h (24°C, 135 rpm). For Mes-TA (49.5 KDa) production, the induced culture was incubated for 60 h (17°C, 135 rpm). Purification of all three transaminases led to >90% purity using the following protocol. Cells were harvested by centrifugation (15 min, 6,000g, 4⁰C), suspended in lysis buffer (20 mM Tris-Cl, pH 8, 100 mM NaCl, protease inhibitor cocktail), and sonicated at 4°C. After ultra-centrifugation, the clarified lysate was incubated for 1 h (4°C) in a 25 mL column (Bio-Rad) on an orbital shaker with 1.5 mL TALON cobalt resin (Clontech) pre-equilibrated with lysis buffer (no protease inhibitor). After washing with increasing concentrations of imidazole, target protein was eluted with 0.1 M of imidazole. Purified samples were desalted using gravity-flow Econo-Pac 10DG desalting columns (Bio-Rad) at 4°C in storage buffer (50 mM KPi, 10% glycerol, pH 8.0). Protein concentration was determined using Bradford reagent (Sigma). The final samples were aliquoted and stored at -20°C.

5.5 | 16S rRNA and partial genome sequencing The GenElute Bacterial Genomic DNA Kit (Sigma) was used to obtain small-scale preparations of gDNA for 16S rRNA sequencing. For larger gDNA preparations, the protocol of Poelarends et al. was used with liquid MM cultures of 50-100 mL containing the appropriate sole-nitrogen source (5 mM) [66]. For 16S rRNA sequencing, a PCR reaction was used to amplify the gene using the relevant gDNA as a template with two universal primers: 27F (5’-AGAGTTTGATCMTGGCTCAG–3’) and 1492R (5’-GGYTACCTTGTTACGACTT-3’). The PCR product was sequenced by GATC Biotech

Page 29: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 64PDF page: 64PDF page: 64PDF page: 64

64 | Chapter 2

AG (Germany) and the nucleotide sequences (Accession #’s) were analyzed using the ‘blastn’ tool from NCBI that gave top microbial genome hits. To further study two β-Val-degrading microbes (‘7a’ soil isolate from Table 1 (Var M7V); and a previously isolated Aci MG01), the resulting genomic DNA from the large-scale preparation was subjected to paired-end sequencing by Baseclear BV (the Netherlands). The partial genome sequencing results were obtained using an Illumina (Solexa) GAIIx Genome Analyzer. Subsequent processing of the raw DNA sequence reads followed by de novo gene assembly using the CLC Genomics Workbench (v5.0.1) software resulted in the reassembly statistics in Table S1 and provided local search databases for genome mining. For partial genome construction, the sequencing reads were assembled against the relevant reference genome and automatically annotated using GenDB (Table S5).

5.6 | BLAST analyses The contig sequences for M7V and MG01 acquired from the de novo gene assemblies served as databases that were searched against using a local ‘tblastn’ search tool in Geneious (v8.1.8) [67,68]. Amino acid sequences of target enzymes served as queries using the ‘tblastn’ search option to generate the top hits from public databases (PDB or Uni-/Swiss-Prot) summarized in Tables 2. The non-redundant ‘nr’ database on the NCBI server was searched against using these top BLAST hits as queries and ‘blastp’ (v2.4.0+) search option to ascertain the putative function. The homologs search using the TAM SBV1 peptide sequence as a search query against the nr database on the NCBI server with the ‘blastp’ tool generated the top 5,000 hits [68]. From these hits, 10 homologs annotated as putative HALs (based on homology only) were randomly chosen to represent sequences with diverse relatedness (40–100% seq. ID) to putTAM SBV1. The 10 homologs are analyzed in Figure 7 (“putHAL [organism]”).

5.7 | Partial genome construction and comparative genome analyses The paired-end sequence reads from Var M7V, Aci MG01, and Pse SBV1 were submitted to the GenDB [69]. For each dataset, a de novo assembly was performed by applying the GS de novo Assembler software (release v2.3 with default settings). To facilitate genome ordering, the best reference organisms were identified based on the best 16S homology hit using BLASTn. Obtained contigs were mapped to the reference replicons by means of the computer program r2cat [70]. Subsequently, contigs representing the three target organisms were extracted and stored in different projects. Grouped and ordered contigs were automatically annotated using the genome annotation system GenDB 2.0 [69]. The reference strains from which whole genomes were used for the partial genome constructions are indicated (*) in Table S4. For the Pse SBV1 partial genome construction and comparative analyses, Pf Pf0-1 and Pf Pf-5 were used as the reference genomes. The comparative genome databases were constructed with the EDGAR [22] software using the partial genome output from GenDB. Sequence identity and the availability of a genome were key factors in choosing reference strains for comparative growth and genome studies.

Page 30: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 65PDF page: 65PDF page: 65PDF page: 65

65Genome mining |

2

5.8 | Gene context analyses To acquire the gene contexts of the β-valine enzymes and VarM7V-TA, the annotations from the partial genomes housed in the GenDB server were exported to .csv format and processed in MS Excel after file conversion. To acquire the gene contexts of the Pg-TA, the annotated genome from the NCBI public database (RefSeq: NZ_ABLD01000001.1) was imported into the Geneious software (v8.1.8) and then exported to MS Excel for further processing. For Vp-TA gene context, the contig (20,678 bp) harboring the gene was imported into the RAST web-based server (http://rast.nmpdr.org/rast.cgi) for annotation and then exported into MS Excel for editing [71-73]. Figures depicting ORF arrangements (Figure 2 and 5) represent relative aspect ratios.

5.9 | Multiple sequence alignments and phylogenetic tree The Geneious software (v8.1.8) was used for all MSAs and construction of the phylogenetic tree. The MSA represented in Figure 7a was also used to construct the phylogenetic tree and adjusted manually beforehand to remove unnecessary gaps. Regions surrounding the highlighted areas and key active site residues did not contain gaps. The tree was constructed using the neighbor-joining method, Jukes-Cantor genetic distance model, bootstrap resampling with 1,000 replicates, and a >70% support threshold as the consensus method.

5.10 | Enzyme activity assay The transamination activity was measured indirectly via a coupled-enzyme assay (Figure S1) using glutamate dehydrogenase (GDH, Sigma). The rate of transaminase-mediated glutamate formation was measured by monitoring NADH formation at 340 nm. After an assay optimization was performed in a multi-cell spectrophotometer (Jasco V-660), final component concentrations were used in substrate screening assays as follows: GDH (15 U/mL), NAD+ (2 mM), PLP (50 μM), amino substrate (2, 8, or 16 mM), α-ketoglutarate (α-KG) (0.1 mM), and an appropriate amount of enzyme. Specific activity (U/mg) measurements for substrate screenings were performed in a 96-well plate format using a Synergy Mx (SMx) Monochromator-Based Multi-Mode Microplate Reader (BioTEK). For each reaction, all components except α-KG were added (150 μL volume) and incubated in a plate thermomixer (27°C, 10 min), after which the plate was uncovered and placed in the SMx plate reader (27°C). The reaction was initiated with addition of α-KG (150 μL) via syringe delivery (~1 min) to give a final volume of 300 μL in 50 mM potassium-phosphate (KPi) buffer (pH 8). The plate was then shaken for 5 sec. Absorbance at 310 nm was recorded every 10 sec for 1 h. Pathlength corrections were taken at the end of each plate read using 900 nm and 977 nm absorbance reads. Slopes were calculated by SMx software (Gen5, v1.10, BioTEK) by calculating the tangents over pre-set time-frames (0-1, 0.5-1.5, 1-2, 1.5-2.5, 2-3, 2.5-3.5, 3-4, 3.5-4.5, 4-5, 4.5-6, 5-7, 7-9, 9-11, 11-15, and 15-20 min). A maximum of 36-40 wells were used in order to avoid long lag times between measurements. To calculate the reaction rates, an average of the slopes within the linear range of the plot, absorbance vs. time, were used. Slopes with R2-values <0.9 were filtered out. All substrate solutions were prepared in 50 mM KPi buffer (pH 8) with adjustments to pH 8 using phosphoric acid as needed. Activity plots were generated in GraphPad Prism 6.

Page 31: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 66PDF page: 66PDF page: 66PDF page: 66

66 | Chapter 2

Supplementary information

Figure S1 | Coupled-enzyme assay with TA and GDH. The transaminase (TA) reaction was measured indirectly by monitoring NADH formation at 340 nm from the coupled reaction with glutamate dehydrogenase (GDH).

Table S1 | De novo gene assembly statistics

Genomes: M7V/H1 MG01 SBV12

Total bases from reads 777,288,212 354,796,820 361,771,089

Matched bases 772,640,989 350,972,163 354,769,799

Non-matched bases 4,647,223 3,824,657 7,001,290

Contig count 746 370 996

Partial genome size (bp)3 14,505,630 4,684,633 6,440,520

Coverage4 53x 75x 55x1 Subsequent bioinformatics analyses proved that the M7 culture contains genomes of a Variovorax and Herbaspirillum species (M7V and M7H, resp.).2 Previously reported by Otzen et al., 20153 Total non-redundant sequence length covered by all contigs4 Indicates fold-excess of sequence length covered by matched reads over the partial genome length

Page 32: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 67PDF page: 67PDF page: 67PDF page: 67

67Genome mining |

2

Table S2 | Pairwise Average Nucleotide Identity (ANI) %

MG01 Ac AAC00-1 Da SPH-1

MG01 99.98

Ac AAC00-1* 83.05 100.00

Da SPH-1 82.87 83.02 100.00

M7V Vp EPS Vp S110

M7V 100.00

Vp EPS* 86.51 100.00

Vp S110 86.99 87.98 99.84

SBV1 Pf Pf-5 Pf Pf0-1

SBV1 100.00

Pf Pf-5* 84.98 100.00

Pf Pf0-1 86.72 85.00 100.00

*reference genome used for partial genome construction of relevant soil isolate

Table S3 | % sequence ID matrix of putative β-valine-degrading enzymes

CoA ligases

BvaA Aci BvaA1 Aci BvaA2

Aci BvaA1 31

Aci BvaA2 24 24

Aci BvaA3 0 0 0

Var BvaA 63 30 22

CoA lyases

BvaB1 Aci BvaB1a Aci BvaB1b Var BvaB1 BvaB2 Aci BvaB2

Aci BvaB1a 55

Aci BvaB1b 32 32

Var BvaB1 59 60 36

BvaB2 31 30 31 30

Aci BvaB2 26 29 29 28 32

Var BvaB2 28 27 34 28 37 33

Page 33: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 68PDF page: 68PDF page: 68PDF page: 68

68 | Chapter 2

Table S4 | Bacterial strains

Strain Abbrev. Cat. No. Supplier/Ref. Storage Media

Variovorax sp. str. M7V M7V n/aisolated in this study as mixed

cultureLB

V. paradoxus str. S110 Vp S110 23568 KCTC (Korea) LB

V. paradoxus str. EPS* Vp EPS n/aPaul Orwin (California State

University, U.S.A.)LB

Acidovorax sp. str. MG01 Aci MG01 n/a isolated in this study LB

A. citrulli AAC00-1* Ac AAC00-1 n/aRon Walcott (University of

Georgia, USA)sent in water/LB

Delftia acidovorans SPH-1 Da SPH-1 14801 DSMZ DSMZ Liquid Media 1

Pseudomonas sp. str. SBV1 Pse SBV1 n/a Otzen et al., 2015 LB

P. fluorescens Pf0-1 Pf Pf0-1 n/a Stuart Levy (Tufts Univ., USA)* LB /PseudoMM

P. fluorscens Pf-5* Pf Pf-5 BAA-477 ATCC* ATCC Medium 18/LB*reference genome used for partial genome construction of relevant soil isolate

Table S5 | Statistics and features of partial genome constructions

Genome: M7V1 MG01 SBV1

Aligned reads 6,606,653 6,806,499 6,993,234

Assembled bases 468,920,403 339,676,556 347,907,871

Contigs 673 887 1899

Scaffolds 167 150 315

Bases in scaffolds 8,872,035 4,373,069 5,854,165

Bases not in scaffolds 118,589 190,418 327,987

Coverage 52.16x 70.65x 50.63x

G + C content [%] 68.93 64.75 58.96

Scaffold size N50 150,845 63,456 36,027

Largest 370,018 228,295 110,083

Contig size N50 32,874 11,547 6,482

Largest 142,777 84,75 39,909

No. of CDS [ordered/unordered] 7,920 / 128 4,080 / 216 5,409 / 3251 Reads were separated from those originating from symbiont genome of Herbaspirillum M7H.

Page 34: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 69PDF page: 69PDF page: 69PDF page: 69

69Genome mining |

2

References

1. Spiteller P. β-Amino Acid Biosynthesis. Amin. Acids, Pept. Proteins Org. Chem. Vol.1 – Orig. Synth. Amin. Acids. 2009. p. 119–61.

2. Kudo F, Miyanaga A, Eguchi T. Biosynthesis of natural products containing β-amino acids. Nat. Prod. Rep. 2014;31:1056–73.

3. Walsh CT, O’Brien R V, Khosla C. Nonproteinogenic amino acid building blocks for nonribosomal peptide and hybrid polyketide scaffolds. Angew. Chem. Int. Ed. Engl. 2013;52:7098–124.

4. Wang B, Kang Q, Lu Y, Bai L, Wang C. Unveiling the biosynthetic puzzle of destruxins in Metarhizium species. Proc. Natl. Acad. Sci. 2012;109:1287–92.

5. Liu W. Biosynthesis of the enediyne antitumor antibiotic C-1027. Science. 2002;297:1170–3. 6. Kleinert HD, Luly JR, Marcotte PA, Perun TJ, Plattner JJ, Stein H. Renin inhibitors: Improvements in the stability

and biological activity of small peptides containing novel Leu-Val replacements. FEBS Lett. 1988;230:38–42. 7. Kaeberlein T, Lewis K, Epstein SS. Isolating “uncultivable” microorganisms in pure culture in a simulated natural

environment. Science. 2002;296:1127–9. 8. Venter JC. Environmental genome shotgun sequencing of the Sargasso Sea. Science. 2004;304:66–74. 9. DeSantis G, Zhu Z, Greenberg WA, Wong K, Chaplin J, Hanson SR, et al. An enzyme library approach to

biocatalysis: Development of nitrilases for enantioselective production of carboxylic acid derivatives. J. Am. Chem. Soc. 2002;124:9024–5.

10. Ferrer M, Martínez-Abarca F, Golyshin PN. Mining genomes and “metagenomes” for novel catalysts. Curr. Opin. Biotechnol. 2005;16:588–93.

11. Gabor EM, de Vries EJ, Janssen DB. A novel penicillin acylase from the environmental gene pool with improved synthetic properties. Enzyme Microb. Technol. 2005;36:182–90.

12. Torsvik V. Prokaryotic diversity- magnitude, dynamics, and controlling factors. Science. 2002;296:1064–6. 13. Kim JH, Kyung DH, Yun HD, Cho BK, Kim BG. Screening and purification of a novel transaminase catalyzing

the transamination of aryl β-amino Acid from Mesorhizobium sp. LUK. J. Microbiol. Biotechnol. 2006;16:1832–6. 14. Kim J, Kyung D, Yun H, Cho B-K, Seo J-H, Cha M, et al. Cloning and characterization of a novel β-transaminase

from Mesorhizobium sp. strain LUK: a new biocatalyst for the synthesis of enantiomerically pure β-amino acids. Appl. Environ. Microbiol. 2007;73:1772–82.

15. Otzen M, Crismaru CG, Postema CP, Wijma HJ, Heberling MM, Szymanski W, et al. Metabolism of β-valine via a CoA-dependent ammonia lyase pathway. Appl. Microbiol. Biotechnol. 2015;99:8987–98.

16. Crismaru CG, Wybenga GG, Szymanski W, Wijma HJ, Wu B, Bartsch S, et al. Biochemical properties and crystal structure of a β-phenylalanine aminotransferase from Variovorax paradoxus. Appl. Environ. Microbiol. 201333;79:185–95.

17. Yang Y, Zhao H, Barrero R a, Zhang B, Sun G, Wilson IW, et al. Genome sequencing and analysis of the paclitaxel-producing endophytic fungus Penicillium aurantiogriseum NRRL 62431. BMC Genomics. 2014;15:69.

18. Han J-I, Spain JC, Leadbetter JR, Ovchinnikova G, Goodwin LA, Han CS, et al. Genome of the root-associated plant growth-promoting bacterium Variovorax paradoxus strain EPS. Genome Announc. 2013;1:e00843-13-e00843-13.

19. Paulsen IT, Press CM, Ravel J, Kobayashi DY, Myers GSA, Mavrodi D V, et al. Complete genome sequence of the plant commensal Pseudomonas fluorescens Pf-5. Nat. Biotechnol. 2005;23:873–8.

20. Richter M, Rosselló-Móra R. Shifting the genomic gold standard for the prokaryotic species definition. Proc. Natl. Acad. Sci. 2009;106:19126–31.

21. Zhang W, Du P, Zheng H, Yu W, Wan L, Chen C. Whole-genome sequence comparison as a method for improving bacterial species definition. J. Gen. Appl. Microbiol. 2014;60:75–8.

22. Blom J, Albaum SP, Doppmeier D, Puhler A, Vorholter F-J, Zakrzewski M, et al. EDGAR: A software framework for the comparative analysis of prokaryotic genomes. BMC Bioinformatics. 2009;10:154.

23. Wilding M, Peat TS, Newman J, Scott C. A β-alanine catabolism pathway containing a highly promiscuous ω-transaminase in the 12-aminododecanate-degrading Pseudomonas sp. strain AAC. Appl. Environ. Microbiol. 2016;82:3846–56.

Page 35: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 70PDF page: 70PDF page: 70PDF page: 70

70 | Chapter 2

24. Herrmann G, Selmer T, Jessen HJ, Gokarn RR, Selifonova O, Gort SJ, et al. Two β-alanyl-CoA:ammonia lyases in Clostridium propionicum. FEBS J. 2005;272:813–21.

25. Liao HH, Gokarn RR, Gort SJ, Jessen HJ, Selifonova O. Alanine 2, 3-aminomutase. Google Patents; 2008. p. 65. 26. Steffen-Munsberg F, Vickers C, Kohls H, Land H, Mallin H, Nobili A, et al. Bioinformatic analysis of a PLP-

dependent enzyme superfamily suitable for biocatalytic applications. Biotechnol. Adv. 2015;33:566–604. 27. Park E-S, Kim M, Shin J-S. Molecular determinants for substrate selectivity of ω-transaminases. Appl. Microbiol.

Biotechnol. 2012;93:2425–35. 28. Watanabe N, Sakabe K, Sakabe N, Higashi T, Sasaki K, Aibara S, et al. Crystal structure analysis of ω-amino acid:

pyruvate aminotransferase with a newly developed weissenberg camera and an imaging plate using synchrotron radiation. J. Biochem. 1989;105:1–3.

29. Yonaha K, Toyama S, Yasuda M. Properties of crystalline ω-amino acid: pyruvate aminotransferase of Pseudomonas sp. F-126. Agric. 1977;41:1701–6.

30. Ruzicka FJ, Frey PA. Glutamate 2,3-aminomutase: a new member of the radical SAM superfamily of enzymes. Biochim. Biophys. Acta. 2007;1774:286–96.

31. Rachid S, Krug D, Weissman KJ, Müller R. Biosynthesis of (R)-β-tyrosine and its incorporation into the highly cytotoxic chondramides produced by Chondromyces crocatus. J. Biol. Chem. 2007;282:21810–7.

32. Heberling MM, Wu B, Bartsch S, Janssen DB. Priming ammonia lyases and aminomutases for industrial and therapeutic applications. Curr. Opin. Chem. Biol. 2013;17:250–60.

33. Heberling MM, Masman MF, Bartsch S, Wybenga GG, Dijkstra BW, Marrink SJ, et al. Ironing out their differences: Dissecting the structural determinants of a phenylalanine aminomutase and ammonia lyase. ACS Chem. Biol. 2015;10:989–97.

34. Wybenga GG, Crismaru CG, Janssen DB, Dijkstra BW. Structural determinants of the β-selectivity of a bacterial aminotransferase. J. Biol. Chem. 2012;287:28495–502.

35. Seo J-H, Hwang J-Y, Seo S-H, Kang H, Hwang B-Y, Kim B-G. Computational selection, identification and structural analysis of ω-aminotransferases with various substrate specificities from the genome sequence of Mesorhizobium loti MAFF303099. Biosci. Biotechnol. Biochem. 2012;76:1308–14.

36. Jendresen CB, Stahlhut SG, Li M, Gaspar P, Siedler S, Förster J, et al. Highly active and specific tyrosine ammonia-lyases from diverse origins enable enhanced production of aromatic compounds in Bacteria and Saccharomyces cerevisiae. Appl. Environ. Microbiol. 2015;81:4458–76.

37. Krug D, Müller R. Discovery of additional members of the tyrosine aminomutase enzyme family and the mutational analysis of CmdF. ChemBioChem. 2009;10:741–50.

38. Wu B, Szymański W, Wijma HJ, Crismaru CG, de Wildeman S, Poelarends GJ, et al. Engineering of an enantioselective tyrosine aminomutase by mutation of a single active site residue in phenylalanine aminomutase. Chem. Commun. 2010;46:8157–9.

39. Watts KT, Mijts BN, Lee PC, Manning AJ, Schmidt-Dannert C. Discovery of a substrate selectivity switch in tyrosine ammonia-lyase, a member of the aromatic amino acid lyase family. Chem. Biol. 2006;13:1317–26.

40. Shon M, Shanmugavel R, Shin G, Mathew S, Lee S-H, Yun H. Enzymatic synthesis of chiral γ-amino acids using ω-transaminase. Chem. Commun. 2014;50:12680–3.

41. Mathew S, Bea H, Nadarajan SP, Chung T, Yun H. Production of chiral β-amino acids using ω-transaminase from Burkholderia graminis. J. Biotechnol. 2015;196–197C:1–8.

42. Bea H-S, Park H-J, Lee S-H, Yun H. Kinetic resolution of aromatic β-amino acids by ω-transaminase. Chem. Commun. 2011;47:5894.

43. Wybenga GG, Szymański W, Wu B, Feringa BL, Janssen DB, Dijkstra BW, et al. Structural investigations into the stereochemistry and activity of a phenylalanine-2,3-aminomutase from Taxus chinensis. Biochemistry. 2014;53:3187–98.

44. Cripps RE. The microbial metabolism of acetophenone. Metabolism of acetophenone and some chloroacetophenones by an Arthrobacter species. Biochem. J. 1975;152:233–41.

45. Halden RU, Peters EG, Halden BG, Dwyer DF. Transformation of mono- and dichlorinated phenoxybenzoates by phenoxybenzoate-dioxygenase in Pseudomonas pseudoalcaligenes POB310 and a modified diarylether-metabolizing bacterium. Biotechnol. Bioeng. 2000;69:107–12.

Page 36: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 71PDF page: 71PDF page: 71PDF page: 71

71Genome mining |

2

46. Dehmel U, Engesser KH, Timmis KN, Dwyer DF. Cloning, nucleotide sequence, and expression of the gene encoding a novel dioxygenase involved in metabolism of carboxydiphenyl ethers in Pseudomonas pseudoalcaligenes POB310. Arch. Microbiol. 1995;163:35–41.

47. Kabisch M, Fortnagel P. Nucleotide sequence of the metapyrocatechase II (catechol 2,3-oxygenase II) gene mpcII from Alcaligenes eutrophus JMP 222. Nucleic Acids Res. 1990;18:5543.

48. Uchiyama T, Abe T, Ikemura T, Watanabe K. Substrate-induced gene-expression screening of environmental metagenome libraries for isolation of catabolic genes. Nat. Biotechnol. 2005;23:88–93.

49. Dorn E, Hellwig M, Reineke W, Knackmuss H-J. Isolation and characterization of a 3-chlorobenzoate degrading pseudomonad. Arch. Microbiol. 1974;99:61–70.

50. Miyazaki R, Bertelli C, Benaglio P, Canton J, De Coi N, Gharib WH, et al. Comparative genome analysis of Pseudomonas knackmussii B13, the first bacterium known to degrade chloroaromatic compounds. Environ. Microbiol. 2015;17:91–104.

51. Gabor EM, de Vries EJ, Janssen DB. Construction, characterization, and use of small-insert gene banks of DNA isolated from soil and enrichment cultures for the recovery of novel amidases. Environ. Microbiol. 2004;6:948–58.

52. Govarthanan M, Lee SM, Kamala-Kannan S, Oh BT. Characterization, real-time quantification and in silico modeling of arsenate reductase (arsC) genes in arsenic-resistant Herbaspirillum sp. GW103. Res. Microbiol. 2015;166:196–204.

53. Belimov AA, Hontzeas N, Safronova VI, Demchinskaya S V., Piluzza G, Bullitta S, et al. Cadmium-tolerant plant growth-promoting bacteria associated with the roots of Indian mustard (Brassica juncea L. Czern.). Soil Biol. Biochem. 2005;37:241–50.

54. Han J-I, Choi H-K, Lee S-W, Orwin PM, Kim J, LaRoe SL, et al. Complete genome sequence of the metabolically versatile plant growth-promoting endophyte Variovorax paradoxus S110. J. Bacteriol. 2011;193:1183–90.

55. Brooks JP, Edwards DJ, Harwich MD, Rivera MC, Fettweis JM, Serrano MG, et al. The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies. BMC Microbiol. 2015;15:66.

56. Law A, Boulanger MJ. Defining a structural and kinetic rationale for paralogous copies of phenylacetate-CoA ligases from the cystic fibrosis pathogen Burkholderia cenocepacia J2315. J. Biol. Chem. 2011;286:15577–85.

57. Wilding M, Walsh EFA, Susan J, Scott C. Identification of novel transaminases from a 12-aminododecanoic acid-metabolizing Pseudomonas strain. Microb. Biotechnol. 2015;8:665–72.

58. Antoine P, Taillieu X, Thonart P. The degradation of L-tyrosine to phenol and benzoate in pig manure. Appl. Biochem. Biotechnol. 1997;63–65:707–17.

59. Fuchs G, Boll M, Heider J. Microbial degradation of aromatic compounds — from one strategy to four. Nat. Rev. Microbiol. 2011;9:803–16.

60. Christenson SD, Liu W, Toney MD, Shen B. A novel 4-methylideneimidazole-5-one-containing tyrosine aminomutase in enediyne antitumor antibiotic C-1027 biosynthesis. J. Am. Chem. Soc. 2003;125:6062–3.

61. Micallef ML, D’Agostino PM, Sharma D, Viswanathan R, Moffitt MC. Genome mining for natural product biosynthetic gene clusters in the Subsection V cyanobacteria. BMC Genomics. 2015;16:669.

62. Calabrese JC, Jordan DB, Boodhoo A, Sariaslani S, Vannelli T. Crystal structure of phenylalanine ammonia lyase: Multiple helix dipoles implicated in catalysis. Biochemistry. 2004;43:11403–16.

63. Schafhauser T, Kirchner N, Kulik A, Huijbers MME, Flor L, Caradec T, et al. The cyclochlorotine mycotoxin is produced by the nonribosomal peptide synthetase CctN in Talaromyces islandicus (“ Penicillium islandicum”). Environ. Microbiol. 2016;18:3728–41.

64. Van Lanen SG, Oh T-J, Liu W, Wendt-Pienkowski E, Shen B. Characterization of the maduropeptin biosynthetic gene cluster from Actinomadura madurae ATCC 39144 supporting a unifying paradigm for enediyne biosynthesis. J. Am. Chem. Soc. 2007;129:13082–94.

65. Neidhardt FC, Bloch PL, Smith DF. Culture medium for enterobacteria. J. Bacteriol. 1974;119:736–47. 66. Poelarends GJ, Wilkens M, Larkin MJ, van Elsas JD, Janssen DB. Degradation of 1,3-dichloropropene by

Pseudomonas cichorii 170. Appl. Environ. Microbiol. 1998;64:2931–6. 67. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious Basic: An integrated

and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–9.

Page 37: University of Groningen Genomics-based discovery and ...

514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-Heberling514067-L-bw-HeberlingProcessed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017Processed on: 27-9-2017 PDF page: 72PDF page: 72PDF page: 72PDF page: 72

72 | Chapter 2

68. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–10.

69. Meyer F, Goesmann A, McHardy AC, Bartels D, Bekel T, Clausen J, et al. GenDB--an open source genome annotation system for prokaryote genomes. Nucleic Acids Res. 2003;31:2187–95.

70. Husemann P, Stoye J. r2cat: synteny plots and comparative assembly. Bioinformatics. 2010;26:570–1. 71. Brettin T, Davis JJ, Disz T, Edwards RA, Gerdes S, Olsen GJ, et al. RASTtk: A modular and extensible

implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes. Sci. Rep. 2015;5:8365.

72. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics. 2008;9:75.

73. Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42:D206–14.