Promotersselected fromrandomDNAsequences · 7405 Thepublicationcostsofthis article...

5
Proc. NatI. Acad. Sci. USA Vol. 83, pp. 7405-7409, October 1986 Genetics Promoters selected from random DNA sequences (mutagenesis/evolution) MARSHALL S. Z. HORWITZ AND LAWRENCE A. LOEB The Joseph Gottstein Memorial Cancer Research Laboratory, Department of Pathology SM-30, University of Washington, Seattle, WA 98195 Communicated by Earl P. Benditt, July 3, 1986 ABSTRACT We have selected a group of Escherichia coli promoters from random DNA sequences by replacing 19 base pairs at the -35 promoter region of the tetracycline resistance gene te" of the plasmid pBR322. Substitution of 19 base pairs with chemically synthesized random sequences results in a maximum of 419 (about 3 x 1011) possible replacement se- quences. From a population of about 1000 bacteria harboring plasmids with these random substitutions, tetracycline selec- tion has revealed several functional -35 promoter sequences. These promoters have retained only partial. homology to the -35 promoter consensus sequence. In three of these promoters, the consensus agent shifts 10 nucleotides downstream, allow- ing the RNA polymerase to recognize another Pribnow box from within the original pBR322 sequence. Two of the sequences promote transcription more strongly than the native promoter. This technique may have application for the selection of additional DNA sequences with varied biological activity. Comparison of known RNA polymerase binding sites of different genes of Escherichia coli reveals two highly con- served promoter elements centered at about -10 and -35 base pairs (bp) from the start of transcription (1-5). A consensus sequence of the nontemplate strand includes "TATAAT" from positions -13 to -8, the "Pribnow box," and "TTGACA" from positions -36 to -31, the "recogni- tion" site, with 17 bp between the two (6). The involvement of each nucleotide in the initiation of transcription has been inferred largely from an analysis of mutations. Rare muta- tions that increase transcription, "up mutations," usually increase homology with the consensus sequence and spacing, while the more common mutations that decrease transcrip- tion, "down mutations," usually decrease homology with the consensus sequence and spacing (7). Targeted random mutagenesis has been used to define prokaryotic (8) and eukaryotic (9) translation initiation sig- nals. In these experiments base substitutions were limited in number to no more than three and were identified before activity was manually assayed. This strategy limits the search for functional sequences to derivatives of those already known. We report here a technique that has allowed us to create several unusual promoter recognition sequences. We have substituted for the promoter recognition site of a plasmid- borne selectable marker chemically synthesized random sequences of 19 bp, such that every particular plasmid molecule contains a unique, randomly chosen sequence. When introduced into competent cells, growth selection identifies those sequences with promoter activity. When all four DNA bases are present in the 19-bp random stretch, there are 419 (about 3 x 1011) different possible replacement sequences. We have used antibiotic selection to identify several unusual promoters from a population of about 1000 such bacteria, heterogeneous in DNA sequence at the pro- moter recognition site of the tetracycline resistance gene of pBR322 (tet). This technique may be generalized for the selection of other genetic regulatory elements or protein coding sequences. MATERIALS AND METHODS Oligonucleotides. Oligonucleotides were synthesized by the phosphoramidite method with an Applied Biosystems 380A DNA synthesizer and purified by thin layer chroma- tography (10). Random sequences were synthesized with equimolar mixtures of phosphoramidites. Plasmid Constructions. Restriction endonucleases and en- zymes of nucleic acid metabolism were obtained commer- cially and use followed the supplier's instructions. Standard molecular cloning methods were employed (11). A plasmid with a deletion in the promoter recognition site, pBdEC, was constructed by digestion of pBR322 with EcoRI and Cla I, by extension of the 5' overhangs with the large fragment of DNA polymerase I, and by blunt-end recircu- larization with T4 DNA ligase. The plasmid sequences were confirmed by DNA sequence analysis. The plasmid populations containing random substitution with all four bases, pRAN4, or just three bases (cytosine, guanine, and thymine), pRAN3, were constructed by hybrid- izing 4 x 10-' pM of primer 8-mer, 5' GGATCGAT 3', to 2 X 10- pM of template 35-mer of mixed sequence, either 5' CCGAATTC(A,C,G,T)19ATCGATCC 3' or 5' CCGAATTC- (C,G,T)19ATCGATCC 3', respectively, in 90 mM NaCl/15 mM Tris HCl, pH 7.9/1 mM MgCl2 at 650C for 5 min and 570C for 90 min. The primed template was extended with the large fragment of DNA polymerase I and digested with an excess of EcoRI and Taq I. The resulting product was ligated into EcoRI- and Cla I-digested pBR322 that had been treated with bacterial alkaline phosphatase and purified by agarose gel electrophoresis. One of the new promoter sequences was duplicated in a second plasmid to rule out the possibility of mutation outside the promoter region. The insert of plasmid pBT9 was recon- structed in the plasmid pBT9R by hybridizing 105 pM 26-mer (5' AATTCTTGGGCGCGCGTCGGCTTGAT 3') to 10-5 pM 24-mer (5' CGATCAAGCCGACGCGCGCCCAAG 3') using the conditions described above. Incubation with T4 polynucleotide kinase in the presence of ATP added 5'- phosphoryl termini. The resulting product contains EcoRI and Cla I sticky ends and was ligated into similarly digested pBR322. The expected plasmid sequences were confirmed by DNA sequence analysis. Competent DH5 and DH5.1 E. coli (endAl, recAl), prepared by the method of Hanahan (12) and purchased from Bethesda Research Laboratories and Vector Cloning Systems (San Diego, CA), respectively, were used for DNA transformation. Transformants were grown in Luria-Bertani Medium (LB) (with 0.1% glucose) for 60 min (approximately two doublings) prior to antibiotic selection. DNA Sequence Analysis. Rapid plasmid DNA preparation was by the alkaline lysis method (11). DNA sequencing was Abbreviations: tetr, tetracycline resistance gene; bp, base pair(s). 7405 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on November 17, 2020

Transcript of Promotersselected fromrandomDNAsequences · 7405 Thepublicationcostsofthis article...

Page 1: Promotersselected fromrandomDNAsequences · 7405 Thepublicationcostsofthis article weredefrayedinpartbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement" Downloaded

Proc. NatI. Acad. Sci. USAVol. 83, pp. 7405-7409, October 1986Genetics

Promoters selected from random DNA sequences(mutagenesis/evolution)

MARSHALL S. Z. HORWITZ AND LAWRENCE A. LOEBThe Joseph Gottstein Memorial Cancer Research Laboratory, Department of Pathology SM-30, University of Washington, Seattle, WA 98195

Communicated by Earl P. Benditt, July 3, 1986

ABSTRACT We have selected a group of Escherichia colipromoters from random DNA sequences by replacing 19 basepairs at the -35 promoter region of the tetracycline resistancegene te" of the plasmid pBR322. Substitution of 19 base pairswith chemically synthesized random sequences results in amaximum of 419 (about 3 x 1011) possible replacement se-quences. From a population of about 1000 bacteria harboringplasmids with these random substitutions, tetracycline selec-tion has revealed several functional -35 promoter sequences.These promoters have retained only partial. homology to the-35 promoter consensus sequence. In three ofthese promoters,the consensus agent shifts 10 nucleotides downstream, allow-ing the RNA polymerase to recognize another Pribnow box fromwithin the original pBR322 sequence. Two of the sequencespromote transcription more strongly than the native promoter.This technique may have application for the selection ofadditionalDNA sequences with varied biological activity.

Comparison of known RNA polymerase binding sites ofdifferent genes of Escherichia coli reveals two highly con-served promoter elements centered at about -10 and -35base pairs (bp) from the start of transcription (1-5). Aconsensus sequence of the nontemplate strand includes"TATAAT" from positions -13 to -8, the "Pribnow box,"and "TTGACA" from positions -36 to -31, the "recogni-tion" site, with 17 bp between the two (6). The involvementof each nucleotide in the initiation of transcription has beeninferred largely from an analysis of mutations. Rare muta-tions that increase transcription, "up mutations," usuallyincrease homology with the consensus sequence and spacing,while the more common mutations that decrease transcrip-tion, "down mutations," usually decrease homology with theconsensus sequence and spacing (7).Targeted random mutagenesis has been used to define

prokaryotic (8) and eukaryotic (9) translation initiation sig-nals. In these experiments base substitutions were limited innumber to no more than three and were identified beforeactivity was manually assayed. This strategy limits the searchfor functional sequences to derivatives of those alreadyknown.We report here a technique that has allowed us to create

several unusual promoter recognition sequences. We havesubstituted for the promoter recognition site of a plasmid-borne selectable marker chemically synthesized randomsequences of 19 bp, such that every particular plasmidmolecule contains a unique, randomly chosen sequence.When introduced into competent cells, growth selectionidentifies those sequences with promoter activity. When allfour DNA bases are present in the 19-bp random stretch,there are 419 (about 3 x 1011) different possible replacementsequences. We have used antibiotic selection to identifyseveral unusual promoters from a population of about 1000such bacteria, heterogeneous in DNA sequence at the pro-

moter recognition site of the tetracycline resistance gene ofpBR322 (tet). This technique may be generalized for theselection of other genetic regulatory elements or proteincoding sequences.

MATERIALS AND METHODSOligonucleotides. Oligonucleotides were synthesized by

the phosphoramidite method with an Applied Biosystems380A DNA synthesizer and purified by thin layer chroma-tography (10). Random sequences were synthesized withequimolar mixtures of phosphoramidites.Plasmid Constructions. Restriction endonucleases and en-

zymes of nucleic acid metabolism were obtained commer-cially and use followed the supplier's instructions. Standardmolecular cloning methods were employed (11).A plasmid with a deletion in the promoter recognition site,

pBdEC, was constructed by digestion ofpBR322 with EcoRIand Cla I, by extension of the 5' overhangs with the largefragment of DNA polymerase I, and by blunt-end recircu-larization with T4 DNA ligase. The plasmid sequences wereconfirmed by DNA sequence analysis.The plasmid populations containing random substitution

with all four bases, pRAN4, or just three bases (cytosine,guanine, and thymine), pRAN3, were constructed by hybrid-izing 4 x 10-' pM of primer 8-mer, 5' GGATCGAT 3', to 2X 10- pM of template 35-mer of mixed sequence, either 5'CCGAATTC(A,C,G,T)19ATCGATCC 3' or 5' CCGAATTC-(C,G,T)19ATCGATCC 3', respectively, in 90 mM NaCl/15mM Tris HCl, pH 7.9/1 mM MgCl2 at 650C for 5 min and 570Cfor 90 min. The primed template was extended with the largefragment of DNA polymerase I and digested with an excessof EcoRI and Taq I. The resulting product was ligated intoEcoRI- and Cla I-digested pBR322 that had been treated withbacterial alkaline phosphatase and purified by agarose gelelectrophoresis.One of the new promoter sequences was duplicated in a

second plasmid to rule out the possibility of mutation outsidethe promoter region. The insert of plasmid pBT9 was recon-structed in the plasmid pBT9R by hybridizing 105 pM26-mer (5' AATTCTTGGGCGCGCGTCGGCTTGAT 3') to10-5 pM 24-mer (5' CGATCAAGCCGACGCGCGCCCAAG3') using the conditions described above. Incubation with T4polynucleotide kinase in the presence of ATP added 5'-phosphoryl termini. The resulting product contains EcoRIand Cla I sticky ends and was ligated into similarly digestedpBR322. The expected plasmid sequences were confirmed byDNA sequence analysis. Competent DH5 and DH5.1 E. coli(endAl, recAl), prepared by the method of Hanahan (12) andpurchased from Bethesda Research Laboratories and VectorCloning Systems (San Diego, CA), respectively, were usedfor DNA transformation. Transformants were grown inLuria-Bertani Medium (LB) (with 0.1% glucose) for 60 min(approximately two doublings) prior to antibiotic selection.DNA Sequence Analysis. Rapid plasmid DNA preparation

was by the alkaline lysis method (11). DNA sequencing was

Abbreviations: tetr, tetracycline resistance gene; bp, base pair(s).

7405

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

Nov

embe

r 17

, 202

0

Page 2: Promotersselected fromrandomDNAsequences · 7405 Thepublicationcostsofthis article weredefrayedinpartbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement" Downloaded

7406 Genetics: Horwitz and Loeb

by dideoxy chain-termination (13) using double-strandpBR322 templates (14) from these rapid preparations. BothDNA strands were sequenced.

Tetracycline Resistance Determination. Tetracycline resist-ance was determined by the 50% efficiency of plating (EOP50)method (15) on LB agar (with 0.1% glucose).RNA Gel Blot Analysis. Nucleic acids were purified from E.

coli (16) and electrophoresed on 1% agarose/2.2 M formal-dehyde (11). Hybridization was on GeneScreenPlus mem-branes (New England Nuclear) following manufacturer'sinstructions. Densitometry of the autoradiogram was per-formed on a Hoefer GS300 scanning densitometer.

RESULTSThe Promoter Recognition Sequence Is Necessary for Tran-

scription. The location of the tet( promoter has been deducedfrom promoter consensus sequence homology in pBR322(17), deletion mutations (18), and electron microscopic map-ping (19). The transcription initiation site has been identifiedby Si-nuclease mapping (16). The gene encodes a single,noninducible, 43.5-kDa polypeptide (20) that functions at thecell membrane to block accumulation of the antibiotic (21).We first deleted the promoter recognition sequence in tetr

to confirm the importance of this sequence in the transcrip-tion of that gene. EcoRI and Cla I restriction sites flank the-35 sequence in tet' (Fig. 1). We constructed pBdEC, aplasmid with a 22-bp deletion extending from position -42 inthe EcoRI site to position -21 in the Cla I site. E. coli DH5.1was found to be resistant to a tetracycline concentration of 2,ug/ml in the absence of a plasmid, to 40 pug/ml whenharboring pBR322, and to just 4 tkg/ml when harboringpBdEC, the plasmid with the promoter recognition sitedeletion (Fig. 2). RNA gel blot analysis shows an absence ofthe tetr transcript in cells containing pBdEC (Fig. 3, seebelow). Therefore, the promoter recognition sequence isnecessary for tetr transcription.Replacement of the Promoter Recognition Sequence with

Random Sequences Yields Promoter Substitutions. The ratio-nale of the experiment is explained in Fig. 1. We excised thepromoter recognition sequence of pBR322 by digestion withEcoRI and Cla I and purification of the larger fragment byagarose gel electrophoresis. Next, an oligonucleotide tem-plate containing defined sequence termini sandwiching a19-base random stretch was hybridized to a defined sequenceprimer, copied with the large fragment of DNA polymerase

I in the presence of all four dNTPs, and digested with EcoRIand Taq I to produce a heterogeneous restriction fragmentpopulation. We then ligated the restriction fragments into theparent plasmid to produce a population of plasmids, pRAN4,containing random 19-bp promoter substitutions. For thispopulation, there is a maximum of 419 different possiblereplacement sequences. The plasmid population was used totransform E. coli. Growth in ampicillin selects for bacteriacontaining plasmids, while growth in tetracycline selects forplasmids with functional tetr promoters.

Specifically, we transformed DH5.1 with pRAN4 (Table1). Ampicillin selection yielded 125 colonies. To determinethe nature of insertions present, plasmids from 10 of thesecolonies have been characterized. Plasmid sizes were com-pared by agarose gel electrophoresis (not shown), and in eachplasmid about 200 bp centered at -35 were sequenced. Twoplasmids are identical to pBR322, and three plasmids containdeletions bounded by the EcoRI and Cla I sites (sequencesnot shown); these are assumed to be part of a background ofabout 50% of the vectors that escaped either digestion withrestriction enzymes or ligation of the insert. The other fiveplasmids derived from pRAN4-pBB3, pBB5, pBB9,pBB10, and pBB13-contain promoter substitutions of 10-23bp (Fig. 2). Of the total of 77 bases substituted among thesefive plasmids, the average insert length is 15, and thecomposition is 22% adenine, 27% cytosine, 34% guanine, and17% thymine.

Selection of pRAN4 transformants in both ampicillin andtetracycline yielded 28 colonies. Plasmids from all 28 of thesecolonies have been characterized to determine if any containpromoter replacement sequences. Twenty-seven plasmidsare identical to pBR322. These are from the background ofunmodified vectors; it is improbable that a promoter se-quence identical to pBR322 would be present within the smallsubpopulation of all random sequences selected here. Theother plasmid, pBG8, contains a 38-bp promoter substitution(Fig. 2). Although there is sequence length heterogeneity, theaverage length of all pRAN4 substitutions is 19 bases, inagreement with the target length. We conclude that about halfof the 125 ampicillin-resistant colonies contain plasmids withpromoter substitutions and that about 1 of these 63 colonies isalso tetracycline resistant, suggesting that, very roughly, 2% ofthe 3 x 1011 possible random sequences present in this con-struction may duplicate promoter recognition site activity.

EcoR I -35 Cla -10

-CZZ- GC TfiIA T TCGAjIE[ T CGACAAGAATTCTCATGT|TTGAC TCATCGAAA T GCGGTAGTTTATCGTTCTT AGAGTACAAACTGTCGAATAGTAGCTATTCGAAATTACGCCATCAAATAG

pBR322

ampr ori tetr

AATTCC AT sEcoRI/ClaIGGJ TAGC s gel purify

19 DNA ligase

-35V -10

NNNNNNNNNNNNNNNNNNN

NNNNNNNNNNNNNNNNNNNamppRAN4 =- L

Iampr ori tetramp/tet

FIG. 1. Schematic of experimental strategy. The tet' promoter sequence is indicated (17). Matches with the -10 and -35 promoter consensussequence are boxed (6). An additional "anti-tet" promoter (data not shown) initiates transcription from within the tet' Pribnow box transcribingin the opposite direction, toward the ampicillin resistance gene (19). Transfection was into E. coli DH5. 1. amp, ampicillin; tet, tetracycline; N,unspecified bases.

Proc. Natl. Acad. Sci. USA 83 (1986)

Dow

nloa

ded

by g

uest

on

Nov

embe

r 17

, 202

0

Page 3: Promotersselected fromrandomDNAsequences · 7405 Thepublicationcostsofthis article weredefrayedinpartbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement" Downloaded

Proc. Natl. Acad. Sci. USA 83 (1986) 7407

Tetracyclineresistance(pg/ml)Sequence

consensus

native (pBR322) T T C

-35V

tc TT GAC a t

TCA TGT7 ~GClDTC

-10V

t tg TAt AaT

ATCGAMAAOUVT~7GC

AC,G,T substitution (pRAN4)pBG8 --- GTGCAGAAACGCCGCAGGGGAAAGAACTGCGCIT1G~

pBB3 ---

pBB5pBB9 ---

pBB10pBB13 ---

C,G,T substitution (pRAN3)pBT9 -TCpBT21 - - -

pBTR3 - - -

pBA 1pBA2pBA3 - - -

pBA4pBA5 - - -

pBA6 - - -

pBA7pBA8 - - -

pBA9 - - -

pBA10 - - -

deletion (pBdEC) - -

GGAGCCGCC02aTWCGAAGGCAGGGGGGGCMIF

GGCGATGCT TTCCAKMACBTTAGCc3AO3CIoAjATA

TTqC9GGME=

TTGGGCGCGCGTCGG£TTGJGCCCC TT TTCTC CgIf

CGTCCCTGCCrMICGCTTGmGCGTGTCGGTCCCCGTGTC.3rC

CGTGGCGCCDca]3C2T1IMCGTT TCGGmCI3GGGCGdfl3CCGGTGGGCGOCI = TCGGGGGCGGfMCCCGMC0I3r

GGCGGENGKGGCCGCGCCC T T GCTTTGG TGGf1TCTCGCCCC

ErMTGGqCCDGCCTTCGGGTqGMTGffCTGCGCGCCCG

GTGGGCCGCGGCMGGt TCCG

-M -------------

---------

-- -----TIAI---

FIG. 2. Promoter sequence substitutions. The consensus promoter sequence (6) is on the top line. The most strongly conserved bases arein upper case. The sequence ofpBR322 (7) is on the second line. The lines following list promoter substitutions. Spaces have been inserted beforeand after the pBR322 promoter and the promoter substitutions to maximize alignment with the consensus. Matches with the consensus are boxed.Dashes denote positions identical with pBR322. Base matches among the dashes indicate positions within pBR322, outside the substitution, thattogether with the substitution allow for - 35 promoter consensus match. The last line lists the sequence of the promoter deletion plasmid pBdEC.Two plasmids, pBTR3 and pBA8, possess downstream insertions, a probable artifact of the method of plasmid construction, here indicated assubscripts among the dashes. Three plasmids, pBA7, pBT9, and pBT21, have two sites of potential alignment with the consensus; although notaligned, bases in this alternate site are underlined. Tetracycline resistance was determined by the 50% efficiency of plating (EOP50) method (15).

To verify that our synthetic restriction fragments were

responsible for the observed sequence heterogeneity at thepromoter recognition region and to check for allowablesequence diversity, we prepared a population of plasmids,

U)U N

ct

LO Lu ct a.< 13 crm m mO

a a aCZ

COctN CO o

b-H C F-m m m ma a a a

z

clCY)crmQL

a

-4 1.4 kb

FIG. 3. RNA gel blot analysis to quantify tetr transcription.Nucleic acid (5 ,g per lane) purified from E. coli DH5.1 harboring theindicated plasmid and 100 ng of pBR322 DNA were electrophoresedthrough 1% agarose/2.2 M formaldehyde, transferred to a hybrid-ization membrane, and probed with the 32P-labeled, nick-translated,787-bp EcoRV-Nru I restriction fragment (11) from the coding regionof tetr. A HindIII digest of bacteriophage A DNA was used as a sizemarker (data not shown). The 1.4-kilobase (kb) band is indicated.

pRAN3, deficient in adenine throughout the randomly sub-stituted segment. The plasmid population pRAN3 containspromoter recognition site substitutions of random 19-baseruns of cytosine, guanine, and thymine. For this population,there is a maximum of 319 (about 109) different possiblereplacement sequences. We transformed DH5.1 with pRAN3(Table 1). Ampicillin selection yielded 887 colonies. Plasmidsfrom 10 of these colonies have been characterized. All 10 ofthese plasmids, pBA1-pBA10, contain replacement inser-tions of 15-29 bp (Fig. 2), implying that >90% of the plasmidsin pRAN3 contain promoter substitutions. Of the total of 203bases substituted among these 10 plasmids, the average insertlength is 20, and the composition is 0% adenine, 32%cytosine, 42% guanine, and 26% thymine. This compositionbias indicates that the sequence heterogeneity results fromligation of the random insert, not from a cellular process.Selection in both ampicillin and tetracycline yielded 23colonies. Plasmids from all 23 of these colonies have beencharacterized. Twenty-one plasmids are identical to pBR322.The presence of adenine in these promoter sequences indi-cates that these are from the background of unmodifiedvectors and are not present within the small subpopulation ofall random sequences selected here. The other two plasmids,pBT9 and pBT21, contain promoter substitutions of 19 and 17bases, respectively (Fig. 2). An additional tetracycline-resistant colony containing a plasmid, pBTR3 (Fig. 2), wasdetected by replica plating the colonies growing on theampicillin media onto ampicillin/tetracycline media (data notshown). Although there is sequence length heterogeneity, theaverage length of all pRAN3 substitutions is 20 bases, closeto the target length of 19. We conclude that >90% of the 887ampicillin-resistant colonies contain promoter substitutionsand that about two of these colonies are also tetracyclineresistant, suggesting that, very roughly, 0.2% of the 109

40

40

22624

506030

6642222

1024

4

Genetics: Horwitz and Loeb

.o-----------

Dow

nloa

ded

by g

uest

on

Nov

embe

r 17

, 202

0

Page 4: Promotersselected fromrandomDNAsequences · 7405 Thepublicationcostsofthis article weredefrayedinpartbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement" Downloaded

7408 Genetics: Horwitz and Loeb

Table 1. Transformation with random plasmid populations

amp selection amp and tet selection

Colonies, Sequences, no. Colonies, Sequences, no.DNA no. pBR322 Deletions Inserts no. pBR322 Deletions Inserts

pRAN4 125 2 3 5 28 27 0 1pRAN3 887 0 0 10 23 21 0 2

E. coli DH5.1 was transformed with 410 ng of DNA from the indicated plasmid population at anefficiency of 103-104 colonies per ,tg of DNA with ampicillin (amp) selection and 10-102 colonies perjig of DNA with ampicillin and tetracycline (tet) selection. The number of colonies produced is listed.For transformation with each of the two plasmid populations, plasmids from 10 ampicillin-resistantcolonies have been sequenced in the promoter region for tetr. The number of plasmids that are pBR322background, contain promoter deletions, or contain promoter replacement insertions is indicated. Theinsertion sequences are listed in Fig. 2. All transformed colonies were grown on LB agar. Ampicillinconcentration was 50 Ag/ml; tetracycline concentration was 12.5 Ag/ml. Equal amounts ofDNA wereused for both antibiotic selections. No transformants were observed in the absence of DNA. pRAN4,random plasmid population containing all four bases; pRAN3, random plasmid population containingjust three bases (cytosine, guanine, and thymine).

possible random sequences present in this construction mayduplicate promoter recognition site activity. The absence ofadenine reduces the frequency at which promoter recognitionsites may be selected from random sequences.

Tetracycline Resistance Correlates with Transcription.DH5.1 bearing plasmids with promoter recognition sitereplacements was tested for resistance to tetracycline (Fig.2). The range of tetracycline resistance conferred by theplasmids selected for ampicillin resistance only (pBB3,pBB5, pBB9, pBB10, pBB13, and pBA1-pBA10) is between2 and 10 ,g/ml. For the plasmids selected for both ampicillinand tetracycline resistance (pBG8, pBT9, pBT21, andpBTR3), the range is between 30 and 60 ,g/ml, while pBR322is resistant to 40 ,g/ml. Both pBT9 and pBT21 are resistantto concentrations that inhibit pBR322, 50 and 60 ,g/ml,respectively.An RNA gel blot was used to quantify transcription from

tet' (Fig. 3). The 787-bp EcoRV-Nru I restriction fragmentfrom the protein coding region of tetr was hybridized tocellular nucleic acids. The absence of hybridization in DH5.1with no plasmid reveals the specificity of the probe. Hybrid-ization of nucleic acids from DH5.1 harboring pBR322detects three bands. The upper two are plasmid DNA, whilethe bottom band contains a tet' transcript of about 1.4-kilobases maximum length. DH5.1 containing either thetetracycline-sensitive promoter deletion plasmid, pBdEC, orthe tetracycline-sensitive promoter substitution plasmid,pBA5, reveals the absence of tetr transcripts. DH5.1 harbor-ing the tetracycline-resistant promoter substitutions, pBG8,pBT9, pBT21, and pBTR3, exhibits various levels of tetrtranscripts. The amount of tetr transcript for each plasmidwas quantified by densitometry. To exclude possible differ-ences in plasmid copy number, transcript levels have beennormalized to plasmid DNA concentrations by taking thequotient of the values for the 1.4-kilobase band and theplasmid DNA bands. There is a direct correlation betweenlevels of tetracycline resistance and tetr transcript (Table 2).Therefore, the phenotype provides a good estimate of pro-moter strength.The Promoter Substitutions Are Responsible for Transcrip-

tion. It is possible that the promoter substitution is not thecause of tetracycline resistance. To rule out the possibilitythat a chromosomal rather than a plasmid mutation isresponsible for this phenotype, we performed secondarytransformations using plasmids purified from the primarytetracycline-resistant transformants. The plasmids pBG8,pBT9, pBT21, and pBTR3 each transformed E. coli DH5under tetracycline selection with an efficiency about equal topBR322 (data not shown). We, therefore, conclude thattetracycline resistance is conferred by the plasmid, not by a

mutation in the host bacteria.

To unambiguously establish that the promoter replace-ments are the elements of the plasmid responsible fortetracycline resistance, we reconstructed one of the plas-mids. We used oligonucleotides of defined sequence toduplicate the promoter substitution in pBT9. We synthesizedtwo oligonucleotides, each identical to one of the two DNAstrands of the promoter recognition region. The twooligonucleotides were annealed and ligated into pBR322. Asynthetic 26-mer with the same sequence as pBT9 in thenontemplate strand extending from the EcoRI restriction siteto the Cla I restriction site was hybridized to a complemen-tary synthetic 24-mer with an equivalent sequence for thetemplate strand. This insert was ligated into EcoRI- and ClaI-digested pBR322 to produce pBT9R, a plasmid identical inpromoter sequence to pBT9. In the absence of a plasmid,DH5 was found to be resistant to a tetracycline concentrationof 2 ,g/ml. When harboring pBT9R, the reconstruction ofpBT9, DH5 is resistant to a tetracycline concentration of 50,ug/ml, the same level of resistance that the original pBT9confers on DH5.1. Therefore, tetracycline resistance isconferred by the promoter substitutions, not by a mutationelsewhere on the plasmid.

DISCUSSIONWe inserted random, chemically synthesized DNA se-quences into plasmids and have selected from a heteroge-neous population those sequences that are biologically ac-tive. From a population of about 1000 plasmids, each differ-ent and random in sequence for about 19 bp in the promoterrecognition site of tetr, four additional promoters have beenidentified by antibiotic selection, two of which promotetranscription more strongly than the native promoter. Wehave demonstrated that the promoter substitutions were

Table 2. Correlation of phenotype and transcription

Relative value

Tetracycline tetrPlasmid resistance transcriptpBT21 1.5 1.27pBT9 1.25 1.17pBG8 1.00 1.01pBR322 1 1pBTR3 0.75 0.290pBA5 0.05 0pBdEC 0.05 0

Plasmids were grown in E. coli DH5.1. Tetracycline resistance wasdetermined by the 50% efficiency of plating (EOP50) method (15), andthe tetr transcript levels were quantified by densitometry of the RNAgel blot (Fig. 3). Transcript levels were normalized to plasmid DNAconcentration. All values are expressed relative to those for pBR322.

Proc. Natl. Acad. Sci. USA 83 (1986)

Dow

nloa

ded

by g

uest

on

Nov

embe

r 17

, 202

0

Page 5: Promotersselected fromrandomDNAsequences · 7405 Thepublicationcostsofthis article weredefrayedinpartbypagecharge payment.Thisarticle mustthereforebeherebymarked"advertisement" Downloaded

Proc. Natl. Acad. Sci. USA 83 (1986) 7409

present within the initial population of random sequences bysynthesizing sequences of biased composition, deficient inadenine; new promoter sequences were recovered at a lowerfrequency and did not contain adenine. We have establishedthat the promoter substitutions are responsible for transcrip-tion initiation by reconstructing one of the new promotersthrough chemical synthesis of that DNA sequence; reinser-tion of this sequence into a plasmid with a deletion in thepromoter recognition site restores biological activity.

Ligation of target DNA upstream from a promoterlessmarker gene is an established strategy for the identification ofpromoters. The tetracycline resistance gene of pBR322 hasbeen used as a promoter probe (22-24). In these experimentsthe promoter typically was inactivated by deletion within therecognition site. Restriction digests of genomic DNA fromvarious prokaryotes and eukaryotes were ligated into theinactive promoter as a test of their potential to restorefunction. Depending upon the organismal source of the DNA,insertions with promoter activity were selected at a frequen-cy of 0.2-33% from among all recombinants. In the absenceof DNA sequence data this high frequency of promoterselection has been variously explained as either verificationof functional promoters (15) or as fortuitous restoration of thedeleted portion of the native promoter (25). The latterexplanation implies a lack of stringency of RNA polymerasein choosing promoter DNA sequences.While all four of the promoter recognition regions reported

here retain homology with the consensus sequence, in threeof those sequences, pBG8, pBT9, and pBT21, the consensusalignment shifts 10 nucleotides downstream (Fig. 2). In thesethree promoters transcription initiation also shifts 10 nucle-otides downstream (data not shown). The RNA polymerase,therefore, recognizes another Pribnow box from within theoriginal pBR322 sequence. A candidate Pribnow box, 10nucleotides downstream from the original Pribnow box, isTGCGGTAGTTT, where agreement with the consensussequence has been underlined. Supportive evidence comesfrom a mutation in lac in which a base substitution at +1activates a latent Pribnow box to initiate transcription at + 13(26).The functional tetr promoter in pBTR3 has conserved the

consensus spacing of 17 nucleotides between the -35 and-10 promoter elements. However, the nonfunctional tet'promoter in pBA7 retains substantial homology with theconsensus sequence but not the spacing. These promotermutations show the significance of the spacing between the /two promoter elements.From among more than 150 known promoters and promot-

er mutations (7), the only up mutation that decreases homol-ogy to the consensus sequence is found in the -35 region ofara (27), and there are no down mutations that increasehomology with the consensus sequence. Among the se-quences reported here, pBTR3 contains decreased homologywith the consensus sequence, relative to pBR322, and is adown mutation, in agreement with this observation. Howev-er, pBT9 and pBT21 contain tet' promoters with decreasedconsensus homology at both promoter elements, yet are upmutations. While our results highlight the importance of thepromoter consensus sequence, they also indicate that devi-ations from the consensus need not necessarily decreasepromoter activity.

It should be emphasized that the set of all possible DNAsequences may contain a large number of biologically activesequences that have never been tested in nature. Presumably,during prebiotic evolution nucleotides were linked togetherand replicated chemically with high error rates in the absenceof polymerases (28). The total pool of possible sequences

may have been progressively narrowed by the more rapidreproduction of particular molecular species. Accurate rep-lication ofpolymers ofincreasing length is likely to have beencoupled to the evolution of fidelity mechanisms, resulting ina further reduction in the number of random species (29).From this, a very limited repertoire of functional DNAsequences evolved. By presenting a cell with a randompopulation of nucleotide sequences, selection of unusualbiological activities might be possible. Since the arithmeticpotential of sequence diversity is so great, however, only asmall fraction of all sequences can be screened experimen-tally.The technique of biological selection from a large popula-

tion of random DNA sequences may be useful for definingother prokaryotic and eukaryotic functions. Genetic regula-tory signals that may be studied include enhancers andreplication origins. The technique may also be extended forthe selection of protein coding sequences; candidate appli-cations include peptide hormones, leader sequences, cata-lytic domains, and even entire enzymes. The import of thistechnique is its usefulness in the absence of physical, chem-ical, or empirical assumptions about structure and functionrelationships.

We thank Ray Monnat and Brad Preston for helpful advice, PatrickChou and Yim Foon Lee for performring oligonucleotide synthesis,and Chris Bjarke and Peter Evers for technical assistance with DNAsequencing. M.S.Z.H. is a student in the Medical Scientist TrainingProgram. This work was funded by the Gottstein Memorial Trust.

1. Pribnow, D. (1975) J. Mol. Biol. 99, 419-443.2. Schaller, H., Gray, C. & Herrmann, K. (1975) Proc. Natl. Acad. Sci.

USA 72, 737-741.3. Takanami, M., Sugimoto, K., Sugisaki, H. & Okamoto, T. (1976) Nature

(London) 260, 297-302.4. Seeburg, P. H., Nusslein, C. & Schaller, H. (1977) Eur. J. Biochem. 74,

107-113.5. Rosenberg, M. & Court, D. (1979) Annu. Rev. Genet. 13, 319-353.6. Hawley, D. K. & McClure, W. R. (1983) Nucleic Acids Res. 11,

2237-2255.7. McClure, W. R. (1985) Annu. Rev. Biochem. 54, 171-204.8. Matteuci, M. D. & Heyneker, H. C. (1983) Nucleic Acids Res. 11,

3113-3121.9. Kozak, M. (1986) Cell 44, 283-292.

10. Alvarado-Urbina, G., Sathe, G. M., Liu, W.-C., Gillen, M. F., Duck,P. D., Bender, R. & Ogilvie, K. K. (1981) Science 214, 270-274.

11. Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) Molecular Cloning:ALaboratory Manual (Cold Spring Harbor Laboratory, Cold SpringHarbor, NY).

12. Hanahan, D. (1983) J. Mol. Biol. 166, 557-580.13. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl. Acad. Sci.

USA 74, 5463-5467.14. Wallace, B. R., Johnson, M. J., Suggs, S. V., Miyoshi, K., Bhatt, R. &

Itakura, K. (1981) Gene 16, 21-26.15. West, R. W., Neve, R. L. & Rodriguez, R. L. (1979) Gene 7, 271-288.16. Brosius, J., Cate, R. L. & Perlmutter, A. P. (1982) J. Biol. Chem. 257,

9205-9210.17. Sutcliffe, J. G. (1979) Cold Spring Harbor Symp. Quant. Biol. 43, 77-90.18. Rodriguez, R. L., West, R. W. & Heyneker, H. L. (1979) Nucleic Acids

Res. 6, 3267-3287.19. Stuber, D. & Bujard, H. (1981) Proc. Natl. Acad. Sci. USA 78, 167-171.20. Backman, K. & Boyer, H. W. (1983) Gene 26, 197-203.21. McMurray, L., Petrucci, R. E. & Levy, S. B. (1980) Proc. Nati. Acad.

Sci. USA 77, 3974-3977.22. Widera, G., Gautier, F., Lindenmaier, W. & Collins, J. (1978) Mol. Gen.

Genet. 163, 301-305.23. Neve, R. L., West, R. W. & Rodriguez, R. L. (1979) Nature (London)

277, 324-325.24. West, R. W. & Rodriguez, R. L. (1982) Gene 20, 291-304.25. Brosius, J. (1984) Gene 27, 151-160.26. Maquat, L. E. & Reznikoff, W. S. (1980) J. Mal. Biol. 139, 551-556.27. Horwitz, A. H., Morandi, C. & Wilcox, G. (1980) J. Bacteriol. 142,

659-667.28. Fakhrai, H., van Roode, J. H. G. & Orgel, L. E. (1981) J. Mol. Evol. 17,

295-302.29. Eigen, M. & Schuster, P. (1978) Naturwissenschoqften 65, 341-369.

Genetics: Horwitz and Loeb

Dow

nloa

ded

by g

uest

on

Nov

embe

r 17

, 202

0