DNA Functional Groups Required for Formation of Open Complexes ...

7
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1987 by The American Society of Biological Chemists, Inc. Vol. 262, No. 2, Issue of January 15, pp. 892-898,1987 Printed in U. S. A. DNA Functional Groups Required for Formation of Open Complexes between Escherichia coli RNA Polymerase and the X PR Promoter IDENTIFICATION VIA BASE ANALOG SUBSTITUTIONS* (Received for publication, June 20, 1986) John W. DubendorffS, Pieter L. deHaseth8, Mary S. RosendahlV, and Marvin H. CarutherslJ From the Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309-0215 Synthetic 75-base pair promoters bearing base changes and/or base analog substitutions at selected positions were constructed. Using both abortive initi- ation and run-off transcription assays, the interaction of these altered promoters with Escherichia coli RNA polymerase was studied in order to determine the in- volvement of DNA functional groups in promoter rec- ognition. Two adjacent thymines in the -35 region were identified whose 5-methyl groups play a crucial role. Additionally, the combined results from several substitution experiments showed that functional groups in the major groove of the strongly conserved T-A base pair at the -7 position are probable sites of direct interaction with RNA polymerase. The specific interaction of RNA polymerase with promoter DNA is an important step in the expression of bacterial genes. While sequence comparisons among over 100 promoters read- ily reveal substantial diversity, two regions of homology have emerged as well (1, 2) and are commonly referred to as the -35 (5’-TTGACA-3’) and -10 or Pribnow box (5”TATAAT- 3‘) consensus sequences where these numbers indicate the approximate positions upstream from the start site of tran- scription. Specific sites of contact between RNA polymerase and promoter DNA have been localized to the DNA at or near these regions of homology (3). In view of the sequence varia- tion even in these regions, a subset of the chemical function- alities specified by these consensus sequences (or the comple- mentary sequences on the anti-sense strand) must be suffi- cient to enable polymerase to recognize the DNA as a pro- moter. Extensive mutational analysis has shown that while the strongest promoters seem to have better matches to the consensus sequence, the exclusion of a particular base pair at some sites is more critical (1,4,5). This produces a picture of the transcription process in Escherichia coli which is both complex and flexible. * This work was supported by National Institutes of Health Grant GM21120 (to M. H. C.), an Upjohn Graduate Fellowship (to J. W. D.), and National Institutes of Health postdoctoral fellowships (to P. L. deH. and M. S. R.). This is Paper XXII in the series “Studies on Gene Control Regions.” The preceding paper is Mandecki, W., Gold- man, R. A., Powell, B. s., and Caruthers, M. H. (1985) J. Bacteriol. 164, 1353. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solelyto indicate this fact. 2 Current address: Biology Dept., Brookhaven National Labora- tory, Upton, Long Island, NY 11973. 5 Current address: Dept. of Biochemistry, School of Medicine, Case Western Reserve University, Cleveland, OH 44106. ll Current address: Synthetech Inc., Boulder, CO 80301. 11 To whom correspondence should be addressed. In an attempt to better define transcription at the func- tional group level, we have introduced specific base modifi- cations into the DNA of the bacteriophage X PR promoter (Fig. 1). By substituting deoxyuridine for thymidine and deox- yinosine for deoxyguanosine, the 5-methyl group on thymine and the 2-amino group on guanine were removed and thereby tested as contact sites between E. coli polymerase and DNA. This approach, called functional group mutagenesis, has the advantage of testing one functional group without affecting others as would be the case if base pair transversions and transitions were studied. These modified promoters were as- sembled enzymatically from chemically synthesized DNA fragments using T4 DNA ligase. In order to understand this system and to establish a base line performance in our labo- ratory, we initially constructed and measured the activity of a synthetic, unmodified X PR promoter (6). We also showed in a preliminary report (7) that no significant effect on transcription activity was observed with certain analog sub- stituted promoters. We have continued this work on the PR promoter region and report here the identification of two positions in the -35 region where methyl groups on thymine appear critical to theinteraction with RNA polymerase. Ad- ditionally, the combined results from several analog substi- tutions showed that functional groups in the major groove of the strongly conserved T.A base pair at the -7 position are probable sites of direct interaction with RNA polymerase. MATERIALS AND METHODS Enzymes and Reagents-RNA polymerase was isolated from E. coli (grain processing) in the laboratory of Dr. Carol L. Cech (Department of Chemistry and Biochemistry, University of Colorado, Boulder) by the method of Burgess and Jendrisak (8). The activity of the various preparations used in this study ranged from 50 to 80% of total concentration as measured by the T7 template functional assay method of Chamberlin et al. (9). The RNA polymerase concentrations reported here are nominal values which are not corrected for the fraction of active enzyme. T4 polynucleotide kinase and DNA ligase were from either Bethesda Research Laboratory or New England Biolabs. Snake venom phosphodiesterase and CpA’ were fromSigma. Deoxynucleosides including 5-methyldeoxycytidine, deoxyuridine, and deoxyinosine were purchased from Sigma or Pharmacia P-L Biochemicals. Promoters-Oligodeoxynucleotides were synthesized using the phosphoramidite methodology (10-12) and purified by high perform- ance liquid chromatography (13). The sequences of selected segments were confirmed by the electrophoresis homochromatography mobility shift analysis procedure (14, 15). Analog segments were analyzed by partial snake venom phosphodiesterase digestion and electrophoresis with a similarly treated sample of the unmodified segment as a marker. Oligodeoxynucleotidescomprising the 75-base pair (bp) PR promoter were covalently joined using T4 DNA ligase and isolated The abbreviations used are: CpA, cytidylyl (3’45’)adenosine; CpApU, cytidylyl (3’+5’)adenyly1(3’+5’)uridine; bp, base pair; PAGE, polyacrylamide gel electrophoresis. 892

Transcript of DNA Functional Groups Required for Formation of Open Complexes ...

Page 1: DNA Functional Groups Required for Formation of Open Complexes ...

THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1987 by The American Society of Biological Chemists, Inc.

Vol. 262, No. 2, Issue of January 15, pp. 892-898,1987 Printed in U. S. A.

DNA Functional Groups Required for Formation of Open Complexes between Escherichia coli RNA Polymerase and the X PR Promoter IDENTIFICATION VIA BASE ANALOG SUBSTITUTIONS*

(Received for publication, June 20, 1986)

John W. DubendorffS, Pieter L. deHaseth8, Mary S. RosendahlV, and Marvin H. CarutherslJ From the Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309-0215

Synthetic 75-base pair promoters bearing base changes and/or base analog substitutions at selected positions were constructed. Using both abortive initi- ation and run-off transcription assays, the interaction of these altered promoters with Escherichia coli RNA polymerase was studied in order to determine the in- volvement of DNA functional groups in promoter rec- ognition. Two adjacent thymines in the -35 region were identified whose 5-methyl groups play a crucial role. Additionally, the combined results from several substitution experiments showed that functional groups in the major groove of the strongly conserved T-A base pair at the -7 position are probable sites of direct interaction with RNA polymerase.

The specific interaction of RNA polymerase with promoter DNA is an important step in the expression of bacterial genes. While sequence comparisons among over 100 promoters read- ily reveal substantial diversity, two regions of homology have emerged as well (1, 2) and are commonly referred to as the -35 (5’-TTGACA-3’) and -10 or Pribnow box (5”TATAAT- 3‘) consensus sequences where these numbers indicate the approximate positions upstream from the start site of tran- scription. Specific sites of contact between RNA polymerase and promoter DNA have been localized to the DNA at or near these regions of homology (3). In view of the sequence varia- tion even in these regions, a subset of the chemical function- alities specified by these consensus sequences (or the comple- mentary sequences on the anti-sense strand) must be suffi- cient to enable polymerase to recognize the DNA as a pro- moter. Extensive mutational analysis has shown that while the strongest promoters seem to have better matches to the consensus sequence, the exclusion of a particular base pair at some sites is more critical (1,4,5). This produces a picture of the transcription process in Escherichia coli which is both complex and flexible.

* This work was supported by National Institutes of Health Grant GM21120 (to M. H. C.), an Upjohn Graduate Fellowship (to J. W. D.), and National Institutes of Health postdoctoral fellowships (to P. L. deH. and M. S. R.). This is Paper XXII in the series “Studies on Gene Control Regions.” The preceding paper is Mandecki, W., Gold- man, R. A., Powell, B. s., and Caruthers, M. H. (1985) J. Bacteriol. 164, 1353. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

2 Current address: Biology Dept., Brookhaven National Labora- tory, Upton, Long Island, NY 11973.

5 Current address: Dept. of Biochemistry, School of Medicine, Case Western Reserve University, Cleveland, OH 44106.

ll Current address: Synthetech Inc., Boulder, CO 80301. 11 To whom correspondence should be addressed.

In an attempt to better define transcription at the func- tional group level, we have introduced specific base modifi- cations into the DNA of the bacteriophage X PR promoter (Fig. 1). By substituting deoxyuridine for thymidine and deox- yinosine for deoxyguanosine, the 5-methyl group on thymine and the 2-amino group on guanine were removed and thereby tested as contact sites between E. coli polymerase and DNA. This approach, called functional group mutagenesis, has the advantage of testing one functional group without affecting others as would be the case if base pair transversions and transitions were studied. These modified promoters were as- sembled enzymatically from chemically synthesized DNA fragments using T4 DNA ligase. In order to understand this system and to establish a base line performance in our labo- ratory, we initially constructed and measured the activity of a synthetic, unmodified X PR promoter (6). We also showed in a preliminary report (7) that no significant effect on transcription activity was observed with certain analog sub- stituted promoters. We have continued this work on the P R

promoter region and report here the identification of two positions in the -35 region where methyl groups on thymine appear critical to the interaction with RNA polymerase. Ad- ditionally, the combined results from several analog substi- tutions showed that functional groups in the major groove of the strongly conserved T .A base pair a t the -7 position are probable sites of direct interaction with RNA polymerase.

MATERIALS AND METHODS

Enzymes and Reagents-RNA polymerase was isolated from E. coli (grain processing) in the laboratory of Dr. Carol L. Cech (Department of Chemistry and Biochemistry, University of Colorado, Boulder) by the method of Burgess and Jendrisak (8). The activity of the various preparations used in this study ranged from 50 to 80% of total concentration as measured by the T7 template functional assay method of Chamberlin et al. (9). The RNA polymerase concentrations reported here are nominal values which are not corrected for the fraction of active enzyme. T4 polynucleotide kinase and DNA ligase were from either Bethesda Research Laboratory or New England Biolabs. Snake venom phosphodiesterase and CpA’ were from Sigma. Deoxynucleosides including 5-methyldeoxycytidine, deoxyuridine, and deoxyinosine were purchased from Sigma or Pharmacia P-L Biochemicals.

Promoters-Oligodeoxynucleotides were synthesized using the phosphoramidite methodology (10-12) and purified by high perform- ance liquid chromatography (13). The sequences of selected segments were confirmed by the electrophoresis homochromatography mobility shift analysis procedure (14, 15). Analog segments were analyzed by partial snake venom phosphodiesterase digestion and electrophoresis with a similarly treated sample of the unmodified segment as a marker. Oligodeoxynucleotides comprising the 75-base pair (bp) PR promoter were covalently joined using T4 DNA ligase and isolated

The abbreviations used are: CpA, cytidylyl (3’45’)adenosine; CpApU, cytidylyl (3’+5’)adenyly1(3’+5’)uridine; bp, base pair; PAGE, polyacrylamide gel electrophoresis.

892

Page 2: DNA Functional Groups Required for Formation of Open Complexes ...

Base Analog Substitutions in X PR Promoter 893

by standard procedures (6,16). Synthetic duplexes were stored frozen in 10 mM Tris (pH 8), 50 mM KCl, 1 mM EDTA.

Abortive Initiation Assays-The formation of functional or open RNA polymerase-promoter complexes was detected through their ability to carry out the reiterative synthesis of the trinucleotide CpApU from CpA and UTP (17). The reactions were carried out as previously described for the wild-type 75-base pair synthetic promoter (6). Briefly, association lag times (18) were obtained by incubating initiating dinucleotide (1 mM), [a-"P]UTP 90.04 mM; 20 Ci/mmol), and synthetic DNA template (1 nM) in standard reaction buffer (40 mM Tris (pH 8), 100 mM KCl, 10 mM MgCL, 1 mM dithiothreitol) for 10 min at 37 "C. RNA polymerase was added to 10 nM at time 0. Aliquots (2.7 pl) were removed at specified times and spotted on paper (Whatmann 3") prestreaked with 0.1 M EDTA to quench the reaction. The CpApU product was separated from labeled UTP by ascending paper chromatography as described (19). The lag time, T, was determined using a linear regression analysis which included only points corresponding to times greater than three times the estimated T value. An anomalous dependence of T on the concentra- tion of RNA polymerase, ascribed to end-binding of polymerase to the 75-bp synthetic promoter (6, 20), prevented determination of kinetic parameters by generating T plots as described by McClure (18). A low RNA polymerase concentration (10 nM) was therefore used to minimize binding to DNA termini.

Run-off Transcription Assay-The rate of functional complex for- mation was measured by the ability of complexes to synthesize RNA corresponding to the properly initiated message which terminated at or near the end of the synthetic promoter. Reactions were initiated by mixing 8 p1 of RNA polymerase with 56 pl of binding reaction mixture containing synthetic promoter duplex. Both components were prewarmed at 37 "C before mixing. The final RNA polymerase and DNA concentrations were 67 nM and 3 to 5 nM, respectively, unless otherwise specified. After selected times, 7.5-pl aliquots were removed and added to 2.5 p1 of transcription mixture containing heparin (100 pg/ml final concentration), which binds RNA polymer- ase not present in functional complexes, and nucleotide triphosphates to allow transcription (open complexes at the X PR promoter are not sensitive to added heparin (17)). Binding and transcription reactions were performed in 30 mM Tris (pH 7.9), 100 mM KCI, 3 mM MgCI,, 0.1 mM EDTA, and 0.2 mM dithiothreitol. During the 20-min tran- scription reactions, ATP and CTP were at 200 p M and ~x-~'P-labeled GTP and UTP were at 2 p~ and 20 Ci/mmol. Reactions were terminated by freezing at -78 "C. Mixtures were dried in a Speed- Vac and taken up in 5 p1 of 7 M urea, 1.5 mM GTP, 1.5 mM UTP, 0.1% bromphenol blue, and 0.1% xylene cyanol. Samples were elec- trophoresed on 20% polyacrylamide, 7 M urea gels until the xylene cyanol marker dye had run approximately 12 cm. RNA was visualized

by autoradiography and the X PR-specific transcripts excised and counted for Cerenkov radiation.

RESULTS

Promoter Constructs--X PR was synthesized so as to contain base pair analogs (deoxyuridine, deoxyinosine, and 5-meth- yldeoxycytidine) as well as base pair changes at certain key locations that had been identified in chemical probing exper- iments to be sites of close contact with RNA polymerase (Fig. 1). These included base pairs in the -10 and -35 regions that show a high degree of sequence conservation among different promoters (1, 2), DNA within and just upstream of the -10 region that is apparently contacted by RNA polymerase (3), and the single-stranded domain as detected in functional RNA polymerase-promoter complexes (21). Some promoters bear substitutions a t multiple positions so that several functional groups could be checked simultaneously. The lack of an effect on transcription for such multiple substitutions was strong evidence that none of the groups tested was important in forming a functional complex. Two promoters, both having the same size as the analog promoters, were constructed as positive and negative controls. These were the wild-type un- modified promoter and the transition T. A to C . G at position -7. The synthetic wild-type promoter had previously been shown to yield results identical to those obtained from the same promoter as part of a bacteriophage X restriction frag- ment (6). As a control for our ability t o detect altered promoter activity, the -7 transition was used primarily because the analogous natural mutation at this highly conserved position has been shown to dramatically reduce promoter function

Generally promoters were synthesized by enzymatically joining short DNA segments containing 15 to 20 mononucleo- tides each. The fully assembled promoters were isolated from reactants by polyacrylamide gel electrophoresis (PAGE) under nondenaturing conditions. Homogeneity was then checked by PAGE under denaturing conditions where the individual 75 mononucleotide single strands were easily sep- arated (Fig. 2). We estimate that promoters synthesized and purified by this procedure were a t least 95% homogeneous.

(22-24).

14 I ' B O

T.A-CM.G T-U T-U

1-7 T.A-C.G 18 T.A-C.1

FIG. 1. A summary of analog substitutions in the X PR promoter. Starting with the initiation site for cro RNA as base pair +1, base pairs to the left and right are indicated as - and +, respectively. The -10 and -35 regions are marked by the shaded rectangles below the DNA sequence. Promoter sites substituted with analogs are designated as follows: deoxyinosine for deoxyguanosine, G+ deoxyuridine for thymidine T-U. Transitions from thymidine. deoxyadenosine to 5-methy1deoxycytidine.deoxyguanosine base pairs at positions -34 and -35 are designated by T . A-4". G. Transitions at position -7 from thymidine'deoxyadenosine to deoxycytidine.deoxy- guanosine and deoxycytidine. deoxyinosine are designated T. A+C. G and T . A+C '1, respectively. Analog pro- moters are defined by numbers located either at the terminus of lines or inserted into lines leading to the analog abbreviation. These analog promoters are therefore abbreviated as P-1 through P-20 in the text, Tables, and Figures. Promoters designated 1, 2, 3, 7, 9, 10, 11, 12a, 12b, 13, 15, 17, 18, 19, and 20 are modified to contain only a single analog or base pair substitution. Promoters designated 4, 5, 6, 8, 12, 14, and 16 contain two or three such substitutions.

Page 3: DNA Functional Groups Required for Formation of Open Complexes ...

894 Base Analog Substitutions in X P R Promoter

Comparison of Promoters Using Abortive Initiation-We have investigated the kinetics of open complex formation with the 75-bp synthetic duplexes using an abortive initiation assay to detect formation of open complexes. Typical results with unmodified and selected analog promoters are shown in Fig. 3. In most cases analog promoters behaved similarly to the unmodified 75-bp promoter. Results with P-3 and P-10 illus- trate this observation. Lag time experiments for variants bearing single (P-12a and 12b) and double (P-12) substitu- tions of uracil for the two highly conserved thymines in the

1 2 3 4 5 6 7

0

FIG. 2. PAGE analysis on denaturing gels of purified pro- moters. Promoters isolated by PAGE under nondenaturing condi- tions from reaction mixtures were then analyzed by PAGE under denaturing conditions on a 20% acrylamide, 7 M urea gel. '"P was from internal sites generated by enzymatically joining 5'-"P-labeled deoxyoligonucleotides to form the final promoters. Lanes I to 7 contain P-75 (the synthetic, unmodified duplex), P-19, P-20, P-15, P-16, P-13, and P-11, respectively.

-35 region are especially interesting. I t can be seen that single uracil substitutions reduce the promoter-dependent produc- tion of aborted product from the promoters to similar extents, while leading to significant increases in lag times. The doubly uracil-substituted promoter shows negligible activity during the time of our assay.

The T values calculated from these abortive initiation assays are summarized in Table I. As shown before, the synthetic wild-type promoter as a 75-bp duplex behaves similarly to a PR promoter carried on an 890-base pair restriction fragment, demonstrating the validity of our approach (6). In most cases, these variant PR promoters, with analog substitutions at a variety of positions, gave lag times of 1 to 3 (+1.0) min based on several determinations. We conclude that within experi- mental error these promoters show similar behavior. Of par- ticular interest was that substitution of uracil for thymine a t all A . T base pairs between -15 and +5 and also substitution of inosine for guanine a t all but two G .C base pairs in the same region (sites a t -1 and +3 were not tested) did not alter

TABLE I A summary of association 7 values obtained by the abortive initintwn

assay on a m b g and unmodified X PR promoters Promotef rb I1 Promoter T

P-890 P-75 P-1 P-2 P-3 P-4 P-5 P-6 P-7 P-8 P-9 P-10

rnin 1.3 1.7 1.5 2.5 2.9 1.9 3.0 1.8 0.6 1.6 1.4 2.0

rnin P-1 1 2.8 P-12a 9 P-12b 9 P-12 NA' P-13 1.9 P-14 NA P-15 2.1 P- 16 3.0 P-17 NA P-18 NA P-19 1.7 P-20 3.7

' P-890, a nonsynthetic, naturally occurring 890-bp HaeIh derived duplex from phage X that contains the PR promoter; P-75; a synthetic 75-bp duplex containing X PR promoter; P-1 to P-20, various modified promoters as defined in the legend to Fig. 1.

The association T values or lag times were calculated from abor- tive initiation results.

NA is an abbreviation for not active. These promoters were inactive in the abortive initiation assay.

FIG. 3. Time courses of CpApU synthesis from synthetic promoters. Percent of counts incorporated into product are plotted as a function of time. Promoter and RNA polymerase concen- trations were 1 and 10 nM, respectively. The numbers 3, 10, 12, 12a, 12b, 17, and 18 refer to promoter analogs or base pair changes as defined in the legend to Fig. 1. Since promoters P-l2a, P-12b and also P-17, P-18 show, as pairs, identical abor- tive initiation results, only one symbol is used in each case to show the data. P- 890 is the 890-bp HaeIII restriction frag- ment of phage X and contains the entire rightward control region. The 75-bp un- modified promoter is abbreviated as P- 75.

1 1 , , I 1 I , , , 1 I I I 1 ,

10 - P-890 0 P-75 0s 0

9 - P - I O 0 0

P - l 2 0 , b k! 8 -

P-12 x

a

x ? - P-17,18 0 6 -

y 6 -

z 4 - 0 0 0 "

0

0

m - 0

CL P-3 0 0 0

0 a

0 a n ,a 5 - 0 - R 0

c w V

0 0 0

a O B 4 e * * E 3 - O O E J

- 0

* * 2 -

O B I -

O t Y * 4 -

-LL-"~-L~ 4 a a i 4 + 9 ?

* * 8 * *

e 3 6 9 12 I 5 18 21 24 27 30 33 36 39 4 2 4 5 48 51

TIME (mln)

Page 4: DNA Functional Groups Required for Formation of Open Complexes ...

Base Analog Substitutions in X PR Promoter 895

the rate of transcription initiation. These 20 base pairs include the -10 region, the transcription initiation site, and the DNA region generally considered to be unwound during formation of a t least some open promoter complexes (3).

There were, however, several promoters where alterations in sequence or insertion of analogs caused an increase in lag times. Substitution of uracil for thymine a t either -34 or -35 changes 7 from 1.7 min for the unmodified promoter to 9 min or more for each of the single substituted analogs. This indicates a 4 to 5-fold reduction in promoter strength. Inser- tion of uracil a t both -34 and -35 generated a very weak promoter whose 7 value could not be measured. Several ad- ditional promoters displayed negligible activity in the assay; all bear substitutions a t either the -7 position or the two conserved thymines in the -35 region. These include the variant P-17 (C.G for T - A at -7) confirming our ability to detect naturally occurring severe "down" mutations and also the substitution of the analog base pair C . I at the same position. In addition, the double substitution of C".G for T- A in the -35 region resulted in severe loss of activity.

Comparison of Promoters Using the Run-off Transcription Assay-Promoters were also tested for activity using the productive rate assay of Stefan0 and Gralla (25). Unlike abortive initiation, this assay uses the production of run-off RNA as a reflection of promoter occupancy through the time course of the binding reaction (see "Materials and Methods").

Our previous characterization of the synthesized promoters

O.5 1 2 5 10 2 0 30 60

- m " . o o o a

FIG. 4. Gel analysis of complex formation between RNA polymerase and wild-type promoter. P-75 promoter DNA was a t 5 nM. Numbers above each lane indicate the time in minutes after addition of RNA polymerase (final concentration 67 nM). The arrows bracket the location of run-off RNA. Complete assay conditions are under "Materials and Methods." Identical results were obtained when [-y-"'P]ATP was substituted for [~r-~~PItriphosphates (data not shown).

showed that transcription is initiated at the correct site and proceeds in the right direction (6). A typical experiment is presented in Fig. 4. At each time point, the number of pro- ductive complexes present is related to the quantity of RNA synthesized. In addition to transcripts of the expected size (21 nucleotides), smaller transcripts were also observed. These smaller transcripts (18 to 20 nucleotides) probably result from polymerases which have paused near the end of the promoter duplex. The group of four discrete transcripts representing complete or almost complete elongation to the end of the fragment was excised to determine promoter-specific complex formation. A plot of the radioactivity in the run-off transcript uersw time of incubation of RNA polymerase with DNA gives an indication of the rate of promoter saturation (Fig. 5). With the majority of templates, including the wild-type, half-times for promoter saturation of 4.0 +. 0.5 min were found (Table 11) which corresponds to a 7 value of 5.8 min. This is longer than the time obtained from the lag assays described in the previous section but is consistent with the value expected from the higher RNA polymerase concentration used in the run-off transcription assay (67 uersw 10 nM in the abortive initiation assays). For all analogs, the results in the run-off transcription assay paralleled those obtained in the abortive

'y I . . . . . . . . . . ' I

I 2 5 10 10 JO 40 l o w TIME trnin1

FIG. 5. Comparison of promoters by initial rate of complex formation with RNA polymerase. For each time point, the four discrete transcripts representing complete or almost complete elon- gation were excised as gel slices and counted. Complete assay condi- tions are described under "Materials and Methods" and the legend to Fig. 4 . 0 , wild-type promoter; W, P-3; +, P-16.

TABLE I1 A summnry of half-times of promoter saturation a9 obtained from

run-off transcription assays The promoter abbreviations are defined in Table I and the legend

to Fig. 1. Half-times for promoter saturation were completed a t 67 nM E. coli RNA polymerase. Values in parentheses were from exper- iments completed a t 120 nM E. coli RNA polymerase. -~ -~ ~~ - -. ~ - -

Promoter fH Promoter tu - ~~ ~

~~ ~

rnin rnin

P-75 3.8 (5)

P-15 4.1 P-4 P-14 NA P-3 4.5 P-13 3.8 P-2 4.2 P-12 NA P-1 3.6 P-12b (15)

4.0

;:: 11 1 ;::; NA P-9

P-12a (15) P-20 3.4 P-11 P-19 3.4 P-10 P-1s NA

~ ~~ ~

NA is an abbreviation for not active. These promoters had less than 10% of the wild-type promoter activity (P-75).

Page 5: DNA Functional Groups Required for Formation of Open Complexes ...

896 Base Analog Substitutions in X PR Promoter

0.5 1 2 5 1 0 20 30 60

A“”””

B c C c

L

1

FIG. 6. Comparison of wild-type, P-12a. and P-12b by ini- tial rate of complex formation with RNA polymerase. DNA and RNA polymerase concentrations were 15 and 120 nM, respec- tively. Complete assay conditions are under “Materials and Methods.” Gel patterns of the time course (0.5 to 60 min) of complex formation using wild-type, 12a, and 12b promoters are shown. Arrows bracket the location of the run-off RNA. Panels A, R, and C show the reactions using P-75, P-l2a, and P-l2b, respectively. Gels were analyzed as described in the legend to Fig. 4.

I

1 6 n x) TIME fminl

6.3 90 I10

FIG. 7. Time course of complex formation between RNA polymerase and P-75. P-12a. or P-12b. Gels shown in Fig. 6 were analyzed as outlined under “Materials and Methods.” 0, P-75; 0, P-12a; +, P-12b. Curves corresponding to results with P-75 (-) and also P-12a plus P-12b (- - - -) are shown.

initiation assay. Thus certain promoters which showed less than 10% of wild-type activity (P-12, -14, -17, and -18) by this assay were also inactive in the abortive initiation assay. Similarly the same templates that, like the wild-type pro-

moter, gave lag times of 1 to 3 min showed kinetic behavior similar to that of the wild-type promoter in this assay.

Due to their reduced activity, the promoters with the single T to U substitutions were assayed a t 120 nM RNA polymerase concentration in an attempt to increase the measurable signal. From Figs. 6 and 7 and Table I1 it can be seen that these promoters (P-12a and P-12b) show similar kinetics of open complex formation, with both significantly slower than the wild-type promoter assayed under the same conditions. The relative effect of the single uridine substitutions as judged by this assay is similar to that seen using the abortive initiation reaction.

Transcription from End-labeled Promoters-Promoters P- 12, P-14, P-17, and P-18 were inactive in both assays. In order to confirm that these promoters were indeed inactive and that the observed lack of transcription was not due to promoter loss or breakdown during purification or storage, run-off transcription assays were carried out with 5‘ end-labeled promoters (26). After isolation by denaturing gel electropho- resis and precipitation, they were incubated with 40 nM RNA polymerase for 30 min. Transcription was then carried out for 30 min as described above. The promoters were resolved from their transcription products by electrophoresis on 20% acrylamide, 7 M urea gels. Autoradiography allowed visuali- zation of the labeled promoter DNA and the transcribed RNA. The bands were excised and counted, allowing a comparison of the activity of various promoters from the ratios of counts in the RNA and the DNA. Using this protocol, promoter P- 10 was found to direct the synthesis of greater than 10 times the number of RNA chains as did P-12, P-14, P-17, or P-18 (on a mole basis), confirming the observation that the latter four promoters have greatly reduced activity (data not shown).

DISCUSSION

We have described experiments which, for the first time, identify promoter functional groups that affect transcription by RNA polymerase. Two methyl groups on adjacent thy- mines a t -34 and -35 were found to be crucial to promoter recognition. Removal of either methyl group via substitution of uracil for thymine led to a 4- to 5-fold reduction in the rate of formation of a transcriptionally competent complex whereas substitution of uracil a t both sites generated an inactive promoter. These results are quite surprising since X PH is classified as a strong promoter (27). However, previous research has shown that substitution of uracil for thymine at a single site in the lac operator (i.e. removal of a methyl group) can increase the free energy of binding to lac repressor by 1.5 kcal/mol (28). If removal of methyl groups at -34 and -35 of PR have an equivalent effect on RNA polymerase binding, then the 3 kcal/mol binding free energy would indeed be significant and translate into approximately a 100-fold reduc- tion in affinity of E. coli RNA polymerase for the modified PR promoter (P-12). Unfortunately, we have not been able to determine the initial binding constant (Kh) and isomerization rate constant ( k p ) for the various 75-base pair synthetic pro- moters (see “Materials and Methods” and also Ref. 6). As a result, it is unknown which step of open complex formation is affected by the modifications described here. It is conceiv- able that the methyl groups at -34 and -35 interact specifi- cally with hydrophobic amino acid side chain(s) in such a way that the helix is destabilized to yield a lowered melting tem- perature. In this way, the thymine methyl groups may con- tribute to the formation of both open and closed promoter complexes. Alternatively, the methyl groups may serve as a recognition element only during closed complex formation and facilitate the further interaction of RNA polymerase with the promoter through additional contacts which lead to strand

Page 6: DNA Functional Groups Required for Formation of Open Complexes ...

Base Analog Substitutions in X PR Promoter 897

separation and open complex formation (29, 30). Our results do not distinguish between these possibilities.

We do, however, propose that RNA polymerase through hydrophobic interactions recognizes specifically the thymine 5-methyl groups at -34 and -35 irrespective of how these recognition events contribute to the steps leading to transcrip- tion. Analogous experiments with lac operators substituted with 5-bromodeoxyuridine, 5-bromodeoxycytidine, and 5 - methyldeoxycytidine have shown that lac repressor can rec- ognize substituents at the 5-position of pyrimidines (31, 32). The alternative explanation is that insertion of uracil would distort the promoter conformation in the -35 region and thereby negatively affect the formation of a functional polym- erase-promoter complex. This explanation seems unlikely since recent experiments with defined sequence deoxyoligo- nucleotides have shown that substitution of uracil for thymine did not affect the global structure of DNA (33). The only structural variation detected was a small change (estimated at 10" to 20") in the base orientations around the N-glycosidic bonds (x angles). If these results can be applied directly to promoters, such small perturbations in the absence of any other detectable conformation distortions would seem to be incapable of leading to the large alterations in promoter activity observed at -34 and -35. Moreover, substitutions of uracil for thymine elsewhere, including multiple substitutions in P-4 and P-16, had a negligible effect on transcription. If uracil substitution grossly distorted the promoter DNA con- formation, then at least some of these other uracil analogs should also have exhibited reduced activity. This was not the case.

The significance of thymine 5-methyl groups as polymerase contact sites in the -35 region is consistent with current data. As summarized in Fig. 8, chemical modification and photo- cross-linking have helped to define the sites of close approach between E. coli RNA polymerase and several of its promoters (3, 34). These earlier observations suggested that in the -35 region, RNA polymerase recognizes an E. coli promoter through contacts in the major groove of DNA. Methyl contact sites as suggested by the experiments described here are superimposed on these previously published results. As can be seen by inspection of this data, these methyl groups are positioned precisely in the region where RNA polymerase has been shown to contact DNA. Moreover, through substitution

-35 rbalon -16 raalon

Front

Bock

t t t I Boffom aYrand I Top strand Prlbnor Bo;

FIG. 8. Planar representation of RNA polymerase contacts with the T7 A3 and lac UV5 promoters (modified from Ref. 3). Polymerase contacts with the T7 A3 and lac UV5 promoters are superimposed on a cylindrical projection of the DNA helix. The distance between the Pribnow box and the -35 region is that of the T7 A3 promoter. Contact regions likely to interact with polymerase are shown as shaded areas. These shaded areas are located primarily on the front face in the -35 region and on both faces in a region from approximately -16 through the Pribnow box. These regions were defined from chemical protection experiments (3). The two filled circles represent the methyl groups whose removal (by introduction of uridine at these sites) affects binding of RNA polymerase. Substi- tution of a C. I base pair for the normally occurring T . A base pair at the -7 position (filled i n section of the Pribnow box region) indicates that the central major groove is recognized by RNA polymerase at this highly conserved site.

of these methyl groups with bromine, photochemical cross- linking experiments have shown that these sites are in close proximity to the polymerase molecule (35) and further support our conclusion that RNA polymerase contacts the promoter through direct hydrophobic interactions at positions -34 and -35. Of particular note is the observation that removal of either methyl group by introduction of a single uridine results in a similar reduction of the rate of complex formation. Viewed from another perspective, this implies that the pres- ence of a single methyl group at either site allows complex formation to proceed, albeit at approximately one-fourth to one-fifth the normal rate. Inspection of over 100 E. coli promoter sequences shows a high conservation of thymine at -34 and -35 with nearly all promoters having at least one and the stronger promoters retaining thymine at both posi- tions (1, 36). Among the sequenced promoters (1, 2), those lacking thymine at -34 and -35 are generally considered to be transcriptionally the least active. Evidently a range of hydrophobicity (0, 1-, or %methyl groups) is discernable dur- ing recognition of the promoter by RNA polymerase. Appar- ently RNA polymerase can tolerate a certain amount of variability in the molecular surface of the promoters to which it binds. This is not surprising in view of the great variation in sequence and wide spectrum of regulatory characteristics of E. coli promoters (36, 37).

Promoters containing 5-methyldeoxycytidine and deoxy- cytidine at positions -34 and -35 showed little activity. This result indicates that although the methyl groups appear to be recognized by polymerase, some additional characteristics of T.A base pairs in the -35 region are necessary for promoter function. Several possibilities are immediately apparent. It is conceivable that while the methyl groups are important to the kinetic recognition of the DNA as a promoter, some hydrogen bonding pattern recognition (38) is required for functional complex formation. For example, perhaps the exocyclic amino group on adenine and the thymine 4-carbonyl interact through hydrogen bonding with RNA polymerase. Another possibility is that the two additional hydrogen bonds in C". G rather than T . A base pairs greatly affect the local melting or unwinding of the promoter during transcription. Alternatively, the introduction of C".G base pairs at these positions might disturb the local DNA structure sufficiently so as to prevent productive binding of RNA polymerase to the promoter. Perturbations of DNA structure by cross-chain clashes of guanines on opposite strands have been described (39). Replacement of two T.A base pairs in the -35 region by C.G base pairs introduces the possibility for just such unfa- vorable contacts. The ambiguity as to the explanation for the observed results with these substitutions might be resolved by synthesis of promoters containing either (2.1 or C". I base pairs at these two positions.

e RNA POLYME?

FIG. 9. Possible mechanism for denaturation of promoters. B: His, Tyr, Ser, Thr, Lys, or Arg. The -C02H group could be either Asp or Glu. This reaction could represent the nucleation of strand separation presented by Buc and McClure (40).

Page 7: DNA Functional Groups Required for Formation of Open Complexes ...

898 Base Analog Substitutions in X PR Promoter

Base analog substitutions in the -10 region that test for recognition of the thymine 5-methyl and m guanine 2-amino groups have a negligible effect on transcription. However, other changes in this region have dramatic effects: substitu- tion of a C G base pair for the T. A base pair at -7 leads to a promoter lacking measurable activity, as does the introduc- tion of a C .I base pair. On a functional group level, the C -1 analog base pair presents a minor groove identical to that of the T.A base pair found normally at -7. The differences between the C.1 and T.A base pairs exist exclusively in the major groove. Since removal of the methyl group from thy- mine (P-3) at this position results in no change in the kinetics (Figs. 3 and 5, Table I), the altered locations of the exocyclic amino and carbonyl groups in the major groove (for C.1 or C-G relative to T. A) must be responsible for the observed lack of activity. Additionally absence of a minor groove amino group on inosine excludes alteration of the local helix struc- ture (39) as an explanation for this result. By elimination, therefore, we arrive at the conclusion that at the -7 position RNA polymerase recognizes some or all of the functional groups as present in the central part of the major groove of a T. A base pair. This is consistent with chemical and physical modification data suggesting interaction with polymerase al- most exclusively in the major groove of the Pribnow box region (2, 34). We therefore propose a mechanism where the critical contacts between RNA polymerase and the promoter at -7 occur through the central part of the major groove and lead to an induced local melting of the promoter (Fig. 9). The denaturation process would be initiated by the interaction of carbonyl and amino groups from the T. A base pair at -7 with appropriate side chains on RNA polymerase. Thus groups on RNA polymerase would shift the hydrogen bonding potential of functional groups on the bases away from the base paired state. If this RNA polymerase-induced local melt- ing were accompanied by a conformational change of the DNA (40) or protein (41), then perhaps even the one hydrogen bond remaining after tautomerization would not form or reformation of the keto-amino forms of the base pairs would be inhibited. This process could not occur if T- A were re- placed by transitions such as C . G or C . I where the orientation of the amino and carbonyl groups are inverted relative to RNA polymerase. The mechanism as shown in Fig. 9 is supported by the pK, of adenosine N1 (4.8), the ability of carboxylic acids with pK, similar to 4 (Asp and Glu) to catalyze enolization of carbonyl groups, the ease of the ade- nine exocyclic amino groups to form additional hydrogen bonds when part of B form DNA (42), and observations that DNA duplexes can be destabilized by acidic conditions (43). A series of such catalytic events in the -10 region (usually quite rich in A.T base pairs) could therefore be responsible for the DNA melting process.

These results identify for the first time certain DNA base pair functional groups that appear to be involved in transcrip- tion of promoters by E. coli RNA polymerase. Moreover, the results further extend previous conclusions (28, 31, 32, 44) that thymine 5-methyl groups are important contact sites between DNA and proteins. Thus the thymine 5-methyl may be a major distinguishing functional group which allows pro- teins to read DNA sequences. The methyl group is, after all, the only functional group that uniquely defines a base pair by one contact within a protein-DNA complex. Recognition of

base pairs or individual bases through other functional groups requires two contacts (45).

Acknowledgments-We thank Jeri Beltman for excellent technical assistance and David Auble for critically reading the manuscript.

REFERENCES 1. Hawley, D. K., and McClure, W. R. (1983) Nucleic Acids Res. 11, 2237-

2. Rosenberg, M., and Court, D. (1979) Annu. Reu. Genet. 13, 319-353 3. Siebenlist, U., Simpson, R. B., and Gilbert, W. (1980) Cell 20, 269-281 4. von Hippel, P. H., Bear, D. G., Winter, R. B., and Ber 0. G. (1982) in

Promoters: Structure and Function (Rodriguez, R., anfchamberlin, M.,

5. Youderian, P., Bouvier, S., and Susskind, M. M. (1982) Cell 30, 843-853 e&) pp. 3-33, Praeger Scientific, Praeger Publishers, New York

6. deHaseth, P. ,L., Goldman, R. A,, Cech, C. L., and Caruthers, M. H. (1983) Nuclezc Actds Res. 11, 773-787

7. Caruthers, M. H., Beaucage, S. L., Efcavitch, J. W., Fisher, E. F., Goldman, R. A,, deHaseth,,P. L., Mandecki, W.? Matteucci, M. D., Rosendflhl, M. S., and Stahinskl, Y. (1983) Cold Sprzng Harbor Symp. Quant. Bml. 47, 411-418

2255

8. Burgess, R. R., and Jendrisak, J. J. (1975) Biochemistry 14,4634-4638 9. Cbamberlin, M. J., Nierman, W. C., Wiggs, J., and Neff, N. (1979) J. Biol.

Chem. 254, 10061-10069 10. Caruthers, M. H. (1982) in Chemical and Enzymatic Synthesis of Gene

Fragments, A Laboratory Manual (Gassen, H. G., and Lang, A., eds) pp.

11. Barone, A. D., Tang, J.-Y., and Caruthers, M. H. (1984) Nucleic Acids Res. 71-79, Verlag-Chemie, Weinheim, Federal Republic of Germany

1 3 A n 5 1 - A M 1 12. Caruthers, M. H., McBride, L. J., Bracco, L. P., and Dubendorff, J. W.

13. Matteucci, M. D., and Caruthers, M. H. (1981) J. Am. Chem. Soc. 103.

~ - , _"" "" (1985) Nucleosides & Nucleotides 4,95-105

~~

3185-3191 14. Bambara, R., Jay, E., and Wu, R. (1974) Nucleic Acids Res. 1, 1503-1520 15. Tu, C., Jay, E., Bahl, C., and Wu, R. (1976) Anal. Biochem. 74, 73-93 16. Yansura, D. G., Goeddel, D. V., and Caruthers, M. H. (1977) Biochemistry

17. Hawley, D. K., and McClure, W. R. (1980) Proc. Natl. Acad. Sci. U. S. A.

18. McClure, W. R. (1980) Proc. Natl. Acad. Sci. U. S. A. 77,5634-5638 19. McClure, W. R., Cech, C . L., and Johnston, D. E. (1978) J. Bid. Chem.

20. Melancon, P., Burgess, R. R., and Record, M. T., Jr. (1983) Biochemistry

21. Siebenlist, U. (1979) Nature 279,651-652 22. Post, L. E., Arfsten, A. E., Reusser, F., and Nornura, M. (1978) Cell 16,

23. Berman, M. L., and Landy, A. (1979) Proc. Natl. Acad. Sci. U. S. A. 76,

24. Rosenberg, M., Chepelinsky, A. B., and McKenney, K. (1983) Science 222,

25. Stefano, J. E., and Gralla, J. D. (1980) J. Biol. Chem. 255,10423-10430 26. Maxam, A,, and Gilbert, W. (1977) Proc. Natl. Acad. Sci. U. S. A. 74,560-

564 27. Hawley, D. K., Malan, T. P., Mulligan, M. E., and McClure, W. R. (1982)

in Promoters; Structure and Function (Rodnguez, R., and Chamberlm,

28. Goeddel, D. V., Yansura, D. G., and Caruthers, M. H. (1977) Nucleic Acids M., eds) pp. 54-68, Praeger Scientific, Praeger Publishers, New York

16, 1772-1780

77,6381-6385

253,8941-8948

22,5169-5176

215-229

4303-4307

734-739

29. Chamberlin, M. J., Rosenberg, S., and Kadesch, T. (1982) in Promoters: Res. 4,3039-3054

Structure and Function (RodIlguez, R., and Chamberlin, M., eds) PP. 34- 53, Praeger Scientific, Praeger Publishers, New York

30. Lu, P., Cheung, S . , and Arndt, K. (1983) J. Bwmol. Structure Dymm. 1. 509-521

31. Fisher, E. F., and Caruthers, M. H. (1979) Nucleic Acids Res. 13,401-416 32. Caruthers, M. H. (1980) Acct. Chem. Res. 13,155-160 33. Delort, A,"., Neumann, J. M., Molko, D., HervB, M., Tbule, R., and

Dink, S. T. (1985) Nucleic Acids Res. 13,3343-3356 34. Simpson, R. B. (1982) in Promoters: Structure and Function (Rodriguez,

R., and Chamberlin, M., eds) pp. 164-180, Praeger Scientific, Praeger Publishers, New York

35. Sim son R. B (1979) Cell 18,277-285 36. McKure: W. R. (1985) Annu. Rev. Biochem. 54, 171-204 37. von Hippel, P. H., Bear, D. G., Morgan, W. D., and McSwiggen, J. A. (1984)

38. von Himel. P. H. (1979) in Biological Regulution and Development (Gold- Annu. Reu. Biochem. 53,389-446

ber e;, R F , ed) pp. 279-347, Plenum,-New York 39. Callaine C. R. (1982) J. Mol. Bid. 161,343-352 A n RBI,- H id MrClnr~ W R (1985) Rinchmistrv 24.2712-2723 41. Roe, J.-H., Burgess, R. R., and Record, M. T.,JL (19h ) i. Mol. Biol. 184,

42. Ts'o, P. 0. P. (1974) in Basic Principles of Nucleic Acid Chemistry (T'so, P. 441-453

43. Bloomfield, V. A,, Crothers, D. M., and Tinoco, I., Jr. (1974) Physical 0. P., ed) Vol. 1, pp. 453-577, Academic Press, Orlando, FL

44. Yolov, A. A,, Vinogradova, M. N., Gromova, E. S., Rosenthal, A,, Cech, D., Chemistry of Nucleic Acids, pp. 334-336, Harper and Row, New York

A,, and Shabarova, 2. A. (1985) Nuclezc Actds Res. 13,8983-8998 Veiko, V. P., Metelev, V. G., Kosykh, V. G:, Buryanov, Ya. I., Bayev, A.

45. Seeman, N. C., Rosenberg, J. M., and Rich, A. (1977) Proc. Natl. Acad. Sci. U. S . A. 74,966-970

."_ "-("_)I__ ... _ _ _ ~ "", ..