Supporting Information - PNAS · 2010. 10. 20. · 3. Kuhlman B, et al. (2003) Design of a novel...

6
Supporting Information Guntas et al. 10.1073/pnas.1006528107 SI Materials. All oligonucleotides were supplied by Eurofins MWG Operon. Kits to isolate library plasmids were from Qiagen and Fermentas. All restriction endonucleases and Taq DNA polymerase were from New England Biolabs. Affinity-tag chromatography, anion exchange, gel filtration, and PD-10 desalting columns were purchased from GE Healthcare. Cloned Pfu DNA polymerase, dNTPs, and quick-change site-directed mutagenesis kit were from Stratagene. BODIPY (507545)-iodoacetamide was purchased from Invitrogen. pGEX-4T-1 and pET-21b vectors were from GE Healthcare and EMD Biosciences, respectively. Trimetho- prim was purchased from Sigma. SI Methods. Creation of the Directed Library in Silico. The structural alignment of Ubc12 (1) (PDB ID code 1Y8X) and UbcH7 docked to E6AP (2) (PDB ID code 1C4Z) was performed using PyMol Molecular Viewer. The initial random rigid-body perturbation of E6AP and Ubc12 was on the order of 1 Å normal and parallel to the interface with 5° of relative rotation. After the domains were slid into contact, Monte Carlo optimization of the docked position in centroid mode (side chains represented as spherical pseudoa- toms) was used to relieve clashes and remove large gaps at the interface. Sequence design was performed with Rosettas full- atom scoring function (3) and simulated annealing at 13 interface residues on E6AP (cysteine and proline were excluded). The neighboring residues were allowed to repack, but not change identity. The chi angles for the amino acid side-chain conforma- tions were taken from Dunbracks backbone-dependent rotamer library (4). For each chi angle, the set of angles included five sam- ples (the average observed in the Dunbrack library, 0.5 and 1 standard-deviation values). Both histidine tautomers were allowed during simulations. After each sequence design protocol, the backbone angles were minimized using gradient-based mini- mization. Sequence design and backbone minimization protocols were iterated 10 times to arrive at the final model. The entire protocol was repeated 6,500 times, and the interface binding en- ergy of final models was predicted using Rosetta full-atom scoring function. Binding energy was calculated by subtracting the calcu- lated energy of the unbound proteins from the calculated energy of the bound complex. The side-chain and backbone conforma- tions were kept the same in the unbound and bound states. Top 323 scored models were aligned to create the in silico amino acid profiles. Fixed-Backbone Design of E6AP . Monomeric E6AP was rede- signed keeping the backbone fixed and using Rosettas full-atom scoring function. The same 13 residues varied in the interface si- mulations were allowed to vary identity. Neighboring residues were allowed to change side-chain conformation. One hundred independent simulations were performed to create the profile that demonstrates monomeric E6AP amino acid preferences (Table S1). Docking Simulations of D1 Design. D1s sequence and the sequence for Ubc12 were threaded onto the starting E6AP-Ubc12 model. Before docking, side chains throughout the entire structure were repacked. The initial relative perturbation parameters were set to 2-Å translations and 15° rotations. Following centroid mode docking, interface residues were repacked, and one cycle of rigid- body minimization was performed. Finally, 50 cycles of Monte Carlo minimization were performed to arrive at a final model (5). Two thousand decoys were generated and total score versus rmsd (with respect to the starting docked structure) was plotted (Fig. S4). The model with the lowest total energy is shown in Fig. 3B. DNA Library Construction. The inserts for all mutant gene libraries of E6AP were synthesized (described below) using the same PCR scheme (Fig. S5) except the 5outside PCR primer and the 5restriction site for cloning were modified to avoid cross- contamination among the libraries. Plasmid, pQE32-wtE6AP- DH[1,2], that expresses murine DHFR [1108] as a C-terminal fusion to wild-type E6AP was double-digested with either PstI/ HindIII (directed library) or BglII/HindIII (random and semidir- ected libraries). Following ligation of 1- to 1.5-μg linearized vec- tor and 4 times molar-excess insert overnight at 14 °C, DNA was ethanol precipitated and electroporated into Escherichia coli XL1-Blue cells. The library size was estimated extrapolating the number of colony-forming units obtained by plating a small fraction of the transformed cells on LB/ampicillin plates. The library plasmid (pQE32-E6AP -DH[1,2]) was purified to perform protein complementation assay (PCA) selections. PCR Assembly and Amplification for the Directed Library. For the assembly of E6AP fragment [634709] (Fig. S5), mutagenic oligonucleotides were combined and incubated in 100 μL PCR reaction with 0.25 mM each dNTPs, 2 mM MgSO 4 , 10 mM KCl, 10 mM ðNH 4 Þ 2 SO 4 , 0.1% Triton X-100, 0.1 mgmL nucle- ase-free BSA, 20 mM Tris-Hcl (pH 8.8) using the following ther- mal cycling protocol: 95 °C for 5 min, 85 °C for 2 min, 5 units of Cloned Pfu DNA Polymerase added, 40 cycles of (95 °C for 30 s, 52 °C for 30 s, 72 °C for 20 s) and finally 72 °C for 2 min. The final concentration of each oligonucleotide is 0.3 μM. Two and a half microliters of the assembly reaction was diluted into 100-μL amplification reaction and the target fragment (228 base pairs) was PCR-amplified using outside primers and standard PCR conditions. The amplified mutagenic fragment was annealed to PCR-amplified E6AP [525634] fragment and Overlap Extension PCR was performed to amplify full-length insert (567 base pairs) that was flanked by unique PstI and HindIII restriction sites. PCR Assembly and Amplification for the Random and Semidirected Libraries. The PCR conditions are as described above. The full- length inserts were flanked by BglII and HindIII restriction sites. High-Throughput Screening. Plasmid, pQE32-Ubc12 -DH[3], ex- presses murine DHFR[109187] as a C-terminal fusion to Ubc12 . One and a half micrograms of each pQE32-Ubc12 - DH[3] and pQE32-E6AP -DH[12] were mixed and transformed into E. coli BL21/pREP4 cells via six electroporations. After an hour of incubation at 37 °C, cells were washed twice by M9 mini- mal medium and plated on M9 minimal plates supplemented with 50 μgmL kanamycin, 150 μgmL ampicillin, 1 mM IPTG, 30 μM thiamine, and 240 μgmL trimethoprim. Following incu- bation at room temperature for 45 d or at 37 °C for 2 d, a small fraction (1%) of colonies that had been scraped of the plates was diluted into 50 mL M9 medium supplemented with the same concentrations of selective antibiotics. Cells were incubated overnight at the selection temperature and plasmid DNA was extracted. Using the harvested plasmid DNA as the template, the mutant E6AP gene pool was PCR-amplified with Taq DNA polymerase. The PCR product was gel-purified, double- digested with appropriate enzymes, and ligated to linearized Guntas et al. www.pnas.org/cgi/doi/10.1073/pnas.1006528107 1 of 6

Transcript of Supporting Information - PNAS · 2010. 10. 20. · 3. Kuhlman B, et al. (2003) Design of a novel...

Page 1: Supporting Information - PNAS · 2010. 10. 20. · 3. Kuhlman B, et al. (2003) Design of a novel globular protein fold with atomic-level accuracy. Science 302:1364–1368. 4. Dunbrack

Supporting InformationGuntas et al. 10.1073/pnas.1006528107SI Materials.All oligonucleotides were supplied by Eurofins MWG Operon.Kits to isolate library plasmids were from Qiagen and Fermentas.All restriction endonucleases and Taq DNA polymerase werefrom New England Biolabs. Affinity-tag chromatography, anionexchange, gel filtration, and PD-10 desalting columns werepurchased from GE Healthcare. Cloned Pfu DNA polymerase,dNTPs, and quick-change site-directed mutagenesis kit were fromStratagene. BODIPY (507∕545)-iodoacetamide was purchasedfrom Invitrogen. pGEX-4T-1 and pET-21b vectors were fromGE Healthcare and EMD Biosciences, respectively. Trimetho-prim was purchased from Sigma.

SI Methods.Creation of the Directed Library in Silico.The structural alignment ofUbc12 (1) (PDB ID code 1Y8X) and UbcH7 docked to E6AP (2)(PDB ID code 1C4Z) was performed using PyMol MolecularViewer. The initial random rigid-body perturbation of E6APand Ubc12 was on the order of 1 Å normal and parallel to theinterface with 5° of relative rotation. After the domains were slidinto contact, Monte Carlo optimization of the docked position incentroid mode (side chains represented as spherical pseudoa-toms) was used to relieve clashes and remove large gaps at theinterface. Sequence design was performed with Rosetta’s full-atom scoring function (3) and simulated annealing at 13 interfaceresidues on E6AP (cysteine and proline were excluded). Theneighboring residues were allowed to repack, but not changeidentity. The chi angles for the amino acid side-chain conforma-tions were taken from Dunbrack’s backbone-dependent rotamerlibrary (4). For each chi angle, the set of angles included five sam-ples (the average observed in the Dunbrack library, �0.5 and �1standard-deviation values). Both histidine tautomers wereallowed during simulations. After each sequence design protocol,the backbone angles were minimized using gradient-based mini-mization. Sequence design and backbone minimization protocolswere iterated 10 times to arrive at the final model. The entireprotocol was repeated 6,500 times, and the interface binding en-ergy of final models was predicted using Rosetta full-atom scoringfunction. Binding energy was calculated by subtracting the calcu-lated energy of the unbound proteins from the calculated energyof the bound complex. The side-chain and backbone conforma-tions were kept the same in the unbound and bound states. Top323 scored models were aligned to create the in silico amino acidprofiles.

Fixed-Backbone Design of E6AP�. Monomeric E6AP� was rede-signed keeping the backbone fixed and using Rosetta’s full-atomscoring function. The same 13 residues varied in the interface si-mulations were allowed to vary identity. Neighboring residueswere allowed to change side-chain conformation. One hundredindependent simulations were performed to create the profilethat demonstrates monomeric E6AP amino acid preferences(Table S1).

Docking Simulations of D1 Design.D1’s sequence and the sequencefor Ubc12� were threaded onto the starting E6AP-Ubc12 model.Before docking, side chains throughout the entire structure wererepacked. The initial relative perturbation parameters were set to2-Å translations and 15° rotations. Following centroid modedocking, interface residues were repacked, and one cycle of rigid-body minimization was performed. Finally, 50 cycles of MonteCarlo minimization were performed to arrive at a final model

(5). Two thousand decoys were generated and total score versusrmsd (with respect to the starting docked structure) was plotted(Fig. S4). The model with the lowest total energy is shownin Fig. 3B.

DNA Library Construction. The inserts for all mutant gene librariesof E6AP� were synthesized (described below) using the samePCR scheme (Fig. S5) except the 5′ outside PCR primer andthe 5′ restriction site for cloning were modified to avoid cross-contamination among the libraries. Plasmid, pQE32-wtE6AP-DH[1,2], that expresses murine DHFR [1–108] as a C-terminalfusion to wild-type E6AP was double-digested with either PstI/HindIII (directed library) or BglII/HindIII (random and semidir-ected libraries). Following ligation of 1- to 1.5-μg linearized vec-tor and 4 times molar-excess insert overnight at 14 °C, DNA wasethanol precipitated and electroporated into Escherichia coliXL1-Blue cells. The library size was estimated extrapolatingthe number of colony-forming units obtained by plating a smallfraction of the transformed cells on LB/ampicillin plates. Thelibrary plasmid (pQE32-E6AP�-DH[1,2]) was purified to performprotein complementation assay (PCA) selections.

PCR Assembly and Amplification for the Directed Library. For theassembly of E6AP fragment [634–709] (Fig. S5), mutagenicoligonucleotides were combined and incubated in 100 μL PCRreaction with 0.25 mM each dNTPs, 2 mM MgSO4, 10 mMKCl, 10 mM ðNH4Þ2SO4, 0.1% Triton X-100, 0.1 mg∕mL nucle-ase-free BSA, 20 mM Tris-Hcl (pH 8.8) using the following ther-mal cycling protocol: 95 °C for 5 min, 85 °C for 2 min, 5 units ofCloned Pfu DNA Polymerase added, 40 cycles of (95 °C for 30 s,52 °C for 30 s, 72 °C for 20 s) and finally 72 °C for 2 min. The finalconcentration of each oligonucleotide is 0.3 μM. Two and a halfmicroliters of the assembly reaction was diluted into 100-μLamplification reaction and the target fragment (228 base pairs)was PCR-amplified using outside primers and standard PCRconditions. The amplified mutagenic fragment was annealed toPCR-amplified E6AP [525–634] fragment and Overlap ExtensionPCR was performed to amplify full-length insert (567 base pairs)that was flanked by unique PstI and HindIII restriction sites.

PCR Assembly and Amplification for the Random and SemidirectedLibraries. The PCR conditions are as described above. The full-length inserts were flanked by BglII and HindIII restriction sites.

High-Throughput Screening. Plasmid, pQE32-Ubc12�-DH[3], ex-presses murine DHFR[109–187] as a C-terminal fusion toUbc12�. One and a half micrograms of each pQE32-Ubc12�-DH[3] and pQE32-E6AP�-DH[1–2] were mixed and transformedinto E. coli BL21/pREP4 cells via six electroporations. After anhour of incubation at 37 °C, cells were washed twice by M9 mini-mal medium and plated onM9minimal plates supplemented with50 μg∕mL kanamycin, 150 μg∕mL ampicillin, 1 mM IPTG,30 μM thiamine, and 2–40 μg∕mL trimethoprim. Following incu-bation at room temperature for 4–5 d or at 37 °C for 2 d, a smallfraction (∼1%) of colonies that had been scraped of the plateswas diluted into 50 mL M9 medium supplemented with thesame concentrations of selective antibiotics. Cells were incubatedovernight at the selection temperature and plasmid DNA wasextracted. Using the harvested plasmid DNA as the template,the mutant E6AP� gene pool was PCR-amplified with TaqDNA polymerase. The PCR product was gel-purified, double-digested with appropriate enzymes, and ligated to linearized

Guntas et al. www.pnas.org/cgi/doi/10.1073/pnas.1006528107 1 of 6

Page 2: Supporting Information - PNAS · 2010. 10. 20. · 3. Kuhlman B, et al. (2003) Design of a novel globular protein fold with atomic-level accuracy. Science 302:1364–1368. 4. Dunbrack

pQE32-wtE6AP-DH[1,2] as described above for the next roundof PCA selections. The stringency of selection for each round wasmodulated by varying incubation temperature and Trimethoprimconcentration (Table S4).

Cloning, Expression, and Purification. E6AP� mutants and Ubc12�were PCR-amplified and cloned into pGEX-4T-1 and pET-21b,respectively. All point mutants of E6AP� derived designs andof Ubc12� were made following the Quick-Change site-directedmutagenesis protocol. Ubc12� was purified by His-tag affinitychromatography followed by anion-exchange chromatography.E6AP� mutants were purified in three steps: Following GST-affinity chromatography, proteins were cleaved from GST tagby thrombin digestion. GSTand E6AP� were separated by anionexchange. Finally, the monomeric E6AP� was collected using aSuperdex 75 Hi-load 16∕60 gel-filtration chromatographycolumn.

Fluorescence Polarization. One hundred micromolar Ubc12� wasBODIPY(boron-dipyrromethene)-labeled overnight at 4 °C in500 μL 50 mM Tris pH 7.5, 1 mM BODIPY (507∕545)-iodoace-tamide, and 1 mM TCEP [tris(2-carboxyethyl)phosphine]. Thesuspension was centrifuged to recover the supernatant and thereaction was stopped with 5 mM DTT. The labeled proteinwas buffer-exchanged into 20 mM KH2PO4, 150 mM NaCl,5 mM β-mercaptoethanol pH 7.0 using a PD-10 desalting column.The concentration of the labeled protein was determined as

described before (6). The fluorescence polarization experimentsand data analysis to determine the dissociation constants weredone essentially as described before (6). After each titration,polarization was measured in triplicates.

Analytical Size Exclusion Chromatography. For the analysis of pureproteins, 32 μL of 100 μMof each protein was loaded onto Super-dex 75 PC 3.2∕30 column. For the analysis of protein complexes,32-μL protein solution that contains 100 μMof each binding part-ner was loaded to the column. The flow rate was 0.1 mL∕min.

PCAAssays.The survivals for the first round of PCA selection werecompared for all libraries. Based on agarose gel and spectro-photometric analysis, 200 ng plasmid that expresses each librarymutants or parent E6AP� as a fusion to DHFR [1–108] was mixedwith 200 ng pQE32-Ubc12�-DH[3] plasmid and electroporatedinto E. coli BL21/pREP4 cells. Fifty percent of cells for eachtransformation were plated on an M9-minimal medium agarplate with 2 μg∕mL trimethoprim and incubated at room tem-perature for 4–5 d. A small fraction of cells following transfor-mation was plated on LB plates supplemented with 150 μg∕mLampicillin and 50 μg∕mL kanamycin to determine the transfor-mation efficiency. Comparable number of colonies on richplates suggests that the different number of colonies observedon selective M9 minimal plates was not due to differences in elec-troporation efficiencies.

1. Huang DT, et al. (2005) Structural basis for recruitment of Ubc12 by an E2 binding

domain in NEDD8’s E1. Mol Cell 17:341–350.

2. Huang L, et al. (1999) Structure of an E6AP-UbcH7 complex: Insights into ubiquitina-

tion by the E2-E3 enzyme cascade. Science 286:1321–1326.

3. Kuhlman B, et al. (2003) Design of a novel globular protein fold with atomic-level

accuracy. Science 302:1364–1368.

4. Dunbrack RL, Jr, Cohen FE (1997) Bayesian statistical analysis of protein side-chainrotamer preferences. Protein Sci 6:1661–1681.

5. Gray JJ, et al. (2003) Protein-protein docking with simultaneous optimization ofrigid-body displacement and side-chain conformations. J Mol Biol 331:281–299.

6. Eletr ZM, et al. (2005) E2 conjugating enzymes must disengage from their E1 enzymesbefore E3-dependent ubiquitin and ubiquitin-like transfer. Nat Struct Mol Biol12:933–934.

Fig. S1. Binding experiment with WT E6AP and WT Ubc12. E6AP (700 μM) was titrated into Bodipy-labeled Ubc12 (2 μM), and the change in fluorescencepolarization was monitored. In similar experiments with partners that bind with tight affinity, the fluorescence polarization rises to ∼0.4 upon binding (Fig. 2).Thess data suggest that the affinity between E6AP and Ubc12 is greater than 1 mM.

Guntas et al. www.pnas.org/cgi/doi/10.1073/pnas.1006528107 2 of 6

Page 3: Supporting Information - PNAS · 2010. 10. 20. · 3. Kuhlman B, et al. (2003) Design of a novel globular protein fold with atomic-level accuracy. Science 302:1364–1368. 4. Dunbrack

Fig. S2. A comparison of computational (blue) and experimental (red) amino acid profiles for the directed library. The experimental values were calculatedassuming that the allowed nucleic acids at each sequence position were incorporated with equal frequency.

Guntas et al. www.pnas.org/cgi/doi/10.1073/pnas.1006528107 3 of 6

Page 4: Supporting Information - PNAS · 2010. 10. 20. · 3. Kuhlman B, et al. (2003) Design of a novel globular protein fold with atomic-level accuracy. Science 302:1364–1368. 4. Dunbrack

Fig. S3. SDS-PAGE of fractions collected using analytical size exclusion chromatography. When mixed with Ubc12�, the tight binder SD3 allows Ubc12� to beeluted at earlier fractions compared to either pure Ubc12� or complexed with the weak binder D2.

Fig. S4. Total energy versus rmsd plot. Docking simulations to predict the interface between “D1” and Ubc12� generated 2,000 decoy structures. rmsd is basedon the starting model of E6AP docked to Ubc12�. The model that has the lowest total energy is shown in Fig. 3B.

Fig. S5. Synthesis of library inserts. Overlapping designed mutagenic oligonucleotides were annealed and extended to assemble and amplify the geneticfragment that encodes for E6AP� residues 634–709. This fragment was mixed and extended with the E6AP� fragment that encodes for residues 525–634. DNAsequences for residues 628–634 are identical for both fragments, and this overlap region serves as the annealing site for SOE-PCR to amplify full-length insert.

Guntas et al. www.pnas.org/cgi/doi/10.1073/pnas.1006528107 4 of 6

Page 5: Supporting Information - PNAS · 2010. 10. 20. · 3. Kuhlman B, et al. (2003) Design of a novel globular protein fold with atomic-level accuracy. Science 302:1364–1368. 4. Dunbrack

Table S1. Amino acid preferences for monomeric E6AP�

E6AP� residue

V634 L635 S638 L639 L642 M653 I655 I659 S660 Q661 I682 F690 Y694

ARG (9) 1(8) (4) (2) (1) 68 (35)LYS (7) (12) 32 (18) (1) (3) (9) (1) 5 (27)HIS (2) (1)ASP (5) 2 (21) (11) (1) (9)GLU 32 (6) 72 (21) 47 (17) (2) (1) (1)PHE (2) (2) (4) (1)TYR (1) (1) (12) (1) 100 (29) (8) 100 (95) 4 (11)TRP (4) (2)LEU (2) 25 (12) (3) 100 (98) 100 (96) 5 (9) 23 (12)ILE (1) 21 (1) (3) 100 (49) (1) (1)VAL (4) (2) (10) (1) 22 (14) 100 (99) (1)MET (1) (4) (1) (1) (1)ALA (2) (1) (6)GLYSER (4) (1) (5) (4) (1) (1) (45) (8) (6)THR 68 (39) (13) (16) (1) 100 (92) (36) 100 (97) (5) 73 (41) (2)ASN (2) (3) (1) (5)GLN (17) (4) (5) (1) (1)PROCYSSTOP

Amino acid percentages in the computationally designed sequences. Percentages are shown from two sets of simulations, each 100 fixed-backbonedesign simulations with monomeric E6AP. Percentages shown in parentheses derive from runs with a higher final temperature during simulatedannealing and therefore have more varied sequences. Proline and cysteine were not allowed in the computational protocol. Shaded boxes indicatethe amino acids included in the experimental library. In addition to covering the sequence space observed computationally, the experimental libraryincluded the wild-type amino acid at each sequence position. The theoretical diversity of the experimental library is 5.6 × 107.

Table S2. Sequences isolated and their affinities for Ubc12�

RM† 634 635 638 639 642 653 655 659 660 661 682 690 694 Kd , μM

D1 G629R R G W L L V I I P T V Y L 0.034 ± 0.017D2 A R W L L L L I P T V Y Y 60 ± 9D3 G629R R G W L L L L I P T V Y L 1.6 ± 0.3D4 G629R R G W L L V I I G A V Y L 0.023 ± 0.016D5 R G W L L L L I P T V Y L 2.4 ± 0.5D6 R G W L L L L I G A V Y L n.dD7 E557K, A599D R A W L L V I I G A V Y L n.dD8 E557K R G W L L L L I P A V Y Y n.dSD1 R A R L L M I I P P V F V 0.14 ± 0.02 (0.32 ± 0.08)SD2 R A T L L M I I P P V Y L 0.32 ± 0.06 (0.59 ± 0.06)SD3 E557K, Q637R H V Y L L M I I P P V F V 0.019 ± 0.003 (0.064

±0.007)SD4 E685D R C H L L M I I P P V F L 0.052 ± 0.010 (0.16 ±

0.02)SD5 R A H L L M I I P P V F L 0.039 ± 0.012 (0.11 ±

0.01)SD6 E685D T A R L L M I I P P V Y L (1.44 ± 0.30)SD7 E685D T A R L L M I I P P V F V (0.28 ± 0.04)SD8 W A S L L M I I P P V F I (0.16 ± 0.02)SD9 E685D T A R L L M I I P P V F L (0.27 ± 0.05)SD10 E557K, V649A R A Q L L M I I P P V F L (0.15 ± 0.02)SD11 E557K, M619T R V H L L M I I P P V Y L (2.23 ± 0.41)SD12 T A R L L M I I S P V F V NDSD13 E685D R I H L L M I I P P V F L NDSD14 E685D T M R L L M I I P P V F L NDSD15 R A L L L M I L S P V F V NDSD16 D543G, E557K,

E685DT A R L L M I I P P V F L ND

SD17 N692D R A S L L M I I P P V Y L NDWT V L S L L M I I S Q I F Y 188.7 ± 108.4

Mutations shown in bold were introduced as a result of PCR at designed positions during the screening. Kd values in parentheses were measured with GST-fused protein. ND: not determined.†RM: Random mutations (interface mutations are shown in italics) introduced as a result of PCR at undesigned residues.

Guntas et al. www.pnas.org/cgi/doi/10.1073/pnas.1006528107 5 of 6

Page 6: Supporting Information - PNAS · 2010. 10. 20. · 3. Kuhlman B, et al. (2003) Design of a novel globular protein fold with atomic-level accuracy. Science 302:1364–1368. 4. Dunbrack

Table S3. Sequence similarity between D1 and top sequence-matching designs

Design 634 635 638 639 642 653 655 659 682 690 694 Match

1 R L G L L M I I V Y Y 7∕112 R L A L L M M I V Y L 7∕113–8 R L L L L H I I V F L 7∕119–11 R L A L L H I I V F L 7∕1112 R L L L L T I I V F L 7∕1113 R L T L L S I I V F L 7∕1114 R L A L L T I I V F L 7∕1115 R L S L L T I T V Y L 7∕1116 R L S L L T I T V Y L 7∕1117 R L A L L T I T V Y L 7∕1118 E V W L L H I I V Y A 7∕1119 L R W L L S I T V F L 6∕1120 D G F L L H I T V Y A 6∕1121 G G F L L H S I V Y A 6∕11D1 R G W L L V I I V Y L

Matching residues are shown in bold. Sequences 3 through 8 and 9 through 11 were sampled morethan once. Positions 660 and 661 were not considered in this analysis.

Table S4. PCA selection conditions

Round 1 Round 2 Round 3 Round 4

Directed [RT, 2, 5] [37, 2, 3] [37, 2, 2] [37, 40, 2]Random [RT, 2, 5] [37, 2, 3] [37, 2, 2] [37, 40, 2]Semidirected [RT, 2, 5] [37, 2, 3] [37, 40, 2] or [37, 2, 2] * [37, 2, 2] *

Selection conditions: [temperature (°C), trimethoprim (μg∕mL), incubationtime (days)].RT: Room temperature.

*The selection was performed using Ubc12�-DHFR[3]_I114A, which allows a morestringent selection.

Guntas et al. www.pnas.org/cgi/doi/10.1073/pnas.1006528107 6 of 6