[] Encyclopedia of Polymer Science and Technology.(BookFi.org)

53
Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 145 GENETIC METHODS OF POLYMER SYNTHESIS Introduction Polymers have found enormous technological significance in a variety of areas including plastics, resins, fibers, and biomaterials. Both the chemical nature of the polymer and its structure are critical in determining its properties; the im- portance of polymer structure in dictating function has, for many decades, fueled the development of synthetic strategies for creating well-defined macromolecules. During the past 15 years, the application of biosynthetic strategies has gained increasing prominence as a polymer synthetic method, owing to the fact that the biosynthesis of macromolecules offers precise control of monomer sequence and macromolecular structure. This control is the source of the information storage, hierarchical assembly, mechanical strength, and enzymatic properties of natural macromolecules such as nucleic acids and proteins, which serve as an inspiration to the polymer chemist. Genetic methods of polymer synthesis have therefore been used to create protein-based polymeric materials inspired by natural molecules such as collagen, silk, and elastin, as well as materials comprising amino acid sequences with no natural counterpart. The macromolecules produced by these strategies exhibit novel and well-controlled properties, and genetically directed synthetic methods promise the contribution of additional advanced materials in the future. Properties of Synthetic Polymers Synthetic polymers, used here to describe polymers produced by chemical (ie, non- biological) methods, have enjoyed great technological success since the beginning of their development in the early 20th century. A nearly limitless set of monomers can be polymerized by a vast variety of chemical transformations, which has con- tributed to the successful application of polymers in disparate applications, such as textiles, high temperature materials, and biomaterials. The properties of poly- mers are controlled by their chemical composition and by structural parameters such as polymer chain length, stereochemistry, monomer sequence, and topology. Because chemical polymerizations are statistical processes, synthetic polymers exist as a population of different molecules of varying structure, rather than as a collection of identical molecules. The properties of a synthetic polymer can there- fore only be characterized with respect to the average properties of the chain Encyclopedia of Polymer Science and Technology. Copyright John Wiley & Sons, Inc. All rights reserved.

Transcript of [] Encyclopedia of Polymer Science and Technology.(BookFi.org)

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 145

    GENETIC METHODS OFPOLYMER SYNTHESISIntroduction

    Polymers have found enormous technological signicance in a variety of areasincluding plastics, resins, bers, and biomaterials. Both the chemical nature ofthe polymer and its structure are critical in determining its properties; the im-portance of polymer structure in dictating function has, for many decades, fueledthe development of synthetic strategies for creating well-dened macromolecules.During the past 15 years, the application of biosynthetic strategies has gainedincreasing prominence as a polymer synthetic method, owing to the fact that thebiosynthesis of macromolecules offers precise control of monomer sequence andmacromolecular structure. This control is the source of the information storage,hierarchical assembly, mechanical strength, and enzymatic properties of naturalmacromolecules such as nucleic acids and proteins, which serve as an inspirationto the polymer chemist. Genetic methods of polymer synthesis have therefore beenused to create protein-based polymeric materials inspired by natural moleculessuch as collagen, silk, and elastin, as well as materials comprising amino acidsequences with no natural counterpart. The macromolecules produced by thesestrategies exhibit novel and well-controlled properties, and genetically directedsynthetic methods promise the contribution of additional advanced materials inthe future.

    Properties of Synthetic Polymers

    Synthetic polymers, used here to describe polymers produced by chemical (ie, non-biological) methods, have enjoyed great technological success since the beginningof their development in the early 20th century. A nearly limitless set of monomerscan be polymerized by a vast variety of chemical transformations, which has con-tributed to the successful application of polymers in disparate applications, suchas textiles, high temperature materials, and biomaterials. The properties of poly-mers are controlled by their chemical composition and by structural parameterssuch as polymer chain length, stereochemistry, monomer sequence, and topology.Because chemical polymerizations are statistical processes, synthetic polymersexist as a population of different molecules of varying structure, rather than as acollection of identical molecules. The properties of a synthetic polymer can there-fore only be characterized with respect to the average properties of the chain

    Encyclopedia of Polymer Science and Technology. Copyright John Wiley & Sons, Inc. All rights reserved.

  • 146 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    population and the distribution around this average. Such heterogeneity has notadversely affected the use of these macromolecules in bulk application, but doeshave profound consequences when attempting to engineer polymer properties onthe nanometer length scale. In this regime, important parameters such as func-tional group placement and macromolecular assembly are directly affected byvariations in molecular weight, sequence, stereochemistry, and topology.

    There have been many advances in synthetic polymer chemistry that haveincreased the control of polymer structure. These advances began most notablyin the 1950s with the implementation of ZieglerNatta polymerizations (1,2) andliving polymerizations (35), which permitted, for the rst time, control over archi-tectural parameters such as stereochemistry and molecular weight, respectively.The development of these methods was critical to developing an understanding ofstructureproperty relationships in polymeric materials and for the production ofuseful plastics, bers, and elastomers. Since that time, a variety of novel chemi-cal strategies for controlling polymerizations have continued to evolve, includingthe use of organometallic catalysts to control stereochemistry in olenic poly-mers; atom-transfer radical polymerizations that permit living free-radical poly-merization; living N-carboxyanhydride polymerizations; and living ruthenium-catalyzed ring opening metathesis polymerizations (2,615). These methods havebeen extremely important for producing stereoregularmaterials, syntheticmacro-molecules with unusual self-assembly properties and phase separation behavior,polymers with triggered and controlled response, and functionalized polymerswith controlled molecular weight. Despite these advances, precise control of func-tional group placement remains elusive in polymers produced by chemical meth-ods, which can limit their utility in applications where precise molecular recog-nition, catalysis, and assembly is desired. Because genetically directed methodsof polymer synthesis offer precise control over molecular weight, stereochemistry,and sequence, they have been increasingly applied over the last 15 years towardthe synthesis of exactly dened protein based materials that have potential usesin materials, biomedical, and nanotechnology applications.

    Biosynthesis of Proteins

    The control over protein synthesis in living organisms is derived from the tem-plated nature of the genetically directed polymerization (Fig. 1). A DNA messageencoding an amino acid sequencenaturally derived or articialis transcribed(converted) to a messenger RNA (mRNA) sequence. This sequence is translated(decoded) by transfer RNA (tRNA) molecules, which deliver amino acids to theribosome. At the ribosome, the corresponding amino acid sequence is linked to-gether enzymatically to form the protein. The tRNAmolecules carry the appropri-ate amino acid to its exactly specied location along the protein chain via codonanticodon pairing, and are charged with the appropriate amino acid by a highlyselective class of enzymes, the aminoacyl tRNA synthetases. These enzymes ex-hibit an extremely small error frequency in amino acid incorporationfewer thanone in 10,000 errors (16)so the amino acid sequence is preserved with great -delity during protein biosynthesis; this permits the exact control of the placementof over 20 different amino acids along the protein backbone. This primary sequence

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 147

    Fig. 1. Schematic of protein biosynthesis.

    directs the formation of regular secondary structure (helical, coil, or sheet) andmediates the 3-D structure, assembly, andmolecular recognition that are the hall-mark of biological materials and the source of their remarkable properties. Thestrict sequence control, then, allows the synthesis of a remarkably diverse set ofmaterials from a single set of building blocks. For example, the range of propertiesof Silk (qv), Collagen (qv), and elastin, which are each synthesized from the sameset of 20 natural amino acids, arises from differences in their primary sequencesand consequently different conformations and assembly.

    The mechanical properties of these natural proteins, which can becomparable to and sometimes exceed those of high performance synthetic ma-terials (Table 1), coupled with their potential biological activity, have been amotivation for the synthesis of polymers via genetic methods. Capturing theseproperties in genetically engineered articial protein polymers offers the possi-bility of biologically derived performance materials and other protein materialsthat can be engineered with properties not currently found in the natural proteinrepertoire (eg, novel foldingmotifs and assemblies). Furthermore, the exactmolec-ular weight control and precise specication of amino acid position of geneticallydirected methods provides materials that carry sufcient information to exactlycontrol their folding and assembly. Because proteins assume thermodynamicallystable structures in solution, the 3-D placement of functional groups on the atomic

    Table 1. Representative Properties of Several Natural Proteins and Synthetic Fibersa

    Material UTS, MPa Modulus, GPa % Strain at break

    B. mori silk 740 10 20N. clavipes silk 875972 1113 1718Collagen (mammalian tendon) 120 1.2 13Elastin (bovine ligament)b 2 0.0011 150Polylactic acid, 50,000300,000 MW 2850 1.23.0 26Bone 160 20 3Kevlar (49 ber) 3600 130 2.7Synthetic rubber 50 0.001 850aAdapted from Ref. 17, with permission from Elsevier.bFrom Ref. 18.

  • 148 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    length scale, with assembly controlling placement on longer length scales, is possi-ble; the advantages of this assembly include the purposeful control of the mechan-ical, chemical, and biological properties of the articial protein polymer chain. Thevariety of amino acid sequences that can be constructed and produced in cellularhosts via protein engineering methods provides access to a wide variety of proteinpolymers with diverse properties. Additionally, the incorporation of non-naturalamino acids in these polymers has also signicantly broadened the versatility ofin vivo protein synthesis as a route to producing polymeric materials.

    Genetically Directed Synthetic Strategies

    Recombinant DNA methods have been traditionally used in site-directed mutage-nesis studies designed to probe protein folding or enzymatic activity. The easewithwhich genetic sequences can be constructed has, however, led to the increased useof thesemethods for the synthesis of proteins with repetitive sequences. Naturallyoccurring materials with desired properties can be produced in organisms such asbacteria and yeast, simply by transferring the DNA encoding the natural proteininto the appropriate expression host. Alternatively, short, repetitive amino acidsequences that are found in naturally occurring proteins can be linked together toform an articial protein (one that does not exist in nature). Finally, since the fold-ing of certain amino acid motifs is understood, amino acid sequences not found innature can be designed from scratch (de novo design) to produce articial proteinswith desired chemical, biological, and/or physical behavior.

    The production of naturally occurring proteins, such as silk, collagen, orelastin, is motivated by their excellent mechanical properties. In one approach forproducing these proteins, the cDNA encoding the natural protein is isolated froman organism that produces the protein of interest and is cloned into an expressionplasmid that carries the necessary genetic information to direct the expressionhost to produce both RNA and protein. While this approach guarantees that thepolymer sequence will match that of the natural protein, difculties with genestability and protein synthesis can be encountered when the genes are extremelylong and repetitive and when the coding preferences of the organism from whichthe DNA is isolated are different than those of the expression host. In manycases, it is not possible or necessary to construct a single amino acid sequencethat encodes the full-length protein, and so shortened DNA sequences from thenatural protein can be used.

    Since many natural structural proteins comprise highly repetitive aminoacid sequences, strategies to produce articial repetitive proteins with propertiesanalogous to those of the natural proteins have also been developed. The articialrepetitive proteins are either based on naturally occurring amino acid sequencesor on sequences identied via de novo design. The overall strategy for gene designand expression is summarized in Figure 2. Synthetic genes encoding the arti-cial proteins are produced via standard molecular biological protocols; the genesfor short amino acid sequences (3040 amino acids, 90120 base pairs) are rstproduced by solid-phase chemical synthesis and are then ligated into circular plas-mid DNA, which serves as a source of the DNA monomer for future experiments.The initial solid-phase synthesis of the short genes permits use of DNA sequences

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 149

    Fig. 2. Protein engineering strategy for the genetically directed synthesis of proteinpolymers.

    that reect the natural coding preferences of the organism in which the proteinwill be expressed and allows reduction of repetition in codon usage. It also permitsthe inclusion of amino acids with desired chemical functionality, or the insertionof biologically active sequences, to capture a variety of desired materials prop-erties in the resulting protein. After the sequence of the gene in the plasmid inconrmed, the monomer is isolated from the plasmid and multimerized enzymat-ically. The multimerization proceeds in a head-to-tail orientation to form genesthat can have lengths of up to 5000 base pairs (approximately 1675 amino acids);these genes are ultimately ligated into an expression plasmid (vide infra).

    There are several different strategies (random, iterative, and recursive) thathave emerged for the construction of multimers, as outlined in Figure 3, althoughthe use of one strategy is not exclusive of another; ie, a combination of these ap-proaches can be used in the construction of a single articial gene (19). In randomapproaches, as the general one described above, DNA monomers are oligomerizedin a single step, creating a population of oligomers of different lengths. In itera-tive approaches, a DNA segment is oligomerized in a series of uniform steps, eachstep adding one DNA segment to the growing oligomer. In recursive approaches,the DNA monomers are joined sequentially, with the length of the ligated seg-ments doubling at each step. Random approaches offer the advantage of rapidcreation of a library of repetitive genes of different lengths in a single step. Al-though this method does not guarantee the synthesis of a specic gene of desiredlength, it has been the most widely adopted of the multimerization strategies,owing to its simplicity and the fact that ligation conditions can be controlledto increase the probability of obtaining multimers of a given molecular weightrange. Iterative and recursive approaches offer the advantage of production ofgenes with a specic and predetermined molecular weight, which is desirableif the effects of sequential changes in molecular weight or the impact of a pre-cise and alternating block copolymer architecture are of interest. These meth-ods requires repetitive cloning steps, the exact number depending on the desired

  • 150 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    Fig. 3. Multimerization strategies employed for the construction of articial genes.Adapted from Ref. 19, with permission from Elsevier.

    length of the gene. The different approaches are used depending on the require-ments of a specic macromolecular design, and demonstrate the exibility of ge-netic methods of polymer synthesis for the production of varied types of polymerarchitectures.

    Once the appropriate gene library is constructed, individual genes are incor-porated into an expression plasmid, which is then incorporated into an expressionhost, most commonly E. coli. Because only one type of plasmid is maintained percell, the plasmid DNA of individual bacterial colonies can be screened to identifyand isolate a specic synthetic gene of desired length. The chosen synthetic geneis expressed from an expression plasmid (Fig. 2), which directs the synthesis ofmRNA and protein in the cell. The expression plasmid also contains a promoterthat regulates protein production and that can be controlled by inexpensive ex-ternal means (eg, addition of chemicals, change in temperature). During proteinexpression, host cells are grown to a desired density, and then protein synthesisis initiated by simple addition of chemicals or change in temperature. Proteinpolymers can be produced at high levels in host cells; protein generally accu-mulates intracellularly (although it can also be transported out of the cell) andis then isolated from the cellular proteins via cell lysis, washing, precipitation,and chromatographic methods. The proteins are analyzed via amino acid anal-ysis, chromatographic methods, mass spectrometry, NMR, and immunochemicalanalysis to conrm their identity; the pure product is absolutely homogeneousin both molecular weight and sequence. A variety of protein polymers based onsilk, collagen, elastin, mussel adhesive proteins, and viral spike proteins havebeen produced by these methods, and there are also reports of the production of

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 151

    hybrid polymeric materials comprising synthetic polymers coupled to geneticallyengineered motifs. There are limited reports of the synthesis of the mussel ad-hesive and viral spike protein polymers (20,21) and the hybrid polymers (2224).The vast majority of materials have been produced from silk, collagen, elastin,and articial amino acid sequences (Table 2); research in the design, synthesis,and characterization of these materials will therefore by presented below.

    Silk-Like Protein Polymers

    Naturally occurring silk (qv) is produced by a variety of organisms such as thesilkworm (eg, Bombyx mori), spider (eg, Nephila clavipes), and scorpion, and thetoughness and mechanical properties of the silk ber has interested materialsscientists for its use as a lightweight, high performance ber, and in compos-ite applications. Of all the silks, those from B. mori and from the dragline ofN. clavipes have been the most extensively studied. The silk from the silkwormB. mori has been cultivated for centuries and used as a textile ber. Spider silks,in contrast, are a family of proteins, in which there are multiple types of silk; forexample, N. clavipes has seven different types of silk, each of which is tailoredfor a specic function. All silks have a highly repetitive amino acid sequence,which is characteristic of the brous proteins (eg, silk, collagen, and elastin) thatserve a mechanical or structural function; globular proteins that serve catalyticor molecular recognition functions do not contain such repetitive sequences. Theexact nature of the repetitive amino acid sequence depends on the type of silk, butincludes alanine- and glycine-rich regions that have a high propensity for -sheetformation via intrachain hydrogen bonding, interrupted by less crystalline oramorphous domains. The crystalline region from B. mori broin, for example, hasthe sequence [GAGAGSGAAG(SGAGAG)8Y], and it is estimated to occur approx-imately 70 times within the protein (101). The repeated motifs from spider silkare less conserved, but a 13 amino acid repeat, (YGGLGSQGAGRGG), based oncDNA sequences, has been identied (102); this consensus region is generally fol-lowed by a short polyalanine sequence. The novel mechanical properties of silksare believed to arise from the folding of the protein into small -sheet crystal-lites that are dispersed in a noncrystalline matrix (103); shear forces during berspinning permit processing and alignment of the crystallites to produce excellentber properties. The dragline of N. clavipes, for example, dissipates energy over alarge surface area and balances stiffness (modulus of 1050 GPa), strength (ten-sile strength of near 1.0 GPa), and extensibility (elongation to break of 1030%)(104) (Table 1). These mechanical properties compare will to those of synthetichigh performance materials, with a high tensile strength that approaches that ofKevlar, a toughness 3 times that of aramid bers, and a strength 5 times that ofsteel by weight.

    Protein engineering strategies have been widely employed for the produc-tion of silk-like proteins, both in research investigations aimed at understandingthe molecular basis for the excellent mechanical properties of silk, as well as incommercial applications that employ protein engineering methods as a sourceof specialty materials (or potentially materials for bulk ber applications). Pro-tein engineering of silk is motivated in large part by the difculty of isolating

  • 152 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    Table 2. Representative Consensus Sequences Used for Genetically SynthesizedProtein Polymersa

    Natural proteinProtein model Expression host Ref.

    Silk-like proteins(GAGAGS)x B. mori silk E. coli 26[(GAGAGS)9GAAGY)x B. mori silk E. coli 26,61[(GAGAGS)n-(GAAVTG-

    RGDSPASAAGY)m]xB. mori

    silk/bronectinE. coli 27,61

    [(GAGAGS)n-(GAAPGAS-IKVAVSAGPSAGY)]x

    B. morisilk/laminin

    E. coli 61

    [GGAGSGYGGGYGHGYGSDGG-(GAGAGS)3]2,4,6

    S. c. ricini/B. morisilks

    E. coli 28

    [(GXG)n-(A)m]x N. clavipesdragline silk

    E. coli,B. subtilis,P. pastoris

    2933

    [(GPGGYGPGQQ)n-(A)m]x N. clavipesdragline silk

    E. coli,B. subtilis,P. pastoris

    2933

    [GLGGQGGGAGQGGYG]x N. clavipesdragline silk

    E. coli 35

    [SGRGGLGGQGAGA10-GGAGQGGYGGLGSQGT]x

    N. clavipesdragline silk

    E. coli 34

    [SGRGGLGGQGAGA5-GGAGQYGGLGSQG]x

    N. clavipesdragline silk

    E. coli 4143

    [SGRGGLGGQGAGMA5M-GGAGQYGGLGSQG]x

    N. clavipesdragline silk

    E. coli 4143

    [SGRGYSLGGQGAGA5-GGAGQYGGLGSQG]x

    N. clavipesdragline silk

    E. coli 41,44

    [SGPGGYGPGQQT]x N. clavipesdragline silk

    E. coli 34

    [(GPGGSGPGGY)2-GPGGK]11 N. clavipesagelliform silk

    E. coli 46

    [{(AEAEAKAK)2AG(GPGQQ)6GS}9-(AEAEAKAK)2AG(GPGQQ)]

    N. clavipesagelliform silk

    E. coli 47

    [A18TS(GVGAGYGAGAGYGV-GAGYGAGVGYGAGAGY)TS]4

    S. cynthia ricini/B.mori silk

    E. coli 45

    Partial cDNA construct N. clavipes E. coli, 25,38,40dragline silk mammalian,

    transgenic 36,37plants

    Partial cDNA construct A. diadematus silk Mammalian,transgenicanimals

    38,39

    Collagen, cDNA constructspro1 Human type I, III P. pastoris, 50

    collagen S. frugiperda 48,55pro1 and pro2 Human type I

    collagenP. pastoris 49

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 153

    Table 2. (Continued)Natural protein

    Protein model Expression host Ref.

    pro1 Human type Icollagen

    H. polymorpha 54

    pro2 Human type Icollagen

    Transgenicmouse

    51

    pro1 Human type Icollagen

    Transgenicmouse

    52

    Collagen-like proteins(GESGREGAPGAEGSP-

    GRDGSPGAKGDRGET)6Human type I

    collagenB. brevis 57

    (GAPGAPGAPGPVGPA-GKSGDRGETGPAGPP)8,10

    Human type Icollagen

    B. brevis 57

    Partial cDNA Mouse type I, rattype IIIcollagens

    P. pastoris 53

    GPE[{(GPQ)(GPE)4}2]x Bovine collagen S. cerevisiae 56GXP-GXQ-rich random polymers Collagen-like P. pastoris 58(GPP)32 Collagen-like E. coli 59,60[GAP(GPP)4]x Collagen-like E. coli 61[(GAP(GPP)4)2GLPGPKGDRGD-

    AGPKGADGSPGPAGPA-GPVGSP]n

    Collagen-like, Cellbinding

    E. coli 61

    (GAPGAPGSQGAPGLQ)52 Collagen-like E. coli 6163Elastin-like proteins

    (VPGXG)x Mammalianelastin

    E. coli, yeast,transgenicplants

    61,6474

    [(VPGVG)n(VPGXG)m]x Mammalianelastin

    E. coli 67,73,80,81

    [(VPGVG)n(VPGAG)m(VPGGG)p]xb Mammalianelastin

    E. coli 19,68,76

    [GVGVP GVG(F/V)PGXG(F/V)P-(GVGVP)3]n

    Mammalianelastin

    E. coli 75

    [GVGVP GVG(F/V)PGXG(F/V)P-GVGVP GVGFPG(V/F)GFP]n

    Mammalianelastin

    E. coli 75

    [GVGVP GVG(F/K)PGEGFP-GVGVP GVG(F/V)PG(F/K)GVP]n

    Mammalianelastin

    E. coli 75

    [GVGIP GFGEP GEGFPGVGVP-GFGFP GFGIP]n

    Mammalianelastin

    E. coli 75

    (GVGIP)x, (AVGVP)x Mammalianelastin

    E. coli 75

    ABc Mammalianelastin

    E. coli 7779

    BABc Mammalianelastin

    E. coli 7779

  • 154 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    Table 2. (Continued)Natural protein

    Protein model Expression host Ref.

    (GVG(V/I)P)10GVGVPGRGDSP-(GVG(V/I)P)10

    Elastin/cell-bindingdomain

    E. coli 75

    [LD-CS5d-(GVPGI)x]y Fibronectin/elastin E. coli 66[LD-CS5-G(VPGIG)20VP]5 Fibronectin/elastin E. coli 88RKTMG[LD-CS5-

    G(VPGIG)20,25VP]5,3,1-LEKAAKLE

    Fibronectin/elastin E. coli 88

    [LD-CS5-G((VPGIG)2VPGKG(VPGIG)2)4-VP]3

    Fibronectin/elastin E. coli 89

    Silk-elastin-like proteins[(GVGVP)n(GAGAGS)m]x Elastin/B. mori

    silkE. coli 61,82,87

    [(GVGVP)n(GAGAGS)m-(GAAVTGRGDSPASAAGY)-(GAGAGS)p]x

    Elastin/B. morisilk/bronectin

    E. coli 61,82

    [(GVGVP)n(GEGVP)-(GVGVP)m(GAGAGS)p]x

    Elastin/B. morisilk

    E. coli 83,84

    [(GAGAGS)n(GVGVP)m-(GKGVP)(GVGVP)p]x

    B. mori silk/elastin E. coli 82,8587

    Articial proteins[(AG)nPEG]x Random coil E. coli 100[(AG)nEG]x Silk mimetic

    sequenceE. coli 90

    [(GA)3GX]x Silk mimeticsequence

    E. coli 91,98,99

    ED(E17D)xEE, benzylated Helical E. coli 9294Helix-[(AG)3PEG]10-Helixe Leucine zipper E. coli 95Helix-[(AG)3PEG]10 Leucine zipper E. coli 95Helix Leucine zipper E. coli 95[(AAAQ)x(AAAE)(AAAQ)x]y Helical E. coli 96[GKGSAQA]x Random coil E. coli 97[AKPSYPPTYK]x Mussel adhesive

    mimeticE. coli 20

    [LSVQTSAPLTVSDGK]16 Viral spike protein E. coli 21[SGLDFDNNALRIKLG]26 Viral spike protein E. coli 21[QLSLRVSEPLDTSHGV]64 Viral spike protein E. coli 21

    aOne-letter abbreviations of the amino acids are employed except as noted in the table. The letter Xindicates positions in which several different amino acids have been encoded. Only general consensusrepeats are given; minor linker regions, fusion tags, random sequences of amino acids, and minorsubstitutions in isolated repeats are deleted for simplication.bThe A and G residues are dispersed randomly throughout the sequence to reduce gene repetition.cA blocks: [VPGEG(IPGAG)4]14, [(APGGVPGGAPGG)2]x, [VPGVG(IPGVGVPGVG)2]19,[VPGEG(VPGVG)4]30, [VPGEG(VPGVG)4]48, [(VPGMG)5]x; B blocks: [VPAVG(IPAVG)4]16,[VPGFG(IPGVG)4]14.dCS5 = GEEIQIGHIPREDVDYHLYP; CS5 is a cell-binding domain from bronectin.eHelix = SGDLENEVAQLEREVRSLEDEAAELEQKVSRLKNEIEDLKAE.

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 155

    these proteins in large quantities from the predatory spider, which cannot be cul-tivated like the silkworm. Despite advances in recombinant DNA methods, nocomplete cDNA or genomic DNA sequence of a natural silk protein has been suc-cessfully utilized for protein expression. The difculties arise primarily becauseof the extreme length and repetitiveness of these genes, and also potentially be-cause of the poor codon correspondence between the animal and the expressionhost, which places an unbalanced demand on aminoacyl-tRNA pools during bac-terial protein expression. As a result of these difculties, genetic deletions and/orpremature termination during protein synthesis are often observed. However,the relatively simple, repetitive amino acid sequences that comprise silk proteinscan be easily used to construct articial silk-like proteins, and this approach hasaddressed the instability and poor production of protein from the expression plas-mids prepared from the natural cDNA. Indeed, these approaches have been usedto produce silk-like protein polymers containing nearly 1000 amino acids frombacterial expression systems, and somewhat longer polymers in yeast, indicatingthe exibility of the method. Limitations in predicting protein polymer structureand function from the properties of the repetitive amino acid sequences can oc-cur, but have not caused serious problems in the design of silk-like articial pro-teins; the repetition of consensus sequences of natural silks provides protein-basedpolymers in which the conformational properties of the original protein are wellpreserved.

    The instability of long cDNA sequences has limited their lengths to less than2.5 kb. Recently, a 1.5-kb partial cDNA fragment of N. clavipes dragline silk hasbeen cloned and the 43-kDa protein has been expressed fromE. coli (25). However,most work has focused on the bacterial expression of silk-like proteins from syn-thetic genes, as mentioned above. Earliest reports of this approach date back tothe work of Cappello and co-workers in 1990 (26). In these initial studies, a varietyof different silk-like protein sequences with the repetitive sequence (GAGAGS)nwere constructed, (where G is glycine, A is alanine, and S is serine) on the basisof the B. mori silk broin. The 108-bp monomer, anked by Ban I restriction siteswhich permit only head-to-tail ligation of monomers, was multimerized enzymat-ically, and the multimers were fractionated and ligated into an expression systemcontrolled by a strong, temperature-sensitive promoter that permits induction ofprotein expression by simply raising temperature to above 40C. The (GAGAGS)nsequence proved intractable because of the formation of very stable and insoluble-sheet structures, and so more complex sequences with improved solubility havebeen produced via introduction of noncrystalline regions dispersed between the-sheet crystalline regions. For example, articial proteins designed to containboth the crystalline segments of silk broin and the cell adhesion domain of -bronectin have also been synthesized by Cappello and co-workers, and have beenshown via wide-angle X-ray scattering and molecular simulations to crystallizeaccording to the models for Silk I (27). More recently, a sequence that alternatesthe (GAGAGS)4 repeat from B. mori with a (VPGVG)8 repeat from elastin hasbeen produced, and the protein maintains excellent materials properties based onthe combination of crystallinity and elasticity imparted by the two repetitive do-mains of the 832-amino acid articial protein (105). Copolymers of the crystallineregion of B. mori silk broins with the unordered glycine-rich region of Samia

  • 156 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    cynthia ricini have also been produced to improve the tractability of the B. moricrystalline domain (28).

    Similar success has been observed for the expression of repetitive articialproteins based on spider silk amino acid sequences. Genes for partial amino acidsequences from N. clavipes have been generated, and repetitive genes of thesesequences have been produced via recursive multimerization strategies. Proteinscomprising 8 or 16 repeats of a poly(alanine) -sheet hard segment with eitherinterspersed (GXG) variants or (GPGGY) and (GPGQQ) alternating pentamersas the soft segment have been expressed in both E. coli and Bacillus subtilisby Fahnestock and co-workers (2931). When expressed in E. coli under con-trol of the strong promoter of the bacteriophage T7, approximately 1030% ofthe soluble cell protein is the expression target sequence, although some trun-cated products are formed in these expression systems. The proteins are easilypuried via chromatographic methods or via solubility differences between thearticial repetitive protein and other cellular proteins (31,32). When expressed inthe yeast Pichia pastoris, from genes employing the codon preferences for Pichia(33), truncated products are eliminated and protein yields are increased, withsilk-like proteins comprising 10% of the total protein (yields of 1 g/L). A chromo-somal integration strategy is used to establish the genes in Pichia by homologousrecombination, and expression is controlled by the strong promoter AOX1, whichis induced by addition of small amounts of methanol. Fusing the target protein tosecretion signals and prosequences of Saccharomyces cerevisiae permits efcientsecretion of the target protein from the cell and into the extracellular medium,although at much lower yields. The advantages of this system include the factthat P. pastoris is a useful host for large-scale fermentation, and that extracellu-lar localization of an expression protein target has many potential advantages forlow cost production (simpler recovery, purication, and the potential for expres-sion from immobilized cells). Circular dichroism (CD) characterization of theseproteins in dilute aqueous solution indicates that the protein polymer adoptsa random coil conformation, with increasing -sheet content upon drying, con-sistent with the reported behavior of natural silk proteins upon spinning intobers.

    The production of other synthetic genes based on the amino acid sequencesof dragline silk fromN. clavipes has also been achieved by Kaplan and co-workers.From the partial cDNA sequences of this silk, two consensus repeats (NCMAG1and NCMAG2, abbreviated for N. clavipes major ampullate gland) have beenidentied: (GGAGQGGYGGLGSQGAGRGGLGGQGAG), followed by a polyala-nine region, and (GPGGYGPGQQGPGGYAPGQQPSGPGS), also followed by apolyalanine region. Repetitive proteins with sequences based on NCMAG1 andNCMAG2 have been produced from E. coli via expression of articial repetitivegenes from a pQE9 expression vector under control of a bacteriophage T5 pro-moter (34); these proteins contain a hexahistidine fusion to facilitate puricationvia metal chelate afnity chromatography. Yields of approximately 15 mg/L havebeen observed, although the yield of the silk-like repetitive proteins decreaseswith increasing length. Characterization via CD conrms the expected presenceof -sheet structure in these proteins. Additional silk-like proteins, also modeledafter the dragline silk of N. clavipes, have been produced in E. coli by other re-search groups (35,106); in general, these proteins have molecular weights ranging

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 157

    from 10 to 160 kDa (molecular weights of natural spider silk proteins can reach740 kDa), with yields of puried protein ranging from 2 to 20 mg/L.

    In addition to expression from bacterial and yeast expression hosts, silkproteins have also been expressed from transgenic hosts, motivated in part bythe low expression yields obtained in microorganisms. Tobacco and potato plantshave been successfully utilized as transgenic hosts for producing silk-like proteinsbased on dragline silk (36,37). In these investigations, at least 2% of the total sol-uble protein in the endoplasmic reticulum of the tobacco and potato leaves andpotato tubers is reported to be silk. In addition, partial cDNA clones of the silkprotein from Araneus diadematus and from N. clavipes have been expressed inmammalian cells [baby hamster kidney (BHK) cells] to yield several grams ofprotein that is recovered in soluble form by ammonium sulfate precipitation. The2550-mg/L yields provide an ultimate yield of more than 12 g of material puri-ed from conditioned culturemedia (38). For production of larger amounts of theserecombinant silk-like proteins, expression from goat milk may prove a viable al-ternative, and is currently under development; transgenic goats that express silkproteins in their milk have been produced by Nexia Biotechnologies, Inc., andmethods to purify these proteins have been developed (39,107109).

    Additional areas of active research have been to mimic the ber spinningconditions of the spider and to control -sheet assembly in order to produce arti-cial silk protein materials that have mechanical properties similar to those of thenatural silk protein bers. In general, the spinning of recombinant, silk-like arti-cial proteins requires dissolution in harsh solvents such as hexauoroisopropanol,formic acid, or 9M lithium bromide. For example, bers have been spun from solu-tions of these proteins in hexauoroisopropanol with quenching in methanol andwet drawing. Such bers exhibit the properties [tenacities of 2 g/denier (gpd) and810% elongation] of traditional textile bers, but do not match the propertiesof natural dragline silk (110). Silk proteins produced from bacterial sources havealso been spun into bers from more benign aqueous processing conditions thatbetter mimic those of the spider. In 2002, Arcidiacono and co-workers reportedthe aqueous solution spinning of silk sequences derived from N. clavipes draglinesilk and expressed from E. coli (40). The proteins are produced via expressionof the C-terminal cDNA of N. clavipes dragline (42.9 kDa), or via expression ofthe consensus sequences NCMAG1 and NCMAG2 of N. clavipes (55 kDa). Thekey to the successful spinning of these proteins from aqueous solution has beento eliminate the drying of the protein at any stage of purication and processing,by increasing protein concentration via removal of water during purication, withprocessing in dilute denaturing buffer (160 mM1M urea). The content of -sheetstructure in the bers increases with time, as expected, and ber diameters rangefrom 10 to 60 m in diameter, depending on the weight percentage of the spindope and the identity of the protein. Fibers are insoluble in water and birefrin-gent, indicating the formation of oriented bers, although no mechanical propertymeasurements have been reported. The A. diadematus-derived protein producedin mammalian cells (38) has also been processed in a similar mannerrst con-centrated into an aqueous spin dope exceeding 23% (w/v) protein, coagulated inmethanol/water, and exposed to wet draw conditions. The bers isolated via thesemethods exhibit a tenacity of 2.26 gpd and 55130% extensibility, which yield atoughness equivalent to that of silk, although the properties do not exactly mimic

  • 158 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    natural silk bers, which have a tenacity of 711 gpd and an extensibility of 30%(vide supra).

    The difculties in processing silk proteins is largely a result of the spon-taneous assembly into sheets of the poly(alanine) sequences of dragline silk,especially under conditions of shear. In an effort to understand the architecturalvariables that will permit purposeful control of -sheet formation and hierarchi-cal assembly during processing, protein engineering methods have been employedtoward the design and synthesis of silk protein sequences in which amino acidsand amino acid sequences are included for control of -sheet formation. Incorpo-rating sequences that can trigger -sheet assembly, for example, as demonstratedby Kaplan and co-workers, has been one approach to control protein properties.In one design, methionine residues have been included to ank the (AAAAA)units of the NCMAG1-based repetitive proteins (4143). Under reducing con-ditions, these polymers readily form -sheet structures as assessed by CD andFourier transform infrared (FTIR) spectroscopies; -sheet assembly is disruptedupon oxidation of methionine to the larger and more hydrophilic sulfoxide. Theseresearchers have also introduced sites of enzymatic phosphorylation that ankthe alanine-rich sequences, in order to alter the size and charge of serine residuesand control -sheet formation (41,44). When the serine hydroxyl group in the se-quence RGYSLG is phosphorylated by cAMP protein kinase, -sheet formation isprevented; upon dephosphorylation by alkaline phosphatase, the ability to form sheets is recovered. In other investigations aimed at controlling solubility andprocessability of silk-like proteins, Asakura and co-workers (45) have designeda chimeric sequence that combines the crystalline polyalanine sequence (Ala)18of Samia cynthia ricini silk with a noncrystalline (GX)-rich sequence (GVGAGY-GAGAGYGVGAGYGAGVGYGAGAGY) found in the silk broin of B. mori. Theintroduction of the glycine-rich noncrystalline sequence in the protein results inimproved solubility of this silk-like protein and prevents -sheet formation in thepolyalanine region of the protein, as assessed via solid-state 13C CP/MAS NMRstudies.

    The materials properties of other members of the spider silk family are alsobecoming of increasing research interest. Sequences based on the agelliformsilk protein, which produces the unique elastomeric properties of the capture spi-ral of spider webs, have been recently studied by Conticello and co-workers (46).A protein containing 11 repeats of the agelliform-derived amino acid sequence{(GPGGSGPGGY)2GPGGK} can be expressed in E. coli, and has been character-ized by a combination of techniques including CD, FTIR, and NMR, which indi-cate that these molecules adopt a -turn conformation. These results suggest thatthe elastomeric properties of the agelliform sequence may have similar originsas those found in other -turn-forming proteins such as elastin. The mechanicalproperties of protein polymers that contain agelliform-like amino acid sequencescan also be controlled by the inclusion of peptide motifs that are known to form-sheet crystallites but that are not naturally found in silk proteins.

    Conticello and co-workers (47) have also produced alternating block copoly-mers containing a agelliform-like sequence (GPGQQ)6, which is derived from theA. diadematus dragline silk broin, and the amphiphilic sequence (AEAEAKAK)2.Although the (AEAEAKAK) sequence is not a silk-derived sequence, it has beenshown to form very stable -sheet structures in aqueous solution (111). Proteins

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 159

    containing these sequences assemble into -sheet linked membranes from con-centrated aqueous solution, and membrane formation is responsive to conditionsthat alter the conformation of the (AEAEAKAK) units in the polymer (47).

    The application of silk-like protein polymers in many disparate applicationssuch as textiles, medical applications, and cosmetics will be possible, given theirdemonstrated synthesis and favorable properties. One difculty to be overcomein the synthesis of articial silk-like proteins from bacterial hosts is the gener-ally low expression yields, which can be as low as 110 mg/L. Yeast systems offeradvantages in this regard, with potential yields of 3001000 mg/L possible, al-though difculties in purifying the protein from these expression hosts have beenencountered. While the mechanical properties of spider silk are a desirable goal, ithas remained difcult to reproduce the properties in the laboratory environment,although continued progress is being made in this direction, owing to increasedunderstanding of the processing of these proteins and the ability to control the sol-ubility of the proteins via protein engineering strategies. Improved understandingof the impact of synthesis and processing on the resulting mechanical propertiesof silk-like materials will increase the number of applications in which these poly-mers will nd use. Coupled with the fact that silk-like proteins can also be usefulin supporting cell growth and tissue function, the proteins will nd use in bothmaterials science andmedical applications, such as in bers, threads, membranes,and scaffolds. The genetic tailoring of their sequence and the ability to include re-peats of other structural proteins such as collagen or elastin (vide infra) has alsoresulted in a variety of silk-based proteins that can be useful in varied materialsapplications, including ber reinforcement, controlled release, and scaffolds fortissue engineering, and will continue to expand the number of uses in which thesepolymers will nd application.

    Collagen-Like Protein Polymers

    Collagen (qv) is another important brous structural protein that has been pro-duced by protein engineering methods for applications in materials science andbiology. The collagens comprise a highly abundant family of extracellular ma-trix proteins that are found in all connective tissues. They represent 30% of thetotal body proteins in mammals and are found in tendons, ligaments, cartilage,bone, and skin. Although there are at least 20 genetically distinct types of col-lagen (the most abundant in mammals being types IIII), all are composed ofthree helical chains that have the general amino acid sequence Gly-X-Y, in whichX is primarily proline, and Y is primarily hydroxyproline. The monomeric he-lical sequence rst assembles into a triple helix with dimensions of 300 nm inlength and 1.5 nm in diameter, which then forms larger-scale hierarchical struc-tures that ultimately assemble into high strength elastic bers that form con-nective tissue (112). In addition to its structural role, collagen is also involvedin a variety of important biological events such as early development, cell at-tachment and proliferation, wound healing, and tissue remodeling. This combi-nation of mechanical strength, higher order assembly, and biological activity ofcollagen has made it a broadly used protein-based biomaterial. Collagen I, inparticular, is the most abundant structural protein in animals, and has been

  • 160 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    widely exploited for medical use, cosmetics, therapeutics, and also as its dena-tured form (gelatin) in food and photographic emulsions. Although collagen isavailable in large quantities from mammalian sources, extraction from animaltissues poses the risk of contamination by viral or infectious agents, which hasbecome of increasing concern in recent years. Owing to this potential risk, thesimplicity of the repetitive sequence, the wide biological and materials uses ofcollagen, and the ability to tailor collagen amino acid sequences, protein engi-neering strategies have become more prevalent for the production of recombinantcollagen.

    In contrast to the production of silk-like proteins, recombinant productionof collagen-like proteins has focused primarily on the production of cDNA se-quences isolated from natural sources. Fibrillar collagens have been produced inmammalian cells (113,114), insect cells (48,49), yeast (51,115118), and trans-genic animals (51,52,118) and plants (119,120). Yields generally range from 15 to500 mg/L, although a triple-helical, 160-kDa human type I collagen has been pro-duced in mouse milk at 18-mg/mL yields (52), and a 21-kDa fragment of mousecollagen I (from cDNA fragments) can be produced from P. pastoris at yields of14.8 g/L of claried broth (53).

    To obtain correctly assembled collagen brils, most organisms employ exten-sive post-translationalmodication of procollagens, and the enzymes thatmediatethese modications have been incorporated into expression systems for collagenproduction. One of the most important considerations is the hydroxylation of pro-line residues to 4-hydroxylproline via the action of the tetrameric (2 2) enzymeprolyl 4-hydroxylase. In almost all collagens (with the exception of some marineinvertebrate collagens), hydroxylation of the proline is required for correct brilformation and to maintain the melting temperature of the collagen at physiolog-ically relevant levels (approximately 40C). A lack of hydroxylation reduces (andcan, under physiological conditions, eliminate) the propagation of banded bril for-mation, lowers the melting transition temperature, and can render the collagenuseless for biological applications (120). Most bacterial and yeast expression hostsdo not contain the genes for prolyl 4-hydroxylase, and some insect andmammalianexpression hosts do not generate sufcient amounts of this enzyme. Coexpressionof collagen with the subunits of prolyl 4-hydroxylase has therefore been usedto permit recombinant production of functional collagen. Another considerationin collagen expression is that the protein is generally expressed as procollagen,which contains N- and C-terminal peptides to impart solubility. The genes for theproteolytic enzymes N-proteinase and C-proteinase are also often included in ex-pression hosts to allow processing of the procollagen, via removal of the N- andC-terminal peptides, into the assembling collagen form. It has been recently re-ported, however, that human type I collagen proteins lacking the propeptides canbe produced from S. cerevisiae expression hosts that lack the genes for the prolyl4-hydroxylase (117) and can still correctly fold. Generally, in expression systemsin which the proteinases are not included, procollagen can be cleaved with pepsinpost-translationally to liberate the N- and C-terminal propeptides and permit b-rillar assembly of collagen.

    Vuorela and co-workers (50) have demonstrated that coexpression of sub-units of human prolyl 4-hydroxylase with the pro1 (III) chains of human typeIII procollagen in P. pastoris yeast expression systems permits production of

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 161

    approximately 15 mg/L of properly folded homotrimeric type III procollagen thatis resistant to pepsin digestion. The yeast expression hosts also demonstrate anexpected increased level of prolyl 4-hydroxylase activity, and can be used in thelarge-scale production of different types of recombinant collagen. Similar strate-gies have also been employed for the production of high levels of human type Icollagen in P. pastoris (121). In these investigations, the genes for the pro1 andpro2 chains of type I procollagen are expressed, along with the genes for both the and subunits of prolyl 4-hydroxylase. The correct assembly of pro1 and pro2chains is observed (2:1 ratio), and high expression yields of 500mg/L are obtained.In contrast to these other yeast systems, properly hydroxylated, 28-kDa mouse1 (I) collagen fragments can be expressed from the yeast Hansenula polymorphawithout coexpression of prolyl 4-hydroxylase (54). Successful expression of the pro-collagen chains from insect cells has also been observed. Pro1 chains of humantype III collagen have been expressed from baculovirus vectors in S. frugiperdaSf9 or High Five cells (48), as have wild-type and modied pro chains of humantype I collagen (55). With coexpression of human prolyl 4-hydroxylase, correctlyassembling procollagen molecules can be produced at yields of 1020mg/L.

    Mammary gland expression has also been explored as a route to produce pro-collagens, although this has been more limited in scope and use to date. The gen-eral method involves the expression of genomic or cDNA transgene sequences inconjunctionwithmammary-glandspecic promoters, which drives the expressionof the proteins in milk. It has been achieved for procollagen via two different ap-proaches. In one strategy, the cDNA encoding a shortened version of recombinantpro2 procollagen chains has been integrated into themouse genome, and the pro-collagen has been coexpressed with the and subunits of prolyl 4-hydroxylase.The expression of a thermally stable triple helical articial 2(I) homotrimer at50200 mg/L is possible in this system (51). In another instance, transgenic micewere generated that contained the S1-casein mammary-glandspecic promoterlinked to a 37-kb segment of the human 1(I) procollagen structural gene. Veryhigh levels of triple helical procollagen are produced via this system (8 mg/mL)(52). An advantage for transgenic expression of protein polymers in milk is thatthe expression levels of proteins can be increased via the use of a species with ahigher volume production of milk (eg, mouse to rabbit to pig or goat).

    Although recombinant natural collagen sequences have been of signicantacademic and commercial interest (115,122,123), the production of collagen-basedproteinswith additional functions has also been explored. For example, in an effortto design advanced biomaterials, Fertala and co-workers (124) have identied spe-cic regions in collagen that are responsible for particular biological activity. Theseresearchers have developed a cDNA cassette expression system (125) that allowssystematic removal of specic amino acid sequences in collagen II. By removal ofcertain D blocks in collagen II, followed by expression of the modied protein inmammalian cells (HT-1080 cells), purication, and assessment of human chon-drocyte (cartilage cell) spreading and migration, the amino acid region 704-938has been identied as critical for the spreading of chondrocytes (124). The studiesdemonstrate that collagen (II) domains differ in their ability to support attach-ment and migration of chondrocytes, and suggest that advanced collagen-basedmaterials could be designed to present multiple copies of the critical domains,permitting support of a greater number of cells and potential improvements in

  • 162 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    tissue regeneration. In other efforts to produce collagen-based materials with im-proved biological activity, Hayashi and co-workers have produced a collagen IIIepidermal growth factor (EGF) chimeric protein from Sf9 insect cells (126). Theproteins maintain the bril-forming properties of the collagen domain, and lmsof the protein also improve cell growth via the action of the EGF; the chimericprotein may therefore be useful for cell culturing, wound healing, and tissue en-gineering applications. Together, these studies point to the utility of protein engi-neering strategies to identify and to producematerials containingmultifunctionaldomains that impart desired biological and materials properties.

    In addition to protein engineering investigations that have been based onnatural collagen sequences, there is also signicant interest in creating arti-cial collagen-like sequences with novel chemical and physical properties. Ac-cordingly, genetically directed methods have been employed to produce articialprotein polymers containing the collagen-based GXY repeat, with substitu-tions in the X and Y positions to impart desired chemical functionality. Theproteins have been expressed in yeast and bacterial systems. For expres-sion from yeast (Saccharomyces cerevisiae) and certain bacteria (Bacillus bre-vis), fusion of the -mating factor secretion signal permits secretion of thecollagen-like proteins (CLPs) into the medium (53,56,57), which simplies pu-rication and points to the feasibility of commercial scale production of theproteins. Kajino and co-workers (57) have designed DNA sequences encod-ing human 1 collagen repeats (GESGREGAPGAEGSPGRDGSPGAKGDRGET)6and (GAPGAPGAPGPVGPAGKSGDRGETGPAGPP)8,10 according to the codonusage of B. brevis. The proteins are expressed in B. brevis at estimated yieldsof 500 mg/L, and they exhibit reversible sol-gel behavior similar to that of na-tive gelatin. Nonhydroxylated gelatins, with sequences based on partial cDNAsequences from mouse type I and rat type III collagen, have been produced inhigher yields of 14.8 g/L from P. pastoris, using the S. cerevisiae -mating factor(53). Completely synthetic, 36.8-kDa collagen sequences that are rich in GXP andGXQ sequences can also be produced, without degradation, in P. pastoris at highyields of approximately 36 g/L (58).

    Expression from E. coli has not been as successful as the yeast and B. brevisexpression systems. In early attempts to express CLPs from bacterial expressionsystems (59,60), a 22-kDa CLP with the sequence (Gly-Pro-Pro)32 was producedin E. coli under control of a thermally inducible promoter. The resulting proteindegraded in this cellular host, although use of an appropriate mutant host mini-mized this problem. Reducing the proline content of the collagen-like proteins ap-pears to minimize stability problems, as less proline-rich proteins with sequencessuch as {GAP(GPP)4}x and {{GAP(GPP)4}2GPAGPVGSP}x have been success-fully produced from E. coli by Ferrari and Cappello (61). Recently, the sequence(GAPGAPGSQGAPGLQ)52, which has an even lower proline content than previ-ously producedCLPs, has been expressed inE. coli; the protein can be expressed insoluble form, at 100200-mg/L yields, in low cell density, small-scale experiments(62,63).

    The tensile strengths and availability of type I collagen render it the mostwidely used natural polymer in the medical eld, with uses in tissue engineering,delivery systems, and cosmetic surgery; the denatured form of collagen (gelatin)nds widespread use as a food additive and photographic emulsion. In addition to

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 163

    minimizing risks associated with isolation of collagen from animal sources, genet-ically directed strategies for the production of recombinant collagen-like proteinpolymers are also appealing as a result of the potential for CLPs in the aboveapplications. The progress toward expression of collagen-like proteins from bac-terial and other expression systems may permit the rational design of articialCLPs with specialized materials uses; for example, understanding the individ-ual roles of collagen domains in eliciting biological responses may result in thedesign of CLP-based scaffolds for directed interactions with cells. The ability toproduce designer collagen and gelatin polymers via genetically directed methodsmay therefore drive the growth of these strategies as a large-scale source of re-combinant human collagens and articial collagen-like polymers for a variety ofadditional applications in nanotechnology, materials science, and medicine.

    Elastin-Like Protein Polymers

    Elastin is another protein found in connective tissues, and provides strength,exibility, and elasticity to organs and tissue. When expressed in tissue, elastinis produced as the 750800 amino acid protein tropoelastin, which then assem-bles into microbrils, is modied via hydroxylation of proline, and is cross-linkedvia lysyl oxidase oxidation of lysine-rich domains (127). This assembly and insol-ubilization of tropoelastin results in the mature elastin network, which conferselastic and biological functionality to tissue. Elastin is composed of large, mobile,highly hydrated hydrophobic domains and smaller alanine-rich regions. The hy-drophobic domains of elastin contain proline, alanine, valine, leucine, isoleucine,and glycine, with valine and glycine beingmost abundant. The sequence of the hy-drophobic domain is highly repetitive, containing repeats of the general sequencesGX, PX, GGX, or PGX (X = G, A, V, L, or I); the common repeat in mammalianelastin is VPGVG, with up to 11 consecutive pentapeptide repeats (127,128). Incontrast to silk and collagen, elastin, even at high molecular weights, is solu-ble in aqueous solution, which, along with the highly repetitive nature of thehydrophobic regions of this functional protein, has fueled an enormous amountof interest in the use of elastin-based repeats in genetically engineered proteinpolymers. Whereas protein engineering has been employed in the production oftropoelastins for studies of biological activity, essentially all materials investiga-tions have focused on the biosynthesis of elastin-like proteins (ELPs) comprisingrepetitions of the VP(LCST)GVG motif.

    One unique feature of the VPGVG sequence is that, although it is gen-erally soluble, it exhibits a lower critical solution temperature (LCST), whichcauses aqueous solutions of the protein to separate into solvent-rich and polymer-rich phases upon heating. This transition is analogous to that of the well-knownpoly(N-isopropylacrylamide) polymers, in which an entropy-driven dehydration ofthe polymer chain results in phase separation above a critical temperature. Theability of elastin-like polymers to undergo an LCST transition is conserved as longas the glycine and valine residues are present. The LCST transition is observedeven with substitution, in the repeat sequence VPGXG, of any amino acid X (ex-cept proline); indeed, choice of the amino acid residue X can be used to control theLCST of the protein. This unique behavior of elastin confersmechanical properties

  • 164 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    that make it a versatile platform for the engineering of protein-based materialsfor a variety of applications such as biomaterials, drug delivery, responsive silicamembranes, and tissue engineering.

    Because of this versatility, both chemical and protein engineering methodshave been used widely for the synthesis of elastin-based protein materials basedon the VPGXG sequence. A single repeat of the VPGVGunit is sufcient to observethe random coil to -turn transition (129), which has made chemical strategiesaccessible for the synthesis of elastin-based peptide materials. However, limita-tions of solid-phase peptide synthesis plague these materials at higher molecularweights, since the presence of hard-to-purify deletion products signicantly altersthe properties of the materials (64). Protein engineering methods have thereforebeen usedwidely for synthesis, with relatively high yields (generally 50100mg/L,and up to 800 mg/L) facilitating the characterization and use of these materialsfor commercial applications. Elastin-like proteins have been produced in E. coli(6468), fungi (69), chloroplasts (70), and plants (71), although most materialsinvestigations with elastin-like proteins utilize bacterial expression.

    The broad range of interest in elastin-like protein polymers has fueled thedevelopment of new genetic strategies for the assembly of the genes encoding theelastin repeats (70,74). Seamless cloning methods (72) permit the assembly ofhigh molecular weight genes, in a controlled head-to-tail fashion, directly into theexpression vector. Clones of up to 5000 bp (200 kDa) have been prepared via thesestrategies, with yields commonly on the order of 200800 mg/L. Since the mul-timerization is random in nature, seamless cloning results in the formation of alibrary of genes of different lengths, although the exact length of the genes cannotbe controlled purposefully. Recursive strategies have therefore also been appliedfor the synthesis of repetitive articial proteins with specic and predeterminedchain lengths (68) to enable control of elastin-like protein transition temperaturevia purposeful control of ELP molecular weight and block architecture. By se-quential cloning steps, gene products carrying a specied number of repeats canbe produced. A variety of (VPGXG)n ELPs have been produced via these strategies,with molecular weights up to 130 kDa (330 pentapeptide repeats).

    The elastin-like proteins produced by genetic methods have proven usefulfor a number of applications in biotechnology, materials science, and medicine.In early demonstrations by Urry and co-workers (64), ELPs with the general se-quence G(VPGVG)19VPGV have been expressed from E. coli, are easily puri-ed from cellular proteins, and exhibit a sharp LCST at approximately 48C, asexpected for a VPGVG sequence of this molecular weight. By appropriate de-sign of the elastin-like polymer, thermal, chemical, and electrochemical stim-uli can all be used to alter and control the LCST of ELPs (64). Urry and co-workers have demonstrated, for example, that substitution of any of the naturalamino acids in the fourth position of the VPGVG pentapeptide provides a rangeof LCST transitions. Attachment of redox-active cofactors such as FAD/FADH2(avin adenine dinucleotide) or NAD/NADH (nicotinamide adenine dinucleotide)to glutamic acids in a VPGEG repeat permits electrochemical control over thehydrophobic collapse of the protein (73). Upon reduction of FAD to FADH2 or ofNAD to NADH, the increased hydrophobicity of the pentapeptide results in thehydrophobic collapse of the elastin-like polymer at substantially lower tempera-tures (73). If the two different states of theELP (oxidized and reduced) have LCSTs

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 165

    above and below room temperature, the electrochemical transition can stimulatean observable transition under ambient conditions.

    Following up on the extensive body of work by Urry (74,75,130), others havealso taken advantage of the control of LCST possible for these elastin-like poly-mers in the design of ELPs for drug delivery applications. In one example, Chilkotiand co-workers (19,68,76) have produced a family of polymers of the general se-quence VPGXG in E. coli, where X = V, A, and/or G. The ratios of V:A:G arevaried in order to tune the LCST, with one library containing no A or G, andtwo others containing the residues in ratios of 5:2:3 and 1:8:7. The A and G sub-stitutions in the fourth position of the pentamer are dispersed throughout thesequence to reduce the repetition within the gene. These polymers have 120 to330 pentapeptide repeats and molecular weights of 50 kDa to 130 kDa, and theLCST of the polymers is dependent on both the molecular weight and the guestresidue sequence. Transition temperatures that vary smoothly between approxi-mately 28C and 78C are obtained, demonstrating the potential for geneticallydesigning an ELP with a desired transition temperature for drug delivery ap-plications. In addition, ELPs designed to exhibit an LCST at 35C (131) exhibitcomplex shear moduli similar to those observed for collagen and hyaluronan atphysiological temperature, and are able to support cartilaginous cell growth. Thecombination of these results suggests that these materials may also be useful asinjectable scaffolds for cartilaginous tissue repair (131), although chemical cross-linking would likely be required to impart sufcient mechanical integrity upon gelformation. These investigators have also designed block copolymers based on theVPGXG sequences, with one block comprising 64 repeats of the V:A:G composition1:8:7 (LCST > 90) and the other comprising 60 repeats of VPGVG (LCST = 35C).Upon heating, these block copolymers form 40100-nm size nanoparticles, owingto the hydrophobic collapse and aggregation of the VPGVG block. The mechan-ical, chemical, and/or biological properties can be tuned to enable self-assemblywith incremental changes in temperature, pH, and ionic strength, making theseblock copolymers attractive candidates for controlled delivery, tissue engineeringapplications, and stimuli responsive surfaces and membranes (132,133).

    Genetically directed synthetic strategies also permit the preparation of othercomplex block copolymers with tunable properties. Conticello and co-workers(77,78) have designed and synthesized a series of elastin-mimetic diblock (AB)and triblock (BAB) copolymers for assembly on the mesoscopic scale, and haveexpressed these polymers in E. coli. The B blocks consist primarily of the hy-drophobic elastin repeats {VPAVG(IPAVG)4}16, and the A blocks comprise morehydrophilic repeats such as {VPGEG(IPGAG)4}14, {VPGEG(VPGVG)4}30, and{VPGEG(VPGVG)4}48 (see Table 2). The hydrophobic B blocks exhibit an LCSTbelow 37C, which is desired so that collapse of the hydrophobic block occurs un-der physiologically relevant conditions. The substitution of alanine (A) in place ofglycine (G) in position three of the pentamer results in a change of mechanicalproperties of the material from elastomeric to plastic, which permits control ofthe mechanical properties of the block. The hydrophilic blocks are chosen be-cause of their high LCST values, so that phase separation of the two blocks oc-curs at physiological conditions to form ordered polymeric structures. Indeed,BAB block copolymers with an A block comprising {VPGEG(VPGVG)4}30 un-dergo a reversible solgel transition at 23C. The same group has demonstrated

  • 166 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    that diblock copolymers of the general sequence {VPGEG(IPGAG)4}14VPGEG{VPGFG(IPGVG)4}16VPGFG assemble into spherical particles and beaded la-ments in water at 25C (79).

    There are also signicant opportunities for controlling the mechanical prop-erties of ELP materials via covalent cross-linking strategies. Several differentstrategies have been employed to cross-link soluble elastin molecules into a cross-linked, elastomeric network. Early work by Urry employed irradiation for cross-linking, which yields elastomeric networks with an elastic modulus very sim-ilar to native elastin. Although this method does not require a specic aminoacid for cross-linking, it affords little control over the position of cross-linkingand can also result in chain scission. Therefore, in addition to irradiation(134), other radical-based cross-linkingmethods employing dicumyl peroxide havebeen developed, as well as chemical cross-linking strategies such as carbodi-imide coupling (80) and chemical reactions with electrophilic reagents. Chem-ical cross-linking approaches, explored by several groups, offer the advantageof control of the positions of the chemically reactive cross-linking sites, whichcan in turn control cross-linking density, molecular weight between cross-links,and resultant mechanical properties. Elastin-like proteins with the repetitive se-quences {(VPGVG)4VPGKG}39 have been produced by the Conticello group; thelysines are reacted with the electrophilic N-hydroxysuccinimide esters of bifunc-tional carboxylic acids, bis(sulfosuccinimidyl)suberate (in phosphate buffer), ordisuccinimidylsuberate (in DMSO) to produce cross-linked gels (67). The elastin-mimetic hydrogels exhibit expected elastomeric behavior, with a reversible con-traction (62.4%) and expansion upon cycling between 10 and 45C. Trifunctionaltris-succinimidyl aminotriacetate has also been used by other groups to cross-link elastin-like polymers {VPGKG(VPGVG)6}n (with n = 56, 112, and 224) and{VPGKG(VPGVG)16}n (with n = 51, 102, and 204) (81). The mechanical proper-ties of the cross-linked hydrogels are dependent on molecular weight, the densityof lysine residues, and temperature, and range from 0.24 to 15 KPa. Cross-linkedelastin-like polymer hydrogels exhibit an LCST, although one that is gradual,as compared to the abrupt LCST transition observed for soluble elastin-like poly-mers. In additional studies byUrry and co-workers (80), polymerswith the generalsequence {(GVGVP)2(GXGVP)(GVGVP)2}n, with X= lysine or glutamic acid, havebeen mixed and cross-linked via carbodiimide-mediated coupling. The swelling ofthe cross-linkedmaterials varies as a function of the cross-linking reaction temper-ature, and lamentous structures are formed in elastin-like polymers cross-linkedabove their LCST.

    Methods to produce elastomeric bers from ELPs have also been investi-gated. Uncross-linked polymers can be spun into brous mats via electrospinningprotocols (135), with long uniform bers resulting from spinning solutions greaterthan 10 wt% in polymer; tensile strengths of 35 MPa and a material modulus of1.8 GPa are possible via these strategies. Acrylate groups have been incorporatedvia reactionwith the lysine (K) amines to permit production of cross-linked elastin-mimetic bers (136), which exhibit increased tensile strengths and moduli thatare similar to those reported for native elastin.

    Alternate strategies to control the mechanical properties of elastin-likeproteins, via combination with amino acid sequences from different brousproteins, have also been explored. Combining the temperature-responsiveness

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 167

    of elastin-based repeats with the excellent mechanical properties of silk-likeproteins (vide supra) yields protein polymers that have properties similar tothose observed in segmented polyurethanes, with the advantage of controlledsensitivity to stimuli (pH and temperature). Cappello and co-workers havecombined the (GVGVP) repeat from mammalian elastin with the (GAGAGS)sequence from B. mori silk to create copolymers of over 10 different compo-sitions (82). The relative lengths of the silk-like and elastin-like blocks arevaried, and the ionic sensitivity of the polymers is controlled via the selec-tion of amino acids in the elastin-like block. Representative copolymer compo-sitions include the sequences {(GVGVP)4(GEGVP)(GVGVP)3(GAGAGS)}16 and{(GVGVP)4(GXGVP)(GVGVP)4(GAGAGS)}11, where X is either glutamic acid orvaline (83,84); these silk-elastin-like polymers (SELPs) are easily expressed fromE. coli hosts. Characterization of the physical behavior of these proteins via turbid-ity measurements indicates that the transition temperature (LCST) of the poly-mers containing glutamic acid is sensitive to pH, and that the LCST of both theglutamic acid- and valine-containing polymers can be modulated with changes inionic strength, temperature, polymer concentration, and polymer length (83,84).The simultaneous control over amino acid sequence and molecular weight af-forded by genetically directed methods of polymer synthesis therefore providesopportunities for ne-tuning the stimuli-responsive behavior of these polymers.Thematerials have potential applications as injectible urethral bulking agents fortreating incontinence, cell culture coatings, drug delivery systems, or soft-tissueaugmentation or bone repair.

    Cappello and co-workers have also investigated polymers in which the num-ber of silk-like domains in the polymer is increased to produce polymers thatspontaneously form hydrogels in aqueous solution. For example, the sequence{(GVGVP)4(GKGVP)(GVGVP)3(GAGAGS)4}12 undergoes an irreversible sol-geltransition in aqueous solution of physiological pH and ionic strength, mediatedby hydrogen-bond controlled crystallization of the silk-like domains (85), un-like the polymers above, which contain fewer silk-like repeats and do not formhydrogels. The hydrogels (12 wt%) exhibit mechanical and swelling propertiesthat do not change as a function of temperature, as the presence of the silk do-mains eliminates their temperature sensitivity (86). X-ray diffraction patternsof SELP polymers are very similar to those of simple silk-like polymers (SLPs),conrming the presence of similar crystalline arrangements of silk-like blocksin both SELPs and SLPs. Crystalline silk-broin domains are also observed, viaNMR, in the solid state of other SELPs (137). Proteins and DNA can be deliv-ered from the hydrogels without loss of activity (85,87,138,139); bioactive DNAcan be released from the gels for up to 28 days, with the rate of release for2.611-kb plasmids being dependent on the size of the plasmid DNA (87,139).DNA released from these gels can also be delivered in the active form in amouse breast cancer model (139). These results suggest the potential use of thesepolymers as in situ gel-forming implants for protein and gene delivery applications(140,141).

    Protein engineering strategies have also allowed the incorporation of bio-logical function into elastin-like protein polymers and SELPs. The incorporationof cell-binding amino acid sequences, such as RGD (82,142) and REDV (from theCS5 domain of bronectin) (66) (Table 2), permits endothelial cell adhesion to

  • 168 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    these materials for their application in tissue engineering. Tirrell and co-workershave also incorporated cross-linking domains into biologically active elastin-likepolymers (Table 2) to permit the production of biologically active, cross-linkedmaterials with mechanical properties similar to those of native elastin. The incor-poration of lysine residues at the termini or in the VPGXG repeat of elastin-CS5copolymers permits cross-linking via gluteraldehyde orN-hydroxysuccinimide ac-tivated suberic acids (88,89). Molecules of the general sequence RKTMG{LD-CS5-G(VPGIG)20VP}xLEKAAKLE, with x = 1, 3, and 5, have been cross-linked viareaction with gluteraldehyde. The observed molecular weight between cross-links(Mc) and mechanical properties of the cross-linked polymers approximate thoseof elastin (0.30.6 MPa), with elongation-to-break values of 100220% (88). Abroader range of mechanical properties can be obtained from elastin-based pro-teins in which lysine residues are incorporated in the VPGXG repeats. Proteinsof the general sequence {LD-CS5-G((VPGIG)2(VPGKG)(VPGIG)2)4VP}3, whencross-linked via reaction with NHS-activated suberic acids, yield cross-linkedmaterials with Youngs moduli ranging from 0.070.97 MPa, with Mc values of3,00038,000 (89). These values span the range of mechanical properties of nat-ural elastins and vary as expected with measured extents of cross-linking. Filmsprepared from these proteins have also been shown to support endothelial cell ad-hesion under shear stresses that mimic those in blood vessels (143). The polymersare therefore being considered for applications in engineered articial grafts forthe surgical reconstruction of small- and medium-diameter blood vessels.

    Overall, the applications for elastin-based protein polymers are numerous,given their useful elastomeric properties, biological behavior, and low immuno-genicity. The versatility of these systems includes the ability to engineer the LCSTbehavior of (VPGXG)n polymers via choice of the X residue, to produce cross-linkedelastomeric gels, and to synthesize elastin-like block copolymers that assembleinto nanoparticles. The amino acid sequence of these polymers can also be engi-neered to allow the incorporation of biologically active domains. Their possibleapplications as drug and gene delivery vehicles, membranes, elastomeric bers,and vascular grafts have continued to motivate interest in their scientic andtechnological potential, and a variety of academic and industrial research groupscontinue to investigate these polymers.

    Articial Proteins Designed de Novo

    In addition to the design of protein polymers based on repetitive amino acid se-quences found innaturally occurring proteins,denovodesignhas also been appliedto the production of articial proteins with desired conformational properties.Knowledge of the secondary structure preferences (ie, -helix, -sheet, coiled-coil,reverse turn, etc) of amino acids and amino acid sequences has permitted thedesign of completely articial proteins with prescribed structures, chemical reac-tivity, and assembly. These short amino acid sequences are encoded into genes andmultimerized into target polymer lengths via the genetic strategies previously de-scribed. Proteins are expressed fromE. coli, puried by appropriate protocols, andtheir structure and assembly is characterized. A variety of novel protein materi-als, such as crystalline lamellar solids, smectic-like liquid crystals, and reversible

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 169

    hydrogels have been produced via these methods, and showcase the control overmacromolecular structure and function that can be achieved by genetically di-rected polymer synthesis.

    The earliest example of this approach involved the design and synthesis ofcrystalline lamellar solids by Tirrell and co-workers. Although the folded chainlamellar crystal is a well-known motif in polymer science, this architecture isformed in synthetic polymers for largely kinetic reasons. The trapping of the poly-mer chain into a folded conformation upon cooling makes it essentially impossi-ble to precisely control the thickness and surface chemistry of such chain-foldedlamellar structures. The genetic engineering approach, however, has afforded suchfolded structures in which both variables can be controlled, simply by controllingthe sequence of amino acids in a protein polymer. Polymers with the general se-quence {(AG)xEG}y(with x ranging from 3 to 6, and y ranging from 5 to 54) havebeen designed on the basis of the knowledge that alanylglycyl dyads form ther-modynamically stable -sheets in natural silk proteins. When precipitated fromformic acid, this family of proteins forms crystalline solids with primarily -sheetstructure, as ascertained from infrared and Raman spectroscopic analysis, crosspolarization/magic angle spinningNMR, andwide-angleX-ray diffraction (WAXD)(90,144). X-ray diffraction investigations indicate that the lamellar thickness isalways shorter than the chain length of the polymers, as expected for the chain-folded architecture, and consistent with this observation, X-ray structure rene-ment identies an antiparallel -sheet arrangement of the chains as the best tto experimental X-ray data. The glutamic acid residues of this sequence would beexpected to lie at the lamellar surface, and indeed, deprotonation of the carboxylicacid groups does not cause any change in the chain conformation or intersheetpacking distance as observed via vibrational spectroscopy and WAXD, respec-tively (145). Consistent with these results, X-ray diffraction characterization ofcrystalline lamellar solids of {(AG)3XG} sequences (with X = Asn, Phe, Ser, Val,or Tyr) demonstrates that the interlamellar spacing increases linearly with theincreasing volume of the amino acid side chain in position X (91), but that neitherthe turn length nor the intrasheet packing distance is altered. These results sug-gest that crystal surface functionality can be designed via the choice of the aminoacid in position X of the {(AG)3XG}y sequences.

    The self-assembling properties of certain -sheet-forming peptides have alsobeen captured in protein polymers. Sequences of alternating polar and nonpolarresidues comprise a prevalent sequence motif in the antiparallel -sheet regionsof natural proteins, and accordingly, the peptide (AEAEAKAK)2 has been demon-strated by Zhang and co-workers to self-assemble into an organized supramolecu-lar structure (111). Protein polymers of the sequence (AEAEAKAKAEAEAKAK)9have been produced by Goeden-Wood and co-workers in an effort to capture theself-assembling characteristics of the peptides while improving the mechanicalproperties of the supramolecular gels (146,147). The proteins (which contain adecahistidine fusion sequence) can be expressed from E. coli and puried viametal-chelate afnity chromatography with yields of approximately 5 mg/L. Theprotein is shown, via CD and FTIR investigations, to adopt a stable -sheet struc-ture, and is suggested to form -sheet brils via Congo Red dye binding assaysand characterization by scanning electronmicroscopy (SEM). Under physiologicalconditions, the protein forms hydrogels with a storage modulus of approximately

  • 170 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    100 Pa, suggesting the potential application of these polymers in soft tissue engi-neering or drug delivery applications.

    In addition to thede novo design of -sheet forming protein polymers inwhichself-assembly is controlled by monomer sequence, the design of helical proteinsin which assembly is controlled by molecular weight has also been demonstrated.Helical rods form well-dened self-assembled structures; one particularly well-studied example is poly( -benzyl-,L-glutamate) (PBLG), which forms orientedlms and liquid crystalline solutions. The large dipole moment along the helicalaxis permits orientation of these rod-like polymers via the use of an electric eldto yield materials with interesting piezoelectric and nonlinear optical properties.Since the conventional approach for making these molecules utilizes the ring-opening polymerization of N-carboxy--amino acid anhydrides, molecules madeby this method have broad molecular weight distributions, and therefore formcholesteric liquid crystalline phases in which there is orientational order butno longitudinal registry of individual chains. Smectic liquid crystalline phases,in which there is both orientational order and longitudinal registry, are not ob-served for PBLG synthesized by the chemical methods. In contrast, genetically di-rected synthetic strategies permit the synthesis ofmonodisperse poly(,L-glutamicacid) (PLGA), which can be chemically benzylated to form biosynthetically de-rived PBLGs capable of forming smectic-like liquid crystalline phases in solution(92,93). Monodisperse polypeptides with the sequence ED(E17D)xEE (referred toas PBLG-x after benzylation), where x= 36, have been produced inE. coli (93,94).Aspartic acid (D) is included periodically to provide recognition sites for the en-zyme BbsI and to preserve the genetic stability of the repetitive glutamic acidsequence. The biosynthetically produced PLGAs are benzylated by alkylation ofthe PLGA with phenyldiazomethane, and analysis by NMR indicates 9498%benzylation. Small-angle X-ray diffraction patterns of lms dried from solutionsof PBLG-4 and PBLG-5 (in a 97/3 mixture of chloroform and triuoroacetic acid)show well-dened spacings of 114.5 and 140

    A, respectively. These values almost

    exactly match the expected lengths of the monodisperse PBLG helices (114.5 and141

    A) (92,93), suggesting alignment of the helical molecules in a smectic-like

    liquid crystalline phase. The lms are thought to specically orient in a twistedgrain boundary (TGB)-like phase, owing to the superposition of a smectic-A lay-ering (from the monodisperse PBLG) and the chirality of the helical rod (148).

    Additional helical polymers in which assembly is controlled by monomer se-quence have also been designed by Tirrell and co-workers and by Kaplan andMcGrath, and have proven fruitful for assembly of protein-based materials. Heli-cal coiled-coil motifs (comprising leucine zipper peptides) are found in transcrip-tion factors and play an important role in controlling protein dimerization andDNA binding; related helical motifs are also found in structural proteins such askeratin (the main structural protein in hair and nails). The primary structuralfeature of the leucine zipper peptides is the heptad repeat sequence abcdefg, inwhich hydrophobic amino acids occupy positions a and d (d is primarily leucine),while charged residues generally occupy positions e and g. Under appropriate pHand temperatures, these amino acid sequences adopt helical conformations thatarray the hydrophobic residues a and d along a single face of the helix. Assem-bly of these helices, most often in the form of dimeric coiled-coils, is controlledby aggregation of the hydrophobic face, with the stability of specic associations

  • Vol. 10 GENETIC METHODS OF POLYMER SYNTHESIS 171

    modulated by the charged residues e and g. The coiled-coil stability, aggregationnumber, and aggregate specicity can be manipulated though control of the hep-tad amino acid sequence and the chain length. Thus, these helical motifs serve asversatile structures for the assembly of novel protein-based materials.

    Earliest work in this area by Kaplan and McGrath focused on the synthesisof a variety of isolated leucine zipper peptides with controlled assembly prop-erties (149,150). Expanding the approach, Tirrell and co-workers have designedhydrogel-forming, triblock copolymers carrying short leucine zipper end blocksanking a water-soluble polyelectrolyte domain (95). In this triblock copolymer,there are three main considerations in the design of the protein sequence: (1)the hydrophobic face of the leucine zipper block is modeled after the a/d residuepattern of the Jun oncogene product; (2) the b, c, and f positions are chosen onthe basis of the residues that most commonly reside in those positions in natu-rally occurring coiled-coil proteins; and (3) nine of the 12 e and g positions in thesequence are occupied by Glu residues in order to facilitate pH control of gela-tion and viscoelastic behavior. The triblock protein polymer comprises two leucinezipper end blocks of six heptad repeats, with a central polyelectrolyte block ofthe sequence {(AG)3PEG}10. It has been expressed in E. coli, and CD character-ization of the puried protein conrms the presence of both helical and randomcoil structures. At low pH and ambient temperature, these materials form elasticgels; increases in pH or temperature produce a viscous solution of nonassociatedprotein (95). Solid-state NMR investigations of the protein hydrogel show thatthe central polyelectrolyte domain is isotropically mobile on the microsecond timescale, while the leucine zipper domains are rigid, which is consistent with theassociation of the leucine zippers as the basis for hydrogel formation (151). Onthe millisecond time scale, the leucine zipper domains exhibit rigid body motion,which supports the hypothesis that the domains act as continuously exchangingphysical cross-links.

    Alteration of the composition of these proteins via geneticmethodsmay proveto be a successful strategy for modulating the physical behavior of protein-basedhydrogels. For example, alteration of the acid/base balance in the helical domains(positions e and g) can alter the stability of the coiled-coil association, which willin turn control the stability of the hydrogel network and alter the pH and tem-perature at which these macromolecular structures assemble into hydrogels. In-clusion of cross-linking or lysis domains and/or cellular or molecular recognitionsequences will permit the design of hydrogels with predetermined physical andbiological properties andmay expand their uses as reagents for encapsulation andcontrolled delivery.

    Genetically directed synthetic methods are also being exploited for thestructure-based design of protein polymers that display desired functional groups.Kiick and co-workers have designed a series of functionalized, alanine-rich heli-cal proteins with the general sequence {(AAAQ)x(AAAE)(AAAQ)x}h, in which theposition and number of the glutamic acid residues (E) can be varied systematically.Members of this family of proteins can be produced inE. coli and are easily puriedvia metal chelate afnity chromatography at yields of 1020 mg/L. The proteinsare shown, via CD spectroscopy, to be highly helical under ambient conditions(96), and, along with related random coil sequences, have intended applicationsfor purposeful display of biologically active groups and/or organic moieties.

  • 172 GENETIC METHODS OF POLYMER SYNTHESIS Vol. 10

    In all of the cases described above, genetically directed methods have beenused to produce protein polymers with predesigned secondary structures thatcontrol the folding, assembly, and/or the mechanical and biological properties ofthe resulting polymer. However, genetically directedmethods of polymer synthesishave also been applied to the production of protein polymers in which a lackof regular secondary structure is desired. Recursive multimerization strategieshave been used by Won and Barron (97) to produce genes that encode random coilprotein polymers of the general sequence (GKGSAQA)x, withmolecularweights upto 46.25 kDa. Protein polymers of this repetitive, nonnatural sequence have beenproduced inE. coli at yields of approximately 15mg/L, and CD studies conrm therandom coil nature of the protein. The ultimate goal of this line of inves