Mutalik QuantEstimation Supplement revised - Nature .p2 p5 p3 p4 p7 p6 p1 u4 u1 u7 u6 u5 u3 u2 u8

download Mutalik QuantEstimation Supplement revised - Nature .p2 p5 p3 p4 p7 p6 p1 u4 u1 u7 u6 u5 u3 u2 u8

of 25

  • date post

    09-Aug-2018
  • Category

    Documents

  • view

    214
  • download

    0

Embed Size (px)

Transcript of Mutalik QuantEstimation Supplement revised - Nature .p2 p5 p3 p4 p7 p6 p1 u4 u1 u7 u6 u5 u3 u2 u8

  • 1

    Supplemental Information

    Quantitative Estimation of Activity and Quality for Collections of Functional Genetic Elements

    Vivek K. Mutalik1,2,3,9, Joao C. Guimaraes1,3,4,9, Guillaume Cambray1,3,9, Quynh-Anh Mai1,3,

    Marc Juul Christoffersen1,3, Lance Martin1,3,8, Ayumi Yu1,3,8, Colin Lam1,3, Cesar Rodriguez1,3,8,

    Gaymon Bennett1,3,8, Jay D. Keasling1,2,3,6,7, Drew Endy1,5,9,*, Adam P. Arkin1,2,3,9,*

    1 BIOFAB International Open Facility Advancing Biotechnology (BIOFAB), 5885 Hollis Street,

    Emeryville, CA 94608, USA

    2 Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720,

    USA

    3 Department of Bioengineering, University of California, Berkeley, CA, 94720, USA

    4 Department of Informatics, Computer Science and Technology Center, University of Minho,

    Campus de Gualtar, Braga, Portugal 5 Department of Bioengineering, Stanford University, Stanford, CA 94305, USA

    6 Department of Chemical & Biomolecular Engineering, University of California, Berkeley, CA,

    94720, USA

    7Joint Bioenergy Institute, 5885 Hollis Street, Emeryville, CA 94608, USA

    8 Present Addresses: Dept. of Bioengineering, Stanford University, Stanford, CA 94305,

    USA (L. M.); Philotic, Inc. 88 Kearny St, Suite 2100, San Francisco, CA 94108, USA

    (A. Y.); Autodesk, Inc. One Market Street, Suite 200, San Francisco, CA 94105 (C. R.);

    Center for Biological Futures, Fred Hutchinson Cancer Research Center, 1100 Fairview

    Ave. Seattle, WA 98109 (G. B.).

    9 Equal contribution

    *Correspondence should be addressed to D.E. or A.P.A. (endy@stanford.edu; aparkin@lbl.gov)

    Nature Methods: doi:10.1038/nmeth.2403

  • 2

    Table of Contents

    Supplementary Figures

    SF1. Combinatorial assembly of Promoters and 5 UTRs 3

    SF2. Quality and reproducibility of characterization pipeline 4

    SF3. Comparison of plasmid-born vs chromosomally integrated

    Promoter:5 UTR combinatorial library driving gfp expression 5

    SF4. Observed variation and correlation of mRNA abundance and

    fluorescence from combinatorial library of expression elements 6

    SF5. Estimation of expression element performance scores using transcript abundance

    and translation efficiency datasets 7

    SF6. Estimation of part activity with limited measurements 8

    SF7. Genetic element performance scores at two different temperatures 9

    SF8. Variability in 5 UTR scores is correlated with RNA folded structure at the

    UTR:GOI junction. 10

    Supplementary Tables

    ST1. Putative transcription and translational elements used in this work 11

    ST2. ANOVA table for main expression elements and their interaction 13

    ST3. Process Steps and Costs for BIOFAB Quant. Estimation Pilot Study 14

    ST4. List of plasmids and strains used in this work 15

    ST5. List of primers used in the present work 18

    Supplementary Note 22

    References 25

    Nature Methods: doi:10.1038/nmeth.2403

  • 3

    Supplementary Figure 1: Combinatorial assembly of promoters and 5 UTRs (A) The

    vector backbone of pFABOUT2 (gfp) was PCR amplified using primers oFAB57 and oFAB58,

    which introduces BsaI target sites (shown as green and purple boxes with arrow head) by

    replacing the TetR coding region, Ptet and 5 UTR driving the expression of gfp gene. The

    terminators (symbol T) and promoters (line-arrow) on the vector backbone as shown in the

    figure. The purified PCR products were then digested with enzyme BsaI and yield

    pFABOUT2_cut with overhangs as shown in the figure. The cut vector backbone was then

    ligated to phosphorylated-annealed oligos encoding the promoter and 5 UTR to yield 84 (seven

    promoters and eleven 5 UTRs, and one Null-RBS 5 UTR as a control) constructs that make up

    the GFP library. (B) The vector backbone of pFABOUT18 (rfp) was PCR amplified using

    primers oFAB58 and oFAB60 to introduce BsaI sites (shown as red and purple boxes with arrow

    head) upstream of reporter. The cut vector pFABOUT18_cut was then ligated to phosphorylated-

    annealed oligos encoding promoters and 5 UTR to yield 84 constructs of RFP library. Note that

    both GFP and RFP libraries have a common four nucleotide TTTG junction between promoters

    and 5 UTRs.

    Nature Methods: doi:10.1038/nmeth.2403

  • 4

    Supplementary Figure 2: Quality and reproducibility of measurements (a) Comparison

    between plate reader and flow-cytometer measurements across 77 combinations of 7 Promoters

    and 11 UTRs driving gfp expression (error bars indicate standard deviation across three

    biological replicates). The regression correlation (excluding one outlier in grey, at very low

    expression level) is 0.981, and R2=0.962. (b) Comparison between plate reader and flow-

    cytometer measurements across 77 combinations of 7 Promoters and 11 UTRs driving rfp

    expression (error bars indicate standard deviation across three biological replicates). The

    regression correlation (excluding two outliers in grey, at very low expression level) is 0.980 and

    R2=0.961. (c) Comparison of two biological replicates of flow cytometer measurements. The

    correlation between (log) intensities of replicate measurements lies between 0.984 and 0.995. An

    example is shown in the figure (Replicate 1 vs Replicate 2), with r=0.986 and R2=0.972

    (N=154). The correlation for the measurements using plate reader lies between 0.995 and 0.998.

    (d) Transcript abundance was measured in three replicate biological samples. The correlation

    between (log) intensities of replicate measurements lies between 0.788 and 0.907. A

    representative example is shown in the figure (Replicate 1 vs Replicate 2), with r=0.907 and

    R2=0.823 (N=154).

    a b

    c

    18 20 22 24 26

    18

    20

    22

    24

    26

    Replicate 1 - mRNA abundance (Molecules/l, log2)

    Re

    plic

    ate

    2 -

    mR

    NA

    ab

    un

    da

    nce

    (M

    ole

    cu

    les/

    l, lo

    g2

    )

    R2= 0.823

    2 4 6 8 10

    24

    68

    10

    Replicate 1 - Expression strength by Flow Cytometry (A.U., log2)

    Re

    plic

    ate

    2 -

    Exp

    ressio

    n s

    tre

    ng

    th b

    y F

    low

    Cyto

    me

    try (

    A.U

    ., lo

    g2

    )

    R2= 0.972

    d

    12 14 16 18 20 22

    24

    68

    GFP expression strength by Plate Reader (RNU/OD/h, log2)

    GF

    P e

    xp

    ressio

    n s

    tre

    ng

    th b

    y F

    low

    Cyto

    me

    try (

    A.U

    ., lo

    g2

    )

    R2= 0.962

    8 10 12 14 16 18 20

    34

    56

    78

    9

    RFP expression strength by Plate Reader (RNU/OD/h, log2)

    RF

    P e

    xp

    ressio

    n s

    tre

    ng

    th b

    y F

    low

    Cyto

    me

    try (

    A.U

    ., lo

    g2

    )

    R2= 0.961

    Nature Methods: doi:10.1038/nmeth.2403

  • 5

    Supplementary Figure 3. Comparison of plasmid-born vs chromosomally integrated

    Promoter:5 UTR combinatorial library driving gfp expression. Scatter plots of fluorescence

    (GFP) measurements from 63 pairs of plasmid-born versus chromosomally integrated Promoter-

    5 UTR combination. A linear regression on 60 points gives the following relationship: Fplasmid =

    12.07 * Fchromosomal 18.03 (R2 of 0.85). These data fit with the expected dosage difference

    between a p15A-born and chromosomally integrated gene (~12-15 versus ~1 copy).

    10 20 30 40 50 60

    0100

    200

    300

    400

    500

    600

    Mean fluorescence on chromosome (A.U.)

    Mean flu

    ore

    scence o

    n p

    lasm

    id (

    A.U

    .) y = 12.07 * x - 18.03R2 = 0.85

    Nature Methods: doi:10.1038/nmeth.2403

  • 6

    Supplementary Figure 4: Observed variation and correlation of mRNA abundance and

    fluorescence from combinatorial library of expression elements. Scatter plot of mRNA

    abundance versus fluorescence for constructs driving gfp (a) and rfp (b) expression. Pair-wise

    comparison between mRNA levels (c) and fluorescence (d) for GFP and RFP library. Data

    points are coded with a different symbol for every pi and different colors for ui, according to the

    legend.

    a b

    GF

    P F

    luo

    resc

    en

    ce (

    A.U

    ., lo

    g2

    )

    GFP mRNA abundance (Molecules/L, log2) RFP mRNA abundance (Molecules/L, log2)

    RF

    P F

    luo

    resc

    en

    ce (

    A.U

    ., lo

    g2

    )

    RFP Fluorescence (A.U., log2)

    GF

    P F

    luo

    resc

    en

    ce (

    A.U

    ., lo

    g2

    )

    3 4 5 6 7 8 9

    2

    4

    6

    8

    R2=0.38

    18 19 20 21 22 23

    2

    4

    6

    8

    R2=0.58

    20 21 22 23 24 25 26

    3

    4

    5

    6

    7

    8

    9

    R2=0.59

    p1

    p2

    p3

    p4

    p5

    p6

    p7

    u1

    u2

    u3

    u4

    u5

    u6

    u7

    u8

    u9

    u10

    u11

    dc

    20 21 22 23 24 25 26

    18

    19

    20

    21

    22

    23

    RFP mRNA abundance (Molecules/L, log2)

    GF

    P m

    RN

    A a

    bu

    nd

    an

    ce (

    Mo

    lecu

    les/

    L

    , lo

    g2

    )

    R