Plenary Speaker slides at the 2016 International Workshop on Biodesign Automation

67
60 IWBDA 2016 - Newcastle Upon Tyne / Accelerating Synthetic Biology via Software and Hardware Advances Prof. Natalio Krasnogor Interdisciplinary Computing and Complex BioSystems (ICOS) Research Group Centre for Bacterial Cell Biology Centre for Synthetic Biology and the Bioeconomy Newcastle University Natalio.Krasnogor@newcastle http://homepages.cs.ncl.ac.uk/natalio.krasnogor/ twitter: @NKrasnogor 1 Tuesday, 6 September 16

Transcript of Plenary Speaker slides at the 2016 International Workshop on Biodesign Automation

60IWBDA 2016 - Newcastle Upon Tyne /

Accelerating Synthetic Biology via Software and Hardware

AdvancesProf.NatalioKrasnogor

Interdisciplinary Computing and Complex BioSystems (ICOS) Research GroupCentre for Bacterial Cell Biology

Centre for Synthetic Biology and the BioeconomyNewcastle University

Natalio.Krasnogor@newcastlehttp://homepages.cs.ncl.ac.uk/natalio.krasnogor/

twitter: @NKrasnogor

1

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Outline

• Computational & Hardware support for designing and manufacturing Combinatorial DNA at your Desk

• Machine Intelligence for Synthetic Biology

•Conclusions

2

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

N

• Different scales require different “programming languages”, e.g. DNALD, SBOL, IBL, etc for modularity, hierarchical abstraction, reusability & standardisation across scales

• Microfluidics for writing DNA but also as a “wind-tunnel” on your desktop, e.g.,:

• to try out multiple designs and gather data• to optimise cell-free kits for ad-hoc

applications• to combinatorial stress-test synthetic cell

systems

• Machine Intelligence & data analytics across scales

3

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /4

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Reading and Writing DNA at your Desk• The study of biology has accelerated rapidly thanks to methods for massively parallel cell-free cloning and DNA sequencing in desktop next generation sequencing (NGS) machines• The engineering of biology is still largely restrained by limitations of gene synthesis and cloning methodologies• Off-the-shelf Microfluidic is about to supercharge synthetic biology by:

• increasing the throughput of gene synthesis• reducing cost through miniaturization• handle complexity of more ambitious designs through autonomous liquid handling at source.

5

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Combinatorial DNA Synthesis on your DesktopParts

Library Targets

OperatorsPlaner Assembly Plan

Instrument Instructions

Programable Order Polymerization (POP)

Microfluidics Combinatorial Assembly of DNA (M-CAD)

Microfluidics In Vitro Cloning (MIC)

Key challenge is to enable precise design, editing and

manufacturing of combinatorial DNA libraries at your desk.

CAD

CAM

6

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Combinatorial DNA Synthesis on your DesktopParts

Library Targets

OperatorsPlaner Assembly Plan

Instrument Instructions

Programable Order Polymerization (POP)

Microfluidics Combinatorial Assembly of DNA (M-CAD)

Microfluidics In Vitro Cloning (MIC)

Key challenge is to enable precise design, editing and

manufacturing of combinatorial DNA libraries at your desk.

CAD

CAM

and then find out what the

heck just happened!?!?

6

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

A Programming Language for Sequences:DNALD (DNA Library Design)

A specification language that produces a set of target DNAsequences as a function of operations on a set of inputs

To maximise impact the specification process must be:• user friendly and debuggable • but expressively powerful enough to:

• define non-trivial combinatorial constructs• communicate degrees of freedom

7

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

DNA Library Designer with DNALD

8

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Background validation

evaluationconstraints

syntaxerrors

errornavigation

errorsmarked

9

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Suggests quick fixes

resolve names

correct indices

10

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Search across projects

searchresults

navigateworkspace

11

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Compare differences between files and versions

duplicate each orevery change

highlights insertions and deletions

12

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Graphical Representation of Complex DNA Libraries

Assembly plan

13

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

And Paired Visualisations

l Emphasises reuse with shared nodes and provides indication of library's combinatorial degree

l Every path from 5' to 3' is an output

Graphical Representation of Complex DNA Libraries

14

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

How can DNALD be extended?• Plug-ins could be added to add semantics to variants, eg:

• different codon usage or codon tables for same protein sequence• different coded protein sequence with same physico-chemical properties

•Equivalent/Reduced Alphabets

for Contact Number preservationEquivalent/Reduced Alphabets

for solvent accessibility preservation

Text

Automated Alphabet Reduction for Protein Datasets. BMC Bioinformatics, 2009, 10:6

15

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

How can DNALD be extended?• Plug-ins could be added to use eg:

• Statistical or machine learning driven design of experiments

Text

16

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

How can DNALD be extended?

Planning heuristics adaptable to other assembly protocols

17

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

EXAMPLES

18

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

StemCellReprogramming(UKB)Frank Edenhofer

19

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Operons Rewiring (UEVE)François Képès

20

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

From the DNA Library to the Synthesis Plan

l When O={+} & P=unrestricted è Planning problem

l Related computational problem bounded-depth min-cost string production (BDMSP) is NP-hard and APX-hard by reduction from vertex cover

21

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Combinatorial DNA Synthesis on your DesktopParts

Library Targets

OperatorsPlaner Assembly Plan

Instrument Instructions

Programable Order Polymerization (POP)

Microfluidics Combinatorial Assembly of DNA (M-CAD)

Microfluidics In Vitro Cloning (MIC)

Key challenge is to enable precise design, editing and

manufacturing of combinatorial DNA libraries at your desk.

CAD

CAM

22

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /23

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

One Pot VS Generic Protocols

Microfluidic gene synthesis is advancing fast to:• overcome the limitation of strictly assembling genes in one pot reactions & accommodate a range of assembly methods.• be able to execute ad hoc gene synthesis via programmability over droplet routing. • enable the implementation of complex & parallel schemes (which are challenging to execute both manually and on liquid handling robots) • able to accommodate different construction protocols.•More reproducible

Zhou,X et al. Microfluidic PicoArray synthesis of oligodeoxynucleotides and simultaneous assembling of multiple DNA sequences. Nucleic Acids Res., 32, 5409–5417. 2004

Tian,J., et al. . Advancinghigh-throughput gene synthesis technology. Mol. Biosyst., 5, 714–722. 2009

Quan,J.,et al. Parallel on-chip gene synthesis and application to optimization of protein expression. Nat. Biotechnol., 29, 449–452. 2011

24

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Programmable Liquid Handling

25

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /26

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

What Can Be Done• Synthesis of Genes de novo ==> POP assembly• Construction of Rationally Designed (DNALD) Combinatorial Gene Libraries ==> M-CAD• Cell-free cloning of assembled synthetic DNA ==> M-IC

• Sequenced validation• Downstream (application) validation

on-chip

off-chip

27

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Post-transcriptional regulation of Azurin, a bacterial QS-activated gene (Nottingham & Newcastle)

Koch, Heeb, Camara, Dubern, Krasnogor

28

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Combinatorial Library Design (DNALD) & Construction (EWD): the Azurin example

• Bacteria regulate gene expression at the transcriptional and post- transcriptional level• RsmA global post-transcriptional regulator, modulates switch between acute and chronic infection (p. aeruginosa @ cystic fibrosis) • RsmA positively and negatively regulates target mRNAs by binding to mRNA secondary structures (stem loops-palindromic sequences)•RsmA homologues (CsrA) present in a variety of bacteria, Gram-positive and Gram-negative

29

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

It is postulated that RmsA positively regulates Azurin

Three hypothetical loops in the mRNA

2nd and 3rd AGGA is in the loop of the stem

30

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

It is postulated that RrmsA positively regulates Azurin

31

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

It is postulated that RrmsA positively regulates Azurin

32

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

It is postulated that RrmsA positively regulates Azurin

33

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Post-transcriptional regulation of Azurin, a bacterial QS-activated gene (Nottingham & Newcastle)

Koch, Heeb, Camara, Dubern, Krasnogor

34Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

M-CAD

35

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

M-CAD

36

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Results•Gel electrophoresis analysis of a representative set of 16 of the Azurin library targets shows that all constructs are of the expected size with no spurious assembly products

•Western blot from extracts of Pseudomonas aeruginosa expressing the azurine gene incubated with anti-azurin polyclonal antibodies

37

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Outline

• Computational & Hardware support for designing and manufacturing Combinatorial DNA at your Desk

• Machine Intelligence for Synthetic Biology

•Conclusions

38

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Polymers For Controlling QS Dependent Phenotypes

Bacterial Sequestrant

Dual action

QS Quencher

39

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

10s of Persons-years!

40

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

On Roll Royces & Ford Ts: an analogy

• Hand-crafted • + Comfortable• + Reliable/Robust• Faster• + Expensive • Selective

• Assembly Line Product• - Comfortable• - Reliable/Robust• Slower• Cheaper• Ubiquitous/Popular

41

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Modular models for SynBio design

http://www.virtualparts.org42

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

What does the VPR do?

•Provides modular, composable, dynamic models of genetic components

•AND includes models of the upper layers of molecular biology they encode (mRNA, proteins, metabolites etc.)•AND their interactions

•SBML and Rule Based•Facilitates model-based design•Supports automated design

•e.g. Computational Intelligence•Supports CAD tools and languages

G. Misirli, J. Hallinan, and A. Wipat, “Composable modular models for synthetic biology,” ACM J. Emerging Technologies in Computing Systems, vol. 11, iss. 3, pp. 1-19, 2014

43Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Open Language (SBOL)•Synthetic biology standard (currently Version 2.0):

•Designed to allow for the exchange of descriptions of genetic parts, devices, modules, and systems.

•Facilitates storage of genetic designs in repositories.

•Allows for designs of genetic parts and systems to be embedded in publications.

•SBOL can be used to create workflows between different tools Galdzicki et al., Nature Biotechnology (2014)

Six independent groups collaborated on the design of a set of genetic toggle switches. using several SBOL enabled tools.

44

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /45

An Environment for Augmented Biodesign Using Integrated Data Resources James McLaughlin, Goksel Misirli, Matthew Pocock, and Anil WipatIWBDA 2016

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

User

create data-informed designData augmenteddesign AmBiT

Data enrichment:BLASTEMBOSSdatabase cross refs

Other SBOL Stackinstances

46

An Environment for Augmented Biodesign Using Integrated Data Resources James McLaughlin, Goksel Misirli, Matthew Pocock, and Anil WipatIWBDA 2016

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Meta-Stochastic Simulationssapredict - Web application using classifiers as a tool for biologists to deduce the best stochastic simulation algorithm for their model

User simply clicks to upload stochastic model in SBML format

Fast model property analysis is performed (C++ and igraph)

Algorithm prediction performed using biomodels analysis. (Linear SVC using python sklearn)

Results displayed. User can then download preconfigured simulator to execute their model

47

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Automated Model Analysis forSimulations Reaction & species

dependency graphs generated from models

Clocks identify fast to compute properties

48

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Model analysis

49

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

•Model checking: Exhaustively, verifies whether a property holds by a model of a system. Statistical model checking (SMC) integrates the simulation technique with model checking by generating and verifying a number of simulation paths to determine an “approximate correctness” of queried properties.

•Machine Learning method for selecting the most appropriate Stochastic Simulation Algorithms (SSAs) has been extended to Statistical Model Checkers (SMCs) selection.

•However, there are intrinsic differences between simulation algorithms and model checkers; model checkers require both the model & property specifications.

•Our methodology is illustrated for frequently used properties in the literature, called property patterns.

Automated Model Analysis forFormal Verification

In collaboration with Prof. M. Gheorghe, Dr. Savas Konur (Bradford University) & Mehmet E. Bakir (Sheffield University)

50Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Verification Patterns•Patterns are frequently used property types for querying features of models (e.g., something is always the case, something will eventually be the case)•Below are 8 frequently used patterns represented in natural language and using existing temporal logic operators

51

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Fastest model checkers

52

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Fastest model checkers

53

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

SMCs Prediction•The SSAs prediction method has been extended by allowing parallel edges for species and reaction dependency graphs and some non-graph properties such as, the number of updated variables involved in a reaction - min, mean, max and sum of the update values.

•Support Vector Machine (SVM) prediction of the fastest SMC presented below.

Patterns AccuracyEventually 0.945Always 0.927Follows 0.961Precedes 0.967Never 0.942Steady-State 0.939Until 0.941Infinitely-Often 0.961

54

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Outline

• Computational & Hardware support for designing and manufacturing Combinatorial DNA at your Desk

• Machine Intelligence for Synthetic Biology

•Conclusions

55

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

u domain specific language for synthetic biology

u SB entities (genes, proteins, promoters) first class entities

u implemented as Eclipse RCP

Synthetic Biology Life CycleDesign

u emphasis on high performanceu 9 different stochastic simulation

algorithm variantsu automated algorithm selectionu MPI support

Simulation

VerificationBiocompilationu quasi-natural language for

definition of propertiesu automatic translation into

temporal logicsu automated algorithm

selection

u links to sequence repositories u design completion with terminators,

RBS, spacers, ...u consideration of custom constraints

VERIFY [ GFP > 0 uM ] EVENTUALLY HOLDSVERIFY [ GFP > 0 uM ] ALWAYS HOLDSVERIFY [ GFP > 2*RFP ] NEVER HOLDS

GTATAATTACGGCTACAATGCGCCGTTATT

56

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycle

Design Simulation

VerificationBiocompilation

Data Analytics &Machine

Intelligence

“Wind Tunneling” via desktop

microfluidics

57

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycle

58

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycle

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

58

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycle

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

58

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycle

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

58

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycle

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

58

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycle

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc SBOL files

DSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

58

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycle

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc SBOL files

DSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

SBOL filesDSL filesComputational logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etcSBOL files

DSL filesComputationally logsEngineering Protocols Experimental logs(seq, proteomics, metabolomics, optical, etc)etc

58

Tuesday, 6 September 16

60IWBDA 2016 - Newcastle Upon Tyne /

Synthetic Biology Life Cycledisrupted by machine learning, data analytic and peer-to-peer

data-driven bio-manufacturing

so we can finally find out what the heck

just happened!?!?

Like “Neural Grafting” for BioRobots

59

Tuesday, 6 September 16

My colleagues at the ICOS and CSBB in Newcastle

Prof. A. Wipat (Newcastle U.)Dr. M. Gheorghe (Bradford U.) Dr. J. Bacardit (Newcastle U.)Prof. P. Wright (Newcastle U.)Prof. C. Alexander (U. Nottingham)Dr. F. Fernandez-Trillo (U. Birmingham)Prof. M. Camara (U. Nottingham)Dr. S. Heeb (U. Nottingham)Dr. J. Dubern (U. Nottingham)Prof. C. Biggs (U. Sheffield)Dr. S. Konur (Bradford U.)Dr. S. Kalvala (Warwick U.)Dr. C. Ladrou (Warwick U.)Dr. C. Delattre (Illumina)Dr. A. Rivald (Illumina)Prof. E. Shapiro (Weizmann Institute)Dr. T. Ben Yehezquel (Weizmann Institute)Prof. U. Feigel (Weizmann Institute)

!

60IWBDA 2016 - Newcastle Upon Tyne /

Acknowledgements

EP/N031962/1

EP/J004111/2

EP/D021847/2

EP/I031642/2

BB/F01855X/1

BB/D019613/1

5 Years Research Managing Directorfor a new £8M grant:

http://tinyurl.com/h99vl3h

closing date: 5/September/2016

60

Tuesday, 6 September 16