Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular...

71
Biochemistry, computing in biology

Transcript of Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular...

Page 1: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Biochemistry, computing in biology

Page 2: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

1 Introduction

2 Theoretical background Biochemistry/molecular biology

3 Theoretical background computer science

4 History of the field

5 Splicing systems

6 P systems

7 Hairpins

8 Detection techniques

9 Micro technology introduction

10 Microchips and fluidics

11 Self assembly

12 Regulatory networks

13 Molecular motors

14 DNA nanowires

15 Protein computers

16 DNA computing - summery

17 Presentation of essay and discussion

Course outline

Page 3: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Recombination

Page 4: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Recombination and crossover

Page 5: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Recombination and crossover

Page 6: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

If no exchange of genes (i.e. phenotypic marker) occurs, recombination event can not be detected

Recombination and crossover

Page 7: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Recombination and crossover

Page 8: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Introduction to ciliates

Page 9: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

literature

Genome Gymnastics: Unique Modes  of DNA

Evolution and Processing in Ciliates. David M.

Prescott, Nature Reviews Genetics

Computational power of gene rearrangement.

Lila Kari and Laura Landweber, DIMACS series

in discreet mathematics and theoretical

computer science

Page 10: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Very ancient ( ~ 2 . 109 years ago)

Very rich group ( ~ 10000 genetically

different organisms)

Very important from the evolutionary

point of view

The ciliate

Page 11: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

DNA molecules in micronucleus are very

long (hundreds of kilo bps)

DNA molecules in macronucleus are gene-

size, short (average ~ 2000 bps)

The ciliate

Page 12: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

The ciliate

Page 13: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Baldauf et al. 2000. Science 290:972.

The ciliate tree

Page 14: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Urostyla grandis

Bar: 50 m

Holosticha kessleri

Bar: 100 m

Uroleptus sp.

Bar: 100 m

Scrambled Genes Found

S. lemnaeO. trifallaxO. nova

Eschaneustyla sp.

Bar: 25 m

Page 15: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

The ciliate

Page 16: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

The ciliate

Page 17: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Dapi staining of the ciliate

Page 18: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Nuclei

Micronucleus the small nucleus containing a

single copy of the genome that is used for

sexual reproduction

Macronucleus the large nucleus that carries up

to several hundred copies of the genome and

controls metabolism and asexual reproduction

Page 19: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Prescott, 2000

Macronucleus

Micronucleus

Cutting, splicing, elimination, reordering, and amplification of DNA

Lifecycle of a ciliate

Page 20: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

The ciliate, meiosis

Page 21: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

CellPairing

Meiosis andNuclear Exchange

Nuclear Fusion andDuplication of theZygotic Nucleus

Macronuclear Developmentand Nuclear Degeneration

MIC

MAC

Modified from Larry Klobutcher & Carolyn Jahn Ann. Review Microbiology, 2002

Polytenization

Chromatid breakage

De novo telomere formation

The ciliate, reproduction

Page 22: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Computing in ciliates

Page 23: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Astounding feats of ‘DNA computing’ are routine in this ‘simple’ single -celled organism— a protozoan. In initial micronucleus, DNA is‘junky’and scrambled, but….

….it reassembles itself in proper sequence by means of computer-like acrobatics (unscrambling, throwing out genetic ‘junk’)—in macronucleus

The ciliate

Page 24: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

IES: internal eliminated segmentsMDS: macronuclear destined sequences

MAC

MIC

Telomere Pointers

The complexity of spirotrich biology

Page 25: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Splicing

Page 26: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Fractioned genes

Page 27: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Intervening non-coding DNA regions (IES: internal

eliminated segments) interrupt protein-coding

sequences (MDS macronuclear destined sequences)

IESs are removed during macronuclear development

MDSs are unscrambled

Prescott, 2000

The complexity of gene scrambling

Page 28: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Actin I

DNA polymerase

Landweber et al., 2000

Hogan et al., 2001

-TBP

Prescott et al., 1998

Oxytricha nova

Scramble genes -TBP, actin I, DNA pol

Page 29: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Prescott et al, 1998

Degree of scrambling in -TBP

Page 30: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Hogan et al, 2001

Unscrambling of actin I

Page 31: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Landweber et al, 2000

Degree of scrambling in DNA pol

Page 32: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

DNA folding and recombination DNA pol

Page 33: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

DNA folding and recombination

Page 34: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

DNA pol : Hairpin loop

Prescott, 2000

DNA folding and recombination DNA pol

Page 35: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Prescott et al, 1998

Recombination -TBP

Page 36: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

(i) Isolate the micronuclear and macronuclear forms

of the -TBP gene

(ii) Compare the micronuclear and macronuclear gene

structures (MDS and IESs) to determine whether

the gene is scrambled

(iii) Compare homologous MDSs and scrambling patterns

in various stichotrich species (earlier

diverging species vs later diverging species)

(iv) Trace a parsimonious evolutionary scrambling

pathway

Tracing evolutionary scrambling

Page 37: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Uroleptus sp.

Oxytrichidae and Paraurostyla weissei

Comparisons of scrambling complexity

Page 38: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Oxytricha trifallax

Oxytricha nova

Stylonychia mytilus

Uroleptus sp.

Paraurostyla weissei

100

100

100

The evolution of recombination

Page 39: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

P. weissei Uroleptus sp.

Holosticha sp.

O. trifallax

O. nova

S. mytilus

Evolutionary scrambling pathway

Page 40: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Formal theory

Page 41: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Ciliate computing The process of gene unscrambling in

hypotrichous ciliates represents one of

nature’s ingenious solutions to the

computational problem of gene assembly.

With some essential genes fragmented in as

many as 50 pieces, these organisms rely on a

set of sequence and structural clues to

detangle their coding regions.

For example, pointer sequences present at

the junctions between coding and non-coding

sequences permit reassembly of the

functional copy. As the process of gene

unscrambling appears to follow a precise

algorithm or set of algorithms, the question

remains: what is the actual problem being

solved?

Page 42: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Genomic Copies of some Protein-coding

genes are obscured by intervening non-

protein-coding DNA sequence elements

(internally eliminated sequences, IES)

Protein-coding sequences (macronuclear

destined sequences, MDS) are present in

a permuted order, and must be

rearranged.

The problem in the cell

Page 43: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

By clever structural alignment…, the cell

decides which sequences are IES and MDS, as well

as which are guides.

After this decision, the process is simply

sorting, O(n).

Decision process unknown, but amounts to finding

the correct path. Most Costly.

Assumption

Page 44: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

there is some as yet undiscovered

“oracle”mechanism within the cell,

or the cell simulates non-determinism

the former solution lacks biological

credibility and the latter implies

exponential time and space explosion.

What we want is a deterministic algorithm

for applying the inter- and intra-

molecular recombination operations to

descramble an arbitrary gene.

Ciliate computing

Page 45: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

The first proposed step in gene unscrambling—

alignment or combinatorial pattern matching—

may involve searches through several possible

matches, via either intra-molecular or

intermolecular strand associations.

This part could be similar to Adleman’s (1994)

DNA solution of a directed Hamiltonian path

problem.

Ciliate computing

Page 46: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

The second step—homologous recombination at

aligned repeats—involves the choice of whether to

retain the coding or the non-coding segment

between each pair of recombination junctions.

This decision process could even be equivalent to

solving an n-bit instance of a satisfiability

problem, where n is the number of scrambled

segments.

Ciliate computing

Page 47: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

We use our knowledge of the first step to develop

a model for the guided homologous recombinations

and prove that such a model has the computational

power of a Turing machine, the accepted formal

model of computation. This indicates that, in

principle, these unicellular organisms may have

the capacity to perform at least any computation

carried out by an electronic computer.

Ciliate computing

Page 48: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Assume the cell simply reconstructs

the genes by matching up pointers. Just one problem... pointer sequences

are not unique. In fact, may have

multiplicities greater than 13. The proposed solution to this was

that the cell would simply try every

possible combination of pointers

until it found the right two.

Ciliate computing, the naïve model

Page 49: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Relies on short repeat sequences to act

as guides in homologous recombination

events

Splints analogous to edges in Adleman

One example represents solution of 50

city HP (50 pieces reordered)

How the cell computes

Page 50: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Guided recombination system

wxuxvuxwxv

Formal model

Page 51: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Context necessary for a re-

combination between repeats x

(p, x, q) ~ (p’, x, q’)

Formal model

Page 52: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Formal Language Model

Where u=u’p, w=qw’=w’’p’, v=q’v’

Intramolecular recombination. The guide is

x. Delete x wx from original.

Intermolecuar recombination. Strand

Exchange.

This is a universal Turing machine (proven

by Tom Head)

wxuxvuxwxv

Formal model, splicing operation

Page 53: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Formal model, splicing operation

Page 54: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Gene unscrambling algorithm

Page 55: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Ciliate computing

Page 56: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Micronucleus: cell mating

Macronucleus: RNA transcripts (expression)

Micro: I0 M1 I1 M2 I2 M3 … Ik Mk Ik+1

M = P1 N P2

Macro: permutation of (possibly rotated)

M1,…, Mk and I0 ,…, Ik+1are removed

Gene assembly in ciliates

Page 57: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 58: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 59: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 60: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 61: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 62: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 63: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 64: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 65: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 66: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 67: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 68: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 69: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Molecular operators

Page 70: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

The pointer sequences

must be in spatial

proximity during

unscrambling

Topology must be

faithfully reproduced

somehow

Pointers

Page 71: Biochemistry, computing in biology 1Introduction 2Theoretical background Biochemistry/molecular biology 3Theoretical background computer science 4History.

Recombination event

attaches Minor Locus to

end of Major Locus

Relocation of a locus