Dynamic programming

21
MARYAM BIBI FA12-BTY-011 TOPIC : DYNAMIC PROGRAMING SUBJECT : BIOINFIRMATICS

Transcript of Dynamic programming

MARYAM BIBI

FA12-BTY-011

TOPIC : DYNAMIC PROGRAMING

SUBJECT : BIOINFIRMATICS

Alignmentused to uncover homologies between sequences combined with phylogenetic studies can determine orthologous and paralogousrelationships

Global Alignments compares one whole sequence with other entire

sequence computationally expensive

Local Alignment uses a subset of a sequence and attempts to align it to subset of other sequences computationally less expensive

Dynamic Programming:

dynamic programing is solving complex

prblems by breaking them into a simpler

subproblems.

Problem can be divided into many smaller

parts.

Needleman and Wunsch were the first to

propose this method.

Needleman and Wunsch describes general

algorithm for sequence aignment.

Maximize a score of similarity to give maximun

match.

Maximun match= largest number of nucleotides

that can be match with others.

That want to quantify sequence similarity

between two sequences.

Dynamic programming in bioinformatics

Dynamic programming is widely used in

bioinformatics for the tasks such as sequence

alignment, protein folding, RNA structure

prediction and protein-DNA binding.

First dynamic programming algorithms for

protein-DNA binding were developed in the 1970s

independently by Charles Delisi in USA

and Georgii Gurskii and Alexanderr

zasedatelev in USSR.

Dynamic Programming in sequence alignment

There are three steps in dynamic programing.

1. initialization.The first step in the global alignment dynamic programming approach is to create a matrix with M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be aligned.

2. Matrix filling(scoring)We fill the matrix with highest possible score.

To align with diagnol ( align in next position.)

Allign in off-diagonal requires inserion of crossponding gaps

3. Traceback and aligningMove from last corner and follow arrow.

Global alignment via dynamic programing

1st column and 1st row will be empty.

Fill 1st block with zero

Then fill 1st row and 1st coulmn with gap penality multiples.

While filling the matrix there are three possible values

horizental: score+ gap penality

Verticle: score+ gap penality

Diagonal: score+(match/mismatch)

We have to write max score from these values in a cell

Let ,

match = + 1

mismatch= -1

gap penality= -2

Lets

Seq# 1 A A A C

Seq# 2 A G C -

A A T C

0 -2 -4 -6 -8

A -2( 1) ( -4)

1(-4)

-1 -6

-1-1

-3 -5

G -4-1 0 -2 -4

C -6-3 -2 -1 -1

Backward tracking

In backward tracking we have to move

from last cell( lower corner) and follows

arrow from which cell the current cell’s

values come from and go ahead.

A A T C

0 -2 -4 -6 -8

A -2 1 -1 -3 -5

G -4-1 0 -2 -4

C -6-3 -2 -1 -1

A A T C

0 -2 -4 -6 -8

A -2 1 -1 -3 -5

G -4-1 0 -2 -4

C -6-3 -2 -1 -1

Path 1

Path 2

Backtracking

Now we have to allign this sequence.

For alligning there are 2 rules.

1. If the value come from column we will have

to write two sequences

2. If value come from horizontal or vertical

then we will have to write perpendicular

and add gap to other side.

Backtracking

Here we have two paths. So we will get two

possible sequences.

We will write sequence from 3’-end.

Path 1:

Seq#1 A A T C

Seq#2 - A G C

Backtracking

Path 2

Seq# 1 A A T C

Seq# 2 A G - C

Hense , we do global allignment in this way.

Local alignment via dynamic programing

Algorithim is same as in global alignment,

but there are some changes.

We fill 1st column and 1st row with zero.

If the value comes in negative number than

it is replaced by zero.

Backtracking will be start from maximun

value.

Local alignment

Let,

Match = 1

Mismatch = 0

Gap penality= 0

And sequence#1 GAATTCAGTTA

Squence#2 GGATCGA

G A A T T C A G T T A

0 0 0 0 0 0 0 0 0 0 0 0

G 0 1 1 1 1 1 1 1 1 1 1 1

G 0 1 1 1 1 1 1 1 2 2 2 2

A 0 1 2 2 2 2 2 2 2 2 2 3

T 0 1 2 2 3 3 3 3 3 3 3 3

C 0 1 2 2 3 3 4 4 4 4 4 4

G 0 1 2 2 3 3 4 4 5 5 5 5

A 0 1 2 3 3 3 3 4 5 5 5 6

Bactracking

After the matrix fill step, the maximum alignment

score for the two test sequences is 6. The traceback

step determines the actual alignment(s) that result

in the maximum score.

G A A T T C A G T T A

0 0 0 0 0 0 0 0 0 0 0 0

G 0 1 1 1 1 1 1 1 1 1 1 1

G 0 1 1 1 1 1 1 1 2 2 2 2

A 0 1 2 2 2 2 2 2 2 2 2 3

T 0 1 2 2 3 3 3 3 3 3 3 3

C 0 1 2 2 3 3 4 4 4 4 4 4

G 0 1 2 2 3 3 4 4 5 5 5 5

A 0 1 2 3 3 3 3 4 5 5 5 6

Backtracking

Rule will be same for this as in global

alignment

Sequence alignment :

Seq#1 G A A T T C A G T TA

Seq#2 G A - T C – G - - A

So in this way we align the sequence using

dynamic programing.