Global and local alignment (bioinformatics)

18
0 5 / 2 3 / 2 0 2 2 I n t r o d u c t i o n t o B i o i n f o r m a t i c s 1

Transcript of Global and local alignment (bioinformatics)

Page 1: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics1

Page 2: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics2Group Member

151-15-255

151-15-453

151-15-240

151-15-245

Page 3: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics3

Page 4: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics4

Why align sequences?

• Useful for discovering• Functional• Structural and• Evolutionary relationship

– For example• To find whether two (or more) genes or proteins are

evolutionarily related to each other• Two proteins with similar sequences will probably be

structurally or functionally similar

Page 5: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics5

Global Vs Local Alignment

• Global Alignment– A general global alignment technique is the Needleman–

Wunsch algorithm, which is based on dynamic programming.– Attempts to align the maximum of the entire sequence– Suitable for similar and equal length sequences

• Local Alignment– Local alignments are more useful fordissimilar sequences that are suspected to contain regions of similarity or similar sequence motifs within their larger sequence context.– Stretches of sequences with highest density of matches are

aligned– Suitable for partially similar, different length and conserved

region containing sequences

Page 6: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics6

Page 7: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics7

Allows obtaining the optimal alignment with linear gap cost has been proposed by Needleman and Wunsch by providing a score, for each position of the aligned sequences.

Based on the dynamic programming technique.

For two sequences of length m and n we define a matrix of dimensions m+1 and n+1.

Global Alignment

Page 8: Global and local alignment (bioinformatics)

8 Global Alignment

Three steps in dynamic programming Initialization Matrix fill (scoring) Traceback (alignment)

Smith–Waterman algorithm

Page 9: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics9

Sequences:S: ATTATCTT: TTTCTA

T

S 0

_

A

T

T

A

T

C

T

_ T T T C T A

0

-1

-2

-3

-4

-5

-6

-7

-1 -2 -3 -4 -5 -6

0 -1 -2 -3 -4 -5

1 2 1 0 -1 -2

0 3 4 3 2 1

-1 2 3 4 3 4

-2 1 4 3 6 5

-3 0 3 6 5 6

-4 -1 2 5 8 7

Match Score = +2Mismatch Score = 0Gap Penalty = -1

i-1, j-1 i-1, j

I, j-1 I, j

Page 10: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics10

0

_

A

T

T

A

T

C

T

_ T T T C T A

0

-1

-2

-3

-4

-5

-6

-7

-1 -2 -3 -4 -5 -6

0 -1 -2 -3 -4 -5

1 2 1 0 -1 -2

0 3 4 3 2 1

-1 2 3 4 3 4

-2 1 4 3 6 5

-3 0 3 6 5 6

-4 -1 2 5 8 7

T

S

Page 11: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics11

Optimal Alignment:S

T

No: of matches = 5No: of mismatches = 3

(5 x 2) – (3 x -1) = 7

A T T A T C T –

- T T – T C T A

Page 12: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics12

Page 13: Global and local alignment (bioinformatics)

13

S/T 0 A T G A T G T A G

0 0 0 0 0 0 0 0 0 0 0

G 0 0 0

A 0

G 0

A 0

T 0

G 0

T 0

G 0

C 0

0 + 2 0 +-2

0 + -2 2

0 + 2 = 2 0 + -2 = 0 0 + -2 = 0

Match : 2, Mismatch : -1, Gap : -2

0 + -1 0 + -2

0 + -2 0

0 + 2 = 0 0 + -2 = 0 0 + -2 = 0

Matrix fill (scoring)

Page 14: Global and local alignment (bioinformatics)

14

S/T 0 A T G A T G T A G

0 0 0 0 0 0 0 0 0 0 0

G 0 0 0 2 0 0 2 0 0 0

A 0 2 0 0 4 2 0 0 2 0

G 0 0 1 2 2 3 4 2 0 4

A 0 2 0 0 4 2 2 2 4 2

T 0 0 4 2 2 6 4 4 2 3

G 0 0 2 6 4 4 8 6 4 4

T 0 0 2 4 4 6 6 10 8 6

G 0 0 0 4 3 4 8 8 9 10

C 0 0 0 2 3 1 6 7 7 8

Match : 2, Mismatch : -1, Gap : -2 Matrix fill (scoring)

Page 15: Global and local alignment (bioinformatics)

Trace back15

S/T 0 A T G A T G T A G

0 0 0 0 0 0 0 0 0 0 0

G 0 0 0 2 0 0 2 0 0 0

A 0 2 0 0 4 2 0 0 2 0

G 0 0 1 2 2 3 4 2 0 4

A 0 2 0 0 4 2 2 2 4 2

T 0 0 4 2 2 6 4 4 2 3

G 0 0 2 6 4 4 8 6 4 4

T 0 0 2 4 4 6 6 10 8 6

G 0 0 0 4 3 4 8 8 9 10

C 0 0 0 2 3 1 6 7 7 8

Match : 2, Mismatch : -1, Gap : -2

Page 16: Global and local alignment (bioinformatics)

Alignment16

G A T G T A G| | | | | | |G A T G T - G2 2 2 2 2 -2 2

6 X 2 = 121 X -2 = -2 10

G A T G T| | | | |G A T G T2 2 2 2 2

5 X 2 = 10 10

Page 17: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics17

Page 18: Global and local alignment (bioinformatics)

05/02/2023Introduction to Bioinform

atics18