Dot Plots

28
Dot Plots

description

Dot Plots. DNA dot plots. Identification of regions of Similarity between two sequences Insertions-deletions: Introns Repetitive regions (self-self analysis) ‏ Inverted repeats. Repeats. All DNA sequences contain repeats. Repeats. All DNA sequences contain repeats. Window size. - PowerPoint PPT Presentation

Transcript of Dot Plots

Page 1: Dot Plots

Dot Plots

Page 2: Dot Plots

DNA dot plots

Identification of regions of – Similarity between two sequences– Insertions-deletions: Introns– Repetitive regions (self-self analysis)– Inverted repeats

Page 3: Dot Plots

Repeats

• All DNA sequences contain repeats

Page 4: Dot Plots

Repeats

• All DNA sequences contain repeats

Page 5: Dot Plots

Window size

• Window size 1

Page 6: Dot Plots

Window size

• Window size 9

Page 7: Dot Plots

Exercise

CCTAAAGG

G

G

A

A

A

T

C

C

Sequence 1

Seq

uenc

e 2

Practice for,a) window size 1b) window size 3

Page 8: Dot Plots

Exercise

CCTAAAGG

G

G

A

A

A

T

C

C

Sequence 1

Seq

uenc

e 2

Window size 1

Identity

Page 9: Dot Plots

Exercise

CCTAAAGG

G

G

A

A

A

T

C

C

Sequence 1

Seq

uenc

e 2

Window size 3

Not considered

Page 10: Dot Plots

Exercise

CCTAAAGG

G

3G

A

A

A

T

C

C

Sequence 1

Seq

uenc

e 2

Window size 3

GGAGGA

= 3 / 3 identities

Page 11: Dot Plots

Exercise

CCTAAAGG

G

3G

2A

A

A

T

C

C

Sequence 1

Seq

uenc

e 2

Window size 3

GGAGAA

= 2 / 3 identities

Page 12: Dot Plots

Exercise

CCTAAAGG

G

3G

2A

1A

A

T

C

C

Sequence 1

Seq

uenc

e 2

Window size 3

GGAAAA

= 1 / 3 identities

Page 13: Dot Plots

Exercise

CCTAAAGG

G

3G

2A

1A

0A

T

C

C

Sequence 1

Seq

uenc

e 2

Window size 3

GGAAAT

= 0 / 3 identities

Page 14: Dot Plots

Exercise

CCTAAAGG

G

000123G

001232A

012321A

013210A

131100T

310000C

C

Sequence 1

Seq

uenc

e 2

Window size 3

Page 15: Dot Plots

Introns

mRNA

Gen

e

Introns are spliced out in the mRNA

}

}}

}

Page 16: Dot Plots

Protein dot plots

Page 17: Dot Plots

CLC Combined Workbench

Page 18: Dot Plots

Ankyrin repeat protein

Page 19: Dot Plots

HIV Long Terminal Repeats

Page 20: Dot Plots

Di-nucleotide repeats

Page 21: Dot Plots

Repetitive regions

Page 22: Dot Plots

Exercise: Inverted repeats

Page 23: Dot Plots

Exercise: Inverted repeats

CCTAAAGG

G

G

A

T

T

T

C

C

Sequence 1

Rev

erse

com

plem

ent Make a dot plot with the sequence

against the reverse-complement of the sequence. Now diagonals represent inverted repeats.

Window size 3

Page 24: Dot Plots

Genome dot plots: inverted repeatsAnalysis of a random sequence of Homo sapiens chromosome 7 reveals numerous short inverted repeats

Page 25: Dot Plots

The human Alu sequence

A self-self plot reveals some repetitive regions.

Page 26: Dot Plots

The human Alu sequence

A plot of the Alu sequence against its reverse-complement reveals its inverted repeat (palindromic) nature, seen as the diagonal along the entire sequence length

Page 27: Dot Plots

WD-repeat proteinsIdentity matrix Blosum45 matrix

Page 28: Dot Plots

Conclusion

• Dot plots provide an intuitive view of sequence comparisons.

• The sliding window size is important.• For proteins, substitution matrices can be

used.• Dot plots can reveal

– Repeats– Insertion/Deletions (such as introns)– Inverted repeats