Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence...

8
Introduction to Bioinformatics Dot Plots

Transcript of Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence...

Page 1: Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity –Assign.

Introduction to Bioinformatics

Dot Plots

Page 2: Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity –Assign.

Dot Plots

• One of the simplest and oldest methods for sequence alignment

• Visualization of regions of similarity – Assign one sequence on the horizontal axis– Assign the other on the vertical axis– Place dots on the space of matches– Diagonal lines means adjacent regions of

identity

Page 3: Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity –Assign.

Simple Example• Construct a simple dot plot for

GCTGAAGCGAA

One sequence goes horizontally, the other verticallyMark boxes w/ matched horizontal and vertical symbolsLook for diagonal(s)

Alignment:GCTGAAGCT-AA

G C T G A A

G * *

C *

T *

A *

A *

Page 4: Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity –Assign.

Another Example• Construct a simple dot plot for

GCTAGTCAGATCTGACGCTAGATGGTCACATCTGCCGC

A long stretch of nearly identical residues is revealed starting at the fifth nucleotide of each sequence (GTCA-ATCTG-CGC).

Page 5: Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity –Assign.

Sliding Window and Cutoff

• Problem– Plot becomes noisy when comparing large,

similar sequences

• Solution– Sliding window (size = w)– Cutoff (value = v)– Consider w nucleotides at a time – When at least v matches in a window, place a

dot on the space where the window starts

Page 6: Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity –Assign.

Example• Same example with w = 4 and v = 3

• Compare to the previous plot. You make the call!

Page 7: Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity –Assign.

Worksheet • w = 4 and v = 3

Page 8: Introduction to Bioinformatics Dot Plots. One of the simplest and oldest methods for sequence alignment Visualization of regions of similarity –Assign.

What else can it do (and how)?

• Gaps • Inverse subsequence• Repeats• Palindrome• Genome rearrangement• Exon identification• RNA structure prediction• Nice tool for conceptualizing sequence-

related algorithms