Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene...

80
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 254; Phone: x3748 [email protected] www.cis.fiu.edu/~giri/teach/BioinfS15.html

Transcript of Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene...

Page 1: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

CAP 5510: Introduction to BioinformaticsCGS 5166: Bioinformatics Tools

Giri Narasimhan ECS 254; Phone: x3748

[email protected] www.cis.fiu.edu/~giri/teach/BioinfS15.html

Page 2: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 2

Gene Expressionq Process of transcription and/or translation of a

gene is called gene expression. q Every cell of an organism has the same genetic

material, but different genes are expressed at different times.

q Patterns of gene expression in a cell is indicative of its state.

Page 3: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 3

Hybridizationq If two complementary strands of DNA or mRNA

are brought together under the right experimental conditions they will hybridize.

q A hybridizes to B ⇒  A is reverse complementary to B, or  A is reverse complementary to a subsequence of B.

q It is possible to experimentally verify whether A hybridizes to B, by labeling A or B with a radioactive or fluorescent tag, followed by excitation by laser.

Page 4: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 4

Measuring gene expressionq Gene expression for a single gene can be measured

by extracting mRNA from the cell and doing a simple hybridization experiment.

q Given a sample of cells, gene expression for every gene can be measured using a single microarray experiment.

Page 5: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 5

Microarray/DNA chip technology

q High-throughput method to study gene expression of thousands of genes simultaneously.

q Many applications: Genetic disorders & Mutation/polymorphism detection Study of disease subtypes Drug discovery & toxicology studies Pathogen analysis Differing expressions over time, between tissues, between drugs, across disease states

Page 6: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 6

Microarray Data

Gene Expression Level

Gene1

Gene2

Gene3

Page 7: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 7

Gene Chips

Page 8: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 8

DNA Chips & Images

Page 9: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 9

Page 10: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 10

Gene g

Probe 1 Probe 2 Probe N …

Page 11: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 11

Microarray/DNA chips (Simplified)

q Construct probes corresponding to reverse complements of genes of interest.

q Microscopic quantities of probes placed on solid surfaces at defined spots on the chip.

q Extract mRNA from sample cells and label them. q Apply labeled sample (mRNA extracted from cells) to every

spot, and allow hybridization. q Wash off unhybridized material. q Use optical detector to measure amount of fluorescence

from each spot.

Page 12: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 12

Affymetrix DNA chip schematic

www.affymetrix.com

Page 13: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 13

What’s on the slide?

Page 14: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 14

Microarrays: competing technologies

q Affymetrix & Agilent q Differ in:

method to place DNA: Spotting vs. photolithography Length of probe Complete sequence vs. series of fragments

Page 15: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 15

Sample Treated Sample(t1) Expt 1 Treated Sample(t2) Expt 2 Treated Sample(t3) Expt 3 … Treated Sample(tn) Expt n

Study effect of treatment over time

Page 16: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 16

http://www.arabidopsis.org/info/2010_projects/comp_proj/AFGC/RevisedAFGC/Friday/

Page 17: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 17

How to compare 2 cell samples with Two-Color Microarrays?

q mRNA from sample 1 is extracted and labeled with a red fluorescent dye.

q mRNA from sample 2 is extracted and labeled with a green fluorescent dye.

q Mix the samples and apply it to every spot on the microarray. Hybridize sample mixture to probes.

q Use optical detector to measure the amount of green and red fluorescence at each spot.

Page 18: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 18

q  Variations in cells/individuals q  Variations in mRNA extraction, isolation, introduction of dye, variation

in dye incorporation, dye interference q  Variations in probe concentration, probe amounts, substrate surface

characteristics q  Variations in hybridization conditions and kinetics q  Variations in optical measurements, spot misalignments, discretization

effects, noise due to scanner lens and laser irregularities q  Cross-hybridization of sequences with high sequence identity q  Limit of factor 2 in precision of results q  Variation changes with intensity: larger variation at low or high

expression levels

Sources of Variations & Experimental Errors

Need to Normalize data

Page 19: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 19

Clusteringq Clustering is a general method to study patterns in

gene expressions. q Several known methods:

 Hierarchical Clustering (Bottom-Up Approach)  K-means Clustering (Top-Down Approach)  Self-Organizing Maps (SOM)

Page 20: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 20

Hierarchical Clustering: Example

0 1 2 3 4 5 6 7 8 9

10

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9

10

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9

10

0 1 2 3 4 5 6 7 8 9 10

Page 21: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 21

A Dendrogram

Page 22: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 22

Hierarchical Clustering [Johnson, SC, 1967]

q Given n points in Rd, compute the distance between every pair of points

q While (not done) Pick closest pair of points si and sj and make them part of the same cluster. Replace the pair by an average of the two sij

Try the applet at: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletH.html

Page 23: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 23

Distance Metricsq For clustering, define a distance function:

  Euclidean distance metrics

  Pearson correlation coefficient k=2: Euclidean Distance

kd

i

kiik YXYXD

/1

1)(),( !"

#$%

&−= ∑

=

!!"

#$$%

& −!!"

#$$%

& −= ∑

= y

i

x

id

i

xyYYXX

d σσρ

1

1-1 ≤ ρxy ≥ 1

Page 24: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 24

Page 25: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 25

Clustering of gene expressionsq Represent each gene as a vector or a point in d-

space where d is the number of arrays or experiments being analyzed.

Page 26: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 26

From Eisen MB, et al, PNAS 1998 95(25):14863-8

Clustering Random vs. Biological Data

Page 27: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 27

K-Means Clustering: Example

Example from Andrew Moore’s tutorial on Clustering.

Page 28: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 28

Start

Page 29: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 29

Page 30: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 30

Page 31: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 31

Start

End

Page 32: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 32

K-Means Clustering [McQueen ’67]

Repeat Start with randomly chosen cluster centers Assign points to give greatest increase in score Recompute cluster centers Reassign points

until (no changes) Try the applet at: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/

AppletH.html

Page 33: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 33

Comparisonsq Hierarchical clustering

Number of clusters not preset. Complete hierarchy of clusters Not very robust, not very efficient.

q K-Means Need definition of a mean. Categorical data? More efficient and often finds optimum clustering.

Page 34: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 34

Functionally related genes behave similarly across experiments

Page 35: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 35

Self-Organizing Maps [Kohonen]q Kind of neural network. q Clusters data and find complex relationships

between clusters. q Helps reduce the dimensionality of the data. q Map of 1 or 2 dimensions produced. q Unsupervised Clustering q Like K-Means, except for visualization

Page 36: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 36

SOM Architecturesq  2-D Grid q  3-D Grid q  Hexagonal Grid

Page 37: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 37

SOM Algorithmq Select SOM architecture, and initialize weight

vectors and other parameters. q While (stopping condition not satisfied) do for each

input point x winning node q has weight vector closest to x. Update weight vector of q and its neighbors. Reduce neighborhood size and learning rate.

Page 38: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 38

SOM Algorithm Detailsq Distance between x and weight vector: q Winning node: q Weight update function (for neighbors):

q Learning rate:

iwx −

)]()()[,,()()1( kwkxixkkwkw iii −+=+ µ

ii

wxxq −=min)(

!!

"

#

$$

%

& −−= 2

2)(

exp)(),,( 0

σηµ

xqi rrkixk

Page 39: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 39

World Bank Statisticsq Data: World Bank statistics of countries in 1992. q 39 indicators considered e.g., health, nutrition,

educational services, etc. q The complex joint effect of these factors can can

be visualized by organizing the countries using the self-organizing map.

Page 40: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 40

World Poverty PCA

Page 41: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 41

World Poverty SOM

Page 42: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 42

World Poverty Map

Page 43: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 43

Page 44: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 44

Page 45: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 45

Viewing SOM Clusters on PCA axes

Page 46: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 46

1

5

4

3

2

1

2 3 4 5 6

SOM Example [Xiao-rui He]

Page 47: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 47

Neural Networks

Σ Input X

Synaptic Weights W

ƒ(•)

Bias θ

Output y

Page 48: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 48

Learning NN

×

×

× Σ

Σ

Adaptive Algorithm

Input X

1

Weights W

- +

Desired Response

Error

Page 49: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 49

Types of NNsq  Recurrent NN q  Feed-forward NN q  Layered

Other issuesq  Hidden layers possible q  Different activation functions possible

Page 50: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 50

Application: Secondary Structure Prediction

Page 51: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

PCR and Sequencing

Page 52: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 52

Polymerase Chain Reaction (PCR)q For testing, large amount of DNA is needed

Identifying individuals for forensic purposes Ø (0.1 microliter of saliva contains enough epithelial cells)

Identifying pathogens (viruses and/or bacteria) q PCR is a technique to amplify the number of copies of a

specific region of DNA. q Useful when exact DNA sequence is unknown q Need to know “flanking” sequences q Primers designed from “flanking” sequences

Page 53: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 53

PCR

DNA

Region to be amplifiedFlanking Regions with

known sequence

Reverse Primer

Millions of Copies

Forward Primer

Flanking Regions with known sequence

Page 54: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 54

PCR

Page 55: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 55

Schematic outline of a typical PCR cycle

Target DNA

Primers

dNTPs

DNA polymerase

Page 56: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 56

Picture Copyright: AccessExcellence @ the National Museum of Health

Page 57: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 57

Gel Electrophoresisq Used to measure the lengths of DNA fragments. q When voltage is applied to DNA, different size

fragments migrate to different distances (smaller ones travel farther).

Page 58: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 58

Gel Pictures

Page 59: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 59

Gel Electrophoresis: Measure sizes of fragments

q The phosphate backbone makes DNA a highly negatively charged molecule.

q DNA can be separated according to its size. q Gel: allow hot 1% solution of purifed agarose to cool

and solidify/polymerize. q DNA sample added to wells at the top of a gel and

voltage is applied. Larger fragments migrate through the pores slower.

q Varying concentration of agarose makes different pore sizes & results.

q Proteins can be separated in much the same way, only acrylamide is used as the crosslinking agent.

Page 60: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 60

Gel Electrophoresis

Page 61: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 61

Gel Electrophoresis

Page 62: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 62

Page 63: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 63

Sequencing

Page 64: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 64

Why sequencing?q Useful for further study:

Locate gene sequences, regulatory elements Compare sequences to find similarities Identify mutations Use it as a basis for further experiments

Next 4 slides contains material prepared by Dr. Stan Metzenberg. Also see: http://stat-www.berkeley.edu/users/terry/Classes/s260.1998/Week8b/week8b/node9.html

Page 65: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 65

Historyq Two methods independently developed in 1974

Maxam & Gilbert method Sanger method: became the standard

q Nobel Prize in 1980

Page 66: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 66

Original Sanger Methodq  (Labeled) Primer is annealed to template strand of denatured DNA. This

primer is specifically constructed so that its 3' end is located next to the DNA sequence of interest. Once the primer is attached to the DNA, the solution is divided into four tubes labeled "G", "A", "T" and "C". Then reagents are added to these samples as follows:

“G” tube: ddGTP, DNA polymerase, and all 4 dNTPs “A” tube: ddATP, DNA polymerase, and all 4 dNTPs “T” tube: ddTTP, DNA polymerase, and all 4 dNTPs “C” tube: ddCTP, DNA polymerase, and all 4 dNTPs

q  DNA is synthesized, & nucleotides are added to growing chain by the DNA polymerase. Occasionally, a ddNTP is incorporated in place of a dNTP, and the chain is terminated. Then run a gel.

q  All sequences in a tube have same prefix and same last nucleotide. q  http://www.wellcome.ac.uk/Education-resources/Teaching-and-

education/Animations/DNA/WTDV026689.htm

Page 67: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 67

Sanger Methodq  Example of sequences seen in gel from “G” tube:

Page 68: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 68

Modified Sangerq  Reactions performed in a single tube containing all four ddNTP's, each

labeled with a different color dye

Page 69: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 69

Sequencing Gels: Separate vs Single Lanes

A C G T

GCCAGGTGAGCCTTTGCA

Page 70: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 70

Sequencing

Page 71: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 71

Shotgun Sequencing

From http://www.tulane.edu/~biochem/lecture/723/humgen.html

Page 72: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 72

Sequencing

Page 73: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 73

Sequencing: Generate Contigsq Short for “contiguous sequence”. A continuously covered

region in the assembly.

q  Jang W et al (1999) Making effective use of human genomic sequence data. Trends Genet. 15(7): 284-6. Kent WJ and Haussler D (2001) Assembly of the working draft of the human genome with GigAssembler. Genome Res 11(9): 1541-8.

Dove-tail overlap

Collapsing into a single sequence

Page 74: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

q Scaffold (supercontig): formed when two contigs with no sequence overlap can be linked

Data from paired end reads help create scaffolds with known gaps Ø  If two reads end up in two different contigs, then we can link contigs to form

scaffold.

Paired Reads

2/16/15 CAP5510 / CGS 5166 74

Page 75: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 75

Shotgun Sequencing

From http://www.tulane.edu/~biochem/lecture/723/humgen.html

Page 76: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

Human Genome Project

q Many videos available on youtube.com, dnatube.com, and elsewhere.

q Find some and watch them.

2/16/15 CAP5510 / CGS 5166 76

Page 77: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 77

Assembly: Simple Exampleq  ACCGT, CGTGC, TTAC, TACCGT q  Total length = ~10 q 

•  --ACCGT-- •  ----CGTGC •  TTAC----- •  -TACCGT— •  TTACCGTGC

Page 78: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 78

Assembly: Complicationsq Errors in input sequence fragments (~3%)

Indels or substitutions q Contamination by host DNA q Chimeric fragments (joining of non-contiguous fragments) q Unknown orientation q Repeats (long repeats)

Fragment contained in a repeat Repeat copies not exact copies Inherently ambiguous assemblies possible Inverted repeats

q Inadequate Coverage

Page 79: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 79

Assembly: Complications

Page 80: Giri Narasimhangiri/teach/Bioinf/S15/Lec5... · 2015. 2. 16. · 2/16/15 CAP5510 / CGS 5166 2 Gene Expression! Process of transcription and/or translation of a gene is called gene

2/16/15 CAP5510 / CGS 5166 80

Assembly: Complications