An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown,...

71
An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert Hinman, Julio Ng, Michael Sneddon, Hoa Troung, Jerry Wang, Che Fung Yung

Transcript of An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown,...

Page 1: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Molecular Biology Primer

Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert Hinman, Julio Ng, Michael Sneddon, Hoa Troung, Jerry Wang, Che Fung Yung

Page 2: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Outline:

• 1. What Is Life Made Of?• 2. What Is Genetic Material?• 3. What Carries Information between DNA and Proteins• 4. How are Proteins Made?• 5. How to analysis genome (some lab techniques)

Page 3: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

1. What is Life made of?

Page 4: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Life begins with Cell

• A cell is a smallest structural unit of an organism that is capable of independent functioning

• All cells have some common features

Page 5: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

All Cells have common Cycles

• Born, eat, replicate, and die

Page 6: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Two types of cells: Prokaryotes v.s.Eukaryotes

Page 7: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Prokaryotes and Eukaryotes

•According to the most recent evidence, there are three main branches to the tree of life. •Prokaryotes include Archaea (“ancient ones”) and bacteria.•Eukaryotes are kingdom Eukarya and includes plants, animals, fungi and certain algae.

Page 8: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Prokaryotes and Eukaryotes, continuedProkaryotes Eukaryotes

Single cell Single or multi cell

No nucleus Nucleus

No organelles Organelles

One piece of circular DNA Chromosomes

No mRNA post transcriptional modification

Exons/Introns splicing

Page 9: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Section 2: Genetic Material of Life

Page 10: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA: The Code of Life

• The structure and the four genomic letters code for all living organisms

• Adenine, Guanine, Thymine, and Cytosine which pair A-T and C-G on complimentary strands.

Page 11: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA, continued• DNA has a double helix

structure which composed of • sugar molecule• phosphate group• and a base (A,C,G,T)

• DNA always reads from 5’ end to 3’ end for transcription replication 5’ ATTTAGGCC 3’3’ TAAATCCGG 5’

Page 12: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA the Genetics Makeup• Genes are inherited and are

expressed• genotype (genetic makeup)• phenotype (physical

expression)

• On the left, is the eye’s phenotypes of green and black eye genes.

Page 13: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Genetic Information: Chromosomes

• (1) Double helix DNA strand. • (2) Chromatin strand (DNA with histones)• (3) Condensed chromatin during interphase with centromere. • (4) Condensed chromatin during prophase • (5) Chromosome during metaphase

Page 14: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Chromosomes

Organism Number of base pair number of Chromosomes

---------------------------------------------------------------------------------------------------------

Prokayotic

Escherichia coli (bacterium) 4x106 1

Eukaryotic

Saccharomyces cerevisiae (yeast) 1.35x107 17

Drosophila melanogaster(insect) 1.65x108 4

Homo sapiens(human) 2.9x109 23

Zea mays(corn) 5.0x109 10

Page 15: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

The organization of genes on a human chromosome

Page 16: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Human genome sequence

Page 17: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Comparison of genomes

Page 18: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Discovery of DNA• DNA Sequences

• Chargaff and Vischer, 1949• DNA consisting of A, T, G, C

• Adenine, Guanine, Cytosine, Thymine• Chargaff Rule

• Noticing #A#T and #G#C • A “strange but possibly meaningless”

phenomenon.• Wow!! A Double Helix

• Watson and Crick, Nature, April 25, 1953•

• Rich, 1973• Structural biologist at MIT.• DNA’s structure in atomic resolution. Crick Watson

1 Biologist1 Physics Ph.D. Student900 wordsNobel Prize

Page 19: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Watson & Crick – “…the secret of life”

• Watson: a zoologist, Crick: a physicist

• “In 1947 Crick knew no biology and practically no organic chemistry or crystallography..” – www.nobel.se

• Applying Chagraff’s rules and the X-ray image from Rosalind Franklin, they constructed a “tinkertoy” model showing the double helix

• Their 1953 Nature paper: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”

Watson & Crick with DNA model

Rosalind Franklin with X-ray image of DNA

Page 20: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA: The Basis of Life• Humans have about 3 billion base

pairs.• How do you package it into a cell?• How does the cell know where in the

highly packed DNA where to start transcription?

• Special regulatory sequences• DNA size does not mean more

complex• Complexity of DNA

• Eukaryotic genomes consist of variable amounts of DNA

• Single Copy or Unique DNA• Highly Repetitive DNA

Page 21: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Human Genome Composition

Page 22: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA, continued• DNA has a double helix structure. However,

it is not symmetric. It has a “forward” and “backward” direction. The ends are labeled 5’ and 3’ after the Carbon atoms in the sugar component.

5’ AATCGCAAT 3’

3’ TTAGCGTTA 5’

DNA always reads 5’ to 3’ for transcription replication

Page 23: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Basic Structure

Phosphate

Sugar

Page 24: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA - replication• DNA can replicate by

splitting, and rebuilding each strand.

• Note that the rebuilding of each strand uses slightly different mechanisms due to the 5’ 3’ asymmetry, but each daughter strand is an exact replica of the original strand.

http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/D/DNAReplication.html

Page 25: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Section 6: What carries information between DNA to Proteins

Page 26: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

The flow of genetic information

Page 27: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA RNA: Transcription• DNA gets transcribed by a

protein known as RNA-polymerase

• This process builds a chain of bases that will become mRNA

• RNA and DNA are similar, except that RNA is single stranded and thus less stable than DNA• Also, in RNA, the base uracil (U) is

used instead of thymine (T), the DNA counterpart

Page 28: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

T

C

A

C

T

G

G

C

G

A

G

T

C

A

G

C

G

A

G

U

C

A

G

C

DNA RNA

A = T

G = C

T U

Page 29: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Definition of a Gene

• Regulatory regions: up to 50 kb upstream of +1 site

• Exons: protein coding and untranslated regions (UTR)1 to 178 exons per gene (mean 8.8)8 bp to 17 kb per exon (mean 145 bp)

• Introns: splice acceptor and donor sites, junk DNAaverage 1 kb – 50 kb per intron

• Gene size: Largest – 2.4 Mb (Dystrophin). Mean – 27 kb.

Page 30: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Transcription: DNA pre mRNA

RNA polymerase II catalyzes the formation of phosphodiester bond that link nucleotides together to form a linear chain from 5’ to 3’ by unwinding the helix just ahead of the active site for polymerization of complementary base pairs.

• The hydrolysis of high energy bonds of the substrates (nucleoside triphosphates ATP, CTP, GTP, and UTP) provides energy to drive the reaction.

• During transcription, the DNA helix reforms as RNA forms.• When the terminator sequence is met, polymerase halts and

releases both the DNA template and the RNA.

Transcription occurs in the nucleus. σ factor from RNA polymerase reads the promoter sequence and opens a small portion of the double helix exposing the DNA bases.

Page 31: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Central Dogma Revisited

• Base Pairing Rule: A and T or U is held together by 2 hydrogen bonds and G and C is held together by 3 hydrogen bonds.

• Note: Some mRNA stays as RNA (ie noncoding RNA).

DNA Pre mRNA mRNA

protein

Splicing

Spliceosome

Translation

Transcription

Nucleus

Ribosome in Cytoplasm

Page 32: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

RNA processing: pre-RNA mature RNA• 5’ Cap

• Poly-A

• Splicing

• Editing

Page 33: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Splicing

Page 34: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Alternative splicing

Page 35: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

5’ Cap of RNA

Page 36: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

PolyA addition

Page 37: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

3 How are Proteins Made?

Page 38: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Revisiting the Central Dogma• In going from DNA to proteins,

there is an intermediate step where mRNA is made from DNA, which then makes protein• This known as The Central

Dogma• Why the intermediate step?

• DNA is kept in the nucleus, while protein sythesis happens in the cytoplasm, with the help of ribosomes

Page 39: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

The Central Dogma (cont’d)

Page 40: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Translation• The process of going

from RNA to polypeptide.

• Three base pairs of RNA (called a codon) correspond to one amino acid based on a fixed table.

• Always starts with Methionine and ends with a stop codon

Page 41: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

tRNA

Page 42: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Translation, continued• Catalyzed by Ribosome

• Using two different sites, the Ribosome continually binds tRNA, joins the amino acids together and moves to the next location along the mRNA

• ~10 codons/second, but multiple translations can occur simultaneously

http://wong.scripps.edu/PIX/ribosome.jpg

Page 43: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

The genetic code

Page 44: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Reading frames

Page 45: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Protein Synthesis: Summary• There are twenty amino

acids, each coded by three- base-sequences in DNA, called “codons”• This code is degenerate

• The central dogma describes how proteins derive from DNA• DNA mRNA (splicing?)

protein• The protein adopts a 3D

structure specific to it’s amino acid arrangement and function

Page 46: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Simultaneous translation

Page 47: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Proteins

• Complex organic molecules made up of amino acid subunits

• 20* different kinds of amino acids. Each has a 1 and 3 letter abbreviation.

• Proteins are often enzymes that catalyze reactions.

• Also called “poly-peptides”

*Some other amino acids exist but not in humans.

Page 48: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

• Composed of a chain of amino acids.

R

|

H2N--C--COOH

|

H

Proteins

20 possible groups

Page 49: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

20 amino acids

Page 50: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

R R | | H2N--C--COOH H2N--C--COOH | | H H

Proteins

Page 51: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Dipeptide

R O R | II | H2N--C--C--NH--C--COOH | | H H

This is a peptide bond

Page 52: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Protein structure• Linear sequence of amino acids folds to form

a complex 3-D structure.• The structure of a protein is intimately

connected to its function.

Page 53: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

How to Analyze DNA?

Page 54: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Analyzing a Genome• How to analyze a genome in four easy steps.

• Cut it• Use enzymes to cut the DNA in to small fragments.

• Copy it• Copy it many times to make it easier to see and detect.

• Read it• Use special chemical techniques to read the small fragments.

• Assemble it• Take all the fragments and put them back together. This is

hard!!!

• Bioinformatics takes over• What can we learn from the sequenced DNA.• Compare interspecies and intraspecies.

Page 55: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Polymerase Chain Reaction (PCR)• Polymerase Chain Reaction (PCR)

• Used to massively replicate DNA sequences.

• How it works:• Separate the two strands with low heat• Add some base pairs, primer sequences, and

DNA Polymerase• Creates double stranded DNA from a single

strand.• Primer sequences create a seed from which

double stranded DNA grows.• Now you have two copies.• Repeat. Amount of DNA grows exponentially.

• 1→2→4→8→16→32→64→128→256…

Page 56: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Cloning DNA• DNA Cloning

• Insert the fragment into the genome of a living organism and watch it multiply.

• Once you have enough, remove the organism, keep the DNA.

• Use Polymerase Chain Reaction (PCR)

Vector DNA

Page 57: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Cutting DNA

• Restriction Enzymes cut DNA• Only cut at special sequences

• DNA contains thousands of these sites.

• Applying different Restriction Enzymes creates fragments of varying size.

Restriction Enzyme “A” Cutting Sites

Restriction Enzyme “A” & Restriction Enzyme “B” Cutting Sites

Restriction Enzyme “B” Cutting Sites

“A” and “B” fragments overlap

Page 58: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Pasting DNA

• Two pieces of DNA can be fused together by adding chemical bonds

• Hybridization – complementary base-pairing

• Ligation – fixing bonds with single strands

Page 59: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Electrophoresis• A copolymer of mannose and galactose,

agaraose, when melted and recooled, forms a gel with pores sizes dependent upon the concentration of agarose

• The phosphate backbone of DNA is highly negatively charged, therefore DNA will migrate in an electric field

• The size of DNA fragments can then be determined by comparing their migration in the gel to known size standards.

Page 60: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA Microarray

Tagged probes become hybridized to the DNA chip’s microarray.

May, 11, 2004 60

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Millions of DNA strands build up on each location.

http://www.affymetrix.com/corporate/media/image_library/image_library_1.affx

Page 61: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

DNA Microarray

Affymetrix

Each blue spot indicates the location of a PCR product. On a real microarray, each spot is about 100um in diameter.

May, 11, 2004 www.geneticsplace.com 61

An Introduction to Bioinformatics Algorithms www.bioalgorithms.info

Microarray is a tool for analyzing gene expression that consists of a glass slide.

Page 62: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Affymetrix GeneChip® Arrays

Data from an experiment showing the expression of thousands of genes on a single GeneChip® probe array.

http://www.affymetrix.com/corporate/media/image_library/image_library_1.affxMay 11,2004 14

Page 63: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Beta globins:• Beta globin chains of closely related species are highly similar:• Observe simple alignments below:

Human β chain: MVHLTPEEKSAVTALWGKV NVDEVGGEALGRLL

Mouse β chain: MVHLTDAEKAAVNGLWGKVNPDDVGGEALGRLL

Human β chain: VVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLG

Mouse β chain: VVYPWTQRYFDSFGDLSSASAIMGNPKVKAHGKK VIN

Human β chain: AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGN

Mouse β chain: AFNDGLKHLDNLKGTFAHLSELHCDKLHVDPENFRLLGN

Human β chain: VLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH

Mouse β chain: MI VI VLGHHLGKEFTPCAQAAFQKVVAGVASALAHKYH

There are a total of 27 mismatches, or (147 – 27) / 147 = 81.7 % identical

Page 64: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Beta globins: Cont.Human β chain: MVH L TPEEKSAVTALWGKVNVDEVGGEALGRLL

Chicken β chain: MVHWTAEEKQL I TGLWGKVNVAECGAEALARLL

Human β chain: VVYPWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLG

Chicken β chain: IVYPWTQRFF ASFGNLSSPTA I LGNPMVRAHGKKVLT

Human β chain: AFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGN

Chicken β chain: SFGDAVKNLDNIK NTFSQLSELHCDKLHVDPENFRLLGD

Human β chain: VLVCVLAHHFGKEFTPPVQAAY QKVVAGVANALAHKYH

Mouse β chain: I L I I VLAAHFSKDFTPECQAAWQKLVRVVAHALARKYH

-There are a total of 44 mismatches, or (147 – 44) / 147 = 70.1 % identical

- As expected, mouse β chain is ‘closer’ to that of human than chicken’s.

Page 65: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Molecular evolution can be visualized with phylogenetic tree.

Page 66: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Origins of New Genes.

• All animals lineages traced back to a common ancestor, a protish about 700 million years ago.

Page 67: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

How Do Different Species Differ?

• As many as 99% of human genes are conserved across all mammals

• The functionality of many genes is virtually the same among many organisms

• It is highly unlikely that the same gene with the same function would spontaneously develop among all currently living species

• The theory of evolution suggests all living things evolved from incremental change over millions of years

Page 68: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Mouse and Human overview

• Mouse has 2.1 x109 base pairs versus 2.9 x 109 in human.

• About 95% of genetic material is shared.• 99% of genes shared of about 30,000 total.• The 300 genes that have no homologue in

either species deal largely with immunity, detoxification, smell and sex*

*Scientific American Dec. 5, 2002

Page 69: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Human and Mouse

Significant chromosomal rearranging occurred between the diverging point of humans and mice.

Here is a mapping of human chromosome 3.

It contains homologous sequences to at least 5 mouse chromosomes.

Page 70: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Comparative Genomics• What can be done with the

full Human and Mouse Genome? One possibility is to create “knockout” mice – mice lacking one or more genes. Studying the phenotypes of these mice gives predictions about the function of that gene in both mice and humans.

Page 71: An Introduction to Bioinformatics Algorithms Molecular Biology Primer Angela Brooks, Raymond Brown, Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert.

An Introduction to Bioinformatics Algorithms

Future reading and references• Molecular Cell Biology Lodish, Harvey; Berk, Arnold;

Zipursky, S. Lawrence; Matsudaira, Paul; Baltimore, David; Darnell, James E. New York: W. H. Freeman & Co. ; c1999