A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala...

60
A Zero-Knowledge Based Introduction to Biology Jim Notwell 09 January 2013

Transcript of A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala...

Page 1: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

A Zero-Knowledge Based Introduction to Biology

Jim Notwell

09 January 2013

Page 2: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Q: What is your genome?

A:

Page 3: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Q: What is your genome?

A:The sum of your hereditary information.

Page 4: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

From DNA to Organism

You are composed of ~ 10 trillion cells

Page 5: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

From DNA to Organism Cell

Page 6: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

From DNA to Organism Cell Protein

Proteins do most of the work in biology

Page 7: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Central Dogma of Biology

Page 8: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

DNA: “Blueprints” for a cell

•Genetic information encoded in long strings

•Deoxyribonucleic acid comes in four flavors: adenine, thymine, guanine, and cytosine

Page 9: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Phosphate-deoxyribose Backbone

O O

C C

CC

H

H

HHH

H

H

COP

O-

O

to next nucleotide

to previous nucleotide

to base

3’

5’

Page 10: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Nucleobase Complementary Pairing

Adenine (A)

Cytosine (C)

Guanine (G)

Thymine (T)

pyrimidines

purines

Page 11: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

DNA Double Helix

Page 12: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

DNA Packaging

Page 13: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Q: What is your genome?

A:The sum of your hereditary information.

Page 14: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Q: What is your genome?

A:The sum of your hereditary information. Humans bundle two copies of the genome into 46 chromosomes in every cell

Page 15: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Central Dogma of Biology

Page 16: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

DNA vs RNA

Page 17: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

RNA Nucleobases

Adenine (A)

Cytosine (C)

Guanine (G)

Uracil (U)

pyrimidines

purines

Page 18: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Transcription

3’5’

5’3’

G A T T A C A . . .

C T A A T G T . . .

Page 19: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Transcription

3’5’

5’3’

G A T T A C A . . .

C T A A T G T . . .

Page 20: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Transcription

3’5’

5’3’

G A T T A C A . . .

C T A A T G T . . .

Strands are separated (DNA helicase)

Page 21: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Transcription

3’5’

5’3’

G A T T A C

A . . .

C T A A T G T . . .

G A U U A C A

An RNA copy of the 5’→3’ sequence is created from the 3’→5’ template

Page 22: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Transcription

3’5’

5’3’

G A U U A C A . . .

G A T T A C A . . .

C T A A T G T . . .

pre-mRNA 5’ 3’

Page 23: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

RNA Processing

5’ cap poly(A) tail

intronexon

mRNA

5’ UTR 3’ UTR

Page 24: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Structure

5’ 3’

promoter

5’ UTR exons 3’ UTR

introns

coding

non-coding

Page 25: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Central Dogma of Biology

Page 26: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

From RNA to Protein•Proteins are long strings of amino acids joined by peptide bonds

•Translation from RNA sequence to amino acid sequence performed by ribosomes

•20 amino acids → 3 RNA letters required to specify a single amino acid

Page 27: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Amino AcidAlanine

Arginine

Asparagine

Aspartate

Cysteine

Glutamate

Glutamine

Glycine

Histidine

Isoleucine

Leucine

Lysine

Methionine

Phenylalanine

Proline

Serine

Threonine

Tryptophan

Tyrosine

Valine

C

O

N

H

C

H

H OH

R

There are 20 standard amino acids

Page 28: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Proteins

C

O

N

H

C

H

R

to previous aa to next aa

N-terminus

(start)

H OH

C-terminus

(end)from 5’ 3’ mRNA

Page 29: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Translation

The ribosome (a complex of protein and RNA) synthesizes a protein by reading the mRNA in triplets (codons). Each codon is translated to an amino acid.

Page 30: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Translation

GGGG"GlyGAG"GluGCG"AlaGUG"Val

A#GGA"GlyGAA"Glutamic#acid"(Glu)GCA"AlaGUA"Val

C#GGC"GlyGAC"AspGCC"AlaGUC"Val

U#GGU"Glycine"(Gly)GAU"Aspar4c#acid"(Asp)GCU"Alanine"(Ala)GUU"Valine"(Val)

G#

GAGG"Arg"AAG"LysACG"ThrAUG"Methionine"(Met)"or"START

A#AGA"Arginine"(Arg)AAA"Lysine"(Lys)ACA"Thr"AUA"Ile

C#AGC"Ser"AAC"AsnACC"ThrAUC"Ile

U#AGU"Serine"(Ser)AAU"Asparagine"(Asn)ACU"Threonine"(Thr)AUU"Isoleucine"(Ile)

A#

GCGG"Arg"CAG"GlnCCG"ProCUG"Leu

A#CGA"Arg"CAA"Glutamine"(Gln)CCA"ProCUA"Leu

C#CGC"Arg"CAC"HisCCC"ProCUC"Leu

U#CGU"Arginine"(Arg)CAU"His4dine"(His)CCU"Proline"(Pro)CUU"Leucine"(Leu)

C#

GUGG"Tryptophan"(Trp)UAG"STOPUCG"Ser"UUG"Leu

A#UGA"STOPUAA"STOPUCA"Ser"UUA"Leucine"(Leu)

C#UGC"CysUAC"TyrUCC"SerUUC"Phe

U#UGU"Cysteine"(Cys)UAU"Tyrosine"(Tyr)UCU"Serine"(Ser)UUU"Phenylalanine"(Phe)

U#

GACU

Page 31: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Translation

5’ . . . A U U A U G G C C U G G A C U U G A . . . 3’

Page 32: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Translation

5’ . . . A U U A U G G C C U G G A C U U G A . . . 3’

UTR Met

Start Codon

Ala Trp Thr

Stop Codon

Page 33: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Translation

Page 34: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Central Dogma of Biology

Page 35: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Protein coding

1%

Other Stuff 99%

Page 36: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Protein coding

1%

Non-coding exons

2%

Introns/ promoters/ polyA sites

37%

Intergenic transcribed

RNA 19% Regulatory

elements 9%

??? 32%

The ENCODE Project Consortium (2012) Nature 489:57-74

Page 37: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Non-coding RNAs•RNAs transcribed from DNA but not translated into protein

•Structural ncRNAs: Conserved secondary structure

•Involved in gene regulation

Page 38: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

microRNA

Page 39: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Protein coding

1%

Non-coding exons

2%

Introns/ promoters/ polyA sites

37%

Intergenic transcribed

RNA 19% Regulatory

elements 9%

??? 32%

The ENCODE Project Consortium (2012) Nature 489:57-74

Page 40: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Different Cell Types

Subsets of the DNA sequence determine the identity and function of different cells

Page 41: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Expression Regulation•When should each gene be expressed?

•Why? Every cell has same DNA but each cell expresses different proteins.

•Signal transduction: One signal converted to another: cascade has “master regulators” turning on many proteins, which in turn each turn on many proteins

Page 42: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Central Dogma of Biology

Page 43: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Transcription Regulation•Transcription factors link to binding sites

•Complex of transcription factors forms

•Complex assists or inhibits formation of the RNA polymerase machinery

Page 44: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Transcription

3’5’

5’3’

G A T T A C A . . .

C T A A T G T . . .

Page 45: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Transcription Factor Binding Sites•Short, degenerate DNA sequences recognized by particular transcription factors

•For complex organisms, cooperative binding of multiple transcription factors required to initiate transcription

Binding Sequence Logo

Page 46: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Transcription Regulation

Transcription Factor A

TF A Binding Site

Gene B

Page 47: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Regulatory RegionANRV285-GG07-02 ARI 8 August 2006 1:29

understand how different permutations of thesame regulatory elements alter gene expres-sion. An understanding of how the combina-torial organization of a promoter encodes reg-ulatory information first requires an overviewof the proteins that constitute the transcrip-tional machinery.

THE EUKARYOTICTRANSCRIPTIONALMACHINERYFactors involved in the accurate transcrip-tion of eukaryotic protein-coding genes byRNA polymerase II can be classified into threegroups: general (or basic) transcription fac-tors (GTFs), promoter-specific activator pro-teins (activators), and coactivators (Figure 2).GTFs are necessary and can be sufficient foraccurate transcription initiation in vitro (re-viewed in 141). Such factors include RNApolymerase II itself and a variety of auxil-iary components, including TFIIA, TFIIB,TFIID, TFIIE, TFIIF, and TFIIH. In addi-tion to these “classic” GTFs, it is apparent thatin vivo transcription also requires Mediator,a highly conserved, large multisubunit com-plex that was originally identified in yeast (re-viewed in 38, 119).

GTFs assemble on the core promoter inan ordered fashion to form a transcriptionpreinitiation complex (PIC), which directsRNA polymerase II to the transcription startsite (TSS). The first step in PIC assemblyis binding of TFIID, a multisubunit com-plex consisting of TATA-box-binding pro-tein (TBP) and a set of tightly bound TBP-associated factors (TAFs). Transcription thenproceeds through a series of steps, includingpromoter melting, clearance, and escape, be-fore a fully functional RNA polymerase IIelongation complex is formed. The currentmodel of transcription regulation views thisas a cycle, in which complete PIC assembly isstimulated only once. After RNA polymeraseII escapes from the promoter, a scaffold struc-ture, composed of TFIID, TFIIE, TFIIH,and Mediator, remains on the core promoter

Distal regulatory elements

Proximalpromoterelements

Promoter ( 1 kb)

Corepromoter

EnhancerSilencer

Locus controlregion Insulator

Figure 1Schematic of a typical gene regulatory region. The promoter, which iscomposed of a core promoter and proximal promoter elements, typicallyspans less than 1 kb pairs. Distal (upstream) regulatory elements, which caninclude enhancers, silencers, insulators, and locus control regions, can belocated up to 1 Mb pairs from the promoter. These distal elements maycontact the core promoter or proximal promoter through a mechanism thatinvolves looping out the intervening DNA.

Generaltranscription factor(GTF): a factor thatassembles on thecore promoter toform a preinitiationcomplex and isrequired fortranscription of all(or almost all) genes

Coactivators:adaptor proteins thattypically lackintrinsicsequence-specificDNA binding butprovide a linkbetween activatorsand the generaltranscriptionalmachinery

PIC: preinitiationcomplex

TSS: transcriptionstart site

(73); subsequent reinitiation of transcriptionthen only requires rerecruitment of RNApolymerase II-TFIIF and TFIIB.

The assembly of a PIC on the core pro-moter is sufficient to direct only low levels ofaccurately initiated transcription from DNAtemplates in vitro, a process generally referredto as basal transcription. Transcriptional ac-tivity is greatly stimulated by a second classof factors, termed activators. In general, ac-tivators are sequence-specific DNA-bindingproteins whose recognition sites are usuallypresent in sequences upstream of the corepromoter (reviewed in 149). Many classes ofactivators, discriminated by different DNA-binding domains, have been described, eachassociating with their own class of specificDNA sequences. Examples of activator fam-ilies include those containing a cysteine-rich zinc finger, homeobox, helix-loop-helix(HLH), basic leucine zipper (bZIP), fork-head, ETS, or Pit-Oct-Unc (POU) DNA-binding domain (reviewed in 142). In additionto a sequence-specific DNA-binding domain,a typical activator also contains a separableactivation domain that is required for the ac-tivator to stimulate transcription (149). An

www.annualreviews.org • Transcriptional Regulatory Elements 31

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 200

6.7:

29-5

9. D

ownl

oade

d fr

om a

rjour

nals

.ann

ualre

view

s.org

by S

tanf

ord

Uni

vers

ity R

ober

t Cro

wn

Law

Lib

. on

04/0

3/07

. For

per

sona

l use

onl

y.

Page 48: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Gene Regulatory Region

ANRV285-GG07-02 ARI 8 August 2006 1:29

TBP:TATA-box-bindingprotein

TAF:TBP-associatedfactor

TFBS: transcriptionfactor-binding site

PIC

TFIIDTFIIA

TFIIB

TFIIF

TFIIH

RNApolymerase II

TFIIE

?

?

?

Activator

Mediator

DBD

AD

Corepromoter

TATA TSS

Co-activator

Figure 2The eukaryotic transcriptional machinery. Factors involved in eukaryotictranscription by RNA polymerase II can be classified into three groups:general transcription factors (GTFs), activators, and coactivators. GTFs,which include RNA polymerase II itself and TFIIA, TFIIB, TFIID,TFIIE, TFIIF, and TFIIH, assemble on the core promoter in an orderedfashion to form a preinitiation complex (PIC), which directs RNApolymerase II to the transcription start site (TSS). Transcriptional activityis greatly stimulated by activators, which bind to upstream regulatoryelements and work, at least in part, by stimulating PIC formation througha mechanism thought to involve direct interactions with one or morecomponents of the transcriptional machinery. Activators consist of aDNA-binding domain (DBD) and a separable activation domain (AD)that is required for the activator to stimulate transcription. The directtargets of activators are largely unknown.

extensive discussion of the properties of acti-vators is beyond the scope of this review; read-ers are referred to several excellent reviews onthe subject (87 and references therein).

The DNA-binding sites for activators[also called transcription factor-binding sites(TFBSs)] are generally small, in the rangeof 6–12 bp, although binding specificity isusually dictated by no more than 4–6 po-sitions within the site. The TFBSs for a

specific activator are typically degenerate,and are therefore described by a consen-sus sequence in which certain positions arerelatively constrained and others are morevariable. Many activators form heterodimersand/or homodimers, and thus their bindingsites are generally composed of two half-sites.Notably, the precise subunit composition ofan activator can also dictate its binding speci-ficity and regulatory action (37).

Although an activator can bind to a widevariety of sequence variants that conform tothe consensus, in certain instances the precisesequence of a TFBS can impact the regulatoryoutput. For example, TFBS sequence vari-ations can affect activator binding strength(reviewed in 30), which may be biologicallyimportant in situations such as in early devel-opment, in which activators are distributed ina concentration gradient (84, 144). TFBS se-quence variations may also direct a preferencefor certain dimerization partners over others(37, 124, 142). Finally, the particular sequenceof a TFBS can affect the structure of a boundactivator in a way that alters its activity (69,104, 108, 154, 163). The best-studied exam-ples are nuclear hormone receptors, a largeclass of ligand-dependent activators. Variousstudies have shown that the relative orienta-tion of the half-sites, as well as the spacing be-tween them, play a major role in directing theregulatory action of the bound nuclear hor-mone receptor dimer (37).

Activators work, at least in part, by in-creasing PIC formation through a mechanismthought to involve direct interactions withone or more components of the transcrip-tional machinery, termed the “target” (141,149). Activators may also act by promoting astep in the transcription process subsequent toPIC assembly, such as initiation, elongation,or reinitiation (103). Finally, activators havealso been proposed to function by recruit-ing activities that modify chromatin structure(47, 106). Chromatin often poses a barrierto transcription because it prevents the tran-scriptional machinery from interacting di-rectly with promoter DNA, and thus can be

32 Maston · Evans · Green

Ann

u. R

ev. G

enom

. Hum

an G

enet

. 200

6.7:

29-5

9. D

ownl

oade

d fr

om a

rjour

nals

.ann

ualre

view

s.org

by S

tanf

ord

Uni

vers

ity R

ober

t Cro

wn

Law

Lib

. on

04/0

3/07

. For

per

sona

l use

onl

y.

Page 49: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Protein coding

1%

Non-coding exons

2%

Introns/ promoters/ polyA sites

37%

Intergenic transcribed

RNA 19% Regulatory

elements 9%

??? 32%

The ENCODE Project Consortium (2012) Nature 489:57-74

Page 50: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Q: What if the transcription/translation machinery makes mistakes?

Q:What is the effect in coding regions?

Page 51: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Evolution = Mutation + Selection

Page 52: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Structural AbnormalitiesB. Structural Abnormalities Normal

Insertion

Reciprocal Translocation

Duplication

Deletion

Inversion

Page 53: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Single Nucleotide ChangesII. Single Nucleotide Changes

A A A A T A C G T G C A U U U U A U G C A C G U

Phe Tyr Ala Arg

DNA

mRNA

Protein

Normal

A A G A T A C G T G C A U U C U A U G C A C G U

Phe Tyr Ala Arg

DNA

mRNA

Protein

Silent Mutation

A A A A T A C C T G C A U U U U A U G G A C G U

Phe Tyr Gly Arg

DNA

mRNA

Protein

Missense Mutation

A A A A T T C G T G C A U U U U A A G C A C G U

Phe

DNA

mRNA

Protein

Nonsense Mutation

STOP

Page 54: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Single Nucleotide ChangesII. Single Nucleotide Changes

A A A A T A C G T G C A U U U U A U G C A C G U

Phe Tyr Ala Arg

DNA

mRNA

Protein

Normal

A A A T A T A C G T G C U U U A U A U G C A C G

Phe Ile Cys Thr

DNA

mRNA

Protein

Frameshift (Insertion) A A A A A C C T G C A U U U U U G G A C G U

Phe Leu His Val

DNA

mRNA

Protein

Frameshift (Deletion)

T

Page 55: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Evolution = Mutation + Selection

Page 56: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Selection

time

Harmful mutation Beneficial mutation

Page 57: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Evolution = Mutation + Selection

Page 58: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Summary

Evolution = Mutation + Selection

Page 59: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Summary•All hereditary information encoded in double-stranded DNA

•Each cell in an organism has same DNA

•DNA → RNA → protein

•Proteins have many diverse roles in cell

•Gene regulation diversifies protein products within different cells

Page 60: A Zero-Knowledge Based Introduction to Biology · 2013. 1. 10. · Translation GUG"Val GCG"Ala GAG"Glu GGG"Gly G GUA"Val GCA"Ala GAA"Glutamic#acid"(Glu) GGA"Gly A# GUC"Val GCC"Ala

Further Reading•See website: cs173.stanford.edu