Genome Organization & Protein Synthesis and Processing in Plants

46
Genome Organization & Protein Synthesis and Processing in Plants

description

Genome Organization & Protein Synthesis and Processing in Plants. Viral genomes. Viral genomes: ssRNA, dsRNA, ssDNA, dsDNA, linear or ciruclar Viruses with RNA genomes: Almost all plant viruses and some bacterial and animal viruses Genomes are rather small (a few thousand nucleotides) - PowerPoint PPT Presentation

Transcript of Genome Organization & Protein Synthesis and Processing in Plants

Page 1: Genome Organization & Protein Synthesis and Processing in Plants

Genome Organization & Protein Synthesis and Processing in Plants

Page 2: Genome Organization & Protein Synthesis and Processing in Plants

Viral genomesViral genomes: ssRNA, dsRNA, ssDNA, dsDNA, linear or ciruclar

Viruses with RNA genomes: •Almost all plant viruses and some bacterial and animal viruses•Genomes are rather small (a few thousand nucleotides)Viruses with DNA genomes (e.g. lambda = 48,502 bp):•Often a circular genome.Replicative form of viral genomes•all ssRNA viruses produce dsRNA molecules•many linear DNA molecules become circularMolecular weight and contour length: • duplex length per nucleotide = 3.4 Å• Mol. Weight per base pair = ~ 660

Page 3: Genome Organization & Protein Synthesis and Processing in Plants

Procaryotic genomes• Generally 1 circular chromosome

(dsDNA)• Usually without introns• Relatively high gene density (~2500

genes per mm of E. coli DNA)• Contour length of E.coli genome: 1.7

mm• Often indigenous plasmids are present

Page 4: Genome Organization & Protein Synthesis and Processing in Plants

PlasmidsExtra chromosomal circular DNAs• Found in bacteria, yeast and other fungi• Size varies form ~ 3,000 bp to 100,000 bp.• Replicate autonomously (origin of replication)• May contain resistance genes• May be transferred from one bacterium to another• May be transferred across kingdoms• Multicopy plasmids (~ up to 400 plasmids/per cell)• Low copy plasmids (1 –2 copies per cell)• Plasmids may be incompatible with each other• Are used as vectors that could carry a foreign gene

of interest (e.g. insulin)

-lactamase

ori

foreign gene

Page 5: Genome Organization & Protein Synthesis and Processing in Plants

Eukaryotic genome

• Moderately repetitive– Functional (protein coding, tRNA coding)– Unknown function

• SINEs (short interspersed elements)– 200-300 bp– 100,000 copies

• LINEs (long interspersed elements)– 1-5 kb– 10-10,000 copies

Page 6: Genome Organization & Protein Synthesis and Processing in Plants

Eukaryotic genome• Highly repetitive

– Minisatellites• Repeats of 14-500 bp• 1-5 kb long• Scattered throughout genome

– Microsatellites• Repeats up to 13 bp• 100s of kb long, 106 copies• Around centromere

– Telomeres• Short repeats (6 bp)• 250-1,000 at ends of chromosomes

Page 7: Genome Organization & Protein Synthesis and Processing in Plants

Eucaryotic genomes• Located on several chromosomes• Relatively low gene density (50 genes per mm

of DNA in humans)• Contour length of DNA from a single human cell

= 2 meters• Approximately 1011 cells = total length 2 x 1011

km• Distance between sun and earth (1.5 x 108 km)• Human chromosomes vary in length over a 25

fold range • Carry organelles genome as well

Page 8: Genome Organization & Protein Synthesis and Processing in Plants

Mitochondrial genome (mtDNA)

• Multiple identical circular chromosomes

• Size ~15 Kb in animals• Size ~ 200 kb to 2,500 kb in plants• Over 95% of mitochondrial proteins are

encoded in the nuclear genome.• Often A+T rich genomes. • Mt DNA is replicated before or during

mitosis

Page 9: Genome Organization & Protein Synthesis and Processing in Plants

Chloroplast genome (cpDNA)• Multiple circular molecules • Size ranges from 120 kb to 160 kb• Similar to mtDNA• Many chloroplast proteins are

encoded in the nucleus (separate signal sequence)

Page 10: Genome Organization & Protein Synthesis and Processing in Plants

“Cellular” GenomesViruses Procaryotes Eucaryotes

Viral genome Bacterial chromosome

Plasmids

Chromosomes(Nuclear genome)

Mitochondrial genome

Chloroplast genome

Genome: all of an organism’s genes plus intergenic DNA Intergenic DNA = DNA between genes

Capsid

Nucleus

Page 11: Genome Organization & Protein Synthesis and Processing in Plants

Estimated genome sizes

1e1 1e2 1e3 1e4 1e5 1e6 1e7 1e8 1e9 1e10 1e11 1e12

viruses (1024)

bacteria (>100)fungi

mitochondria (~ 100)

plants

mammals

Size in nucleotides. Number in ( ) = completely sequenced genomes

Page 12: Genome Organization & Protein Synthesis and Processing in Plants

Size of genomes

Epstein-Barr virus 0.172 x 106

E. coli 4.6 x 106

S. cerevisiae 12.1 x 106

C. elegans 95.5 x 106

A. thaliana 117 x 106

D. melanogaster 180 x 106

H. sapiens 3200 x 106

Page 13: Genome Organization & Protein Synthesis and Processing in Plants

Chromosome organizationEucaryotic chromosome

Telomere TelomereCentromere

Centromere: • DNA sequence that serve as an attachment for protein during mitosis. • In yeast these sequences (~ 130 nts) are very A+T rich. • In higher eucaryotes centromers are much longer and contain “satellite DNA”Telomeres:• At the end of chromosomes; help stabilize the chromosome• In yeast telomeres are ~ 100 bp long (imperfect repeats)• Repeats are added by a specific telomerase

p-arm q-arm

5’ – (TxGy)n3’ – (AxCy)n

x and y = 1 - 4n = 20 to 100; (1500 in mammals)

Page 14: Genome Organization & Protein Synthesis and Processing in Plants

Gene classificationcoding genes

non-coding genes

Messenger RNA

Proteins

Structural RNA

Structural proteins Enzymes

transfer RNA

ribosomal RNA

otherRNA

Chromosome(simplified)

intergenic region

Page 15: Genome Organization & Protein Synthesis and Processing in Plants

What is a gene ?• Definitions

1. Classical definition: Portion of a DNA that determines a single character (phenotype)

2. One gene – one enzyme (Beadle & Tatum 1940): “Every gene encodes the information for one enzyme”

3. One gene – one protein: “One gene contains information for one protein (structural proteins included) one gene – one polypeptide

4. Current definition: A piece of DNA (or in some cases RNA) that contains the primary sequence to produce a functional biological gene product (RNA, protein).

Page 16: Genome Organization & Protein Synthesis and Processing in Plants

Coding regionNucleotides (open reading frame) encoding

the amino acid sequence of a protein

The molecular definition of gene includes more than just the coding region

Page 17: Genome Organization & Protein Synthesis and Processing in Plants

Noncoding regions

• Regulatory regions– RNA polymerase binding site– Transcription factor binding sites

• Introns• Polyadenylation [poly(A)] sites

Page 18: Genome Organization & Protein Synthesis and Processing in Plants

Gene

Molecular definition:Entire nucleic acid sequence necessary for the

synthesis of a functional polypeptide (protein chain) or functional RNA

Page 19: Genome Organization & Protein Synthesis and Processing in Plants

Anatomy of a gene

• ORF. From start (ATG) to stop (TGA, TAA, TAG)

• Upstream region with binding site. (e.g. TATA box).

• Poly-a ‘tail’• Splices. Bounded by AG and GT splice

signals.

Page 20: Genome Organization & Protein Synthesis and Processing in Plants

Bacterial genes

• Most do not have introns• Many are organized in operons: contiguous

genes, transcribed as a single polycistronic mRNA, that encode proteins with related functions

Polycistronic mRNA encodes several proteins

Page 21: Genome Organization & Protein Synthesis and Processing in Plants

What would be the effect of a mutation in the control region (a) compared to a

mutation in a structural gene (b)?

Bacterial operon

Page 22: Genome Organization & Protein Synthesis and Processing in Plants

Eucaryotic genes

Exon 190 bp

Exon 2222 bp

Exon 3126 bp

Intron A131 bp

Intron B851 bp

Hemoglobin beta subunit gene

Introns: intervening sequences within a gene that are not translatedinto a protein sequence. Collagen has 50 introns.

Exons: sequences within a gene that encode protein sequencesSplicing: Removal of introns from the mRNA molecule.

Splicing

Page 23: Genome Organization & Protein Synthesis and Processing in Plants

Regulatory mechanisms

• ‘organize expression of genes’ (function calls)

• Promoter region (binding site), usually near coding region

• Binding can block (inhibit) expression• Computational challenges

– Identify binding sites– Correlate sequence to expression

Page 24: Genome Organization & Protein Synthesis and Processing in Plants

Eukaryotic genes

• Most have introns• Produce monocistronic mRNA: only one

encoded protein• Large

Page 25: Genome Organization & Protein Synthesis and Processing in Plants

Alternative splicing

• Splicing is the removal of introns• mRNA from some genes can be spliced into

two or more different mRNAs

Page 26: Genome Organization & Protein Synthesis and Processing in Plants

“Nonfunctional” DNA

• Higher eukaryotes have a lot of noncoding DNA

• Some has no known structural or regulatory function (no genes)

80 kb

Page 27: Genome Organization & Protein Synthesis and Processing in Plants

Types of eukaryotic DNA

Page 28: Genome Organization & Protein Synthesis and Processing in Plants

Duplicated genes• Encode closely related (homologous)

proteins• Clustered together in genome• Formed by duplication of an ancestral gene

followed by mutation

Five functional genes and two pseudogenes

Page 29: Genome Organization & Protein Synthesis and Processing in Plants

Pseudogenes

• Nonfunctional copies of genes• Formed by duplication of ancestral gene, or

reverse transcription (and integration)• Not expressed due to mutations that

produce a stop codon (nonsense or frameshift) or prevent mRNA processing, or due to lack of regulatory sequences

Page 30: Genome Organization & Protein Synthesis and Processing in Plants

Repetitive DNA• Moderately repeated DNA

– Tandemly repeated rRNA, tRNA and histone genes (gene products needed in high amounts)

– Large duplicated gene families– Mobile DNA

• Simple-sequence DNA– Tandemly repeated short sequences– Found in centromeres and telomeres (and others)– Used in DNA fingerprinting to identify

individuals

Page 31: Genome Organization & Protein Synthesis and Processing in Plants

Types of DNA repeats

Tandem repeats (e.g. satellite DNA)

Inverted repeats (e.g. in transposons)

5’-CATGTGCTGAAGGCTATGTGCTGCGACG- 3’3’-GTACACGACTTCCGATACACGACGCTGC- 5’

5’-CATGTGCTGAAGGCTCAGCACATCGACG- 3’3’-GTACACGACTTCCGAGTCGTGTAGCTGC- 5’ Stem

Loop

Palindroms = adjacent inverted repeats (e.g. restriction sites)• Form hairpin structures

• Form stem-loop structures

Hairpin

Perfect repeats vs degenerate repeats

Page 32: Genome Organization & Protein Synthesis and Processing in Plants

Repetitive sequencesChromosomal DNA

Satellite DNA

Caesium chloridedensity gradient

Type No. of Repeats

Size Percent of genome

Highly repetitive

> 1 Mill < 10 bp 10 %

Moderately repetitive

> 1000 ~ 150 - ~300 bp 20 %

Repeats in the mouse genome

Page 33: Genome Organization & Protein Synthesis and Processing in Plants

DNA repeats and forensics

878 bp556 bp

M F Suspect

Alu sequenceY

X

M F Suspect

528 bp199 bp

X-Y homologous regionsAluSTYa

AluSTXa

AluSTYa

Gender determination1) Standard technique: PCR

amplification of the amelogenin locus (Males = XY => 103 + 109 bp)

2) AluSTXa Alu insertion on X 3) AluSTYa Alu insertion on Y

Page 34: Genome Organization & Protein Synthesis and Processing in Plants

Mobile DNA

• Move within genomes• Most of moderately repeated DNA sequences

found throughout higher eukaryotic genomes– L1 LINE is ~5% of human DNA (~50,000 copies)– Alu is ~5% of human DNA (>500,000 copies)

• Some encode enzymes that catalyze movement

Page 35: Genome Organization & Protein Synthesis and Processing in Plants

Transposition

• Movement of mobile DNA• Involves copying of mobile DNA element

and insertion into new site in genome

Page 36: Genome Organization & Protein Synthesis and Processing in Plants

Why?

• Molecular parasite: “selfish DNA”• Probably have significant effect on

evolution by facilitating gene duplication, which provides the fuel for evolution, and exon shuffling

Page 37: Genome Organization & Protein Synthesis and Processing in Plants

RNA or DNA intermediate

• Transposon moves using DNA intermediate

• Retrotransposon moves using RNA intermediate

Page 38: Genome Organization & Protein Synthesis and Processing in Plants

Types of mobile DNA elements

Page 39: Genome Organization & Protein Synthesis and Processing in Plants

LTR (long terminal repeat)• Flank viral retrotransposons and retroviruses• Contain regulatory sequences

Transcription start site and poly (A) site

Page 40: Genome Organization & Protein Synthesis and Processing in Plants
Page 41: Genome Organization & Protein Synthesis and Processing in Plants

LINES and SINES• Non-viral retro-transposons

– RNA intermediate– Lack LTR

• LINES (long interspersed elements)– ~6000 to 7000 base pairs– L1 LINE (~5% of human DNA)– Encode enzymes that catalyze movement

• SINES (short interspersed elements)– ~300 base pairs– Alu (~5% of human DNA)

Page 42: Genome Organization & Protein Synthesis and Processing in Plants

Proteins

• Most protein sequences (today) are inferred• What’s wrong with this?• Proteins (and nucleic acids) are modified• ‘mature’ Rna• Computational challenges

– Identify (possible) aspects of molecular life cycle– Identify protein-protein and protein-nucleic acid

interactions

Page 43: Genome Organization & Protein Synthesis and Processing in Plants

Genetic variation

• Variable number tandem repeats (minisatellites). 10-100 bp. Forensic applications.

• Short tandem repeat polymorphisms (microsatellites). 2-5 bp, 10-30 consecutive copies.

• Single nucleotide polymorphisms

Page 44: Genome Organization & Protein Synthesis and Processing in Plants

Single nucleotide polymorphisms

• 1/2000 bp. • Types

– Silent– Truncating – Shifting

• Significance: much of individual variation.• Challenge: correlation to disease

Page 45: Genome Organization & Protein Synthesis and Processing in Plants

Yeast genome

• 4.6 x 106 bp. One chromosome. Published 1997.

• 4,285 protein-coding genes• 122 structural RNA genes• Repeats. Regulatory elements. Transposons.• Lateral transfers.

Page 46: Genome Organization & Protein Synthesis and Processing in Plants

Yeast protein functionsRegulatory 45 1.05%Cell structure 182 4.24Transposons,etc 87 2.03Transport & binding 281 6.55Putative transport 146 3.40Replication, repair 115 2.68Transcription 55 1.28Translation 182 4.24Enzymes 251 5.85Unknown 1632 38.06