Download - MCD Nucleic Acids Alexandra Burke-Smith 1. Nucleic Acids …€¦ · MCD Nucleic Acids Alexandra Burke-Smith 3 The karyotype A karyotype is an organised profile of someone’s chromosomes.

MCD Nucleic Acids Alexandra Burke-Smith

1

1. Nucleic Acids and Chromosomes Dr Birgit Leitinger ([email protected])

DNA and RNA bases - purines and pyrimidines, the four bases of DNA (A, T, G, C) and of RNA (A, U, G, C)

Sugars in DNA and RNA - deoxyribose and ribose

Nucleotides and nucleosides

DNA: polymer of deoxyribonucleotide units

Primary sequence of DNA specifies genetic information

DNA double helix - antiparallel complementary strands

Melting and annealing of complementary strands

Watson-Crick base pairing: G-C and A-T

Genomes of E.coli (a single circular DNA molecule) and Homo Sapiens (46 chromosomes contain linear DNA)

The human karyotype

Packaging of eukaryotic DNA - chromatin and chromosomes

1. Outline the Chemical composition of DNA

Nucleic Acids

DNA and RNA are nucleic acids; linear polymer chains consisting of nucleotides joined by phosphodiester

bonds

DNA is a linear polymer of deoxyribonucleotide units

A nucleotide consists of a nitrogenous base, a sugar and one or more phosphate groups

A nucleoside is composed of a nitrogenous base and a sugar (NO PHOSPHATE)

DNA and RNA each contain 4 different types of nucleotides that are arranged in different sequences.

Sugars- deoxyribose or ribose

The sugar in DNA is deoxyribose, lacking an oxygen atom that is present in ribose, the parent compound. Ribose is the sugar in RNA.

Primes (’) are used in numbering the carbon atoms in the ribose: 1’ to 5’. The 1’ C is linked to the base; the 5’ C is linked to the phosphate. The 3’ C has an OH group which is important in the structure of DNA.

DNA and RNA nitrogenous bases

There are 5 different bases; but each type of nucleic acid (DNA or RNA) contains only 4 bases:

Purines (big): Adenine (A) and Guanine (G)

Consists of TWO aromatic rings including carbon and nitrogen

Pyrimidines (Small): Cytosine (C) and Thymine (T)/ Uracil (U)- in RNA

Consists of ONE aromatic ring including carbon and nitrogen

The nitrogenous bases are then linked to the pentose sugar (ribose or doxyribose) by a beta-Glycosidic

linkage

mailto:[email protected]


2

Nucleosides vs Nucleotides

Base + (deoxy)ribose = nucleoside

Base Nucleoside

Adenine (deoxy)adenosine

Guanine (deoxy)guanosine

Cytosine (deoxy)cytidine

Uracil (deoxy)uridine

Thymine (deoxy)thymidine

Note: a nucleoside can also involve a ribose sugar instead of a deoxyribose, e.g. adenosine

Base + ribose + phosphate = nucleotide

E.g. adenosine monophosphate (AMP), deoxycytidine triphosphate (dCTP)

2. Describe the Structure of DNA

A single DNA strand is a long chain of deoxyribonucleotide units linked by phosphodiester links.

The 3’ C-OH group on a sugar of one nucleotide is linked to phosphate group, which in turn is joined to 5’-OH of an adjacent sugar.

On each deoxyribose there is a base.

The chain has two ends, the 5’ end and the 3’ end. It is not symmetrical.

The primary sequence is the linear sequence of the bases. By convention, the nucleotide sequence is specified in the 5‟ to 3‟ direction.

The bases carry genetic information, the sugar and phosphates perform a structural role.

The secondary structure of DNA is a right-handed double helix. The two chains in the helix run in opposite directions; antiparallel

There are 10 base pairs per helical turn

The deoxyribose and phosphate groups run along the outside of the helix, with the negative charges outside.

The bases point inwards and the flat planes are perpendicular to the helix.

The two chains are held together by hydrogen bonds between the bases.

The two strands are complementary in their sequence due to the specificity of base-pairing. Adenine always pairs with Thymine (or Uracil in RNA); Guanine always pairs with Cytosine.

A and T form 2 Hydrogen bonds, which is less stable than C and G, which forms 3 hydrogen bonds

The DNA strands can be separated by Melting (Heat or low salt) and re-annealing (Cool or high salt)

RNA can assume a variety of shapes.

3. Explain How cells package DNA

Genome The entire DNA coding for an organism constitutes its genome.

E.coli has 4.6 x 106 base pairs in a single circular double-stranded molecule. The length of E.coli DNA is 1.4

mm.

The human genome is approxiamately 3 x 109 base pairs of DNA divided into chromosomes that each

contain a single, linear double-helical DNA molecule of ~ 200 x 106 base pairs. Chromosomes are visible only

just before cells divide but not in non-dividing cells. Mitotic chromosomes: highly condensed form of

chromosomes. Interphase chromosomes: extended form


3

The karyotype A karyotype is an organised profile of someone’s chromosomes.

A diploid human cell has 46 chromosomes o 22 pairs of ‘normal’ chromosomes (autosomes) and o 2 sex chromosomes (X and Y)

Abnormal Karyotypes in breast tumour: 48 chromosomes instead of 46, multiple translocations, eg two pieces of

chromosome 8 (green) plus piece of chromosome 17 (purple).

Packaging of eukaryotic DNA

tightly packaged, forming a complex with proteins. This complex is called chromatin. During interphase,

when cells are not dividing, chromatin is more extended. Further condensation during mitosis produces

chromosomes.

The lowest level of packaging is the nucleosome, which consists of DNA wrapped around histone proteins.

Nucleosome includes ~ 200 bp DNA (linker DNA plus core DNA)

Structure of the nucleosome

The nucleosome causes an approximately 7-fold condensation of DNA. (200 bp ~68nm; wrapping around

histone octamer ~10 nm).

The chain of nucleosomes is further packed to generate a more compact structure: the 30 nm fiber, giving

~40-fold condensation.

In the interphase nucleus the DNA is further condensed about 100-1000 fold.

Chromosomes


4

2. DNA Replication, the cell cycle and mitosis Dr Birgit Leitinger ([email protected])

DNA replication is semi-conservative

DNA synthesis is mediated by DNA polymerases

DNA polymerases require a template and a primer

DNA synthesis occurs in the 5’ to 3’ direction

Nucleoside analogs lacking a 3’-OH terminate the chain

The replication fork, leading and lagging strands

Proof reading mechanism

Replication of E.coli and mammalian chromosomes

Phases of the life cycle of a eukaryotic cell

Stages of mitosis

1. Explain how cells duplicate DNA

DNA replication

Before each cell division, a faithful copy of the DNA needs to be made to ensure

genetic continuity in the daughter cell

DNA replication is semi-conservative. Each daughter cell inherits one old and

one new strand.

The two strands are complementary to each other so each strand serves as a

template for the synthesis of the other strand. This generates two identical copies.

The DNA helix is very stable and has to be unwound before replication can occur. This is done by a DNA

helicase, an enzyme that uses ATP as source of energy to break hydrogen bonds between base pairs.

New DNA is synthesised by enzymes called DNA polymerases. DNA polymerases add nucleotides to the 3’

end of a growing chain. DNA polymerases require a template strand, an olinucleotide primer (sequence of

polymerized nucleotides made by a different enzyme) , and a supply of of deoxynucleotide triphosphates

(dNTPs)

The Enzyme reaction

DNA polymerases add dNTPs to the 3’ end of a DNA molecule.

DNA (and RNA) synthesis occurs in 5’ to 3’ direction.

Energy is released by hydrolysis of the triphosphate. This drives the reaction.

A free 3’ hydroxyl group is required.

Some nucleoside analogs are used as drugs- they act as chain terminators e.g. Acyclovir- herpes, and

dideoxycytosine (ddC)- Drug for HIV (zalcitabine). Chain terminators do not have a free 3’ hydroxyl group,

therefore polymerase cannot add anymore free dNTPs

The Replication Fork

Replication begins at discrete points on the DNA molecule called origin of replication.

The site of DNA synthesis is called a replication fork: the fork moves along during the process.

The templates for the two new daughter strands have opposite orientations: 3’ to 5’ and 5’ to 3’

The replication fork is asymmetric. Both strands are synthesised in a 5’-3’ direction. The leading strand is

synthesised continuously (i.e. the top one in the diagram above), whereas the lagging strand is synthesised

in short pieces termed Okazaki fragments

(bottom strand)



5

RNA Primers

RNA primes the synthesis of new DNA

A specialised RNA polymerase called DNA primase synthesises a short RNA fragment (~ 5 nucleotides). The RNA primer is only transient and removed at a later stage of replication.

For the synthesis of the leading strand, an RNA primer is needed only to start replication at a replication origin.

Synthesis of the Lagging Strand

DNA primase synthesises multiple short RNA fragment primers

DNA polymerase adds to RNA primer, starting the Okazaki fragment, and finishes the particular section of the DNA fragment

A special ribonuclease removes RNA primer using a 5’ to 3’ exonuclease activity.

A repair DNA polymerase then replaces RNA with DNA.

DNA ligase joins the two fragments together using ATP. This makes the DNA strand continuous. Replication Fork Proteins

DNA replication is performed by a group of proteins that work together with DNA polymerase and primase, forming a multi-enzyme protein machine.

Coordination between leading and lagging strands. Most of the proteins involved in DNA replication are held together as a large multi-enzyme complex.

The looping of the template for the lagging strand makes sure the two DNA polymerase can work together closely, and the orientation of the two DNA synthesis strands are the same.

The sliding clamp makes sure the DNA polymerase stays attached during DNA synthesis, and forces

Single stranded DNA-binding proteins make sure the DNA remains extended

Accuracy of DNA replication

The high fidelity of DNA replication requires a proof-reading mechanism to ensure no mistakes are made.

Mutations (changes in DNA sequence) are very dangerous to the organism. Any errors in replication cannot

be repaired.

DNA replication has an error frequency of about 1 change per 109 base pairs.

Before a new nucleotide is added, the previous nucleotide is checked for correct base-pairing.

Any incorrect bases are removed by 3’ to 5’ exonuclease activity of DNA polymerase. A new, correct

nucleotide is then added.

Replication of the E.coli chromosome

In E.coli, replication starts at a unique origin, OriC.

Two replication forks proceed simultaneously in opposite directions (creates a loop shapes of new DNA forming).

The two forks meet at the other side of the circular chromosome. This is known as BI-DIRECTIONAL REPLICATION


6

Replication of eukaryotic genome

Eukaryotic chromosomes are linear and very long.

Multiple replication origins are distributed at intervals of about 100 kilobase pairs.

Each replication origin gives bi-directional replication forks. Replication is finished when all the forks have met.

2. Describe the cell division cycle and explain how chromosomes segregate as the cell divides

M phase: Mitosis; cell division; 1hr G1 phase: Gap phase 1 (prior to DNA synthesis); 10 hrs S phase: period of DNA synthesis (replication); 9hrs G2 phase: Gap phase 2 (between DNA synthesis and mitosis); 4 hrs G0: cells which have stopped dividing G1, G0, S, and G2 are Interphase

1) Interphase G2: Chromosomes not visible 2) Late prophase: Condensed chromosomes, each contains two sister chromatids 3) Metaphase: Condensed chromosomes, aligned on central plane of spindle 4) Anaphase: Sister Chromatids move to opposite poles of spindle. Centromere splits, ensuring each new cell

gets one half of new chromosomes 5) Telophase: Two daughter cells formed 6) Interphase G1: Condensation process reversed


7

3. Gene Organisation & Transcription Part 1- Gene Transcription Dr Laki Buluwela ([email protected])

1. Describe the basic differences between DNA and RNA

DNA: Double-stranded polynucleotide formed from two separate chains of covalently linked deoxyribonucleotide units. It serves as the cell’s store of genetic information that is transmitted from generation to generation

Doxyribose and phosphate backbone

Adenine, guanine, cytosine and thymine bases

Stability: Deoxyribose sugar less reactive because of C-H bonds. Stable in alkaline conditions. Minor groove means harder for enzymes to attack DNA.

RNA: A (usually) single-stranded polynucleotide in the form of a chain of covalently linked ribonucleotide subunits. It is synthesized when an RNA polymerase copies the nucleotide sequence of DNA. RNA serves a variety of functions in cells including protein synthesis, and can be of the form messenger RNA, micro RNA, ribosomal RNA or transfer RNA.

Ribose and phosphate backbone

Adenine, guanine, cytosine and thymine bases

Stability: Ribose is more reactive because of C-OH bond. Not stable in alkaline conditions. Has larger groves more easily attacked by enzymes.

DNA and Genes • DNA is the hereditary material.

• DNA is organised into functional units called “Genes”.

• The Gene is the unit of inheritance

• Genes encode information for making proteins and functional RNA’s

The human Genome The complete DNA sequence of an organism is called a “Genome”

22 autosomal pairs, and a pair of sex chromosomes (X and Y)

Each nucleated cell in the Human body has the same DNA content

of a full genome

The entire length of the human genome consists of ~ 6 X109 base pairs and is approx 2m long

Gene Expression

DNARNAPROTEIN

The initial product of gene expression is always RNA

In some cases this RNA is functional (e.g. tRNA, rRNA, snRNA)

In many cases this RNA is the template for protein translation (eg mRNA)

Gene expression occurs in the cell nucleus. RNA is exported to the cell cytoplasm, to be used in “Protein Translation”

Typically, 10,000 - 30,000 different genes are expressed in Human Cells / Tissues

Some genes are expressed in all cells

~ 25% of genes expressed in the cell are required for cell specific function. Differences in Gene Expression Explain Cellular Specialisation: Both Red Cells and lymphocytes are “related”

as they develop from a common progenitor cell. However differentiated stem cells can express different genes related to their function, e.g. B-Lymphocytes make “Antibodies” and function in the immune system, and Red Blood Cells contain “Haemoglobin” and function in oxygen transport

Regulation of gene expression can occur at many levels: transcription, RNA processing, RNA transport, RNA turnover, translational and protein activity


8

2. Describe what is meant by “Transcription”

The process in which nucleotide information in the DNA is copied into RNA is called TRANSCRIPTION

DNA RNA

1. DNA strand unwinds

2. Ribonucleotide bases are joined by phosphodiester bonds. The RNA chain grows one base at a time in a 5’ ->3’ direction (nucleotides are added to the 3’ end)

Proteins involved with Transcription

• Gene transcription is carried out by enzymes called “RNA Polymerases”

• Gene transcription also involves special gene regulatory proteins called “Transcription Factors”

• The “start” of a gene contains DNA sequences that are important in bringing about Transcription, they are the site at which the complex assembles called 2Gene promoters”

3. List the major functional classes of RNA and the classes of RNA polymerases involved in synthesising each of these.

Functional classes of RNA

Transfer

Ribosomal

Messenger

Eukaryotic RNA Polymerases

Eukaryotic cells contain three types of RNA polymerases:

RNA Polymerase I -Transcribes rRNA genes RNA Polymerase III- Transcribes tRNA and 5S RNA genes RNA Polymerase II - Transcribes genes encoding proteins into mRNA

4. Describe what is meant by a “gene promoter”

The elongation reaction of RNA synthesis requires a separate and distinct initiation step to build a transcription complex. The DNA sequences at which the initiation complex assembles is called a gene promoter.

5. Describe what is meant by a “Transcription factor”

The rate/level of transcription from a given gene is regulated by the activity of DNA binding proteins, or Transcription Factors.

These can be transcriptional activators, or transcriptional repressors, and hence are useful in regulating gene expression. They act collectively to bring about cell specific / developmental / inducible gene expression.

6. Describe, with the aid of diagrams, the processes involved in transcribing a eukaryotic gene. Components of the Basal Transcription Complex (diagrams on powerpoint)

The Basal transcription complex allows RNA polymerase II to be phosphorylated and then engage in transcription. It is a gene promoter and works at a constant speed.

In the absence of binding of other Transcription Factors (which affect the efficiency of the Basal Transcription complex and hence control the rate/level of transcription) this produces a Basal (low) level of transcription.

Transcription factors (TF) have specific DNA sequences which “bend DNA” on binding, i.e. cause the DNA to start to unwind. They can interact with each other and the Basal Transcription Complex to modulate transcription.


9

Step 1 TFII D contains TATA Binding Protein (TBP- specifies the initiation point for transcription by RNA polymerase II) and TBP Accessory Factors (TAF’s). On initially binding to DNA TFIID:

• Partially unwinds the DNA helix, widening the minor grove to allow extensive contact with bases within the DNA.

• This unwinding is asymmetric with respect to the TBP-TATA complex,thereby assuring transcription is unidirectional

Step 2

• TFII A and TFII B bind • TFII B is particularly important, as it is able to bind to TFII D and RNA Polymerase II

Step 3

• RNA polymerase binds to TFII B (with TF IIF bound to itself) Final Steps

• involve the binding of TFII E,TFII H and TFII J to RNA polymerase • TFII H promotes further unwinding of the DNA helix to facilitate RNA synthesis by RNA Polymerase II.

Transcription Factor Activity

• Transcription factors also facilitate transcription by helping to remodel “chromatin”. They do this by recruiting proteins with enzymatic activities that modify histones.

• The Nucleosome (histone protein-DNA complex) is the building block of chromatin • Histone tails extend from nucleosome, and is thus subject to post-translational modification • Acetylation of the histone gives the nucleosome a more open structure, therefore more transcription occurs.

Hence Hyperacetylation correlates with Gene expression • Hypoacetylation correlates with gene repression

Factors affecting Transcription Factor Expression Cell lineage is a major determinant, but also signals external to the cell, e.g.

• Hormones • Growth Factors • Mechanical Stress • Heat • Cell contact • Light • Touch • Voltage

Gene Regulation in Human Disease

• Mutated Transcription Factors have been implicated in several human hereditary disorders • Abnormal Transcription Factor expression is found in several human cancers • Mutations affecting the regulation of specific genes have been described in certain human diseases

Example 1: Inflammation and Aspirin

• Inflammation occurs with either tissue damage, infection etc • NFkB (Transcription factor which is triggered by external signal e.g. mechanical stress) breaks down IkB (an

Inhibitor of NFkB), then goes into the nucleus where it starts transcription production of cytokines inflammation

• Aspirin acts to inhibit the breakdown of IkB. Consequently, NFkB remains in the cytoplasm and is unable to initiate transcription of cytokine genes, therefore reducing inflammation.

Example 2: In Leukaemia


10

• Over half of Acute Lymphoblastic Leukaemia’s (ALL) involve mutated transcription factors • These mutations are often caused by chromosomal translocations.

Example 3: The oestrogen receptor in Breast Cancer • Oestrogen is a steroid hormone (able to get across cell membranes easily) • Cells that respond to oestrogen contain a protein called the oestrogen receptor. • The Oestrogen receptor is a transcription factor, that regulates “oestrogen regulated genes” • Many oestrogen regulated genes are involved with cell growth • Over half of all breast cancers “over express” the oestrogen receptor, leading to increase in cell growth • Breast cancer treatment involves the use of anti-oestrogens, such as tamoxifen • Tamoxifen mimics oestrogen, therefore binds to the oestrogen receptor and inhibits cell growth e.g. tumour

growth

7. Distinguish “sense” and “anti-sense” DNA template strands. Sense strand: The sense DNA strand will read in the same direction as the RNA Strand formed. This will not be complementary to the RNA single strand, but rather will have the same sequence (with U replacing T) Antisense strand: the DNA which the RNA is built up from, i.e. is complementary to the synthesised RNA, and will be

read in the opposite direction.


11

4. Gene Organisation & Transcription Part 2- mRNA Processing Dr Laki Buluwela ([email protected]) Transcription in E.coli

Bacteria only have one type of RNA polymerase

The Newly transcribed RNA is ready for use in translation

Certain antibiotics inhibit transcription in bacteria e.g. rifamycins

1. Describe, with the aid of diagrams, the events that take place in pre- mRNA processing, including an explanation of the

splice-donor site, splice-acceptor site, and the “lariat” intermediate in mRNA splicing.

RNA Processing

pre-mRNA mRNA

• The initial RNA produced from a gene is known as a “primary transcript” or “Pre- mRNA” or “heterogenous nuclear RNA” ( hn RNA)

• This needs to be processed to make mRNA before translation

• This occurs in the nucleus, and then the mRNA is exported to the cytoplasm for translation

The anatomy of gene (see powerpoint for diagram)

• The promoter lies at the 5’ end

• The sequence information contained in the final mRNA is encoded “discontinuously” in the DNA of the gene

• Segments of the gene which contain sequences that form part of the final RNA are called “exons”

• “Introns” are sequences in the gene which are transcribed but are edited out of the final mRNA

The sequence of events in mRNA splicing

Introns start with the sequence “GU” and end with the sequence “AG”. This knowledge is used as “landmarks” to help remove the introns

1. RNA processing uses small Ribonuclear Proteins, or snRNP’s.

This first to be used is U1 which binds to the splice donor sequence (AGGU)



12

2. The next to bind are snRNP’s U2, U4, U5 and U6.

U2, U4 and U6 bind to the general intron

U5 binds to the splice acceptor site (Pyr15NCAG, Pyr15=run of 15 pyrimidine, N= any base)

3. The binding of U2, U4, U5 and U6 completes the formation of the splicing complex or “spliceosome” (sub-

organelle of the nucleus)

This results in a cleavage of the splice donor sequence at the end of the exon and beginning of the intron

4. The free end of the intron is thus able to move, and is encouraged to fold back to a specific “A” residue in

the intron which is used as the “branchpoint” in an intermediate step in mRNA splicing

5. The “branch” results from a phosphodiester bond between the 5’ phosphate on the “G” at the start of the

cleaved intron and the 2’ OH of the branchpoint “A”, creating a loop in the RNA

6. The phosphodiester bond between the “G” at the end of the intron and the next exon is then cleaved and

the intron removed as a “lariat” structure

7. The exposed, adjacent exon sequences are finally ligated” (joined) together (AG)

2. Define the function of the “Spliceosome”

The spliceosome is the complex formed by the binding or the small ribonuclear proteins with pre-mRNA during mRNA processing that coordinates the splicing of pre-mRNA to form mature mRNA.

3. Describe the addition of a “cap” and “poly A tail” to pre-messenger (hn-) RNA. Post-transcriptional modification of mRNA

• Addition of the 5’ cap: • Added to 5’ end of mRNA • formed by hydrolysis of the terminal triphosphate of the mRNA to a diphosphate. • This then reacts with a phosphate of GTP to form a 5’-5’ phosphate linkage. • CAP is further modified by Methylation at the N7 position in the purine (guanine) ring to form a 7-

Methylguanylate cap. • acts to protect mRNA at the 5’ end • enhances the translation of mRNA. Polio

A highly infectious disease caused by a virus, that invades the nervous system and interferes with the recognition of the cap during translation.

enters the body through the mouth and multiplies in the intestine mainly affects children <3 yrs old Initial symptoms are fever, fatigue, headache, vomiting, stiffness in the neck and pain in the limbs. 1/200

leads to irreversible paralysis Vaccine has allowed us to almost eradicate the disease By targeting the 5’ cap, cells become commited to making polio proteins instead of host proteins

Addition of Poly-A tail

• Known as Polyadenylation • Added one base at a time • added 11-30 bases downstream (3’ end) of the sequence AAUAAA, which is found in all mRNA’s • artificially engineered mRNA without the poly-A tail are more unstable and degenerate rapidly

4. With examples, describe how mutations in splice sites feature in human disease.

A mutation is a heritable change in the sequence of a gene. • ~ 13% of cases feature mutations in gene promoters • ~ 6 % feature mutations of the polyadenylation sequence • ~ 1.2 % feature mutations of RNA capping • ~ 33% feature mutations in splice donor / acceptor sequences • the remaining 47% of cases involve mutations that alter the structure / function of the protein being

produced e.g. in cancer


13

E.g. Thalassemia • derived from Greek meaning sea • genetic, haematological, disease in which there is an imbalance in the relative amounts of alpha-Chains and

beta-Chains making up Haemoglobin. • originally characterized by Dr Cooley and Dr Pearl Lee • very common in the Mediterranean, SE Asia, China • certain overlaps with SCA, as carriers have resistance to Malaria • symptomatic (severe anaemia) by 12 months old

1) Hepatosplenomegaly/Hepatomegaly (enlarged spleen and liver) 2) Extramedullary Hematopoeisis (massive increase in RBC production in bone marrow)

- massive RBC breakdown - iron overload (hemosiderosis), and elevated GI absorption of iron due to chronic anaemia - hepatic fibrosis, cirrhosis, darkening of skin by iron-stimulated melanin production, cardiomyopathy, congestive heart failure

• Blood transfusions/marrow transplants useful for management of disease • Beta-thalassemia- relative deficiency of beta chains. Several types of b-Thalassemia feature splice site

mutations in the b-Globin gene. • The intron therefore cannot be spliced therefore mature mRNA is not formed, and translation of a non-

functional occurs


14

5. Protein Translation and post-translation modification Professor Tony Magee ([email protected])

1. Outline the mechanisms by which ribosomes can translate a mRNA sequence into a protein sequence

2. Describe the role of aminoacyl tRNAs in ensuring the fidelity of the genetic code

3. State how a ribosome recognises the start and end of a sequence to be translated

4. Explain why some antibiotics inhibit protein synthesis in prokaryotes but not eukaryotes

5. Identify the features of a newly-synthesised protein that are required for it to enter the secretory pathway

6. Give examples of the ways in which newly-synthesised proteins can be post-translationally modified e.g. insulin

There is a linear relationship between the information encoded within DNA (the genetic code) and the proteins that are synthesised using that information

Three nucleotides encode one amino acid, hence the name “triplet code”; a group of three nucleotides is called a codon

Eukaryotic DNA contains non-coding introns and exons

Protein synthesis always starts with Met =AUG and finishes with a stop codon =UAA, UAG or UGA

Transfer RNAs are the transporters of amino acids (bound to the 3‟ terminus) to the ribosome; they contain an anticodon loop that binds to the corresponding codon on the mRNA

Structure of a typical mRNA 5’ CAP - 5’ UTR – Coding Region – 3’ UTR – poly A CAP: 7- Methyl guanosine (entry site for ribosome) UTR: untranslated region Poly A: protects mRNA Genetic Code

64 potential codons, only 20 amino acids therefore each amino acid is coded for by more than one codon. This is degenerate

Rare amino acids have fewer codons e.g. Met (AUG), Trp (UGG) have a single codon

AUG (Met) is the start codon for all protein sequences

All protein sequences have “stop codons”: UAA, UAG, UGA Reading mRNA

Starts by finding 5’ CAP- recognises it then scans 5’ – 3’

Translation starts at first AUG, which sets the “frame” in succeeding triplet codons to bring amino acids

Translation stops at first “in frame” termination codon Overview

Messenger RNA (mRNA) is transcribed from DNA (pre-mRNA) and processed (spliced) in the nucleus

mRNA is transported out of the nucleus and translated in the 5’->3’ direction into protein in the cytoplasm and on the rough endoplasmic reticulum

The machinery for translating mRNA is the ribosome, a large 2 subunit complex of proteins and ribosomal RNA (rRNA)



15

Transfer RNA

The transporter of amino acids to the ribosome for translation (at least one per amino acid)

Hydrogen bonds within itself to form a compact structure (represented by a clover shape)

Anti-parallel binding- like DNA

3 bases (anticodon) are complementary to codon on mRNA, i.e. mRNA = 5’-3’, tRNA= 3’- 5’ “Wobble”- the first two bases on an anticodon are really important with respect to the binding energy with mRNA. The wobble is that the 3rd base has a lower binding energy so may accept a different base, resulting in a different primary sequence. Aminoacyl tRNA synthetases

One for each amino acid

Important role in fidelity of translation (proof reading mechanism)- selectivity for correct amino acids using the hydrolysis of ATP and hydrolysis of incorrect amino acid releasing AMP (adenosine monophosphate)

Mutations of these enzymes are linked with cancer, neuropathies, autoimmunity, metabolic disease

Initiation

Step 1: dissociation of ribosome subunits (40S + 60S) Step 2: assembly of preinitiation complex containing Met-tRNA+ Initiation factors (eIFs) +40S subunit + GTP

(Guanosine triose phosphate). Only initiator Met-tRNA can bind to 40S subunit alone. 40S subunit is primarily involved in tRNA and mRNA recognition

Step 3: binding of mRNA to preinitiation complex, which sets the frame of translation. eIF4E and G bind to cap and are recognised by the preinitiation complex.

Step 4: binding of 60S subunit to the complex. The hydrolysis of a GTP molecule ensures correct base pairing. Elongation

Step 1: binding of new tRNA carrying second amino acid to adjacent A site(amino acyl) site in frame with initiator Met (in P site)

Step 2: catalysis of peptide bond between the two amino acids by peptidyl transferase on the 60s subunit Step 3: translocation of tRNA to P (peptidyl) site (original A site, which moves along) and dissociation of first

tRNA. The cycle then repeats, but with the A and P site having shifted along

Elongation Factors (EFs) are proteins that promote movement of ribosome along mRNA using GTP. They use the

energy of GTP to enhance the efficiency and accuracy of translation by providing “pauses” (e.g. the time taken for

GTP hydrolysis) that allow incorrect base pairs to dissociate.

Termination Step 1: recognition of in frame stop codon. Release factors (proteins) bind to empty A site; it is not the t-RNA

that recognizes the stop codon. Step 2: release of peptide chain. Peptidyl transferase catalyses transfer of the completed protein chain to

water and releases it from the ribosome, by hydrolysing the bond between the t-RNA and the peptide Step 3: dissociation of release factors and ribosomes. These can all be re-used when the cycle repeats itself.

Polyribosomes

• Ribosomes do not work singly on a mRNA but in multiple copies on the mRNA – a polyribosome – like a string of beads

• This makes translation very efficient • Translation speed of each ribosome = 15 amino acids/sec


16

• Multiple ribosomes processing simultaneously a 300 amino acid long protein, i.e. one ribosome every 30a.a. of synthesized protein - the number of protein molecules produced in 1 min is ~4000

Antibiotics inhibit protein synthesis in prokaryotes • Translational machinery is complex, easily disrupted – common target for antibiotics • Antibiotics exploit differences between prokaryotic and eukaryotic ribosomes and translation factors for example:

• Streptomycin - Inhibits initiation • Tetracycline - Inhibits aa-tRNA binding • Erythromycin - Inhibits translocation • Chloramphenicol - Inhibits peptidyl transferase • Puromycin - Terminates elongation prematurely

Antibiotics selectively inhibit prokaryotes

Antibiotics are natural products of bacteria or fungi to give them a selective advantage over other microbes INTRACELLULAR COMPONENTS Most cellular compartments are bounded by a membrane so the cell needs a mechanism to transfer proteins across membranes (RER):secreted and transmembrane proteins Same ribosomes on RER as for cytoplasmic proteins The first 20-25 amino acids that emerge from the ribosome are very hydrophobic, e.g. leu- these have a particular signal sequence

• Step 1: recognition of signal sequence by a protein-RNA complex: “Signal-Recognition Particle” (SRP), which halts translation so the translating mRNA can be transported through the cytoplasm

• Step 2: binding of SRP to a receptor at the RER surface, translation resumes • Step 3: translocation of the polypeptide into the lumen of RER through a specific protein channel

Note: Transmembrane proteins have an extra hydrophobic sequence holding them in the membrane, whereas secreted proteins will go completely into the lumen

• Step 4: cleavage of signal sequence by signal peptidase (co-translational). Folding of the polypeptide then occurs.

Post-translational modification There are only 20 naturally occurring amino acids so cell uses post-translational modifications (over 200) to increase diversity (by changing the side chains in different amino acids), including:

• Disulphide bond formation (e.g. insulin) • Proteolytic cleavage (e.g. insulin A and B chains) • Addition of carbohydrate (Glycosylation) • Addition of phosphate (Phosphorylation) • Addition of lipid groups (Prenylation, Acylation) • Hydroxylation (e.g. Collagen)

Example1: Insulin biosynthesis in pancreatic β cells

• Insulin undergoes extensive post-translational modification along the production pathway, including disulphide bond formation (2) and proteolytic cleavage (4, 5)

Example 2: Glycosylation in the RER and Golgi complex

• Addition of sugars • If the protein contains a specific amino acids

sequence, it is recognised by AsnXSer/Thr, and undergoes Initial N-glycosylation and Glycan processing (See Metabolism 8)


17

6. Analysis of nucleic acids Professor Tony Magee ([email protected]) Outline the following:

1. DNA Cloning

2. Restriction endonuclease digestion

3. Separation of DNA fragments

4. Nucleic Acid Hybridisation

5. Polymerase Chain Reaction (PCR)

6. Next generation sequencing

Definition of Cloning: A method of selectively amplifying DNA sequences of interest to generate homogenous DNA

populations.

Cell-based DNA cloning (in vivo)

1. cutting a target DNA and a replicon (sequence capable of independent replication – e.g. plasmid, bacteriophage, Yeast Artificial Chromosome) with restriction endonucleases, so that the ends of the two DNA sequences are compatible

2. mixing and joining the DNA fragments by using the enzyme DNA ligase 3. Transformation of the recombinant DNA molecules into host cells (bacteria, yeast) 4. Selective propagation of individual cell colonies on agar plate (selecting an antibiotic resistance marker in the

replicon; only host cells with replicon and target DNA survive) 5. Expansion of the cell culture and isolation of recombinant DNA (using gene markers)

Restriction endonucleases (RE)

Type II cleave DNA at specific palindromic (i.e. each strand is the reverse of each other) sequences, which are

usually 4-8bp long

One half of a bacterial restriction-modification system (like a primitive immune system), whereby bacterial

DNA is “chopped” up. Host DNA is methylated (by a specific methylase) therefore is protected.

Some produce “blunt” ends- symmetrical

Some produce “sticky” ends- asymmetrical with an overhang. These are extremely useful for cloning

The longer the recognition site, the less frequently it occurs in DNA, therefore the less often that particular

RE will be involved

This allows you to tailor large genomic sequences of DNA into shorter ones

Separation of DNA fragments

Electrophoresis

- DNA phosphate backbone has negative charge, therefore will move towards the anode (+VE) when an

electrical force is applied

- When forced to move through a porous gel matrix (agarose / polyacrylamide gel) small fragments travel

faster than larger ones

- After resolution, DNA can be isolated from the gel or transferred to a membrane to form a replica for

hybridisation



18

Nucleic acid hybridisation

- method for detecting specific nucleic acid sequences in which homologous single-stranded DNA or RNA

molecules combine to form double-stranded molecules, i.e. takes advantage of the complementary nature

of DNA/RNA

- involves a labelled nucleic acid probe to identify target homologous molecules in a mixture of nucleic acids

Nucleic Acid Blotting

Target DNA (e.g. from an agar plate or gel) is immobilised on a solid support - nylon or nitrocellulose membrane - which readily binds single-stranded nucleic acid (e.g. denatured DNA or mRNA)

then hybridised with a solution of (radioactively or fluorescently) labelled probe, which can be detected by exposure of photographic film

then incubate and wash the sample

the DNA can then be isolated

Hybridisation assays- immensely important and versatile in nucleic acid analysis, e.g. - Southern blot hybridisation (DNA target and DNA probe) - Northern blot hybridisation (RNA target and DNA probe) - Colony blot hybridisation (bacterial DNA target, DNA probe) - Tissue in situ hybridisation (RNA target and RNA probe) - Chromosome in situ hybridisation (Chromosome target and DNA probe) - Reverse hybridisation – Microarrays (immobilised DNA or oligonucleotide probe, target DNA solution)

Chromosome in situ hybridisation: used to locate specific genes on chromosomes using fluorescently labelled DNA probes

Melting temperature and Hybridisation Stringency

Energy requirements for denaturation

Denaturation of a probe DNA is achieved by heating until the hydrogen bonds between the bases holding the two strands together are disrupted, leaving bases free to anneal with DNA probes

The energy needed to do this depends on: - Strand length: longer strand = more hydrogen bonds to break - Base composition: G-C pair has one more hydrogen bond than A-T, so harder to break - Chemical environment: Monovalent cations (Na+) stabilise the DNA duplex by neutralising charge on phosphate backbone. Denaturants (formamide / urea) destabilise the DNA duplex, so less energy is required for separation.

Melting Temperature

Melting temperature (Tm) - measure of nucleic acid duplex stability. It shows the Midpoint temperature of transition from double stranded (DS) to single stranded (SS) forms of nucleic acid (i.e. half-denaturation point)

For mammalian genomic DNA (40% GC) this is ~87oC

Hybridisation is carried out at temperatures <25 oC below Tm

Hybridisation stringency

increase the stringency by lessening the likelihood of non-homologous annealing.

Hybridisation stringency (i.e. the power to distinguish between related sequences) increases with: - Increase in temperature - Decrease in Na+ concentration

Exactly complimentary sequences will hybridise at higher temperature or lower salt than sequences which are not exactly complimentary and have base mismatches

Able to distinguish between closely related sequences Cell-free DNA cloning (in vitro)- Polymerase Chain Reaction (PCR)

Important in modern molecular biology


19

In vitro method to allow selective amplification of a specific target DNA within a heterogeneous collection of DNA sequences (e.g. total genomic DNA or complex cDNA population)

Sequence information of the gene of interest in used to design primers complementary to the two strands being copied (forward and reverse- approx 15-25 nucleotides in length)

Primers are specifically annealed to heat-denatured DNA by lowering temperature

Thermostable Thermophilus aquaticus (Taq) DNA polymerase extend 5’->3’ from the primers and generate new strands using dNTPs (deoxyribonucleotide triphosphate)

Taq polymerase comes from hot springs bacteria, so designed to work at high temperatures

Leads to a massive increase in target sequence

Denature = 94oC anneal = 50-60 oC extend = 72 oC

Undergoes ~30 cycles, 20 hours. After 4 cycles half of the fragments are now target sequences Primer Design

Length - Usually about 20 nucleotides for a complex genomic DNA target – this gives the required specificity for

target sequence Base composition - Avoid tandem repeats of nucleotides that can form hairpins - %GC and length should give an ~equal Tm for each primer - Otherwise one primer may bind without the other 3’ end - Avoid complementarity of the bases at the 3’ ends - Primer dimers may result causing non-productive amplification- the primers are then extended instead of

the sequence - Unimportant at the 5’ end because polymerisation proceeds 5’->3’ (addition of dNTPs to 3’ end)

Applications

• Typing genetic markers (Restriction Fragment Length Polymorphisms) • Detecting point mutations

– restriction site changes, Allele-specific amplification • cDNA cloning • Genome walking • Gene expression – Reverse Transcription-PCR (reverse transcribe mRNA to DNA using oligo-dT [binds to

polyA tail], then amplify)- this measures mRNA levels • Introducing mutations experimentally (base mismatches) • DNA microarrays • DNA sequencing – Human Genome

The next generation Very high throughput DNA sequencing that will yield cheap sequencing of individual genomes – personalised medicine

1. Take collection of DNA/RNA fragments 2. Form a sequence library 3. PCR technology can make many short sequence reads 4. high computing power is then needed for sequence assembly

Note: this means we can sequence entire genome quickly and cheaply