4: Genome evolution. Gene Duplication Gene Duplication - History 1936: The first observation of a...

Post on 19-Dec-2015

231 views 0 download

Tags:

Transcript of 4: Genome evolution. Gene Duplication Gene Duplication - History 1936: The first observation of a...

4: Genome evolution

Gene Duplication

Gene Duplication - History

1936: The first observation of a duplicated gene was in the Bar gene of Drosophila.

1950: Alpha and beta chains of hemoglobin are recognized to have been derived from gene duplication

1970: Ohno developed a theoretical framework of gene duplication

1995: Gene duplications are studied in fully sequenced genomes

Types of Genomic Duplications

•Part of an exon or the entire exon is duplicated

•Complete gene duplication

•Partial chromosome duplication

•Complete chromosome duplication

•Polyploidy: full genome duplication

Mechanism of Gene Duplication

Genes are duplicated mainly due to unequal crossing over

Mechanism of Gene Duplication

If these regions are complementary, it increases the chance of unequal crossing over. For example, if both of these regions are the same repeated sequence (microsatellite, transposon, etc’…)

After a Gene is DuplicatedAlternative fates:1. It can die and become a pseudogene.2. It can retain its original function, thus allowing

the organism to produce double the amount of the derived protein.

3. The two copies can diverge and each one will specialize in a different function.

Identical copiesOne copy diesDivergence

Invariant repeats

If the duplicated genes are identical or nearly identical, they are called invariant repeats. Many times the effect is an increase in the quantity of the derived protein, and this is why these duplications are also called “dose repetitions”.

Classical examples are the genes encoding rRNAs and tRNAs needed for translation.

Invariant repeats

Variant repeatsSome classic examples:

Trypsin, the digestive enzyme and Thrombin (cleaves fibrinogen during blood clotting) were derived from a complete gene duplication.

Lactalbumin, connected with lactose synthesis and Lysozyme, which degrades bacteria cell wall are also a result of an ancient gene duplication.

Variant repeats

4: Genome evolution

Dose Repetition

Gene duplication in mosquito as a response to insecticides

Kingdom = Metazoa (humans are also Metazoa)Phylum = Arthropoda (humans are Chordata)Class = Insecta (humans are Mammalia)Order = Diptera (humans are Primates)Genus = Culex (humans are Homo)Species = pipiens (sapiens)

Organophosphorous insecticides

Organophosphorous insecticides (e.g., parathion and malathion) interact with many enzymes and in particlar they inhibit the acetylcholinesterase (AChE) activity in the central nervous system, inducing lethal conditions.

Organophosphorous insecticides

The acetylcholine is a is a neurotransmitter that, upon release from neurons, stimulates the opening of a Na+ and K+ channels. These channels regulate the function of the brain as well as the heart, lungs, and skeletal muscles.

The acetylcholinesterase catalyzes the hydrolysis of acetylcholine to form inactive acetate and choline.

Acetylcholinesterase

Acetyl-CoA+

Choline

Acetylcholine

Postsynaptic tissue

Cholinergicneuron

Acetylcholinesterase

Acetylcholinesterase

Acetyl-CoA+

Choline

Acetylcholine

Postsynaptic tissue

Acetylcholinesterase

Cholinergicneuron

Insecticide

Esterases

Esterases are detoxifying carboxylester hydrolasewhich are responsible for the resistance to organophosphorous insecticides.

These enzymes are none specific.

Detoxifying esterases

Acetyl-CoA+

Choline

Acetylcholine

Postsynaptic tissue

Cholinergicneurone

Insecticide

Esterase

Esterases

Culex pipiens typically has 2 genes encoding esterases: Est-3 and Est-2. These genes are separated by an intergenic DNA fragment varying between 2–6 kb.

Est-3 Est-2

Alignment of predicted estα2 and estβ2 amino acid sequencesof Culex quinquefasciatus

~47% similarity between the two sequences

[Biochem.J.(1997) 325,359-365]

Esterases

Resistance alleles correspond to an esterase over-production (which binds or metabolizes the insecticide) relative to basal esterase production of susceptibility alleles. Several resistance allele have been described.

Different allele show 85-90% of similarity

Esterase starch gel

Esterases

For most alleles, the over-production of esterase is the result of gene duplication. This concerns either one locus or both.

Est-3 Est-2

B

A

A

B

47 % of similarity

~100 % of similarity

Nomenclature for the various resistance genes and their products at the Ester resistance locus

Genetica 112–113: 287–296, 2001

Esterases

The duplication of the two esterase loci, explains the tight statistical association of some electromorphs, like A2 and B2. Although, A4, A2 and A1 are coded by alleles of the Est-locus , and B2 and B4 by alleles at the Est-2 locus, A1, A4-B4 and A2-B2 are considered as alleles of a single superlocus (named Ester).

Independent amplifications have occurred only a few times.

Esterases

The level of gene duplication varies between the different alleles:EsterB1 could reach easily 100 copies in the fieldEster4 has never been found above few copies.

It varies also within and among populations for a given amplified allele.

Why the various amplified alleles have distinct limits of amplification is unknown.

Frequency of resistance allelein Montpellier (France)

111 21 31 41 km

111 21 31 41 km 111 21 31 41 km

Treatment area

A1

A1

A1

A1

Esterases

Resistance allele has a cost for the mosquito. In absence of insecticide in the environment non resistant-mosquitoes have the best fitness.

Geographic distribution of resistance allele

Genetica 112–113: 287–296, 2001

Esterases

The level of gene duplication varies between the different alleles:EsterB1 could reach easily 100 copies in the fieldEster4 has never been found above few copies.

It varies also within and among populations for a given amplified allele.

Why the various amplified alleles have distinct limits of amplification is unknown.

Gene Duplication in Aphids as a response for insecticide.

Same story than the mosquitoes

Few Words About Aphids

Kingdom=Metazoa (humans are also metazoa)Phylum=Arthropoda (humans are Chordata)Class=Insecta (humans are Mammalia)Order=Hemiptera (humans are Primates)Genus=Myzus (humans are Homo)Species=persicae (sapiens)

Around 4,000 species, ~250 are pests.

Few Words About Aphids

The Myzus persicae likes…lettuce.In fact, it is the most important aphid pest on lettuce

E4 & FE4

Myzus persicae has 2 genes encoding esterases E4 and FE4, which are responsible for the resistance to organophosphorous insecticides.

These genes show 99% identity in nucleotide sequences, both have exactly the same exon-intron structure (same size and same positions).

Many copies of E4 and FE4

Resistance strains of the aphid were found to contain multiple copies of E4 and FE4. The sequences of all copies are 100% identical.

It is believed that this duplication occurred within the last 50 years, with the introduction of the selective agent.

Take home message I:

Increase in gene number can occur quite rapidly under selection pressure.

Take home message II:

Mutations of gene duplication are not the limiting step (in evolution). It is selection that counts most.

4: Genome evolution

Duplications of RNA-specifying genes

Ribosome

Ribosome is a complex of proteins and RNA (called rRNA) on which proteins are built, based on the information in the mRNA.

Ribosomes are always composed of two units – big and small.

Ribosome

In prokaryotes the entire ribosome is 70S, and is composed of a 50S large subunit, and a 30S small subunit.

In eukaryotes the entire ribosome is 80S, and is composed of a 60S large subunit and a 40S small subunit.

Each subunit contain different rRNA.

The S value is the sedimentation coefficient in ultracentrifuge.

rRNA

There are also ribosomal genes coded by the mitochondrial genome.

In fact, the mitochondrial ribosome is coded by both nuclear and mitochondrial genes.

Comparison of ribosome structure in Bacteria, Eukaryotes, and Mitochondria

 Bacterial (70S)Eukaryotic (80S)Mitochondrial (55S)

Large Subunit50S60S39S

rRNAs(1 of each)

23S (2904 nts)28S (4700 nts)16S (1560 nts)

5S (120 nts)5S (120 nts) 

 5.8S (160 nts)

Proteins33~4948

Small Subunit30S40S28S

rRNA16S (1542 nts)18S (1900 nts)12S (950 nts)

Proteins20~3329

16S, 18S are the most commonly used genesin phylogenetic analysis

Eukaryotic rRNA genes

• 28S, 5.8S, and 18S rRNAs are encoded by a single transcription unit (45S) separated by 2 internally transcribed spacers (ITS) and bounded by externally transcribed spacers (ETS).

18S 28S

ITS 1 ITS 2ETS ETS

5.8 S

• In Human the 45S rDNA is organized into 5 clusters (each has 30-40 repeats)

• These clusters are located on chromosomes 13, 14, 15, 21, and 22.

• These clusters are transcribed by the RNA polymerase I.

Human rRNA genes

18S 28S 18S 28S 18S 28S 18S 28S

• 5SrRNA genes occurs in tandem arrays and there are about ~200-300 true 5S genes and many dispersed pseudogenes.

• In human there are two gene cluster on chromosome 1 (in dogs there is a single gene cluster).

• 5S rRNA is transcribed by RNA polymerase III.

Human rRNA genes

Numbers of rRNA and tRNA genes per haploid genome in various organisms__________________________________________________________________________Genome Source Number of Number of

Approximate rRNA sets tRNA genesa

genome size (bp)__________________________________________________________________________Human mitochondrion 1 22 2 104

Nicotiana tabacum chloroplast 2 37 2 105

Escherichia coli 7 ~ 100 4 106

Neurospora crassa ~ 100 ~ 2,600 2 107

Saccharomyces cerevisiae ~ 140 ~ 360 5 107

Caenorhabditis elegans ~ 55 ~ 300 8 107

Tetrahymena thermophila 1 ~ 800c 2 108

Drosophila melanogaster 120-240 590-900 2 108

Physarum polycephalum 80-280 ~ 1,050 5 108

Euglena gracilis 800-1,000 ~ 740 2 109

Human ~ 300 ~ 1,300 3 109

Rattus norvegicus 150-170 ~ 6,500 3 109

Xenopus laevis 500-760 6,500-7,8008 109

__________________________________________________________________________

Correlation between the number of rRNA genes and the genome size

Correlation between number of rRNA genes and genome size: an exception

The general pattern: bigger genomes more genes to transcribed more rRNA needed.

Numbers of rRNA and tRNA genes per haploid genome in various organisms__________________________________________________________________________Genome Source Number of Approximate

rRNA sets genome size (bp)__________________________________________________________________________Human mitochondrion 1 2 104

Nicotiana tabacum chloroplast 2 2 105

Escherichia coli 7 4 106

Neurospora crassa ~ 100 2 107

Saccharomyces cerevisiae ~ 140 5 107

Caenorhabditis elegans ~ 55 8 107

Tetrahymena thermophila 1 2 108

Drosophila melanogaster 120-240 2 108

Physarum polycephalum 80-280 5 108

Euglena gracilis 800-1,000 2 109

Human ~ 300 3 109

Rattus norvegicus 150-170 3 109

Xenopus laevis 500-760 8 109

__________________________________________________________________________

4: Genome evolution

51Concerted EvolutionConcerted Evolution

18S rRNA tree

Bos Homo Ornithorhyncus

Gallus Xenopus

Danio Tetraodon

Branchiostoma Saccoglossus

Strongylocentrotus Capitella

Aplysia Lottia

Tribolium Apis

Trichinella Caenorhabditis

Schmidtea Drosophila

Anopheles Trichoplax Hydra Stylophora

Nematostella Sycon

Leucetta Caulophocus

Walteria Chondrosia

Chondrilla Negombata

Amphimedon Biemna

Monosiga Cryptococcus

Ustilago Neurospora

Schizosaccaromyces Kluyveromyces

0.1

Cnidaria

Hexactinellida

Bilateria

Demospongiae

Calcarea

Evolution of rRNA genes

• Although there are many copy of the same gene in the genome and the duplication is an ancient phenomena (since all organisms have many copies). All copies present in one genome are almost identical.

Divergent (classical) evolution

Duplication

Mutation

Time

Speciation

Divergent (classical) evolutionvs.

concerted evolution Divergent evolution

Concerted evolution

Concerted evolution

Duplication

Mutation

Time

Speciation

Question?

• How is it possible that all the ribosomal copies remain identical ??

????

(a) Stringent selection.(a) Stringent selection.(b) Recent multiplication.(b) Recent multiplication.(c) Concerted evolution.(c) Concerted evolution.

(a) Stringent selection.(a) Stringent selection.

Refuted by the fact that the ITS regions are as conserved as the functional rRNA sequences.

(b) Recent multiplication.(b) Recent multiplication.

Refuted by the fact that the intraspecific homogeneity does not decrease with evolutionary time.

(c) Concerted evolution.(c) Concerted evolution.

62

CONCERTED EVOLUTION

A member of a gene family does not evolve independently of the other members of the family.

It exchanges sequence information with other members reciprocally or non-reciprocally.

Through genetic interactions among its members, a multigene family evolves in concert as a unit.

CONCERTED EVOLUTION

Concerted evolution results in a homogenized set of nonallelic homologous sequences.

64

CONCERTED EVOLUTION REQUIRES:

(1) the horizontal transfer of mutations among the family members (homogenization).

(2) the spread of mutations in the population (fixation).

Mechanisms of concerted evolution

1. Unequal crossing-over2. Gene conversion3. Duplicative transposition.

Mechanisms of concerted evolution 1- Unequal crossing

1

2

Mechanisms of concerted evolution 1- Unequal crossing

3

4

Gene conversion

Gene conversion(one possible origin)

)a( Heteroduplexes formed by the resolution of Holliday structure or by other mechanisms.

)b( The blue DNA uses the invaded segment (e') as template to "correct" the mismatch, resulting in gene

conversion .

Gene conversion(one possible origin)

)c( Both DNA molecules use their original sequences as template to correct the mismatch.  Gene conversion does

not occur .

Gene conversion(one possible origin)

Gene conversion has been Gene conversion has been found in found in allall ssppeciesecies and at and at allall lociloci that were examined that were examined in detail. in detail.

The rate of gene The rate of gene conversion varies with conversion varies with genomic location.genomic location.

concerted evolution:Advantages of Gene Conversion over

Unequal Crossing-Over

1. Unequal crossing-over changes the number of repeats, and may cause a dosage imbalance. Gene conversion does not change repeat number.

concerted evolution:Advantages of Gene Conversion over

Unequal Crossing-Over

2. Gene conversion can act on dispersed repeats. Unequal crossing-over is severely restricted when repeats are dispersed.

deletiondeletion

duplicationduplication

77

concerted evolution:Advantages of Unequal Crossing-Over over

Gene Conversion

1. Unequal crossing-over is faster and more efficient in bringing about concerted evolution.

At the mutation level, UCO occurs more At the mutation level, UCO occurs more frequently than GC.frequently than GC.

concerted evolution:Advantages of Unequal Crossing-Over

over Gene Conversion

2. In a gene-conversion event, only a small region is involved.

79

In yeast, an unequal crossing-over event involves on average ~20,000 20,000 bpbp. A gene-conversion track cannot exceed 1,500 bp1,500 bp.