Genome Rearrangements Anne Bergeron, Comparative Genomics Laboratory Université du Québec à...

48
Genome Rearrangements Anne Bergeron, Comparative Genomics Laborat Université du Québec à Montr Belle marquise, vos beaux yeux me font mourir d'amour. Vos yeux beaux d'amour me font, belle marquise, mourir. Me font vos beaux yeux mourir, belle marquise, d'amour.

Transcript of Genome Rearrangements Anne Bergeron, Comparative Genomics Laboratory Université du Québec à...

Genome Rearrangements

Anne Bergeron, Comparative Genomics LaboratoryUniversité du Québec à Montréal

Belle marquise, vos beaux yeux me font mourir d'amour.

Vos yeux beaux d'amour me font, belle marquise, mourir.

Me font vos beaux yeux mourir, belle marquise, d'amour.

1. General introduction to genome rearrangementsExamples of rearranged genomes

2. Measures of distanceRearrangement operationsThe Hannenhalli-Pevzner distance equation

3. A unifying view of genome rearrangementsThe Double-Cut-and-Join operationThe adjacency graph and the distance equation

1. General introduction to genome rearrangementsExamples of rearranged genomes

2. Measures of distanceRearrangement operationsThe Hannenhalli-Pevzner distance equation

3. A unifying view of genome rearrangementsThe Double-Cut-and-Join operationThe adjacency graph and the distance equation

Example of rearranged genomes : Mitochondrial Genomes

Bombyx mori

Homo sapiens

Mitochondria are small, oval shaped organelles surrounded by two highly specialized membranes.

Animal mitochondrial genomes are normally circular, ~16 kB in length, and encode:

13 proteins 22 tRNAs and 2 rRNAs.

RWLFSTNHKDIGTLYLLFGAWAGVLGTALSLLIRAELGQPGNLLGNDHIYNVIVTAHAFVMIFFMVMPIMIGGFGNWLVPLMIGAPDMAFPRMNNM KWIYSTNHKDIGTLYFIFGIWSGMIGTSLSLLIRAELGNPGSLIGDDQIYNTIVTAHAFIMIFFMVMPIMIGGFGNWLVPLMLGAPDMAFPRMNNM :*::***********::** *:*::**:**********:**.*:*:*:***.*******:**********************:*************

SFWLLPPSLLLLLASAMVEAGAGTGWTVYPPLAGNYSHPGASVDLTIFSLHLAGVSSILGAINFITTIINMKPPAMTQYQTPLFVWSVLITAVLLLLSLPSFWLLPPSLMLLISSSIVENGAGTGWTVYPPLSSNIAHSGSSVDLAIFSLHLAGISSIMGAINFITTMINMRLNNMSFDQLPLFVWAVGITAFLLLLSLP*********:**::*::** ************:.* :*.*:****:********:***:********:***: *: * *****:* ***.*******

VLAAGITMLLTDRNLNTTFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIVTYYSGKKEPFGYMGMVWAMMSIGFLGFIVWAHHMFTVGMDVDVLAGAITMLLTDRNLNTSFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIISQESGKKETFGCLGMIYAMLAIGLLGFIVWAHHMFTVGMDID***..************:***************************************:: *****.** :**::**::**:****************:*

TRAYFTSATMIIAIPTGVKVFSWLATLHGSNMKWSAAVLWALGFIFLFTVGGLTGIVLANSSLDIVLHDTYYVVAHFHYVLSMGAVFAIMGGFIHWFPLFTRAYFTSATMIIAVPTGIKIFSWLATMHGTQINYNPNILWSLGFVFLFTVGGLTGVILANSSIDITLHDTYYVVAHFHYVLSMGAVFAIIGGFINWYPLF*************:***:*:******:**:::::.. :**:***:**********::*****:**.***********************:****:*:***

SGYTLDQTYAKIHFTIMFIGVNLTFFPQHFLGLSGMPRRYSDYPDAYTTWNILSSVGSFISLTAVMLMIFMIWEAFASKRKVLMVEEPSMNLETGLSLNSYMLKIQFFTMFIGVNMTFFPQHFLGLAGMPRRYSDYPDSYISWNMISSLGSYISLLSVMMMLIIIWESMINQRINLFSLNLPSSIE:* :*:. **:* ******:**********:***********:* :**::**:**:*** :**:*:::***:: .:* *: : . .:*

Here is an alignment of the cytochrome c oxidase I of, respectively, Homo sapiens and Bombyx mori.

RWLFSTNHKDIGTLYLLFGAWAGVLGTALSLLIRAELGQPGNLLGNDHIYNVIVTAHAFVMIFFMVMPIMIGGFGNWLVPLMIGAPDMAFPRMNNM KWIYSTNHKDIGTLYFIFGIWSGMIGTSLSLLIRAELGNPGSLIGDDQIYNTIVTAHAFIMIFFMVMPIMIGGFGNWLVPLMLGAPDMAFPRMNNM :X::XXXXXXXXXXX::XX X:X::XX:XXXXXXXXXX:XX.X:X:X:XXX.XXXXXXX:XXXXXXXXXXXXXXXXXXXXXX:XXXXXXXXXXXXX

SFWLLPPSLLLLLASAMVEAGAGTGWTVYPPLAGNYSHPGASVDLTIFSLHLAGVSSILGAINFITTIINMKPPAMTQYQTPLFVWSVLITAVLLLLSLPSFWLLPPSLMLLISSSIVENGAGTGWTVYPPLSSNIAHSGSSVDLAIFSLHLAGISSIMGAINFITTMINMRLNNMSFDQLPLFVWAVGITAFLLLLSLPXXXXXXXXX:XX::X::XX XXXXXXXXXXXX:.X :X.X:XXXX:XXXXXXXX:XXX:XXXXXXXX:XXX: X: X XXXXX:X XXX.XXXXXXX

VLAAGITMLLTDRNLNTTFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIVTYYSGKKEPFGYMGMVWAMMSIGFLGFIVWAHHMFTVGMDVDVLAGAITMLLTDRNLNTSFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIISQESGKKETFGCLGMIYAMLAIGLLGFIVWAHHMFTVGMDIDXXX..XXXXXXXXXXXX:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:: XXXXX.XX :XX::XX::XX:XXXXXXXXXXXXXXXX:X

TRAYFTSATMIIAIPTGVKVFSWLATLHGSNMKWSAAVLWALGFIFLFTVGGLTGIVLANSSLDIVLHDTYYVVAHFHYVLSMGAVFAIMGGFIHWFPLFTRAYFTSATMIIAVPTGIKIFSWLATMHGTQINYNPNILWSLGFVFLFTVGGLTGVILANSSIDITLHDTYYVVAHFHYVLSMGAVFAIIGGFINWYPLFXXXXXXXXXXXXX:XXX:X:XXXXXX:XX:::::.. :XX:XXX:XXXXXXXXXX::XXXXX:XX.XXXXXXXXXXXXXXXXXXXXXXX:XXXX:X:XXX

SGYTLDQTYAKIHFTIMFIGVNLTFFPQHFLGLSGMPRRYSDYPDAYTTWNILSSVGSFISLTAVMLMIFMIWEAFASKRKVLMVEEPSMNLETGLSLNSYMLKIQFFTMFIGVNMTFFPQHFLGLAGMPRRYSDYPDSYISWNMISSLGSYISLLSVMMMLIIIWESMINQRINLFSLNLPSSIE:X :X:. XX:X XXXXXX:XXXXXXXXXX:XXXXXXXXXXX:X :XX::XX:XX:XXX :XX:X:::XXX:: .:X X: : . .:X

73% identity over more than 500 amino acids.

Example of rearranged genomes : Mitochondrial Genomes

A lowly worm

Charles Darwin, 1809 - 1882

But the order of the genesdiffers from species tospecies.

The 37 genes of animalmitochondria are highly conserved.

Example of rearranged genomes : Mitochondrial Genomes

COX1 COX2 ATP6 ATP8 COX3 ND3 ND2ND4L ND4 ND5 CYTB RNS RNL ND1

ND6

Homo sapiens mitochondrial genome (proteins and rRNAs)

COX1 COX2 ATP6 ATP8 COX3 ND3 ND2ND6 CYTB

ND5 ND4 ND4L RNSRNLND1

Bombyx mori mitochondrial genome (proteins and rRNAs)

ND4L ND4 ND5 RNS RNL ND1

ND6

ND6

ND5 ND4 ND4L RNSRNLND1

The invariant parts

COX1 stands for the genecytochrome c oxidase I.

COX1 stands for the genecytochrome c oxidase I.

Example of rearranged genomes : Mitochondrial Genomes

COX1 COX2 ATP6 ATP8 COX3 ND3 ND2ND4L CYTB

Homo sapiens mitochondrial genome (proteins and rRNAs)

COX1 COX2 ATP6 ATP8 COX3 ND3 ND2CYTB

ND4L

ND4 ND5 RNS RNL ND1

ND6

ND6

ND5 ND4 RNSRNLND1

Bombyx mori mitochondrial genome (proteins and rRNAs)

The modified parts

ND4

ND4

ND5

ND5

ND6

ND6

RNS

RNS

RNL

RNL

ND1

ND1

Example of rearranged genomes : Mitochondrial Genomes

Fruit Fly

Mosquito

Silkworm

Locust

Tick

Centipede

Example of rearranged genomes : Mitochondrial Genomes of 6 Arthropoda

Identical ‘runs’ of genes have been grouped.

(Art work by Guillaume Bourque, scientific work by Guillaume Bourque, Pavel Pevzner and Glenn Tesler, 2004)

Example of rearranged genomes : mammal X chromosomes

Sixteen large synteny blocks are ordered differently in the X chromosomes of the human, mouse and rat. Blocks have similar gene content and order.Note that the estimated number of genes in the X chromosome is 2000.

(Art work by Guillaume Bourque, scientific work by Guillaume Bourque, Pavel Pevzner and Glenn Tesler, 2004)

Example of rearranged genomes : mammal X chromosomes

QuickTime™ and a decompressor

are needed to see this picture.

Problem: Given two or more genomes,How do we measure their similarity and/ordistance with respect to gene order and gene content?

Sub-problem: How do we knowthat two genes or blocks are the "same" in two different species?

1. General introduction to genome rearrangementsExamples of rearranged genomes

2. Measures of distanceRearrangement operationsThe Hannenhalli-Pevzner distance equation

3. A unifying view of genome rearrangementsThe Double-Cut-and-Join operationThe adjacency graph and the distance equation

Rearrangement operations affect gene orderand gene content. There are various types:

• Inversions• Transpositions• Reverse transpositions• Translocations, fusions and fissions• Duplications and losses• Others...

Rearrangement operations

Any set of operations yields a distance between genomes, by counting the minimum number of operations needed to transform one genome into the other.

Rearrangement operations

• Inversions

Rearrangement operations

• Inversions

Rearrangement operations

• Inversions

Example: Mitochondrial Genomes of 6 Arthropoda

Fruit Fly

Mosquito

Silkworm

Locust

Tick

Centipede

An inversion.

Rearrangement operations

• Transpositions

Rearrangement operations

• Transpositions

Rearrangement operations

• Transpositions

Example: Mitochondrial Genomes of 6 Arthropoda

Fruit Fly

Mosquito

Silkworm

Locust

Tick

CentipedeA transposition

Rearrangement operations

• Reverse transpositions

Rearrangement operations

• Reverse transpositions

Rearrangement operations

• Reverse transpositions

Example: Mitochondrial Genomes of 6 Arthropoda

Fruit Fly

Mosquito

Silkworm

Locust

Tick

Centipede

A reverse transposition

Rearrangement operations

• Translocations, fusions and fissions

Rearrangement operations

• Translocations, fusions and fissions

Rearrangement operations

• Translocations, fusions and fissions

Rearrangement operations

• Translocations, fusions and fissions

Rearrangement operations

• Translocations, fusions and fissions

Rearrangement operations

• Translocations, fusions and fissions

[Source: Linda Ashworth, LLNL]DOE Human Genome Program Report

From 24 chromosomes

To 21 chromosomes

1. General introduction to genome rearrangementsExamples of rearranged genomes

2. Measures of distanceRearrangement operationsThe Hannenhalli-Pevzner distance equation

3. A unifying view of genome rearrangementsThe Double-Cut-and-Join operationThe adjacency graph and the distance equation

The Hannenhalli-Pevzner distance equation

QuickTime™ and a decompressor

are needed to see this picture.

In 1995, Hannenhalli and Pevzner found a formula to compute the minimum number of inversions, translocations, fusions or fissions necessary to transform a multichromosomal genome into another.

Sketch of the approach:

• Cap the chromosomes• Concatenate all the chromosomes• Sort the resulting genome by inversions

1. General introduction to genome rearrangementsExamples of rearranged genomes

2. Measures of distanceRearrangement operationsThe Hannenhalli-Pevzner distance equation

3. A unifying view of genome rearrangementsThe Double-Cut-and-Join operationThe adjacency graph and the distance equation

Acts on up to 4 gene extremities: , ,,

Reminder

The Double-Cut-and-Join operation

Yancopoulos et al. 2005

Linear chromosomes

Translocation

Translocation Translocation

Translocation

Translocation

Translocation

The Double-Cut-and-Join operation

Reminder

Fusion

Fission

Inversion

Inversion

Fission

Fusion

Linear and circular chromosomes

The Double-Cut-and-Join operation

Reminder

Circular chromosomes

Fusion

Fission

Inversion

Inversion

Fission

Fusion

The Double-Cut-and-Join operation

Reminder

1. General introduction to genome rearrangementsExamples of rearranged genomes

2. Measures of distanceRearrangement operationsThe Hannenhalli-Pevzner distance equation

3. A unifying view of genome rearrangementsThe Double-Cut-and-Join operationThe adjacency graph and the distance equation

4. Breakpoint reuseBreakpoint reuse estimatesMinimizing breakpoint reuse

The adjacency graph and the distance equation

23 5

14 6

Genome A

1 2 3 4

5 6

Genome B

Joint work with Julia Mixtacki and Jens Stoye

The adjacency graph and the distance equation

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

Joint work with Julia Mixtacki and Jens Stoye

The adjacency graph and the distance equation

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

Joint work with Julia Mixtacki and Jens Stoye

The adjacency graph and the distance equation

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

Joint work with Julia Mixtacki and Jens Stoye

The adjacency graph and the distance equation

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

Joint work with Julia Mixtacki and Jens Stoye

The adjacency graph and the distance equation

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

Joint work with Julia Mixtacki and Jens Stoye

The adjacency graph and the distance equation

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

C = number of cyclesI = number of odd pathsG = number of “genes”

D = G - (C + I/2)

D = 6 - (1 + 2/2) = 4

Joint work with Julia Mixtacki and Jens Stoye