1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

42
Comparative Genomics in Comparative Genomics in Vertebrates : Vertebrates : Lessons from the Tetraodon nigroviridis genome

Transcript of 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

Page 1: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

1

Comparative Genomics in Comparative Genomics in Vertebrates :Vertebrates :

Lessons from the Tetraodon nigroviridis genome

Page 2: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

2

R. Hinegardner 1968

Page 3: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

3

Identify human genes by comparison to a compact vertebrate genome

Tetraodon genomic sequence

Human genomic sequence Exons

Page 4: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

4

Query: SPWTFPS*FLMSSSMKVPSWSRISSPM*GIL*STVSSST SPWTFPS* L+SSS+KV S S SSPM*GIL T SSSTSbjct: SPWTFPS*LLISSSIKVSSSSFTSSPM*GILHKTXSSST

Query: LLFQLFLALSDLKQLRILHTDLKPDNVMLVD--EKELKIKLMDFGLALLTHEAKT--GTI +L Q+ AL LK L ++H DLKP+N+MLVD + ++K++DFG A +H +KT T Sbjct: ILQQVATALKKLKSLGLIHADLKPENIMLVDPVRQPYRVKVIDFGSA--SHVSKTVCSTY

Query: VNALAQYSHNEDEEEEEEHDFKVDKT-DLCDSKKHPE VNAL QY+ ++D+++ ++ + + +K DL D + ESbjct: VNALGQYNDDDDDDDGDDPEEREEKQKDLEDHRDDKE

Query: RYKELTEQQMPGALPPECTPNMDGPHARSVRREQSLHSFHTLFCRRCFKYDRFLH +YKELTEQQ+PGALPPECTPN+DGP+A+SV+REQSLHSFHTLFCRRCFKYD FLHSbjct: KYKELTEQQLPGALPPECTPNIDGPNAKSVQREQSLHSFHTLFCRRCFKYDCFLH

Page 5: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

5

BLAST

A T T G C G T A T G C A G C G T A G C A A T T G C G A T A C

T T A C G C G A T G T A G A C A G C G T A G C A A T G T T G C A

Exact match

Query

Subject

word of size W = 11 bases

Page 6: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

6

A T T G C G T A T G C A G C G T A G C A A T T G C G A T A C

T T A C G C G A T G T A G A C A G C G T A G C A A T G T T G C A

Blast:

Query

Subject

T A T G C A G C G T A G C A A T

Scoring matrix NUC.4.4

A T G C NA 5 -4 -4 -4 -2T -4 5 -4 -4 -2G -4 -4 5 -4 -2C -4 -4 -4 5 -2N -2 -2 -2 -2 -1

+5-4-4+5

- 8 < X

X = threshold for cumulative score of successive mismatches = 21 by default

W

Page 7: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

7

Word “W” = 3 amino acids

(threshold “X”)

(threshold “T”)

L E C N Q L I P I A H K T C P E G K N L

H K TH L TH V TH Y TY K TN K T

L K C H N T Q L P F I Y K T C P E G K N

Extension

Automaton

TBLASTX, BLASTP, BLASTX

Page 8: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

8

A R N D C Q E G H I L K M F P S T W Y V B Z X *A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 -2 -1 0 -4 R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 -1 0 -1 -4 N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 3 0 -1 -4 D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 4 1 -1 -4 C 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -3 -3 -2 -4 Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 0 3 -1 -4 E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 1 4 -1 -4 G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 -1 -2 -1 -4 H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 0 0 -1 -4 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3 -3 -3 -1 -4 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 -4 -3 -1 -4 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2 0 1 -1 -4 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1 -3 -1 -1 -4 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 -3 -3 -1 -4 P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 -2 -1 -2 -4 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2 0 0 0 -4 T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0 -1 -1 0 -4 W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3 -4 -3 -2 -4 Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1 -3 -2 -1 -4 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 -3 -2 -1 -4 B -2 -1 3 4 -3 0 1 -1 0 -3 -4 0 -3 -3 -2 0 -1 -4 -3 -3 4 1 -1 -4 Z -1 0 0 1 -3 3 4 -2 0 -3 -3 1 -1 -3 -1 0 -1 -3 -2 -2 1 4 -1 -4 X 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 -1 -1 -4 * -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 -4 1

BLOSUM62 scoring matrix

Page 9: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

9

Results

TBLASTX:

- Non substitutive scoring matrix: match = +15mismatch = -12

- Initial anchoring word: W= 5

- Never more than 2 consecutive mismatches: X = 25

Page 10: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

10

33 % of Tetraodongenome

TBLASTXW=5, X=25n.s. matrix

(10 hours)

8,3 million alignments

322 annotatedHuman genes

Page 11: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

11Length (bp)

% Id

Page 12: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

12

Exofish performancesExofish performances

Sensitivity Sensitivity genesgenes 62.5%62.5%exonsexons 27.5%27.5%

SpecificitySpecificity100 %100 %

On 322 genesOn 322 genes

Human geneHuman gene

Tetraodon Tetraodon matchesmatches

EcoresEcores((EEvolutionary volutionary CoConserved nserved ReRegions)gions)

Ecores per gene 2.58

Page 13: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

13

ExofishExofish

(fishing for exons…with fish exons)(fishing for exons…with fish exons)

Human genomic sequence

Tetraodon genome

Compute alignments

Assemble selected alignments

EcoresEvolutionary conserved regions

Select alignments

Filter repeats and low complexity regions

Page 14: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

14

a

Genscan Genscan

Exofish Exofish Annotation Annotation

Carnitine palmitoyl transferase ICarnitine palmitoyl transferase I

Page 15: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

15

Genscan

Exofish

Exofish

Genscan Exofish

Annotation

Annotation

Similar to mouse HTF9C

Ran binding protein 1

KIAA1292 protein

Page 16: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

16

Exofish 43728 ecores

Estimating the number of genesin a genome

Refseq(13751 genes)

65 %

35 %

? ecores

(How many genes ?)

Human genome

Page 17: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

17

How many genes in the human genome?

42066 ecores found in 42,4% of the human genome

42066 / 0.424 = 99212 ecores in the entire genome

11% of ecores correspond to pseudogenes

99212 x 0.89 = 88299 ecores correspond to genes

A gene possesses on average between 2.58 and 3.18 ecore88299 / 3.18 = 27767 genes88299 / 2.58 = 34224 genes

28000 < Human genes < 34000

Estimation based on 42 % of the human genome (january 2000)

Page 18: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

18

Genesweep (organisé par Ewan Birney a Cold Spring Harbor in may 2000).

Science, 28 may 2000

Page 19: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

19

Organism Nb. Genes Size genome

Virus flu 8 0,001 MbVirus polio 1 0,007 Mb

Mycoplasma genitalium 480 0,58 MbArcheoglobus fulgidus 2.420 2,18 MbMesorhizobium loti 6.746 7,03 Mb

Yeast 6.000 16 MbNematode 19.000 100 MbDrosophila 14.000 120 MbArabidopsis 25.000 100 MbHuman 30.000 3000 Mb

Number of genes in eukaryotes; a paradox ?

Paramecium 40.000 80 Mb

Page 20: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

How to estimate the number of genes in a genome…… without knowing the sequence?

92 93 94 95 96 97 98 99 00 01 02 03 04 05 06

20 000

40 000

160 000

140 000

120 000

100 000

80 000

60 000

?

(Antequera and Bird)

(Fields et al.)

(Roest Crollius et al.)

(Lander et al.)

Published estimates

(Ewing and Green et al.)

(Liang et al.)

Page 21: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

21

64% of the genome is anchored to chromosomes

36% remains as independent sequences

350 Mb genome21 chromosomes

Whole Genome Shotgun Sequencing : 8 X

Assembly with Arachne

Physical mapping

Page 22: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

Gnatostomata(jawed vertebrates)

Chondrichthyes(cartilaginous fishes)

Actinoptérygiens(ray finned fish)

Osteichthyes(bony fishes)

Mammals

Tetrapodes Coelacanthimorpha

Sauropsidae

Mus musculus

Homosapiens

Gallusgallus

Oryziaslatipes

Tetraodonnigroviridis

Takifugurubripes

Danio rerio

Sarcopterygiens(lobe finned fish)

Teleosts Acipenseriforms(sturgeons,…)

Percomorphs Otophysi

CypriniformsBeloniforms Tetraodontiforms

225 my

Pal

éoz

oic

Méz

ozoi

c

Page 23: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

23

Ancestral species

orthologs

paralogs

Species 1 Species 2

speciation

A B

duplication

B’

• A and B derive from an ancestral gene by speciationspeciation: they are orthologsorthologs

• B’ appears by duplication of B: they are paralogsparalogs

Signature?

Signature?

Page 24: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

24

Ancestral genome

Duplication Deletionsintra-chromosomal

rearrangementsFusions

and fissions

Translocations

Time (tenth of million years)

Page 25: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

25

Tetraodon Takifugu

n = 1078Ks <=0.35 n = 330 30.6%Ks > 0.35 n = 748 69.4%

n = 995Ks <=0.35 n = 179 18.0%Ks > 0.35 n = 816 82.0%

Identification of duplicate genes

Page 26: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

26

Distribution of 748 duplicate genes in the Tetraodon genome

Page 27: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

27

Common ancestor

duplication

diploidization

Homo sapiens Tetraodon nigroviridis

Page 28: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

28

Human genome:Synteny with the Tetraodon genome

Tetraodon genome:Synteny with the human genome

Page 29: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

29

Page 30: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

30

The Paleozoic era

Page 31: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

31Drawing by Z. Burian under the direction of Prof. J. Augusta

The giant placoderm Dunkleosteus (~7 metres) chases two Cladoselache sarks

Page 32: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

Gnatostomata(jawed vertebrates)

Chondrichthyes(cartilaginous vertebrates)

Actinoptérygiens(ray finned fishes)

Osteichthyes(bony vertebrates)

Mammals

Tetrapods Coelacanthimorpha

Sauropsidae

Mus musculus

Homosapiens

Gallusgallus

Oryzalatipes

Tetraodonnigroviridis

Takifugurubripes

Danio rerio

Sarcopterygiens(lobbed finned fishes)

Teleosts Acipenseriforms(esturgeons,…)

Percomorphs Otophysi

CypriniformsTetraodontiformsBeloniforms

Page 33: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

33

Page 34: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

34

The ancestral osteichthyes genome (bony vertebrates)

Page 35: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

35

What are the intermediary steps in the evolution of the Tetraodon and the human genome ?

Page 36: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

36

Modeling the evolution of a duplicated Tetraodon chromosome

Gene order is progressively rearranged over time along Tetraodon and human chromosomes (independently)

The degree of rearrangement along a chromosome segment is thus a measure of elapsed time

Modeling a few simple cases of chromosomal rearrangements in Tetraodon:

1) No rearrangement2) a recent fusion between two chromosomes3) an ancient fusion between two chromosomes4) a fission (break) of a chromosome

Page 37: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

37

A simple case: no interchromosomal rearrangement after the dulication

Tetraodon nigroviridisHomo sapiens

Ancestral genome

Page 38: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

38

Page 39: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

39

1 2 3 4 5 6 7 8 9 10 11 12 13

14

15

16

17

18

19

20

21

Chromosomes Tetraodon

1

2

3

4

5

6

789

10

11

12

13141516

171819

2021

22X

Chr

omos

ome

s H

uma

in

Distribution of 6884 orthologs in their respective genomes

9 11

1

2

3

4

5

6

7

89

10

11

12

13141516

171819

2021

22X

Tetraodon chromosomes

Page 40: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

40

Page 41: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

41

Page 42: 1 Comparative Genomics in Vertebrates : Lessons from the Tetraodon nigroviridis genome.

42

Olivier JaillonJean-Marc AuryJean-Louis PetitLaurence BouneauCécile FischerAlain BernotSophie NicaudCarole DossatBéatrice SegurensCorinne DasilvaMarcel SalanoubatMichael LevyNathalie BoudetVéronique AnthouardClaire JubinVanina CastelliMichael KatinkaBenoît VacherieZineb SkalliLaurence CattolicoJulie PoulainSimone DupratPhilippe BrottierGuillaume LardierVincent SchachterFrancis QuetierWilliam SaurinClaude ScarpelliPatrick WinckerJean WeissenbachHugues Roest Crollius

Georges LutfallaChristian BiémontJean-Nicolas Volff

Jérôme GouzyDaniel Kahn

Nicole Stange-ThomannEvan MauceliDavid JaffeSheila FisherKevin J. McKernanPaul McEwanStephanie BosakMike ZodyJill MesirovKerstin Lindblad-TohBruce BirrenChad NusbaumEric S. Lander

Jean-Pierre CoutanceauCatherine Ozouf-Costaz

Frédéric BrunetMarc Robinson-RechaviVincent Laudet

Sergi CastellanoGenis ParraCharles ChappleRoderic Guigó