A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of...

36
A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment

Transcript of A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of...

Page 1: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

A Phylogenetic Approach to Genome Evolution in Drosophila*

Patrick M. O’GradyUniversity of California, Berkeley

Division of Organisms & Environment

Page 2: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Karl Magnacca

Rick Lapoint

Gordon Bennett

Matthew Van Dam

HylaeusHawaiian Drosophila

Hawaiian Drosophila

Nesophrosyne

RhaphiomidasArenivagaRhaphiophoridae

Page 3: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

biogeography of the southwest; population genetics; species formation; ecology;

conservation biology; arid land use practices

taxonomy; systematics; origin and preservation of biodiversity; comparative

method in biology

population genetics; genetics of species

formation; coalescent processes; genomics

taxonomy; comparative phylogenetics; biogeography; conservation; evolution of

host plant usage

Karl Magnacca

Rick Lapoint

Gordon Bennett

Matthew Van Dam

HylaeusHawaiian Drosophila

Hawaiian Drosophila

Nesophrosyne

RhaphiomidasArenivagaRhaphiophoridae

Page 4: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– Distribution of transposable elements in 12 Drosophila genomes

Outline

Phylogeny

GenomeEvolution

– Phylogenetic relationships in the family Drosophilidae– Placement of the Hawaiian Drosophila & Scaptomyza– Time scale for drosophilid evolution

Introduction – Need for and application of a phylogenetic perspective in genome-level analyses

Page 5: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– Evolutionary relationships in the Drosophilidae

Phylogeny

01020304050

D. melanogaster

D. simulans

D. yakuba

D. sechellia

D. ananassae

D. erecta

D. pseudoobscura

D. persimilis

D. virilis

D. mojavensis

D. grimshawiHawaiian Drosophila

melanogaster group

Sophophora

Drosophila

obscura group

willistoni group

D. santomea

D. orena

D. mauritiana

D. teissieri

D. willistoni

Divergence Time (Million Years)

repleta group

virilis group

melanogaster subgroup

- Taxon sampling

- Level of support

- Complex history

Page 6: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Hawaiian Drosophila- 1,000 endemics

- diverse behavior & morphology

- single colonist

- biogeography

- ancient: 25 mya

D.hanaulae

D.ingens

D.cyrtolomaD.melanocephala

D.obscuripes

D.neoperkinsi

D.oahuensis

D.nigribasis

D.neopicta

D.substenopteraD.hemipeza

D.silvestris

D.heteroneura

D.differens

D.planitibiaD.picticornis

D.setosifrons

D.primaeva

D.adunca

K

H

M

Mo

H

H

O

O

O

O

MN

Mo

EM

WM

WM

EM

EM

Page 7: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Applications

phenotype

genotype

genomic content and expression profiles of olfactory and gustatory receptor neurons

comparative approach

Page 8: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– Distribution of transposable elements in 12 Drosophila genomes

Outline

Phylogeny

GenomeEvolution

– Phylogenetic relationships in the family Drosophilidae– Placement of the Hawaiian Drosophila & Scaptomyza– Time scale for drosophilid evolution

Introduction – Need for and application of a phylogenetic perspective in genome-level analyses

Page 9: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– Combination of molecular technology (high throughput DNA sequencing) and computer science (bioinformatics) to determine and disseminate whole genome sequences

– Use wealth of information from genome projects focusing on related model organisms to generate characters for phylogenetic and population genetic analyses

Characters

Genomics

Genome-enabled (phylogenomic) research

Amplify genes based on polytene chromosome location to sample entire genome (56/104). Sample nearly complete mt genome.

Gene tree vs. species tree. Right gene?

Page 10: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Sophophora

98virilis-repleta

Hawaiian Drosophila

99

53

99

polychaeta100

immigrans-tripunctata99

The genus Drosophila is composed of six clades of widely divergent taxa.

Page 11: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

96

99

Scaptomyza96

The genus Drosophila is composed of six clades of widely divergent taxa.

Several genera are strongly supported as being imbedded within Drosophila.

ZaprionusZygothricaHirtodrosophila

What now?!

Page 12: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

51525354555

Sophophora 33my

virilis-repleta 31-32my

Hawaiian taxa 25my

polychaeta 20my

immigrans-tripunctata 30my

Fossil and biogeographiccalibration points

Drosophila Clock

Page 13: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Major bursts of diversification took place ~30 mya, over 90% of the extant drosophilid lineages were established around this time.

• Sea level dropped significantly• Antarctic ice sheet forms, continuing the cooler period that began at the end of the Cretaceous • Major biotic exchange between Nearctic and Neotropical regions• Some major families of host plants originated and diversified during this period (e.g., Cactaceae)

Page 14: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

robusta,melanica

Hawaiian Drosophilaand the genusScaptomyza

99

The Hawaiian Drosophilidae (genus Scaptomyza plus the Hawaiian Drosophila) form a clade.

This is indicative of a single ancestral colonization event.

The sister group is the robusta/melanica lineage.

Page 15: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

picture wing, nudidrosophila: 15my

modified mouthpart: 16my

modified tarsus, antopocerus: 9my

Scaptomyza: 21my

61

100

100

99

Species group relationships in Hawaiian Drosophila lineage agree with previous phylogenetic work and biogeography of Gardner Pinacles

100

99

Page 16: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Elmomyza

ExalloscaptomyzaBunostoma

Parascaptomyza

Engiscaptomyza

Grimshawomyia

99

100

100

10099

99100

99

MetascaptomyzaScaptomyza is strongly supported as monophyletic.

Basal members of Scaptomyza (and the sister clade to this group) are Hawaiian, suggesting that this cosmopolitan genus originated in Hawaii and subsequently spread throughout the world.

• Biogeographic data suggests that Scaptomyza species are better long distance dispersers than Hawaiian Drosophila

– 70% of the described species are endemic to islands

– Galapagos, St. Helena, Tristan da Cunha, Gough

• Ecology, desiccation

• Expand taxon sampling in Scaptomyza to include additional taxa from all 17 subgenera in this genus.

Page 17: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– Evolution of transposable elements in 12 Drosophila genomes

Outline

Phylogeny

GenomeEvolution

– Phylogenetic relationships in the family Drosophilidae– Placement of the Hawaiian Drosophila & Scaptomyza– Time scale for drosophilid evolution

Introduction – Need for and application of a phylogenetic perspective in genome-level analyses

Page 18: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Comparative Genomics“In many ways we are like children in an enchanted forest, wandering almost aimlessly from discovery to discovery. For the moment, at least, that should be sufficient. At some point we will inevitably emerge into a clearing where principles and patterns in the organization and evolution of the genome are evident. Until then, let us be thankful that thepleasures of the forest are so numerous and diverse.”

R. J. MacIntyre (1985)

Page 19: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

TEs

Class I

Class II

– Retroelements; copy themselves to RNA and then back to DNA via reverse transcriptase

• LTR and non-LTR

• LINEs and SINEs

• Similar to retroviruses

– DNA elements; no RNA intermediate needed, cut-and-paste themselves directly using a transposase enzyme

• Alu elements

• Transposable, or mobile, genetic elements• Ubiquitous in most eukaryotic genomes:

Arabidopsis (10%), rice (40%), Drosophila (15%), C. elegans (15%), mouse (37%), human (40%)

MITEs – Miniature inverted repeat elements• Non-autonomous, can’t transpose on their own

• Associated with non coding regions of plant genes

Page 20: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– TE colonizes a nascent genome

Life History

Birth

Death

Reproduction

Senescence

– Copy number increases• active transposition

• selection regulates copy number and transposition freq.

• utilize new niche space = horizontal transfer

– Mutation inactivates some TEs • transposition becomes less frequent

– All elements become non-functional– Transposition ceases– TE eventually fades into genomic background

• sometimes leaves a footprint

• may not be discoverable

Page 21: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– 2,907bp; 31bp IR; encodes

an enzyme that transposes the

element throughout the genome

P

Structure

HorizontalTransfer

– jumped from the genome of

D. willistoni to D. melanogaster in the past 50 years

– elements are 1bp different in spite of the fact that their MRCA was ~30mya

Page 22: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– BLAST known Drosophila elements (~180)

Survey

Page 23: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

D. melanogaster

D. simulans

D. sechellia

D. yakuba

D. erectaD. ananassae

D. pseudoobscura

D. persimilis

D. willistoni

D. virilis

D. mojavensis

D. grimshawi

unambiguous gain, no homoplasyunambiguous gain, homoplasyunambiguous loss

Mapping significant blast hits onto the species phylogenybetter than e-10, bit score >100

Complex history of gains and losses - mirrors phylogenetic relationship between target and origin - some interesting exceptions: simulans, erecta, grimshawi

Source bias, need TE profile for each taxon

Page 24: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

D. melanogaster

D. simulans

D. sechellia

D. yakuba

D. erecta

D. ananassae

D. pseudoobscura

D. persimilis

D. willistoni

D. virilis

D. mojavensis

D. grimshawi 1

5

6

1

1

1

Unique TEsTE sequences present in only a single genome

Truly unique TEs are rare - most are inherited vertically - life cycle is long enough to persist through multiple speciation events - horizontal transfer is rare

Page 25: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

D. melanogaster

D. simulans

D. sechellia

D. yakuba

D. erecta

D. ananassae

D. pseudoobscura

D. persimilis

D. willistoni

D. virilis

D. mojavensis

D. grimshawi3

11

3

1

12

22

11

4

1

6

4

11

1

1

Shared TEs simple

Blue elements track phylogeny

Red elements show homoplasy

Have some transposons been lost in some taxa?Were there multiple gains via horizontal transfer?

Page 26: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

D. melanogaster

D. simulans

D. sechellia

D. yakuba

D. erecta

D. ananassae

D. pseudoobscura

D. persimilis

D. willistoni

D. virilis

D. mojavensis

D. grimshawi

D. melanogaster

D. simulans

D. sechellia

D. yakuba

D. erecta

D. ananassae

D. pseudoobscura

D. persimilis

D. willistoni

D. virilis

D. mojavensis

D. grimshawi

Is horizontal transfer more likely in those transposable elements that show a more dispersed distributional pattern?

Is this a difference in timing of life histories?

Shared TEs complex

Page 27: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Summary

Loss

D. grimshawi

Two species in the melanogaster subgroup, D. simulans and D. erecta, have lost TEs present in their closest relatives. - not due to source bias or sampling - could this be correlated with population history, genome size, or some other aspect of the biology of these species?

D. grimshawi has far fewer TEs (18) than D. mojavensis (73) and D. virilis (68). - Source bias? - Perhaps, but these three taxa are essentially equidistant from D. melanogaster - Could this be an effect of being isolated on a remote Pacific Island for ~25 my? - reduced opportunity for horizontal transfer; “genomic isolation” - Mode of speciation?

Page 28: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

– tblastx of known Drosophila elements (~180)– computational methods to discover novel elements

Survey

Structure& Function

– copy number and genomic location– is/are coding region(s) complete? – phylogenetic analyses - horizontal transfer? – molecular evolution of TEs– 3D structure prediction of transposase, reverse

transcriptase, and other key protein regions

First pass: filter to obtain largest TEs

Second pass: obtain remaining elements

– phylogenetic analyses • Long-term evolutionary history of TEs in genome

• Relationships of TE families

Survey

Page 29: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

D. melanogaster

D. simulans

D. sechellia

D. yakuba

D. erecta

D. ananassae

D. pseudoobscura

D. persimilis

D. willistoni

D. virilis

D. mojavensis

D. grimshawi 1

5

6

1

1

1

Unique TEsTE sequences present in only a single genome

Truly unique TEs are rare - most are inherited vertically - life cycle is long enough to persist through multiple speciation events - horizontal transfer is rare

Page 30: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

pogo TE found only in the D. melanogaster genome - 21bp IR; ~2200bp long - Sequence variation is low - Only one SNP in the 5’ IR; 9 SNPs in the rest of the 17 full length, internally deleted or terminally truncated elemenets

Full length: 4Internally deleted: 11Terminally truncated: 2Only IRs: ~30

Page 31: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

0

0.05

0.1

0.15

0.2

0.2+

Uhu element

~25*

~123*

~117*

~58*

* in-situ hybridization to polytene chromosomes (Wisotzkey et al. 1997)

~166*

~600

LTR LTRreverse transcriptase

D. heteroneura

D. grimshawi

D. picticornis

D. silvestris

D. planitibia

D. differens

Page 32: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

D.heteroneuraX63028D.heteroneuraX63029

D.heteroneuraX17356D.silvestrisX63038

D.silvestrisX63032D.planitibiaX63035D.differensX63031D.planitibiaX63034D.planitibiaX63033D.differensX63030

D.picticornisX63037D.picticornisX63036

scaffold 15110ascaffold 13733

scaffold 15252scaffold 14853b

scaffold 14853cscaffold 14906scaffold 3137

scaffold 14853dscaffold 14822

scaffold 14822bscaffold 15203

scaffold 14853scaffold 14853e

scaffold 15110bscaffold 14822c

scaffold 15116

10

100

10076

64

10059

95

75

63

58

72

58

9276

86

D.heteroneuraX63028D.heteroneuraX63029

D.heteroneuraX17356D.silvestrisX63038

D.silvestrisX63032D.planitibiaX63035D.differensX63031D.planitibiaX63034D.planitibiaX63033D.differensX63030

D.picticornisX63037D.picticornisX63036

scaffold 15110ascaffold 13733

scaffold 15252scaffold 14853b

scaffold 14853cscaffold 14906scaffold 3137

scaffold 14853dscaffold 14822

scaffold 14822bscaffold 15203

scaffold 14853scaffold 14853e

scaffold 15110bscaffold 14822c

scaffold 15116

10

100

10076

64

10059

95

75

63

58

72

58

9276

86

Uhu Phylogeny

Inactive copies

- parsimony

- stop codons- frameshifts- deletions

Species tree

- D. grimshawi

pairwise divergence:

0.00181 - 0.09457- activity

- vertical vs. horizontal transfer?

- timing of life cycle?

Page 33: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Uhu Project- Incorporate remaining Uhu sequences

- Phylogeny, deletions, molecular evolution

- Localize to D. grimshawi chromosomes- Do positions match in other taxa?

D.heteroneuraX63028D.heteroneuraX63029

D.heteroneuraX17356D.silvestrisX63038

D.silvestrisX63032D.planitibiaX63035D.differensX63031D.planitibiaX63034D.planitibiaX63033D.differensX63030

D.picticornisX63037D.picticornisX63036

scaffold 15110ascaffold 13733

scaffold 15252scaffold 14853b

scaffold 14853cscaffold 14906scaffold 3137

scaffold 14853dscaffold 14822

scaffold 14822bscaffold 15203

scaffold 14853scaffold 14853e

scaffold 15110bscaffold 14822c

scaffold 15116

10

100

10076

64

10059

95

75

63

58

72

58

9276

86

- Protein structure predictions

Page 34: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

• Drosophilidae Phylogeny– Jim Bonacum (UI Springfield), Rob DeSalle (AMNH),

Kenneth Kaneshiro (UH Manoa)

• Biogeography and Molecular Clock Analyses – Chelsea Specht & Jon Price (Smithsonian)

• Drosophila TransposonsRichard Lapoint, Gordon Bennett, Karl Magnacca, Matthew VanDam

• Laboratory Assistance– Amy Turmelle, Jessica Pearson, Ed Reiman, Sean Sullivan,

Mark Giannullo, Jake Wintermute

Page 35: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

orf1

D. silvestris

D. grimshawi

200bp

orf2

Felger & Hunt, 1992

SNAP; Don Gilbert, 2006

0

0.05

0.1

0.15

0.2

0.2+

Loa element

Page 36: A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of California, Berkeley Division of Organisms & Environment.

Loa Phylogeny

Inactive copies

- parsimony

- stop codons- no frameshifts- no deletions

- D. grimshawi

pairwise divergence:

0.00213 - 0.01104- activity

D. silvestris X60174

D. silvestris X60175

D. silvestris X60176scaffold 6903

scaffold 14822

scaffold 8020

scaffold 7657

scaffold 5896

scaffold 3602

scaffold 14822b

scaffold 4020

scaffold 8401

scaffold 3325

scaffold 7987

scaffold 3860

scaffold 6680

scaffold 8003

scaffold 14952

10

- Gene prediction incorrect?