Protecting garden fruits from spotted wing drosophila (Drosophila ...
A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of...
-
Upload
silvester-dawson -
Category
Documents
-
view
217 -
download
1
Transcript of A Phylogenetic Approach to Genome Evolution in Drosophila* Patrick M. O’Grady University of...
A Phylogenetic Approach to Genome Evolution in Drosophila*
Patrick M. O’GradyUniversity of California, Berkeley
Division of Organisms & Environment
Karl Magnacca
Rick Lapoint
Gordon Bennett
Matthew Van Dam
HylaeusHawaiian Drosophila
Hawaiian Drosophila
Nesophrosyne
RhaphiomidasArenivagaRhaphiophoridae
biogeography of the southwest; population genetics; species formation; ecology;
conservation biology; arid land use practices
taxonomy; systematics; origin and preservation of biodiversity; comparative
method in biology
population genetics; genetics of species
formation; coalescent processes; genomics
taxonomy; comparative phylogenetics; biogeography; conservation; evolution of
host plant usage
Karl Magnacca
Rick Lapoint
Gordon Bennett
Matthew Van Dam
HylaeusHawaiian Drosophila
Hawaiian Drosophila
Nesophrosyne
RhaphiomidasArenivagaRhaphiophoridae
– Distribution of transposable elements in 12 Drosophila genomes
Outline
Phylogeny
GenomeEvolution
– Phylogenetic relationships in the family Drosophilidae– Placement of the Hawaiian Drosophila & Scaptomyza– Time scale for drosophilid evolution
Introduction – Need for and application of a phylogenetic perspective in genome-level analyses
– Evolutionary relationships in the Drosophilidae
Phylogeny
01020304050
D. melanogaster
D. simulans
D. yakuba
D. sechellia
D. ananassae
D. erecta
D. pseudoobscura
D. persimilis
D. virilis
D. mojavensis
D. grimshawiHawaiian Drosophila
melanogaster group
Sophophora
Drosophila
obscura group
willistoni group
D. santomea
D. orena
D. mauritiana
D. teissieri
D. willistoni
Divergence Time (Million Years)
repleta group
virilis group
melanogaster subgroup
- Taxon sampling
- Level of support
- Complex history
Hawaiian Drosophila- 1,000 endemics
- diverse behavior & morphology
- single colonist
- biogeography
- ancient: 25 mya
D.hanaulae
D.ingens
D.cyrtolomaD.melanocephala
D.obscuripes
D.neoperkinsi
D.oahuensis
D.nigribasis
D.neopicta
D.substenopteraD.hemipeza
D.silvestris
D.heteroneura
D.differens
D.planitibiaD.picticornis
D.setosifrons
D.primaeva
D.adunca
K
H
M
Mo
H
H
O
O
O
O
MN
Mo
EM
WM
WM
EM
EM
Applications
phenotype
genotype
genomic content and expression profiles of olfactory and gustatory receptor neurons
comparative approach
– Distribution of transposable elements in 12 Drosophila genomes
Outline
Phylogeny
GenomeEvolution
– Phylogenetic relationships in the family Drosophilidae– Placement of the Hawaiian Drosophila & Scaptomyza– Time scale for drosophilid evolution
Introduction – Need for and application of a phylogenetic perspective in genome-level analyses
– Combination of molecular technology (high throughput DNA sequencing) and computer science (bioinformatics) to determine and disseminate whole genome sequences
– Use wealth of information from genome projects focusing on related model organisms to generate characters for phylogenetic and population genetic analyses
Characters
Genomics
Genome-enabled (phylogenomic) research
Amplify genes based on polytene chromosome location to sample entire genome (56/104). Sample nearly complete mt genome.
Gene tree vs. species tree. Right gene?
Sophophora
98virilis-repleta
Hawaiian Drosophila
99
53
99
polychaeta100
immigrans-tripunctata99
The genus Drosophila is composed of six clades of widely divergent taxa.
96
99
Scaptomyza96
The genus Drosophila is composed of six clades of widely divergent taxa.
Several genera are strongly supported as being imbedded within Drosophila.
ZaprionusZygothricaHirtodrosophila
What now?!
51525354555
Sophophora 33my
virilis-repleta 31-32my
Hawaiian taxa 25my
polychaeta 20my
immigrans-tripunctata 30my
Fossil and biogeographiccalibration points
Drosophila Clock
Major bursts of diversification took place ~30 mya, over 90% of the extant drosophilid lineages were established around this time.
• Sea level dropped significantly• Antarctic ice sheet forms, continuing the cooler period that began at the end of the Cretaceous • Major biotic exchange between Nearctic and Neotropical regions• Some major families of host plants originated and diversified during this period (e.g., Cactaceae)
robusta,melanica
Hawaiian Drosophilaand the genusScaptomyza
99
The Hawaiian Drosophilidae (genus Scaptomyza plus the Hawaiian Drosophila) form a clade.
This is indicative of a single ancestral colonization event.
The sister group is the robusta/melanica lineage.
picture wing, nudidrosophila: 15my
modified mouthpart: 16my
modified tarsus, antopocerus: 9my
Scaptomyza: 21my
61
100
100
99
Species group relationships in Hawaiian Drosophila lineage agree with previous phylogenetic work and biogeography of Gardner Pinacles
100
99
Elmomyza
ExalloscaptomyzaBunostoma
Parascaptomyza
Engiscaptomyza
Grimshawomyia
99
100
100
10099
99100
99
MetascaptomyzaScaptomyza is strongly supported as monophyletic.
Basal members of Scaptomyza (and the sister clade to this group) are Hawaiian, suggesting that this cosmopolitan genus originated in Hawaii and subsequently spread throughout the world.
• Biogeographic data suggests that Scaptomyza species are better long distance dispersers than Hawaiian Drosophila
– 70% of the described species are endemic to islands
– Galapagos, St. Helena, Tristan da Cunha, Gough
• Ecology, desiccation
• Expand taxon sampling in Scaptomyza to include additional taxa from all 17 subgenera in this genus.
– Evolution of transposable elements in 12 Drosophila genomes
Outline
Phylogeny
GenomeEvolution
– Phylogenetic relationships in the family Drosophilidae– Placement of the Hawaiian Drosophila & Scaptomyza– Time scale for drosophilid evolution
Introduction – Need for and application of a phylogenetic perspective in genome-level analyses
Comparative Genomics“In many ways we are like children in an enchanted forest, wandering almost aimlessly from discovery to discovery. For the moment, at least, that should be sufficient. At some point we will inevitably emerge into a clearing where principles and patterns in the organization and evolution of the genome are evident. Until then, let us be thankful that thepleasures of the forest are so numerous and diverse.”
R. J. MacIntyre (1985)
TEs
Class I
Class II
– Retroelements; copy themselves to RNA and then back to DNA via reverse transcriptase
• LTR and non-LTR
• LINEs and SINEs
• Similar to retroviruses
– DNA elements; no RNA intermediate needed, cut-and-paste themselves directly using a transposase enzyme
• Alu elements
• Transposable, or mobile, genetic elements• Ubiquitous in most eukaryotic genomes:
Arabidopsis (10%), rice (40%), Drosophila (15%), C. elegans (15%), mouse (37%), human (40%)
MITEs – Miniature inverted repeat elements• Non-autonomous, can’t transpose on their own
• Associated with non coding regions of plant genes
– TE colonizes a nascent genome
Life History
Birth
Death
Reproduction
Senescence
– Copy number increases• active transposition
• selection regulates copy number and transposition freq.
• utilize new niche space = horizontal transfer
– Mutation inactivates some TEs • transposition becomes less frequent
– All elements become non-functional– Transposition ceases– TE eventually fades into genomic background
• sometimes leaves a footprint
• may not be discoverable
– 2,907bp; 31bp IR; encodes
an enzyme that transposes the
element throughout the genome
P
Structure
HorizontalTransfer
– jumped from the genome of
D. willistoni to D. melanogaster in the past 50 years
– elements are 1bp different in spite of the fact that their MRCA was ~30mya
– BLAST known Drosophila elements (~180)
Survey
D. melanogaster
D. simulans
D. sechellia
D. yakuba
D. erectaD. ananassae
D. pseudoobscura
D. persimilis
D. willistoni
D. virilis
D. mojavensis
D. grimshawi
unambiguous gain, no homoplasyunambiguous gain, homoplasyunambiguous loss
Mapping significant blast hits onto the species phylogenybetter than e-10, bit score >100
Complex history of gains and losses - mirrors phylogenetic relationship between target and origin - some interesting exceptions: simulans, erecta, grimshawi
Source bias, need TE profile for each taxon
D. melanogaster
D. simulans
D. sechellia
D. yakuba
D. erecta
D. ananassae
D. pseudoobscura
D. persimilis
D. willistoni
D. virilis
D. mojavensis
D. grimshawi 1
5
6
1
1
1
Unique TEsTE sequences present in only a single genome
Truly unique TEs are rare - most are inherited vertically - life cycle is long enough to persist through multiple speciation events - horizontal transfer is rare
D. melanogaster
D. simulans
D. sechellia
D. yakuba
D. erecta
D. ananassae
D. pseudoobscura
D. persimilis
D. willistoni
D. virilis
D. mojavensis
D. grimshawi3
11
3
1
12
22
11
4
1
6
4
11
1
1
Shared TEs simple
Blue elements track phylogeny
Red elements show homoplasy
Have some transposons been lost in some taxa?Were there multiple gains via horizontal transfer?
D. melanogaster
D. simulans
D. sechellia
D. yakuba
D. erecta
D. ananassae
D. pseudoobscura
D. persimilis
D. willistoni
D. virilis
D. mojavensis
D. grimshawi
D. melanogaster
D. simulans
D. sechellia
D. yakuba
D. erecta
D. ananassae
D. pseudoobscura
D. persimilis
D. willistoni
D. virilis
D. mojavensis
D. grimshawi
Is horizontal transfer more likely in those transposable elements that show a more dispersed distributional pattern?
Is this a difference in timing of life histories?
Shared TEs complex
Summary
Loss
D. grimshawi
Two species in the melanogaster subgroup, D. simulans and D. erecta, have lost TEs present in their closest relatives. - not due to source bias or sampling - could this be correlated with population history, genome size, or some other aspect of the biology of these species?
D. grimshawi has far fewer TEs (18) than D. mojavensis (73) and D. virilis (68). - Source bias? - Perhaps, but these three taxa are essentially equidistant from D. melanogaster - Could this be an effect of being isolated on a remote Pacific Island for ~25 my? - reduced opportunity for horizontal transfer; “genomic isolation” - Mode of speciation?
– tblastx of known Drosophila elements (~180)– computational methods to discover novel elements
Survey
Structure& Function
– copy number and genomic location– is/are coding region(s) complete? – phylogenetic analyses - horizontal transfer? – molecular evolution of TEs– 3D structure prediction of transposase, reverse
transcriptase, and other key protein regions
First pass: filter to obtain largest TEs
Second pass: obtain remaining elements
– phylogenetic analyses • Long-term evolutionary history of TEs in genome
• Relationships of TE families
Survey
D. melanogaster
D. simulans
D. sechellia
D. yakuba
D. erecta
D. ananassae
D. pseudoobscura
D. persimilis
D. willistoni
D. virilis
D. mojavensis
D. grimshawi 1
5
6
1
1
1
Unique TEsTE sequences present in only a single genome
Truly unique TEs are rare - most are inherited vertically - life cycle is long enough to persist through multiple speciation events - horizontal transfer is rare
pogo TE found only in the D. melanogaster genome - 21bp IR; ~2200bp long - Sequence variation is low - Only one SNP in the 5’ IR; 9 SNPs in the rest of the 17 full length, internally deleted or terminally truncated elemenets
Full length: 4Internally deleted: 11Terminally truncated: 2Only IRs: ~30
0
0.05
0.1
0.15
0.2
0.2+
Uhu element
~25*
~123*
~117*
~58*
* in-situ hybridization to polytene chromosomes (Wisotzkey et al. 1997)
~166*
~600
LTR LTRreverse transcriptase
D. heteroneura
D. grimshawi
D. picticornis
D. silvestris
D. planitibia
D. differens
D.heteroneuraX63028D.heteroneuraX63029
D.heteroneuraX17356D.silvestrisX63038
D.silvestrisX63032D.planitibiaX63035D.differensX63031D.planitibiaX63034D.planitibiaX63033D.differensX63030
D.picticornisX63037D.picticornisX63036
scaffold 15110ascaffold 13733
scaffold 15252scaffold 14853b
scaffold 14853cscaffold 14906scaffold 3137
scaffold 14853dscaffold 14822
scaffold 14822bscaffold 15203
scaffold 14853scaffold 14853e
scaffold 15110bscaffold 14822c
scaffold 15116
10
100
10076
64
10059
95
75
63
58
72
58
9276
86
D.heteroneuraX63028D.heteroneuraX63029
D.heteroneuraX17356D.silvestrisX63038
D.silvestrisX63032D.planitibiaX63035D.differensX63031D.planitibiaX63034D.planitibiaX63033D.differensX63030
D.picticornisX63037D.picticornisX63036
scaffold 15110ascaffold 13733
scaffold 15252scaffold 14853b
scaffold 14853cscaffold 14906scaffold 3137
scaffold 14853dscaffold 14822
scaffold 14822bscaffold 15203
scaffold 14853scaffold 14853e
scaffold 15110bscaffold 14822c
scaffold 15116
10
100
10076
64
10059
95
75
63
58
72
58
9276
86
Uhu Phylogeny
Inactive copies
- parsimony
- stop codons- frameshifts- deletions
Species tree
- D. grimshawi
pairwise divergence:
0.00181 - 0.09457- activity
- vertical vs. horizontal transfer?
- timing of life cycle?
Uhu Project- Incorporate remaining Uhu sequences
- Phylogeny, deletions, molecular evolution
- Localize to D. grimshawi chromosomes- Do positions match in other taxa?
D.heteroneuraX63028D.heteroneuraX63029
D.heteroneuraX17356D.silvestrisX63038
D.silvestrisX63032D.planitibiaX63035D.differensX63031D.planitibiaX63034D.planitibiaX63033D.differensX63030
D.picticornisX63037D.picticornisX63036
scaffold 15110ascaffold 13733
scaffold 15252scaffold 14853b
scaffold 14853cscaffold 14906scaffold 3137
scaffold 14853dscaffold 14822
scaffold 14822bscaffold 15203
scaffold 14853scaffold 14853e
scaffold 15110bscaffold 14822c
scaffold 15116
10
100
10076
64
10059
95
75
63
58
72
58
9276
86
- Protein structure predictions
• Drosophilidae Phylogeny– Jim Bonacum (UI Springfield), Rob DeSalle (AMNH),
Kenneth Kaneshiro (UH Manoa)
• Biogeography and Molecular Clock Analyses – Chelsea Specht & Jon Price (Smithsonian)
• Drosophila TransposonsRichard Lapoint, Gordon Bennett, Karl Magnacca, Matthew VanDam
• Laboratory Assistance– Amy Turmelle, Jessica Pearson, Ed Reiman, Sean Sullivan,
Mark Giannullo, Jake Wintermute
orf1
D. silvestris
D. grimshawi
200bp
orf2
Felger & Hunt, 1992
SNAP; Don Gilbert, 2006
0
0.05
0.1
0.15
0.2
0.2+
Loa element
Loa Phylogeny
Inactive copies
- parsimony
- stop codons- no frameshifts- no deletions
- D. grimshawi
pairwise divergence:
0.00213 - 0.01104- activity
D. silvestris X60174
D. silvestris X60175
D. silvestris X60176scaffold 6903
scaffold 14822
scaffold 8020
scaffold 7657
scaffold 5896
scaffold 3602
scaffold 14822b
scaffold 4020
scaffold 8401
scaffold 3325
scaffold 7987
scaffold 3860
scaffold 6680
scaffold 8003
scaffold 14952
10
- Gene prediction incorrect?