Cis -regulatory variation and divergence in...

46
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2016 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1332 Cis-regulatory variation and divergence in Capsella KIM A. STEIGE ISSN 1651-6214 ISBN 978-91-554-9442-1 urn:nbn:se:uu:diva-268953

Transcript of Cis -regulatory variation and divergence in...

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2016

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1332

Cis-regulatory variation anddivergence in Capsella

KIM A. STEIGE

ISSN 1651-6214ISBN 978-91-554-9442-1urn:nbn:se:uu:diva-268953

Dissertation presented at Uppsala University to be publicly examined in Lindahlsalen,Norbyvägen 18 A, Uppsala, Monday, 15 February 2016 at 13:15 for the degree of Doctor ofPhilosophy. The examination will be conducted in English. Faculty examiner: Professor Dr.Jeff J. Doyle (Department of Plant Biology, Cornell University).

AbstractSteige, K. A. 2016. Cis-regulatory variation and divergence in Capsella. DigitalComprehensive Summaries of Uppsala Dissertations from the Faculty of Science andTechnology 1332. 45 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9442-1.

Cis-regulatory changes in e.g. promoters or enhancers that affect the expression of a linkedfocal gene have long been thought to be important for adaptation. In this thesis, I investigate theselective importance and genomic correlates of cis-regulatory variation and divergence in thegenus Capsella, using massively parallel sequencing data. This genus provides an opportunityto investigate cis-regulatory changes in response to polyploidization and mating system shifts,as it harbors three diploid species, the outcrosser Capsella grandiflora and the selfers Capsellaorientalis and Capsella rubella, as well as the tetraploid Capsella bursa-pastoris. We firstidentify cis-regulatory changes associated with adaptive floral evolution in connection with therecent switch to self-fertilization in C. rubella and show that cis-regulatory changes betweenC. rubella and its outcrossing close relative C. grandiflora are associated with differences intransposable element content. Second, we show that variation in positive and purifying selectionis important for the distribution of cis-regulatory variation across the genome of C. grandiflora.Interestingly, the presence of polymorphic transposable elements is strongly associated with cis-regulatory variation in C. grandiflora. Third, we show that the tetraploid C. bursa-pastoris isof hybrid origin and investigate the contribution of both parental species to gene expression.We show that gene expression in the tetraploid is partly explained by cis-regulatory divergencebetween the parental species. Nonetheless, within C. bursa-pastoris there is a great deal ofvariation in homeolog expression. In summary, this thesis explores the role of cis-regulatorychanges for adaptive morphological changes in connection to a shift in mating system, the role ofcis-regulatory divergence between progenitor species for an allopolyploid as well as the impactof positive and purifying selection on cis-regulatory variation within a species.

Keywords: Capsella, Shepherd's Purse, cis-regulatory changes, allele-specific expression,mating system shift, floral evolution, polyploidy, positive selection, purifying selection,transposable elements, small RNA, methylation, transposable element silencing, distributionof fitness effects

Kim A. Steige, Department of Ecology and Genetics, Evolutionary Biology, Norbyvägen 18D,Uppsala University, SE-75236 Uppsala, Sweden.

© Kim A. Steige 2016

ISSN 1651-6214ISBN 978-91-554-9442-1urn:nbn:se:uu:diva-268953 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-268953)

“They say a little knowledge is a dangerous thing, but it's not one half

so bad as a lot of ignorance.” ― Terry Pratchett, Equal Rites

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals:

I. Steige KA, Reimegård J, Koenig D, Scofield DG, Slotte T. 2015. Cis-regulatory changes associated with a recent mating system shift and floral adaptation in Capsella. Molecular Biology and Evolution 32:2501-2514.

II. Steige KA*, Laenen B*, Reimegård J, Scofield DG, Slotte T. The impact of natural selection on the distribution of cis-regulatory variation across the genome of an outcrossing plant. Manuscript. *Equal contributions

III. Douglas GM*, Gos G*, Steige KA*, Salcedo A, Holm K, Josephs EB, Arunkumar R, Ågren JA, Hazzouri K, Wang W, Platts AE, Williamson RJ, Neuffer B, Lascoux M, Slotte T, Wright SI. 2015. Hybrid origins and the earliest stages of diploidization in the highly successful recent polyploid Capsella bursa-pastoris. Proceedings of the National Academy of Sciences 112:2806-2811. *Equal contributions

IV. Steige KA, Reimegård J, Rebernig CA, Köhler C, Scofield DG, Slotte T. The role of transposable elements for gene expression in Capsella hybrids and allopolyploids. Manuscript.

Reprints were made with permission from the respective publishers.

Additionally I contributed to the following publications during the duration of my thesis: Slotte T, Hazzouri KM, Ågren JA, Koenig D, Maumus F, Guo Y, Steige K, Platts AE, Escobar JS, Newman LK, Wang W, Mandáková T, Vello E, Stef-fen J, Takuno S, Brandvain Y, Coop G, Andolfatto P, Hu TT, Blanchette M, Clark RM, Quesneville H, Nordborg M, Gaut BS, Lysak MA, Jenkins J, Grimwood J, Prochnick S, Shu S, Rokhsar D, Schmutz J, Weigel D, Wright SI. 2013. The Capsella rubella genome provides insights into the causes and consequences of mating system evolution. Nature Genetics 45:831–835. Fischer I, Steige KA, Stephan W, Mboup M. 2013. Sequence evolution and expression regulation of stress-responsive genes in natural populations of wild tomato. PLoS ONE 8:e78182.

Contents

Introduction ................................................................................................... 11Importance of cis-regulatory changes for evolution ................................. 11Effect of transposable element insertions on gene expression ................. 13The role of self-fertilization and polyploidization in plant speciation ..... 13The model system Capsella ...................................................................... 16

Materials and methods .................................................................................. 19Illumina sequencing .................................................................................. 19Analysis of cis-regulatory changes ........................................................... 20Inferring past demographic changes and selection based on the site frequency spectrum ................................................................................... 22Combining information on polymorphism and divergence to assess positive and purifying selection ................................................................ 24

Research aims ................................................................................................ 26

Summary of the papers .................................................................................. 27Paper I ....................................................................................................... 27Paper II ..................................................................................................... 28Paper III .................................................................................................... 30Paper IV .................................................................................................... 31

Svensk sammanfattning ................................................................................. 33

Deutsche Zusammenfassung ......................................................................... 35

Acknowledgements ....................................................................................... 38

References ..................................................................................................... 40

Abbreviations

ASE allele-specific expression bp base pairs CRE cis-regulatory element DFE distribution of negative fitness effects DNA deoxyribonucleic acid nt nucleotide PCR polymerase chain reaction qPCR quantitative polymerase chain reaction QTL quantitative trait locus RdDM RNA-directed DNA methylation RNA ribonucleic acid selfing self-fertilization SFS site frequency spectrum SNP single nucleotide polymorphism TE transposable element WGD whole genome duplication α proportion of nonsynonymous substitutions that are fixed by

positive selection

11

Introduction

Importance of cis-regulatory changes for evolution Since regulatory regions were discovered in the early 1960s (Jacob and Monod 1961), their role for adaptive evolution has been of major interest (Wray 2007, Carroll 2008, Stern and Orgogozo 2008, Wittkopp and Kalay 2012, Albert and Kruglyak 2015). King and Wilson (1975) first suggested regulatory changes to be important for phenotypic differences between hu-mans and chimpanzees, as their proteins were so similar.

Modern hypotheses on the adaptive significance of regulatory changes of-ten focus on changes in cis-regulatory elements (CREs), which are regulato-ry regions that are linked to a gene, such as promoters or enhancers. CREs are modular, which means that they can contain multiple different binding motifs of e.g. transcription factors (Figure 1). As transcription in eukaryotes is dependent on transcription factors, genetic changes in their binding motifs can alter gene expression (Hoopes 2008, Clancy 2008). In addition, DNA methylation in cis-regulatory regions as well as changes in local chromatin structure can also result in cis-regulatory changes (Phillips 2008).

As the CREs are modular, changes in CREs may result in quite specific changes in gene expression, for instance limited to a certain tissue, life stage or environmental condition. For this reason, it has been suggested that cis-regulatory mutations might have fewer deleterious pleiotropic effects than amino acid mutations, and potentially contribute more to adaptive evolution (Doebley and Lukens 1998, Carroll 2000, Wray 2007, Carroll 2008, Stern and Orgogozo 2008, but see Hoekstra and Coyne 2007).

12

Figure 1. Schematic overview of transcription and translation, leading from genomic DNA to proteins. Binding of transcription factors to CREs (yellow) is required for transcription. The introns in the RNA will be spliced out to generate mRNA. The mRNA then is translated to protein.

Several recent studies have found empirical evidence for cis-regulatory changes in association with phenotypic evolution. In Drosophila cis-regulatory changes have been found to be responsible for wing pigmentation (Prud'homme et al. 2007) and in sticklebacks for pigmentation (Miller et al. 2007) and tooth number (Cleves et al. 2014). In plant species cis-regulatory changes have been connected with changes in leaf morphology (Capsella: Sicard et al. 2014; tomato: Kimura et al. 2008), stigma length in tomato con-nected to selfing (Chen et al. 2007) and apical dominance in maize (Doebley et al. 1997, Studer et al. 2011).

Recently, evidence for positive selection on cis-regulatory divergence has started to accumulate. This stems mainly from studies that have contrasted cis-regulatory changes within vs. between species to show that there is an excess of cis-regulatory fixations between species (Wittkopp et al. 2008), or an excess of concordant cis- and trans-regulatory changes between species (Fraser et al. 2010). This is indicative of directional selection on cis-regulatory changes. Other studies have found patterns of positive selection on cis-regulatory changes (House et al. 2014) or a higher proportion of genes with evidence for positive selection among those with marked cis-regulatory divergence (Graze et al. 2012).

exon exon exon

intron

TATA box

promoter region

transcription

splicing

AAAAA

translation

DNA

RNA

mRNA

protein

13

Effect of transposable element insertions on gene expression Transposable elements (TEs) were first discovered by Barbara McClintock in 1948 (McClintock 1948) and can make up a substantial part of genomes (e.g. 85% in maize; reviewed in Chénais et al. 2012). In general transposons are usually repressed in both plants and animals, as there might be strong negative fitness effects if a TE transposes into or close to an important gene.

As TEs are very common but their content in genomes can differ even be-tween closely related species (reviewed in Chénais et al. 2012), their impact on gene expression is very interesting to study. Indeed, Barbara McClintock already observed in maize in 1956 (McClintock 1956) an effect of TE inser-tions on gene expression of genes located either nearby or at the insertion site. More recently, other studies have also found that TEs might affect the expression of nearby genes (Lippman et al. 2004, Hollister and Gaut 2009, Hollister et al. 2011). In Arabidopsis the RNA-directed DNA methylation (RdDM) pathway, which involves targeting of TEs by 24-nt small RNAs, is of particular interest for the decrease in gene expression of genes nearby TEs (Lippman et al. 2004, Hollister and Gaut 2009, Hollister et al. 2011).

The role of self-fertilization and polyploidization in plant speciation Mating system shifts and polyploidy are important contributors to plant spe-ciation. In angiosperms, it has been estimated that up to 15% of speciation events involve a change in ploidy (Wood et al. 2009) and in the Solanaceae family up to 15% of speciation processes involve shifts from outcrossing to self-fertilization (selfing; Goldberg and Igić 2012). These major transitions have marked evolutionary consequences and can be associated with major changes in morphology as well as genetic and gene expression variation. They therefore offer an opportunity to study cis-regulatory evolution in as-sociation with plant speciation.

In association with the shift to selfing, parallel changes in floral and re-productive traits have evolved in many different lineages. Selfers generally exhibit a so-called ‘selfing syndrome’ (Ornduff 1969), which is character-ized by a reduction of the size of petals and sepals, a decrease in pollen number, a decrease in volatile production for scent and also a reduction of the distance between anther and stigma (reviewed in Sicard and Lenhard 2011). The occurrence of parallel changes in morphology in many independ-ent lineages in response to the shift to selfing strongly suggests that the changes are adaptive. But even though the selfing syndrome has been stud-ied in several species (e.g. Capsella: Slotte et al. 2012, Sicard et al. 2011

14

Arabis: Tedder et al. 2015, Ipomoea: Duncan and Rausher 2013, Leptosi-phon: Goodwillie et al. 2006, Mimulus: Fishman et al. 2002) and genomic regions for these morphological changes have been identified in some spe-cies (e.g. Capsella: Slotte et al. 2012, Sicard et al. 2011, Leptosiphon: Goodwillie et al. 2006, Mimulus: Fishman et al. 2002), the underlying genet-ic changes are not well understood. The selfing syndrome therefore offers a great opportunity to investigate the role of cis-regulatory changes for mor-phological adaptation.

Another major mechanism of plant speciation is polyploidization (also termed whole genome duplication; WGD). Polyploid species can occur through polyploidization within a species (autopolyploid) or through hybrid-ization between species associated with polyploidization (allopolyploid; Figure 2; Ramsey and Schemske 1998). In an allopolyploid species, the chromosome copies derived from the two progenitor species are called homeologs. A change in ploidy presumably leads to almost instant specia-tion, as offspring from backcrosses with the progenitor species are less likely to be fertile (Burton and Husband 2000).

A change in ploidy might have multiple drastic consequences on both the genome (’genomic shock’) and the transcriptome (’transcriptomic shock’; McClintock 1984). Major chromosomal changes have occurred following allopolyploid speciation in Tragopogon (Soltis and Soltis 2009) and an on-going loss of genes has been found in maize (Schnable et al. 2011).

15

Figure 2. Simplified modes of hybridization and polyploidization. Two diploid pro-genitor species (A and B) can hybridize and form a diploid hybrid (left) or a tetra-ploid (allopolyploid; middle) that contains homeologous chromosomes from each of the progenitor species. When a WGD occurs within a species, an autopolyploid is formed (right).

After WGD loss of redundant genes can be biased towards one of the home-ologous chromosome copies in an allopolyploid. This is called ‘biased frac-tionation’ (Langham et al. 2004) and has been suggested to be quite common after WGD, especially in allopolyploids (Garsmeur et al. 2014). It has been found in most cases that genes on the more fractionated subgenome show a lower expression compared to the less fractionated subgenome (Schnable et al. 2011). It has been hypothesized, that differences in selection due to ex-pression level differences between the two homeologs plays a role for frac-tionation (Freeling et al. 2012), and that the fitness cost of accumulation of more major deleterious mutations might be lower for lowly expressed genes than highly expressed genes.

One possible mechanism to explain expression biases between homeologs is the presence of TEs. As it is known that the presence of TEs can affect the expression of surrounding genes (Lippman et al. 2004), differences in TE content between homeologs might lead to expression dominance of the homeologous subgenome with fewer TEs. While it has been shown in multi-

hybrid allopolyploid autopolyploid

BA

BA BA B B

16

ple systems that TE content can affect expression bias between homeologs (Woodhouse et al. 2014, Pophaly and Tellier 2015), there are exceptions to this pattern (Renny-Byfield et al. 2015) and other mechanisms such as dif-ferences in TE silencing or TE family composition between species might also play an important role.

Allopolyploid species are therefore very interesting to study regarding cis-regulatory differences between the two homeologs. Are differences we see due to different contributions of the progenitor species, e.g. differences in regulatory elements or different TE accumulation between the progenitor species? Or is there a stronger impact of transcriptomic shock, e.g. changes due to hybridization or polyploidization themselves?

The model system Capsella The genus Capsella (Shepherd’s Purse) is a promising model system to in-vestigate the role of cis-regulatory changes within and between species, as well as the influence of differences in mating system and ploidy. There are four recognized species (Chater et al. 1993): the diploid outcrosser C. gran-diflora, the two diploid selfers C. rubella and C. orientalis, as well as the tetraploid selfer C. bursa-pastoris (Figure 3). The outcrosser occurs mainly in Greece and parts of Italy, whereas the selfing species have a much broader distribution: C. rubella around the Mediterranean and parts of central Eu-rope, C. orientalis in Central Asia, while the highly successful tetraploid C. bursa-pastoris is found worldwide (Hurka and Neuffer 1997).

17

Figure 3. Inflorescences of the four recognized Capsella species. The diploid out-crosser C. grandiflora (A), the diploid selfer C. rubella (B), the diploid selfer C. orientalis (C) and the tetraploid selfer C. bursa-pastoris (D).

Genomic studies in this system are greatly facilitated by the published refer-ence genome of C. rubella (Slotte et al. 2013), its relatively small genome size, the close relationship with the model system Arabidopsis thaliana and the possibility to cross the different species with each other. All the selfing species in this system exhibit the typical floral traits of a selfing plant (Chater et al. 1993, Sicard et al. 2011, Slotte et al. 2012). In C. rubella the shift to selfing from the outcrossing ancestor C. grandiflora has occurred relatively recently (most likely < 200 kya; Foxe et al. 2009, Guo et al. 2009, Slotte et al. 2013). Five main genomic regions harbor quantitative trait loci (QTLs) for floral and reproductive divergence between C. rubella and C. grandiflora (Slotte et al. 2012). In addition to morphological differences and differences in mating system, the four Capsella species also differ with re-spect to genome size (Hurka et al. 2012) and TE content (Ågren et al. 2014). The outcrosser C. grandiflora has the highest TE content, as well as more TEs close to protein coding genes than the two selfing diploid species

A

C

B

D

18

(Ågren et al. 2014). Between the two diploid selfers C. rubella and C. orien-talis, C. orientalis contains considerably fewer TEs than C. rubella (Ågren et al. 2014). This system is therefore a very good model to investigate the cis-regulatory consequences of mating system shifts and polyploidy and assess TEs as a possible cause for expression divergence.

19

Materials and methods

Illumina sequencing But how are we able to assess genome wide regulatory changes? There are three main methods being currently used to generate expression data: quanti-tative polymerase chain reactions (qPCR), microarray hybridization and transcriptome sequencing using massively parallel sequencing techniques.

For qPCR the sequences of the genes need to be known to design specific primers and probes to assess gene expression. Additionally while this tech-nique is quite sensitive to expression differences, analyzing a whole tran-scriptome, i.e. thousands of genes, would not be feasible. Using microarrays, it is possible to analyze all genes in an organism for which reliable probes can be designed, but different sequences in an array might differ in their specificity (Git et al. 2010) and if working with a non-model species, micro-arrays might not yet be available and would have to be designed. Massively parallel sequencing on the other hand does not rely on availability of a ge-nome to generate sequencing data, even though a reference genome greatly facilitates data processing. But there are other challenges to consider e.g. complicated bioinformatic pipelines and higher error rates than other se-quencing methods (Nielsen et al. 2011), which includes both sequencing errors and alignment errors.

We have used Illumina sequencing to generate whole transcriptome and genome sequencing data. Illumina sequencing is one of the methods of mas-sively parallel sequencing. Using this method there are three main steps to generate the data (Bennett 2002, Bentley et al. 2008). First is the so-called library preparation. DNA or cDNA (for transcriptomes) is first fragmented into small pieces and adapters are ligated onto the fragments. These frag-ments with the adapters are then PCR amplified and purified to remove any remaining adapters that might affect the next steps. The second step is the cluster generation. In this step, the generated library is loaded onto the flow cell. On the surface of the flow cell oligonucleotides, complementary to the adapters of the library, bind to the different fragments in the library. These fragments are then amplified, and at the end there should be clusters of dif-ferent fragments on the flow cell. The third step is the sequencing. Single bases that are incorporated into the DNA strand are identified for each of the clusters. Once sequencing is done, short reads of e.g. ~100 bp are retained.

20

Short read length depends on the specific machine, chemistry and settings that are used.

Short reads lead to new challenges for data analysis. If a reference ge-nome with good annotation is available, analysis is greatly facilitated. As it is not feasible to analyze massively parallel sequencing data by hand, we now depend on bioinformatics pipelines for processing and analyses.

Analysis of cis-regulatory changes We use allele-specific expression (ASE) to directly identify genes with cis-regulatory changes. ASE is defined as unequal expression of the two alleles within an individual. When ASE is observed, it could be due to genetic or methylation changes in regulatory regions, presence of transposable ele-ments, parental imprinting or differences in the chromatin structure between the two alleles. ASE should reflect only the contribution of cis-regulatory changes and not trans-effects because all trans-acting factors are shared within an individual (Figure 4; Fraser 2011).

There are of course caveats to this kind of study. This method relies on transcribed single nucleotide polymorphisms (SNPs) to distinguish the two alleles. Additionally, read mapping biases can easily lead to a false signal of ASE. Therefore, it is important to decrease read mapping bias as much as possible. One way to do this is to mask known SNPs in the reference that is used for mapping. But as Degner et al. (2009) have shown, 5-10% of the analyzed SNPs still contain mapping bias using a masked reference genome, and this can have a major effect on the inference of ASE. For this reason, it is preferable to use specific parental haplotypes based on the analyzed indi-vidual. This should significantly reduce mapping bias, and allow us to ana-lyze patterns of ASE as a true biological signal.

21

Figure 4. Allele-specific expression (ASE) within a diploid hybrid. There might be multiple different genomic or epigenetic changes that lead to a biased expression of the two alleles. CREs might differ between the two alleles, which in turn might lead to the binding of different transcription factors and a pattern of ASE (A). Another cause for ASE might be a transposable element insertion in one allele that is methyl-ated and might affect surrounding genes (B).

To assess ASE, we used a hierarchical Bayesian method developed by Skel-ly et al. (2011), which incorporates data replicates. This method has a lower false positive rate compared to for example a simple binomial test, as tech-nical variation is modeled more realistically by using genomic data, which has no true ASE. This is done by using the genomic read counts for coding

!!!!!""""""""""""!

!!!!!""""""""""""!

!!!!!"""""""""""""

!!!!!"""""""""""""

!!!!!"""""""""""""

!!!!!"""""""""""""

!!!!!"""""""""""""!

!!!!!""""""""""""!

!!!!!""""""""""""!

!!!!!"""""""""""""

!!!!!"""""""""""""

!!!!!"""""""""""""

!!!!!"""""""""""""

!!!!!"""""""""""""#

#$

22

SNPs to estimate parameters of a beta-binomial distribution of variation in allelic biases due to technical variation. These estimates are then used in the analysis of the RNA data. Ultimately, this results in estimates of the propor-tion of genes showing ASE and the overall variation in ASE along genes, as well as gene-specific estimates of the posterior probability of ASE and the degree of ASE.

Inferring past demographic changes and selection based on the site frequency spectrum Mutation introduces new variation into the genome. Either through selection or genetic drift, these new mutations can change in frequency in a popula-tion. In smaller populations, the impact of drift on allele frequencies is gen-erally greater than in large populations. To be able to assess what evolution-ary forces are acting on populations, we usually compare patterns of poly-morphism with expectations under neutrality.

One way to summarize patterns of polymorphism in a population is a site frequency spectrum (SFS). To generate a SFS, mutations that exist at differ-ent frequencies in a sample of n alleles are summed up (Figure 5). Demo-graphic changes in populations over time affect the shape of the SFS. Popu-lation expansion for example leads to an increase of low frequency polymor-phisms compared to the neutral expectation. Population admixture can lead to a general pattern of an excess of intermediate frequency polymorphisms. If the ancestral state cannot be reliably inferred, it is better to use a folded SFS, which means mutations in the frequencies 1 and n-1, 2 and n-2, … are combined. The SFS can also be specified for multiple populations, as a joint SFS, or multidimensional SFS if there are more than two populations or species. A multidimensional SFS contains additional information on the number of shared, fixed, and unique variants in each population or species.

23

Figure 5. Site frequency spectrum. Neutral expectation is shown in grey, an excess of high frequency variants is shown in green and an excess of low frequency vari-ants is shown in blue.

There are different programs to use information from the SFS to infer past demographic events. One of those is δaδi (Gutenkunst et al. 2009), which is based on the diffusion approximation and uses a composite likelihood func-tion to estimate demographic parameters from the SFS for up to three popu-lations. Another program called fastsimcoal2 (Excoffier et al. 2013) relies on coalescent simulations and composite likelihood to obtain estimates of de-mographic parameters from the SFS. Both of these methods can model dif-ferent demographic scenarios and infer parameters such as the effective pop-ulation size Ne, possible past population size changes such as population expansion or bottlenecks, timing of speciation events or possible gene flow between species.

But not only demographic events, but also selection can affect the shape of the SFS. For instance, weak purifying selection is expected to lead to a shift in the SFS toward a higher proportion of low frequency variants, whereas balancing selection is expected to lead to an excess of intermediate frequency variants. This information is captured in classical neutrality tests such as e.g. Tajima's D (Tajima 1989) and is also used in methods to assess the distribution of negative fitness effects (DFE) of new mutations (see be-low).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Counts

Proportion

0.00

0.05

0.10

0.15

0.20

0.25

0.30

24

Combining information on polymorphism and divergence to assess positive and purifying selection A classic way to assess positive selection is the McDonald-Kreitman test (McDonald and Kreitman 1991). In their publication from 1991 the authors analyzed if the Adh locus of Drosophila was under selection by testing if the ratio of nonsynonymous (replacement) to synonymous substitutions that are fixed between the species significantly differed from the ratio of replacement to synonymous polymorphisms within a species. They found that the propor-tion of replacement substitutions (29%) was significantly greater than re-placement polymorphisms (5%) and argued that this was likely caused by positive selection. Their reasoning was that a selected variant that will be fixed by selection will be polymorphic for a shorter time than a variant that is fixed by random drift.

The proportion of nonsynonymous substitutions that are fixed by positive selection (α) can be estimated by an extension of the McDonald-Kreitman test (Fay et al. 2001, Smith and Eyre-Walker 2002). However, in the pres-ence of weakly deleterious mutations, α will be underestimated (McDonald and Kreitman 1991, Eyre-Walker 2002, Eyre-Walker and Keightley 2009), as weakly negative selected mutations will contribute more to polymorphism than divergence. Accounting for weak purifying selection when estimating α is therefore important, and this can be done for instance by first inferring the distribution of negative fitness effects (DFE; Keightley and Eyre-Walker 2007). Here, we do this by using a method called DFE-α (Eyre-Walker and Keightley 2009).

The DFE describes the approximate distribution of the strength of purify-ing selection (Ne*s) on new mutations. If this product is much larger than 1, then selection is efficient, but if it is less than 1 genetic drift determines if the mutation becomes fixed or lost. In DFE-α, the DFE is estimated using a gamma distribution based on information of SFS of putatively neutrally evolving and selected variants (e.g. SFS for synonymous and nonsynony-mous sites). The method assumes that the mutations are not in linkage dise-quilibrium (Eyre-Walker and Keightley 2007). One can account for recent demographic changes using information from the neutral SFS and a simple two-step population size change model.

To decrease the impact of slightly deleterious mutations on estimates of α, some studies have removed low frequency variants (e.g. Fay et al. 2001, Charlesworth and Eyre-Walker 2006), but it has been suggested that this is still biased unless there is a high level of adaptive evolution or a strongly L-shaped DFE (Charlesworth and Eyre-Walker 2008). Eyre-Walker and Keightley (Eyre-Walker and Keightley 2009) therefore first estimate the DFE from the SFS and then use this estimate to infer the amount of slightly deleterious and neutral substitutions in the selected site class. If the number

25

of observed substitutions in that site class is higher than expected, the differ-ence is inferred to be due to substitutions that are fixed by positive selection.

26

Research aims

In this thesis, I investigate the impact of selection on cis-regulatory variation, as well as the role of mating system shifts and polyploidy for cis-regulatory changes in the crucifer genus Capsella. The specific aims are to:

I. Identify cis-regulatory changes in association with the recent shift to selfing in C. rubella and test for a contribution of cis-regulatory changes to the selfing syndrome of C. rubella (paper I).

II. Test whether differences in transposable element content are associ-ated with cis-regulatory variation and divergence in Capsella (pa-pers I, II, IV).

III. Quantify the impact of positive and purifying selection on genes harboring standing cis-regulatory variation in the outcrosser C. grandiflora (paper II).

IV. Investigate the origin of the tetraploid C. bursa-pastoris and the ge-nomic consequences of polyploid speciation (paper III).

V. Test whether cis-regulatory divergence among homeologous genes in C. bursa-pastoris is predictable based on cis-regulatory diver-gence between the diploid parental species (papers III, IV).

27

Summary of the papers

Paper I Paper I gives insight into the role of regulatory changes for phenotypic di-vergence in wild plant species and identifies one possible explanation for regulatory divergence. We examined the selfing species C. rubella and the outcrosser C. grandiflora. C. rubella diverged from C. grandiflora less than 200 kya (Slotte et al. 2013) and already shows the typical floral traits of a selfing species. Previous studies have identified QTL responsible for this phenotypic shift (Sicard et al. 2011, Slotte et al. 2012). We have found evi-dence that suggests that: 1) cis-regulatory changes between two closely re-lated plant species, the selfer C. rubella and the outcrosser C. grandiflora, might have played a role for phenotypic adaptation and 2) differences in the accumulation of TEs might be important for cis-regulatory divergence.

To assess cis-regulatory changes by quantifying ASE, we generated whole transcriptome sequencing data of flower buds and leaves of three in-terspecific C. grandiflora x C. rubella F1 hybrids. Additionally we conduct-ed whole genome sequencing of the F1s and their C. rubella parents. As-sessing ASE within an F1, instead of studying differential expression be-tween C. rubella and C. grandiflora, allows us to directly identify genes with cis-regulatory changes, as trans changes should affect both alleles equally. To assess allelic biases, we mapped our data to parental haplotypes to reduce mapping bias and used a method developed by Skelly et al. (2011) to esti-mate ASE (see Methods).

We estimated that on average 44% of the genes show ASE, but only 6% showed strong allelic biases in flower buds and in leaves. Previously identi-fied narrow QTL regions for floral and reproductive traits were enriched for genes showing ASE in flower buds. This does not seem to be caused by higher heterozygosity facilitating both detection of ASE and QTL in these regions, as leaves do not show this excess. We also identified 19 candidate genes that show ASE in flower buds and are located in the narrow QTL re-gions. These genes are promising candidates for further work into establish-ing the genetic basis of the selfing syndrome. The gene JAGGED, which is involved in determining petal growth and shape by promoting petal cell pro-liferation in A. thaliana (Sauret-Güeto et al. 2013, Schiessl et al. 2014), is of particular interest. Our results suggest that cis-regulatory differences be-tween C. rubella and C. grandiflora result in lower expression of JAGGED

28

in C. rubella than in C. grandiflora. This might be important for the selfing syndrome, as it has been shown that C. rubella has smaller petals due to a shorter period of cell proliferation (Sicard et al. 2011) consistent with expec-tations given the expression of JAGGED in these species.

Across all analyzed genes, we detected a shift towards higher expression of the C. rubella allele. A previous study had shown that C. rubella harbors a lower number of TEs near genes than C. grandiflora (Ågren et al. 2014), suggesting expression changes associated with TE silencing as one explana-tion for this pattern. Indeed we detected both an excess of heterozygous TE insertions in the vicinity of genes with ASE and a decrease of expression on the haplotype carrying the TE insertion. This decrease was even stronger for TE insertions targeted by uniquely mapping 24-nt small RNAs, suggestive of increased methylation and subsequent suppression of local expression.

Taken together, our results suggest that cis-regulatory changes have been important during recent adaptive floral evolution in Capsella, and differ-ences in TE dynamics between selfing and outcrossing species could be an important mechanism underlying rapid cis-regulatory divergence.

Paper II The importance of cis-regulatory changes for adaptation between species has been assessed in a variety of species (reviewed in Wittkopp and Kalay 2012). However, we know less about the factors that drive variation within species. It has been shown in previous studies that cis-regulatory variation is more common in regions with higher levels of polymorphism (Zhang et al. 2011, Lowry et al. 2013, Rockman et al 2010). A recent study showed that in Caenorhabditis elegans, these elevated levels of polymorphisms are likely due to a difference in background selection (Rockman et al. 2010). Genes that show cis-regulatory variation are mainly located in chromosome arms with increased rates of recombination, which makes background selection less efficient. In other selfers similar patterns might be expected, but patterns of variation in outcrossers such as C. grandiflora should be less affected by background selection (Slotte 2014).

In paper II we assessed the distribution of cis-regulatory variation in the outcrosser C. grandiflora and assessed which selective forces are acting on genes that show cis-regulatory variation. We also investigated whether ge-nomic correlates (such as genomic location, recombination rate, gene density or TEs) differ between genes that show significant ASE vs. those that do not. C. grandiflora is an interesting system for studying the impact of selection on genes that show cis-regulatory variation, as it has a large effective popu-lation size (Slotte et al. 2013), low population structure (St Onge et al. 2011) as well as high levels of polymorphism. It has previously been shown that

29

selection is very efficient on both protein coding genes (Slotte et al. 2010) and regulatory regions (Williamson et al 2014) in C. grandiflora.

To assess cis-regulatory variation by ASE we conducted whole genome resequencing and generated transcriptome data from flower buds and leaves of three intraspecific F1s generated by crossing individuals from different C. grandiflora populations. Contrary to paper I we did not use parental haplo-types to map against, but conducted read backed phasing to generate phased fragments. Comparing this data to data of C. grandiflora x C. rubella F1s with known phase (paper I) treated the same way showed that more than 95% of the SNPs were correctly phased.

On average 35% of the assessed genes showed ASE within the C. grandi-flora F1s. Both flower buds and leaves showed a similar degree of expres-sion biases. This is a bit lower than the proportion of genes that showed ASE between C. grandiflora and C. rubella (paper I), but as this study assesses intraspecific variation, this is to be expected.

To assess patterns of polymorphism and selection on genes with vs. with-out ASE, we analyzed whole genome resequencing data from 32 individuals, including both a population and a range-wide sample. Genes that show ASE are within regions of higher levels of polymorphism than control genes that were also amenable to analysis of ASE. We also found that ASE genes have elevated ratios of nonsynonymous to synonymous polymorphisms, which indicates that these sets of genes might differ in levels of purifying selection.

We used the method DFE-α (Keigthley and Eyre-Walker 2009) to assess the distribution of fitness effects and found that nonsynonymous sites of genes that show ASE generally are under relaxed purifying selection com-pared to control genes. This holds for different datasets (the population sam-ple and the range-wide sample), for assuming either a constant population size or a stepwise change in population size as well as after correcting for differences in expression level among ASE and control genes.

To assess the impact of positive selection on ASE genes and control genes, we estimated the rate of adaptive substitutions relative to neutral di-vergence. We found that ASE genes show a lower proportion of adaptive nonsynonymous substitutions than control genes. A different method by Messer and Petrov (2013) shows the same pattern.

As previously shown in paper I, we also found an association of ASE genes with the presence of heterozygous TE insertions. This association is slightly weaker than in paper I, but nonetheless significant. But as other factors than TEs might affect ASE, we conducted a logistic regression with ASE as the response variable and multiple predictor variables. This showed that presence of TEs increased the odds by ~40% to observe ASE. Other predictors such as tissue specificity and gene expression level were also sig-nificant, but did not increase the odds as much as TEs.

The results of paper II show that cis-regulatory variation in C. grandiflo-ra is pervasive and not connected to a specific chromosomal region. Genes

30

that show cis-regulatory variation are under relaxed purifying selection and show lower levels of positive selection than the control genes. In general this suggests that variation in positive and purifying selection determines the distribution of cis-regulatory variation across the genome. Additionally the results of paper II suggest that TEs might be important for cis-regulatory variation in C. grandiflora.

Paper III As mentioned in the introduction, WGD can have major genomic conse-quences. But to properly assess this, we need to know the specific mode of polyploid speciation; whether the polyploid is an autopolyploid, i.e. WGD within a species, or an allopolyploid, i.e. hybridization between species fol-lowed by WGD. In paper III we therefore conducted whole genome and transcriptome sequencing of the tetraploid C. bursa-pastoris, to assess the mode of speciation as well as genomic consequences of polyploidization. C. bursa-pastoris is a recently formed tetraploid, whose specific mode of speci-ation has been debated. Some studies claimed that C. bursa-pastoris was an autopolyploid of the outcrosser C. grandiflora (St Onge et al. 2012), other studies claimed it to be an autopolyploid of the selfer C. orientalis (Hurka et al. 2012).

But these studies were either not based on all species of the genus Capsella (St Onge et al. 2012), or were only based on chloroplast DNA (Hurka et al. 2012). Based on alignments of assembled whole genome rese-quencing data of both C. bursa-pastoris homeologs and the all diploid spe-cies, we found a strong support of a hybrid origin of this tetraploid from C. orientalis and the ancestor of C. grandiflora and C. rubella. Using diver-gence population genetic analyses of the multidimensional SFS obtained from resequencing data, we inferred that the origin of C. bursa-pastoris oc-curred relatively recently, within the last 100 – 300 ky. While other studies support patterns of biased fractionation (e.g. Woodhouse et al. 2010, Schna-ble et al. 2011), our study showed no evidence for major gene loss after pol-yploid speciation. Instead, we found that differences between the progenitor species have a major effect on variation between the homeologous chromo-somes in C. bursa-pastoris. Most major effect mutations, which include stop codon gain, stop codon loss, start codon loss or splice site loss, in the tetra-ploid that are fixed between the homeologs are shared with one or both of the progenitor species. Interestingly, most of these shared major effect muta-tions are shared with the selfing progenitor C. orientalis. We additionally observed a decrease in the efficacy of selection genomewide due to a combi-nation of demographic history, selfing and WGD. Lastly we investigated the effects of expression differences between the progenitor species and between the two homeologous genomes in C. bursa-pastoris. We found evidence that

31

the expression of most homeologous genes within the tetraploid follow ex-pression differences between the progenitor species.

To conclude, we show that the tetraploid C. bursa-pastoris has a hybrid origin from two diploid species of the genus Capsella. Our results also show that the ancestral legacies of the progenitor species of an allopolyploid can greatly affect genome evolution as well as gene expression variation.

Paper IV In paper IV we investigated the role of cis-regulatory differences between the progenitor species for homeolog specific expression in the tetraploid C. bursa-pastoris in more detail than in paper III, especially if differences in TE content between the progenitor species lead to overall differences in gene expression. As both C. grandiflora and C. rubella harbor more TE insertions than C. orientalis (Ågren et al. 2014), and the progenitor species likely dif-fered as well at the time of speciation, we would expect a shift in gene ex-pression towards the C. orientalis-like homeolog in the tetraploid. Diploid hybrids between the progenitor species should also have an expression bias towards C. orientalis under this model.

To test this, we generated diploid F1 hybrids between C. orientalis and C. rubella to be able to compare ASE in the F1s with homeolog-specific ex-pression in the tetraploid. We conducted deep transcriptome and genome sequencing from flower buds and leaves in four C. bursa-pastoris accessions and two interspecific F1s and mapped the reads against parental haplotypes to reduce mapping bias. We also retained only SNPs that we could reliably phase in the tetraploid. For SNPs that show a fixed difference between C. grandiflora and C. orientalis as well as have a fixed difference between the homeologs in C. bursa-pastoris we should be able to confidently assign origin. This way, we can assess whether there is a global shift in the ex-pressed genes towards one of the homeologous genomes, and whether TEs play a role for this. We assessed expression bias similar to paper I using the hierarchical Bayesian method developed Skelly et al. (2011).

We found that three of the four C. bursa-pastoris accessions show an ex-pression bias in the expected direction towards the C. orientalis-like home-ologous genome. The fourth accession shows a bias into the opposite direc-tion, towards the C. grandiflora/C. rubella like homeologous genome, as do the diploid F1s. Generally the association between TE presence and signifi-cant expression bias was rather weak and the effect of TEs on the expression bias very small, unlike in paper I and paper II. This shows that differences in the total TE content alone cannot explain the patterns that we observe. One possible explanation for this might be that there are differences in TE silencing between the progenitor species. While we did not find such pat-terns in paper I, where we assessed C. grandiflora x C. rubella F1s, C. ori-

32

entalis and C. rubella are more diverged (~1-2 Mya paper III) than C. grandiflora and C. rubella (<200 kya Slotte et al. 2013) and could potential-ly show differences in TE silencing. In Arabidopsis differences in TE silenc-ing have been observed in species which differ in mating system, and for example the selfer A. thaliana shows stronger silencing than the outcrossing relative A. lyrata (He et al. 2012).

Additionally we found that genes that show cis-regulatory differences be-tween C. orientalis and C. rubella in the F1 hybrids, also in general are more likely to show an expression bias in the same direction in C. bursa-pastoris. This agrees with the results from paper III. Nonetheless, we find that there are multiple genes that do not follow this pattern, as is clear from the differ-ences in the overall direction of expression bias.

To conclude, in paper IV we investigated the possible role of TEs on gene expression changes between the two homeologs of C. bursa-pastoris and the diploid F1s between C. orientalis and C. rubella. We find that there is only a rather weak association and that other factors likely play a role and should be investigated in more detail.

33

Svensk sammanfattning

Ärftliga förändringar i en organisms fenotyp kan ske genom mutationer i proteinkodande delar av arvsmassan eller genom mutationer som påverkar genuttryck. Förändringar i proteiner uttrycks alltid då genen uttrycks. Däre-mot kan förändringar i genuttryck vara märkbara enbart vid vissa utveck-lingsstadier, under specifika miljöförhållanden eller i ett specifikt organ. Av detta skäl har man antagit att förändringar i genuttryck kan ha färre negativa bieffekter och därför skulle kunna bidra till organismers anpassning i större utsträckning än förändringar i proteiner. Trots att allt fler studier har un-dersökt betydelsen av genuttrycksförändringar för organismers anpassning så återstår många obesvarade frågor.

I denna avhandling har jag försökt klargöra betydelsen av förändringar i genuttryck inom och mellan arter av växtsläktet Capsella (lommar) i senapsfamiljen (Brassicaceae). I detta växtsläkte finns fyra arter som skiljer sig åt i parningssystem. Tre av arterna är självbefruktande, medan den fjärde arten är utkorsande och behöver få sina fröämnen befruktade av pollen andra individer. Självbefruktande arter har ofta mycket mindre blommor än sina utkorsande släktingar. Det finns också variation i ploidinivå inom växtsläktet Capsella, dvs arterna har olika antal kromosomuppsättningar. En av arterna är tetraploid och har fyra kromosomuppsättningar i sitt genom, medan de övriga arterna är diploida och har två kromosomuppsättningar. Capsella är därmed ett intressant system för att studera vad som händer med genuttryck i samband med förändringar i parningssystem eller ploidinivå.

I artikel I studerade vi förändringar i genuttryck mellan den diploida utkorsande arten C. grandiflora och den diploida självbefruktande arten C. rubella. Dessa två arter skiljer sig åt vad gäller blommorfologi, och vi har visat att de morfologiska förändringarna är kopplade till förändringar i ge-nuttryck. Särskilt genen JAGGED är intressant, eftersom de uttrycksförän-dringar vi identifierade i denna gen skulle kunna påverka kronbladsstorlek. Vi undersökte även möjliga orsaker till förändringar i genuttryck mellan dessa närbesläktade arter, och fann tecken på att själviska gener, så kallade transposoner, har en effekt på genuttryck. Dessa själviska gener kan kopieras eller flytta från en plats till en annan i arvsmassan. Eftersom spridning av själviska gener i arvsmassan kan ha skadliga effekter, har de flesta organis-mer mekanismer som tystar dem och förhindrar dem från att sprida sig i arvsmassan. De två arterna C. grandiflora och C. rubella skiljer sig åt vad gäller innehåll av själviska gener, och vi har visat att förändringar i ge-

34

nuttryck mellan arterna åtminstone delvis är en effekt av att de mekanismer som kontrollerar transposoner också förhindrar uttryck av närliggande gener. Våra resultat tyder på att förändringar i genuttryck varit viktiga för blomevo-lution i C. rubella och att skillnader i själviska gener mellan arter är en intressant mekanism för ändrat genuttryck mellan arter i samband med förändringar i parningssystem.

I artikel II undersökte vi hur selektion påverkar gener med uttrycksvaria-tion inom arten C. grandiflora. Vi fann att gener med uttrycksvariation fanns utspridda över genomet och inte enbart i särskilda regioner. Gener med uttrycksvariation var under svagare selektionstryck än gener som inte hade uttrycksvariation. Gener med uttrycksvariation var också oftare än andra nära kopplade till själviska gener. Våra resultat tyder på att variation i det naturliga urvalets effektivitet spelar roll för fördelningen av genuttrycksvai-ration inom en art och att själviska gener är viktiga för uttrycksvariation.

I artikel III undersökte vi den tetraploida självbefruktande arten C. bursa-pastoris. Tetraploider kan uppkomma genom fördubbling av arvsmassan hos en diploid art eller genom genomfördubbling i samband med hybridisering mellan olika arter. Det har länge varit oklart hur C. bursa-pastoris uppkom-mit. Vi undersökte alla fyra Capsella-arter och fann att C. bursa-pastoris uppkom nyligen, ca 200 000 år sedan, genom hybridisering mellan den diploida självbefruktande arten C. orientalis och anfadern till C. grandiflora och C. rubella. Vi fann inga tecken på massiv förlust av gener efter C. bur-sa-pastoris uppkomst, trots att detta är vanligt i andra polyploida arter. Däremot fann vi att nedärvd genetisk variation och genuttrycksvariation från de två föräldraarterna till stor grad kan förklara variationsmönster i C. bursa-pastoris. Vi fann även tecken på minskad selektionseffektivitet i C. bursa-pastoris som ett resultat av en kombination av förändrad ploidinivå, självbe-fruktning och demografisk historia. Denna studie klargjorde därmed ur-sprunget till C. bursa-pastoris, och påvisade även hur skillnader mellan föräldraarterna kan påverka nyligen bildade polyploida arter.

I artikel IV undersökte vi betydelsen av genuttrycksförändringar mellan föräldraarterna för genuttryck i C. bursa-pastoris mer närgående. Vi fann att medan skillnader mellan föräldraarterna till stor del kan förklara ge-nuttrycksmönster i den tetraploiden, så fanns det också stor variation mellan olika individer. Till skillnad från de tidigare artiklarna så fann vi inga starka tecken på att själviska gener påverkar genuttryck i denna tetraploida art.

I denna avhandling har jag undersökt betydelsen av förändringar i ge-nuttryck inom och mellan arter. Jag har funnit tecken på att förändringar i genuttryck kan vara viktiga för adaptiva morfologiska förändringar, att skillnader mellan föräldraarterna till en tetrapoid kan påverka ge-nuttrycksmönster, samt att variation i selektionstryck mellan gener kan påverka inomartsvariation i genuttryck.

35

Deutsche Zusammenfassung

Mutationen in einem Organismus, die weitervererbt werden und den Phäno-typ verändern können, können in verschiedenen Regionen der DNA auftre-ten. Entweder betreffen diese Mutationen kodierende Regionen, also solche Regionen, die exprimiert und in Proteine übersetzt werden, oder regulatori-schen Regionen, deren Veränderung die Genexpression beeinflussen kann. Eine Hypothese, warum besonders regulatorische Regionen für adaptive Evolution wichtig sind, ist, dass sich Veränderungen in den kodierenden Regionen immer auf den Organismus auswirken sollten. Veränderungen in regulatorischen Regionen hingegen, sollten nur Auswirkungen auf ein be-stimmtes Gewebe, unter bestimmten Umwelteinflüssen oder zu einem be-stimmten Entwicklungsstadium haben. Dadurch ergeben sich weniger nega-tive Konsequenzen für den Organismus. Immer mehr Studien haben in den letzten Jahren die Relevanz regulatorischer Regionen für adaptive Evolution untersucht. Dennoch sind noch immer viele Fragen ungeklärt.

Ziel meiner Doktorarbeit war einen Teil der offenen Fragen hinsichtlich der Bedeutung regulatorischer Divergenz in Verbindung mit dem Übergang von Fremd- zu Selbstbestäubung oder der Ploidie von Pflanzen zu klären. Unter Ploidy versteht man die Zahl an Chromosomen in einer Zelle. Zusätz-lich war ich auch an regulatorischer Variation innerhalb einer Pflanzenart interessiert. Für diese Fragestellungen ist die Gattung Capsella (Hirtentä-schelkraut) aus der Familie der Kreuzblütengewächse ein gutes Modelsys-tem. In diesem Gattung gibt es vier anerkannte Arten: die diploide fremdbe-stäubende Art C. grandiflora, die diploiden selbstbestäubenden Arten C. rubella und C. orientalis, sowie die tetraploide selbstbestäubende Art C. bursa-pastoris. Daher eignet sich Capsella gut um regulatorische Verände-rungen hinsichtlich dieser beiden Eigenschaften (Ploidie und die Art der Bestäubung) zu untersuchen.

Das erste Kapitel meiner Doktorarbeit handelt von der Rolle regulatori-scher Divergenz beim Übergang von Fremd- zu Selbstbestäubung zwischen C. grandiflora und C. rubella. Selbstbestäubende Pflanzen haben typische Blütenmerkmale, Kron- und Kelchblätter sind kleiner und sie produzieren weniger Pollen. Diese morphologischen Veränderungen sind adaptiv und wir konnten gezeigt, dass sie in Verbindung mit regulatorischen Veränderungen stehen können. Speziell ein Gen, JAGGED, spielt dabei vermutlich eine wichtige Rolle. Dieses Gen zeigt regulatorische Divergenz zwischen den beiden Arten und ist dafür bekannt, dass es das Wachstum der Blütenblätter

36

beeinflusst. Zusätzlich wurde im ersten Kapitel die Bedeutung von Transpo-sons, auch springende Gene genannt, für regulatorische Divergenz zwischen diesen zwei Arten untersucht. Transposons sind in der Lage ihre Position im Genom zu verändern bzw. Kopien von sich zu erstellen und werden deswe-gen meist im Genom epigenetisch unterdrückt. Dies kann allerdings auch Gene im unmittelbaren Umfeld der Transposons beeinflussen und diese zu-sammen mit den Transposons unterdrücken. Die Genome der beiden Arten C. grandiflora und C. rubella unterscheiden sich in ihrer Häufigkeit von Transposons und dies wirkt sich auf Unterschiede in der Genexpression aus. Ich konnte also im ersten Kapitel zeigen, dass regulatorische Divergenz für adaptive morphologische Veränderung im Bezug auf den Übergang von Fremd- zu Selbstbestäubung eine Rolle spielt und dass Unterschiede in der Genexpression zwischen den Arten auf Transposons zurückzuführen sind.

Im zweiten Kapitel wurde untersucht wie positive und negative Selektion Gene mit innerartlicher regulatorische Variation in C. grandifora beeinflusst. Dazu wurde untersuch, wie Gene mit regulatorischer Variation im Genom verteilt sind und welche Faktoren die regulatorische Variation beeinflussen können. Die Art C. grandiflora bietet sich dafür an, da sie eine relativ große Populationsgröße hat und zusätzlich wenig Populationsstruktur aufweist. Wir haben gezeigt, dass Gene mit regulatorischer Variation über das Genom verteilt und weniger stark konserviert sind als Gene ohne regulatorische Va-riation. Zusätzlich sind Gene mit regulatorischer Variation mit der Anwe-senheit von Transposons in ihrer Nähe positiv korreliert. Wir haben gezeigt dass die Selektionsstärke die Verteilung von regulatorischer Variation inner-halb von Arten beeinflusst, und dass Transposons für regulatorische Variati-on von Bedeutung sind.

Im dritten Kapitel der Doktorarbeit wurde der Ursprung der tetraploiden Art C. bursa-pastoris untersucht. Tetraploide Arten können durch Verdopp-lung des Genoms einer diploider Art entstehen (Autopolyploidie) oder durch Kreuzung zweier diploider Arten und Genomverdopplung (Allopolyploidie). Es hat sich gezeigt, dass C. bursa-pastoris eine allopolyploide Art ist, die vor etwa 200 000 Jahren entstanden ist. Dies geschah durch eine Kreuzung von dem Selbstbestäuber C. orientalis und einem fremdbestäubenden Vor-fahr der beiden Arten C. grandiflora und C. rubella. Anders als in anderen Fällen von Polyploidie findet man in C. bursa-pastoris keinen Verlust einer Großzahl an Genen. Allerdings werden viele Mutationen mit negativen Fit-nesseffekten mit den Elternarten geteilt. Außerdem ist effektive Selektion in dieser Art schwächer, was auf eine Kombination von Demographie, Selbst-bestäubung und Genomduplikation zurückzuführen ist. Schlussendlich haben wir gezeigt, dass Unterschiede in Genexpression zwischen den Elternarten ebenfalls in C. bursa-pastoris zu finden sind. Somit ist C. bursa-pastoris eine junge allopolyploide Art ist, deren Elternarten einen großen Einfluss auf das Genom und die Genexpression haben.

37

Im vierten Kapitel wurde die Genexpression in C. bursa-pastoris noch etwas genauer betrachtet und ins Verhältnis zu regulatorischer Divergenz zwischen den Elternarten gesetzt. In C. bursa-pastoris findet sich eine hohe Variation in Genexpression, und dass die Expressionsmuster zwischen den Elternarten und C. bursa-pastoris stimmen relativ schwach überein. Im Ge-gensatz zu den anderen Kapiteln scheinen Transposons für regulatorische Variation oder Divergenz keine Rolle zu spielen.

Im Verlauf meiner Doktorarbeit habe ich die Bedeutung regulatorischer Variation und Divergenz hinsichtlich Veränderungen in der Bestäubungsart oder Ploidie aufgezeigt. Es gibt Hinweise darauf, dass regulatorische Diver-genz für adaptive morphologische Unterschiede zwischen Arten eine Rolle spielt, und dass Polyploide stark von Unterschieden zwischen ihren Elternar-ten beeinflusst werden können. Zusätzlich kann Selektionsstärke die Vertei-lung von regulatorischer Variation innerhalb einer Arten beeinflussen. In meiner Doktorarbeit habe ich einen Teil dazu beigetragen, die noch offenen Fragen über die Relevanz regulatorischer Variation und Divergenz für Evo-lution in Pflanzen zu beantworten.

38

Acknowledgements

These last four years of doing my PhD here in Uppsala have been a great experience!

First and foremost, I would like to thank Tanja for giving me the oppor-tunity to do my PhD thesis here at Uppsala University (as well as partly at SU and SciLife Lab in Stockholm). It has been fun working together and I really appreciate your support and mentoring during the last years. I would also like to thank all the current and previous members of the Slotte lab for all their help and valuable discussions. Cindy and Julia, who have been a great help in the greenhouse and the lab, thank you both! I also really appre-ciate not only Veronika’s help with the bioinformatics, but also the fun times when we were all alone on gamma four at SciLife Lab. I would also like to thank Mike for nice discussions at work. Ben, it has been great working with you and I especially am thankful for the weekly R lessons. I learned a lot! Jörg and Andy, who have been fun to talk to and share an office with in Stockholm.

I would also like to say thank you to Doug. You have been a great second supervisor and had always time to discuss things with me if I had a problem or a question. I really appreciate that!

Our work would also not have been possible without Johan. Thank you for your bioinformatics work and setting up the pipelines for the parental haplotypes. And also for being patient for an absolute beginner in bioinfor-matics and teaching me a lot!

Many thanks also to our co-authors and collaborators. Stephen Wright (University of Toronto) and his group, who have been heavily involved with my third paper. Dan Koenig, thank you for being involved with my first paper. Claudia Köhler (SLU Upppsala) and her group for great collabora-tions. Especially Carolin, who has been showing me how to properly do crosses between selfing plants, thanks a lot!

Also a great thanks to my office mates in Uppsala, Linnéa, Nagarjun and Krystyna, who have been making it fun to work! And I also thank you be-cause I could always bug you about really tricky bioinformatics questions… or not so tricky ones.

A great thanks also to the members of the knitting group (and to those who tried at least once to get some cake). Jenni, Nina, Berrit, Claudia, Alex, Rhiannon, Martin, Marta, Gwenna, Cosima and everyone else who was there from time to time. It was a lot of fun to spend evenings with you knitting,

39

talking science and going out for dinner with you (as well as having cake from time to time). Jenni, I would especially like to thank you, because you have been a great support during the last years, and I was really happy that you returned from France to work some more in Uppsala.

Homa and Mi, it was really fun spending time with you and I miss having lunch with you guys, being in Stockholm so often. Also a big thank you to Ioana! I have missed having you around and seeing you so enthusiastic about lichens! I would also like to thank Rob and Severin, who have been starting their PhD around the same time I did. It was fun having you around, being in book clubs together and teaching students to hold a pipette! Christen, you have been a great roommate and since you moved on to Stockholm Universi-ty it was really nice to still have Fika or lunch together! My new roommate Alex, thanks for being so patient with me since I have not been the best roommate being totally stressed out at the end of my PhD.

And all the people who made my Friday evenings awesome with whis-key, rum, beer, discussions about science and everything else and of course food! Aaron, Bart, Jesper, Jelmer, Jaime, Matthias, Agnès, Taki, TJ, Torsten, Ludo, and all the others who joined in, I thank you all for great times we had together! I would also like to thank Claire, Vera, Verena, Kerri, Ruxi, Ariel-le, Luciana, Matthias, Paulina, and Sergio. It was a lot of fun having you around!

Also a big thank you for the EBC Graduate School and everyone who has been involved with it! Having the seminar series and the opportunity to talk to all the speakers has been a great enrichment for doing a PhD in Uppsala. The yearly retreats have also always been a fun thing to do!

Meinen Freunden in Deutschland auch vielen lieben Dank, ich habe euch die letzten Jahre vermisst, aber wir sind ja immer im Kontakt geblieben. Katharina, Kata und Tina, wir haben es zwar nicht immer geschafft alle vier zusammen zu Skypen, aber dennoch hatten wir immer viel Spaß! Bine und Pat, euch auch vielen Dank dass ihr für mich da wart.

Und natürlich will ich auch meine Familie nicht vergessen. Vielen Dank für all die Unterstützung in den letzten Jahren Ralph und Pat. Ihr wart immer für mich da wenn ich euch gebraucht habe und habt mich immer unterstützt. Vielen Dank! Lisa und Sinah, auch euch vielen Dank, dass ihr immer Zeit hattet und mir immer einen Grund zum Lachen gegeben habt. Ihr habt mir in schwierigen Zeiten immer Halt gegeben.

I hope I have not forgotten to mention anyone here. I really appreciate having all of you around and making these last years a great experience!

40

References

Albert FW, Kruglyak L. 2015. The role of regulatory variation in complex traits and disease. Nat Rev Genet 16:197–212.

Bennett S. 2002. Solexa Ltd. Pharmacogenomics 5:433-438. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG,

Hall KP, Evers DJ, Barnes CL, Bignell HR, Boutell JM, Bryant J, Carter RJ, Keira Cheetham R, Cox AJ, Ellis DJ, Flatbush MR, Gormley NA, Humphray SJ, Irving LJ, Karbelashvili MS, Kirk SM, Li H, Liu X, Maisinger KS, Murray LJ, Obradovic B, Ost T, Parkinson ML, Pratt MR, Rasolonjatovo IM, Reed MT, Rigatti R, Rodighiero C, Ross MT, Sabot A, Sankar SV, Scally A, Schroth GP, Smith ME, Smith VP, Spiridou A, Torrance PE, Tzonev SS, Vermaas EH, Wal-ter K, Wu X, Zhang L, Alam MD, Anastasi C, Aniebo IC, Bailey DM, Bancarz IR, Banerjee S, Barbour SG, Baybayan PA, Benoit VA, Benson KF, Bevis C, Black PJ, Boodhun A, Brennan JS, Bridgham JA, Brown RC, Brown AA, Buermann DH, Bundu AA, Burrows JC, Carter NP, Castillo N, Chiara E Catenazzi M, Chang S, Neil Cooley R, Crake NR, Dada OO, Diakoumakos KD, Dominguez-Fernandez B, Earnshaw DJ, Egbujor UC, Elmore DW, Etchin SS, Ewan MR, Fedurco M, Fraser LJ, Fuentes Fajardo KV, Scott Furey W, George D, Gietzen KJ, Goddard CP, Golda GS, Granieri PA, Green DE, Gustafson DL, Hansen NF, Harnish K, Haudenschild CD, Heyer NI, Hims MM, Ho JT, Horgan AM, Hoschler K, Hurwitz S, Ivanov DV, Johnson MQ, James T, Huw Jones TA, Kang GD, Kerelska TH, Kersey AD, Khrebtukova I, Kindwall AP, Kings-bury Z, Kokko-Gonzales PI, Kumar A, Laurent MA, Lawley CT, Lee SE, Lee X, Liao AK, Loch JA, Lok M, Luo S, Mammen RM, Martin JW, McCauley PG, McNitt P, Mehta P, Moon KW, Mullens JW, Newington T, Ning Z, Ling Ng B, Novo SM, O'Neill MJ, Osborne MA, Osnowski A, Ostadan O, Paraschos LL, Pickering L, Pike AC, Pike AC, Chris Pinkard D, Pliskin DP, Podhasky J, Qui-jano VJ, Raczy C, Rae VH, Rawlings SR, Chiva Rodriguez A, Roe PM, Rogers J, Rogert Bacigalupo MC, Romanov N, Romieu A, Roth RK, Rourke NJ, Rue-diger ST, Rusman E, Sanches-Kuiper RM, Schenker MR, Seoane JM, Shaw RJ, Shiver MK, Short SW, Sizto NL, Sluis JP, Smith MA, Ernest Sohna Sohna J, Spence EJ, Stevens K, Sutton N, Szajkowski L, Tregidgo CL, Turcatti G, Vandevondele S, Verhovsky Y, Virk SM, Wakelin S, Walcott GC, Wang J, Worsley GJ, Yan J, Yau L, Zuerlein M, Rogers J, Mullikin JC, Hurles ME, McCooke NJ, West JS, Oaks FL, Lundberg PL, Klenerman D, Durbin R, Smith AJ. 2008. Accurate whole human genome sequencing using reversible termina-tor chemistry. Nature 456:53-59.

Burton TL, Husband BC. 2000. Fitness differences among diploids, tetraploids, and their triploid progeny in Chamerion angustifolium: mechanisms of inviability and implications for polyploid evolution. Evolution 54:1182-1191.

Carroll SB. 2000. Endless forms: the evolution of gene regulation and morphologi-cal diversity. Cell 101:577–580.

41

Carroll SB. 2008. Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell 134:25–36.

Charlesworth J, Eyre-Walker A. 2006. The rate of adaptive evolution in enteric bacteria. Mol Biol Evol 23:1348-1356.

Charlesworth J, Eyre-Walker A. 2008. The McDonald-Kreitman test and slightly deleterious mutations. Mol Biol Evol 25:1007-1015.

Chater AO. 1993. Capsella. In: Tutin TG, Heywood H, Burges NA, Moore DM, Valentine DH, Walters SM, Webb DA, editors. Flora Europaea. Cambridge, UK: Cambridge University Press. pp. 381–382.

Chen KY, Cong B, Wing R, Vrebalov J, Tanksley SD. 2007. Changes in regulation of a transcription factor lead to autogamy in cultivated tomatoes. Science 318:643-645.

Chénais B, Caruso A, Hiard S, Casse N. 2012. The impact of transposable elements on eukaryotic genomes: from genome size increase to genetic adaptation to stressful environments. Gene 509:7-15.

Clancy S. 2008. RNA transcription by RNA polymerase: prokaryotes vs eukaryotes. Nature Education 1(1):125.

Cleves PA, Ellis NA, Jimenez MT, Nunez SM, Schluter D, Kingsley DM, Miller CT. 2014. Evolved tooth gain in sticklebacks is associated with a cis-regulatory allele of Bmp6. Proceedings of the National Academy of Sciences 111:13912–13917.

Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK. 2009. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25:3207–3212.

Doebley J, Lukens L. 1998. Transcriptional regulators and the evolution of plant form. Plant Cell 10:1075-82.

Doebley J, Stec A, Hubbard L. 1997. The evolution of apical dominance in maize. Nature 386:485–488.

Duncan TM, Rausher MD. 2013. Evolution of the selfing syndrome in Ipomoea. Front Plant Sci 4:301.

Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa VC, Foll M. 2013. Robust de-mographic inference from genomic and SNP data. PLoS Genet 9:e1003905.

Eyre-Walker A, Keightley PD. 2007. The distribution of fitness effects of new muta-tions. Nat Rev Genet 8:610-8.

Eyre-Walker A, Keightley PD. 2009. Estimating the rate of adaptive molecular evo-lution in the presence of slightly deleterious mutations and population size change. Mol Biol Evol 26:2097–2108.

Eyre-Walker A. 2002. Changing effective population size and the McDonald–Kreitman test. Genetics 162:2017-2024.

Fay J, Wycoff GJ, Wu C-I. 2001. Positive and negative selection on the human ge-nome. Genetics 158:1227-1234.

Fishman L, Kelly AJ, Willis JH. 2002. Minor quantitative trait loci underlie floral traits associated with mating system divergence in Mimulus. Evolution 56:2138-2155.

Foxe JP, Slotte T, Stahl EA, Neuffer B, Hurka H, Wright SI. 2009. Recent specia-tion associated with the evolution of selfing in Capsella. Proceedings of the Na-tional Academy of Sciences 106:5241–5245.

Fraser HB, Moses AM, Schadt EE. 2010. Evidence for widespread adaptive evolu-tion of gene expression in budding yeast. Proceedings of the National Academy of Sciences 107:2977–2982.

42

Fraser HB. 2011. Genome-wide approaches to the study of adaptive gene expression evolution: systematic studies of evolutionary adaptations involving gene expres-sion will allow many fundamental questions in evolutionary biology to be ad-dressed. Bioessays 33:469–477.

Freeling M, Woodhouse MR, Subramaniam S, Turco G, Lisch D, Schnable JC. 2012. Fractionation mutagenesis and similar consequences of mechanisms re-moving dispensable or less-expressed DNA in plants. Curr Opin Plant Biol 15:131–139.

Garsmeur O, Schnable JC, Almeida A, Jourda C, D'Hont A, Freeling M. 2014. Two evolutionarily distinct classes of paleopolyploidy. Mol Biol Evol 31:448–454.

Git A, Dvinge H, Salmon-Divon M, Osborne M, Kutter C, Hadfield J, Bertone P, Caldas C. 2010. Systematic comparison of microarray profiling, real-time PCR, and next-generation sequencing technologies for measuring differential mi-croRNA expression. RNA 16:991-1006.

Goldberg EE, Igić B. 2012. Tempo and mode in plant breeding system evolution. Evolution 66:3701-9.

Goodwillie C, Ritland C, Ritland K. 2006. The genetic basis of floral traits associat-ed with mating system evolution in Leptosiphon (Polemoniaceae): an analysis of quantitative trait loci. Evolution 60:491-504.

Graze RM, Novelo LL, Amin V, Fear JM, Casella G, Nuzhdin SV, McIntyre LM. 2012. Allelic imbalance in Drosophila hybrid heads: exons, isoforms, and evo-lution. Mol Biol Evol 29:1521–1532.

Guo Y-L, Bechsgaard JS, Slotte T, Neuffer B, Lascoux M, Weigel D, Schierup MH. 2009. Recent speciation of Capsella rubella from Capsella grandiflora, associ-ated with loss of self-incompatibility and an extreme bottleneck. Proceedings of the National Academy of Sciences 106:5246–5251.

Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD. 2009. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet 5:e1000695.

He F, Zhang X, Hu J, Turck F, Dong X, Goebel U, Borevitz J, de Meaux J. 2012. Genome-wide analysis of cis-regulatory divergence between species in the Ara-bidopsis genus. Mol Biol Evol 29:3385–3395.

Hoekstra HE, Coyne JA. 2007. The locus of evolution: evo devo and the genetics of adaptation. Evolution 61:995–1016.

Hollister JD, Gaut BS. 2009. Epigenetic silencing of transposable elements: a trade-off between reduced transposition and deleterious effects on neighboring gene expression. Genome Res 19:1419–1428.

Hollister JD, Smith LM, Guo Y-L, Ott F, Weigel D, Gaut BS. 2011. Transposable elements and small RNAs contribute to gene expression divergence between Arabidopsis thaliana and Arabidopsis lyrata. Proceedings of the National Acad-emy of Sciences 108:2322–2327.

Hoopes L. 2008. Introduction to the gene expression and regulation topic room. Nature Education 1(1):160.

House MA, Griswold CK, Lukens LN. 2014. Evidence for selection on gene expres-sion in cultivated rice (Oryza sativa). Mol Biol Evol 31:1514-1525.

Hurka H, Friesen N, German DA, Franzke A, Neuffer B. 2012. ‘Missing link’ spe-cies Capsella orientalis and Capsella thracica elucidate evolution of model plant genus Capsella (Brassicaceae). Mol Ecol 21:1223-1238.

Hurka H, Neuffer B. 1997. Evolutionary processes in the genus Capsella (Brassica-ceae). Plant Syst Evol 206:295–316.

Jacob F, Monod J. 1961. Genetic regulatory mechanisms in synthesis of proteins. J Mol Biol 3:318–356.

43

Keightley PD, Eyre-Walker A. 2007. Joint inference of the distribution of fitness effects of deleterious mutations and population demography based on nucleotide polymorphism frequencies. Genetics 177:2251-2261.

Kimura S, Koenig D, Kang J, Yoong FY, Sinha N. 2008. Natural variation in leaf morphology results from mutation of a novel KNOX gene. Curr Biol 18:672-7.

King MC, Wilson AC. 1975. Evolution at two levels in humans and chimpanzees. Science 188:107-116.

Langham RJ, Walsh J, Dunn M, Ko C, Goff SA, Freeling M. 2004. Genomic dupli-cation, fractionation and the origin of regulatory novelty. Genetics 166:935–945.

Lippman Z, Gendrel A-V, Black M, Vaughn MW, Dedhia N, McCombie WR, Lavine K, Mittal V, May B, Kasschau KD, Carrington JC, Doerge RW, Colot V, Martienssen R. 2004. Role of transposable elements in heterochromatin and epigenetic control. Nature 430:471–476.

Lowry DB, Logan TL, Santuari L, Hardtke CS, Richards JH, Derose-Wilson LJ, McKay JK, Sen S, Juenger TE. 2013. Expression quantitative trait locus map-ping across water availability environments reveals contrasting associations with genomic features in Arabidopsis. Plant Cell 25:3266-3279.

McClintock B. 1948. Mutable loci in maize. Carnegie Inst Wash Yearb 47:155-169. McClintock B. 1956. Controlling elements and the gene. Cold Spring Harb Symp

Quant Biol 21:197-216. McClintock B. 1984. The significance of responses of the genome to challenge.

Science 226:792-801. McDonald JH, Kreitman M. 1991. Adaptive evolution at the Adh locus in Drosophi-

la. Nature 351:652-654. Messer PW, Petrov DA. 2013. Frequent adaptation and the McDonald-Kreitman

test. Proceedings of the National Academy of Sciences 110:8615-8620. Miller CT, Beleza S, Pollen AA, Schluter D, Kittles RA, Shriver MD, Kingsley DM.

2007. Cis-Regulatory changes in Kit ligand expression and parallel evolution of pigmentation in sticklebacks and humans. Cell 131:1179–1189.

Nielsen R, Paul JS, Albrechtsen A, Song YS. 2011. Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 12:443-51.

Ornduff R. 1969. Reproductive biology in relation to systematics. Taxon 18:121-133.

Phillips, T. 2008. The role of methylation in gene expression. Nature Education 1:116.

Pophaly SD, Tellier A. 2015. Population level purifying selection and gene expres-sion shape subgenome evolution in maize. Mol Biol Evol 32: 3226-3235.

Prud'homme B, Gompel N, Carroll SB. 2007. Emerging principles of regulatory evolution. Proceedings of the National Academy of Sciences 1:8605–8612.

Ramsey J, Schemske D. 1998. Pathways, mechanisms, and rates of polyploid for-mation in flowering plants. Annu Rev Ecol Syst 29:467–501.

Renny-Byfield S, Gong L, Gallagher JP, Wendel JF. 2015. Persistence of subge-nomes in paleopolyploid cotton after 60 my of evolution. Mol Biol Evol 32:1063–1071.

Rockman MV, Skrovanek SS, Kruglyak L. 2010. Selection at linked sites shapes heritable phenotypic variation in C. elegans. Science 330:372–376.

Sauret-Güeto S, Schiessl K, Bangham A, Sablowski R, Coen E. 2013. JAGGED controls Arabidopsis petal growth and shape by interacting with a divergent po-larity field. Plos Biol 11:e1001550.

44

Schiessl K, Muiño JM, Sablowski R. 2014. Arabidopsis JAGGED links floral organ patterning to tissue growth by repressing Kip-related cell cycle inhibitors. Pro-ceedings of the National Academy of Sciences 111:2830–2835.

Schnable JC, Springer NM, Freeling M. 2011. Differentiation of the maize subge-nomes by genome dominance and both ancient and ongoing gene loss. Proceed-ings of the National Academy of Sciences 108:4069–4074.

Sicard A, Stacey N, Hermann K, Dessoly J, Neuffer B, Bäurle I, Lenhard M. 2011. Genetics, evolution, and adaptive significance of the selfing syndrome in the genus Capsella. Plant Cell 23:3156–3171.

Sicard A, Thamm A, Marona C, Lee YW, Wahl V, Stinchcombe JR, Wright SI, Kappel C, Lenhard M. 2014. Repeated evolutionary changes of leaf morphology caused by mutations to a homeobox gene. Curr Biol 24:1880-1886.

Skelly DA, Johansson M, Madeoy J, Wakefield J, Akey JM. 2011. A powerful and flexible statistical framework for testing hypotheses of allele-specific gene ex-pression from RNA-seq data. Genome Res 21:1728–1737.

Slotte T, Foxe JP, Hazzouri KM, Wright SI. 2010. Genome-wide evidence for effi-cient positive and purifying selection in Capsella grandiflora, a plant species with a large effective population size. Mol Biol Evol 27:1813-1821.

Slotte T, Hazzouri KM, Ågren JA, Koenig D, Maumus F, Guo Y, Steige K, Platts AE, Escobar JS, Newman LK, Wang W, Mandáková T, Vello E, Steffen J, Takuno S, Brandvain Y, Coop G, Andolfatto P, Hu TT, Blanchette M, Clark RM, Quesneville H, Nordborg M, Gaut BS, Lysak MA, Jenkins J, Grimwood J, Prochnick S, Shu S, Rokhsar D, Schmutz J, Weigel D, Wright SI. 2013. The Capsella rubella genome and the genomic consequences of rapid mating system evolution. Nat Genet 45:831-835.

Slotte T, Hazzouri KM, Stern D, Andolfatto P, Wright SI. 2012. Genetic architecture and adaptive significance of the selfing syndrome in Capsella. Evolution 66:1360–1374.

Slotte T. 2014. The impact of linked selection on plant genomic variation. Brief Funct Genomics 13:268–275.

Smith NGC, Eyre-Walker A. 2002. Adaptive protein evolution in Drosophila. Na-ture 415:1022-1024.

Soltis PS, Soltis DE. 2009. The role of hybridization in plant speciation. Annu Rev Plant Biol 60:561-588.

St Onge KR, Foxe JP, Li J, Li H, Holm K, Corcoran P, Slotte T, Lascoux M, Wright SI. 2012. Coalescent-based analysis distinguishes between allo- and autopoly-ploid origin in Shepherd's Purse (Capsella bursa-pastoris). Mol Biol Evol 29:1721-33 .

St Onge KR, Källman T, Slotte T, Lascoux M, Palmé AE. 2011. Contrasting demo-graphic history and population structure in Capsella rubella and Capsella gran-diflora, two closely related species with different mating systems. Mol Ecol 20:3306–3320.

Stern DL, Orgogozo V. 2008. The loci of evolution: how predictable is genetic evo-lution? Evolution 62:2155–2177.

Studer A, Zhao Q, Ross-Ibarra J, Doebley J. 2011. Identification of a functional transposon insertion in the maize domestication gene tb1. Nat Genet 43:1160–1163.

Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585–595.

Tedder A, Carleial S, Gołębiewska M, Kappel C, Shimizu KK, Stift M. 2015. Evo-lution of the Selfing Syndrome in Arabis alpina (Brassicaceae). PLoS One 10:e0126618.

45

Williamson RJ, Josephs EB, Platts AE, Hazzouri KM, Haudry A, Blanchette M, Wright SI. 2014. Evidence for widespread positive and negative selection in coding and conserved noncoding regions of Capsella grandiflora. PLoS Genet 10:e1004622.

Wittkopp PJ, Haerum BK, Clark AG. 2008. Regulatory changes underlying expres-sion differences within and between Drosophila species. Nat Genet 40:346–350.

Wittkopp PJ, Kalay G. 2012. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nat Rev Genet 13:59–69.

Wood TE, Takebayashi N, Barker MS, Mayrose I, Greenspoon PB, Rieseberg LH. 2009. The frequency of polyploid speciation in vascular plants. Proceedings of the National Academy of Sciences 106:13875–13879.

Woodhouse MR, Cheng F, Pires JC, Lisch D, Freeling M, Wang X. 2014. Origin, inheritance, and gene regulatory consequences of genome dominance in poly-ploids. Proceedings of the National Academy of Sciences 111:5283–5288.

Woodhouse MR, Schnable JC, Pedersen BS, Lyons E, Lisch D, Subramaniam S, Freeling M. 2010. Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of the two homologs. PLoS Biol 8:e1000409.

Wray GA. 2007. The evolutionary significance of cis-regulatory mutations. Nat Rev Genet 8:206–216.

Zhang X, Cal AJ, Borevitz JO. 2011. Genetic architecture of regulatory variation in Arabidopsis thaliana. Genome Res 21:725–733.

Ågren JA, Wang W, Koenig D, Neuffer B, Weigel D, Wright SI. 2014. Mating sys-tem shifts and transposable element evolution in the plant genus Capsella. BMC Genomics 15:602.

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1332

Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science andTechnology, Uppsala University, is usually a summary of anumber of papers. A few copies of the complete dissertationare kept at major Swedish research libraries, while thesummary alone is distributed internationally throughthe series Digital Comprehensive Summaries of UppsalaDissertations from the Faculty of Science and Technology.(Prior to January, 2005, the series was published under thetitle “Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-268953

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2016