1 Section: Population and evolutionary genetics · 127 molecular evolution across spermatogenesis...
Transcript of 1 Section: Population and evolutionary genetics · 127 molecular evolution across spermatogenesis...
1
Section: Population and evolutionary genetics 1
2
Contrasting levels of molecular evolution on the mouse X chromosome 3
4
Erica L. Larson*, Dan Vanderpool*, Sara Keeble*,†, Meng Zhou†, Brice A.J. Sarver*, 5
Andrew D. Smith†, Matthew D. Dean†, Jeffrey M. Good* 6
7
*Division of Biological Sciences, University of Montana, Missoula, MT 59812 8
†Molecular and Computational Biology, University of Southern California, Los Angeles, 9
CA 90089 10
11
Data access: The data reported in this paper is available through NCBI under the 12
accession numbers: SRP065082, SRP065034, and SRP075865.13
Genetics: Early Online, published on June 17, 2016 as 10.1534/genetics.116.186825
Copyright 2016.
2
14
Short title: Molecular evolution of spermatogenesis 15
16
Keywords: Faster-X, gene expression, DNA methylation, fluorescence-activated cell 17
sorting, postmeiotic sex chromosome repression (PSCR) 18
19
Corresponding author: 20
Jeffrey M. Good 21
Division of Biological Sciences 22
University of Montana 23
Missoula, MT 59812 24
Phone: (406) 243 – 5122 25
Email: [email protected] 26
3
Abstract 27
The mammalian X chromosome has unusual evolutionary dynamics compared to 28
autosomes. Faster-X evolution of spermatogenic protein-coding genes is known to be 29
most pronounced for genes expressed late in spermatogenesis, but it is unclear if these 30
patterns extend to other forms of molecular divergence. We tested for faster-X evolution 31
in mice spanning three different forms of molecular evolution – divergence in protein 32
sequence, gene expression, and DNA methylation – across different developmental stages 33
of spermatogenesis. We used fluorescence-activated cell sorting to isolate individual cell 34
populations and then generated cell-specific transcriptome profiles across different stages 35
of spermatogenesis in two subspecies of house mice (Mus musculus), thereby overcoming 36
a fundamental limitation of previous studies on whole tissues. We found faster-X protein 37
evolution at all stages of spermatogenesis and faster-late protein evolution for both X-38
linked and autosomal genes. In contrast, there was less expression divergence late in 39
spermatogenesis (slower-late) on the X chromosome and for autosomal genes expressed 40
primarily in testis (testis-biased). We argue that slower-late expression divergence 41
reflects strong regulatory constraints imposed during this critical stage of sperm 42
development and that these constraints are particularly acute on the tightly regulated sex 43
chromosomes. We also found slower-X DNA methylation divergence based on genome-44
wide bisulfite sequencing of sperm from two species of mice (M. musculus and M. 45
spretus), though it is unclear whether slower-X DNA methylation reflects development 46
constraints in sperm or other X-linked phenomena. Our framework clarifies key 47
differences in patterns of regulatory and protein evolution across spermatogenesis that are 48
4
likely to have important consequences for mammalian sex chromosome evolution, male 49
fertility, and speciation. 50
51
Introduction 52
The X chromosome plays a disproportionately large role in adaptation and speciation 53
(Ellegren 2011; Bachtrog et al. 2011; Charlesworth 2013), but the underlying molecular 54
and evolutionary drivers of these patterns remain unclear. On one hand, the X 55
chromosome often shows strong signatures of evolutionary constraint. For example, the 56
evolution of dosage compensation via epigenetic X chromosome inactivation (XCI) in 57
females (Lyon 1961; 1962) imposes regulatory constraints that select for strong 58
conservation of X-linked gene content in placental mammals (Ohno 1967; Kohn et al. 59
2004). On the other hand, these inherent constraints are punctuated by strong 60
specialization in X-linked gene content (Emerson et al. 2004; Potrzebowski et al. 2008; 61
Mueller et al. 2008; Zhang et al. 2010; Sin et al. 2012; Mueller et al. 2013) and 62
numerous examples of rapid X-linked evolution (Torgerson and Singh 2003; 2006; 63
Baines and Harr 2007; Kousathanas et al. 2014; Nam et al. 2015). 64
The X chromosome is predicted to evolve faster than the autosomes if beneficial 65
mutations are on average recessive because selection will act more efficiently on X-66
linked mutations exposed in hemizygous males (Charlesworth et al. 1987). Under this 67
model, faster-X evolution should be most intense for male-specific genes (Rice 1984; 68
Vicoso and Charlesworth 2009). Indeed, the strongest evidence for faster-X evolution 69
comes from patterns of protein-coding evolution during spermatogenesis (Torgerson and 70
Singh 2006; Baines et al. 2008; Grath and Parsch 2012; Sin et al. 2012; Kousathanas et al. 71
5
2014). Molecular evolution on the X chromosome may also differ from the autosomes 72
due to a smaller effective population size (i.e., ¾ autosomal Ne assuming equal sex ratios; 73
Vicoso and Charlesworth 2009; Mank et al. 2010) or sex-linked differences in mutation 74
rates (µ; Miyata et al. 1987; Begun and Whitley 2000; Ellegren 2007). Differences in Ne 75
and µ aside, the theoretical expectations for faster-X evolution should extend to other 76
functional aspects of DNA sequence evolution. Gene regulation is usually measured 77
through various biochemical phenotypes (e.g., transcript abundances, methylation 78
patterns) that may not directly reflect linked sequence evolution, yet accelerated 79
divergence of X-linked gene expression levels has been reported in fruit flies (Kayserili 80
et al. 2012; Meisel et al. 2012a; Llopart 2015; Coolon et al. 2015), birds (Dean et al. 81
2015), and mammals (Khaitovich et al. 2005a; Zhang et al. 2010; Brawand et al. 2011). 82
The evolution of other regulatory phenotypes has not been widely considered and the 83
extent to which different forms of molecular evolution show similar patterns of X-linked 84
evolution remains to be seen. 85
A critical evaluation of molecular evolution in the male germ line depends on a 86
few important details of spermatogenesis. Spermatogenesis is defined by progressive 87
gene specialization, with postmeiotic genes tending to be more narrowly expressed and 88
functionally specific (Eddy 2002; Schultz et al. 2003; Good and Nachman 2005). 89
Expression breadth and specialization influence rates of protein-coding evolution (Liao et 90
al. 2006; Ellegren and Parsch 2007; Meisel et al. 2012b) and genes expressed later in 91
spermatogenesis show more rapid protein-coding evolution (Good and Nachman 2005; 92
Sin et al. 2012). Yet there remain relatively few examples where patterns of divergence 93
have been evaluated across different stages of spermatogenesis (Good and Nachman 94
6
2005; Kousathanas et al. 2014) and how different forms of molecular evolution change in 95
this developmental context is largely unknown. The sex chromosomes are also silenced 96
during male meiosis through Meiotic Sex Chromosome Inactivation (MSCI) (Turner 97
2007). This period of inactivation results in strong selection against X-linked genes that 98
need to be expressed during meiosis, while genes expressed prior to MSCI are enriched 99
on the X chromosome (Wang et al. 2001; Khil et al. 2004). Gene expression remains 100
transcriptionally repressed in postmeiotic stages of spermatogenesis (Postmeiotic Sex 101
Chromosome Repression or PSCR), though several X-linked genes overcome PSCR and 102
are highly expressed in round spermatids (Namekawa et al. 2006; Mueller et al. 2008; 103
Sin et al. 2012). In general, X-linked genes expressed during PSCR tend to be more 104
specialized with narrower expression profiles and show more rapid protein evolution 105
relative to co-expressed autosomal genes (Sin et al. 2012; Kousathanas et al. 2014). 106
While the context and timing of expression during spermatogenesis play crucial 107
roles in the interpretation of faster-X protein evolution, most comparative expression 108
studies have focused on whole tissues. This experimental approach implicitly assumes 109
that differential gene expression between species is not simply an artifact of differences 110
in cell composition. Yet spermatogenesis is usually asynchronous and overlapping in 111
mature testis, leading to age-dependent heterogeneity in the abundances of germ cell 112
populations through time. Testis cellular composition (i.e., testis histology) may also 113
evolve rapidly when selection from sperm competition results in allometric shifts towards 114
more sperm-producing seminiferous tubules (Firman et al. 2015). Such technical issues 115
confound patterns of gene expression measured from whole testis (Good et al. 2010), 116
especially when combined with differences in stages of maturity, levels of fertility, or 117
7
comparisons between species. For example, testis expression patterns in primates cluster 118
more strongly with mating system than evolutionary relatedness (Brawand et al. 2011; 119
Saglican et al. 2014), indicating that testis transcriptomes are strongly influenced by 120
convergent shifts in cellular composition associated with different reproductive strategies. 121
When combined with considerable variation in relative enrichment for or against X-122
linkage across different stages of spermatogenesis (Khil et al. 2004), it is apparent that a 123
rigorous examination of faster-X gene expression evolution requires a cell- or stage-124
specific approach. 125
Here we report two experiments designed to evaluate three different forms of 126
molecular evolution across spermatogenesis in mice. First, we used fluorescence-127
activated cell sorting (FACS; Getun et al. 2011) and RNA sequencing (RNA-seq; Wang 128
et al. 2009) to generate transcriptomes from mitotic, meiotic, and postmeiotic 129
spermatogenic cells in two subspecies of house mice (Mus musculus musculus and M. m. 130
domesticus). We quantified genome-wide patterns of protein-coding and expression 131
divergence across key developmental stages of spermatogenesis and test for faster-X 132
molecular evolution. Second, we performed whole genome bisulfite sequencing (BS-seq; 133
Frommer et al. 1992) to quantify patterns of DNA methylation divergence between 134
sperm from house mice (M. m. musculus) and the closely related Algerian mouse (Mus 135
spretus). This second experiment allowed us to quantify the evolution of a key 136
regulatory phenotype on and off the X chromosome for the first time in mice. 137
Collectively, these experiments allow us to quantify different molecular evolutionary 138
patterns in light of specific stages of sperm development. 139
140
8
141
Materials and Methods 142
Experimental design and choice of mouse strains 143
Information on protein-coding evolution between M. m. musculus, M. m. domesticus, and 144
M. spretus is assessable using published genomic resources (Keane et al. 2011). However, 145
the technical demands and experimental resources necessary to generate novel cell-146
specific transcriptome and DNA methylation data across spermatogenesis in all three 147
lineages are reasonably beyond the scope of a single study. Therefore, we used a nested 148
subset of evolutionary contrasts to optimize our power to quantify gene expression and 149
methylation divergence. Divergence in gene expression levels accumulates relatively 150
quickly (Lemos et al. 2004) and so we focused our FACS-based partitioning of gene 151
expression across spermatogenesis on two subspecies of house mice (M. m. domesticus 152
and M. m. musculus). We used four wild-derived inbred strains (purchased from Jackson 153
Laboratory, Bar Harbor, ME) to generate inter-strain F1s for each subspecies (M. m. 154
musculus: CZECHII/EiJ females x PWK/PhJ males; M. m. domesticus: WSB/EiJ females 155
x LEWES/EiJ males). This crossing design reduces the impacts of inbreeding depression 156
on basic reproductive phenotypes and follows previous studies on these mice (Good et al. 157
2008; 2010; Campbell et al. 2013). Less in known about the evolutionary tempo of DNA 158
methylation divergence, but patterns of methylation can be highly conserved between 159
species (Molaro et al. 2011). Therefore, we focused our analysis of sperm DNA 160
methylation on contrasts between M. spretus and M. m. musculus represented by four 161
partially inbred strains of M. spretus (SFM and STF) and M. m. musculus (MPB and 162
MBS) acquired from François Bonhomme (U. Montpellier). 163
9
Gene expression and DNA methylation experiments were initiated independently 164
at the University of Montana (expression) and the University of Southern California 165
(methylation) using different strains of mice. To confirm the expected evolutionary 166
relationships among the mouse strains used in this study, we estimated a phylogeny using 167
published whole genome data for WSB/EiJ and SPRET/EiJ (Keane et al. 2011) and new 168
whole exome data from all other strains. Whole exomes were enriched for Illumina 169
sequencing using an in-solution NimbleGen SeqCap EZ Mouse (Roche) exome design 170
targeting 54.3 Mb of exonic regions in the mouse genome (Fairfield et al. 2011). This in-171
solution platform performs well when used across different species of Mus with minimal 172
biases in capture efficiency and sensitivity (Sarver et al., submitted). Custom individually 173
barcoded Illumina libraries (Meyer and Kircher 2010) were enriched and sequenced (100 174
bp PE) at the University of Utah Microarray and Genomic Analysis Core (HiSeq 2000), 175
University of Oregon Genomics and Cell Characterization Core Facility (HiSeq 2500) 176
and University of Southern California Epigenome Center (HiSeq 2000, NextSeq 500). 177
We mapped quality filtered reads to the mouse reference genome (GRCm38) using BWA 178
v0.7.12 (Li and Durbin 2009), called variants using HAPLOTYPECALLER in the GENOME 179
ANALYSIS TOOLKIT v3.3.0 (McKenna et al. 2010), filtered variants on a minimum quality 180
score of 30 and depth of 10 reads, and combined variants using VCFTOOLS v0.1.12b 181
(Danecek et al. 2011). The combined set of single nucleotide polymorphisms (SNPs) was 182
filtered to include sites called across all samples. We then used a concatenated alignment 183
of all variant genotypes to estimate a maximum likelihood phylogeny using RAxML 184
v8.2.3 (Stamatakis 2014). 185
10
Animal use was approved by the University of Southern California or the 186
University of Montana Institutional Animal Care and Use Committees. Experimental 187
males were weaned at ~21 days after birth and individually caged for at least 15 days to 188
mitigate potential reproductive influences associated with dominance interactions 189
(Snyder 1967). All experimental males were euthanized between 60-90 days using CO2 190
followed by cervical dislocation (UM protocol 002-13) or only cervical dislocation (USC 191
protocol 11394). 192
193
Transcriptome and methylome sample preparation and sequencing 194
We used FACS to enrich individual cell populations from across the developmental 195
timeline of spermatogenesis following Getun et al. (2011). Testes were dissected 196
following euthanasia, decapsulated, and seminiferous tubules were digested. Cells were 197
washed repeatedly, stained with Hoechst 33343 (Invitrogen) and propidium iodide, 198
filtered twice through a 40 µm cell strainer, and kept on ice prior to sorting. Cell sorting 199
was performed on a FACSAria IIu cell sorter (BD Biosciences) at the University of 200
Montana Center for Environmental Health Sciences Fluorescence Cytometry Core. Cell 201
populations were sorted based on size, granularity, and fluorescence and collected in 15 202
µL beta mercaptoethanol (Sigma) per mL of RLT lysis buffer (Qiagen). RNA was 203
extracted from each cell population using the Qiagen RNeasy kit and quantified using a 204
Bioanalyzer 2000 (Agilent). Samples with RNA integrity (RIN) ≥ 8 were prepared for 205
RNA-seq using the Illumina Truseq Sample Prep Kit v2 in a design that avoided batch 206
effects between cell populations and genotypes. Libraries were sequenced (100 bp PE 207
and 100 bp SE) on Illumina machines at the QB3 Vincent J. Coates Genomics 208
11
Sequencing Laboratory at University of California Berkley (HiSeq 2000), the University 209
of Utah Microarray and Genomic Analysis Core (HiSeq 2000), the University of Oregon 210
Genomics and Cell Characterization Core Facility (HiSeq 2500) and the University of 211
Southern California Epigenome Center (HiSeq 2000, NextSeq 500). 212
For bisulfite sequencing, sperm were isolated from caudal epididymes by 213
incubating diced tissue in 50 µl of equilibrated M199 for 40 minutes at 5% CO2 at 37°C. 214
We removed tissue debris and allowed sperm to settle for 20 minutes. We then collected 215
100 µl of the top suspension and incubated the sample in 100 µl of extraction buffer (20 216
mM Tris Cl pH 8.0, 20 mM EDTA, 200 mM NaCl, 80 mM DTT, 4% SDS, 250 µg/ml 217
Proteinase K) at 55°, with occasional mixing, until the sample was completely dissolved. 218
We extracted DNA with the QIAamp DNA Mini Kit or the QIAamp DNA Blood Mini 219
Kit (Qiagen) with a final elution in 50 µl of distilled H2O. Bisulfite treatment, which 220
converts unmethylated cytosines to thymine, and 100 bp PE Illumina sequencing was 221
performed at Beijing Genomics Institute. 222
223
Illumina sequence read processing and mapping 224
For RNA-seq data, we removed adaptors and low quality bases using TRIMMOMATIC 225
v0.32 (Bolger et al. 2014). Trimmed reads were mapped using TOPHAT v2.0.10 (default 226
parameters; (Kim et al. 2013) to strain-specific (PWK/PhJ and WSB/EiJ) published 227
pseudo-references that incorporate all known SNPs, indels, and structural variants 228
relative to the mouse reference genome (GRCm38) based on the classic laboratory strain, 229
C57BL/6J (Huang et al. 2014). This approach leverages the extensive annotation 230
developed for the mouse genome while minimizing mapping bias that favors reads 231
12
matching the C57BL/6J reference, which is predominantly derived from M. m. 232
domesticus (Yang et al. 2011). We translated reads back into the GRCm38 coordinates 233
using LAPELS v1.0.5 (Holt et al. 2013; Huang et al. 2013) and counted the number of 234
fragments uniquely mapping to protein-coding genes (GRCm38, Ensembl release 78) 235
using FEATURECOUNTS v1.4.4 (-Q 20, -C, --primary). In addition, we counted multiply-236
mapped fragments within 203 multicopy/ampliconic X-linked genes (Mueller et al. 2013) 237
and 156 Y-linked genes. 238
For methylome sequence data, we trimmed adaptors and mapped paired-end reads 239
using RMAPBS-PE in the RMAP package (Smith et al. 2008; 2009). We mapped M. m. 240
musculus to GRCm38 and M. spretus to a custom pseudo-reference generated with 241
GATK v1.6 that combined the coordinates of GRCm38 with the sequence variation 242
information (Sanger release version 1303) of M. spretus strain SPRET/EiJ (Keane et al. 243
2011). 244
245
Gene expression across spermatogenesis 246
All gene expression analyses were conducted using R v3.1.1(R Development Core Team 247
2014) and the BIOCONDUCTOR v3.0 R package edgeR v3.12 (Robinson et al. 2010) with 248
p-values adjusted to 5% false discovery rate (FDR; Benjamini and Hochberg 1995). We 249
restricted our analyses to protein-coding genes with greater than one Fragments Per 250
Kilobase of exon per Million mapped reads (FPKM) in at least four of our 36 samples, 251
and we also tested a range of expression thresholds (see Results). We evaluated library 252
normalization using the weighted trimmed mean of M-values (Robinson and Oshlack 253
2010) or the scaling factor method (R package DESeq v1.22, Anders and Huber 2010). 254
13
For brevity, we only present results using scaling factor normalization. To visualize 255
expression data, we normalized FPKM values so that the sum of squares equals one (R 256
package vegan v2.3-1, Oksanen et al. 2015) or used variance-stabilizing transformation 257
(Anders and Huber 2010). We plotted normalized FPKM values using the R packages 258
beanplot v1.2 (Kampstra 2008) and gplots v2.17.0 (Warnes et al. 2015). 259
We defined a gene as “expressed” in a particular cell type (e.g. spermatogonia, 260
round spermatids) if it had an FPKM > 1 in a minimum of four individuals for a given 261
cell type. We defined a gene as “induced” in a particular cell type if the median 262
expression in the focal cell was higher (>2✕) than the median expression of the other cell 263
types combined. Similarly, we defined genes as induced at a particular stage of 264
spermatogenesis (e.g. early, late) when expression was higher (>4✕) than the maximum 265
expression of all other stages (e.g. Kousathanas et al. 2014). We used Affymetrix 266
microarray data across 96 tissues (Mouse Genome 430 2.0 Array) from the Mouse 267
MOE430 Gene Atlas (Su et al. 2002; Lattin et al. 2008) to identify testis-biased genes 268
(testis expression >2✕ median tissue expression) and to estimate tissue specificity (τ) 269
following the recommendations of Liao and Zhang (2006). The τ value ranges from 0 to 270
1 with higher values indicative of expression restricted to one or a few tissues (Yanai et 271
al. 2005; Liao and Zhang 2006; Liao et al. 2006). 272
273
Evolutionary analyses of protein-coding and expression divergence 274
We used codeml in PAML v4.7 (Yang 2007) to estimate the rate of nonsynonymous 275
substitutions (dN), synonymous substitutions (dS), and the dN/dS ratio (ω). Annotated 276
protein-coding sequences were extracted from published whole-genome sequences of M. 277
14
m. musculus (PWK/PhJ), M. m. domesticus (WSB/EiJ), and M. spretus (SPRET/EiJ) 278
(Keane et al. 2011). We used BIOSTRINGS v2.32.1 (Pages et al. 2008) to retrieve 279
sequences from the longest transcript for each gene, excluding transcripts with internal 280
stop codons. For genes that are induced early and late in spermatogenesis we estimated 281
pairwise and global (i.e., one-rate) ω from unrooted three-taxon alignments of 1) 282
individual transcripts and 2) a concatenated set transcripts for each chromosome. 283
Estimates of dN and dS from one-to-one orthologs between M. m. domesticus 284
(C57BL/6J) and Rattus norvegicus were retrieved from Ensembl Biomart release 83 285
(Smedley et al. 2015). For analyses of individual genes, we discarded transcripts with dS 286
values above the 95% quantile as a filter for poor alignments and transcripts with dS 287
estimates near zero, which are uninformative and can artificially inflate ω. 288
To identify differentially expressed (DE) genes between M. m. musculus and M. 289
m. domesticus, we fit our data with negative binomial generalized linear models with 290
Cox-Reid tagwise dispersion estimates (McCarthy et al. 2012). Our model included 291
species and cell type as a single factor and we constructed a design matrix that contrasted 292
each unique combination (e.g., M. m. musculus spermatogonia vs. M. m. domesticus 293
spermatogonia). We then tested for differential expression using likelihood ratio tests, 294
dropping one coefficient from the design matrix (i.e., the ‘null’ model) and comparing 295
that to the full model. We also calculated the correlation (1 - Spearman’s ρ) in 296
chromosome-wide expression divergence between subspecies using pair-wise median 297
FPKM values with confidence intervals generated by bootstrapping the data 1000 times. 298
299
Sperm methylation divergence 300
15
We quantified sperm methylome divergence between M. m. musculus and M. spretus 301
using BS-seq (Frommer et al. 1992) on 8 mice (four per species). Methylation of the 302
cytosine in cytosine-phosphate-guanine dinucleotides (CpG sites) plays a central role in 303
the epigenetic regulation of gene expression in mammals (Reik 2007). Mammalian 304
genomes generally show high levels of CpG methylation, punctuated by small 305
hypomethylated regions (HMRs) associated with promoters and enhancers (Molaro et al. 306
2011). Our primary goal was to study methylation divergence at orthologous CpG sites. 307
Therefore, we excluded sites if more than 1/2 of the mapped reads for a given individual 308
suggested a mutation resulting in the loss of CpG status (neither TpG nor CpG). To 309
control for differences in sequencing coverage, we randomly downsampled uniquely 310
mapped autosomal reads that contained at least one CpG so that individual autosomes 311
had average CpG coverage similar to the X-linked CpG’s. Hypomethylated regions 312
(HMRs) and other basic statistics were called using the HMR program in METHPIPE (Song 313
et al. 2013). Genomic windows that were called HMR in one species but not the other 314
were flagged as potential differentially methylated regions (DMRs). We then called 315
DMRs using the DMR program in METHPIPE, requiring each DMR to contain at least five 316
orthologous CpG’s that were differentially methylated between the species based on an 317
hypergeometric test of the four possible states (methylated vs. not methylated ✕ M. 318
spretus vs. M. m. musculus). 319
We used three different null hypotheses to test whether the proportion of X-linked 320
DMR’s differed from expectations. First, the number of X-linked base pairs divided by 321
the total number of base pairs (minus mtDNA and chrY; 171,032,299 / 2,633,777,672 = 322
0.065) in the sequenced mouse genome (GRCm38) provided a crude expectation of the 323
16
proportion of X-linked DMR’s. Second, because hypomethylation tends to cluster near 324
protein-coding genes, we calculated the null expectation based on the proportion of 325
protein-coding genes that are X-linked (869 / 19,720 = 0.044). Third, to further account 326
for potential differences in CpG sequencing coverage between the X chromosome and the 327
autosomes, we calculated that X-to-autosome ratio of CpG’s that were covered at least 328
3✕ in one species and at least 4✕ in the other (the minimum number of reads necessary 329
to detect a significantly differentially methylated CpG) among the downsampled reads 330
(348,647 / 9,968,595 = 0.035). 331
332
The data reported in this paper is available through NCBI under the accession 333
numbers SRP065082, SRP065034, and SRP075865. 334
335
Results 336
Phylogenetic relationship of the mice used in this study 337
We estimated a maximum likelihood phylogeny using whole genome and targeted exome 338
resequencing data from the mouse strains used in this study (Figure 1). The resulting tree 339
conformed to the expected evolutionary relationships among strains of M. m. musculus, 340
M. m. domesticus, and M. spretus. Strains within a given taxon are closely related and 341
approximate levels of individual variation following general expectations for these 342
species and subspecies. Likewise, M. musculus and M. spretus were separated by about 343
twice as much genetic divergence as M. m. musculus and M. m. domesticus. Thus, the 344
different strains of mice used within our studies of expression, DNA methylation, and 345
protein-coding divergence represent closely-related samples from these three lineages. 346
17
347
Stage-specific dissection of gene expression across mouse spermatogenesis 348
We used FACS to generate highly enriched populations of four cell types spanning three 349
phases of spermatogenesis (Figures S1, 2A): (1) spermatogonia (mitosis), (2) 350
leptotene/zygotene spermatocytes (meiosis, prior to MSCI), (3) diplotene spermatocytes 351
(meiosis, after MSCI), and (4) round spermatids (postmeiosis). For each subspecies, M. m. 352
musculus and M. m. domesticus, we mapped between 53 and 74 million unique 353
sequencing fragments from each cell type to their respective genomes (Table S1). Each 354
cell population had a distinct expression profile (Figure S2A), there was strong clustering 355
of transcription levels by cell type (Figure S2B), and expression patterns of established 356
cell or stage-specific genes were specific to their cell type (Figure S2C). Overall, these 357
results indicate that our FACS experiments yielded highly enriched cell populations with 358
low variance among individual samples. 359
We detected expression of 14,223 protein-coding genes across spermatogenesis. 360
Most genes were expressed in multiple cell types (Table 1), but often with large between-361
cell differences in expression levels. Our developmental timeline brackets the onset of 362
MSCI (at mid-pachytene), and as expected, we observed chromosome-wide repression of 363
X-linked genes in diplotene spermatocytes and partial reactivation of the X chromosome 364
in postmeiotic round spermatids (Figure 2). Each cell population produced distinct 365
expression profiles (Figure S2), but early cell types (spermatogonia and 366
leptotene/zygotene spermatocytes) showed very similar patterns of expression. Therefore, 367
we grouped results into three expression stages relative to MSCI: 1) expressed early in 368
spermatogonia and/or leptotene/zygotene spermatocytes and silenced at MSCI, 2) 369
18
expressed early, silenced at MSCI, and reactivated late, and 3) expressed only late 370
(Figure 3). Genes induced at each of these stages showed good overall agreement 371
between functional annotations and developmental processes indicative of that stage of 372
spermatogenesis (Table 1). We also observed very high gene-by-gene correspondence 373
between our general groupings and previously described patterns of X-linked expression 374
during spermatogenesis (Figure S3), which used different methods of cell enrichment and 375
expression profiling (Namekawa et al. 2006). These results indicate that our FACS-based 376
enrichment captures the dynamic changes in transcription across spermatogenesis. 377
Next, we combined our data with published multi-tissue expression data and 378
found that testis-biased genes made up a larger proportion of the genes induced late in 379
spermatogenesis (early 4.0%, late 35.4%), consistent with increasing gene specificity 380
during postmeiotic development (Schultz et al. 2003). The tissue specificity of gene 381
expression (τ) also increased late (median ± SE τ early, 0.831 ± 0.005; late, 0.903 ± 0.009, 382
Wilcoxon test P ≤ 0.001). This pattern was particularly striking on the X chromosome 383
where nearly half of late induced genes were also testis-biased (Table 1). In addition, the 384
X chromosome was enriched for genes induced early. These results are in agreement with 385
previous studies indicating that X-linked gene content has been shaped by natural 386
selection for spermatogenic genes expressed before or after MSCI (Khil et al. 2004; 387
Mueller et al. 2008; Sin et al. 2012). 388
389
Faster-late and faster-X protein evolution 390
We calculated rates of protein-coding evolution (ω) between M. m. musculus, M. m. 391
domesticus, and M. spretus for genes induced at different stages of spermatogenesis. 392
19
These contrasts revealed two striking patterns. First, late-induced autosomal and X-linked 393
genes evolved more rapidly than early-induced genes (i.e. “faster-late”, Figure 4A, Table 394
2), confirming a previous finding of a positive correlation between rates of protein 395
evolution and timing of expression during spermatogenesis (Good and Nachman 2005). 396
Second, X-linked genes showed significantly higher ω when compared to autosomal 397
genes (i.e. “faster-X”, Figure 4A). These patterns of faster-late and faster-X protein 398
evolution held when considering only pairwise differences within M. musculus (M. m. 399
musculus versus M. m. domesticus) (Table S2) or when considering much more divergent 400
contrasts between the house mouse and the Norwegian rat (Table 2). They were also 401
robust to different expression level thresholds (Table S3). 402
Interpreting patterns of ω requires consideration of the synonymous (dS) and 403
nonsynonymous (dN) per site substitution rates in isolation. The mammalian X 404
chromosome has a lower per-nucleotide mutation rate than the autosomes as a 405
consequence of male-driven molecular evolution (Ellegren and Parsch 2007). Consistent 406
with this, we observed lower dS for X-linked genes across all evolutionary contrasts 407
(Tables 2, S2) and for X-linked genes expressed early or late in spermatogenesis (Table 408
S2). The X chromosome is also predicted to have a lower Ne relative to the autosomes 409
and thus shallower coalescent depths on average, which could have a particularly strong 410
impact on the comparison of X versus autosomal divergence between closely related 411
lineages. Consequently, we observed the greatest difference in X-linked versus autosomal 412
estimates of dS between M. m. musculus and M. m. domesticus (dSX/dSauto = 0.572 versus 413
dSX/dSauto ~0.8 for contrasts involving M. spretus or rat). Finally, we found that the X 414
chromosome shows higher dN compared to the autosomes (Table 2), with the greatest 415
20
difference among genes induced late in spermatogenesis (Table S2). Thus, we found 416
consistently faster-late and faster-X protein-coding evolution for mouse spermatogenic 417
genes across different levels of evolutionary divergence and when considering both 418
relative and absolute rates of non-synonymous divergence. 419
420
Slower-late and slower-X expression evolution 421
We observed strikingly different patterns of expression divergence relative to protein 422
evolution. Although there was a general trend towards more DE genes induced late in 423
spermatogenesis (24% of spermatogonia genes versus 37% of round spermatids genes 424
were DE) this trend was restricted to autosomal genes and there were actually fewer X-425
linked DE genes induced late (i.e. “slower-X”, Figures 4B, 4C). The slower-X pattern 426
late in spermatogenesis was robust to different temporal classifications (Table S4), 427
different expression thresholds (Table S5), different methods of accounting for X-linked 428
ampliconic/multicopy genes (Figure S4), and was also apparent in per chromosome 429
comparisons (Figure 5). As discussed above, testis-biased genes tended to be expressed 430
late in spermatogenesis (Table 1). When we restricted our analysis to testis-biased genes 431
we found that expression divergence was dramatically reduced late and was similarly 432
constrained between the X chromosome and the autosomes (i.e. “slower-late”, Figure 4B). 433
If we compare expression divergence on the autosomes and the X chromosome while 434
excluding testis-biased genes, the pattern of slower-X late is even more apparent (Figure 435
S5). Thus, slower-late expression evolution appears to be a general feature of X-linked 436
spermatogenic genes, but was restricted to testis-biased genes on the autosomes. 437
These estimates of differential expression reflect divergence of a biochemical 438
21
phenotype that do not account for the potential impacts of lower mutation rates and/or 439
shallower coalescent times on the X chromosome. If we conservatively scale our 440
expectation for X-linked differential expression by the X-to-autosome ratio of 441
synonymous divergence (dSX/dSauto = 0.572 for the contrast between M. m. musculus and 442
M. m. domesticus) then we find that the X chromosome becomes significantly enriched 443
for DE genes induced early (P < 0.001) and the slower-X pattern becomes non-significant 444
late (P = 0.38). Likewise, for testis-biased genes, the X chromosome is significantly 445
enriched for DE genes early (P < 0.001) but not late (P = 0.924). Thus, the strongest 446
pattern in our data is that expression divergence changes considerably across 447
spermatogenesis both on and off the X chromosome, but at each stage the rate of 448
evolution of the X chromosome relative to the autosomes depends on the extent to which 449
mutation and effective population size influence evolution of this phenotype. 450
Overall patterns of protein divergence and expression divergence were 451
qualitatively different across this timeline. Protein divergence increased late in 452
spermatogenesis and was elevated on the X chromosome, while the opposite was true for 453
expression divergence. Interpretation of X versus autosomal expression divergence is 454
clearly dependent on the assumptions one makes with respect to null expectations. Given 455
this, it is informative to focus on the relationships between gene expression and protein 456
evolution across spermatogenesis. We calculated correlations and partial correlations 457
between protein evolution (ω), expression level (normalized FPKM), log2 fold-change 458
between subspecies (logFC), and tissue specificity (τ) (Figure S6). There was a strong 459
positive relationship between expression level and logFC on and off the X chromosome 460
for genes induced early and late in spermatogenesis. LogFC and τ were positively 461
22
correlated on the autosomes early, indicating that tissue-specific genes tended to also 462
show larger changes in expression levels between M. m. musculus and M. m. domesticus. 463
In contrast, logFC was negatively correlated with both τ and ω on the autosomes late in 464
spermatogenesis, but only when testis-biased genes were included (Figure S6). These 465
results support our general finding of elevated protein-coding divergence coupled with 466
more constrained gene expression for testis-biased genes late in spermatogenesis. 467
468
Slower-X sperm methylome evolution 469
For BS-seq, we obtained an average genomic coverage of 12.7✕ in M. m. musculus and 470
12.3✕ in M. spretus (Tables S6, S7). Methylation levels across individual CpG sites were 471
most strongly correlated among individuals within strains, followed by strains within 472
species, followed by between species comparisons (Table S8), as expected based on 473
patterns of DNA sequence divergence (Figure 1). Approximately 6.0% of the genome lies 474
underneath X-linked HMR’s, which is reasonably close to that expected based the 475
relative length of the X chromosome (6.4%). We defined methylome divergence by 476
identifying differentially methylated regions (DMRs) as genomic regions with adequate 477
coverage in both species but where an HMR was called in only one of the species. Only 478
2.3% of the 9,580 DMRs between M. m. musculus and M. spretus were X-linked, a 479
strongly significant reduction given the proportion of X-linked CpGs (Χ2 = 23.2, P < 10-5; 480
Figure 6, Table S9). We observed considerable variation in DMR enrichment among 481
chromosomes (Figure 6), likely because of the sensitivity of enrichment tests when 482
samples sizes are large. Nine of twenty linkage groups showed significant skews in 483
observed versus expected DMRs but the X chromosome always showed the strongest de-484
23
enrichment of DMRs across a range of test configurations. We emphasize here that we 485
downsampled autosomal reads to match X-linked coverage, therefore our results cannot 486
be explained by differential coverage. Our results remained qualitatively identical across 487
five different downsampling schemes. Thus, in mouse sperm the X chromosome shows 488
less absolute divergence in sperm methylation status relative to the autosomes. 489
Finally, we evaluated the potential influence of lower X chromosome substitution 490
rates on sperm methylome evolution. Similar to our expression analyses above, we 491
corrected our expectation for the number of X-linked DMRs by the X-to-autosome ratio 492
of synonymous substitutions ( dSX/dSauto = 0.77 for the contrast between M. m. musculus 493
and M. m. spretus). Using this correction, the reduction in the frequency of DMRs on the 494
X chromosome becomes non-significant (Χ2, P = 0.088). 495
496
Discussion 497
Forty years ago King and Wilson argued that protein sequence and gene expression 498
represent distinct levels of evolution (King and Wilson 1975). This influential paper 499
popularized the ideas that evolution proceeds through different molecular mechanisms 500
along the transition from genotype to phenotype and that gene expression may play a 501
predominant role in organismal evolution. In our study we found striking contrasts in 502
patterns of divergence between different forms of molecular evolution dependent on 503
chromosome origin and developmental stage of spermatogenesis. Our cell-specific data 504
yield new insights into the evolution of spermatogenesis and the critical role that 505
developmental context plays in molecular evolution on and off of the sex chromosomes. 506
24
Below we discuss the implications of our results for the evolution of spermatogenesis, the 507
X chromosome, and speciation. 508
509
Protein evolution and spermatogenesis 510
Faster-X protein evolution has now been found across a broad range of taxa including 511
mammals (Hvilsom et al. 2012; Veeramah et al. 2014), birds (Mank et al. 2007), flies 512
(Begun et al. 2007; Langley et al. 2012; Garrigan et al. 2014) and other insects (Jaquiery 513
et al. 2012; Sackton et al. 2014). Both higher rates of adaptive substitutions (Kousathanas 514
et al. 2014) and relaxed constraint (Wright et al. 2015) likely contribute to these patterns. 515
X-linked sequence evolution is not always unusual (reviewed in (Vicoso and 516
Charlesworth 2006; Ellegren and Parsch 2007; Meisel and Connallon 2013), but there is 517
strong support for faster-X protein evolution for genes expressed in male reproductive 518
tissues (Torgerson and Singh 2003; 2006; Baines et al. 2008; Grath and Parsch 2012; Sin 519
et al. 2012; Kousathanas et al. 2014). From these data, gene expression has emerged as a 520
defining factor in faster-X protein evolution, with breadth of expression, tissue specificity, 521
and degree of sex-bias all strongly influencing evolutionary rates (Meisel 2011; Meisel et 522
al. 2012b). 523
Given these general findings, the details of spermatogenesis should strongly 524
dictate patterns of molecular evolution on and off the sex chromosomes, yet most 525
evolutionary studies have not utilized a strong developmental framework (but see 526
Kousathanas et al. 2014). Our FACS data allowed us to partition genes across 527
spermatogenesis, yielding strong support for faster-X protein evolution (Figure 4A) based 528
on relative (ω) and absolute (dN) estimates of amino acid divergence. Faster-X protein 529
25
evolution was most striking when considering ω estimates that used dS to approximate 530
neutral divergence. This assumption is violated in many species due to selection for 531
biased codon usage. For example, the X chromosome shows stronger codon usage bias in 532
Drosophila (Singh et al. 2005), which in turn may inflate X-linked estimates of ω 533
(Campos et al. 2013). There is also weak codon usage bias in mice, but unlike 534
Drosophila, it is significantly weaker on the X chromosome (Kessler and Dean 2014) and 535
therefore conservative with respect to the observation of faster-X protein evolution. 536
Further, both lower mutation rates and shallower coalescent times likely contribute to less 537
nucleotide divergence on the mouse X chromosome (Table S2), making the observation 538
of higher X-linked dN conservative. 539
Some aspects of our faster-X protein-coding results are seemingly contrary to the 540
basic dominance predictions of faster-X theory. Postmeiotic cells are haploid, leading to 541
the prediction that faster-X evolution in the germ line should be restricted to genes 542
expressed prior to meiosis (Kousathanas et al. 2014). However, spermatids form a 543
multicellular syncytium connected through intercellular bridges that enable functional 544
equivalence through cytoplasmic exchange (Braun et al. 1989; Caldwell and Handel 545
1991). Exchange of gene products between spermatids is likely a functional necessity 546
given that many sex-linked genes are essential to the postmeiotic stages of 547
spermatogenesis. Assuming exchange of autosomal gene products maintains functional 548
diploidy, then our finding of faster-X protein evolution across the diploid and haploid 549
stages of spermatogenesis is generally consistent with the dominance predictions of 550
faster-X theory (Charlesworth et al. 1987; Vicoso and Charlesworth 2009). 551
552
26
Regulatory evolution and spermatogenesis 553
In contrast to protein evolution, X-linked postmeiotic genes (i.e., induced late) showed 554
less differential expression than comparable autosomal genes (Figure 4). As with 555
expression divergence, sperm methylome evolution also appeared slower on the X 556
chromosome (Figure 5). Given the strong signature of faster-X protein evolution, why do 557
we find evidence for equivalent or slower-X gene expression and DNA methylation 558
evolution across the same developmental timeline? One simple explanation is that one or 559
more of the assumptions of the faster-X model sequence evolution do not hold for these 560
complex biochemical traits. First, for the predictions of faster-X theory to hold, these 561
regulatory phenotypes must reflect linked sequence evolution. For faster-X expression 562
divergence, this would require that differences in transcript abundances are due to 563
divergence in cis-regulatory elements and/or X-linked trans-regulatory elements 564
(Kayserili et al. 2012; Meisel et al. 2012a; Meisel and Connallon 2013). Several studies 565
suggest that divergence in transcript abundances is largely determined by evolution in cis 566
(Wittkopp et al. 2004; Landry et al. 2005; Wittkopp et al. 2008; Graze et al. 2009; Shi et 567
al. 2012; Goncalves et al. 2012; Shen et al. 2014; Mack et al. 2016), while other research 568
indicates that trans-regulatory divergence is more common (Emerson et al. 2010; 569
McManus et al. 2010; Schaefke et al. 2013; Meiklejohn et al. 2014; Coolon et al. 2014; 570
Combes et al. 2015) or that the inferred mode of regulatory divergence is dependent on 571
taxon and experimental design (Coolon and Wittkopp 2013; Guerrero et al. 2016). The 572
mechanisms underlying the evolution of DNA methylation are even more poorly 573
understood, but levels of DNA methylation within a given genomic region appear 574
27
strongly dependent on underlying (i.e., cis) genetic sequences (Hernando-Herraez et al. 575
2015). 576
Faster-X sequence evolution also assumes that beneficial mutations are on 577
average at least partially recessive (Charlesworth et al. 1987), while theory (Gibson and 578
Weir 2005) and several empirical studies (Lemos et al. 2008; McManus et al. 2010; 579
Schaefke et al. 2013) suggest that cis-regulatory elements should on average act 580
additively. Nonetheless, these studies also find a substantial proportion of cis-regulatory 581
elements that act non-additively or, in some cases, a greater overall proportion of non-582
additive cis-regulatory elements (Coolon et al. 2014). Unfortunately, these issues remain 583
largely unresolved on the X chromosome because both dominance relationships and cis 584
versus trans regulatory evolution are usually assessed through allele-specific expression 585
in F1 genotypes, which cannot be evaluated for X-linked genes during spermatogenesis. 586
It is also important to consider what fraction of X-linked substitutions that 587
influence gene expression or DNA methylation reflect new mutations that have been 588
targeted by positive directional selection. One formal possibility is that divergence of 589
regulatory traits simply reflects neutral sequence evolution. Both patterns of slower-X 590
gene expression and methylation divergence become non-significant when expectations 591
are corrected by lower synonymous substitution rates on the X chromosome. This simple 592
correction is likely highly conservative, but does suggest that differences in mutation 593
rates and/or the effective population sizes may partially account for patterns of 594
divergence on the X chromosome. We still lack a strong theoretical framework for 595
differentiating between adaptive and neutral divergence of expression phenotypes 596
(Khaitovich et al. 2005b; Meisel and Connallon 2013). Recent genomic analyses of 597
28
patterns of sequence polymorphism and divergence in mice indicate widespread adaptive 598
evolution of both coding (Halligan et al. 2010) and non-coding regions (Halligan et al. 599
2011; 2013). Most putatively adaptive autosomal substitutions in mouse genomes occur 600
within candidate regulatory regions (i.e., untranslated exons and conserved non-coding 601
elements), although the individual fitness effects of amino acid substitutions appear to be 602
much larger (Halligan et al. 2013). These studies did not consider X-linked patterns 603
directly but do suggest that a substantial fraction of substitutions within regulatory 604
regions may be targeted by positive selection. However, if selection on gene expression 605
acts on standing genetic variation, as opposed to new mutations, then substitution rates 606
are predicted to be lower on the X chromosome (Meisel and Connallon 2013) given 607
lower levels of X-linked genetic diversity in mice (Geraldes et al. 2008). The relative 608
contribution of new mutations versus standing genetic variation to adaptive regulatory 609
divergence has not yet been explicitly addressed (Coolon and Wittkopp 2013). 610
In sum, it clear that extending the faster-X model sequence evolution to 611
regulatory phenotypes depends on several assumptions that are both restrictive and 612
remain largely unresolved. However, if regulatory divergence between subspecies of 613
mice were simply a consequence of the genetic architecture of these traits, mutational 614
processes, or the origin of adaptive variation, then we might reasonably expect patterns 615
on and off the X chromosome to be consistent across different stages of spermatogenesis. 616
Instead we found that X-linked expression divergence, both overall and relative to the 617
autosomes, changed dynamically over the timeline of spermatogenesis with a strong 618
signature of less divergence late (Figure 4B). Given this, we propose that slower-late 619
regulatory evolution may be best explained by strong developmental constraints on gene 620
29
expression phenotypes during the later stages of spermatogenesis that are particularly 621
acute on the sex chromosomes. 622
Evolution of the mammalian X chromosome likely reflects a balance between the 623
inherent constraints of dosage compensation (Ohno 1967; Kohn et al. 2004) and 624
spermatogenesis (Schultz et al. 2003; Shima et al. 2004; Chalmel et al. 2007) with the 625
conflicting forces of sexual and antagonistic selection that favor X-linked male-biased 626
genes (Rice 1984). During spermatogenesis, the sex chromosomes are tightly regulated 627
through the generally repressive epigenetic environments of MSCI and PSCR (Turner 628
2007; Hu and Namekawa 2015). For example, PSCR in mice is partially maintained by 629
the multi-copy Y-linked Sly gene (Cocquet et al. 2009; 2010). Sly represses postmeiotic 630
sex chromosome expression and favors Y transmission in spermatids. The multi-copy X-631
linked genes Slx/Slxl1 seem to counter this by increasing sex chromosome expression and 632
X transmission. This direct antagonism appears to have driven a copy number arms race 633
(Scavetta and Tautz 2010; Ellis et al. 2011) where each added copy leads to an 634
incremental increase in relative expression and subsequent counter selection for up-635
regulation of the other gene group (Cocquet et al. 2012). So while the underlying 636
genomic architecture evolves rapidly via gene duplication (Ellis et al. 2011), expression 637
level phenotypes appear to be under strong stabilizing selection for a dosage equilibrium 638
between sex-linked postmeiotic genes that if disrupted leads to male sterility. This is 639
evidence for strong constraint on the regulatory phenotypes of X-linked genes that 640
overcome PSCR to be expressed during this critical period of sperm development. Our 641
data suggest that such constraints likely extend more broadly across the X chromosome 642
and to testis-biased autosomal genes. 643
30
We also found that patterns of DNA methylation diverged more slowly on X 644
chromosomes compared to the autosomes (Figure 5). Dynamic methylation plays a key 645
role in mammalian genome regulation (Li et al. 1992; Okano et al. 1999; Reik et al. 646
2001), yet very little is know about either the tempo or mechanisms of DNA methylome 647
evolution. Sperm are the first mature cells following a major “erasure”; nearly the entire 648
genome is demethylated in spermatogonia stem cells and then remethylated late in 649
spermatogenesis (Reik et al. 2001). A study involving primates found the methylomes of 650
mature sperm to differ from those of somatic cells in several important ways. Most 651
notably, the HMRs are more numerous and extend for longer genomic intervals in sperm 652
compared to somatic cells (Molaro et al. 2011). We propose that the observation of less 653
X-linked sperm methylome divergence may be an extension of the same regulatory 654
constraints that lead to slower gene expression divergence late in spermatogenesis, as 655
well as other X-linked phenomena such as mutational environment. Comparable 656
methylome data from other tissues and other stages in spermatogenesis will be helpful in 657
resolving these outstanding questions. 658
Cis-regulatory evolution is thought to proceed with fewer pleiotropic constraints 659
than protein-coding divergence (e.g. Carroll 2008). However, our results indicate that the 660
stages of spermatogenesis that show the fastest rates of protein evolution show less 661
expression divergence for both X-linked and testis-biased genes. Protein evolution is 662
strongly influenced by underlying patterns of gene expression; there are generally higher 663
rates of protein-coding divergence for genes that have lower expression (Nguyen et al. 664
2015) and for narrowly expressed genes (Liao et al. 2006; Meisel 2011). Expression 665
specificity increases late in spermatogenesis and the X chromosome is enriched for testis-666
31
biased postmeiotic genes. Therefore expression specificity may be the primary factor 667
underlying faster-late and faster-X protein evolution. While divergence in gene 668
expression (logFC) and tissue specificity (τ) were also positively correlated early in 669
spermatogenesis, they were negatively correlated for autosomal testis-biased genes 670
expressed late (Figure S6). Thus, specificity may actually constrain expression 671
divergence late in spermatogenesis, especially for postmeiotic testis-biased genes that 672
presumably play critical roles in sperm development and maturation. There is typically a 673
positive association between differential expression and tissue specificity (Meisel et al. 674
2012a), suggesting that our results reflect a unique feature of sperm development rather 675
than a general molecular evolutionary trend. 676
Our observation of less X-linked expression divergence late in spermatogenesis 677
differs some from studies in mammals reporting elevated (uncorrected) X-linked 678
expression divergence primarily in testis (Khaitovich et al. 2005a; Zhang et al. 2010; 679
Brawand et al. 2011). These previous studies have focused on various whole tissue 680
contrasts spanning different taxa and phylogenetic depths. However, Zhang et al. (2010) 681
did report greater relative X-linked expression divergence in spermatids between mouse 682
and rat; the exact cell population that we found to be the most conserved between 683
subspecies of mice (Figure 4). While this could reflect changes in X-linked expression 684
evolution over deeper evolutionary timescales, this previous result was based on a meta-685
analysis of microarray data collected independently in mouse (Chalmel et al. 2007) and 686
rat (Johnston et al. 2008) using different cell-enrichment procedures. A full assessment of 687
testis gene expression evolution between mouse and rat awaits the comparison of cell-688
specific transcriptomes that control for potential experimental artifacts. 689
32
How broadly the slower-late X and testis-biased autosomal patterns of regulatory 690
evolution apply to other taxa remains to be seen. Expression divergence has not been 691
systematically evaluated in non-mammalian systems using cell- or stage-specific data, 692
making the relative contributions of cellular composition and developmental constraint 693
difficult to determine. This technical limitation aside, the influence of developmental 694
timing on spermatogenic expression evolution will also be dependent on the existence or 695
extent of MSCI/PSCR-like phenomena in other taxa. For example, faster-X expression 696
divergence has also been reported in Drosophila (Kayserili et al. 2012; Meisel et al. 697
2012a; Llopart 2015; Coolon et al. 2015) where the X chromosome appears to be 698
transcriptionally repressed during spermatogenesis (Meiklejohn et al. 2011). Whether this 699
repression reflects a process akin to MSCI has been contentious, but it does suggest that 700
regulatory constraints are likely play an important role in sex chromosome evolution 701
(Vibranovski 2014). 702
703
Implications for speciation 704
Faster-X theory was originally proposed in part to explain two observations that invoke a 705
large role for sex chromosomes in speciation (Charlesworth et al. 1987). First, when 706
hybrids between recently diverged lineages show sex-specific sterility or inviability, it is 707
usually the heterogametic sex (Haldane 1922). Second, heterogametic sterility is 708
disproportionately linked to the X or Z chromosomes, a pattern known as the large-X 709
effect (Coyne 1992). Faster-X evolution remains one of the predominant mechanisms 710
invoked to explain the evolution of hybrid sterility (Kousathanas et al. 2014), and the link 711
between rapid evolution and speciation is intuitive. A long-standing alternative 712
33
hypothesis is that spermatogenesis is an inherently sensitive process that may be easily 713
disrupted in hybrids (Lifschytz and Lindsley 1972; Jablonka and Lamb 1991; Wu and 714
Davis 1993). Disruption of MSCI/PSCR are strong a priori candidate regulatory 715
mechanisms for the sensitivity of spermatogenesis (Lifschytz and Lindsley 1972). Studies 716
in mice (Good et al. 2010; Bhattacharyya et al. 2013; Campbell et al. 2013; Turner et al. 717
2014) and other mammals (Davis et al. 2015) have recently found a strong link between 718
regulatory disruption of the X chromosome during spermatogenesis and the evolution of 719
hybrid male sterility. Our results reveal that the same developmental stages that have 720
disrupted gene expression in mouse hybrids (Good et al. 2010; Bhattacharyya et al. 2013; 721
Campbell et al. 2013) show the strongest evolutionary constraints in gene expression 722
levels (Figure 4). While these data do not discount an important role for faster-X protein-723
coding evolution in speciation, they do suggest that inherent developmental constraints 724
on the regulation of gene expression late in spermatogenesis may play a central and 725
underappreciated role in the evolution of hybrid male sterility. 726
727
Acknowledgments 728
We would like to thank the University of Montana Fluorescence Cytometry Core, 729
supported by the National Institute of General Medicine Sciences of the National 730
Institutes of Health (P30GM103338), the University of Montana Genomics Core, 731
supported by a grant from the M.J. Murdock Charitable Trust and the Vincent J. Coates 732
Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 733
Instrumentation Grants S10RR029668 and S10RR027303. We also thank Pamela K. 734
Shaw, Irina Getun, Bivian Torres, Colin Callahan and Brent Young for assistance with 735
34
FACS and other sample preparations, Francois Bonhomme for mice and the staff of the 736
UM Laboratory Animal Research facility. Members of the Good and Dean labs gave us 737
helpful feedback on experimental results and comments from Bret Payseur and multiple 738
anonymous reviewers improved previous versions of this manuscript. This research was 739
supported by the Eunice Kennedy Shriver National Institute of Child Health and Human 740
Development (R01HD073439; JMG), the National Institute of General Medical Sciences 741
(R01GM098536; MDD), and the National Science Foundation (1146525; MDD). 742
743
744
Figure Legends 745
Figure 1. Evolutionary relationships among mouse strains and species. The 746
phylogeny was estimated using maximum likelihood, based on genome-wide data. 747
Bootstrap proportions are indicated for each internal branch. Shaded circles indicate 748
which strains were used for estimates of gene expression, protein-coding and sperm 749
methylation divergence. 750
751
Figure 2. X-linked expression across spermatogenesis in M. m. musculus and M. m. 752
domesticus. (A) Progression of dividing germ cells during spermatogenesis. In early 753
prophase I the X chromosome is transcriptionally silenced (MSCI) for the remainder of 754
meiosis and remains partially inactivated (PSCR) during postmeiotic development of 755
spermatids. We used FACS to isolate cell populations spanning these three phases of 756
spermatogenesis (a-d). (B) Patterns of X-linked gene expression in FACS isolated cell 757
populations. Genes (rows) are grouped by the timing of induction: early (expressed prior 758
35
to MSCI), early and late (inactivated at MSCI and reactivated late) and late (expressed 759
only in postmeiotic cells). 760
761
Figure 3. Expression patterns for genes induced at different stages of 762
spermatogenesis in Mus musculus. The normalized FPKM values for each gene is 763
plotted as a horizontal tick mark and the distribution density for each cell type is plotted 764
as a gray outline. Lines represent the median FPKM for each cell type (solid) and stage of 765
spermatogenesis (dashed). Genes induced early have higher expression before the onset 766
of MSCI (SP, spermatogonia and LZ, leptotene/zygotene spermatocytes). Genes induced 767
both early and late are silenced at MSCI (DIP, diplotene spermatocytes) and are 768
reactivated in postmeiotic cells (RS, round spermatids). Genes induced late are only 769
highly expressed in postmeiotic cells. 770
771
Figure 4. Evolutionary divergence in protein-coding sequence and expression level. 772
(A) Median (± SE) estimates of protein-coding divergence (ω) among subspecies of M. 773
musculus and M. spretus is higher for genes expressed late and for X-linked genes. Gene 774
expression divergence between subspecies of M. musculus estimated as (B) the 775
proportion of expressed genes that are differentially expressed (DE) and (C) the pairwise 776
correlations of gene expression per chromosome (1 – ρ, ± 95% CI). The evolution of X-777
linked genes is either equivalent or slower than autosomal genes and there is a marked 778
drop in the proportion of testis-biased genes expressed late in spermatogenesis that are 779
DE. N = total number of genes. Significant differences in ω (Wilcoxon test) and 780
proportion of DE genes (Χ2) are indicated for each contrast. FDR corrected P-values : * ≤ 781
36
0.05, ** ≤ 0.01, *** ≤ 0.001. 782
783
Figure 5. Contrasting patterns of gene expression divergence on the X chromosome. 784
Early in spermatogenesis the X chromosome has a similar proportion of DE genes 785
compared to the autosomes, while there are fewer X-linked DE genes late in 786
spermatogenesis. The shift in the number of observed/expected X-linked genes between 787
early and late reflects the enrichment of X-linked genes induced early. Significance is 788
based on chromosome-wise hypergeometric test for enrichment with FDR corrected P-789
values. 790
791
Figure 6. Evolution of the mouse sperm methylome. Chromosomal distributions of 792
differentially methylated regions (DMRs) between M. m. musculus and M. spretus. The 793
observed and expected numbers of DMRs are plotted to the left. Expectations are based 794
on the proportion of CpG sites on each chromosome multiplied by the total number of 795
DMRs. Chromosomes in dark gray have significantly more or less DMRs than expected 796
based on chromosome-wise hypergeometric tests. The X chromosome has significantly 797
fewer DMRs compared to all of the autosomes (P < 0.001). 798
799
37
Table 1. The X chromosome is enriched for genes induced before and after MSCI and for testis-biased genes. Values represent 787
the percentage of genes that meet a given expression threshold in each cell type and stage of spermatogenesis. Enrichment of X-linked 788
genes are based on chromosome-wise hypergeometric tests, FDR corrected P-values: *** ≤ 0.001. 789
% Expressed1 % Induced2 % Testis-biased3 Functional annotation
Auto X Auto X Auto X Auto and X By cell
Spermatogonia 54.3 50.5 19.1 40.3*** 5.6 11.3*** 5cell, intracellular, anatomical structure development, biosynthetic process, nucleic acid binding transcription factor activity
Leptotene/zygotene spermatocytes
52.7 45.6 15.3 34.0*** 5.2 7.5 5organelle, intracellular, cell, cellular nitrogen metabolic process, nucleus
4Diplotene spermatocytes 48.0 13.2 16.2 - 48.4 - cilium, reproduction, microtubule organizing center
Round spermatids 50.3 35.6 22.8 23.0 42.5 44.9 cilium, reproduction
By stage
Early NA NA 10.6 33.9*** 3.1 7.5*** 5anatomical structure development, immune system process, plasma membrane, signal transduction, cell differentiation
Early & Late NA NA 2.3 9.3 14.3 19.5 5anatomical structure development, cytoskeletal organization, cytoskeletal protein binding, cell differentiation, cytoskeleton
Late NA NA 7.9 19.1*** 35.6 49.4*** reproduction, signal transducer activity, extracellular region, cell wall organization or biogenesis, neurological system process
1 % Expressed = genes with FPKM > 1 in a minimum of 4 individuals for a given cell type out of the total genes detected (14,223 genes). 790 2 % Induced = By cell: genes with a median FPKM > 2✕ the median FPKM of all other cell types combined out of the total genes expressed. By stage: genes 791 with a maximum FPKM > 4✕ the maximum FPKM of all other stages combined out of the total genes expressed. (Note: a gene can be induced in more than one 792 cell type, but can only be induced at a single stage). 793 3 % Testis-biased = genes with higher expression in the testes compared 96 other tissues from the Mouse MOE430 Gene Atlas out of the total genes induced. 794
38
4 Due to MSCI there are too few genes expressed in diplotene spermatocytes to report X-linked induced genes. 795 5 Only the top five gene enrichment categories are listed. 796
39
Table 2. Protein-coding divergence on the X chromosome and the autosomes. Estimates of median (± standard error) omega (ω), 797
nonsynonymous substitution rate (dN) and synonymous substitution rate (dS) among M. m. musculus, M. m. domesticus, M. spretus 798
and Rattus norvegicus. N = the number of genes in each comparison. Significant differences (Wilcoxon test) between the autosomes 799
and X chromosome are indicated by FDR corrected (Benjamini and Hochberg 1995) P-values: *** < 0.001. 800
global estimates
among Mus species M. m. domesticus
vs. Rattus norvegicus
N Auto 9,898 10,565 X 416 311 ω Auto 0.114 ± 3.00x10-3 0.119 ± 1.61x10-3 X 0.235*** ± 2.82x10-2 0.192*** ± 1.33x10-2 dN Auto 0.003 ± 5.48x10-5 0.023 ± 3.38x10-4 X 0.004*** ± 4.99x10-4 0.029*** ± 2.82x10-3 dS Auto 0.024 ± 1.10x10-4 0.196 ± 5.73x10-4 X 0.018*** ± 4.90x10-4 0.156*** ± 3.52x10-3
dSX/dSauto 0.750 0.796 801
40
References 802
Anders S., Huber W., 2010 Differential expression analysis for sequence count data. 803 Genome Biol 11: R106. 804
Bachtrog D., Kirkpatrick M., Mank J. E., McDaniel S. F., Pires J. C., Rice W., 805 Valenzuela N., 2011 Are all sex chromosomes created equal? Trends Genet 27: 350–806 357. 807
Baines J. F., Harr B., 2007 Reduced X-linked diversity in derived populations of house 808 mice. Genetics 175: 1911–1921. 809
Baines J. F., Sawyer S. A., Hartl D. L., Parsch J., 2008 Effects of X-linkage and sex-810 biased gene expression on the rate of adaptive protein evolution in Drosophila. Mol 811 Biol Evol 25: 1639–1650. 812
Begun D. J., Whitley P., 2000 Reduced X-linked nucleotide polymorphism in Drosophila 813 simulans. Proc Natl Acad Sci U S A 97: 5960–5965. 814
Begun D. J., Holloway A. K., Stevens K., Hillier L. W., Poh Y.-P., Hahn M. W., Nista P. 815 M., Jones C. D., Kern A. D., Dewey C. N., Pachter L., Myers E., Langley C. H., 816 2007 Population genomics: whole-genome analysis of polymorphism and divergence 817 in Drosophila simulans. PLoS Biol 5: e310–26. 818
Benjamini Y., Hochberg Y., 1995 Controlling the false discovery rate: a practical and 819 powerful approach to multiple testing. J R Stat Soc 57: 289–300. 820
Bhattacharyya T., Gregorova S., Mihola O., Anger M., Sebestova J., Denny P., Simecek 821 P., Forejt J., 2013 Mechanistic basis of infertility of mouse intersubspecific hybrids. 822 Proc Natl Acad Sci U S A 110: E468–77. 823
Bolger A. M., Lohse M., Usadel B., 2014 Trimmomatic: a flexible trimmer for Illumina 824 sequence data. Bioinformatics 30: 2114–2120. 825
Braun R. E., Behringer R. R., Peschon J. J., Brinster R. L., Palmiter R. D., 1989 826 Genetically haploid spermatids are phenotypically diploid. Nature 337: 373–376. 827
Brawand D., Soumillon M., Necsulea A., Julien P., Csárdi G., Harrigan P., Weier M., 828 Liechti A., Aximu-Petri A., Kircher M., Albert F. W., Zeller U., Khaitovich P., 829 Grützner F., Bergmann S., Nielsen R., Pääbo S., Kaessmann H., 2011 The evolution 830 of gene expression levels in mammalian organs. Nature 478: 343–348. 831
Caldwell K. A., Handel M. A., 1991 Protamine transcript sharing among postmeiotic 832 spermatids. Proc Natl Acad Sci U S A 88: 2407–2411. 833
Campbell P., Good J. M., Nachman M. W., 2013 Meiotic sex chromosome inactivation is 834 disrupted in sterile hybrid male house mice. Genetics 193: 819–828. 835
41
Campos J. L., Zeng K., Parker D. J., Charlesworth B., Haddrill P. R., 2013 Codon usage 836 bias and effective population sizes on the X chromosome versus the autosomes in 837 Drosophila melanogaster. Mol Biol Evol 30: 811–823. 838
Carroll S. B., 2008 Evo-devo and an expanding evolutionary synthesis: a genetic theory 839 of morphological evolution. Cell 134: 25–36. 840
Chalmel F., Rolland A. D., Niederhauser-Wiederkehr C., Chung S. S. W., Demougin P., 841 Gattiker A., Moore J., Patard J.-J., Wolgemuth D. J., Jégou B., Primig M., 2007 The 842 conserved transcriptome in human and rodent male gametogenesis. Proc Natl Acad 843 Sci U S A 104: 8346–8351. 844
Charlesworth B., Coyne J. A., Barton N. H., 1987 The relative rates of evolution of sex 845 chromosomes and autosomes. Am Nat 130: 113–146. 846
Charlesworth D., 2013 Plant sex chromosome evolution. J Exp Bot 64: 405–420. 847
Cocquet J., Ellis P. J. I., Mahadevaiah S. K., Affara N. A., Vaiman D., Burgoyne P. S., 848 2012 A genetic basis for a postmeiotic X versus Y chromosome intragenomic 849 conflict in the mouse. PLoS Genet 8: e1002900. 850
Cocquet J., Ellis P. J. I., Yamauchi Y., Mahadevaiah S. K., Affara N. A., Ward M. A., 851 Burgoyne P. S., 2009 The multicopy gene Sly represses the sex chromosomes in the 852 male mouse germline after meiosis. PLoS Biol 7: e1000244. 853
Cocquet J., Ellis P. J. I., Yamauchi Y., Riel J. M., Karacs T. P. S., Rattigan A., Ojarikre 854 O. A., Affara N. A., Ward M. A., Burgoyne P. S., 2010 Deficiency in the multicopy 855 Sycp3-like X-linked genes Slx and Slxl1 causes major defects in spermatid 856 differentiation. Mol. Biol. Cell 21: 3497–3505. 857
Combes M. C., Hueber Y., Dereeper A., Rialle S., Herrera J. C., Lashermes P., 2015 858 Regulatory divergence between parental alleles determines gene expression patterns 859 in hybrids. Genome Biol Evol 7: 1110–1121. 860
Coolon J. D., Wittkopp P. J., 2013 cis- and trans- regulation in Drosophila interspecific 861 hybrids. In:Chen ZJ, Birchler JA (Eds.), Polyploid and hybrid genomics,, pp. 37–57. 862
Coolon J. D., McManus C. J., Stevenson K. R., Graveley B. R., Wittkopp P. J., 2014 863 Tempo and mode of regulatory evolution in Drosophila. Genome Res 24: 797–808. 864
Coolon J. D., Stevenson K. R., McManus C. J., Yang B., Graveley B. R., Wittkopp P. J., 865 2015 Molecular mechanisms and evolutionary processes contributing to accelerated 866 divergence of gene expression on the Drosophila X chromosome. Mol Biol Evol 32: 867 2605–2615. 868
Coyne J. A., 1992 Genetics and speciation. Nature 355: 511–515. 869
Danecek P., Auton A., Abecasis G., Albers C. A., Banks E., DePristo M. A., Handsaker 870
42
R. E., Lunter G., Marth G. T., Sherry S. T., McVean G., Durbin R., 1000 Genomes 871 Project Analysis Group, 2011 The variant call format and VCFtools. Bioinformatics 872 27: 2156–2158. 873
Davis B. W., Seabury C. M., Brashear W. A., Li G., Roelke-Parker M., Murphy W. J., 874 2015 Mechanisms underlying mammalian hybrid sterility in two feline interspecies 875 models. Mol Biol Evol 32: 2534–2546. 876
Dean R., Harrison P. W., Wright A. E., Zimmer F., Mank J. E., 2015 Positive selection 877 underlies faster-Z evolution of gene expression in birds. Mol Biol Evol 32: 2646–878 2656. 879
Eddy E. M., 2002 Male germ cell gene expression. Recent Prog Horm Res 57: 103–128. 880
Ellegren H., 2007 Characteristics, causes and evolutionary consequences of male-biased 881 mutation. Proc Biol Sci 274: 1–10. 882
Ellegren H., 2011 Sex-chromosome evolution: recent progress and the influence of male 883 and female heterogamety. Nat Rev Genet 12: 157–166. 884
Ellegren H., Parsch J., 2007 The evolution of sex-biased genes and sex-biased gene 885 expression. Nat Rev Genet 8: 689–698. 886
Ellis P. J. I., Bacon J., Affara N. A., 2011 Association of Sly with sex-linked gene 887 amplification during mouse evolution: a side effect of genomic conflict in 888 spermatids? Hum Mol Genet 20: 3010–3021. 889
Emerson J. J., Hsieh L.-C., Sung H.-M., Wang T.-Y., Huang C.-J., Lu H. H.-S., Lu M.-Y. 890 J., Wu S.-H., Li W.-H., 2010 Natural selection on cis and trans regulation in yeasts. 891 Genome Res 20: 826–836. 892
Emerson J. J., Kaessmann H., Betrán E., Long M., 2004 Extensive gene traffic on the 893 mammalian X chromosome. Science 303: 537–540. 894
Fairfield H., Gilbert G. J., Barter M., Corrigan R. R., Curtain M., Ding Y., D'Ascenzo M., 895 Gerhardt D. J., He C., Huang W., Richmond T., Rowe L., Probst F. J., Bergstrom D. 896 E., Murray S. A., Bult C., Richardson J., Kile B. T., Gut I., Hager J., Sigurdsson S., 897 Mauceli E., Di Palma F., Lindblad-Toh K., Cunningham M. L., Cox T. C., Justice M. 898 J., Spector M. S., Lowe S. W., Albert T., Donahue L. R., Jeddeloh J., Shendure J., 899 Reinholdt L. G., 2011 Mutation discovery in mice by whole exome sequencing. 900 Genome Biol 12: R86. 901
Firman R. C., Garcia-Gonzalez F., Thyer E., Wheeler S., Yamin Z., Yuan M., Simmons 902 L. W., 2015 Evolutionary change in testes tissue composition among experimental 903 populations of house mice. Evolution 69: 848–855. 904
Frommer M., McDonald L. E., Millar D. S., Collis C. M., Watt F., Grigg G. W., Molloy 905 P. L., Paul C. L., 1992 A genomic sequencing protocol that yields a positive display 906
43
of 5-methylcytosine residues in individual DNA strands. Proc Natl Acad Sci U S A 907 89: 1827–1831. 908
Garrigan D., Kingan S. B., Geneva A. J., Vedanayagam J. P., Presgraves D. C., 2014 909 Genome diversity and divergence in Drosophila mauritiana: multiple signatures of 910 faster X evolution. Genome Biol Evol 6: 2444–2458. 911
Geraldes A., Basset P., Gibson B., Smith K. L., Harr B., Yu H.-T., Bulatova N., Ziv Y., 912 Nachman M. W., 2008 Inferring the history of speciation in house mice from 913 autosomal, X-linked, Y-linked and mitochondrial genes. Mol Ecol 17: 5349–5363. 914
Getun I. V., Torres B., Bois P. R. J., 2011 Flow cytometry purification of mouse meiotic 915 cells. JoVE 50: e2602. 916
Gibson G., Weir B., 2005 The quantitative genetics of transcription. Trends Genet 21: 917 616–623. 918
Goncalves A., Leigh-Brown S., Thybert D., Stefflova K., Turro E., Flicek P., Brazma A., 919 Odom D. T., Marioni J. C., 2012 Extensive compensatory cis-trans regulation in the 920 evolution of mouse gene expression. Genome Res 22: 2376–2384. 921
Good J. M., Nachman M. W., 2005 Rates of protein evolution are positively correlated 922 with developmental timing of expression during mouse spermatogenesis. Mol Biol 923 Evol 22: 1044–1052. 924
Good J. M., Dean M. D., Nachman M. W., 2008 A complex genetic basis to X-linked 925 hybrid male sterility between two species of house mice. Genetics 179: 2213–2228. 926
Good J. M., Giger T., Dean M. D., Nachman M. W., 2010 Widespread over-expression 927 of the X chromosome in sterile F1 hybrid mice. PLoS Genet 6: e1001148. 928
Grath S., Parsch J., 2012 Rate of amino acid substitution is influenced by the degree and 929 conservation of male-biased transcription over 50 myr of Drosophila evolution. 930 Genome Biol Evol 4: 346–359. 931
Graze R. M., McIntyre L. M., Main B. J., Wayne M. L., Nuzhdin S. V., 2009 Regulatory 932 divergence in Drosophila melanogaster and D. simulans, a genomewide analysis of 933 allele-specific expression. Genetics 183: 547–561. 934
Guerrero R. F., Posto A. L., Moyle L. C., Hahn M. W., 2016 Genome-wide patterns of 935 regulatory divergence revealed by introgression lines. Evolution 70: 696–706. 936
Haldane J. B. S., 1922 Sex ratio and unisexual sterility in hybrid animals. J. Genet. 12: 937 101–109. 938
Halligan D. L., Kousathanas A., Ness R. W., Harr B., Eöry L., Keane T. M., Adams D. J., 939 Keightley P. D., 2013 Contributions of protein-coding and regulatory change to 940 adaptive molecular evolution in murid rodents. PLoS Genet 9: e1003995–14. 941
44
Halligan D. L., Oliver F., Eyre-Walker A., Harr B., Keightley P. D., 2010 Evidence for 942 pervasive adaptive protein evolution in wild mice. PLoS Genet 6: e1000825–9. 943
Halligan D. L., Oliver F., Guthrie J., Stemshorn K. C., Harr B., Keightley P. D., 2011 944 Positive and negative selection in murine ultraconserved noncoding elements. Mol 945 Biol Evol 28: 2651–2660. 946
Hernando-Herraez I., Heyn H., Fernandez-Callejo M., Vidal E., Fernandez-Bellon H., 947 Prado-Martinez J., Sharp A. J., Esteller M., Marques-Bonet T., 2015 The interplay 948 between DNA methylation and sequence divergence in recent human evolution. Nucl 949 Acids Res 43: 8204–8214. 950
Holt J., Huang S., McMillan L., Wang W., 2013 Read annotation pipeline for high-951 throughput sequencing data. BCB'13: 605–612. 952
Hu Y. C., Namekawa S. H., 2015 Functional significance of the sex chromosomes during 953 spermatogenesis. Reproduction 149: R265–R277. 954
Huang S., Holt J., Kao C.-Y., McMillan L., Wang W., 2014 A novel multi-alignment 955 pipeline for high-throughput sequencing data. Database 2014: bau057. 956
Huang S., Kao C.-Y., McMillan L., Wang W., 2013 Transforming genomes using MOD 957 files with applications. BCB'13: 595–604. 958
Hvilsom C., Qian Y., Bataillon T., Li Y., Mailund T., Sallé B., Carlsen F., Li R., Zheng 959 H., Jiang T., Jiang H., Jin X., Munch K., Hobolth A., Siegismund H. R., Wang J., 960 Schierup M. H., 2012 Extensive X-linked adaptive evolution in central chimpanzees. 961 Proc Natl Acad Sci U S A 109: 2054–2059. 962
Jablonka E., Lamb M. J., 1991 Sex chromosomes and speciation. Proc. Biol. Sci. 243: 963 203–208. 964
Jaquiery J., Stoeckel S., Rispe C., Mieuzet L., Legeai F., Simon J. C., 2012 Accelerated 965 evolution of sex chromosomes in aphids, an X0 system. Mol Biol Evol 29: 837–847. 966
Johnston D. S., Wright W. W., Dicandeloro P., Wilson E., Kopf G. S., Jelinsky S. A., 967 2008 Stage-specific gene expression is a fundamental characteristic of rat 968 spermatogenic cells and Sertoli cells. Proc Natl Acad Sci U S A 105: 8315–8320. 969
Kampstra P., 2008 Beanplot: A boxplot alternative for visual comparison of distributions. 970 J Stat Softw 28: 1–9. 971
Kayserili M. A., Gerrard D. T., Tomancak P., Kalinka A. T., 2012 An excess of gene 972 expression divergence on the X chromosome in Drosophila embryos: implications 973 for the faster-X hypothesis. PLoS Genet 8: e1003200. 974
Keane T. M., Goodstadt L., Danecek P., White M. A., Wong K., Yalcin B., Heger A., 975 Agam A., Slater G., Goodson M., Furlotte N. A., Eskin E., Nellåker C., Whitley H., 976
45
Cleak J., Janowitz D., Hernandez-Pliego P., Edwards A., Belgard T. G., Oliver P. L., 977 McIntyre R. E., Bhomra A., Nicod J., Gan X., Yuan W., van der Weyden L., Steward 978 C. A., Bala S., Stalker J., Mott R., Durbin R., Jackson I. J., Czechanski A., Guerra-979 Assunção J. A., Donahue L. R., Reinholdt L. G., Payseur B. A., Ponting C. P., Birney 980 E., Flint J., Adams D. J., 2011 Mouse genomic variation and its effect on phenotypes 981 and gene regulation. Nature 477: 289–294. 982
Kessler M. D., Dean M. D., 2014 Effective population size does not predict codon usage 983 bias in mammals. Ecol Evol 4: 3887–3900. 984
Khaitovich P., Hellmann I., Enard W., Nowick K., Leinweber M., Franz H., Weiss G., 985 Lachmann M., Pääbo S., 2005a Parallel patterns of evolution in the genomes and 986 transcriptomes of humans and chimpanzees. Science 309: 1850–1854. 987
Khaitovich P., Pääbo S., Weiss G., 2005b Toward a neutral evolutionary model of gene 988 expression. Genetics 170: 929–939. 989
Khil P. P., Smirnova N. A., Romanienko P. J., Camerini-Otero R. D., 2004 The mouse X 990 chromosome is enriched for sex-biased genes not subject to selection by meiotic sex 991 chromosome inactivation. Nat Genet 36: 642–646. 992
Kim D., Pertea G., Trapnell C., Pimentel H., Kelley R., Salzberg S. L., 2013 TopHat2: 993 accurate alignment of transcriptomes in the presence of insertions, deletions and gene 994 fusions. Genome Biol 14: R36. 995
King M. C., Wilson A. C., 1975 Evolution at two levels in humans and chimpanzees. 996 Science 188: 107–116. 997
Kohn M., Kehrer-Sawatzki H., Vogel W., Graves J. A. M., Hameister H., 2004 Wide 998 genome comparisons reveal the origins of the human X chromosome. Trends Genet 999 20: 598–603. 1000
Kousathanas A., Halligan D. L., Keightley P. D., 2014 Faster-X adaptive protein 1001 evolution in house mice. Genetics 196: 1131–1143. 1002
Landry C. R., Wittkopp P. J., Taubes C. H., Ranz J. M., Clark A. G., Hartl D. L., 2005 1003 Compensatory cis-trans evolution and the dysregulation of gene expression in 1004 interspecific hybrids of Drosophila. Genetics 171: 1813–1822. 1005
Langley C. H., Stevens K., Cardeno C., Lee Y. C. G., Schrider D. R., Pool J. E., Langley 1006 S. A., Suarez C., Corbett-Detig R. B., Kolaczkowski B., Fang S., Nista P. M., 1007 Holloway A. K., Kern A. D., Dewey C. N., Song Y. S., Hahn M. W., Begun D. J., 1008 2012 Genomic variation in natural populations of Drosophila melanogaster. Genetics 1009 192: 533–598. 1010
Lattin J. E., Schroder K., Su A. I., Walker J. R., Zhang J., Wiltshire T., Saijo K., Glass C. 1011 K., Hume D. A., Kellie S., Sweet M. J., 2008 Expression analysis of G Protein-1012 Coupled Receptors in mouse macrophages. Immunome Res 4: 5–13. 1013
46
Lemos B., Araripe L. O., Fontanillas P., Hartl D. L., 2008 Dominance and the 1014 evolutionary accumulation of cis- and trans-effects on gene expression. Proc Natl 1015 Acad Sci U S A 105: 14471–14476. 1016
Lemos B., Meiklejohn C. D., Cáceres M., Hartl D. L., 2004 Rates of divergence in gene 1017 expression profiles of primates, mice and flies: stabilizing selection and variability 1018 among functional categories. Evolution 59: 126–137. 1019
Li E., Bestor T. H., Jaenisch R., 1992 Targeted mutation of the DNA methyltransferase 1020 gene results in embryonic lethality. Cell 69: 915–926. 1021
Li H., Durbin R., 2009 Fast and accurate short read alignment with Burrows-Wheeler 1022 transform. Bioinformatics 25: 1754–1760. 1023
Liao B. Y., Zhang J., 2006 Low rates of expression profile divergence in highly 1024 expressed genes and tissue-specific genes during mammalian evolution. Mol Biol 1025 Evol 23: 1119–1128. 1026
Liao B.-Y., Scott N. M., Zhang J., 2006 Impacts of gene essentiality, expression pattern, 1027 and gene compactness on the evolutionary rate of mammalian proteins. Mol Biol 1028 Evol 23: 2072–2080. 1029
Lifschytz E., Lindsley D. L., 1972 The role of X-chromosome inactivation during 1030 spermatogenesis. Proc Natl Acad Sci U S A 69: 182–186. 1031
Llopart A., 2015 Parallel faster-X evolution of gene expression and protein sequences in 1032 Drosophila: beyond differences in expression properties and protein interactions. 1033 PLoS ONE 10: e0116829. 1034
Lyon M. F., 1961 Gene action in the X-chromosome of the mouse (Mus musculus L.). 1035 Nature 190: 372–373. 1036
Lyon M. F., 1962 Sex chromatin and gene action in the mammalian X-chromosome. Am 1037 J Hum Genet 14: 135–148. 1038
Mack K. L., Campbell P., Nachman M. W., 2016 Gene regulation and speciation in house 1039 mice. Genome Res 26: 451–461.. 1040
Mank J. E., Axelsson E., Ellegren H., 2007 Fast-X on the Z: rapid evolution of sex-linked 1041 genes in birds. Genome Res 17: 618–624. 1042
Mank J. E., Vicoso B., Berlin S., Charlesworth B., 2010 Effective population size and the 1043 faster-X effect: empirical results and their interpretation. Evolution 64: 663–674. 1044
McCarthy D. J., Chen Y., Smyth G. K., 2012 Differential expression analysis of 1045 multifactor RNA-Seq experiments with respect to biological variation. Nucl Acids 1046 Res 40: 4288–4297. 1047
47
McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., 1048 Garimella K., Altshuler D., Gabriel S., Daly M., DePristo M. A., 2010 The Genome 1049 Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA 1050 sequencing data. Genome Res 20: 1297–1303. 1051
McManus C. J., Coolon J. D., Duff M. O., Eipper-Mains J., Graveley B. R., Wittkopp P. 1052 J., 2010 Regulatory divergence in Drosophila revealed by mRNA-seq. Genome Res 1053 20: 816–825. 1054
Meiklejohn C. D., Coolon J. D., Hartl D. L., Wittkopp P. J., 2014 The roles of cis- and 1055 trans-regulation in the evolution of regulatory incompatibilities and sexually 1056 dimorphic gene expression. Genome Res 24: 84–95. 1057
Meiklejohn C. D., Landeen E. L., Cook J. M., Kingan S. B., Presgraves D. C., 2011 Sex 1058 chromosome-specific regulation in the Drosophila male germline but little evidence 1059 for chromosomal dosage compensation or meiotic inactivation. PLoS Biol 9: 1060 e1001126. 1061
Meisel R. P., 2011 Towards a more nuanced understanding of the relationship between 1062 sex-biased gene expression and rates of protein-coding sequence evolution. Mol Biol 1063 Evol 28: 1893–1900. 1064
Meisel R. P., Connallon T., 2013 The faster-X effect: integrating theory and data. Trends 1065 Genet 29: 537–544. 1066
Meisel R. P., Malone J. H., Clark A. G., 2012a Faster-X evolution of gene expression in 1067 Drosophila. PLoS Genet 8: e1003013. 1068
Meisel R. P., Malone J. H., Clark A. G., 2012b Disentangling the relationship between 1069 sex-biased gene expression and X-linkage. Genome Res 22: 1255–1265. 1070
Meyer M., Kircher M., 2010 Illumina sequencing library preparation for highly 1071 multiplexed target capture and sequencing. Cold Spring Harb Protoc 2010: t5448. 1072
Miyata T., Hayashida H., Kuma K., Mitsuyasu K., Yasunaga T., 1987 Male-driven 1073 molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harb 1074 Symp Quant Biol 52: 863–867. 1075
Molaro A., Hodges E., Fang F., Song Q., McCombie W. R., Hannon G. J., Smith A. D., 1076 2011 Sperm methylation profiles reveal features of epigenetic inheritance and 1077 evolution in primates. Cell 146: 1029–1041. 1078
Mueller J. L., Mahadevaiah S. K., Park P. J., Warburton P. E., Page D. C., Turner J. M. 1079 A., 2008 The mouse X chromosome is enriched for multicopy testis genes showing 1080 postmeiotic expression. Nat Genet 40: 794–799. 1081
Mueller J. L., Skaletsky H., Brown L. G., Zaghlul S., Rock S., Graves T., Auger K., 1082 Warren W. C., Wilson R. K., Page D. C., 2013 Independent specialization of the 1083
48
human and mouse X chromosomes for the male germ line. Nat Genet 45: 1083–1087. 1084
Nam K., Munch K., Hobolth A., Dutheil J. Y., Veeramah K. R., Woerner A. E., Hammer 1085 M. F., Great Ape Genome Diversity Project, Mailund T., Schierup M. H., 2015 1086 Extreme selective sweeps independently targeted the X chromosomes of the great 1087 apes. Proc Natl Acad Sci U S A 112: 6413–6418. 1088
Namekawa S. H., Park P. J., Zhang L.-F., Shima J. E., McCarrey J. R., Griswold M. D., 1089 Lee J. T., 2006 Postmeiotic sex chromatin in the male germline of mice. Curr Biol 1090 16: 660–667. 1091
Nguyen L. P., Galtier N., Nabholz B., 2015 Gene expression, chromosome heterogeneity 1092 and the fast-X effect in mammals. Biol Lett 11: 20150010. 1093
Ohno S., 1967 Sex chromosomes and sex-linked genes. Springer-Verlag. 1094
Okano M., Bell D. W., Haber D. A., Li E., 1999 DNA methyltransferases Dnmt3a and 1095 Dnmt3b are essential for de novo methylation and mammalian development. Cell 99: 1096 247–257. 1097
Oksanen J., Blanchet F. G., Kindt R., Legendre P., Minchin P. R., O’Hara R. B., Simpson 1098 G. L., Solymos P., Henry M., Stevens H., Wagner H., 2015 vegan: Community 1099 ecology package. R package version 2.3-1. 1100
Pages H., Gentleman R., Aboyoun P., DebRoy S., 2008 Biostrings: String objects 1101 representing biological sequences, and matching algorithms, 2008. R package version 1102 2.38.2. 1103
Potrzebowski L., Vinckenbosch N., Marques A. C., Chalmel F., Jégou B., Kaessmann H., 1104 2008 Chromosomal gene movements reflect the recent origin and biology of therian 1105 sex chromosomes. PLoS Biol 6: e80. 1106
R Development Core Team, 2014 R: A language and environment for statistical 1107 computing. 1108
Reik W., 2007 Stability and flexibility of epigenetic gene regulation in mammalian 1109 development. Nature 447: 425–432. 1110
Reik W., Dean W., Walter J., 2001 Epigenetic reprogramming in mammalian 1111 development. Science 293: 1089–1093. 1112
Rice W. R., 1984 Sex chromosomes and the evolution of sexual dimorphism. Evolution 1113 38: 735–742. 1114
Robinson M. D., Oshlack A., 2010 A scaling normalization method for differential 1115 expression analysis of RNA-seq data. Genome Biol 11: R25. 1116
Robinson M. D., McCarthy D. J., Smyth G. K., 2010 edgeR: a Bioconductor package for 1117
49
differential expression analysis of digital gene expression data. Bioinformatics 26: 1118 139–140. 1119
Sackton T. B., Corbett-Detig R. B., Nagaraju J., Vaishna L., Arunkumar K. P., Hartl D. 1120 L., 2014 Positive selection drives faster-Z evolution in silkmoths. Evolution 68: 1121 2331–2342. 1122
Saglican E., Ozkurt E., Hu H., Erdem B., Khaitovich P., 2014 Heterochrony explains 1123 convergent testis evolution in primates. bioRxiv: 010553. 1124
Sarver B. A., Keeble S., Cosart T., Tucker P. K., Dean M. D., Good J. M., Phylogenomic 1125 insights into genome evolution and speciation in mice. submitted: 1–39. 1126
Scavetta R. J., Tautz D., 2010 Copy number changes of CNV regions in intersubspecific 1127 crosses of the house mouse. Mol Biol Evol 27: 1845–1856. 1128
Schaefke B., Emerson J. J., Wang T. Y., Lu M. Y. J., Hsieh L. C., Li W. H., 2013 1129 Inheritance of gene expression level and selective constraints on trans- and cis-1130 regulatory changes in yeast. Mol Biol Evol 30: 2121–2133. 1131
Schultz N., Hamra F. K., Garbers D. L., 2003 A multitude of genes expressed solely in 1132 meiotic or postmeiotic spermatogenic cells offers a myriad of contraceptive targets. 1133 Proc Natl Acad Sci U S A 100: 12201–12206. 1134
Shen S. Q., Turro E., Corbo J. C., 2014 Hybrid mice reveal parent-of-origin and cis- and 1135 trans-regulatory effects in the retina. PLoS ONE 9: e109382–16. 1136
Shi X., Ng D. W.-K., Zhang C., Comai L., Ye W., Chen Z. J., 2012 Cis- and trans-1137 regulatory divergence between progenitor species determines gene-expression 1138 novelty in Arabidopsis allopolyploids. Nature Communications 3: 950–9. 1139
Shima J. E., McLean D. J., McCarrey J. R., Griswold M. D., 2004 The murine testicular 1140 transcriptome: characterizing gene expression in the testis during the progression of 1141 spermatogenesis. Biol Reprod 71: 319–330. 1142
Sin H. S., Ichijima Y., Koh E., Namiki M., Namekawa S. H., 2012 Human postmeiotic 1143 sex chromatin and its impact on sex chromosome evolution. Genome Res 22: 827–1144 836. 1145
Singh N. D., Davis J. C., Petrov D. A., 2005 X-linked genes evolve higher codon bias in 1146 Drosophila and Caenorhabditis. Genetics 171: 145–155. 1147
Smedley D., Haider S., Durinck S., Pandini L., Provero P., Allen J., Arnaiz O., Awedh M. 1148 H., Baldock R., Barbiera G., Bardou P., Beck T., Blake A., Bonierbale M., Brookes 1149 A. J., Bucci G., Buetti I., Burge S., Cabau C., Carlson J. W., Chelala C., 1150 Chrysostomou C., Cittaro D., Collin O., Cordova R., Cutts R. J., Dassi E., Genova A. 1151 D., Djari A., Esposito A., Estrella H., Eyras E., Fernandez-Banet J., Forbes S., Free R. 1152 C., Fujisawa T., Gadaleta E., Garcia-Manteiga J. M., Goodstein D., Gray K., Guerra-1153
50
Assunção J. A., Haggarty B., Han D.-J., Han B. W., Harris T., Harshbarger J., 1154 Hastings R. K., Hayes R. D., Hoede C., Hu S., Hu Z.-L., Hutchins L., Kan Z., Kawaji 1155 H., Keliet A., Kerhornou A., Kim S., Kinsella R., Klopp C., Kong L., Lawson D., 1156 Lazarevic D., Lee J.-H., Letellier T., Li C.-Y., Lio P., Liu C.-J., Luo J., Maass A., 1157 Mariette J., Maurel T., Merella S., Mohamed A. M., Moreews F., Nabihoudine I., 1158 Ndegwa N., Noirot C., Perez-Llamas C., Primig M., Quattrone A., Quesneville H., 1159 Rambaldi D., Reecy J., Riba M., Rosanoff S., Saddiq A. A., Salas E., Sallou O., 1160 Shepherd R., Simon R., Sperling L., Spooner W., Staines D. M., Steinbach D., Stone 1161 K., Stupka E., Teague J. W., Dayem Ullah A. Z., Wang J., Ware D., Wong-Erasmus 1162 M., Youens-Clark K., Zadissa A., Zhang S.-J., Kasprzyk A., 2015 The BioMart 1163 community portal: an innovative alternative to large, centralized data repositories. 1164 Nucl Acids Res 43: W598. 1165
Smith A. D., Chung W. Y., Hodges E., Kendall J., Hannon G., Hicks J., Xuan Z., Zhang 1166 M. Q., 2009 Updates to the RMAP short-read mapping software. Bioinformatics 25: 1167 2841–2842. 1168
Smith A. D., Xuan Z., Zhang M. Q., 2008 Using quality scores and longer reads 1169 improves accuracy of Solexa read mapping. BMC Bioinformatics 9: 128–8. 1170
Snyder R. L., 1967 Fertility and reproductive performance of grouped male mice. 1171 In:Benirschke K (Ed.), Comparative aspects of reproductive failure,, pp. 458–472. 1172
Song Q., Decato B., Hong E. E., Zhou M., Fang F., Qu J., Garvin T., Kessler M., Zhou J., 1173 Smith A. D., 2013 A reference methylome database and analysis pipeline to facilitate 1174 integrative and comparative epigenomics. PLoS ONE 8: e81148. 1175
Stamatakis A., 2014 RAxML version 8: a tool for phylogenetic analysis and post-analysis 1176 of large phylogenies. Bioinformatics 30: 1312–1313. 1177
Su A. I., Cooke M. P., Ching K. A., Hakak Y., Walker J. R., Wiltshire T., Orth A. P., 1178 Vega R. G., Sapinoso L. M., Moqrich A., Patapoutian A., Hampton G. M., Schultz P. 1179 G., Hogenesch J. B., 2002 Large-scale analysis of the human and mouse 1180 transcriptomes. Proc Natl Acad Sci U S A 99: 4465–4470. 1181
Torgerson D. G., Singh R. S., 2003 Sex-linked mammalian sperm proteins evolve faster 1182 than autosomal ones. Mol Biol Evol 20: 1705–1709. 1183
Torgerson D. G., Singh R. S., 2006 Enhanced adaptive evolution of sperm-expressed 1184 genes on the mammalian X chromosome. Heredity 96: 1–6. 1185
Turner J. M. A., 2007 Meiotic sex chromosome inactivation. Development 134: 1823–1186 1831. 1187
Turner L. M., White M. A., Tautz D., Payseur B. A., 2014 Genomic networks of hybrid 1188 sterility. PLoS Genet 10: e1004162. 1189
Veeramah K. R., Gutenkunst R. N., Woerner A. E., Watkins J. C., Hammer M. F., 2014 1190
51
Evidence for increased levels of positive and negative selection on the X 1191 chromosome versus autosomes in humans. Mol Biol Evol 31: 2267–2282. 1192
Vibranovski M. D., 2014 Meiotic sex chromosome inactivation in Drosophila. J. 1193 Genomics 2: 104–117. 1194
Vicoso B., Charlesworth B., 2006 Evolution on the X chromosome: unusual patterns and 1195 processes. Nat Rev Genet 7: 645–653. 1196
Vicoso B., Charlesworth B., 2009 Effective population size and the faster-X effect: an 1197 extended model. Evolution 63: 2413–2426. 1198
Wang P. J., McCarrey J. R., Yang F., Page D. C., 2001 An abundance of X-linked genes 1199 expressed in spermatogonia. Nat Genet 27: 422–426. 1200
Wang Z., Gerstein M., Snyder M., 2009 RNA-Seq: a revolutionary tool for 1201 transcriptomics. Nat Rev Genet 10: 57–63. 1202
Warnes G. R., Bolker B., Bonebakker L., Gentleman R., Huber W., Liaw A., Lumley T., 1203 Maechler M., Magnusson A., Moeller S., 2015 gplots: Various R programming tools 1204 for plotting data. R package version 2.17. 1205
Wittkopp P. J., Haerum B. K., Clark A. G., 2004 Evolutionary changes in cis and trans 1206 gene regulation. Nature 430: 85–88. 1207
Wittkopp P. J., Haerum B. K., Clark A. G., 2008 Regulatory changes underlying 1208 expression differences within and between Drosophila species. Nat Genet 40: 346–1209 350. 1210
Wright A. E., Harrison P. W., Zimmer F., Montgomery S. H., Pointer M. A., Mank J. E., 1211 2015 Variation in promiscuity and sexual selection drives avian rate of Faster-Z 1212 evolution. Mol Ecol 24: 1218–1235. 1213
Wu C.-I., Davis A. W., 1993 Evolution of postmating reproductive isolation: the 1214 composite nature of Haldane's rule and its genetic bases. Am Nat 142: 187–212. 1215
Yanai I., Benjamin H., Shmoish M., Chalifa-Caspi V., Shklar M., Ophir R., Bar-Even A., 1216 Horn-Saban S., Safran M., Domany E., Lancet D., Shmueli O., 2005 Genome-wide 1217 midrange transcription profiles reveal expression level relationships in human tissue 1218 specification. Bioinformatics 21: 650–659. 1219
Yang H., Wang J. R., Didion J. P., Buus R. J., Bell T. A., Welsh C. E., Bonhomme F., Yu 1220 A. H.-T., Nachman M. W., Piálek J., Tucker P. K., Boursot P., McMillan L., 1221 Churchill G. A., de Villena F. P.-M., 2011 Subspecific origin and haplotype diversity 1222 in the laboratory mouse. Nat Genet 43: 648–655. 1223
Yang Z., 2007 PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 1224 24: 1586–1591. 1225
52
Zhang Y. E., Vibranovski M. D., Landback P., Marais G. A. B., Long M., 2010 1226 Chromosomal redistribution of male-biased genes in mammalian evolution with two 1227 bursts of gene gain on the X chromosome. PLoS Biol 8: 2527. 1228
1229