Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures...

31
Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions in gastric cancer Jaqueline Ramalho Buttura #1 , Monize Nakamoto Provisor #1,2 , Renan Valieris 1 , Vinicius Fernando Calsavara 3 ,Rodrigo Duarte Drummond 1 , Alexandre Defelicibus 1 Joao Paulo Lima 1 , Helano Carioca Freitas 4,5 , Vladmir C. Cordeiro Lima 4 , Thais Fernanda Bartelli 5 , Marc Wiedner 6 , Rafael Rosales 7 , Kenneth John Gollob 8 , Joanna Loizou 6,9 , Emmanuel Dias-Neto 5,10 , Diana Noronha Nunes 5 , Israel Tojal da Silva 1 1 Laboratory of Bioinformatics and Computational Biology, A.C.Camargo Cancer Center, So Paulo, SP 01509-010, Brazil. 2 present address: Department of Genomics, Fleury Group, So Paulo, SP, Brazil. 3 Department of Statistics and Epidemiology, A.C. Camargo Cancer Center, So Paulo, SP, Brazil. 4 Medical Oncology Department, A.C. Camargo Cancer Center, So Paulo, SP, Brazil. 5 Laboratory of Medical Genomics, A.C. Camargo Cancer Center, So Paulo, SP 01509-010, Brazil. 6 CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090 Vienna, Austria. 7 University of So Paulo. Department of Mathematics and Computer Science. 14049-900, Ribeiro Preto, SP, Brazil. 8 Translational Immuno-oncology Group, A.C. Camargo Cancer Center, So Paulo, SP 01509-010, Brazil. 9 Institute of Cancer Research, Department of Medicine I, Medical University of Vienna and Comprehensive Cancer Center, Vienna 1090, Austria. 10 University of So Paulo. Laboratory of Neurosciences, Institute of Psychiatry. So Paulo, SP, Brazil. # - equal contribution; Abstract DNA mismatch repair deficiency (dMMR) leads to increased mutation load, which in turn may impact anti-tumor immune responses and treatment ef- fectiveness. Currently, there are different mutational signatures described in primary cancers that are associated with dMMR. Whether the somatic and epigenetic changes in MMR genes precede one or more dMMR signa- Preprint submitted to medRxiv April 17, 2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 22, 2020. . https://doi.org/10.1101/2020.04.17.20066944 doi: medRxiv preprint NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Transcript of Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures...

Page 1: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

Mutational signatures driven by epigenetic determinants

stratify patients for therapeutic interventions in gastric

cancer

Jaqueline Ramalho Buttura#1, Monize Nakamoto Provisor#1,2, RenanValieris1, Vinicius Fernando Calsavara3,Rodrigo Duarte Drummond1,Alexandre Defelicibus1 Joao Paulo Lima1, Helano Carioca Freitas4,5,

Vladmir C. Cordeiro Lima4, Thais Fernanda Bartelli5, Marc Wiedner6,Rafael Rosales7, Kenneth John Gollob8, Joanna Loizou6,9, Emmanuel

Dias-Neto5,10, Diana Noronha Nunes5, Israel Tojal da Silva1

1Laboratory of Bioinformatics and Computational Biology, A.C.Camargo Cancer Center,So Paulo, SP 01509-010, Brazil.

2present address: Department of Genomics, Fleury Group, So Paulo, SP, Brazil.3Department of Statistics and Epidemiology, A.C. Camargo Cancer Center, So Paulo,

SP, Brazil.4Medical Oncology Department, A.C. Camargo Cancer Center, So Paulo, SP, Brazil.

5Laboratory of Medical Genomics, A.C. Camargo Cancer Center, So Paulo, SP01509-010, Brazil.

6CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences,Lazarettgasse 14, AKH BT 25.3, 1090 Vienna, Austria.

7University of So Paulo. Department of Mathematics and Computer Science. 14049-900,Ribeiro Preto, SP, Brazil.

8Translational Immuno-oncology Group, A.C. Camargo Cancer Center, So Paulo, SP01509-010, Brazil.

9Institute of Cancer Research, Department of Medicine I, Medical University of Viennaand Comprehensive Cancer Center, Vienna 1090, Austria.

10University of So Paulo. Laboratory of Neurosciences, Institute of Psychiatry. SoPaulo, SP, Brazil.

# - equal contribution;

Abstract

DNA mismatch repair deficiency (dMMR) leads to increased mutation load,which in turn may impact anti-tumor immune responses and treatment ef-fectiveness. Currently, there are different mutational signatures describedin primary cancers that are associated with dMMR. Whether the somaticand epigenetic changes in MMR genes precede one or more dMMR signa-

Preprint submitted to medRxiv April 17, 2020

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Page 2: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

tures, and if so by which mechanism remains unknown. To investigate therelationship between these changes and dMMR signatures, we performed ade novo extraction of mutational signatures in a large cohort of 787 gastriccancer patients. We detected three dMMR-related signatures, one of whichclearly discriminates tumors with MLH1 gene silencing caused by hyper-methylation within its promoter (AUC = 98%). We then demonstrate thatsamples with the highest exposures to signature share features related to bet-ter prognosis, encompassing clinical and molecular aspects, as well as alteredimmune infiltrate composition, predictive of a better response to immunecheckpoint inhibitors. Overall, our analysis explored the impact of modifi-cations in MMR-related genes on shaping specific mutational signatures andwe provide evidence that patient classification based on mutational signatureexposure can identify a group of patients with a good prognosis and who arepotentially good candidates for immunotherapy.

Keywords: Mutational Signature, Bioinformatics, Gastric Cancer, DNA,Mismatch Repair Prognosis

1. Introduction1

Cancer results from the sequential accumulation of DNA alterations, in-2

cluding single nucleotide mutations [1] that arise from different endogenous3

or exogenous processes [2]. Distinct DNA-damaging processes leave charac-4

teristic nucleotide base-change footprints known as mutational signatures [3].5

Previous studies [4] have extracted distinct mutational signatures by examin-6

ing a large set of human cancer genomes and some of these have been reported7

in the COSMIC database (denoted hereafter as CS). This pan-cancer anal-8

ysis revealed significant heterogeneity of operational mutational processes,9

that encompass mutation-triggering events as diverse as the off-target activ-10

ity AID/APOBEC family of cytidine deaminases, the exposure to ultraviolet11

light, tobacco-smoking and the defective DNA mismatch repair [5, 6].12

13

Collectively, the understanding of the mechanistic basis of mutational sig-14

natures, as well as to their etiology, may provide clues for cancer diagnosis and15

hold prognostic value [7]. For example, six mutational signatures have been16

associated with BRCA1/BRCA2 dysfunction, which most likely are predic-17

tive of response to treatment with PARP inhibitors [8]. Thus, homologous18

recombination repair (HRR)-deficiency features based on these signatures19

2

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 3: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

allowed the prediction of BRCAness in breast cancer patients with 98.7%20

sensitivity [8]. Additionally, given that nucleotide excision repair (NER) de-21

ficient tumors are more sensitive to certain treatments, somatic variations in22

the ERCC2 gene, which encodes a key protein of the NER pathway, have23

also been linked with characteristic mutational signatures [9, 10]. Other mu-24

tational processes are associated with patients harboring biallelic MUTYH25

germline mutations [11], a finding that may indicate deficient base excision26

repair (BER). Such patients are eligible for genetic counseling [12] and might27

benefit from immunotherapy [13].28

29

In addition to HRR, NER and BER repair pathways, another mecha-30

nism underlying oncogenic genomic variations, with important effects on31

anti-tumor immune responses occur in tumors with impaired DNA mismatch32

repair (MMR), which harbor elevated frequencies of single-nucleotide vari-33

ants (SNVs) and exceptionally high indel rates [14]. Recent studies demon-34

strated that various MMR-deficient (dMMR) tumor types (gastrointestinal,35

glioblastoma, endometrial and prostate) are more responsive to programmed36

cell death protein 1 (PD1) immune checkpoint inhibitors as compared to37

MMR-proficient tumors [15, 16, 17]. A set of four mutational signatures38

(CS-6, CS-15, CS-20, and CS-26) have been associated with dMMR. Never-39

theless, it is still unclear if somatic and epigenetic changes in MMR genes40

lead to one or more dMMR signatures.41

42

In this study we investigated the significance of molecular events in MMR-43

genes that shape characteristic mutation signatures found in MMR-deficient44

gastric adenocarcinomas. The presence of these signatures was evaluated for45

their prognostic value in a cohort of 787 gastric cancer patients with pub-46

licly available data, including 439 patients from the TCGA, and validated47

in a second cohort composed of 170 gastric cancer patients [18]. We fur-48

ther investigated whether local tumor immune response and prognosis varied49

according to MMR-deficiency exposure load. The consequences of these ap-50

pear to be predictive of the responsiveness to immune checkpoint blockade51

and may be used to support treatment strategies in the future.52

3

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 4: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

2. Materials and Methods53

2.1. Clinical and genomic data from public cohort54

The non-redundant public cohorts assessed here contained clinical and55

molecular information of gastric adenocarcinoma samples provided by: i)56

The Cancer Genome Atlas (TCGA, N=439), ii) cBioPortal, N=226; iii) and57

International Cancer Genome Consortium (ICGC, N=122), totaling 787 pa-58

tients (Supplementary Material Table S1). TCGA data was assessed on59

October 4th, 2018 and corresponds to the MC3 variant calling project, which60

is a comprehensive effort to detect consensus mutations and forms the basis61

of Pan-Cancer Atlas initiative [19]. cBioPortal and ICGC cohorts’ com-62

prise Asian samples which were last assessed on January 9th, 2019. Raw63

reads from matched non-tumor exomes from TCGA dataset, encompassing64

the MMR genes were downloaded and used to detect the germline SNVs65

following the Genome Analysis Toolkits (GATK)s best practice for germi-66

line alterations calling. We also used additional filters considering mutations67

with VAF (variant allele frequency) ≥ 0.3 and minimum depth coverage of68

10 reads. Furthermore, dbNSFP MetaLR rankscore was used to filter out69

(≤ 0.6) the synonymous mutations. The methylation levels in the form of70

beta-values ranging from 0 to 1 were addressed for the TCGA cohort [20].71

We then used the CpG sites in the promoter of MMR genes to detect those72

that were hypermethylated or hypomethylated. The baseline clinical features73

are summarized in Supplementary Material Table S2.74

2.2. Clinical and genomic data from validation cohort75

Patients in the validation cohort were prospectively enrolled in an insti-76

tutional study to unveil the epidemiology and genomics of gastric adenocar-77

cinomas in Brazil [18]. This study was approved by the local ethics com-78

mittee and all participants provided written informed consent. An overview79

of the clinical characteristics of patients in the validation cohort is provided80

in Table S3. Genomic DNA from frozen tissues (n=165) was extracted with81

AllPrep DNA/RNA Mini Kit (Qiagen), QIASymphony THC 400 (Qiagen)82

or phenol/chloroform/isoamyl alcohol precipitation. gDNA from FFPE tis-83

sue (n=4) was extracted with RecoverAll Total Nucleic Acid Isolation Kit84

(Thermo Fisher), and there was one sample from gastric wash. Exome li-85

braries were prepared using Agilent SureSelect V6 kit and sequenced us-86

ing Illumina platforms (HiSeq4000, 100bp, n=33; Novaseq, 150bp, n=137 -87

4

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 5: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

pairedend reads for both). The raw sequencing data (.fastq files) were de-88

posited in SRA (http://www.ncbi.nlm.nih.gov/sra) under accession number89

PRJNA505810.90

91

For our local independent validation cohort, the somatic SNVs were called92

by using an in-house pipeline following the Broad Institute GATK Best Prac-93

tices guidelines [21] as described [6]. Briefly, the raw reads were aligned94

using Burrows Wheeler Aligner (BWA-mem) with default settings to assem-95

bly GRCh38. Next, alignment files in SAM format were converted to BAM96

files, sorted and filtered to exclude reads with mapq score <15. The re-97

tained reads were processed using SAMtools (v1.9) and Picard (v3.8) (https :98

//broadinstitute.github.io/picard/) respectively, which excludes low-quality99

reads and PCR duplicates. Finally, the somatic SNVs calling was performed100

for the whole exome data from analysis-ready BAM files using Mutect2 (v3.8)101

for tumor samples and further with a panel of 16 unmatched non-tumor leuko-102

cyte samples. Extensive filtering was applied to remove low mapping quality,103

as well as strand, position bias and OxoG oxidative artifacts. Furthemore,104

any residual germline mutations from the database of germline mutations105

of of gnomAD (https://gnomad.broadinstitute.org/) and Online Archive of106

Brazilian Mutations (ABraOM, available at http://abraom.ib.usp.br/) were107

removed.108

2.3. Mutational signatures estimation109

All somatic SNVs of the six classes (C>A, C>G, C>T, T>A, T>C and110

T>G) were mapped onto trinucleotide sequences by including the 5’ and 3’111

neighboring base-contexts. Next, the SNV spectrum with 96 trinucleotide112

mutations types for all samples were loaded into signeR [22] to estimate113

the optimal number of mutational signatures, which is based on the me-114

dian Bayesian Information Criterion (BIC) value. We next used the cosine115

similarity to compare the extracted de novo mutational signatures to those116

described in the COSMIC signatures (v2), considering cosine similarity > 0.7117

as a measure of closeness to COSMIC signatures. Patients with higher ex-118

posure for a given signature (Exposure value greater or equal to the third119

quartile) were named as high and those with lower exposure values (Exposure120

value less than third quartile) were named as low.121

5

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 6: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

2.4. Molecular features122

We used the MSIseq [23] software for microsatellite instability (MSI) sta-123

tus prediction (MSI-H and Non-MSI-H) from whole exome data. Briefly, this124

software is based on four machine-learning frameworks, which requires a cat-125

alog of somatic SNVs and microindels of samples, a file containing the exact126

locations of mononucleotides (length≥5) and microsatellites consisting of di,127

tri, and tetranucleotide repeats, as annotated in the simpleRepeats track and128

avaliable at http : //hgdownload.cse.ucsc.edu/goldenpath/hg19/database/.129

MSIseq is available at The Comprehensive R Archive Network (CRAN).130

131

Consistent with a method previously proposed by Chalmers et al. [24],132

the Tumor Mutational Burden (TMB) was calculated as the total number of133

mutations divided by the length of the target region in megabases.134

135

The tumor heterogeneity estimation was performed using math.score (MATH)136

function from package maftools version 3.8 [25]. A higher MATH score indi-137

cates increased tumor heterogeneity.138

139

Neoantigen count: The list of neoantigens available for 77 TCGA-STAD140

samples was extracted from The Cancer Immunome Atlas (TCIA) [26], as-141

sessed on June 17th, 2017 at https : //tcia.at/neoantigens.142

2.5. Statistical analyses143

The baseline patient characteristics are expressed as absolute and relative144

frequencies for qualitative variables and as the mean ± standard deviation145

(SD) for quantitative variables. Mutational signature exposure and TMB146

were considered as continuous variables. The association between qualita-147

tive variables was evaluated by chi-squared test or Fishers exact test, as148

appropriate.149

150

Overall survival functions were estimated by the Kaplan-Meier estima-151

tor and the log-rank test was used to compare the survival functions among152

groups (eg, patients with higher mutational signature exposure (Shigh) ver-153

sus other (Slow)). The Cox semiparametric proportional hazards model was154

fitted to the dataset to describe the relationship between overall survival and155

the main clinical features. Hazard ratio (HR) and 95% confidence intervals156

(95%CI) were calculated for all variables. A backward stepwise selection al-157

gorithm was applied, with different significance levels to enter (p=0.10) and158

6

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 7: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

remain (p=0.05) in the model. Variables were removed from the model if they159

were non-significant or acted as confounders (change in coefficient >20%).160

The proportional hazards assumption was assessed based on the Schoenfeld161

residuals [27]. There was evidence that covariates had a constant effect over162

time in all cases.163

164

Multivariate analyses were performed considering the main clinical fea-165

tures (such as age, pathological stage, Lauren tumor subtype and ethnicity),166

previously associated with overall survival, and with exposures of mutational167

signatures associated with dMMR, besides molecular features TMB and MSI168

status. Forest plots were created based on the final multiple Cox regression169

model. Metastatic patients were excluded from these analyses. In addition,170

we fitted simple and multiple logistic regression models in order to assess the171

effect of S2, S4 and S5 exposures in the MLH1 methylation. Overall perfor-172

mance, calibration, and the discriminatory power of the final multiple logistic173

regression model were assessed using the Brier score, the HosmerLemeshow174

goodness-of-fit test, and the area under the receiver operating characteristic175

(ROC) curve (AUC), respectively [28]. Besides, we assessed the goodness-of176

fit through a Q-Q plot. The significance level was fixed at 5% for all tests177

(two-sided). Statistical analysis was performed using R software (v3.5).178

2.6. Mutational signatures in cell lines179

The CRISPR-Cas9 knockout clones for MLH1 were generated in human180

HAP1 cells using the following guide RNA (gRNA) sequence: 5 - AAGA-181

CAATGGCACCGGGATC - 3. Clonal populations with a frameshift muta-182

tion within MLH1 were subsequently cultured for three months to allow for183

the accumulation of mutations during cellular division [29]. To identify muta-184

tions, genomic DNA was submitted to whole genome sequencing (WGS). De185

novo somatic mutations including substitutions, indels and rearrangements186

in subclones were obtained by removing all mutations seen in parental clones.187

Next, SNVs were mapped onto trinucleotide sequences by including the 5’188

and 3’ neighboring base-contexts and then the level of samples’ exposure to189

previously found mutational signatures was estimated [22].190

2.7. Significantly mutated genes and pathway analysis191

To assess the impact of dMMR pathway on the genes throughout the192

genome, we searched for genes more frequently mutated than would be ex-193

pected by chance [30]. The proper gene symbol annotation in MAF (Mu-194

7

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 8: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

tation Annotation Format) files was addressed by maftools (v.8) [25] (pre-195

pareMutSig function) and then loaded into online MutSigCV server (v.1.3.4)196

(https : //cloud.genepattern.org/gp/pages/index.jsf). The oncoplots were197

built by using the significantly mutated genes from MutSigCV analysis. The198

significantly mutated genes associated to S4high and S4low group were entered199

into Gene Set Enrichment Analysis (GSEA) according to the Investigate gene200

sets function available at MSigDB (Molecular Signatures Database (v7.0),201

http : //software.broadinstitute.org/gsea/msigdb/index.jsp). KEGG, RE-202

ACTOME, GO biological process, oncogenic signatures (module C6) and im-203

munologic signatures (module C7) were also considered to compute overlaps,204

and the top 20 gene sets with FDR q-value <0.05 were used to summarize205

these analysis.206

2.8. Inflammatory infiltrate and immune aspects207

We estimated the cellular composition from the bulk expression datasets208

(TCGA) by using two complementary approaches. For both analyses, FPKM209

(Fragments Per Kilobase Million) from 380 tumor TCGA-STAD samples210

were used as normalized gene expression profile, retrieved on January 22th,211

2018. First, the CIBERSORT software based on the deconvolution method212

for characterizing cell composition of complex tissues from their gene ex-213

pression profiles, was used [31]. CIBESORT takes advantage of a validated214

leukocyte gene signature matrix, termed LM22. This gene signature contains215

547 genes that distinguish 22 human hematopoietic cell phenotypes, includ-216

ing seven T cell types, nave and memory B cells, plasma cells, natural killer217

(NK) cells, and myeloid subsets. Simultaneously, a recent technique based on218

gene set enrichment analysis (GSEA) termed as xCell [32] was used to infer219

34 immune cell types. Herein, we used this method to confirm the findings220

by CIBERSORT.221

222

CIBERSORT analysis was performed online using a public server (http :223

//cibersort.stanford.edu/) for characterizing absolute and relative immune224

cell composition with 1000 permutations and disabled quantile normaliza-225

tion as set parameters. From the 380 TCGA-STAD tumor samples, 215226

215 samples (56%) yielded data on infiltrating immune cells (p-value<0.05),227

which were considered for further analysis (50 samples as S4high and 165 as228

S4low). We also used the second approach known as xCell to reinforce the229

findings when comparing S4high and S4low samples. xCell analysis was per-230

formed using the R package with default parameters (available at https :231

8

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 9: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

//github.com/dviraran/xCell). In order to verify the immune effector re-232

sponse present in S4high and S4low samples, differential expression of key im-233

munoregulatory/inflammatory or cytotoxic markers was also performed. The234

comparison of the groups in this section was performed by Mann-Whitney U235

Test with statistical significance set at p-value<0.05.236

237

We also used the pre-processed immune subtypes previously described by238

Thorsson et al. [33] for TCGA samples (available for 103 in S4high samples239

and 285 in S4low which can be assessed in Table S2).240

3. Results241

3.1. Mutational signatures242

Using signeR [22] analysis to estimate de novo mutation signatures across243

three gastric cancer cohorts, we identified seven (denoted hereafter as S[1-7])244

mutational signatures (Figure 1A) which are related to signatures described245

in the COSMIC database by cosine similarity scores (Figure 1B). Signature246

1 (S1) is associated with endogenous mutational processes initiated by spon-247

taneous deamination of 5-methylcytosine (CS-1); Signatures S2, S4 and S5248

are associated with defective DNA mismatch repair and/or microsatellite in-249

stability (CS-6/CS-15, CS-20 and CS-21/CS-26 respectively); Signature S3250

is related with failure of DNA-double strand break repair by homologous re-251

combination (CS-3); Signature S6 related to CS-17 with unknown etiology;252

and Signature S7 is associated with error-prone polymerase activity (POLE253

(DNA Polymerase Epsilon, Catalytic Subunit), CS-10).254

255

CS-3 (S3 in Figure 1 and Figure S1) was the predominant signature found256

here, supporting previous results which have characterized this signature in257

gastric cancer samples with a very high prevalence of small indels and base258

substitutions due to failure of DNA double-strand break repair by homolo-259

gous recombination [34]. This finding suggests that 7-12% of gastric cancers260

may benefit from either platinum therapy or PARP inhibitors. However, no-261

tably, another group of patients not exposed to signature CS-3 was found to262

be highly exposed to signatures associated with dMMR (S2, S4 and S5 in263

Figure S1). Thus, our analysis identified a distinct group of gastric cancer264

patients harboring features that might have therapeutic relevance and which265

are further investigated here.266

9

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 10: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

Figure 1: De novo mutational signatures in gastric cancer. (a) Mutational signa-tures in gastric cancer from 787 patients from TCGA, ICGC and cbioPortal cohorts. (b)Heatmap with cosine similarities between de novo mutational signatures and COSMICsignatures.

C>A C>G C>T T>A T>C T>G

0.00

0.20

S1

0.0

0.2

0.4

S2

0.00

0.03 S3

0.00

0.10

S4

0.00

0.10

S5

0.00

0.10

0.20

S6

0.0

0.2

0.4

AC

AA

CC

AC

GA

CT

CC

AC

CC

CC

GC

CT

GC

AG

CC

GC

GG

CT

TC

AT

CC

TC

GT

CT

AC

AA

CC

AC

GA

CT

CC

AC

CC

CC

GC

CT

GC

AG

CC

GC

GG

CT

TC

AT

CC

TC

GT

CT

AC

AA

CC

AC

GA

CT

CC

AC

CC

CC

GC

CT

GC

AG

CC

GC

GG

CT

TC

AT

CC

TC

GT

CT

AT

AA

TC

AT

GA

TT

CT

AC

TC

CT

GC

TT

GT

AG

TC

GT

GG

TT

TT

AT

TC

TT

GT

TT

AT

AA

TC

AT

GA

TT

CT

AC

TC

CT

GC

TT

GT

AG

TC

GT

GG

TT

TT

AT

TC

TT

GT

TT

AT

AA

TC

AT

GA

TT

CT

AC

TC

CT

GC

TT

GT

AG

TC

GT

GG

TT

TT

AT

TC

TT

GT

TT

S7

(a) Mutational signatures called by signeR.

CS

−1

CS

−2

CS

−3

CS

−4

CS

−5

CS

−6

CS

−7

CS

−8

CS

−9

CS

−10

CS

−11

CS

−12

CS

−13

CS

−14

CS

−15

CS

−16

CS

−17

CS

−18

CS

−19

CS

−20

CS

−21

CS

−22

CS

−23

CS

−24

CS

−25

CS

−26

CS

−27

CS

−28

CS

−29

CS

−30

S1

S2

S3

S4

S5

S6

S70.2

0.4

0.6

0.8

(b) Heatmap with cosine similarity between de novo signaturesand COSMIC signatures.

10

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 11: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

3.2. dMMR signatures and prognostic features267

We reasoned that dMMR signature exposure could hold prognostic value268

in gastric cancer. Therefore, we first evaluated the influence of each dMMR269

signature exposure and the main possible clinical and molecular prognostic270

features such as age at diagnosis, ethnicity, tumor pathological stage, Lauren271

classification, anatomic site, TMB and microsatellite instability (MSI) sta-272

tus on overall survival (OS) fitting simple Cox regression model (Figure S2).273

Data from 584 gastric cancer patients with available vital status information274

(Alive/Dead) and without metastasis at diagnosis were included in simple275

and multiple Cox regression models. The median follow-up time for these276

patients was 28.9 months (with a 95% confidence interval: 95%CI 25.8-32.1)277

and the mean follow up time was 36.2 months (95%CI 32.9-39.5).278

279

We then fitted a multiple Cox regression model to the dataset using prog-280

nostic features (variables with significant p-value are shown at Figure S2),281

and observed that S4 exposure burden was associated with improved OS282

compared with other dMMR signatures (hazard ratio [HR] 0.59 with 95%CI283

0.37-0.96) (Figure 2A, Figure S3). Thus, we focused on signature S4, which284

has the potential to offer important clinically actionable information for treat-285

ment selection.286

287

11

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 12: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

Figure 2: Forest plots showing the hazard ratio estimated according to multipleCox regression models for overall survival.

(a) Mutational signature S4

Variable

Age

Race (White)

StageIIIII

S4

N (total)

484

484

484484

484

N (deaths)

188

188

188188

188

Hazard Ratio [95% CI]

1.2565 [1.071,1.475]

1.4467 [1.08,1.938]

1.8439 [0.993,3.422]2.8682 [1.569,5.244]

0.598 [0.373,0.958]

P−Value

0.005 **

0.013 *

0.0520.001 **

0.033 *

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5Hazard ratio [95% CI]

(b) Tumor Mutational Burden (TMB)

Variable

Age

Race (White)

StageIIIII

TMB

N (total)

484

484

484484

484

N (deaths)

188

188

188188

188

Hazard Ratio [95% CI]

1.2468 [1.064,1.461]

1.3829 [1.032,1.853]

1.8023 [0.971,3.346]2.878 [1.574,5.262]

0.6614 [0.468,0.934]

P−Value

0.006 **

0.03 *

0.0620.001 **

0.019 *

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5Hazard ratio [95% CI]

(c) MSI status

Variable

Age

Race (White)

StageIIIII

MSI−high

N (total)

484

484

484484

484

N (deaths)

188

188

188188

188

Hazard Ratio [95% CI]

1.2277 [1.045,1.442]

1.463 [1.092,1.96]

1.8523 [0.998,3.437]2.8699 [1.569,5.249]

0.7755 [0.508,1.183]

P−Value

0.012 *

0.011 *

0.0510.001 **

0.238

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5Hazard ratio [95% CI]

Our analysis also revealed that a higher TMB was also associated with288

improved OS (HR=0.66; 95%CI 0.46-0.93) (Figure 2B), consistent with pre-289

vious studies [35]. Distinctly from TMB, we found no association between290

predicted MSI-H status and improved OS (HR=0.78; 95%CI 0.51-1.18) (Fig-291

ure 2C). The calibration curves for these models considering overall survival292

in 2 years is given in Figure S4, which indicates that all models are adequate.293

294

We used the maxstat function (available in R language) to define groups295

according to the S4 exposure. The optimal cutpoint was within the range of296

12

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 13: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

highest quartile (Q3), and thus patients with S4 exposure ≥Q3 were labeled297

as S4high otherwise S4low (<Q3). The survival curves from patients in the298

S4high and S4low groups were statistically different (p-value<0.03) with a me-299

dian OS of 72 months (95%CI 48.0-∞) in the S4high group as compared to300

37 months (95%CI 28.0-68.0) in the S4low group (Figure S5). Next, we used301

an independent gastric cancer cohort to validate that S4high has a survival302

benefit. Kaplan-Meier was performed to analyze patients in the validation303

cohort grouped based on samples’ exposure level of signature S4. By com-304

paring patients in the S4high group whose median OS was not reached at the305

time of 5 years (95%CI 38.2-∞) and the S4low group with a median OS of306

48 months (95%CI 21.3-∞), the data indicated a trend toward a survival307

benefit for S4high group, supporting our previous findings (Figure S6).308

3.3. dMMR signatures associated with MLH1 hypermethylation309

Although the genes associated with dMMR are known, the underlying310

gene modifications that lead to each of the dMMR signatures still remain311

poorly characterized.312

313

To improve our understanding of the determinant changes that influ-314

ence the different dMMR signatures detected in this study, we first looked315

for somatic and germline SNVs and indels in MMR genes (LIG1, POLE,316

EXO1, MLH1, MLH3, MSH2, MSH3, MSH5, MSH6, PCNA, PMS1, PMS2,317

PMS2L3, PMS2L4, POLD1, POLD2, POLD3, POLD4 and SSBP1 ). We318

observed that only 6% of patients (12/197) harbored somatic variations in319

the MLH1 gene within either S2high or S4high groups and no mutated patients320

within Slow groups. Likewise, our results showed that 9% of the patients in321

the S2high group harbored somatic variations in the MLH3 gene. We also322

found that only 8% of patients in the S5high group (8/100 considering TCGA323

cohort) harbor germline mutations in the MSH5 gene. Altogether, we could324

observe only few cases harboring mutated MMR genes with some association325

with the Shigh groups.326

327

We next searched for epigenetic changes in the MMR genes. In line with328

previous studies [36, 37], we observed downregulation of MLH1 gene expres-329

sion driven by hypermethylation of its promoter (Figure S7). To further as-330

sess how the mutational exposure is associated with epigenetic changes in the331

MLH1 gene, simple and multiple logistic regression models were fitted to the332

13

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 14: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

dataset (Table 1). This analysis revealed that S4 exposure burden was associ-333

ated with an increased chance of MLH1 promoter being methylated (odds ra-334

tio [OR]=22.561; 95%)CI 7.909-64.353). On the other hand, S5 exposure bur-335

den was associated with a decreased chance of MLH1 promoter methylation336

(OR=0.107; 95%CI 0.048-0.238).Finally, no difference was observed in S2337

exposure burden (OR=3.682; 95%CI 0.881-15.386). The performance of this338

model was adequate (HosmerLemeshow goodness-of-fit test χ2(8)=10.257;339

p-value=0.247) (Figure 3A), with a good performance observed (Brier score340

0.0364), and an excellent power of discrimination (AUC=0.982; 95%CI 0.971-341

0.994) (Figure 3B). Using the Youden index, the best cutoff value (threshold)342

was 0.125, which had a sensitivity of 95.45% and specificity of 95.82% (Fig-343

ure 3B).344

345

Table 1: Simple and multiple logistic regression models for MLH1 methylationand dMMR mutational signatures

VariableSimple logistic regression model

Coefficient Standard ErrorCI(95%) for coefficient

p-value ORCI(95%) for OR

Lower Upper Lower UpperExpS2 4.772 0.5226 3.748 5.796 < 0.0001 118.155 42.424 329.078ExpS4 3.2424 0.3469 2.562 3.922 < 0.0001 25.595 12.968 50.518ExpS5 0.2725 0.1674 −0.056 0.601 0.104 1.313 0.946 1.823

VariableMultiple logistic regression model

Coefficient Standard ErrorCI(95%) for coefficient

p-value ORCI(95%) for OR

Lower Upper Lower UpperIntercept −2.640 0.300 −3.227 −2.052 < 0.0001

ExpS2 1.304 0.730 −0.126 2.733 0.074 3.682 0.881 15.386ExpS4 3.1162 0.5348 2.068 4.164 < 0.0001 22.561 7.909 64.353ExpS5 −2.231 0.4067 −3.028 −1.434 < 0.0001 0.107 0.048 0.238

14

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 15: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

Figure 3: Performance and power discrimination of logistic model. (a) QQ Plotshowing that the propensity score model had an adequate level of calibration according tothe Hosmer-Lemeshow test. (b) Summary of ROC curve showing the power of discrimi-nation for MLH1 methylation at signature S4.

● ●●

●● ●

●●

●●

●●

●●

● ●●

● ● ●●●

●● ●● ●

●● ● ●● ●● ● ●

● ●

●● ●● ●●

● ●● ●

● ● ●

● ●

●● ● ●●

● ●●

●● ●●● ●● ●●● ●● ●●● ● ●●● ●● ●● ●●● ●●●● ●●

● ● ●●●● ●● ●● ● ●

●●

●● ●●

●● ●● ●

●●

● ●

● ● ●●●● ●● ●● ●●

● ●●● ● ● ●●● ●● ● ●● ●●● ● ● ●●● ● ●● ● ●● ●

● ●

● ● ● ●●●●●

●●● ● ●● ●●● ●

●●●

●●

●● ●●●

●● ● ●● ●

● ●●● ●

●●●

●●●

● ●

●● ●● ● ●●

● ●● ●● ●●● ● ●● ●● ● ●

● ● ●● ●●● ●●●● ●● ●●●●●

●●● ● ●●●

●● ●

● ●●● ●●●

●●

● ●●

● ●●● ● ●● ●●● ●

●●●● ●●● ●●● ●● ●●

●● ●●● ●●

−3 −2 −1 0 1 2 3

−4

−3

−2

−1

01

23

Quantiles of standard normal

Qua

ntile

s of

dev

ianc

e re

sidu

als

(a) QQ Plot

Specificity

Sen

sitiv

ity

1.0 0.8 0.6 0.4 0.2 0.0

0.0

0.2

0.4

0.6

0.8

1.0

0.125 (0.958, 0.955)

AUC=0.983, CI(95%)=[0.972−0.994]

(b) ROC Curve

It should be noted that these observations suggest that neither germline346

SNVs, somatic SNVs or indels are the major modifications affecting gene347

expression levels, moreover, in fact, MLH1 promoter is hypermethylated in348

almost 60% of individuals in the S4high group (beta-value≥ 0.3). Taken349

together, we conclude that the main mechanism of impaired MMR associated350

with the signature S4 (CS-20) in gastric cancer samples is driven by MLH1351

promoter hypermethylation. Moreover, by using genomic sequencing data352

from three HAP1 cells samples (2 MLH1KO and 1 MLH1WT cell), we have353

observed that MLH1KO cell lines have higher exposures of the S4 signature,354

while the parental cell line has higher exposure of the S5 signature (Figure355

S8), which identifies the absence of MLH1 as the cause of the S4 mutational356

signature.357

3.4. Clinical and molecular features358

Given our observation that the dMMR signature S4 was associated with359

better prognosis, and possibly related to an epigenetic causative mechanism,360

we next tried to further characterize the clinical and molecular features within361

15

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 16: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

S4high and S4low groups.362

363

Previously defined clinical features that were associated with improved364

prognosis in gastric cancer were also enriched in the S4high group (Table 2),365

such as distal anatomic site and intestinal histology [38]. On the other hand,366

known clinical variables associated with a worse prognosis in gastric cancer,367

such as cardia/proximal anatomic site, diffuse histology, positive lymph node368

metastasis (stage N+) and advanced pathological stages (stage III and IV)369

[38] were significantly higher in the S4low group (Table 2). In addition, the370

predicted MSI-H status, MSI and POLE molecular subtypes were also en-371

riched in the S4high group, while genomically stable (GS) and chromosomal372

instability (CIN) molecular subtypes were enriched in the S4low group (Table373

2). Our data also reveal that most cases of MSI-H (n=119/160, 74%, Table 2)374

were grouped within the S4high group, however, it has not escaped from our375

attention that a smaller fraction of MSI-H cases were unexpectedly grouped376

in the S4low group. Similarly, we have also found Non-MSI-H patients in377

the S4high group. Comparing the survival curves of these groups, we found378

MSI-H within S4low group trends to have a worse prognosis with 9.07 months379

as median OS (95%CI 9.0-∞) than Non-MSI-H within S4high group with 53380

months as median OS (95%CI 20.0-∞). Similarly, diffuse histologic subtype381

grouped into S4high (median OS not reached, 95%CI 24.0−∞) trends to have382

a better prognosis than intestinal histologic subtype grouped into S4low (me-383

dian OS of 43.1 months, 95%CI 28.0-∞) (Figure S9). Thus, we conclude that384

the mutational signature classification was able to improve the stratification385

of patients within the prognostic groups, independent of their previous clin-386

ical or molecular classification.387

388

To further understand tumor heterogeneity in the S4high and S4low pa-389

tients, we examined the spread of allele frequencies according to the quan-390

titative measure of the degree of heterogeneity [39]. We then performed a391

correlation analysis based on this score, S4 exposure and TMB (Figure 4).392

We noted that the correlation of the tumor heterogeneity score (MATH) with393

either TMB or S4 exposure are opposite in the S4high and S4low groups. In394

the S4high group, there was a negative correlation of S4 exposure or TMB395

with MATH and, in the S4low group there was a positive correlation. We also396

observed that the MATH score is higher in the S4low than the S4high group397

(p-value=3.711x10−12). Lastly, the TMB and neoantigen load (by TCIA)398

have shown a positive correlation with signature S4 exposure in both groups399

16

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 17: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

Table 2: The clinical-pathological features of gastric cancer according to S4dMMR mutational signature groups

All(n = 787) S4low(n = 590) S4high(n = 197)Pvalue

N % N % N %

Age (mean ± SD) 64.17 ± 11.73 63.44 ± 12 66.25 ± 10.68 0.0024Gender 767 97 571 91 196 99Female 274 36 192 34 82 42

0.0473Male 493 64 379 66 114 58Race 726 92 546 93 180 91White 275 28 203 37 72 40

0.7427Black 13 2 11 2 2 1Asian 437 60 331 61 106 59Other 1 0 1 0 0 0

Anatomic Site 627 80 495 84 132 67Cardia/Proximal 168 27 148 30 20 15

0.0015Fundus/Body 212 34 166 34 46 35

Antrum/Distal 242 39 178 36 64 48Other 5 1 3 1 2 2

Histology Lauren 467 59 383 65 84 43Diffuse 150 32 132 34 18 21

0.0041Intestinal 301 64 235 61 66 79Mixed 16 3 116 4 0 0

Stage T 712 90 526 89 186 94T1− T2 181 25 132 25 49 26

0.8116T3− T4 531 75 394 75 137 74

Stage N 712 90 526 89 186 94N0 173 24 115 22 58 31

0.0144N+ 539 76 411 78 128 69

Stage M 707 90 524 89 183 93M0 623 88 461 88 162 89

0.9422M1 62 1 47 9 15 8MX 22 3 16 3 6 3

Pathological Stage 715 91 546 93 169 86I 85 12 58 11 27 16

0.0386II 220 31 160 29 60 36III 289 40 228 42 61 36IV 121 15 100 18 21 12

Molecular Subtype 403 51 289 49 114 58CIN 223 55 206 71 17 15

<0.0001GS 50 12 47 16 3 3

EBV 38 9 33 11 5 4MSI 85 21 0 0 85 75

POLE 7 2 3 1 4 4MSIseq Status 787 100 590 100 197 100

MSI −H 160 20 41 7 119 60<0.0001

Non−MSI −H 627 80 549 93 78 40Immune Subtype 388 49 285 48 103 52

C1 128 33 101 35 27 26

<0.0001C2 209 54 135 47 74 72C3 35 9 34 12 1 1C4 9 2 8 3 1 1C6 7 2 7 2 0 0

17

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 18: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

(Figure 4).400

401

These findings suggest that tumors highly exposed to signature S4 are402

more homogeneous in the S4high group and, together with high TMB and403

high neoantigen load, a reduced tumor heterogeneity appears to be determi-404

nant of a good prognosis. In this sense, we speculate that the methylation of405

MLH1 promoter associated with the S4high signature may be an early event406

in tumorigenesis.407

408

In order to check if signature S4 is represented equally across the three409

cohorts, we compared their samples exposures. To avoid performing statis-410

tical tests with different numbers of samples, a subsampling procedure was411

applied, randomly selecting 24 samples from each cohort and then perform-412

ing the KruskalWallis test. This was repeated 1000 times, always generating413

p-values above 0.05, leading us to conclude that the S4 exposure is similar414

for all cohorts.415

3.5. Significantly mutated genes and related pathways in S4 dMMR groups416

MMR-deficiency leads to an elevated frequency of mutations in the genome417

[14] and the consequences of MMR-deficiency may be derived from func-418

tional alterations in many distinct genes. In order to verify the existence419

of a common set of genes commonly mutated and their main related path-420

ways between S4high and S4low groups (Supplementary Material Table S4)421

we investigated the presence of consistent SNVs differentiating these groups.422

At least one somatic mutation, including SNVs and indels, was detected for423

83.25% of S4high patients, and 78.64% within the S4low group. We observed424

an increased number of deletions in the S4high group, while within the S4low425

group the mutations basically consisted of SNVs. These results are expected426

when considering that MSI/dMMR would lead to a higher number of dele-427

tions [14].428

429

The gene set found as significantly mutated in the S4high group is com-430

posed of 102 genes. The most commonly mutated genes in this group are431

ARID1A (42%), KMT2D (35%) and TP53 (31%). In addition, there are an-432

other 56 genes presenting mutations in at least 10% of patients (Table S4A).433

The enrichment analysis of these mutated genes identified pathways related434

to immune cell differentiation, protein and RNA metabolism, gene expres-435

sion regulation, cell differentiation and embryogenesis (Table S4B). It was436

18

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 19: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

Figure 4: Scatter plots showing the Spearman correlation between molecularfeatures. Blue dots represent patients in S4low group and red those in S4high group.

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

● ●

●●

R = 0.76 , p < 2.2e−16R = 0.49 , p < 2.2e−16

1e−06

1e−05

1e−04

1e−03

1e−02 1e−01 1e+00 1e+01 1e+02log10(TMB)

log1

0(E

xp_S

4)

class_0.25_S4 ●a ●aS4high S4low

(a) TMB and S4 exposure

●●

●●

●● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

R = − 0.16 , p = 0.073R = 0.25 , p = 4.7e−07

25

50

75

0.1 1.0 10.0 100.0log10(TMB)

MAT

H

class_0.25_S4 ●a ●aS4high S4low

(b) TMB and MATH score

●●

●●

●● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

R = − 0.29 , p = 0.001R = 0.27 , p = 5.5e−08

25

50

75

1e−06 1e−05 1e−04 1e−03log10(Exp_S4)

MAT

H

class_0.25_S4 ●a ●aS4high S4low

(c) S4 exposure and MATHscore

●●

●●

●●

●●

●● ●

R = 0.93 , p = 1.4e−05R = 0.49 , p = 1.1e−05

1

10

100

1000

10000

1e−06 1e−05 1e−04 1e−03log10(Exp_S4)

log1

0(N

eoan

tigen

cou

nt)

class_0.25_S4 ●a ●aS4high S4low

(d) S4 exposure and neoantigencount

19

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 20: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

previously suggested that somatic mutations in chromatinregulating genes437

such as KMT2D (also known as MLL2 ) and ARID1A are associated with438

improved survival [37].439

440

In the gene set found as significantly mutated in theS4low group, 12 out441

of 24 genes are known oncogenes, associated with tumor progression, or tu-442

mor suppressor genes. These 12 genes are PIK3CA, KRAS, RHOA, CDH1,443

CTNNB1, ITGAV, SMAD4, TP53, CDKN2A, APC, PTEN and PIK3R1444

(details in the Table S4C). The most frequently mutated genes in the S4low445

group are TP53 (47%), ARID1A (13%) and CDH1 (9%) (Table S4C). The446

other 21 significantly mutated genes for this group were mutated in up to447

8% of patients. In addition to the common pathways related to cancer,448

we also found pathways associated with regulation of cell death, phospho-449

rus metabolism, regulation of transferase activity, morphogenesis pathways450

(gland development and anatomical structure of a tube) and VEGF and neu-451

rotrophin signaling pathways (Table S4D).452

453

These findings were in accordance with some genes found previously in454

215 non-hypermutated tumors from the TCGA cohort as APC, CTNNB1,455

SMAD4 and SMAD2, with somatic mutations in CDH1 and RHOA enriched456

in the genomically stable and/or diffuse histology [40], subtypes enriched in457

S4low group.458

3.6. Immune diversity in S4 dMMR groups459

To investigate a possible role of the immune system being associated with460

the improved clinical outcomes seen in the S4high as compared to S4low, we461

performed a series of analysis to determine the immune cell infiltrate com-462

position in each group. These analyses used two different analytical method-463

ologies (see Materials and Methods), and demonstrated a significantly higher464

proportion of infiltrating cytotoxic and pro-inflammatory immune cells in the465

group S4high,as exemplified by increased CD8+ central and effector memory466

T cells, CD4+ memory T cells, Th1 cells, gamma/delta T cells, NK cells,467

M1 macrophages and plasmacytoid dendritic cells (pDC), as compared to468

the S4low group (Figure 5A and S10). In contrast, immature and immune469

regulatory dendritic cells were higher in the S4low group (Figure 5A and S10).470

471

To further characterize the immune regulatory environment in the S4high472

and S4low groups respectively associated with good or poor clinical outcomes,473

20

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 21: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

Figure 5: The main immunological features associated with S4high and S4low

groups.

** *** *** * * * * *

0.0

0.2

0.4

B.cells

.naive

B.cells

.mem

Plasma.c

ells

T.cell

s.CD8

T.cell

s.CD4.n

aive

T.cell

s.CD4.m

em.re

st

T.cell

s.CD4.m

em.ac

t TfhTr

egs

T.cell

s.gam

ma.delt

a

NK.cells

.rest

NK.cells

.act

Monoc

ytes

Macro

phag

es.M

0

Macro

phag

es.M

1

Macro

phag

es.M

2

Dendr

itic.ce

lls.re

st

Dendr

itic.ce

lls.ac

t

Mast.c

ells.r

est

Mast.c

ells.a

ct

Eosino

phils

Neutro

phils

Abs

olut

e va

lue

Groups S4−high S4−low

(a) Boxplots showing the absolute quantification of immune in-filtrate cells estimated by CIBERSORT (Newman et al., 2015).Blue boxes represent patients in S4low group and red those within S4high

group. Resulting p-values: *p<0.05; **p<0.01; ***p<0.001 by Mann-Whitney U test.

CD39

CTLA4

PD1

LAG3

GZMA

PRF1

GZMB

IFNG

PDL1

TIM3

**********

Exp

ress

ion

quar

tiles

Q1Q2Q3Q4

GroupsS4−highS4−low

Immune SubtypesC1C2C3C4C6

MLH1

00.20.40.60.81

p−value

10.10.010.0010.0001

< 0.01*

Groups

Immune Subtypes

MLH1

(b) Heatmap showing the immune effector genes (cytotoxic and im-mune checkpoints) expression. Normalized gene expression level (FPKM)for each marker gene was classified in quartiles. Q1 means the range up to firstquartile (25% lowest values); Q2: of 25% to 50% (median) expression value;Q3: of 50% to 75% expression values and Q4 means the range of 75% (thirdquartile) to highest expression values. Left to heatmap, the p value labels con-sider the Mann-Whitney U test by comparing gene expression between S4high

(red samples) versus S4low (blue samples) groups. These comparisons are alsoshown in Figure S9 as boxplots.

21

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 22: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

we performed comparisons between gene expression levels of key genes cod-474

ing for immunoregulatory and effector molecules proven to be important for475

tumor control in many cancers [41, 42]. First, the genes for CD8+ T cell476

related cytolytic molecules Granzyme A/B and Perforin-1 (GZMA, GZMB477

and PRF1 genes, respectively) all displayed higher expression in the S4high478

group (Figures 5B and S11). Moreover, the inflammatory T cell response479

related cytokine, IFN-gamma (IFNG gene) and other proinflammatory cy-480

tokines (IL1B, IL6 and IL8 ), T cell activation marker genes (IL2RA and481

ICOS ) and NK cell KIR family receptors were also higher in the S4high than482

the S4low group (Figures 5B and S11). Second, a series immunosuppression483

related genes (TGFB1, IL10, and FOXP3 ) showed no differences between484

the groups, while ENTPD1 (CD39 gene), a protein associated with Treg485

immunosuppression activity [43], was higher in S4low vs. S4high (Figures 5B486

and S11).487

488

Importantly, the expression of the immune checkpoint inhibitor genes489

(PDCD1 for PD1 receptor, CD274 for PD-L1 ligand, PDCD1LG2 for PD-490

L2, HAVCR2 for TIM3, LAG3 and CTLA4 ) were also higher in the S4high491

group (Figures 5B and S11) possibly indicating a relationship with a more492

immunologically activated tumor microenvironment [44]. Expression of HLA,493

antigen processing and presentation-related genes (such as CD86, B2M, vari-494

ous HLA class II genes, HLA-E, HLA-C, TAP1 and TAP2 ) were also higher495

in the S4high group (Figure S11). Together these findings indicate a highly496

activated immune microenvironment in the S4high group as compared to the497

S4low group.498

499

The immune subtypes previously characterized by Thorsson et al. [33]500

reinforce the finding that the S4high group primarily presents a more immuno-501

logically active tumor microenvironment which is composed predominantly502

of C2 (interferon-gamma) immune subtype, with a significantly higher pro-503

portion of this subtype as compared to S4low (Table 2 and Figure 5B). Im-504

portantly, the C2 immune subtype has been associated with highly mutated505

tumors [33]. On the other hand, the S4low group displayed a higher propor-506

tion of C3 (inflammatory) and C1 (wound healing) immune subtypes [33]507

(Table 2 and Figure 5B). Lastly, the C2 immune subtype in the S4low group508

seems to be less activated than in the S4high group with reduced relative gene509

expression of immune effector molecules (Figure 5B).510

22

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 23: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

3.7. Discussion511

This study provides a comprehensive and integrated analysis of the im-512

pact of MMR-related gene alterations in shaping specific mutational signa-513

tures associated with gastric cancer. We present evidence that the determi-514

nation of MMR-deficiency can be used not only for MSI-phenotype classifi-515

cation, but also as a potential indicator of prognosis and to select potential516

candidates for treatment with checkpoint inhibitors.517

518

We performed a de novo extraction of mutational signatures based on519

somatic SNVs across four WES cohorts, spanning 787 gastric cancer samples520

derived mainly from populations with European and Asian descendance. We521

found 7 different mutational signatures, with three elated to MMR-deficiency.522

523

Next, we examined the prognostic value of these dMMR signatures in mul-524

tivariate survival analysis by employing a Cox proportional hazards model.525

This analysis revealed signature S4, related to the previously described CS-526

20, as the only dMMR signature with significant prognostic value. This prog-527

nostic value was validated using our local cohort of gastric cancer patients,528

distinct in terms of molecular ancestry as well as some clinical and molecular529

features such as Lauren’s histology and tumor heterogeneity. This cohort530

was predominantly composed of diffuse/mixed histology samples, while pub-531

lic cohorts were enriched for the intestinal subtype. Furthermore, the S4low532

group in this independent cohort was less heterogeneous than the S4low group533

from the public cohorts and even than the S4high groups from both cohorts.534

Nevertheless, we observed a better prognosis for the patients of the S4high535

group, also for this cohort.536

537

Interestingly, after performing a comprehensive analysis of patients ex-538

posed to signature S4 by an in depth evaluation of molecular and immune fea-539

tures, we observed that the main mechanism associated with impaired MMR540

seems to be the hypermethylation of the MLH1 gene promoter (hMLH1 ).541

Moreover, we show that disruption of MLH1 in vitro using CRISPR/Cas9542

assay reproduces the CS-20 signature [45] that resembles the S4 signature.543

Here, we have shown an endogenous epigenetic mechanism for this signature544

in gastric cancer patients. Remarkably, we also reproduce the S4 signature545

in an isogenic cell model in which the MLH1KO cells had a high exposure of546

the S4 signature. Thus importantly, we conclude that independently of the547

primary mechanism that leads to the loss of MLH1 gene expression - due to548

23

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 24: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

promoter hypermethylation or loss of function mutagenesis - it results in the549

same mutational signature.550

551

It has been well documented that CpG island methylator (CIMP) phe-552

notype is an early event in tumorigenesis, preceding the hMLH1 in solid553

tumors, which in turn drives the microsatellite instability high (MSI-H) phe-554

notype [36, 40, 37]. In contrast, MSI-low (MSI-L) and microsatellite stable555

(MSS) gastric carcinoma subtypes have unmethylated MLH1 promoters and556

regular MLH1 activity [36]. Here, we classified samples as MSI-H and non-557

MSI-H (MSI-L and MSS) and observed that most cases of MSI-H fall within558

the S4high group, however, a smaller fraction of MSI-H cases did not show559

high exposure of this mutational signature and were grouped in the S4low560

group. This is in line with previous studies showing about one-quarter of561

the MSI-H cases, despite being MSI-H, present distinct molecular features562

and poor prognosis [46]. Similarly, we have also found Non-MSI-H patients563

in the S4high group, showing that mutational signature exposure is capable564

of clustering samples independently of their MSI-status. Furthermore, we565

also identified a few cases (4%) in the S4high group that instead of presenting566

hMLH1, carried somatic mutations in the MLH1 gene that apparently lead567

to loss-of-function of the encoded protein. Thus, for about 70% of S4high568

cases we found a clear genetic or epigenetic cause.569

570

We also demonstrated a strong correlation between S4 exposure and571

TMB, which showed significant prognostic value upon multivariate survival572

analysis. Hypermutated tumors have been associated with better progno-573

sis and a good response to immunotherapy apparently due to neoantigen574

enrichment and intrinsic antitumor immune responses [47, 48]. However, a575

threshold for classifying TMB-high samples usually varies with tumor type576

[49] and in some cases may not predict a better response [48] due to intratu-577

moral heterogeneity [50]. In this sense, it is important to highlight that most578

mutations in the S4high signature are clonal, which is an important feature579

to predict response to immune checkpoint inhibitors therapy.580

581

High intratumoral heterogeneity has been associated with an incomplete582

response to therapy, higher relapse rates, and poor clinical outcomes [51, 52].583

The increased genomic instability observed in MSI/dMMR and CIN (chro-584

mosomal instability) tumors is the major driver of high intratumoral hetero-585

geneity [53, 51]. However, the most unstable tumors (with the highest burden586

24

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 25: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

of somatic SNVs or copy number alterations) are not the most intrinsically587

heterogeneous[53]. Furthermore, the greatest intratumor heterogeneity was588

found in tumors exhibiting relatively high numbers of both somatic muta-589

tions and copy number alterations, which can be associated with exogenous590

mutagens, including viral infection and tobacco smoking. These tumors have591

high number of sub-clonal mutations related to late events and exhibit fre-592

quent chromosomal instability associated with CIN subtype, TP53 muta-593

tions, and APOBEC-related mutational signatures (previous related to EBV594

gastric cancer subtype [6, 53]. Likewise, here we noted that patients with595

higher S4 exposure harbor more homogeneous tumors as compared to S4low596

group. Taken together, the S4high group encompassed patients with interme-597

diate to high TMB, in addition to the MSI and CIN molecular phenotypes598

associated with lower tumor heterogeneity, which might allow for a more ef-599

fective antitumor immune response in this subset of gastric cancer patients.600

In this sense, a recent meta-analysis discussed the importance of MSI-status601

for the treatment response in gastric cancer patients, suggesting that MSI-H602

patients may not benefit from perioperative or adjuvant therapy and could603

go straight to surgery [54].604

605

Finally, several studies have shown that the tumor microenvironment606

context, at diagnosis, is capable of predicting treatment response and clini-607

cal outcome [55, 56]. The balance of inflammatory/cytotoxic immune cells,608

with elements of an effective antitumor response, including regulatory cells609

and suppressor signals, may indicate which patients have an intrinsically610

effective antitumor response, and thus, a better prognosis. EBV and MSI611

subtypes in gastric cancer have already been associated with higher immune612

infiltrate and responsiveness to immunotherapy, as well as better prognosis613

[55]. Here we found many elements indicating that the tumor microenviron-614

ment in the S4high group is more active as compared to S4low patients. In615

general, the absolute quantification by CIBERSORT [31] or GSEA scores616

by xCell [32] of immune cell subtypes and the differential gene expression617

pointed to higher activity of proinflammatory and cytotoxic cells, as well as618

antigen processing and presentation in S4high. In contrast, although there619

were some immunogenic tumors in S4low group, the predominant environ-620

ment was enriched in Treg lymphocytes and M2 macrophages, both related621

to worse prognosis [56, 55].622

623

In conclusion, while past studies have aimed to identify patients using624

25

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 26: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

molecular and clinical features such as MSI status, TMB load and MLH1625

gene expression levels, our study provides evidence that classification based626

on mutational signature exposure may identify groups of patients with com-627

mon clinical, immunological and mutational features that are directly related628

to a better prognosis, and who might benefit from immunotherapy-based629

treatments.630

3.8. Acknowledgements631

This project received financial support from FAPESP (14-26897-0 and632

16/11791-7); ED-N and KJG are research fellows from Conselho Nacional633

de Desenvolvimento Cientıfico e Tecnolgico (CNPq, Brazil). ED-N acknowl-634

edges the support given by Associacao Beneficente Alzira Denise Hertzog635

Silva (ABADHS).636

References637

[1] M. R. Stratton, P. J. Campbell, P. A. Futreal, The cancer genome,638

Nature 458 (2009) 719–724.639

[2] R. Hakem, DNA-damage repair; the good, the bad, and the ugly, EMBO640

J. 27 (2008) 589–605.641

[3] T. Helleday, S. Eshtad, S. Nik-Zainal, Mechanisms underlying muta-642

tional signatures in human cancers, Nat. Rev. Genet. 15 (2014) 585–598.643

[4] L. B. Alexandrov, S. Nik-Zainal, D. C. Wedge, S. A. Aparicio, S. Behjati,644

A. V. Biankin, et al., Signatures of mutational processes in human645

cancer, Nature 500 (2013) 415–421.646

[5] M. Hollstein, L. B. Alexandrov, C. P. Wild, M. Ardin, J. Zavadil, Base647

changes in tumour DNA have the power to reveal the causes and evolu-648

tion of cancer, Oncogene 36 (2017) 158–167.649

[6] I. Bobrovnitchaia, R. Valieris, R. D. Drummond, J. P. Lima, H. C.650

Freitas, T. F. Bartelli, et al., APOBEC-mediated DNA alterations:651

A possible new mechanism of carcinogenesis in EBV-positive gastric652

cancer, Int. J. Cancer 146 (2020) 181–191.653

[7] A. Van Hoeck, N. H. Tjoonk, R. van Boxtel, E. Cuppen, Portrait of654

a cancer: mutational signature analyses for cancer diagnostics, BMC655

Cancer 19 (2019) 457.656

26

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 27: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

[8] H. Davies, D. Glodzik, S. Morganella, L. R. Yates, J. Staaf, X. Zou,657

et al., HRDetect is a predictor of BRCA1 and BRCA2 deficiency based658

on mutational signatures, Nat. Med. 23 (2017) 517–525.659

[9] J. Kim, K. W. Mouw, P. Polak, L. Z. Braunstein, A. Kamburov, D. J.660

Kwiatkowski, et al., Somatic ERCC2 mutations are associated with a661

distinct genomic signature in urothelial tumors, Nat. Genet. 48 (2016)662

600–606.663

[10] M. Jager, F. Blokzijl, E. Kuijk, J. Bertl, M. Vougioukalaki, R. Janssen,664

et al., Deficiency of nucleotide excision repair is associated with muta-665

tional signature observed in cancer, Genome Res. 29 (2019) 1067–1077.666

[11] M. L. Thibodeau, E. Y. Zhao, C. Reisle, C. Ch’ng, H. L. Wong, Y. Shen,667

et al., Base excision repair deficiency signatures implicate germline and668

somatic MUTYH aberrations in pancreatic ductal adenocarcinoma and669

breast cancer oncogenesis, Cold Spring Harb Mol Case Stud 5 (2019).670

[12] M. D. Giraldez, F. Balaguer, L. Bujanda, M. Cuatrecasas, J. Mu?oz,671

V. Alonso-Espinaco, et al., MSH6 and MUTYH deficiency is a frequent672

event in early-onset colorectal cancer, Clin. Cancer Res. 16 (2010) 5402–673

5413.674

[13] N. M. Volkov, G. A. Yanus, A. O. Ivantsov, F. V. Moiseenko, O. G.675

Matorina, I. V. Bizin, et al., Efficacy of immune checkpoint blockade676

in MUTYH-associated hereditary colorectal cancer, Invest New Drugs677

(2019).678

[14] R. Mandal, R. M. Samstein, K. W. Lee, J. J. Havel, H. Wang, C. Kr-679

ishna, et al., Genetic diversity of tumors with mismatch repair deficiency680

influences anti-PD-1 immunotherapy response, Science 364 (2019) 485–681

491.682

[15] D. T. Le, J. N. Durham, K. N. Smith, H. Wang, B. R. Bartlett, L. K.683

Aulakh, et al., Mismatch repair deficiency predicts response of solid684

tumors to PD-1 blockade, Science 357 (2017) 409–413.685

[16] W. Abida, M. L. Cheng, J. Armenia, S. Middha, K. A. Autio, H. A.686

Vargas, et al., Analysis of the Prevalence of Microsatellite Instability in687

Prostate Cancer and Response to Immune Checkpoint Blockade, JAMA688

Oncol 5 (2019) 471–478.689

27

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 28: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

[17] Z. R. Reichert, J. Urrutia, J. J. Alumkal, Microsatellite Instability as690

an Emerging Biomarker for Checkpoint Inhibitor Response in Advanced691

Prostate Cancer, JAMA Oncol 5 (2019) 478–479.692

[18] GE4GACgroup, T. Bartelli, et al., Genomics and epidemiology for gas-693

tric adenocarcinomas (GE4GAC): a Brazilian initiative to study gastric694

cancer, Appl. Cancer Res. 39 (2019).695

[19] K. Ellrott, M. Bailey, G. Saksena, K. Covington, C. Kandoth, C. Stew-696

art, et al., Scalable open science approach for mutation calling of tumor697

exomes using multiple genomic pipelines, Cell Syst. 28 (2018) 271–281.698

[20] G. Ciriello, M. L. Miller, B. A. Aksoy, Y. Senbabaoglu, N. Schultz,699

C. Sander, Emerging landscape of oncogenic signatures across human700

cancers, Nat. Genet. 45 (2013) 1127–1133.701

[21] M. A. DePristo, E. Banks, R. Poplin, K. V. Garimella, J. R. Maguire,702

C. Hartl, et al., A framework for variation discovery and genotyping703

using next-generation DNA sequencing data, Nat. Genet. 43 (2011)704

491–498.705

[22] R. A. Rosales, R. D. Drummond, R. Valieris, E. Dias-Neto, I. T. da Silva,706

signeR: an empirical Bayesian approach to mutational signature discov-707

ery, Bioinformatics 33 (2017) 8–16.708

[23] M. N. Huang, J. R. McPherson, I. Cutcutache, B. T. Teh, P. Tan, S. G.709

Rozen, MSIseq: Software for Assessing Microsatellite Instability from710

Catalogs of Somatic Mutations, Sci Rep 5 (2015) 13321.711

[24] Z. R. Chalmers, C. F. Connelly, D. Fabrizio, L. Gay, S. M. Ali, R. Ennis,712

et al., Analysis of 100,000 human cancer genomes reveals the landscape713

of tumor mutational burden, Genome Med 9 (2017) 34.714

[25] A. Mayakonda, D. C. Lin, Y. Assenov, C. Plass, H. P. Koeffler, Maftools:715

efficient and comprehensive analysis of somatic variants in cancer,716

Genome Res. 28 (2018) 1747–1756.717

[26] P. Charoentong, F. Finotello, M. Angelova, C. Mayer, M. Efre-718

mova, D. Rieder, et al., Pan-cancer Immunogenomic Analyses Reveal719

Genotype-Immunophenotype Relationships and Predictors of Response720

to Checkpoint Blockade, Cell Rep 18 (2017) 248–262.721

28

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 29: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

[27] D. Schoenfeld, Partial residuals for the proportional hazards regression722

model., Biometrika 69 (1982) 239–241.723

[28] D. Hosmer, S. Lemeshow, S. RX., Applied Logistic Regression., Wiley724

3 edition (2013).725

[29] X. Zou, M. Owusu, R. Harris, S. P. Jackson, J. I. Loizou, S. Nik-Zainal,726

Validating the concept of mutational signatures with isogenic cell mod-727

els, Nat Commun 9 (2018) 1744.728

[30] M. S. Lawrence, P. Stojanov, P. Polak, G. V. Kryukov, K. Cibulskis,729

A. Sivachenko, et al., Mutational heterogeneity in cancer and the search730

for new cancer-associated genes, Nature 499 (2013) 214–218.731

[31] A. M. Newman, C. L. Liu, M. R. Green, A. J. Gentles, W. Feng, Y. Xu,732

et al., Robust enumeration of cell subsets from tissue expression profiles,733

Nat. Methods 12 (2015) 453–457.734

[32] D. Aran, Z. Hu, A. J. Butte, xCell: digitally portraying the tissue735

cellular heterogeneity landscape, Genome Biol. 18 (2017) 220.736

[33] V. Thorsson, D. Gibbs, S. Brown, D. Wolf, D. Bortone, T. Ou Yang,737

et al., The Immune Landscape of Cancer, Immunity 51 (2019) 411–412.738

[34] L. B. Alexandrov, S. Nik-Zainal, H. C. Siu, S. Y. Leung, M. R. Stratton,739

A mutational signature in gastric cancer suggests therapeutic strategies,740

Nat Commun 6 (2015) 8683.741

[35] R. Buttner, J. W. Longshore, F. Lopez-Rios, S. Merkelbach-Bruse,742

N. Normanno, E. e. a. Rouleau, Implementing TMB measurement in743

clinical practice: considerations on assay requirements, ESMO Open 4744

(2019) e000442.745

[36] S. Y. Leung, S. T. Yuen, L. P. Chung, K. M. Chu, A. S. Chan, J. C. Ho,746

hMLH1 promoter methylation and lack of hMLH1 expression in sporadic747

gastric carcinomas with high-frequency microsatellite instability, Cancer748

Res. 59 (1999) 159–164.749

[37] W. Hu, Y. Yang, L. Qi, J. Chen, W. Ge, S. Zheng, Subtyping of mi-750

crosatellite instability-high colorectal cancer, Cell Commun. Signal 17751

(2019) 79.752

29

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 30: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

[38] G. Cammerer, A. Formentini, M. Karletshofer, D. Henne-Bruns, M. Ko-753

rnmann, Evaluation of important prognostic clinical and pathological754

factors in gastric cancer, Anticancer Res. 32 (2012) 1839–1842.755

[39] A. Rajput, T. Bocklage, A. Greenbaum, J. H. Lee, S. A. Ness, Mutant-756

Allele Tumor Heterogeneity Scores Correlate With Risk of Metastases757

in Colon Cancer, Clin Colorectal Cancer 16 (2017) e165–e170.758

[40] A. J. Bass, V. Thorsson, I. Shmulevich, S. M. Reynolds, M. Miller,759

B. Bernard, T. Hinoue, et al., Comprehensive molecular characterization760

of gastric adenocarcinoma, Nature 513 (2014) 202–209.761

[41] G. Landskron, M. De la Fuente, P. Thuwajit, C. Thuwajit, M. A. Her-762

moso, Chronic inflammation and cytokines in the tumor microenviron-763

ment, J Immunol Res 2014 (2014) 149185.764

[42] S. A. Fuertes Marraco, N. J. Neubert, G. Verdeil, D. E. Speiser, In-765

hibitory Receptors Beyond T Cell Exhaustion, Front Immunol 6 (2015)766

310.767

[43] J. Bastid, A. Cottalorda-Regairaz, G. Alberici, N. Bonnefoy, J. F.768

Eliaou, A. Bensussan, ENTPD1/CD39 is a promising therapeutic target769

in oncology, Oncogene 32 (2013) 1743–1751.770

[44] M. Binnewies, E. W. Roberts, K. Kersten, V. Chan, D. F. Fearon,771

M. Merad, et al., Understanding the tumor immune microenvironment772

(TIME) for effective therapy, Nat. Med. 24 (2018) 541–550.773

[45] J. Drost, R. van Boxtel, F. Blokzijl, T. Mizutani, N. Sasaki, V. Sasselli,774

et al., Use of CRISPR-modified human stem cell organoids to study the775

origin of mutational signatures in cancer, Science 358 (2017) 234–238.776

[46] W. Hu, Y. Yang, L. Qi, J. Chen, W. Ge, S. Zheng, Subtyping of mi-777

crosatellite instability-high colorectal cancer, Cell Commun. Signal 17778

(2019) 79.779

[47] A. M. Goodman, S. Kato, L. Bazhenova, S. P. Patel, G. M. Frampton,780

V. Miller, et al., Tumor Mutational Burden as an Independent Predictor781

of Response to Immunotherapy in Diverse Cancers, Mol. Cancer Ther.782

16 (2017) 2598–2608.783

30

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint

Page 31: Mutational signatures driven by epigenetic determinants ... · 4/22/2020  · Mutational signatures driven by epigenetic determinants stratify patients for therapeutic interventions

[48] S. Mishima, A. Kawazoe, Y. Nakamura, A. Sasaki, D. Kotani,784

Y. Kuboki, et al., Clinicopathological and molecular features of re-785

sponders to nivolumab for patients with advanced gastric cancer, J786

Immunother Cancer 7 (2019) 24.787

[49] R. M. Samstein, C. H. Lee, A. N. Shoushtari, M. D. Hellmann, R. Shen,788

Y. Y. Janjigian, et al., Tumor mutational load predicts survival after789

immunotherapy across multiple cancer types, Nat. Genet. 51 (2019)790

202–206.791

[50] Q. Jia, W. Wu, Y. Wang, P. B. Alexander, C. Sun, Z. Gong, et al.,792

Local mutational diversity drives intratumoral immune heterogeneity in793

non-small cell lung cancer, Nat Commun 9 (2018) 5361.794

[51] G. Stanta, S. Bonin, Overview on Clinical Relevance of Intra-Tumor795

Heterogeneity, Front Med (Lausanne) 5 (2018) 85.796

[52] A. C. F. Bolhaqueiro, B. Ponsioen, B. Bakker, S. J. Klaasen, E. Ku-797

cukkose, R. H. van Jaarsveld, et al., Ongoing chromosomal instability798

and karyotype evolution in human colorectal cancer organoids, Nat.799

Genet. 51 (2019) 824–834.800

[53] F. Raynaud, M. Mina, D. Tavernari, G. Ciriello, Pan-cancer inference801

of intra-tumor heterogeneity reveals associations with different forms of802

genomic instability, PLoS Genet. 14 (2018) e1007669.803

[54] F. Pietrantonio, R. Miceli, A. Raimondi, Y. W. Kim, W. K. Kang, R. E.804

Langley, et al., Individual Patient Data Meta-Analysis of the Value of805

Microsatellite Instability As a Biomarker in Gastric Cancer, J. Clin.806

Oncol. 37 (2019) 3392–3400.807

[55] M. Wang, R. A. Busuttil, S. Pattison, P. J. Neeson, A. Boussioutas,808

Immunological battlefield in gastric cancer and role of immunotherapies,809

World J. Gastroenterol. 22 (2016) 6373–6384.810

[56] W. H. Fridman, L. Zitvogel, C. Saut?s-Fridman, G. Kroemer, The im-811

mune contexture in cancer prognosis and treatment, Nat Rev Clin Oncol812

14 (2017) 717–734.813

31

. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 22, 2020. .https://doi.org/10.1101/2020.04.17.20066944doi: medRxiv preprint