Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The ....

36
1 Whole exome and transcriptome analyses integrated with microenvironmental immune signatures of lung squamous cell carcinoma Jeong-Sun Seo 1,2,3,6,* , Ji Won Lee 2,3,* , Ahreum Kim 2,3,* , Jong-Yeon Shin 2,6,* , Yoo Jin Jung 4 , Sae Bom Lee 4 , Yoon Ho Kim 4 , Samina Park 5 , Hyun Joo Lee 5 , In-Kyu Park 5 , Chang-Hyun Kang 5 , Ji-Young Yun 2,6 , Jihye Kim 2,6 & Young Tae Kim 2,4,5 1 Gongwu Genomic Medicine Institute (G2MI), Medical Research Center, Seoul National University Bundang Hospital, Seongnamsi 13605, Korea 2 Genomic Medicine Institute (GMI), Medical Research Center, Seoul National University, Seoul 03080, Republic of Korea 3 Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul 03080, Republic of Korea 4 Cancer Research Institute, Seoul National University College of Medicine, Seoul 03080, Korea 5 Department of Thoracic and Cardiovascular Surgery, Seoul National University Hospital, Seoul 03080, Korea 6 Macrogen Inc., Seoul 08511, Korea * These authors contributed equally to this work Running title Genomic landscape and microenvironmental immune signature Abbreviations LUSC: Lung squamous cell carcinoma SCNV: Somatic copy-number variation PCA: Principal component analysis GO: Gene ontology GSEA: Gene set enrichment analysis STAR: Spliced transcripts alignment to a reference VSD: Variance stabilizing data FPKM: Fragments per kilobase million DEG: Differentially expressed ESTIMATE: Estimation of stromal and immune cells in malignant tumours using on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Transcript of Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The ....

Page 1: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

1

Whole exome and transcriptome analyses integrated with

microenvironmental immune signatures of lung squamous cell

carcinoma

Jeong-Sun Seo1,2,3,6,* , Ji Won Lee2,3,* , Ahreum Kim2,3,*, Jong-Yeon Shin2,6,*, Yoo Jin Jung4,

Sae Bom Lee4, Yoon Ho Kim4, Samina Park5, Hyun Joo Lee5, In-Kyu Park5, Chang-Hyun

Kang5, Ji-Young Yun2,6, Jihye Kim2,6 & Young Tae Kim2,4,5

1Gongwu Genomic Medicine Institute (G2MI), Medical Research Center, Seoul National University

Bundang Hospital, Seongnamsi 13605, Korea

2Genomic Medicine Institute (GMI), Medical Research Center, Seoul National University, Seoul

03080, Republic of Korea

3Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul 03080,

Republic of Korea

4Cancer Research Institute, Seoul National University College of Medicine, Seoul 03080, Korea

5Department of Thoracic and Cardiovascular Surgery, Seoul National University Hospital, Seoul

03080, Korea

6Macrogen Inc., Seoul 08511, Korea

* These authors contributed equally to this work

Running title

Genomic landscape and microenvironmental immune signature

Abbreviations

LUSC: Lung squamous cell carcinoma SCNV: Somatic copy-number variation PCA:

Principal component analysis GO: Gene ontology GSEA: Gene set enrichment

analysis STAR: Spliced transcripts alignment to a reference VSD: Variance

stabilizing data FPKM: Fragments per kilobase million DEG: Differentially expressed

ESTIMATE: Estimation of stromal and immune cells in malignant tumours using

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 2: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

2

expression TIMER: Tumor immune estimation resource FDR: False discovery rate

CYT: Cytolytic activity CAF: Carcinoma associated fibroblasts TAM: Tumor

associated macrophage KEGG: Kyoto encyclopedia of genes and genomes.

Corresponding authors

Professor Jeong-Sun Seo, Gongwu Genomic Medicine Institute (G2MI), Medical

Research Center, Seoul National University Bundang Hospital, Seongnamsi 13605,

Korea. E-mail: [email protected] or Professor Young Tae Kim, Department of

Thoracic and Cardiovascular Surgery, Seoul National University Hospital, Seoul

03080, Korea. E-mail: [email protected].

Conflicts of Interest

No potential conflicts of interest relevant to this article were disclosed.

Abstract

The immune microenvironment in lung squamous cell carcinoma (LUSC) is not well

understood, with interactions between the host immune system and the tumor, as

well as the molecular pathogenesis of LUSC, awaiting better characterization. To

date, no molecularly targeted agents have been developed for LUSC treatment.

Identification of predictive and prognostic biomarkers for LUSC could help optimize

therapy decisions. We sequenced whole exomes and RNA from 101 tumors and

matched noncancer control Korean samples. We used the information to predict

subtype-specific interactions within the LUSC microenvironment and to connect

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 3: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

3

genomic alterations with immune signatures. Hierarchical clustering based on gene

expression and mutational profiling revealed subtypes that were either immune

defective or immune competent. We analyzed infiltrating stromal and immune cells to

further characterize the tumor microenvironment. Elevated expression of

macrophage 2 signature genes in the immune competent subtype confirmed that

tumor-associated macrophages (TAMs) linked inflammation and mutation-driven

cancer. A negative correlation was evident between the immune score and the

amount of somatic copy-number variation (SCNV) of immune genes (r = -0.58). The

SCNVs showed a potential detrimental effect on immunity in the immune-deficient

subtype. Knowledge of the genomic alterations in the tumor microenvironment could

be used to guide design of immunotherapy options that are appropriate for patients

with certain cancer subtypes.

Introduction

Lung cancer is the second leading cause of death in Korea. The most common

type of primary lung cancer, lung adenocarcinoma, has been characterized at the

molecular level (1,2). Lung squamous cell carcinoma, which accounts for 30 percent

of all lung cancers (3), is not well characterized due to poor understanding of the

cancer’s genomic evolution (4) and the antitumor activity of immune cells (5,6).

Genomic alterations in the tumor characterize various stages of cancer progression.

Immune defenses, on the other hand, are governed by tumor stroma, including

basement membrane, extracellular matrix, vasculature, and cells of the immune

system (7-9). Most cells in tumor stroma have some capacity to suppress a tumor,

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 4: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

4

although this capacity changes as the cancer progresses; invasion and metastasis

can follow (10-13).

Immune and stromal characteristics have emerged as prognostic and predictive

factors that could be used to guide a personalized approach in cancer

immunotherapy (14,15). Analyses of genomic alterations, especially somatic

mutations, have been used to predict response to immunotherapy (16,17). Here, we

used genomic and transcriptomic analysis to integrate molecular subtypes of tumors

with and immune responses. We show that genomic alterations affect the tumor

microenvironment and tumor development in a subtype-specific manner. The data

show how genomic alterations and tumor microenvironment impact cancer

proliferation and invasion, and how predicted roles of immune cells and their

interactions with cancer cells in LUSC might affect cancer therapy and patient

survival.

Materials and Methods

RNA and whole exome sequencing

All protocols of this study were approved by the Institutional Review Board of Seoul

National University Hospital (IRB No:1312-117-545).

One hundred and one cases of lung squamous cell cancer samples, taken between

2011-2013, were included. Of these 101 patients, two patients were treated by

neoadjuvant chemotherapy before surgery, and were subsequently excluded from

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 5: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

5

survival analysis. All the tumor and matched adjacent noncancer control tissue

specimens were grossly dissected immediate after surgery and preserved in liquid

nitrogen. Data on clinical features such as smoking history, pathologic TNM stage,

tumor size and degree of differentiations were collected (Table 1 and Supplementary

Table S1). For RNA-seq, we extracted RNA from tissue using RNAiso Plus (Takara

Bio Inc.), followed by purification using RNeasy MinElute (Qiagen Inc.). RNA was

assessed for quality and was quantified using an RNA 6000 Nano LabChip on a

2100 Bioanalyzer (Agilent Inc.). The RNA-seq libraries were prepared as previously

described (18).

For whole exome sequencing, genomic DNA was extracted and 3 μg from each

sample was sheared and used for the construction of a paired-end sequencing

library as described in the protocol provided by Illumina. Enrichment of exonic

sequences was then performed for each library using the SureSelect Human All

Exon 50Mb Kit (Agilent Inc.) following the manufacturer's instructions.

Libraries for RNA and whole exome sequencing were sequenced with Illumina

TruSeq SBS Kit v3 on a HiSeq 2000 sequencer (Illumina Inc.) to obtain 100-bp

paired-end reads. The image analysis and base calling were performed using the

Illumina pipeline (v1.8) with default settings.

RNA-seq analysis

To characterize the LUSC transcriptome profile in cancer and noncancer control

cells, we performed RNA-Seq for 101 LUSC and matched noncancer control

samples. Total RNA extracted from lung specimens and depleted of ribosomal RNA

was sequenced at the desired depth (100X) on RNA-Seq (Illumina HiSeq). The

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 6: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

6

reads were aligned to the human genome (version GRCh37) with the Spliced

Transcripts Alignment to a Reference (STAR) alignment software. The pre-

processing pipeline on the GTAK website was followed (19). The raw read counts

were generated using HTSeq-count for each annotated gene.

Unsupervised subtype clustering

With the Ensembl gene set, the number of raw reads aligned to each gene was

computed by HT-seq count and was normalized by Variance Stabilizing Data (VSD)

method with use of the R package ‘DEseq2’. The variance for each gene was

calculated and the top 1,000 genes by variance were selected for PCA analysis

(20). PCA analysis using the 1,000 most variable genes was conducted with all

tumor and noncancer control samples. Samples were clustered based on principal

components into three groups noncancer control with 95% confidence interval by

hierarchical clustering methods as implemented in the R package ‘rgl’ (21). When

analyzing RNA sequencing data, batch effects should be considered if

experimental conditions and library preparation varied. All of our samples were

processed in the same batches, thus additional batch-effect corrections were not

necessary (22).

Differentially expressed gene-analysis

Differentially expressed genes of tumor subtypes compared to noncancer control

expression in noncancer control cells were determined by the significance criteria

(adjusted P value < 0.05, |Log2 (fold change)| ≥ 1, and base Mean ≥ 100) as

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 7: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

7

implemented in the R packages DESeq2and edgeR. The adjusted P value for

multiple testing was calculated by using the Benjamini–Hochberg correction from

the computed P value (23). The centered VSD values of the differentially

expressed gene list were applied to the array hierarchical clustering algorism

(Cluster 3.0) with uncentered correlation and average linkage (24). The gene

expression pattern was visualized with use of JAVA treeview. The hierarchical tree

by arrays was generated by the clustering process and two types of gene sets in

differentially expressed genes s (subtype A-UP and B-DOWN, subtype A-DOWN

and B-UP) were selected and enriched for Gene Ontology (GO) gene sets by

Gene Set Enrichment Analysis (GSEA) in order to determine genes enriched in

ranked gene lists.

Fragments Per Kilobase Million (FPKM) calculation and normalization

Raw reads (HTseqcounts) were normalized using FPKM as implemented in the R

package ’edgeR’ and the FPKM values were transformed to log2 values and adjusted

to the median centered gene expression values by subtracting the row-wise median

from the expression values in each row (Cluster 3.0). The centered and log2

transformed VSD and FPKM expression were used to illustrate gene expression

pattern in a heatmap.

GSEA and network analysis

GO analysis of the gene expression data was performed using GSEA (v2.24)

desktop tools (permutation = 1,000) and visualized by the Enrichment Map tool in

Cytoscape [P ≤ 0.05, FDR q-value ≤ 0.1, and similarity ≤ 0.5 (25)].

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 8: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

8

Tumor microenvironment analysis

The fractions of stromal and immune cells in tumor samples were estimated by

Estimation of STromal and Immune cells in MAlignant Tumours using Expression

data (ESTIMATE) scores with predictions of tumor purity based on the absolute

method previously reported (26). The abundance of six infiltrating immune cell types

(B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells) in

the two subtypes of LUSC was estimated with the Tumor IMmune Estimation

Resource (TIMER) algorithm (27).

Immune score, stromal score, tumor purity, and cell cycle score

Immune score, stromal score, and tumor purity were made using ESTIMATE. The

gene set for cell cycle score was used to calculate the cell cycle score from the

one in Davoli, T et al. (28).

Statistical analyses

Quantitative data are presented as mean ± standard deviation. We used R-3.2.3 to

perform the statistical analyses. The normality of the variables was tested by

Shapiro-Wilk normality test (29). For two groups, significance (P value) for normally

distributed variables was determined by unpaired Student t test, and non-normally

distributed variables were analyzed Mann-Whitney U test. For more than two groups,

Kruskal-Wallis and one-way ANOVA tests were used for non-parametric and

parametric method, respectively (30). Statistically significant differences were tested

at P values < 0.05. R value was computed by Pearson and distance correlation.

Survival graph was plotted by Kaplan Meier method, and comparison between the

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 9: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

9

subtypes were analyzed with the log-rank test, and Cox’s proportional hazards

model was also used for survival analysis (SPSS for Windows, version 22; SPSS

Inc., Chicago, IL).

Somatic mutation detection using whole exome sequencing

To find somatic mutations, we performed whole exome sequencing (100X). Reads

were aligned to the NCBI human reference genome (hg19) using BWA (31). Picard

was applied to mark duplicates and we used Genome Analysis Toolkit (GATK Indel

Realigner) to improve alignment accuracy. Somatic single-nucleotide variants (SNVs)

from 101 LUSC samples with matched noncancer control samples were called using

MuTect (32). GATK’s Haplotype Caller was also used for indel detection. All variants

were annotated with information from several databases using ANNOVAR (33).

Somatic copy-number variant detection

To detect somatic copy-number variants from whole exome sequencing data,

EXCAVATOR analysis was applied (34). GISTIC analysis was used to identify

recurrent amplification and deletion peaks (35).

Calculation of SCNV levels

We obtained amplified and deleted copy-number variants using EXCAVATOR and

GISTIC analyses. Arm and focal SCNV levels of each patient were respectively

calculated by summing the copy number changes at each copy number event. Arm,

chromosome, and focal CNV level were normalized to the mean and standard

deviation among the samples (28).

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 10: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

10

Total SCNV = ∑ | 𝐶𝑜𝑝𝑦 𝑁𝑢𝑚𝑏𝑒𝑟 𝑐ℎ𝑎𝑛𝑔𝑒 𝑎𝑡 𝑡𝑜𝑡𝑎𝑙 𝑐𝑜𝑝𝑦 𝑛𝑢𝑚𝑏𝑒𝑟 𝑟𝑒𝑔𝑖𝑜𝑛|

Arm-level SCNV = ∑ | 𝐶𝑜𝑝𝑦 𝑁𝑢𝑚𝑏𝑒𝑟 𝑐ℎ𝑎𝑛𝑔𝑒 𝑎𝑡 𝑎𝑟𝑚 𝑐𝑜𝑝𝑦 𝑛𝑢𝑚𝑏𝑒𝑟 𝑟𝑒𝑔𝑖𝑜𝑛|

Focal-level SCNV = ∑ |𝐶𝑜𝑝𝑦 𝑁𝑢𝑚𝑏𝑒𝑟 𝑐ℎ𝑎𝑛𝑔𝑒 𝑎𝑡 𝑓𝑜𝑐𝑎𝑙 𝑐𝑜𝑝𝑦 𝑛𝑢𝑚𝑏𝑒𝑟 𝑟𝑒𝑔𝑖𝑜𝑛|

Cell cycle related SCNV=∑|Copy Number change at copy number region of genes

used in Cell cycle score. |

Immune related SCNV=∑|Copy Number change at copy number region of genes

used in immune score. |

Prediction of neoantigens

We predicted neoantigens using the pVAC-Seq pipeline (36). We used non-

synonymous mutations to follow the pVAC-seq pipeline. Amino acid changes and

transcript sequences were annotated by variant effect predictor. Epitopes

predicted by HLAminer and were filtered by RNA expression (FPKM>1) and

coverage (tumor coverage >10X and noncancer control coverage>5X).

Availability of data and material

Whole exome and RNA sequencing data are available under the NCBI Sequence

Read Archive (SRA) study accession no. SRP114315

Results

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 11: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

11

Identification of LUSC subtypes

In this study, 101 LUSC and matched noncancer control samples were used to

discover significant differential gene expression, and principal component analysis

(PCA) on entire samples was performed to identify distinctive clusters based on

gene variability between LUSC and noncancer control samples. PCA with the top

1,000 most variable genes and unsupervised hierarchical clustering with the k-

means algorithm were applied to RNA-seq data and distinguished noncancer

control from tumor samples. In contrast, 19 tumor samples overlapped and were

more closely associated with the noncancer control group with 95 % confidence

interval ellipsoids (Fig. 1A and Supplementary Fig. S1). Additional PCA with 101

tumor samples distinguished the 19 tumor samples (subtype B) from 82 other

major tumor samples (subtype A) at the 95 % confidence interval. Thus, PCA

revealed two molecular subtypes, A and B. These two subtypes were also

distinguishable in the TCGA LUSC cohort (n = 431) by PCA and unsupervised

hierarchical clustering at 95 % confidence interval (Supplementary Fig. S2).

To identify the pattern of genomic alterations in LUSC, we sequenced the exome

of independent noncancer control samples (n = 101) and tumor samples (n = 101).

The number of total mutations in subtype B was less than subtype A (P < 0.001 by

Mann-Whitney U test; Fig. 1B and Supplementary Fig. S3A). Mutations in genes

encoding TP53, NAV3, CDH10, KMT2D, NFE2L2, CTNNA3, KEAP1 NTRK3, RB1,

NOTCH1, PTEN, FGFR2 and EGFR were also identified in current study (Fig. 1C).

Mutational signature combinations in subtype A were significantly different from

those in subtype B (P < 0.01 by Mann-Whitney U test) showing more irregular

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 12: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

12

patterns [Fig. 1D and Supplementary Fig. S3D, (32)]. We compared the burden of

somatic copy-number variation (SCNV) (28). Arm-level SCNVs (length > 98% of a

chromosome arm), focal-level SCNVs (length < 98% of a chromosome arm), and

total SCNVs (all of a chromosome), which comprise the burden of SCNV, were

mostly found in subtype A (Fig. 1E and F). To elucidate the effect of genomic

alteration on the tumor microenvironment, we also investigated major

histocompatibility complex (MHC) and tumor specific antigen (TSA) level, and found

that subtype A carried more deletions of MHC regions than did subtype B

(Supplementary Fig. S3C). TSA level was also higher in subtype A than in subtype B

(P < 0.001 by Mann Whitney U test; Supplementary Fig. S3D). Our transcriptome

analysis thus defined two LUSC subtypes, A and B, with different patterns of somatic

mutations.

Identification of differentially expressed genes in subtypes

We analyzed differential expression of 4,807 genes in subtype A and 1,471 genes

in subtype B compared to noncancer control expression, and matched the

significance criteria applied in this study (DESeq2: false discovery rate (FDR)  < 0.1,

DESeq2: FDR < 0.05). We derived a heatmap depicting 5,387 differentially

expressed genes in subtype A and B by excluding the overlapped genes

(Supplementary Table S2). Differentially expressed genes were significantly

enriched in several pathways through Gene Ontology (GO) term enrichment

analysis. Genes for which expression was upregulated were mainly involved with

the cell cycle and DNA replication (n = 1,826) in subtype A and with immune and

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 13: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

13

defense pathways (n = 1,876) in subtype B (Fig. 2A).

Further functional gene enrichment analysis using GSEA (Gene Set Enrichment

Analysis, v2.2.3; ref. 25) proved that these differentially expressed genes were

enriched in pathways related to cytoskeleton, mitosis, cell cycle, and chromatin

modification in subtype A and pathways related to defense response, immune

response, cytokine and metabolic and biosynthetic process in subtype B. Using

network analysis with Cytoscape (P ≤ 0.05, FDR q-value ≤ 0.1, and similarity ≤

0.5), we found that the increase in expression and activation of pathway-related

genes in subtype B was linked to regulation of immune cell differentiation and

apoptotic processes, which would increase immune cell abundance. Genes

characterizing subtype A, on the other hand, were linked to cell cycle and

microtubule pathways that would enhance cancer cell proliferation (Fig. 2B).

Consistent with gene set enrichment analysis, network analysis showed that cell

cycle and immune system gene sets were differentially correlated in each subtype.

Both immune response and cell cycle pathways were differently upregulated in the

subtypes, showing an interaction between immune response and somatic mutations.

The immune landscape of the microenvironment in LUSC

To identify the roles of immunity and distribution of infiltrating immune cells in tumor

and noncancer control samples, we computed stromal and immune scores along

with tumor purity based on the ESTIMATE method; ref. 26).The stromal and

immune scores in subtype B compared to subtype A were significantly increased

(stromal core P = 9.10 x 10-25, immune score P = 7.51 x 10-19 by unpaired Student

t test; Fig. 3A and Supplementary Fig. S4). The high immune scores suggested

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 14: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

14

that recruitment of immune cells was more enhanced in subtype B than in A, which

implicates not only immune cells surrounding the tumor but also immune cells

infiltrating the tumor (26). This observation may reflect the fact that the different

portion of infiltrating immune and stromal cells was intermixed in a dissecting

tumor, and this can be useful for predicting the degree of tumor purity and tumor

microenvironment (tumor purity P = 9.80 x 10-20 by unpaired Student t test;

Supplementary Fig. S4). As subtype B had more tumor-infiltrating immune cells than

did subtype A, cytolytic activity of subtype B must have been higher. Indeed, the

cytolytic activity score was higher in subtype B than in subtype A (P = 2.07 x 10-6 by

unpaired Student t test; Supplementary Fig. S4). Cytolytic activity is a metric of

immune-mediated cell destruction, and immune infiltration predicts more cytolytic

activity in subtype B of LUSC (37). Similarly, TCGA LUSC cohort results identified

different microenvironmental factors in the two subtypes. Subtype B (n = 328) in the

TCGA LUSC cohort was consistent with the description of our LUSC subtype B (n =

19; higher immune and stromal scores, lower tumor purity, and higher CYT score;

Supplementary Fig. S5). The TCGA cohort contained more subtype B than subtype A

samples; the Korean and TCGA cohorts likely had different subtype compositions.

Immune cells are important in tumors. Stromal cells influence tumor proliferation

and inflammation. In order to evaluate tumor associated stroma, the effects of

activated and normal stroma on two tumor subtypes were described by 50 stromal

signature genes as previously reported (38). The heat map of LUSC and noncancer

control cells using active and normal stroma genes indicated three groups: activated

stromal-rich samples (subtype A), normal and activated stromal-rich samples

(subtype B), and normal stromal-rich samples (noncancer control). Mean normalized

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 15: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

15

gene expression of the 25 exemplar activated and normal stromal genes was

higher in subtype B than in subtype A (activated stroma P = 0.0247; normal stroma

P < 0.01 by Mann-Whitney U test: Fig. 3A and Supplementary Figs. S6A and B).

The activated stromal signature genes were associated with carcinoma associated

fibroblasts (CAFs) and were overexpressed in subtype B as compared to subtype

A and noncancer control. Likewise, the normal stromal genes, which reflect the

composition of immune cells, were more highly expressed in subtype B than in

subtype A. Overexpression of growth factors and chemokines related to CAFs,

such as HGF, FGF7, CXCL12, MMP2, IL6, CCL2, NFkB, mediated tumor

promotion and aggressive invasion in subtype B [Supplementary Fig. S6C; (39)].

To analyze tumor microenvironment integration in subtypes A and B, we

assessed the correlation of stromal and immune scores as well as tumor purity.

There was a positive correlation between stromal and immune scores (subtype A,

Pearson’s r, 0.79; distance r, 0.78; subtype B Pearson’s r, 0.46; distance r, 0.54);

samples with high tumor purity showed low stromal and immune scores (Fig. 3B).

Subtypes A and B differed in their association with stromal and immune scores,

suggesting variation in microenvironments and in interactions with stromal and

immune cells.

We estimated (with use of the TIMER algorithm) the abundance of tumor-infiltrating

immune cells in six cell types (B cells, CD4+ T cells, CD8+ T cells, neutrophils,

macrophages, and dendritic cells) to predict immune cell profiling in subtypes A and

B. Noncancer control A and noncancer control B subtypes had no significant

differences. Dendritic cells were the most abundant cells among both tumor and

noncancer control cells. Although macrophages were not the most abundant cells in

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 16: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

16

tumors and noncancer control samples, macrophages seemed to be the most

influential in a subtype-specific manner (P = 6.27 x 10-9 by Mann-Whitney U test;

Figs. 3C and D). The proportion of macrophages in subtype B was significantly

higher than in subtype A, supporting the previous observations that macrophages

are present in large numbers in the tumor microenvironment and promote tumor

progression and metastasis (40).

Intratumoral immune coordination was assessed by analyzing the correlation

between selected immune cell markers. Pairwise comparisons of immune cell

abundance levels were done by measuring Pearson correlation coefficients (r).

The relationships implied by these correlations were visualized as r values

(Supplementary Fig. S7). Correlation between immune cells was greater in tumor

cells than in noncancer control cells, and the degree of correlation was higher in

subtype B than in subtype A. Dendritic cells were more correlated with other

immune cells in subtype B than in subtype A, and the correlations of dendritic cells

with neutrophil and neutrophil with CD8+ T cells were increased in subtype B.

Dendritic cells and neutrophils seemed to have immediate connection with CD8+ T

cells, confirming previous findings that the crosstalk between neutrophils and

recruited dendritic cells accelerates antigen presentation to T cells and generates

an antigen-specific immune response (41,42). Elevated expression of macrophage

2 signature genes in subtype B, a previously validated gene set (43), suggests

that tumor-associated macrophages (TAMs) promote tumor growth in subtype B

cancer (Fig. 3D and Supplementary Figs. S8A and B). We examined the

reproducibility of the immune parameters by subtype in both our LUSC and TCGA

LUSC cohorts. All Immune scores, activated stromal gene expression, and

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 17: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

17

macrophage 2 activity patterns by subtype were highly reproducible across both

cohorts. LUSC and TCGA LUSC cohort subtype B showed higher immune

infiltration (Supplementary Fig. S9). The immune microenvironment in subtype B

was converted to escape mode during the immunoediting process of LUSC in

order to help proliferation of resistant clones in an immunocompetent host. This

complexity of subtype-specific microenvironments should inform therapeutic

strategies for treatment of LUSC.

Impacts of somatic copy-number variants on tumor microenvironment

To identify the effect of genomic alterations on tumor microenvironments in LUSC,

we investigated the association between somatic genomic alterations and the

immune score. The SCNVs were significantly higher in subtype A (P < 0.001 by

Mann-Whitney test; Fig. 4A), and were negatively correlated with the immune

score (Pearson’s coefficient: -0.58) and with other immune related properties such

as the stromal score (stromal cell infiltration), CYT score (immune cytolytic activity),

CD4+ T-cell infiltration (Pearson’s coefficient: -0.42), macrophage infiltration

(Pearson’s coefficient: -0.39), and dendritic cell infiltration (Pearson’s coefficient: -

0.41; Figs. 4B and C). These results suggested that the negative correlation

between immune score and SCNVs in subtype A may influence the immune

response of subtype A. Focal-level SCNVs contributed more to immune scores

than arm-level SCNVs (Figs. 4D and E). In an association test, focal-level SCNVs

were more highly correlated with immune score (Fig. 4F). These findings indicate

that focal-level SCNVs are more influential on the tumor microenvironment, and

might be better targets for immunotherapy. Focal-level SCNVs had a stronger

correlation with LUSC immunity and may make it easier to determine the target

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 18: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

18

genes for drug interventions than arm-level SCNVs, because arm-level SCNVs are

large genomic defects that may affect multiple targets (44,45). To assess the role of

focal-level SCNV in the tumor microenvironment, we investigated focal-level SCNVs

in immune related pathways via KEGG enrichment analysis. Subtype A had a high

prevalence of SCNVs on immune-related pathways (B-cell receptor signaling

pathway, chemokine signaling pathway, T-cell receptor signaling pathway and Toll-

like receptor signaling pathway of KEGG; Fig. 5). Genes harboring SCNV deletions,

which lead to loss of function, were enriched in subtype A only for immune related

pathways of immune system processes, immune responses and response to

cytokine (Supplementary Fig. S10). These data suggested that focal-level SCNVs

may drive the low immune response in tumor microenvironments and would be good

targets to control immunity.

By analyzing the tumor microenvironment with RNA sequencing and whole exome

sequencing, we defined two immune related subtypes: subtype A (immune defective)

and subtype B (immune competent). The log2 VSD expression pattern between

tumors and noncancer control tissue was consistent over the validation gene set with

previously studied subtypes [Supplementary Fig. S11; (46)]. The subtype A

expression pattern resembled that of the classical subtype, in which TP63, AKR1C3,

FOXE1, and TXN genes were overexpressed. Subtype B appeared to overexpress

basal-related genes (S100A7, MMP13, and SERPINB3) and secretory-related genes

(ARHGDIB and TNFRSF14). Except for the secretory-related genes, the other genes

were all overexpressed in subtype A relative to subtype B. The expression pattern of

previous validated genes differed between subtypes. Thus, our clustering method

with gene selection for subtyping achieved similar results to those of the previous

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 19: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

19

study. Our immune subtyping approach categorized the LUSC into immune defective

and competent subtypes based on the pattern of infiltrating immune, stromal cells,

immune cell composition. Our data also suggest that genomic alterations, especially

SCNVs, decrease immune related activities and immune cells in cell proliferation

related subtypes of LUSC, and that activated stroma and TAM lead to the evolution

of cancer cells in the immune related subtype. Although the subtypes differed in

immune activation, mutation burden, and SCNV, the subtypes had no significant

differences in clinical features or overall survival in both cohorts (our LUSC cohort P

value = 0.223; TCGA LUSC cohort P value = 0.54; log-rank test; Supplementary Figs.

S12 and S13). It was difficult to evaluate to what extent clinical features were

affected by low immune activities due to the high SCNV in subtype A and the

increased activity of TAMs and activated stroma in subtype B.

We found that all immune checkpoint genes were more highly expressed in subtype

B than subtype A in both our LUSC samples and the TCGA LUSC cohort (Fig. 6A),

and two independent cohorts had similar expression patterns for immune

checkpoints (Fig.6B). Both PD-1 and its ligand PD-L1 as well as other immune

checkpoint genes are highly expressed in subtype B samples. PD-1 and PD-L1 are

involved in immune tolerance by preventing stimulation of an immune response and

inducing tumor immune escape (47), and might be valuable targets for new drugs in

subtype B. However, difficulty with inducibility of PD-L1 protein expression and with

accurate assays complicate their use. More reliable biomarkers are needed (48).

Discussion

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 20: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

20

Although the study of PD1 expression for immunotherapy has produced benefit in

many clinical cases, some patients do not respond to checkpoint blockade (49). For

such nonresponders, tumor-specific antigens with appropriate MHC-binding affinity,

might be more useful for immunotherapy (36,50,51). We analyzed the MHC region

and TSA levels for both subtypes and found that subtype A had more TSA and more

deletions of MHC regions than did subtype B. However, these results did not explain

the role of the tumor microenvironment. Further study is needed to validate the

results. In particular TSA content was predicted but not validated in the clinical

environment.

Our genetic results thus provide evidence for an immune response to cancer in

humans and indicate a mechanism of tumor-intrinsic resistance to cytolytic activity in

tumors with a high burden of somatic mutations. Analysis of genomic alterations and

their impact on the tumor microenvironment in a subtype-specific manner might

identify patients who could benefit from cancer immunotherapies that boost the

immune system.

Acknowledgements

This work has been supported by Macrogen Inc. (grant no. MGR17-01) and the

National Research Foundation of Korea (NRF) grant funded by the Korea

government (MSIP) (No. NRF-2014R1A2A2A05003665).

References

1. Park JY, Jang SH. Epidemiology of Lung Cancer in Korea: Recent Trends. Tuberc Respir Dis

(Seoul) 2016;79(2):58-69 doi 10.4046/trd.2016.79.2.58.

2. Lee CT. Epidemiology of Lung Cancer in Korea. Cancer Res Treat 2002;34(1):3-5 doi

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 21: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

21

10.4143/crt.2002.34.1.3.

3. Kim Y, Hammerman PS, Kim J, Yoon JA, Lee Y, Sun JM, et al. Integrative and comparative

genomic analysis of lung squamous cell carcinomas in East Asian patients. J Clin Oncol

2014;32(2):121-8 doi 10.1200/JCO.2013.50.8556.

4. Yates LR, Campbell PJ. Evolution of the cancer genome. Nat Rev Genet 2012;13(11):795-

806 doi 10.1038/nrg3317.

5. Gajewski TF, Schreiber H, Fu YX. Innate and adaptive immune cells in the tumor

microenvironment. Nat Immunol 2013;14(10):1014-22 doi 10.1038/ni.2703.

6. Yang Y. Cancer immunotherapy: harnessing the immune system to battle cancer. J Clin

Invest 2015;125(9):3335-7 doi 10.1172/JCI83871.

7. Green S, Dawe DE, Banerji S. Immune Signatures of Non-Small Cell Lung Cancer. J Thorac

Oncol 2017;12(6):913-5 doi 10.1016/j.jtho.2017.04.008.

8. Schoenhals JE, Seyedin SN, Anderson C, Brooks ED, Li YR, Younes AI, et al. Uncovering the

immune tumor microenvironment in non-small cell lung cancer to understand response

rates to checkpoint blockade and radiation. Transl Lung Cancer Res 2017;6(2):148-58 doi

10.21037/tlcr.2017.03.06.

9. Safonov A, Jiang T, Bianchini G, Gyorffy B, Karn T, Hatzis C, et al. Immune Gene Expression

Is Associated with Genomic Aberrations in Breast Cancer. Cancer Res 2017 doi

10.1158/0008-5472.CAN-16-3478.

10. Li H, Fan X, Houghton J. Tumor microenvironment: the role of the tumor stroma in cancer.

J Cell Biochem 2007;101(4):805-15 doi 10.1002/jcb.21159.

11. Vannucci L. Stroma as an Active Player in the Development of the Tumor

Microenvironment. Cancer Microenviron 2015;8(3):159-66 doi 10.1007/s12307-014-0150-x.

12. Ramamonjisoa N, Ackerstaff E. Characterization of the Tumor Microenvironment and

Tumor-Stroma Interaction by Non-invasive Preclinical Imaging. Front Oncol 2017;7:3 doi

10.3389/fonc.2017.00003.

13. Siniard RC, Harada S. Immunogenomics: using genomics to personalize cancer

immunotherapy. Virchows Arch 2017 doi 10.1007/s00428-017-2140-0.

14. Becht E, de Reynies A, Giraldo NA, Pilati C, Buttard B, Lacroix L, et al. Immune and Stromal

Classification of Colorectal Cancer Is Associated with Molecular Subtypes and Relevant for

Precision Immunotherapy. Clin Cancer Res 2016;22(16):4057-66 doi 10.1158/1078-

0432.CCR-15-2879.

15. Wang RF. A special issue on cancer immunotherapy. Cell Res 2017;27(1):1-2 doi

10.1038/cr.2017.1.

16. Rizvi NA, Hellmann MD, Snyder A, Kvistborg P, Makarov V, Havel JJ, et al. Cancer

immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small

cell lung cancer. Science 2015;348(6230):124-8 doi 10.1126/science.aaa1348.

17. Palmieri G, Colombino M, Cossu A, Marchetti A, Botti G, Ascierto PA. Genetic instability

and increased mutational load: which diagnostic tool best direct patients with cancer to

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 22: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

22

immunotherapy? J Transl Med 2017;15(1):17 doi 10.1186/s12967-017-1119-6.

18. Ju YS, Kim JI, Kim S, Hong D, Park H, Shin JY, et al. Extensive genomic and transcriptional

diversity identified through massively parallel DNA and RNA sequencing of eighteen

Korean individuals. Nat Genet 2011;43(8):745-52 doi 10.1038/ng.872.

19. Soundararajan R, Stearns TM, Griswold AL, Mehta A, Czachor A, Fukumoto J, et al.

Detection of canonical A-to-G editing events at 3' UTRs and microRNA target sites in

human lungs using next-generation sequencing. Oncotarget 2015;6(34):35726-36 doi

10.18632/oncotarget.6132.

20. Bailey P, Chang DK, Nones K, Johns AL, Patch AM, Gingras MC, et al. Genomic analyses

identify molecular subtypes of pancreatic cancer. Nature 2016;531(7592):47-52 doi

10.1038/nature16965.

21. Inamura K, Fujiwara T, Hoshida Y, Isagawa T, Jones MH, Virtanen C, et al. Two subclasses of

lung squamous cell carcinoma with different gene expression profiles and prognosis

identified by hierarchical clustering and non-negative matrix factorization. Oncogene

2005;24(47):7105-13 doi 10.1038/sj.onc.1208858.

22. Qian Liu MM. <strong>Evaluation of Methods in Removing Batch Effects on RNA-seq

Data</strong>. Infect Dis Transl Med 2016;2(1):3-9 doi 10.11979/idtm.201601002.

23. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for

RNA-seq data with DESeq2. Genome Biol 2014;15(12):550 doi 10.1186/s13059-014-0550-8.

24. de Hoon MJ, Imoto S, Nolan J, Miyano S. Open source clustering software. Bioinformatics

2004;20(9):1453-4 doi 10.1093/bioinformatics/bth078.

25. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set

enrichment analysis: a knowledge-based approach for interpreting genome-wide

expression profiles. Proc Natl Acad Sci U S A 2005;102(43):15545-50 doi

10.1073/pnas.0506580102.

26. Yoshihara K, Shahmoradgoli M, Martinez E, Vegesna R, Kim H, Torres-Garcia W, et al.

Inferring tumour purity and stromal and immune cell admixture from expression data. Nat

Commun 2013;4:2612 doi 10.1038/ncomms3612.

27. Li B, Severson E, Pignon JC, Zhao H, Li T, Novak J, et al. Comprehensive analyses of tumor

immunity: implications for cancer immunotherapy. Genome Biol 2016;17(1):174 doi

10.1186/s13059-016-1028-7.

28. Davoli T, Uno H, Wooten EC, Elledge SJ. Tumor aneuploidy correlates with markers of

immune evasion and with reduced response to immunotherapy. Science 2017;355(6322)

doi 10.1126/science.aaf8399.

29. Ghasemi A, Zahediasl S. Normality tests for statistical analysis: a guide for non-statisticians.

Int J Endocrinol Metab 2012;10(2):486-9 doi 10.5812/ijem.3505.

30. Hazra A, Gogtay N. Biostatistics Series Module 3: Comparing Groups: Numerical Variables.

Indian J Dermatol 2016;61(3):251-60 doi 10.4103/0019-5154.182416.

31. Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform.

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 23: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

23

Bioinformatics 2010;26(5):589-95 doi 10.1093/bioinformatics/btp698.

32. do Valle IF, Giampieri E, Simonetti G, Padella A, Manfrini M, Ferrari A, et al. Optimized

pipeline of MuTect and GATK tools to improve the detection of somatic single nucleotide

polymorphisms in whole-exome sequencing data. BMC Bioinformatics 2016;17(Suppl

12):341 doi 10.1186/s12859-016-1190-7.

33. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from

high-throughput sequencing data. Nucleic Acids Res 2010;38(16):e164 doi

10.1093/nar/gkq603.

34. Xu D, Olman V, Wang L, Xu Y. EXCAVATOR: a computer program for efficiently mining

gene expression data. Nucleic Acids Res 2003;31(19):5582-9.

35. Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0

facilitates sensitive and confident localization of the targets of focal somatic copy-number

alteration in human cancers. Genome Biol 2011;12(4):R41 doi 10.1186/gb-2011-12-4-r41.

36. Hundal J, Carreno BM, Petti AA, Linette GP, Griffith OL, Mardis ER, et al. pVAC-Seq: A

genome-guided in silico approach to identifying tumor neoantigens. Genome Med

2016;8(1):11 doi 10.1186/s13073-016-0264-5.

37. Rooney MS, Shukla SA, Wu CJ, Getz G, Hacohen N. Molecular and genetic properties of

tumors associated with local immune cytolytic activity. Cell 2015;160(1-2):48-61 doi

10.1016/j.cell.2014.12.033.

38. Moffitt RA, Marayati R, Flate EL, Volmar KE, Loeza SG, Hoadley KA, et al. Virtual

microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal

adenocarcinoma. Nat Genet 2015;47(10):1168-78 doi 10.1038/ng.3398.

39. Banat GA, Tretyn A, Pullamsetti SS, Wilhelm J, Weigert A, Olesch C, et al. Immune and

Inflammatory Cell Composition of Human Lung Cancer Stroma. PLoS One

2015;10(9):e0139073 doi 10.1371/journal.pone.0139073.

40. Pollard JW. Tumour-educated macrophages promote tumour progression and metastasis.

Nat Rev Cancer 2004;4(1):71-8 doi 10.1038/nrc1256.

41. Schuster S, Hurrell B, Tacchini-Cottier F. Crosstalk between neutrophils and dendritic cells:

a context-dependent process. J Leukoc Biol 2013;94(4):671-5 doi 10.1189/jlb.1012540.

42. Tate MD, Brooks AG, Reading PC, Mintern JD. Neutrophils sustain effective CD8(+) T-cell

responses in the respiratory tract following influenza infection. Immunol Cell Biol

2012;90(2):197-205 doi 10.1038/icb.2011.26.

43. Shaykhiev R, Krause A, Salit J, Strulovici-Barel Y, Harvey BG, O'Connor TP, et al. Smoking-

dependent reprogramming of alveolar macrophage polarization: implication for

pathogenesis of chronic obstructive pulmonary disease. J Immunol 2009;183(4):2867-83

doi 10.4049/jimmunol.0900473.

44. Beroukhim R, Mermel CH, Porter D, Wei G, Raychaudhuri S, Donovan J, et al. The

landscape of somatic copy-number alteration across human cancers. Nature

2010;463(7283):899-905 doi 10.1038/nature08822.

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 24: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

24

45. Weir BA, Woo MS, Getz G, Perner S, Ding L, Beroukhim R, et al. Characterizing the cancer

genome in lung adenocarcinoma. Nature 2007;450(7171):893-8 doi 10.1038/nature06358.

46. Wilkerson MD, Yin X, Hoadley KA, Liu Y, Hayward MC, Cabanski CR, et al. Lung squamous

cell carcinoma mRNA expression subtypes are reproducible, clinically important, and

correspond to normal cell types. Clin Cancer Res 2010;16(19):4864-75 doi 10.1158/1078-

0432.CCR-10-0199.

47. He J, Hu Y, Hu M, Li B. Development of PD-1/PD-L1 Pathway in Tumor Immune

Microenvironment and Treatment for Non-Small Cell Lung Cancer. Sci Rep 2015;5:13110

doi 10.1038/srep13110.

48. Tsiatas M, Mountzios G, Curigliano G. Future perspectives in cancer immunotherapy. Ann

Transl Med 2016;4(14):273 doi 10.21037/atm.2016.07.14.

49. Roh W, Chen PL, Reuben A, Spencer CN, Prieto PA, Miller JP, et al. Integrated molecular

analysis of tumor biopsies on sequential CTLA-4 and PD-1 blockade reveals markers of

response and resistance. Sci Transl Med 2017;9(379) doi 10.1126/scitranslmed.aah3560.

50. Matsushita H, Vesely MD, Koboldt DC, Rickert CG, Uppaluri R, Magrini VJ, et al. Cancer

exome analysis reveals a T-cell-dependent mechanism of cancer immunoediting. Nature

2012;482(7385):400-4 doi 10.1038/nature10755.

51. Gubin MM, Zhang X, Schuster H, Caron E, Ward JP, Noguchi T, et al. Checkpoint blockade

cancer immunotherapy targets tumour-specific mutant antigens. Nature

2014;515(7528):577-81 doi 10.1038/nature13988.

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 25: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

25

Table 1. Clinical data summary

Clinical data summary

Korean (n=101)

Patient characteristics Number of patients

Age at diagnosis, years

Median 70

Range 35-83

Sex

Male 95

Female 6

Smoking status

Never-smoker 12

Former smoker 61

Current smoker 28

Median follow-up, months 45

Tumor stage

Ⅰ 51

Ⅱ 29

Ⅲ 21

Ⅳ 0

T stage

T1 27

T2 58

T3 15

T4 1

N stage

N0 62

N1 24

N2 15

Recurrancy 31

Total LN

Median 30

Range 5-66

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 26: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

26

Figure Legend

Figure 1. Identification of the subtypes of LUSC. A, Principal component analysis

(PCA) of the top 1,000 most variable genes among the subtype A, B, and noncancer

control were performed across all samples. 3D PCA scores plots of top 1,000 of

most variable genes was drawn as meshes containing cancer and noncancer control

points (left) and subtype A and B points (right) based on K-means clustering (k-

means = 2) on the first three PCs with 95% confidence interval ellipsoids. B, Ratio of

somatic synonymous and nonsynonymous mutations in mutations per megabase

and subtypes were classified. C, Name of significantly mutated genes

(left),distribution of mutations across 101 LUSC, and frequency of significantly

mutated genes (right) were plotted. D, The mutational signature revealed by somatic

mutations in whole-exome sequencing. E, Arm level CNV, focal level CNV and total

level CNV of individual sample displayed in each column. F, Significant, focally

amplified (red), and deleted (blue) regions are plotted across chromosomal locations.

Figure 2. Molecular subtypes at transcriptomic expression level. A, Heatmap

depicted 5,387 differentially expressed genes in subtype A and B, and two distinct

clustered genes in differentially expressed genes were selected. Top 10 GO gene

sets in either clusters were determined based on the rank of enrichment –log10(P

value) of pathway and the matched significance criteria (P-value < 0.05 and FDR q

value < 0.1) B, Network visualization based on gene enrichment analysis. Nodes

represent subtype A (red) and B (blue) networks, and green edges represent genetic

overlap between networks using Gene Ontology (GO) enrichment analysis

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 27: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

27

(permutation = 1,000). The size of each node reflects the number of genes included

in the network. Genes in significant networks were annotated and grouped with

simplified GO terms. Networks meeting the cut-off conditions detailed at the bottom

of the figure (right) were visualized with the Enrichment Map plugin for Cytoscape (P

≤ 0.05, FDR q value ≤ 0.1, and similarity ≤ 0.5).

Figure 3. The immune landscape of the microenvironment in LUSC. A, The heatmap

depicted the expression of stromal genes and the tumor microenvironmental factor

(cytolytic activity (CYT), purity, immune, stromal), which divided into three groups,

describing activated stroma-rich samples (subtype A), normal and activated stroma-

rich samples (subtype B), and normal stroma-rich samples (noncancer control) gene

expression. B, Scatterplots between stromal and immune scores with tumor purity

gradient were shown, and correlation coefficient indicated by each of the subtypes.

The color grading corresponds to the tumor purity, indexed as shown on the color

bar at the bottom right of the figure. C, The abundance of infiltrating B cells, CD4+ T

cells, CD8+ T cells, neutrophil, macrophages, and dendritic cells in two subtypes was

estimated, and each P value was indicated by each of the subtypes (Mann-Whitney

U test and Kruskal-Wallis test). Box represents the median (thick line) and the

quartiles (line). D, The median centered and log2 transformed expression level

(log2fpkm+1) of M1 and M2 signature genes in subtypes and noncancer control was

box-plotted with corresponding Mann-Whitney U and Kruskal-Wallis test results. Box

represents the median (thick line) and the quartiles (line).

Figure 4. Impacts of genomic alterations on tumor micro environments in LUSC. A,

The level of total CNVs in cell cycle related gene sets are displayed across classes

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 28: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

28

(left) and the level of total CNVs in immune related gene sets are displayed across

classes (right). B, The association between genomic alteration load (the number of

mutations, the level of CNV) and cell cycle score is shown (left) and the association

between genomic alteration load and immune score is shown (right). C, The

correlations between genomic alterations (CNVs, mutations) and immune related

properties were displayed. D, The cell cycle scores are shown in each sample and

the high score densities and low score densities of cell cycle score are plotted on the

x axis and y axis. E, The immune scores are shown in each sample and the high

score densities and low score densities of immune score are plotted on the x axis

and y axis. F, The correlation between CNV level (arm, focal) and cell cycle score or

immune score are displayed across subtype.

Figure 5. SCNV in LUSC immune pathway. The diagrams show the genes with

SCNV in the four immune related pathways and the percentage of samples with

SCNV in immune related pathway across classes; copy number deletion (blue), copy

number amplification (pink), the percentage of samples with SCNV in subtype A (red

box), the percentage of samples with SCNV in subtype B (blue box)

Figure 6. The expression pattern of immune checkpoints in LUSC. A, The

expression of immune checkpoint genes was analyzed between subtypes in our

LUSC samples (n = 101) and TCGA LUSC cohort (n = 431), respectively. The

comparison of median centered and log2 transformed expression (log2fpkm) of

immune checkpoint genes was performed between subtypes, and P-value was

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 29: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

29

indicated by Mann-Whitney U or unpaired Student t test based on the normality. A

two-color scale was used with blue indicating low expression values and red

representing highly expressed genes. B, The median centered and log2 transformed

expression level (log2fpkm) of immune checkpoint genes in both cohorts was box-

plotted with corresponding Kruskal-Wallis test results. Box represents the median

(thick line) and the quartiles (line).

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 30: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 31: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 32: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 33: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 34: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 35: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453

Page 36: Whole exome and transcriptome analyses integrated with … · analyzer (Agilent Inc.). The . RNA-seq. libraries were prepared as previously . described (18). For. whole . exome sequencing,

Published OnlineFirst May 2, 2018.Cancer Immunol Res   Jeong-Sun Seo, Ji Won Lee, Ahreum Kim, et al.   cell carcinomamicroenvironmental immune signatures of lung squamous Whole exome and transcriptome analyses integrated with

  Updated version

  10.1158/2326-6066.CIR-17-0453doi:

Access the most recent version of this article at:

  Material

Supplementary

  C1

http://cancerimmunolres.aacrjournals.org/content/suppl/2018/05/02/2326-6066.CIR-17-0453.DAccess the most recent supplemental material at:

  Manuscript

Authorbeen edited. Author manuscripts have been peer reviewed and accepted for publication but have not yet

   

   

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected] at

To order reprints of this article or to subscribe to the journal, contact the AACR Publications

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://cancerimmunolres.aacrjournals.org/content/early/2018/05/02/2326-6066.CIR-17-0453To request permission to re-use all or part of this article, use this link

on July 20, 2020. © 2018 American Association for Cancer Research. cancerimmunolres.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 2, 2018; DOI: 10.1158/2326-6066.CIR-17-0453