Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously ....

32
1 / 26 Nomogram Integrating Genomics with Clinicopathological Features Improves Prognosis Prediction for Colorectal Cancer Yongfu Xiong 1, Wenxian You 1, Min Hou 2, Linglong Peng 1 , He Zhou 1 , Zhongxue Fu 1* 1 Department of Gastrointestinal Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China. 2 Department of Oncology, Affiliated Hospital of North Sichuan Medical College, Nanchong, China. Equal contributor * Correspondence: Zhongxue Fu, Ph.D. Department of Gastrointestinal Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, China. E-mail: [email protected] Running title A Nomogram to Predict Colorectal Cancer Prognosis Keywords: CRC, genomic signature, nomogram, clinicopathological features Disclosure of potential conflicts of interest The authors have no potential conflicts of interest to disclose. Financial support This work was financially supported by a grant from the National Natural Science foundation of China (grant no. 81572319; project recipient: ZhongXue Fu). on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Transcript of Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously ....

Page 1: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

1 / 26

Nomogram Integrating Genomics with Clinicopathological Features Improves

Prognosis Prediction for Colorectal Cancer

Yongfu Xiong1†, Wenxian You

1†, Min Hou2†, Linglong Peng

1, He Zhou

1, Zhongxue

Fu1*

1Department of Gastrointestinal Surgery, The First Affiliated Hospital of Chongqing

Medical University, Chongqing, China.

2Department of Oncology, Affiliated Hospital of North Sichuan Medical College,

Nanchong, China.

†Equal contributor

*Correspondence: Zhongxue Fu, Ph.D. Department of Gastrointestinal Surgery, The

First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, China.

E-mail: [email protected]

Running title

A Nomogram to Predict Colorectal Cancer Prognosis

Keywords: CRC, genomic signature, nomogram, clinicopathological features

Disclosure of potential conflicts of interest

The authors have no potential conflicts of interest to disclose.

Financial support

This work was financially supported by a grant from the National Natural Science

foundation of China (grant no. 81572319; project recipient: ZhongXue Fu).

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 2: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

2 / 26

Abstract

The current tumor staging system is insufficient for predicting the outcomes of

patients with colorectal cancer (CRC) because of its phenotypic and genomic

heterogeneity. Integrating gene expression signatures with clinicopathological factors

may yield a predictive accuracy exceeding that of the currently available system.

Twenty-seven signatures that used gene expression data to predict CRC prognosis

were identified and re-analyzed using bioinformatic methods. Next, clinically

annotated CRC samples (n=1710) with the corresponding expression profiles, that

predicted a patient's probability of cancer recurrence, were pooled to evaluate their

prognostic values and establish a clinicopathologic-genomic nomogram. Only 2 of the

27 signatures evaluated showed a significant association with prognosis and provided

a reasonable prediction accuracy in the pooled cohort (HR = 2.46, 95% CI =

1.183-5.132, p < 0.001, AUC = 60.83; HR =2.33, 95% CI = 1.218-4.453, p < 0.001,

AUC = 71.34). By integrating the above signatures with prognostic

clinicopathological features, a clinicopathologic-genomic nomogram was cautiously

constructed. The nomogram successfully stratified CRC patients into three risk groups

with remarkably different DFS rates and further stratified stage II and III patients into

distinct risk subgroups. Importantly, among patients receiving chemotherapy, the

nomogram determined that those in the intermediate- (HR = 0.98, 95% CI =

0.255-0.679, p < 0.001) and high-risk (HR = 0.67, 95% CI = 0.469-0.957, p = 0.028)

groups had favorable responses.

Implications: These findings offer evidence that genomic data provides independent

and complementary prognostic information, and incorporation of this information

refines the prognosis of CRC.

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 3: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

3 / 26

Introduction

Colorectal carcinoma (CRC) is the third most commonly diagnosed malignant disease

and the second leading cause of cancer death worldwide (1). Despite advances in

CRC screening, diagnosis, and curative resection, its prognosis is not entirely

satisfactory because optimal management and individual therapy strategies still

present great challenges, as CRC is a well-recognized heterogeneous disease.

Currently, treatment decisions and prognoses for patients with CRC are primarily

driven by assessment of the tumor-node-metastasis (TNM) staging system, which is

based merely on anatomical information (2). Previous studies revealed that patients

with stage I CRC have a 5-year survival rate of approximately 93%, which decreases

to approximately 80% for patients with stage II disease and to 60% for patients with

stage III disease (3). However, discrepancies in the survival outcomes of patients at

the same stage and receiving similar treatments is commonly observed.

More importantly, according to the current TNM stage, adjuvant therapy is

generally recommended for all patients with stage III disease (4). However, patients

with T1-2N1M0 tumors (stage IIIA) have significantly higher survival rates than

those with stage IIB tumors (3), suggesting that stratifying patients at high risk of

recurrence, who are most likely to benefit from adjuvant therapy, is critical. Therefore,

substantial effort has been made to discover new clinicopathological indicators to

optimize the current staging system. To date, some clinicopathological features, such

as emergency presentation, adjacent organ involvement (T4), intestinal perforation or

obstruction, high tumor grade, inadequate sampling of lymph nodes and

lymphatic/vascular invasion, with prognostic value to classify CRC patients who are

at a high risk of recurrence have been applied in clinical practice. However, these

factors are all relatively weak and insufficient to identify CRC patients who may

benefit from adjuvant therapy, which leads to potential under- or over-treatment (5).

CRC biological behavior (e.g., recurrence, metastasis, drug resistance) is a

tightly regulated process that requires the aberrant expression of related genes to

empower carcinoma cells with their corresponding abilities. Accordingly, before

clinicopathologically detectable changes occur, the underlying alterations necessary

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 4: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

4 / 26

for recurrence have already occurred in the primary CRC, providing the possibility to

develop robust prognostic tools by using multiple genes in combination (6). During

the last decade, gene signatures have shown great promise in predicting the long-term

outcomes and treatment responses of individual patients (7). Of note, because of the

outstanding ability to predict the prognoses of patients with breast cancer, multigene

assays, such as Oncotype DX and MammaPrint, have been successfully approved by

the US Food and Drug Administration and are available in routine clinical practice (8).

Actually, genomic signatures to predict the prognosis of CRC have also been

continually developed in the past 10 years, but none are commercially used in the

clinic (9). For example, Agesen et al. established a 13-gene expression classifier,

ColoGuideEx, for prognosis predictions specific to patients with stage II CRC (10).

Based on the essential role of lipid metabolism in carcinogenesis, Teodoro et al.

identified a group of 4 genes that predict survival in intermediate-stage colon cancer

patients, allowing the delineation of a high-risk group that may benefit from adjuvant

therapy. In addition, Smith (11), Chen (12), Oh (13) and Popovici et al. (14) also

published their own signatures based on different mechanisms. Despite a large

number of studies, no signatures have remained credible in either meta-analyses or

prospective trials (15). However, these published signatures clearly show low

prediction accuracies but moderate clinical usefulness (15). Moreover, when confined

to a specific CRC stage, promising results regarding risk stratification have also been

reported. Therefore, investigating the predictive ability of these published signatures

on comprehensive large-scale datasets and identifying whether any can be used to

clinically guide treatment decisions is necessary.

Currently, while doubts about the predictive value of clinicopathological features

are increasing, they still provide the most reliable guidelines for the prognostication

and treatment of CRC (16). Thus, we hypothesized that integrating genomic

signatures with clinicopathological features in a model would yield a predictive

accuracy exceeding that of the currently available prognostic system. Nomogram is a

statistical prediction model that combines multiple prognostic factors to make

intuitive graphical and individualized predictions (17). Here, we aimed to apply a

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 5: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

5 / 26

systematic approach to evaluate the clinical usefulness of CRC-related signatures and

then construct a composite clinicopathologic-genomic nomogram by integrating

factors with potential prognostic value in a training set. Moreover, using another

independent set, the capacity of the nomogram to stratify CRC patients most likely to

benefit from chemotherapy was further validated.

Materials and Methods

Patients and prognostic signatures

To identify gene expression data arrayed using the Affymetrix platform with clinically

annotated data, we systematically searched Gene Expression Omnibus (GEO,

http://www.ncbi.nlm.nih.gov/geo/), ArrayExpress (http://www.ebi.ac.uk/arrayexpress/)

and related literature with the keywords “colorectal cancer ”, “CRC”, “colon cancer ”,

“survival”, “relapse”, “recurrence”, “prognostic”, “prognosis” and “outcomes”

published through Aug 1, 2017. For some datasets whose clinical data did not

accompany their gene expression profiles, we either searched the supplements or

contacted one or more of the investigators to obtain the missing information.

Moreover, datasets with small sample sizes and duplications were excluded. Raw

microarray data and the corresponding clinical data of these datasets were retrieved

and manually organized when available. Only patients diagnosed with colorectal

cancer having clinicopathological and survival information available were included.

Patients with follow-up or survival times of less than 1 month as well as patients with

missing or insufficient data on age, local invasion, lymph node metastasis, distant

metastasis and TNM stage were excluded from subsequent processing. Eventually, all

patients satisfying the inclusion criteria were combined and summarized in

Supplementary Table S1 and S2.

Expression data processing

For raw CEL files available from Affymetrix microarrays, the data were normalized

and annotated using a MAS5 algorithm and the corresponding annotation files from R

Bioconductor to obtain summarized values for each probeset; otherwise,

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 6: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

6 / 26

pre-processed data as provided by the contributors were used. For each sample in

every data set, measurements without a gene annotation were excluded, and multiple

probesets corresponding to a single gene were summarized into a gene symbol by

taking the most variable probeset measured by the interquartile range (IQR).

Identification and analysis of gene signatures potentially related to CRC

prognosis

Gene signatures potentially related to CRC prognosis were systematically retrieved

from PubMed. The search was restricted to recent papers to increase validity (from

January 2004 to August 2017). The selection criteria are detailed in Supplementary

Figure S1. Articles that provided a list of differentially expressed genes in primary

tumor samples associated with CRC prognosis were included in our study. Studies

based on tissue microarray and those that were exclusively focused on differences

between stages or between primary tumors and metastases were excluded. The

signatures finally included in our analysis are described in Table 1 (detailed

descriptions provided in Supplementary Table S3). In addition, the probesets or genes

of those signatures were re-annotated using the SOURCE web tool to address the

retired gene symbols and their differences in the tested platforms. The re-annotated

genes were then subjected to biological function enrichment analysis, and the online

analytical tool DAVID (Database for Annotation, Visualization and Integrated

Discovery) (18) was used to enrich gene ontology (GO) functions and Kyoto

Encyclopedia of Genes and Genomes (KEGG) pathways. GO terms and KEGG

pathways with significant enrichment false discovery rate (FDR) values less than 0.05

were selected for further analysis. In addition, genes from the abovementioned

signatures were mapped and imported into the Retrieval of Interacting Genes/Proteins

(STRING) 9.1 database, which queried the human protein-protein interaction network

for interactions between effective linkers and seeds to construct a functional

subnetwork. Cluster analyses were performed using correlation distance metrics and

the average linkage agglomeration algorithm (R package nclust version 1.9.0; ref. 34).

Non-negative matrix factorization (NMF) was done with the NMF package (version

0.20.5) and standard strategies.

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 7: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

7 / 26

Subclass prediction of CRC patients in the training set

The re-preprocessed microarray dataset, which represents the genomic features of the

individual, were classified with the prognostic signatures identified above using the

nearest template prediction (NTP) method (19) as implemented in Gene Pattern

software (Broad Institute of Harvard and MIT, Boston, MA) (20). NTP requires only a

list of pre-specified template signature genes and a dataset to be tested and not a

corresponding training dataset to capture good and poor gene expression patterns in

each sample. Briefly, a template containing representative expression patterns of the

signature genes was defined based on published gene signatures from their respective

studies. The proximity of the signature gene expression patterns of the sample to the

template was evaluated by calculating the cosine distance. The FDR was used to

correct the p-values for multiple hypothesis testing, and prediction analysis was

performed separately for each dataset. A prediction of good or poor for the related

signature was determined based on FDR predictions <0.05, and the remaining

samples with intermediate expression levels of the correlated genes in the signature

were classified as uncertain.

Concordance among these predictions was evaluated using unsupervised

clustering according to the Cramer’s V coefficient of the paired prediction overlap as

previously described (21). Cramer’s V statistic values range from 0 to 1, with values

between 0.36 and 0.49 indicating substantial correlation, and values higher than 0.5

indicating strong correlation.

Development, comparison, and validation of the prognostic models

All the patients included in this study were randomly separated into training and

validation sets. Disease-free survival (DFS) was defined as the interval between the

day that surgery was performed and the day that recurrence was first detected. If

recurrence was not diagnosed, the date of death due to CRC or last follow-up was

used. Overall survival (OS) was calculated from the date of diagnosis or surgery to

the date of death or last follow-up. Continuous variables were expressed as median

(IQR) or median (range) values, and group comparisons were performed by the

Student's t test or the Wilcoxon rank sum test. Categorical variables were expressed as

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 8: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

8 / 26

percentages, and group comparisons were performed by Pearson’s χ2 test or the

Fisher’s exact test. Median follow-up was calculated using the reverse Kaplan-Meier

method (22). In the training set, survival curves for different variable values were

generated using Kaplan-Meier estimates and compared using the log-rank test.

Variables that achieved significance at p values less than 0.05 were subjected to

multivariate analyses via the Cox regression model. Then, statistical analyses to

identify independent prognostic factors were conducted. Based on the multivariate

analysis results, a prognostic nomogram was established using a backward stepwise

Cox proportional hazard model. The entire population was divided into three risk

groups (high, intermediate, low) according to the tertiles of the total scores given by

the established nomogram in the training set. Concordance index (c-index) values

were used to measure the discrimination performance. Finally, for external validation,

the total scores of each patient in the validation set were calculated according to the

proposed nomogram to verify its generalization. Three risk groups were determined

by the tertiles defined in the training set, and the respective Kaplan-Meier survival

curves were delineated. All statistical tests performed were two-sided, and p values

less than 0.05 were considered statistically significant.

Results

Screening and analysis of eligible samples and published signatures

To comprehensively evaluate their clinical usefulness, we retrieved all of the

published signatures. Accordingly, 81 signatures published as valid prognostic tools in

CRC were obtained from 507 potentially related articles. Among these signatures,

those with genes not clearly described in their respective studies or investigating the

prognostic value base on tissue array, immune cells, circulation blood or RT-PCR

results were further excluded (see flowchart in Supplementary Figure S1 for details).

Finally, a total of 27 signatures from 26 studies met all of the inclusion criteria and

were retained for subsequent analysis (Supplementary Table S3). Among these

signatures, almost all were obtained from experiments based on CRC tissues except

for one, which was derived from an animal experiment (23). The 27 signatures

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 9: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

9 / 26

contained 1274 total genes, among which 1041 were unique, and the signature sizes

ranged from 4 to 264 (Figure 1A). The top overlapping genes in the above signatures

were FN1 (ten times), CYP1B1 and POSTN (six times), 7 genes (five times), 7 genes

(four times), 29 genes (three times), and 107 genes (twice). As is clearly shown in

Figure 1A, none of these genes appeared in all signatures. In addition, some repeated

genes in different signatures even presented opposite prognostic values. This

lower-than-expected overlap, at least in part, accounted for the low reproducibility of

the signatures and the failure in clinical practice.

Accumulating evidence strongly suggests that the prognosis of CRC patients,

such as recurrence and metastasis, is tightly regulated by aberrantly expressed genes

via specific biological processes (16). To gain insight into the underlying biological

meanings of the gene set obtained from the above signatures, enrichment (GO and

KEGG) and protein-protein interaction (PPI) analyses were conducted

(Supplementary Table S4). Figure 1B-D shows that specific GO categories closely

related to CRC prognosis, such as cell cycle (FDR<0.0001), regulation of cell

proliferation (FDR<0.0001), extracellular matrix (FDR<0.0001), and chemokine

activity (FDR<0.0001), were significantly enriched. Additionally, some KEGG

pathways well known to be involved in CRC metastasis and recurrence were enriched,

including colorectal cancer metastasis signaling (FDR<0.0001), wnt-β catenin

signaling (FDR = 0.0002), VEGF signaling (FDR = 0.0008), and chemokine signaling

(FDR = 0.0021) (Figure 1E). More importantly, a visualized network of these gene

interactions revealed that the abovementioned pathways not only form highly

integrated modules but also play essential roles in the entire network (Figure 1F).

These findings suggested that notwithstanding the low reproducibility in the system

test and clinical validation, genes obtained from CRC signatures actually have

potential prognostic value.

Furthermore, to assess the global performance of these signature and construct a

composite nomogram, publicly available datasets with whole-genome profiles and

corresponding clinical information in CRC patients were screened and downloaded

(see flowchart in Supplementary Figure S1 for details). Ultimately, 1710 CRC

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 10: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

10 / 26

patients pertaining to 9 cohorts (Supplementary Table S5) met the set criteria and

were retained. Although the profiles were obtained using different technologies and

based on fresh or formalin-fixed and paraffin-embedded (FFPE) specimens as source

material, they showed no evidence of cohort-bias clustering by principal component

analysis (PCA, Supplementary Figure S2A-C) or hierarchical clustering

(Supplementary Figure S2D), suggesting that patients from different cohorts in the

present study could be blended together. Moreover, to balance the baseline

characteristics, all eligible patients were randomly divided into a training set (n = 855)

and a validation set (n = 855). The demographic and major clinicopathological

characteristics of the patients are summarized in Supplementary Table S2. As we

expected, no significant differences were observed between the statistical properties

of the training and validation sets.

Global prognostic performance of the published signatures

To determine the correlations of signatures and CRC patient outcomes, the prognostic

performances of the 27 signatures were evaluated in the training set using a modified

NTP method as previously described (19). Among the 27 signatures evaluated, all

except one (Teodoro’s (24) signature had the only poor prognostic gene) were able to

confidently stratify patients (FDR <0.05) into good and poor subgroups. Table 1 and

Figure 2A summarize the prediction results obtained for each of the 855 patients.

Popovici’s signature [34] was the most prevalent prediction in the training set (83.6%;

615 of 855), whereas Agesen’s signature [42] was securely identified in only 19.9%

(167 of 855) of CRC patients. Of note, conflicting prognostic outcomes were

commonly observed among the training set, and only 34% patients had consistent

results (good or poor prognosis) in more than 5 signatures. However, as demonstrated

in Figure 2A, the overall tendency of the NTP analysis was consistent and roughly

consistent with previous studies (25). In addition, global Cramer’s V coefficients were

calculated for each signature to gain insight into their concordances. Unsupervised

clustering based on these coefficients indicated a substantial association among these

signatures (Figure 2B). Obviously, 5 signatures obtained from the studies of Oh (13),

Minh (11), Chang (9), Jiang (26), and Fritzmann et al. (27) were at least moderately

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 11: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

11 / 26

correlated with each other. Moreover, another 4 signatures (Toshiaki’s, Aziz’s, Chen’s,

Xu’s) were also clustered with similar coefficients. These findings are largely

consistent with the initial purpose of each signature, which further confirms the

potential prognostic value of these signatures.

We then applied Cox regression analysis to assess whether any signatures were

statistically significant for CRC-related recurrence/progression and independent of

clinical features in the training set. The univariate and clinical factor-adjusted

multivariate analysis results are displayed in Supplementary Table S6 and Figure 2C.

Univariate analyses revealed that age (HR = 1.12, 95% CI = 1.021-1.269, p = 0.048),

local invasion (T stage; HR = 2.21, 95% CI = 1.139-7.029, p = 0.027), lymph node

metastasis (N stage; HR = 2.28, 95% CI = 1.189-3.784, p =0.021), distant metastasis

(M stage; HR = 2.46, 95% CI = 1.363-4.453, p = 0.003), and TNM stage (HR = 2.15,

95% CI = 1.165-3.971, p = 0.014) were confidently associated with poor prognosis.

However, with respect to genomic factors, only 8 signatures were validated to have

significant effects on DFS (Figure 2C). Next, we used multivariate Cox analysis with

the covariates, including age, local invasion, lymph node metastasis, distant

metastasis and TNM stage, to evaluate the predictive values of the 8 signatures. As

shown in Figure 4C, after each covariate was adjusted, Popovici’s signature (HR =

1.92, 95% CI = 1.298-2.841, p = 0.001) and Fritzmann’s signature (HR = 2.16, 95%

CI = 1.144-4.069, p < 0.0001) remained powerful enough to indicate prognosis in the

training set for DFS, which revealed that these two signatures are independent

prognostic factors.

Construction and comparison analysis of the composite nomogram

Accurately predicting DFS in patients with CRC is important for counseling,

follow-up planning, and selection of appropriate adjuvant therapy. Thus, to provide

the clinician with a quantitative and user-friendly method to generate individualized

predictions, we constructed a nomogram that integrated the genomic signatures

identified above with clinicopathological features (Figure 3A). Calibration plots

revealed excellent agreement between the nomogram-predicted probabilities and the

actual observations of 1-, 3- and 5-year DFS (Figure 3B). The nomogram

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 12: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

12 / 26

demonstrated that distant metastasis made the largest contribution to prognosis,

followed by the TNM stage and lymph node metastasis. Age and Popovici’s signature

had moderate effects on survival rate. Each category within these variables was

assigned a point on the top scale based on the coefficients from Cox regression. By

summing all of the assigned points for the seven variables and drawing a vertical line

between total points and the survival probability axis, we easily obtained the

estimated probabilities of 1-, 3- and 5-year DFS. The risk score cutoff values (≤10.6,

10.6-21.2 and ≥21.2) were selected in terms of total points to stratify patients into

roughly equal tertiles in the training set, which accurately divided patients into low-

(reference), intermediate- (HR = 2.227, 95% CI = 1.507-3.291, p < 0.001), and

high-risk (HR = 6.787, 95% CI = 4.786-9.624, p < 0.001) subgroups with

significantly different DFS rates (Figure 3C, D).

Although the TNM stage in combination with other clinical factors is a

well-recognized prediction system for CRC prognosis (28), its effectiveness urgently

needs to be increased. To determine whether the genomic signatures added additional

prognostic value to the current system, time-dependent receiver operating

characteristic (ROC) analysis was applied to compare the performances between the

nomogram, clinicopathological factors and genomic signatures (Figure 3E).

Expectedly, the nomogram achieved the greatest area under the ROC curve (AUCs at

5-year DFS = 78.07, 71.40, 71.34, and 60.83, p < 0.05), suggesting that the integrated

clinicopathological genomic nomogram had a prognostic performance superior to

those of clinicopathological and genomic information by themselves.

Validating the nomogram for stratifying CRC patient risks

To validate the correlation between the nomogram score (total points) and DFS

statuses of patients with CRC, Kaplan-Meier analysis and log-rank tests based on the

same cutoff value were conducted on the validation set. As shown in Figure 4A and B,

applying the clinicopathologic-genomic nomogram stratified patients into three

distinct risk subgroups with significantly different DFS rates. In addition, the

prognostic accuracy of the nomogram was remarkably better than those of both

clinicopathological and genomic information by themselves (Figure 4C), which was

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 13: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

13 / 26

consistent with the results obtained from the training set. Disagreements on

potentially appropriate candidates for adjuvant therapy prevail, especially for patients

belonging to AJCC stages II and III. In the present study, the survival rates predicted

by the nomogram were significantly distinct between the Kaplan-Meier curves

(Figure 4D, E) in patients categorized by the abovementioned major

clinicopathological features (AJCC stages II and III).

Moreover, to ensure the effectiveness of statistical power, patients with available

chemotherapy information were integrated together regardless of the set to which they

belonged (training set or validation set). Intriguingly, we noted that adjuvant

chemotherapy did not enhance OS in all 869 patients (HR = 0.98, 95% CI =

0.769-1.258, p = 0.873; Figure 5A) or in patients who were defined as low risk by the

nomogram (HR = 0.93, 95% CI = 0.515-1.695, p = 0.824; Figure 8B). However,

patients in the nomogram defined as intermediate (HR = 0.98, 95% CI = 0.255-0.679,

p < 0.001; Figure 5C) or high risk (HR = 0.67, 95% CI = 0.469-0.957, p = 0.028;

Figure 5D) had favorable responses to adjuvant chemotherapy. Taken together, these

results suggested that the clinicopathologic-genomic nomogram has the

discriminatory power to identify CRC patients who are suitable candidates for

adjuvant chemotherapy.

Discussion

Although surgery remains the mainstay of curative treatment, the prognosis of

patients with CRC, especially for long-term outcome, depends more on pre- or

postoperative individualized therapies (29), including various strategies of

chemotherapy and chemoradiation. However, as a well-recognized heterogeneous

cancer, its prognosis varies significantly, and optimal management presents challenges.

Currently, adjuvant chemotherapy is regarded as standard treatment for patients with

stage III colon cancer (4). However, patients with T1-2N1M0 tumors (stage IIIA)

have significantly higher survival rates than those with stage IIB tumors (3).

Moreover, for some patients, long-term survival continues to be jeopardized by the

high risk of recurrence even after chemotherapy, and personalized therapy needs to

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 14: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

14 / 26

further optimized. With respect to stage II patients, disagreement regarding which

patients are potentially appropriate candidates for adjuvant therapy prevails. The

MOSAIC trial (30) has shown only a 20% benefit from adjuvant therapy at 3 years for

patients with stage II disease and a 28% benefit for patients with recognized clinical

risk factors. Thus, for approximately 75% of stage II patients who receive curative

surgery, the administration of adjuvant chemotherapy is unnecessary and harmful.

Unfortunately, low-risk stage II patients have not been accurately defined. In addition,

patients with microsatellite instability (MSI CRC) have a better prognosis (31) and do

not benefit from 5-fluorouracil (5-FU)-based adjuvant chemotherapy (32), similar to

patients with CpG island methylation (33). All the abovementioned studies highlight

that the rational application of therapeutic strategies requires accurate prognosis

prediction and risk stratification for CRC patients.

Considering the inherent deficiency of TNM stages, substantial effort has been

placed on their optimization (2), and they have been continuously developed over the

past decade. Substantial changes in CRC classifications, based mainly on the analysis

of surveillance, epidemiology, and end results (SEER) data, were made in the seventh

edition (34). However, a comparative study demonstrated that the seventh TNM

edition did not provide a greater accuracy for predicting CRC patient prognoses but

rather resulted in a more complex classification for daily clinical use (35). The eighth

edition of the CRC staging system, containing major advances, including the

introduction of molecular markers, such as MSI, KRAS and BRAF mutations, to

strengthen its discriminatory power, will be implemented worldwide on January 1,

2018 (2). Nonetheless, the forthcoming edition may not provide substantial

improvement because the prognostic value of clinicopathological factors has peaked.

Moreover, the current models still lack genomic information, which may directly

reflect the heterogeneity of CRC and present substantial potential prognostic value.

Genomic signatures predicting the prognosis of CRC have been continually

developed in the past 10 years. In the present study, we retrieved 81 articles related to

the development of signatures for predicting CRC prognosis, including OS, DFS,

disease-specific survival (DSS) and various types of recurrence. For instance, based

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 15: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

15 / 26

on the association between the aberrantly expressed gene and the duration of

individual DFS, Zhang et al. (36) constructed a six RNA-based classifier, which was

further validated to have reliable predictive power for detecting recurrence in patients

with stage II colon cancer. Similarly, Anna et al. (23) developed a genomic profile

specific for adult intestinal stem cells (ISC), which was highly enriched in genes that

positively predicted the risk of recurrence. Patients bearing primary CRCs with high

average expression levels of ISC genes had a relative risk of relapse that was 10-fold

higher than that for patients with low levels (23). Moreover, recent evidence (37)

suggest that gene signatures that closely reflect specific biological processes or

oncogenic pathway statuses also have potential prognostic values for stratifying CRC

patients. Even though almost all of these signatures were proven to have prognostic

value in their respective publications, none are routinely used in clinical practice.

Potential explanations for the unsatisfactory results must be considered. Thus, in

the current study, we exhaustively reviewed and analyzed the published multi-gene

signatures in CRC. To eliminate the potential confounding effects, only data derived

from the Affymetrix human genome platform having clearly described genes were

included. Of the 81 published signatures, 27 signatures from 26 publications met all

of the inclusion criteria and were retained (Table 1 and Supplementary Table S3). The

27 signatures contained 1274 total genes, among which 1041 were unique, and the

signature sizes ranged from 4 to 264 (Figure 1A). The top overlapping genes in the

above signatures were FN1 (ten times), CYP1B1 and POSTN (six times), 7 genes

(five times), 7 genes (four times), 29 genes (three times), and 107 genes (twice). As is

clearly shown in Figure 2A, none of these genes appeared in all signatures. These

results were expected since they were previously reported (15) and in agreement with

the systematic analyses of lung cancer and breast cancer prognostic signatures (38).

Obviously, the slight overlap of these signatures in gene identity was perplexing but

sufficiently explained the low reproducibility. Moreover, genetic heterogeneity was

recently shown to exist not only between but also within tumors, meaning that

biological statuses, including oncogenic and carcinostatic states, might regularly vary

in the same patient (39). Our NTP analysis revealed that some patients concurrently

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 16: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

16 / 26

harbored more than one signature, which further confirmed the abovementioned

results and indicated that their tumors were genomic instability at the individual level.

Taken together, the different signatures might reflect diverse biological processes and

be active in different or identical individual tumors, which partially accounts for the

wide nonoverlapping among signatures constructed in different CRC studies.

In terms of global performance, in the training set, 8 of 27 signatures showed a

significant association with prognosis and could reasonably predict the outcome.

However, after multivariate adjustment, only 2 signatures remained powerful enough

to indicate prognosis for DFS, which directly demonstrated the low reproducibility of

CRC genomic signatures. More importantly, Popovici’s signature, Fritzmann’s

signature and TNM stage had remarkable prognostic abilities and were independent of

each other, indicating that these signatures may be used to refine the current

prognostic model and facilitate further stratification of patients with CRC at the same

TNM stage.

Our primary goal was to construct a nomogram that could integrate genomic

signatures with clinicopathological variables in a large cohort of CRC patients to add

additional prognostic value to the current staging system. Our multivariate analysis in

the training set revealed that age, local invasion, lymph node metastasis, distant

metastasis, TNM stage, Popovici’s signature and Fritzmann’s signature were

independent prognostic factors (Supplementary Table S6), which was highly

consistent with previous studies on CRC risk factors (40,41). Accordingly, we

cautiously built a clinicopathologic-genomic nomogram using the abovementioned

significant factors. The AUC of the nomogram showed a superior prediction accuracy

compared to that of the combined clinical factors in the training and validation sets

(Figure 3E, 4C). Additionally, the calibration plots showed optimal agreement

between the expected and observed survival probabilities (Figure 3B), ensuring the

reliability of our nomogram. Moreover, the nomogram successfully stratified CRC

patients into three risk groups with remarkably different DFS values based on total

points, which was further confirmed using the same cutoff value in the validation set.

Intriguingly, the TNM staging system is the most important tool for guiding

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 17: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

17 / 26

treatment in clinical practice. However, disagreement prevails on potentially

appropriate candidates for adjuvant therapy, especially for stage II and III CRC

patients. Therefore, the identification of a high-risk subgroup among these patients is

of urgent clinical needed. Our findings suggest that the clinicopathologic-genomic

nomogram could be a promising tool to stratify stage II and III patients into distinct

risk subgroup and guide individualized therapy. Moreover, in the current study, we

clearly revealed that chemotherapy provides a survival benefit to patients classified as

intermediate- and high-risk by the nomogram (Figure 5).

Despite the promising results, there were several shortcomings in this work. First,

the NTP method uses only a list of signature genes to make class predictions in each

patient’s expression data, allowing the method to be less sensitive to differences in

experimental and analytical conditions and applicable to every patient without

optimization of the analysis parameters. In the current study, we applied the NTP

method for all prediction models, which generally consisted of two major components:

a gene signature and a prediction algorithm. Obviously, the intrinsic limitation of the

NTP method is that it inevitably causes the algorithm lose its prognostic value.

Second, the forthcoming 8th edition of the CRC TNM staging system may better

predict prognosis, but data used in the present study was from the current 7th edition

or the 6th edition. Third, the prognostic signatures of CRC included in our study were

not exhaustive, and some promising signatures were excluded because of insufficient

information. In addition, some clinicopathological factors, even those well recognized

to have prognostic value, were not included in the composite nomogram because of

the low availability of information.

In summary, based on analyzing a large-scale CRC cohort and published

multi-gene prognostic signatures, we confirmed herein for the first time that genomic

information in combination with clinical data can improve patient prognostic

evaluations and should be considered as an independent and complementary approach

to the current clinicopathological prognostic model. Furthermore, considering the

wide promise of our nomogram, which integrates genomic signatures with

clinicopathological features to improve the prognosis prediction of colorectal cancer,

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 18: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

18 / 26

further systematic research is needed.

Acknowledgments

Conception and design: Z. Fu and Y. Xiong. Analysis and interpretation of data (e.g.,

statistical analysis, biostatistics, computational analysis): Y. Xiong, W. You, and M.

Hou. Writing, review, and/or revision of the manuscript: L. Peng, H. Zhou.

Administrative, technical, and/or material support (i.e., reporting or organizing data):

Y. Xiong, W. You, M. Hou, L. Peng, and H. Zhou. Study supervision: Z. Fu.

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 19: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

19 / 26

References

1. Favoriti P, Carbone G, Greco M, Pirozzi F, Pirozzi RE, Corcione F. Worldwide burden of

colorectal cancer: a review. Updates in surgery 2016;68(1):7-11 doi

10.1007/s13304-016-0359-y.

2. Amin MB, Greene FL, Edge SB, Compton CC, Gershenwald JE, Brookland RK, et al. The Eighth

Edition AJCC Cancer Staging Manual: Continuing to build a bridge from a population-based to

a more "personalized" approach to cancer staging. CA Cancer J Clin 2017;67(2):93-9 doi

10.3322/caac.21388.

3. O'Connell JB, Maggard MA, Ko CY. Colon cancer survival rates with the new American Joint

Committee on Cancer sixth edition staging. Journal of the National Cancer Institute

2004;96(19):1420-5 doi 10.1093/jnci/djh275.

4. Mahar AL, Compton C, Halabi S, Hess KR, Weiser MR, Groome PA. Personalizing prognosis in

colorectal cancer: A systematic review of the quality and nature of clinical prognostic tools for

survival outcomes. Journal of surgical oncology 2017 doi 10.1002/jso.24774.

5. Dotan E, Cohen SJ. Challenges in the management of stage II colon cancer. Seminars in

oncology 2011;38(4):511-20 doi 10.1053/j.seminoncol.2011.05.005.

6. Stintzing S, Stremitzer S, Sebio A, Lenz HJ. Predictive and prognostic markers in the treatment

of metastatic colorectal cancer (mCRC): personalized medicine at work.

Hematology/oncology clinics of North America 2015;29(1):43-60 doi

10.1016/j.hoc.2014.09.009.

7. Lu AT, Salpeter SR, Reeve AE, Eschrich S, Johnston PG, Barrier AJ, et al. Gene expression

profiles as predictors of poor outcomes in stage II colorectal cancer: A systematic review and

meta-analysis. Clin Colorectal Cancer 2009;8(4):207-14 doi 10.3816/CCC.2009.n.035.

8. Xin L, Liu Y-H, Martin TA, Jiang WG. The Era of Multigene Panels Comes? The Clinical Utility of

Oncotype DX and MammaPrint. World Journal of Oncology 2017;8(2):34-40 doi

10.14740/wjon1019w.

9. Chang W, Gao X, Han Y, Du Y, Liu Q, Wang L, et al. Gene expression profiling-derived

immunohistochemistry signature with high prognostic value in colorectal carcinoma. Gut

2014;63(9):1457-67 doi 10.1136/gutjnl-2013-305475.

10. Agesen TH, Sveen A, Merok MA, Lind GE, Nesbakken A, Skotheim RI, et al. ColoGuideEx: a

robust gene classifier specific for stage II colorectal cancer prognosis. Gut 2012;61(11):1560-7

doi 10.1136/gutjnl-2011-301179.

11. Smith JJ, Deane NG, Wu F, Merchant NB, Zhang B, Jiang A, et al. Experimentally derived

metastasis gene expression profile predicts recurrence and death in patients with colon

cancer. Gastroenterology 2010;138(3):958-68 doi 10.1053/j.gastro.2009.11.005.

12. Chen H, Sun X, Ge W, Qian Y, Bai R, Zheng S. A seven-gene signature predicts overall survival

of patients with colorectal cancer. Oncotarget 2016 doi 10.18632/oncotarget.10982.

13. Oh SC, Park Y-Y, Park ES, Lim JY, Kim SM, Kim S-B, et al. Prognostic gene expression signature

associated with two molecularly distinct subtypes of colorectal cancer. Gut 2012;61(9):1291-8

doi 10.1136/gutjnl-2011-300812.

14. Popovici V, Budinska E, Tejpar S, Weinrich S, Estrella H, Hodgson G, et al. Identification of a

poor-prognosis BRAF-mutant-like population of patients with colon cancer. Journal of clinical

oncology : official journal of the American Society of Clinical Oncology 2012;30(12):1288-95

doi 10.1200/JCO.2011.39.5814.

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 20: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

20 / 26

15. Phillips WA, Sanz-Pamplona R, Berenguer A, Cordero D, Riccadonna S, Solé X, et al. Clinical

Value of Prognosis Gene Expression Signatures in Colorectal Cancer: A Systematic Review.

PloS one 2012;7(11):e48877 doi 10.1371/journal.pone.0048877.

16. Zhang ZY, Gao W, Luo QF, Yin XW, Basnet S, Dai ZL, et al. A nomogram improves AJCC stages

for colorectal cancers by introducing CEA, modified lymph node ratio and negative lymph

node count. Scientific reports 2016;6:39028 doi 10.1038/srep39028.

17. Iasonos A, Schrag D, Raj GV, Panageas KS. How to build and interpret a nomogram for cancer

prognosis. Journal of clinical oncology : official journal of the American Society of Clinical

Oncology 2008;26(8):1364-70 doi 10.1200/JCO.2007.12.9791.

18. Dennis G, Jr., Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, et al. DAVID: Database for

Annotation, Visualization, and Integrated Discovery. Genome biology 2003;4(5):P3.

19. Hoshida Y. Nearest template prediction: a single-sample-based flexible class prediction with

confidence assessment. PloS one 2010;5(11):e15543 doi 10.1371/journal.pone.0015543.

20. Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP. GenePattern 2.0. Nature genetics

2006;38(5):500-1 doi

http://www.nature.com/ng/journal/v38/n5/suppinfo/ng0506-500_S1.html.

21. Fan C, Oh D, Wessels L, Weigelt B, Nuyten D, Nobel A, et al. Concordance among

gene-expression-based predictors for breast cancer. N Engl J Med 2006;355(6):560-9.

22. Guyot P, Ades AE, Ouwens MJ, Welton NJ. Enhanced secondary analysis of survival data:

reconstructing the data from published Kaplan-Meier survival curves. BMC medical research

methodology 2012;12:9 doi 10.1186/1471-2288-12-9.

23. Merlos-Suarez A, Barriga FM, Jung P, Iglesias M, Cespedes MV, Rossell D, et al. The intestinal

stem cell signature identifies colorectal cancer stem cells and predicts disease relapse. Cell

Stem Cell 2011;8(5):511-24 doi 10.1016/j.stem.2011.02.020.

24. ColoLipidGene: signature of lipid metabolism-related genes to predict prognosis in stage-II

colon cancer patients.

25. Wu J, Zhou L, Huang L, Gu J, Li S, Liu B, et al. Nomogram integrating gene expression

signatures with clinicopathological features to predict survival in operable NSCLC: a pooled

analysis of 2164 patients. Journal of experimental & clinical cancer research : CR 2017;36(1):4

doi 10.1186/s13046-016-0477-x.

26. Jiang Y, Casey G, Lavery IC, Zhang Y, Talantov D, Martin-McGreevy M, et al. Development of a

clinically feasible molecular assay to predict recurrence of stage II colon cancer. J Mol Diagn

2008;10(4):346-54 doi 10.2353/jmoldx.2008.080011.

27. Fritzmann J, Morkel M, Besser D, Budczies J, Kosel F, Brembeck FH, et al. A Colorectal Cancer

Expression Profile That Includes Transforming Growth Factor β Inhibitor BAMBI Predicts

Metastatic Potential. Gastroenterology 2009;137(1):165-75 doi 10.1053/j.gastro.2009.03.041.

28. Kazem MA, Khan AU, Selvasekar CR. Validation of nomogram for disease free survival for

colon cancer in UK population: A prospective cohort study. Int J Surg 2016;27:58-65 doi

10.1016/j.ijsu.2015.12.069.

29. Adam R, Delvart V, Pascal G, Valeanu A, Castaing D, Azoulay D, et al. Rescue surgery for

unresectable colorectal liver metastases downstaged by chemotherapy: a model to predict

long-term survival. Annals of surgery 2004;240(4):644-57; discussion 57-8.

30. André T, Boni C, Mounedji-Boudiaf L, Navarro M, Tabernero J, Hickish T, et al.

Oxaliplatin, Fluorouracil, and Leucovorin as Adjuvant Treatment for Colon Cancer. New

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 21: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

21 / 26

England Journal of Medicine 2004;350(23):2343-51 doi 10.1056/NEJMoa032709.

31. Des Guetz G, Schischmanoff O, Nicolas P, Perret GY, Morere JF, Uzzan B. Does microsatellite

instability predict the efficacy of adjuvant chemotherapy in colorectal cancer? A systematic

review with meta-analysis. Eur J Cancer 2009;45(10):1890-6 doi 10.1016/j.ejca.2009.04.018.

32. Vilar E, Gruber SB. Microsatellite instability in colorectal cancer-the stable evidence. Nature

reviews Clinical oncology 2010;7(3):153-62 doi 10.1038/nrclinonc.2009.237.

33. Jover R, Nguyen TP, Perez-Carbonell L, Zapater P, Paya A, Alenda C, et al. 5-Fluorouracil

adjuvant chemotherapy does not increase survival in patients with CpG island methylator

phenotype colorectal cancer. Gastroenterology 2011;140(4):1174-81 doi

10.1053/j.gastro.2010.12.035.

34. Gunderson LL, Jessup JM, Sargent DJ, Greene FL, Stewart A. Revised tumor and node

categorization for rectal cancer based on surveillance, epidemiology, and end results and

rectal pooled analysis outcomes. Journal of clinical oncology : official journal of the American

Society of Clinical Oncology 2010;28(2):256-63 doi 10.1200/jco.2009.23.9194.

35. Nitsche U, Maak M, Schuster T, Kunzli B, Langer R, Slotta-Huspenina J, et al. Prediction of

prognosis is not improved by the seventh and latest edition of the TNM classification for

colorectal cancer in a single-center collective. Annals of surgery 2011;254(5):793-800;

discussion -1 doi 10.1097/SLA.0b013e3182369101.

36. Zhang J-X, Song W, Chen Z-H, Wei J-H, Liao Y-J, Lei J, et al. Prognostic and predictive value of a

microRNA signature in stage II colon cancer: a microRNA expression analysis. The Lancet

Oncology 2013;14(13):1295-306 doi 10.1016/s1470-2045(13)70491-1.

37. de Sousa EMF, Colak S, Buikhuisen J, Koster J, Cameron K, de Jong JH, et al. Methylation of

cancer-stem-cell-associated Wnt target genes predicts poor prognosis in colorectal cancer

patients. Cell Stem Cell 2011;9(5):476-85 doi 10.1016/j.stem.2011.10.008.

38. Arpino G, Generali D, Sapino A, Del Mastro L, Frassoldati A, de Laurentis M, et al. Gene

expression profiling in breast cancer: a clinical perspective. Breast 2013;22(2):109-20 doi

10.1016/j.breast.2013.01.016.

39. Marusyk A, Almendro V, Polyak K. Intra-tumour heterogeneity: a looking glass for cancer?

Nature reviews Cancer 2012;12(5):323-34 doi 10.1038/nrc3261.

40. Zhang ZY, Luo QF, Yin XW, Dai ZL, Basnet S, Ge HY. Nomograms to predict survival after

colorectal cancer resection without preoperative therapy. BMC cancer 2016;16(1):658 doi

10.1186/s12885-016-2684-4.

41. Kawai K, Sunami E, Yamaguchi H, Ishihara S, Kazama S, Nozawa H, et al. Nomograms for

colorectal cancer: A systematic review. World journal of gastroenterology

2015;21(41):11877-86 doi 10.3748/wjg.v21.i41.11877.

42. Abdul Aziz NA, Mokhtar NM, Harun R, Mollah MMH, Mohamed Rose I, Sagap I, et al. A

19-Gene expression signature as a predictor of survival in colorectal cancer. BMC Medical

Genomics 2016;9(1) doi 10.1186/s12920-016-0218-1.

43. Shi M, He J. ColoFinder: a prognostic 9-gene signature improves prognosis for 871 stage II

and III colorectal cancer patients. PeerJ 2016;4:e1804 doi 10.7717/peerj.1804.

44. Xu G, Zhang M, Zhu H, Xu J. A 15-gene signature for prediction of colon cancer recurrence

and prognosis based on SVM. Gene 2017;604:33-40 doi 10.1016/j.gene.2016.12.016.

45. Nguyen MN, Choi TG, Nguyen DT, Kim JH, Jo YH, Shahid M, et al. CRC-113 gene expression

signature for predicting prognosis in patients with colorectal cancer. Oncotarget

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 22: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

22 / 26

2015;6(31):31674-92 doi 10.18632/oncotarget.5183.

46. Wang L, Shen X, Wang Z, Xiao X, Wei P, Wang Q, et al. A molecular signature for the

prediction of recurrence in colorectal cancer. Molecular cancer 2015;14:22 doi

10.1186/s12943-015-0296-2.

47. Vargas T, Moreno-Rubio J, Herranz J, Cejas P, Molina S, Gonzalez-Vallinas M, et al.

ColoLipidGene: signature of lipid metabolism-related genes to predict prognosis in stage-II

colon cancer patients. Oncotarget 2015;6(9):7348-63 doi 10.18632/oncotarget.3130.

48. Sveen A, Agesen TH, Nesbakken A, Meling GI, Rognum TO, Liestol K, et al. ColoGuidePro: a

prognostic 7-gene expression signature for stage III colorectal cancer patients. Clinical cancer

research : an official journal of the American Association for Cancer Research

2012;18(21):6001-10 doi 10.1158/1078-0432.ccr-11-3302.

49. Salazar R, Roepman P, Capella G, Moreno V, Simon I, Dreezen C, et al. Gene expression

signature to improve prognosis prediction of stage II and III colorectal cancer. Journal of

clinical oncology : official journal of the American Society of Clinical Oncology

2011;29(1):17-24 doi 10.1200/jco.2010.30.1077.

50. Kalady MF, Dejulius K, Church JM, Lavery IC, Fazio VW, Ishwaran H. Gene signature is

associated with early stage rectal cancer recurrence. J Am Coll Surg 2010;211(2):187-95 doi

10.1016/j.jamcollsurg.2010.03.035.

51. Peng J, Wang Z, Chen W, Ding Y, Wang H, Huang H, et al. Integration of genetic signature and

TNM staging system for predicting the relapse of locally advanced colorectal cancer. Int J

Colorectal Dis 2010;25(11):1277-85 doi 10.1007/s00384-010-1043-1.

52. Watanabe T, Kobunai T, Yamamoto Y, Kanazawa T, Konishi T, Tanaka T, et al. Prediction of liver

metastasis after colorectal cancer using reverse transcription-polymerase chain reaction

analysis of 10 genes. Eur J Cancer 2010;46(11):2119-26 doi 10.1016/j.ejca.2010.04.019.

53. Hao JM, Chen JZ, Sui HM, Si-Ma XQ, Li GQ, Liu C, et al. A five-gene signature as a potential

predictor of metastasis and survival in colorectal cancer. J Pathol 2010;220(4):475-89 doi

10.1002/path.2668.

54. Watanabe T, Kobunai T, Sakamoto E, Yamamoto Y, Konishi T, Horiuchi A, et al. Gene

expression signature for recurrence in stage III colorectal cancers. Cancer 2009;115(2):283-92

doi 10.1002/cncr.24023.

55. Garman KS, Acharya CR, Edelman E, Grade M, Gaedcke J, Sud S, et al. A genomic approach to

colon cancer risk stratification yields biologic insights into therapeutic opportunities.

Proceedings of the National Academy of Sciences of the United States of America

2008;105(49):19432-7 doi 10.1073/pnas.0806674105.

56. Lin YH, Friederichs J, Black MA, Mages J, Rosenberg R, Guilford PJ, et al. Multiple gene

expression classifiers from different array platforms predict poor prognosis of colorectal

cancer. Clinical cancer research : an official journal of the American Association for Cancer

Research 2007;13(2 Pt 1):498-507 doi 10.1158/1078-0432.ccr-05-2734.

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 23: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

23 / 26

Table 1 Genomic Signatures Included in the Study

First

author Prognosis

Number of genes

Patients with the signature classified as (%)a

Reference Original signature Available genes (%)

Poor Uncertainty Good

Aziz OS 19 15(78.9%)

201(23.5%) 349(40.8%) 305(35.7%) (42)

Chang DFS 95 95(100.0%)

341(39.9%) 299(35%) 215(25.1%) (9)

Chen OS 7 6(85.7%)

293(34.3%) 265(31%) 297(34.7%) (12)

Shi DFS 9 9(100.0%)

396(46.3%) 300(35.1%) 159(18.6%) (43)

Xu OS 15 13(86.7%)

299(35%) 492(57.5%) 64(7.5%) (44)

Minh DFS 77 74(96.5%)

358(41.9%) 366(42.8%) 131(15.3%) (45)

Shen DFS 45 32(71.1%)

335(39.2%) 338(39.5%) 182(21.3%) (46)

Teodoro DFS 4 4(100.0%)

0(0.0%) 855(100%) 0(0.0%) (47)

Agesen DFS 13 13(100.0%)

74(8.7%) 688(80.5%) 93(10.9%) (10)

Popovici DFS 64 56(87.5%)

333(38.9%) 140(16.4%) 382(44.7%) (14)

Seveen DFS 7 7(100.0%)

140(16.4%) 598(69.9%) 117(13.7%) (48)

Merlos DFS 264 247(93.6%)

156(18.2%) 235(27.5%) 464(54.3%) (23)

Oh DFS, OS 87 81(93.1%)

278(32.5%) 269(31.5%) 308(36%) (13)

Salazar DFS 18 16(88.9%)

192(22.5%) 326(38.1%) 337(39.4%) (49)

Kalady DFS, OS 36 24(66.7%)

245(28.7%) 429(50.2%) 181(21.2%) (50)

Peng DFS 8 7(87.5%)

120(14%) 453(53%) 282(33%) (51)

Toshiaki DFS 10 9(90.0%)

323(37.8%) 232(27.1%) 300(35.1%) (52)

Fritzmann DFS 113 100(88.5%)

568(66.4%) 166(19.4%) 121(14.2%) (27)

Hao DFS 4 4(100.0%)

188(22%) 577(67.5%) 90(10.5%) (53)

Jorissen DFS 122 102(83.6%)

370(43.3%) 231(27%) 254(29.7%) (53)

Smith DFS, OS 34 29(85.3%)

337(39.4%) 395(46.2%) 123(14.4%) (11)

Watanabe DFS 31 22(71.0%)

156(18.2%) 432(50.5%) 267(31.2%) (54)

Garman DFS, OS 46 37(80.4%)

193(22.6%) 336(39.3%) 326(38.1%) (55)

Jiang DFS 7 6(85.7%)

216(25.3%) 372(43.5%) 267(31.2%) (26)

Lin.a DFS 16 14(87.5%)

141(16.5%) 518(60.6%) 196(22.9%) (56)

Lin.b DFS 22 18(81.8%)

319(37.3%) 324(37.9%) 212(24.8%) (56)

Wang DFS 19 19(100.0%)

226(26.4%) 507(59.3%) 122(14.3%) (46)

aPatients were classified as good, poor or uncertain by the respective published genomic

signatures based on prediction results (false discover rate [FDR] <0.05) of the nearest

template prediction (NTP).

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 24: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

24 / 26

Figure legends

Figure 1. Bioinformatic analysis of gene lists included in this study. A. Landscape

distribution characteristics of the gene lists as reported in 27 prognostic signatures.

Genes with good and poor prognostic values in each signature are shown as short

green and red vertical bars, respectively. The number of gene occurrences is

graphically depicted in the upper panel. Each row represents one signature annotated

by the first author of the corresponding article. All the unique genes were sorted

alphabetically. B-E. GO and KEGG pathway analyses of genes in the 27 signatures.

The vertical axis represents Go or KEGG pathway annotations. The horizontal axis

represents the number of genes assigned to the corresponding annotation. (B)

Biological process; (C) cellular components; (D) molecular function; and (E) KEGG

pathway. F. Protein-protein interaction (PPI) networks. Nodes represent the proteins,

and edges represent the physical interaction. The black and gray nodes represent

genes with good and poor prognostic values, respectively. CRC recurrence-related

pathways that were significantly enriched by KEGG analysis form some closely

connected regions and are highlighted in the colored box.

Figure 2. Global prognostic performance of the published CRC signatures. A.

Concordance of signature-based prediction results in the training set. Each column

represents the prediction of each individual sample. The orange, blue and white bars

indicate good, poor and uncertainty prognoses of the corresponding signature,

respectively. B. Heatmap of Cramer’s V coefficients showing correlation between

these published signatures. C. Analysis of prognostic performance using Cox

regression; left: univariate analysis; right: clinical factors adjusted for multivariate

analysis.

Figure 3. Development of the composite clinicopathologic-genomic nomogram. A.

Composite nomogram to predict disease-free survival for patients with CRC. Each

category within these variables was assigned to a point on the top scale based on the

coefficients from Cox regression. By summing all of the assigned points for the seven

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 25: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

25 / 26

variables and drawing a vertical line between the total points and survival probability

axis, we easily obtained the estimated probabilities of 1-, 3- and 5-year disease-free

survival (DFS). B. Calibration curves for predicting patient survival at 1, 3 and 5

years. The actual DFS is plotted on the y-axis, and the nomogram-predicted DFS is

plotted on the x-axis; a plot along the 45-degree line would indicate a perfect

calibration model in which the predicted probabilities are identical to the actual

outcomes. C. The total point distribution of each CRC patient and their disease-free

survival time and status. The blue dotted line represents the cutoff value dividing

patients into low-, intermediate- and high-risk groups. D. Kaplan-Meier survival

curves of DFS for different risk groups using the nomogram in the training set. E.

Time-dependent ROC curves comparing the prognostic accuracies of 5-year

disease-free survival among the prognostic signatures combined with

clinicopathological features and the nomogram in the training set. The AUC 95% CIs

were calculated from 1000 bootstraps of the survival data. ROC, receiver operator

characteristic. AUC, area under the curve.

Figure 4. Verifying the prognostic value of the nomogram in the validation set. A.

The total point distribution of each CRC patient and their disease-free survival time

and status. The blue dotted line represents the cutoff value dividing patients into low-,

intermediate- and high-risk groups. B. Kaplan-Meier survival curves of DFS for

different risk groups using the nomogram in the validation set. C. Time-dependent

ROC curves comparing the prognostic accuracies of 5-year disease-free survival

among the prognostic signatures in combination with clinicopathological features and

the nomogram in the validation set. D, E. Kaplan-Meier survival curves for patients in

different risk subgroups, which were stratified by TNM stages II (D) and III (E).

Figure 5. Effect of chemotherapy on overall survival in patients stratified by the

nomogram. A. Kaplan-Meier survival curves for all patients who received

chemotherapy. B-D. Effects of adjuvant chemotherapy on overall survival in the

different subgroups. Low-risk group (B); intermediate risk group (C); high-risk group

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 26: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

26 / 26

(D).

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 27: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 28: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 29: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 30: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 31: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063

Page 32: Nomogram Integrating Genomics with Clinicopathological ... · genomic nomogram was cautiously . constructed. The nomogram successfully stratified CRC patients into three risk. groups

Published OnlineFirst May 21, 2018.Mol Cancer Res   Yongfu Xiong, Wenxian You, Min Hou, et al.  

CancerFeatures Improves Prognosis Prediction for Colorectal Nomogram Integrating Genomics with Clinicopathological

  Updated version

  10.1158/1541-7786.MCR-18-0063doi:

Access the most recent version of this article at:

  Material

Supplementary

  http://mcr.aacrjournals.org/content/suppl/2018/05/19/1541-7786.MCR-18-0063.DC1

Access the most recent supplemental material at:

  Manuscript

Authorbeen edited. Author manuscripts have been peer reviewed and accepted for publication but have not yet

   

   

   

  E-mail alerts related to this article or journal.Sign up to receive free email-alerts

  Subscriptions

Reprints and

  [email protected] at

To order reprints of this article or to subscribe to the journal, contact the AACR Publications

  Permissions

  Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://mcr.aacrjournals.org/content/early/2018/05/19/1541-7786.MCR-18-0063To request permission to re-use all or part of this article, use this link

on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063