SUPPLEMENTARY INFORMATION - Nature Research · the functional group, if one of the nucleotide...

43
SUPPLEMENTARY INFORMATION Systematic investigation of cancer-associated somatic point mutations in SNP databases HyunChul Jung 1,2 , Thomas Bleazard 3 , Jongkeun Lee 1 and Dongwan Hong 1 1. Cancer Genomics Branch, Division of Convergence Technology, National Cancer Center, Gyeonggi-do 410-769, Korea 2. Bioinformatics and Systems Biology Graduate Program, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA 3. College of Natural Sciences, Seoul National University Graduate School, Seoul 110-799, Korea To whom correspondence should be addressed; E-mail: [email protected] Nature Biotechnology: doi:10.1038/nbt.2681

Transcript of SUPPLEMENTARY INFORMATION - Nature Research · the functional group, if one of the nucleotide...

SUPPLEMENTARY INFORMATION

Systematic investigation of cancer-associated somatic point mutations

in SNP databases

HyunChul Jung1,2

, Thomas Bleazard3, Jongkeun Lee

1 and Dongwan Hong

1

1. Cancer Genomics Branch, Division of Convergence Technology, National Cancer Center,

Gyeonggi-do 410-769, Korea

2. Bioinformatics and Systems Biology Graduate Program, University of California San Diego,

9500 Gilman Drive, La Jolla, CA 92093, USA

3. College of Natural Sciences, Seoul National University Graduate School, Seoul 110-799,

Korea

To whom correspondence should be addressed; E-mail: [email protected]

Nature Biotechnology: doi:10.1038/nbt.2681

Table of Contents

Materials and Methods ................................................................................................................ 3

Supplementary Notes .................................................................................................................. 6

Supplementary Figures

Suppl. Figure 1. The number of overlapped positions supported by at least 1, 5 and 10 tumor

samples .......................................................................................................................................... 8

Suppl. Figure 2. Mutually exclusive alteration pattern between PIK3CA and TP53 .................... 9

Suppl. Figure 3. For TP53, Kaplan-Meier survival curves for tumor samples with cancer-

associated somatic mutations represented in dbSNP or other variants versus wild-type by log-

rank test........................................................................................................................................ 10

Suppl. Figure 4. Workflow of the proposed comprehensive SNP filtering approach ................. 11

Supplementary Tables

Suppl. Table 1. List of the compiled cancer genomics articles ................................................... 12

Suppl. Table 2. List of the cancer-associated somatic mutations represented in dbSNP ..................................................................................................................................................... 18

Suppl. Table 3. Functional consequence of the cancer-associated somatic mutations

represented in dbSNP .................................................................................................................. 21

Suppl. Table 4. High-confidence filtered cancer-associated somatic mutations represented in

dbSNP shown in two example articles ......................................................................................... 39

Suppl. Table 5. Analysis of mutually exclusive alteration patterns ............................................. 40

Suppl. Table 6. List of patients with cancer-associated somatic mutations represented in dbSNP

and other variants in TP53 ........................................................................................................... 41

Suppl. Table 7 Cancer-associated somatic mutations represented in 1000 Genomes Project . 42

References .................................................................................................................................. 43

Nature Biotechnology: doi:10.1038/nbt.2681

Materials and Methods

Eligible cancer genomics articles

We selected articles published in Nature, Nature Genetics, Genome Research, and PNAS between

January 2010 and June 2012 that used next generation sequencing technology to study human

cancer. In this survey, we focused on cancer genomics articles with whole genome sequencing

(WGS) or whole exome sequencing (WES). We selected articles where the identification of point

mutations was one of the main parts of the study and was mentioned in the abstract. We excluded

articles which only investigated structural variations, copy number variations, or pathogen infections

using sequencing data. We selected articles regardless of next generation sequencing platform used,

number of samples, cancer type, and point mutation calling algorithm used. Articles were first

identified in a PubMed and Google Scholar search. Then, we further searched for articles using the

search engine of each journal. Several individuals independently read the methods, and

supplementary methods of each paper to search for the SNP filtering approach used in the point

mutation calling workflow. Following this inspection, we classified the articles into four categories: (A)

Article with filtering; (B) Article with partial filtering; (C) Article with filtering-unknown; and (D) Other.

The 'articles with filtering' were those that used the common SNP filtering approach which filtered

identified point mutations against public SNP databases such as dbSNP1 or the 1000 Genomes

Project database2. According to descriptions of the SNP filtering, the 'articles with filtering’ did not use

a subset of dbSNP databases such as common SNPs (SNPs with >= 1% minor allele frequency) or

Flagged SNPs (Clinically associated SNPs), because most of the articles used old versions of dbSNP

(e.g. dbSNP 130) that do not provide the subset. The 'articles with partial filtering' were those that

filtered out point mutations using the public databases but which saved for analysis those in disease

databases such as COSMIC3 or OMIM

4. The 'articles with filtering-unknown' were those where we

could not find any description of SNP filtering approach in any section of the article.

Preprocessing of COSMIC and dbSNP database for extraction of overlapping SNPs

We downloaded all SNPs listed in dbSNP135 from the UCSC genome browser5. We extracted SNPs

whose class was ‘single’ on all chromosomes (n = 47,762,409). Out of the single SNPs, we selected

SNPs whose function column contained ‘missense’, ‘nonsense’, ‘stop-loss’, or ‘splice’ (n=514,700).

We also downloaded COSMIC v60 data from the COSMIC web site. We selected point mutations

Nature Biotechnology: doi:10.1038/nbt.2681

where a hg19 coordinate was available and whose mutation description column had ‘Nonstop

extension’, ‘Substitution – Nonsense’, or ‘Substitution – Missense’. To select only somatic point

mutations, we chose point mutations where the mutation somatic status was ‘confirmed somatic

variant’ or ‘reported in another cancer sample as somatic’ and discarded point mutations of which the

status was ‘variant of unknown origin’, ‘reported in another sample as germline’, ‘not specified’, and

‘confirmed germline variant’. We removed duplicate mutation entries with the same sample ID and

retained just one representative of each. We focused on overlapping non-silent SNPs supported by at

least five tumor samples in the main analyses.

Prediction of functional consequence of the cancer-associated somatic mutations represented

in dbSNP

We used three in-silico methods, SIFT6, PolyPhen2

7, and MutationAssessor

8 to assess the functional

impact of the cancer-associated somatic mutations represented in dbSNP database. In cases where

positions had several reported variant alleles, we ran the three tools with all reported nucleotide

changes. The prediction results can be found in supplementary figure 4 and table 4. Next, we

classified each mutation position into functional and non-functional groups. The positions predicted to

be functional with relatively low confidence were also classified into the functional group. The

positions having multiple prediction results due to several reported variant alleles were classified into

the functional group, if one of the nucleotide changes was predicted to be functional. We used

PhyloP9 to assess the degree of conservation. Mutation positions for which the PhyloP score was

greater than 1.3 (P < 0.05) were classified into the functional group.

Identification of the high-confidence filtered mutations

For the bladder cancer article10

, we first downloaded publicly available raw sequencing data

(SRA038181) from the NCBI Sequence Read Archive (SRA)11

. We followed the same alignment and

variant calling approach to replicate their variant calling results. We first aligned reads against NCBI

reference genome (hg18) using Burrows-Wheeler Alignment (BWA) tool12

and performed local

realignment of the BWA-aligned reads using Genome Analysis Toolkit (GATK)13

. After removing PCR

duplicates using Picard, somatic point mutations were called by VarScan14

. We first aligned whole-

exome sequencing of 9 bladder tumor samples used in the discovery step through the pipeline

Nature Biotechnology: doi:10.1038/nbt.2681

described above. To confirm our results, we contacted authors to ask for their point mutation calling

results (VarScan output file). The consistency between the results was very high. For example, there

was little difference in the number of reads supporting variant alleles and variant allele frequencies.

Based on the high consistency in the variant calling results of 9 tumor samples, we decided to analyze

the 88 tumor samples using their variant calling results. To select high-confidence filtered cancer-

associated somatic mutations represented in dbSNP, we used a list of the validated point mutations in

their supplementary materials with the variant calling results. For each tumor sample, we selected the

filtered mutations for which the number of reads supporting variant allele and variant allele

frequencies were greater than those of at least one confirmed point mutation from the same sample.

In cases where tumor samples had too few confirmed mutations for setting the two cutoff values, we

did not include the mutations from these tumor samples.

For the prostate cancer article15

, we downloaded sequencing data (SRA037395) from NCBI SRA and

processed the data with the pipeline described above to detect variants. To select high-confidence

filtered cancer-associated somatic mutations represented in dbSNP, we only focused on the detection

of homozygous mutations. There were two reasons for this. First, we did not have any reliable cutoff

values such as the number of reads supporting variant allele and variant allele frequency from

validated mutations. Second, we did not take the same variant calling approaches used in the article.

Thus, we selected only homozygous mutations of which variant allele frequency was higher than 95%.

Moreover, we manually inspected the identified homozygous mutations with the Integrative Genomics

Viewer (IGV) browser16

. Finally, we contacted authors to ask for confirmation of the identified

homozygous mutations and they confirmed them.

Evaluation of clinical significance of the cancer-associated somatic mutations represented in

dbSNP in TP53

We obtained patient survival information in Supplementary Table 1 of the article concerned17

. The 48

patients provided information such as survival (in months) after the diagnosis, first hormone therapy,

and first chemotherapy. We first searched for high-confidence non-silent somatic mutations and high-

level copy number alterations in TP53. According to TP53 mutants, we classified the patients into

those with the cancer-associated somatic mutations represented in dbSNP; those with other variants

such as non-silent point mutations (excluding the cancer-associated somatic mutations represented in

Nature Biotechnology: doi:10.1038/nbt.2681

dbSNP), frameshift indels, and structural variations (high-level amplifications or deletions); and those

without variants (wild-type). The patient (WA10) having both the cancer-associated somatic mutation

represented in dbSNP and high-level deletion was excluded in this analysis. Patients with cancer-

associated somatic mutations represented in dbSNP or other variants in TP53 did not show significant

prognostic difference for survival after the diagnosis and first chemotherapy.

Supplementary Notes

Investigation of cancer-associated somatic mutations represented in 1000 Genomes Project

database

dbSNP135, which we used in this study, includes 1000 Genomes Project Pilot 1,2,3 and Phase 1 data,

which are the most recent to be released. Therefore, we searched for mutations reported by 1000

Genomes Project data among the cancer-associated somatic mutations represented in dbSNP

database (n=257). We found that 9 of the 257 mutations were reported by them and almost half (n=4)

of the 9 mutations were predicted to have a functional consequence by at least three out of the four

methods. 4 of the 9 mutations were common germline SNPs with MAF of at least 1%. 2 of them,

rs1801516 (MAF=7.9%) in ATM and rs59912467 (MAF=1.2%) in STK11, were found to be a

melanoma susceptibility locus by GWAS and be related to a cancer prone syndrome by OMIM

database, respectively. The other two common SNPs might be passenger mutations or their

association with cancer might not be revealed yet. In addition, 4 of the remaining 5 SNPs with MAF of

less than 1% were flagged as clinically-associated (Supplementary Table 7). For example, rs2893457

generates one of the six well-known hot-spot codons in TP53. In addition, rs1801166 in APC and

rs59912467 in STK11 are related to multiple colorectal adenomas and cancer prone disorder by

OMIM database. Furthermore, we don’t exclude the possibility that some of the overlapping SNPs

reported by 1000 Genomes Project data were erroneously entered into the COSMIC database.

Description of the SNP filtering pipeline

Our proposed comprehensive SNP filtering approach was implemented in a web-based tool called

CSTAR (Cancer genome Sequencing Tool to Acquire Reliable somatic point mutations; http://cstar-

ncc.org), which takes non-silent point mutations in SNP databases as an input (VCF, MAF or tab

Nature Biotechnology: doi:10.1038/nbt.2681

delimited text file). The pipeline compares the input SNP list to a knowledgebase that is comprised of

overlapping SNPs between dbSNP and COSMIC databases. Those point mutations not present in

SNP databases are immediately forwarded as candidate mutations. For mutations present in dbSNP,

the program references functional consequences predicted by in-silico prediction tools SIFT, Polyphen,

Mutation Assessor and Phylop, clinical associations flagged by dbSNP, and disease susceptibility

information from the GWAS catalog (http://www.genome.gov/gwastudies/). The first filtering module

allows users to create customized cancer-associated variant lists by selecting ① the required

minimum number of tumor samples supporting each mutation in COSMIC, ② the number of mutations

occurred in gene, and ③ the required number of tools predicting damage. The second parameter in

particular is designed to aid in identifying cancer driver genes by rescuing mutations that are either

clustered in “hot spots” or scattered along the entire gene. The second module then rescues clinically-

associated or disease susceptibility SNPs. Finally, the rescued SNP list from the two modules is

provided as an output. Supplementary Figure 4 shows the workflow of the proposed comprehensive

SNP filtering approach.

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Figures

Supplementary Figure 1. The number of overlapped positions supported by at least 1, 5 and 10

tumor samples

514,587

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Figure 2. Mutually exclusive alteration pattern between PIK3CA and TP53 (P = 0.03).

Tumor samples with or without mutations are labeled in red or blue, respectively. For PIK3CA and

TP53, newly identified tumor samples with filtered mutations were marked with asterisks. P values

were calculated by two-tailed Fisher exact test

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Figure 3. For TP53, Kaplan-Meier survival curves for tumor samples with cancer-

associated somatic mutations represented in dbSNP or other variants versus wild-type by log-rank

test. The tumor sample (WA10) having both the cancer-associated somatic mutation represented in

dbSNP and other variant was excluded from survival analysis

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Figure 4. Workflow of the proposed comprehensive SNP filtering approach

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Tables

Supplementary Table 1. List of the compiled cancer genomics articles

Journal Year Category Title Evidence sentences

Nature 2010 Filtering A comprehensive catalogue of somatic mutations from a human cancer genome

In order to allow for any under-called positions in the germline, no observations of that allele were permitted in the germ line, although one call was permitted if the depth was ≥30×. Substitutions corresponding to known SNP positions (dbSNP 129) were excluded. Substitutions were annotated using Ensembl version 52.

Nature 2010 Filtering A small-cell lung cancer genome with complex signatures of tobacco exposure

We used the optimal thresholds defined in point 5 of the power calculations above (based on a mutation prevalence of 8 per Mb, as estimated from capillary sequence data in COSMIC) to determine whether there was sufficient evidence for calling a somatic substitution or not at each base in this preliminary list. Resulting tumour-specific substitutions were further filtered to remove (1) those residing in regions of loss of heterozygosity (LOH) in the normal cell line; (2) those potentially due to misalignment in segmental duplications and near sequence gaps; (3) those corresponding to polymorphic positions in dbSNP; (4) those potentially due to misalignment or miscalls as they are adjacent to SNPs or within 5 bp of insertions and deletions; and (5) those where all supporting reads contained the putative variant in the first or last 5 bp of the read (to reduce effects of misalignment across indels). Substitutions were annotated using Ensembl version 52.

Nature 2010 Filtering Genome remodelling in a basal-like breast cancer metastasis and xenograft.

We again followed the same procedure as described in Mardis et al(1). Predicted SNVs and Indels were compared to dbSNP 129. For SNVs, we require a position match for determining concordance between the variant and dbSNP 129. In addition, we compared (by position) predicted SNVs with SNPs found in the CEU and YRI trios as determined from the 1,000 Genomes project.

Nature 2010 Filtering The mutation spectrum revealed by paired genome sequences from a lung cancer patient

This suggests that excluding SNVs that are only partially called in the normal would have increased the overall validation rate to 78% without a large impact on sensitivity. Further, excluding such loci that are only partially called in the normal would yield only 8,732 tumor-specific SNVs that are also described in dbSNP (i.e. likely false negative calls in the normal genome assembly).

Nature Genetics 2011 Filtering

Frequent somatic mutations in MAP3K5 and MAP3K9 in metastatic melanoma identified by exome sequencing

The pileup file of all variations detected in each sample was first compared to all variations annotated in dbSNP132 along with data from the 1000 Genomes Project. After this analysis, all newly identified variations were fully annotated.

Nature Genetics 2011 Filtering Exome sequencing identifies GRIN2A as frequently mutated in melanoma

To eliminate common germline mutations from consideration, alterations observed in dbSNP130 or in the 1000 Genomes Project 11_2010 data release project were removed.

Nature Biotechnology: doi:10.1038/nbt.2681

Nature Genetics 2011 Filtering

Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia

We used an in-house software system to identify somatic mutations by comparing variants identified in bone marrow exome data set against dbSNP and germline variants present in peripheral blood control samples.

Nature Genetics 2011 Filtering Frequent mutations of chromatin remodeling genes in transitional cell carcinoma of the bladder

To eliminate any previously described germline variants, we cross-referenced potential somatic mutations against the dbSNP130 and SNP datasets of Han Chinese in Beijing (CHB) and Japanese in Toyko (JPT) from the three pilot studies in the 1000 Genomes Project.

Nature Genetics 2011 Filtering Analysis of the coding genome of diffuse large B-cell lymphoma

For the tumor samples, only 'high confidence' variants (that is, variants supported by at least one read in one direction and two non-duplicate reads in the opposite direction) were retained, according to the GS Reference Mapper Software algorithm. For the normal samples, a less stringent criterion was applied in that all variants detected in at least one read were considered to be present in the sample. Candidate somatic (that is, tumor-specific) variants were then obtained by removing known population polymorphisms present in the NCBI dbSNP database (Build 132) as well as variants present in the corresponding paired normal DNA.

Nature Genetics 2011 Filtering

Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer

Candidate somatic (that is, tumor-specific) variants were then obtained by removing known population polymorphisms present in the NCBI dbSNP database (Build 132) as well as variants present in the corresponding paired normal DNA.

Nature Genetics 2011 Filtering

Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell renal cell carcinoma

In order to eliminate any previously described germline variants, the somatic mutations were cross-referenced against the dbSNP (version 130) and SNP data sets of Han Chinese in Beijing (CHB) and Japanese in Toyko (JPT) from the three pilot studies in the 1000 genomes project (http://www.1000genomes.org). Any mutations present in above data sets were filtered out and the remaining mutations were subjected to subsequent analyses.

Nature 2011 Filtering Frequent mutation of histone-modifying genes in non-Hodgkin lymphoma

Any SNV near gapped alignments or exactly overlapping sites assessed as being polymorphisms (SNPs) were disregarded, including variants matching a position in dbSNP or the sequenced personal genomes of Venter58, Watson59 or the anonymous Asian60 and Yoruban61 individuals.

Nature 2011 Filtering Frequent pathway mutations of splicing machinery in myelodysplasia

Synonymous variants, polymorphisms registered in the dbSNP131 and 1000 genome database, and variants on the intron region except splicing sites were filtered.

PNAS 2011 Filtering

Whole-exome sequencing of neoplastic cysts of the pancreas reveals recurrent mutations in components of ubiquitin-dependent pathways

Duplicate tags were removed, and a mismatched base was identified as a mutation only when (i) it was identified by more than five distinct tags, (ii) the number of distinct tags containing a particular mismatched base was at least 20% of the total distinct tags, (iii) it was not present in >0.1% of the tags in the matched normal sample, and (iv) it was not present in SNP databases (dbSNP Build 134 Release, http://www.ncbi.nlm.nih.gov/projects/SNP/ and http://browser.1000 genomes.org/index.html).

Nature Biotechnology: doi:10.1038/nbt.2681

PNAS 2011 Filtering

Exome sequencing identifies a spectrum of mutation frequencies in advanced and lethal prostate cancers

A majority of the variants identified by exome sequencing were present within dbSNP. After removing from consideration all variants that were observed in the pilot dataset of the 1000 Genomes Project (11, 12) as well as any variants present in any of ~2,000 additional exomes sequenced at the University of Washington, the number of variants remaining in 20/23 samples was reduced to ~350.

Nature Genetics 2012 Filtering

Integrated analysis of somatic mutations and focal copy-number changes identifies key genes and pathways in hepatocellular carcinoma

Variants were filtered for their coding localization, annotation in dbSNP131 or 1000 genomes, somatic and functionally impairment.

Nature Genetics 2012 Filtering

Whole-genome sequencing of liver cancers identifies etiological influences on mutation patterns and recurrent mutations in chromatin regulators

If a base with consensus quality lower than 20 occurs within 3bp on either side of the target SNV, we discarded the SNVs. After SNV calling in the tumor samples, candidate SNVs were filtered based on the lymphocyte sequence of the same patient; (1) candidate SNV alleles with a frequency ≥ 0.03 after removing reads with base quality < 15, and mapping quality < 20, (2) depth of coverage in lymphocyte ≤ 5, (3) depth of coverage in lymphocyte ≤ 10 and candidate SNV allele was represented in the dbSNP database v131 (http://www.ncbi.nlm.nih.gov/projects/SNP/).

Nature 2012 Filtering Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes

Variants, which are reported in dbSNP130, that were found in any of the normal blood samples or that were found within the public genomes from Complete Genomics were removed from the data set.

Nature 2012 Filtering Driver mutations in histone H3.3 and chromatin remodelling genes in paediatric glioblastoma

A variant called in a tumour was considered to be a candidate somatic mutation if the matched normal sample had at least 10 reads covering this position and had zero variant reads, and the variant was not reported in dbSNP131 or the 1000 Genomes data set (October 2011).

PNAS 2011 Others Loss-of-function mutations in Notch receptors in cutaneous and lung squamous cell carcinoma

Tumor samples withouth matched normal samples : To eliminate common germline polymorphisms from consideration, variants that had the same position as variants present in pilot data from the 1,000 Genomes

Project or in ∼2,000 exomes corresponding to normal (nontumor, nonxenografted) tissues sequenced at the

University of Washington were removed from consideration. ; Tumor samples with matached normal samples : All mutations known in dbSNP were subtracted unless present in COSMIC.

Nature Genetics 2011 Partial filtering

Exome sequencing of gastric adenocarcinoma identifies recurrent somatic mutations in cell adhesion and chromatin remodeling genes

To identify somatic mutations, we excluded from our analysis all germline variants found in the dbSNP131 or 1000 Genomes Project (4th August 2010 release) databases and then subtracted the sequence variants of the normal exomes from the tumor exomes. Any sequence variants found in COSMIC v47, a database of cancer somatic mutations, were retained.

Nature Genetics 2012 Partial filtering

Exome sequencing of liver fluke-associated cholangiocarcinoma

We compared our variants against the common polymorphisms present in dbSNP131 and in the 1000 Genomes Project databases, in order to discard any common SNPs. Several cancer somatic mutations are also present in dbSNP, and we retained any common variants also found to be present in COSMIC v47.

Nature 2012 Partial filtering

The genetic basis of early T-cell precursor acute lymphoblastic leukaemia

High-confidence germline variants that were not found in dbSNP were retained as novel variants. In addition, variants in dbSNP that were also present in OMIM or COSMIC were retained as these variants are likely to be of biologic importance.

Nature 2012 Partial filtering

Novel mutations target distinct subgroups of medulloblastoma

Since only tumor samples were sequenced, known germline variations in dbSNP (excluding validated mutations in COSMIS, OMIMSNP and ClinicalVar), NHLBI Exome Sequencing Project (http://evs.gs.washington.edu/EVS/downloaded on 11.21.2011) and germline variations identified by PCGP were removed.

Nature Biotechnology: doi:10.1038/nbt.2681

Nature Genetics 2011 Unknown High-resolution characterization of a hepatocellular carcinoma genome

Nature Genetics 2011 Unknown

Inactivating mutations of the chromatin remodeling gene ARID2 in hepatocellular carcinoma

Nature Genetics 2011 Unknown

Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion

Nature Genetics 2011 Unknown Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes

Nature Genetics 2011 Unknown

Somatic histone H3 alterations in pediatric diffuse intrinsic pontine gliomas and non-brainstem glioblastomas

Nature Genetics 2011 Unknown

Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia

Nature 2011 Unknown A novel recurrent mutation in MITF predisposes to familial and sporadic melanoma

Nature 2011 Unknown Initial genome sequencing and analysis of multiple myeloma

Nature 2011 Unknown The genomic complexity of primary human prostate cancer

Nature 2011 Unknown Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

Nature Biotechnology: doi:10.1038/nbt.2681

Genome Research

2011 Unknown

Whole-exome sequencing of human pancreatic cancers and characterization of genomic instability caused by MLH1 haploinsufficiency and complete deficiency

Nature Genetics 2012 Unknown Exome sequencing identifies recurrent somatic MAP2K1 and MAP2K2 mutations in melanoma

Nature Genetics 2012 Unknown

Somatic histone H3 alterations in pediatric diffuse intrinsic pontine gliomas and non-brainstem glioblastomas

Nature Genetics 2012 Unknown

Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer

Nature 2012 Unknown A novel retinoblastoma therapy from genomic and epigenetic analyses

Nature 2012 Unknown Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing

Nature 2012 Unknown

Exome sequencing identifies frequent mutation of the SWI_SNF complex gene PBRM1 in renal carcinoma

Nature 2012 Unknown Melanoma genome sequencing reveals frequent PREX2 mutations

Nature 2012 Unknown Sequence analysis of mutations and translocations across breast cancer subtypes

Nature 2012 Unknown Whole-genome analysis informs breast cancer response to aromatase inhibition

Nature 2012 Unknown The landscape of cancer genes and mutational processes in breast cancer

Nature Biotechnology: doi:10.1038/nbt.2681

Nature 2012 Unknown Clonal selection drives genetic divergence of metastatic medulloblastoma

Genome Research

2012 Unknown Whole genome sequencing of matched primary and metastatic acral melanomas

PNAS 2012 Unknown

Discovery and prioritization of somatic mutations in diffuse large B-cell lymphoma (DLBCL) by whole-exome sequencing

Nature 2012 Unknwon The clonal and mutational evolution spectrum of primary triple-negative breast cancers

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Table 2. List of the cancer-associated somatic mutations represented in dbSNP

Gene Chromosome & Position

(Hg19)

Number of supporting

tumor samples

rs ID

Gene Chromosome & Position

(Hg19)

Number of supporting

tumor samples

rs ID

JAK2 9:5073770-5073770 29268 rs77375493

TP53 17:7578407-7578407 25 rs138729528

KRAS 12:25398284-25398284 14685 rs121913529

PTPN11 12:112888198-112888198 25 rs121918453

BRAF 7:140453136-140453136 13572 rs113488022

CDKN2A 9:21971036-21971036 24 rs121913381

KRAS 12:25398285-25398285 4770 rs121913530

TP53 17:7579358-7579358 24 rs11540654

KRAS 12:25398281-25398281 3067 rs112445441

PDGFRA 4:55141036-55141036 24 rs121908586

IDH1 2:209113112-209113112 2307 rs121913500

EGFR 7:55249005-55249005 23 rs121913465

PIK3CA 3:178952085-178952085 1459 rs121913279

PTEN 10:89711900-89711900 23 rs121913294

EGFR 7:55259515-55259515 1423 rs121434568

APC 5:112175207-112175207 23 rs121913462

FGFR3 4:1803568-1803568 1200 rs121913483

EGFR 7:55221822-55221822 23 rs149840192

NRAS 1:115256529-115256529 952 rs11554290

BRAF 7:140453137-140453137 22 rs121913378

PIK3CA 3:178936091-178936091 793 rs104886003

TP53 17:7577139-7577139 22 rs55832599

TP53 17:7578406-7578406 765 rs28934578

CDKN2A 9:21971017-21971017 21 rs121913386

KIT 4:55599321-55599321 759 rs121913507

HRAS 11:534285-534285 21 rs104894226

IDH1 2:209113113-209113113 631 rs121913499

TP53 17:7577509-7577509 21 rs121912652

TP53 17:7577538-7577538 597 rs11540652

APC 5:112175390-112175390 20 rs121913328

TP53 17:7577120-7577120 592 rs28934576

TP53 17:7577084-7577084 20 rs121912667

NRAS 1:115256530-115256530 555 rs121913254

NF2 22:30067836-30067836 20 rs74315499

PIK3CA 3:178936082-178936082 492 rs121913273

APC 5:112175426-112175426 19 rs121913326

CTNNB1 3:41266124-41266124 489 rs121913412

PTEN 10:89720852-89720852 19 rs121909231

TP53 17:7577539-7577539 461 rs121912651

PIK3CA 3:178936093-178936093 19 rs121913275

NRAS 1:115258747-115258747 432 rs121913237

TSHR 14:81610299-81610299 19 rs28937584

TP53 17:7577121-7577121 427 rs121913343

HRAS 11:533873-533873 18 rs121913496

CTNNB1 3:41266137-41266137 395 rs121913409

BRAF 7:140453145-140453145 18 rs121913366

TP53 17:7577094-7577094 394 rs28934574

VHL 3:10183725-10183725 18 rs5030826

FGFR3 4:1806099-1806099 388 rs121913485

PIK3CA 3:178952007-178952007 18 rs121913288

TP53 17:7577548-7577548 338 rs28934575

FGFR2 10:123279677-123279677 18 rs79184941

CTNNB1 3:41266113-41266113 309 rs121913403

KIT 4:55594258-55594258 17 rs121913523

TP53 17:7577534-7577534 308 rs28934571

VHL 3:10183797-10183797 17 rs5030807

HRAS 11:534288-534288 300 rs104894230

BRAF 7:140481411-140481411 17 rs121913351

KRAS 12:25398282-25398282 299 rs121913535

CDKN2A 9:21971096-21971096 16 rs121913384

CTNNB1 3:41266101-41266101 279 rs121913400

CDKN2A 9:21971153-21971153 16 rs121913383

IDH2 15:90631934-90631934 277 rs121913502

CDKN2A 9:21971108-21971108 16 rs11552822

RET 10:43617416-43617416 247 rs74799832

TP53 17:7577511-7577511 16 rs28934577

NRAS 1:115258744-115258744 244 rs121434596

KRAS 12:25398262-25398262 15 rs121913538

FGFR3 4:1803564-1803564 235 rs121913482

PIK3CA 3:178936074-178936074 15 rs121913285

GNAS 20:57484420-57484420 230 rs11554273

EGFR 7:55233043-55233043 15 rs139236063

TP53 17:7578190-7578190 224 rs121912666

VHL 3:10183772-10183772 15 rs104893829

FLT3 13:28592642-28592642 223 rs121913488

TP53 17:7578518-7578518 14 rs28934875

CTNNB1 3:41266097-41266097 222 rs28931588

CSF1R 5:149433645-149433645 13 rs1801271

HRAS 11:533874-533874 214 rs121913233

STK11 19:1207021-1207021 13 rs121913324

NRAS 1:115258748-115258748 214 rs121913250

TP53 17:7578532-7578532 13 rs28934873

MPL 1:43815009-43815009 199 rs121913615

APC 5:112151261-112151261 12 rs137854568

Nature Biotechnology: doi:10.1038/nbt.2681

DNMT3A 2:25457242-25457242 184 rs147001633

FGFR3 4:1806119-1806119 12 rs28931614

CTNNB1 3:41266136-41266136 163 rs121913407

NF2 22:30057302-30057302 12 rs74315496

PDGFRA 4:55152093-55152093 158 rs121908585

BRAF 7:140481417-140481417 11 rs121913348

PIK3CA 3:178936092-178936092 154 rs121913274

VHL 3:10191488-10191488 11 rs5030818

TP53 17:7578461-7578461 151 rs121912654

SRC 20:36031762-36031762 11 rs121913314

TP53 17:7577547-7577547 150 rs121912656

KIT 4:55599333-55599333 11 rs121913682

KRAS 12:25380275-25380275 142 rs17851045

TP53 17:7574012-7574012 11 rs17882252

GNAQ 9:80409488-80409488 142 rs121913492

TSHR 14:81610289-81610289 10 rs121908877

APC 5:112175639-112175639 136 rs121913332

STK11 19:1221319-1221319 10 rs121913322

CTNNB1 3:41266104-41266104 134 rs28931589

CDKN2A 9:21971177-21971177 10 rs121913382

PTEN 10:89692904-89692904 128 rs121909224

VHL 3:10191480-10191480 10 rs121913346

IDH2 15:90631838-90631838 127 rs121913503

GNAS 20:57484596-57484596 10 rs121913494

KRAS 12:25380276-25380276 118 rs121913240

APC 5:112164616-112164616 10 rs137854574

TP53 17:7577556-7577556 118 rs121912655

KRAS 12:25398279-25398279 10 rs104894365

TP53 17:7577085-7577085 113 rs112431538

VHL 3:10188200-10188200 10 rs5030811

TP53 17:7577022-7577022 110 rs121913344

TSHR 14:81610258-81610258 10 rs121908859

EGFR 7:55249071-55249071 107 rs121434569

MPL 1:43814979-43814979 10 rs121913614

FGFR3 4:1806089-1806089 107 rs121913479

SMO 7:128850341-128850341 10 rs121918347

CTNNB1 3:41266098-41266098 98 rs121913396

ATM 11:108175462-108175462 10 rs1801516

TP53 17:7578442-7578442 98 rs148924904

PTPN11 12:112888202-112888202 10 rs121918462

TP53 17:7577124-7577124 92 rs121912657

WT1 11:32413578-32413578 9 rs121907909

NRAS 1:115256528-115256528 92 rs121913255

PTEN 10:89717615-89717615 9 rs121909227

HRAS 11:534289-534289 91 rs104894229

BRAF 7:140453146-140453146 9 rs121913369

AKT1 14:105246551-105246551 91 rs121434592

TSHR 14:81606172-81606172 9 rs121908878

NRAS 1:115258745-115258745 90 rs121434595

TP53 17:7574017-7574017 9 rs121912664

KIT 4:55599320-55599320 88 rs121913506

WT1 11:32417910-32417910 9 rs142937387

PTPN11 12:112888210-112888210 85 rs121918464

KRAS 12:25380283-25380283 9 rs121913528

PIK3CA 3:178936094-178936094 84 rs121913286

ERBB2 17:37880220-37880220 9 rs121913470

TP53 17:7578479-7578479 83 rs28934874

ABL1 9:133748290-133748290 9 rs121913451

GNAS 20:57484421-57484421 79 rs121913495

EGFR 7:55259485-55259485 9 rs148934350

KIT 4:55593610-55593610 79 rs121913517

RB1 13:48941648-48941648 9 rs121913300

CDKN2A 9:21971120-21971120 77 rs121913388

CDKN2A 9:21971116-21971116 9 rs11552823

CTNNB1 3:41266112-41266112 77 rs121913228

APC 5:112164586-112164586 8 rs137854573

TP53 17:7577559-7577559 77 rs28934573

STK11 19:1220415-1220415 8 rs121913323

CTNNB1 3:41266125-41266125 76 rs121913413

RET 10:43609948-43609948 8 rs75076352

KRAS 12:25378562-25378562 71 rs121913527

IDH1 2:209108317-209108317 8 rs34218846

TP53 17:7577106-7577106 69 rs17849781

FGFR2 10:123258034-123258034 8 rs121913476

PTEN 10:89717672-89717672 67 rs121909219

VHL 3:10183764-10183764 8 rs5030804

CTNNB1 3:41266103-41266103 67 rs121913399

APC 5:112128143-112128143 8 rs62619935

PTEN 10:89692905-89692905 65 rs121909229

TP53 17:7577526-7577526 8 rs121912653

CDKN2A 9:21971186-21971186 65 rs121913387

BRAF 7:140453132-140453132 8 rs121913365

HRAS 11:534286-534286 63 rs104894228

MET 7:116423414-116423414 8 rs121913246

KIT 4:55593613-55593613 58 rs121913521

TP53 17:7578401-7578401 8 rs147002414

KIT 4:55593661-55593661 57 rs121913513

WT1 11:32413560-32413560 8 rs28941778

TP53 17:7577099-7577099 55 rs121912660

VHL 3:10191506-10191506 7 rs5030820

EGFR 7:55259524-55259524 54 rs121913444

DNMT3A 2:25466800-25466800 7 rs144689354

HRAS 11:533875-533875 54 rs28933406

TSHR 14:81610105-81610105 7 rs149978216

PIK3CA 3:178952074-178952074 54 rs121913283

VHL 3:10183794-10183794 7 rs119103277

Nature Biotechnology: doi:10.1038/nbt.2681

FGFR3 4:1807889-1807889 51 rs78311289

PTPN11 12:112926884-112926884 7 rs121918458

FGFR3 4:1806092-1806092 49 rs121913484

APC 5:112175240-112175240 7 rs1801166

TP53 17:7577550-7577550 48 rs28934572

VHL 3:10188245-10188245 7 rs104893830

KIT 4:55593603-55593603 47 rs121913235

KRAS 12:25398255-25398255 7 rs121913236

EGFR 7:55241707-55241707 46 rs28929495

GNAS 20:57484597-57484597 7 rs137854533

FGFR3 4:1808331-1808331 44 rs121913480

RB1 13:48953760-48953760 7 rs121913302

FLT3 13:28592641-28592641 43 rs121909646

MET 7:116411990-116411990 7 rs56391007

PTPN11 12:112888211-112888211 43 rs121918465

CBL 11:119148991-119148991 6 rs192712314

ALK 2:29432664-29432664 41 rs113994087

NF2 22:30070880-30070880 6 rs74315504

FGFR3 4:1807890-1807890 41 rs121913105

VHL 3:10183739-10183739 6 rs5030802

APC 5:112175423-112175423 38 rs121913329

FKBP9 7:33014327-33014327 6 rs2953555

BRAF 7:140453134-140453134 37 rs121913364

KIT 4:55561764-55561764 6 rs121913505

PTPN11 12:112888199-112888199 36 rs121918454

VHL 3:10183785-10183785 6 rs5030828

FBXW7 4:153247289-153247289 36 rs149680468

JAK3 19:17948009-17948009 6 rs121913504

KIT 4:55599340-55599340 34 rs121913514

STK11 19:1220487-1220487 6 rs121913315

BRAF 7:140453154-140453154 34 rs121913338

WT1 11:32413566-32413566 6 rs121907900

PIK3CA 3:178952084-178952084 33 rs121913281

TP53 17:7578380-7578380 6 rs72661117

PIK3CA 3:178916876-178916876 33 rs121913287

VHL 3:10191503-10191503 6 rs104893825

KRAS 12:25380277-25380277 33 rs121913238

BRAF 7:140453193-140453193 6 rs121913370

BRAF 7:140481402-140481402 33 rs121913355

ERBB2 17:37881000-37881000 6 rs121913471

APC 5:112173917-112173917 32 rs121913333

KRAS 12:25380282-25380282 5 rs104886029

APC 5:112175576-112175576 32 rs121913330

PTEN 10:89692911-89692911 5 rs121909241

FGFR3 4:1806153-1806153 32 rs28931615

RET 10:43609949-43609949 5 rs75996173

KIT 4:55593464-55593464 31 rs3822214

WT1 11:32413565-32413565 5 rs121907903

PIK3CA 3:178952090-178952090 31 rs121913277

CDC73 1:193094272-193094272 5 rs121434265

PTEN 10:89711899-89711899 29 rs121913293

RB1 13:48955550-48955550 5 rs121913304

PIK3CA 3:178921553-178921553 29 rs121913284

VHL 3:10191555-10191555 5 rs5030823

APC 5:112174631-112174631 29 rs121913331

BRAF 7:140453150-140453150 5 rs121913341

CDKN2A 9:21971111-21971111 28 rs121913385

TRRAP 7:98509802-98509802 5 rs147405090

CDKN2A 9:21971028-21971028 28 rs121913389

BRAF 7:140481403-140481403 5 rs121913357

EGFR 7:55241708-55241708 28 rs121913428

KIT 4:55595519-55595519 5 rs121913516

PIK3CA 3:178927980-178927980 28 rs121913272

ZDHHC11 5:833915-833915 5 rs62332110

PTPN11 12:112888166-112888166 27 rs121918461

RB1 13:48955538-48955538 5 rs121913303

KIT 4:55599348-55599348 27 rs121913524

RB1 13:48942685-48942685 5 rs121913301

APC 5:112175303-112175303 27 rs121913327

NF1 17:29576111-29576111 5 rs137854560

NF2 22:30032794-30032794 26 rs121434259

BRAF 7:140453149-140453149 5 rs121913361

STK11 19:1223125-1223125 26 rs59912467

CDKN2A 9:21971053-21971053 5 rs137854598

TSHR 14:81609760-81609760 26 rs121908864

APC 5:112162891-112162891 5 rs137854580

KIT 4:55594221-55594221 25 rs121913512

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Table 3. Functional consequence of the cancer-associated somatic mutations represented in dbSNP

Gene rsID Chromosome & Position (Hg19)

Number of supporting

tumor samples

Ref Var

SIFT PolyPhen2 MutationAssessor Phylop

Prediction Score Prediction Score Prediction Score Score

ABL1 rs121913451 9:133748290-

133748290 9 C G DAMAGING 0.05

possibly damaging

0.95 low 0.86 0.85

AKT1 rs121434592 14:105246551-

105246551 91 C T DAMAGING 0.01

probably damaging

1.00 high 3.85 2.36

ALK rs113994087 2:29432664-

29432664 41 C T Not scored N/A

probably damaging

1.00 medium 3.32 2.56

APC rs121913326 5:112175426-

112175426 19 G T Not scored N/A nonsense Nonsense 2.94

APC rs121913327 5:112175303-

112175303 27 C T Not scored N/A nonsense Nonsense 2.86

APC rs121913328 5:112175390-

112175390 20 C T Not scored N/A nonsense Nonsense 1.57

APC rs121913329 5:112175423-

112175423 38 C T Not scored N/A nonsense Nonsense 2.94

APC rs121913330 5:112175576-

112175576 32 C T Not scored N/A nonsense Nonsense 2.83

APC rs121913331 5:112174631-

112174631 29 C T Nonsense N/A nonsense Nonsense 1.39

APC rs121913332 5:112175639-

112175639 136 C T Not scored N/A nonsense Nonsense 1.51

APC rs121913333 5:112173917-

112173917 32 C T Nonsense N/A nonsense Nonsense 0.78

APC rs121913462 5:112175207-

112175207 2 G A Not scored N/A benign 0.01 low 1.04 2.86

APC rs121913462 5:112175207-

112175207 21 G T Not scored N/A benign 0.01 Nonsense 2.86

APC rs137854568 5:112151261-

112151261 12 C T Nonsense N/A nonsense Nonsense 2.52

APC rs137854573 5:112164586-

112164586 8 C T Nonsense N/A nonsense Nonsense -0.03

APC rs137854574 5:112164616-

112164616 10 C T Nonsense N/A nonsense Nonsense -0.03

APC rs137854580 5:112162891-

112162891 5 C T Nonsense N/A nonsense Nonsense 1.46

APC rs1801166 5:112175240-

112175240 4 G C Not scored N/A benign 0.00 neutral 0.55 1.53

APC rs1801166 5:112175240-

112175240 3 G T Not scored N/A benign 0.00 Nonsense 1.53

APC rs62619935 5:112128143-

112128143 8 C T Nonsense N/A nonsense Nonsense 1.28

ATM rs1801516 11:108175462-

108175462 10 G A TOLERATED 0.23 benign 0.04 medium 2.16 2.75

BRAF rs113488022 7:140453136- 10 A C DAMAGING 0 possibly 0.82 high 3.96 2.16

Nature Biotechnology: doi:10.1038/nbt.2681

140453136 damaging

BRAF rs113488022 7:140453136-

140453136 13550 A T DAMAGING 0

possibly damaging

0.82 medium 2.28 2.16

BRAF rs113488022 7:140453136-

140453136 12 A G TOLERATED 0.32

possibly damaging

0.82 medium 2.58 2.16

BRAF rs121913338 7:140453154-

140453154 31 T C DAMAGING 0

probably damaging

1.00 high 4.29 2.16

BRAF rs121913338 7:140453154-

140453154 3 T A DAMAGING 0

probably damaging

1.00 high 4.64 2.16

BRAF rs121913341 7:140453150-

140453150 5 A C DAMAGING 0

probably damaging

1.00 high 3.94 2.16

BRAF rs121913348 7:140481417-

140481417 6 C T DAMAGING 0

probably damaging

1.00 high 4.49 2.62

BRAF rs121913348 7:140481417-

140481417 5 C A DAMAGING 0

probably damaging

1.00 high 4.49 2.62

BRAF rs121913351 7:140481411-

140481411 12 C A DAMAGING 0

probably damaging

1.00 high 4.63 2.62

BRAF rs121913351 7:140481411-

140481411 4 C T DAMAGING 0

probably damaging

1.00 high 4.63 2.62

BRAF rs121913351 7:140481411-

140481411 1 C G DAMAGING 0

probably damaging

1.00 high 4.63 2.62

BRAF rs121913355 7:140481402-

140481402 5 C T DAMAGING 0

probably damaging

1.00 high 4.23 2.62

BRAF rs121913355 7:140481402-

140481402 11 C A DAMAGING 0

probably damaging

1.00 high 4.58 2.62

BRAF rs121913355 7:140481402-

140481402 17 C G DAMAGING 0

probably damaging

1.00 medium 3.16 2.62

BRAF rs121913357 7:140481403-

140481403 1 C G DAMAGING 0

probably damaging

1.00 high 4.58 2.62

BRAF rs121913357 7:140481403-

140481403 4 C T DAMAGING 0

probably damaging

1.00 high 4.58 2.62

BRAF rs121913361 7:140453149-

140453149 5 C G DAMAGING 0

probably damaging

1.00 high 4.64 2.65

BRAF rs121913364 7:140453134-

140453134 37 T C DAMAGING 0

possibly damaging

0.78 medium 2.20 2.16

BRAF rs121913365 7:140453132-

140453132 6 T A DAMAGING 0

possibly damaging

0.92 medium 3.44 0.94

BRAF rs121913365 7:140453132-

140453132 2 T G DAMAGING 0

possibly damaging

0.92 medium 3.44 0.94

BRAF rs121913366 7:140453145-

140453145 11 A C DAMAGING 0

probably damaging

1.00 high 4.42 2.16

BRAF rs121913366 7:140453145-

140453145 7 A T DAMAGING 0

probably damaging

1.00 high 4.42 2.16

BRAF rs121913369 7:140453146-

140453146 9 G C DAMAGING 0

possibly damaging

0.93 medium 2.15 0.30

BRAF rs121913370 7:140453193-

140453193 6 T C DAMAGING 0.04

probably damaging

1.00 medium 3.35 2.16

BRAF rs121913378 7:140453137-

140453137 2 C A DAMAGING 0.01 benign 0.04 low 1.30 2.65

Nature Biotechnology: doi:10.1038/nbt.2681

BRAF rs121913378 7:140453137-

140453137 20 C T DAMAGING 0 benign 0.04 medium 2.33 2.65

CBL rs192712314 11:119148991-

119148991 6 G A DAMAGING 0.05

probably damaging

1.00 medium 3.32 2.74

CDC73 rs121434265 1:193094272-

193094272 2 C A Nonsense N/A nonsense Nonsense 1.37

CDC73 rs121434265 1:193094272-

193094272 3 C G Nonsense N/A nonsense Nonsense 1.37

CDKN2A rs11552822 9:21971108-

21971108 4 C T TOLERATED 0.31

probably damaging

1.00 medium 2.05 2.81

CDKN2A rs11552822 9:21971108-

21971108 2 C G TOLERATED 0.54

probably damaging

1.00 medium 2.05 2.81

CDKN2A rs11552822 9:21971108-

21971108 10 C A TOLERATED 1

probably damaging

1.00 medium 2.05 2.81

CDKN2A rs11552823 9:21971116-

21971116 2 G T DAMAGING 0

probably damaging

1.00 synonymous in

Uniprot 2.75

CDKN2A rs11552823 9:21971116-

21971116 7 G A DAMAGING 0

probably damaging

1.00 synonymous in

Uniprot 2.75

CDKN2A rs121913381 9:21971036-

21971036 12 C A DAMAGING 0

probably damaging

1.00 medium 1.94 2.81

CDKN2A rs121913381 9:21971036-

21971036 7 C G DAMAGING 0

probably damaging

1.00 medium 1.94 2.81

CDKN2A rs121913381 9:21971036-

21971036 5 C T TOLERATED 0.19

probably damaging

1.00 medium 1.94 2.81

CDKN2A rs121913382 9:21971177-

21971177 10 C A Nonsense N/A benign 0.09 low 1.39 -0.39

CDKN2A rs121913383 9:21971153-

21971153 1 C T DAMAGING 0.02

probably damaging

0.97 low 1.04 0.79

CDKN2A rs121913383 9:21971153-

21971153 15 C A Nonsense N/A

probably damaging

0.97 low 1.04 0.79

CDKN2A rs121913384 9:21971096-

21971096 3 C T TOLERATED 0.36

possibly damaging

0.64 medium 2.05 2.81

CDKN2A rs121913384 9:21971096-

21971096 13 C A Nonsense N/A

possibly damaging

0.64 medium 2.05 2.81

CDKN2A rs121913385 9:21971111-

21971111 1 G T DAMAGING 0.02

possibly damaging

0.80 low 1.85 2.75

CDKN2A rs121913385 9:21971111-

21971111 27 G A DAMAGING 0.02

possibly damaging

0.80 low 1.85 2.75

CDKN2A rs121913386 9:21971017-

21971017 21 G A DAMAGING 0

probably damaging

1.00 Nonsense 2.81

CDKN2A rs121913387 9:21971186-

21971186 65 G A Nonsense N/A

possibly damaging

0.82 low 1.70 0.73

CDKN2A rs121913388 9:21971120-

21971120 77 G A Nonsense N/A

probably damaging

0.99 medium 2.05 1.46

CDKN2A rs121913389 9:21971028-

21971028 1 C G TOLERATED 0.15

probably damaging

0.98 medium 2.00 2.81

CDKN2A rs121913389 9:21971028-

21971028 27 C T Nonsense N/A

probably damaging

0.98 medium 2.00 2.81

CDKN2A rs137854598 9:21971053- 2 G T DAMAGING 0 probably 1.00 synonymous in 2.75

Nature Biotechnology: doi:10.1038/nbt.2681

21971053 damaging Uniprot

CDKN2A rs137854598 9:21971053-

21971053 3 G A DAMAGING 0

probably damaging

1.00 synonymous in

Uniprot 2.75

CSF1R rs1801271 5:149433645-

149433645 13 T C DAMAGING 0

probably damaging

1.00 neutral 0.70 1.98

CTNNB1 rs121913228 3:41266112-

41266112 1 T A DAMAGING 0

probably damaging

1.00 medium 2.57 2.16

CTNNB1 rs121913228 3:41266112-

41266112 59 T G DAMAGING 0

probably damaging

1.00 medium 3.27 2.16

CTNNB1 rs121913228 3:41266112-

41266112 17 T C DAMAGING 0

probably damaging

1.00 medium 3.27 2.16

CTNNB1 rs121913396 3:41266098-

41266098 55 A G DAMAGING 0

probably damaging

1.00 medium 3.29 2.16

CTNNB1 rs121913396 3:41266098-

41266098 28 A T DAMAGING 0

probably damaging

1.00 medium 3.29 2.16

CTNNB1 rs121913396 3:41266098-

41266098 15 A C DAMAGING 0

probably damaging

1.00 medium 3.29 2.16

CTNNB1 rs121913399 3:41266103-

41266103 55 G A DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs121913399 3:41266103-

41266103 12 G C DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs121913400 3:41266101-

41266101 145 C G DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs121913400 3:41266101-

41266101 81 C T DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs121913400 3:41266101-

41266101 53 C A DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs121913403 3:41266113-

41266113 156 C T DAMAGING 0

probably damaging

1.00 medium 3.27 2.71

CTNNB1 rs121913403 3:41266113-

41266113 126 C G DAMAGING 0

probably damaging

1.00 medium 3.27 2.71

CTNNB1 rs121913403 3:41266113-

41266113 27 C A DAMAGING 0

probably damaging

1.00 medium 3.27 2.71

CTNNB1 rs121913407 3:41266136-

41266136 1 T A DAMAGING 0.03 benign 0.13 medium 2.41 2.25

CTNNB1 rs121913407 3:41266136-

41266136 10 T G DAMAGING 0 benign 0.13 medium 3.22 2.25

CTNNB1 rs121913407 3:41266136-

41266136 152 T C DAMAGING 0 benign 0.13 medium 3.22 2.25

CTNNB1 rs121913409 3:41266137-

41266137 15 C A DAMAGING 0

probably damaging

1.00 medium 3.22 2.80

CTNNB1 rs121913409 3:41266137-

41266137 16 C G DAMAGING 0

probably damaging

1.00 medium 3.22 2.80

CTNNB1 rs121913409 3:41266137-

41266137 364 C T DAMAGING 0

probably damaging

1.00 medium 3.22 2.80

CTNNB1 rs121913412 3:41266124-

41266124 3 A T TOLERATED 0.16

possibly damaging

0.49 medium 2.69 2.25

CTNNB1 rs121913412 3:41266124-

41266124 5 A C DAMAGING 0

possibly damaging

0.49 medium 3.24 2.25

Nature Biotechnology: doi:10.1038/nbt.2681

CTNNB1 rs121913412 3:41266124-

41266124 481 A G DAMAGING 0

possibly damaging

0.49 medium 3.24 2.25

CTNNB1 rs121913413 3:41266125-

41266125 2 C G TOLERATED 0.16

possibly damaging

0.49 medium 2.69 2.80

CTNNB1 rs121913413 3:41266125-

41266125 69 C T DAMAGING 0

possibly damaging

0.49 medium 3.24 2.80

CTNNB1 rs121913413 3:41266125-

41266125 5 C A DAMAGING 0.04

possibly damaging

0.49 medium 3.24 2.80

CTNNB1 rs28931588 3:41266097-

41266097 115 G T DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs28931588 3:41266097-

41266097 38 G C DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs28931588 3:41266097-

41266097 69 G A DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs28931589 3:41266104-

41266104 65 G A DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

CTNNB1 rs28931589 3:41266104-

41266104 69 G T DAMAGING 0

probably damaging

1.00 medium 3.29 2.71

DNMT3A rs144689354 2:25466800-

25466800 7 G A DAMAGING 0

probably damaging

1.00 high 3.61 1.26

DNMT3A rs147001633 2:25457242-

25457242 184 C T DAMAGING 0.03

possibly damaging

0.65 medium 2.83 2.69

EGFR rs121434568 7:55259515-

55259515 1422 T G DAMAGING 0

probably damaging

1.00 high 4.01 2.15

EGFR rs121434568 7:55259515-

55259515 1 T A DAMAGING 0

probably damaging

1.00 high 4.01 2.15

EGFR rs121434569 7:55249071-

55249071 107 C T DAMAGING 0

probably damaging

1.00 low 1.74 2.80

EGFR rs121913428 7:55241708-

55241708 26 G C DAMAGING 0

probably damaging

1.00 high 4.06 2.54

EGFR rs121913428 7:55241708-

55241708 2 G A DAMAGING 0

probably damaging

1.00 high 4.06 2.54

EGFR rs121913444 7:55259524-

55259524 6 T G DAMAGING 0

probably damaging

1.00 high 3.54 2.22

EGFR rs121913444 7:55259524-

55259524 48 T A DAMAGING 0

probably damaging

1.00 medium 2.85 2.22

EGFR rs121913465 7:55249005-

55249005 1 G A DAMAGING 0

probably damaging

1.00 medium 2.14 2.73

EGFR rs121913465 7:55249005-

55249005 22 G T DAMAGING 0.01

probably damaging

1.00 medium 3.24 2.73

EGFR rs139236063 7:55233043-

55233043 15 G T DAMAGING 0.01

probably damaging

1.00 medium 3.15 2.77

EGFR rs148934350 7:55259485-

55259485 9 C T DAMAGING 0

probably damaging

1.00 high 3.54 2.75

EGFR rs149840192 7:55221822-

55221822 3 C A DAMAGING 0

probably damaging

1.00 medium 1.99 2.82

EGFR rs149840192 7:55221822-

55221822 20 C T DAMAGING 0

probably damaging

1.00 medium 3.38 2.82

EGFR rs28929495 7:55241707- 26 G A DAMAGING 0 probably 1.00 high 4.06 2.75

Nature Biotechnology: doi:10.1038/nbt.2681

55241707 damaging

EGFR rs28929495 7:55241707-

55241707 20 G T DAMAGING 0

probably damaging

1.00 high 4.06 2.75

ERBB2 rs121913470 17:37880220-

37880220 9 T C DAMAGING 0

probably damaging

1.00 high 3.94 1.92

ERBB2 rs121913471 17:37881000-

37881000 5 G T TOLERATED 0.47

possibly damaging

0.81 low 1.12 2.42

ERBB2 rs121913471 17:37881000-

37881000 1 G A TOLERATED 0.32

possibly damaging

0.81 low 1.86 2.42

FBXW7 rs149680468 4:153247289-

153247289 2 G C DAMAGING 0

probably damaging

1.00 high 3.55 1.56

FBXW7 rs149680468 4:153247289-

153247289 34 G A DAMAGING 0

probably damaging

1.00 high 3.55 1.56

FGFR2 rs121913476 10:123258034-

123258034 2 A T

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 medium 2.89 0.24

FGFR2 rs121913476 10:123258034-

123258034 6 A C

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 medium 2.89 0.24

FGFR2 rs79184941 10:123279677-

123279677 18 G C

DAMAGING *Warning! Low confidence.

0.04 probably

damaging 0.99 high 3.89 2.74

FGFR3 rs121913105 4:1807890-1807890 5 A C DAMAGING *Warning! Low

confidence. 0

probably damaging

1.00 high 4.25 0.55

FGFR3 rs121913105 4:1807890-1807890 36 A T DAMAGING *Warning! Low

confidence. 0

probably damaging

1.00 high 4.45 0.55

FGFR3 rs121913479 4:1806089-1806089 107 G T DAMAGING *Warning! Low

confidence. 0.01 benign 0.01 medium 2.54 0.38

FGFR3 rs121913480 4:1808331-1808331 44 G T DAMAGING *Warning! Low

confidence. 0

probably damaging

1.00 high 4.75 2.16

FGFR3 rs121913482 4:1803564-1803564 235 C T DAMAGING *Warning! Low

confidence. 0

probably damaging

1.00 high 3.81 1.89

FGFR3 rs121913483 4:1803568-1803568 1200 C G DAMAGING *Warning! Low

confidence. 0.01

probably damaging

1.00 high 3.54 1.89

FGFR3 rs121913484 4:1806092-1806092 49 A T DAMAGING *Warning! Low

confidence. 0

probably damaging

0.96 medium 2.57 0.20

FGFR3 rs121913485 4:1806099-1806099 388 A G DAMAGING *Warning! Low

confidence. 0.01

probably damaging

0.99 medium 3.06 0.67

FGFR3 rs28931614 4:1806119-1806119 12 G A DAMAGING *Warning! Low

confidence. 0.03

probably damaging

0.96 medium 2.28 1.02

FGFR3 rs28931615 4:1806153-1806153 32 C A TOLERATED 0.06 possibly

damaging 0.63 medium 2.26 0.90

FGFR3 rs78311289 4:1807889-1807889 5 A C DAMAGING *Warning! Low

confidence. 0

probably damaging

1.00 high 3.90 1.69

FGFR3 rs78311289 4:1807889-1807889 46 A G DAMAGING *Warning! Low

confidence. 0

probably damaging

1.00 high 4.25 1.69

FKBP9 rs2953555 7:33014327-

33014327 6 G A DAMAGING 0.01

probably damaging

1.00 high 3.81 2.45

FLT3 rs121909646 13:28592641-

28592641 43 T A DAMAGING 0

probably damaging

1.00 high 3.93 2.25

FLT3 rs121913488 13:28592642-

28592642 6 C T DAMAGING 0

probably damaging

0.96 medium 2.90 2.79

Nature Biotechnology: doi:10.1038/nbt.2681

FLT3 rs121913488 13:28592642-

28592642 188 C A DAMAGING 0

probably damaging

0.96 medium 3.24 2.79

FLT3 rs121913488 13:28592642-

28592642 29 C G DAMAGING 0

probably damaging

0.96 medium 3.39 2.79

GNAQ rs121913492 9:80409488-

80409488 1 T C

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.65 2.14

GNAQ rs121913492 9:80409488-

80409488 78 T A

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.65 2.14

GNAQ rs121913492 9:80409488-

80409488 63 T G

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.65 2.14

GNAS rs11554273 20:57484420-

57484420 5 C A DAMAGING 0

probably damaging

1.00 high 4.37 2.59

GNAS rs11554273 20:57484420-

57484420 225 C T DAMAGING 0

probably damaging

1.00 high 4.37 2.59

GNAS rs121913494 20:57484596-

57484596 10 A T DAMAGING 0

probably damaging

1.00 high 4.34 0.98

GNAS rs121913495 20:57484421-

57484421 1 G T DAMAGING 0

probably damaging

1.00 high 4.37 2.59

GNAS rs121913495 20:57484421-

57484421 78 G A DAMAGING 0

probably damaging

1.00 high 4.37 2.59

GNAS rs137854533 20:57484597-

57484597 7 G T DAMAGING 0

probably damaging

1.00 high 4.00 0.84

HRAS rs104894226 11:534285-534285 10 C T DAMAGING *Warning! Low

confidence. 0 benign 0.29 high 4.17 1.98

HRAS rs104894226 11:534285-534285 11 C A DAMAGING *Warning! Low

confidence. 0 benign 0.29 high 4.17 1.98

HRAS rs104894228 11:534286-534286 6 C A DAMAGING *Warning! Low

confidence. 0

possibly damaging

0.50 high 4.17 1.98

HRAS rs104894228 11:534286-534286 57 C G DAMAGING *Warning! Low

confidence. 0

possibly damaging

0.50 high 4.17 1.98

HRAS rs104894229 11:534289-534289 12 C G DAMAGING *Warning! Low

confidence. 0.02

possibly damaging

0.53 medium 3.16 1.98

HRAS rs104894229 11:534289-534289 23 C A DAMAGING *Warning! Low

confidence. 0.01

possibly damaging

0.53 medium 3.36 1.98

HRAS rs104894229 11:534289-534289 56 C T DAMAGING *Warning! Low

confidence. 0.01

possibly damaging

0.53 medium 3.36 1.98

HRAS rs104894230 11:534288-534288 8 C G DAMAGING *Warning! Low

confidence. 0

possibly damaging

0.86 high 4.05 1.98

HRAS rs104894230 11:534288-534288 41 C T DAMAGING *Warning! Low

confidence. 0

possibly damaging

0.86 high 4.05 1.98

HRAS rs104894230 11:534288-534288 251 C A DAMAGING *Warning! Low

confidence. 0.01

possibly damaging

0.86 medium 3.36 1.98

HRAS rs121913233 11:533874-533874 103 T A DAMAGING *Warning! Low

confidence. 0.01 benign 0.01 high 4.71 1.66

HRAS rs121913233 11:533874-533874 111 T C DAMAGING *Warning! Low

confidence. 0.02 benign 0.01 high 4.71 1.66

HRAS rs121913496 11:533873-533873 6 C G DAMAGING *Warning! Low

confidence. 0.01 benign 0.03 high 4.01 -0.80

HRAS rs121913496 11:533873-533873 12 C A DAMAGING *Warning! Low 0.01 benign 0.03 high 4.01 -0.80

Nature Biotechnology: doi:10.1038/nbt.2681

confidence.

HRAS rs28933406 11:533875-533875 54 G T DAMAGING *Warning! Low

confidence. 0.03 benign 0.01 high 4.36 2.05

IDH1 rs121913499 2:209113113-

209113113 111 G C

DAMAGING *Warning! Low confidence.

0 probably

damaging 0.98 high 4.62 2.62

IDH1 rs121913499 2:209113113-

209113113 423 G A

DAMAGING *Warning! Low confidence.

0 probably

damaging 0.98 high 4.62 2.62

IDH1 rs121913499 2:209113113-

209113113 97 G T

DAMAGING *Warning! Low confidence.

0 probably

damaging 0.98 high 4.62 2.62

IDH1 rs121913500 2:209113112-

209113112 2251 C T

DAMAGING *Warning! Low confidence.

0 benign 0.03 high 3.92 1.34

IDH1 rs121913500 2:209113112-

209113112 56 C A

DAMAGING *Warning! Low confidence.

0 benign 0.03 high 4.62 1.34

IDH1 rs34218846 2:209108317-

209108317 8 C T TOLERATED 0.13

probably damaging

0.99 medium 2.69 2.76

IDH2 rs121913502 15:90631934-

90631934 269 C T

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.41 1.39

IDH2 rs121913502 15:90631934-

90631934 8 C A

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.41 1.39

IDH2 rs121913503 15:90631838-

90631838 106 C T

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.41 0.83

IDH2 rs121913503 15:90631838-

90631838 21 C A

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.41 0.83

JAK2 rs77375493 9:5073770-5073770 29268 G T DAMAGING 0 probably

damaging 1.00 medium 2.55 2.58

JAK3 rs121913504 19:17948009-

17948009 6 G A DAMAGING 0

possibly damaging

0.95 medium 2.96 1.05

KIT rs121913235 4:55593603-

55593603 12 T A DAMAGING 0

probably damaging

1.00 high 3.58 2.32

KIT rs121913235 4:55593603-

55593603 14 T C DAMAGING 0

probably damaging

1.00 high 3.58 2.32

KIT rs121913235 4:55593603-

55593603 21 T G DAMAGING 0

probably damaging

1.00 high 3.58 2.32

KIT rs121913505 4:55561764-

55561764 6 G A TOLERATED 0.1

possibly damaging

0.84 medium 2.32 -0.14

KIT rs121913506 4:55599320-

55599320 46 G T DAMAGING 0

probably damaging

1.00 high 4.16 2.63

KIT rs121913506 4:55599320-

55599320 42 G C DAMAGING 0.01

probably damaging

1.00 medium 3.47 2.63

KIT rs121913507 4:55599321-

55599321 1 A C DAMAGING 0

probably damaging

0.99 high 4.16 2.09

KIT rs121913507 4:55599321-

55599321 758 A T DAMAGING 0

probably damaging

0.99 high 4.16 2.09

KIT rs121913512 4:55594221-

55594221 25 A G DAMAGING 0

probably damaging

1.00 low 1.76 2.27

KIT rs121913513 4:55593661-

55593661 57 T C DAMAGING 0 benign 0.26 high 3.59 2.32

KIT rs121913514 4:55599340-

55599340 15 T A DAMAGING 0

probably damaging

1.00 medium 3.46 0.88

Nature Biotechnology: doi:10.1038/nbt.2681

KIT rs121913514 4:55599340-

55599340 15 T G DAMAGING 0

probably damaging

1.00 medium 3.46 0.88

KIT rs121913514 4:55599340-

55599340 4 T T Not scored N/A

probably damaging

1.00 synonymous in

Uniprot 0.88

KIT rs121913516 4:55595519-

55595519 5 C T DAMAGING 0

probably damaging

1.00 medium 2.48 2.78

KIT rs121913517 4:55593610-

55593610 12 T G DAMAGING 0

probably damaging

0.99 medium 3.30 2.32

KIT rs121913517 4:55593610-

55593610 52 T A DAMAGING 0

probably damaging

0.99 medium 3.30 2.32

KIT rs121913517 4:55593610-

55593610 15 T C DAMAGING 0

probably damaging

0.99 medium 3.30 2.32

KIT rs121913521 4:55593613-

55593613 50 T A DAMAGING 0

probably damaging

1.00 medium 2.95 2.32

KIT rs121913521 4:55593613-

55593613 8 T G DAMAGING 0

probably damaging

1.00 medium 2.95 2.32

KIT rs121913523 4:55594258-

55594258 17 T C DAMAGING 0.02 benign 0.32 medium 3.36 2.27

KIT rs121913524 4:55599348-

55599348 27 T C DAMAGING 0.03

probably damaging

1.00 medium 2.30 2.03

KIT rs121913682 4:55599333-

55599333 3 A T DAMAGING 0

probably damaging

1.00 high 3.60 2.03

KIT rs121913682 4:55599333-

55599333 7 A G DAMAGING 0

probably damaging

1.00 medium 2.84 2.03

KIT rs121913682 4:55599333-

55599333 1 A C DAMAGING 0

probably damaging

1.00 medium 3.25 2.03

KIT rs3822214 4:55593464-

55593464 31 A C TOLERATED 0.6 benign 0.01 low 1.12 0.45

KRAS rs104886029 12:25380282-

25380282 2 G C

DAMAGING *Warning! Low confidence.

0 possibly

damaging 0.47 high 4.29 2.89

KRAS rs104886029 12:25380282-

25380282 3 G T

DAMAGING *Warning! Low confidence.

0 possibly

damaging 0.47 high 4.29 2.89

KRAS rs104894365 12:25398279-

25398279 10 C T

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 3.78 2.67

KRAS rs112445441 12:25398281-

25398281 3018 C T

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.08 2.67

KRAS rs112445441 12:25398281-

25398281 22 C A

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.08 2.67

KRAS rs112445441 12:25398281-

25398281 27 C G

DAMAGING *Warning! Low confidence.

0.02 probably

damaging 1.00 medium 3.18 2.67

KRAS rs121913236 12:25398255-

25398255 7 G T

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.32 2.67

KRAS rs121913238 12:25380277-

25380277 23 G T

DAMAGING *Warning! Low confidence.

0.03 possibly

damaging 0.56 high 3.54 2.89

KRAS rs121913238 12:25380277-

25380277 10 G C

DAMAGING *Warning! Low confidence.

0.01 possibly

damaging 0.56 high 4.23 2.89

KRAS rs121913240 12:25380276-

25380276 12 T G

DAMAGING *Warning! Low confidence.

0 benign 0.03 high 4.23 2.30

KRAS rs121913240 12:25380276- 57 T A DAMAGING *Warning! Low 0.01 benign 0.03 high 4.23 2.30

Nature Biotechnology: doi:10.1038/nbt.2681

25380276 confidence.

KRAS rs121913240 12:25380276-

25380276 49 T C

DAMAGING *Warning! Low confidence.

0.02 benign 0.03 high 4.23 2.30

KRAS rs121913527 12:25378562-

25378562 68 C T

DAMAGING *Warning! Low confidence.

0.01 probably

damaging 1.00 high 3.63 2.72

KRAS rs121913527 12:25378562-

25378562 3 C G

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.32 2.72

KRAS rs121913528 12:25380283-

25380283 9 C T

DAMAGING *Warning! Low confidence.

0.01 possibly

damaging 0.94 medium 3.19 1.50

KRAS rs121913529 12:25398284-

25398284 1273 C G

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 3.90 2.67

KRAS rs121913529 12:25398284-

25398284 7999 C T

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 3.90 2.67

KRAS rs121913529 12:25398284-

25398284 5413 C A

DAMAGING *Warning! Low confidence.

0.01 probably

damaging 1.00 high 3.90 2.67

KRAS rs121913530 12:25398285-

25398285 745 C G

DAMAGING *Warning! Low confidence.

0.02 probably

damaging 1.00 medium 2.40 2.67

KRAS rs121913530 12:25398285-

25398285 2809 C A

DAMAGING *Warning! Low confidence.

0.01 probably

damaging 1.00 medium 3.00 2.67

KRAS rs121913530 12:25398285-

25398285 1216 C T

DAMAGING *Warning! Low confidence.

0.01 probably

damaging 1.00 medium 3.21 2.67

KRAS rs121913535 12:25398282-

25398282 199 C A

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.08 2.67

KRAS rs121913535 12:25398282-

25398282 42 C G

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.08 2.67

KRAS rs121913535 12:25398282-

25398282 58 C T

DAMAGING *Warning! Low confidence.

0.01 probably

damaging 1.00 medium 3.18 2.67

KRAS rs121913538 12:25398262-

25398262 11 C A

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.25 2.67

KRAS rs121913538 12:25398262-

25398262 4 C G

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.25 2.67

KRAS rs17851045 12:25380275-

25380275 54 T A

DAMAGING *Warning! Low confidence.

0.01 benign 0.09 high 4.23 2.30

KRAS rs17851045 12:25380275-

25380275 88 T G

DAMAGING *Warning! Low confidence.

0.01 benign 0.09 high 4.23 2.30

MET rs121913246 7:116423414-

116423414 8 A G DAMAGING 0

probably damaging

1.00 high 3.81 2.15

MET rs56391007 7:116411990-

116411990 7 C T DAMAGING 0.01

probably damaging

1.00 medium 3.19 2.80

MPL rs121913614 1:43814979-

43814979 10 G A TOLERATED 0.26 benign 0.00 low 1.10 0.31

MPL rs121913615 1:43815009-

43815009 199 G T TOLERATED 0.49 benign 0.01 low 1.10 2.20

NF1 rs137854560 17:29576111-

29576111 5 C T Nonsense N/A nonsense Nonsense 1.55

NF2 rs121434259 22:30032794-

30032794 26 C T Not scored N/A nonsense Nonsense 2.87

NF2 rs74315496 22:30057302-

30057302 12 C T Not scored N/A nonsense Nonsense 0.74

Nature Biotechnology: doi:10.1038/nbt.2681

NF2 rs74315499 22:30067836-

30067836 20 C T Not scored N/A nonsense Nonsense 2.68

NF2 rs74315504 22:30070880-

30070880 6 C T Not scored N/A nonsense Nonsense 2.64

NRAS rs11554290 1:115256529-

115256529 22 T G

DAMAGING *Warning! Low confidence.

0 benign 0.24 high 4.42 2.09

NRAS rs11554290 1:115256529-

115256529 157 T A

DAMAGING *Warning! Low confidence.

0.01 benign 0.24 high 4.42 2.09

NRAS rs11554290 1:115256529-

115256529 773 T C

DAMAGING *Warning! Low confidence.

0.02 benign 0.24 high 4.42 2.09

NRAS rs121434595 1:115258745-

115258745 22 C A

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.32 2.89

NRAS rs121434595 1:115258745-

115258745 68 C G

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.32 2.89

NRAS rs121434596 1:115258744-

115258744 12 C G

DAMAGING *Warning! Low confidence.

0.02 probably

damaging 1.00 high 3.52 2.89

NRAS rs121434596 1:115258744-

115258744 48 C A

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.32 2.89

NRAS rs121434596 1:115258744-

115258744 184 C T

DAMAGING *Warning! Low confidence.

0.03 probably

damaging 1.00 high 4.32 2.89

NRAS rs121913237 1:115258747-

115258747 37 C G

DAMAGING *Warning! Low confidence.

0 possibly

damaging 0.61 high 4.15 2.89

NRAS rs121913237 1:115258747-

115258747 347 C T

DAMAGING *Warning! Low confidence.

0 possibly

damaging 0.61 high 4.15 2.89

NRAS rs121913237 1:115258747-

115258747 48 C A

DAMAGING *Warning! Low confidence.

0.01 possibly

damaging 0.61 medium 3.46 2.89

NRAS rs121913250 1:115258748-

115258748 18 C G

DAMAGING *Warning! Low confidence.

0.02 possibly

damaging 0.61 medium 2.66 2.89

NRAS rs121913250 1:115258748-

115258748 73 C A

DAMAGING *Warning! Low confidence.

0.01 possibly

damaging 0.61 medium 3.06 2.89

NRAS rs121913250 1:115258748-

115258748 123 C T

DAMAGING *Warning! Low confidence.

0.04 possibly

damaging 0.61 medium 3.46 2.89

NRAS rs121913254 1:115256530-

115256530 546 G T

DAMAGING *Warning! Low confidence.

0.03 possibly

damaging 0.92 high 4.07 2.62

NRAS rs121913254 1:115256530-

115256530 9 G C

DAMAGING *Warning! Low confidence.

0 possibly

damaging 0.92 high 4.42 2.62

NRAS rs121913255 1:115256528-

115256528 58 T A

DAMAGING *Warning! Low confidence.

0.01 benign 0.29 high 4.42 2.09

NRAS rs121913255 1:115256528-

115256528 34 T G

DAMAGING *Warning! Low confidence.

0.01 benign 0.29 high 4.42 2.09

PDGFRA rs121908585 4:55152093-

55152093 158 A T

DAMAGING *Warning! Low confidence.

0 probably

damaging 1.00 high 4.00 2.20

PDGFRA rs121908586 4:55141036-

55141036 24 T A DAMAGING 0

probably damaging

1.00 medium 3.24 1.01

PIK3CA rs104886003 3:178936091-

178936091 777 G A TOLERATED 0.25

probably damaging

0.96 medium 2.11 2.72

PIK3CA rs104886003 3:178936091-

178936091 16 G C TOLERATED 0.12

probably damaging

0.96 medium 2.14 2.72

PIK3CA rs121913272 3:178927980- 28 T C TOLERATED 0.34 probably 1.00 medium 2.91 2.06

Nature Biotechnology: doi:10.1038/nbt.2681

178927980 damaging

PIK3CA rs121913273 3:178936082-

178936082 6 G C TOLERATED 0.51

probably damaging

1.00 low 1.78 2.72

PIK3CA rs121913273 3:178936082-

178936082 486 G A TOLERATED 0.71

probably damaging

1.00 low 1.80 2.72

PIK3CA rs121913274 3:178936092-

178936092 86 A C TOLERATED 0.22

probably damaging

1.00 low 1.78 2.17

PIK3CA rs121913274 3:178936092-

178936092 68 A G TOLERATED 0.15

probably damaging

1.00 medium 2.11 2.17

PIK3CA rs121913275 3:178936093-

178936093 8 G T TOLERATED 0.29

probably damaging

1.00 low 1.68 0.23

PIK3CA rs121913275 3:178936093-

178936093 11 G C TOLERATED 0.29

probably damaging

1.00 low 1.68 0.23

PIK3CA rs121913277 3:178952090-

178952090 22 G C TOLERATED 0.15

probably damaging

0.96 low 1.85 2.89

PIK3CA rs121913277 3:178952090-

178952090 9 G A TOLERATED 0.5

probably damaging

0.96 medium 2.00 2.89

PIK3CA rs121913279 3:178952085-

178952085 1290 A G TOLERATED 0.16

possibly damaging

0.64 low 0.94 2.31

PIK3CA rs121913279 3:178952085-

178952085 169 A T TOLERATED 1

possibly damaging

0.64 neutral -0.18 2.31

PIK3CA rs121913281 3:178952084-

178952084 33 C T TOLERATED 0.2 benign 0.00 neutral -0.49 2.89

PIK3CA rs121913283 3:178952074-

178952074 12 G A TOLERATED 1 benign 0.14 neutral -0.73 2.89

PIK3CA rs121913283 3:178952074-

178952074 2 G C TOLERATED 1 benign 0.14 neutral -0.73 2.89

PIK3CA rs121913283 3:178952074-

178952074 40 G T TOLERATED 1 benign 0.14 neutral -0.73 2.89

PIK3CA rs121913284 3:178921553-

178921553 29 T A TOLERATED 0.09

probably damaging

1.00 medium 3.07 0.42

PIK3CA rs121913285 3:178936074-

178936074 15 C G DAMAGING 0.04

probably damaging

0.99 medium 3.21 2.72

PIK3CA rs121913286 3:178936094-

178936094 74 C A TOLERATED 0.11

possibly damaging

0.71 medium 2.30 2.72

PIK3CA rs121913286 3:178936094-

178936094 10 C G TOLERATED 1

possibly damaging

0.71 neutral 0.35 2.72

PIK3CA rs121913287 3:178916876-

178916876 33 G A Not scored N/A

probably damaging

1.00 medium 2.52 2.53

PIK3CA rs121913288 3:178952007-

178952007 18 A G DAMAGING 0

probably damaging

1.00 high 3.65 2.31

PTEN rs121909219 10:89717672-

89717672 67 C T Not scored N/A nonsense Nonsense 0.48

PTEN rs121909224 10:89692904-

89692904 74 C G Not scored N/A

probably damaging

1.00 high 4.43 2.41

PTEN rs121909224 10:89692904-

89692904 54 C T Not scored N/A

probably damaging

1.00 Nonsense 2.41

PTEN rs121909227 10:89717615-

89717615 9 C T Not scored N/A nonsense Nonsense 2.68

Nature Biotechnology: doi:10.1038/nbt.2681

PTEN rs121909229 10:89692905-

89692905 5 G C Not scored N/A

probably damaging

1.00 high 3.88 2.41

PTEN rs121909229 10:89692905-

89692905 51 G A Not scored N/A

probably damaging

1.00 high 4.09 2.41

PTEN rs121909229 10:89692905-

89692905 9 G T Not scored N/A

probably damaging

1.00 high 4.09 2.41

PTEN rs121909231 10:89720852-

89720852 19 C T Not scored N/A nonsense Nonsense 2.52

PTEN rs121909241 10:89692911-

89692911 1 G T Not scored N/A

probably damaging

1.00 high 3.84 2.41

PTEN rs121909241 10:89692911-

89692911 4 G A Not scored N/A

probably damaging

1.00 high 4.39 2.41

PTEN rs121913293 10:89711899-

89711899 29 C T Not scored N/A

probably damaging

1.00 high 4.29 2.68

PTEN rs121913294 10:89711900-

89711900 23 G A Not scored N/A

probably damaging

1.00 medium 3.49 2.68

PTPN11 rs121918453 12:112888198-

112888198 25 G A DAMAGING 0.05

probably damaging

1.00 medium 1.96 1.49

PTPN11 rs121918454 12:112888199-

112888199 3 C A DAMAGING 0.03

probably damaging

1.00 medium 1.97 1.49

PTPN11 rs121918454 12:112888199-

112888199 33 C T DAMAGING 0

probably damaging

1.00 medium 2.91 1.49

PTPN11 rs121918458 12:112926884-

112926884 6 T C TOLERATED 0.12

possibly damaging

0.81 low 1.13 2.02

PTPN11 rs121918458 12:112926884-

112926884 1 T A DAMAGING 0.03

possibly damaging

0.81 medium 3.21 2.02

PTPN11 rs121918461 12:112888166-

112888166 2 A G DAMAGING 0.03

possibly damaging

0.91 low 0.96 2.21

PTPN11 rs121918461 12:112888166-

112888166 25 A T DAMAGING 0

possibly damaging

0.91 medium 2.36 2.21

PTPN11 rs121918462 12:112888202-

112888202 10 C T DAMAGING 0

probably damaging

1.00 high 3.86 2.78

PTPN11 rs121918464 12:112888210-

112888210 12 G C DAMAGING 0

probably damaging

1.00 medium 2.14 2.78

PTPN11 rs121918464 12:112888210-

112888210 73 G A DAMAGING 0

probably damaging

1.00 medium 2.78 2.78

PTPN11 rs121918465 12:112888211-

112888211 7 A T DAMAGING 0

probably damaging

0.99 high 4.14 2.21

PTPN11 rs121918465 12:112888211-

112888211 29 A G DAMAGING 0

probably damaging

0.99 high 4.14 2.21

PTPN11 rs121918465 12:112888211-

112888211 7 A C DAMAGING 0

probably damaging

0.99 medium 2.55 2.21

RB1 rs121913300 13:48941648-

48941648 9 C T Nonsense N/A nonsense Nonsense 2.79

RB1 rs121913301 13:48942685-

48942685 5 C T Nonsense N/A nonsense Nonsense 1.45

RB1 rs121913302 13:48953760-

48953760 7 C T Nonsense N/A nonsense Nonsense 2.64

RB1 rs121913303 13:48955538- 5 C T Nonsense N/A nonsense Nonsense -0.15

Nature Biotechnology: doi:10.1038/nbt.2681

48955538

RB1 rs121913304 13:48955550-

48955550 5 C T Nonsense N/A nonsense Nonsense 2.49

RET rs74799832 10:43617416-

43617416 247 T C DAMAGING 0

probably damaging

1.00 medium 3.33 2.19

RET rs75076352 10:43609948-

43609948 8 T C DAMAGING 0

probably damaging

1.00 medium 2.52 1.87

RET rs75996173 10:43609949-

43609949 5 G A DAMAGING 0

probably damaging

1.00 medium 2.52 1.08

SMO rs121918347 7:128850341-

128850341 10 G T DAMAGING 0.01

probably damaging

1.00 high 3.99 2.72

SRC rs121913314 20:36031762-

36031762 11 C T Nonsense N/A nonsense Nonsense 2.59

STK11 rs121913315 19:1220487-1220487 6 G T DAMAGING 0 probably

damaging 1.00 high 4.40 2.58

STK11 rs121913322 19:1221319-1221319 10 C T TOLERATED 0.34 benign 0.03 neutral 0.30 2.13

STK11 rs121913323 19:1220415-1220415 8 C T Nonsense N/A nonsense Nonsense 2.64

STK11 rs121913324 19:1207021-1207021 13 C T Nonsense N/A nonsense Nonsense 1.73

STK11 rs59912467 19:1223125-1223125 26 C G TOLERATED 0.56 benign 0.06 medium 2.38 -2.10

TP53 rs112431538 17:7577085-7577085 97 C T DAMAGING 0 probably

damaging 1.00 medium 3.33 2.54

TP53 rs112431538 17:7577085-7577085 16 C A Nonsense N/A probably

damaging 1.00 Nonsense 2.54

TP53 rs11540652 17:7577538-7577538 11 C G DAMAGING 0 probably

damaging 1.00 high 4.02 1.26

TP53 rs11540652 17:7577538-7577538 62 C A DAMAGING 0 probably

damaging 1.00 high 4.02 1.26

TP53 rs11540652 17:7577538-7577538 524 C T DAMAGING 0.01 probably

damaging 1.00 medium 3.47 1.26

TP53 rs11540654 17:7579358-7579358 24 C A DAMAGING 0.02 benign 0.37 medium 2.53 -0.22

TP53 rs121912651 17:7577539-7577539 450 G A DAMAGING 0 possibly

damaging 0.94 high 4.02 0.63

TP53 rs121912651 17:7577539-7577539 11 G C DAMAGING 0 possibly

damaging 0.94 high 4.02 0.63

TP53 rs121912652 17:7577509-7577509 8 C G DAMAGING 0 probably

damaging 1.00 high 3.66 1.26

TP53 rs121912652 17:7577509-7577509 13 C A Nonsense N/A probably

damaging 1.00 Nonsense 1.26

TP53 rs121912653 17:7577526-7577526 8 A G DAMAGING 0 probably

damaging 1.00 high 3.66 2.03

TP53 rs121912654 17:7578461-7578461 10 C T TOLERATED 0.77 probably

damaging 1.00 low 1.62 0.33

TP53 rs121912654 17:7578461-7578461 141 C A DAMAGING 0.01 probably

damaging 1.00 medium 3.48 0.33

TP53 rs121912655 17:7577556-7577556 64 C A DAMAGING 0 probably

damaging 1.00 high 4.04 2.53

TP53 rs121912655 17:7577556-7577556 16 C G DAMAGING 0 probably

damaging 1.00 high 4.04 2.53

TP53 rs121912655 17:7577556-7577556 38 C T DAMAGING 0 probably 1.00 high 4.04 2.53

Nature Biotechnology: doi:10.1038/nbt.2681

damaging

TP53 rs121912656 17:7577547-7577547 51 C A DAMAGING 0 probably

damaging 1.00 high 3.92 2.53

TP53 rs121912656 17:7577547-7577547 98 C T DAMAGING 0 probably

damaging 1.00 high 3.92 2.53

TP53 rs121912656 17:7577547-7577547 1 C G DAMAGING 0 probably

damaging 1.00 high 3.92 2.53

TP53 rs121912657 17:7577124-7577124 70 C T DAMAGING 0 probably

damaging 1.00 high 3.80 2.67

TP53 rs121912657 17:7577124-7577124 22 C A DAMAGING 0.01 probably

damaging 1.00 medium 2.83 2.67

TP53 rs121912660 17:7577099-7577099 43 C T DAMAGING 0 probably

damaging 1.00 high 3.97 2.64

TP53 rs121912660 17:7577099-7577099 12 C A DAMAGING 0 probably

damaging 1.00 high 3.97 2.64

TP53 rs121912664 17:7574017-7574017 6 C A DAMAGING 0.02 probably

damaging 0.96 medium 2.80 0.65

TP53 rs121912664 17:7574017-7574017 3 C T DAMAGING 0.03 probably

damaging 0.96 medium 2.80 0.65

TP53 rs121912666 17:7578190-7578190 9 T G DAMAGING 0 probably

damaging 1.00 high 3.75 2.10

TP53 rs121912666 17:7578190-7578190 215 T C DAMAGING 0 probably

damaging 1.00 medium 3.40 2.10

TP53 rs121912667 17:7577084-7577084 14 T A DAMAGING 0 probably

damaging 1.00 high 3.67 2.06

TP53 rs121912667 17:7577084-7577084 4 T C DAMAGING 0 probably

damaging 1.00 high 3.67 2.06

TP53 rs121912667 17:7577084-7577084 2 T G DAMAGING 0 probably

damaging 1.00 medium 3.12 2.06

TP53 rs121913343 17:7577121-7577121 9 G C DAMAGING 0 probably

damaging 1.00 high 3.81 1.26

TP53 rs121913343 17:7577121-7577121 418 G A DAMAGING 0 probably

damaging 1.00 high 3.81 1.26

TP53 rs121913344 17:7577022-7577022 110 G A Nonsense N/A nonsense Nonsense 0.72

TP53 rs138729528 17:7578407-7578407 11 G C DAMAGING 0 probably

damaging 1.00 high 3.98 0.75

TP53 rs138729528 17:7578407-7578407 14 G A DAMAGING 0 probably

damaging 1.00 medium 3.43 0.75

TP53 rs147002414 17:7578401-7578401 8 G A DAMAGING 0 probably

damaging 1.00 medium 3.46 2.80

TP53 rs148924904 17:7578442-7578442 98 T C DAMAGING 0 probably

damaging 1.00 medium 2.80 0.40

TP53 rs17849781 17:7577106-7577106 51 G A DAMAGING 0 probably

damaging 1.00 high 3.95 2.67

TP53 rs17849781 17:7577106-7577106 18 G C DAMAGING 0.01 probably

damaging 1.00 high 3.95 2.67

TP53 rs17882252 17:7574012-7574012 11 C A Nonsense N/A nonsense Nonsense 1.16

TP53 rs28934571 17:7577534-7577534 288 C A DAMAGING 0.01 probably

damaging 1.00 high 3.96 0.21

Nature Biotechnology: doi:10.1038/nbt.2681

TP53 rs28934571 17:7577534-7577534 20 C G DAMAGING 0.01 probably

damaging 1.00 high 3.96 0.21

TP53 rs28934572 17:7577550-7577550 34 C T DAMAGING 0 probably

damaging 1.00 high 3.57 2.53

TP53 rs28934572 17:7577550-7577550 14 C A DAMAGING 0 probably

damaging 1.00 high 3.92 2.53

TP53 rs28934573 17:7577559-7577559 69 G A DAMAGING 0 probably

damaging 1.00 high 4.03 1.26

TP53 rs28934573 17:7577559-7577559 8 G T DAMAGING 0 probably

damaging 1.00 high 4.03 1.26

TP53 rs28934574 17:7577094-7577094 26 G C DAMAGING 0 possibly

damaging 0.86 high 3.80 0.14

TP53 rs28934574 17:7577094-7577094 368 G A DAMAGING 0 possibly

damaging 0.86 medium 3.45 0.14

TP53 rs28934575 17:7577548-7577548 10 C G DAMAGING 0 probably

damaging 1.00 high 3.58 2.53

TP53 rs28934575 17:7577548-7577548 47 C A DAMAGING 0 probably

damaging 1.00 high 3.92 2.53

TP53 rs28934575 17:7577548-7577548 281 C T DAMAGING 0 probably

damaging 1.00 medium 3.12 2.53

TP53 rs28934576 17:7577120-7577120 82 C A DAMAGING 0 probably

damaging 1.00 high 3.81 2.53

TP53 rs28934576 17:7577120-7577120 26 C G DAMAGING 0 probably

damaging 1.00 high 3.81 2.53

TP53 rs28934576 17:7577120-7577120 484 C T DAMAGING 0.01 probably

damaging 1.00 medium 3.11 2.53

TP53 rs28934577 17:7577511-7577511 8 A G DAMAGING 0 probably

damaging 1.00 high 4.00 0.84

TP53 rs28934577 17:7577511-7577511 8 A T DAMAGING 0 probably

damaging 1.00 high 4.00 0.84

TP53 rs28934578 17:7578406-7578406 746 C T DAMAGING 0 probably

damaging 0.99 high 3.63 2.66

TP53 rs28934578 17:7578406-7578406 19 C A DAMAGING 0 probably

damaging 0.99 high 3.98 2.66

TP53 rs28934873 17:7578532-7578532 12 A T DAMAGING 0 benign 0.01 low 1.10 2.17

TP53 rs28934873 17:7578532-7578532 1 A C DAMAGING 0 benign 0.01 low 1.10 2.17

TP53 rs28934874 17:7578479-7578479 61 G A DAMAGING 0 possibly

damaging 0.88 medium 2.92 2.80

TP53 rs28934874 17:7578479-7578479 8 G C DAMAGING 0 possibly

damaging 0.88 medium 3.26 2.80

TP53 rs28934874 17:7578479-7578479 14 G T DAMAGING 0 possibly

damaging 0.88 medium 3.46 2.80

TP53 rs28934875 17:7578518-7578518 14 C G DAMAGING 0 probably

damaging 1.00 high 3.67 2.66

TP53 rs55832599 17:7577139-7577139 22 G A DAMAGING 0 probably

damaging 0.98 high 3.87 1.33

TP53 rs72661117 17:7578380-7578380 6 C G DAMAGING 0.01 possibly

damaging 0.95 medium 3.11 2.73

TRRAP rs147405090 7:98509802- 5 C T DAMAGING 0 probably 1.00 medium 2.73 2.57

Nature Biotechnology: doi:10.1038/nbt.2681

98509802 damaging

TSHR rs121908859 14:81610258-

81610258 10 A G DAMAGING 0

probably damaging

1.00 medium 3.36 2.00

TSHR rs121908864 14:81609760-

81609760 26 T C DAMAGING 0

probably damaging

1.00 high 3.70 2.17

TSHR rs121908877 14:81610289-

81610289 5 G T DAMAGING 0

probably damaging

0.98 medium 3.27 2.51

TSHR rs121908877 14:81610289-

81610289 5 G C DAMAGING 0

probably damaging

0.98 medium 3.27 2.51

TSHR rs121908878 14:81606172-

81606172 3 G C DAMAGING 0.01

probably damaging

1.00 high 3.69 2.51

TSHR rs121908878 14:81606172-

81606172 2 G T DAMAGING 0.02

probably damaging

1.00 high 3.69 2.51

TSHR rs121908878 14:81606172-

81606172 4 G A DAMAGING 0.02

probably damaging

1.00 high 3.69 2.51

TSHR rs149978216 14:81610105-

81610105 7 T C DAMAGING 0

probably damaging

1.00 medium 2.70 2.11

TSHR rs28937584 14:81610299-

81610299 8 G C DAMAGING 0

probably damaging

1.00 high 3.99 2.51

TSHR rs28937584 14:81610299-

81610299 11 G T DAMAGING 0

probably damaging

1.00 high 3.99 2.51

VHL rs104893825 3:10191503-

10191503 3 G A TOLERATED 0.51

probably damaging

0.99 low 0.89 1.29

VHL rs104893825 3:10191503-

10191503 3 G T DAMAGING 0

probably damaging

0.99 medium 2.18 1.29

VHL rs104893829 3:10183772-

10183772 1 C A TOLERATED 0.39

possibly damaging

0.95 low 1.61 2.48

VHL rs104893829 3:10183772-

10183772 14 C T TOLERATED 0.4

possibly damaging

0.95 medium 1.96 2.48

VHL rs104893830 3:10188245-

10188245 7 G C DAMAGING 0.01

probably damaging

0.98 medium 2.90 2.53

VHL rs119103277 3:10183794-

10183794 2 G C DAMAGING 0

probably damaging

1.00 medium 3.09 2.35

VHL rs119103277 3:10183794-

10183794 1 G T DAMAGING 0.04

probably damaging

1.00 medium 3.09 2.35

VHL rs119103277 3:10183794-

10183794 4 G A Nonsense N/A

probably damaging

1.00 Nonsense 2.35

VHL rs121913346 3:10191480-

10191480 1 T G DAMAGING 0

probably damaging

1.00 medium 2.89 0.94

VHL rs121913346 3:10191480-

10191480 3 T C DAMAGING 0

probably damaging

1.00 medium 2.89 0.94

VHL rs121913346 3:10191480-

10191480 6 T A DAMAGING 0

probably damaging

1.00 medium 2.89 0.94

VHL rs5030802 3:10183739-

10183739 6 G T Nonsense N/A nonsense Nonsense 1.28

VHL rs5030804 3:10183764-

10183764 3 A T DAMAGING 0

probably damaging

0.99 medium 3.07 2.01

VHL rs5030804 3:10183764-

10183764 3 A G DAMAGING 0.03

probably damaging

0.99 medium 3.07 2.01

Nature Biotechnology: doi:10.1038/nbt.2681

VHL rs5030804 3:10183764-

10183764 2 A C DAMAGING 0.04

probably damaging

0.99 medium 3.07 2.01

VHL rs5030807 3:10183797-

10183797 3 T G DAMAGING 0

probably damaging

1.00 medium 2.43 1.88

VHL rs5030807 3:10183797-

10183797 10 T A DAMAGING 0

probably damaging

1.00 medium 2.43 1.88

VHL rs5030807 3:10183797-

10183797 4 T C DAMAGING 0.01

probably damaging

1.00 medium 2.43 1.88

VHL rs5030811 3:10188200-

10188200 5 C A DAMAGING 0.01

probably damaging

1.00 medium 3.09 2.53

VHL rs5030811 3:10188200-

10188200 1 C G DAMAGING 0.01

probably damaging

1.00 medium 3.09 2.53

VHL rs5030811 3:10188200-

10188200 4 C T DAMAGING 0.02

probably damaging

1.00 medium 3.09 2.53

VHL rs5030818 3:10191488-

10191488 11 C T Nonsense N/A nonsense Nonsense 1.39

VHL rs5030820 3:10191506-

10191506 6 C T DAMAGING 0.01

probably damaging

1.00 medium 2.63 1.34

VHL rs5030820 3:10191506-

10191506 1 C G TOLERATED 0.06

probably damaging

1.00 medium 2.63 1.34

VHL rs5030823 3:10191555-

10191555 5 C A Nonsense N/A nonsense Nonsense 2.74

VHL rs5030826 3:10183725-

10183725 9 C T DAMAGING 0

possibly damaging

0.82 medium 2.98 1.35

VHL rs5030826 3:10183725-

10183725 2 C G DAMAGING 0

possibly damaging

0.82 medium 2.98 1.35

VHL rs5030826 3:10183725-

10183725 7 C A Nonsense N/A

possibly damaging

0.82 Nonsense 1.35

VHL rs5030828 3:10183785-

10183785 6 T C TOLERATED 0.27

probably damaging

1.00 medium 2.19 0.81

WT1 rs121907900 11:32413566-

32413566 2 G C DAMAGING 0

probably damaging

1.00 medium 2.87 1.53

WT1 rs121907900 11:32413566-

32413566 4 G A DAMAGING 0.01

probably damaging

1.00 medium 2.87 1.53

WT1 rs121907903 11:32413565-

32413565 5 C T TOLERATED 0.29

probably damaging

1.00 medium 2.87 2.87

WT1 rs121907909 11:32413578-

32413578 9 G A Nonsense N/A nonsense Nonsense 2.87

WT1 rs142937387 11:32417910-

32417910 9 G T Nonsense N/A nonsense Nonsense 2.94

WT1 rs28941778 11:32413560-

32413560 8 C T DAMAGING 0.02

probably damaging

1.00 medium 2.87 2.87

ZDHHC11 rs62332110 5:833915-833915 5 G T TOLERATED 0.27 benign 0.00 neutral 0.69 -2.35

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Table 4. High-confidence filtered cancer-associated somatic mutations represented in dbSNP shown in two example articles

Bladder cancer dataset

Gene rs ID

Number of

tumor samples

Sample ID

Chr Position (Hg18)

Ref Var

Normal Tumor p-value (Fisher exaxt test)

# reads supporting

ref

# reads supporting

var Frequency

# reads supporting

ref

# reads supporting

var Frequency

BRAF rs121913361 5 B54 chr7 140099618 C G 48 0 0% 5 6 54.55% 1.03E-05

BRAF rs121913338 34 B98 chr7 140099623 T C 45 0 0% 23 21 47.73% 1.54E-08

BRAF rs121913355 33 B88 chr7 140127871 C G 72 0 0% 39 19 32.76% 3.14E-08

CTNNB1 rs121913407 163 B88 chr3 41241140 T C 50 0 0% 20 19 48.72% 6.08E-09

HRAS rs121913233 214 B68 chr11 523874 T A 65 0 0% 9 61 87.14% 4.06E-29

HRAS rs28933406 54 B80-8 chr11 523875 G T 94 0 0% 39 43 52.44% 1.75E-18

NRAS rs11554290 952 B89-12 chr1 115058052 T A 109 1 0.91% 42 40 48.78% 3.74E-17

PIK3CA rs104886003 793 B54 chr3 180418785 G A 28 0 0% 20 5 20% 0.018514

PIK3CA rs121913273 492 B59 chr3 180418776 G A 95 0 0% 20 8 28.57% 3.02E-06

PIK3CA rs121913279 1459 B78 chr3 180434779 A G 63 0 0% 57 102 64.15% 4.34E-22

PIK3CA rs104886003 793 B90 chr3 180418785 G A 36 0 0% 47 32 40.51% 4.67E-07

PIK3CA rs121913273 492 B25 chr3 180418776 G A 54 0 0% 27 13 32.50% 3.99E-06

PIK3CA rs121913273 492 B98 chr3 180418776 G A 61 0 0% 23 8 25.81% 8.48E-05

PIK3CA rs104886003 793 B98 chr3 180418785 G C 63 0 0% 22 10 31.25% 6.38E-06

PIK3CA rs121913273 492 B85-0 chr3 180418776 G A 38 0 0% 25 30 54.55% 1.40E-09

PIK3CA rs121913287 33 B45 chr3 180399570 G A 76 0 0% 37 22 37.29% 8.36E-10

PIK3CA rs104886003 793 B80-8 chr3 180418785 G A 36 0 0% 34 22 39.29% 2.32E-06

PIK3CA rs104886003 793 B65 chr3 180418785 G A 47 0 0% 28 34 54.84% 1.77E-11

PIK3CA rs104886003 793 B18 chr3 180418785 G A 90 0 0% 8 4 33.33% 1.16E-04

PIK3CA rs104886003 793 B22 chr3 180418785 G A 35 0 0% 22 20 47.62% 3.49E-07

TP53 rs28934578 765 B86 chr17 7519131 C T 24 0 0% 12 45 78.95% 5.43E-12

TP53 rs121913344 110 B96 chr17 7517747 G A 98 0 0% 7 66 90.41% 7.73E-40

TP53 rs11540652 597 B34 chr17 7518263 C T 39 0 0% 11 14 56% 9.31E-08

TP53 rs28934578 765 B103 chr17 7519131 C T 35 0 0% 12 23 65.71% 4.66E-10

Prostate cancer dataset (Homozygous mutation)

Gene rs ID Number of tumor samples

Sample ID Chr Position (Hg18)

Ref Var

Tumor

# reads supporting

ref

# reads supporting

var Frequency

TP53 rs28934575 351 LuCaP58 chr17 7518273 C T 0 126 100%

TP53 rs28934576 605 LuCaP93 chr17 7517845 C T 0 94 100%

TP53 rs28934576 605 LuCaP145.2 chr17 7517845 C T 0 82 100%

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Table 5. Analysis of mutually exclusive alteration patterns TP53 PIK3CA

Newly identified tumor sampels with high-confidence filtered cancer-associated somatic

mutations represented in dbSNP

B103 B18

B34 B22

B86 B25

B96 B45

B54

B59

B65

B78

B80-8

B85-0

B90

B98

Reported tumor samples with mutations in the original study

B101

B104-0

B14

B16

B23

B36

B37

B59-3

B60

B61

B63

B66

B71

B74

B77

B8

B81

B84

B89-12

B9

P=0.034 PIK3CA

Mutant WT

TP53 Mutant 0 24

WT 12 61

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Table 6. List of patients with cancer-associated somatic mutations represented in dbSNP and other

variants in TP53

Patients with cancer-associated somatic mutations represented in dbSNP

ID Genomic change (hg18) AA Change Survival (in

months) from first hormone therapy

rs ID

WA15 g.chr17:7518259C>A p.R249S 7 rs28934571

WA3 g.chr17:7518263C>A p.R248L 14 rs11540652

WA30 g.chr17:7519131C>T p.R175H 41 rs28934578

WA40 g.chr17:7517846G>A p.R273C 45 rs121913343

WA50 g.chr17:7518259C>A p.R249S 110 rs28934571

WA53 g.chr17:7518915T>C p.Y220C 54 rs121912666

WA54 g.chr17:7517846G>A p.R273C 66 rs121913343

Patients with other variants

ID Genomic change (hg18) AA change Survival (in months) from

first hormone therapy

WA11 g.chr17:7514729delG p.F340fs 17

WA18 g.chr17:7518948delCT p.R208fs 72

WA12 g.chr17:7519233C>T p.C141Y 60

WA14 g.chr17:7520093A>C p.Y107D 96

WA28 g.chr17:7519200G>A p.P152L 179

WA37 g.chr17:7517842A>C p.V274G 39

WA43-44 g.chr17:7518936C>A p.R213L 60

WA49 g.chr17:7518299T>C p.Y236C 81

WA31 g.chr17:7520258G>A p.Q52* 88

WA35 g.chr17:7518982C>A p.E198* 109

WA55 g.chr17:7520300G>A p.Q38* 50

WA10 High-level deletion 24

WA13 High-level deletion 132

WA22 High-level deletion 30

WA24 High-level deletion 22

WA26 High-level deletion 96

WA48 High-level deletion 105

WA57 High-level deletion 77

WA59 High-level deletion 94

Nature Biotechnology: doi:10.1038/nbt.2681

Supplementary Table 7. Cancer-associated somatic mutations represented in 1000 Genomes Project

rsID MAF Gene Evidence of pathogenic SNPs

rs142937387 0.001 WT1

rs28934576 0.001 TP53 Flagged as Clinically-associated

rs121913322 0.002 STK11 Flagged as Clinically-associated

rs1801166 0.004 APC Flagged as Clinically-associated

rs56391007 0.005 MET Flagged as Clinically-associated

rs59912467 0.0128 STK11 Flagged as Clinically-associated

rs34218846 0.041 IDH1

rs3822214 0.064 KIT

rs1801516 0.079 ATM Significant GWAS locus

Nature Biotechnology: doi:10.1038/nbt.2681

References

1. Sherry, S.T. et al. Nucleic Acids Res 29, 308-311 (2001).

2. Abecasis, G.R. et al. Nature 491, 56-65 (2012).

3. Forbes, S.A. et al. Nucleic Acids Res 39, D945-950 (2011).

4. Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A. & McKusick, V.A. Nucleic Acids Res 33, D514-517

(2005).

5. Dreszer, T.R. et al. Nucleic Acids Res 40, D918-923 (2012).

6. Kumar, P., Henikoff, S. & Ng, P.C. Nat Protoc 4, 1073-1081 (2009).

7. Adzhubei, I.A. et al. Nat Methods 7, 248-249 (2010).

8. Reva, B., Antipin, Y. & Sander, C. Nucleic Acids Res 39, e118 (2011).

9. Pollard, K.S., Hubisz, M.J., Rosenbloom, K.R. & Siepel, A. Genome Res 20, 110-121 (2010).

10. Gui, Y. et al. Nat Genet 43, 875-878 (2011).

11. Leinonen, R., Sugawara, H. & Shumway, M. Nucleic Acids Res 39, D19-21 (2011).

12. Li, H. & Durbin, R. Bioinformatics 25, 1754-1760 (2009).

13. DePristo, M.A. et al. Nat Genet 43, 491-498 (2011).

14. Koboldt, D.C. et al. Bioinformatics 25, 2283-2285 (2009).

15. Kumar, A. et al. Proc Natl Acad Sci U S A 108, 17087-17092 (2011).

16. Robinson, J.T. et al. Nat Biotechnol 29, 24-26 (2011).

17. Grasso, C.S. et al. Nature 487, 239-243 (2012).

Nature Biotechnology: doi:10.1038/nbt.2681