SUPPLEMENTARY INFORMATION FOR THE GENOMIC … · 1 SUPPLEMENTARY INFORMATION FOR THE GENOMIC...
Transcript of SUPPLEMENTARY INFORMATION FOR THE GENOMIC … · 1 SUPPLEMENTARY INFORMATION FOR THE GENOMIC...
1
SUPPLEMENTARY INFORMATION FOR
THE GENOMIC LANDSCAPE OF HYPODIPLOID ACUTE LYMPHOBLASTIC LEUKEMIA
Linda Holmfeldt1,30, Lei Wei1,30, Ernesto Diaz-Flores2, Michael Walsh3, Jinghui Zhang4, Li Ding5,6, Debbie Payne-Turner1, Michelle Churchman1, Anna Andersson1,7, Shann-Ching Chen1, Kelly McCastlain1, Jared Becksfort4, Jing Ma1, Gang Wu4, Samir N. Patel1,29, Susan L. Heatley1,29, Letha A. Phillips1, Guangchun Song1, John Easton8, Matthew Parker4, Xiang Chen4, Michael Rusch4, Kristy Boggs8, Bhavin Vadodaria8, Erin Hedlund4, Christina Drenberg9, Sharyn Baker9,
Deqing Pei10, Cheng Cheng10, Robert Huether4, Charles Lu5, Robert S. Fulton5,6, Lucinda L.
Fulton5,6, Yashodhan Tabib5, David J. Dooling5,6, Kerri Ochoa5, Mark Minden11, Ian D. Lewis12, L. Bik To12, Paula Marlton13, Andrew W. Roberts14, Gordana Raca15, Wendy Stock15, Geoffrey Neale16, Hans G. Drexler17, Ross A. Dickins18, David W. Ellison1, Sheila A. Shurtleff1, Ching-Hon Pui3, Raul C. Ribeiro3, Meenakshi Devidas19, Andrew J. Carroll20, Nyla A. Heerema21, Brent Wood22, Michael J. Borowitz23, Julie M. Gastier-Foster24,25,26, Susana C. Raimondi1, Elaine R. Mardis4,5,27, Richard K. Wilson4,5,27, James R. Downing1, Stephen P. Hunger28, Mignon L. Loh2, and Charles G. Mullighan1
1Pathology, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA 2Department of Pediatrics, University of California School of Medicine, San Francisco, California, USA 3Department of Oncology, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA 4Department of Computational Biology and Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA
5The Genome Institute at Washington University, St Louis, Missouri, USA
6Department of Genetics, Washington University School of Medicine, St Louis, Missouri, USA
7Department of Clinical Genetics, Lund University Hospital, Lund, Sweden
8Pediatric Cancer Genome Project, St. Jude Children’s Research Hospital, Memphis, Tennessee USA 9Department of Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA
10Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA 11Princess Margaret Hospital/University Health Network, University of Toronto, Ontario, Canada 12Division of Haematology, Institute of Medical and Veterinary Science, Adelaide, South Australia, Australia 13Oncology/Haematology Unit, Princess Alexandra Hospital, Woolloongabba, Queensland, Australia 14Department of Clinical Haematology and Bone Marrow Transplant, Royal Melbourne Hospital, Melbourne, Victoria, Australia
15Hematology/Oncology, University of Chicago Medicine, Chicago, Illinois, USA 16The Hartwell Center for Bioinformatics and Biotechnology, St. Jude Children’s Research Hospital, Memphis, Tennessee, USA
17Department of Human and Animal Cell Cultures, Deutsche Sammlung von Mikroorganismen und Zellkulturen, Braunschweig, Germany 18Molecular Medicine Division, Walter & Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia 19Department of Biostatistics, College of Medicine, University of Florida, Gainesville, Florida, USA 20Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA
Nature Genetics: doi: 10.1038/ng.2532
2
21Department of Pathology, College of Medicine, Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA 22Department of Laboratory Medicine, Seattle Children’s Hospital, Seattle, Washington, USA 23Division of Hematologic Pathology, Johns Hopkins Hospital, Baltimore, Maryland, USA 24Department of Pathology and Laboratory Medicine, Nationwide Children’s Hospital, Columbus, Ohio, USA 25Department of Pathology, Ohio State University, Columbus, Ohio, USA 26Department of Pediatrics, Ohio State University, Columbus, Ohio, USA 27Siteman Cancer Center, Washington University, St Louis, Missouri, USA 28Section of Pediatric Hematology/Oncology/Bone Marrow Transplantation and Center for Cancer and Blood Disorders, University of Colorado Denver School of Medicine, Children’s Hospital Colorado, Aurora, Colorado, USA 29Present addresses: Weill Cornell Medical College, Cornell University, New York, New York, USA (S.N.P.) and Human Immunology, Centre for Cancer Biology, SA Pathology, Adelaide, South Australia, Australia (S.L.H.)
30These authors contributed equally to this work
Nature Genetics: doi: 10.1038/ng.2532
3
TABLE OF CONTENTS
SUPPLEMENTARY NOTE ........................................................................................................ 6
Next generation sequencing of hypodiploid ALL ..................................................................... 6
Kindred harboring inherited TP53 mutation in the light of Li-Fraumeni syndrome ................... 6
Germline variants in hypodiploid ALL ..................................................................................... 7
PAX5 and CDKN2A/B alterations in hypodiploid ALL ............................................................. 7
Low frequency of JAK mutations in hypodiploid ALL............................................................... 8
Additional copy number alterations identified in hypodiploid ALL ............................................ 8
Additional histone modifier genes mutated in next-generation sequenced hypodiploid ALL ...10
Two potential open reading frames in the NALM-16 NF1 transcript .......................................14
Verification of genetic alterations in xenografted primary hypodiploid ALL cells .....................14
Minimal Residual Disease status and PAG1 alterations are associated with poor outcome ...15
SUPPLEMENTARY TABLES ...................................................................................................16
Supplementary Table 1: Pediatric hypodiploid ALL cohort. ....................................................16
Supplementary Table 2: Adult ALL cohort. ............................................................................16
Supplementary Table 3: Whole genome sequencing coverage data. ....................................17
Supplementary Table 4: Whole exome sequencing coverage data. .......................................19
Supplementary Table 5: Validation frequency of next-generation sequencing data. ..............20
Supplementary Table 6: Number of sequence mutations, copy number alterations and
structural variants identified by next-generation sequencing. .................................................21
Supplementary Table 7: Mutations identified by next-generation sequencing. .......................23
Supplementary Table 8: Structural variations identified by whole genome sequencing. .........24
Supplementary Table 9: Genes resequenced. .......................................................................25
Supplementary Table 10: Regions of copy number alterations and copy-neutral loss-of-
heterozygosity in hypodiploid ALL. ........................................................................................26
Supplementary Table 11: Mutations identified by Sanger sequencing in the hypodiploid ALL
cohort. ...................................................................................................................................27
Supplementary Table 12: Copy number alterations and mutations. .......................................28
Supplementary Table 13: Association between aneuploidy and lesions. ...............................29
Supplementary Table 14: Differential expression analysis – NH versus masked NH. ............31
Supplementary Table 15: Differential expression analysis – LH versus masked LH. .............31
Supplementary Table 16: TP53 mutations in adult ALL. ........................................................32
Nature Genetics: doi: 10.1038/ng.2532
4
Supplementary Table 17: Copy number alterations and mutations in hypodiploid ALL vs non-
hypodiploid ALL. ....................................................................................................................33
Supplementary Table 18: TP53 mutations in pediatric hypodiploid ALL. ................................36
Supplementary Table 19: IKZF1, IKZF2 and RB1 deletions in adult ALL. ..............................38
Supplementary Table 20: Alterations targeting histone modifiers in next-generation
sequenced hypodiploid ALL. ..................................................................................................39
Supplementary Table 21: Differential expression analysis – NH versus LH. ..........................41
Supplementary Table 22: Gene set enrichment analysis (GSEA) – NH versus LH. ...............41
Supplementary Table 23: Ex vivo drug study of PI3K/mTOR and MEK inhibitors on
hypodiploid ALL cells. ............................................................................................................42
Supplementary Table 24: Sequences of shRNAs. .................................................................43
Supplementary Table 25: Single nucleotide variations identified by mRNA seq of NALM-16. 44
Supplementary Table 26: Primer sequences used for targeted gene resequencing and NF1
deletion mapping. ..................................................................................................................45
Supplementary Table 27: Murine lymphoid precursor cells used for gene expression profiling.
..............................................................................................................................................45
Supplementary Table 28: Antibodies used for biochemical studies. .......................................46
Supplementary Table 29: Association between aneuploidy and event free survival (EFS).....47
Supplementary Table 30: Association between aneuploidy and cumulative incidence of any
relapse. .................................................................................................................................48
Supplementary Table 31: Association between aneuploidy and minimal residual disease. ....49
Supplementary Table 32: Association between copy number alterations/mutations and event
free survival (EFS). ................................................................................................................50
Supplementary Table 33: Association between copy number alterations/mutations and
cumulative incidence (CIN) of any relapse. ............................................................................52
Supplementary Table 34: Multivariable analysis of copy number alterations/mutations, clinical
features and association with cumulative incidence of any relapse. .......................................54
SUPPLEMENTARY FIGURES .................................................................................................55
Supplementary Figure 1: Coverage plots for next-generation sequenced hypodiploid ALL
cases. ....................................................................................................................................55
Supplementary Figure 2: Circos plots of whole genome sequenced hypodiploid ALL. ...........57
Supplementary Figure 3: Mutation spectrum of next-generation sequenced hypodiploid ALL.
..............................................................................................................................................62
Supplementary Figure 4: Protein domain and alteration plots for targets of sequence
mutations in hypodiploid ALL. ................................................................................................63
Nature Genetics: doi: 10.1038/ng.2532
5
Supplementary Figure 5: Mapping of NF1 deletions. .............................................................64
Supplementary Figure 6: Immunoblot analysis of NF1. ..........................................................66
Supplementary Figure 7: Validation of mutations in NRAS and PTPN11 in non-tumor samples
in near haploid ALL. ...............................................................................................................67
Supplementary Figure 8: PAG1 deletions correlate with PAG1 expression levels. ................68
Supplementary Figure 9: Mutant p53 fails to stimulate p21 in hypodiploid ALL. .....................69
Supplementary Figure 10: IKZF1 and IKZF2 deletions in adult ALL. ......................................70
Supplementary Figure 11: Expression of Ikzf1, Ikzf2 and Ikzf3 during murine lymphoid
development. .........................................................................................................................71
Supplementary Figure 12: CD19 levels and degree of antigen receptor rearrangements in
hypodiploid ALL. ....................................................................................................................72
Supplementary Figure 13: RB1 alterations in pediatric hypodiploid ALL and adult ALL. ........74
Supplementary Figure 14: Tumor suppressor gene pathway alterations in hypodiploid ALL. .75
Supplementary Figure 15: Deletions and sequence mutations in genes encoding histones and
histone modifiers. ..................................................................................................................76
Supplementary Figure 16: GEP restricted to probes on chromosomes showing identical
patterns of aneuploidy. ..........................................................................................................77
Supplementary Figure 17: Flow cytometric analysis of signaling pathways in hypodiploid ALL.
..............................................................................................................................................78
Supplementary Figure 18: Ikzf2 and Ikzf3 knockdown efficiency assessed by immunoblot
analysis. ................................................................................................................................79
Supplementary Figure 19: Flow cytometric analysis of signaling pathways in hematopoietic
cell lines. ...............................................................................................................................80
Supplementary Figure 20: The importance of optimal normalization of SNP microarray data.
..............................................................................................................................................81
Supplementary Figure 21: Immunohistochemistry and FACS analyses of tissue from mice
xenografted with human primary hypodiploid ALL cells. ........................................................82
Supplementary Figure 22: Copy number analysis of primary hypodiploid ALL samples versus
xenografted leukemic samples. .............................................................................................83
SUPPLEMENTARY REFERENCES .....................................................................................84
Nature Genetics: doi: 10.1038/ng.2532
6
SUPPLEMENTARY NOTE
Next generation sequencing of hypodiploid ALL
We identified 988 putative somatic single nucleotide variations (SNVs) and insertion/deletion
mutations (Indels) and 164 structural variations (SVs) in the whole genome- and exome
sequenced cases. We selected 822 SNVs and Indels (excluding some synonymous and UTR
mutations) and 137 SVs (excluding some low quality SVs from exome sequencing) for
experimental validation. We successfully determined statuses for 779 SNVs and Indels and 112
SVs, including 646 somatic SNVs and Indels and 96 somatic SVs, at validation rates of 83%
and 86%, respectively (Supplementary Tables 5-8 and Supplementary Figs. 2-3). In addition,
germline variant analysis was performed and initially identified 289 SNVs and 544 Indels that
after filtering left 58 germline variants predicted to be deleterious (Supplementary Note and
Supplementary Table 7).
Kindred harboring inherited TP53 mutation in the light of Li-Fraumeni syndrome
A 9 year-old diagnosed with relapsed low hypodiploid ALL was treated at St Jude. At remission,
the patient underwent NK cell therapy and conditioning of clofarabine 40 mg/m2, etoposide 100
mg/m2 and cyclophosphamide 400 mg/m2 in view of a match unrelated transplant. However,
prior to transplant the patient developed massive capillary leak, multi-organ failure and died.
The family history was salient for his biological father dying from gliobastoma multiforme
at the age of 31 and the paternal grandfather dying from a malignancy of unknown type. Given
the family history, TP53 testing was performed. The testing for both the patient and his father
was significant for a frame-shift g.13886delG mutation that was heterozygous in a skin biopsy
from the patient and homozygous in the tumor samples from both the boy and his father
(p.Gly302fs; Fig. 4c-e). This mutation has been previously reported (http://www-p53.iarc.fr). The
father’s tumor also harbored a heterozygous deleterious mutation of IDH1. In addition,
immunohistochemistry testing of the patient’s father’s tumor supported findings consistent with
p53 and IDH mutations (Fig. 4e).
Li-Fraumeni syndrome (LFS) carries at least three different definitions. The classic
definition is a proband with a sarcoma before the age of 45 years and a first degree relative with
any cancer before 45 years of age and a first or second-degree relative with any cancer before
age 45 years or a sarcoma at any age.1 In 2009, the Chompret criteria for LFS in the context of
a positive gene test for a TP53 mutation were reported.2-4 Other definitions of LFS were put forth
Nature Genetics: doi: 10.1038/ng.2532
7
by Birch and Eeles.5,6 Birch included any childhood cancer coupled with other criteria whereas
Eeles’ description made adjustment for age and degrees of relatedness with LFS-related
malignancies. A germline mutation in TP53 confirms the diagnosis of LFS or LFL syndrome.
While patients with leukemia and LFS/LFL syndrome have been reported there is less
agreement that this type of cancer should be included in the definition of LFS/LFL syndrome.7-12
Here we present two generations with a deleterious TP53 mutation leading to a
hematologic malignancy in a child and a solid tumor in his father. This case is notable since it
supports leukemia, in particular low hypodiploid ALL, as a diagnostic cancer for LFS. Further,
this patient responded with significant toxicity to clofarabine, perhaps indicating further study
and caution of this agent in patients with p53 mutations.
Current screening guidelines for individuals with TP53 mutations are controversial given
the cost of screening and whether preventative steps can be taken. However, Villani et. al. have
generated biochemical and imaging guidelines, which have shown some evidence to improve
outcome.13
Germline variants in hypodiploid ALL
In addition to the mutations in TP53, NRAS and PTPN11, 49 deleterious mutations were
identified in matched remission DNA from next generation sequencing, and thus likely inherited
(Supplementary Table 7). Among these, a frame-shift mutation was identified in SH2B3,
encoding LNK, a negative regulator of cytokine signaling14 also mutated in non-hypodiploid
high-risk ALL.15 One case harbored a frame-shift mutation in XRCC1, which encodes a protein
involved in DNA single-strand break repair.16 Another case harbored a nonsense mutation in
TP53INP1, a p53 target gene that is expressed upon high levels of reactive oxygen species and
that encodes a protein with antioxidant functions17. Splice region mutations were identified in the
cancer associated genes FANCA18, MLL319,20 and ROS121. The majority of these mutations
were heterozygous in remission cells and homozygous in the tumors due to aneuploidy of the
respective chromosomes (Supplementary Table 7).
PAX5 and CDKN2A/B alterations in hypodiploid ALL
Some of the recurrent alterations identified in hypodiploid ALL have been reported previously in
ALL/high-risk ALL22,23. One of these was focal deletion of CDKN2A/CDKN2B at 9p21 (encoding
INK4/ARF), which was most common in near diploid ALL (77.3%) compared to 22.1% and
23.5% of near haploid and low hypodiploid ALL cases, respectively (Table 1 and Supplementary
Nature Genetics: doi: 10.1038/ng.2532
8
Table 12). Deletions of these genes are common in ALL, with deletions in about one third of B-
ALL (Supplementary Table 17) and 72% of T-ALL.22,23 The B-lymphoid transcription factor gene
PAX5 (9p13) was altered in 59.1% of near diploid cases (by focal deletion (18.2%), amplification
(4.5%), broad deletions terminating in the gene (13.6%) or sequence mutation (27.3%)) but less
frequent in near haploid (7.4%) and low hypodiploid (5.9%) ALL (Table 1 and Supplementary
Table 12). PAX5 is frequently altered in non-hypodiploid ALL (more than 30% of B-ALL22,23). In
addition to the focal deletions and sequence mutations, mono-allelic loss of the PAX5 and
CDKN2A/CDKN2B genes due to deletion of 9p was observed in 31.8% and 9.1% of near diploid
cases, respectively, and by whole chromosome 9 loss or copy-neutral LOH of chromosome 9 in
cases with less than 44 chromosomes (PAX5 78.9%, and CDKN2A/CDKN2B 62.5%, of the 104
near haploid and low hypodiploid cases).
Low frequency of JAK mutations in hypodiploid ALL
The Janus kinase genes JAK1 and JAK2 are mutated in ALL23,24, and alterations were identified
in 2 near diploid ALL cases, but not in near haploid or low hypodiploid ALL (JAK1 p.Val658Phe
and p.Lys847Glu in SJHYPO101-D and SJHYPO104-D, respectively; Supplementary Table 12
and Supplementary Fig. 4). SJHYPO101 also harbored a deletion in the pseudoautosomal
region (PAR1) on the sex chromosomes X and Y that results in P2RY8-CRLF2 fusion and over-
expression of CRLF2 (cytokine receptor-like factor 2, or thymic stromal lymphopoietin
receptor).25,26 JAK1 p.Val658Phe is a homologue of the transforming JAK p.Val617Phe
substitution, and normally found together with CRLF2 rearrangements.26 JAK and CRLF2
alterations have been shown to occur together in 7% of B-progenitor ALL cases, and are
associated with Down syndrome ALL.26 Two additional near diploid and two near haploid cases
harbored a deletion in this region. All three near diploid and one of the near haploid cases had
evidence of a P2RY8-CRLF2 fusion based on PCR of cDNA from these cases (Supplementary
Table 12 and data not shown).
Additional copy number alterations identified in hypodiploid ALL
Additional targets of DNA copy number alteration identified in at least 2 cases by SNP 6.0
microarray analysis of the entire pediatric hypodiploid ALL cohort included ANKRD11, ARID1B,
ARPP21, C20orf194, CUL5, DMD, EPHA7, FAM53B, GAB2, PDS5B/APRIN, RASA2 and
SMAD2 (Supplementary Tables 10 and 12).
Nature Genetics: doi: 10.1038/ng.2532
9
ANKRD11 (ankyrin repeat domain 11) was partially deleted (exon 2 or only a deletion in
intron 2) in 2 low hypodiploid cases and one near haploid case. This gene is a member of an
ankyrin repeat-containing co-factor family that interacts with p160 nuclear receptor co-
activators, and inhibits ligand-dependent transcriptional activation.27 ANKRD11 is also a p53 co-
activator28. No focal copy number alterations involving this gene were identified in a previous
study of childhood ALL.29
A deletion directly upstream (in two near haploid cases) and a deletion of the first 3
exons (in one near diploid case) of ARID1B (AT rich interactive domain 1B) were identified in
the hypodiploid ALL cohort. ARID1B encodes a protein involved in transcriptional activation and
repression of a number of genes by chromatin remodeling. ARID1B and the closely related
protein p270/ARID1A are non-catalytic subunits of the mammalian SWI/SNF complex, in which
p270/ARID1A is required for differentiation-associated cell cycle arrest and is a tumor
suppressor, while ARID1B has been shown to be dispensable for this function.30,31
Intragenic deletions of ARPP-21 (cAMP-regulated phosphoprotein, 21kDa) were
identified in 3 near diploid cases and one near haploid case, and alterations of this gene have
been identified previously (Supplementary Table 17 and Refs. 29,32). ARPP21 is a cAMP-
regulated phosphoprotein also known as Regulator of Calmodulin Signaling (RCS) that has a
central role in integration of signals in medium spiny neurons33.
A deletion of intron 1 of C20orf194 was identified in two near haploid cases. The specific
function of the protein encoded by this gene is unknown.
CUL5, also known as Cullin-5, encodes an E3 ubiquitin ligase that interacts with
members of the Hsp90 chaperone complex, and polyubiquitinates the Hsp90 client ErbB2.34
This gene was focally deleted in two near haploid cases, but was not found targeted in a
previous large scale genome-wide study on childhood ALL29 (Supplementary Table 17).
FAM53B (Family with sequence similarity 53, member B) is the human homologue of the
Medaka and Zebrafish gene simplet (smp). In fish, simplet has been shown to stimulate cell
proliferation and tissue regeneration35,36, while no function has been depicted in mammals.
Focal deletion of FAM53B was identified in two near haploid cases.
DMD (Dystrophin) is one of the largest genes found in the human genome, and
mutations in DMD are responsible for Duchenne (DMD) and Becker (BMD) muscular
dystrophies. This gene was altered in 2 low hypodiploid cases (one intragenic amplification and
one deletion of the first 11 exons) and 2 near diploid cases (deletion of the 5’ end and the 3’
end, and one case with LOH and amplification of the entire gene). It was also deleted in 4.3% of
cases in a study of B-cell progenitor ALL.29 The expression levels of DMD are predictive of
Nature Genetics: doi: 10.1038/ng.2532
10
overall survival in B-cell chronic lymphoblastic leukemia (e.g. Ref. 37), but the encoded protein
has not been implicated in tumorigenesis. No sequence mutations of DMD were identified in a
large-scale resequencing project of high-risk, non-hypodiploid B-ALL.23
Two near haploid cases harbored focal deletions of the EPHA7 (EPH receptor A7) gene,
belonging to the Ephrin receptor subfamily of the protein-tyrosine kinase family. EPHA7 is a
direct target of different MLL fusion gene products in acute leukemia, like MLL-AF4 and MLL-
AF9. The resulting up-regulated EPHA7 levels are accompanied by an increase in
phosphorylated ERK.38 The deletions of this gene identified in the hypodiploid ALL cohort,
however, are homozygous and lead to a complete loss of a functional gene product.
Deletions targeting intron 1 of GAB2 (GRB2-associated binding protein 2) were found in
two near haploid cases and one low hypodiploid case. GAB2 is a member of the GRB2-
associated binding protein gene family, is an activator of phosphatidylinositol-3 kinase39, and
has been shown to be a mediator for the pathogenic effects of Ptpn11 mutations in mice.40
PDS5B/APRIN (PDS5, regulator of cohesion maintenance, homolog B) was deleted in
two near haploid cases. The protein encoded by PDS5B/APRIN is cohesion-associated and is
involved in accurate chromosome segregation during mitosis. Deletion and down-regulation of
this gene have been reported in a variety of cancers and cancer cell lines.41-46
RASA2 (RAS p21 protein activator 2) also known as GAP1m, a member of the GAP1
family of GTPase-activating proteins.47,48 RASA2 has a perinuclear localization and binds
inositol 1,3,4,5-tetrakisphosphate (IP4), which is a compound suggested to function as a second
messenger.49 Focal deletion of this gene was identified in two near haploid cases, both of which
harbored an alteration in the Ras signaling pathway (NF1 deletion and NRAS mutation,
respectively; Supplementary Table 12).
Two low hypodiploid ALL cases harbored an amplification of the first exon of SMAD2.
The protein encoded by this gene is a mediator of TGF signaling and a transcriptional
modulator, ultimately controlling apoptosis, cell proliferation and differentiation. SMAD2 (MAD
homolog 2) has previously been shown to be inactivated in cancer by missense, nonsense and
frame shift mutations, focal deletions and by loss of the entire chromosomal region.50-52
Additional histone modifier genes mutated in next-generation sequenced hypodiploid
ALL
Details for all the mutations below are presented in Supplementary Tables 7-8 and 20.
Nature Genetics: doi: 10.1038/ng.2532
11
Histone writers
EHMT2 (euchromatic histone-lysine N-methyltransferase 2) encodes a protein
methyltransferase that mediates silencing of specific genes during endotoxin shock via
dimethylation of H3K9.53,54 One near haploid case harbored a somatic missense alteration
(p.Glu883Gln) that is predicted to be deleterious.
PRDM1 encodes a repressor of IFNB1 (Interferon-β) expression via direct binding to the
IFNB1 promoter55 and by assembling silent chromatin over the IFNB1 promoter when in
complex with EHMT2.56 A somatic 3’-UTR mutation was identified in one near haploid case.
The protein encoded by MLL2 (myeloid/lymphoid or mixed-lineage leukemia 2) is a
histone methyltransferase that methylates H3K4.57,58 One near haploid case harbored a somatic
missense alteration (p.Val4642Ile) in the MLL2 gene.
One near haploid case harbored a WHSC1 p.Glu1099Lys substitution that is predicted
to be deleterious. WHSC1, also known as MMSET, encodes a histone methyltransferase that
methylates H3K36, and alteration of the WHSC1 expression affects cell growth, adhesion and
access to chromatin.59,60
UBR4 (Ubiquitin protein ligase E3 component n-recognin 4) encodes an E3 ubiquitin
ligase, and its family member UBR2 has been shown to be a H2A ubiquitin ligase.61,62 One low
hypodiploid case harbored a somatic mutation in the exon 19 splice region and one low
hypodiploid tumor/normal pair had a p.Arg1349His substitution.
The gene encoding the histone methyltransferase protein Nuclear receptor binding SET
domain protein 1 (NSD1) harbored a 3’UTR mutation in one near haploid case60.
Histone erasers
USP7 (ubiquitin specific peptidase 7) encodes a histone deubiquitylating enzyme, which has a
wide variety of targets that includes Histone H2B.63 Overexpression of USP7 has been linked to
prostate, bladder, colon, liver and lung cancer.64,65 One low hypodiploid case harbored a USP7
missense substitution (p.Ala381Thr) that is predicted to be deleterious.
Lysine (K)-specific demethylase 1A (encoded by KDM1A) is a nuclear protein that is a
component of several histone deacetylase complexes, but usually silences genes by functioning
as a histone demethylase of H3K4.66,67 Inhibition of KDM1A activity has been proposed as a
therapeutic strategy in cancer.68 A somatic nonsense mutation was identified at codon 417 of
this gene in one near haploid case.
USP22 is a gene encoding a member of a TFTC/STAGA histone acetyltransferase
complex that mediates a histone H2A and H2B deubiquitinase activity.69 There is evidence for
Nature Genetics: doi: 10.1038/ng.2532
12
USP22 being an oncogene, with down-regulation of this gene being associated with a reduction
of cyclin D2 levels, while high expression in colorectal carcinoma was associated with higher
levels of amongst others c-Myc and pAKT in primary tumor tissue.70 A focal deletion in this gene
was identified in one low hypodiploid ALL case.
HDAC2 encodes one of five proteins known to deacetylate H3K56 (the other four are
SIRT1, SIRT2, SIRT3 and HDAC1).71 The use of histone deacetylase inhibitors is in advanced
clinical development as cancer therapeutic agents.72-74 One near haploid case harbored a
somatic HDAC2 missense substitution (p. Ser118Pro) that is predicted to be deleterious.
Histone readers
BRDT (Bromodomain, testis-specific) has the ability to recognize acetylated lysines.60 A BRDT
missense substitution with a predicted deleterious effect (p.Arg532Gln) was identified in one low
hypodipoloid case.
SFMBT2 belongs to the Scm family of Polycomb transcriptional represseor genes. The
encoded protein has four MBT (Malignant Brain Tumor) domains that are known to have tumor
suppressor activity. deletion in near haploid.75,76 A focal deletion of this gene was identified in
one near haploid case.
Histone binders
ASXL3 belongs to the ASXL family that encodes orthologs of the Drosophila Additional sex
combs (Asx) gene that is an enhancer of both the polycomb and trithorax group gene77. A
predicted deleterious p.Thr1243Ala substitution was identified in a low hypodiploid case.
Binders of histone writers
NFYC encodes one subumit of the trimerix complex NF-Y, which is a highly conserved
transcription factor that binds with high specificity and affinity to CCAAT motifs in various
promoter regions. NF-Y recruits ASH2L, a subunit of MLL, which is a complex that methylates
Lysine 4 on histone 3 (H3K4).78 One near haploid case harbored a p.Pro240Lys substitution
with a predicted deleterious effect.
Binders of histone erasers
PHF12 (PHD zinc finger transcription factor) encodes a protein that with HDAC1 forms a protein
complex involved in transcription regulation. The ability of PHF12 to interact with chromatin is
necessary for the complex to bind upstream of the promoter of the regulated genes. Inactivation
Nature Genetics: doi: 10.1038/ng.2532
13
of this protein complex promotes the progression of RNAP II within transcribed regions and thus
increased transcription.79 One low hypodiploid case harbored a p.Glu986Ala substitution.
The chromodomain helicase DNA binding (CHD) genes CDH3 and CHD4 encode
proteins that are part of the Mi2–nucleosome remodeling and deacetylase (Mi2-NuRD) complex.
The Mi2-NuRD complex couples histone deacetylation and chromatin remodeling, mediating
repressive functions that affect transcriptional regulation, replication, DNA repair and
determination of cell fate.80-82 One CHD3 splice region mutation and one CHD4 missense
alteration (p.Asn1131Ile) that is predicted to be deleterious were identified in near haploid
cases.
Histone DNA modifiers
Cell division control protein 6, encoded by CDC6, is essential for the initiation of DNA replication
by being responsible for the loading of mini-chromosome maintenance (MCM) proteins onto
replication origins.83 A near haploid case harbored a predicted deleterious substitution in CDC6
(p.Glu402Gln).
TET1 and TET3 belong to the Ten-eleven translocation gene family. The encoded
proteins are members of a DNA hydroxylase family that possess enzymatic activity toward 5-
methylcytosine (5mC). They can convert 5mC into 5-hydroxymethylcytosine, which may
influence maintenance of DNA methylation.83,84 A TET1 3’-UTR mutation was identified in one
near haploid case and a TET3 missense substitution (p.Gly795Asp) that is predicted to be
deleterious in a low hypodiploid case.
The protein encoded by MBD5 contains a methyl-binding domain that is required for
localization to chromatin, and may contribute to the formation or function of heterochromatin.85
An MBD5 p.Ser1097Ile substitution with a predicted deleterious effect was identified in a low
hypodiploid case.
Chromatin remodeling
ARID1A belongs to the SWI/SNF family, the members of which have ATPase and helicase
activities. The ARID1A protein is part of the large ATP-dependent chromatin remodeling
complex SNF/SWI that is required for activation of transcription of genes normally repressed by
chromatin.86,87 One near haploid case harbored a p.Pro1384Ser substitution in the ARID1A
gene. This mutation was predicted to be deleterious. As stated above, deletions were identified
directly above the gene family member ARID1B in two near haploid cases and one near diploid
case harbored a deletion in that gene (Supplementary Tables 10 and 12).
Nature Genetics: doi: 10.1038/ng.2532
14
Histone genes
The HIST1H2BK gene is located in a histone cluster on chromosome 6p21.33, and encodes a
member of the histone H2B family. A Gly14Ser substitution with a predicted deleterious effect
was identified in one near haploid case. In addition, as mentioned in the main text, recurrent
deletions in a histone cluster at 6p22 was identified in 19.1% of near haploid cases (Table 1,
and Supplementary Tables 12 and 20).
Two potential open reading frames in the NALM-16 NF1 transcript
To define the consequences of the exon 15-35 deletion of NF1, we performed transcriptome
sequencing (RNA-seq) of the near haploid NALM-16 ALL cell line88, which harbors the same
intragenic NF1 deletion that results in splicing of exons 14 to 36 (Supplementary Fig. 5). This
identified a 9.18 kb long NF1 transcript with two potential open reading frames (ORFs). Mutant
ORF1 is translated from the canonical translational start site and encodes a truncated protein
with a premature stop in exon 36, downstream of the deletion. Mutant ORF2 encodes a C-
terminal fragment of Neurofibromin (amino acids 1792-2818, NP_000258.1), translated from a
start codon located in exon 37 downstream of the deletion. Immunoblot analysis on NALM-16
and THP-1 extracts using antibodies specific for the NF1 N-terminus (sc-68) and NF1 C-
terminus (sc-67) detected the full length NF1 protein in THP-1 (an acute monocytic leukemia
cell line lacking an NF1 alteration and with high NF1 mRNA levels as assessed by gene
expression profiling; Ref 89 and data not shown) but not in NALM-16. Further, this antibody
failed to detect the putative ORF1 and ORF2 products in the NF1 extract, indicating that no NF1
protein is produced in cells harboring the exon 15-35 deletion (Supplementary Fig. 6).
Verification of genetic alterations in xenografted primary hypodiploid ALL cells
The panel of hypodiploid ALL xenografts established showed a remarkable consistency in the
tempo of engraftment between transplant replicates (2-3 mice transplanted per primary tumor).
To confirm the presence of human leukemia in the xenografted mice, an immunohistochemical
analysis was performed on a subset of xenografted tumors (4 tumors, all from different primary
cases), identifying leukemic cells positive for human CD45 in tissues including spleen,
meninges and sternal marrow (Supplementary Fig. 21). Further, DNA extracted from bone
marrow of xenografted mice was analyzed for copy number alterations by Affymetrix SNP 6.0
microarrays (7 tumors from 3 different primary cases). In six xenografts derived from two
Nature Genetics: doi: 10.1038/ng.2532
15
primary tumors, the patterns of aneuploidy and focal DNA copy number alterations were
identical to those identified in the primary tumors (Supplementary Fig. 22a-b). A xenograft
derived from a third primary tumor acquired three regions of amplification, two of which were
focal (at chromosomes 2p24.2 and 9p13.3), and one of which was approximately 10Mb in size
(19p13.3-p13.12) (Supplementary Fig. 22c). Each of these three regions was located on
aneuploid chromosomes, and the amplifications resulted in the acquisition of copy-neutral LOH
in the regions of copy number gain (data not shown). This tumor also acquired a focal deletion
of AUTS2 at 7q11.22, leading to complete loss of this gene as the other chromosome was
already lost in the primary tumor. Importantly, the focal deletion of NF1 exons 15-35 present in
the primary tumor was also present in the xenograft (Supplementary Fig. 22c).
Minimal Residual Disease status and PAG1 alterations are associated with poor outcome
Associations between karyotype, genetic lesions, clinical features and outcome were analyzed.
Both near haploid and low hypodiploid ALL are associated with poor outcome, with no
significant difference between these subgroups (Supplementary Table 29). In accordance with
prior studies9, near haploid and low hypodiploid ALL had a higher incidence of relapse
compared to near diploid ALL (Supplementary Table 30). As expected, near haploid and low
hypodiploid ALL exhibited a high frequency of positive minimal residual disease (MRD ≥0.01%;
40% and 42.9%, respectively) at the end of induction (day 29) compared to near diploid ALL
(6.7%; Supplementary Table 31).
In univariate analysis, the degree of aneuploidy, MRD status and IKZF2 and PAG1
alterations were associated with poor outcome, and PAX5 alterations with favorable outcome
(Supplementary Tables 30-33). MRD status and PAG1 alterations were associated with poor
outcome also in multivariable analyses (Supplementary Table 34).
Nature Genetics: doi: 10.1038/ng.2532
16
SUPPLEMENTARY TABLES
Supplementary Table 1: Pediatric hypodiploid ALL cohort.
See Excel Table: “Table_S1_Pediatric_hypodiploid_ALL_cohort.xlsx”
D, diagnosis; G, germline (remission material); R, relapse; WGS, whole genome sequencing; WES, whole exome sequencing; GEP, gene expression profiling; COG, Children’s Oncology Group.
Supplementary Table 2: Adult ALL cohort.
See Excel Table: “Table_S2_Adult_ALL_cohort.xlsx”
D, diagnosis; R, relapse; PH, BCR-ABL1 positive; H50, >50 chromosomes; H47, >47 chromosomes; IMVS, Institute of Medical and Veterinary Science, Adelaide, Australia; PAH, Princess Alexandra Hospital, Woolloongabba, Australia; RMH, Royal Melbourne Hospital, Parkville, Australia; UHN, University Health Network, Toronto, Canada; CALGB, The Cancer and Leukemia Group B.
Nature Genetics: doi: 10.1038/ng.2532
17
Supplementary Table 3: Whole genome sequencing coverage data.
Average haploid coverage: 44.9 fold (±1.6) for tumor samples and 34.4 fold (±1.3) for normal samples. D, diagnosis; G, germline (remission material)
Patient G / D Nucleotides Sequenced
% Reads Mapped
Genome Coverage
Haploid Coverage
Exon Coverage
% Genomic
bases covered
% Exonic bases
covered
% Coding bases
covered
% SNP Detection
SJHYPO046 G 93,509,941,800 95.23% 27.3 27.86 24.6 97 89 88 99.15
SJHYPO056 G 98,926,517,800 94.66% 28.3 28.42 27 98 91 91 99.36
SJHYPO021 G 106,058,755,000 92.57% 28.8 29.84 25.6 98 94 94 99.37
SJHYPO013 G 103,903,194,200 91.15% 29 29.37 26.3 98 92 90 99.33
SJHYPO055 G 108,917,276,800 95.36% 29.4 29.94 30 98 97 97 99.54
SJHYPO044 G 107,498,752,200 95.04% 29.7 30.5 28.1 98 92 92 99.06
SJHYPO052 G 110,026,458,600 95.91% 29.9 30.34 28.8 98 93 93 99.43
SJHYPO052 D 150,981,391,000 95.89% 30.6 44.02 28.3 97 89 87 99.38
SJHYPO042 G 114,125,730,000 94.82% 30.9 31.47 30 99 97 97 99.52
SJHYPO022 G 118,293,797,200 92.15% 31.3 32.86 27.4 98 94 94 99.01
SJHYPO026 G 111,262,869,400 94.67% 31.7 31.73 31.3 99 99 99 99.54
SJHYPO051 G 109,674,698,800 95.41% 31.9 32.53 29.2 98 90 88 99.12
SJHYPO029 G 120,422,313,200 93.55% 32.2 32.68 33 99 98 98 99.54
SJHYPO040 G 112,724,007,600 93.28% 32.6 33.07 31.3 99 98 98 99.35
SJHYPO029 D 126,566,434,400 92.46% 33.5 33.75 34.3 99 98 99 99.15
SJHYPO020 G 124,887,001,600 94.70% 34.1 35.45 31.7 99 96 96 99.47
SJHYPO056 D 125,840,368,000 95.03% 34.3 34.59 33.7 98 97 97 99.44
SJHYPO123 G 130,329,067,406 95.24% 35.9 36.61 31.9 98 88 86 98.71
SJHYPO051 D 130,004,435,400 95.54% 36 36.78 32.1 98 89 87 98.54
SJHYPO119 D 129,244,870,800 95.04% 36.3 36.83 32.7 98 90 88 98.66
SJHYPO046 D 132,861,794,200 94.98% 37.2 38.4 33.2 97 90 89 99.31
SJHYPO004 G 129,531,260,200 95.57% 37.8 39.14 34.6 99 96 96 99.55
SJHYPO121 G 135,246,155,200 93.67% 39.3 40.69 36.3 99 97 97 99.56
SJHYPO013 D 150,742,264,200 94.95% 39.8 40.5 37.6 99 97 97 99.56
SJHYPO044 D 144,971,006,200 94.68% 41.3 42.3 40.6 99 98 99 99.09
SJHYPO120 G 160,109,976,800 95.83% 41.3 41.83 37 98 91 89 99.25
SJHYPO123 D 152,626,710,348 93.73% 41.5 42.73 36.7 98 90 88 98.98
Nature Genetics: doi: 10.1038/ng.2532
18
Patient G / D Nucleotides Sequenced
% Reads Mapped
Genome Coverage
Haploid Coverage
Exon Coverage
% Genomic
bases covered
% Exonic bases
covered
% Coding bases
covered
% SNP Detection
SJHYPO002 D 142,284,502,000 95.58% 41.6 42.93 36.3 99 94 93 99.46
SJHYPO002 G 144,479,906,000 95.39% 42.5 43.9 37.4 99 94 93 99.38
SJHYPO120 D 151,205,890,800 95.77% 42.6 43.14 37 98 88 86 98.69
SJHYPO006 G 148,819,701,800 95.76% 43.7 44.89 39.3 99 95 95 99.48
SJHYPO004 D 170,082,231,800 94.49% 43.9 44.52 43.6 99 98 99 99.57
SJHYPO006 D 149,759,844,600 95.87% 44 45.72 39.4 98 94 94 99.31
SJHYPO119 G 164,845,521,200 96.44% 44.3 45.34 38.5 98 90 89 99.26
SJHYPO055 D 185,146,504,000 95.09% 47.3 48.53 45.2 98 98 98 99.61
SJHYPO021 D 194,894,104,400 91.72% 47.7 50 43.3 99 97 97 97.43
SJHYPO022 D 193,078,257,200 91.66% 47.9 51.31 41.7 98 96 96 98.96
SJHYPO042 D 193,520,383,400 93.47% 49.9 51.85 47.9 99 99 99 99.45
SJHYPO040 D 203,845,340,000 91.56% 51.6 53.62 47.5 98 97 97 99.4
SJHYPO020 D 216,676,659,400 94.41% 52.6 58.32 48.2 99 98 98 99.43
SJHYPO026 D 216,234,398,400 94.70% 56 57.23 56.9 99 99 99 99.01
SJHYPO121 D 215,636,739,800 94.61% 60.7 63.44 55.8 99 98 98 99.49
Nature Genetics: doi: 10.1038/ng.2532
19
Supplementary Table 4: Whole exome sequencing coverage data. D, diagnosis; G, germline (remission material); R, relapse
Case G / D /
R Nucleotides Sequenced
% Reads Mapped
Duplication Rate
% Covered Coding Bases ≥ 10x
% Covered Coding Bases ≥ 20x
% Covered Coding Bases ≥ 30x
SJHYPO001 D 36,203,643,314 98.2% 0.18 98.0 96.3 94.6
SJHYPO001 G 23,676,803,494 89.8% 0.096 93.4 89.2 85.4
SJHYPO005 D 18,219,143,560 98.6% 0.15 97.0 94.3 91.9
SJHYPO005 G 19,460,765,244 98.4% 0.13 96.9 94.2 91.8
SJHYPO009 D 17,764,641,338 98.5% 0.25 96.7 93.7 90.7
SJHYPO009 G 14,040,901,830 98.0% 0.12 96.4 93.1 90.1
SJHYPO009 R 8,842,150,848 99.1% 0.23 93.3 87.1 78.0
SJHYPO012 D 14,348,121,812 96.9% 0.49 89.9 82.4 74.5
SJHYPO012 G 15,199,382,132 95.2% 0.36 92.1 87.1 81.9
SJHYPO014 D 23,025,245,932 98.2% 0.13 97.6 95.5 93.6
SJHYPO014 G 20,245,193,662 98.7% 0.15 97.6 95.4 93.1
SJHYPO016 D 14,199,789,458 96.5% 0.23 91.9 86.7 81.5
SJHYPO016 G 10,731,147,788 95.1% 0.19 94.6 90.2 85.8
SJHYPO019 D 7,772,516,004 98.6% 0.27 91.7 84.0 73.4
SJHYPO019 G 9,554,828,260 98.8% 0.2 94.2 90.2 84.7
SJHYPO024 D 14,791,275,674 99.1% 0.18 95.3 93.1 90.6
SJHYPO024 G 20,608,688,016 99.2% 0.24 95.7 94.1 92.4
SJHYPO032 D 16,899,506,042 98.0% 0.094 93.7 89.6 86.0
SJHYPO032 G 14,253,139,392 98.4% 0.15 93.7 89.3 85.1
SJHYPO036 D 25,996,903,080 99.1% 0.21 96.2 95.1 94.0
SJHYPO036 G 16,179,316,452 99.2% 0.18 95.7 94.1 92.1
SJHYPO037 D 11,345,339,292 98.1% 0.56 91.5 83.0 71.0
SJHYPO037 R 11,900,972,410 99.0% 0.5 93.2 87.5 79.1
SJHYPO039 D 11,917,553,580 90.1% 0.11 89.1 81.8 74.9
SJHYPO039 G 12,215,196,742 98.2% 0.12 92.8 87.8 82.6
SJHYPO041 D 5,202,479,296 98.7% 0.33 87.0 70.6 50.2
SJHYPO041 G 8,122,168,106 99.1% 0.23 93.9 88.8 81.1
SJHYPO045 D 10,228,524,600 86.0% 0.048 82.8 78.6 74.2
SJHYPO045 G 12,959,649,308 77.4% 0.054 82.7 78.6 74.6
SJHYPO047 D 11,252,109,626 99.0% 0.21 94.8 91.7 87.5
SJHYPO047 G 14,875,868,628 99.2% 0.27 95.2 92.9 89.9
SJHYPO052 R 9,029,261,630 98.7% 0.26 93.2 87.1 78.4
SJHYPO116 D 5,890,611,060 95.7% 0.041 90.9 80.3 65.9
SJHYPO116 G 4,953,854,700 96.9% 0.042 90.7 81.0 67.7
SJHYPO117 D 16,385,912,760 99.2% 0.48 94.6 91.2 86.3
SJHYPO117 R 9,911,419,870 98.5% 0.53 89.6 78.6 64.3
SJHYPO124 D 10,739,280,914 99.2% 0.48 92.8 85.2 73.6
SJHYPO124 G 4,325,139,968 98.9% 0.31 84.3 63.8 40.8
SJHYPO125 D 15,139,696,586 96.4% 0.05 91.9 86.5 81.4
SJHYPO125 G 14,135,196,844 95.9% 0.088 91.5 86.3 81.3
SJHYPO126 D 14,091,114,990 99.2% 0.33 95.1 92.2 87.6
SJHYPO126 G 8,951,109,244 98.4% 0.42 91.4 82.2 69.5
Nature Genetics: doi: 10.1038/ng.2532
20
Supplementary Table 5: Validation frequency of next-generation sequencing data.
Total number of mutations/alterations per next generation sequencing (NGS) technique is indicated as well as validation percentages. WES, whole exome sequencing; WGS, whole genome sequencing; SNV, single nucleotide variation; Indel, insertion/deletion mutation.
NGS Technique
Mutation type
Somatic Somatic
% Non-
tumor Non-
tumor % Wild-type
Wild-type %
WES SNV/Indel 229 72.9 33 10.5 52 16.6
WES SV 1 50 0 0 1 50
WGS SNV/Indel 417 89.7 33 7.1 15 3.2
WGS SV 95 86.4 10 9.1 5 4.5
WES & WGS All 742 83.3 76 8.5 73 8.2
Nature Genetics: doi: 10.1038/ng.2532
21
Supplementary Table 6: Number of sequence mutations, copy number alterations and structural variants identified by next-generation sequencing.
Tier1: Coding synonymous, nonsynonymous, splice site, and non-coding RNA variants; Tier2: Conserved variants; Tier3: Variants in non-repeat masked regions; Tier4: Remaining SNVs. SNV, single nucleotide variation; Indel, insertion/deletion mutation; CDS, Coding DNA sequence; HQ, high quality; SV, structural variation; CNA, copy number alteration; N, number; Mb, megabases; AA, amino acid; UTR, untranslated leader region; Amp, amplification; Del, deletion. * Including numbers based on whole chromosome gain and loss.
Nature Genetics: doi: 10.1038/ng.2532
22
Sample SNVs SNVs SNVs Indels HQ
SNVs HQ
SNVs HQ
SNVs SVs
CNA (N) *
CNA (Mb) *
CNA (N) *
CNA (Mb) *
Tier1 AA Change
Tier1 UTRs
Tier1 Silent
CDS Tier2 Tier3 Tier4
Amp Amp Del Del
SJHYPO002-D 2 1 5 2 32 222 1708 2 0 0 29 2455
SJHYPO006-D 11 6 10 0 50 522 2050 5 0 0 40 2725
SJHYPO020-D 4 3 0 1 7 149 1762 3 0 0 25 2219
SJHYPO021-D 7 4 1 1 33 261 1672 4 0 0 29 2757
SJHYPO029-D 5 6 0 2 25 260 2070 3 17 555 18 0.8
SJHYPO040-D 7 13 2 1 40 336 2100 4 2 0.007 30 2725
SJHYPO042-D 6 2 1 2 29 297 2135 2 13 632 11 0.2
SJHYPO044-D 49 28 11 5 198 2011 3608 12 36 548 15 1
SJHYPO046-D 35 42 12 2 242 2557 4497 2 0 0 20 2600
SJHYPO056-D 5 5 0 4 9 150 1792 2 0 0 34 2726
SJHYPO123-D 15 16 5 1 96 927 2092 14 1 0.002 37 2505
SJHYPO004-D 5 1 1 0 27 243 2308 3 1 0.0001 21 1397
SJHYPO013-D 7 4 0 1 27 323 1675 2 1 0.1 20 1866
SJHYPO022-D 14 6 2 1 38 443 1342 7 0 0 23 1292
SJHYPO026-D 5 6 4 1 19 194 1094 4 0 0 15 1427
SJHYPO051-D 6 6 4 0 41 330 1022 9 3 0.002 30 1565
SJHYPO052-D 8 3 1 0 42 315 1554 3 1 0.01 22 1415
SJHYPO055-D 12 0 1 3 47 405 1680 6 28 1720 9 56
SJHYPO119-D 4 7 0 0 33 274 1117 5 1 0.009 20 1397
SJHYPO120-D 11 0 1 1 37 398 1256 22 15 81 32 980
Median 7 6 1 1 33 315 1708 4 1 22
Mean 10.9 8 3.1 1.4 53.6 530.9 1926.7 5.7 6 24
Range 2-49 0-42 0-12 0-5 7-242 149- 2557 1022-4497 2-22 0-36 9-40
Nature Genetics: doi: 10.1038/ng.2532
23
Supplementary Table 7: Mutations identified by next-generation sequencing.
See Excel Table: “Table_S7_NGS_SNVs_Indels.xlsx”
Somatic and putative germline deleterious single nucleotide variations (SNVs) and insertion/deletion mutations (Indels) identified by whole genome- and whole exome sequencing of a subset of the hypodiploid ALL cohort. Putative germline variants are highlighted in yellow. D, diagnosis; R, relapse. Column definition is listed below: A. GeneName: HUGO gene symbol B. VarType: SNV, single nucleotide variation; Indel, insertion/deletion mutation C. Sample: Hypodiploid ALL sample ID D. Chr: chromosome E. Position: chromosome position in hg19 coordinates F. Class: classification based on amino acid change pattern. ‘exon’, mutation in non-coding RNA genes; ‘splice_region’ mutation not directly affecting the canonical splice sites but located within 10bp of the canonical splice sites. G. AAChange: predicted amino acid change for the mutation H. ProteinGI: NCBI protein GI number I. mRNA_acc: RefSeq accession number J. Mut Reads Diagnosis: number of NGS reads containing mutant allele (diagnosis) K. Total Reads Diagnosis: number of NGS reads covering the site (diagnosis) L. Mut Reads Relapse: number of NGS reads containing mutant allele (relapse) M. Total Reads Relapse: number of NGS reads covering the site (relapse) N. Mut Reads Normal: number of NGS reads containing mutant allele (normal) O.Total Reads Normal: number of NGS reads covering the site (normal) P. Reference Allele: the allele represented in the reference human genome. Reference allele is marked as ‘–‘ for an insertion. Q. Non-reference Allele: the mutated allele R. Flanking: 20bp [reference allele/mutant allele] 20bp S. Status: somatic or germline mutation (germline referring to non-tumor cells) T. SIFTResult: ‘deleterious status’ assigned by SIFT U. pph2result: ‘deleterious status’ assigned by polyPHEN2
Nature Genetics: doi: 10.1038/ng.2532
24
Supplementary Table 8: Structural variations identified by whole genome sequencing.
See Excel Table: “Table_S8_WGS_SVs.xlsx”
Somatic structural variations identified by whole genome sequencing of a subset of the hypodiploid ALL cohort. CDS, coding DNA sequence.
Column definition is listed below: A. Sample: Hypodiploid ALL sample ID B. ChrA: Chromosome for breakpoint A C. PosA: Position of breakpoint A D. OrientationA: + Region to the left of PosA is included in mutant genotype - Region to the right of PosA is included in mutant genotype E. ChrB: Chromosome for breakpoint B F. PosB: Position of breakpoint B G. OrientationB: + Region to the right of PosB is included in mutant genotype - Region to the left of PosB is included in mutant genotype H. Type: INS, insertion; DEL, deletion; INV, inversion; ITX, intrachromosomal
translocation; CTX, interchromosomal translocation I. Usage: GENIC: Both endpoints were in genes: checked for fusion
HALF_INTERGENIC: One endpoint was in a gene: checked for truncation INTERGENIC / INTRONIC: Neither endpoint was in a gene or both were in the same intron of a gene; no gene fusion or truncation INVERTED_REPEAT: Both endpoints were in the same gene, but in opposite orientations: checked for truncation
J. Gene: Fusion or truncated gene that would result from structural variation K. Chromosomes: Chromosomes involved in the rearrangement L. Tx: Number of predicted fusion transcripts M. Valid CDS: Number of predicted fusion transcripts with an annotated CDS start and stop N. In-Frame CDS: Number of “Valid CDS” transcripts with a CDS length divisible by three. O. Mod. In-Frame CDS: Number of “In-Frame CDS” transcripts that are not identical to an
existing annotated transcript. P. mutA: Number of reads supporting the structural variation at breakpoint A Q. mutB: Number of reads supporting the structural variation at breakpoint B R. Validation Status:
Valid: The SV has been experimentally validated Putative: The SV has yet to be validated
Nature Genetics: doi: 10.1038/ng.2532
25
Supplementary Table 9: Genes resequenced.
CDS, coding DNA sequence.
Gene Genbank Accession Number Exons sequenced
CBL NM_005188.2 8-9
CRLF2 NM_022148.2 6
ETV6 NM_001987.4 CDS
FLT3 NM_004119.2 14, 20 and coding region of 24
IKZF1 NM_006060.3 CDS
IKZF2 NM_016260.2 CDS
IKZF3 NM_012481.3 CDS
JAK1 NM_002227.2 13-18
JAK2 NM_004972.3 13-24 and coding region of 25
KIF2B NM_032559.4 CDS
KRAS NM_033360.2 2-3
MAPK1 NM_138957.2 CDS
NF1 NM_000267.2 CDS
NRAS NM_002524.3 2-3
PAG1 NM_018440.3 CDS
PAX5 NM_016734.1 CDS
PTPN11 NM_002834.3 3, 4 and 13
RB1 NM_000321.2 CDS
TP53 NM_000546.4 CDS
Nature Genetics: doi: 10.1038/ng.2532
26
Supplementary Table 10: Regions of copy number alterations and copy-neutral loss-of-heterozygosity in hypodiploid ALL. See Excel Table: “Table_S10_SNP_data.xlsx” The table lists all regions of copy number alterations (CNAs) identified by manual curation of circular binary segmentation data for the hypodiploid ALL cohort, and copy-neutral loss-of-heterozygosity (CN LOH) identified by dChip using the Hidden Markov Model algorithm. CNAs smaller than 8 SNP and/or copy number probes have been filtered. Final lesion listings exclude DNA gains and losses arising from antigen receptor gene rearrangements at 2p11.2 (IGK@), 7p14.1 (TRG@), 7q34 (TRB@), 14q11.2 (TRA@), 14q32.33 (IGH@) and 22q11.22 (IGL@). The workbook contains 5 sheets, with the near haploid, masked near haploid, low hypodiploid, masked low hypodiploid and near diploid cases in different sheets. Del, deletion; Homo, homozygous; Hemi, hemizygous; Subpop, subpopulation; R, relapse. “Duplicated genome” indicates lesions and whole chromosomal events affected by reduplication of the hypodiploid genomic complement in the masked near haploid and low hypodiploid ALL cases. Masked hypodiploid cases do here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%. The analyses were initially performed in human assembly hg18, on which the Affymetrix SNP 6.0 microarray is based, and then mapped to hg19 using a method described previously90. Both hg18 and hg19 coordinates are included in the table.
A. ID: Sample B. Comment: lesion type C. Chrom: Chromosome D. Cytoband: Sublocation on chromosome E. loc.start_hg18: Chromosomal start position of lesion based on Human genome build 18 F. loc.end_hg18: Chromosomal end position of lesion based on Human genome build 18 G. loc.start_hg19: Chromosomal start position of lesion based on Human genome build 19 H. loc.end_hg19: Chromosomal end position of lesion based on Human genome build 19 I. LiftOverStatus: One of the following: “complete” (all bases in the original hg18 segment are
successfully remapped to hg19); “partial” (not all, but >50% of the bases are remapped, with the ratio of remapping); “suspicious” (>50% of the bases are remapped, but did not pass subsequent QA); “failed” (>50% of the bases cannot be remapped).
J. num.mark: Number of probes included in the segment K. seg.mean: log2 ([normalized tumor signal]/[normalized normal signal]) L. seg.observedCN: Absolute copy number M. seg.size (kb): Size of segment in kilobases N. total # of gene in the segment: Number of genes included in segment O. first 10 genes in segment: Lists the names of the first 10 genes in the segment P. total # of miRNA in the segment: Number of miRNAs included in segment Q. first 10 miRNAs in segment: Lists the names of the first 10 miRNAs in the segment
Nature Genetics: doi: 10.1038/ng.2532
27
Supplementary Table 11: Mutations identified by Sanger sequencing in the hypodiploid ALL cohort. See Excel Table: “Table_S11_Sanger_mutations.xlsx” Somatic and putative germline deleterious single nucleotide variations (SNVs) and insertion/deletion mutations (Indels) identified by Sanger sequencing of the entire hypodiploid ALL cohort. Column definition is listed below: A. Gene Name: HUGO gene symbol B. VarType: SNV, single nucleotide variation; Indel, insertion/deletion mutation C. Sample: Hypodiploid ALL sample ID D. Chr: chromosome E. Position: chromosome position in hg19 coordinates F. Class: classification based on amino acid change pattern. ‘exon’, mutation in non-coding
RNA genes; ‘splice_region’ mutation not directly affecting the canonical splice sites but located within 10bp of the canonical splice sites.
G. AAChange: predicted amino acid change for the mutation H. ProteinGI: NCBI protein GI number I. mRNA_acc: RefSeq accession number J. Mutant peak intensity (%): Mutant peak size of total peaks K. Reference Allele: the allele represented in the reference human genome. Reference allele
is marked as ‘–‘ for an insertion. L. Non-reference Allele: the mutated allele M. Flanking: 10bp [reference allele/mutant allele] 10bp N. Status: somatic or germline mutation (germline referring to non-tumor cells) O. SIFTResult: ‘deleterious status’ assigned by SIFT P. SIFTScore Q. pph2result: ‘deleterious status’ assigned by polyPHEN2 R. pph2score
Nature Genetics: doi: 10.1038/ng.2532
28
Supplementary Table 12: Copy number alterations and mutations. See Excel Table: “Table_S12_Specific_lesion_information.xlsx” Unless otherwise stated, genes were not sequenced. Paired or unpaired indicate whether a matched normal DNA sample was available. Masked hypodiploid cases here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%. NH, near haploid (24-31 chromosomes); mNH, masked near haploid (2x24-31 chromosomes); LH, low hypodiploid (32-39 chromosomes); mLH, masked low hypodiploid (2x32-39 chromosomes); ND, near diploid (44-45 chromosomes); CNA, copy number alteration; Seq mut, sequence mutation; Del, deletion; Amp, amplification; Homo, homozygous; Het, heterozygous; e, exon; i, intron; LOH, loss-of-heterozygosity; US, upstream; DS, downstream; † Concomitant deletion of the other corresponding chromosomal copy, giving rise to a bi-allelic mutational event; § Mutation present in matched remission sample; §§ Directly upstream of gene in question; ¥ Deleted as part of rearrangement of the immunoglobulin lambda light chain locus at 22q11.22.
Nature Genetics: doi: 10.1038/ng.2532
29
Supplementary Table 13: Association between aneuploidy and lesions.
P values were calculated by an Exact Chi-Square test (2x5) and values <0.05 are highlighted, as well as the subgroup(s) that are associated with a high frequency of the lesion in question. Signaling indicates genes involved in RTK- and/or Ras signaling. Masked hypodiploid cases here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%. CNA, copy number alteration; Seq. mut., sequence mutation; pwy, pathway; NH, near haploid; mNH, masked near haploid; LH, low hypodiploid; mLH, masked low hypodiploid; ND, near diploid.
Gene Hypodiploid subtype
n= Normal (%)
Deleted or mutated
(%)
Two-sided P values
Bonferroni step down adjusted
P values
CDKN2A/B, CNA NH 50 76 24 1.57x10-6 2.51x10-5
mNH 18 83.3 16.7
LH 26 88.5 11.5
mLH 8 37.5 62.5
ND 22 22.7 77.3
Histone cluster, 6p22, CNA NH 50 82 18 0.206 1
mNH 18 77.8 22.2
LH 26 96.2 3.8
mLH 8 100 0
ND 22 90.9 9.1
JAK1, Seq. mut. NH 50 100 0 0.097 1
mNH 18 100 0
LH 26 100 0
mLH 8 100 0
ND 22 90.9 9.1
FLT3 (Signaling), Seq. mut. NH 50 90 10 0.218 1
mNH 18 94.4 5.6
LH 26 100 0
mLH 8 100 0
ND 22 100 0
KRAS (Signaling), Seq. mut. NH 50 96 4 0.373 1
mNH 18 100 0
LH 26 100 0
mLH 8 100 0
ND 22 90.9 9.1
NF1 (Signaling), CNA and seq. mut. NH 50 56 44 2.77x10-4 3.6x10-3
mNH 18 55.6 44.4
LH 26 92.3 7.7
mLH 8 87.5 12.5
ND 22 95.5 4.5
NRAS (Signaling), Seq. mut. NH 50 86 14 0.181 1
mNH 18 83.3 16.7
LH 26 100 0
mLH 8 100 0
ND 22 81.8 18.2
PTPN11 (Signaling), Seq. mut. NH 50 98 2 0.242 1
mNH 18 100 0
LH 26 100 0
mLH 8 100 0
ND 22 90.9 9.1
Signaling combined NH 50 30 70 3.29x10-8 6.25x10-7
mNH 18 27.8 72.2
LH 26 96.2 3.8
mLH 8 87.5 12.5
ND 22 68.2 31.8
Nature Genetics: doi: 10.1038/ng.2532
30
Gene Hypodiploid subtype
n= Normal (%)
Deleted or mutated
(%)
Two-sided P values
Bonferroni step down adjusted
P values
PAG1, CNA and seq. mut. NH 50 88 12 0.294 1
mNH 18 94.4 5.6
LH 26 96.2 3.8
mLH 8 100 0
ND 22 100 0
RB1, CNA and seq. mut. NH 50 90 10 1.28x10-4 1.79x10-3
mNH 18 94.4 5.6
LH 26 57.7 42.3
mLH 8 62.5 37.5
ND 22 0 0
TP53, Seq. mut. NH 50 98 2 4.65x10-19 1.02x10-17
mNH 18 94.4 5.6
LH 26 3.8 96.2
mLH 8 25 75
ND 22 95.5 4.5
IKZF1 (B-pwy), CNA and seq. mut. NH 50 98 2 0.656 1
mNH 18 94.4 5.6
LH 26 96.2 3.8
mLH 8 100 0
ND 22 90.9 9.1
IKZF2 (B-pwy), CNA and seq. mut. NH 50 100 0 8.69x10-12 1.83x10-10
mNH 18 94.4 5.6
LH 26 38.5 61.5
mLH 8 75 25
ND 22 100 0
IKZF3 (B-pwy), CNA and seq. mut. NH 50 86 14 0.135 1
mNH 18 88.9 11.1
LH 26 100 0
mLH 8 87.5 12.5
ND 22 100 0
PAX5 (B-pwy), CNA and seq. mut. NH 50 90 10 1.25x10-7 2.12x10-6
mNH 18 100 0
LH 26 92.3 7.7
mLH 8 100 0
ND 22 40.9 59.1
EBF1 (B-pwy), CNA NH 50 100 0 0.322 1
mNH 18 100 0
LH 26 100 0
mLH 8 100 0
ND 22 95.5 4.5
VPREB1 (B-pwy), CNA NH 50 98 2 0.044 0.485
mNH 18 88.9 11.1
LH 26 92.3 7.7
mLH 8 100 0
ND 22 77.3 22.7
B-pathway combined NH 50 76 24 4.63x10-5 6.95x10-4
mNH 18 66.7 33.3
LH 26 26.9 73.1
mLH 8 62.5 37.5
ND 22 27.3 72.7
Nature Genetics: doi: 10.1038/ng.2532
31
Supplementary Table 14: Differential expression analysis – NH versus masked NH.
See Excel Table: “Table_S14_NH_vs_mNH_Limma.xlsx”
Differential expression analysis performed by limma with estimation of false discovery rate (FDR) at 0.05 between near haploid (NH) and masked near haploid (mNH) cases. No statistically significant differences were identified between those two hypodiploid subgroups. Masked hypodiploid cases here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%.
Supplementary Table 15: Differential expression analysis – LH versus masked LH.
See Excel Table: “Table_S15_LH_vs_mLH_Limma.xlsx”
Differential expression analysis performed by limma with estimation of false discovery rate (FDR) at 0.05 between low hypodiploid (LH) and masked low hypodiploid (mLH) cases. No statistically significant differences were identified between those two hypodiploid subgroups. Masked hypodiploid cases here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%.
Nature Genetics: doi: 10.1038/ng.2532
32
Supplementary Table 16: TP53 mutations in adult ALL.
LH, low hypodiploid; H47, >47 chromosomes; NH, near haploid; D, diagnosis; R, relapse; Homo, homozygous; Het, heterozygous; LFS, Li-Fraumeni Syndrome.
Seq ID ALL subgroup
Status TP53 mutation (NM_000546.4)
Homo or het
Status LFS associated or sporadic
Predicted effect Domain Dominant negative
ADT003 LH D R248Q Homo Somatic LFS and sporadic DNA contact DNA binding Yes
ADT017 LH D V173M Homo Somatic LFS and sporadic Deleterious DNA binding Yes
ADT018 LH D V173M Homo No germline LFS and sporadic Deleterious DNA binding Yes
ADT027 LH D Y220H Homo No germline Sporadic Deleterious DNA binding No
ADT028 LH D R249T Homo Somatic Sporadic Deleterious DNA binding N/A
ADT040 T-ALL D and R R175H Homo No germline LFS and sporadic Conformational DNA binding Yes
ADT044 N/A R L265P Homo No germline LFS and sporadic Deleterious DNA binding Yes
ADT044 N/A R R267Q Homo No germline LFS and sporadic Deleterious DNA binding N/A
ADT074 H47 R R213* Homo No germline LFS and sporadic Deleterious DNA binding No
ADT076 Other R R282fs Het No germline Sporadic Truncating
ADT084 Other R L145P Het Somatic Sporadic Deleterious DNA binding N/A
ADT084 Other R G187_E6splice_region Homo Somatic
ADT085 LH D Y220C Homo Somatic LFS and sporadic Deleterious DNA binding Yes
ADT121 Other D GinsR282 Het No germline
ADT122 Other D P219L Het No germline Sporadic Deleterious DNA binding No
ADT122 Other D R273C Het No germline LFS and sporadic Deleterious DNA binding Yes
Adult_Hypo2 LH D R273H Homo No germline LFS and sporadic DNA contact DNA binding Yes
Adult_Hypo3 LH D R249S Het No germline Sporadic Conformational DNA binding Yes
Adult_Hypo4 LH D R273fs Homo No germline Sporadic Truncating
Adult_Hypo5 NH D R290fs Het No germline Sporadic Truncating
Adult_Hypo6 LH D G187_E6splice_region Homo No germline Sporadic
Nature Genetics: doi: 10.1038/ng.2532
33
Supplementary Table 17: Copy number alterations and mutations in hypodiploid ALL vs non-hypodiploid ALL.
Abnormalities are deletions unless otherwise indicated. The St. Jude (SJ) cohort was studied in Ref29 and consisted of 258 childhood ALL cases divided into high hyperdiploid (H50; >50 chromosomes; n=44), TCF3-PBX1 (n=17), ETV6-RUNX1 (n=50), MLL-rearranged (n=24), BCR-ABL1 (PH; n=21), hypodiploid (Hypo; mainly near diploid cases with a dicentric chromosome; n=10), and other (n=92). The hypodiploid ALL cohort is divided into the near haploid (NH), low hypodiploid (LH) and near diploid (ND) subgroups. § Genes that were sequenced in the hypodiploid ALL cohort but not in the SJ cohort (genes that only are targeted by sequence mutations and not by copy number alteration and that were not sequenced in the SJ cohort are shaded in gray); † Copy number alteration (CNA); *Sequence mutation; **B cell pathway lesions include deletions or sequence mutations involving BLNK, EBF1, IKZF1, IKZF2, IKZF3, LEF1, PAX5, RAG1/2, and TCF3. VPREB1 may be considered part of the B cell pathway but is located in the immunoglobulin lambda light chain locus at 22q11.22, and is commonly deleted upon rearrangement of this locus. The biologic significance of VPREB1 deletions in B-ALL is thereby unclear, and the frequency of B cell pathway lesions is thus shown excluding and including VPREB1 alterations. iAmp21, internal amplification of chromosome 21; pwy, pathway.
Lesion Location SJ cohort
% H50 % TCF3-PBX1
% ETV6-RUNX1
% MLL % PH % Hypo % Other % NH % LH % ND %
n= 258 44 17 50 24 21 10 92 68 34 22
JAK1 * § 1p32.3-p31.3
0 0 0 0 2 9.1
PDE4B 1p31.2 2 0.8 0 0 0 0 2 4.0 0 0 0 0 0 0 0 0 1 1.5 0 0 0 0
NRAS * § 1p13.1
10 14.7 0 0 4 18.2
ADAR 1q22 2 0.8 0 0 0 0 0 0 0 0 0 0 0 0 2 2.2 0 0 0 0 0 0
LOC440742 1q44 2 0.8 0 0 0 0 0 0 0 0 0 0 0 0 2 2.2 0 0 0 0 0 0
1q gain 1q23.3-1qtel 30 11.6 13 29.5 16 94.1 0 0 0 0 0 0 0 0 1 1.1 0 0 0 0 0 0
IKZF2 † or * § 2q34 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1.5 18 52.9 0 0
ARPP-21 3p22.3 8 3.1 1 2.3 0 0 2 4.0 0 0 1 4.8 2 20 2 2.2 1 1.5 0 0 3 13.6
FHIT 3p14.2 12 4.7 0 0 0 0 6 12.0 0 0 2 9.5 1 10 3 3.3 0 0 0 0 0 0
FLNB 3p14.3 7 2.7 1 2.3 0 0 1 2.0 0 0 1 4.8 1 10 3 3.3 0 0 0 0 1 4.5
BTLA/CD200 3q13.2 16 6.2 0 0 0 0 8 16.0 0 0 5 23.8 1 10 2 2.2 2 2.9 0 0 0 0
RASA2 3q22-q23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 0 0 0 0
MBNL1 3q25.1 9 3.5 2 4.5 0 0 3 6.0 0 0 2 9.5 1 10 1 1.1 0 0 0 0 0 0
TBL1XR1 3q26.32 15 5.8 1 2.3 0 0 8 16.0 1 4.2 1 4.8 0 0 4 4.3 0 0 2 5.9 0 0
IL1RAP 3q28 3 1.2 0 0 0 0 1 2.0 0 0 1 4.8 1 10 0 0 0 0 0 0 0 0
ARHGAP24 4q21.23 2 0.8 0 0 0 0 0 0 0 0 0 0 1 10 1 1.1 0 0 0 0 0 0
NR3C2 4q31.23 10 3.9 0 0 0 0 6 12.0 0 0 0 0 1 10 3 3.3 0 0 0 0 1 4.5
LEF1 4q25 5 1.9 0 0 0 0 2 4.0 0 0 0 0 1 10 2 2.2 0 0 0 0 0 0
Nature Genetics: doi: 10.1038/ng.2532
34
Lesion Location SJ cohort
% H50 % TCF3-PBX1
% ETV6-RUNX1
% MLL % PH % Hypo % Other % NH % LH % ND %
FBXW7 4q31.3 5 1.9 0 0 0 0 1 2.0 0 0 1 4.8 1 10 2 2.2 1 1.5 0 0 0 0
EBF1 5q33.3 12 4.7 1 2.3 0 0 5 10.0 0 0 3 14.3 1 10 2 2.2 0 0 0 0 1 4.5
Histone cluster 6p22.2 21 8.1 1 2.3 0 0 3 6.0 0 0 3 14.3 3 30 11 12.0 6 8.8 1 2.9 2 9.1
GRIK2 6q16 11 4.3 1 2.3 1 5.9 7 14.0 0 0 0 0 0 0 2 2.2 0 0 0 0 0 0
EPHA7 6q16.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 0 0 0 0
ARMC2/SESN1 6q21 13 5.0 0 0 0 0 8 16.0 0 0 0 0 0 0 5 5.4 0 0 0 0 0 0
ARID1B 6q25.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 0 0 1 4.5
4LOC389437 6q25.3 7 2.7 0 0 0 0 4 8.0 0 0 0 0 1 10 2 2.2 0 0 0 0 0 0
IKZF1 † or * 7p13 48 18.6 4 9.1 0 0 0 0 1 4.2 16 76.2 5 50 22 24 3 4.4 1 2.9 2 9.1
CDK6 7q21.2 8 3.1 1 2.3 0 0 0 0 0 0 2 9.5 3 30 2 2.2 0 0 1 2.9 1 4.5
MSRA 8p23 6 2.3 0 0 0 0 2 4.0 0 0 1 4.8 2 20 1 1.1 0 0 0 0 0 0
TOX 8q12.1 11 4.3 0 0 0 0 5 10.0 0 0 1 4.8 0 0 5 5.4 0 0 0 0 0 0
PAG1 † or * § 8q21.13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 11.8 1 2.9 0 0
CCDC26 8q24.21 5 1.9 1 2.3 0 0 0 0 0 0 0 0 0 0 4 4.3 0 0 0 0 0 0
JAK2 * § 9p24
1 1.5 0 0 0 0
CDKN2A/B 9p21.3 87 33.7 9 20.5 6 35.3 15 30.0 4 16.7 11 52.4 10 100 32 35 15 22.1 8 23.5 17 77.3
PAX5 † or * 9p13.2 83 32.2 4 9.1 8 47.1 17 34.0 5 20.8 11 52.4 10 100 28 30 5 7.4 2 5.9 13 59.1
ABL1 9q34.13 5 1.9 0 0 0 0 0 0 0 0 4 19.0 1 10 0 0 0 0 0 0 0 0
ADARB2 10p15.2 1 0.4 0 0 0 0 0 0 0 0 1 4.8 0 0 0 0 0 0 0 0 0 0
COPEB/KLF6 10p15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
BLNK 10q24.1 3 1.2 0 0 0 0 2 4.0 0 0 0 0 0 0 1 1.1 0 0 1 2.9 0 0
ADD3 10q25.2 14 5.4 1 2.3 0 0 4 8.0 0 0 5 23.8 0 0 4 4.3 0 0 0 0 1 4.5
FAM53B 10q26.13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 0 0 0 0
RAG1/2 11p12 15 5.8 0 0 0 0 8 16.0 1 4.2 0 0 0 0 6 6.5 1 1.5 0 0 0 0
NUP160/PTPRJ 11p11.2 1 0.4 0 0 0 0 0 0 0 0 0 0 0 0 1 1.1 0 0 0 0 0 0
GAB2 11q14.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 1 2.9 0 0
ATM 11q22.3 7 2.7 0 0 0 0 2 4.0 0 0 1 4.8 0 0 4 4.3 0 0 0 0 0 0
CUL5 11q22.3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 0 0 0 0
ETV6 † or * § 12p13.2 63 24.4 5 11.4 0 0 34 68.0 2 8.3 2 9.5 2 20 18 20 5 7.4 0 0 2 9.1
KRAS * § 12p12.1
2 2.9 0 0 2 9.1
BTG1 12q21.33 18 7.0 0 0 0 0 7 14.0 0 0 4 19.0 1 10 6 6.5 0 0 0 0 1 4.5
PTPN11 * § 12q24
1 1.5 0 0 2 9.1
FLT3 * § 13q12
6 8.8 0 0 0 0
ZMYM5 13q12.11 5 1.9 1 2.3 0 0 2 4.0 0 0 0 0 0 0 2 2.2 1 1.5 0 0 0 0
PDS5B/APRIN 13q12.3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 0 0 0 0
Nature Genetics: doi: 10.1038/ng.2532
35
Lesion Location SJ cohort
% H50 % TCF3-PBX1
% ETV6-RUNX1
% MLL % PH % Hypo % Other % NH % LH % ND %
ELF1 13q14.11 12 4.7 2 4.5 2 11.8 4 8.0 1 4.2 0 0 1 10 2 2.2 1 1.5 0 0 1 4.5
SERP2/TSC22D1 13q14 15 5.8 2 4.5 2 11.8 4 8.0 1 4.2 2 9.5 1 10 3 3.3 0 0 0 0 0 0
RB1 † or * § 13q14.2 15 5.8 3 6.8 2 11.8 2 4.0 2 8.3 4 19.0 0 0 2 2.2 6 8.8 14 41.2 0 0
DLEU2/7/mir15/-16a 13q14 16 6.2 5 11.4 2 11.8 3 6.0 3 12.5 1 4.8 0 0 2 2.2 0 0 0 0 0 0
ATP10A 15q12 5 1.9 0 0 0 0 1 2.0 0 0 1 4.8 1 10 2 2.2 0 0 0 0 0 0
SPRED1 (5’) 15q14 6 2.3 0 0 0 0 0 0 0 0 1 4.8 1 10 4 4.3 1 1.5 0 0 0 0
LTK 15q15.1 6 2.3 0 0 0 0 3 6.0 0 0 0 0 1 10 2 2.2 0 0 0 0 0 0
ANKRD11 16q24.3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 5.9 0 0
TP53 * § 17p13.1
2 2.9 31 91.2 1 4.5
NF1 † or * § 17q11.2 8 3.1 1 2.3 0 0 2 4.0 0 0 0 0 1 10 4 4.3 44.1 3 8.8 1 4.5
IKZF3 † or * § 17q21.1 3 1.2 0 0 0 0 0 0 0 0 0 0 2 20 1 1.1 9 13.2 1 2.9 0 0
SMAD2 18q21.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 5.9 0 0
TCF3 19p13.3 17 6.6 1 2.3 16 94.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
C20orf194 20p13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 0 0 0 0
C20orf94 20p12.2 20 7.8 2 4.5 0 0 7 14.0 0 0 7 33.3 0 0 4 4.3 1 1.5 0 0 0 0
ERG 21q22 14 5.4 0 0 0 0 0 0 0 0 0 0 0 0 14 15 1 1.5 0 0 0 0
iAmp21 21, varies 11 4.3 0 0 0 0 5 10.0 0 0 0 0 0 0 6 6.5 0 0 0 0 1 4.5
VPREB1 22q11.22 80 31.0 7 15.9 1 5.9 35 70.0 1 4.2 7 33.3 3 30 26 28 3 4.4 2 5.9 5 22.7
IL3RA Xp22.33 18 7.0 1 2.3 0 0 6 12.0 0 0 0 0 1 10 10 11 0 0 0 0 0 0
CRLF2 § Xp22.3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2.9 0 0 2 9.1
DMD Xp21.1 11 4.3 1 2.3 0 0 4 8.0 0 0 0 0 0 0 6 6.5 0 0 2 5.9 2 9.1
B pwy ** 137 53.1 11 25.0 17 100 27 54.0 6 25.0 16 76.2 10 100 50 54 16 23.5 22 64.7 14 63.6
B pwy with VPREB1 169 65.5 16 36.4 17 100 42 84.0 6 25.0 16 76.2 10 100 62 67 18 26.5 23 67.6 16 72.7
Nature Genetics: doi: 10.1038/ng.2532
36
Supplementary Table 18: TP53 mutations in pediatric hypodiploid ALL.
Codons R248, R273 and R282 are hot-spots in LFS. D, diagnosis; R, relapse; Homo, homozygous; Het, heterozygous; LFS, Li-Fraumeni Syndrome; LH, low hypodiploid; NH, near haploid.
Nature Genetics: doi: 10.1038/ng.2532
37
Seq ID Subgroup TP53 mutation (NM_000546.4)
Homo or het
Status LFS associated or sporadic
Predicted effect
Domain Dominant negative
SJHYPO003-D LH R213* Homo Non-tumor LFS and sporadic Deleterious DNA-binding
SJHYPO004-D LH D49fs Homo Non-tumor Sporadic Truncating Proline-rich
SJHYPO005-D LH G245S Homo Non-tumor LFS and sporadic Deleterious DNA-binding
SJHYPO009-D & R LH R306* Homo Somatic LFS and sporadic Truncating None
SJHYPO012-D LH R248Q Homo Non-tumor LFS and sporadic DNA contact DNA-binding Yes
SJHYPO013-D LH R282fs Homo Somatic Sporadic Truncating DNA-binding
SJHYPO014-D LH C176S Het Somatic Sporadic Deleterious DNA-binding Yes
SJHYPO022-D LH R306fs Het Somatic Sporadic Truncating None
SJHYPO025-D LH F113fs Homo Non-tumor Sporadic Truncating DNA-binding
SJHYPO026-D LH T284>KRRSEETT Het Somatic DNA-binding
SJHYPO027-D LH R306* Homo Non-tumor LFS and sporadic Truncating None
SJHYPO029-D LH L137fs Het Somatic Sporadic Truncating DNA-binding
SJHYPO048-D LH R248W Homo Non-tumor LFS and sporadic DNA contact DNA-binding Yes
SJHYPO051-D LH F109fs Homo Non-tumor Sporadic Truncating DNA-binding
SJHYPO052-D LH exon1-exon7 splicing N/A N/A N/A N/A
SJHYPO053-D LH PinsR282 Homo Somatic
DNA-binding
SJHYPO055-D LH T125R Homo Somatic LFS and sporadic Deleterious DNA-binding
SJHYPO061-D LH R174_C176>R Homo Somatic DNA-binding
SJHYPO062-D NH A88fs Homo Somatic In COSMIC at aa Truncating Proline-rich
SJHYPO063-D LH R280K Homo Non-tumor LFS and sporadic DNA binding DNA-binding Yes
SJHYPO064-D LH R282fs Homo Somatic In COSMIC at aa Truncating DNA-binding
SJHYPO068-D LH R248W Het Somatic LFS and sporadic DNA contact DNA-binding Yes
SJHYPO074-D LH R273C Homo Somatic LFS and sporadic DNA contact DNA-binding
SJHYPO077-D LH Y163N Homo Somatic Sporadic Deleterious DNA-binding Yes
SJHYPO078-D LH R306* Homo Somatic LFS and sporadic Truncating None
SJHYPO079-D LH L130H Homo Non-tumor Sporadic Deleterious DNA-binding
SJHYPO080-D LH R273H Homo Somatic LFS and sporadic DNA contact DNA-binding Yes
SJHYPO083-D LH GinsR282 Homo Somatic DNA-binding
SJHYPO084-D LH Y220C Homo Somatic LFS and sporadic Deleterious DNA-binding Yes
SJHYPO093-D LH R248W Het Somatic LFS and sporadic DNA contact DNA-binding Yes
SJHYPO096-D ND R282>RLA Homo Somatic DNA-binding
SJHYPO119-D LH R273H Homo Non-tumor LFS and sporadic DNA contact DNA-binding Yes
SJHYPO120-D LH R280S Homo Non-tumor Sporadic DNA binding DNA-binding Yes
SJHYPO126-D LH I162fs Homo Non-tumor Sporadic Truncating DNA-binding
Nature Genetics: doi: 10.1038/ng.2532
38
Supplementary Table 19: IKZF1, IKZF2 and RB1 deletions in adult ALL.
Number (N) and percentage of cases with deletion in the respective gene are indicated. PH, BCR-ABL1 positive; H47, >47 chromosomes; H50, >50 chromosomes; MLL, mixed lineage leukemia; LH, low hypodiploid.
ALL subtype IKZF1
deletion (N)
IKZF1 deletion
(%)
IKZF2 deletion
(N)
IKZF2 deletion
(%)
RB1 deletion
(N)
RB1 deletion
(%)
Low hypodiploid (N=11) 0 0 3 27.3 2 18.2
Near haploid (N=1) 0 0 1 100 1 100
H50 (N=4) 0 0 0 0 0 0
MLL (N=3) 0 0 0 0 0 0
PH (N=31) 17 54.8 0 0 4 12.9
ERG (N=4) 1 25 0 0 2 50
H47 (N=2) 0 0 0 0 0 0
Other (N=40) 7 17.5 0 0 3 7.5
Bi-phenotypic (N=1) 0 0 0 0 0 0
N/A (N=4) 0 0 0 0 0 0
T-ALL (N=16) 0 0 0 0 0 0
Total non-LH (N=106) 25 23.6 1 0.9 10 9.4
Nature Genetics: doi: 10.1038/ng.2532
39
Supplementary Table 20: Alterations targeting histone modifiers in next-generation sequenced hypodiploid ALL.
SNV, single nucleotide variant; Indel, insertion/deletion mutation; CNV, copy number variation; DEL, deletion; D, diagnosis; R, relapse; G, remission; NH, near haploid; LH, low hypodiploid.
Sample Gene
Mutation type Mutation class Status Histone modifier type
Hypodiploid ALL subgroup
SJHYPO002-D CREBBP91
Indel K389_M395>K Somatic Histone writer NH
SJHYPO006-D CREBBP Focal CNV DEL Somatic Histone writer NH
SJHYPO032-D CREBBP SNV R1169C Somatic Histone writer NH
SJHYPO036-D CREBBP Indel P1279_E21splice Somatic Histone writer NH
SJHYPO037-R CREBBP SNV R1446C Appearing at R Histone writer NH
SJHYPO040-D CREBBP Indel P1279_E21splice Somatic Histone writer NH
SJHYPO056-D CREBBP Focal CNV DEL Somatic Histone writer NH
SJHYPO117-D & R CREBBP SNV Q1500P In D and R, no G Histone writer NH
SJHYPO001-D EHMT2, Ref67
SNV E883Q Somatic Histone writer NH
SJHYPO006-D PRDM1, Ref67
SNV E7_UTR_3 Somatic Histone writer NH
SJHYPO029-D MLL2, Ref57
SNV V4642I Somatic Histone writer NH
SJHYPO032-D WHSC1, Ref60
SNV E1099K Somatic Histone writer NH
SJHYPO044-D EZH2, Ref92
SNV N675K Somatic Histone writer NH
SJHYPO117-R EZH2 SNV R684H Appearing at R Histone writer NH
SJHYPO117-R EZH2 SNV G159R Appearing at R Histone writer NH
SJHYPO055-D UBR461,62
SNV Q879_E19splice_region Somatic Histone writer LH
SJHYPO126-D UBR4 SNV R1349H Present in G Histone writer LH
SJHYPO056-D NSD1, Ref60
SNV E23_UTR_3 Somatic Histone writer NH
SJHYPO013-D USP7, Ref63
SNV A381T Somatic Histone eraser LH
SJHYPO032-D KDM1A67
SNV Q417* Somatic Histone eraser NH
SJHYPO119-D USP22, Ref69
Fusion DEL Somatic Histone eraser LH
SJHYPO124-D HDAC2, Ref71
SNV S118P Somatic Histone eraser NH
SJHYPO022-D BRDT60
SNV R532Q Somatic Histone reader LH
SJHYPO123-D SFMBT2, Ref60
Focal CNV DEL Somatic Histone reader NH
SJHYPO026-D ASXL3 SNV T1243A Somatic Histone binder LH
SJHYPO040-D NFYC SNV P240L Somatic Binds histone writer NH
SJHYPO012-D PHF12, Ref79
SNV E986A Somatic Binds histone eraser LH
SJHYPO041-D CHD3, Ref80
SNV D93_E2splice_region Somatic Binds histone eraser NH
SJHYPO044-D CHD4, Ref80
SNV N1113I Somatic Binds histone eraser NH
Nature Genetics: doi: 10.1038/ng.2532
40
Sample Gene
Mutation type Mutation class Status Histone modifier type
Hypodiploid ALL subgroup
SJHYPO039-D CDC6, Ref83
SNV E402Q Somatic Histone DNA modifier NH
SJHYPO046-D TET1, Ref83
SNV E12_UTR_3 Somatic Histone DNA modifier NH
SJHYPO120-D MBD5, Ref85
SNV S1097I Somatic Histone DNA modifier LH
SJHYPO125-D TET3, Ref83
SNV G795D Somatic Histone DNA modifier LH
SJHYPO046-D ARID1A87
SNV P1384S Somatic Histone reorder chromatin NH
SJHYPO006-D HIST1H2BK SNV G14S Somatic Histone NH
Nature Genetics: doi: 10.1038/ng.2532
41
Supplementary Table 21: Differential expression analysis – NH versus LH.
See Excel Table: “Table_S21_NH_vs_LH_Limma.xlsx”
Differential expression analysis performed by limma with estimation of false discovery rate (FDR) at 0.05 between near haploid (NH) and low hypodiploid (LH) cases. More than 15,000 probesets showed differential expression between these two hypodiploid subgroups.
Supplementary Table 22: Gene set enrichment analysis (GSEA) – NH versus LH.
See Excel Table: “Table_S22_NH_vs_LH_GSEA.xlsx”
GSEA analysis comparing near haploid (NH) and low hypodiploid (LH) ALL, leaving 671 gene sets significant with an FDR cutoff at 0.25.
Nature Genetics: doi: 10.1038/ng.2532
42
Supplementary Table 23: Ex vivo drug study of PI3K/mTOR and MEK inhibitors on hypodiploid ALL cells.
IC50 values for the respective samples and drugs are shown. The concentration range tested for the different drugs are indicated. “>” indicates if an IC50 was not reached with the highest concentration tested. NT, not tested.
Hypodiploid subgroup
Generic ID Bez235 (0.03125-1uM)
GDC-0941 (0.02-5uM)
Mek162 (0.03125-1uM)
PD0325901 (0.1024-10uM)
near haploid SJHYPO037-X1 0.095 0.067 0.209 NT
near haploid SJHYPO037-X2 0.096 0.041 0.244 NT
near haploid SJHYPO054-X2 NT NT NT >10
near haploid SJHYPO054-X3 0.033 0.027 0.514 >10
near haploid SJHYPO123-X1 NT 0.079 >1 >10
near haploid SJHYPO123-X2 0.075 0.093 >1 >10
near haploid SJHYPO123-X3 0.075 NT NT NT
low hypodiploid SJHYPO077-X1 NT 0.027 NT <0.1024
low hypodiploid SJHYPO120-X1 0.109 0.501 >1 0.637
low hypodiploid SJHYPO120-X3 0.072 0.499 >1 0.155
near haploid NALM-16 0.261 0.673 >1 >10
Nature Genetics: doi: 10.1038/ng.2532
43
Supplementary Table 24: Sequences of shRNAs.
shRNA ID Sequences of the guide strands of the shRNA
Ikzf2-4422 ATGGCACAAGATACAGAAAAA
Ikzf2-8315 TAGGTCAGGTTTAAATCAATA
Ikzf3-449 TTCGATGAAAGTGAAAGATGA
Ikzf3-1586 CTCCATCAAAGTGATCAACAA
Luc-1309 (Firefly luciferase) CCCGCCTGAAGTCTCTGATTAA
Nature Genetics: doi: 10.1038/ng.2532
44
Supplementary Table 25: Single nucleotide variations identified by mRNA seq of NALM-16.
See Excel Table: “Table_S25_NALM-16_mRNA-seq_variations.xlsx”
RNA-seq (transcriptome sequencing) was carried out for the near haploid ALL cell line NALM-16. Single nucleotide variations are listed. Assessment of functional impact of missense variations calculated by POLYPHEN and SIFT is included, as well as comparisons with a local database of sequence variations obtained from whole genome sequencing of tumor and normal DNA from 254 children (The SJCRH – Washington University Pediatric Cancer Genome Project (PCGP)93. Column definition is listed below: A: GeneName: HUGO gene symbol B: Chr: Chromosome C: HG19_Pos: Chromosome position in hg19 coordinates D: Class: Classification based on amino acid change pattern. Exon refers to variations in non-coding RNA genes. E: AAChange: Predicted amino acid change for the variation F: ProteinGI: NCBI protein GI number G: mRNA_acc: Refseq accession number H: ReferenceAllele: The allele represented in the reference human genome. Reference allele is marked as – for an insertion. I: MutantAllele: Mutant allele J: Flanking: 20bp[reference allele/mutant allele]20bp K: Freq: Frequency of reads with the variation L: SIFTResult: Deleterious status assigned by SIFT M: SIFTScore: SIFT score N: pph2result: Deleterious status assigned by polyPHEN2 O: pph2score: PolyPHEN2 score P: PCGP+-2: Variation identified not at the specific site, but within 2bp, in at least two whole genome sequenced sample from the PCGP. Q: COSMIC_OMIM_VALID_CLINIC+-2: Variation identified not at the specific site, but within 2bp, in COSMIC or OMIM. R: COSMIC_OMIM_VALID_CLINIC_pmid: Pubmed ID
Nature Genetics: doi: 10.1038/ng.2532
45
Supplementary Table 26: Primer sequences used for targeted gene resequencing and NF1 deletion mapping.
See Excel Table: “Table_S26_Primers.xlsx”
Supplementary Table 27: Murine lymphoid precursor cells used for gene expression profiling.
Bone marrow harvested from wild-type C57BL/6 mice was flow sorted based on the surface marker scheme below.
Differentiation stage
Explanation Surface markers used for flow cytometric cell sorting
CLP Common lymphoid precursor Sca1low
, Lin-, IL7Rα
+, cKIT
low
Hardy Fraction A pre-pro-B B220+, CD43
+, CD24
-, BP-1
-
Hardy Fraction B Pro-B B220+, CD43
+, CD24
+, BP-1
-
Hardy Fraction C Pre-B early B220+, CD43
+, CD24
+, BP-1
+
Hardy Fraction D Pre-B late B220+, CD43
-, IgM
-, IgD
-
Hardy Fraction E Immature B cells B220+, CD43
-, IgM
+, IgD
-
Hardy Fraction F Mature B cells B220-bright, CD43-, IgM
+, IgD
+
Nature Genetics: doi: 10.1038/ng.2532
46
Supplementary Table 28: Antibodies used for biochemical studies.
Antibody Catalogue number
Company Purpose
pERK 9101 Cell Signaling Technology Primary antibody - Flow analysis and immunoblotting
pAKT 4060 Cell Signaling Technology Primary antibody - Flow analysis
pS6 2211 Cell Signaling Technology Primary antibody - Flow analysis
pStat5 9351 Cell Signaling Technology Primary antibody - Flow analysis
pTyr 9411 Cell Signaling Technology Primary antibody - Flow analysis
IgG control 5415 Cell Signaling Technology Primary antibody - Flow analysis
b-Actin 4967 Cell Signaling Technology Primary antibody - Flow analysis
pmTOR 44-1125G Life Technologies Primary antibody - Flow analysis
p4EBP1 44-1170G Life Technologies Primary antibody - Flow analysis
Bcl-2 1017-1 Epitomics Primary antibody - Flow analysis
Bcl-xl 1018-1 Epitomics Primary antibody - Flow analysis
Mcl-1 1239-1 Epitomics Primary antibody - Flow analysis
p53 1047-1 Epitomics Primary antibody - Flow analysis
p53 2524 Cell Signaling Technology Primary antibody - Immunoblotting
Mouse anti-Ras 05-516 Millipore Primary antibody - Flow analysis
Polyclonal goat anti-rabbit Immunoglobulin/HRP P0448 DakoCytomation Primary antibody - Flow analysis
Sheep anti mouse IgG HRP NA931V GE Healthcare (UK) Primary antibody - Flow analysis
anti-rabbit IgG-FITC 711-096-152 Jackson Immunoresearch Secondary antibody
anti mouse IgG-APC 115-096-146 Jackson Immunoresearch Secondary antibody
anti rabbit IgG-Alexa 647 A-21246 Life Technologies Secondary antibody
phospho-STAT5 (Tyr 694)-Alexa 647 612599 BD Biosciences Directly conjugated primary antibody - Flow analysis
pBTK (Tyr 551)-Alexa 647 558134 BD Biosciences Directly conjugated primary antibody - Flow analysis
CD45 557833 BD Pharmingen Directly conjugated primary antibody - Flow analysis
CD19 557921 BD Pharmingen Directly conjugated primary antibody - Flow analysis
CD16/CD32 553142 BD Biosciences Used to block Fc receptors
sc-68, Neurofibromin (N) sc-68 Santa Cruz Biotechnology, Inc. Immunoblotting; N-terminal epitope of NF1
sc-67, Neurofibromin (D) sc-67 Santa Cruz Biotechnology, Inc. Immunoblotting; C-terminal epitope of NF1
Aiolos sc-101982 Santa Cruz Biotechnology, Inc. Immunoblotting
Helios sc-9866 Santa Cruz Biotechnology, Inc. Immunoblotting
Nature Genetics: doi: 10.1038/ng.2532
47
Supplementary Table 29: Association between aneuploidy and event free survival (EFS).
Estimated EFS in percentage with standard error within parenthesis. Information is missing for 17 hypodiploid ALL cases. Masked hypodiploid cases here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%.
P values calculated by a Log-rank Test. NH, near haploid; LH, low hypodiploid. Masked and non-masked cases separated
Hypodiploid subgroup n= Year 1 Year 2 Year 5 P value
Near haploid 43 79.6 (6.5) 57.1 (8.2) 54.2 (13.0) 0.11
Masked near haploid 17 87.1 (8.7) 77.4 (13.0) 77.4 (26.0)
Low hypodiploid 19 68.4 (10.3) 52.1 (11.4) 41.7 (18.4)
Masked low hypodiploid 7 66.7 (17.2) 66.7 (27.2) No Data
Near diploid 21 90.5 (6.2) 90.5 (6.2) 80.7 (8.9)
Masked and non-masked cases combined
Hypodiploid subgroup n= Year 1 Year 2 Year 5 P value
NH and masked NH 60 81.7 (5.3) 61.8 (7.2) 59.3 (12.6) 0.035
LH and masked LH 26 68.5 (9.1) 54.9 (11.1) 44.0 (19.0)
Near diploid 21 90.5 (6.2) 90.5 (6.2) 80.7 (8.9)
Comparison between near haploid and low hypodiploid ALL, excluding near diploid ALL
Hypodiploid subgroup n= Year 1 Year 2 Year 5 P value
NH and masked NH 60 81.7 (5.3) 61.8 (7.2) 59.3 (12.6) 0.29
LH and masked LH 26 68.5 (9.1) 54.9 (11.1) 44.0 (19.0)
Nature Genetics: doi: 10.1038/ng.2532
48
Supplementary Table 30: Association between aneuploidy and cumulative incidence of any relapse. Cumulative incidence (CIN) of any relapse in percentage with standard error within parenthesis. Information is missing for 17 hypodiploid ALL cases. Masked hypodiploid cases here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%. P values calculated by Gray's Test. NH, near haploid; LH, low hypodiploid.
Masked and non-masked cases separated
Hypodiploid subgroup n= Year 1 Year 2 Year 5 P value
Near haploid 43 20.4 (6.5) 37.3 (8.1) 40.2 (8.3) 0.10
Masked near haploid 17 12.9 (8.9) 20.2 (10.9) 20.2 (10.9)
Low hypodiploid 19 31.6 (11.0) 47.9 (12.0) 47.9 (12.0)
Masked low hypodiploid 7 16.7 (17.0) 16.7 (17.0) No Data
Near diploid 21 5.0 (5.0) 5.0 (5.0) 15.3 (8.4)
Masked and non-masked cases combined
Hypodiploid subgroup n= Year 1 Year 2 Year 5 P value
NH and masked NH 60 18.3 (5.3) 33.7 (6.9) 36.3 (7.1) 0.056
LH and masked LH 26 27.5 (9.1) 41.0 (10.5) 41.0 (10.5)
Near diploid 21 5.0 (5.0) 5.0 (5.0) 15.3 (8.4)
Comparison between near haploid and low hypodiploid ALL, excluding near diploid ALL
Hypodiploid subgroup n= Year 1 Year 2 Year 5 P value
NH and masked NH 60 18.3 (5.3) 33.7 (6.9) 36.3 (7.1) 0.49
LH and masked LH 26 27.5 (9.1) 41.0 (10.5) 41.0 (10.5)
Nature Genetics: doi: 10.1038/ng.2532
49
Supplementary Table 31: Association between aneuploidy and minimal residual disease.
Number of cases with negative (Neg) and positive (Pos) minimal residual disease (MRD), respectively. Information is missing for 33 hypodiploid ALL cases. Masked hypodiploid cases here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%. NH, near haploid; LH, low hypodiploid. P values were calculated by an Exact Chi-Square test.
Masked and non-masked cases separated (P = 0.16) Hypodiploid subgroup n= MRD Neg (< 0.01%) MRD Pos (≥0.01%)
Near haploid 41 56.1% 43.9%
Masked near haploid 14 71.4% 28.6%
Low hypodiploid 14 50% 50%
Masked low hypodiploid 7 71.4% 2 (28.6%)
Near diploid 16 87.5% 2 (12.5%)
Masked and non-masked cases combined (P = 0.098) Hypodiploid subgroup n= MRD Neg (< 0.01%) MRD Pos (≥0.01%)
NH and masked NH 55 60% 40%
LH and masked LH 21 57.1% 42.9%
Near diploid 16 87.5% 12.5%
Comparison between near haploid and low hypodiploid ALL, excluding near diploid ALL (P = 0.82) Hypodiploid subgroup n= MRD Neg (< 0.01%) MRD Pos (≥0.01%)
NH and masked NH 55 60% 40%
LH and masked LH 21 57.1% 42.9%
Nature Genetics: doi: 10.1038/ng.2532
50
Supplementary Table 32: Association between copy number alterations/mutations and event free survival (EFS).
Estimated EFS in % with standard error (SE) within parenthesis. P values calculated by a Log-rank Test. Signaling indicates that gene is involved in RTK- or Ras signaling.
Nature Genetics: doi: 10.1038/ng.2532
51
Factors n= Year 1 Year 2 Year 5 P value
CDKN2A/B
Normal 70 81.6 (4.8) 64.0 (6.5) 61.8 (10.6) 0.81
Altered 37 77.9 (7.0) 71.6 (8.0) 59.9 (10.1)
Histone cluster (6p22)
Normal 93 78.3 (4.5) 64.8 (5.7) 57.6 (8.0) 0.35
Altered 14 92.9 (6.6) 78.6 (10.5) 78.6 (16.3)
FLT3 (Signaling)
Normal 102 79.4 (4.2) 65.2 (5.3) 58.9 (7.6) 0.16
Altered 5 100 (0.0) 100 (0.0) 100 (0.0)
KRAS (Signaling)
Normal 105 80.8 (4.0) 67.0 (5.1) 60.7 (7.6) 0.54
Altered 2 50.0 (25.0) 50.0 (25.0) 50.0 (25.0)
NF1 (Signaling)
Normal 76 81.1 (4.6) 67.5 (5.9) 59.3 (8.7) 0.92
Altered 31 77.6 (8.2) 64.3 (9.9) 64.3 (13.6)
NRAS (Signaling)
Normal 94 78.8 (4.4) 66.4 (5.4) 59.6 (7.9) 0.45
Altered 13 90.9 (8.3) 68.2 (14.5) 68.2 (19.2)
PTPN11 (Signaling)
Normal 104 80.6 (4.0) 67.7 (5.1) 61.4 (7.5) 0.13
Altered 3 66.7 (22.2) 33.3 (19.2) 33.3 (27.2)
Signaling combined
Normal 57 80.5 (5.2) 68.9 (6.6) 58.8 (9.4) 0.98
Altered 50 79.9 (6.1) 63.8 (7.8) 63.8 (11.6)
IKZF1 (B pathway)
Normal 104 79.6 (4.1) 66.7 (5.2) 60.3 (7.8) 0.77
Altered 3 100 (0.0) 66.7 (22.2) 66.7 (22.2)
IKZF2 (B pathway)
Normal 94 84.1 (4.0) 68.4 (5.4) 63.9 (7.5) 0.043
Altered 13 53.8 (12.9) 53.8 (13.8) 26.9 (23.0)
IKZF3 (B pathway)
Normal 97 78.1 (4.4) 63.4 (5.4) 58.3 (7.4) 0.12
Altered 10 100 (0.0) 100 (0.0) 83.3 (34.0)
PAX5 (B pathway)
Normal 90 77.4 (4.6) 62.0 (5.9) 53.9 (9.2) 0.014
Altered 17 94.1 (5.5) 88.2 (7.6) 88.2 (9.1)
VPREB1 (B pathway)
Normal 97 78.2 (4.4) 65.6 (5.4) 58.7 (8.0) 0.30
Altered 10 100 (0.0) 77.8 (13.9) 77.8 (16.4)
B pathway combined
Normal 61 76.4 (5.8) 57.7 (7.1) 52.7 (9.7) 0.076
Altered 46 84.7 (5.3) 77.4 (6.7) 69.9 (10.6)
RB1
Normal 89 81.1 (4.3) 68.0 (5.4) 61.1 (7.6) 0.62
Altered 18 77.4 (10.6) 60.3 (14.4) 60.3 (26.9)
TP53
Normal 83 83.1 (4.3) 68.4 (5.7) 63.3 (7.8) 0.31
Altered 24 70.8 (9.0) 61.1 (11.0) 45.8 (19.5)
PAG1
Normal 100 81.0 (4.1) 69.8 (5.2) 63.3 (7.5) 0.034
Altered 7 71.4 (15.6) 28.6 (13.9) No Data
Nature Genetics: doi: 10.1038/ng.2532
52
Supplementary Table 33: Association between copy number alterations/mutations and cumulative incidence (CIN) of any relapse.
CIN of any relapse in % with SE within parenthesis. P values calculated by Gray's Test. Signaling indicates that gene is involved in RTK- or Ras signaling.
Nature Genetics: doi: 10.1038/ng.2532
53
Factors n= Year 1 Year 2 Year 5 P value
CDKN2A/B
Normal 70 18.4 (4.8) 34.0 (6.2) 36.2 (6.4) 0.26
Altered 36 17.1 (6.5) 20.5 (7.1) 27.7 (8.2)
Histone cluster (6p22)
Normal 92 19.7 (4.3) 30.6 (5.2) 35.5 (5.6) 0.55
Altered 14 7.1 (7.1) 21.4 (11.4) 21.4 (11.4)
FLT3 (Signaling)
Normal 101 18.7 (4.0) 30.7 (4.9) 34.9 (5.2) 0.16
Altered 5 0 0 0
KRAS (Signaling)
Normal 105 18.2 (3.9) 29.7 (4.8) 33.8 (5.1) 0.47
Altered 1 0 0 0
NF1 (Signaling)
Normal 75 16.4 (4.4) 28.5 (5.5) 34.1 (5.9) 0.81
Altered 31 22.4 (8.3) 31.6 (9.6) 31.6 (9.6)
NRAS (Signaling)
Normal 93 19.2 (4.2) 29.2 (5.0) 33.7 (5.3) 0.64
Altered 13 9.1 (9.1) 31.8 (16.4) 31.8 (16.4)
PTPN11 (Signaling)
Normal 104 18.4 (3.9) 28.9 (4.8) 33.1 (5.1) 0.67
Altered 2 0 50.0 (50.0) 50.0 (50.0)
Signaling combined
Normal 57 17.7 (5.1) 27.2 (6.1) 34.1 (6.7) 0.93
Altered 49 18.4 (6.0) 32.3 (7.6) 32.3 (7.6)
IKZF1 (B pathway)
Normal 103 18.5 (4.0) 29.2 (4.8) 33.5 (5.1) 0.90
Altered 3 0 33.3 (33.3) 33.3 (33.3)
IKZF2 (B pathway)
Normal 93 15.0 (3.9) 28.2 (5.1) 32.7 (5.4) 0.30
Altered 13 38.5 (14.2) 38.5 (14.2) 38.5 (14.2)
IKZF3 (B pathway)
Normal 96 20.0 (4.2) 32.3 (5.1) 35.3 (5.3) 0.15
Altered 10 0 0 16.7 (16.7)
PAX5 (B pathway)
Normal 90 21.4 (4.5) 33.9 (5.4) 39.2 (5.8) 0.0086
Altered 16 0 6.3 (6.3) 6.3 (6.3)
VPREB1 (B pathway)
Normal 96 19.8 (4.2) 30.0 (5.0) 34.6 (5.4) 0.42
Altered 10 0 22.2 (14.8) 22.2 (14.8)
B pathway combined
Normal 61 23.6 (5.8) 38.1 (6.9) 43.1 (7.2) 0.033
Altered 45 11.2 (4.8) 18.6 (6.0) 21.5 (6.5)
RB1
Normal 88 18.0 (4.2) 28.5 (5.1) 33.3 (5.5) 0.81
Altered 18 16.7 (9.1) 33.8 (13.8) 33.8 (13.8)
TP53
Normal 82 15.9 (4.2) 27.8 (5.4) 32.9 (5.8) 0.59
Altered 24 25.0 (9.1) 34.7 (10.4) 34.7 (10.4)
PAG1
Normal 99 17.1 (3.9) 25.9 (4.7) 30.3 (5.1) 0.023
Altered 7 28.6 (18.6) 71.4 (20.0) No Data
Nature Genetics: doi: 10.1038/ng.2532
54
Supplementary Table 34: Multivariable analysis of copy number alterations/mutations, clinical features and association with cumulative incidence of any relapse.
Fine & Gray’s modeling of CIN of any relapse identified PAG1 alteration as the only gene alteration independently associated with poor outcome. WBC, white blood cell count; MRD, minimal residual disease. HR, hazard ratio; CI, confidence interval.
Factors HR HR low 95% CI
HR high 95% CI
P
PAX5 altered vs normal 0.138 0.02 1.06 0.057
PAG1 altered vs normal 3.412 1.03 11.3 0.044
WBC <100 vs ≥100 0.576 0.20 1.62 0.30
MRD positive (< 0.01%) vs Negative (≥0.01%) 3.863 1.54 9.67 0.0039
Nature Genetics: doi: 10.1038/ng.2532
55
SUPPLEMENTARY FIGURES
Supplementary Figure 1: Coverage plots for next-generation sequenced hypodiploid ALL cases.
a-c, Different colors represent different fold coverage, as indicated. SJHYPO052-D and –G underwent whole genome sequencing, while exome sequencing was performed for SJHYPO052-R, explaining the gap seen around SJHYPO052 in panel c.
Nature Genetics: doi: 10.1038/ng.2532
56
Nature Genetics: doi: 10.1038/ng.2532
57
Supplementary Figure 2: Circos plots of whole genome sequenced hypodiploid ALL.
Circos94 plots depicting structural genetic variants, including DNA copy number alterations, intra- and inter-chromosomal translocations, and sequence alterations. Loss-of-heterozygosity, orange; amplification, red; deletion, blue; Sequence mutations in Refseq genes: non-silent single nucleotide variants, brown; insertion/deletions, red; genes at structural variant breakpoints: genes involved in in-frame fusions, pink; others, blue.
Nature Genetics: doi: 10.1038/ng.2532
58
Nature Genetics: doi: 10.1038/ng.2532
59
Nature Genetics: doi: 10.1038/ng.2532
60
Nature Genetics: doi: 10.1038/ng.2532
61
Nature Genetics: doi: 10.1038/ng.2532
62
Supplementary Figure 3: Mutation spectrum of next-generation sequenced hypodiploid ALL.
Next-generation sequenced cases are depicted from left to right. Colored bars represent number of specific lesions identified in each case as indicated. WGS, whole genome sequencing; WES, whole exome sequencing; SNV, single nucleotide variation; aa, amino acid; CDS, coding DNA sequence; UTR, untranslated leader region; HQ, high quality; CNV, copy number variation; Amp, amplification; Del, deletion; Mb, megabases.
Nature Genetics: doi: 10.1038/ng.2532
63
Supplementary Figure 4: Protein domain and alteration plots for targets of sequence mutations in hypodiploid ALL.
Nature Genetics: doi: 10.1038/ng.2532
64
Supplementary Figure 5: Mapping of NF1 deletions. a, Top: Schematic of the NF1 gene with direction indicated by arrows and exons by vertical lines. The intragenic deletions and amplification of NF1 are depicted for each hypodiploid ALL case harboring a copy number alteration in this gene. Heterozygous deletions are shown as solid lines and homozygous deletions as dotted lines. The majority of focal deletions are accompanied by loss of the entire other chromosomal copy, leading to a bi-allelic loss. The intragenic NF1 amplification gives rise to copy-neutral loss-of-heterozygosity in SJHYPO120. b, Electropherogram showing the fusion point at the genomic level for one case. A 3 base pair (bp) insertion of non-consensus bases is present between the intron 14 to intron 35 fusion. Together with the presence of partially conserved heptamer recombination signal sequences (RSS) immediately internal to the genomic breakpoints, this is suggestive of a RAG mediated recombination event. c, Transcriptome sequencing (mRNA seq) data from the hypodiploid ALL cell line NALM-16. Read depth (red) and GC content (blue) for all NF1 exons are shown, with no coverage of exons 15-35. d, Schematic of the full length NF1 gene (top), NF1 gene with homozygous deletion from intron 14 to intron 35 (middle), and two putative open reading frames (ORF1 and ORF2) present in the NALM-16 NF1 transcript. Putative ORF1 is translated from the canonical NF1 start site, and has a premature stop codon in exon 36 downstream of the deletion. Putative ORF2 has an alternative start site in exon 37, downstream of the deleted region. The deleted region is depicted by a gray box, and ORFs as white boxes. e, NALM-16 NF1 RNA seq raw data presented in the Bambino viewer95. The corresponding paired reads spanning over the breakpoint between exon 14 (upper) and exon 36 (lower) are shown in the top and bottom panels, respectively. Read 1-5 indicate reads in the top and bottom panels that are the same read, spanning the splice site.
Nature Genetics: doi: 10.1038/ng.2532
65
Nature Genetics: doi: 10.1038/ng.2532
66
Supplementary Figure 6: Immunoblot analysis of NF1. a-b, Immunoblot analysis on the cell lines NALM-16 and THP-1 using the antibody sc-68 (a), raised against the N-terminus of NF1, and sc-67 (b) raised against the C-terminus of NF1. The size of full length NF1 is 250kDa, and the predicted sizes of putative NF1 mutant ORFs in NALM-16 are 62kDa (mutant ORF1, N-terminal part of wild-type NF1) and 115kDa (mutant ORF2, C-terminal part of wild-type NF1), respectively. Full length, wild-type NF1 is not present in NALM-16, but detected in the control THP-1. Only a nonspecific band, present also in THP-1, is seen at the size of mutant putative ORF1 (a), and no band is detected for mutant putative ORF2 (b), indicating that the NF1 deletion leads to loss of NF1 protein production.
Nature Genetics: doi: 10.1038/ng.2532
67
Supplementary Figure 7: Validation of mutations in NRAS and PTPN11 in non-tumor samples in near haploid ALL.
a and b, Electropherograms of forward and reverse DNA sequences covering the NRAS p.Gly12Ser substitution in SJHYPO020 (a) and PTPN11 p.Gly503Arg in SJHYPO036 (b). The only available non-tumor DNA was obtained from hematopoietic cells from the respective patient, and it is thus not known if the mutations were inherited or acquired in the hematopoietic compartment prior to the development of leukemia. The respective mutated codons are shown in upper case letters. a, SJHYPO020-D (P1) is the sorted tumor population (CD45-dim, CD19 positive (+)); SJHYPO020-G is a remission bone marrow sample from this patient; P2 is the sorted CD45 positive, CD7 positive fraction from a bone marrow sample taken at diagnosis; P3 is the sorted CD45 positive, CD7 negative fraction from the same diagnosis sample. b, SJHYPO036-D and –G represent samples taken at diagnosis and remission, respectively. c, Fluorescence activated cell sorting (FACS) plots showing the gating for the cell sorting of an SJHYPO020 bone marrow sample taken at diagnosis.
Nature Genetics: doi: 10.1038/ng.2532
68
Supplementary Figure 8: PAG1 deletions correlate with PAG1 expression levels. a, Heatmap showing SNP microarray data for the area covering PAG1 on chromosome 8q21.12. A focal deletion in PAG1 was first detected in the relapse sample for SJHYPO056, while this deletion was not detected at diagnosis (D vs R*). b, Top: Schematic of the PAG1 gene with direction indicated by arrows and exons by vertical lines. WGL log2 ratio copy number data visualized in the UCSC web browser (http://genome.ucsc.edu/). The pink vertical lines indicate probe intensities, with lines below the respective black zero lines correspond to a loss of genetic material. Double-headed arrows indicate the extent of the deletions. Note the lack of focal deletion in SJHYPO056-D but deletion at relapse. c, Relative gene expression levels of the PAG1 transcript in hypodiploid ALL samples either wild-type (WT) for PAG1 or harboring a PAG1 deletion (Del) as indicated, as assessed by three PAG1 specific probe sets from Affymetrix GeneChip HT HG-U133+ PM microarrays.
Nature Genetics: doi: 10.1038/ng.2532
69
Supplementary Figure 9: Mutant p53 fails to stimulate p21 in hypodiploid ALL. a, Flow cytometry analysis of p53 (left panel) and p21 (right panel) levels in cells from hypodiploid ALL xenograft SJHYPO120-X, harboring a p.Arg280Ser p53 substitution, and the cell line Reh (TEL-AML1 ALL harboring wild-type p53). Cells were treated with increasing concentrations of etoposide to activate p53, as indicated. p53 levels were already high in the p53 mutant cells, and etoposide treatment did not lead to increased p21 levels in these cells, while stimulation was seen in Reh. b, Histological examination of sternum stained for p53 (left panel) and p21 (right panel) from mice xenografted with primary hypodiploid ALL tumor cells either mutant (upper) or wild-type (lower) for TP53 as indicated. Scale bar corresponds to 50 microns.
Nature Genetics: doi: 10.1038/ng.2532
70
Supplementary Figure 10: IKZF1 and IKZF2 deletions in adult ALL. a-b, SNP 6.0 microarray heatmaps showing focal deletions of IKZF1 (a) and IKZF2 (b) in the adult ALL cohort. Blue indicates DNA loss. PH+, Philadelphia chromosome (BCR-ABL1) positive ALL; LH, low hypodiploid; NH, near haploid.
Nature Genetics: doi: 10.1038/ng.2532
71
Supplementary Figure 11: Expression of Ikzf1, Ikzf2 and Ikzf3 during murine lymphoid development. a-c, Gene expression levels of the Ikaros family genes Ikzf1 (a), Ikzf2 (b) and Ikzf3 (c) in murine cells flow sorted into Hardy Fractions as assessed by 2-3 probesets per gene from Affymetrix GeneChip MG-430 2.0 microarrays. One-way analysis of variance was performed to test for significant differences between the groups. CLP, common lymphoid precursor; Hardy Fraction A, pre-pro-B; B, proB; C, preB early; D, preB late; E, immature B; F, mature B cells.
Nature Genetics: doi: 10.1038/ng.2532
72
Supplementary Figure 12: CD19 levels and degree of antigen receptor rearrangements in hypodiploid ALL.
a, Comparison of CD19 expression level on near haploid and low hypodiploid ALL tumor cells. All near haploid (N=47) and low hypodiploid (N=19) cases with available CD19 expression data from flow cytometry studies are compared. There is a significant association between the level of CD19 expression and hypodiploid ALL subgroup. b, Percentage of cases with a rearrangement in the antigen receptor loci at 2p11.2 (IGK@), 7p14.1 (TRG@), 7q34 (TRB@), 14q11.2 (TRA@), 14q32.33 (IGH@) and/or 22q11.22 (IGL@). c, Heatmap showing SNP 6.0 microarray data for the area covering TRG@ (the T cell receptor gamma locus) on chromosome 7p14.1. Light blue of the entire region shown indicates either 7p loss (in near diploid cases) or whole chromosome 7 loss. Focal deletions (light or dark blue) indicate a rearrangement at the TRG@ locus. AgR, antigen receptor rearrangement; LH, low hypodiploid. d, Gene set enrichment analysis demonstrates enrichment for Hardy Panel fraction B (pro-B cell stage) in low hypodiploid ALL compared with near haploid ALL. The gene set HARDYWTB_500UP includes the top 500 probesets upregulated in Hardy Panel B compared with the other Hardy Panel fractions, identified using limma.
Nature Genetics: doi: 10.1038/ng.2532
73
Nature Genetics: doi: 10.1038/ng.2532
74
Supplementary Figure 13: RB1 alterations in pediatric hypodiploid ALL and adult ALL. a, Protein domain plot of RB1 with alterations identified in pediatric hypodiploid ALL. b-c, SNP 6.0 microarray heatmaps showing focal deletions of RB1 in pediatric hypodiploid ALL (b) and the adult ALL cohort (c). A case with a simultaneous RB1 deletion and sequence mutation is indicated by a Y in b. Blue indicates DNA loss. mNH, masked near haploid; LH, low hypodiploid; NH, near haploid; PH+, Philadelphia chromosome (BCR-ABL1) positive ALL.
Nature Genetics: doi: 10.1038/ng.2532
75
Supplementary Figure 14: Tumor suppressor gene pathway alterations in hypodiploid ALL. CDKN2A/CDKN2B are tumor suppressor genes functioning upstream of TP53 and RB1. The Total percentages indicate cases with both CDKN2A/B deletions and either TP53 (left) or RB1 (right) alterations for each hypodiploid subgroup. Genes in boxes have been subject of targeted resequencing, while genes in ovals have not been sequenced. Genes shaded in gray do not harbor any known alterations.
Nature Genetics: doi: 10.1038/ng.2532
76
Supplementary Figure 15: Deletions and sequence mutations in genes encoding histones and histone modifiers. a, SNP 6.0 microarray heatmap showing focal deletions in the histone cluster at chromosome 6p22 in hypodiploid ALL. The minimal region of deletion involved genes HIST1H2BE, HIST1H4D, HIST1H3D, HIST1H2AD and HIST1H2BF. Blue indicates DNA loss. Masked hypodiploid cases do here refer to cases with either a pure doubled hypodiploid clone or cases harboring a doubled clone constituting at least 30%. mNH, masked near haploid; LH, low hypodiploid; ND, near diploid. b, Protein domain and alteration plot of CREBBP.
Nature Genetics: doi: 10.1038/ng.2532
77
Supplementary Figure 16: GEP restricted to probes on chromosomes showing identical patterns of aneuploidy.
a-b, Unsupervised principal component analysis (PCA) of gene expression data from all hypodiploid ALL cases with available high quality RNA (N=94). Near haploid/masked near haploid, low hypodiploid/masked low hypodiploid and near diploid cases form three distinct clusters by PCA also when restricting the analysis to commonly aneuploid chromosomes (a) or only chromosome 21 (b), which always retains both the maternal and paternal chromosomal copies.
Nature Genetics: doi: 10.1038/ng.2532
78
Supplementary Figure 17: Flow cytometric analysis of signaling pathways in hypodiploid ALL. Spleen cells from mice transplanted with primary human hypodiploid ALL samples or the hypodiploid ALL cell line NALM-16 (non-transplanted) were analyzed for the presence of the indicated proteins. Healthy donor is a control, and indicates cells from a peripheral blood sample from a non-cancerous individual.
Nature Genetics: doi: 10.1038/ng.2532
79
Supplementary Figure 18: Ikzf2 and Ikzf3 knockdown efficiency assessed by immunoblot analysis. a-b, Immunoblot analysis on the murine cell lines Ba/F3 (a) and Arf-/- pre-B (b) using the antibody sc-9866 detecting Helios (a), and sc-101982 detecting Aiolos (b). Luc-1309 indicates a control shRNA specific for Firefly luciferase mRNA. The cells expressing shRNAs Ikzf2-4422, Ikzf2-8315, Ikzf3-449 and Ikzf3-1586 were used for downstream analyses. Knockdown did not influence cell viability, cell cycle distribution or proliferation (data not shown).
Nature Genetics: doi: 10.1038/ng.2532
80
Supplementary Figure 19: Flow cytometric analysis of signaling pathways in hematopoietic cell lines.
Flow cytometric analyses detecting levels of pERK (a) and pS6 (b) in murine cell lines after knockdown of Ikzf2 (in Ba/F3, left panel) and Ikzf3 (in Arf-/- pre-B cells, right panel) and stimulation with PMA (50nM, 15 minutes). Two independent shRNAs per gene were employed. Luc-1309 indicates a control shRNA specific for Firefly luciferase mRNA.
Nature Genetics: doi: 10.1038/ng.2532
81
Supplementary Figure 20: The importance of optimal normalization of SNP microarray data.
a-c, SNP 6.0 microarray data for 10 hypodiploid ALL diagnosis (D) and matched remission (G)
samples are presented from left to right in each of the three panels. Chromosomes are shown
from 1-22, X and Y from top to bottom. a, Normalization is performed using quantile
normalization in dChip, a median centering approach that borrows information across arrays,
and the hypodiploid genomes are erroneously normalized. b, The same data set presented after
performance of reference normalization96 in which only chromosomes known or predicted to be
diploid are used as reference chromosomes to guide normalization of the entire array, and in
which normalization of each array is performed independently of other samples. HYPO053,
HYPO055 and HYPO084 contains substantial proportions of a doubled clone and are
normalized as masked low hypodiploid cases. a-b; Red indicates gain of genetic material, and
blue indicates loss. c, Loss-of-heterozygosity (indicated in dark blue) visualization for the same
samples in dChip. Each tumor sample was directly compared to its matched remission sample.
Nature Genetics: doi: 10.1038/ng.2532
82
Supplementary Figure 21: Immunohistochemistry and FACS analyses of tissue from mice xenografted with human primary hypodiploid ALL cells.
a-f, Selected tissues stained for human CD45 (left) and with hematoxylin and eosin (HE) stain (right) in each panel. Scale bar corresponds to 50 microns. Spleen, meninges and sternal marrow from SJHYPO072-X2 (a-c) and SJHYPO120-X1 (d-f) are shown. g-h, FACS analysis of bone marrow from SJHYPO072-X2 (g) and -X3 (h). Cells are stained with DAPI, for mouse CD45 (PE-Cy7 conjugated antibody), and human CD45 (FITC conjugated), CD19 (APC) and CD3 (PE).
Nature Genetics: doi: 10.1038/ng.2532
83
Supplementary Figure 22: Copy number analysis of primary hypodiploid ALL samples versus xenografted leukemic samples.
a-c, Copy number analysis comparing related hypodiploid ALL diagnosis (D) and xenograft (X1-3) samples. The focal deletions that were present at diagnosis were retained in the xenograft clones. a, The masked near haploid case SJHYPO072-D harbored three homozygous focal deletions (in the Histone cluster at 6p22.1, PAG1 and IKZF3). The other corresponding chromosomal copy of chromosomes 6, 8 and 17 was lost prior to reduplication of the near haploid genome, giving rise to copy neutral loss-of-heterozygosity of these chromosomes, with focal homozygous deletions of the respective gene. b, SJHYPO039-D is a near haploid case with the only focal deletion present in ETV6. c, The near haploid case SJHYPO123-D harbored a focal deletion only in NF1, with gain of 4 lesions in the xenograft. Blue indicates loss of genetic material.
Nature Genetics: doi: 10.1038/ng.2532
84
SUPPLEMENTARY REFERENCES
1. Li, F.P. et al. A cancer family syndrome in twenty-four kindreds. Cancer research 48,
5358-62 (1988). 2. Tinat, J. et al. 2009 version of the Chompret criteria for Li Fraumeni syndrome. Journal
of clinical oncology : official journal of the American Society of Clinical Oncology 27, e108-9; author reply e110 (2009).
3. Chompret, A. et al. P53 germline mutations in childhood cancers and cancer risk for carrier individuals. Br J Cancer 82, 1932-7 (2000).
4. Chompret, A. The Li-Fraumeni syndrome. Biochimie 84, 75-82 (2002). 5. Birch, J.M. et al. Prevalence and diversity of constitutional mutations in the p53 gene
among 21 Li-Fraumeni families. Cancer research 54, 1298-304 (1994). 6. Eeles, R.A. Germline mutations in the TP53 gene. Cancer Surv 25, 101-24 (1995). 7. Chompret, A. et al. Sensitivity and predictive value of criteria for p53 germline mutation
screening. J Med Genet 38, 43-7 (2001). 8. Nichols, K.E., Malkin, D., Garber, J.E., Fraumeni, J.F., Jr. & Li, F.P. Germ-line p53
mutations predispose to a wide spectrum of early-onset cancers. Cancer Epidemiol Biomarkers Prev 10, 83-7 (2001).
9. Olivier, M. et al. Li-Fraumeni and related syndromes: correlation between tumor type, family structure, and TP53 genotype. Cancer research 63, 6643-50 (2003).
10. Varley, J.M. Germline TP53 mutations and Li-Fraumeni syndrome. Human mutation 21, 313-20 (2003).
11. Wong, P. et al. Prevalence of early onset colorectal cancer in 397 patients with classic Li-Fraumeni syndrome. Gastroenterology 130, 73-9 (2006).
12. Gonzalez, K.D. et al. Beyond Li Fraumeni Syndrome: clinical characteristics of families with p53 germline mutations. Journal of clinical oncology : official journal of the American Society of Clinical Oncology 27, 1250-6 (2009).
13. Villani, A. et al. Biochemical and imaging surveillance in germline TP53 mutation carriers with Li-Fraumeni syndrome: a prospective observational study. Lancet Oncol 12, 559-67 (2011).
14. Tong, W., Zhang, J. & Lodish, H.F. Lnk inhibits erythropoiesis and Epo-dependent JAK2 activation and downstream signaling pathways. Blood 105, 4604-12 (2005).
15. Roberts, K.G. et al. Genetic alterations activating kinase and cytokine receptor signaling in high-risk acute lymphoblastic leukemia. Cancer Cell 22, 153-66 (2012).
16. Campalans, A. et al. XRCC1 interactions with multiple DNA glycosylases: a model for its recruitment to base excision repair. DNA Repair (Amst) 4, 826-35 (2005).
17. Cano, C.E. et al. Tumor protein 53-induced nuclear protein 1 is a major mediator of p53 antioxidant function. Cancer Res 69, 219-26 (2009).
18. Soulier, J. Fanconi anemia. Hematology Am Soc Hematol Educ Program 2011, 492-7 (2011).
19. Ellis, M.J. et al. Whole-genome analysis informs breast cancer response to aromatase inhibition. Nature 486, 353-60 (2012).
20. Fujimoto, A. et al. Whole-genome sequencing of liver cancers identifies etiological influences on mutation patterns and recurrent mutations in chromatin regulators. Nat Genet 44, 760-4 (2012).
21. Zimmermann, S. & Peters, S. Going beyond EGFR. Ann Oncol 23 Suppl 10, x197-x203 (2012).
22. Mullighan, C.G. et al. Genome-wide analysis of genetic alterations in acute lymphoblastic leukaemia. Nature 446, 758-64 (2007).
Nature Genetics: doi: 10.1038/ng.2532
85
23. Zhang, J. et al. Key pathways are frequently mutated in high-risk childhood acute lymphoblastic leukemia: a report from the Children's Oncology Group. Blood 118, 3080-7 (2011).
24. Mullighan, C.G. et al. JAK mutations in high-risk childhood acute lymphoblastic leukemia. Proc Natl Acad Sci U S A 106, 9414-8 (2009).
25. Harvey, R.C. et al. Rearrangement of CRLF2 is associated with mutation of JAK kinases, alteration of IKZF1, Hispanic/Latino ethnicity, and a poor outcome in pediatric B-progenitor acute lymphoblastic leukemia. Blood 115, 5312-21 (2010).
26. Mullighan, C.G. et al. Rearrangement of CRLF2 in B-progenitor- and Down syndrome-associated acute lymphoblastic leukemia. Nat Genet 41, 1243-6 (2009).
27. Zhang, A. et al. Identification of a novel family of ankyrin repeats containing cofactors for p160 nuclear receptor coactivators. J Biol Chem 279, 33799-805 (2004).
28. Neilsen, P.M. et al. Identification of ANKRD11 as a p53 coactivator. J Cell Sci 121, 3541-52 (2008).
29. Mullighan, C.G. et al. Deletion of IKZF1 and prognosis in acute lymphoblastic leukemia. N Engl J Med 360, 470-80 (2009).
30. Nagl, N.G., Jr. et al. The p270 (ARID1A/SMARCF1) subunit of mammalian SWI/SNF-related complexes is essential for normal cell cycle arrest. Cancer Res 65, 9236-44 (2005).
31. Nagl, N.G., Jr., Zweitzig, D.R., Thimmapaya, B., Beck, G.R., Jr. & Moran, E. The c-myc gene is a direct target of mammalian SWI/SNF-related complexes during differentiation-associated cell cycle arrest. Cancer Res 66, 1289-93 (2006).
32. Mullighan, C.G. et al. Genomic analysis of the clonal origins of relapsed acute lymphoblastic leukemia. Science 322, 1377-80 (2008).
33. Rakhilin, S.V. et al. A network of control mediated by regulator of calcium/calmodulin-dependent signaling. Science 306, 698-701 (2004).
34. Ehrlich, E.S. et al. Regulation of Hsp90 client proteins by a Cullin5-RING E3 ubiquitin ligase. Proc Natl Acad Sci U S A 106, 20330-5 (2009).
35. Kizil, C., Otto, G.W., Geisler, R., Nusslein-Volhard, C. & Antos, C.L. Simplet controls cell proliferation and gene transcription during zebrafish caudal fin regeneration. Dev Biol 325, 329-40 (2009).
36. Thermes, V. et al. Medaka simplet (FAM53B) belongs to a family of novel vertebrate genes controlling cell proliferation. Development 133, 1881-90 (2006).
37. Nikitin, E.A. et al. Expression level of lipoprotein lipase and dystrophin genes predict survival in B-cell chronic lymphocytic leukemia. Leuk Lymphoma 48, 912-22 (2007).
38. Nakanishi, H., Nakamura, T., Canaani, E. & Croce, C.M. ALL1 fusion proteins induce deregulation of EphA7 and ERK phosphorylation in human acute leukemias. Proc Natl Acad Sci U S A 104, 14442-7 (2007).
39. Nishida, K. et al. Gab-family adapter proteins act downstream of cytokine and growth factor receptors and T- and B-cell antigen receptors. Blood 93, 1809-16 (1999).
40. Xu, D. et al. A germline gain-of-function mutation in Ptpn11 (Shp-2) phosphatase induces myeloproliferative disease by aberrant activation of hematopoietic stem cells. Blood 116, 3611-21 (2010).
41. Geck, P., Sonnenschein, C. & Soto, A.M. The D13S171 marker, misannotated to BRCA2, links the AS3 gene to various cancers. Am J Hum Genet 69, 461-3 (2001).
42. Harada, H. et al. Polymorphism and allelic loss at the AS3 locus on 13q12-13 in esophageal squamous cell carcinoma. Int J Oncol 18, 1003-7 (2001).
43. Murthy, S., Agoulnik, I.U. & Weigel, N.L. Androgen receptor signaling and vitamin D receptor action in prostate cancer cells. Prostate 64, 362-72 (2005).
44. Reis, E.M. et al. Large-scale transcriptome analyses reveal new genetic marker candidates of head, neck, and thyroid cancer. Cancer Res 65, 1693-9 (2005).
Nature Genetics: doi: 10.1038/ng.2532
86
45. Seo, M.J. et al. New approaches to pathogenic gene function discovery with human squamous cell cervical carcinoma by gene ontology. Gynecol Oncol 96, 621-9 (2005).
46. Zhang, Y. et al. Correlation of genomic and expression alterations of AS3 with esophageal squamous cell carcinoma. J Genet Genomics 35, 267-71 (2008).
47. Maekawa, M. et al. A novel mammalian Ras GTPase-activating protein which has phospholipid-binding and Btk homology regions. Mol Cell Biol 14, 6879-85 (1994).
48. Maekawa, M., Nakamura, S. & Hattori, S. Purification of a novel ras GTPase-activating protein from rat brain. J Biol Chem 268, 22948-52 (1993).
49. Lockyer, P.J. et al. Distinct subcellular localisations of the putative inositol 1,3,4,5-tetrakisphosphate receptors GAP1IP4BP and GAP1m result from the GAP1IP4BP PH domain directing plasma membrane targeting. Curr Biol 7, 1007-10 (1997).
50. Eppert, K. et al. MADR2 maps to 18q21 and encodes a TGFbeta-regulated MAD-related protein that is functionally mutated in colorectal carcinoma. Cell 86, 543-52 (1996).
51. Riggins, G.J. et al. Mad-related genes in the human. Nat Genet 13, 347-9 (1996). 52. Uchida, K. et al. Somatic in vivo alterations of the JV18-1 gene at 18q21 in human lung
cancers. Cancer Res 56, 5583-5 (1996). 53. Tachibana, M. et al. Histone methyltransferases G9a and GLP form heteromeric
complexes and are both crucial for methylation of euchromatin at H3-K9. Genes Dev 19, 815-26 (2005).
54. Chen, X., El Gazzar, M., Yoza, B.K. & McCall, C.E. The NF-kappaB factor RelB and histone H3 lysine methyltransferase G9a directly interact to generate epigenetic silencing in endotoxin tolerance. J Biol Chem 284, 27857-65 (2009).
55. Keller, A.D. & Maniatis, T. Identification and characterization of a novel repressor of beta-interferon gene expression. Genes Dev 5, 868-79 (1991).
56. Gyory, I., Wu, J., Fejer, G., Seto, E. & Wright, K.L. PRDI-BF1 recruits the histone H3 methyltransferase G9a in transcriptional silencing. Nat Immunol 5, 299-308 (2004).
57. Kouzarides, T. Chromatin modifications and their function. Cell 128, 693-705 (2007). 58. FitzGerald, K.T. & Diaz, M.O. MLL2: A new mammalian member of the trx/MLL family of
genes. Genomics 59, 187-92 (1999). 59. Martinez-Garcia, E. et al. The MMSET histone methyl transferase switches global
histone methylation and alters gene expression in t(4;14) multiple myeloma cells. Blood 117, 211-20 (2011).
60. Yun, M., Wu, J., Workman, J.L. & Li, B. Readers of histone modifications. Cell Res 21, 564-78 (2011).
61. An, J.Y. et al. UBR2 mediates transcriptional silencing during spermatogenesis via histone ubiquitination. Proc Natl Acad Sci U S A 107, 1912-7 (2010).
62. Tasaki, T. et al. A family of mammalian E3 ubiquitin ligases that contain the UBR box motif and recognize N-degrons. Mol Cell Biol 25, 7120-36 (2005).
63. van der Knaap, J.A. et al. GMP synthetase stimulates histone H2B deubiquitylation by the epigenetic silencer USP7. Mol Cell 17, 695-707 (2005).
64. Song, M.S. et al. The deubiquitinylation and localization of PTEN are regulated by a HAUSP-PML network. Nature 455, 813-7 (2008).
65. Hussain, S., Zhang, Y. & Galardy, P.J. DUBs and cancer: the role of deubiquitinating enzymes as oncogenes, non-oncogenes and tumor suppressors. Cell cycle 8, 1688-97 (2009).
66. Shi, Y. et al. Histone demethylation mediated by the nuclear amine oxidase homolog LSD1. Cell 119, 941-53 (2004).
67. Arrowsmith, C.H., Bountra, C., Fish, P.V., Lee, K. & Schapira, M. Epigenetic protein families: a new frontier for drug discovery. Nat Rev Drug Discov 11, 384-400 (2012).
68. Lynch, J.T., Harris, W.J. & Somervaille, T.C. LSD1 inhibition: a therapeutic strategy in cancer? Expert Opin Ther Targets (2012).
Nature Genetics: doi: 10.1038/ng.2532
87
69. Zhao, Y. et al. A TFTC/STAGA module mediates histone H2A and H2B deubiquitination, coactivates nuclear receptors, and counteracts heterochromatin silencing. Mol Cell 29, 92-101 (2008).
70. Liu, Y.L. et al. USP22 acts as an oncogene by the activation of BMI-1-mediated INK4a/ARF pathway and Akt pathway. Cell Biochem Biophys 62, 229-35 (2012).
71. Fullgrabe, J., Kavanagh, E. & Joseph, B. Histone onco-modifications. Oncogene 30, 3391-403 (2011).
72. Cea, M. et al. New Insights into the Treatment of Multiple Myeloma with Histone Deacetylase Inhibitors. Curr Pharm Des (2012).
73. Diyabalanage, H.V., Granda, M.L. & Hooker, J.M. Combination Therapy: Histone Deacetylase Inhibitors and Platinum-based Chemotherapeutics for Cancer. Cancer letters (2012).
74. Johnstone, R.W. Histone-deacetylase inhibitors: novel drugs for the treatment of cancer. Nat Rev Drug Discov 1, 287-99 (2002).
75. Bonasio, R., Lecona, E. & Reinberg, D. MBT domain proteins in development and disease. Semin Cell Dev Biol 21, 221-30 (2010).
76. Chetcuti, A., Adams, L.J., Mitchell, P.B. & Schofield, P.R. Altered gene expression in mice treated with the mood stabilizer sodium valproate. Int J Neuropsychopharmacol 9, 267-76 (2006).
77. Aravind, L. & Iyer, L.M. The HARE-HTH and associated domains: novel modules in the coordination of epigenetic DNA and protein modifications. Cell cycle 11, 119-31 (2012).
78. Fossati, A., Dolfini, D., Donati, G. & Mantovani, R. NF-Y recruits Ash2L to impart H3K4 trimethylation on CCAAT promoters. PLoS One 6, e17220 (2011).
79. Jelinic, P., Pellegrino, J. & David, G. A novel mammalian complex containing Sin3B mitigates histone acetylation and RNA polymerase II progression within transcribed loci. Mol Cell Biol 31, 54-62 (2011).
80. Schuettengruber, B., Martinez, A.M., Iovino, N. & Cavalli, G. Trithorax group proteins: switching genes on and keeping them active. Nat Rev Mol Cell Biol 12, 799-814 (2011).
81. Bouazoune, K. & Brehm, A. ATP-dependent chromatin remodeling complexes in Drosophila. Chromosome Res 14, 433-49 (2006).
82. Unhavaithaya, Y. et al. MEP-1 and a homolog of the NURD complex component Mi-2 act together to maintain germline-soma distinctions in C. elegans. Cell 111, 991-1002 (2002).
83. Alabert, C. & Groth, A. Chromatin replication and epigenome maintenance. Nat Rev Mol Cell Biol 13, 153-67 (2012).
84. Wu, H. & Zhang, Y. Mechanisms and functions of Tet protein-mediated 5-methylcytosine oxidation. Genes Dev 25, 2436-52 (2011).
85. Laget, S. et al. The human proteins MBD5 and MBD6 associate with heterochromatin but they do not bind methylated DNA. PLoS One 5, e11982 (2010).
86. Nie, Z. et al. A specificity and targeting subunit of a human SWI/SNF family-related chromatin-remodeling complex. Mol Cell Biol 20, 8879-88 (2000).
87. Lans, H., Marteijn, J.A. & Vermeulen, W. ATP-dependent chromatin remodeling in the DNA-damage response. Epigenetics Chromatin 5, 4 (2012).
88. Kohno, S., Minowada, J. & Sandberg, A.A. Chromosome evolution of near-haploid clones in an established human acute lymphoblastic leukemia cell line (NALM-16). J Natl Cancer Inst 64, 485-93 (1980).
89. Tsuchiya, S. et al. Establishment and characterization of a human acute monocytic leukemia cell line (THP-1). Int J Cancer 26, 171-6 (1980).
90. Robinson, G. et al. Novel mutations target distinct subgroups of medulloblastoma. Nature 488, 43-8 (2012).
Nature Genetics: doi: 10.1038/ng.2532
88
91. Goodman, R.H. & Smolik, S. CBP/p300 in cell growth, transformation, and development. Genes Dev 14, 1553-77 (2000).
92. Margueron, R. & Reinberg, D. The Polycomb complex PRC2 and its mark in life. Nature 469, 343-9 (2011).
93. Downing, J.R. et al. The Pediatric Cancer Genome Project. Nat Genet 44, 619-22 (2012).
94. Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome research 19, 1639-45 (2009).
95. Edmonson, M.N. et al. Bambino: a variant detector and alignment viewer for next-generation sequencing data in the SAM/BAM format. Bioinformatics 27, 865-6 (2011).
96. Pounds, S. et al. Reference alignment of SNP microarray signals for copy number analysis of tumors. Bioinformatics 25, 315-21 (2009).
Nature Genetics: doi: 10.1038/ng.2532