Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

43
Article Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural Stem/Progenitor Cells Graphical Abstract Highlights d 27 Recurrent DSB clusters (RDCs) are identified in neural stem/progenitor cells d All RDCs are within genes, most of which are long, transcribed, and late replicating d Most RDC genes are involved in synapse function and/or neural cell adhesion d A nucleotide-resolution view of replication stress-associated fragile sites is provided Authors Pei-Chi Wei, Amelia N. Chang, Jennifer Kao, Zhou Du, Robin M. Meyers, Frederick W. Alt, Bjoern Schwer Correspondence [email protected] (F.W.A.), [email protected] (B.S.) In Brief Neural stem and progenitor cells undergo massive genomic alterations in a very restricted set of genes involved in synapse function and neural cell adhesion, processes that are likely to govern the special behavior of brain cells. Many of these genes have also been implicated in mental disorders. Accession Numbers GSE74356 Wei et al., 2016, Cell 164, 644–655 February 11, 2016 ª2016 Elsevier Inc. http://dx.doi.org/10.1016/j.cell.2015.12.039

Transcript of Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Page 1: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Article

Long Neural Genes Harbor Recurrent DNA Break

Clusters in Neural Stem/Progenitor Cells

Graphical Abstract

Highlights

d 27 Recurrent DSB clusters (RDCs) are identified in neural

stem/progenitor cells

d All RDCs are within genes, most of which are long,

transcribed, and late replicating

d Most RDC genes are involved in synapse function and/or

neural cell adhesion

d A nucleotide-resolution view of replication stress-associated

fragile sites is provided

Wei et al., 2016, Cell 164, 644–655February 11, 2016 ª2016 Elsevier Inc.http://dx.doi.org/10.1016/j.cell.2015.12.039

Authors

Pei-Chi Wei, Amelia N. Chang,

Jennifer Kao, Zhou Du, Robin M. Meyers,

Frederick W. Alt, Bjoern Schwer

[email protected] (F.W.A.),[email protected](B.S.)

In Brief

Neural stem and progenitor cells undergo

massive genomic alterations in a very

restricted set of genes involved in

synapse function and neural cell

adhesion, processes that are likely to

govern the special behavior of brain cells.

Many of these genes have also been

implicated in mental disorders.

Accession Numbers

GSE74356

Page 2: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Article

Long Neural Genes Harbor Recurrent DNA BreakClusters in Neural Stem/Progenitor CellsPei-Chi Wei,1,2,3,4 Amelia N. Chang,1,2,3,4 Jennifer Kao,1,2,3 Zhou Du,1,2,3 Robin M. Meyers,1,2,3 Frederick W. Alt,1,2,3,*and Bjoern Schwer1,2,3,*1Program in Cellular and Molecular Medicine, Boston Children’s Hospital, Howard Hughes Medical Institute, Boston, MA 02115, USA2Department of Genetics, Harvard Medical School, Boston, MA 02115, USA3Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA4Co-first author

*Correspondence: [email protected] (F.W.A.), [email protected] (B.S.)

http://dx.doi.org/10.1016/j.cell.2015.12.039

SUMMARY

Repair of DNA double-strand breaks (DSBs) bynon-homologous end joining is critical for neuraldevelopment, and brain cells frequently contain so-matic genomic variations that might involve DSBintermediates. We now use an unbiased, high-throughput approach to identify genomic regionsharboring recurrent DSBs in primary neural stem/progenitor cells (NSPCs). We identify 27 recurrentDSB clusters (RDCs), and remarkably, all occurwithin gene bodies. Most of these NSPC RDCswere detected only upon mild, aphidicolin-inducedreplication stress, providing a nucleotide-resolutionview of replication-associated genomic fragile sites.The vast majority of RDCs occur in long, tran-scribed, and late-replicating genes. Moreover,almost 90% of identified RDC-containing genesare involved in synapse function and/or neuralcell adhesion, with a substantial fraction also impli-cated in tumor suppression and/or mental disor-ders. Our characterization of NSPC RDCs revealsa basis of gene fragility and suggests potential im-pacts of DNA breaks on neurodevelopment andneural functions.

INTRODUCTION

Evolutionarily conserved DNA double-strand break (DSB) repair

pathways are required for maintenance of genome stability in

mammalian cells (Lieber, 2010). Classical non-homologous

end joining (C-NHEJ) is a critical somatic cell DSB repair

pathway that is not dependent on sequence homology and

that functions throughout the cell cycle (Alt et al., 2013). Evolu-

tionarily conserved core C-NHEJ proteins include XRCC4 and

DNA Ligase 4 (Lig4), which form an end ligation complex (Alt

et al., 2013; Boboila et al., 2012). C-NHEJ to a degree relies on

DSB detection by the Ataxia telangiectasia-mutated (ATM)

DNA damage response protein (Alt et al., 2013). Deficiency for

C-NHEJ factors, or ATM and its downstream factors, leads to

644 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

persistence of DSBs and their more frequent joining to other

DSBs to generate chromosomal rearrangements, including

translocations, deletions, inversions, and amplifications (Alt

et al., 2013; Gapud and Sleckman, 2011). In the absence of

C-NHEJ, such chromosomal rearrangements employ an alterna-

tive end-joining (A-EJ) pathway (Boboila et al., 2012).

C-NHEJ DSB repair is required for both immune and nervous

system development (Gao et al., 1998). Inactivation of Xrcc4 or

Lig4 in the mouse germline blocks lymphocyte development,

owing to the requirement for C-NHEJ to join antigen receptor

variable region gene segments during V(D)J recombination (Alt

et al., 2013). Xrcc4 or Lig4 inactivation also severely impairs neu-

ral development, leading to widespread apoptotic death of early

post-mitotic neurons and associated late embryonic lethality

(Barnes et al., 1998; Gao et al., 1998; Frank et al., 2000).

Neuronal loss and embryonic lethality in C-NHEJ-deficient

mice are rescued by p53 deficiency, indicating that both result

from a p53-dependent checkpoint response to unrepaired

DSBs (Frank et al., 2000; Gao et al., 2000). However, V(D)J

recombination and, correspondingly, B cell development are

not rescued in C-NHEJ/p53 double-deficient mice, which

routinely develop lethal progenitor B cell lymphomas with clonal

translocations and amplifications involving fusion of V(D)J

recombination-associated DSBs in the immunoglobulin heavy-

chain (IgH) and c-myc oncogene loci via A-EJ (Difilippantonio

et al., 2002; Hu et al., 2015; Zhu et al., 2002). Notably,

C-NHEJ/p53 double-deficient mice also develop medulloblas-

tomas (MBs) in situ (Lee and McKinnon, 2002; Zhu et al.,

2002). Moreover, neural stem/progenitor cell (NSPC)-specific

inactivation of Xrcc4 in p53-deficient mice leads toMBs that har-

bor recurrent clonal translocations, amplifications, and deletions

(Yan et al., 2006).

Brain cells frequently contain somatic genomic variations,

including deletions, and rearrangements, which in some cases

are linked to retrotransposition (Erwin et al., 2014; McConnell

et al., 2013; Poduri et al., 2013). In this regard, single-cell

sequencing of human frontal cortex neurons revealed that up

to 41% had at least one megabase-scale de novo copy number

variation (CNV), most of which were deletions (McConnell et al.,

2013). Due to technical limitations of such analyses, the actual

frequency of these CNVs might be even higher (Erwin et al.,

2014). Such somatic changes have been speculated to generate

neuronal diversity and result in greater variance of cellular and

Page 3: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

A

BChr12-sgRNA-1

Chr12Cen

N-myc

Tel

E2E3E1

1

2

3

4

5

6

7

8910

11

12

13

14

15

16

1718

19

Npas3

Lsamp

50

5,000500

5

Chr12- sgRNA-1

Figure 1. Elucidation of DSBs in Xrcc4�/�p53�/� NSPCs

(A) Illustration shows N-myc locus and sgRNA target site (vertical black

arrowhead) and location and orientation of HTGTS primer (green arrowhead).

Cen, centromere; Tel, telomere; E, exon.

(B) Circos plot of the mouse genome divided into individual chromosomes

showing the genome-wide HTGTS junction pattern of Chr12-sgRNA-1-medi-

ated bait DSBs in Xrcc4�/�p53�/� NSPCs binned into 2.5-Mb regions (black

bars). Bar height indicates the number of translocations per bin on a log scale.

20,000 junctions from four independent experiments are plotted. Red line in-

dicates recurrent translocations between Chr12 bait DSBs (red arrowhead)

and an RDCwithin Lsamp onChr16; an RDCwithinNpas3 onChr12 is denoted

by the green line. Blue star denotes translocations to sgRNA OT site.

See also Figure S1.

organismal phenotypes (Erwin et al., 2014; Muotri and Gage,

2006). In theory, genomic aberrations in NSPCs might be trans-

mitted to daughter cells and, thereby, contribute to genomic

mosaicism in individual neurons or glial cells, where they could

influence aspects of normal or abnormal brain function (Poduri

et al., 2013). A better understanding of the potential impacts

of such genomic alterations in neural cells awaits elucidation

of the underlying mechanisms (Erwin et al., 2014; Poduri et al.,

2013).

We have developed an unbiased high-throughput, genome-

wide, translocation sequencing (HTGTS) approach to map, at

nucleotide resolution, genome-wide DSBs based on their ability

to translocate to endogenous or ectopic bait DSBs at a specific

chromosomal location (Chiarle et al., 2011; Dong et al., 2015;

Frock et al., 2015; Hu et al., 2015). HTGTS and a related

approach revealed that off-target (OT) activities of lymphocyte-

specific antigen receptor gene diversification enzymes generate

recurrent DSBs or DSB clusters across the genome of B lineage

cells (Chiarle et al., 2011; Hu et al., 2015; Klein et al., 2011; Meng

et al., 2014; Zhang et al., 2012). For bothmouse and human cells,

recurrent DSBs or classes of DSBs are evident in genome-wide

translocation landscapes, regardless of chromosomal location.

The ability of such clusters of DSBs across the genome to

be revealed by HTGTS results from cellular heterogeneity in 3D

genome organization (Alt et al., 2013; Frock et al., 2015; Zhang

et al., 2012), a phenomenon that allows recurrent DSBs to be reli-

ably identified by HTGTS baits on a different chromosome (Frock

et al., 2015). In the absence of recurrent DSBs, proximity causes

DSBs in cis along a given chromosome to preferentially join

(Dong et al., 2015; Frock et al., 2015; Zhang et al., 2012). Within

a cis chromosome, translocation frequency is further enhanced

between sequences within topological domains or loops due

to increased interaction or other processes (Alt et al., 2013;

Hu et al., 2015; Zhang et al., 2012). Together, these properties

of chromosomal translocations allow the use of HTGTS as a

remarkably sensitive DSB detection method.

We now apply an enhanced linear amplification-mediated

HTGTS approach (Frock et al., 2015) to map DSBs in NSPCs.

These studies reveal a large set of recurrently broken genes

and suggest potential mechanisms underlying their origin.

RESULTS

High-Throughput Mapping of DSBs and Translocationsin NSPCsFor initial studies, we performed HTGTS on NSPCs isolated from

mice deficient for XRCC4 and p53 (Xrcc4�/�p53�/�mice), since,

based on our prior studies, we expected this background to be a

rich source of NSPCDSBs (Gao et al., 1998; Yan et al., 2006).We

used a Cas9:single-guide RNA (sgRNA) approach to generate

an initial HTGTS bait DSB as we described for other studies

(Dong et al., 2015; Frock et al., 2015; Hu et al., 2015). Specif-

ically, we designed an sgRNA (Chr12-sgRNA-1) that targets

a Cas9:sgRNA-generated bait DSB to an intergenic region

�52 kb telomeric of N-myc on chromosome (Chr) 12 (Figure 1A,

top). The Chr12-sgRNA-1 was introduced into cultured NSPCs,

which were then maintained for 4.5 days and harvested for

HTGTS. We used a primer that allowed us to identify endoge-

nous prey DSBs genome-wide that joined to centromeric broken

ends of a Chr12-sgRNA-1-generated bait DSB (Figure 1A, top).

In four separate experiments, we identified 32,144 independent

HTGTS junctions. We visualized overall junction patterns along

each individual chromosome via modified Circos plots (Frock

et al., 2015) of the mouse genome separated into 2.5-Mb bins.

These studies revealed that 61.4% (19,734) of HTGTS junctions

mapped within 500 kb of theChr12-sgRNA-1 target site, with the

majority of these not representing translocations but rather rep-

resenting rejoining of a given bait DSB following resection (Fig-

ure 1A; Figures S1A and S1B; Table S1; Chiarle et al., 2011;

Frock et al., 2015).

After excluding the break site resections, a substantial fraction

(�8%) of the remaining Chr12 junctions involved prey DSBs

spread over Chr12 (Figure 1A), a phenomenon resulting from

joining of bait DSBs to widespread low-level DSBs in cis due

to 3D spatial proximity (Alt et al., 2013; Zhang et al., 2012). We

estimated that the frequency of prey DSBs participating in

such break site chromosome translocations in XRCC4-deficient

NSPCs is, at a minimum, about eight per cell (Table S2). Indeed,

the actual DSB frequency likely is much higher since most DSBs

are rejoined locally and do not translocate (Alt et al., 2013).

Beyond the break site junctions, the remainder of the 9,966

(31%) HTGTS junctions were distributed broadly throughout

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. 645

Page 4: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

the genome (Table S1; Figure 1B; for convenience, bins with less

than five junctions are not illustrated on Circos plots but exam-

ples are shown in Figure S1C).

We used the spatial clustering approach for the identification

of chromatin immunoprecipitation (ChIP)-enriched regions

(SICER) algorithm (Zang et al., 2009) to perform an unbiased

assay of the HTGTS library data with the goal of identifying signif-

icantly enriched junction clusters across the XRCC4-deficient

NSPC genome (see the Supplemental Experimental Proce-

dures). This analysis revealed three recurrent translocation

clusters; notably, two of these clusters were located specifically

within the limbic system-associated membrane protein (Lsamp)

gene on Chr16 and the neuronal PAS domain protein 3 (Npas3)

gene on Chr12, while the other represented a Chr12-sgRNA-1

OT site on Chr12 (Figure 1). As the prey DSBs participating in

recurrent translocations to Lsamp and Npas3 were spread

broadly across these long genes (see below), we refer to them

as recurrent DSB clusters (RDCs). Finally, we also found

the same three enriched junction clusters by an independent

custom model-based analysis of ChIP sequencing (ChIP-seq)

(MACS)-based pipeline (see the Supplemental Experimental

Procedures).

The Lsamp and Npas3 Genes Are Prone to DSBs andTranslocations in NSPCsTo elucidate potential underlying mechanisms, we examined

HTGTS junctions between Chr12-sgRNA-1 bait DSBs and prey

DSBs across the 2.2-Mb-long Lsamp gene in Xrcc4�/�p53�/�

NSPCs. By convention, prey HTGTS junctions are denoted + if

the prey is read from the junction in a centromere-to-telomere di-

rection and – if in the opposite direction (Figure 2A, top; Chiarle

et al., 2011). Lsamp translocations occurred at similar levels to

prey DSBs in both the plus (+) and minus (–) orientations, indi-

cating that Chr12-sgRNA-1 bait DSBs can join to either end of

a prey DSB (Figure 2A), similar to what is found for translocation

of bait DSBs to prey DSBs genome-wide in B cells (Chiarle et al.,

2011). Translocation junctions were distributed broadly across

Lsamp, but were most enriched over an �600-kb internal region

(Figure 2A). About 0.5% (51/9,966) of total inter-chromosomal

translocations involved Lsamp (Table S1). To independently

confirm accumulation of recurrent DSBs in Lsamp in

Xrcc4�/�p53�/� NSPCs, we used a Cas9:sgRNA (Chr16-

sgRNA-1) to introduce bait DSBs in an intergenic region �8

Mb upstream of Lsamp (Figure 2B). We found that Chr16-

sgRNA-1 bait junctions were again substantially enriched across

Lsamp in both + and – orientations, with Lsamp translocations

occurring at a level of about 2% (151/7,965) of total inter-chro-

mosomal translocations (Table S1), consistent with anticipated

proximity effects (Alt et al., 2013; Zhang et al., 2012). For com-

parison, when normalized as described above for widespread

DSBs, we estimated that 60% of NSPCs have one Lsamp DSB

that translocates to a bait DSB (Table S2); again, the number

of LsampDSBs could bemuch higher, because we only included

in our estimate the small fraction of total DSBs that translocated

(see Discussion for details).

To further assess potential mechanisms of Lsamp transloca-

tions, we employed I-SceI-mediated bait DSBs within c-Myc

(c-Myc25xI-SceI) on Chr15 (Chiarle et al., 2011) for HTGTS ana-

646 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

lyses of ATM-deficient (ATM�/�) NSPCs. These studies revealed

overall translocation patterns, including the presence of an

Lsamp RDC, that were generally similar to those observed for

Xrcc4�/�p53�/� NSPCs (Figure 2C; data not shown). Because

we had previously generated HTGTS libraries from the same

c-Myc25xI-SceI bait DSBs in B cells (Meng et al., 2014), we could

directly compare HTGTS translocation junctions along Chr16 in

B cells versus those in NSPCs (Table S1). In this regard, HTGTS

libraries from primary IgH class switch recombination (CSR)-

stimulated B lymphocytes did not reveal any junction enrichment

in Lsamp (Figure 2D). On the other hand, activated B cell HTGTS

libraries exhibited two HTGTS junction peaks in Chr16 not pre-

sent in NSPC libraries (Figure 2D; compare with Figure 2C).

One B cell peak (purple star) contained junctions spread broadly

over the Igl light-chain locus and the other (green star) contained

two focal peaks of junctions in Bcl-6 and in a transcribed region

near Lpp. Notably, the latter two are known OTs of activation-

induced cytidine deaminase (AID), the B cell enzyme that in-

duces DSB formation for IgH CSR (Meng et al., 2014). For com-

parison, in size-matched HTGTS libraries (normalized to 7,000

inter-chromosomal junctions), activated B cell libraries con-

tained 12 junctions targeted to a site of convergent transcription

downstream of the transcription start site (TSS) of Bcl-6 (the

strongest Chr16 AID OT gene; Meng et al., 2014), while NSPC li-

braries contained over 40 junctions spread across the body of

Lsamp. By performing global run-on sequencing analyses

(GRO-seq; Core et al., 2008), we found active transcription

over the entire Lsamp gene in Xrcc4�/�p53�/� and ATM�/�

NSPCs (lower panels in Figures 2B and 2C). In contrast, exami-

nation of GRO-seq analyses of activated B cells (Meng et al.,

2014) revealed that Lsamp is not detectably transcribed (Fig-

ure 2D, lower panel).

We also examined the Npas3 RDC in detail in Xrcc4�/�p53�/�

NSPCs (Figure 3). Similar to junctions identified in Lsamp, junc-

tions were detected in both orientations across Npas3 when

cloned from the Chr12-sgRNA-1 bait DSB site located 40 Mb

centromeric of the gene (Figure 3A). These intra-chromosomal

junctions to the 823-kb Npas3 gene occurred at a frequency

that corresponded to about 1% of all inter-chromosomal HTGTS

junctions (Table S1). Junction enrichment in Npas3 again was

further enhanced when a different sgRNA (Chr12-sgRNA-2)

was used to move the bait DSB approximately 6 Mb telomeric

to Npas3 (Figure 3B), with intra-chromosomal translocations to

Npas3 DSBs occurring at a level corresponding to almost 3%

of inter-chromosomal translocations captured (Table S1).

GRO-seq analyses of Xrcc4�/�p53�/� NSPCs indicated active

transcription over the entire Npas3 gene (Figure 3C).

HTGTS studies with the c-Myc25xI-SceI bait DSBs revealed the

Lsamp RDC in both wild-type (WT) and ATM-deficient NSPCs,

while the Chr12-sgRNA-1 revealed the Lsamp RDC in

Xrcc4�/�p53�/�, but not WT NSPCs (Table S1). None of the

bait DSBs used revealed Npas3 RDCs in WT HTGTS libraries,

and only the Chr12-sgRNAs revealed the Npas3 RDC in the

Xrcc4�/�p53�/� NSPCs (Figures 1 and 3; data not shown). We

suspect that the differential recovery of these two RDCs may

be related to the frequency at which the different bait and prey

DSBs are induced or persist in the different genotypes (Dong

et al., 2015), as both the Lsamp and Npas3 RDCs were readily

Page 5: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

A

Xrcc4-/- p53-/-

Chr12

Chr16

Chr12-sgRNA-1

chr16:37,086,230-45,531,141

1 Mb

+

Cen Tel

20

10

20

10

Junc

tion

num

ber 0 10 30 40 50 60 70 80 90 10020

Chr16 position (Mb)

+

B

Chr16Chr16-sgRNA-1

1 Mb

33 35 37 39 41 43 45 47 49

Cen

10

20

10

20

Xrcc4-/-p53-/-

Junc

tion

num

ber

+

Chr16 position (Mb)

Tel

+ Deletion

Inversion

CentricDicentric

Tel

Tel

Chr15Centric

C

Chr16

NSPCs

B cells

Dicentric Tel+

c-Myc 25x I-SceI

Cen Tel

20

10

20

10

Junc

tion

num

ber

ATM-/- 0 10 30 40 50 60 70 80 90 10020

1 Mb

chr16:37,086,230-45,531,141

Chr16 position (Mb)

+

D

1 Mb

0 10 30 40 50 60 70 80 90 10020Chr16 position (Mb)

Cen Tel

20

10

20

10

Junc

tion

num

ber

ATM-/-+

Lsamp

RefGene

+-

HTGTS

GRO-seq

045

064

chr16:37,086,230-45,531,141

RefGene

+-

HTGTS

GRO-seq

045

064

Lsamp

Lsamp

HTGTS

chr16:37,086,230-45,531,141

RefGene

+-

HTGTS

GRO-seq

045

064

Figure 2. Identification and Characterization of Lsamp RDC

(A) Translocation cluster between Chr12-sgRNA-1-mediated bait DSBs and prey DSBs on Chr16 in Xrcc4�/�p53�/� NSPCs. (Top) Diagram shows translocation

outcomes (see text for details). Green arrowhead denotes HTGTS primer. (Middle) Graph of Chr16 prey junctions (normalized to 7,070 inter-chromosomal

junctions from four independent experiments) is shown. Junctions in centromere-to-telomere orientation (+) are in blue and junctions in telomere-to-centromere

orientation (–) are in red. Bin size, 1 Mb. (Bottom) Enlarged view of region around Lsamp shows HTGTS junctions (related to panel above as indicated by dashed

lines; genomic coordinates are below). Junction enrichment within Lsamp (highlighted in yellow) was significant (p = 3.333 10�7; seeHTGTS Junction Enrichment

Analysis in the Supplemental Experimental Procedures).

(B) (Top) Illustration shows intra-chromosomal translocations formed between Lsamp-proximal Chr16-sgRNA-1-mediated bait DSBs and prey DSB cluster

(highlighted in yellow). (Middle) Prey junctions captured by Lsamp-proximal bait DSBs over a 16-Mb Chr16 region, combined from three independent

Xrcc4�/�p53�/� experiments, are shown. Bin size, 100 kb. Details as in (A). (Bottom) Enlarged view of region around Lsamp shows HTGTS junctions (related to

panel abovewith dashed lineswith genomic coordinates indicated at the bottom). RefGene andGRO-seqdata are shown (ordinate indicates normalizedGRO-seq

counts; reads are shown in plus [blue] andminus [red] orientations). Junction enrichment within Lsampwas highly significant (p = 1.543 10�13), as described in (A).

5,917 junctions (945 intra-chromosomal translocations onChr16more than 10 kb from the bait DSB site and 4,972 inter-chromosomal translocations) are plotted.

(C) (Top) Illustration shows translocation outcomes between c-Myc25xI-SceI cassette (yellow box) bait DSBs and prey DSBs onChr16with details as in (A). (Middle)

Chr16 prey junctions from four independent experiments inATM�/�ROSAI-SceI-GRc-Myc25xI-SceINSPCs, with LsampRDC in yellow, are shown. A purple rectangle

and star indicate region corresponding to Igl, and a green rectangle and star indicate region corresponding toBcl-6 and Lpp. (Bottom) Enlarged view of indicated

RDC-containing region is shown, as described for (B). RefGene and GRO-seq reads from ATM�/�ROSAI-SceI-GRc-Myc25xI-SceI NSPCs are shown as for (B). 7,070

inter-chromosomal junctions are plotted. Junctions within Lsamp were significantly enriched (p = 5.43 3 10�6), as described in (A).

(D) HTGTS analysis of activated ATM�/�ROSAI-SceI-GRc-Myc25xI-SceI B cells and GRO-seq analyses of activated B cells (Meng et al., 2014) are displayed as

described for (B).

See also Table S1.

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. 647

Page 6: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

A

0.4 Mb

Deletion

Inversion

GRO-seq

094

064

+-

chr12:53,347,664-56,175,162

40

Cen

20

40

20

Junc

tion

num

ber 5550454035302520 60

TelChr12

Exc. circle

Inversion

B

chr12:53,347,664-56,175,162

+

Chr12 position (Mb)

Xrcc4-/- p53-/-+

40

20

40

20

Junc

tion

num

ber 5550454035302520 60

Chr12 position (Mb)

Xrcc4-/- p53-/-+

Chr12-sgRNA-1

Chr12-sgRNA-2 +

Tel

Cen Tel

RefGene

HTGTS0.4 Mb

chr12:53,347,664-56,175,162HTGTS

0.4 Mb

TelChr12

CNpas3

Figure 3. Identification of Recurrent DSB Cluster in Npas3

(A) (Top) Illustration shows intra-chromosomal translocation outcomes be-

tween Chr12-sgRNA-1-mediated bait DSBs and Chr12 prey DSBs in

Xrcc4�/�p53�/� NSPCs. (Bottom) Prey junctions are identified from Chr12-

sgRNA-1 bait DSBs over a 40-Mb Chr12 region containing the Npas3 RDC.

Data are combined from four independent experiments. Bin size, 500 kb.

13,455 junctions (3,489 junctions located more than 10 kb from either side of

the bait DSB and 9,966 inter-chromosomal junctions) are plotted. Junction

enrichment within Npas3 was highly significant (p = 2.63 3 10�15; see HTGTS

Junction Enrichment Analysis in the Supplemental Experimental Procedures).

Other details are as in Figure 2A.

(B) (Top) Illustration shows intra-chromosomal translocation outcomes be-

tween Chr12-sgRNA-2 bait DSBs and Chr12 prey DSBs, presented as in (A).

(Bottom) Prey junctions are identified from Chr12-sgRNA-2 bait DSBs over a

40-Mb Chr12 region containing the Npas3 RDC. Data combined from three

independent experiments are presented as in (A). Bin size, 500 kb. 5,471 total

junctions (1,366 Chr12 junctions located more than 10 kb from either side of

the bait DSB and 4,105 inter-chromosomal junctions) are plotted. Junction

enrichment within Npas3 region was significant (p = 2.03 3 10�14), as

described in (A).

(C) GRO-seq and RefGene information (bottom) are shown as described for

Figure 2B.

See also Table S1.

648 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

apparent in HTGTS studies employing bait DSBs on Chr12, 15,

and 16, respectively, in Xrcc4�/�p53�/� NSPCs under condi-

tions in which these prey DSBs are further enhanced; and Lsamp

and Npas3 also were detected under such conditions by bait

DSBs on Chr15 or Chr12 in WT NSPCs (Figures 4, 5, and 6;

see below).

Elucidation of Replication Stress-Induced DSBs andTranslocations in NSPCsGiven thatNSPCsundergo extensive cell division both in vivo and

in vitro (McKinnon, 2013), we investigated potential effects of

DNA replication stress on DSB generation. Treatment with low

doses of aphidicolin (APH), a DNA polymerase inhibitor, induces

replication stress and, thus, has been widely used for common

fragile site (CFS) analyses (Durkin and Glover, 2007; Glover

et al., 1984). To identify genomic regions subject to DNA replica-

tion stress-associatedDSBs,we treatedXrcc4�/�p53�/�NSPCs

with either APHor vehicle control (DMSO) and performedHTGTS

with bait DSBs generated, respectively, on either Chr12 (Chr12-

sgRNA-1), Chr16 (Chr16-sgRNA-2), or Chr15 (Chr15-Myc-

sgRNA). For each of the three bait DSBs, we performed at least

three independent HTGTS experiments on control- or APH-

treated cells. These experiments all were analyzed separately

to confirm reproducibility, and then pooled, normalized to the

same number of total junctions, and plotted in modified Circos

plots to facilitate comparison of APH-induced RDCs found in

the different bait libraries (Figure 4; Figure S2).

For the unbiased identification of junction enrichment

across the genome in APH-treated versus control samples,

we again employed SICER, which also is a method of choice

for comparing two identical samples with or without a specific

treatment (Zang et al., 2009; Figure S2; see the Supplemental

Experimental Procedures). This analysis revealed 282, 156,

and 294 candidate replication stress-induced RDCs, respec-

tively, in HTGTS libraries generated from Chr12, Chr15, and

Chr16 bait DSBs. For further analysis, we only considered

RDCs that showed a significantly higher translocation density

in libraries from APH-treated versus vehicle control-treated

cells (p < 0.05, one-tailed t test; see the Supplemental Experi-

mental Procedures). This criterion reduced the number of clus-

ter candidates that were significantly enriched across all bio-

logical replicates to 69, 158, and 133 in Chr15-Myc-sgRNA-,

Chr12-sgRNA-1-, and Chr16-sgRNA-2-based libraries, respec-

tively (Table S3). While many of these might be bona fide repli-

cation stress-induced RDCs, for more detailed analyses we

only considered APH-induced RDCs that were independently

detected by at least two HTGTS bait DSB locations on different

chromosomes (Figure S2A). Based on this stringent criterion,

26 of the 360 candidate replication stress-induced RDCs

were identified from at least two bait DSB locations (Figure 4);

strikingly, all of these, like the majority of all candidate RDCs,

were in gene bodies (Figure 5; Figures S2, S3, and S4).

Notably, we verified these 26 RDCs with the MACS-based,

custom pipeline mentioned above (Table S4). Translocation

junctions within these RDCs occurred similarly in + and – orien-

tations, again indicating that the bait DSB end could join to one

or the other end of a given prey DSB within the RDC (Figure S3).

Six of the 26 RDC-containing genes (RDC genes) were

Page 7: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

1

23

4

5

6

7

8910

11

12

13

14

15

16

1718

19

1

2

3

4

5

6

7

8910

11

12

13

14

15

16

1718

19

1

2

3

4

5

6

7

8910

11

12

13

14

15

16

1718

19

1

2

3

4

5

6

7

8910

11

12

13

14

15

16

1718

19

50

5,000500

550

5,000500

5

APHA DMSO

B

C

Chr

15-M

yc-s

gRN

AC

hr12

-sgR

NA

-1C

hr16

-sgR

NA

-2

APHDMSO

APHDMSO

1

2

3

4

5

6

7

8910

11

12

13

14

15

16

1718

19

50

5,000500

5

50

5,000500

5

50

5,000500

550

5,000500

5

1

2

3

4

5

6

7

8910

11

12

13

14

15

16

1718

19

Figure 4. Genome-wide Identification of Replication Stress-Induced

RDCs in NSPCs

(A) Circos plot showing HTGTS junctions from Cas9:sgRNA-mediated bait

DSBs on Chr15 (Chr15-Myc-sgRNA) in DMSO- (left) or APH-treated (right)

Xrcc4�/�p53�/� NSPCs. Junctions from three independent experiments per

condition were combined and randomly down-sampled so that identical

numbers of junctions for each condition (n = 17,701 junctions) could be shown

in each plot.

(B) HTGTS junctions from bait DSBs on Chr12 (Chr12-sgRNA-1) are shown,

as in (A).

(C) HTGTS junctions identified in three (DMSO, left) or four (APH, right) ex-

periments from Chr16-sgRNA-2-mediated bait DSBs; other details as in (A).

For all panels, the bait DSB site (red arrowhead) and sgRNA OT sites (blue

stars) are denoted. Lines in the middle of the plot connect the break site to the

SICER-identified replication stress-induced RDCs that were identified for that

particular break site. Red lines indicate six RDCs detected by bait DSBs on all

three tested chromosomes. Blue lines in each plot indicate RDCs detected by

bait DSBs on two of the three tested break sites, which numbered five for the

Chr15-Myc-sgRNA break site (A), 19 for theChr12-sgRNA-1 break site (B), and

16 for theChr16-sgRNA-2 break site (C). Red stars indicate locations of Lsamp

and Npas3.

See also Figure S3.

chr15:45,410,184-50,625,535

Csmd3

1 Mb

Chr15

Chr12

Bait+-+-

APHD

Chr12

Chr15

chr12:88,030,948-93,575,373 1 Mb

Chr12

Chr16

Bait+-+-

APHNrxn3Chr16

Chr12

F

Chr16

Cadm2

Chr12

Chr16

Bait+-+-

APH

1 Mbchr16:64,653,666-69,623,153

Chr12

E

A Chr15

Chr12

Chr16

Bait+-+-+-

APH

chr6:74,829,631-79,931,661

Ctnna2

1 Mb

B

C

Xrcc4-/-p53-/- NSPCsChr12

Chr15 Chr16

Shared APH-induced RDCs

(n=6)

Cdh13

Chr15

Chr12

Chr16

+-+-+-

APH

1 Mbchr8:118,805,655-123,849,348

Bait

Figure 5. Characterization of Replication Stress-Induced RDCs in

XRCC4/p53-Deficient NSPCs

(A) APH-induced RDCs in Xrcc4�/�p53�/� NSPCs identified from bait DSBs

located on three different chromosomes. Six APH-induced inter-chromosomal

translocation clusters were detected by all three HTGTS strategies; theCtnna2

(B) andCdh13 (C) RDCs are shown and the other four are shown in Figure S4A.

(B and C) HTGTS junctions in either DMSO- or APH-treated libraries prepared

from the indicated bait DSBs. Genomic regions corresponding to RDCs are

highlighted in yellow. RefGene tracks are shown. Libraries were normalized as

described in Figure 4.

(D–F) APH-induced RDCs in Xrcc4�/�p53�/� NSPCs in Csmd3 (D), Nrxn3 (E),

and Cadm2 (F) identified from bait DSBs located on two different chromo-

somes. The panels are organized as for (A)–(C). All panels show 2Mb on either

side of the indicated RDC. See Figure S4 for additional examples of proximity-

facilitated RDC identification.

detected by bait DSBs located on three different chromosomes

(Figures 5A–5C; Figure S4A). Finally, as expected based on

proximity effects (Alt et al., 2013), we found higher junction

densities in replication stress-induced RDCs that were on the

same chromosome as the bait DSBs that detected them (Fig-

ures 5D–5F; Figure S4C).

We performed an identical set of assays for replication stress-

induced RDCs in WT NSPCs, except that we only employed

HTGTS bait DSBs from Chr15 or Chr12. Although WT NSPC

HTGTS experiments yielded somewhat lower total junction

numbers than Xrcc4�/�p53�/� NSPC experiments, they re-

vealed 13 of the 26 RDCs detected in Xrcc4�/�p53�/� NSPCs

(Figure 6; Figures S5A–S5F). In addition, Lsamp appeared in

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. 649

Page 8: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

chr6:74,829,631-79,931,6611 Mb

Chr15

Chr12

Bait+-+-

APHCtnna2Nrxn1

Chr15

Chr12

Bait

1 Mbchr17:88,430,984-93,494,142

+-+-

APH

Lsamp

Chr15

Chr12

Bait

1 Mbchr16:37,086,230-45,531,141

+-+-

APH

Shared APH-induced RDCs (n=6)

Chr12Chr15

A B

C D

Wild-type NSPCs

Nrxn3

Chr15

Chr12

Bait

chr12:88,030,948-93,575,373 1 Mb

+-+-

APH

Csmd3

Chr15

Chr12

Bait

chr15:45,410,184-50,625,5351 Mb

+-+-

APHEChr12

Chr15

Chr15

Chr12

F

Figure 6. Replication Stress-Induced RDCs

in Repair-Proficient NSPCs

(A) Detection of RDCs on a different chromosome

from the bait DSBs on Chr15 or Chr12 is shown.

(B–D) Three are shown, including Lsamp (B),Nrxn1

(C), and Ctnna2 (D); others are shown in Figure S5.

Libraries were normalized as described in Figure 4

(Chr15 bait libraries, 14,525 junctions; Chr12

bait libraries, 10,088 junctions). Details are as in

Figure 5.

(E and F) Detection of RDCs in Csmd3 (E) or Nrxn3

(F) from two bait DSBs, of which one lies on the

RDC-containing chromosome, is shown. Libraries

were normalized as described above. Other details

are as in Figure 5.

See also Figure S5.

WT cells as a replication stress-induced RDC. In total, six of the

14 WT RDCs (including Lsamp) were detected from both bait

DSBs (Figures 6A–6D; Figure S5B). These studies show that

replication stress-associated RDCs form in both WT and

C-NHEJ (XRCC4)-deficient cells. As in repair-deficient NSPCs,

location of the replication stress-induced RDC on the break

site chromosome in WT NSPCs resulted in higher junction den-

sities (Figures 6E and 6F).

Analysis of translocation junctions between bait DSBs and

replication stress-mediated RDCs revealed, strikingly, that

�60% of junctions in WT NSPCs were microhomology (MH)

mediated, while more than 90% of junctions in Xrcc4�/�p53�/�

NSPCs were MH mediated (Figure S5G; Table S5). Genome-

wide translocation junctions showed a similar shift in MH usage

between WT and Xrcc4�/�p53�/� NSPCs (Figure S5G).

Together, these findings show that both the C-NHEJ DSB repair

pathway andA-EJpathways (which are biased toward longerMH

usage) can mediate translocations of replication stress-associ-

ated DSBs and translocations to DSBs genome-wide in NSPCs.

Replication Stress-Associated DSBs and TranslocationsTarget Long, Actively Transcribed, Neural GenesAll 27 (including Lsamp in WT NSPCs) replication stress-induced

RDCs identified by HTGTS and our unbiased, genome-wide

enrichment analysis were located within genes (Figures 5 and

6; Figures S4 and S5), with all but one clearly being actively tran-

scribed, albeit on average at slightly lower levels than other

active genes in NSPCs (Figures 7A and 7B). Strikingly, detailed

analysis of these RDC genes revealed that 15 of 27 (55.6%)

650 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

are involved in neural cell adhesion and

22 of 27 (81.5%) have roles in synapto-

genesis and synaptic function (Figure 7C;

Table S6). Moreover, the vast majority of

these genes have been linked to neural

disorders inmice and/or in humans (Table

S6). We note, however, that expression of

some of these genes is not restricted to

neural cells. For example, Lsamp is ex-

pressed in fibroblasts where it is also frag-

ile (Le Tallec et al., 2011); and Wwox,

Pard3b, Oxr1, and Nfia are all expressed

in B cells (Meng et al., 2014), withWwox also being fragile in lym-

phocytes (Le Tallec et al., 2013). Likewise, Dcc is expressed in

most normal tissues and is deleted in colon cancer (Fearon

et al., 1990; see also Discussion).

With the exception of Ptn, all genes harboring replication

stress-induced RDCs in NSPCs were longer than 100 kb, which

is significantly above the average gene length in the mouse

genome (Figure 7D). To test whether these long genes incur

more translocationsand, thus, formRDCssimply becauseof their

larger target size,wecomputationally sampled andconcatenated

randomly selected, active genes of average size (15–25 kb) from

HTGTS libraries into regions of �1 Mb, and we compared size-

normalized junction density in these regions to that of the 27

RDC genes (Figures 7E and 7F). Even when normalized by size,

the largegenesharboringRDCs inNSPCsshowedhigher junction

density than predicted by size alone (Figures 7E and 7F; Figures

S6A and S6B). Moreover, the large genes harboring RDCs repre-

sented only a small fraction (1.5%) of the 1,761 actively tran-

scribed NSPC genes larger than 100 kb, which further indicates

that the observed accumulation of DSBs in these genes in

response to replication stress is not just due to size per se. These

findings indicate that this subset of longgenes inNSPCs isdispro-

portionately susceptible to DSB-induced genomic instability.

To gain further insight into potential underlying mechanisms,

we investigated the replication timing of the 27 identified RDC

genes in NSPCs by examining existing murine neural progenitor

replication timing data (Hiratani et al., 2008; Pope et al., 2014).

Whereas a few of these genes show relatively neutral or early

replication timing (Npas3, Nfia, Wwox, and Ptn), the majority

Page 9: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

G

E F

Ear

lyLa

teR

eplic

atio

n tim

ing

ratio

log 2

(Ear

ly/L

ate)

DBA

-12-8-4048

Act

ive

Inac

tive

Rbfox1

RDC-genes

GR

O-s

eq d

ensi

tylo

g2[R

PK

M]

-8-4048

12

GR

O-s

eq d

ensit

ylo

g2[R

PKM

]

RDC-genes

All

**

R1

R2

R3

R4

R5

Bai

3P

ard3

bG

rik2

Gpc

6C

tnnd

2O

xr1

Csm

d3R

bfox

1Fg

f12

Lsam

pC

adm

2N

rxn1 Dcc

Prk

g1N

fiaM

agi2

Sdk

1P

tnC

tnna

2C

smd1

Ww

oxC

dh13

Ntm

05

10152025

406080

100120140160

Tran

sloc

atio

n de

nsity

R6

Dgk

bN

pas3

Mdg

a2N

rxn3

0

10

20

30

4050

100

150

200

Tran

sloc

atio

n de

nsity

Sdk

1C

dh13

Lsam

pN

pas3

Oxr

1P

rkg1

Ctn

nd2

Gpc

6D

gkb

Cad

m2

Bai

3R

bfox

1D

ccN

tmFg

f12

Nfia

Mdg

a2G

rik2

Nrx

n3N

rxn1

Ww

oxC

tnna

2P

tnC

smd3

Csm

d1P

ard3

bM

agi2

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

Gen

e le

ngth

log2

[kb]

048

16

RDC-genes

All

12

-4-8

-12

****

Synaptogenesis, synapse function (22)

NSPC RDC-genes (27)

Neural cell adhesion (15)

2139

C Figure 7. Replication Stress-Induced RDCs

in Long, Actively Transcribed, Neural Genes

(A) Transcriptional activity (GRO-seq) of the iden-

tified 27 genes containing replication stress-

induced RDCs. Transcriptional activity cutoff value

(reads per kilobase of transcript per million map-

ped reads [RPKM] = 0.05) is indicated by dashed

red line.

(B) Transcription rate of all active (RPKM R 0.05)

NSPC genes (black) and active replication stress-

induced RDC genes (green). Whiskers show mini-

mum and maximum values; top and bottom edges

of boxplots correspond to 25th and 75th percen-

tiles, respectively; horizontal lines indicate the

median (**p < 0.005, Kolmogorov-Smirnov [K-S]

test).

(C) Venn diagram of the indicated molecular

functions among the 27 identified RDC genes

(yellow circle). 22 of 27 genes (81.5%, light green

circle) have roles in synaptogenesis and synapse

function; 15 of the 27 genes (55.6%, purple circle)

have roles in neural cell adhesion, with the majority

(13 of 15 genes, 86.7%) also having roles in syn-

aptogenesis and synapse function. See Table S6

for a detailed description.

(D) Gene length comparison of all active NSPC

genes (black) and NSPC RDC genes (green). Box-

and-whisker plots show the binary logarithm of

kilobase gene length; graph details are as in (A)

(****p < 0.0001, K-S test).

(E) Five groups (R1–R5) of 50 actively transcribed

15- to 25-kb genes each were randomly selected

from three independent Xrcc4�/�p53�/� Chr12-

sgRNA-1 bait DSB libraries and junction numbers

within the concatenated regions determined (gray

bars). Junction numbers within the indicated inter-

chromosomal RDCs were determined in the same

libraries (blue bars). Translocation density is indi-

cated as junctions per megabase.

(F) Translocation densities of concatenated average size (15- to 25-kb) active genes onChr12 (R6, n = 62, gray bar) or intra-chromosomalChr12RDCs (blue bars).

Data represent mean and SEM of libraries from three independent Chr12-sgRNA-1 bait DSB experiments.

(G) Replication timing analysis of RDC genes (see the Experimental Procedures for details). Average and SEM are shown.

See also Figures S6 and S7 and Table S6.

replicate late (Figure 7G; Figure S6C). Notably, the 27 RDCs on

average replicate significantly later in NSPCs than other genes

larger than 100 kb (P.-C.W., A.N.C., J.K., Z.D., R.M.M., F.W.A.,

and B.S., unpublished data). Because the 27 genes we identified

as being prone to genomic instability in NSPCs are highly

conserved between mouse and man, we also examined existing

replication timing data of their human orthologs in neural progen-

itors (Ryba et al., 2010; Figure S6D). Nearly 90% of these human

orthologs showed conserved replication timing with their mouse

counterparts (Figure S6D), suggesting that the majority of genes

we identified as sensitive to replication stress-induced genomic

instability in murine NSPCs could potentially be prone to replica-

tion stress-induced fragility in humans.

DISCUSSION

Detection of Recurrent Classes of DSBs in NSPCsDevelopment of NSPCs into post-mitotic neurons in vivo is

dependent on the repair of DSBs by C-NHEJ (Gao et al.,

1998), suggesting critical roles for DSBs and/or their repair in

neural cells. We now have employed HTGTS to identify tens of

thousands of endogenous DSBs across the genomes of

XRCC4/p53-deficient and WT NSPCs, based on their transloca-

tion to bait DSBs on several different chromosomes. Our findings

reveal multiple different sources of recurrent DSBs in NSPCs, of

which a large fraction corresponds to general classes of DSBs

observed in other cell types (e.g., Chiarle et al., 2011; Frock

et al., 2015; see below). Beyond these, our unbiased approach

revealed 27 clear RDC sites in NSPCs, as they were recurrently

detected from HTGTS bait DSBs located on different chromo-

somes. Strikingly, all 27 RDCs occurred in gene bodies. More-

over, they mainly occurred in large genes encoding proteins

involved in neural development or function, with a significant

subset having been implicated as rearranged in neural and other

cancers. Based on detection from a single HTGTS bait site, we

identified 333 additional, likely lower level, RDC candidates. As

spatial proximity of bait and prey DSBs on the same chromo-

some clearly enhances detection of replication stress-induced

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. 651

Page 10: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

RDCs in NSPCs (Figures 5 and 6; Figures S4C and S5D), HTGTS

with additional bait DSB locations may eventually allow confir-

mation of many of these additional apparent RDCs. Due to the

high sensitivity of HTGTS as a DSB identification approach, we

expect that, with appropriate means of delivering bait DSBs,

our approach could be extended to other neural lineage cells,

including mature neurons.

Potential Sources of General Classes of EndogenousDSBs in NSPCsIn XRCC4/p53-deficient NSPCs, a large proportion of bait DSB

junctions involve re-joining of the two bait DSB ends subsequent

to resection (e.g., Figure S1), similar to what occurs in other cell

types (Chiarle et al., 2011; Frock et al., 2015). Beyond the imme-

diate break site, junctions were enriched along each tested

XRCC4/p53-deficient NSPC break site chromosome (i.e.,

Chr12, 15, and 16) relative to other chromosomes, consistent

with spatial proximity influencing preferential joining of bait

DSBs to the subset of widespread, low-level chromosomal

DSBs that occur in cis (Frock et al., 2015; Zhang et al., 2012).

Previously, this phenomenon was most prominently observed

in cells harboring widespread DSBs generated by ionizing radia-

tion or by non-specific activities of certain nucleases (Frock

et al., 2015; Zhang et al., 2012). While we have not elucidated

the source of widespread low-level DSBs in NSPCs, such

DSBs might arise from various endogenous sources, including

replicative, transcriptional, or oxidative stress (e.g., Aguilera

and Garcıa-Muse, 2013; Erwin et al., 2014; Ju et al., 2006; Kim

and Jinks-Robertson, 2012; Madabhushi et al., 2015). In this re-

gard, ATM deficiency, which increases oxidative stress (Paull,

2015), led to the greatest levels of this class of DSBs in NSPCs

(Table S1). Notably, low-level widespread DSBs and overall

RDC DSBs appear to similarly contribute as major DSB sources

detectable in NSPCs. Finally, DSBs captured by HTGTS baits

also are enriched near the TSSs of active genes in NSPCs; but

they are not frequent enough to be considered recurrent in any

given gene (e.g., they occur at negligible frequency in RDC

gene TSSs compared to the frequency of DSBs across the

gene body [Schwer et al., 2016]).

Mechanisms Promoting Replication Stress-InducedGenomic Instability of Neural Genes in NSPCsOf the 27 genes harboring robust RDCs in NSPCs, 25 were

evident only in response to APH-induced replication stress;

moreover, APH treatment increased the DSB frequency in the

two genes, Npas3 and Lsamp, that had RDCs in the absence

of treatment (Figures 4, 5, and 6; Figures S4C and S5B). APH

is well known to induce CFS instability (Durkin and Glover,

2007). Consistent with characteristics often associated with

CFSs, most replication stress-induced RDCs in NSPCs are

within actively transcribed, large, and late-replicating genes (Fig-

ure 7). Thus, as proposed for CFSs, these characteristics, and

potentially others, may contribute to the DSBs that generate

NSPC RDCs by increasing the frequency of collisions between

transcription and replication factors and/or mitotic entry with

incomplete replication (Gao and Smith, 2014; Helmrich et al.,

2011; Le Tallec et al., 2014). In this regard, Lsamp is the largest,

actively transcribed NSPC gene and it replicates late, potentially

652 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

predisposing it to frequent DSBs and RDC formation in the

absence of APH treatment. The mechanism(s) of Npas3 fragility

may be distinct, as this gene has neutral to early replication

timing. In this context, we also identified an RDC in Ptn, which

is not an exceptionally large gene (95.7 kb), replicates early,

and is highly transcribed relative to surrounding regions, reminis-

cent of the early replication fragile sites (ERFSs) identified in B

lymphocytes (Barlow et al., 2013). Notably, DSBs in ERFSs

also have been linked to collisions between transcription and

replication, but ERFSs are not induced by APH treatment

(Barlow et al., 2013).

Mapping of suspected CFSs generally has been achieved

mostly through experimental approaches involving cytogenetic

studies of metaphase chromosomes from a limited number of

cells (Durkin and Glover, 2007). Thus, the majority of CFSs

have been characterized at low resolution (Savelyeva and

Brueckner, 2014). In the mouse, only eight CFSs have been

molecularly mapped and only in lymphocytes (Helmrich et al.,

2006); one of these (Wwox, FRA8E1; Krummel et al., 2002)

was identified as an RDC in our study of NSPCs. The ortholo-

gous human gene (WWOX) also is located within a CFS

(FRA16D; Krummel et al., 2002). In human cells, only nine

CFSs have been fine-mapped to a resolution of about 150 kb,

although others have been implicated at lower resolution

(several megabases), mostly in transformed cell lines (Savelyeva

and Brueckner, 2014). Remarkably, of these implicated human

CFSs, six span genes (Bosco et al., 2010; Le Tallec et al.,

2011, 2013) that correspond to RDCs that we identified at

high resolution in NSPCs (Pard3b, Fgf12, Prkg1, Gpc6, Lsamp,

and Sdk1; Table S6). Thus, HTGTS elucidates CFSs, and other

types of genomic fragility, at nucleotide resolution. Such resolu-

tion is critical for understanding underlying mechanisms. For

example, based on the analysis of large numbers of HTGTS

junctions, we found that both RDC translocation junctions

and genome-wide translocation junctions in XRCC4-deficient

NSPCs have a markedly increased frequency and extent of

MH usage as compared to their counterparts in WT NSPCs (Fig-

ure S5G). Thus, in contrast to earlier conclusions based on more

limited approaches studying mouse embryonic stem cells (Arlt

et al., 2012), our studies indicate that both C-NHEJ and A-EJ

pathways can mediate the various types of translocations we

observed in NSPCs.

RDC Genes in NSPCs Are Implicated in NeuralProcesses, Neural Disorders, and CancerThe great majority (24 of 27) of RDC genes in NSPCs have roles

in neural cell adhesion and/or regulation of synapse formation

and function (Figure 7; also see Table S6). These include the cad-

herin-associated proteins Ctnna2 and Ctnnd2; cadherin Cdh13;

synaptic cell adhesion molecule Cadm2; neural cell adhesion

molecules Bai3, Csmd1, Csmd3, Dcc, Lsamp, Mdga2, Magi2,

Ntm, and Sdk1; excitatory neurotransmitter receptor Grik2;

and two members of the neurexin family of synaptic cell surface

proteins (Nrxn1 and Nrxn3; see Table S6). In addition, nearly all

NSPC RDC-containing genes have been linked, in mice, hu-

mans, or both, to neurodevelopmental and neuropsychiatric

disorders, including autism spectrum disorder (44%; 12/27),

schizophrenia (37%; 10/27), bipolar disorder (29.6%; 8/27),

Page 11: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

and intellectual disability (22.2%; 6/27) (Table S6). In the above

contexts, recurrent DSB-mediated genomic alterations in

NSPC RDC genes might generate neuronal diversity and,

thereby, affect neural physiology and/or predispose to neurode-

velopmental disorders.

It is perhaps notable that the human orthologs of nine of the

RDCs identified in our study are found in relatively focal (5.8-

to 15.4-Mb) CNVs detected by single-cell sequencing of human

frontal cortex neurons (McConnell et al., 2013; Figure S7). While

the relevance of this finding awaits further studies, it is tempting

to speculate that the human orthologs of RDCs that we defined in

NSPCsmay give rise to at least some of these neuronal CNVs. In

this regard, NSPCs harboring RDCs may be positively selected,

and/or DSBs leading to RDC formation may occur at high fre-

quency. Consistent with the latter possibility, we estimate that,

when considered in aggregate, 12 DSBs per cell translocate to

the 27 RDCs in XRCC4/p53-deficient NSPCs (Table S2). How-

ever, the actual DSB frequency in these cells is likely much

higher. In this regard, we have used the XRCC4/p53-deficient

NSPCs to enhance the ability to find recurrent endogenous

DSB clusters via HTGTS. Thus, while XRCC4 deficiency has no

known impact on DSB generation, it enhances DSB persistence,

thereby enhancing translocation and facilitating detection by

HTGTS (Alt et al., 2013). Notably, however, even in XRCC4-defi-

cient NSPCs, most DSBs are still joined locally near the break

site by A-EJ, resulting in our HTGTS results estimating only the

minimal DSB frequency in any given RDC (e.g., Table S2; data

not shown). Finally, our finding of RDCs in WT NSPCs, where

an even greater fraction of DSBs were joined locally by

C-NHEJ (Table S2; data not shown), emphasizes that actual

DSB frequency in RDC genes is much greater than minimal

numbers revealed by HTGTS.

Given that HTGTS does not reveal the precise frequency of

DSBs at a given RDC, we compared the approximate frequency

of spontaneous translocations to Lsamp in NSPCs to those

occurring to Bcl-6 in activated B cells, in which Bcl-6 is a major

AID OT. This comparison is possible because we have done

HTGTS on both NSPCs and on activated B cells from the

same c-Myc bait DSBs in the same ATM-deficient background

(Figure 2). We found that translocations to Lsamp in NSPCs

occurred five times more frequently than translocations to Bcl-

6 in B cells (Table S2). As Bcl-6 translocations occur at about

3% the level of translocations to an IgH CSR region that breaks

in at least 40%–50% of activated B cells over a 4-day activation

period (which is the same period over which we assayed

NSPCs), this comparison suggests that DSBs occur frequently

in Lsamp and, by extension, in other RDCs in the context of repli-

cation stress. An intriguing, unanswered question raised by our

current findings is how the bulk of RDC DSBs are repaired

locally, in particular, whether they might frequently join to other

DSBs within the same RDC. In this context, most of the 27

RDC genes fall within a single replication domain (Figure S6),

which very often appears to correspond to topologically associ-

ating domains (TADs) (Pope et al., 2014). The frequent joining of

recurrent DSBs within a given TAD or chromosomal loop domain

is exploited by lymphoid cells to promote frequent joining of

DSBs within antigen receptor loci (Zarrin et al., 2007; Alt et al.,

2013; Dong et al., 2015; Hu et al., 2015) and also may contribute

to recurrent deletions found in certain cancers (Alt et al., 2013;

Hu et al., 2015). In analogy to our recent HTGTS studies in which

endogenous IgH switch region breaks were used as bait DSBs

(Dong et al., 2015), we could begin to address such questions

by using RDC regions with the highest DSB density as endoge-

nous baits.

We have found previously that DSB repair by C-NHEJ sup-

presses development of MBs with recurrent deletions, translo-

cations, and amplification of N-myc and other genes (Yan

et al., 2006). Notably, Cdh13, an NSPC RDC gene, frequently

has been found to have copy number loss in human group III

MBs (Northcott et al., 2012), aswell as in other cancers, including

ovarian, lung, liver, and breast cancers (see Table S6). In addi-

tion, NRXN3 amplification in double minutes has been detected

in human MBs (Rausch et al., 2012). Several preliminary candi-

date RDCs lie within the centromeric portion of Chr12 where

mouse N-myc is located. In this regard, RDC gene fragility in

NSPCs might be relevant to the speculation that frequent gener-

ation of endogenous DSBs during normal neuroblast differentia-

tion contributes to N-myc amplification in human neuroblas-

tomas (Kohl et al., 1983). Indeed, numerous NSPC RDC genes

are frequently deleted, rearranged, or amplified in various human

cancers (Table S6). Thus, LSAMP is among the most frequently

deleted genes in human cancers and NPAS3 is deleted in high-

grade astrocytomas and glioblastomas (see Table S6). Likewise,

three RDCs are recurrently deleted and rearranged (CADM2), re-

arranged and amplified (CSMD3), or involved in inter-chromo-

somal gene fusions (DGKB) in prostate cancer (see Table S6).

These latter observations may well reflect fragility of some

NSPC RDC genes in other tissues and cell types in which they

are expressed. HTGTS analyses of additional cell types for

spontaneous or replication stress-induced RDCs could test

this hypothesis and also identify RDCs specific to those other

cell types.

EXPERIMENTAL PROCEDURES

NSPC Culture and DSB Induction

NSPCs from frontal brains of postnatal day (P)8–14 mice were prepared and

cultured as described in the Supplemental Experimental Procedures. All

related animal work was performed under protocol 14-10-2790R approved

by the Institutional Animal Care and Use Committee of Boston Children’s Hos-

pital. Bait DSB induction was achieved either via a Cas9:sgRNA approach

(Frock et al., 2015) or via a triamcinolone acetate (TA)-inducible I-SceI

approach (Chiarle et al., 2011). Replication stress was induced by treatment

with APH (Sigma) for 96 hr. See the Supplemental Experimental Procedures

for details.

GRO-Seq

GRO-seq libraries were prepared as previously described (Meng et al., 2014)

from 5–8 3 106 NSPC nuclei. Three biological replicates per genotype

(ATM�/�R26I-SceI-GRc-Myc25xI-SceI or Xrcc4�/�p53�/�) were performed.

GRO-seq data were aligned to mouse genome build mm9/NCBI37 by Bowtie2

and non-redundant, uniquely mapped sequence reads were retained. De novo

transcripts were identified and gene expression levels were estimated as pre-

viously described (Meng et al., 2014).

HTGTS and Related Bioinformatic Analyses

Emulsion-PCR-mediated HTGTS and linear amplification-mediated (LAM)-

HTGTS were performed and analyzed as described previously (Chiarle et al.,

2011; Frock et al., 2015). Primers used and junction yield per experiment, as

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. 653

Page 12: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

well as descriptions of bioinformatic methods used for HTGTS junction ana-

lyses, RDC identification, repair junction signature analysis (e.g., direct versus

MH mediated), and Cas9:sgRNA OT site identification, are given in Table S7

and the Supplemental Experimental Procedures.

Replication Timing Analysis

Custom Python scripts were used to calculate median replication timing ratios

of genomic regions based on Repli-chip data (Weddington et al., 2008). Repli-

cation timing datasets analyzed were mouse NPC 46C, TT2, and D3 (Hiratani

et al., 2008) and two replicates of human NPC BG01 (Ryba et al., 2010). Repli-

cation timing ratios were displayed by Integrative Genomics Viewer (IGV, Rob-

inson et al., 2011).

ACCESSION NUMBERS

The accession number for the sequencing data reported in this paper is GEO:

GSE74356.

SUPPLEMENTAL INFORMATION

Supplemental Information includes Supplemental Experimental Procedures,

seven figures, and seven tables and can be found with this article online at

http://dx.doi.org/10.1016/j.cell.2015.12.039.

AUTHOR CONTRIBUTIONS

F.W.A. and B.S. conceived of and planned the study. B.S., P.-C.W., A.N.C.,

Z.D., R.M.M., and F.W.A. designed experiments. P.-C.W., A.N.C., J.K., and

B.S. performed research. B.S., P.-C.W., A.N.C., J.K., Z.D., R.M.M., and

F.W.A. analyzed and interpreted data. B.S., P.-C.W., and F.W.A. designed fig-

ures and wrote the manuscript. Other authors helped polish the manuscript.

ACKNOWLEDGMENTS

We thank Drs. R. Axel, C. Boboila, and members of the F.W.A. laboratory for

helpful comments and stimulating discussions; Drs. C. Guo, M. Gostissa,

and J. Hu for experimental advice; and Drs. Y. Zhang, L. Shen, and F.-L.

Meng for DNA sequencing assistance. This work in the F.W.A. lab was sup-

ported by the Porter Anderson Fund from Boston Children’s Hospital and

the Howard Hughes Medical Institute. B.S. is a Martin D. Abeloff Scholar of

The V Foundation for Cancer Research and is supported by National Institute

on Aging (NIA)/NIH grant K01AG043630. P.W. is supported by a National Can-

cer Center postdoctoral fellowship.

Received: October 22, 2015

Revised: November 23, 2015

Accepted: December 21, 2015

Published: February 11, 2016

REFERENCES

Aguilera, A., and Garcıa-Muse, T. (2013). Causes of genome instability. Annu.

Rev. Genet. 47, 1–32.

Alt, F.W., Zhang, Y., Meng, F.L., Guo, C., and Schwer, B. (2013). Mechanisms

of programmed DNA lesions and genomic instability in the immune system.

Cell 152, 417–429.

Arlt, M.F., Rajendran, S., Birkeland, S.R., Wilson, T.E., and Glover, T.W. (2012).

De novo CNV formation in mouse embryonic stem cells occurs in the absence

of Xrcc4-dependent nonhomologous end joining. PLoS Genet. 8, e1002981.

Barlow, J.H., Faryabi, R.B., Callen, E., Wong, N., Malhowski, A., Chen, H.T.,

Gutierrez-Cruz, G., Sun, H.W., McKinnon, P., Wright, G., et al. (2013). Identifi-

cation of early replicating fragile sites that contribute to genome instability. Cell

152, 620–632.

Barnes, D.E., Stamp, G., Rosewell, I., Denzel, A., and Lindahl, T. (1998). Tar-

geted disruption of the gene encoding DNA ligase IV leads to lethality in embry-

onic mice. Curr. Biol. 8, 1395–1398.

654 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

Boboila, C., Alt, F.W., and Schwer, B. (2012). Classical and alternative end-

joining pathways for repair of lymphocyte-specific and general DNA double-

strand breaks. Adv. Immunol. 116, 1–49.

Bosco, N., Pelliccia, F., and Rocchi, A. (2010). Characterization of FRA7B, a

human common fragile site mapped at the 7p chromosome terminal region.

Cancer Genet. Cytogenet. 202, 47–52.

Chiarle, R., Zhang, Y., Frock, R.L., Lewis, S.M., Molinie, B., Ho, Y.J., Myers,

D.R., Choi, V.W., Compagno, M., Malkin, D.J., et al. (2011). Genome-wide

translocation sequencing reveals mechanisms of chromosome breaks and re-

arrangements in B cells. Cell 147, 107–119.

Core, L.J., Waterfall, J.J., and Lis, J.T. (2008). Nascent RNA sequencing re-

veals widespread pausing and divergent initiation at human promoters. Sci-

ence 322, 1845–1848.

Difilippantonio, M.J., Petersen, S., Chen, H.T., Johnson, R., Jasin, M., Kanaar,

R., Ried, T., and Nussenzweig, A. (2002). Evidence for replicative repair of DNA

double-strand breaks leading to oncogenic translocation and gene amplifica-

tion. J. Exp. Med. 196, 469–480.

Dong, J., Panchakshari, R.A., Zhang, T., Zhang, Y., Hu, J., Volpi, S.A., Meyers,

R.M., Ho, Y.J., Du, Z., Robbiani, D.F., et al. (2015). Orientation-specific joining

of AID-initiated DNA breaks promotes antibody class switching. Nature 525,

134–139.

Durkin, S.G., and Glover, T.W. (2007). Chromosome fragile sites. Annu. Rev.

Genet. 41, 169–192.

Erwin, J.A., Marchetto, M.C., and Gage, F.H. (2014). Mobile DNA elements in

the generation of diversity and complexity in the brain. Nat. Rev. Neurosci. 15,

497–506.

Fearon, E.R., Cho, K.R., Nigro, J.M., Kern, S.E., Simons, J.W., Ruppert, J.M.,

Hamilton, S.R., Preisinger, A.C., Thomas, G., Kinzler, K.W., et al. (1990). Iden-

tification of a chromosome 18q gene that is altered in colorectal cancers. Sci-

ence 247, 49–56.

Frank, K.M., Sharpless, N.E., Gao, Y., Sekiguchi, J.M., Ferguson, D.O., Zhu,

C., Manis, J.P., Horner, J., DePinho, R.A., and Alt, F.W. (2000). DNA ligase

IV deficiency in mice leads to defective neurogenesis and embryonic lethality

via the p53 pathway. Mol. Cell 5, 993–1002.

Frock, R.L., Hu, J., Meyers, R.M., Ho, Y.J., Kii, E., and Alt, F.W. (2015).

Genome-wide detection of DNA double-stranded breaks induced by engi-

neered nucleases. Nat. Biotechnol. 33, 179–186.

Gao, G., and Smith, D.I. (2014). Very large common fragile site genes and their

potential role in cancer development. Cell. Mol. Life Sci. 71, 4601–4615.

Gao, Y., Sun, Y., Frank, K.M., Dikkes, P., Fujiwara, Y., Seidl, K.J., Sekiguchi,

J.M., Rathbun, G.A., Swat, W., Wang, J., et al. (1998). A critical role for DNA

end-joining proteins in both lymphogenesis and neurogenesis. Cell 95,

891–902.

Gao, Y., Ferguson, D.O., Xie, W., Manis, J.P., Sekiguchi, J., Frank, K.M.,

Chaudhuri, J., Horner, J., DePinho, R.A., and Alt, F.W. (2000). Interplay of

p53 and DNA-repair protein XRCC4 in tumorigenesis, genomic stability and

development. Nature 404, 897–900.

Gapud, E.J., and Sleckman, B.P. (2011). Unique and redundant functions of

ATM and DNA-PKcs during V(D)J recombination. Cell Cycle 10, 1928–1935.

Glover, T.W., Berger, C., Coyle, J., and Echo, B. (1984). DNA polymerase alpha

inhibition by aphidicolin induces gaps and breaks at common fragile sites in

human chromosomes. Hum. Genet. 67, 136–142.

Helmrich, A., Stout-Weider, K., Hermann, K., Schrock, E., and Heiden, T.

(2006). Common fragile sites are conserved features of human and mouse

chromosomes and relate to large active genes. Genome Res. 16, 1222–1230.

Helmrich, A., Ballarino, M., and Tora, L. (2011). Collisions between replication

and transcription complexes cause common fragile site instability at the

longest human genes. Mol. Cell 44, 966–977.

Hiratani, I., Ryba, T., Itoh, M., Yokochi, T., Schwaiger, M., Chang, C.W., Lyou,

Y., Townes, T.M., Schubeler, D., and Gilbert, D.M. (2008). Global reorganiza-

tion of replication domains during embryonic stem cell differentiation. PLoS

Biol. 6, e245.

Page 13: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Hu, J., Zhang, Y., Zhao, L., Frock, R.L., Du, Z., Meyers, R.M., Meng, F.L.,

Schatz, D.G., and Alt, F.W. (2015). Chromosomal Loop Domains Direct the

Recombination of Antigen Receptor Genes. Cell 163, 947–959.

Ju, B.G., Lunyak, V.V., Perissi, V., Garcia-Bassets, I., Rose, D.W., Glass, C.K.,

and Rosenfeld, M.G. (2006). A topoisomerase IIbeta-mediated dsDNA break

required for regulated transcription. Science 312, 1798–1802.

Kim, N., and Jinks-Robertson, S. (2012). Transcription as a source of genome

instability. Nat. Rev. Genet. 13, 204–214.

Klein, I.A., Resch, W., Jankovic, M., Oliveira, T., Yamane, A., Nakahashi, H., Di

Virgilio, M., Bothmer, A., Nussenzweig, A., Robbiani, D.F., et al. (2011). Trans-

location-capture sequencing reveals the extent and nature of chromosomal

rearrangements in B lymphocytes. Cell 147, 95–106.

Kohl, N.E., Kanda, N., Schreck, R.R., Bruns, G., Latt, S.A., Gilbert, F., and Alt,

F.W. (1983). Transposition and amplification of oncogene-related sequences

in human neuroblastomas. Cell 35, 359–367.

Krummel, K.A., Denison, S.R., Calhoun, E., Phillips, L.A., and Smith, D.I.

(2002). The common fragile site FRA16D and its associated gene WWOX are

highly conserved in the mouse at Fra8E1. Genes Chromosomes Cancer 34,

154–167.

Le Tallec, B., Dutrillaux, B., Lachages, A.M., Millot, G.A., Brison, O., and Deba-

tisse, M. (2011). Molecular profiling of common fragile sites in human fibro-

blasts. Nat. Struct. Mol. Biol. 18, 1421–1423.

Le Tallec, B., Millot, G.A., Blin, M.E., Brison, O., Dutrillaux, B., and Debatisse,

M. (2013). Common fragile site profiling in epithelial and erythroid cells reveals

that most recurrent cancer deletions lie in fragile sites hosting large genes. Cell

Rep. 4, 420–428.

Le Tallec, B., Koundrioukoff, S., Wilhelm, T., Letessier, A., Brison, O., and De-

batisse,M. (2014). Updating themechanisms of common fragile site instability:

how to reconcile the different views? Cell. Mol. Life Sci. 71, 4489–4494.

Lee, Y., and McKinnon, P.J. (2002). DNA ligase IV suppresses medulloblas-

toma formation. Cancer Res. 62, 6395–6399.

Lieber, M.R. (2010). The mechanism of double-strand DNA break repair by the

nonhomologous DNA end-joining pathway. Annu. Rev. Biochem. 79, 181–211.

Madabhushi, R., Gao, F., Pfenning, A.R., Pan, L., Yamakawa, S., Seo, J.,

Rueda, R., Phan, T.X., Yamakawa, H., Pao, P.C., et al. (2015). Activity-Induced

DNA Breaks Govern the Expression of Neuronal Early-Response Genes. Cell

161, 1592–1605.

McConnell, M.J., Lindberg, M.R., Brennand, K.J., Piper, J.C., Voet, T., Cow-

ing-Zitron, C., Shumilina, S., Lasken, R.S., Vermeesch, J.R., Hall, I.M., and

Gage, F.H. (2013). Mosaic copy number variation in human neurons. Science

342, 632–637.

McKinnon, P.J. (2013). Maintaining genome stability in the nervous system.

Nat. Neurosci. 16, 1523–1529.

Meng, F.L., Du, Z., Federation, A., Hu, J., Wang, Q., Kieffer-Kwon, K.R.,

Meyers, R.M., Amor, C., Wasserman, C.R., Neuberg, D., et al. (2014). Conver-

gent transcription at intragenic super-enhancers targets AID-initiated genomic

instability. Cell 159, 1538–1548.

Muotri, A.R., and Gage, F.H. (2006). Generation of neuronal variability and

complexity. Nature 441, 1087–1093.

Northcott, P.A., Shih, D.J., Peacock, J., Garzia, L., Morrissy, A.S., Zichner, T.,

Stutz, A.M., Korshunov, A., Reimand, J., Schumacher, S.E., et al. (2012). Sub-

group-specific structural variation across 1,000 medulloblastoma genomes.

Nature 488, 49–56.

Paull, T.T. (2015). Mechanisms of ATM Activation. Annu. Rev. Biochem. 84,

711–738.

Poduri, A., Evrony, G.D., Cai, X., and Walsh, C.A. (2013). Somatic mutation,

genomic variation, and neurological disease. Science 341, 1237758.

Pope, B.D., Ryba, T., Dileep, V., Yue, F., Wu, W., Denas, O., Vera, D.L., Wang,

Y., Hansen, R.S., Canfield, T.K., et al. (2014). Topologically associating do-

mains are stable units of replication-timing regulation. Nature 515, 402–405.

Rausch, T., Jones, D.T., Zapatka, M., Stutz, A.M., Zichner, T., Weischenfeldt,

J., Jager, N., Remke, M., Shih, D., Northcott, P.A., et al. (2012). Genome

sequencing of pediatric medulloblastoma links catastrophic DNA rearrange-

ments with TP53 mutations. Cell 148, 59–71.

Robinson, J.T., Thorvaldsdottir, H., Winckler, W., Guttman, M., Lander, E.S.,

Getz, G., and Mesirov, J.P. (2011). Integrative genomics viewer. Nat. Bio-

technol. 29, 24–26.

Ryba, T., Hiratani, I., Lu, J., Itoh, M., Kulik, M., Zhang, J., Schulz, T.C., Robins,

A.J., Dalton, S., and Gilbert, D.M. (2010). Evolutionarily conserved replication

timing profiles predict long-range chromatin interactions and distinguish

closely related cell types. Genome Res. 20, 761–770.

Savelyeva, L., and Brueckner, L.M. (2014). Molecular characterization of com-

mon fragile sites as a strategy to discover cancer susceptibility genes. Cell.

Mol. Life Sci. 71, 4561–4575.

Schwer, B., Wei, P., Chang, A.N., Kao, J., Du, Z., Meyers, R.M., and Alt, F.W.

(2016). Transcription-associated processes cause DNA double-strand

breaks and translocations in neural stem/progenitor cells. Proc. Natl. Acad.

Sci. USA. Published online February 12, 2016. http://dx.doi.org/10.1073/

pnas.1525564113.

Weddington, N., Stuy, A., Hiratani, I., Ryba, T., Yokochi, T., and Gilbert, D.M.

(2008). ReplicationDomain: a visualization tool and comparative database for

genome-wide replication timing data. BMC Bioinformatics 9, 530.

Yan, C.T., Kaushal, D., Murphy, M., Zhang, Y., Datta, A., Chen, C., Monroe, B.,

Mostoslavsky, G., Coakley, K., Gao, Y., et al. (2006). XRCC4 suppresses me-

dulloblastomas with recurrent translocations in p53-deficient mice. Proc. Natl.

Acad. Sci. USA 103, 7378–7383.

Zang, C., Schones, D.E., Zeng, C., Cui, K., Zhao, K., and Peng, W. (2009). A

clustering approach for identification of enriched domains from histone modi-

fication ChIP-Seq data. Bioinformatics 25, 1952–1958.

Zarrin, A.A., Del Vecchio, C., Tseng, E., Gleason, M., Zarin, P., Tian, M., and

Alt, F.W. (2007). Antibody class switching mediated by yeast endonuclease-

generated DNA breaks. Science 315, 377–381.

Zhang, Y., McCord, R.P., Ho, Y.J., Lajoie, B.R., Hildebrand, D.G., Simon, A.C.,

Becker, M.S., Alt, F.W., and Dekker, J. (2012). Spatial organization of the

mouse genome and its role in recurrent chromosomal translocations. Cell

148, 908–921.

Zhu, C., Mills, K.D., Ferguson, D.O., Lee, C., Manis, J., Fleming, J., Gao, Y.,

Morton, C.C., and Alt, F.W. (2002). Unrepaired DNA breaks in p53-deficient

cells lead to oncogenic gene amplification subsequent to translocations.

Cell 109, 811–821.

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. 655

Page 14: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Supplemental Figures

A B

C

Junc

tion

num

ber

Chromosomal position

+ orientation

orientation

Excision circles

Upstream inversion

Resection

Deletion

Downstream inversionDicentric Ju

nctio

ns

Chr12-sgRNA-1

+

Chr12:12,990,861-13,010,661

2 kb

5.9

16.723.7

53.7%

900600300

0300600900

1200

1200

1520

1005002000

Chr12 Cen Tel

Chr16 Cen Tel

Npas3

Lsamp

OT

+

+

Figure S1. Analysis of Genome-wide DSBs in NSPCs, Related to Figure 1

(A) Illustration of bait breaksite DSB joining outcomes in each quadrant. Blue arrowhead indicates location of bait break site (dashed gray line); green arrowhead,

HTGTS primer.

(B) Distribution of HTGTS junctions around the bait breaksite in Chr12-sgRNA-1 libraries (normalized to 20,000 total junctions) from Xrcc4�/�p53�/� NSPCs;

11,584 junctions mapped within 10 kb of the bait breaksite. Relative percentages of types of joining outcomes per quadrant (as illustrated in A) are indicated.

Junctions in centromere-to-telomere orientation (+) are in blue, and junctions in telomere-to-centromere orientation (–) are in red.

(C) Representative dot plots showing DSB distribution across the indicated chromosomes separated into 1-Mb bins. Orientation of junctions is indicated by plus

and minus symbols. Panels show a representative Chr12-sgRNA-1 breaksite chromosome and Chr16 from the same experiment in Xrcc4�/�p53�/� NSPCs. A

dashed line indicates the bait breaksite; green arrowhead denotes HTGTS primer. Library size was normalized as described in (B). Cas9:sgRNA off-target (OT)

and RDCs are highlighted by blue rectangles.

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. S1

Page 15: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

C

E

F

G

Filter 1Determine the biological reproducibility

of translocation enrichment across libraries

HTGTS librariesRemove junctions within 10 kb of bait break-site

Compare APH-treated and DMSO-treated librariesIdentification of APH-induced

translocation clusters by SICER

Filter 2Identify genomic regions containing

reproducible translocation clusters identifiedfrom at least two bait DSB locations

Filter 3Test for junction enrichment over surrounding

local junction density

RDCs

Identification of Replication Stress-inducedRDCs

A

Name Chrom. Start End Associated gene

B

chr11OT

0

2

4

6

Tran

sloc

atio

n de

nsity

(X

1000

)

APHDMSO

*

**

chr17OT

chr12OT

chr15OT

Chr15-Myc-sgRNA

Chr12-sgRNA-1

Chr16-sgRNA-2

Bai

3P

ard3

bG

rik2

Dgk

bN

pas3

Mdg

a2N

rxn3

Gpc

6C

tnnd

2O

xr1

Csm

d3R

bfox

1Fg

f12

Cad

m2

Nrx

n1 Dcc

Prk

g1 Nfia

Mag

i2S

dk1

Ptn

Ctn

na2

Csm

d1W

wox

Cdh

13N

tm

chr15

0

2

4

6

8

10121620

Tran

sloc

atio

n de

nsity

*** *

****

**

** * * **

*

* * * * *

Bai

3P

ard3

bG

rik2

Dgk

bN

pas3

Mdg

a2N

rxn3

Gpc

6C

tnnd

2O

xr1

Csm

d3R

bfox

1Fg

f12

Cad

m2

Nrx

n1 Dcc

Prk

g1 Nfia

Mag

i2S

dk1

Ptn

Ctn

na2

Csm

d1W

wox

Cdh

13N

tm

chr12

0

2

4

6

88

1216

Tran

sloc

atio

n de

nsity

* * *

**

****

*

** * ** ** * * ** * ** **

*

*** * * *

Bai

3P

ard3

bG

rik2

Dgk

bN

pas3

Mdg

a2N

rxn3

Gpc

6C

tnnd

2O

xr1

Csm

d3R

bfox

1Fg

f12

Cad

m2

Nrx

n1 Dcc

Prk

g1 Nfia

Mag

i2S

dk1

Ptn

Ctn

na2

Csm

d1W

wox

Cdh

13N

tm

chr16

0

2

4

6

8

10

12

16

Tran

sloc

atio

n de

nsity 14

* * ** * * ** *

**

****

* ** * *

**

* * *

APH (n =3)DMSO (n=3)

APH (n =3)DMSO (n=3)

APH (n =4)DMSO (n=3)

DAPHDMSO

01234

20304050

Tran

sloc

atio

n de

nsity

5

*

**

Lsamp

Chr15-Myc-sgRNA

Chr12-sgRNA-1

Chr16-sgRNA-2

Region 01Region 02Region 03Region 04Region 05Region 06Region 07Region 08Region 09

Region 11Region 10

Region 12Region 13Region 14Region 15Region 16Region 17Region 18Region 19Region 20Region 21Region 22Region 23Region 24Region 25Region 26Region 27

chr1chr10

chr12chr12

chr12

chr14chr15

chr15chr15

chr16

chr16

chr19

chr17

chr4chr5

chr6chr6

chr8chr8

chr8

chr9

chr5

chr18

chr1

chr16

chr12

chr16

Pard3bGrik2

Npas3Mdga2Nrxn3Gpc6Ctnnd2Oxr1Csmd3Rbfox1

Cadm2Nrxn1

Prkg1NfiaMagi2Sdk1PtnCtnna2Csmd1WwoxCdh13Ntm

Dcc

Bai3

Fgf12

Dgkb

Lsamp

61,920,00048,810,000

54,060,00067,560,00090,030,000

117,420,00030,090,00141,280,00047,280,000

5,550,000

66,480,00090,540,001

30,810,00197,440,00019,050,000

141,900,00136,540,00077,190,00116,260,001

117,090,000120,750,000

28,860,000

71,850,001

25,550,001

28,530,001

38,580,001

40,200,000

62,579,99949,409,999

55,169,99968,309,99991,469,999

118,169,99930,959,99941,819,99948,779,999

7,709,999

67,619,99991,229,999

31,529,99997,859,99919,559,999

142,349,99936,959,99977,759,99916,829,999

117,599,999121,559,999

29,939,999

72,359,999

25,829,999

28,649,999

39,449,999

41,849,999

Figure S2. Identification of Replication Stress-Induced RDCs, Related to Experimental Procedures

(A) Flow-chart illustrating the identification of RDCs. See Supplemental Experimental Procedures for additional details.

(B) Overview of SICER-identified RDC regions; genomic coordinates and names of associated genes are listed.

(C) Translocation densities of Cas9:sgRNA off-target sites in APH- or DMSO-treated Xrcc4�/�p53�/� NSPCs transfected with either Chr15-Myc-sgRNA, Chr12-

sgRNA-1, or Chr16-sgRNA-2. Translocation density was calculated within ± 1 kb of the off-target DSB site and expressed per 1,000 junctions, per Mb of off-

target region, in each library. Data represent mean and SEM; *p < 0.05, **p < 0.01 (unpaired one-tailed t test).

(D) Translocation densities of Lsamp in APH- or DMSO-treated Xrcc4�/�p53�/� NSPCs transfected with Chr15-Myc-sgRNA, Chr12-sgRNA-1, or Chr16-

sgRNA-2. Translocation densities were calculated as per 1,000 junctions per Mb, in each library. Data represent mean and SEM; *p < 0.05, **p < 0.01 (unpaired

one-tailed t test).

(E–G) Translocation densities of replication stress-induced RDC-genes in Xrcc4�/�p53�/�NSPCs. HTGTS junction density within each gene in individual libraries

from Chr15-Myc-sgRNA (E), Chr12-sgRNA-1 (F), or Chr16-sgRNA-2 (G) bait DSBs was determined. Number of libraries analyzed for each condition is listed.

Genes located on the bait-site chromosome are boxed. Data represent mean and SEM; *p < 0.05, **p < 0.01 (unpaired one-tailed t test).

S2 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

Page 16: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Ptn

chr6:35,912,186-37,514,837

400 kb

400 kb

400 kb

024

24

0

510

5

10

024

24

024

24

0

5

10

5

10

024

24

1 Mb

1 Mb

1 Mb

chr6:74,829,631-79,931,661

Ctnna2

024

24

0

5

10

5

10

0

5

10

5

10

Pard3b1 Mb

1 Mb

1 Mb

chr1:59,683,398-64,690,858

1 Mb

1 Mb

1 Mb

Cdh13

024

24

0

5

10

5

10

0

5

10

5

10chr8:118,805,655-123,849,348

0

5

10

5

10

0

5

10

5

10

0

5

10

5

10

Nfia1 Mb

1 Mb

1 Mb

chr4:95,246,634-99,787,567

024

24

0

5

10

5

10

024

24

Ntm1 Mb

1 Mb

1 Mb

chr9:26,801,549-31,772,714

Chr15

Chr12

Chr16

Bait

A

1 Mb

1 Mb

Csmd1

chr8:13,890,545-19,537,385

0

5

10

5

10

024

24

Gpc6

0

5

10

5

10

024

24

1 Mb

1 Mb

chr14:115,322,519-120,380,751

1 Mb

1 Mb

0

5

10

5

10

024

24

Magi2

chr5:16,730,864-22,212,609

Chr12

Chr16

BaitNrxn1

chr17:88,430,984-93,494,142

0

5

10

5

10

024

24

Grik21 Mb

1 Mb

1 Mb

1 Mb

chr10:46,817,269-51,510,560

0

5

10

5

10

024

24

0

5

10

5

10

024

24

Prkg11 Mb

1 Mb

chr19:28,636,977-33,841,523

0

5

10

5

10

0

48

4

8

Sdk1

chr5:139,715,488-144,691,745

1 Mb

1 Mb

Wwox1 Mb

1 Mb

chr8:114,961,552-119,878,612

0

5

10

5

10

024

24

Chr12

Chr16

BaitNrxn3

024

24

51015

05

1015

1 Mb

1 Mb

chr12:88,030,948-93,575,373

0

510

5

10

102030

0102030

Cadm21 Mb

1 Mb

chr16:64,653,666-69,623,153

Rbfox11 Mb

1 Mb

0

510

5

10

102030

0102030

chr16:3,882,886-9,414,573

102030

0102030

024

24

Mdga21 Mb

1 Mb

chr12:65,565,046-70,325,536

B

Ctnnd21 Mb

51015

05

1015

Chr15

1 Mb

chr15:28,100,348-32,961,098

024

24

Chr16

BaitC

chr12:37,000,000-41,000,000

DgkbFgf12

chr16:26,156,670-30,755,329

204060

02040

0

510

5

10

60

1 Mb

Csmd3

chr15:45,410,184-50,625,535

1 Mb204060

0204060

024

24

1 Mb

1 Mb

Npas3

chr12:52,347,664-57,175,162

Oxr1

0

510

5

10

51015

05

1015

chr15:39,277,028-43,694,593

Chr15

Chr12

Bait 1 Mb

1 Mb

D

chr18:69,984,471-73,501,886

Dcc

024

24

0

510

5

10

1 Mb

1 Mb

Bai3

chr1:23,122,321-27,888,552

024

24

024

24

1 Mb

1 Mb

Chr12

Chr16

Bait024

24

0

1020

10

20

1 Mb

1 Mb

0

1020

10

20

024

24

1 Mb

1 Mb

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

(legend on next page)

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. S3

Page 17: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Figure S3. Replication Stress-Induced RDCs Form Translocations in + and – Orientation in XRCC4/p53-Deficient NSPCs, Related to Figure 4

Individual panels show APH-induced translocations in + and – orientation.

(A) Clusters found with all three bait DSBs (Chr15, Chr15-Myc-sgRNA; Chr12, Chr12-sgRNA-1; Chr16, and Chr16-sgRNA-2).

(B) Clusters found from Chr12-sgRNA-1 and Chr16-sgRNA-2 bait DSBs.

(C) Clusters found from Chr15-Myc-sgRNA and Chr16-sgRNA-2 bait DSBs.

(D) Clusters found from Chr12-sgRNA-1 and Chr15-Myc-sgRNA bait DSBs. Clusters are highlighted in yellow and genomic coordinates are listed under each

panel. The ordinate shows junctions per bin (40-kb bins for Ptn; 100-kb bins for all others).

S4 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

Page 18: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

A

1 Mb

Pard3b

Chr15

Chr12

Chr16

Bait+-+-+-

APH

chr1:59,683,398-64,690,858

Nfia

Chr15

Chr12

Chr16

Bait

1 Mbchr4:95,246,634-99,787,567

+-+-+-

APH

chr5:16,730,864-22,212,609

Magi2

1 Mb

Chr12

Chr16

Bait+-+-

APH

Ntm

Chr15

Chr12

Chr16

Bait+-+-+-

APH

1 Mbchr9:26,801,549-31,772,714

Gpc6

Chr12

Chr16

Bait

1 Mbchr14:115,322,519-120,380,751

+-+-

APH

1 Mbchr8:13,890,545-19,537,385

Chr12

Chr16

Bait

Csmd1

+-+-

APH

chr8:114,961,552-119,878,612

Wwox

Chr12

Chr16

Bait

1 Mb

+-+-

APH

chr5:139,715,488-144,691,745

Sdk1

Chr12

Chr16

Bait

1 Mb

+-+-

APH

1 Mbchr19:28,636,977-33,841,523

Prkg1

Chr12

Chr16

Bait+-+-

APH

chr10:46,817,269-51,510,560

BaitChr12

Chr16

Grik2

1 Mb

+-+-

APH

Dcc

Chr15

Chr12

Bait

1 Mbchr18:69,984,471-73,501,886

+-+-

APH

Chr15

Chr12

Chr16

+-+-+-

APHPtn

chr6:35,912,186-37,514,837 400 kb

Bait

chr16:3,882,886-9,414,573 1 Mb

Chr12

Chr16

Bait+-+-

APHRbfox1

chr15:28,100,348-32,961,098

Ctnnd2

Chr15

Chr16

Bait +-+-

APH

1 Mb

APH

Chr15

Chr12

Bait+-+-

APHOxr1

1 Mbchr15:39,277,028-43,694,593

1 Mb

Npas3

Chr15

Chr12

Bait+-+-

APH

chr12:52,347,664-57,175,162

chr12:65,565,046-70,325,536

Chr12

Chr16

Bait+-+-

Mdga2

1 Mb

chr1:23,122,321-27,888,552

Bai3

1 Mb

Chr12

Chr16

Bait+-+-

APH

chr16:26,156,670-30,755,329

Fgf12

1 Mb

Chr12

Chr16

Bait+-+-

APH

chr12:37,000,000-41,000,000

Dgkb

1 Mb

Chr12

Chr16

Bait+-+-

APH

C

BaitChr12

Chr16

Nrxn1

1 Mbchr17:88,430,984-93,494,142

+-+-

APH

B

Figure S4. Replication Stress-Induced RDCs Identified from Two or Three Different Bait DSBs in XRCC4/p53-Deficient NSPCs, Related to

Figure 5

HTGTS junctions in DMSO- or APH-treated libraries prepared from the indicated bait DSBs (Chr15, Chr15-Myc-sgRNA; Chr12, Chr12-sgRNA-1; Chr16, Chr16-

sgRNA-2) are shown.

(A) Additional RDCs identified from all three bait DSBs.

(B and C) RDCs identified from two bait DSBs; (C) spatial proximity enhances RDC detection. Genomic regions corresponding to SICER-identified DSB clusters

(see the Supplemental Experimental Procedures and Figure S2A) are highlighted in yellow; RefGene tracks are shown for reference. To allow for direct com-

parison, junction numbers plotted were normalized between panels, as described for Figure 4.

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. S5

Page 19: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Mdga2

+-+-

APH

Chr15

Chr12

Bait

chr12:65,565,046-70,325,5361 Mb

chr15:28,100,348-32,961,098 1 Mb

Ctnnd2

Chr15

Chr12

Bait+-+-

APH

1 Mbchr8:13,890,545-19,537,385

Csmd1

Chr15

Chr12

Bait+-+-

APH

Magi2

chr5:16,730,864-22,212,609 1 Mb

Chr15

Chr12

Bait+-+-

APH

chr8:114,961,552-119,878,612 1 Mb

Wwox

Chr15

Chr12

Bait+-+-

APH

Grik2

chr10:46,817,269-51,510,560 1 Mb

Chr15

Chr12

Bait+-+-

APH

1 Mbchr12:37,000,000-41,000,000

Dgkb

Chr15

Chr12

Bait+-+-

APH

B D

C

APH (n=4)DMSO (n=4)

0

2

4

6

8

Tran

sloc

atio

n de

nsity 10

12

E

Grik

2D

gkb

Npa

s3M

dga2

Nrx

n3C

tnnd

2C

smd3

Lsam

pN

rxn1 Dcc

Mag

i2C

tnna

2C

smd1

Ww

ox

Chr12

*

**

**

*** ** * *

APH (n=5)DMSO (n=5)

0

2

4

6

8

Tran

sloc

atio

n de

nsity

F

Grik

2D

gkb

Npa

s3M

dga2

Nrx

n3C

tnnd

2C

smd3

Lsam

pN

rxn1 Dcc

Mag

i2C

tnna

2C

smd1

Ww

ox

**** *

*

Chr15

*

* *** ** * **

1

23

4

5

6

7

8910

11

12

13

14

15

16

1718

19

50

5,000500

5

DMSO

Chr15-Myc-sgRNA

1

2

3

4

5

6

7

891011

12

13

14

15

16

1718

19

50

5,000500

5

APH

Chr15-Myc-sgRNA

A

Direct 1 2 3 4 5 6 7 8 9 100

5

10

15

20

25

30

35

40

45

% J

unct

ions

Wild type (n=3)Xrcc4-/-p53-/- (n=4)

Microhomology (bp)

Inter-chromosomal translocationsChr15-Myc-sgRNA

RDC-gene translocations

Direct 1 2 3 4 5 6 7 8 9 100

5

10

15

20

25

30

35

40

45

% J

unct

ions

Microhomology (bp)

Direct 1 2 3 4 5 6 7 8 9 100

10

20

30

40

50

% J

unct

ions

Microhomology (bp)

Wild type (n=7)Xrcc4-/-p53-/- (n=4)

Intra-chromosomal translocations

Direct 1 2 3 4 5 6 7 8 9 100

10

20

30

40

50

% J

unct

ions

Microhomology (bp)

RDC-gene translocations

Direct 1 2 3 4 5 6 7 8 9 100

10

20

30

40

50

% J

unct

ions

Wild type (n=7)Xrcc4-/-p53-/- (n=4)

Microhomology (bp)

Inter-chromosomal translocations

Chr12-sgRNA-1G

**** ***

******

***

*****

*****

**

***

Intra-chromosomal translocations

Direct 1 2 3 4 5 6 7 8 9 100

5

10

15

20

25

30

35

40

45%

Jun

ctio

ns

Microhomology (bp)

Wild type (n=3)

Xrcc4-/-p53-/- (n=4)

*

*

**

*

***

**

**

********

*******

********

***

****

*****

***

*

*

****

****

***

*** **** *

Wild type (n=5)

Xrcc4-/-p53-/- (n=3)

Wild type (n=4)Xrcc4-/-p53-/- (n=3)

Npas3

1 Mbchr12:52,347,664-57,175,162

Chr15

Chr12

Bait+-+-

APH

chr18:69,984,471-73,501,886

Dcc

Chr15

Chr12

Bait+-+-

APH

1 Mb

Figure S5. Replication Stress-Induced RDCs Identified from Two Different Bait DSBs in Wild-Type NSPCs, Related to Figure 6

(A) Circos plots of genome-wide HTGTS junctions from DMSO- (left) or APH-treated (right) Chr15-Myc-sgRNA-expressing wild-type NSPCs. Red arrowhead

denotes bait DSB site. Red stars indicate Npas3 (Chr12) and Lsamp (Chr16) RDCs. Lines indicate locations of APH-induced RDCs: the 6 red lines indicate RDCs

detected by both the Chr12 (Chr12-sgRNA-1) and the Chr15 (Chr15-Myc-sgRNA) baits; the blue (6 lines) indicate RDCs detected only by the Chr15-Myc-sgRNA

bait. Plots are normalized to 13,911 junctions per condition.

(B–D) HTGTS junctions in DMSO- or APH-treated libraries prepared from the indicated bait DSBs in wild-type NSPCs, as described for Figure S4; (B) additional

RDCs identified from both bait DSBs; (C) RDCs identified from one bait DSB. (D) Spatial proximity enhances RDC detection in wild-type NSPCs. Genomic regions

corresponding to SICER-identified DSB clusters are in yellow. Junction numbers plotted were normalized between panels to allow direct comparison.

(E and F) Translocation densities of replication stress-induced RDC-genes identified in wild-type NSPCs; junction density within each gene in the indicated

number of individual experiments from Chr12-sgRNA-1 (E) or Chr15-Myc-sgRNA (F) bait DSBs is shown as in Figure S2D. Genes located on the bait-site

chromosome are boxed by dotted lines. Data represent mean and SEM; *p < 0.05, **p < 0.01 (unpaired one-tailed t test).

(G) Repair junction profiles of the indicated classes of HTGTS junctions prepared from eitherChr15-Myc-sgRNA- orChr12-sgRNA-1 bait DSBs in wild-type (black

line) or Xrcc4�/�p53�/� (red line) NSPCs. Joins with 0 (direct joins) to 10 bp of junctional MHwere identified and plotted as a percentage of total junctions without

insertions. The number of libraries (n) examined per genotype is indicated. Total junction numbers analyzed: Chr15-Myc-sgRNA experiments, (i) inter-chro-

mosomal: 7,596 wild-type and 5,775 Xrcc4�/�p53�/� junctions; (ii) intra-chromosomal: 719 wild-type and 943 Xrcc4�/�p53�/� junctions; (iii) RDC genes: 406

wild-type and 825 Xrcc4�/�p53�/� junctions. For Chr12-sgRNA-1 experiments, (i) inter-chromosomal: 2,982 wild-type and 7,871 Xrcc4�/�p53�/� junctions;

(ii) intra-chromosomal: 310 wild-type and 1,880 Xrcc4�/�p53�/� junctions; (iii) RDC genes: 416 wild-type and 1,312 Xrcc4�/�p53�/� junctions. Data represent

mean ± SEM; ****p < 0.0001; ***p < 0.001; **p < 0.01; *p < 0.05 (unpaired two-tailed t test).

S6 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

Page 20: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

C0.5 Mb

Ptn

chr6:36,116,111-37,471,108 0.5 Mb

Wwox

chr8:116,226,199-118,613,965 0.5 Mb

Gpc6

chr14:116,728,235-118,997,3900.2 Mb

Nfia

chr4:96,749,372-98,159,7680.5 Mb

Npas3

chr12:53,662,517-55,860,308

0.5 Mb

Pard3b

chr1:60,848,438-63,422,604 0.5 Mb

Ntm

chr9:28,162,692-30,411,571 0.5 Mb

Ctnnd2

chr15:29,284,587-31,736,2780.5 Mb

Cdh13

chr8:120,133,618-122,521,3840.5 Mb

Sdk1

chr5:140,460,335-143,225,071

1 Mb

Lsamp

chr16:37,115,623-44,296,431 1 Mb

Nrxn3

chr12:87,177,699-94,326,868 0.5 Mb

Magi2

chr5:17,973,604-20,738,3401 Mb

Csmd1

chr8:12,572,541-19,723,6081 Mb

Rbfox1

chr16:3,506,028-10,634,165

0.5 Mb

Mdga2

chr12:66,658,207-69,055,998 0.5 Mb

Cadm2

chr16:65,973,815-68,237,8830.5 Mb

Oxr1

chr15:40,172,839-42,648,684

0.5 Mb

Prkg1

chr19:30,127,419-32,351,081 0.5 Mb

Ctnna2

chr6:76,144,911-78,454,906 0.5 Mb

Csmd3

chr15:46,759,367-49,111,0580.5 Mb

Nrxn1

chr17:89,741,375-92,148,6410.5 Mb

Grik2

chr10:47,939,425-50,295,551

Ear

lyLa

te

D

Ear

lyLa

teR

eplic

atio

n tim

ing

ratio

log 2

(Ear

ly/L

ate)

A B

R1

R2

R3

R4

R5

Bai

3P

ard3

bG

rik2

Dgk

bN

pas3

Mdg

a2N

rxn3

Gpc

6C

tnnd

2O

xr1

Csm

d3N

rxn1 Dcc

Prk

g1N

fiaM

agi2

Sdk

1P

tnC

tnna

2C

smd1

Ww

oxC

dh13

Ntm

0

5

10

15

20

25

Tran

sloc

atio

n de

nsity

R7

Rbf

ox1

Fgf1

2Ls

amp

Cad

m20

10

20

30

40

5050

100

150

200

Tran

sloc

atio

n de

nsity

Sdk

1C

dh13

Npa

s3O

xr1

Prk

g1C

tnnd

2G

pc6

Dgk

bC

adm

2B

ai3

Rbf

ox1

Dcc

Ntm

Fgf1

2N

fiaP

ard3

bG

rik2

Nrx

n3N

rxn1

Ww

oxC

tnna

2P

tnLs

amp

Csm

d3C

smd1

Mdg

a2M

agi2

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

0.5 Mb

Bai3

chr1:24,124,320-26,886,552 0.5 Mb

Dgkb

chr12:37,607,291-40,359,997

0.5 Mb

Dcc

chr18:70,413,285-73,510,723 0.5 Mb

Fgf12

chr16:27,158,669-29,753,329

Figure S6. DNA Replication Stress-Induced RDCs and Replication Timing, Related to Figure 7

(A) Five groups of 50 transcribed 15-25 kb genes were randomly selected from three independentChr16 bait DSB libraries (R1-R5, as described in Figure 7E), and

junction numbers within the concatenated regions were calculated (gray bars). Similarly, junction numbers within the indicated inter-chromosomal RDCs were

determined. Translocation density is indicated as junctions per Mb.

(legend continued on next page)

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. S7

Page 21: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

(B) Translocation densities of concatenated average-size (15-25 kb) active genes on Chr16 (R7, n = 50) and intra-chromosomal Chr16 RDCs. Data represent

mean and SEM from four independent experiments.

(C) Panels showing the replication timing ratio (log2[early/late]) of the indicated genes and surrounding genomic regions in three sets of murine neural progenitor

Repli-chip data (Hiratani et al., 2008). RefGene tracks (red) are shown for reference.

(D) Replication timing of orthologs of the identified RDC genes in human neural progenitors. Averagemedian replication timing ratios from twoRepli-chip datasets

(Ryba et al., 2010) are shown.

S8 Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc.

Page 22: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

RDC (Csmd1)

0 mb 2 mb 4 mb 6 mb 8 mb 10 mb 12 mb 14 mb 16 mb 18 mb17 mb

chr8

RDC (Csmd3)

108 mb 110 mb 112 mb 114 mb 116 mb 118 mb 120 mb 122 mb17 mb

chr8

RDC (Ctnna2)

74 mb 76 mb 78 mb 80 mb 82 mb 84 mb 86 mb 88 mb 90 mb17 mb

chr2

RDC (Fgf12)

182 mb 184 mb 186 mb 188 mb 190 mb 192 mb 194 mb 196 mb 198 mb17 mb

chr3

RDC (Magi2)

72 mb 74 mb 76 mb 78 mb 80 mb 82 mb 84 mb 86 mb17 mb

chr7

RDC (Mdga2)

44 mb 46 mb 48 mb 50 mb 52 mb 54 mb 56 mb 58 mb17 mb

chr14

RDC (Npas3)

22 mb 24 mb 26 mb 28 mb 30 mb 32 mb 34 mb 36 mb 38 mb17 mb

chr14

RDC (Nrxn1)

42 mb 44 mb 46 mb 48 mb 50 mb 52 mb 54 mb 56 mb 58 mb17 mb

chr2

RDC (Nrxn3)

78 mb 80 mb 82 mb 84 mb 86 mb 88 mb 90 mb 92 mb 94 mb17 mb

chr14

RDC (Nrxn3)

17 mbchr14

74 mb 76 mb 78 mb 80 mb 82 mb 84 mb 86 mb 88 mb 90 mb

Figure S7. Human RDC Orthologs and Neuronal Copy Number Variations, Related to Figure 7

Human orthologs of 9 murine NSPC RDC-genes overlap with 10 of the 133 CNVs smaller than 20 Mb identified by single-cell sequencing of 110 human post-

mortem frontal cortex neurons from three individuals (McConnell et al., 2013). CNVs are indicated in red (deletions) or green (duplications); human RDC orthologs

are in blue. 17Mb of genomic sequence of the indicated CNV-containing chromosome are shown in each panel. The ten CNVs ranged in size from 5.8 to 15.4Mb.

Cell 164, 644–655, February 11, 2016 ª2016 Elsevier Inc. S9

Page 23: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

Cell, Volume 164

Supplemental Information

Long Neural Genes Harbor Recurrent DNA Break

Clusters in Neural Stem/Progenitor Cells

Pei-Chi Wei, Amelia N. Chang, Jennifer Kao, Zhou Du, Robin M. Meyers, Frederick W.Alt, and Bjoern Schwer

Page 24: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

1

SUPPLEMENTAL EXPERIMENTAL PROCEDURES

NSPC Isolation and Culture NSPCs were prepared and cultured as described (Brewer and Torricelli, 2007). Passage 0 dissociated DIV (days-in-vitro) 0 cells were plated in ultra-low attachment 6-well plates (Corning) at a density of 4 × 105 cells per mL. On DIV4.5, cultures were dissociated into single cells, followed by nucleofection 2 h later. To induce bait DSB generation via GR-I-SceI, 10 µM triamcinolone acetonide (TA, Sigma) was added on DIV5.5. Cells were collected for GRO-seq or HTGTS on DIV9. Replication stress was induced by addition of 0.5 µM aphidicolin (APH, Sigma) for 72 h; cells were then fed with fresh medium, which resulted in reduction of APH concentrations to 0.25 µM, and incubated for another 24 h before collection on DIV9.

Cas9:sgRNA-mediated DSB Induction To induce Cas9:sgRNA-mediated DSBs, 5 × 106 dissociated DIV5 NSPCs were nucleofected with 5 µg of Cas9:sgRNA expression vector by using the Mouse Neural Stem Cell Nucleofector reagent (VPG-1004, Lonza), as per manufacturer's instruction. Cas9:sgRNA expression vectors were constructed by ligating annealed oligonucleotides (see Table S7 for details) into BbsI-digested pSpCas9(BB) (Addgene plasmid 42230; Cong et al., 2013).

HTGTS Libraries (fragment size 500 – 1,000 bp) were purified and sequenced (Illumina MiSeq). FASTQ output files were de-multiplexed, and unique reads aligned to genome build mm9/NCBI37 by Bowtie2 (Langmead and Salzberg, 2012) were processed through a custom HTGTS pipeline (Frock et al., 2015). See Table S7 for details on junction yield per experiment.

HTGTS Junction Enrichment Analysis Unbiased, genome-wide identification of RDCs was performed by SICER (Zang et al., 2009) analysis of individual HTGTS libraries (excluding junctions within 5 Mb of the bait break-site) from untreated cells with the following parameters: SICER-rb.sh species- mm9; redundancy threshold- 5; window- 30,000 bp; fragment size- 1; effective genome fraction- 0.74; gap size- 90,000 bp; E value- 0.1. E-score cutoff was 50 for the break-site chromosome and 20 for all other chromosomes. SICER clusters had to be present in at least two biological replicate libraries to be considered RDCs. Identification of APH-induced DSB clusters was performed by SICER analysis of HTGTS data sets from control (DMSO) or treated (APH) cells using the following settings: SICER.sh species- mm9; redundancy threshold- 5; window- 30,000 bp; fragment size- 1; effective genome fraction- 0.74; gap size- 90,000 bp; FDR- 0.01. Only clusters with ≥1.5-fold increase in junction density in individual libraries from APH-treated cells (P < 0.05, one-tailed unpaired t-test) were further considered. Among the shared clusters identified from different bait DSBs, only high-confidence clusters that showed ≥1.5-fold increased translocation density over surrounding genomic areas of identical size and had been sampled ≥7 times were further considered. For custom MACS-based, unbiased, genome-wide verification of significantly enriched junction clusters, HTGTS junctions were binned into 2.5-Mb regions and Poisson Lambda values (λ; λ = njunctions/region sizeMb) were computed for each bin (λr) and three surrounding regions: whole genome without break-site chromosome (λ1), region extended by 1.5× (λ2) and 2.5× (λ3) on either side of bin center. P-values of enrichment of λr against the maximum value among λ1-3 were determined by Poisson distribution; P < 0.05 was considered significant. Significance of junction enrichment in replication stress-induced clusters was assessed as above but instead of

Page 25: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

2

2.5-Mb bins, genomic coordinates of SICER clusters and surrounding regions were used to compute λ values.

Identification of Recurrent Translocations to Cas9:sgRNA Off-target Sites Translocations between Cas9:sgRNA on- and off-target DSBs were identified as described (Frock et al., 2015) by MACS2 (Zhang et al., 2008; see also http://github.com/taoliu/MACS) with the following settings: --keep-dup all --nomodel --extsize 2000 --llocal 10000000. Hotspots ≥100 kb from the bait DSB break-site with an FDR-adjusted P-value threshold of 1 × 10–9 were considered translocations between Cas9:sgRNA on- and off-target DSBs if they shared >30% sequence with the on-target sgRNA-binding site, and formed focal translocations in plus and minus orientation in more than one biological replicate library.

Estimation of Translocation Rate and DSB Frequency To estimate translocation frequency, we first determined the yield of unique junctions per amount of HTGTS library input DNA (assuming ~6 pg DNA per diploid mouse cell; 2.73×109 bp × 2 (diploid) × 660 (average MW per bp) × 1.67×10-12 Da) by calculating the junction recovery rate:

𝑗𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑟𝑒𝑐𝑜𝑣𝑒𝑟𝑦𝑟𝑎𝑡𝑒(1𝑗𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑝𝑒𝑟𝑥𝑔𝑒𝑛𝑜𝑚𝑒𝑠) =𝑖𝑛𝑝𝑢𝑡𝐷𝑁𝐴(𝑝𝑔)

6𝑝𝑔×𝑛 𝑢𝑛𝑖𝑞𝑢𝑒𝑗𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑠

Translocation rate (i.e., translocation number in a given genomic region per cell) was then calculated, factoring in the approximate fraction of cells with bait DSBs (0.5 for NSPCs; 0.8 for activated B cells):

𝑡𝑟𝑎𝑛𝑠𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛𝑟𝑎𝑡𝑒 =𝑛 𝑗𝑢𝑛𝑐𝑡𝑖𝑜𝑛𝑠𝑖𝑛𝑟𝑒𝑔𝑖𝑜𝑛𝑜𝑓𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 ×𝑟𝑒𝑐𝑜𝑣𝑒𝑟𝑦𝑟𝑎𝑡𝑒

𝑛 𝑐𝑒𝑙𝑙𝑠𝑤𝑖𝑡ℎ𝑏𝑎𝑖𝑡𝐷𝑆𝐵𝑠

Multiplication of translocation rate × 100 yielded the percentage of cells containing translocations within a given region. Frequencies of widespread DSBs were calculated based on the observed translocation rate on the cis (break-site) chromosome, excluding 500 kb on each side of the bait break-site. Frequencies of DSBs within Lsamp or Npas3, or within RDCs, were calculated based on the observed translocation rate of each of these categories on the break-site chromosome and multiplied by the total number of chromosomes (40) to derive an estimate of DSB number per cell. For direct comparison of DSB frequencies of Lsamp in NSPCs and Bcl-6 in B cells (both of which are located on Chr16 while the bait break-site is located on Chr15), translocation rates were directly equated to number of DSBs per cell.

DSB Repair Junction Signature Analysis Junctional repair signatures in HTGTS libraries were analyzed at the nucleotide level by calculating the difference between end coordinate of the bait alignment and start coordinate of the prey alignment. In this calculation, a value of 0 corresponds to a "direct" junction, whereas negative values represent short nucleotide homologies or "microhomologies" (MHs); positive values indicate junctional nucleotide insertions.

Page 26: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

3

SUPPLEMENTAL TABLES Table S1. Relative HTGTS Junction Distribution, Related To Figure 1, 2, and 3.

Relative Distribution (%) Break-site chromosome

Bait DSB Genotype ±500 kb around break-site

>500 kb of break-site

Inter-chromosomal junctions Npas3 Lsamp

Chr12-sgRNA-1

Xrcc4-/-p53-/- 61.39 7.60 31.00 1.13 # 0.42 #

Chr12-sgRNA-1 Wild type 49.83 4.94 45.24 0.21 0.16

Chr12-sgRNA-2

Xrcc4-/-p53-/- 58.86 8.41 32.73 2.66 # 0.22 #

Chr16-sgRNA-1

Xrcc4-/-p53-/- 52.36 7.40 40.24 0.28 # 1.90 #

c-Myc- 25×I-SceI ATM-/- 39.21 17.89 42.90 0.06 0.59 #

c-Myc- 25×I-SceI Wild type 61.66 6.16 32.18 0.07 0.99 #

c-Myc- 25×I-SceI

ATM-/- iABC* 20.50 24.79 54.71 0.03 0.03

*Analysis of published ATM-/- B cell HTGTS data (Meng et al., 2014). #Significant junction enrichment. Table S2. Translocation and DSB Frequency Estimation, Related to Figures 1-6. Translocation and DSB frequencies of the indicated classes of prey DSBs are shown. See Supplemental Experimental Procedures for details. *Junction number (in parentheses) within stated region; ¶Average junction number per RDC located on the break-site chromosome.

1. DSB rates in Lsamp in NSPCs and DSBs in the AID off-target gene Bcl-6 in activated B cells based on numbers of translocations between I-SceI-mediated bait DSBs on Chr15 to prey DSBs (Lsamp or Bcl-6) on Chr16.

Genotype (Cell type) Bait DSB Input

(µg) Unique

Junctions Junction Number *

Translocation (% of cells)

DSBs per cell

ATM-/- (NSPC) c-Myc 25×I-SceI 160 16,476 Lsamp (42) 0.5 % 0.005

ATM-/- (B cell§) c-Myc

25×I-SceI 80 42,751 Bcl-6 (23) 0.1% 0.001 § In vitro activated B-cell HTGTS data from Meng et al., 2014.

2. Rates of widespread DSBs across the genome based on numbers of translocations between bait DSBs and prey DSBs on the cis chromosome (excluding junctions within 500 kb of bait break site).

Genotype (NSPCs) Bait DSB Input

(µg) Unique

Junctions Junction Number *

Translocation (% of cells)

DSBs per cell

ATM-/- c-Myc 25×I-SceI 160 16,476 2,947

(Chr15) 35.8 14.3

Xrcc4-/-p53-/- Chr12-sgRNA-1 100 20,000 1,586

(Chr12) 15.9 6.3

Wild type Chr12-sgRNA-1 320 19,674 971

(Chr12) 9.9 3.9

Page 27: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

4

3. Rates of DSBs within Lsamp or Npas3 in the absence of induced replication stress, based on numbers of translocations between bait DSBs and prey DSBs within Lsamp or Npas3 on the cis chromosome.

Genotype (NSPCs) Bait DSB Input

(µg) Unique

Junctions Junction Number *

Translocation (% of cells)

DSBs per cell

Xrcc4-/-p53-/- Chr12-sgRNA-2 120 12,593 109 (Npas3) 1.7 0.7

Xrcc4-/-p53-/- Chr16-sgRNA-1 120 19,798 151 (Lsamp) 1.5 0.6

4. Rates of RDC-gene-associated DSBs based on numbers of translocations between bait DSBs and RDCs on the cis chromosome.

Genotype (NSPCs) Bait DSB Input

(µg) Unique

Junctions Junction Number ¶

Translocation (% of cells)

DSBs per cell

Xrcc4-/-p53-/- Chr12- sgRNA-1 75 40,755 177 (Chr12) 23.5 (27 RDCs) 9.4

Wild type Chr12- sgRNA-1 70 11,304 47 (Chr12) 11.6 (14 RDCs) 4.7

Xrcc4-/-p53-/- Chr15-Myc-sgRNA 75 23,959 158 (Chr15) 35.6 (27 RDCs) 14.2

Wild type Chr15-Myc-sgRNA 70 13,911 57 (Chr15) 11.5 (14 RDCs) 4.6

Table S3. Supplied as a separate Excel file. Table S4. HTGTS Junction Enrichment within Identified Replication Stress-sensitive Genes, Related to Figures 4, 5, and S3 and S4. MACS-based HTGTS junction enrichment analysis against background λ in APH-treated Xrcc4-/-p53-/- samples as described in Supplemental Experimental Procedures.

P value

Chr Start End RefSeq ID Gene Chr15 bait DSB

Chr12 bait DSB

Chr16 bait DSB

chr1 25,124,320 25,886,552 NM_175642 Bai3 3.63 × 10-2 1.77 × 10-5 2.84 × 10-3 chr1 61,685,398 62,688,858 NM_001081050 Pard3b 2.85 × 10-4 1.55 × 10-7 1.56 ×10-3 chr10 48,819,269 49,508,560 NM_001111268 Grik2 2.15 × 10-2 2.42 × 10-7 1.01 × 10-3 chr12 38,607,291 39,359,997 NM_178681 Dgkb 1.38 × 10-2 4.28 × 10-15 8.87 × 10-3 chr12 54,349,664 55,173,162 NM_013780 Npas3 6.62 × 10-6 5.77 × 10-51 1.57 × 10-5 chr12 67,567,046 68,323,536 NM_001193266 Mdga2 1.38 × 10-2 6.43 × 10-24 9.00 × 10-5 chr12 90,032,948 91,573,373 NM_001198587 Nrxn3 8.47 × 10-2 3.6 × 10-14 7.21 × 10-3 chr14 117,324,519 118,378,751 NM_001079844 Gpc6 3.72 × 10-3 6.21 × 10-6 1.01 × 10-3 chr15 30,102,348 30,959,098 NM_008729 Ctnnd2 4.62 × 10-15 8.95 × 10-5 1.75 × 10-4 chr15 41,279,028 41,692,593 NM_001130166 Oxr1 9.78 × 10-6 6.51 × 10-4 1.47 × 10-3 chr15 47,412,184 48,623,535 NM_001081391 Csmd3 1.96 × 10-51 8.99 × 10-8 2.05 × 10-4 chr16 5,884,886 7,412,573 NM_021477 Rbfox1 5.75 × 10-2 6.02 × 10-4 1.95 × 10-19 chr16 28,158,669 28,753,329 NM_010199 Fgf12 2.28 × 10-2 4.44 × 10-3 1.29 × 10-10 chr16 66,655,666 67,621,153 NM_001145977 Cadm2 1.56 × 10-3 6.24 × 10-6 2.00 × 10-28 chr17 90,432,984 91,492,142 NM_020252 Nrxn1 3.23 × 10-4 6.73 × 10-9 4.53 × 10-4 chr18 71,413,285 72,510,723 NM_007831 Dcc 9.59 × 10-4 3.10 × 10-7 4.39 × 10-4 chr19 30,638,977 31,839,523 NM_001013833 Prkg1 8.47 × 10-2 2.12 × 10-5 4.63 × 10-4 chr4 97,248,634 97,785,567 NM_001122952 Nfia 6.98 × 10-4 6.92 × 10-4 2.77 × 10-4 chr5 18,732,864 20,210,609 NM_001170746 Magi2 1.75 × 10-4 5.48 ×10-5 2.25 × 10-3

Page 28: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

5

chr5 141,717,488 142,689,745 NM_177879 Sdk1 7.58 × 10-3 3.39 × 10-6 2.26 × 10-5 chr6 36,665,663 36,761,361 NM_008973 Ptn 1.01 × 10-2 1.14 × 10-5 7.38 × 10-3 chr6 76,831,631 77,929,661 NM_001109764 Ctnna2 1.01 × 10-3 4.54 × 10-7 8.25 × 10-5 chr8 15,892,545 17,535,385 NM_053171 Csmd1 7.47 × 10-4 1.04 × 10-4 4.61 × 10-4 chr8 116,963,552 117,876,612 NM_019573 Wwox 7.94 × 10-3 9.03 × 10-5 1.18 × 10-2 chr8 120,807,655 121,847,348 NM_019707 Cdh13 8.98 × 10-4 6.08 × 10-9 2.04 × 10-4 chr9 28,803,549 29,770,714 NM_172290 Ntm 1.47 × 10-3 1.80 × 10-5 2.39 × 10-3

Table S5. Translocation Junction Signatures of Replication Stress-induced RDC-genes. Junctions from three to five independent experiments per bait DSB location and genotype were analyzed; total junction number analyzed per condition is listed in parentheses. Data represent mean and S.E.M. MH, microhomology.

Chr12-sgRNA-1 bait DSBs

% of junctions Wild type (n=416) Xrcc4-/-p53-/- (n=1,312) P-value (two-tailed unpaired t test)

Direct 42.8 ± 1.7 6.6 ± 1.1 4.2 × 10-4 MH (1-10 bp) 57.2 ± 1.7 93.4 ± 1.1 4.2 × 10-4

Chr15-Myc-sgRNA bait DSBs % of junctions Wild type (n=406) Xrcc4-/-p53-/- (n=825) P-value

(two-tailed unpaired t test) Direct 37.3 ± 3.1 7.7 ± 1.2 1.6 × 10-5

MH (1-10 bp) 62.7 ± 3.1 92.3 ± 1.2 1.5 × 10-5 Table S6. Supplied as a separate Excel file. Table S7. Detailed Information on HTGTS Libraries, Related to Figure 1-6 and Experimental Procedures. 1. sgRNA- and HTGTS-related oligonucleotide sequences. Bio, biotinylation. #Nucleotides used as linker sequences for cloning into BbsI-digested pSpCas9(BB) (Addgene plasmid 42230; Cong et al., 2013) are underlined. *For details on LAM-HTGTS adapter sequences (I5, I7, P5, P7) see Frock et al., 2015.

sgRNA-RNA-related oligonucleotides

Name Sequence (5' > 3')# sgRNA target coordinates (NCBI37/mm9)

Chr12-sgRNA-1 A CACC ATTCCGCCAACCCTCGAGAT Chr12:13,000,844-13,000,863 Chr12-sgRNA-1 B AAAC ATCTCGAGGGTTGGCGGAAT

Chr12-sgRNA-2 A CACC GCTGTCACTAGGAACGTTATC Chr12: 61,485,370-61,485,390 Chr12-sgRNA-2 B AAAC GATAACGTTCCTAGTGACAGC

Chr15-Myc-sgRNA A CACC GCCCTATTTCATCTGCGACG Chr15: 61,819,136-61,819,155 Chr15-Myc-sgRNA B AAAC CGTCGCAGATGAAATAGGGC

Chr16 sgRNA-1 A CACC GCTCCAACCCTTAGCCCATC Chr16: 31,462,937-31,462,956 Chr16 sgRNA-1 B AAAC GATGGGCTAAGGGTTGGAGC

Chr16-sgRNA-2 A CACC GATACGGCAAAGGACTAGTT Chr16: 39,382,741-39,382,760 Chr16-sgRNA-2 B AAAC AACTAGTCCTTTGCCGTATC

LAM-HTGTS Oligonucleotides

Page 29: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

6

Name Sequence (5' > 3') Bio- Chr12-sgRNA-1 Bio/CAGGTGCCAAGTTCTACCAACAAGC Bio-Chr12-sgRNA-2 Bio/CTGCTTGACATTTCAGCTATCTAAT Bio-Chr15-Myc-sgRNA Bio/CGAGCGTCACTGATAGTAGGGAGT Bio-Chr16-sgRNA-1 Bio/AGGTACTACTGAGAGCTACCTC Bio-Chr16-sgRNA-2 Bio/CTATGGAGTGACTGAAGCTAAATT

Oligonucleotides for nested-PCR (without Illumina I5 5'-adapter sequence*)

Name Sequence (5' > 3') PreCasChr12-sgRNA-1 CCTCTAAGATAAAAACTGGAAGTAGTT PreCasChr12-sgRNA-2 GCAAACTGAAAGAGCACCTGTGAG PreCasChr15-Myc-sgRNA GCACCAACCAGAGCTGGATAACTCT PreCasChr16-sgRNA-1 GTTCCTAGCCGTGTGAATTGAGG PreCasChr16-sgRNA-2 GATAGTCGGGGAACGTTGGGATGC

2. sgRNA Off-target sites identified via HTGTS. Nucleotides conserved between on- and off-target loci are in red. PAM is underlined.

Chr15-Myc-sgRNA On-target Off-target (identified by HTGTS) Off-target locus

GCCCTATTTCATCTGCGACG AGG GCCCTATTTCACCTGCAACA GGG Chr11:78,738,291-

78,738,310

ACCCTTAAGCACCTGCGACA AGG Chr17:25,707,243-25,707,262

Chr12-sgRNA-1 On-target Off-target (identified by HTGTS) Off-target locus

ATTCCGCCAACCCTCGAGAT AGG CCCATCCCATCCCATCCCGA GGG Chr12:112,278,690-112,278,709

Chr16-sgRNA-1 On-target Off-target (identified by HTGTS) Off-target locus

GCTCCAACCCTTAGCCCATC AGG GAAGTTACAGTTCGCCTGAT GGG Chr2:92,090,109-92,090,128

Chr16-sgRNA-2 On-target Off-target (identified by HTGTS) Off-target locus

CTGTGATAGTCGGGGAACGT TGG AAGGAAAGACTGAGCAACAC TGG Chr8:69,971,278-

69,971,297

AGGGACTAGTAATACAGCAA AGG Chr15:94,741,284-94,741,303

3. Summary of HTGTS junctions per experiment. NSPC genotypes, bait DSB site, name of experiment, corresponding junction number, and related Figures are listed.

ATM-/- R26 GR-I-SceI c-Myc25xI-SceI (Figure 2C) Exp-A (6,667) Exp-B (4,712) Exp-C (2,182) Exp-D (2,915) Xrcc4-/-p53-/- Chr12-sgRNA-1 (Figures 1B, 2A, 3A, S1B, S1C, and S5G) Exp-A (7,098) Exp-B (4,628) Exp-C (10,754)

Page 30: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

7

Exp-D (9,664) Xrcc4+/+p53-/- Chr12-sgRNA-1 (Figure S5G) Exp-A (5,962) Exp-B (7,443) Exp-C (7,345) Exp-D (11,147) Wild type Chr12-sgRNA-1 (Figure S5G) Exp-A (4,579) Exp-B (2,382) Exp-C (1,828) Exp-D (2,883) Exp-E (2,353) Exp-F (2,216) Exp-G (1,828) Xrcc4-/-p53-/- Chr15-Myc-sgRNA (Figure S5G) Exp-A (7,812) Exp-B (4,593) Exp-C (4,596) Exp-D (4,930) Xrcc4+/+p53-/- Chr15-Myc-sgRNA (Figure S5G) Exp-A (5,176) Exp-B (8,227) Exp-C (8,367) Exp-D (8,809) Wild type Chr15-Myc-sgRNA (Figure S5G) Exp-A (9,095) Exp-B (3,338) Exp-C (2,152) Xrcc4-/-p53-/- Chr12-sgRNA-2 (Figure 3B) Exp-A (4,266) Exp-B (4,417) Exp-C (3,910) Xrcc4-/-p53-/- Chr16-sgRNA-1 (Figure 2B) Exp-A (7,220) Exp-B (6,836) Exp-C (5,742) Xrcc4-/-p53-/- Chr15-Myc-sgRNA (Figures 4, 5, S2-5) DMSO, Exp-A (7,533) APH, Exp-A (8,509)DMSO, Exp-B (7,733) APH, Exp-B (7,467)DMSO, Exp-C (6,497) APH, Exp-C (7,983)Xrcc4-/-p53-/- Chr12-sgRNA-1 (Figures 4, 5, 7, S2-5) DMSO, Exp-A (9,853) APH, Exp-A (10,813)DMSO, Exp-B (8,543) APH, Exp-B (17,370)DMSO, Exp-C (9,122) APH, Exp-C (12,572)Xrcc4-/-p53-/- Chr16-sgRNA-2 (Figures 4, 5, S2-6) DMSO, Exp-A (6,202) APH, Exp-A (4,438)DMSO, Exp-B (5,799) APH, Exp-B (3,787)DMSO, Exp-C (5,678) APH, Exp-C (4,495) APH, Exp-D (6,114)Wild type Chr15-Myc-sgRNA (Figures 6 and S5) DMSO, Exp-A (3,450) APH, Exp-A (4,276) DMSO, Exp-B (3,406) APH, Exp-B (3,486) DMSO, Exp-C (2,005) APH, Exp-C (1,979) DMSO, Exp-D (2,833) APH, Exp-D (1,476) DMSO, Exp-E (2,826) APH, Exp-E (2,694)

Page 31: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

8

Wild type Chr12-sgRNA-1 (Figures 6 and S5) DMSO, Exp-A (4,941) APH, Exp-A (5,576) DMSO, Exp-B (1,717) APH, Exp-B (2,080) DMSO, Exp-C (2,046) APH, Exp-C (1,875) DMSO, Exp-D (1,384) APH, Exp-D (1,773)

SUPPLEMENTAL REFERENCES

Abdel-Salam, G., Thoenes, M., Afifi, H.H., Korber, F., Swan, D., and Bolz, H.J. (2014). The supposed tumor suppressor gene WWOX is mutated in an early lethal microcephaly syndrome with epilepsy, growth retardation and retinal degeneration. Orphanet J Rare Dis 9, 12.

Abe, K., Chisaka, O., Van Roy, F., and Takeichi, M. (2004). Stability of dendritic spines and synaptic contacts is controlled by alpha N-catenin. Nat Neurosci 7, 357-363.

Allen, N.J., Bennett, M.L., Foo, L.C., Wang, G.X., Chakraborty, C., Smith, S.J., and Barres, B.A. (2012). Astrocyte glypicans 4 and 6 promote formation of excitatory synapses via GluA1 AMPA receptors. Nature 486, 410-414.

Amet, L.E., Lauri, S.E., Hienola, A., Croll, S.D., Lu, Y., Levorse, J.M., Prabhakaran, B., Taira, T., Rauvala, H., and Vogt, T.F. (2001). Enhanced hippocampal long-term potentiation in mice lacking heparin-binding growth-associated molecule. Mol Cell Neurosci 17, 1014-1024.

Anney, R., Klei, L., Pinto, D., Almeida, J., Bacchelli, E., Baird, G., Bolshakova, N., Bolte, S., Bolton, P.F., Bourgeron, T., et al. (2012). Individual common variants exert weak effects on the risk for autism spectrum disorderspi. Hum Mol Genet 21, 4781-4792.

Arikkath, J., Peng, I.F., Ng, Y.G., Israely, I., Liu, X., Ullian, E.M., and Reichardt, L.F. (2009). Delta-catenin regulates spine and synapse morphogenesis and function in hippocampal neurons during development. J Neurosci 29, 5435-5442.

Autism Genome Project, C., Szatmari, P., Paterson, A.D., Zwaigenbaum, L., Roberts, W., Brian, J., Liu, X.Q., Vincent, J.B., Skaug, J.L., Thompson, A.P., et al. (2007). Mapping autism risk loci using genetic linkage and chromosomal rearrangements. Nat Genet 39, 319-328.

Auweter, S.D., Fasan, R., Reymond, L., Underwood, J.G., Black, D.L., Pitsch, S., and Allain, F.H. (2006). Molecular basis of RNA recognition by the human alternative splicing factor Fox-1. EMBO J 25, 163-173.

Bednarek, A.K., Laflin, K.J., Daniel, R.L., Liao, Q., Hawkins, K.A., and Aldaz, C.M. (2000). WWOX, a novel WW domain-containing protein mapping to human chromosome 16q23.3-24.1, a region frequently affected in breast cancer. Cancer Res 60, 2140-2145.

Belcaro, C., Dipresa, S., Morini, G., Pecile, V., Skabar, A., and Fabretto, A. (2015). CTNND2 deletion and intellectual disability. Gene 565, 146-149.

Berger, M.F., Lawrence, M.S., Demichelis, F., Drier, Y., Cibulskis, K., Sivachenko, A.Y., Sboner, A., Esgueva, R., Pflueger, D., Sougnez, C., et al. (2011). The genomic complexity of primary human prostate cancer. Nature 470, 214-220.

Bhalla, K., Phillips, H.A., Crawford, J., McKenzie, O.L., Mulley, J.C., Eyre, H., Gardner, A.E., Kremmidiotis, G., and Callen, D.F. (2004). The de novo chromosome 16 translocations of two patients with abnormal phenotypes (mental retardation and epilepsy) disrupt the A2BP1 gene. J Hum Genet 49, 308-311.

Bignell, G.R., Greenman, C.D., Davies, H., Butler, A.P., Edkins, S., Andrews, J.M., Buck, G., Chen, L., Beare, D., Latimer, C., et al. (2010). Signatures of mutation and selection in the cancer genome. Nature 463, 893-898.

Page 32: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

9

Bolliger, M.F., Martinelli, D.C., and Sudhof, T.C. (2011). The cell-adhesion G protein-coupled receptor BAI3 is a high-affinity receptor for C1q-like proteins. Proc Natl Acad Sci U S A 108, 2534-2539.

Borglum, A.D., Demontis, D., Grove, J., Pallesen, J., Hollegaard, M.V., Pedersen, C.B., Hedemand, A., Mattheisen, M., investigators, G., Uitterlinden, A., et al. (2014). Genome-wide study of association and interaction with maternal cytomegalovirus infection suggests new schizophrenia loci. Mol Psychiatry 19, 325-333.

Brewer, G.J., and Torricelli, J.R. (2007). Isolation and culture of adult neurons and neurospheres. Nat Protoc 2, 1490-1498.

Brunskill, E.W., Ehrman, L.A., Williams, M.T., Klanke, J., Hammer, D., Schaefer, T.L., Sah, R., Dorn, G.W., 2nd, Potter, S.S., and Vorhees, C.V. (2005). Abnormal neurodevelopment, neurosignaling and behaviour in Npas3-deficient mice. Eur J Neurosci 22, 1265-1276.

Brunskill, E.W., Witte, D.P., Shreiner, A.B., and Potter, S.S. (1999). Characterization of npas3, a novel basic helix-loop-helix PAS gene expressed in the developing mouse nervous system. Mech Dev 88, 237-241.

Bucan, M., Abrahams, B.S., Wang, K., Glessner, J.T., Herman, E.I., Sonnenblick, L.I., Alvarez Retuerto, A.I., Imielinski, M., Hadley, D., Bradfield, J.P., et al. (2009). Genome-wide analyses of exonic copy number variants in a family-based study point to novel autism susceptibility genes. PLoS Genet 5, e1000536.

Casey, J.P., Magalhaes, T., Conroy, J.M., Regan, R., Shah, N., Anney, R., Shields, D.C., Abrahams, B.S., Almeida, J., Bacchelli, E., et al. (2012). A novel approach of homozygous haplotype sharing identifies candidate genes in autism spectrum disorder. Hum Genet 131, 565-579.

Catania, E.H., Pimenta, A., and Levitt, P. (2008). Genetic deletion of Lsamp causes exaggerated behavioral activation in novel environments. Behav Brain Res 188, 380-390.

Ching, M.S., Shen, Y., Tan, W.H., Jeste, S.S., Morrow, E.M., Chen, X., Mukaddes, N.M., Yoo, S.Y., Hanson, E., Hundley, R., et al. (2010). Deletions of NRXN1 (neurexin-1) predispose to a wide spectrum of developmental disorders. Am J Med Genet B Neuropsychiatr Genet 153B, 937-947.

Christoforou, A., Espeseth, T., Davies, G., Fernandes, C.P., Giddaluru, S., Mattheisen, M., Tenesa, A., Harris, S.E., Liewald, D.C., Payton, A., et al. (2014). GWAS-based pathway analysis differentiates between fluid and crystallized intelligence. Genes Brain Behav 13, 663-674.

Chu, T.T., and Liu, Y. (2010). An integrated genomic analysis of gene-function correlation on schizophrenia susceptibility genes. J Hum Genet 55, 285-292.

Cong, L., Ran, F.A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P.D., Wu, X., Jiang, W., Marraffini, L.A., et al. (2013). Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819-823.

Contractor, A., Swanson, G., and Heinemann, S.F. (2001). Kainate receptors are involved in short- and long-term plasticity at mossy fiber synapses in the hippocampus. Neuron 29, 209-216.

Cukier, H.N., Dueker, N.D., Slifer, S.H., Lee, J.M., Whitehead, P.L., Lalanne, E., Leyva, N., Konidari, I., Gentry, R.C., Hulme, W.F., et al. (2014). Exome sequencing of extended families with autism reveals genes shared across neurodevelopmental and neuropsychiatric disorders. Molecular autism 5, 1.

Dabell, M.P., Rosenfeld, J.A., Bader, P., Escobar, L.F., El-Khechen, D., Vallee, S.E., Dinulos, M.B., Curry, C., Fisher, J., Tervo, R., et al. (2013). Investigation of NRXN1 deletions: clinical and molecular characterization. Am J Med Genet A 161A, 717-731.

Page 33: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

10

Danielson, E., Zhang, N., Metallo, J., Kaleka, K., Shin, S.M., Gerges, N., and Lee, S.H. (2012). S-SCAM/MAGI-2 is an essential synaptic scaffolding molecule for the GluA2-containing maintenance pool of AMPA receptors. J Neurosci 32, 6967-6980.

Davis, L.K., Maltman, N., Mosconi, M.W., Macmillan, C., Schmitt, L., Moore, K., Francis, S.M., Jacob, S., Sweeney, J.A., and Cook, E.H. (2012). Rare inherited A2BP1 deletion in a proband with autism and developmental hemiparesis. Am J Med Genet A 158A, 1654-1661.

Demyanenko, G.P., Halberstadt, A.I., Pryzwansky, K.B., Werner, C., Hofmann, F., and Maness, P.F. (2005). Abnormal neocortical development in mice lacking cGMP-dependent protein kinase I. Brain Res Dev Brain Res 160, 1-8.

Deneen, B., Ho, R., Lukaszewicz, A., Hochstim, C.J., Gronostajski, R.M., and Anderson, D.J. (2006). The transcription factor NFIA controls the onset of gliogenesis in the developing spinal cord. Neuron 52, 953-968.

DeRosse, P., Lencz, T., Burdick, K.E., Siris, S.G., Kane, J.M., and Malhotra, A.K. (2008). The genetics of symptom-based phenotypes: toward a molecular classification of schizophrenia. Schizophr Bull 34, 1047-1053.

Docampo, E., Ribases, M., Gratacos, M., Bruguera, E., Cabezas, C., Sanchez-Mora, C., Nieva, G., Puente, D., Argimon-Pallas, J.M., Casas, M., et al. (2012). Association of neurexin 3 polymorphisms with smoking behavior. Genes Brain Behav 11, 704-711.

Donohoe, G., Walters, J., Hargreaves, A., Rose, E.J., Morris, D.W., Fahey, C., Bellini, S., Cummins, E., Giegling, I., Hartmann, A.M., et al. (2013). Neuropsychological effects of the CSMD1 genome-wide associated schizophrenia risk variant rs10503253. Genes Brain Behav 12, 203-209.

Elliott, N.A., and Volkert, M.R. (2004). Stress induction and mitochondrial localization of Oxr1 proteins in yeast and humans. Mol Cell Biol 24, 3180-3187.

Erbel-Sieler, C., Dudley, C., Zhou, Y., Wu, X., Estill, S.J., Han, T., Diaz-Arrastia, R., Brunskill, E.W., Potter, S.S., and McKnight, S.L. (2004). Behavioral and regulatory abnormalities in mice deficient in the NPAS1 and NPAS3 transcription factors. Proc Natl Acad Sci U S A 101, 13648-13653.

Etherton, M.R., Blaiss, C.A., Powell, C.M., and Sudhof, T.C. (2009). Mouse neurexin-1alpha deletion causes correlated electrophysiological and behavioral changes consistent with cognitive impairments. Proc Natl Acad Sci U S A 106, 17998-18003.

Ferreira, M.A., O'Donovan, M.C., Meng, Y.A., Jones, I.R., Ruderfer, D.M., Jones, L., Fan, J., Kirov, G., Perlis, R.H., Green, E.K., et al. (2008). Collaborative genome-wide association analysis supports a role for ANK3 and CACNA1C in bipolar disorder. Nat Genet 40, 1056-1058.

Finci, L., Zhang, Y., Meijers, R., and Wang, J.H. (2015). Signaling mechanism of the netrin-1 receptor DCC in axon guidance. Prog Biophys Mol Biol 118, 153-160.

Finger, J.H., Bronson, R.T., Harris, B., Johnson, K., Przyborski, S.A., and Ackerman, S.L. (2002). The netrin 1 receptors Unc5h3 and Dcc are necessary at multiple choice points for the guidance of corticospinal tract axons. J Neurosci 22, 10346-10356.

Floris, C., Rassu, S., Boccone, L., Gasperini, D., Cao, A., and Crisponi, L. (2008). Two patients with balanced translocations and autistic disorder: CSMD3 as a candidate gene for autism found in their common 8q23 breakpoint area. Eur J Hum Genet 16, 696-704.

Frei, J.A., Andermatt, I., Gesemann, M., and Stoeckli, E.T. (2014). The SynCAM synaptic cell adhesion molecules are involved in sensory axon pathfinding by regulating axon-axon contacts. J Cell Sci 127, 5288-5302.

Page 34: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

11

Gao, L., Macara, I.G., and Joberty, G. (2002). Multiple splice variants of Par3 and of a novel related gene, Par3L, produce proteins with different binding properties. Gene 294, 99-107.

Gehman, L.T., Stoilov, P., Maguire, J., Damianov, A., Lin, C.H., Shiue, L., Ares, M., Jr., Mody, I., and Black, D.L. (2011). The splicing regulator Rbfox1 (A2BP1) controls neuronal excitation in the mammalian brain. Nat Genet 43, 706-711.

Gil, O.D., Zanazzi, G., Struyk, A.F., and Salzer, J.L. (1998). Neurotrimin mediates bifunctional effects on neurite outgrowth via homophilic and heterophilic interactions. J Neurosci 18, 9312-9325.

Gil, O.D., Zhang, L., Chen, S., Ren, Y.Q., Pimenta, A., Zanazzi, G., Hillman, D., Levitt, P., and Salzer, J.L. (2002). Complementary expression and heterophilic interactions between IgLON family members neurotrimin and LAMP. J Neurobiol 51, 190-204.

Gimelli, S., Leoni, M., Di Rocco, M., Caridi, G., Porta, S., Cuoco, C., Gimelli, G., and Tassano, E. (2013). A rare 3q13.31 microdeletion including GAP43 and LSAMP genes. Mol Cytogenet 6, 52.

Glancy, M., Barnicoat, A., Vijeratnam, R., de Souza, S., Gilmore, J., Huang, S., Maloney, V.K., Thomas, N.S., Bunyan, D.J., Jackson, A., et al. (2009). Transmitted duplication of 8p23.1-8p23.2 associated with speech delay, autism and learning difficulties. Eur J Hum Genet 17, 37-43.

Glessner, J.T., Wang, K., Cai, G., Korvatska, O., Kim, C.E., Wood, S., Zhang, H., Estes, A., Brune, C.W., Bradfield, J.P., et al. (2009). Autism genome-wide copy number variation reveals ubiquitin and neuronal genes. Nature 459, 569-573.

Goldfarb, M., Schoorlemmer, J., Williams, A., Diwakar, S., Wang, Q., Huang, X., Giza, J., Tchetchik, D., Kelley, K., Vega, A., et al. (2007). Fibroblast growth factor homologous factors control neuronal excitability through modulation of voltage-gated sodium channels. Neuron 55, 449-463.

Gombash, S.E., Lipton, J.W., Collier, T.J., Madhavan, L., Steece-Collier, K., Cole-Strauss, A., Terpstra, B.T., Spieles-Engemann, A.L., Daley, B.F., Wohlgenant, S.L., et al. (2012). Striatal pleiotrophin overexpression provides functional and morphological neuroprotection in the 6-hydroxydopamine model. Mol Ther 20, 544-554.

Goto, K., and Kondo, H. (1993). Molecular cloning and expression of a 90-kDa diacylglycerol kinase that predominantly localizes in neurons. Proc Natl Acad Sci U S A 90, 7598-7602.

Griswold, A.J., Dueker, N.D., Van Booven, D., Rantus, J.A., Jaworski, J.M., Slifer, S.H., Schmidt, M.A., Hulme, W., Konidari, I., Whitehead, P.L., et al. (2015). Targeted massively parallel sequencing of autism spectrum disorder-associated genes in a case control cohort reveals rare loss-of-function risk variants. Molecular autism 6, 43.

Griswold, A.J., Ma, D., Cukier, H.N., Nations, L.D., Schmidt, M.A., Chung, R.H., Jaworski, J.M., Salyakina, D., Konidari, I., Whitehead, P.L., et al. (2012). Evaluation of copy number variations reveals novel candidate genes in autism spectrum disorder-associated pathways. Hum Mol Genet 21, 3513-3523.

Hansell, N.K., Halford, G.S., Andrews, G., Shum, D.H., Harris, S.E., Davies, G., Franic, S., Christoforou, A., Zietsch, B., Painter, J., et al. (2015). Genetic basis of a cognitive complexity metric. PloS one 10, e0123886.

Hashimoto, T., Maekawa, S., and Miyata, S. (2009). IgLON cell adhesion molecules regulate synaptogenesis in hippocampal neurons. Cell Biochem Funct 27, 496-498.

Page 35: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

12

Heck, A., Pfister, H., Czamara, D., Muller-Myhsok, B., Putz, B., Lucae, S., Hennings, J., and Ising, M. (2011). Evidence for associations between MDGA2 polymorphisms and harm avoidance: replication and extension of a genome-wide association finding. Psychiatr Genet 21, 257-260.

Heinla, I., Leidmaa, E., Kongi, K., Pennert, A., Innos, J., Nurk, K., Tekko, T., Singh, K., Vanaveski, T., Reimets, R., et al. (2015). Gene expression patterns and environmental enrichment-induced effects in the hippocampi of mice suggest importance of Lsamp in plasticity. Front Neurosci 9, 205.

Herradon, G., and Perez-Garcia, C. (2014). Targeting midkine and pleiotrophin signalling pathways in addiction and neurodegenerative disorders: recent progress and perspectives. Br J Pharmacol 171, 837-848.

Hirao, K., Hata, Y., Ide, N., Takeuchi, M., Irie, M., Yao, I., Deguchi, M., Toyoda, A., Sudhof, T.C., and Takai, Y. (1998). A novel multiple PDZ domain-containing molecule interacting with N-methyl-D-aspartate receptors and neuronal cell adhesion proteins. J Biol Chem 273, 21105-21110.

Hishimoto, A., Liu, Q.R., Drgon, T., Pletnikova, O., Walther, D., Zhu, X.G., Troncoso, J.C., and Uhl, G.R. (2007). Neurexin 3 polymorphisms are associated with alcohol dependence and altered expression of specific isoforms. Hum Mol Genet 16, 2880-2891.

Holt, R., Barnby, G., Maestrini, E., Bacchelli, E., Brocklebank, D., Sousa, I., Mulder, E.J., Kantojarvi, K., Jarvela, I., Klauck, S.M., et al. (2010). Linkage and candidate gene studies of autism spectrum disorders in European populations. Eur J Hum Genet 18, 1013-1019.

Horn, K.E., Glasgow, S.D., Gobert, D., Bull, S.J., Luk, T., Girgis, J., Tremblay, M.E., McEachern, D., Bouchard, J.F., Haber, M., et al. (2013). DCC expression by neurons regulates synaptic plasticity in the adult brain. Cell reports 3, 173-185.

Hozumi, Y., Watanabe, M., Otani, K., and Goto, K. (2009). Diacylglycerol kinase beta promotes dendritic outgrowth and spine maturation in developing hippocampal neurons. BMC Neurosci 10, 99.

Huang, J., Perlis, R.H., Lee, P.H., Rush, A.J., Fava, M., Sachs, G.S., Lieberman, J., Hamilton, S.P., Sullivan, P., Sklar, P., et al. (2010). Cross-disorder genomewide analysis of schizophrenia, bipolar disorder, and depression. Am J Psychiatry 167, 1254-1263.

Ibrahim-Verbaas, C.A., Bressler, J., Debette, S., Schuur, M., Smith, A.V., Bis, J.C., Davies, G., Trompet, S., Smith, J.A., Wolf, C., et al. (2015). GWAS for executive function and processing speed suggests involvement of the CADM2 gene. Mol Psychiatry.

Innos, J., Philips, M.A., Leidmaa, E., Heinla, I., Raud, S., Reemann, P., Plaas, M., Nurk, K., Kurrikoff, K., Matto, V., et al. (2011). Lower anxiety and a decrease in agonistic behaviour in Lsamp-deficient mice. Behav Brain Res 217, 21-31.

Israely, I., Costa, R.M., Xie, C.W., Silva, A.J., Kosik, K.S., and Liu, X. (2004). Deletion of the neuron-specific protein delta-catenin leads to severe cognitive and synaptic dysfunction. Curr Biol 14, 1657-1663.

Joset, P., Wacker, A., Babey, R., Ingold, E.A., Andermatt, I., Stoeckli, E.T., and Gesemann, M. (2011). Rostral growth of commissural axons requires the cell adhesion molecule MDGA2. Neural Dev 6, 22.

Kadota, M., Yang, H.H., Gomez, B., Sato, M., Clifford, R.J., Meerzaman, D., Dunn, B.K., Wakefield, L.M., and Lee, M.P. (2010). Delineating genetic alterations for tumor progression in the MCF10A series of breast cancer cell lines. PloS one 5, e9201.

Page 36: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

13

Kakefuda, K., Oyagi, A., Ishisaka, M., Tsuruma, K., Shimazawa, M., Yokota, K., Shirai, Y., Horie, K., Saito, N., Takeda, J., et al. (2010). Diacylglycerol kinase beta knockout mice exhibit lithium-sensitive behavioral abnormalities. PloS one 5, e13447.

Kamnasaran, D., Muir, W.J., Ferguson-Smith, M.A., and Cox, D.W. (2003). Disruption of the neuronal PAS3 gene in a family affected with schizophrenia. J Med Genet 40, 325-332.

Kang, P., Lee, H.K., Glasgow, S.M., Finley, M., Donti, T., Gaber, Z.B., Graham, B.H., Foster, A.E., Novitch, B.G., Gronostajski, R.M., et al. (2012). Sox9 and NFIA coordinate a transcriptional regulatory cascade during the initiation of gliogenesis. Neuron 74, 79-94.

Karlsson, R., Graae, L., Lekman, M., Wang, D., Favis, R., Axelsson, T., Galter, D., Belin, A.C., and Paddock, S. (2012). MAGI1 copy number variation in bipolar affective disorder and schizophrenia. Biol Psychiatry 71, 922-930.

Kawakami, M., Staub, J., Cliby, W., Hartmann, L., Smith, D.I., and Shridhar, V. (1999). Involvement of H-cadherin (CDH13) on 16q in the region of frequent deletion in ovarian cancer. Int J Oncol 15, 715-720.

Kee, H.J., Ahn, K.Y., Choi, K.C., Won Song, J., Heo, T., Jung, S., Kim, J.K., Bae, C.S., and Kim, K.K. (2004). Expression of brain-specific angiogenesis inhibitor 3 (BAI3) in normal brain and implications for BAI3 in ischemia-induced brain angiogenesis and malignant glioma. FEBS Lett 569, 307-316.

Keller, F., Rimvall, K., Barbe, M.F., and Levitt, P. (1989). A membrane glycoprotein associated with the limbic system mediates the formation of the septo-hippocampal pathway in vitro. Neuron 3, 551-561.

Kim, H.G., Kishikawa, S., Higgins, A.W., Seong, I.S., Donovan, D.J., Shen, Y., Lally, E., Weiss, L.A., Najm, J., Kutsche, K., et al. (2008). Disruption of neurexin 1 associated with autism spectrum disorder. Am J Hum Genet 82, 199-207.

Kim, S.A., Kim, J.H., Park, M., Cho, I.H., and Yoo, H.J. (2007). Family-based association study between GRIK2 polymorphisms and autism spectrum disorders in the Korean trios. Neurosci Res 58, 332-335.

Kirov, G., Rujescu, D., Ingason, A., Collier, D.A., O'Donovan, M.C., and Owen, M.J. (2009). Neurexin 1 (NRXN1) deletions in schizophrenia. Schizophr Bull 35, 851-854.

Kleppisch, T., Wolfsgruber, W., Feil, S., Allmann, R., Wotjak, C.T., Goebbels, S., Nave, K.A., Hofmann, F., and Feil, R. (2003). Hippocampal cGMP-dependent protein kinase I supports an age- and protein synthesis-dependent component of long-term potentiation but is not essential for spatial reference and contextual memory. J Neurosci 23, 6005-6012.

Koide, T., Banno, M., Aleksic, B., Yamashita, S., Kikuchi, T., Kohmura, K., Adachi, Y., Kawano, N., Kushima, I., Nakamura, Y., et al. (2012). Common variants in MAGI2 gene are associated with increased risk for cognitive impairment in schizophrenic patients. PloS one 7, e36836.

Koido, K., Janno, S., Traks, T., Parksepp, M., Ljubajev, U., Veiksaar, P., Must, A., Shlik, J., Vasar, V., and Vasar, E. (2014). Associations between polymorphisms of LSAMP gene and schizophrenia. Psychiatry Res 215, 797-798.

Koido, K., Traks, T., Balotsev, R., Eller, T., Must, A., Koks, S., Maron, E., Toru, I., Shlik, J., Vasar, V., et al. (2012). Associations between LSAMP gene polymorphisms and major depressive disorder and panic disorder. Transl Psychiatry 2, e152.

Koiliari, E., Roussos, P., Pasparakis, E., Lencz, T., Malhotra, A., Siever, L.J., Giakoumaki, S.G., and Bitsios, P. (2014). The CSMD1 genome-wide associated schizophrenia risk variant

Page 37: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

14

rs10503253 affects general cognitive ability and executive function in healthy males. Schizophr Res 154, 42-47.

Krellman, J.W., Ruiz, H.H., Marciano, V.A., Mondrow, B., and Croll, S.D. (2014). Behavioral and neuroanatomical abnormalities in pleiotrophin knockout mice. PloS one 9, e100597.

Kremmidiotis, G., Baker, E., Crawford, J., Eyre, H.J., Nahmias, J., and Callen, D.F. (1998). Localization of human cadherin genes to chromosome regions exhibiting cancer-related loss of heterozygosity. Genomics 49, 467-471.

Krummel, K.A., Roberts, L.R., Kawakami, M., Glover, T.W., and Smith, D.I. (2000). The characterization of the common fragile site FRA16D and its involvement in multiple myeloma translocations. Genomics 69, 37-46.

Lachman, H.M., Fann, C.S., Bartzis, M., Evgrafov, O.V., Rosenthal, R.N., Nunes, E.V., Miner, C., Santana, M., Gaffney, J., Riddick, A., et al. (2007). Genomewide suggestive linkage of opioid dependence to chromosome 14q. Hum Mol Genet 16, 1327-1334.

Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nature methods 9, 357-359.

Lanore, F., Labrousse, V.F., Szabo, Z., Normand, E., Blanchet, C., and Mulle, C. (2012). Deficits in morphofunctional maturation of hippocampal mossy fiber synapses in a mouse model of intellectual disability. J Neurosci 32, 17882-17893.

Lanoue, V., Usardi, A., Sigoillot, S.M., Talleur, M., Iyer, K., Mariani, J., Isope, P., Vodjdani, G., Heintz, N., and Selimi, F. (2013). The adhesion-GPCR BAI3, a gene linked to psychiatric disorders, regulates dendrite morphogenesis in neurons. Mol Psychiatry 18, 943-950.

Lauri, S.E., Rauvala, H., Kaila, K., and Taira, T. (1998). Effect of heparin-binding growth-associated molecule (HB-GAM) on synaptic transmission and early LTP in rat hippocampal slices. Eur J Neurosci 10, 188-194.

Lee, H.J., Woo, H.G., Greenwood, T.A., Kripke, D.F., and Kelsoe, J.R. (2013a). A genome-wide association study of seasonal pattern mania identifies NF1A as a possible susceptibility gene for bipolar disorder. J Affect Disord 145, 200-207.

Lee, K., Kim, Y., Lee, S.J., Qiang, Y., Lee, D., Lee, H.W., Kim, H., Je, H.S., Sudhof, T.C., and Ko, J. (2013b). MDGAs interact selectively with neuroligin-2 but not other neuroligins to regulate inhibitory synapse development. Proc Natl Acad Sci U S A 110, 336-341.

Lesch, K.P., Timmesfeld, N., Renner, T.J., Halperin, R., Roser, C., Nguyen, T.T., Craig, D.W., Romanos, J., Heine, M., Meyer, J., et al. (2008). Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies. J Neural Transm 115, 1573-1585.

Li, Y.S., Milner, P.G., Chauhan, A.K., Watson, M.A., Hoffman, R.M., Kodner, C.M., Milbrandt, J., and Deuel, T.F. (1990). Cloning and expression of a developmentally regulated protein that induces mitogenic and neurite outgrowth activity. Science 250, 1690-1694.

Lips, E.S., Cornelisse, L.N., Toonen, R.F., Min, J.L., Hultman, C.M., International Schizophrenia, C., Holmans, P.A., O'Donovan, M.C., Purcell, S.M., Smit, A.B., et al. (2012). Functional gene group analysis identifies synaptic gene groups as risk factor for schizophrenia. Mol Psychiatry 17, 996-1006.

Litwack, E.D., Babey, R., Buser, R., Gesemann, M., and O'Leary, D.D. (2004). Identification and characterization of two novel brain-derived immunoglobulin superfamily members with a unique structural organization. Mol Cell Neurosci 25, 263-274.

Page 38: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

15

Liu, Q.R., Drgon, T., Johnson, C., Walther, D., Hess, J., and Uhl, G.R. (2006). Addiction molecular genetics: 639,401 SNP whole genome association identifies many "cell adhesion" genes. Am J Med Genet B Neuropsychiatr Genet 141B, 918-925.

Lovci, M.T., Ghanem, D., Marr, H., Arnold, J., Gee, S., Parra, M., Liang, T.Y., Stark, T.J., Gehman, L.T., Hoon, S., et al. (2013). Rbfox proteins regulate alternative mRNA splicing through evolutionarily conserved RNA bridges. Nat Struct Mol Biol 20, 1434-1442.

Lu, W., Quintero-Rivera, F., Fan, Y., Alkuraya, F.S., Donovan, D.J., Xi, Q., Turbe-Doan, A., Li, Q.G., Campbell, C.G., Shanske, A.L., et al. (2007). NFIA haploinsufficiency is associated with a CNS malformation syndrome and urinary tract defects. PLoS Genet 3, e80.

Macintyre, G., Alford, T., Xiong, L., Rouleau, G.A., Tibbo, P.G., and Cox, D.W. (2010). Association of NPAS3 exonic variation with schizophrenia. Schizophr Res 120, 143-149.

Maher, C.A., Kumar-Sinha, C., Cao, X., Kalyana-Sundaram, S., Han, B., Jing, X., Sam, L., Barrette, T., Palanisamy, N., and Chinnaiyan, A.M. (2009). Transcriptome sequencing to detect gene fusions in cancer. Nature 458, 97-101.

Mallaret, M., Synofzik, M., Lee, J., Sagum, C.A., Mahajnah, M., Sharkia, R., Drouot, N., Renaud, M., Klein, F.A., Anheim, M., et al. (2014). The tumour suppressor gene WWOX is mutated in autosomal recessive cerebellar ataxia with epilepsy and mental retardation. Brain 137, 411-419.

Mann, F., Zhukareva, V., Pimenta, A., Levitt, P., and Bolz, J. (1998). Membrane-associated molecules guide limbic and nonlimbic thalamocortical projections. J Neurosci 18, 9409-9419.

Marchionini, D.M., Lehrmann, E., Chu, Y., He, B., Sortwell, C.E., Becker, K.G., Freed, W.J., Kordower, J.H., and Collier, T.J. (2007). Role of heparin binding growth factors in nigrostriatal dopamine system development and Parkinson's disease. Brain Res 1147, 77-88.

Marshall, C.R., Young, E.J., Pani, A.M., Freckmann, M.L., Lacassie, Y., Howald, C., Fitzgerald, K.K., Peippo, M., Morris, C.A., Shane, K., et al. (2008). Infantile spasms is associated with deletion of the MAGI2 gene on chromosome 7q11.23-q21.11. Am J Hum Genet 83, 106-111.

Martin, C.L., Duvall, J.A., Ilkin, Y., Simon, J.S., Arreaza, M.G., Wilkes, K., Alvarez-Retuerto, A., Whichello, A., Powell, C.M., Rao, K., et al. (2007). Cytogenetic and molecular characterization of A2BP1/FOX1 as a candidate gene for autism. Am J Med Genet B Neuropsychiatr Genet 144B, 869-876.

Matter, C., Pribadi, M., Liu, X., and Trachtenberg, J.T. (2009). Delta-catenin is required for the maintenance of neural structure and function in mature cortex in vivo. Neuron 64, 320-327.

Mattheisen, M., Samuels, J.F., Wang, Y., Greenberg, B.D., Fyer, A.J., McCracken, J.T., Geller, D.A., Murphy, D.L., Knowles, J.A., Grados, M.A., et al. (2015). Genome-wide association study in obsessive-compulsive disorder: results from the OCGAS. Mol Psychiatry 20, 337-344.

McCarthy, M.J., Nievergelt, C.M., Kelsoe, J.R., and Welsh, D.K. (2012). A survey of genomic studies supports association of circadian clock genes with bipolar disorder spectrum illnesses and lithium response. PloS one 7, e32091.

Medina, M., Marinescu, R.C., Overhauser, J., and Kosik, K.S. (2000). Hemizygosity of delta-catenin (CTNND2) is associated with severe mental retardation in cri-du-chat syndrome. Genomics 63, 157-164.

Micheau, J., Vimeney, A., Normand, E., Mulle, C., and Riedel, G. (2014). Impaired hippocampus-dependent spatial flexibility and sociability represent autism-like phenotypes in GluK2 mice. Hippocampus 24, 1059-1069.

Page 39: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

16

Mikhail, F.M., Lose, E.J., Robin, N.H., Descartes, M.D., Rutledge, K.D., Rutledge, S.L., Korf, B.R., and Carroll, A.J. (2011). Clinically relevant single gene or intragenic deletions encompassing critical neurodevelopmental genes in patients with developmental delay, mental retardation, and/or autism spectrum disorders. Am J Med Genet A 155A, 2386-2396.

Motazacker, M.M., Rost, B.R., Hucho, T., Garshasbi, M., Kahrizi, K., Ullmann, R., Abedini, S.S., Nieh, S.E., Amini, S.H., Goswami, C., et al. (2007). A defect in the ionotropic glutamate receptor 6 gene (GRIK2) is associated with autosomal recessive mental retardation. Am J Hum Genet 81, 792-798.

Must, A., Tasa, G., Lang, A., Vasar, E., Koks, S., Maron, E., and Vali, M. (2008). Association of limbic system-associated membrane protein (LSAMP) to male completed suicide. BMC Med Genet 9, 34.

Nakanishi, K., Tokita, Y., Aono, S., Ida, M., Matsui, F., Higashi, Y., and Oohira, A. (2010). Neuroglycan C, a brain-specific chondroitin sulfate proteoglycan, interacts with pleiotrophin, a heparin-binding growth factor. Neurochem Res 35, 1131-1137.

Niederkofler, V., Baeriswyl, T., Ott, R., and Stoeckli, E.T. (2010). Nectin-like molecules/SynCAMs are required for post-crossing commissural axon guidance. Development 137, 427-435.

Nivard, M.G., Mbarek, H., Hottenga, J.J., Smit, J.H., Jansen, R., Penninx, B.W., Middeldorp, C.M., and Boomsma, D.I. (2014). Further confirmation of the association between anxiety and CTNND2: replication in humans. Genes Brain Behav 13, 195-201.

Noor, A., Lionel, A.C., Cohen-Woods, S., Moghimi, N., Rucker, J., Fennell, A., Thiruvahindrapuram, B., Kaufman, L., Degagne, B., Wei, J., et al. (2014). Copy number variant study of bipolar disorder in Canadian and UK populations implicates synaptic genes. Am J Med Genet B Neuropsychiatr Genet 165B, 303-313.

Nurnberger, J.I., Jr., Koller, D.L., Jung, J., Edenberg, H.J., Foroud, T., Guella, I., Vawter, M.P., Kelsoe, J.R., and Psychiatric Genomics Consortium Bipolar, G. (2014). Identification of pathways for bipolar disorder: a meta-analysis. JAMA Psychiatry 71, 657-664.

Oliver, P.L., Finelli, M.J., Edwards, B., Bitoun, E., Butts, D.L., Becker, E.B., Cheeseman, M.T., Davies, B., and Davies, K.E. (2011). Oxr1 is essential for protection against oxidative stress-induced neurodegeneration. PLoS Genet 7, e1002338.

Paige, A.J., Taylor, K.J., Stewart, A., Sgouros, J.G., Gabra, H., Sellar, G.C., Smyth, J.F., Porteous, D.J., and Watson, J.E. (2000). A 700-kb physical map of a region of 16q23.2 homozygously deleted in multiple cancers and spanning the common fragile site FRA16D. Cancer Res 60, 1690-1697.

Pan, Y., Wang, K.S., and Aragam, N. (2011). NTM and NR3C2 polymorphisms influencing intelligence: family-based association studies. Prog Neuropsychopharmacol Biol Psychiatry 35, 154-160.

Park, C., Falls, W., Finger, J.H., Longo-Guess, C.M., and Ackerman, S.L. (2002). Deletion in Catna2, encoding alpha N-catenin, causes cerebellar and hippocampal lamination defects and impaired startle modulation. Nat Genet 31, 279-284.

Paul, C., Schoberl, F., Weinmeister, P., Micale, V., Wotjak, C.T., Hofmann, F., and Kleppisch, T. (2008). Signaling through cGMP-dependent protein kinase I in the amygdala is critical for auditory-cued fear memory and long-term potentiation. J Neurosci 28, 14202-14212.

Pettem, K.L., Yokomaku, D., Takahashi, H., Ge, Y., and Craig, A.M. (2013). Interaction between autism-linked MDGAs and neuroligins suppresses inhibitory synapse development. J Cell Biol 200, 321-336.

Page 40: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

17

Pickard, B.S., Christoforou, A., Thomson, P.A., Fawkes, A., Evans, K.L., Morris, S.W., Porteous, D.J., Blackwood, D.H., and Muir, W.J. (2009). Interacting haplotypes at the NPAS3 locus alter risk of schizophrenia and bipolar disorder. Mol Psychiatry 14, 874-884.

Pickard, B.S., Malloy, M.P., Porteous, D.J., Blackwood, D.H., and Muir, W.J. (2005). Disruption of a brain transcription factor, NPAS3, is associated with schizophrenia and learning disability. Am J Med Genet B Neuropsychiatr Genet 136B, 26-32.

Pickard, B.S., Pieper, A.A., Porteous, D.J., Blackwood, D.H., and Muir, W.J. (2006). The NPAS3 gene--emerging evidence for a role in psychiatric illness. Ann Med 38, 439-448.

Pieper, A.A., Wu, X., Han, T.W., Estill, S.J., Dang, Q., Wu, L.C., Reece-Fincanon, S., Dudley, C.A., Richardson, J.A., Brat, D.J., et al. (2005). The neuronal PAS domain protein 3 transcription factor controls FGF-mediated adult hippocampal neurogenesis in mice. Proc Natl Acad Sci U S A 102, 14052-14057.

Pimenta, A.F., Fischer, I., and Levitt, P. (1996). cDNA cloning and structural analysis of the human limbic-system-associated membrane protein (LAMP). Gene 170, 189-195.

Pimenta, A.F., Zhukareva, V., Barbe, M.F., Reinoso, B.S., Grimley, C., Henzel, W., Fischer, I., and Levitt, P. (1995). The limbic system-associated membrane protein is an Ig superfamily member that mediates selective neuronal growth and axon targeting. Neuron 15, 287-297.

Piper, M., Barry, G., Hawkins, J., Mason, S., Lindwall, C., Little, E., Sarkar, A., Smith, A.G., Moldrich, R.X., Boyle, G.M., et al. (2010). NFIA controls telencephalic progenitor cell differentiation through repression of the Notch effector Hes1. J Neurosci 30, 9127-9139.

Qin, H., Samuels, J.F., Wang, Y., Zhu, Y., Grados, M.A., Riddle, M.A., Greenberg, B.D., Knowles, J.A., Fyer, A.J., McCracken, J.T., et al. (2015). Whole-genome association analysis of treatment response in obsessive-compulsive disorder. Mol Psychiatry.

Qiu, S., Champagne, D.L., Peters, M., Catania, E.H., Weeber, E.J., Levitt, P., and Pimenta, A.F. (2010). Loss of limbic system-associated membrane protein leads to reduced hippocampal mineralocorticoid receptor expression, impaired synaptic plasticity, and spatial memory deficit. Biol Psychiatry 68, 197-204.

Rabaneda, L.G., Robles-Lanuza, E., Nieto-Gonzalez, J.L., and Scholl, F.G. (2014). Neurexin dysfunction in adult neurons results in autistic-like behavior in mice. Cell reports 8, 338-346.

Raulo, E., Chernousov, M.A., Carey, D.J., Nolo, R., and Rauvala, H. (1994). Isolation of a neuronal cell surface receptor of heparin binding growth-associated molecule (HB-GAM). Identification as N-syndecan (syndecan-3). J Biol Chem 269, 12999-13004.

Redies, C., Hertel, N., and Hubner, C.A. (2012). Cadherins and neuropsychiatric disorders. Brain Res 1470, 130-144.

Reissner, C., Klose, M., Fairless, R., and Missler, M. (2008). Mutational analysis of the neurexin/neuroligin complex reveals essential and regulatory components. Proc Natl Acad Sci U S A 105, 15124-15129.

Reynolds, L.M., Gifuni, A.J., McCrea, E.T., Shizgal, P., and Flores, C. (2015). dcc haploinsufficiency results in blunted sensitivity to cocaine enhancement of reward seeking. Behav Brain Res.

Riener, M.O., Nikolopoulos, E., Herr, A., Wild, P.J., Hausmann, M., Wiech, T., Orlowska-Volk, M., Lassmann, S., Walch, A., and Werner, M. (2008). Microarray comparative genomic hybridization analysis of tubular breast carcinoma shows recurrent loss of the CDH13 locus on 16q. Human pathology 39, 1621-1629.

Riou, P., Saffroy, R., Comoy, J., Gross-Goupil, M., Thiery, J.P., Emile, J.F., Azoulay, D., Piatier-Tonneau, D., Lemoine, A., and Debuire, B. (2002). Investigation in liver tissues and cell

Page 41: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

18

lines of the transcription of 13 genes mapping to the 16q24 region that are frequently deleted in hepatocellular carcinoma. Clinical cancer research : an official journal of the American Association for Cancer Research 8, 3178-3186.

Rivero, O., Sich, S., Popp, S., Schmitt, A., Franke, B., and Lesch, K.P. (2013). Impact of the ADHD-susceptibility gene CDH13 on development and function of brain networks. Eur Neuropsychopharmacol 23, 492-507.

Rose, E.J., Morris, D.W., Hargreaves, A., Fahey, C., Greene, C., Garavan, H., Gill, M., Corvin, A., and Donohoe, G. (2013). Neural effects of the CSMD1 genome-wide associated schizophrenia risk variant rs10503253. Am J Med Genet B Neuropsychiatr Genet 162B, 530-537.

Rujescu, D., Ingason, A., Cichon, S., Pietilainen, O.P., Barnes, M.R., Toulopoulou, T., Picchioni, M., Vassos, E., Ettinger, U., Bramon, E., et al. (2009). Disruption of the neurexin 1 gene is associated with schizophrenia. Hum Mol Genet 18, 988-996.

Sanz, R., Ferraro, G.B., and Fournier, A.E. (2015). IgLON cell adhesion molecules are shed from the cell surface of cortical neurons to promote neuronal growth. J Biol Chem 290, 4330-4342.

Sato, M., Mori, Y., Sakurada, A., Fujimura, S., and Horii, A. (1998). The H-cadherin (CDH13) gene is inactivated in human lung cancer. Hum Genet 103, 96-101.

Schaaf, C.P., Boone, P.M., Sampath, S., Williams, C., Bader, P.I., Mueller, J.M., Shchelochkov, O.A., Brown, C.W., Crawford, H.P., Phalen, J.A., et al. (2012). Phenotypic spectrum and genotype-phenotype correlations of NRXN1 exon deletions. Eur J Hum Genet 20, 1240-1247.

Schoorlemmer, J., and Goldfarb, M. (2001). Fibroblast growth factor homologous factors are intracellular signaling proteins. Curr Biol 11, 793-797.

Sebat, J., Lakshmi, B., Malhotra, D., Troge, J., Lese-Martin, C., Walsh, T., Yamrom, B., Yoon, S., Krasnitz, A., Kendall, J., et al. (2007). Strong association of de novo copy number mutations with autism. Science 316, 445-449.

Shirai, Y., Kouzuki, T., Kakefuda, K., Moriguchi, S., Oyagi, A., Horie, K., Morita, S.Y., Shimazawa, M., Fukunaga, K., Takeda, J., et al. (2010). Essential role of neuron-enriched diacylglycerol kinase (DGK), DGKbeta in neurite spine formation, contributing to cognitive function. PloS one 5, e11602.

Shu, T., Butz, K.G., Plachez, C., Gronostajski, R.M., and Richards, L.J. (2003). Abnormal development of forebrain midline glia and commissural projections in Nfia knock-out mice. J Neurosci 23, 203-212.

Smallwood, P.M., Munoz-Sanjuan, I., Tong, P., Macke, J.P., Hendry, S.H., Gilbert, D.J., Copeland, N.G., Jenkins, N.A., and Nathans, J. (1996). Fibroblast growth factor (FGF) homologous factors: new members of the FGF family implicated in nervous system development. Proc Natl Acad Sci U S A 93, 9850-9857.

Srour, M., Riviere, J.B., Pham, J.M., Dube, M.P., Girard, S., Morin, S., Dion, P.A., Asselin, G., Rochefort, D., Hince, P., et al. (2010). Mutations in DCC cause congenital mirror movements. Science 328, 592.

Steen, V.M., Nepal, C., Ersland, K.M., Holdhus, R., Naevdal, M., Ratvik, S.M., Skrede, S., and Havik, B. (2013). Neuropsychological deficits in mice depleted of the schizophrenia susceptibility gene CSMD1. PloS one 8, e79501.

Page 42: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

19

Struyk, A.F., Canoll, P.D., Wolfgang, M.J., Rosen, C.L., D'Eustachio, P., and Salzer, J.L. (1995). Cloning of neurotrimin defines a new subfamily of differentially expressed neural cell adhesion molecules. J Neurosci 15, 2141-2156.

Sudhof, T.C. (2008). Neuroligins and neurexins link synaptic function to cognitive disease. Nature 455, 903-911.

Suzuki, H., Katayama, K., Takenaka, M., Amakasu, K., Saito, K., and Suzuki, K. (2009). A spontaneous mutation of the Wwox gene and audiogenic seizures in rats with lethal dwarfism and epilepsy. Genes Brain Behav 8, 650-660.

Terracciano, A., Esko, T., Sutin, A.R., de Moor, M.H., Meirelles, O., Zhu, G., Tanaka, T., Giegling, I., Nutile, T., Realo, A., et al. (2011). Meta-analysis of genome-wide association studies identifies common variants in CTNNA2 associated with excitement-seeking. Transl Psychiatry 1, e49.

Turner, C.A., Watson, S.J., and Akil, H. (2012). The fibroblast growth factor family: neuromodulation of affective behavior. Neuron 76, 160-174.

Turner, T.N., Sharma, K., Oh, E.C., Liu, Y.P., Collins, R.L., Sosa, M.X., Auer, D.R., Brand, H., Sanders, S.J., Moreno-De-Luca, D., et al. (2015). Loss of delta-catenin function in severe autism. Nature 520, 51-56.

Uemura, M., and Takeichi, M. (2006). Alpha N-catenin deficiency causes defects in axon migration and nuclear organization in restricted regions of the mouse brain. Dev Dyn 235, 2559-2566.

Underwood, J.G., Boutz, P.L., Dougherty, J.D., Stoilov, P., and Black, D.L. (2005). Homologues of the Caenorhabditis elegans Fox-1 protein are neuronal splicing regulators in mammals. Mol Cell Biol 25, 10005-10016.

Ushkaryov, Y.A., Petrenko, A.G., Geppert, M., and Sudhof, T.C. (1992). Neurexins: synaptic cell surface proteins related to the alpha-latrotoxin receptor and laminin. Science 257, 50-56.

Vaags, A.K., Lionel, A.C., Sato, D., Goodenberger, M., Stein, Q.P., Curran, S., Ogilvie, C., Ahn, J.W., Drmic, I., Senman, L., et al. (2012). Rare deletions at the neurexin 3 locus in autism spectrum disorder. Am J Hum Genet 90, 133-141.

van den Oord, E.J., Kuo, P.H., Hartmann, A.M., Webb, B.T., Moller, H.J., Hettema, J.M., Giegling, I., Bukszar, J., and Rujescu, D. (2008). Genomewide association analysis followed by a replication study implicates a novel candidate gene for neuroticism. Arch Gen Psychiatry 65, 1062-1071.

Volkert, M.R., Elliott, N.A., and Housman, D.E. (2000). Functional genomics reveals a family of eukaryotic oxidation protection genes. Proc Natl Acad Sci U S A 97, 14530-14535.

Vrijenhoek, T., Buizer-Voskamp, J.E., van der Stelt, I., Strengman, E., Genetic, R., Outcome in Psychosis, C., Sabatti, C., Geurts van Kessel, A., Brunner, H.G., Ophoff, R.A., et al. (2008). Recurrent CNVs disrupt three candidate genes in schizophrenia patients. Am J Hum Genet 83, 504-510.

Walsh, T., McClellan, J.M., McCarthy, S.E., Addington, A.M., Pierce, S.B., Cooper, G.M., Nord, A.S., Kusenda, M., Malhotra, D., Bhandari, A., et al. (2008). Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 320, 539-543.

Yamagata, M., and Sanes, J.R. (2008). Dscam and Sidekick proteins direct lamina-specific synaptic connections in vertebrate retina. Nature 451, 465-469.

Yamagata, M., Weiner, J.A., and Sanes, J.R. (2002). Sidekicks: synaptic adhesion molecules that promote lamina-specific connectivity in the retina. Cell 110, 649-660.

Page 43: Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural ...

20

Yuan, L., Seong, E., Beuscher, J.L., and Arikkath, J. (2015). delta-Catenin Regulates Spine Architecture via Cadherin and PDZ-dependent Interactions. J Biol Chem 290, 10947-10957.

Zacco, A., Cooper, V., Chantler, P.D., Fisher-Hyland, S., Horton, H.L., and Levitt, P. (1990). Isolation, biochemical characterization and ultrastructural analysis of the limbic system-associated membrane protein (LAMP), a protein expressed by neurons comprising functional neural circuits. J Neurosci 10, 73-90.

Zahir, F.R., Baross, A., Delaney, A.D., Eydoux, P., Fernandes, N.D., Pugh, T., Marra, M.A., and Friedman, J.M. (2008). A patient with vertebral, cognitive and behavioural abnormalities and a de novo deletion of NRXN1alpha. J Med Genet 45, 239-243.

Zhao, Z., Wang, Z., Gu, Y., Feil, R., Hofmann, F., and Ma, L. (2009). Regulate axon branching by the cyclic GMP pathway via inhibition of glycogen synthase kinase 3 in dorsal root ganglion sensory neurons. J Neurosci 29, 1350-1360.