Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene...

46
1 Supplementary Materials Revealing global regulatory perturbations across human cancers Hani Goodarzi # , Olivier Elemento # , and Saeed Tavazoie * Department of Molecular Biology & Lewis-Sigler Institute for Integrative Genomics Princeton University, Princeton, NJ 08544.

Transcript of Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene...

Page 1: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

1

Supplementary Materials

Revealing global regulatory perturbations across human cancers

Hani Goodarzi#, Olivier Elemento#, and Saeed Tavazoie*

Department of Molecular Biology & Lewis-Sigler Institute for Integrative Genomics

Princeton University, Princeton, NJ 08544.

Page 2: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

2

Supplemental Procedures In what follows, we provide a detailed description of the methods used in this study. The iPAGE software along with a minitutorial and other results from this study (both experimental and computational) are available online at http://tavazoielab.princeton.edu/iPAGE/. Our approach, described here, involves the discovery of the informative cis-regulatory elements and cellular pathways from gene expression datasets; a subsequent analysis recovers the pathways that are likely regulated by the identified putative binding sites. A schematic of the FIRE/iPAGE framework is presented in Figure 1 and Figure S12. Pre-processing of input datasets All cancer microarray datasets used in this study where downloaded from GEO (http://www.ncbi.nlm.nih.gov/projects/geo/). Each cancer versus normal dataset was converted into continuous or discrete gene expression profiles, as follows. In the continuous case (i.e., urinary bladder cancer), each gene was associated with a continuous expression value based on the following equation: (1) , where p is a p-value calculated by performing a Student’s t-test between the cancer samples and the normal controls. s is the sign of the difference between the average values in these two sets. Thus, v indicates the extent to which a gene is up-regulated or down-regulated in the cancer state with maximal and minimal values of 1 and -1 respectively. In the discrete case, genes were first clustered into ~ groups (N is the total number of genes) based on their expression values in the normal and tumor samples, using the k-means unsupervised clustering approach. Then the clusters whose average expressions did not differ between the normal and cancer samples (nominal p-value from t-test > 0.05, where the t-test is performed on the expression profiles in each cluster) were combined into a single background cluster. Subsequently, each gene was associated with the cluster index of the cluster to which it belongs. FIRE: De novo discovery of informative regulatory elements FIRE was used with default settings, as described in Elemento et al, 2007. iPAGE: A detailed explanation of the algorithm Expression profile An expression profile is defined across N genes, where each gene is associated with a unique expression measure. Expression measures, discrete or continuous, can be obtained from a variety of gene-level measurements or analyses. For example, cluster indices from a partitioning process or the ranks obtained from sorting are discrete measures; whereas, results from a single microarray or any continuous-type statistic (e.g., p-values) are continuous values. In this study we have demonstrated this unifying capacity of iPAGE; e.g., in the bladder carcinoma we have used a continuous statistic derived from Student’s t-test while in the BL vs DLBCL case we employed discrete indices obtained from clustering of gene expression values across all the samples. From here forward, we refer to these lists of input values as expression profiles. Schematized continuous and discrete expression profiles are shown in Figure S13.

Page 3: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

3

Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology annotations). For each pathway, the pathway profile is defined as a binary vector with N elements, one for each gene. In this profile, “1” indicates that the gene belongs to the pathway and “0” indicates that it does not. A schematized pathway profile is shown in Figure S13. Quantizing continuous expression profiles Although the concept of mutual information is defined for both discrete and continuous random variables, in practice, continuous data are discretized before calculating the mutual information (MI) values. Our quantization procedure is based on the maximum entropy principle (so as to make the least assumptions about the underlying data distribution), and involves using equally populated “expression bins”. Thus, the discretization step only requires a single parameter, i.e., the number of genes in each bin. In the default iPAGE settings, the number of bins (Ne) is determined by: (2) where Nm is the number of bins in the pathway profile (here Nm=2). Although determining Ne values from Eqn. (2) allows a reliable calculation of mutual information (Slonim et al., 2005), other values can also be explored by the user. In this study, we used the continuous mode in one of the datasets and variations in the number of bins did not significantly change the results. Indeed, when we ran FIRE and iPAGE on the bladder carcinoma dataset with various numbers of bins (10, 50, 100 and 250), the identified seeds (k-mers) largely overlapped, with hypergeometric p-values always less than 1e-53 (down to 1e-281 in some comparisons). We made the same observation for the number of iPAGE-identified pathways, with hypergeometric p-values always less than 1e-20 (down to 1e-83). Calculating the mutual information values Given a pathway profile and an expression profile with Ne bins (or clusters), we create a table C of dimensions 2× Ne, in which C(1, j) represents the number of genes that are contained in the jth expression bin and are also present in the given pathway. C(2, j), on the other hand, contains the number of genes that are in the jth expression bin but are not assigned to the pathway. Given this table, we calculate the empirical mutual information as follows:

(3) ,

where , and .

Randomization-based statistical testing To assess the statistical significance of the calculated MI values, we use a non-parametric randomization-based statistical test. Given I as the real MI value and keeping the pathway profile unaltered, the expression profile is shuffled 10,000 times and the corresponding MI values Irandom are calculated. A pathway is accepted only if I is larger than (1-max_p) of the Irandom values (max_p is set to 0.005 by default). This corresponds to a p-value < 0.005. In iPAGE, pathways are first sorted by information (from informative to non-informative). Starting from the most informative pathway, the statistical test described above is applied to each pathway, and pathways that pass the test are returned (provided they also pass the conditional information test described below). When k contiguous pathways in the sorted list do not pass the test, the procedure is stopped (k is set to 20 by default). Removing redundantly informative pathways

Page 4: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

4

Due to the hierarchical and nested nature of pathway annotations (e.g. Gene Ontology), many pathways display some level of redundancy, i.e., two pathways may be represented by very similar sets of genes (e.g. GO:0006511, ubiquitin dependent protein catabolic process and GO:0019941, modification dependent protein catabolic process). To discover representative pathways and remove redundant ones, we require that each returned pathway be highly informative about the expression profile, but also bring a significant amount of new information compared to all other significantly informative pathways as calculated by conditional mutual information (Cover and Thomas, 2006). To achieve this, we require that each candidate pathway fulfills

(4)

for all already accepted pathways, i.e., all pathways that have already passed the statistical and conditional information tests. An identical criterion was used in FIRE (Elemento et al., 2007). In iPAGE, r is set to 5 by default and only the pathways satisfying the above equation are presented in the graphical output; however, the list of all significant pathways is also created and stored as a text file. Pathway over- and under-representation Informative pathways are generally over-represented or under-represented in certain expression clusters/bins. To quantify the level of over- and under-representation, the hypergeometric distribution is used to calculate two distinct p-values:

(5) for over-representation

and

(6) for under-representation,

where x equals the number of genes in the given expression bin/cluster which are also assigned to the given pathway. m is the number of genes assigned to the pathway, n is the number of genes in the expression bin and N is the total number of genes. If pover<punder, we consider the pathway to be over-represented in the expression bin/cluster; otherwise, it is under-represented. iPAGE graphical output The over- and under-representation p-values described in the previous section are used to draw a heatmap, i.e., a graphical representation of pathway over- and under-representation across all expression bins/clusters. In this heatmap, the rows represent the significantly informative pathways and the columns are the expression bins/clusters. Colors indicate over- or under-representation levels. The red color-map indicates (in log10) the over-representation p-values; whereas, the blue color-map shows under-representation. Additional iPAGE output files In addition to the graphical heatmap, iPAGE generates files containing the actual log(p-values) for over- and under-representations, and the list of removed redundant pathways.

Page 5: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

5

False Discovery Rate (FDR) In order to measure the FDR of our method, we randomly shuffled the gene labels of the gene expression profile and counted the number of pathways discovered compared to the non-shuffled data. We have tabulated the results in Table S1, for the BL vs DLBCL and bladder cancer expression datasets (continuous and clustered profiles). iPAGE command line The basic command line syntax for iPAGE is : perl page.pl --expfile=<inp> --species=<sp> --exptype=<type> where <inp> indicates the input expression profile (a two-column tab-delimited text file with gene names in the first column and expression measures in the second), <sp> indicates the species, and <type> indicates whether the expression profile is discrete (e.g., cluster indices) or continuous (e.g., expression values obtained from a single microarray experiment). We have prepackaged pathway annotations for many species, ranging from bacteria to human. For example, the following command line will run iPAGE on a continuous E. coli expression profile : perl page.pl --expfile=./TEST/continuous.exp --species=human_go --exptype=continuous iPAGE creates an expfile_PAGE directory where the results are saved to (./TEST/continuous.exp_PAGE in this case). Pathway-Regulatory Interaction Map Generator (PRMG) Motif definition As described in (Elemento et al., 2007), regulatory elements (motifs) are defined as regular expressions and can only consist of the following characters: A, C, G, T, [AC], [AG], [AT], [CG], [CT], [GT], [ACG], [ACT], [AGT], [CGT], and N (equivalent to [ACGT]). Motif profile We look for motifs both in 5’ upstream (DNA motifs) and in 3’UTR sequences (RNA motifs). Given a motif, the motif profile is defined as a binary vector with N elements, where for each gene, “+1” indicates the presence and “0” indicates the absence of the motif in the corresponding promoter (or 3’UTR). “1” indicates that at least one match of the regular expression is present in the sequences (see Figure S14). For 5’ sequences both strands are searched; whereas, in 3’ UTR sequences only the transcribed strand is considered. We used a generic definition of active motif profile (Elemento et al., 2007) to build the pathway-regulatory map; i.e., we only count the motif occurrences that are in expression cluster/bins in which the motif is over-represented. This approach filters out motif occurrences that are unlikely to be functional. Pathway Profiles For each pathway, the pathway profile is defined the same as in iPAGE. Creating pathway-regulatory interaction maps

Page 6: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

6

In the first step, we calculate the mutual information between the motif profile and pathway profile for each pair of motifs and pathways. We then assess the significance of these associations through 1,000 random shuffles of the motif profile and recalculating the MI values. By default, a category is accepted only if the real MI is larger than 995 of the random values. The associations that pass this test are deemed significant and their under- or over-representation p-values are calculated using equations (5) and (6). Graphical output We build a matrix with motifs as columns and pathways as rows where the non-zero elements represent the –log10(p-value) in case of over-representations and log10(p-value) otherwise. This matrix is then visualized as a blue-red heatmap with red and blue elements representing positive and negative associations respectively. A schematic representation of this method is shown in Figure S14. PRMG command line The PRMG script (prmg.pl) is part of the iPAGE package and is located in the PAGEvx.x directory; however, it also relies on FIRE outputs to run (FIRE is available at http://tavazoielab.princeton.edu/FIRE/): export FIREDIR=/path/to/FIRE export PAGEDIR=/path/to/iPAGE perl prmg.pl --expfile=<inp> --species=<sp> where <inp> indicates the input expression profile, <sp> indicates the species. The script does not work on expfile itself, but uses it to locate iPAGE and FIRE summary files (in expfile_PAGE and expfile_FIRE directories). For example, the following command line will run PRMG on a continuous expression profile: perl prmg.pl --expfile=./TEST/continuous.exp --species=human_go The results are written to a motif_cat.cdt file and the graphical representations are created in the motif_cat.eps and motif_cat.pdf files. In order to run this program you also need to install the cluster 3.0 perl binder at http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm and define a global variable termed USRDIR pointing to the directory where this binder is installed: export USRDIR=/path/to/cluster3 Experimental validation of the discovered regulatory associations Transfection of siRNAs targeting Elk1 and NFYA transcription factors ON-TargetplusTM (Dharcomon) set of siRNAs for each TF were transfected into MDA-MB-231 cells (growin in D10F medium) using LipofectamineTM 2000 (invitrogen). 72 hours after transfection, RNA samples were extracted from the cells (mirVana™ miRNA Isolation Kit) and were subjected to cDNA synthesis (SuperScript® III RTS First-Strand cDNA Synthesis Kit from Invitrogen). mRNA knock-down in each sample was verified using SYBRE Green qPCR reactions (Universal ProbeLibrary Assay Design Center, Roche Applied Science). For each TF, we selected two of the successfully knocked-down transfections and extracted their total RNA along with mock-transfected cells as controls. We then differentially labeled the RNA samples with Cy3 and Cy5 dyes and hybridized them to Agilent human gene expression arrays (4×44k). The genes with significant discordant changes between the two biological replicates were filtered out and for the rest, the Cy3/Cy5 values were

Page 7: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

7

averaged and combined into a single dataset as log of ratios. The expression profiles are available at http://tavazoielab.princeton.edu/iPAGE/. Transfection of decoy and scrambled oligonucleotide sequences For the validation experiments, we chose two of the genes implicated by FIRE to have a version of AAAA[ATG]TT (NM_000337 and NM_001024660). For each gene, we then synthesized a 19bp sequence containing the AAAA[ATG]TT motif. These sequences were also randomly shuffled to create scrambled sequences as controls. The resulting sequences were synthesized as double stranded oligonucleotides: Decoy1: caattGAAATTTTGagcaa, Scrambled1: gtTtATAcAcTaaaGaTGa, Decoy2: gctggAAAAAATTTaagac, Scrambled2: aagATTgctAgAAgAaATc. We then transfected these oligonucleotides into MDA-MB-231 cells grown in D10F medium at a concentration of 1 µM (TransIT®-Express Transfection Reagent). 72 hours post-transfection, we extracted RNA and differentially labeled the samples with Cy3 or Cy5 dyes. The samples were then hybridized to Agilent human gene expression arrays (4×44k). The Cy3/Cy5 ratios from the two sets were then averaged, filtered and combined in a single dataset as log of ratios. In this step, we filtered out ~2000 genes that showed significantly discordant expression level changes in the two biological replicates. The expression profile is available at http://tavazoielab.princeton.edu/iPAGE/.

Page 8: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

8

Supplemental Results iPAGE and Gene-set Enrichment Analysis As mentioned in the main text, a variety of powerful approaches have been developed for gene-set enrichment analysis (e.g., Sinha et al., 2008; Subramanian et al., 2005). These computational methods rely on different statistical tests to assess the non-random distribution of pathway-membership across the expression values. For examples, Sinha et al. (2008) use the hypergeometric distribution to calculate the overlap between gene-expression clusters and pathway-memberships. Subramanian et al. (2005), on the other hand, use a rank-based approach to discover non-random patterns. These approaches are well-defined, powerful and tested; however, in a setting where a large number of statistical tests are performed, sensitivity is reduced mainly because of multiple-testing corrections. In iPAGE, we employ mutual information to tackle this problem. Although iPAGE is only one component in our broader approach, it provides major advantages over the previous methodologies:

1. In comparison with the hypergeometric distribution, using a mutual information-based method notably decreases the number of statistical tests necessary (by the number of clusters/bins), resulting in a higher sensitivity at the same false discovery rate.

2. Mutual information also enables iPAGE to analyze both discrete and continuous inputs making it a universal approach for analyzing any type of data without the need for any upstream analysis. This is particularly important in this study where we analyze different whole-genome datasets.

3. iPAGE is also capable of discovering non-monotonic associations, a category that is masked in rank-based methods. In these associations, different components of a pathway may show opposite expression patterns relative to each other. For example, a number of metabolic pathways include both biosynthetic and catabolic genes and an increase in the final product typically requires the simultaneous down-regulation of the catabolic genes and up-regulation of the biosynthetic genes.

4. In iPAGE, we use conditional information to substantially reduce the redundancy, which in turn results in a concise and manageable graphical output.

5. We have been extra careful in limiting the computational expenses and our package is relatively fast. Also, we have attempted to make iPAGE as user-friendly as possible. In addition to the complete command-line version, we have developed Graphical User Interfaces for both Mac OS X and windows users for rapid and every-day use of the application by non-experts.

The Regulatory Network of Bladder Carcinoma The general method for meta-analysis of bladder cancer vs normal dataset used in the main text (Dyrskjot et al., 2004) reveals the most prominent signatures of a cancer state: in this case, a faster cell cycle and a repressed immune response through the regulatory effects of E2F and SEF1/E47 transcription factors respectively. We also identified a range of significant cis-regulatory elements, including a putative 3’UTR element, NUNGNUGU (seed UAGAUGU/TAGATGT) (Figure 2B, main text). Our approach also reveals that several of these motifs co-occur in promoters or 3’UTRs of the same genes (Figure S1A), thus suggesting possible cooperations between the regulatory factors that bind them. In order to provide additional evidence for these predicted cooperations, we used the approach described in Pilpel et al. (2001) to compare the extent to which two of the novel motifs (one DNA and one RNA) cooperate with the Elk-1 motif in co-regulating their target genes. First, we selected 1000 random pairs of genes and used Pearson correlation to calculate the correlation coefficient (R) between the expression levels of each pair across the bladder carcinoma dataset (Dyrskjot et al., 2004). We then repeated the same procedure, this time on the set of genes harboring an

Page 9: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

9

Elk1 motif in their upstream sequence. As it is shown in Figure S1B, the distribution of the resulting R-values from Elk1 target genes is shifted to the right compared to the random background values (p-value <1e-14). The genes harboring the DNA motif [ACG]ACGT[CT][CGT][AG][CGT] (seed ACGTCGG), show a distribution similar to that of Elk1 (p-value <1e-15) and focusing on the genes that harbor both of the motifs results in an even larger shift towards higher R-values (p-value <1e-15). The p-values reported in each case have been calculated using Mann-Whitney test, comparing each distribution to that of the background. Repeating this procedure for [ACU]U[ACU]G[ACG]UGU (seed UAGAUGU/TAGATGT), resulted in comparable distributions (Figure S1B). These observations further highlight the biological relevance of the discovered novel motifs and provide additional support for the predicted motif interactions. While the continuous analysis of bladder carcinoma signatures apparently captures a global picture of deregulations, it does not take into account intra-cancer variations and may hide pathways that are deregulated in different subsets of samples. In order to increase our sensitivity in capturing a broader range of pathway deregulations, we first clustered the genes based on their expression across normal and tumor samples and then combined the clusters with low average differences between normal and tumor into a single background cluster. The informative pathways that iPAGE discovered from this discrete input data were generally similar to the pathways obtained when analyzing the continuous profile, as described in the main text. However, iPAGE also identified NF-κB, PI3-K and Rho signaling pathways as globally up-regulated in the tumor samples (Figure S15). Genes involved in “Negative regulation of apoptosis” also show a substantial increase in their expression compared to normal samples (Figure S15). The identification of additional pathways without an increase in FDR (Table S1) suggests that using a cluster-based approach, which takes into account the intra-cancer variability, may increase sensitivity and reveal additional pathways. Interestingly, the benefits of using this approach become even more apparent when we search for cis-regulatory elements. We identified 128 significantly informative motifs, a summarized version of which is presented in Figure S16. The sequence motifs discovered in the continuous analysis are included among the new set of motifs (e.g. E2F and SEF-1); however, this analysis reveals many new known and novel regulatory elements including p53, Elk-1, Sp1, NF-Y and CRE-BP1 (Figure S16). Among the known regulatory elements, Elk-1, Sp1, NF-Y and E2F show a significant association with “DNA replication” compared to E2F alone in the continuous method. E47 and SEF-1 are associated with the “immune response” pathways (Figure S17). In addition to these, we also captured the role of STAT3 transcription factor in cell cycle regulation. The signal transducers and activators of transcription (STATs) are a family of proteins with key roles in cellular differentiation and proliferation. Among these, STAT3 is known to be involved in coordinating G1-S transition (Fukada et al., 1998). In the pathway-regulatory interaction map (Figure S17), up-regulation in the “protein folding” genes is associated with HSF (Heat Shock Factor). In higher eukaryotes, HSF binds the heat shock element (HSE) located in the promoters of heat shock genes to activate their transcription as part of the unfolded protein response (Amin et al., 1988). In addition to its apparent role in protein folding, HSF is also believed to participate in general immune response (Singh and Aballay, 2006) potentially explaining its association with this pathway in our analysis (Figure S17). Besides the abovementioned transcription factors, we should also highlight the role of SRF in regulating the “cell junction” pathway. Assembly and disassembly of cell-cell junctions comprise a number of key events during physiological and pathological processes. The role of serum receptor

Page 10: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

10

factor (SRF) as a potential regulator of this process has been previously established by in vivo observation of the deficiencies in mutant backgrounds (Busche et al., 2008). As highlighted in our results (Figure S17), deregulation in the activity of this transcription factor may be crucial for tumor metastasis (i.e. benign to malignant transition). A substantial fraction of cis-regulatory elements discovered in this study correspond to the binding sites of known transcription factors. The expression of these transcription factors can be viewed in the context of this dataset or other independent whole-genome datasets. This would enable us to compare the presence or absence of a given transcription factor to the expression of its putative downstream genes. Such analysis can act as a validation step for both the discovered motifs and the functions associated with them through the “pathway-regulatory interaction map”. For example, here, we have focused on two transcription factors from the bladder carcinoma dataset: Elk1 and TFDP1 (a protein that binds E2F and promotes its DNA binding affinity; Crosby et al., 2007). Elk1, which was associated with mitosis, RNA splicing and ribosome biogenesis in our study (Figure 2C, main text), shows a significant anti-correlation with these genes, suggesting that it functions as a repressor (Figure S2). Similarly, TFDP1, which is a partner in E2F complexes, is significantly correlated with the expression of the genes in the E2F associated functions, namely DNA replication, microtubule biogenesis and mitotic cell cycle (Figure S3). We have also included the same analysis for AhR and its predicted association with Ub-dependent protein catabolic process (Figure S4). In addition to GO pathways, we also used the MSigDB c2 gene-sets to re-analyze the bladder cancer dataset in its continuous format (as presented in the main text). The iPAGE results are presented in Figure S18. Up-regulation of many cancer-related genes and cell-cycle pathways along with a down-regulation of immune response pathways (e.g. cytokine pathway, IL and TNF mediated inflammatory responses) match the results presented based on the GO pathways in the main text. We have also included the pathway-regulatory interaction map obtained from these gene-sets (Figure S19). Comparative analysis of cancer sub-types (BL vs. DLBCL) Due to space limitations, for this analysis, we only included a limited number of deregulated pathways in the figures of the main paper. Here, however, we have presented the iPAGE and FIRE outputs with the complete list of the deregulated pathways and informative sequence motifs (Figure S5A and B). Some of the matrices presented here and in the next section are too large to be legibly fit in a page; however, the figures are vector graphics and can be enlarged when viewed electronically. Similar to the Elk1 and TFDP1 analysis for the bladder carcinoma dataset, in Figure S6, we have included the expression of NF-Y (as the average expression of NFYB and NFYC genes) and SP1 in comparison with the expression of their associated downstream pathways. NF-Y, associated with cell cycle and chromatin biogenesis, is highly correlated with the genes in these pathways (p-value < 1e-30). Sp1 shows a similar correlation with cell cycle and secretory pathways (p-value < 1e-12). Building a regulatory map of cancer deregulation Next, we studied regulatory perturbations across many cancer types to capture both globally deregulated and more cancer-specific pathways. We compiled a compendium of 46 cancer versus normal gene expression microarray datasets (see Table S2). We then processed the samples and used iPAGE to build a cancer pathway map (Figure S7 and Figure 5 in the main text). This systematic iPAGE analysis of cancer datasets allowed us to compare the corresponding cancers based on their pathway-level perturbations. In order to do so, we used hierarchical clustering to cluster the cancers based on their informative pathways (listed in Figure 5, main text). The clustering results and the correlation matrix are shown in Figure S20. The hierarchical clustering tree in Figure S20 shows that independent cancer datasets of similar types are also similar in terms of their informative (and

Page 11: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

11

therefore deregulated) pathways. For example, in this figure, all non-small cell lung datasets are grouped together in one cluster (Beer et al., 2002; Bhattacharjee et al., 2001; Stearman et al., 2005) distinguishing them from the single small cell carcinoma (SMCL) dataset (Bhattacharjee et al., 2001). Among all the cancers, two samples in particular show a highly negative correlation with others: a melanoma dataset (Hoek et al., 2006) and a chronic lymphocytic leukemia (CML) dataset (Alizadeh et al., 2000). This anti-correlation largely results from a lower expression of cell cycle related genes in both of these tumors (Figure 5, main text). As Alizadeh et al. (2000) noted, CML is a slowly progressing disease with a low proliferation rate. Similarly, despite being highly metastatic, a cohort of melanoma samples in this dataset has been shown to have a low proliferation rate (Hoek et al., 2006) thus explaining the lower expression of mitotic genes in this cancer. We also used FIRE on our compendium to build a cancer regulatory map. In essence, sequence motifs whose associated genes show significant deregulation in the tumor samples are identified and compiled to form this regulatory map (Figure S8). Apart from their independent occurrences in multiple datasets, most of these motifs also have high network-level conservation scores (Figure S9). Figure S9 also includes the cis-regulatory elements identified in the bladder carcinoma and lymphoma datasets. Subsequently, using an information-theoretical approach, we associated the discovered cis-regulatory elements with the deregulated pathways to build a cancer pathway-regulatory interaction map (Figure S10). In Table S3, we list a number of novel and significant associations from this map, representing an unknown regulatory protein (or miRNA) potentially regulating the associated pathway through recognition of the corresponding sequence motif. In some cases, we have predicted novel associations for known transcription factors (or miRNAs). In Figure S11, we have used the NCI-60 gene expression panel to test some of these novel associations. For example, we have predicted Lmo2-complex as a direct regulator of PI3K signaling and cell migration. The members of these pathways that harbor an Lmo2 binding site are highly correlated with the expression of this transcription factor (Figure S11). We observed similar correlations between TCF3 (E47 complex) and the inflammatory response pathway and miR-203 and protein degradation pathway (Figure S11). Here, we have also included the cancer pathway map built from MSigDB c2 gene-sets in Figure S21. The procedure is the same as above, with the exception that we have used MDigDB c2 gene-sets instead of GO terms (biological processes). The discovered motifs are informative of gene expression patterns across independent datasets We used the human gene-expression atlas (Su et al., 2004) and NCI-60 gene expression panel (Ross et al., 2000) to further test the functional relevance of our discovered elements. We clustered the genes in these two datasets into 70 clusters using the k-means approach, and then used FIRE in non-discovery mode (a mode in which the motif discovery is skipped and all the steps are performed on a set of input motifs) to test whether our 104 identified motifs show a significant non-random pattern across these datasets. As shown in Figure S22 and Figure S23, 74 of these motifs have significant mutual information values with the expression clusters further strengthening the evidence for the role of these elements in regulating gene expression. FIRE analysis of experimentally tested associations As shown in Figure S24, we used FIRE to also analyze our gene expression profiles from both decoy vs. scrambled dataset and TF knock-down datasets. In the AAAA[ATG]TT dataset, in addition to ATA[AT][GT][CT]T[AT] (which resembles the reverse complement of AAAA[ATG]TT), we also discovered Elk4 and another novel motif (Figure S24B). The observed deregulation in the Elk4 downstream genes explains the up-regulation in the cell cycle genes, as this TF is a known modulator

Page 12: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

12

of mitosis. Similarly, we observed an up-regulation in the genes harboring the CCAAT motif in the NFYA knock-down dataset (Figure S24C).

Page 13: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

13

Legends Figure S1. Measuring co-regulation in the downstream genes of novel motifs. (A) The regulatory interaction matrix for the bladder carcinoma cis-regulatory elements. (B) 1000 gene pairs are selected from bladder carcinoma dataset (Dyrskjot et al., 2004) and the distribution of their Pearson correlation coefficients is plotted. The background set includes all the genes in the dataset; whereas ‘Elk1’ is limited to the genes harboring Elk1 motifs in their upstream sequences. Similarly, TAGATGT plot represents the genes harboring [ACU]U[ACU]G[ACG]UGU (a novel 3’ UTR element). We have also included this distribution for the simultaneous occurrence of these two motifs. Similarly, the distribution of R-values are shown for Elk1 and ACGTCGG (a upstream element [ACG]ACGT[CT][CGT][AG][CGT]) and their simultaneous presence. Figure S2. Elk1 expression across the bladder carcinoma samples and its comparison with target genes within the associated pathways. The expression of each pathway is calculated as the average normalized expression of the genes listed. A regression test is then used to calculate the correlation coefficients and their associated p-values. Figure S3. TFDP1 (E2F associated protein) expression and its comparison with E2F associated pathways across the bladder carcinoma samples. Figure S4. The correlation of transcription factor AhR with Ub-dependent protein catabolic process. Figure S5. The complete iPAGE (A) and FIRE (B) outputs for the BL vs DLBCL dataset. While Figure 2A and B contain a summarized version of these results, here we have included the complete outputs. Figure S6. The normalized expression of NF-Y and Sp1 across the BL vs. DLBCL samples along with the expression of their target genes. Figure S7. The complete cancer pathway map without redundancy removal. Figure S8. Cancer regulatory map. The level of significance by which the genes harboring a given putative cis-regulatory element are up or down regulated is depicted here. This matrix is formatted to include only the known motifs and those that are significantly associated with more than 3 cancers. Figure S9. Network-level conservation scores. This figure shows our discovered motifs and their network-level conservation scores with respect to the chicken genome (Elemento and Tavazoie, 2005). Values range from 0 to 1, with 1 being most conserved. Figure S10. The complete cancer pathway-regulatory interaction map. Figure S11. The expression level of Lmo2, TCF3 and miR-203 modules across the NCI-60 panel. Figure S12. Conceptual schematic of our computational framework. We start from a gene expression dataset and use iPAGE and FIRE to discover the deregulated pathways and cis-regulatory elements that are informative of gene expression patterns. We then use pathway-regulatory interaction map (PRM) analysis to functionally annotate the identified motifs through associating them with their potential downstream pathways. Figure S13. iPAGE schematic. Two exemplary expression profiles are shown: discrete (e.g. cluster indices from co-expression clustering) and continuous (e.g. log of fold change in expression level in the tumor sample compared to normal). Mutual information is then used to assess the level by which given pathway profiles are informative of these expression profiles. Figure S14. Pathway-Regulatory Interaction Map Generator. The discovered informative pathways (from iPAGE) and putative cis-regulatory elements (from FIRE) are combined in this approach to associate regulatory proteins with their target pathways.

Page 14: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

14

Figure S15. Bladder carcinoma versus normal samples (discrete mode). The informative pathways discovered by iPAGE that are deregulated in tumor samples compared to normal controls. The top panel shows the normalized average expression of each cluster in Bladder carcinoma and normal samples. Figure S16. Discovering putative cis-regulatory elements driving bladder cancer deregulation. We used FIRE to identify putative DNA/RNA motifs whose downstream/upstream genes show significant deregulation in the tumor samples compared to normal controls. The top panel shows the normalized average expression of each cluster in Bladder carcinoma and normal samples. Figure S17. The pathway-regulatory interaction map of bladder cancer. The discovered cis-regulatory elements (Figure S16) are positively or negatively associated with the significantly informative pathways (Figure S15). Figure S18. iPAGE output for the bladder cancer dataset (continuous) using MSigDB c2 gene-sets. Figure S19. The pathway-regulatory interaction map of bladder cancer using the MSigDB gene sets. The gene-sets discovered in Figure S18 are associated with cis-regulatory elements in Figure 2A (main text). Figure S20. Correlation matrix calculated from our cancer pathway map. All the cancer samples are clustered based on their deregulations across different pathways. Each element in this matrix represents a pair-wise correlation value. On the right, the tree representing the hierarchical clustering is presented. Figure S21. The complete cancer pathway map for MSigDB c2 gene-sets (without redundancy removal). Figure S22. The discovered cis-regulatory elements in the cancer regulatory map are informative of gene expression clusters in the NCI-60 dataset. Figure S23. The discovered cis-regulatory elements are informative of gene expression clusters in the human gene expression atlas. Figure S24. FIRE analysis of experimentally tested associations. (A) The motifs that are most informative of the decoy AAAA[ATG]TT vs scrambled microarray experiment. (B) Knocking down Elk1 results in the upregulation of genes harboring Sp1, Elk1 and MEF-2 binding sites. (C) Knocking down NFYA results in upregulation of genes harboring the NF-Y binding site. Table S1. The number of pathways discovered by iPAGE in 3 datasets studied in the main text in comparison with the number of pathways obtained from the same but randomly shuffled datasets. Table S2. The list, tissue and references of the cancer gene expression studies used to compile our initial dataset. Table S3. A list of predictions based on the associations in the Cancer Pathway-Regulatory Interaction Map.

Page 15: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

15

Supplemental Figures

-0.75 -0.55 -0.35 -0.15 0.05 0.25 0.45 0.65 0.85

050

100

150

R

Frequency

-0.75 -0.55 -0.35 -0.15 0.05 0.25 0.45 0.65 0.85

BackgroundElk1 (p < 1e-14)TAGATGT (p < 1e-15)Both (p < 1e-15)

050

100

150

R

(B)

012 bits

12C 3G 4CA 5TC

6A 7A 8TA 9

GCA

5'

012 bits

12

GC

3C 4G 5G 6TA

7A 8G 9

5'

012 bits

1

TGC

2C 3TA 4C 5G 6C 7TGA

8A 9

5'

012 bits

1

UCA

2U 3UCA 4G 5GCA 6U 7G 8U 9

3'UT

R

012 bits

12C 3G 4T 5TCA

6

GA

7

CA

8A 9

5'

012 bits

1

GCA

2A 3C 4G 5T

6TC

7

TGC

8GA

9

TGC

5'

012 bits

1

GCA

2A 3GA 4TA

5A 6C 7G 8CA

9

GCA

5'

012 bits

12C 3G 4TC 5A 6G 7T 8TC

9

5'

012 bits

1

GCA

2C 3G 4TA 5

GC

6A 7C 8G 9

5'

012 bits

12C 3G 4C 5G

6T 7T 8A 9

5'

012 bits

12C 3A 4T 5C

6TA

7GC

8A 9

5'

012 bits

1

TCA

2

CA

3T 4C 5T 6GC

7C 8A 9TGC5'

012 bits

1

TCA

2TC

3C 4C 5C 6C

7C 8A 9

5'

012 bits

12A 3T 4G 5A

6

GC

7A 8G 9TCA5'

012 bits

12A 3T 4TA 5T 6G 7C 8A

9

5'

012 bits

12A 3T 4C 5T

6

GA

7T 8G 9

5'

0

1

2

bits

1 2

C

3

G

4

CA

5

TC

6

A

7

A

8

TA

9

GCA

5'

0

1

2

bits

1 2

GC

3

C

4

G

5

G

6

TA

7

A

8

G

9

5'

0

1

2

bits

1

TGC

2

C

3

TA

4

C

5

G

6

C

7

TGA

8

A

9

5'

0

1

2

bits

1

UCA

2

U

3

UCA

4

G

5

GCA

6

U

7

G

8

U

9

3'UTR

0

1

2

bits

1 2

C

3

G

4

T

5

TCA

6

GA

7

CA

8

A

9

5'

0

1

2

bits

1

GCA

2

A

3

C

4

G

5

T

6

TC

7

TGC

8

GA

9

TGC

5'

0

1

2

bits

1

GCA

2

A

3

GA

4

TA

5

A

6

C

7

G

8

CA

9

GCA

5'

0

1

2

bits

1 2

C

3

G

4

TC

5

A

6

G

7

T

8

TC

9

5'

0

1

2

bits

1

GCA

2

C

3

G

4

TA

5

GC

6

A

7

C

8

G

9

5'

0

1

2

bits

1 2

C

3

G

4

C

5

G

6

T

7

T

8

A

9

5'

0

1

2

bits

1 2

C

3

A

4

T

5

C

6

TA

7

GC

8

A

9

5'

0

1

2

bits

1

TCA

2

CA

3

T

4

C

5

T

6

GC

7

C

8

A

9

TGC

5'

0

1

2

bits

1

TCA

2

TC

3

C

4

C

5

C

6

C

7

C

8

A

9

5'

0

1

2

bits

1 2

A

3

T

4

G

5

A

6

GC

7

A

8

G

9

TCA

5'

0

1

2

bits

1 2

A3

T4

TA

5

T6

G7

C8

A9

5'

0

1

2

bits

1 2

A3

T4

C5

T6

GA

7T

8G

9

5'

0.01(Pos)

0.0

(Neg)0.01

0.0

(A)

BackgroundElk1 (p < 1e-14)ACGTCGG (p < 1e-15)Both (p < 1e-15)

E2F

Elk-1

AhR

-

O2

-

-

-

-

-

Tal-1b_E47

-

-

Skn-1

Oct-1

SEF-1

E2F

Elk-

1

AhR

- O2 - - - - - Tal-

1b_E

47

- - Skn-

1

Oct-

1

SEF-

1

Figure S1

Page 16: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

16

Elk1

CA CA CA CA CA CA Normal

CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA Normal

CA Normal

CA CA CA CA CA CA CA Normal

Normal

Normal

Normal

Normal

Normal

Mitotic cell cycle (GO:0000278) R = -0.747 (p = 7.389e-11 )

NM_000075NM_000430NM_001005NM_001008

NM_001008895NM_001015885

NM_001256NM_001376NM_001787NM_002358NM_003318NM_003590NM_003591NM_004219NM_004354NM_004359NM_004701NM_004856NM_005192NM_005862NM_006185NM_006221NM_006461NM_006806NM_007234NM_007317NM_013366NM_014865NM_015113NM_015282NM_016237NM_018131NM_019063NM_022346NM_022662NM_024039NM_031966

RNA splicing (GO:0008380) R = -0.860 (p = 1.426e-18 )

NM_001003NM_001014972

NM_001017NM_001031684

NM_002370NM_002486NM_002715NM_003090NM_003092NM_003094NM_003594NM_004175NM_004593NM_004596NM_004597NM_004697NM_004941NM_005089NM_005105NM_005146NM_005626NM_005839NM_005850NM_006347NM_006372NM_006469NM_006711NM_006842NM_006857NM_006924NM_006938NM_007362NM_012322NM_012469NM_014003NM_014829NM_016200NM_016312NM_021145NM_022157NM_031287

Ribosome biogenesis (GO:0042254) R = -0.860 (p = 1.527e-18 )

NM_001001NM_001003

NM_001024226NM_001659NM_001660NM_004477NM_004663NM_006331NM_006627NM_006824NM_007043NM_012341NM_013285NM_013393NM_014953NM_015169NM_016037NM_016101NM_016312NM_017647NM_018321NM_018428NM_018648NM_018983NM_025065NM_033416

1

-1

Figure S2

Page 17: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

17

TFDP1

Normal

Normal

Normal

Normal

Normal

Normal

CA CA CA CA CA CA CA CA CA CA Normal

CA Normal

Normal

CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA

DNA replication (GO:0006260) R = 0.757 (p = 2.279e-11 )

NM_000251NM_000534NM_001034NM_001254NM_002129NM_002592NM_002692NM_002915NM_002946NM_003504NM_004537NM_005157NM_005914NM_005915NM_007195NM_016937NM_021173NM_030755NM_030928

Microtubule biogenesis (GO::0000226) R = 0.717 (p = 1.310e-09 )

NM_000381NM_001070NM_001376NM_003981NM_005445NM_005886NM_006101NM_015282NM_016359

Mitotic cell cycle (GO:0000278) R = 0.677 (p = 3.354e-08 )

NM_001070NM_001237NM_001254NM_001376NM_001786NM_002497NM_003591NM_003592NM_003981NM_004354NM_005157NM_005192NM_005445NM_005862NM_005886NM_006101NM_006221NM_006400NM_006403NM_007074NM_014865NM_014881NM_015282NM_016359NM_018492NM_022346NM_024039NM_031966

1

-1

Figure S3

Page 18: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

18

AhR

Normal

Normal

Normal

Normal

Normal

CA CA CA Normal

Normal

CA CA CA CA CA CA CA Normal

CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA Normal

CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA CA

protein catabolic process (Neg) (GO:0006511) R = -0.474 (p = 8.289e-04 )

NM_002793NM_003470NM_004205NM_006677NM_020903

protein catabolic process (Pos) (GO:0006511) R = 0.580 (p = 1.123e-05 )

NM_001001NM_002787NM_002794NM_003341NM_003363NM_005067NM_005154NM_006590NM_007051NM_015902NM_016041

1

-1

Figure S4

Page 19: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

19

BLDLBCL

2

-297 12 31 8 2 24 39 41 27 13 57 18 10

560 58 63 99 72 81 32 10

777 71 10

276 15 7 10 23 10

335 38 10

479 61 33 26 48 42 14 10

140 44 5 73 29 95 62 55 75 0 91 25 45 30 47 82 80 56 53 34 84 43 85 10

666 93 78 28 96 3 4 86 22 92 69 37 89 65 50 67 10

016 9 88 10

864 87 94 20 11 54 52 98 1 46 90 74 68 21 10

949 36 70 59 83 19 17 6 51

mitotic cell cycle, GO:0000278microtubule-based movement, GO:0007018mRNA metabolic process, GO:0016071nucleoplasm, GO:0005654protein folding, GO:0006457ribosome biogenesis and assembly, GO:0042254amino acid and derivative metabolic process, GO:0006519helicase activity, GO:0004386ribosome, GO:0005840translation regulator activity, GO:0045182electron carrier activity, GO:0009055ubiquitin-dependent protein catabolic process, GO:0006511envelope, GO:0031975SAM-dependent methyltransferase activity, GO:0008757tRNA processing, GO:0008033response to DNA damage stimulus, GO:0006974reproduction, GO:0000003tissue development, GO:0009888cell-cell adhesion, GO:0016337muscle contraction, GO:0006936visual perception, GO:0007601cofactor metabolic process, GO:0051186carbohydrate binding, GO:0030246serine-type peptidase activity, GO:0008236endopeptidase inhibitor activity, GO:0004866innate immune response, GO:0045087heme binding, GO:0020037steroid metabolic process, GO:0008202lipid catabolic process, GO:0016042chromosomal part, GO:0044427transmembrane receptor protein kinase activity, GO:0019199transferase activity, transferring methyl groups, GO:0016765cell migration, GO:0016477anion transporter activity, GO:0008509endoplasmic reticulum part, GO:0044432actin filament-based process, GO:0030029transmission of nerve impulse, GO:0019226proteinaceous extracellular matrix, GO:0005578alkali metal ion binding, GO:0031420secretion, GO:0046903inflammatory response, GO:0006954copper ion binding, GO:0005507lymphocyte activation, GO:0046649GTPase regulator activity, GO:0030695SH3/SH2 adaptor activity, GO:0005070phospholipid binding, GO:0005543hematopoietin/IFN-class cytokine receptor activity, GO:0004896UDP-glycosyltransferase activity, GO:0008194antigen processing and presentation, GO:0019882vacuole, GO:0005773protein kinase cascade, GO:0007243

5

Over-representation

-5

Under-representation

(A)

Figure S5A

Page 20: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

20

BLDLBCL2

-2

C41

C58

C72

C107

C102

C15

C23

C35

C38

C79

C33

C26

C40

C29

C95

C62

C75

C25

C47

C82

C80

C56

C53

C34

C84

C43

C106

C93

C78

C28

C86

C92

C69

C37

C65

C94

C52

C98

C1 C46

C90

C74

C68

C109

C49

C70

C59

C83

C17

C6 optimized motif

location

MI (bits)

z-score

robustness

position bias

orientation bias

conservation index

seed

motif name

0

1

2

bits

1 2

C

3

G

4

TCA

5

T

6

TA

7

G

8

TA

9

TCA

5’ 0.018 12.9 10/10 - - 0.33 CGCTTGA -

0

1

2

bits

1 2

T

3

GCA

4

T

5

TA

6

C

7

G

8

C

9

TGC

5’ 0.017 11.5 10/10 - - 0.74 TCTTCGC -

0

1

2

bits

1 2

U

3 4

G

5

U

6

A

7

U

8

G

9

3’UTR 0.015 8.9 10/10 - 0.94 UUGUAUG -

0

1

2

bits

1 2

GA

3

A

4

A

5

A

6

C

7

G

8

C

9

5’ 0.015 8.7 9/10 Y - 0.53 GAAACGC -

0

1

2

bits

1 2

T

3

T

4

A

5

TGA

6

C

7

G

8

GCA

9

5’ 0.014 7.8 9/10 - - 0.87 TTAGCGC -

0

1

2

bits

1

TGA

2

A

3

C

4

TG

5

A

6

C

7

TGA

8

CA

9

TGC

5’ 0.022 16.8 10/10 - - 0.89 ACTACAA -

0

1

2

bits

1

GA

2

C

3

GC

4

A

5

A

6

TA

7

C

8

GA

9

5’ 0.018 13.1 10/10 Y - 1.00 CCAATCA NF-Y

0

1

2

bits

1

TGA

2

TA

3

C

4

G

5

GC

6

C

7

TA

8

A

9

5’ 0.016 10.6 10/10 - - 0.74 ACGCCTA -

0

1

2

bits

1 2

A

3

T

4

A

5

A

6

T

7

C

8

TGC

9

CA 5’ 0.013 7.1 6/10 - - 0.61 ATAATCC Bcd

0

1

2

bits

1

TCA

2

A

3

A

4

C

5

TA

6

TC

7

C

8

G

9

5’ 0.013 7.2 7/10 - 0.80 AACTCCG -

0

1

2

bits

1 2

T

3

TA

4

T

5

CA

6

C

7

G

8

CA

9

5’ 0.021 15.9 10/10 - - 0.88 TTTCCGC -

0

1

2

bits

1

TGA

2

TA

3

C

4

G

5

T

6

GA

7

TG

8

TCA

9

GCA

5’ 0.018 13.2 10/10 - - 0.70 ACGTGGC XBP-1

0

1

2

bits

1 2

A

3

A

4

U

5

A

6

UG

7

G

8

U

9

3’UTR 0.018 12.1 10/10 - 0.94 AAUAUGU -

0

1

2

bits

1

UCA

2

CA

3

A

4

G

5

U

6

G

7

UA

8

A

9

UCA

3’UTR 0.015 9.1 10/10 - 0.92 AAGUGUA -

0

1

2

bits

1

TGA

2

TCA

3

TA

4

T

5

G

6

C

7

G

8

C

9

5’ 0.024 19.7 10/10 Y - 0.99 CATGCGC -

0

1

2

bits

1

TGA

2

A

3

C

4

G

5

TG

6

A

7

CA

8

GC

9

5’ 0.018 13.0 10/10 - - 0.14 ACGGAAG -

0

1

2

bits

1

GCA

2

G

3

GC

4

T

5

A

6

C

7

G

8

C

9

5’ 0.014 7.2 6/10 - 0.45 GCTACGC -

0

1

2

bits

1

TGA

2

GCA

3

TGA

4

C

5

GC

6

G

7

T

8

A

9

5’ 0.022 17.0 10/10 - - 0.72 CGCCGTA -

0

1

2

bits

1 2

TA

3

TG

4

C

5

G

6

T

7

C

8

G

9

TCA

5’ 0.014 8.1 8/10 - 0.09 AGCGTCG -

0

1

2

bits

1

T

2 3

TGA

4

TC

5

TG

6

C

7

G

8

A

9

GCA

5’ 0.024 19.7 10/10 - - 0.84 CTCGCGA -

0

1

2

bits

1

UGA

2

G

3

U

4

U

5

UG

6

UG

7

A

8

UC

9

3’UTR 0.022 17.2 10/10 - 0.99 GUUUUAU -

0

1

2

bits

1 2

C

3

G

4

TC

5

A

6

A

7

TC

8

G

9

5’ 0.015 9.0 9/10 - 0.83 CGCAACG -

0

1

2

bits

1

TCA

2

A

3

C

4

G

5

C

6

TG

7

TA

8

C

9

5’ 0.015 8.8 9/10 - 0.64 ACGCTTC Hairy

0

1

2

bits

1 2

A

3

GC

4

T

5

A

6

A

7

C

8

T

9

TA 5’ 0.014 7.6 9/10 - - 0.18 AGTAACT MYB.Ph3

0

1

2

bits

1 2

C

3

C

4

C

5

G

6

C

7

C

8

C

9

5’ 0.038 35.4 10/10 - - 1.00 CCCGCCC Sp1

0

1

2

bits

1

GC

2

U

3

GC

4

G

5

A

6 7

C

8

CA

9

3’UTR 0.018 12.6 10/10 Y 0.92 UGGAGCC -

0

1

2

bits

1 2

A

3

CA

4

TC

5

C

6

G

7

T

8

CA

9

GC 5’ 0.015 9.5 8/10 - - 0.29 ACCCGTA -

0

1

2

bits

1

TGC

2

CA

3

G

4

A

5

TG

6

G

7

CA

8

T

9

G 5’ 0.014 8.8 9/10 - - 0.49 AGAGGCT -

0

1

2

bits

1

UGC

2

GC

3

GC

4

A

5

C

6

UGC

7

UG

8

U

9

C 3’UTR 0.014 7.8 6/10 - 0.89 CCACCUU miR-301

0

1

2

bits

1

GCA

2

C

3

C

4

U

5

C

6

UG

7

G

8

U

9

UGC

3’UTR 0.014 7.6 7/10 - 0.60 CCUCUGU -

0

1

2

bits

1

UGC

2

UA

3

C

4

U

5

GA

6

C

7

C

8

U

9

GCA

3’UTR 0.014 7.4 7/10 - 0.92 UCUGCCU hsa-let-7d

0

1

2

bits

1

TGA

2

CA

3

A

4

G

5

TA

6

TA

7

C

8

G

9

5’ 0.016 9.9 9/10 - - 0.87 AAGAACG -

0

1

2

bits

1 2

C

3

G

4

A

5

A

6

A

7

A

8 9

5’ 0.016 10.4 10/10 - 0.65 CGAAAAA E2F

0

1

2

bits

1 2

GCA

3

GA

4

A

5

C

6

G

7

T

8

T

9

5’ 0.015 9.5 10/10 - - 0.85 AAACGTT -

0

1

2

bits

1

UCA

2

UGC

3

UG

4

A

5

U

6

U

7

UG

8

G

9

UCA

3’UTR 0.022 17.1 10/10 - 0.93 UUAUUUG -

0

1

2

bits

1 2

A

3

U

4

G

5

A

6

UA

7

G

8

UGA

9

UA 3’UTR 0.018 12.3 10/10 - 0.99 AUGAAGA -

0

1

2

bits

1 2

C

3

C

4

G

5

G

6

A

7

A

8

GA

9

5’ 0.055 53.6 10/10 Y - 1.00 CCGGAAG ELK4

0

1

2

bits

1 2

A

3

T

4

GA

5

TGA

6

C

7

G

8

TGA

9

TCA

5’ 0.026 22.2 10/10 - - 1.00 ATGGCGG -

0

1

2

bits

1

GCA

2

TG

3

C

4

A

5

C

6

G

7

A

8 9

GA 5’ 0.014 7.7 8/10 - 0.80 GCACGAC -

0

1

2

bits

1 2

T

3

A

4

C

5

GC

6

C

7

G

8

A

9

5’ 0.013 6.5 7/10 - 0.70 TACCCGA -

0

1

2

bits

1

UGA

2

G

3

U

4

A

5

UG

6

A

7

UG

8

U

9

3’UTR 0.018 12.1 10/10 - 0.99 GUAUAUU -

0

1

2

bits

1

UA

2

C

3

UGA

4

UG

5

A

6

UC

7

U

8

G

9

3’UTR 0.015 9.7 10/10 - 0.91 CUGACUG -

0

1

2

bits

1 2

A

3

GA

4

U

5

A

6

C

7

U

8

A

9

UCA

3’UTR 0.013 6.9 7/10 - 0.83 AAUACUA -

0

1

2

bits

1

TGA

2

CA

3

G

4

TCA

5

TA

6

C

7

T

8

A

9

5’ 0.015 8.8 9/10 - - 0.94 AGAACTA -

0

1

2

bits

1 2

C

3

G

4

TCA

5

TA

6

G

7

T

8

A

9

5’ 0.014 8.2 8/10 - - 0.26 CGCAGTA -

0

1

2

bits

1

GC

2

GC

3

U

4

GCA

5

U

6

U

7

C

8

GC

9

3’UTR 0.015 9.1 9/10 - 0.79 CUGUUCC -

0

1

2

bits

1

GCA

2

C

3

GC

4

G

5

TCA

6

TC

7

T

8

A

9

TCA

5’ 0.018 12.6 10/10 - - 0.95 CGGCCTA -

0

1

2

bits

1

TGC

2

C

3

G

4

T

5

T

6

GA

7

C

8

TC

9

5’ 0.016 10.6 10/10 - - 0.95 CGTTGCC -

0

1

2

bits

1 2

A

3

TC

4

TC

5

T

6

C

7

A

8

C

9

TA 5’ 0.016 10.6 10/10 - - 0.26 ATTTCAC -

0

1

2

bits

1

TA

2

TA

3

A

4

G

5

T

6

T

7

G

8

TG

9

TCA

5’ 0.016 9.9 10/10 - - 0.40 AAGTTGT -

0

1

2

bits

1 2

A

3

A

4

T

5

T

6

G

7

A

8

TG

9

5’ 0.015 9.1 8/10 - - 0.93 AATTGAT -

0

1

2

bits

1 2

T3

T

4

A

5

T

6

G

7

C

8

CA

9

TGA

5’ 0.014 8.4 9/10 - 0.85 TTATGCA -

0

1

2bits

1

TCA

2T

3C

4TA

5

A6

T7

G8

A9

5’ 0.014 7.5 8/10 - - 0.41 TCAATGA -

0

1

2

bits

1 2

C3

U4

C5

UA

6

C7

C8

U9

GCA

3’UTR 0.016 10.4 9/10 - 0.70 CUCUCCU -

0

1

2

bits

1

GCA

2

A

3

G

4

A

5

GA

6

TA

7

C

8

A

9

TG 5’ 0.016 10.7 10/10 - - 0.60 AGAATCA -

0

1

2

bits

1 2

T

3

T

4

A

5

T

6

A

7

C

8

GCA

9

5’ 0.016 10.3 10/10 - 0.98 TTATACA TATA

0

1

2

bits

1 2

T

3

G

4

TCA

5

C

6

CA

7

T

8

GC

9

A 5’ 0.016 10.4 10/10 - 0.93 TGACATC v-ErbA

0

1

2

bits

1 2

T

3

G

4

A

5 6

T

7

C

8

A

9

5’ 0.016 10.0 9/10 Y - 0.40 TGACTCA AP-1

0

1

2

bits

1

TGC

2

A

3

G

4

TA

5

T

6

T

7

C

8

A

9

5’ 0.013 7.6 7/10 - - 0.56 AGTTTCA -

0

1

2

bits

1

GCA

2

C

3

TA

4

T

5

C

6

C

7

A

8

TG

9

TGC

5’ 0.013 7.5 7/10 - - 0.24 CATCCAG -

0

1

2

bits

1

UA

2

A

3

U

4 5

G

6

GC

7

GA

8

U

9

3’UTR 0.018 12.0 10/10 - 0.96 AUUGCAU -

0

1

2

bits

1 2

T

3

A

4

GC

5

TCA

6

G

7

T

8

A

9

TGA

5’ 0.015 9.1 10/10 - - 0.90 TACAGTA -

0

1

2

bits

1

UCA

2

U

3

UA

4

UGA

5

U

6

G

7

A

8

G

9

UA 3’UTR 0.015 8.8 9/10 - 0.62 UUUUGAG -

0

1

2

bits

1 2

A

3

C

4

T

5

T

6

A

7

A

8

TC

9

5’ 0.013 7.3 6/10 - - 0.84 ACTTAAT Ftz

0

1

2

bits

1 2

TA

3

A

4

G

5

G

6

C

7

A

8

A

9

TGA

5’ 0.013 6.4 8/10 - 0.39 AAGGCAA -

0

1

2

bits

1 2

A

3

U

4

G

5

G

6

UA

7

U

8

UCA

9

UC 3’UTR 0.016 9.9 10/10 - 0.86 AUGGUUU -

0

1

2

bits

1

TCA

2

A

3

A

4

T

5

C

6

TA

7

G

8

T

9

5’ 0.015 9.6 10/10 - - 0.92 AATCAGT Gfi

0

1

2

bits

1 2

CA

3

TGA

4

C

5

T

6

T

7

G

8

A

9

TA 5’ 0.015 9.7 9/10 - 0.94 AACTTGA -

0

1

2

bits

1 2

C

3

A

4

G

5

G

6

CA

7

A

8

A

9

TGA

5’ 0.013 7.5 9/10 - - 0.43 CAGGAAA -

0

1

2

bits

1 2

A

3

G

4

A

5

T

6

A

7

A

8

TGC

9

TG 5’ 0.013 7.1 6/10 - 0.26 AGATAAG GATA-1

0

1

2

bits

1

TA

2

TC

3

TC

4

T

5

A

6

A

7

GA

8

C

9

TCA

5’ 0.021 16.0 10/10 - - 0.99 TTTAAAC -

0

1

2

bits

1 2

A

3

A

4

G

5

G

6

A

7

A

8

A

9

5’ 0.021 15.6 10/10 - - 0.55 AAGGAAA NF-AT

0

1

2

bits

1

TGA

2

TCA

3

A

4

C

5

T

6

T

7

TCA

8

G

9

TCA

5’ 0.018 12.7 10/10 - - 0.97 AACTTTG -

0

1

2

bits

1

TA

2

A

3

G

4

A

5

TGA

6

A

7

G

8

A

9

5’ 0.017 11.8 10/10 - - 0.90 AGAGAGA -

0

1

2

bits

1 2

T

3

A

4

A

5

C

6

TGA

7

G

8

GC

9

TA 5’ 0.017 10.8 10/10 - 0.46 TAACTGC v-Myb

0

1

2

bits

1 2

T

3

A

4

C

5

TG

6

GC

7

A

8

A

9

5’ 0.016 10.5 10/10 - - 0.73 TACTCAA -

0

1

2

bits

1 2

T

3

G

4

A

5

A

6

A

7

C

8

A

9

5’ 0.015 8.6 9/10 - - 0.88 TGAAACA -

0

1

2

bits

1

GCA

2

A

3

T

4

A

5

T

6 7

G

8

GC

9

5’ 0.014 8.3 9/10 - 0.93 ATATAGC -

0

1

2

bits

1

TCA

2

A

3

T

4

C

5

T

6

TGC

7

C

8

A

9

5’ 0.014 8.0 9/10 - 0.71 ATCTTCA -

0

1

2

bits

1

TA

2

A

3

T

4

TG

5

G

6

T

7

TG

8

C

9

TCA

5’ 0.013 7.4 8/10 - 0.78 ATTGTTC -

0

1

2

bits

1 2

A

3

A

4 5

T

6

A

7

C

8

C

9

GCA

5’ 0.013 6.6 7/10 - 0.74 AAATACC -

0

1

2

bits

1

UGA

2

UGA

3

CA

4

C

5

G

6

C

7

C

8

U

9

GCA

3’UTR 0.016 9.7 10/10 - 0.28 GCCGCCU -

0

1

2

bits

1

GC

2

T

3

C

4

A

5

C

6

TC

7

C

8

TC

9

TGA

5’ 0.013 7.3 8/10 - - 0.74 TCACCCC -

0

1

2

bits

1 2

C

3

C

4

C

5

C

6

T

7

G

8

A

9

5’ 0.013 6.4 7/10 - - 0.16 CCCCTGA STRE

0

1

2

bits

1

GC

2

A

3

G

4

UA

5

C

6

C

7

U

8

GCA

9

3’UTR 0.015 9.5 10/10 - 0.52 AGUCCUG -

0

1

2

bits

1

TGA

2

C

3

TGA

4

A

5

T

6

GA

7

C

8

A

9

5’ 0.014 8.2 7/10 - 0.94 CAATGCA -

0

1

2

bits

1 2

A

3

T

4

A

5

T

6

C

7

A

8

A

9

5’ 0.014 7.2 6/10 - 0.85 ATATCAA -

0

1

2

bits

1 2

GA

3

C

4

G

5

T

6

C

7

A

8

C

9

TGC

5’ 0.018 12.9 10/10 - - 1.00 GCGTCAC CREB

0

1

2

bits

1

CA

2

A

3

A

4

CA

5

GC

6

G

7

C

8

UC

9

C 3’UTR 0.025 24.5 10/10 Y 0.83 AAAGGCU -

0

1

2

bits

1

A

2

CA

3

A

4

TGC

5

G

6

T

7

A

8

GA

9

TCA

5’ 0.017 11.2 10/10 - - 0.89 AAGGTAA -

0

1

2

bits

1

UGA

2

UA

3

A

4

UA

5

G

6

UA

7

G

8

C

9

UA 3’UTR 0.015 8.9 10/10 Y 0.95 AAAGAGC -

0

1

2

bits

1

TGA

2

GA

3

A

4

A

5

T

6

A

7

G

8

C

9

5’ 0.013 6.8 6/10 - 0.92 AAATAGC MEF-2

0

1

2

bits

1 2

A

3

TC

4

T

5

G

6

C

7

TGA

8

A

9

TGA

5’ 0.017 11.3 10/10 - - 0.60 ATTGCAA -

0

1

2

bits

1 2

C

3

T

4

T

5

T

6

A

7

TA

8

C

9

TGA

5’ 0.015 9.6 10/10 - - 0.96 CTTTAAC Dof2

0

1

2

bits

1

TGA

2

C

3

A

4

A

5

A

6

TGC

7

A

8

G

9

5’ 0.015 9.9 9/10 Y 0.18 CAAAGAG -

0

1

2

bits

1 2

A

3

T

4

T

5

T

6

GA

7

TC

8

C

9

GA 5’ 0.015 9.0 10/10 - 0.95 ATTTGCC -

0

1

2

bits

1

TCA

2

A

3

G

4

TA

5

C

6

A

7

A

8

A

9

5’ 0.014 8.4 8/10 - 0.27 AGACAAA Broad-complex_1

0

1

2

bits

1 2

C

3

A

4

A

5

A

6

G

7

A

8

A

9

5’ 0.014 8.1 7/10 - - 0.82 CAAAGAA STE11

0

1

2

bits

1 2

T

3

C

4

T

5

TA

6

T

7

C

8

A

9

TGC

5’ 0.014 8.1 9/10 - - 0.16 TCTATCA -

0

1

2

bits

1 2

C

3

T

4

A

5

A

6

T

7

TC

8

GCA

9

TC 5’ 0.014 8.1 8/10 - - 0.86 CTAATCA -

0

1

2

bits

1

TCA

2

A

3

T

4

T

5

A

6

GA

7

T

8

C

9

TCA

5’ 0.014 7.6 9/10 - - 0.98 ATTAATC Dfd

0

1

2

bits

1 2

U

3

GC

4

GCA

5

U

6

U

7

C

8

GC

9

UGC

3’UTR 0.016 10.3 10/10 - 0.86 UCCUUCC -

0

1

2

bits

1 2

C

3

G

4

TA

5

C

6

A

7

A

8

CA

9

5’ 0.015 8.5 10/10 Y 0.95 CGACAAC Tax_CREB

0

1

2

bits

1

GCA

2

C

3

C

4

U

5

U

6

C

7

UG

8

UCA

9

C 3’UTR 0.014 7.0 6/10 - 0.37 CCUUCUC -

0

1

2

bits

1 2

A

3

G

4

TGC

5

T

6

C

7

T

8

G

9

TGA

5’ 0.013 7.3 7/10 - - 0.33 AGCTCTG -

0

1

2

bits

1

UGA

2

A

3

UCA

4

U

5

GC

6

G

7

A

8

CA

9

UGA

3’UTR 0.018 12.8 10/10 - 0.91 AAUGGAA -

0

1

2

bits

1 2

C

3

T

4

G

5

T

6

G

7

TC

8

CA

9

C 5’ 0.018 12.5 10/10 - - 0.30 CTGTGCC -

0

1

2

bits

1

GCA

2

C

3

A

4

G

5

G

6

T

7

G

8 9

G 5’ 0.016 10.0 10/10 - - 0.84 CAGGTGA Snail

0

1

2

bits

1 2

U

3

G

4

CA

5

A

6

C

7

C

8

GC

9

GCA

3’UTR 0.014 7.5 8/10 - 0.29 UGCACCC -

0

1

2

bits

1 2

G

3

A

4

GA

5

C

6

C

7

C

8

G

9

3’UTR 0.013 6.3 6/10 - 0.19 GAGCCCG -

0

1

2

bits

1

TCA

2

C

3

A

4

G

5

CA

6

TA

7

C

8

A

9

GC 5’ 0.015 9.0 9/10 - 0.50 CAGCTCA -

0

1

2

bits

1

TCA

2

C

3

A

4

T

5

T

6

A

7

GCA

8

C

9

TC 5’ 0.014 7.4 7/10 - - 0.08 CATTAAC Ubx

0

1

2

bits

1 2

A

3

T

4

A

5

TCA

6

A

7

C

8

C

9

TCA

5’ 0.014 7.9 8/10 - 0.86 ATATACC -

0

1

2

bits

1 2

TG

3

A

4

A

5

A

6

A

7

G

8

C

9

5’ 0.015 9.4 9/10 - 0.24 TAAAAGC Dof2

0

1

2

bits

1 2

T

3

C

4

C

5

C

6

C

7

C

8

A

9

5’ 0.017 11.8 10/10 - 0.87 TCCCCCA ZNF42_1-4

0

1

2

bits

1

TCA

2

C

3

A

4

TG

5

CA

6

T

7

G

8

G

9

GA 5’ 0.015 9.3 10/10 - 0.47 CAGATGG -

0

1

2

bits

1

GCA

2

A

3

T

4

GC

5

A

6

GA

7

C

8

T

9

TGC

5’ 0.014 7.6 8/10 - 0.88 ATGAACT -

0

1

2

bits

1

GCA

2

C

3

A

4

GC

5

A

6

C

7

C

8

C

9

TCA

5’ 0.018 13.1 10/10 - - 0.92 CACACCC -

0

1

2

bits

1 2

A

3

G

4

C

5

C

6

T

7

G

8

G

9

5’ 0.018 13.1 10/10 - - 0.96 AGCCTGG -

0

1

2

bits

1 2

C

3

A

4

GC

5

A

6

C

7

A

8

G

9

GCA

5’ 0.017 11.7 10/10 - - 0.31 CACACAG -

0

1

2

bits

1

GC

2

A

3

G

4

G

5

C

6

CA

7

GC

8

T

9

TG 5’ 0.015 9.8 10/10 - 0.67 AGGCCCT -

0

1

2

bits

1

TGC

2

C

3

TCA

4

GC

5

A

6

T

7

G

8

C

9

CA 5’ 0.015 9.5 10/10 - 0.96 CCCATGC -

0

1

2

bits

1

TCA

2

T

3

GC

4

T

5

G

6

C

7

T

8

C

9

TCA

5’ 0.015 9.1 10/10 - 0.93 TCTGCTC -

0

1

2

bits

1

GC

2

C

3

CA

4

A

5

UA

6

GC

7

C

8

U

9

3’UTR 0.014 8.1 7/10 - 0.79 CCAUCCU -

0

1

2

bits

1 2

T

3

TG

4

CA

5

A

6

C

7

C

8

C

9

TC 5’ 0.013 7.6 6/10 Y 0.22 TTAACCC -

0

1

2

bits

1 2

A

3

A

4

TC

5

G

6

T

7

G

8

CA

9

TCA

5’ 0.014 8.8 8/10 - - 0.50 AATGTGA -

0

1

2

bits

1 2

UA

3

C

4

GC

5

U

6

C

7

GC

8

A

9

GCA

3’UTR 0.014 8.1 9/10 - 0.56 UCCUCCA -

10

Over-representation

-10

Under-representation

(B)

Figure S5B

Page 21: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

21

NF-Y

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LBL BL DL

BCL

DLBC

LDL

BCL

DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LBL BL DL

BCL

BL DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LBL BL BL DL

BCL

DLBC

LBL BL DL

BCL

DLBC

LDL

BCL

BL BL BL BL BL DLBC

LDL

BCL

BL DLBC

LBL DL

BCL

BL DLBC

LDL

BCL

BL DLBC

LDL

BCL

BL BL DLBC

LDL

BCL

BL DLBC

LDL

BCL

BL BL BL BL

Chromatin GO:0000785 R = 0.698 (p = 7.240e-34 )

NM_000489NM_001003681

NM_001013NM_002105NM_002128NM_002129NM_002892NM_003173NM_004242NM_004965NM_005320NM_005342NM_006339NM_012117NM_012330NM_016343NM_017520NM_021018NM_024670NM_032188

Mitotic cell cycle (GO:0000278) R = 0.809 (p = 3.404e-56 )

NM_001013NM_001070NM_001254NM_001761NM_001786NM_001789NM_001790NM_002497NM_003318NM_003590NM_003592NM_003600NM_003981NM_004060NM_004336NM_004354NM_004856NM_005030NM_005192NM_005862NM_005886NM_006185NM_006559NM_006600NM_007019NM_007174NM_007317NM_014865NM_014881NM_016237NM_016343NM_016426NM_018131NM_018492NM_024039

1

-1

Sp1

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LBL DL

BCL

BL DLBC

LDL

BCL

DLBC

LBL DL

BCL

DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL BL DLBC

LDL

BCL

BL DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LBL DL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LBL DL

BCL

BL DLBC

LBL DL

BCL

BL DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LDL

BCL

BL DLBC

LDL

BCL

BL DLBC

LDL

BCL

DLBC

LDL

BCL

BL DLBC

LBL DL

BCL

DLBC

LDL

BCL

DLBC

LBL BL BL DL

BCL

Mitotic cell cycle (GO:0000278) R = 0.459 (p = 5.207e-12 )

NM_000430NM_001237NM_001790NM_001798NM_002497NM_003405NM_004724NM_006185NM_006265NM_006403NM_007174NM_007317NM_012325NM_015113NM_018131NM_019063

Secretory pathway (GO:0045045) R = 0.498 (p = 2.677e-14 )

NM_001024226NM_003139NM_003144NM_004283NM_005570NM_005697NM_006555NM_016322NM_018362

Figure S6

Page 22: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

22

PPC

Pro

stat

e D

hana

seka

ran

et a

lM

E M

elan

oma

Hoe

k et

al

HSC

C H

ead-

Nec

k C

rom

er e

t al

B-C

LL L

euke

mia

Has

linge

r et a

lC

LL L

ymph

oma

Aliz

adeh

et a

lB

PH P

rost

ate

Dha

nase

kara

n et

al

GB

M B

rain

Lia

ng e

t al

HSC

C H

ead-

Nec

k C

hung

et a

lR

CC

C R

enal

Boe

r et a

lC

A B

ladd

er D

yrsk

jot e

t al

GC

T S

emin

oma

Kor

kola

et a

lA

D O

varia

n W

elsh

et a

lG

L B

rain

Ric

kman

et a

lC

A C

olon

Gra

uden

s et

al

AO

Bra

in B

rede

l et a

lG

L B

rain

Bre

del e

t al

OD

Bra

in B

rede

l et a

lO

DG

L B

rain

Sun

et a

lA

C B

rain

Sun

et a

lG

LB B

rain

Sun

et a

lR

CC

C R

enal

Len

burg

et a

lC

A R

enal

Hig

gins

et a

lM

M M

yelo

ma

Zha

n et

al

CO

ID L

ung

Bha

ttach

arje

e et

al

CA

Bre

ast S

orlie

et a

lD

LBC

L Ly

mph

oma

Aliz

adeh

et a

lPD

C P

ancr

eas

Ishi

kaw

a et

al

FL L

ymph

oma

Aliz

adeh

et a

lA

D P

ancr

eas

Logs

don

et a

lIL

C B

reas

t Rad

vany

i et a

lID

C B

reas

t Rad

vany

i et a

lM

PC P

rost

ate

Dha

nase

kara

n et

al

SMC

L Lu

ng B

hatta

char

jee

et a

lM

L M

elan

oma

Tal

anto

v et

al

CA

Bre

ast R

icha

rdso

n et

al

MU

C O

varia

n H

endr

ix e

t al

CC

C O

varia

n H

endr

ix e

t al

SRS

Ova

rian

Hen

drix

et a

lE

ND

Ova

rian

Hen

drix

et a

lM

CA

Bre

ast R

adva

nyi e

t al

MPM

Mes

othe

liom

a G

ordo

n et

al

TU

Pro

stat

e La

poin

te e

t al

AD

Lun

g St

earm

an e

t al

AD

Lun

g B

hatta

char

jee

et a

lA

D L

ung

Bee

r et a

lSQ

Lun

g B

hatta

char

jee

et a

l

caspase regulator activityregulation of cytokine biosynthetic processGTPase activitysnRNA metabolic processcortical cytoskeleton organization and biogenesistranscription initiation from RNA polymerase II promotermicrotubule organizing centertranscriptional activator activityprotein targeting to peroxisomeRNA polymerase II transcription factor activitymonocarboxylic acid transportepidermal growth factor receptor signaling pathwayenergy reserve metabolic processNAD(P)+-protein-arginine ADP-ribosyltransferase activitytransferase activity, transferring acyl groupsacyltransferase activitymagnesium ion bindingsignal peptide processingglycoprotein metabolic processendoplasmic reticulum lumenendoplasmic reticulum partendoplasmic reticulum membranenuclear envelope-endoplasmic reticulum networkRho protein signal transductionnucleoside kinase activitynuclease activitymRNA metabolic processmRNA processingpigment biosynthetic processubiquitin-specific protease activitysmall nuclear ribonucleoprotein complexunfolded protein bindingprotein foldingmetallopeptidase activityregulation of cell proliferationATPase activityhelicase activityATP-dependent helicase activitynucleotidyltransferase activitycytoplasm organization and biogenesistranslation regulator activitytranslation factor activity, nucleic acid bindingtranslational initiationtranslation initiation factor activityregulation of transcription from RNA polymerase II promoterribosomemacromolecule catabolic processcellular macromolecule catabolic processubiquitin-dependent protein catabolic processproteasome complex (sensu Eukaryota)steroid metabolic processiron ion bindingprocollagen-proline dioxygenase activityER-Golgi intermediate compartmentvesicle-mediated transportsecretory pathwaysecretionpoly(A) bindingER to Golgi vesicle-mediated transportGolgi vesicle transportoxidoreductase activityantioxidant activity2 iron, 2 sulfur cluster bindingendomembrane systemchromatin modificationnucleocytoplasmic transportnucleoplasmintracellular protein transporttRNA metabolic processligase activityubiquitin-protein ligase activityregulation of small GTPase mediated signal transductioncarbon-oxygen lyase activitysmall GTPase regulator activityfatty acid metabolic processelectron carrier activityNADH dehydrogenase activityoxidoreductase activity, acting on NADH or NADPH, quinone or similar compoundelectron transportoxidative phosphorylationhydrogen ion transporter activitymonovalent inorganic cation transporter activitycofactor metabolic processcoenzyme metabolic processmitochondrial membraneorganelle envelopenucleotide metabolic processcofactor biosynthetic processribonucleotide biosynthetic processpurine nucleotide metabolic processcAMP biosynthetic processoxidoreductase activity, acting on single donors (O2)nucleotide catabolic processnucleotide transporter activitynucleotidase activitytranscription corepressor activityresponse to unfolded proteinphospholipid metabolic processmembrane lipid metabolic processtransferase activity, transferring alkyl or aryl groupssexual reproductioncentral nervous system developmentcell developmentneuron morphogenesis during differentiationvoltage-gated potassium channel activityneurotransmitter bindingneurotransmitter receptor activityion channel activityalpha-type channel activitycation channel activitytransmission of nerve impulsesynaptic transmissionpotassium ion transportmonovalent inorganic cation transportmetal ion transportalkali metal ion bindingamine receptor activityphosphoinositide bindingGPI anchor bindingvitamin transportcytoplasmic vesiclevesicletransferring glycosyl groupspositive regulation of angiogenesisDNA (cytosine-5-)-methyltransferase activitycell-cell adhesionhormone activityolfactory receptor activitysensory perception of chemical stimulustumor necrosis factor receptor bindingsolute:sodium symporter activitypositive regulation of transcription, DNA-dependenthistamine receptor activityendonuclease activityisomerase activitydisulfide oxidoreductase activityprotein disulfide isomerase activitycell junctionmicrosomeoxidoreductase activity, acting on paired donorsserine-type endopeptidase activityserine-type peptidase activityserine-type endopeptidase inhibitor activityprotease inhibitor activityendopeptidase inhibitor activitylipid transportenzyme inhibitor activitytissue developmentcornified envelopeectoderm developmentepidermis developmenthormone metabolic processintermediate filamentcarbon-carbon lyase activityamino acid metabolic processnitrogen compound metabolic processamine metabolic processamino acid and derivative metabolic processcell-matrix adhesionactin filament bundle formationmethyltransferase activityS-adenosylmethionine-dependent methyltransferase activityalcohol catabolic processglycolysisglucose metabolic processhexose metabolic processcellular carbohydrate metabolic processenergy derivation by oxidation of organic compoundsalcohol metabolic processnegative regulation of progression through cell cyclephosphoinositide-mediated signalingDNA replicationDNA repairresponse to DNA damage stimuluscytoskeleton organization and biogenesismicrotubule-based movementmicrotubule cytoskeletonmitosismitotic cell cyclechromosomechromosome organization and biogenesischromatinnucleosomechromatin assemblyestablishment and/or maintenance of chromatin architectureDNA packagingapoptotic nuclear changeslyase activityredox signal responsesubstrate-bound cell migration, cell attachment to substratecell redox homeostasisRNA methyltransferase activitypyridoxal kinase activitypyrimidine nucleotide biosynthetic processcell agingprotein binding, bridgingregulation of signal transductioncarboxylic ester hydrolase activityprotein-tyrosine kinase activityphosphoprotein phosphatase activityregulation of hydrolase activityregulation of GTPase activityN-acetylneuraminate metabolic processendocytosisactin cytoskeletoncytoskeletal protein bindingactin bindingprotein homooligomerizationanti-apoptosisnegative regulation of programmed cell deathnegative regulation of apoptosissodium ion transportsugar bindingnucleotide receptor activity, G-protein coupledprotein dimerization activityoxygen and reactive oxygen species metabolic processantigen processing and presentationphospholipid bindinghumoral immune responselytic vacuolelysosomecytokine activityinflammatory responseresponse to woundingantigen processing and presentation of endogenous peptide antigensterol biosynthetic processpositive regulation of cell proliferationcell activationresponse to virusinterleukin bindinghematopoietin/interferon-class (D200-domain) cytokine receptor activityregulation of I-kappaB kinase/NF-kappaB cascadeprotein kinase cascadeI-kappaB kinase/NF-kappaB cascadephospholipase A2 inhibitor activitysmall GTPase mediated signal transductioncopper ion bindingcadmium ion bindingsymporter activityelectrochemical potential-driven transporter activitycell motilityion homeostasisGTPase activator activitysecond-messenger-mediated signalingcirculationmuscle contractionGTPase regulator activityvisual perceptionangiogenesislipoprotein bindinginduction of apoptosisWnt receptor signaling pathwayinsulin-like growth factor bindingcell morphogenesissteroid hormone receptor activitycaveolavascular endothelial growth factor receptor activitytransmembrane receptor protein tyrosine kinase activityplatelet-derived growth factor receptor activityenzyme linked receptor protein signaling pathwaywound healingprocollagen-lysine 5-dioxygenase activityphospholipid-translocating ATPase activityanion transportproteinaceous extracellular matrixskeletal developmentreceptor signaling protein tyrosine kinase activitycell migrationmuscle development

4.5Pos

0

Neg-4.5

Figure S7

Page 23: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

23

ILC

Brea

st Ra

dvan

yi et

alBP

H Pr

ostat

e Dha

nase

kara

n et a

lOD

Bra

in B

rede

l et a

lAD

Lun

g Bha

ttach

arjee

et al

SRS O

varia

n Hen

drix

et al

MM

Mye

loma Z

han e

t al

AD L

ung B

eer e

t al

CLL

Lym

phom

a Aliz

adeh

et al

CCC

Ovar

ian H

endr

ix et

alCA

Bre

ast R

ichar

dson

et al

CA B

ladde

r Dyr

skjot

et al

END

Ovar

ian H

endr

ix et

alGL

B Br

ain Su

n et a

lCA

Bre

ast S

orlie

et al

HSCC

Hea

d-Ne

ck C

rom

er et

alID

C Br

east

Radv

anyi

et al

MCA

Bre

ast R

adva

nyi e

t al

AC B

rain

Sun e

t al

GL B

rain

Rick

man

et al

SMCL

Lun

g Bha

ttach

arjee

et al

COID

Lun

g Bha

ttach

arjee

et al

SQ L

ung B

hatta

char

jee et

alAO

Bra

in B

rede

l et a

lOD

GL B

rain

Sun e

t al

AD Pa

ncre

as L

ogsd

on et

alPD

C Pa

ncre

as Is

hika

wa et

alB-

CLL

Leuk

emia

Hasli

nger

et al

AD L

ung S

tearm

an et

alM

PM M

esot

helio

ma G

ordo

n et a

lGL

Bra

in B

rede

l et a

lGC

T Se

min

oma K

orko

la et

alM

E M

elano

ma H

oek e

t al

HSCC

Hea

d-Ne

ck C

hung

et al

ML

Mela

nom

a Tala

ntov

et al

PPC

Pros

tate D

hana

seka

ran e

t al

TU Pr

ostat

e Lap

ointe

et al

DLBC

L Ly

mph

oma A

lizad

eh et

alAD

Ova

rian W

elsh e

t al

MPC

Pros

tate D

hana

seka

ran e

t al

MUC

Ova

rian H

endr

ix et

alCA

Colo

n Gra

uden

s et a

lCA

Ren

al Hi

ggin

s et a

l

0

1

2

bits

1

TGA

2

C

3

G

4T 5

G

6T 7TA

8

GC

9

TGC 5’ bZIP911

0

1

2

bits

1 2

A

3

A

4

C

5

T

6

G

7

GC

8

G

9

TA 5’ c-Myb

0

1

2

bits

1 2

T

3

GC

4

T

5

GA

6

C

7

G

8

TA

9 5’ O20

1

2

bits

1

C

2

C

3

C

4

C

5

C

6

GC

7

TCA

8CA

9

TCA 5’ ADR1

0

1

2

bits

1

TCA

2

TG

3

T

4

T

5

GCA

6

A

7

G

8

A

9

GCA 3’ UTR -

0

1

2

bits

1

A

2

A

3

TCA

4

G

5

G

6

C

7

TGC

8

C

9

T 3’ UTR -0

1

2

bits

1

TCA

2

G

3

G

4

C

5TC

6

C

7T 8TA

9TC 3’ UTR -

0

1

2

bits

1

TC

2

GA

3

A

4

T

5

T

6

GC

7

G

8

C

9

TC 5’ NF-Y

0

1

2

bits

1

TCA

2

GC

3

A

4

G

5

T

6

GA

7

TA

8

T

9 3’ UTR -0

1

2

bits

1

GCA

2

C

3

G

4TA

5 6T 7A 8

C

9

GC 5’ -

0

1

2

bits

1 2

GCA

3

TG

4

C

5

C

6

TA

7

G

8

A

9 3’ UTR -0

1

2

bits

1 2

C

3

T

4

GA

5

TGA

6

C

7

TC

8

T

9

TGC 3’ UTR -

0

1

2

bits

1

TGA

2T 3TA

4 5T 6

C

7

G

8

C

9 5’ E2F0

1

2

bits

1 2

A

3

A

4

A

5

A

6

TG

A

7

T

8

T

9 5’ -0

1

2

bits

1 2

T

3

T

4

A

5

A

6

A

7

A

8

TA

9

TCA 5’ SBF-1

0

1

2

bits

1

GC

2TA

3T 4

G

5 6

GA

7

G

8

G

9

TGC 3’ UTR -

0

1

2

bits

1

GC

2

C

3

C

4

C

5

T

6

C

7

C

8

C

9 5’ -0

1

2

bits

1 2

A

3 4

TGA

5

A

6

T

7

A

8

T

9

TA 5’ -

0

1

2

bits

1 2T 3A 4A 5T 6A 7

GA

8TA

9 5’ HNF-10

1

2

bits

1

TGC

2

C

3

C

4

T

5

G

6

C

7

C

8

C

9

TCA 5’ -

0

1

2

bits

1

GCA

2

T

3

TG

4

T

5

A

6

A

7

CA

8

A

9 5’ MEF-20

1

2

bits

1 2

C

3

C

4

C

5

C

6A 7

G

8

G

9 5’ MIG10

1

2

bits

1 2

CA

3

A

4

CA

5

A

6

T

7

A

8

A

9

TCA 5’ HFH-1

0

1

2

bits

1

TGA

2

C

3

G

4

T

5

TC

6

A

7 8

TG

9 5’ CRE-BP10

1

2

bits

1TA

2

G

3A 4

TCA

5A 6TA

7T 8

G

9

TCA 3’ UTR -

0

1

2

bits

1

TGC

2

A

3

TA

4

A

5

A

6

A

7

A

8

A

9 5’ -0

1

2

bits

1

TCA

2

GC

3

T

4

C

5

T

6

C

7

TG

8

G

9

GCA 3’ UTR -

0

1

2

bits

1

TCA

2

TCA

3TG

4TG

5

C

6

GC

7

TGA

8T 9TA 3’ UTR -

0

1

2

bits

1 2

T

3

TA

4

GA

5

TGA

6

C

7

TGA

8

C

9

G 5’ -0

1

2

bits

1

TG

2

C

3

G

4

T

5

TA

6

TGA

7

TC

8

CA

9 5’ -0

1

2

bits

1 2T 3TG

4T 5T 6A 7

C

8T 9 3’ UTR -0

1

2

bits

1 2

A

3

GC

4

TA

5

TC

6

A

7

T

8

G

9

GCA 5’ p53

0

1

2

bits

1 2

A

3

G

4

C

5

C

6

C

7

C

8

CA

9

TCA 5’ -

0

1

2

bits

1

TGC

2

A

3 4

C

5

GC

6

CA

7

C

8

G

9

TCA 3’ UTR -

0

1

2

bits

1

TGA

2

TCA

3

C

4

G

5

CA

6

GCA

7

T

8

CA

9

GA 5’ -

0

1

2

bits

1

GCA

2

C

3

TGA

4

T

5

TC

6

A

7

C

8

G

9 5’ -0

1

2

bits

1 2

A

3

A

4

TC

5

TGA

6

CA

7

C

8

G

9 5’ -0

1

2

bits

1

TGC

2

T

3

A

4

C

5

G

6

TGA

7

CA

8

TGA

9 5’ -0

1

2

bits

1

TG

2

CA

3

TA

4

TA

5

GA

6

C

7

G

8

C

9

TGA 5’ -

0

1

2

bits

1

GCA

2

C

3

G

4

T

5

TCA

6

G

7

GA

8

C

9 5’ Egr-30

1

2

bits

1 2

A

3

C

4

G

5

TA

6

C

7

GA

8 9 5’ v-Jun0

1

2

bits

1 2

A

3

A

4

TGA

5

T

6

G

7

A

8

A

9

TA 5’ ISRE

0

1

2

bits

1 2

TG

3

T

4

T

5

T

6

T

7

T

8 9

TGA 3’ UTR -

0

1

2

bits

1 2

T

3

CA

4

C

5

C

6

G

7

C

8

C

9 5’ -0

1

2

bits

1 2

GC

3

C

4

G

5

G

6

A

7

A

8

GCA

9

TC 5’ Elk-1

0

1

2

bits

1

GCA

2

C

3

C

4

G

5

GA

6

A

7

TCA

8

GA

9

TGA 5’ c-Ets-1-p54

0

1

2

bits

1

CA

2

T

3

C

4

T

5

G

6

C

7

CA

8

A

9

GC 5’ bZIP910

0

1

2

bits

1 2

A

3

C

4

TA

5

T

6

GC

7

C

8

T

9

TGC 5’ -

0

1

2

bits

1

TGC

2

TA

3

A

4

T

5

C

6

G

7

TC

8

TGC

9

GCA 5’ -

0

1

2

bits

1 2

T3

G4

A5

TCA

6 7

C

8

A

9

TA 5’ -

0

1

2

bits

1 2

GA

3

T4

T5

A6

A7

T8

C9

TCA 5’ -

0

1

2

bits

1

TA

2

A

3

TGC

4

G

5

GA

6

T7

TC

8

A9

TGA 3’ UTR miR-487a

0

1

2

bits

1 2

CA

3

G

4

GA

5

A

6

A

7

T

8

A

9 5’ HSF0

1

2

bits

1 2

C

3

TG

4

G

5

C

6

G

7

TCA

8

A

9 5’ -0

1

2

bits

1

TGA

2

T

3

T

4

G

5

T

6

TGA

7

T

8

G

9

TGA 3’ UTR miR-377

0

1

2

bits

1

GC

2

T

3

TG

4

TA

5

C

6

G

7

T

8 9 5’ CRE-BP10

1

2

bits

1 2

A

3

T

4

T

5

TGA

6

C

7

A

8

GC

9

TCA 3’ UTR miR-301

0

1

2

bits

1

TCA

2

T

3

TG

4

TC

5

T

6

TA

7

A

8

T

9 3’ UTR -0

1

2

bits

1

TGC

2 3

A

4

A

5

T

6

A

7

A

8

A

9 3’ UTR -0

1

2

bits

1

TCA

2

A

3

TG

4

T

5

T

6

TGA

7

T

8

CA

9 3’ UTR -0

1

2

bits

1

TGC

2

A

3

C

4

A

5

GCA

6

A

7

G

8

A

9

GCA 5’ -

0

1

2

bits

1 2

G

3

C

4

C

5

G

6

A

7

GA

8

C

9 5’ -0

1

2

bits

1

GCA

2

C

3

A

4

GC

5

TC

6

T

7

G

8

G

9

GC 5’ Lmo2_complex

0

1

2

bits

1

GCA

2

T

3

G

4

GCA

5

C

6

TA

7

C

8

A

9

TGC 5’ -

0

1

2

bits

1 2

A

3

C

4

TC

5

G

6

TA

7

A

8

GA

9 3’ UTR miR-5820

1

2

bits

1

TGC

2

GCA

3

T

4

CA

5

G

6

C

7

G

8

A

9

TGA 5’ -

0

1

2

bits

1

TGC

2

TGA

3 4

TG

5

C

6

G

7

A

8

C

9

TCA 5’ -

0

1

2

bits

1

TA

2

G

3

TC

4

GA

5

GA

6

A

7

TC

8

TC

9

TA 3’ UTR -

0

1

2

bits

1 2

T

3

TA

4

TGA

5

G

6

TC

7

G

8

TG

9

TCA 3’ UTR -

0

1

2

bits

1

GCA

2

TA

3

TA

4

A

5

C

6

G

7

C

8

GC

9

TGA 5’ -

0

1

2

bits

1 2 3

C

4

G

5

A

6

A

7

CA

8

A

9 5’ -0

1

2

bits

1 2

CA

3

G

4

A

5

C

6

G

7

TC

8

T

9

TGC 5’ -

0

1

2

bits

1 2

TGC

3

GA

4

A

5

T

6

C

7

GA

8

A

9

GCA 5’ CDP_CR3+HD

0

1

2

bits

1

GCA

2

T

3

G

4

C

5

TG

6

TC

7

GC

8

G

9

TGC 3’ UTR -

0

1

2

bits

1 2

C

3

GA

4

TC

5

C

6

TA

7

C

8

G

9

CA 5’ -

0

1

2

bits

1

GCA

2

T

3

TA

4

GCA

5

C

6

G

7

CA

8

A

9

TGA 5’ -

0

1

2

bits

1 2

C

3

T

4

G

5

C

6

G

7

C

8

C

9

TGC 5’ HEN1

0

1

2

bits

1 2

C

3

GC

4

G

5

A

6

TA

7

TA

8

A

9 5’ -0

1

2

bits

1

TG

2

TC

3

G

4

TA

5

C

6

G

7

CA

8

A

9 5’ -0

1

2

bits

1

TGA

2

TA

3

TA

4

TCA

5

C

6

G

7

T

8

T

9 5’ -0

1

2

bits

1

TGC

2

GCA

3

TGA

4

C

5

G

6

TA

7

A

8

CA

9

TGC 5’ -

0

1

2

bits

1

TA

2

T

3

TGC

4

A

5

TG

6

G

7

T

8 9

TGA 3’ UTR -

0

1

2

bits

1

GCA

2

TCA

3

GA

4

T

5

TCA

6

G

7

GA

8

T

9

TG 3’ UTR -

0

1

2

bits

1 2

TCA

3

TG

4

T

5

CA

6

C

7

TG

8

T

9

TGA 3’ UTR -

0

1

2

bits

1

TGA

2

C

3

C

4

TG

5

A

6

G

7

T

8

TCA

9 5’ PPARalpha-RXR-alpha0

1

2

bits

1

TCA

2

TA

3

G

4

TA

5

G

6

TA

7

A

8

A

9

CA 5’ -

0

1

2

bits

1

GCA

2

C

3

A

4

G

5

G

6

T

7

G

8

TCA

9 5’ E470

1

2

bits

1 2

T

3

TGA

4

TA

5

G

6

TGA

7

A

8

C

9

TCA 3’ UTR -

0

1

2

bits

1

TGC

2

TC

3

G

4 5

T

6

T

7

T

8

G

9 3’ UTR -0

1

2

bits

1 2

G

3

TGC

4

TC

5

T

6

TG

7

T

8

A

9 3’ UTR -0

1

2

bits

1 2

C

3

C

4

C

5

G

6

C

7

C

8

C

9 5’ Sp10

1

2

bits

1

TGA

2

A

3

G

4

TC

5

TG

6

TGC

7

A

8

TA

9 3’ UTR -0

1

2

bits

1

TCA

2

C

3

TC

4

T

5

G

6

CA

7

C

8

A

9

TGC 5’ -

0

1

2

bits

1 2

TG

3

T

4

A

5

TC

6

C

7

TC

8

TG

9

TGA 3’ UTR -

0

1

2

bits

1

TGA

2

TGC

3

A

4

TGA

5

CA

6

G

7

TC

8

T

9

TGA 3’ UTR -

0

1

2

bits

1

TGA

2

TC

3

C

4

G

5

TC

6

TGA

7

A

8

A

9

GA 5’ -

0

1

2

bits

1

TCA

2

A

3

T

4

C

5

C

6

TC

7

C

8

T

9

TGC 5’ NF-kappaB-p50

0

1

2

bits

1

TGA

2

G

3

TG

4

TGC

5

C

6

TGA

7

C

8

TGC

9

GC 3’ UTR -

0

1

2

bits

1 2

A

3

C

4

GC

5

T

6

T

7

A

8

C

9

CA 5’ HLF

0

1

2

bits

1

TCA

2

C

3

C

4

C

5

TCA

6

A

7 8

GCA

9

TC 3’ UTR -

0

1

2

bits

1

GA

2

C

3

TA

4

T

5

C

6

C

7

GA

8

C

9

TGA 5’ -

0

1

2

bits

1 2

C

3

A

4

C

5

T

6

C

7

G

8

A

9 5’ -

5Pos

0

Neg-5

Figure S8

Page 24: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

24

Figure S9

Page 25: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

25

012 bits1

GCA

2

UCA

3

GA

4U 5UCA 6G 7GA 8U 9UG 3’ -

012 bits1

UCA

2

GC

3A 4G 5U

6

GA

7

UA

8U 93’-

012 bits12U 3UA 4

UGA

5G 6UC 7G 8UG 9

UCA3’-

012 bits12C 3U 4GA

5

UGA

6C 7UC 8U 9UGC 3’-

012 bits1

UGC

2A 34C

5

GC

6

CA

7C 8G 9UCA3’-

012 bits1

UGA

2

UGC

3A 4UGA 5

CA

6G 7UC 8U 9UGA 3’-

012 bits1

UCA

2

UG

3U 4U 5GCA

6A 7G 8A

9

GCA3’-

012 bits1A 2A 3UCA

4G 5G 6C

7

UGC

8C 9U3’-

012 bits1

UCA

2G 3G 4C

5

UC

6C 7U 8UA

9

UC3’-

012 bits12U 3UG 4U 5U 6A

7C 8U 93’-

012 bits12

UCA

3

UG

4U 5CA 6C 7UG 8U 9UGA 3’-

012 bits1

UGC

23A 4A 5U

6A 7A 8A

9

3’-

012 bits1

UA

2U 3UGC 4A 5UG 6G 7U 89

UGA3’-

012 bits1

UA

2G 3UC 4

GA

5

GA

6A 7UC 8

UC

9

UA3’-

012 bits12U 3UGA 4

UA

5G 6UGA 7A 8C 9UCA 3’ -

012 bits12A 3C 4UC 5G 6UA 7A 8GA 9

3’miR-582

012 bits12G 3UGC 4

UC

5U 6UG 7U 8A 93’-

012 bits12A 3U 4U

5

UGA

6C 7A 8GC

9

UCA3’miR-301_miR-203

012 bits12

UG

3U 4A 5UC

6C 7UC 8

UG

9

UGA3’-

012 bits1

UCA

2U 3UG 4

UC

5U 6UA 7A 8U 93’ -

012 bits1

UCA

2A 3UG 4U 5U 6UGA

7U 8CA 9

3’-

012 bits1

UGA

2A 3G 4UC 5

UG

6

UGC

7A 8UA 9

3’-

012 bits12

UG

3U 4U 5U

6U 7U 89

UGA3’-

012 bits1

UA

2A 3UGC 4G 5GA 6U 7UC 8A 9UGA 3’miR-487a

012 bits1

UA

2G 3A 4UCA

5A 6UA 7U 8G 9UCA3’-

012 bits1

UCA

2

UCA

3

UG

4

UG

5C 6GC 7

UGA

8U 9UA 3’-

012 bits1

UGA

2U 3U 4G

5U 6UGA 7U 8G 9UGA3’miR-377

012 bits1

UGC

2

UC

3G 45U

6U 7U 8G

9

3’-

012 bits1

GCA

2U 3G 4C

5

UG

6

UC

7

GC

8G 9UGC 3’-

012 bits1

UCA

2C 3C 4C

5

UCA

6A 78GCA

9

UC3’-

012 bits1

UGA

2G 3UG 4

UGC

5C 6UGA 7C 8UGC 9

GC3’-

012 bits1

GC

2

UA

3U 4G 56

GA

7G 8G 9UGC3’-

012 bits12

GCA

3

UG

4C 5C 6UA

7G 8A 93’-

012 bits1

UCA

2

GC

3U 4C 5U

6C 7UG 8G 9GCA 3’-

012 bits12C 3A 4C

5T 6C 7G

8A 95’-

012 bits1

TCA

2A 3T 4C

5C 6TC 7C 8T 9TGC 5’NF-kappaB-p50

012 bits12A 3GC 4TA

5TC

6A 7T 8G 9GCA5’p53

012 bits12A 3C 4TA

5T 6GC 7C 8T 9TGC 5’-

012 bits1

GCA

2C 3A 4GC

5TC

6T 7G 8G

9

GC5’Lmo2_complex

012 bits1

GCA

2T 3G 4GCA 5C 6TA 7C 8A 9TGC 5’-

012 bits1

CA

2T 3C 4T

5G 6C 7CA

8A 9GC 5’bZIP910

012 bits1

TCA

2C 3TC 4T 5G 6CA

7C 8A 9TGC 5’-

012 bits12A 3A 4C 5T

6G 7GC 8G 9TA 5’c-Myb

012 bits1

GCA

2C 3A 4G

5G 6T 7G

8

TCA

9

5’E47

012 bits12C 3C 4C

5C 6A 7G

8G 95’-

012 bits1C 2C 3C

4C 5C 6GC

7

TCA

8CA

9

TCA5’-

012 bits1

GC

2C 3C 4C

5T 6C 7C

8C 95’-

012 bits1

TGC

2C 3C 4T

5G 6C 7C

8C 9TCA 5’-

012 bits12A 3G 4C 5C

6C 7C 8CA

9

TCA5’-

012 bits12

CA

3G 4GA 5A 6A 7T

8A 95’HSF

012 bits12

CA

3A 4CA 5A 6T 7A

8A 9TCA 5’HFH-1

012 bits12T 3A 4A 5T

6A 7GA 8TA

9

5’HN

F-1

012 bits1

GCA

2T 3TG 4T 5A 6A

7

CA

8A 95’M

EF-2

012 bits12A 3A 4A

5A 6TGA 7T 8T 95’-

012 bits12

TGC

3

GA

4A 5T 6C 7GA

8A 9GCA 5’CDP_CR3+HD

012 bits12T 3T 4A

5A 6A 7A

8

TA

9

TCA5’SBF-1

012 bits12T 3G 4A

5

TCA

67C 8A 9TA 5

’-012 bits

12A 3A 4TGA 5T 6G 7A 8A

9TA5’ISRE

012 bits1

TGC

2A 3C 4A

5

GCA

6A 7G 8A

9

GCA5’-

012 bits1

TCA

2TA

3G 4TA 5G 6TA 7A 8A 9CA 5’-

012 bits1

TGC

2A 3TA 4A 5A 6A 7A

8A 95’-

012 bits12A 34TGA

5A 6T 7A

8T 9TA 5’-

012 bits12

GA

3T 4T 5A

6A 7T 8C

9

TCA5’-

012 bits12G 3C 4C

5G 6A 7GA

8C 95’-

012 bits12C 3T 4G

5C 6G 7C

8C 9TGC 5’HEN1

012 bits1

GA

2C 3TA 4T 5C 6C

7

GA

8C 9TGA 5’-

012 bits12A 3C 4GC

5T 6T 7A

8C 9CA 5’HLF

012 bits12

CA

3G 4A 5C

6G 7TC 8T 9TGC 5’-

012 bits1

TGA

2C 3G 4T

5G 6T 7TA

8

GC

9

TGC5’bZIP911

012 bits1

TGA

2C 3C 4TG

5A 6G 7T

8

TCA

9

5’PPARalpha-RXR-alpha

012 bits12C 3GA 4TC

5C 6TA 7C 8G 9CA5’-

012 bits1

TGA

2

TCA

3C 4G 5CA

6

GCA

7T 8CA 9

GA5’-

012 bits1

TGC

2

TGA

34TG

5C 6G 7A

8C 9TCA 5’-

012 bits1TG

2C 3G 4T 5TA

6

TGA

7TC

8CA

9

5’-

012 bits1

GCA

2C 3G 4T

5

TCA

6G 7GA 8C 95’Egr-3

012 bits1

GCA

2C 3C 4G

5

GA

6A 7TCA 8

GA

9

TGA5’c-Ets-1-p54

012 bits12T 3CA 4C 5C 6G

7C 8C 9

5’-

012 bits12C 3TG 4G 5C 6G

7

TCA

8A 95’-

012 bits12C 3C 4C

5G 6C 7C

8C 95’Sp1

012 bits1

GCA

2TA

3TA

4A 5C 6G 7C

8

GC

9

TGA5’-

012 bits1

TGC

2

GCA

3T 4CA 5G 6C 7G

8A 9TGA 5’-

012 bits1

TGA

2TC

3C 4G 5TC

6

TGA

7A 8A 9GA 5’-

012 bits12T 3GC 4T 5GA 6C 7G 8TA

9

5’O2

012 bits1

TGC

2TA

3A 4T 5C6G 7TC 8

TGC

9

GCA5’-

012 bits1TC

2

GA

3A 4T 5T6

GC7G 8C 9TC

5’NF-Y

012 bits12A 3A 4TC 5

TGA

6

CA

7C 8G 95’-

012 bits12T 3TA 4

GA

5

TGA

6C 7TGA 8C 9G5’-

012 bits1

TGA

2T 3TA 45T 6C 7G

8C 95’E2F

012 bits1

GCA

2C 3G 4TA

56T 7A 8C 9GC

5’-

012 bits1

TGC

2

GCA

3

TGA

4C 5G 6TA

7A 8CA 9

TGC5’-

012 bits12

GC

3C 4G 5G

6A 7A 8GCA

9TC5’Elk-1

012 bits1TG

2TC

3G 4TA 5C 6G 7CA

8A 95’-

012 bits1

TGA

2C 3G 4T

5

TC

6A 78TG

9

5’CRE-BP1

012 bits1

GC

2T 3TG 4TA

5C 6G 7T

89

5’CRE-BP1

012 bits12A 3C 4G 5TA

6C 7GA 89

5’v-Jun

012 bits1

TGA

2

TA

3

TA

4

TCA

5C 6G 7T

8T 95’-

012 bits1

TGC

2T 3A 4C

5G 6TGA 7CA

8

TGA

9

5’-

012 bits1

GCA

2C 3TGA 4T 5TC 6A 7C 8G 9

5’-

012 bits1

TG

2

CA

3

TA

4

TA

5

GA

6C 7G 8C

9

TGA5’-

012 bits12C 3GC 4G 5A 6TA

7TA

8A 95’-

012 bits1

GCA

2T 3TA 4

GCA

5C 6G 7CA

8A 9TGA 5’-

012 bits123C 4G 5A 6A

7CA

8A 95’-

sodium ion transportvoltage-gated potassium channel activitypotassium ion transportcell agingRho protein signal transductionnegative regulation of progression through cell cycleantioxidant activityNADH dehydrogenase activityenergy derivation by oxidation of organic compoundsglycolysisalcohol catabolic processfatty acid metabolic processcofactor biosynthetic processamino acid metabolic processserine-type endopeptidase activitycell-cell adhesioncopper ion bindingmicrotubule organizing centernucleotide metabolic processtumor necrosis factor receptor bindingER to Golgi vesicle-mediated transportphosphoinositide-mediated signalingmitotic cell cycleDNA replicationDNA repairchromatin assemblychromosome organization and biogenesisATP-dependent helicase activityubiquitin-dependent protein catabolic processmRNA processingnucleotidyltransferase activitymethyltransferase activityprotein foldingchromatin modificationsterol biosynthetic processresponse to virusregulation of cell proliferationphospholipase A2 inhibitor activitypositive regulation of cell proliferationnegative regulation of apoptosiscell motilityantigen processing and presentationI-kappaB kinase/NF-kappaB cascadehematopoietin/IFN (D200-d) cytokine receptor activityinflammatory responselysosomecell migrationinduction of apoptosisangiogenesisactin cytoskeletonactin bindingWnt receptor signaling pathwayplatelet-derived growth factor receptor activitytransmembrane receptor protein tyrosine kinase activityvascular endothelial growth factor receptor activityinsulin-like growth factor bindinganion transportproteinaceous extracellular matrix

3(Pos)

0

(Neg)-3

Figure S10

Page 26: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

26

LMO2

LC:HOP_62

CNS:SF_295

BR:T47D

CNS:U251

ME:MDA_N

LC:NCI_H322M

CNS:SF_539

LC:NCI_H522

CNS:SF_268

CNS:SNB_19

ME:SK_MEL_28

RE:RXF_393

ME:UACC_257

ME:MDA_MB_435

BR:MDA_MB_231

CO:COLO205

LC:EKVX

ME:MALME_3M

LC:NCI_H226

RE:SN12C

PR:DU_145

CO:HT29

ME:M14

OV:OVCAR_8

CNS:SNB_75

OV:OVCAR_5

RE:ACHN

LE:SR

RE:UO_31

BR:BT_549

CO:HCT_15

OV:IGROV1

PR:PC_3

ME:SK_MEL_2

LC:A549

OV:NCI_ADR_RES

BR:HS578T

LC:HOP_92

RE:A498

ME:LOXIMVI

ME:SK_MEL_5

LC:NCI_H460

RE:786_0

CO:HCT_116

CO:KM12

LE:RPMI_8226

RE:CAKI_1

ME:UACC_62

OV:OVCAR_4

RE:TK_10

OV:SK_OV_3

OV:OVCAR_3

CO:HCC_2998

BR:MCF7

PI3K signaling (GO:0048015) R = 0.760 (p = 2.325e-12 )

NM_000707NM_000739

NM_001033952NM_001039583

NM_001058NM_002565NM_004080NM_004573NM_004720NM_019888NM_176875

cell migration (GO:0016477) R = 0.833 (p = 2.357e-17 )

NM_000193NM_001007NM_002226NM_002547NM_002673NM_003019NM_003212NM_003285NM_003728NM_006181NM_014495NM_022034NM_022093

TCF3

ME:SK_MEL_28

ME:MALME_3M

CO:COLO205

OV:OVCAR_4

ME:UACC_257

ME:SK_MEL_5

ME:MDA_N

ME:MDA_MB_435

ME:SK_MEL_2

BR:T47D

CNS:SF_539

LC:HOP_92

LC:NCI_H322M

LC:A549

LC:NCI_H226

ME:UACC_62

LC:HOP_62

RE:UO_31

BR:MDA_MB_231

CNS:U251

ME:M14

CNS:SNB_75

LC:NCI_H460

RE:786_0

CNS:SF_295

ME:LOXIMVI

RE:RXF_393

OV:OVCAR_8

LC:EKVX

LE:RPMI_8226

RE:SN12C

CNS:SF_268

OV:OVCAR_3

OV:NCI_ADR_RES

RE:TK_10

RE:ACHN

OV:OVCAR_5

LE:SR

PR:PC_3

RE:CAKI_1

CNS:SNB_19

OV:IGROV1

CO:HCT_116

CO:HT29

RE:A498

BR:MCF7

PR:DU_145

BR:BT_549

CO:SW_620

BR:HS578T

LE:HL_60

LC:NCI_H522

CO:HCC_2998

OV:SK_OV_3

CO:HCT_15

CO:KM12

LE:CCRF_CEM

LE:MOLT_4

LE:K_562

inflammatory response (GO:0006954) R = 0.477 (p = 2.082e-04 )

NM_000211NM_000660NM_000674NM_000677NM_001039NM_001058NM_001643NM_002349NM_004590NM_004946NM_016946

mir-203

CNS:SF_268

CNS:SF_295

CNS:SF_539

LE:SR

ME:LOXIMVI

ME:M14

ME:SK_MEL_2

ME:UACC_257

ME:MDA_MB_435

ME:MDA_N

LC:A549

LC:HOP_62

LC:HOP_92

RE:786_0

RE:ACHN

RE:SN12C

PR:DU_145

LE:K_562

PR:PC_3

CNS:SNB_75

OV:IGROV1

ME:MALME_3M

CNS:SNB_19

LC:NCI_H522

LE:HL_60

CNS:U251

BR:BT_549

BR:MDA_MB_231

BR:HS578T

ME:SK_MEL_5

LE:MOLT_4

LE:CCRF_CEM

RE:UO_31

RE:CAKI_1

LC:NCI_H226

LC:NCI_H460

ME:SK_MEL_28

ME:UACC_62

RE:TK_10

RE:RXF_393

RE:A498

OV:SK_OV_3

LE:RPMI_8226

CO:SW_620

OV:OVCAR_8

OV:OVCAR_5

OV:OVCAR_3

CO:KM12

CO:HCT_116

LC:EKVX

CO:COLO205

OV:NCI_ADR_RES

OV:OVCAR_4

CO:HT29

LC:NCI_H322M

CO:HCT_15

CO:HCC_2998

BR:T47D

BR:MCF7

protein degradation (GO:0030163) R = -0.429 (p = 1.208e-03 )

NM_001017415NM_002792NM_002803NM_003336NM_004651NM_007051

1

-1

Figure S11

Page 27: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

27

gene-expression dataset

ab initio motif discovery (FIRE)

pathway analysis (iPAGE)

regulatory interaction map

pathway-regulatory interaction map

1.00

-1.00

ribosome biogenesis and assembly, GO:0042254ER to Golgi vesicle-mediated transport, GO:0006888tRNA aminoacylation, GO:0043039DNA replication, GO:0006260RNA splicing, GO:0008380mitotic cell cycle, GO:0000278ubiquitin-dependent protein catabolic process, GO:0006511microtubule biogenesis, GO:0000226oxidative phosphorylation, GO:0006119positive regulation of apoptosis, GO:0043065cAMP-mediated signaling, GO:0019933humoral immune response, GO:0006959potassium ion transport, GO:0006813homophilic cell adhesion, GO:0007156sodium ion transport, GO:0006814lymphocyte activation, GO:0046649

3

Over-

repres

entat

ion

-3

Unde

r-rep

resen

tation

1-1

optimized motif

location

MI (bits)

z-score

position bias

orientation bias

motif name

0

1

2

bits

1 2

C

3

G

4

CA

5

TC

6

A

7

A

8

TA

9

GCA

5’ 0.005 11.7 - E2F

0

1

2

bits

1 2

GC

3

C

4

G

5

G

6

TA

7

A

8

G

9

5’ 0.026 78.1 Y - Elk-1

0

1

2

bits

1

TGC

2

C

3

TA

4

C

5

G

6

C

7

TGA

8

A

9

5’ 0.010 26.1 - AhR

0

1

2

bits

1

UCA

2

U

3

UCA

4

G

5

GCA

6

U

7

G

8

U

9

3’UTR 0.004 8.4 Y -

0

1

2

bits

1 2

C

3

G

4

T

5

TCA

6

GA

7

CA

8

A

9

5’ 0.008 21.4 Y - O2

0

1

2

bits

1

GCA

2

A

3

C

4

G

5

T

6

TC

7

TGC

8

GA

9

TGC

5’ 0.007 18.5 - -

0

1

2

bits

1

GCA

2

A

3

GA

4

TA

5

A

6

C

7

G

8

CA

9

GCA

5’ 0.005 11.9 Y - -

0

1

2

bits

1 2

C

3

G

4

TC

5

A

6

G

7

T

8

TC

9

5’ 0.008 22.4 - -

0

1

2

bits

1

GCA

2

C

3

G

4

TA

5

GC

6

A

7

C

8

G

9

5’ 0.009 21.4 - - -

0

1

2

bits

1 2

C

3

G

4

C

5

G

6

T

7

T

8

A

9

5’ 0.004 8.8 - - -

0

1

2

bits

1 2

C

3

A

4

T

5

C

6

TA

7

GC

8

A

9

5’ 0.005 14.3 Y - Tal-1b_E47

0

1

2

bits

1

TCA

2

CA

3

T

4

C

5

T

6

GC

7

C

8

A

9

TGC

5’ 0.005 13.7 Y - -

0

1

2

bits

1

TCA

2

TC

3

C

4

C

5

C

6

C

7

C

8

A

9

5’ 0.005 10.6 Y - -

0

1

2

bits

1 2

A

3

T

4

G

5

A

6

GC

7

A

8

G

9

TCA

5’ 0.005 13.1 - Skn-1

0

1

2

bits

1 2

A

3

T

4

TA

5

T

6

G

7

C

8

A

9

5’ 0.004 10.5 - - Oct-1

0

1

2

bits

1 2

A

3

T

4

C

5

T

6

GA

7

T

8

G

9

5’ 0.004 8.8 - - SEF-1

10

Over-

repres

entat

ion

-10

Unde

r-rep

resen

tation

012 bits

12A 3T 4C 5T

6

GA

7T 8G 9

SEF-1

012 bits

12C 3A 4T

5C 6TA 7

GC

8A 9Tal1b-E47

012 bits

1

TCA

2

CA

3T 4C 5T

6

GC

7C 8A 9TGC-

012 bits

12A 3T 4G

5A 6GC 7A 8G 9TCASkn-1

012 bits

1

TCA

2TC

3C 4C 5C

6C 7C 8A 9

-012 bits

12C 3G 4T

5

TCA

6

GA

7

CA

8A 9O2

012 bits

1

GCA

2A 3C 4G 5T

6

TC

7

TGC

8

GA

9

TGC-

012 bits

1

TGC

2C 3TA 4C 5G 6C

7

TGA

8A 9AhR

012 bits

12

GC

3C 4G 5G

6

TA

7A 8G 9

Elk-1

012 bits

1

GCA

2C 3G 4TA

5

GC

6A 7C 8G

9

-012 bits

1

TGA

2C 3GA 4

TCA

5

TA

6C 7G 8A

9

-012 bits

12C 3G 4C

5G 6T 7T 8A

9

-012 bits

12C 3G 4TC

5A 6G 7T 8TC

9

-012 bits

1

GCA

2A 3GA 4

TA

5A 6C 7G 8CA

9

GCA-

012 bits

12C 3G 4CA

5

TC

6A 7A 8TA 9

GCAE2F

potassium ion transport GO:0006813sodium ion transport GO:0006814lymphocyte activation GO:0046649humoral immune response GO:0006959homophilic cell adhesion GO:0007156positive regulation of apoptosis GO:0043065oxidative phosphorylation GO:0006119microtubule biogenesis GO:0000226mitotic cell cycle GO:0000278DNA replication GO:0006260ribosome biogenesis and assembly GO:0042254Ub-dependent protein catabolic process GO:0006511RNA splicing GO:0008380tRNA aminoacylation GO:0043039ER to Golgi vesicle-mediated transport GO:0006888

3Pos

0

Neg-3

012 bits

12C 3G 4CA 5TC

6A 7A 8TA 9

GCA

5'

012 bits

12

GC

3C 4G 5G 6TA

7A 8G 9

5'

012 bits

1

TGC

2C 3TA 4C 5G 6C 7TGA

8A 9

5'

012 bits

1

UCA

2U 3UCA 4G 5GCA 6U 7G 8U 9

3'UT

R

012 bits

12C 3G 4T 5TCA6

GA

7

CA

8A 95'

012 bits

1

GCA

2A 3C 4G 5T6TC

7

TGC

8GA

9

TGC

5'

012 bits

1

GCA

2A 3GA 4TA

5A 6C 7G 8CA

9

GCA

5'

012 bits

12C 3G 4TC 5A 6G 7T 8TC

9

5'

012 bits

1

GCA

2C 3G 4TA 5

GC

6A 7C 8G 9

5'

012 bits

12C 3G 4C 5G

6T 7T 8A 9

5'

012 bits

12C 3A 4T 5C

6TA

7GC

8A 9

5'

012 bits

1

TCA

2

CA

3T 4C 5T 6GC

7C 8A 9TGC5'

012 bits

1

TCA

2TC

3C 4C 5C 6C

7C 8A 9

5'

012 bits

12A 3T 4G 5A

6

GC

7A 8G 9TCA5'

012 bits

12A 3T 4TA 5T 6G 7C 8A

9

5'

012 bits

12A 3T 4C 5T

6

GA

7T 8G 9

5'

0

1

2

bits

1 2

C

3

G

4

CA

5

TC

6

A

7

A

8

TA

9

GCA

5'

0

1

2

bits

1 2

GC

3

C

4

G

5

G

6

TA

7

A

8

G

9

5'

0

1

2

bits

1

TGC

2

C

3

TA

4

C

5

G

6

C

7

TGA

8

A

9

5'

0

1

2

bits

1

UCA

2

U

3

UCA

4

G

5

GCA

6

U

7

G

8

U

9

3'UTR

0

1

2

bits

1 2

C

3

G

4

T

5

TCA

6

GA

7

CA

8

A

9

5'

0

1

2

bits

1

GCA

2

A

3

C

4

G

5

T

6

TC

7

TGC

8

GA

9

TGC

5'

0

1

2

bits

1

GCA

2

A

3

GA

4

TA

5

A

6

C

7

G

8

CA

9

GCA

5'

0

1

2

bits

1 2

C

3

G

4

TC

5

A

6

G

7

T

8

TC

9

5'

0

1

2

bits

1

GCA

2

C

3

G

4

TA

5

GC

6

A

7

C

8

G

9

5'

0

1

2

bits

1 2

C

3

G

4

C

5

G

6

T

7

T

8

A

9

5'

0

1

2

bits

1 2

C3

A4

T

5

C

6

TA

7

GC

8

A

9

5'

0

1

2bits

1

TCA

2CA

3T

4

C5

T6

GC

7

C8

A9

TGC

5'

0

1

2

bits

1

TCA

2

TC

3

C4

C5

C6

C7

C8

A9

5'

0

1

2

bits

1 2

A

3

T

4

G

5

A

6

GC

7

A

8

G

9

TCA

5'

0

1

2

bits

1 2

A

3

T

4

TA

5

T

6

G

7

C

8

A

9

5'

0

1

2

bits

1 2

A

3

T

4

C

5

T

6

GA

7

T

8

G

9

5'

0.01(Pos)

0.0

(Neg)0.01

0.0

Figure S12

Page 28: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

28

Gen

es

+4.2 +3.9 +2.5 +2.0 +1.1 +0.5 -0.3 -1.4 -1.9 -2.2 -2.7 -3.5

0 1 1 0 1 1 1 1 0 0 0 0 D

eple

tion

Enr

ichm

ent

Continuous Expression

Values

0 0 0 0 1 1 1 1 2 2 2 2

0 1 1 0 1 1 1 1 0 0 0 0 D

eple

tion

Enr

ichm

ent

Discerete Expression

Values Figure S13

Page 29: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

29

Gen

es

1 1 1 1 0 0 0 0

1 1 1 1 0 0 0 0

Sign

ifica

nt M

I Enr

ichm

ent 1

1 1 1 0 0 0 0

0 0 0 0 1 1 1 1

Sign

ifica

nt M

I Dep

letio

n

1 1 1 1 0 0 0 0

0 1 1 0 1 0 1 0

Insig

nific

ant M

I

0

1

2

bits

1

GCA

2

A

3

C

4

G

5

T

6

TC

7

TGC

8

GA

9

TGC

Figure S14

Page 30: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

30

Bladder CarcinomaNormal

2

-2

70 58 49 67 74 18 38 54 80 53 14 19 31 75 12 59 5 3 62 22 28 81 7 1 15 43 2 48 21 66 23 32 8 26 45 29 42 77 44 82 79 33 73 71 50 0 16 51 39 11 6 60 69 47 78 57 56 46 55 25 34 52 30 35 41 64 68 10 37 63 27 20 72 36 65 17 9 24 13 4 40 61 76

vesicle-mediated transport, GO:0016192protein folding, GO:0006457electron transport, GO:0006118proteasome complex (sensu Eukaryota), GO:0000502cofactor metabolic process, GO:0051186mRNA processing, GO:0006397I-kappaB kinase/NF-kappaB cascade, GO:0007249fatty acid metabolic process, GO:0006631mitotic cell cycle, GO:0000278cytoskeleton organization and biogenesis, GO:0007010phosphoinositide-mediated signaling, GO:0048015cell-cell adhesion, GO:0016337Rho protein signal transduction, GO:0007266negative regulation of apoptosis, GO:0043066chromatin assembly, GO:0031497cell junction, GO:0030054DNA replication, GO:0006260DNA repair, GO:0006281response to wounding, GO:0009611proteinaceous extracellular matrix, GO:0005578humoral immune response, GO:0006959potassium ion transport, GO:0006813synaptic transmission, GO:0007268sodium ion transport, GO:0006814

3

Over-

repres

entat

ion

-3

Unde

r-rep

resen

tation

Figure S15

Page 31: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

31

Bladde

r Carc

inoma

Norm

al2

-2

C70

C58

C49

C67

C74

C18

C38

C54

C80

C53

C14

C19

C31

C75

C12

C59

C5 C3 C62

C22

C28

C81

C7 C1 C15

C43

C2 C48

C21

C66

C23

C32

C8 C26

C45

C29

C42

C77

C44

C82

C79

C33

C73

C71

C50

C0 C16

C51

C39

C11

C6 C60

C69

C47

C78

C57

C56

C46

C55

C25

C34

C52

C30

C35

C41

C64

C68

C10

C37

C63

C27

C20

C72

C36

C65

C17

C9 C24

C13

C4 C40

C61

C76

optimized motif

location

MI (bits)

z-score

robustness

position bias

orientation bias

conservation index

seed

motif name

0

1

2

bits

1 2

C

3

C

4

G

5

G

6

A

7

A

8

GA

9

5’ 0.029 28.2 10/10 Y - 1.00 CCGGAAG Elk-1

0

1

2

bits

1 2

C

3

C

4

C

5

G

6

C

7

C

8

C

9

5’ 0.028 26.9 10/10 - - 1.00 CCCGCCC Sp1

0

1

2

bits

1 2

C

3

A

4

T

5

GA

6

TGC

7

C

8

G

9

GC

5’ 0.013 8.1 6/10 - 0.97 CATGGCG p53

0

1

2

bits

1

TGA

2

C

3

G

4

T

5

TC

6

A

7 8

TG

9

5’ 0.018 14.9 10/10 - - 0.99 CGTCACG CRE-BP1

0

1

2

bits

1

TA

2

C

3

T

4

TCA

5

GA

6

C

7

G

8

A

9

5’ 0.014 9.9 10/10 - - 1.00 CTCGCGA -

0

1

2

bits

1 2

G

3

U

4

U

5 6

UC

7

GC

8

U

9

3’UTR 0.014 10.2 10/10 - 0.91 GUUAUGU -

0

1

2

bits

1

TGC

2

C

3

G

4

T

5

A

6

C

7

GCA

8 9

5’ 0.012 7.6 8/10 - 0.75 CGTACCC -

0

1

2

bits

1

TCA

2

A

3

A

4

T

5

GCA

6

G

7

C

8

G

9

GA 5’ 0.012 7.1 5/10 - 0.90 AATGGCG -

0

1

2

bits

1 2

A

3

C

4

TG

5

A

6

C

7

A

8 9

5’ 0.011 7.1 6/10 - 0.97 ACTACAT -

0

1

2

bits

1

GCA

2

TC

3

A

4

T

5

GC

6

C

7

G

8

C

9

5’ 0.014 10.5 10/10 - 1.00 CATGCGC -

0

1

2

bits

1 2

U

3

UGA

4

UA

5

G

6

UGA

7

A

8

C

9

UCA

3’UTR 0.017 13.6 10/10 - 0.99 UUUGUAC -

0

1

2

bits

1 2

A

3

GA

4

T

5

T

6 7

C

8

G

9

5’ 0.013 9.0 9/10 - 0.80 AGTTCCG STAT3

0

1

2

bits

1

TC

2

A

3

A

4

C

5

TC

6

TGA

7

C

8

G

9

5’ 0.012 7.5 7/10 - 0.73 AACCTCG MEF-2

0

1

2

bits

1

TCA

2

C

3

G

4

T

5

A

6

GA

7

TCA

8

CA

9

5’ 0.014 10.2 10/10 - 0.93 CGTAGTC Freac-2

0

1

2

bits

1

TGA

2

C3

G4

TC

5

A6

A7

TG

8

GC

9

TGC

5’ 0.014 9.9 8/10 - 0.94 CGCAAGC HLF

0

1

2

bits

1 2 3

C4

G5

A6

A7

CA

8

A9

5’ 0.013 9.8 9/10 - 0.96 CCGAAAA E2F

0

1

2

bits

1 2

GCA

3

U4

G5

U6

A7

GC

8

U9

3’UTR 0.015 11.1 10/10 - 0.97 AUGUACU -

0

1

2

bits

1 2

C

3

GC

4

A

5

A

6

T

7

CA

8

GA

9

GCA

5’ 0.011 6.9 5/10 Y - 1.00 CCAATCA NF-Y

0

1

2

bits

1

GC

2

A

3

C

4

C

5

TA

6

G

7

G

8

G

9

TGA

5’ 0.014 10.4 10/10 - 0.94 ACCTGGG NRSF

0

1

2

bits

1

GCA

2

C

3

A

4

TC

5

C

6

C

7

C

8

A

9

GCA

5’ 0.015 11.1 10/10 - 0.97 CACCCCA SREBP-1

0

1

2

bits

1 2

A

3

G

4

C

5

A

6

G

7

GA

8

G

9

GCA

5’ 0.013 8.9 8/10 - - 0.88 AGCAGGG HEN1

0

1

2

bits

1

TGC

2

CA

3

C

4

C

5

T

6

G

7

T

8

C

9

TGC

5’ 0.013 8.7 8/10 - 0.67 CCCTGTC Sn

0

1

2

bits

1 2

T

3

G

4

TG

5

GA

6

C

7

A

8

C

9

CA 5’ 0.013 8.7 8/10 - 0.90 TGGACAC bZIP911

0

1

2

bits

1

GCA

2

G

3

C

4

UG

5

GC

6

A

7

GC

8

A

9

GCA

3’UTR 0.013 8.8 10/10 - 0.82 GCUGAGA -

0

1

2

bits

1

GCA

2

UA

3

UA

4

C

5

GC

6

GCA

7

A

8

GC

9

GCA

3’UTR 0.013 8.7 9/10 - 0.61 UUCCCAC -

0

1

2

bits

1 2

C

3

T

4

G

5

CA

6

C

7

C

8

A

9

5’ 0.011 7.4 8/10 - - 0.95 CTGCCCA SRF

0

1

2

bits

1

TA

2

T

3

A

4

A

5

C

6 7

T

8

C

9

TGA

5’ 0.012 7.4 6/10 - - 0.71 TAACTTC CRE-BP1_c-Jun

0

1

2

bits

1 2

A

3

GC

4

G

5

GA

6

A

7

T

8

A

9

5’ 0.012 8.1 6/10 - - 0.23 AGGAATA SEF-1

0

1

2

bits

1 2

C

3

A

4

A

5

G

6

C

7

TA

8

A

9

5’ 0.011 6.5 5/10 - 0.06 CAAGCAA MIF-1

0

1

2

bits

1

TGA

2

C

3

TC

4

T

5

T

6

G

7

CA

8

A

9

TGC

5’ 0.011 6.4 6/10 - - 0.32 CCTTGCA HNF-4

0

1

2

bits

1

GC

2

GCA

3

GC

4

G

5

A

6 7

U

8

C

9

3’UTR 0.015 10.5 10/10 - 0.76 GGGACUC -

0

1

2

bits

1 2

GC

3

UC

4

A

5

A

6 7

U

8

G

9

3’UTR 0.013 9.7 10/10 - 0.94 GUAAAUG -

0

1

2

bits

1 2

U

3

UGA

4

UCA

5

G

6

C

7

UG

8

UCA

9

UCA

3’UTR 0.013 9.5 8/10 - 0.97 UUUGCUU -

0

1

2

bits

1

TGA

2

GA

3

T

4

T

5

C

6

A

7

T

8

G

9

TCA

5’ 0.013 8.2 7/10 - - 0.99 ATTCATG HSF

0

1

2

bits

1

TC

2

T

3

A

4

A

5

GC

6

A

7

C

8

A

9

5’ 0.012 7.0 7/10 - - 0.79 TAAGACA -

0

1

2

bits

1

TA

2

TGA

3

G

4

CA

5

GC

6

A

7

G

8

A

9

TA 5’ 0.012 8.0 6/10 - 0.68 AGAGAGA E47

0

1

2

bits

1

TCA

2

T

3

TA

4

C

5

T

6

A

7

G

8

TCA

9

GA 5’ 0.012 7.4 5/10 - 0.79 TACTAGA HSF

10

noitatneserper-revO

-10

noitatne serper-rednU

Figure S16

Page 32: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

32

012 bits

12C 3A 4A

5G 6C 7TA

8A 9MIF-1

012 bits

1TA

2T 3A 4A

5C 67T

8C 9TGACRE-BP1_c-Jun

012 bits

1

TCA

2T 3TA 4C 5T 6A

7G 8TCA 9

GA-

012 bits

1

TGA

2C 3TC 4T 5T 6G

7

CA

8A 9TGC COUP-TF___HNF-4

012 bits

1

TGA

2

GA

3T 4T 5C

6A 7T 8G

9

TCA

HSF

012 bits

1TA

2

TGA

3G 4CA 5

GC

6A 7G 8A

9TATh1_E47

012 bits

12A 3G 4C

5A 6G 7GA

8G 9GCAHEN1

012 bits

1

GCA

2C 3A 4TC

5C 6C 7C

8A 9GCA SREBP-1

012 bits

12T 3G 4TG

5

GA

6C 7A 8C

9

CAbZIP911

012 bits

1

TGC

2

CA

3C 4C 5T

6G 7T 8C

9

TGC

Sn

012 bits

12C 3T 4G

5

CA

6C 7C 8A

9

SRF

012 bits

1

GC

2A 3C 4C

5TA

6G 7G 8G

9

TGA

NRSF

012 bits

1TA

2C 3T 4TCA

5

GA

6C 7G 8A

9

-

012 bits

12A 3GC 4G 5GA 6A 7T 8A

9

SEF-1

012 bits

1

TCA

2A 3A 4T

5

GCA

6G 7C 8G

9

GA-

012 bits

12C 3GC 4A 5A 6T

7

CA

8

GA

9

GCA

NF-Y

012 bits

1

TCA

2C 3G 4T

5A 6GA 7

TCA

8

CA

9

Freac-2

012 bits

12A 3C 4TG

5A 6C 7A

89

-

012 bits

123C 4G 5A

6A 7CA 8A 9E2F

012 bits

1

GCA

2TC

3A 4T 5GC

6C 7G 8C

9

-

012 bits

12C 3C 4C

5G 6C 7C

8C 9

Sp1

012 bits

1

TGA

2C 3G 4TC

5A 6A 7TG

8

GC

9

TGC

HLF

012 bits

12A 3GA 4T 5T 67C 8G 9

STAT3

012 bits

12C 3A 4T

5

GA

6

TGC

7C 8G 9GCp53

012 bits

1

TGA

2C 3G 4T

5

TC

6A 78TG

9

CRE-BP1

012 bits

1

TGC

2C 3G 4T

5A 6C 7GCA

89

-

012 bits

12C 3C 4G

5G 6A 7A

8

GA

9

Elk-1

cofactor metabolic process GO:0051186vesicle-mediated transport GO:0016192cytoskeleton organization and biogenesis GO:0007010mitotic cell cycle GO:0000278chromatin assembly GO:0031497protein folding GO:0006457DNA repair GO:0006281proteasome complex (sensu Eukaryota) GO:0000502DNA replication GO:0006260mRNA processing GO:0006397fatty acid metabolic process GO:0006631cell junction GO:0030054potassium ion transport GO:0006813sodium ion transport GO:0006814proteinaceous extracellular matrix GO:0005578response to wounding GO:0009611cell-cell adhesion GO:0016337humoral immune response GO:0006959

3Pos

0

Neg-3

Figure S17

Page 33: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

33

1.00

-1.00

PENG_LEUCINE_DNMOREAUX_TACI_HI_VS_LOW_DNMRNA_PROCESSING_REACTOMESASAKI_TCELL_LYMPHOMA_VS_CD4_UPTRANSLATION_FACTORSSTEMCELL_COMMON_UPCMV_24HRS_UPTARTE_PCHSC_INTERMEDIATEPROGENITORS_SHAREDSMITH_HTERT_UPHCC_SURVIVAL_GOOD_VS_POOR_DNTSA_HEPATOMA_UPMETASTASIS_ADENOCARC_UPHYPOXIA_RCC_UPBRCA_PROGNOSIS_NEGHYPOXIA_NORMAL_UPCROMER_HYPOPHARYNGEAL_MET_VS_NON_DNAGED_MOUSE_HYPOTH_DNWERNER_FIBRO_DNET743_SARCOMA_72HRS_UPWANG_MLL_CBP_VS_GMP_DNET743_SARCOMA_48HRS_DNTNFALPHA_ALL_UPUVB_SCC_DNFALT_BCLL_DNUVB_NHEK3_C0MOOTHA_VOXPHOSAGUIRRE_PANCREAS_CHR12NOUZOVA_CPG_H4_UPLI_FETAL_VS_WT_KIDNEY_UPAGUIRRE_PANCREAS_CHR8DNA_DAMAGE_SIGNALINGUBIQUITIN_MEDIATED_PROTEOLYSISCARIES_PULP_UPCHANG_SERUM_RESPONSE_DNALCALAY_AML_NPMC_UPBETA_ALANINE_METABOLISMMEF2DPATHWAYCELL_ADHESIONKANG_TERT_DNGH_EXOGENOUS_MIDDLE_UPGH_EXOGENOUS_LATE_UPCYTOKINEPATHWAYGPCRDB_CLASS_A_RHODOPSIN_LIKEMOREAUX_TACI_HI_IN_BMPCLEE_ACOX1_DNIGF_VS_PDGF_UPRORIE_ES_PNET_DN

4

Over-representation

-4

Under-representation

Figure S18

Page 34: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

34

012 bits

12C 3G 4C

5G 6T 7T

8A 9StuAp

012 bits

12C 3A 4T

5C 6TA 7

GC

8A 9Tal-1beta_E47

012 bits

1

TCA

2TC

3C 4C 5C

6C 7C 8A

9

ADR1

012 bits

12A 3T 4TA

5T 6G 7C

8A 9Oct-1

012 bits

12A 3T 4G

5A 6GC 7A 8G 9TCASkn-1

012 bits

1

TCA

2

CA

3T 4C 5T

6

GC

7C 8A 9TGC-

012 bits

12A 3T 4C

5T 6GA 7T 8G 9

SEF-1

012 bits

1

TCA

2T 3TCA 4G 5GCA 6T 7G 8T

9

-

012 bits

1

GCA

2C 3G 4TA

5

GC

6A 7C 8G

9

Tax_CREB

012 bits

12C 3G 4TC

5A 6G 7T

8TC

9

-

012 bits

1

TGC

2C 3TA 4C 5G 6C

7

TGA

8A 9AhR

012 bits

12C 3G 4T

5

TCA

6

GA

7

CA

8A 9O2

012 bits

1

GCA

2A 3GA 4TA

5A 6C 7G

8

CA

9

GCA

-

012 bits

12

GC

3C 4G 5G

6TA

7A 8G 9

NRF-2

012 bits

1

GCA

2A 3C 4G

5T 6TC 7

TGC

8

GA

9

TGC

PHO4

012 bits

12C 3G 4CA

5

TC

6A 7A 8TA

9

GCA

E2F

DNA_DAMAGE_SIGNALING

UVB_SCC_DN

SASAKI_TCELL_LYMPHOMA_VS_CD4_UP

SMITH_HTERT_UP

BRCA_PROGNOSIS_NEG

AGED_MOUSE_HYPOTH_DN

CMV_24HRS_UP

MOOTHA_VOXPHOS

ET743_SARCOMA_72HRS_UP

ET743_SARCOMA_48HRS_DN

AGUIRRE_PANCREAS_CHR8

UVB_NHEK3_C0

HYPOXIA_NORMAL_UP

HCC_SURVIVAL_GOOD_VS_POOR_DN

HSC_INTERMEDIATEPROGENITORS_SHARED

WERNER_FIBRO_DN

TRANSLATION_FACTORS

MRNA_PROCESSING_REACTOME

PENG_LEUCINE_DN

STEMCELL_COMMON_UP

NOUZOVA_CPG_H4_UP

KANG_TERT_DN

GH_EXOGENOUS_LATE_UP

CYTOKINEPATHWAY

LI_FETAL_VS_WT_KIDNEY_UP

CELL_ADHESION

CARIES_PULP_UP

IGF_VS_PDGF_UP

LEE_ACOX1_DN

GPCRDB_CLASS_A_RHODOPSIN_LIKE

3Pos

0

Neg-3

Figure S19

Page 35: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

35

CO

ID L

ung

Bha

ttach

arje

e et

al

MM

Mye

lom

a Z

han

et a

lSQ

Lun

g B

hatta

char

jee

et a

lA

D L

ung

Bee

r et a

lA

D L

ung

Stea

rman

et a

lA

D L

ung

Bha

ttach

arje

e et

al

PDC

Pan

crea

s Is

hika

wa

et a

lC

A B

reas

t Sor

lie e

t al

CA

Col

on G

raud

ens

et a

lC

A B

ladd

er D

yrsk

jot e

t al

MC

A B

reas

t Rad

vany

i et a

lSM

CL

Lung

Bha

ttach

arje

e et

al

MPC

Pro

stat

e D

hana

seka

ran

et a

lC

CC

Ova

rian

Hen

drix

et a

lC

A B

reas

t Ric

hard

son

et a

lM

UC

Ova

rian

Hen

drix

et a

lM

L M

elan

oma

Tal

anto

v et

al

FL L

ymph

oma

Aliz

adeh

et a

lT

U P

rost

ate

Lapo

inte

et a

lSR

S O

varia

n H

endr

ix e

t al

EN

D O

varia

n H

endr

ix e

t al

AD

Pan

crea

s Lo

gsdo

n et

al

MPM

Mes

othe

liom

a G

ordo

n et

al

DLB

CL

Lym

phom

a A

lizad

eh e

t al

ILC

Bre

ast R

adva

nyi e

t al

IDC

Bre

ast R

adva

nyi e

t al

OD

Bra

in B

rede

l et a

lO

DG

L B

rain

Sun

et a

lA

C B

rain

Sun

et a

lG

LB B

rain

Sun

et a

lG

L B

rain

Bre

del e

t al

GB

M B

rain

Lia

ng e

t al

GL

Bra

in R

ickm

an e

t al

AO

Bra

in B

rede

l et a

lPP

C P

rost

ate

Dha

nase

kara

n et

al

B-C

LL L

euke

mia

Has

linge

r et a

lR

CC

C R

enal

Len

burg

et a

lB

PH P

rost

ate

Dha

nase

kara

n et

al

ME

Mel

anom

a H

oek

et a

lC

LL L

ymph

oma

Aliz

adeh

et a

lC

A R

enal

Hig

gins

et a

lR

CC

C R

enal

Boe

r et a

lH

SCC

Hea

d-N

eck

Cro

mer

et a

lH

SCC

Hea

d-N

eck

Chu

ng e

t al

AD

Ova

rian

Wel

sh e

t al

GC

T S

emin

oma

Kor

kola

et a

l

COID Lung Bhattacharjee et alMM Myeloma Zhan et alSQ Lung Bhattacharjee et alAD Lung Beer et alAD Lung Stearman et alAD Lung Bhattacharjee et alPDC Pancreas Ishikawa et alCA Breast Sorlie et alCA Colon Graudens et alCA Bladder Dyrskjot et alMCA Breast Radvanyi et alSMCL Lung Bhattacharjee et alMPC Prostate Dhanasekaran et alCCC Ovarian Hendrix et alCA Breast Richardson et alMUC Ovarian Hendrix et alML Melanoma Talantov et alFL Lymphoma Alizadeh et alTU Prostate Lapointe et alSRS Ovarian Hendrix et alEND Ovarian Hendrix et alAD Pancreas Logsdon et alMPM Mesothelioma Gordon et alDLBCL Lymphoma Alizadeh et alILC Breast Radvanyi et alIDC Breast Radvanyi et alOD Brain Bredel et alODGL Brain Sun et alAC Brain Sun et alGLB Brain Sun et alGL Brain Bredel et alGBM Brain Liang et alGL Brain Rickman et alAO Brain Bredel et alPPC Prostate Dhanasekaran et alB-CLL Leukemia Haslinger et alRCCC Renal Lenburg et alBPH Prostate Dhanasekaran et alME Melanoma Hoek et alCLL Lymphoma Alizadeh et alCA Renal Higgins et alRCCC Renal Boer et alHSCC Head-Neck Cromer et alHSCC Head-Neck Chung et alAD Ovarian Welsh et alGCT Seminoma Korkola et al

1Pos

0

Neg-1

Figure S20

Page 36: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

36

BPH

Pro

stat

e D

hana

seka

ran

et a

lPP

C P

rost

ate

Dha

nase

kara

n et

al

HSC

C H

ead-

Nec

k C

rom

er e

t al

B-C

LL L

euke

mia

Has

linge

r et a

lH

SCC

Hea

d-N

eck

Chu

ng e

t al

ME

Mel

anom

a H

oek

et a

lC

LL L

ymph

oma

Aliz

adeh

et a

lR

CC

C R

enal

Len

burg

et a

lR

CC

C R

enal

Boe

r et a

lG

BM

Bra

in L

iang

et a

lC

A R

enal

Hig

gins

et a

lA

O B

rain

Bre

del e

t al

GC

T S

emin

oma

Kor

kola

et a

lA

D O

varia

n W

elsh

et a

lA

D P

ancr

eas

Logs

don

et a

lM

M M

yelo

ma

Zha

n et

al

SQ L

ung

Bha

ttach

arje

e et

al

CO

ID L

ung

Bha

ttach

arje

e et

al

PDC

Pan

crea

s Is

hika

wa

et a

lFL

Lym

phom

a A

lizad

eh e

t al

MPC

Pro

stat

e D

hana

seka

ran

et a

lD

LBC

L Ly

mph

oma

Aliz

adeh

et a

lC

A B

reas

t Sor

lie e

t al

MC

A B

reas

t Rad

vany

i et a

lIL

C B

reas

t Rad

vany

i et a

lID

C B

reas

t Rad

vany

i et a

lE

ND

Ova

rian

Hen

drix

et a

lC

CC

Ova

rian

Hen

drix

et a

lSR

S O

varia

n H

endr

ix e

t al

MU

C O

varia

n H

endr

ix e

t al

ML

Mel

anom

a T

alan

tov

et a

lC

A B

reas

t Ric

hard

son

et a

lT

U P

rost

ate

Lapo

inte

et a

lM

PM M

esot

helio

ma

Gor

don

et a

lA

D L

ung

Bee

r et a

lSM

CL

Lung

Bha

ttach

arje

e et

al

AD

Lun

g St

earm

an e

t al

AD

Lun

g B

hatta

char

jee

et a

lG

L B

rain

Ric

kman

et a

lC

A B

ladd

er D

yrsk

jot e

t al

CA

Col

on G

raud

ens

et a

lO

D B

rain

Bre

del e

t al

GL

Bra

in B

rede

l et a

lG

LB B

rain

Sun

et a

lO

DG

L B

rain

Sun

et a

lA

C B

rain

Sun

et a

l

LEE_MYC_TGFA_DNYAO_P4_KO_VS_WT_UPMAMMARY_DEV_UPADIP_HUMAN_UPNADLER_OBESITY_DNLE_MYELIN_DNADIP_DIFF_CLUSTER3RECKPATHWAYGH_HYPOPHYSECTOMY_RAT_DNADIPOGENESIS_HMSC_CLASS3_UPHDACI_COLON_CLUSTER9PARP_KO_UPSMITH_HTERT_DNZHANG_EFT_EWSFLI1_UPCMV_HCMV_TIMECOURSE_24HRS_DNFERNANDEZ_MYC_TARGETSTSP1PATHWAYCIS_XPC_DNADIP_VS_FIBRO_UPADIP_VS_PREADIP_UPAGED_MOUSE_NEOCORTEX_UPH2O2_CSBDIFF_C1AGED_MOUSE_MUSCLE_DNNUCLEOTIDE_SUGARS_METABOLISMNICOTINATE_AND_NICOTINAMIDE_METABOLISMPHENYLALANINE_TYROSINE_AND_TRYPTOPHAN_BIOSYNTHESISBRENTANI_ANGIOGENESISST_FAS_SIGNALING_PATHWAYTSA_HEPATOMA_CANCER_DNBLEO_MOUSE_LYMPH_LOW_24HRS_DNIGF_VS_PDGF_DNMKK6EE_UPKENNY_WNT_DNPASSERINI_IMMUNEDAC_FIBRO_UPH2O2_CSBRESCUED_UPFLECHNER_KIDNEY_TRANSPLANT_WELL_PBL_UPBRG1_SW13_UPBUTANOATE_METABOLISMCIS_RESIST_LUNG_DNTNFALPHA_ADIP_DNUVB_NHEK1_C4ATP_SYNTHESISFLAGELLAR_ASSEMBLYETCPATHWAYADIP_DIFF_UPINSULIN_SIGNALINGHDACI_COLON_CUR2HRS_UPZHAN_MM_CD138_HP_VS_RESTZHAN_MULTIPLE_MYELOMA_VS_NORMAL_UPAGED_RHESUS_DNPARK_MSCS_LIN2ZHAN_MMPC_SIM_BC_AND_MMIDX_TSA_UP_CLUSTER2VEGFPATHWAYDORSAM_HOXA9_UPYAMA_RECURRENT_HCC_UPAPOPTOSISHDACI_COLON_CURSUL_UPDEATHPATHWAYABBUD_LIF_UPRADMACHER_AMLNORMALKARYTYPE_SIGHIVNEFPATHWAYTCRAPATHWAYTHELPERPATHWAYPENG_GLUCOSE_UPNKTPATHWAYIFNALPHA_RESIST_DNNFKBPATHWAYTNFR2PATHWAYGSK3PATHWAYAPOPTOSIS_GENMAPPYANG_OSTECLASTS_SIGFALT_BCLL_DNSHIPP_FL_VS_DLBCL_UPPTDINSPATHWAYHOGERKORP_ANTI_CD44_DNST_G_ALPHA_S_PATHWAYHOGERKORP_CD44_TEMPORALLY_DNHDACI_COLON_SUL12HRS_DNBRACX_UPHDACI_COLON_CUR_DNRABPATHWAYHDACI_COLON_BUT12HRS_UPSPRYPATHWAYWERNER_FIBRO_DNOLDWERNER_FIBRO_DNOLD_FIBRO_DNAGUIRRE_PANCREAS_CHR22MAPKPATHWAYP27PATHWAYGH_GHRHR_KO_6HRS_UPRADIATION_SENSITIVITYHOFFMANN_BIVSBII_LGBIICMV_HCMV_TIMECOURSE_18HRS_UPLEE_ACOX1_DNCMV_HCMV_TIMECOURSE_16HRS_UPARGININE_AND_PROLINE_METABOLISMPHOSPHATIDYLINOSITOL_SIGNALING_SYSTEMMANALO_HYPOXIA_UPSMITH_HTERT_UPBRCA1_OVEREXP_PROSTATE_UPPEPTIDE_GPCRSGPCRS_CLASS_A_RHODOPSIN_LIKEGPCRDB_CLASS_A_RHODOPSIN_LIKEALCALAY_AML_NPMC_UPZHAN_MULTIPLE_MYELOMA_SUBCLASSES_DIFFHDACI_COLON_CUR48HRS_UPPYRIMIDINE_METABOLISMSANSOM_APC_LOSS5_UPIL6_SCAR_FIBRO_UPIGF1_NIH3T3_UPCELL_ADHESIONFETAL_LIVER_VS_ADULT_LIVER_GNF2GH_EXOGENOUS_MIDDLE_UPGPCRDB_OTHERMONOAMINE_GPCRSMETASTASIS_ADENOCARC_DNCMV_HCMV_TIMECOURSE_6HRS_DNDRUG_RESISTANCE_AND_METABOLISMTCYTOTOXICPATHWAYMONOCYTEPATHWAYLYMPHOCYTEPATHWAYGOLDRATH_CYTOLYTICBLYMPHOCYTEPATHWAYPROSTAGLANDIN_SYNTHESIS_REGULATIONNEUTROPHILPATHWAYIDX_TSA_DN_CLUSTER3MOREAUX_TACI_HI_IN_BMPCSTRESS_TPA_SPECIFIC_UPCOCAINE_BRAIN_4WKS_UPCREB_BRAIN_8WKS_DNFATTY_ACID_METABOLISMTRYPTOPHAN_METABOLISMHDACI_COLON_SUL_UPUVC_HIGH_D3_DNSTOSSI_ER_UPDSRNA_UPFLECHNER_KIDNEY_TRANSPLANT_WELL_PBL_DNROSS_CBF_MYHGNATENKO_PLATELET_UPGNATENKO_PLATELETFALT_BCLL_IG_MUTATED_VS_WT_UPPLATELET_EXPRESSEDUVC_TTD-XPCS_COMMON_DNWELCSH_BRCA_UPYAGI_AML_PROGNOSISLEE_TCELLS1_UPLEE_TCELLS8_UPLEE_TCELLS10_UPNAB_LUNG_DNUVC_HIGH_D4_DNSIG_CD40PATHWAYMAPUVC_XPCS_ALL_UPUVC_XPCS_8HR_UPDAVIES_MGUS_MMJISON_SICKLE_CELLHDACI_COLON_CUR24HRS_UPZHAN_MULTIPLE_MYELOMA_VS_NORMAL_DNEGFPATHWAYBRENTANI_TRANSPORT_OF_VESICLESPDGFPATHWAYPASSERINI_APOPTOSISBYSTROM_IL5_DNSIG_IL4RECEPTOR_IN_B_LYPHOCYTESIGF1PATHWAYTENEDINI_MEGAKARYOCYTIC_GENESTPOPATHWAYZHAN_MMPC_SIMALDAVIES_NIDX_TSA_UP_CLUSTER1HYPOXIA_REG_UPAS3_FIBRO_DNTNFALPHA_4HRS_UPHUMAN_CD34_ENRICHED_TRANSCRIPTION_FACTORSWALKER_MM_SNP_DIFFCMV_UV-CMV_COMMON_HCMV_6HRS_DNUEDA_MOUSE_LIVERJNK_UPLIAN_MYELOID_DIFF_TFROSS_CBF_LEUKEMIAUVB_NHEK1_C2UVB_NHEK2_DNCMV_HCMV_TIMECOURSE_20HRS_DNMYOD_NIH3T3_DNNGUYEN_KERATO_DNZHAN_MM_MOLECULAR_CLASSI_UPZHAN_MM_CD138_MF_VS_RESTUVC_HIGH_D2_DNINFLAMMATORY_RESPONSE_PATHWAYHDACI_COLON_SUL16HRS_UPDAC_CGI_BLADDER_UPMIDDLEAGE_UPCMV_HCMV_TIMECOURSE_8HRS_DNSARCOMAS_LIPOSARCOMA_DNTGFBETA_C4_UPSLRPPATHWAYCOLLER_MYC_DNTPA_SKIN_UPDSRNA_DNCELL_GROWTH_AND_OR_MAINTENANCETSA_PANC50_UPSHEPARD_NEG_REG_OF_CELL_PROLIFERATIONDORSEY_DOXYCYCLINE_UPCARIES_PULP_DNKANG_TERT_UPCMV-UV_HCMV_6HRS_DNLIN_WNT_UPLEE_MYC_E2F1_UPHYPOPHYSECTOMY_RAT_DNPARK_MSCS_BOTHINTEGRINPATHWAYGUO_HEX_DNBREASTCA_THREE_CLASSESCALRES_MOUSE_NEOCORTEX_DNINNEREAR_UPCALCINEURIN_NF_AT_SIGNALINGST_INTEGRIN_SIGNALING_PATHWAYCCR5PATHWAYGOLUB_ALL_VS_AML_UPCROONQUIST_IL6_RAS_UPZHAN_TONSIL_BONEMARROWRIBAVIRIN_RSV_DNUVB_NHEK3_C8FLECHNER_KIDNEY_TRANSPLANT_REJECTION_PBL_UPINTEGRIN_MEDIATED_CELL_ADHESION_KEGGSTRESS_ARSENIC_SPECIFIC_UPG2PATHWAYTFF2_KO_UPCDC25PATHWAYSA_G2_AND_M_PHASESET743_RESIST_UPSRCRPTPPATHWAYCELLCYCLEPATHWAYPTC1PATHWAYPOMEROY_MD_TREATMENT_GOOD_VS_POOR_DNIFN_BETA_GLIOMA_DNROTH_HTERT_UPSMITH_HCV_INDUCED_HCC_UPCELL_MOTILITYBECKER_TAMOXIFEN_RESISTANT_UPIFN_ALL_UPNF90_UPIFNALPHA_NL_HCC_UPGRANDVAUX_IFN_NOT_IRF3_UPCMV_HCMV_TIMECOURSE_14HRS_UPIL17PATHWAYBECKER_IFN_INDUCIBLE_SUBSET_1CTLPATHWAYCDC42RACPATHWAYCMV_HCMV_6HRS_UPBRENTANI_DNA_METHYLATION_AND_MODIFICATIONKLEIN_PEL_UPHOFMANN_MANTEL_LYMPHOMA_VS_LYMPH_NODES_UPCORDERO_KRAS_KD_VS_CONTROL_DNBRCA2_BRCA1_UPRIBAVIRIN_RSV_UPP21_MIDDLE_DNADIP_DIFF_CLUSTER1CORDERO_KRAS_KD_VS_CONTROL_UPCALRES_RHESUS_UPTGFBETA_EARLY_UPHIF1_TARGETSJECHLINGER_EMT_UPCROONQUIST_IL6_STROMA_UPBRCA_BRCA1_NEGVEGF_MMMEC_ALL_UPVEGF_MMMEC_6HRS_UPEMT_UPTGFBETA_ALL_UPAGEING_KIDNEY_SPECIFIC_UPCMV_ALL_DNCMV_24HRS_DNIL1_CORNEA_UPBRENTANI_CELL_ADHESIONKANG_TERT_DNSANA_TNFA_ENDOTHELIAL_UPLINDSTEDT_DEND_8H_VS_48H_UPHPV31_DNIFNALPHA_HCC_UPAGED_MOUSE_CEREBELLUM_UPDAC_BLADDER_UPSANA_IFNG_ENDOTHELIAL_UPIFN_GAMMA_UPCMV_HCMV_TIMECOURSE_12HRS_UPBENNETT_SLE_UPIFNALPHA_NL_UPCMV_8HRS_UPIFN_ALPHA_UPDER_IFNG_UPIFN_BETA_UPIFN_ANY_UPDER_IFNA_UPTAKEDA_NUP8_HOXA9_3D_UPTAKEDA_NUP8_HOXA9_10D_UPTAKEDA_NUP8_HOXA9_8D_UPIFNA_UV-CMV_COMMON_HCMV_6HRS_UPIFNA_HCMV_6HRS_UPTAKEDA_NUP8_HOXA9_16D_UPDER_IFNB_UPRADAEVA_IFNA_UPLEE_MYC_E2F1_DNPASSERINI_GROWTHLEE_E2F1_DNGLYCEROLIPID_METABOLISMSTEMCELL_COMMON_DNCMV_HCMV_TIMECOURSE_48HRS_DNELONGINA_KO_UPAPPEL_IMATINIB_UPYU_CMYC_DNKLEIN_PEL_DNMYC_TARGETSZHAN_MMPC_EARLYVSZHAN_TONSIL_PCBCLEE_DENA_DNLEE_TCELLS4_UPIDX_TSA_DN_CLUSTER5AS3_FIBRO_C3AS3_FIBRO_UPMYOD_BRG1_UPHUMAN_TISSUE_LIVERSTRIATED_MUSCLE_CONTRACTIONMYOD_NIH3T3_UPIGF_VS_PDGF_UPRUTELLA_HEPATGFSNDCS_UPLAL_KO_6MO_UPLAL_KO_3MO_UPBRENTANI_TRANSCRIPTION_FACTORSUVB_NHEK1_UPCMV_ALL_UPCELL_ADHESION_MOLECULE_ACTIVITYUV-CMV_UNIQUE_HCMV_6HRS_UPCMV-UV_HCMV_6HRS_UPJECHLINGER_EMT_DNEMT_DNHTERT_UPIRS_KO_ADIP_DNPPARAPATHWAYUVC_LOW_ALL_DNNOUZOVA_CPG_H4_UPTPA_RESIST_EARLY_DNAGUIRRE_PANCREAS_CHR6IRS1_KO_ADIP_DNDORSAM_HOXA9_DNHDACI_COLON_BUT48HRS_UPUEDA_MOUSE_SCNADIP_VS_PREADIP_DNTZD_ADIP_DNADIP_VS_FIBRO_DNADIPOGENESIS_HMSC_CLASS8_DNSCHRAETS_MLL_UPFRASOR_ER_DNAD12_24HRS_DNBRG1_ALAB_UPPASSERINI_ADHESIONGILDEA_BLADDER_UPVEGF_MMMEC_3HRS_UPDAC_IFN_BLADDER_UPHTERT_DNPASSERINI_EMPOD1_KO_MOST_UPADIP_HUMAN_DNFETAL_LIVER_ENRICHED_TRANSCRIPTION_FACTORSTGFBETA_LATE_UPMMS_HUMAN_LYMPH_HIGH_24HRS_DNKENNY_WNT_UPXU_ATRA_PLUSNSC_DNJAIN_NEMO_DIFFHDACI_COLON_SUL24HRS_UPRORIE_ES_PNET_UPIDX_TSA_DN_CLUSTER4GRANDVAUX_IRF3_UPHEMATOP_STEM_ALL_UPBECKER_TAMOXIFEN_RESISTANT_DNMATRIX_METALLOPROTEINASESAGUIRRE_PANCREAS_CHR1FERRANDO_MLL_T_ALL_UPTOLLPATHWAYCELL_SURFACE_RECEPTOR_LINKED_SIGNAL_TRANSDUCTIONNAKAJIMA_MCSMBP_MASTBRENTANI_IMMUNE_FUNCTIONHADDAD_CD45CD7_PLUS_VS_MINUS_UPHADDAD_HSC_CD7_UPMENSE_HYPOXIA_UPPARK_RARALPHA_UPCMV_HCMV_6HRS_DNTAVOR_CEBP_UPHALMOS_CEBP_UPCHAUHAN_2ME2LEE_CIP_UPHALMOS_CEBP_DNCROMER_HYPOPHARYNGEAL_MET_VS_NON_UPLEE_DENA_UPIRITANI_ADPROX_DNIGLESIAS_E2FMINUS_UPIDX_TSA_DN_CLUSTER2IRITANI_ADPROX_VASCGN_CAMP_GRANULOSA_DNIFN_BETA_GLIOMA_UPHDACI_COLON_BUT_UPFERRARI_4HPR_UPTNFA_NFKB_DEP_UPGALE_FLT3ANDAPL_DNZHAN_MMPC_SIMLIAN_MYELOID_DIFF_RECEPTORSTSA_HEPATOMA_UPROSS_CBFLH_GRANULOSA_DNVEGF_HUVEC_30MIN_UPFSH_GRANULOSA_DNTPA_SENS_LATE_UPADDYA_K562_HEMIN_TREATMENTLVAD_HEARTFAILURE_DNUVB_NHEK3_C3ERM_KO_SERTOLI_DNET743_SARCOMA_6HRS_UPFRASOR_ER_UPPASSERINI_INFLAMMATIONBRCA1_MES_UPDNMT1_KO_UPKANNAN_P53_UPAD12_32HRS_DNWANG_HOXA9_VS_MEIS1_DNLIZUKA_L1_GR_G1BUT_TSA_UPTGFBETA_C5_UPSARCOMAS_SYNOVIAL_DNLIZUKA_L0_SM_L1ST_TUMOR_NECROSIS_FACTOR_PATHWAYGOLUB_ALL_VS_AML_DNTAKEDA_NUP8_HOXA9_3D_DNMETASTASIS_ADENOCARC_UPTAKEDA_NUP8_HOXA9_6H_DNMCALPAINPATHWAYXPB_TTD-CS_DNLEI_HOXC8_DNIL1RPATHWAYOXSTRESS_RPETHREE_DNNI2_MOUSE_DNWANG_HOXA9_VS_MEIS1_UPEXTRINSICPATHWAYIDX_TSA_DN_CLUSTER1AD12_48HRS_DNHYPOXIA_FIBRO_UPHDACI_COLON_SUL48HRS_UPET743_SARCOMA_72HRS_UPSERUM_FIBROBLAST_CORE_DNTPA_SENS_MIDDLE_UPET743_HELA_UPLEE_ACOX1_UPCHANG_SERUM_RESPONSE_DNSANA_TNFA_ENDOTHELIAL_DNLVAD_HEARTFAILURE_UPLI_FETAL_VS_WT_KIDNEY_UPLINDSTEDT_DEND_UPHYPOXIA_REVIEWAD12_ANY_DNSHEPARD_CRASH_AND_BURN_MUT_VS_WT_UPUVB_NHEK3_C1TNFALPHA_30MIN_UPTNFALPHA_ALL_UPADIP_DIFF_CLUSTER2NI2_MOUSE_UPGALINDO_ACT_UPKRETZSCHMAR_IL6_DIFFBROCKE_IL6LU_IL4BCELLGERY_CEBP_TARGETSKNUDSEN_PMNS_UPDIAB_NEPH_UPLINDSTEDT_DEND_8H_VS_48H_DNZUCCHI_EPITHELIAL_DNHINATA_NFKB_UPHOHENKIRK_MONOCYTE_DEND_UPROS_MOUSE_AORTA_DNBLEO_HUMAN_LYMPH_HIGH_24HRS_UPBASSO_HCL_DIFFROSS_MLL_FUSIONPARK_RARALPHA_MODNAKAJIMA_MCS_UPNADLER_OBESITY_UPHEARTFAILURE_VENTRICLE_DNTAKEDA_NUP8_HOXA9_10D_DNVERHAAK_AML_NPM1_MUT_VS_WT_UPNEMETH_TNF_UPHOHENKIRK_MONOCYTE_DEND_DNLINDSTEDT_DEND_DNWIELAND_HEPATITIS_B_INDUCEDBASSO_GERMINAL_CENTER_CD40_UPFLECHNER_KIDNEY_TRANSPLANT_REJECTION_UPCARIES_PULP_HIGH_UPCARIES_PULP_UPROSS_AML1_ETOIL6PATHWAYNTHIPATHWAYG_PROTEIN_SIGNALINGTYPE_III_SECRETION_SYSTEMPHOTOSYNTHESISHIPPOCAMPUS_DEVELOPMENT_POSTNATALCALCIUM_REGULATION_IN_CARDIAC_CELLSCREB_BRAIN_8WKS_UPGPCRPATHWAYPYK2PATHWAYPGC1APATHWAYDFOSB_BRAIN_2WKS_UPBCRPATHWAYBIOPEPTIDESPATHWAYNDKDYNAMINPATHWAYCCR3PATHWAYNOS1PATHWAYAGED_MOUSE_HYPOTH_DNCREB_BRAIN_2WKS_UPFMLPPATHWAYFCER1PATHWAYAT1RPATHWAYALZHEIMERS_INCIPIENT_DNAGEING_BRAIN_DNTCELL_ANERGIC_UPVIPPATHWAYCMV_HCMV_TIMECOURSE_20HRS_UPSMOOTH_MUSCLE_CONTRACTIONPENG_RAPAMYCIN_UPAGED_MOUSE_CORTEX_DNLEE_MYC_TGFA_UPFSH_GRANULOSA_UPLH_GRANULOSA_UPROSS_PML_RARTPA_SENS_EARLY_UPET743_SARCOMA_UPINOS_ALL_DNFALT_BCLL_IG_MUTATED_VS_WT_DNIDX_TSA_UP_CLUSTER6HYPOXIA_NORMAL_UPCALRES_RHESUS_DNST_DIFFERENTIATION_PATHWAY_IN_PC12_CELLSCOCAINE_BRAIN_5D_UPIDX_TSA_UP_CLUSTER5OXIDATIVE_PHOSPHORYLATIONMOOTHA_VOXPHOSELECTRON_TRANSPORT_CHAINKREBPATHWAYBRG1_ALAB_DNZHAN_MM_CD138_CD2_VS_RESTCHIARETTI_ZAP70_DIFFZHAN_MMPC_PCLAMB_CYCLIN_D3_GLOCUSABRAHAM_MM_VS_AL_DNCELL_CYCLE_CHECKPOINTCASPASEPATHWAYYEN_MYC_WTALANINE_AND_ASPARTATE_METABOLISMHUMAN_TISSUE_KIDNEYINOSITOL_METABOLISMLIZUKA_G1_SM_G2COLLER_MYC_UPOXSTRESS_RPETWO_DNSTRESS_GENOTOXIC_SPECIFIC_UPOXSTRESS_RPE_H2O2HNE_DNCALRES_MOUSE_UPHADDAD_CD45CD7_PLUS_VS_MINUS_DNHADDAD_HSC_CD7_DNSCHUMACHER_MYC_DNBRCA1_OVEREXP_PROSTATE_DNHDACI_COLON_BUT16HRS_DNCIRCADIAN_EXERCISEUVB_NHEK3_C5WILLERT_WNT_NCCIT_ALL_UPIDX_TSA_UP_CLUSTER4UVC_TTD_ALL_UPUVC_TTD_4HR_UPGH_AUTOCRINE_DNP21_P53_MIDDLE_DNHSC_INTERMEDIATEPROGENITORS_SHAREDVANTVEER_BREAST_OUTCOME_GOOD_VS_POOR_UPBRCA_PROGNOSIS_POSKREBS_TCA_CYCLECITRATE_CYCLE_TCA_CYCLETCAZHAN_MM_CD1_VS_CD2_DNHCC_SURVIVAL_GOOD_VS_POOR_UPGLYCINE_SERINE_AND_THREONINE_METABOLISMVALINE_LEUCINE_AND_ISOLEUCINE_DEGRADATIONPROPANOATE_METABOLISMAGEING_LYMPH_DNGLYOXYLATE_AND_DICARBOXYLATE_METABOLISMFOLATE_BIOSYNTHESISAGEING_KIDNEY_SPECIFIC_DNAGEING_KIDNEY_DNPOMEROY_DESMOPLASIC_VS_CLASSIC_MD_DNLYSINE_DEGRADATIONHUMAN_TISSUE_TESTISHDACI_COLON_TSA_DNTESTIS_EXPRESSED_GENESUVB_NHEK1_C6MOREAUX_TACI_HI_VS_LOW_DNSTEMCELL_COMMON_UPGALE_FLT3ANDAPL_UPRNA_TRANSCRIPTION_REACTOMEGOLDRATH_HPGH_GHRHR_KO_24HRS_DNUVB_SCC_DNFALT_BCLL_UPROTH_HTERT_DIFFBRCA1KO_MEF_DNWANG_MLL_CBP_VS_GMP_DNHYPOXIA_RCC_UPAGED_MOUSE_HYPOTH_UPLEE_MYC_UPIRITANI_ADPROX_UPCHEN_HOXA5_TARGETS_DNMMS_MOUSE_LYMPH_HIGH_4HRS_UPBYSTRYKH_HSC_BRAIN_CIS_GLOCUSVHL_RCC_UPHDACI_COLON_BUT48HRS_DNBRENTANI_SIGNALINGHDACI_COLON_BUT24HRS_DNPENG_LEUCINE_UPTGF_BETA_SIGNALING_PATHWAYRORIE_ES_PNET_DNO6BG_RESIST_MEDULLOBLASTOMA_DNHG_PROGERIA_DNMRNA_PROCESSINGUVB_NHEK3_C7MRNA_PROCESSING_REACTOMEMRNA_SPLICINGELONGINA_KO_DNTARTE_PCAGUIRRE_PANCREAS_CHR19ASTON_OLIGODENDROGLIA_MYELINATION_SUBSETASTON_DEPRESSION_DNSHIPP_FL_VS_DLBCL_DNBLEO_MOUSE_LYMPH_HIGH_24HRS_DNSCHUMACHER_MYC_UPBRCA1_OVEREXP_DNMENSSEN_MYC_UPPENG_RAPAMYCIN_DNPROTEASOMECMV_24HRS_UPBASSO_REGULATORY_HUBSPENG_LEUCINE_DNESR_FIBROBLAST_UPET743_RESIST_DNCROMER_HYPOPHARYNGEAL_MET_VS_NON_DNPURINE_METABOLISMTGFBETA_C2_UPZHAN_MM_CD138_CD1_VS_RESTHDACI_COLON_SUL24HRS_DNBCRABL_HL60_CDNA_DNPROSTAGLANDIN_AND_LEUKOTRIENE_METABOLISMHDACI_COLON_CLUSTER5BILE_ACID_BIOSYNTHESISUVB_NHEK4_6HRS_UPTSADAC_HYPOMETH_OVCA_UPHDACI_COLON_SUL48HRS_DNPARK_HSC_VS_MPP_DNGENOTOXINS_ALL_24HRS_REGBCRABL_HL60_CDNA_UPHASLINGER_B_CLL_12PARK_MSCS_DIFFIRS1_KO_ADIP_UPCROONQUIST_RAS_STROMA_DNIL6_FIBRO_UPAMINOACYL_TRNA_BIOSYNTHESISTRNA_SYNTHETASESLEE_MYC_DNTSA_HEPATOMA_CANCER_UPUVB_SCC_UPHDACI_COLON_CUR_UPBASSO_GERMINAL_CENTER_CD40_DNBREAST_CANCER_ESTROGEN_SIGNALINGEGF_HDMEC_UPVEGF_MMMEC_12HRS_UPZUCCHI_EPITHELIAL_UPELECTRON_TRANSPORTER_ACTIVITYGLUTATHIONE_METABOLISMBLEO_HUMAN_LYMPH_HIGH_4HRS_UPUVB_NHEK3_C6HOFMANN_MDS_CD34_LOW_AND_HIGH_RISKHOUSTIS_ROSFSH_OVARY_MCV152_UPGLYCEROPHOSPHOLIPID_METABOLISMAGUIRRE_PANCREAS_CHR7HYPOXIA_RCC_NOVHL_UPNAB_LUNG_UPUBIQUITIN_MEDIATED_PROTEOLYSISZELLER_MYC_UPHISTIDINE_METABOLISMNELSON_ANDROGEN_UPCHESLER_BRAIN_HIGHEST_VARIANCE_GENESUVC_HIGH_D5_DNCHESLER_BRAIN_CIS_GENESBCNU_GLIOMA_NOMGMT_48HRS_DNPENTOSE_PHOSPHATE_PATHWAYRESISTANCE_XENOGRAFTS_UPHOFMANN_MDS_CD34_LOW_RISKPENG_GLUCOSE_DNLEE_E2F1_UPAGUIRRE_PANCREAS_CHR12HSC_STHSC_ADULTP21_EARLY_DNBREAST_DUCTAL_CARCINOMA_GENESRAY_P210_DIFFAGUIRRE_PANCREAS_CHR17ELECTRON_TRANSPORTBCNU_GLIOMA_MGMT_48HRS_DNDNA_DAMAGE_SIGNALINGSANSOM_APC_LOSS4_UPHSC_INTERMEDIATEPROGENITORS_ADULTHOX_GENESGENOTOXINS_ALL_4HRS_REGMIDDLEAGE_DNGLYCOLYSIS_AND_GLUCONEOGENESISGLYCOLYSISGLUCONEOGENESISTCRPATHWAYTPA_RESIST_MIDDLE_DNPEART_HISTONE_UPHEMATOPOESIS_RELATED_TRANSCRIPTION_FACTORSHDACI_COLON_BUT12HRS_DNSANA_IFNG_ENDOTHELIAL_DNP21_P53_EARLY_DNCANTHARIDIN_DNMOREAUX_TACI_HI_IN_PPC_UPREN_E2F1_TARGETSHESS_HOXAANMEIS1_DNHESS_HOXAANMEIS1_UPRUIZ_TENASCIN_TARGETSHEARTFAILURE_ATRIA_DNADIP_DIFF_CLUSTER4INOS_ALL_UPAGUIRRE_PANCREAS_CHR8BRCA_BRCA1_POSFERRANDO_MLL_T_ALL_DNCANCER_NEOPLASTIC_META_UPCHANG_SERUM_RESPONSE_UPSERUM_FIBROBLAST_CORE_UPPEART_HISTONE_DNBHATTACHARYA_ESC_UPBREASTCA_TWO_CLASSESCIS_XPC_UPG1_TO_S_CELL_CYCLE_REACTOMEDNA_REPLICATION_REACTOMEVERNELL_PRB_CLSTR1P21_ANY_DNCANCER_UNDIFFERENTIATED_META_UPMANALO_HYPOXIA_DNOLDAGE_DNLI_FETAL_VS_WT_KIDNEY_DNADIP_DIFF_CLUSTER5IRITANI_ADPROX_LYMPHCMV_IE86_UPYU_CMYC_UPBRENTANI_CELL_CYCLEVANTVEER_BREAST_OUTCOME_GOOD_VS_POOR_DNBRCA_PROGNOSIS_NEGSHEPARD_BMYB_MORPHOLINO_DNP21_P53_ANY_DNCROONQUIST_IL6_RAS_DNGREENBAUM_E2A_UPSHEPARD_GENES_COMMON_BW_CB_MOSHEPARD_CRASH_AND_BURN_MUT_VS_WT_DNCELL_CYCLECELL_CYCLE_KEGGPRMT5_KD_UPSASAKI_ATL_UPSASAKI_TCELL_LYMPHOMA_VS_CD4_UPGOLDRATH_CELLCYCLELE_MYELIN_UPALCALAY_AML_NPMC_DNDOX_RESIST_GASTRIC_UPSERUM_FIBROBLAST_CELLCYCLECROONQUIST_IL6_STARVE_UPIDX_TSA_UP_CLUSTER3ZHAN_MM_CD138_PR_VS_RESTLEE_TCELLS3_UPUVC_TTD_8HR_DNBRCA1_OVEREXP_UPROME_INSULIN_2F_UPUVB_NHEK2_UPPROTEASOMEPATHWAYPROTEASOME_DEGRADATIONET743_SARCOMA_24HRS_DNUVB_NHEK3_C0FSH_OVARY_MCV152_DNMARTINELLI_IFNS_DIFFTARTE_BCELLLIAN_MYELOID_DIFF_GRANULESTEFFEN_AML_PML_PLZF_TRGTMUNSHI_MM_VS_PCS_UPMUNSHI_MM_UPET743_SARCOMA_48HRS_DNHCC_SURVIVAL_GOOD_VS_POOR_DNTSA_CD4_UPST_B_CELL_ANTIGEN_RECEPTORLEE_CIP_DNTRANSLATION_FACTORSHIPPOCAMPUS_DEVELOPMENT_PRENATALFLOTHO_CASP8AP2_MRD_DIFFNING_COPD_DNNING_COPD_UPUVB_NHEK1_C1HYPOPHYSECTOMY_RAT_UPPOMEROY_DESMOPLASIC_VS_CLASSIC_MD_UPRIBOSOMAL_PROTEINS

4.5Pos

0

Neg-4.5

Figure S21

Page 37: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

37

C4

C6

C9

C11

C13

C14

C16

C20

C21

C25

C26

C27

C29

C30

C32

C37

C38

C39

C42

C47

C49

C53

C56

C59

C61

C62

C63

C65

C68 optimized motif

location

MI (bits)

z-score

position bias

orientation bias

seed

motif name

0

1

2

bits

1 2

GC

3

C

4

G

5

G

6

A

7

A

8

GCA

9

TC 5' 0.022 27.4 Y - Elk-1

0

1

2

bits

1 2

C

3

C

4

C

5

G

6

C

7

C

8

C

9

5' 0.018 21.1 Y - Sp1

0

1

2

bits

1

GCA

2

C

3

C

4

G

5

GA

6

A

7

TCA

8

GA

9

TGA

5' 0.016 16.1 Y - c-Ets-1_p54_

0

1

2

bits

1 2

A

3

C

4

G

5

TA

6

C

7

GA

8 9

5' 0.013 12.0 Y - v-Jun

0

1

2

bits

1

TGC

2

TGA

3 4

TG

5

C

6

G

7

A

8

C

9

TCA

5' 0.012 14.8 Y - -

0

1

2

bits

1 2

T

3

CA

4

C

5

C

6

G

7

C

8

C

9

5' 0.012 10.5 Y - -

0

1

2

bits

1

GCA

2

C

3

G

4

T

5

TCA

6

G

7

GA

8

C

9

5' 0.011 11.6 Y - Egr-3

0

1

2

bits

1 2

A

3

A

4

A

5

A

6

TGA

7

T

8

T

9

5' 0.019 20.0 Y - -

0

1

2

bits

1

GCA

2

T

3

TG

4

T

5

A

6

A

7

CA

8

A

9

5' 0.016 20.6 Y - MEF-2

0

1

2

bits

1 2

A

3 4

TGA

5

A

6

T

7

A

8

T

9

TA 5' 0.015 18.6 Y - -

0

1

2

bits

1 2

CA

3

A

4

CA

5

A

6

T

7

A

8

A

9

TCA

5' 0.013 12.6 Y - HFH-1

0

1

2

bits

1 2

T

3

A

4

A

5

T

6

A

7

GA

8

TA

9

5' 0.013 15.0 Y - HNF-1

0

1

2

bits

1

TCA

2

TA

3

G

4

TA

5

G

6

TA

7

A

8

A

9

CA 5' 0.009 7.4 Y - -

0

1

2

bits

1 2

A

3

C

4

TA

5

T

6

GC

7

C

8

T

9

TGC

5' 0.007 5.3 Y Elk-1

0

1

2

bits

1 2

C

3

TG

4

G

5

C

6

G

7

TCA

8

A

9

5' 0.016 18.4 Y - -

0

1

2

bits

1

TGC

2

GCA

3

TGA

4

C

5

G

6

TA

7

A

8

CA

9

TGC

5' 0.013 15.7 Y - -

0

1

2

bits

1 2

A

3

A

4

TC

5

TGA

6

CA

7

C

8

G

9

5' 0.013 14.7 Y - -

0

1

2

bits

1

TG

2

C

3

G

4

T

5

TA

6

TGA

7

TC

8

CA

9

5' 0.013 13.5 Y - -

0

1

2

bits

1

TGC

2

T

3

A

4

C

5

G

6

TGA

7

CA

8

TGA

9

5' 0.012 11.0 Y - -

0

1

2

bits

1 2

C

3

C

4

C

5

C

6

A

7

G

8

G

9

5' 0.014 15.3 Y - MIG1

0

1

2

bits

1

GC

2

C

3

C

4

C

5

T

6

C

7

C

8

C

9

5' 0.010 8.0 Y - Sp1

0

1

2

bits

1

TGC

2

C

3

C

4

T

5

G

6

C

7

C

8

C

9

TCA

5' 0.009 7.1 Y - Sp1

0

1

2

bits

1 2

A

3

G

4

C

5

C

6

C

7

C

8

CA

9

TCA

5' 0.008 6.3 Y -

0

1

2

bits

1 2

T

3

T

4

A

5

A

6

A

7

A

8

TA

9

TCA

5' 0.012 12.5 Y - SBF-1

0

1

2

bits

1 2

A3

A4

TGA

5

T6

G

7

A

8

A

9

TA 5' 0.010 8.3 Y - ISRE

0

1

2bits

1

TGC

2

A3

TA

4A

5A

6

A7

A8

A9

5' 0.012 10.1 Y - Hb

0

1

2

bits

1

C

2

C3

C4

C5

C6

GC

7

TCA

8

CA

9

TCA

5' 0.012 11.6 Y - ADR1

0

1

2

bits

1

TGA

2

C

3

G

4

T

5

TC

6

A

7 8

TG

9

5' 0.011 10.1 Y - CRE-BP1

0

1

2

bits

1

TGC

2

GCA

3

T

4

CA

5

G

6

C

7

G

8

A

9

TGA

5' 0.010 9.2 Y - -

0

1

2

bits

1

CA

2

GC

3

A

4

G

5 6

GA

7

A

8 9

3'UTR 0.007 5.7 Y -

0

1

2

bits

1

TGA

2

TCA

3

C

4

G

5

CA

6

GCA

7

T

8

CA

9

GA 5' 0.011 12.0 Y - -

0

1

2

bits

1

GCA

2

TA

3

TA

4

A

5

C

6

G

7

C

8

GC

9

TGA

5' 0.010 9.4 Y - -

0

1

2

bits

1 2

C

3

GA

4

TC

5

C

6

TA

7

C

8

G

9

CA 5' 0.009 8.2 Y - -

0

1

2

bits

1

TGA

2

TA

3

TA

4

TCA

5

C

6

G

7

T

8

T

9

5' 0.011 9.2 Y - -

0

1

2

bits

1 2

GA

3

T

4

T

5

A

6

A

7

T

8

C

9

TCA

5' 0.009 6.8 Y - Dfd

0

1

2

bits

1 2

G

3

C

4

C

5

G

6

A

7

GA

8

C

9

5' 0.011 9.8 Y - -

0

1

2

bits

1 2

T

3

GC

4

T

5

GA

6

C

7

G

8

TA

9

5' 0.007 3.7 Y O2

0

1

2

bits

1

TG

2

CA

3

TA

4

TA

5

GA

6

C

7

G

8

C

9

TGA

5' 0.010 8.7 Y - -

0

1

2

bits

1

GA

2

GC

3

A

4

GA

5

CA

6

G

7

C

8 9

GA

3'UTR 0.009 7.5 Y -

0

1

2

bits

1 2 3

A

4

GA

5

G

6

C

7

G

8

G

9

CA

3'UTR 0.009 8.5 Y -

0

1

2

bits

1

GC

2

C

3

G

4 5 6 7 8

G

9

3'UTR 0.007 5.4 Y miR-518d/miR-

0

1

2

bits

1

GC

2

T

3

TG

4

TA

5

C

6

G

7

T

8 9

5' 0.009 9.8 Y - CRE-BP1

0

1

2

bits

1

TG

2

TC

3

G

4

TA

5

C

6

G

7

CA

8

A

9

5' 0.009 8.6 Y -

0

1

2

bits

1

GCA

2

T

3

TA

4

GCA

5

C

6

G

7

CA

8

A

9

TGA

5' 0.009 7.5 Y -

0

1

2

bits

1 2

CA

3

G

4

GA

5

A

6

A

7

T

8

A

9

5' 0.008 6.6 Y - HSF

0

1

2

bits

1

GCA

2

C

3

A

4

G

5

G

6

T

7

G

8

TCA

9

5' 0.008 6.3 Y - E47

0

1

2

bits

1 2

T

3

TA

4

GA

5

TGA

6

C

7

TGA

8

C

9

G 5' 0.008 6.6 Y -

0

1

2

bits

1 2 3

C

4

G

5

A

6

A

7

CA

8

A

9

5' 0.007 6.0 Y E2F

0

1

2

bits

1

CA

2

G

3 4 5

GCA

6

A

7

G

8

A

9

GCA

3'UTR 0.008 6.8 Y -

0

1

2

bits

1

GA

2 3 4

G

5 6

GA

7 8

G

9

GA

3'UTR 0.008 8.0 Y miR-377/miR-4

0

1

2

bits

1 2

C

3

GC

4

G

5

A

6

TA

7

TA

8

A

9

5' 0.008 5.5 Y -

0

1

2

bits

1

GCA

2

C

3

TGA

4

T

5

TC

6

A

7

C

8

G

9

5' 0.008 5.3 Y - -

0

1

2

bits

1 2

T

3

G

4

A

5

TCA

6 7

C

8

A

9

TA 5' 0.007 5.9 Y - AP-1

0

1

2

bits

1

TGA

2

TC

3

C

4

G

5

TC

6

TGA

7

A

8

A

9

GA 5' 0.007 4.8 Y -

0

1

2

bits

1 2

C

3

T

4

G

5

C

6

G

7

C

8

C

9

TGC

5' 0.007 4.8 Y - HEN1

0

1

2

bits

1

TGA

2

T

3

TA

4 5

T

6

C

7

G

8

C

9

5' 0.007 5.0 Y E2F

0

1

2

bits

1

CA

2

T

3

C

4

T

5

G

6

C

7

CA

8

A

9

GC 5' 0.006 3.3 Y - bZIP910

0

1

2

bits

1

TCA

2

C

3

TC

4

T

5

G

6

CA

7

C

8

A

9

TGC

5' 0.006 3.0 Y - -

0

1

2

bits

1

TCA

2

A

3

T

4

C

5

C

6

TC

7

C

8

T

9

TGC

5' 0.006 3.6 Y NF-kappaB__p5

0

1

2

bits

1

TGC

2

A

3

C

4

A

5

GCA

6

A

7

G

8

A

9

GCA

5' 0.006 3.7 Y - Mat1-Mc

0

1

2

bits

1

GCA

2

T

3

G

4

GCA

5

C

6

TA

7

C

8

A

9

TGC

5' 0.006 3.8 Y GCN4

0

1

2

bits

1

GCA

2

C

3

G

4

TA

5 6

T

7

A

8

C

9

GC 5' 0.006 3.3 Y bZIP911

0

1

2

bits

1 2

A

3

GC

4

TA

5

TC

6

A

7

T

8

G

9

GCA

5' 0.006 3.7 Y -

0

1

2

bits

1

TGC

2

TA

3

A

4

T

5

C

6

G

7

TC

8

TGC

9

GCA

5' 0.006 3.0 Y - -

0

1

2

bits

1 2

CA

3

G

4

A

5

C

6

G

7

TC

8

T

9

TGC

5' 0.006 3.7 Y - -

0

1

2

bits

1

GCA

2

C

3

A

4

GC

5

TC

6

T

7

G

8

G

9

GC 5' 0.006 2.8 Y Lmo2_complex

0

1

2

bits

1 2

TGC

3

GA

4

A

5

T

6

C

7

GA

8

A

9

GCA

5' 0.005 2.5 Y - CDP_CR3+HD

10

Over-representation

-10

Under-representation

Figure S22

Page 38: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

38

C2

C4

C5

C6

C7

C9

C11

C13

C15

C16

C18

C19

C20

C21

C23

C24

C25

C28

C29

C30

C32

C33

C34

C35

C36

C37

C38

C39

C45

C46

C47

C48

C49

C56

C59

C60

C64

C66

C67 optimized motif

location

MI (bits)

z-score

position bias

orientation bias

seed

motif name

0

1

2

bits

1 2

GC

3

C

4

G

5

G

6

A

7

A

8

GCA

9

TC 5' 0.035 62.0 Y - Elk-1

0

1

2

bits

1 2

C

3

C

4

C

5

G

6

C

7

C

8

C

9

5' 0.029 53.4 Y - Sp1

0

1

2

bits

1

GCA

2

C

3

C

4

G

5

GA

6

A

7

TCA

8

GA

9

TGA

5' 0.023 37.5 Y - c-Ets-1_p54_

0

1

2

bits

1

TGC

2

GCA

3

TGA

4

C

5

G

6

TA

7

A

8

CA

9

TGC

5' 0.018 29.2 Y - -

0

1

2

bits

1 2

T

3

CA

4

C

5

C

6

G

7

C

8

C

9

5' 0.018 31.0 Y - -

0

1

2

bits

1 2

A

3

C

4

G

5

TA

6

C

7

GA

8 9

5' 0.017 30.1 Y - v-Jun

0

1

2

bits

1 2

C

3

TG

4

G

5

C

6

G

7

TCA

8

A

9

5' 0.017 25.2 Y - -

0

1

2

bits

1

TGA

2

TCA

3

C

4

G

5

CA

6

GCA

7

T

8

CA

9

GA 5' 0.017 28.2 Y - -

0

1

2

bits

1

TGC

2

TGA

3 4

TG

5

C

6

G

7

A

8

C

9

TCA

5' 0.015 22.1 Y - -

0

1

2

bits

1 2

A

3

A

4

TC

5

TGA

6

CA

7

C

8

G

9

5' 0.013 24.0 Y - -

0

1

2

bits

1

TGC

2

T

3

A

4

C

5

G

6

TGA

7

CA

8

TGA

9

5' 0.013 18.2 Y - -

0

1

2

bits

1

TG

2

C

3

G

4

T

5

TA

6

TGA

7

TC

8

CA

9

5' 0.012 19.9 Y - -

0

1

2

bits

1

GC

2

T

3

TG

4

TA

5

C

6

G

7

T

8 9

5' 0.012 19.1 Y - CRE-BP1

0

1

2

bits

1 2

C

3

GA

4

TC

5

C

6

TA

7

C

8

G

9

CA 5' 0.010 11.5 Y - -

0

1

2

bits

1

TGA

2

TA

3

TA

4

TCA

5

C

6

G

7

T

8

T

9

5' 0.010 12.6 Y - -

0

1

2

bits

1 2

T

3

TA

4

GA

5

TGA

6

C

7

TGA

8

C

9

G 5' 0.010 12.5 Y - -

0

1

2

bits

1

TGA

2

C

3

G

4

T

5

TC

6

A

7 8

TG

9

5' 0.014 27.2 Y - CRE-BP1

0

1

2

bits

1

TG

2

CA

3

TA

4

TA

5

GA

6

C

7

G

8

C

9

TGA

5' 0.013 21.5 Y - -

0

1

2

bits

1

GCA

2

C

3

TGA

4

T

5

TC

6

A

7

C

8

G

9

5' 0.011 14.2 Y - -

0

1

2

bits

1 2 3

C

4

G

5

A

6

A

7

CA

8

A

9

5' 0.011 14.1 Y - E2F

0

1

2

bits

1

TGA

2

T

3

TA

4 5

T

6

C

7

G

8

C

9

5' 0.010 12.7 Y - E2F

0

1

2

bits

1 2

T

3

GC

4

T

5

GA

6

C

7

G

8

TA

9

5' 0.007 7.2 Y - O2

0

1

2

bits

1

GCA

2

C

3

G

4

T

5

TCA

6

G

7

GA

8

C

9

5' 0.012 15.0 Y - Egr-3

0

1

2

bits

1

TGC

2

GCA

3

T

4

CA

5

G

6

C

7

G

8

A

9

TGA

5' 0.010 15.1 Y - -

0

1

2

bits

1 2

G

3

C

4

C

5

G

6

A

7

GA

8

C

9

5' 0.009 12.3 Y - -

0

1

2

bits

1 2

C3

T4

G5

C

6

G

7

C

8

C

9

TGC

5' 0.006 6.2 Y - HEN1

0

1

2bits

1 2

TGC

3GA

4A

5T

6

C7

GA

8

A9

GCA

5' 0.004 2.8 Y - CDP_CR3+HD

0

1

2

bits

1 2

A3 4

TGA

5

A6

T7

A8

T9

TA 5' 0.009 15.1 Y - -

0

1

2

bits

1

GCA

2

C

3

A

4

G

5

G

6

T

7

G

8

TCA

9

5' 0.008 10.1 Y - E47

0

1

2

bits

1

GCA

2

T

3

TG

4

T

5

A

6

A

7

CA

8

A

9

5' 0.008 11.2 Y - MEF-2

0

1

2

bits

1 2

CA

3

A

4

CA

5

A

6

T

7

A

8

A

9

TCA

5' 0.008 6.9 Y - HFH-1

0

1

2

bits

1 2

A

3

GC

4

TA

5

TC

6

A

7

T

8

G

9

GCA

5' 0.007 9.7 Y - -

0

1

2

bits

1 2

T

3

A

4

A

5

T

6

A

7

GA

8

TA

9

5' 0.007 7.3 Y - HNF-1

0

1

2

bits

1

GCA

2

T

3

G

4

GCA

5

C

6

TA

7

C

8

A

9

TGC

5' 0.006 7.3 Y - GCN4

0

1

2

bits

1

TGC

2

A

3

TA

4

A

5

A

6

A

7

A

8

A

9

5' 0.006 6.5 Y - Hb

0

1

2

bits

1 2

A

3

A

4

TGA

5

T

6

G

7

A

8

A

9

TA 5' 0.006 6.5 Y ISRE

0

1

2

bits

1

TCA

2

C

3

TC

4

T

5

G

6

CA

7

C

8

A

9

TGC

5' 0.006 6.4 Y - -

0

1

2

bits

1 2

T

3

G

4

A

5

TCA

6 7

C

8

A

9

TA 5' 0.005 4.0 Y - AP-1

0

1

2

bits

1 2

GA

3

T

4

T

5

A

6

A

7

T

8

C

9

TCA

5' 0.004 3.1 Y Dfd

0

1

2

bits

1 2

C

3

GC

4

G

5

A

6

TA

7

TA

8

A

9

5' 0.009 14.9 Y - -

0

1

2

bits

1

TGA

2

TC

3

C

4

G

5

TC

6

TGA

7

A

8

A

9

GA 5' 0.008 8.6 Y - -

0

1

2

bits

1

GCA

2

T

3

TA

4

GCA

5

C

6

G

7

CA

8

A

9

TGA

5' 0.009 10.5 Y - -

0

1

2

bits

1

GCA

2

TA

3

TA

4

A

5

C

6

G

7

C

8

GC

9

TGA

5' 0.008 8.3 Y - -

0

1

2

bits

1 2

A

3

A

4

A

5

A

6

TGA

7

T

8

T

9

5' 0.008 8.9 Y - -

0

1

2

bits

1

GC

2

C

3

C

4

C

5

T

6

C

7

C

8

C

9

5' 0.008 8.9 Y - Sp1

0

1

2

bits

1

TC

2

GA

3

A

4

T

5

T

6

GC

7

G

8

C

9

TC 5' 0.007 7.3 Y - NF-Y

0

1

2

bits

1 2

C

3

C

4

C

5

C

6

A

7

G

8

G

9

5' 0.008 9.0 Y - MIG1

0

1

2

bits

1

TGC

2

C

3

C

4

T

5

G

6

C

7

C

8

C

9

TCA

5' 0.007 7.9 Y - Sp1

0

1

2

bits

1

C

2

C

3

C

4

C

5

C

6

GC

7

TCA

8

CA

9

TCA

5' 0.007 9.3 Y - ADR1

0

1

2

bits

1 2

A

3

G

4

C

5

C

6

C

7

C

8

CA

9

TCA

5' 0.007 6.3 Y - -

0

1

2

bits

1

TGC

2

TA

3

A

4

T

5

C

6

G

7

TC

8

TGC

9

GCA

5' 0.007 7.3 Y - -

0

1

2

bits

1

TG

2

TC

3

G

4

TA

5

C

6

G

7

CA

8

A

9

5' 0.006 6.6 Y -

0

1

2

bits

1

GCA

2

C

3

A

4

GC

5

TC

6

T

7

G

8

G

9

GC 5' 0.006 5.9 Y - Lmo2_complex

0

1

2

bits

1 2

T

3

T

4

A

5

A

6

A

7

A

8

TA

9

TCA

5' 0.006 6.1 Y - SBF-1

0

1

2

bits

1 2 3

A

4

GA

5

G

6

C

7

G

8

G

9

CA

3'UTR 0.006 5.0 Y -

0

1

2

bits

1

TGA

2

C

3

C

4

TG

5

A

6

G

7

T

8

TCA

9

5' 0.006 4.7 Y - PPARalpha_RXR

0

1

2

bits

1

A

2 3

GC

4

A

5

G

6

G

7 8 9

GA

3'UTR 0.006 7.0 Y -

0

1

2

bits

1

GCA

2

CA

3

GA

4 5

CA

6

G

7

GA

8 9

G 3'UTR 0.005 5.6 Y -

0

1

2

bits

1 2

CA

3

G

4

A

5

C

6

G

7

TC

8

T

9

TGC

5' 0.006 6.7 Y - -

0

1

2

bits

1 2

G

3 4

A

5

C

6

C

7

C

8

G

9

GA

3'UTR 0.006 6.1 Y -

0

1

2

bits

1 2

CA

3

G

4

GA

5

A

6

A

7

T

8

A

9

5' 0.005 5.0 Y HSF

0

1

2

bits

1 2

A

3 4 5

GA

6

C

7

A

8

GC

9

CA

3'UTR 0.005 6.0 Y miR-203

0

1

2

bits

1 2

A

3 4 5

GA

6

C

7

A

8

GC

9

CA

3'UTR 0.005 6.0 Y miR-203

0

1

2

bits

1

TCA

2

TA

3

G

4

TA

5

G

6

TA

7

A

8

A

9

CA 5' 0.005 4.2 Y -

0

1

2

bits

1

CA

2

T

3

C

4

T

5

G

6

C

7

CA

8

A

9

GC 5' 0.005 3.7 Y bZIP910

0

1

2

bits

1

TCA

2

A

3

T

4

C

5

C

6

TC

7

C

8

T

9

TGC

5' 0.005 3.8 Y NF-kappaB__p5

0

1

2

bits

1

CA

2

G

3 4 5

GCA

6

A

7

G

8

A

9

GCA

3'UTR 0.005 4.4 Y -

0

1

2

bits

1

A

2

A

3

GC

4

G

5

GA

6 7

C

8

A

9

GA

3'UTR 0.005 3.6 Y miR-487a

0

1

2

bits

1 2

A

3

C

4

TA

5

T

6

GC

7

C

8

T

9

TGC

5' 0.004 2.8 Y - Elk-1

0

1

2

bits

1 2

A

3

C

4

GC

5

T

6

T

7

A

8

C

9

CA 5' 0.004 2.4 Y - HLF

0

1

2

bits

1

GA

2

C

3

TA

4

T

5

C

6

C

7

GA

8

C

9

TGA

5' 0.004 1.9 Y - GCR1

10

Over-representation

-10

Under-representation

Figure S23

Page 39: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

39

1.836

-0.382

location

MI (bits)

z-score

robustness

position bias

orientation bias

conservation index

seed

motif name

0

1

2

bits

1 2

C

3

C

4

G

5

G

6

A

7

TA

8

TGA

9

5’ 0.007 45.2 10/10 Y - 1.00 CCGGAAG ELK4

0

1

2

bits

1 2

A

3

T

4

A

5

TA

6

TG

7

TC

8

T

9

TA 5’ 0.004 18.3 10/10 Y 1.00 ATATGCT -

0

1

2

bits

1

GCA

2

C

3

TC

4

A

5

A

6

A

7

C

8

G

9

TG 5’ 0.002 10.1 10/10 Y - 0.01 CCAAACG -

10

-10

0.653

-0.772

location

MI (bits)

z-score

robustness

position bias

orientation bias

conservation index

seed

motif name

0

1

2

bits

1 2

C

3

C

4

G

5

C6

C7

C8

C9

TGC

5’ 0.009 48.3 10/10 - - 1.00 CCGCCCC Sp1

0

1

2

bits

1 2 3

A

4

C

5

T

6

T

7

C

8

C

9

GA 5’ 0.002 9.8 7/10 - - 0.95 CACTTCC Elk-1

0

1

2

bits

1

TCA

2

A

3

CA

4

TA

5

A

6

T

7

A

8

A

9

5’ 0.006 30.7 10/10 - - 1.00 AAAATAA MEF-2

0.523

-0.847

location

MI (bits)

z-score

robustness

position bias

orientation bias

conservation index

seed

motif name

0

1

2

bits

1

GA

2

GA

3

C

4

C

5

A

6

A

7

T

8

GC

9

GA 5’ 0.002 6.1 7/10 Y 1.00 NF-Y

...AAAAATT...

Elk1 knock-down

NFYA knock-down

10

-10

10

-10

(A)

(B)

(C)

Figure S24

Page 40: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

40

Supplemental Table

Table S1

Real Data Random Data I Random Data II BL vs DLBCL 525 1 1 Bladder Carcinoma (cont) 224 4 5 Bladder Carcinoma (disc) 248 1 1

Page 41: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

41

Table S2

Tissue Sample Name Sample Bladder CA Bladder Dyrskjot et al Carcinoma (Dyrskjot et al., 2004)

Brain

GBM Brain Liang et al OD Brain Bredel et al GL Brain Bredel et al AO Brain Bredel et al GL Brain Rickman et al ODGL Brain Sun et al AC Brain Sun et al GLB Brain Sun et al

Glioblastoma Multiforme (Liang et al., 2005) Oligodendroglioma (Bredel et al., 2005) Glioblastoma (Bredel et al., 2005) Anaplastic Oligoastrocytoma (Bredel et al., 2005) Glioma (Rickman et al., 2001) Oligodendroglioma (Sun et al., 2006) Astrocytoma (Sun et al., 2006) Glioblastoma (Sun et al., 2006)

CA Breast Sorlie et al Carcinoma (Sorlie et al., 2001) CA Breast Richardson et al Carcinoma (Richardson et al., 2006)

Breast MCA Breast Radvanyi et al ILC Breast Radvanyi et al IDC Breast Radvanyi et al

Metastatic Breast Carcinoma (Radvanyi et al., 2005) Invasive Lobular Carcinoma (Radvanyi et al., 2005) Invasive Ductal Carcinoma (Radvanyi et al., 2005)

Colon CA Colon Graudens et al Carcinoma (Graudens et al., 2006) HSCC Head-Neck Cromer et al Head-Neck Squamous Cell Carcinoma(Cromer et al., 2004) Head-neck HSCC Head-Neck Chung et al Head-Neck Squamous Cell Carcinoma (Chung et al., 2004)

Leukemia B-CLL Leukemia Haslinger et al Chronic Lymphocytic Leukemia (Haslinger et al., 2004) AD Lung Beer et al Adenocarcinoma (Beer et al., 2002) AD Lung Bhattacharjee et al COID Lung Bhattacharjee et al SQ Lung Bhattacharjee et al SMCL Lung Bhattacharjee et al

Adenocarcinoma (Bhattacharjee et al., 2001) Carcinoid (Bhattacharjee et al., 2001) Squamous Cell Lung Carcinoma (Bhattacharjee et al., 2001) Small Cell Lung Cancer (Bhattacharjee et al., 2001)

Lung

AD Lung Stearman et al Adenocarcinoma (Stearman et al., 2005)

Lymphoma FL Lymphoma Alizadeh et al DLBCL Lymphoma Alizadeh et al CLL Lymphoma Alizadeh et al

Follicular Lymphoma (Alizadeh et al., 2000) Diffuse Large B-Cell Lymphoma (Alizadeh et al., 2000) Chronic Lymphocytic Leukemia (Alizadeh et al., 2000)

Melanoma ML Melanoma Talantov et al ME Melanoma Hoek et al

Cutaneous melanoma (Hoek et al., 2006) Melanoma (Talantov et al., 2005)

Mesothelioma MPM Mesothelioma Gordon et al Malignant Mesothelioma (Gordon et al., 2005) Myeloma MM Myeloma Zhan et al Multiple Myeloma (Zhan et al., 2002)

AD Ovarian Welsh et al Adenocarcinoma (Welsh et al., 2001)

Ovarian CCC Ovarian Hendrix et al MUC Ovarian Hendrix et al SRS Ovarian Hendrix et al END Ovarian Hendrix et al

Clear Cell Carcinoma (Hendrix et al., 2006) Mucinous Adenocarcinoma (Hendrix et al., 2006) Serous Adenocarcinoma (Hendrix et al., 2006) Endometrioid Adenocarcinoma (Hendrix et al., 2006)

PDC Pancreas Ishikawa et al Pancreatic Ductal Carcinoma (Ishikawa et al., 2005) Pancreas AD Pancreas Logsdon et al Adenocarcinoma (Logsdon et al., 2003) MPC Prostate Dhanasekaran et al PPC Prostate Dhanasekaran et al BPH Prostate Dhanasekaran et al

Metastatic Prostate Cancer (Dhanasekaran et al., 2001) Primary Prostate Cancer (Dhanasekaran et al., 2001) Benign Prostatic Hyperplasia (Dhanasekaran et al., 2001)

Prostate

TU Prostate Lapointe et al Primary Tumor (Lapointe et al., 2004) CA Renal Higgins et al Carcinoma (Higgins et al., 2003) RCCC Renal Boer et al Clear Cell Renal Cell Carcinoma (Boer et al., 2001) Renal RCCC Renal Lenburg et al Clear Cell Renal Cell Carcinoma (Lenburg et al., 2003)

Seminoma GCT Seminoma Korkola et al Germ Cell Tumor (Korkola et al., 2006)

Page 42: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

42

Table S3

ecnacifingiSsfitoMsmret OG

chromatin assembly0

1

2

bits

1

A

2

A

3

UCA

4

G

5

G

6

C

7

UGC

8

C

9

U 7.68-e1 < p’3

DNA packaging0

1

2

bits

1

UCA

2

G

3

G

4

C

5

UC

6

C

7

U

8

UA

9

UC 5.23-e1 < p’3

chromosome organization and biogenesis0

1

2

bits

1 2

A

3

A

4

TC

5

TGA

6

CA

7

C

8

G

9

1.11-e1 < p’5

ribonucleoprotein complex0

1

2

bits

1

GCA

2

T

3

TA

4

GCA

5

C

6

G

7

CA

8

A

9

TGA 7.8-e1 < p’5

DNA packaging0

1

2

bits

1

TG

2

CA

3

TA

4

TA

5

GA

6

C

7

G

8

C

9

TGA 4.8-e1 < p’5

DNA repair0

1

2

bits

1

TGC

2

T

3

A

4

C

5

G

6

TGA

7

CA

8

TGA

9

7.7-e1 < p’5

mRNA processing0

1

2

bits

1 2

UG

3

U

4

U

5

U

6

U

7

U

8 9

UGA 7-e1 < p’3

cell-cell adhesion0

1

2

bits

1 2

C

3

UCA

4

C

5

U

6

G

7

U

8

G

9

UGC 9.6-e1 < p’3

cell-cell adhesion0

1

2

bits

1 2

A

3

G

4

A

5

UGC

6

C

7

C

8

GA

9

GCA 7.6-e1 < p’3

ubiquitin-protein ligase activity0

1

2

bits

1

UG

2

UA

3

A

4

A

5

UC

6

G

7

C

8

UCA

9

UA 4.6-e1 < p’3

protein-tyrosine kinase activity0

1

2

bits

1

UGA

2

G

3

UG

4

UGC

5

C

6

UGA

7

C

8

UGC

9

GC 3.6-e1 < p’3

Golgi vesicle transport0

1

2

bits

1

GCA

2

TA

3

TA

4

A5

C6

G7

C

8

GC

9

TGA 3.6-e1 < p’5

cytoskeletal protein binding0

1

2

bits

1

UCA

2

UCA

3

C4

C5

GC

6GC

7 8

CA

9

GC 2.6-e1 < p’3

GPI anchor binding0

1

2

bits

1 2

C

3

UGC

4

GC

5

C6

UCA

7

C8

C9

GCA 2.6-e1 < p’3

DNA repair0

1

2

bits

1 2

G

3

UGC

4

UC

5

U

6

UG

7

U

8

A

9

6-e1 < p’3

humoral immune response0

1

2

bits

1

TGA

2

GA

3

T

4

T

5

C

6

A

7

T

8

G

9

TCA 6-e1 < p’5

phosphoinositide-mediated signaling0

1

2

bits

1

UCA

2

CA

3

G

4

GA

5

U

6

C

7

C

8

UC

9

GCA 6-e1 < p’3

mitotic cell cycle0

1

2

bits

1 2

C

3

TG

4

G

5

C

6

G

7

TCA

8

A

9

9.5-e1 < p’5

mitosis0

1

2

bits

1

TG

2

C

3

G

4

T

5

TA

6

TGA

7

TC

8

CA

9

5.5-e1 < p’5

response to wounding0

1

2

bits

1 2

A

3

C

4

TA

5

T

6

GC

7

C

8

T

9

TGC 4.5-e1 < p’5

response to wounding0

1

2

bits

1

GCA

2

T

3

G

4

GCA

5

C

6

TA

7

C

8

A

9

TGC 3.5-e1 < p’5

cell-cell adhesion0

1

2

bits

1 2

A

3 4

TGA

5

A

6

T

7

A

8

T

9

TA 2.5-e1 < p’5

mRNA metabolic process0

1

2

bits

1 2

A

3

T

4

C

5

G

6

C

7

G

8

CA

9

TGA 1.5-e1 < p’5

small GTPase mediated signal transduction0

1

2

bits

1

TGA

2

TCA

3

C

4

G

5

CA

6

GCA

7

T

8

CA

9

GA 5-e1 < p’5

Wnt receptor signaling pathway0

1

2

bits

1 2

G

3

C

4

C

5

G

6

A

7

GA

8

C

9

8.4-e1 < p’5

Page 43: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

43

References Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Rosenwald, A., Boldrick, J.C., Sabet, H., Tran, T., Yu, X., et al. (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503-511.

Amin, J., Ananthan, J., and Voellmy, R. (1988). Key features of heat shock regulatory elements. Mol Cell Biol 8, 3761-3769.

Beer, D.G., Kardia, S.L., Huang, C.C., Giordano, T.J., Levin, A.M., Misek, D.E., Lin, L., Chen, G., Gharib, T.G., Thomas, D.G., et al. (2002). Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med 8, 816-824.

Bhattacharjee, A., Richards, W.G., Staunton, J., Li, C., Monti, S., Vasa, P., Ladd, C., Beheshti, J., Bueno, R., Gillette, M., et al. (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A 98, 13790-13795.

Boer, J.M., Huber, W.K., Sultmann, H., Wilmer, F., von Heydebreck, A., Haas, S., Korn, B., Gunawan, B., Vente, A., Fuzesi, L., et al. (2001). Identification and classification of differentially expressed genes in renal cell carcinoma by expression profiling on a global human 31,500-element cDNA array. Genome Res 11, 1861-1870.

Bredel, M., Bredel, C., Juric, D., Harsh, G.R., Vogel, H., Recht, L.D., and Sikic, B.I. (2005). Functional network analysis reveals extended gliomagenesis pathway maps and three novel MYC-interacting genes in human gliomas. Cancer Res 65, 8679-8689.

Busche, S., Descot, A., Julien, S., Genth, H., and Posern, G. (2008). Epithelial cell-cell contacts regulate SRF-mediated transcription via Rac-actin-MAL signalling. J Cell Sci 121, 1025-1035.

Chung, C.H., Parker, J.S., Karaca, G., Wu, J., Funkhouser, W.K., Moore, D., Butterfoss, D., Xiang, D., Zanation, A., Yin, X., et al. (2004). Molecular classification of head and neck squamous cell carcinomas using patterns of gene expression. Cancer Cell 5, 489-500.

Cover, T., and Thomas, J. (2006). Elements of Information Theory, Second Edition edn (Hoboken, NJ, Wiley-Interscience).

Cromer, A., Carles, A., Millon, R., Ganguli, G., Chalmel, F., Lemaire, F., Young, J., Dembele, D., Thibault, C., Muller, D., et al. (2004). Identification of genes associated with tumorigenesis and metastatic potential of hypopharyngeal cancer by microarray analysis. Oncogene 23, 2484-2498.

Crosby, M.E., Jacobberger, J., Gupta, D., Macklis, R.M., and Almasan, A. (2007). E2F4 regulates a stable G2 arrest response to genotoxic stress in prostate carcinoma. Oncogene 26, 1897-1909.

Dhanasekaran, S.M., Barrette, T.R., Ghosh, D., Shah, R., Varambally, S., Kurachi, K., Pienta, K.J., Rubin, M.A., and Chinnaiyan, A.M. (2001). Delineation of prognostic biomarkers in prostate cancer. Nature 412, 822-826.

Dyrskjot, L., Kruhoffer, M., Thykjaer, T., Marcussen, N., Jensen, J.L., Moller, K., and Orntoft, T.F. (2004). Gene expression in the urinary bladder: a common carcinoma in situ gene expression signature exists disregarding histopathological classification. Cancer Res 64, 4040-4048.

Page 44: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

44

Elemento, O., Slonim, N., and Tavazoie, S. (2007). A universal framework for regulatory element discovery across all genomes and data types. Mol Cell 28, 337-350.

Elemento, O., and Tavazoie, S. (2005). Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach. Genome Biol 6, R18.

Fukada, T., Ohtani, T., Yoshida, Y., Shirogane, T., Nishida, K., Nakajima, K., Hibi, M., and Hirano, T. (1998). STAT3 orchestrates contradictory signals in cytokine-induced G1 to S cell-cycle transition. Embo J 17, 6670-6677.

Gordon, G.J., Rockwell, G.N., Jensen, R.V., Rheinwald, J.G., Glickman, J.N., Aronson, J.P., Pottorf, B.J., Nitz, M.D., Richards, W.G., Sugarbaker, D.J., et al. (2005). Identification of novel candidate oncogenes and tumor suppressors in malignant pleural mesothelioma using large-scale transcriptional profiling. Am J Pathol 166, 1827-1840.

Graudens, E., Boulanger, V., Mollard, C., Mariage-Samson, R., Barlet, X., Gremy, G., Couillault, C., Lajemi, M., Piatier-Tonneau, D., Zaborski, P., et al. (2006). Deciphering cellular states of innate tumor drug responses. Genome Biol 7, R19.

Haslinger, C., Schweifer, N., Stilgenbauer, S., Dohner, H., Lichter, P., Kraut, N., Stratowa, C., and Abseher, R. (2004). Microarray gene expression profiling of B-cell chronic lymphocytic leukemia subgroups defined by genomic aberrations and VH mutation status. J Clin Oncol 22, 3937-3949.

Hendrix, N.D., Wu, R., Kuick, R., Schwartz, D.R., Fearon, E.R., and Cho, K.R. (2006). Fibroblast growth factor 9 has oncogenic activity and is a downstream target of Wnt signaling in ovarian endometrioid adenocarcinomas. Cancer Res 66, 1354-1362.

Higgins, J.P., Shinghal, R., Gill, H., Reese, J.H., Terris, M., Cohen, R.J., Fero, M., Pollack, J.R., van de Rijn, M., and Brooks, J.D. (2003). Gene expression patterns in renal cell carcinoma assessed by complementary DNA microarray. Am J Pathol 162, 925-932.

Hoek, K.S., Schlegel, N.C., Brafford, P., Sucker, A., Ugurel, S., Kumar, R., Weber, B.L., Nathanson, K.L., Phillips, D.J., Herlyn, M., et al. (2006). Metastatic potential of melanomas defined by specific gene expression profiles with no BRAF signature. Pigment Cell Res 19, 290-302.

Ishikawa, M., Yoshida, K., Yamashita, Y., Ota, J., Takada, S., Kisanuki, H., Koinuma, K., Choi, Y.L., Kaneda, R., Iwao, T., et al. (2005). Experimental trial for diagnosis of pancreatic ductal carcinoma based on gene expression profiles of pancreatic ductal cells. Cancer Sci 96, 387-393.

Korkola, J.E., Houldsworth, J., Chadalavada, R.S., Olshen, A.B., Dobrzynski, D., Reuter, V.E., Bosl, G.J., and Chaganti, R.S. (2006). Down-regulation of stem cell genes, including those in a 200-kb gene cluster at 12p13.31, is associated with in vivo differentiation of human male germ cell tumors. Cancer Res 66, 820-827.

Lapointe, J., Li, C., Higgins, J.P., van de Rijn, M., Bair, E., Montgomery, K., Ferrari, M., Egevad, L., Rayford, W., Bergerheim, U., et al. (2004). Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proc Natl Acad Sci U S A 101, 811-816.

Lenburg, M.E., Liou, L.S., Gerry, N.P., Frampton, G.M., Cohen, H.T., and Christman, M.F. (2003). Previously unidentified changes in renal cell carcinoma gene expression identified by parametric

Page 45: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

45

analysis of microarray data. BMC Cancer 3, 31.

Liang, Y., Diehn, M., Watson, N., Bollen, A.W., Aldape, K.D., Nicholas, M.K., Lamborn, K.R., Berger, M.S., Botstein, D., Brown, P.O., et al. (2005). Gene expression profiling reveals molecularly and clinically distinct subtypes of glioblastoma multiforme. Proc Natl Acad Sci U S A 102, 5814-5819.

Logsdon, C.D., Simeone, D.M., Binkley, C., Arumugam, T., Greenson, J.K., Giordano, T.J., Misek, D.E., Kuick, R., and Hanash, S. (2003). Molecular profiling of pancreatic adenocarcinoma and chronic pancreatitis identifies multiple genes differentially regulated in pancreatic cancer. Cancer Res 63, 2649-2657.

Pilpel, Y., Sudarsanam, P., and Church, G.M. (2001). Identifying regulatory networks by combinatorial analysis of promoter elements. Nat Genet 29, 153-159.

Radvanyi, L., Singh-Sandhu, D., Gallichan, S., Lovitt, C., Pedyczak, A., Mallo, G., Gish, K., Kwok, K., Hanna, W., Zubovits, J., et al. (2005). The gene associated with trichorhinophalangeal syndrome in humans is overexpressed in breast cancer. Proc Natl Acad Sci U S A 102, 11005-11010.

Richardson, A.L., Wang, Z.C., De Nicolo, A., Lu, X., Brown, M., Miron, A., Liao, X., Iglehart, J.D., Livingston, D.M., and Ganesan, S. (2006). X chromosomal abnormalities in basal-like human breast cancer. Cancer Cell 9, 121-132.

Rickman, D.S., Bobek, M.P., Misek, D.E., Kuick, R., Blaivas, M., Kurnit, D.M., Taylor, J., and Hanash, S.M. (2001). Distinctive molecular profiles of high-grade and low-grade gliomas based on oligonucleotide microarray analysis. Cancer Res 61, 6885-6891.

Ross, D.T., Scherf, U., Eisen, M.B., Perou, C.M., Rees, C., Spellman, P., Iyer, V., Jeffrey, S.S., Van de Rijn, M., Waltham, M., et al. (2000). Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet 24, 227-235.

Singh, V., and Aballay, A. (2006). Heat-shock transcription factor (HSF)-1 pathway required for Caenorhabditis elegans immunity. Proc Natl Acad Sci U S A 103, 13092-13097.

Sinha, S., Adler, A.S., Field, Y., Chang, H.Y., and Segal, E. (2008). Systematic functional characterization of cis-regulatory motifs in human core promoters. Genome Res 18, 477-488.

Slonim, N., Atwal, G.S., Tkacik, G., and Bialek, W. (2005). Information-based clustering. Proc Natl Acad Sci U S A 102, 18297-18302.

Sorlie, T., Perou, C.M., Tibshirani, R., Aas, T., Geisler, S., Johnsen, H., Hastie, T., Eisen, M.B., van de Rijn, M., Jeffrey, S.S., et al. (2001). Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 98, 10869-10874.

Stearman, R.S., Dwyer-Nield, L., Zerbe, L., Blaine, S.A., Chan, Z., Bunn, P.A., Jr., Johnson, G.L., Hirsch, F.R., Merrick, D.T., Franklin, W.A., et al. (2005). Analysis of orthologous gene expression between human pulmonary adenocarcinoma and a carcinogen-induced murine model. Am J Pathol 167, 1763-1775.

Su, A.I., Wiltshire, T., Batalov, S., Lapp, H., Ching, K.A., Block, D., Zhang, J., Soden, R., Hayakawa, M., Kreiman, G., et al. (2004). A gene atlas of the mouse and human protein-encoding transcriptomes.

Page 46: Goodarzi etal Suppl Mat final - Columbia University · 2009-11-04 · 3 Pathway Profile Each gene can be associated with a subset of M known pathways (e.g. from the Gene Ontology

46

Proc Natl Acad Sci U S A 101, 6062-6067.

Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., et al. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545-15550.

Sun, L., Hui, A.M., Su, Q., Vortmeyer, A., Kotliarov, Y., Pastorino, S., Passaniti, A., Menon, J., Walling, J., Bailey, R., et al. (2006). Neuronal and glioma-derived stem cell factor induces angiogenesis within the brain. Cancer Cell 9, 287-300.

Talantov, D., Mazumder, A., Yu, J.X., Briggs, T., Jiang, Y., Backus, J., Atkins, D., and Wang, Y. (2005). Novel genes associated with malignant melanoma but not benign melanocytic lesions. Clin Cancer Res 11, 7234-7242.

Welsh, J.B., Zarrinkar, P.P., Sapinoso, L.M., Kern, S.G., Behling, C.A., Monk, B.J., Lockhart, D.J., Burger, R.A., and Hampton, G.M. (2001). Analysis of gene expression profiles in normal and neoplastic ovarian tissue samples identifies candidate molecular markers of epithelial ovarian cancer. Proc Natl Acad Sci U S A 98, 1176-1181.

Zhan, F., Hardin, J., Kordsmeier, B., Bumm, K., Zheng, M., Tian, E., Sanderson, R., Yang, Y., Wilson, C., Zangari, M., et al. (2002). Global gene expression profiling of multiple myeloma, monoclonal gammopathy of undetermined significance, and normal bone marrow plasma cells. Blood 99, 1745-1757.