The Applications of Microarrays and Artificial Intelligence for Diagnosis, Prognosis and Selection...
-
Upload
laureen-goodwin -
Category
Documents
-
view
229 -
download
5
Transcript of The Applications of Microarrays and Artificial Intelligence for Diagnosis, Prognosis and Selection...
The Applications of Microarrays and Artificial Intelligence for Diagnosis, Prognosis and
Selection of Therapeutic Targets
Oncogenomics SectionJaved Khan M.D.
February 2005
•5 year old male referred to NCI second opinion•Injury to R groin/inguinal while playing•Rapid evolving mass with initial resolution•Diagnosis= hematoma•Treatment observation
•Several weeks later mass enlarged•Suspected malignancy•Biopsy performed
Case History
Lymphoma?
Despite availability of immunohistochemistry, cytogeneticsand molecular techniques, in some cases incorrect
diagnoses are made
Alveolar Rhabdomyosarcoma
Lymphoma
Rhabdomyosarcoma
Ewing’s
Neuroblastoma
Small Round Blue Cell Tumors (SRBCT)
Diagnostic Dilemmas
Rhabdomyosarcoma non-Hodgkin's Lymphoma
Origin Muscle Lymphoid
Treatment Chemotherapy Chemotherapy
Lumbar PunctureIntra-thecal drugs
Yes Rarely
Yes No
Accurate Diagnosis is Essential for the Treatment Small Blue Round Cell Tumors
Prognosis 20-60% survival 50-90% survival
Surgery
Radiation
Evolution of Translational Applications of Microarrays
Schena et al.
1. Cancer Diagnosis using Artificial Neural Networks (ANN)
2. Prognosis Prediction using ANNs
3. Array-CGH Investigation of Genomic Imbalances & Characterization of Tumor Progression Models
Translational Applications of Microarrays Outline
Hypothesis
• Cancers belonging to a given diagnosis have diagnostic specific gene expression profiles
Applications of Microarrays for Tumor Diagnosis
Or
• Intrinsic genomic instability of tumors leads to extensive random fluctuations in global gene expression such that no two cancers have similar profiles.
Multidimensional Scaling (MDS)•Alveolar Rhabdomyosarcoma (ARMS cell lines) •1238 element cDNA microarray•2-Class problem comparing ARMS vs. “others”•First report to demonstrate that cancers of a given diagnosis have similar gene expression profiles•Utilized visualization (MDS) and clustering tools
Do Cancers Exhibit Diagnostic Specific Gene Expression Profiles?
Khan et al., Cancer Research , 58, 5009-5013, 1998
Using unsupervised clustering methods we demonstrated: Cancers (ARMS) belonging to a
specific type have similar gene expression profiles, raising the possibility for it’s application
for diagnostics.
Can gene expression profiles be used to reliably diagnose cancers belonging to multiple classes?
Unknown n=25Non-SRBCT n=5
Khan, Wei, Ringnér et al. Nat Med. 7: 673-9, 2001
Lymphoma(n=8)
Rhabdomyosarcoma
(n=20)
Ewing’s(n=23)
Neuroblastoma(n=12)
6567 element cDNA Microarray
Unsupervised clustering Principal Component Analysis (PCA) showed no diagnostic specific clustering
RMS
NB
EWS
BL
Why Artificial Neural Networks (ANNs)?•Supervised•Pattern recognition algorithms•Modeled on the human neuron/brain•Learning from prior experience by error minimization
APPLICATIONS•Defense•Voice and handwriting recognition•Fingerprint Recognition•Diagnosis of Arrhythmias•Diagnosis of Myocardial Infarcts•Interpreting Mammograms, Radiographs/MRI
•Input = any type data, e.g. gene expression •Output = any given number (1)
ANN Training & Validation of 63 SRBCT samples
3750Trained Models
Output: (0-1)
Ideal output:
e.g. for EWS
EWS RMS BL NB 1 0 0 0
25 UnknownTest Samples
Gene Minimization
Increasing Number of Ranked Genes in Order
Identified minimal top 96 (1.5%) that
perfectly classified all 4 SRBCT classes
How well do these top 96 genes perform?
EWS NB RMS BL
Hierarchical Clustering using Top 96 Ranked Clones Resulted in “Perfect” Clustering
Identified 41genes not previously reported
to be expressed in SRBCTs
Cancer Diagnosis? 25 UnknownTest Samples
Diagnostic Classification
•Euclidean Distance from Ideal
•Calculate 95th Percentile
•Diagnosis Confirmed if Distance if within 95th Percentile
•Construct Probability Distribution of Distances
•Highest Output Determines Classification
Distance from Ideal
Ideal Distance=0
Ideal output:EWS RMS BL NB 1 0 0 0
25 UnknownTest Samples
Sensitivity (%)
9310010096
Cancer
EWSBLNBRMS
Specificity (%)
100100100100
ANN Diagnostic Classification
The expression profile of 96 genes can predict the diagnosis of SRBCT using ANNs
Cancer Prognosis
•The expression profiles of cancer contains “prognostic information” at presentation prior to treatment.
•Computer algorithms can utilize this to predict outcome of individual patients with no prior knowledge
Hypothesis
Converse Hypothesis
Presentation Remission Relapse
Therapy
•Only a fraction of the original pretreatment tumor mass contains the “resistant clone” and this cannot be detected by whole tumor microarray experimentation.
Cancer Prognosis
NeuroblastomaIncidence:• 1 per 100,000 in children < 15 yrs in US• The most common solid tumor for children <1 yr• 7-10% of cancers of childhood
Survival:• 75% under 1 year of age• < 30% of children over 1 year old with advanced disease despite
aggressive therapy
Known Prognostic Factors
• Age, Stage, Shimada Histology, MYCN amplification, Ploidy• Other genes, such as TRKA, TRKB, hTERT, BCL2, FYN, CD44 and caspases
Neuroblastoma Prognosis
Age Stage
MYCN amplification Ploidy
Shimada Histology Risk
John MarisCurrent Opinion in Pediatrics 2005, 17:7–13
Children’s Oncology Group (COG) Neuroblastoma Risk Stratification
•Low: 90% survival•Intermediate : 70-90%•High: 10-30%
Wei JS, Greer BT et al.Cancer Research Oct 1, 2004; 64(19) 42k cDNA Microarray
7 biological repeats
.
Survival Probability Curves of 49 Patients
Can ANNs predict survival status of each individual patient?
ANN Experimental Design using all clones
49 NB Samples30 Alive (>3yrs)
19 Deceased
Gene Expression Profiling42578 Clones
25933 Unigenes
PCA & Train ANN
Output: 0=Alive 1=Dead
Leave-one-out(Test)
.
ANN prediction (88%) of 49 patients using 38k High Quality Clones in a leave-one-out strategy
0=Alive 1=Dead
Alive Dead
Output 0=Alive 1=Dead
.
ANNs (using all 38k genes) were able to predict outcome without prior knowledge- of known risk factors
Can we predict the outcomes with a small number of genes?
TEST21 NB samples
(16 Alive, 5 Deceased)
.
Clone Minimization
(19 Genes)
.PCA of NB tumor samples using 19 genes
Alive Dead
How well do these top 19 genes perform for predicting the outcome?
ANNs were retrained on the 35 NB training samples using the expression profiles the 19
genes and outcome predicted on 21 test samples
.
ANN prediction (98%) using 19 top ranked genes
St4-A-NB14St2-NA-NB18
Output 0=Alive 1=Dead
Alive Dead
.
Survival probability using ANN-ranked top 19 genes
Performance of ANN-19 genes(Train and Test)
.
Which patients will benefit the most from the 19-gene prediction models?
•Ultra High-Risk
•Good High-Risk
.
Survival probability of high-risk patients using the top 19 genes
All high-risk Without MYCN-A
Patients with COG high-risk disease will benefit most from 19-gene prediction models
.
The expression of the 24 predictor clones (19 genes)
No single gene performs better than the 19 genes
Details of 19 predictor genes •12 known genes, 7 ESTs, 2 previously described (MYCN & CD44)
•8 out of the 12 neural specific genes
•DLK1 human homologue Drosophilia Delta gene -expressed by developing neuroblasts
-activates Notch signaling pathway, inhibits neuronal differentiation
•ARC, MYCN, and SLIT3 also neural development -higher expression in the poor-outcome tumors suggest a more aggressive less differentiated phenotype, reminiscent of proliferating and migrating neural crest progenitors
•SLIT3 neuron axon repellant gene was high; ROBO2, of one of its receptors was low in poor risk suggesting these NB cells secrete a substrate to repel connecting axons and potentially prevent differentiation
•Three secreted proteins DLK1, SLIT3, and PRSS3 in poor risk tumors
1. Gene expression profiles contain prognostic information in the pre-treatment diagnostic samples for patients with NB
Summary
2. Identified 19 prognostic specific genes that performed the best
3. Ability to further partition current COG high risk patients into ultra-high and survivors groups
1. Develop a multiplex PCR based prognosis prediction assay / serum markers test
2. Validate in a larger cohort of patients: national trials
Future Direction
3. Biological studies of the role of selected genes in the tumorigenic process
4. Isotope-coded affinity tags (ICAT) analysis of ultra-high risk tumor samples
5. New treatment trial for ultra-high risk patients
Comparative Genomic Hybridization(CGH)
Known Genomic Alterations that Correlate with Poor Prognosis in NB
•MYCN amplification: 20-25%
•1p deletion: 30-36%
•17q gain: 70-80%
•11q deletion: 44%
Objectives
• Perform a systematic survey of genomic copy
number alterations in Neuroblastoma
• Identify genomic alterations that correlate with
stage and MYCN amplifications
• Infer a model for tumor progression
Chen QR, Bilke S et al.
Preparation of genomic DNA
DNA labeling
Hybridization
Image analysis
BAC array DNA arrayMetaphase spread
CANCER
KlenowLabeling
Harvest DNA
DNA
DNADNA
DNA
CGH
Gene3-5cDNA<1MB2BAC
5-10MB2Metaphase CGH
ResolutionFold Sensitivity
● 20 Stage 1 tumor samples
● 53 Stage 4 tumors - 15 Stage 4S - 20 Stage 4 without MYCN amplification (4-) - 18 Stage 4 with MYCN amplification (4+)
•12 neuroblastoma cell lines (MYCN-A & NA)
•73 tumor samples
Neuroblastoma Sample and Dataset
•42 k clone cDNA microarray
Sensitivity of cDNA A-CGH to Detect Single Copy Number Changes
0.5 (Expected) equal to 0.9 (Observed)
Control X chr Expected ratio
46XY 1 0.546XX 2 1.047XXX 3 1.548XXXX 4 2.049XXXXX 5 2.5
Ref
Increased Sensitivity for Detecting Single Copy Changes using Topological t-statistics
Local genomic sequence mapping information
+t-statistics
Sample Ratio Data
Self-Self Ratio Data
Self-Self Ratio Data
Self-Self Ratio Data
Bilke S, Chen QR, et al.Bioinformatics. 2004 Nov 11
Topological t-statistics of NB data set
Chen QR, Bilke S et al.BMC Genomics 2004, 5:70
1p36
17
Objectives
• Perform a systematic survey of genomic copy
number alterations in Neuroblastoma
• Identify genomic alterations that correlate with
stage and MYCN amplifications
• Infer a model for tumor progression
Staging of Neuroblastoma
Stage 1: Localized tumor with complete gross excision
Stage 2: Localized tumor with incomplete gross excision
Stage 3: Unresectable unilateral tumor infiltrating across the midline
Stage 4: Distant Spread +/- MYCN Amplification
Stage 4S: <1 yr age. Localized primary with metastasis to skin, liver, and/or bone marrow (<10% infiltration)- Survival >90%
Rajagopalan et al. Nature Reviews 2003, Vol. 9. 695-701
Least Aggressive Most Aggressive
Stage 4S? (>90% survival)
Stage 1 Stage 2 Stage 3 Stage 4 Stage 4-MYCN-A
Tumor Progression Models
3 Principles for building models:
• All the stages arise from a common ancestor
• All changes within a parent genotype must be present in the
daughter (the inheritance signature). The daughter will
acquire additional genomic changes.
• Unobserved intermediate genotypes are possible but the
model with the smallest number of genotypes is utilized.
4+1- 4-
Tumor Progression Model from Array-CGH Data
All Possible Progression Models for 3 StagesStage 1
Stage 4+ Stage 4-
• “Linear Progression” is incompatible with our data • Tumor type is determined early when the set of genomic changes is acquired• Unobserved stages may be Neuroblastoma in situ (Beckwith, Perrin 1963)
17q21-25
2p25-24, 1p36
7p15-14, 7p11-q11,7q21-233p21, 4p16,11q13,11q23,14q11
11q14, 11q21-25
11q15,11q12-132q23,2q35, 11p1117p11-q11
Stage 4+
Stage 4-
Stage 1 Stage 4S
Genomic progression model for all 4 stages neuroblastoma
Summary• cDNA based array-CGH is an effective tool for genome-
wide high-resolution DNA copy number measurements
• Possible to detect low copy number changes by using topological t-statistics
• Identified genomic alterations specific to stage and MYCN amplification
• Characteristic pattern of genomic imbalances allows the identification of a model of tumor progression for neuroblastoma
• These regions may harbor prognostic markers, tumor suppressor genes or oncogenes and potential drug targets.
AcknowledgementsOncogenomic SectionPediatric Oncology BranchNational Cancer Institute
• Jun Wei • Braden Greer• Sven Bilke• Qingrong Chen • Craig Whiteford• Nicola Cenacchi• Alexei Krasnoselsky• Chang-Gue Son
GCRC, Germany• Frank Westermann • Frank Berthold
• Manfred Schwab Cooperative Human Tissue Network/COG• John Maris
The Children’s Hospital,Westmead, Australia • Daniel Catchpoole
NHGRI • Sean Davis
NCI • Seth Steinberg
NICHD Brain and Tissue Bank
http://home.ccr.cancer.gov/oncology/oncogenomics/
•150 Normal Samples•19 Different Organs•30 Individuals•18, 927 Unique Genes•Tool for identifying “cancer specific targets”•Immune therapy, molecular targeted therapy•Tool for identifying co-regulated genes
Genome Research March 2005
“….. I seem to have been only like a boy playing on the seashore, and diverting myself in now and then finding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all
undiscovered before me.” Isaac Newton