Applications of machine learning in Computational...

41
Applications of Machine Learning in Computational Biology Narges Razavian New York University Slides thanks to James Galagan@Board Institute Su-In Lee@Univ of Washington Rainer Breitling@ Univ of Glasgow Christopher M. Bishop@ ECCV 2004

Transcript of Applications of machine learning in Computational...

Page 1: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Applications of Machine Learning in Computational Biology

Narges Razavian

New York University

Slides thanks to James Galagan@Board Institute Su-In Lee@Univ of Washington Rainer Breitling@ Univ of Glasgow Christopher M. Bishop@ ECCV 2004

Page 2: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational
Page 3: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational
Page 4: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Central Dogma of Biology

Page 5: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Examples of Challenges involved

Slide Credit: Manolis Kellis

Page 6: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Application : Decoding Sequences and Motif Discovery

Page 7: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Motif Discovery GCGTCTGACGGCGCACCGTTCGCGCTGCCGGCACCCCGGGCTCCATAATGAAAATCATGT

TCAGTAAGCTACACTCTGCATATCGGGCTACCAACGAAATGGAGTATCGGTCATGATCTT

GCCAGCCGTGCCTAAAAGCTTGGCCGCAGGGCCGAGTATAATTGGTCGCGGTCGCCTCGA

AGTTAGCTTATGCAATGCAGGAGGTGGGGCAAAGTTCAGGCGGATCGGCCGATGGCGGGC

GTAGGTGAAGGAGACAGCGGAGGCGTGGAGCGTGATGACATTGGCATGGTGGCCGCTTCC

CCCGTCGCGTCTCGGGTAAATGGCAAGGTAGACGCTGACGTCGTCGGTCGATTTGCCACC

TGCTGCCGTGCCCTGGGCATCGCGGTTTACCAGCGTAAACGTCCGCCGGACCTGGCTGCC

GCCCGGTCTGGTTTCGCCGCGCTGACCCGCGTCGCCCATGACCAGTGCGACGCCTGGACC

GGGCTGGCCGCTGCCGGCGACCAGTCCATCGGGGTGCTGGAAGCCGCCTCGCGCACGGCG

ACCACGGCTGGTGTGTTGCAGCGGCAGGTGGAACTGGCCGATAACGCCTTGGGCTTCCTG

TACGACACCGGGCTGTACCTGCGTTTTCGTGCCACCGGACCTGACGATTTCCACCTCGCG

TATGCCGCTGCGTTGGCTTCGACGGGCGGGCCGGAGGAGTTTGCCAAGGCCAATCACGTG

GTGTCCGGTATCACCGAGCGCCGCGCCGGCTGGCGTGCCGCCCGTTGGCTCGCCGTGGTC

ATCAACTACCGCGCCGAGCGCTGGTCGGATGTCGTGAAGCTGCTCACTCCGATGGTTAAT

GATCCCGACCTCGACGAGGCCTTTTCGCACGCGGCCAAGATCACCCTGGGCACCGCACTG

GCCCGACTGGGCATGTTTGCCCCGGCGCTGTCTTATCTGGAGGAACCCGACGGTCCTGTC

GCGGTCGCTGCTGTCGACGGTGCACTGGCCAAAGCGCTGGTGCTGCGCGCGCATGTGGAT

ATGGAGTCGGCCAGCGAAGTGCTGCAGGACTTGTATGCGGCTCACCCCGAAAACGAACAG

GTCGAGCAGGCGCTGTCGGATACCAGCTTCGGGATCGTCACCACCACAGCCGGGCGGATC

GAGGCCCGCACCGATCCGTGGGATCCGGCGACCGAGCCCGGCGCGGAGGATTTCGTCGAT

CCCGCGGCCCACGAACGCAAGGCCGCGCTGCTGCACGAGGCCGAACTCCAACTCGCCGAG

Page 8: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

GCGTCTGACGGCGCACCGTTCGCGCTGCCGGCACCCCGGGCTCCATAATGAAAATCATGT

TCAGTAAGCTACACTCTGCATATCGGGCTACCAACGAAATGGAGTATCGGTCATGATCTT

GCCAGCCGTGCCTAAAAGCTTGGCCGCAGGGCCGAGTATAATTGGTCGCGGTCGCCTCGA

AGTTAGCTTATGCAATGCAGGAGGTGGGGCAAAGTTCAGGCGGATCGGCCGATGGCGGGC

GTAGGTGAAGGAGACAGCGGAGGCGTGGAGCGTGATGACATTGGCATGGTGGCCGCTTCC

CCCGTCGCGTCTCGGGTAAATGGCAAGGTAGACGCTGACGTCGTCGGTCGATTTGCCACC

TGCTGCCGTGCCCTGGGCATCGCGGTTTACCAGCGTAAACGTCCGCCGGACCTGGCTGCC

GCCCGGTCTGGTTTCGCCGCGCTGACCCGCGTCGCCCATGACCAGTGCGACGCCTGGACC

GGGCTGGCCGCTGCCGGCGACCAGTCCATCGGGGTGCTGGAAGCCGCCTCGCGCACGGCG

ACCACGGCTGGTGTGTTGCAGCGGCAGGTGGAACTGGCCGATAACGCCTTGGGCTTCCTG

TACGACACCGGGCTGTACCTGCGTTTTCGTGCCACCGGACCTGACGATTTCCACCTCGCG

TATGCCGCTGCGTTGGCTTCGACGGGCGGGCCGGAGGAGTTTGCCAAGGCCAATCACGTG

GTGTCCGGTATCACCGAGCGCCGCGCCGGCTGGCGTGCCGCCCGTTGGCTCGCCGTGGTC

ATCAACTACCGCGCCGAGCGCTGGTCGGATGTCGTGAAGCTGCTCACTCCGATGGTTAAT

GATCCCGACCTCGACGAGGCCTTTTCGCACGCGGCCAAGATCACCCTGGGCACCGCACTG

GCCCGACTGGGCATGTTTGCCCCGGCGCTGTCTTATCTGGAGGAACCCGACGGTCCTGTC

GCGGTCGCTGCTGTCGACGGTGCACTGGCCAAAGCGCTGGTGCTGCGCGCGCATGTGGAT

ATGGAGTCGGCCAGCGAAGTGCTGCAGGACTTGTATGCGGCTCACCCCGAAAACGAACAG

GTCGAGCAGGCGCTGTCGGATACCAGCTTCGGGATCGTCACCACCACAGCCGGGCGGATC

GAGGCCCGCACCGATCCGTGGGATCCGGCGACCGAGCCCGGCGCGGAGGATTTCGTCGAT

CCCGCGGCCCACGAACGCAAGGCCGCGCTGCTGCACGAGGCCGAACTCCAACTCGCCGAG

Sequence Annotation

Gene

Page 9: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

GCGTCTGACGGCGCACCGTTCGCGCTGCCGGCACCCCGGGCTCCATAATGAAAATCATGT

TCAGTAAGCTACACTCTGCATATCGGGCTACCAACGAAATGGAGTATCGGTCATGATCTT

GCCAGCCGTGCCTAAAAGCTTGGCCGCAGGGCCGAGTATAATTGGTCGCGGTCGCCTCGA

AGTTAGCTTATGCAATGCAGGAGGTGGGGCAAAGTTCAGGCGGATCGGCCGATGGCGGGC

GTAGGTGAAGGAGACAGCGGAGGCGTGGAGCGTGATGACATTGGCATGGTGGCCGCTTCC

CCCGTCGCGTCTCGGGTAAATGGCAAGGTAGACGCTGACGTCGTCGGTCGATTTGCCACC

TGCTGCCGTGCCCTGGGCATCGCGGTTTACCAGCGTAAACGTCCGCCGGACCTGGCTGCC

GCCCGGTCTGGTTTCGCCGCGCTGACCCGCGTCGCCCATGACCAGTGCGACGCCTGGACC

GGGCTGGCCGCTGCCGGCGACCAGTCCATCGGGGTGCTGGAAGCCGCCTCGCGCACGGCG

ACCACGGCTGGTGTGTTGCAGCGGCAGGTGGAACTGGCCGATAACGCCTTGGGCTTCCTG

TACGACACCGGGCTGTACCTGCGTTTTCGTGCCACCGGACCTGACGATTTCCACCTCGCG

TATGCCGCTGCGTTGGCTTCGACGGGCGGGCCGGAGGAGTTTGCCAAGGCCAATCACGTG

GTGTCCGGTATCACCGAGCGCCGCGCCGGCTGGCGTGCCGCCCGTTGGCTCGCCGTGGTC

ATCAACTACCGCGCCGAGCGCTGGTCGGATGTCGTGAAGCTGCTCACTCCGATGGTTAAT

GATCCCGACCTCGACGAGGCCTTTTCGCACGCGGCCAAGATCACCCTGGGCACCGCACTG

GCCCGACTGGGCATGTTTGCCCCGGCGCTGTCTTATCTGGAGGAACCCGACGGTCCTGTC

GCGGTCGCTGCTGTCGACGGTGCACTGGCCAAAGCGCTGGTGCTGCGCGCGCATGTGGAT

ATGGAGTCGGCCAGCGAAGTGCTGCAGGACTTGTATGCGGCTCACCCCGAAAACGAACAG

GTCGAGCAGGCGCTGTCGGATACCAGCTTCGGGATCGTCACCACCACAGCCGGGCGGATC

GAGGCCCGCACCGATCCGTGGGATCCGGCGACCGAGCCCGGCGCGGAGGATTTCGTCGAT

CCCGCGGCCCACGAACGCAAGGCCGCGCTGCTGCACGAGGCCGAACTCCAACTCGCCGAG

Sequence Annotation

Gene

Promoter Motif

Page 10: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

A Generative Model

Background Island

0.15

0.25

0.75 0.85

A: 0.25

T: 0.25

G: 0.25

C: 0.25

TAAGAATTGTGTCACACACATAAAAACCCTAAGTTAGAGGATTGAGATTGGCA GACGATTGTTCGTGATAATAAACAAGGGGGGCATAGATCAGGCTCATATTGGC

A: 0.15

T: 0.13

G: 0.30

C: 0.42

Page 11: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

P

B B

P P

B

P P

B

P

B

P

B

P

B

P

B

A Generative Model(cont.)

P P

B B B

P P

C A A A T G C G S:

B B B

P P P

B B

A: 0.42

T: 0.30

G: 0.13

C: 0.15

A: 0.25

T: 0.25

G: 0.25

C: 0.25

P(S|P) P(S|B) P(Li+1|Li)

Bi+1 Pi+1

Bi 0.85 0.15

Pi 0.25 0.75

Page 12: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Fundamental HMM Operations

Decoding • Given an HMM and sequence S • Find a corresponding sequence of

labels, L

Evaluation • Given an HMM and sequence S • Find P(S|HMM)

Training • Given an HMM w/o parameters

and set of sequences S • Find transition and emission

probabilities the maximize P(S | params, HMM)

Computation Biology

Annotate pathogenicity islands on a new sequence

Score a particular sequence (not as useful for this model – will come back to this later)

Learn a model for sequence composed of background DNA and pathogenicity islands

Page 13: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Application: Modeling Protein Families

Page 14: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Modeling Protein Families

• Given amino acid sequences from a protein family, how can we find other members? – Can search databases with each known member – not sensitive

– More information is contained in full set

• The HMM Profile Approach – Learn the statistical features of protein family

– Model these features with an HMM

– Search for new members by scoring with HMM

Page 15: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

UBE2D2 FPTDYPFKPPKVAFTTRIYHPNINSN-GSICLDILR-------------SQWSPALTISK

UBE2D3 FPTDYPFKPPKVAFTTRIYHPNINSN-GSICLDILR-------------SQWSPALTISK

BAA91697 FPTDYPFKPPKVAFTTKIYHPNINSN-GSICLDILR-------------SQWSPALTVSK

UBE2D1 FPTDYPFKPPKIAFTTKIYHPNINSN-GSICLDILR-------------SQWSPALTVSK

UBE2E1 FTPEYPFKPPKVTFRTRIYHCNINSQ-GVICLDILK-------------DNWSPALTISK

UBCH9 FSSDYPFKPPKVTFRTRIYHCNINSQ-GVICLDILK-------------DNWSPALTISK

UBE2N LPEEYPMAAPKVRFMTKIYHPNVDKL-GRICLDILK-------------DKWSPALQIRT

AAF67016 IPERYPFEPPQIRFLTPIYHPNIDSA-GRICLDVLKLP---------PKGAWRPSLNIAT

UBCH10 FPSGYPYNAPTVKFLTPCYHPNVDTQ-GNICLDILK-------------EKWSALYDVRT

CDC34 FPIDYPYSPPAFRFLTKMWHPNIYET-GDVCISILHPPVDDPQSGELPSERWNPTQNVRT

BAA91156 FPIDYPYSPPTFRFLTKMWHPNIYEN-GDVCISILHPPVDDPQSGELPSERWNPTQNVRT

UBE2G1 FPKDYPLRPPKMKFITEIWHPNVDKN-GDVCISILHEPGEDKYGYEKPEERWLPIHTVET

UBE2B FSEEYPNKPPTVRFLSKMFHPNVYAD-GSICLDILQN-------------RWSPTYDVSS

UBE2I FKDDYPSSPPKCKFEPPLFHPNVYPS-GTVCLSILEED-----------KDWRPAITIKQ

E2EPF5 LGKDFPASPPKGYFLTKIFHPNVGAN-GEICVNVLKR-------------DWTAELGIRH

UBE2L1 FPAEYPFKPPKITFKTKIYHPNIDEK-GQVCLPVISA------------ENWKPATKTDQ

UBE2L6 FPPEYPFKPPMIKFTTKIYHPNVDEN-GQICLPIISS------------ENWKPCTKTCQ

UBE2H LPDKYPFKSPSIGFMNKIFHPNIDEASGTVCLDVIN-------------QTWTALYDLTN

UBC12 VGQGYPHDPPKVKCETMVYHPNIDLE-GNVCLNILR-------------EDWKPVLTINS

Human Ubiquitin Conjugating Enzymes

Page 16: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Profile HMM

Ij

Start M1 Mj MN End

Dj D1 DN

I I1 IN

A

C

D

E

F

G

H

I

K

L

M

N

O

P

Q

R

S

T

V

W

Y

A

C

D

E

F

G

H

I

K

L

M

N

O

P

Q

R

S

T

V

W

Y

A------------ D S A G -

E2EPF5 LG K D F PA S PP K G YF L T K I F H P N VGA N UBE2L1 F PA E Y P F K PP K I T F K T K I Y H P N I DE K UBE2L6 F PP E Y P F K PPMI K F TT K I Y H P N V DE N UBE2H LP D K Y P F K S P S IG F M N K I F H P N I DE A

- G E ICV N VL KR W T A E LGI RH Q VCLPVI A----------- E N W K PA T K T D Q

- G Q ICLPII SS A----------- E N W K PC T K T C Q S G T VCL D VI N -P----------- QT W T AL Y D L TN

Page 17: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Using Profile HMMs

Decoding Find sequence of labels, L,

that maximizes P(L|S, HMM)

Evaluation • Find P(S|HMM)

Training • Find transition and emission

probabilities the maximize P(S | params, HMM)

Computation Biology

Align a new sequence to a protein family

Score a sequence for membership in family

Discover and model family structure

Page 18: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Application: Modeling Protein Dynamics

Page 19: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Background • Proteins: Molecular machines, composed of a

sequences of Amino Acid sub-units

Page 20: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Background:

• Protein functional analysis pipeline

20

Crystallize to Get X-Ray Snapshot

Molecular Dynamics

Simulations

Learn Probabilistic

Model

Analyze and Predict

Image: H khanlou, et.al. “Durable Efficacy and Continued Safety of Ibalizumab in Treatment-Experienced Patients”, Infectious Diseases Society of America (IDSA) October 2011

Page 21: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Modeling Protein Tertiary Structure

Page 22: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

10 second Reminder! Probability Theory

• Sum rule

• Product rule

• From these we have Bayes’ theorem – with normalization

Page 23: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

10 second Reminder(cont.)! Decomposition

• Consider an arbitrary joint distribution

• By successive application of the product rule

Page 24: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Directed Acyclic Graphs • Joint distribution

where denotes the parents of i

No directed cycles

Page 25: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Undirected Graphs

• Provided then joint distribution is product of non-negative functions over the cliques of the graph where are the clique potentials, and Z is a normalization constant

Page 26: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Undirected Graphical Models

• Pairwise Undirected graphical models (single and bivariate potentials only)

n

n

i jieij jiijii

n

i jieij jiijii

dXdXXXfXf

XXfXfXP

GraphFactorAasFieldRandomMarkov

..),()(

),()(

)(

111

11

X2

X4

Xn-1

X5

X1

X3

Xn

f12

f12 f13

f34

f4n-1 f5n-1 f5n

f1 f2

f5

f3

fn-1 fn

f4

26

Page 27: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Question:

• Each potential has some parameters. How to estimate them from training data?

– Could do gradient descent on the likelihood of the data, (if we knew z)

– Often iterative process

• How to compute z?

– Belief propagation (next slides)

Page 28: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Message Passing

• Example

• Find marginal for a particular node – for M-state nodes, cost is – exponential in length of chain – but, we can exploit the graphical structure

(conditional independences)

Page 29: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Message Passing • Joint distribution

• Exchange sums and products

Page 30: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Message Passing

• Express as product of messages

• Recursive evaluation of messages

• Find Z by normalizing

Page 31: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Belief Propagation • Extension to general tree-structured graphs

• At each node: – form product of incoming messages and local evidence

– marginalize to give outgoing message

– one message in each direction across every link

• No convergence guaranteed if there are loops!

Page 32: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Inference and Learning • Data set

• Likelihood function (independent observations)

• Maximize (log) likelihood

Page 33: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Modeling Protein Tertiary Structure

• Optimize Pseudo-likelihood of training data, to estimate parameters

Page 34: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Application: Microarray Gene Expression Analysis

Page 35: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

35

The dramatic consequences of gene regulation in biology

Same genome Different tissues

•Different physiology •Different proteome

•Different expression pattern

Anise swallowtail, Papilio zelicaon

Page 36: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

36

cDNA microarray schema

From Duggan et al. Nature Genetics 21, 10 – 14 (1999)

color code for relative expression

Page 37: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

37

Hierarchical clustering

• Combine most similar genes into agglomerative clusters, build tree of genes

• Do the same procedure along the second dimension to cluster samples

• Display as a heatmap

Page 38: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Hierarchical clustering results

Chi et al., PNAS | September 16, 2003 | vol. 100 | no. 19 | 10623-10628 “Endothelial cell diversity revealed by global expression profiling”

Page 39: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Personalized cancer treatment

160 d

rugs

Drug sensitivity test

~100 patients at UWMC

… g1

g2

g4

g5 g6

g3

e8

g11

g14 g15

g9

g16

g

g g30,000

g3

g7

g12 g13

g

g

g

g

g

g

g10 30,0

00 g

enes

RNA levels of genes in

cancer cells

Drug 3 Drug 2

Drug i

Drug 6

Drug 4

Drug 5

Drug 160

30,000 features! (feature selection)

Prior knowledge on drugs’ targets

Publicly available RNA level data

>3000 patients Transfer learning, Feature reconstruction

Page 40: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Other applications

• Predicting phenotype (symptoms) given:

– Predictive Models Can be:

• Generative (i.e. Bayesian Network)

• Discriminative (i.e. Regression, SVM, KNN)

RNA levels

of genes

Protein levels

of genes

Epigenetics

(Methylation)

A few histologic

features

…ACGTAGCTAGCTAGCTAGCTGATGCTAGCTACGTGCT…

DNA sequence

Page 41: Applications of machine learning in Computational Biologypeople.csail.mit.edu/dsontag/courses/ml13/slides/lecture... · 2013-12-14 · Applications of Machine Learning in Computational

Many more exciting research to come!