Analysis of Gene Expression at the Single-Cell Level

35
Analysis of Gene Expression at the Single-Cell Level Guo-Cheng Yuan Department of Biostatistics and Computational Biology Dana-Farber Cancer Institute Harvard School of Public Health Bioconductor, July 31 st , 201

description

Analysis of Gene Expression at the Single-Cell Level. Guo-Cheng Yuan Department of Biostatistics and Computational Biology Dana-Farber Cancer Institute Harvard School of Public Health. Bioconductor , July 31 st , 2014. bioconductor. - PowerPoint PPT Presentation

Transcript of Analysis of Gene Expression at the Single-Cell Level

Page 1: Analysis of Gene Expression at the Single-Cell Level

Analysis of Gene Expressionat the Single-Cell Level

Guo-Cheng Yuan

Department of Biostatistics and Computational Biology

Dana-Farber Cancer Institute

Harvard School of Public Health Bioconductor, July 31st, 2014

Page 2: Analysis of Gene Expression at the Single-Cell Level

bioconductor

Page 3: Analysis of Gene Expression at the Single-Cell Level

Methods to sequence the DNA and RNA of single cells are poised to transform many areas of biology and medicine.

--- Nature Methods

Page 4: Analysis of Gene Expression at the Single-Cell Level
Page 5: Analysis of Gene Expression at the Single-Cell Level

• “Recent technical advances have enabled RNA sequencing (RNA-seq) in single cells. Exploratory studies have already led to insights into the dynamics of differentiation, cellular responses to stimulation and the stochastic nature of transcription. We are entering an era of single-cell transcriptomics that holds promise to substantially impact biology and medicine.”– R. Sandberg, 2014. Nature Methods

Page 6: Analysis of Gene Expression at the Single-Cell Level
Page 7: Analysis of Gene Expression at the Single-Cell Level

Cell-type A

Cell-type B

Cell-type C

Cell-type DCell-type E

Cell-type F

Cell D

ivis

ion

Page 8: Analysis of Gene Expression at the Single-Cell Level

R. Sandberg, 2014. Nature Methods

Page 9: Analysis of Gene Expression at the Single-Cell Level

Challenges in single-cell data analysis

• Characterize and distinguish technical/biological variability

• Identify new and meaningful cell clusters.

• Identify the lineage relationship between different cell clusters.

• Characterize the dynamic process during cell-state transitions.

• Elucidate the transition of regulatory networks.

• Distinguish stochastic vs real variation

Page 10: Analysis of Gene Expression at the Single-Cell Level
Page 11: Analysis of Gene Expression at the Single-Cell Level

CMP GMP

MEP CLPMEP

Guoji Guo, Eugenio Marco

Page 12: Analysis of Gene Expression at the Single-Cell Level

SPADE: a density-normalized, spanning tree model

Qiu et al. 2011Nat Biotech, p886

Down-sample

Clustering,Spanning-tree

Visualization

Page 13: Analysis of Gene Expression at the Single-Cell Level
Page 14: Analysis of Gene Expression at the Single-Cell Level
Page 15: Analysis of Gene Expression at the Single-Cell Level

Log2(CMP1/CMP2)CD55 7.87ICAM4 3.98CD274 3.32MPL 3.19TEK 2.83

Page 16: Analysis of Gene Expression at the Single-Cell Level

Cancer Stem Cells

• Each cancer contains a highly heterogeneous cell population.

• Clonal evolution contributes to cancer heterogeneity

• Cancer cells are hierarchically organized and maintained by cancer stem cells

• How are the leukemia stem cells related to normal blood cell lineage? How do they differ?

Page 17: Analysis of Gene Expression at the Single-Cell Level

Single cell analysis of the mouse MLL-AF9 acute myeloid leukemia cells

Compilation of mouse cell surface antigens (Lai et al., 1998; eBioscience website)

Primer design for 300 multiplexed PCR (collaboration with Helen Skaletsky)

Micro-fluidic high-throughput realtime PCR (96.96 Array)

Guoji Guo, Assieh Saadatpour

Page 18: Analysis of Gene Expression at the Single-Cell Level

t-SNE analysis identifies similarities between cell-types

• t-SNE is a nonlinear dimension reduction method, and can identify patterns undetectable by PCA

• t-SNE minimizes the divergence between distributions over pairs of points.

• Leukemia cells are more similar to GMPs than to HSCs

• Leukemia cells are highly heterogeneous.

Page 19: Analysis of Gene Expression at the Single-Cell Level

Mapping leukemia cells to normal hematopoietic cell hierarchy

• Use 33 common genes to map cell hierarchy.

• Mapping identifies two subtypes of leukemia cells.

• These cells are similar but not identical to their corresponding normal lineages.

Page 20: Analysis of Gene Expression at the Single-Cell Level

All Leukemia

Leukemia 1 Leukemia 2

GMP

Coexpression networks are different among subtypes

Page 21: Analysis of Gene Expression at the Single-Cell Level

Surani and Tischler, Nature 2012

Guo et al. Dev Cell 2010

Page 22: Analysis of Gene Expression at the Single-Cell Level

Dynamic clusteringT = 1 T = 2 T = 3 T = 4

Eugenio Marco, Bobby Karp, Lorenzo Trippa, Guoji Guo

Maximizing the penalized log-likelihood.2

)()|(log c

cacP x

Page 23: Analysis of Gene Expression at the Single-Cell Level

Identifying bifurcation points and directions

>80% variance increase during bifurcation is attributed to a single (bifurcation) direction.

ICM

TE

EPI

PE

Page 24: Analysis of Gene Expression at the Single-Cell Level

Modeling dynamics by bifurcation analysis

U(x)

U(x)

0274 23 ba

0274 23 ba

bxaxx

xU 24

)(24

I)

II)

dtxUdx )(

Page 25: Analysis of Gene Expression at the Single-Cell Level

)()( tdWdtxUdx

0274 23 ba

0274 23 ba

U(x)

U(x)

I)

II)

Modeling dynamics by bifurcation analysis

Page 26: Analysis of Gene Expression at the Single-Cell Level

Noise level s has large impact on lineage biases

s = 1s = 0.5s = 2

Page 27: Analysis of Gene Expression at the Single-Cell Level

Control Perturbation

Lineage bias due to perturbation of TF activity

Predicted lineage bias due to 2 fold decrease of TF level

U(x) U(x)

Page 28: Analysis of Gene Expression at the Single-Cell Level

Nanog

PE

EPI

Experimental validation using Nanog mutant

Page 29: Analysis of Gene Expression at the Single-Cell Level

How do we infer dynamics without temporal information?

Page 30: Analysis of Gene Expression at the Single-Cell Level

Characterization of early bipotential progeny ofLgr5+ intestinal stem cells

Tae-Hee Kim, Assieh SaadatpourCrosnier 2006. Nature Review

Page 31: Analysis of Gene Expression at the Single-Cell Level

Principal Curve Analysis Reconstruct Temporal Information

t-SNE plot indicates two distinct clusters, linked a small number of transitional cells

Page 32: Analysis of Gene Expression at the Single-Cell Level

Principal Curve Analysis Reconstruct Temporal Information

t-SNE plot indicates two distinct clusters, linked a small number of transitional cells

Principal curve analysis captures the overall trend of cell-state transition

Page 33: Analysis of Gene Expression at the Single-Cell Level

Inferred dynamic gene expression profileO

lfm4

Rn

f43

Gap

dhC

dca

7A

ctb

Pcn

aT

af1d

Sm

oc2-

2C

d24

Itgav

Vil1

Pro

m1

CD

44G

cnt3

His

t1h1

eA

xin2

Sm

oc2-

1S

ox9-

1C

nn3

Cd

x2C

cl9

Acs

m3

Kitl

Gpl

d1 Vdr

Ho

pxH

es1

-2F

gfr4

Cd

kn1b

Nr4

a2V

egfc

Tex

9Z

nrf3

Lrp1

Cd

9D

pp4

Far

p1C

cnl1

Asc

l2S

ox9-

2S

lc14

a1Lr

ig1

No

p2C

cnd1

Lcp1

Hp

rtLy

75Il6

stR

hob

tb3

Cd

x1T

yms

Bm

i1F

osN

fat5

Atf3

Rn

ase1 Dll1

Cd

14Z

fp36

Cd

82G

emin

4S

ycn

Hig

2 Il6R

fc4

Cd

k1D

efa

5C

cnb1

Rrm

2P

lk1

Ang

4C

asp

12 Tlr1 Lc

tK

itA

poa1 Lifr

Tdg

f1D

ctM

uc2

Spd

efN

upr

1D

ll4T

ert-

1N

eur

od1

Ter

t-2

Clc

a3Li

peA

toh1

-1T

bx3

Du

sp1

JunB

Psr

c1C

dkn

2bB

mp4

He

s3T

reh

Re

p15

Ch

ga Alp

iP

ax4

Ato

h1-2

Cck

Cd

kn1a

Ne

urog

3C

cna2

Mki

67D

efa

3C

ach

d1E

phb3

Fst

l1C

enp

eS

hisa

2N

fatc

3Ju

nP

relp

No

tch1

He

s1-1

Fgf

r2C

es1

dA

pex1

Abc

g2Ig

f1r

Lgr5

Agr

3A

rg2

Na

p1l1

Clc

a1H

eph

Sem

a4d

No

tch2

Wdr

90 Ppi

fT

ac1

Egr

2C

cne1

Sel

mG

as6

Sox

4H

beg

fC

d38

Fos

bD

bd2

Lyz1

Clc

a2Ig

fbp4

Ra

sa3

Msi

Cd

55A

spa

Ifnar

1T

cof1

Du

sp4

Mm

p7K

cnq1 Syp

Dcl

k1C

hgb

Ra

d51l

1G

FP

He

s5E

2F1

Egr

1H

ey1

Insm

1G

m61

4H

ey2

Sp5

Btla

Lyz2

He

ylC

cl6

Re

g4M

ycH

es7

Cre

b3l4

Cd

83F

cgbp

Cd

kn1c

Zic

2H

epa

cam

2G

fi1N

kx2.

2G

ram

d1a

Gfi1

bC

dkn

2aT

ff3D

iap3

Cd

kn3

1

2

3

4

5

6

7

8

9

10

Use the principal curve coordinate as a proxy for temporal evolution.

Page 34: Analysis of Gene Expression at the Single-Cell Level

Conclusions

• Single-cell genomics is a powerful technology for understanding cellular heterogeneity and hierarchy.

• Single-cell gene expression data analysis present many new methodological challenges.

• It is a great time to develop algorithms and software for single cell data analysis.

Page 35: Analysis of Gene Expression at the Single-Cell Level

Acknowledgement

Eugenio MarcoAssieh Saadatpour

Bobby Karp

Lorenzo Trippa

Paul Robson

Stuart OrkinGuoji Guo

Ramesh ShivdasaniTae-Hee Kim

Funding fromNIH, HSCI