Flow Cytometry Critical Assessment of Population ...€¦ · Flow Cytometry (FCM) • a.k.a....
Transcript of Flow Cytometry Critical Assessment of Population ...€¦ · Flow Cytometry (FCM) • a.k.a....
Flow Cytometry – Critical Assessment of
Population Identification Methods
Richard H. Scheuermann, Ph.D.
Director of Informatics
J. Craig Venter Institute
Single Cell Phenotypes
• Different cell types play different
physiological roles in the body
• Cell identity and function
(phenotype) is dictated by the subset
of genes/proteins expressed
• Alterations in the normal expressed
“parts list” can give rise to disease
Bruce Wetzel & Harry Schaefer, National Cancer Institute
http://en.wikipedia.org/wiki/Image:SEM_blood_cells.jpg
Flow Cytometry (FCM)
• a.k.a. Fluorescence Activated Cell Sorting (FACSTM)
• Method:
Stain cell population with fluorescent reagents that bind to specific molecules, e.g.
fluorescein-conjugated anti-CD40 antibodies
Measure fluorescence properties of each cell using flow cytometer
• Direct and indirect measurement of single cell characteristics, e.g. cell size,
membrane protein expression, secreted protein expression, cell cycle state,
DNA ploidy, signal transduction activation
• Research uses: study normal and abnormal cell activation, differentiation and
function; biomarker discovery
• Clinical uses: diagnosis and
monitoring of leukemia,
lymphoma and
myeloproliferative disorders
How a Flow Cytometer Works
Fluidic SystemOptic System
Electronics
(s)
Flow Cytometry (FCM)
• a.k.a. Fluorescence Activated Cell Sorting (FACSTM)
• Method:
Stain cell population with fluorescent reagents that bind to specific molecules, e.g.
fluorescein-conjugated anti-CD40 antibodies
Measure fluorescence properties of each cell using flow cytometer
• Direct and indirect measurement of single cell characteristics, e.g. cell size,
membrane protein expression, secreted protein expression, cell cycle state,
DNA ploidy, signal transduction activation
• Research uses: study normal and abnormal cell activation, differentiation and
function; biomarker discovery
• Clinical uses: diagnosis and
monitoring of leukemia,
lymphoma and
myeloproliferative disorders
Traditional Flow Cytometry Analysis
•Subjective
•Time-consuming
•Doesn’t handle overlapping
distributions well
•Sensitive to slight difference in
fluorescence intensity distributions
between samples
•Requires at least one 2D plot that clearly
segregates populations in question
Goal - group together cells with similar
characteristics
Traditional approach - manual gating 2D
at a time
FCM can measure many parameters simultaneously, e.g., BD LSR-II can produce data for up to 19 parameters for every cell in a given sample
FCM instrumentation & reagents
CyTOF Mass Cytometry
Automated Cell Population Identification
FLOCK
FLOCK is a density-based algorithmic method for the identification of unique cell populations in multi-dimensional FCM data
2D example
Divide with hyper-grids
Find dense hyper-regions
Merge neighboring dense hyper-regions
Clustering based on region centers
N1-3
UM1-2
UM3-4PB GSM
GNSM
DNM
CD
27
IgD
B2
20
CD24
CD
38
IgG
A
17 B Cell Populations in Blood
Proportion Change of PB/Plasma Cells
FlowCAP Challenge
• The goal of FlowCAP is to advance the development of computational methods for
the identification of cell populations of interest in flow cytometry data by providing the
means to objectively test these methods, initially by comparison to manual gating
analysis by experts using common datasets.
• FlowCAP Challenges
1) Design specific computational challenges
2) Collect de-identified datasets and distribute to algorithm development community;
2) Collect challenge results from algorithm development community;
3) Assess results in comparison with some gold standard or defined performance
metric
• http://flowcap.flowsite.org
Flow Cytometry: Critical
Assessment of Population
Identification Methods
(FlowCAP)
FlowCAP Challenges - Summary
• FlowCAP-I: Cell population identification
i) completely automated, ii) manually tuned, iii) predefined population
numbers, iv) trained using manual gates
• FlowCAP-II: Sample classification/FCM biomarkers
i) HIV exposed but uninfected, ii) AML, iii) T cell responses following
HIV vaccination
• FlowCAP-III: Collection of challenges
i) rare cell population identification (EQAPOL), ii) survival
biomarkers, iii) sample classification (HVTN ICS), iv) manual gating
comparison (HIP-C lyoplate)
• FlowCAP-IV: Clinical outcome correlates
Time to AIDS progression from PBMCs stimulated with HIV antigens
FlowCAP Challenges - Summary
• FlowCAP-I: Cell population identification
i) completely automated, ii) manually tuned, iii) predefined population
numbers, iv) trained using manual gates
• FlowCAP-II: Sample classification/FCM biomarkers
i) HIV exposed but uninfected, ii) AML, iii) T cell responses following
HIV vaccination
• FlowCAP-III: Collection of challenges
i) rare cell population identification (EQAPOL), ii) survival
biomarkers, iii) sample classification (HVTN ICS), iv) manual gating
comparison (HIP-C lyoplate)
• FlowCAP-IV: Clinical outcome correlates
Time to AIDS progression from PBMCs stimulated with HIV antigens
FlowCAP-I Datasets
• Graft versus Host Disease (GvHD) - samples for finding cellular signatures to predict or correlate with early detection of GvHD.
• Diffuse Large B-cell Lymphoma (DLBCL) – lymph node biopsies from treated patients with histologically-confirmed DLBCL.
• Hematopoietic Stem Cell Transplant (HSCT) – samples derived from mouse hematopoietic stem cell transplant experiments.
• Symptomatic West Nile Virus (WNV) – PBMC from patients with symptomatic West Nile virus infection stimulated in vitro with WNV peptide pools.
• Normal Donors (ND) – differences in the response of PBMC to various stimuli for a set of healthy donors, including both cell surface and intracellular markers.
Dataset # Samples # Events # Colors # Markers,
incl. FSC/SSC
Provider
GvHD 12 14,000 4 6 BCCRC & TreeStar
DLBCL 30 5,000 3 5 BCCRC
HSCT 30 10,000 4 6 BCCRC
WNV 13 100,000 6 8 McMaster
ND 30 17,000 10 11 Amgen
F-measure
precision = tp/(tp + fp)
recall = tp/(tp + fn)
manual gating result
algorithmic result
Comparison with Manual Gating
FlowCAP-I Results
Challenge 1 – Completely Automated
FlowCAP-II
• Sample classification challenges, with sample labels provided
for half of samples up front for training purposes
Challenge 1: HIV exposed in utero and uninfected vs not exposed –
find cell populations that can discriminate between the two groups
using blood samples taken 6 mo after birth and stimulated through
TLRs
Challenge 2: AML versus healthy – blood and bone marrow with 6
different marker combinations
Challenge 3: Identify antigen stimulation groups post HIV vaccine –
Gag and Env antigen stimulated T cells ex vivo using blood samples
collected ~ 10 months after vaccination
FlowCAP-II Results
• Challenge 1
Too difficult?
Can we really expect to
predict in utero exposure
to HIV without infection?
• Challenge 2
Too easy?
Many methods showed
perfect classification
accuracy
But…....
Manual Gating “Gold Standard”
Misclassification
• With one exception,
misclassification was
randomly distributed
• Many methods
misclassified the
same sample (#340)
as AML
FlowCAP Summary
• Several automated algorithms for population identification are able to
closely match manual gating results with good performance
• Both model-based and non-model-based approaches performed well, but
different methods produced better results on different datasets
• Merging results from multiple algorithms provided further improvements
• Excellent performance of methods to identify cell-based biomarkers for
sample classification
• Manuscript published in Nature Methods
Project Applications
• Respiratory Pathogen Research Center (RPRC)
T cell responses during severe RSV infection
Biomarkers of poor respiratory function in premature infants
• La Jolla Institute for Allergy and Immunology Human
Immune Profiling Center (LJI HIPC)
T cell responses during latent and active Mtb infection and
vaccination
T cell responses during mild and severe Dengue virus infections
• UCSD Center for Advanced Laboratory Medicine (CALM)
Diagnosis and prognosis of CLL and AML
Computational Analysis Pipeline
Data transformation
FCSTranslogicle
Filtering
DAFi
Alignment
GaussNorm
Cell population identification
FLOCK-cm
Mapping
FlowMap-FR
Comparative analysis
Wilcoxon RS
original
A_D117_filt.fcsA_D139_filt.fcsA_D140_filt.fcsA_D145_filt.fcsA_D160_filt.fcsA_D167_filt.fcs
A_D73_filt.fcsA_D84_filt.fcs
A_D8_filt.fcsA_U106_filt.fcsA_U150_filt.fcsA_U197_filt.fcs
A_U29_filt.fcsA_U34_filt.fcsA_U36_filt.fcsA_U45_filt.fcsA_U50_filt.fcsA_U53_filt.fcsA_U68_filt.fcsA_U85_filt.fcsA_U95_filt.fcsN_D14_filt.fcsN_D30_filt.fcsN_D32_filt.fcsN_D34_filt.fcsN_D36_filt.fcsN_D49_filt.fcsN_D55_filt.fcsN_D86_filt.fcsN_D91_filt.fcs
N_U104_filt.fcsN_U11_filt.fcsN_U14_filt.fcsN_U17_filt.fcs
N_U187_filt.fcsN_U201_filt.fcs
N_U59_filt.fcsN_U63_filt.fcs
N_U7_filt.fcsN_U81_filt.fcs
N_U9_filt.fcsS_D101_filt.fcsS_D106_filt.fcsS_D109_filt.fcsS_D111_filt.fcsS_D114_filt.fcsS_D126_filt.fcsS_D127_filt.fcsS_D130_filt.fcsS_D136_filt.fcsS_D142_filt.fcsS_D143_filt.fcsS_D146_filt.fcsS_D150_filt.fcsS_D155_filt.fcsS_D162_filt.fcs
S_D31_filt.fcsS_D63_filt.fcsS_D72_filt.fcsS_D80_filt.fcsS_D87_filt.fcsS_U65_filt.fcs
0 1000200030004000
BV605_CCD6
normalized
A_D117_filt.fcsA_D139_filt.fcsA_D140_filt.fcsA_D145_filt.fcsA_D160_filt.fcsA_D167_filt.fcsA_D73_filt.fcsA_D84_filt.fcs
A_D8_filt.fcsA_U106_filt.fcsA_U150_filt.fcsA_U197_filt.fcsA_U29_filt.fcsA_U34_filt.fcsA_U36_filt.fcsA_U45_filt.fcsA_U50_filt.fcsA_U53_filt.fcsA_U68_filt.fcsA_U85_filt.fcsA_U95_filt.fcsN_D14_filt.fcsN_D30_filt.fcsN_D32_filt.fcsN_D34_filt.fcsN_D36_filt.fcsN_D49_filt.fcsN_D55_filt.fcsN_D86_filt.fcsN_D91_filt.fcs
N_U104_filt.fcsN_U11_filt.fcsN_U14_filt.fcsN_U17_filt.fcs
N_U187_filt.fcsN_U201_filt.fcs
N_U59_filt.fcsN_U63_filt.fcs
N_U7_filt.fcsN_U81_filt.fcs
N_U9_filt.fcsS_D101_filt.fcsS_D106_filt.fcsS_D109_filt.fcsS_D111_filt.fcsS_D114_filt.fcsS_D126_filt.fcsS_D127_filt.fcsS_D130_filt.fcsS_D136_filt.fcsS_D142_filt.fcsS_D143_filt.fcsS_D146_filt.fcsS_D150_filt.fcsS_D155_filt.fcsS_D162_filt.fcsS_D31_filt.fcsS_D63_filt.fcsS_D72_filt.fcsS_D80_filt.fcsS_D87_filt.fcsS_U65_filt.fcs
0 1000 3000
BV605_CCD6
original
A_D117_filt.fcsA_D139_filt.fcsA_D140_filt.fcsA_D145_filt.fcsA_D160_filt.fcsA_D167_filt.fcs
A_D73_filt.fcsA_D84_filt.fcs
A_D8_filt.fcsA_U106_filt.fcsA_U150_filt.fcsA_U197_filt.fcs
A_U29_filt.fcsA_U34_filt.fcsA_U36_filt.fcsA_U45_filt.fcsA_U50_filt.fcsA_U53_filt.fcsA_U68_filt.fcsA_U85_filt.fcsA_U95_filt.fcsN_D14_filt.fcsN_D30_filt.fcsN_D32_filt.fcsN_D34_filt.fcsN_D36_filt.fcsN_D49_filt.fcsN_D55_filt.fcsN_D86_filt.fcsN_D91_filt.fcs
N_U104_filt.fcsN_U11_filt.fcsN_U14_filt.fcsN_U17_filt.fcs
N_U187_filt.fcsN_U201_filt.fcs
N_U59_filt.fcsN_U63_filt.fcs
N_U7_filt.fcsN_U81_filt.fcs
N_U9_filt.fcsS_D101_filt.fcsS_D106_filt.fcsS_D109_filt.fcsS_D111_filt.fcsS_D114_filt.fcsS_D126_filt.fcsS_D127_filt.fcsS_D130_filt.fcsS_D136_filt.fcsS_D142_filt.fcsS_D143_filt.fcsS_D146_filt.fcsS_D150_filt.fcsS_D155_filt.fcsS_D162_filt.fcs
S_D31_filt.fcsS_D63_filt.fcsS_D72_filt.fcsS_D80_filt.fcsS_D87_filt.fcsS_U65_filt.fcs
0 1000200030004000
BV605_CCD6
normalized
A_D117_filt.fcsA_D139_filt.fcsA_D140_filt.fcsA_D145_filt.fcsA_D160_filt.fcsA_D167_filt.fcsA_D73_filt.fcsA_D84_filt.fcs
A_D8_filt.fcsA_U106_filt.fcsA_U150_filt.fcsA_U197_filt.fcsA_U29_filt.fcsA_U34_filt.fcsA_U36_filt.fcsA_U45_filt.fcsA_U50_filt.fcsA_U53_filt.fcsA_U68_filt.fcsA_U85_filt.fcsA_U95_filt.fcsN_D14_filt.fcsN_D30_filt.fcsN_D32_filt.fcsN_D34_filt.fcsN_D36_filt.fcsN_D49_filt.fcsN_D55_filt.fcsN_D86_filt.fcsN_D91_filt.fcs
N_U104_filt.fcsN_U11_filt.fcsN_U14_filt.fcsN_U17_filt.fcs
N_U187_filt.fcsN_U201_filt.fcs
N_U59_filt.fcsN_U63_filt.fcs
N_U7_filt.fcsN_U81_filt.fcs
N_U9_filt.fcsS_D101_filt.fcsS_D106_filt.fcsS_D109_filt.fcsS_D111_filt.fcsS_D114_filt.fcsS_D126_filt.fcsS_D127_filt.fcsS_D130_filt.fcsS_D136_filt.fcsS_D142_filt.fcsS_D143_filt.fcsS_D146_filt.fcsS_D150_filt.fcsS_D155_filt.fcsS_D162_filt.fcsS_D31_filt.fcsS_D63_filt.fcsS_D72_filt.fcsS_D80_filt.fcsS_D87_filt.fcsS_U65_filt.fcs
0 1000 3000
BV605_CCD6
Diagnostic Application: Chronic Lymphocytic
Leukemia (CLL)
Leukemia staining panels
2016
CLL challenges/opportunities
• Difficult to cleanly
separate CLL due to
overlapping expression,
especially in MRD
• Prognostics significance
of different T cell subsets
• Significance of CLL
subpopulations
• Increased complexity of
results using 10 color
panel
4 c
olo
r10 c
olo
r
CLL study design
• CLL (11), no CLL (5) and MRD (4) samples from CALM
• Two 10-color CLL staining panels
• Initial filter on singlets, viable cells, and lymphocytes using DAFi
• Then filter on CLL (CD5+CD19+), T cell (CD5+CD19-), and B cell (CD5-
CD19+) subsets using DAFi
• Cell population identification using FLOCK
• Panel #1 and CLL analysis only
• Results:
Improved marker-based definition of CLL
Better monitoring of MRD
CLL subtype identification
CALM SOP
DAFi filtering resultsS
am
ple
#5
8 D
AF
i filters
Co
mp
osite
FL
OC
K re
su
ltsC
D19
CD5
B cells T cells CLL cells
CLL results
• Identified 45 distinct cell populations using Panel #1
and initial CLL filter
• 10 of these are probably false positives due to
presence in normal samples => refine CLL definition
to CD5+CD19+CD45+CD10-CD79bint/-
• Of the remaining 35
28 are significantly different between CLL and normal
7 appear to be specific to one CLL case; these may
represent distinct CLL subtypes
CLL subset examples
5br 5di 5di 5di/br 5di/br
31 5011 60 74
3 766 78 76
Normal
CLL
Improved CLL definition: CD10-CD79int/-
Normal
CLL
DAFi Filtering: Original vs. New
66-normal 54-CLL 13-MRD
CLL samples with improved CLL definition
31 11 50 60 74
CD10
CD5
CD
79
bC
D1
9
Normal samples with improved CLL definition
3 66 7 78 76
CD10
CD5
CD
79
bC
D1
9
Minimal residual disease
13-MRD 23-MRD 44-MRD74-CLL66-normal
MRD samples with improved CLL definition
13 23 33 44
X: CD5; Y: CD19
X: CD10: Y: CD79b
Summary
• New computational and statistical methods (e.g. FLOCK, SWIFT, FLAME, flowMeans, OpenCyto) are becoming part of routine FCM data analysis and are replacing manual gating, especially for high dimensional data
• But caveat emptor…....some methods are better than others
Don’t rely on summary statistics alone....look at the results
For cell population identification methods– Each population show a unimodal distribution for all evaluated marker
– Marker expression patterns should show natural distibutions
– Beware of over-partitioning
• FlowCAP challenges are providing objective means to judge the quality of the analysis results
• Application for improved diagnosis of CLL
Improved definition – CD19+CD5+ => CD19+CD5+CD10-CD79bdim– Improved diagnostic accuracy
– Improved monitoring of MRD
Subtype classification– Prognostic significance?
Acknowledgments
J. Craig Venter Institute
Yu “Max” Qian
Alex Lee
Hyunsoo Kim
Rick Stanton
Joyce Hsiao
Katie O’Nell
University of California, San Diego
Jack Bui
Broad Institute of MIT and Harvard
Jill Mesirov
Southern Methodist University
Monnie McGee
Mengya Liu
La Jolla Institute of Allergy & Immunology
Alex Sette
Bjoern Peters
Cecilia Arlehamn
University of Rochester
Ignacio Sanz
David Topham
Texas Advanced Computing Center
Weijia Xu
San Diego Supercomputer Center
Robert Sinkovits
Ilkay Altintas
Jianwu Wang
Shweta Purawat
FlowCAP Organizing Committee
Nima Aghaeepour, Stanford
Ryan Brinkman, BCCA
Greg Finak, FHCRC
Raphael Gottardo, FHCRC
Tim Mosmann, URMC
Richard H. Scheuermann, JCVI
Supported by NIH N01AI40076, R01EB008400,
HHSN272201200005C, U19AI118626