Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees,...

84
Subgroup Identication for Personalized (Stratified) Medicine Lei Shen Singapore July 13, 2017

Transcript of Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees,...

Page 1: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Subgroup Identification for Personalized (Stratified) Medicine

Lei Shen

SingaporeJuly 13, 2017

Page 2: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Outline

1. Confirming subgroup• What if (we think) we know the subgroup?

2. Learning about subgroup• How to (try to) find subgroups?

3. Learn-and-Confirm

7/13/2017 2

Page 3: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Tailored Therapies

♦ Working Definition: a treatment that is shown to be more effective on average in one subgroup of patients than its complementary subgroup

♦ Need to identify & establish during drug development: complementary subgroups with differential treatment effects based on measurable characteristics of the patients prior to treatment (biomarkers)

7/13/2017 3

Page 4: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Subgroup for Tailored Therapy

7/13/2017 4

Entire Population

Subgroup of Interest

Group size: 50%M+

TRT response: -1.17SOC response: -0.09

Treatment effect: -1.08

g1 = 1 g1 = 0

M−

TRT response: -0.33SOC response: -0.20

Treatment effect: -0.13

Page 5: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Example of a “Perfect” Subgroup

7/13/2017 5

All Patients

No effect

M+

HbA

1c R

educ

tion

(%)

0

1.5

0.75

“Perfect” = Efficacy is entirely attributed to a known subpopulation

Impact on drug development:All-comer design: 110 subjectsM+ subpop only: 30 subjects(α=0.05, power=90%, SD=1.2)

(50%)M-

(50%)

Page 6: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

7/13/2017 6

Page 7: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

7/13/2017 7

Page 8: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

7/13/2017 8

Page 9: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

7/13/2017 9

Page 10: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

7/13/2017 10

Page 11: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

“Perfect” Subgroup: Key Points

♦ A mixture distribution can be hidden in plain sight

♦ Even a perfect subgroup would not tell us much about whether an individual patient will respond to treatment

• Positive Predictive Value = 66%i.e. 1/3 of M+ patients will be non-responders

• Negative Predictive Value = 80%i.e. 1/5 of M− patients will be responders

7/13/2017 11

Page 12: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Confirming a Subgroup

♦ If we think we know the subgroup

prospective confirmation is (typically) required for regulatory approval.

♦ What testing strategy to use?

It depends.

7/13/2017 12

Page 13: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Which Type of Subgroup?

7/13/2017 13

Marker

Res

pons

e

- +No treatment

Treatment

Marker

Res

pons

e

- +No treatment

Treatment

Marker

Res

pons

e

- +

No treatment

Treatment

Marker

Res

pons

e

- +

No treatment

Treatment

Page 14: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

How To Spend Your

7/13/2017 14All Patients M+ M-

p=.01

p<.0001

E

Note: Width of bar denotes relative sample size

Marker

Res

pons

e

- +

No treatment

Treatment

Suppose Marker(+) represents 50% of the population.

2E

No effect

Serial gate-keeping works1. Test all-comers at =0.052. If significant, test M+ at =0.05

Most appropriate labelIndicated for sub-group M+ only

Page 15: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

How To Spend Your

7/13/2017 15All Patients

p=.35

P=.001

E

Note: Width of bar denotes relative sample size

Marker

Res

pons

e

- +

No treatment

Treatment

Suppose Marker(+) represents 25% of the population.

2E

No effect

Serial gatekeeping doesn’t work Split works• Test all-comers at = 0.04• Test sub-group M+ at = 0.01

Most appropriate labelIndicated for sub-group M+ only

M+ M-

Page 16: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

How To Spend Your

7/13/2017 16

P=.01

P=.001

E

Note: Width of bar denotes relative sample size

Suppose Marker(+) represents 50% of the population.

2E

Both testing strategies work.

In real life, always almost better to have flexibility.

Most appropriate label•Indicated for sub-group M+ only?•Indicated for all patients, but works better in sub-group M+?

Marker

Res

pons

e

- +

No treatment

Treatment

P=.20

Page 17: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Flexible Testing Scheme

Pre-specify:♦ How to split α (“initial α-allocation”)♦ How to move α (“α-propagation”)Bretz et al. (2011), R package gMCP

How would it work if we had:• p1=0.024• p2=0.045• p3=0.02• p4=0.002

7/13/2017 17

Page 18: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Example

7/13/2017 18

Page 19: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Example

7/13/2017 19

Page 20: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Example

7/13/2017 20

Page 21: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Example

7/13/2017 21

Page 22: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Example

7/13/2017 22

Page 23: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Multi-population Tailoring Trials

Confirmatory testing for both the overall population and subgroup(s)

♦ Example 1: 4 tests• Overall population, high dose vs. SOC• Overall population, low dose vs. SOC• Subgroup, high dose vs. SOC• Subgroup, low dose vs. SOC

♦ Example 2: 4 tests• Overall population• Subgroup A• Subgroup B• Subgroup A∩B

7/13/2017 23

Page 24: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Regulatory Decision-making Criteria

Millen et al. (2012) proposed two criteria for regulatory decision making:

1. Influence Condition: To enable overall population labeling, the beneficial effect of treatment must not be limited to only the predefined subpopulation

2. Interaction Condition: To support enhanced labeling for the predefined subpopulation, the treatment effect therein should be appreciably greater than that in the complementary subpopulation

7/13/2017 24

Page 25: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Outline

1. Confirming subgroup• What if (we think) we know the subgroup?

2. Learning about subgroup• How to (try to) find subgroups?

3. Learn-and-Confirm

7/13/2017 25

Page 26: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Development of Tailored Therapy

7/13/2017 26

Discovery Development Phase 1 Phase 2 Phase 3

Joint RegulatorySubmissions Therapeutic & Diagnostic

•Understand disease•Understand how to intervene in the disease pathway•Create a molecule•Identify predictive biomarker

IDEAL ‐ Prospective

•Tox testing•Formulation development•Assay development for the predictive biomarker•Prototype a diagnostic device

•Normal volunteers•Safety and tolerance•PK/PD•Biomarker assessment•Diagnostic assessment

•Disease patients•Dose response•Verify efficacy and safety•Confirm biomarker•Confirm diagnostic

•Disease patients•Replicate trials•Confirm efficacy and safety based on validated biomarker and diagnostic

Manufacture diagnostic on a commercial scale

Page 27: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Development of Tailored Therapy

7/13/2017 27

Discovery Development Phase 1 Phase 2 Phase 3

RegulatorySubmission of Therapeutic & 

Bridge(?) Diagnostic

•Understand disease•Understand how to intervene in the disease pathway•Create a molecule•No known biomarkers –many possibilities

REALITY

•Tox testing•Formulation development•Assay development for Research Use Only•Multiplex assays

•Normal volunteers•Safety and tolerance•PK/PD•Biomarker assessment in normals?

•Disease patients•Dose response•Verify efficacy and safety• Search for possible subgroups.

•Disease patients•Replicate trials•Confirm efficacy and safety•Test subgroups found in Ph 2?•Search for subgroups.

Manufacture diagnostic on a commercial scale

Page 28: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Not for promotional use© 2013 Eli Lilly and Company Lilly Confidential 28

Page 29: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

How much do we know, and when?

♦ Concurrent–Prospective• We learn something new; revise ongoing plans• The Concurrent piece

– Ongoing BLINDED Studies• The Prospective piece

– Create revised Statistical Analysis Plan to incorporate new information

♦ Retrospective–Prospective• We learn something new and review past data• The Retrospective piece

– Reviewing past data– Stored samples

• The Prospective piece– Pre-specified hypothesis and analysis plan– Then assay samples for markers of interest

7/13/2017 29

Page 30: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

(Less) Prospective Options

7/13/2017 30

Enrichment Design

Register

Marker Present

Marker Absent Stop

Treatment A

Treatment B

Test Marker

Result

Randomize

Marker Present

Marker Absent

Treatment A

Treatment B

Test Marker

Result

Randomize

Treatment A

Treatment B

Randomize

Page 31: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

(Less) Prospective Options

7/13/2017 31

Phase 3

Study 3.1

Study 3.2

Study 3.3

Results

SAPResults

SAP

Page 32: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Modified Adaptive Signature Design

7/13/2017 32

2nd Stratified randomizationfor biomarker evaluation

Evaluating Effect in all1050 patients

Marker Exploratory Set350 Patients

Marker Confirmatory Set700 Patients

Learn & Confirm paradigm within a single trial.

Stratified for prognostic factors and treatment group

Page 33: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Modified Adaptive Signature Design

7/13/2017 33

Marker Exploratory Set350 Patients

Marker Confirmatory Set750 Patients

Identify Predictive Marker for Drug Effect

Confirm Drug Effect in Markerpositive patients

ConfirmPredictive nature

of marker

ASSAY SAMPLESASSAY SAMPLES ASSAY SAMPLESASSAY SAMPLES

ACTION

Page 34: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Retrospective-Prospective

7/13/2017 34

References Treatment (panitumumab or cetuximab) No of patients (WT:MT) Objective Response

N (%)

Mutant WildA. Liévre, et al. (AACR Proceedings, 2007) cmab ± CT 76 (49:27) 0 (0) 24 (49)

S. Benvenuti, et al. (Cancer Res, 2007) pmab or cmab or cmab + CT 48 (32:16) 1 (6) 10 (31)

W. De Roock, et al.(ASCO Proceedings, 2007) cmab or cmab + irinotecan 113 (67:46) 0 (0) 27 (40)

D. Finocchiaro, et al. (ASCO Proceedings, 2007) cmab ± CT 81 (49:32) 2 (6) 13 (26)

F. Di Fiore, et al.(Br J Cancer, 2007) cmab + CT 59 (43:16) 0 (0) 12 (28)

S. Khambata-Ford, et al. (J Clin Oncol, 2007)

cmab 80 (50:30) 0 (0) 5 (10)

Single-Arm Studies Support the Hypothesis for KRAS as a Biomarker for EGFr Inhibitors

Page 35: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Subgroup Identification

Objective: Identify subgroup A that

• is defined by (a small number of) biomarker Xs

• consists of patients for whom the treatment effect is large (as defined in counterfactual models, Foster, Taylor, Ruberg 2011)

• can be declared with sufficient confidence to warrant further investment

7/13/2017 35

Page 36: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Framework of Methods

7/13/2017 36

Handle “Treatment”(“Predictive” biomarkers)

Select Subgroups

Multiplicity (Type 1 error, bias)

Page 37: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients
Page 38: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Framework of Methods

7/13/2017 38

Handle “Treatment”(“Predictive” biomarkers)

Select Subgroups

Multiplicity (Type 1 error, bias)

Transformation using random forest

Trees, maximize purity

Permutation

Page 39: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients
Page 40: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Framework of Methods

7/13/2017 40

Handle “Treatment”(“Predictive” biomarkers)

Select Subgroups

Multiplicity (Type 1 error, bias)

Transformation using random forestIncorporate T in summary statistic

Trees, maximize purityTrees, maximize χ2 statistic

PermutationBootstrap

Page 41: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Few More Methods

♦ Su et al. (2008) Interaction trees with censored survival data. International Journal of Biostatistics.

♦ (SIDES) Lipkovich et al. (2011) Subgroup identification based on differential effect search. Statisics in Medicine.

♦ Nguyen, Gu, Shen (2013) Two-step adaptive elastic net with random data splits. Midwest Biopharmaceutical Statistics Workshop.

♦ (QUINT) Dusseldorp, Van Mechelen (2013) Qualitative interaction trees: a tool to identify qualitative treatment-subgroup interactions. Statistics in Medicine.

♦ (TSDT) Shen, Ding, Battioui (2015) ) A Framework of Statistical Methods for Identification of Subgroups with Differential Treatment Effects in Randomized Trials. Applied Statistics in Biomedicine and Clinical Trials Design, Springer

7/13/2017 41

Page 42: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Framework of Methods

7/13/2017 42

Handle “Treatment”(“Predictive” biomarkers)

Select Subgroups

Multiplicity (Type 1 error, bias)

Additional Features

1- Transformation using random forest2- Incorporate T in summary statistic3- Directly contrast 2 arms4- One arm first, then the other5- Model Marker × Trt

1- Trees, maximize purity2- Trees, maximize χ2 statistic3- Test regression coefficients

1- Permutation2- Bootstrap3- Subsampling4- Cross-validation

Optimize tuning parametersCondition on known prognostic markersVariable importanceBias correctionMissing data

Page 43: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

“Method Generator”

5 × 3 × 4 = 60 “methods”

What about “1-3-3”?• Virtual twins• … followed by penalized regression• … with subsampling to control type I error

Tang (2016) The VG (Virtual Twins and GUIDE Method) for Subgroup Identification, MBSW presentation7/13/2017 43

Page 44: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

So which method should we use?

♦ As statisticians,we enjoy developing new methods;

♦ As drug developers,we are agnostic about the choice of method

♦ No single method is best in ALL applications

♦ Hence, the key is analysis optimization

7/13/2017 44

Page 45: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Analysis Optimization

7/13/2017 45

Data Generation

• Web interface• Standard datasets

BSID

• Open methods• Standard output

Performance Measurement

• Web interface• Standard summary

Three components:1. Data generation (consistency)2. Analysis methods (openness)3. Performance measures (consistency)

Page 46: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Data Generation – a Survey

7/13/2017 46

Attribute SIDES (2011)1 SIDES (2014)2 VT3 GUIDE4 QUINT5 IT6

n 900 300, 900 400 - 2000 100 200 - 1000 300, 450

p 5 - 20 20 - 100 15 - 30 100 5 - 20 4

response type continuous continuous binary binary continuous TTE

predictor type binary binary continuous categorical continuous ordinal, categorical

predictor correlation 0, 0.3 0, 0.2 0, 0.7 0 0, 0.2 0

treatment assignment 1:1 1:1 ? ~1:1 ~1:1 ?

# predictive markers 0 - 3 2 0, 2 0, 2 1 - 3 0, 2

predictive effect(s) higher order higher order higher order N/A, simple, higher order

simple, higher order simple

predictive M+ group size (% of n) 15% - 20% 50% N/A, ~25%, ~50% N/A, ~36% ~16% - ~50% N/A, ~25%, ?

# prognostic markers 0 0 3 0 - 4 1 - 3 0, 2

prognostic effect(s) N/A N/A simple, higher order N/A, simple, higher order

simple, higher order simple

“contribution model”logit model (w/o and with subject-specific effects

linear model (on probability

scale)“tree model” exponential

model

Page 47: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Performance Metrics – a Survey

7/13/2017 47

SIDES (2011)1 VT3 GUIDE4 QUINT5SIDES (2014)2 IT6

Selection rate

Complete match rate

Partial match rate

Confirmation rate

Treatment effect fraction

Pr(complete match)

Pr(partial match)

Pr(selecting a subset)

Treatment effect fraction (updated def.)

Pr(selecting a superset)

Finding correct X’s

Closeness of to the true A

Closeness of the size of to the size of

the true A

Properties of as an

estimator of

Power

Pr(selection at 1st or 2nd

level splits of trees)

Accuracy

Pr(nontrivial tree)

(RP1a) Pr(type I errors)

(RP1b) Pr(type II errors)

(RP2) Rec. of tree

complexity

(RP4) Rec. of assignments of observations to

partition classes

(RP3) Rec. of splitting vars

and split points.

Frequencies of the final tree sizes

Bias assessment via likelihood

ratio and logrank tests

Frequency of (predictor)

“hits”

SIDES (2011)1

VT3

GUIDE4

QUINT5SIDES (2014)2

IT6

Selection rate

Complete match rate

Partial match rate

Confirmation rate

Treatment effect fraction

Pr(complete match)

Pr(partial match)

Pr(selecting a subset)

Treatment effect fraction (updated def.)

Pr(selecting a superset)

Finding correct X’s

Closeness of to the true A

Properties of as an

estimator of

Closeness of the size of to the size of

the true A

Power

Pr(selection at 1st or 2nd

level splits of trees)

Accuracy

Pr(nontrivial tree)

(RP1a) Pr(type I errors)

(RP1b) Pr(type II errors)

(RP2) Rec. of tree

complexity (RP4) Rec. of assignments of observations to

partition classes

(RP3) Rec. of splitting vars

and split points.

Frequencies of the final tree sizes

Bias assessment via likelihood

ratio and logrank tests

Frequency of (predictor)

“hits”

Marker Level Subgroup Level Subj. Level

(testing)

(estimation) (prediction)

Page 48: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Performance Metrics

♦ Variable level (“Testing”)• Important for knowledge• # and % of predictors: truth vs. identified

♦ Subgroup level (“Estimation”)• Important for next study• Treatment effect in the identified subgroup• Quantified impact: time, cost

♦ Patient level (“Prediction”)• Important for clinical practice• How well are patients classified?

7/13/2017 48

Page 49: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Conditional Performance Metrics

7/13/2017 49

M+

Treatment effect: 10

x1 = 1 x1 = 0

M−

Treatment effect: 0

Group size: 50%

Group size: 50%

x 2 =

1x 2

= 0

BSID Method A900/1000: Null

100/1000: x1 = 1

Truth(but x1 very hard to find)

1000 simulationsBSID Method B900/1000: Null50/1000: x1 = 1 50/1000: x2 = 1

UnconditionalSize: 0.95Effect: 5.5

UnconditionalSize: 0.95

Effect: 5.25

ConditionalSize: 0.5Effect: 10

ConditionalSize: 0.5

Effect: 7.5

Gro

up s

ize:

50%

Gro

up s

ize:

50%

Page 50: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

The “Right” Subgroup?

7/13/2017 50

Marker(continuous)

Res

pons

e

No treatment

Treatment

What is the optimal cut-off?What does ‘optimal’ mean?

Page 51: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

The “Right” Subgroup?

7/13/2017 51

Entire Population

Subgroup of Interest

Group size: 50%M+

TRT response: -1.17SOC response: -0.09

Treatment effect: -1.08

g1 = 1 g1 = 0

M−

Entire Population

Subgroup of Interest

Group size: 25%M+

TRT response: -1.39SOC response: -0.19

Treatment effect: -1.20

g1 = 1 g1 = 0

M−

g 2 =

1g 2

= 0

TRT response: -0.33SOC response: -0.20

Treatment effect: -0.13

Page 52: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Frontier Plot

7/13/2017 52

Num

ber N

eede

d to Treat

Number of Patients

Clinically Meaningful

Page 53: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Outline

1. Confirming subgroup• What if (we think) we know the subgroup?

2. Learning about subgroup• How to (try to) find subgroups?

A Bayesian interlude

3. Learn-and-Confirm7/13/2017 53

Page 54: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID

♦ Effectively utilize prior knowledge♦ Results from Bayesian analyses more interpretable

• Directly answer key questions of interest♦ Bayesian subgroup identification

• Define subgroups and corresponding statistical models• Specify (partition prior probabilities• Posterior analysis

7/13/2017 54

Page 55: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID: Biomarkers

m factors X1;X2; : : : ; Xm

For simplicity, consider dichotomous X’s (0 or 1)

Subgroups defined by specification of values of the X’s

Start by considering single-factor subgroups:e.g. S = {all individuals with X11 = 1}

X’s may be prognostic, predictive, or neither

7/13/2017 55

Page 56: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID: Models

Response of an individual is

Y = Bk + Tj + errorBk = baseline (prognostic) model involving factor kTj = treatment (predictive) model involving factor j

♦ Either Bk or Tj or both could be absent♦ The models will have unknown parameters♦ There are typically MANY possible models

7/13/2017 56

Page 57: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID: Prior

Interpretable prior inputs:♦ oi, the effect odds of Xi to X1, defined as the prior relative odds

that Xi has an effect compared to X1. (Default: oi = 1)♦ Null control: specify p0 and q0, the prior probability that an

individual has no treatment (predictive) effect and no baseline (prognostic) effect, respectively. (Default: p0 = q0 = 0.5)

♦ ri is the ratio of the prior probability of the overall treatment model to the sum of the prior probabilities of the treatment models with i factor splits. (Default: ri = 1)

These inputs determine the prior probability P(M) of a model M.

Objective prior distributions are also specified for the parameters of each model (e.g., treatment effect size)

7/13/2017 57

Page 58: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID: Posterior

Mi be all the models under which there is a predictive effect for individual i (i.e. a specification of the values of all of the factors)

Individual treatment effect probability is given by

Pi =Σall Ml in Mi P(Ml | data)Subgroup treatment effect probability is then given by the average of the Pi for all the individuals in the specified subgroup.

7/13/2017 58

Page 59: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID: Example

♦ 32 biomarkers♦ Constant prior♦ Pr(any predcitive biomarker) = 0.3♦ No. of models (≤1 predictive & ≤1 prognostic

biomarkers): 3234♦ Total prior probability of 1 allocated among all

models

7/13/2017 59

Page 60: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID: Results

7/13/2017 60

g2snp05 Prior Prob Posterior Prob Total

Model 1 0.00156 0.02844

0.21Model 2 0.00156 0.15224

Model 3 0.00005 0.02968

g2snp06 Prior Prob Posterior Prob Total

Model 4 0.00156 0.010790.07

Model 5 0.00156 0.05479

Page 61: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID: Results

Constant prior posterior prob = 0.21 (0.28)

Informative prior: Tier 1 markers 3 times as plausible as Tier 2 ones

posterior prob = 0.27 (0.36)

7/13/2017 61

Page 62: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Bayesian Subgroup ID: Results

7/13/2017 62

Page 63: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Outline

1. Confirming subgroup• What if (we think) we know the subgroup?

2. Learning about subgroup• How to (try to) find subgroups?

3. Learn-and-Confirm

7/13/2017 63

Page 64: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Simulated Example

Available data♦ Ph2a study with 240 patients (180 vs. 60)♦ Ph2b study with 270 patients (180 vs. 90)

Biomarkers♦ 100 candidate genotypic markers

• G1-G30: more plausible• g31-g100: less plausible

♦ Predictive biomarkers: G9, G20, G25, g100

7/13/2017 64

Page 65: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Simulated Example: Analysis #1

Traditional analysis:

1. Analyze one marker at a time• Perception: simple

2. Analyze all markers equally• Perception: unbiased

3. Analyze both studies the same way• Perception: consistent

7/13/2017 65

Page 66: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Simulated Example: Analysis #1

Analysis conclusion:No confident finding due to lack of consistency

7/13/2017 66

Ph2a StudyMarker P-value

G9* <0.0005g98* 0.006g100 0.007G20* 0.024G12* 0.028G2* 0.029G25 0.029g58* 0.035G10* 0.046g90* 0.047G6* 0.048

Ph2b StudyMarker P-value

G25 <0.0005g100 0.003G5* 0.013g83* 0.014g55* 0.018g95* 0.028g49* 0.035g84* 0.039G23* 0.044G19* 0.044

Ph2a StudyMarker P-value

G9* <0.0005g98* 0.006g100 0.007G20* 0.024G12* 0.028G2* 0.029G25 0.029g58* 0.035G10* 0.046g90* 0.047G6* 0.048

Ph2b StudyMarker P-value

G25 <0.0005g100 0.003G5* 0.013g83* 0.014g55* 0.018g95* 0.028g49* 0.035g84* 0.039G23* 0.044G19* 0.044

Ph2a StudyMarker P-value

G9 <0.0005g98 0.006g100 0.007G20 0.024G12 0.028G2 0.029

G25 0.029g58 0.035G10 0.046g90 0.047G6 0.048

Ph2b StudyMarker P-value

G25 <0.0005g100 0.003G5 0.013g83 0.014g55 0.018g95 0.028g49 0.035g84 0.039G23 0.044G19 0.044

Page 67: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Simulated Example: Analysis #2

Still use simple analysis(one marker at a time; all markers equally)

But consider a learn-and-confirm paradigm1. Analyze study #1

• Nominal α=0.05 with no multiplicity adjustment2. Analyze study #2, for those markers that

passed step #1• Bonferroni with overall α=0.05

7/13/2017 67

Page 68: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Simulated Example: Analysis #2

Step #1 11 markers

Step #2 p-value < 0.05/11:♦ G25♦ g100

Result: identified 2 of the 4 predictive markers

7/13/2017 68

Ph2a StudyMarker P-value

G9 <0.0005g98 0.006g100 0.007G20 0.024G12 0.028G2 0.029

G25 0.029g58 0.035G10 0.046g90 0.047G6 0.048

Page 69: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Simulated Example: Analysis #3

1. Analyze study #1 using TSDT• Tier 1 analysis: {G1 … G30}

G9 with strong confidenceG25 with moderate confidencesome with mild confidence

• Tier 2 analysis: for all 100 markers some with moderate confidence

2. Analyze study #2 using TSDT• Tier 1 analysis: G9

G9 strongly confirmed• Tier 2 analysis: {G9, G25}

G25 confirmed• Tier 3 analysis: 21 markers

g100 identified

7/13/2017 69

Page 70: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

TSDT

7/13/2017 70

Sampled Dataset

#1......S

tudy

D

ata

LeftoutDataset

#1

Sampled Dataset

#500

LeftoutDataset

#500

+

+

Subgroups

Subgroups

“Internal” consistency: subgroup often found?Effect

Effect

“External”consistency:similar subgroup effect?

+

Strength of findingsHonest estimates

Many timese.g. 500

“Relevant” subgroupsas defined by the User

Page 71: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Real Example

♦ Two phase 3 clinical trials• Study #1 = “learn”• Study #2 = “confirm”

♦ 800 patients in each trial

♦ Affymetrix HTA2 gene expression array data• ~70,000 transcript clusters on HTA2 array• Measured at baseline (prior to treatment)

♦ Prior knowledge ranked list of markers7/13/2017 71

Page 72: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Optimized Learn-and-confirm

♦ Perform simulations to select (and pre-specify) the optimal analysis approach for a given application

♦ Simulation set-up: tailor to the situation• Relevant metrics (marker-, subgroup-, patient-level)

♦ Consider the ENTIRE analysis• Learn-and-confirm, how to utilize prior knowledge• Analysis method(s)• Multiplicity control

7/13/2017 72

Page 73: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Real Example: Simulation

♦ Clinically relevant treatment effects♦ Variability based on historical data♦ Scenarios:

• With and without prognostic marker• Subgroups (predictive markers):

– Single-marker– Two-marker: several types

♦ Aspects of evaluation:• Analysis methods• Levels of multiplicity control• Number of candidate markers

7/13/2017 73

Page 74: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Real Example: Simulation

6 (scenarios)× 10 (analysis approaches)× 16 (study #1 multiplicity control)× 5 (study #2 multiplicity control)× 100 (datasets)× 5 (# candidate biomarkers)= 2,400,000 analyses

Each includes 100 sub-samples × 100 permutations

7/13/2017 74

Page 75: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Real Example: Approach #1

7/13/2017 75

StudyMethod

Analysis Multiplicity Control

Study 1 Single-marker Unadjusted

Study 2 Single-marker Bonferroni

• Aggressive “learn” stage• Simple analysis for both stages

Page 76: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Real Example: Approach #2

7/13/2017 76

StudyMethod

Analysis Multiplicity Control

Study 1 TSDT Resampling

Study 2 Single marker Bonferroni

• Advanced method for “learn” stage• Simple analysis for “confirm” stage

Page 77: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

A Real Example: Approach #3

7/13/2017 77

StudyMethod

Analysis Multiplicity Control

Study 1 TSDT Resampling

Study 2 TSDT Resampling

• Advanced method for both stages

Page 78: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Error control: learn (panels 0.1, 0.15, 0.2) & confirm (x-axis)Approach #1 (top) vs. #2 (bottom); Scenario: 1 pred marker

7/13/2017 78

Page 79: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Error control: learn (panels 0.1, 0.15, 0.2) & confirm (x-axis)Approach #1 (top) vs. #2 (bottom); Scenario: 2 pred markers

7/13/2017 79

Page 80: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

# of Markers Included in the AnalysisApproach #1 vs. #3; Scenario: 1 pred marker

7/13/2017 80

Page 81: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

# of Markers Included in the AnalysisApproach #1 vs. #3; Scenario: 2 pred markers

7/13/2017 81

Biomarker 1

Bio

mar

ker 2

“And” subgroup

Page 82: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Real Example Summary

♦ Multiplicity control in Study #2 (“confirm”) has a greater impact than that for Study #1

♦ Advanced method (e.g. TSDT) performs better than simple analysis

♦ With 1 predictive marker, decent power to correctly identify it, if:• TSDT is used in the first studyOR• Single-marker analyses are performed for both

studies but the number of markers is ≤100

7/13/2017 82

Page 83: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Conclusions

♦ Utilize subgroup identification methods and optimize for each application;

♦ Consider pre-specified learn-and-confirm;

♦ When appropriate, run multi-population confirmatory trials with flexible testing scheme

7/13/2017 83

Page 84: Subgroup Identification for Personalized (Stratified) Medicine · 5- Model Marker × Trt 1- Trees, maximize purity 2- Trees, maximize χ2 statistic 3- Test regression coefficients

Acknowledgement

♦ Chakib Battioui, Brian Denton, Xuemin Gu, Rick Higgs, Michael Man, Eric Nantz, Steve Ruberg, Hollins Showalter

♦ Jim Berger (Duke), Ying Ding (U Pitt), Jared Foster (NIH), Wei-Yin Loh (UW-Madison), Jeremy Taylor (U Mich), Xiaojing Wang (U Conn), Richard Zink (SAS)

7/13/2017 84