Handling Interim and Immature Data In a Clinical Trials ... Interim and Immature Data In a Clinical...

30
Handling Interim and Immature Data In a Clinical Trials Setting Paul Stutzman Axio Research 22-Oct-2015

Transcript of Handling Interim and Immature Data In a Clinical Trials ... Interim and Immature Data In a Clinical...

Proprietary and confidential. Do not distribute.

Proprietary and Confidential

Handling Interim and Immature Data In a

Clinical Trials Setting

Paul Stutzman

Axio Research

22-Oct-2015

Proprietary and confidential. Do not distribute.

Introduction

Outline

2

Motivation

Definition

Preparation

Potential pitfalls

Reassess

Tools and processes

Concluding remarks

Proprietary and confidential. Do not distribute.

Motivation

3

Given:

Variety of statistical projects

Early involvement

Maintain quality/accuracy

Minimize cost/effort

Timely deliverables

Proprietary and confidential. Do not distribute.

Definition

Context

4

Preparatory programming - standards based

Raw SDTM ADaM

Medical monitoring

Interim analyses

Data Monitoring Committee (DMC) / Data Safety

Monitoring Board (DSMB)

Proprietary and confidential. Do not distribute.

Definition

What is interim and incomplete data?

5

Interim:

Before database lock

Often iterative – multiple snapshots over time

Unclean

Incomplete:

Missing records (subjects, visits, parameters…)

Missing values

Proprietary and confidential. Do not distribute.

Definition

Dirty data

6

Good: Accurate and Complete

Fast: Timely

Cheap: Cost & Effort

Pick any 2:

Proprietary and confidential. Do not distribute.

Preparation

What can we know ahead of time?

7

Protocol

Annotated CRFs

SAP

DMC/DSMB Charter

Look at the actual data!

Proprietary and confidential. Do not distribute.

Preparation

What can we know ahead of time?

8

Visit structure

The protocol is your friend.

However…

“Patients randomized to both study arms will have

21-day chemotherapy cycles until disease

progression, unacceptable toxicity, withdrawal of

consent, or protocol specified parameters to stop

treatment.”

Proprietary and confidential. Do not distribute.

Preparation

What can we know ahead of time?

9

All permissible values

Annotated CRF

Data dictionary

Standards constraints

Raw Analysis datasets TLFs

Raw SDTM ADaM TLFs (metadata is critical)

Proprietary and confidential. Do not distribute.

Preparation

Use defensive programming/coding techniques

10

Your program should not fail

Handle all values (anticipated or not)

Unanticipated values should be flagged

Code for all variables – even if insufficient data

Use the SAS log!

Proprietary and confidential. Do not distribute.

Potential pitfalls

Initial considerations

11

All missing

In sufficient data, no records meet criteria yet

For example: No deaths

In DS?

In AE?

Separate deaths dataset?

Will they need to be reconciled?

Should be noted in the SAS log

Proprietary and confidential. Do not distribute.

Potential pitfalls

Initial considerations

12

Not all values are present. Unsure of:

What actual values will be (categorical)

The relationship between variables

xxCAT, xxSCAT

xxTEST, xxTESTCD

Numeric coded categorical variables

Visits

When in doubt, note in the SAS log

Proprietary and confidential. Do not distribute.

Potential pitfalls

Initial considerations

13

Invalid/incorrect data

Out of range

Mismatched related variables’ values

Invalid categorical values

Should be noted in the SAS log

Proprietary and confidential. Do not distribute.

Reassess as you go

Hey, I got a new data transfer!

14

New or missing datasets

How did the number of records in each dataset change

New or missing variables

Variable attributes have changed

New values in categorical variables

New ranges (or outliers) in continuous variables

Changes in related variables (xxTEST vs. xxTESTCD)

Proprietary and confidential. Do not distribute.

Reassess as you go

Hey, I got a new data transfer!

15

Scrutinize lab values and units

Visits

Sort orders no longer provide uniqueness

What does any of this mean for my metadata?

Proprietary and confidential. Do not distribute.

Tools and processes

Defensive programming/coding defined

16

Typical definition (www.drdobbs.com)

“Defensive programming is a practice where you

anticipate failures in your code, then add supporting code

to detect, isolate, and in some cases, recover from the

anticipated failure.”

In this context, however…

Writing programs that handle all input data (both

anticipated and unanticipated) gracefully, and notify the

user/programmer when data or processing issues arise.

Proprietary and confidential. Do not distribute.

Tools and processes

Defensive programming/coding: Notify!

17

Look for all possible known values

Make unanticipated values obvious

Use the SAS log to highlight unexpected values

Use the SAS log to identify variables that can’t be

calculated yet

Proprietary and confidential. Do not distribute.

Tools and processes

Defensive programming/coding techniques

18

For unanticipated values: flag and notify

Final ELSE on IF/ELSE IF constructs

OTHERWISE on SELECT statements

ELSE on SQL CASE statements

PROC FORMAT OTHER

Make unanticipated values to something obvious

Use the SAS log to highlight unexpected values

Proprietary and confidential. Do not distribute.

Tools and processes

Defensive programming/coding SAS log messages

19

Standardize and differentiate from SAS’ messages

!!! Warning:

### Note:

Make the code invisible

put "!!" "! War" "ning: message text";

put "##" "# Note: message text";

Proprietary and confidential. Do not distribute.

Tools and processes

Misc: %FindDups macro

20

%finddups(InDS=ds1 ds2 ds3, SortOrd=USUBJID);

data combined;

merge ds1 ds2 ds3;

by USUBJID;

### Note: No duplicates found in ds1 with SortOrd=USUBJID

!!! Warning: Duplicates in ds2: USUBJID=111-123-0001

!!! Warning: Duplicates in ds2: USUBJID=111-345-0001

### Note: No duplicates found in ds3 with SortOrd=USUBJID

%finddups(InDS=ADLB, SortOrd=USUBJID PARAMCD AVISITN ADT);

Proprietary and confidential. Do not distribute.

Tools and processes

Process an iterative data transfer

21

Archive old data – but keep available

Load new data

Compare

Proprietary and confidential. Do not distribute.

Tools and processes

Process an iterative data transfer

22

Comparison criteria

New or missing datasets

How did the number of records in each dataset change

New or missing variables

Variables with different attributes

New values in categorical variables

New ranges (or outliers) in continuous variables

Changes in related variables (xxTEST vs. xxTESTCD)

Visits!!!

Proprietary and confidential. Do not distribute.

Tools and processes

Data transfer comparison macro

23

Compare current data to previous transfer

Datasets: presence, record counts

Variables: presence, attributes

Values (optional): categorical, continuous

Output Spreadsheet (ODS EXCELXP Tagset)

%mLibComp(Folder=SDTM);

%mLibComp(Folder=SDTM, Values=_all_);

%mLibComp(Folder=SDTM, Values=DM EX, MaxCats=25);

Proprietary and confidential. Do not distribute.

Tools and processes

Datasets comparison spreadsheet

24

Proprietary and confidential. Do not distribute.

Tools and processes

Variables and attributes spreadsheet

25

Proprietary and confidential. Do not distribute.

Tools and processes

Values comparison (categorical) spreadsheet

26

Proprietary and confidential. Do not distribute.

Tools and processes

Values comparison (continuous) spreadsheet

27

Proprietary and confidential. Do not distribute.

Tools and processes

Examine the relationships

28

PROC FREQ is your friend

VISITNUM VISIT (across all domains)

xxTESTCD xxTEST (what about units?)

PARAMN PARAMCD PARAM

Proprietary and confidential. Do not distribute.

Conclusions

29

Understand potential impact

Prepare

Reassess

Tools and processes

Proprietary and confidential. Do not distribute. 30