Effective Use of the RETAIN Statement in Programming Clinical Trial
description
Transcript of Effective Use of the RETAIN Statement in Programming Clinical Trial
A leading global CRO 1
Effective Use of the RETAIN Statement in Programming
Clinical Trial
Mingxia ChenBiostatisticianBeijing, China
A leading global CRO 2
Introduction
• RETAIN statement causes a variable that is created by an INPUT or assignment statement to retain its value from one iteration of the DATA step to the next ---- SAS help document
• In contrast to the default DATA step behavior, without RETAIN statement, SAS automatically sets new variables in the data step to missing at the start of each iteration
• The RETAIN statement is very useful to perform data manipulation across observations.
• RETAIN is simpler and more flexible, to speed up your program.
A leading global CRO 3
Example 1
Concatenate character values from multiple records of the same variable
A leading global CRO 4
Original Dataset (SUPPEX)
Result Dataset (REASON)
USUBJID IDVARIDVARVAL
QNAM QLABEL QVAL
100-5001 EXSEQ 1EXLOT1
Lot Number 1 583918
100-5001 EXSEQ 1EXLOT2
Lot Number 2 590118
100-5001 EXSEQ 1EXLOT3
Lot Number 3 598303
100-5001 EXSEQ 2 IIREASReason for injection interruption
Adverse Event
100-5001 EXSEQ 2 IISPEC Specify for interruptionALLERGIC REACTION, SKIN RASH, DYSPNEA
100-5002 EXSEQ 2EXLOT1
Lot Number 1 540327
100-5002 EXSEQ 2EXLOT2
Lot Number 2 558160
…… …… …… …… USUBJID EXSEQ EXLOT REASON
100-5001 1583918, 590118, 598303
100-5001 2 Adverse Event: ALLERGIC REACTION, SKIN RASH, DYSPNEA
100-5002 2 540327, 558160
…… …… …… ……
A leading global CRO 5
Original Dataset (SUPPEX)
Result Dataset (RSLTDS)
100-5001 EXSEQ 2 IIREAS Reason for injection interruption Adverse Event
100-5001 EXSEQ 2 IISPEC Specify for interruption ALLERGIC REACTION, SKIN RASH, DYSPNEA
USUBJID
IDVAR
IDVARVAL
QNAM QLABEL QVAL
100-5001
EXSEQ
1EXLOT1
Lot Number 1 583918
100-5001
EXSEQ
1EXLOT2
Lot Number 2 590118
100-5001
EXSEQ
1EXLOT3
Lot Number 3 598303
100-5002
EXSEQ
2EXLOT1
Lot Number 1 540327
100-5002
EXSEQ
2EXLOT2
Lot Number 2 558160
…… …… …… ……
USUBJID EXSEQ EXLOT REASON
100-5001 1 583918, 590118, 598303
100-5002 2 540327, 558160
…… …… …… ……
A leading global CRO 6
PROC SORT DATA=SUPPEX; BY USUBJID IDVAR IDVARVAL QNAM;RUN;DATA RSLTDS(KEEP=USUBJID EXSEQ EXLOT REASON); SET SUPPEX; BY USUBJID IDVAR IDVARVAL QNAM; FORMAT EXLOT REASON $200.; RETAIN EXLOT REASON; IF FIRST.IDVARVAL THEN DO; EXLOT=""; REASON =""; END;
IF INDEX(QNAM,“EXLOT")>0 THEN EXLOT=CATX(“, ", OF EXLOT QVAL); IF QNAM IN (“IIREAS", “IISPEC") THEN REASON=CATX(": ", OF REASON
QVAL);
IF LAST.IDVARVAL;
EXSEQ=INPUT(IDVARVAL, best.);RUN;
SAS Code
Assigns initial value (missing) to EXLOT, REASON
Prevents the 2 variables from being reset to missing for each iteration.
Concatenates character values of QVAL in multiple records
Keeps the last observation per SUBJECT per EXSEQ
A leading global CRO 7
Example 2
Select the previous minimum sum of longest diameters in oncology clinical trials
A leading global CRO 8
Original dataset (TR)USUBJID
VISITNUM
VISITTRTESTCD
TRTESTTRSTRESN
TRSTRESU
100-5001 -1 PRE-TREATMENT SUMLDIAMSUM OF LONGEST DIAMETER
74 mm
100-5002 -1 PRE-TREATMENT SUMLDIAMSUM OF LONGEST DIAMETER
94 mm
100-5002 110 TUMOR ASSESSMENT 1
SUMLDIAMSUM OF LONGEST DIAMETER
84 mm
100-5002 120 TUMOR ASSESSMENT 2
SUMLDIAMSUM OF LONGEST DIAMETER
92 mm
100-5002 130 TUMOR ASSESSMENT 3
SUMLDIAMSUM OF LONGEST DIAMETER
106 mm
100-5005 -1 PRE-TREATMENT SUMLDIAMSUM OF LONGEST DIAMETER
96 mm
100-5005 110 TUMOR ASSESSMENT 1
SUMLDIAMSUM OF LONGEST DIAMETER
100 mm
100-5005 130 TUMOR ASSESSMENT 3
SUMLDIAMSUM OF LONGEST DIAMETER
95 mm
101-5002 -1 PRE-TREATMENT SUMLDIAMSUM OF LONGEST DIAMETER
188 mm
101-5002 110TUMOR ASSESSMENT 1
SUMLDIAMSUM OF LONGEST DIAMETER
65 mm
101-5002 120TUMOR ASSESSMENT 2
SUMLDIAMSUM OF LONGEST DIAMETER
66 mm
101-5002 130TUMOR ASSESSMENT 3
SUMLDIAMSUM OF LONGEST DIAMETER
66 mm
101-5002 140TUMOR ASSESSMENT 4
SUMLDIAMSUM OF LONGEST DIAMETER
63 mm
101-5002 150TUMOR ASSESSMENT 5
SUMLDIAMSUM OF LONGEST DIAMETER
61 mm
…… …… …… ……
A leading global CRO 9
Result Dataset (PREMINSL)USUBJID
VISITNUM
VISITTRTESTCD
TRTESTTRSTRESN
TRSTRESU
PREMIN
SL 100-5001
-1 PRE-TREATMENTSUMLDIAM
SUM OF LONGEST DIAMETER
74 mm .
100-5002
-1 PRE-TREATMENTSUMLDIAM
SUM OF LONGEST DIAMETER
94 mm .
100-5002
110 TUMOR ASSESSMENT 1
SUMLDIAM
SUM OF LONGEST DIAMETER
84 mm 94
100-5002
120 TUMOR ASSESSMENT 2
SUMLDIAM
SUM OF LONGEST DIAMETER
92 mm 84
100-5002
130 TUMOR ASSESSMENT 3
SUMLDIAM
SUM OF LONGEST DIAMETER
106 mm 84
100-5005
-1 PRE-TREATMENTSUMLDIAM
SUM OF LONGEST DIAMETER
96 mm .
100-5005
110 TUMOR ASSESSMENT 1
SUMLDIAM
SUM OF LONGEST DIAMETER
100 mm 96
100-5005
130 TUMOR ASSESSMENT 3
SUMLDIAM
SUM OF LONGEST DIAMETER
95 mm 96
101-5002
-1 PRE-TREATMENTSUMLDIAM
SUM OF LONGEST DIAMETER
188 mm .
101-5002
110TUMOR ASSESSMENT 1
SUMLDIAM
SUM OF LONGEST DIAMETER
65 mm 188
101-5002
120TUMOR ASSESSMENT 2
SUMLDIAM
SUM OF LONGEST DIAMETER
66 mm 65
101-5002
130TUMOR ASSESSMENT 3
SUMLDIAM
SUM OF LONGEST DIAMETER
66 mm 65
101-5002
140TUMOR ASSESSMENT 4
SUMLDIAM
SUM OF LONGEST DIAMETER
63 mm 65
101-5002
150TUMOR ASSESSMENT 5
SUMLDIAM
SUM OF LONGEST DIAMETER
61 mm 63
…… …… …… ……
A leading global CRO 10
SAS CodePROC SORT DATA=TR; BY USUBJID TRTESTCD VISITNUM;RUN;
DATA PREMINSL(DROP=LASTSL); SET TR; BY USUBJID TRTESTCD VISITNUM;
RETAIN PREMINSL ;
LASTSL=LAG(TRSTRESN);
IF FIRST.TRTESTCD THEN DO;
LASTSL=.;
PREMINSL=.;
END;
ELSE PREMINSL=MIN(PREMINSL, LASTSL);
RUN;
Prevents PREMINSL from being reset to missing for each iteration
Sets LASTSL to missing for the first visit of each SUBJECT
LAG function to Get the value of the last observation
Sets PREMINSL to missing for the first visit of each SUBJECT
Compares PREMINSL to LASTSL in current iteration and resets value if LASTSL is smaller
A leading global CRO 11
Example 3
Count the number of observations (example AE table summary)
A leading global CRO 12
AE Summary Table
A leading global CRO 13
Analysis Dataset (ADAE)TRTAN USUBJID AESEQ AETERM TEAEFL TESAEFL RTEAEFL
A 100-5001 1 CHEILITIS
A 100-5001 2 CONSTIPATION Y
A 100-5001 3 CONSTIPATION Y
A 100-5001 4 CONSTIPATION Y
A 100-5001 5 DIARRHEA Y
A 100-5004 1 BACK PAIN (THORACAL) Y Y
B 100-5002 1 ANEMIA Y
B 100-5003 1 FEVER WITH CHILLS Y
B 100-5003 2 NAUSEA Y
B 100-5003 3 PNEUMONIA Y Y
B 100-5005 1 ASYMPTOMATIC PULMONAL EMBOLE
B 100-5005 2 LOSS OF APPETITE Y Y
B 100-5005 3 NAUSEA Y
B 100-5005 4 NEUTROPENIA Y
B 100-5005 5 NEUTROPENIA Y
B 100-5005 6 PAIN LEFT ILIAC REGION (OS ILIUM)
B 100-5005 7 PAIN RIGHT CALF Y Y
B 100-5005 8 WEAKNESS Y Y
A leading global CRO 14
SQL Procedure
A leading global CRO 15
Result Dataset (AECNT)
A leading global CRO 16
SAS CodePROC SORT DATA=ADAE; BY TRTAN USUBJID AESEQ; RUN;DATA AECNT(KEEP=TRTAN COL1 COLU1 SUBJCNT EVECNT); SET ADAE; BY TRTAN USUBJID AESEQ; ARRAY AEFL{3} $ TEAEFL RTEAEFL TESAEFL; ARRAY SUBJAE{3} SUBJAE1-SUBJAE3; ARRAY SCNT{3} SCNT1-SCNT3; ARRAY AECNT {3} AECNT1-AECNT3; RETAIN SCNT1-SCNT3 AECNT1-AECNT3 0; IF FIRST.TRTAN THEN DO I=1 TO hbound(SCNT); SCNT{I}=0; AECNT{I}=0;
END; IF FIRST.USUBJID THEN DO I=1 TO hbound(SUBJAE); SUBJAE{I}=0;
END; DO I=1 TO hbound(SUBJAE); IF AEFL{I}='Y' THEN SUBJAE{I}+1; END; IF LAST.USUBJID THEN DO I=1 TO hbound(SCNT); IF SUBJAE{I}>=1 THEN SCNT{I}=SCNT{I}+1; AECNT{I}=AECNT{I}+SUBJAE{I}; END; IF LAST.TRTAN THEN DO I=1 TO hbound(SCNT); COL1=I; COLU1=PUT(COL1, COL1F.); SUBJCNT=SCNT{I}; EVECNT=AECNT{I}; OUTPUT; END;RUN;
Assigns the 0 for each treatment group
Assigns the initial value (0) of each AE category for each subject
Count the number of AEs per SUBJECT for each AE category.
Count Number of subject (SCNT{i}) and Number of AEs (AECNT{i} per AE category.
Assigns the initial value (0) for the 6 variables
Output the Number of subject and Number or events for each category.
A leading global CRO 17
Conclusion
The RETAIN statement can carry over values from one observation to next, so it is very useful to manipulate the data across observations.
A leading global CRO 18
Q&A
Thanks