U SING S URVIVAL A NALYSIS TO ANALYZE DEGREE COMPLETION Janice Love University of California, Los...
-
Upload
joe-bridgwater -
Category
Documents
-
view
213 -
download
0
Transcript of U SING S URVIVAL A NALYSIS TO ANALYZE DEGREE COMPLETION Janice Love University of California, Los...
USING SURVIVAL ANALYSIS TO ANALYZE DEGREE COMPLETION
Janice LoveUniversity of California, Los AngelesOffice of Academic Planning & BudgetCAIR 2014
AGENDA
Survival Analysis History & Background
Overview
Survival Analysis example using SPSS
Results of Survival Analysis
SURVIVAL ANALYSIS BACKGROUNDDefinition
• A statistical method for studying the time to an event. The term “survival” suggests that the event of interest is death but the technique is useful for other types of events.
Alternative terminology• Event analysis, Time series analysis, Time-to-
event analysis• Survival analysis –studies involving time to death
(biomedical sciences)• Reliability theory / Reliability analysis
(engineering)• Duration analysis / Duration modeling
(economics)• Event history analysis (Sociology)
Uses• Clinical trials• Cohort studies
http://wpfau.blogspot.com/2011/08/safe-withdrawal-rates-and-life.html
Example of Survival Probability Graph
http://www.statcan.gc.ca/daily-quotidien/000216/dq000216b-eng.htm
Example of Survival Probability Graph
• Unknown – been around for a few hundred years
• Techniques developed in medical / biological sciences
• World War II –military vehicles (reliability and failure time analysis)
• The Kaplan-Meier Estimator was introduced with the publication of NONPARAMETRIC ESTIMATION FROM INCOMPLETE OBSERVATIONS – E. L. Kaplan / Paul Meier, 1958
• Cited 34,000 times as of 2011
SURVIVAL ANALYSIS HISTORY
http://articles.chicagotribune.com/2011-08-18/news/ct-met-meier-obit-20110818_1_clinical-trials-research-experimental-treatment
SURVIVAL ANALYSIS - OVERVIEW A set of statistical methods where the outcome variable is the
time until the occurrence of an event of interest
Follows cohort over specified time period with focus on an event
Useful when the rate of the occurrence of the event varies over time
Differs from other statistical methods: handles censored data (the withdrawal of individuals from the study)
Censored observations :• Individuals who have not experienced “the event” by the end of the study
• Right censoringo Study participant can’t be locatedo or lives beyond the end of the studyo or drop outs before the study is completedo or is still enrolled
o An observation with incomplete information
o Don’t have to handle these individuals as “missing”
o Do have to follow rules with respect to censored datao # of censored should be small relative to non-censoredo Censored and non-censored population should be similar (Kaplan-Meier)
Censored Event Total2 3 5
Outcome data
Student 1
Student 2
Student 3
Student 4
Student 5
1 2 3 4 5 6 7 8 9 10 11 12Time in Terms
Dropped out after 5 terms
"Survived" - still enrolled at the end of the study period
terms enrolled Graduation_status
Student 1 5 0Student 2 9 1Student 3 14 0Student 4 7 1Student 5 8 1
SURVIVAL ANALYSIS - CENSORING
SURVIVAL ANALYSIS - CENSORINGConsequences of mishandling or ignoring censored data:
Ignoring censored records completely or arbitrarily assigning event dates introduces bias into the results
Inclusion of the censored data produces less bias. Newell/Nyun 2011
ExampleStudent cohort, N = 50, event of interest = GraduationStill enrolled at the end of the study, N = 6No longer enrolled but did not graduate, N = 4Options:
Code all 10 as missingCode 4 as missing, 6 as graduated as of study end
Consequences: Mean time to degree is over or understatedselection bias risk
Two methods to produce the cumulative probability of survival that the survival graph is based upon:
1. SPSS Life Table: (Each time period) the effective size of the cohort is reduced by ½ of the censored group
2. Kaplan-Meier Survival Table: The survival probability estimate for each time period, except the first, is a compound conditional probability
SURVIVAL ANALYSIS – HANDLING CENSORED DATA
Data required for analysis:
Clearly defined event: (death, onset of illness, recovery from illness, marriage, birth, mechanical failure, success, job loss, employment, graduation). Terminal event
Event status (1 = event occurred, 0 = event did not occur)
Time variable = Time measured from the entry of a subject into the study until the defined event. Months, terms, days, years, seconds.
Covariates: To determine if different groups have different survival times Gender, age, ethnicity, GPA, treatment, intervention Regression models
SURVIVAL ANALYSIS - OVERVIEW
SURVIVAL ANALYSIS – SPSS DATA LAYOUTBasic student data
• Time variable – terms enrolled• Event status – graduation status
terms_enrolled graduate_status gender 1st_term_gpa
Student 1 5 0 1 3.4Student 2 9 1 0 4.0Student 3 14 0 1 2.9Student 4 7 1 1 3.9Student 5 8 1 0 3.1
Group into categories
Censored indicator
Binary or dummy variables
Cohort Description
• Undergraduates, one division• Fall 2006, Fall 2007 entering freshmen, N = 884• Respondents to 2008 UCUES* survey• Freshmen admits (transfers excluded)• 1st term gpa >= 3.0• Censored = 10 or 1.1%• Explanatory variables available: gender, URM status,
domestic-foreign status, Pell Grant recipient status, hours worked (survey), double/triple major
* UCUES = University of California Undergraduate Survey
SURVIVAL ANALYSIS – LIFE TABLE PRODUCED BY SPSS primary output of the survival analysis
procedureIntervals = terms. count is from admit term
Count of still enrolled students at start of term
SURVIVAL ANALYSIS – LIFE TABLE PRODUCED BY SPSS primary output of the survival analysis
procedure
# withdrawing during interval = censored
# exposed to risk: # entering interval minus ½ censored
# terminal events = # graduated
Proportion Terminating: # Terminal events ÷ # exposed to risk: example Term 10 = 38 ÷ 829.5 = .05
Proportion surviving = 1 – proportion terminating
Probability Density = Estimated probability of graduating in interval
Hazard Rate = Instantaneous failure rate. % chance of graduating given not having graduated at start of interval
Cumul. Surviving = cumulative % of those surviving at end of interval = (829.5 - 38) ÷ 884 = 0.90
SURVIVAL FUNCTION GRAPH PRODUCED BY SPSSThe proportion of the cohort that has survived (still enrolled) at any term
There is a 90% probability of surviving to the end of 10th term.
Surviving = remaining enrolled!
Each step of the curve represents an event
ONE MINUS SURVIVAL FUNCTION
There is a 10% probability of not-surviving to the end of 10th term.
Not surviving = graduating!!
SURVIVAL ANALYSIS: SPSS, WITH COVARIATEFACTOR = GENDER
SPSS• Analyze
• Survival• Life Tables
SURVIVAL TABLE=Terms_enrolled BY Gender(1 2) /INTERVAL=THRU 15 BY 1 /STATUS=graduated(1) /PRINT=TABLE /PLOTS (SURVIVAL OMS)=Terms_enrolled BY Gender.
SURVIVAL ANALYSIS – SPSS, LIFE TABLE BY GENDER
Median Survival Time = Time at which 50% of the original cohorts have not-survived (graduated)
Hazard Rate = Instantaneous failure rate. % chance of graduating given not having graduated at start of interval
SURVIVAL ANALYSIS: HAZARD RATIO
Hazard Ratio = ratio of the hazard rates.
At 12th term, Hazard ratio = 1.63 / 1.41 = 1.16, females are 16% more likely to graduate in the 12th term than males
At 13th term, Hazard ratio = .41 / .62 = .66, females are 34% less likely to graduate in the 13th term than males
Interval Start Time
Number Entering Interval
Number of Terminal Events
Hazard Rate
0 586 0 .00
1 586 0 .00
2 586 0 .00
3 586 0 .00
4 585 0 .00
5 584 0 .00
6 584 0 .00
7 583 0 .00
8 583 0 .00
9 583 38 .07
10 545 22 .04
11 523 73 .15
12 450 404 1.63
13 46 15 .41
14 28 11 .49
15 17 17 .00
0 298 0 .00
1 298 0 .00
2 298 0 .00
3 298 0 .00
4 298 0 .00
5 298 0 .00
6 298 1 .00
7 296 0 .00
8 296 1 .00
9 295 10 .03
10 285 16 .06
11 268 46 .19
12 222 183 1.41
13 38 18 .62
14 20 6 .36
15 13 13 .00
Life Table - Hazard Rate Column
First-order Controls
Gender Female
Male
SURVIVAL FUNCTIONS - SPSSFACTOR = GENDER
Survival Pattern: SPSS will produce a different colored line for each of the factor’s values
SURVIVAL ANALYSIS: KAPLAN-MEIER METHOD
Assumptions Censored individual – student who has not
experienced the event (graduated) by the end of the study, e.g. they are no longer enrolled Check for differences between censored and
non-censored groups
Cohorts should behave similarly – groups entering at different times should be similar
Avoid “selection bias” in data
SURVIVAL FUNCTIONS – SPSS, KAPLAN_MEIERFACTOR = GENDER
KM Terms_enrolled BY Gender /STATUS=graduated(1) /PRINT TABLE MEAN /PLOT SURVIVAL /TEST LOGRANK BRESLOW TARONE /COMPARE OVERALL POOLED.
KAPLAN-MEIER SURVIVAL TABLEThis is an example of the survival table produced by the Kaplan-Meier procedure.
Kaplan-Meier Survival Probability Estimate calculation example:
Interval 4: Cumulative Proportion Surviving = # remaining / # at risk =[(# at start of interval - (# censored + # of events)] ÷ [# at start of interval - # of events] = [(46 – (2 + 1)] ÷ [(46 – 2)] = 43 ÷ 44 = 0.978Interval 5: Cumulative Proportion Surviving = [(43 – (2 + 2)] ÷ (43 – 2) = 39 ÷ 41 = 0.951 x 0.978 = 0.930
Kaplan-Meier Survival Table: The survival probability estimate for each time period, except the first, is a compound conditional probability
KAPLAN-MEIER OUTPUT
Log Rank weights all graduations equally
Breslow gives more weight to earlier graduations
Taron-Ware is mixture of two
Kaplan-Meier Results – Gender
Null Hypothesis: Female Curve = Male Curve
Curves not significantly different at p < .05
• Measures influence of explanatory variables
• Most used Survival analysis method
• Only time independent variables are appropriate
• Assumptions: Hazards are proportional
COX REGRESSION (PROPORTIONAL HAZARDS)
COX REGRESSION, CHECKING PROPORTIONAL HAZARDS ASSUMPTION
Repeat for each factor!
SPSS• Analyze
• Survival• Cox Regression
COX REGRESSION: USE LOG MINUS LOG FUNCTION TO CHECK PROPORTIONAL HAZARDS ASSUMPTION
Do not use Cox Regression if the curves cross. This means the hazards are not proportional.
COX REGRESSION MODEL – EXAMPLE, GENDER
SPSS
• Analyze• Survival
• Cox Regression• (move gender to
Covariates box)
Interpretation of SPSS Cox Regression Results: • The reference category is
female because I made that choice for this model
• It is not statistically significant at p < 0.05 that females and males have different survival curves
Exp(B) = Hazard ratio: Female vs. Male The null hypothesis is that this ratio = 1.
Hazard Ratio = eB = e-0.04 = 0.961
COX REGRESSION MODEL RESULTS: PELL GRANT RECIPIENTS VS. NON-PELL GRANT RECIPIENT
Tip: To edit the default chart, click on the chart until the “Chart Editor” opens
Per Kaplan-Meier Estimation, Pell-Grant Student curve is not equal to non-Pell Grant students curve, highly significant at p < .001
COX REGRESSION MODEL RESULTS: PELL GRANT RECIPIENTS VS. NON-PELL GRANT RECIPIENT
Pell Grant Recipients1. Work more hours than non-Pell Grant Recipients2. Pell Grant Recipients with similar GPAs to non-Pell Grant Recipients have attempted 10 more units
Survival Analysis provides the following:
• Handles both censored data and a time variable• Life table • Graphical representation of trends• Kaplan-Meier survival function estimator• Survival comparison between 2 or more groups
• Regression models – relationships between variables and survival times
p value is produced that indicates if difference between curves is significant or not
SUMMARY
Descriptive power of survival analysis :Terms Enrolled by 1st Term GPA – Using Survival Graph (K-M) to display data
~ 34% probability of continued enrollment
~ 9% probability of continued enrollment
At end of 12th term:
Contact Info: [email protected]
Thank you!
REFERENCES
Dunn, S. (2002). Kaplan-Meier Survival Probability Estimates. Retrieved from http://vassarstats.net/survival.html Harris, S. (2009). Additional Regression techniques, October 2009, Retrieved from http://www.edshare.soton.ac.uk/id/document/9437
Newell, J. & Hyun, S. (2011). Survival Probabilities With and Without the Use of Censored Failure Times Retrieved from https://www.uscupstate.edu/uploadedFiles/Academics/Undergraduate_Research/Reseach_Journal/2011_007_ARTICLE_NEWELL_HYUN.pdf Singh, R., Mukhopadhyay, K. (2011). Survival analysis in clinical trials: Basics and must know areas, Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3227332/t Wiorkowski, J., Moses, A., & Redlinger, L. (2014).The Use of Survival Analysis to Compare Student Cohort Data, Presented at the 2014 Conference of the Association of Institutional Research