Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored...
Transcript of Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored...
![Page 1: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/1.jpg)
STATGRAPHICS.COM
Distribution Fitting for
Arbitrarily Censored
Data
Dr. Neil Polhemus
![Page 2: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/2.jpg)
STATGRAPHICS.COM
Censored Data
• Censored data arise when the value of an
observation is only partially known.
• For example, in a medical trial the survival time of
a patient may only be known to greater than some
value because the patient left the study.
• Or a measurement taken to study groundwater
contamination may only be known to be less than
some detection or quantitation limit.
![Page 3: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/3.jpg)
STATGRAPHICS.COM
Types of Censored Data
• Right-censored – known to be greater than.
> 45
• Left-censored – known to be less than.
< 7
• Interval-censored – known to be between.
[6,10]
![Page 4: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/4.jpg)
STATGRAPHICS.COM
Example #1: Arsenic Concentrations
• Arsenic concentrations (ug/L) in an urban stream,
Manoa Stream at Kanewai Field, on Oahu, Hawaii.
(Tomlinson, 2003)
0.5 0.5 0.5 0.6 0.7 0.7 <0.9 0.9
<1.0 <1.0 <1.0 <1.0 1.5 1.7 <2.0 <2.0
<2.0 <2.0 <2.0 <2.0 <2.0 <2.0 2.8 3.2
![Page 5: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/5.jpg)
STATGRAPHICS.COM
Example #2: Breast Cancer Study
• Days between treatment and breast retraction –
Finkelstein and Wolfe (1985)
![Page 6: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/6.jpg)
STATGRAPHICS.COM
Statgraphics Procedures
Option #1: Describe – Distribution Fitting – Fitting
Censored Data.
• Handles left-censored and right-censored data.
• Includes goodness-of-fit tests.
Option #2: R Interface – Distribution Fitting
(Arbitrarily Censored Data).
• Also handles interval-censored data.
• Does not include goodness-of-fit tests.
![Page 7: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/7.jpg)
STATGRAPHICS.COM
Fitting the Arsenic Data
Note: may use Edit – Replace Censored Values to create the Value column.
![Page 8: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/8.jpg)
STATGRAPHICS.COM
Probability Plots
![Page 9: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/9.jpg)
STATGRAPHICS.COM
Probability Plots
Lognormal Probability Plot
0.1 1 10
Value
0.1
1
5
20
50
80
95
99
99.9
cu
mu
lati
ve p
erc
en
t
![Page 10: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/10.jpg)
STATGRAPHICS.COM
Distribution Fitting (Censored Data)
![Page 11: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/11.jpg)
STATGRAPHICS.COM
Analysis Options
![Page 12: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/12.jpg)
STATGRAPHICS.COM
Tables and Graphs
![Page 13: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/13.jpg)
STATGRAPHICS.COM
Analysis Summary
![Page 14: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/14.jpg)
STATGRAPHICS.COM
Goodness-of-Fit Tests
Type I censoring – items removed after prespecified times.
Type II censoring – test stopped after prespecified number of failures.
![Page 15: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/15.jpg)
STATGRAPHICS.COM
Comparison of Alternative Distributions
![Page 16: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/16.jpg)
STATGRAPHICS.COM
Goodness-of-Fit Tests
![Page 17: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/17.jpg)
STATGRAPHICS.COM
Histogram with Fits
Histogram for Value
0 1 2 3 4
Value
0
2
4
6
8fr
eq
uen
cy
DistributionLoglogisticLognormal
![Page 18: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/18.jpg)
STATGRAPHICS.COM
Quantile Plot
Quantile Plot
0 1 2 3 4
Value
0
0.2
0.4
0.6
0.8
1
cu
mu
lati
ve p
rob
ab
ilit
y
DistributionLoglogisticLognormal
![Page 19: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/19.jpg)
STATGRAPHICS.COM
Critical Values
![Page 20: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/20.jpg)
STATGRAPHICS.COM
Quantile-Quantile Plot
Quantile-Quantile Plot
0 1 2 3 4
distribution
0
1
2
3
4V
alu
eDistribution
LoglogisticLognormal
![Page 21: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/21.jpg)
STATGRAPHICS.COM
Nonparametric Estimates
• Statgraphics contains a procedure under Describe
– Life Data – Life Tables (Times) which will
estimate a nonparametric survival function for
right-censored data.
• Helsel (2012) shows how such a procedure may
be used to analyze left-censored data by “flipping”
the data values.
![Page 22: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/22.jpg)
STATGRAPHICS.COM
Data Input Dialog Box
![Page 23: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/23.jpg)
STATGRAPHICS.COM
Kaplan-Meier Estimate
Estimated Survival Function
0 1 2 3 4 5
5-Value
0
0.2
0.4
0.6
0.8
1su
rviv
al
pro
bab
ilit
y
![Page 24: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/24.jpg)
STATGRAPHICS.COM
Results
• Subtract each result from 5 to get what you want.
• Q1 = 0.5, Median = 0.7, Q3 = 0.9
![Page 25: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/25.jpg)
STATGRAPHICS.COM
Fitting the Breast Cancer Data
• Uses the “interval” and “Icens” packages written
for R.
![Page 26: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/26.jpg)
STATGRAPHICS.COM
Data Input Dialog Boxes
![Page 27: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/27.jpg)
STATGRAPHICS.COM
Data Input Dialog Boxes
![Page 28: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/28.jpg)
STATGRAPHICS.COM
Analysis Options
![Page 29: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/29.jpg)
STATGRAPHICS.COM
Options
• Assumed distribution – will be fit to the data.
• Lower threshold – for distributions such as lognormal, the
minimum possible value.
• Apply Efron bias correction – if smallest value is left-censored,
sets KMT nonparametric CDF at that observation to 0 for
purposes of calculating the mean and standard deviation.
Otherwise, CDF is assumed to decay linearly to 0 at lower
threshold.
• Confidence level for intervals: confidence level used to create
confidence limits for distribution parameters and other
quantities.
• Number of bootstrap subsamples: number of subsamples to be
used when estimating confidence limits for the distribution
parameters and other quantities.
![Page 30: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/30.jpg)
STATGRAPHICS.COM
Analysis Summary
![Page 31: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/31.jpg)
STATGRAPHICS.COM
Plotting the Data
Scatterplot
0 10 20 30 40 50
Row
0
20
40
60
80
100
days
![Page 32: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/32.jpg)
STATGRAPHICS.COM
Distribution Fitting
Parameter estimates are obtained by maximizing the
likelihood function:
where
![Page 33: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/33.jpg)
STATGRAPHICS.COM
Example
![Page 34: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/34.jpg)
STATGRAPHICS.COM
Plot of Fitted Distribution
Lognormal distribution (Mean=100.514,Std. Dev.=214.7)
0 50 100 150 200 250 300
X
0
3
6
9
12
15
18(X 0.001)
den
sit
y
![Page 35: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/35.jpg)
STATGRAPHICS.COM
Cumulative Distribution Function
Cumulative Distribution
Lognormal distribution:Mean=100.514,Std. Dev.=214.7
0 50 100 150 200 250 300
X
0
0.2
0.4
0.6
0.8
1
CD
F
Lognormal
![Page 36: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/36.jpg)
STATGRAPHICS.COM
Survival Function
Survival Function
Lognormal distribution:Mean=100.514,Std. Dev.=214.7
0 50 100 150 200 250 300
X
0
0.2
0.4
0.6
0.8
1
Su
rviv
or
Fu
ncti
on
Lognormal
![Page 37: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/37.jpg)
STATGRAPHICS.COM
Nonparametric Estimates
• A nonparametric estimate of the survival function
may be obtained without assuming any particular
distributional form.
• Estimated using the methods of Kaplan, Meier and
Turnbull.
• Can also calculate confidence limits for the
nonparametric survival function or CDF.
![Page 38: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/38.jpg)
STATGRAPHICS.COM
Example
![Page 39: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/39.jpg)
STATGRAPHICS.COM
Example
Survival Function
0 20 40 60 80 100
X
0
0.2
0.4
0.6
0.8
1
Su
rviv
or
Fu
ncti
on
KMT95% KMT limits
![Page 40: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/40.jpg)
STATGRAPHICS.COM
Example
Survival Function
Lognormal distribution:Mean=100.514,Std. Dev.=214.7
0 20 40 60 80 100
X
0
0.2
0.4
0.6
0.8
1
Su
rviv
or
Fu
ncti
on
LognormalKMT95% KMT limits
![Page 41: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/41.jpg)
STATGRAPHICS.COM
Box-and-Whisker Plot
• Based on 1st, 25th, 50th, 75th and 99th percentiles.
Box-and-Whisker Plot for KMT Estimate
0 10 20 30 40 50
days
![Page 42: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/42.jpg)
STATGRAPHICS.COM
Quantile-Quantile Plot
Quantile-Quantile Plot
0 20 40 60 80 100 120
KMT quantile
0
20
40
60
80
100
120
Fit
ted
qu
an
tile
![Page 43: Distribution Fitting for Arbitrarily Censored Dr. Neil Polhemus censored data webinar...Censored Data. •Handles left-censored and right-censored data. •Includes goodness-of-fit](https://reader035.fdocuments.net/reader035/viewer/2022070903/5f6cd5418dcee06b500291fe/html5/thumbnails/43.jpg)
STATGRAPHICS.COM
References
• StatFolios and data files are at: www.statgraphics.com/webinars
• Finkelstein, D.M. and Wolfe, R.A. (1985). “A semiparametric model for regression analysis of interval-censored failure time data.” Biometrics 41, 731-740.
• Gentleman R, Vandal A (2018). Icens: NPMLE for Censored and Truncated Data. R package version 1.54.0.
• Helsel, D.R. (2005). Nondetects and Data Analysis: Statistics for Censored Environmental Data. Wiley, New York.
• Helsel, D.R. (2012). Statistics for Censored Environmental Data using Minitab and R, second edition. Wiley, Hoboken, N.J.
• Lee, E.T. and Wang, J.W. (2003). Statistical Methods for Survival Data Analysis, 3rd edition. Wiley, New York.
• R Package “interval” - https://cran.r-project.org/web/packages/interval/interval.pdf
• Tomlinson, M. S. (2003). “Effects of ground-water/surface-water interactions and land use on water quality.” Written communication (draft USGS report).
• Turnbull BW (1976). “The Empirical Distribution Function with Arbitrarily Grouped, Censored and Truncated Data.” Journal of the Royal Statistical Society. Series B, 38(3), 290–295.