A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence...

15
ELSEVIER Int. J. Pres. Ves. & Piping 60 (1994) 193-207 ~) 1994 Elsevier Science Limited Printed in Northern Ireland. All rights reserved 0308-0161/94/$07.00 A statistical analysis of the failure time distribution for Cr Mo V steel tubes in the presence of outliers M. Evans Department of Materials Engineering, University College Swansea, Swansea, UK, SA2 8PP (Received 8 October 1993; accepted 13 October 1993) The placing of confidence bounds around any failure time prediction requires knowledge of the failure time distribution, and the existence of outlier observations can result in the incorrect identification of the true underlying distribution. This paper outlines two tests which can be used to pick out an influential data point from within a generalised log gamma failure time distribution. A method of bounding the influence of outliers on the estimates made for the parameters of a generalised gamma distribution (using maximum likelihood) is also presented. These outlier tests when applied to three data sets on failure times for I 1 1 ~Cr~Mo~V steel tubes (BSC data) concluded that outlier observation did indeed exist in such data. When no attempt was made to isolate such outliers the most supported failure time distribution was the log normal. However, bounding the influence of such outliers resulted in a generalised gamma distribution (with a high k value) being the best fitting failure time distribu- tion. Ignoring outliers led to exaggerated changes in the estimated parameters of the distribution. 1 INTRODUCTION This paper aims to extend the work recently carried out by Evans.l In that paper, a number of statistical tests aimed at identifying the nature of the creep failure time distribution for ~Cr~Mo~Vt t t steel tubes were presented. A detailed analysis of three data sets (each obtained under unchanging test conditions) revealed that the log normal distribution best represented the recorded failure times. However, a wide variety of three- parameter gamma distributions were also ac- cepted by the data, and in two of the three data sets, so too was the Weibull distribution. This uncertainty about the exact distributional form was shown to have a significant effect on the size of confidence limits placed around the lower quantiles of failure time. All the distributions found to be acceptable by the data were then subjected to a simple interpolative test. This involved plotting the 193 reliability rates predicted by the distribution against actual reliability rates, and then measur- ing the largest absolute deviation between the two. A significant difference implied some inadequacy of the distribution function in describing the data. This paper is aimed at extending the number of interpolative tests available. Influential data points (or outliers) can produce drastic changes in parameter estimates and may make any resulting failure time prediction highly misleading and inaccurate. This is particularly true for the relatively small samples available on times to creep failure under a single stress-temperature combination. It is therefore all the more important to be able to detect and minimise the harmful effects of outliers in creep failure data. The present paper deals with this aspect of interpolation and proceeds along the following lines. Section 2 outlines the model used by Evans t and Section 3 summarises his results.

Transcript of A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence...

Page 1: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

ELSEVIER

Int. J. Pres. Ves. & Piping 60 (1994) 193-207 ~) 1994 Elsevier Science Limited

Printed in Northern Ireland. All rights reserved 0308-0161/94/$07.00

A statistical analysis of the failure time distribution for Cr Mo V steel tubes in the

presence of outliers

M. Evans Department of Materials Engineering, University College Swansea, Swansea, UK, SA2 8PP

(Received 8 October 1993; accepted 13 October 1993)

The placing of confidence bounds around any failure time prediction requires knowledge of the failure time distribution, and the existence of outlier observations can result in the incorrect identification of the true underlying distribution. This paper outlines two tests which can be used to pick out an influential data point from within a generalised log gamma failure time distribution. A method of bounding the influence of outliers on the estimates made for the parameters of a generalised gamma distribution (using maximum likelihood) is also presented.

These outlier tests when applied to three data sets on failure times for I 1 1 ~Cr~Mo~V steel tubes (BSC data) concluded that outlier observation did indeed exist in such data. When no attempt was made to isolate such outliers the most supported failure time distribution was the log normal. However, bounding the influence of such outliers resulted in a generalised gamma distribution (with a high k value) being the best fitting failure time distribu- tion. Ignoring outliers led to exaggerated changes in the estimated parameters of the distribution.

1 I N T R O D U C T I O N

This paper aims to extend the work recently carried out by Evans.l In that paper, a number of statistical tests aimed at identifying the nature of the creep failure time distribution for ~Cr~Mo~Vt t t steel tubes were presented. A detailed analysis of three data sets (each obtained under unchanging test conditions) revealed that the log normal distribution best represented the recorded failure times. However, a wide variety of three- parameter gamma distributions were also ac- cepted by the data, and in two of the three data sets, so too was the Weibull distribution. This uncertainty about the exact distributional form was shown to have a significant effect on the size of confidence limits placed around the lower quantiles of failure time.

All the distributions found to be acceptable by the data were then subjected to a simple interpolative test. This involved plotting the

193

reliability rates predicted by the distribution against actual reliability rates, and then measur- ing the largest absolute deviation between the two. A significant difference implied some inadequacy of the distribution function in describing the data.

This paper is aimed at extending the number of interpolative tests available. Influential data points (or outliers) can produce drastic changes in parameter estimates and may make any resulting failure time prediction highly misleading and inaccurate. This is particularly true for the relatively small samples available on times to creep failure under a single stress-temperature combination. It is therefore all the more important to be able to detect and minimise the harmful effects of outliers in creep failure data.

The present paper deals with this aspect of interpolation and proceeds along the following lines. Section 2 outlines the model used by Evans t and Section 3 summarises his results.

Page 2: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

194 M. Evans

Section 4 develops two tests for detecting outlier Here, the shape of the standardised error points contained within a generalised (three- distribution depends directly on the value of k. parameter) log gamma distribution. Each of As the model stands, W tends to a normal these outlier test statistics follow distributions of distribution as k tends to infinity, but unfortun- their own. The percentiles of these distributions ately the mean of W also tends to infinity at the are required to make the outlier tests operational same time. To avoid this shortcoming the and so Section 5 derives such values using Monte following transformation was suggested by Carlo techniques. Section 6 suggests ways of Prentice: 3 estimating the failure time distribution in the presence of outliers, and modifies the distribution W~ = V~[W - In(k)] (3a) tests given in Evans ~ so as to cope with data sets This allows eqns (lb) and (2) to be rewritten as: containing outliers. Then, in Section 7, the Y = t t + o14'1 (3b) outlier tests are applied to the three data sets used by Evans and the effects of bounded with: influence estimation on these data are given, kk_0.5 [ (__~)] Conclusions are presented in Section 8. f(Wl) - F[k----] exp V~W, - k exp (3c)

2 TH E M O D E L

The stochastic nature of creep failure data can be modelled using an equation of the form:

Y = U + e (la)

where Y is the log of failure time (t); and e is a stochastic error process which has some prob- ability density function (pdf). Greater conformity can often be obtained by writing eqn (la) in terms of a standardised error, W. This is done by dividing the errors through by some constant b:

Y = U + bW (lb)

The parameter U is taken to be some deterministic function of testing conditions:

U = g(T, S) (lc)

where S is stress; and T is temperature. Unfortunately, creep theory is unable to deliver the exact functional form of U, nor is it able to say anything about the characteristics of the error distribution. Not knowing the functional form of eqn (lc) can be dealt with by looking at recorded failure times at a single stress-temperature combination. In turn, the data can be used to identify the type of error distribution by specifying a very general density function and then initiating data based tests for specific forms contained within it. One such fexible pdf, following Stacy 2, is t he generalised log gamma distribution:

1 f ( W ) = ~ exp[kW - exp W] (2)

where t t = U + b l n ( k ) ; and o = b / V ~ . From eqns (3b) and (3c) it follows that the probability density function for Y is given by:

f(Y)=oF[k----~]exp V ~ - k exp (3d)

Now when k = 1, W, follows a standard extreme value distribution and Y follows an extreme value distribution, with failure times (t) being Weibull distributed. When k = ~, W, follows a standard normal distribution and Y a normal distribution, so that t is log normal. If k = o = l , then Y follows a log exponential distribution, whilst if oVk = 1, Y follows a two-parameter log gamma distribution and t a two-parameter gamma distribution.

3 P A S T STUDIES

Evans ~ recently obtained estimates for tt and o, together with confidence limits for times to failure (given a stated probability of failure), at three stress and temperature combinations for 1 1 1 ~Cr:Mo~V steel tubes. The data were generated at BSC 4 laboratories. Such estimates were obtained for a number of k values supported by the data. For all three data sets, k = oo was best supported by the data (implying failure times are log normally distributed), but the existence of other significant values for k substantially increased the confidence limits placed around times to failure. These concludions were obtained without any formal search for influential data points and it may be the case that these conclusions are sensitive to the existence of any

Page 3: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

Statistical analysis of failure time distribution for steel tubes 195

outliers. It is to the issues of detection and 'removal' of influential data points that this paper now turns.

4 D E T E C T I N G O U T L I E R S

A recorded failure time is defined as an outlier when it is far removed from the rest of the observations and is usually generated by some unusual factor. Outliers are a problem because if used in the estimation procedure, substantial changes can occur in the coefficients of the regression equation or likelihood function. This in turn will induce substantial changes in any predictions that are made of times to failure.

Yet the detection of an outlier is not straightforward. Traditionally, the errors in eqn (la) are assumed to be normally distributed, so that W~, constructed using either least squares or maximum likelihood estimates of /~ and tr, follows a standard normal distribution. In such a situation, any value for W1 outside the range -1 .96 to +1-96 can be considered an outlier error and so the Y value corresponding to that error is also an outlier amongst all other Y values.

This approach to detecting outliers is both fundamentally weak and wholly inappropriate when dealing with creep data. The procedure is weak because if an outlier does exist it is used along with the rest of the data to estimate/z and o, making it less likely that the error in question, computed using such estimates, will appear as an outlier. Further, the possibly non-normal nature of creep failure time data means that critical values for deciding if an extreme failure time is indeed inconsistent with the rest of the data are unavailable.

In a manner similar to that followed by Belsley et al. 5 the following two statistics will be used to detect outliers.

(i) The standardised residual (OSR and MSR) The standardised error, 14/1, forms a very useful starting point simply because it is a scaled residual. Thus, when k=o0, W1 follows a standard normal distribution with mean zero and variance 1. Alternatively, when k = 1, V¢~ follows a standard extreme value distribution with a mean equal to the negative of Euler's constant (-0.5772) and a variance equal to :t2/6. More generally, W1 follows a standard log gamma

and:

distribution (given by eqn (3c)) with a mean of v ~ [ q J ( k ) - In(k)] and variance equal to kW'(k). W(k) and W'(k) are the digamma and trigamma functions defined in Section 6.2 below.

To prevent the existence of any outliers reducing the absolute size of W~ by distorting estimates of /z and o, the standardised errors associated with each reading on log failure time (Y~) are constructed using the formula:

Y~ - #(i) SR, - (4a)

o(i)

/~(i) reads the value for /~ estimated using all values on Y except the ith reading, o(i) has exactly the same meaning. This can obviously be constructed using either ordinary least squares estimates of/~ and a (OSR), or using maximum likelihood estimates (MSR). If a hat is used to indicate least squares estimates and a tilde maximum likelihood estimates, then:

Y,. -/)(i) OSR, = (4b)

0(i)

Y,- - /~( i ) MSRi - (4c)

O(i)

(ii) Standardised fit (ODFITS and MDFITS) Influential observations can also be detected by measuring any change in some prediction of failure time that results from dropping a particular failure time observation. The failure time prediction chosen is the mean or expected log failure time, E[Y], so that the following scaled statistic can be used to detect outliers:

E[Y] - E[Y(i)] DFITSi = V'var E[Y(i)] (5a)

Again E[Y(i)] is the mean log failure time calculated by excluding the ith failure time reading, varE[Y(i)] is simply the calculated variance in mean log failure time when the ith reading on Y is omitted from the calculation. DFITSi can be constructed using a least squares estimate of E[Y] and var E[Y(i)] (ODFITS), or maximum likelihood estimates (MDFITS). ODFITS~ is found by substituting into eqn (5a):

E[I?] = EY~ (5b) N

and:

var E[? ] = E[Y - E[~"]]2 (5c) N z - N

Page 4: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

196 M. Evans

N is the number of recorded failure times in the sample.

MDFITS~ is derived by inserting into eqn (5a) the following:

E[I?] =/~ + Ov~[qJ(k) - In(k)] (5d)

and:

var E[I?] = var[/~] + D 2 var[b] + 2D cov[/~, (~]

(5e)

where D =V~[Ud(k ) - ln (k ) ] ; var refers to the variance of the parameter contained within the square brackets and coy[#, o] is the covariance between the maximum likelihood estimates for # and o. These variances and covariances come from the negative of the H (or Hessian) matrix discussed in Section 6.3 below.

5 MONTE CARLO ANALYSIS OF CRITICAL VALUES

Very large SRi and DFITSi values identify influential observations, but to proceed further a working distinction between very large and just large is required. An influential observation is said to exist when either the SR~ or DFITS~ statistics corresponding to that observation have a less than 5% chance of occurring. If there is only a 2-5% chance of SRi exceeding SR~.975 or being less than SR~.025 then any observation with an SR statistic outside the range SR~.o25 to SR~.975 is an outlier. The same goes for any observation

outside the range DFITS~'.025 to DFITS~.975. SR* and DFITS* are termed critical values, and their size will, for a given percentile, depend on sample size (N), k (type of distribution) and method of estimation (least squares or maximum likelihood). When both k and N are large, such critical values are obtained from a standard normal table, e.g. S1L*.025 = - 1 . 9 6 and 5R4~.975 =

+1.96. Whilst SR* is not so dependent on sample size, DFITS* is, and Belsley et al. 5 suggest the following size adjusted cut of rule when k is large:

DFITS* = + 1.96 x .

However, for non-normal distributions (small k) and small samples, such critical values are not valid. For small k , critical values for MDFITS are likely to be smaller than for ODFITS given the relative inefficiency of least squares methods at low k values. In an (as yet) unpublished paper, Evans 6 has used Monte Carlo techniques to take into account the interacting effects of k and N on the distributions for DFITS and SR. All the critical values shown in the following results sections come from this Monte Carlo study. However, a summary of the results of this study are shown in Table 1.

The following points can be drawn from the table:

(1) As expected, skewness increases quite substantia!ly as k falls in value.

Table 1 . C r i t i c a l values for SR and D F I T S s t a t i s t i c s at the 97.5% and 2.5% levels

N = 15 N = 50 N = 100 N = 200

O S R * M S R * O S R * M S R * O S R * M S R * O S R * M S R *

k = 1 1.61 to 1-58 to 1.39 to 1.34 to 1.37 to 1-35 to 1.36 to 1-29 to

- 4 . 4 4 - 4 . 3 2 - 3 . 8 8 - 3 . 9 2 - 3 . 7 4 - 3 . 8 0 - 3 . 6 4 - 3 - 6 8

k = 10 2-01 to 1-98 to 1.80 to 1.74 to 1.73 to 1.70 to 1.68 to 1.72 to

- 2 - 7 8 - 2 - 7 9 - 2 - 3 7 - 2 - 4 1 - 2 - 3 8 - 2 . 3 4 - 2 . 3 6 - 2 . 3 6

k = oo 2-22 to 2.31 to 1-99 to 2 .02 to 1.98 to 2 .02 to 1.98 to 2-01 to

- 2 . 2 1 - 2 . 2 8 - 2 . 0 7 - 2 . 0 6 - 2 - 0 3 - 1.98 - 1.99 - 2 - 0 1

N = 15 N = 50 N = 100 N = 500

O D F I T S * M D F I T S * O D F I T S * M D F I T S * O D F I T S * M D F I T S * O D F I T S * M D F I T S *

k = 1 0-42 to 0 .39 to 0 .22 to 0 .20 to 0-15 to 0 .14 to 0.11 to 0 .10 to

- 0 - 7 5 - 0 - 6 7 - 0 . 3 6 - 0 . 3 3 - 0 . 2 5 - 0 . 2 3 - 0 . 1 7 - 0 . 1 6

k = 10 0 -58 to 0 .52 to 0 .27 to 0 .26 to 0 .18 to 0 .18 to 0 .13 to 0 .13 to

- 0 . 7 1 - 0 . 6 3 - 0 . 3 0 - 0 . 3 1 - 0 . 2 1 - 0 . 2 1 - 0 - 1 5 - 0 - 1 5

k = ~ 0 -56 to 0 .58 to 0 .28 to 0-28 to 0 .20 to 0 .20 to 0 .14 to 0 .14 to - 0 - 5 5 - 0 . 5 7 - 0 - 2 9 - 0 . 2 8 - 0 - 2 0 - 0 . 2 0 - 0 . 1 4 - 0 . 1 4

Page 5: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

Statistical analysis of failure time distribution for steel tubes 197

(2) Critical values for DFITS are highly dependent on N, whilst this dependency is marginal for SR (except in very small samples).

(3) Detectible differences between OSR* and MSR* and between MDFITS* and ODFITS* at the 2.5 and 97.5 percentiles are only observed for k values less than about 10 and for N values less than 50.

6 D E A L I N G WITH OUTLIERS

6.1 Bounded influence estimation

An obvious approach to handling outliers is simply to delete all observations with significant SRi or DFITSi statistics and then re-estimate the parameters of the distribution. This is referred to below as the zero weighting option. A better approach is to minimise the influence of such outliers. Instead of giving all outliers a zero weighting, a more attractive suggestion is to weight outliers in proportion to the strength of their influence. In this context an ideal weight is DFITS,- because this measures the extent to which a particular observation influences the expected (mean) log failure time. Such a weighting procedure is known as bounded influence estimation, and along lines similar to Welsch, 7 the following weighting scheme is developed:

g~ = 1 if DFITS~'.o25 --- DFITS~ - DFITS~.975 (6a)

DFITS*.o25 if DFITSi - DFITS*.o25 (6b)

g i - DFITSi

DFITS~.975 if DFITSi --- DFITS~.og75 (6c)

gi= DFITSi

Depending on whether least squares or maximum likelihood methods are used, ODFITS~ or MDFITS~ are substituted into eqns (6) for DFITSi.

6.2 Bounded influence and least squares

The bounded influence estimates for /~ and o using least squares procedures are now obtained by minimising:

~,gi(y~ _ tz ,)2 (7)

where:

tz' =/~ - oV~[tIJ(k) - In(k)]

When there are no outliers, gi takes on a value of 1 for all i readings. This is referred to below as the unweighted case and is a straightforward application of least squares techniques. In this unweighted case the least squares formulae for/~ and o are:

/~ = ( - ~ ) + {Ov~[qJ (k ) - In(k)]} (8a)

r2= x[Y,- 712 (8b) N - 1 x var[W~]

I 7 is the arithmetic mean failure time. qJ(k) is the digamma function which is related to k in the following way:

qJ(k) ~ In(k) 1 1 1 1 - - + - - - - + . . . .

2k 12k 2 120k 4 252k 6 (8c)

The variance of WI, var[W~], is defined in terms of the trigamma function, W'(k) , and k alone:

1 1 1 var[Wl] = qJ(k) x k ~ 1 + ~-~ + 6k 2 30k4 . . . .

(8d) In eqn (8a), the terms in the curly bracket simply give O times the mean of the error process, W~. Notice that when k becomes very large, the mean of W~ becomes very small and var[W] approxim- ates unity so that o is simply the standard error of the regression and /~ the mean failure time. This of course corresponds to the very familiar linear regression with normally distributed errors (k = oo yields a normal error distribution for Y). For an assumed k value,/~ and t~ are fully defined from the data. However, Cox and Hinkley 8 have shown that, unfortunately, such least squares estimators are quite inefficient for k values less than about 5.

6.3 Bounded influence and maximum likelihood

Maximum likelihood estimators are often used because of the inefficiency in least squares estimates. For a given k, /~ and o are chosen so as to maximise the joint probability of observing a given set of Y values. Prentice 3 shows that this corresponds to maximising the following likeli- hood function (when none of the observations are censored):

In(L;/~, O,/c) = (k - 0.5) x In(k) - ln(F[k])

( W , ) (9a) - In(o) + V~W1 - k exp

Page 6: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

198 M. Evans

In turn, the bounded influence estimates for /u and o using maximum likelihood techniques are obtained by maximising the following weighted log likelihood function:

In(L;/~, O, k)~ = (k - 0.5) x In(k) - ln(r[k])

exp ((gW~)~ (9b) - In(o) + V~(gW~) - k \ V~ }

where g is the weighting variable whose construction was described above.

The Newton-Raphson numerical procedure can be used to maximise these likelihood functions for a predetermined value of k. Details of this procedure can be found in Greene, 9 but basically it involves iterating the following equation to convergence:

B, = B 0 - H(Bo)-'U(Bo)

B0 is a vector containing initial guesses of the values for ft and o; B1 is a vector of improved estimates, U(Bo) is a vector of partial first derivatives of ft and o with respect to In(L) computed using the initial guesses; whilst H(Bo) is a matrix of second and cross-partial derivatives (cross partials are located along the diagonal of the matrix). Derivative formulas are to be found in Lawless. l° Once B~ has been calculated it replaces Bo and this iterative replacement continues until there is little change in successive estimates of the parameters. The negative of the H matrix (usually referred to as the Hessian) obtained on the final iteration is particularly useful because it gives asymptotic estimates of the variance for the parameters ft and o. Thus, H(1, 1) gives the variance for ju, H(2,2) the variance for o and H(1,2) the covariance between/~ and o.

The unweighted maximum likelihood estimates of ~ and o, from which the SRi and DFITSi statistics are derived, are obtained by setting g~ equal to 1 for all i readings in eqn (9b). g~ values are then obtained from the constructed SR and DFITS statistics, allowing bounded influence maximum likelihood estimates of/~ and o to be derived from a Newton-Raphson maximisation of eqn (9b).

6.4 Testing amongst alternative distributions

Within the maximum likelihood framework set out above, a powerful distribution test emerges. Normality can be tested against any alternative

distribution contained within the generalised gamma distribution. For example, log normal against Weibull, log normal against exponential or log normal against gamma or generalised gamma. Because k and cr determine the form of the distribution, the test is one of simply accepting or rejecting a given k, or k and o, combination.

The procedure then is to fix the value for k in advance and then estimate o and /~ so as to maximise the log likelihood function given by eqns (9a) or (9b), depending on whether any outliers exist (eqns (9a) and (9b) are equivalent when no outliers exist because g, = 1 for all i). Let L(/~, 0,/~) stand for the unweighted maxi- mixed log likelihood function and L(/i, 0,/¢)~ stand for the bounded influence maximised log likelihood function. The tilde indicates the parameters that are estimated, whilst the bar indicates values which are fixed prior to estimation. This process is repeated for a whole range of k values so as to obtain log likelihoods corresponding to a number of different distribu- tions. Let L max(k*) stand for the largest of all these recorded log likelihoods in the unweighted case, and L max(k*)~ for the largest log likelihood in the bounded influence case. These obviously corresponds to a particular k value: k* Large sample statistical theory then shows that the statistics:

LR = - 2 In[ L max(k*)]

L (--~, O,, ~-) 1 (10a)

[L max(k*)g] n . . . . (10b) L R g = - 2 [L(/~, O, k)gJ

are asymptotically distributed as a chi square variable with one degree of freedom. A 95% confidence interval for k is then made up of all those k values which result in LR (or LRg in the presence of outliers) being less than X~.0.0s = 3.841, or which result in R max(k) = L max(k*)/L(f~, 0, fQ exceeding 0-147. (Under bounded influence estimation this rule becomes R max(k)g = L max(k*)JL(f~, 0, k)g exceeding 0.147.) R max(k) defines the maximised relative likelihood function. Distributions corresponding to k values outside of this 95% range are then said to be inconsistent with the presented data. Thus, if k = 1 lies outside this 95% range t does not follow a Weibull distribution. By fixing o alone or k and o in advance of maximising the log likelihood, tests for a two-parameter gamma

Page 7: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

Statistical analysis of failure time distribution for steel tubes 199

and exponential distribution can be formed in the same way.

An additional test for correct distributional specification is constructed using reliability rates. Reliability, defined as the probability of surviving a specified time, is modelled as:

value, then eqn (12a) can be rewritten as:

Qo = f~ + Wk,.o

This allows us to express/z and W~ as a function of Qo:

/~[t] = Q(k , k exp[(Y - (U + b ln(k)))/b]) (11a)

when failure times follow a generalised gamma distribution. Such reliability rates can be obtained using both the unweighted and bounded influence estimates for U and b. Q is simply the incomplete gamma function, tabulated values of which are again given in Abramowitz and Stegun.11 Such modelled reliability rates can then be compared to actual rates measured as:

R[t] = 1 - [(i - 0.5)/N] ( l ib )

where N is the number of recorded failures; and i is a rank ranging from 1 to n when failure times are ordered from lowest to highest. Stephens 12 has tabulated critical values for the largest absolute deviation between modelled and actual reliability rates: D = max I/~[t] - R[t]l. Any D value in excess of such critical values is then taken as evidence that failure times do not follow a generalised gamma distribution with those parameter values used to construct/~[t].

6.5 Predicting t imes to failure

fz = Q o - Wk.pO

(Y - Qo) W, = 0 + Wk.p

(13a)

(13b)

With W1 defined in this way, the log likelihood function can be estimated for various imputed values for Qo. Let L((~o, 8, k) be the maximum of the log likelihood when estimated at a given value for Qo (Q0 = Qo) and k (k = k). In turn, let L(Q0, O,k)g be the maximum of the log likelihood when weights are used to tackle outliers and when estimated at a given value for Qo (Qo = Qo) and k (k = k). Obviously when Qo is equal to the mid-point prediction (Q0 = Yp), the likelihood function will take on its greatest value, i.e. it will be equal to L max(k*) or L max(k*)g under bounded influence estimation. The 95% confidence interval for lip in the unweighted case is then made up of all those values for Qo which result in R max(Qo) exceeding 0-147, where:

L max(k*) R max(Qo) - L(00, O,/c) (14a)

If W1 follows the generalised log gamma distribution given by eqn (3c) then a maximum likelihood estimate (mid-point prediction) of the pth quantile for Y is given by:

= f~ + Wk,pO (12a)

where:

(12b)

X2.p is the p th percent critical value obtained from the chi square distribution with 2k degrees of freedom. Such an equation is applicable when all the uncensored data are recorded at one stress and one temperature. Approximate 95% con- fidence intervals for each quantile can be obtained by hypothesising a value for lip and then estimating the log likelihood using that hypothes- ised value. If Qo is the hypothesised quantile

Under bounded influence estimation the 95% confidence interval is given by:

L max(k*)g (14b) R max(Qo)g = L(Qo, 0,/c)g

In this way, predictions, with levels of con- fidence, can be obtained for Y at various quantiles for both weighted and unweighted situations. Of most interest is the lower limit on the confidence interval for Yo.1 and Yo.ol. The latter gives a lower bound for the log of time at which only 1% of all items on test will fail. Put differently, it is the log time at which there is only a 1% chance of failure. There is in turn only a 5% chance of this log time being wrong. The former is for a 10% failure rate.

The program used to carry out unweighted and bounded influence maximum likelihood estima- tion was TSP (version 4.2b).

Page 8: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

200 M. Evans

7 RESULTS

In Evans, ~ estimates were made for /~ and o in three data sets. Figure 1 contains all the readings made in each of the three data sets. The first set was a sample of 30 failure times recorded at a temperature of 550°C and a stress of 173 MPa. The unweighted likelihood function was maxi- mised when k =0% but the 95% confidence interval included values for k down to 8. Figures 2 and 3 show, for the first time, maximum likelihood estimates of SR~ and DFITSi when k = oo and when k = 8. Negative values imply short times to failure, whilst positive values imply longer than average times to failure. (The least squares estimates were very similar in value and so are not shown.) Both sets of statistics tell the same story. Namely, that for the most supported distribution (log normal) the 24th observation is an outlier, and for the least supported distribu- tion (log gamma with k = 8) the third observation may also be an outlier. This extra outlier results from the distribution becoming skewed towards lower failure times as k decreases.

Figure 4 shows the relative maximised likelihood for the unweighted and bounded influence situations, where the weights come from the application of eqns (6) to the data in Fig. 3. Under bounded influence estimation the

log normal distribution is no longer most supported by the data. Instead, the best supported distribution is the log gamma distribu- tion with k = 1000. In turn, the lower 95% confidence limit on k falls from 8 to around 4, with k values between 4 and 1000 being more likely to be correct, and k values more than 1000 being less likely to be correct.

Table 2 shows for three values of k, unweighted maximum likelihood estimates for and o, together with bounded influence estimates and estimates obtained by zero weighting the outliers. The effect of bounded influence estimation is to decrease the estimates of/~ and a, and more noticeably to increase the standard error of /~ shown in brackets in Table 2. As expected, zero weighting results in larger parameter changes.

Figure 5 plots the modelled reliability given by eqn ( l l a ) actual reliability given by eqn ( l l b ) using both the unweighted and bounded in- fluence maximum likelihood estimates of /~ and o. In each case the k value used corresponded to that value which maximised R max(k) and R max(k)~, respectively. Given that a perfect model corresponds to the 45 ° line, it follows that a marked improvement in performance comes from a bounded influence estimation. D (the maximum absolute deviation between modelled

15000

10000 O~ E

12-

0 ~ 5 0 0 0 n"

0 ~ . . . . I F I 80 100 120 140 160

Stress ( M P a )

[] 5 5 0 C ~ 6 0 0 C + 5 7 5 C

Fig. I. BSC high temperature creep data.

180

Page 9: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

Statistical analysis of failure time distribution for steel tubes

[~c -k=oD - - v - k = 8

201

: S C t k : = l

SR'[ k=8 ]

r r

~0

0

-1

-2 ~R'[ k:¢o]

-3

--tSR"tk=a]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

O b s e r v a t i o n s

Fig. 2. Standardised residuals (maximum likelihood), Stress = 173 MPa, temperature = 550°C.

(D

LL.

D

.7

.5 -

.3

.1

-.1

- .3

%5

t

- o - k =oo --v- k = 8

I I I I I I I I I I I I I I I I I I I I I I I I I I I I 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

O b s e r v a t i o n s

Fig. 3. Standardiscd fit (maximum likelihood). Stress = 173 MPa, temperature = 550°C.

D FITSX[ k :¢o] DFIT~[k=8]

DFIT~[k=oo]

OFITSJ[k =6]

Page 10: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

202 M. Evans

1.1

1

I - - D - Unweighted --v-- Boundedlnf luence

.9

.8

v x

E r r

.7

.6

.5

.4

.3

.2

.1

0

• I

I

i

I I I I I 1 2 4 8 12 25 100 500

k

Fig. 4. Maximised relative likelihood for various k. Stress = 173 MPa, temperature = 550°C.

¢ O

and actual reliability) falls from 0.15 in the unweighted case with k = ~ , to 0.10 under bounded influence estimation with k = 1000.

Figure 6 shows the 95% confidence interval for the 1% quantile prediction of log failure time using unweighted and bounded influence estima- tion. All the k values shown are accepted by the data and very little difference exists between the two estimation methods. The lowest limit (LL*) in the unweighted cases was 6.79 log hours whilst under bounded influence procedures this figure was slightly higher at 6.85 log hours (LL*)g. The upper limit (UL*) was very similar for the unweighted and bounded influence scenarios: 7.65 log hours versus 7.69 log hours, respectively.

The second data set was a sample of 44 recorded failure times at a temperature of 600°C and a stress of 94MPa. k = oo again maximised the unweighted log likelihood function and k = 1 lay at the lower bound of a 95% confidence interval for k. Figure 7 plots the MDFITSi statistics for both these distributions. For the normal distribution only the 12th observation showed up as an outlier. The skew to the left, introduced by estimating when k = 1, removes such an outlier but at the expense of introducing two new outliers at the top (right) end of the distribution.

Figure 8 shows the maximised relative likelihood for the unweighted and bounded log

Table 2. Comparison of weighting schemes for data obtained at 550°C and 173MPa

Unweighted Bounded influence Zero weighting

/J o # o /J o

k = 8 8.380 0.412 8-347 0-36 8.330 0.336 (84.78) (7.31) (124-5) (7.9) (132.5) (7.74)

k = 100 8-330 0.389 8.303 0.348 8.289 0.328 (96.03) (7.73) (129-3) (7-80) (136-1) (7-66)

k = ~ 8.309 0-381 8-288 0.345 8"273 0-324 (100-7) (7-83) (130.4) (7.75) (137-4) (7-62)

Page 11: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

Statistical analysis of failure time distribution for steel tubes 2 0 3

Z"

:5 .~_

I T

g

o

1

.9

.8

.7

.6

.5

.4

.3

.2

.1

0

- v - - Unweighted (k =oo )

r ~ ~ i ~ i ~ i ~ i 0 .1 .2 .3 .4

--o-- Bounded Influence (k = 1000)

i i

.5 Actual Reliability

I i I

.6 .7 i

.8 .9 1

Fig . 5. Generalised log gamma probability plot (k = 1000 and k = oo). S t ress = 173 M P a , t e m p e r a t u r e = 550°C.

l ikelihood functions. Both have a maximum at obtained by zero weighting any outliers. There is k = 0% and in both cases 1 defines the lower very little difference between the unweighted and bound of the 95% confidence interval for k. bounded influence estimates of /z and o. However, over the intervening k values the R However, the zero weighting of outliers does max(k)g is always smaller than R max(k), lead to noticeable changes when k = 1, i.e.

Table 3 shows unweighted maximum likeli- reductions in ~t and o, and increases in the hood estimates for /.t and o, together with standard errors of each estimate. bounded influence estimates and estimates The third data set contained 20 failure times at

A

E i :

"6

tu~ (UL~

7.5

7

( L L ~

I L L : )

6 . 5

l T Upper Limit ± Lower Limit v Point Estimate D Failure Time

. . . . . . . . . . . . . . . . . . . Unweighted Bounded inf. Unweighted

I1

Unweighted [ ]

Bounded Inf,

Bounded

i I i I I I

8 100 ~ 5 100 1000

Fig . 6. 9 5 % confidence intervals for the 1% quantile. Stress = 173 M P a , t e m p e r a t u r e = 550°C with sample = 30.

Page 12: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

.4

.2

I --~- k =oo v - k = 1

DFITS"[ k =col

-.2

DFITSftk=I ]

F-- i

LL 0 a

- .4

1.1

1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344

Observations

Fig. 7. Standardised fit (maximum likelihood). Stress = 94 MPa, temperature = 600°C.

A

E n-

1

.9

.8

.7

.6

.5

.4

.3

.2

.1

0

- - c - Unweighted --v-- Bounded Influence

204 M. Evans

I [ I I I f I 11 1]05 2 4 8 12 25 100 500 0/4

k

Fig. 8. Maximised relative likelihood for various k. Stress = 95 MPa, temperature = 600°C.

Co

Page 13: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

Statistical analysis of failure time distribution for steel tubes

Table 3. Comparison of weighting schemes for data obtained at 600°C and 94MPa

Unweighted Bounded influence Zero weighting

/t a /~ cr p o

k = 1 8.80 0-447 8.795 0.445 8-744 0.417 (120.9) (5.70) (123"4) (8.68) (128.3) (8.43)

k = 10 8.634 0-480 8-634 0-481 8-634 0.481 (118.2) (6.55) (118.46) (9.32) (118.5) (9.32)

k = ~ 8-558 0.483 8.561 0-480 8.583 0-462 (117.5) (7.12) (118.2) (9-38) (121"9) (9.28)

205

a temperature of 575°C and 126MPa. k = ~ again maximised the unweighted log likelihood func- tion and k = 1 lay at the lower bound of a 95% confidence interval for k. Figure 9 plots the MDFITS~ statistics for both these models with two outliers being identified at both ends of the symmetric normal distribution (observation 1 and 3).

Finally, Fig. 10 plots the two maximised relative likelihood functions. When failure times are unweighted the log normal distribution is most supported by the data whilst the extreme

value distribution is least supported. This conclusion changes quite substantially when the existing outliers are bounded. In this situation k = 500 is most supported by the data, and k = 2 the least supported generalised log gamma distribution.

Table 4 shows unweighted maximum likeli- hood estimates for /~ and o, together with bounded influence estimates and estimates obtained by zero weighting any outliers. The only noticeable difference between bounded influence estimation and the unweighted case is for a at

Or)

U. r ~

.7

. 4 4

. 1 8

- . 0 8

-.34

- . 6

I --o- k =oo - v - k= 1

I I I I I I I I I I I I I 2 3 4 5 6 7 8 9 10 11 12 13 14

O b s e r v a t i o n s

I 15 16

I" I 17 18 19

Fig. 9. Standardised fit (maximum likelihood). Stress = 126 MPa, temperature = 575°C.

DFITS'~{k =oo]

DFITS~[k=I]

20

Page 14: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

206

1.1

1

M. Evans

---or- Unweighted - -v-- Bounded Influence

.9

.8

X

E t r

.7

.6

.5

.4

.3

.2

. 1 I i

0 I I I I I I I I 11 .v!~5

1 2 4 8 12 25 100 500 04

k

Fig. 10. Maximised relative likelihood for various k. Stress = 126 MPa, temperature = 575°C.

o o

high k values. Zero weighting of outliers leads to an over exaggeration of this phenomenon.

8 CONCLUSIONS

BSC high temperature failure time data are readily accessible and are frequently used in empirical studies of creep behaviour. It has been shown above that these data contain within them a number of outlier observations which future researchers should be aware of when drawing conclusions from their research of the data.

When bounded influence estimation was used to isolate the influential data points, it became clear that the log normal distribution was no longer the distribution that best fitted the failure time data. Instead, the generalised gamma distribution with high k values became a better description of the data. Bounded influence estimation also resulted in a better fit to the data as shown by improved estimates of reliability rates, although the effect of the outliers on lower quantile predictions of failure time was very minimal. Finally, bounded influence estimation was shown to lead to less abrupt changes in parameter estimates.

Table 4. Compar i son o f Weight ing S c h e m e s for Data Obta ined at 575"C and 126MPa

Unweighted Bounded influence Zero weighting

k = 1 8.546 0-521 8.546 0.519 8-549 0.519 (56.38) (4-96) (69.33) (6.31) (69.32) (6.30)

k = 8 8.378 0.515 8.347 0.475 8.305 0.427 (67-75) (6.03) (76-91) (6.29) (84.05) (6.06)

k = oo 8.289 0.506 8-275 0-478 8-283 0.384 (72.62) (6.24) (76.63) (6-33) (91.52) (6-0)

Page 15: A statistical analysis of the failure time distribution for 12Cr12Mo14V steel tubes in the presence of outliers

Statistical analysis o f failure time distribution for steel tubes 207

R E F E R E N C E S

1. Evans, M., Statistical properties of the failure time distribution for ~Cr~Mo~V~ ~ ~ steel tubes. In Proc. Int. Conf. on Advances in Materials and Processing Technologies, Vol 2, Dublin, Ireland, Aug. 1993, pp. 863-73.

2. Stacy, E. W., A generalisation of the gamma distribution. Ann. Math. Star., 33 (1962) 1187-92.

3. Prentice, R. L., A log gamma model and its maximum likelihood estimation. Biometrika, 62 (1974) 539-44.

4. BSCC High Temperature Data. IISI Westerham Press, Kent, UK, 1973.

5. Belsley, D. A., Kuh, E. & Welsch, R. E., Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley, New York, 1980.

6. Evans, M., A Monte Carlo analysis of outlier test statistic distributions in the presence of non normality.

Unpublished paper, Department of Materials Engineer- ing, University College Swansea, UK, 1993.

7. Welsh, R. E., Regression sensitivity analysis and bounded influence estimation. In Evaluation of Econometric Models, ed. J. Kmenta & J. B. Ramsay, Academic Press, New York, 1980.

8. Cox, D. R. & Hinkley, D. V., A note on the efficiency of least squares estimates. J. Royal Statistical Soc., B 30 (1968) 284-9.

9. Greene, W. H., Econometric Analytis (2nd edn), Macmillan, New York, 1993.

10. Lawless, J. F., Statistical Models and Methods of Lifetime Data. John Wiley, New York, 1982.

11. Abramowitz, M. & Stegun, I. A., Handbook of Mathematical Functions. US Government Printing Office, Washington DC, 1965.

12. Stephens, M. A., EDF statistics for goodness of fit and some comparisons. J. Amer. Statistical Assoc., 69 (1974) 730-37.