CHANGE DETECTION IN AUTOCORRELATED · residuals of the machining process with w = 0.553 subject to...

CHANGE DETECTION

IN AUTOCORRELATED PROCESSES

Jiangbin Yang

A t hesis submit ted in conformity wi th the requirements for the degree of Doctor of Philosophy

Graduate Department of Mechanical and Industrial Engineering University of Toronto

@Copyright by Jiangbin Yang 1999

National Library Bibliothèque nationale du Canada

Acquisitions and Acquisitions et Bibliographie Services services bibliographiques 395 Wellington Street 395. nre Wellington ûttawaON K 1 A W OnawaON K1AON4 Canade canada

The author bas granted a non- exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or seil copies of this thesis in rnicrofom, paper or electronic formats.

The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fiom it may be printed or othenvise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/nlm, de reproduction sur papier ou sur fomt électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

Change Detection in Autocorrelated Processes

Jiangbin Yang, Ph.D.

Department of Mechanical and Industrial Engineering

University of Toronto, 1999

Abstract

The problem of change detection is about quick detection ~f a change in a dynamic system or

process at a low rate of false alarm by sequentially observing the system or process. It has important

applications in quality control, signal processing and other areas. This thesis studies the problem in

the context of autocorrelated processes.

First, we systematically investigate the properties of process residuals (one-step ahead forecast

errors) for change detection. We show t hat process resid uals are statistically sufficient for the pro blem

of change detection, and change detection can be done by using process residuals. We show that

process residuals are mutually uncorrelated with zero means when there is no change to the process,

that is, when the process is in-control. We develop a general procedure for specifically deriving

the forms of process residuals. Using the procedure, we derive the forms of residuals of general

autoregressive integrated moving average (ARIMA) processes and state space models, and obtain

some specific properties of the residuals under several situations. Under the Gaussian assumption,

for an ARIMA process or a steady-state state space mode1 subject to a change in process mean

level, the residuals are i.i.d. with zero means before the occurrence of change. After the occurrence

of change, the residuals are still mutudy independent with the same variance as before, but with

tirne-varying and generally

process mean level, we find

nonzero means. For an autocorrelated process subject to a change in

that the properties of residuals are independent of the feedback control

applied to the process.

We then concentrate on detection of a change iii proceçs mean level in an autocorrelated process

by using the process residuals. Cumulative surn (CUSUM), exponentially weighted moving average

(EWMA) and Shewhart control chart procedures are applied to the residuals. For computation of the

average run lengths ( ARLs) of the cont rol chart procedures applied to the residuals whose means are

time-varying after change, we derive an explicit formula for Shewhart, establish integral equations for

CUSUM and EWMA, and develop efficient numerical procedures for solving the integral equations.

Under the ARL criterion, we numerically study the performance of the control chart procedures

applied to the residuals of au tocorrelated processes under several situations.

We study the likelihood ratio (LR) testing procedure applied to the process residuals. We extend

the classical LR testing procedure by replacing its constant threshold value with a time-varying

threshold sequence. We propose a combined CUSUM and Shewhart control chart procedure to

approximate the extended LR testing procedure. We develop numericai procedures based on integral

equations for cornputation of the ARLs of these change detection procedures, and numerically study

t heir performance.

We study the problern of optimal sequential testing on process mean levels. The problern is a

special situation of change detection, an extension of the Wald's sequential probability ratio testing

(SPRT) problern, and has wide application in signal processing and other areas. We formulate the

problern as an optimal stopping problem, derive the optimal stopping rule, obtain sorne important

properties of the optimal stopping boundaries, develop a numerical procedure for computation of the

optimal stop ping boundaries, and present a numerical example.

To rny father, Jin Yang

who is so wam-hearted and diligent

and

to whom 1 feel deeply indebted

Acknowledgment s

1 thank rny supervisor, Prof. Viliam Makis, who has offered me valuable guidance and assistance

during the course of my Ph.D. study.

1 thank Mr. Youhua Chen for his friendly suggestion made to me in Spring, 1994, which inspired

me to pursue my Ph.D. in the engineering field.

I thank the members of my departmental oral examination cornmittee, Prof. Viliam Makis

(Supervisor and Chair), Prof. Andrew K. S. Jardine, Prof. Su bbarayan Pasupat hy (Department of

Electrical and Computer Engineering), Prof. Morton J. M. Posner, and Prof. Jim G. C. Templeton,

for their valuable comments and feedback on the manuscript of my thesis.

1 thank the external appraiser, Prof. Hoang Pham of the Department of Industrial Engineering,

Rutgers University, br his valuable appraisal on my thesis.

1 acknowledge the Connaught Scholarships offered to me in 1994,1995 and 1996 by the University

of Toronto. I acknowledge the Ontario Graduate Scholarships offered to me in 1996, 1997 and

1998. 1 acknowledge the continuous financial support since January, 1995 from the Condition Based

Maintenance (CBM) Consortium a t the University of Toronto led by the principal investigators Prof.

Andrew K. S. Jardine and Prof. Viliam Makis. I acknowledge the financial support from Prof. Viliam

Makis' NSERC fund.

I thank Ms. Jayne Beardsmore, an efficient administrator of the CBM Consortium, who has been

very helpful and kind. 1 thank Mr. Oscar del Rio, a knowledgeable Computing Adrninistrator of our

department, who h a s offered me much prompt and effective help. 1 thank Ms. Brenda Fung and the

graduate office who have efficiently helped me organize the final stage of my Ph.D. study.

I thank my wife, Hongmei Cao, who has accompanied me for many days and nights of hard

work, who has understood and appreciated me, who has shared many things with me, and who has

supported and encouraged me.

Contents

Abstract

Acknowledgments

1 Introduction

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 The Problem

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Two Production Processes

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 A Machining Process

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.22 A Cutting Process

. . . . . . . . . . . . . . . . . . . . 1.3 Problem Formulation of Change Detection

2 Residuals for Mode1 Transformation 10

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction 11

. . . . . . . . . . . . . . . . . . 2.2 Sufficiency of Residuals for Change Detection 13

. . . . . . . . . . . . . . . . . . . . . . . 2.3 Second-Order Properties of Residuals 15

. . . . . . . . . . . . . . . . . . . . 2.4 Procedure for Deriving Forms of Residuals 17

. . . . . . . . . . . . . . . . . . . 2.5 Properties of Residuals of ARIMA Processes 18

. . . . . . . . . . . . . . . . . . 2.5.1 Processes Subject to Additive Changes 18

2.5.2 Processes Subject to Nonadditive Changes . . . . . . . . . . . . . . . .

2.6 Properties of Residuals of State Space Models . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Residuals of a General Model

. . . . . . . . . . . . . . . . . . . . 2.6.2 Residuals of the Machining Process

2.6.3 Residuais of the Steady-State Machining Process . . . . . . . . . . . .

2.7 Independence of Residuals of Feedback Control . . . . . . . . . . . . . . . . .

2.8 Summary and the Mode1 Transformation . . . . . . . . . . . . . . . . . . . . .

3 Performance of Control Charts Applied to Residuals

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction

3.2 Integral Equations for Computation of the ARLs . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . 3.2.1 Shewhart Control Chart Procedure

. . . . . . . . . . . . . . . . . . . . . 3.2.2 CUSUM Control Chart Procedure

3.3.3 EWMA Control Chart Procedure . . . . . . . . . . . . . . . . . . . . .

3.3 Numerical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3.1 Results on the Machining Process Subject to a Step Shift . . . . . . . .

3.3.2 Results on the Machining Process Subject to a Drift Rate Change . . .

3.3.3 Results on the Cutting Process . . . . . . . . . . . . . . . . . . . . . .

3.3.4 Summary of the Performance of Control Charts . . . . . . . . . . . . .

4 Sequential Testing on Process Mean Levels 56

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2 Formulation as an Optimal Stopping Problem . . . . . . . . . . . . . . . 60

. . . . . . . . . . . . . . . . . . . . . 4.3 Derivation of the Optimal Stopping Rule 63

. . . . . . . . . . . . . . . . . 4.4 Properties of the Optimal S topping Boundaries 71

. . . . . . . . . . . . . . . . 4.5 Computation of the Optimal Stopping Boundaries 76

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 NumericalResults 78

5 Likelihood Ratio Test ing Procedures

Introduction . . . * . . . . . . . . . . . . . . . . . . . . . . . O

The Structure of LR Testing Procedures . . . . . . . . . . . .

Extended LR Testing Procedures . . . . . . . . . . . . . . . .

A Combined CUSUM and Shewhart Control Chart Procedure

Performance Analysis . . . . . . . . . . . . . . . . . . . . . . .

5.5.1 The LR Testing Procedure Applied to AR(1) Processes

. . . . . 5.5.2 The Extended LR Procedure Applied to I MA(1. 1) Processes

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 The CCS Procedure

. . . . . . . . . . . . . . . . . . . . . . Summary of the Performance Analyses

6 Conclusion and Future Research

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Contributions 101

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Future Research 103

Bibliography

vii

List of Tables

3.1 Optimum ARLs and parameters of the CUSUM, EWMA and Shewhart control

chart procedures applied to the residuals of the machining process with w =

0.553 subject to an additive step shift of size A in its state equation when their

in-control ARLs are 100. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


chart procedures applied to the residuals of the machining process with w =

0.553 subject to a drift rate change of c' in its state equation when their in-

control ARLs are 300. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49


chart procedures applied to the residuals of the AR(2) cutting process subject

to an observational additive step shift of size A when their in-control ARLs are

100. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * 53

4.1 The optimal stopping boundaries when b ( i ) = 0.9' for i 2 0. . . . . . . . . . . 78

5.1 ARLs of the LR testing procedure and the CUSUM and EWMA control chart

procedures applied to the residuals of an AR(1) process with autoregressive

coefficient al = 0.45 subject to an observational additive step shift of size A

when their in-control ARLs axe 100. ARLs are optimized at A = 1 in the upper

panel, and at A = 3 in the lower panel. . . . . . . . . . . . . . . . . . . . . . 90

5.2 ARLs of the extended LR (XLR) testing procedure in comparison with the

classical LR testing procedure and the CUSUM and EWMA control chart pro-

cedures applied to the residuals of an IMA(1,l) process with rnoving average

coefficient bl = 0.1 subject to an observational additive step shift of size A

when their in-control ARLs are 100. ARLs are optimized at A = 1 in the

. . . . . . . . . . . . . . . . . . upper panel, and at A = 3 in the lower panel. 94

5.3 ARLs of a CCS procedure in comparison with the LR testing procedure and the

CUSUM control chart procedure applied to the residuals of an AR(1) process

wi th autoregressive coefficient al = 0.45 subject to a n observational additive

. . . . . . . . . . . . step shift of size A when their in-control ARLs are 100. 97

List of Figures

3.1 Optimum ARLs versus sizes of change (upper panel) and the ARL curves

with the chart parameters optimizing the ARLs a t A = 2 (lower paael) of

the CUSUM, EWMA and Shewhart control chart procedures applied to the

residuals of the machining process with w = 0.553 subject to an additive step

shift of size A in its state equation, when the fixed in-control ARL is 100. . . 48


with the chart parameters optimizing the ARLs at d = 2 (lower panel) of the

CUSUM, EWMA and Shewhart control chart procedures applied to the resid-

uals of the machining process with w = 0.553 subject to a drift rate change of

d in its state equation, when the fixed in-control ARL is 300. . . . . . . . . . 51


with the chart parameters optimizing the ARLs at A = 2 (lower panel) of

the CUSUM, EWMA and Shewhart control chart procedures applied to the

residuals of the AR(2) cut ting process subject to an observational additive step

shift of size A, when the fixed in-control ARL is 100. . . . . . . . . . . . . . . 54

4.1 A typical illustration of g,(A) and the related functions and quantities. . . . . 70

4.2 The optimal stopping boundaxies of the example. . . . . . . . . . . . . . . . . 78

5.1 ARLs of the LR testing procedure and the CUSUM and EWMA control chart

procedures applied to the residuals of an AR(1) process with autoregressive

coefficient al = 0.45 subject to an observational additive step shift of size A.

For the fixed in-control ARL, the chosen chart parameters optimize the ARL

at A = 1 in the upper panel and at A = 3 in the lower panel. . . . . . . . . . 91

5.2 ARLs of the extended LR (XLR) testing procedure in comparison with the

classical LR testing procedure and the CUSUM and EWMA control chart pro-

cedures applied to the residuals of an I MA(1,l) process with moving average

coefficient bl = 0.1 subject to an observational additive step shift of size A.

For t h e fixed in-control ARL, the chosen chart parameters optimize the ARL

at A = 1 in the upper panel and at A = 3 in the lower panel. . . . . . . . . . 95

5.3 ARLs of a CCS detection procedure in comparison with the classicai LR testing

procedure and the CUSUM and EWMA control chart procedures applied to the

residuals of an AR(1) process with autoregressive coefficient al = 0.45 subject

to an observational additive step shift of size A. . . . . . . . . . . . . . . . . 98

Chapter 1

Introduction

1.1 The Problem

The problem of change detection is about quick detection of a change in a dynamic system

or process at a low rate of false alarm by sequentially observing the system or process. The

instant of change is unknown, and randomness is usudly involved in the system. The problem

arises naturally in the areas of quality control, signal processing, automatic control and othen.

Specific examples of applications, such as statistical process control, navigation system moni-

toring, seisrnic data processing, signal segmentation, and vibration monitoring of mechanical

systems, can be found in Basseville and Nikiforov (1993).

Autocorrelation may exist in the process or the sequential observations of the system.

In quality control, more than 70% of the studied processes subject to change detection are

autocorrelated, according to Alwan and Roberts (1995). Because of more complexity, more

effort is involved in designing change detection procedures for autocorrelated processes than

for independent and identically distributed (i.i.d.) processes, and neglect of the presence of

autocorrelation may seriously reduce the effectiveness of the designed detection procedures.

The impact of autocorrelation on the performance of control charts directly applied to the

output of an autocorrelated process has been reported in the literature, for example, by Alwan

(1992). Therefore, care must be taken to deal with autocorrelation properly.

In production and other areas, an autocorrelated process may be controlled in order to

compensate for expected deviations of the process from its taxget value. This will reduce the

systematic variability of the process and improve the process capability in quality control.

As a result , bot h the quality and productivity levels will be increased. A controlled process

should be rnonitored in order to detect changes to the process. A special-cause change can

disturb the process and invalidate the process mode1 that has been used to develop the control

policy. If it is not detected, the control policy will become ineffective. The complementary

roles of process control and process monitoring have been discussed by MacGregor (1988)

and Box and Krarner (1992). Integration of process control and process monitoring has been

proposed by Tucker, Faltin and Vander Wiel (1993) and Montgomery, Keats, Runger and

Messina (1994). Usually, for a controlled autocorrelated process, autocorrelation still exists

in the closed-loop process output.

Since Wald (1947) founded sequential analysis and Page (1954) proposed the CUSUM

procedure, substantial research has been done for detection of step shifts in process mean

levels in i.i.d. processes and their continuous analogue - drift rate changes in Wiener processes.

Works including Shiryaev (1961), Shiryaev (1963), Shiryaev (1978), Lorden (1971), Lai (1995),

and Siegmund and Venkatraman (1995) have made contributions to the related problems. The

optimality of the likelihood ratio CUSUM procedure under a well-defined rninimax criterion

for detection of a step shift in process mean level in an i.i.d. process was first proven by

Moustakides (1986) by using optimal stopping theory, and later by Ritov (1990) by using

saddle point decision theory. Similar optimality results have also been obtained by Beibel

(1996) and Yakir (1997). The situation where an i.i.d. process is subject to a step shift in

process mean level has been referred to as the i.i.d. case in the literature, because under this

situation, the process is i.i.d. both before and after change. Moustakides (1998) extended

the optimality condition, that is, the process is i.i.d. both before and after change, of the

likelihood ratio CUSUM procedure to a more general condition, where only the iikelihood

ratio sequence is required to be i.i.d. both before and after change. However, as stated by

Moustakides, the generalized condition is very restrictive, and the extension for autocorrelated

processes is very limited.

Since 19709, change detection in autocorrelated processes has received growing attention.

Kligene and Telknis (1983) surveyed and analyzed the works published before 1980. Bas-

seville (1988) surveyed and described the then state-of-theart statisticd methods for detect-

ing changes in signals and systems. Basseville and Nikiforov (1993) presented the existing

met hodology comprehensively, and included the relevant references. Lai (l995), wit h discus-

sions, surveyed a large variety of sequential change detection procedures widely scattered in

quality control, fault detection and signal processing literature. Since change detection in au-

tocorrelated processes is more complicated than in the i.i.d. case, there are more chalienging

problems. And less attention has been drawn to change detection in controlled processes.

This t hesis studies some problems on change detect ion in autocorrelated processes. Specif-

ically, this thesis systematically investigates the properties of process residuals and by using

residuals, achieve a mode1 transformation of change detection in autocorrelated processes ei-

ther controlled or not, establish a set of procedures for accurately and precisely computing

the ARLs of control charts and studying their performance, study a situation of change detec-

tion in autocorrelated processes - optimal sequential testing on process mean levels, analyze

and extend a widely used change detection procedure - the LR testing procedure, propose

a combined CUSUM and Shewhart scheme for change detection in autocorrelated processes,

and study the performance of these change detection procedures.

In the following sections, we will introduce two production processes that have motivated

this work, and fomulate the problem of change detection in a general framework.

1.2 Two Production Processes

1.2.1 A Machining Process

An application that rnotivated this work was a machining process introduced in Crowder

(1992), which is described as follows. A cornputer-controlled bar lathe forms cylindrical metal

pieces from a metal bar in a turning operation by cutting away excess metal. The process

is subject to tool Wear and stochastic shocks resulting from small differences in the ban,

environmental condition changes, etc.. The outside diameters of metal cylinders are measured

by an automatic gauging system w hich contri butes moderate measurement errors.

The process can be represented by a controlled random walk model with linear drift and

measurement errors as follows. For t = 1,2, *,

where Ot is the outside diameter value of t he part at time t, is its measurement, cl - N(O, 0:)

is the measurement error, ut - N ( 0 , 0:) is the stochastic shock, c is the linear drift rate, which

may well approximate the effect of tool Wear during a time period, and ut-1 is the amount

of feedback control deterrnined by {K, i t - 1). We assume that O. - N(&, q i ) is the

initial setting, {a,} and {ut) are two independent sequences of i.i.d. random variables. Vander

Wiel (1996 b) studied such a machining process with target outside diameter value of 15010

units (1 unit = 0.0001 inch), and obtained the following maximum likelihood estimates of the

process parameters:

The machining process modeled by ( 1.1) is representative of nonstationaxy production

processes subject to tool Wear and rneasurement errors. Its optimal control has been studied

by several authors. Crowder (1992) studied optimal control of the process when no drift was

present, Jensen and Vardernan (1993) developed an optimal control policy for the process with

additional adjustment errors, and Vander Wiel(1996 b) found an optimal control policy under

the assurnption that the lathe can automatically compensate for the deterministic tool W e a r

wit h negligible cost. The process mode1 belong to the class of state-space models, which have

been extensively examined in the process control literature such as Astrom and Wittenmark

(1997).

1.2.2 A Cutting Process

A second application that further stimulated this work was a metal cutting process studied by

Allen and Huang (1994). During rough turning operations of a tool, the vertical cutting force

experienced by the tool was rneasured by two strain gauges mounted on the top and bottom

faces of t h e tool. The force measurements were digitized by a analogue-to-digital converter,

and input to a computer numerical controller. The controller then manipulated the feed rate

in order to maintain the cutting force at a desired level.

The process was modeled by an autoregressive process of order 2 (AR(2) process) under

feedback control as follows.

where & represents the deviation of the vertical cutting force from its target level at time t ,

ut-1 is the control input (feed rate) that is deterrnined by {k;., i 5 t - 11, and et is the noise

input. It's assumed that { e t ) is a sequence of i.i.d. normal random variables with mean O

6

and variance u:. Allen and Huang (1994) obtained the following least square estimates of the

process paramet ers:

al = 0.45, a2 = 0.2,

b l = O . l , bz=0 .04 ,

and derived a so-called general minimum variance control policy.

The mode1 ( 1.3) is a special case of ARIMA processes under feedback control. With their

powerful capability of modeling, ARIMA processes have been widely applied in the areas of

automatic control, quality control and others, which can be seen in the literature such as Box,

Jenkins and Reinsel (1994).

In the turning operations, the cutting conditions such as depth of cutting and cutting

speed are usually changeable. These changes may cause the cutting force to largely deviate

from its target. Special-cause changes occur to the machining process in 5 1.2.1 as well, which

rnay result from abnormal status of the hydraulic system or from changes in the cutting tool

conditions. Two approaches can be employed to deal with the changes, which are adaptive

control and change detection. By employing change detection, we can detect and eliminate

the root cause of change if possible. Or, we can detect the change, and then adjust the control

policy when it is necessary. Thus, we can leave the system more stable.

1.3 Problem Formulation of Change Detection

The change detection problern can be formulated in a general statistical inference framework.

Let {x) be a stochastic process having a parametric model !P that is subject to a change at

an unknown t ime point r. The probability distribution functions of {x) belong to {Fq(,), r E

O), where R = {1,2, , oo). r is called the change point, and we want to have it detected.

The parameter value of \Ii may be a vector, and rnay be a function of time t. V(r) denotes

how the process model parameter changes. If r = 00, then no change occurs to the process,

and ik = &. If r < oo, then the change occurs to the process,

.s>, for t < r,

$i for t 3 r.

That is, the process model parameter jumps to q!q from at t = r , and then remains tl>l.

When iU = the process is said to be in-control, and when = di7 the process is said to

We assume that we start observing the process at time t = 1. Let yt be the observation

of Y f , and = (yi , , yt) be the observations obtained up to t 2 1. Let ( 0 - , y-,, y,) denote

the initial condition of the process prior to t = 1, assumed to be given. Let Y' = (- , y,-1, y,).

Based on y: (or equivalently y'), detection of the change point r is statistical inference on

r , and the detection or inference problem is the following sequential testing problem. At time

t = 1,3, O , test the hypotheses

based on the observations y: (or equivalently y'). We call these two hypotheses the change

point hypotheses. Obse r~ t ion is continued and the test is camied out until a decision in favor

8

of Hl is made for the fint time. When Hi is decided at stage t , occurrence of change is

alarmed, and the observation and test are stopped.

For t = 1,2, *, let Ft be the minimum O-dgebra induced by {K, i = 1,2, *#., t). Then,

the change detection procedure or equivalently, the sequential testing procedure is a stopping

nile (or a stoppiiig time) N 2 1 with respect to {Ft, t = 1,2, . 8 . ) .

Chapter 2

Residuals for Mode1 Transformation

2.1 Introduction

When autocorrelation became prominent in the 19709, concern was raised about the perfor-

mance of control charts directly applied to autocorrelated processes. As observed and studied

by Johnson and Bagshaw (1974), Bagshaw and Johnson (1975), Vasilopoulos and Stamboulis

(1978), Alwan (1992) and Wardell, Moskowitz and Plante (1992), control charts directly a p

plied to autocorrelated processes are seriously misleading. They either result in too many

false alarms or c m fail to alarm when a change really occurs.

To circumvent autocorrelation, residuals have been used for change detection in autocor-

related processes, since residuals are mutually independent under certain circumstances. An

early work on monitoring residuals of autocorrelated processes was Berthouex, Hunter and

Pallesen (1978). Since it was formally proposed by Alwan and Roberts (1988), applying con-

trol charts to residuals for change detection in autocorrelated processes has been a popular

practice in the quality control area. In some special cases, some properties of residuals have

appeared or been studied in the literature. Harris and Ross (1991) studied the properties of

residuals of an integrated moving average process of order (1,l) (IMA(1,l) process) and an

autoregressive process of order 1 (AR(1) process) subject to step shifts in the process mean

levels. Wardell, Moskowitz and Plante (1992) studied how residuals of an autoregessive mov-

ing average process of order (1,l) respond to a step shift in process mean level by assuming

the process is deterministic. Wardell, Moskowitz and Plante (1994) derived the forms of resid-

uals of an autoregressive moving average process of order (p, q) process subject to a change

in process mean level by using Ztransform. Basseville and Nikiforov (1993) studied some

properties of residuals, and utilized the convenience of residuals in designing change detection

procedures when the residuals were mu tudy independent.

Although residuals have been used for change detection in practice and some properties of

residuals have been around in the literature, the usage of residuals has been ad hoc, properties

of residuals have not been systematically investigated and especidy, the sufficiency of resid-

uals for change detection has been overlooked. In this chapter, we are going to systematically

investigate the properties of residuals for change detection, show the sufficiency of residuals for

change detection, develop a general procedure for derivation of forms of residuals of autocor-

related processes, and show the independence of feedback control of residuals. The sufficiency

of residuals will provide us a theoretical background for the use of residuals, knowledge of the

properties of residuals will enable us to design better change detection procedures and monitor

residuals more effectively, and the independence of residuals of feedback control will turn out

to be a convenient bridge for the combination of process control and change detection.

For the stochastic process {K) defined in 3 1.3, let

for t = 1,2, - -. We cal1 g(y t - l ) the forecast function. Obviously, g(yt- ' ) is independent of

the unknown value of r. The process residual at t = 1,2, - is defined as

{el, ez, - -) forms a stochastic process. We will also use et denote its realization. That is,

Let ei = (el, - - , et) . For convenience, we also let et denote e:.

2.2 Sufficiency of Residuals for Change Detection

We first study the sufficiency of process residuals for change detection. We have noted that

the initial condition is always given. For t = 1,2,--,

define a one-to-one transformation G from the observations y: to the residuals e:, that is,

e l = G(y:). e: is a statistic. For any given ei, there is one and only one yf = G-'(e:). That

is, the conditional distribution of y: given e: is

1, if yf = G-'(e:),

0, otherwise,

which is independent of the unknown parameter r. Therefore, by the definition of sufficiency,

at any time t 2 1, e: = (el , ---, et) is a sufficient statistic on {Fe(,), r E O), or equivaleotly

a sufficient statistic on the unknown parameter r. That is, al1 the information about the

unknown parameter r contained in y: is also fully preserved in e:. Hence, the residuals are

sufficient for the change detection problem.

One can design change detection procedures based on the observations directly. However,

due to the one-to-one transformation G-', any change detection procedure based on the

observations con be transformed to a change detection procedure based on the residuals, and

vice versa. And a detection procedure based on the residuds will have exactly the same

detection propert ies as its counterpart based on the observations.

Because residuals are suficient, and equivalent to observations for a change detection prob-

13

lem, when residuals have nicer properties than observations, we can concentrate on residuals

in designing and searching for effective change detection procedures. As we will see, residuals

can have much nicer properties, such as mutual independence, than observations, which will

provide us great convenience in designing and analyzing change detection procedures.

We note that the sufficiency of residuals always holds for the change detection problem,

no matter what the process mode1 is, if a change occurs, or how the change occurs. More on

the sufficiency of residuals for change detection will be discussed in 5 2.3 and 5 2.5.2.

2.3 Second-Order Properties of Residuals

We now study the properties of residuals further. Because of the definition of residuals ( 2. l),

&=,(et lyt-')

= &=,(Y, - g(yt-l) 1 y'-')

= Er=,(Y,Iyt-') - g(yt-')

= o. Then, by using total probability,

for any t 3 1. Therefore, when there is no change, the residuals are mutually uncorrelated,

and their means are zero. For a Gaussian process, the mutual uncorrelatedness implies that

the residuals are mutually independent when there is no change.

One may now be interested in looking at the distributions of residuds. We consider this in

the case of continuous variables. Let f*(,) be the probability density function corresponding

to FQ(,). That is, Kt has a joint density function f*,,)(yf). The Jacobian of G,

Therefore, the joint density function of e: = G(yf) is

f;<r>(e:) = f*(r,(G-l(4))7

and

That is, the likelihood of the residuals e: and the likelihood of the corresponding observations

y: are equal. This from another side confirms the sufficiency of residuals for change detection.

If the residuals are mutually independent, then

f d y : IyO)

= I L I f ~ ( r ) (yi Igi- l )

= II:=, f$crl(ei>v

and therefore, a LR testing procedure based on residuals will have much simpler form than

its counterpart based on observations.

2.4 Procedure for Deriving Forms of

In order to analyze the properties of the residuals { e t ) in detail,

they behave when a change @ ( r ) occurs to the process {K), we

Residuals

and especially to leam how

need to derive the forms of

{et). Two things must be kept in mind when we are deriving the forms. The f is t thing is

t hat the forecast function g(y t - ' ) used to compute the residuals is always defined under the

assumption that (x) is in control, that is, r = m, because we don't know the occurrence of

the change until it's detected, and before the detection, we make forecasts by assuming that

there is no change. The second thing is that {K) is subject to the change ck(r).

The forms of residuals may have to be specifically derived for a particular type of process

{ Y I ) . However, t here is a general procedure to follow, which is the following.

(i) Derive the forecast function g ( y t - ' ) , that is, the one step ahead forecast by assuming

r = 00.

(ii) Find the conditional distribution of the process output for given y'-' by taking into

account the effect of Q (r).

(iii) Find the relationship between the conditional expectation of Y, in (ii) and g(yt- ' ) in (i).

(iv) Subtract g(y t - ' ) from k; having the conditional distribution in (ii) by wing the rela-

t ionship of (iii). Then, the forms of residuals et = x - g(yt-' ) are ready.

2.5 Properties of Residuals of ARIMA Processes

Now, we can derive the forms of residuals of an ARIMA process.

Before going on, we need to distinguish between two kinds of changes in an autocorrelated

process, namely, additive changes and nonadditive changes. An additive change occurs in

the process mean level, and has an additive effect on the mean values of process observations.

Nonadditive changes are mode1 parameter changes, including changes in spectral characteristic

or system dynamics.

2.5.1 Processes Subject to Additive Changes

An ARIMA process of order (p, 1, q) ( A RZMA(p, 1, q) process) under feedback control subject

to a n additive change can be represented as

where by using time series notations as those in Box, Jenkins and Reinsel (1994), B is the

backward shift operator applied to the time index t, a ( B ) = 1 + a i B + = a + a,BP is the

stationary autoregressive operator, P(B) = 1 + blB + *O- - + 4 Bq is the invertible moving

average operator. 1 is a nomegative integer. The process is stationary when 1 = 0, and

nonstationary when I 2 1. { e t ) is the innovation sequence that is a sequence of white noise

input with mean zero and variance O:. u+i is the net effect of the feedback control that is

determined by y'-', and ut-1 z O represents that there is no control applied. r is the unknown

change point. A(-) is a detenninistic function, A(i) O for i < O and A(i) may be nonzero

for i 2 0. ( 2.2) can also be equivalently represented as

where

We cal1 A ( t -r) an observational additive change, and d(t -r) an innovational additive change.

In r e d world, these two types of additive changes may corne from different backgrounds.

However , t hey are mat hemat ically equivalent .

For the AR1 M A ( p , 1, q) process subject to the additive change A(t - r), we can derive the

foms of residuals by following the generd procedure in 5 2.4.

( 9

(ii) Given

(iii) The relationship between g( and E(G 1 y'-' ) is obvious.

(iv)

for t < r, et = (2.3)

- 1.1 + et, a(B)

for t > r.

Two resdts can be observed for the ARIMA process subject to the additive change. The

first is that the residuals { e t ) are the inverse filtering of the observations {k;) by removing

the term ut,

et = 4 B ) P - N B )

- ut-1)-

The second is that the residuals form an i.i.d. process subject to an additive change,

where 6(t - r) is the inverse filtering of A(t - r ) and 6(i) = O for i < O. Therefore, detection

of an additive change in an ARIMA process is equivalent to detection of an additive change

in an i.i.d. process.

2.5.2 Processes Subject to Nonadditive Changes

For an A R M A process, a mode1 parameter change from (a, ,û, 1, oc) to (a', P', l', oz) at change

point r , with the process mean value unchanged, is a nonadditive change to the ARIMA

process. Under such a change, we have the following development for the residuals.

(il

a ( B ) ( l - B)' a ( B ) ( l - B)' g(yt-l) = [l -

P ( W Iyt + P(B)

U t - 1

(ii) Given yt-',

a ( ~ ) ( i - ~ ) l I N + @ ( B I utml + et, for t < r, Y t =

where = et for t < r, and {ci, t 2 r ) is a sequence of white noise input with mean zero

and standard deviation O:.

(iii) The relationship between g(yt-l) and E ( x 1 yt-l) is obvious as well.

(iv)

Therefore, before the change, the residuals are an i.i.d. process. After the change, the residuals

either exhibit autocorrelation or have a changed variance. The possible autocorrelation arnong

residuals will not provide us the convenience for constructing change detection procedures as

the mutual independence does in the situation of additive changes. However, being autocor-

related or not does not afKect the sufficiency, and equivalence to observations, of residuals,

because the sufficiency holds for change detection problems in general.

Detecting an additive change in a normal i.i.d. process was defined as a basic problem in

Basseville and Nikiforov (1993). They established the sufficiency of likelihood ratios for change

detection. For change detection in AMMA processes subject to innovational additive changes,

they recognized the sufficiency of residuals because the likelihood ratios were functions of the

residuals. However, they did not recognize the prevalent sufficiency of residuals for change

detection in general. For detection of a nonadditive change in an ARIMA process, they

concluded wrongly that residuals were no longer sufficient for change detection just because

they could no longer be mutually independent.

One probably could be puzzled by the sufficiency of residuals for detection of an additive

change in a nonstationary ARIMA process, for example, an observational additive step change

in an I M A ( 1 , I ) process. Because of the differencing operation (1 - B)', 1 2 1 in ( 2.3) for

nonstationary A R M A processes, the residual mean will go to zero exponentially fast after

an obser~tional additive step change has occurred to the ARIMA process. This means that

the effect of the observational step change on the residuals will vanish. So, how could the

residuals be sufficient? The puzzle will not exist if one can realize that for a nonstationary

ARIMA process, the process observations wander around anywhere because of the integrating

operation (1 - B)-' in ( 2.2). Therefore, the effect of the observational step change on the

process observations themselves will be buried into the wandering, that is, will vanish as

well. The puzzle cas also be resolved by the fact that any change detection procedure based

on observations (or feedback control actions) c a n be transformed to a procedure based on

residuds with the sarne detection properties preserved, and vice versa.

2.6 Properties of Residuals of State Space Models

2.6.1 Residuals of a General Mode1

Consider a general state space model subject to an additive change A(t-r) in its measurement

equation, which is described as follows. For t = 1,2, --,

= HtOt + at + A(t - r),

Qt = @te,-i + ut-, + ut,

where at time t, et is the state of the dynamic system, Ot is the transition matrix, Ht is the

state observation transformation matrix, I; is the measurement of the transformed et, et is the

measurement error, ut is the noise input, ut-' is the amount of feedback control determined

by { e t ) and {ut) are independent normal white noise processes, et -* N(0, C,) and

ut .- N(0, C , ) . r 2 1 is the change point, A(*) is a deterministic function and A(i) = O for

i < O. O. - N(&, b) is initialized. Although it is not necessary to explicitly specify their

Following the general procedure in 3 2.4 and using Kalman filtering, we can derive the

forms of residuals of the model ( 2.4).

(9

where

and

(ii)

Y J y t - l - N( Ht [ @ t ~ ( O t - l ]y t - ' ) + ut-l] + A(t - r ) ,

H ~ [ O , C O V ( B ~ - ~ I Y ~ - ~ ) @ : + %]H: + cc).

(iii)

By mathematical induction, for t 2 r ,

where n::, r 1.

(iv) For t < r ,

For t 2 r ,

where C& 0.

By an approach similar to that in 5 2.3, we can prove that {et , t = 1,2,-} are mutudy

uncorrelated. Therefore, {et, t = 1,2, - - a) are m u t u d y independent because of the Gaussian

property.

Similady, we can derive the forms of residuals of a state space model subject to an additive

change in its state equation. We will illustrate this with the machining process in the next

section.

2.6.2 Residuals of the Machining Process

We have noted that the model of the machining process is a special case of state space models.

We now consider the machining process subject to an additive change A(t - r) in the state

equation, which is represented as follows. For t = 1 ,2 , - - O ,

yt = @t + et,

[Ot - A(t - r ) ] = [et-~ - A(t - 1 - r ) ] + ut-1 + c + ut,

or equivalent Iy,

Yt = Ot + et, (2.6)

et = et-1 + ut-i + c + ut + [A(t - P) - A(t - 1 - 7 - ) ] ,

where r 2 1 is a change point, A(*) is a deterministic function and A(i) i O for i < 0.

The change A(t - r ) represents an additive change to at in ( 2.5). Looking at ( S.6), we see

the differencing of A(t - r ) with respect to t, A(t - r ) -A(t - 1 - r ) , can also represent additive

changes in the drift rate c and in the mean level of the stochastic shocks {vt). Therefore, we

use A(t - r ) to represent the total effect of additive changes to &, which may result from

changes in tool conditions, met al bar quality and environmental conditions and others.

The function A(-) can represent various types of additive changes.

A(;) ii A for i > O

represents a step shift.

A(i) = (i + 1)c'for i 2 O

represent a drift rate change from c to c + d per part or an external ramping shift.

A(0) = L

A(i ) = O for i # O

represents a impulse shift.

We have mentioned that O. - N(&, qi) is initialized. For t = 1,2, -, let

By Kalman filtering,

Let

and 2

wt = qi-i + 02 q:-i + 0; + a,"

Then, following the general procedure in 5 2.4 and using Kalman fdtering, we can derive the

forms of residuals of the process represented by ( 2.6).

( ii)

(iii)

For t < r ,

E(@tlvt) = Er=,(O,lyt).

By mathematical induction, for t 2 r,

( iv) For t < r ,

For t 2 r ,

where n;=, 1.

Because of the one-to-one transformation G from y: to e:, by an approach sirnilar to that

in 5 2.3, we can prove that {et , t = l ,2 , -) is a sequence of mutudy uncorrelated random

variables. By the Gaussian property, (e t , t = 1,2, *) is a sequence of mutudy independent

normal random variables.

Therefore, for the machining process, its residuds form a sequence of mutually independent

normal random variables with wiances independence of the change. The mean values of

residuds are constant zero before the change point r and start to change after r .

Since sf is independent of the change A(t - r), the residuals { e t ) depend on A(t - r ) only

t hrough t heir mean values

t-r i

We now examine how E(et) responds to some specific types of additive changes. For a step

shift of size A, A(i) = A for i 2 0,

for t < r,

( An:Z(l - ~ i - ~ ) , for t 3 r.

For a drift rate change, A(i) = ( i + l ) d for i 2 0,

for t < r,

- W ) for t 2 r.

2.6.3 Residuals of the Steady-State Machining Process

One can see from ( 2.9) and ( 2.8) that the filtering rate wt and the residual variance s: depend

on t only through t$ defined by ( 2.7), which in turn is fully determined by q& the variance

of the initial process setting eO. For any initial value q& i4:) converges to

which is called the steady state variance. When q: is stabilized at q2, we Say that the process

is in steady state.

It is interesting to investigate the effect of qg on the convergence of ($1. It is because of

( 2.7) that ($1 converges very fast to q2 for any value of q& Numerical results have shown

the convergence of {q:) for the machining process with the parameters given by ( 1.2) in two

extreme cases: (i) qi = O, that is, the initial setting of the process rnean is error-free and (ii)

q; is large, say 102. In both cases, (q:} stabilizes at q2 after two iterations. Furthemore, if

q,2 = g2, then q: = q2 for al1 t. Therefore, we assume that the process is in steady state by

assuming g,2 = q2.

When the process is in steady state,

and the filtering rate

where rsw = oE/o: is the signal-tenoise ratio. It is easy to see that w(*) is a increasing

function of rsN. That is, the higher the signal-to-noise ratio, the larger the filtering rate.

From ( 2.10), for t 2 r ,

t-r

E(eJ = C(1 - w ) ' [ ~ ( t - i - r ) - A(t - i - 1 - r ) ] . i=O

When the change is a step shift of size A,

That is, the residual mean jumps to h at r and then decreases exponentially back to zero.

When the change is a drift rate change from c to (c + d) per part,

That is, the residual mean will converge exponentidy to a non-zero level of d/w.

For the machining process with parameters in ( 1.2),

q2 = 2.i'8g2, s2 = 5.608~ and w = 0.5528.

In sumrnary, for the machining process in steady state, its residuals form a sequence of

mutually independent normal random variables with a constant variance s2. Let

for k < 0,

~ , " = ~ ( i - w)'[A(k - i) - A(k - i - l)], for k 2 O.

Then the residuals form an i.i.d. normal process subject to an additive change 6(t - r).

2.7 Independence of Residuals of Feedback Control

An objective of qudity control is to reduce process variability and improve process capability.

For autocorrelated processes, by applying automatic process control, we can reduce systematic

variation. By applying change detection, we can possibly detect and remove special causes in

order to improve the processes. As we have mentioned in 5 1.1, automatic process control and

change detection (process monitoring) are complementary rather than cornpetitive tools, and

their combination for better quality control has been advocated in the literature.

Examining the results and developments so far, we see that the forms of residuals of the

ARIMA processes and state space models subject to additive changes are independent of the

applied feedback control. This may at first appear surprising. However, the reason behind

this is simple. That is, ut-1 is fully determined by Y'-', and the conditioning operation in

computation of the residuals cancels out the effect of ut-1 on et. Actually, the form of residuals

of a process is independent of any deterministic component of the process, for example, the

deterministic drift rate c in the machining process.

However, we must note that the forms of residuals of a process may depend on the applied

feedback control when the process is subject to a nonadditive change. For example, when the

autoregressive or moving average operator of an ARIMA process changes, feedback control

will impose some effect on the means of residuals.

The independence of residuals of feedback control under situations of additive changes

serves as a bridge for the combination of automatic process control and change detection in

quality control. Under different optimization criteria and control situations, different feedback

control policies can be obtained. When adjustment cost is minor, simple minimum variance

control can be applied. When adjustment is disruptive or time consuming, adjustment cost

has to be taken into account. Under some situations, such as that described in Vander Wiel

(1996 b), tool Wear can be automaticdy corrected by the lathe with negligible cost while in

others it cannot. However, no matter what specific form of feedback control is used, residuals

behave the same in response to additive changes. Therefore, an independent change detection

procedure based on residuals can be developed and used regardless of the specific form of

feedback control applied.

One may wonder why the feedback control cannot rernove the eflect of additive changes

in the residuals. The reason is that the effect of al1 feedback control is incorporated into the

forecast and the forecast g(yt-') is done under the assumption that the process is in-control.

Consider an example. Suppose a step shift of size A occurs at r and an additional amount of

(-A) of control is applied to the process at (r - 1). Then the effect of the step shift on process

output can be compensated because 8, = 0,4 + (u,-l - A) + C+ ur +A = El,-* + u,-1 +c+ v,.

Hence, Y, lyr-l - N(E(O,- l l y r - ' ) + u,-l + c, s:). However, at ( r - l), @, and Y , is forecast as

E ( @ , - ~ lyr-') + (u,-1 -A) +c. Therefore, e, = Y , - [E(8 , -1 lyr-') + (u,4 -A) +cl - N ( A , s:).

That is, t h e feedback can not remove the effect of the change in the residuals. This from

another side confirms that the properties of residuals of a process subject to an additive

change are independent of the applied feedback control.

2.8 Surnmary and the Mode1 Tkansformation

Residuals are always sufficient and equivalent to process observations for change detection.

Any change detection procedure based on observations can be transformed to a change de-

tection procedure based on residuals with the same detection properties such as optimality

preserved, and vice versa. When there is no change, the residuais axe mutually uncorrelated

with zero rneans, and furthermore mutudly independent if the process is Gaussian. For an

ARIMA process or a steady-state space model subject to an additive change, the residuals

are mutuaily independent. Actually, the residual process is an i.i.d. process subject to an

additive change that is the inverse filtering of the original additive change.

Because of the properties of residuals, detection of an additive change in an ARIMA

process or a steady-state state space model, possibly under feedback control, is transformed

to detection of an additive change in an i.i.d. process with detection properties preserved.

This model transformation will offer us much convenience for designing and analyzing change

detection procedures based on residuals.

In the rest of this work, we will focus on the residual process

which is an i.i.d. process with zero rneans and variance 02 subject to an additive change

6(t - P), where r = 1,2, , oo is the change point- E(e t ) = d ( t - r ) . b(i) = O for i < 0,

6 ( i ) for i 2 O is the mean of residual afler change. The formulation of the chônge detection

problem is now as foliows. At t = 1,2, - O , test the hypotheses

based on ei. The correspondhg stopping tirne N is now with respect to {Ft, t = l,2, - O),

where Ft is the minimum O-algebra induced by {ei, i = 1,2, , t ).

Chapter 3

Performance of Control Charts

Applied to Residuals

Introduction

From now on, we will study change detection procedures. In this chapter, we study the

CUSUM, EWMA and Shewhart control chart procedures applied to the residud process

{ e l , t = 1,2, ---) defined in 5 2.8.

Although applying control charts to residuals has been a popular practice for change

detection, especially, in the quality control area, a systematic investigation on the performance

of the control chart procedures applied to residuals of autocorrelated processes has not been

available in the literature. The lack of research in this area was due to two reasons.

Reason one, lack of a systematic investigation on the properties of residuals in the

previous literature.

r Reason two, the tirne-varying feature of the mean of residual after change, which made

the computation of performance measures not so easy as in the i.i.d. case.

By using a discrete-state Markov chah approach, Vander Wiel (1996 a) studied the perfor-

mance of two-sided CUSUM, EWMA and Shewhart control charts applied to the residuals

in a special case, that is, for an IllfA(1,1) process subject to an observational additive step

shift.

In this chapter, for a general pattern of the mean of residual after change, that is, a

geoeral 49, we will derive an explicit formula for computation of the ARLs of the Shewhart

control chart procedure, establish integral equations for the CUSUM and EWMA control

chart procedures, and develop numerical procedures of Gaussian quadrature for solving the

integral equations. Because of the timevarying feature of the mean of residual after change,

the integral equations and the numerical procedures based on them have not been done by

other authors. Under the ARL criterion, we will then study the performance of the control

chart procedures applied to the residuals under three typical situations, where the mean of

residual after change is decreasing to zero, decreasing at fist and then stabilized at a nonzero

level, and gradually increasing and stabilized at a nonzero level. By the ARL criterion, we

mean that for a change detection procedure N, for a fixed value of its in-control ARL that is

the smaller its out-of-control ARL that is

the faster the procedure N for detection of the change 4).

For computation of the ARLs, both the integral equation approach and the Markov chain

approach can be used. We are using the integral equation approach here, because it is more

efficient and can offer us more accurate and precise solution than the Markov chah approach.

Before we examine the advantage of the integal equation approach, we look at the difierence

between the two approaches, which exists in the following aspects.

0 First, in the case of a continuous state variable, by using the integral equation approach,

we establish an integral equation of ARL function on the base of the continuous state

variable directly. By using the Markov chain approach, however, one has to discretize

the continuous state variable and then establish transition probability matrices on the

base of the discret izat ion intervals.

0 Second, when solving the ARL values, by using the Markov chah approach, one a p

proximates the ARL values at each discretization interval by a constant (zer~degree

polynomial). That is, a continuous ARL function is approximated by a discontinuous

37

step function. By using the integral equation approach, however, we fit the integrand

(the continuous ARL function) on each quadrature interval by a polynomial of higher

degree when solving the i n t e g d equation by using quadrature methods. That is, the

continuous ARL function is fitted by a smooth spline curve.

a Third, by using Gaussian quadrature to solve an integral equation, we can choose the

location of the quadrature abscissas in an optimal way to achieve the closest fitness

of t h e ARL function by a spline curve. However, there is no such an optimal way for

discretization in the Markov chain approach.

For an N-point quadrature rule, basically we solve an N-dimensional system of linear

equations in the integral equation approach. In an N-discretization interval Markov chain a p

proach, one needs to carry out N-dimensional rnatrix operations. Therefore, when the number

of quadrature points is about the same as the number of discretization intervals, the computa-

tional complexities of the two approaches are about the same, the integral equation approach,

however, can offer much more accurate and precise solutions than the Markov chain approach.

Usually, only a small number of quadrature points are needed to achieve desired computa-

tional accuracy and precision. However, the same degree of accuracy and precision may not be

achieved by a large number of discretization intervals in the Morkov chain approach. Merely

increasing the number of discretization intervals cannot increase the accuracy and precision

of final solutions, because in the Markov chain approach, (i) the continuous ARL function

is always approximated by a discontinuous step function, which is a vital shortcoming; (ii)

increasing the number of discretization i n t e r d s will increase the computational complexity

and intermediate computational errors dramatically. More discussions on quadrature methods

and solution of integral equations can be found in Press, Teukolsky, Vetterling and Flannery

38

(1992).

When both the integral equation approach and the Markov chah appoach can be used,

the former is superior to the later in terms of computational efficiency, accuracy and precision.

However, in some cases (for example, when the state miab le is more than two dimensional),

it may be hard to solve the integral equations efficiently, while the discrete-state Markov chain

approximation can be still applied. The less efficiency and accuracy makes the Markov chain

approach more flexible and versatile. Lucas and Crosier (1982) computed the ARLs of CUSUM

charts in the i.i.d. case by using the two approaches. They discussed and their numerical

results demonstrated the relative advantages and disadvantages of the two approaches.

3.2 Integral Equations for Computation of the ARLs

Under the ARL cri terion, for a change detect ion procedure N, since we are only interested in

Er,, (N) and Er=* (N), we can simply assume that

and concentrate on computation of KZI(N) . By substituting constant O for 6(t - 1), t =

1,2, - -, Er,,(N) is just a special case of (N).

Two other assurnptions made in this chapter are

a et has a normal distribution;

6(t - 1) converges as t increases, and the limiting value is denoted by &O).

Let fN( I1 ,O1) be the normal density function with mean p and variance 02, and O(-) be the

standard normal distri bution function.

3.2.1 Shewhart Control Chart Procedure

The Shewhart control chart procedure with control limit C applied to {el, ez, 0 ) is a stopping

time,

Ns = inf(t 3 1 : (etl 2 Co).

For i 1 O, define ARLs(i) the ARL of Ns applied to {e i+l , ei+l, 0 ) . Then,

B y condit ioning,

which leads to the following formula.

3.2.2 CUSUM Control Chart Procedure

Let

&(t) = max{O, (et - k) + S H ( ~ - l)],

&(t ) = max{O, (-et - k) + &(t - 1)).

&(O) = O and &(O) = O are initidized.' Then, stopping times

and

N; = inf{t 2 1 : SL(t) 2 h )

are the higher-side and lower-side CUSUM control chart procedures applied to (el, el , - O ) ,

respectively. k > O is the CUSUM reference value, and h > O is the CUSUM threshold value.

We note that

{SH(t), t = 1>2, - = }

and

(S&(t),t = i , 2 , - a * }

form two non-homogeneous cont inous-state Markov chains.

For i 2 O, we define ARL$(s, i ) and ARL& i) the ARLs of the upper-side and lower-

side CUSUM control chart procedure applied to {e;+l, e;+?, 9 ) given that SH( i ) = s and

SL(i) = s, respectively. Then, for &(O) = O and &(O) = 0,

By condit ioning and the Markov property,

Using 4.) instead of 6(9 in ( 3.1), we obtain the equation for ARL$(s, i).

Since d( t - 1) + @O) when t + 00, from ( 3.1),

and

which is a Fredholm equation of the second kind that can be solved numerically by using

standard procedures such as Gaussian quadrature.

For an N-point Gaussian quadrat use appiied to [O, hl, let qi, q2, - - , q~ be the abscissas

and wl , w2, - * O , WN the corresponding weights. Then, adapting the idea of the interpolatory

formula observed by Nystrom, which is described in Press, Teukolsky, Vetterling and Flannery

(1992), we can accurately approximate ( 3.1) by

Usually, 6 ( t ) converges exponentially fast. For a sufficiently large T, Say 100, we can

initialize ARLZ(S, T) to be A R L ~ ( S , oo) and then compute A R L ~ ( S , i) recursively from i =

T - 1 to i = O by using ( 3.3). For the initialization and recursions, only the function values

at s = 0, qi , qz, , q~ need to be evaluated.

In summary, to compute ARLg(0, O), we carry out the following numerical procedure.

a Solve the integral equation ( 3.2) for ARLE(., a).

a For a sufficiently large integer T 2 O, initialize ARLg(0, T) to be A R L ~ ( . , w).

O For i = T - 1, T - 2, , O , recursively obtain ARL& i) from A R L & ~ + 1) by using

the interpolatory formula ( 3.3).

a Obtain Er,i ( N g ) = A R L ~ ( o , O), the final value of our interest.

3.2.3 EWMA Control Chart Procedure

For a chosen coefficient X E (0,1], let

is initidized, and

is assumed. For C > 0, the stopping time

is the EWMA control chart procedure applied to (el, e2, 0 ) . X is the EWMA coefficient, and

C is the EWMA control limit.

is a non-homogeneous continous-state Markov chain.

For i 3 O, define ARLE(r, i) the ARL of applied to {ei+,, ei+*, * ) given that Zi = z.

Then, for Zo = 0,

Er=1 ( N E ) = ARLE(O, O).

By conditioning and the Markov property,

ARLE (r, i)

Then, a similar numerical procedure as in § 3.2.2 can be developed to obtain

3.3 Numerical Analysis

In this section, we will numerically investigate the performance of the CUSUM, EWMA and

Shewhart control chart procedures applied to the residuals. O bviously, for different types

of & ( O ) , the mean of residual after change, a change detection procedure may show different

performance. Therefore, performance of the control chart procedures should be investigated

for various types of &). To compare the performance of different control chart procedures

under the ARL criterion, their in-control ARLs will be fixed at the same value.

Both the CUSUM and EWMA control chart procedures have two design pararneters.

CUSUM has its reference value k and threshold value h. EWMA has its coefficient X and

control limit C. When their in-control ARLs are fixed, the degree of freedom of choice of their

design parameters will be reduced from 2 to 1. That is, once one of the two parameters is

chosen, the other will be determined.

For a fixed in-control ARL value, at a particular size of change, we minirnize the ARLs

of the CUSUM and EWMA control chart procedures. This can be done by optimizing one

of their design parameters. We choose to optimize the reference value k for CUSUM and the

coefficient X for EWMA.

In the rest of this section, we will present numerical results in three typicd cases of if(-).

In each case, we will first present data in a table, where the CUSUM reference values k and

EWMA coefficients A optimal for diflerent sizes of change are given, together with the ARL

values for each set of optirnal design pararneters. The optimum ARLs of CUSUM and EWMA

at each considered size of change are supencripted with * in the tables. We will then plot the

optimum ARLs versus sizes of change, and the ARL curves for chosen design parameters. In

this section, we will assume that

0 the standard deviation of residuai is a = 1.

3.3.1 Results on the Machining Process Subject to a Step Shift

Consider the steady-state machining process subject to an additive step shift of size A in its

state equation, that is, A(i) = A for i 2 O. Then the mean of residual ofter change is

for i 2 0, where w = 0.553.

Table 3.1: Optimum ARLs and parameters of the CUSUM, EWMA and Shewhart control

chart procedures applied to the residuds of the machining process with w = 0.553 subject to

an additive step shift of size A in its state equation when their in-control ARLs are 100.

r

L

From Table 3.1 and Figure 3.1, we have the foilowing observations on the performance of

the CUSUM, EWMA and Shewhart control chart procedures when the meôn of residud after

change is ( 3.4).

1. Roughly, EWMA performs the best in this case.

46

2

53.9'

35.0"

35.4

40.1

69.6

A O

100

100

100

100

100

CUSUM

EWMA

2.5

33.5'

14.6'

14.9

18.3

50.1

k = 0.9

X = 0.35

X = 0.45

X=0.70 -

Shewhart

1.5

73.5'

62.0'

62.2

66.1

84.9

3

17.2"

4.9'

5.0

6.4

30.7

0.5

96.0'

96.6'

96.6

97.0

98.8

1

87.8'

84.5'

84.6

86.4

94.3

3.5

7.4'

2.0

1.9'

2.2

15.8

4

3.0'

1.3

1.2.

1.2

6.9

4.5

1.5'

1.1

1.1

O *

2.9

5

1.1*

1.0

1.0

1.0"

1.5 b.

2. Only when the ~ i z e of change is very small, CUSUM performs better than EWMA.

3. Shewhart performs the worst for any considered size of change.

A: Size of change

I I I 1 1

I I 1 I 1

CUSUM: k d . 9 '.

\

80

3 5 60- E 3

-E C, 40 0"

20

O

- O 1 2 3 4 5 6 7

k Size of change

. . ' . ' .X ".

t ' S . 'Q. - . . . - -. X: CUSUM - - . ". . . . - O. . .

;t '. . +: EWMA -

- O: Shewhart - +, . ..,

- - +. , ' x * .. =o..

I 1 ;r :*...m,,,,,a-,-,-a ---- c& -*--

Figure 3.1: Optimum ARLs versus sizes of change (upper panel) and the ARL curves with

the chat parameters optimizing the ARLs at A = 2 (lower panel) of the CUSUM, EWMA

O 1 7

2 - -

3 -

4 5 6 7

and Shewhart control chart procedures applied to the residuals of the machining process with

w = 0.553 subject to an additive step shift of size A in its state equation, when the fixed

in-control ARL is 100.

3.3.2 Results on the Machining Process Subject to a Drift Rate

Change

Consider the steady-state machining process subject to a drift rate change of c' in its state

equation. Then the mean of residual after change is

for i 2 0, where w = 0.553.


chart procedures applied to the residuals of the machining process with w = 0.553 subject to

a drift rate change of d in its state equation when their in-control ARLs are 300.

From Table 3.2 and Figure 3.2, Ive have the follorving observations on the performance of

the CUSUM, EWMA and Shewhart control chart procedures when the mean of residud after

49

change is ( 3.5).

1. Optimized CUS UM control chart procedures produce the smallest optimum ARL values.

2. The CUSUM control chart procedure optimizing the ARL at a considered size of change

produces a lower ARL curve than the EWMA optimizing the ARL at the same size of

change, and also a lower curve than Shewhart.

3. Shewhôrt performs almost as well as CUSUM for large size of changes.

4. Shewhart performs better than EWMA for medium and large size of changes.

5. EWMA does not perform as well as CUSUM.

6. EWMA performs better than Shewhart for small size of changes.

O: Shewhart

+: EWMA

CUSUM: k=1,4

EWMA: L0.57

Shewhart

6: Arnount of drift rate change


the chart parameters optimizing the ARLs at c' = 2 (lower panel) of the CUSUM, EWMA

and Shewhart control chart procedures applied to the residuals of the machining process with

w = 0.553 subject to a drift rate change of d in its state equation, when the fixed in-control

ARL is 300.

3.3.3 Results on the Cutting Process

Consider the AR(2) cutting process with parameters in ( 1.4) subject to an observational

additive step shift of size A, that is, A(i) E A for i 2 0. Then the mean of residual after

change is

a(i) = (1 - 0.458 - O . ~ B ~ ) A ( ~ )

A, for i=O,

0.55A for i = 1,

0.35A for i > 1.

From Table 3.3 and Figure 3.3, we have the following observations on the performance of

the CUSUM, EWMA and Shewhart control chart procedures when the mean of residual after

change is ( 3.6).

1. CUSUM and EWMA perform better than Shewhart, especially for srnail and medium

size of changes.

2. CUSUM performs better than EWMA for smaller size of changes.

3. EWMA performs better than CUSUM for larger size of changes.

chart procedures applied to the residuals of the AR(2) cutting process subject to an observa-

- .. A O 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

k = O . 1 100 39.5' 20.9 13.0 8.9 6.5 4.9 3.8 3.1 2.6 2.2

CUSUM

u

tional additive step shift of size A when their in-control ARLs are 100.

I

-

-

b

EWMA -

-

-

S hewhart 40.9 23.3 12.0 5.7


'O.

x: CUSUM

+: EWMA

O: Shewhart

. *%,, ' 2 .' ,.

1 ' . * . . - 4. : 2 , , ,a,, , , ,a,, , , ,a,, , , -a,, . , , m -, , - , a.. , , . - - - - -

3 4 5 6 7 A: Size of change

- CUSUM: k=OS

- - . EWMA: k0.2

. - . - Shewhatt

1 2 3 4 5 A: Size of change


the chart parameters optirnizing the ARLs at A = 2 (lower panel) of the CUSUM, EWMA

and Shewhart control chart procedures applied to the residuds of the AR(2) cutting process

subject to an observational additive step shift of size A, when the fixed in-control ARL is 100.

3.3.4 Summary of the Performance of Control Charts

Under the ARL Criterion, we have studied the performance of the CUSUM, EWMA and

Shewhart control chart procedures under three different situations, where the mean of residud

after change represented by 6(i) for i 2 0 decreases to O rapidly ( 3.4), decreases at first and

then stabilizes at a nonzero level ( 3.6), and gradudly increases to a constant level ( 3.5).

From the analyses, we have the following results.

1. When 6(i) for i 2 O decreases to O rapidly, EWMA roughly performs the best among

the three control chart procedures.

2. Otherwise, CUSUM roughly performs the best.

3. Generally, Shewhart performs well for detection of very lôrge size of changes.

Chapter 4

Sequent ial Testing on Process Mean

Levels

4.1 Introduction

Consider a situation for the residud process { e t , t = 1,2,. -}. For a fixed time point ro 2 1,

either the additive change occurs at ro, or no change occurs a t d l . That is, the change point

r E {ro, oo). We want to discriminate r = r o from r = oci by sequentiaily observing el until a

decision based on the observations can be made. Therefore, we sequentially test Ho: r = m

versus Hi : r = ro. Before ro, et behaves the same under both Ho and H l , and starting at

ro, et behaves differently under Ho and Hi. Specifically, starting at ro, E(et) is time-varying

according to 6(9 under Hi, and E(et) is constant zero under Ho. Therefore, starting at ro,

we actually sequentially test

Ho: E(ei) = 0, for i = 1,2,-, versus

: E ( e i ) ( i - ) , fori=1,2,-,

based on the observations et. We cal1 these two hypotheses the mean level hypotheses. Since

{a, m) c {1,2, - - -, oc), the mean level hypot hesis testing is a special situation of the general

change detection problem formulated in 5 2.8 and originally in 5 1.3.

Anot her situation t hat leads to formulation of the mean level hypot hesis testing is when it

is suspected that an additive change has taken place before observation is started. If we want

to clarify this suspicious situation, we will have to test it. If the suspicion is true, then E(e t )

is tirne-varying according to 6(*) starting at the beginning of the observation. Therefore, we

will need to test the mean level hypotheses.

The problem of sequential testing on process mean levels also stands on its own. It has

already been a problem widely accepted and studied in the area of signal detection. Basically,

in the circumstances of signal detection, we use the mean level hypothesis test ing formulation

to test

57

absence or presence

of a known signal in a noise process. The problem has been studied by many authors. Liptser

and Shiryaev (1978) in their 517.6 studied sequential testing of two hypotheses that observa-

tions follow a noise process or the observations follow the noise process containing a known

signal additively. Cochlar and Vrana (1978) studied sequential testing of two hypotheses that

observations have a joint probability density function, Say, f (zîlO), or the observations have

a different joint probability density function, Say, f ( x y l l ) . They also particularly studied se-

quential testing on presence or absence of a known signal in a Gaussian noise process. Liu and

Blostein (1992) studied sequential testing of two hypotheses that independent observations

have a joint probability density function ni=, foi(~i), or they have a joint probability density

function nf=, f l i ( x i ) . They also discussed the applications of this sequential testing problem

in communications. Recently, Kailath and Poor (1998) has comprehensively reviewed the

related topics of t his sequent ial test ing problem.

Other applications of the mean level hypothesis testing problem may arise in system iden-

tification. Consider that the system status or process mean level is unknown to the observer,

but can only be in two possible states, Say, normal or abnomal. Regard the normal state as

an in-control state. Then, the means of residuals are al1 zero if the system or process is in the

normal state, and are time-varying following a pattern if in the abnormal state. Therefore,

the unknown system status or process mean level forms a set of mean level hypotheses. By

sequentially testing the mean level hypotheses, we can have the unknown state identified.

We note that the classical Wald's SPRT problem, where b( i ) for i 2 O is a constant, is a

special case of the mean level hypothesis testing problem. SPRT has been applied in different

areas. Since the time-varying feature is more prevalent than the time-invariant feature, the

meaa level hypothesis testing is more applicable.

Our objective here is to derive an optimal sequential test for the mean level hypotheses.

When d(i - 1) is a constant for i = 1,2, - - -, the optimal sequentid test is the SPRT proposed

by Wald (1945), and its optimality was proved by Wald and Wolfowitz (1948), and later by

Lehmann (1959), Chow and Robbins (1963) and others. For 6(9 taking a general time-varying

pattern, only a few works on the optimal sequential tests exist in the literature. Cochlar and

Vrana (1978) derived a Bayes optimum sequential test minimizing the expected sample size

over the a posteriori probability distribution of the two hypotheses for given type4 and type-

II errors. Liu and Blostein (1992) derived the same Bayes optimum sequential test, and aiso

obtained some properties of the optimal stopping boundaries. Liptser and Shiryaev (1978)

derived an optimal sequential test under a different criterion (see the (17.115) in their Theorem

17.8).

In this chapter, we will denve the optimal sequential test that minimizes the expected

sample size when the process is in the out-of-control state for given upper bounds of type4

and type-II errors. We will first formulate the optimal sequential testing problem as an optimal

stopping problem, then derive the optimal stopping rule, study some important properties of

the optimal stopping boundaries, and develop an algorithm for computation of the optimal

stopping boundaries, and present an example.

In this chapter, we assume that

O et has a normal distribution with standard deviation a = 1.

4.2 Formulation as an Optimal Stopping Problem

For n = 0,1,2, * * O , let 3, be the minimum 0-dgebra induced by {et, t = 1,2, ff , n), where

specially, {et , t = 1, .-- -, 0) = 0, 6 = (0, R). Let e l be a

S = (r, d) consists of a stopping time r 2 O with respect

nul1 vector. A sequentid test

to (Fn,n = 0,1,2,***) and a

terminal decision rule d that is a function

O, to accept Ho, d(e;) =

( 1, to accept H ~ ,

given r = n and e;. r = O represents no sarnpling a t dl . A fixed sample size test is a special

case of sequential tests, for which r r m, the fixed sample size. The type4 and type41 errors

of a sequential test S are

a = P(d = ilHo is true) and

p = P(d = OIHl is true),

respectively. When it is necessary, we use T(S), a(S) and P(S) denote the stopping time,

type-1 and type-II errors associated with a sequential test S, respectively.

Our objective here is to find an optimal sequential test minimizing

subject to the constraints a(S) 5 a0 and P(S) 5 Bo, where O < cro, Po < i are specified values.

That is, we are interested in finding an optimal sequential test, satisfying the prescribed upper

bounds of decision errors, which will stop sampling most quickly on average when Hl is true.

For convenience, let El ( 0 ) denote E(-1 Hi).

This objective leads us to consider the following auxiliary problem: for g, cl > O, hding

the optimal sequentid test S* attaining

inf S ~ { a f l sequential tests)

{QQ(S) + CIP(S) + E i ( r ( W -

q-, and cl cm be interpreted a s the loss incurred by fdsely rejecting Ho and HI, respectively.

The sufficiency of the auxiliary problem for Our objective ( 4.1) results from the fact that,

for any other sequentiai test S', if a ( S r ) 5 a(S') and @(Y) < - ,û(S'), then we must have

( ( ) ) ( ( S ) ) . Otherwise, it contradicts ( 4.2). Therefore, if we have a(S*) = cro

and P(S') = Bo by properly choosing q, and CI, then S' will be the optimal sequentid test

we want. Sirnilar auxiliary problems were first used by Wald and Wolfowitz (1948) to prove

the optimality of Wald's SPRT, and subsequently used and analyzed by some other authors

in optimality proofs.

For convenience, for t = 1,2, - - O , let

Let f o , t ( * ) and f 1 J - ) denote the density functions of e t under Ho and H l , respectively. Then

and

Let

and

which are the likelihood functions. For convenience, denote Lo(ey) = 1 and Li (e:) = 1.

For any stopping time 7, the optimal terminal decision d e d* rninimizing %a + cl@, and

therefore minimizing Qa + cl/3 + El ( T ) , is when r = n,

0 (to accept HO), if coLo(eî) 2 s L ~ ( e ; ) , 8(eC) =

This is because of the following inequali ty :

Therefore, to find the optimal sequential test of ( 4.2), we will find an optimal stopping time

T' minimizing

The optimal stopping time r', together with the optimal terminal decision rule d' ( 4.3), will

render an optimal sequential test S* = (T*, d.) of our objective ( 4.1).

In the next section, we derive the form of the optimal stopping time T*.

4.3 Derivation of the Optimal Stopping Rule

Define the stopping loss at stage n = 0,1,-• as

Denote A, the Iikelihood ratio at stage n, which is

Let z, and A, be the realizations of Zn and A,, respectively. By the definition of Zn and ( 4 4 ,

the optimal stopping time T' is the optimal stopping time with respect to { (Z,&),n =

0,1, - 1 , which satisfies

El (Zr.) = inf El (2,).

For K = 0,1, and n = 0,1, , K , by backward recursion, define

with

pF c m be interpreted as the minimum expected loss at stage n for a finite stopping time

r 5 K, which means that we must stop sarnpling and make a decision before or at stage h'.

By the definition, for any n 5 K, ,OF 2 O, and by backward induction, BE+' 5 ,OF. Therefore,

for n = O, 1, - -,

exists. Since in

Zn = n + min{~An,ci) ,

min{qAn, cl) is bounded between O and cl for al1 n = 0 , 1 , - - O , and {n) is an increasing

sequence, by Theorem 2 of Chow and Robbins (1963), the optimal stopping time r* with

respect to {(Zn, F,,)) exists and

For A 2 0, let

h(X) = min{~X, cl).

Then, for n = 0,1, -,

2, = n + h(Xn),

and

Hence, by the definit ion, for given n, K and the deteministic sequence { p i ) , ,8: is a function

of A, only, and we may denote it by @(A,).

For n = 0 ,1 , - -0 ,K + 1, define

By the definitions, for n = O, 1, # # * , K ,

where

fN(o,i)(-) is the standard normal density function. For n = 0,1, O , because lim~,, exists,

gn(An) = $ i w f f ( ~ n ) (4.8)

exists. Since

0 5 $(An) 5 h(Xn) I CI,

by the Lebesgue theorem of dominated convergence,

gn (An) = min{h(Xn), Gn+i (An) + 11,

where

By the definitions and ( 45 ) , the optimal stopping time is

The functions gn(X) and Gn(X) will provide us convenience for studying the structure of

t'. We first derive some important properties of them, which Xe the following proposition.

Proposition 1 For any n = 0, 1 , - -O,

( i ) both g,(A) and & ( A ) are concave in A 2 O;

(ii) g,(O) = O , gn(X) = cl for suficiently k r g e A, and gn(A) is increasing in A 2 0;

(ii;) Gn(0) = O , Gn(w) = cl, and &(A) is increasing in X 2 O.

Proof of Proposition 1:

(i) By the definitions, for any integer K 2 0, g g ( ~ ) = h(X) is obviously concave in A 2 O. For

any n = 0,1 , - , K , if c(A) is concave, then by the definition of concavity and ( 4 4 , it can

be verified that G ~ ( A ) is concave in h 2 O. Hence, by ( 4.6) and backward induction, $(A)

65

and GF(X) are concave in X 2 O. Then, for al1 n - 0,1, -, by the definition of concavity and

( 4.8), g,(X) is concave in X _> O, so is therefore G,(X) by ( 4.10).

(ii) gn(0) = h(0) = O is obvious.

If cl 1, then by ( 4.9), for A 2 ci/% and n = 0,1,--,

Suppose ci > 1. By ( 4.9), for al1 n = 0,1, and A > cl/@,

Let [cl] denote the integer part of cl. For any n = 0,1, -, consider m = n + [ci] + 1 and

6 E (O, 1).

9 m ( W 2 1,

for X 2 cl /co. Hence, by ( 4-10), there exists v,-1 2 cl/%, such that for X 2 v,-l,

and by ( 4.9),

€ gmmi(X) 2 min{cl,2 - -1. 2"-n

Once again, by ( 4.10), there exists v,-2 2 c1/cO, such that for X 2 ~ ~ - 2 ,

and by ( 4.9),

Thus, by backward recursion, there exists v,, 2 cl/co, such that for X > v,,

hl+l = min{cl, [cl] + 2 - C z)

That is, for A 2 v,, g,(X) = cl.

For any n = 0,1, : ., since gn(0) = O and gn(A) - cl for A 2 un, by its concavity, gn (A)

must be increasing in A > O. Therefore, (ii) is proved.

( i i i ) follows ( i i ) by ( 4.10). Proposition 1 is proved.

Now, we can establish the structure of the optimal stopping time T*.

Theorem 1 The optimal stopping time has the f o m

{A,, n = 0,1, 0 ) and {B,, n = 0,1, ':) are Iwo detenninistic sequences. I f cl > 1,

then for n = O , 1, -,

and A, = c l / ~ if and only if B,, = cl/co. If O < ci < 1 , then for any n = O , l , - g ,

An = Bn r c 1 / g *

Proof of Theorem 1:

From ( 4.11) and ( 4.9), for n = 0,1, O:'-, we need to compare h(A) with Gn+I(X) + 1 for X 2 0.

Suppose that cl > 1. By Proposition 1,

From the proof of Proposition 1 (ii), t here exists v, 2 cl/%, such t hat for X 2 un,

Gn+l(X) + 1 > h(A) = Cl.

Therefore, if there exists An E (O, cl /QI, such that

Gn+l(An) + 1 =-CO& = h(An), (4.12)

then, by the properties of G,+l(X) in Proposition 1 (i) and (iii), firstly A, > l / ~ , secondly

such an A, is unique, and thirdly there uniquely exists B, E cl/^, oo) such that

Gn+l(Bn) + 1 = c l = h(Bn)*

If A, = cl /Q, then B, = cl/%, and vice versa. Furthemore, for A E (A,, Bn),

and for XÉ(A,, B,),

When 44n = B n , (A, , B n ) = 0.

If t here does not exist such an A, E (0, cl /QI tbat ( 4.12) holds, that is, for dl X E (0, cl/co],

then by the increasing property of Gn+1 (A), for al1 A 2 0,

In this case, (cl/co, Q/Q) = 0, we let A, = Bn = cl/% for convenience.

Suppose O < cl 5 1 instead. By Proposition 1 (iii), for all A 2 O,

Hence, the optimal stopping tirne is T* O in this case. We then let A, = Bn = cl/% for

convenience as well.

Therefore, under ail situations, for n = 0,1, - - =,

and the optimal stopping time is

Theorem 1 is proved.

Illustrated in Figure 4.1 are the relationships among A,, Bn, the functions h(*) , gn(*) and

G,(-), and the decision parameters co and cl, as well as the properties of the functions shown

in Proposition 1.

kVe cal1 {A,, B,} the optimal stopping boundaries at stage n = 0,I , - - * , and (A, , Bn) the

sampling continuation interval of the optimal sequential test. If (A,, B,) = 0, sampling must

be stopped before or at stage n. In the case of cl 5 1, that is, the loss incurred by falsely

rejecting Hl is not greater than the cost of taking a sarnple, r* O, t hat is, it is optimal not

to observe at all. When sampling is stopped at stage n, the optimal terminal decision d e by

1 (to accept Hl), if A, 5 A,, d ' ( e ; ) =

( O (to accept Ho), if X, 2 Bn.

In case of A, = A, = Bn7 decisions to accept Hl or Ho c m be randomized in any way. For

simplicity, we let d*(er ) = 1 in this case. In summary, the optimal sequentid test is

I stop sampling and accept Hi at stage n, if A, 5 A,,

S' = continue sômpling at stage n, if An E (A,, &),

stop sampling and accept Ho at stage n, if A, 2 B,.

Figure 4.1: A typical illustration of g,(X) and the related functions and quantities.

4.4 Properties of the Optimal Stopping Boundaries

The optimal stopping boundaries {An, Bn, n = 0,1, --) are determined by 9, CI and (p i , i =

1,2, - -1. Generally, A,, and Bn Vary as n changes. In t his section, we study the properties

of the optimal stopping boundaries, especially, how {A,) and {B.) vary when {pt) exhibits

certain patterns.

If for to 2 1, { p t , t = to, to+l, - - * ) is a constant sequence, then {g,,(-), n = to-1, to, 9 ) is a

constant sequence of functions. Therefore, {A,, n = to - 1, toT - - e) and {B,,n = to-l,to,-0.)

are two constant sequences. When to = 1, this is the special situation of Wald's SPRT.

If {pt, t = 1,2, - --) is periodically varying in t, that is, there exists a positive integer I

such t hat

for any t = 1,2, -, then both {An, n = 0,1, - -) and {B,, n = 0,1, -) are periodically

varying in n with the same period 1, that is,

and

&+I = B n

for any n = 0,1, ---. This is because for any n 5 by the definitions,

and t hen

for any n = 0,1, ---.

The following proposition shows that when {p i ) converges to 0, the optimal sequential

test is truncated. That is, there exists M, such that A, = B, = ci /Q for a l l n 2 M.

Proposition 2 Assume

The optimal sequential test is truncated. That is, there exàsts an integer M 1 0, such that

Proof of Proposition 2:

I t will be sufficient to show that under the condition of the proposition, there exists an integer

hl, such that for any K 2 n 3 M + 1 and X 2 0,

and

IG:(A) - h ( ~ ) l c 1.

For any e E (0,1), thereexists 6, > O , such that

For any positive integer K , by the definition, for X 2 0,

Since lim+, pi = O, there exists m,, such that for any K 2 m,, 1x1 < 4, aod X E (0, a), eo

is bounded. Therefore, for A' 2 m,, 1x1 < cf,, and A E (O, %),

€ Ih(Aexp{-pK(x + y))) - h(A)l < co- = e.

Co

For K 2 me and 1x1 < de,

and

pK € exp{-pK(x + -)) 3 1 - -. 2 C l

and t herefore,

The non-equality of ( 4.17) is because of that h ( - ) is an increasing function and h(X) = cl for

X 3 cl/%. Since h(0) = O, from ( 4.16) and ( 4.17), for any K _> m,, 1x1 < 6,, and A 2 0,

Thus, for any K 2 m, and A 2 0,

Hence, by ( 4.6),

If A? - 1 2 m,, then by a similar development,

Then by backward induction, for any K 2 n 2 M + 1 and X 2 0, ( 4.13) and ( 4.14) hold,

and by ( 4.6),

g:;<w = w* Therefore, by ( 4.8), for any n 3 M and X 2 0,

and by ( 4-11),

This means that the optimal sequential test is truncated. That is, sampling must be stopped

no later than M. Proposition 2 is proved.

We cal1 such an M in Proposition 2 a tnincation point. ( 4.15) indicates a way for h d i n g

a tnincation point when Lmi+, = 0. ( 4.15) also indicates that limi,, pi = O is oot a

necessuy condition for the optimal sequential test to be tmncated. In fact, if Ki+,lpil

is sufficiently small but not zero, then the optimal sequential test can be still truncated.

Proposition 2 substantiaily strengthens the result of Liu and Blostein (1992), where it was

shown that the optimal stopping boundaries were truncated when pt m O for t beyond a

certain stage M.

4.5 Computation of the Optimal Stopping Boundaries

Generally, to compute An and B,, we need to compute gn(*) and Gn+i(*). To compute g&)

and G,+l ( O ) , we can first compute ~ ( 0 ) and G,K++, (-) for a sufficjently large K > n. By the

convergence properties, gn ( 9 ) and G,+l ( 0 ) can be well approximated by g;L(-) and Gs, (O).

Once g,(-) has been computed, g,-1 (O), gn-P(=), * * * can be computed backward recursively. In

t his way, -4. and B, can be found for any n = 0,1, *fip.

Major effort is required in the computation of Gn(*) functions, that is, the integration

in ( 4.7). The integration can be done by using Gaussian quadrature eventually. However,

caution must be taken to avoid computational singularity in the integal kernels, as well a s in

the integral intervals. By using variable transformation, the kernel singularity can be removed,

and an expression of CF(=) suitable for computation can be obtained as follows.

where log(-) is the natural logarithm function. Variable transformation was first made to

utilize the properties of gf(-) in ( 4.18). Then, the variable was transforrned back to avoid

the nearly kernel singularities in ( 4.19).

An efficient cornputer algorithm for computation of the optimal stopping boundaries by

using t h e procedure described in the beginning of this section and the expression of G ~ ( A )

of ( 4.20) has been developed. In the computer aigorithm, we have used some programming

techniques to avoid forward recursive calling of G:(*) in ( 4.20). which can cause exponentially

increasing memory consumption and slow down computation. The computer algorithm has

been successfu~ly implemented in Turbo C++. In the following section, we will present an

example of the computed optimal stopping boundaries.

4.6 Numerical Result s

Consider

pt = d(t - 1) = 0.9~-', for t 2 1.

Using decision parameters Q = 10 and cl = 10, we have computed the optimal stopping

boundaries that are presented in Table 4.1 and are plotted in Figure 4.2.

Table 4.1: The optimal stopping boundaries when 6( i ) = 0.9' for i 2 0.

Stop & Accept Ho

Continuation Region . . . . . . . - . . - . . . . . . . . . . . . . . . . . . . . . . . . . . . - C

pping Boundary A,,

Stop & Accept H,

Figure 4.2: The optimal stopping boundaries of the example.

A distinct feature of the optimal stopping boundaries of the example is shown in Figure 4.2.

When = 6(t - 1) decreases to O for t 2 1, the continuation region, where the sampling

of the sequential test is continued, is bounded. That is, the optimal stopping boundaries are

truncated, and the test must be stopped before or at the truncation point, which is 12 in this

example.

Chapter 5

Likelihood Ratio Testing Procedures

5.1 Introduction

For change detection in autocorrelated processes, there have been two sets of tools. The first

set is the control chart procedures. We have studied the performance of the CUSUM, EWMA

and Shewhart control chart procedures for change detection in autocorrelated processes. Now,

we examine their alternatives, the second set of tools of LR testing procedures.

A classical LR testing procedure signals occurrence of a change when the maximum of

the likelihood ratio over unknown parameten, such as the change point, exceeds a constant

threshold value. Basseville and Nikiforov (1993) described a unified likelihood ratio framework

for design and performance analysis of change detection procedures, and studied the LR testing

procedure in details. Because of the fine intuition behind them and their ability to deal

with cornplex situations, LR testing procedures have been widely used for change detection

in autocorrelated processes. When the magnitude of change is unknown, the LR testing

procedure is a generalized likelihood ratio testing procedure proposed by Willsky and Jones

(1976) and studied recently by Lai (1995) and Siegmund and Venkatraman (1995). When the

change point is the only unknown parameter, the LR testing procedure is a likelihood ratio

CUSUM procedure by using residuals, and it is optimal under a minimax criterion in the i.i.d.

case. However, the optimality rnay not hold in general, and the performance of the LR testing

procedure in comparison with the control chart procedures is unknown.

In this chapter, we will study the structure of the LR testing procedure, extend the LR

testing procedure, propose a combined CUSUM and Shewhart control chart procedure, derive

integral equations for computation of their ARLs in some cases, and study their performance.

5.2 The Structure of LR Testing Procedures

Consider the change detection problem on the residual process {et , t = 1,2, - * 0 ) . Assume

et has a continuous distribution, and let f,(*) be the probability density function of et if

E ( e t ) = p. Because of the mutual independence, the joint density function of e: is

under Ho, and

under Hi. The likelihood ratio is

and the LR testing procedure is a stopping time

t

NLR = inf{t 2 1 : max C log f ~ ( i - r ) (ei) 1 9 % i=r fo (ei) 1 h l ,

where h is a constant threshold value.

As we have discussed in the last part of 5 2.3, the joint densities of ei ( 5.1) under Ho and

( 5.2) under Hi are equal to the joint densities of qt under Ho and Hl, respectively. Therefore,

the likelihood ratio ( 5.3) based on the residuals e: is equal to the likelihood ratio based on

the observations y:, and the LR testing procedure ( 5.4) based on the residuals has the same

detection properties as that based on y:. However, because of the mutual independence, the

LR testing procedure based on residuals has a simpler expression than its counterpart based

on observations for an autocorrelated process.

The stopping time NLR can be rewritten in the fouowing form.

w here

NiR = inf{n 2 1 : xi=+=-' log f ~ ( i - r ) (ei 10 (ei) 2 h)

= infin 2 1 : log f ' ( i ) ( C r + i )

/ ~ f e r + i ) 2 h ) ,

for r = 1,2, - - O . NiR is the stopping rule of a one-side SPRT on the mean level hypotheses

presented in 5 4.1, and NLR, i = 2,3, , is the same stopping nile appbed to (e,, e,+~, -1 .

( 5.5) suggests that in the LR testing procedure, we actually activate a common stopping rule

at each possible change point r = 1,2, -, and whenever one of the activated tests signals that

E(e i ) has started to change, the whole LR testing procedure NtR is stopped, and Hl : r 5 t

in the change point hypotheses is accepted.

Lorden (1971) pointed out the relationship ( 5.5) between the likelihood ratio CUSUM

procedure and the one-side SPRT. By using it, he obtained the asymptotic optimality of the

likelihood ratio CUSUM procedure in the i.i.d. case. The relationship ( 5.5) has also been

used for design and analysis of change detection procedures by Basseville and Nikiforov (1993)

and Lai (1995).

5.3 Extended LR Testing Procedures

In the i.i.d. case, that is, when 6( i ) is a constant for i 2 O, SPRT is optimal for testing the

change profile hypotheses, and the likelihood ratio CUSUM procedure ( 5.4) is optimal for

change detection - testing the change point hypotheses. In both SPRT and the c1assica.l LR

testing procedure ( 5.4), their threshold values are a constant. When 6(i ) is generally tirne-

varying for i 2 0, we have shown in Chapter 4 that the optimal sequentid test on the mean

level hypotheses is a likelihood ratio test with time-varying thresholds. These let us propose

the following extended LR testing procedure by replacing the constant threshold in ( 5.4) with

time-vârying thresholds.

t

NELR = infit 2 1 : max [x log f ~ ( i - r ) (ei) i<r<t i=r

f d e i ) - ht -r ] 2 O),

where {hi, i = 0,1, - - -} is a time-varying sequence.

The classical LR testing procedure NLR can be rewritten ôs

t

NLR = inf { t 2 1 : max [ç log f~(i-r) (ei) 1 9 9 i=r

-hl 2 O}. f o ( 4

For a &), let { N E L ~ ) denote the class of al1 extended LR testing procedures. Clearly,

Therefore, the best change detection procedure from the class {NELR) may improve the

performance of the classicd LR testing procedure NLR.

To find the optimal threshold sequence (hi, i = 0,1, - - 9 ) for the extended LR testing

procedure ( 5.6) needs some effort. However, we can let

where {Ai, i = 1,2, - - *) is the lower threshold sequence of the optimal sequential likelihood

ratio test T* on the mean level hypotheses studied in Chapter 4, which takes at least one

observation.

By doing the value assignment ( 5.7), we actuaily make

where NkLR is the one-side optimal sequential test r' but applied to e,, e,+l, - 0 0 , and fiEtR

is the extended LR testing procedure with the threshold sequence {- log Ai+i, i = 0,1, - *),

t

&LR = inf ( t 2 1 : max [C log fqi-r) (ei) + 1% &,+il > O).

l ~ r ~ t i=, fo(ei)

We expect that the extended LR testing procedure NELR of ( 5.8) performs better than the

classical LR testing procedure NLR of ( 5.5).

In Chapter 4, we have studied the structure of ru and obtained some properties of {Ai, i =

0,1, - -). When 6(i) converges to zero for i = 0,1, - -*, for example, when {et) are the residuals

of an IMA(1,l) process subject to an observational additive step shift, the optimal sequential

test r' on the mean level hypotheses is truncated and takes at most M observations. Therefore,

the extended LR testing procedure NELR is

which turns out to be a window-limited procedure.

When {et} are the residuals of an AR(p) process subject to an observational additive step

shift, 6(i) varies at the first p time points i = O, 1, --- , p- 1, and is a constant for i = p, p+l, -- -. In this case, we have shown thôt Ai is a constant for i = p,p + 1, - - O - Therefore, in NI, the

threshold hi is a constant for i = p - 1, p, p + 1, --O, and varies only at the first p - 1 time

points i = 0,1,--,p -2.

85

5.4 A Combined CUSUM and Shewhart Control Chart

Procedure

To approximate the best change detection procedure from the class { N E ~ R ) , we propose the

following combined CUSUM and Shewhart (CCS) control chart procedure.

where

is the usual higher-side CUSUM statistic, k > O is the CUSUM reference value, h > O is the

CUSUM threshold value, C > O is the Shewhart control limit. Note that we assume

Otherwise, the CUSUM part will be dominated by the Shewhart part.

CCS procedures first appeared in Westgard, Groth, Aronsson and de Verdier (1977), and

were studied by Lucas (1 982) for detection of step shifts of unknown size in the i.i.d. case. She-

whart chart was capable of detecting changes of large size, CUSUM was superior in detecting

changes of small size, and their combination combined the strength of both.

Our intention here is to approximate the time-varying t hreshold sequence of the extended

LR testing procedure by adding the Shewhart control limit to the threshold of CUSUM.

In cornparison with the LR or extended LR testing procedures, the structure of the CCS

procedure is relatively simple. The CCS procedure is non-parametric, that is, its composition

(not performance) is independent of process mode1 parameters- For a general if(-), we can

establish integral equations to solve the ARLs of the CCS procedure.

86

5.5 Performance Analysis

In this section, under the ARL criterion, we will investigate the performance of three change

detection procedures: the LR testing procedure applied to the residuals of an AR(1) process

for detection of an observational additive step shift, the extended LR testing procedure applied

to the residuals of an IMA(1 , I ) process for detection of an observational additive step shift,

and the CCS procedure. For these procedures, because of the Markov chain nature of their

testing statistics, we can manage to establish integral equations for their ARLs, and solve

them accurately and precisely. The involved Markov chains will be non-homogeneous.

Singulority, primarily the discontinuity of the integral kernels, will be present in the inte-

gral equations of the three procedures. This imposes much difficulty in solving the integral

equations, care and effort are needed to handle the singularity in order to design valid and

efficient computer algori t hms. Gaussian quadrature for solving standard Fredholm equations

of the second kind cannot be applied due to the singularity. A technique we will use to

overcome the singularity is to construct quadrature on a unifom grid by using cubic spline

approximation of the unknown function. Once the singuiarity is handled, similar recursive

computational procedures can be developed to solve the ARL values as we did for CUSUM

and EWMA in Chapter 3.

In the following subsections, we will describe the three change detection procedures, derive

the integral equations, and present numerical results. Al1 the computation has been success-

fully implemented in Turbo Cf+. As in 5 3.2, for performance andysis of a change detection

procedure N, we will be only interested in E,,(N) and E r Z I ( N ) , and we can assume that

We will assume that

0 et has a normal distribution with standard deviation u = 1.

5.5.1 The LR Testing Procedure Applied to AR(1) Processes

For an AR(1) process with the autoregressive parameter al subject to an observational additive

step shift, the mean of residud after change is

6(i) = { i = O, (5.10)

( 1 - a ) i > 1 ,

and the threshold in ( 5.7) is a constant for all i 2 O. Hence, the extended LR testing procedure

NELR in ( 5 .8 ) reduces to the usual LR testing procedure with a constant threshold.

Let A. > O be the size of an observational additive step shift we intend to detect, po = A.

and pl = (1 - a l ) A o . For t 2 1, let

is a non-homogeneous Markov chain. Let h > O be a constant threshold value. Let So = 0,

NLR = NELR = inf(t 2 1 : St 2 h ) .

For i 2 0, define AR@, i) the ARL of the detection procedure hR applied to {eN17 e;+l, --)

given that Si = S. Obviously, s < h. Then, for So = 0,

Er,i (NLR) = ARL(0, O ) .

For s < h, we c m derive an integral equation suitable for c

ment ation as follows.

ARL(s, i)

ation and computer imple-

w here f N ( p , 0 2 ) ( - ) is the N (pl 02) density function, and is the indicator function. Clearly, the

integral kernel is discontinuous, and therefore, singular. Besides, there is another singularity,

the infinite integral range. To solve this integral equation, special techniques and major effort

more than those for solving the integral equations in 5 3.2 are required.

procedures applied to the residuals of an AR(1) process with autoregressive coefficient al =

F

0.45 subject to an observational additive step shift of size A when their in-control ARLs are

100. ARLs are optimized a t A = 1 in the upper panel, and at A = 3 in the lower panel.

Table 5.1: ARLs of the LR testing procedure and the CUSUM and EWMA control chart

For the fixed in-control ARL of 100 and al = 0.45, Table 5.1 presents the ARLs, Er=i(N),

of the LR procedure in cornparison to those of CUSUM and EWMA, and Figure 5.1 shows

the ARL curves.

Figure 5.1 shows that when the mean of residuai after change is ( 5.10), the LR testing

procedure performs roughly the sarne as the CUSUM control chart procedure, and both of

them perform better than the EWMA control chart procedure.

-I

A O

100

100

100

100

100

100

LR

CUSUM

EWMA

LR

CUSUM

EWMA

(h=2.72)

k = 0.28

X=O.OS

(h=2.92)

k = 1.0

À=0.63

0.5

29.04

28.47

35.04

40.89

40.93

62.80

1

12.66

12.62

13.68

17.45

18.12

26.53

1.5

7.07

7.31

7.47

7.98

8.67

11.03

2

4.46

4.90

4.77

4.05

4.50

4.83

2.5

3.01

3.59

3.34

2.35

2.57

2.45

3

2.13

2.78

2.55

1.58

1.68

1.59

3.5

1.59

2.24

2.15

1.23

1.27

1.26

4

1.28

1.84

1.94

1.08

1.09

1.11

4.5

1.12

1.53

1.79

1.02

1.03

1.04 -

-- . - . - LR: (h=2.9234) 60- a CUSUM: k=1 .O (h=1.532)

- V

A - - - - EWMA: k0.63 (C=2.W)

2 40- -

20 - -

O I I 1 - 1 D I b

1 I 1 I I

- . - . - . - LR: (h=2.7217)

CUSUM: b0.28 (h=4.156) - - - - - EWMA: M . 0 8 (C=2.067)

-

-

- O I 1 1 1 1 - . - - , . .

O 0.5 1 1.5 2 2.5 3 3.5 4 4.5 A: Size of change

I

Figure 5.1: ARLs of the LR testing procedure and the CUSUM and EWMA control chart

O 0.5 1 1.5 2 2.5 3 3.5 4 4.5 A: Size of change

procedures applied to the residuals of an AR(1) process with autoregressive coefficient al =

0.45 subject to an observational additive step shift of size A. For the fixed in-control ARL,

the chosen chart parameters optimize the ARL at A = 1 in the upper panel and at A = 3 in

the lower panel.

5.5.2 The Extended LR Procedure Applied to IMA(1,l) Processes

For an I MA(1,l) process subject to an observahional additive step shift, the rnean of residual

after change is decaying exponentially fast:

for i 2 O. We wiil study the situations where IPII is small so that the optimal sequential test

r' on the mean level hypotheses is truncated at two observations. Thmefore, the window size

of N E L R in ( 5.9) is M = 2 and integral equations for computation of ARLs c m be established

and solved.

Let A. > O be the size of an observational additive step shift we intend to detect, po = A.

and pl = Piho. Let ho and hl > O be two threshold values.

is a non-homogeneous Markov chain.

For i > O, define ARL(e, i) the ARL of the extended LR procedure fiELR applied to

{ei+i, ei+2, - - a ) given t hat e; = e. For eo = 0,

We can derive the following integral equation.

Po PO Cl1 = lP(~o(ei+i - j-)) 1 hot or po(e - -) + pi(eà+i - 5 2 hi) 2

for e c 2 + 7. The same kinds of singularity as in ( 5.11) are present in ( 5.13).

For the fixed in-control ARL of 100 and bl = 0.1, Table 5.2 presents the ARLs of the

extended LR detection procedure in cornparison with the classical LR detection procedure

and the CUSUM and EWMA control chart procedures and Figure 5.2 shows the ARL curves.

Figure 5.2 shows that when the mean of residual &ter change is ( 5.12), the extended LR

procedure improves the performance of both CUSUM and the classical LR detection procedure

slightly. For detection of this rapidly decreasing change, the EWMA control chart procedure

offers t h e best performance.

ho = 2.4 XLR

(hl = 1.77)

EWMA 1 X = 0.55

5 ho = 4.2

XLR

CUSUM 1 k = 2.0

Table 5.2: ARLs of the extended LR (XLR) testing procedure in cornparison with the classicd

LR testing procedure and the CUSUM and EWMA control chart procedures applied to the

residuals of an I M A ( 1 , l ) process with moving average coefficient bl = 0.1 subject to an

observational additive step shift of size A when their in-control ARLs are 100. ARLs are

optimized at A = 1 in the upper panel, and at A = 3 in the lower panel.

. . . . . . . . . . XLR: b2.4 (h=l.7742)

80 - .-.-.-. LR: (h=1.9294) - CUSUM: k=2.0 (h4.327) MMA: M.55 (C=2.544) -

-

20 - \ - \

O 0.5 1 1.5 2 2.5 3 3.5 4 A: Size of change

f 00 - ....... . . . XLR: hw.2 (h=2.2928)

80 - . - . - . - . LR: (h=2.7634)

CUSUM: k=2.0 (h=0.329) A 60- a EWMA: M.495 (C=2.533)

- Y

2 40- -

20 -

O - - - - O 0.5 1 1.5 2 2.5 3 3.5 4

A: Size of change

Figure 5.2: ARLs of the extended LR (XLR) testing procedure in cornparison with the classical

LR testing procedure and the CUSUM and EWMA control chart procedures applied to the

residuals of an I M A(1 , l ) process with moving average coefficient bl = 0.1 subject to an

observational additive step shift of size A. For the fixed in-control ARL, the chosen chart

parameters optimize the ARL at A = 1 in the upper panel and at A = 3 in the lower panel.

5.5.3 The CCS Procedure

For a general &),

is a non-homogeneous Maskov c h a h

For i > O, define ARL(s, i) the ARL of the CCS procedure applied to {ei+l, e;+2, -) given

that S H ( i ) = S. For SH(0) = O,

For O < s < h, we can derive the following integral equations for computation.

= lP(lei+l( 3 Cor s+(e i+ l - k ) 2 h )

+ARL(O, i + 1)[9(-b(i) + min{C, k - s)) - a(-&) - C)]

+ARL(O, i + l)[@(-d(i) + k - s) - 9(-b( i ) - C)]

Therefore,

and

for the purpose of initialization. Singularity of discontinuous integral kernels also appears in

the above integrd equations.

CCS

CUSUM

Table 5.3: ARLs of a CCS procedure in cornparison with the LR testing procedure and the

CUSUM control chart procedure applied to the residuds of an AR(I) process with autore-

gressive coefficient a, = 0.15 subject to an observational additive step shift of size A when

their in-control ARLs are 100.

A recursive computational procedure sirnilar to that for CUSUM chart in 5 3.2.2 has been

developed. As an example, we compute the ARL values of the CCS procedure and compare its

performance with the CUSUM and the LR testing procedme when the mean of residual after .

change is ( 5.10). For the fixed in-control ARL of 100 and al = 0.45, Table 5.3 presents the

computed ARL values, E,,&V), and Figure 5.3 shows the ARL curves. The CCS procedure

performs roughly the same as CUSUM and the LR procedure, and it performs a little better

97

than CUSUM for detection of larger size of changes.

. . . . . . . . . . CCS: Cd.8, b0.28 ( h d . 170)

CUSUM: k=0.28 (h4.156)

. - . - . W . LR: (hs2.7217)

- d

- - - . - . - b - . - . - . - . - . d C .

O t I I 1 I I

O 0.5 1 4 f .5 2 2-5 3 3.5 A: size of change

Figure 5.3: ARLs of a CCS detection procedure in cornparison with the classical LR testing

procedure and the CUSUM and EWMA control chart procedures applied to the residuals of

an AR(1) process wi th autoregressive coefficient ai = 0.45 subject to an observational additive

step shift of size 4.

5.6 Summary of the Performance Analyses

Combining the performance analyses in Chapter 3 and this chapter, we have the following re-

sults on the performance of the control chart procedures, the classical and extended LR testing

procedures, and the CCS procedure for detection of an additive change in ôn autocorrelated

process .

1. The classical and extended LR testing procedures and the CCS procedure roughly per-

form as well as CUSUM. The LR testing procedure shows slightly better performance

than CUSUM for detection of a large size of observational additive step shift in an

AR(1) process. The extended LR testing procedure can improve the performance of the

classical one.

2. When the mean of residual after change decreases to O rapidly, EWMA roughly performs

the best among al1 the considered change detection procedures.

3. Otherwise, CUSUM roughly performs the best.

4. Generally, Shewhart performs well for detection of very large size of changes.

Chapter 6

Conclusion and Future Research

6.1 Contribut ions

This thesis has studied several issues on change detection in autocorrelated processes. The

contributions and major findings of this thesis are as follows.

We have systematically investigated the properties of process residuals for change detec-

tion. We have shown that process residuals are sufficient, and equivalent to process observa-

tions, for the problem of change detection. Therefore, change detection c m be done by using

process residuals. We have shown that process residuals are mutually uncorrelated with zero

means when the process is in-control. We have developed a general procedure for specifically

deriving the forms of process residuals. Using the procedure, we have derived the forms of

residuals of general A RIMA processes and state space models, and studied the properties

of the residuals under situations of additive changes, nonadditive changes, steady state, and

nonsteady state. Under the Gaussian assumption, for an ARMA process or a steady-state

state space model subject to an additive change, the residuals are i.i.d. with zero means before

the occurrence of change, and are still mutually independent with the same variance as before

but with time-varying means after the occurrence of change. For an ARIMA process or a

steady-state state model subject to an additive change, we have found that the properties of

residuals are independent of the feedback control applied to the process. The independence

of feedback control of process residuals makes the combination of process control and process

monitoring (change detection) easier, because change detection procedures based on residuals

can be designed independently of the feedback control policies.

Because of the properties of proce3s residuals, detection of an additive change in an ARIMA

process or a steady-state state space model is transformed to detection of an additive change in

the residual process with detection properties preserved. We have studied the performance of

the CUSUM, EWMA and Shewhart control chart procedures applied to the residuals under the

ARL criterion. For computation of the ARLs of the control chart procedures, we have derived

an explicit formula for Shewhart, established integral equations for CUSUM and EWMA, and

developed numerical procedures for solving the integral equations. We have investigated three

typical situations to study the performance of the control chart procedures applied to the

residuals.

We have also studied the LR testing procedure applied to residuals for detection of an

additive change in an autocorrelated process. We have extended the classical LR testing

procedure by replacing its constant threshold with a time-varying threshold sequence. We have

proposed a combined CUSUM and Shewhart procedure for approximation of the extended LR

testing procedure. We have developed numerical procedures based on integral equations for

computation of the ARLs of these detection procedures, and nurnerically investigated their

performance in sorne cases.

We have found that the LR testing procedures perform roughly as well as the CUSUM

control chart procedure. The extended LR testing procedure can improve the performance of

the classical one. When the mean of residual after change decreases rapidly to zero, the EWMA

control chart procedure roughly performs the best among a.U considered change detection

procedures. Otherwise, CUSUM roughly performs the best. Shewhart performs well only for

detection of very large size of changes.

We have studied the problem of optimal sequential testing on process mean levels. The.

problem is a special situation of change detection, is a generalization of Wald's SPRT problem,

and has wide applications in signal processing and other areas. We have formulated the

problem as an optimal stopping problem, derîved the optimal stopping d e , obtained some

important properties of the optimal stopping boundaries, which are two sequence of time-

varying threshold values, developed an algorithm for computation of the optima stopping

boundaries, and presented a numerical example. When the rnean of residual after change

decreases to zero (or a sufficiently small value), the optimal stopping boundaries are truncated.

That is, to be cost-efficient, the sampling and the test must be stopped before or at the

truncation point. When the mean of residual after change is stabilized at a constant level, the

optimal stopping boundaries are also stabilized. When the mean of residual after change is

periodically varying, the optimal stopping boundaries are also periodically varying with the

same period.

Future Research

Future research in the area of change detection in autocorrelated processes c m be done in the

following direct ions.

a Detect ion of nonaddit ive changes.

Change detection in multivariate processes.

a Combination of detection and estimation of the change point in the change detection

problem.

Bibliography

[l] Allen, R., and Huang, K. (1994), "Self-tuning Control of Cutting Force for Rough Tuming

Operat ionsn and the references, Proceedings of the Institution of Mechanical Engineers,

Part B: Journal of Engineering Manufacture, Vo1.208, pp.157-165.

[2] Alwan, L. C. (1992), "Effects of Autocorrelation on Control Chart Performance", Corn-

munications in Statistics - Theory and Methods, 21(4), pp.1025-1049.

[3] Alwan, L. C., and Roberts, H. V. (1988), "TirneSeries Modeling for Statistical Process

Control", Journal of Business and Economic Statistics, Vo1.6, No. 1, pp.87-95.

[4] Alwan, L. C., and Roberts, H. V. (1995), "The Problem of Misplaced Control Limitsn,

Applied Statistics, 44, No.3, pp.269-278.

[5] Astrom, K. J ., and Wittenmark, B. (1 997), Cornputer-Controlled Systems: Theory and

Design, 3rd Ed., Upper Saddle River, New Jersey: Prentice Hall.

[6] Bagshaw, M., and Johnson, R. A. (1975), "The Effect of Serial Correlation on the Per-

formance of CUSUM Tests IIn, Technometrics, Vol. 17, No.1, pp.73-80.

[7] Basseville, M. (1988), &Detecting Changes in Signals and Systems - A Survey", Auto-

matica, Vo1.24, No.3, pp.309-326.

[B] Bassevitle, M., and Nikiforov, 1. V. (1993), Detection of Abrupt Changes: Theory and

Application, Englewood ClifXs, New Jersey: Prentice Hd.

(91 Beibel, M. (1996), "A Note on Ritov's Bayes Approach to the Minimax Property of the

CUSUM Procedure", The Annals of Statistics, Vo1.24, No.4, pp.1804-1812.

(101 Bert houex, P. M., Hunter, W. G., and Pallesen, L. (l978), UMoni toring Sewage Treatment

Plants: Some Quality Control Aspects", Journal of Quality Technology, Vol.iO, No.4,

pp. 139-149.

[ll] Box, G . E. P., Jenkins, G. M., and Reinsel, G. C. (1994), Tirne Sen'es Anolysis: Fore-

casting and Contml, 3rd Ed., Englewood Cliffs, New Jersey: Prentice Hall.

[12] Box, G., and Kramer, T. (1992), uStatistical Process Monitoring and Feedback Adjust-

ment - A Discussionn, Technornetrics, Vo1.34, No.3, pp.251-267.

[13] Chow, Y. S., and Robbins, H. (1963), "On Optimal Stopping Rules", 2. Wahrschein-

fichkeitstheorie, Vo1.2, No.1, pp.33-49.

[14] Cochlar, J., and Vrana, 1. (1978), "On the Optimal Sequential Test of Two Hypotheses

for Statistically Dependent Observationn, Kybernetika, Vol. 14, No. 1, pp.57-69.

[15] Crowder, S. V. (l992), UAn SPC Mode1 for Short Production Runs: Minimizing Expected

Cost", Technornetrics, Vo1.34, No.1, pp.64-73.

[16] Harris, T. J., and Ross, W. H. (1991), 'Statistical Process Control Procedures for Cor-

related Observationsn, The Canadian Journal of Chernical Engineering, Vo1.69, Feb.,

pp.4&57.

[17] Jensen, K. L., and Vardeman, S. B. (1993), "Optimal Adjustment in the Presence of De-

terministic Process Drift and Randorn Adjustment Error", Technometrics, Vo1.35, No.4,

pp.376-389.

[18] Johnson, R. A., and Bagshaw, M. (1974), "The Effect of Serial Correlation on the Per-

formance of CUSUM Tests", Technornetrics, Vo1.16, pp.103-112.

[19] Kailath, T., and Poor, H. V. (1998), "Detection of Stochastic Processes", IEEE Trans-

action on Information Theory, h1.44, No.6, pp.2230-2259.

[20] Kligene, N., and Telksnis, L. (1983), "Methods of Detecting Instants of Change of Ran-

dom Process Properties", Automation and Remote Control, Vo1.44, pp.1241-1283.

r2 11 Lai, T. L. (1995), "Sequential Changepoint Detection in Quality Control and Dynamical

Systemsn with Discussions, Journal of the Royal Statistical Society, Series B, 57, No.4,

pp.613-658.

[2%] Lehmann, E. L. (1959), Testing Statistical Hypotheses, John Wiley and Sons.

[23] Liptser, R. S., and Shiryaev, A. N. (1978), Statistics of Randorn Processes II: Applications,

New York, Heidelberg and Berlin: Springer-Verlag.

[24] Liu, Y ., and Blostein, S. D. (1992), "Optimality of Sequential Probability Ratio Test for

Nonstationary Observations", IEEE Transactions on In/ormation Theory, Vo1.38, No.1,

[25] Lorden, G. (1971), "Procedures for Reacting to a Change in Distribution", The Annals

of Mathematical Statistics, Vo1.42, No.6, pp.1897-1908.

[26] Lucas, J. M. (1982), "Combined Shewhart-CUSUM Quality Control Schemes" , Journal

of Quality Technology, Vol. 14, No.2, pp.51-59.

[27] Lucas, J. M., and Crosier, R. B. (1982), "Fast Initiai Response for CUSUM Quality Con-

trol Schemes: Cive Your CUSUM a Head Start", Technomettics, Vo1.24, No.3, pp.190-

205.

[28] MacGregor, J. F. (1988), "On-Line Statistical Process Control", Chernical Engineering

Progress, Oct., pp.21-31.

[29] Montgomery, D. C., Keats, J. B., Runger, G. C., and Messina, W., S. (1994), "Inte-

grating Statistical Process Control and Engineering Process Control" , Journal of Quality

Technology, Vo1.26, No.2, pp.79-87.

[a01 Moustakides, G. V. (1986), "Optimal Stopping Times for Detecting Changes in Distri-

butions", The Annals of Statistics, Vo1.14, No.4, pp. 1379-1387.

[31] Moustakides, G. V. (1998), "Quickest Detection of Abrupt Changes for a Class of Random

Processes", IEEE T~*unsacf ions on In formu tion Theory, Vo1.44, No.5, pp. 1965-1968.

[32] Page, E. S. (WM), "Continuous Inspection Scheme" , Biometrika, 41, pp.100-115.

[33] Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. (1992), Numen'cul

Recipes in C: The Ar t of Scientific Computing, 2nd edition, Cambridge University Press.

[34] Ritov, Y. (1990), "Decision Theoretic Optimality of the CUSUM Proceduren, The Annals

of Statistics, Vo1.18, N0.3, pp.14641469.

[35] Shiryaev, A. N. (1961), "The Problem of the Most Rapid Detection of a Disturbance in

a Stationary Process", Soviet Mathematics, Vo1.2, No.3, pp.795-799.

107

[36] Shiryaev, A. N. (1963), "On Optimal Methods in Quickest Detection Problems", Theory

of Probability and its Applications, Vo1.8, No. 1, pp.22-46.

[37] Shiryaev, A. N. (1978), Optimal Stopping Rules, New York, Heidelberg and Berlin:

Springer-Verlag.

[38] Siegmund, D., and Venkatraman, E. S. (1995), "Using t h e Generalized Likelihood Ratio

Statistic for Sequential Detection of a Change-Pointn, The Annals of Statistics, Vo1.23,

No. 1, pp.255271.

[39] Tucker, W. T., Faltin, F. W., and Vander Wiel, S. A. (1993), "Algorithmic Statistical

Process Control: An Elaboration", Technometn'cs, Vo1.35, No.4, pp.363-375.

[40] Vander Wiel, S. A. (1996 a), "Monitoring Processes That Wander Using Integrated Mov-

ing Average Modelsn, Technornehics, Vo1.38, No.2, pp.139-151.

[41] Vander Wiel, S. A. (1996 b), "Optimal Discrete Adjustments for Short Production Rusn,

Probability in the Engineering and Infornational Sciences, 10, pp.119-151.

[42] Vasilopoulos, A. V., and Stamboulis, A. P. (1978), "Modification of Control Chart Limits

in the Presence of Data Correlation", Journal of Quality Technology, Vol.10, No.1, pp.20-

30.

[43] Wald, A. (1945), 'Sequential Tests of S tatistical Hypotheses", Annais of Mathematical

Statistics, Vo1.16, pp.117-186.

[44] Wald, A. (1947), Sequential Analysis, New York: John Willey and Sons.

[45] Wald, A., and Wolfowitz, J. (1948), "Optimal Character of the Sequential Probability

Ratio Testn, Annals of Mathematical Statistics, V01.19, pp.326-339.

108

[46] Wardell, D. G., Moskowitz, H., and Plante, R. D. (1992), "Control Charts in Presenceof

Data Correlation", Management Science, Vo1.38, No.8, Aug., pp. 1084-1 105.

[47] Wardell, D. G., Moskowitz, H., and Plante, R. D. (1994), "Run-Length Distribution of

Special-Cause Control Charts for Correlated Processes", Technometrics, Vo1.36, No.1,

pp.3-17.

[48] Westgard, G. O., Groth, T., Aronsson, T., and de Verdier C.-H. (1977), "Combined

S hewhart-CUSUM Control Chart for Improved Quality Control in Clinical Chemistry" ,

Clinical Chemistry, Vo1.23, No.10, pp.1881-1887.

[49] Willsky, A. S., and Jones, H. L. (1976), "A Generalized Likelihood Ratio Approach to the

Detection and Estimation of Jumps in Linear Systemsn, IEEE Transaction on Automatic

Control, Vo1.21, No.1, pp.108-112.

[50] Yakir, B. (1997), "A Note on Optimal Detection of a Change in Distributionsn, The

Annals of StatWics, Vo1.25, No.5, pp.2117-2126.

CHANGE DETECTION IN AUTOCORRELATED · residuals of the machining process with w = 0.553 subject to...

Documents

Transcript of CHANGE DETECTION IN AUTOCORRELATED · residuals of the machining process with w = 0.553 subject to...